Message ID | E1ZiI7T-0002zo-Ez@rmk-PC.arm.linux.org.uk (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On 10/03/2015 10:24 AM, Russell King wrote: > The cpuidle ->enter method expects the return value to be the sleep > state we entered. Returning negative numbers or other codes is not > permissible since coupled CPU idle was merged. > At least some of the mvebu_v7_cpu_suspend() implementations return the > value from cpu_suspend(), which returns zero if the CPU vectors back > into the kernel via cpu_resume() (the success case), or the non-zero > return value of the suspend actor, or one (failure cases). > > We do not want to be returning the failure case value back to CPU idle > as that indicates that we successfully entered one of the deeper idle > states. Always return zero instead, indicating that we slept for the > shortest amount of time. > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> > --- > Please think about this - having only an Armada 388 platform, I can't > test this due to CPU idle being disabled there. However, the code paths > indicate that if we fail to sleep in armada_370_xp_pmsu_idle_enter() or > armada_38x_do_cpu_suspend() fail to sleep, we end up reporting that we > successfully slept in CPU idle state 1 in both cases. > > Rafael - the code in cpuidle.c prior to the addition of coupled CPU idle > seems to allow the return of negative error numbers from the ->enter() > method: > > if (entered_state >= 0) { > /* Update cpuidle counters */ > /* This can be moved to within driver enter routine > * but that results in multiple copies of same code. > */ > dev->states_usage[entered_state].time += dev->last_residency; > dev->states_usage[entered_state].usage++; > } else { > dev->last_residency = 0; > } > > Since coupled CPU idle, negative return codes will give a negative table > index inside cpuidle_state_is_coupled() - which is obviously buggy. Does > this need fixing? If so, this patch below is wrong, and we should be > returning a negative errno rather than zero on failure to enter a low > power state. It is not possible since the commit 0b89e9aa2. This change was to prevent to re-enable the local interrupt when we are in the process of coupling an idle state and let the coupled code to handle that. If the backend driver changed the state in our back (for example to a shallower state) to a non coupled one, the local interrupt will be re-enabled. If at the contrary, the backend driver choose a deeper idle state which is coupled, we won't re-enable the interrupt. I don't know any driver doing this but the code allows it (not sure it is a good thing - anyway). So the test: if (!cpuidle_state_is_coupled(dev, drv, entered_state)) should be: if (!cpuidle_state_is_coupled(dev, drv, *index*)) ... in order to have the decisions made when entering this function symmetric to the ones which will be taken when exiting the function because they will be based on the same index. It is the same behavior than the broadcast timer enter/exit in the same function. To be clean, the above should be fixed. But in any case, armada_370_xp_pmsu_idle_enter and armada_38x_do_cpu_suspend can return a negative value because they don't depend on the ARCH_NEEDS_CPU_IDLE_COUPLED, and cpuidle_state_is_coupled is an inline for 'return false'. -- Daniel
On Mon, Oct 05, 2015 at 01:45:21PM +0200, Daniel Lezcano wrote: > On 10/03/2015 10:24 AM, Russell King wrote: > >The cpuidle ->enter method expects the return value to be the sleep > >state we entered. Returning negative numbers or other codes is not > >permissible since coupled CPU idle was merged. > > > > >At least some of the mvebu_v7_cpu_suspend() implementations return the > >value from cpu_suspend(), which returns zero if the CPU vectors back > >into the kernel via cpu_resume() (the success case), or the non-zero > >return value of the suspend actor, or one (failure cases). > > > >We do not want to be returning the failure case value back to CPU idle > >as that indicates that we successfully entered one of the deeper idle > >states. Always return zero instead, indicating that we slept for the > >shortest amount of time. > > > >Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> > >--- > >Please think about this - having only an Armada 388 platform, I can't > >test this due to CPU idle being disabled there. However, the code paths > >indicate that if we fail to sleep in armada_370_xp_pmsu_idle_enter() or > >armada_38x_do_cpu_suspend() fail to sleep, we end up reporting that we > >successfully slept in CPU idle state 1 in both cases. > > > >Rafael - the code in cpuidle.c prior to the addition of coupled CPU idle > >seems to allow the return of negative error numbers from the ->enter() > >method: > > > > if (entered_state >= 0) { > > /* Update cpuidle counters */ > > /* This can be moved to within driver enter routine > > * but that results in multiple copies of same code. > > */ > > dev->states_usage[entered_state].time += dev->last_residency; > > dev->states_usage[entered_state].usage++; > > } else { > > dev->last_residency = 0; > > } > > > >Since coupled CPU idle, negative return codes will give a negative table > >index inside cpuidle_state_is_coupled() - which is obviously buggy. Does > >this need fixing? If so, this patch below is wrong, and we should be > >returning a negative errno rather than zero on failure to enter a low > >power state. > > It is not possible since the commit 0b89e9aa2. > > This change was to prevent to re-enable the local interrupt when we are in > the process of coupling an idle state and let the coupled code to handle > that. > > If the backend driver changed the state in our back (for example to a > shallower state) to a non coupled one, the local interrupt will be > re-enabled. If at the contrary, the backend driver choose a deeper idle > state which is coupled, we won't re-enable the interrupt. I don't know any > driver doing this but the code allows it (not sure it is a good thing - > anyway). > > So the test: > > if (!cpuidle_state_is_coupled(dev, drv, entered_state)) > > should be: > > if (!cpuidle_state_is_coupled(dev, drv, *index*)) > > ... in order to have the decisions made when entering this function > symmetric to the ones which will be taken when exiting the function because > they will be based on the same index. Is someone going to fix this seven month bug?
diff --git a/drivers/cpuidle/cpuidle-mvebu-v7.c b/drivers/cpuidle/cpuidle-mvebu-v7.c index 01a856971f05..18ded9e7cb34 100644 --- a/drivers/cpuidle/cpuidle-mvebu-v7.c +++ b/drivers/cpuidle/cpuidle-mvebu-v7.c @@ -39,8 +39,12 @@ static int mvebu_v7_enter_idle(struct cpuidle_device *dev, ret = mvebu_v7_cpu_suspend(deepidle); cpu_pm_exit(); + /* + * If we failed to enter the desired state, indicate that we + * slept lightly. + */ if (ret) - return ret; + return 0; return index; }
The cpuidle ->enter method expects the return value to be the sleep state we entered. Returning negative numbers or other codes is not permissible since coupled CPU idle was merged. At least some of the mvebu_v7_cpu_suspend() implementations return the value from cpu_suspend(), which returns zero if the CPU vectors back into the kernel via cpu_resume() (the success case), or the non-zero return value of the suspend actor, or one (failure cases). We do not want to be returning the failure case value back to CPU idle as that indicates that we successfully entered one of the deeper idle states. Always return zero instead, indicating that we slept for the shortest amount of time. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> --- Please think about this - having only an Armada 388 platform, I can't test this due to CPU idle being disabled there. However, the code paths indicate that if we fail to sleep in armada_370_xp_pmsu_idle_enter() or armada_38x_do_cpu_suspend() fail to sleep, we end up reporting that we successfully slept in CPU idle state 1 in both cases. Rafael - the code in cpuidle.c prior to the addition of coupled CPU idle seems to allow the return of negative error numbers from the ->enter() method: if (entered_state >= 0) { /* Update cpuidle counters */ /* This can be moved to within driver enter routine * but that results in multiple copies of same code. */ dev->states_usage[entered_state].time += dev->last_residency; dev->states_usage[entered_state].usage++; } else { dev->last_residency = 0; } Since coupled CPU idle, negative return codes will give a negative table index inside cpuidle_state_is_coupled() - which is obviously buggy. Does this need fixing? If so, this patch below is wrong, and we should be returning a negative errno rather than zero on failure to enter a low power state. drivers/cpuidle/cpuidle-mvebu-v7.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)