Message ID | 1424971248-29076-1-git-send-email-gregory.clement@free-electrons.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | Rafael Wysocki |
Headers | show |
On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: > As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring > that cpu_pm_enter is not called twice on the same CPU before > cpu_pm_exit is called.". In the current code in case of failure when > calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never > called whereas cpu_pm_enter() was called just before. > > This patch moves the cpu_pm_exit() in order to balance the > cpu_pm_enter() calls. > > Reported-by: Fulvio Benini <fbf@libero.it> > Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Should that go to "stable" too? Which "stable" series it should go to if so? > --- > drivers/cpuidle/cpuidle-mvebu-v7.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/cpuidle/cpuidle-mvebu-v7.c b/drivers/cpuidle/cpuidle-mvebu-v7.c > index 38e68618513a..cefa07438ae1 100644 > --- a/drivers/cpuidle/cpuidle-mvebu-v7.c > +++ b/drivers/cpuidle/cpuidle-mvebu-v7.c > @@ -37,11 +37,11 @@ static int mvebu_v7_enter_idle(struct cpuidle_device *dev, > deepidle = true; > > ret = mvebu_v7_cpu_suspend(deepidle); > + cpu_pm_exit(); > + > if (ret) > return ret; > > - cpu_pm_exit(); > - > return index; > } > >
Hi Rafael, On 26/02/2015 22:55, Rafael J. Wysocki wrote: > On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: >> As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring >> that cpu_pm_enter is not called twice on the same CPU before >> cpu_pm_exit is called.". In the current code in case of failure when >> calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never >> called whereas cpu_pm_enter() was called just before. >> >> This patch moves the cpu_pm_exit() in order to balance the >> cpu_pm_enter() calls. >> >> Reported-by: Fulvio Benini <fbf@libero.it> >> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> > > Should that go to "stable" too? Which "stable" series it should go to if so? Yes as it fixes a potential issue, you're right it should go to "stable". The bug was here since the introduction of the driver in 3.16. Thanks, Gregory > >> --- >> drivers/cpuidle/cpuidle-mvebu-v7.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/cpuidle/cpuidle-mvebu-v7.c b/drivers/cpuidle/cpuidle-mvebu-v7.c >> index 38e68618513a..cefa07438ae1 100644 >> --- a/drivers/cpuidle/cpuidle-mvebu-v7.c >> +++ b/drivers/cpuidle/cpuidle-mvebu-v7.c >> @@ -37,11 +37,11 @@ static int mvebu_v7_enter_idle(struct cpuidle_device *dev, >> deepidle = true; >> >> ret = mvebu_v7_cpu_suspend(deepidle); >> + cpu_pm_exit(); >> + >> if (ret) >> return ret; >> >> - cpu_pm_exit(); >> - >> return index; >> } >> >> >
On 02/26/2015 10:55 PM, Rafael J. Wysocki wrote: > On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: >> As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring >> that cpu_pm_enter is not called twice on the same CPU before >> cpu_pm_exit is called.". In the current code in case of failure when >> calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never >> called whereas cpu_pm_enter() was called just before. >> >> This patch moves the cpu_pm_exit() in order to balance the >> cpu_pm_enter() calls. >> >> Reported-by: Fulvio Benini <fbf@libero.it> >> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> > > Should that go to "stable" too? Which "stable" series it should go to if so? Hi Rafael, do you mind if I take this fix in my tree ? There is the same issue for cpuidle-arm64.c I fixed.
On Friday, February 27, 2015 05:50:23 PM Daniel Lezcano wrote: > On 02/26/2015 10:55 PM, Rafael J. Wysocki wrote: > > On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: > >> As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring > >> that cpu_pm_enter is not called twice on the same CPU before > >> cpu_pm_exit is called.". In the current code in case of failure when > >> calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never > >> called whereas cpu_pm_enter() was called just before. > >> > >> This patch moves the cpu_pm_exit() in order to balance the > >> cpu_pm_enter() calls. > >> > >> Reported-by: Fulvio Benini <fbf@libero.it> > >> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> > > > > Should that go to "stable" too? Which "stable" series it should go to if so? > > Hi Rafael, > > do you mind if I take this fix in my tree ? Not at all, please go ahead. > There is the same issue for cpuidle-arm64.c I fixed. OK
On 02/27/2015 10:39 AM, Gregory CLEMENT wrote: > Hi Rafael, > > On 26/02/2015 22:55, Rafael J. Wysocki wrote: >> On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: >>> As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring >>> that cpu_pm_enter is not called twice on the same CPU before >>> cpu_pm_exit is called.". In the current code in case of failure when >>> calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never >>> called whereas cpu_pm_enter() was called just before. >>> >>> This patch moves the cpu_pm_exit() in order to balance the >>> cpu_pm_enter() calls. >>> >>> Reported-by: Fulvio Benini <fbf@libero.it> >>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> >> >> Should that go to "stable" too? Which "stable" series it should go to if so? > > Yes as it fixes a potential issue, you're right it should go > to "stable". The bug was here since the introduction of the driver > in 3.16. Hi Gregory, actually the 'stable' rules state clearly: "- It must fix a real bug that bothers people (not a, "This could be a problem..." type thing)." You say "it fixes a potential issue", so no bug has been raise yet, right ? >>> --- >>> drivers/cpuidle/cpuidle-mvebu-v7.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/cpuidle/cpuidle-mvebu-v7.c b/drivers/cpuidle/cpuidle-mvebu-v7.c >>> index 38e68618513a..cefa07438ae1 100644 >>> --- a/drivers/cpuidle/cpuidle-mvebu-v7.c >>> +++ b/drivers/cpuidle/cpuidle-mvebu-v7.c >>> @@ -37,11 +37,11 @@ static int mvebu_v7_enter_idle(struct cpuidle_device *dev, >>> deepidle = true; >>> >>> ret = mvebu_v7_cpu_suspend(deepidle); >>> + cpu_pm_exit(); >>> + >>> if (ret) >>> return ret; >>> >>> - cpu_pm_exit(); >>> - >>> return index; >>> } >>> >>> >> > >
Hi Rafael, On 03/03/2015 11:30, Daniel Lezcano wrote: > On 02/27/2015 10:39 AM, Gregory CLEMENT wrote: >> Hi Rafael, >> >> On 26/02/2015 22:55, Rafael J. Wysocki wrote: >>> On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: >>>> As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring >>>> that cpu_pm_enter is not called twice on the same CPU before >>>> cpu_pm_exit is called.". In the current code in case of failure when >>>> calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never >>>> called whereas cpu_pm_enter() was called just before. >>>> >>>> This patch moves the cpu_pm_exit() in order to balance the >>>> cpu_pm_enter() calls. >>>> >>>> Reported-by: Fulvio Benini <fbf@libero.it> >>>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> >>> >>> Should that go to "stable" too? Which "stable" series it should go to if so? >> >> Yes as it fixes a potential issue, you're right it should go >> to "stable". The bug was here since the introduction of the driver >> in 3.16. > > Hi Gregory, > > actually the 'stable' rules state clearly: > > "- It must fix a real bug that bothers people (not a, "This could be a > problem..." type thing)." > > You say "it fixes a potential issue", so no bug has been raise yet, right ? Indeed nobody claimed yet having a bug related to this issue. Gregory > >>>> --- >>>> drivers/cpuidle/cpuidle-mvebu-v7.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/cpuidle/cpuidle-mvebu-v7.c b/drivers/cpuidle/cpuidle-mvebu-v7.c >>>> index 38e68618513a..cefa07438ae1 100644 >>>> --- a/drivers/cpuidle/cpuidle-mvebu-v7.c >>>> +++ b/drivers/cpuidle/cpuidle-mvebu-v7.c >>>> @@ -37,11 +37,11 @@ static int mvebu_v7_enter_idle(struct cpuidle_device *dev, >>>> deepidle = true; >>>> >>>> ret = mvebu_v7_cpu_suspend(deepidle); >>>> + cpu_pm_exit(); >>>> + >>>> if (ret) >>>> return ret; >>>> >>>> - cpu_pm_exit(); >>>> - >>>> return index; >>>> } >>>> >>>> >>> >> >> > >
Gregory CLEMENT wrote: > Hi Rafael, > > > > On 03/03/2015 11:30, Daniel Lezcano wrote: > >> On 02/27/2015 10:39 AM, Gregory CLEMENT wrote: >> >>> Hi Rafael, >>> >>> On 26/02/2015 22:55, Rafael J. Wysocki wrote: >>> >>>> On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: >>>> >>>>> As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring >>>>> that cpu_pm_enter is not called twice on the same CPU before >>>>> cpu_pm_exit is called.". In the current code in case of failure when >>>>> calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never >>>>> called whereas cpu_pm_enter() was called just before. >>>>> >>>>> This patch moves the cpu_pm_exit() in order to balance the >>>>> cpu_pm_enter() calls. >>>>> >>>>> Reported-by: Fulvio Benini <fbf@libero.it> >>>>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> >>>>> >>>> Should that go to "stable" too? Which "stable" series it should go to if so? >>>> >>> Yes as it fixes a potential issue, you're right it should go >>> to "stable". The bug was here since the introduction of the driver >>> in 3.16. >>> >> Hi Gregory, >> >> actually the 'stable' rules state clearly: >> >> "- It must fix a real bug that bothers people (not a, "This could be a >> problem..." type thing)." >> >> You say "it fixes a potential issue", so no bug has been raise yet, right ? >> > > Indeed nobody claimed yet having a bug related to this issue. > > Gregory > > I reported the issue, but i cannot say if it's a real bug. I had random kernel panics with a Netgear ReadyNAS RN102 (armada 370 cpu): http://www.readynas.com/forum/viewtopic.php?f=20&t=78697&sid=14747617286d55ac27296cdee7a3f420&start=210#p452214 All i can say is that the system use the "armadaxp_idle" driver and works fine when running "stress --cpu 8" in background. I asked Netgear to provide a firmware without the idle driver to confirm if it's the cause of the problem, but they did not answered. Bye, Fulvio -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/03/2015 11:52 AM, Fulvio wrote: > Gregory CLEMENT wrote: >> Hi Rafael, >> >> >> >> On 03/03/2015 11:30, Daniel Lezcano wrote: >>> On 02/27/2015 10:39 AM, Gregory CLEMENT wrote: >>>> Hi Rafael, >>>> >>>> On 26/02/2015 22:55, Rafael J. Wysocki wrote: >>>>> On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: >>>>>> As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring >>>>>> that cpu_pm_enter is not called twice on the same CPU before >>>>>> cpu_pm_exit is called.". In the current code in case of failure when >>>>>> calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never >>>>>> called whereas cpu_pm_enter() was called just before. >>>>>> >>>>>> This patch moves the cpu_pm_exit() in order to balance the >>>>>> cpu_pm_enter() calls. >>>>>> >>>>>> Reported-by: Fulvio Benini <fbf@libero.it> >>>>>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> >>>>> Should that go to "stable" too? Which "stable" series it should go >>>>> to if so? >>>> Yes as it fixes a potential issue, you're right it should go >>>> to "stable". The bug was here since the introduction of the driver >>>> in 3.16. >>> Hi Gregory, >>> >>> actually the 'stable' rules state clearly: >>> >>> "- It must fix a real bug that bothers people (not a, "This could be >>> a problem..." type thing)." >>> >>> You say "it fixes a potential issue", so no bug has been raise yet, >>> right ? >> >> Indeed nobody claimed yet having a bug related to this issue. >> >> Gregory >> > I reported the issue, but i cannot say if it's a real bug. > I had random kernel panics with a Netgear ReadyNAS RN102 (armada 370 cpu): > http://www.readynas.com/forum/viewtopic.php?f=20&t=78697&sid=14747617286d55ac27296cdee7a3f420&start=210#p452214 > > > All i can say is that the system use the "armadaxp_idle" driver and > works fine when running "stress --cpu 8" in background. Hi Fulvio, so IIUC, you suggest the stress test prevent the system to use cpuidle because of the busy cycles, right ? Is it possible to have the kernel panic stack ? Are you able to reproduce the kernel panic ? and test a new kernel ? Thanks -- Daniel > I asked Netgear to provide a firmware without the idle driver to confirm > if it's the cause of the problem, but they did not answered. > Bye, > Fulvio
Hi Fulvio, On 03/03/2015 11:52, Fulvio wrote: > Gregory CLEMENT wrote: >> Hi Rafael, >> >> >> >> On 03/03/2015 11:30, Daniel Lezcano wrote: >> >>> On 02/27/2015 10:39 AM, Gregory CLEMENT wrote: >>> >>>> Hi Rafael, >>>> >>>> On 26/02/2015 22:55, Rafael J. Wysocki wrote: >>>> >>>>> On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: >>>>> >>>>>> As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring >>>>>> that cpu_pm_enter is not called twice on the same CPU before >>>>>> cpu_pm_exit is called.". In the current code in case of failure when >>>>>> calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never >>>>>> called whereas cpu_pm_enter() was called just before. >>>>>> >>>>>> This patch moves the cpu_pm_exit() in order to balance the >>>>>> cpu_pm_enter() calls. >>>>>> >>>>>> Reported-by: Fulvio Benini <fbf@libero.it> >>>>>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> >>>>>> >>>>> Should that go to "stable" too? Which "stable" series it should go to if so? >>>>> >>>> Yes as it fixes a potential issue, you're right it should go >>>> to "stable". The bug was here since the introduction of the driver >>>> in 3.16. >>>> >>> Hi Gregory, >>> >>> actually the 'stable' rules state clearly: >>> >>> "- It must fix a real bug that bothers people (not a, "This could be a >>> problem..." type thing)." >>> >>> You say "it fixes a potential issue", so no bug has been raise yet, right ? >>> >> >> Indeed nobody claimed yet having a bug related to this issue. >> >> Gregory >> >> > I reported the issue, but i cannot say if it's a real bug. > I had random kernel panics with a Netgear ReadyNAS RN102 (armada 370 cpu): > http://www.readynas.com/forum/viewtopic.php?f=20&t=78697&sid=14747617286d55ac27296cdee7a3f420&start=210#p452214 I didn't know you experimented random kernel panics and that you thought it was related to the CPU Idle driver. > > All i can say is that the system use the "armadaxp_idle" driver and > works fine when running "stress --cpu 8" in background. > I asked Netgear to provide a firmware without the idle driver to confirm > if it's the cause of the problem, but they did not answered. I think that if you disable all the state using by doing an echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/stateNUM/disable where NUM is the nuymbert of the state. Then it should disable the cpuidle on the fly. Daniel, is it correct? Or if you can modify your bootargs, then just add cpuidle.off=1 to the kernel command line. Thanks, Gregory > Bye, > Fulvio >
On 03/03/2015 01:51 PM, Gregory CLEMENT wrote: > Hi Fulvio, > > On 03/03/2015 11:52, Fulvio wrote: >> Gregory CLEMENT wrote: >>> Hi Rafael, >>> >>> >>> >>> On 03/03/2015 11:30, Daniel Lezcano wrote: >>> >>>> On 02/27/2015 10:39 AM, Gregory CLEMENT wrote: >>>> >>>>> Hi Rafael, >>>>> >>>>> On 26/02/2015 22:55, Rafael J. Wysocki wrote: >>>>> >>>>>> On Thursday, February 26, 2015 06:20:48 PM Gregory CLEMENT wrote: >>>>>> >>>>>>> As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring >>>>>>> that cpu_pm_enter is not called twice on the same CPU before >>>>>>> cpu_pm_exit is called.". In the current code in case of failure when >>>>>>> calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never >>>>>>> called whereas cpu_pm_enter() was called just before. >>>>>>> >>>>>>> This patch moves the cpu_pm_exit() in order to balance the >>>>>>> cpu_pm_enter() calls. >>>>>>> >>>>>>> Reported-by: Fulvio Benini <fbf@libero.it> >>>>>>> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> >>>>>>> >>>>>> Should that go to "stable" too? Which "stable" series it should go to if so? >>>>>> >>>>> Yes as it fixes a potential issue, you're right it should go >>>>> to "stable". The bug was here since the introduction of the driver >>>>> in 3.16. >>>>> >>>> Hi Gregory, >>>> >>>> actually the 'stable' rules state clearly: >>>> >>>> "- It must fix a real bug that bothers people (not a, "This could be a >>>> problem..." type thing)." >>>> >>>> You say "it fixes a potential issue", so no bug has been raise yet, right ? >>>> >>> >>> Indeed nobody claimed yet having a bug related to this issue. >>> >>> Gregory >>> >>> >> I reported the issue, but i cannot say if it's a real bug. >> I had random kernel panics with a Netgear ReadyNAS RN102 (armada 370 cpu): >> http://www.readynas.com/forum/viewtopic.php?f=20&t=78697&sid=14747617286d55ac27296cdee7a3f420&start=210#p452214 > > I didn't know you experimented random kernel panics and that you thought > it was related to the CPU Idle driver. > >> >> All i can say is that the system use the "armadaxp_idle" driver and >> works fine when running "stress --cpu 8" in background. >> I asked Netgear to provide a firmware without the idle driver to confirm >> if it's the cause of the problem, but they did not answered. > > I think that if you disable all the state using by doing an > > echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/stateNUM/disable > > where NUM is the nuymbert of the state. Then it should disable the > cpuidle on the fly. > > Daniel, is it correct? Yes, it is correct. Make sure it is done on all cpus. > Or if you can modify your bootargs, then just add cpuidle.off=1 to the kernel > command line. > > Thanks, > > Gregory > >> Bye, >> Fulvio >> > >
> I didn't know you experimented random kernel panics and that you thought > it was related to the CPU Idle driver. > > >> All i can say is that the system use the "armadaxp_idle" driver and >> works fine when running "stress --cpu 8" in background. >> I asked Netgear to provide a firmware without the idle driver to confirm >> if it's the cause of the problem, but they did not answered. >> > > I think that if you disable all the state using by doing an > > echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/stateNUM/disable > > where NUM is the nuymbert of the state. Then it should disable the > cpuidle on the fly. > Thanks, i'll try that as soon as possible (i gave back the unit to my client) and report back. However, the description of the cpu_pm_enter function state: "Must be called on the affected CPU with interrupts disabled. Platform is responsible for ensuring that cpu_pm_enter is not called twice on the same CPU before cpu_pm_exit is called. Notified drivers can include VFP co-processor, interrupt controller and its PM extensions, local CPU timers context save/restore which shouldn't be interrupted. Hence it must be called with interrupts disabled." and the point is: it that an invariant? Do current code and future code safely assume that cpu_pm_enter is not called twice? For example if cpu_pm_enter do "context save" and cpu_pm_exit do "context restore", calling twice cpu_pm_enter will overwrite the previous saved context: is that safe in all circumstances? I assume the rule " It must fix a real bug that bothers people (not a, "This could be a problem..." type thing)." is to avoid committing useless changes that may introduce new bugs, but i do not think that apply to this case: a bug report from an unknown user (me) should change nothing. Bye, Fulvio -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/03/2015 03:58 PM, Fulvio wrote: > >> I didn't know you experimented random kernel panics and that you thought >> it was related to the CPU Idle driver. >> >>> All i can say is that the system use the "armadaxp_idle" driver and >>> works fine when running "stress --cpu 8" in background. >>> I asked Netgear to provide a firmware without the idle driver to >>> confirm if it's the cause of the problem, but they did not answered. >> >> I think that if you disable all the state using by doing an >> >> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/stateNUM/disable >> >> where NUM is the nuymbert of the state. Then it should disable the >> cpuidle on the fly. > Thanks, i'll try that as soon as possible (i gave back the unit to my > client) and report back. > > However, the description of the cpu_pm_enter function state: > "Must be called on the affected CPU with interrupts disabled. Platform > is responsible for ensuring that cpu_pm_enter is not called twice on the > same CPU before cpu_pm_exit is called. Notified drivers can include VFP > co-processor, interrupt controller and its PM extensions, local CPU > timers context save/restore which shouldn't be interrupted. Hence it > must be called with interrupts disabled." > > and the point is: it that an invariant? Do current code and future code > safely assume that cpu_pm_enter is not called twice? The fix is correct. The cpu_pm_enter/exit symmetry must be kept because we don't know what the notifier clients are doing. The point is : can we send it to stable@ as a bug fix or not ? > For example if cpu_pm_enter do "context save" and cpu_pm_exit do > "context restore", calling twice cpu_pm_enter will overwrite the > previous saved context: is that safe in all circumstances? That is the drawback of the notifiers: the kernel provides a service and everyone plug something on it. The cpu_pm notifier are very low level functions, so the answer of your question is not obvious. I already checked all the cpuidle drivers if the potential bug you reported is there or not but apparently everything else is fine, cpu_pm_exit is always called after cpu_pm_enter. As you stated, the API description implies cpu_pm_exit must be called after cpu_pm_enter. So the fix is right. > I assume the rule " It must fix a real bug that bothers people (not a, > "This could be a problem..." type thing)." is to avoid committing > useless changes that may introduce new bugs, but i do not think that > apply to this case: a bug report from an unknown user (me) should change > nothing. It would be perfect if we can succeed to reproduce the bug you are facing and check the patch fixes it. In this case, it goes to stable@ automatically.
On 03/04/2015 03:53 PM, Rafael J. Wysocki wrote: > On Tuesday, March 03, 2015 04:20:26 PM Daniel Lezcano wrote: >> On 03/03/2015 03:58 PM, Fulvio wrote: [ ... ] >> The fix is correct. The cpu_pm_enter/exit symmetry must be kept because >> we don't know what the notifier clients are doing. >> >> The point is : can we send it to stable@ as a bug fix or not ? > > Yes, we can, unless there's a risk of breaking things that cannot be > ignored. Ok, I added the stable@ tag. Thanks -- Daniel
On Tuesday, March 03, 2015 04:20:26 PM Daniel Lezcano wrote: > On 03/03/2015 03:58 PM, Fulvio wrote: > > > >> I didn't know you experimented random kernel panics and that you thought > >> it was related to the CPU Idle driver. > >> > >>> All i can say is that the system use the "armadaxp_idle" driver and > >>> works fine when running "stress --cpu 8" in background. > >>> I asked Netgear to provide a firmware without the idle driver to > >>> confirm if it's the cause of the problem, but they did not answered. > >> > >> I think that if you disable all the state using by doing an > >> > >> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/stateNUM/disable > >> > >> where NUM is the nuymbert of the state. Then it should disable the > >> cpuidle on the fly. > > Thanks, i'll try that as soon as possible (i gave back the unit to my > > client) and report back. > > > > However, the description of the cpu_pm_enter function state: > > "Must be called on the affected CPU with interrupts disabled. Platform > > is responsible for ensuring that cpu_pm_enter is not called twice on the > > same CPU before cpu_pm_exit is called. Notified drivers can include VFP > > co-processor, interrupt controller and its PM extensions, local CPU > > timers context save/restore which shouldn't be interrupted. Hence it > > must be called with interrupts disabled." > > > > and the point is: it that an invariant? Do current code and future code > > safely assume that cpu_pm_enter is not called twice? > > The fix is correct. The cpu_pm_enter/exit symmetry must be kept because > we don't know what the notifier clients are doing. > > The point is : can we send it to stable@ as a bug fix or not ? Yes, we can, unless there's a risk of breaking things that cannot be ignored.
diff --git a/drivers/cpuidle/cpuidle-mvebu-v7.c b/drivers/cpuidle/cpuidle-mvebu-v7.c index 38e68618513a..cefa07438ae1 100644 --- a/drivers/cpuidle/cpuidle-mvebu-v7.c +++ b/drivers/cpuidle/cpuidle-mvebu-v7.c @@ -37,11 +37,11 @@ static int mvebu_v7_enter_idle(struct cpuidle_device *dev, deepidle = true; ret = mvebu_v7_cpu_suspend(deepidle); + cpu_pm_exit(); + if (ret) return ret; - cpu_pm_exit(); - return index; }
As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring that cpu_pm_enter is not called twice on the same CPU before cpu_pm_exit is called.". In the current code in case of failure when calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never called whereas cpu_pm_enter() was called just before. This patch moves the cpu_pm_exit() in order to balance the cpu_pm_enter() calls. Reported-by: Fulvio Benini <fbf@libero.it> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> --- drivers/cpuidle/cpuidle-mvebu-v7.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)