Message ID | 20220505015814.3727692-8-rui.zhang@intel.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | PM: Solution for S0ix failure caused by PCH overheating | expand |
Hello, I assume I can ignore this patch as this seems to be only for testing I'm not even sure why this is needed as this completely breaks setting the alarm time. On 05/05/2022 09:58:14+0800, Zhang Rui wrote: > Automated suspend/resume testing uses the RTC for wakeup. > A short rtcwake period is desirable, so that more suspend/resume > cycles can be completed, while the machine is available for testing. > > But if too short a wake interval is specified, the event can occur, > while still suspending, and then no event wakes the suspended system > until the user notices that testing has stalled, and manually intervenes. > > Here we add a hook to the rtc-cmos driver to > a) remove the alarm timer in the beginning of suspend, if there is any > b) arm the wakeup in PM notifier callback, which is in the very late stage > before the system really suspends > The remaining part of suspend is usually measured under 10 ms, > and so arming the timer at this point could be as fast as the minimum > time of 1-second. > > But there is a 2nd race. The RTC has 1-second granularity, and unless > you are timing the timer with a higher resolution timer, > there is no telling if the current time + 1 will occur immediately, > or a full second in the future. And so 2-seconds is the safest minimum: > > Run 1,000 s2idle cycles with (max of) 2 second wakeup period: > > # echo 2 > /sys/module/rtc_cmos/parameters/rtc_wake_override_sec > # sleepgraph.py -m freeze -multi 1000 0 -skiphtml -gzip > > Clear the timer override, to not interfere with subsequent > real use of the machine's suspend/resume feature: > > # echo 0 > /sys/module/rtc_cmos/parameters/rtc_wake_override_sec > > Originally-by: Len Brown <len.brown@intel.com> > Signed-off-by: Zhang Rui <rui.zhang@intel.com> > Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> > --- > drivers/rtc/interface.c | 1 + > drivers/rtc/rtc-cmos.c | 45 +++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 46 insertions(+) > > diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c > index 9edd662c69ac..fb93aa2dc99c 100644 > --- a/drivers/rtc/interface.c > +++ b/drivers/rtc/interface.c > @@ -1020,6 +1020,7 @@ void rtc_timer_cancel(struct rtc_device *rtc, struct rtc_timer *timer) > rtc_timer_remove(rtc, timer); > mutex_unlock(&rtc->ops_lock); > } > +EXPORT_SYMBOL_GPL(rtc_timer_cancel); > > /** > * rtc_read_offset - Read the amount of rtc offset in parts per billion > diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c > index 7c006c2b125f..9590c40fa9d8 100644 > --- a/drivers/rtc/rtc-cmos.c > +++ b/drivers/rtc/rtc-cmos.c > @@ -32,6 +32,7 @@ > #include <linux/init.h> > #include <linux/interrupt.h> > #include <linux/spinlock.h> > +#include <linux/suspend.h> > #include <linux/platform_device.h> > #include <linux/log2.h> > #include <linux/pm.h> > @@ -70,6 +71,9 @@ static inline int cmos_use_acpi_alarm(void) > } > #endif > > +static int rtc_wake_override_sec; > +module_param(rtc_wake_override_sec, int, 0644); > + > struct cmos_rtc { > struct rtc_device *rtc; > struct device *dev; > @@ -89,6 +93,7 @@ struct cmos_rtc { > u8 century; > > struct rtc_wkalrm saved_wkalrm; > + struct notifier_block pm_nb; > }; > > /* both platform and pnp busses use negative numbers for invalid irqs */ > @@ -744,6 +749,42 @@ static irqreturn_t cmos_interrupt(int irq, void *p) > return IRQ_NONE; > } > > +static int cmos_pm_notify(struct notifier_block *nb, unsigned long mode, void *_unused) > +{ > + struct cmos_rtc *cmos = container_of(nb, struct cmos_rtc, pm_nb); > + struct rtc_device *rtc = cmos->rtc; > + unsigned long now; > + struct rtc_wkalrm alm; > + > + if (rtc_wake_override_sec == 0) > + return NOTIFY_OK; > + > + switch (mode) { > + case PM_SUSPEND_PREPARE: > + /* > + * Cancel the timer to make sure it won't fire > + * before rtc is rearmed later. > + */ > + rtc_timer_cancel(rtc, &rtc->aie_timer); > + break; > + case PM_SUSPEND_LATE: > + if (rtc_read_time(rtc, &alm.time)) > + return NOTIFY_BAD; > + > + now = rtc_tm_to_time64(&alm.time); > + memset(&alm, 0, sizeof(alm)); > + rtc_time64_to_tm(now + rtc_wake_override_sec, &alm.time); > + alm.enabled = true; > + if (rtc_set_alarm(rtc, &alm)) > + return NOTIFY_BAD; > + if (cmos->wake_on) > + cmos->wake_on(cmos->dev); > + break; > + } > + > + return NOTIFY_OK; > +} > + > #ifdef CONFIG_PNP > #define INITSECTION > > @@ -937,6 +978,9 @@ cmos_do_probe(struct device *dev, struct resource *ports, int rtc_irq) > nvmem_cfg.size, > use_hpet_alarm() ? ", hpet irqs" : ""); > > + cmos_rtc.pm_nb.notifier_call = cmos_pm_notify; > + register_pm_notifier(&cmos_rtc.pm_nb); > + > return 0; > > cleanup2: > @@ -965,6 +1009,7 @@ static void cmos_do_remove(struct device *dev) > struct cmos_rtc *cmos = dev_get_drvdata(dev); > struct resource *ports; > > + unregister_pm_notifier(&cmos_rtc.pm_nb); > cmos_do_shutdown(cmos->irq); > > if (is_valid_irq(cmos->irq)) { > -- > 2.17.1 >
Hi, Alexandre, Thanks for reviewing the patch. On Fri, 2022-05-06 at 23:46 +0200, Alexandre Belloni wrote: > Hello, > > I assume I can ignore this patch as this seems to be only for testing The main purpose of this patch is for automate testing. But this doesn't mean it cannot be part of upstream code, right? > I'm not even sure why this is needed as this completely breaks > setting > the alarm time. Or overrides the alarm time, :) People rely on the rtc alarm in the automated suspend stress test, which suspend and resume the system for over 1000 iterations. As I mentioned in the cover letter of this patch series, if the system suspend time varies from under 1 second to over 60 seconds, how to alarm the RTC before suspend? This feature is critical in this scenario. Plus, the current solution is transparent to people who don't known/use this "rtc_wake_override_sec" parameter. And for people who use this, they should know that the previous armed RTC alarm will be overrode whenever a system suspend is triggered. I can add a message when the parameter is set, if needed. thanks, rui > > On 05/05/2022 09:58:14+0800, Zhang Rui wrote: > > Automated suspend/resume testing uses the RTC for wakeup. > > A short rtcwake period is desirable, so that more suspend/resume > > cycles can be completed, while the machine is available for > > testing. > > > > But if too short a wake interval is specified, the event can occur, > > while still suspending, and then no event wakes the suspended > > system > > until the user notices that testing has stalled, and manually > > intervenes. > > > > Here we add a hook to the rtc-cmos driver to > > a) remove the alarm timer in the beginning of suspend, if there is > > any > > b) arm the wakeup in PM notifier callback, which is in the very > > late stage > > before the system really suspends > > The remaining part of suspend is usually measured under 10 ms, > > and so arming the timer at this point could be as fast as the > > minimum > > time of 1-second. > > > > But there is a 2nd race. The RTC has 1-second granularity, and > > unless > > you are timing the timer with a higher resolution timer, > > there is no telling if the current time + 1 will occur immediately, > > or a full second in the future. And so 2-seconds is the safest > > minimum: > > > > Run 1,000 s2idle cycles with (max of) 2 second wakeup period: > > > > # echo 2 > /sys/module/rtc_cmos/parameters/rtc_wake_override_sec > > # sleepgraph.py -m freeze -multi 1000 0 -skiphtml -gzip > > > > Clear the timer override, to not interfere with subsequent > > real use of the machine's suspend/resume feature: > > > > # echo 0 > /sys/module/rtc_cmos/parameters/rtc_wake_override_sec > > > > Originally-by: Len Brown <len.brown@intel.com> > > Signed-off-by: Zhang Rui <rui.zhang@intel.com> > > Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> > > --- > > drivers/rtc/interface.c | 1 + > > drivers/rtc/rtc-cmos.c | 45 > > +++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 46 insertions(+) > > > > diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c > > index 9edd662c69ac..fb93aa2dc99c 100644 > > --- a/drivers/rtc/interface.c > > +++ b/drivers/rtc/interface.c > > @@ -1020,6 +1020,7 @@ void rtc_timer_cancel(struct rtc_device *rtc, > > struct rtc_timer *timer) > > rtc_timer_remove(rtc, timer); > > mutex_unlock(&rtc->ops_lock); > > } > > +EXPORT_SYMBOL_GPL(rtc_timer_cancel); > > > > /** > > * rtc_read_offset - Read the amount of rtc offset in parts per > > billion > > diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c > > index 7c006c2b125f..9590c40fa9d8 100644 > > --- a/drivers/rtc/rtc-cmos.c > > +++ b/drivers/rtc/rtc-cmos.c > > @@ -32,6 +32,7 @@ > > #include <linux/init.h> > > #include <linux/interrupt.h> > > #include <linux/spinlock.h> > > +#include <linux/suspend.h> > > #include <linux/platform_device.h> > > #include <linux/log2.h> > > #include <linux/pm.h> > > @@ -70,6 +71,9 @@ static inline int cmos_use_acpi_alarm(void) > > } > > #endif > > > > +static int rtc_wake_override_sec; > > +module_param(rtc_wake_override_sec, int, 0644); > > + > > struct cmos_rtc { > > struct rtc_device *rtc; > > struct device *dev; > > @@ -89,6 +93,7 @@ struct cmos_rtc { > > u8 century; > > > > struct rtc_wkalrm saved_wkalrm; > > + struct notifier_block pm_nb; > > }; > > > > /* both platform and pnp busses use negative numbers for invalid > > irqs */ > > @@ -744,6 +749,42 @@ static irqreturn_t cmos_interrupt(int irq, > > void *p) > > return IRQ_NONE; > > } > > > > +static int cmos_pm_notify(struct notifier_block *nb, unsigned long > > mode, void *_unused) > > +{ > > + struct cmos_rtc *cmos = container_of(nb, struct cmos_rtc, > > pm_nb); > > + struct rtc_device *rtc = cmos->rtc; > > + unsigned long now; > > + struct rtc_wkalrm alm; > > + > > + if (rtc_wake_override_sec == 0) > > + return NOTIFY_OK; > > + > > + switch (mode) { > > + case PM_SUSPEND_PREPARE: > > + /* > > + * Cancel the timer to make sure it won't fire > > + * before rtc is rearmed later. > > + */ > > + rtc_timer_cancel(rtc, &rtc->aie_timer); > > + break; > > + case PM_SUSPEND_LATE: > > + if (rtc_read_time(rtc, &alm.time)) > > + return NOTIFY_BAD; > > + > > + now = rtc_tm_to_time64(&alm.time); > > + memset(&alm, 0, sizeof(alm)); > > + rtc_time64_to_tm(now + rtc_wake_override_sec, > > &alm.time); > > + alm.enabled = true; > > + if (rtc_set_alarm(rtc, &alm)) > > + return NOTIFY_BAD; > > + if (cmos->wake_on) > > + cmos->wake_on(cmos->dev); > > + break; > > + } > > + > > + return NOTIFY_OK; > > +} > > + > > #ifdef CONFIG_PNP > > #define INITSECTION > > > > @@ -937,6 +978,9 @@ cmos_do_probe(struct device *dev, struct > > resource *ports, int rtc_irq) > > nvmem_cfg.size, > > use_hpet_alarm() ? ", hpet irqs" : ""); > > > > + cmos_rtc.pm_nb.notifier_call = cmos_pm_notify; > > + register_pm_notifier(&cmos_rtc.pm_nb); > > + > > return 0; > > > > cleanup2: > > @@ -965,6 +1009,7 @@ static void cmos_do_remove(struct device *dev) > > struct cmos_rtc *cmos = dev_get_drvdata(dev); > > struct resource *ports; > > > > + unregister_pm_notifier(&cmos_rtc.pm_nb); > > cmos_do_shutdown(cmos->irq); > > > > if (is_valid_irq(cmos->irq)) { > > -- > > 2.17.1 > > > >
On 07/05/2022 10:00:40+0800, Zhang Rui wrote: > Hi, Alexandre, > > Thanks for reviewing the patch. > > On Fri, 2022-05-06 at 23:46 +0200, Alexandre Belloni wrote: > > Hello, > > > > I assume I can ignore this patch as this seems to be only for testing > > The main purpose of this patch is for automate testing. > But this doesn't mean it cannot be part of upstream code, right? > > > I'm not even sure why this is needed as this completely breaks > > setting > > the alarm time. > > Or overrides the alarm time, :) > > People rely on the rtc alarm in the automated suspend stress test, > which suspend and resume the system for over 1000 iterations. > As I mentioned in the cover letter of this patch series, if the system > suspend time varies from under 1 second to over 60 seconds, how to > alarm the RTC before suspend? > This feature is critical in this scenario. > > Plus, the current solution is transparent to people who don't known/use > this "rtc_wake_override_sec" parameter. And for people who use this, > they should know that the previous armed RTC alarm will be overrode > whenever a system suspend is triggered. I can add a message when the > parameter is set, if needed. > It is not transparent, if I read your patch properly, this breaks wakeup for everyone... > > On 05/05/2022 09:58:14+0800, Zhang Rui wrote: > > > +static int cmos_pm_notify(struct notifier_block *nb, unsigned long > > > mode, void *_unused) > > > +{ > > > + struct cmos_rtc *cmos = container_of(nb, struct cmos_rtc, > > > pm_nb); > > > + struct rtc_device *rtc = cmos->rtc; > > > + unsigned long now; > > > + struct rtc_wkalrm alm; > > > + > > > + if (rtc_wake_override_sec == 0) > > > + return NOTIFY_OK; > > > + > > > + switch (mode) { > > > + case PM_SUSPEND_PREPARE: > > > + /* > > > + * Cancel the timer to make sure it won't fire > > > + * before rtc is rearmed later. > > > + */ > > > + rtc_timer_cancel(rtc, &rtc->aie_timer); > > > + break; > > > + case PM_SUSPEND_LATE: > > > + if (rtc_read_time(rtc, &alm.time)) > > > + return NOTIFY_BAD; > > > + > > > + now = rtc_tm_to_time64(&alm.time); > > > + memset(&alm, 0, sizeof(alm)); > > > + rtc_time64_to_tm(now + rtc_wake_override_sec, > > > &alm.time); > > > + alm.enabled = true; > > > + if (rtc_set_alarm(rtc, &alm)) > > > + return NOTIFY_BAD; ... because if rtc_wake_override_sec is not set, this sets the alarm to now which is the current RTC time, ensuring the alarm will never trigger.
On Sat, 2022-05-07 at 09:31 +0200, Alexandre Belloni wrote: > On 07/05/2022 10:00:40+0800, Zhang Rui wrote: > > Hi, Alexandre, > > > > Thanks for reviewing the patch. > > > > On Fri, 2022-05-06 at 23:46 +0200, Alexandre Belloni wrote: > > > Hello, > > > > > > I assume I can ignore this patch as this seems to be only for > > > testing > > > > The main purpose of this patch is for automate testing. > > But this doesn't mean it cannot be part of upstream code, right? > > > > > I'm not even sure why this is needed as this completely breaks > > > setting > > > the alarm time. > > > > Or overrides the alarm time, :) > > > > People rely on the rtc alarm in the automated suspend stress test, > > which suspend and resume the system for over 1000 iterations. > > As I mentioned in the cover letter of this patch series, if the > > system > > suspend time varies from under 1 second to over 60 seconds, how to > > alarm the RTC before suspend? > > This feature is critical in this scenario. > > > > Plus, the current solution is transparent to people who don't > > known/use > > this "rtc_wake_override_sec" parameter. And for people who use > > this, > > they should know that the previous armed RTC alarm will be overrode > > whenever a system suspend is triggered. I can add a message when > > the > > parameter is set, if needed. > > > > It is not transparent, if I read your patch properly, this breaks > wakeup > for everyone... > > > > On 05/05/2022 09:58:14+0800, Zhang Rui wrote: > > > > +static int cmos_pm_notify(struct notifier_block *nb, unsigned > > > > long > > > > mode, void *_unused) > > > > +{ > > > > + struct cmos_rtc *cmos = container_of(nb, struct > > > > cmos_rtc, > > > > pm_nb); > > > > + struct rtc_device *rtc = cmos->rtc; > > > > + unsigned long now; > > > > + struct rtc_wkalrm alm; > > > > + > > > > + if (rtc_wake_override_sec == 0) > > > > + return NOTIFY_OK; > > > > + > > > > + switch (mode) { > > > > + case PM_SUSPEND_PREPARE: > > > > + /* > > > > + * Cancel the timer to make sure it won't fire > > > > + * before rtc is rearmed later. > > > > + */ > > > > + rtc_timer_cancel(rtc, &rtc->aie_timer); > > > > + break; > > > > + case PM_SUSPEND_LATE: > > > > + if (rtc_read_time(rtc, &alm.time)) > > > > + return NOTIFY_BAD; > > > > + > > > > + now = rtc_tm_to_time64(&alm.time); > > > > + memset(&alm, 0, sizeof(alm)); > > > > + rtc_time64_to_tm(now + rtc_wake_override_sec, > > > > &alm.time); > > > > + alm.enabled = true; > > > > + if (rtc_set_alarm(rtc, &alm)) > > > > + return NOTIFY_BAD; > > ... because if rtc_wake_override_sec is not set, this sets the alarm > to > now which is the current RTC time, ensuring the alarm will never > trigger. No. As the code below > > > > > > > > if (rtc_wake_override_sec == 0) > > > > + return NOTIFY_OK; We check for rtc_wake_override_sec at the beginning of the notifier callback. So it takes effect only if a) rtc_wake_override_sec is set, AND b) a suspend is triggered. thanks, rui
Hi, Alexandre, May I know if this addresses your concern or not? thanks, rui On Sat, 2022-05-07 at 15:41 +0800, Zhang Rui wrote: > On Sat, 2022-05-07 at 09:31 +0200, Alexandre Belloni wrote: > > On 07/05/2022 10:00:40+0800, Zhang Rui wrote: > > > Hi, Alexandre, > > > > > > Thanks for reviewing the patch. > > > > > > On Fri, 2022-05-06 at 23:46 +0200, Alexandre Belloni wrote: > > > > Hello, > > > > > > > > I assume I can ignore this patch as this seems to be only for > > > > testing > > > > > > The main purpose of this patch is for automate testing. > > > But this doesn't mean it cannot be part of upstream code, right? > > > > > > > I'm not even sure why this is needed as this completely breaks > > > > setting > > > > the alarm time. > > > > > > Or overrides the alarm time, :) > > > > > > People rely on the rtc alarm in the automated suspend stress > > > test, > > > which suspend and resume the system for over 1000 iterations. > > > As I mentioned in the cover letter of this patch series, if the > > > system > > > suspend time varies from under 1 second to over 60 seconds, how > > > to > > > alarm the RTC before suspend? > > > This feature is critical in this scenario. > > > > > > Plus, the current solution is transparent to people who don't > > > known/use > > > this "rtc_wake_override_sec" parameter. And for people who use > > > this, > > > they should know that the previous armed RTC alarm will be > > > overrode > > > whenever a system suspend is triggered. I can add a message when > > > the > > > parameter is set, if needed. > > > > > > > It is not transparent, if I read your patch properly, this breaks > > wakeup > > for everyone... > > > > > > On 05/05/2022 09:58:14+0800, Zhang Rui wrote: > > > > > +static int cmos_pm_notify(struct notifier_block *nb, > > > > > unsigned > > > > > long > > > > > mode, void *_unused) > > > > > +{ > > > > > + struct cmos_rtc *cmos = container_of(nb, struct > > > > > cmos_rtc, > > > > > pm_nb); > > > > > + struct rtc_device *rtc = cmos->rtc; > > > > > + unsigned long now; > > > > > + struct rtc_wkalrm alm; > > > > > + > > > > > + if (rtc_wake_override_sec == 0) > > > > > + return NOTIFY_OK; > > > > > + > > > > > + switch (mode) { > > > > > + case PM_SUSPEND_PREPARE: > > > > > + /* > > > > > + * Cancel the timer to make sure it won't fire > > > > > + * before rtc is rearmed later. > > > > > + */ > > > > > + rtc_timer_cancel(rtc, &rtc->aie_timer); > > > > > + break; > > > > > + case PM_SUSPEND_LATE: > > > > > + if (rtc_read_time(rtc, &alm.time)) > > > > > + return NOTIFY_BAD; > > > > > + > > > > > + now = rtc_tm_to_time64(&alm.time); > > > > > + memset(&alm, 0, sizeof(alm)); > > > > > + rtc_time64_to_tm(now + rtc_wake_override_sec, > > > > > &alm.time); > > > > > + alm.enabled = true; > > > > > + if (rtc_set_alarm(rtc, &alm)) > > > > > + return NOTIFY_BAD; > > > > ... because if rtc_wake_override_sec is not set, this sets the > > alarm > > to > > now which is the current RTC time, ensuring the alarm will never > > trigger. > > No. > As the code below > > > > > > > > > > if (rtc_wake_override_sec == 0) > > > > > + return NOTIFY_OK; > > We check for rtc_wake_override_sec at the beginning of the notifier > callback. So it takes effect only if a) rtc_wake_override_sec is set, > AND b) a suspend is triggered. > > thanks, > rui > >
On Thu, May 5, 2022 at 3:58 AM Zhang Rui <rui.zhang@intel.com> wrote: > > Automated suspend/resume testing uses the RTC for wakeup. > A short rtcwake period is desirable, so that more suspend/resume > cycles can be completed, while the machine is available for testing. > > But if too short a wake interval is specified, the event can occur, > while still suspending, and then no event wakes the suspended system > until the user notices that testing has stalled, and manually intervenes. If the wakeup event occurs while still suspending, it should abort the suspend in progress, shouldn't it? But the above implies that it doesn't do that. If this is fixed, wouldn't it address the issue at hand?
On Tue, 2022-05-17 at 17:14 +0200, Rafael J. Wysocki wrote: > On Thu, May 5, 2022 at 3:58 AM Zhang Rui <rui.zhang@intel.com> wrote: > > > > Automated suspend/resume testing uses the RTC for wakeup. > > A short rtcwake period is desirable, so that more suspend/resume > > cycles can be completed, while the machine is available for > > testing. > > > > But if too short a wake interval is specified, the event can occur, > > while still suspending, and then no event wakes the suspended > > system > > until the user notices that testing has stalled, and manually > > intervenes. > > If the wakeup event occurs while still suspending, it should abort > the > suspend in progress, shouldn't it? But the above implies that it > doesn't do that. > > If this is fixed, wouldn't it address the issue at hand? I think the rootcause of the original problem is that 1. on some systems, the ACPI RTC Fixed event is used during suspend only, and the ACPI Fixed event is enabled in the rtc-cmos driver .suspend() callback and 2. if the RTC Alarm already expires before .suspend() invoked, we will lose the ACPI RTC Fixed Event as well as the wakeup event, say 20 seconds delay in freeze processes. But, even if that problem is fixed, the suspend aborts and "fails" as expected, this is still a problem for the suspend-automation scenario, because the system actually can suspend successfully if we don't set the RTC alarm too aggressively. And in PCH overheating case, surely we will get false alarms, because we will never use a 60s+ rtc alarm for suspend-automation. thanks, rui
On Wed, May 18, 2022 at 4:45 PM Zhang Rui <rui.zhang@intel.com> wrote: > > On Tue, 2022-05-17 at 17:14 +0200, Rafael J. Wysocki wrote: > > On Thu, May 5, 2022 at 3:58 AM Zhang Rui <rui.zhang@intel.com> wrote: > > > > > > Automated suspend/resume testing uses the RTC for wakeup. > > > A short rtcwake period is desirable, so that more suspend/resume > > > cycles can be completed, while the machine is available for > > > testing. > > > > > > But if too short a wake interval is specified, the event can occur, > > > while still suspending, and then no event wakes the suspended > > > system > > > until the user notices that testing has stalled, and manually > > > intervenes. > > > > If the wakeup event occurs while still suspending, it should abort > > the > > suspend in progress, shouldn't it? But the above implies that it > > doesn't do that. > > > > If this is fixed, wouldn't it address the issue at hand? > > I think the rootcause of the original problem is that > 1. on some systems, the ACPI RTC Fixed event is used during suspend > only, and the ACPI Fixed event is enabled in the rtc-cmos driver > .suspend() callback > and > 2. if the RTC Alarm already expires before .suspend() invoked, we will > lose the ACPI RTC Fixed Event as well as the wakeup event, say 20 > seconds delay in freeze processes. Well, the RTC Fixed event can be armed in a PM/HIBERNATE notifier and if it fires before .suspend() runs, system wakeup can be triggered from there. > But, even if that problem is fixed, the suspend aborts and "fails" as > expected, this is still a problem for the suspend-automation scenario, > because the system actually can suspend successfully if we don't set > the RTC alarm too aggressively. And in PCH overheating case, surely we > will get false alarms, because we will never use a 60s+ rtc alarm for > suspend-automation. I'm not sure why this is a problem. It only means that occasionally the system will not reach the final "suspended" state, but that can happen regardless.
On Wed, 2022-05-18 at 17:02 +0200, Rafael J. Wysocki wrote: > On Wed, May 18, 2022 at 4:45 PM Zhang Rui <rui.zhang@intel.com> > wrote: > > > > On Tue, 2022-05-17 at 17:14 +0200, Rafael J. Wysocki wrote: > > > On Thu, May 5, 2022 at 3:58 AM Zhang Rui <rui.zhang@intel.com> > > > wrote: > > > > > > > > Automated suspend/resume testing uses the RTC for wakeup. > > > > A short rtcwake period is desirable, so that more > > > > suspend/resume > > > > cycles can be completed, while the machine is available for > > > > testing. > > > > > > > > But if too short a wake interval is specified, the event can > > > > occur, > > > > while still suspending, and then no event wakes the suspended > > > > system > > > > until the user notices that testing has stalled, and manually > > > > intervenes. > > > > > > If the wakeup event occurs while still suspending, it should > > > abort > > > the > > > suspend in progress, shouldn't it? But the above implies that it > > > doesn't do that. > > > > > > If this is fixed, wouldn't it address the issue at hand? > > > > I think the rootcause of the original problem is that > > 1. on some systems, the ACPI RTC Fixed event is used during suspend > > only, and the ACPI Fixed event is enabled in the rtc-cmos driver > > .suspend() callback > > and > > 2. if the RTC Alarm already expires before .suspend() invoked, we > > will > > lose the ACPI RTC Fixed Event as well as the wakeup event, say 20 > > seconds delay in freeze processes. > > Well, the RTC Fixed event can be armed in a PM/HIBERNATE notifier and > if it fires before .suspend() runs, system wakeup can be triggered > from there. Agreed. Len, Do you recall any other case that we may miss the RTC wakeup event? > > > But, even if that problem is fixed, the suspend aborts and "fails" > > as > > expected, this is still a problem for the suspend-automation > > scenario, > > because the system actually can suspend successfully if we don't > > set > > the RTC alarm too aggressively. And in PCH overheating case, surely > > we > > will get false alarms, because we will never use a 60s+ rtc alarm > > for > > suspend-automation. > > I'm not sure why this is a problem. > > It only means that occasionally the system will not reach the final > "suspended" state, but that can happen regardless. It is not a kernel problem. It is a problem for suspend-automation. Because suspend-automation is chasing for kernel suspend problems, and IMO, cases like suspend aborts because of long suspend delay from PCH thermal driver are not kernel problems. It would be nice to leverage a kernel I/F to get rid of such issues, But if the patch is rejected, I agree we can live without it. thanks, rui
First let's agree on why this should not be ignored. Our development team at Intel has lab with laptops, we run sleepgraph on every RC, and we publish the tool in public: https://www.intel.com/content/www/us/en/developer/topic-technology/open/pm-graph/overview.html But even if we were funded to do it (which we are not), we can't possibly test every kind of device. We need the community to help testing Linux (suspend/resume, specifically) on a broad range of devices, so together we can make it better for all. The community is made up mostly of users, rather than kernel hackers, and so this effectively means that distro binary kernels need to be able to support testing. Enabling that broad community of users/contributors is the goal. As Rui explained, this patch does nothing and breaks nothing if the new hook remains unused. If it is used, then overrides the wakeup duration for all subsequent system suspends, until it is cleared. If it does more than that, or does that in a clumsy way, then let's fix that. Today it gives us two new capabilities: 1. Prevents a lost wake event. Commonly we see this with kcompatd taking 20 seconds when we had previously armed the RTC for 15 seconds. The system will sleep forever, until the user intervenes -- which may be a very long time later. Rafael, If you have a better way to fix that, I'm all ears. Aborted suspend flows are ugly -- particularly when the user didn't want them, but they are much less ugly then losing a wake event, which can result in losing, say 10-hours of test time. 2. Allows more suspends/resume cycles per time. Say the early wake is fixed. Then we have to decide how long to sleep before being suspended. If we set it for 1 second, and suspend takes longer than 1 second, then all of our tests will fail with early wakeups and we have tested nothing. If we set it to 60 seconds, and suspend takes 1 second, then 59/60 seconds are spent sleeping, when they could be spent testing Linux. With this patch, we can set it to the minimum of 2 seconds right before we sleep, guaranteeing that we spend at least 1 second, and under 2 seconds sleeping, and the rest of the time testing -- which allows us to meet the goal. thanks, Len Brown, Intel
On Thu, May 19, 2022 at 4:33 AM Len Brown <lenb@kernel.org> wrote: > > First let's agree on why this should not be ignored. > > Our development team at Intel has lab with laptops, we run sleepgraph > on every RC, and we publish the tool in public: > https://www.intel.com/content/www/us/en/developer/topic-technology/open/pm-graph/overview.html > > But even if we were funded to do it (which we are not), we can't > possibly test every kind of device. > We need the community to help testing Linux (suspend/resume, > specifically) on a broad range of devices, so together we can make it > better for all. > > The community is made up mostly of users, rather than kernel hackers, > and so this effectively means that distro binary kernels need to be > able to support testing. > > Enabling that broad community of users/contributors is the goal. > > As Rui explained, this patch does nothing and breaks nothing if the > new hook remains unused. > If it is used, then overrides the wakeup duration for all subsequent > system suspends, until it is cleared. > If it does more than that, or does that in a clumsy way, then let's fix that. > > Today it gives us two new capabilities: > > 1. Prevents a lost wake event. Commonly we see this with kcompatd > taking 20 seconds when we had previously armed the RTC for 15 seconds. > The system will sleep forever, until the user intervenes -- which may > be a very long time later. > > Rafael, If you have a better way to fix that, I'm all ears. Aborted > suspend flows are ugly -- particularly when the user didn't want them, > but they are much less ugly then losing a wake event, which can result > in losing, say 10-hours of test time. The real problem here is the missed wakeup events and I've already said in this thread how this can be fixed and Rui appears to agree with me. So I'd say why don't we just go and fix it? And it is orthogonal to the first 3 patches in this series, because they move the PCH thermal delay to the noirq phase which is later than the arming of the RTC Fixed Event IIUC. > 2. Allows more suspends/resume cycles per time. Say the early wake is > fixed. Then we have to decide how long to sleep before being > suspended. If we set it for 1 second, and suspend takes longer than 1 > second, then all of our tests will fail with early wakeups and we have > tested nothing. We have tested "early" wakeups which is what the users would see on the system in question if they set the RTC wake time to 1 second before suspending. This may not be what we want to test, though, but that is a different problem, as discussed below. > If we set it to 60 seconds, and suspend takes 1 > second, then 59/60 seconds are spent sleeping, when they could be > spent testing Linux. With this patch, we can set it to the minimum of > 2 seconds right before we sleep, guaranteeing that we spend at least 1 > second, and under 2 seconds sleeping, and the rest of the time testing > -- which allows us to meet the goal. So the goal specifically is to test the last phase of system suspend and in particular whether or not the platform has reached the specific minimum-power state at the end of it. In order to do that, we basically want to ignore all of the wakeup events except for the special RTC wakeup 1 second after the platform has been asked to get into the minimum-power state, so what we are talking about here really is a special suspend test mode using the RTC as a wakeup vehicle.
diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c index 9edd662c69ac..fb93aa2dc99c 100644 --- a/drivers/rtc/interface.c +++ b/drivers/rtc/interface.c @@ -1020,6 +1020,7 @@ void rtc_timer_cancel(struct rtc_device *rtc, struct rtc_timer *timer) rtc_timer_remove(rtc, timer); mutex_unlock(&rtc->ops_lock); } +EXPORT_SYMBOL_GPL(rtc_timer_cancel); /** * rtc_read_offset - Read the amount of rtc offset in parts per billion diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c index 7c006c2b125f..9590c40fa9d8 100644 --- a/drivers/rtc/rtc-cmos.c +++ b/drivers/rtc/rtc-cmos.c @@ -32,6 +32,7 @@ #include <linux/init.h> #include <linux/interrupt.h> #include <linux/spinlock.h> +#include <linux/suspend.h> #include <linux/platform_device.h> #include <linux/log2.h> #include <linux/pm.h> @@ -70,6 +71,9 @@ static inline int cmos_use_acpi_alarm(void) } #endif +static int rtc_wake_override_sec; +module_param(rtc_wake_override_sec, int, 0644); + struct cmos_rtc { struct rtc_device *rtc; struct device *dev; @@ -89,6 +93,7 @@ struct cmos_rtc { u8 century; struct rtc_wkalrm saved_wkalrm; + struct notifier_block pm_nb; }; /* both platform and pnp busses use negative numbers for invalid irqs */ @@ -744,6 +749,42 @@ static irqreturn_t cmos_interrupt(int irq, void *p) return IRQ_NONE; } +static int cmos_pm_notify(struct notifier_block *nb, unsigned long mode, void *_unused) +{ + struct cmos_rtc *cmos = container_of(nb, struct cmos_rtc, pm_nb); + struct rtc_device *rtc = cmos->rtc; + unsigned long now; + struct rtc_wkalrm alm; + + if (rtc_wake_override_sec == 0) + return NOTIFY_OK; + + switch (mode) { + case PM_SUSPEND_PREPARE: + /* + * Cancel the timer to make sure it won't fire + * before rtc is rearmed later. + */ + rtc_timer_cancel(rtc, &rtc->aie_timer); + break; + case PM_SUSPEND_LATE: + if (rtc_read_time(rtc, &alm.time)) + return NOTIFY_BAD; + + now = rtc_tm_to_time64(&alm.time); + memset(&alm, 0, sizeof(alm)); + rtc_time64_to_tm(now + rtc_wake_override_sec, &alm.time); + alm.enabled = true; + if (rtc_set_alarm(rtc, &alm)) + return NOTIFY_BAD; + if (cmos->wake_on) + cmos->wake_on(cmos->dev); + break; + } + + return NOTIFY_OK; +} + #ifdef CONFIG_PNP #define INITSECTION @@ -937,6 +978,9 @@ cmos_do_probe(struct device *dev, struct resource *ports, int rtc_irq) nvmem_cfg.size, use_hpet_alarm() ? ", hpet irqs" : ""); + cmos_rtc.pm_nb.notifier_call = cmos_pm_notify; + register_pm_notifier(&cmos_rtc.pm_nb); + return 0; cleanup2: @@ -965,6 +1009,7 @@ static void cmos_do_remove(struct device *dev) struct cmos_rtc *cmos = dev_get_drvdata(dev); struct resource *ports; + unregister_pm_notifier(&cmos_rtc.pm_nb); cmos_do_shutdown(cmos->irq); if (is_valid_irq(cmos->irq)) {