Message ID | 20210615123904.2568052-2-grzegorz.jaszczyk@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | introduce watchdog_dev_suspend/resume | expand |
On Tue, 15 Jun 2021 at 16:18, Guenter Roeck <linux@roeck-us.net> wrote: > > On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote: > > The watchdog drivers often disable wdog clock during suspend and then > > enable it again during resume. Nevertheless the ping worker is still > > running and can issue low-level ping while the wdog clock is disabled > > causing the system hang. To prevent such condition introduce > > watchdog_dev_suspend/resume which can be used by any wdog driver and > > actually cancel ping worker during suspend and restore it back, if > > needed, during resume. > > > > I'll have to look into this further, but I don't think this is the correct > solution. Most likely the watchdog core needs to have its own independent > suspend/resule functions and suspend the high resolution timer on > suspend and restore it on resume. This may require an additional flag > to be set by drivers to indicate that the timer should be stopped on > suspend. That makes sense - thank you for your suggestion. I think I could register a pm notifier in the watchdog core when the new e.g. WDOG_STOP_PING_ON_SUSPEND status flag will be set by the driver and actually call watchdog_dev_suspend/resume from the notifier callback. Please let me know if you see any other issue with this solution, if not I will post v2. Thank you in advance, Grzegorz > > > Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org> > > --- > > drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++ > > include/linux/watchdog.h | 2 ++ > > 2 files changed, 51 insertions(+) > > > > diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c > > index 2946f3a63110..3feca1567281 100644 > > --- a/drivers/watchdog/watchdog_dev.c > > +++ b/drivers/watchdog/watchdog_dev.c > > @@ -1219,6 +1219,55 @@ void __exit watchdog_dev_exit(void) > > kthread_destroy_worker(watchdog_kworker); > > } > > > > +int watchdog_dev_suspend(struct watchdog_device *wdd) > > +{ > > + struct watchdog_core_data *wd_data = wdd->wd_data; > > + int ret; > > + > > + if (!wdd->wd_data) > > + return -ENODEV; > > + > > + /* ping for the last time before suspend */ > > + mutex_lock(&wd_data->lock); > > + if (watchdog_worker_should_ping(wd_data)) > > + ret = __watchdog_ping(wd_data->wdd); > > + mutex_unlock(&wd_data->lock); > > + > > + if (ret) > > + return ret; > > + > > + /* > > + * make sure that watchdog worker will not kick in when the wdog is > > + * suspended > > + */ > > + hrtimer_cancel(&wd_data->timer); > > + kthread_cancel_work_sync(&wd_data->work); > > + > > + return 0; > > +} > > +EXPORT_SYMBOL_GPL(watchdog_dev_suspend); > > + > > +int watchdog_dev_resume(struct watchdog_device *wdd) > > +{ > > + struct watchdog_core_data *wd_data = wdd->wd_data; > > + int ret; > > + > > + if (!wdd->wd_data) > > + return -ENODEV; > > + > > + /* > > + * __watchdog_ping will also retrigger hrtimer and therefore restore the > > + * ping worker if needed. > > + */ > > + mutex_lock(&wd_data->lock); > > + if (watchdog_worker_should_ping(wd_data)) > > + ret = __watchdog_ping(wd_data->wdd); > > + mutex_unlock(&wd_data->lock); > > + > > + return ret; > > +} > > +EXPORT_SYMBOL_GPL(watchdog_dev_resume); > > + > > module_param(handle_boot_enabled, bool, 0444); > > MODULE_PARM_DESC(handle_boot_enabled, > > "Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default=" > > diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h > > index 9b19e6bb68b5..febfde3b1ff6 100644 > > --- a/include/linux/watchdog.h > > +++ b/include/linux/watchdog.h > > @@ -209,6 +209,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd, > > unsigned int timeout_parm, struct device *dev); > > extern int watchdog_register_device(struct watchdog_device *); > > extern void watchdog_unregister_device(struct watchdog_device *); > > +int watchdog_dev_suspend(struct watchdog_device *wdd); > > +int watchdog_dev_resume(struct watchdog_device *wdd); > > > > int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int); > > > > -- > > 2.29.0 > >
On Wed, Jun 16, 2021 at 03:59:23PM +0200, Grzegorz Jaszczyk wrote: > On Tue, 15 Jun 2021 at 16:18, Guenter Roeck <linux@roeck-us.net> wrote: > > > > On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote: > > > The watchdog drivers often disable wdog clock during suspend and then > > > enable it again during resume. Nevertheless the ping worker is still > > > running and can issue low-level ping while the wdog clock is disabled > > > causing the system hang. To prevent such condition introduce > > > watchdog_dev_suspend/resume which can be used by any wdog driver and > > > actually cancel ping worker during suspend and restore it back, if > > > needed, during resume. > > > > > > > I'll have to look into this further, but I don't think this is the correct > > solution. Most likely the watchdog core needs to have its own independent > > suspend/resule functions and suspend the high resolution timer on > > suspend and restore it on resume. This may require an additional flag > > to be set by drivers to indicate that the timer should be stopped on > > suspend. > > That makes sense - thank you for your suggestion. I think I could > register a pm notifier in the watchdog core when the new e.g. > WDOG_STOP_PING_ON_SUSPEND status flag will be set by the driver and > actually call watchdog_dev_suspend/resume from the notifier callback. > Please let me know if you see any other issue with this solution, if > not I will post v2. > Go for it. Thanks, Guenter
Hi Grzegorz, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on hwmon/hwmon-next] [also build test WARNING on soc/for-next linus/master v5.13-rc6 next-20210616] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Grzegorz-Jaszczyk/introduce-watchdog_dev_suspend-resume/20210616-201447 base: https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git hwmon-next config: arm64-randconfig-r022-20210615 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 64720f57bea6a6bf033feef4a5751ab9c0c3b401) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # https://github.com/0day-ci/linux/commit/71dadcad8da1862c1205b19ddf4279d494e57545 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Grzegorz-Jaszczyk/introduce-watchdog_dev_suspend-resume/20210616-201447 git checkout 71dadcad8da1862c1205b19ddf4279d494e57545 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> drivers/watchdog/watchdog_dev.c:1232:2: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (watchdog_worker_should_ping(wd_data)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:56:28: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:58:30: note: expanded from macro '__trace_if_var' #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/watchdog/watchdog_dev.c:1236:6: note: uninitialized use occurs here if (ret) ^~~ include/linux/compiler.h:56:47: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^~~~ include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var' #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond)) ^~~~ drivers/watchdog/watchdog_dev.c:1232:2: note: remove the 'if' if its condition is always true if (watchdog_worker_should_ping(wd_data)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:56:23: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^ drivers/watchdog/watchdog_dev.c:1225:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 drivers/watchdog/watchdog_dev.c:1263:2: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (watchdog_worker_should_ping(wd_data)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:56:28: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:58:30: note: expanded from macro '__trace_if_var' #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/watchdog/watchdog_dev.c:1267:9: note: uninitialized use occurs here return ret; ^~~ drivers/watchdog/watchdog_dev.c:1263:2: note: remove the 'if' if its condition is always true if (watchdog_worker_should_ping(wd_data)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:56:23: note: expanded from macro 'if' #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) ) ^ drivers/watchdog/watchdog_dev.c:1253:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 2 warnings generated. vim +1232 drivers/watchdog/watchdog_dev.c 1221 1222 int watchdog_dev_suspend(struct watchdog_device *wdd) 1223 { 1224 struct watchdog_core_data *wd_data = wdd->wd_data; 1225 int ret; 1226 1227 if (!wdd->wd_data) 1228 return -ENODEV; 1229 1230 /* ping for the last time before suspend */ 1231 mutex_lock(&wd_data->lock); > 1232 if (watchdog_worker_should_ping(wd_data)) 1233 ret = __watchdog_ping(wd_data->wdd); 1234 mutex_unlock(&wd_data->lock); 1235 1236 if (ret) 1237 return ret; 1238 1239 /* 1240 * make sure that watchdog worker will not kick in when the wdog is 1241 * suspended 1242 */ 1243 hrtimer_cancel(&wd_data->timer); 1244 kthread_cancel_work_sync(&wd_data->work); 1245 1246 return 0; 1247 } 1248 EXPORT_SYMBOL_GPL(watchdog_dev_suspend); 1249 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index 2946f3a63110..3feca1567281 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -1219,6 +1219,55 @@ void __exit watchdog_dev_exit(void) kthread_destroy_worker(watchdog_kworker); } +int watchdog_dev_suspend(struct watchdog_device *wdd) +{ + struct watchdog_core_data *wd_data = wdd->wd_data; + int ret; + + if (!wdd->wd_data) + return -ENODEV; + + /* ping for the last time before suspend */ + mutex_lock(&wd_data->lock); + if (watchdog_worker_should_ping(wd_data)) + ret = __watchdog_ping(wd_data->wdd); + mutex_unlock(&wd_data->lock); + + if (ret) + return ret; + + /* + * make sure that watchdog worker will not kick in when the wdog is + * suspended + */ + hrtimer_cancel(&wd_data->timer); + kthread_cancel_work_sync(&wd_data->work); + + return 0; +} +EXPORT_SYMBOL_GPL(watchdog_dev_suspend); + +int watchdog_dev_resume(struct watchdog_device *wdd) +{ + struct watchdog_core_data *wd_data = wdd->wd_data; + int ret; + + if (!wdd->wd_data) + return -ENODEV; + + /* + * __watchdog_ping will also retrigger hrtimer and therefore restore the + * ping worker if needed. + */ + mutex_lock(&wd_data->lock); + if (watchdog_worker_should_ping(wd_data)) + ret = __watchdog_ping(wd_data->wdd); + mutex_unlock(&wd_data->lock); + + return ret; +} +EXPORT_SYMBOL_GPL(watchdog_dev_resume); + module_param(handle_boot_enabled, bool, 0444); MODULE_PARM_DESC(handle_boot_enabled, "Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default=" diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h index 9b19e6bb68b5..febfde3b1ff6 100644 --- a/include/linux/watchdog.h +++ b/include/linux/watchdog.h @@ -209,6 +209,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd, unsigned int timeout_parm, struct device *dev); extern int watchdog_register_device(struct watchdog_device *); extern void watchdog_unregister_device(struct watchdog_device *); +int watchdog_dev_suspend(struct watchdog_device *wdd); +int watchdog_dev_resume(struct watchdog_device *wdd); int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int);
The watchdog drivers often disable wdog clock during suspend and then enable it again during resume. Nevertheless the ping worker is still running and can issue low-level ping while the wdog clock is disabled causing the system hang. To prevent such condition introduce watchdog_dev_suspend/resume which can be used by any wdog driver and actually cancel ping worker during suspend and restore it back, if needed, during resume. Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org> --- drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++ include/linux/watchdog.h | 2 ++ 2 files changed, 51 insertions(+)