Message ID | 20200214162209.129107-1-dima@arista.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [PATCHv2] watchdog: Add stop_on_reboot parameter to control reboot policy | expand |
On Fri, Feb 14, 2020 at 04:22:09PM +0000, Dmitry Safonov wrote: > Many watchdog drivers use watchdog_stop_on_reboot() helper in order > to stop the watchdog on system reboot. Unfortunately, this logic is > coded in driver's probe function and doesn't allows user to decide what > to do during shutdown/reboot. > > On the other side, Xen and Qemu watchdog drivers (xen_wdt and i6300esb) > may be configured to either send NMI or turn off/reboot VM as > the watchdog action. As the kernel may stuck at any state, sending NMIs > can't reliably reboot the VM. > > At Arista, we benefited from the following set-up: the emulated watchdogs > trigger VM reset and softdog is set to catch less severe conditions to > generate vmcore. Just before reboot watchdog's timeout is increased > to some good-enough value (3 mins). That keeps watchdog always running > and guarantees that VM doesn't stuck. > > Provide new stop_on_reboot module parameter to let user control > watchdog's reboot policy. > > Cc: Guenter Roeck <linux@roeck-us.net> > Cc: Wim Van Sebroeck <wim@linux-watchdog.org> > Cc: linux-watchdog@vger.kernel.org > Signed-off-by: Dmitry Safonov <dima@arista.com> > --- > Changes v1 => v2: Add module parameter instead of ioctl() > > drivers/watchdog/watchdog_core.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c > index 861daf4f37b2..5ead96199a0b 100644 > --- a/drivers/watchdog/watchdog_core.c > +++ b/drivers/watchdog/watchdog_core.c > @@ -39,6 +39,10 @@ > > static DEFINE_IDA(watchdog_ida); > > +static int stop_on_reboot = -1; > +module_param(stop_on_reboot, int, 0644); > +MODULE_PARM_DESC(stop_on_reboot, "Stop watchdogs on reboot (0=keep watching, 1=stop)"); > + My major concern is that this is writeable at runtime. Changing the value won't change the behavior of already loaded drivers. Unloading and reloading the driver will change its behavior after the value was changed. This would be confusing, and it is hard to imagine for anyone to expect such a behavior. Does this have to be writeable ? Guenter > /* > * Deferred Registration infrastructure. > * > @@ -254,6 +258,14 @@ static int __watchdog_register_device(struct watchdog_device *wdd) > } > } > > + /* Module parameter to force watchdog policy on reboot. */ > + if (stop_on_reboot != -1) { > + if (stop_on_reboot) > + set_bit(WDOG_STOP_ON_REBOOT, &wdd->status); > + else > + clear_bit(WDOG_STOP_ON_REBOOT, &wdd->status); > + } > + > if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) { > wdd->reboot_nb.notifier_call = watchdog_reboot_notifier; >
Hi Guenter, On 2/22/20 4:06 PM, Guenter Roeck wrote: > On Fri, Feb 14, 2020 at 04:22:09PM +0000, Dmitry Safonov wrote: [..] >> +static int stop_on_reboot = -1; >> +module_param(stop_on_reboot, int, 0644); >> +MODULE_PARM_DESC(stop_on_reboot, "Stop watchdogs on reboot (0=keep watching, 1=stop)"); >> + > > My major concern is that this is writeable at runtime. > Changing the value won't change the behavior of already loaded > drivers. Unloading and reloading the driver will change its behavior > after the value was changed. This would be confusing, and it is hard > to imagine for anyone to expect such a behavior. Does this have to be > writeable ? No, it wasn't. I've messed it up by thinking about fours in 0644, but for some reason failed to recognize that it allows root writes. I'll follow up with v3, sorry for simple-minded typo. Thanks, Dmitry
diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c index 861daf4f37b2..5ead96199a0b 100644 --- a/drivers/watchdog/watchdog_core.c +++ b/drivers/watchdog/watchdog_core.c @@ -39,6 +39,10 @@ static DEFINE_IDA(watchdog_ida); +static int stop_on_reboot = -1; +module_param(stop_on_reboot, int, 0644); +MODULE_PARM_DESC(stop_on_reboot, "Stop watchdogs on reboot (0=keep watching, 1=stop)"); + /* * Deferred Registration infrastructure. * @@ -254,6 +258,14 @@ static int __watchdog_register_device(struct watchdog_device *wdd) } } + /* Module parameter to force watchdog policy on reboot. */ + if (stop_on_reboot != -1) { + if (stop_on_reboot) + set_bit(WDOG_STOP_ON_REBOOT, &wdd->status); + else + clear_bit(WDOG_STOP_ON_REBOOT, &wdd->status); + } + if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) { wdd->reboot_nb.notifier_call = watchdog_reboot_notifier;
Many watchdog drivers use watchdog_stop_on_reboot() helper in order to stop the watchdog on system reboot. Unfortunately, this logic is coded in driver's probe function and doesn't allows user to decide what to do during shutdown/reboot. On the other side, Xen and Qemu watchdog drivers (xen_wdt and i6300esb) may be configured to either send NMI or turn off/reboot VM as the watchdog action. As the kernel may stuck at any state, sending NMIs can't reliably reboot the VM. At Arista, we benefited from the following set-up: the emulated watchdogs trigger VM reset and softdog is set to catch less severe conditions to generate vmcore. Just before reboot watchdog's timeout is increased to some good-enough value (3 mins). That keeps watchdog always running and guarantees that VM doesn't stuck. Provide new stop_on_reboot module parameter to let user control watchdog's reboot policy. Cc: Guenter Roeck <linux@roeck-us.net> Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: linux-watchdog@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> --- Changes v1 => v2: Add module parameter instead of ioctl() drivers/watchdog/watchdog_core.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)