Message ID | 85ce6068-c98e-dbe0-a4b6-5c877b460f64@kernel.org (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | was [Re: [PATCH v3] watchdog: Add hrtimer-based pretimeout feature] | expand |
On 9/4/21, 1:19AM, Jiri Slaby wrote: > On 04. 09. 21, 10:16, Jiri Slaby wrote: > > On 02. 09. 21, 16:05, Guenter Roeck wrote: > >> On 9/1/21 11:55 PM, Jiri Slaby wrote: > >>> On 03. 02. 21, 21:11, Curtis Klein wrote: > >>>> This adds the option to use a hrtimer to generate a watchdog pretimeout > >>>> event for hardware watchdogs that do not natively support watchdog > >>>> pretimeouts. > >>>> > >>>> With this enabled, all watchdogs will appear to have pretimeout support > >>>> in userspace. If no pretimeout value is set, there will be no change in > >>>> the watchdog's behavior. > >>> > >>> Hi, > >>> > >>> on my Dell Latitude 7280, CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT=y causes > >>> all reboot, kexec, suspend to panic. Disabling that option makes it > >>> all work again. Provided it happens very late in the process, I don't > >>> know how to grab some logs... > >>> > >>> Any ideas? > >>> > >> > >> AFAICS the timer does not stop on reboot. I think we'll need to > >> augment the code > >> to do that. > > > > No, it is stopped via device unregister -> watchdog_dev_unregister -> > > watchdog_cdev_unregister -> watchdog_hrtimer_pretimeout_stop. > > > > But look: > > watchdog_cdev_unregister > > -> wdd->wd_data = NULL; > > -> watchdog_hrtimer_pretimeout_stop > > -> hrtimer_cancel(&wdd->wd_data->pretimeout_timer); > > > > The diff below obviously fixes the issue, > > Which is exactly a -next commit: > commit c7b178dae139f8857edc50888cfbf251cd974a38 > Author: Curtis Klein <curtis.klein@hpe.com> > Date: Tue Jun 22 23:26:23 2021 -0700 > > watchdog: Fix NULL pointer dereference when releasing cdev > > > --- a/drivers/watchdog/watchdog_dev.c > > +++ b/drivers/watchdog/watchdog_dev.c > > @@ -1096,6 +1096,8 @@ static void watchdog_cdev_unregister(struct > > watchdog_device *wdd) > > watchdog_stop(wdd); > > } > > > > + watchdog_hrtimer_pretimeout_stop(wdd); > > + > > mutex_lock(&wd_data->lock); > > wd_data->wdd = NULL; > > wdd->wd_data = NULL; > > @@ -1103,7 +1105,6 @@ static void watchdog_cdev_unregister(struct > > watchdog_device *wdd) > > > > hrtimer_cancel(&wd_data->timer); > > kthread_cancel_work_sync(&wd_data->work); > > - watchdog_hrtimer_pretimeout_stop(wdd); > > > > put_device(&wd_data->dev); > > } > > > > thanks, > > > -- > js > suse labs Does it still make sense to stop the timer on reboot or suspend? I haven't had any problems with rebooting but I haven't been able to test suspending. -Curtis
On 9/5/21 3:22 PM, Klein, Curtis wrote: > On 9/4/21, 1:19AM, Jiri Slaby wrote: >> On 04. 09. 21, 10:16, Jiri Slaby wrote: >>> On 02. 09. 21, 16:05, Guenter Roeck wrote: >>>> On 9/1/21 11:55 PM, Jiri Slaby wrote: >>>>> On 03. 02. 21, 21:11, Curtis Klein wrote: >>>>>> This adds the option to use a hrtimer to generate a watchdog pretimeout >>>>>> event for hardware watchdogs that do not natively support watchdog >>>>>> pretimeouts. >>>>>> >>>>>> With this enabled, all watchdogs will appear to have pretimeout support >>>>>> in userspace. If no pretimeout value is set, there will be no change in >>>>>> the watchdog's behavior. >>>>> >>>>> Hi, >>>>> >>>>> on my Dell Latitude 7280, CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT=y causes >>>>> all reboot, kexec, suspend to panic. Disabling that option makes it >>>>> all work again. Provided it happens very late in the process, I don't >>>>> know how to grab some logs... >>>>> >>>>> Any ideas? >>>>> >>>> >>>> AFAICS the timer does not stop on reboot. I think we'll need to >>>> augment the code >>>> to do that. >>> >>> No, it is stopped via device unregister -> watchdog_dev_unregister -> >>> watchdog_cdev_unregister -> watchdog_hrtimer_pretimeout_stop. >>> >>> But look: >>> watchdog_cdev_unregister >>> -> wdd->wd_data = NULL; >>> -> watchdog_hrtimer_pretimeout_stop >>> -> hrtimer_cancel(&wdd->wd_data->pretimeout_timer); >>> >>> The diff below obviously fixes the issue, >> >> Which is exactly a -next commit: >> commit c7b178dae139f8857edc50888cfbf251cd974a38 >> Author: Curtis Klein <curtis.klein@hpe.com> >> Date: Tue Jun 22 23:26:23 2021 -0700 >> >> watchdog: Fix NULL pointer dereference when releasing cdev >> >>> --- a/drivers/watchdog/watchdog_dev.c >>> +++ b/drivers/watchdog/watchdog_dev.c >>> @@ -1096,6 +1096,8 @@ static void watchdog_cdev_unregister(struct >>> watchdog_device *wdd) >>> watchdog_stop(wdd); >>> } >>> >>> + watchdog_hrtimer_pretimeout_stop(wdd); >>> + >>> mutex_lock(&wd_data->lock); >>> wd_data->wdd = NULL; >>> wdd->wd_data = NULL; >>> @@ -1103,7 +1105,6 @@ static void watchdog_cdev_unregister(struct >>> watchdog_device *wdd) >>> >>> hrtimer_cancel(&wd_data->timer); >>> kthread_cancel_work_sync(&wd_data->work); >>> - watchdog_hrtimer_pretimeout_stop(wdd); >>> >>> put_device(&wd_data->dev); >>> } >>> >>> thanks, >> >> >> -- >> js >> suse labs > > Does it still make sense to stop the timer on reboot or suspend? > > I haven't had any problems with rebooting but I haven't been able to test > suspending. > Only if it is really a problem. Guenter
--- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -1096,6 +1096,8 @@ static void watchdog_cdev_unregister(struct watchdog_device *wdd) watchdog_stop(wdd); } + watchdog_hrtimer_pretimeout_stop(wdd); + mutex_lock(&wd_data->lock); wd_data->wdd = NULL;