diff mbox series

thermal/thresholds: Fix thermal lock annotation issue

Message ID 20241024102303.1086147-1-daniel.lezcano@linaro.org (mailing list archive)
State In Next
Delegated to: Rafael Wysocki
Headers show
Series thermal/thresholds: Fix thermal lock annotation issue | expand

Commit Message

Daniel Lezcano Oct. 24, 2024, 10:23 a.m. UTC
When the thermal zone is unregistered (thermal sensor module being
unloaded), no lock is held when flushing the thresholds. That results
in a WARN when the lockdep validation is set in the kernel config.

This has been reported by syzbot.

As the thermal zone is in the process of being destroyed, there is no
need to send a notification about purging the thresholds to the
userspace as this one will receive a thermal zone deletion
notification which imply the deletion of all the associated resources
like the trip points or the user thresholds.

Split the function thermal_thresholds_flush() into a lockless one
without notification and its call with the lock annotation followed
with the thresholds flushing notification.

Please note this scenario is unlikely to happen, as the sensor drivers
are usually compiled-in in order to have the thermal framework to be
able to kick in at boot time if needed.

Link: https://lore.kernel.org/all/67124175.050a0220.10f4f4.0012.GAE@google.com
Reported-by: syzbot+f24dd060c1911fe54c85@syzkaller.appspotmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/thermal_thresholds.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

Comments

Daniel Lezcano Oct. 24, 2024, 10:36 a.m. UTC | #1
Hi,

please note this fix has been written on top of the thermal thresholds 
series, so I don't know how it conflicts if it is applied before

Thanks

   -- D.

On 24/10/2024 12:23, Daniel Lezcano wrote:
> When the thermal zone is unregistered (thermal sensor module being
> unloaded), no lock is held when flushing the thresholds. That results
> in a WARN when the lockdep validation is set in the kernel config.
> 
> This has been reported by syzbot.
> 
> As the thermal zone is in the process of being destroyed, there is no
> need to send a notification about purging the thresholds to the
> userspace as this one will receive a thermal zone deletion
> notification which imply the deletion of all the associated resources
> like the trip points or the user thresholds.
> 
> Split the function thermal_thresholds_flush() into a lockless one
> without notification and its call with the lock annotation followed
> with the thresholds flushing notification.
> 
> Please note this scenario is unlikely to happen, as the sensor drivers
> are usually compiled-in in order to have the thermal framework to be
> able to kick in at boot time if needed.
> 
> Link: https://lore.kernel.org/all/67124175.050a0220.10f4f4.0012.GAE@google.com
> Reported-by: syzbot+f24dd060c1911fe54c85@syzkaller.appspotmail.com
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>   drivers/thermal/thermal_thresholds.c | 13 +++++++++----
>   1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_thresholds.c b/drivers/thermal/thermal_thresholds.c
> index ea4aa5a2e86c..2888eabd3efe 100644
> --- a/drivers/thermal/thermal_thresholds.c
> +++ b/drivers/thermal/thermal_thresholds.c
> @@ -20,17 +20,22 @@ int thermal_thresholds_init(struct thermal_zone_device *tz)
>   	return 0;
>   }
>   
> -void thermal_thresholds_flush(struct thermal_zone_device *tz)
> +static void __thermal_thresholds_flush(struct thermal_zone_device *tz)
>   {
>   	struct list_head *thresholds = &tz->user_thresholds;
>   	struct user_threshold *entry, *tmp;
>   
> -	lockdep_assert_held(&tz->lock);
> -
>   	list_for_each_entry_safe(entry, tmp, thresholds, list_node) {
>   		list_del(&entry->list_node);
>   		kfree(entry);
>   	}
> +}
> +
> +void thermal_thresholds_flush(struct thermal_zone_device *tz)
> +{
> +	lockdep_assert_held(&tz->lock);
> +
> +	__thermal_thresholds_flush(tz);
>   
>   	thermal_notify_threshold_flush(tz);
>   
> @@ -39,7 +44,7 @@ void thermal_thresholds_flush(struct thermal_zone_device *tz)
>   
>   void thermal_thresholds_exit(struct thermal_zone_device *tz)
>   {
> -	thermal_thresholds_flush(tz);
> +	__thermal_thresholds_flush(tz);
>   }
>   
>   static int __thermal_thresholds_cmp(void *data,
Rafael J. Wysocki Oct. 24, 2024, 1:17 p.m. UTC | #2
On Thu, Oct 24, 2024 at 12:36 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
>
> Hi,
>
> please note this fix has been written on top of the thermal thresholds
> series, so I don't know how it conflicts if it is applied before

No worries.

Applied (on top of the thresholds series).

> On 24/10/2024 12:23, Daniel Lezcano wrote:
> > When the thermal zone is unregistered (thermal sensor module being
> > unloaded), no lock is held when flushing the thresholds. That results
> > in a WARN when the lockdep validation is set in the kernel config.
> >
> > This has been reported by syzbot.
> >
> > As the thermal zone is in the process of being destroyed, there is no
> > need to send a notification about purging the thresholds to the
> > userspace as this one will receive a thermal zone deletion
> > notification which imply the deletion of all the associated resources
> > like the trip points or the user thresholds.
> >
> > Split the function thermal_thresholds_flush() into a lockless one
> > without notification and its call with the lock annotation followed
> > with the thresholds flushing notification.
> >
> > Please note this scenario is unlikely to happen, as the sensor drivers
> > are usually compiled-in in order to have the thermal framework to be
> > able to kick in at boot time if needed.
> >
> > Link: https://lore.kernel.org/all/67124175.050a0220.10f4f4.0012.GAE@google.com
> > Reported-by: syzbot+f24dd060c1911fe54c85@syzkaller.appspotmail.com
> > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> > ---
> >   drivers/thermal/thermal_thresholds.c | 13 +++++++++----
> >   1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/thermal/thermal_thresholds.c b/drivers/thermal/thermal_thresholds.c
> > index ea4aa5a2e86c..2888eabd3efe 100644
> > --- a/drivers/thermal/thermal_thresholds.c
> > +++ b/drivers/thermal/thermal_thresholds.c
> > @@ -20,17 +20,22 @@ int thermal_thresholds_init(struct thermal_zone_device *tz)
> >       return 0;
> >   }
> >
> > -void thermal_thresholds_flush(struct thermal_zone_device *tz)
> > +static void __thermal_thresholds_flush(struct thermal_zone_device *tz)
> >   {
> >       struct list_head *thresholds = &tz->user_thresholds;
> >       struct user_threshold *entry, *tmp;
> >
> > -     lockdep_assert_held(&tz->lock);
> > -
> >       list_for_each_entry_safe(entry, tmp, thresholds, list_node) {
> >               list_del(&entry->list_node);
> >               kfree(entry);
> >       }
> > +}
> > +
> > +void thermal_thresholds_flush(struct thermal_zone_device *tz)
> > +{
> > +     lockdep_assert_held(&tz->lock);
> > +
> > +     __thermal_thresholds_flush(tz);
> >
> >       thermal_notify_threshold_flush(tz);
> >
> > @@ -39,7 +44,7 @@ void thermal_thresholds_flush(struct thermal_zone_device *tz)
> >
> >   void thermal_thresholds_exit(struct thermal_zone_device *tz)
> >   {
> > -     thermal_thresholds_flush(tz);
> > +     __thermal_thresholds_flush(tz);
> >   }
> >
> >   static int __thermal_thresholds_cmp(void *data,
>
>
> --
> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_thresholds.c b/drivers/thermal/thermal_thresholds.c
index ea4aa5a2e86c..2888eabd3efe 100644
--- a/drivers/thermal/thermal_thresholds.c
+++ b/drivers/thermal/thermal_thresholds.c
@@ -20,17 +20,22 @@  int thermal_thresholds_init(struct thermal_zone_device *tz)
 	return 0;
 }
 
-void thermal_thresholds_flush(struct thermal_zone_device *tz)
+static void __thermal_thresholds_flush(struct thermal_zone_device *tz)
 {
 	struct list_head *thresholds = &tz->user_thresholds;
 	struct user_threshold *entry, *tmp;
 
-	lockdep_assert_held(&tz->lock);
-
 	list_for_each_entry_safe(entry, tmp, thresholds, list_node) {
 		list_del(&entry->list_node);
 		kfree(entry);
 	}
+}
+
+void thermal_thresholds_flush(struct thermal_zone_device *tz)
+{
+	lockdep_assert_held(&tz->lock);
+
+	__thermal_thresholds_flush(tz);
 
 	thermal_notify_threshold_flush(tz);
 
@@ -39,7 +44,7 @@  void thermal_thresholds_flush(struct thermal_zone_device *tz)
 
 void thermal_thresholds_exit(struct thermal_zone_device *tz)
 {
-	thermal_thresholds_flush(tz);
+	__thermal_thresholds_flush(tz);
 }
 
 static int __thermal_thresholds_cmp(void *data,