Message ID | 20200330140859.12535-1-tiwai@suse.de (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Zhang Rui |
Headers | show |
Series | thermal: Add a sanity check for invalid state at stats update | expand |
On Mon, Mar 30, 2020 at 7:39 PM Takashi Iwai <tiwai@suse.de> wrote: > > The thermal sysfs handler keeps the statistics table with the fixed > size that was determined from the initial max_states() call, and the > table entry is updated at each sysfs cur_state write call. And, when > the driver's set_cur_state() ops accepts the value given from > user-space, the thermal sysfs core blindly applies it to the > statistics table entry, which may overflow and cause an Oops. > Although it's rather a bug in the driver's ops implementations, we > shouldn't crash but rather give a proper warning instead. > > This patch adds a sanity check for avoiding such an OOB access and > warns with a stack trace to show the suspicious device in question. Hi Takashi, Instead of this warning, I think we should reject such input when writing to cur_state. See attached patch. If you think this OK, I'll submit it. Regards, Amit > Signed-off-by: Takashi Iwai <tiwai@suse.de> > --- > > We've hit some crash by stress tests, and this patch at least works > around the crash itself. While the actual bug fix of the buggy driver > is still being investigated, I submit the hardening in the core side > at first. > > drivers/thermal/thermal_sysfs.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c > index aa99edb4dff7..a23c4e701d63 100644 > --- a/drivers/thermal/thermal_sysfs.c > +++ b/drivers/thermal/thermal_sysfs.c > @@ -772,6 +772,11 @@ void thermal_cooling_device_stats_update(struct thermal_cooling_device *cdev, > > spin_lock(&stats->lock); > > + if (dev_WARN_ONCE(&cdev->device, new_state >= stats->max_states, > + "new state %ld exceeds max_state %ld", > + new_state, stats->max_states)) > + goto unlock; > + > if (stats->state == new_state) > goto unlock; > > -- > 2.16.4 >
On Tue, 07 Apr 2020 15:30:51 +0200, Amit Kucheria wrote: > > On Mon, Mar 30, 2020 at 7:39 PM Takashi Iwai <tiwai@suse.de> wrote: > > > > The thermal sysfs handler keeps the statistics table with the fixed > > size that was determined from the initial max_states() call, and the > > table entry is updated at each sysfs cur_state write call. And, when > > the driver's set_cur_state() ops accepts the value given from > > user-space, the thermal sysfs core blindly applies it to the > > statistics table entry, which may overflow and cause an Oops. > > Although it's rather a bug in the driver's ops implementations, we > > shouldn't crash but rather give a proper warning instead. > > > > This patch adds a sanity check for avoiding such an OOB access and > > warns with a stack trace to show the suspicious device in question. > > Hi Takashi, > > Instead of this warning, I think we should reject such input when > writing to cur_state. > > See attached patch. If you think this OK, I'll submit it. Actually the input value itself is correct, the problem is rather about the max_states that may vary depending on other driver. So IMO, we don't want to refuse the input completely. Please see the thread: https://lore.kernel.org/linux-acpi/s5h5zeiwd01.wl-tiwai@suse.de/ thanks, Takashi > > Regards, > Amit > > > Signed-off-by: Takashi Iwai <tiwai@suse.de> > > --- > > > > We've hit some crash by stress tests, and this patch at least works > > around the crash itself. While the actual bug fix of the buggy driver > > is still being investigated, I submit the hardening in the core side > > at first. > > > > drivers/thermal/thermal_sysfs.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c > > index aa99edb4dff7..a23c4e701d63 100644 > > --- a/drivers/thermal/thermal_sysfs.c > > +++ b/drivers/thermal/thermal_sysfs.c > > @@ -772,6 +772,11 @@ void thermal_cooling_device_stats_update(struct thermal_cooling_device *cdev, > > > > spin_lock(&stats->lock); > > > > + if (dev_WARN_ONCE(&cdev->device, new_state >= stats->max_states, > > + "new state %ld exceeds max_state %ld", > > + new_state, stats->max_states)) > > + goto unlock; > > + > > if (stats->state == new_state) > > goto unlock; > > > > -- > > 2.16.4 > > > From 54266260d483ab4476510dd4461a1cafc611e17d Mon Sep 17 00:00:00 2001 > Message-Id: <54266260d483ab4476510dd4461a1cafc611e17d.1586266224.git.amit.kucheria@linaro.org> > From: Amit Kucheria <amit.kucheria@linaro.org> > Date: Tue, 7 Apr 2020 18:48:14 +0530 > Subject: [PATCH] thermal: Reject invalid cur_state input from userspace > > We don't check if the cur_state value input in sysfs is greater than the > maximum cooling state that the cooling device supports. This can cause > access to unallocated memory in case THERMAL_STATISTICS in enabled and > could also crash cooling devices that don't check for an invalid state in > their set_cur_state() callback. > > Return an error if the state being requested in greater than the maximum > cooling state the device supports. > > Reported-by: Takashi Iwai <tiwai@suse.de> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > --- > drivers/thermal/thermal_sysfs.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c > index 7e1d11bdd258..8033e5a9386a 100644 > --- a/drivers/thermal/thermal_sysfs.c > +++ b/drivers/thermal/thermal_sysfs.c > @@ -703,7 +703,7 @@ cur_state_store(struct device *dev, struct device_attribute *attr, > const char *buf, size_t count) > { > struct thermal_cooling_device *cdev = to_cooling_device(dev); > - unsigned long state; > + unsigned long state, max_state; > int result; > > if (sscanf(buf, "%ld\n", &state) != 1) > @@ -712,6 +712,13 @@ cur_state_store(struct device *dev, struct device_attribute *attr, > if ((long)state < 0) > return -EINVAL; > > + result = cdev->ops->get_max_state(cdev, &max_state); > + if (result) > + return result; > + > + if (state >= max_state) > + return -EINVAL; > + > mutex_lock(&cdev->lock); > > result = cdev->ops->set_cur_state(cdev, state); > -- > 2.20.1 >
Right, I have a V2 patch series, which will be post soon. Thanks, rui -----Original Message----- From: linux-pm-owner@vger.kernel.org <linux-pm-owner@vger.kernel.org> On Behalf Of Takashi Iwai Sent: Tuesday, April 07, 2020 10:13 PM To: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Takashi Iwai <tiwai@suse.de>; Zhang, Rui <rui.zhang@intel.com>; Daniel Lezcano <daniel.lezcano@linaro.org>; Linux PM list <linux-pm@vger.kernel.org>; LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH] thermal: Add a sanity check for invalid state at stats update Importance: High On Tue, 07 Apr 2020 15:30:51 +0200, Amit Kucheria wrote: > > On Mon, Mar 30, 2020 at 7:39 PM Takashi Iwai <tiwai@suse.de> wrote: > > > > The thermal sysfs handler keeps the statistics table with the fixed > > size that was determined from the initial max_states() call, and the > > table entry is updated at each sysfs cur_state write call. And, > > when the driver's set_cur_state() ops accepts the value given from > > user-space, the thermal sysfs core blindly applies it to the > > statistics table entry, which may overflow and cause an Oops. > > Although it's rather a bug in the driver's ops implementations, we > > shouldn't crash but rather give a proper warning instead. > > > > This patch adds a sanity check for avoiding such an OOB access and > > warns with a stack trace to show the suspicious device in question. > > Hi Takashi, > > Instead of this warning, I think we should reject such input when > writing to cur_state. > > See attached patch. If you think this OK, I'll submit it. Actually the input value itself is correct, the problem is rather about the max_states that may vary depending on other driver. So IMO, we don't want to refuse the input completely. Please see the thread: https://lore.kernel.org/linux-acpi/s5h5zeiwd01.wl-tiwai@suse.de/ thanks, Takashi > > Regards, > Amit > > > Signed-off-by: Takashi Iwai <tiwai@suse.de> > > --- > > > > We've hit some crash by stress tests, and this patch at least works > > around the crash itself. While the actual bug fix of the buggy > > driver is still being investigated, I submit the hardening in the > > core side at first. > > > > drivers/thermal/thermal_sysfs.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/drivers/thermal/thermal_sysfs.c > > b/drivers/thermal/thermal_sysfs.c index aa99edb4dff7..a23c4e701d63 > > 100644 > > --- a/drivers/thermal/thermal_sysfs.c > > +++ b/drivers/thermal/thermal_sysfs.c > > @@ -772,6 +772,11 @@ void thermal_cooling_device_stats_update(struct > > thermal_cooling_device *cdev, > > > > spin_lock(&stats->lock); > > > > + if (dev_WARN_ONCE(&cdev->device, new_state >= stats->max_states, > > + "new state %ld exceeds max_state %ld", > > + new_state, stats->max_states)) > > + goto unlock; > > + > > if (stats->state == new_state) > > goto unlock; > > > > -- > > 2.16.4 > > > From 54266260d483ab4476510dd4461a1cafc611e17d Mon Sep 17 00:00:00 2001 > Message-Id: > <54266260d483ab4476510dd4461a1cafc611e17d.1586266224.git.amit.kucheria > @linaro.org> > From: Amit Kucheria <amit.kucheria@linaro.org> > Date: Tue, 7 Apr 2020 18:48:14 +0530 > Subject: [PATCH] thermal: Reject invalid cur_state input from > userspace > > We don't check if the cur_state value input in sysfs is greater than > the maximum cooling state that the cooling device supports. This can > cause access to unallocated memory in case THERMAL_STATISTICS in > enabled and could also crash cooling devices that don't check for an > invalid state in their set_cur_state() callback. > > Return an error if the state being requested in greater than the > maximum cooling state the device supports. > > Reported-by: Takashi Iwai <tiwai@suse.de> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > --- > drivers/thermal/thermal_sysfs.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/thermal/thermal_sysfs.c > b/drivers/thermal/thermal_sysfs.c index 7e1d11bdd258..8033e5a9386a > 100644 > --- a/drivers/thermal/thermal_sysfs.c > +++ b/drivers/thermal/thermal_sysfs.c > @@ -703,7 +703,7 @@ cur_state_store(struct device *dev, struct device_attribute *attr, > const char *buf, size_t count) > { > struct thermal_cooling_device *cdev = to_cooling_device(dev); > - unsigned long state; > + unsigned long state, max_state; > int result; > > if (sscanf(buf, "%ld\n", &state) != 1) @@ -712,6 +712,13 @@ > cur_state_store(struct device *dev, struct device_attribute *attr, > if ((long)state < 0) > return -EINVAL; > > + result = cdev->ops->get_max_state(cdev, &max_state); > + if (result) > + return result; > + > + if (state >= max_state) > + return -EINVAL; > + > mutex_lock(&cdev->lock); > > result = cdev->ops->set_cur_state(cdev, state); > -- > 2.20.1 >
Too late for me to post it out by this midnight, will post it out tomorrow. Thanks, rui -----Original Message----- From: linux-pm-owner@vger.kernel.org <linux-pm-owner@vger.kernel.org> On Behalf Of Zhang, Rui Sent: Tuesday, April 07, 2020 10:17 PM To: Takashi Iwai <tiwai@suse.de>; Amit Kucheria <amit.kucheria@verdurent.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>; Linux PM list <linux-pm@vger.kernel.org>; LKML <linux-kernel@vger.kernel.org> Subject: RE: [PATCH] thermal: Add a sanity check for invalid state at stats update Right, I have a V2 patch series, which will be post soon. Thanks, rui -----Original Message----- From: linux-pm-owner@vger.kernel.org <linux-pm-owner@vger.kernel.org> On Behalf Of Takashi Iwai Sent: Tuesday, April 07, 2020 10:13 PM To: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Takashi Iwai <tiwai@suse.de>; Zhang, Rui <rui.zhang@intel.com>; Daniel Lezcano <daniel.lezcano@linaro.org>; Linux PM list <linux-pm@vger.kernel.org>; LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH] thermal: Add a sanity check for invalid state at stats update Importance: High On Tue, 07 Apr 2020 15:30:51 +0200, Amit Kucheria wrote: > > On Mon, Mar 30, 2020 at 7:39 PM Takashi Iwai <tiwai@suse.de> wrote: > > > > The thermal sysfs handler keeps the statistics table with the fixed > > size that was determined from the initial max_states() call, and the > > table entry is updated at each sysfs cur_state write call. And, > > when the driver's set_cur_state() ops accepts the value given from > > user-space, the thermal sysfs core blindly applies it to the > > statistics table entry, which may overflow and cause an Oops. > > Although it's rather a bug in the driver's ops implementations, we > > shouldn't crash but rather give a proper warning instead. > > > > This patch adds a sanity check for avoiding such an OOB access and > > warns with a stack trace to show the suspicious device in question. > > Hi Takashi, > > Instead of this warning, I think we should reject such input when > writing to cur_state. > > See attached patch. If you think this OK, I'll submit it. Actually the input value itself is correct, the problem is rather about the max_states that may vary depending on other driver. So IMO, we don't want to refuse the input completely. Please see the thread: https://lore.kernel.org/linux-acpi/s5h5zeiwd01.wl-tiwai@suse.de/ thanks, Takashi > > Regards, > Amit > > > Signed-off-by: Takashi Iwai <tiwai@suse.de> > > --- > > > > We've hit some crash by stress tests, and this patch at least works > > around the crash itself. While the actual bug fix of the buggy > > driver is still being investigated, I submit the hardening in the > > core side at first. > > > > drivers/thermal/thermal_sysfs.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/drivers/thermal/thermal_sysfs.c > > b/drivers/thermal/thermal_sysfs.c index aa99edb4dff7..a23c4e701d63 > > 100644 > > --- a/drivers/thermal/thermal_sysfs.c > > +++ b/drivers/thermal/thermal_sysfs.c > > @@ -772,6 +772,11 @@ void thermal_cooling_device_stats_update(struct > > thermal_cooling_device *cdev, > > > > spin_lock(&stats->lock); > > > > + if (dev_WARN_ONCE(&cdev->device, new_state >= stats->max_states, > > + "new state %ld exceeds max_state %ld", > > + new_state, stats->max_states)) > > + goto unlock; > > + > > if (stats->state == new_state) > > goto unlock; > > > > -- > > 2.16.4 > > > From 54266260d483ab4476510dd4461a1cafc611e17d Mon Sep 17 00:00:00 2001 > Message-Id: > <54266260d483ab4476510dd4461a1cafc611e17d.1586266224.git.amit.kucheria > @linaro.org> > From: Amit Kucheria <amit.kucheria@linaro.org> > Date: Tue, 7 Apr 2020 18:48:14 +0530 > Subject: [PATCH] thermal: Reject invalid cur_state input from > userspace > > We don't check if the cur_state value input in sysfs is greater than > the maximum cooling state that the cooling device supports. This can > cause access to unallocated memory in case THERMAL_STATISTICS in > enabled and could also crash cooling devices that don't check for an > invalid state in their set_cur_state() callback. > > Return an error if the state being requested in greater than the > maximum cooling state the device supports. > > Reported-by: Takashi Iwai <tiwai@suse.de> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > --- > drivers/thermal/thermal_sysfs.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/thermal/thermal_sysfs.c > b/drivers/thermal/thermal_sysfs.c index 7e1d11bdd258..8033e5a9386a > 100644 > --- a/drivers/thermal/thermal_sysfs.c > +++ b/drivers/thermal/thermal_sysfs.c > @@ -703,7 +703,7 @@ cur_state_store(struct device *dev, struct device_attribute *attr, > const char *buf, size_t count) > { > struct thermal_cooling_device *cdev = to_cooling_device(dev); > - unsigned long state; > + unsigned long state, max_state; > int result; > > if (sscanf(buf, "%ld\n", &state) != 1) @@ -712,6 +712,13 @@ > cur_state_store(struct device *dev, struct device_attribute *attr, > if ((long)state < 0) > return -EINVAL; > > + result = cdev->ops->get_max_state(cdev, &max_state); > + if (result) > + return result; > + > + if (state >= max_state) > + return -EINVAL; > + > mutex_lock(&cdev->lock); > > result = cdev->ops->set_cur_state(cdev, state); > -- > 2.20.1 >
diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c index aa99edb4dff7..a23c4e701d63 100644 --- a/drivers/thermal/thermal_sysfs.c +++ b/drivers/thermal/thermal_sysfs.c @@ -772,6 +772,11 @@ void thermal_cooling_device_stats_update(struct thermal_cooling_device *cdev, spin_lock(&stats->lock); + if (dev_WARN_ONCE(&cdev->device, new_state >= stats->max_states, + "new state %ld exceeds max_state %ld", + new_state, stats->max_states)) + goto unlock; + if (stats->state == new_state) goto unlock;
The thermal sysfs handler keeps the statistics table with the fixed size that was determined from the initial max_states() call, and the table entry is updated at each sysfs cur_state write call. And, when the driver's set_cur_state() ops accepts the value given from user-space, the thermal sysfs core blindly applies it to the statistics table entry, which may overflow and cause an Oops. Although it's rather a bug in the driver's ops implementations, we shouldn't crash but rather give a proper warning instead. This patch adds a sanity check for avoiding such an OOB access and warns with a stack trace to show the suspicious device in question. Signed-off-by: Takashi Iwai <tiwai@suse.de> --- We've hit some crash by stress tests, and this patch at least works around the crash itself. While the actual bug fix of the buggy driver is still being investigated, I submit the hardening in the core side at first. drivers/thermal/thermal_sysfs.c | 5 +++++ 1 file changed, 5 insertions(+)