Message ID | 20180226144118.24693-2-enric.balletbo@collabora.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | Zhang Rui |
Headers | show |
On Mon, Feb 26, 2018 at 4:41 PM, Enric Balletbo i Serra <enric.balletbo@collabora.com> wrote: > --- a/drivers/thermal/thermal_sysfs.c > +++ b/drivers/thermal/thermal_sysfs.c > if (!strncmp(buf, "enabled", sizeof("enabled") - 1)) > - result = tz->ops->set_mode(tz, THERMAL_DEVICE_ENABLED); > + mode = THERMAL_DEVICE_ENABLED; > else if (!strncmp(buf, "disabled", sizeof("disabled") - 1)) > - result = tz->ops->set_mode(tz, THERMAL_DEVICE_DISABLED); > + mode = THERMAL_DEVICE_DISABLED; > else > - result = -EINVAL; > + return -EINVAL; sysfs_match_string()?
Hi, I stumbled across this patch since I'm currently poking around with early thermal bringup on a platform and this patch has been integrated in our development tree. I'm seeing some unexpected behaviors, which could entirely due to wrong expectation from my side. I only have some basic working knowledge of the thermal framework, just want to double check and perhaps learn a thing or two. On Mon, Feb 26, 2018 at 03:41:18PM +0100, Enric Balletbo i Serra wrote: > From: Zhang Rui <rui.zhang@intel.com> > > Thermal "mode" sysfs attribute is introduced to enable/disable a thermal > zone, and .get_mode()/.set_mode() callback is introduced for platform > thermal driver to enable/disable the hardware thermal control logic. And > thermal core takes no action upon thermal zone enable/disable. > > Actually, this is not quite right because thermal core still pokes those > disabled thermal zones occasionally, e.g. upon system resume. > > To fix this, a new flag 'mode' is introduced in struct thermal_zone_device > to represent the thermal zone mode, and several decisions have been made > based on this flag, including > 1. check the thermal zone mode right after it's registered. > 2. skip updating thermal zone if the zone is disabled > 3. stop the polling timer when the thermal zone is disabled > > Note: basically, this patch doesn't affect the existing always-enabled > thermal zones much, with just one exception - > thermal zone .get_mode() must be well prepared to reflect the real thermal > zone status upon the thermal zone registration. From my perspective this looks like a pretty significant change. For the platform I'm working on I added a thermal zone to the device tree, with the expectation that it would be enabled. Judging from the code without this patch this expectation seems to be naive, since of-thermal.c sets tz->mode to THERMAL_DEVICE_DISABLED, so apparently either userspace or some driver should call _set_mode(tz, THERMAL_DEVICE_ENABLED). However even without this the thermal zone appears to be active (I didn't really test end-to-end yet, but at least thermal_zone_device_update() is called and calls handle_thermal_trip()). Not sure why this is the case with THERMAL_DEVICE_DISABLED, but before I learned about the existence of the flag my expectation was that the zone would be enabled. With this patch thermal_zone_device_update() is skipped if the zone hasn't been explictly enabled, which may be consistent with the state of 'tz->mode', but effectively changes the previous/current behavior. Not sure if I'm just dumbly overlooking something obvious or if there is an actual problem with of_thermal (and maybe others). Any insights? Thanks Matthias
On Thu, Jun 28, 2018 at 05:33:02PM -0700, Matthias Kaehlcke wrote: > Hi, > > I stumbled across this patch since I'm currently poking around with > early thermal bringup on a platform and this patch has been integrated > in our development tree. > > I'm seeing some unexpected behaviors, which could entirely due to > wrong expectation from my side. I only have some basic working > knowledge of the thermal framework, just want to double check and > perhaps learn a thing or two. > > On Mon, Feb 26, 2018 at 03:41:18PM +0100, Enric Balletbo i Serra wrote: > > From: Zhang Rui <rui.zhang@intel.com> > > > > Thermal "mode" sysfs attribute is introduced to enable/disable a thermal > > zone, and .get_mode()/.set_mode() callback is introduced for platform > > thermal driver to enable/disable the hardware thermal control logic. And > > thermal core takes no action upon thermal zone enable/disable. > > > > Actually, this is not quite right because thermal core still pokes those > > disabled thermal zones occasionally, e.g. upon system resume. > > > > To fix this, a new flag 'mode' is introduced in struct thermal_zone_device > > to represent the thermal zone mode, and several decisions have been made > > based on this flag, including > > 1. check the thermal zone mode right after it's registered. > > 2. skip updating thermal zone if the zone is disabled > > 3. stop the polling timer when the thermal zone is disabled > > > > Note: basically, this patch doesn't affect the existing always-enabled > > thermal zones much, with just one exception - > > thermal zone .get_mode() must be well prepared to reflect the real thermal > > zone status upon the thermal zone registration. > > From my perspective this looks like a pretty significant change. For > the platform I'm working on I added a thermal zone to the device tree, > with the expectation that it would be enabled. Judging from the code > without this patch this expectation seems to be naive, since > of-thermal.c sets tz->mode to THERMAL_DEVICE_DISABLED, so apparently > either userspace or some driver should call _set_mode(tz, > THERMAL_DEVICE_ENABLED). However even without this the thermal zone > appears to be active (I didn't really test end-to-end yet, but at > least thermal_zone_device_update() is called and calls > handle_thermal_trip()). Not sure why this is the case with > THERMAL_DEVICE_DISABLED, but before I learned about the existence of > the flag my expectation was that the zone would be enabled. > > With this patch thermal_zone_device_update() is skipped if the zone > hasn't been explictly enabled, which may be consistent with the state > of 'tz->mode', but effectively changes the previous/current behavior. > > Not sure if I'm just dumbly overlooking something obvious or if there > is an actual problem with of_thermal (and maybe others). The problem is that there are now two 'mode' fields, tzd->mode and the other typically tzd->devdata->mode, and tzd->mode is never set to enabled. > thermal zone .get_mode() must be well prepared to reflect the real thermal > zone status upon the thermal zone registration. For of_thermal tzd->mode is initialized with the result of .get_mode() when the zone is registered. At this time no sensor has been added to the zone, hence the zone is disabled. When a sensor is added later tzd->devdata->mode is set to enabled, however tzd->mode remains disabled: tzd->mode = DISABLED tzd->devdata->mode = DISABLED of_parse_thermal_zones thermal_zone_device_register tzd->mode = tzd->get_mode() // => DISABLED <sensor>_probe thermal_zone_of_sensor_register tzd->set_mode(THERMAL_DEVICE_ENABLED) tzd->devdata->mode = ENABLED One way to fix this for of_thermal could be to setting tzd->mode in .set_mode() in addition to setting tzd->devdata->mode. However this feels like a workaround/hack. Personally I find it confusing to have two mode fields for a thermal zone device. Maybe tzd->mode should replace tzd->devdata->mode?
Hi Matthias, Sorry for late reply, my memory is bad so I need to look at this again. The patch was send some time ago and there are pending changes to do but then I switched. I'll take a look, but did you saw why this patch was not merged [1]? Maybe that could answer some of your questions. Best regards, Enric [1] https://lkml.org/lkml/2018/2/27/910 On 03/07/18 19:13, Matthias Kaehlcke wrote: > On Thu, Jun 28, 2018 at 05:33:02PM -0700, Matthias Kaehlcke wrote: >> Hi, >> >> I stumbled across this patch since I'm currently poking around with >> early thermal bringup on a platform and this patch has been integrated >> in our development tree. >> >> I'm seeing some unexpected behaviors, which could entirely due to >> wrong expectation from my side. I only have some basic working >> knowledge of the thermal framework, just want to double check and >> perhaps learn a thing or two. >> >> On Mon, Feb 26, 2018 at 03:41:18PM +0100, Enric Balletbo i Serra wrote: >>> From: Zhang Rui <rui.zhang@intel.com> >>> >>> Thermal "mode" sysfs attribute is introduced to enable/disable a thermal >>> zone, and .get_mode()/.set_mode() callback is introduced for platform >>> thermal driver to enable/disable the hardware thermal control logic. And >>> thermal core takes no action upon thermal zone enable/disable. >>> >>> Actually, this is not quite right because thermal core still pokes those >>> disabled thermal zones occasionally, e.g. upon system resume. >>> >>> To fix this, a new flag 'mode' is introduced in struct thermal_zone_device >>> to represent the thermal zone mode, and several decisions have been made >>> based on this flag, including >>> 1. check the thermal zone mode right after it's registered. >>> 2. skip updating thermal zone if the zone is disabled >>> 3. stop the polling timer when the thermal zone is disabled >>> >>> Note: basically, this patch doesn't affect the existing always-enabled >>> thermal zones much, with just one exception - >>> thermal zone .get_mode() must be well prepared to reflect the real thermal >>> zone status upon the thermal zone registration. >> >> From my perspective this looks like a pretty significant change. For >> the platform I'm working on I added a thermal zone to the device tree, >> with the expectation that it would be enabled. Judging from the code >> without this patch this expectation seems to be naive, since >> of-thermal.c sets tz->mode to THERMAL_DEVICE_DISABLED, so apparently >> either userspace or some driver should call _set_mode(tz, >> THERMAL_DEVICE_ENABLED). However even without this the thermal zone >> appears to be active (I didn't really test end-to-end yet, but at >> least thermal_zone_device_update() is called and calls >> handle_thermal_trip()). Not sure why this is the case with >> THERMAL_DEVICE_DISABLED, but before I learned about the existence of >> the flag my expectation was that the zone would be enabled. >> >> With this patch thermal_zone_device_update() is skipped if the zone >> hasn't been explictly enabled, which may be consistent with the state >> of 'tz->mode', but effectively changes the previous/current behavior. >> >> Not sure if I'm just dumbly overlooking something obvious or if there >> is an actual problem with of_thermal (and maybe others). > > The problem is that there are now two 'mode' fields, tzd->mode and the > other typically tzd->devdata->mode, and tzd->mode is never set to enabled. > >> thermal zone .get_mode() must be well prepared to reflect the real thermal >> zone status upon the thermal zone registration. > > For of_thermal tzd->mode is initialized with the result of .get_mode() > when the zone is registered. At this time no sensor has been added > to the zone, hence the zone is disabled. When a sensor is added later > tzd->devdata->mode is set to enabled, however tzd->mode remains disabled: > > tzd->mode = DISABLED > tzd->devdata->mode = DISABLED > > of_parse_thermal_zones > thermal_zone_device_register > tzd->mode = tzd->get_mode() // => DISABLED > > <sensor>_probe > thermal_zone_of_sensor_register > tzd->set_mode(THERMAL_DEVICE_ENABLED) > tzd->devdata->mode = ENABLED > > One way to fix this for of_thermal could be to setting tzd->mode in > .set_mode() in addition to setting tzd->devdata->mode. However this > feels like a workaround/hack. Personally I find it confusing to have > two mode fields for a thermal zone device. Maybe tzd->mode should > replace tzd->devdata->mode? >
Hi Enric, On Wed, Jul 04, 2018 at 12:36:39PM +0200, Enric Balletbo i Serra wrote: > Hi Matthias, > > Sorry for late reply, my memory is bad so I need to look at this again. The > patch was send some time ago and there are pending changes to do but then I > switched. I'll take a look, but did you saw why this patch was not merged [1]? > Maybe that could answer some of your questions. > > Best regards, > Enric > > [1] https://lkml.org/lkml/2018/2/27/910 I missed this, thanks for the pointer! > On 03/07/18 19:13, Matthias Kaehlcke wrote: > > On Thu, Jun 28, 2018 at 05:33:02PM -0700, Matthias Kaehlcke wrote: > >> Hi, > >> > >> I stumbled across this patch since I'm currently poking around with > >> early thermal bringup on a platform and this patch has been integrated > >> in our development tree. > >> > >> I'm seeing some unexpected behaviors, which could entirely due to > >> wrong expectation from my side. I only have some basic working > >> knowledge of the thermal framework, just want to double check and > >> perhaps learn a thing or two. > >> > >> On Mon, Feb 26, 2018 at 03:41:18PM +0100, Enric Balletbo i Serra wrote: > >>> From: Zhang Rui <rui.zhang@intel.com> > >>> > >>> Thermal "mode" sysfs attribute is introduced to enable/disable a thermal > >>> zone, and .get_mode()/.set_mode() callback is introduced for platform > >>> thermal driver to enable/disable the hardware thermal control logic. And > >>> thermal core takes no action upon thermal zone enable/disable. > >>> > >>> Actually, this is not quite right because thermal core still pokes those > >>> disabled thermal zones occasionally, e.g. upon system resume. > >>> > >>> To fix this, a new flag 'mode' is introduced in struct thermal_zone_device > >>> to represent the thermal zone mode, and several decisions have been made > >>> based on this flag, including > >>> 1. check the thermal zone mode right after it's registered. > >>> 2. skip updating thermal zone if the zone is disabled > >>> 3. stop the polling timer when the thermal zone is disabled > >>> > >>> Note: basically, this patch doesn't affect the existing always-enabled > >>> thermal zones much, with just one exception - > >>> thermal zone .get_mode() must be well prepared to reflect the real thermal > >>> zone status upon the thermal zone registration. > >> > >> From my perspective this looks like a pretty significant change. For > >> the platform I'm working on I added a thermal zone to the device tree, > >> with the expectation that it would be enabled. Judging from the code > >> without this patch this expectation seems to be naive, since > >> of-thermal.c sets tz->mode to THERMAL_DEVICE_DISABLED, so apparently > >> either userspace or some driver should call _set_mode(tz, > >> THERMAL_DEVICE_ENABLED). However even without this the thermal zone > >> appears to be active (I didn't really test end-to-end yet, but at > >> least thermal_zone_device_update() is called and calls > >> handle_thermal_trip()). Not sure why this is the case with > >> THERMAL_DEVICE_DISABLED, but before I learned about the existence of > >> the flag my expectation was that the zone would be enabled. > >> > >> With this patch thermal_zone_device_update() is skipped if the zone > >> hasn't been explictly enabled, which may be consistent with the state > >> of 'tz->mode', but effectively changes the previous/current behavior. > >> > >> Not sure if I'm just dumbly overlooking something obvious or if there > >> is an actual problem with of_thermal (and maybe others). > > > > The problem is that there are now two 'mode' fields, tzd->mode and the > > other typically tzd->devdata->mode, and tzd->mode is never set to enabled. > > > >> thermal zone .get_mode() must be well prepared to reflect the real thermal > >> zone status upon the thermal zone registration. > > > > For of_thermal tzd->mode is initialized with the result of .get_mode() > > when the zone is registered. At this time no sensor has been added > > to the zone, hence the zone is disabled. When a sensor is added later > > tzd->devdata->mode is set to enabled, however tzd->mode remains disabled: > > > > tzd->mode = DISABLED > > tzd->devdata->mode = DISABLED > > > > of_parse_thermal_zones > > thermal_zone_device_register > > tzd->mode = tzd->get_mode() // => DISABLED > > > > <sensor>_probe > > thermal_zone_of_sensor_register > > tzd->set_mode(THERMAL_DEVICE_ENABLED) > > tzd->devdata->mode = ENABLED > > > > One way to fix this for of_thermal could be to setting tzd->mode in > > .set_mode() in addition to setting tzd->devdata->mode. However this > > feels like a workaround/hack. Personally I find it confusing to have > > two mode fields for a thermal zone device. Maybe tzd->mode should > > replace tzd->devdata->mode? > >
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 2b1b0ba393a4..8716ba5b2761 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -306,9 +306,9 @@ static void monitor_thermal_zone(struct thermal_zone_device *tz) { mutex_lock(&tz->lock); - if (tz->passive) + if (tz->mode == THERMAL_DEVICE_ENABLED && tz->passive) thermal_zone_device_set_polling(tz, tz->passive_delay); - else if (tz->polling_delay) + else if (tz->mode == THERMAL_DEVICE_ENABLED && tz->polling_delay) thermal_zone_device_set_polling(tz, tz->polling_delay); else thermal_zone_device_set_polling(tz, 0); @@ -464,11 +464,35 @@ static void thermal_zone_device_reset(struct thermal_zone_device *tz) pos->initialized = false; } +int thermal_zone_set_mode(struct thermal_zone_device *tz, + enum thermal_device_mode mode) +{ + int result; + + if (!tz->ops->set_mode) + return -EPERM; + + result = tz->ops->set_mode(tz, mode); + if (result) + return result; + + if (tz->mode != mode) { + tz->mode = mode; + monitor_thermal_zone(tz); + } + return 0; +} +EXPORT_SYMBOL_GPL(thermal_zone_set_mode); + void thermal_zone_device_update(struct thermal_zone_device *tz, enum thermal_notify_event event) { int count; + /* Do nothing if the thermal zone is disabled */ + if (tz->mode == THERMAL_DEVICE_DISABLED) + return; + if (atomic_read(&in_suspend)) return; @@ -1278,6 +1302,15 @@ thermal_zone_device_register(const char *type, int trips, int mask, INIT_DELAYED_WORK(&tz->poll_queue, thermal_zone_device_check); thermal_zone_device_reset(tz); + + if (tz->ops->get_mode) { + enum thermal_device_mode mode; + + result = tz->ops->get_mode(tz, &mode); + tz->mode = result ? THERMAL_DEVICE_ENABLED : mode; + } else + tz->mode = THERMAL_DEVICE_ENABLED; + /* Update the new thermal zone and mark it as already updated. */ if (atomic_cmpxchg(&tz->need_update, 1, 0)) thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED); diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c index ba81c9080f6e..2746540289c4 100644 --- a/drivers/thermal/thermal_sysfs.c +++ b/drivers/thermal/thermal_sysfs.c @@ -51,18 +51,9 @@ static ssize_t mode_show(struct device *dev, struct device_attribute *attr, char *buf) { struct thermal_zone_device *tz = to_thermal_zone(dev); - enum thermal_device_mode mode; - int result; - - if (!tz->ops->get_mode) - return -EPERM; - result = tz->ops->get_mode(tz, &mode); - if (result) - return result; - - return sprintf(buf, "%s\n", mode == THERMAL_DEVICE_ENABLED ? "enabled" - : "disabled"); + return sprintf(buf, "%s\n", tz->mode == THERMAL_DEVICE_ENABLED ? + "enabled" : "disabled"); } static ssize_t @@ -70,18 +61,17 @@ mode_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { struct thermal_zone_device *tz = to_thermal_zone(dev); + enum thermal_device_mode mode; int result; - if (!tz->ops->set_mode) - return -EPERM; - if (!strncmp(buf, "enabled", sizeof("enabled") - 1)) - result = tz->ops->set_mode(tz, THERMAL_DEVICE_ENABLED); + mode = THERMAL_DEVICE_ENABLED; else if (!strncmp(buf, "disabled", sizeof("disabled") - 1)) - result = tz->ops->set_mode(tz, THERMAL_DEVICE_DISABLED); + mode = THERMAL_DEVICE_DISABLED; else - result = -EINVAL; + return -EINVAL; + result = thermal_zone_set_mode(tz, mode); if (result) return result; diff --git a/include/linux/thermal.h b/include/linux/thermal.h index 8c5302374eaa..ff60977c91da 100644 --- a/include/linux/thermal.h +++ b/include/linux/thermal.h @@ -211,6 +211,7 @@ struct thermal_zone_device { struct thermal_attr *trip_type_attrs; struct thermal_attr *trip_hyst_attrs; void *devdata; + enum thermal_device_mode mode; int trips; unsigned long trips_disabled; /* bitmap for disabled trips */ int passive_delay; @@ -466,6 +467,8 @@ struct thermal_zone_device *thermal_zone_get_zone_by_name(const char *name); int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp); int thermal_zone_get_slope(struct thermal_zone_device *tz); int thermal_zone_get_offset(struct thermal_zone_device *tz); +int thermal_zone_set_mode(struct thermal_zone_device *tz, + enum thermal_device_mode mode); int get_tz_trend(struct thermal_zone_device *, int); struct thermal_instance *get_thermal_instance(struct thermal_zone_device *,