Message ID | 3b80853abb45a9e067cf7a16754b07bb67712457.1520274879.git.amit.kucheria@linaro.org (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Eduardo Valentin |
Headers | show |
On Mon, Mar 5, 2018 at 12:36 PM, Amit Kucheria <amit.kucheria@linaro.org> wrote: > From: Ram Chandrasekar <rkumbako@codeaurora.org> > > There is currently no way for the governor to be selected for each thermal > zone in devicetree. This results in the default governor being used for all > thermal zones even though no such restriction exists in the core code. > > Add support for specifying the thermal governor to be used for a thermal > zone in the devicetree. The devicetree config should specify the governor > name as a string that matches any available governors. If not specified, we > maintain the current behaviour of using the default governor. > > Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > --- > Documentation/devicetree/bindings/thermal/thermal.txt | 8 ++++++++ > drivers/thermal/of-thermal.c | 6 ++++++ > 2 files changed, 14 insertions(+) > > diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt > index 1719d47..fced9d3 100644 > --- a/Documentation/devicetree/bindings/thermal/thermal.txt > +++ b/Documentation/devicetree/bindings/thermal/thermal.txt > @@ -168,6 +168,14 @@ Optional property: > by means of sensor ID. Additional coefficients are > interpreted as constant offset. > > +- thermal-governor: Thermal governor to be used for this thermal zone. > + Expected values are: > + "step_wise": Use step wise governor. > + "fair_share": Use fair share governor. > + "user_space": Use user space governor. > + "power_allocator": Use power allocator governor. This looks pretty Linux specific. Not that we can't have Linux specific properties, but we try to avoid them. What determines the selection? I'd imagine only certain governors make sense for certain devices. We should perhaps describe those characteristics which can then infer the best governor. Not really sure though... Rob
On 05/03/2018 19:36, Amit Kucheria wrote: > From: Ram Chandrasekar <rkumbako@codeaurora.org> > > There is currently no way for the governor to be selected for each thermal > zone in devicetree. This results in the default governor being used for all > thermal zones even though no such restriction exists in the core code. > > Add support for specifying the thermal governor to be used for a thermal > zone in the devicetree. The devicetree config should specify the governor > name as a string that matches any available governors. If not specified, we > maintain the current behaviour of using the default governor. > > Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Why not create a kernel parameter (eg. thermal.governor=) ? So everyone can gain benefit of this feature. And in order to specify that from the DT, add the 'chosen' node and bootargs with the desired kernel parameter?
On Tue, Mar 6, 2018 at 1:38 AM, Rob Herring <robh+dt@kernel.org> wrote: > On Mon, Mar 5, 2018 at 12:36 PM, Amit Kucheria <amit.kucheria@linaro.org> wrote: >> From: Ram Chandrasekar <rkumbako@codeaurora.org> >> >> There is currently no way for the governor to be selected for each thermal >> zone in devicetree. This results in the default governor being used for all >> thermal zones even though no such restriction exists in the core code. >> >> Add support for specifying the thermal governor to be used for a thermal >> zone in the devicetree. The devicetree config should specify the governor >> name as a string that matches any available governors. If not specified, we >> maintain the current behaviour of using the default governor. >> >> Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org> >> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> >> --- >> Documentation/devicetree/bindings/thermal/thermal.txt | 8 ++++++++ >> drivers/thermal/of-thermal.c | 6 ++++++ >> 2 files changed, 14 insertions(+) >> >> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt >> index 1719d47..fced9d3 100644 >> --- a/Documentation/devicetree/bindings/thermal/thermal.txt >> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt >> @@ -168,6 +168,14 @@ Optional property: >> by means of sensor ID. Additional coefficients are >> interpreted as constant offset. >> >> +- thermal-governor: Thermal governor to be used for this thermal zone. >> + Expected values are: >> + "step_wise": Use step wise governor. >> + "fair_share": Use fair share governor. >> + "user_space": Use user space governor. >> + "power_allocator": Use power allocator governor. > > This looks pretty Linux specific. Not that we can't have Linux > specific properties, but we try to avoid them. > > What determines the selection? I'd imagine only certain governors make > sense for certain devices. We should perhaps describe those > characteristics which can then infer the best governor. Not really > sure though... I'm not sure if it would be easy to assign preferred governors to device classes. It is dependent on what devices are present on the system, what throttling knobs they expose and how the system designer decided to integrate it all. e.g. A GPU driver might be controlled in kernel or userspace depending on whether it exposes a devfreq knob or some more esoteric statistics to userspace. Bang Bang governor seems to be designed for Fans with a simple ON/OFF iterface. Userspace governor is designed to move thermal policy to userspace (e.g. through thermald). So backlight brightness, battery charging, GPU scaling, even cpu frequency scaling can be offloaded to userspace. On embedded platforms, modem control typically happens in userspace Power allocator governor is designed for a closed-loop system to keep the total TDP of the platform under control while allowing various devices (cpu, gpu, modem, etc.) to dynamically increase or decrease their individual budget depending on the usecase. Regards, Amit
On Tue, Mar 6, 2018 at 2:41 AM, Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > On 05/03/2018 19:36, Amit Kucheria wrote: >> From: Ram Chandrasekar <rkumbako@codeaurora.org> >> >> There is currently no way for the governor to be selected for each thermal >> zone in devicetree. This results in the default governor being used for all >> thermal zones even though no such restriction exists in the core code. >> >> Add support for specifying the thermal governor to be used for a thermal >> zone in the devicetree. The devicetree config should specify the governor >> name as a string that matches any available governors. If not specified, we >> maintain the current behaviour of using the default governor. >> >> Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org> >> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > Why not create a kernel parameter (eg. thermal.governor=) ? So everyone > can gain benefit of this feature. And in order to specify that from the > DT, add the 'chosen' node and bootargs with the desired kernel parameter? > This is supposed to be a per-thermal zone property. So specifying it on the command-line, while possible, might be a little cumbersome. I'm not even sure if kernel parameters can have a variable number of arguments. IOW, thermal.tz0.governor=userspace, thermal.tz1.governor=step_wise, thermal.tz2.governor=userspace, ..... I'm already seeing SoCs defining 8 or more thermal zones.
On 05/03/18 18:36, Amit Kucheria wrote: > From: Ram Chandrasekar <rkumbako@codeaurora.org> > > There is currently no way for the governor to be selected for each thermal > zone in devicetree. How is that any different from cpufreq, cpuidle or devfreq subsystems ? > This results in the default governor being used for all > thermal zones even though no such restriction exists in the core code. > What restrictions ? > Add support for specifying the thermal governor to be used for a thermal > zone in the devicetree. Then what prevents us from doing the same for other subsystems with so called /inefficient default/ governors ? > The devicetree config should specify the governor > name as a string that matches any available governors. If not specified, we > maintain the current behaviour of using the default governor. If one is specified, can the user change it from sysfs ? If yes, then why do we need this binding at all ? If no, we are basically restricting, then what would happen if the DT was shipped with wrong governor specified because the firmware author didn't know about any other governor ? IMO, having this in DT is bad idea as it may cause more issues than gain. We already have sysfs support like other PM subsystems and userspace can deal with it.
On 06/03/2018 06:48, Amit Kucheria wrote: > On Tue, Mar 6, 2018 at 2:41 AM, Daniel Lezcano > <daniel.lezcano@linaro.org> wrote: >> On 05/03/2018 19:36, Amit Kucheria wrote: >>> From: Ram Chandrasekar <rkumbako@codeaurora.org> >>> >>> There is currently no way for the governor to be selected for each thermal >>> zone in devicetree. This results in the default governor being used for all >>> thermal zones even though no such restriction exists in the core code. >>> >>> Add support for specifying the thermal governor to be used for a thermal >>> zone in the devicetree. The devicetree config should specify the governor >>> name as a string that matches any available governors. If not specified, we >>> maintain the current behaviour of using the default governor. >>> >>> Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org> >>> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> >> >> Why not create a kernel parameter (eg. thermal.governor=) ? So everyone >> can gain benefit of this feature. And in order to specify that from the >> DT, add the 'chosen' node and bootargs with the desired kernel parameter? >> > > This is supposed to be a per-thermal zone property. So specifying it > on the command-line, while possible, might be a little cumbersome. I'm > not even sure if kernel parameters can have a variable number of > arguments. IOW, thermal.tz0.governor=userspace, > thermal.tz1.governor=step_wise, thermal.tz2.governor=userspace, ..... > > I'm already seeing SoCs defining 8 or more thermal zones. Ah yes indeed.
On Tue, Mar 6, 2018 at 6:02 PM, Sudeep Holla <sudeep.holla@arm.com> wrote: > > > On 05/03/18 18:36, Amit Kucheria wrote: >> From: Ram Chandrasekar <rkumbako@codeaurora.org> >> >> There is currently no way for the governor to be selected for each thermal >> zone in devicetree. > > How is that any different from cpufreq, cpuidle or devfreq subsystems ? Cpufreq/cpuidle are designed to control a single parameter while thermal framework is trying to mitigate heat from several disparate sources that are throttled in different ways. Besides, cpufreq/cpuidle have somewhat mature governors. Cpuidle has only one governor (for tickless) - menu governor, cpufreq has ondemand in mainline, replaced by interactive in android and hopefully soon both will be replaced by schedutil. Badly configured cpufreq/cpuidle/devfreq only leads to wasted power, while badly configured thermal zone leads to the loss of operation e.g. reboots, too hot to touch, etc. >> This results in the default governor being used for all >> thermal zones even though no such restriction exists in the core code. >> > > What restrictions ? I said "no such restriction exists". IOW, the core code does allow the governor to be changed. >> Add support for specifying the thermal governor to be used for a thermal >> zone in the devicetree. > > Then what prevents us from doing the same for other subsystems with so > called /inefficient default/ governors ? Nothing? :-) It depends on how closely we want the board DT to represent a production-ready setup. If a platform maintainer wants to ship a well-tested configuration in DT, is that really a problem? If however, you are requiring them to ship magic runes for userspace daemons (e.g. per-board thermald config) to tweak the various governors in sysfs, it takes a bit more work to get to a production-ready setup. We should of course continue to improve our govenors to arrive at the one true governor, but that'll take some time and might never even happen in the case of thermal IMHO. >> The devicetree config should specify the governor >> name as a string that matches any available governors. If not specified, we >> maintain the current behaviour of using the default governor. > > If one is specified, can the user change it from sysfs ? If > yes, then why do we need this binding at all ? If no, we are basically > restricting, then what would happen if the DT was shipped with wrong > governor specified because the firmware author didn't know about any > other governor ? The governor can be changed from sysfs today. So it is all override-able. I guess your main worry is an unmodifiable DT with this option being shipped in firmware, right? That is a fair point, though that could be worked around from sysfs. > IMO, having this in DT is bad idea as it may cause more issues than > gain. We already have sysfs support like other PM subsystems and > userspace can deal with it. As mentioned above, I think thermal is a bit special because bad configuration may lead to loss of operation. But iff we're strictly enforcing the rule about no policy in board DTs, then I concede. Personally, I'd like mainline board DTs to have enough policy to provide a stable system so I don't have to futz around with userspace to set it all up correctly (in 3 different distros). Regards, Amit
On Wed, Mar 7, 2018 at 4:29 PM, Amit Kucheria <amit.kucheria@linaro.org> wrote: > Cpufreq/cpuidle are designed to control a single parameter while > thermal framework is trying to mitigate heat from several disparate > sources that are throttled in different ways. Besides, cpufreq/cpuidle > have somewhat mature governors. Cpuidle has only one governor (for > tickless) - menu governor, cpufreq has ondemand in mainline, replaced > by interactive in android and hopefully soon both will be replaced by Interactive and schedfreq are already removed from Android 4.4 and 4.9. It used schedutil now. > schedutil. > > Badly configured cpufreq/cpuidle/devfreq only leads to wasted power, > while badly configured thermal zone leads to the loss of operation > e.g. reboots, too hot to touch, etc. I don't think such heat-ups will happen right during boot, where some init.rc should come up and change the governor. Over that if we are worried about production images only, then what prevents us to select the right default governor in the defconfig ? We shouldn't be worried about multi-platform kernels for production images. -- viresh
On Thu, Mar 8, 2018 at 10:19 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote: > On Wed, Mar 7, 2018 at 4:29 PM, Amit Kucheria <amit.kucheria@linaro.org> wrote: >> Cpufreq/cpuidle are designed to control a single parameter while >> thermal framework is trying to mitigate heat from several disparate >> sources that are throttled in different ways. Besides, cpufreq/cpuidle >> have somewhat mature governors. Cpuidle has only one governor (for >> tickless) - menu governor, cpufreq has ondemand in mainline, replaced >> by interactive in android and hopefully soon both will be replaced by > > Interactive and schedfreq are already removed from Android 4.4 and 4.9. > It used schedutil now. Good to hear. >> schedutil. >> >> Badly configured cpufreq/cpuidle/devfreq only leads to wasted power, >> while badly configured thermal zone leads to the loss of operation >> e.g. reboots, too hot to touch, etc. > > I don't think such heat-ups will happen right during boot, where some > init.rc should > come up and change the governor. Interestingly enough, there are some patches that make the thermal framework start earlier to deal with boot-time thermal issues. I expect to post them soon. :-) These are required, for example, if the device goes into a reboot loop - the device might not even make it to the point in userspace where the governors are setup. In such a loop, the temperature constantly keeps increasing. > Over that if we are worried about production images only, then what prevents us > to select the right default governor in the defconfig ? We shouldn't > be worried about > multi-platform kernels for production images. I was refering to the 'make defconfig' out-of-box experience that makes the majority of developer boards/devices out there stable to work on. I can't do that today. Regards, Amit
On 08-03-18, 11:05, Amit Kucheria wrote: > Interestingly enough, there are some patches that make the thermal > framework start earlier to deal with boot-time thermal issues. I > expect to post them soon. :-) These are required, for example, if the > device goes into a reboot loop - the device might not even make it to > the point in userspace where the governors are setup. In such a loop, > the temperature constantly keeps increasing. > > I was refering to the 'make defconfig' out-of-box experience that > makes the majority of developer boards/devices out there stable to > work on. I can't do that today. Fair enough, but I am still not sure if we should put something that defines our thermal policy into DT which should describe hardware.
On 07/03/18 10:59, Amit Kucheria wrote: > On Tue, Mar 6, 2018 at 6:02 PM, Sudeep Holla <sudeep.holla@arm.com> wrote: >> >> >> On 05/03/18 18:36, Amit Kucheria wrote: >>> From: Ram Chandrasekar <rkumbako@codeaurora.org> >>> >>> There is currently no way for the governor to be selected for each thermal >>> zone in devicetree. >> >> How is that any different from cpufreq, cpuidle or devfreq subsystems ? > > Cpufreq/cpuidle are designed to control a single parameter while > thermal framework is trying to mitigate heat from several disparate > sources that are throttled in different ways. Besides, cpufreq/cpuidle > have somewhat mature governors. Cpuidle has only one governor (for > tickless) - menu governor, cpufreq has ondemand in mainline, replaced > by interactive in android and hopefully soon both will be replaced by > schedutil. > While I agree on the facts, I don't think adding this information to DT because thermal governors are not so mature compared to cpufreq/idle is acceptable. It's give more reason not to add as new governors may get added as things mature. > Badly configured cpufreq/cpuidle/devfreq only leads to wasted power, > while badly configured thermal zone leads to the loss of operation > e.g. reboots, too hot to touch, etc. > As Viresh pointed out, if the default hardware or firmware configuration causes that then it's no good platform to use. If adding more functionality in Linux(like cpufreq and others) causes this then Linux needs to deal with it. If default thermal can't help in preventing such damage then it shouldn't be default. >>> This results in the default governor being used for all >>> thermal zones even though no such restriction exists in the core code. >>> >> >> What restrictions ? > > I said "no such restriction exists". IOW, the core code does allow the > governor to be changed. > Ah ok. >>> Add support for specifying the thermal governor to be used for a thermal >>> zone in the devicetree. >> >> Then what prevents us from doing the same for other subsystems with so >> called /inefficient default/ governors ? > > Nothing? :-) > Cool problem solved, just kidding. > It depends on how closely we want the board DT to represent a > production-ready setup. If a platform maintainer wants to ship a > well-tested configuration in DT, is that really a problem? > You mean configuration + policy ? Then yes policy in DT is no-no. > If however, you are requiring them to ship magic runes for userspace > daemons (e.g. per-board thermald config) to tweak the various > governors in sysfs, it takes a bit more work to get to a > production-ready setup. > Fair enough. > We should of course continue to improve our govenors to arrive at the > one true governor, but that'll take some time and might never even > happen in the case of thermal IMHO. > I am not sure about "one true governor" as different users have different needs. >>> The devicetree config should specify the governor >>> name as a string that matches any available governors. If not specified, we >>> maintain the current behaviour of using the default governor. >> >> If one is specified, can the user change it from sysfs ? If >> yes, then why do we need this binding at all ? If no, we are basically >> restricting, then what would happen if the DT was shipped with wrong >> governor specified because the firmware author didn't know about any >> other governor ? > > The governor can be changed from sysfs today. So it is all override-able. > > I guess your main worry is an unmodifiable DT with this option being > shipped in firmware, right? That is a fair point, though that could be > worked around from sysfs. > Yes, and Viresh's point triggered more question to me. If DT has some governor like interactive which will not in mainline or not configure in the default build what will be done. If in such case default governor is good enough to boot without burning or any other issues, why not always? >> IMO, having this in DT is bad idea as it may cause more issues than >> gain. We already have sysfs support like other PM subsystems and >> userspace can deal with it. > > As mentioned above, I think thermal is a bit special because bad > configuration may lead to loss of operation. But iff we're strictly > enforcing the rule about no policy in board DTs, then I concede. > As I said IMO I see more issues with that, so I don't like it. > Personally, I'd like mainline board DTs to have enough policy to > provide a stable system so I don't have to futz around with userspace > to set it all up correctly (in 3 different distros). > Sysfs is more flexible than DT and requires some fuzzing to get the best out of the platform if you need.
diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt index 1719d47..fced9d3 100644 --- a/Documentation/devicetree/bindings/thermal/thermal.txt +++ b/Documentation/devicetree/bindings/thermal/thermal.txt @@ -168,6 +168,14 @@ Optional property: by means of sensor ID. Additional coefficients are interpreted as constant offset. +- thermal-governor: Thermal governor to be used for this thermal zone. + Expected values are: + "step_wise": Use step wise governor. + "fair_share": Use fair share governor. + "user_space": Use user space governor. + "power_allocator": Use power allocator governor. + Type: string + - sustainable-power: An estimate of the sustainable power (in mW) that the Type: unsigned thermal zone can dissipate at the desired Size: one cell control temperature. For reference, the diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c index e09f035..a884b01 100644 --- a/drivers/thermal/of-thermal.c +++ b/drivers/thermal/of-thermal.c @@ -974,6 +974,7 @@ int __init of_parse_thermal_zones(void) struct thermal_zone_params *tzp; int i, mask = 0; u32 prop; + const char *governor_name; tz = thermal_of_build_thermal_zone(child); if (IS_ERR(tz)) { @@ -996,6 +997,11 @@ int __init of_parse_thermal_zones(void) /* No hwmon because there might be hwmon drivers registering */ tzp->no_hwmon = true; + if (!of_property_read_string(child, "thermal-governor", + &governor_name)) + strlcpy(tzp->governor_name, governor_name, + THERMAL_NAME_LENGTH); + if (!of_property_read_u32(child, "sustainable-power", &prop)) tzp->sustainable_power = prop;