mbox series

[0/2] thermal: Add support of multiple sensors

Message ID 20220218084604.1669091-1-abailon@baylibre.com (mailing list archive)
Headers show
Series thermal: Add support of multiple sensors | expand

Message

Alexandre Bailon Feb. 18, 2022, 8:46 a.m. UTC
Following this comment [1], this updates thermal_of to support multiple
sensors.

This has some limitations:
- A sensor must have its own termal zone, even if it is also registered
  inside a thermal zone supporting multiple sensors.
- Some callbacks (such as of_thermal_set_trips) have been updated to support
  multiple sensors but I don't know if this really make sense.
- of_thermal_get_trend have not been updated to support multiple sensors.
  This would probably make sense to support it but I am not sure how to do it,
  especially for the average. 

[1]: https://patchwork.kernel.org/comment/24723927/

Alexandre Bailon (2):
  dt-bindings: thermal: Update the bindings to support multiple sensor
  Thermal: Add support of multi sensor

 .../bindings/thermal/thermal-zones.yaml       |  20 +-
 drivers/thermal/thermal_of.c                  | 491 +++++++++++++++---
 2 files changed, 449 insertions(+), 62 deletions(-)

Comments

Eduardo Valentin Feb. 25, 2022, 11:52 p.m. UTC | #1
Hello Alexandre,

On Fri, Feb 18, 2022 at 09:46:02AM +0100, Alexandre Bailon wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> Following this comment [1], this updates thermal_of to support multiple
> sensors.
> 
> This has some limitations:
> - A sensor must have its own termal zone, even if it is also registered
>   inside a thermal zone supporting multiple sensors.
> - Some callbacks (such as of_thermal_set_trips) have been updated to support
>   multiple sensors but I don't know if this really make sense.
> - of_thermal_get_trend have not been updated to support multiple sensors.
>   This would probably make sense to support it but I am not sure how to do it,
>   especially for the average.

Great to see this having somewhat a form now!

Overall the idea is sane and aligned to what I had in mind back during the 2019 Linux plumbers: one thermal zone should have multiple sensor inputs.
https://lpc.events/event/4/page/34-accepted-microconferences#PMSummary

In fact, that is aligned to what I originally wrote in the thermal device tree bindings:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/thermal/thermal-zones.yaml#n79

The only major concern with your series is the usage of of-thermal to achieve the multiple sensors per thermal zone.
While that solves the problem, it has the following limitations:
(1) limited to devices described in device tree. everybody else is left out.
(2) it keeps extending the code duplication in of-thermal. 

My suggestion here is have the thermal core aware of the multiple sensors per thermal zone.

That has the advantage of:
(a) cleanup the sensor handling within of-thermal
(b) expand the multi sensor per zone to all types of thermal drivers
(c) standardize the way to handle the multi sensor.

In my original thoughts of achieving this would have include:
(i) move the sensor handling part to a specific c file within the thermal core. That would include helper functions to execute the aggregations, max, min, avg, geo avg, etc etc
(ii) a way to tell what sensors are being aggregated via sysfs, probably a simple syslink to the original device would suffice
(iii) a way to change the aggregation via sysfs. just like you proposed a way to specify the aggregation via device tree, we should have a way to specify the aggregation at runtime.
(iv) once (i)-(iii) is done, you basically cleanup of-thermal to use the new C api written, and of-thermal simply use the API created to register the sensors. I d expect that all the callbacks related sensor ops would disappear from of-thermal.

> 
> [1]: https://patchwork.kernel.org/comment/24723927/
> 
> Alexandre Bailon (2):
>   dt-bindings: thermal: Update the bindings to support multiple sensor
>   Thermal: Add support of multi sensor
> 
>  .../bindings/thermal/thermal-zones.yaml       |  20 +-
>  drivers/thermal/thermal_of.c                  | 491 +++++++++++++++---
>  2 files changed, 449 insertions(+), 62 deletions(-)
> 
> --
> 2.34.1
>
Kevin Hilman March 23, 2022, 9:33 p.m. UTC | #2
Hi Eduardo, Daniel,

Eduardo Valentin <eduval@amazon.com> writes:

> On Fri, Feb 18, 2022 at 09:46:02AM +0100, Alexandre Bailon wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>> 
>> 
>> 
>> Following this comment [1], this updates thermal_of to support multiple
>> sensors.
>> 
>> This has some limitations:
>> - A sensor must have its own termal zone, even if it is also registered
>>   inside a thermal zone supporting multiple sensors.
>> - Some callbacks (such as of_thermal_set_trips) have been updated to support
>>   multiple sensors but I don't know if this really make sense.
>> - of_thermal_get_trend have not been updated to support multiple sensors.
>>   This would probably make sense to support it but I am not sure how to do it,
>>   especially for the average.
>
> Great to see this having somewhat a form now!
>
> Overall the idea is sane and aligned to what I had in mind back during the 2019 Linux plumbers: one thermal zone should have multiple sensor inputs.
> https://lpc.events/event/4/page/34-accepted-microconferences#PMSummary
>
> In fact, that is aligned to what I originally wrote in the thermal device tree bindings:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/thermal/thermal-zones.yaml#n79
>
> The only major concern with your series is the usage of of-thermal to achieve the multiple sensors per thermal zone.
> While that solves the problem, it has the following limitations:
> (1) limited to devices described in device tree. everybody else is left out.
> (2) it keeps extending the code duplication in of-thermal. 
>
> My suggestion here is have the thermal core aware of the multiple sensors per thermal zone.
>
> That has the advantage of:
> (a) cleanup the sensor handling within of-thermal
> (b) expand the multi sensor per zone to all types of thermal drivers
> (c) standardize the way to handle the multi sensor.

This cleanup all sounds like the right direction to be headed, but since
this has been planned since 2019 and nothing has happended, what is the
level of urgency is for this of-thermal -> thermal core cleanup/rework?

In $SUBJECT series, we have a fully functional series that solves an
existing problem and takes a big step in the right long-term direction.
While it indeed has the has limitations you mention, I don't think that
should block the merging of this series.  More importantly, there are
existing drivers[1] as well as forthcoming ones from MTK that depend on
this series. Those are blocked if you require the of-thermal -> core
move first.

As a maintainer also, I fully understand that maintainer bandwith is
limited, and it's always nice to have contributors do core framework
development when possible, but IMO, in this case I don't think it should
be a prerequisite since a follow-up series to do the core work would not
affect any functionality or bindings etc.  I don't see any reasons not
do to this incrementally.

So I would kindly request (read: beg, plead & grovel) that you seriously
consider merging this series as a first phase and the of-thermal -> core
change be done as a second phase.  Yes, I fully understand that punting
this to a second phase means it might not get done soon.  But it's been
waiting for years already, so it seems the urgency is low.  Meanwhile,
there are OF users that are ready to use this feature today.

Thanks for considering,

Kevin

[1] https://lore.kernel.org/linux-mediatek/20210617114707.10618-1-ben.tseng@mediatek.com/
AngeloGioacchino Del Regno April 5, 2022, 12:14 p.m. UTC | #3
Il 23/03/22 22:33, Kevin Hilman ha scritto:
> Hi Eduardo, Daniel,
> 
> Eduardo Valentin <eduval@amazon.com> writes:
> 
>> On Fri, Feb 18, 2022 at 09:46:02AM +0100, Alexandre Bailon wrote:
>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>>
>>>
>>>
>>> Following this comment [1], this updates thermal_of to support multiple
>>> sensors.
>>>
>>> This has some limitations:
>>> - A sensor must have its own termal zone, even if it is also registered
>>>    inside a thermal zone supporting multiple sensors.
>>> - Some callbacks (such as of_thermal_set_trips) have been updated to support
>>>    multiple sensors but I don't know if this really make sense.
>>> - of_thermal_get_trend have not been updated to support multiple sensors.
>>>    This would probably make sense to support it but I am not sure how to do it,
>>>    especially for the average.
>>
>> Great to see this having somewhat a form now!
>>
>> Overall the idea is sane and aligned to what I had in mind back during the 2019 Linux plumbers: one thermal zone should have multiple sensor inputs.
>> https://lpc.events/event/4/page/34-accepted-microconferences#PMSummary
>>
>> In fact, that is aligned to what I originally wrote in the thermal device tree bindings:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/thermal/thermal-zones.yaml#n79
>>
>> The only major concern with your series is the usage of of-thermal to achieve the multiple sensors per thermal zone.
>> While that solves the problem, it has the following limitations:
>> (1) limited to devices described in device tree. everybody else is left out.
>> (2) it keeps extending the code duplication in of-thermal.
>>
>> My suggestion here is have the thermal core aware of the multiple sensors per thermal zone.
>>
>> That has the advantage of:
>> (a) cleanup the sensor handling within of-thermal
>> (b) expand the multi sensor per zone to all types of thermal drivers
>> (c) standardize the way to handle the multi sensor.
> 
> This cleanup all sounds like the right direction to be headed, but since
> this has been planned since 2019 and nothing has happended, what is the
> level of urgency is for this of-thermal -> thermal core cleanup/rework?
> 
> In $SUBJECT series, we have a fully functional series that solves an
> existing problem and takes a big step in the right long-term direction.
> While it indeed has the has limitations you mention, I don't think that
> should block the merging of this series.  More importantly, there are
> existing drivers[1] as well as forthcoming ones from MTK that depend on
> this series. Those are blocked if you require the of-thermal -> core
> move first.
> 
> As a maintainer also, I fully understand that maintainer bandwith is
> limited, and it's always nice to have contributors do core framework
> development when possible, but IMO, in this case I don't think it should
> be a prerequisite since a follow-up series to do the core work would not
> affect any functionality or bindings etc.  I don't see any reasons not
> do to this incrementally.
> 
> So I would kindly request (read: beg, plead & grovel) that you seriously
> consider merging this series as a first phase and the of-thermal -> core
> change be done as a second phase.  Yes, I fully understand that punting
> this to a second phase means it might not get done soon.  But it's been
> waiting for years already, so it seems the urgency is low.  Meanwhile,
> there are OF users that are ready to use this feature today.
> 
> Thanks for considering,
> 
> Kevin
> 
> [1] https://lore.kernel.org/linux-mediatek/20210617114707.10618-1-ben.tseng@mediatek.com/
> 
> 

Hello Eduardo, Kevin,

I would like to add that this series is not only benefitting MediaTek platforms,
and not only Chromebooks.
On some Qualcomm SoCs (from SDM845 onwards, if I'm not wrong!), downstream, there
is some "qti virtual sensor" driver, which is addressing this kind of situation:
on these platforms, averaging, min and max (and some interpolation too, but that's
another story, I guess) is happening and that's used as some advanced way to
ensure that both performance stays high and that the device is safe to operate.
On these platforms, this is done by evaluating CPU, GPU, Hexagon DSPs, modem, wifi
and (modem,wifi)PA IPs and deciding on a thermal throttling strategy.

You understand that, while this is not "excessively" important for a Chromebook,
which is a laptop, it may become even a safety concern in devices of other form
factor, like smartphones, where there is a very strict thermal headroom (hence
requiring a fine grained thermal management).

Even though, on MediaTek, I guess that the primary usecase is Chromebooks and this
kind of mechanism is required primarily for the LVTS sensors that are used for SVS
calculations (read: better power efficiency), the Linux community is huge - and,
with this kept in mind, there will probably be someone that will like to upstream
their MTK smartphone for a reason or another (I think! This happened with Qualcomm
so I guess that it's going to happen with "any other thing")... and that adds up
to this problem being a safety concern to fix.

Of course, I agree with you, Eduardo, about the needed cleanup but, for all of
the aforementioned reasons - mine and Kevin's, like him, I would also beg, plead
and grovel that you consider merging this series as a first phase, and accept the
cleanup and use-case expansion as a second phase.

P.S.: I'm adding Marijn and Konrad to the loop, as people interested to the
       Qualcomm side of things, and mainly upstreaming smartphones.

Kind regards,
Angelo
Daniel Lezcano April 5, 2022, 4:23 p.m. UTC | #4
Hi Angelo,


On 05/04/2022 14:14, AngeloGioacchino Del Regno wrote:

[ ... ]


> Hello Eduardo, Kevin,
> 
> I would like to add that this series is not only benefitting MediaTek
>  platforms, and not only Chromebooks. On some Qualcomm SoCs (from
> SDM845 onwards, if I'm not wrong!), downstream, there is some "qti
> virtual sensor" driver, which is addressing this kind of situation: 
> on these platforms, averaging, min and max (and some interpolation
> too, but that's another story, I guess) is happening and that's used
> as some advanced way to ensure that both performance stays high and
> that the device is safe to operate. On these platforms, this is done
> by evaluating CPU, GPU, Hexagon DSPs, modem, wifi and (modem,wifi)PA
> IPs and deciding on a thermal throttling strategy.
> 
> You understand that, while this is not "excessively" important for a
>  Chromebook, which is a laptop, it may become even a safety concern
> in devices of other form factor, like smartphones, where there is a
> very strict thermal headroom (hence requiring a fine grained thermal
> management).
> 
> Even though, on MediaTek, I guess that the primary usecase is 
> Chromebooks and this kind of mechanism is required primarily for the
> LVTS sensors that are used for SVS calculations (read: better power
> efficiency), the Linux community is huge - and, with this kept in
> mind, there will probably be someone that will like to upstream their
> MTK smartphone for a reason or another (I think! This happened with
> Qualcomm so I guess that it's going to happen with "any other
> thing")... and that adds up to this problem being a safety concern to
> fix.
> 
> Of course, I agree with you, Eduardo, about the needed cleanup but,
> for all of the aforementioned reasons - mine and Kevin's, like him, I
> would also beg, plead and grovel that you consider merging this
> series as a first phase, and accept the cleanup and use-case
> expansion as a second phase.

I'll take care of the cleanups and then respin Alex's series on top of 
those.

Thanks

  -- Daniel


> P.S.: I'm adding Marijn and Konrad to the loop, as people interested
> to the Qualcomm side of things, and mainly upstreaming smartphones.
> 
> Kind regards, Angelo