diff mbox series

arm64: dts: qcom: sc7180: Add 'sustainable_power' for CPU thermal zones

Message ID 20200813113030.1.I89c33c4119eaffb986b1e8c1bc6f0e30267089cd@changeid (mailing list archive)
State Accepted
Commit 5a4d9f3e18f5e2281f50e7f644e1f9a1589d900b
Headers show
Series arm64: dts: qcom: sc7180: Add 'sustainable_power' for CPU thermal zones | expand

Commit Message

Matthias Kaehlcke Aug. 13, 2020, 6:30 p.m. UTC
The 'sustainable_power' attribute provides an estimate of the sustained
power that can be dissipated at the desired control temperature. One
could argue that this value is not necessarily the same for all devices
with the same SoC, which may have different form factors or thermal
designs. However there are reasons to specify a (default) value at SoC
level for SC7180: most importantly, if no value is specified at all the
power_allocator thermal governor (aka 'IPA') estimates a value, using the
minimum power of all cooling devices of the zone, which can result in
overly aggressive thermal throttling. For most devices an approximate
conservative value should be more useful than the minimum guesstimate
of power_allocator. Devices that need a different value can overwrite
it in their <device>.dts. Also the thermal zones for SC7180 have a high
level of granularity (essentially one for each function block), which
makes it more likely that the default value just works for many devices.

The values correspond to 1901 MHz for the big cores, and 1804 MHz for
the small cores. The values were determined by limiting the CPU
frequencies to different max values and launching a bunch of processes
that cause high CPU load ('while true; do true; done &' is simple and
does a good job). A frequency is deemed sustainable if the CPU
temperatures don't rise (consistently) above the second trip point
('control temperature', 95 degC in this case). Once the highest
sustainable frequency is found, the sustainable power can be calculated
by multiplying the energy consumption per core at this frequency (which
can be found in /sys/kernel/debug/energy_model/) with the number of
cores that are specified as cooling devices.

The sustainable frequencies were determined at room temperature
on a device without heat sink or other passive cooling elements.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
---
If maintainers think 'sustainable_power' should be specified at
device level (with which I conceptually agree) I'm fine with
doing that, just seemed it could be useful to have a reasonable
'default' at SoC level in this case.

 arch/arm64/boot/dts/qcom/sc7180.dtsi | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Matthias Kaehlcke Sept. 1, 2020, 5:07 p.m. UTC | #1
On Thu, Aug 13, 2020 at 11:30:33AM -0700, Matthias Kaehlcke wrote:
> The 'sustainable_power' attribute provides an estimate of the sustained
> power that can be dissipated at the desired control temperature. One
> could argue that this value is not necessarily the same for all devices
> with the same SoC, which may have different form factors or thermal
> designs. However there are reasons to specify a (default) value at SoC
> level for SC7180: most importantly, if no value is specified at all the
> power_allocator thermal governor (aka 'IPA') estimates a value, using the
> minimum power of all cooling devices of the zone, which can result in
> overly aggressive thermal throttling. For most devices an approximate
> conservative value should be more useful than the minimum guesstimate
> of power_allocator. Devices that need a different value can overwrite
> it in their <device>.dts. Also the thermal zones for SC7180 have a high
> level of granularity (essentially one for each function block), which
> makes it more likely that the default value just works for many devices.
> 
> The values correspond to 1901 MHz for the big cores, and 1804 MHz for
> the small cores. The values were determined by limiting the CPU
> frequencies to different max values and launching a bunch of processes
> that cause high CPU load ('while true; do true; done &' is simple and
> does a good job). A frequency is deemed sustainable if the CPU
> temperatures don't rise (consistently) above the second trip point
> ('control temperature', 95 degC in this case). Once the highest
> sustainable frequency is found, the sustainable power can be calculated
> by multiplying the energy consumption per core at this frequency (which
> can be found in /sys/kernel/debug/energy_model/) with the number of
> cores that are specified as cooling devices.
> 
> The sustainable frequencies were determined at room temperature
> on a device without heat sink or other passive cooling elements.
> 
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> ---
> If maintainers think 'sustainable_power' should be specified at
> device level (with which I conceptually agree) I'm fine with
> doing that, just seemed it could be useful to have a reasonable
> 'default' at SoC level in this case.

Any comments on this?
Doug Anderson Sept. 1, 2020, 8:19 p.m. UTC | #2
Hi,

On Tue, Sep 1, 2020 at 10:07 AM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> On Thu, Aug 13, 2020 at 11:30:33AM -0700, Matthias Kaehlcke wrote:
> > The 'sustainable_power' attribute provides an estimate of the sustained
> > power that can be dissipated at the desired control temperature. One
> > could argue that this value is not necessarily the same for all devices
> > with the same SoC, which may have different form factors or thermal
> > designs. However there are reasons to specify a (default) value at SoC
> > level for SC7180: most importantly, if no value is specified at all the
> > power_allocator thermal governor (aka 'IPA') estimates a value, using the
> > minimum power of all cooling devices of the zone, which can result in
> > overly aggressive thermal throttling. For most devices an approximate
> > conservative value should be more useful than the minimum guesstimate
> > of power_allocator. Devices that need a different value can overwrite
> > it in their <device>.dts. Also the thermal zones for SC7180 have a high
> > level of granularity (essentially one for each function block), which
> > makes it more likely that the default value just works for many devices.
> >
> > The values correspond to 1901 MHz for the big cores, and 1804 MHz for
> > the small cores. The values were determined by limiting the CPU
> > frequencies to different max values and launching a bunch of processes
> > that cause high CPU load ('while true; do true; done &' is simple and
> > does a good job). A frequency is deemed sustainable if the CPU
> > temperatures don't rise (consistently) above the second trip point
> > ('control temperature', 95 degC in this case). Once the highest
> > sustainable frequency is found, the sustainable power can be calculated
> > by multiplying the energy consumption per core at this frequency (which
> > can be found in /sys/kernel/debug/energy_model/) with the number of
> > cores that are specified as cooling devices.
> >
> > The sustainable frequencies were determined at room temperature
> > on a device without heat sink or other passive cooling elements.

I'm curious: was this a bare board, or a device in a case?  Hrm, I'm
not sure which one would be worse at heat dissipation, but I would
imagine that being inside a plastic case might be worse?


> > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> > ---
> > If maintainers think 'sustainable_power' should be specified at
> > device level (with which I conceptually agree) I'm fine with
> > doing that, just seemed it could be useful to have a reasonable
> > 'default' at SoC level in this case.
>
> Any comments on this?

I'm not massively familiar with this area of the code, but I guess I
shouldn't let that stop me from having an opinion!  :-P

* I would agree that it seems highly unlikely that someone would put
one of these chips in a device that could only dissipate the heat from
the lowest OPP, so having some higher estimate definitely makes sense.

* In terms of the numbers here, I believe that you're claiming that we
can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.  My memory
of how much power we could dissipate in previous laptops I worked on
is a little fuzzy, but that doesn't seem insane for a passively-cooled
laptop.  However, I think someone could conceivably put this chip in a
smaller form factor.  In such a case, it seems like we'd want these
things to sum up to ~2000 (if it would ever make sense for someone to
put this chip in a phone) or ~4000 (if it would ever make sense for
someone to put this chip in a small tablet).  It seems possible that,
to achieve this, we might have to tweak the
"dynamic-power-coefficient".  I don't know how much thought was put
into those numbers, but the fact that the little cores have a super
round 100 for their dynamic-power-coefficient makes me feel like they
might have been more schwags than anything.  Rajendra maybe knows?

* I'm curious about the fact that there are two numbers here: one for
littles and one for bigs.  If I had to guess I'd say that since all
the cores are in one package so the contributions kinda need to be
thought of together, right?  If we're sitting there thermally
throttled then we'd want to pick the best perf-per-watt for the
overall package.  This is why your patch says we can sustain the
little cores at max and the big cores get whatever is left over,
right?

* Should we be leaving some room in here for the GPU?  ...or I guess
once we list it as a cooling device we'll have to decrease the amount
the CPUs can use?


So I guess the tl; dr is:

a) We should check "dynamic-power-coefficient" and possibly adjust.

b) I don't think the "conservative" by-default numbers should add up
to 7 Watts.  I could be convinced that this chip is not intended for
phones and thus we could have it add up to 4 Watts, but 7 Watts seems
too much.


-Doug
Matthias Kaehlcke Sept. 1, 2020, 9:33 p.m. UTC | #3
Hi Doug,

On Tue, Sep 01, 2020 at 01:19:10PM -0700, Doug Anderson wrote:
> Hi,
> 
> On Tue, Sep 1, 2020 at 10:07 AM Matthias Kaehlcke <mka@chromium.org> wrote:
> >
> > On Thu, Aug 13, 2020 at 11:30:33AM -0700, Matthias Kaehlcke wrote:
> > > The 'sustainable_power' attribute provides an estimate of the sustained
> > > power that can be dissipated at the desired control temperature. One
> > > could argue that this value is not necessarily the same for all devices
> > > with the same SoC, which may have different form factors or thermal
> > > designs. However there are reasons to specify a (default) value at SoC
> > > level for SC7180: most importantly, if no value is specified at all the
> > > power_allocator thermal governor (aka 'IPA') estimates a value, using the
> > > minimum power of all cooling devices of the zone, which can result in
> > > overly aggressive thermal throttling. For most devices an approximate
> > > conservative value should be more useful than the minimum guesstimate
> > > of power_allocator. Devices that need a different value can overwrite
> > > it in their <device>.dts. Also the thermal zones for SC7180 have a high
> > > level of granularity (essentially one for each function block), which
> > > makes it more likely that the default value just works for many devices.
> > >
> > > The values correspond to 1901 MHz for the big cores, and 1804 MHz for
> > > the small cores. The values were determined by limiting the CPU
> > > frequencies to different max values and launching a bunch of processes
> > > that cause high CPU load ('while true; do true; done &' is simple and
> > > does a good job). A frequency is deemed sustainable if the CPU
> > > temperatures don't rise (consistently) above the second trip point
> > > ('control temperature', 95 degC in this case). Once the highest
> > > sustainable frequency is found, the sustainable power can be calculated
> > > by multiplying the energy consumption per core at this frequency (which
> > > can be found in /sys/kernel/debug/energy_model/) with the number of
> > > cores that are specified as cooling devices.
> > >
> > > The sustainable frequencies were determined at room temperature
> > > on a device without heat sink or other passive cooling elements.
> 
> I'm curious: was this a bare board, or a device in a case?  Hrm, I'm
> not sure which one would be worse at heat dissipation, but I would
> imagine that being inside a plastic case might be worse?

This was with a device in a plastic case.

> > > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> > > ---
> > > If maintainers think 'sustainable_power' should be specified at
> > > device level (with which I conceptually agree) I'm fine with
> > > doing that, just seemed it could be useful to have a reasonable
> > > 'default' at SoC level in this case.
> >
> > Any comments on this?
> 
> I'm not massively familiar with this area of the code, but I guess I
> shouldn't let that stop me from having an opinion!  :-P
> 
> * I would agree that it seems highly unlikely that someone would put
> one of these chips in a device that could only dissipate the heat from
> the lowest OPP, so having some higher estimate definitely makes sense.
> 
> * In terms of the numbers here, I believe that you're claiming that we
> can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.

No, I'm claiming it's 768 mW + 1202 mW = ~2 W.

SC7180 has a 6 thermal zones for the 6 little cores and 4 zones for the
2 big cores. Each of these thermal zones uses either all little or all big
cores as cooling devices, hence the power sustainable power of the
individual zones doesn't add up. 768 mW corresponds to 6x 128 mW (aka all
little cores at 1.8 GHz), and 1202 mW to 2x 601 mW (both big cores at 1.9 GHz).

> My memory
> of how much power we could dissipate in previous laptops I worked on
> is a little fuzzy, but that doesn't seem insane for a passively-cooled
> laptop.  However, I think someone could conceivably put this chip in a
> smaller form factor.  In such a case, it seems like we'd want these
> things to sum up to ~2000 (if it would ever make sense for someone to
> put this chip in a phone) or ~4000 (if it would ever make sense for
> someone to put this chip in a small tablet).

See above, the sustainable power with this patch only adds up to ~2000.
It is possible though that it would be lower in a smaller form factor
device.

I'd be ok with posting something lower for SC7180 (it would be a guess
though) and use the specific numbers in the device specific DT.

> It seems possible that,
> to achieve this, we might have to tweak the
> "dynamic-power-coefficient".  I don't know how much thought was put
> into those numbers, but the fact that the little cores have a super
> round 100 for their dynamic-power-coefficient makes me feel like they
> might have been more schwags than anything.  Rajendra maybe knows?

Yeah, it's possible that that was just an approximation

> * I'm curious about the fact that there are two numbers here: one for
> littles and one for bigs.  If I had to guess I'd say that since all
> the cores are in one package so the contributions kinda need to be
> thought of together, right?  If we're sitting there thermally
> throttled then we'd want to pick the best perf-per-watt for the
> overall package.  This is why your patch says we can sustain the
> little cores at max and the big cores get whatever is left over,
> right?

It's derived from how Qualcomm specified the thermal zones and cooling
devices. Any ("cpu") zone is either cooled by (all) big cores or by (all)
little cores, but not a mix of them. In my tests I also saw that the big
cores seemed to have little impact on the little ones. The little cores
are at max because even running at max frequency the temperature in the
'little zones' wouldn't come close to the trip point.

> * Should we be leaving some room in here for the GPU?  ...or I guess
> once we list it as a cooling device we'll have to decrease the amount
> the CPUs can use?

I don't know for sure, but judging from the CPU zones I wouldn't be
surprised if the GPU was managed exclusively in the dedicated GPU
thermal zones (I guess that's what 'gpuss0-thermal' and 'gpuss1-thermal'
are). If that's not the case the values in the CPU zones can be
adjusted when specific data is available.

> So I guess the tl; dr is:
> 
> a) We should check "dynamic-power-coefficient" and possibly adjust.

ok, lets see if Rajendra can check if there is room for tweaking.

> b) I don't think the "conservative" by-default numbers should add up
> to 7 Watts.  I could be convinced that this chip is not intended for
> phones and thus we could have it add up to 4 Watts, but 7 Watts seems
> too much.

I suppose this is mostly addressed by my explications above, unless we
think that 2 Watts in CPU power might still be too aggressive as a
default.

Thanks

m.
Doug Anderson Sept. 1, 2020, 10:45 p.m. UTC | #4
Hi,

On Tue, Sep 1, 2020 at 2:33 PM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> Hi Doug,
>
> On Tue, Sep 01, 2020 at 01:19:10PM -0700, Doug Anderson wrote:
> > Hi,
> >
> > On Tue, Sep 1, 2020 at 10:07 AM Matthias Kaehlcke <mka@chromium.org> wrote:
> > >
> > > On Thu, Aug 13, 2020 at 11:30:33AM -0700, Matthias Kaehlcke wrote:
> > > > The 'sustainable_power' attribute provides an estimate of the sustained
> > > > power that can be dissipated at the desired control temperature. One
> > > > could argue that this value is not necessarily the same for all devices
> > > > with the same SoC, which may have different form factors or thermal
> > > > designs. However there are reasons to specify a (default) value at SoC
> > > > level for SC7180: most importantly, if no value is specified at all the
> > > > power_allocator thermal governor (aka 'IPA') estimates a value, using the
> > > > minimum power of all cooling devices of the zone, which can result in
> > > > overly aggressive thermal throttling. For most devices an approximate
> > > > conservative value should be more useful than the minimum guesstimate
> > > > of power_allocator. Devices that need a different value can overwrite
> > > > it in their <device>.dts. Also the thermal zones for SC7180 have a high
> > > > level of granularity (essentially one for each function block), which
> > > > makes it more likely that the default value just works for many devices.
> > > >
> > > > The values correspond to 1901 MHz for the big cores, and 1804 MHz for
> > > > the small cores. The values were determined by limiting the CPU
> > > > frequencies to different max values and launching a bunch of processes
> > > > that cause high CPU load ('while true; do true; done &' is simple and
> > > > does a good job). A frequency is deemed sustainable if the CPU
> > > > temperatures don't rise (consistently) above the second trip point
> > > > ('control temperature', 95 degC in this case). Once the highest
> > > > sustainable frequency is found, the sustainable power can be calculated
> > > > by multiplying the energy consumption per core at this frequency (which
> > > > can be found in /sys/kernel/debug/energy_model/) with the number of
> > > > cores that are specified as cooling devices.
> > > >
> > > > The sustainable frequencies were determined at room temperature
> > > > on a device without heat sink or other passive cooling elements.
> >
> > I'm curious: was this a bare board, or a device in a case?  Hrm, I'm
> > not sure which one would be worse at heat dissipation, but I would
> > imagine that being inside a plastic case might be worse?
>
> This was with a device in a plastic case.
>
> > > > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> > > > ---
> > > > If maintainers think 'sustainable_power' should be specified at
> > > > device level (with which I conceptually agree) I'm fine with
> > > > doing that, just seemed it could be useful to have a reasonable
> > > > 'default' at SoC level in this case.
> > >
> > > Any comments on this?
> >
> > I'm not massively familiar with this area of the code, but I guess I
> > shouldn't let that stop me from having an opinion!  :-P
> >
> > * I would agree that it seems highly unlikely that someone would put
> > one of these chips in a device that could only dissipate the heat from
> > the lowest OPP, so having some higher estimate definitely makes sense.
> >
> > * In terms of the numbers here, I believe that you're claiming that we
> > can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.
>
> No, I'm claiming it's 768 mW + 1202 mW = ~2 W.
>
> SC7180 has a 6 thermal zones for the 6 little cores and 4 zones for the
> 2 big cores. Each of these thermal zones uses either all little or all big
> cores as cooling devices, hence the power sustainable power of the
> individual zones doesn't add up. 768 mW corresponds to 6x 128 mW (aka all
> little cores at 1.8 GHz), and 1202 mW to 2x 601 mW (both big cores at 1.9 GHz).

Ah!  Thanks for explaining.


> > My memory
> > of how much power we could dissipate in previous laptops I worked on
> > is a little fuzzy, but that doesn't seem insane for a passively-cooled
> > laptop.  However, I think someone could conceivably put this chip in a
> > smaller form factor.  In such a case, it seems like we'd want these
> > things to sum up to ~2000 (if it would ever make sense for someone to
> > put this chip in a phone) or ~4000 (if it would ever make sense for
> > someone to put this chip in a small tablet).
>
> See above, the sustainable power with this patch only adds up to ~2000.
> It is possible though that it would be lower in a smaller form factor
> device.
>
> I'd be ok with posting something lower for SC7180 (it would be a guess
> though) and use the specific numbers in the device specific DT.

Given the advice in the bindings it seems like 2W should be fine.


> > It seems possible that,
> > to achieve this, we might have to tweak the
> > "dynamic-power-coefficient".  I don't know how much thought was put
> > into those numbers, but the fact that the little cores have a super
> > round 100 for their dynamic-power-coefficient makes me feel like they
> > might have been more schwags than anything.  Rajendra maybe knows?
>
> Yeah, it's possible that that was just an approximation
>
> > * I'm curious about the fact that there are two numbers here: one for
> > littles and one for bigs.  If I had to guess I'd say that since all
> > the cores are in one package so the contributions kinda need to be
> > thought of together, right?  If we're sitting there thermally
> > throttled then we'd want to pick the best perf-per-watt for the
> > overall package.  This is why your patch says we can sustain the
> > little cores at max and the big cores get whatever is left over,
> > right?
>
> It's derived from how Qualcomm specified the thermal zones and cooling
> devices. Any ("cpu") zone is either cooled by (all) big cores or by (all)
> little cores, but not a mix of them. In my tests I also saw that the big
> cores seemed to have little impact on the little ones. The little cores
> are at max because even running at max frequency the temperature in the
> 'little zones' wouldn't come close to the trip point.

OK, crazy.  I suppose that this makes sense,especially without a
heatsink and over a short burst of time.  I'd imagine that with a
heatsink things might look different, but trying to model everything
is impossible and seems like what't there works OK until someone can
say why it doesn't.  :-)


> > * Should we be leaving some room in here for the GPU?  ...or I guess
> > once we list it as a cooling device we'll have to decrease the amount
> > the CPUs can use?
>
> I don't know for sure, but judging from the CPU zones I wouldn't be
> surprised if the GPU was managed exclusively in the dedicated GPU
> thermal zones (I guess that's what 'gpuss0-thermal' and 'gpuss1-thermal'
> are). If that's not the case the values in the CPU zones can be
> adjusted when specific data is available.

Sounds good.


> > So I guess the tl; dr is:
> >
> > a) We should check "dynamic-power-coefficient" and possibly adjust.
>
> ok, lets see if Rajendra can check if there is room for tweaking.
>
> > b) I don't think the "conservative" by-default numbers should add up
> > to 7 Watts.  I could be convinced that this chip is not intended for
> > phones and thus we could have it add up to 4 Watts, but 7 Watts seems
> > too much.
>
> I suppose this is mostly addressed by my explications above, unless we
> think that 2 Watts in CPU power might still be too aggressive as a
> default.

With all your explanations, I'm happy to add:

Reviewed-by: Douglas Anderson <dianders@chromium.org>

-Doug
Matthias Kaehlcke Sept. 1, 2020, 11:09 p.m. UTC | #5
On Tue, Sep 01, 2020 at 03:45:52PM -0700, Doug Anderson wrote:
> Hi,
> 
> On Tue, Sep 1, 2020 at 2:33 PM Matthias Kaehlcke <mka@chromium.org> wrote:
> >
> > Hi Doug,
> >
> > On Tue, Sep 01, 2020 at 01:19:10PM -0700, Doug Anderson wrote:
> > > Hi,
> > >
> > > On Tue, Sep 1, 2020 at 10:07 AM Matthias Kaehlcke <mka@chromium.org> wrote:
> > > >
> > > > On Thu, Aug 13, 2020 at 11:30:33AM -0700, Matthias Kaehlcke wrote:
> > > > > The 'sustainable_power' attribute provides an estimate of the sustained
> > > > > power that can be dissipated at the desired control temperature. One
> > > > > could argue that this value is not necessarily the same for all devices
> > > > > with the same SoC, which may have different form factors or thermal
> > > > > designs. However there are reasons to specify a (default) value at SoC
> > > > > level for SC7180: most importantly, if no value is specified at all the
> > > > > power_allocator thermal governor (aka 'IPA') estimates a value, using the
> > > > > minimum power of all cooling devices of the zone, which can result in
> > > > > overly aggressive thermal throttling. For most devices an approximate
> > > > > conservative value should be more useful than the minimum guesstimate
> > > > > of power_allocator. Devices that need a different value can overwrite
> > > > > it in their <device>.dts. Also the thermal zones for SC7180 have a high
> > > > > level of granularity (essentially one for each function block), which
> > > > > makes it more likely that the default value just works for many devices.
> > > > >
> > > > > The values correspond to 1901 MHz for the big cores, and 1804 MHz for
> > > > > the small cores. The values were determined by limiting the CPU
> > > > > frequencies to different max values and launching a bunch of processes
> > > > > that cause high CPU load ('while true; do true; done &' is simple and
> > > > > does a good job). A frequency is deemed sustainable if the CPU
> > > > > temperatures don't rise (consistently) above the second trip point
> > > > > ('control temperature', 95 degC in this case). Once the highest
> > > > > sustainable frequency is found, the sustainable power can be calculated
> > > > > by multiplying the energy consumption per core at this frequency (which
> > > > > can be found in /sys/kernel/debug/energy_model/) with the number of
> > > > > cores that are specified as cooling devices.
> > > > >
> > > > > The sustainable frequencies were determined at room temperature
> > > > > on a device without heat sink or other passive cooling elements.
> > >
> > > I'm curious: was this a bare board, or a device in a case?  Hrm, I'm
> > > not sure which one would be worse at heat dissipation, but I would
> > > imagine that being inside a plastic case might be worse?
> >
> > This was with a device in a plastic case.
> >
> > > > > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> > > > > ---
> > > > > If maintainers think 'sustainable_power' should be specified at
> > > > > device level (with which I conceptually agree) I'm fine with
> > > > > doing that, just seemed it could be useful to have a reasonable
> > > > > 'default' at SoC level in this case.
> > > >
> > > > Any comments on this?
> > >
> > > I'm not massively familiar with this area of the code, but I guess I
> > > shouldn't let that stop me from having an opinion!  :-P
> > >
> > > * I would agree that it seems highly unlikely that someone would put
> > > one of these chips in a device that could only dissipate the heat from
> > > the lowest OPP, so having some higher estimate definitely makes sense.
> > >
> > > * In terms of the numbers here, I believe that you're claiming that we
> > > can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.
> >
> > No, I'm claiming it's 768 mW + 1202 mW = ~2 W.
> >
> > SC7180 has a 6 thermal zones for the 6 little cores and 4 zones for the
> > 2 big cores. Each of these thermal zones uses either all little or all big
> > cores as cooling devices, hence the power sustainable power of the
> > individual zones doesn't add up. 768 mW corresponds to 6x 128 mW (aka all
> > little cores at 1.8 GHz), and 1202 mW to 2x 601 mW (both big cores at 1.9 GHz).
> 
> Ah!  Thanks for explaining.
> 
> 
> > > My memory
> > > of how much power we could dissipate in previous laptops I worked on
> > > is a little fuzzy, but that doesn't seem insane for a passively-cooled
> > > laptop.  However, I think someone could conceivably put this chip in a
> > > smaller form factor.  In such a case, it seems like we'd want these
> > > things to sum up to ~2000 (if it would ever make sense for someone to
> > > put this chip in a phone) or ~4000 (if it would ever make sense for
> > > someone to put this chip in a small tablet).
> >
> > See above, the sustainable power with this patch only adds up to ~2000.
> > It is possible though that it would be lower in a smaller form factor
> > device.
> >
> > I'd be ok with posting something lower for SC7180 (it would be a guess
> > though) and use the specific numbers in the device specific DT.
> 
> Given the advice in the bindings it seems like 2W should be fine.
> 
> 
> > > It seems possible that,
> > > to achieve this, we might have to tweak the
> > > "dynamic-power-coefficient".  I don't know how much thought was put
> > > into those numbers, but the fact that the little cores have a super
> > > round 100 for their dynamic-power-coefficient makes me feel like they
> > > might have been more schwags than anything.  Rajendra maybe knows?
> >
> > Yeah, it's possible that that was just an approximation
> >
> > > * I'm curious about the fact that there are two numbers here: one for
> > > littles and one for bigs.  If I had to guess I'd say that since all
> > > the cores are in one package so the contributions kinda need to be
> > > thought of together, right?  If we're sitting there thermally
> > > throttled then we'd want to pick the best perf-per-watt for the
> > > overall package.  This is why your patch says we can sustain the
> > > little cores at max and the big cores get whatever is left over,
> > > right?
> >
> > It's derived from how Qualcomm specified the thermal zones and cooling
> > devices. Any ("cpu") zone is either cooled by (all) big cores or by (all)
> > little cores, but not a mix of them. In my tests I also saw that the big
> > cores seemed to have little impact on the little ones. The little cores
> > are at max because even running at max frequency the temperature in the
> > 'little zones' wouldn't come close to the trip point.
> 
> OK, crazy.  I suppose that this makes sense,especially without a
> heatsink and over a short burst of time.  I'd imagine that with a
> heatsink things might look different, but trying to model everything
> is impossible and seems like what't there works OK until someone can
> say why it doesn't.  :-)

Exactly, they will likely look different, but since we want conservative
numbers for the generic case it's ok/good to use numbers from a
configuration without heatsink.

And 1.9 GHz doesn't even seem horribly slow for a device with a heatsink.
When we have such a device we can evaluate whether there are significant
gains from tweaking the values for the big cores in the device specific DT.

> > > * Should we be leaving some room in here for the GPU?  ...or I guess
> > > once we list it as a cooling device we'll have to decrease the amount
> > > the CPUs can use?
> >
> > I don't know for sure, but judging from the CPU zones I wouldn't be
> > surprised if the GPU was managed exclusively in the dedicated GPU
> > thermal zones (I guess that's what 'gpuss0-thermal' and 'gpuss1-thermal'
> > are). If that's not the case the values in the CPU zones can be
> > adjusted when specific data is available.
> 
> Sounds good.
> 
> 
> > > So I guess the tl; dr is:
> > >
> > > a) We should check "dynamic-power-coefficient" and possibly adjust.
> >
> > ok, lets see if Rajendra can check if there is room for tweaking.
> >
> > > b) I don't think the "conservative" by-default numbers should add up
> > > to 7 Watts.  I could be convinced that this chip is not intended for
> > > phones and thus we could have it add up to 4 Watts, but 7 Watts seems
> > > too much.
> >
> > I suppose this is mostly addressed by my explications above, unless we
> > think that 2 Watts in CPU power might still be too aggressive as a
> > default.
> 
> With all your explanations, I'm happy to add:
> 
> Reviewed-by: Douglas Anderson <dianders@chromium.org>

Thanks!
Rajendra Nayak Sept. 2, 2020, 5:36 a.m. UTC | #6
> * In terms of the numbers here, I believe that you're claiming that we
> can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.  My memory
> of how much power we could dissipate in previous laptops I worked on
> is a little fuzzy, but that doesn't seem insane for a passively-cooled
> laptop.  However, I think someone could conceivably put this chip in a
> smaller form factor.  In such a case, it seems like we'd want these
> things to sum up to ~2000 (if it would ever make sense for someone to
> put this chip in a phone) or ~4000 (if it would ever make sense for
> someone to put this chip in a small tablet).  It seems possible that,
> to achieve this, we might have to tweak the
> "dynamic-power-coefficient".

DPC values are calculated (at a SoC) by actually measuring max power at various
frequency/voltage combinations by running things like dhrystone.
How would the max power a SoC can generate depend on form factors?
How much it can dissipate sure is, but then I am not super familiar how
thermal frameworks end up using DPC for calculating power dissipated,
I am guessing they don't.
  
> I don't know how much thought was put
> into those numbers, but the fact that the little cores have a super
> round 100 for their dynamic-power-coefficient makes me feel like they
> might have been more schwags than anything.  Rajendra maybe knows?

FWIK, the values are always scaled and normalized to 100 for silver and
then used to derive the relative DPC number for gold. If you see the DPC
for silver cores even on sdm845 is a 100.
Again these are not estimations but based on actual power measurements.
Rajendra Nayak Sept. 2, 2020, 5:41 a.m. UTC | #7
>> I'm not massively familiar with this area of the code, but I guess I
>> shouldn't let that stop me from having an opinion!  :-P
>>
>> * I would agree that it seems highly unlikely that someone would put
>> one of these chips in a device that could only dissipate the heat from
>> the lowest OPP, so having some higher estimate definitely makes sense.
>>
>> * In terms of the numbers here, I believe that you're claiming that we
>> can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.
> 
> No, I'm claiming it's 768 mW + 1202 mW = ~2 W.
> 
> SC7180 has a 6 thermal zones for the 6 little cores and 4 zones for the
> 2 big cores. Each of these thermal zones uses either all little or all big
> cores as cooling devices, hence the power sustainable power of the
> individual zones doesn't add up. 768 mW corresponds to 6x 128 mW (aka all
> little cores at 1.8 GHz), and 1202 mW to 2x 601 mW (both big cores at 1.9 GHz).
> 
>> My memory
>> of how much power we could dissipate in previous laptops I worked on
>> is a little fuzzy, but that doesn't seem insane for a passively-cooled
>> laptop.  However, I think someone could conceivably put this chip in a
>> smaller form factor.  In such a case, it seems like we'd want these
>> things to sum up to ~2000 (if it would ever make sense for someone to
>> put this chip in a phone) or ~4000 (if it would ever make sense for
>> someone to put this chip in a small tablet).
> 
> See above, the sustainable power with this patch only adds up to ~2000.
> It is possible though that it would be lower in a smaller form factor
> device.
> 
> I'd be ok with posting something lower for SC7180 (it would be a guess
> though) and use the specific numbers in the device specific DT.
> 
>> It seems possible that,
>> to achieve this, we might have to tweak the
>> "dynamic-power-coefficient".  I don't know how much thought was put
>> into those numbers, but the fact that the little cores have a super
>> round 100 for their dynamic-power-coefficient makes me feel like they
>> might have been more schwags than anything.  Rajendra maybe knows?
> 
> Yeah, it's possible that that was just an approximation

No, these are based on actual power measurements.

> 
>> * I'm curious about the fact that there are two numbers here: one for
>> littles and one for bigs.  If I had to guess I'd say that since all
>> the cores are in one package so the contributions kinda need to be
>> thought of together, right?  If we're sitting there thermally
>> throttled then we'd want to pick the best perf-per-watt for the
>> overall package.  This is why your patch says we can sustain the
>> little cores at max and the big cores get whatever is left over,
>> right?
> 
> It's derived from how Qualcomm specified the thermal zones and cooling
> devices. Any ("cpu") zone is either cooled by (all) big cores or by (all)
> little cores, but not a mix of them. In my tests I also saw that the big
> cores seemed to have little impact on the little ones. The little cores
> are at max because even running at max frequency the temperature in the
> 'little zones' wouldn't come close to the trip point.
> 
>> * Should we be leaving some room in here for the GPU?  ...or I guess
>> once we list it as a cooling device we'll have to decrease the amount
>> the CPUs can use?
> 
> I don't know for sure, but judging from the CPU zones I wouldn't be
> surprised if the GPU was managed exclusively in the dedicated GPU
> thermal zones (I guess that's what 'gpuss0-thermal' and 'gpuss1-thermal'
> are). If that's not the case the values in the CPU zones can be
> adjusted when specific data is available.
> 
>> So I guess the tl; dr is:
>>
>> a) We should check "dynamic-power-coefficient" and possibly adjust.
> 
> ok, lets see if Rajendra can check if there is room for tweaking.

I suggest we don't :)
Doug Anderson Sept. 2, 2020, 3:32 p.m. UTC | #8
Hi,

On Tue, Sep 1, 2020 at 10:36 PM Rajendra Nayak <rnayak@codeaurora.org> wrote:
>
>
> > * In terms of the numbers here, I believe that you're claiming that we
> > can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.  My memory
> > of how much power we could dissipate in previous laptops I worked on
> > is a little fuzzy, but that doesn't seem insane for a passively-cooled
> > laptop.  However, I think someone could conceivably put this chip in a
> > smaller form factor.  In such a case, it seems like we'd want these
> > things to sum up to ~2000 (if it would ever make sense for someone to
> > put this chip in a phone) or ~4000 (if it would ever make sense for
> > someone to put this chip in a small tablet).  It seems possible that,
> > to achieve this, we might have to tweak the
> > "dynamic-power-coefficient".
>
> DPC values are calculated (at a SoC) by actually measuring max power at various
> frequency/voltage combinations by running things like dhrystone.
> How would the max power a SoC can generate depend on form factors?
> How much it can dissipate sure is, but then I am not super familiar how
> thermal frameworks end up using DPC for calculating power dissipated,
> I am guessing they don't.
>
> > I don't know how much thought was put
> > into those numbers, but the fact that the little cores have a super
> > round 100 for their dynamic-power-coefficient makes me feel like they
> > might have been more schwags than anything.  Rajendra maybe knows?
>
> FWIK, the values are always scaled and normalized to 100 for silver and
> then used to derive the relative DPC number for gold. If you see the DPC
> for silver cores even on sdm845 is a 100.
> Again these are not estimations but based on actual power measurements.

The scaling to 100 doesn't seem to match how the thermal framework is
using them.  Take a look at of_cpufreq_cooling_register().  It takes
the "dynamic-power-coefficient" and passes it as "capacitance" into
__cpufreq_cooling_register().  That's eventually used to compute
power, which is documented in the code to be in mW.

power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
do_div(power, 1000000000);

/* power is stored in mW */
freq_table[i].power = power;

That's used together with "sustainable-power", which is the attribute
that Matthias is trying to set.  That value is documented to be in mW
as well.

...so if the silver cores are always scaled to 100 regardless of how
much power they actually draw then it'll be impossible to actually
think about "sustainable-power" as a mW value.  Presumably we either
need to accept that fact (and ideally document it) or we need to
change the values for silver / gold cores (we could still keep the
relative values the same and just scale them).

-Doug
Rajendra Nayak Sept. 3, 2020, 4:44 a.m. UTC | #9
On 9/2/2020 9:02 PM, Doug Anderson wrote:
> Hi,
> 
> On Tue, Sep 1, 2020 at 10:36 PM Rajendra Nayak <rnayak@codeaurora.org> wrote:
>>
>>
>>> * In terms of the numbers here, I believe that you're claiming that we
>>> can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.  My memory
>>> of how much power we could dissipate in previous laptops I worked on
>>> is a little fuzzy, but that doesn't seem insane for a passively-cooled
>>> laptop.  However, I think someone could conceivably put this chip in a
>>> smaller form factor.  In such a case, it seems like we'd want these
>>> things to sum up to ~2000 (if it would ever make sense for someone to
>>> put this chip in a phone) or ~4000 (if it would ever make sense for
>>> someone to put this chip in a small tablet).  It seems possible that,
>>> to achieve this, we might have to tweak the
>>> "dynamic-power-coefficient".
>>
>> DPC values are calculated (at a SoC) by actually measuring max power at various
>> frequency/voltage combinations by running things like dhrystone.
>> How would the max power a SoC can generate depend on form factors?
>> How much it can dissipate sure is, but then I am not super familiar how
>> thermal frameworks end up using DPC for calculating power dissipated,
>> I am guessing they don't.
>>
>>> I don't know how much thought was put
>>> into those numbers, but the fact that the little cores have a super
>>> round 100 for their dynamic-power-coefficient makes me feel like they
>>> might have been more schwags than anything.  Rajendra maybe knows?
>>
>> FWIK, the values are always scaled and normalized to 100 for silver and
>> then used to derive the relative DPC number for gold. If you see the DPC
>> for silver cores even on sdm845 is a 100.
>> Again these are not estimations but based on actual power measurements.
> 
> The scaling to 100 doesn't seem to match how the thermal framework is
> using them.  Take a look at of_cpufreq_cooling_register().  It takes
> the "dynamic-power-coefficient" and passes it as "capacitance" into
> __cpufreq_cooling_register().  That's eventually used to compute
> power, which is documented in the code to be in mW.
> 
> power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
> do_div(power, 1000000000);
> 
> /* power is stored in mW */
> freq_table[i].power = power;
> 
> That's used together with "sustainable-power", which is the attribute
> that Matthias is trying to set.  That value is documented to be in mW
> as well.
> 
> ...so if the silver cores are always scaled to 100 regardless of how
> much power they actually draw then it'll be impossible to actually
> think about "sustainable-power" as a mW value.  Presumably we either
> need to accept that fact (and ideally document it) or we need to
> change the values for silver / gold cores (we could still keep the
> relative values the same and just scale them).

That sounds reasonable (still keep the relative values and scale them)
I'll get back on what those scaled numbers would look like, and try to
get some sense of why this scaling to 100 was done (like you said
I don't see any documentation on this), but I see atleast a few other non-qcom
SoCs doing this too in mainline (like rockchip/rk3399)
Rajendra Nayak Sept. 3, 2020, 5:30 a.m. UTC | #10
On 9/3/2020 10:14 AM, Rajendra Nayak wrote:
> 
> On 9/2/2020 9:02 PM, Doug Anderson wrote:
>> Hi,
>>
>> On Tue, Sep 1, 2020 at 10:36 PM Rajendra Nayak <rnayak@codeaurora.org> wrote:
>>>
>>>
>>>> * In terms of the numbers here, I believe that you're claiming that we
>>>> can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.  My memory
>>>> of how much power we could dissipate in previous laptops I worked on
>>>> is a little fuzzy, but that doesn't seem insane for a passively-cooled
>>>> laptop.  However, I think someone could conceivably put this chip in a
>>>> smaller form factor.  In such a case, it seems like we'd want these
>>>> things to sum up to ~2000 (if it would ever make sense for someone to
>>>> put this chip in a phone) or ~4000 (if it would ever make sense for
>>>> someone to put this chip in a small tablet).  It seems possible that,
>>>> to achieve this, we might have to tweak the
>>>> "dynamic-power-coefficient".
>>>
>>> DPC values are calculated (at a SoC) by actually measuring max power at various
>>> frequency/voltage combinations by running things like dhrystone.
>>> How would the max power a SoC can generate depend on form factors?
>>> How much it can dissipate sure is, but then I am not super familiar how
>>> thermal frameworks end up using DPC for calculating power dissipated,
>>> I am guessing they don't.
>>>
>>>> I don't know how much thought was put
>>>> into those numbers, but the fact that the little cores have a super
>>>> round 100 for their dynamic-power-coefficient makes me feel like they
>>>> might have been more schwags than anything.  Rajendra maybe knows?
>>>
>>> FWIK, the values are always scaled and normalized to 100 for silver and
>>> then used to derive the relative DPC number for gold. If you see the DPC
>>> for silver cores even on sdm845 is a 100.
>>> Again these are not estimations but based on actual power measurements.
>>
>> The scaling to 100 doesn't seem to match how the thermal framework is
>> using them.  Take a look at of_cpufreq_cooling_register().  It takes
>> the "dynamic-power-coefficient" and passes it as "capacitance" into
>> __cpufreq_cooling_register().  That's eventually used to compute
>> power, which is documented in the code to be in mW.
>>
>> power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
>> do_div(power, 1000000000);
>>
>> /* power is stored in mW */
>> freq_table[i].power = power;
>>
>> That's used together with "sustainable-power", which is the attribute
>> that Matthias is trying to set.  That value is documented to be in mW
>> as well.
>>
>> ...so if the silver cores are always scaled to 100 regardless of how
>> much power they actually draw then it'll be impossible to actually
>> think about "sustainable-power" as a mW value.  Presumably we either
>> need to accept that fact (and ideally document it) or we need to
>> change the values for silver / gold cores (we could still keep the
>> relative values the same and just scale them).
> 
> That sounds reasonable (still keep the relative values and scale them)
> I'll get back on what those scaled numbers would look like, and try to
> get some sense of why this scaling to 100 was done (like you said
> I don't see any documentation on this), but I see atleast a few other non-qcom
> SoCs doing this too in mainline (like rockchip/rk3399)

On second thoughts, why wouldn't a relative 'sustainable-power' value work?
On every device, one would need to do the exercise that Matthias did to come
up with the OPP at which we can sustain max CPU/GPU loads anyway.
I mean even if we do change the DPC values to match actual power, Matthias would
still observe that we can sustain at the very same OPP and not any different.
Its just that the mW values that are passed to kernel are relative and not
absolute. My worry is that perhaps no SoC vendor wants to put these absolute numbers
out.
Matthias Kaehlcke Sept. 3, 2020, 12:17 p.m. UTC | #11
Hi Rajendra,

On Thu, Sep 03, 2020 at 11:00:52AM +0530, Rajendra Nayak wrote:
> 
> On 9/3/2020 10:14 AM, Rajendra Nayak wrote:
> > 
> > On 9/2/2020 9:02 PM, Doug Anderson wrote:
> > > Hi,
> > > 
> > > On Tue, Sep 1, 2020 at 10:36 PM Rajendra Nayak <rnayak@codeaurora.org> wrote:
> > > > 
> > > > 
> > > > > * In terms of the numbers here, I believe that you're claiming that we
> > > > > can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.  My memory
> > > > > of how much power we could dissipate in previous laptops I worked on
> > > > > is a little fuzzy, but that doesn't seem insane for a passively-cooled
> > > > > laptop.  However, I think someone could conceivably put this chip in a
> > > > > smaller form factor.  In such a case, it seems like we'd want these
> > > > > things to sum up to ~2000 (if it would ever make sense for someone to
> > > > > put this chip in a phone) or ~4000 (if it would ever make sense for
> > > > > someone to put this chip in a small tablet).  It seems possible that,
> > > > > to achieve this, we might have to tweak the
> > > > > "dynamic-power-coefficient".
> > > > 
> > > > DPC values are calculated (at a SoC) by actually measuring max power at various
> > > > frequency/voltage combinations by running things like dhrystone.
> > > > How would the max power a SoC can generate depend on form factors?
> > > > How much it can dissipate sure is, but then I am not super familiar how
> > > > thermal frameworks end up using DPC for calculating power dissipated,
> > > > I am guessing they don't.
> > > > 
> > > > > I don't know how much thought was put
> > > > > into those numbers, but the fact that the little cores have a super
> > > > > round 100 for their dynamic-power-coefficient makes me feel like they
> > > > > might have been more schwags than anything.  Rajendra maybe knows?
> > > > 
> > > > FWIK, the values are always scaled and normalized to 100 for silver and
> > > > then used to derive the relative DPC number for gold. If you see the DPC
> > > > for silver cores even on sdm845 is a 100.
> > > > Again these are not estimations but based on actual power measurements.
> > > 
> > > The scaling to 100 doesn't seem to match how the thermal framework is
> > > using them.  Take a look at of_cpufreq_cooling_register().  It takes
> > > the "dynamic-power-coefficient" and passes it as "capacitance" into
> > > __cpufreq_cooling_register().  That's eventually used to compute
> > > power, which is documented in the code to be in mW.
> > > 
> > > power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
> > > do_div(power, 1000000000);
> > > 
> > > /* power is stored in mW */
> > > freq_table[i].power = power;
> > > 
> > > That's used together with "sustainable-power", which is the attribute
> > > that Matthias is trying to set.  That value is documented to be in mW
> > > as well.
> > > 
> > > ...so if the silver cores are always scaled to 100 regardless of how
> > > much power they actually draw then it'll be impossible to actually
> > > think about "sustainable-power" as a mW value.  Presumably we either
> > > need to accept that fact (and ideally document it) or we need to
> > > change the values for silver / gold cores (we could still keep the
> > > relative values the same and just scale them).
> > 
> > That sounds reasonable (still keep the relative values and scale them)
> > I'll get back on what those scaled numbers would look like, and try to
> > get some sense of why this scaling to 100 was done (like you said
> > I don't see any documentation on this), but I see atleast a few other non-qcom
> > SoCs doing this too in mainline (like rockchip/rk3399)
> 
> On second thoughts, why wouldn't a relative 'sustainable-power' value work?
> On every device, one would need to do the exercise that Matthias did to come
> up with the OPP at which we can sustain max CPU/GPU loads anyway.

You assume that a thermal zone only has cooling devices of a the same type (or
with the same fake unit for power consumption). This falls apart when multiple
types are used, which is common.

Also sustainable power is only a derived value, the lying already starts in
the energy model, which is used by EAS, so a fake unit could cause further
problems.

> I mean even if we do change the DPC values to match actual power, Matthias would
> still observe that we can sustain at the very same OPP and not any different.
> Its just that the mW values that are passed to kernel are relative and not
> absolute. My worry is that perhaps no SoC vendor wants to put these absolute numbers
> out.

This is pretty much 'security' by obscurity. It would be relatively easy to
measure actual power consumption at different CPU speeds and derive the DPC
values from that.
Doug Anderson Sept. 3, 2020, 3:12 p.m. UTC | #12
Hi,

On Thu, Sep 3, 2020 at 5:17 AM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> Hi Rajendra,
>
> On Thu, Sep 03, 2020 at 11:00:52AM +0530, Rajendra Nayak wrote:
> >
> > On 9/3/2020 10:14 AM, Rajendra Nayak wrote:
> > >
> > > On 9/2/2020 9:02 PM, Doug Anderson wrote:
> > > > Hi,
> > > >
> > > > On Tue, Sep 1, 2020 at 10:36 PM Rajendra Nayak <rnayak@codeaurora.org> wrote:
> > > > >
> > > > >
> > > > > > * In terms of the numbers here, I believe that you're claiming that we
> > > > > > can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.  My memory
> > > > > > of how much power we could dissipate in previous laptops I worked on
> > > > > > is a little fuzzy, but that doesn't seem insane for a passively-cooled
> > > > > > laptop.  However, I think someone could conceivably put this chip in a
> > > > > > smaller form factor.  In such a case, it seems like we'd want these
> > > > > > things to sum up to ~2000 (if it would ever make sense for someone to
> > > > > > put this chip in a phone) or ~4000 (if it would ever make sense for
> > > > > > someone to put this chip in a small tablet).  It seems possible that,
> > > > > > to achieve this, we might have to tweak the
> > > > > > "dynamic-power-coefficient".
> > > > >
> > > > > DPC values are calculated (at a SoC) by actually measuring max power at various
> > > > > frequency/voltage combinations by running things like dhrystone.
> > > > > How would the max power a SoC can generate depend on form factors?
> > > > > How much it can dissipate sure is, but then I am not super familiar how
> > > > > thermal frameworks end up using DPC for calculating power dissipated,
> > > > > I am guessing they don't.
> > > > >
> > > > > > I don't know how much thought was put
> > > > > > into those numbers, but the fact that the little cores have a super
> > > > > > round 100 for their dynamic-power-coefficient makes me feel like they
> > > > > > might have been more schwags than anything.  Rajendra maybe knows?
> > > > >
> > > > > FWIK, the values are always scaled and normalized to 100 for silver and
> > > > > then used to derive the relative DPC number for gold. If you see the DPC
> > > > > for silver cores even on sdm845 is a 100.
> > > > > Again these are not estimations but based on actual power measurements.
> > > >
> > > > The scaling to 100 doesn't seem to match how the thermal framework is
> > > > using them.  Take a look at of_cpufreq_cooling_register().  It takes
> > > > the "dynamic-power-coefficient" and passes it as "capacitance" into
> > > > __cpufreq_cooling_register().  That's eventually used to compute
> > > > power, which is documented in the code to be in mW.
> > > >
> > > > power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
> > > > do_div(power, 1000000000);
> > > >
> > > > /* power is stored in mW */
> > > > freq_table[i].power = power;
> > > >
> > > > That's used together with "sustainable-power", which is the attribute
> > > > that Matthias is trying to set.  That value is documented to be in mW
> > > > as well.
> > > >
> > > > ...so if the silver cores are always scaled to 100 regardless of how
> > > > much power they actually draw then it'll be impossible to actually
> > > > think about "sustainable-power" as a mW value.  Presumably we either
> > > > need to accept that fact (and ideally document it) or we need to
> > > > change the values for silver / gold cores (we could still keep the
> > > > relative values the same and just scale them).
> > >
> > > That sounds reasonable (still keep the relative values and scale them)
> > > I'll get back on what those scaled numbers would look like, and try to
> > > get some sense of why this scaling to 100 was done (like you said
> > > I don't see any documentation on this), but I see atleast a few other non-qcom
> > > SoCs doing this too in mainline (like rockchip/rk3399)

I don't think I was too closely involved in these numbers on rk3399,
but as far as I can tell the 100 number came from:

https://crrev.com/c/364003

...interestingly enough the number _wasn't_ scaled to 100 (but was a
number close to 100) and then was changed to scale to 100.  That makes
it seem like 100, though awfully round, was at least based loosely on
fact for rk3399.

In any case, the devicetree bindings make it pretty clear that this
value should be based in reality and not some bogus number.


> > On second thoughts, why wouldn't a relative 'sustainable-power' value work?
> > On every device, one would need to do the exercise that Matthias did to come
> > up with the OPP at which we can sustain max CPU/GPU loads anyway.
>
> You assume that a thermal zone only has cooling devices of a the same type (or
> with the same fake unit for power consumption). This falls apart when multiple
> types are used, which is common.
>
> Also sustainable power is only a derived value, the lying already starts in
> the energy model, which is used by EAS, so a fake unit could cause further
> problems.
>
> > I mean even if we do change the DPC values to match actual power, Matthias would
> > still observe that we can sustain at the very same OPP and not any different.
> > Its just that the mW values that are passed to kernel are relative and not
> > absolute. My worry is that perhaps no SoC vendor wants to put these absolute numbers
> > out.
>
> This is pretty much 'security' by obscurity. It would be relatively easy to
> measure actual power consumption at different CPU speeds and derive the DPC
> values from that.

Right, I was going to say that.  Specifically:

* Anyone that actually gets one of these chips can just measure it
pretty trivially.  Run the core at a certain speed and measure with
the smart battery.  Run at a different speed and measure again.

* Presumably the power consumption of different types of cores in
Qualcomm SoCs of the same generation is roughly equivalent.  So I
could go and grab a Pixel 4a and put AOSP on it and measure the power
consumption and presumably get pretty close numbers for big and little
power coefficients.  I don't know for sure if Pixel 4a's SoC is
officially the same generation but I'd bet it's close.

* Presumably someone would be able to get a pretty good guess by
figuring out the form factor and working backwards.  It sounds as if
thermal dissipation (in terms of Watts) for various form factor
devices is somewhat standard.  Maybe this is more so for phones /
tablets than laptops which might have bigger heat pipes or active
cooling, but still.  Someone could do the math pretty easily.

I guess if you're really worried about protecting this then you can
delay posting it for brand new chipsets using a new type of technology
until product is almost ready to ship, but for sc7180 it doesn't feel
like this is something worth fighting about.

-Doug
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi
index d46b3833e52f..23f84639d6b9 100644
--- a/arch/arm64/boot/dts/qcom/sc7180.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi
@@ -3320,6 +3320,7 @@  cpu0-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 1>;
+			sustainable-power = <768>;
 
 			trips {
 				cpu0_alert0: trip-point0 {
@@ -3368,6 +3369,7 @@  cpu1-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 2>;
+			sustainable-power = <768>;
 
 			trips {
 				cpu1_alert0: trip-point0 {
@@ -3416,6 +3418,7 @@  cpu2-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 3>;
+			sustainable-power = <768>;
 
 			trips {
 				cpu2_alert0: trip-point0 {
@@ -3464,6 +3467,7 @@  cpu3-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 4>;
+			sustainable-power = <768>;
 
 			trips {
 				cpu3_alert0: trip-point0 {
@@ -3512,6 +3516,7 @@  cpu4-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 5>;
+			sustainable-power = <768>;
 
 			trips {
 				cpu4_alert0: trip-point0 {
@@ -3560,6 +3565,7 @@  cpu5-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 6>;
+			sustainable-power = <768>;
 
 			trips {
 				cpu5_alert0: trip-point0 {
@@ -3608,6 +3614,7 @@  cpu6-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 9>;
+			sustainable-power = <1202>;
 
 			trips {
 				cpu6_alert0: trip-point0 {
@@ -3648,6 +3655,7 @@  cpu7-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 10>;
+			sustainable-power = <1202>;
 
 			trips {
 				cpu7_alert0: trip-point0 {
@@ -3688,6 +3696,7 @@  cpu8-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 11>;
+			sustainable-power = <1202>;
 
 			trips {
 				cpu8_alert0: trip-point0 {
@@ -3728,6 +3737,7 @@  cpu9-thermal {
 			polling-delay = <0>;
 
 			thermal-sensors = <&tsens0 12>;
+			sustainable-power = <1202>;
 
 			trips {
 				cpu9_alert0: trip-point0 {