diff mbox series

[v2] arm64: dts: rockchip: Add GPU OPP voltage ranges to RK356x SoC dtsi

Message ID bdb60f1f793166cd65f58ab7aea025347076019c.1719679068.git.dsimic@manjaro.org (mailing list archive)
State New, archived
Headers show
Series [v2] arm64: dts: rockchip: Add GPU OPP voltage ranges to RK356x SoC dtsi | expand

Commit Message

Dragan Simic June 29, 2024, 4:39 p.m. UTC
Add support for voltage ranges to the GPU OPPs defined in the SoC dtsi for
RK356x.  These voltage ranges are useful for RK356x-based boards that are
designed to use the same power supply for the GPU and NPU portions of the
SoC, which is described further in the following documents:

  - Rockchip RK3566 Hardware Design Guide, version 1.1.0, page 37
  - Rockchip RK3568 Hardware Design Guide, version 1.2, page 78

The values for the exact GPU OPP voltages and the lower limits for the GPU
OPP voltage ranges differ from the values found in the vendor kernel source
(cf. downstream commit f8b9431ee38e ("arm64: dts: rockchip: rk3568: support
adjust opp-table by otp")). [1][2]  However, our values have served us well
so far, so let's keep them for now, until we actually start supporting the
CPU and GPU binning, together with the related voltage adjustments.

[1] https://github.com/rockchip-linux/kernel/commit/f8b9431ee38ed561650be7092ab93f564598daa9
[2] https://raw.githubusercontent.com/rockchip-linux/kernel/f8b9431ee38ed561650be7092ab93f564598daa9/arch/arm64/boot/dts/rockchip/rk3568.dtsi

Suggested-by: Diederik de Haas <didi.debian@cknow.org>
Helped-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
---

Notes:
    Changes in v2:
      - Dropped support for optional GPU OPP voltage ranges, which may
        actually hide some misconfiguration issues in board dts files, [3]
        but that will be covered by other debugging patches [4]
    
    Link to v1: https://lore.kernel.org/linux-rockchip/446399362bd2dbeeaecd8351f68811165429749a.1719637113.git.dsimic@manjaro.org/T/#u
    
    [3] https://lore.kernel.org/linux-rockchip/f10d5a3c425c2c4312512c20bd35073c@manjaro.org/
    [4] https://lore.kernel.org/linux-rockchip/36170f8485293b336106e92346478daa@manjaro.org/

 arch/arm64/boot/dts/rockchip/rk356x.dtsi | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

Diederik de Haas June 29, 2024, 10:01 p.m. UTC | #1
On Saturday, 29 June 2024 18:39:02 CEST Dragan Simic wrote:
> Add support for voltage ranges to the GPU OPPs defined in the SoC dtsi for
> RK356x.  These voltage ranges are useful for RK356x-based boards that are
> designed to use the same power supply for the GPU and NPU portions of the
> SoC, which is described further in the following documents:
> 
>   - Rockchip RK3566 Hardware Design Guide, version 1.1.0, page 37
>   - Rockchip RK3568 Hardware Design Guide, version 1.2, page 78

That was interesting to read, thanks.
Now I understand the difference between rk809(-5) and rk817(-5).

But AFAIUI the above description described why there were separate tables for 
rk809 and rk817 in v1. But that was dropped in v2. So it seems to me the 
(commit) message should be updated accordingly?

I also expected that (for v1) there would be a similar construct as was 
recently added for rk3588. But I should interpret Heiko's comments as that 
strategy should not be applied to rk356x?

> The values for the exact GPU OPP voltages and the lower limits for the GPU
> OPP voltage ranges differ from the values found in the vendor kernel source
> (cf. downstream commit f8b9431ee38e ("arm64: dts: rockchip: rk3568: support
> adjust opp-table by otp")). [1][2]  

Why? In their latest update Rockchip changed it to the values as specified in 
the links. My assumption is that based on extensive testing they did and/or 
the feedback they got from the client/customers, they felt the need to change 
it to the values they did.

I think we should follow their values unless we have an explicit and very good 
reason to deviate from that.

> However, our values have served us well so far, so let's keep them for now,

And I don't think that qualifies as a (very) good reason.
I think it's reasonable to assume that far more (stress) testing has been done 
with the downstream code, then has happened with the upstream code.
Hopefully that'll change in the future, but I don't think we're there yet.

When we/upstream adds npu support, I think we should also follow downstream's 
OPP values, unless we have a very good reason to deviate from that.

> until we actually start supporting the CPU and GPU binning, together with
> the related voltage adjustments.

I may not fully understand what you mean by that, but I think it's (again) 
reasonable to assume that Rockchip has far more insight into this then we do.

Cheers,
  Diederik

> [1]
> https://github.com/rockchip-linux/kernel/commit/f8b9431ee38ed561650be7092ab
> 93f564598daa9 [2]
> https://raw.githubusercontent.com/rockchip-linux/kernel/f8b9431ee38ed561650
> be7092ab93f564598daa9/arch/arm64/boot/dts/rockchip/rk3568.dtsi
Heiko Stuebner June 30, 2024, 9:07 a.m. UTC | #2
Am Sonntag, 30. Juni 2024, 00:01:41 CEST schrieb Diederik de Haas:
> On Saturday, 29 June 2024 18:39:02 CEST Dragan Simic wrote:
> > Add support for voltage ranges to the GPU OPPs defined in the SoC dtsi for
> > RK356x.  These voltage ranges are useful for RK356x-based boards that are
> > designed to use the same power supply for the GPU and NPU portions of the
> > SoC, which is described further in the following documents:
> > 
> >   - Rockchip RK3566 Hardware Design Guide, version 1.1.0, page 37
> >   - Rockchip RK3568 Hardware Design Guide, version 1.2, page 78
> 
> That was interesting to read, thanks.
> Now I understand the difference between rk809(-5) and rk817(-5).
> 
> But AFAIUI the above description described why there were separate tables for 
> rk809 and rk817 in v1. But that was dropped in v2. So it seems to me the 
> (commit) message should be updated accordingly?
> 
> I also expected that (for v1) there would be a similar construct as was 
> recently added for rk3588. But I should interpret Heiko's comments as that 
> strategy should not be applied to rk356x?

The issue I had was more about the #ifdef'ery and then having a board define
a constant to enable one or the other.

As far as I understood the description, the OPP itself is the same in
terms of frequency and voltage, just the regulator can't fully realize
that target voltage, so the solution is to allow a voltage range, to
also support the less-exact regulator.

On the rk3588 on the other hand the soc variants have different OPP
tables themselfs, because the soc itself only supports different
frequencies+voltages. So the solution here is the split of the OPPs so
that we don't mess around with /delete-node/ edits of one OPP table.

So TL;DR separate OPP tables are the way to go if the user needs different
freq+voltage values and voltage ranges allows boards to use less-adapted
regulators.


> > The values for the exact GPU OPP voltages and the lower limits for the GPU
> > OPP voltage ranges differ from the values found in the vendor kernel source
> > (cf. downstream commit f8b9431ee38e ("arm64: dts: rockchip: rk3568: support
> > adjust opp-table by otp")). [1][2]  
> 
> Why? In their latest update Rockchip changed it to the values as specified in 
> the links. My assumption is that based on extensive testing they did and/or 
> the feedback they got from the client/customers, they felt the need to change 
> it to the values they did.
> 
> I think we should follow their values unless we have an explicit and very good 
> reason to deviate from that.

Correct.
Values from some "random" Radxa kernel would also not be my
selection of choice.

In the mainline-kernel we always want the save choice - which in for me
is Rockchip's. If people want to experiment with other values on their own
boards to sort of overclock their chips, that's their prerogative.


Heiko


> > However, our values have served us well so far, so let's keep them for now,
> 
> And I don't think that qualifies as a (very) good reason.
> I think it's reasonable to assume that far more (stress) testing has been done 
> with the downstream code, then has happened with the upstream code.
> Hopefully that'll change in the future, but I don't think we're there yet.
> 
> When we/upstream adds npu support, I think we should also follow downstream's 
> OPP values, unless we have a very good reason to deviate from that.
> 
> > until we actually start supporting the CPU and GPU binning, together with
> > the related voltage adjustments.
> 
> I may not fully understand what you mean by that, but I think it's (again) 
> reasonable to assume that Rockchip has far more insight into this then we do.
> 
> Cheers,
>   Diederik
> 
> > [1]
> > https://github.com/rockchip-linux/kernel/commit/f8b9431ee38ed561650be7092ab
> > 93f564598daa9 [2]
> > https://raw.githubusercontent.com/rockchip-linux/kernel/f8b9431ee38ed561650
> > be7092ab93f564598daa9/arch/arm64/boot/dts/rockchip/rk3568.dtsi
>
Diederik de Haas June 30, 2024, 11:53 a.m. UTC | #3
On Sunday, 30 June 2024 11:07:47 CEST Heiko Stübner wrote:
> Am Sonntag, 30. Juni 2024, 00:01:41 CEST schrieb Diederik de Haas:
> > On Saturday, 29 June 2024 18:39:02 CEST Dragan Simic wrote:
> > > Add support for voltage ranges to the GPU OPPs defined in the SoC
> > > dtsi for RK356x.  These voltage ranges are useful for RK356x-based
> > > boards that are designed to use the same power supply for the GPU
> > > and NPU portions of the SoC, which is described further in the
> > > following documents:
> > >   - Rockchip RK3566 Hardware Design Guide, version 1.1.0, page 37
> > >   - Rockchip RK3568 Hardware Design Guide, version 1.2, page 78
> > 
> > That was interesting to read, thanks.
> > Now I understand the difference between rk809(-5) and rk817(-5).
> > 
> > But AFAIUI the above description described why there were separate tables
> > for rk809 and rk817 in v1. But that was dropped in v2. So it seems to me
> > the (commit) message should be updated accordingly?
> > 
> > I also expected that (for v1) there would be a similar construct as was
> > recently added for rk3588. But I should interpret Heiko's comments as that
> > strategy should not be applied to rk356x?
> 
> The issue I had was more about the #ifdef'ery and then having a board define
> a constant to enable one or the other.

Yeah, I had some thoughts about that too, but by the time I was ready to 
respond to that, there was v2, so that became irrelevant.

> As far as I understood the description, the OPP itself is the same in
> terms of frequency and voltage, just the regulator can't fully realize
> that target voltage, so the solution is to allow a voltage range, to
> also support the less-exact regulator.
> 
> On the rk3588 on the other hand the soc variants have different OPP
> tables themselfs, because the soc itself only supports different
> frequencies+voltages. So the solution here is the split of the OPPs so
> that we don't mess around with /delete-node/ edits of one OPP table.
> 
> So TL;DR separate OPP tables are the way to go if the user needs different
> freq+voltage values and voltage ranges allows boards to use less-adapted
> regulators.

Thanks for the explanation.

One of the things I researched was whether there was a different OPP table
in Rockchip's rk3566.dtsi (and then the assumption that RK817 = RK3566 and
RK809 = RK3568, which would be flawed/incorrect). But there wasn't.

Cheers,
  Diederik
Dragan Simic June 30, 2024, 12:04 p.m. UTC | #4
Hello Diederik,

On 2024-06-30 00:01, Diederik de Haas wrote:
> On Saturday, 29 June 2024 18:39:02 CEST Dragan Simic wrote:
>> Add support for voltage ranges to the GPU OPPs defined in the SoC dtsi 
>> for
>> RK356x.  These voltage ranges are useful for RK356x-based boards that 
>> are
>> designed to use the same power supply for the GPU and NPU portions of 
>> the
>> SoC, which is described further in the following documents:
>> 
>>   - Rockchip RK3566 Hardware Design Guide, version 1.1.0, page 37
>>   - Rockchip RK3568 Hardware Design Guide, version 1.2, page 78
> 
> That was interesting to read, thanks.
> Now I understand the difference between rk809(-5) and rk817(-5).

I'm glad it was useful. :)

> But AFAIUI the above description described why there were separate 
> tables for
> rk809 and rk817 in v1. But that was dropped in v2. So it seems to me 
> the
> (commit) message should be updated accordingly?

I also thought about removing that description in the v2, but it 
actually
doesn't hurt to provide an example of what the GPU OPP voltage ranges 
are
useful for.

> I also expected that (for v1) there would be a similar construct as was
> recently added for rk3588. But I should interpret Heiko's comments as 
> that
> strategy should not be applied to rk356x?

The trouble with applying the same strategy, which was the initial plan
for the v1, is that the need for voltage ranges depends on one of the 
board
features, i.e. the GPU and NPU voltage regulators.  As such, it still 
has
to affect the RK356x SoC dtsi, which may warrant separate 
rk356x-gpu-range.dtsi,
for example, but the troubles would arise later if we had another 
similar
dtsi variant, because we'd then have to split the SoC dtsi into four 
variants,
which would hardly be warranted or sustainable.

That's why the v1 went with a macro instead.  However, there are already
numerous unresolved examples of what that macro tries to solve in the 
RK3399
SoC dtsi files, so the conclusion was that we need a more systemic 
solution,
which will be the upcoming debugging facilities in the OPP handling.  
Those
facilities will allow us to detect possible issues with the 
misconfigured
DT voltages on all SoCs and boards, which the v1 macro would have solved 
in
another way, but only for the RK356x.

>> The values for the exact GPU OPP voltages and the lower limits for the 
>> GPU
>> OPP voltage ranges differ from the values found in the vendor kernel 
>> source
>> (cf. downstream commit f8b9431ee38e ("arm64: dts: rockchip: rk3568: 
>> support
>> adjust opp-table by otp")). [1][2]
> 
> Why? In their latest update Rockchip changed it to the values as 
> specified in
> the links. My assumption is that based on extensive testing they did 
> and/or
> the feedback they got from the client/customers, they felt the need to 
> change
> it to the values they did.
> 
> I think we should follow their values unless we have an explicit and 
> very good
> reason to deviate from that.

There's a rather good reason, which was provided in the patch 
description
right below, but I can see you've already disagreed with it. :)

>> However, our values have served us well so far, so let's keep them for 
>> now,
> 
> And I don't think that qualifies as a (very) good reason.
> I think it's reasonable to assume that far more (stress) testing has 
> been done
> with the downstream code, then has happened with the upstream code.
> Hopefully that'll change in the future, but I don't think we're there 
> yet.

They key in the patch description is "for now". :)  I'd much rather 
leave
the exact voltages unchanged for now, and get that covered a bit later, 
either
in a separate follow-up patch (or in the v3 that would be a two-patch 
series,
as the patch 2/2), which would be good for possibly doing any regression
tracking later, or do it later as part of supporting the CPU and GPU 
binning.

> When we/upstream adds npu support, I think we should also follow 
> downstream's
> OPP values, unless we have a very good reason to deviate from that.

That would make sense, especially because we haven't had the NPU 
supported
before in the mainline.

>> until we actually start supporting the CPU and GPU binning, together 
>> with
>> the related voltage adjustments.
> 
> I may not fully understand what you mean by that, but I think it's 
> (again)
> reasonable to assume that Rockchip has far more insight into this then 
> we do.

Basically, I meant that (my) plan is to work on supporting the CPU and 
GPU
binning, at which point the voltages would also be adjusted according to
the downstream.

>> [1] 
>> https://github.com/rockchip-linux/kernel/commit/f8b9431ee38ed561650be7092ab
>> 93f564598daa9
>> [2] 
>> https://raw.githubusercontent.com/rockchip-linux/kernel/f8b9431ee38ed561650
>> be7092ab93f564598daa9/arch/arm64/boot/dts/rockchip/rk3568.dtsi
Diederik de Haas June 30, 2024, 3:43 p.m. UTC | #5
Hi Dragan,

On Sunday, 30 June 2024 14:04:50 CEST Dragan Simic wrote:
> > I also expected that (for v1) there would be a similar construct as was
> > recently added for rk3588. But I should interpret Heiko's comments as
> > that strategy should not be applied to rk356x?
> 
> The trouble with applying the same strategy, ...

One of the reasons I like/hoped for it is that I'm a 'sucker' for consistency.

> ... the need for voltage ranges depends on one of the board features,
> i.e. the GPU and NPU voltage regulators.  As such, it still has to
> affect the RK356x SoC dtsi, which may warrant separate
> rk356x-gpu-range.dtsi, for example, but the troubles would arise ...

... but it's probably better if I (generally) abstain from taking part
in the discussion about the correct/desired implementation as I don't
understand the material in enough detail to meaningfully contribute.

> That's why the v1 went with a macro instead.

... which didn't seem to help with my consistency wish ;-)
(AFAIC there's no need to discuss this further (publicly))

> > When we/upstream adds npu support, I think we should also follow
> > downstream's OPP values, unless we have a very good reason to
> > deviate from that.
> 
> That would make sense, especially because we haven't had the NPU
> supported before in the mainline.

I first wondered why you hadn't *updated* the npu OPP values ... 
to later find out they haven't been specified at all in 'upstream'.

Cheers,
  Diederik
Dragan Simic June 30, 2024, 3:51 p.m. UTC | #6
Hello Diederik,

On 2024-06-30 17:43, Diederik de Haas wrote:
> On Sunday, 30 June 2024 14:04:50 CEST Dragan Simic wrote:
>> > I also expected that (for v1) there would be a similar construct as was
>> > recently added for rk3588. But I should interpret Heiko's comments as
>> > that strategy should not be applied to rk356x?
>> 
>> The trouble with applying the same strategy, ...
> 
> One of the reasons I like/hoped for it is that I'm a 'sucker' for
> consistency.

I also like consistency, but doing it that way simply wasn't feasible
in this case.  Maybe I'll rework the RK3399 SoC dtsi files a bit, so
we'd end up with more overall consistency. :)

>> ... the need for voltage ranges depends on one of the board features,
>> i.e. the GPU and NPU voltage regulators.  As such, it still has to
>> affect the RK356x SoC dtsi, which may warrant separate
>> rk356x-gpu-range.dtsi, for example, but the troubles would arise ...
> 
> ... but it's probably better if I (generally) abstain from taking part
> in the discussion about the correct/desired implementation as I don't
> understand the material in enough detail to meaningfully contribute.

I find your responses useful, so as far as I'm concerned, you're more
than welcome to take part in the discussions.
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
index d8543b5557ee..ec772bce359a 100644
--- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
@@ -195,32 +195,32 @@  gpu_opp_table: opp-table-1 {
 
 		opp-200000000 {
 			opp-hz = /bits/ 64 <200000000>;
-			opp-microvolt = <825000>;
+			opp-microvolt = <825000 825000 1000000>;
 		};
 
 		opp-300000000 {
 			opp-hz = /bits/ 64 <300000000>;
-			opp-microvolt = <825000>;
+			opp-microvolt = <825000 825000 1000000>;
 		};
 
 		opp-400000000 {
 			opp-hz = /bits/ 64 <400000000>;
-			opp-microvolt = <825000>;
+			opp-microvolt = <825000 825000 1000000>;
 		};
 
 		opp-600000000 {
 			opp-hz = /bits/ 64 <600000000>;
-			opp-microvolt = <825000>;
+			opp-microvolt = <825000 825000 1000000>;
 		};
 
 		opp-700000000 {
 			opp-hz = /bits/ 64 <700000000>;
-			opp-microvolt = <900000>;
+			opp-microvolt = <900000 900000 1000000>;
 		};
 
 		opp-800000000 {
 			opp-hz = /bits/ 64 <800000000>;
-			opp-microvolt = <1000000>;
+			opp-microvolt = <1000000 1000000 1000000>;
 		};
 	};