diff mbox series

[v2,1/1] arm64: dts: mediatek: mt8186: Increase CCI frequency

Message ID 20230914121035.17320-2-chun-jen.tseng@mediatek.com (mailing list archive)
State New, archived
Headers show
Series arm64: dts: mediatek: mt8186: Increase CCI frequency | expand

Commit Message

Mark Tseng Sept. 14, 2023, 12:10 p.m. UTC
The original CCI OPP table's lowest frequency 500 MHz is too low and causes
system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 GHz and adjust
the OPPs accordingly.

Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI node and CCI OPP table")

Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com>
---
 arch/arm64/boot/dts/mediatek/mt8186.dtsi | 90 ++++++++++++------------
 1 file changed, 45 insertions(+), 45 deletions(-)

Comments

AngeloGioacchino Del Regno Nov. 29, 2023, 1:22 p.m. UTC | #1
Il 14/09/23 14:10, Mark Tseng ha scritto:
> The original CCI OPP table's lowest frequency 500 MHz is too low and causes
> system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 GHz and adjust
> the OPPs accordingly.
> 
> Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI node and CCI OPP table")
> 
> Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com>

You ignored my comment [1] on the v1 of this patch.

Besides, I think that you should at least keep the 500MHz frequency for a
sleep-only/idle OPP to save power.

It would also be helpful to understand why you chose this new frequency range,
so if you can, please put some numbers in the commit description, showing the
stall in terms of requested BW vs actual BW (as I'd imagine that a 2x increase
in CCI frequency means that we need *twice* the bandwidth compared to what we
have for the workloads that are stalling the system).

[1]: https://lore.kernel.org/all/799325f5-29b5-f0c0-16ea-d47c06830ed3@collabora.com/

Regards,
Angelo
Mark Tseng Jan. 10, 2024, 5:44 a.m. UTC | #2
On Wed, 2023-11-29 at 14:22 +0100, AngeloGioacchino Del Regno wrote:
> Il 14/09/23 14:10, Mark Tseng ha scritto:
> > The original CCI OPP table's lowest frequency 500 MHz is too low
> > and causes
> > system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 GHz
> > and adjust
> > the OPPs accordingly.
> > 
> > Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI node
> > and CCI OPP table")
> > 
> > Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com>
> 
> You ignored my comment [1] on the v1 of this patch.
> 
> Besides, I think that you should at least keep the 500MHz frequency
> for a
> sleep-only/idle OPP to save power.
> 
> It would also be helpful to understand why you chose this new
> frequency range,
> so if you can, please put some numbers in the commit description,
> showing the
> stall in terms of requested BW vs actual BW (as I'd imagine that a 2x
> increase
> in CCI frequency means that we need *twice* the bandwidth compared to
> what we
> have for the workloads that are stalling the system).
> 
Hi AngeloGioacchino Del Regno,

Thanks your reminder this issue. After ajdustment CCI OPP, we also do
power test benchmark and the result is PASS.

The original CCI table has stall issue.  When the Big CPU frequency set
on 2.05G and CCI frequency keep on 500MHz then run CTS MediaTest will
system stall then trigger watchdog reset SoC.

The CPU and CCI frequency setting are not in the same driver. So it
will have timing issue cause CPU stall side effect.

BRs,

Mark Tseng

> [1]: 
> https://lore.kernel.org/all/799325f5-29b5-f0c0-16ea-d47c06830ed3@collabora.com/
> 
> Regards,
> Angelo
AngeloGioacchino Del Regno Jan. 10, 2024, 11:10 a.m. UTC | #3
Il 10/01/24 06:44, Chun-Jen Tseng (曾俊仁) ha scritto:
> On Wed, 2023-11-29 at 14:22 +0100, AngeloGioacchino Del Regno wrote:
>> Il 14/09/23 14:10, Mark Tseng ha scritto:
>>> The original CCI OPP table's lowest frequency 500 MHz is too low
>>> and causes
>>> system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 GHz
>>> and adjust
>>> the OPPs accordingly.
>>>
>>> Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI node
>>> and CCI OPP table")
>>>
>>> Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com>
>>
>> You ignored my comment [1] on the v1 of this patch.
>>
>> Besides, I think that you should at least keep the 500MHz frequency
>> for a
>> sleep-only/idle OPP to save power.
>>
>> It would also be helpful to understand why you chose this new
>> frequency range,
>> so if you can, please put some numbers in the commit description,
>> showing the
>> stall in terms of requested BW vs actual BW (as I'd imagine that a 2x
>> increase
>> in CCI frequency means that we need *twice* the bandwidth compared to
>> what we
>> have for the workloads that are stalling the system).
>>
> Hi AngeloGioacchino Del Regno,
> 
> Thanks your reminder this issue. After ajdustment CCI OPP, we also do
> power test benchmark and the result is PASS.
> 

Sorry but `PASS` is not a number; I actually wanted a before and after power
consumption measurement in microwatts.

> The original CCI table has stall issue.  When the Big CPU frequency set
> on 2.05G and CCI frequency keep on 500MHz then run CTS MediaTest will
> system stall then trigger watchdog reset SoC.
> 
> The CPU and CCI frequency setting are not in the same driver. So it
> will have timing issue cause CPU stall side effect.
> 

Are you trying to fix a frequency setting delay/desync with raising the
frequency of the CCI?
That's not the right way of doing it.

Asserting that we have a timing issue because the two frequency settings
are not done by the same driver is borderline wrong - but anyway - if there
is a frequency setting timing issue because of the interaction between the
two drivers (cpufreq/ccifreq), the right way of eliminating the stall is to
actually solve the root cause of that.

I'm insisting on this because if there's a "timing issue" this means that
even though the "base" CCI frequency is higher, during a scaling up operation
depending on how much the CCI gets flooded, you might *either*:
  - Have this same stall issue again, and/or
  - Have performance issues/drops while waiting for the CCI to scale up.

Even though you may not (or may...) get a stall issue again with this change,
you will surely get (very short) temporary performance drops during scaling up.

....and this is why your CCI frequency increase solution does *not* resolve
this issue, but only partially mitigates it.

That should get solved, not partially mitigated.

Besides that, can you please tell me how to replicate the stall issue, making
me able to better understand what's going on here?

Regards,
Angelo

> BRs,
> 
> Mark Tseng
> 
>> [1]:
>> https://lore.kernel.org/all/799325f5-29b5-f0c0-16ea-d47c06830ed3@collabora.com/
>>
>> Regards,
>> Angelo
Mark Tseng Feb. 26, 2024, 9:12 a.m. UTC | #4
On Wed, 2024-01-10 at 12:10 +0100, AngeloGioacchino Del Regno wrote:
> Il 10/01/24 06:44, Chun-Jen Tseng (曾俊仁) ha scritto:
> > On Wed, 2023-11-29 at 14:22 +0100, AngeloGioacchino Del Regno
> > wrote:
> > > Il 14/09/23 14:10, Mark Tseng ha scritto:
> > > > The original CCI OPP table's lowest frequency 500 MHz is too
> > > > low
> > > > and causes
> > > > system stalls. Increase the frequency range to 1.05 GHz ~ 1.4
> > > > GHz
> > > > and adjust
> > > > the OPPs accordingly.
> > > > 
> > > > Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI
> > > > node
> > > > and CCI OPP table")
> > > > 
> > > > Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com>
> > > 
> > > You ignored my comment [1] on the v1 of this patch.
> > > 
> > > Besides, I think that you should at least keep the 500MHz
> > > frequency
> > > for a
> > > sleep-only/idle OPP to save power.
> > > 
> > > It would also be helpful to understand why you chose this new
> > > frequency range,
> > > so if you can, please put some numbers in the commit description,
> > > showing the
> > > stall in terms of requested BW vs actual BW (as I'd imagine that
> > > a 2x
> > > increase
> > > in CCI frequency means that we need *twice* the bandwidth
> > > compared to
> > > what we
> > > have for the workloads that are stalling the system).
> > > 
> > 
> > Hi AngeloGioacchino Del Regno,
> > 
> > Thanks your reminder this issue. After ajdustment CCI OPP, we also
> > do
> > power test benchmark and the result is PASS.
> > 
> 
> Sorry but `PASS` is not a number; I actually wanted a before and
> after power
> consumption measurement in microwatts.
> 
> > The original CCI table has stall issue.  When the Big CPU frequency
> > set
> > on 2.05G and CCI frequency keep on 500MHz then run CTS MediaTest
> > will
> > system stall then trigger watchdog reset SoC.
> > 
> > The CPU and CCI frequency setting are not in the same driver. So it
> > will have timing issue cause CPU stall side effect.
> > 
> 
> Are you trying to fix a frequency setting delay/desync with raising
> the
> frequency of the CCI?
> That's not the right way of doing it.
> 
> Asserting that we have a timing issue because the two frequency
> settings
> are not done by the same driver is borderline wrong - but anyway - if
> there
> is a frequency setting timing issue because of the interaction
> between the
> two drivers (cpufreq/ccifreq), the right way of eliminating the stall
> is to
> actually solve the root cause of that.
> 
> I'm insisting on this because if there's a "timing issue" this means
> that
> even though the "base" CCI frequency is higher, during a scaling up
> operation
> depending on how much the CCI gets flooded, you might *either*:
>   - Have this same stall issue again, and/or
>   - Have performance issues/drops while waiting for the CCI to scale
> up.
> 
> Even though you may not (or may...) get a stall issue again with this
> change,
> you will surely get (very short) temporary performance drops during
> scaling up.
> 
> ....and this is why your CCI frequency increase solution does *not*
> resolve
> this issue, but only partially mitigates it.
> 
> That should get solved, not partially mitigated.
> 
> Besides that, can you please tell me how to replicate the stall
> issue, making
> me able to better understand what's going on here?
> 
> Regards,
> Angelo
> 

Hi Angelo,

Thanks your review and suggestion.

This issue happen on CTS mediaTest and WebGL test.

I found that if CCI under 1.05 GHz, it will happend randomly system
stall so I do workaround modification on devfreq driver to limit CCI
level (1.05 Ghz). It look solve this issue.

BRs,

Mark Tseng



> > BRs,
> > 
> > Mark Tseng
> > 
> > > [1]:
> > > 
https://lore.kernel.org/all/799325f5-29b5-f0c0-16ea-d47c06830ed3@collabora.com/
> > > 
> > > Regards,
> > > Angelo
> 
>
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/mediatek/mt8186.dtsi b/arch/arm64/boot/dts/mediatek/mt8186.dtsi
index f04ae70c470a..b98832d032eb 100644
--- a/arch/arm64/boot/dts/mediatek/mt8186.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8186.dtsi
@@ -39,79 +39,79 @@ 
 		compatible = "operating-points-v2";
 		opp-shared;
 
-		cci_opp_0: opp-500000000 {
-			opp-hz = /bits/ 64 <500000000>;
-			opp-microvolt = <600000>;
+		cci_opp_0: opp-1050000000 {
+			opp-hz = /bits/ 64 <1050000000>;
+			opp-microvolt = <843750>;
 		};
 
-		cci_opp_1: opp-560000000 {
-			opp-hz = /bits/ 64 <560000000>;
-			opp-microvolt = <675000>;
+		cci_opp_1: opp-1073000000 {
+			opp-hz = /bits/ 64 <1073000000>;
+			opp-microvolt = <850000>;
 		};
 
-		cci_opp_2: opp-612000000 {
-			opp-hz = /bits/ 64 <612000000>;
-			opp-microvolt = <693750>;
+		cci_opp_2: opp-1096000000 {
+			opp-hz = /bits/ 64 <1096000000>;
+			opp-microvolt = <856250>;
 		};
 
-		cci_opp_3: opp-682000000 {
-			opp-hz = /bits/ 64 <682000000>;
-			opp-microvolt = <718750>;
+		cci_opp_3: opp-1120000000 {
+			opp-hz = /bits/ 64 <1120000000>;
+			opp-microvolt = <862500>;
 		};
 
-		cci_opp_4: opp-752000000 {
-			opp-hz = /bits/ 64 <752000000>;
-			opp-microvolt = <743750>;
+		cci_opp_4: opp-1143000000 {
+			opp-hz = /bits/ 64 <1143000000>;
+			opp-microvolt = <881250>;
 		};
 
-		cci_opp_5: opp-822000000 {
-			opp-hz = /bits/ 64 <822000000>;
-			opp-microvolt = <768750>;
+		cci_opp_5: opp-1166000000 {
+			opp-hz = /bits/ 64 <1166000000>;
+			opp-microvolt = <893750>;
 		};
 
-		cci_opp_6: opp-875000000 {
-			opp-hz = /bits/ 64 <875000000>;
-			opp-microvolt = <781250>;
+		cci_opp_6: opp-1190000000 {
+			opp-hz = /bits/ 64 <1190000000>;
+			opp-microvolt = <906250>;
 		};
 
-		cci_opp_7: opp-927000000 {
-			opp-hz = /bits/ 64 <927000000>;
-			opp-microvolt = <800000>;
+		cci_opp_7: opp-1213000000 {
+			opp-hz = /bits/ 64 <1213000000>;
+			opp-microvolt = <918750>;
 		};
 
-		cci_opp_8: opp-980000000 {
-			opp-hz = /bits/ 64 <980000000>;
-			opp-microvolt = <818750>;
+		cci_opp_8: opp-1236000000 {
+			opp-hz = /bits/ 64 <1236000000>;
+			opp-microvolt = <937500>;
 		};
 
-		cci_opp_9: opp-1050000000 {
-			opp-hz = /bits/ 64 <1050000000>;
-			opp-microvolt = <843750>;
+		cci_opp_9: opp-1260000000 {
+			opp-hz = /bits/ 64 <1260000000>;
+			opp-microvolt = <950000>;
 		};
 
-		cci_opp_10: opp-1120000000 {
-			opp-hz = /bits/ 64 <1120000000>;
-			opp-microvolt = <862500>;
+		cci_opp_10: opp-1283000000 {
+			opp-hz = /bits/ 64 <1283000000>;
+			opp-microvolt = <962500>;
 		};
 
-		cci_opp_11: opp-1155000000 {
-			opp-hz = /bits/ 64 <1155000000>;
-			opp-microvolt = <887500>;
+		cci_opp_11: opp-1306000000 {
+			opp-hz = /bits/ 64 <1306000000>;
+			opp-microvolt = <975000>;
 		};
 
-		cci_opp_12: opp-1190000000 {
-			opp-hz = /bits/ 64 <1190000000>;
-			opp-microvolt = <906250>;
+		cci_opp_12: opp-1330000000 {
+			opp-hz = /bits/ 64 <1330000000>;
+			opp-microvolt = <993750>;
 		};
 
-		cci_opp_13: opp-1260000000 {
-			opp-hz = /bits/ 64 <1260000000>;
-			opp-microvolt = <950000>;
+		cci_opp_13: opp-1353000000 {
+			opp-hz = /bits/ 64 <1353000000>;
+			opp-microvolt = <1006250>;
 		};
 
-		cci_opp_14: opp-1330000000 {
-			opp-hz = /bits/ 64 <1330000000>;
-			opp-microvolt = <993750>;
+		cci_opp_14: opp-1376000000 {
+			opp-hz = /bits/ 64 <1376000000>;
+			opp-microvolt = <1018750>;
 		};
 
 		cci_opp_15: opp-1400000000 {