Message ID | 20230914121035.17320-2-chun-jen.tseng@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: dts: mediatek: mt8186: Increase CCI frequency | expand |
Il 14/09/23 14:10, Mark Tseng ha scritto: > The original CCI OPP table's lowest frequency 500 MHz is too low and causes > system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 GHz and adjust > the OPPs accordingly. > > Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI node and CCI OPP table") > > Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com> You ignored my comment [1] on the v1 of this patch. Besides, I think that you should at least keep the 500MHz frequency for a sleep-only/idle OPP to save power. It would also be helpful to understand why you chose this new frequency range, so if you can, please put some numbers in the commit description, showing the stall in terms of requested BW vs actual BW (as I'd imagine that a 2x increase in CCI frequency means that we need *twice* the bandwidth compared to what we have for the workloads that are stalling the system). [1]: https://lore.kernel.org/all/799325f5-29b5-f0c0-16ea-d47c06830ed3@collabora.com/ Regards, Angelo
On Wed, 2023-11-29 at 14:22 +0100, AngeloGioacchino Del Regno wrote: > Il 14/09/23 14:10, Mark Tseng ha scritto: > > The original CCI OPP table's lowest frequency 500 MHz is too low > > and causes > > system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 GHz > > and adjust > > the OPPs accordingly. > > > > Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI node > > and CCI OPP table") > > > > Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com> > > You ignored my comment [1] on the v1 of this patch. > > Besides, I think that you should at least keep the 500MHz frequency > for a > sleep-only/idle OPP to save power. > > It would also be helpful to understand why you chose this new > frequency range, > so if you can, please put some numbers in the commit description, > showing the > stall in terms of requested BW vs actual BW (as I'd imagine that a 2x > increase > in CCI frequency means that we need *twice* the bandwidth compared to > what we > have for the workloads that are stalling the system). > Hi AngeloGioacchino Del Regno, Thanks your reminder this issue. After ajdustment CCI OPP, we also do power test benchmark and the result is PASS. The original CCI table has stall issue. When the Big CPU frequency set on 2.05G and CCI frequency keep on 500MHz then run CTS MediaTest will system stall then trigger watchdog reset SoC. The CPU and CCI frequency setting are not in the same driver. So it will have timing issue cause CPU stall side effect. BRs, Mark Tseng > [1]: > https://lore.kernel.org/all/799325f5-29b5-f0c0-16ea-d47c06830ed3@collabora.com/ > > Regards, > Angelo
Il 10/01/24 06:44, Chun-Jen Tseng (曾俊仁) ha scritto: > On Wed, 2023-11-29 at 14:22 +0100, AngeloGioacchino Del Regno wrote: >> Il 14/09/23 14:10, Mark Tseng ha scritto: >>> The original CCI OPP table's lowest frequency 500 MHz is too low >>> and causes >>> system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 GHz >>> and adjust >>> the OPPs accordingly. >>> >>> Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI node >>> and CCI OPP table") >>> >>> Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com> >> >> You ignored my comment [1] on the v1 of this patch. >> >> Besides, I think that you should at least keep the 500MHz frequency >> for a >> sleep-only/idle OPP to save power. >> >> It would also be helpful to understand why you chose this new >> frequency range, >> so if you can, please put some numbers in the commit description, >> showing the >> stall in terms of requested BW vs actual BW (as I'd imagine that a 2x >> increase >> in CCI frequency means that we need *twice* the bandwidth compared to >> what we >> have for the workloads that are stalling the system). >> > Hi AngeloGioacchino Del Regno, > > Thanks your reminder this issue. After ajdustment CCI OPP, we also do > power test benchmark and the result is PASS. > Sorry but `PASS` is not a number; I actually wanted a before and after power consumption measurement in microwatts. > The original CCI table has stall issue. When the Big CPU frequency set > on 2.05G and CCI frequency keep on 500MHz then run CTS MediaTest will > system stall then trigger watchdog reset SoC. > > The CPU and CCI frequency setting are not in the same driver. So it > will have timing issue cause CPU stall side effect. > Are you trying to fix a frequency setting delay/desync with raising the frequency of the CCI? That's not the right way of doing it. Asserting that we have a timing issue because the two frequency settings are not done by the same driver is borderline wrong - but anyway - if there is a frequency setting timing issue because of the interaction between the two drivers (cpufreq/ccifreq), the right way of eliminating the stall is to actually solve the root cause of that. I'm insisting on this because if there's a "timing issue" this means that even though the "base" CCI frequency is higher, during a scaling up operation depending on how much the CCI gets flooded, you might *either*: - Have this same stall issue again, and/or - Have performance issues/drops while waiting for the CCI to scale up. Even though you may not (or may...) get a stall issue again with this change, you will surely get (very short) temporary performance drops during scaling up. ....and this is why your CCI frequency increase solution does *not* resolve this issue, but only partially mitigates it. That should get solved, not partially mitigated. Besides that, can you please tell me how to replicate the stall issue, making me able to better understand what's going on here? Regards, Angelo > BRs, > > Mark Tseng > >> [1]: >> https://lore.kernel.org/all/799325f5-29b5-f0c0-16ea-d47c06830ed3@collabora.com/ >> >> Regards, >> Angelo
On Wed, 2024-01-10 at 12:10 +0100, AngeloGioacchino Del Regno wrote: > Il 10/01/24 06:44, Chun-Jen Tseng (曾俊仁) ha scritto: > > On Wed, 2023-11-29 at 14:22 +0100, AngeloGioacchino Del Regno > > wrote: > > > Il 14/09/23 14:10, Mark Tseng ha scritto: > > > > The original CCI OPP table's lowest frequency 500 MHz is too > > > > low > > > > and causes > > > > system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 > > > > GHz > > > > and adjust > > > > the OPPs accordingly. > > > > > > > > Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI > > > > node > > > > and CCI OPP table") > > > > > > > > Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com> > > > > > > You ignored my comment [1] on the v1 of this patch. > > > > > > Besides, I think that you should at least keep the 500MHz > > > frequency > > > for a > > > sleep-only/idle OPP to save power. > > > > > > It would also be helpful to understand why you chose this new > > > frequency range, > > > so if you can, please put some numbers in the commit description, > > > showing the > > > stall in terms of requested BW vs actual BW (as I'd imagine that > > > a 2x > > > increase > > > in CCI frequency means that we need *twice* the bandwidth > > > compared to > > > what we > > > have for the workloads that are stalling the system). > > > > > > > Hi AngeloGioacchino Del Regno, > > > > Thanks your reminder this issue. After ajdustment CCI OPP, we also > > do > > power test benchmark and the result is PASS. > > > > Sorry but `PASS` is not a number; I actually wanted a before and > after power > consumption measurement in microwatts. > > > The original CCI table has stall issue. When the Big CPU frequency > > set > > on 2.05G and CCI frequency keep on 500MHz then run CTS MediaTest > > will > > system stall then trigger watchdog reset SoC. > > > > The CPU and CCI frequency setting are not in the same driver. So it > > will have timing issue cause CPU stall side effect. > > > > Are you trying to fix a frequency setting delay/desync with raising > the > frequency of the CCI? > That's not the right way of doing it. > > Asserting that we have a timing issue because the two frequency > settings > are not done by the same driver is borderline wrong - but anyway - if > there > is a frequency setting timing issue because of the interaction > between the > two drivers (cpufreq/ccifreq), the right way of eliminating the stall > is to > actually solve the root cause of that. > > I'm insisting on this because if there's a "timing issue" this means > that > even though the "base" CCI frequency is higher, during a scaling up > operation > depending on how much the CCI gets flooded, you might *either*: > - Have this same stall issue again, and/or > - Have performance issues/drops while waiting for the CCI to scale > up. > > Even though you may not (or may...) get a stall issue again with this > change, > you will surely get (very short) temporary performance drops during > scaling up. > > ....and this is why your CCI frequency increase solution does *not* > resolve > this issue, but only partially mitigates it. > > That should get solved, not partially mitigated. > > Besides that, can you please tell me how to replicate the stall > issue, making > me able to better understand what's going on here? > > Regards, > Angelo > Hi Angelo, Thanks your review and suggestion. This issue happen on CTS mediaTest and WebGL test. I found that if CCI under 1.05 GHz, it will happend randomly system stall so I do workaround modification on devfreq driver to limit CCI level (1.05 Ghz). It look solve this issue. BRs, Mark Tseng > > BRs, > > > > Mark Tseng > > > > > [1]: > > > https://lore.kernel.org/all/799325f5-29b5-f0c0-16ea-d47c06830ed3@collabora.com/ > > > > > > Regards, > > > Angelo > >
diff --git a/arch/arm64/boot/dts/mediatek/mt8186.dtsi b/arch/arm64/boot/dts/mediatek/mt8186.dtsi index f04ae70c470a..b98832d032eb 100644 --- a/arch/arm64/boot/dts/mediatek/mt8186.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt8186.dtsi @@ -39,79 +39,79 @@ compatible = "operating-points-v2"; opp-shared; - cci_opp_0: opp-500000000 { - opp-hz = /bits/ 64 <500000000>; - opp-microvolt = <600000>; + cci_opp_0: opp-1050000000 { + opp-hz = /bits/ 64 <1050000000>; + opp-microvolt = <843750>; }; - cci_opp_1: opp-560000000 { - opp-hz = /bits/ 64 <560000000>; - opp-microvolt = <675000>; + cci_opp_1: opp-1073000000 { + opp-hz = /bits/ 64 <1073000000>; + opp-microvolt = <850000>; }; - cci_opp_2: opp-612000000 { - opp-hz = /bits/ 64 <612000000>; - opp-microvolt = <693750>; + cci_opp_2: opp-1096000000 { + opp-hz = /bits/ 64 <1096000000>; + opp-microvolt = <856250>; }; - cci_opp_3: opp-682000000 { - opp-hz = /bits/ 64 <682000000>; - opp-microvolt = <718750>; + cci_opp_3: opp-1120000000 { + opp-hz = /bits/ 64 <1120000000>; + opp-microvolt = <862500>; }; - cci_opp_4: opp-752000000 { - opp-hz = /bits/ 64 <752000000>; - opp-microvolt = <743750>; + cci_opp_4: opp-1143000000 { + opp-hz = /bits/ 64 <1143000000>; + opp-microvolt = <881250>; }; - cci_opp_5: opp-822000000 { - opp-hz = /bits/ 64 <822000000>; - opp-microvolt = <768750>; + cci_opp_5: opp-1166000000 { + opp-hz = /bits/ 64 <1166000000>; + opp-microvolt = <893750>; }; - cci_opp_6: opp-875000000 { - opp-hz = /bits/ 64 <875000000>; - opp-microvolt = <781250>; + cci_opp_6: opp-1190000000 { + opp-hz = /bits/ 64 <1190000000>; + opp-microvolt = <906250>; }; - cci_opp_7: opp-927000000 { - opp-hz = /bits/ 64 <927000000>; - opp-microvolt = <800000>; + cci_opp_7: opp-1213000000 { + opp-hz = /bits/ 64 <1213000000>; + opp-microvolt = <918750>; }; - cci_opp_8: opp-980000000 { - opp-hz = /bits/ 64 <980000000>; - opp-microvolt = <818750>; + cci_opp_8: opp-1236000000 { + opp-hz = /bits/ 64 <1236000000>; + opp-microvolt = <937500>; }; - cci_opp_9: opp-1050000000 { - opp-hz = /bits/ 64 <1050000000>; - opp-microvolt = <843750>; + cci_opp_9: opp-1260000000 { + opp-hz = /bits/ 64 <1260000000>; + opp-microvolt = <950000>; }; - cci_opp_10: opp-1120000000 { - opp-hz = /bits/ 64 <1120000000>; - opp-microvolt = <862500>; + cci_opp_10: opp-1283000000 { + opp-hz = /bits/ 64 <1283000000>; + opp-microvolt = <962500>; }; - cci_opp_11: opp-1155000000 { - opp-hz = /bits/ 64 <1155000000>; - opp-microvolt = <887500>; + cci_opp_11: opp-1306000000 { + opp-hz = /bits/ 64 <1306000000>; + opp-microvolt = <975000>; }; - cci_opp_12: opp-1190000000 { - opp-hz = /bits/ 64 <1190000000>; - opp-microvolt = <906250>; + cci_opp_12: opp-1330000000 { + opp-hz = /bits/ 64 <1330000000>; + opp-microvolt = <993750>; }; - cci_opp_13: opp-1260000000 { - opp-hz = /bits/ 64 <1260000000>; - opp-microvolt = <950000>; + cci_opp_13: opp-1353000000 { + opp-hz = /bits/ 64 <1353000000>; + opp-microvolt = <1006250>; }; - cci_opp_14: opp-1330000000 { - opp-hz = /bits/ 64 <1330000000>; - opp-microvolt = <993750>; + cci_opp_14: opp-1376000000 { + opp-hz = /bits/ 64 <1376000000>; + opp-microvolt = <1018750>; }; cci_opp_15: opp-1400000000 {
The original CCI OPP table's lowest frequency 500 MHz is too low and causes system stalls. Increase the frequency range to 1.05 GHz ~ 1.4 GHz and adjust the OPPs accordingly. Fixes: 32dfbc03fc26 ("arm64: dts: mediatek: mt8186: Add CCI node and CCI OPP table") Signed-off-by: Mark Tseng <chun-jen.tseng@mediatek.com> --- arch/arm64/boot/dts/mediatek/mt8186.dtsi | 90 ++++++++++++------------ 1 file changed, 45 insertions(+), 45 deletions(-)