Message ID | 20240522081508.1488592-1-quic_kshivnan@quicinc.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | soc: qcom: icc-bwmon: Update zone1_thres_count to 3 | expand |
On 22/05/2024 10:15, Shivnandan Kumar wrote: > Update zone1_thres_count to 3 from 16 so that > driver can reduce bus vote in 3 sample windows instead > of waiting for 16 windows. This is in line with downstream > implementation. > This might make bwmon quite jittery. I don't think downstream is the source of truth here. Please provide some measurements *on mainline tree*. Best regards, Krzysztof
On 5/22/2024 1:58 PM, Krzysztof Kozlowski wrote: > On 22/05/2024 10:15, Shivnandan Kumar wrote: >> Update zone1_thres_count to 3 from 16 so that >> driver can reduce bus vote in 3 sample windows instead >> of waiting for 16 windows. This is in line with downstream >> implementation. >> > > This might make bwmon quite jittery. I don't think downstream is the > source of truth here. Please provide some measurements *on mainline tree*. > Hi Krzysztof, The 16-window (64 ms) waiting time is too long to reduce the bus vote. At higher FPS, there will be multiple frames in 64ms e.g. 4 frames at 60FPS in 64ms. Hence, delay of 64ms in decision making will lead to higher power regression. I’ve tested this change for 4K video playback on mainline tree, and there’s a significant power-saving. I propose to make it a tunable,so that user space can tune it based on runtime depending on fps. USECASE zone1_thres_count=16 zone1_thres_count=3 4K video playback 236.15 mA 203.15 mA Thanks, Shivnandan > Best regards, > Krzysztof >
On 22/05/2024 11:05, Shivnandan Kumar wrote: > > > On 5/22/2024 1:58 PM, Krzysztof Kozlowski wrote: >> On 22/05/2024 10:15, Shivnandan Kumar wrote: >>> Update zone1_thres_count to 3 from 16 so that >>> driver can reduce bus vote in 3 sample windows instead >>> of waiting for 16 windows. This is in line with downstream >>> implementation. >>> >> >> This might make bwmon quite jittery. I don't think downstream is the >> source of truth here. Please provide some measurements *on mainline tree*. >> > > Hi Krzysztof, > > The 16-window (64 ms) waiting time is too long to reduce the bus vote. > At higher FPS, there will be multiple frames in 64ms e.g. 4 frames at > 60FPS in 64ms. Hence, delay of 64ms in decision making will lead to > higher power regression. I’ve tested this change for 4K video playback > on mainline tree, and there’s a significant power-saving. Please include it, with measurement below, in the commit msg. > I propose to make it a tunable,so that user space can tune it > based on runtime depending on fps.> > USECASE zone1_thres_count=16 zone1_thres_count=3 > 4K video playback 236.15 mA 203.15 mA > > Thanks, > Shivnandan > >> Best regards, >> Krzysztof >> Best regards, Krzysztof
On Wed, May 22, 2024 at 02:35:21PM +0530, Shivnandan Kumar wrote: > > > On 5/22/2024 1:58 PM, Krzysztof Kozlowski wrote: > > On 22/05/2024 10:15, Shivnandan Kumar wrote: > > > Update zone1_thres_count to 3 from 16 so that > > > driver can reduce bus vote in 3 sample windows instead > > > of waiting for 16 windows. This is in line with downstream > > > implementation. > > > > > > > This might make bwmon quite jittery. I don't think downstream is the > > source of truth here. Please provide some measurements *on mainline tree*. > > > > Hi Krzysztof, > > The 16-window (64 ms) waiting time is too long to reduce the bus vote. > At higher FPS, there will be multiple frames in 64ms e.g. 4 frames at 60FPS > in 64ms. Hence, delay of 64ms in decision making will lead to higher power > regression. I’ve tested this change for 4K video playback on mainline tree, > and there’s a significant power-saving. > As requested by Krzysztof already, update your commit message to capture such motivation. Please read and follow this: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes > I propose to make it a tunable,so that user space can tune it > based on runtime depending on fps. > I presume that in e.g. Android there could be some sort of power HAL that tweaks this value dynamically? In a general purpose system, how do we make sure that this value stays relevant for multiple types of use cases? > USECASE zone1_thres_count=16 zone1_thres_count=3 > 4K video playback 236.15 mA 203.15 mA 4k video playback is a fairly specific (and generally unusual) use case. Is there any impact (negative or positive) for other use cases/workloads? Regards, Bjorn > > Thanks, > Shivnandan > > > Best regards, > > Krzysztof > >
diff --git a/drivers/soc/qcom/icc-bwmon.c b/drivers/soc/qcom/icc-bwmon.c index 656706259353..f1065427bb80 100644 --- a/drivers/soc/qcom/icc-bwmon.c +++ b/drivers/soc/qcom/icc-bwmon.c @@ -815,7 +815,7 @@ static const struct icc_bwmon_data msm8998_bwmon_data = { static const struct icc_bwmon_data sdm845_cpu_bwmon_data = { .sample_ms = 4, .count_unit_kb = 64, - .zone1_thres_count = 16, + .zone1_thres_count = 3, .zone3_thres_count = 1, .quirks = BWMON_HAS_GLOBAL_IRQ, .regmap_fields = sdm845_cpu_bwmon_reg_fields, @@ -834,7 +834,7 @@ static const struct icc_bwmon_data sdm845_llcc_bwmon_data = { static const struct icc_bwmon_data sc7280_llcc_bwmon_data = { .sample_ms = 4, .count_unit_kb = 64, - .zone1_thres_count = 16, + .zone1_thres_count = 3, .zone3_thres_count = 1, .quirks = BWMON_NEEDS_FORCE_CLEAR, .regmap_fields = sdm845_llcc_bwmon_reg_fields,
Update zone1_thres_count to 3 from 16 so that driver can reduce bus vote in 3 sample windows instead of waiting for 16 windows. This is in line with downstream implementation. Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com> --- drivers/soc/qcom/icc-bwmon.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- 2.25.1