Message ID | 20210528032054.7572-1-yu.c.chen@intel.com (mailing list archive) |
---|---|
State | Mainlined, archived |
Headers | show |
Series | [v2] intel_idle: Adjust the SKX C6 latency and residency if PC6 is disabled | expand |
On Fri, 2021-05-28 at 11:20 +0800, Chen Yu wrote: > Currently cpuidle assumes worst-case C-state parameters, and so C6 > is described with PC6 parameters, which is worst case for requesting > CC6. When PC6 is enabled, this is appropriate. But if PC6 is disabled > in BIOS, the exit latency and target_residency should be adjusted > accordingly. > > Exit latency: > Previously the C6 exit latency was measured when woken up from CC6/PC6. > With PC6 disabled, the C6 exit latency should be CC6/PC0. > > Target residency: > With PC6 disabled, idle duration within [CC6, PC6) would make the > idle governor choose C1E over C6. This would cause low energy-efficiency. > We should lower the bar to request C6 when PC6 is disabled. > > To fill this gap, check if PC6 is disabled in the BIOS in the > MSR_PKG_CST_CONFIG_CONTROL(0xe2). If so, use CC6/PC0 parameters as the > new exit latency. Meanwhile, update target_residency to 3 times of the new > exit latency. This is consistent with how intel_idle driver uses _CST to > calculate the target_residency. The consequence is that, the OS would > be more offen to choose C6 over C1E when PC6 is disabled. This is reasonable > because if the user is using C6, it implies that the user cares about energy, > so choosing C6 more frequently is in accordance with user requirement. > > The new exit latency of CC6/PC0 92us was from wult[1] result on SKX, which was > measured via NIC wakeup from 99.99th latency. Besides SKX, the CLX and CPX > both have the same CPU model number. And since they have similar CC6 exit latency > to SKX, 96us and 89us respectively, reuse the value of SKX. > > There is concern that if we should introduce a more generic solution > rather than optimizing on each platforms. However consider the > code complexity and different PC6 bit interpretation on different > platforms, tune the code per platform seems to be an acceptable trade-off. > > [1] https://intel.github.io/wult/ > > Suggested-by: Len Brown <len.brown@intel.com> > Signed-off-by: Chen Yu <yu.c.chen@intel.com> > --- > v2: Simplify the commit log to not mention C3/PC3. (Artem) > Confirm the exit latency on CLX and CPX.(Artem) Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
On Fri, May 28, 2021 at 5:16 AM Chen Yu <yu.c.chen@intel.com> wrote: > > Currently cpuidle assumes worst-case C-state parameters, and so C6 > is described with PC6 parameters, which is worst case for requesting > CC6. When PC6 is enabled, this is appropriate. But if PC6 is disabled > in BIOS, the exit latency and target_residency should be adjusted > accordingly. > > Exit latency: > Previously the C6 exit latency was measured when woken up from CC6/PC6. > With PC6 disabled, the C6 exit latency should be CC6/PC0. > > Target residency: > With PC6 disabled, idle duration within [CC6, PC6) would make the > idle governor choose C1E over C6. This would cause low energy-efficiency. > We should lower the bar to request C6 when PC6 is disabled. > > To fill this gap, check if PC6 is disabled in the BIOS in the > MSR_PKG_CST_CONFIG_CONTROL(0xe2). If so, use CC6/PC0 parameters as the > new exit latency. Meanwhile, update target_residency to 3 times of the new > exit latency. This is consistent with how intel_idle driver uses _CST to > calculate the target_residency. The consequence is that, the OS would > be more offen to choose C6 over C1E when PC6 is disabled. This is reasonable > because if the user is using C6, it implies that the user cares about energy, > so choosing C6 more frequently is in accordance with user requirement. > > The new exit latency of CC6/PC0 92us was from wult[1] result on SKX, which was > measured via NIC wakeup from 99.99th latency. Besides SKX, the CLX and CPX > both have the same CPU model number. And since they have similar CC6 exit latency > to SKX, 96us and 89us respectively, reuse the value of SKX. > > There is concern that if we should introduce a more generic solution > rather than optimizing on each platforms. However consider the > code complexity and different PC6 bit interpretation on different > platforms, tune the code per platform seems to be an acceptable trade-off. > > [1] https://intel.github.io/wult/ > > Suggested-by: Len Brown <len.brown@intel.com> > Signed-off-by: Chen Yu <yu.c.chen@intel.com> > --- > v2: Simplify the commit log to not mention C3/PC3. (Artem) > Confirm the exit latency on CLX and CPX.(Artem) > --- > drivers/idle/intel_idle.c | 33 +++++++++++++++++++++++++++++++++ > 1 file changed, 33 insertions(+) > > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > index ec1b9d306ba6..e6c543b5ee1d 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -1484,6 +1484,36 @@ static void __init sklh_idle_state_table_update(void) > skl_cstates[6].flags |= CPUIDLE_FLAG_UNUSABLE; /* C9-SKL */ > } > > +/** > + * skx_idle_state_table_update - Adjust the Sky Lake/Cascade Lake > + * idle states table. > + */ > +static void __init skx_idle_state_table_update(void) > +{ > + unsigned long long msr; > + > + rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr); > + > + /* > + * 000b: C0/C1 (no package C-state support) > + * 001b: C2 > + * 010b: C6 (non-retention) > + * 011b: C6 (retention) > + * 111b: No Package C state limits. > + */ > + if ((msr & 0x7) < 2) { > + /* > + * Uses the CC6 + PC0 latency and 3 times of > + * latency for target_residency if the PC6 > + * is disabled in BIOS. This is consistent > + * with how intel_idle driver uses _CST > + * to set the target_residency. > + */ > + skx_cstates[2].exit_latency = 92; > + skx_cstates[2].target_residency = 276; > + } > +} > + > static bool __init intel_idle_verify_cstate(unsigned int mwait_hint) > { > unsigned int mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint) + 1; > @@ -1515,6 +1545,9 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) > case INTEL_FAM6_SKYLAKE: > sklh_idle_state_table_update(); > break; > + case INTEL_FAM6_SKYLAKE_X: > + skx_idle_state_table_update(); > + break; > } > > for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) { > -- Applied as 5.14 material with some edits in the subject and changelog. Thanks!
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index ec1b9d306ba6..e6c543b5ee1d 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -1484,6 +1484,36 @@ static void __init sklh_idle_state_table_update(void) skl_cstates[6].flags |= CPUIDLE_FLAG_UNUSABLE; /* C9-SKL */ } +/** + * skx_idle_state_table_update - Adjust the Sky Lake/Cascade Lake + * idle states table. + */ +static void __init skx_idle_state_table_update(void) +{ + unsigned long long msr; + + rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr); + + /* + * 000b: C0/C1 (no package C-state support) + * 001b: C2 + * 010b: C6 (non-retention) + * 011b: C6 (retention) + * 111b: No Package C state limits. + */ + if ((msr & 0x7) < 2) { + /* + * Uses the CC6 + PC0 latency and 3 times of + * latency for target_residency if the PC6 + * is disabled in BIOS. This is consistent + * with how intel_idle driver uses _CST + * to set the target_residency. + */ + skx_cstates[2].exit_latency = 92; + skx_cstates[2].target_residency = 276; + } +} + static bool __init intel_idle_verify_cstate(unsigned int mwait_hint) { unsigned int mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint) + 1; @@ -1515,6 +1545,9 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) case INTEL_FAM6_SKYLAKE: sklh_idle_state_table_update(); break; + case INTEL_FAM6_SKYLAKE_X: + skx_idle_state_table_update(); + break; } for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
Currently cpuidle assumes worst-case C-state parameters, and so C6 is described with PC6 parameters, which is worst case for requesting CC6. When PC6 is enabled, this is appropriate. But if PC6 is disabled in BIOS, the exit latency and target_residency should be adjusted accordingly. Exit latency: Previously the C6 exit latency was measured when woken up from CC6/PC6. With PC6 disabled, the C6 exit latency should be CC6/PC0. Target residency: With PC6 disabled, idle duration within [CC6, PC6) would make the idle governor choose C1E over C6. This would cause low energy-efficiency. We should lower the bar to request C6 when PC6 is disabled. To fill this gap, check if PC6 is disabled in the BIOS in the MSR_PKG_CST_CONFIG_CONTROL(0xe2). If so, use CC6/PC0 parameters as the new exit latency. Meanwhile, update target_residency to 3 times of the new exit latency. This is consistent with how intel_idle driver uses _CST to calculate the target_residency. The consequence is that, the OS would be more offen to choose C6 over C1E when PC6 is disabled. This is reasonable because if the user is using C6, it implies that the user cares about energy, so choosing C6 more frequently is in accordance with user requirement. The new exit latency of CC6/PC0 92us was from wult[1] result on SKX, which was measured via NIC wakeup from 99.99th latency. Besides SKX, the CLX and CPX both have the same CPU model number. And since they have similar CC6 exit latency to SKX, 96us and 89us respectively, reuse the value of SKX. There is concern that if we should introduce a more generic solution rather than optimizing on each platforms. However consider the code complexity and different PC6 bit interpretation on different platforms, tune the code per platform seems to be an acceptable trade-off. [1] https://intel.github.io/wult/ Suggested-by: Len Brown <len.brown@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> --- v2: Simplify the commit log to not mention C3/PC3. (Artem) Confirm the exit latency on CLX and CPX.(Artem) --- drivers/idle/intel_idle.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)