Message ID | 20200701090751.7543-5-ionela.voinescu@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | cpufreq: improve frequency invariance support | expand |
On 01-07-20, 10:07, Ionela Voinescu wrote: > In the majority of cases, the index argument to cpufreq's target_index() > is meant to identify the frequency that is requested from the hardware, > according to the frequency table: policy->freq_table[index].frequency. > > After successfully requesting it from the hardware, this value, together > with the maximum hardware frequency (policy->cpuinfo.max_freq) are used > as arguments to arch_set_freq_scale(), in order to set the task scheduler > frequency scale factor. This is a normalized indication of a CPU's > current performance. > > But for the vexpress-spc-cpufreq driver, when big.LITTLE switching [1] > is enabled, there are three issues with using the above information for > setting the FI scale factor: > > - cur_freq: policy->freq_table[index].frequency is not the frequency > requested from the hardware. ve_spc_cpufreq_set_rate() will convert > from this virtual frequency to an actual frequency, which is then > requested from the hardware. For the A7 cluster, the virtual frequency > is half the actual frequency. The use of the virtual policy->freq_table > frequency results in an incorrect FI scale factor. > > - max_freq: policy->cpuinfo.max_freq does not correctly identify the > maximum frequency of the physical cluster. This value identifies the > maximum frequency achievable by the big-LITTLE pair, that is the > maximum frequency of the big CPU. But when the LITTLE CPU in the group > is used, the hardware maximum frquency passed to arch_set_freq_scale() > is incorrect. > > - missing a scale factor update: when switching clusters, the driver > recalculates the frequency of the old clock domain based on the > requests of the remaining CPUs in the domain and asks for a clock > change. But this does not result in an update in the scale factor. > > Therefore, introduce a local function bLs_set_sched_freq_scale() that > helps call arch_set_freq_scale() with correct information for the > is_bL_switching_enabled() case, while maintaining the old, more > efficient, call site of arch_set_freq_scale() for when cluster > switching is disabled. > > Also, because of these requirements in computing the scale factor, this > driver is the only one that maintains custom support for FI, which is > marked by the presence of the CPUFREQ_CUSTOM_SET_FREQ_SCALE flag. > > [1] https://lwn.net/Articles/481055/ > > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> > Cc: Viresh Kumar <viresh.kumar@linaro.org> > Cc: Sudeep Holla <sudeep.holla@arm.com> > Cc: Rafael J. Wysocki <rjw@rjwysocki.net> > Cc: Liviu Dudau <liviu.dudau@arm.com> > --- > drivers/cpufreq/vexpress-spc-cpufreq.c | 23 ++++++++++++++++++++++- > 1 file changed, 22 insertions(+), 1 deletion(-) Is there anyone who cares for this driver and EAS ? I will just skip doing the FIE thing here and mark it skipped.
On Wednesday 01 Jul 2020 at 16:16:19 (+0530), Viresh Kumar wrote: > On 01-07-20, 10:07, Ionela Voinescu wrote: > > In the majority of cases, the index argument to cpufreq's target_index() > > is meant to identify the frequency that is requested from the hardware, > > according to the frequency table: policy->freq_table[index].frequency. > > > > After successfully requesting it from the hardware, this value, together > > with the maximum hardware frequency (policy->cpuinfo.max_freq) are used > > as arguments to arch_set_freq_scale(), in order to set the task scheduler > > frequency scale factor. This is a normalized indication of a CPU's > > current performance. > > > > But for the vexpress-spc-cpufreq driver, when big.LITTLE switching [1] > > is enabled, there are three issues with using the above information for > > setting the FI scale factor: > > > > - cur_freq: policy->freq_table[index].frequency is not the frequency > > requested from the hardware. ve_spc_cpufreq_set_rate() will convert > > from this virtual frequency to an actual frequency, which is then > > requested from the hardware. For the A7 cluster, the virtual frequency > > is half the actual frequency. The use of the virtual policy->freq_table > > frequency results in an incorrect FI scale factor. > > > > - max_freq: policy->cpuinfo.max_freq does not correctly identify the > > maximum frequency of the physical cluster. This value identifies the > > maximum frequency achievable by the big-LITTLE pair, that is the > > maximum frequency of the big CPU. But when the LITTLE CPU in the group > > is used, the hardware maximum frquency passed to arch_set_freq_scale() > > is incorrect. > > > > - missing a scale factor update: when switching clusters, the driver > > recalculates the frequency of the old clock domain based on the > > requests of the remaining CPUs in the domain and asks for a clock > > change. But this does not result in an update in the scale factor. > > > > Therefore, introduce a local function bLs_set_sched_freq_scale() that > > helps call arch_set_freq_scale() with correct information for the > > is_bL_switching_enabled() case, while maintaining the old, more > > efficient, call site of arch_set_freq_scale() for when cluster > > switching is disabled. > > > > Also, because of these requirements in computing the scale factor, this > > driver is the only one that maintains custom support for FI, which is > > marked by the presence of the CPUFREQ_CUSTOM_SET_FREQ_SCALE flag. > > > > [1] https://lwn.net/Articles/481055/ > > > > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> > > Cc: Viresh Kumar <viresh.kumar@linaro.org> > > Cc: Sudeep Holla <sudeep.holla@arm.com> > > Cc: Rafael J. Wysocki <rjw@rjwysocki.net> > > Cc: Liviu Dudau <liviu.dudau@arm.com> > > --- > > drivers/cpufreq/vexpress-spc-cpufreq.c | 23 ++++++++++++++++++++++- > > 1 file changed, 22 insertions(+), 1 deletion(-) > > Is there anyone who cares for this driver and EAS ? I will just skip doing the > FIE thing here and mark it skipped. > That is a good question. The vexpress driver is still used for TC2, but I don't know of any users of this bL switcher functionality that's part of the driver. I think there were a few people wondering recently if it's still used [1]. If we disconsider the bL switcher functionality, then the vexpress driver itself gets in line with the other drivers supported by this series. Therefore there would not be a flag set needed here. Also, to maintain current functionality, we would not need to introduce a flag at all. But, the frequency invariance fix is also useful for schedutil to better select a frequency based on invariant utilization. So if we independently decide having a flag like the one introduced at 1/8 is useful, I would recommend to consider this patch, as it does fix some current functionality in the kernel (whether we can determine if it's used much or not). Therefore, IMO, if it's not used any more it should be removed, but if it's kept it should be fixed. [1] https://lore.kernel.org/linux-arm-kernel/20200624195811.435857-8-maz@kernel.org/ Thanks, Ionela. > -- > viresh
On 01-07-20, 15:07, Ionela Voinescu wrote: > On Wednesday 01 Jul 2020 at 16:16:19 (+0530), Viresh Kumar wrote: > > Is there anyone who cares for this driver and EAS ? I will just skip doing the > > FIE thing here and mark it skipped. > > That is a good question. The vexpress driver is still used for TC2, but > I don't know of any users of this bL switcher functionality that's part > of the driver. I think there were a few people wondering recently if > it's still used [1]. Even if it is used by some, there is no need, I believe, to enable freq-invariance for it, which wasn't enabled until now. And considering that we are going to enable the flag only for the interested parties now, as from the discussion on 1/8, this shouldn't be required.
Hi, On Thursday 02 Jul 2020 at 08:35:51 (+0530), Viresh Kumar wrote: > On 01-07-20, 15:07, Ionela Voinescu wrote: > > On Wednesday 01 Jul 2020 at 16:16:19 (+0530), Viresh Kumar wrote: > > > Is there anyone who cares for this driver and EAS ? I will just skip doing the > > > FIE thing here and mark it skipped. > > > > That is a good question. The vexpress driver is still used for TC2, but > > I don't know of any users of this bL switcher functionality that's part > > of the driver. I think there were a few people wondering recently if > > it's still used [1]. > > Even if it is used by some, there is no need, I believe, to enable > freq-invariance for it, which wasn't enabled until now. > It was enabled until now, but it was partially broken. If you look over the driver you'll see arch_set_freq_scale() being called for both is_bL_switching_enabled() and for when it's not [1]. For !is_bL_switching_enabled() this is fine. But for is_bL_switching_enabled(), it is broken as described in 4/8. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/cpufreq/vexpress-spc-cpufreq.c?h=v5.8-rc3#n203 > And considering that we are going to enable the flag only for the > interested parties now, as from the discussion on 1/8, this shouldn't > be required. > If we just don't want frequency invariance for is_bL_switching_enabled(), I can just guard the setting of the flag suggested by Rafael at 1/8 by !CONFIG_BL_SWITCHER. I'll proceed to do that and remove the fix at 4/8. Many thanks! Ionela. > -- > viresh
On 02-07-20, 12:41, Ionela Voinescu wrote: > It was enabled until now, but it was partially broken. If you look over > the driver you'll see arch_set_freq_scale() being called for both > is_bL_switching_enabled() and for when it's not [1]. I missed that completely, it was indeed added here: commit 518accf20629 ("cpufreq: arm_big_little: invoke frequency-invariance setter function") and so this patch or a version of it is required here. > If we just don't want frequency invariance for > is_bL_switching_enabled(), I can just guard the setting of the flag > suggested by Rafael at 1/8 by !CONFIG_BL_SWITCHER. > > I'll proceed to do that and remove the fix at 4/8. I think it would be better to do that and avoid any complicate code unnecessarily here.
diff --git a/drivers/cpufreq/vexpress-spc-cpufreq.c b/drivers/cpufreq/vexpress-spc-cpufreq.c index e0a1a3367ec5..f2caf67d4050 100644 --- a/drivers/cpufreq/vexpress-spc-cpufreq.c +++ b/drivers/cpufreq/vexpress-spc-cpufreq.c @@ -55,6 +55,8 @@ static atomic_t cluster_usage[MAX_CLUSTERS + 1]; static unsigned int clk_big_min; /* (Big) clock frequencies */ static unsigned int clk_little_max; /* Maximum clock frequency (Little) */ +static inline u32 get_table_max(struct cpufreq_frequency_table *table); + static DEFINE_PER_CPU(unsigned int, physical_cluster); static DEFINE_PER_CPU(unsigned int, cpu_last_req_freq); @@ -87,6 +89,18 @@ static unsigned int find_cluster_maxfreq(int cluster) return max_freq; } +static void bLs_set_sched_freq_scale(int cluster, unsigned long cur_freq) +{ + unsigned long max_freq = get_table_max(freq_table[cluster]); + int j; + + for_each_online_cpu(j) { + if (cluster == per_cpu(physical_cluster, j)) + arch_set_freq_scale(get_cpu_mask(j), cur_freq, + max_freq); + } +} + static unsigned int clk_get_cpu_rate(unsigned int cpu) { u32 cur_cluster = per_cpu(physical_cluster, cpu); @@ -154,6 +168,9 @@ ve_spc_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) mutex_unlock(&cluster_lock[new_cluster]); + if (bLs) + bLs_set_sched_freq_scale(new_cluster, new_rate); + /* Recalc freq for old cluster when switching clusters */ if (old_cluster != new_cluster) { /* Switch cluster */ @@ -170,7 +187,11 @@ ve_spc_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) pr_err("%s: clk_set_rate failed: %d, old cluster: %d\n", __func__, ret, old_cluster); } + mutex_unlock(&cluster_lock[old_cluster]); + + if (new_rate) + bLs_set_sched_freq_scale(old_cluster, new_rate); } return 0; @@ -200,7 +221,7 @@ static int ve_spc_cpufreq_set_target(struct cpufreq_policy *policy, ret = ve_spc_cpufreq_set_rate(cpu, actual_cluster, new_cluster, freqs_new); - if (!ret) { + if (!is_bL_switching_enabled() && !ret) { arch_set_freq_scale(policy->related_cpus, freqs_new, policy->cpuinfo.max_freq); }
In the majority of cases, the index argument to cpufreq's target_index() is meant to identify the frequency that is requested from the hardware, according to the frequency table: policy->freq_table[index].frequency. After successfully requesting it from the hardware, this value, together with the maximum hardware frequency (policy->cpuinfo.max_freq) are used as arguments to arch_set_freq_scale(), in order to set the task scheduler frequency scale factor. This is a normalized indication of a CPU's current performance. But for the vexpress-spc-cpufreq driver, when big.LITTLE switching [1] is enabled, there are three issues with using the above information for setting the FI scale factor: - cur_freq: policy->freq_table[index].frequency is not the frequency requested from the hardware. ve_spc_cpufreq_set_rate() will convert from this virtual frequency to an actual frequency, which is then requested from the hardware. For the A7 cluster, the virtual frequency is half the actual frequency. The use of the virtual policy->freq_table frequency results in an incorrect FI scale factor. - max_freq: policy->cpuinfo.max_freq does not correctly identify the maximum frequency of the physical cluster. This value identifies the maximum frequency achievable by the big-LITTLE pair, that is the maximum frequency of the big CPU. But when the LITTLE CPU in the group is used, the hardware maximum frquency passed to arch_set_freq_scale() is incorrect. - missing a scale factor update: when switching clusters, the driver recalculates the frequency of the old clock domain based on the requests of the remaining CPUs in the domain and asks for a clock change. But this does not result in an update in the scale factor. Therefore, introduce a local function bLs_set_sched_freq_scale() that helps call arch_set_freq_scale() with correct information for the is_bL_switching_enabled() case, while maintaining the old, more efficient, call site of arch_set_freq_scale() for when cluster switching is disabled. Also, because of these requirements in computing the scale factor, this driver is the only one that maintains custom support for FI, which is marked by the presence of the CPUFREQ_CUSTOM_SET_FREQ_SCALE flag. [1] https://lwn.net/Articles/481055/ Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Liviu Dudau <liviu.dudau@arm.com> --- drivers/cpufreq/vexpress-spc-cpufreq.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-)