Message ID | 5594c7d6756a47b473ceb6f48cc217458db32ab0.1607584435.git.viresh.kumar@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: topology: Cleanup init_amu_fie() a bit | expand |
On 10-12-20, 12:48, Viresh Kumar wrote: > Every time I have stumbled upon this routine, I get confused with the > way 'have_policy' is used and I have to dig in to understand why is it > so. > > Here is an attempt to make it easier to understand, and hopefully it is > an improvement. This is based on the logic that amu_fie_cpus will be > empty if cpufreq policy wasn't available for any CPU. > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > --- > > Ionela, I think it would be even better to do this over this patch > > - /* > - * If none of the CPUs have cpufreq support, we only enable > - * the use of the AMU feature for FIE if all CPUs support AMU. > - * Otherwise, enable_policy_freq_counters has already enabled > - * policy cpus. > - */ > - if (cpumask_empty(amu_fie_cpus) && > - cpumask_equal(valid_cpus, cpu_present_mask)) > + /* Overwrite amu_fie_cpus if all CPUs support AMU */ > + if (cpumask_equal(valid_cpus, cpu_present_mask)) > cpumask_copy(amu_fie_cpus, cpu_present_mask); > > This will also take care of the case where the cpufreq policy isn't > there for a small group of CPUs, which do have AMUs enabled for them. > (This doesn't normally happen though). And on similar lines, this change as well as amu_fie_cpus must be set to all the CPUs and this check (and parameter to the routine) aren't required.. bool arch_freq_counters_available(const struct cpumask *cpus) { - return amu_freq_invariant() && - cpumask_subset(cpus, amu_fie_cpus); + return amu_freq_invariant(); }
Hi, On Thursday 10 Dec 2020 at 12:48:20 (+0530), Viresh Kumar wrote: > Every time I have stumbled upon this routine, I get confused with the > way 'have_policy' is used and I have to dig in to understand why is it > so. > I'm first of all replying to discuss the need of this policy analysis that enable_policy_freq_counters() does which results in the setting of have_policy. Basically, that's functions purpose is only to make sure that invariance at the level of the policy is consistent: either all CPUs in a policy support counters and counters will be used for the scale factor, or either none or only some support counters and therefore the default cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs in a policy. If we find that cpufreq policies are not present at all, we only enable counter use if all CPUs support them. This being said, there is a very important part of your patches in this set: + /* Disallow partial use of counters for frequency invariance */ + if (!cpumask_equal(amu_fie_cpus, cpu_present_mask)) + goto free_valid_mask; If this is agreed upon, there is a lot more that can be removed from the initialisation: enable_policy_freq_counters() can entirely go away plus all the checks around it. I completely agree that all of this will be more clear if we decided to "Disallow partial use of counters for frequency invariance", but I think we might have to add it back in again when systems with partial support for counters show up. I don't agree to not support these systems from the get go as it's likely that the first big.LITTLE systems will only have partial support for AMUs, but it's only an assumption at this point. I'm happy to hear the opinion of other arm64 folk about this. Many thanks for looking over the code, Ionela.
My intent was to keep the logical working of the code as is (in all the patches I have sent today), let me see if I broke that assumption somewhere unintentionally. On 10-12-20, 10:38, Ionela Voinescu wrote: > I'm first of all replying to discuss the need of this policy analysis > that enable_policy_freq_counters() does which results in the setting of > have_policy. > > Basically, that's functions purpose is only to make sure that invariance > at the level of the policy is consistent: either all CPUs in a policy > support counters and counters will be used for the scale factor, or > either none or only some support counters and therefore the default > cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs > in a policy. > > If we find that cpufreq policies are not present at all, we only enable > counter use if all CPUs support them. Right, and the same must be true after this patch. > This being said, there is a very important part of your patches in this > set: > > + /* Disallow partial use of counters for frequency invariance */ > + if (!cpumask_equal(amu_fie_cpus, cpu_present_mask)) > + goto free_valid_mask; The current code already has this check and so this isn't something new. > If this is agreed upon, there is a lot more that can be removed from the > initialisation: enable_policy_freq_counters() can entirely go away plus > all the checks around it. > > I completely agree that all of this will be more clear if we decided to > "Disallow partial use of counters for frequency invariance", but I think > we might have to add it back in again when systems with partial support > for counters show up. > > I don't agree to not support these systems from the get go as it's > likely that the first big.LITTLE systems will only have partial support > for AMUs, but it's only an assumption at this point. Here is how things move AFAICT: - We set valid_cpus 1-by-1 with all CPUs that have counters available. - Once all CPUs of a policy are part of valid_cpus, we update amu_fie_cpus with that policies related_cpus. - Once we are done with all CPUs, we check if cpufreq policy was there for any of the CPUs, if not, we update amu_fie_cpus if all present CPUs are part of valid_cpus. - At this point we call static_branch_enable() if amu_fie_cpus is not empty (i.e. even if it is partially set). - But right after that we call static_branch_disable() if we aren't invariant (call to topology_scale_freq_invariant()), and this will happen if amu_fie_cpus doesn't have all the CPUs there. Isn't it? So partial amu support is already disallowed, without cpufreq. Where am I wrong ? (And I know there is a high possibility of that happening here :) )..
On Thursday 10 Dec 2020 at 16:25:06 (+0530), Viresh Kumar wrote: > My intent was to keep the logical working of the code as is (in all > the patches I have sent today), let me see if I broke that assumption > somewhere unintentionally. > Okay, I replied separately in regards to the piece of code that confused things. > On 10-12-20, 10:38, Ionela Voinescu wrote: > > I'm first of all replying to discuss the need of this policy analysis > > that enable_policy_freq_counters() does which results in the setting of > > have_policy. > > > > Basically, that's functions purpose is only to make sure that invariance > > at the level of the policy is consistent: either all CPUs in a policy > > support counters and counters will be used for the scale factor, or > > either none or only some support counters and therefore the default > > cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs > > in a policy. > > > > If we find that cpufreq policies are not present at all, we only enable > > counter use if all CPUs support them. > > Right, and the same must be true after this patch. > > > This being said, there is a very important part of your patches in this > > set: > > > > + /* Disallow partial use of counters for frequency invariance */ > > + if (!cpumask_equal(amu_fie_cpus, cpu_present_mask)) > > + goto free_valid_mask; > > The current code already has this check and so this isn't something > new. > Replied separately for the patch. > > If this is agreed upon, there is a lot more that can be removed from the > > initialisation: enable_policy_freq_counters() can entirely go away plus > > all the checks around it. > > > > I completely agree that all of this will be more clear if we decided to > > "Disallow partial use of counters for frequency invariance", but I think > > we might have to add it back in again when systems with partial support > > for counters show up. > > > > I don't agree to not support these systems from the get go as it's > > likely that the first big.LITTLE systems will only have partial support > > for AMUs, but it's only an assumption at this point. > > Here is how things move AFAICT: > > - We set valid_cpus 1-by-1 with all CPUs that have counters available. > Yes. > - Once all CPUs of a policy are part of valid_cpus, we update > amu_fie_cpus with that policies related_cpus. > Yes. > - Once we are done with all CPUs, we check if cpufreq policy was there > for any of the CPUs, if not, we update amu_fie_cpus if all present > CPUs are part of valid_cpus. > Yes. > - At this point we call static_branch_enable() if amu_fie_cpus is not > empty (i.e. even if it is partially set). > Yes. > - But right after that we call static_branch_disable() if we aren't > invariant (call to topology_scale_freq_invariant()), and this will > happen if amu_fie_cpus doesn't have all the CPUs there. Isn't it? So > partial amu support is already disallowed, without cpufreq. > This is the point that needs clarification: topology_scale_freq_invariant()) = cpufreq_supports_freq_invariance() || arch_freq_counters_available(cpu_online_mask); This checks if the full system is invariant. The possible scenarios are: - All online CPUs support AMUs - arch_freq_counters_available() will return true -> topology_scale_freq_invariant() will return true. - None of the CPUs support AMUs, or part of the CPUs support AMUs - the system is invariant only if cpufreq is invariant (dependent on whether the driver implements the proper callbacks that results in calling arch_set_freq_scale() in cpufreq core. - Either cpufreq does not support invariance or we don't have AMU support on all CPUs -> the system is not invariant so we disable the AMU static key that guards the calls to topology_scale_freq_tick() - we would not want to set a scale factor for only a part of the CPUs. So whether were are or are not invariant does not depend only on the AMU presence, but also on the cpufreq support for invariance. We have to disable invariance altogether (including the AMU guarding static key) if the system is not invariant (it no all CPUs have means to provide the scale). Let me know if it makes sense. > Where am I wrong ? (And I know there is a high possibility of that > happening here :) ).. > No worries. I myself sometimes wonder if I missed something. I will be happy in a few years when we can remove this and use counters for all CPUs. Thanks, Ionela. > -- > viresh
On 10-12-20, 11:29, Ionela Voinescu wrote: > On Thursday 10 Dec 2020 at 16:25:06 (+0530), Viresh Kumar wrote: > > - But right after that we call static_branch_disable() if we aren't > > invariant (call to topology_scale_freq_invariant()), and this will > > happen if amu_fie_cpus doesn't have all the CPUs there. Isn't it? So > > partial amu support is already disallowed, without cpufreq. > > > > This is the point that needs clarification: > > topology_scale_freq_invariant()) = cpufreq_supports_freq_invariance() || > arch_freq_counters_available(cpu_online_mask); > > This checks if the full system is invariant. > > The possible scenarios are: > > - All online CPUs support AMUs - arch_freq_counters_available() will > return true -> topology_scale_freq_invariant() will return true. > > - None of the CPUs support AMUs, or part of the CPUs support AMUs - the > system is invariant only if cpufreq is invariant (dependent on > whether the driver implements the proper callbacks that results in > calling arch_set_freq_scale() in cpufreq core. > > - Either cpufreq does not support invariance or we don't have AMU > support on all CPUs -> the system is not invariant so we disable > the AMU static key that guards the calls to > topology_scale_freq_tick() - we would not want to set a scale factor > for only a part of the CPUs. > > So whether were are or are not invariant does not depend only on the AMU > presence, but also on the cpufreq support for invariance. We have to > disable invariance altogether (including the AMU guarding static key) > if the system is not invariant (it no all CPUs have means to provide the > scale). Okay, I think I mis-assumed that amu_fie_cpus will get set by enable_policy_freq_counters() even for CPUs where AMU support isn't there, it won't be though. Having said that, this patch, along with the minor suggestion in the commit log, still stands fine, right ? The other patch which I sent is probably incorrect due to the above assumption I had.
Hey, On Thursday 10 Dec 2020 at 12:48:20 (+0530), Viresh Kumar wrote: > Every time I have stumbled upon this routine, I get confused with the > way 'have_policy' is used and I have to dig in to understand why is it > so. > > Here is an attempt to make it easier to understand, and hopefully it is > an improvement. This is based on the logic that amu_fie_cpus will be > empty if cpufreq policy wasn't available for any CPU. > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > --- > > Ionela, I think it would be even better to do this over this patch > > - /* > - * If none of the CPUs have cpufreq support, we only enable > - * the use of the AMU feature for FIE if all CPUs support AMU. > - * Otherwise, enable_policy_freq_counters has already enabled > - * policy cpus. > - */ > - if (cpumask_empty(amu_fie_cpus) && > - cpumask_equal(valid_cpus, cpu_present_mask)) > + /* Overwrite amu_fie_cpus if all CPUs support AMU */ > + if (cpumask_equal(valid_cpus, cpu_present_mask)) > cpumask_copy(amu_fie_cpus, cpu_present_mask); > Yes, I was just about to suggest this, reading the patch below. > This will also take care of the case where the cpufreq policy isn't > there for a small group of CPUs, which do have AMUs enabled for them. > (This doesn't normally happen though). > > --- > arch/arm64/kernel/topology.c | 16 +++++++--------- > 1 file changed, 7 insertions(+), 9 deletions(-) > > diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c > index f6faa697e83e..7f7d8de325b6 100644 > --- a/arch/arm64/kernel/topology.c > +++ b/arch/arm64/kernel/topology.c > @@ -199,14 +199,14 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate) > return 0; > } > > -static inline bool > +static inline void > enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus) > { > struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); > > if (!policy) { > pr_debug("CPU%d: No cpufreq policy found.\n", cpu); > - return false; > + return; > } > > if (cpumask_subset(policy->related_cpus, valid_cpus)) > @@ -214,8 +214,6 @@ enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus) > amu_fie_cpus); > > cpufreq_cpu_put(policy); > - > - return true; > } > > static DEFINE_STATIC_KEY_FALSE(amu_fie_key); > @@ -225,7 +223,6 @@ static int __init init_amu_fie(void) > { > bool invariance_status = topology_scale_freq_invariant(); > cpumask_var_t valid_cpus; > - bool have_policy = false; > int ret = 0; > int cpu; > > @@ -245,17 +242,18 @@ static int __init init_amu_fie(void) > continue; > > cpumask_set_cpu(cpu, valid_cpus); > - have_policy |= enable_policy_freq_counters(cpu, valid_cpus); > + enable_policy_freq_counters(cpu, valid_cpus); > } > > /* > - * If we are not restricted by cpufreq policies, we only enable > + * If none of the CPUs have cpufreq support, we only enable > * the use of the AMU feature for FIE if all CPUs support AMU. > * Otherwise, enable_policy_freq_counters has already enabled > * policy cpus. > */ > - if (!have_policy && cpumask_equal(valid_cpus, cpu_present_mask)) > - cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus); > + if (cpumask_empty(amu_fie_cpus) && > + cpumask_equal(valid_cpus, cpu_present_mask)) > + cpumask_copy(amu_fie_cpus, cpu_present_mask); > Yes, if you really don't like the have_policy variable, I would go for your suggestion in the commit message for this condition and the removal of the comment. In the form of the comment here it creates more confusion, but your suggestion in the commit message hides all involvement of policies in enable_policy_freq_counters(). Thanks, Ionela. > if (!cpumask_empty(amu_fie_cpus)) { > pr_info("CPUs[%*pbl]: counters will be used for FIE.", > -- > 2.25.0.rc1.19.g042ed3e048af >
On Thursday 10 Dec 2020 at 18:04:39 (+0530), Viresh Kumar wrote: > On 10-12-20, 11:29, Ionela Voinescu wrote: > > On Thursday 10 Dec 2020 at 16:25:06 (+0530), Viresh Kumar wrote: > > > - But right after that we call static_branch_disable() if we aren't > > > invariant (call to topology_scale_freq_invariant()), and this will > > > happen if amu_fie_cpus doesn't have all the CPUs there. Isn't it? So > > > partial amu support is already disallowed, without cpufreq. > > > > > > > This is the point that needs clarification: > > > > topology_scale_freq_invariant()) = cpufreq_supports_freq_invariance() || > > arch_freq_counters_available(cpu_online_mask); > > > > This checks if the full system is invariant. > > > > The possible scenarios are: > > > > - All online CPUs support AMUs - arch_freq_counters_available() will > > return true -> topology_scale_freq_invariant() will return true. > > > > - None of the CPUs support AMUs, or part of the CPUs support AMUs - the > > system is invariant only if cpufreq is invariant (dependent on > > whether the driver implements the proper callbacks that results in > > calling arch_set_freq_scale() in cpufreq core. > > > > - Either cpufreq does not support invariance or we don't have AMU > > support on all CPUs -> the system is not invariant so we disable > > the AMU static key that guards the calls to > > topology_scale_freq_tick() - we would not want to set a scale factor > > for only a part of the CPUs. > > > > So whether were are or are not invariant does not depend only on the AMU > > presence, but also on the cpufreq support for invariance. We have to > > disable invariance altogether (including the AMU guarding static key) > > if the system is not invariant (it no all CPUs have means to provide the > > scale). > > Okay, I think I mis-assumed that amu_fie_cpus will get set by > enable_policy_freq_counters() even for CPUs where AMU support isn't > there, it won't be though. > > Having said that, this patch, along with the minor suggestion in the > commit log, still stands fine, right ? The other patch which I sent is > probably incorrect due to the above assumption I had. > Yes, this one is good, although I would vote for the commit message implementation. I'll wait for v2 for reviewed-by, in case you want to push something for the second patch in the same series. Ionela. > -- > viresh
On 10-12-20, 10:38, Ionela Voinescu wrote: > Basically, that's functions purpose is only to make sure that invariance > at the level of the policy is consistent: either all CPUs in a policy > support counters and counters will be used for the scale factor, or > either none or only some support counters and therefore the default > cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs > in a policy. Why is it important to have this thing at policy level ? If we are okay with only one policy having AMUs and not the other one, then what about only some CPUs of both the policies having it. Does it break anything ?
Hi, Sorry, I missed this. On Friday 11 Dec 2020 at 16:35:55 (+0530), Viresh Kumar wrote: > On 10-12-20, 10:38, Ionela Voinescu wrote: > > Basically, that's functions purpose is only to make sure that invariance > > at the level of the policy is consistent: either all CPUs in a policy > > support counters and counters will be used for the scale factor, or > > either none or only some support counters and therefore the default > > cpufreq implementation will be used (arch_set_freq_scale()) for all CPUs > > in a policy. > > Why is it important to have this thing at policy level ? If we are > okay with only one policy having AMUs and not the other one, then what > about only some CPUs of both the policies having it. Does it break > anything ? > First of all, in order for a single CPU in a policy to use AMUs for FIE, when the others do not support AMUs, we'd have to modify arch_set_freq_scale() which otherwise would uniformly set its own computed scale factor for all related CPUs. Beyond this, it's very likely that CPUs in the same policy have the same uarch and therefore either all or none in the policy would support AMUs. The check here is just to make sure of that and then ensure that arch_set_freq_scale() and arch_scale_freq_tick() do not clash with each other. Beyond this, it's difficult to say how important it is to have the same scale for all CPUs in a policy. AMUs would give you a scale based on an average frequency between ticks, and more importantly it would be based on the actual core frequency, not on what software requested. Cpufreq would give you the scale obtained using the frequency that the OS requested. Therefore, the two scale values could end up being quite different, and therefore can result in quite different utilization values for CPUs doing the same work. An alternative would be to apply the AMU computed scale to all CPUs although not all CPUs in a policy might support AMUs. But given that this scenario is unlikely, the added hassle in arch_scale_freq_tick() which would involve getting information on related CPUs from policies was not worth it, in my opionion. Thanks, Ionela. > -- > viresh
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index f6faa697e83e..7f7d8de325b6 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -199,14 +199,14 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate) return 0; } -static inline bool +static inline void enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus) { struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); if (!policy) { pr_debug("CPU%d: No cpufreq policy found.\n", cpu); - return false; + return; } if (cpumask_subset(policy->related_cpus, valid_cpus)) @@ -214,8 +214,6 @@ enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus) amu_fie_cpus); cpufreq_cpu_put(policy); - - return true; } static DEFINE_STATIC_KEY_FALSE(amu_fie_key); @@ -225,7 +223,6 @@ static int __init init_amu_fie(void) { bool invariance_status = topology_scale_freq_invariant(); cpumask_var_t valid_cpus; - bool have_policy = false; int ret = 0; int cpu; @@ -245,17 +242,18 @@ static int __init init_amu_fie(void) continue; cpumask_set_cpu(cpu, valid_cpus); - have_policy |= enable_policy_freq_counters(cpu, valid_cpus); + enable_policy_freq_counters(cpu, valid_cpus); } /* - * If we are not restricted by cpufreq policies, we only enable + * If none of the CPUs have cpufreq support, we only enable * the use of the AMU feature for FIE if all CPUs support AMU. * Otherwise, enable_policy_freq_counters has already enabled * policy cpus. */ - if (!have_policy && cpumask_equal(valid_cpus, cpu_present_mask)) - cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus); + if (cpumask_empty(amu_fie_cpus) && + cpumask_equal(valid_cpus, cpu_present_mask)) + cpumask_copy(amu_fie_cpus, cpu_present_mask); if (!cpumask_empty(amu_fie_cpus)) { pr_info("CPUs[%*pbl]: counters will be used for FIE.",
Every time I have stumbled upon this routine, I get confused with the way 'have_policy' is used and I have to dig in to understand why is it so. Here is an attempt to make it easier to understand, and hopefully it is an improvement. This is based on the logic that amu_fie_cpus will be empty if cpufreq policy wasn't available for any CPU. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> --- Ionela, I think it would be even better to do this over this patch - /* - * If none of the CPUs have cpufreq support, we only enable - * the use of the AMU feature for FIE if all CPUs support AMU. - * Otherwise, enable_policy_freq_counters has already enabled - * policy cpus. - */ - if (cpumask_empty(amu_fie_cpus) && - cpumask_equal(valid_cpus, cpu_present_mask)) + /* Overwrite amu_fie_cpus if all CPUs support AMU */ + if (cpumask_equal(valid_cpus, cpu_present_mask)) cpumask_copy(amu_fie_cpus, cpu_present_mask); This will also take care of the case where the cpufreq policy isn't there for a small group of CPUs, which do have AMUs enabled for them. (This doesn't normally happen though). --- arch/arm64/kernel/topology.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)