mbox series

[0/1] cpufreq: Move arch_freq_get_on_cpu to cpuinfo_cur_freq

Message ID 20240603081331.3829278-1-beata.michalska@arm.com (mailing list archive)
Headers show
Series cpufreq: Move arch_freq_get_on_cpu to cpuinfo_cur_freq | expand

Message

Beata Michalska June 3, 2024, 8:13 a.m. UTC
Hi All,

The CPUFreq core exposes, via sysfs, number of attributes, with cpuinfo_cur_freq
and scaling_cur_freq being one of those. The first one, cpuinfo_cur_freq is
supposed to provide current frequency of a given CPU, with an expectation,
that this one will be the one that hardware actually runs at. The second one,
scaling_cur_freq, is likewise suppose to show current frequency,though this
time, it might not actually reflect the frequency values as seen by the hardware,
as it will, in most cases, provide the last frequency requested.
For some architectures though, the scaling_cur_freq might be given higher
precision, still with no guarantee to match the exact CPU frequency. 
This though adds a bit of confusion as it deviates from the attribute's
original meaning: last requested frequency.
Currently the higher precision of the CPU frequency provided via scalin_cur_freq
for x86 is being achieved by utilizing APERF/MPERF registers, based on values
of which, an average frequency (over 20ms time-window) is being determined.
With the added support for AMU-based FIE for ARM, similar functionality can be
([1]) alike exposed through arch_freq_ge_on_cpu. So far so good.
Things get slightly complicated/confusing though when attempting to advertise
that via related CPUFreq sysfs attributes. Relying on hardware counters to
determine the CPU frequency is, in fact, a hardware feedback and as such,
intuitively, should be wired with cpuinfo_cur_freq, even despite the fact that
both implementations (of arch_freq_ge_on_cpu) provide an average instead of
an instant view on current frequency. For that to happen though,
arch_freq_ge_on_cpu (and hw feedback for that matter) should be removed from
scaling_cur_freg. Having both relying on arch_freq_ge_on_cpu is a no go, as
there should be a way to distinguish current vs expected frequency.
Now, as straightforward as that may seem, the change might cause some
disruption for user-space tools that might be relying on scaling_cur_freq
to provide that extra precision, one that APERF/MPERF bring. For those,
scaling_cur_freq may retain its use of arch_freq_ge_on_cpu for cases when
cpuinfo_cur_freq is not exposed.
The change seems unavoidable though: this way or another.
Any ideas on ow to handle it better are more than welcomed.

Related discussions:
[1] https://lore.kernel.org/all/20240417093848.1555462-1-beata.michalska@arm.com/
    (updated version on the way)
[2] https://lore.kernel.org/all/ZIwx+4zVzgKGLcS3@e120325.cambridge.arm.com/
[3] https://lore.kernel.org/all/20240520091853.tbxzoolqh5tgzg7t@vireshk-i7/

Beata Michalska (1):
  cpufreq: Rewire arch specific feedback for cpuinfo/scaling_cur_freq

 drivers/cpufreq/cpufreq.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)