Message ID | c222deda79ad334ff4edcbd49ddda248685c4ee1.1572395990.git.leonard.crestez@nxp.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
Series | cpufreq: Add user_min/max_freq | expand |
On Wed, Oct 30, 2019 at 02:41:49AM +0200, Leonard Crestez wrote: > Current values in scaling_min_freq and scaling_max freq can change on > the fly due to event such as thermal monitoring. This behavior is > confusing for userspace and because once an userspace limit is written > to scaling_min/max_freq it is not possible to read it back. Yes, this is indeed confusing. > Introduce two new user_min/max_freq files which only contain the limits > imposed by userspace, without any aggregation. > > Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> > > --- > This was motivated by these discussions: > > * https://patchwork.kernel.org/patch/11078475/#22805379 > * https://patchwork.kernel.org/patch/11171817/#22917099 > > Those threads are about devfreq but same issue applies to cpufreq as > well. Let me know if this solution seems reasonable? > > An alternative would be to make scaling_min/max_freq always read back > the configured value and introduce new effective_min/max_freq files for > the aggregate values. That might break existing users (though I'm not > familiar with any). It seems there isn't really a perfect solution :( This change creates a set of new, consistent attributes, but since we can't make the current min/max attributes read-only userspace will keep using them forever. It's somewhat doubtful that userspace can do anything useful with the current min/max values, since they might change just after being read. Anything besides monitoring the limits (approximately) would be inherently broken. Having min/max return the configured value would be the expected behavior (IMO), but obviously I also don't know for sure if there are userspace components relying on the current behavior.
On Wednesday, October 30, 2019 1:41:49 AM CET Leonard Crestez wrote: > Current values in scaling_min_freq and scaling_max freq can change on > the fly due to event such as thermal monitoring. Which is intentional. > This behavior is confusing for userspace and because once an userspace > limit is written to scaling_min/max_freq it is not possible to read it back. That can be argued both ways. It is also useful to know the effective constraints and arguably the ability to read back the values that you have written is mostly needed for debugging the code. Also arguably, if there are multiple sources of frequency limits in user space, there needs to be a user space arbiter deciding on which value to use and in that case it needs to store the last value chosen by it anyway. > Introduce two new user_min/max_freq files which only contain the limits > imposed by userspace, without any aggregation. I'm not sure how useful that is except for the debugging use case to be honest. Do you have any specific use cases beyond debugging in mind? > Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> > --- > This was motivated by these discussions: > > * https://patchwork.kernel.org/patch/11078475/#22805379 > * https://patchwork.kernel.org/patch/11171817/#22917099 > > Those threads are about devfreq but same issue applies to cpufreq as > well. Let me know if this solution seems reasonable? > > An alternative would be to make scaling_min/max_freq always read back > the configured value and introduce new effective_min/max_freq files for > the aggregate values. That might break existing users (though I'm not > familiar with any). > > Documentation/admin-guide/pm/cpufreq.rst | 27 ++++++++++++++++++------ > drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++ > include/linux/pm_qos.h | 4 ++++ > 3 files changed, 44 insertions(+), 6 deletions(-) > > diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst > index 0c74a7784964..734c01c1040e 100644 > --- a/Documentation/admin-guide/pm/cpufreq.rst > +++ b/Documentation/admin-guide/pm/cpufreq.rst > @@ -309,21 +309,36 @@ are the following: > > ``scaling_max_freq`` > Maximum frequency the CPUs belonging to this policy are allowed to be > running at (in kHz). > > - This attribute is read-write and writing a string representing an > - integer to it will cause a new limit to be set (it must not be lower > - than the value of the ``scaling_min_freq`` attribute). > + This attribute is read-write: writing an integer will set a new limit > + (just like ``user_max_freq``) while reading will show the current > + limit (potentially affected by other system contraints such as thermal > + throttling). > > ``scaling_min_freq`` > Minimum frequency the CPUs belonging to this policy are allowed to be > running at (in kHz). > > - This attribute is read-write and writing a string representing a > - non-negative integer to it will cause a new limit to be set (it must not > - be higher than the value of the ``scaling_max_freq`` attribute). > + This attribute is read-write: writing an integer will set a new limit > + (just like ``user_min_freq``) while reading will show the current > + limit (potentially affected by other system contraints). > + > +``user_max_freq`` > + Userspace contraint for the maximum frequency the CPUs belonging to > + this policy are allowed to be running at (in kHz). > + > + This attribute is read-write: writing an integer will set a new limit > + and reading will show the last limit set by userspace. Making these read-write is not useful IMO. Make them read-only. > + > +``user_min_freq`` > + Userspace contraint for minimum frequency the CPUs belonging to this > + policy are allowed to be running at (in kHz). > + > + This attribute is read-write: writing an integer will set a new limit > + and reading will show the last limit set by userspace. > > ``scaling_setspeed`` > This attribute is functional only if the `userspace`_ scaling governor > is attached to the given policy. > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > index 48a224a6b178..caefed0dac43 100644 > --- a/drivers/cpufreq/cpufreq.c > +++ b/drivers/cpufreq/cpufreq.c > @@ -722,13 +722,28 @@ static ssize_t store_##file_name \ > \ > ret = freq_qos_update_request(policy->object##_freq_req, val);\ > return ret >= 0 ? count : ret; \ > } > > +store_one(user_min_freq, min); > +store_one(user_max_freq, max); > store_one(scaling_min_freq, min); > store_one(scaling_max_freq, max); I don't agree with duplicating functionality like this. > > +#undef show_one > + > +#define show_one(file_name, object) \ > +static ssize_t show_##file_name \ > +(struct cpufreq_policy *policy, char *buf) \ > +{ \ > + s32 val = freq_qos_get_request_value(policy->object##_freq_req);\ > + return sprintf(buf, "%d\n", val); \ > +} > + > +show_one(user_min_freq, min); > +show_one(user_max_freq, max); > + > /** > * show_cpuinfo_cur_freq - current CPU frequency as detected by hardware > */ > static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy, > char *buf) > @@ -906,10 +921,12 @@ cpufreq_freq_attr_ro(related_cpus); > cpufreq_freq_attr_ro(affected_cpus); > cpufreq_freq_attr_rw(scaling_min_freq); > cpufreq_freq_attr_rw(scaling_max_freq); > cpufreq_freq_attr_rw(scaling_governor); > cpufreq_freq_attr_rw(scaling_setspeed); > +cpufreq_freq_attr_rw(user_min_freq); > +cpufreq_freq_attr_rw(user_max_freq); > > static struct attribute *default_attrs[] = { > &cpuinfo_min_freq.attr, > &cpuinfo_max_freq.attr, > &cpuinfo_transition_latency.attr, > @@ -919,10 +936,12 @@ static struct attribute *default_attrs[] = { > &related_cpus.attr, > &scaling_governor.attr, > &scaling_driver.attr, > &scaling_available_governors.attr, > &scaling_setspeed.attr, > + &user_min_freq.attr, > + &user_max_freq.attr, > NULL > }; > > #define to_policy(k) container_of(k, struct cpufreq_policy, kobj) > #define to_attr(a) container_of(a, struct freq_attr, attr) > diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h > index e97c2e376889..90b147b7d7a3 100644 > --- a/include/linux/pm_qos.h > +++ b/include/linux/pm_qos.h > @@ -310,7 +310,11 @@ int freq_qos_add_notifier(struct freq_constraints *qos, > enum freq_qos_req_type type, > struct notifier_block *notifier); > int freq_qos_remove_notifier(struct freq_constraints *qos, > enum freq_qos_req_type type, > struct notifier_block *notifier); > +static inline s32 freq_qos_get_request_value(struct freq_qos_request *req) > +{ > + return req->pnode.prio; > +} > > #endif >
On 31.10.2019 12:24, Rafael J. Wysocki wrote: > On Wednesday, October 30, 2019 1:41:49 AM CET Leonard Crestez wrote: >> Current values in scaling_min_freq and scaling_max freq can change on >> the fly due to event such as thermal monitoring. > > Which is intentional. > >> This behavior is confusing for userspace and because once an userspace >> limit is written to scaling_min/max_freq it is not possible to read it back. > > That can be argued both ways. > > It is also useful to know the effective constraints and arguably the ability > to read back the values that you have written is mostly needed for debugging > the code. > > Also arguably, if there are multiple sources of frequency limits in user space, > there needs to be a user space arbiter deciding on which value to use and in > that case it needs to store the last value chosen by it anyway. If an userspace tool needs to temporarily adjust min/max_freq it has no way of reliably restoring the old value. >> Introduce two new user_min/max_freq files which only contain the limits >> imposed by userspace, without any aggregation. > > I'm not sure how useful that is except for the debugging use case to be honest. > > Do you have any specific use cases beyond debugging in mind? No. I guess it would be useful for userspace cpufreq daemons but I'm not familiar with any. Maybe somebody else could chime in? Honestly it's not very useful for debugging: what you would want is a debugfs files which can enumerate all QoS requests in the system. >> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> >> --- >> This was motivated by these discussions: >> >> * https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fpatch%2F11078475%2F%2322805379&data=02%7C01%7Cleonard.crestez%40nxp.com%7C9935807314f14a73eb9d08d75dec8695%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637081142759735597&sdata=G0JDLEytUKMCDN2x7ccXmRHBFktPOJBbPsY52A0ppxY%3D&reserved=0 >> * https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fpatch%2F11171817%2F%2322917099&data=02%7C01%7Cleonard.crestez%40nxp.com%7C9935807314f14a73eb9d08d75dec8695%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637081142759735597&sdata=FLP0%2FoMMubhPMyDt2xTL%2F3xlLfR%2FK9CWIcDA6T14MFw%3D&reserved=0 >> >> Those threads are about devfreq but same issue applies to cpufreq as >> well. Let me know if this solution seems reasonable? >> >> An alternative would be to make scaling_min/max_freq always read back >> the configured value and introduce new effective_min/max_freq files for >> the aggregate values. That might break existing users (though I'm not >> familiar with any). >> >> Documentation/admin-guide/pm/cpufreq.rst | 27 ++++++++++++++++++------ >> drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++ >> include/linux/pm_qos.h | 4 ++++ >> 3 files changed, 44 insertions(+), 6 deletions(-) >> >> diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst >> index 0c74a7784964..734c01c1040e 100644 >> --- a/Documentation/admin-guide/pm/cpufreq.rst >> +++ b/Documentation/admin-guide/pm/cpufreq.rst >> @@ -309,21 +309,36 @@ are the following: >> >> ``scaling_max_freq`` >> Maximum frequency the CPUs belonging to this policy are allowed to be >> running at (in kHz). >> >> - This attribute is read-write and writing a string representing an >> - integer to it will cause a new limit to be set (it must not be lower >> - than the value of the ``scaling_min_freq`` attribute). >> + This attribute is read-write: writing an integer will set a new limit >> + (just like ``user_max_freq``) while reading will show the current >> + limit (potentially affected by other system contraints such as thermal >> + throttling). >> >> ``scaling_min_freq`` >> Minimum frequency the CPUs belonging to this policy are allowed to be >> running at (in kHz). >> >> - This attribute is read-write and writing a string representing a >> - non-negative integer to it will cause a new limit to be set (it must not >> - be higher than the value of the ``scaling_max_freq`` attribute). >> + This attribute is read-write: writing an integer will set a new limit >> + (just like ``user_min_freq``) while reading will show the current >> + limit (potentially affected by other system contraints). >> + >> +``user_max_freq`` >> + Userspace contraint for the maximum frequency the CPUs belonging to >> + this policy are allowed to be running at (in kHz). >> + >> + This attribute is read-write: writing an integer will set a new limit >> + and reading will show the last limit set by userspace. > > Making these read-write is not useful IMO. Make them read-only. > >> + >> +``user_min_freq`` >> + Userspace contraint for minimum frequency the CPUs belonging to this >> + policy are allowed to be running at (in kHz). >> + >> + This attribute is read-write: writing an integer will set a new limit >> + and reading will show the last limit set by userspace. >> >> ``scaling_setspeed`` >> This attribute is functional only if the `userspace`_ scaling governor >> is attached to the given policy. >> >> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c >> index 48a224a6b178..caefed0dac43 100644 >> --- a/drivers/cpufreq/cpufreq.c >> +++ b/drivers/cpufreq/cpufreq.c >> @@ -722,13 +722,28 @@ static ssize_t store_##file_name \ >> \ >> ret = freq_qos_update_request(policy->object##_freq_req, val);\ >> return ret >= 0 ? count : ret; \ >> } >> >> +store_one(user_min_freq, min); >> +store_one(user_max_freq, max); >> store_one(scaling_min_freq, min); >> store_one(scaling_max_freq, max); > > I don't agree with duplicating functionality like this. OK. My logic was that a tool which doesn't want to deal with the "effective" limits would switch to always reading/writing user_min/max_freq if they're present. -- Regards, Leonard
On Thu, Oct 31, 2019 at 2:01 PM Leonard Crestez <leonard.crestez@nxp.com> wrote: > > On 31.10.2019 12:24, Rafael J. Wysocki wrote: > > On Wednesday, October 30, 2019 1:41:49 AM CET Leonard Crestez wrote: > >> Current values in scaling_min_freq and scaling_max freq can change on > >> the fly due to event such as thermal monitoring. > > > > Which is intentional. > > > >> This behavior is confusing for userspace and because once an userspace > >> limit is written to scaling_min/max_freq it is not possible to read it back. > > > > That can be argued both ways. > > > > It is also useful to know the effective constraints and arguably the ability > > to read back the values that you have written is mostly needed for debugging > > the code. > > > > Also arguably, if there are multiple sources of frequency limits in user space, > > there needs to be a user space arbiter deciding on which value to use and in > > that case it needs to store the last value chosen by it anyway. > > If an userspace tool needs to temporarily adjust min/max_freq it has no > way of reliably restoring the old value. And the new attributes don't really help here AFAICS, because if the old value was written by a user space task different from the one updating it, that task may try to update it again in parallel with the current writer, and so the current writer actually doesn't know whether or not the value it has read is the most recent one (and even so, whether or not writing it back is desirable anyway).
On Thu, Oct 31, 2019 at 11:24:31AM +0100, Rafael J. Wysocki wrote: > On Wednesday, October 30, 2019 1:41:49 AM CET Leonard Crestez wrote: > > Current values in scaling_min_freq and scaling_max freq can change on > > the fly due to event such as thermal monitoring. > > Which is intentional. > > > This behavior is confusing for userspace and because once an userspace > > limit is written to scaling_min/max_freq it is not possible to read it back. > > That can be argued both ways. > > It is also useful to know the effective constraints and arguably the ability > to read back the values that you have written is mostly needed for debugging > the code. Agreed that reading the values back is probably mostly useful for debugging. Reading the effective constraints is a debugging use case as well, userspace can't make any decisions based on values which might be in constant flux. IMO the current interface is completely counterintuitive, I really hope we wouldn't implement it the same way if given a chance to do it again. If there is use for reading the effective constraints it should be exposed as a separate read-only attribute. Keep it simple (when possible).
On Thu, Oct 31, 2019 at 11:24 PM Matthias Kaehlcke <mka@chromium.org> wrote: > > On Thu, Oct 31, 2019 at 11:24:31AM +0100, Rafael J. Wysocki wrote: > > On Wednesday, October 30, 2019 1:41:49 AM CET Leonard Crestez wrote: > > > Current values in scaling_min_freq and scaling_max freq can change on > > > the fly due to event such as thermal monitoring. > > > > Which is intentional. > > > > > This behavior is confusing for userspace and because once an userspace > > > limit is written to scaling_min/max_freq it is not possible to read it back. > > > > That can be argued both ways. > > > > It is also useful to know the effective constraints and arguably the ability > > to read back the values that you have written is mostly needed for debugging > > the code. > > Agreed that reading the values back is probably mostly useful for debugging. > > Reading the effective constraints is a debugging use case as well, userspace > can't make any decisions based on values which might be in constant flux. It sometimes helps to diagnose issues on production systems, though. If you see those numbers change even without writing to the attributes, for example, that indicates the existence of a source of frequency constraints in the system (e.g. platform firmware) that might have gone unnoticed otherwise. > IMO the current interface is completely counterintuitive, I really hope we > wouldn't implement it the same way if given a chance to do it again. If there > is use for reading the effective constraints it should be exposed as a separate > read-only attribute. Keep it simple (when possible). I agree that such a design would have been better, but the existing one cannot be changed this way now. And I had not invented it. :-)
diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst index 0c74a7784964..734c01c1040e 100644 --- a/Documentation/admin-guide/pm/cpufreq.rst +++ b/Documentation/admin-guide/pm/cpufreq.rst @@ -309,21 +309,36 @@ are the following: ``scaling_max_freq`` Maximum frequency the CPUs belonging to this policy are allowed to be running at (in kHz). - This attribute is read-write and writing a string representing an - integer to it will cause a new limit to be set (it must not be lower - than the value of the ``scaling_min_freq`` attribute). + This attribute is read-write: writing an integer will set a new limit + (just like ``user_max_freq``) while reading will show the current + limit (potentially affected by other system contraints such as thermal + throttling). ``scaling_min_freq`` Minimum frequency the CPUs belonging to this policy are allowed to be running at (in kHz). - This attribute is read-write and writing a string representing a - non-negative integer to it will cause a new limit to be set (it must not - be higher than the value of the ``scaling_max_freq`` attribute). + This attribute is read-write: writing an integer will set a new limit + (just like ``user_min_freq``) while reading will show the current + limit (potentially affected by other system contraints). + +``user_max_freq`` + Userspace contraint for the maximum frequency the CPUs belonging to + this policy are allowed to be running at (in kHz). + + This attribute is read-write: writing an integer will set a new limit + and reading will show the last limit set by userspace. + +``user_min_freq`` + Userspace contraint for minimum frequency the CPUs belonging to this + policy are allowed to be running at (in kHz). + + This attribute is read-write: writing an integer will set a new limit + and reading will show the last limit set by userspace. ``scaling_setspeed`` This attribute is functional only if the `userspace`_ scaling governor is attached to the given policy. diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 48a224a6b178..caefed0dac43 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -722,13 +722,28 @@ static ssize_t store_##file_name \ \ ret = freq_qos_update_request(policy->object##_freq_req, val);\ return ret >= 0 ? count : ret; \ } +store_one(user_min_freq, min); +store_one(user_max_freq, max); store_one(scaling_min_freq, min); store_one(scaling_max_freq, max); +#undef show_one + +#define show_one(file_name, object) \ +static ssize_t show_##file_name \ +(struct cpufreq_policy *policy, char *buf) \ +{ \ + s32 val = freq_qos_get_request_value(policy->object##_freq_req);\ + return sprintf(buf, "%d\n", val); \ +} + +show_one(user_min_freq, min); +show_one(user_max_freq, max); + /** * show_cpuinfo_cur_freq - current CPU frequency as detected by hardware */ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy, char *buf) @@ -906,10 +921,12 @@ cpufreq_freq_attr_ro(related_cpus); cpufreq_freq_attr_ro(affected_cpus); cpufreq_freq_attr_rw(scaling_min_freq); cpufreq_freq_attr_rw(scaling_max_freq); cpufreq_freq_attr_rw(scaling_governor); cpufreq_freq_attr_rw(scaling_setspeed); +cpufreq_freq_attr_rw(user_min_freq); +cpufreq_freq_attr_rw(user_max_freq); static struct attribute *default_attrs[] = { &cpuinfo_min_freq.attr, &cpuinfo_max_freq.attr, &cpuinfo_transition_latency.attr, @@ -919,10 +936,12 @@ static struct attribute *default_attrs[] = { &related_cpus.attr, &scaling_governor.attr, &scaling_driver.attr, &scaling_available_governors.attr, &scaling_setspeed.attr, + &user_min_freq.attr, + &user_max_freq.attr, NULL }; #define to_policy(k) container_of(k, struct cpufreq_policy, kobj) #define to_attr(a) container_of(a, struct freq_attr, attr) diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index e97c2e376889..90b147b7d7a3 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -310,7 +310,11 @@ int freq_qos_add_notifier(struct freq_constraints *qos, enum freq_qos_req_type type, struct notifier_block *notifier); int freq_qos_remove_notifier(struct freq_constraints *qos, enum freq_qos_req_type type, struct notifier_block *notifier); +static inline s32 freq_qos_get_request_value(struct freq_qos_request *req) +{ + return req->pnode.prio; +} #endif
Current values in scaling_min_freq and scaling_max freq can change on the fly due to event such as thermal monitoring. This behavior is confusing for userspace and because once an userspace limit is written to scaling_min/max_freq it is not possible to read it back. Introduce two new user_min/max_freq files which only contain the limits imposed by userspace, without any aggregation. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> --- This was motivated by these discussions: * https://patchwork.kernel.org/patch/11078475/#22805379 * https://patchwork.kernel.org/patch/11171817/#22917099 Those threads are about devfreq but same issue applies to cpufreq as well. Let me know if this solution seems reasonable? An alternative would be to make scaling_min/max_freq always read back the configured value and introduce new effective_min/max_freq files for the aggregate values. That might break existing users (though I'm not familiar with any). Documentation/admin-guide/pm/cpufreq.rst | 27 ++++++++++++++++++------ drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++ include/linux/pm_qos.h | 4 ++++ 3 files changed, 44 insertions(+), 6 deletions(-)