Message ID | 20190128165522.31749-2-quentin.perret@arm.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | Register an Energy Model for Arm reference platforms | expand |
rHi Quentin, On Mon, Jan 28, 2019 at 04:55:16PM +0000, Quentin Perret wrote: > The Energy Model (EM) framework provides an API to let drivers register > the active power of CPUs. The drivers are expected to provide a callback > method which estimates the power consumed by a CPU at each available > performance levels. How exactly this should be implemented, however, > depends on the platform. > > On some systems, PM_OPP knows the voltage and frequency at which CPUs > can run. When coupled with the CPU 'capacitance' (as provided by the > 'dynamic-power-coefficient' devicetree binding), it is possible to > estimate the dynamic power consumption of a CPU as P = C * V^2 * f, with > C its capacitance and V and f respectively the voltage and frequency of > the OPP. The Intelligent Power Allocator (IPA) thermal governor already > implements that estimation method, in the thermal framework. > > However, this power estimation method can be applied to any platform > where all the parameters are known (C, V and f), and not only those > suffering thermal issues. As such, the code implementing this feature > can be re-used to also populate the EM framework now used by EAS. > > As a first step, introduce in PM_OPP a helper function which CPUFreq > drivers can use to register into the EM framework. This duplicates the > power estimation done in IPA until it can be migrated to using the EM > framework. This will be done later, once the EM framework has support > for at least all platforms currently supported by IPA. > > Signed-off-by: Quentin Perret <quentin.perret@arm.com> > --- > drivers/opp/of.c | 60 ++++++++++++++++++++++++++++++++++++++++++ > include/linux/pm_opp.h | 5 ++++ > 2 files changed, 65 insertions(+) > > diff --git a/drivers/opp/of.c b/drivers/opp/of.c > index 06f0f632ec47..7572a2eb2fd4 100644 > --- a/drivers/opp/of.c > +++ b/drivers/opp/of.c > @@ -1047,3 +1047,63 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp) > return of_node_get(opp->np); > } > EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node); > + > +/** > + * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU > + * @mW: pointer to the power estimate in milli-watts > + * @KHz: pointer to the OPP's frequency, in kilo-hertz nit: should be kHz > + * @cpu: CPU for which power needs to be estimated > + * > + * Computes the power estimated by @CPU at the first OPP above @KHz (ceil), > + * and updates @KHz and @mW accordingly. > + * > + * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance > + * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f > + * respectively the voltage and frequency of the OPP. > + * > + * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power > + * calculation failed because of missing parameters, 0 otherwise. > + */ > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) I think it is more common to put the input parameters first, then the output ones, i.e. cpu, kHz, mW. > +{ > + unsigned long mV, Hz, MHz; > + struct device *cpu_dev; > + struct dev_pm_opp *opp; > + struct device_node *np; > + u32 cap; > + u64 tmp; > + int ret; > + > + cpu_dev = get_cpu_device(cpu); > + if (!cpu_dev) > + return -ENODEV; > + > + np = of_node_get(cpu_dev->of_node); > + if (!np) > + return -EINVAL; > + > + ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap); > + of_node_put(np); > + if (ret) > + return -EINVAL; > + > + Hz = *KHz * 1000; > + opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz); > + if (IS_ERR(opp)) > + return -EINVAL; > + > + mV = dev_pm_opp_get_voltage(opp) / 1000; > + dev_pm_opp_put(opp); > + if (!mV) > + return -EINVAL; > + > + MHz = Hz / 1000000; > + tmp = (u64)cap * mV * mV * MHz; > + do_div(tmp, 1000000000); > + > + *mW = (unsigned long)tmp; > + *KHz = Hz / 1000; > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power); > diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h > index 0a2a88e5a383..fedde14f5187 100644 > --- a/include/linux/pm_opp.h > +++ b/include/linux/pm_opp.h > @@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma > struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev); > struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp); > int of_get_required_opp_performance_state(struct device_node *np, int index); > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu); > #else > static inline int dev_pm_opp_of_add_table(struct device *dev) > { > @@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np, > { > return -ENOTSUPP; > } > +static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) > +{ > + return -ENOTSUPP; > +} > #endif > > #endif /* __LINUX_OPP_H__ */ Besides the nits above: Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Tested-by: Matthias Kaehlcke <mka@chromium.org>
On 28-01-19, 16:55, Quentin Perret wrote: > diff --git a/drivers/opp/of.c b/drivers/opp/of.c > index 06f0f632ec47..7572a2eb2fd4 100644 > --- a/drivers/opp/of.c > +++ b/drivers/opp/of.c > @@ -1047,3 +1047,63 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp) > return of_node_get(opp->np); > } > EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node); > + > +/** > + * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU > + * @mW: pointer to the power estimate in milli-watts > + * @KHz: pointer to the OPP's frequency, in kilo-hertz > + * @cpu: CPU for which power needs to be estimated > + * > + * Computes the power estimated by @CPU at the first OPP above @KHz (ceil), > + * and updates @KHz and @mW accordingly. > + * > + * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance > + * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f > + * respectively the voltage and frequency of the OPP. > + * > + * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power > + * calculation failed because of missing parameters, 0 otherwise. > + */ > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) I would suggest to change the return type to unsigned long and return 0 for errors and positive values for mW. > +{ > + unsigned long mV, Hz, MHz; MHz is used only once, I think you can get rid of it just open code the division by 1000000. Maybe remove Hz as well and directly use kHz. > + struct device *cpu_dev; > + struct dev_pm_opp *opp; > + struct device_node *np; > + u32 cap; > + u64 tmp; Name this mW with the above changes. > + int ret; > + > + cpu_dev = get_cpu_device(cpu); > + if (!cpu_dev) > + return -ENODEV; > + > + np = of_node_get(cpu_dev->of_node); > + if (!np) > + return -EINVAL; > + > + ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap); > + of_node_put(np); > + if (ret) > + return -EINVAL; > + > + Hz = *KHz * 1000; > + opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz); > + if (IS_ERR(opp)) > + return -EINVAL; > + > + mV = dev_pm_opp_get_voltage(opp) / 1000; > + dev_pm_opp_put(opp); > + if (!mV) > + return -EINVAL; > + > + MHz = Hz / 1000000; > + tmp = (u64)cap * mV * mV * MHz; > + do_div(tmp, 1000000000); > + > + *mW = (unsigned long)tmp; > + *KHz = Hz / 1000; > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power); > diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h > index 0a2a88e5a383..fedde14f5187 100644 > --- a/include/linux/pm_opp.h > +++ b/include/linux/pm_opp.h > @@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma > struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev); > struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp); > int of_get_required_opp_performance_state(struct device_node *np, int index); > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu); > #else > static inline int dev_pm_opp_of_add_table(struct device *dev) > { > @@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np, > { > return -ENOTSUPP; > } Add a blank line here. We missed adding the same before of_get_required_opp_performance_state() earlier, maybe add the new routine before of_get_required_opp_performance_state() and add blank line before and after it :) > +static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) > +{ > + return -ENOTSUPP; > +} > #endif > > #endif /* __LINUX_OPP_H__ */ > -- > 2.20.1
Hi Matthias, On Monday 28 Jan 2019 at 11:02:51 (-0800), Matthias Kaehlcke wrote: > > +/** > > + * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU > > + * @mW: pointer to the power estimate in milli-watts > > + * @KHz: pointer to the OPP's frequency, in kilo-hertz > > nit: should be kHz Right, I can change that :-) > > > + * @cpu: CPU for which power needs to be estimated > > + * > > + * Computes the power estimated by @CPU at the first OPP above @KHz (ceil), > > + * and updates @KHz and @mW accordingly. > > + * > > + * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance > > + * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f > > + * respectively the voltage and frequency of the OPP. > > + * > > + * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power > > + * calculation failed because of missing parameters, 0 otherwise. > > + */ > > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) > > I think it is more common to put the input parameters first, then the > output ones, i.e. cpu, kHz, mW. Hmm, the issue is this must match the expectations of the EM framework: https://elixir.bootlin.com/linux/v5.0-rc4/source/include/linux/energy_model.h#L62 So, I don't really have a choice. Or we can change the core code if this _really_ is a problem -- we don't have users yet so now is the best time to do so I guess ... > > > +{ > > + unsigned long mV, Hz, MHz; > > + struct device *cpu_dev; > > + struct dev_pm_opp *opp; > > + struct device_node *np; > > + u32 cap; > > + u64 tmp; > > + int ret; > > + > > + cpu_dev = get_cpu_device(cpu); > > + if (!cpu_dev) > > + return -ENODEV; > > + > > + np = of_node_get(cpu_dev->of_node); > > + if (!np) > > + return -EINVAL; > > + > > + ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap); > > + of_node_put(np); > > + if (ret) > > + return -EINVAL; > > + > > + Hz = *KHz * 1000; > > + opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz); > > + if (IS_ERR(opp)) > > + return -EINVAL; > > + > > + mV = dev_pm_opp_get_voltage(opp) / 1000; > > + dev_pm_opp_put(opp); > > + if (!mV) > > + return -EINVAL; > > + > > + MHz = Hz / 1000000; > > + tmp = (u64)cap * mV * mV * MHz; > > + do_div(tmp, 1000000000); > > + > > + *mW = (unsigned long)tmp; > > + *KHz = Hz / 1000; > > + > > + return 0; > > +} > > +EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power); > > diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h > > index 0a2a88e5a383..fedde14f5187 100644 > > --- a/include/linux/pm_opp.h > > +++ b/include/linux/pm_opp.h > > @@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma > > struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev); > > struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp); > > int of_get_required_opp_performance_state(struct device_node *np, int index); > > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu); > > #else > > static inline int dev_pm_opp_of_add_table(struct device *dev) > > { > > @@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np, > > { > > return -ENOTSUPP; > > } > > +static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) > > +{ > > + return -ENOTSUPP; > > +} > > #endif > > > > #endif /* __LINUX_OPP_H__ */ > > Besides the nits above: > > Reviewed-by: Matthias Kaehlcke <mka@chromium.org> > Tested-by: Matthias Kaehlcke <mka@chromium.org> Thank you ! Quentin
Hi Viresh, On Tuesday 29 Jan 2019 at 10:40:30 (+0530), Viresh Kumar wrote: > > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) > > I would suggest to change the return type to unsigned long and return 0 for > errors and positive values for mW. Well, I can't really do that -- this must match the API of PM_EM: https://elixir.bootlin.com/linux/v5.0-rc4/source/include/linux/energy_model.h#L62 I'll make sure to make that explicit in the commit message for v2. > > > +{ > > + unsigned long mV, Hz, MHz; > > MHz is used only once, I think you can get rid of it just open code the division > by 1000000. Maybe remove Hz as well and directly use kHz. I would expect/hope the compiler will optimize things like that for me but yes, I can try to reduce a bit the temp variables if you prefer. > > > + struct device *cpu_dev; > > + struct dev_pm_opp *opp; > > + struct device_node *np; > > + u32 cap; > > + u64 tmp; > > Name this mW with the above changes. So that doesn't really work given what I mentioned above. > > + int ret; > > + > > + cpu_dev = get_cpu_device(cpu); > > + if (!cpu_dev) > > + return -ENODEV; > > + > > + np = of_node_get(cpu_dev->of_node); > > + if (!np) > > + return -EINVAL; > > + > > + ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap); > > + of_node_put(np); > > + if (ret) > > + return -EINVAL; > > + > > + Hz = *KHz * 1000; > > + opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz); > > + if (IS_ERR(opp)) > > + return -EINVAL; > > + > > + mV = dev_pm_opp_get_voltage(opp) / 1000; > > + dev_pm_opp_put(opp); > > + if (!mV) > > + return -EINVAL; > > + > > + MHz = Hz / 1000000; > > + tmp = (u64)cap * mV * mV * MHz; > > + do_div(tmp, 1000000000); > > + > > + *mW = (unsigned long)tmp; > > + *KHz = Hz / 1000; > > + > > + return 0; > > +} > > +EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power); > > diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h > > index 0a2a88e5a383..fedde14f5187 100644 > > --- a/include/linux/pm_opp.h > > +++ b/include/linux/pm_opp.h > > @@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma > > struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev); > > struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp); > > int of_get_required_opp_performance_state(struct device_node *np, int index); > > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu); > > #else > > static inline int dev_pm_opp_of_add_table(struct device *dev) > > { > > @@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np, > > { > > return -ENOTSUPP; > > } > > Add a blank line here. We missed adding the same before > of_get_required_opp_performance_state() earlier, maybe add the new routine > before of_get_required_opp_performance_state() and add blank line before and > after it :) Sounds good ! > > > +static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) > > +{ > > + return -ENOTSUPP; > > +} > > #endif > > > > #endif /* __LINUX_OPP_H__ */ > > -- > > 2.20.1 > > -- > viresh Thanks, Quentin
Hi Quentin, On Tue, Jan 29, 2019 at 09:03:34AM +0000, Quentin Perret wrote: > Hi Matthias, > > On Monday 28 Jan 2019 at 11:02:51 (-0800), Matthias Kaehlcke wrote: > > > +/** > > > + * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU > > > + * @mW: pointer to the power estimate in milli-watts > > > + * @KHz: pointer to the OPP's frequency, in kilo-hertz > > > > nit: should be kHz > > Right, I can change that :-) > > > > > > + * @cpu: CPU for which power needs to be estimated > > > + * > > > + * Computes the power estimated by @CPU at the first OPP above @KHz (ceil), > > > + * and updates @KHz and @mW accordingly. > > > + * > > > + * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance > > > + * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f > > > + * respectively the voltage and frequency of the OPP. > > > + * > > > + * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power > > > + * calculation failed because of missing parameters, 0 otherwise. > > > + */ > > > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) > > > > I think it is more common to put the input parameters first, then the > > output ones, i.e. cpu, kHz, mW. > > Hmm, the issue is this must match the expectations of the EM framework: > > https://elixir.bootlin.com/linux/v5.0-rc4/source/include/linux/energy_model.h#L62 > > So, I don't really have a choice. Or we can change the core code if this > _really_ is a problem -- we don't have users yet so now is the best time > to do so I guess ... I see. I think it would be preferable to follow the common scheme, but it's also not a major issue, up to you whether you want to change the definition of the callback. It's certainly true that now would be the best time for a change. Cheers Matthias
diff --git a/drivers/opp/of.c b/drivers/opp/of.c index 06f0f632ec47..7572a2eb2fd4 100644 --- a/drivers/opp/of.c +++ b/drivers/opp/of.c @@ -1047,3 +1047,63 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp) return of_node_get(opp->np); } EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node); + +/** + * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU + * @mW: pointer to the power estimate in milli-watts + * @KHz: pointer to the OPP's frequency, in kilo-hertz + * @cpu: CPU for which power needs to be estimated + * + * Computes the power estimated by @CPU at the first OPP above @KHz (ceil), + * and updates @KHz and @mW accordingly. + * + * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance + * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f + * respectively the voltage and frequency of the OPP. + * + * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power + * calculation failed because of missing parameters, 0 otherwise. + */ +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) +{ + unsigned long mV, Hz, MHz; + struct device *cpu_dev; + struct dev_pm_opp *opp; + struct device_node *np; + u32 cap; + u64 tmp; + int ret; + + cpu_dev = get_cpu_device(cpu); + if (!cpu_dev) + return -ENODEV; + + np = of_node_get(cpu_dev->of_node); + if (!np) + return -EINVAL; + + ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap); + of_node_put(np); + if (ret) + return -EINVAL; + + Hz = *KHz * 1000; + opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz); + if (IS_ERR(opp)) + return -EINVAL; + + mV = dev_pm_opp_get_voltage(opp) / 1000; + dev_pm_opp_put(opp); + if (!mV) + return -EINVAL; + + MHz = Hz / 1000000; + tmp = (u64)cap * mV * mV * MHz; + do_div(tmp, 1000000000); + + *mW = (unsigned long)tmp; + *KHz = Hz / 1000; + + return 0; +} +EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power); diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h index 0a2a88e5a383..fedde14f5187 100644 --- a/include/linux/pm_opp.h +++ b/include/linux/pm_opp.h @@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev); struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp); int of_get_required_opp_performance_state(struct device_node *np, int index); +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu); #else static inline int dev_pm_opp_of_add_table(struct device *dev) { @@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np, { return -ENOTSUPP; } +static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu) +{ + return -ENOTSUPP; +} #endif #endif /* __LINUX_OPP_H__ */
The Energy Model (EM) framework provides an API to let drivers register the active power of CPUs. The drivers are expected to provide a callback method which estimates the power consumed by a CPU at each available performance levels. How exactly this should be implemented, however, depends on the platform. On some systems, PM_OPP knows the voltage and frequency at which CPUs can run. When coupled with the CPU 'capacitance' (as provided by the 'dynamic-power-coefficient' devicetree binding), it is possible to estimate the dynamic power consumption of a CPU as P = C * V^2 * f, with C its capacitance and V and f respectively the voltage and frequency of the OPP. The Intelligent Power Allocator (IPA) thermal governor already implements that estimation method, in the thermal framework. However, this power estimation method can be applied to any platform where all the parameters are known (C, V and f), and not only those suffering thermal issues. As such, the code implementing this feature can be re-used to also populate the EM framework now used by EAS. As a first step, introduce in PM_OPP a helper function which CPUFreq drivers can use to register into the EM framework. This duplicates the power estimation done in IPA until it can be migrated to using the EM framework. This will be done later, once the EM framework has support for at least all platforms currently supported by IPA. Signed-off-by: Quentin Perret <quentin.perret@arm.com> --- drivers/opp/of.c | 60 ++++++++++++++++++++++++++++++++++++++++++ include/linux/pm_opp.h | 5 ++++ 2 files changed, 65 insertions(+)