Message ID | 20191010113937.15962-13-ulf.hansson@linaro.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | cpuidle: psci: Support hierarchical CPU arrangement | expand |
On Thu, Oct 10, 2019 at 01:39:36PM +0200, Ulf Hansson wrote: > In case we have succeeded to attach a CPU to its PM domain, let's deploy > runtime PM support for the corresponding attached device, to allow the CPU > to be powered-managed accordingly. > > To set the triggering point for when runtime PM reference counting should > be done, let's store the index of deepest idle state for the CPU in the per > CPU struct. Then use this index to compare the selected idle state index > when entering idle, as to understand whether runtime PM reference counting > is needed or not. > > Note that, from the hierarchical point view, there may be good reasons to > do runtime PM reference counting even on shallower idle states, but at this > point this isn't supported, mainly due to limitations set by the generic PM > domain. > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > --- > drivers/cpuidle/cpuidle-psci.c | 21 +++++++++++++++++++-- > 1 file changed, 19 insertions(+), 2 deletions(-) > > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > index 1510422c7a53..0919b40c1a85 100644 > --- a/drivers/cpuidle/cpuidle-psci.c > +++ b/drivers/cpuidle/cpuidle-psci.c > @@ -16,6 +16,7 @@ > #include <linux/of.h> > #include <linux/of_device.h> > #include <linux/psci.h> > +#include <linux/pm_runtime.h> > #include <linux/slab.h> > > #include <asm/cpuidle.h> > @@ -25,6 +26,7 @@ > > struct psci_cpuidle_data { > u32 *psci_states; > + u32 rpm_state_id; > struct device *dev; > }; > > @@ -50,14 +52,28 @@ static int psci_enter_idle_state(struct cpuidle_device *dev, > struct cpuidle_driver *drv, int idx) > { > int ret; > - u32 *states = __this_cpu_read(psci_cpuidle_data.psci_states); > - u32 state = psci_get_domain_state(); > + struct psci_cpuidle_data *data = this_cpu_ptr(&psci_cpuidle_data); > + u32 *states = data->psci_states; > + struct device *pd_dev = data->dev; > + bool runtime_pm = (pd_dev && data->rpm_state_id == idx); > + u32 state; Wonder if we can have separate psci_enter_idle_state for OSI mode so that all these runtime extra check can be reduced ? It will also make sure there's no additional latency for PC mode because of OSI. -- Regards, Sudeep
On Thu, 24 Oct 2019 at 18:33, Sudeep Holla <sudeep.holla@arm.com> wrote: > > On Thu, Oct 10, 2019 at 01:39:36PM +0200, Ulf Hansson wrote: > > In case we have succeeded to attach a CPU to its PM domain, let's deploy > > runtime PM support for the corresponding attached device, to allow the CPU > > to be powered-managed accordingly. > > > > To set the triggering point for when runtime PM reference counting should > > be done, let's store the index of deepest idle state for the CPU in the per > > CPU struct. Then use this index to compare the selected idle state index > > when entering idle, as to understand whether runtime PM reference counting > > is needed or not. > > > > Note that, from the hierarchical point view, there may be good reasons to > > do runtime PM reference counting even on shallower idle states, but at this > > point this isn't supported, mainly due to limitations set by the generic PM > > domain. > > > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > > --- > > drivers/cpuidle/cpuidle-psci.c | 21 +++++++++++++++++++-- > > 1 file changed, 19 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > > index 1510422c7a53..0919b40c1a85 100644 > > --- a/drivers/cpuidle/cpuidle-psci.c > > +++ b/drivers/cpuidle/cpuidle-psci.c > > @@ -16,6 +16,7 @@ > > #include <linux/of.h> > > #include <linux/of_device.h> > > #include <linux/psci.h> > > +#include <linux/pm_runtime.h> > > #include <linux/slab.h> > > > > #include <asm/cpuidle.h> > > @@ -25,6 +26,7 @@ > > > > struct psci_cpuidle_data { > > u32 *psci_states; > > + u32 rpm_state_id; > > struct device *dev; > > }; > > > > @@ -50,14 +52,28 @@ static int psci_enter_idle_state(struct cpuidle_device *dev, > > struct cpuidle_driver *drv, int idx) > > { > > int ret; > > - u32 *states = __this_cpu_read(psci_cpuidle_data.psci_states); > > - u32 state = psci_get_domain_state(); > > + struct psci_cpuidle_data *data = this_cpu_ptr(&psci_cpuidle_data); > > + u32 *states = data->psci_states; > > + struct device *pd_dev = data->dev; > > + bool runtime_pm = (pd_dev && data->rpm_state_id == idx); > > + u32 state; > > Wonder if we can have separate psci_enter_idle_state for OSI mode so > that all these runtime extra check can be reduced ? It will also make > sure there's no additional latency for PC mode because of OSI. Good idea, that's the plan. See previous answer. Perhaps if I add a patch on top, implementing your suggestion, would you be happy with that? Kind regards Uffe
On Thu, Oct 24, 2019 at 07:00:38PM +0200, Ulf Hansson wrote: > On Thu, 24 Oct 2019 at 18:33, Sudeep Holla <sudeep.holla@arm.com> wrote: > > > > On Thu, Oct 10, 2019 at 01:39:36PM +0200, Ulf Hansson wrote: > > > In case we have succeeded to attach a CPU to its PM domain, let's deploy > > > runtime PM support for the corresponding attached device, to allow the CPU > > > to be powered-managed accordingly. > > > > > > To set the triggering point for when runtime PM reference counting should > > > be done, let's store the index of deepest idle state for the CPU in the per > > > CPU struct. Then use this index to compare the selected idle state index > > > when entering idle, as to understand whether runtime PM reference counting > > > is needed or not. > > > > > > Note that, from the hierarchical point view, there may be good reasons to > > > do runtime PM reference counting even on shallower idle states, but at this > > > point this isn't supported, mainly due to limitations set by the generic PM > > > domain. > > > > > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > > > --- > > > drivers/cpuidle/cpuidle-psci.c | 21 +++++++++++++++++++-- > > > 1 file changed, 19 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > > > index 1510422c7a53..0919b40c1a85 100644 > > > --- a/drivers/cpuidle/cpuidle-psci.c > > > +++ b/drivers/cpuidle/cpuidle-psci.c > > > @@ -16,6 +16,7 @@ > > > #include <linux/of.h> > > > #include <linux/of_device.h> > > > #include <linux/psci.h> > > > +#include <linux/pm_runtime.h> > > > #include <linux/slab.h> > > > > > > #include <asm/cpuidle.h> > > > @@ -25,6 +26,7 @@ > > > > > > struct psci_cpuidle_data { > > > u32 *psci_states; > > > + u32 rpm_state_id; > > > struct device *dev; > > > }; > > > > > > @@ -50,14 +52,28 @@ static int psci_enter_idle_state(struct cpuidle_device *dev, > > > struct cpuidle_driver *drv, int idx) > > > { > > > int ret; > > > - u32 *states = __this_cpu_read(psci_cpuidle_data.psci_states); > > > - u32 state = psci_get_domain_state(); > > > + struct psci_cpuidle_data *data = this_cpu_ptr(&psci_cpuidle_data); > > > + u32 *states = data->psci_states; > > > + struct device *pd_dev = data->dev; > > > + bool runtime_pm = (pd_dev && data->rpm_state_id == idx); > > > + u32 state; > > > > Wonder if we can have separate psci_enter_idle_state for OSI mode so > > that all these runtime extra check can be reduced ? It will also make > > sure there's no additional latency for PC mode because of OSI. > > Good idea, that's the plan. See previous answer. > > Perhaps if I add a patch on top, implementing your suggestion, would > you be happy with that? Separating idle entry functions was the main idea behind moving PSCI CPUidle into drivers/cpuidle, I don't think that's complicated to change and given that the series is not queued yet we can make these changes in the series itself rather than on top. Thanks, Lorenzo
On Fri, 25 Oct 2019 at 10:29, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: > > On Thu, Oct 24, 2019 at 07:00:38PM +0200, Ulf Hansson wrote: > > On Thu, 24 Oct 2019 at 18:33, Sudeep Holla <sudeep.holla@arm.com> wrote: > > > > > > On Thu, Oct 10, 2019 at 01:39:36PM +0200, Ulf Hansson wrote: > > > > In case we have succeeded to attach a CPU to its PM domain, let's deploy > > > > runtime PM support for the corresponding attached device, to allow the CPU > > > > to be powered-managed accordingly. > > > > > > > > To set the triggering point for when runtime PM reference counting should > > > > be done, let's store the index of deepest idle state for the CPU in the per > > > > CPU struct. Then use this index to compare the selected idle state index > > > > when entering idle, as to understand whether runtime PM reference counting > > > > is needed or not. > > > > > > > > Note that, from the hierarchical point view, there may be good reasons to > > > > do runtime PM reference counting even on shallower idle states, but at this > > > > point this isn't supported, mainly due to limitations set by the generic PM > > > > domain. > > > > > > > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > > > > --- > > > > drivers/cpuidle/cpuidle-psci.c | 21 +++++++++++++++++++-- > > > > 1 file changed, 19 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > > > > index 1510422c7a53..0919b40c1a85 100644 > > > > --- a/drivers/cpuidle/cpuidle-psci.c > > > > +++ b/drivers/cpuidle/cpuidle-psci.c > > > > @@ -16,6 +16,7 @@ > > > > #include <linux/of.h> > > > > #include <linux/of_device.h> > > > > #include <linux/psci.h> > > > > +#include <linux/pm_runtime.h> > > > > #include <linux/slab.h> > > > > > > > > #include <asm/cpuidle.h> > > > > @@ -25,6 +26,7 @@ > > > > > > > > struct psci_cpuidle_data { > > > > u32 *psci_states; > > > > + u32 rpm_state_id; > > > > struct device *dev; > > > > }; > > > > > > > > @@ -50,14 +52,28 @@ static int psci_enter_idle_state(struct cpuidle_device *dev, > > > > struct cpuidle_driver *drv, int idx) > > > > { > > > > int ret; > > > > - u32 *states = __this_cpu_read(psci_cpuidle_data.psci_states); > > > > - u32 state = psci_get_domain_state(); > > > > + struct psci_cpuidle_data *data = this_cpu_ptr(&psci_cpuidle_data); > > > > + u32 *states = data->psci_states; > > > > + struct device *pd_dev = data->dev; > > > > + bool runtime_pm = (pd_dev && data->rpm_state_id == idx); > > > > + u32 state; > > > > > > Wonder if we can have separate psci_enter_idle_state for OSI mode so > > > that all these runtime extra check can be reduced ? It will also make > > > sure there's no additional latency for PC mode because of OSI. > > > > Good idea, that's the plan. See previous answer. > > > > Perhaps if I add a patch on top, implementing your suggestion, would > > you be happy with that? > > Separating idle entry functions was the main idea behind moving PSCI > CPUidle into drivers/cpuidle, I don't think that's complicated to > change and given that the series is not queued yet we can make these > changes in the series itself rather than on top. Okay, no problem. I fold in a patch (or amend existing ones, if that is better) into the series that addresses this. Thanks for reviewing and enjoy your weekend! Kind regards Uffe
On Thu, Oct 24, 2019 at 07:00:38PM +0200, Ulf Hansson wrote: > On Thu, 24 Oct 2019 at 18:33, Sudeep Holla <sudeep.holla@arm.com> wrote: > > > > On Thu, Oct 10, 2019 at 01:39:36PM +0200, Ulf Hansson wrote: > > > In case we have succeeded to attach a CPU to its PM domain, let's deploy > > > runtime PM support for the corresponding attached device, to allow the CPU > > > to be powered-managed accordingly. > > > > > > To set the triggering point for when runtime PM reference counting should > > > be done, let's store the index of deepest idle state for the CPU in the per > > > CPU struct. Then use this index to compare the selected idle state index > > > when entering idle, as to understand whether runtime PM reference counting > > > is needed or not. > > > > > > Note that, from the hierarchical point view, there may be good reasons to > > > do runtime PM reference counting even on shallower idle states, but at this > > > point this isn't supported, mainly due to limitations set by the generic PM > > > domain. > > > > > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > > > --- > > > drivers/cpuidle/cpuidle-psci.c | 21 +++++++++++++++++++-- > > > 1 file changed, 19 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > > > index 1510422c7a53..0919b40c1a85 100644 > > > --- a/drivers/cpuidle/cpuidle-psci.c > > > +++ b/drivers/cpuidle/cpuidle-psci.c > > > @@ -16,6 +16,7 @@ > > > #include <linux/of.h> > > > #include <linux/of_device.h> > > > #include <linux/psci.h> > > > +#include <linux/pm_runtime.h> > > > #include <linux/slab.h> > > > > > > #include <asm/cpuidle.h> > > > @@ -25,6 +26,7 @@ > > > > > > struct psci_cpuidle_data { > > > u32 *psci_states; > > > + u32 rpm_state_id; > > > struct device *dev; > > > }; > > > > > > @@ -50,14 +52,28 @@ static int psci_enter_idle_state(struct cpuidle_device *dev, > > > struct cpuidle_driver *drv, int idx) > > > { > > > int ret; > > > - u32 *states = __this_cpu_read(psci_cpuidle_data.psci_states); > > > - u32 state = psci_get_domain_state(); > > > + struct psci_cpuidle_data *data = this_cpu_ptr(&psci_cpuidle_data); > > > + u32 *states = data->psci_states; > > > + struct device *pd_dev = data->dev; > > > + bool runtime_pm = (pd_dev && data->rpm_state_id == idx); > > > + u32 state; > > > > Wonder if we can have separate psci_enter_idle_state for OSI mode so > > that all these runtime extra check can be reduced ? It will also make > > sure there's no additional latency for PC mode because of OSI. > > Good idea, that's the plan. See previous answer. > > Perhaps if I add a patch on top, implementing your suggestion, would > you be happy with that? No, I prefer to amend this itself to keep it easy to be able to bisect. -- Regards, Sudeep
On Sun, 27 Oct 2019 at 03:34, Sudeep Holla <sudeep.holla@arm.com> wrote: > > On Thu, Oct 24, 2019 at 07:00:38PM +0200, Ulf Hansson wrote: > > On Thu, 24 Oct 2019 at 18:33, Sudeep Holla <sudeep.holla@arm.com> wrote: > > > > > > On Thu, Oct 10, 2019 at 01:39:36PM +0200, Ulf Hansson wrote: > > > > In case we have succeeded to attach a CPU to its PM domain, let's deploy > > > > runtime PM support for the corresponding attached device, to allow the CPU > > > > to be powered-managed accordingly. > > > > > > > > To set the triggering point for when runtime PM reference counting should > > > > be done, let's store the index of deepest idle state for the CPU in the per > > > > CPU struct. Then use this index to compare the selected idle state index > > > > when entering idle, as to understand whether runtime PM reference counting > > > > is needed or not. > > > > > > > > Note that, from the hierarchical point view, there may be good reasons to > > > > do runtime PM reference counting even on shallower idle states, but at this > > > > point this isn't supported, mainly due to limitations set by the generic PM > > > > domain. > > > > > > > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > > > > --- > > > > drivers/cpuidle/cpuidle-psci.c | 21 +++++++++++++++++++-- > > > > 1 file changed, 19 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > > > > index 1510422c7a53..0919b40c1a85 100644 > > > > --- a/drivers/cpuidle/cpuidle-psci.c > > > > +++ b/drivers/cpuidle/cpuidle-psci.c > > > > @@ -16,6 +16,7 @@ > > > > #include <linux/of.h> > > > > #include <linux/of_device.h> > > > > #include <linux/psci.h> > > > > +#include <linux/pm_runtime.h> > > > > #include <linux/slab.h> > > > > > > > > #include <asm/cpuidle.h> > > > > @@ -25,6 +26,7 @@ > > > > > > > > struct psci_cpuidle_data { > > > > u32 *psci_states; > > > > + u32 rpm_state_id; > > > > struct device *dev; > > > > }; > > > > > > > > @@ -50,14 +52,28 @@ static int psci_enter_idle_state(struct cpuidle_device *dev, > > > > struct cpuidle_driver *drv, int idx) > > > > { > > > > int ret; > > > > - u32 *states = __this_cpu_read(psci_cpuidle_data.psci_states); > > > > - u32 state = psci_get_domain_state(); > > > > + struct psci_cpuidle_data *data = this_cpu_ptr(&psci_cpuidle_data); > > > > + u32 *states = data->psci_states; > > > > + struct device *pd_dev = data->dev; > > > > + bool runtime_pm = (pd_dev && data->rpm_state_id == idx); > > > > + u32 state; > > > > > > Wonder if we can have separate psci_enter_idle_state for OSI mode so > > > that all these runtime extra check can be reduced ? It will also make > > > sure there's no additional latency for PC mode because of OSI. > > > > Good idea, that's the plan. See previous answer. > > > > Perhaps if I add a patch on top, implementing your suggestion, would > > you be happy with that? > > No, I prefer to amend this itself to keep it easy to be able to bisect. Alright! So I explored this a little bit more - and it actually forced me to re-order some of the patches in the series, but it seems to have turned out well. In the new approach I have taken, I haven't replaced the actual callback for the idle state, but instead make an early decision in psci_enter_idle_state(), based on one single read of a per CPU variable/struct. This tell what path to go. I am running some final test, but should be able to post a new version tomorrow. Although, if you still don't think the new approach is good enough, we can always invent a callback, that we can assign when the CPU is attached to PM domain. In any case, you will have to tell what you think, after I posted the new version, just wanted to give you a heads up. Kind regards Uffe
diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c index 1510422c7a53..0919b40c1a85 100644 --- a/drivers/cpuidle/cpuidle-psci.c +++ b/drivers/cpuidle/cpuidle-psci.c @@ -16,6 +16,7 @@ #include <linux/of.h> #include <linux/of_device.h> #include <linux/psci.h> +#include <linux/pm_runtime.h> #include <linux/slab.h> #include <asm/cpuidle.h> @@ -25,6 +26,7 @@ struct psci_cpuidle_data { u32 *psci_states; + u32 rpm_state_id; struct device *dev; }; @@ -50,14 +52,28 @@ static int psci_enter_idle_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, int idx) { int ret; - u32 *states = __this_cpu_read(psci_cpuidle_data.psci_states); - u32 state = psci_get_domain_state(); + struct psci_cpuidle_data *data = this_cpu_ptr(&psci_cpuidle_data); + u32 *states = data->psci_states; + struct device *pd_dev = data->dev; + bool runtime_pm = (pd_dev && data->rpm_state_id == idx); + u32 state; + /* + * Do runtime PM if we are using the hierarchical CPU toplogy, but only + * when cpuidle have selected the deepest idle state for the CPU. + */ + if (runtime_pm) + pm_runtime_put_sync_suspend(pd_dev); + + state = psci_get_domain_state(); if (!state && idx) state = states[idx - 1]; ret = __psci_enter_idle_state(idx, state); + if (runtime_pm) + pm_runtime_get_sync(pd_dev); + /* Clear the domain state to start fresh when back from idle. */ psci_set_domain_state(0); return ret; @@ -142,6 +158,7 @@ static int __init psci_dt_cpu_init_idle(struct device_node *cpu_node, } data->dev = dev; + data->rpm_state_id = state_nodes; /* Idle states parsed correctly, store them in the per-cpu struct. */ data->psci_states = psci_states;
In case we have succeeded to attach a CPU to its PM domain, let's deploy runtime PM support for the corresponding attached device, to allow the CPU to be powered-managed accordingly. To set the triggering point for when runtime PM reference counting should be done, let's store the index of deepest idle state for the CPU in the per CPU struct. Then use this index to compare the selected idle state index when entering idle, as to understand whether runtime PM reference counting is needed or not. Note that, from the hierarchical point view, there may be good reasons to do runtime PM reference counting even on shallower idle states, but at this point this isn't supported, mainly due to limitations set by the generic PM domain. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> --- drivers/cpuidle/cpuidle-psci.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)