diff mbox

[9/9] ARM: smp: Add runtime PM support for CPU hotplug

Message ID 1438731339-58317-10-git-send-email-lina.iyer@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Lina Iyer Aug. 4, 2015, 11:35 p.m. UTC
Enable runtime PM for CPU devices. Do a runtime get of the CPU device
when the CPU is hotplugged in and a runtime put of the CPU device when
the CPU is hotplugged off. When all the CPUs in a domain are hotplugged
off, the domain may also be powered off and cluster_pm_enter/exit()
notifications are be sent out.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
---
 arch/arm/kernel/smp.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

Comments

Kevin Hilman Aug. 12, 2015, 8:28 p.m. UTC | #1
Lina Iyer <lina.iyer@linaro.org> writes:

> Enable runtime PM for CPU devices. Do a runtime get of the CPU device
> when the CPU is hotplugged in and a runtime put of the CPU device when
> the CPU is hotplugged off. When all the CPUs in a domain are hotplugged
> off, the domain may also be powered off and cluster_pm_enter/exit()
> notifications are be sent out.
>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Signed-off-by: Lina Iyer <lina.iyer@linaro.org>

How does the runtiem PM usage with hotplug work with the runtime PM
usage in idle?

IIUC, when a CPU is hotplugged in, it will always have a usecount of at
least 1, right?  and if it's not idle, it will have done a _get() so it
will have a usecount of at least 2.   So I'm not quite seeing how the
usecount will ever go to zero in idle and allow the domain to power_off.

Kevin
Lina Iyer Aug. 12, 2015, 8:43 p.m. UTC | #2
On Wed, Aug 12 2015 at 14:28 -0600, Kevin Hilman wrote:
>Lina Iyer <lina.iyer@linaro.org> writes:
>
>> Enable runtime PM for CPU devices. Do a runtime get of the CPU device
>> when the CPU is hotplugged in and a runtime put of the CPU device when
>> the CPU is hotplugged off. When all the CPUs in a domain are hotplugged
>> off, the domain may also be powered off and cluster_pm_enter/exit()
>> notifications are be sent out.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
>> Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
>
>How does the runtiem PM usage with hotplug work with the runtime PM
>usage in idle?
>
>IIUC, when a CPU is hotplugged in, it will always have a usecount of at
>least 1, right?  and if it's not idle, it will have done a _get() so it
>will have a usecount of at least 2.   So I'm not quite seeing how the
>usecount will ever go to zero in idle and allow the domain to power_off.
>
When the CPU is online and running, it would have a ref count of 1. Then
cpuidle would do a _put and the ref count would go to 0 and when coming
out of idle, cpuidle would get a _get and the ref count will be back at
1. Ref count is not incremented when cpuidle is initialized on the CPU.
So whenever the CPU is running, it would have a ref count of 1.

As of today, atleast my understanding is hotplug does not go back into
cpuidle, so the ref count would go to 0 when the core is hotplugged off.
This may change, if the CPU hotplug is routed to cpuidle.

Am I wrong in my understanding?

Thanks,
Lina
Stephen Boyd Aug. 12, 2015, 11:47 p.m. UTC | #3
On 08/04, Lina Iyer wrote:
> @@ -137,7 +138,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
>  	}
>  
> -

Remove noise please.

>  	memset(&secondary_data, 0, sizeof(secondary_data));
>  	return ret;
>  }
> @@ -205,6 +205,9 @@ int __cpu_disable(void)
>  	unsigned int cpu = smp_processor_id();
>  	int ret;
>  
> +	/* We dont need the CPU device anymore. */
> +	pm_runtime_put_sync(get_cpu_device(cpu));

This is all very generic. Any reason it can't be done at a higher
level for all architectures? It certainly seems like
cpu_startup_entry() could be modifed to do the
pm_runtime_get_sync().

> +
>  	ret = platform_cpu_disable(cpu);
>  	if (ret)
>  		return ret;
> @@ -272,6 +275,13 @@ void __ref cpu_die(void)
>  {
>  	unsigned int cpu = smp_processor_id();
>  
> +	/*
> +	 * We dont need the CPU device anymore.
> +	 * Lets do this before IRQs are disabled to allow
> +	 * runtime PM to suspend the domain as well.
> +	 */
> +	pm_runtime_put_sync(get_cpu_device(cpu));

The two put calls is confusing. __cpu_disable() is called on the
CPU that's dying, and cpu_die() is called on the CPU that's doing
the takedown. That would be two decrements but only one increment
in secondary_start_kernel()? How is this properly balanced?

> +
>  	idle_task_exit();
>  
>  	local_irq_disable();
> @@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void)
>  	local_irq_enable();
>  	local_fiq_enable();
>  
> +	/* We are running, enable runtime PM for the CPU. */
> +	cpu_dev = get_cpu_device(cpu);
> +	if (cpu_dev)
> +		pm_runtime_get_sync(cpu_dev);

Also, where would the dev->power.irq_safe flag be set if we
aren't using the genpd DT stuff? It looks like we're going to
start causing warnings on devices that don't have the DT magic.
Please add a hotplug test with some device that isn't using this
genpd code to catch problems. Also please turn on lockdep and RCU
lockdep, touching the idle code like this
Lina Iyer Aug. 13, 2015, 4 p.m. UTC | #4
On Wed, Aug 12 2015 at 17:47 -0600, Stephen Boyd wrote:
>On 08/04, Lina Iyer wrote:
>> @@ -137,7 +138,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>>  		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
>>  	}
>>
>> -
>
>Remove noise please.
>
OK

>>  	memset(&secondary_data, 0, sizeof(secondary_data));
>>  	return ret;
>>  }
>> @@ -205,6 +205,9 @@ int __cpu_disable(void)
>>  	unsigned int cpu = smp_processor_id();
>>  	int ret;
>>
>> +	/* We dont need the CPU device anymore. */
>> +	pm_runtime_put_sync(get_cpu_device(cpu));
>
>This is all very generic. Any reason it can't be done at a higher
>level for all architectures? It certainly seems like
>cpu_startup_entry() could be modifed to do the
>pm_runtime_get_sync().
>
I am suspecting, when the concept of CPU PM domains are finalized, they
would probably move out of the ARM domain and into generic.  Will keep
that in mind.

>> +
>>  	ret = platform_cpu_disable(cpu);
>>  	if (ret)
>>  		return ret;
>> @@ -272,6 +275,13 @@ void __ref cpu_die(void)
>>  {
>>  	unsigned int cpu = smp_processor_id();
>>
>> +	/*
>> +	 * We dont need the CPU device anymore.
>> +	 * Lets do this before IRQs are disabled to allow
>> +	 * runtime PM to suspend the domain as well.
>> +	 */
>> +	pm_runtime_put_sync(get_cpu_device(cpu));
>
>The two put calls is confusing. __cpu_disable() is called on the
>CPU that's dying, and cpu_die() is called on the CPU that's doing
>the takedown.
>
Is that right? Looking at the code and the comments, I can only imagine
that they must be called on the CPU going down. If thats not the case,
then I need to fix this.

>That would be two decrements but only one increment
>in secondary_start_kernel()? How is this properly balanced?
>
I dont see __cpu_disable() ending up at cpu_die(). These seem two
different exit points. I will check again.

>> +
>>  	idle_task_exit();
>>
>>  	local_irq_disable();
>> @@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void)
>>  	local_irq_enable();
>>  	local_fiq_enable();
>>
>> +	/* We are running, enable runtime PM for the CPU. */
>> +	cpu_dev = get_cpu_device(cpu);
>> +	if (cpu_dev)
>> +		pm_runtime_get_sync(cpu_dev);
>
>Also, where would the dev->power.irq_safe flag be set if we
>aren't using the genpd DT stuff? It looks like we're going to
>start causing warnings on devices that don't have the DT magic.
>
Not necessarily. I have added _get and _put at points, when the
interrupts are still enabled. So there should not be a need for the CPU
devices to be IRQ safe. They will operate as regular devices. If they
are attached to a non-IRQ safe domain, they would effect power savings
on the domain.

>Please add a hotplug test with some device that isn't using this
>genpd code to catch problems. Also please turn on lockdep and RCU
>lockdep, touching the idle code like this
>
Good idea. Will add and test.

Thanks Stephen for the review.

Thanks,
Lina
Stephen Boyd Aug. 13, 2015, 7:18 p.m. UTC | #5
On 08/13, Lina Iyer wrote:
> On Wed, Aug 12 2015 at 17:47 -0600, Stephen Boyd wrote:
> >On 08/04, Lina Iyer wrote:
> 
> >>+
> >> 	ret = platform_cpu_disable(cpu);
> >> 	if (ret)
> >> 		return ret;
> >>@@ -272,6 +275,13 @@ void __ref cpu_die(void)
> >> {
> >> 	unsigned int cpu = smp_processor_id();
> >>
> >>+	/*
> >>+	 * We dont need the CPU device anymore.
> >>+	 * Lets do this before IRQs are disabled to allow
> >>+	 * runtime PM to suspend the domain as well.
> >>+	 */
> >>+	pm_runtime_put_sync(get_cpu_device(cpu));
> >
> >The two put calls is confusing. __cpu_disable() is called on the
> >CPU that's dying, and cpu_die() is called on the CPU that's doing
> >the takedown.
> >
> Is that right? Looking at the code and the comments, I can only imagine
> that they must be called on the CPU going down. If thats not the case,
> then I need to fix this.
> 
> >That would be two decrements but only one increment
> >in secondary_start_kernel()? How is this properly balanced?
> >
> I dont see __cpu_disable() ending up at cpu_die(). These seem two
> different exit points. I will check again.

Yeah I suspect we want a single call in __cpu_disable() so that
it runs on the CPU that's being hotplugged out.

> 
> >>+
> >> 	idle_task_exit();
> >>
> >> 	local_irq_disable();
> >>@@ -401,6 +412,11 @@ asmlinkage void secondary_start_kernel(void)
> >> 	local_irq_enable();
> >> 	local_fiq_enable();
> >>
> >>+	/* We are running, enable runtime PM for the CPU. */
> >>+	cpu_dev = get_cpu_device(cpu);
> >>+	if (cpu_dev)
> >>+		pm_runtime_get_sync(cpu_dev);
> >
> >Also, where would the dev->power.irq_safe flag be set if we
> >aren't using the genpd DT stuff? It looks like we're going to
> >start causing warnings on devices that don't have the DT magic.
> >
> Not necessarily. I have added _get and _put at points, when the
> interrupts are still enabled. So there should not be a need for the CPU
> devices to be IRQ safe. They will operate as regular devices. If they
> are attached to a non-IRQ safe domain, they would effect power savings
> on the domain.

What about preemption? Preemption is disabled in __cpu_disable()
and secondary_start_kernel() where this patch is calling
pm_runtime_{get,put}_sync(). That should still trigger a warning
with the might_sleep() inside the runtime PM functions if we
haven't set the irq_safe flag.
Kevin Hilman Aug. 14, 2015, 6:59 p.m. UTC | #6
Lina Iyer <lina.iyer@linaro.org> writes:

> On Wed, Aug 12 2015 at 14:28 -0600, Kevin Hilman wrote:
>>Lina Iyer <lina.iyer@linaro.org> writes:
>>
>>> Enable runtime PM for CPU devices. Do a runtime get of the CPU device
>>> when the CPU is hotplugged in and a runtime put of the CPU device when
>>> the CPU is hotplugged off. When all the CPUs in a domain are hotplugged
>>> off, the domain may also be powered off and cluster_pm_enter/exit()
>>> notifications are be sent out.
>>>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
>>> Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
>>
>>How does the runtiem PM usage with hotplug work with the runtime PM
>>usage in idle?
>>
>>IIUC, when a CPU is hotplugged in, it will always have a usecount of at
>>least 1, right?  and if it's not idle, it will have done a _get() so it
>>will have a usecount of at least 2.   So I'm not quite seeing how the
>>usecount will ever go to zero in idle and allow the domain to power_off.
>>
> When the CPU is online and running, it would have a ref count of 1. Then
> cpuidle would do a _put and the ref count would go to 0 and when coming
> out of idle, cpuidle would get a _get and the ref count will be back at
> 1. Ref count is not incremented when cpuidle is initialized on the CPU.
> So whenever the CPU is running, it would have a ref count of 1.

Ah, OK, that was the part I was missing.

Thanks,

Kevin
diff mbox

Patch

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 0496b48..1d614b8 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -27,6 +27,7 @@ 
 #include <linux/completion.h>
 #include <linux/cpufreq.h>
 #include <linux/irq_work.h>
+#include <linux/pm_runtime.h>
 
 #include <linux/atomic.h>
 #include <asm/smp.h>
@@ -137,7 +138,6 @@  int __cpu_up(unsigned int cpu, struct task_struct *idle)
 		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
 	}
 
-
 	memset(&secondary_data, 0, sizeof(secondary_data));
 	return ret;
 }
@@ -205,6 +205,9 @@  int __cpu_disable(void)
 	unsigned int cpu = smp_processor_id();
 	int ret;
 
+	/* We dont need the CPU device anymore. */
+	pm_runtime_put_sync(get_cpu_device(cpu));
+
 	ret = platform_cpu_disable(cpu);
 	if (ret)
 		return ret;
@@ -272,6 +275,13 @@  void __ref cpu_die(void)
 {
 	unsigned int cpu = smp_processor_id();
 
+	/*
+	 * We dont need the CPU device anymore.
+	 * Lets do this before IRQs are disabled to allow
+	 * runtime PM to suspend the domain as well.
+	 */
+	pm_runtime_put_sync(get_cpu_device(cpu));
+
 	idle_task_exit();
 
 	local_irq_disable();
@@ -352,6 +362,7 @@  asmlinkage void secondary_start_kernel(void)
 {
 	struct mm_struct *mm = &init_mm;
 	unsigned int cpu;
+	struct device *cpu_dev;
 
 	/*
 	 * The identity mapping is uncached (strongly ordered), so
@@ -401,6 +412,11 @@  asmlinkage void secondary_start_kernel(void)
 	local_irq_enable();
 	local_fiq_enable();
 
+	/* We are running, enable runtime PM for the CPU. */
+	cpu_dev = get_cpu_device(cpu);
+	if (cpu_dev)
+		pm_runtime_get_sync(cpu_dev);
+
 	/*
 	 * OK, it's off to the idle thread for us
 	 */