diff mbox

[1/8] arm64: Use cpu_ops for smp_stop

Message ID dfed1c27146457caec8f3c228a499f5e44678fd2.1399594544.git.geoff@infradead.org (mailing list archive)
State New, archived
Headers show

Commit Message

Geoff Levand May 9, 2014, 12:48 a.m. UTC
The current implementation of ipi_cpu_stop() is just a tight infinite loop
around cpu_relax().  Add a check for a valid cpu_die method of the appropriate
cpu_operations structure, and if a valid method is found, transfer control to
that method.

The core kexec code calls the arch specific machine_shutdown() routine to
shutdown any SMP secondary CPUs.  The current implementation of the arm64
machine_shutdown() uses smp_send_stop(), which ultimately runs ipi_cpu_stop()
on the secondary CPUs.  The infinite loop implementation of the current
ipi_cpu_stop() does not have any mechanism to get the CPU into a state
compatable with a kexec re-boot.

Signed-off-by: Geoff Levand <geoff@infradead.org>
---
 arch/arm64/kernel/smp.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Mark Rutland May 9, 2014, 8:44 a.m. UTC | #1
On Fri, May 09, 2014 at 01:48:17AM +0100, Geoff Levand wrote:
> The current implementation of ipi_cpu_stop() is just a tight infinite loop
> around cpu_relax().  Add a check for a valid cpu_die method of the appropriate
> cpu_operations structure, and if a valid method is found, transfer control to
> that method.
> 
> The core kexec code calls the arch specific machine_shutdown() routine to
> shutdown any SMP secondary CPUs.  The current implementation of the arm64
> machine_shutdown() uses smp_send_stop(), which ultimately runs ipi_cpu_stop()
> on the secondary CPUs.  The infinite loop implementation of the current
> ipi_cpu_stop() does not have any mechanism to get the CPU into a state
> compatable with a kexec re-boot.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> ---
>  arch/arm64/kernel/smp.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index f0a141d..020bbd5 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -508,6 +508,14 @@ static void ipi_cpu_stop(unsigned int cpu)
>  
>  	local_irq_disable();
>  
> +	/* If we have the cup_ops use them. */
> +
> +	if (cpu_ops[cpu]->cpu_disable && cpu_ops[cpu]->cpu_die
> +		&& !cpu_ops[cpu]->cpu_disable(cpu))
> +		cpu_ops[cpu]->cpu_die(cpu);

For PSCI 0.2 support, we're going to need a cpu_kill callback which we
can't call from the dying CPU. Specifically, we'll need to poll
CPU_AFFINITY_INFO to ensure that secondaries have _actually_ left the
kernel and aren't going to be adversely affected by the kernel text
getting clobbered.

As we're going to wire that up to the cpu hotplug infrastructure it
would be nice to perform the hotplug for kexec by reusing the generic
hotplug infrastructure rather than calling portions of the arm64
implementation directly.

> +
> +	/* Spin here if the cup_ops fail. */
> +
>  	while (1)
>  		cpu_relax();

This seems very dodgy to me. If a CPU doesn't actually die it's going to
be spinning in some memory that we may later clobber. At that point the
CPU will do arbitrarily bad things when it begins executing whatever its
currently executing instructions (or vectors) were replaced by, and you
will waste hours trying to figure out what went wrong (See 8121cf312a19
"ARM: 7766/1: versatile: don't mark pen as __INIT" for a similar mess).

If we fail to hotplug a CPU we at minimum need some acknowledgement that
we failed. I would rather we failed to kexec entirely in that case.

Cheers,
Mark.
Geoff Levand May 13, 2014, 10:27 p.m. UTC | #2
Hi Mark,

On Fri, 2014-05-09 at 09:44 +0100, Mark Rutland wrote:
> On Fri, May 09, 2014 at 01:48:17AM +0100, Geoff Levand wrote:
> > +	/* If we have the cup_ops use them. */
> > +
> > +	if (cpu_ops[cpu]->cpu_disable && cpu_ops[cpu]->cpu_die
> > +		&& !cpu_ops[cpu]->cpu_disable(cpu))
> > +		cpu_ops[cpu]->cpu_die(cpu);
> 
> For PSCI 0.2 support, we're going to need a cpu_kill callback which we
> can't call from the dying CPU. Specifically, we'll need to poll
> CPU_AFFINITY_INFO to ensure that secondaries have _actually_ left the
> kernel and aren't going to be adversely affected by the kernel text
> getting clobbered.
> 
> As we're going to wire that up to the cpu hotplug infrastructure it
> would be nice to perform the hotplug for kexec by reusing the generic
> hotplug infrastructure rather than calling portions of the arm64
> implementation directly.

OK, is there somewhere I can see that new code, and when do you expect
it to be merged?

> > +
> > +	/* Spin here if the cup_ops fail. */
> > +
> >  	while (1)
> >  		cpu_relax();
> 
> This seems very dodgy to me. If a CPU doesn't actually die it's going to
> be spinning in some memory that we may later clobber. At that point the
> CPU will do arbitrarily bad things when it begins executing whatever its
> currently executing instructions (or vectors) were replaced by, and you
> will waste hours trying to figure out what went wrong (See 8121cf312a19
> "ARM: 7766/1: versatile: don't mark pen as __INIT" for a similar mess).
> 
> If we fail to hotplug a CPU we at minimum need some acknowledgement that
> we failed. I would rather we failed to kexec entirely in that case.

This loop is for the non-hotplug power-off shutdown.  This whole smp_stop
support needs to be reconsidered for a hotplug spin-table re-work.

-Geoff
diff mbox

Patch

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index f0a141d..020bbd5 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -508,6 +508,14 @@  static void ipi_cpu_stop(unsigned int cpu)
 
 	local_irq_disable();
 
+	/* If we have the cup_ops use them. */
+
+	if (cpu_ops[cpu]->cpu_disable && cpu_ops[cpu]->cpu_die
+		&& !cpu_ops[cpu]->cpu_disable(cpu))
+		cpu_ops[cpu]->cpu_die(cpu);
+
+	/* Spin here if the cup_ops fail. */
+
 	while (1)
 		cpu_relax();
 }