Message ID | 1355960681-32015-1-git-send-email-swarren@wwwdotorg.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Stephen Warren <swarren@wwwdotorg.org> writes: > From: Stephen Warren <swarren@nvidia.com> > > Both the regular reboot and shutdown paths do this. It seems reasonable > for kexec to work the same way. > > On the Tegra ARM SoC at least, this change allows kexec to work with SMP > enabled. ARM's machine_shutdown() simply puts all CPUs into a loop. If > the code of that loop is over-written, the CPUs may crash, and cause the > kexec'd kernel not to be able to initialize them. In practice, this > causes the kexec'd kernel to hang and/or crash. The intended way to > solve this is for ARM machines to provide a cpu_kill SMP operation to > e.g. power down the CPUs, or place them in reset. However, at least on > Tegra, the implementation of that function would simply be duplicating > the hotplug code that already exists, so it seems simpler to just call > disable_nonboot_cpus() for the kexec path, just like reboot/shutdown. I would like to ack this but I think this is one of those differences for which there is a reason. On x86 this happens in machine_shutdown. Before this can be merged someone needs to look into the history and see why we don't share this code path. My hunch is because it was duplication of effort and the stop machine code has historically been fragile. Eric > Signed-off-by: Stephen Warren <swarren@nvidia.com> > --- > kernel/kexec.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/kexec.c b/kernel/kexec.c > index 5e4bd78..6fe74d3 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -1570,6 +1570,7 @@ int kernel_kexec(void) > #endif > { > kernel_restart_prepare(NULL); > + disable_nonboot_cpus(); > printk(KERN_EMERG "Starting new kernel\n"); > machine_shutdown(); > }
Hi Stephen, On Wed, Dec 19, 2012 at 11:44:41PM +0000, Stephen Warren wrote: > From: Stephen Warren <swarren@nvidia.com> > > Both the regular reboot and shutdown paths do this. It seems reasonable > for kexec to work the same way. > > On the Tegra ARM SoC at least, this change allows kexec to work with SMP > enabled. ARM's machine_shutdown() simply puts all CPUs into a loop. If > the code of that loop is over-written, the CPUs may crash, and cause the > kexec'd kernel not to be able to initialize them. In practice, this > causes the kexec'd kernel to hang and/or crash. The intended way to > solve this is for ARM machines to provide a cpu_kill SMP operation to > e.g. power down the CPUs, or place them in reset. However, at least on > Tegra, the implementation of that function would simply be duplicating > the hotplug code that already exists, so it seems simpler to just call > disable_nonboot_cpus() for the kexec path, just like reboot/shutdown. If you do manage to get this merged, please can you follow up with a patch to remove the smp_kill_cpus bits from arch/arm/kernel/smp.c please? It only exists as a hook to do exactly this and currently nobody is using it afaict. Cheers, Will
On 12/20/2012 03:49 AM, Will Deacon wrote: > Hi Stephen, > > On Wed, Dec 19, 2012 at 11:44:41PM +0000, Stephen Warren wrote: >> From: Stephen Warren <swarren@nvidia.com> >> >> Both the regular reboot and shutdown paths do this. It seems reasonable >> for kexec to work the same way. >> >> On the Tegra ARM SoC at least, this change allows kexec to work with SMP >> enabled. ARM's machine_shutdown() simply puts all CPUs into a loop. If >> the code of that loop is over-written, the CPUs may crash, and cause the >> kexec'd kernel not to be able to initialize them. In practice, this >> causes the kexec'd kernel to hang and/or crash. The intended way to >> solve this is for ARM machines to provide a cpu_kill SMP operation to >> e.g. power down the CPUs, or place them in reset. However, at least on >> Tegra, the implementation of that function would simply be duplicating >> the hotplug code that already exists, so it seems simpler to just call >> disable_nonboot_cpus() for the kexec path, just like reboot/shutdown. > > If you do manage to get this merged, please can you follow up with a patch > to remove the smp_kill_cpus bits from arch/arm/kernel/smp.c please? It only > exists as a hook to do exactly this and currently nobody is using it afaict. I originally implemented this in arch/arm/kernel/process.c:machine_shutdown(), which currently is: void machine_shutdown(void) { #ifdef CONFIG_SMP smp_send_stop(); #endif } and I changed it to something like: void machine_shutdown(void) { #ifdef CONFIG_HOTPLUG_CPU disable_nonboot_cpus(); #elifdef CONFIG_SMP smp_send_stop(); #endif } ... but then figured that moving it up into the core kexec code would be better, so that everything always worked the same way. Anyway, the change above addresses Eric's concern about isolating the change to ARM. Does that seem like a reasonable thing for the ARM code to do?
On Thu, Dec 20, 2012 at 05:21:56PM +0000, Stephen Warren wrote: > On 12/20/2012 03:49 AM, Will Deacon wrote: > > If you do manage to get this merged, please can you follow up with a patch > > to remove the smp_kill_cpus bits from arch/arm/kernel/smp.c please? It only > > exists as a hook to do exactly this and currently nobody is using it afaict. > > I originally implemented this in > arch/arm/kernel/process.c:machine_shutdown(), which currently is: > > void machine_shutdown(void) > { > #ifdef CONFIG_SMP > smp_send_stop(); > #endif > } > > and I changed it to something like: > > void machine_shutdown(void) > { > #ifdef CONFIG_HOTPLUG_CPU > disable_nonboot_cpus(); > #elifdef CONFIG_SMP > smp_send_stop(); > #endif > } > > ... but then figured that moving it up into the core kexec code would be > better, so that everything always worked the same way. Hmmm, isn't this racy: requiring the secondaries to hit idle and notice they're offline and call cpu_die before the primary has replace the kernel image? > Anyway, the change above addresses Eric's concern about isolating the > change to ARM. Does that seem like a reasonable thing for the ARM code > to do? I think you're better off using what we currently have and hanging your code off platform_cpu_kill. Will
On 12/20/2012 10:36 AM, Will Deacon wrote: > On Thu, Dec 20, 2012 at 05:21:56PM +0000, Stephen Warren wrote: >> On 12/20/2012 03:49 AM, Will Deacon wrote: >>> If you do manage to get this merged, please can you follow up with a patch >>> to remove the smp_kill_cpus bits from arch/arm/kernel/smp.c please? It only >>> exists as a hook to do exactly this and currently nobody is using it afaict. >> >> I originally implemented this in >> arch/arm/kernel/process.c:machine_shutdown(), which currently is: >> >> void machine_shutdown(void) >> { >> #ifdef CONFIG_SMP >> smp_send_stop(); >> #endif >> } >> >> and I changed it to something like: >> >> void machine_shutdown(void) >> { >> #ifdef CONFIG_HOTPLUG_CPU >> disable_nonboot_cpus(); >> #elifdef CONFIG_SMP >> smp_send_stop(); >> #endif >> } >> >> ... but then figured that moving it up into the core kexec code would be >> better, so that everything always worked the same way. > > Hmmm, isn't this racy: requiring the secondaries to hit idle and notice > they're offline and call cpu_die before the primary has replace the kernel > image? Isn't disable_nonboot_cpus() synchronous? If not, I imagine my original patch wasn't any better in this respect, except that the hotunplug happened earlier, and hence reduced the likelihood of actually seeing any such issues. >> Anyway, the change above addresses Eric's concern about isolating the >> change to ARM. Does that seem like a reasonable thing for the ARM code >> to do? > > I think you're better off using what we currently have and hanging your code > off platform_cpu_kill. OK, I'll look into that. Joseph Lo just posted patches to implement cpu_kill() on Tegra, which was needed to fix some issues in our hotplug code anyway. Perhaps that will remove the need for any other changes...
On 12/20/2012 10:59 AM, Stephen Warren wrote: > On 12/20/2012 10:36 AM, Will Deacon wrote: >> On Thu, Dec 20, 2012 at 05:21:56PM +0000, Stephen Warren wrote: >>> On 12/20/2012 03:49 AM, Will Deacon wrote: >>>> If you do manage to get this merged, please can you follow up with a patch >>>> to remove the smp_kill_cpus bits from arch/arm/kernel/smp.c please? It only >>>> exists as a hook to do exactly this and currently nobody is using it afaict. >>> >>> I originally implemented this in >>> arch/arm/kernel/process.c:machine_shutdown(), which currently is: >>> >>> void machine_shutdown(void) >>> { >>> #ifdef CONFIG_SMP >>> smp_send_stop(); >>> #endif >>> } >>> >>> and I changed it to something like: >>> >>> void machine_shutdown(void) >>> { >>> #ifdef CONFIG_HOTPLUG_CPU >>> disable_nonboot_cpus(); >>> #elifdef CONFIG_SMP >>> smp_send_stop(); >>> #endif >>> } >>> >>> ... but then figured that moving it up into the core kexec code would be >>> better, so that everything always worked the same way. >> >> Hmmm, isn't this racy: requiring the secondaries to hit idle and notice >> they're offline and call cpu_die before the primary has replace the kernel >> image? > > Isn't disable_nonboot_cpus() synchronous? If not, I imagine my original > patch wasn't any better in this respect, except that the hotunplug > happened earlier, and hence reduced the likelihood of actually seeing > any such issues. > >>> Anyway, the change above addresses Eric's concern about isolating the >>> change to ARM. Does that seem like a reasonable thing for the ARM code >>> to do? >> >> I think you're better off using what we currently have and hanging your code >> off platform_cpu_kill. > > OK, I'll look into that. Joseph Lo just posted patches to implement > cpu_kill() on Tegra, which was needed to fix some issues in our hotplug > code anyway. Perhaps that will remove the need for any other changes... Will, I just remembered one other advantage of disable_nonboot_cpus(); it always makes the kexec happen on the boot CPU. Without this, I believe it's random whether CPU0 or CPU1 performs the kexec. I suspect it's most likely to work if we can always kexec on the boot CPU rather than a random CPU? Joseph, Peter, As you know, I've been looking into kexec[2] on Tegra. Here's a summary of a few tests I did: linux-next plus nothing in particular, SMP enabled: * Hangs/crashes during kexec linux-next + SMP disabled: kexec works linux-next + hotunplug CPUs other than CPU0 before kexec: kexec works Will Deacon suggested this was due to a missing implementation of struct smp_operations .cpu_kill on Tegra, which means that when CPUn are present, they're simply spinning executing code, and kexec will eventually overwrite that code, causing all manner of problems. So, since I noticed that Joseph posted an implementation of .cpu_kill for Tegra, I tried that out[1]. It certainly doesn't solve the problem, and in fact actually causes the kernel performing the kexec (rather than the kexec'd kernel) to hang: > [ 26.291903] Starting new kernel > [ 47.309571] INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=2102 jiffies, g=211, c=210, q=37) > [ 47.323410] Task dump for CPU 1: > [ 47.329763] dd R running 0 401 1 0x00000000 > [ 47.339343] [<c0521690>] (__schedule+0x360/0x600) from [<c002d9b0>] (do_syslog+0x2b4/0x478) > [ 47.350952] [<c002d9b0>] (do_syslog+0x2b4/0x478) from [<ed86bb08>] (0xed86bb08) Manually hotunplugging the CPUs first still works OK with those patches applied though. I /think/ kexec calls .cpu_kill() without having caused the CPU itself to call .cpu_die() first? Did you allow for that possibility inside the implementation you posted? [1] http://patchwork.ozlabs.org/patch/207601/ http://patchwork.ozlabs.org/patch/207602/ [2] http://en.wikipedia.org/wiki/Kexec
On Thu, Dec 20, 2012 at 08:15:30PM +0000, Stephen Warren wrote: > On 12/20/2012 10:59 AM, Stephen Warren wrote: > > On 12/20/2012 10:36 AM, Will Deacon wrote: > >> I think you're better off using what we currently have and hanging your code > >> off platform_cpu_kill. > > > > OK, I'll look into that. Joseph Lo just posted patches to implement > > cpu_kill() on Tegra, which was needed to fix some issues in our hotplug > > code anyway. Perhaps that will remove the need for any other changes... > > Will, Hi Stephen, > I just remembered one other advantage of disable_nonboot_cpus(); it > always makes the kexec happen on the boot CPU. Without this, I believe > it's random whether CPU0 or CPU1 performs the kexec. I suspect it's most > likely to work if we can always kexec on the boot CPU rather than a > random CPU? It will be the same CPU as the one which made the kexec system call. If you need to invoke this from a specific CPU, you currently have to do this in userspace using something like the `taskset' utility. Linux can actually boot on any physical CPU for ARM now (thanks to the logical map) but most platforms have restrictions related to secure firmware. Will
diff --git a/kernel/kexec.c b/kernel/kexec.c index 5e4bd78..6fe74d3 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1570,6 +1570,7 @@ int kernel_kexec(void) #endif { kernel_restart_prepare(NULL); + disable_nonboot_cpus(); printk(KERN_EMERG "Starting new kernel\n"); machine_shutdown(); }