Message ID | 20220719195325.402745-2-gpiccoli@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | The panic notifiers refactor strikes back - fixes/clean-ups | expand |
On 19/07/2022 16:53, Guilherme G. Piccoli wrote: > Currently the regular CPU shutdown path for ARM disables IRQs/FIQs > in the secondary CPUs - smp_send_stop() calls ipi_cpu_stop(), which > is responsible for that. IRQs are architecturally masked when we > take an interrupt, but FIQs are high priority than IRQs, hence they > aren't masked. With that said, it makes sense to disable FIQs here, > but there's no need for (re-)disabling IRQs. > > More than that: there is an alternative path for disabling CPUs, > in the form of function crash_smp_send_stop(), which is used for > kexec/panic path. This function relies on a SMP call that also > triggers a busy-wait loop [at machine_crash_nonpanic_core()], but > without disabling FIQs. This might lead to odd scenarios, like > early interrupts in the boot of kexec'd kernel or even interrupts > in secondary "disabled" CPUs while the main one still works in the > panic path and assumes all secondary CPUs are (really!) off. > > So, let's disable FIQs in both paths and *not* disable IRQs a second > time, since they are already masked in both paths by the architecture. > This way, we keep both CPU quiesce paths consistent and safe. > > Cc: Marc Zyngier <maz@kernel.org> > Cc: Michael Kelley <mikelley@microsoft.com> > Cc: Russell King <linux@armlinux.org.uk> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> > > --- > > V2: > - Small wording improvement (thanks Michael Kelley); > - Only disable FIQs, since IRQs are masked by architecture > definition when we take an interrupt. Thanks a lot Russell > and Marc for the discussion [0]. > > Should we add a Fixes tag here? If so, maybe the proper target is: > b23065313297 ("ARM: 6522/1: kexec: Add call to non-crashing cores through IPI") > > [0] https://lore.kernel.org/lkml/Ymxcaqy6DwhoQrZT@shell.armlinux.org.uk/ > > arch/arm/kernel/machine_kexec.c | 2 ++ > arch/arm/kernel/smp.c | 5 ++--- > 2 files changed, 4 insertions(+), 3 deletions(-) > [...] Hi Mark / Russell, do you think this one is good enough or is there room for improvement? Appreciate the reviews! Cheers, Guilherme
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c index f567032a09c0..0b482bcb97f7 100644 --- a/arch/arm/kernel/machine_kexec.c +++ b/arch/arm/kernel/machine_kexec.c @@ -77,6 +77,8 @@ void machine_crash_nonpanic_core(void *unused) { struct pt_regs regs; + local_fiq_disable(); + crash_setup_regs(®s, get_irq_regs()); printk(KERN_DEBUG "CPU %u will stop doing anything useful since another CPU has crashed\n", smp_processor_id()); diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 73fc645fc4c7..0c24ec1fd88e 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -600,6 +600,8 @@ static DEFINE_RAW_SPINLOCK(stop_lock); */ static void ipi_cpu_stop(unsigned int cpu) { + local_fiq_disable(); + if (system_state <= SYSTEM_RUNNING) { raw_spin_lock(&stop_lock); pr_crit("CPU%u: stopping\n", cpu); @@ -609,9 +611,6 @@ static void ipi_cpu_stop(unsigned int cpu) set_cpu_online(cpu, false); - local_fiq_disable(); - local_irq_disable(); - while (1) { cpu_relax(); wfe();
Currently the regular CPU shutdown path for ARM disables IRQs/FIQs in the secondary CPUs - smp_send_stop() calls ipi_cpu_stop(), which is responsible for that. IRQs are architecturally masked when we take an interrupt, but FIQs are high priority than IRQs, hence they aren't masked. With that said, it makes sense to disable FIQs here, but there's no need for (re-)disabling IRQs. More than that: there is an alternative path for disabling CPUs, in the form of function crash_smp_send_stop(), which is used for kexec/panic path. This function relies on a SMP call that also triggers a busy-wait loop [at machine_crash_nonpanic_core()], but without disabling FIQs. This might lead to odd scenarios, like early interrupts in the boot of kexec'd kernel or even interrupts in secondary "disabled" CPUs while the main one still works in the panic path and assumes all secondary CPUs are (really!) off. So, let's disable FIQs in both paths and *not* disable IRQs a second time, since they are already masked in both paths by the architecture. This way, we keep both CPU quiesce paths consistent and safe. Cc: Marc Zyngier <maz@kernel.org> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> --- V2: - Small wording improvement (thanks Michael Kelley); - Only disable FIQs, since IRQs are masked by architecture definition when we take an interrupt. Thanks a lot Russell and Marc for the discussion [0]. Should we add a Fixes tag here? If so, maybe the proper target is: b23065313297 ("ARM: 6522/1: kexec: Add call to non-crashing cores through IPI") [0] https://lore.kernel.org/lkml/Ymxcaqy6DwhoQrZT@shell.armlinux.org.uk/ arch/arm/kernel/machine_kexec.c | 2 ++ arch/arm/kernel/smp.c | 5 ++--- 2 files changed, 4 insertions(+), 3 deletions(-)