diff mbox series

[v2,01/13] ARM: Disable FIQs (but not IRQs) on CPUs shutdown paths

Message ID 20220719195325.402745-2-gpiccoli@igalia.com (mailing list archive)
State New, archived
Headers show
Series The panic notifiers refactor strikes back - fixes/clean-ups | expand

Commit Message

Guilherme G. Piccoli July 19, 2022, 7:53 p.m. UTC
Currently the regular CPU shutdown path for ARM disables IRQs/FIQs
in the secondary CPUs - smp_send_stop() calls ipi_cpu_stop(), which
is responsible for that. IRQs are architecturally masked when we
take an interrupt, but FIQs are high priority than IRQs, hence they
aren't masked. With that said, it makes sense to disable FIQs here,
but there's no need for (re-)disabling IRQs.

More than that: there is an alternative path for disabling CPUs,
in the form of function crash_smp_send_stop(), which is used for
kexec/panic path. This function relies on a SMP call that also
triggers a busy-wait loop [at machine_crash_nonpanic_core()], but
without disabling FIQs. This might lead to odd scenarios, like
early interrupts in the boot of kexec'd kernel or even interrupts
in secondary "disabled" CPUs while the main one still works in the
panic path and assumes all secondary CPUs are (really!) off.

So, let's disable FIQs in both paths and *not* disable IRQs a second
time, since they are already masked in both paths by the architecture.
This way, we keep both CPU quiesce paths consistent and safe.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Kelley <mikelley@microsoft.com>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>

---

V2:
- Small wording improvement (thanks Michael Kelley);
- Only disable FIQs, since IRQs are masked by architecture
definition when we take an interrupt. Thanks a lot Russell
and Marc for the discussion [0].

Should we add a Fixes tag here? If so, maybe the proper target is:
b23065313297 ("ARM: 6522/1: kexec: Add call to non-crashing cores through IPI")

[0] https://lore.kernel.org/lkml/Ymxcaqy6DwhoQrZT@shell.armlinux.org.uk/

 arch/arm/kernel/machine_kexec.c | 2 ++
 arch/arm/kernel/smp.c           | 5 ++---
 2 files changed, 4 insertions(+), 3 deletions(-)

Comments

Guilherme G. Piccoli Aug. 7, 2022, 3:35 p.m. UTC | #1
On 19/07/2022 16:53, Guilherme G. Piccoli wrote:
> Currently the regular CPU shutdown path for ARM disables IRQs/FIQs
> in the secondary CPUs - smp_send_stop() calls ipi_cpu_stop(), which
> is responsible for that. IRQs are architecturally masked when we
> take an interrupt, but FIQs are high priority than IRQs, hence they
> aren't masked. With that said, it makes sense to disable FIQs here,
> but there's no need for (re-)disabling IRQs.
> 
> More than that: there is an alternative path for disabling CPUs,
> in the form of function crash_smp_send_stop(), which is used for
> kexec/panic path. This function relies on a SMP call that also
> triggers a busy-wait loop [at machine_crash_nonpanic_core()], but
> without disabling FIQs. This might lead to odd scenarios, like
> early interrupts in the boot of kexec'd kernel or even interrupts
> in secondary "disabled" CPUs while the main one still works in the
> panic path and assumes all secondary CPUs are (really!) off.
> 
> So, let's disable FIQs in both paths and *not* disable IRQs a second
> time, since they are already masked in both paths by the architecture.
> This way, we keep both CPU quiesce paths consistent and safe.
> 
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Michael Kelley <mikelley@microsoft.com>
> Cc: Russell King <linux@armlinux.org.uk>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> 
> ---
> 
> V2:
> - Small wording improvement (thanks Michael Kelley);
> - Only disable FIQs, since IRQs are masked by architecture
> definition when we take an interrupt. Thanks a lot Russell
> and Marc for the discussion [0].
> 
> Should we add a Fixes tag here? If so, maybe the proper target is:
> b23065313297 ("ARM: 6522/1: kexec: Add call to non-crashing cores through IPI")
> 
> [0] https://lore.kernel.org/lkml/Ymxcaqy6DwhoQrZT@shell.armlinux.org.uk/
> 
>  arch/arm/kernel/machine_kexec.c | 2 ++
>  arch/arm/kernel/smp.c           | 5 ++---
>  2 files changed, 4 insertions(+), 3 deletions(-)
> [...]

Hi Mark / Russell, do you think this one is good enough or is there room
for improvement?

Appreciate the reviews!
Cheers,


Guilherme
diff mbox series

Patch

diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index f567032a09c0..0b482bcb97f7 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -77,6 +77,8 @@  void machine_crash_nonpanic_core(void *unused)
 {
 	struct pt_regs regs;
 
+	local_fiq_disable();
+
 	crash_setup_regs(&regs, get_irq_regs());
 	printk(KERN_DEBUG "CPU %u will stop doing anything useful since another CPU has crashed\n",
 	       smp_processor_id());
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 73fc645fc4c7..0c24ec1fd88e 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -600,6 +600,8 @@  static DEFINE_RAW_SPINLOCK(stop_lock);
  */
 static void ipi_cpu_stop(unsigned int cpu)
 {
+	local_fiq_disable();
+
 	if (system_state <= SYSTEM_RUNNING) {
 		raw_spin_lock(&stop_lock);
 		pr_crit("CPU%u: stopping\n", cpu);
@@ -609,9 +611,6 @@  static void ipi_cpu_stop(unsigned int cpu)
 
 	set_cpu_online(cpu, false);
 
-	local_fiq_disable();
-	local_irq_disable();
-
 	while (1) {
 		cpu_relax();
 		wfe();