diff mbox series

[22/37] arm64: smp: Switch to hotplug core state synchronization

Message ID 20230414232310.569498144@linutronix.de (mailing list archive)
State New, archived
Headers show
Series cpu/hotplug, x86: Reworked parallel CPU bringup | expand

Commit Message

Thomas Gleixner April 14, 2023, 11:44 p.m. UTC
Switch to the CPU hotplug core state tracking and synchronization
mechanim. No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
---
 arch/arm64/Kconfig           |    1 +
 arch/arm64/include/asm/smp.h |    2 +-
 arch/arm64/kernel/smp.c      |   14 +++++---------
 3 files changed, 7 insertions(+), 10 deletions(-)

Comments

Mark Rutland April 17, 2023, 3:50 p.m. UTC | #1
On Sat, Apr 15, 2023 at 01:44:49AM +0200, Thomas Gleixner wrote:
> Switch to the CPU hotplug core state tracking and synchronization
> mechanim. No functional change intended.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org

I gave this a spin on arm64 (in a 64-vCPU VM on an M1 host), and it seems to
work fine with a bunch of vCPUs being hotplugged off and on again randomly.

FWIW:

Tested-by: Mark Rutland <mark.rutland@arm.com>

I also hacked the code to have the dying CPU spin forever before the call to
cpuhp_ap_report_dead(). In that case I see a warning, and that we don't call
arch_cpuhp_cleanup_dead_cpu(), and that the CPU is marked as offline (per
/sys/devices/system/cpu/$N/online).

As a tangent/aside, we might need to improve that for confidential compute
architectures, and we might want to generically track cpus which might still be
using kernel text/data. On arm64 we ensure that via our cpu_kill() callback
(which'll use PSCI CPU_AFFINITY_INFO), but I'm not sure if TDX and/or SEV-SNP
have a similar mechanism.

Otherwise, a malicious hypervisor can pause a vCPU just before it leaves the
kernel (e.g. immediately after the arch_cpuhp_cleanup_dead_cpu() call), wait
for a kexec (or resuse of stack memroy), and unpause the vCPU to cause things
to blow up.

Thanks,
Mark.

> ---
>  arch/arm64/Kconfig           |    1 +
>  arch/arm64/include/asm/smp.h |    2 +-
>  arch/arm64/kernel/smp.c      |   14 +++++---------
>  3 files changed, 7 insertions(+), 10 deletions(-)
> 
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -216,6 +216,7 @@ config ARM64
>  	select HAVE_KPROBES
>  	select HAVE_KRETPROBES
>  	select HAVE_GENERIC_VDSO
> +	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
>  	select IRQ_DOMAIN
>  	select IRQ_FORCED_THREADING
>  	select KASAN_VMALLOC if KASAN
> --- a/arch/arm64/include/asm/smp.h
> +++ b/arch/arm64/include/asm/smp.h
> @@ -99,7 +99,7 @@ static inline void arch_send_wakeup_ipi_
>  
>  extern int __cpu_disable(void);
>  
> -extern void __cpu_die(unsigned int cpu);
> +static inline void __cpu_die(unsigned int cpu) { }
>  extern void cpu_die(void);
>  extern void cpu_die_early(void);
>  
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -333,17 +333,13 @@ static int op_cpu_kill(unsigned int cpu)
>  }
>  
>  /*
> - * called on the thread which is asking for a CPU to be shutdown -
> - * waits until shutdown has completed, or it is timed out.
> + * Called on the thread which is asking for a CPU to be shutdown after the
> + * shutdown completed.
>   */
> -void __cpu_die(unsigned int cpu)
> +void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu)
>  {
>  	int err;
>  
> -	if (!cpu_wait_death(cpu, 5)) {
> -		pr_crit("CPU%u: cpu didn't die\n", cpu);
> -		return;
> -	}
>  	pr_debug("CPU%u: shutdown\n", cpu);
>  
>  	/*
> @@ -370,8 +366,8 @@ void cpu_die(void)
>  
>  	local_daif_mask();
>  
> -	/* Tell __cpu_die() that this CPU is now safe to dispose of */
> -	(void)cpu_report_death();
> +	/* Tell cpuhp_bp_sync_dead() that this CPU is now safe to dispose of */
> +	cpuhp_ap_report_dead();
>  
>  	/*
>  	 * Actually shutdown the CPU. This must never fail. The specific hotplug
>
Thomas Gleixner April 25, 2023, 7:51 p.m. UTC | #2
On Mon, Apr 17 2023 at 16:50, Mark Rutland wrote:
> On Sat, Apr 15, 2023 at 01:44:49AM +0200, Thomas Gleixner wrote:
> I gave this a spin on arm64 (in a 64-vCPU VM on an M1 host), and it seems to
> work fine with a bunch of vCPUs being hotplugged off and on again randomly.
>
> FWIW:
>
> Tested-by: Mark Rutland <mark.rutland@arm.com>
>
> I also hacked the code to have the dying CPU spin forever before the call to
> cpuhp_ap_report_dead(). In that case I see a warning, and that we don't call
> arch_cpuhp_cleanup_dead_cpu(), and that the CPU is marked as offline (per
> /sys/devices/system/cpu/$N/online).

Nice!

> As a tangent/aside, we might need to improve that for confidential compute
> architectures, and we might want to generically track cpus which might still be
> using kernel text/data. On arm64 we ensure that via our cpu_kill() callback
> (which'll use PSCI CPU_AFFINITY_INFO), but I'm not sure if TDX and/or SEV-SNP
> have a similar mechanism.
>
> Otherwise, a malicious hypervisor can pause a vCPU just before it leaves the
> kernel (e.g. immediately after the arch_cpuhp_cleanup_dead_cpu() call), wait
> for a kexec (or resuse of stack memroy), and unpause the vCPU to cause things
> to blow up.

There are a gazillion ways for a malicious hypervisor to blow up a
'squint enough to be confident' guest.

The real question is whether it can utilize such a blow up to extract
confidential information from the guest.

If not then it's just yet another way of DoS which is an "acceptable"
attack as it only affects availability but not confidentiality.

Thanks,

        tglx
Mark Rutland April 26, 2023, 7:59 a.m. UTC | #3
On Tue, Apr 25, 2023 at 09:51:12PM +0200, Thomas Gleixner wrote:
> On Mon, Apr 17 2023 at 16:50, Mark Rutland wrote:
> > As a tangent/aside, we might need to improve that for confidential compute
> > architectures, and we might want to generically track cpus which might still be
> > using kernel text/data. On arm64 we ensure that via our cpu_kill() callback
> > (which'll use PSCI CPU_AFFINITY_INFO), but I'm not sure if TDX and/or SEV-SNP
> > have a similar mechanism.
> >
> > Otherwise, a malicious hypervisor can pause a vCPU just before it leaves the
> > kernel (e.g. immediately after the arch_cpuhp_cleanup_dead_cpu() call), wait
> > for a kexec (or resuse of stack memroy), and unpause the vCPU to cause things
> > to blow up.
> 
> There are a gazillion ways for a malicious hypervisor to blow up a
> 'squint enough to be confident' guest.
> 
> The real question is whether it can utilize such a blow up to extract
> confidential information from the guest.
>
> If not then it's just yet another way of DoS which is an "acceptable"
> attack as it only affects availability but not confidentiality.

Sure.

My thinking is that this is an attack against the *integrity* of the guest
(since the vCPU that gets unpasued may write to memory), and so it's
potentially more than just a DoS.

I only mention this because I'd like to account for that on arm64, and if other
architectures also wanted to handle that it might make sense to have some
common infrastructure to track whether CPUs are potentially still within the
kernel.

Thanks,
Mark.
Thomas Gleixner April 26, 2023, 8:15 a.m. UTC | #4
On Wed, Apr 26 2023 at 08:59, Mark Rutland wrote:
> On Tue, Apr 25, 2023 at 09:51:12PM +0200, Thomas Gleixner wrote:
>> If not then it's just yet another way of DoS which is an "acceptable"
>> attack as it only affects availability but not confidentiality.
>
> Sure.
>
> My thinking is that this is an attack against the *integrity* of the guest
> (since the vCPU that gets unpasued may write to memory), and so it's
> potentially more than just a DoS.
>
> I only mention this because I'd like to account for that on arm64, and if other
> architectures also wanted to handle that it might make sense to have some
> common infrastructure to track whether CPUs are potentially still within the
> kernel.

Fair enough.
diff mbox series

Patch

--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -216,6 +216,7 @@  config ARM64
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
 	select HAVE_GENERIC_VDSO
+	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select KASAN_VMALLOC if KASAN
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -99,7 +99,7 @@  static inline void arch_send_wakeup_ipi_
 
 extern int __cpu_disable(void);
 
-extern void __cpu_die(unsigned int cpu);
+static inline void __cpu_die(unsigned int cpu) { }
 extern void cpu_die(void);
 extern void cpu_die_early(void);
 
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -333,17 +333,13 @@  static int op_cpu_kill(unsigned int cpu)
 }
 
 /*
- * called on the thread which is asking for a CPU to be shutdown -
- * waits until shutdown has completed, or it is timed out.
+ * Called on the thread which is asking for a CPU to be shutdown after the
+ * shutdown completed.
  */
-void __cpu_die(unsigned int cpu)
+void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu)
 {
 	int err;
 
-	if (!cpu_wait_death(cpu, 5)) {
-		pr_crit("CPU%u: cpu didn't die\n", cpu);
-		return;
-	}
 	pr_debug("CPU%u: shutdown\n", cpu);
 
 	/*
@@ -370,8 +366,8 @@  void cpu_die(void)
 
 	local_daif_mask();
 
-	/* Tell __cpu_die() that this CPU is now safe to dispose of */
-	(void)cpu_report_death();
+	/* Tell cpuhp_bp_sync_dead() that this CPU is now safe to dispose of */
+	cpuhp_ap_report_dead();
 
 	/*
 	 * Actually shutdown the CPU. This must never fail. The specific hotplug