Message ID | 20220209074109.453116-5-chao.gao@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Improve KVM's interaction with CPU hotplug | expand |
+Marc On Wed, Feb 09, 2022, Chao Gao wrote: > The CPU STARTING section doesn't allow callbacks to fail. Move KVM's > hotplug callback to ONLINE section so that it can abort onlining a CPU in > certain cases to avoid potentially breaking VMs running on existing CPUs. > For example, when kvm fails to enable hardware virtualization on the > hotplugged CPU. > > Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures > when offlining a CPU, all user tasks and non-pinned kernel tasks have left > the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's > CPU offline callback to disable hardware virtualization at that point. > Likewise, KVM's online callback can enable hardware virtualization before > any vCPU task gets a chance to run on hotplugged CPUs. > > KVM's CPU hotplug callbacks are renamed as well. > > Suggested-by: Thomas Gleixner <tglx@linutronix.de> > Signed-off-by: Chao Gao <chao.gao@intel.com> > --- > include/linux/cpuhotplug.h | 2 +- > virt/kvm/kvm_main.c | 30 ++++++++++++++++++++++-------- > 2 files changed, 23 insertions(+), 9 deletions(-) > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > index 773c83730906..14d354c8ce35 100644 > --- a/include/linux/cpuhotplug.h > +++ b/include/linux/cpuhotplug.h > @@ -182,7 +182,6 @@ enum cpuhp_state { > CPUHP_AP_CSKY_TIMER_STARTING, > CPUHP_AP_TI_GP_TIMER_STARTING, > CPUHP_AP_HYPERV_TIMER_STARTING, > - CPUHP_AP_KVM_STARTING, > CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING, > CPUHP_AP_KVM_ARM_VGIC_STARTING, > CPUHP_AP_KVM_ARM_TIMER_STARTING, This probably needs an ack from Marc. IIUC, it changes the ordering between generic KVM enabling hardware and KVM ARM doing its vGIC and timer stuff. > @@ -200,6 +199,7 @@ enum cpuhp_state { > > /* Online section invoked on the hotplugged CPU from the hotplug thread */ > CPUHP_AP_ONLINE_IDLE, > + CPUHP_AP_KVM_ONLINE, > CPUHP_AP_SCHED_WAIT_EMPTY, > CPUHP_AP_SMPBOOT_THREADS, > CPUHP_AP_X86_VDSO_VMA_ONLINE, > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 23481fd746aa..f60724736cb1 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -4853,13 +4853,27 @@ static void hardware_enable_nolock(void *caller_name) > } > } > > -static int kvm_starting_cpu(unsigned int cpu) > +static int kvm_online_cpu(unsigned int cpu) > { > + int ret = 0; > + > raw_spin_lock(&kvm_count_lock); > - if (kvm_usage_count) > + /* > + * Abort the CPU online process if hardware virtualization cannot > + * be enabled. Otherwise running VMs would encounter unrecoverable > + * errors when scheduled to this CPU. > + */ > + if (kvm_usage_count) { > + WARN_ON_ONCE(atomic_read(&hardware_enable_failed)); > + > hardware_enable_nolock((void *)__func__); > + if (atomic_read(&hardware_enable_failed)) { > + atomic_set(&hardware_enable_failed, 0); > + ret = -EIO; > + } > + } > raw_spin_unlock(&kvm_count_lock); > - return 0; > + return ret; > } > > static void hardware_disable_nolock(void *junk) > @@ -4872,7 +4886,7 @@ static void hardware_disable_nolock(void *junk) > kvm_arch_hardware_disable(); > } > > -static int kvm_dying_cpu(unsigned int cpu) > +static int kvm_offline_cpu(unsigned int cpu) > { > raw_spin_lock(&kvm_count_lock); > if (kvm_usage_count) > @@ -5641,8 +5655,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, > goto out_free_2; > } > > - r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, "kvm/cpu:starting", > - kvm_starting_cpu, kvm_dying_cpu); > + r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online", > + kvm_online_cpu, kvm_offline_cpu); > if (r) > goto out_free_2; > register_reboot_notifier(&kvm_reboot_notifier); > @@ -5705,7 +5719,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, > kmem_cache_destroy(kvm_vcpu_cache); > out_free_3: > unregister_reboot_notifier(&kvm_reboot_notifier); > - cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING); > + cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE); > out_free_2: > kvm_arch_hardware_unsetup(); > out_free_1: > @@ -5731,7 +5745,7 @@ void kvm_exit(void) > kvm_async_pf_deinit(); > unregister_syscore_ops(&kvm_syscore_ops); > unregister_reboot_notifier(&kvm_reboot_notifier); > - cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING); > + cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE); > on_each_cpu(hardware_disable_nolock, NULL, 1); > kvm_arch_hardware_unsetup(); > kvm_arch_exit(); > -- > 2.25.1 >
On Wed, 09 Feb 2022 19:59:45 +0000, Sean Christopherson <seanjc@google.com> wrote: > > +Marc Thanks for the heads up. > > On Wed, Feb 09, 2022, Chao Gao wrote: > > The CPU STARTING section doesn't allow callbacks to fail. Move KVM's > > hotplug callback to ONLINE section so that it can abort onlining a CPU in > > certain cases to avoid potentially breaking VMs running on existing CPUs. > > For example, when kvm fails to enable hardware virtualization on the > > hotplugged CPU. > > > > Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures > > when offlining a CPU, all user tasks and non-pinned kernel tasks have left > > the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's > > CPU offline callback to disable hardware virtualization at that point. > > Likewise, KVM's online callback can enable hardware virtualization before > > any vCPU task gets a chance to run on hotplugged CPUs. > > > > KVM's CPU hotplug callbacks are renamed as well. > > > > Suggested-by: Thomas Gleixner <tglx@linutronix.de> > > Signed-off-by: Chao Gao <chao.gao@intel.com> > > --- > > include/linux/cpuhotplug.h | 2 +- > > virt/kvm/kvm_main.c | 30 ++++++++++++++++++++++-------- > > 2 files changed, 23 insertions(+), 9 deletions(-) > > > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > > index 773c83730906..14d354c8ce35 100644 > > --- a/include/linux/cpuhotplug.h > > +++ b/include/linux/cpuhotplug.h > > @@ -182,7 +182,6 @@ enum cpuhp_state { > > CPUHP_AP_CSKY_TIMER_STARTING, > > CPUHP_AP_TI_GP_TIMER_STARTING, > > CPUHP_AP_HYPERV_TIMER_STARTING, > > - CPUHP_AP_KVM_STARTING, > > CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING, > > CPUHP_AP_KVM_ARM_VGIC_STARTING, > > CPUHP_AP_KVM_ARM_TIMER_STARTING, > > This probably needs an ack from Marc. IIUC, it changes the ordering > between generic KVM enabling hardware and KVM ARM doing its vGIC and > timer stuff. Indeed, that's not great. Specially the part that enable interrupts before things are up and running on the CPU. But TBH, this area really deserves a good scrubbing, and I don't see why we need to keep these individual CPUHP notifiers. I wrote the patch below, thrown it at a test box, and nothing caught fire as I was fiddling with CPUs going up and down. It is thus obviously perfect. Feel free to take it as part of your series. Thanks, M. From 57d80dbe5a10bc3b5bce748f637dea420ef960a1 Mon Sep 17 00:00:00 2001 From: Marc Zyngier <maz@kernel.org> Date: Thu, 10 Feb 2022 13:50:52 +0000 Subject: [PATCH] KVM: arm64: Simplify the CPUHP logic For a number of historical reasons, the KVM/arm64 hotplug setup is pretty complicated, and we have two extra CPUHP notifiers for vGIC and timers. It looks pretty pointless, and gets in the way of further changes. So let's just expose some helpers that can be called from the core CPUHP callback, and get rid of everything else. This gives us the opportunity to drop a useless notifier entry, as well as tidy-up the timer enable/disable, which was a bit odd. Signed-off-by: Marc Zyngier <maz@kernel.org> --- arch/arm64/kvm/arch_timer.c | 27 ++++++++++----------------- arch/arm64/kvm/arm.c | 4 ++++ arch/arm64/kvm/vgic/vgic-init.c | 19 ++----------------- include/kvm/arm_arch_timer.h | 4 ++++ include/kvm/arm_vgic.h | 4 ++++ include/linux/cpuhotplug.h | 3 --- 6 files changed, 24 insertions(+), 37 deletions(-) diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c index 6e542e2eae32..f9d14c6dc0b4 100644 --- a/arch/arm64/kvm/arch_timer.c +++ b/arch/arm64/kvm/arch_timer.c @@ -796,10 +796,18 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu) ptimer->host_timer_irq_flags = host_ptimer_irq_flags; } -static void kvm_timer_init_interrupt(void *info) +void kvm_timer_cpu_up(void) { enable_percpu_irq(host_vtimer_irq, host_vtimer_irq_flags); - enable_percpu_irq(host_ptimer_irq, host_ptimer_irq_flags); + if (host_ptimer_irq) + enable_percpu_irq(host_ptimer_irq, host_ptimer_irq_flags); +} + +void kvm_timer_cpu_down(void) +{ + disable_percpu_irq(host_vtimer_irq); + if (host_ptimer_irq) + disable_percpu_irq(host_ptimer_irq); } int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value) @@ -961,18 +969,6 @@ void kvm_arm_timer_write_sysreg(struct kvm_vcpu *vcpu, preempt_enable(); } -static int kvm_timer_starting_cpu(unsigned int cpu) -{ - kvm_timer_init_interrupt(NULL); - return 0; -} - -static int kvm_timer_dying_cpu(unsigned int cpu) -{ - disable_percpu_irq(host_vtimer_irq); - return 0; -} - static int timer_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu) { if (vcpu) @@ -1170,9 +1166,6 @@ int kvm_timer_hyp_init(bool has_gic) goto out_free_irq; } - cpuhp_setup_state(CPUHP_AP_KVM_ARM_TIMER_STARTING, - "kvm/arm/timer:starting", kvm_timer_starting_cpu, - kvm_timer_dying_cpu); return 0; out_free_irq: free_percpu_irq(host_vtimer_irq, kvm_get_running_vcpus()); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index fefd5774ab55..6c9cb3fdd3af 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1600,6 +1600,8 @@ static void _kvm_arch_hardware_enable(void *discard) { if (!__this_cpu_read(kvm_arm_hardware_enabled)) { cpu_hyp_reinit(); + kvm_vgic_cpu_up(); + kvm_timer_cpu_up(); __this_cpu_write(kvm_arm_hardware_enabled, 1); } } @@ -1613,6 +1615,8 @@ int kvm_arch_hardware_enable(void) static void _kvm_arch_hardware_disable(void *discard) { if (__this_cpu_read(kvm_arm_hardware_enabled)) { + kvm_timer_cpu_down(); + kvm_vgic_cpu_down(); cpu_hyp_reset(); __this_cpu_write(kvm_arm_hardware_enabled, 0); } diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c index fc00304fe7d8..60038a8516de 100644 --- a/arch/arm64/kvm/vgic/vgic-init.c +++ b/arch/arm64/kvm/vgic/vgic-init.c @@ -460,17 +460,15 @@ int kvm_vgic_map_resources(struct kvm *kvm) /* GENERIC PROBE */ -static int vgic_init_cpu_starting(unsigned int cpu) +void kvm_vgic_cpu_up(void) { enable_percpu_irq(kvm_vgic_global_state.maint_irq, 0); - return 0; } -static int vgic_init_cpu_dying(unsigned int cpu) +void kvm_vgic_cpu_down(void) { disable_percpu_irq(kvm_vgic_global_state.maint_irq); - return 0; } static irqreturn_t vgic_maintenance_handler(int irq, void *data) @@ -579,19 +577,6 @@ int kvm_vgic_hyp_init(void) return ret; } - ret = cpuhp_setup_state(CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING, - "kvm/arm/vgic:starting", - vgic_init_cpu_starting, vgic_init_cpu_dying); - if (ret) { - kvm_err("Cannot register vgic CPU notifier\n"); - goto out_free_irq; - } - kvm_info("vgic interrupt IRQ%d\n", kvm_vgic_global_state.maint_irq); return 0; - -out_free_irq: - free_percpu_irq(kvm_vgic_global_state.maint_irq, - kvm_get_running_vcpus()); - return ret; } diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h index 51c19381108c..16a2f65fcfb4 100644 --- a/include/kvm/arm_arch_timer.h +++ b/include/kvm/arm_arch_timer.h @@ -106,4 +106,8 @@ void kvm_arm_timer_write_sysreg(struct kvm_vcpu *vcpu, u32 timer_get_ctl(struct arch_timer_context *ctxt); u64 timer_get_cval(struct arch_timer_context *ctxt); +/* CPU HP callbacks */ +void kvm_timer_cpu_up(void); +void kvm_timer_cpu_down(void); + #endif diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index bb30a6803d9f..a2a0cca05a73 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -427,4 +427,8 @@ int vgic_v4_load(struct kvm_vcpu *vcpu); void vgic_v4_commit(struct kvm_vcpu *vcpu); int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db); +/* CPU HP callbacks */ +void kvm_vgic_cpu_up(void); +void kvm_vgic_cpu_down(void); + #endif /* __KVM_ARM_VGIC_H */ diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 411a428ace4d..4345b8eafc03 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -183,9 +183,6 @@ enum cpuhp_state { CPUHP_AP_TI_GP_TIMER_STARTING, CPUHP_AP_HYPERV_TIMER_STARTING, CPUHP_AP_KVM_STARTING, - CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING, - CPUHP_AP_KVM_ARM_VGIC_STARTING, - CPUHP_AP_KVM_ARM_TIMER_STARTING, /* Must be the last timer callback */ CPUHP_AP_DUMMY_TIMER_STARTING, CPUHP_AP_ARM_XEN_STARTING,
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 773c83730906..14d354c8ce35 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -182,7 +182,6 @@ enum cpuhp_state { CPUHP_AP_CSKY_TIMER_STARTING, CPUHP_AP_TI_GP_TIMER_STARTING, CPUHP_AP_HYPERV_TIMER_STARTING, - CPUHP_AP_KVM_STARTING, CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING, CPUHP_AP_KVM_ARM_VGIC_STARTING, CPUHP_AP_KVM_ARM_TIMER_STARTING, @@ -200,6 +199,7 @@ enum cpuhp_state { /* Online section invoked on the hotplugged CPU from the hotplug thread */ CPUHP_AP_ONLINE_IDLE, + CPUHP_AP_KVM_ONLINE, CPUHP_AP_SCHED_WAIT_EMPTY, CPUHP_AP_SMPBOOT_THREADS, CPUHP_AP_X86_VDSO_VMA_ONLINE, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 23481fd746aa..f60724736cb1 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4853,13 +4853,27 @@ static void hardware_enable_nolock(void *caller_name) } } -static int kvm_starting_cpu(unsigned int cpu) +static int kvm_online_cpu(unsigned int cpu) { + int ret = 0; + raw_spin_lock(&kvm_count_lock); - if (kvm_usage_count) + /* + * Abort the CPU online process if hardware virtualization cannot + * be enabled. Otherwise running VMs would encounter unrecoverable + * errors when scheduled to this CPU. + */ + if (kvm_usage_count) { + WARN_ON_ONCE(atomic_read(&hardware_enable_failed)); + hardware_enable_nolock((void *)__func__); + if (atomic_read(&hardware_enable_failed)) { + atomic_set(&hardware_enable_failed, 0); + ret = -EIO; + } + } raw_spin_unlock(&kvm_count_lock); - return 0; + return ret; } static void hardware_disable_nolock(void *junk) @@ -4872,7 +4886,7 @@ static void hardware_disable_nolock(void *junk) kvm_arch_hardware_disable(); } -static int kvm_dying_cpu(unsigned int cpu) +static int kvm_offline_cpu(unsigned int cpu) { raw_spin_lock(&kvm_count_lock); if (kvm_usage_count) @@ -5641,8 +5655,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, goto out_free_2; } - r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, "kvm/cpu:starting", - kvm_starting_cpu, kvm_dying_cpu); + r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online", + kvm_online_cpu, kvm_offline_cpu); if (r) goto out_free_2; register_reboot_notifier(&kvm_reboot_notifier); @@ -5705,7 +5719,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, kmem_cache_destroy(kvm_vcpu_cache); out_free_3: unregister_reboot_notifier(&kvm_reboot_notifier); - cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING); + cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE); out_free_2: kvm_arch_hardware_unsetup(); out_free_1: @@ -5731,7 +5745,7 @@ void kvm_exit(void) kvm_async_pf_deinit(); unregister_syscore_ops(&kvm_syscore_ops); unregister_reboot_notifier(&kvm_reboot_notifier); - cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING); + cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE); on_each_cpu(hardware_disable_nolock, NULL, 1); kvm_arch_hardware_unsetup(); kvm_arch_exit();
The CPU STARTING section doesn't allow callbacks to fail. Move KVM's hotplug callback to ONLINE section so that it can abort onlining a CPU in certain cases to avoid potentially breaking VMs running on existing CPUs. For example, when kvm fails to enable hardware virtualization on the hotplugged CPU. Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures when offlining a CPU, all user tasks and non-pinned kernel tasks have left the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's CPU offline callback to disable hardware virtualization at that point. Likewise, KVM's online callback can enable hardware virtualization before any vCPU task gets a chance to run on hotplugged CPUs. KVM's CPU hotplug callbacks are renamed as well. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chao Gao <chao.gao@intel.com> --- include/linux/cpuhotplug.h | 2 +- virt/kvm/kvm_main.c | 30 ++++++++++++++++++++++-------- 2 files changed, 23 insertions(+), 9 deletions(-)