Message ID | 1245065405-5714-1-git-send-email-agraf@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jun 15, 2009 at 01:30:05PM +0200, Alexander Graf wrote: > X86 CPUs need to have some magic happening to enable the virtualization > extensions on them. This magic can result in unpleasant results for > users, like blocking other VMMs from working (vmx) or using invalid TLB > entries (svm). > > Currently KVM activates virtualization when the respective kernel module > is loaded. This blocks us from autoloading KVM modules without breaking > other VMMs. That will only become interesting if we every have such a thing in mainline. So NACK, lots of complication for no good reason. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 15.06.2009, at 14:17, Christoph Hellwig wrote: > On Mon, Jun 15, 2009 at 01:30:05PM +0200, Alexander Graf wrote: >> X86 CPUs need to have some magic happening to enable the >> virtualization >> extensions on them. This magic can result in unpleasant results for >> users, like blocking other VMMs from working (vmx) or using invalid >> TLB >> entries (svm). >> >> Currently KVM activates virtualization when the respective kernel >> module >> is loaded. This blocks us from autoloading KVM modules without >> breaking >> other VMMs. > > That will only become interesting if we every have such a thing in > mainline. So NACK, lots of complication for no good reason. > I don't want to fight political battles here. Seriously - we're out of kindergarden. There are users out there who want to have VBox/VMware and kvm installed in parallel and can't have both kernel modules loaded at the same time. We're only hurting _our_ users, not the others if we keep people from having kvm*.ko loaded. Sigh. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jun 15, 2009 at 02:25:01PM +0200, Alexander Graf wrote:
> I don't want to fight political battles here.
So stop that crap.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/15/2009 02:30 PM, Alexander Graf wrote: > X86 CPUs need to have some magic happening to enable the virtualization > extensions on them. This magic can result in unpleasant results for > users, like blocking other VMMs from working (vmx) or using invalid TLB > entries (svm). > > Currently KVM activates virtualization when the respective kernel module > is loaded. This blocks us from autoloading KVM modules without breaking > other VMMs. > > To circumvent this problem at least a bit, this patch introduces on > demand activation of virtualization. This means, that instead > virtualization is enabled on creation of the first virtual machine > and disabled on destruction of the last one. > > So using this, KVM can be easily autoloaded, while keeping other > hypervisors usable. > > +static int hardware_enable_all(void) > +{ > + int r = 0; > + > + spin_lock(&kvm_lock); > + > + kvm_usage_count++; > + if (kvm_usage_count == 1) { > + atomic_set(&hardware_enable_failed, 1); > + on_each_cpu(hardware_enable, NULL, 1); > + > + if (!atomic_dec_and_test(&hardware_enable_failed)) > + r = -EBUSY; > + } > That's a little obfuscated. I suggest atomic_set(..., p) and atomic_read(...). > + > static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val, > void *v) > { > int cpu = (long)v; > > + if (!kvm_usage_count) > + return NOTIFY_OK; > + > val&= ~CPU_TASKS_FROZEN; > switch (val) { > case CPU_DYING: > @@ -2513,13 +2571,15 @@ static void kvm_exit_debug(void) > > static int kvm_suspend(struct sys_device *dev, pm_message_t state) > { > - hardware_disable(NULL); > + if (kvm_usage_count) > + hardware_disable(NULL); > return 0; > } > > static int kvm_resume(struct sys_device *dev) > { > - hardware_enable(NULL); > + if (kvm_usage_count) > + hardware_enable(NULL); > return 0; > } > > Please tell me you tested suspend/resume with/without VMs and cpu hotunplug/hotplug.
On 06/15/2009 03:17 PM, Christoph Hellwig wrote: > On Mon, Jun 15, 2009 at 01:30:05PM +0200, Alexander Graf wrote: > >> X86 CPUs need to have some magic happening to enable the virtualization >> extensions on them. This magic can result in unpleasant results for >> users, like blocking other VMMs from working (vmx) or using invalid TLB >> entries (svm). >> >> Currently KVM activates virtualization when the respective kernel module >> is loaded. This blocks us from autoloading KVM modules without breaking >> other VMMs. >> > > That will only become interesting if we every have such a thing in > mainline. So NACK, lots of complication for no good reason. > If it were truly lots of complication, I might agree. But it isn't, and we keep getting reports from users about it.
Avi Kivity wrote: > On 06/15/2009 02:30 PM, Alexander Graf wrote: >> X86 CPUs need to have some magic happening to enable the virtualization >> extensions on them. This magic can result in unpleasant results for >> users, like blocking other VMMs from working (vmx) or using invalid TLB >> entries (svm). >> >> Currently KVM activates virtualization when the respective kernel module >> is loaded. This blocks us from autoloading KVM modules without breaking >> other VMMs. >> >> To circumvent this problem at least a bit, this patch introduces on >> demand activation of virtualization. This means, that instead >> virtualization is enabled on creation of the first virtual machine >> and disabled on destruction of the last one. >> >> So using this, KVM can be easily autoloaded, while keeping other >> hypervisors usable. >> >> +static int hardware_enable_all(void) >> +{ >> + int r = 0; >> + >> + spin_lock(&kvm_lock); >> + >> + kvm_usage_count++; >> + if (kvm_usage_count == 1) { >> + atomic_set(&hardware_enable_failed, 1); >> + on_each_cpu(hardware_enable, NULL, 1); >> + >> + if (!atomic_dec_and_test(&hardware_enable_failed)) >> + r = -EBUSY; >> + } >> > > That's a little obfuscated. I suggest atomic_set(..., p) and > atomic_read(...). Ah, I was more searching for an atomic_test :-). >> static int kvm_cpu_hotplug(struct notifier_block *notifier, >> unsigned long val, >> void *v) >> { >> int cpu = (long)v; >> >> + if (!kvm_usage_count) >> + return NOTIFY_OK; >> + >> val&= ~CPU_TASKS_FROZEN; >> switch (val) { >> case CPU_DYING: >> @@ -2513,13 +2571,15 @@ static void kvm_exit_debug(void) >> >> static int kvm_suspend(struct sys_device *dev, pm_message_t state) >> { >> - hardware_disable(NULL); >> + if (kvm_usage_count) >> + hardware_disable(NULL); >> return 0; >> } >> >> static int kvm_resume(struct sys_device *dev) >> { >> - hardware_enable(NULL); >> + if (kvm_usage_count) >> + hardware_enable(NULL); >> return 0; >> } >> >> > + > > Please tell me you tested suspend/resume with/without VMs and cpu > hotunplug/hotplug. I tested cpu hotplugging. On the last round I tested suspend/resume, but this time I couldn't because my machine can't do suspend :-(. So I'll try hard and find a machine I can test it on for the next round. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/16/2009 05:08 PM, Alexander Graf wrote: >> Please tell me you tested suspend/resume with/without VMs and cpu >> hotunplug/hotplug. >> > > I tested cpu hotplugging. On the last round I tested suspend/resume, but > this time I couldn't because my machine can't do suspend :-(. > So I'll try hard and find a machine I can test it on for the next round. > I can test suspend/resume for you if you don't have a friendly machine. I have a personal interest in keeping it working :)
On 16.06.2009, at 17:13, Avi Kivity wrote: > On 06/16/2009 05:08 PM, Alexander Graf wrote: >>> Please tell me you tested suspend/resume with/without VMs and cpu >>> hotunplug/hotplug. >>> >> >> I tested cpu hotplugging. On the last round I tested suspend/ >> resume, but >> this time I couldn't because my machine can't do suspend :-(. >> So I'll try hard and find a machine I can test it on for the next >> round. >> > > I can test suspend/resume for you if you don't have a friendly > machine. I have a personal interest in keeping it working :) Thinking about it again - there's only the atomic dec_and_test vs. read thing and the suspend test missing. Is the atomic operation as is really that confusing? If not, we can keep the patch as is and you simply try s2ram on your notebook :-). I'm pretty sure it works - it used to. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/18/2009 12:56 AM, Alexander Graf wrote: >> I can test suspend/resume for you if you don't have a friendly >> machine. I have a personal interest in keeping it working :) > > > Thinking about it again - there's only the atomic dec_and_test vs. > read thing and the suspend test missing. > > Is the atomic operation as is really that confusing? Yes. It says, "something tricky is going on, see if you can find it". > If not, we can keep the patch as is and you simply try s2ram on your > notebook :-). I'm pretty sure it works - it used to. It looks like it will work, but these things are tricky. I'll test an updated patch. Please also test reboot on Intel with a VM spinning and with no VMs loaded - Intel reboots are tricky too.
diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 906d597..3141a92 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -124,7 +124,7 @@ long ia64_pal_vp_create(u64 *vpd, u64 *host_iva, u64 *opt_handler) static DEFINE_SPINLOCK(vp_lock); -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { long status; long tmp_base; @@ -137,7 +137,7 @@ void kvm_arch_hardware_enable(void *garbage) slot = ia64_itr_entry(0x3, KVM_VMM_BASE, pte, KVM_VMM_SHIFT); local_irq_restore(saved_psr); if (slot < 0) - return; + return -EINVAL; spin_lock(&vp_lock); status = ia64_pal_vp_init_env(kvm_vsa_base ? @@ -145,7 +145,7 @@ void kvm_arch_hardware_enable(void *garbage) __pa(kvm_vm_buffer), KVM_VM_BUFFER_BASE, &tmp_base); if (status != 0) { printk(KERN_WARNING"kvm: Failed to Enable VT Support!!!!\n"); - return ; + return -EINVAL; } if (!kvm_vsa_base) { @@ -154,6 +154,8 @@ void kvm_arch_hardware_enable(void *garbage) } spin_unlock(&vp_lock); ia64_ptr_entry(0x3, slot); + + return 0; } void kvm_arch_hardware_disable(void *garbage) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 9057335..6558ab7 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -80,7 +80,8 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu) return r; } -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { + return 0; } diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index cbfe91e..a14e676 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -70,7 +70,8 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { /* Section: not file related */ -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { /* every s390 is virtualization enabled ;-) */ + return 0; } diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 4627627..72d5075 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -463,7 +463,7 @@ struct descriptor_table { struct kvm_x86_ops { int (*cpu_has_kvm_support)(void); /* __init */ int (*disabled_by_bios)(void); /* __init */ - void (*hardware_enable)(void *dummy); /* __init */ + int (*hardware_enable)(void *dummy); void (*hardware_disable)(void *dummy); void (*check_processor_compatibility)(void *rtn); int (*hardware_setup)(void); /* __init */ diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 04ee964..47a8b94 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -245,7 +245,7 @@ static void svm_hardware_disable(void *garbage) cpu_svm_disable(); } -static void svm_hardware_enable(void *garbage) +static int svm_hardware_enable(void *garbage) { struct svm_cpu_data *svm_data; @@ -254,16 +254,20 @@ static void svm_hardware_enable(void *garbage) struct desc_struct *gdt; int me = raw_smp_processor_id(); + rdmsrl(MSR_EFER, efer); + if (efer & EFER_SVME) + return -EBUSY; + if (!has_svm()) { printk(KERN_ERR "svm_cpu_init: err EOPNOTSUPP on %d\n", me); - return; + return -EINVAL; } svm_data = per_cpu(svm_data, me); if (!svm_data) { printk(KERN_ERR "svm_cpu_init: svm_data is NULL on %d\n", me); - return; + return -EINVAL; } svm_data->asid_generation = 1; @@ -274,11 +278,12 @@ static void svm_hardware_enable(void *garbage) gdt = (struct desc_struct *)gdt_descr.address; svm_data->tss_desc = (struct kvm_ldttss_desc *)(gdt + GDT_ENTRY_TSS); - rdmsrl(MSR_EFER, efer); wrmsrl(MSR_EFER, efer | EFER_SVME); wrmsrl(MSR_VM_HSAVE_PA, page_to_pfn(svm_data->save_area) << PAGE_SHIFT); + + return 0; } static void svm_cpu_uninit(int cpu) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index d3919ac..3df3b0a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1068,12 +1068,15 @@ static __init int vmx_disabled_by_bios(void) /* locked but not enabled */ } -static void hardware_enable(void *garbage) +static int hardware_enable(void *garbage) { int cpu = raw_smp_processor_id(); u64 phys_addr = __pa(per_cpu(vmxarea, cpu)); u64 old; + if (read_cr4() & X86_CR4_VMXE) + return -EBUSY; + INIT_LIST_HEAD(&per_cpu(vcpus_on_cpu, cpu)); rdmsrl(MSR_IA32_FEATURE_CONTROL, old); if ((old & (FEATURE_CONTROL_LOCKED | @@ -1088,6 +1091,8 @@ static void hardware_enable(void *garbage) asm volatile (ASM_VMX_VMXON_RAX : : "a"(&phys_addr), "m"(phys_addr) : "memory", "cc"); + + return 0; } static void vmclear_local_vcpus(void) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 03431b2..bfef950 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4222,9 +4222,9 @@ int kvm_arch_vcpu_reset(struct kvm_vcpu *vcpu) return kvm_x86_ops->vcpu_reset(vcpu); } -void kvm_arch_hardware_enable(void *garbage) +int kvm_arch_hardware_enable(void *garbage) { - kvm_x86_ops->hardware_enable(garbage); + return kvm_x86_ops->hardware_enable(garbage); } void kvm_arch_hardware_disable(void *garbage) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 11eb702..7678995 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -292,7 +292,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu); void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu); int kvm_arch_vcpu_reset(struct kvm_vcpu *vcpu); -void kvm_arch_hardware_enable(void *garbage); +int kvm_arch_hardware_enable(void *garbage); void kvm_arch_hardware_disable(void *garbage); int kvm_arch_hardware_setup(void); void kvm_arch_hardware_unsetup(void); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 92ef725..ead53e4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -65,6 +65,8 @@ DEFINE_SPINLOCK(kvm_lock); LIST_HEAD(vm_list); static cpumask_var_t cpus_hardware_enabled; +static int kvm_usage_count = 0; +static atomic_t hardware_enable_failed; struct kmem_cache *kvm_vcpu_cache; EXPORT_SYMBOL_GPL(kvm_vcpu_cache); @@ -75,6 +77,8 @@ struct dentry *kvm_debugfs_dir; static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl, unsigned long arg); +static int hardware_enable_all(void); +static void hardware_disable_all(void); static bool kvm_rebooting; @@ -931,6 +935,7 @@ static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { static struct kvm *kvm_create_vm(void) { + int r = 0; struct kvm *kvm = kvm_arch_create_vm(); #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET struct page *page; @@ -938,6 +943,11 @@ static struct kvm *kvm_create_vm(void) if (IS_ERR(kvm)) goto out; + + r = hardware_enable_all(); + if (r) + goto out_err; + #ifdef CONFIG_HAVE_KVM_IRQCHIP INIT_LIST_HEAD(&kvm->irq_routing); INIT_HLIST_HEAD(&kvm->mask_notifier_list); @@ -946,8 +956,8 @@ static struct kvm *kvm_create_vm(void) #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET page = alloc_page(GFP_KERNEL | __GFP_ZERO); if (!page) { - kfree(kvm); - return ERR_PTR(-ENOMEM); + r = -ENOMEM; + goto out_err; } kvm->coalesced_mmio_ring = (struct kvm_coalesced_mmio_ring *)page_address(page); @@ -955,15 +965,13 @@ static struct kvm *kvm_create_vm(void) #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) { - int err; kvm->mmu_notifier.ops = &kvm_mmu_notifier_ops; - err = mmu_notifier_register(&kvm->mmu_notifier, current->mm); - if (err) { + r = mmu_notifier_register(&kvm->mmu_notifier, current->mm); + if (r) { #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET put_page(page); #endif - kfree(kvm); - return ERR_PTR(err); + goto out_err; } } #endif @@ -984,6 +992,11 @@ static struct kvm *kvm_create_vm(void) #endif out: return kvm; + +out_err: + hardware_disable_all(); + kfree(kvm); + return ERR_PTR(r); } /* @@ -1036,6 +1049,7 @@ static void kvm_destroy_vm(struct kvm *kvm) kvm_arch_flush_shadow(kvm); #endif kvm_arch_destroy_vm(kvm); + hardware_disable_all(); mmdrop(mm); } @@ -2332,11 +2346,41 @@ static struct miscdevice kvm_dev = { static void hardware_enable(void *junk) { int cpu = raw_smp_processor_id(); + int r; if (cpumask_test_cpu(cpu, cpus_hardware_enabled)) return; + cpumask_set_cpu(cpu, cpus_hardware_enabled); - kvm_arch_hardware_enable(NULL); + + r = kvm_arch_hardware_enable(NULL); + + if (r) { + cpumask_clear_cpu(cpu, cpus_hardware_enabled); + atomic_inc(&hardware_enable_failed); + printk(KERN_INFO "kvm: enabling virtualization on " + "CPU%d failed\n", cpu); + } +} + +static int hardware_enable_all(void) +{ + int r = 0; + + spin_lock(&kvm_lock); + + kvm_usage_count++; + if (kvm_usage_count == 1) { + atomic_set(&hardware_enable_failed, 1); + on_each_cpu(hardware_enable, NULL, 1); + + if (!atomic_dec_and_test(&hardware_enable_failed)) + r = -EBUSY; + } + + spin_unlock(&kvm_lock); + + return r; } static void hardware_disable(void *junk) @@ -2349,11 +2393,25 @@ static void hardware_disable(void *junk) kvm_arch_hardware_disable(NULL); } +static void hardware_disable_all(void) +{ + BUG_ON(!kvm_usage_count); + + spin_lock(&kvm_lock); + kvm_usage_count--; + if (!kvm_usage_count) + on_each_cpu(hardware_disable, NULL, 1); + spin_unlock(&kvm_lock); +} + static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val, void *v) { int cpu = (long)v; + if (!kvm_usage_count) + return NOTIFY_OK; + val &= ~CPU_TASKS_FROZEN; switch (val) { case CPU_DYING: @@ -2513,13 +2571,15 @@ static void kvm_exit_debug(void) static int kvm_suspend(struct sys_device *dev, pm_message_t state) { - hardware_disable(NULL); + if (kvm_usage_count) + hardware_disable(NULL); return 0; } static int kvm_resume(struct sys_device *dev) { - hardware_enable(NULL); + if (kvm_usage_count) + hardware_enable(NULL); return 0; } @@ -2596,7 +2656,6 @@ int kvm_init(void *opaque, unsigned int vcpu_size, goto out_free_1; } - on_each_cpu(hardware_enable, NULL, 1); r = register_cpu_notifier(&kvm_cpu_notifier); if (r) goto out_free_2; @@ -2644,7 +2703,6 @@ out_free_3: unregister_reboot_notifier(&kvm_reboot_notifier); unregister_cpu_notifier(&kvm_cpu_notifier); out_free_2: - on_each_cpu(hardware_disable, NULL, 1); out_free_1: kvm_arch_hardware_unsetup(); out_free_0a:
X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. --- v2 uses kvm_lock and traces failures atomically Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/ia64/kvm/kvm-ia64.c | 8 ++- arch/powerpc/kvm/powerpc.c | 2 +- arch/s390/kvm/kvm-s390.c | 2 +- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/svm.c | 13 ++++-- arch/x86/kvm/vmx.c | 7 +++- arch/x86/kvm/x86.c | 4 +- include/linux/kvm_host.h | 2 +- virt/kvm/kvm_main.c | 82 +++++++++++++++++++++++++++++++++------ 9 files changed, 96 insertions(+), 26 deletions(-)