Message ID | 1511714482-3273-2-git-send-email-sironi@amazon.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2017-11-27 0:41 GMT+08:00 Filippo Sironi <sironi@amazon.de>: > ... that the guest should see. > Guest operating systems may check the microcode version to decide whether > to disable certain features that are known to be buggy up to certain > microcode versions. Address the issue by making the microcode version > that the guest should see settable. > The rationale for having userspace specifying the microcode version, rather > than having the kernel picking it, is to ensure consistency for live-migrated > instances; we don't want them to see a microcode version increase without a > reset. Is there a scenario which needs to refresh the microcode in the guest instead of on the host? Regards, Wanpeng Li > > Signed-off-by: Filippo Sironi <sironi@amazon.de> > --- > arch/x86/kvm/x86.c | 23 +++++++++++++++++++++++ > include/uapi/linux/kvm.h | 3 +++ > 2 files changed, 26 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 925c3e29cad3..741588f27ebc 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4033,6 +4033,29 @@ long kvm_arch_vm_ioctl(struct file *filp, > } u; > > switch (ioctl) { > + case KVM_GET_MICROCODE_VERSION: { > + r = -EFAULT; > + if (copy_to_user(argp, > + &kvm->arch.microcode_version, > + sizeof(kvm->arch.microcode_version))) > + goto out; > + break; > + } > + case KVM_SET_MICROCODE_VERSION: { > + u32 microcode_version; > + > + r = -EFAULT; > + if (copy_from_user(µcode_version, > + argp, > + sizeof(microcode_version))) > + goto out; > + r = -EINVAL; > + if (!microcode_version) > + goto out; > + kvm->arch.microcode_version = microcode_version; > + r = 0; > + break; > + } > case KVM_SET_TSS_ADDR: > r = kvm_vm_ioctl_set_tss_addr(kvm, arg); > break; > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 282d7613fce8..e11887758e29 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -1192,6 +1192,9 @@ struct kvm_s390_ucas_mapping { > #define KVM_S390_UCAS_UNMAP _IOW(KVMIO, 0x51, struct kvm_s390_ucas_mapping) > #define KVM_S390_VCPU_FAULT _IOW(KVMIO, 0x52, unsigned long) > > +#define KVM_GET_MICROCODE_VERSION _IOR(KVMIO, 0x5e, __u32) > +#define KVM_SET_MICROCODE_VERSION _IOW(KVMIO, 0x5f, __u32) > + > /* Device model IOC */ > #define KVM_CREATE_IRQCHIP _IO(KVMIO, 0x60) > #define KVM_IRQ_LINE _IOW(KVMIO, 0x61, struct kvm_irq_level) > -- > 2.7.4 >
On 26/11/2017 17:41, Filippo Sironi wrote: > ... that the guest should see. > Guest operating systems may check the microcode version to decide whether > to disable certain features that are known to be buggy up to certain > microcode versions. Address the issue by making the microcode version > that the guest should see settable. What's the advantage of specifying the microcode version, rather than relying on userspace to drop the CPUID bit for the buggy feature? old guest(*) new guest hide in CPUID good good use ucode rev BAD good (*) old guest = doesn't know that the feature is buggy until a given ucode revision Thanks, Paolo > The rationale for having userspace specifying the microcode version, rather > than having the kernel picking it, is to ensure consistency for live-migrated > instances; we don't want them to see a microcode version increase without a > reset.
On 26/11/2017 17:41, Filippo Sironi wrote: > ... that the guest should see. > Guest operating systems may check the microcode version to decide whether > to disable certain features that are known to be buggy up to certain > microcode versions. Address the issue by making the microcode version > that the guest should see settable. > The rationale for having userspace specifying the microcode version, rather > than having the kernel picking it, is to ensure consistency for live-migrated > instances; we don't want them to see a microcode version increase without a > reset. > > Signed-off-by: Filippo Sironi <sironi@amazon.de> > --- > arch/x86/kvm/x86.c | 23 +++++++++++++++++++++++ > include/uapi/linux/kvm.h | 3 +++ > 2 files changed, 26 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 925c3e29cad3..741588f27ebc 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4033,6 +4033,29 @@ long kvm_arch_vm_ioctl(struct file *filp, > } u; > > switch (ioctl) { > + case KVM_GET_MICROCODE_VERSION: { > + r = -EFAULT; > + if (copy_to_user(argp, > + &kvm->arch.microcode_version, > + sizeof(kvm->arch.microcode_version))) > + goto out; > + break; > + } > + case KVM_SET_MICROCODE_VERSION: { > + u32 microcode_version; > + > + r = -EFAULT; > + if (copy_from_user(µcode_version, > + argp, > + sizeof(microcode_version))) > + goto out; > + r = -EINVAL; > + if (!microcode_version) > + goto out; > + kvm->arch.microcode_version = microcode_version; > + r = 0; > + break; > + } Also, there's no need to define new ioctls, instead you can just place it in the vcpu and use KVM_GET_MSR/KVM_SET_MSR. I'd agree that's slightly less polished, but it matches what we do already for e.g. nested VMX model specific registers. And it spares you for writing the documentation that you didn't include in this patch. :) Paolo
On Mon, Nov 27, 2017 at 3:58 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > On 26/11/2017 17:41, Filippo Sironi wrote: >> ... that the guest should see. >> Guest operating systems may check the microcode version to decide whether >> to disable certain features that are known to be buggy up to certain >> microcode versions. Address the issue by making the microcode version >> that the guest should see settable. >> The rationale for having userspace specifying the microcode version, rather >> than having the kernel picking it, is to ensure consistency for live-migrated >> instances; we don't want them to see a microcode version increase without a >> reset. >> >> Signed-off-by: Filippo Sironi <sironi@amazon.de> >> --- >> arch/x86/kvm/x86.c | 23 +++++++++++++++++++++++ >> include/uapi/linux/kvm.h | 3 +++ >> 2 files changed, 26 insertions(+) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 925c3e29cad3..741588f27ebc 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -4033,6 +4033,29 @@ long kvm_arch_vm_ioctl(struct file *filp, >> } u; >> >> switch (ioctl) { >> + case KVM_GET_MICROCODE_VERSION: { >> + r = -EFAULT; >> + if (copy_to_user(argp, >> + &kvm->arch.microcode_version, >> + sizeof(kvm->arch.microcode_version))) >> + goto out; >> + break; >> + } >> + case KVM_SET_MICROCODE_VERSION: { >> + u32 microcode_version; >> + >> + r = -EFAULT; >> + if (copy_from_user(µcode_version, >> + argp, >> + sizeof(microcode_version))) >> + goto out; >> + r = -EINVAL; >> + if (!microcode_version) >> + goto out; >> + kvm->arch.microcode_version = microcode_version; >> + r = 0; >> + break; >> + } > > Also, there's no need to define new ioctls, instead you can just place > it in the vcpu and use KVM_GET_MSR/KVM_SET_MSR. I'd agree that's > slightly less polished, but it matches what we do already for e.g. > nested VMX model specific registers. And it spares you for writing the > documentation that you didn't include in this patch. :) > > Paolo This feels good time to mention Peter Hornyack's old MSR KVM_EXIT patches. With something like them, there would be no need to push this into the kernel at all.
> On 26. Nov 2017, at 17:02, Wanpeng Li <kernellwp@gmail.com> wrote: > > 2017-11-27 0:41 GMT+08:00 Filippo Sironi <sironi@amazon.de>: >> ... that the guest should see. >> Guest operating systems may check the microcode version to decide whether >> to disable certain features that are known to be buggy up to certain >> microcode versions. Address the issue by making the microcode version >> that the guest should see settable. >> The rationale for having userspace specifying the microcode version, rather >> than having the kernel picking it, is to ensure consistency for live-migrated >> instances; we don't want them to see a microcode version increase without a >> reset. > > Is there a scenario which needs to refresh the microcode in the guest > instead of on the host? > > Regards, > Wanpeng Li Not that I can think of. Today, we're picking up the host microcode version when launching an instance and making sure that the same version is exposed for the life time of the instance (i.e., across migrations that don't result in a reset). Filippo >> >> Signed-off-by: Filippo Sironi <sironi@amazon.de> >> --- >> arch/x86/kvm/x86.c | 23 +++++++++++++++++++++++ >> include/uapi/linux/kvm.h | 3 +++ >> 2 files changed, 26 insertions(+) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 925c3e29cad3..741588f27ebc 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -4033,6 +4033,29 @@ long kvm_arch_vm_ioctl(struct file *filp, >> } u; >> >> switch (ioctl) { >> + case KVM_GET_MICROCODE_VERSION: { >> + r = -EFAULT; >> + if (copy_to_user(argp, >> + &kvm->arch.microcode_version, >> + sizeof(kvm->arch.microcode_version))) >> + goto out; >> + break; >> + } >> + case KVM_SET_MICROCODE_VERSION: { >> + u32 microcode_version; >> + >> + r = -EFAULT; >> + if (copy_from_user(µcode_version, >> + argp, >> + sizeof(microcode_version))) >> + goto out; >> + r = -EINVAL; >> + if (!microcode_version) >> + goto out; >> + kvm->arch.microcode_version = microcode_version; >> + r = 0; >> + break; >> + } >> case KVM_SET_TSS_ADDR: >> r = kvm_vm_ioctl_set_tss_addr(kvm, arg); >> break; >> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h >> index 282d7613fce8..e11887758e29 100644 >> --- a/include/uapi/linux/kvm.h >> +++ b/include/uapi/linux/kvm.h >> @@ -1192,6 +1192,9 @@ struct kvm_s390_ucas_mapping { >> #define KVM_S390_UCAS_UNMAP _IOW(KVMIO, 0x51, struct kvm_s390_ucas_mapping) >> #define KVM_S390_VCPU_FAULT _IOW(KVMIO, 0x52, unsigned long) >> >> +#define KVM_GET_MICROCODE_VERSION _IOR(KVMIO, 0x5e, __u32) >> +#define KVM_SET_MICROCODE_VERSION _IOW(KVMIO, 0x5f, __u32) >> + >> /* Device model IOC */ >> #define KVM_CREATE_IRQCHIP _IO(KVMIO, 0x60) >> #define KVM_IRQ_LINE _IOW(KVMIO, 0x61, struct kvm_irq_level) >> -- >> 2.7.4 >> > Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
> On 27. Nov 2017, at 02:40, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 26/11/2017 17:41, Filippo Sironi wrote: >> ... that the guest should see. >> Guest operating systems may check the microcode version to decide whether >> to disable certain features that are known to be buggy up to certain >> microcode versions. Address the issue by making the microcode version >> that the guest should see settable. > > What's the advantage of specifying the microcode version, rather than > relying on userspace to drop the CPUID bit for the buggy feature? > > old guest(*) new guest > > hide in CPUID good good > > use ucode rev BAD good > > > (*) old guest = doesn't know that the feature is buggy until a given > ucode revision > > Thanks, > > Paolo On C5 and M5 instances, we're basically exposing the host CPUID with few exceptions. Linux (among the others) has checks to make sure that certain features aren't enabled on a certain family/model/stepping if the microcode version isn't greater than or equal to a known good version. apic_check_deadline_errata() in arch/x86/kernel/apic/apic.c is the most recent example in Linux that I know of (by now you've updated it ;) but when we got the original bug report that triggered this patch, this wasn't the case). By exposing the real microcode version, we're preventing buggy guests that don't check that they are running virtualized (i.e., they should trust the hypervisor) from disabling features that are effectively not buggy. Filippo >> The rationale for having userspace specifying the microcode version, rather >> than having the kernel picking it, is to ensure consistency for live-migrated >> instances; we don't want them to see a microcode version increase without a >> reset. Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
> On 27. Nov 2017, at 03:58, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 26/11/2017 17:41, Filippo Sironi wrote: >> ... that the guest should see. >> Guest operating systems may check the microcode version to decide whether >> to disable certain features that are known to be buggy up to certain >> microcode versions. Address the issue by making the microcode version >> that the guest should see settable. >> The rationale for having userspace specifying the microcode version, rather >> than having the kernel picking it, is to ensure consistency for live-migrated >> instances; we don't want them to see a microcode version increase without a >> reset. >> >> Signed-off-by: Filippo Sironi <sironi@amazon.de> >> --- >> arch/x86/kvm/x86.c | 23 +++++++++++++++++++++++ >> include/uapi/linux/kvm.h | 3 +++ >> 2 files changed, 26 insertions(+) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 925c3e29cad3..741588f27ebc 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -4033,6 +4033,29 @@ long kvm_arch_vm_ioctl(struct file *filp, >> } u; >> >> switch (ioctl) { >> + case KVM_GET_MICROCODE_VERSION: { >> + r = -EFAULT; >> + if (copy_to_user(argp, >> + &kvm->arch.microcode_version, >> + sizeof(kvm->arch.microcode_version))) >> + goto out; >> + break; >> + } >> + case KVM_SET_MICROCODE_VERSION: { >> + u32 microcode_version; >> + >> + r = -EFAULT; >> + if (copy_from_user(µcode_version, >> + argp, >> + sizeof(microcode_version))) >> + goto out; >> + r = -EINVAL; >> + if (!microcode_version) >> + goto out; >> + kvm->arch.microcode_version = microcode_version; >> + r = 0; >> + break; >> + } > > Also, there's no need to define new ioctls, instead you can just place > it in the vcpu and use KVM_GET_MSR/KVM_SET_MSR. I'd agree that's > slightly less polished, but it matches what we do already for e.g. > nested VMX model specific registers. And it spares you for writing the > documentation that you didn't include in this patch. :) > > Paolo I wanted to do the work once rather than doing it per vCPU but using KVM_{GET|SET}_MSR and extending the list of MSRs that userspace can save/restore is certainly doable. I'll look into that and post a v2. Filippo Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
> On 27. Nov 2017, at 14:09, Steve Rutherford <srutherford@google.com> wrote: > > On Mon, Nov 27, 2017 at 3:58 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: >> On 26/11/2017 17:41, Filippo Sironi wrote: >>> ... that the guest should see. >>> Guest operating systems may check the microcode version to decide whether >>> to disable certain features that are known to be buggy up to certain >>> microcode versions. Address the issue by making the microcode version >>> that the guest should see settable. >>> The rationale for having userspace specifying the microcode version, rather >>> than having the kernel picking it, is to ensure consistency for live-migrated >>> instances; we don't want them to see a microcode version increase without a >>> reset. >>> >>> Signed-off-by: Filippo Sironi <sironi@amazon.de> >>> --- >>> arch/x86/kvm/x86.c | 23 +++++++++++++++++++++++ >>> include/uapi/linux/kvm.h | 3 +++ >>> 2 files changed, 26 insertions(+) >>> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index 925c3e29cad3..741588f27ebc 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -4033,6 +4033,29 @@ long kvm_arch_vm_ioctl(struct file *filp, >>> } u; >>> >>> switch (ioctl) { >>> + case KVM_GET_MICROCODE_VERSION: { >>> + r = -EFAULT; >>> + if (copy_to_user(argp, >>> + &kvm->arch.microcode_version, >>> + sizeof(kvm->arch.microcode_version))) >>> + goto out; >>> + break; >>> + } >>> + case KVM_SET_MICROCODE_VERSION: { >>> + u32 microcode_version; >>> + >>> + r = -EFAULT; >>> + if (copy_from_user(µcode_version, >>> + argp, >>> + sizeof(microcode_version))) >>> + goto out; >>> + r = -EINVAL; >>> + if (!microcode_version) >>> + goto out; >>> + kvm->arch.microcode_version = microcode_version; >>> + r = 0; >>> + break; >>> + } >> >> Also, there's no need to define new ioctls, instead you can just place >> it in the vcpu and use KVM_GET_MSR/KVM_SET_MSR. I'd agree that's >> slightly less polished, but it matches what we do already for e.g. >> nested VMX model specific registers. And it spares you for writing the >> documentation that you didn't include in this patch. :) >> >> Paolo > > This feels good time to mention Peter Hornyack's old MSR KVM_EXIT > patches. With something like them, there would be no need to push this > into the kernel at all. That's one of the solution we discussed internally (at Amazon) but we didn't pursue yet given the need to release a quick fix for customers. I was thinking about implementing a mechanism to selectively go back to userspace to emulate MSRs; something that's not limited to KVM unhandled MSRs but that instead could even override KVM's handling. Filippo Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
On 09/12/2017 08:42, Sironi, Filippo wrote: > I wanted to do the work once rather than doing it per vCPU but using > KVM_{GET|SET}_MSR and extending the list of MSRs that userspace can > save/restore is certainly doable. > > I'll look into that and post a v2. Thanks! Paolo
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 925c3e29cad3..741588f27ebc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4033,6 +4033,29 @@ long kvm_arch_vm_ioctl(struct file *filp, } u; switch (ioctl) { + case KVM_GET_MICROCODE_VERSION: { + r = -EFAULT; + if (copy_to_user(argp, + &kvm->arch.microcode_version, + sizeof(kvm->arch.microcode_version))) + goto out; + break; + } + case KVM_SET_MICROCODE_VERSION: { + u32 microcode_version; + + r = -EFAULT; + if (copy_from_user(µcode_version, + argp, + sizeof(microcode_version))) + goto out; + r = -EINVAL; + if (!microcode_version) + goto out; + kvm->arch.microcode_version = microcode_version; + r = 0; + break; + } case KVM_SET_TSS_ADDR: r = kvm_vm_ioctl_set_tss_addr(kvm, arg); break; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 282d7613fce8..e11887758e29 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1192,6 +1192,9 @@ struct kvm_s390_ucas_mapping { #define KVM_S390_UCAS_UNMAP _IOW(KVMIO, 0x51, struct kvm_s390_ucas_mapping) #define KVM_S390_VCPU_FAULT _IOW(KVMIO, 0x52, unsigned long) +#define KVM_GET_MICROCODE_VERSION _IOR(KVMIO, 0x5e, __u32) +#define KVM_SET_MICROCODE_VERSION _IOW(KVMIO, 0x5f, __u32) + /* Device model IOC */ #define KVM_CREATE_IRQCHIP _IO(KVMIO, 0x60) #define KVM_IRQ_LINE _IOW(KVMIO, 0x61, struct kvm_irq_level)
... that the guest should see. Guest operating systems may check the microcode version to decide whether to disable certain features that are known to be buggy up to certain microcode versions. Address the issue by making the microcode version that the guest should see settable. The rationale for having userspace specifying the microcode version, rather than having the kernel picking it, is to ensure consistency for live-migrated instances; we don't want them to see a microcode version increase without a reset. Signed-off-by: Filippo Sironi <sironi@amazon.de> --- arch/x86/kvm/x86.c | 23 +++++++++++++++++++++++ include/uapi/linux/kvm.h | 3 +++ 2 files changed, 26 insertions(+)