Message ID | 20241128013424.4096668-6-seanjc@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | KVM: x86: CPUID overhaul, fixes, and caching | expand |
On Wed, Nov 27, 2024 at 05:33:32PM -0800, Sean Christopherson wrote: >Drop x86.c's local pre-computed cr4_reserved bits and instead fold KVM's >reserved bits into the guest's reserved bits. This fixes a bug where VMX's >set_cr4_guest_host_mask() fails to account for KVM-reserved bits when >deciding which bits can be passed through to the guest. In most cases, >letting the guest directly write reserved CR4 bits is ok, i.e. attempting >to set the bit(s) will still #GP, but not if a feature is available in >hardware but explicitly disabled by the host, e.g. if FSGSBASE support is >disabled via "nofsgsbase". > >Note, the extra overhead of computing host reserved bits every time >userspace sets guest CPUID is negligible. The feature bits that are >queried are packed nicely into a handful of words, and so checking and >setting each reserved bit costs in the neighborhood of ~5 cycles, i.e. the >total cost will be in the noise even if the number of checked CR4 bits >doubles over the next few years. In other words, x86 will run out of CR4 >bits long before the overhead becomes problematic. > >Note #2, __cr4_reserved_bits() starts from CR4_RESERVED_BITS, which is >why the existing __kvm_cpu_cap_has() processing doesn't explicitly OR in >CR4_RESERVED_BITS (and why the new code doesn't do so either). > >Fixes: 2ed41aa631fc ("KVM: VMX: Intercept guest reserved CR4 bits to inject #GP fault") >Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> >Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Chao Gao <chao.gao@intel.com>
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 1944f9415672..27919c8f438b 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -400,8 +400,11 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu); kvm_pmu_refresh(vcpu); - vcpu->arch.cr4_guest_rsvd_bits = - __cr4_reserved_bits(guest_cpuid_has, vcpu); + +#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f) + vcpu->arch.cr4_guest_rsvd_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_) | + __cr4_reserved_bits(guest_cpuid_has, vcpu); +#undef __kvm_cpu_cap_has kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent)); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ca9b0a00cbcc..5288d53fef5c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -119,8 +119,6 @@ u64 __read_mostly efer_reserved_bits = ~((u64)(EFER_SCE | EFER_LME | EFER_LMA)); static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE); #endif -static u64 __read_mostly cr4_reserved_bits = CR4_RESERVED_BITS; - #define KVM_EXIT_HYPERCALL_VALID_MASK (1 << KVM_HC_MAP_GPA_RANGE) #define KVM_CAP_PMU_VALID_MASK KVM_PMU_CAP_DISABLE @@ -1285,9 +1283,6 @@ EXPORT_SYMBOL_GPL(kvm_emulate_xsetbv); bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { - if (cr4 & cr4_reserved_bits) - return false; - if (cr4 & vcpu->arch.cr4_guest_rsvd_bits) return false; @@ -9773,10 +9768,6 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) kvm_caps.supported_xss = 0; -#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f) - cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_); -#undef __kvm_cpu_cap_has - if (kvm_caps.has_tsc_control) { /* * Make sure the user can only configure tsc_khz values that