diff mbox series

[v2,01/49] KVM: x86: Do all post-set CPUID processing during vCPU creation

Message ID 20240517173926.965351-2-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: CPUID overhaul, fixes, and caching | expand

Commit Message

Sean Christopherson May 17, 2024, 5:38 p.m. UTC
During vCPU creation, process KVM's default, empty CPUID as if userspace
set an empty CPUID to ensure consistent and correct behavior with respect
to guest CPUID.  E.g. if userspace never sets guest CPUID, KVM will never
configure cr4_guest_rsvd_bits, and thus create divergent, incorrect, guest-
visible behavior due to letting the guest set any KVM-supported CR4 bits
despite the features not being allowed per guest CPUID.

Note!  This changes KVM's ABI, as lack of full CPUID processing allowed
userspace to stuff garbage vCPU state, e.g. userspace could set CR4 to a
guest-unsupported value via KVM_SET_SREGS.  But it's extremely unlikely
that this is a breaking change, as KVM already has many flows that require
userspace to set guest CPUID before loading vCPU state.  E.g. multiple MSR
flows consult guest CPUID on host writes, and KVM_SET_SREGS itself already
relies on guest CPUID being up-to-date, as KVM's validity check on CR3
consumes CPUID.0x7.1 (for LAM) and CPUID.0x80000008 (for MAXPHYADDR).

Furthermore, the plan is to commit to enforcing guest CPUID for userspace
writes to MSRs, at which point bypassing sregs CPUID checks is even more
nonsensical.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 arch/x86/kvm/cpuid.h | 1 +
 arch/x86/kvm/x86.c   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

Comments

Maxim Levitsky July 5, 2024, 12:48 a.m. UTC | #1
On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> During vCPU creation, process KVM's default, empty CPUID as if userspace
> set an empty CPUID to ensure consistent and correct behavior with respect
> to guest CPUID.  E.g. if userspace never sets guest CPUID, KVM will never
> configure cr4_guest_rsvd_bits, and thus create divergent, incorrect, guest-
> visible behavior due to letting the guest set any KVM-supported CR4 bits
> despite the features not being allowed per guest CPUID.
> 
> Note!  This changes KVM's ABI, as lack of full CPUID processing allowed
> userspace to stuff garbage vCPU state, e.g. userspace could set CR4 to a
> guest-unsupported value via KVM_SET_SREGS.  But it's extremely unlikely
> that this is a breaking change, as KVM already has many flows that require
> userspace to set guest CPUID before loading vCPU state.  E.g. multiple MSR
> flows consult guest CPUID on host writes, and KVM_SET_SREGS itself already
> relies on guest CPUID being up-to-date, as KVM's validity check on CR3
> consumes CPUID.0x7.1 (for LAM) and CPUID.0x80000008 (for MAXPHYADDR).
> 
> Furthermore, the plan is to commit to enforcing guest CPUID for userspace
> writes to MSRs, at which point bypassing sregs CPUID checks is even more
> nonsensical.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 2 +-
>  arch/x86/kvm/cpuid.h | 1 +
>  arch/x86/kvm/x86.c   | 1 +
>  3 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index f2f2be5d1141..2b19ff991ceb 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -335,7 +335,7 @@ static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent)
>  #endif
>  }
>  
> -static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> +void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_lapic *apic = vcpu->arch.apic;
>  	struct kvm_cpuid_entry2 *best;
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index 23dbb9eb277c..0a8b561b5434 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -11,6 +11,7 @@
>  extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
>  void kvm_set_cpu_caps(void);
>  
> +void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu);
>  void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu);
>  void kvm_update_pv_runtime(struct kvm_vcpu *vcpu);
>  struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d750546ec934..7adcf56bd45d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12234,6 +12234,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  	kvm_xen_init_vcpu(vcpu);
>  	kvm_vcpu_mtrr_init(vcpu);
>  	vcpu_load(vcpu);
> +	kvm_vcpu_after_set_cpuid(vcpu);

This makes me a bit nervous. At this point the vcpu->arch.cpuid_entries is NULL,
but so is vcpu->arch.cpuid_nent so it sort of works but is one mistake away from crash.

Maybe we should add some protection to this, e.g empty zero cpuid or something like that.

Best regards,
	Maxim Levitsky


>  	kvm_set_tsc_khz(vcpu, vcpu->kvm->arch.default_tsc_khz);
>  	kvm_vcpu_reset(vcpu, false);
>  	kvm_init_mmu(vcpu);
Maxim Levitsky July 24, 2024, 5:24 p.m. UTC | #2
On Mon, 2024-07-08 at 11:46 -0700, Sean Christopherson wrote:
> On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> > > diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> > > index 23dbb9eb277c..0a8b561b5434 100644
> > > --- a/arch/x86/kvm/cpuid.h
> > > +++ b/arch/x86/kvm/cpuid.h
> > > @@ -11,6 +11,7 @@
> > >  extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
> > >  void kvm_set_cpu_caps(void);
> > >  
> > > +void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu);
> > >  void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu);
> > >  void kvm_update_pv_runtime(struct kvm_vcpu *vcpu);
> > >  struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index d750546ec934..7adcf56bd45d 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -12234,6 +12234,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > >  	kvm_xen_init_vcpu(vcpu);
> > >  	kvm_vcpu_mtrr_init(vcpu);
> > >  	vcpu_load(vcpu);
> > > +	kvm_vcpu_after_set_cpuid(vcpu);
> > 
> > This makes me a bit nervous. At this point the vcpu->arch.cpuid_entries is
> > NULL, but so is vcpu->arch.cpuid_nent so it sort of works but is one mistake
> > away from crash.
> > 
> > Maybe we should add some protection to this, e.g empty zero cpuid or
> > something like that.
> 
> Hmm, a crash is actually a good thing.  In the post-KVM_SET_CPUID2 case, if KVM
> accessed vcpu->arch.cpuid_entries without properly consulting cpuid_nent, the
> resulting failure would be a out-of-bounds read.  Similarly, a zeroed CPUID array
> would effectiely mask any bugs.
> 
> Given that KVM heavily relies on "vcpu" to be zero-allocated, and that changing
> cpuid_nent during kvm_arch_vcpu_create() would be an extremely egregious bug,
> a crash due to a NULL-pointer dereference should never escape developer testing,
> let alone full release testing.
> 
> KVM does the "empty" array thing for IRQ routing (though in that case the array
> and the nr_entries are in a single struct), and IMO it's been a huge net negative
> because it's led to increased complexity just so that arch code can omit a NULL
> check.
> 

Makes sense, let it be.

Best regards,
	Maxim Levitsky
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index f2f2be5d1141..2b19ff991ceb 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -335,7 +335,7 @@  static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent)
 #endif
 }
 
-static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
+void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
 	struct kvm_cpuid_entry2 *best;
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 23dbb9eb277c..0a8b561b5434 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -11,6 +11,7 @@ 
 extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
 void kvm_set_cpu_caps(void);
 
+void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu);
 void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu);
 void kvm_update_pv_runtime(struct kvm_vcpu *vcpu);
 struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d750546ec934..7adcf56bd45d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12234,6 +12234,7 @@  int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	kvm_xen_init_vcpu(vcpu);
 	kvm_vcpu_mtrr_init(vcpu);
 	vcpu_load(vcpu);
+	kvm_vcpu_after_set_cpuid(vcpu);
 	kvm_set_tsc_khz(vcpu, vcpu->kvm->arch.default_tsc_khz);
 	kvm_vcpu_reset(vcpu, false);
 	kvm_init_mmu(vcpu);