diff mbox series

[v2,35/49] KVM: x86: Add a macro to handle features that are fully VMM controlled

Message ID 20240517173926.965351-36-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: CPUID overhaul, fixes, and caching | expand

Commit Message

Sean Christopherson May 17, 2024, 5:39 p.m. UTC
Add a macro to track CPUID features for which KVM fully defers to
userspace, i.e. that KVM honors if they are enumerated to the guest, even
if KVM itself doesn't advertise them to usersepace.

Somewhat unfortunately, this behavior only applies to MWAIT (largely
because of KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS), and it's not all that
likely future features will be handled in a similar way.  I.e. very
arguably, potentially tracking every feature in kvm_vmm_cpu_caps is a
waste of memory.

However, adding one-off handling for individual features is quite painful,
especially when considering future hardening.  It's very doable to verify,
at compile time, that every CPUID-based feature that KVM queries when
emulating guest behavior is actually known to KVM, e.g. to prevent KVM
bugs where KVM emulates some feature but fails to advertise support to
userspace.  In other words, any features that are special cased, i.e. not
handled generically in the CPUID framework, would also need to be special
cased for any hardening efforts that build on said framework.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

Comments

Maxim Levitsky July 5, 2024, 2:08 a.m. UTC | #1
On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> Add a macro to track CPUID features for which KVM fully defers to
> userspace, i.e. that KVM honors if they are enumerated to the guest, even
> if KVM itself doesn't advertise them to usersepace.
> 
> Somewhat unfortunately, this behavior only applies to MWAIT (largely
> because of KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS), and it's not all that
> likely future features will be handled in a similar way.  I.e. very
> arguably, potentially tracking every feature in kvm_vmm_cpu_caps is a
> waste of memory.
> 
> However, adding one-off handling for individual features is quite painful,
> especially when considering future hardening.  It's very doable to verify,
> at compile time, that every CPUID-based feature that KVM queries when
> emulating guest behavior is actually known to KVM, e.g. to prevent KVM
> bugs where KVM emulates some feature but fails to advertise support to
> userspace.  In other words, any features that are special cased, i.e. not
> handled generically in the CPUID framework, would also need to be special
> cased for any hardening efforts that build on said framework.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index de898d571faa..16bb873188d6 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -36,6 +36,8 @@
>  u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
>  EXPORT_SYMBOL_GPL(kvm_cpu_caps);
>  
> +static u32 kvm_vmm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
> +
>  u32 xstate_required_size(u64 xstate_bv, bool compacted)
>  {
>  	int feature_bit = 0;
> @@ -115,6 +117,21 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
>  	feature_bit(name);							\
>  })
>  
> +/*
> + * VMM Features - For features that KVM "supports" in some capacity, i.e. that
> + * KVM may query, but that are never advertised to userspace.  E.g. KVM allows
> + * userspace to enumerate MONITOR+MWAIT support to the guest, but the MWAIT
> + * feature flag is never advertised to userspace because MONITOR+MWAIT aren't
> + * virtualized by hardware, can't be faithfully emulated in software (KVM
> + * emulates them as NOPs), and allowing the guest to execute them natively
> + * requires enabling a per-VM capability.
> + */
> +#define VMM_F(name)								\
> +({										\
> +	kvm_vmm_cpu_caps[__feature_leaf(X86_FEATURE_##name)] |= F(name);	\
> +	0;									\
> +})
> +
>  /*
>   * Magic value used by KVM when querying userspace-provided CPUID entries and
>   * doesn't care about the CPIUD index because the index of the function in
> @@ -674,7 +691,7 @@ void kvm_set_cpu_caps(void)
>  		 * NOTE: MONITOR (and MWAIT) are emulated as NOP, but *not*
>  		 * advertised to guests via CPUID!
>  		 */
> -		F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64, MONITOR */ |
> +		F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64 */ | VMM_F(MWAIT) |
>  		0 /* DS-CPL, VMX, SMX, EST */ |
>  		0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ |
>  		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |

Hi,

Not sure that this is worth it. Especially, IMHO the definition of
'KVM honors if they are enumerated to the guest, even if KVM itself doesn't advertise them to usersepace',
is very problematic - AFAIK KVM allows/honours userspace to set anything in the guest visible CPUID, I myself
caused a guest crash once on purpose by forcing it to see AVX3 which is not supported on my CPU.

I think that you mean features that KVM also uses for itself, e.g disables certain VMEXITS, etc,
but this is very hard to understand.

IMHO it is better to handle this case by case basis, it is less confusing.

So far, MWAIT is the only such feature, what do you think is the probability
of Intel/AMD adding more such features?

Best regards,	
	Maxim Levitsky
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index de898d571faa..16bb873188d6 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -36,6 +36,8 @@ 
 u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_cpu_caps);
 
+static u32 kvm_vmm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
+
 u32 xstate_required_size(u64 xstate_bv, bool compacted)
 {
 	int feature_bit = 0;
@@ -115,6 +117,21 @@  u32 xstate_required_size(u64 xstate_bv, bool compacted)
 	feature_bit(name);							\
 })
 
+/*
+ * VMM Features - For features that KVM "supports" in some capacity, i.e. that
+ * KVM may query, but that are never advertised to userspace.  E.g. KVM allows
+ * userspace to enumerate MONITOR+MWAIT support to the guest, but the MWAIT
+ * feature flag is never advertised to userspace because MONITOR+MWAIT aren't
+ * virtualized by hardware, can't be faithfully emulated in software (KVM
+ * emulates them as NOPs), and allowing the guest to execute them natively
+ * requires enabling a per-VM capability.
+ */
+#define VMM_F(name)								\
+({										\
+	kvm_vmm_cpu_caps[__feature_leaf(X86_FEATURE_##name)] |= F(name);	\
+	0;									\
+})
+
 /*
  * Magic value used by KVM when querying userspace-provided CPUID entries and
  * doesn't care about the CPIUD index because the index of the function in
@@ -674,7 +691,7 @@  void kvm_set_cpu_caps(void)
 		 * NOTE: MONITOR (and MWAIT) are emulated as NOP, but *not*
 		 * advertised to guests via CPUID!
 		 */
-		F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64, MONITOR */ |
+		F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64 */ | VMM_F(MWAIT) |
 		0 /* DS-CPL, VMX, SMX, EST */ |
 		0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ |
 		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |