diff mbox series

[v2,19/49] KVM: x86: Add a macro to init CPUID features that ignore host kernel support

Message ID 20240517173926.965351-20-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: CPUID overhaul, fixes, and caching | expand

Commit Message

Sean Christopherson May 17, 2024, 5:38 p.m. UTC
Add a macro for use in kvm_set_cpu_caps() to automagically initialize
features that KVM wants to support based solely on the CPU's capabilities,
e.g. KVM advertises LA57 support if it's available in hardware, even if
the host kernel isn't utilizing 57-bit virtual addresses.

Take advantage of the fact that kvm_cpu_cap_mask() adjusts kvm_cpu_caps
based on raw CPUID, i.e. will clear features bits that aren't supported in
hardware, and simply force-set the capability before applying the mask.

Abusing kvm_cpu_cap_set() is a borderline evil shenanigan, but doing so
avoid extra CPUID lookups, and a future commit will harden the entire
family of *F() macros to assert (at compile time) that every feature being
allowed is part of the capability word being processed, i.e. using a macro
will bring more advantages in the future.

Avoiding CPUID also fixes a largely benign bug where KVM could incorrectly
report LA57 support on Intel CPUs whose max supported CPUID is less than 7,
i.e. if the max supported leaf (<7) happened to have bit 16 set.  In
practice, barring a funky virtual machine setup, the bug is benign as all
known CPUs that support VMX also support leaf 7.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

Comments

Maxim Levitsky July 5, 2024, 1:21 a.m. UTC | #1
On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> Add a macro for use in kvm_set_cpu_caps() to automagically initialize
> features that KVM wants to support based solely on the CPU's capabilities,
> e.g. KVM advertises LA57 support if it's available in hardware, even if
> the host kernel isn't utilizing 57-bit virtual addresses.
> 
> Take advantage of the fact that kvm_cpu_cap_mask() adjusts kvm_cpu_caps
> based on raw CPUID, i.e. will clear features bits that aren't supported in
> hardware, and simply force-set the capability before applying the mask.
> 
> Abusing kvm_cpu_cap_set() is a borderline evil shenanigan, but doing so
> avoid extra CPUID lookups, and a future commit will harden the entire
> family of *F() macros to assert (at compile time) that every feature being
> allowed is part of the capability word being processed, i.e. using a macro
> will bring more advantages in the future.

Could you explain what do you mean by "extra CPUID lookups"?


> 
> Avoiding CPUID also fixes a largely benign bug where KVM could incorrectly
> report LA57 support on Intel CPUs whose max supported CPUID is less than 7,
> i.e. if the max supported leaf (<7) happened to have bit 16 set.  In
> practice, barring a funky virtual machine setup, the bug is benign as all
> known CPUs that support VMX also support leaf 7.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 77625a5477b1..a802c09b50ab 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -70,6 +70,18 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
>  	(boot_cpu_has(X86_FEATURE_##name) ? F(name) : 0);	\
>  })
>  
> +/*
> + * Raw Feature - For features that KVM supports based purely on raw host CPUID,
> + * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
> + * Simply force set the feature in KVM's capabilities, raw CPUID support will
> + * be factored in by kvm_cpu_cap_mask().
> + */
> +#define RAW_F(name)						\
> +({								\
> +	kvm_cpu_cap_set(X86_FEATURE_##name);			\
> +	F(name);						\
> +})
> +
>  /*
>   * Magic value used by KVM when querying userspace-provided CPUID entries and
>   * doesn't care about the CPIUD index because the index of the function in
> @@ -682,15 +694,12 @@ void kvm_set_cpu_caps(void)
>  		F(AVX512VL));
>  
>  	kvm_cpu_cap_mask(CPUID_7_ECX,
> -		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> +		F(AVX512VBMI) | RAW_F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
>  		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
>  		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
>  		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
>  		F(SGX_LC) | F(BUS_LOCK_DETECT)
>  	);
> -	/* Set LA57 based on hardware capability. */
> -	if (cpuid_ecx(7) & F(LA57))
> -		kvm_cpu_cap_set(X86_FEATURE_LA57);
>  
>  	/*
>  	 * PKU not yet implemented for shadow paging and requires OSPKE

Putting a function call into a macro which evaluates into a bitmask is somewhat misleading,
but let it be...

IMHO in long term, it might be better to rip the whole huge 'or'ed mess, and replace
it with a list of statements, along with comments for all unusual cases.


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky
Sean Christopherson July 8, 2024, 8:53 p.m. UTC | #2
On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> > Add a macro for use in kvm_set_cpu_caps() to automagically initialize
> > features that KVM wants to support based solely on the CPU's capabilities,
> > e.g. KVM advertises LA57 support if it's available in hardware, even if
> > the host kernel isn't utilizing 57-bit virtual addresses.
> > 
> > Take advantage of the fact that kvm_cpu_cap_mask() adjusts kvm_cpu_caps
> > based on raw CPUID, i.e. will clear features bits that aren't supported in
> > hardware, and simply force-set the capability before applying the mask.
> > 
> > Abusing kvm_cpu_cap_set() is a borderline evil shenanigan, but doing so
> > avoid extra CPUID lookups, and a future commit will harden the entire
> > family of *F() macros to assert (at compile time) that every feature being
> > allowed is part of the capability word being processed, i.e. using a macro
> > will bring more advantages in the future.
> 
> Could you explain what do you mean by "extra CPUID lookups"?

cpuid_ecx(7) incurs a CPUID to read the raw info, on top of the CPUID that is
executed by kvm_cpu_cap_init() (kvm_cpu_cap_mask() as of this patch).  Obviously
not a big deal, but it's an extra VM-Exit when running as a VM.

> > +/*
> > + * Raw Feature - For features that KVM supports based purely on raw host CPUID,
> > + * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
> > + * Simply force set the feature in KVM's capabilities, raw CPUID support will
> > + * be factored in by kvm_cpu_cap_mask().
> > + */
> > +#define RAW_F(name)						\
> > +({								\
> > +	kvm_cpu_cap_set(X86_FEATURE_##name);			\
> > +	F(name);						\
> > +})
> > +
> >  /*
> >   * Magic value used by KVM when querying userspace-provided CPUID entries and
> >   * doesn't care about the CPIUD index because the index of the function in
> > @@ -682,15 +694,12 @@ void kvm_set_cpu_caps(void)
> >  		F(AVX512VL));
> >  
> >  	kvm_cpu_cap_mask(CPUID_7_ECX,
> > -		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> > +		F(AVX512VBMI) | RAW_F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> >  		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
> >  		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> >  		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
> >  		F(SGX_LC) | F(BUS_LOCK_DETECT)
> >  	);
> > -	/* Set LA57 based on hardware capability. */
> > -	if (cpuid_ecx(7) & F(LA57))
> > -		kvm_cpu_cap_set(X86_FEATURE_LA57);
> >  
> >  	/*
> >  	 * PKU not yet implemented for shadow paging and requires OSPKE
> 
> Putting a function call into a macro which evaluates into a bitmask is
> somewhat misleading, but let it be...
> 
> IMHO in long term, it might be better to rip the whole huge 'or'ed mess, and replace
> it with a list of statements, along with comments for all unusual cases.

As in something like this?

	kvm_cpu_cap_init(AVX512VBMI);
	kvm_cpu_cap_init_raw(LA57);
	kvm_cpu_cap_init(PKU);
	...
	kvm_cpu_cap_init(BUS_LOCK_DETECT);

	kvm_cpu_cap_init_aliased(CPUID_8000_0001_EDX, FPU);

	...

	kvm_cpu_cap_init_scattered(CPUID_12_EAX, SGX1);
	kvm_cpu_cap_init_scattered(CPUID_12_EAX, SGX2);
	kvm_cpu_cap_init_scattered(CPUID_12_EAX, SGX_EDECCSSA);

The tricky parts are incorporating raw CPUID into the masking and handling features
that KVM _doesn't_ support.  For raw CPUID, we could simply do CPUID every time, or
pre-fill an array to avoid hundreds of CPUIDs that are largely redudant.

But I don't see a way to mask off unsupported features without losing the
compile-time protections that the current code provides.  And even if we took a
big hammer approach, e.g. finalized masking for all words at the very end, we'd
still need to carry state across each statement, i.e. we'd still need the bitwise-OR
and mask  behavior, it would just be buried in helpers/macros.

I suspect the generated code will be larger, but I doubt that will actually be
problematic.  The written code will also be more verbose (roughly 4x since we
tend to squeeze 4 features per line), and it will be harder to ensure initialization
of features in a given word are all co-located.

I definitely don't hate the idea, but I don't think it will be a clear "win" either.
Unless someone feels strongly about pursuing this approach, I'll add to the "things
to explore later" list.
Sean Christopherson July 8, 2024, 10:36 p.m. UTC | #3
On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> > +/*
> > + * Raw Feature - For features that KVM supports based purely on raw host CPUID,
> > + * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
> > + * Simply force set the feature in KVM's capabilities, raw CPUID support will
> > + * be factored in by kvm_cpu_cap_mask().
> > + */
> > +#define RAW_F(name)						\
> > +({								\
> > +	kvm_cpu_cap_set(X86_FEATURE_##name);			\
> > +	F(name);						\
> > +})
> > +
> >  /*
> >   * Magic value used by KVM when querying userspace-provided CPUID entries and
> >   * doesn't care about the CPIUD index because the index of the function in
> > @@ -682,15 +694,12 @@ void kvm_set_cpu_caps(void)
> >  		F(AVX512VL));
> >  
> >  	kvm_cpu_cap_mask(CPUID_7_ECX,
> > -		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> > +		F(AVX512VBMI) | RAW_F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> >  		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
> >  		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> >  		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
> >  		F(SGX_LC) | F(BUS_LOCK_DETECT)
> >  	);
> > -	/* Set LA57 based on hardware capability. */
> > -	if (cpuid_ecx(7) & F(LA57))
> > -		kvm_cpu_cap_set(X86_FEATURE_LA57);
> >  
> >  	/*
> >  	 * PKU not yet implemented for shadow paging and requires OSPKE
> 
> Putting a function call into a macro which evaluates into a bitmask is somewhat misleading,
> but let it be...

And weird.  Rather than abuse kvm_cpu_cap_set(), what about adding another variable
scoped to kvm_cpu_cap_init()?

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0e64a6332052..b8bc8713a0ec 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -87,12 +87,10 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
 /*
  * Raw Feature - For features that KVM supports based purely on raw host CPUID,
  * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
- * Simply force set the feature in KVM's capabilities, raw CPUID support will
- * be factored in by __kvm_cpu_cap_mask().
  */
 #define RAW_F(name)                                            \
 ({                                                             \
-       kvm_cpu_cap_set(X86_FEATURE_##name);                    \
+       kvm_cpu_cap_passthrough |= F(name);                     \
        F(name);                                                \
 })
 
@@ -737,6 +735,7 @@ do {                                                                        \
 do {                                                                   \
        const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);    \
        const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;   \
+       u32 kvm_cpu_cap_passthrough = 0;                                \
        u32 kvm_cpu_cap_emulated = 0;                                   \
        u32 kvm_cpu_cap_synthesized = 0;                                \
                                                                        \
@@ -745,6 +744,7 @@ do {                                                                        \
        else                                                            \
                kvm_cpu_caps[leaf] = (mask);                            \
                                                                        \
+       kvm_cpu_caps[leaf] |= kvm_cpu_cap_passthrough;                  \
        kvm_cpu_caps[leaf] &= (raw_cpuid_get(cpuid) |                   \
                               kvm_cpu_cap_synthesized);                \
        kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;                     \
Maxim Levitsky July 24, 2024, 5:39 p.m. UTC | #4
On Mon, 2024-07-08 at 13:53 -0700, Sean Christopherson wrote:
> On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> > > Add a macro for use in kvm_set_cpu_caps() to automagically initialize
> > > features that KVM wants to support based solely on the CPU's capabilities,
> > > e.g. KVM advertises LA57 support if it's available in hardware, even if
> > > the host kernel isn't utilizing 57-bit virtual addresses.
> > > 
> > > Take advantage of the fact that kvm_cpu_cap_mask() adjusts kvm_cpu_caps
> > > based on raw CPUID, i.e. will clear features bits that aren't supported in
> > > hardware, and simply force-set the capability before applying the mask.
> > > 
> > > Abusing kvm_cpu_cap_set() is a borderline evil shenanigan, but doing so
> > > avoid extra CPUID lookups, and a future commit will harden the entire
> > > family of *F() macros to assert (at compile time) that every feature being
> > > allowed is part of the capability word being processed, i.e. using a macro
> > > will bring more advantages in the future.
> > 
> > Could you explain what do you mean by "extra CPUID lookups"?
> 
> cpuid_ecx(7) incurs a CPUID to read the raw info, on top of the CPUID that is
> executed by kvm_cpu_cap_init() (kvm_cpu_cap_mask() as of this patch).  Obviously
> not a big deal, but it's an extra VM-Exit when running as a VM.
> 
> > > +/*
> > > + * Raw Feature - For features that KVM supports based purely on raw host CPUID,
> > > + * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
> > > + * Simply force set the feature in KVM's capabilities, raw CPUID support will
> > > + * be factored in by kvm_cpu_cap_mask().
> > > + */
> > > +#define RAW_F(name)						\
> > > +({								\
> > > +	kvm_cpu_cap_set(X86_FEATURE_##name);			\
> > > +	F(name);						\
> > > +})
> > > +
> > >  /*
> > >   * Magic value used by KVM when querying userspace-provided CPUID entries and
> > >   * doesn't care about the CPIUD index because the index of the function in
> > > @@ -682,15 +694,12 @@ void kvm_set_cpu_caps(void)
> > >  		F(AVX512VL));
> > >  
> > >  	kvm_cpu_cap_mask(CPUID_7_ECX,
> > > -		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> > > +		F(AVX512VBMI) | RAW_F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> > >  		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
> > >  		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> > >  		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
> > >  		F(SGX_LC) | F(BUS_LOCK_DETECT)
> > >  	);
> > > -	/* Set LA57 based on hardware capability. */
> > > -	if (cpuid_ecx(7) & F(LA57))
> > > -		kvm_cpu_cap_set(X86_FEATURE_LA57);
> > >  
> > >  	/*
> > >  	 * PKU not yet implemented for shadow paging and requires OSPKE
> > 
> > Putting a function call into a macro which evaluates into a bitmask is
> > somewhat misleading, but let it be...
> > 
> > IMHO in long term, it might be better to rip the whole huge 'or'ed mess, and replace
> > it with a list of statements, along with comments for all unusual cases.
> 
> As in something like this?
> 
> 	kvm_cpu_cap_init(AVX512VBMI);
> 	kvm_cpu_cap_init_raw(LA57);
> 	kvm_cpu_cap_init(PKU);
> 	...
> 	kvm_cpu_cap_init(BUS_LOCK_DETECT);
> 
> 	kvm_cpu_cap_init_aliased(CPUID_8000_0001_EDX, FPU);
> 
> 	...
> 
> 	kvm_cpu_cap_init_scattered(CPUID_12_EAX, SGX1);
> 	kvm_cpu_cap_init_scattered(CPUID_12_EAX, SGX2);
> 	kvm_cpu_cap_init_scattered(CPUID_12_EAX, SGX_EDECCSSA);
> 
> The tricky parts are incorporating raw CPUID into the masking and handling features
> that KVM _doesn't_ support.  For raw CPUID, we could simply do CPUID every time, or
> pre-fill an array to avoid hundreds of CPUIDs that are largely redudant.

In terms of performance, again this code is run once per kvm module load, so even
if it does something truly gross performance wise, it's not a problem, even if run
nested.

> 
> But I don't see a way to mask off unsupported features without losing the
> compile-time protections that the current code provides.  And even if we took a
> big hammer approach, e.g. finalized masking for all words at the very end, we'd
> still need to carry state across each statement, i.e. we'd still need the bitwise-OR
> and mask  behavior, it would just be buried in helpers/macros.

Can you elaborate on this?

For example let's say this:


	kvm_cpu_cap_init(CPUID_7_0_EBX,
		F(FSGSBASE) | EMUL_F(TSC_ADJUST) | F(SGX) | F(BMI1) | F(HLE) |
		F(AVX2) | F(FDP_EXCPTN_ONLY) | F(SMEP) | F(BMI2) | F(ERMS) |
		F(INVPCID) | F(RTM) | F(ZERO_FCS_FDS) | 0 /*MPX*/ |
		F(AVX512F) | F(AVX512DQ) | F(RDSEED) | F(ADX) | F(SMAP) |
		F(AVX512IFMA) | F(CLFLUSHOPT) | F(CLWB) | 0 /*INTEL_PT*/ |
		F(AVX512PF) | F(AVX512ER) | F(AVX512CD) | F(SHA_NI) |
		F(AVX512BW) | F(AVX512VL));



This will be replaced with:

kvm_cpu_cap_clear_all(CPUID_7_0_EBX);

kvm_cpu_cap_init(FSGSBASE);
kvm_cpu_cap_init(TSC_ADJUST, CAP_EMULATED);
..
kvm_cpu_cap_init(AVX512VL);

Then each 'kvm_cpu_cap_init' will opt-in to set a bit if supported in host cpuid, or
always opt-in for emulated features, etc....

Host CPUID can indeed be cached, if extra host cpuid queries cause too slow (e.g 1 second) delay
when nested.



> 
> I suspect the generated code will be larger, but I doubt that will actually be
> problematic.  

Yes, 100% agree.


> The written code will also be more verbose (roughly 4x since we
> tend to squeeze 4 features per line)

It will be about as long as the list of macros in the cpufeatures.h, where
all features are nicely ordered by cpuid leaves.

In this case I consider verbose long code to be an improvement.

IMHO the OR'ed mask of macros is just too terse, hard to parse.
It was borderline OK, before this patch series because it only
contained features, but now it also contains various modifiers,
IMHO it's just hard to notice that EMUL_F at that corner...


> , and it will be harder to ensure initialization
> of features in a given word are all co-located.

Actually co-location won't be needed.

We can first copy the caps from boot_cpu_data,
then zero all the leaves that we initialize ourselves.

After that we can initialize opt-in features in any order - it will still be sorted
by CPUID leaves but even if the order is broken (e.g due to cherry-pick or something),
it won't cause any issues.


> 
> I definitely don't hate the idea, but I don't think it will be a clear "win" either.
> Unless someone feels strongly about pursuing this approach, I'll add to the "things
> to explore later" list.
> 

Please do consider this, I am almost sure that whoever will need to read this code later (could be you...),
will thank you.


Best regards,
	Maxim Levitsky
Maxim Levitsky July 24, 2024, 5:40 p.m. UTC | #5
On Mon, 2024-07-08 at 15:36 -0700, Sean Christopherson wrote:
> On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> > > +/*
> > > + * Raw Feature - For features that KVM supports based purely on raw host CPUID,
> > > + * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
> > > + * Simply force set the feature in KVM's capabilities, raw CPUID support will
> > > + * be factored in by kvm_cpu_cap_mask().
> > > + */
> > > +#define RAW_F(name)						\
> > > +({								\
> > > +	kvm_cpu_cap_set(X86_FEATURE_##name);			\
> > > +	F(name);						\
> > > +})
> > > +
> > >  /*
> > >   * Magic value used by KVM when querying userspace-provided CPUID entries and
> > >   * doesn't care about the CPIUD index because the index of the function in
> > > @@ -682,15 +694,12 @@ void kvm_set_cpu_caps(void)
> > >  		F(AVX512VL));
> > >  
> > >  	kvm_cpu_cap_mask(CPUID_7_ECX,
> > > -		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> > > +		F(AVX512VBMI) | RAW_F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> > >  		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
> > >  		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> > >  		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
> > >  		F(SGX_LC) | F(BUS_LOCK_DETECT)
> > >  	);
> > > -	/* Set LA57 based on hardware capability. */
> > > -	if (cpuid_ecx(7) & F(LA57))
> > > -		kvm_cpu_cap_set(X86_FEATURE_LA57);
> > >  
> > >  	/*
> > >  	 * PKU not yet implemented for shadow paging and requires OSPKE
> > 
> > Putting a function call into a macro which evaluates into a bitmask is somewhat misleading,
> > but let it be...
> 
> And weird.  Rather than abuse kvm_cpu_cap_set(), what about adding another variable
> scoped to kvm_cpu_cap_init()?
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0e64a6332052..b8bc8713a0ec 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -87,12 +87,10 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
>  /*
>   * Raw Feature - For features that KVM supports based purely on raw host CPUID,
>   * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
> - * Simply force set the feature in KVM's capabilities, raw CPUID support will
> - * be factored in by __kvm_cpu_cap_mask().
>   */
>  #define RAW_F(name)                                            \
>  ({                                                             \
> -       kvm_cpu_cap_set(X86_FEATURE_##name);                    \
> +       kvm_cpu_cap_passthrough |= F(name);                     \
>         F(name);                                                \
>  })
>  
> @@ -737,6 +735,7 @@ do {                                                                        \
>  do {                                                                   \
>         const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);    \
>         const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;   \
> +       u32 kvm_cpu_cap_passthrough = 0;                                \
>         u32 kvm_cpu_cap_emulated = 0;                                   \
>         u32 kvm_cpu_cap_synthesized = 0;                                \
>                                                                         \
> @@ -745,6 +744,7 @@ do {                                                                        \
>         else                                                            \
>                 kvm_cpu_caps[leaf] = (mask);                            \
>                                                                         \
> +       kvm_cpu_caps[leaf] |= kvm_cpu_cap_passthrough;                  \
>         kvm_cpu_caps[leaf] &= (raw_cpuid_get(cpuid) |                   \
>                                kvm_cpu_cap_synthesized);                \
>         kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;                     \
> 

I agree, this is better.

Best regards,
	Maxim Levitsky
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 77625a5477b1..a802c09b50ab 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -70,6 +70,18 @@  u32 xstate_required_size(u64 xstate_bv, bool compacted)
 	(boot_cpu_has(X86_FEATURE_##name) ? F(name) : 0);	\
 })
 
+/*
+ * Raw Feature - For features that KVM supports based purely on raw host CPUID,
+ * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
+ * Simply force set the feature in KVM's capabilities, raw CPUID support will
+ * be factored in by kvm_cpu_cap_mask().
+ */
+#define RAW_F(name)						\
+({								\
+	kvm_cpu_cap_set(X86_FEATURE_##name);			\
+	F(name);						\
+})
+
 /*
  * Magic value used by KVM when querying userspace-provided CPUID entries and
  * doesn't care about the CPIUD index because the index of the function in
@@ -682,15 +694,12 @@  void kvm_set_cpu_caps(void)
 		F(AVX512VL));
 
 	kvm_cpu_cap_mask(CPUID_7_ECX,
-		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
+		F(AVX512VBMI) | RAW_F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
 		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
 		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
 		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
 		F(SGX_LC) | F(BUS_LOCK_DETECT)
 	);
-	/* Set LA57 based on hardware capability. */
-	if (cpuid_ecx(7) & F(LA57))
-		kvm_cpu_cap_set(X86_FEATURE_LA57);
 
 	/*
 	 * PKU not yet implemented for shadow paging and requires OSPKE