diff mbox series

[v2,26/49] KVM: x86: Add a macro to init CPUID features that KVM emulates in software

Message ID 20240517173926.965351-27-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: CPUID overhaul, fixes, and caching | expand

Commit Message

Sean Christopherson May 17, 2024, 5:39 p.m. UTC
Now that kvm_cpu_cap_init() is a macro with its own scope, add EMUL_F() to
OR-in features that KVM emulates in software, i.e. that don't depend on
the feature being available in hardware.  The contained scope
of kvm_cpu_cap_init() allows using a local variable to track the set of
emulated leaves, which in addition to avoiding confusing and/or
unnecessary variables, helps prevent misuse of EMUL_F().

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

Comments

Maxim Levitsky July 5, 2024, 1:59 a.m. UTC | #1
On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> Now that kvm_cpu_cap_init() is a macro with its own scope, add EMUL_F() to
> OR-in features that KVM emulates in software, i.e. that don't depend on
> the feature being available in hardware.  The contained scope
> of kvm_cpu_cap_init() allows using a local variable to track the set of
> emulated leaves, which in addition to avoiding confusing and/or
> unnecessary variables, helps prevent misuse of EMUL_F().
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 36 +++++++++++++++++++++---------------
>  1 file changed, 21 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 1064e4d68718..33e3e77de1b7 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -94,6 +94,16 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
>  	F(name);						\
>  })
>  
> +/*
> + * Emulated Feature - For features that KVM emulates in software irrespective
> + * of host CPU/kernel support.
> + */
> +#define EMUL_F(name)						\
> +({								\
> +	kvm_cpu_cap_emulated |= F(name);			\
> +	F(name);						\
> +})

To me it feels more and more that this patch series doesn't go into the right direction.

How about we just abandon the whole concept of masks and instead just have a list of statements

Pretty much the opposite of the patch series I confess:


#define CAP_PASSTHOUGH		0x01
#define CAP_EMULATED		0x02
#define CAP_AMD_ALIASED		0x04 // for AMD aliased features
#define CAP_SCATTERED		0x08
#define CAP_X86_64		0x10 // supported only on 64 bit hypervisors
...


/* CPUID_1_ECX*/

				/* TMA is not passed though because: xyz*/
kvm_cpu_cap_init(TMA,           0);

kvm_cpu_cap_init(SSSE3,         CAP_PASSTHOUGH);
				/* CNXT_ID is not passed though because: xyz*/
kvm_cpu_cap_init(CNXT_ID,       0);
kvm_cpu_cap_init(RESERVED,      0);
kvm_cpu_cap_init(FMA,           CAP_PASSTHOUGH);
...
				/* KVM always emulates TSC_ADJUST */
kvm_cpu_cap_init(TSC_ADJUST,    CAP_EMULATED | CAP_SCATTERED);

...

/* CPUID_D_1_EAX*/
				/* XFD is disabled on 32 bit systems because: xyz*/
kvm_cpu_cap_init(XFD, 		CAP_PASSTHOUGH | CAP_X86_64)


'kvm_cpu_cap_init' can be a macro if needed to have the compile checks.

There are several advantages to this:

- more readability, plus if needed each statement can be amended with a comment.
- No weird hacks in 'F*' macros, which additionally eventually evaluate into a bit, which is confusing.
  In fact no need to even have them at all.
- No need to verify that bitmask belongs to a feature word.
- Merge friendly - each capability has its own line.

Disadvantages:

- Longer list - IMHO not a problem, since it is very easy to read / search
  and can have as much comments as needed.
  For example this is how the kernel lists the CPUID features and this list IMHO
  is very manageable.

- Slower - kvm_set_cpu_caps is called exactly once per KVM module load, thus
  performance is the last thing I would care about in this function.


Another note about this patch: It is somewhat confusing that EMUL_F just forces a feature in kvm caps,
regardless of CPU support, because KVM also has KVM_GET_EMULATED_CPUID and it has a different meaning.

Users can easily confuse the EMUL_F for something that sets a feature bit in the KVM_GET_EMULATED_CPUID.


Best regards,
	Maxim Levitsky



> +
>  /*
>   * Aliased Features - For features in 0x8000_0001.EDX that are duplicates of
>   * identical 0x1.EDX features, and thus are aliased from 0x1 to 0x8000_0001.
> @@ -649,6 +659,7 @@ do {									\
>  do {									\
>  	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
>  	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
> +	u32 kvm_cpu_cap_emulated = 0;					\
>  									\
>  	if (leaf < NCAPINTS)						\
>  		kvm_cpu_caps[leaf] &= (mask);				\
> @@ -656,6 +667,7 @@ do {									\
>  		kvm_cpu_caps[leaf] = (mask);				\
>  									\
>  	kvm_cpu_caps[leaf] &= raw_cpuid_get(cpuid);			\
> +	kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;			\
>  } while (0)
>  
>  /*
> @@ -684,12 +696,10 @@ void kvm_set_cpu_caps(void)
>  		0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ |
>  		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |
>  		F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) |
> -		F(XMM4_2) | F(X2APIC) | F(MOVBE) | F(POPCNT) |
> +		F(XMM4_2) | EMUL_F(X2APIC) | F(MOVBE) | F(POPCNT) |
>  		0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) |
>  		F(F16C) | F(RDRAND)
>  	);
> -	/* KVM emulates x2apic in software irrespective of host support. */
> -	kvm_cpu_cap_set(X86_FEATURE_X2APIC);
>  
>  	kvm_cpu_cap_init(CPUID_1_EDX,
>  		F(FPU) | F(VME) | F(DE) | F(PSE) |
> @@ -703,13 +713,13 @@ void kvm_set_cpu_caps(void)
>  	);
>  
>  	kvm_cpu_cap_init(CPUID_7_0_EBX,
> -		F(FSGSBASE) | F(SGX) | F(BMI1) | F(HLE) | F(AVX2) |
> -		F(FDP_EXCPTN_ONLY) | F(SMEP) | F(BMI2) | F(ERMS) | F(INVPCID) |
> -		F(RTM) | F(ZERO_FCS_FDS) | 0 /*MPX*/ | F(AVX512F) |
> -		F(AVX512DQ) | F(RDSEED) | F(ADX) | F(SMAP) | F(AVX512IFMA) |
> -		F(CLFLUSHOPT) | F(CLWB) | 0 /*INTEL_PT*/ | F(AVX512PF) |
> -		F(AVX512ER) | F(AVX512CD) | F(SHA_NI) | F(AVX512BW) |
> -		F(AVX512VL));
> +		F(FSGSBASE) | EMUL_F(TSC_ADJUST) | F(SGX) | F(BMI1) | F(HLE) |
> +		F(AVX2) | F(FDP_EXCPTN_ONLY) | F(SMEP) | F(BMI2) | F(ERMS) |
> +		F(INVPCID) | F(RTM) | F(ZERO_FCS_FDS) | 0 /*MPX*/ |
> +		F(AVX512F) | F(AVX512DQ) | F(RDSEED) | F(ADX) | F(SMAP) |
> +		F(AVX512IFMA) | F(CLFLUSHOPT) | F(CLWB) | 0 /*INTEL_PT*/ |
> +		F(AVX512PF) | F(AVX512ER) | F(AVX512CD) | F(SHA_NI) |
> +		F(AVX512BW) | F(AVX512VL));
>  
>  	kvm_cpu_cap_init(CPUID_7_ECX,
>  		F(AVX512VBMI) | RAW_F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> @@ -728,16 +738,12 @@ void kvm_set_cpu_caps(void)
>  
>  	kvm_cpu_cap_init(CPUID_7_EDX,
>  		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
> -		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
> +		F(SPEC_CTRL_SSBD) | EMUL_F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
>  		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
>  		F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) |
>  		F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D)
>  	);
>  
> -	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
> -	kvm_cpu_cap_set(X86_FEATURE_TSC_ADJUST);
> -	kvm_cpu_cap_set(X86_FEATURE_ARCH_CAPABILITIES);
> -
>  	if (boot_cpu_has(X86_FEATURE_IBPB) && boot_cpu_has(X86_FEATURE_IBRS))
>  		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL);
>  	if (boot_cpu_has(X86_FEATURE_STIBP))
Sean Christopherson July 8, 2024, 10:30 p.m. UTC | #2
On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > Now that kvm_cpu_cap_init() is a macro with its own scope, add EMUL_F() to
> > OR-in features that KVM emulates in software, i.e. that don't depend on
> > the feature being available in hardware.  The contained scope
> > of kvm_cpu_cap_init() allows using a local variable to track the set of
> > emulated leaves, which in addition to avoiding confusing and/or
> > unnecessary variables, helps prevent misuse of EMUL_F().
> > 
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/cpuid.c | 36 +++++++++++++++++++++---------------
> >  1 file changed, 21 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 1064e4d68718..33e3e77de1b7 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -94,6 +94,16 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
> >  	F(name);						\
> >  })
> >  
> > +/*
> > + * Emulated Feature - For features that KVM emulates in software irrespective
> > + * of host CPU/kernel support.
> > + */
> > +#define EMUL_F(name)						\
> > +({								\
> > +	kvm_cpu_cap_emulated |= F(name);			\
> > +	F(name);						\
> > +})
> 
> To me it feels more and more that this patch series doesn't go into the right
> direction.
> 
> How about we just abandon the whole concept of masks and instead just have a
> list of statements
> 
> Pretty much the opposite of the patch series I confess:

FWIW, I think it's actually largely the same code under the hood.  The code for
each concept/flavor ends up being very similar, it's mostly just handling the
bitwise-OR in the callers vs. in the helpers.

> #define CAP_PASSTHOUGH		0x01
> #define CAP_EMULATED		0x02
> #define CAP_AMD_ALIASED		0x04 // for AMD aliased features
> #define CAP_SCATTERED		0x08
> #define CAP_X86_64		0x10 // supported only on 64 bit hypervisors
> ...
> 
> 
> /* CPUID_1_ECX*/
> 
> 				/* TMA is not passed though because: xyz*/
> kvm_cpu_cap_init(TMA,           0);
> 
> kvm_cpu_cap_init(SSSE3,         CAP_PASSTHOUGH);
> 				/* CNXT_ID is not passed though because: xyz*/
> kvm_cpu_cap_init(CNXT_ID,       0);
> kvm_cpu_cap_init(RESERVED,      0);
> kvm_cpu_cap_init(FMA,           CAP_PASSTHOUGH);
> ...
> 				/* KVM always emulates TSC_ADJUST */
> kvm_cpu_cap_init(TSC_ADJUST,    CAP_EMULATED | CAP_SCATTERED);
> 
> ...
> 
> /* CPUID_D_1_EAX*/
> 				/* XFD is disabled on 32 bit systems because: xyz*/
> kvm_cpu_cap_init(XFD, 		CAP_PASSTHOUGH | CAP_X86_64)
> 
> 
> 'kvm_cpu_cap_init' can be a macro if needed to have the compile checks.
> 
> There are several advantages to this:
> 
> - more readability, plus if needed each statement can be amended with a comment.
> - No weird hacks in 'F*' macros, which additionally eventually evaluate into a bit,
>   which is confusing.
>   In fact no need to even have them at all.
> - No need to verify that bitmask belongs to a feature word.

Yes, but the downside is that there is no enforcement of features in a word being
bundled together.

> - Merge friendly - each capability has its own line.

That's almost entirely convention though.  Other than inertia, nothing is stopping
us from doing:

	kvm_cpu_cap_init(CPUID_12_EAX,
		SF(SGX1) |
		SF(SGX2) |
		SF(SGX_EDECCSSA)
	);

I don't see a clean way of avoiding the addition of " |" on the last existing
line, but in practice I highly doubt that will ever be a source of meaningful pain.

Same goes for the point about adding comments.  We could do that with either
approach, we just don't do so today.

> Disadvantages:
> 
> - Longer list - IMHO not a problem, since it is very easy to read / search
>   and can have as much comments as needed.
>   For example this is how the kernel lists the CPUID features and this list IMHO
>   is very manageable.

There's one big difference: KVM would need to have a line for every feature that
KVM _doesn't_ support.  For densely populated words, that's not a huge issue,
but it's problematic for sparsely populated words, e.g. CPUID_12_EAX would have
29 reserved/unsupport entries, which IMO ends up being a big net negative for
code readability and ongoing maintenance.

We could avoid that cost (and the danger of a missed bit) by collecting the set
of features that have been initialized for each word, and then masking off the
uninitialized/unsupported at the end.  But then we're back to the bitwise-OR and
mask logic.

And while I agree that having the F*() macros set state _and_ evaulate to a bit
is imperfect, it does have its advantages.  E.g. to avoid evaluating to a value,
we could have F() modify a local variable that is scoped to kvm_cpu_cap_init(),
a las kvm_cpu_cap_emulated.  But then we'd need explicit code and/or comments
to call out that VMM_F() and the like intentionally don't set kvm_cpu_cap_supported,
whereas evualating to a value is a relatively self-documenting "0;".

> - Slower - kvm_set_cpu_caps is called exactly once per KVM module load, thus
>   performance is the last thing I would care about in this function.
> 
> Another note about this patch: It is somewhat confusing that EMUL_F just
> forces a feature in kvm caps, regardless of CPU support, because KVM also has
> KVM_GET_EMULATED_CPUID and it has a different meaning.

Yeah, but IMO that's a problem with KVM_GET_EMULATED_CPUID being poorly defined.

> Users can easily confuse the EMUL_F for something that sets a feature bit in
> the KVM_GET_EMULATED_CPUID.

I'll see if I can find a good spot for a comment to try and convenient
Maxim Levitsky July 24, 2024, 5:58 p.m. UTC | #3
On Mon, 2024-07-08 at 15:30 -0700, Sean Christopherson wrote:
> On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > > Now that kvm_cpu_cap_init() is a macro with its own scope, add EMUL_F() to
> > > OR-in features that KVM emulates in software, i.e. that don't depend on
> > > the feature being available in hardware.  The contained scope
> > > of kvm_cpu_cap_init() allows using a local variable to track the set of
> > > emulated leaves, which in addition to avoiding confusing and/or
> > > unnecessary variables, helps prevent misuse of EMUL_F().
> > > 
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >  arch/x86/kvm/cpuid.c | 36 +++++++++++++++++++++---------------
> > >  1 file changed, 21 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > > index 1064e4d68718..33e3e77de1b7 100644
> > > --- a/arch/x86/kvm/cpuid.c
> > > +++ b/arch/x86/kvm/cpuid.c
> > > @@ -94,6 +94,16 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
> > >  	F(name);						\
> > >  })
> > >  
> > > +/*
> > > + * Emulated Feature - For features that KVM emulates in software irrespective
> > > + * of host CPU/kernel support.
> > > + */
> > > +#define EMUL_F(name)						\
> > > +({								\
> > > +	kvm_cpu_cap_emulated |= F(name);			\
> > > +	F(name);						\
> > > +})
> > 
> > To me it feels more and more that this patch series doesn't go into the right
> > direction.
> > 
> > How about we just abandon the whole concept of masks and instead just have a
> > list of statements
> > 
> > Pretty much the opposite of the patch series I confess:
> 
> FWIW, I think it's actually largely the same code under the hood.  The code for
> each concept/flavor ends up being very similar, it's mostly just handling the
> bitwise-OR in the callers vs. in the helpers.
> 
> > #define CAP_PASSTHOUGH		0x01
> > #define CAP_EMULATED		0x02
> > #define CAP_AMD_ALIASED		0x04 // for AMD aliased features
> > #define CAP_SCATTERED		0x08
> > #define CAP_X86_64		0x10 // supported only on 64 bit hypervisors
> > ...
> > 
> > 
> > /* CPUID_1_ECX*/
> > 
> > 				/* TMA is not passed though because: xyz*/
> > kvm_cpu_cap_init(TMA,           0);
> > 
> > kvm_cpu_cap_init(SSSE3,         CAP_PASSTHOUGH);
> > 				/* CNXT_ID is not passed though because: xyz*/
> > kvm_cpu_cap_init(CNXT_ID,       0);
> > kvm_cpu_cap_init(RESERVED,      0);
> > kvm_cpu_cap_init(FMA,           CAP_PASSTHOUGH);
> > ...
> > 				/* KVM always emulates TSC_ADJUST */
> > kvm_cpu_cap_init(TSC_ADJUST,    CAP_EMULATED | CAP_SCATTERED);
> > 
> > ...
> > 
> > /* CPUID_D_1_EAX*/
> > 				/* XFD is disabled on 32 bit systems because: xyz*/
> > kvm_cpu_cap_init(XFD, 		CAP_PASSTHOUGH | CAP_X86_64)
> > 
> > 
> > 'kvm_cpu_cap_init' can be a macro if needed to have the compile checks.
> > 
> > There are several advantages to this:
> > 
> > - more readability, plus if needed each statement can be amended with a comment.
> > - No weird hacks in 'F*' macros, which additionally eventually evaluate into a bit,
> >   which is confusing.
> >   In fact no need to even have them at all.
> > - No need to verify that bitmask belongs to a feature word.
> 
> Yes, but the downside is that there is no enforcement of features in a word being
> bundled together.

As I explained earlier, this is not an issue in principle, even if the caps are not
grouped together, the code will still work just fine.


kvm_cpu_cap_init_begin(CPUID_1_ECX);
                                /* TMA is not passed though because: xyz*/
kvm_cpu_cap_init(TMA,           0);
kvm_cpu_cap_init(SSSE3,         CAP_PASSTHOUGH);
                                /* CNXT_ID is not passed though because: xyz*/
kvm_cpu_cap_init(CNXT_ID,       0);
kvm_cpu_cap_init(RESERVED,      0);
kvm_cpu_cap_init(FMA,           CAP_PASSTHOUGH);
...
                                /* KVM always emulates TSC_ADJUST */
kvm_cpu_cap_init(TSC_ADJUST,    CAP_EMULATED | CAP_SCATTERED);

kvm_cpu_cap_init_end(CPUID_1_ECX);

...

...

And kvm_cpu_cap_init_begin, can set some cap_in_progress variable.



> 
> > - Merge friendly - each capability has its own line.
> 
> That's almost entirely convention though.  Other than inertia, nothing is stopping
> us from doing:
> 
> 	kvm_cpu_cap_init(CPUID_12_EAX,
> 		SF(SGX1) |
> 		SF(SGX2) |
> 		SF(SGX_EDECCSSA)

That trivial change is already an improvement, although it still leaves the problem
of thinking that this is one bit 'or', which was reasonable before this patch series,
because it was indeed one big 'or' but now there is lots of things going on behind
the scenes and that violates the principle of the least surprise.

My suggestion fixes this, because when the user sees a series of function calls,
and nobody will assume anything about these functions calls in contrast with series
of 'ors'. It's just how I look at it.

> 	);
> 
> I don't see a clean way of avoiding the addition of " |" on the last existing
> line, but in practice I highly doubt that will ever be a source of meaningful pain.
> 
> Same goes for the point about adding comments.  We could do that with either
> approach, we just don't do so today.

Yes, from the syntax POV there is indeed no problem, and I do agree that putting
each feature on its own line, together with comments for the features that need it
is a win-win improvement over what we have after this patch series.

> 
> > Disadvantages:
> > 
> > - Longer list - IMHO not a problem, since it is very easy to read / search
> >   and can have as much comments as needed.
> >   For example this is how the kernel lists the CPUID features and this list IMHO
> >   is very manageable.
> 
> There's one big difference: KVM would need to have a line for every feature that
> KVM _doesn't_ support.

Could you elaborate on why?
If we zero the whole leaf and then set specific bits there, one bit per kvm_cpu_cap_init.



>   For densely populated words, that's not a huge issue,
> but it's problematic for sparsely populated words, e.g. CPUID_12_EAX would have
> 29 reserved/unsupport entries, which IMO ends up being a big net negative for
> code readability and ongoing maintenance.
> 
> We could avoid that cost (and the danger of a missed bit) by collecting the set
> of features that have been initialized for each word, and then masking off the
> uninitialized/unsupported at the end.  But then we're back to the bitwise-OR and
> mask logic.
> 
> And while I agree that having the F*() macros set state _and_ evaulate to a bit
> is imperfect, it does have its advantages.  E.g. to avoid evaluating to a value,
> we could have F() modify a local variable that is scoped to kvm_cpu_cap_init(),
> a las kvm_cpu_cap_emulated.  But then we'd need explicit code and/or comments
> to call out that VMM_F() and the like intentionally don't set kvm_cpu_cap_supported,
> whereas evualating to a value is a relatively self-documenting "0;".
> 
> > - Slower - kvm_set_cpu_caps is called exactly once per KVM module load, thus
> >   performance is the last thing I would care about in this function.
> > 
> > Another note about this patch: It is somewhat confusing that EMUL_F just
> > forces a feature in kvm caps, regardless of CPU support, because KVM also has
> > KVM_GET_EMULATED_CPUID and it has a different meaning.
> 
> Yeah, but IMO that's a problem with KVM_GET_EMULATED_CPUID being poorly defined.
> 
> > Users can easily confuse the EMUL_F for something that sets a feature bit in
> > the KVM_GET_EMULATED_CPUID.
> 
> I'll see if I can find a good spot for a comment to try and convenient


Best regards,
	Maxim Levitsky
>
Sean Christopherson July 27, 2024, 12:06 a.m. UTC | #4
On Wed, Jul 24, 2024, Maxim Levitsky wrote:
> On Mon, 2024-07-08 at 15:30 -0700, Sean Christopherson wrote:
> > On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > > On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > > There are several advantages to this:
> > > 
> > > - more readability, plus if needed each statement can be amended with a comment.
> > > - No weird hacks in 'F*' macros, which additionally eventually evaluate into a bit,
> > >   which is confusing.
> > >   In fact no need to even have them at all.
> > > - No need to verify that bitmask belongs to a feature word.
> > 
> > Yes, but the downside is that there is no enforcement of features in a word being
> > bundled together.
> 
> As I explained earlier, this is not an issue in principle, even if the caps are not
> grouped together, the code will still work just fine.

I agree that functionally it'll all be fine, but I also want the code to bunch
things together for readers.  We can force that with functions, though it means
passing in more state to kvm_cpu_cap_init_{begin,end}().

> kvm_cpu_cap_init_begin(CPUID_1_ECX);
>                                 /* TMA is not passed though because: xyz*/
> kvm_cpu_cap_init(TMA,           0);
> kvm_cpu_cap_init(SSSE3,         CAP_PASSTHOUGH);
>                                 /* CNXT_ID is not passed though because: xyz*/
> kvm_cpu_cap_init(CNXT_ID,       0);
> kvm_cpu_cap_init(RESERVED,      0);
> kvm_cpu_cap_init(FMA,           CAP_PASSTHOUGH);
> ...
>                                 /* KVM always emulates TSC_ADJUST */
> kvm_cpu_cap_init(TSC_ADJUST,    CAP_EMULATED | CAP_SCATTERED);
> 
> kvm_cpu_cap_init_end(CPUID_1_ECX);
> 
> ...
> 
> ...
> 
> And kvm_cpu_cap_init_begin, can set some cap_in_progress variable.

Ya, but then compile-time asserts become run-time asserts.

> > > - Merge friendly - each capability has its own line.
> > 
> > That's almost entirely convention though.  Other than inertia, nothing is stopping
> > us from doing:
> > 
> > 	kvm_cpu_cap_init(CPUID_12_EAX,
> > 		SF(SGX1) |
> > 		SF(SGX2) |
> > 		SF(SGX_EDECCSSA)
> 
> That trivial change is already an improvement, although it still leaves the problem
> of thinking that this is one bit 'or', which was reasonable before this patch series,
> because it was indeed one big 'or' but now there is lots of things going on behind
> the scenes and that violates the principle of the least surprise.
> 
> My suggestion fixes this, because when the user sees a series of function calls,
> and nobody will assume anything about these functions calls in contrast with series
> of 'ors'. It's just how I look at it.

If it's the macro styling that's misleading, we could do what we did for the
static_call() wrappers and make them look like functions.  E.g.

	kvm_cpu_cap_init(CPUID_12_EAX,
		scattered_f(SGX1) |
		scattered_f(SGX2) |
		scattered_f(SGX_EDECCSSA)
	);

though that probably doesn't help much and is misleading in its own right.  Does
it help if the names are more verbose? 
 
> > 	);
> > 
> > I don't see a clean way of avoiding the addition of " |" on the last existing
> > line, but in practice I highly doubt that will ever be a source of meaningful pain.
> > 
> > Same goes for the point about adding comments.  We could do that with either
> > approach, we just don't do so today.
> 
> Yes, from the syntax POV there is indeed no problem, and I do agree that putting
> each feature on its own line, together with comments for the features that need it
> is a win-win improvement over what we have after this patch series.
> 
> > 
> > > Disadvantages:
> > > 
> > > - Longer list - IMHO not a problem, since it is very easy to read / search
> > >   and can have as much comments as needed.
> > >   For example this is how the kernel lists the CPUID features and this list IMHO
> > >   is very manageable.
> > 
> > There's one big difference: KVM would need to have a line for every feature that
> > KVM _doesn't_ support.
> 
> Could you elaborate on why?
> If we zero the whole leaf and then set specific bits there, one bit per kvm_cpu_cap_init.

Ah, if we move the the handling of boot_cpu_data[*] into the helpers, then yes,
there's no need to explicitly initialize features that aren't supported by KVM.

That said, I still don't like using functions instead of macros, mainly because
a number of compile-assertions become run-time assertions.  To provide equivalent
functionality, we also would need to pass in extra state to begin/end() (as
mentioned earlier).  Getting compile-time assertions on usage, e.g. via
guest_cpu_cap_has(), would also be trickier, though still doable, I think.
Lastly, it adds an extra step (calling _end()) to each flow, i.e. adds one more
thing for developers to mess up.  But that's a very minor concern and definitely
not a sticking point.

I agree that the macro shenanigans are aggressively clever, but for me, the
benefits of compile-time asserts make it worth dealing with the cleverness.

[*] https://lore.kernel.org/all/ZqKlDC11gItH1uj9@google.com
Maxim Levitsky Aug. 5, 2024, 11:16 a.m. UTC | #5
У пт, 2024-07-26 у 17:06 -0700, Sean Christopherson пише:
> > On Wed, Jul 24, 2024, Maxim Levitsky wrote:
> > > > On Mon, 2024-07-08 at 15:30 -0700, Sean Christopherson wrote:
> > > > > > On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > > > > > > > On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > > > > > > > There are several advantages to this:
> > > > > > > > 
> > > > > > > > - more readability, plus if needed each statement can be amended with a comment.
> > > > > > > > - No weird hacks in 'F*' macros, which additionally eventually evaluate into a bit,
> > > > > > > >   which is confusing.
> > > > > > > >   In fact no need to even have them at all.
> > > > > > > > - No need to verify that bitmask belongs to a feature word.
> > > > > > 
> > > > > > Yes, but the downside is that there is no enforcement of features in a word being
> > > > > > bundled together.
> > > > 
> > > > As I explained earlier, this is not an issue in principle, even if the caps are not
> > > > grouped together, the code will still work just fine.
> > 
> > I agree that functionally it'll all be fine, but I also want the code to bunch
> > things together for readers.  We can force that with functions, though it means
> > passing in more state to kvm_cpu_cap_init_{begin,end}().
> > 
> > > > kvm_cpu_cap_init_begin(CPUID_1_ECX);
> > > >                                 /* TMA is not passed though because: xyz*/
> > > > kvm_cpu_cap_init(TMA,           0);
> > > > kvm_cpu_cap_init(SSSE3,         CAP_PASSTHOUGH);
> > > >                                 /* CNXT_ID is not passed though because: xyz*/
> > > > kvm_cpu_cap_init(CNXT_ID,       0);
> > > > kvm_cpu_cap_init(RESERVED,      0);
> > > > kvm_cpu_cap_init(FMA,           CAP_PASSTHOUGH);
> > > > ...
> > > >                                 /* KVM always emulates TSC_ADJUST */
> > > > kvm_cpu_cap_init(TSC_ADJUST,    CAP_EMULATED | CAP_SCATTERED);
> > > > 
> > > > kvm_cpu_cap_init_end(CPUID_1_ECX);
> > > > 
> > > > ...
> > > > 
> > > > ...
> > > > 
> > > > And kvm_cpu_cap_init_begin, can set some cap_in_progress variable.
> > 
> > Ya, but then compile-time asserts become run-time asserts.

Not really, it all can be done with macros, in exactly the same way IMHO,
we do have BUILD_BUG_ON after all.

I am not against using macros, I am only against collecting a bitmask
while applying various side effects, and then passing the bitmask to
the kvm_cpu_cap_init.

> > 
> > > > > > > > - Merge friendly - each capability has its own line.
> > > > > > 
> > > > > > That's almost entirely convention though.  Other than inertia, nothing is stopping
> > > > > > us from doing:
> > > > > > 
> > > > > >         kvm_cpu_cap_init(CPUID_12_EAX,
> > > > > >                 SF(SGX1) |
> > > > > >                 SF(SGX2) |
> > > > > >                 SF(SGX_EDECCSSA)
> > > > 
> > > > That trivial change is already an improvement, although it still leaves the problem
> > > > of thinking that this is one bit 'or', which was reasonable before this patch series,
> > > > because it was indeed one big 'or' but now there is lots of things going on behind
> > > > the scenes and that violates the principle of the least surprise.
> > > > 
> > > > My suggestion fixes this, because when the user sees a series of function calls,
> > > > and nobody will assume anything about these functions calls in contrast with series
> > > > of 'ors'. It's just how I look at it.
> > 
> > If it's the macro styling that's misleading, we could do what we did for the
> > static_call() wrappers and make them look like functions.  E.g.
> > 
> >         kvm_cpu_cap_init(CPUID_12_EAX,
> >                 scattered_f(SGX1) |
> >                 scattered_f(SGX2) |
> >                 scattered_f(SGX_EDECCSSA)
> >         );



> > 
> > though that probably doesn't help much and is misleading in its own right.  Does
> > it help if the names are more verbose? 

Verbose names are a good thing, I already mentioned this.

> >  
> > > > > >         );
> > > > > > 
> > > > > > I don't see a clean way of avoiding the addition of " |" on the last existing
> > > > > > line, but in practice I highly doubt that will ever be a source of meaningful pain.
> > > > > > 
> > > > > > Same goes for the point about adding comments.  We could do that with either
> > > > > > approach, we just don't do so today.
> > > > 
> > > > Yes, from the syntax POV there is indeed no problem, and I do agree that putting
> > > > each feature on its own line, together with comments for the features that need it
> > > > is a win-win improvement over what we have after this patch series.
> > > > 
> > > > > > 
> > > > > > > > Disadvantages:
> > > > > > > > 
> > > > > > > > - Longer list - IMHO not a problem, since it is very easy to read / search
> > > > > > > >   and can have as much comments as needed.
> > > > > > > >   For example this is how the kernel lists the CPUID features and this list IMHO
> > > > > > > >   is very manageable.
> > > > > > 
> > > > > > There's one big difference: KVM would need to have a line for every feature that
> > > > > > KVM _doesn't_ support.
> > > > 
> > > > Could you elaborate on why?
> > > > If we zero the whole leaf and then set specific bits there, one bit per kvm_cpu_cap_init.
> > 
> > Ah, if we move the the handling of boot_cpu_data[*] into the helpers, then yes,
> > there's no need to explicitly initialize features that aren't supported by KVM.
> > 
> > That said, I still don't like using functions instead of macros, mainly because
> > a number of compile-assertions become run-time assertions.

I'm almost sure that we can do everything with compile time asserts with series of functions.




> >   To provide equivalent
> > functionality, we also would need to pass in extra state to begin/end() (as
> > mentioned earlier).

Besides the number of leaf currently initialized, I don't see which other extra state we need.

In fact I can prove that this is possible:

Roughly like this:

#define kvm_cpu_cap_init_begin(leaf)							\
do {											\
 const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf; 				\
 u32 kvm_cpu_cap_emulated = 0; 								\
 u32 kvm_cpu_cap_synthesized = 0; 							\
	u32 kvm_cpu_cap_regular = 0;


#define feature_scattered(name) 							\
 BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES); 					\
 KVM_VALIDATE_CPU_CAP_USAGE(name); 							\
											\
	if (boot_cpu_has(X86_FEATURE_##name) 						\
		kvm_cpu_cap_regular |= feature_bit(name);


#define kvm_cpu_cap_init_end() 								\
	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);			\
											\
	if (kvm_cpu_cap_init_in_progress < NCAPINTS) 					\
 		kvm_cpu_caps[kvm_cpu_cap_init_in_progress] &= kvm_cpu_cap_regular; 	\
 	else 										\
 		kvm_cpu_caps[kvm_cpu_cap_init_in_progress] = kvm_cpu_cap_regular; 	\
 											\
 	kvm_cpu_caps[kvm_cpu_cap_init_in_progress] &= (raw_cpuid_get(cpuid) | 		\
 	kvm_cpu_cap_synthesized); 							\
 	kvm_cpu_caps[kvm_cpu_cap_init_in_progress] |= kvm_cpu_cap_emulated; 		\
} while(0);


And now we have:

kvm_cpu_cap_init_begin(CPUID_12_EAX);
 feature_scattered(SGX1);
 feature_scattered(SGX2);
 feature_scattered(SGX_EDECCSSA);
kvm_cpu_cap_init_end();

In my book this looks much less misleading than the current version - I didn't put
much effort into naming variables though, the kvm_cpu_cap_regular name can be better IMHO.


Best regards,
	Maxim Levitsky


> >   Getting compile-time assertions on usage, e.g. via
> > guest_cpu_cap_has(), would also be trickier, though still doable, I think.
> > Lastly, it adds an extra step (calling _end()) to each flow, i.e. adds one more
> > thing for developers to mess up.  But that's a very minor concern and definitely
> > not a sticking point.
> > 
> > I agree that the macro shenanigans are aggressively clever, but for me, the
> > benefits of compile-time asserts make it worth dealing with the cleverness.
> > 
> > [*] https://lore.kernel.org/all/ZqKlDC11gItH1uj9@google.com
> >
Sean Christopherson Aug. 5, 2024, 7:59 p.m. UTC | #6
On Mon, Aug 05, 2024, mlevitsk@redhat.com wrote:
> У пт, 2024-07-26 у 17:06 -0700, Sean Christopherson пише:
> > > > > And kvm_cpu_cap_init_begin, can set some cap_in_progress variable.
> > > 
> > > Ya, but then compile-time asserts become run-time asserts.
> 
> Not really, it all can be done with macros, in exactly the same way IMHO,
> we do have BUILD_BUG_ON after all.
> 
> I am not against using macros, I am only against collecting a bitmask
> while applying various side effects, and then passing the bitmask to
> the kvm_cpu_cap_init.

Gah, I wasn't grokking that, obviously.  Sorry for not catching on earlier.

> > > To provide equivalent functionality, we also would need to pass in extra
> > > state to begin/end() (as mentioned earlier).
> 
> Besides the number of leaf currently initialized, I don't see which other
> extra state we need.
> 
> In fact I can prove that this is possible:
> 
> Roughly like this:
> 
> #define kvm_cpu_cap_init_begin(leaf)							\
> do {											\
>  const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf; 				\
>  u32 kvm_cpu_cap_emulated = 0; 								\
>  u32 kvm_cpu_cap_synthesized = 0; 							\
> 	u32 kvm_cpu_cap_regular = 0;

Maybe "virtualized" instead of "regular"?

> #define feature_scattered(name) 							\
>  BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES); 					\
>  KVM_VALIDATE_CPU_CAP_USAGE(name); 							\
> 											\
> 	if (boot_cpu_has(X86_FEATURE_##name) 						\
> 		kvm_cpu_cap_regular |= feature_bit(name);
> 
> 
> #define kvm_cpu_cap_init_end() 								\
> 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);			\
> 											\
> 	if (kvm_cpu_cap_init_in_progress < NCAPINTS) 					\
>  		kvm_cpu_caps[kvm_cpu_cap_init_in_progress] &= kvm_cpu_cap_regular; 	\
>  	else 										\
>  		kvm_cpu_caps[kvm_cpu_cap_init_in_progress] = kvm_cpu_cap_regular; 	\
>  											\
>  	kvm_cpu_caps[kvm_cpu_cap_init_in_progress] &= (raw_cpuid_get(cpuid) | 		\
>  	kvm_cpu_cap_synthesized); 							\
>  	kvm_cpu_caps[kvm_cpu_cap_init_in_progress] |= kvm_cpu_cap_emulated; 		\
> } while(0);
> 
> 
> And now we have:
> 
> kvm_cpu_cap_init_begin(CPUID_12_EAX);
>  feature_scattered(SGX1);
>  feature_scattered(SGX2);
>  feature_scattered(SGX_EDECCSSA);
> kvm_cpu_cap_init_end();

I don't love the syntax (mainly the need for a begin()+end()), but I'm a-ok
getting rid of the @mask param/input.

What about making kvm_cpu_cap_init() a variadic macro, with the relevant features
"unpacked" in the context of the macro?  That would avoid the need for a trailing
macro, and would provide a clear indication of when/where the set of features is
"initialized".

The biggest downside I see is that the last entry can't have a trailing comma,
i.e. adding a new feature would require updating the previous feature too.

#define kvm_cpu_cap_init(leaf, init_features...)			\
do {									\
	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
	u32 kvm_cpu_cap_virtualized= 0;					\
	u32 kvm_cpu_cap_emulated = 0;					\
	u32 kvm_cpu_cap_synthesized = 0;				\
									\
	init_features;							\
									\
	kvm_cpu_caps[leaf] = kvm_cpu_cap_virtualized;			\
	kvm_cpu_caps[leaf] &= (raw_cpuid_get(cpuid) |			\
			       kvm_cpu_cap_synthesized);		\
	kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;			\
} while (0)

	kvm_cpu_cap_init(CPUID_1_ECX,
		VIRTUALIZED_F(XMM3),
		VIRTUALIZED_F(PCLMULQDQ),
		VIRTUALIZED_F(SSSE3),
		VIRTUALIZED_F(FMA),
		VIRTUALIZED_F(CX16),
		VIRTUALIZED_F(PDCM),
		VIRTUALIZED_F(PCID),
		VIRTUALIZED_F(XMM4_1),
		VIRTUALIZED_F(XMM4_2),
		EMULATED_F(X2APIC),
		VIRTUALIZED_F(MOVBE),
		VIRTUALIZED_F(POPCNT),
		EMULATED_F(TSC_DEADLINE_TIMER),
		VIRTUALIZED_F(AES),
		VIRTUALIZED_F(XSAVE),
		// DYNAMIC_F(OSXSAVE),
		VIRTUALIZED_F(AVX),
		VIRTUALIZED_F(F16C),
		VIRTUALIZED_F(RDRAND),
		EMULATED_F(HYPERVISOR)
	);


Alternatively, we could force a trailing comma by omitting the semicolon after
init_features, but that looks weird for the the macro itself, and arguably a bit
weird for the users too.
Maxim Levitsky Sept. 10, 2024, 8:41 p.m. UTC | #7
On Mon, 2024-08-05 at 12:59 -0700, Sean Christopherson wrote:
> On Mon, Aug 05, 2024, mlevitsk@redhat.com wrote:
> > У пт, 2024-07-26 у 17:06 -0700, Sean Christopherson пише:
> > > > > > And kvm_cpu_cap_init_begin, can set some cap_in_progress variable.
> > > > 
> > > > Ya, but then compile-time asserts become run-time asserts.
> > 
> > Not really, it all can be done with macros, in exactly the same way IMHO,
> > we do have BUILD_BUG_ON after all.
> > 
> > I am not against using macros, I am only against collecting a bitmask
> > while applying various side effects, and then passing the bitmask to
> > the kvm_cpu_cap_init.
> 
> Gah, I wasn't grokking that, obviously.  Sorry for not catching on earlier.
> 
> > > > To provide equivalent functionality, we also would need to pass in extra
> > > > state to begin/end() (as mentioned earlier).
> > 
> > Besides the number of leaf currently initialized, I don't see which other
> > extra state we need.
> > 
> > In fact I can prove that this is possible:
> > 
> > Roughly like this:
> > 
> > #define kvm_cpu_cap_init_begin(leaf)							\
> > do {											\
> >  const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf; 				\
> >  u32 kvm_cpu_cap_emulated = 0; 								\
> >  u32 kvm_cpu_cap_synthesized = 0; 							\
> > 	u32 kvm_cpu_cap_regular = 0;
> 
> Maybe "virtualized" instead of "regular"?
> 
> > #define feature_scattered(name) 							\
> >  BUILD_BUG_ON(X86_FEATURE_##name >= MAX_CPU_FEATURES); 					\
> >  KVM_VALIDATE_CPU_CAP_USAGE(name); 							\
> > 											\
> > 	if (boot_cpu_has(X86_FEATURE_##name) 						\
> > 		kvm_cpu_cap_regular |= feature_bit(name);
> > 
> > 
> > #define kvm_cpu_cap_init_end() 								\
> > 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);			\
> > 											\
> > 	if (kvm_cpu_cap_init_in_progress < NCAPINTS) 					\
> >  		kvm_cpu_caps[kvm_cpu_cap_init_in_progress] &= kvm_cpu_cap_regular; 	\
> >  	else 										\
> >  		kvm_cpu_caps[kvm_cpu_cap_init_in_progress] = kvm_cpu_cap_regular; 	\
> >  											\
> >  	kvm_cpu_caps[kvm_cpu_cap_init_in_progress] &= (raw_cpuid_get(cpuid) | 		\
> >  	kvm_cpu_cap_synthesized); 							\
> >  	kvm_cpu_caps[kvm_cpu_cap_init_in_progress] |= kvm_cpu_cap_emulated; 		\
> > } while(0);
> > 
> > 
> > And now we have:
> > 
> > kvm_cpu_cap_init_begin(CPUID_12_EAX);
> >  feature_scattered(SGX1);
> >  feature_scattered(SGX2);
> >  feature_scattered(SGX_EDECCSSA);
> > kvm_cpu_cap_init_end();
> 
> I don't love the syntax (mainly the need for a begin()+end()), but I'm a-ok
> getting rid of the @mask param/input.
> 
> What about making kvm_cpu_cap_init() a variadic macro, with the relevant features
> "unpacked" in the context of the macro?  That would avoid the need for a trailing
> macro, and would provide a clear indication of when/where the set of features is
> "initialized".
> 
> The biggest downside I see is that the last entry can't have a trailing comma,
> i.e. adding a new feature would require updating the previous feature too.
> 
> #define kvm_cpu_cap_init(leaf, init_features...)			\
> do {									\
> 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
> 	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
> 	u32 kvm_cpu_cap_virtualized= 0;					\
> 	u32 kvm_cpu_cap_emulated = 0;					\
> 	u32 kvm_cpu_cap_synthesized = 0;				\
> 									\
> 	init_features;							\
> 									\
> 	kvm_cpu_caps[leaf] = kvm_cpu_cap_virtualized;			\
> 	kvm_cpu_caps[leaf] &= (raw_cpuid_get(cpuid) |			\
> 			       kvm_cpu_cap_synthesized);		\
> 	kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;			\
> } while (0)
> 
> 	kvm_cpu_cap_init(CPUID_1_ECX,
> 		VIRTUALIZED_F(XMM3),
> 		VIRTUALIZED_F(PCLMULQDQ),
> 		VIRTUALIZED_F(SSSE3),
> 		VIRTUALIZED_F(FMA),
> 		VIRTUALIZED_F(CX16),
> 		VIRTUALIZED_F(PDCM),
> 		VIRTUALIZED_F(PCID),
> 		VIRTUALIZED_F(XMM4_1),
> 		VIRTUALIZED_F(XMM4_2),
> 		EMULATED_F(X2APIC),
> 		VIRTUALIZED_F(MOVBE),
> 		VIRTUALIZED_F(POPCNT),
> 		EMULATED_F(TSC_DEADLINE_TIMER),
> 		VIRTUALIZED_F(AES),
> 		VIRTUALIZED_F(XSAVE),
> 		// DYNAMIC_F(OSXSAVE),
> 		VIRTUALIZED_F(AVX),
> 		VIRTUALIZED_F(F16C),
> 		VIRTUALIZED_F(RDRAND),
> 		EMULATED_F(HYPERVISOR)
> 	);

Hi,

This is no doubt better than using '|'.

I still strongly prefer my version, because I don't really like the fact that _F 
macros have side effects, and yet passed as parameters to the kvm_cpu_cap_init function/macro.

Basically an unwritten rule, which I consider very important and because of which
I raised my concerns over this patch series is that if a function has side effects,
it should not be used as a parameter to another function, instead, it should be 
called explicitly on its own.

If you strongly prefer the variadic macro over my begin/end API, I can live with
that though, it is still better than '|'ing a mask with functions that have side
effects.

Best regards,
	Maxim Levitsky


> 
> 
> Alternatively, we could force a trailing comma by omitting the semicolon after
> init_features, but that looks weird for the the macro itself, and arguably a bit
> weird for the users too.
>
Sean Christopherson Sept. 11, 2024, 4:03 p.m. UTC | #8
On Tue, Sep 10, 2024, Maxim Levitsky wrote:
> On Mon, 2024-08-05 at 12:59 -0700, Sean Christopherson wrote:
> > > And now we have:
> > > 
> > > kvm_cpu_cap_init_begin(CPUID_12_EAX);
> > >  feature_scattered(SGX1);
> > >  feature_scattered(SGX2);
> > >  feature_scattered(SGX_EDECCSSA);
> > > kvm_cpu_cap_init_end();
> > 
> > I don't love the syntax (mainly the need for a begin()+end()), but I'm a-ok
> > getting rid of the @mask param/input.
> > 
> > What about making kvm_cpu_cap_init() a variadic macro, with the relevant features
> > "unpacked" in the context of the macro?  That would avoid the need for a trailing
> > macro, and would provide a clear indication of when/where the set of features is
> > "initialized".
> > 
> > The biggest downside I see is that the last entry can't have a trailing comma,
> > i.e. adding a new feature would require updating the previous feature too.
> > 
> > #define kvm_cpu_cap_init(leaf, init_features...)			\
> > do {									\
> > 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
> > 	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
> > 	u32 kvm_cpu_cap_virtualized= 0;					\
> > 	u32 kvm_cpu_cap_emulated = 0;					\
> > 	u32 kvm_cpu_cap_synthesized = 0;				\
> > 									\
> > 	init_features;							\
> > 									\
> > 	kvm_cpu_caps[leaf] = kvm_cpu_cap_virtualized;			\
> > 	kvm_cpu_caps[leaf] &= (raw_cpuid_get(cpuid) |			\
> > 			       kvm_cpu_cap_synthesized);		\
> > 	kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;			\
> > } while (0)
> > 
> > 	kvm_cpu_cap_init(CPUID_1_ECX,
> > 		VIRTUALIZED_F(XMM3),
> > 		VIRTUALIZED_F(PCLMULQDQ),
> > 		VIRTUALIZED_F(SSSE3),
> > 		VIRTUALIZED_F(FMA),
> > 		VIRTUALIZED_F(CX16),
> > 		VIRTUALIZED_F(PDCM),
> > 		VIRTUALIZED_F(PCID),
> > 		VIRTUALIZED_F(XMM4_1),
> > 		VIRTUALIZED_F(XMM4_2),
> > 		EMULATED_F(X2APIC),
> > 		VIRTUALIZED_F(MOVBE),
> > 		VIRTUALIZED_F(POPCNT),
> > 		EMULATED_F(TSC_DEADLINE_TIMER),
> > 		VIRTUALIZED_F(AES),
> > 		VIRTUALIZED_F(XSAVE),
> > 		// DYNAMIC_F(OSXSAVE),
> > 		VIRTUALIZED_F(AVX),
> > 		VIRTUALIZED_F(F16C),
> > 		VIRTUALIZED_F(RDRAND),
> > 		EMULATED_F(HYPERVISOR)
> > 	);
> 
> Hi,
> 
> This is no doubt better than using '|'.
> 
> I still strongly prefer my version, because I don't really like the fact that
> _F macros have side effects, and yet passed as parameters to the
> kvm_cpu_cap_init function/macro.
> 
> Basically an unwritten rule, which I consider very important and because of which
> I raised my concerns over this patch series is that if a function has side effects,
> it should not be used as a parameter to another function, instead, it should be 
> called explicitly on its own.

Splitting hairs to some degree, but the above suggestion is distinctly different
than passing the _result_ of a function call as a parameter to another function.
The actual "call" happens within the body of kvm_cpu_cap_init().  

This is effectively the same as passing a function pointer to a helper, and that
function pointer implementation having side effects, which is quite common in the
kernel and KVM, e.g. msr_access_t, rmap_handler_t, tdp_handler_t, gfn_handler_t,
on_lock_fn_t, etc.

I 100% agree that it's unusual and subtle to essentially have a variable number
of function pointers, but I don't see it as being an inherently bad pattern,
especially since it is practically impossible to misuse _because_ the macro
unpacks the "calls" at compile time.

IMO, the part that is most gross is the macros operating on local variables, but
that behavior exists in all ideas we've discussed, probably because I'm pretty
sure it's unavoidable unless we do something even worse (way, waaaaay worse).

E.g. we could add 32 versions of kvm_cpu_cap_init() that invoke pairs of parameters
and pass in the variables

  fn1(f1, virtualized, emulated, synthesized)
  fn2(f2, virtualized, emulated, synthesized)
  fn3(f3, virtualized, emulated, synthesized)
  ...
  fnN(fN, virtualized, emulated, synthesized)

and

  kvm_cpu_cap_init19(CPUID_1_ECX,
	F, XMM3,
	F, PCLMULQDQ,
	F, SSE3,
	...
	EMULATED_F, HYPERVISOR
  );

But that's beyond horrific :-)

> If you strongly prefer the variadic macro over my begin/end API, I can live with
> that though, it is still better than '|'ing a mask with functions that have side
> effects.
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 1064e4d68718..33e3e77de1b7 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -94,6 +94,16 @@  u32 xstate_required_size(u64 xstate_bv, bool compacted)
 	F(name);						\
 })
 
+/*
+ * Emulated Feature - For features that KVM emulates in software irrespective
+ * of host CPU/kernel support.
+ */
+#define EMUL_F(name)						\
+({								\
+	kvm_cpu_cap_emulated |= F(name);			\
+	F(name);						\
+})
+
 /*
  * Aliased Features - For features in 0x8000_0001.EDX that are duplicates of
  * identical 0x1.EDX features, and thus are aliased from 0x1 to 0x8000_0001.
@@ -649,6 +659,7 @@  do {									\
 do {									\
 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
 	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
+	u32 kvm_cpu_cap_emulated = 0;					\
 									\
 	if (leaf < NCAPINTS)						\
 		kvm_cpu_caps[leaf] &= (mask);				\
@@ -656,6 +667,7 @@  do {									\
 		kvm_cpu_caps[leaf] = (mask);				\
 									\
 	kvm_cpu_caps[leaf] &= raw_cpuid_get(cpuid);			\
+	kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;			\
 } while (0)
 
 /*
@@ -684,12 +696,10 @@  void kvm_set_cpu_caps(void)
 		0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ |
 		F(FMA) | F(CX16) | 0 /* xTPR Update */ | F(PDCM) |
 		F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) |
-		F(XMM4_2) | F(X2APIC) | F(MOVBE) | F(POPCNT) |
+		F(XMM4_2) | EMUL_F(X2APIC) | F(MOVBE) | F(POPCNT) |
 		0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) |
 		F(F16C) | F(RDRAND)
 	);
-	/* KVM emulates x2apic in software irrespective of host support. */
-	kvm_cpu_cap_set(X86_FEATURE_X2APIC);
 
 	kvm_cpu_cap_init(CPUID_1_EDX,
 		F(FPU) | F(VME) | F(DE) | F(PSE) |
@@ -703,13 +713,13 @@  void kvm_set_cpu_caps(void)
 	);
 
 	kvm_cpu_cap_init(CPUID_7_0_EBX,
-		F(FSGSBASE) | F(SGX) | F(BMI1) | F(HLE) | F(AVX2) |
-		F(FDP_EXCPTN_ONLY) | F(SMEP) | F(BMI2) | F(ERMS) | F(INVPCID) |
-		F(RTM) | F(ZERO_FCS_FDS) | 0 /*MPX*/ | F(AVX512F) |
-		F(AVX512DQ) | F(RDSEED) | F(ADX) | F(SMAP) | F(AVX512IFMA) |
-		F(CLFLUSHOPT) | F(CLWB) | 0 /*INTEL_PT*/ | F(AVX512PF) |
-		F(AVX512ER) | F(AVX512CD) | F(SHA_NI) | F(AVX512BW) |
-		F(AVX512VL));
+		F(FSGSBASE) | EMUL_F(TSC_ADJUST) | F(SGX) | F(BMI1) | F(HLE) |
+		F(AVX2) | F(FDP_EXCPTN_ONLY) | F(SMEP) | F(BMI2) | F(ERMS) |
+		F(INVPCID) | F(RTM) | F(ZERO_FCS_FDS) | 0 /*MPX*/ |
+		F(AVX512F) | F(AVX512DQ) | F(RDSEED) | F(ADX) | F(SMAP) |
+		F(AVX512IFMA) | F(CLFLUSHOPT) | F(CLWB) | 0 /*INTEL_PT*/ |
+		F(AVX512PF) | F(AVX512ER) | F(AVX512CD) | F(SHA_NI) |
+		F(AVX512BW) | F(AVX512VL));
 
 	kvm_cpu_cap_init(CPUID_7_ECX,
 		F(AVX512VBMI) | RAW_F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
@@ -728,16 +738,12 @@  void kvm_set_cpu_caps(void)
 
 	kvm_cpu_cap_init(CPUID_7_EDX,
 		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
-		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
+		F(SPEC_CTRL_SSBD) | EMUL_F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
 		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
 		F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) |
 		F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D)
 	);
 
-	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
-	kvm_cpu_cap_set(X86_FEATURE_TSC_ADJUST);
-	kvm_cpu_cap_set(X86_FEATURE_ARCH_CAPABILITIES);
-
 	if (boot_cpu_has(X86_FEATURE_IBPB) && boot_cpu_has(X86_FEATURE_IBRS))
 		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL);
 	if (boot_cpu_has(X86_FEATURE_STIBP))