diff mbox series

KVM: x86: Advertise AMX-COMPLEX CPUID to userspace

Message ID 20230802022954.193843-1-tao1.su@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: Advertise AMX-COMPLEX CPUID to userspace | expand

Commit Message

Tao Su Aug. 2, 2023, 2:29 a.m. UTC
Latest Intel platform GraniteRapids-D introduces AMX-COMPLEX, which adds
two instructions to perform matrix multiplication of two tiles containing
complex elements and accumulate the results into a packed single precision
tile.

AMX-COMPLEX is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 8]

Since there are no new VMX controls or additional host enabling required
for guests to use this feature, advertise the CPUID to userspace.

Signed-off-by: Tao Su <tao1.su@linux.intel.com>
---
 arch/x86/kvm/cpuid.c         | 3 ++-
 arch/x86/kvm/reverse_cpuid.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)


base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4

Comments

Xiaoyao Li Aug. 2, 2023, 7:40 a.m. UTC | #1
On 8/2/2023 10:29 AM, Tao Su wrote:
> Latest Intel platform GraniteRapids-D introduces AMX-COMPLEX, which adds
> two instructions to perform matrix multiplication of two tiles containing
> complex elements and accumulate the results into a packed single precision
> tile.
> 
> AMX-COMPLEX is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 8]
> 
> Since there are no new VMX controls or additional host enabling required
> for guests to use this feature, advertise the CPUID to userspace.
> 
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> ---
>   arch/x86/kvm/cpuid.c         | 3 ++-
>   arch/x86/kvm/reverse_cpuid.h | 1 +
>   2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 7f4d13383cf2..883ec8d5a77f 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -647,7 +647,8 @@ void kvm_set_cpu_caps(void)
>   	);
>   
>   	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
> -		F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
> +		F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI) |
> +		F(AMX_COMPLEX)
>   	);
>   
>   	kvm_cpu_cap_mask(CPUID_D_1_EAX,
> diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
> index 56cbdb24400a..b81650678375 100644
> --- a/arch/x86/kvm/reverse_cpuid.h
> +++ b/arch/x86/kvm/reverse_cpuid.h
> @@ -43,6 +43,7 @@ enum kvm_only_cpuid_leafs {
>   /* Intel-defined sub-features, CPUID level 0x00000007:1 (EDX) */
>   #define X86_FEATURE_AVX_VNNI_INT8       KVM_X86_FEATURE(CPUID_7_1_EDX, 4)
>   #define X86_FEATURE_AVX_NE_CONVERT      KVM_X86_FEATURE(CPUID_7_1_EDX, 5)
> +#define X86_FEATURE_AMX_COMPLEX         KVM_X86_FEATURE(CPUID_7_1_EDX, 8)
>   #define X86_FEATURE_PREFETCHITI         KVM_X86_FEATURE(CPUID_7_1_EDX, 14)
>   
>   /* CPUID level 0x80000007 (EDX). */
> 
> base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
Sean Christopherson Aug. 2, 2023, 11:36 p.m. UTC | #2
On Wed, Aug 02, 2023, Tao Su wrote:
> Latest Intel platform GraniteRapids-D introduces AMX-COMPLEX, which adds
> two instructions to perform matrix multiplication of two tiles containing
> complex elements and accumulate the results into a packed single precision
> tile.
> 
> AMX-COMPLEX is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 8]
> 
> Since there are no new VMX controls or additional host enabling required
> for guests to use this feature, advertise the CPUID to userspace.

Nit, I would rather justify this (last paragraph) with something like:

  Advertise AMX_COMPLEX if it's supported in hardware.  There are no VMX
  controls for the feature, i.e. the instructions can't be interecepted, and
  KVM advertises base AMX in CPUID if AMX is supported in hardware, even if
  KVM doesn't advertise AMX as being supported in XCR0, e.g. because the
  process didn't opt-in to allocating tile data.

If the above is accurate and there are no objections, I'll fixup the changelog
when applying.

Side topic, this does make me wonder if advertising AMX when XTILE_DATA isn't
permitted is a bad idea.  But no one has complained, and chasing down all the
dependent AMX features would get annoying, so I'm inclined to keep the status quo.
Xiaoyao Li Aug. 3, 2023, 3:12 a.m. UTC | #3
On 8/3/2023 7:36 AM, Sean Christopherson wrote:
> On Wed, Aug 02, 2023, Tao Su wrote:
>> Latest Intel platform GraniteRapids-D introduces AMX-COMPLEX, which adds
>> two instructions to perform matrix multiplication of two tiles containing
>> complex elements and accumulate the results into a packed single precision
>> tile.
>>
>> AMX-COMPLEX is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 8]
>>
>> Since there are no new VMX controls or additional host enabling required
>> for guests to use this feature, advertise the CPUID to userspace.
> 
> Nit, I would rather justify this (last paragraph) with something like:
> 
>    Advertise AMX_COMPLEX if it's supported in hardware.  There are no VMX
>    controls for the feature, i.e. the instructions can't be interecepted, and
>    KVM advertises base AMX in CPUID if AMX is supported in hardware, even if
>    KVM doesn't advertise AMX as being supported in XCR0, e.g. because the
>    process didn't opt-in to allocating tile data.

It looks good to me.

> If the above is accurate and there are no objections, I'll fixup the changelog
> when applying.
> 
> Side topic, this does make me wonder if advertising AMX when XTILE_DATA isn't
> permitted is a bad idea.  But no one has complained, and chasing down all the
> dependent AMX features would get annoying, so I'm inclined to keep the status quo.
Tao Su Aug. 3, 2023, 3:26 a.m. UTC | #4
On Wed, Aug 02, 2023 at 04:36:10PM -0700, Sean Christopherson wrote:
> On Wed, Aug 02, 2023, Tao Su wrote:
> > Latest Intel platform GraniteRapids-D introduces AMX-COMPLEX, which adds
> > two instructions to perform matrix multiplication of two tiles containing
> > complex elements and accumulate the results into a packed single precision
> > tile.
> > 
> > AMX-COMPLEX is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 8]
> > 
> > Since there are no new VMX controls or additional host enabling required
> > for guests to use this feature, advertise the CPUID to userspace.
> 
> Nit, I would rather justify this (last paragraph) with something like:
> 
>   Advertise AMX_COMPLEX if it's supported in hardware.  There are no VMX
>   controls for the feature, i.e. the instructions can't be interecepted, and
>   KVM advertises base AMX in CPUID if AMX is supported in hardware, even if
>   KVM doesn't advertise AMX as being supported in XCR0, e.g. because the
>   process didn't opt-in to allocating tile data.
> 
> If the above is accurate and there are no objections, I'll fixup the changelog
> when applying.

Totally agree.

> 
> Side topic, this does make me wonder if advertising AMX when XTILE_DATA isn't
> permitted is a bad idea.  But no one has complained, and chasing down all the
> dependent AMX features would get annoying, so I'm inclined to keep the status quo.

From the description of AMX exception, there is no CPUID checking and #UD will be produced
if XCR0[18:17] != 0b11. Since user applications should check both the XCR0 and CPUIDs
before using related AMX instructions, I don't think there should be bad effects in keeping
the status quo.

Thanks,
Tao
Paolo Bonzini Aug. 3, 2023, 9:04 p.m. UTC | #5
On 8/3/23 01:36, Sean Christopherson wrote:
> Side topic, this does make me wonder if advertising AMX when XTILE_DATA isn't
> permitted is a bad idea.  But no one has complained, and chasing down all the
> dependent AMX features would get annoying, so I'm inclined to keep the status quo.

I think it should not be an issue because you have to check XCR0 anyway 
before using AMX.  OS kernels know what's in XCR0 but still they need to 
check the save state leaves in CPUID[0xD].  In neither case will the 
instruction leaves in CPUID[7] be the only input to the test (if they 
are used at all).

Paolo
Sean Christopherson Aug. 4, 2023, 12:40 a.m. UTC | #6
On Wed, 02 Aug 2023 10:29:54 +0800, Tao Su wrote:
> Latest Intel platform GraniteRapids-D introduces AMX-COMPLEX, which adds
> two instructions to perform matrix multiplication of two tiles containing
> complex elements and accumulate the results into a packed single precision
> tile.
> 
> AMX-COMPLEX is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 8]
> 
> [...]

Applied to kvm-x86 misc, thanks!

[1/1] KVM: x86: Advertise AMX-COMPLEX CPUID to userspace
      https://github.com/kvm-x86/linux/commit/99b668545356

--
https://github.com/kvm-x86/linux/tree/next
https://github.com/kvm-x86/linux/tree/fixes
Tao Su Aug. 4, 2023, 6:58 a.m. UTC | #7
On Thu, Aug 03, 2023 at 05:40:28PM -0700, Sean Christopherson wrote:
> On Wed, 02 Aug 2023 10:29:54 +0800, Tao Su wrote:
> > Latest Intel platform GraniteRapids-D introduces AMX-COMPLEX, which adds
> > two instructions to perform matrix multiplication of two tiles containing
> > complex elements and accumulate the results into a packed single precision
> > tile.
> > 
> > AMX-COMPLEX is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 8]
> > 
> > [...]
> 
> Applied to kvm-x86 misc, thanks!

Sean, thanks!

> 
> [1/1] KVM: x86: Advertise AMX-COMPLEX CPUID to userspace
>       https://github.com/kvm-x86/linux/commit/99b668545356
> 
> --
> https://github.com/kvm-x86/linux/tree/next
> https://github.com/kvm-x86/linux/tree/fixes
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7f4d13383cf2..883ec8d5a77f 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -647,7 +647,8 @@  void kvm_set_cpu_caps(void)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
-		F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI)
+		F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI) |
+		F(AMX_COMPLEX)
 	);
 
 	kvm_cpu_cap_mask(CPUID_D_1_EAX,
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index 56cbdb24400a..b81650678375 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -43,6 +43,7 @@  enum kvm_only_cpuid_leafs {
 /* Intel-defined sub-features, CPUID level 0x00000007:1 (EDX) */
 #define X86_FEATURE_AVX_VNNI_INT8       KVM_X86_FEATURE(CPUID_7_1_EDX, 4)
 #define X86_FEATURE_AVX_NE_CONVERT      KVM_X86_FEATURE(CPUID_7_1_EDX, 5)
+#define X86_FEATURE_AMX_COMPLEX         KVM_X86_FEATURE(CPUID_7_1_EDX, 8)
 #define X86_FEATURE_PREFETCHITI         KVM_X86_FEATURE(CPUID_7_1_EDX, 14)
 
 /* CPUID level 0x80000007 (EDX). */