mbox series

[v3,00/16] x86: support AVX10

Message ID 516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com (mailing list archive)
Headers show
Series x86: support AVX10 | expand

Message

Jan Beulich Dec. 11, 2024, 10:09 a.m. UTC
AVX10.1 is just a re-branding of certain AVX512 (sub)features, i.e.
adds no new instructions. Therefore it's mostly relaxation that needs
doing, plus dealing with the 256-bit-only case that AVX512 itself
does not allow for. Luckily an unnecessary restriction on the mask
register insns was taken out again, simplifying the actual emulator
adjustments quite a bit.

AVX10.2 is adding quite a few new insns, support for which (new in v3)
is roughly added chapter-wise as the spec has them (perhaps not in the
order of the chapters there).

While it probably can be rebased ahead, the series in this form
depends on the previously submitted "[PATCH v5 0/3] x86/CPUID: leaf
pruning". It also is assumed to go on top of "[PATCH v7 0/7] x86emul:
misc additions", albeit at most contextual dependencies ought to exit
there.

I've tried to be very careful in rebasing ahead of other emulator
patches I've been carrying, but almost all testing I've done is with
all of those collectively in place.

01: x86/CPUID: enable AVX10 leaf
02: x86emul: support AVX10.1
03: x86emul/test: use simd_check_avx512*() in main()
04: x86emul/test: drop cpu_has_avx512vl
05: x86emul: AVX10.1 testing
06: x86emul/test: engage AVX512VL via command line option
07: x86emul: support AVX10.2 256-bit embedded rounding / SAE
08: x86emul: support AVX10.2 scalar compare insns
09: x86emul: support AVX10.2 partial copy insns
10: x86emul: support AVX10.2 media insns
11: x86emul: support AVX10.2 minmax insns
12: x86emul: support AVX10.2 media insns
13: x86emul: support AVX10.2 saturating convert insns
14: x86emul: support other AVX10.2 convert insns
15: x86emul: support SIMD MOVRS
16: x86emul: support AVX10.2 forms of SM4 insns

Jan

Comments

Jan Beulich Dec. 11, 2024, 10:29 a.m. UTC | #1
On 11.12.2024 11:09, Jan Beulich wrote:
> AVX10.1 is just a re-branding of certain AVX512 (sub)features, i.e.
> adds no new instructions. Therefore it's mostly relaxation that needs
> doing, plus dealing with the 256-bit-only case that AVX512 itself
> does not allow for. Luckily an unnecessary restriction on the mask
> register insns was taken out again, simplifying the actual emulator
> adjustments quite a bit.
> 
> AVX10.2 is adding quite a few new insns, support for which (new in v3)
> is roughly added chapter-wise as the spec has them (perhaps not in the
> order of the chapters there).
> 
> While it probably can be rebased ahead, the series in this form
> depends on the previously submitted "[PATCH v5 0/3] x86/CPUID: leaf
> pruning". It also is assumed to go on top of "[PATCH v7 0/7] x86emul:
> misc additions", albeit at most contextual dependencies ought to exit
> there.
> 
> I've tried to be very careful in rebasing ahead of other emulator
> patches I've been carrying, but almost all testing I've done is with
> all of those collectively in place.
> 
> 01: x86/CPUID: enable AVX10 leaf
> 02: x86emul: support AVX10.1
> 03: x86emul/test: use simd_check_avx512*() in main()
> 04: x86emul/test: drop cpu_has_avx512vl
> 05: x86emul: AVX10.1 testing
> 06: x86emul/test: engage AVX512VL via command line option
> 07: x86emul: support AVX10.2 256-bit embedded rounding / SAE
> 08: x86emul: support AVX10.2 scalar compare insns
> 09: x86emul: support AVX10.2 partial copy insns
> 10: x86emul: support AVX10.2 media insns
> 11: x86emul: support AVX10.2 minmax insns
> 12: x86emul: support AVX10.2 media insns
> 13: x86emul: support AVX10.2 saturating convert insns
> 14: x86emul: support other AVX10.2 convert insns
> 15: x86emul: support SIMD MOVRS
> 16: x86emul: support AVX10.2 forms of SM4 insns

I should probably have mentioned here two further opens:
1) Testing. So far I haven't been able to think of a good approach to test
   some (most?) of the new insns, beyond the EVEX Disp8 and predicates
   testing that's being taken care of in individual patches.
2) Supposedly there is a way to constrain guests to 256-bit vector size via
   a VMCS setting. The spec has no details beyond mentioning this fact.

Jan