Message ID | 20211214022825.563892248@linutronix.de (mailing list archive) |
---|---|
Headers | show |
Series | x86/fpu: Preparatory changes for guest AMX support | expand |
> From: Thomas Gleixner <tglx@linutronix.de> > Sent: Tuesday, December 14, 2021 10:50 AM > > Folks, > > this is a follow up to the initial sketch of patches which got picked up by > Jing and have been posted in combination with the KVM parts: > > https://lore.kernel.org/r/20211208000359.2853257-1- > yang.zhong@intel.com > > This update is only touching the x86/fpu code and not changing anything on > the KVM side. > > BIG FAT WARNING: This is compile tested only! > > In course of the dicsussion of the above patchset it turned out that there > are a few conceptual issues vs. hardware and software state and also > vs. guest restore. Overall this is definitely a good move and also help simplify the KVM side logic.
Hi Thomas, On 12/14/2021 10:50 AM, Thomas Gleixner wrote: > Folks, > > this is a follow up to the initial sketch of patches which got picked up by > Jing and have been posted in combination with the KVM parts: > > https://lore.kernel.org/r/20211208000359.2853257-1-yang.zhong@intel.com > > This update is only touching the x86/fpu code and not changing anything on > the KVM side. > > BIG FAT WARNING: This is compile tested only! > > In course of the dicsussion of the above patchset it turned out that there > are a few conceptual issues vs. hardware and software state and also > vs. guest restore. > > This series addresses this with the following changes vs. the original > approach: > > 1) fpstate reallocation is now independent of fpu_swap_kvm_fpstate() > > It is triggered directly via XSETBV and XFD MSR write emulation which > are used both for runtime and restore purposes. > > For this it provides two wrappers around a common update function, one > for XCR0 and one for XFD. > > Both check the validity of the arguments and the correct sizing of the > guest FPU fpstate. If the size is not sufficient, fpstate is > reallocated. > > The functions can fail. > > 2) XFD synchronization > > KVM must neither touch the XFD MSR nor the fpstate->xfd software state > in order to guarantee state consistency. > > In the MSR write emulation case the XFD specific update handler has to > be invoked. See #1 > > If MSR write emulation is disabled because the buffer size is > sufficient for all use cases, i.e.: > > guest_fpu::xfeatures == guest_fpu::perm > The buffer size can be sufficient once one of the features is requested since kernel fpu realloc full size (permitted). And I think we don't want to disable interception until all the features are detected e.g., one by one. Thus it can be guest_fpu::xfeatures != guest_fpu::perm. Thanks, Jing
> From: Liu, Jing2 <jing2.liu@linux.intel.com> > Sent: Tuesday, December 14, 2021 2:52 PM > > On 12/14/2021 10:50 AM, Thomas Gleixner wrote: > > If MSR write emulation is disabled because the buffer size is > > sufficient for all use cases, i.e.: > > > > guest_fpu::xfeatures == guest_fpu::perm > > > The buffer size can be sufficient once one of the features is requested > since > kernel fpu realloc full size (permitted). And I think we don't want to > disable > interception until all the features are detected e.g., one by one. > > Thus it can be guest_fpu::xfeatures != guest_fpu::perm. > There are two options to handle multiple xfd features. a) a conservative approach as Thomas suggested, i.e. don't disable emulation until all the features in guest_fpu::perm are requested by the guest. This definitely has poor performance if the guest only wants to use a subset of perm features. But functionally p.o.v it just works. Given we only have one xfeature today, let's just use this simple check which has ZERO negative impact. b) an optimized approach by dynamically enabling/disabling emulation. e.g. we can disable emulation after the 1st xfd feature is enabled and then reenable it in #NM vmexit handler when XFD_ERR includes a bit which is not in guest_fpu::xfeatures, sort of like: --xfd trapped, perm has two xfd features-- (G) access xfd_feature1; (H) trap #NM (XFD_ERR = xfd_feature1) and inject #NM; (G) WRMSR(IA32_XFD, (-1ULL) & ~xfd_feature1); (H) reallocate fpstate and disable write emulation for XFD; --xfd passed through-- (G) do something... (G) access xfd_feature2; (H) trap #NM (XFD_ERR = xfd_feature2), enable emulation, inject #NM; --xfd trapped-- (G) WRMSR(IA32_XFD, (-1ULL) & ~(xfd_feature1 | xfd_feature2)); (H) reallocate fpstate and disable write emulation for XFD; --xfd passed through-- (G) do something... Thanks Kevin
On 12/14/21 03:50, Thomas Gleixner wrote: > The only remaining issue is the KVM XSTATE save/restore size checking which > probably requires some FPU core assistance. But that requires some more > thoughts vs. the IOCTL interface extension and once that is settled it > needs to be solved in one go. But that's an orthogonal issue to the above. That's not a big deal because KVM uses the uncompacted format. So KVM_CHECK_EXTENSION and KVM_GET_XSAVE can just use CPUID to retrieve the size and uncompacted offset of the largest bit that is set in kvm_supported_xcr0, while KVM_SET_XSAVE can do the same with the largest bit that is set in the xstate_bv. Paolo > The series is also available from git: > > git://git.kernel.org/pub/scm/linux/kernel/git/people/tglx/devel.git x86/fpu-kvm
On Tue, Dec 14 2021 at 11:42, Paolo Bonzini wrote: > On 12/14/21 03:50, Thomas Gleixner wrote: >> The only remaining issue is the KVM XSTATE save/restore size checking which >> probably requires some FPU core assistance. But that requires some more >> thoughts vs. the IOCTL interface extension and once that is settled it >> needs to be solved in one go. But that's an orthogonal issue to the above. > > That's not a big deal because KVM uses the uncompacted format. So > KVM_CHECK_EXTENSION and KVM_GET_XSAVE can just use CPUID to retrieve the > size and uncompacted offset of the largest bit that is set in > kvm_supported_xcr0, while KVM_SET_XSAVE can do the same with the largest > bit that is set in the xstate_bv. For simplicity you can just get that information from guest_fpu. See below. Thanks, tglx --- --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -518,6 +518,11 @@ struct fpu_guest { u64 perm; /* + * @uabi_size: Size required for save/restore + */ + unsigned int uabi_size; + + /* * @fpstate: Pointer to the allocated guest fpstate */ struct fpstate *fpstate; --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -240,6 +240,7 @@ bool fpu_alloc_guest_fpstate(struct fpu_ gfpu->fpstate = fpstate; gfpu->xfeatures = fpu_user_cfg.default_features; gfpu->perm = fpu_user_cfg.default_features; + gfpu->uabi_size = fpu_user_cfg.default_size; fpu_init_guest_permissions(gfpu); return true; --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1545,6 +1545,7 @@ static int fpstate_realloc(u64 xfeatures newfps->is_confidential = curfps->is_confidential; newfps->in_use = curfps->in_use; guest_fpu->xfeatures |= xfeatures; + guest_fpu->uabi_size = usize; } fpregs_lock();