Message ID | 20220816175936.23238-1-dgilbert@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: Always enable legacy fp/sse | expand |
On Tue, Aug 16, 2022, Dr. David Alan Gilbert (git) wrote: > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > A live migration under qemu is currently failing when the source > host is ~Nehalem era (pre-xsave) and the destination is much newer, > (configured with a guest CPU type of Nehalem). > QEMU always calls kvm_put_xsave, even on this combination because > KVM_CAP_CHECK_EXTENSION_VM always returns true for KVM_CAP_XSAVE. > > When QEMU calls kvm_put_xsave it's rejected by > fpu_copy_uabi_to_guest_fpstate-> > copy_uabi_to_xstate-> > validate_user_xstate_header > > when the validate checks the loaded xfeatures against > user_xfeatures, which it finds to be 0. > > I think our initialisation of user_xfeatures is being > too strict here, and we should always allow the base FP/SSE. > > Fixes: ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") > bz: https://bugzilla.redhat.com/show_bug.cgi?id=2079311 > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > --- > arch/x86/kvm/cpuid.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > index de6d44e07e34..3b2319cecfd1 100644 > --- a/arch/x86/kvm/cpuid.c > +++ b/arch/x86/kvm/cpuid.c > @@ -298,7 +298,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) > guest_supported_xcr0 = > cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent); > > - vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0; > + vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0 | > + XFEATURE_MASK_FPSSE; I don't think this is correct. This will allow the guest to set the SSE bit even when XSAVE isn't supported due to kvm_guest_supported_xcr0() returning user_xfeatures. static inline u64 kvm_guest_supported_xcr0(struct kvm_vcpu *vcpu) { return vcpu->arch.guest_fpu.fpstate->user_xfeatures; } I believe the right place to fix this is in validate_user_xstate_header(). It's reachable if and only if XSAVE is supported in the host, and when XSAVE is _not_ supported, the kernel unconditionally allows FP+SSE. So it follows that the kernel should also allow FP+SSE when using XSAVE too. That would also align the logic with fpu_copy_guest_fpstate_to_uabi(), which fordces the FPSSE flags. Ditto for the non-KVM save_xstate_epilog(). Aha! And fpu__init_system_xstate() ensure the host supports FP+SSE when XSAVE is enabled (knew their had to be a sanity check somewhere). --- arch/x86/kernel/fpu/xstate.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index c8340156bfd2..83b9a9653d47 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -399,8 +399,13 @@ int xfeature_size(int xfeature_nr) static int validate_user_xstate_header(const struct xstate_header *hdr, struct fpstate *fpstate) { - /* No unknown or supervisor features may be set */ - if (hdr->xfeatures & ~fpstate->user_xfeatures) + /* + * No unknown or supervisor features may be set. Userspace is always + * allowed to restore FP+SSE state (XSAVE/XRSTOR are used by the kernel + * if and only if FP+SSE are supported in xstate). + */ + if (hdr->xfeatures & ~fpstate->user_xfeatures & + ~(XFEATURE_MASK_FP | XFEATURE_MASK_SSE)) return -EINVAL; /* Userspace must use the uncompacted format */ base-commit: de3d415edca23831c5d1f24f10c74a715af7efdb --
On Tue, 2022-08-16 at 21:37 +0000, Sean Christopherson wrote: > On Tue, Aug 16, 2022, Dr. David Alan Gilbert (git) wrote: > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > > A live migration under qemu is currently failing when the source > > host is ~Nehalem era (pre-xsave) and the destination is much newer, > > (configured with a guest CPU type of Nehalem). > > QEMU always calls kvm_put_xsave, even on this combination because > > KVM_CAP_CHECK_EXTENSION_VM always returns true for KVM_CAP_XSAVE. > > > > When QEMU calls kvm_put_xsave it's rejected by > > fpu_copy_uabi_to_guest_fpstate-> > > copy_uabi_to_xstate-> > > validate_user_xstate_header > > > > when the validate checks the loaded xfeatures against > > user_xfeatures, which it finds to be 0. > > > > I think our initialisation of user_xfeatures is being > > too strict here, and we should always allow the base FP/SSE. > > > > Fixes: ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") Thanks for fixing this, Dave! > > bz: https://bugzilla.redhat.com/show_bug.cgi?id=2079311 > > > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > --- > > arch/x86/kvm/cpuid.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > > index de6d44e07e34..3b2319cecfd1 100644 > > --- a/arch/x86/kvm/cpuid.c > > +++ b/arch/x86/kvm/cpuid.c > > @@ -298,7 +298,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) > > guest_supported_xcr0 = > > cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent); > > > > - vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0; > > + vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0 | > > + XFEATURE_MASK_FPSSE; > > I don't think this is correct. This will allow the guest to set the SSE bit > even when XSAVE isn't supported due to kvm_guest_supported_xcr0() returning > user_xfeatures. > > static inline u64 kvm_guest_supported_xcr0(struct kvm_vcpu *vcpu) > { > return vcpu->arch.guest_fpu.fpstate->user_xfeatures; > } > > I believe the right place to fix this is in validate_user_xstate_header(). It's > reachable if and only if XSAVE is supported in the host, and when XSAVE is _not_ > supported, the kernel unconditionally allows FP+SSE. So it follows that the kernel > should also allow FP+SSE when using XSAVE too. That would also align the logic > with fpu_copy_guest_fpstate_to_uabi(), which fordces the FPSSE flags. Ditto for > the non-KVM save_xstate_epilog(). > > Aha! And fpu__init_system_xstate() ensure the host supports FP+SSE when XSAVE > is enabled (knew their had to be a sanity check somewhere). Thanks for the feedback Sean! I have near to no experience in this code, and I hope you can help me with a question I have, based in Dave's commit message: > > QEMU always calls kvm_put_xsave, even on this combination because > > KVM_CAP_CHECK_EXTENSION_VM always returns true for KVM_CAP_XSAVE. Any particular reason why it always returns true for KVM_CAP_XSAVE, even when the CPU does not support it? IIUC, if it returns false to this capability, kvm_put_xsave() should never be called, and thus it can avoid bug reproduction. Thanks in advance, Leo > > --- > arch/x86/kernel/fpu/xstate.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c > index c8340156bfd2..83b9a9653d47 100644 > --- a/arch/x86/kernel/fpu/xstate.c > +++ b/arch/x86/kernel/fpu/xstate.c > @@ -399,8 +399,13 @@ int xfeature_size(int xfeature_nr) > static int validate_user_xstate_header(const struct xstate_header *hdr, > struct fpstate *fpstate) > { > - /* No unknown or supervisor features may be set */ > - if (hdr->xfeatures & ~fpstate->user_xfeatures) > + /* > + * No unknown or supervisor features may be set. Userspace is always > + * allowed to restore FP+SSE state (XSAVE/XRSTOR are used by the kernel > + * if and only if FP+SSE are supported in xstate). > + */ > + if (hdr->xfeatures & ~fpstate->user_xfeatures & > + ~(XFEATURE_MASK_FP | XFEATURE_MASK_SSE)) > return -EINVAL; > > /* Userspace must use the uncompacted format */ > > base-commit: de3d415edca23831c5d1f24f10c74a715af7efdb > -- >
On 8/17/22 05:29, Leonardo BrĂ¡s wrote: >>> QEMU always calls kvm_put_xsave, even on this combination because >>> KVM_CAP_CHECK_EXTENSION_VM always returns true for KVM_CAP_XSAVE. > Any particular reason why it always returns true for KVM_CAP_XSAVE, even when > the CPU does not support it? > > IIUC, if it returns false to this capability, kvm_put_xsave() should never be > called, and thus it can avoid bug reproduction. Because it allows userspace to have a single path for saving/restoring FPU state. See for example the "migration" code in tools/testing/selftests/kvm/lib/x86_64/processor.c (the vcpu_save_state and vcpu_load_state functions). In fact, the QEMU code that uses KVM_GET_FPU/KVM_SET_FPU in x86 is obsolete, because it's not been used since Linux 2.6.36. Paolo
* Sean Christopherson (seanjc@google.com) wrote: > On Tue, Aug 16, 2022, Dr. David Alan Gilbert (git) wrote: > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > > A live migration under qemu is currently failing when the source > > host is ~Nehalem era (pre-xsave) and the destination is much newer, > > (configured with a guest CPU type of Nehalem). > > QEMU always calls kvm_put_xsave, even on this combination because > > KVM_CAP_CHECK_EXTENSION_VM always returns true for KVM_CAP_XSAVE. > > > > When QEMU calls kvm_put_xsave it's rejected by > > fpu_copy_uabi_to_guest_fpstate-> > > copy_uabi_to_xstate-> > > validate_user_xstate_header > > > > when the validate checks the loaded xfeatures against > > user_xfeatures, which it finds to be 0. > > > > I think our initialisation of user_xfeatures is being > > too strict here, and we should always allow the base FP/SSE. > > > > Fixes: ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") > > bz: https://bugzilla.redhat.com/show_bug.cgi?id=2079311 > > > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > --- > > arch/x86/kvm/cpuid.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > > index de6d44e07e34..3b2319cecfd1 100644 > > --- a/arch/x86/kvm/cpuid.c > > +++ b/arch/x86/kvm/cpuid.c > > @@ -298,7 +298,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) > > guest_supported_xcr0 = > > cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent); > > > > - vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0; > > + vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0 | > > + XFEATURE_MASK_FPSSE; Hi Sean, Thanks for the reply, > I don't think this is correct. This will allow the guest to set the SSE bit > even when XSAVE isn't supported due to kvm_guest_supported_xcr0() returning > user_xfeatures. > > static inline u64 kvm_guest_supported_xcr0(struct kvm_vcpu *vcpu) > { > return vcpu->arch.guest_fpu.fpstate->user_xfeatures; > } > > I believe the right place to fix this is in validate_user_xstate_header(). It's > reachable if and only if XSAVE is supported in the host, and when XSAVE is _not_ > supported, the kernel unconditionally allows FP+SSE. So it follows that the kernel > should also allow FP+SSE when using XSAVE too. That would also align the logic > with fpu_copy_guest_fpstate_to_uabi(), which fordces the FPSSE flags. Ditto for > the non-KVM save_xstate_epilog(). OK, yes, I'd followed the check that failed down to this test; although by itself this test works until Leo's patch came along later; so I wasn't sure where to fix it. > Aha! And fpu__init_system_xstate() ensure the host supports FP+SSE when XSAVE > is enabled (knew their had to be a sanity check somewhere). > > --- > arch/x86/kernel/fpu/xstate.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c > index c8340156bfd2..83b9a9653d47 100644 > --- a/arch/x86/kernel/fpu/xstate.c > +++ b/arch/x86/kernel/fpu/xstate.c > @@ -399,8 +399,13 @@ int xfeature_size(int xfeature_nr) > static int validate_user_xstate_header(const struct xstate_header *hdr, > struct fpstate *fpstate) > { > - /* No unknown or supervisor features may be set */ > - if (hdr->xfeatures & ~fpstate->user_xfeatures) > + /* > + * No unknown or supervisor features may be set. Userspace is always > + * allowed to restore FP+SSE state (XSAVE/XRSTOR are used by the kernel > + * if and only if FP+SSE are supported in xstate). > + */ > + if (hdr->xfeatures & ~fpstate->user_xfeatures & > + ~(XFEATURE_MASK_FP | XFEATURE_MASK_SSE)) > return -EINVAL; > > /* Userspace must use the uncompacted format */ That passes the small smoke test for me; will you repost that then? Thanks, Dave > base-commit: de3d415edca23831c5d1f24f10c74a715af7efdb > -- >
On Wed, Aug 17, 2022, Dr. David Alan Gilbert wrote:
> That passes the small smoke test for me; will you repost that then?
Yep, will do.
* Sean Christopherson (seanjc@google.com) wrote: > On Wed, Aug 17, 2022, Dr. David Alan Gilbert wrote: > > That passes the small smoke test for me; will you repost that then? > > Yep, will do. Thanks. Dave
On Wed, Aug 17, 2022, Dr. David Alan Gilbert wrote: > * Sean Christopherson (seanjc@google.com) wrote: > > On Tue, Aug 16, 2022, Dr. David Alan Gilbert (git) wrote: > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > > > index de6d44e07e34..3b2319cecfd1 100644 > > > --- a/arch/x86/kvm/cpuid.c > > > +++ b/arch/x86/kvm/cpuid.c > > > @@ -298,7 +298,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) > > > guest_supported_xcr0 = > > > cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent); > > > > > > - vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0; > > > + vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0 | > > > + XFEATURE_MASK_FPSSE; > > Hi Sean, > Thanks for the reply, > > > I don't think this is correct. This will allow the guest to set the SSE bit > > even when XSAVE isn't supported due to kvm_guest_supported_xcr0() returning > > user_xfeatures. > > > > static inline u64 kvm_guest_supported_xcr0(struct kvm_vcpu *vcpu) > > { > > return vcpu->arch.guest_fpu.fpstate->user_xfeatures; > > } > > > > I believe the right place to fix this is in validate_user_xstate_header(). It's > > reachable if and only if XSAVE is supported in the host, and when XSAVE is _not_ > > supported, the kernel unconditionally allows FP+SSE. So it follows that the kernel > > should also allow FP+SSE when using XSAVE too. That would also align the logic > > with fpu_copy_guest_fpstate_to_uabi(), which fordces the FPSSE flags. Ditto for > > the non-KVM save_xstate_epilog(). > > OK, yes, I'd followed the check that failed down to this test; although > by itself this test works until Leo's patch came along later; so I > wasn't sure where to fix it. > > > Aha! And fpu__init_system_xstate() ensure the host supports FP+SSE when XSAVE > > is enabled (knew their had to be a sanity check somewhere). > > > > --- > > arch/x86/kernel/fpu/xstate.c | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c > > index c8340156bfd2..83b9a9653d47 100644 > > --- a/arch/x86/kernel/fpu/xstate.c > > +++ b/arch/x86/kernel/fpu/xstate.c > > @@ -399,8 +399,13 @@ int xfeature_size(int xfeature_nr) > > static int validate_user_xstate_header(const struct xstate_header *hdr, > > struct fpstate *fpstate) > > { > > - /* No unknown or supervisor features may be set */ > > - if (hdr->xfeatures & ~fpstate->user_xfeatures) > > + /* > > + * No unknown or supervisor features may be set. Userspace is always > > + * allowed to restore FP+SSE state (XSAVE/XRSTOR are used by the kernel > > + * if and only if FP+SSE are supported in xstate). > > + */ > > + if (hdr->xfeatures & ~fpstate->user_xfeatures & > > + ~(XFEATURE_MASK_FP | XFEATURE_MASK_SSE)) > > return -EINVAL; > > > > /* Userspace must use the uncompacted format */ > > That passes the small smoke test for me; will you repost that then? *sigh* The bug is more subtle than just failing to restore. Saving can also "fail". If XSAVE is hidden from the guest on an XSAVE-capable host, __copy_xstate_to_uabi_buf() will happily reinitialize FP+SSE state and thus corrupt guest FPU state on migration. And not that it matters now, but before realizing that KVM_GET_XSAVE is also broken, I decided I like Dave's patch better because KVM really should separate what userspace can save/restore from what the guest can access. Amusingly, there's actually another bug lurking with respect to usurping user_xfeatures to represent supported_guest_xcr0. The latter is zero-initialized, whereas user_xfeatures is set to the "default" features on initialization, i.e. migrating a VM without ever doing KVM_SET_CPUID2 would do odd things. Sending a v2 shortly to reinstate guest_supported_xcr0 before landing Dave's patch.
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index de6d44e07e34..3b2319cecfd1 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -298,7 +298,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) guest_supported_xcr0 = cpuid_get_supported_xcr0(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent); - vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0; + vcpu->arch.guest_fpu.fpstate->user_xfeatures = guest_supported_xcr0 | + XFEATURE_MASK_FPSSE; kvm_update_pv_runtime(vcpu);