From patchwork Fri Dec 17 15:29:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685091 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56E58C433F5 for ; Fri, 17 Dec 2021 15:30:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238202AbhLQPaF (ORCPT ); Fri, 17 Dec 2021 10:30:05 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238196AbhLQPaE (ORCPT ); Fri, 17 Dec 2021 10:30:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755004; x=1671291004; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eOjDT1d53ECluHIyuHq9Pg4thb5nT1Vxb+5HpXMsiCI=; b=I055sbMyPQVJBU23OKGSlrmj7rOUM1Gj3U0BZOe01L/WEIswnhG6wSR+ jAKIaY8+4W0r3Wjkdoci9s7GlFS2qMb6F/YGWpK1Mt2iz5i00rRNEImKq yTIKora17Ie8KICyY9Hv11OBWlUGSxh29vnMy5vzSJFCGQLa0Rg9IcZt6 V0bqcm9FnMx6/0bH7lYTsxs4u+QYAux7vRftxOLwvyYVtow1A2lfl3LSI XtUC+03nsm3ANdoEzl9smJ99C4gbOexfrmH+wvg2UyN3avuXolL4tIEOn aE0OuIzEfvnLK+zqnqWyqSZhk1JYmuPHY02m4TA1MbA7FA/DYOwI0I/Yc w==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723426" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723426" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588388" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:03 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 01/23] x86/fpu: Extend fpu_xstate_prctl() with guest permissions Date: Fri, 17 Dec 2021 07:29:41 -0800 Message-Id: <20211217153003.1719189-2-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Thomas Gleixner KVM requires a clear separation of host user space and guest permissions for dynamic XSTATE components. Add a guest permissions member to struct fpu and a separate set of prctl() arguments: ARCH_GET_XCOMP_GUEST_PERM and ARCH_REQ_XCOMP_GUEST_PERM. The semantics are equivalent to the host user space permission control except for the following constraints: 1) Permissions have to be requested before the first vCPU is created 2) Permissions are frozen when the first vCPU is created to ensure consistency. Any attempt to expand permissions via the prctl() after that point is rejected. Signed-off-by: Thomas Gleixner Signed-off-by: Jing Liu --- (To Thomas) xstate_get_guest_group_perm() is again moved from xstate.h to api.h. Per Paolo's suggestion we also need to check permission at KVM_GET_SUPPORTED_CPUID, which is a /dev/kvm generic ioctl. arch/x86/include/asm/fpu/api.h | 2 ++ arch/x86/include/asm/fpu/types.h | 9 ++++++ arch/x86/include/uapi/asm/prctl.h | 26 ++++++++-------- arch/x86/kernel/fpu/core.c | 3 ++ arch/x86/kernel/fpu/xstate.c | 51 +++++++++++++++++++++++-------- arch/x86/kernel/fpu/xstate.h | 13 ++++++-- arch/x86/kernel/process.c | 2 ++ 7 files changed, 79 insertions(+), 27 deletions(-) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index c2767a6a387e..d8c222290e68 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -132,6 +132,8 @@ static inline void fpstate_free(struct fpu *fpu) { } /* fpstate-related functions which are exported to KVM */ extern void fpstate_clear_xstate_component(struct fpstate *fps, unsigned int xfeature); +extern inline u64 xstate_get_guest_group_perm(void); + /* KVM specific functions */ extern bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu); extern void fpu_free_guest_fpstate(struct fpu_guest *gfpu); diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 3c06c82ab355..6ddf80637697 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -387,6 +387,8 @@ struct fpstate { /* @regs is dynamically sized! Don't add anything after @regs! */ } __aligned(64); +#define FPU_GUEST_PERM_LOCKED BIT_ULL(63) + struct fpu_state_perm { /* * @__state_perm: @@ -476,6 +478,13 @@ struct fpu { */ struct fpu_state_perm perm; + /* + * @guest_perm: + * + * Permission related information for guest pseudo FPUs + */ + struct fpu_state_perm guest_perm; + /* * @__fpstate: * diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index 754a07856817..500b96e71f18 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -2,20 +2,22 @@ #ifndef _ASM_X86_PRCTL_H #define _ASM_X86_PRCTL_H -#define ARCH_SET_GS 0x1001 -#define ARCH_SET_FS 0x1002 -#define ARCH_GET_FS 0x1003 -#define ARCH_GET_GS 0x1004 +#define ARCH_SET_GS 0x1001 +#define ARCH_SET_FS 0x1002 +#define ARCH_GET_FS 0x1003 +#define ARCH_GET_GS 0x1004 -#define ARCH_GET_CPUID 0x1011 -#define ARCH_SET_CPUID 0x1012 +#define ARCH_GET_CPUID 0x1011 +#define ARCH_SET_CPUID 0x1012 -#define ARCH_GET_XCOMP_SUPP 0x1021 -#define ARCH_GET_XCOMP_PERM 0x1022 -#define ARCH_REQ_XCOMP_PERM 0x1023 +#define ARCH_GET_XCOMP_SUPP 0x1021 +#define ARCH_GET_XCOMP_PERM 0x1022 +#define ARCH_REQ_XCOMP_PERM 0x1023 +#define ARCH_GET_XCOMP_GUEST_PERM 0x1024 +#define ARCH_REQ_XCOMP_GUEST_PERM 0x1025 -#define ARCH_MAP_VDSO_X32 0x2001 -#define ARCH_MAP_VDSO_32 0x2002 -#define ARCH_MAP_VDSO_64 0x2003 +#define ARCH_MAP_VDSO_X32 0x2001 +#define ARCH_MAP_VDSO_32 0x2002 +#define ARCH_MAP_VDSO_64 0x2003 #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 8ea306b1bf8e..ab19b3d8b2f7 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -450,6 +450,8 @@ void fpstate_reset(struct fpu *fpu) fpu->perm.__state_perm = fpu_kernel_cfg.default_features; fpu->perm.__state_size = fpu_kernel_cfg.default_size; fpu->perm.__user_state_size = fpu_user_cfg.default_size; + /* Same defaults for guests */ + fpu->guest_perm = fpu->perm; } static inline void fpu_inherit_perms(struct fpu *dst_fpu) @@ -460,6 +462,7 @@ static inline void fpu_inherit_perms(struct fpu *dst_fpu) spin_lock_irq(¤t->sighand->siglock); /* Fork also inherits the permissions of the parent */ dst_fpu->perm = src_fpu->perm; + dst_fpu->guest_perm = src_fpu->guest_perm; spin_unlock_irq(¤t->sighand->siglock); } } diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index d28829403ed0..84386603482f 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1595,7 +1595,7 @@ static int validate_sigaltstack(unsigned int usize) return 0; } -static int __xstate_request_perm(u64 permitted, u64 requested) +static int __xstate_request_perm(u64 permitted, u64 requested, bool guest) { /* * This deliberately does not exclude !XSAVES as we still might @@ -1605,6 +1605,7 @@ static int __xstate_request_perm(u64 permitted, u64 requested) */ bool compacted = cpu_feature_enabled(X86_FEATURE_XSAVES); struct fpu *fpu = ¤t->group_leader->thread.fpu; + struct fpu_state_perm *perm; unsigned int ksize, usize; u64 mask; int ret; @@ -1621,15 +1622,18 @@ static int __xstate_request_perm(u64 permitted, u64 requested) mask &= XFEATURE_MASK_USER_SUPPORTED; usize = xstate_calculate_size(mask, false); - ret = validate_sigaltstack(usize); - if (ret) - return ret; + if (!guest) { + ret = validate_sigaltstack(usize); + if (ret) + return ret; + } + perm = guest ? &fpu->guest_perm : &fpu->perm; /* Pairs with the READ_ONCE() in xstate_get_group_perm() */ - WRITE_ONCE(fpu->perm.__state_perm, requested); + WRITE_ONCE(perm->__state_perm, requested); /* Protected by sighand lock */ - fpu->perm.__state_size = ksize; - fpu->perm.__user_state_size = usize; + perm->__state_size = ksize; + perm->__user_state_size = usize; return ret; } @@ -1640,7 +1644,7 @@ static const u64 xstate_prctl_req[XFEATURE_MAX] = { [XFEATURE_XTILE_DATA] = XFEATURE_MASK_XTILE_DATA, }; -static int xstate_request_perm(unsigned long idx) +static int xstate_request_perm(unsigned long idx, bool guest) { u64 permitted, requested; int ret; @@ -1661,14 +1665,19 @@ static int xstate_request_perm(unsigned long idx) return -EOPNOTSUPP; /* Lockless quick check */ - permitted = xstate_get_host_group_perm(); + permitted = xstate_get_group_perm(guest); if ((permitted & requested) == requested) return 0; /* Protect against concurrent modifications */ spin_lock_irq(¤t->sighand->siglock); - permitted = xstate_get_host_group_perm(); - ret = __xstate_request_perm(permitted, requested); + permitted = xstate_get_group_perm(guest); + + /* First vCPU allocation locks the permissions. */ + if (guest && (permitted & FPU_GUEST_PERM_LOCKED)) + ret = -EBUSY; + else + ret = __xstate_request_perm(permitted, requested, guest); spin_unlock_irq(¤t->sighand->siglock); return ret; } @@ -1713,12 +1722,18 @@ int xfd_enable_feature(u64 xfd_err) return 0; } #else /* CONFIG_X86_64 */ -static inline int xstate_request_perm(unsigned long idx) +static inline int xstate_request_perm(unsigned long idx, bool guest) { return -EPERM; } #endif /* !CONFIG_X86_64 */ +inline u64 xstate_get_guest_group_perm(void) +{ + return xstate_get_group_perm(true); +} +EXPORT_SYMBOL_GPL(xstate_get_guest_group_perm); + /** * fpu_xstate_prctl - xstate permission operations * @tsk: Redundant pointer to current @@ -1742,6 +1757,7 @@ long fpu_xstate_prctl(struct task_struct *tsk, int option, unsigned long arg2) u64 __user *uptr = (u64 __user *)arg2; u64 permitted, supported; unsigned long idx = arg2; + bool guest = false; if (tsk != current) return -EPERM; @@ -1760,11 +1776,20 @@ long fpu_xstate_prctl(struct task_struct *tsk, int option, unsigned long arg2) permitted &= XFEATURE_MASK_USER_SUPPORTED; return put_user(permitted, uptr); + case ARCH_GET_XCOMP_GUEST_PERM: + permitted = xstate_get_guest_group_perm(); + permitted &= XFEATURE_MASK_USER_SUPPORTED; + return put_user(permitted, uptr); + + case ARCH_REQ_XCOMP_GUEST_PERM: + guest = true; + fallthrough; + case ARCH_REQ_XCOMP_PERM: if (!IS_ENABLED(CONFIG_X86_64)) return -EOPNOTSUPP; - return xstate_request_perm(idx); + return xstate_request_perm(idx, guest); default: return -EINVAL; diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index 86ea7c0fa2f6..98a472775c97 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -20,10 +20,19 @@ static inline void xstate_init_xcomp_bv(struct xregs_state *xsave, u64 mask) xsave->header.xcomp_bv = mask | XCOMP_BV_COMPACTED_FORMAT; } -static inline u64 xstate_get_host_group_perm(void) +static inline u64 xstate_get_group_perm(bool guest) { + struct fpu *fpu = ¤t->group_leader->thread.fpu; + struct fpu_state_perm *perm; + /* Pairs with WRITE_ONCE() in xstate_request_perm() */ - return READ_ONCE(current->group_leader->thread.fpu.perm.__state_perm); + perm = guest ? &fpu->guest_perm : &fpu->perm; + return READ_ONCE(perm->__state_perm); +} + +static inline u64 xstate_get_host_group_perm(void) +{ + return xstate_get_group_perm(false); } enum xstate_copy_mode { diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 04143a653a8a..d7bc23589062 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -993,6 +993,8 @@ long do_arch_prctl_common(struct task_struct *task, int option, case ARCH_GET_XCOMP_SUPP: case ARCH_GET_XCOMP_PERM: case ARCH_REQ_XCOMP_PERM: + case ARCH_GET_XCOMP_GUEST_PERM: + case ARCH_REQ_XCOMP_GUEST_PERM: return fpu_xstate_prctl(task, option, arg2); } From patchwork Fri Dec 17 15:29:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685095 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EE97C433F5 for ; Fri, 17 Dec 2021 15:30:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238249AbhLQPaI (ORCPT ); Fri, 17 Dec 2021 10:30:08 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238205AbhLQPaF (ORCPT ); Fri, 17 Dec 2021 10:30:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755005; x=1671291005; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3YkNKzn2lfXyfS619hT0EqppM2U7SVTXNcylXZzTQRs=; b=hy8VOLYMmIuUp58TE1guRdjxeWWP2P1xg1R7xkAcG/dMSOCZkt63k6cV X1MD2DQdcARGQmzppwLCRkRmezYzgndjexgQzuqIvtmfibDC4HPnwzS75 lomoukXAvxfdadhZ0puoHet9BmRDMp6qpss99vGbHQSs6qscTvHskUM0O 0kTHFosfYwGsL07dwFhcVSu0JsELD7ALe89hxZL+YWW+1emLRgoXoT579 mHQmWwYG/FI/mNo4rYFr53bRcQNPo9dFZ++mVcJ2hZkUV4VAabHaRqZT8 02OrPOgtWtBHvH5EvgYzGzw1WGgXDaeYRZrT58brCLBdqNwd7YlFbTxec g==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723429" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723429" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588392" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:03 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 02/23] x86/fpu: Prepare guest FPU for dynamically enabled FPU features Date: Fri, 17 Dec 2021 07:29:42 -0800 Message-Id: <20211217153003.1719189-3-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Thomas Gleixner To support dynamically enabled FPU features for guests prepare the guest pseudo FPU container to keep track of the currently enabled xfeatures and the guest permissions. Signed-off-by: Thomas Gleixner Signed-off-by: Jing Liu --- arch/x86/include/asm/fpu/types.h | 13 +++++++++++++ arch/x86/kernel/fpu/core.c | 26 +++++++++++++++++++++++++- 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 6ddf80637697..c752d0aa23a4 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -504,6 +504,19 @@ struct fpu { * Guest pseudo FPU container */ struct fpu_guest { + /* + * @xfeatures: xfeature bitmap of features which are + * currently enabled for the guest vCPU. + */ + u64 xfeatures; + + /* + * @perm: xfeature bitmap of features which are + * permitted to be enabled for the guest + * vCPU. + */ + u64 perm; + /* * @fpstate: Pointer to the allocated guest fpstate */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index ab19b3d8b2f7..eddeeb4ed2f5 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -201,6 +201,26 @@ void fpu_reset_from_exception_fixup(void) #if IS_ENABLED(CONFIG_KVM) static void __fpstate_reset(struct fpstate *fpstate); +static void fpu_init_guest_permissions(struct fpu_guest *gfpu) +{ + struct fpu_state_perm *fpuperm; + u64 perm; + + if (!IS_ENABLED(CONFIG_X86_64)) + return; + + spin_lock_irq(¤t->sighand->siglock); + fpuperm = ¤t->group_leader->thread.fpu.guest_perm; + perm = fpuperm->__state_perm; + + /* First fpstate allocation locks down permissions. */ + WRITE_ONCE(fpuperm->__state_perm, perm | FPU_GUEST_PERM_LOCKED); + + spin_unlock_irq(¤t->sighand->siglock); + + gfpu->perm = perm & ~FPU_GUEST_PERM_LOCKED; +} + bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) { struct fpstate *fpstate; @@ -216,7 +236,11 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) fpstate->is_valloc = true; fpstate->is_guest = true; - gfpu->fpstate = fpstate; + gfpu->fpstate = fpstate; + gfpu->xfeatures = fpu_user_cfg.default_features; + gfpu->perm = fpu_user_cfg.default_features; + fpu_init_guest_permissions(gfpu); + return true; } EXPORT_SYMBOL_GPL(fpu_alloc_guest_fpstate); From patchwork Fri Dec 17 15:29:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685119 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BC8DC433F5 for ; Fri, 17 Dec 2021 15:30:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238223AbhLQPaG (ORCPT ); Fri, 17 Dec 2021 10:30:06 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238198AbhLQPaE (ORCPT ); Fri, 17 Dec 2021 10:30:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755004; x=1671291004; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/l/lzu8ovCU50omqsIU1bkVibRLaW4dJhOXZM0Wu658=; b=Z9fzAUNgVF8iWv9NQe28VKtzsADmE8y4RcOsKnHCSl+uJ/eIh4h7rRgn c420iY9VUJOG+dg7siBXcKfbAEf+PVb0r/Ashgyx4ltR8kgGCccoW/v1w NfKlmxig3stoqiEHd3/q2FWUA9o9PmA+D4XLoaahh+0pHx6rNcrqborgQ FH5w6q9DwMgcQQI87QMWNxYgTLjD7EzDk5Lz72n+xi54eZ+ex5K/LFJ5v e804NVK6XzXRQA0QUrYyUcE2eeW2JrS7AzwitBbNYx9aPfi38KDx704Mg czxdUt+Sn57v+RSfbeNIUmQRwZGDV/m46d0sde2s6+Dx1/LbkuPkRP7pJ g==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723431" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723431" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588394" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:03 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 03/23] kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule Date: Fri, 17 Dec 2021 07:29:43 -0800 Message-Id: <20211217153003.1719189-4-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CPUID.0xD.1.EBX enumerates the size of the XSAVE area (in compacted format) required by XSAVES. If CPUID.0xD.i.ECX[1] is set for a state component (i), this state component should be located on the next 64-bytes boundary following the preceding state component in the compacted layout. Fix xstate_required_size() to follow the alignment rule. AMX is the first state component with 64-bytes alignment to catch this bug. Signed-off-by: Yang Zhong Signed-off-by: Jing Liu --- arch/x86/kvm/cpuid.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 07e9215e911d..148003e26cbb 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -42,7 +42,8 @@ static u32 xstate_required_size(u64 xstate_bv, bool compacted) if (xstate_bv & 0x1) { u32 eax, ebx, ecx, edx, offset; cpuid_count(0xD, feature_bit, &eax, &ebx, &ecx, &edx); - offset = compacted ? ret : ebx; + /* ECX[1]: 64B alignment in compacted form */ + offset = compacted ? ((ecx & 0x2) ? ALIGN(ret, 64) : ret) : ebx; ret = max(ret, offset + eax); } From patchwork Fri Dec 17 15:29:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685093 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84B1DC433FE for ; Fri, 17 Dec 2021 15:30:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238238AbhLQPaH (ORCPT ); Fri, 17 Dec 2021 10:30:07 -0500 Received: from mga03.intel.com ([134.134.136.65]:10848 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238206AbhLQPaF (ORCPT ); Fri, 17 Dec 2021 10:30:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755005; x=1671291005; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5YKjhTcj/ohmi9omLNbhnFUXVRZVAw/Npu93/co/edI=; b=hBMxTzlaV5cC3MDogQpD+3ycxudZOyKzsQm5fnFbDJ0YtYSYCASzS/6O c3FpBVeHGXRjegV5+AL0rtHblr/m6dyQ/9Tlxiqqejx+CLdfeHfpSyWGb RDnWiCHdK6U2qeBdoZApPexMN8fuSxxhoph8YLCKbW8CVgDd2MBi3O5A+ 36eW6JllBN0xGnsIFz4NPXWG73cXg/mi5eJR2CgwBYl6/tZ5kx3oNy1NM ZqFAx30YclRB/0PX9g698hZQZ/qEAMk7eEaE3AOyJq4I/q9T3imAlFxLd igqDNuppaVnov6WdmvODh2oVctLwS7RAO8HdjaBSjXcCrPbVaXC5Hvebv Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723434" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723434" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588400" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:04 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 04/23] kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID Date: Fri, 17 Dec 2021 07:29:44 -0800 Message-Id: <20211217153003.1719189-5-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM_GET_SUPPORTED_CPUID should not include any dynamic xstates in CPUID[0xD] if they have not been requested with prctl. Otherwise a process which directly passes KVM_GET_SUPPORTED_CPUID to KVM_SET_CPUID2 would now fail even if it doesn't intend to use a dynamically enabled feature. Userspace must know that prctl is required and allocate >4K xstate buffer before setting any dynamic bit. Suggested-by: Paolo Bonzini Signed-off-by: Jing Liu --- Documentation/virt/kvm/api.rst | 4 ++++ arch/x86/kvm/cpuid.c | 9 ++++++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index aeeb071c7688..eb5671ca2dba 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1684,6 +1684,10 @@ userspace capabilities, and with user requirements (for example, the user may wish to constrain cpuid to emulate older hardware, or for feature consistency across a cluster). +Permissions must be set via prctl() for dynamically-enabled XSAVE +features before calling this ioctl. Otherwise those feature bits are +excluded. + Note that certain capabilities, such as KVM_CAP_X86_DISABLE_EXITS, may expose cpuid features (e.g. MONITOR) which are not supported by kvm in its default configuration. If userspace enables such capabilities, it diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 148003e26cbb..4855344091b8 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -812,11 +812,13 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) goto out; } break; - case 0xd: - entry->eax &= supported_xcr0; + case 0xd: { + u64 guest_perm = xstate_get_guest_group_perm(); + + entry->eax &= supported_xcr0 & guest_perm; entry->ebx = xstate_required_size(supported_xcr0, false); entry->ecx = entry->ebx; - entry->edx &= supported_xcr0 >> 32; + entry->edx &= (supported_xcr0 & guest_perm) >> 32; if (!supported_xcr0) break; @@ -863,6 +865,7 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) entry->edx = 0; } break; + } case 0x12: /* Intel SGX */ if (!kvm_cpu_cap_has(X86_FEATURE_SGX)) { From patchwork Fri Dec 17 15:29:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685135 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23788C433F5 for ; Fri, 17 Dec 2021 15:31:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238701AbhLQPbH (ORCPT ); Fri, 17 Dec 2021 10:31:07 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238192AbhLQPaF (ORCPT ); Fri, 17 Dec 2021 10:30:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755005; x=1671291005; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Cq5pl+LFSmydyprw9+uNd5O4QogykkkHwYzeeS4aZ1E=; b=jNVSj84q8mX12SKcv+7MXFPV5r5owUMOwAnqlq81/LwLAAM1jD8Rk5ev +4BimgvJBop9hKAVHpaL4koAInJ0Ylr8ERiOJrsWhh+47RbnmNj2inm+w 6aAKX4G1A0hMewxe9lOL8BwCvmAS/hGbA+MSfZRTImrauBurm7kKSLD/0 6N3+zjZmVR/XKAiy135cU6ZBX3E/UEASDErryPBIuw+X0OSsEIus/ViLW 5tuXPlWY/KQ+3KvfblMyOT+0SyKYHPfDmvJFt/ZHCxF6GdcaGVL7rqn9M cn1NmqK19NZFrBkdHU4XJbaytgWg/JmTKybAbbzb8tetr41NPiH0ewgJF w==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723437" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723437" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588404" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:04 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 05/23] kvm: x86: Check permitted dynamic xfeatures at KVM_SET_CPUID2 Date: Fri, 17 Dec 2021 07:29:45 -0800 Message-Id: <20211217153003.1719189-6-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Guest xstate permissions should be set by userspace VMM before vcpu creation. Extend KVM_SET_CPUID2 to verify that every feature reported in CPUID[0xD] has proper permission set. Signed-off-by: Jing Liu --- arch/x86/kvm/cpuid.c | 36 ++++++++++++++++++++++++------------ 1 file changed, 24 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 4855344091b8..a068373a7fbd 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -81,7 +81,9 @@ static inline struct kvm_cpuid_entry2 *cpuid_entry2_find( return NULL; } -static int kvm_check_cpuid(struct kvm_cpuid_entry2 *entries, int nent) +static int kvm_check_cpuid(struct kvm_vcpu *vcpu, + struct kvm_cpuid_entry2 *entries, + int nent) { struct kvm_cpuid_entry2 *best; @@ -97,6 +99,16 @@ static int kvm_check_cpuid(struct kvm_cpuid_entry2 *entries, int nent) return -EINVAL; } + /* Check guest permissions for dynamically-enabled xfeatures */ + best = cpuid_entry2_find(entries, nent, 0xd, 0); + if (best) { + u64 xfeatures; + + xfeatures = best->eax | ((u64)best->edx << 32); + if (xfeatures & ~vcpu->arch.guest_fpu.perm) + return -ENXIO; + } + return 0; } @@ -277,21 +289,21 @@ u64 kvm_vcpu_reserved_gpa_bits_raw(struct kvm_vcpu *vcpu) static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2, int nent) { - int r; + int r; - r = kvm_check_cpuid(e2, nent); - if (r) - return r; + r = kvm_check_cpuid(vcpu, e2, nent); + if (r) + return r; - kvfree(vcpu->arch.cpuid_entries); - vcpu->arch.cpuid_entries = e2; - vcpu->arch.cpuid_nent = nent; + kvfree(vcpu->arch.cpuid_entries); + vcpu->arch.cpuid_entries = e2; + vcpu->arch.cpuid_nent = nent; - kvm_update_kvm_cpuid_base(vcpu); - kvm_update_cpuid_runtime(vcpu); - kvm_vcpu_after_set_cpuid(vcpu); + kvm_update_kvm_cpuid_base(vcpu); + kvm_update_cpuid_runtime(vcpu); + kvm_vcpu_after_set_cpuid(vcpu); - return 0; + return 0; } /* when an old userspace process fills a new kernel module */ From patchwork Fri Dec 17 15:29:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40641C433EF for ; Fri, 17 Dec 2021 15:30:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238268AbhLQPaK (ORCPT ); Fri, 17 Dec 2021 10:30:10 -0500 Received: from mga03.intel.com ([134.134.136.65]:10848 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238213AbhLQPaF (ORCPT ); Fri, 17 Dec 2021 10:30:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755005; x=1671291005; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7dXZPVEDoLzyZy2Pl21Dl3pKMjJYI7sq2Em4vBDIQI4=; b=ljOjIuMH7GSEiBc9Y9xaW8M4nRZNDzAq1SQQSzh262ZBoI39Z49xbIkD kfVIxBinc7N/btRS2jUChqPBrPqtNnSoxbiFiR9PxCfwIqcNfkL01Knd3 ixedGFt9/h+hDZpsLJkej98IIKJ/yBeFCX+5FzdjgJ093aFN/tO3wecQE pGlUfilrOmj/JEcmw+xYvMeraVDEXbIG1YfzFAK/7gjUV54foI4xGBn28 cL8n3g3/tT4UlNhaX5XESUiZZpSVQv1o+N12f7/A/9zmBsh9+HpLZTy2P YoWf1TGxvw4is1BwXhSyXrRnnwvIXmlzvLaOhjW8jZBkVndb9p1/w6n4o A==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723439" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723439" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588407" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:04 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 06/23] x86/fpu: Make XFD initialization in __fpstate_reset() a function argument Date: Fri, 17 Dec 2021 07:29:46 -0800 Message-Id: <20211217153003.1719189-7-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org vCPU threads are different from native tasks regarding to the initial XFD value. While all native tasks follow a fixed value (init_fpstate::xfd) established by the FPU core at boot, vCPU threads need to obey the reset value (i.e. ZERO) defined by the specification, to meet the expectation of the guest. Let the caller supply an argument and adjust the host and guest related invocations accordingly. Signed-off-by: Yang Zhong Signed-off-by: Thomas Gleixner Signed-off-by: Jing Liu --- arch/x86/kernel/fpu/core.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index eddeeb4ed2f5..a78bc547fc03 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -199,7 +199,7 @@ void fpu_reset_from_exception_fixup(void) } #if IS_ENABLED(CONFIG_KVM) -static void __fpstate_reset(struct fpstate *fpstate); +static void __fpstate_reset(struct fpstate *fpstate, u64 xfd); static void fpu_init_guest_permissions(struct fpu_guest *gfpu) { @@ -231,7 +231,8 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) if (!fpstate) return false; - __fpstate_reset(fpstate); + /* Leave xfd to 0 (the reset value defined by spec) */ + __fpstate_reset(fpstate, 0); fpstate_init_user(fpstate); fpstate->is_valloc = true; fpstate->is_guest = true; @@ -454,21 +455,21 @@ void fpstate_init_user(struct fpstate *fpstate) fpstate_init_fstate(fpstate); } -static void __fpstate_reset(struct fpstate *fpstate) +static void __fpstate_reset(struct fpstate *fpstate, u64 xfd) { /* Initialize sizes and feature masks */ fpstate->size = fpu_kernel_cfg.default_size; fpstate->user_size = fpu_user_cfg.default_size; fpstate->xfeatures = fpu_kernel_cfg.default_features; fpstate->user_xfeatures = fpu_user_cfg.default_features; - fpstate->xfd = init_fpstate.xfd; + fpstate->xfd = xfd; } void fpstate_reset(struct fpu *fpu) { /* Set the fpstate pointer to the default fpstate */ fpu->fpstate = &fpu->__fpstate; - __fpstate_reset(fpu->fpstate); + __fpstate_reset(fpu->fpstate, init_fpstate.xfd); /* Initialize the permission related info in fpu */ fpu->perm.__state_perm = fpu_kernel_cfg.default_features; From patchwork Fri Dec 17 15:29:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2EF29C4332F for ; Fri, 17 Dec 2021 15:30:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238276AbhLQPaL (ORCPT ); Fri, 17 Dec 2021 10:30:11 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238212AbhLQPaG (ORCPT ); Fri, 17 Dec 2021 10:30:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755006; x=1671291006; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zp4aaec9LK9Wtx++K81iowo8S7Uo1gqNfud0OEHDekM=; b=AsGp91K/IVsHaccjQhXHgJObsLjPEspCYhvpWeGr3eeqqVZVd+R94vIE RuIskQXvGW2UrC8O4ITkc7pIWCy/OBp/XjoCKL7z0ILNvubsdCH+/iUZb XFY7j1q+pE2TZ//BfRom440BMbMLYO331yRSIzy4QwMe40p/SMMfWVEDt 277+eymIUyVixKS61BC4oT6CwsD5A2D2cEUdLa3WLewxPJMNmRmomX4hP 7sfYyYfK5tjo9QOLDLat7XcLWQRPklTWzED7a8q5b7BKOZQgoHSv0OM13 X2WWE1phpF821/M6e+8jac7WgeD6mfaZ5JYYE3j5Hbd4PPv4k+DpDZfhW A==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723442" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723442" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588411" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:04 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 07/23] x86/fpu: Add guest support to xfd_enable_feature() Date: Fri, 17 Dec 2021 07:29:47 -0800 Message-Id: <20211217153003.1719189-8-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Thomas Gleixner Guest support for dynamically enabled FPU features requires a few modifications to the enablement function which is currently invoked from the #NM handler: 1) Use guest permissions and sizes for the update 2) Update fpu_guest state accordingly 3) Take into account that the enabling can be triggered either from a running guest via XSETBV and MSR_IA32_XFD write emulation or from a guest restore. In the latter case the guests fpstate is not the current tasks active fpstate. Split the function and implement the guest mechanics throughout the callchain. [ Jing: replace xchg with direct assignment in fpstate_realloc() ] Signed-off-by: Thomas Gleixner Signed-off-by: Jing Liu --- arch/x86/kernel/fpu/xstate.c | 93 +++++++++++++++++++++--------------- arch/x86/kernel/fpu/xstate.h | 2 + 2 files changed, 56 insertions(+), 39 deletions(-) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 84386603482f..b0925051b0b6 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1499,29 +1499,6 @@ void fpstate_free(struct fpu *fpu) vfree(fpu->fpstate); } -/** - * fpu_install_fpstate - Update the active fpstate in the FPU - * - * @fpu: A struct fpu * pointer - * @newfps: A struct fpstate * pointer - * - * Returns: A null pointer if the last active fpstate is the embedded - * one or the new fpstate is already installed; - * otherwise, a pointer to the old fpstate which has to - * be freed by the caller. - */ -static struct fpstate *fpu_install_fpstate(struct fpu *fpu, - struct fpstate *newfps) -{ - struct fpstate *oldfps = fpu->fpstate; - - if (fpu->fpstate == newfps) - return NULL; - - fpu->fpstate = newfps; - return oldfps != &fpu->__fpstate ? oldfps : NULL; -} - /** * fpstate_realloc - Reallocate struct fpstate for the requested new features * @@ -1529,6 +1506,7 @@ static struct fpstate *fpu_install_fpstate(struct fpu *fpu, * of that task * @ksize: The required size for the kernel buffer * @usize: The required size for user space buffers + * @guest_fpu: Pointer to a guest FPU container. NULL for host allocations * * Note vs. vmalloc(): If the task with a vzalloc()-allocated buffer * terminates quickly, vfree()-induced IPIs may be a concern, but tasks @@ -1537,13 +1515,13 @@ static struct fpstate *fpu_install_fpstate(struct fpu *fpu, * Returns: 0 on success, -ENOMEM on allocation error. */ static int fpstate_realloc(u64 xfeatures, unsigned int ksize, - unsigned int usize) + unsigned int usize, struct fpu_guest *guest_fpu) { struct fpu *fpu = ¤t->thread.fpu; struct fpstate *curfps, *newfps = NULL; unsigned int fpsize; + bool in_use; - curfps = fpu->fpstate; fpsize = ksize + ALIGN(offsetof(struct fpstate, regs), 64); newfps = vzalloc(fpsize); @@ -1553,28 +1531,55 @@ static int fpstate_realloc(u64 xfeatures, unsigned int ksize, newfps->user_size = usize; newfps->is_valloc = true; + /* + * When a guest FPU is supplied, use @guest_fpu->fpstate + * as reference independent whether it is in use or not. + */ + curfps = guest_fpu ? guest_fpu->fpstate : fpu->fpstate; + + /* Determine whether @curfps is the active fpstate */ + in_use = fpu->fpstate == curfps; + + if (guest_fpu) { + newfps->is_guest = true; + newfps->is_confidential = curfps->is_confidential; + newfps->in_use = curfps->in_use; + guest_fpu->xfeatures |= xfeatures; + } + fpregs_lock(); /* - * Ensure that the current state is in the registers before - * swapping fpstate as that might invalidate it due to layout - * changes. + * If @curfps is in use, ensure that the current state is in the + * registers before swapping fpstate as that might invalidate it + * due to layout changes. */ - if (test_thread_flag(TIF_NEED_FPU_LOAD)) + if (in_use && test_thread_flag(TIF_NEED_FPU_LOAD)) fpregs_restore_userregs(); newfps->xfeatures = curfps->xfeatures | xfeatures; newfps->user_xfeatures = curfps->user_xfeatures | xfeatures; newfps->xfd = curfps->xfd & ~xfeatures; - curfps = fpu_install_fpstate(fpu, newfps); - /* Do the final updates within the locked region */ xstate_init_xcomp_bv(&newfps->regs.xsave, newfps->xfeatures); - xfd_update_state(newfps); + if (guest_fpu) { + guest_fpu->fpstate = newfps; + /* If curfps is active, update the FPU fpstate pointer */ + if (in_use) + fpu->fpstate = newfps; + } else { + fpu->fpstate = newfps; + } + + if (in_use) + xfd_update_state(fpu->fpstate); fpregs_unlock(); - vfree(curfps); + /* Only free valloc'ed state */ + if (curfps && curfps->is_valloc) + vfree(curfps); + return 0; } @@ -1682,14 +1687,16 @@ static int xstate_request_perm(unsigned long idx, bool guest) return ret; } -int xfd_enable_feature(u64 xfd_err) +int __xfd_enable_feature(u64 xfd_err, struct fpu_guest *guest_fpu) { u64 xfd_event = xfd_err & XFEATURE_MASK_USER_DYNAMIC; + struct fpu_state_perm *perm; unsigned int ksize, usize; struct fpu *fpu; if (!xfd_event) { - pr_err_once("XFD: Invalid xfd error: %016llx\n", xfd_err); + if (!guest_fpu) + pr_err_once("XFD: Invalid xfd error: %016llx\n", xfd_err); return 0; } @@ -1697,14 +1704,16 @@ int xfd_enable_feature(u64 xfd_err) spin_lock_irq(¤t->sighand->siglock); /* If not permitted let it die */ - if ((xstate_get_host_group_perm() & xfd_event) != xfd_event) { + if ((xstate_get_group_perm(!!guest_fpu) & xfd_event) != xfd_event) { spin_unlock_irq(¤t->sighand->siglock); return -EPERM; } fpu = ¤t->group_leader->thread.fpu; - ksize = fpu->perm.__state_size; - usize = fpu->perm.__user_state_size; + perm = guest_fpu ? &fpu->guest_perm : &fpu->perm; + ksize = perm->__state_size; + usize = perm->__user_state_size; + /* * The feature is permitted. State size is sufficient. Dropping * the lock is safe here even if more features are added from @@ -1717,10 +1726,16 @@ int xfd_enable_feature(u64 xfd_err) * Try to allocate a new fpstate. If that fails there is no way * out. */ - if (fpstate_realloc(xfd_event, ksize, usize)) + if (fpstate_realloc(xfd_event, ksize, usize, guest_fpu)) return -EFAULT; return 0; } + +int xfd_enable_feature(u64 xfd_err) +{ + return __xfd_enable_feature(xfd_err, NULL); +} + #else /* CONFIG_X86_64 */ static inline int xstate_request_perm(unsigned long idx, bool guest) { diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index 98a472775c97..3e63f723525b 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -55,6 +55,8 @@ extern void fpu__init_system_xstate(unsigned int legacy_size); extern void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr); +extern int __xfd_enable_feature(u64 which, struct fpu_guest *guest_fpu); + static inline u64 xfeatures_mask_supervisor(void) { return fpu_kernel_cfg.max_features & XFEATURE_MASK_SUPERVISOR_SUPPORTED; From patchwork Fri Dec 17 15:29:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685137 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C792DC433FE for ; Fri, 17 Dec 2021 15:31:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238713AbhLQPbI (ORCPT ); Fri, 17 Dec 2021 10:31:08 -0500 Received: from mga03.intel.com ([134.134.136.65]:10848 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238219AbhLQPaG (ORCPT ); Fri, 17 Dec 2021 10:30:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755006; x=1671291006; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4zf2XhN+pJn1URvKvn8kauagNWE4KhwPv1N796bD7S0=; b=JNWavPRqJUGcczsm+7zUrOHG94BDvrF1TAzu1RIIQBC0D+ne0vhfQrgA PHF4rjCSq/8+UBbMLxqy83OcN+q6QcBK0ilACs1g2/5le4Blidy7JNAC+ 9z65vP5ppkTtQznV03njtyc7LrUYTd8RvEBG76Qh75adl3HXn+NxOf2RC W6wPjsCcUTWbu5pF+GuRmyPRJg+v4N5xw8JskY+JygRwGT3iklqj6H2eo gDXReYXipGq22vIfJFbg+QRbxzBoRfIVK10Ni4js0aQFeCgyaCJZEvlc7 1YHNriMFWNsvH6L7W6YZKVwVvEgqamOxyrfjPHV5blc8JVgNpODRfB+Nm g==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723445" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723445" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588414" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:04 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 08/23] x86/fpu: Provide fpu_update_guest_perm_features() for guest Date: Fri, 17 Dec 2021 07:29:48 -0800 Message-Id: <20211217153003.1719189-9-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Kevin Tian KVM can require fpstate expansion in two approaches: 1) Dynamic expansion when intercepting guest updates to XCR0 and XFD MSR; 2) Static expansion e.g. at KVM_SET_CPUID2; The first option doesn't waste memory for legacy guest if it doesn't support XFD. However doing so introduces more complexity and also imposes an order requirement in the restoring path, i.e. XCR0/XFD must be restored before XSTATE. Given that the agreement is to do the static approach. This is considered a better tradeoff though it does waste 8K memory for legacy guest if its CPUID includes XFD features. Provide a wrapper to allow expanding the fpstate buffer to what guest perm allows. Once completed, both emulation and restore path don't need to worry about the buffer size. Signed-off-by: Kevin Tian Signed-off-by: Jing Liu Reviewed-by: Thomas Gleixner --- arch/x86/include/asm/fpu/api.h | 1 + arch/x86/kernel/fpu/core.c | 27 +++++++++++++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index d8c222290e68..8e934e571273 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -138,6 +138,7 @@ extern inline u64 xstate_get_guest_group_perm(void); extern bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu); extern void fpu_free_guest_fpstate(struct fpu_guest *gfpu); extern int fpu_swap_kvm_fpstate(struct fpu_guest *gfpu, bool enter_guest); +extern int fpu_update_guest_perm_features(struct fpu_guest *guest_fpu); extern void fpu_copy_guest_fpstate_to_uabi(struct fpu_guest *gfpu, void *buf, unsigned int size, u32 pkru); extern int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf, u64 xcr0, u32 *vpkru); diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index a78bc547fc03..2560a95980aa 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -261,6 +261,33 @@ void fpu_free_guest_fpstate(struct fpu_guest *gfpu) } EXPORT_SYMBOL_GPL(fpu_free_guest_fpstate); +/* + * fpu_update_guest_perm_features - Enable xfeatures according to guest perm + * @guest_fpu: Pointer to the guest FPU container + * + * Enable all dynamic xfeatures according to guest perm. Invoked if the + * caller wants to conservatively expand fpstate buffer instead of waiting + * until XCR0 or XFD MSR is written. + * + * Return: 0 on success, error code otherwise + */ +int fpu_update_guest_perm_features(struct fpu_guest *guest_fpu) +{ + u64 expand; + + lockdep_assert_preemption_enabled(); + + if (!IS_ENABLED(CONFIG_X86_64)) + return 0; + + expand = guest_fpu->perm & ~guest_fpu->xfeatures; + if (!expand) + return 0; + + return __xfd_enable_feature(expand, guest_fpu); +} +EXPORT_SYMBOL_GPL(fpu_update_guest_perm_features); + int fpu_swap_kvm_fpstate(struct fpu_guest *guest_fpu, bool enter_guest) { struct fpstate *guest_fps = guest_fpu->fpstate; From patchwork Fri Dec 17 15:29:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 567ABC433EF for ; Fri, 17 Dec 2021 15:30:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238291AbhLQPaM (ORCPT ); Fri, 17 Dec 2021 10:30:12 -0500 Received: from mga03.intel.com ([134.134.136.65]:10852 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238222AbhLQPaG (ORCPT ); Fri, 17 Dec 2021 10:30:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755006; x=1671291006; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Oudfdu6myNVjvoNfDhIoilt+uJCzFGHVxT+rvDPYbcw=; b=j1iwBUDiQL47pNPKRKxemujDYKX8ixqVtUDU/muQG1MQOwMANr/JzB25 jRf4GSlYXJapWKWqGdeBL/4ECdZy/wM6jQIzwuYNMxudU+I1pQRQtgPgB LxdJawtyl7EklzexfnzpkDxZLBm6tFNQItu6g4KVYphlhUGxEKsWR1ln2 b86/AIL9vk8lrgS7AFY4wFOWwnHFnqURaa9tZCUNnQ7h3EoSBbU69Xy6/ Vz7Dxer6if224LFqLbe7y9VWvsBeqtEJag0Bk7fhApmupc/q7Nrzuig3v 97yOlOfs7e4Rxl9C3S97JBY35N1N7W0obpIP/i1rliyXqdtLdyDLABbOg g==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723447" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723447" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588422" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:04 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 09/23] kvm: x86: Enable dynamic XSAVE features at KVM_SET_CPUID2 Date: Fri, 17 Dec 2021 07:29:49 -0800 Message-Id: <20211217153003.1719189-10-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Statically enable all xfeatures allowed by guest perm in KVM_SET_CPUID2, with fpstate buffer sized accordingly. This avoids run-time expansion in the emulation and restore path of XCR0 and XFD MSR [1]. Change kvm_vcpu_after_set_cpuid() to return error given fpstate reallocation may fail. [1] https://lore.kernel.org/all/20211214024948.048572883@linutronix.de/ Suggested-by: Thomas Gleixner Suggested-by: Paolo Bonzini Signed-off-by: Jing Liu --- arch/x86/kvm/cpuid.c | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index a068373a7fbd..eb5a5070accb 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -204,10 +204,12 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_update_cpuid_runtime); -static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) +static int kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu->arch.apic; struct kvm_cpuid_entry2 *best; + u64 xfeatures; + int r; best = kvm_find_cpuid_entry(vcpu, 1, 0); if (best && apic) { @@ -222,9 +224,17 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) best = kvm_find_cpuid_entry(vcpu, 0xD, 0); if (!best) vcpu->arch.guest_supported_xcr0 = 0; - else - vcpu->arch.guest_supported_xcr0 = - (best->eax | ((u64)best->edx << 32)) & supported_xcr0; + else { + xfeatures = best->eax | ((u64)best->edx << 32); + + vcpu->arch.guest_supported_xcr0 = xfeatures & supported_xcr0; + + if (xfeatures != vcpu->arch.guest_fpu.xfeatures) { + r = fpu_update_guest_perm_features(&vcpu->arch.guest_fpu); + if (r) + return r; + } + } /* * Bits 127:0 of the allowed SECS.ATTRIBUTES (CPUID.0x12.0x1) enumerate @@ -260,6 +270,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) * adjustments to the reserved GPA bits. */ kvm_mmu_after_set_cpuid(vcpu); + + return 0; } int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu) @@ -301,9 +313,7 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2, kvm_update_kvm_cpuid_base(vcpu); kvm_update_cpuid_runtime(vcpu); - kvm_vcpu_after_set_cpuid(vcpu); - - return 0; + return kvm_vcpu_after_set_cpuid(vcpu); } /* when an old userspace process fills a new kernel module */ From patchwork Fri Dec 17 15:29:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA449C433F5 for ; Fri, 17 Dec 2021 15:30:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238305AbhLQPaM (ORCPT ); Fri, 17 Dec 2021 10:30:12 -0500 Received: from mga03.intel.com ([134.134.136.65]:10848 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238226AbhLQPaH (ORCPT ); Fri, 17 Dec 2021 10:30:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755007; x=1671291007; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=k8gT9oB/5YPUMem/eqQOfbKkI8IytJEMqRnBB5pWAb0=; b=kTyPyaCea+JVUjiwq+IWLSBESXvDm0SSHiLesF2Ohk2GkdLYh+ddt0No U2AMQQeDoKy44/cpjFtKabpEm7FUfattn/N/KjND+je4SCT9bBFMzEnUs qf2DqvnvWKrhNmhBoFoHpnLSV7G+RvOCozkWtb62Xiz8YogJMcRJwA0y9 Jh82NYQU6xgZMnWPvzBZudvMAoX6y3vKjtMDoGudSasqSOPOiVk3JazUS BLcvf5uN8cHsDGOiYGodpjwVYIB3Dr4Jaso0NURBrGG/PkqTUFYv2lW6p xcu8jOstr5mHAlG3Jn2UZKysZF6wfYebdp+8zWw1OjXLQJGd/mgKDhEwQ Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723449" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723449" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588428" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:05 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 10/23] x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation Date: Fri, 17 Dec 2021 07:29:50 -0800 Message-Id: <20211217153003.1719189-11-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Kevin Tian Guest XFD can be updated either in the emulation path or in the restore path. Provide a wrapper to update guest_fpu::fpstate::xfd. If the guest fpstate is currently in-use, also update the per-cpu xfd cache and the actual MSR. Signed-off-by: Kevin Tian Signed-off-by: Jing Liu --- arch/x86/include/asm/fpu/api.h | 6 ++++++ arch/x86/kernel/fpu/core.c | 12 ++++++++++++ 2 files changed, 18 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 8e934e571273..100b535cd3de 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -140,6 +140,12 @@ extern void fpu_free_guest_fpstate(struct fpu_guest *gfpu); extern int fpu_swap_kvm_fpstate(struct fpu_guest *gfpu, bool enter_guest); extern int fpu_update_guest_perm_features(struct fpu_guest *guest_fpu); +#ifdef CONFIG_X86_64 +extern void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd); +#else +static inline void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd) { } +#endif + extern void fpu_copy_guest_fpstate_to_uabi(struct fpu_guest *gfpu, void *buf, unsigned int size, u32 pkru); extern int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf, u64 xcr0, u32 *vpkru); diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 2560a95980aa..3daea097c618 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -288,6 +288,18 @@ int fpu_update_guest_perm_features(struct fpu_guest *guest_fpu) } EXPORT_SYMBOL_GPL(fpu_update_guest_perm_features); +#ifdef CONFIG_X86_64 +void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd) +{ + fpregs_lock(); + guest_fpu->fpstate->xfd = xfd; + if (guest_fpu->fpstate->in_use) + xfd_update_state(guest_fpu->fpstate); + fpregs_unlock(); +} +EXPORT_SYMBOL_GPL(fpu_update_guest_xfd); +#endif /* CONFIG_X86_64 */ + int fpu_swap_kvm_fpstate(struct fpu_guest *guest_fpu, bool enter_guest) { struct fpstate *guest_fps = guest_fpu->fpstate; From patchwork Fri Dec 17 15:29:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685117 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D880BC433EF for ; Fri, 17 Dec 2021 15:30:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238271AbhLQPaN (ORCPT ); Fri, 17 Dec 2021 10:30:13 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238234AbhLQPaH (ORCPT ); Fri, 17 Dec 2021 10:30:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755007; x=1671291007; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Lc+hkJ6Zcnttittpo56Xg8lXuQMR5lxzOCxmw+1Adsc=; b=EMQu7+wC9Z6cMw1oDTgIGfL3cIC4f00e/CKKy53xkWkjJ/scu1/OC5vU EAERjPSV1gLojSW2btugqljLN5UClq2Tw/EYlWz8LTj9hBVW19hlk8voI 4SGi55UmAv4Or4WhQ+cfN9L75RBP87AMb+A9iNfp/9I73q7wN9xdaUPud by5IJwcrt6K3bvNW6m5eDtc0OsQSK8yjv0NuYLdXa3mbR0KUmd3u8aRdx ql5w/zdDIcd9Vn9HkoihnzCAvAKnBsu0gZgBKwF1rzsSMbWl/k4fbmxzD qNOMMiY0Qn0pCvVMOybcQcYLDaqM5whkZqRNAftJMRNGX+sHwpcXWTZ89 w==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723451" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723451" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588431" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:05 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 11/23] kvm: x86: Add emulation for IA32_XFD Date: Fri, 17 Dec 2021 07:29:51 -0800 Message-Id: <20211217153003.1719189-12-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Intel's eXtended Feature Disable (XFD) feature allows the software to dynamically adjust fpstate buffer size for XSAVE features which have large state. Because fpstate has been expanded for all possible dynamic xstates at KVM_SET_CPUID2, emulation of the IA32_XFD MSR is straightforward. For write just call fpu_update_guest_xfd() to update the guest fpu container once all the sanity checks are passed. For read then return the cached value in the container. Signed-off-by: Zeng Guang Signed-off-by: Wei Wang Signed-off-by: Jing Liu --- arch/x86/kvm/x86.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e0aa4dd53c7f..a274146ef439 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1358,6 +1358,7 @@ static const u32 msrs_to_save_all[] = { MSR_F15H_PERF_CTL3, MSR_F15H_PERF_CTL4, MSR_F15H_PERF_CTL5, MSR_F15H_PERF_CTR0, MSR_F15H_PERF_CTR1, MSR_F15H_PERF_CTR2, MSR_F15H_PERF_CTR3, MSR_F15H_PERF_CTR4, MSR_F15H_PERF_CTR5, + MSR_IA32_XFD, }; static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)]; @@ -3668,6 +3669,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; vcpu->arch.msr_misc_features_enables = data; break; +#ifdef CONFIG_X86_64 + case MSR_IA32_XFD: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_XFD)) + return 1; + + if (data & ~(XFEATURE_MASK_USER_DYNAMIC & + vcpu->arch.guest_supported_xcr0)) + return 1; + + fpu_update_guest_xfd(&vcpu->arch.guest_fpu, data); + break; +#endif default: if (kvm_pmu_is_valid_msr(vcpu, msr)) return kvm_pmu_set_msr(vcpu, msr_info); @@ -3988,6 +4002,15 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_K7_HWCR: msr_info->data = vcpu->arch.msr_hwcr; break; +#ifdef CONFIG_X86_64 + case MSR_IA32_XFD: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_XFD)) + return 1; + + msr_info->data = vcpu->arch.guest_fpu.fpstate->xfd; + break; +#endif default: if (kvm_pmu_is_valid_msr(vcpu, msr_info->index)) return kvm_pmu_get_msr(vcpu, msr_info); @@ -6421,6 +6444,10 @@ static void kvm_init_msr_list(void) min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp)) continue; break; + case MSR_IA32_XFD: + if (!kvm_cpu_cap_has(X86_FEATURE_XFD)) + continue; + break; default: break; } From patchwork Fri Dec 17 15:29:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685133 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CABB2C433EF for ; Fri, 17 Dec 2021 15:31:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238672AbhLQPbF (ORCPT ); Fri, 17 Dec 2021 10:31:05 -0500 Received: from mga03.intel.com ([134.134.136.65]:10852 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238236AbhLQPaH (ORCPT ); Fri, 17 Dec 2021 10:30:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755007; x=1671291007; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PXq6/YEFmL7fuafodWXOL+G8jefqhvh1JXMT3eAUNEI=; b=Unh9mg7UQwcm4rChbIy/dYdjANXCjOB8X3bBRTezdhZsHVE7iAQjc6Vy Bps/zE9EyD73NTKkOLeQqCl/wkJ7pyyG8JlYj63KqT7bgVQ51/qRU1oG3 jYJrhUTkW3xVDCVVYmha4TrZPm+ErlborwGKl7B3kIzlDdBdqtezx7xEX ff8wwqycf+RYFNtrQAsDqqpq8/rx9EXs/LUSUEdM6DzYCcPmL7qZn6rh5 YQKyE+oigSIauMRvf8vdt8gpuelqIlnwTergA5Di8VfIrRF9OXigeqhsE dq1DiICD/kAtYBRvQOa0Cqu2+LwX9FMBD/F+44q2QugYoMHeFiDqRF8QM w==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723453" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723453" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588435" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:05 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 12/23] x86/fpu: Prepare xfd_err in struct fpu_guest Date: Fri, 17 Dec 2021 07:29:52 -0800 Message-Id: <20211217153003.1719189-13-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When XFD causes an instruction to generate #NM, IA32_XFD_ERR contains information about which disabled state components are being accessed. The #NM handler is expected to check this information and then enable the state components by clearing IA32_XFD for the faulting task (if having permission). If the XFD_ERR value generated in guest is consumed/clobbered by the host before the guest itself doing so. This may lead to non-XFD-related #NM treated as XFD #NM in host (due to non-zero value in XFD_ERR), or XFD-related #NM treated as non-XFD #NM in guest (XFD_ERR cleared by the host #NM handler). Introduce a new field in fpu_guest to save the guest xfd_err value. KVM is expected to save guest xfd_err before preemption is enabled and restore it right before entering the guest (with preemption disabled). Signed-off-by: Kevin Tian Signed-off-by: Jing Liu --- arch/x86/include/asm/fpu/types.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index c752d0aa23a4..3795d0573773 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -517,6 +517,11 @@ struct fpu_guest { */ u64 perm; + /* + * @xfd_err: Save the guest value. + */ + u64 xfd_err; + /* * @fpstate: Pointer to the allocated guest fpstate */ From patchwork Fri Dec 17 15:29:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685129 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D409C433EF for ; Fri, 17 Dec 2021 15:31:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238596AbhLQPa6 (ORCPT ); Fri, 17 Dec 2021 10:30:58 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238210AbhLQPaI (ORCPT ); Fri, 17 Dec 2021 10:30:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755008; x=1671291008; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UAli8J8lgVIdRC1P6KuS5fK9r8AqDifhErcOIH4VyWU=; b=PQRpxPJ5b2IKl791mTuEcNKPsJpKVrfM3D3aD+v6bEP1g46TRZJqOUMS 3/VDxE2csLBLb3khL/lwF88bZe9srs+Cl1BuRMyK9f1e2+et8dS/Ok5ss Uoxx/Vq+KcVW68c/zSA6diz5tSat5HpkeJbOXDnzuCDRgL0pthpVRXQfd Eszdhp0edDpV2lWmhPZkM7h+zvJ4fLJzYmHo6v1XDgr29pF4TINXSSla6 bQKdpFTZfBH0NAHlgZH4fR/D7clUrhYT9ay6BeLMOhVLPvG0OANAKTcQ2 CLsgljs/XCB5R0uSle7W3nB0l0B1Ur/zTfGg/6OutJ8PsjVYFvo4eF02g A==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723455" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723455" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588438" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:05 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 13/23] kvm: x86: Intercept #NM for saving IA32_XFD_ERR Date: Fri, 17 Dec 2021 07:29:53 -0800 Message-Id: <20211217153003.1719189-14-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Guest IA32_XFD_ERR is generally modified in two places: - Set by CPU when #NM is triggered; - Cleared by guest in its #NM handler; Intercept #NM for the first case, if guest CPUID includes any dynamic xfeature. #NM is rare if the guest doesn't use dynamic features. Otherwise, there is at most one exception per guest task given a dynamic feature. Save the current XFD_ERR value to the guest_fpu container in the #NM VM-exit handler. This must be done with interrupt/preemption disabled, otherwise the unsaved MSR value may be clobbered by host operations. Inject a virtual #NM to the guest after saving the MSR value. Restore the host value (always ZERO outside of the host #NM handler) before enabling preemption. Restore the guest value from the guest_fpu container right before entering the guest (with preemption disabled). Suggested-by: Thomas Gleixner Signed-off-by: Jing Liu --- TODO: Investigate delaying #NM interception until guest sets a dynamic feature in XCR0. arch/x86/kvm/vmx/vmcs.h | 5 +++++ arch/x86/kvm/vmx/vmx.c | 15 ++++++++++++++- arch/x86/kvm/x86.c | 6 ++++++ 3 files changed, 25 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h index 6e5de2e2b0da..c57798b56f95 100644 --- a/arch/x86/kvm/vmx/vmcs.h +++ b/arch/x86/kvm/vmx/vmcs.h @@ -129,6 +129,11 @@ static inline bool is_machine_check(u32 intr_info) return is_exception_n(intr_info, MC_VECTOR); } +static inline bool is_nm(u32 intr_info) +{ + return is_exception_n(intr_info, NM_VECTOR); +} + /* Undocumented: icebp/int1 */ static inline bool is_icebp(u32 intr_info) { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9453743ce0c4..483075045253 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -36,6 +36,7 @@ #include #include #include +#include #include #include #include @@ -763,6 +764,9 @@ void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu) vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, match); } + if (vcpu->arch.guest_supported_xcr0 & XFEATURE_MASK_USER_DYNAMIC) + eb |= (1u << NM_VECTOR); + vmcs_write32(EXCEPTION_BITMAP, eb); } @@ -4750,7 +4754,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) vect_info = vmx->idt_vectoring_info; intr_info = vmx_get_intr_info(vcpu); - if (is_machine_check(intr_info) || is_nmi(intr_info)) + if (is_machine_check(intr_info) || is_nmi(intr_info) || is_nm(intr_info)) return 1; /* handled by handle_exception_nmi_irqoff() */ if (is_invalid_opcode(intr_info)) @@ -6338,6 +6342,12 @@ static void handle_interrupt_nmi_irqoff(struct kvm_vcpu *vcpu, kvm_after_interrupt(vcpu); } +static void handle_exception_nm(struct kvm_vcpu *vcpu) +{ + rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); + kvm_queue_exception(vcpu, NM_VECTOR); +} + static void handle_exception_nmi_irqoff(struct vcpu_vmx *vmx) { const unsigned long nmi_entry = (unsigned long)asm_exc_nmi_noist; @@ -6346,6 +6356,9 @@ static void handle_exception_nmi_irqoff(struct vcpu_vmx *vmx) /* if exit due to PF check for async PF */ if (is_page_fault(intr_info)) vmx->vcpu.arch.apf.host_apf_flags = kvm_read_and_reset_apf_flags(); + /* if exit due to NM, handle before preemptions are enabled */ + else if (is_nm(intr_info)) + handle_exception_nm(&vmx->vcpu); /* Handle machine checks before interrupts are enabled */ else if (is_machine_check(intr_info)) kvm_machine_check(); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a274146ef439..e528085030b3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9894,6 +9894,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (test_thread_flag(TIF_NEED_FPU_LOAD)) switch_fpu_return(); + if (vcpu->arch.guest_fpu.xfd_err) + wrmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); + if (unlikely(vcpu->arch.switch_db_regs)) { set_debugreg(0, 7); set_debugreg(vcpu->arch.eff_db[0], 0); @@ -9957,6 +9960,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) static_call(kvm_x86_handle_exit_irqoff)(vcpu); + if (vcpu->arch.guest_fpu.xfd_err) + wrmsrl(MSR_IA32_XFD_ERR, 0); + /* * Consume any pending interrupts, including the possible source of * VM-Exit on SVM and any ticks that occur between VM-Exit and now. From patchwork Fri Dec 17 15:29:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685131 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D010C433F5 for ; Fri, 17 Dec 2021 15:31:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238657AbhLQPbD (ORCPT ); Fri, 17 Dec 2021 10:31:03 -0500 Received: from mga03.intel.com ([134.134.136.65]:10848 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238239AbhLQPaI (ORCPT ); Fri, 17 Dec 2021 10:30:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755008; x=1671291008; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2nS5cBvFGjvxlgG+SqL2JDRrtHBJKVl2zUPt7R6ZOgY=; b=VM25fRrKDPy3fX8CWec1u1UIlv6MYVCchtjvzpU65gCk7JF7uHASe1bM qTfldoIYpCtpQ222K9VOtNBgcuVQl1lue/E48tloTjHbLwYdvx3oN2mkL POiFic6c3CGpOv2MzNNSQZfCEuxaZ7HV/M0Z0nU5fuMhWYFYhm5EuhGPG jvuKznIZDEolecOw9DC5R8yzw603hgwl7zFyOC8rh2rh9yohoeEyxOvoG YXCfx3Ioly69Yg7AURMaxsRjJAO5QFWW1f5pRez8J+Ma9kQh+fL2PenxE vrbokC4wCMVD8fWq1apkZxMMPh+VgQRxsNAq+VyLrrXVsM2lw2S/powdh w==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723458" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723458" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588443" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:05 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 14/23] kvm: x86: Emulate IA32_XFD_ERR for guest Date: Fri, 17 Dec 2021 07:29:54 -0800 Message-Id: <20211217153003.1719189-15-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Emulate read/write to IA32_XFD_ERR MSR. Only the saved value in the guest_fpu container is touched in the emulation handler. Actual MSR update is handled right before entering the guest (with preemption disabled) Signed-off-by: Zeng Guang Signed-off-by: Wei Wang Signed-off-by: Jing Liu --- arch/x86/kvm/x86.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e528085030b3..15b093c58b3d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1358,7 +1358,7 @@ static const u32 msrs_to_save_all[] = { MSR_F15H_PERF_CTL3, MSR_F15H_PERF_CTL4, MSR_F15H_PERF_CTL5, MSR_F15H_PERF_CTR0, MSR_F15H_PERF_CTR1, MSR_F15H_PERF_CTR2, MSR_F15H_PERF_CTR3, MSR_F15H_PERF_CTR4, MSR_F15H_PERF_CTR5, - MSR_IA32_XFD, + MSR_IA32_XFD, MSR_IA32_XFD_ERR, }; static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)]; @@ -3681,6 +3681,17 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) fpu_update_guest_xfd(&vcpu->arch.guest_fpu, data); break; + case MSR_IA32_XFD_ERR: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_XFD)) + return 1; + + if (data & ~(XFEATURE_MASK_USER_DYNAMIC & + vcpu->arch.guest_supported_xcr0)) + return 1; + + vcpu->arch.guest_fpu.xfd_err = data; + break; #endif default: if (kvm_pmu_is_valid_msr(vcpu, msr)) @@ -4010,6 +4021,13 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = vcpu->arch.guest_fpu.fpstate->xfd; break; + case MSR_IA32_XFD_ERR: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_XFD)) + return 1; + + msr_info->data = vcpu->arch.guest_fpu.xfd_err; + break; #endif default: if (kvm_pmu_is_valid_msr(vcpu, msr_info->index)) @@ -6445,6 +6463,7 @@ static void kvm_init_msr_list(void) continue; break; case MSR_IA32_XFD: + case MSR_IA32_XFD_ERR: if (!kvm_cpu_cap_has(X86_FEATURE_XFD)) continue; break; From patchwork Fri Dec 17 15:29:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685115 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99405C433F5 for ; Fri, 17 Dec 2021 15:30:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238322AbhLQPaO (ORCPT ); Fri, 17 Dec 2021 10:30:14 -0500 Received: from mga03.intel.com ([134.134.136.65]:10852 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238196AbhLQPaI (ORCPT ); Fri, 17 Dec 2021 10:30:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755008; x=1671291008; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YW4AxwndnvUQjvMrZxczR1u/34DqMkxUNjj5YnnNfFo=; b=SpobS4rjzrvoeH08FbfXnL4Y8OGrHttJdAv7lyH8ZcoyOCL6+78PY+m2 iVmUvR8SphGQeIhc09RA7CVfcqXblcNrfFtoBQQvI3pBwEDqXL6oE75sl gL1EMQBVLnjopTuuAZLZs38y2e4rOuVC0kVH8YWqWIwi+kjTpS+2HDm7E mye+yttn2AxTsPhAmA0NnOBi8JrZ7EV6Keg9tQ0XkXpSkZa18sHG2Kq+5 xKfZ0mCSHFoLEMNU0xnX8oPJfx+MQNfNGL20emkEfkGlh5LwH6Gq0QWGW Ip0nwbfuhN8/tHktCzxJl1om9sSCz0preYAsmAwMV90sf9zkNWHHn1e1/ w==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723460" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723460" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588447" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:05 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 15/23] kvm: x86: Add XCR0 support for Intel AMX Date: Fri, 17 Dec 2021 07:29:55 -0800 Message-Id: <20211217153003.1719189-16-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Two XCR0 bits are defined for AMX to support XSAVE mechanism. Bit 17 is for tilecfg and bit 18 is for tiledata. The value of XCR0[17:18] is always either 00b or 11b. Also, SDM recommends that only 64-bit operating systems enable Intel AMX by setting XCR0[18:17]. If a 32-bit guest tries to set dynamic bits, it fails to pass vcpu->arch.guest_supported_xcr0 check and gets a #GP. Signed-off-by: Yang Zhong Signed-off-by: Jing Liu --- arch/x86/kvm/x86.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 15b093c58b3d..f8bacf18e6ed 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -210,7 +210,7 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs; #define KVM_SUPPORTED_XCR0 (XFEATURE_MASK_FP | XFEATURE_MASK_SSE \ | XFEATURE_MASK_YMM | XFEATURE_MASK_BNDREGS \ | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ - | XFEATURE_MASK_PKRU) + | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); @@ -989,6 +989,12 @@ static int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) if ((xcr0 & XFEATURE_MASK_AVX512) != XFEATURE_MASK_AVX512) return 1; } + +#ifdef CONFIG_X86_64 + if ((xcr0 & XFEATURE_MASK_XTILE) && + ((xcr0 & XFEATURE_MASK_XTILE) != XFEATURE_MASK_XTILE)) + return 1; +#endif vcpu->arch.xcr0 = xcr0; if ((xcr0 ^ old_xcr0) & XFEATURE_MASK_EXTEND) From patchwork Fri Dec 17 15:29:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685127 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE1EEC433FE for ; Fri, 17 Dec 2021 15:30:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238568AbhLQPay (ORCPT ); Fri, 17 Dec 2021 10:30:54 -0500 Received: from mga03.intel.com ([134.134.136.65]:10848 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238216AbhLQPaI (ORCPT ); Fri, 17 Dec 2021 10:30:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755008; x=1671291008; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XZHG4A5w+AIKSL6AXtrm1ex1QPx8dDb0x+PzomwZpmE=; b=PWeZQPPD2MwKf2fywP0rDX1X79szQcy00B72W5w5HlCaEeoCDRyuB7vF ne6uK2ixhqI+jEzKLuNYSGlXQR4gjV/3JSaHWIGywdvFJR2x3pjIqrjow SLW+qDCoCYGj64l11Kn/OW36d53EdS9I/PodC4dOdr36yGECRcsksusCK fGnbZEUn0fXUoBFKwEnZYLzkbm2syn6t2iJOgM2ZTJAYA3sZGQ3ZSXeo5 Z0Ox7ZcNQq6ua0Pe0adCiQpitGuq7uuInfZC72GR9C1sy06HaHzIaBZDa /uuIFvcpinHDRQvIiuNV8dGhhXa5yDavxd4AYCaaMPuJe5tPe8lKvXsRP w==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723462" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723462" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588455" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:05 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 16/23] kvm: x86: Add CPUID support for Intel AMX Date: Fri, 17 Dec 2021 07:29:56 -0800 Message-Id: <20211217153003.1719189-17-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Extend CPUID emulation to support XFD, AMX_TILE, AMX_INT8 and AMX_BF16. Adding those bits into kvm_cpu_caps finally activates all previous logics in this series. Signed-off-by: Yang Zhong Signed-off-by: Jing Liu --- arch/x86/include/asm/cpufeatures.h | 2 ++ arch/x86/kvm/cpuid.c | 25 +++++++++++++++++++++++-- 2 files changed, 25 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index d5b5f2ab87a0..da872b6f8d8b 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -299,7 +299,9 @@ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */ +#define X86_FEATURE_AMX_BF16 (18*32+22) /* AMX bf16 Support */ #define X86_FEATURE_AMX_TILE (18*32+24) /* AMX tile Support */ +#define X86_FEATURE_AMX_INT8 (18*32+25) /* AMX int8 Support */ /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */ #define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */ diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index eb5a5070accb..1d694ead374e 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -515,7 +515,8 @@ void kvm_set_cpu_caps(void) F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) | F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) | F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | - F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) + F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) | + F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) ); /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */ @@ -534,7 +535,7 @@ void kvm_set_cpu_caps(void) ); kvm_cpu_cap_mask(CPUID_D_1_EAX, - F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1) | F(XSAVES) + F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1) | F(XSAVES) | F(XFD) ); kvm_cpu_cap_init_scattered(CPUID_12_EAX, @@ -660,6 +661,8 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array, case 0x14: case 0x17: case 0x18: + case 0x1d: + case 0x1e: case 0x1f: case 0x8000001d: entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX; @@ -932,6 +935,24 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) goto out; } break; + /* Intel AMX TILE */ + case 0x1d: + if (!kvm_cpu_cap_has(X86_FEATURE_AMX_TILE)) { + entry->eax = entry->ebx = entry->ecx = entry->edx = 0; + break; + } + + for (i = 1, max_idx = entry->eax; i <= max_idx; ++i) { + if (!do_host_cpuid(array, function, i)) + goto out; + } + break; + case 0x1e: /* TMUL information */ + if (!kvm_cpu_cap_has(X86_FEATURE_AMX_TILE)) { + entry->eax = entry->ebx = entry->ecx = entry->edx = 0; + break; + } + break; case KVM_CPUID_SIGNATURE: { const u32 *sigptr = (const u32 *)KVM_SIGNATURE; entry->eax = KVM_CPUID_FEATURES; From patchwork Fri Dec 17 15:29:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685113 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A743C43217 for ; Fri, 17 Dec 2021 15:30:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238329AbhLQPaP (ORCPT ); Fri, 17 Dec 2021 10:30:15 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238221AbhLQPaI (ORCPT ); Fri, 17 Dec 2021 10:30:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755008; x=1671291008; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zAEHvRs3p4TZZV08wBtJSf3RNrbNkqEnul3jvvQ3jcg=; b=nQQu9xp0NsaDjPNRjgg8p+Ia+RaqF14R5NU7DECAVA22jKaMsbpBPY4Q JLF3mcU1xdFne2SwQLq5scKNurmJAD3708JOOBEnHhGaPjVF1ajfV48/n dkulbfzi3LgLcQRQlshhjc3qYLzU5+tOXhIrl3fF1Pxi5k5oRnDSlWZL9 GgpmdAQgAp11GwtO3xEJHg244cSOXyD7Dqq/7reCICr3E1wGjeYUZLdHj ghwDLbYxvHhG7pClJMCHuy/GLTaGiFV7cGD5Exe0SMMXl918D0eOBZhuQ KydsiFIaarLo66icJhwoGI05+Vhj2nlRgo8l3aCVhGFBN8K4Qn8k9grVq A==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723464" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723464" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588461" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:06 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 17/23] x86/fpu: add uabi_size to guest_fpu Date: Fri, 17 Dec 2021 07:29:57 -0800 Message-Id: <20211217153003.1719189-18-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Thomas Gleixner Userspace needs to inquire KVM about the buffer size to work with the new KVM_SET_XSAVE and KVM_GET_XSAVE2. Add the size info to guest_fpu for KVM to access. Signed-off-by: Thomas Gleixner Signed-off-by: Wei Wang Signed-off-by: Jing Liu --- arch/x86/include/asm/fpu/types.h | 5 +++++ arch/x86/kernel/fpu/core.c | 1 + arch/x86/kernel/fpu/xstate.c | 1 + 3 files changed, 7 insertions(+) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 3795d0573773..eb7cd1139d97 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -522,6 +522,11 @@ struct fpu_guest { */ u64 xfd_err; + /* + * @uabi_size: Size required for save/restore + */ + unsigned int uabi_size; + /* * @fpstate: Pointer to the allocated guest fpstate */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 3daea097c618..89d679cc819b 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -240,6 +240,7 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) gfpu->fpstate = fpstate; gfpu->xfeatures = fpu_user_cfg.default_features; gfpu->perm = fpu_user_cfg.default_features; + gfpu->uabi_size = fpu_user_cfg.default_size; fpu_init_guest_permissions(gfpu); return true; diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index b0925051b0b6..4cb1f69c50c6 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1545,6 +1545,7 @@ static int fpstate_realloc(u64 xfeatures, unsigned int ksize, newfps->is_confidential = curfps->is_confidential; newfps->in_use = curfps->in_use; guest_fpu->xfeatures |= xfeatures; + guest_fpu->uabi_size = usize; } fpregs_lock(); From patchwork Fri Dec 17 15:29:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D94DDC433EF for ; Fri, 17 Dec 2021 15:30:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238360AbhLQPaR (ORCPT ); Fri, 17 Dec 2021 10:30:17 -0500 Received: from mga03.intel.com ([134.134.136.65]:10852 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238243AbhLQPaI (ORCPT ); Fri, 17 Dec 2021 10:30:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755008; x=1671291008; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Z0qw6k02GCrK+dqLpnYIbrq3jMZsIMBndSKn0eXyQqs=; b=benw4jgxs3MT72ba+lBigx3D3bmaajHObV2wR0McO+ADNVI3ZErnDNZx P2fqpxODpJrKCSP7ILY4BRVRinDMHaLxe1s2i5JYyMYnMtLN5AC7knr6i 6PZGc6i4Ae3YMhWOi5j5tzTUtrntA5JCbJuXe2RkF3PvoeXF+6S+IG0ea 5tpLb/NoMZN38IfhBmZly8Zkput1GX3jsCMp5ds839jvwTIT9/QB2k2Dz B813TiGn3JcTRPqquOXEA/EY5O2noWAun+st5qMGw+5JbOlEcqZ+oc9zq +YaLxgeuY87D404Tj6aGQ+j3ixBVWsri+12Jr0D0iTXXdvnmi34r3i0E/ g==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723466" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723466" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588464" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:06 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 18/23] kvm: x86: Get/set expanded xstate buffer Date: Fri, 17 Dec 2021 07:29:58 -0800 Message-Id: <20211217153003.1719189-19-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Guang Zeng When AMX is enabled it requires a larger xstate buffer than the legacy hardcoded 4KB one. Exising kvm ioctls (KVM_[G|S]ET_XSAVE under KVM_CAP_XSAVE) are not suitable for this purpose. A new capability (KVM_CAP_XSAVE2) is introduced to mark an extended kvm_xsave format for supporting >4KB fpstate. The expanded fpstate size is returned to userspace when KVM_CAP_XSAVE2 is queried. Introduce KVM_GET_XSAVE2 under this capability with the new format for copying kernel fpstate to userspace. Reuse KVM_SET_XSAVE for both old/new formats by reimplementing it to do properly-sized memdup_user() based on guest fpstate. Signed-off-by: Guang Zeng Signed-off-by: Wei Wang Signed-off-by: Jing Liu --- arch/x86/include/uapi/asm/kvm.h | 12 ++++++++++- arch/x86/kvm/x86.c | 37 ++++++++++++++++++++++++++++++++- include/uapi/linux/kvm.h | 4 ++++ 3 files changed, 51 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 5a776a08f78c..240e17829e89 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -373,9 +373,19 @@ struct kvm_debugregs { __u64 reserved[9]; }; -/* for KVM_CAP_XSAVE */ +/* for KVM_CAP_XSAVE and KVM_CAP_XSAVE2 */ struct kvm_xsave { + /* + * KVM_GET_XSAVE only uses the first 4096 bytes. + * + * KVM_GET_XSAVE2 must have the size match what is returned by + * KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). + * + * KVM_SET_XSAVE uses the extra field if guest_fpu::fpstate::size + * exceeds 4096 bytes. + */ __u32 region[1024]; + __u32 extra[0]; }; #define KVM_MAX_XCRS 16 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f8bacf18e6ed..796a9f2d1f23 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4296,6 +4296,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) else r = 0; break; + case KVM_CAP_XSAVE2: + r = kvm->vcpus[0]->arch.guest_fpu.uabi_size; + break; default: break; } @@ -4899,6 +4902,16 @@ static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, vcpu->arch.pkru); } +static void kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu, + u8 *state, unsigned int size) +{ + if (fpstate_is_confidential(&vcpu->arch.guest_fpu)) + return; + + fpu_copy_guest_fpstate_to_uabi(&vcpu->arch.guest_fpu, + state, size, vcpu->arch.pkru); +} + static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu, struct kvm_xsave *guest_xsave) { @@ -5366,7 +5379,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp, break; } case KVM_SET_XSAVE: { - u.xsave = memdup_user(argp, sizeof(*u.xsave)); + int size = vcpu->arch.guest_fpu.uabi_size; + + u.xsave = memdup_user(argp, size); if (IS_ERR(u.xsave)) { r = PTR_ERR(u.xsave); goto out_nofree; @@ -5375,6 +5390,26 @@ long kvm_arch_vcpu_ioctl(struct file *filp, r = kvm_vcpu_ioctl_x86_set_xsave(vcpu, u.xsave); break; } + + case KVM_GET_XSAVE2: { + int size = vcpu->arch.guest_fpu.uabi_size; + + u.xsave = kzalloc(size, GFP_KERNEL_ACCOUNT); + if (!u.xsave) { + r = -ENOMEM; + break; + } + + kvm_vcpu_ioctl_x86_get_xsave2(vcpu, u.buffer, size); + + if (copy_to_user(argp, u.xsave, size)) { + r = -EFAULT; + break; + } + r = 0; + break; + } + case KVM_GET_XCRS: { u.xcrs = kzalloc(sizeof(struct kvm_xcrs), GFP_KERNEL_ACCOUNT); r = -ENOMEM; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 1daa45268de2..9d1c01669560 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1131,6 +1131,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204 #define KVM_CAP_ARM_MTE 205 #define KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM 206 +#define KVM_CAP_XSAVE2 207 #ifdef KVM_CAP_IRQ_ROUTING @@ -1610,6 +1611,9 @@ struct kvm_enc_region { #define KVM_S390_NORMAL_RESET _IO(KVMIO, 0xc3) #define KVM_S390_CLEAR_RESET _IO(KVMIO, 0xc4) +/* Available with KVM_CAP_XSAVE2 */ +#define KVM_GET_XSAVE2 _IOR(KVMIO, 0xcf, struct kvm_xsave) + struct kvm_s390_pv_sec_parm { __u64 origin; __u64 length; From patchwork Fri Dec 17 15:29:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685111 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EEFBC433FE for ; Fri, 17 Dec 2021 15:30:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238338AbhLQPaQ (ORCPT ); Fri, 17 Dec 2021 10:30:16 -0500 Received: from mga03.intel.com ([134.134.136.65]:10848 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238248AbhLQPaI (ORCPT ); Fri, 17 Dec 2021 10:30:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755008; x=1671291008; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xPVjMhggzr8nWrVWkFh3gagm3PUAtIAJrHO2aKIdNkc=; b=LmyceBkhd5XmPCnzNvDnqvHnf/sCacxjsudjP0OlecHNcU8H3Li9SohE OT4lHeVSabmxU1OBjPlkbVmLY6ZNJfcUyTCeqqeFxUNXLNvWD8O2C/RWu h4YQQyMrId1IM/Zvt6SxYhLhbwXWVQCXc+/OqdN5iT63zEe4vUVhBUbNR gkIKqEjixSpN2JDHvYedhMiSpWvQnMLtHg4CuVJBCsPyOlpCQQ2BGB4nB cxt0xtYqRb2ylXaPnaVE/ICk3kexfAL3+U7TQEQ49Hzs9mGS9mHrk5NS/ pxOOoqMt3xq6cZbxCujRoUr0hxjgeixiLi+/q6HuD/bjHsW9W+vzgkFpw A==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723470" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723470" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588468" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:06 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 19/23] kvm: selftests: Add support for KVM_CAP_XSAVE2 Date: Fri, 17 Dec 2021 07:29:59 -0800 Message-Id: <20211217153003.1719189-20-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Wei Wang When KVM_CAP_XSAVE2 is supported, userspace is expected to allocate buffer for KVM_GET_XSAVE2 and KVM_SET_XSAVE using the size returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). Signed-off-by: Wei Wang Signed-off-by: Guang Zeng Signed-off-by: Jing Liu --- tools/arch/x86/include/uapi/asm/kvm.h | 12 ++++- tools/include/uapi/linux/kvm.h | 3 ++ .../testing/selftests/kvm/include/kvm_util.h | 2 + .../selftests/kvm/include/x86_64/processor.h | 12 +++++ tools/testing/selftests/kvm/lib/kvm_util.c | 31 +++++++++++++ .../selftests/kvm/lib/x86_64/processor.c | 45 ++++++++++++++++++- .../testing/selftests/kvm/x86_64/evmcs_test.c | 2 +- tools/testing/selftests/kvm/x86_64/smm_test.c | 2 +- .../testing/selftests/kvm/x86_64/state_test.c | 2 +- .../kvm/x86_64/vmx_preemption_timer_test.c | 2 +- 10 files changed, 106 insertions(+), 7 deletions(-) diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 5a776a08f78c..240e17829e89 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -373,9 +373,19 @@ struct kvm_debugregs { __u64 reserved[9]; }; -/* for KVM_CAP_XSAVE */ +/* for KVM_CAP_XSAVE and KVM_CAP_XSAVE2 */ struct kvm_xsave { + /* + * KVM_GET_XSAVE only uses the first 4096 bytes. + * + * KVM_GET_XSAVE2 must have the size match what is returned by + * KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). + * + * KVM_SET_XSAVE uses the extra field if guest_fpu::fpstate::size + * exceeds 4096 bytes. + */ __u32 region[1024]; + __u32 extra[0]; }; #define KVM_MAX_XCRS 16 diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index 1daa45268de2..f066637ee206 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -1131,6 +1131,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204 #define KVM_CAP_ARM_MTE 205 #define KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM 206 +#define KVM_CAP_XSAVE2 207 #ifdef KVM_CAP_IRQ_ROUTING @@ -1551,6 +1552,8 @@ struct kvm_s390_ucas_mapping { /* Available with KVM_CAP_XSAVE */ #define KVM_GET_XSAVE _IOR(KVMIO, 0xa4, struct kvm_xsave) #define KVM_SET_XSAVE _IOW(KVMIO, 0xa5, struct kvm_xsave) +/* Available with KVM_CAP_XSAVE2 */ +#define KVM_GET_XSAVE2 _IOR(KVMIO, 0xcf, struct kvm_xsave) /* Available with KVM_CAP_XCRS */ #define KVM_GET_XCRS _IOR(KVMIO, 0xa6, struct kvm_xcrs) #define KVM_SET_XCRS _IOW(KVMIO, 0xa7, struct kvm_xcrs) diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 6a1a37f30494..5d6cbafcb412 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -85,6 +85,7 @@ extern const struct vm_guest_mode_params vm_guest_mode_params[]; int open_path_or_exit(const char *path, int flags); int open_kvm_dev_path_or_exit(void); int kvm_check_cap(long cap); +int vm_check_cap(struct kvm_vm *vm, long cap); int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap); int vcpu_enable_cap(struct kvm_vm *vm, uint32_t vcpu_id, struct kvm_enable_cap *cap); @@ -316,6 +317,7 @@ struct kvm_vm *vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus, * guest_code - The vCPU's entry point */ void vm_vcpu_add_default(struct kvm_vm *vm, uint32_t vcpuid, void *guest_code); +void vm_xsave_req_perm(void); bool vm_is_unrestricted_guest(struct kvm_vm *vm); diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h index 05e65ca1c30c..0546173ab628 100644 --- a/tools/testing/selftests/kvm/include/x86_64/processor.h +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h @@ -10,8 +10,10 @@ #include #include +#include #include +#include #include "../kvm_util.h" @@ -352,6 +354,7 @@ struct kvm_x86_state; struct kvm_x86_state *vcpu_save_state(struct kvm_vm *vm, uint32_t vcpuid); void vcpu_load_state(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_x86_state *state); +void kvm_x86_state_cleanup(struct kvm_x86_state *state); struct kvm_msr_list *kvm_get_msr_index_list(void); uint64_t kvm_get_feature_msr(uint64_t msr_index); @@ -443,4 +446,13 @@ void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, /* VMX_EPT_VPID_CAP bits */ #define VMX_EPT_VPID_CAP_AD_BITS (1ULL << 21) +#define XSTATE_LEGACY_SIZE 4096 + +#define XSTATE_XTILE_CFG_BIT 17 +#define XSTATE_XTILE_DATA_BIT 18 + +#define XSTATE_XTILE_CFG_MASK (1ULL << XSTATE_XTILE_CFG_BIT) +#define XSTATE_XTILE_DATA_MASK (1ULL << XSTATE_XTILE_DATA_BIT) +#define XFEATURE_XTILE_MASK (XSTATE_XTILE_CFG_MASK | \ + XSTATE_XTILE_DATA_MASK) #endif /* SELFTEST_KVM_PROCESSOR_H */ diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 8f2e0bb1ef96..dc4ae30b7caf 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -85,6 +85,33 @@ int kvm_check_cap(long cap) return ret; } +/* VM Check Capability + * + * Input Args: + * vm - Virtual Machine + * cap - Capability + * + * Output Args: None + * + * Return: + * On success, the Value corresponding to the capability (KVM_CAP_*) + * specified by the value of cap. On failure a TEST_ASSERT failure + * is produced. + * + * Looks up and returns the value corresponding to the capability + * (KVM_CAP_*) given by cap. + */ +int vm_check_cap(struct kvm_vm *vm, long cap) +{ + int ret; + + ret = ioctl(vm->fd, KVM_CHECK_EXTENSION, cap); + TEST_ASSERT(ret >= 0, "KVM_CHECK_EXTENSION VM IOCTL failed,\n" + " rc: %i errno: %i", ret, errno); + + return ret; +} + /* VM Enable Capability * * Input Args: @@ -370,6 +397,10 @@ struct kvm_vm *vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus, #ifdef __x86_64__ vm_create_irqchip(vm); #endif + /* + * Permission needs to be requested before KVM_SET_CPUID2. + */ + vm_xsave_req_perm(); for (i = 0; i < nr_vcpus; ++i) { uint32_t vcpuid = vcpuids ? vcpuids[i] : i; diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index 82c39db91369..00324d73c687 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -650,6 +650,25 @@ static void vcpu_setup(struct kvm_vm *vm, int vcpuid) vcpu_sregs_set(vm, vcpuid, &sregs); } +void vm_xsave_req_perm(void) +{ + unsigned long bitmask; + long rc = syscall(SYS_arch_prctl, ARCH_REQ_XCOMP_GUEST_PERM, + XSTATE_XTILE_DATA_BIT); + /* + * The older kernel version(<5.15) can't support + * ARCH_REQ_XCOMP_GUEST_PERM and directly return. + */ + if (rc) + return; + + rc = syscall(SYS_arch_prctl, ARCH_GET_XCOMP_GUEST_PERM, &bitmask); + TEST_ASSERT(rc == 0, "prctl(ARCH_GET_XCOMP_GUEST_PERM) error: %ld", rc); + TEST_ASSERT(bitmask & XFEATURE_XTILE_MASK, + "prctl(ARCH_REQ_XCOMP_GUEST_PERM) failure bitmask=0x%lx", + bitmask); +} + void vm_vcpu_add_default(struct kvm_vm *vm, uint32_t vcpuid, void *guest_code) { struct kvm_mp_state mp_state; @@ -1018,10 +1037,10 @@ void vcpu_dump(FILE *stream, struct kvm_vm *vm, uint32_t vcpuid, uint8_t indent) } struct kvm_x86_state { + struct kvm_xsave *xsave; struct kvm_vcpu_events events; struct kvm_mp_state mp_state; struct kvm_regs regs; - struct kvm_xsave xsave; struct kvm_xcrs xcrs; struct kvm_sregs sregs; struct kvm_debugregs debugregs; @@ -1069,6 +1088,22 @@ struct kvm_msr_list *kvm_get_msr_index_list(void) return list; } +static int vcpu_save_xsave_state(struct kvm_vm *vm, struct vcpu *vcpu, + struct kvm_x86_state *state) +{ + int size; + + size = vm_check_cap(vm, KVM_CAP_XSAVE2); + if (!size) + size = XSTATE_LEGACY_SIZE; + + state->xsave = malloc(size); + if (size == XSTATE_LEGACY_SIZE) + return ioctl(vcpu->fd, KVM_GET_XSAVE, state->xsave); + else + return ioctl(vcpu->fd, KVM_GET_XSAVE2, state->xsave); +} + struct kvm_x86_state *vcpu_save_state(struct kvm_vm *vm, uint32_t vcpuid) { struct vcpu *vcpu = vcpu_find(vm, vcpuid); @@ -1112,7 +1147,7 @@ struct kvm_x86_state *vcpu_save_state(struct kvm_vm *vm, uint32_t vcpuid) TEST_ASSERT(r == 0, "Unexpected result from KVM_GET_REGS, r: %i", r); - r = ioctl(vcpu->fd, KVM_GET_XSAVE, &state->xsave); + r = vcpu_save_xsave_state(vm, vcpu, state); TEST_ASSERT(r == 0, "Unexpected result from KVM_GET_XSAVE, r: %i", r); @@ -1198,6 +1233,12 @@ void vcpu_load_state(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_x86_state *s } } +void kvm_x86_state_cleanup(struct kvm_x86_state *state) +{ + free(state->xsave); + free(state); +} + bool is_intel_cpu(void) { int eax, ebx, ecx, edx; diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c index 2b46dcca86a8..4c7841dfd481 100644 --- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c +++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c @@ -129,7 +129,7 @@ static void save_restore_vm(struct kvm_vm *vm) vcpu_set_hv_cpuid(vm, VCPU_ID); vcpu_enable_evmcs(vm, VCPU_ID); vcpu_load_state(vm, VCPU_ID, state); - free(state); + kvm_x86_state_cleanup(state); memset(®s2, 0, sizeof(regs2)); vcpu_regs_get(vm, VCPU_ID, ®s2); diff --git a/tools/testing/selftests/kvm/x86_64/smm_test.c b/tools/testing/selftests/kvm/x86_64/smm_test.c index d0fe2fdce58c..2da8eb8e2d96 100644 --- a/tools/testing/selftests/kvm/x86_64/smm_test.c +++ b/tools/testing/selftests/kvm/x86_64/smm_test.c @@ -212,7 +212,7 @@ int main(int argc, char *argv[]) vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid()); vcpu_load_state(vm, VCPU_ID, state); run = vcpu_state(vm, VCPU_ID); - free(state); + kvm_x86_state_cleanup(state); } done: diff --git a/tools/testing/selftests/kvm/x86_64/state_test.c b/tools/testing/selftests/kvm/x86_64/state_test.c index 32854c1462ad..2e0a92da8ff5 100644 --- a/tools/testing/selftests/kvm/x86_64/state_test.c +++ b/tools/testing/selftests/kvm/x86_64/state_test.c @@ -218,7 +218,7 @@ int main(int argc, char *argv[]) vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid()); vcpu_load_state(vm, VCPU_ID, state); run = vcpu_state(vm, VCPU_ID); - free(state); + kvm_x86_state_cleanup(state); memset(®s2, 0, sizeof(regs2)); vcpu_regs_get(vm, VCPU_ID, ®s2); diff --git a/tools/testing/selftests/kvm/x86_64/vmx_preemption_timer_test.c b/tools/testing/selftests/kvm/x86_64/vmx_preemption_timer_test.c index a07480aed397..ff92e25b6f1e 100644 --- a/tools/testing/selftests/kvm/x86_64/vmx_preemption_timer_test.c +++ b/tools/testing/selftests/kvm/x86_64/vmx_preemption_timer_test.c @@ -244,7 +244,7 @@ int main(int argc, char *argv[]) vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid()); vcpu_load_state(vm, VCPU_ID, state); run = vcpu_state(vm, VCPU_ID); - free(state); + kvm_x86_state_cleanup(state); memset(®s2, 0, sizeof(regs2)); vcpu_regs_get(vm, VCPU_ID, ®s2); From patchwork Fri Dec 17 15:30:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E29D9C433F5 for ; Fri, 17 Dec 2021 15:30:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238347AbhLQPaQ (ORCPT ); Fri, 17 Dec 2021 10:30:16 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238247AbhLQPaI (ORCPT ); Fri, 17 Dec 2021 10:30:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755008; x=1671291008; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+lLK4UqLTW8dKnWDYMWJ8zGKlRHYspG68U7ZipxVqg4=; b=XzH/fg/Ze3ElHRJ4XqcP1xwtCyuQvg/afNSSFf8wunnO1hTTlYCXxlmA GZBjlrtzVOgA7BDyLboncL3vpcz71c3XhyCRmEZvI1mrIvjwKvsTMMfvN X5hZcVTxXBxSdoW3n/8H8FVIOxS8zyNomTRyTPXNs5rFs5r2DbeGsErk1 uf/oFeBm5dsNBmQp9a/FphLW1J869Um76GzI/D7uvVUA9Js4z3fUCLJZE h4AbWxoaAYxv7nVpk0c2IB/ROeYVNbDgl5Fwu+A2yNGhUc/QkrLaIx52m K0ixyf89dzrmxL3v6TOSQVkJEcWqKhdniy3ftHL0q2hfsH4fSpxDT2leI Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723474" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723474" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588472" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:06 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 20/23] docs: kvm: Add KVM_GET_XSAVE2 Date: Fri, 17 Dec 2021 07:30:00 -0800 Message-Id: <20211217153003.1719189-21-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Wei Wang Update the api doc with the new KVM_GET_XSAVE2 ioctl, which is used when KVM_CAP_XSAVE2 is negotiated with the userspace. KVM_SET_XSAVE ioctl is re-used when KVM_CAP_XSAVE2 is used. The kvm_xsave struct is updated to support data size larger that the legacy hardcoded 4KB. Signed-off-by: Wei Wang Signed-off-by: Zeng Guang Signed-off-by: Jing Liu --- Documentation/virt/kvm/api.rst | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index eb5671ca2dba..0f4ed2d4aea6 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1566,15 +1566,18 @@ otherwise it will return EBUSY error. struct kvm_xsave { __u32 region[1024]; + __u32 extra[0]; }; This ioctl would copy current vcpu's xsave struct to the userspace. +Application should use KVM_GET_XSAVE2 if xsave states are larger than +4KB. 4.43 KVM_SET_XSAVE ------------------ -:Capability: KVM_CAP_XSAVE +:Capability: KVM_CAP_XSAVE and KVM_CAP_XSAVE2 :Architectures: x86 :Type: vcpu ioctl :Parameters: struct kvm_xsave (in) @@ -1585,9 +1588,12 @@ This ioctl would copy current vcpu's xsave struct to the userspace. struct kvm_xsave { __u32 region[1024]; + __u32 extra[0]; }; This ioctl would copy userspace's xsave struct to the kernel. +Application can use this ioctl for xstate buffer in any size +returned from KVM_CHECK_EXTENSION(KVM_CAP_XSAV2). 4.44 KVM_GET_XCRS @@ -5507,6 +5513,27 @@ the trailing ``'\0'``, is indicated by ``name_size`` in the header. The Stats Data block contains an array of 64-bit values in the same order as the descriptors in Descriptors block. +4.42 KVM_GET_XSAVE2 +------------------ + +:Capability: KVM_CAP_XSAVE2 +:Architectures: x86 +:Type: vcpu ioctl +:Parameters: struct kvm_xsave (out) +:Returns: 0 on success, -1 on error + + +:: + + struct kvm_xsave { + __u32 region[1024]; + __u32 extra[0]; + }; + +This ioctl would copy current vcpu's xsave struct to the userspace. +Application can use this ioctl for xstate buffer in any size +returned from KVM_CHECK_EXTENSION(KVM_CAP_XSAV2). + 5. The kvm_run structure ======================== From patchwork Fri Dec 17 15:30:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685121 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 416D3C433EF for ; Fri, 17 Dec 2021 15:30:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238443AbhLQPap (ORCPT ); Fri, 17 Dec 2021 10:30:45 -0500 Received: from mga03.intel.com ([134.134.136.65]:10852 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238253AbhLQPaJ (ORCPT ); Fri, 17 Dec 2021 10:30:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755009; x=1671291009; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ta2Eqb4PNg89gDFRCNYN06qBp8rwWdB7g3PLouU0Wug=; b=IM1QmYl2HaqkEN0KsErwTqhexgI+LcufLGpT5C75uIckqm3DdxAiIVQ3 eaYV0WJouCCyYk1fFYg6IHMktiC0IJ06DXkoj+Cy8ToH9YFKSFeNBBbtT y6+2+8o4acFxo0ctkGHTUNzWKRq8PmfhH3Gf3E3Hwk2HeH0rXv4M41YNj QMDN+1xtvpYX8eKDSiW1P86iZmc9kdhodV2cQYI9f2afCXXZJmTEkajAF BGMQ/a1y67QqpAnpmX/QP4Pn5JdC+EmYGglp7SKNnkgGAyx8vsYoInklS AYobg6SZsroaWSlZp/TMySYdyi63lVuNWUFsAhy+5APUe7kA1VibeFbtf Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723476" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723476" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588475" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:06 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 21/23] x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() Date: Fri, 17 Dec 2021 07:30:01 -0800 Message-Id: <20211217153003.1719189-22-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Thomas Gleixner KVM can disable the write emulation for the XFD MSR when the vCPU's fpstate is already correctly sized to reduce the overhead. When write emulation is disabled the XFD MSR state after a VMEXIT is unknown and therefore not in sync with the software states in fpstate and the per CPU XFD cache. Provide fpu_sync_guest_vmexit_xfd_state() which has to be invoked after a VMEXIT before enabling interrupts when write emulation is disabled for the XFD MSR. It could be invoked unconditionally even when write emulation is enabled for the price of a pointless MSR read. Signed-off-by: Thomas Gleixner Signed-off-by: Jing Liu --- arch/x86/include/asm/fpu/api.h | 2 ++ arch/x86/kernel/fpu/core.c | 24 ++++++++++++++++++++++++ 2 files changed, 26 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 100b535cd3de..dc51809f0e4a 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -142,8 +142,10 @@ extern int fpu_update_guest_perm_features(struct fpu_guest *guest_fpu); #ifdef CONFIG_X86_64 extern void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd); +extern void fpu_sync_guest_vmexit_xfd_state(void); #else static inline void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd) { } +static inline void fpu_sync_guest_vmexit_xfd_state(void) { } #endif extern void fpu_copy_guest_fpstate_to_uabi(struct fpu_guest *gfpu, void *buf, unsigned int size, u32 pkru); diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 89d679cc819b..42a195a6b2ce 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -299,6 +299,30 @@ void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd) fpregs_unlock(); } EXPORT_SYMBOL_GPL(fpu_update_guest_xfd); + +/** + * fpu_sync_guest_vmexit_xfd_state - Synchronize XFD MSR and software state + * + * Must be invoked from KVM after a VMEXIT before enabling interrupts when + * XFD write emulation is disabled. This is required because the guest can + * freely modify XFD and the state at VMEXIT is not guaranteed to be the + * same as the state on VMENTER. So software state has to be udpated before + * any operation which depends on it can take place. + * + * Note: It can be invoked unconditionally even when write emulation is + * enabled for the price of a then pointless MSR read. + */ +void fpu_sync_guest_vmexit_xfd_state(void) +{ + struct fpstate *fps = current->thread.fpu.fpstate; + + lockdep_assert_irqs_disabled(); + if (fpu_state_size_dynamic()) { + rdmsrl(MSR_IA32_XFD, fps->xfd); + __this_cpu_write(xfd_state, fps->xfd); + } +} +EXPORT_SYMBOL_GPL(fpu_sync_guest_vmexit_xfd_state); #endif /* CONFIG_X86_64 */ int fpu_swap_kvm_fpstate(struct fpu_guest *guest_fpu, bool enter_guest) From patchwork Fri Dec 17 15:30:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685125 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BAA7C433FE for ; Fri, 17 Dec 2021 15:30:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238535AbhLQPat (ORCPT ); Fri, 17 Dec 2021 10:30:49 -0500 Received: from mga03.intel.com ([134.134.136.65]:10848 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238255AbhLQPaJ (ORCPT ); Fri, 17 Dec 2021 10:30:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755009; x=1671291009; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U3csfgTLbEXdbK8cw/YgoQNoInbEKp5oorFH3mqVag0=; b=oGwZ21HUIzb6QiNHFVrFGfLUtMmpaZMtTgTTIyG1at0HfVWpNS7bclNN VZm04JyAES6hGci8JeHnH+b/SvQ4xH/SDwTYfZXvUcbCDKVeGZZqU2sBq DR7ZZHQjBml/+C6Dqg+T/LuF8snznbt71tGQn2mMYhgIJykC/0AjE13F4 S4hh21zKi3k9/3V47YO5Os+Nx3V0iYfRBXxV99mEs238FVtX0QpBINVES fTgKTRdrA6k4jWqOQaiSkfGkvwyoDHnJWrd6miWwugLV5GLndd1YiWx7A z3Zsr4p7DQ+ct4LG8yTYD3pPDL+YknrZAZuWWqvZzlsIcI+VB48vxt6SZ w==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723478" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723478" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588478" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:06 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 22/23] kvm: x86: Disable interception for IA32_XFD on demand Date: Fri, 17 Dec 2021 07:30:02 -0800 Message-Id: <20211217153003.1719189-23-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Always intercepting IA32_XFD causes non-negligible overhead when this register is updated frequently in the guest. Disable r/w emulation after intercepting the first WRMSR(IA32_XFD) with a non-zero value. Disable WRMSR emulation implies that IA32_XFD becomes out-of-sync with the software states in fpstate and the per-cpu xfd cache. Call fpu_sync_guest_vmexit_xfd_state() to bring them back in-sync, before preemption is enabled. p.s. We have confirmed that SDM is being revised to say that when setting IA32_XFD[18] the AMX register state is not guaranteed to be preserved. This clarification avoids adding mess for a creative guest which sets IA32_XFD[18]=1 before saving active AMX state to its own storage. Signed-off-by: Kevin Tian Signed-off-by: Yang Zhong Signed-off-by: Jing Liu --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/vmx/vmx.c | 10 ++++++++++ arch/x86/kvm/vmx/vmx.h | 2 +- arch/x86/kvm/x86.c | 6 ++++++ 5 files changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index cefe1d81e2e8..60c27f9990e9 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -30,6 +30,7 @@ KVM_X86_OP(update_exception_bitmap) KVM_X86_OP(get_msr) KVM_X86_OP(set_msr) KVM_X86_OP(get_segment_base) +KVM_X86_OP_NULL(set_xfd_passthrough) KVM_X86_OP(get_segment) KVM_X86_OP(get_cpl) KVM_X86_OP(set_segment) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 860ed500580c..b513c12fa39e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -640,6 +640,7 @@ struct kvm_vcpu_arch { u64 smi_count; bool tpr_access_reporting; bool xsaves_enabled; + bool xfd_out_of_sync; u64 ia32_xss; u64 microcode_version; u64 arch_capabilities; @@ -1329,6 +1330,7 @@ struct kvm_x86_ops { void (*update_exception_bitmap)(struct kvm_vcpu *vcpu); int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr); int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr); + void (*set_xfd_passthrough)(struct kvm_vcpu *vcpu); u64 (*get_segment_base)(struct kvm_vcpu *vcpu, int seg); void (*get_segment)(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 483075045253..97a823a3f23f 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -162,6 +162,7 @@ static u32 vmx_possible_passthrough_msrs[MAX_POSSIBLE_PASSTHROUGH_MSRS] = { MSR_FS_BASE, MSR_GS_BASE, MSR_KERNEL_GS_BASE, + MSR_IA32_XFD, #endif MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, @@ -1929,6 +1930,14 @@ static u64 vcpu_supported_debugctl(struct kvm_vcpu *vcpu) return debugctl; } +#ifdef CONFIG_X86_64 +static void vmx_set_xfd_passthrough(struct kvm_vcpu *vcpu) +{ + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_XFD, MSR_TYPE_RW); + vcpu->arch.xfd_out_of_sync = true; +} +#endif + /* * Writes msr value into the appropriate "register". * Returns 0 on success, non-0 otherwise. @@ -7664,6 +7673,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { #ifdef CONFIG_X86_64 .set_hv_timer = vmx_set_hv_timer, .cancel_hv_timer = vmx_cancel_hv_timer, + .set_xfd_passthrough = vmx_set_xfd_passthrough, #endif .setup_mce = vmx_setup_mce, diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 4df2ac24ffc1..bf9d3051cd6c 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -340,7 +340,7 @@ struct vcpu_vmx { struct lbr_desc lbr_desc; /* Save desired MSR intercept (read: pass-through) state */ -#define MAX_POSSIBLE_PASSTHROUGH_MSRS 13 +#define MAX_POSSIBLE_PASSTHROUGH_MSRS 14 struct { DECLARE_BITMAP(read, MAX_POSSIBLE_PASSTHROUGH_MSRS); DECLARE_BITMAP(write, MAX_POSSIBLE_PASSTHROUGH_MSRS); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 796a9f2d1f23..13ba1e066b55 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3686,6 +3686,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; fpu_update_guest_xfd(&vcpu->arch.guest_fpu, data); + + if (data && kvm_x86_ops.set_xfd_passthrough) + static_call(kvm_x86_set_xfd_passthrough)(vcpu); break; case MSR_IA32_XFD_ERR: if (!msr_info->host_initiated && @@ -10023,6 +10026,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (vcpu->arch.guest_fpu.xfd_err) wrmsrl(MSR_IA32_XFD_ERR, 0); + if (vcpu->arch.xfd_out_of_sync) + fpu_sync_guest_vmexit_xfd_state(); + /* * Consume any pending interrupts, including the possible source of * VM-Exit on SVM and any ticks that occur between VM-Exit and now. From patchwork Fri Dec 17 15:30:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 12685109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98B49C4332F for ; Fri, 17 Dec 2021 15:30:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238366AbhLQPaS (ORCPT ); Fri, 17 Dec 2021 10:30:18 -0500 Received: from mga03.intel.com ([134.134.136.65]:10842 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238256AbhLQPaJ (ORCPT ); Fri, 17 Dec 2021 10:30:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1639755009; x=1671291009; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=k1pytctA/1NzjqzonA+Gg9kC3oNIv2NNsblbfpdxUOI=; b=AvU13g49JH5AG2DPYTOBoswLhjT0wufpsqikZQKMsYA9PHLPw4Cr/iK9 9C1k4XIiPMLmohLCdVZkt3tTsFFklah2zmwFn9QPLBPSXCN97RviK9YGg SHWIth4EyLXDI0dMhvwV0mMUZEZAVCAfnVTrTXwzKdlsn4Wv9m0XLuafx D2f8uQBryuGu14h8vMs1FVzs3EoJtwvRoZxj6mhyRtFi75/kYAj/emwwO zMiYal62JobVX8nuXiSzRGkxUJ7eP8tiRrTyRCBZd3Hv1BhK/URfWmanv Khzk245rrbCq4E1dokFjTzb41ZDNebl4Gl9wfJH3BpIgf+p/0C2mwe3FR w==; X-IronPort-AV: E=McAfee;i="6200,9189,10200"; a="239723481" X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="239723481" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Dec 2021 07:30:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,213,1635231600"; d="scan'208";a="615588481" Received: from 984fee00a228.jf.intel.com ([10.165.56.59]) by orsmga004.jf.intel.com with ESMTP; 17 Dec 2021 07:30:06 -0800 From: Jing Liu To: x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, pbonzini@redhat.com Cc: seanjc@google.com, jun.nakajima@intel.com, kevin.tian@intel.com, jing2.liu@linux.intel.com, jing2.liu@intel.com, guang.zeng@intel.com, wei.w.wang@intel.com, yang.zhong@intel.com Subject: [PATCH v2 23/23] kvm: x86: Disable RDMSR interception of IA32_XFD_ERR Date: Fri, 17 Dec 2021 07:30:03 -0800 Message-Id: <20211217153003.1719189-24-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20211217153003.1719189-1-jing2.liu@intel.com> References: <20211217153003.1719189-1-jing2.liu@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Also disable read emulation of IA32_XFD_ERR MSR at the same point where r/w emulation of IA32_XFD MSR is disabled. This saves one unnecessary VM-exit in guest #NM handler, given that the MSR is already restored with the guest value before the guest is resumed. Signed-off-by: Jing Liu --- arch/x86/kvm/vmx/vmx.c | 2 ++ arch/x86/kvm/vmx/vmx.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 97a823a3f23f..b66a005f076b 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -163,6 +163,7 @@ static u32 vmx_possible_passthrough_msrs[MAX_POSSIBLE_PASSTHROUGH_MSRS] = { MSR_GS_BASE, MSR_KERNEL_GS_BASE, MSR_IA32_XFD, + MSR_IA32_XFD_ERR, #endif MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, @@ -1934,6 +1935,7 @@ static u64 vcpu_supported_debugctl(struct kvm_vcpu *vcpu) static void vmx_set_xfd_passthrough(struct kvm_vcpu *vcpu) { vmx_disable_intercept_for_msr(vcpu, MSR_IA32_XFD, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_XFD_ERR, MSR_TYPE_R); vcpu->arch.xfd_out_of_sync = true; } #endif diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index bf9d3051cd6c..0a00242a91e7 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -340,7 +340,7 @@ struct vcpu_vmx { struct lbr_desc lbr_desc; /* Save desired MSR intercept (read: pass-through) state */ -#define MAX_POSSIBLE_PASSTHROUGH_MSRS 14 +#define MAX_POSSIBLE_PASSTHROUGH_MSRS 15 struct { DECLARE_BITMAP(read, MAX_POSSIBLE_PASSTHROUGH_MSRS); DECLARE_BITMAP(write, MAX_POSSIBLE_PASSTHROUGH_MSRS);