From patchwork Fri Jun 3 02:50:15 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luwei Kang X-Patchwork-Id: 9151571 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DCEBF6074E for ; Fri, 3 Jun 2016 02:52:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D09E42804C for ; Fri, 3 Jun 2016 02:52:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C50302832B; Fri, 3 Jun 2016 02:52:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CC7AF2804C for ; Fri, 3 Jun 2016 02:52:51 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b8fBn-0001Co-RT; Fri, 03 Jun 2016 02:50:23 +0000 Received: from mail6.bemta6.messagelabs.com ([85.158.143.247]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b8fBm-0001Ci-MZ for xen-devel@lists.xen.org; Fri, 03 Jun 2016 02:50:22 +0000 Received: from [85.158.143.35] by server-3.bemta-6.messagelabs.com id A0/9B-25713-E60F0575; Fri, 03 Jun 2016 02:50:22 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrMKsWRWlGSWpSXmKPExsVywNwkVjf3Q0C 4QdcnFoslHxezODB6HN39mymAMYo1My8pvyKBNaN5+gHWgpchFYdfrGNuYDxk3cXIySEkUCnR t/46M4gtIcArcWTZDFYI219i2qtzLF2MXEA1DYwSm+5fYYNwdjNKNB1exwbRvZNRYsObHIjEW kaJJe9Wg7WzCWhJPNn9HKxIRCBYYt+W3ewgRcwCUxklZr6dDZYQFoiVOHHyO1MXIwdQUZzEqp cOEPVGEj03F7KChFkEVCS+P6sGCfMCjVl+Yi8LxF4PiSev+xhBbE4BT4kVLd1gHzAKiEl8P7W GCcRmFhCXuPVkPhPENwISS/ach/pSVOLl439QX0pLHFt3nRXkNAmBucwS8/79hWpQlPi7vpUR YpCOxILdn9ggbG2JZQtfM0McJChxcuYTqIMUJR7OnMM+gVFmFpLds5C0z0LSPgtJ+wJGllWM6 sWpRWWpRbpGeklFmekZJbmJmTm6hgZmermpxcWJ6ak5iUnFesn5uZsYgZHNAAQ7GJf9dTrEKM nBpCTKe/5sQLgQX1J+SmVGYnFGfFFpTmrxIUYZDg4lCd7n74BygkWp6akVaZk5wBQDk5bg4FE S4X0NkuYtLkjMLc5Mh0idYlSUEue9AJIQAElklObBtcHS2iVGWSlhXkagQ4R4ClKLcjNLUOVf MYpzMCoJ8/4CmcKTmVcCN/0V0GImoMUFj/xBFpckIqSkGhhNJ7aLv595Vt9Qb5ecwK5W+Um3N TZOylhgq1HxVPpAR+fmRxvcj3Y/UziQZSH7oaqinYPxqbHF5s3PS/5MW9idFKhleXGe798LkT W73vwLzC/UKvui1znt53SZpuULWOU+lwj/m3tzi2lY6vLUtbsU+/vKbCweTeB8v5Or9o6DIN+ isDK+pWVKLMUZiYZazEXFiQBMJ+tAZgMAAA== X-Env-Sender: luwei.kang@intel.com X-Msg-Ref: server-11.tower-21.messagelabs.com!1464922219!17147140!1 X-Originating-IP: [192.55.52.93] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTkyLjU1LjUyLjkzID0+IDMyNDY2NQ==\n X-StarScan-Received: X-StarScan-Version: 8.46; banners=-,-,- X-VirusChecked: Checked Received: (qmail 11882 invoked from network); 3 Jun 2016 02:50:20 -0000 Received: from mga11.intel.com (HELO mga11.intel.com) (192.55.52.93) by server-11.tower-21.messagelabs.com with SMTP; 3 Jun 2016 02:50:20 -0000 Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP; 02 Jun 2016 19:50:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,409,1459839600"; d="scan'208";a="713144947" Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206]) by FMSMGA003.fm.intel.com with ESMTP; 02 Jun 2016 19:50:19 -0700 Received: from fmsmsx121.amr.corp.intel.com (10.18.125.36) by FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS) id 14.3.248.2; Thu, 2 Jun 2016 19:50:18 -0700 Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by fmsmsx121.amr.corp.intel.com (10.18.125.36) with Microsoft SMTP Server (TLS) id 14.3.248.2; Thu, 2 Jun 2016 19:50:18 -0700 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.150]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.181]) with mapi id 14.03.0248.002; Fri, 3 Jun 2016 10:50:16 +0800 From: "Kang, Luwei" To: Andrew Cooper , Xen-devel Thread-Topic: [PATCH for-4.7] x86/cpuid: Calculate a guests xfeature_mask from its featureset Thread-Index: AQHRvO6TzZdXjzFiBkWU/2l2r2QwTJ/XCXTQ Date: Fri, 3 Jun 2016 02:50:15 +0000 Message-ID: <82D7661F83C1A047AF7DC287873BF1E136ABB56C@SHSMSX101.ccr.corp.intel.com> References: <1464886081-2011-1-git-send-email-andrew.cooper3@citrix.com> In-Reply-To: <1464886081-2011-1-git-send-email-andrew.cooper3@citrix.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ctpclassification: CTP_IC x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZWZkZWE0ZDktNzdhOS00M2ZmLThiYWQtY2FhN2NjYzQyNWFkIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6Ijd3Q0J1MGhnUnJFOWlTSHZlcWVkbHF0QkpOZGlpRkdWbkRPK25kN05FdGc9In0= x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Cc: "Han, Huaitong" , Wei Liu , "Wang, Yong Y" , Jan Beulich Subject: Re: [Xen-devel] [PATCH for-4.7] x86/cpuid: Calculate a guests xfeature_mask from its featureset X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP I have finsh test this patch and it work well, thank Andrew and all. -----Original Message----- From: Andrew Cooper [mailto:andrew.cooper3@citrix.com] Sent: Friday, June 3, 2016 12:48 AM To: Xen-devel Cc: Andrew Cooper ; Jan Beulich ; Wei Liu ; Kang, Luwei ; Han, Huaitong Subject: [PATCH for-4.7] x86/cpuid: Calculate a guests xfeature_mask from its featureset libxc current performs the xstate calculation for guests, and provides the information to Xen to be used when satisfying CPUID traps. (There is further work planned to improve this arrangement, but the worst a buggy toolstack can do is make junk appear in the cpuid leaves for the guest.) dom0 however has no policy constructed for it, and certain fields filter straight through from hardware. Linux queries CPUID.7[0].{EAX/EDX} alone to choose a setting for %xcr0, which is action to take. However, features such as MPX and PKRU are not supported for PV guests. As a result, Linux, using leaked hardware information, fails to set %xcr0 on newer Skylake hardware with PKRU support, and crashes. As an interim solution, dynamically calculate the correct xfeature_mask and xstate_size to report to the guest for CPUID.7[0] queries. This ensures that domains don't see leaked hardware values, even when no cpuid policy is provided. Similarly, CPUID.7[1]{ECX/EDX} represents the applicable settings for MSR_XSS. Xen doesn't support any XSS states in guests, unconditionally clear them for HVM guests. Reported-by: Luwei Kang Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Wei Liu CC: Luwei Kang CC: Huaitong Han --- xen/arch/x86/hvm/hvm.c | 53 ++++++++++++++++++++++++++++++++++++++++++-- xen/arch/x86/traps.c | 50 ++++++++++++++++++++++++++++++++--------- xen/arch/x86/xstate.c | 2 +- xen/include/asm-x86/xstate.h | 32 +++++++++++++++++--------- 4 files changed, 114 insertions(+), 23 deletions(-) extern u64 xfeature_mask; extern u64 xstate_align; +extern unsigned int *xstate_offsets; extern unsigned int *xstate_sizes; /* extended state save area */ -- 2.1.4 diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index bb98051..72bbed5 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3362,7 +3362,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, switch ( input ) { - unsigned int _ecx, _edx; + unsigned int _ebx, _ecx, _edx; case 0x1: /* Fix up VLAPIC details. */ @@ -3443,6 +3443,51 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, switch ( count ) { case 0: + { + uint64_t xfeature_mask = XSTATE_FP_SSE; + uint32_t xstate_size = XSTATE_AREA_MIN_SIZE; + + if ( _ecx & cpufeat_mask(X86_FEATURE_AVX) ) + { + xfeature_mask |= XSTATE_YMM; + xstate_size = MAX(xstate_size, + xstate_offsets[_XSTATE_YMM] + + xstate_sizes[_XSTATE_YMM]); + } + + _ecx = 0; + hvm_cpuid(7, NULL, &_ebx, &_ecx, NULL); + + if ( _ebx & cpufeat_mask(X86_FEATURE_MPX) ) + { + xfeature_mask |= XSTATE_BNDREGS | XSTATE_BNDCSR; + xstate_size = MAX(xstate_size, + xstate_offsets[_XSTATE_BNDCSR] + + xstate_sizes[_XSTATE_BNDCSR]); + } + + if ( _ebx & cpufeat_mask(X86_FEATURE_PKU) ) + { + xfeature_mask |= XSTATE_PKRU; + xstate_size = MAX(xstate_size, + xstate_offsets[_XSTATE_PKRU] + + xstate_sizes[_XSTATE_PKRU]); + } + + hvm_cpuid(0x80000001, NULL, NULL, &_ecx, NULL); + + if ( _ecx & cpufeat_mask(X86_FEATURE_LWP) ) + { + xfeature_mask |= XSTATE_LWP; + xstate_size = MAX(xstate_size, + xstate_offsets[_XSTATE_LWP] + + xstate_sizes[_XSTATE_LWP]); + } + + *eax = (uint32_t)xfeature_mask; + *edx = (uint32_t)(xfeature_mask >> 32); + *ecx = xstate_size; + /* * Always read CPUID[0xD,0].EBX from hardware, rather than domain * policy. It varies with enabled xstate, and the correct xcr0 is @@ -3450,6 +3495,8 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, */ cpuid_count(input, count, &dummy, ebx, &dummy, &dummy); break; + } + case 1: *eax &= hvm_featureset[FEATURESET_Da1]; @@ -3463,7 +3510,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, cpuid_count(input, count, &dummy, ebx, &dummy, &dummy); } else - *ebx = *ecx = *edx = 0; + *ebx = 0; + + *ecx = *edx = 0; break; } break; diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 5d7232d..a2688c3 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -928,7 +928,7 @@ void pv_cpuid(struct cpu_user_regs *regs) switch ( leaf ) { - uint32_t tmp; + uint32_t tmp, _ecx; case 0x00000001: c &= pv_featureset[FEATURESET_1c]; @@ -1087,19 +1087,48 @@ void pv_cpuid(struct cpu_user_regs *regs) break; case XSTATE_CPUID: - if ( !((!is_control_domain(currd) && !is_hardware_domain(currd) - ? ({ - uint32_t ecx; - - domain_cpuid(currd, 1, 0, &tmp, &tmp, &ecx, &tmp); - ecx & pv_featureset[FEATURESET_1c]; - }) - : cpuid_ecx(1)) & cpufeat_mask(X86_FEATURE_XSAVE)) || - subleaf >= 63 ) + + if ( !is_control_domain(currd) && !is_hardware_domain(currd) ) + domain_cpuid(currd, 1, 0, &tmp, &tmp, &_ecx, &tmp); + else + _ecx = cpuid_ecx(1); + _ecx &= pv_featureset[FEATURESET_1c]; + + if ( !(_ecx & cpufeat_mask(X86_FEATURE_XSAVE)) || subleaf >= 63 + ) goto unsupported; switch ( subleaf ) { case 0: + { + uint64_t xfeature_mask = XSTATE_FP_SSE; + uint32_t xstate_size = XSTATE_AREA_MIN_SIZE; + + if ( _ecx & cpufeat_mask(X86_FEATURE_AVX) ) + { + xfeature_mask |= XSTATE_YMM; + xstate_size = MAX(xstate_size, + xstate_offsets[_XSTATE_YMM] + + xstate_sizes[_XSTATE_YMM]); + } + + if ( !is_control_domain(currd) && !is_hardware_domain(currd) ) + domain_cpuid(currd, 0x80000001, 0, &tmp, &tmp, &_ecx, &tmp); + else + _ecx = cpuid_ecx(0x80000001); + _ecx &= pv_featureset[FEATURESET_e1c]; + + if ( _ecx & cpufeat_mask(X86_FEATURE_LWP) ) + { + xfeature_mask |= XSTATE_LWP; + xstate_size = MAX(xstate_size, + xstate_offsets[_XSTATE_LWP] + + xstate_sizes[_XSTATE_LWP]); + } + + a = (uint32_t)xfeature_mask; + d = (uint32_t)(xfeature_mask >> 32); + c = xstate_size; + /* * Always read CPUID.0xD[ECX=0].EBX from hardware, rather than * domain policy. It varies with enabled xstate, and the correct @@ -1108,6 +1137,7 @@ void pv_cpuid(struct cpu_user_regs *regs) if ( !is_control_domain(currd) && !is_hardware_domain(currd) ) cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp); break; + } case 1: a &= pv_featureset[FEATURESET_Da1]; diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index a0cfcc2..1fd1ce8 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -24,7 +24,7 @@ static u32 __read_mostly xsave_cntxt_size; /* A 64-bit bitmask of the XSAVE/XRSTOR features supported by processor. */ u64 __read_mostly xfeature_mask; -static unsigned int *__read_mostly xstate_offsets; +unsigned int *__read_mostly xstate_offsets; unsigned int *__read_mostly xstate_sizes; u64 __read_mostly xstate_align; static unsigned int __read_mostly xstate_features; diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h index 4535354..51a9ed4 100644 --- a/xen/include/asm-x86/xstate.h +++ b/xen/include/asm-x86/xstate.h @@ -26,16 +26,27 @@ #define XSAVE_HDR_OFFSET FXSAVE_SIZE #define XSTATE_AREA_MIN_SIZE (FXSAVE_SIZE + XSAVE_HDR_SIZE) -#define XSTATE_FP (1ULL << 0) -#define XSTATE_SSE (1ULL << 1) -#define XSTATE_YMM (1ULL << 2) -#define XSTATE_BNDREGS (1ULL << 3) -#define XSTATE_BNDCSR (1ULL << 4) -#define XSTATE_OPMASK (1ULL << 5) -#define XSTATE_ZMM (1ULL << 6) -#define XSTATE_HI_ZMM (1ULL << 7) -#define XSTATE_PKRU (1ULL << 9) -#define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */ +#define _XSTATE_FP 0 +#define XSTATE_FP (1ULL << _XSTATE_FP) +#define _XSTATE_SSE 1 +#define XSTATE_SSE (1ULL << _XSTATE_SSE) +#define _XSTATE_YMM 2 +#define XSTATE_YMM (1ULL << _XSTATE_YMM) +#define _XSTATE_BNDREGS 3 +#define XSTATE_BNDREGS (1ULL << _XSTATE_BNDREGS) +#define _XSTATE_BNDCSR 4 +#define XSTATE_BNDCSR (1ULL << _XSTATE_BNDCSR) +#define _XSTATE_OPMASK 5 +#define XSTATE_OPMASK (1ULL << _XSTATE_OPMASK) +#define _XSTATE_ZMM 6 +#define XSTATE_ZMM (1ULL << _XSTATE_ZMM) +#define _XSTATE_HI_ZMM 7 +#define XSTATE_HI_ZMM (1ULL << _XSTATE_HI_ZMM) +#define _XSTATE_PKRU 9 +#define XSTATE_PKRU (1ULL << _XSTATE_PKRU) +#define _XSTATE_LWP 62 +#define XSTATE_LWP (1ULL << _XSTATE_LWP) + #define XSTATE_FP_SSE (XSTATE_FP | XSTATE_SSE) #define XCNTXT_MASK (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \ XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY) @@ -51,6 +62,7 @@