From patchwork Thu Mar 24 08:29:32 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shuai Ruan X-Patchwork-Id: 8658981 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 0336F9F3D1 for ; Thu, 24 Mar 2016 08:35:45 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id DC07A203B5 for ; Thu, 24 Mar 2016 08:35:43 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CA94B20395 for ; Thu, 24 Mar 2016 08:35:42 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1aj0h3-0007o7-CF; Thu, 24 Mar 2016 08:32:37 +0000 Received: from mail6.bemta6.messagelabs.com ([85.158.143.247]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1aj0h1-0007nm-Sq for xen-devel@lists.xen.org; Thu, 24 Mar 2016 08:32:35 +0000 Received: from [85.158.143.35] by server-3.bemta-6.messagelabs.com id B0/C5-07120-326A3F65; Thu, 24 Mar 2016 08:32:35 +0000 X-Env-Sender: shuai.ruan@linux.intel.com X-Msg-Ref: server-3.tower-21.messagelabs.com!1458808352!5147883!2 X-Originating-IP: [192.55.52.115] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 8.11; banners=-,-,- X-VirusChecked: Checked Received: (qmail 24139 invoked from network); 24 Mar 2016 08:32:33 -0000 Received: from mga14.intel.com (HELO mga14.intel.com) (192.55.52.115) by server-3.tower-21.messagelabs.com with SMTP; 24 Mar 2016 08:32:33 -0000 Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP; 24 Mar 2016 01:32:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,384,1455004800"; d="scan'208";a="770477408" Received: from rs-vmm.bj.intel.com ([10.238.135.76]) by orsmga003.jf.intel.com with ESMTP; 24 Mar 2016 01:32:15 -0700 From: Shuai Ruan To: xen-devel@lists.xen.org Date: Thu, 24 Mar 2016 16:29:32 +0800 Message-Id: <1458808173-23279-2-git-send-email-shuai.ruan@linux.intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1458808173-23279-1-git-send-email-shuai.ruan@linux.intel.com> References: <1458808173-23279-1-git-send-email-shuai.ruan@linux.intel.com> Cc: andrew.cooper3@citrix.com, keir@xen.org, jbeulich@suse.com Subject: [Xen-devel] [PATCH V6 1/2] x86/xsaves: calculate the comp_offsets base on xcomp_bv X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Previous patch using all available features calculate comp_offsets. This is wrong.This patch fix this bug by calculating the comp_offset based on xcomp_bv of current guest. Also, the comp_offset should take alignment into consideration. Signed-off-by: Shuai Ruan Reported-by: Jan Beulich --- V5: Address comments from Jan: 1. use xcomp_bv to caculate comp_offsets 2. local variable/funciton parameters with no xstate_ prefix 3. fix a bug realted with test_bit(). V4: Address comments from Jan: 1. use xstate_comp_offsets as on-stack array. V3: Address comments from Jan: 1. fix xstate_comp_offsets used as static array problem. 2. change xstate_align from array to u64 and used as bitmap. 3. change calculating xstate_comp_offsets into three step. 1) whether component is set in xsavearea 2) whether component need align 3) add xstate_size[i-1] V2: Address comments from Jan: 1. code style fix. 2. setup_xstate_comp take xcomp_bv as param. xen/arch/x86/xstate.c | 58 ++++++++++++++++++++++++++++---------------- xen/include/asm-x86/xstate.h | 2 ++ 2 files changed, 39 insertions(+), 21 deletions(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index f649405..a7b6a04 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -26,8 +26,8 @@ u64 __read_mostly xfeature_mask; static unsigned int *__read_mostly xstate_offsets; unsigned int *__read_mostly xstate_sizes; +static u64 __read_mostly xstate_align; static unsigned int __read_mostly xstate_features; -static unsigned int __read_mostly xstate_comp_offsets[sizeof(xfeature_mask)*8]; static uint32_t __read_mostly mxcsr_mask = 0x0000ffbf; @@ -94,7 +94,7 @@ static bool_t xsave_area_compressed(const struct xsave_struct *xsave_area) static int setup_xstate_features(bool_t bsp) { - unsigned int leaf, tmp, eax, ebx; + unsigned int leaf, eax, ebx, ecx, edx; if ( bsp ) { @@ -111,57 +111,71 @@ static int setup_xstate_features(bool_t bsp) for ( leaf = 2; leaf < xstate_features; leaf++ ) { if ( bsp ) + { cpuid_count(XSTATE_CPUID, leaf, &xstate_sizes[leaf], - &xstate_offsets[leaf], &tmp, &tmp); + &xstate_offsets[leaf], &ecx, &edx); + if ( ecx & XSTATE_ALIGN64 ) + __set_bit(leaf, &xstate_align); + } else { cpuid_count(XSTATE_CPUID, leaf, &eax, - &ebx, &tmp, &tmp); + &ebx, &ecx, &edx); BUG_ON(eax != xstate_sizes[leaf]); BUG_ON(ebx != xstate_offsets[leaf]); + BUG_ON(!(ecx & XSTATE_ALIGN64) != !test_bit(leaf, &xstate_align)); } } return 0; } -static void __init setup_xstate_comp(void) +static void setup_xstate_comp(uint16_t *comp_offsets, + const uint64_t xcomp_bv) { unsigned int i; + unsigned int offset; /* * The FP xstates and SSE xstates are legacy states. They are always * in the fixed offsets in the xsave area in either compacted form * or standard form. */ - xstate_comp_offsets[0] = 0; - xstate_comp_offsets[1] = XSAVE_SSE_OFFSET; + comp_offsets[0] = 0; + comp_offsets[1] = XSAVE_SSE_OFFSET; - xstate_comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE; + comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE; - for ( i = 3; i < xstate_features; i++ ) + offset = comp_offsets[2]; + for ( i = 2; i < xstate_features; i++ ) { - xstate_comp_offsets[i] = xstate_comp_offsets[i - 1] + - (((1ul << i) & xfeature_mask) - ? xstate_sizes[i - 1] : 0); - ASSERT(xstate_comp_offsets[i] + xstate_sizes[i] <= xsave_cntxt_size); + if ( (1ul << i) & xcomp_bv ) + { + if ( test_bit(i, &xstate_align) ) + offset = ROUNDUP(offset, 64); + comp_offsets[i] = offset; + offset += xstate_sizes[i]; + } } + ASSERT(offset <= xsave_cntxt_size); } static void *get_xsave_addr(struct xsave_struct *xsave, - unsigned int xfeature_idx) + const uint16_t *comp_offsets, + unsigned int xfeature_idx) { if ( !((1ul << xfeature_idx) & xsave->xsave_hdr.xstate_bv) ) return NULL; - return (void *)xsave + (xsave_area_compressed(xsave) - ? xstate_comp_offsets - : xstate_offsets)[xfeature_idx]; + return (void *)xsave + (xsave_area_compressed(xsave) ? + comp_offsets[xfeature_idx] : + xstate_offsets[xfeature_idx]); } void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size) { struct xsave_struct *xsave = v->arch.xsave_area; + uint16_t comp_offsets[sizeof(xfeature_mask)*8]; u64 xstate_bv = xsave->xsave_hdr.xstate_bv; u64 valid; @@ -172,6 +186,8 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size) } ASSERT(xsave_area_compressed(xsave)); + setup_xstate_comp(comp_offsets, xsave->xsave_hdr.xcomp_bv); + /* * Copy legacy XSAVE area and XSAVE hdr area. */ @@ -188,7 +204,7 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size) { u64 feature = valid & -valid; unsigned int index = fls(feature) - 1; - const void *src = get_xsave_addr(xsave, index); + const void *src = get_xsave_addr(xsave, comp_offsets, index); if ( src ) { @@ -203,6 +219,7 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size) void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) { struct xsave_struct *xsave = v->arch.xsave_area; + uint16_t comp_offsets[sizeof(xfeature_mask)*8]; u64 xstate_bv = ((const struct xsave_struct *)src)->xsave_hdr.xstate_bv; u64 valid; @@ -222,6 +239,7 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) /* Set XSTATE_BV and XCOMP_BV. */ xsave->xsave_hdr.xstate_bv = xstate_bv; xsave->xsave_hdr.xcomp_bv = v->arch.xcr0_accum | XSTATE_COMPACTION_ENABLED; + setup_xstate_comp(comp_offsets, xsave->xsave_hdr.xcomp_bv); /* * Copy each region from the non-compacted offset to the @@ -232,7 +250,7 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) { u64 feature = valid & -valid; unsigned int index = fls(feature) - 1; - void *dest = get_xsave_addr(xsave, index); + void *dest = get_xsave_addr(xsave, comp_offsets, index); if ( dest ) { @@ -564,8 +582,6 @@ void xstate_init(struct cpuinfo_x86 *c) if ( setup_xstate_features(bsp) && bsp ) BUG(); - if ( bsp && (cpu_has_xsaves || cpu_has_xsavec) ) - setup_xstate_comp(); } static bool_t valid_xcr0(u64 xcr0) diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h index c28cea5..a488688 100644 --- a/xen/include/asm-x86/xstate.h +++ b/xen/include/asm-x86/xstate.h @@ -46,6 +46,8 @@ #define XSTATE_LAZY (XSTATE_ALL & ~XSTATE_NONLAZY) #define XSTATE_COMPACTION_ENABLED (1ULL << 63) +#define XSTATE_ALIGN64 (1U << 1) + extern u64 xfeature_mask; extern unsigned int *xstate_sizes;