From patchwork Wed Feb 3 12:39:53 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 8202101 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 7B40CBEEE5 for ; Wed, 3 Feb 2016 12:42:36 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6DE6E202BE for ; Wed, 3 Feb 2016 12:42:35 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 33D0820268 for ; Wed, 3 Feb 2016 12:42:34 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aQwj3-0000NN-VU; Wed, 03 Feb 2016 12:40:01 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aQwj2-0000Mc-Fp for xen-devel@lists.xenproject.org; Wed, 03 Feb 2016 12:40:00 +0000 Received: from [85.158.137.68] by server-17.bemta-3.messagelabs.com id 83/90-02940-F15F1B65; Wed, 03 Feb 2016 12:39:59 +0000 X-Env-Sender: JBeulich@suse.com X-Msg-Ref: server-4.tower-31.messagelabs.com!1454503196!10420791!1 X-Originating-IP: [137.65.248.74] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 30370 invoked from network); 3 Feb 2016 12:39:58 -0000 Received: from prv-mh.provo.novell.com (HELO prv-mh.provo.novell.com) (137.65.248.74) by server-4.tower-31.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 3 Feb 2016 12:39:58 -0000 Received: from INET-PRV-MTA by prv-mh.provo.novell.com with Novell_GroupWise; Wed, 03 Feb 2016 05:39:56 -0700 Message-Id: <56B2032902000078000CE03C@prv-mh.provo.novell.com> X-Mailer: Novell GroupWise Internet Agent 14.2.0 Date: Wed, 03 Feb 2016 05:39:53 -0700 From: "Jan Beulich" To: "xen-devel" References: <56B2016102000078000CE005@prv-mh.provo.novell.com> In-Reply-To: <56B2016102000078000CE005@prv-mh.provo.novell.com> Mime-Version: 1.0 Cc: Andrew Cooper , Keir Fraser , Shuai Ruan Subject: [Xen-devel] [PATCH v2 2/2] x86/xstate: also use alternative asm on xsave side X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Shuai Ruan This patch use alternavtive asm on the xsave side. As xsaves use modified optimization like xsaveopt, xsaves may not writing the FPU portion of the save image too. So xsaves also need some extra tweaks. Signed-off-by: Shuai Ruan Fix XSAVES opcode. Extend the other respective XSAVEOPT conditional to cover XSAVES as well. Re-wrap comment being adjusted. Signed-off-by: Jan Beulich x86/xstate: also use alternative asm on xsave side From: Shuai Ruan This patch use alternavtive asm on the xsave side. As xsaves use modified optimization like xsaveopt, xsaves may not writing the FPU portion of the save image too. So xsaves also need some extra tweaks. Signed-off-by: Shuai Ruan Fix XSAVES opcode. Extend the other respective XSAVEOPT conditional to cover XSAVES as well. Re-wrap comment being adjusted. Signed-off-by: Jan Beulich --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -250,27 +250,29 @@ void xsave(struct vcpu *v, uint64_t mask uint32_t hmask = mask >> 32; uint32_t lmask = mask; int word_size = mask & XSTATE_FP ? (cpu_has_fpu_sel ? 8 : 0) : -1; +#define XSAVE(pfx) \ + alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", \ + ".byte " pfx "0x0f,0xae,0x37\n", \ + X86_FEATURE_XSAVEOPT, \ + ".byte " pfx "0x0f,0xc7,0x27\n", \ + X86_FEATURE_XSAVEC, \ + ".byte " pfx "0x0f,0xc7,0x2f\n", \ + X86_FEATURE_XSAVES, \ + "=m" (*ptr), \ + "a" (lmask), "d" (hmask), "D" (ptr)) if ( word_size <= 0 || !is_pv_32bit_vcpu(v) ) { typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel; typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel; - if ( cpu_has_xsaves ) - asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else if ( cpu_has_xsavec ) - asm volatile ( ".byte 0x48,0x0f,0xc7,0x27" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else if ( cpu_has_xsaveopt ) + if ( cpu_has_xsaveopt || cpu_has_xsaves ) { /* - * xsaveopt may not write the FPU portion even when the respective - * mask bit is set. For the check further down to work we hence - * need to put the save image back into the state that it was in - * right after the previous xsaveopt. + * XSAVEOPT/XSAVES may not write the FPU portion even when the + * respective mask bit is set. For the check further down to work + * we hence need to put the save image back into the state that + * it was in right after the previous XSAVEOPT. */ if ( word_size > 0 && (ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] == 4 || @@ -279,14 +281,9 @@ void xsave(struct vcpu *v, uint64_t mask ptr->fpu_sse.fip.sel = 0; ptr->fpu_sse.fdp.sel = 0; } - asm volatile ( ".byte 0x48,0x0f,0xae,0x37" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); } - else - asm volatile ( ".byte 0x48,0x0f,0xae,0x27" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); + + XSAVE("0x48,"); if ( !(mask & ptr->xsave_hdr.xstate_bv & XSTATE_FP) || /* @@ -296,7 +293,7 @@ void xsave(struct vcpu *v, uint64_t mask (!(ptr->fpu_sse.fsw & 0x0080) && boot_cpu_data.x86_vendor == X86_VENDOR_AMD) ) { - if ( cpu_has_xsaveopt && word_size > 0 ) + if ( (cpu_has_xsaveopt || cpu_has_xsaves) && word_size > 0 ) { ptr->fpu_sse.fip.sel = fcs; ptr->fpu_sse.fdp.sel = fds; @@ -317,24 +314,10 @@ void xsave(struct vcpu *v, uint64_t mask } else { - if ( cpu_has_xsaves ) - asm volatile ( ".byte 0x0f,0xc7,0x2f" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else if ( cpu_has_xsavec ) - asm volatile ( ".byte 0x0f,0xc7,0x27" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else if ( cpu_has_xsaveopt ) - asm volatile ( ".byte 0x0f,0xae,0x37" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else - asm volatile ( ".byte 0x0f,0xae,0x27" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); + XSAVE(""); word_size = 4; } +#undef XSAVE if ( word_size >= 0 ) ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = word_size; } Reviewed-by: Andrew Cooper --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -250,27 +250,29 @@ void xsave(struct vcpu *v, uint64_t mask uint32_t hmask = mask >> 32; uint32_t lmask = mask; int word_size = mask & XSTATE_FP ? (cpu_has_fpu_sel ? 8 : 0) : -1; +#define XSAVE(pfx) \ + alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", \ + ".byte " pfx "0x0f,0xae,0x37\n", \ + X86_FEATURE_XSAVEOPT, \ + ".byte " pfx "0x0f,0xc7,0x27\n", \ + X86_FEATURE_XSAVEC, \ + ".byte " pfx "0x0f,0xc7,0x2f\n", \ + X86_FEATURE_XSAVES, \ + "=m" (*ptr), \ + "a" (lmask), "d" (hmask), "D" (ptr)) if ( word_size <= 0 || !is_pv_32bit_vcpu(v) ) { typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel; typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel; - if ( cpu_has_xsaves ) - asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else if ( cpu_has_xsavec ) - asm volatile ( ".byte 0x48,0x0f,0xc7,0x27" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else if ( cpu_has_xsaveopt ) + if ( cpu_has_xsaveopt || cpu_has_xsaves ) { /* - * xsaveopt may not write the FPU portion even when the respective - * mask bit is set. For the check further down to work we hence - * need to put the save image back into the state that it was in - * right after the previous xsaveopt. + * XSAVEOPT/XSAVES may not write the FPU portion even when the + * respective mask bit is set. For the check further down to work + * we hence need to put the save image back into the state that + * it was in right after the previous XSAVEOPT. */ if ( word_size > 0 && (ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] == 4 || @@ -279,14 +281,9 @@ void xsave(struct vcpu *v, uint64_t mask ptr->fpu_sse.fip.sel = 0; ptr->fpu_sse.fdp.sel = 0; } - asm volatile ( ".byte 0x48,0x0f,0xae,0x37" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); } - else - asm volatile ( ".byte 0x48,0x0f,0xae,0x27" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); + + XSAVE("0x48,"); if ( !(mask & ptr->xsave_hdr.xstate_bv & XSTATE_FP) || /* @@ -296,7 +293,7 @@ void xsave(struct vcpu *v, uint64_t mask (!(ptr->fpu_sse.fsw & 0x0080) && boot_cpu_data.x86_vendor == X86_VENDOR_AMD) ) { - if ( cpu_has_xsaveopt && word_size > 0 ) + if ( (cpu_has_xsaveopt || cpu_has_xsaves) && word_size > 0 ) { ptr->fpu_sse.fip.sel = fcs; ptr->fpu_sse.fdp.sel = fds; @@ -317,24 +314,10 @@ void xsave(struct vcpu *v, uint64_t mask } else { - if ( cpu_has_xsaves ) - asm volatile ( ".byte 0x0f,0xc7,0x2f" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else if ( cpu_has_xsavec ) - asm volatile ( ".byte 0x0f,0xc7,0x27" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else if ( cpu_has_xsaveopt ) - asm volatile ( ".byte 0x0f,0xae,0x37" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); - else - asm volatile ( ".byte 0x0f,0xae,0x27" - : "=m" (*ptr) - : "a" (lmask), "d" (hmask), "D" (ptr) ); + XSAVE(""); word_size = 4; } +#undef XSAVE if ( word_size >= 0 ) ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = word_size; }