From patchwork Fri Apr 21 13:45:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAE4FC77B76 for ; Fri, 21 Apr 2023 16:50:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233250AbjDUQuu (ORCPT ); Fri, 21 Apr 2023 12:50:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232934AbjDUQur (ORCPT ); Fri, 21 Apr 2023 12:50:47 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E5B212586; Fri, 21 Apr 2023 09:50:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095846; x=1713631846; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9rCJFBWLMwkBw9HBuP/Kc52dFcfvwQjY//PWWGqH99k=; b=I/8cUpkjsZLBADf6J2IkBrhChvyhPX/2eaXJe2uboCf4NIZMGWs7Bgvx 3+s0VxIRGE3Jk3CNGH7kXOXCuDABOLLuq3MJAgvrgXF50bf+GQ6R+kwAd GwMIWN9ly6onm83KKrWZ42e7BDzsn0KTYeOc766GSwt9jq+uMRwg36IyC clWgUDN+ZI04+vzi/1RpWlxpUaiDROUze160iiPEF2cWzyOeGLwrJQMQw DOrHlTthHzoYxr4W3UTDAbOXwGfAfrdQ/WaaWqu65fP28kf7nnZyLM3Ga czMOj6fwce8BDAqBy1rQ57E3bjDKQ7IeNbKwU4qrsxBz5I9k2g4Paflgu g==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344786946" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344786946" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817355" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817355" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:41 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Yu-cheng Yu , Borislav Petkov , Kees Cook , Mike Rapoport , Pengfei Xu Subject: [PATCH v2 01/21] x86/shstk: Add Kconfig option for shadow stack Date: Fri, 21 Apr 2023 09:45:55 -0400 Message-Id: <20230421134615.62539-2-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe Shadow stack provides protection for applications against function return address corruption. It is active when the processor supports it, the kernel has CONFIG_X86_SHADOW_STACK enabled, and the application is built for the feature. This is only implemented for the 64-bit kernel. When it is enabled, legacy non-shadow stack applications continue to work, but without protection. Since there is another feature that utilizes CET (Kernel IBT) that will share implementation with shadow stacks, create CONFIG_CET to signify that at least one CET feature is configured. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-3-rick.p.edgecombe%40intel.com --- arch/x86/Kconfig | 24 ++++++++++++++++++++++++ arch/x86/Kconfig.assembler | 5 +++++ 2 files changed, 29 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index a825bf031f49..f03791b73f9f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1851,6 +1851,11 @@ config CC_HAS_IBT (CC_IS_CLANG && CLANG_VERSION >= 140000)) && \ $(as-instr,endbr64) +config X86_CET + def_bool n + help + CET features configured (Shadow stack or IBT) + config X86_KERNEL_IBT prompt "Indirect Branch Tracking" def_bool y @@ -1858,6 +1863,7 @@ config X86_KERNEL_IBT # https://github.com/llvm/llvm-project/commit/9d7001eba9c4cb311e03cd8cdc231f9e579f2d0f depends on !LD_IS_LLD || LLD_VERSION >= 140000 select OBJTOOL + select X86_CET help Build the kernel with support for Indirect Branch Tracking, a hardware support course-grain forward-edge Control Flow Integrity @@ -1952,6 +1958,24 @@ config X86_SGX If unsure, say N. +config X86_USER_SHADOW_STACK + bool "X86 userspace shadow stack" + depends on AS_WRUSS + depends on X86_64 + select ARCH_USES_HIGH_VMA_FLAGS + select X86_CET + help + Shadow stack protection is a hardware feature that detects function + return address corruption. This helps mitigate ROP attacks. + Applications must be enabled to use it, and old userspace does not + get protection "for free". + + CPUs supporting shadow stacks were first released in 2020. + + See Documentation/x86/shstk.rst for more information. + + If unsure, say N. + config EFI bool "EFI runtime service support" depends on ACPI diff --git a/arch/x86/Kconfig.assembler b/arch/x86/Kconfig.assembler index b88f784cb02e..8ad41da301e5 100644 --- a/arch/x86/Kconfig.assembler +++ b/arch/x86/Kconfig.assembler @@ -24,3 +24,8 @@ config AS_GFNI def_bool $(as-instr,vgf2p8mulb %xmm0$(comma)%xmm1$(comma)%xmm2) help Supported by binutils >= 2.30 and LLVM integrated assembler + +config AS_WRUSS + def_bool $(as-instr,wrussq %rax$(comma)(%rbx)) + help + Supported by binutils >= 2.31 and LLVM integrated assembler From patchwork Fri Apr 21 13:45:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220368 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A602C7618E for ; Fri, 21 Apr 2023 16:50:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233084AbjDUQuv (ORCPT ); Fri, 21 Apr 2023 12:50:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233236AbjDUQus (ORCPT ); Fri, 21 Apr 2023 12:50:48 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A10EE1259A; Fri, 21 Apr 2023 09:50:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095846; x=1713631846; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kQ6bOp51j+wxWxjJPy3B5NdRPxWxbmqROqWc/HTwHTA=; b=bmDwpkvQYaMIjpTexGRogz+rAwLeK2ZSEpTZmNhfjV1luok//iFlpZL8 nL2vW3S7vXlednd0w7wjcyAiVBod5DN21V5PUvH0B0EC0S3gJEMoQoM4P BYSE15pLeuM9t7bghpG5QKSKCyj2P9oweBU/YnmJnYnjbfxLvgaz4HvHW p/vnEJq07jQfZnB24qNgcBxDH25Q2h3mTDuL5Dinor/9jCoAq67hSS5km RqM/PKfaCigBLJcWv44+Thd9BfNKqU+PfB0Q9fjpzTlmlNs9ewzoVexlk nnVkfUEo3IMv/uzUILd6uvpqaAs1peiliT9/UnQXgJswwjLPJ94Rj/cjy Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344786955" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344786955" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817358" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817358" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:41 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Yu-cheng Yu , Borislav Petkov , Kees Cook , Mike Rapoport , Pengfei Xu Subject: [PATCH v2 02/21] x86/cpufeatures: Add CPU feature flags for shadow stacks Date: Fri, 21 Apr 2023 09:45:56 -0400 Message-Id: <20230421134615.62539-3-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe The Control-Flow Enforcement Technology contains two related features, one of which is Shadow Stacks. Future patches will utilize this feature for shadow stack support in KVM, so add a CPU feature flags for Shadow Stacks (CPUID.(EAX=7,ECX=0):ECX[bit 7]). To protect shadow stack state from malicious modification, the registers are only accessible in supervisor mode. This implementation context-switches the registers with XSAVES. Make X86_FEATURE_SHSTK depend on XSAVES. The shadow stack feature, enumerated by the CPUID bit described above, encompasses both supervisor and userspace support for shadow stack. In near future patches, only userspace shadow stack will be enabled. In expectation of future supervisor shadow stack support, create a software CPU capability to enumerate kernel utilization of userspace shadow stack support. This user shadow stack bit should depend on the HW "shstk" capability and that logic will be implemented in future patches. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-4-rick.p.edgecombe%40intel.com --- arch/x86/include/asm/cpufeatures.h | 2 ++ arch/x86/include/asm/disabled-features.h | 8 +++++++- arch/x86/kernel/cpu/cpuid-deps.c | 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 97327a1e3aff..3993ea7c6312 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -308,6 +308,7 @@ #define X86_FEATURE_MSR_TSX_CTRL (11*32+20) /* "" MSR IA32_TSX_CTRL (Intel) implemented */ #define X86_FEATURE_SMBA (11*32+21) /* "" Slow Memory Bandwidth Allocation */ #define X86_FEATURE_BMEC (11*32+22) /* "" Bandwidth Monitoring Event Configuration */ +#define X86_FEATURE_USER_SHSTK (11*32+23) /* Shadow stack support for user mode applications */ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ @@ -379,6 +380,7 @@ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ #define X86_FEATURE_WAITPKG (16*32+ 5) /* UMONITOR/UMWAIT/TPAUSE Instructions */ #define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */ +#define X86_FEATURE_SHSTK (16*32+ 7) /* "" Shadow stack */ #define X86_FEATURE_GFNI (16*32+ 8) /* Galois Field New Instructions */ #define X86_FEATURE_VAES (16*32+ 9) /* Vector AES */ #define X86_FEATURE_VPCLMULQDQ (16*32+10) /* Carry-Less Multiplication Double Quadword */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 5dfa4fb76f4b..505f78ddca82 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -99,6 +99,12 @@ # define DISABLE_TDX_GUEST (1 << (X86_FEATURE_TDX_GUEST & 31)) #endif +#ifdef CONFIG_X86_USER_SHADOW_STACK +#define DISABLE_USER_SHSTK 0 +#else +#define DISABLE_USER_SHSTK (1 << (X86_FEATURE_USER_SHSTK & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -114,7 +120,7 @@ #define DISABLED_MASK9 (DISABLE_SGX) #define DISABLED_MASK10 0 #define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \ - DISABLE_CALL_DEPTH_TRACKING) + DISABLE_CALL_DEPTH_TRACKING|DISABLE_USER_SHSTK) #define DISABLED_MASK12 0 #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index f6748c8bd647..e462c1d3800a 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -81,6 +81,7 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_XFD, X86_FEATURE_XSAVES }, { X86_FEATURE_XFD, X86_FEATURE_XGETBV1 }, { X86_FEATURE_AMX_TILE, X86_FEATURE_XFD }, + { X86_FEATURE_SHSTK, X86_FEATURE_XSAVES }, {} }; From patchwork Fri Apr 21 13:45:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 333E0C77B7F for ; Fri, 21 Apr 2023 16:50:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232891AbjDUQuy (ORCPT ); Fri, 21 Apr 2023 12:50:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232847AbjDUQut (ORCPT ); Fri, 21 Apr 2023 12:50:49 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F3718691; Fri, 21 Apr 2023 09:50:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095847; x=1713631847; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=R1cGvPwI+TvLSQM2okFg1qOQiQqzPO7kVX+ps6fOpj0=; b=i4SGEPqyZ5+KQFFN8eq5he18EqH1FQ9h4xs355Rz0PXdX5cBXv+mKoNR AkcPe72B9rXbjtlKgTAaZlZGetpCL61qYGo/ZCMPF8oXGJ6h+Eb1XF+b5 TGrPOc21g3bCxw5DifPlMVNU+KPmGUL0XDTOWfAh+JhASsfpkeYionRBP 21+RNVnpHHtbkeWfbTJKhUkRW/ijkHxUKXahcJuY1ZoYs/YlS/4X8yJr3 7Eh9QfYwtzzMGbOYwdT0dDMPtR75lZ1/U1g8zgxhl8VPgqrzEDTgG3pWS Z/i45hvEXDjjVnUBdiUkJyagrUvX/HdaiZyvWIIX5ieRH/+wRD3m8+pFt g==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344786965" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344786965" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817361" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817361" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:42 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Yu-cheng Yu , Borislav Petkov , Kees Cook , Mike Rapoport , Pengfei Xu Subject: [PATCH v2 03/21] x86/cpufeatures: Enable CET CR4 bit for shadow stack Date: Fri, 21 Apr 2023 09:45:57 -0400 Message-Id: <20230421134615.62539-4-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe Setting CR4.CET is a prerequisite for utilizing any CET features, most of which also require setting MSRs. Kernel IBT already enables the CET CR4 bit when it detects IBT HW support and is configured with kernel IBT. However, future patches that enable userspace shadow stack support will need the bit set as well. So change the logic to enable it in either case. Clear MSR_IA32_U_CET in cet_disable() so that it can't live to see userspace in a new kexec-ed kernel that has CR4.CET set from kernel IBT. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-5-rick.p.edgecombe%40intel.com --- arch/x86/kernel/cpu/common.c | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 8cd4126d8253..cc686e5039be 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -600,27 +600,43 @@ __noendbr void ibt_restore(u64 save) static __always_inline void setup_cet(struct cpuinfo_x86 *c) { - u64 msr = CET_ENDBR_EN; + bool user_shstk, kernel_ibt; - if (!HAS_KERNEL_IBT || - !cpu_feature_enabled(X86_FEATURE_IBT)) + if (!IS_ENABLED(CONFIG_X86_CET)) return; - wrmsrl(MSR_IA32_S_CET, msr); + kernel_ibt = HAS_KERNEL_IBT && cpu_feature_enabled(X86_FEATURE_IBT); + user_shstk = cpu_feature_enabled(X86_FEATURE_SHSTK) && + IS_ENABLED(CONFIG_X86_USER_SHADOW_STACK); + + if (!kernel_ibt && !user_shstk) + return; + + if (user_shstk) + set_cpu_cap(c, X86_FEATURE_USER_SHSTK); + + if (kernel_ibt) + wrmsrl(MSR_IA32_S_CET, CET_ENDBR_EN); + else + wrmsrl(MSR_IA32_S_CET, 0); + cr4_set_bits(X86_CR4_CET); - if (!ibt_selftest()) { + if (kernel_ibt && !ibt_selftest()) { pr_err("IBT selftest: Failed!\n"); wrmsrl(MSR_IA32_S_CET, 0); setup_clear_cpu_cap(X86_FEATURE_IBT); - return; } } __noendbr void cet_disable(void) { - if (cpu_feature_enabled(X86_FEATURE_IBT)) - wrmsrl(MSR_IA32_S_CET, 0); + if (!(cpu_feature_enabled(X86_FEATURE_IBT) || + cpu_feature_enabled(X86_FEATURE_SHSTK))) + return; + + wrmsrl(MSR_IA32_S_CET, 0); + wrmsrl(MSR_IA32_U_CET, 0); } /* @@ -1482,6 +1498,9 @@ static void __init cpu_parse_early_param(void) if (cmdline_find_option_bool(boot_command_line, "noxsaves")) setup_clear_cpu_cap(X86_FEATURE_XSAVES); + if (cmdline_find_option_bool(boot_command_line, "nousershstk")) + setup_clear_cpu_cap(X86_FEATURE_USER_SHSTK); + arglen = cmdline_find_option(boot_command_line, "clearcpuid", arg, sizeof(arg)); if (arglen <= 0) return; From patchwork Fri Apr 21 13:45:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220372 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3786C7618E for ; Fri, 21 Apr 2023 16:51:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233295AbjDUQvD (ORCPT ); Fri, 21 Apr 2023 12:51:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233245AbjDUQuu (ORCPT ); Fri, 21 Apr 2023 12:50:50 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A79113868; Fri, 21 Apr 2023 09:50:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095847; x=1713631847; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GTZEH+R4nyAfVJ0BU4FvcCzk7JvYYtp630pyrX+sQxs=; b=FLdMm+zWaF3M2XEE/Ip8IwB7hxMCzINPs0nV5uMvqXBJzwQjCf7rEc19 /e7RtZ9AAEG+7A5Q0lt+OmOrbFJb/+x8xBOaX7fbMA9Ia/mCHpRBfAjvT J2dRYLYERYIk/wVW1wBVsmFK4SznFDpzeRtJtopG8aiMEtye7+FjeVIqx Ab3KOrdOoc0MxM/mA5uwRZB1zBl2RqyD/p4cGCpoXZSP8H8bYItSONmWy qdMDSo0SAB295DSbvdqlPRwyV9pREeQ/xIavLzXxSIf6LJk8C99SKuyYl GphaODjgwHKBlxHWB8O3CgdIrGoIexjqi4SW4T4fjsa706se7uznPttXU g==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344786971" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344786971" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817365" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817365" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:42 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Yu-cheng Yu , Borislav Petkov , Kees Cook , Mike Rapoport , Pengfei Xu Subject: [PATCH v2 04/21] x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states Date: Fri, 21 Apr 2023 09:45:58 -0400 Message-Id: <20230421134615.62539-5-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe Shadow stack register state can be managed with XSAVE. The registers can logically be separated into two groups: * Registers controlling user-mode operation * Registers controlling kernel-mode operation The architecture has two new XSAVE state components: one for each group of those groups of registers. This lets an OS manage them separately if it chooses. Future patches for host userspace and KVM guests will only utilize the user-mode registers, so only configure XSAVE to save user-mode registers. This state will add 16 bytes to the xsave buffer size. Future patches will use the user-mode XSAVE area to save guest user-mode CET state. However, VMCS includes new fields for guest CET supervisor states. KVM can use these to save and restore guest supervisor state, so host supervisor XSAVE support is not required. Adding this exacerbates the already unwieldy if statement in check_xstate_against_struct() that handles warning about un-implemented xfeatures. So refactor these check's by having XCHECK_SZ() set a bool when it actually check's the xfeature. This ends up exceeding 80 chars, but was better on balance than other options explored. Pass the bool as pointer to make it clear that XCHECK_SZ() can change the variable. While configuring user-mode XSAVE, clarify kernel-mode registers are not managed by XSAVE by defining the xfeature in XFEATURE_MASK_SUPERVISOR_UNSUPPORTED, like is done for XFEATURE_MASK_PT. This serves more of a documentation as code purpose, and functionally, only enables a few safety checks. Both XSAVE state components are supervisor states, even the state controlling user-mode operation. This is a departure from earlier features like protection keys where the PKRU state is a normal user (non-supervisor) state. Having the user state be supervisor-managed ensures there is no direct, unprivileged access to it, making it harder for an attacker to subvert CET. To facilitate this privileged access, define the two user-mode CET MSRs, and the bits defined in those MSRs relevant to future shadow stack enablement patches. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-6-rick.p.edgecombe%40intel.com --- arch/x86/include/asm/fpu/types.h | 16 +++++- arch/x86/include/asm/fpu/xstate.h | 6 ++- arch/x86/kernel/fpu/xstate.c | 90 +++++++++++++++---------------- 3 files changed, 61 insertions(+), 51 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 7f6d858ff47a..eb810074f1e7 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -115,8 +115,8 @@ enum xfeature { XFEATURE_PT_UNIMPLEMENTED_SO_FAR, XFEATURE_PKRU, XFEATURE_PASID, - XFEATURE_RSRVD_COMP_11, - XFEATURE_RSRVD_COMP_12, + XFEATURE_CET_USER, + XFEATURE_CET_KERNEL_UNUSED, XFEATURE_RSRVD_COMP_13, XFEATURE_RSRVD_COMP_14, XFEATURE_LBR, @@ -138,6 +138,8 @@ enum xfeature { #define XFEATURE_MASK_PT (1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR) #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) #define XFEATURE_MASK_PASID (1 << XFEATURE_PASID) +#define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL_UNUSED) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) #define XFEATURE_MASK_XTILE_CFG (1 << XFEATURE_XTILE_CFG) #define XFEATURE_MASK_XTILE_DATA (1 << XFEATURE_XTILE_DATA) @@ -252,6 +254,16 @@ struct pkru_state { u32 pad; } __packed; +/* + * State component 11 is Control-flow Enforcement user states + */ +struct cet_user_state { + /* user control-flow settings */ + u64 user_cet; + /* user shadow stack pointer */ + u64 user_ssp; +}; + /* * State component 15: Architectural LBR configuration state. * The size of Arch LBR state depends on the number of LBRs (lbr_depth). diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index cd3dd170e23a..d4427b88ee12 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -50,7 +50,8 @@ #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA /* All currently supported supervisor features */ -#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID) +#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ + XFEATURE_MASK_CET_USER) /* * A supervisor state component may not always contain valuable information, @@ -77,7 +78,8 @@ * Unsupported supervisor features. When a supervisor feature in this mask is * supported in the future, move it to the supported supervisor feature mask. */ -#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT) +#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \ + XFEATURE_MASK_CET_KERNEL) /* All supervisor states including supported and unsupported states. */ #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 714166cc25f2..13a80521dd51 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -39,26 +39,26 @@ */ static const char *xfeature_names[] = { - "x87 floating point registers" , - "SSE registers" , - "AVX registers" , - "MPX bounds registers" , - "MPX CSR" , - "AVX-512 opmask" , - "AVX-512 Hi256" , - "AVX-512 ZMM_Hi256" , - "Processor Trace (unused)" , + "x87 floating point registers", + "SSE registers", + "AVX registers", + "MPX bounds registers", + "MPX CSR", + "AVX-512 opmask", + "AVX-512 Hi256", + "AVX-512 ZMM_Hi256", + "Processor Trace (unused)", "Protection Keys User registers", "PASID state", - "unknown xstate feature" , - "unknown xstate feature" , - "unknown xstate feature" , - "unknown xstate feature" , - "unknown xstate feature" , - "unknown xstate feature" , - "AMX Tile config" , - "AMX Tile data" , - "unknown xstate feature" , + "Control-flow User registers", + "Control-flow Kernel registers (unused)", + "unknown xstate feature", + "unknown xstate feature", + "unknown xstate feature", + "unknown xstate feature", + "AMX Tile config", + "AMX Tile data", + "unknown xstate feature", }; static unsigned short xsave_cpuid_features[] __initdata = { @@ -73,6 +73,7 @@ static unsigned short xsave_cpuid_features[] __initdata = { [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT, [XFEATURE_PKRU] = X86_FEATURE_PKU, [XFEATURE_PASID] = X86_FEATURE_ENQCMD, + [XFEATURE_CET_USER] = X86_FEATURE_SHSTK, [XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE, [XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE, }; @@ -276,6 +277,7 @@ static void __init print_xstate_features(void) print_xstate_feature(XFEATURE_MASK_Hi16_ZMM); print_xstate_feature(XFEATURE_MASK_PKRU); print_xstate_feature(XFEATURE_MASK_PASID); + print_xstate_feature(XFEATURE_MASK_CET_USER); print_xstate_feature(XFEATURE_MASK_XTILE_CFG); print_xstate_feature(XFEATURE_MASK_XTILE_DATA); } @@ -344,6 +346,7 @@ static __init void os_xrstor_booting(struct xregs_state *xstate) XFEATURE_MASK_BNDREGS | \ XFEATURE_MASK_BNDCSR | \ XFEATURE_MASK_PASID | \ + XFEATURE_MASK_CET_USER | \ XFEATURE_MASK_XTILE) /* @@ -446,14 +449,15 @@ static void __init __xstate_dump_leaves(void) } \ } while (0) -#define XCHECK_SZ(sz, nr, nr_macro, __struct) do { \ - if ((nr == nr_macro) && \ - WARN_ONCE(sz != sizeof(__struct), \ - "%s: struct is %zu bytes, cpu state %d bytes\n", \ - __stringify(nr_macro), sizeof(__struct), sz)) { \ +#define XCHECK_SZ(sz, nr, __struct) ({ \ + if (WARN_ONCE(sz != sizeof(__struct), \ + "[%s]: struct is %zu bytes, cpu state %d bytes\n", \ + xfeature_names[nr], sizeof(__struct), sz)) { \ __xstate_dump_leaves(); \ } \ -} while (0) + true; \ +}) + /** * check_xtile_data_against_struct - Check tile data state size. @@ -527,36 +531,28 @@ static bool __init check_xstate_against_struct(int nr) * Ask the CPU for the size of the state. */ int sz = xfeature_size(nr); + /* * Match each CPU state with the corresponding software * structure. */ - XCHECK_SZ(sz, nr, XFEATURE_YMM, struct ymmh_struct); - XCHECK_SZ(sz, nr, XFEATURE_BNDREGS, struct mpx_bndreg_state); - XCHECK_SZ(sz, nr, XFEATURE_BNDCSR, struct mpx_bndcsr_state); - XCHECK_SZ(sz, nr, XFEATURE_OPMASK, struct avx_512_opmask_state); - XCHECK_SZ(sz, nr, XFEATURE_ZMM_Hi256, struct avx_512_zmm_uppers_state); - XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM, struct avx_512_hi16_state); - XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state); - XCHECK_SZ(sz, nr, XFEATURE_PASID, struct ia32_pasid_state); - XCHECK_SZ(sz, nr, XFEATURE_XTILE_CFG, struct xtile_cfg); - - /* The tile data size varies between implementations. */ - if (nr == XFEATURE_XTILE_DATA) - check_xtile_data_against_struct(sz); - - /* - * Make *SURE* to add any feature numbers in below if - * there are "holes" in the xsave state component - * numbers. - */ - if ((nr < XFEATURE_YMM) || - (nr >= XFEATURE_MAX) || - (nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR) || - ((nr >= XFEATURE_RSRVD_COMP_11) && (nr <= XFEATURE_RSRVD_COMP_16))) { + switch (nr) { + case XFEATURE_YMM: return XCHECK_SZ(sz, nr, struct ymmh_struct); + case XFEATURE_BNDREGS: return XCHECK_SZ(sz, nr, struct mpx_bndreg_state); + case XFEATURE_BNDCSR: return XCHECK_SZ(sz, nr, struct mpx_bndcsr_state); + case XFEATURE_OPMASK: return XCHECK_SZ(sz, nr, struct avx_512_opmask_state); + case XFEATURE_ZMM_Hi256: return XCHECK_SZ(sz, nr, struct avx_512_zmm_uppers_state); + case XFEATURE_Hi16_ZMM: return XCHECK_SZ(sz, nr, struct avx_512_hi16_state); + case XFEATURE_PKRU: return XCHECK_SZ(sz, nr, struct pkru_state); + case XFEATURE_PASID: return XCHECK_SZ(sz, nr, struct ia32_pasid_state); + case XFEATURE_XTILE_CFG: return XCHECK_SZ(sz, nr, struct xtile_cfg); + case XFEATURE_CET_USER: return XCHECK_SZ(sz, nr, struct cet_user_state); + case XFEATURE_XTILE_DATA: check_xtile_data_against_struct(sz); return true; + default: XSTATE_WARN_ON(1, "No structure for xstate: %d\n", nr); return false; } + return true; } From patchwork Fri Apr 21 13:45:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1134C7618E for ; Fri, 21 Apr 2023 16:51:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233289AbjDUQvA (ORCPT ); Fri, 21 Apr 2023 12:51:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233241AbjDUQut (ORCPT ); Fri, 21 Apr 2023 12:50:49 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8484C13FAB; Fri, 21 Apr 2023 09:50:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095848; x=1713631848; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Hl4XchB3DZgjf8rPLE6ukS73VjGwRhFWU+vSkAdj3h4=; b=Vujuw011C2tO2o+Zcx5FXfGDKWkUX+5G+bxUXTr9bWTX5Hf19vMkTHDA CMFEUECD91IDzrLao1H3YD0rgD1QFTej8jk5as3EXX1DdhciiofunbymT 5m3FKg4QSZK4QycyFQF/ZbZa7K3cm8x6F+WgWRpVaoue6KwnLh9kHr6Mi AFBkJbB/0MFLwm18l0VZgjQLFP+8xIEO8SOL9JA8Zd0Ia3oqj4HYCFtzP lNvDGSMKSMloURjLNQ4ed8kuwVV7eyhlCX9dg4fSt3NCx+qQeM+iti0JZ rh4KoldshKK/YC+KkxvRaotGn73VsbAZ/vnHlR7T/6nSqdS5Tqf1fO0o2 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344786980" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344786980" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817368" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817368" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:42 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Thomas Gleixner , Borislav Petkov , Kees Cook , Mike Rapoport , Pengfei Xu Subject: [PATCH v2 05/21] x86/fpu: Add helper for modifying xstate Date: Fri, 21 Apr 2023 09:45:59 -0400 Message-Id: <20230421134615.62539-6-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe Just like user xfeatures, supervisor xfeatures can be active in the registers or present in the task FPU buffer. If the registers are active, the registers can be modified directly. If the registers are not active, the modification must be performed on the task FPU buffer. When the state is not active, the kernel could perform modifications directly to the buffer. But in order for it to do that, it needs to know where in the buffer the specific state it wants to modify is located. Doing this is not robust against optimizations that compact the FPU buffer, as each access would require computing where in the buffer it is. The easiest way to modify supervisor xfeature data is to force restore the registers and write directly to the MSRs. Often times this is just fine anyway as the registers need to be restored before returning to userspace. Do this for now, leaving buffer writing optimizations for the future. Add a new function fpregs_lock_and_load() that can simultaneously call fpregs_lock() and do this restore. Also perform some extra sanity checks in this function since this will be used in non-fpu focused code. Suggested-by: Thomas Gleixner Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-7-rick.p.edgecombe%40intel.com --- arch/x86/include/asm/fpu/api.h | 9 +++++++++ arch/x86/kernel/fpu/core.c | 18 ++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 503a577814b2..aadc6893dcaa 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -82,6 +82,15 @@ static inline void fpregs_unlock(void) preempt_enable(); } +/* + * FPU state gets lazily restored before returning to userspace. So when in the + * kernel, the valid FPU state may be kept in the buffer. This function will force + * restore all the fpu state to the registers early if needed, and lock them from + * being automatically saved/restored. Then FPU state can be modified safely in the + * registers, before unlocking with fpregs_unlock(). + */ +void fpregs_lock_and_load(void); + #ifdef CONFIG_X86_DEBUG_FPU extern void fpregs_assert_state_consistent(void); #else diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index caf33486dc5e..f851558b673f 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -753,6 +753,24 @@ void switch_fpu_return(void) } EXPORT_SYMBOL_GPL(switch_fpu_return); +void fpregs_lock_and_load(void) +{ + /* + * fpregs_lock() only disables preemption (mostly). So modifying state + * in an interrupt could screw up some in progress fpregs operation. + * Warn about it. + */ + WARN_ON_ONCE(!irq_fpu_usable()); + WARN_ON_ONCE(current->flags & PF_KTHREAD); + + fpregs_lock(); + + fpregs_assert_state_consistent(); + + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + fpregs_restore_userregs(); +} + #ifdef CONFIG_X86_DEBUG_FPU /* * If current FPU state according to its tracking (loaded FPU context on this From patchwork Fri Apr 21 13:46:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220387 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2291C77B76 for ; Fri, 21 Apr 2023 16:51:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233459AbjDUQvj (ORCPT ); Fri, 21 Apr 2023 12:51:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233360AbjDUQvK (ORCPT ); Fri, 21 Apr 2023 12:51:10 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9516A15600; Fri, 21 Apr 2023 09:50:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095856; x=1713631856; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zOQMasU7O3ApDraDZEqE3fWIuTTvJXbEyrdSDU+BbJk=; b=C+L/41O8W+Ql66Bd3oHRbXRnnfsUMCCviwbO0sdYggpV2SU7mq3B/1WZ VOdTjXD+T7YrGCloLdclY4US6bUwgV8spYY0aSr8fyf3YnKYXqgPe1Rkx P6g671RbXhW6vdcQhSWkrGVZJo9gB0ymFNtQJO0gor5LvQfvN9i/WvqDv 55TDbMrUM2S6fqEPz6PX2kKrecKvNTFAOQScfinNEu+JwykD4/S25BJh5 7Bvtel2utNY2Q/9vuRJGZkHul0fna+MyWGV8pAm4pIrQ+enOUjtvb46wo WX1I9GEtEdIGh3pm2T6ufBGWx7Nb+Z7mnm1EBdXcNPHJmO3TjxUsrUncY Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344786991" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344786991" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817378" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817378" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:42 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Sean Christopherson Subject: [PATCH v2 06/21] KVM:x86: Report XSS as to-be-saved if there are supported features Date: Fri, 21 Apr 2023 09:46:00 -0400 Message-Id: <20230421134615.62539-7-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Add MSR_IA32_XSS to the list of MSRs reported to userspace if supported_xss is non-zero, i.e. KVM supports at least one XSS based feature. Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e7f78fe79b32..33a780fe820b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1454,6 +1454,7 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_UMWAIT_CONTROL, MSR_IA32_XFD, MSR_IA32_XFD_ERR, + MSR_IA32_XSS, }; static const u32 msrs_to_save_pmu[] = { From patchwork Fri Apr 21 13:46:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220370 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1F76C7618E for ; Fri, 21 Apr 2023 16:50:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233236AbjDUQu5 (ORCPT ); Fri, 21 Apr 2023 12:50:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233242AbjDUQut (ORCPT ); Fri, 21 Apr 2023 12:50:49 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 032D415452; Fri, 21 Apr 2023 09:50:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095849; x=1713631849; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0HohTrrONEVXrEezJcMjXtrqkHjWsRbeQkNIfNIXJGw=; b=lqT4+yqpnwwmHRfMNHUIodLs2mZEhBCocYbFClHyuWjiRsFBqRzy2Ilr wrvvQCZt6LPn+Ig/a554lPrCntcUY3fKP7QqX6GvrLHLRuawGBksf5Nk5 biXNz+fQ4L6P7hu/fZNyLqdhYoP5wGiPhNYE47QXV129xAMeeT2tJd4Ky lB65uJAB37zg/nDXmwJ7d1tKQtLP1yPoo5KIDHDkcn2LUlIFli3hxio5B NRfS7MS1aj8nYXD6BsVJy94XQUXwMN98nnoiA/qVYPdd1d2IZDCLGQLcR DZDz1ZOg2/sWKkX9VfT+2Jib5oEpqkH07iPspOybw0Q1twDnfeY2REpkS w==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344786999" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344786999" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817382" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817382" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:43 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Zhang Yi Z Subject: [PATCH v2 07/21] KVM:x86: Refresh CPUID on write to guest MSR_IA32_XSS Date: Fri, 21 Apr 2023 09:46:01 -0400 Message-Id: <20230421134615.62539-8-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Update CPUID(EAX=0DH,ECX=1) when the guest's XSS is modified. CPUID(EAX=0DH,ECX=1).EBX reports current required storage size for all features enabled via XCR0 | XSS so that guest can allocate correct xsave buffer. Note, KVM does not yet support any XSS based features, i.e. supported_xss is guaranteed to be zero at this time. Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/kvm/cpuid.c | 11 ++++++++--- arch/x86/kvm/x86.c | 6 ++++-- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 123bf8b97a4b..dd6d5150d86a 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -276,9 +276,14 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e best->ebx = xstate_required_size(vcpu->arch.xcr0, false); best = cpuid_entry2_find(entries, nent, 0xD, 1); - if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) || - cpuid_entry_has(best, X86_FEATURE_XSAVEC))) - best->ebx = xstate_required_size(vcpu->arch.xcr0, true); + if (best) { + if (cpuid_entry_has(best, X86_FEATURE_XSAVES) || + cpuid_entry_has(best, X86_FEATURE_XSAVEC)) { + u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss; + + best->ebx = xstate_required_size(xstate, true); + } + } best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); if (kvm_hlt_in_guest(vcpu->kvm) && best && diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 33a780fe820b..ab3360a10933 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3776,8 +3776,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) */ if (data & ~kvm_caps.supported_xss) return 1; - vcpu->arch.ia32_xss = data; - kvm_update_cpuid_runtime(vcpu); + if (vcpu->arch.ia32_xss != data) { + vcpu->arch.ia32_xss = data; + kvm_update_cpuid_runtime(vcpu); + } break; case MSR_SMI_COUNT: if (!msr_info->host_initiated) From patchwork Fri Apr 21 13:46:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0391C77B7F for ; Fri, 21 Apr 2023 16:51:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233311AbjDUQvF (ORCPT ); Fri, 21 Apr 2023 12:51:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232445AbjDUQuu (ORCPT ); Fri, 21 Apr 2023 12:50:50 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A24EF8691; Fri, 21 Apr 2023 09:50:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095849; x=1713631849; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FUetmM3QAjP8/HE1ypNjPP89xC2eIzQHpTYHL+d52N8=; b=GQen9tZXMQ5tGqlYFeKwdZEvI2LKq9EhzTYwev0+qQTpiFMeaKRBRkn1 5PNkyF+Pl50QLx3TJNtQ/C0A/D5YdCT3LJVShK0dP3EUUFGjTRWUMXmh+ JwmMIlW+xguinL3Ros4EXQLRNNei2gAPLzJ4enYNeY41SeNChXdq4EMIs IWgMxprbbpQcgzl+6BviX4w/+MtRi7yqWxpZ5QFwfcDlMcD2Z9esvmh2U vqSLHEdwRxCzp0z1xqcMDI0C4eIQOGhd7itYrP0z5EHv14MczIVDakUHI xeHGo0YEjZ+NHkAlNZGpHN44Q5p0GfKqRaOBaGzGfcOz7+qoUXyWey2HF Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787000" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787000" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817386" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817386" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:43 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com Subject: [PATCH v2 08/21] KVM:x86: Init kvm_caps.supported_xss with supported feature bits Date: Fri, 21 Apr 2023 09:46:02 -0400 Message-Id: <20230421134615.62539-9-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Initialize kvm_caps.supported_xss with host XSS msr value AND XSS mask. KVM_SUPPORTED_XSS holds all potential supported feature bits, the result represents all KVM supported feature bits which is used for swapping guest and host FPU contents. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 1 - arch/x86/kvm/x86.c | 6 +++++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 44fb619803b8..c872a5aafa50 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7806,7 +7806,6 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_set(X86_FEATURE_UMIP); /* CPUID 0xD.1 */ - kvm_caps.supported_xss = 0; if (!cpu_has_vmx_xsaves()) kvm_cpu_cap_clear(X86_FEATURE_XSAVES); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ab3360a10933..d2975ca96ac5 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -223,6 +223,8 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) +#define KVM_SUPPORTED_XSS 0 + u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); @@ -9472,8 +9474,10 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) rdmsrl_safe(MSR_EFER, &host_efer); - if (boot_cpu_has(X86_FEATURE_XSAVES)) + if (boot_cpu_has(X86_FEATURE_XSAVES)) { rdmsrl(MSR_IA32_XSS, host_xss); + kvm_caps.supported_xss = host_xss & KVM_SUPPORTED_XSS; + } kvm_init_pmu_capability(ops->pmu_ops); From patchwork Fri Apr 21 13:46:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220385 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D397C7618E for ; Fri, 21 Apr 2023 16:51:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233438AbjDUQve (ORCPT ); Fri, 21 Apr 2023 12:51:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233348AbjDUQvK (ORCPT ); Fri, 21 Apr 2023 12:51:10 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5A5415621; Fri, 21 Apr 2023 09:50:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095854; x=1713631854; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ShCn5EdxlHC9p8Zub2u6/B6bWzPI1MqNtjLonRx9vjY=; b=itN/oTXcJD1sQStLqiU5/aIQpQkCT2KptDvWKQ61PXau/dD4ZYkv5wuA 3QQ0xBUFyrX4FnosWQE5HX0b3seFThWnoDrYQBiFcSJiWxYKup+OzB4+j 5Z6swcIyIVSXrjO6bk4v55FKrZ/MgKGqgUrIIZvI3Yse3sdBM3tYtzRyo uIN9n4bY2JbLjIhS8D5JxZ3tlW1/qugmRXfF/EqBGnHSmpZbCCzDCNDBs bMXkWWi0m71uKUY6Mv5X6d4YDxSQ8BTPeFKH8TvAmCH0r/rQTTVVK7t7x wov43euKqzT2WGzykRpipUI3xkqNesto+Q/Kvsv7BUh6kc0XAO8YhVPam Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787010" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787010" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817389" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817389" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:43 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Sean Christopherson Subject: [PATCH v2 09/21] KVM:x86: Load guest FPU state when accessing xsaves-managed MSRs Date: Fri, 21 Apr 2023 09:46:03 -0400 Message-Id: <20230421134615.62539-10-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Load the guest's FPU state if userspace is accessing MSRs whose values are managed by XSAVES so that the MSR helpers, e.g. kvm_{get,set}_xsave_msr(), can simply do {RD,WR}MSR to access the guest's value. If new feature MSRs supported in XSS are passed through to the guest they are saved and restored by XSAVES/XRSTORS, i.e. in the guest's FPU state. Because is also used for the KVM_GET_MSRS device ioctl(), explicitly check @vcpu is non-null before attempting to load guest state. The XSS supporting MSRs cannot be retrieved via the device ioctl() without loading guest FPU state (which doesn't exist). Note that guest_cpuid_has() is not queried as host userspace is allowed to access MSRs that have not been exposed to the guest, e.g. it might do KVM_SET_MSRS prior to KVM_SET_CPUID2. Signed-off-by: Sean Christopherson Co-developed-by: Yang Weijiang Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d2975ca96ac5..7788646bbf1f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -130,6 +130,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static DEFINE_MUTEX(vendor_module_lock); +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); + struct kvm_x86_ops kvm_x86_ops __read_mostly; #define KVM_X86_OP(func) \ @@ -4336,6 +4339,21 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } EXPORT_SYMBOL_GPL(kvm_get_msr_common); +static const u32 xsave_msrs[] = { + MSR_IA32_U_CET, MSR_IA32_PL3_SSP, +}; + +static bool is_xsaves_msr(u32 index) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(xsave_msrs); i++) { + if (index == xsave_msrs[i]) + return true; + } + return false; +} + /* * Read or write a bunch of msrs. All parameters are kernel addresses. * @@ -4346,11 +4364,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { + bool fpu_loaded = false; int i; - for (i = 0; i < msrs->nmsrs; ++i) + for (i = 0; i < msrs->nmsrs; ++i) { + if (vcpu && !fpu_loaded && kvm_caps.supported_xss && + is_xsaves_msr(entries[i].index)) { + kvm_load_guest_fpu(vcpu); + fpu_loaded = true; + } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; + } + if (fpu_loaded) + kvm_put_guest_fpu(vcpu); return i; } From patchwork Fri Apr 21 13:46:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220374 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AB1EC77B61 for ; Fri, 21 Apr 2023 16:51:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233328AbjDUQvH (ORCPT ); Fri, 21 Apr 2023 12:51:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233251AbjDUQuu (ORCPT ); Fri, 21 Apr 2023 12:50:50 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C983D15461; Fri, 21 Apr 2023 09:50:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095849; x=1713631849; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wTU7bnTSs0TJD+7S+WWyKuDSi/l7jpz3DffujQ6Mtuw=; b=f3mXKENJHm1cFeJDC63c4FpGBhxI1fGY4yatIoPb0YvuwbZH6GqpyOoF TiCQDT3v4RxvGW2TabNQ0tND6e5Qm51OWxAEFkxl48twN7CPhkej8Zb0T TWkX1W2GFOdy0x1agF/jbGEo/f3CEfz5a9uCGWoJLERRoT6+xOqNaeSbm P8AAjFzZi/LZgwoIQzyMSX7RDpM0e/iODOEdBVfyKIgSjv2hdyt5GjrCv gnT2nXrBrXKBIGMtGGu087vVE87g/IFA18TlRKHdA5Nsa+R0MJMYSX/vN lWzVGLM66SX/cfasvgwVQLZH7/pyHJzU1oHjwW3EcFgSWbV4XiVLWVjNS A==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787014" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787014" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817393" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817393" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:43 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com Subject: [PATCH v2 10/21] KVM:x86: Add #CP support in guest exception classification Date: Fri, 21 Apr 2023 09:46:04 -0400 Message-Id: <20230421134615.62539-11-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add handling for Control Protection (#CP) exceptions(vector 21). The new vector is introduced for Intel's Control-Flow Enforcement Technology (CET) relevant violation cases. See Intel's SDM for details. Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/vmx/nested.c | 2 +- arch/x86/kvm/x86.c | 10 +++++++--- arch/x86/kvm/x86.h | 13 ++++++++++--- 4 files changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 7f467fe05d42..1c002abe2be8 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -33,6 +33,7 @@ #define MC_VECTOR 18 #define XM_VECTOR 19 #define VE_VECTOR 20 +#define CP_VECTOR 21 /* Select x86 specific features in */ #define __KVM_HAVE_PIT diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 96ede74a6067..7bc62cd72748 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2850,7 +2850,7 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu, /* VM-entry interruption-info field: deliver error code */ should_have_error_code = intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode && - x86_exception_has_error_code(vector); + x86_exception_has_error_code(vcpu, vector); if (CC(has_error_code != should_have_error_code)) return -EINVAL; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7788646bbf1f..a768cbf3fbb7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -520,11 +520,15 @@ EXPORT_SYMBOL_GPL(kvm_spurious_fault); #define EXCPT_CONTRIBUTORY 1 #define EXCPT_PF 2 -static int exception_class(int vector) +static int exception_class(struct kvm_vcpu *vcpu, int vector) { switch (vector) { case PF_VECTOR: return EXCPT_PF; + case CP_VECTOR: + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET) + return EXCPT_BENIGN; + return EXCPT_CONTRIBUTORY; case DE_VECTOR: case TS_VECTOR: case NP_VECTOR: @@ -707,8 +711,8 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu, kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); return; } - class1 = exception_class(prev_nr); - class2 = exception_class(nr); + class1 = exception_class(vcpu, prev_nr); + class2 = exception_class(vcpu, nr); if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY) || (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) { /* diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index c544602d07a3..2ba7c7fc4846 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -171,13 +171,20 @@ static inline bool is_64_bit_hypercall(struct kvm_vcpu *vcpu) return vcpu->arch.guest_state_protected || is_64_bit_mode(vcpu); } -static inline bool x86_exception_has_error_code(unsigned int vector) +static inline bool x86_exception_has_error_code(struct kvm_vcpu *vcpu, + unsigned int vector) { static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) | BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) | - BIT(PF_VECTOR) | BIT(AC_VECTOR); + BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR); - return (1U << vector) & exception_has_error_code; + if (!((1U << vector) & exception_has_error_code)) + return false; + + if (vector == CP_VECTOR) + return !(vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET); + + return true; } static inline bool mmu_is_nested(struct kvm_vcpu *vcpu) From patchwork Fri Apr 21 13:46:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220375 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DB61C77B7C for ; Fri, 21 Apr 2023 16:51:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233331AbjDUQvI (ORCPT ); Fri, 21 Apr 2023 12:51:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233255AbjDUQuv (ORCPT ); Fri, 21 Apr 2023 12:50:51 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F335715466; Fri, 21 Apr 2023 09:50:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095850; x=1713631850; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HFNN6byA2AP42mJgNw2pZfCWk5/F/jjT/+R4++ncQJA=; b=Q4xR7Ww4d0BsbzyIfcQQ6/1KonM0hsXVgm9G8iDpoLo7FtMRHbjx2MgG e8DcIBGB34HBewwSkFKE4nUegVzCYqrLQo8dLaVe83/HsPHh4k8TNJOXX V2Lwbv79Yq5Rtvt4QpHFXSKOZp54m5VUNHXGBhfsY07xG1W2eD1JATy8S jWN8lDW96WxHIrYZCVb8rldC0G+at+dXhfz+qDsJqj+icHtSe+BVtnJZg 0BOsmbPphv0T5mOEdZF+X6j9XEMJCWPOqcVvL/+ZodTpy96hgw+NRP/+B MPrHRmEGIx5GKIln5DoXwrgEjDiDb3IWL6lkoACkHfBjb8y9KXzMXIKRQ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787020" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787020" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817396" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817396" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Zhang Yi Z Subject: [PATCH v2 11/21] KVM:VMX: Introduce CET VMCS fields and control bits Date: Fri, 21 Apr 2023 09:46:05 -0400 Message-Id: <20230421134615.62539-12-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org CET (Control-flow Enforcement Technology) is a CPU feature used to prevent Return/Jump-Oriented Programming (ROP/JOP) attacks. CET introduces a new exception type, Control Protection (#CP), and two sub-features(SHSTK,IBT) to defend against ROP/JOP style control-flow subversion attacks. Shadow Stack (SHSTK): A shadow stack is a second stack used exclusively for control transfer operations. The shadow stack is separate from the data/normal stack and can be enabled individually in user and kernel mode. When shadow stacks are enabled, CALL pushes the return address on both the data and shadow stack. RET pops the return address from both stacks and compares them. If the return addresses from the two stacks do not match, the processor signals a #CP. Indirect Branch Tracking (IBT): IBT adds a new instrution, ENDBRANCH, that is used to mark valid target addresses of indirect branches (CALL, JMP, ENCLU[EEXIT], etc...). If an indirect branch is executed and the next instruction is _not_ an ENDBRANCH, the processor signals a #CP. Several new CET MSRs are defined to support CET: MSR_IA32_{U,S}_CET: Controls the CET settings for user mode and kernel mode respectively. MSR_IA32_PL{0,1,2,3}_SSP: Stores shadow stack pointers for CPL-0,1,2,3 protection respectively. MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer table. Two XSAVES state bits are introduced for CET: IA32_XSS:[bit 11]: Control saving/restoring user mode CET states IA32_XSS:[bit 12]: Control saving/restoring kernel mode CET states. Six VMCS fields are introduced for CET: {HOST,GUEST}_S_CET: Stores CET settings for kernel mode. {HOST,GUEST}_SSP: Stores shadow stack pointer of current active task/thread. {HOST,GUEST}_INTR_SSP_TABLE: Stores base address of shadow stack pointer table. If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET states are restored from the following VMCS fields at VM-Exit: HOST_S_CET HOST_SSP HOST_INTR_SSP_TABLE If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET states are loaded from the following VMCS fields at VM-Entry: GUEST_S_CET GUEST_SSP GUEST_INTR_SSP_TABLE Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/include/asm/vmx.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 498dc600bd5c..fe2aff27df8c 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -102,6 +102,7 @@ #define VM_EXIT_CLEAR_BNDCFGS 0x00800000 #define VM_EXIT_PT_CONCEAL_PIP 0x01000000 #define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000 +#define VM_EXIT_LOAD_CET_STATE 0x10000000 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -115,6 +116,7 @@ #define VM_ENTRY_LOAD_BNDCFGS 0x00010000 #define VM_ENTRY_PT_CONCEAL_PIP 0x00020000 #define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000 +#define VM_ENTRY_LOAD_CET_STATE 0x00100000 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff @@ -343,6 +345,9 @@ enum vmcs_field { GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822, GUEST_SYSENTER_ESP = 0x00006824, GUEST_SYSENTER_EIP = 0x00006826, + GUEST_S_CET = 0x00006828, + GUEST_SSP = 0x0000682a, + GUEST_INTR_SSP_TABLE = 0x0000682c, HOST_CR0 = 0x00006c00, HOST_CR3 = 0x00006c02, HOST_CR4 = 0x00006c04, @@ -355,6 +360,9 @@ enum vmcs_field { HOST_IA32_SYSENTER_EIP = 0x00006c12, HOST_RSP = 0x00006c14, HOST_RIP = 0x00006c16, + HOST_S_CET = 0x00006c18, + HOST_SSP = 0x00006c1a, + HOST_INTR_SSP_TABLE = 0x00006c1c }; /* From patchwork Fri Apr 21 13:46:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220377 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA545C77B61 for ; Fri, 21 Apr 2023 16:51:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232578AbjDUQvM (ORCPT ); Fri, 21 Apr 2023 12:51:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233280AbjDUQu6 (ORCPT ); Fri, 21 Apr 2023 12:50:58 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64A4615475; Fri, 21 Apr 2023 09:50:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095850; x=1713631850; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fm6A54UaKJ7/SItL0KZhWViv4pMOGNnxyKhl8jQnT5A=; b=OCJD9WK9V1Gp3CFgLCRWqOCW/tQ7evWU6S5WxIRuWMTRU8UKd4NUSciS ikkGFTSIgupp3a1dZ9j9Hk0s9+R2y8ja4V/hFLJAqmTmZQiSStNoa7KbK SNeMPn9UV68sBYkiC9JhvorbvVbKCef2TGatlmvf47SJTtYhtGwHlVigY WD7TWPDOQrdxHQsMBaUa55KcICQ0raIPbk8aDVwAD8wGCSiGQU6OVxjhv sSmyfmpCT2HSeNR2B83QpCbs8Px1tY1Fw3nNGCw06M0BklFWLMNpK3HpS bWvNbeGjhmSmfq4j3tkrsuGxorAgviT/iAXoNoF5cJi5PFXtm3EEFmf8/ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787026" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787026" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817399" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817399" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Sean Christopherson Subject: [PATCH v2 12/21] KVM:x86: Add fault checks for guest CR4.CET setting Date: Fri, 21 Apr 2023 09:46:06 -0400 Message-Id: <20230421134615.62539-13-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Check potential faults for CR4.CET setting per Intel SDM. CR4.CET is the master control bit for CET features (SHSTK and IBT). In addition to basic support checks, CET can be enabled if and only if CR0.WP==1, i.e. setting CR4.CET=1 faults if CR0.WP==0 and setting CR0.WP=0 fails if CR4.CET==1. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 6 ++++++ arch/x86/kvm/x86.h | 3 +++ 2 files changed, 9 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a768cbf3fbb7..7cd7f6755acd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -995,6 +995,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) (is_64_bit_mode(vcpu) || kvm_is_cr4_bit_set(vcpu, X86_CR4_PCIDE))) return 1; + if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET)) + return 1; + static_call(kvm_x86_set_cr0)(vcpu, cr0); kvm_post_set_cr0(vcpu, old_cr0, cr0); @@ -1210,6 +1213,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) return 1; } + if ((cr4 & X86_CR4_CET) && !(kvm_read_cr0(vcpu) & X86_CR0_WP)) + return 1; + static_call(kvm_x86_set_cr4)(vcpu, cr4); kvm_post_set_cr4(vcpu, old_cr4, cr4); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 2ba7c7fc4846..daadd5330dae 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -536,6 +536,9 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type); __reserved_bits |= X86_CR4_VMXE; \ if (!__cpu_has(__c, X86_FEATURE_PCID)) \ __reserved_bits |= X86_CR4_PCIDE; \ + if (!__cpu_has(__c, X86_FEATURE_SHSTK) && \ + !__cpu_has(__c, X86_FEATURE_IBT)) \ + __reserved_bits |= X86_CR4_CET; \ __reserved_bits; \ }) From patchwork Fri Apr 21 13:46:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2A8BC77B61 for ; Fri, 21 Apr 2023 16:51:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232921AbjDUQvY (ORCPT ); Fri, 21 Apr 2023 12:51:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233313AbjDUQvF (ORCPT ); Fri, 21 Apr 2023 12:51:05 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39EFD15467; Fri, 21 Apr 2023 09:50:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095852; x=1713631852; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=P8MbqvwNSHVEvrq5IezQJzNUXAAb4/xulx5aS8UOL0A=; b=OtAgzQI4CpMUMZw//+X9x7UKz7pwGgG/+TpVP0MSYh0GqkfmFiWG68/Z TKF4YvOYaHFyAtfX7FniXLABAhwbHAS1DW5WNFYcairfEq8FoGyol8b8f AXuKJxBsAGYANb5PPzlEsMPso+fIgoId1bC/5G3DpbFqrSt0g/ocws3E1 ER9cKhYUwa8aou4k4XvbtLiWXRMjpuiwyP/L+rpWmWCWw7hOJcLcnFC6q SBKeGy+C9I/vyxr3mEM9vaKxMNTdr4h/fNdw7Y6EyFF4UaStuskj+18ZJ t9ak0OTG7xg5fj1al/QYNy9cynKFsTc8QFSskCQRWJ/dC7bxrALai442q w==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787042" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787042" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817402" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817402" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Sean Christopherson Subject: [PATCH v2 13/21] KVM:VMX: Emulate reads and writes to CET MSRs Date: Fri, 21 Apr 2023 09:46:07 -0400 Message-Id: <20230421134615.62539-14-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add support for emulating read and write accesses to CET MSRs. CET MSRs are universally "special" as they are either context switched via dedicated VMCS fields or via XSAVES, i.e. no additional in-memory tracking is needed, but emulated reads/writes are more expensive. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kernel/fpu/core.c | 1 + arch/x86/kvm/vmx/vmx.c | 42 ++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.h | 30 +++++++++++++++++++++++++++ 3 files changed, 73 insertions(+) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index f851558b673f..b4e28487882c 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -770,6 +770,7 @@ void fpregs_lock_and_load(void) if (test_thread_flag(TIF_NEED_FPU_LOAD)) fpregs_restore_userregs(); } +EXPORT_SYMBOL_GPL(fpregs_lock_and_load); #ifdef CONFIG_X86_DEBUG_FPU /* diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c872a5aafa50..ae816c1c7367 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1955,6 +1955,26 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr) } } +static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu, + struct msr_data *msr) +{ + if (!kvm_cet_user_supported()) + return false; + + if (msr->host_initiated) + return true; + + if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) && + !guest_cpuid_has(vcpu, X86_FEATURE_IBT)) + return false; + + if (msr->index == MSR_IA32_PL3_SSP && + !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) + return false; + + return true; +} + /* * Reads an msr value (of 'msr_info->index') into 'msr_info->data'. * Returns 0 on success, non-0 otherwise. @@ -2093,6 +2113,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else msr_info->data = vmx->pt_desc.guest.addr_a[index / 2]; break; + case MSR_IA32_U_CET: + case MSR_IA32_PL3_SSP: + if (!cet_is_msr_accessible(vcpu, msr_info)) + return 1; + kvm_get_xsave_msr(msr_info); + break; case MSR_IA32_DEBUGCTLMSR: msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); break; @@ -2405,6 +2431,22 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] = data; break; + case MSR_IA32_U_CET: + if (!cet_is_msr_accessible(vcpu, msr_info)) + return 1; + if ((data & GENMASK(9, 6)) || + is_noncanonical_address(data, vcpu)) + return 1; + kvm_set_xsave_msr(msr_info); + break; + case MSR_IA32_PL3_SSP: + if (!cet_is_msr_accessible(vcpu, msr_info)) + return 1; + if ((data & GENMASK(2, 0)) || + is_noncanonical_address(data, vcpu)) + return 1; + kvm_set_xsave_msr(msr_info); + break; case MSR_IA32_PERF_CAPABILITIES: if (data && !vcpu_to_pmu(vcpu)->version) return 1; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index daadd5330dae..52cd02a6bfec 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -2,6 +2,7 @@ #ifndef ARCH_X86_KVM_X86_H #define ARCH_X86_KVM_X86_H +#include #include #include #include @@ -370,6 +371,16 @@ static inline bool kvm_mpx_supported(void) == (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR); } +/* + * Guest CET user mode states depend on host XSAVES/XRSTORS to save/restore + * when vCPU enter/exit user space. If host doesn't support CET user bit in + * XSS msr, then treat this case as KVM doesn't support CET user mode. + */ +static inline bool kvm_cet_user_supported(void) +{ + return !!(kvm_caps.supported_xss & XFEATURE_MASK_CET_USER); +} + extern unsigned int min_timer_period_us; extern bool enable_vmware_backdoor; @@ -550,4 +561,23 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, unsigned int port, void *data, unsigned int count, int in); +/* + * We've already loaded guest MSRs in __msr_io() after check the MSR index. + * In case vcpu has been preempted, we need to disable preemption, check + * and reload the guest fpu states before read/write xsaves-managed MSRs. + */ +static inline void kvm_get_xsave_msr(struct msr_data *msr_info) +{ + fpregs_lock_and_load(); + rdmsrl(msr_info->index, msr_info->data); + fpregs_unlock(); +} + +static inline void kvm_set_xsave_msr(struct msr_data *msr_info) +{ + fpregs_lock_and_load(); + wrmsrl(msr_info->index, msr_info->data); + fpregs_unlock(); +} + #endif From patchwork Fri Apr 21 13:46:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220376 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FCABC7618E for ; Fri, 21 Apr 2023 16:51:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233358AbjDUQvK (ORCPT ); Fri, 21 Apr 2023 12:51:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233265AbjDUQuy (ORCPT ); Fri, 21 Apr 2023 12:50:54 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0A091547D; Fri, 21 Apr 2023 09:50:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095850; x=1713631850; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kdtxCZOCZoQNX/+WpLCxNOJBN/TLPN+8TDwmQk9drd4=; b=ZVUuQ5ystcjDUggtPTZ0YnCJvNLNE0xmeub0gWLWOIG08jnFO/uQkfGH oGJdOig1sq0+1we6tgXo4wiegsnRTJFFg39azdUtWBdvwPvyLrCsYHyVI X00/FfMl27nJUdWnv/p8380yI5NPj5SGEX3jC8sXBcrFaItyNU5k5cHwV 5yHLUgsXBsMQp05Uxr5tWuIXxI0UFLxQzkszXEbcOuDCtJv2fQg4zxLHD FnuDH95oo/MhXGCZMfDjGZqhOYywhPP/NpqJfxbcKWfAy3j9fLS/spz5E 9c79+NsbmmWAdP/pSkeK2cvQuBtIbkU98KtxxzwAcr7gu+4F5bS6NZ63W g==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787038" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787038" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817405" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817405" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Sean Christopherson Subject: [PATCH v2 14/21] KVM:VMX: Add a synthetic MSR to allow userspace VMM to access GUEST_SSP Date: Fri, 21 Apr 2023 09:46:08 -0400 Message-Id: <20230421134615.62539-15-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Introduce a host-only synthetic MSR, MSR_KVM_GUEST_SSP, so that the VMM can read/write the guest's SSP, e.g. to migrate CET state. Use a synthetic MSR, e.g. as opposed to a VCPU_REG_, as GUEST_SSP is subject to the same consistency checks as the PL*_SSP MSRs, i.e. can share code. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/vmx/vmx.c | 15 ++++++++++++--- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 6e64b27b2c1e..7af465e4e0bd 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -58,6 +58,7 @@ #define MSR_KVM_ASYNC_PF_INT 0x4b564d06 #define MSR_KVM_ASYNC_PF_ACK 0x4b564d07 #define MSR_KVM_MIGRATION_CONTROL 0x4b564d08 +#define MSR_KVM_GUEST_SSP 0x4b564d09 struct kvm_steal_time { __u64 steal; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ae816c1c7367..42211ae40650 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1968,7 +1968,8 @@ static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu, !guest_cpuid_has(vcpu, X86_FEATURE_IBT)) return false; - if (msr->index == MSR_IA32_PL3_SSP && + if ((msr->index == MSR_IA32_PL3_SSP || + msr->index == MSR_KVM_GUEST_SSP) && !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) return false; @@ -2115,9 +2116,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) break; case MSR_IA32_U_CET: case MSR_IA32_PL3_SSP: + case MSR_KVM_GUEST_SSP: if (!cet_is_msr_accessible(vcpu, msr_info)) return 1; - kvm_get_xsave_msr(msr_info); + if (msr_info->index == MSR_KVM_GUEST_SSP) + msr_info->data = vmcs_readl(GUEST_SSP); + else + kvm_get_xsave_msr(msr_info); break; case MSR_IA32_DEBUGCTLMSR: msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); @@ -2440,12 +2445,16 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) kvm_set_xsave_msr(msr_info); break; case MSR_IA32_PL3_SSP: + case MSR_KVM_GUEST_SSP: if (!cet_is_msr_accessible(vcpu, msr_info)) return 1; if ((data & GENMASK(2, 0)) || is_noncanonical_address(data, vcpu)) return 1; - kvm_set_xsave_msr(msr_info); + if (msr_index == MSR_KVM_GUEST_SSP) + vmcs_writel(GUEST_SSP, data); + else + kvm_set_xsave_msr(msr_info); break; case MSR_IA32_PERF_CAPABILITIES: if (data && !vcpu_to_pmu(vcpu)->version) From patchwork Fri Apr 21 13:46:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220378 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7722BC77B7F for ; Fri, 21 Apr 2023 16:51:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233370AbjDUQvO (ORCPT ); Fri, 21 Apr 2023 12:51:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233299AbjDUQvE (ORCPT ); Fri, 21 Apr 2023 12:51:04 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CFE115474; Fri, 21 Apr 2023 09:50:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095851; x=1713631851; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/nWbrU0s+VgwW4b1cPKFggjfOPEJLo1z3uE/OMYi758=; b=iChbHomhOVRkU3I/k+WToTqsYhiKqymIk0QEocGqx34iX3RCHnD4IQhl mkO6Ft707YRrFDIcGsqFWVHyckQsqMEIoUrPTm9FQHj0QWKn+bnFc74g3 u0U7zZ/VvQmmAU8AmvJuL+q9biYzEQ2vfy7I40hUMCrqhHcNwetP5qPqI oiXbgmyFMJxknwXY0UFlLjYWw7Cp24XL4SDARIGh/Hgu2Fxej3OS85WHA Dlpm+sA9vD9loTAQGWnrXucnztrIWz78HORJDIJXjSdGGTd+uatNJfT8V mPBVDMgG+sXDDt1M2OsMkUgb7ewURKIoQwb36ldWVEkgvEB2F9npxjmSD w==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787048" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787048" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817409" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817409" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Sean Christopherson Subject: [PATCH v2 15/21] KVM:x86: Report CET MSRs as to-be-saved if CET is supported Date: Fri, 21 Apr 2023 09:46:09 -0400 Message-Id: <20230421134615.62539-16-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Report all CET MSRs, including the synthetic GUEST_SSP MSR, as to-be-saved, e.g. for migration, if CET is supported by KVM. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7cd7f6755acd..95dba3c3df5f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1470,6 +1470,7 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_XFD, MSR_IA32_XFD_ERR, MSR_IA32_XSS, + MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP, }; static const u32 msrs_to_save_pmu[] = { From patchwork Fri Apr 21 13:46:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75BD2C7618E for ; Fri, 21 Apr 2023 16:51:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233375AbjDUQvQ (ORCPT ); Fri, 21 Apr 2023 12:51:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233302AbjDUQvF (ORCPT ); Fri, 21 Apr 2023 12:51:05 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1625C15611; Fri, 21 Apr 2023 09:50:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095852; x=1713631852; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=i7lV+AJDDwMSt8GlHXcIg9vDWIXFzEYq7jFzVP/darg=; b=S4VoRY3bOzWdh/bIHqKxOUN8M1VcgCFS65kDra5kAW3ELMgF7SX/Egpn mtwtjIES9FIp7iZTXnPPq5XH6/nvl/aALfnOBGlrG3L8JbjMwsKxdNDyN 7TIoUBfvfWtFqZZbtCOv2rZVty/rQIGyZ6saQkamm+Vanb55BhSU3OKrJ YMrxoPsG5JWuluzOI82cYOfLVwEvNU6qoBvjsIcf7r7eqnMEgsvIfyJcX W0+qJVNN6QGG7fiQQSbNQZZZcKbZRq/cz0sUa/WxVH6Ka910H4rWzEHGj Q8ND5Ou6S8uE1apKDog0m1NQ4ADQXomgKkD9njoIrvVRqY6rvEeG6JYVw A==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787054" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787054" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817412" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817412" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com Subject: [PATCH v2 16/21] KVM:x86: Save/Restore GUEST_SSP to/from SMM state save area Date: Fri, 21 Apr 2023 09:46:10 -0400 Message-Id: <20230421134615.62539-17-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Save GUEST_SSP to SMM state save area when guest exits to SMM due to SMI and restore it VMCS field when guest exits SMM. Signed-off-by: Yang Weijiang --- arch/x86/kvm/smm.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c index b42111a24cc2..c54d3eb2b7e4 100644 --- a/arch/x86/kvm/smm.c +++ b/arch/x86/kvm/smm.c @@ -275,6 +275,16 @@ static void enter_smm_save_state_64(struct kvm_vcpu *vcpu, enter_smm_save_seg_64(vcpu, &smram->gs, VCPU_SREG_GS); smram->int_shadow = static_call(kvm_x86_get_interrupt_shadow)(vcpu); + + if (kvm_cet_user_supported()) { + struct msr_data msr; + + msr.index = MSR_KVM_GUEST_SSP; + msr.host_initiated = true; + /* GUEST_SSP is stored in VMCS at vm-exit. */ + static_call(kvm_x86_get_msr)(vcpu, &msr); + smram->ssp = msr.data; + } } #endif @@ -565,6 +575,16 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, static_call(kvm_x86_set_interrupt_shadow)(vcpu, 0); ctxt->interruptibility = (u8)smstate->int_shadow; + if (kvm_cet_user_supported()) { + struct msr_data msr; + + msr.index = MSR_KVM_GUEST_SSP; + msr.host_initiated = true; + msr.data = smstate->ssp; + /* Mimic host_initiated access to bypass ssp access check. */ + static_call(kvm_x86_set_msr)(vcpu, &msr); + } + return X86EMUL_CONTINUE; } #endif From patchwork Fri Apr 21 13:46:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220380 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21341C7618E for ; Fri, 21 Apr 2023 16:51:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232997AbjDUQvV (ORCPT ); Fri, 21 Apr 2023 12:51:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233310AbjDUQvF (ORCPT ); Fri, 21 Apr 2023 12:51:05 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EDC815468; Fri, 21 Apr 2023 09:50:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095852; x=1713631852; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mCyLn2dPPOSZGk1Xcubyk/irHLv8dzLxbkgWPygxPU0=; b=Eth6nn5wT1wgyrpcwyih10CIDvjNcGjo5louxzC8OyLshv/oDG1mU4Me rWQCyLMng0C3eQyMBJM4Oz8k+SXiYKm3SJ3xdqITu59Ld1nMpxE9tQH4n rL3P/jHydsYx7bx4rm93qaY7bWMoNgVgwEj3e4dmLbUipbd5UYzlwHQkr IVq7dW1Do4YUTwWDLRVTtg9QvNLOUWmIWu54V/rclq/gu68MX7TJ12qT9 N2kiD4Ptr2gT7A4UswfbPrvqFhsl7Lu6AjvdCAg6EcoVs0VYU31DsG4mt LrMwvb9LTEoYvNwkcYtcbqjO7endcyeNxhkVhP8iimN0i/u3OipQllanu g==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787055" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787055" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817414" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817414" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Zhang Yi Z , Sean Christopherson Subject: [PATCH v2 17/21] KVM:VMX: Pass through user CET MSRs to the guest Date: Fri, 21 Apr 2023 09:46:11 -0400 Message-Id: <20230421134615.62539-18-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Pass through CET user mode MSRs when the associated CET component is enabled to improve guest performance. All CET MSRs are context switched, either via dedicated VMCS fields or XSAVES. Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 42211ae40650..1ec7835c3060 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -709,6 +709,9 @@ static bool is_valid_passthrough_msr(u32 msr) case MSR_LBR_CORE_TO ... MSR_LBR_CORE_TO + 8: /* LBR MSRs. These are handled in vmx_update_intercept_for_lbr_msrs() */ return true; + case MSR_IA32_U_CET: + case MSR_IA32_PL3_SSP: + return true; } r = possible_passthrough_msr_slot(msr) != -ENOENT; @@ -7726,6 +7729,23 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); } +static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_state) +{ + return (kvm_caps.supported_xss & xss_state) && + (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) || + guest_cpuid_has(vcpu, X86_FEATURE_IBT)); +} + +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) +{ + bool incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, MSR_TYPE_RW, incpt); + + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt); +} + static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7793,6 +7813,9 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); + + if (kvm_cet_user_supported()) + vmx_update_intercept_for_cet_msr(vcpu); } static u64 vmx_get_perf_capabilities(void) From patchwork Fri Apr 21 13:46:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220384 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17C14C77B61 for ; Fri, 21 Apr 2023 16:51:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233429AbjDUQvc (ORCPT ); Fri, 21 Apr 2023 12:51:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232685AbjDUQvF (ORCPT ); Fri, 21 Apr 2023 12:51:05 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A44515603; Fri, 21 Apr 2023 09:50:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095854; x=1713631854; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RIbgf/wQy4t+M2WXTq4Egz01zZIm4gWBNnvLRlyFjns=; b=REb34IMCQI0bseqnFERZ+IlA0DlsfpiomgtFnvXd3Z5X07Djh4+ueMQj eGXXLP9vffRXHOEWszaZZCYRsuqE+lb2z4gW4lpjZdzMdqs1l8v+FsMd3 vrNlrM9eLzJ7DV45F8bHoNV81+SMhhOHQchx+c2ZQmvKP6CC5NlFPPJ2+ 4YaagJwodPEUoBE2tMlz+KsFcLakLZuqpy0P9USQG40syzkOlEMocLcx7 ENddj8n3tDHTb8Va96gGTaieDY3b0M6SwKKiGyOr7BKkSnqmxD9RQ/RAB aGTAzPcbKKxZ4ZM9SIT5ualh4darlPsrDmYVnzMcHNYZoL4k2dPR2BwzS A==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787063" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787063" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817417" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817417" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Sean Christopherson Subject: [PATCH v2 18/21] KVM:x86: Enable CET virtualization for VMX and advertise to userspace Date: Fri, 21 Apr 2023 09:46:12 -0400 Message-Id: <20230421134615.62539-19-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Set the feature bits so that CET capabilities can be seen in guest via CPUID enumeration. Add CR4.CET bit support in order to allow guest set CET master control bit(CR4.CET). Disable KVM CET feature if unrestricted_guest is unsupported/disabled as KVM does not support emulating CET. Don't expose CET feature if dependent CET bits are cleared in host XSS, or if XSAVES isn't supported. Updating the CET features in common x86 is a little ugly, but there is no clean solution without risking breakage of SVM if SVM hardware ever gains support for CET, e.g. moving everything to common x86 would prematurely expose CET on SVM. The alternative is to put all the logic in VMX, but that means rereading host_xss in VMX and duplicating the XSAVES check across VMX and SVM. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/cpuid.c | 12 ++++++++++-- arch/x86/kvm/vmx/capabilities.h | 4 ++++ arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++ arch/x86/kvm/vmx/vmx.h | 6 ++++-- arch/x86/kvm/x86.c | 21 ++++++++++++++++++++- 6 files changed, 59 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2865c3cb3501..58e20d5895d1 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -125,7 +125,8 @@ | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \ | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \ | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \ - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP)) + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \ + | X86_CR4_CET)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index dd6d5150d86a..033a2f1a5c3f 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -634,7 +634,7 @@ void kvm_set_cpu_caps(void) F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) | F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) | F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ | - F(SGX_LC) | F(BUS_LOCK_DETECT) + F(SGX_LC) | F(BUS_LOCK_DETECT) | F(SHSTK) ); /* Set LA57 based on hardware capability. */ if (cpuid_ecx(7) & F(LA57)) @@ -652,7 +652,8 @@ void kvm_set_cpu_caps(void) F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) | F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) | - F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D) + F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D) | + F(IBT) ); /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */ @@ -665,6 +666,13 @@ void kvm_set_cpu_caps(void) kvm_cpu_cap_set(X86_FEATURE_INTEL_STIBP); if (boot_cpu_has(X86_FEATURE_AMD_SSBD)) kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD); + /* + * The feature bit in boot_cpu_data.x86_capability could have been + * cleared due to ibt=off cmdline option, then add it back if CPU + * supports IBT. + */ + if (cpuid_edx(7) & F(IBT)) + kvm_cpu_cap_set(X86_FEATURE_IBT); kvm_cpu_cap_mask(CPUID_7_1_EAX, F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 45162c1bcd8f..85cffeae7f10 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -106,6 +106,10 @@ static inline bool cpu_has_load_perf_global_ctrl(void) return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL; } +static inline bool cpu_has_load_cet_ctrl(void) +{ + return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE); +} static inline bool cpu_has_vmx_mpx(void) { return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_BNDCFGS; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 1ec7835c3060..dec7a8b81388 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2631,6 +2631,7 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf, { VM_ENTRY_LOAD_IA32_EFER, VM_EXIT_LOAD_IA32_EFER }, { VM_ENTRY_LOAD_BNDCFGS, VM_EXIT_CLEAR_BNDCFGS }, { VM_ENTRY_LOAD_IA32_RTIT_CTL, VM_EXIT_CLEAR_IA32_RTIT_CTL }, + { VM_ENTRY_LOAD_CET_STATE, VM_EXIT_LOAD_CET_STATE }, }; memset(vmcs_conf, 0, sizeof(*vmcs_conf)); @@ -6340,6 +6341,12 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (vmcs_read32(VM_EXIT_MSR_STORE_COUNT) > 0) vmx_dump_msrs("guest autostore", &vmx->msr_autostore.guest); + if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(GUEST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(GUEST_SSP)); + pr_err("INTR SSP TABLE = 0x%016lx\n", + vmcs_readl(GUEST_INTR_SSP_TABLE)); + } pr_err("*** Host State ***\n"); pr_err("RIP = 0x%016lx RSP = 0x%016lx\n", vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP)); @@ -6417,6 +6424,12 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID) pr_err("Virtual processor ID = 0x%04x\n", vmcs_read16(VIRTUAL_PROCESSOR_ID)); + if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(HOST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(HOST_SSP)); + pr_err("INTR SSP TABLE = 0x%016lx\n", + vmcs_readl(HOST_INTR_SSP_TABLE)); + } } /* @@ -7891,6 +7904,12 @@ static __init void vmx_set_cpu_caps(void) if (cpu_has_vmx_waitpkg()) kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG); + + if (!cpu_has_load_cet_ctrl() || !enable_unrestricted_guest) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_USER; + } } static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 9e66531861cf..5e3ba69006f9 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -493,7 +493,8 @@ static inline u8 vmx_get_rvi(void) VM_ENTRY_LOAD_IA32_EFER | \ VM_ENTRY_LOAD_BNDCFGS | \ VM_ENTRY_PT_CONCEAL_PIP | \ - VM_ENTRY_LOAD_IA32_RTIT_CTL) + VM_ENTRY_LOAD_IA32_RTIT_CTL | \ + VM_ENTRY_LOAD_CET_STATE) #define __KVM_REQUIRED_VMX_VM_EXIT_CONTROLS \ (VM_EXIT_SAVE_DEBUG_CONTROLS | \ @@ -515,7 +516,8 @@ static inline u8 vmx_get_rvi(void) VM_EXIT_LOAD_IA32_EFER | \ VM_EXIT_CLEAR_BNDCFGS | \ VM_EXIT_PT_CONCEAL_PIP | \ - VM_EXIT_CLEAR_IA32_RTIT_CTL) + VM_EXIT_CLEAR_IA32_RTIT_CTL | \ + VM_EXIT_LOAD_CET_STATE) #define KVM_REQUIRED_VMX_PIN_BASED_VM_EXEC_CONTROL \ (PIN_BASED_EXT_INTR_MASK | \ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 95dba3c3df5f..ba82b102600d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -226,7 +226,7 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) -#define KVM_SUPPORTED_XSS 0 +#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER) u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); @@ -9525,6 +9525,25 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) kvm_ops_update(ops); + /* + * Check CET user bit is still set in kvm_caps.supported_xss, + * if not, clear the cap bits as the user parts depends on + * XSAVES support. + */ + if (!kvm_cet_user_supported()) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + } + + /* + * If SHSTK and IBT are available in KVM, clear CET user bit in + * kvm_caps.supported_xss so that kvm_cet_user_supported() returns + * false when called. + */ + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_USER; + for_each_online_cpu(cpu) { smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &r, 1); if (r < 0) From patchwork Fri Apr 21 13:46:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220382 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1068C77B61 for ; Fri, 21 Apr 2023 16:51:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233412AbjDUQv1 (ORCPT ); Fri, 21 Apr 2023 12:51:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233316AbjDUQvF (ORCPT ); Fri, 21 Apr 2023 12:51:05 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B8C11560B; Fri, 21 Apr 2023 09:50:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095853; x=1713631853; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=P1jb03gJ0H2XPPXnhB0RpDRD3+2fwv4Tnnk94fiyE+s=; b=VXVE0I79xzT7zb4qfCytg32UUOH+D2T2vbO6UvvhTqsxgqYGRGBxr7t2 YyLHPYzamF6CwhwegdAVMYQcYpHc7mosgad4OPncqx+8aoMPblg+TrXlE YS0Pjuqhg5avgog/i2jryG8mxme8DiAD6MXiKAwClcm8yQMgF82yPxQaX OIs1rBCVB8pe7cXYDLJfBQqbIDX/o+kSKyIS62+iJyyyiPCSkal0DonDR cGgfLTYkO6TQHXGm138pXdU2KsJOjdPsJqxhvRfetQ9mWYZgHpg4yGqQk 5KZlraA9o9ywA82Qe0tuOjSAFYndHEJI+oft1IxSVf3Ma+yKgxHXxW97l g==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787067" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787067" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817422" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817422" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com, Sean Christopherson Subject: [PATCH v2 19/21] KVM:nVMX: Enable user CET support for nested VMX Date: Fri, 21 Apr 2023 09:46:13 -0400 Message-Id: <20230421134615.62539-20-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add all CET fields to vmcs12 as L1 KVM touches them when CET is enabled for L2. Pass through CET MSRs to L2 when L1 can support and enumerate the VMCS control bits together with CR4 bit as supported. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/nested.c | 12 ++++++++++-- arch/x86/kvm/vmx/vmcs12.c | 6 ++++++ arch/x86/kvm/vmx/vmcs12.h | 14 +++++++++++++- arch/x86/kvm/vmx/vmx.c | 2 ++ 4 files changed, 31 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 7bc62cd72748..522ac27d2534 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -660,6 +660,13 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_FLUSH_CMD, MSR_TYPE_W); + /* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */ + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_U_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL3_SSP, MSR_TYPE_RW); + kvm_vcpu_unmap(vcpu, &vmx->nested.msr_bitmap_map, false); vmx->nested.force_msr_bitmap_recalc = false; @@ -6785,7 +6792,7 @@ static void nested_vmx_setup_exit_ctls(struct vmcs_config *vmcs_conf, VM_EXIT_HOST_ADDR_SPACE_SIZE | #endif VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT | - VM_EXIT_CLEAR_BNDCFGS; + VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_LOAD_CET_STATE; msrs->exit_ctls_high |= VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR | VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER | @@ -6807,7 +6814,8 @@ static void nested_vmx_setup_entry_ctls(struct vmcs_config *vmcs_conf, #ifdef CONFIG_X86_64 VM_ENTRY_IA32E_MODE | #endif - VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS; + VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS | + VM_ENTRY_LOAD_CET_STATE; msrs->entry_ctls_high |= (VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL); diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c index 106a72c923ca..4233b5ca9461 100644 --- a/arch/x86/kvm/vmx/vmcs12.c +++ b/arch/x86/kvm/vmx/vmcs12.c @@ -139,6 +139,9 @@ const unsigned short vmcs12_field_offsets[] = { FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp), FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip), + FIELD(GUEST_S_CET, guest_s_cet), + FIELD(GUEST_SSP, guest_ssp), + FIELD(GUEST_INTR_SSP_TABLE, guest_ssp_tbl), FIELD(HOST_CR0, host_cr0), FIELD(HOST_CR3, host_cr3), FIELD(HOST_CR4, host_cr4), @@ -151,5 +154,8 @@ const unsigned short vmcs12_field_offsets[] = { FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), FIELD(HOST_RSP, host_rsp), FIELD(HOST_RIP, host_rip), + FIELD(HOST_S_CET, host_s_cet), + FIELD(HOST_SSP, host_ssp), + FIELD(HOST_INTR_SSP_TABLE, host_ssp_tbl), }; const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs12_field_offsets); diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h index 01936013428b..3884489e7f7e 100644 --- a/arch/x86/kvm/vmx/vmcs12.h +++ b/arch/x86/kvm/vmx/vmcs12.h @@ -117,7 +117,13 @@ struct __packed vmcs12 { natural_width host_ia32_sysenter_eip; natural_width host_rsp; natural_width host_rip; - natural_width paddingl[8]; /* room for future expansion */ + natural_width host_s_cet; + natural_width host_ssp; + natural_width host_ssp_tbl; + natural_width guest_s_cet; + natural_width guest_ssp; + natural_width guest_ssp_tbl; + natural_width paddingl[2]; /* room for future expansion */ u32 pin_based_vm_exec_control; u32 cpu_based_vm_exec_control; u32 exception_bitmap; @@ -292,6 +298,12 @@ static inline void vmx_check_vmcs12_offsets(void) CHECK_OFFSET(host_ia32_sysenter_eip, 656); CHECK_OFFSET(host_rsp, 664); CHECK_OFFSET(host_rip, 672); + CHECK_OFFSET(host_s_cet, 680); + CHECK_OFFSET(host_ssp, 688); + CHECK_OFFSET(host_ssp_tbl, 696); + CHECK_OFFSET(guest_s_cet, 704); + CHECK_OFFSET(guest_ssp, 712); + CHECK_OFFSET(guest_ssp_tbl, 720); CHECK_OFFSET(pin_based_vm_exec_control, 744); CHECK_OFFSET(cpu_based_vm_exec_control, 748); CHECK_OFFSET(exception_bitmap, 752); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index dec7a8b81388..db4aacbcba7f 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7669,6 +7669,8 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu) cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU)); cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP)); cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57)); + cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK)); + cr4_fixed1_update(X86_CR4_CET, edx, feature_bit(IBT)); #undef cr4_fixed1_update } From patchwork Fri Apr 21 13:46:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97986C7618E for ; Fri, 21 Apr 2023 16:51:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233254AbjDUQva (ORCPT ); Fri, 21 Apr 2023 12:51:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233338AbjDUQvJ (ORCPT ); Fri, 21 Apr 2023 12:51:09 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5BDD1446C; Fri, 21 Apr 2023 09:50:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095854; x=1713631854; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=78zDRIkxntx7XpSCBd+PlWB7x/igZ56S5f6aXY9OngU=; b=b4KhRgSjLaq9EB7Jc6Rd+eWcjJkbDIDYVnqECD7gcqxpOUzttk0xDa4u iZCh0b2n92iAPZuOlIVqPH4G05eg9TM/4ktHlBCEWojPBDVfBhWHivWjC /OIKu6W09dzHa943L1NHXcycCp5JrxDmC1ZvjobfxUGPUCABtEXsrO5yF xnTz7caamvg6MhgFdFyg8UmvfrAmWQaO+ropRcFcjTY3J+KcGUeQ+aoBN iSPNfla7LEGLYztE8CcMzlkPnDhhor8DHfy3okFKgVDV+9sQ+iIv6EzUN cnH0nFzb7RCqwdKqUKQK4nOOeBcFeUUh7iz43S3IHNKzswi//3VBDnOt3 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787068" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787068" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817425" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817425" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com Subject: [PATCH v2 20/21] KVM:x86: Enable supervisor IBT support for guest Date: Fri, 21 Apr 2023 09:46:14 -0400 Message-Id: <20230421134615.62539-21-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add support for MSR_IA32_S_CET and GUEST_S_CET access. Mainline Linux kernel now supports supervisor IBT for kernel code, to make s-IBT work in guest(nested guest), pass through MSR_IA32_S_CET to guest(nested guest) if host kernel and KVM enabled IBT. Note, s-IBT can work independent to host xsaves support because guest MSR_IA32_S_CET is {stored|loaded} from VMCS GUEST_S_CET field. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/nested.c | 3 +++ arch/x86/kvm/vmx/vmx.c | 21 ++++++++++++++++++--- arch/x86/kvm/x86.c | 1 + 3 files changed, 22 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 522ac27d2534..bf690827bfee 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -664,6 +664,9 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_U_CET, MSR_TYPE_RW); + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_S_CET, MSR_TYPE_RW); + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_PL3_SSP, MSR_TYPE_RW); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index db4aacbcba7f..6eab3e452bbb 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -711,6 +711,7 @@ static bool is_valid_passthrough_msr(u32 msr) return true; case MSR_IA32_U_CET: case MSR_IA32_PL3_SSP: + case MSR_IA32_S_CET: return true; } @@ -1961,7 +1962,8 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr) static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr) { - if (!kvm_cet_user_supported()) + if (!kvm_cet_user_supported() && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) return false; if (msr->host_initiated) @@ -1971,6 +1973,9 @@ static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu, !guest_cpuid_has(vcpu, X86_FEATURE_IBT)) return false; + if (msr->index == MSR_IA32_S_CET) + return guest_cpuid_has(vcpu, X86_FEATURE_IBT); + if ((msr->index == MSR_IA32_PL3_SSP || msr->index == MSR_KVM_GUEST_SSP) && !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) @@ -2120,10 +2125,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_U_CET: case MSR_IA32_PL3_SSP: case MSR_KVM_GUEST_SSP: + case MSR_IA32_S_CET: if (!cet_is_msr_accessible(vcpu, msr_info)) return 1; if (msr_info->index == MSR_KVM_GUEST_SSP) msr_info->data = vmcs_readl(GUEST_SSP); + else if (msr_info->index == MSR_IA32_S_CET) + msr_info->data = vmcs_readl(GUEST_S_CET); else kvm_get_xsave_msr(msr_info); break; @@ -2440,12 +2448,16 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vmx->pt_desc.guest.addr_a[index / 2] = data; break; case MSR_IA32_U_CET: + case MSR_IA32_S_CET: if (!cet_is_msr_accessible(vcpu, msr_info)) return 1; if ((data & GENMASK(9, 6)) || is_noncanonical_address(data, vcpu)) return 1; - kvm_set_xsave_msr(msr_info); + if (msr_index == MSR_IA32_S_CET) + vmcs_writel(GUEST_S_CET, data); + else + kvm_set_xsave_msr(msr_info); break; case MSR_IA32_PL3_SSP: case MSR_KVM_GUEST_SSP: @@ -7759,6 +7771,9 @@ static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt); + + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_IBT); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt); } static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) @@ -7829,7 +7844,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); - if (kvm_cet_user_supported()) + if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT)) vmx_update_intercept_for_cet_msr(vcpu); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ba82b102600d..51fccbd2d3e7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1471,6 +1471,7 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_XFD, MSR_IA32_XFD_ERR, MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP, + MSR_IA32_S_CET, }; static const u32 msrs_to_save_pmu[] = { From patchwork Fri Apr 21 13:46:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13220386 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C126C7618E for ; Fri, 21 Apr 2023 16:51:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233268AbjDUQvg (ORCPT ); Fri, 21 Apr 2023 12:51:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233350AbjDUQvK (ORCPT ); Fri, 21 Apr 2023 12:51:10 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43C5715601; Fri, 21 Apr 2023 09:50:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682095856; x=1713631856; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=s9TExKj18DC/arpm4xWsdM/5QbQRmhV9FlvZ5+5DLbc=; b=ld3h8EZzPnDXSI2ifDwxtVeZSf81XHjim2vWhAq6YG+EAqCvkAGV5Mj2 JUJcaLrlvXHhnrX1GmvQWUHh+6WX857FAcn2YdfxNFDUDViKTCfp5QNW8 nJQu6O8jTUfoNFjcLIEHiN0QHGdxinbhuRNKXW/GqFrciREtHm4+RiMC1 wYwJQ+MIsbJI0mYVpXz4XPsTbqN4RfFi9cP0/1anr0H1x+bAIjLMQxo69 rn9IQPO8hTTIDV9953MiDF3lwP2uLAP1M/tmZrzy2Of3XmT6jbtciVI2o S2XAGoZUru/463Lm7PN0qMcp4pOqABrOoZgrGtV6XOP3l0MI0YB63NisG w==; X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="344787073" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="344787073" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10687"; a="722817427" X-IronPort-AV: E=Sophos;i="5.99,214,1677571200"; d="scan'208";a="722817427" Received: from embargo.jf.intel.com ([10.165.9.183]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2023 09:50:44 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, john.allen@amd.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com, weijiang.yang@intel.com Subject: [PATCH v2 21/21] KVM:x86: Support CET supervisor shadow stack MSR access Date: Fri, 21 Apr 2023 09:46:15 -0400 Message-Id: <20230421134615.62539-22-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230421134615.62539-1-weijiang.yang@intel.com> References: <20230421134615.62539-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add MSR access interfaces for supervisor shadow stack, i.e., MSR_IA32_PL{0,1,2} and MSR_IA32_INT_SSP_TAB, meanwhile pass through them to {L1,L2} guests when {L0,L1} KVM supports supervisor shadow stack. Note, currently supervisor shadow stack is not supported on Intel platforms, i.e., VMX always clears CPUID(EAX=07H,ECX=1).EDX.[bit 18]. The main purpose of this patch is to facilitate AMD folks to enable supervisor shadow stack for their platforms. Signed-off-by: Yang Weijiang --- arch/x86/kvm/cpuid.h | 6 +++++ arch/x86/kvm/vmx/nested.c | 12 +++++++++ arch/x86/kvm/vmx/vmx.c | 51 ++++++++++++++++++++++++++++++++++----- 3 files changed, 63 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index b1658c0de847..019a16b25b88 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -232,4 +232,10 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu, return vcpu->arch.pv_cpuid.features & (1u << kvm_feature); } +static __always_inline bool kvm_cet_kernel_shstk_supported(void) +{ + return !IS_ENABLED(CONFIG_KVM_INTEL) && + kvm_cpu_cap_has(X86_FEATURE_SHSTK); +} + #endif diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index bf690827bfee..aaaae92dc9f6 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -670,6 +670,18 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_PL3_SSP, MSR_TYPE_RW); + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL0_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL1_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL2_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW); + kvm_vcpu_unmap(vcpu, &vmx->nested.msr_bitmap_map, false); vmx->nested.force_msr_bitmap_recalc = false; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6eab3e452bbb..074b618f1a07 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -713,6 +713,9 @@ static bool is_valid_passthrough_msr(u32 msr) case MSR_IA32_PL3_SSP: case MSR_IA32_S_CET: return true; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP: + case MSR_IA32_INT_SSP_TAB: + return true; } r = possible_passthrough_msr_slot(msr) != -ENOENT; @@ -1962,8 +1965,11 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr) static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr) { + u64 mask; + if (!kvm_cet_user_supported() && - !kvm_cpu_cap_has(X86_FEATURE_IBT)) + !(kvm_cpu_cap_has(X86_FEATURE_IBT) || + kvm_cpu_cap_has(X86_FEATURE_SHSTK))) return false; if (msr->host_initiated) @@ -1973,15 +1979,27 @@ static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu, !guest_cpuid_has(vcpu, X86_FEATURE_IBT)) return false; + if (msr->index == MSR_IA32_U_CET) + return true; + if (msr->index == MSR_IA32_S_CET) - return guest_cpuid_has(vcpu, X86_FEATURE_IBT); + return guest_cpuid_has(vcpu, X86_FEATURE_IBT) || + kvm_cet_kernel_shstk_supported(); - if ((msr->index == MSR_IA32_PL3_SSP || - msr->index == MSR_KVM_GUEST_SSP) && + if (msr->index == MSR_KVM_GUEST_SSP) + return guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + + if (msr->index == MSR_IA32_INT_SSP_TAB) + return guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) && + kvm_cet_kernel_shstk_supported(); + + if (msr->index == MSR_IA32_PL3_SSP && !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) return false; - return true; + mask = (msr->index == MSR_IA32_PL3_SSP) ? XFEATURE_MASK_CET_USER : + XFEATURE_MASK_CET_KERNEL; + return !!(kvm_caps.supported_xss & mask); } /* @@ -2135,6 +2153,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else kvm_get_xsave_msr(msr_info); break; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP: + case MSR_IA32_INT_SSP_TAB: + if (!cet_is_msr_accessible(vcpu, msr_info)) + return 1; + kvm_get_xsave_msr(msr_info); + break; case MSR_IA32_DEBUGCTLMSR: msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); break; @@ -2471,6 +2495,12 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else kvm_set_xsave_msr(msr_info); break; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP: + case MSR_IA32_INT_SSP_TAB: + if (!cet_is_msr_accessible(vcpu, msr_info)) + return 1; + kvm_set_xsave_msr(msr_info); + break; case MSR_IA32_PERF_CAPABILITIES: if (data && !vcpu_to_pmu(vcpu)->version) return 1; @@ -7774,6 +7804,14 @@ static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_IBT); vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt); + + incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL); + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, incpt); } static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) @@ -7844,7 +7882,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); - if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT)) + if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT) || + kvm_cpu_cap_has(X86_FEATURE_SHSTK)) vmx_update_intercept_for_cet_msr(vcpu); }