From patchwork Thu May 11 04:08:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E88BC77B7F for ; Thu, 11 May 2023 07:13:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237287AbjEKHN4 (ORCPT ); Thu, 11 May 2023 03:13:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237416AbjEKHNp (ORCPT ); Thu, 11 May 2023 03:13:45 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61A1B2D64; Thu, 11 May 2023 00:13:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789221; x=1715325221; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9rCJFBWLMwkBw9HBuP/Kc52dFcfvwQjY//PWWGqH99k=; b=nL3kYCtJZU7xl2UfJDbNCXQyKxi60qUuPPgHeWTKO1UXrCABTzfutHKQ zBGttV75gMcSD1lxifNTUqn+T2+XaGuZSOiFygEoNiRGfEsmu57W7L2Ai DDURLnkMRnPy69e3ItXg16GXz9PUHWFYR/hL6tcQ1VK6c3UNv2Jc4VwB4 GqMQ5AQtMm5FQ0xcuFrLOstUXeLTRKH9/qyusDz9CCYSL2pnyRAPeG0Vl dx8h/H25kZ+KUizdUdCEcWnGJtme5fFPkU5TL5FrsBHOOj01M06umudIQ dgIMvDkZycHyourlDQBS6DNnT5TiueS8uNbFDSoplHMnYKHlF5XB6Ip+M g==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896592" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896592" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512341" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512341" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:22 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Yu-cheng Yu , Borislav Petkov , Kees Cook , Pengfei Xu Subject: [PATCH v3 01/21] x86/shstk: Add Kconfig option for shadow stack Date: Thu, 11 May 2023 00:08:37 -0400 Message-Id: <20230511040857.6094-2-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe Shadow stack provides protection for applications against function return address corruption. It is active when the processor supports it, the kernel has CONFIG_X86_SHADOW_STACK enabled, and the application is built for the feature. This is only implemented for the 64-bit kernel. When it is enabled, legacy non-shadow stack applications continue to work, but without protection. Since there is another feature that utilizes CET (Kernel IBT) that will share implementation with shadow stacks, create CONFIG_CET to signify that at least one CET feature is configured. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-3-rick.p.edgecombe%40intel.com --- arch/x86/Kconfig | 24 ++++++++++++++++++++++++ arch/x86/Kconfig.assembler | 5 +++++ 2 files changed, 29 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index a825bf031f49..f03791b73f9f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1851,6 +1851,11 @@ config CC_HAS_IBT (CC_IS_CLANG && CLANG_VERSION >= 140000)) && \ $(as-instr,endbr64) +config X86_CET + def_bool n + help + CET features configured (Shadow stack or IBT) + config X86_KERNEL_IBT prompt "Indirect Branch Tracking" def_bool y @@ -1858,6 +1863,7 @@ config X86_KERNEL_IBT # https://github.com/llvm/llvm-project/commit/9d7001eba9c4cb311e03cd8cdc231f9e579f2d0f depends on !LD_IS_LLD || LLD_VERSION >= 140000 select OBJTOOL + select X86_CET help Build the kernel with support for Indirect Branch Tracking, a hardware support course-grain forward-edge Control Flow Integrity @@ -1952,6 +1958,24 @@ config X86_SGX If unsure, say N. +config X86_USER_SHADOW_STACK + bool "X86 userspace shadow stack" + depends on AS_WRUSS + depends on X86_64 + select ARCH_USES_HIGH_VMA_FLAGS + select X86_CET + help + Shadow stack protection is a hardware feature that detects function + return address corruption. This helps mitigate ROP attacks. + Applications must be enabled to use it, and old userspace does not + get protection "for free". + + CPUs supporting shadow stacks were first released in 2020. + + See Documentation/x86/shstk.rst for more information. + + If unsure, say N. + config EFI bool "EFI runtime service support" depends on ACPI diff --git a/arch/x86/Kconfig.assembler b/arch/x86/Kconfig.assembler index b88f784cb02e..8ad41da301e5 100644 --- a/arch/x86/Kconfig.assembler +++ b/arch/x86/Kconfig.assembler @@ -24,3 +24,8 @@ config AS_GFNI def_bool $(as-instr,vgf2p8mulb %xmm0$(comma)%xmm1$(comma)%xmm2) help Supported by binutils >= 2.30 and LLVM integrated assembler + +config AS_WRUSS + def_bool $(as-instr,wrussq %rax$(comma)(%rbx)) + help + Supported by binutils >= 2.31 and LLVM integrated assembler From patchwork Thu May 11 04:08:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237546 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 450A6C7EE2A for ; Thu, 11 May 2023 07:13:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237436AbjEKHNq (ORCPT ); Thu, 11 May 2023 03:13:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237287AbjEKHNk (ORCPT ); Thu, 11 May 2023 03:13:40 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A9F861B4; Thu, 11 May 2023 00:13:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789213; x=1715325213; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kQ6bOp51j+wxWxjJPy3B5NdRPxWxbmqROqWc/HTwHTA=; b=UrHJeIEoJJWzXBLYO5ED+WnxoFODQYsbDvc9KW8u4xXF208pC+Z0P/Q1 mc40EssYvdvTQV3MDZC1hxjKf0QdSJkjIfWPSbhxPkB/uFwC6aYwfWpHK ekEiBwfrir+SDLRz9LMr/1EumRDeDrnVdN6YbXAYGAlTgbgdYEdPBTZRg kKovoGvyNxJmBb35p1KTrELfRv94Mbip5F8To1fSTNKIrZMJ2hBsA0E5e +2N+5OdoF6t/uz6kOFBHCM2ii6SBRLLdHmCrainWb9nsyHMbIXJ2kkoJZ Cw7orLgoGvGUNli7smLssc2wVbSGex1PV4sQFm9wi90D+nLTZ0Pdcgoaj g==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896571" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896571" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512343" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512343" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:22 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Yu-cheng Yu , Borislav Petkov , Kees Cook , Pengfei Xu Subject: [PATCH v3 02/21] x86/cpufeatures: Add CPU feature flags for shadow stacks Date: Thu, 11 May 2023 00:08:38 -0400 Message-Id: <20230511040857.6094-3-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe The Control-Flow Enforcement Technology contains two related features, one of which is Shadow Stacks. Future patches will utilize this feature for shadow stack support in KVM, so add a CPU feature flags for Shadow Stacks (CPUID.(EAX=7,ECX=0):ECX[bit 7]). To protect shadow stack state from malicious modification, the registers are only accessible in supervisor mode. This implementation context-switches the registers with XSAVES. Make X86_FEATURE_SHSTK depend on XSAVES. The shadow stack feature, enumerated by the CPUID bit described above, encompasses both supervisor and userspace support for shadow stack. In near future patches, only userspace shadow stack will be enabled. In expectation of future supervisor shadow stack support, create a software CPU capability to enumerate kernel utilization of userspace shadow stack support. This user shadow stack bit should depend on the HW "shstk" capability and that logic will be implemented in future patches. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-4-rick.p.edgecombe%40intel.com --- arch/x86/include/asm/cpufeatures.h | 2 ++ arch/x86/include/asm/disabled-features.h | 8 +++++++- arch/x86/kernel/cpu/cpuid-deps.c | 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 97327a1e3aff..3993ea7c6312 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -308,6 +308,7 @@ #define X86_FEATURE_MSR_TSX_CTRL (11*32+20) /* "" MSR IA32_TSX_CTRL (Intel) implemented */ #define X86_FEATURE_SMBA (11*32+21) /* "" Slow Memory Bandwidth Allocation */ #define X86_FEATURE_BMEC (11*32+22) /* "" Bandwidth Monitoring Event Configuration */ +#define X86_FEATURE_USER_SHSTK (11*32+23) /* Shadow stack support for user mode applications */ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ @@ -379,6 +380,7 @@ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ #define X86_FEATURE_WAITPKG (16*32+ 5) /* UMONITOR/UMWAIT/TPAUSE Instructions */ #define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */ +#define X86_FEATURE_SHSTK (16*32+ 7) /* "" Shadow stack */ #define X86_FEATURE_GFNI (16*32+ 8) /* Galois Field New Instructions */ #define X86_FEATURE_VAES (16*32+ 9) /* Vector AES */ #define X86_FEATURE_VPCLMULQDQ (16*32+10) /* Carry-Less Multiplication Double Quadword */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 5dfa4fb76f4b..505f78ddca82 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -99,6 +99,12 @@ # define DISABLE_TDX_GUEST (1 << (X86_FEATURE_TDX_GUEST & 31)) #endif +#ifdef CONFIG_X86_USER_SHADOW_STACK +#define DISABLE_USER_SHSTK 0 +#else +#define DISABLE_USER_SHSTK (1 << (X86_FEATURE_USER_SHSTK & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -114,7 +120,7 @@ #define DISABLED_MASK9 (DISABLE_SGX) #define DISABLED_MASK10 0 #define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \ - DISABLE_CALL_DEPTH_TRACKING) + DISABLE_CALL_DEPTH_TRACKING|DISABLE_USER_SHSTK) #define DISABLED_MASK12 0 #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index f6748c8bd647..e462c1d3800a 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -81,6 +81,7 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_XFD, X86_FEATURE_XSAVES }, { X86_FEATURE_XFD, X86_FEATURE_XGETBV1 }, { X86_FEATURE_AMX_TILE, X86_FEATURE_XFD }, + { X86_FEATURE_SHSTK, X86_FEATURE_XSAVES }, {} }; From patchwork Thu May 11 04:08:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237569 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43CBBC77B7F for ; Thu, 11 May 2023 07:16:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237425AbjEKHQY (ORCPT ); Thu, 11 May 2023 03:16:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237678AbjEKHPu (ORCPT ); Thu, 11 May 2023 03:15:50 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5BB88A53; Thu, 11 May 2023 00:14:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789291; x=1715325291; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=R1cGvPwI+TvLSQM2okFg1qOQiQqzPO7kVX+ps6fOpj0=; b=OrHdLJ17WdbjwzheHW0iQ9kLBL68Y3nhsMogeTutdGDxCchl8V9HS9Dr MnXh9tWBmYadCMAYC5xSsg69YQvYAjCJKGLNfpRHPUi05HvMUPPd/onGX 4ebW71g9eNfl7lLdegmO+nHMgwLdveov7ARzYMHfRD1hO4Ben+EE3v8AW j26K/uwZ2uzkON/ASKKAMKMumdHsYyJlB9/hgIIz9X6plJnWQ4RRNM0tw f57B4Pbn1DWKxtnYsqKV+nj4/N5ONMuVHM44opsdeI2DnIDjLFdBjuhSP 5ALov2Tmb7nBPH7PfLu7YGPmzsR+g4dhD6LYlrao7QYq5arT+yZXUT53q w==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896657" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896657" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512349" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512349" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:22 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Yu-cheng Yu , Borislav Petkov , Kees Cook , Pengfei Xu Subject: [PATCH v3 03/21] x86/cpufeatures: Enable CET CR4 bit for shadow stack Date: Thu, 11 May 2023 00:08:39 -0400 Message-Id: <20230511040857.6094-4-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe Setting CR4.CET is a prerequisite for utilizing any CET features, most of which also require setting MSRs. Kernel IBT already enables the CET CR4 bit when it detects IBT HW support and is configured with kernel IBT. However, future patches that enable userspace shadow stack support will need the bit set as well. So change the logic to enable it in either case. Clear MSR_IA32_U_CET in cet_disable() so that it can't live to see userspace in a new kexec-ed kernel that has CR4.CET set from kernel IBT. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-5-rick.p.edgecombe%40intel.com --- arch/x86/kernel/cpu/common.c | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 8cd4126d8253..cc686e5039be 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -600,27 +600,43 @@ __noendbr void ibt_restore(u64 save) static __always_inline void setup_cet(struct cpuinfo_x86 *c) { - u64 msr = CET_ENDBR_EN; + bool user_shstk, kernel_ibt; - if (!HAS_KERNEL_IBT || - !cpu_feature_enabled(X86_FEATURE_IBT)) + if (!IS_ENABLED(CONFIG_X86_CET)) return; - wrmsrl(MSR_IA32_S_CET, msr); + kernel_ibt = HAS_KERNEL_IBT && cpu_feature_enabled(X86_FEATURE_IBT); + user_shstk = cpu_feature_enabled(X86_FEATURE_SHSTK) && + IS_ENABLED(CONFIG_X86_USER_SHADOW_STACK); + + if (!kernel_ibt && !user_shstk) + return; + + if (user_shstk) + set_cpu_cap(c, X86_FEATURE_USER_SHSTK); + + if (kernel_ibt) + wrmsrl(MSR_IA32_S_CET, CET_ENDBR_EN); + else + wrmsrl(MSR_IA32_S_CET, 0); + cr4_set_bits(X86_CR4_CET); - if (!ibt_selftest()) { + if (kernel_ibt && !ibt_selftest()) { pr_err("IBT selftest: Failed!\n"); wrmsrl(MSR_IA32_S_CET, 0); setup_clear_cpu_cap(X86_FEATURE_IBT); - return; } } __noendbr void cet_disable(void) { - if (cpu_feature_enabled(X86_FEATURE_IBT)) - wrmsrl(MSR_IA32_S_CET, 0); + if (!(cpu_feature_enabled(X86_FEATURE_IBT) || + cpu_feature_enabled(X86_FEATURE_SHSTK))) + return; + + wrmsrl(MSR_IA32_S_CET, 0); + wrmsrl(MSR_IA32_U_CET, 0); } /* @@ -1482,6 +1498,9 @@ static void __init cpu_parse_early_param(void) if (cmdline_find_option_bool(boot_command_line, "noxsaves")) setup_clear_cpu_cap(X86_FEATURE_XSAVES); + if (cmdline_find_option_bool(boot_command_line, "nousershstk")) + setup_clear_cpu_cap(X86_FEATURE_USER_SHSTK); + arglen = cmdline_find_option(boot_command_line, "clearcpuid", arg, sizeof(arg)); if (arglen <= 0) return; From patchwork Thu May 11 04:08:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237558 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D368C7EE24 for ; Thu, 11 May 2023 07:14:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237592AbjEKHO3 (ORCPT ); Thu, 11 May 2023 03:14:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237509AbjEKHOG (ORCPT ); Thu, 11 May 2023 03:14:06 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3060E83FF; Thu, 11 May 2023 00:13:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789232; x=1715325232; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GTZEH+R4nyAfVJ0BU4FvcCzk7JvYYtp630pyrX+sQxs=; b=MP3dkOBl8KAVaw0d9rUoe8FzCGjzJ1RjCmYPRqBjGTRbgcO0fNDN5ila rTKgENBouEyeLOv4VTP/jUubfjsNM85L36WHK0qXZma2nEWNtWhS4tKD0 iwJ2UbbVPOgrxywgfdVdRFpOQzNet02yXCo7f2qrjRlLdZXWx5Qi2osea uLW8G+AOcyBBflPJ8n8tr5Dlb8TCTOi6hLpBndyCSIhEa7a3+u0ogxjH9 Hnb+c8aIAvYihmDwnbm/xhdRMJEh6SWPf89qa3+/Du/S29rVW2txWGLFL GgCU1c3aGZjLegWQ1L1t3hpmtr/0/JbbWGrT6NWh7YoE0xKCgElBvqCy5 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896611" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896611" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512351" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512351" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:22 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Yu-cheng Yu , Borislav Petkov , Kees Cook , Pengfei Xu Subject: [PATCH v3 04/21] x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states Date: Thu, 11 May 2023 00:08:40 -0400 Message-Id: <20230511040857.6094-5-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe Shadow stack register state can be managed with XSAVE. The registers can logically be separated into two groups: * Registers controlling user-mode operation * Registers controlling kernel-mode operation The architecture has two new XSAVE state components: one for each group of those groups of registers. This lets an OS manage them separately if it chooses. Future patches for host userspace and KVM guests will only utilize the user-mode registers, so only configure XSAVE to save user-mode registers. This state will add 16 bytes to the xsave buffer size. Future patches will use the user-mode XSAVE area to save guest user-mode CET state. However, VMCS includes new fields for guest CET supervisor states. KVM can use these to save and restore guest supervisor state, so host supervisor XSAVE support is not required. Adding this exacerbates the already unwieldy if statement in check_xstate_against_struct() that handles warning about un-implemented xfeatures. So refactor these check's by having XCHECK_SZ() set a bool when it actually check's the xfeature. This ends up exceeding 80 chars, but was better on balance than other options explored. Pass the bool as pointer to make it clear that XCHECK_SZ() can change the variable. While configuring user-mode XSAVE, clarify kernel-mode registers are not managed by XSAVE by defining the xfeature in XFEATURE_MASK_SUPERVISOR_UNSUPPORTED, like is done for XFEATURE_MASK_PT. This serves more of a documentation as code purpose, and functionally, only enables a few safety checks. Both XSAVE state components are supervisor states, even the state controlling user-mode operation. This is a departure from earlier features like protection keys where the PKRU state is a normal user (non-supervisor) state. Having the user state be supervisor-managed ensures there is no direct, unprivileged access to it, making it harder for an attacker to subvert CET. To facilitate this privileged access, define the two user-mode CET MSRs, and the bits defined in those MSRs relevant to future shadow stack enablement patches. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-6-rick.p.edgecombe%40intel.com --- arch/x86/include/asm/fpu/types.h | 16 +++++- arch/x86/include/asm/fpu/xstate.h | 6 ++- arch/x86/kernel/fpu/xstate.c | 90 +++++++++++++++---------------- 3 files changed, 61 insertions(+), 51 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 7f6d858ff47a..eb810074f1e7 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -115,8 +115,8 @@ enum xfeature { XFEATURE_PT_UNIMPLEMENTED_SO_FAR, XFEATURE_PKRU, XFEATURE_PASID, - XFEATURE_RSRVD_COMP_11, - XFEATURE_RSRVD_COMP_12, + XFEATURE_CET_USER, + XFEATURE_CET_KERNEL_UNUSED, XFEATURE_RSRVD_COMP_13, XFEATURE_RSRVD_COMP_14, XFEATURE_LBR, @@ -138,6 +138,8 @@ enum xfeature { #define XFEATURE_MASK_PT (1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR) #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) #define XFEATURE_MASK_PASID (1 << XFEATURE_PASID) +#define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL_UNUSED) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) #define XFEATURE_MASK_XTILE_CFG (1 << XFEATURE_XTILE_CFG) #define XFEATURE_MASK_XTILE_DATA (1 << XFEATURE_XTILE_DATA) @@ -252,6 +254,16 @@ struct pkru_state { u32 pad; } __packed; +/* + * State component 11 is Control-flow Enforcement user states + */ +struct cet_user_state { + /* user control-flow settings */ + u64 user_cet; + /* user shadow stack pointer */ + u64 user_ssp; +}; + /* * State component 15: Architectural LBR configuration state. * The size of Arch LBR state depends on the number of LBRs (lbr_depth). diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index cd3dd170e23a..d4427b88ee12 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -50,7 +50,8 @@ #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA /* All currently supported supervisor features */ -#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID) +#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ + XFEATURE_MASK_CET_USER) /* * A supervisor state component may not always contain valuable information, @@ -77,7 +78,8 @@ * Unsupported supervisor features. When a supervisor feature in this mask is * supported in the future, move it to the supported supervisor feature mask. */ -#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT) +#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \ + XFEATURE_MASK_CET_KERNEL) /* All supervisor states including supported and unsupported states. */ #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 714166cc25f2..13a80521dd51 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -39,26 +39,26 @@ */ static const char *xfeature_names[] = { - "x87 floating point registers" , - "SSE registers" , - "AVX registers" , - "MPX bounds registers" , - "MPX CSR" , - "AVX-512 opmask" , - "AVX-512 Hi256" , - "AVX-512 ZMM_Hi256" , - "Processor Trace (unused)" , + "x87 floating point registers", + "SSE registers", + "AVX registers", + "MPX bounds registers", + "MPX CSR", + "AVX-512 opmask", + "AVX-512 Hi256", + "AVX-512 ZMM_Hi256", + "Processor Trace (unused)", "Protection Keys User registers", "PASID state", - "unknown xstate feature" , - "unknown xstate feature" , - "unknown xstate feature" , - "unknown xstate feature" , - "unknown xstate feature" , - "unknown xstate feature" , - "AMX Tile config" , - "AMX Tile data" , - "unknown xstate feature" , + "Control-flow User registers", + "Control-flow Kernel registers (unused)", + "unknown xstate feature", + "unknown xstate feature", + "unknown xstate feature", + "unknown xstate feature", + "AMX Tile config", + "AMX Tile data", + "unknown xstate feature", }; static unsigned short xsave_cpuid_features[] __initdata = { @@ -73,6 +73,7 @@ static unsigned short xsave_cpuid_features[] __initdata = { [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT, [XFEATURE_PKRU] = X86_FEATURE_PKU, [XFEATURE_PASID] = X86_FEATURE_ENQCMD, + [XFEATURE_CET_USER] = X86_FEATURE_SHSTK, [XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE, [XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE, }; @@ -276,6 +277,7 @@ static void __init print_xstate_features(void) print_xstate_feature(XFEATURE_MASK_Hi16_ZMM); print_xstate_feature(XFEATURE_MASK_PKRU); print_xstate_feature(XFEATURE_MASK_PASID); + print_xstate_feature(XFEATURE_MASK_CET_USER); print_xstate_feature(XFEATURE_MASK_XTILE_CFG); print_xstate_feature(XFEATURE_MASK_XTILE_DATA); } @@ -344,6 +346,7 @@ static __init void os_xrstor_booting(struct xregs_state *xstate) XFEATURE_MASK_BNDREGS | \ XFEATURE_MASK_BNDCSR | \ XFEATURE_MASK_PASID | \ + XFEATURE_MASK_CET_USER | \ XFEATURE_MASK_XTILE) /* @@ -446,14 +449,15 @@ static void __init __xstate_dump_leaves(void) } \ } while (0) -#define XCHECK_SZ(sz, nr, nr_macro, __struct) do { \ - if ((nr == nr_macro) && \ - WARN_ONCE(sz != sizeof(__struct), \ - "%s: struct is %zu bytes, cpu state %d bytes\n", \ - __stringify(nr_macro), sizeof(__struct), sz)) { \ +#define XCHECK_SZ(sz, nr, __struct) ({ \ + if (WARN_ONCE(sz != sizeof(__struct), \ + "[%s]: struct is %zu bytes, cpu state %d bytes\n", \ + xfeature_names[nr], sizeof(__struct), sz)) { \ __xstate_dump_leaves(); \ } \ -} while (0) + true; \ +}) + /** * check_xtile_data_against_struct - Check tile data state size. @@ -527,36 +531,28 @@ static bool __init check_xstate_against_struct(int nr) * Ask the CPU for the size of the state. */ int sz = xfeature_size(nr); + /* * Match each CPU state with the corresponding software * structure. */ - XCHECK_SZ(sz, nr, XFEATURE_YMM, struct ymmh_struct); - XCHECK_SZ(sz, nr, XFEATURE_BNDREGS, struct mpx_bndreg_state); - XCHECK_SZ(sz, nr, XFEATURE_BNDCSR, struct mpx_bndcsr_state); - XCHECK_SZ(sz, nr, XFEATURE_OPMASK, struct avx_512_opmask_state); - XCHECK_SZ(sz, nr, XFEATURE_ZMM_Hi256, struct avx_512_zmm_uppers_state); - XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM, struct avx_512_hi16_state); - XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state); - XCHECK_SZ(sz, nr, XFEATURE_PASID, struct ia32_pasid_state); - XCHECK_SZ(sz, nr, XFEATURE_XTILE_CFG, struct xtile_cfg); - - /* The tile data size varies between implementations. */ - if (nr == XFEATURE_XTILE_DATA) - check_xtile_data_against_struct(sz); - - /* - * Make *SURE* to add any feature numbers in below if - * there are "holes" in the xsave state component - * numbers. - */ - if ((nr < XFEATURE_YMM) || - (nr >= XFEATURE_MAX) || - (nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR) || - ((nr >= XFEATURE_RSRVD_COMP_11) && (nr <= XFEATURE_RSRVD_COMP_16))) { + switch (nr) { + case XFEATURE_YMM: return XCHECK_SZ(sz, nr, struct ymmh_struct); + case XFEATURE_BNDREGS: return XCHECK_SZ(sz, nr, struct mpx_bndreg_state); + case XFEATURE_BNDCSR: return XCHECK_SZ(sz, nr, struct mpx_bndcsr_state); + case XFEATURE_OPMASK: return XCHECK_SZ(sz, nr, struct avx_512_opmask_state); + case XFEATURE_ZMM_Hi256: return XCHECK_SZ(sz, nr, struct avx_512_zmm_uppers_state); + case XFEATURE_Hi16_ZMM: return XCHECK_SZ(sz, nr, struct avx_512_hi16_state); + case XFEATURE_PKRU: return XCHECK_SZ(sz, nr, struct pkru_state); + case XFEATURE_PASID: return XCHECK_SZ(sz, nr, struct ia32_pasid_state); + case XFEATURE_XTILE_CFG: return XCHECK_SZ(sz, nr, struct xtile_cfg); + case XFEATURE_CET_USER: return XCHECK_SZ(sz, nr, struct cet_user_state); + case XFEATURE_XTILE_DATA: check_xtile_data_against_struct(sz); return true; + default: XSTATE_WARN_ON(1, "No structure for xstate: %d\n", nr); return false; } + return true; } From patchwork Thu May 11 04:08:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4521C77B7F for ; Thu, 11 May 2023 07:16:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237624AbjEKHQv (ORCPT ); Thu, 11 May 2023 03:16:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49014 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237627AbjEKHQJ (ORCPT ); Thu, 11 May 2023 03:16:09 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 748CD9EED; Thu, 11 May 2023 00:15:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789304; x=1715325304; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Hl4XchB3DZgjf8rPLE6ukS73VjGwRhFWU+vSkAdj3h4=; b=SywSI2wxMVop8yRdbBrlXlfNLRkVNLAlBse+IsiDwbld/KBWnhVO6LMB 99ZYqclvGt9tLF2JUxaFnSCtiwWKSj4VUTMsvmysXq11FVENIkNNxbnfX f2tbenZ1MG+/NqpLVd0oyw2/lUAiDeMXYP4Nl15ng5b0KT2rvw4vpMD32 S9sIKlfifapi1Kc587LMpaH+XXPbWSTO61SzIoEEj9YWI3fMrfao9pJW7 CUygQIXvrPMhNO1xV6dX5e0jlLxhc1QXSBbkPSWfUGOd5DzzN92SNflEF od1FvgzFWCpcYjcgBvCF8vEvenyI45x6KYamiZHExk2J+akzqktz+gVRI Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896678" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896678" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512358" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512358" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:23 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Thomas Gleixner , Borislav Petkov , Kees Cook , Pengfei Xu Subject: [PATCH v3 05/21] x86/fpu: Add helper for modifying xstate Date: Thu, 11 May 2023 00:08:41 -0400 Message-Id: <20230511040857.6094-6-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Rick Edgecombe Just like user xfeatures, supervisor xfeatures can be active in the registers or present in the task FPU buffer. If the registers are active, the registers can be modified directly. If the registers are not active, the modification must be performed on the task FPU buffer. When the state is not active, the kernel could perform modifications directly to the buffer. But in order for it to do that, it needs to know where in the buffer the specific state it wants to modify is located. Doing this is not robust against optimizations that compact the FPU buffer, as each access would require computing where in the buffer it is. The easiest way to modify supervisor xfeature data is to force restore the registers and write directly to the MSRs. Often times this is just fine anyway as the registers need to be restored before returning to userspace. Do this for now, leaving buffer writing optimizations for the future. Add a new function fpregs_lock_and_load() that can simultaneously call fpregs_lock() and do this restore. Also perform some extra sanity checks in this function since this will be used in non-fpu focused code. Suggested-by: Thomas Gleixner Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230319001535.23210-7-rick.p.edgecombe%40intel.com --- arch/x86/include/asm/fpu/api.h | 9 +++++++++ arch/x86/kernel/fpu/core.c | 18 ++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 503a577814b2..aadc6893dcaa 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -82,6 +82,15 @@ static inline void fpregs_unlock(void) preempt_enable(); } +/* + * FPU state gets lazily restored before returning to userspace. So when in the + * kernel, the valid FPU state may be kept in the buffer. This function will force + * restore all the fpu state to the registers early if needed, and lock them from + * being automatically saved/restored. Then FPU state can be modified safely in the + * registers, before unlocking with fpregs_unlock(). + */ +void fpregs_lock_and_load(void); + #ifdef CONFIG_X86_DEBUG_FPU extern void fpregs_assert_state_consistent(void); #else diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index caf33486dc5e..f851558b673f 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -753,6 +753,24 @@ void switch_fpu_return(void) } EXPORT_SYMBOL_GPL(switch_fpu_return); +void fpregs_lock_and_load(void) +{ + /* + * fpregs_lock() only disables preemption (mostly). So modifying state + * in an interrupt could screw up some in progress fpregs operation. + * Warn about it. + */ + WARN_ON_ONCE(!irq_fpu_usable()); + WARN_ON_ONCE(current->flags & PF_KTHREAD); + + fpregs_lock(); + + fpregs_assert_state_consistent(); + + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + fpregs_restore_userregs(); +} + #ifdef CONFIG_X86_DEBUG_FPU /* * If current FPU state according to its tracking (loaded FPU context on this From patchwork Thu May 11 04:08:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F27CC77B7F for ; Thu, 11 May 2023 07:14:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237489AbjEKHOa (ORCPT ); Thu, 11 May 2023 03:14:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237458AbjEKHOP (ORCPT ); Thu, 11 May 2023 03:14:15 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DD9786A0; Thu, 11 May 2023 00:13:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789234; x=1715325234; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zOQMasU7O3ApDraDZEqE3fWIuTTvJXbEyrdSDU+BbJk=; b=TrSPysq+plbHO0ctkh/DjDCKMamBQRloP1c4aHJ+Z/qHhWOAXIPaPjOo 0NIQZgQAzDDVDyHDfzIYP+Y1INE9g09uIgI9ep0RnQ00yTw4yICM1+lzn ItHOeYLirClPedGbNZlmkRnCvZhtNEnxk9gaLxwFrGUZunxU0E5a4Z2oH XSc/eguApjXjCcV4SycgJMGHX9egq9y44N9gp/995MU/+8+y83sgHLuFN VcCJKlTmZ3r2m3bnLQq3otDYUJYuwUmYG7uFsWcmA8g5lNGJxRgklld1o bCKsn7K65U9W1g93R8AgIoofKfFj8vzC3zp0NuuyrdN7VOL7ZQEc0kWNb Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896630" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896630" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512360" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512360" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:23 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Sean Christopherson Subject: [PATCH v3 06/21] KVM:x86: Report XSS as to-be-saved if there are supported features Date: Thu, 11 May 2023 00:08:42 -0400 Message-Id: <20230511040857.6094-7-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Add MSR_IA32_XSS to the list of MSRs reported to userspace if supported_xss is non-zero, i.e. KVM supports at least one XSS based feature. Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e7f78fe79b32..33a780fe820b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1454,6 +1454,7 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_UMWAIT_CONTROL, MSR_IA32_XFD, MSR_IA32_XFD_ERR, + MSR_IA32_XSS, }; static const u32 msrs_to_save_pmu[] = { From patchwork Thu May 11 04:08:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92350C77B7F for ; Thu, 11 May 2023 07:13:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237424AbjEKHNp (ORCPT ); Thu, 11 May 2023 03:13:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237174AbjEKHNk (ORCPT ); Thu, 11 May 2023 03:13:40 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DC0D2D6A; Thu, 11 May 2023 00:13:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789212; x=1715325212; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7J0BAjKpZeee0YTws2bhg1lRumDrGh+SHYT9zVcLiQI=; b=ZXxOZbgg6h0AcatxizMAgzefIy8sFAe70lc+skijsFs03JCXvTFYHpJk aNSk99IPDb6H5rC7DW/X+zjairDBnxqmi+l0a9M4GROUw/gSXxqx5HjhV uHSxBhnXZW5Wblo+pL5j8K8sGFBHmt2V1yz0mWSgJ9XGLfkT5fQoRlpHa VV4OohyJkxImzYEy89uTt6w+Dt34gZ6P1JsY8+C3U7uRBq9e9fWMJVbIB 1gBm04D+2uZkKNVm1v8Rp2wFiIWgDHn0kuOCqFF0l9Ee2Rgdvo0ZbKR1K Zr0cM9bldTFMgATKvvLRSNj+o5OojdWGiBVHLiwsy7ORArutN/jFMlpxr w==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896562" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896562" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512362" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512362" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:23 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Zhang Yi Z Subject: [PATCH v3 07/21] KVM:x86: Refresh CPUID on write to guest MSR_IA32_XSS Date: Thu, 11 May 2023 00:08:43 -0400 Message-Id: <20230511040857.6094-8-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Update CPUID(EAX=0DH,ECX=1) when the guest's XSS is modified. CPUID(EAX=0DH,ECX=1).EBX reports current required storage size for all features enabled via XCR0 | XSS so that guest can allocate correct xsave buffer. Note, KVM does not yet support any XSS based features, i.e. supported_xss is guaranteed to be zero at this time. Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/kvm/cpuid.c | 7 +++++-- arch/x86/kvm/x86.c | 6 ++++-- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 123bf8b97a4b..cbb1b8a65502 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -277,8 +277,11 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e best = cpuid_entry2_find(entries, nent, 0xD, 1); if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) || - cpuid_entry_has(best, X86_FEATURE_XSAVEC))) - best->ebx = xstate_required_size(vcpu->arch.xcr0, true); + cpuid_entry_has(best, X86_FEATURE_XSAVEC))) { + u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss; + + best->ebx = xstate_required_size(xstate, true); + } best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); if (kvm_hlt_in_guest(vcpu->kvm) && best && diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 33a780fe820b..ab3360a10933 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3776,8 +3776,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) */ if (data & ~kvm_caps.supported_xss) return 1; - vcpu->arch.ia32_xss = data; - kvm_update_cpuid_runtime(vcpu); + if (vcpu->arch.ia32_xss != data) { + vcpu->arch.ia32_xss = data; + kvm_update_cpuid_runtime(vcpu); + } break; case MSR_SMI_COUNT: if (!msr_info->host_initiated) From patchwork Thu May 11 04:08:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41EC0C77B7C for ; Thu, 11 May 2023 07:13:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237174AbjEKHNs (ORCPT ); Thu, 11 May 2023 03:13:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237362AbjEKHNn (ORCPT ); Thu, 11 May 2023 03:13:43 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90A421AE; Thu, 11 May 2023 00:13:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789218; x=1715325218; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FUetmM3QAjP8/HE1ypNjPP89xC2eIzQHpTYHL+d52N8=; b=WC5dvaxY8+N82py+3WjCqNQTMqcHJdyasEF0gjjufZQf8QG88CvlVZ/l SbR8QPfyUAq8rAbeizaTHWxeg5c1qt0eUmuLc9quNVym/dFF3k+SpsNa6 0t6ildFDPkjQN0YU464a/KTKKjV1aQc3CHDaFcE8YnT3/lQddJwmrC8W6 wG8GfYPAWSqVrzf//xN0DFrefUSO/5Th04iVDjLo1LlebHCm9S/YNVgw7 uBstNHn/OCuWwzPbhjHwI08KkHjuAIZUbfQEhI9CESnrOBb2ZAdXrAlJH skcP4R+lSV1t4xClGa1CpzL3Lfwlbp170bc8ifRNVPkTcfF8VUUKVssjm g==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896583" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896583" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512365" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512365" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:23 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v3 08/21] KVM:x86: Init kvm_caps.supported_xss with supported feature bits Date: Thu, 11 May 2023 00:08:44 -0400 Message-Id: <20230511040857.6094-9-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Initialize kvm_caps.supported_xss with host XSS msr value AND XSS mask. KVM_SUPPORTED_XSS holds all potential supported feature bits, the result represents all KVM supported feature bits which is used for swapping guest and host FPU contents. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 1 - arch/x86/kvm/x86.c | 6 +++++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 44fb619803b8..c872a5aafa50 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7806,7 +7806,6 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_set(X86_FEATURE_UMIP); /* CPUID 0xD.1 */ - kvm_caps.supported_xss = 0; if (!cpu_has_vmx_xsaves()) kvm_cpu_cap_clear(X86_FEATURE_XSAVES); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ab3360a10933..d2975ca96ac5 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -223,6 +223,8 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) +#define KVM_SUPPORTED_XSS 0 + u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); @@ -9472,8 +9474,10 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) rdmsrl_safe(MSR_EFER, &host_efer); - if (boot_cpu_has(X86_FEATURE_XSAVES)) + if (boot_cpu_has(X86_FEATURE_XSAVES)) { rdmsrl(MSR_IA32_XSS, host_xss); + kvm_caps.supported_xss = host_xss & KVM_SUPPORTED_XSS; + } kvm_init_pmu_capability(ops->pmu_ops); From patchwork Thu May 11 04:08:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DC2EC77B7C for ; Thu, 11 May 2023 07:14:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237567AbjEKHO0 (ORCPT ); Thu, 11 May 2023 03:14:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237513AbjEKHOG (ORCPT ); Thu, 11 May 2023 03:14:06 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B20EC8694; Thu, 11 May 2023 00:13:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789232; x=1715325232; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TvBtgUKn7ic3XTFVWWRTcuCfR4IG367PYPSHuHHSapk=; b=IC+ruKptPSahtdzxj6jjBO+aLtoeIDGo7ms5fotWJHOQC+/v4fhw3bOf UM8SyJPAlQkUkrFJgq5wGNMyx+nUnuVbFsiLK+KoPUUEL9amdD7QevuXO c8ztgwMePHuAckZgGwJdkfazZeRs178rCgvETMvaFiA3TdIfspXdKLUT3 RUS/Bsy5+67VD3QvmcBvNsDX0bT5iLU+QEK/fA9mhxM08zbTPv2VQHJAn BQHLsMPHwyr4pHdMCN6Jc0N0NzagE2wwWgmHuHAfZIhEXdW0xbbMPxih6 pqO1xxQ+jO5tL2He2Sf8am9xPM+SVxsOBlRCrhVyxACNtdl4v9oEwUfE3 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896640" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896640" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512367" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512367" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:24 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Sean Christopherson Subject: [PATCH v3 09/21] KVM:x86: Load guest FPU state when accessing xsaves-managed MSRs Date: Thu, 11 May 2023 00:08:45 -0400 Message-Id: <20230511040857.6094-10-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Load the guest's FPU state if userspace is accessing MSRs whose values are managed by XSAVES. Two MSR access helpers, i.e., kvm_{get,set}_xsave_msr(), are introduced by a later patch to facilitate access to this kind of MSRs. If new feature MSRs supported in XSS are passed through to the guest they are saved and restored by {XSAVES|XRSTORS} to/from guest's FPU state at vm-entry/exit. Because the modified code is also used for the KVM_GET_MSRS device ioctl(), explicitly check @vcpu is non-null before attempting to load guest state. The XSS supporting MSRs cannot be retrieved via the device ioctl() without loading guest FPU state (which doesn't exist). Note that guest_cpuid_has() is not queried as host userspace is allowed to access MSRs that have not been exposed to the guest, e.g. it might do KVM_SET_MSRS prior to KVM_SET_CPUID2. Signed-off-by: Sean Christopherson Co-developed-by: Yang Weijiang Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d2975ca96ac5..7788646bbf1f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -130,6 +130,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static DEFINE_MUTEX(vendor_module_lock); +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); + struct kvm_x86_ops kvm_x86_ops __read_mostly; #define KVM_X86_OP(func) \ @@ -4336,6 +4339,21 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } EXPORT_SYMBOL_GPL(kvm_get_msr_common); +static const u32 xsave_msrs[] = { + MSR_IA32_U_CET, MSR_IA32_PL3_SSP, +}; + +static bool is_xsaves_msr(u32 index) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(xsave_msrs); i++) { + if (index == xsave_msrs[i]) + return true; + } + return false; +} + /* * Read or write a bunch of msrs. All parameters are kernel addresses. * @@ -4346,11 +4364,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { + bool fpu_loaded = false; int i; - for (i = 0; i < msrs->nmsrs; ++i) + for (i = 0; i < msrs->nmsrs; ++i) { + if (vcpu && !fpu_loaded && kvm_caps.supported_xss && + is_xsaves_msr(entries[i].index)) { + kvm_load_guest_fpu(vcpu); + fpu_loaded = true; + } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; + } + if (fpu_loaded) + kvm_put_guest_fpu(vcpu); return i; } From patchwork Thu May 11 04:08:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237550 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5773C77B7F for ; Thu, 11 May 2023 07:14:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237535AbjEKHOW (ORCPT ); Thu, 11 May 2023 03:14:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237503AbjEKHOF (ORCPT ); Thu, 11 May 2023 03:14:05 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3032E83FE; Thu, 11 May 2023 00:13:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789232; x=1715325232; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0Gy8eU5+LniX8iosDzLMeEn4dijCELzPhyiVmQLWZWc=; b=a+2mNCrj7hitKNIqaLIou45VVrQOI/BfyYfdRyjrERAY6ArXtD2FJwdY 63YnGL9qaWb+HTba2CWI3dtLRAIF2FEVonsix9TRHelxlwd/76hDAQgk/ +OUKYzblycxPIQy4qpb0fsnfFG0Jf/lffNcqtgDqMGSqV+1LwT8YXUIuX 7wPZdlQ8jeB8cCh/V6IYnnvfdM+VID/QQQoB/srIV2vjBKCuUEHQ/o+LL kfSJu54/VhMW5qiK9tlMVzqjk09AfwglXNEj0gG9ROAIl9ACtQHUgrJNX DjUdxg1wRsZQKC5ntvn68vyiJ6NIPgxDT14mD9DIpK1aSY6UFTTExQDhq g==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896624" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896624" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512370" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512370" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:24 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v3 10/21] KVM:x86: Add #CP support in guest exception classification Date: Thu, 11 May 2023 00:08:46 -0400 Message-Id: <20230511040857.6094-11-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add handling for Control Protection (#CP) exceptions(vector 21). The new vector is introduced for Intel's Control-Flow Enforcement Technology (CET) relevant violation cases. Although #CP belongs contributory exception class, but the actual effect is conditional on CET being exposed to guest. If CET is not available to guest, #CP falls back to non-contributory and doesn't have an error code. The rational is used to fix one unit test failure encountered in L1. Although the issue now is fixed in unit test case, keep the handling is reasonable. cr4_guest_rsvd_bits is used to avoid guest_cpuid_has() lookups. Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/vmx/nested.c | 2 +- arch/x86/kvm/x86.c | 10 +++++++--- arch/x86/kvm/x86.h | 13 ++++++++++--- 4 files changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 7f467fe05d42..1c002abe2be8 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -33,6 +33,7 @@ #define MC_VECTOR 18 #define XM_VECTOR 19 #define VE_VECTOR 20 +#define CP_VECTOR 21 /* Select x86 specific features in */ #define __KVM_HAVE_PIT diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 96ede74a6067..7bc62cd72748 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2850,7 +2850,7 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu, /* VM-entry interruption-info field: deliver error code */ should_have_error_code = intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode && - x86_exception_has_error_code(vector); + x86_exception_has_error_code(vcpu, vector); if (CC(has_error_code != should_have_error_code)) return -EINVAL; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7788646bbf1f..a768cbf3fbb7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -520,11 +520,15 @@ EXPORT_SYMBOL_GPL(kvm_spurious_fault); #define EXCPT_CONTRIBUTORY 1 #define EXCPT_PF 2 -static int exception_class(int vector) +static int exception_class(struct kvm_vcpu *vcpu, int vector) { switch (vector) { case PF_VECTOR: return EXCPT_PF; + case CP_VECTOR: + if (vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET) + return EXCPT_BENIGN; + return EXCPT_CONTRIBUTORY; case DE_VECTOR: case TS_VECTOR: case NP_VECTOR: @@ -707,8 +711,8 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu, kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); return; } - class1 = exception_class(prev_nr); - class2 = exception_class(nr); + class1 = exception_class(vcpu, prev_nr); + class2 = exception_class(vcpu, nr); if ((class1 == EXCPT_CONTRIBUTORY && class2 == EXCPT_CONTRIBUTORY) || (class1 == EXCPT_PF && class2 != EXCPT_BENIGN)) { /* diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index c544602d07a3..2ba7c7fc4846 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -171,13 +171,20 @@ static inline bool is_64_bit_hypercall(struct kvm_vcpu *vcpu) return vcpu->arch.guest_state_protected || is_64_bit_mode(vcpu); } -static inline bool x86_exception_has_error_code(unsigned int vector) +static inline bool x86_exception_has_error_code(struct kvm_vcpu *vcpu, + unsigned int vector) { static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) | BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) | - BIT(PF_VECTOR) | BIT(AC_VECTOR); + BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR); - return (1U << vector) & exception_has_error_code; + if (!((1U << vector) & exception_has_error_code)) + return false; + + if (vector == CP_VECTOR) + return !(vcpu->arch.cr4_guest_rsvd_bits & X86_CR4_CET); + + return true; } static inline bool mmu_is_nested(struct kvm_vcpu *vcpu) From patchwork Thu May 11 04:08:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237548 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97357C77B7F for ; Thu, 11 May 2023 07:13:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237420AbjEKHNx (ORCPT ); Thu, 11 May 2023 03:13:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237408AbjEKHNo (ORCPT ); Thu, 11 May 2023 03:13:44 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1AEA230D8; Thu, 11 May 2023 00:13:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789220; x=1715325220; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=J0XgQwTy0iS8eIRI2gcCgub1aePDBSrzX/sB3imPY/s=; b=jJI7y/4zxSBcZirCTURMzP/YPYKtwlmp0i8hMricOOdo0s8C9q3bV6mP uKLt9Qx/77T9mZZSkzXejJGkHVwDsGyk2EM0QbsW/Wq++KD2n1UGF+zGh bSCy2W8RO4Um9oU+UNHRYhucHCglaCNKiJKPs/m7KheQZzUc5Dq3Y3MEy tw7oT9Gj6kVLud3uw3g+HcepaPe+aj6JbTvpkqHHZpKwwfTztL7tiq2vI II6ogs8hOKLtdDyAyr84euFQ5YZsSYzBHcOHXj9A8M7ah+x16AEzHVwgR oyaT4kFJRa+IzBZ2UQ2dPnzZPgocy1Yimr9x4SGx/JvNux+u+W7yeXZQT A==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896605" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896605" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512372" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512372" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:24 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Zhang Yi Z Subject: [PATCH v3 11/21] KVM:VMX: Introduce CET VMCS fields and control bits Date: Thu, 11 May 2023 00:08:47 -0400 Message-Id: <20230511040857.6094-12-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Control-flow Enforcement Technology(CET) is a CPU feature used to prevent Return/Jump-Oriented Programming (ROP/JOP) attacks. CET introduces a new exception type, Control Protection (#CP), and two sub-features(SHSTK,IBT) to defend against ROP/JOP style control-flow subversion attacks. Shadow Stack (SHSTK): A shadow stack is a second stack used exclusively for control transfer operations. The shadow stack is separate from the data/normal stack and can be enabled individually in user and kernel mode. When shadow stack is enabled, CALL pushes the return address on both the data and shadow stack. RET pops the return address from both stacks and compares them. If the return addresses from the two stacks do not match, the processor generates a #CP. Indirect Branch Tracking (IBT): IBT adds a new instrution, ENDBRANCH, that is used to mark valid target addresses of indirect branches(CALL, JMP, ENCLU[EEXIT], etc...). If an indirect branch is executed and the next instruction is _not_ an ENDBRANCH, the processor generates a #CP. Several new CET MSRs are defined to support CET: MSR_IA32_{U,S}_CET: Controls the CET settings for user mode and kernel mode respectively. MSR_IA32_PL{0,1,2,3}_SSP: Stores shadow stack pointers for CPL-0,1,2,3 protection respectively. MSR_IA32_INT_SSP_TAB: Linear address of shadow stack pointer table,the entry is indexed by IST of interrupt gate desc. Two XSAVES state bits are introduced for CET: IA32_XSS:[bit 11]: Control saving/restoring user mode CET states IA32_XSS:[bit 12]: Control saving/restoring kernel mode CET states. Six VMCS fields are introduced for CET: {HOST,GUEST}_S_CET: Stores CET settings for kernel mode. {HOST,GUEST}_SSP: Stores shadow stack pointer of current active task/thread. {HOST,GUEST}_INTR_SSP_TABLE: Stores current active MSR_IA32_INT_SSP_TAB. If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET states are restored from the following VMCS fields at VM-Exit: HOST_S_CET HOST_SSP HOST_INTR_SSP_TABLE If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET states are loaded from the following VMCS fields at VM-Entry: GUEST_S_CET GUEST_SSP GUEST_INTR_SSP_TABLE Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang --- arch/x86/include/asm/vmx.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 498dc600bd5c..fe2aff27df8c 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -102,6 +102,7 @@ #define VM_EXIT_CLEAR_BNDCFGS 0x00800000 #define VM_EXIT_PT_CONCEAL_PIP 0x01000000 #define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000 +#define VM_EXIT_LOAD_CET_STATE 0x10000000 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -115,6 +116,7 @@ #define VM_ENTRY_LOAD_BNDCFGS 0x00010000 #define VM_ENTRY_PT_CONCEAL_PIP 0x00020000 #define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000 +#define VM_ENTRY_LOAD_CET_STATE 0x00100000 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff @@ -343,6 +345,9 @@ enum vmcs_field { GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822, GUEST_SYSENTER_ESP = 0x00006824, GUEST_SYSENTER_EIP = 0x00006826, + GUEST_S_CET = 0x00006828, + GUEST_SSP = 0x0000682a, + GUEST_INTR_SSP_TABLE = 0x0000682c, HOST_CR0 = 0x00006c00, HOST_CR3 = 0x00006c02, HOST_CR4 = 0x00006c04, @@ -355,6 +360,9 @@ enum vmcs_field { HOST_IA32_SYSENTER_EIP = 0x00006c12, HOST_RSP = 0x00006c14, HOST_RIP = 0x00006c16, + HOST_S_CET = 0x00006c18, + HOST_SSP = 0x00006c1a, + HOST_INTR_SSP_TABLE = 0x00006c1c }; /* From patchwork Thu May 11 04:08:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7529DC7EE24 for ; Thu, 11 May 2023 07:15:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237503AbjEKHPF (ORCPT ); Thu, 11 May 2023 03:15:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237521AbjEKHOV (ORCPT ); Thu, 11 May 2023 03:14:21 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7FF2386B5; Thu, 11 May 2023 00:13:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789235; x=1715325235; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mKznbbQmhJ6bKxVw03U0lSS5TRRLL1ugJjkxLjC4Lyc=; b=MDbHMgZPZKNp+vuVOAXsnE0/fOVXfBy0rUG+bmkTc9oiC+I9iSO5cZaA G0ntLFx1NA80dxDbBWkb0oONK/lGgciVtdC1dvv9AqGYySqeAhatUxoAX sHnUQbWePsasbkr+zi+7p5Vp/Kjnd4jak0XbTrVLgg3sq2E/sbPRs214s A0HS5SejnT8Mgh+X8RTy+P/ExHY20YOBTRgpI4FNrGpPGhaynJmSBjPHj 4lA+zocicAjlZfN+SwGwO7jcPsiv5FyshLvGRaTCoBtMvVcNPIZjdzjxs F5Ot0UmWbeUX8MNrVJCBQRms+DBr1ifMHDSEbfG1EvY0YlZZg9hakXefP g==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896662" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896662" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512374" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512374" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:25 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Sean Christopherson Subject: [PATCH v3 12/21] KVM:x86: Add fault checks for guest CR4.CET setting Date: Thu, 11 May 2023 00:08:48 -0400 Message-Id: <20230511040857.6094-13-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Check potential faults for CR4.CET setting per Intel SDM. CR4.CET is the master control bit for CET features (SHSTK and IBT). In addition to basic support checks, CET can be enabled if and only if CR0.WP==1, i.e. setting CR4.CET=1 faults if CR0.WP==0 and setting CR0.WP=0 fails if CR4.CET==1. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a768cbf3fbb7..b6eec9143129 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -995,6 +995,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) (is_64_bit_mode(vcpu) || kvm_is_cr4_bit_set(vcpu, X86_CR4_PCIDE))) return 1; + if (!(cr0 & X86_CR0_WP) && kvm_is_cr4_bit_set(vcpu, X86_CR4_CET)) + return 1; + static_call(kvm_x86_set_cr0)(vcpu, cr0); kvm_post_set_cr0(vcpu, old_cr0, cr0); @@ -1210,6 +1213,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) return 1; } + if ((cr4 & X86_CR4_CET) && !kvm_is_cr0_bit_set(vcpu, X86_CR0_WP)) + return 1; + static_call(kvm_x86_set_cr4)(vcpu, cr4); kvm_post_set_cr4(vcpu, old_cr4, cr4); From patchwork Thu May 11 04:08:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237560 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B542BC77B7F for ; Thu, 11 May 2023 07:15:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229812AbjEKHPC (ORCPT ); Thu, 11 May 2023 03:15:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237524AbjEKHOV (ORCPT ); Thu, 11 May 2023 03:14:21 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7FFF686B6; Thu, 11 May 2023 00:13:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789235; x=1715325235; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ypN7RyU6EyWyfmcv9Dn9NuMR85kCawRsAVFQVfmjfeI=; b=QJP1hQ7YEDDM3+jea63L+5poI9k2DjAZ+3dtd29P/VO/j8j0prGOJhfT fq2Z2Lb3TGGS43Fk+qhMT23Hv5JSS51t+dkE6BPvI1XQzLUiTZydwFDb2 Hz+aGZg617W/eB7gY5LZBL9jpcnzxTHqsv7b5ZKOilRA93HI7oRGovYEo UImwOz4jlNr9LGa50qAFHvJX8wcQtqQ4vAyv+U7JPALHMs88v9cixSwNQ E5yG+GdzXkuNUaJvDAgobm6NmG+F70x1FaobDEKEdBFkLhhv5MqZdp6vC aF7Gb7v2xOcfv00VbNXL67ivCo7Gzv9OdRmViZRznclV+izsqn08Af1HI A==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896658" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896658" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512379" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512379" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:25 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Sean Christopherson Subject: [PATCH v3 13/21] KVM:VMX: Emulate reads and writes to CET MSRs Date: Thu, 11 May 2023 00:08:49 -0400 Message-Id: <20230511040857.6094-14-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add support for emulating read and write accesses to CET MSRs. CET MSRs are universally "special" as they are either context switched via dedicated VMCS fields or via XSAVES, i.e. no additional in-memory tracking is needed, but emulated reads/writes are more expensive. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kernel/fpu/core.c | 1 + arch/x86/kvm/vmx/vmx.c | 18 ++++++++++++++++++ arch/x86/kvm/x86.c | 20 ++++++++++++++++++++ arch/x86/kvm/x86.h | 31 +++++++++++++++++++++++++++++++ 4 files changed, 70 insertions(+) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index f851558b673f..b4e28487882c 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -770,6 +770,7 @@ void fpregs_lock_and_load(void) if (test_thread_flag(TIF_NEED_FPU_LOAD)) fpregs_restore_userregs(); } +EXPORT_SYMBOL_GPL(fpregs_lock_and_load); #ifdef CONFIG_X86_DEBUG_FPU /* diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c872a5aafa50..0ccaa467d7d3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2093,6 +2093,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else msr_info->data = vmx->pt_desc.guest.addr_a[index / 2]; break; + case MSR_IA32_U_CET: + case MSR_IA32_PL3_SSP: + if (!kvm_cet_is_msr_accessible(vcpu, msr_info)) + return 1; + kvm_get_xsave_msr(msr_info); + break; case MSR_IA32_DEBUGCTLMSR: msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); break; @@ -2405,6 +2411,18 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] = data; break; + case MSR_IA32_U_CET: + case MSR_IA32_PL3_SSP: + if (!kvm_cet_is_msr_accessible(vcpu, msr_info)) + return 1; + if (is_noncanonical_address(data, vcpu)) + return 1; + if (msr_index == MSR_IA32_U_CET && (data & GENMASK(9, 6))) + return 1; + if (msr_index == MSR_IA32_PL3_SSP && (data & GENMASK(2, 0))) + return 1; + kvm_set_xsave_msr(msr_info); + break; case MSR_IA32_PERF_CAPABILITIES: if (data && !vcpu_to_pmu(vcpu)->version) return 1; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b6eec9143129..2e3a39c9297c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13630,6 +13630,26 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, } EXPORT_SYMBOL_GPL(kvm_sev_es_string_io); +bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr) +{ + if (!kvm_cet_user_supported()) + return false; + + if (msr->host_initiated) + return true; + + if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) && + !guest_cpuid_has(vcpu, X86_FEATURE_IBT)) + return false; + + if (msr->index == MSR_IA32_PL3_SSP && + !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) + return false; + + return true; +} +EXPORT_SYMBOL_GPL(kvm_cet_is_msr_accessible); + EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_entry); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_fast_mmio); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 2ba7c7fc4846..93afa7631735 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -2,6 +2,7 @@ #ifndef ARCH_X86_KVM_X86_H #define ARCH_X86_KVM_X86_H +#include #include #include #include @@ -370,6 +371,16 @@ static inline bool kvm_mpx_supported(void) == (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR); } +/* + * Guest CET user mode states depend on host XSAVES/XRSTORS to save/restore + * when vCPU enter/exit user space. If host doesn't support CET user bit in + * XSS msr, then treat this case as KVM doesn't support CET user mode. + */ +static inline bool kvm_cet_user_supported(void) +{ + return !!(kvm_caps.supported_xss & XFEATURE_MASK_CET_USER); +} + extern unsigned int min_timer_period_us; extern bool enable_vmware_backdoor; @@ -546,5 +557,25 @@ int kvm_sev_es_mmio_read(struct kvm_vcpu *vcpu, gpa_t src, unsigned int bytes, int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, unsigned int port, void *data, unsigned int count, int in); +bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr); + +/* + * We've already loaded guest MSRs in __msr_io() after check the MSR index. + * In case vcpu has been preempted, we need to disable preemption, check + * and reload the guest fpu states before read/write xsaves-managed MSRs. + */ +static inline void kvm_get_xsave_msr(struct msr_data *msr_info) +{ + fpregs_lock_and_load(); + rdmsrl(msr_info->index, msr_info->data); + fpregs_unlock(); +} + +static inline void kvm_set_xsave_msr(struct msr_data *msr_info) +{ + fpregs_lock_and_load(); + wrmsrl(msr_info->index, msr_info->data); + fpregs_unlock(); +} #endif From patchwork Thu May 11 04:08:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237564 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30606C77B7F for ; Thu, 11 May 2023 07:15:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237574AbjEKHPN (ORCPT ); Thu, 11 May 2023 03:15:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237217AbjEKHO2 (ORCPT ); Thu, 11 May 2023 03:14:28 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 540B17DAA; Thu, 11 May 2023 00:14:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789243; x=1715325243; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=W68GLsI7BGlcsv3VlSMXq+A0E1axs4XFaQILR6l9S10=; b=ZM6fhvhmtw6UJuoSdp6I6vmsZcNRSNRlJ1MiI/XpDINjzPtZ/vfJNnsW b1FvlloJi5dv1BckllXkCIu63ucY+vihdhcQpKQRZIInWhQKK9g0BikiA Pq6ZNEnQbuDuf7x52PM1tz7W8rx4J+BrPWOXicxDwmzxUIY3Oug1Bte2E af1REp8DCrzqjPJqjKFv1J3EwyYAuTYvRaKTdv3QGgMx3uQGOwAM3qiYq KwyKOjLKXQum3zfmF8ZokhBicMTREORNZn1UdSHkrHsMkCFkw9/e5lGQv EyhRbmJ7dz/LRJJjV5RLlkzHj1/CH6vcp2ggD34W+iiXXPUqKBSYsprXH w==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896682" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896682" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512381" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512381" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:25 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Sean Christopherson Subject: [PATCH v3 14/21] KVM:VMX: Add a synthetic MSR to allow userspace to access GUEST_SSP Date: Thu, 11 May 2023 00:08:50 -0400 Message-Id: <20230511040857.6094-15-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Introduce a host-only synthetic MSR, MSR_KVM_GUEST_SSP, so that the VMM can read/write the guest's SSP, e.g. to migrate CET state. Use a synthetic MSR, e.g. as opposed to a VCPU_REG_, as GUEST_SSP is subject to the same consistency checks as the PL*_SSP MSRs, i.e. can share code. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/vmx/vmx.c | 15 ++++++++++++--- arch/x86/kvm/x86.c | 4 ++++ 3 files changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 6e64b27b2c1e..7af465e4e0bd 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -58,6 +58,7 @@ #define MSR_KVM_ASYNC_PF_INT 0x4b564d06 #define MSR_KVM_ASYNC_PF_ACK 0x4b564d07 #define MSR_KVM_MIGRATION_CONTROL 0x4b564d08 +#define MSR_KVM_GUEST_SSP 0x4b564d09 struct kvm_steal_time { __u64 steal; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0ccaa467d7d3..72149156bbd3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2095,9 +2095,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) break; case MSR_IA32_U_CET: case MSR_IA32_PL3_SSP: + case MSR_KVM_GUEST_SSP: if (!kvm_cet_is_msr_accessible(vcpu, msr_info)) return 1; - kvm_get_xsave_msr(msr_info); + if (msr_info->index == MSR_KVM_GUEST_SSP) + msr_info->data = vmcs_readl(GUEST_SSP); + else + kvm_get_xsave_msr(msr_info); break; case MSR_IA32_DEBUGCTLMSR: msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); @@ -2413,15 +2417,20 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) break; case MSR_IA32_U_CET: case MSR_IA32_PL3_SSP: + case MSR_KVM_GUEST_SSP: if (!kvm_cet_is_msr_accessible(vcpu, msr_info)) return 1; if (is_noncanonical_address(data, vcpu)) return 1; if (msr_index == MSR_IA32_U_CET && (data & GENMASK(9, 6))) return 1; - if (msr_index == MSR_IA32_PL3_SSP && (data & GENMASK(2, 0))) + if ((msr_index == MSR_IA32_PL3_SSP || + msr_index == MSR_KVM_GUEST_SSP) && (data & GENMASK(2, 0))) return 1; - kvm_set_xsave_msr(msr_info); + if (msr_index == MSR_KVM_GUEST_SSP) + vmcs_writel(GUEST_SSP, data); + else + kvm_set_xsave_msr(msr_info); break; case MSR_IA32_PERF_CAPABILITIES: if (data && !vcpu_to_pmu(vcpu)->version) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2e3a39c9297c..baac6acebd40 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13642,6 +13642,10 @@ bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr) !guest_cpuid_has(vcpu, X86_FEATURE_IBT)) return false; + /* The synthetic MSR is for userspace access only. */ + if (msr->index == MSR_KVM_GUEST_SSP) + return false; + if (msr->index == MSR_IA32_PL3_SSP && !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) return false; From patchwork Thu May 11 04:08:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237566 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F7B6C77B7F for ; Thu, 11 May 2023 07:15:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237665AbjEKHPt (ORCPT ); Thu, 11 May 2023 03:15:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237743AbjEKHO6 (ORCPT ); Thu, 11 May 2023 03:14:58 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B767D9EC1; Thu, 11 May 2023 00:14:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789259; x=1715325259; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nBe0qXcnNHhJ84SptlgfdsQFDbSH+mW0zYFkQvIGEPk=; b=RL4Y7Q5Iyjj3qoeG4mZIC7mHX20Za+cwcTXeUOKkvrXZG0FEhakLLbre 9j+9fs2In9r41vI8iDsweyFkFdkZfpdjGK3ebuS95aYpEApRneq3/H6cG G3L5a1fRLBSopzgbwseKISbGmsdkXNU9u3NUIbKjAxAmEpyQk4pG0tbrN sH7G0nJOYi/2Gj3O+AgFVo4B4hMwTdPZPu0gkwles9cATh8yCEE2UG3BO 0HwCJWdtJsLKc4gqczZqKSDAx44qYhAdQ+LfHh5Wn3wNtu1VrJbZi2EWB We71088wqLt4TDZFAz+UWKkr3ql+etzPh1PEIr6vMHTFI68Vr46wY1sgm Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896715" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896715" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512384" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512384" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:25 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Sean Christopherson Subject: [PATCH v3 15/21] KVM:x86: Report CET MSRs as to-be-saved if CET is supported Date: Thu, 11 May 2023 00:08:51 -0400 Message-Id: <20230511040857.6094-16-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Report CET user mode MSRs, including the synthetic GUEST_SSP MSR, as to-be-saved MSRs. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/x86.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index baac6acebd40..50026557fb2a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1470,6 +1470,7 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_XFD, MSR_IA32_XFD_ERR, MSR_IA32_XSS, + MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP, }; static const u32 msrs_to_save_pmu[] = { From patchwork Thu May 11 04:08:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237562 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA491C77B7F for ; Thu, 11 May 2023 07:15:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237550AbjEKHPH (ORCPT ); Thu, 11 May 2023 03:15:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237549AbjEKHOX (ORCPT ); Thu, 11 May 2023 03:14:23 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2F528A42; Thu, 11 May 2023 00:13:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789238; x=1715325238; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=i7lV+AJDDwMSt8GlHXcIg9vDWIXFzEYq7jFzVP/darg=; b=YVdRmoPa9C8Z4O1+YxTv1ARmzlXUMpORhv6JHuzHcugiOLk5IkYfswK/ Rl3kXxhtKAqD99zRcVx6LgmQ2Z0HGmChzYhSkvXGKqgt79XH6hC/Rl4lR 42Mt8obw8YJ8gl402bc8Is4balVptHxd0Qjix0IpA2SuKDkW52zdPtTLj wepUtueVGH17A+eWM3EsxoKWJuxS72PmNAcC55OyLknRtIQDkZ3fa/spx VQN6I8wYTx/9M6XNBU5oVatKReephfhLDEOx2r+RSIKR4kAgFO6nryKs9 Sx4F8E1jKC5d8G1t/CZB71Nq+NhBVWTnQOo/Gv145ndAN4Rvq+hoD5X0X Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896680" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896680" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512388" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512388" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:26 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v3 16/21] KVM:x86: Save/Restore GUEST_SSP to/from SMM state save area Date: Thu, 11 May 2023 00:08:52 -0400 Message-Id: <20230511040857.6094-17-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Save GUEST_SSP to SMM state save area when guest exits to SMM due to SMI and restore it VMCS field when guest exits SMM. Signed-off-by: Yang Weijiang --- arch/x86/kvm/smm.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c index b42111a24cc2..c54d3eb2b7e4 100644 --- a/arch/x86/kvm/smm.c +++ b/arch/x86/kvm/smm.c @@ -275,6 +275,16 @@ static void enter_smm_save_state_64(struct kvm_vcpu *vcpu, enter_smm_save_seg_64(vcpu, &smram->gs, VCPU_SREG_GS); smram->int_shadow = static_call(kvm_x86_get_interrupt_shadow)(vcpu); + + if (kvm_cet_user_supported()) { + struct msr_data msr; + + msr.index = MSR_KVM_GUEST_SSP; + msr.host_initiated = true; + /* GUEST_SSP is stored in VMCS at vm-exit. */ + static_call(kvm_x86_get_msr)(vcpu, &msr); + smram->ssp = msr.data; + } } #endif @@ -565,6 +575,16 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt, static_call(kvm_x86_set_interrupt_shadow)(vcpu, 0); ctxt->interruptibility = (u8)smstate->int_shadow; + if (kvm_cet_user_supported()) { + struct msr_data msr; + + msr.index = MSR_KVM_GUEST_SSP; + msr.host_initiated = true; + msr.data = smstate->ssp; + /* Mimic host_initiated access to bypass ssp access check. */ + static_call(kvm_x86_set_msr)(vcpu, &msr); + } + return X86EMUL_CONTINUE; } #endif From patchwork Thu May 11 04:08:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237563 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2744CC77B7C for ; Thu, 11 May 2023 07:15:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237555AbjEKHPK (ORCPT ); Thu, 11 May 2023 03:15:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237453AbjEKHO2 (ORCPT ); Thu, 11 May 2023 03:14:28 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E3BA9EDB; Thu, 11 May 2023 00:14:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789243; x=1715325243; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AX0H9iVRW3xvfWuSDoCTwhSLi7i+WGN6LUh2TW6rnio=; b=mfxn3JVQx0Bv9aZzAFR3vaxMQzBUiEMhZqDFbFcmuhqEZle8zujLApmp QZTsGVH2nCDLel5Fk5bEWiNVP5O3gDooyM7sMcx6Pb58f34RSzZeqVybg nx9Oah6B0HLjI2qZLzEX8oZPw2q+hRhZgfL4QdEm5FyS6pOR8xf0divgv LYGdM9Xn1u0cqBwgYHguM7Ci9TV7h2PdgQFe+ifJGSDkiqh7oeVdrizAW ZVA2gA8O4ViHRO7EemDLt3tUIeS266ZRHqPjZFegcpnr6VpBkft/UBDKN xRva1jwCyFOukMn9/HI0t2TgCQnFJDsV6mj6PcQ2RVVwJt1iOxTXxlJhV g==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896699" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896699" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512389" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512389" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:26 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Zhang Yi Z , Sean Christopherson Subject: [PATCH v3 17/21] KVM:VMX: Pass through user CET MSRs to the guest Date: Thu, 11 May 2023 00:08:53 -0400 Message-Id: <20230511040857.6094-18-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Pass through CET user mode MSRs when the associated CET component is enabled to improve guest performance. All CET MSRs are context switched, either via dedicated VMCS fields or XSAVES. Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 72149156bbd3..c254c23f89f3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -709,6 +709,9 @@ static bool is_valid_passthrough_msr(u32 msr) case MSR_LBR_CORE_TO ... MSR_LBR_CORE_TO + 8: /* LBR MSRs. These are handled in vmx_update_intercept_for_lbr_msrs() */ return true; + case MSR_IA32_U_CET: + case MSR_IA32_PL3_SSP: + return true; } r = possible_passthrough_msr_slot(msr) != -ENOENT; @@ -7702,6 +7705,23 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); } +static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_state) +{ + return (kvm_caps.supported_xss & xss_state) && + (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) || + guest_cpuid_has(vcpu, X86_FEATURE_IBT)); +} + +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) +{ + bool incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, MSR_TYPE_RW, incpt); + + incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt); +} + static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7769,6 +7789,9 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); + + if (kvm_cet_user_supported()) + vmx_update_intercept_for_cet_msr(vcpu); } static u64 vmx_get_perf_capabilities(void) From patchwork Thu May 11 04:08:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237570 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD30CC7EE24 for ; Thu, 11 May 2023 07:16:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237707AbjEKHQ0 (ORCPT ); Thu, 11 May 2023 03:16:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237445AbjEKHPv (ORCPT ); Thu, 11 May 2023 03:15:51 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5A8D8690; Thu, 11 May 2023 00:14:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789291; x=1715325291; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=db0GTQOgk+DeoF0qvInPqP9Z+HtdacSarUcPqrbhQ4k=; b=cTEuEjyWe2F6ueurRHOxIRBvALKhnluS9Xdqy2G4sbmWDE1QSTc2B3C+ k6sdYBbGXqdvRbQpeKzUCxfJ7NA7w9n63tbv9P4AQ6X9nwFCzqhJI4/T7 fQscrBGkgiKZmwfCgPhD3muzQcGT+speHFyBSIifPQlvGKaOCN8WFNM/e dPDONUlOzQFY44yH97LoDHR6Wwo9h+Fo+wFocQs6ABNUHtTCAOW7rl2Tl iWc6Ttgt5lMFSBJI++EeNxHvDNUbEAsqrl8WHbUaMHSpNOjJGNaj5sl7g Hc7jQJiPLpn2fXVtDgzej8Suj21RokcVnw5EBALDNLRRYAi3iStpWs4Ss A==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896712" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896712" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512391" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512391" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:26 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Sean Christopherson Subject: [PATCH v3 18/21] KVM:x86: Enable CET virtualization for VMX and advertise to userspace Date: Thu, 11 May 2023 00:08:54 -0400 Message-Id: <20230511040857.6094-19-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Set the feature bits so that CET capabilities can be seen in guest via CPUID enumeration. Add CR4.CET bit support in order to allow guest set CET master control bit(CR4.CET). Disable KVM CET feature if unrestricted_guest is unsupported/disabled as KVM does not support emulating CET. Don't expose CET feature if dependent CET bits are cleared in host XSS, or if XSAVES isn't supported. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/cpuid.c | 12 ++++++++++-- arch/x86/kvm/vmx/capabilities.h | 4 ++++ arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++ arch/x86/kvm/vmx/vmx.h | 6 ++++-- arch/x86/kvm/x86.c | 21 ++++++++++++++++++++- arch/x86/kvm/x86.h | 3 +++ 7 files changed, 62 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2865c3cb3501..58e20d5895d1 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -125,7 +125,8 @@ | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \ | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \ | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \ - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP)) + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \ + | X86_CR4_CET)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index cbb1b8a65502..fefe8833f892 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -632,7 +632,7 @@ void kvm_set_cpu_caps(void) F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) | F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) | F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ | - F(SGX_LC) | F(BUS_LOCK_DETECT) + F(SGX_LC) | F(BUS_LOCK_DETECT) | F(SHSTK) ); /* Set LA57 based on hardware capability. */ if (cpuid_ecx(7) & F(LA57)) @@ -650,7 +650,8 @@ void kvm_set_cpu_caps(void) F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) | F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) | - F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D) + F(AMX_TILE) | F(AMX_INT8) | F(AMX_BF16) | F(FLUSH_L1D) | + F(IBT) ); /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */ @@ -663,6 +664,13 @@ void kvm_set_cpu_caps(void) kvm_cpu_cap_set(X86_FEATURE_INTEL_STIBP); if (boot_cpu_has(X86_FEATURE_AMD_SSBD)) kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD); + /* + * The feature bit in boot_cpu_data.x86_capability could have been + * cleared due to ibt=off cmdline option, then add it back if CPU + * supports IBT. + */ + if (cpuid_edx(7) & F(IBT)) + kvm_cpu_cap_set(X86_FEATURE_IBT); kvm_cpu_cap_mask(CPUID_7_1_EAX, F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 45162c1bcd8f..85cffeae7f10 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -106,6 +106,10 @@ static inline bool cpu_has_load_perf_global_ctrl(void) return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL; } +static inline bool cpu_has_load_cet_ctrl(void) +{ + return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE); +} static inline bool cpu_has_vmx_mpx(void) { return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_BNDCFGS; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c254c23f89f3..cb5908433c09 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2607,6 +2607,7 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf, { VM_ENTRY_LOAD_IA32_EFER, VM_EXIT_LOAD_IA32_EFER }, { VM_ENTRY_LOAD_BNDCFGS, VM_EXIT_CLEAR_BNDCFGS }, { VM_ENTRY_LOAD_IA32_RTIT_CTL, VM_EXIT_CLEAR_IA32_RTIT_CTL }, + { VM_ENTRY_LOAD_CET_STATE, VM_EXIT_LOAD_CET_STATE }, }; memset(vmcs_conf, 0, sizeof(*vmcs_conf)); @@ -6316,6 +6317,12 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (vmcs_read32(VM_EXIT_MSR_STORE_COUNT) > 0) vmx_dump_msrs("guest autostore", &vmx->msr_autostore.guest); + if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(GUEST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(GUEST_SSP)); + pr_err("INTR SSP TABLE = 0x%016lx\n", + vmcs_readl(GUEST_INTR_SSP_TABLE)); + } pr_err("*** Host State ***\n"); pr_err("RIP = 0x%016lx RSP = 0x%016lx\n", vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP)); @@ -6393,6 +6400,12 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID) pr_err("Virtual processor ID = 0x%04x\n", vmcs_read16(VIRTUAL_PROCESSOR_ID)); + if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) { + pr_err("S_CET = 0x%016lx\n", vmcs_readl(HOST_S_CET)); + pr_err("SSP = 0x%016lx\n", vmcs_readl(HOST_SSP)); + pr_err("INTR SSP TABLE = 0x%016lx\n", + vmcs_readl(HOST_INTR_SSP_TABLE)); + } } /* @@ -7867,6 +7880,12 @@ static __init void vmx_set_cpu_caps(void) if (cpu_has_vmx_waitpkg()) kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG); + + if (!cpu_has_load_cet_ctrl() || !enable_unrestricted_guest) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_USER; + } } static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 9e66531861cf..5e3ba69006f9 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -493,7 +493,8 @@ static inline u8 vmx_get_rvi(void) VM_ENTRY_LOAD_IA32_EFER | \ VM_ENTRY_LOAD_BNDCFGS | \ VM_ENTRY_PT_CONCEAL_PIP | \ - VM_ENTRY_LOAD_IA32_RTIT_CTL) + VM_ENTRY_LOAD_IA32_RTIT_CTL | \ + VM_ENTRY_LOAD_CET_STATE) #define __KVM_REQUIRED_VMX_VM_EXIT_CONTROLS \ (VM_EXIT_SAVE_DEBUG_CONTROLS | \ @@ -515,7 +516,8 @@ static inline u8 vmx_get_rvi(void) VM_EXIT_LOAD_IA32_EFER | \ VM_EXIT_CLEAR_BNDCFGS | \ VM_EXIT_PT_CONCEAL_PIP | \ - VM_EXIT_CLEAR_IA32_RTIT_CTL) + VM_EXIT_CLEAR_IA32_RTIT_CTL | \ + VM_EXIT_LOAD_CET_STATE) #define KVM_REQUIRED_VMX_PIN_BASED_VM_EXEC_CONTROL \ (PIN_BASED_EXT_INTR_MASK | \ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 50026557fb2a..858cb68e781a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -226,7 +226,7 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) -#define KVM_SUPPORTED_XSS 0 +#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER) u64 __read_mostly host_efer; EXPORT_SYMBOL_GPL(host_efer); @@ -9525,6 +9525,25 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) kvm_ops_update(ops); + /* + * Check CET user bit is still set in kvm_caps.supported_xss, + * if not, clear the cap bits as the user parts depends on + * XSAVES support. + */ + if (!kvm_cet_user_supported()) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + } + + /* + * If SHSTK and IBT are available in KVM, clear CET user bit in + * kvm_caps.supported_xss so that kvm_cet_user_supported() returns + * false when called. + */ + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_USER; + for_each_online_cpu(cpu) { smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &r, 1); if (r < 0) diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 93afa7631735..09a8c8316914 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -547,6 +547,9 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type); __reserved_bits |= X86_CR4_VMXE; \ if (!__cpu_has(__c, X86_FEATURE_PCID)) \ __reserved_bits |= X86_CR4_PCIDE; \ + if (!__cpu_has(__c, X86_FEATURE_SHSTK) && \ + !__cpu_has(__c, X86_FEATURE_IBT)) \ + __reserved_bits |= X86_CR4_CET; \ __reserved_bits; \ }) From patchwork Thu May 11 04:08:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31B76C7EE24 for ; Thu, 11 May 2023 07:15:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237453AbjEKHPR (ORCPT ); Thu, 11 May 2023 03:15:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237643AbjEKHOq (ORCPT ); Thu, 11 May 2023 03:14:46 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05DC28689; Thu, 11 May 2023 00:14:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789246; x=1715325246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uNm/HYTnpMFseQVesIJGpIfdIi8BBpG9Gf0NUkKeLyc=; b=C/4QE0erb21Qr7UhRHs3qUEz7vLAcLzmjF3LQAN4J+ALaa54WdnMs5sA TUKM+akecYOuWFEcRIyhixeHI5WgKmnV7FXYEXYK+ES2P/zBXtZSx2hGN ZvROBRHiqoSVkBWzSVskiT2TEA/X6QsUx80Tt0g0vpZio1U4uopIVm6r7 FDMHI3WvKGtSdXxpCpFqlemr/bU/G1PgmwHEVrVDCB1SFhZZyT4fYhA8p Pl7sm2tQQkbWFQON1TxKVA5pGS2yKSYSg2jZFipKxo1ZGMvZZxkQ2pmHu LuZA7A2ERUn+ljPyrhQ/V9OkkwmRdNmoHMBxN8QsRsjSeSD/MXeumug1w A==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896698" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896698" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512393" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512393" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:26 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, Sean Christopherson Subject: [PATCH v3 19/21] KVM:nVMX: Enable user CET support for nested VMX Date: Thu, 11 May 2023 00:08:55 -0400 Message-Id: <20230511040857.6094-20-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add CET fields to vmcs12 and pass through CET user mode MSRs to L2 if L1 can support. Enable nested VMCS control bits and CR4 CET bit support. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/nested.c | 12 ++++++++++-- arch/x86/kvm/vmx/vmcs12.c | 6 ++++++ arch/x86/kvm/vmx/vmcs12.h | 14 +++++++++++++- arch/x86/kvm/vmx/vmx.c | 2 ++ 4 files changed, 31 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 7bc62cd72748..522ac27d2534 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -660,6 +660,13 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_FLUSH_CMD, MSR_TYPE_W); + /* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */ + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_U_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL3_SSP, MSR_TYPE_RW); + kvm_vcpu_unmap(vcpu, &vmx->nested.msr_bitmap_map, false); vmx->nested.force_msr_bitmap_recalc = false; @@ -6785,7 +6792,7 @@ static void nested_vmx_setup_exit_ctls(struct vmcs_config *vmcs_conf, VM_EXIT_HOST_ADDR_SPACE_SIZE | #endif VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT | - VM_EXIT_CLEAR_BNDCFGS; + VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_LOAD_CET_STATE; msrs->exit_ctls_high |= VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR | VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER | @@ -6807,7 +6814,8 @@ static void nested_vmx_setup_entry_ctls(struct vmcs_config *vmcs_conf, #ifdef CONFIG_X86_64 VM_ENTRY_IA32E_MODE | #endif - VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS; + VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS | + VM_ENTRY_LOAD_CET_STATE; msrs->entry_ctls_high |= (VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL); diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c index 106a72c923ca..4233b5ca9461 100644 --- a/arch/x86/kvm/vmx/vmcs12.c +++ b/arch/x86/kvm/vmx/vmcs12.c @@ -139,6 +139,9 @@ const unsigned short vmcs12_field_offsets[] = { FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp), FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip), + FIELD(GUEST_S_CET, guest_s_cet), + FIELD(GUEST_SSP, guest_ssp), + FIELD(GUEST_INTR_SSP_TABLE, guest_ssp_tbl), FIELD(HOST_CR0, host_cr0), FIELD(HOST_CR3, host_cr3), FIELD(HOST_CR4, host_cr4), @@ -151,5 +154,8 @@ const unsigned short vmcs12_field_offsets[] = { FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), FIELD(HOST_RSP, host_rsp), FIELD(HOST_RIP, host_rip), + FIELD(HOST_S_CET, host_s_cet), + FIELD(HOST_SSP, host_ssp), + FIELD(HOST_INTR_SSP_TABLE, host_ssp_tbl), }; const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs12_field_offsets); diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h index 01936013428b..3884489e7f7e 100644 --- a/arch/x86/kvm/vmx/vmcs12.h +++ b/arch/x86/kvm/vmx/vmcs12.h @@ -117,7 +117,13 @@ struct __packed vmcs12 { natural_width host_ia32_sysenter_eip; natural_width host_rsp; natural_width host_rip; - natural_width paddingl[8]; /* room for future expansion */ + natural_width host_s_cet; + natural_width host_ssp; + natural_width host_ssp_tbl; + natural_width guest_s_cet; + natural_width guest_ssp; + natural_width guest_ssp_tbl; + natural_width paddingl[2]; /* room for future expansion */ u32 pin_based_vm_exec_control; u32 cpu_based_vm_exec_control; u32 exception_bitmap; @@ -292,6 +298,12 @@ static inline void vmx_check_vmcs12_offsets(void) CHECK_OFFSET(host_ia32_sysenter_eip, 656); CHECK_OFFSET(host_rsp, 664); CHECK_OFFSET(host_rip, 672); + CHECK_OFFSET(host_s_cet, 680); + CHECK_OFFSET(host_ssp, 688); + CHECK_OFFSET(host_ssp_tbl, 696); + CHECK_OFFSET(guest_s_cet, 704); + CHECK_OFFSET(guest_ssp, 712); + CHECK_OFFSET(guest_ssp_tbl, 720); CHECK_OFFSET(pin_based_vm_exec_control, 744); CHECK_OFFSET(cpu_based_vm_exec_control, 748); CHECK_OFFSET(exception_bitmap, 752); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index cb5908433c09..a2494156902d 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7645,6 +7645,8 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu) cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU)); cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP)); cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57)); + cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK)); + cr4_fixed1_update(X86_CR4_CET, edx, feature_bit(IBT)); #undef cr4_fixed1_update } From patchwork Thu May 11 04:08:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237568 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30F64C77B7F for ; Thu, 11 May 2023 07:16:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237286AbjEKHQJ (ORCPT ); Thu, 11 May 2023 03:16:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48726 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237625AbjEKHPg (ORCPT ); Thu, 11 May 2023 03:15:36 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7FD5BA5CC; Thu, 11 May 2023 00:14:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789275; x=1715325275; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mK53r6YiSNjTM8Iq2DJUiKNRLNWiYAEdm4DYvtOj8+c=; b=T09nLY+VJqyiIv8slJtOCwmUC50N4tR7TibCnRSkTBVCHp/6iyX6D5ds SDDpH/2lXobUTEArgxLzZofCWztkOuynVMjHA9y8W5Wqsj4MJ69JGIBGx Eh2uQTJnf7vKsQKCuLBnujG7Xqi1mHeC3KzQ1XlYV5v1MhbWmd9A5xv9e 9PZsmBJNWLzuD4xu/LaiSGyqc/S4X+6kG6USk06G3g6e6rnZoKpyqfblZ eq/ArAuOkBb+98xFB4GiG4tUkapBjGecw5FRgIkrUF3cUp5BcJgiWio0N qJtXkTyf4f9c/ZrSiTL5jttM9xUFvpCkN9PfH685kk1hItkftR0gI9N3K w==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896713" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896713" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512395" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512395" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:27 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v3 20/21] KVM:x86: Enable kernel IBT support for guest Date: Thu, 11 May 2023 00:08:56 -0400 Message-Id: <20230511040857.6094-21-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Enable MSR_IA32_S_CET access for guest kernel IBT. Mainline Linux kernel now supports supervisor IBT for kernel code, to make s-IBT work in guest(nested guest), pass through MSR_IA32_S_CET to guest(nested guest) if host kernel and KVM enabled IBT. Note, s-IBT can work independent to host xsaves support because guest MSR_IA32_S_CET is {stored|loaded} from VMCS GUEST_S_CET field. Signed-off-by: Yang Weijiang --- arch/x86/kvm/vmx/nested.c | 3 +++ arch/x86/kvm/vmx/vmx.c | 39 ++++++++++++++++++++++++++++++++++----- arch/x86/kvm/x86.c | 7 ++++++- 3 files changed, 43 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 522ac27d2534..bf690827bfee 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -664,6 +664,9 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_U_CET, MSR_TYPE_RW); + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_S_CET, MSR_TYPE_RW); + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_PL3_SSP, MSR_TYPE_RW); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index a2494156902d..1d0151f9e575 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -711,6 +711,7 @@ static bool is_valid_passthrough_msr(u32 msr) return true; case MSR_IA32_U_CET: case MSR_IA32_PL3_SSP: + case MSR_IA32_S_CET: return true; } @@ -2097,14 +2098,18 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = vmx->pt_desc.guest.addr_a[index / 2]; break; case MSR_IA32_U_CET: + case MSR_IA32_S_CET: case MSR_IA32_PL3_SSP: case MSR_KVM_GUEST_SSP: if (!kvm_cet_is_msr_accessible(vcpu, msr_info)) return 1; - if (msr_info->index == MSR_KVM_GUEST_SSP) + if (msr_info->index == MSR_KVM_GUEST_SSP) { msr_info->data = vmcs_readl(GUEST_SSP); - else + } else if (msr_info->index == MSR_IA32_S_CET) { + msr_info->data = vmcs_readl(GUEST_S_CET); + } else { kvm_get_xsave_msr(msr_info); + } break; case MSR_IA32_DEBUGCTLMSR: msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); @@ -2419,6 +2424,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vmx->pt_desc.guest.addr_a[index / 2] = data; break; case MSR_IA32_U_CET: + case MSR_IA32_S_CET: case MSR_IA32_PL3_SSP: case MSR_KVM_GUEST_SSP: if (!kvm_cet_is_msr_accessible(vcpu, msr_info)) @@ -2430,10 +2436,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if ((msr_index == MSR_IA32_PL3_SSP || msr_index == MSR_KVM_GUEST_SSP) && (data & GENMASK(2, 0))) return 1; - if (msr_index == MSR_KVM_GUEST_SSP) + if (msr_index == MSR_KVM_GUEST_SSP) { vmcs_writel(GUEST_SSP, data); - else + } else if (msr_index == MSR_IA32_S_CET) { + vmcs_writel(GUEST_S_CET, data); + } else { kvm_set_xsave_msr(msr_info); + } break; case MSR_IA32_PERF_CAPABILITIES: if (data && !vcpu_to_pmu(vcpu)->version) @@ -7322,6 +7331,19 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) kvm_wait_lapic_expire(vcpu); + /* + * Save host MSR_IA32_S_CET so that it can be reloaded at vm_exit. + * No need to save the other two vmcs fields as supervisor SHSTK + * are not enabled on Intel platform now. + */ + if (IS_ENABLED(CONFIG_X86_KERNEL_IBT) && + (vm_exit_controls_get(vmx) & VM_EXIT_LOAD_CET_STATE)) { + u64 msr; + + rdmsrl(MSR_IA32_S_CET, msr); + vmcs_writel(HOST_S_CET, msr); + } + /* The actual VMENTER/EXIT is in the .noinstr.text section. */ vmx_vcpu_enter_exit(vcpu, __vmx_vcpu_run_flags(vmx)); @@ -7735,6 +7757,13 @@ static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt); + + /* + * If IBT is available to guest, then passthrough S_CET MSR too since + * kernel IBT is already in mainline kernel tree. + */ + incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt); } static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) @@ -7805,7 +7834,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); - if (kvm_cet_user_supported()) + if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT)) vmx_update_intercept_for_cet_msr(vcpu); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 858cb68e781a..b450361b94ef 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1471,6 +1471,7 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_XFD, MSR_IA32_XFD_ERR, MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP, + MSR_IA32_S_CET, }; static const u32 msrs_to_save_pmu[] = { @@ -13652,7 +13653,8 @@ EXPORT_SYMBOL_GPL(kvm_sev_es_string_io); bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr) { - if (!kvm_cet_user_supported()) + if (!kvm_cet_user_supported() && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) return false; if (msr->host_initiated) @@ -13666,6 +13668,9 @@ bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr) if (msr->index == MSR_KVM_GUEST_SSP) return false; + if (msr->index == MSR_IA32_S_CET) + return guest_cpuid_has(vcpu, X86_FEATURE_IBT); + if (msr->index == MSR_IA32_PL3_SSP && !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) return false; From patchwork Thu May 11 04:08:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Weijiang" X-Patchwork-Id: 13237567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0B40C77B7C for ; Thu, 11 May 2023 07:15:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237680AbjEKHPu (ORCPT ); Thu, 11 May 2023 03:15:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237747AbjEKHO6 (ORCPT ); Thu, 11 May 2023 03:14:58 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B76F8A26F; Thu, 11 May 2023 00:14:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683789259; x=1715325259; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7DRMQNweZsiYd8Xw21ykHz/E/apQV0kcIAhwFqTmNrE=; b=UhjyjWKx6oB3S4tj48IHjA1Nxces3KdKjnT3IeSTVS2NpD1WOAw/t2lo 442OkGFTM0XzYPoh07iSwusOgf5769o6A0/VtrZc/J3eJJR83WPxbHlEn ELfB9QysaAlHT+oJMWVFVecdAmXcEPpu4XU5Kbe8pwBV+JVcht3Rj/TOM JoDYqGU4yLxLzmZkxAQXUvvk6qLD3ZUffq1VIpk51E+XD3XWs7X1+GHRu r/5WhXuMSqmfJuf0KaaR2dk7b0SgJ6ie7R0pa3yPJjQWlblPrMwwqC02H bXoJb6LiH+ALwSt2WKyBtuja+9sKVE3Nw82XBqbwHuXbJYMVO+WR2t4r6 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="334896700" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="334896700" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10706"; a="1029512398" X-IronPort-AV: E=Sophos;i="5.99,266,1677571200"; d="scan'208";a="1029512398" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2023 00:13:28 -0700 From: Yang Weijiang To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, rppt@kernel.org, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v3 21/21] KVM:x86: Support CET supervisor shadow stack MSR access Date: Thu, 11 May 2023 00:08:57 -0400 Message-Id: <20230511040857.6094-22-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230511040857.6094-1-weijiang.yang@intel.com> References: <20230511040857.6094-1-weijiang.yang@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add MSR access interfaces for supervisor shadow stack, i.e., MSR_IA32_PL{0,1,2} and MSR_IA32_INT_SSP_TAB, meanwhile pass through them to {L1,L2} guests when {L0,L1} KVM supports supervisor shadow stack. Note, currently supervisor shadow stack is not supported on Intel platforms, i.e., VMX always clears CPUID(EAX=07H,ECX=1):EDX[bit 18]. The main purpose of this patch is to facilitate AMD folks to enable supervisor shadow stack for their platforms. Signed-off-by: Yang Weijiang --- arch/x86/kvm/cpuid.h | 6 ++++++ arch/x86/kvm/vmx/nested.c | 12 ++++++++++++ arch/x86/kvm/vmx/vmx.c | 25 ++++++++++++++++++++++++- arch/x86/kvm/x86.c | 21 ++++++++++++++++++--- 4 files changed, 60 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index b1658c0de847..019a16b25b88 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -232,4 +232,10 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu, return vcpu->arch.pv_cpuid.features & (1u << kvm_feature); } +static __always_inline bool kvm_cet_kernel_shstk_supported(void) +{ + return !IS_ENABLED(CONFIG_KVM_INTEL) && + kvm_cpu_cap_has(X86_FEATURE_SHSTK); +} + #endif diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index bf690827bfee..aaaae92dc9f6 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -670,6 +670,18 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_PL3_SSP, MSR_TYPE_RW); + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL0_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL1_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL2_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW); + kvm_vcpu_unmap(vcpu, &vmx->nested.msr_bitmap_map, false); vmx->nested.force_msr_bitmap_recalc = false; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 1d0151f9e575..d70f2e94b187 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -713,6 +713,9 @@ static bool is_valid_passthrough_msr(u32 msr) case MSR_IA32_PL3_SSP: case MSR_IA32_S_CET: return true; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP: + case MSR_IA32_INT_SSP_TAB: + return true; } r = possible_passthrough_msr_slot(msr) != -ENOENT; @@ -2101,12 +2104,16 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_S_CET: case MSR_IA32_PL3_SSP: case MSR_KVM_GUEST_SSP: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP: + case MSR_IA32_INT_SSP_TAB: if (!kvm_cet_is_msr_accessible(vcpu, msr_info)) return 1; if (msr_info->index == MSR_KVM_GUEST_SSP) { msr_info->data = vmcs_readl(GUEST_SSP); } else if (msr_info->index == MSR_IA32_S_CET) { msr_info->data = vmcs_readl(GUEST_S_CET); + } else if (msr_info->index == MSR_IA32_INT_SSP_TAB) { + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE); } else { kvm_get_xsave_msr(msr_info); } @@ -2427,6 +2434,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_S_CET: case MSR_IA32_PL3_SSP: case MSR_KVM_GUEST_SSP: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL2_SSP: + case MSR_IA32_INT_SSP_TAB: if (!kvm_cet_is_msr_accessible(vcpu, msr_info)) return 1; if (is_noncanonical_address(data, vcpu)) @@ -2440,6 +2449,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vmcs_writel(GUEST_SSP, data); } else if (msr_index == MSR_IA32_S_CET) { vmcs_writel(GUEST_S_CET, data); + } else if (msr_index == MSR_IA32_INT_SSP_TAB) { + vmcs_writel(GUEST_INTR_SSP_TABLE, data); } else { kvm_set_xsave_msr(msr_info); } @@ -7764,6 +7775,17 @@ static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) */ incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT); vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt); + + /* + * Supervisor shadow stack is not supported in VMX now, intercept all + * related MSRs. + */ + incpt = !kvm_cet_kernel_shstk_supported(); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, incpt); } static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) @@ -7834,7 +7856,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); - if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT)) + if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT) || + kvm_cpu_cap_has(X86_FEATURE_SHSTK)) vmx_update_intercept_for_cet_msr(vcpu); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b450361b94ef..a9ab01293420 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1472,6 +1472,8 @@ static const u32 msrs_to_save_base[] = { MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP, MSR_IA32_S_CET, + MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP, + MSR_IA32_INT_SSP_TAB, }; static const u32 msrs_to_save_pmu[] = { @@ -13653,8 +13655,11 @@ EXPORT_SYMBOL_GPL(kvm_sev_es_string_io); bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr) { + u64 mask; + if (!kvm_cet_user_supported() && - !kvm_cpu_cap_has(X86_FEATURE_IBT)) + !(kvm_cpu_cap_has(X86_FEATURE_IBT) || + kvm_cpu_cap_has(X86_FEATURE_SHSTK))) return false; if (msr->host_initiated) @@ -13668,14 +13673,24 @@ bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr) if (msr->index == MSR_KVM_GUEST_SSP) return false; + if (msr->index == MSR_IA32_U_CET) + return true; + if (msr->index == MSR_IA32_S_CET) - return guest_cpuid_has(vcpu, X86_FEATURE_IBT); + return guest_cpuid_has(vcpu, X86_FEATURE_IBT) || + kvm_cet_kernel_shstk_supported(); + + if (msr->index == MSR_IA32_INT_SSP_TAB) + return guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) && + kvm_cet_kernel_shstk_supported(); if (msr->index == MSR_IA32_PL3_SSP && !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK)) return false; - return true; + mask = (msr->index == MSR_IA32_PL3_SSP) ? XFEATURE_MASK_CET_USER : + XFEATURE_MASK_CET_KERNEL; + return !!(kvm_caps.supported_xss & mask); } EXPORT_SYMBOL_GPL(kvm_cet_is_msr_accessible);