Message ID | 20240626073719.5246-1-amit@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] KVM: SVM: let alternatives handle the cases when RSB filling is required | expand |
On Wed, Jun 26, 2024, Amit Shah wrote: > --- > arch/x86/kvm/svm/vmenter.S | 8 ++------ > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S > index a0c8eb37d3e1..2ed80aea3bb1 100644 > --- a/arch/x86/kvm/svm/vmenter.S > +++ b/arch/x86/kvm/svm/vmenter.S > @@ -209,10 +209,8 @@ SYM_FUNC_START(__svm_vcpu_run) > 7: vmload %_ASM_AX > 8: > > -#ifdef CONFIG_MITIGATION_RETPOLINE > /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ > - FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE > -#endif > + FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT Out of an abundance of paranoia, shouldn't this be? FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\ X86_FEATURE_RSB_VMEXIT_LITE Hmm, but it looks like that would incorrectly trigger the "lite" flavor for families 0xf - 0x12. I assume those old CPUs aren't affected by whatever on earth EIBRS_PBRSB is. /* AMD Family 0xf - 0x12 */ VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */ VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | NO_BHI), VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | NO_BHI), > > /* Clobbers RAX, RCX, RDX. */ > RESTORE_HOST_SPEC_CTRL > @@ -348,10 +346,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run) > > 2: cli > > -#ifdef CONFIG_MITIGATION_RETPOLINE > /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ > - FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE > -#endif > + FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT > > /* Clobbers RAX, RCX, RDX, consumes RDI (@svm) and RSI (@spec_ctrl_intercepted). */ > RESTORE_HOST_SPEC_CTRL > -- > 2.45.2 >
On Fri, Jun 28, 2024 at 9:09 AM Sean Christopherson <seanjc@google.com> wrote: > > On Wed, Jun 26, 2024, Amit Shah wrote: > > --- > > arch/x86/kvm/svm/vmenter.S | 8 ++------ > > 1 file changed, 2 insertions(+), 6 deletions(-) > > > > diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S > > index a0c8eb37d3e1..2ed80aea3bb1 100644 > > --- a/arch/x86/kvm/svm/vmenter.S > > +++ b/arch/x86/kvm/svm/vmenter.S > > @@ -209,10 +209,8 @@ SYM_FUNC_START(__svm_vcpu_run) > > 7: vmload %_ASM_AX > > 8: > > > > -#ifdef CONFIG_MITIGATION_RETPOLINE > > /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ > > - FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE > > -#endif > > + FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT > > Out of an abundance of paranoia, shouldn't this be? > > FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\ > X86_FEATURE_RSB_VMEXIT_LITE > > Hmm, but it looks like that would incorrectly trigger the "lite" flavor for > families 0xf - 0x12. I assume those old CPUs aren't affected by whatever on earth > EIBRS_PBRSB is. > > /* AMD Family 0xf - 0x12 */ > VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > > /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */ > VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | NO_BHI), > VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | NO_BHI), Your assumption is correct. As for why the cpu_vuln_whitelist[] doesn't say so explicitly, you need to read between the lines... > /* > * AMD's AutoIBRS is equivalent to Intel's eIBRS - use the Intel feature > * flag and protect from vendor-specific bugs via the whitelist. > * > * Don't use AutoIBRS when SNP is enabled because it degrades host > * userspace indirect branch performance. > */ > if ((x86_arch_cap_msr & ARCH_CAP_IBRS_ALL) || > (cpu_has(c, X86_FEATURE_AUTOIBRS) && > !cpu_feature_enabled(X86_FEATURE_SEV_SNP))) { > setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED); > if (!cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) && > !(x86_arch_cap_msr & ARCH_CAP_PBRSB_NO)) > setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB); > } Families 0FH through 12H don't have EIBRS or AutoIBRS, so there's no cpu_vuln_whitelist[] lookup. Hence, no need to set the NO_EIBRS_PBRSB bit, even if it is accurate. > > > > /* Clobbers RAX, RCX, RDX. */ > > RESTORE_HOST_SPEC_CTRL > > @@ -348,10 +346,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run) > > > > 2: cli > > > > -#ifdef CONFIG_MITIGATION_RETPOLINE > > /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ > > - FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE > > -#endif > > + FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT > > > > /* Clobbers RAX, RCX, RDX, consumes RDI (@svm) and RSI (@spec_ctrl_intercepted). */ > > RESTORE_HOST_SPEC_CTRL > > -- > > 2.45.2 > > >
On Fri, Jun 28, 2024 at 09:09:15AM -0700, Sean Christopherson wrote: > Hmm, but it looks like that would incorrectly trigger the "lite" flavor for > families 0xf - 0x12. I assume those old CPUs aren't affected by whatever on earth > EIBRS_PBRSB is. https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/post-barrier-return-stack-buffer-predictions.html We could add NO_EIBRS_PBRSB to those old families too. Amit, feel free to send a separate patch. Thx.
On Fri, 2024-06-28 at 11:48 -0700, Jim Mattson wrote: > On Fri, Jun 28, 2024 at 9:09 AM Sean Christopherson > <seanjc@google.com> wrote: > > > > On Wed, Jun 26, 2024, Amit Shah wrote: > > > --- > > > arch/x86/kvm/svm/vmenter.S | 8 ++------ > > > 1 file changed, 2 insertions(+), 6 deletions(-) > > > > > > diff --git a/arch/x86/kvm/svm/vmenter.S > > > b/arch/x86/kvm/svm/vmenter.S > > > index a0c8eb37d3e1..2ed80aea3bb1 100644 > > > --- a/arch/x86/kvm/svm/vmenter.S > > > +++ b/arch/x86/kvm/svm/vmenter.S > > > @@ -209,10 +209,8 @@ SYM_FUNC_START(__svm_vcpu_run) > > > 7: vmload %_ASM_AX > > > 8: > > > > > > -#ifdef CONFIG_MITIGATION_RETPOLINE > > > /* IMPORTANT: Stuff the RSB immediately after VM-Exit, > > > before RET! */ > > > - FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, > > > X86_FEATURE_RETPOLINE > > > -#endif > > > + FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, > > > X86_FEATURE_RSB_VMEXIT > > > > Out of an abundance of paranoia, shouldn't this be? > > > > FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, > > X86_FEATURE_RSB_VMEXIT,\ > > X86_FEATURE_RSB_VMEXIT_LITE > > > > Hmm, but it looks like that would incorrectly trigger the "lite" > > flavor for > > families 0xf - 0x12. I assume those old CPUs aren't affected by > > whatever on earth > > EIBRS_PBRSB is. > > > > /* AMD Family 0xf - 0x12 */ > > VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > > VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > > VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > > VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > > > > /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches > > won't work */ > > VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | > > NO_BHI), > > VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB | > > NO_BHI), > > Your assumption is correct. As for why the cpu_vuln_whitelist[] > doesn't say so explicitly, you need to read between the lines... > > > /* > > * AMD's AutoIBRS is equivalent to Intel's eIBRS - use the > > Intel feature > > * flag and protect from vendor-specific bugs via the > > whitelist. > > * > > * Don't use AutoIBRS when SNP is enabled because it > > degrades host > > * userspace indirect branch performance. > > */ > > if ((x86_arch_cap_msr & ARCH_CAP_IBRS_ALL) || > > (cpu_has(c, X86_FEATURE_AUTOIBRS) && > > !cpu_feature_enabled(X86_FEATURE_SEV_SNP))) { > > setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED); > > if (!cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) > > && > > !(x86_arch_cap_msr & ARCH_CAP_PBRSB_NO)) > > setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB); > > } > > Families 0FH through 12H don't have EIBRS or AutoIBRS, so there's no > cpu_vuln_whitelist[] lookup. Hence, no need to set the NO_EIBRS_PBRSB > bit, even if it is accurate. The commit that adds the RSB_VMEXIT_LITE feature flag does describe the bug in a good amount of detail: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2b1299322016731d56807aa49254a5ea3080b6b3 I've not seen any indication this is required for AMD CPUs. David, do you agree we don't need this? Amit >
[AMD Official Use Only - AMD Internal Distribution Only] > -----Original Message----- > From: Amit Shah <amit@kernel.org> > Sent: Monday, July 1, 2024 7:52 AM > To: Jim Mattson <jmattson@google.com>; Sean Christopherson > <seanjc@google.com> > Cc: pbonzini@redhat.com; x86@kernel.org; kvm@vger.kernel.org; linux- > kernel@vger.kernel.org; tglx@linutronix.de; mingo@redhat.com; > bp@alien8.de; dave.hansen@linux.intel.com; hpa@zytor.com; Phillips, Kim > <kim.phillips@amd.com>; Kaplan, David <David.Kaplan@amd.com> > Subject: Re: [PATCH v2] KVM: SVM: let alternatives handle the cases when > RSB filling is required > > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > On Fri, 2024-06-28 at 11:48 -0700, Jim Mattson wrote: > > On Fri, Jun 28, 2024 at 9:09 AM Sean Christopherson > > <seanjc@google.com> wrote: > > > > > > On Wed, Jun 26, 2024, Amit Shah wrote: > > > > --- > > > > arch/x86/kvm/svm/vmenter.S | 8 ++------ > > > > 1 file changed, 2 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/arch/x86/kvm/svm/vmenter.S > > > > b/arch/x86/kvm/svm/vmenter.S index a0c8eb37d3e1..2ed80aea3bb1 > > > > 100644 > > > > --- a/arch/x86/kvm/svm/vmenter.S > > > > +++ b/arch/x86/kvm/svm/vmenter.S > > > > @@ -209,10 +209,8 @@ SYM_FUNC_START(__svm_vcpu_run) > > > > 7: vmload %_ASM_AX > > > > 8: > > > > > > > > -#ifdef CONFIG_MITIGATION_RETPOLINE > > > > /* IMPORTANT: Stuff the RSB immediately after VM-Exit, > > > > before RET! */ > > > > - FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, > > > > X86_FEATURE_RETPOLINE > > > > -#endif > > > > + FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, > > > > X86_FEATURE_RSB_VMEXIT > > > > > > Out of an abundance of paranoia, shouldn't this be? > > > > > > FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, > > > X86_FEATURE_RSB_VMEXIT,\ > > > X86_FEATURE_RSB_VMEXIT_LITE > > > > > > Hmm, but it looks like that would incorrectly trigger the "lite" > > > flavor for > > > families 0xf - 0x12. I assume those old CPUs aren't affected by > > > whatever on earth EIBRS_PBRSB is. > > > > > > /* AMD Family 0xf - 0x12 */ > > > VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | > > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > > > VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | > > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > > > VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | > > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > > > VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | > > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_BHI), > > > > > > /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches > > > won't work */ > > > VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | > > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | > NO_EIBRS_PBRSB | > > > NO_BHI), > > > VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | > > > NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | > NO_EIBRS_PBRSB | > > > NO_BHI), > > > > Your assumption is correct. As for why the cpu_vuln_whitelist[] > > doesn't say so explicitly, you need to read between the lines... > > > > > /* > > > * AMD's AutoIBRS is equivalent to Intel's eIBRS - use the > > > Intel feature > > > * flag and protect from vendor-specific bugs via the > > > whitelist. > > > * > > > * Don't use AutoIBRS when SNP is enabled because it degrades > > > host > > > * userspace indirect branch performance. > > > */ > > > if ((x86_arch_cap_msr & ARCH_CAP_IBRS_ALL) || > > > (cpu_has(c, X86_FEATURE_AUTOIBRS) && > > > !cpu_feature_enabled(X86_FEATURE_SEV_SNP))) { > > > setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED); > > > if (!cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) > > > && > > > !(x86_arch_cap_msr & ARCH_CAP_PBRSB_NO)) > > > setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB); > > > } > > > > Families 0FH through 12H don't have EIBRS or AutoIBRS, so there's no > > cpu_vuln_whitelist[] lookup. Hence, no need to set the NO_EIBRS_PBRSB > > bit, even if it is accurate. > > The commit that adds the RSB_VMEXIT_LITE feature flag does describe the > bug in a good amount of detail: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i > d=2b1299322016731d56807aa49254a5ea3080b6b3 > > I've not seen any indication this is required for AMD CPUs. > > David, do you agree we don't need this? > It's not required, as AMD CPUs don't have the PBRSB issue with AutoIBRS. Although I think Sean was talking about being extra paranoid --David Kaplan
On (Mon) 08 Jul 2024 [11:59:45], Sean Christopherson wrote: > On Mon, Jul 01, 2024, David Kaplan wrote: > > > > > /* > > > > > * AMD's AutoIBRS is equivalent to Intel's eIBRS - use the > > > > > Intel feature > > > > > * flag and protect from vendor-specific bugs via the > > > > > whitelist. > > > > > * > > > > > * Don't use AutoIBRS when SNP is enabled because it degrades > > > > > host > > > > > * userspace indirect branch performance. > > > > > */ > > > > > if ((x86_arch_cap_msr & ARCH_CAP_IBRS_ALL) || > > > > > (cpu_has(c, X86_FEATURE_AUTOIBRS) && > > > > > !cpu_feature_enabled(X86_FEATURE_SEV_SNP))) { > > > > > setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED); > > > > > if (!cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) > > > > > && > > > > > !(x86_arch_cap_msr & ARCH_CAP_PBRSB_NO)) > > > > > setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB); > > > > > } > > > > > > > > Families 0FH through 12H don't have EIBRS or AutoIBRS, so there's no > > > > cpu_vuln_whitelist[] lookup. Hence, no need to set the NO_EIBRS_PBRSB > > > > bit, even if it is accurate. > > > > > > The commit that adds the RSB_VMEXIT_LITE feature flag does describe the > > > bug in a good amount of detail: > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i > > > d=2b1299322016731d56807aa49254a5ea3080b6b3 > > > > > > I've not seen any indication this is required for AMD CPUs. > > > > > > David, do you agree we don't need this? > > > > > > > It's not required, as AMD CPUs don't have the PBRSB issue with AutoIBRS. > > Although I think Sean was talking about being extra paranoid > > Ya. I'm asking if there's a reason not to tack on X86_FEATURE_RSB_VMEXIT_LITE, > beyond it effectively being dead code. There's no runtime cost, and so assuming > it doesn't get spuriously enabled, I don't see a downside. Ah - I get it now. You want to add this code for parity with vmenter.S so that a future bug like this doesn't happen. I disagree, though. It's not really dead code - it does get patched at runtime. If a future AMD CPU has a bug that Intel doesn't, we'll have to introduce a new ALTERNATIVE just for that condition - leading to more complexity than is actually required. Also - reviewers of code will get confused, wondering why this code for AMD exists when the CPU vuln does not. I get that we want to write defensive code, but this was a very special condition that is unlikely to happen in this part of the code, and also this was missed by the devs and the reviewers. The good thing here is that missing this only leads to suboptimal code, not a security bug. So given all this, I vote for the simplicity of code, rather than tacking on something. Sound OK? Amit --
On Mon, Jul 15, 2024, Amit Shah wrote: > On (Mon) 08 Jul 2024 [11:59:45], Sean Christopherson wrote: > > On Mon, Jul 01, 2024, David Kaplan wrote: > > > > > > /* > > > > > > * AMD's AutoIBRS is equivalent to Intel's eIBRS - use the > > > > > > Intel feature > > > > > > * flag and protect from vendor-specific bugs via the > > > > > > whitelist. > > > > > > * > > > > > > * Don't use AutoIBRS when SNP is enabled because it degrades > > > > > > host > > > > > > * userspace indirect branch performance. > > > > > > */ > > > > > > if ((x86_arch_cap_msr & ARCH_CAP_IBRS_ALL) || > > > > > > (cpu_has(c, X86_FEATURE_AUTOIBRS) && > > > > > > !cpu_feature_enabled(X86_FEATURE_SEV_SNP))) { > > > > > > setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED); > > > > > > if (!cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) > > > > > > && > > > > > > !(x86_arch_cap_msr & ARCH_CAP_PBRSB_NO)) > > > > > > setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB); > > > > > > } > > > > > > > > > > Families 0FH through 12H don't have EIBRS or AutoIBRS, so there's no > > > > > cpu_vuln_whitelist[] lookup. Hence, no need to set the NO_EIBRS_PBRSB > > > > > bit, even if it is accurate. > > > > > > > > The commit that adds the RSB_VMEXIT_LITE feature flag does describe the > > > > bug in a good amount of detail: > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i > > > > d=2b1299322016731d56807aa49254a5ea3080b6b3 > > > > > > > > I've not seen any indication this is required for AMD CPUs. > > > > > > > > David, do you agree we don't need this? > > > > > > > > > > It's not required, as AMD CPUs don't have the PBRSB issue with AutoIBRS. > > > Although I think Sean was talking about being extra paranoid > > > > Ya. I'm asking if there's a reason not to tack on X86_FEATURE_RSB_VMEXIT_LITE, > > beyond it effectively being dead code. There's no runtime cost, and so assuming > > it doesn't get spuriously enabled, I don't see a downside. > > Ah - I get it now. You want to add this code for parity with > vmenter.S so that a future bug like this doesn't happen. > > I disagree, though. It's not really dead code - it does get patched > at runtime. Eh, we're splitting hairs over what's dead code where. > If a future AMD CPU has a bug that Intel doesn't, we'll have to introduce a > new ALTERNATIVE just for that condition - leading to more complexity than is > actually required. If and only if the bug was mitigated by FILL_RETURN_BUFFER. And if we needed to extend FILL_RETURN_BUFFER, then we'd need a new alternative regardless of whether or not KVM SVM honored RSB_VMEXIT_LITE. If the hypothetical AMD bug is fixed by a single stuffed return, then the kernel would simply force set RSB_VMEXIT_LITE as appropriate. If the bug requires a mitigation somewhere between RSB_CLEAR_LOOPS and 1, we'd need to add more code somewhere. > Also - reviewers of code will get confused, wondering why this code > for AMD exists when the CPU vuln does not. > > I get that we want to write defensive code, but this was a very > special condition that is unlikely to happen in this part of the code, > and also this was missed by the devs and the reviewers. Defensive code is only part of it, and a minor part at that. The main "issue" is having divergent VM-Enter/VM-Exit code for Intel vs. AMD. To those of us that care primarily about virtualization and are only passingly familiar with the myriad speculation bugs and mitigations, omitting RSB_VMEXIT_LITE _looks_ wrong. To know that the omission is correct, one has to suss out that it's (supposed to be) impossible for RSB_VMEXIT_LITE to be set on AMD. And as a KVM person, that's a detail I don't want to care about. FWIW, I feel the same way about all the other post-VM-Exit mitigations, they just don't stand out in the same way because the entire mitigation sequence is absent on one vendor the other, i.e. they don't look wrong at first glance. But if KVM could have a mostly unified VM-Enter => VM-Exit assembly code, I would happliy eat a dead NOP/JMP or three. Now that I look at it, that actually seems very doable... > The good thing here is that missing this only leads to suboptimal > code, not a security bug. I don't think we can guarantee that. Obviously this is all speculative (lolz), but AFAICT, X86_FEATURE_RSB_VMEXIT_LITE doesn't imply X86_FEATURE_RSB_VMEXIT. > So given all this, I vote for the simplicity of code, rather than tacking on > something. > > Sound OK? > > > Amit > --
On Tue, 2024-07-16 at 12:10 -0700, Sean Christopherson wrote: > On Mon, Jul 15, 2024, Amit Shah wrote: > > On (Mon) 08 Jul 2024 [11:59:45], Sean Christopherson wrote: > > > On Mon, Jul 01, 2024, David Kaplan wrote: > > > > > > (snipped to what is now emerging as the core of the discussion) > > Also - reviewers of code will get confused, wondering why this code > > for AMD exists when the CPU vuln does not. > > > > I get that we want to write defensive code, but this was a very > > special condition that is unlikely to happen in this part of the > > code, > > and also this was missed by the devs and the reviewers. > > Defensive code is only part of it, and a minor part at that. The > main "issue" is > having divergent VM-Enter/VM-Exit code for Intel vs. AMD. To those > of us that > care primarily about virtualization and are only passingly familiar > with the myriad > speculation bugs and mitigations, omitting RSB_VMEXIT_LITE _looks_ > wrong. > > To know that the omission is correct, one has to suss out that it's > (supposed to > be) impossible for RSB_VMEXIT_LITE to be set on AMD. And as a KVM > person, that's > a detail I don't want to care about. OK - I get that. Cognitive overload is a real thing, and the less of it the better. Since this isn't a discussion about any AMD bug or implementation detail, but rather a uniformity in KVM code across different CPU implementations from different vendors, I prefer someone else code up the patch to add that uniformity. I don't have an objection to that. I can of course offer a comment in this hunk, though, that says AMD does not have the bug that necessitates VMEXIT_LITE, and that should help in the meantime. You've not queued this patch yet, right? Do you think it's better I do a v3 with this comment update? > FWIW, I feel the same way about all the other post-VM-Exit > mitigations, they just > don't stand out in the same way because the entire mitigation > sequence is absent > on one vendor the other, i.e. they don't look wrong at first glance. > But if KVM > could have a mostly unified VM-Enter => VM-Exit assembly code, I > would happliy eat > a dead NOP/JMP or three. Now that I look at it, that actually seems > very doable... Sure. I think some of the fallacy there is also to treat VMX and SVM as similar (while not treating the Arm side as similar). They are different implementations, with several overlapping details - but it's perilous to think everything maps the same across vendors. Amit
On Mon, Jul 22, 2024, Amit Shah wrote: > On Tue, 2024-07-16 at 12:10 -0700, Sean Christopherson wrote: > > FWIW, I feel the same way about all the other post-VM-Exit mitigations, > > they just don't stand out in the same way because the entire mitigation > > sequence is absent on one vendor the other, i.e. they don't look wrong at > > first glance. But if KVM could have a mostly unified VM-Enter => VM-Exit > > assembly code, I would happliy eat a dead NOP/JMP or three. Now that I > > look at it, that actually seems very doable... > > Sure. I think some of the fallacy there is also to treat VMX and SVM > as similar (while not treating the Arm side as similar). Bringing ARM into the picture is little more than whataboutism. KVM x86 and KVM arm64 _can't_ share assembly. They _can't_ share things like MSR interception tracking because MSRs are 100% an x86-only concept. The fact that sharing code across x86 and ARM is challenging doesn't have any bearing on whether or not VMX and SVM can/should share code. > They are different implementations, with several overlapping details - but > it's perilous to think everything maps the same across vendors. I never said everything maps the same. The point I am trying to make is that there is significant value _for KVM_ in having common code between architectures, and between vendors within an architecture. I can provide numerous examples where something was implemented/fixed in vendor/arch code, and then later it was discovered that the feature/fix was also wanted/needed in other vendor/arch code.
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S index a0c8eb37d3e1..2ed80aea3bb1 100644 --- a/arch/x86/kvm/svm/vmenter.S +++ b/arch/x86/kvm/svm/vmenter.S @@ -209,10 +209,8 @@ SYM_FUNC_START(__svm_vcpu_run) 7: vmload %_ASM_AX 8: -#ifdef CONFIG_MITIGATION_RETPOLINE /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ - FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE -#endif + FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT /* Clobbers RAX, RCX, RDX. */ RESTORE_HOST_SPEC_CTRL @@ -348,10 +346,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run) 2: cli -#ifdef CONFIG_MITIGATION_RETPOLINE /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ - FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE -#endif + FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT /* Clobbers RAX, RCX, RDX, consumes RDI (@svm) and RSI (@spec_ctrl_intercepted). */ RESTORE_HOST_SPEC_CTRL