diff mbox series

x86/bugs: Adjust SRSO mitigation to new features

Message ID 20241104101543.31885-1-bp@kernel.org (mailing list archive)
State New
Headers show
Series x86/bugs: Adjust SRSO mitigation to new features | expand

Commit Message

Borislav Petkov Nov. 4, 2024, 10:15 a.m. UTC
From: "Borislav Petkov (AMD)" <bp@alien8.de>

If the machine has:

  CPUID Fn8000_0021_EAX[30] (SRSO_USER_KERNEL_NO) -- If this bit is 1, it
  indicates the CPU is not subject to the SRSO vulnerability across
  user/kernel boundaries.

have it fall back to IBPB on VMEXIT only, in the case it is going to run
VMs:

  Speculative Return Stack Overflow: CPU user/kernel transitions protected, falling back to IBPB-on-VMEXIT
  Speculative Return Stack Overflow: Mitigation: IBPB on VMEXIT only

Then, upon KVM module load and in case the machine has

  CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it indicates
  that software may use MSR BP_CFG[BpSpecReduce] to mitigate SRSO.

enable this BpSpecReduce bit to mitigate SRSO across guest/host
boundaries.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
 arch/x86/include/asm/cpufeatures.h |  2 ++
 arch/x86/include/asm/msr-index.h   |  1 +
 arch/x86/kernel/cpu/bugs.c         | 16 +++++++++++++++-
 arch/x86/kernel/cpu/common.c       |  1 +
 arch/x86/kvm/cpuid.c               |  1 +
 arch/x86/kvm/svm/svm.c             |  6 ++++++
 arch/x86/lib/msr.c                 |  2 ++
 7 files changed, 28 insertions(+), 1 deletion(-)

Comments

Sean Christopherson Nov. 5, 2024, 12:57 a.m. UTC | #1
scripts/get_maintainer.pl :-)

On Mon, Nov 04, 2024, Borislav Petkov wrote:
> From: "Borislav Petkov (AMD)" <bp@alien8.de>
> 
> If the machine has:
> 
>   CPUID Fn8000_0021_EAX[30] (SRSO_USER_KERNEL_NO) -- If this bit is 1, it
>   indicates the CPU is not subject to the SRSO vulnerability across
>   user/kernel boundaries.
> 
> have it fall back to IBPB on VMEXIT only, in the case it is going to run
> VMs:
> 
>   Speculative Return Stack Overflow: CPU user/kernel transitions protected, falling back to IBPB-on-VMEXIT
>   Speculative Return Stack Overflow: Mitigation: IBPB on VMEXIT only
> 
> Then, upon KVM module load

It's not strictly KVM module load, it's when KVM enables virtualization.  E.g.
if userspace clears enable_virt_at_load, the MSR will be toggled every time the
number of VMs goes from 0=>1 and 1=>0.

But why do this in KVM?  E.g. why not set-and-forget in init_amd_zen4()?

> and in case the machine has
> 
>   CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it indicates
>   that software may use MSR BP_CFG[BpSpecReduce] to mitigate SRSO.
> 
> enable this BpSpecReduce bit to mitigate SRSO across guest/host
> boundaries.

Shouldn't these be two separate patches?  AFAICT, while the two are related, there
are no strict dependencies between SRSO_USER_KERNEL_NO and SRSO_MSR_FIX.

> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> ---

...

> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 41786b834b16..d54cd67c8c50 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -799,6 +799,7 @@ void kvm_set_cpu_caps(void)
>  
>  	kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
>  	kvm_cpu_cap_check_and_set(X86_FEATURE_IBPB_BRTYPE);
> +	kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_USER_KERNEL_NO);

If the expectation is that X86_FEATURE_SRSO_USER_KERNEL_NO will only ever come
from hardware, i.e. won't be force-set by the kernel, then I would prefer to set
the bit in the "standard" way

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 41786b834b16..eb65336c2168 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -794,7 +794,7 @@ void kvm_set_cpu_caps(void)
        kvm_cpu_cap_mask(CPUID_8000_0021_EAX,
                F(NO_NESTED_DATA_BP) | F(LFENCE_RDTSC) | 0 /* SmmPgCfgLock */ |
                F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */ |
-               F(WRMSR_XX_BASE_NS)
+               F(WRMSR_XX_BASE_NS) | F(SRSO_USER_KERNEL_NO)
        );
 
        kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);

The kvm_cpu_cap_check_and_set() trickery is necessary only for features that are
force-set by the kernel, in order to avoid kvm_cpu_cap_mask()'s masking of the
features by actual CPUID.  I'm trying to clean things up to make that more obvious;
hopefully that'll land in 6.14[*].

And advertising X86_FEATURE_SRSO_USER_KERNEL_NO should also be a separate patch,
no?  I.e. 

 1. Use SRSO_USER_KERNEL_NO in the host
 2. Update KVM to advertise SRSO_USER_KERNEL_NO to userspace, i.e. let userspace
    know that it can be enumerate to the guest.
 3. Add support for SRSO_MSR_FIX.

[*] https://lore.kernel.org/all/20240517173926.965351-49-seanjc@google.com

>  	kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_NO);
>  
>  	kvm_cpu_cap_init_kvm_defined(CPUID_8000_0022_EAX,
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 9df3e1e5ae81..03f29912a638 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void)
>  	kvm_cpu_svm_disable();
>  
>  	amd_pmu_disable_virt();
> +
> +	if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
> +		msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);

I don't like assuming the state of hardware.  E.g. if MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT
was already set, then KVM shouldn't clear it.  KVM's usual method of restoring
host MSRs is to snapshot the MSR into "struct kvm_host_values" on module load,
and then restore from there as needed.  But that assumes all CPUs have the same
value, which might not be the case here?

All that said, I'd still prefer that MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT is set
during boot, unless there's a good reason not to do so.
Borislav Petkov Nov. 5, 2024, 12:34 p.m. UTC | #2
On Mon, Nov 04, 2024 at 04:57:20PM -0800, Sean Christopherson wrote:
> scripts/get_maintainer.pl :-)

That's what I used but I pruned the list.

Why, did I miss anyone?
 
> It's not strictly KVM module load, it's when KVM enables virtualization.

Yeah, the KVM CPU hotplug callback.

> E.g. if userspace clears enable_virt_at_load,

/me reads the documentation on that...

Intersting :-)

Put all the work possible in the module load so that VM startup is minimal.

> the MSR will be toggled every time the number of VMs goes from 0=>1 and
> 1=>0.

I guess that's fine. 

> But why do this in KVM?  E.g. why not set-and-forget in init_amd_zen4()?

Because there's no need to impose an unnecessary - albeit small - perf impact
on users who don't do virt.

I'm currently gravitating towards the MSR toggling thing, i.e., only when the
VMs number goes 0=>1 but I'm not sure. If udev rules *always* load kvm.ko then
yes, the toggling thing sounds better. I.e., set it only when really needed.

> Shouldn't these be two separate patches?  AFAICT, while the two are related,
> there are no strict dependencies between SRSO_USER_KERNEL_NO and
> SRSO_MSR_FIX.

Meh, I can split them if you really want me to.

> If the expectation is that X86_FEATURE_SRSO_USER_KERNEL_NO will only ever come
> from hardware, i.e. won't be force-set by the kernel, then I would prefer to set
> the bit in the "standard" way
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 41786b834b16..eb65336c2168 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -794,7 +794,7 @@ void kvm_set_cpu_caps(void)
>         kvm_cpu_cap_mask(CPUID_8000_0021_EAX,
>                 F(NO_NESTED_DATA_BP) | F(LFENCE_RDTSC) | 0 /* SmmPgCfgLock */ |
>                 F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */ |
> -               F(WRMSR_XX_BASE_NS)
> +               F(WRMSR_XX_BASE_NS) | F(SRSO_USER_KERNEL_NO)

Ok, sure, ofc.

>         );
>  
>         kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
> 
> The kvm_cpu_cap_check_and_set() trickery is necessary only for features that are
> force-set by the kernel, in order to avoid kvm_cpu_cap_mask()'s masking of the
> features by actual CPUID.  I'm trying to clean things up to make that more obvious;
> hopefully that'll land in 6.14[*].

Oh please. It took me a while to figure out what each *cap* function is for so
yeah, cleanup would be nice theere.

> And advertising X86_FEATURE_SRSO_USER_KERNEL_NO should also be a separate patch,
> no?  I.e. 
> 
>  1. Use SRSO_USER_KERNEL_NO in the host
>  2. Update KVM to advertise SRSO_USER_KERNEL_NO to userspace, i.e. let userspace
>     know that it can be enumerate to the guest.
>  3. Add support for SRSO_MSR_FIX.

Sure, I can split. I'm lazy and all but ok... :-P

> [*] https://lore.kernel.org/all/20240517173926.965351-49-seanjc@google.com

Cool.

> 
> >  	kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_NO);
> >  
> >  	kvm_cpu_cap_init_kvm_defined(CPUID_8000_0022_EAX,
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 9df3e1e5ae81..03f29912a638 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void)
> >  	kvm_cpu_svm_disable();
> >  
> >  	amd_pmu_disable_virt();
> > +
> > +	if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
> > +		msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> 
> I don't like assuming the state of hardware.  E.g. if MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT
> was already set, then KVM shouldn't clear it.

Right, I don't see that happening tho. If we have to sync the toggling of this
bit between different places, we'll have to do some dance but so far its only
user is KVM.

> KVM's usual method of restoring host MSRs is to snapshot the MSR into
> "struct kvm_host_values" on module load, and then restore from there as
> needed.  But that assumes all CPUs have the same value, which might not be
> the case here?

Yes, the default value is 0 out of reset and it should be set on each logical
CPU whenever we run VMs on it. I'd love to make it part of the VMRUN microcode
but... :-)

> All that said, I'd still prefer that MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT is set
> during boot, unless there's a good reason not to do so.

Yeah, unnecessary penalty on machines not running virt.

Thx.
Sean Christopherson Nov. 5, 2024, 6:10 p.m. UTC | #3
On Tue, Nov 05, 2024, Borislav Petkov wrote:
> On Mon, Nov 04, 2024 at 04:57:20PM -0800, Sean Christopherson wrote:
> > scripts/get_maintainer.pl :-)
> 
> That's what I used but I pruned the list.
> 
> Why, did I miss anyone?

All of the actual maintainers.  AFAIK, Paolo doesn't subscribe to kvm@.

> > But why do this in KVM?  E.g. why not set-and-forget in init_amd_zen4()?
> 
> Because there's no need to impose an unnecessary - albeit small - perf impact
> on users who don't do virt.
> 
> I'm currently gravitating towards the MSR toggling thing, i.e., only when the
> VMs number goes 0=>1 but I'm not sure. If udev rules *always* load kvm.ko then
> yes, the toggling thing sounds better. I.e., set it only when really needed.
> 
> > Shouldn't these be two separate patches?  AFAICT, while the two are related,
> > there are no strict dependencies between SRSO_USER_KERNEL_NO and
> > SRSO_MSR_FIX.
> 
> Meh, I can split them if you really want me to.

I do.

> > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > > index 9df3e1e5ae81..03f29912a638 100644
> > > --- a/arch/x86/kvm/svm/svm.c
> > > +++ b/arch/x86/kvm/svm/svm.c
> > > @@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void)
> > >  	kvm_cpu_svm_disable();
> > >  
> > >  	amd_pmu_disable_virt();
> > > +
> > > +	if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
> > > +		msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> > 
> > I don't like assuming the state of hardware.  E.g. if MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT
> > was already set, then KVM shouldn't clear it.
> 
> Right, I don't see that happening tho. If we have to sync the toggling of this
> bit between different places, we'll have to do some dance but so far its only
> user is KVM.
> 
> > KVM's usual method of restoring host MSRs is to snapshot the MSR into
> > "struct kvm_host_values" on module load, and then restore from there as
> > needed.  But that assumes all CPUs have the same value, which might not be
> > the case here?
> 
> Yes, the default value is 0 out of reset and it should be set on each logical
> CPU whenever we run VMs on it. I'd love to make it part of the VMRUN microcode
> but... :-)
> 
> > All that said, I'd still prefer that MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT is set
> > during boot, unless there's a good reason not to do so.
> 
> Yeah, unnecessary penalty on machines not running virt.

What does the bit actually do?  I can't find any useful documentation, and the
changelog is equally useless.
Borislav Petkov Nov. 5, 2024, 6:56 p.m. UTC | #4
On Tue, Nov 05, 2024 at 10:10:20AM -0800, Sean Christopherson wrote:
> All of the actual maintainers.

Which maintainers do you mean? tip ones? If so, they're all shorted to
x86@kernel.org.

> AFAIK, Paolo doesn't subscribe to kvm@.

Oh boy, srsly?! I thought I'd reach the proper crowd with
kvm@vger.kernel.org...

> > Meh, I can split them if you really want me to.
> 
> I do.

Sure, next revision.

> What does the bit actually do?  I can't find any useful documentation, and the
> changelog is equally useless.


"Processors which set SRSO_MSR_FIX=1 support an MSR bit which mitigates SRSO
across guest/host boundaries. Software may enable this by setting bit
4 (BpSpecReduce) of MSR C001_102E. This bit can be set once during boot and
should be set identically across all processors in the system."

From: https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf

I think that's the only public info we have on that bit.

Thx.
Sean Christopherson Nov. 5, 2024, 7:18 p.m. UTC | #5
On Tue, Nov 05, 2024, Borislav Petkov wrote:
> On Tue, Nov 05, 2024 at 10:10:20AM -0800, Sean Christopherson wrote:
> > All of the actual maintainers.
> 
> Which maintainers do you mean? tip ones? If so, they're all shorted to
> x86@kernel.org.
> 
> > AFAIK, Paolo doesn't subscribe to kvm@.
> 
> Oh boy, srsly?! I thought I'd reach the proper crowd with
> kvm@vger.kernel.org...

It gets there, usually (as evidenced by my response).  But even for me, there's
a non-zero chance I'll miss something that's only Cc'd to kvm@, largely because
kvm@ is used by all things virt, i.e. it's a bit noisy:

$ git grep kvm@ MAINTAINERS | wc -l
29

> > What does the bit actually do?  I can't find any useful documentation, and the
> > changelog is equally useless.
 
> 
> "Processors which set SRSO_MSR_FIX=1 support an MSR bit which mitigates SRSO
> across guest/host boundaries. Software may enable this by setting bit
> 4 (BpSpecReduce) of MSR C001_102E. This bit can be set once during boot and
> should be set identically across all processors in the system."
> 
> From: https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf
> 
> I think that's the only public info we have on that bit.

Heh, I found that.  Not very helpful.

If you can't document the specifics, can you at least describe the performance
implications?  It's practically impossible to give meaningful feedback without
having any idea what the magic bit does.
Borislav Petkov Nov. 5, 2024, 7:24 p.m. UTC | #6
On Tue, Nov 05, 2024 at 11:18:16AM -0800, Sean Christopherson wrote:
> It gets there, usually (as evidenced by my response).  But even for me, there's
> a non-zero chance I'll miss something that's only Cc'd to kvm@, largely because
> kvm@ is used by all things virt, i.e. it's a bit noisy:
> 
> $ git grep kvm@ MAINTAINERS | wc -l
> 29

Hm, ok, so what do you guys prefer to be CCed on? Everyone from
get_maintainer.pl's output? commit signers, authors, everyone? Or?

> Heh, I found that.  Not very helpful.
> 
> If you can't document the specifics, can you at least describe the performance
> implications?  It's practically impossible to give meaningful feedback without
> having any idea what the magic bit does.

Lemme see what I can get clearance on...

Stay tuned.
Sean Christopherson Nov. 6, 2024, 3:20 p.m. UTC | #7
On Tue, Nov 05, 2024, Borislav Petkov wrote:
> On Tue, Nov 05, 2024 at 11:18:16AM -0800, Sean Christopherson wrote:
> > It gets there, usually (as evidenced by my response).  But even for me, there's
> > a non-zero chance I'll miss something that's only Cc'd to kvm@, largely because
> > kvm@ is used by all things virt, i.e. it's a bit noisy:
> > 
> > $ git grep kvm@ MAINTAINERS | wc -l
> > 29
> 
> Hm, ok, so what do you guys prefer to be CCed on? Everyone from
> get_maintainer.pl's output? commit signers, authors, everyone? Or?

I prefer to be To:/Cc:'d on any patches that touch files that are covered by
relevant MAINTAINERS entries.  IMO, pulling names/emails from git is useless noise
the vast majority of the time.
Borislav Petkov Nov. 6, 2024, 3:29 p.m. UTC | #8
On Wed, Nov 06, 2024 at 07:20:34AM -0800, Sean Christopherson wrote:
> I prefer to be To:/Cc:'d on any patches that touch files that are covered by
> relevant MAINTAINERS entries.  IMO, pulling names/emails from git is useless noise
> the vast majority of the time.

Huh, that's what I did!

Please run this patch through get_maintainer.pl and tell me who else I should
have CCed.
Sean Christopherson Nov. 6, 2024, 3:35 p.m. UTC | #9
On Wed, Nov 06, 2024, Borislav Petkov wrote:
> On Wed, Nov 06, 2024 at 07:20:34AM -0800, Sean Christopherson wrote:
> > I prefer to be To:/Cc:'d on any patches that touch files that are covered by
> > relevant MAINTAINERS entries.  IMO, pulling names/emails from git is useless noise
> > the vast majority of the time.
> 
> Huh, that's what I did!

You didn't though.  The original mail Cc'd kvm@, but neither Paolo nor I.

> Please run this patch through get_maintainer.pl and tell me who else I should
> have CCed.

  $ ./scripts/get_maintainer.pl --nogit --nogit-fallback --norolestats --nofixes -- <patch>
  Thomas Gleixner <tglx@linutronix.de>
  Ingo Molnar <mingo@redhat.com>
  Borislav Petkov <bp@alien8.de>
  Dave Hansen <dave.hansen@linux.intel.com>
  x86@kernel.org
  "H. Peter Anvin" <hpa@zytor.com>
  Peter Zijlstra <peterz@infradead.org>
  Josh Poimboeuf <jpoimboe@kernel.org>
  Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
  Sean Christopherson <seanjc@google.com>
  Paolo Bonzini <pbonzini@redhat.com>
  linux-kernel@vger.kernel.org
  kvm@vger.kernel.org

Versus the actual To + Cc:

  X86 ML <x86@kernel.org>
  Josh Poimboeuf <jpoimboe@redhat.com>,
  Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
  kvm@vger.kernel.org,
  LKML <linux-kernel@vger.kernel.org>,
  "Borislav Petkov (AMD)" <bp@alien8.de>
Borislav Petkov Nov. 6, 2024, 4:13 p.m. UTC | #10
On Wed, Nov 06, 2024 at 07:35:15AM -0800, Sean Christopherson wrote:
> You didn't though.  The original mail Cc'd kvm@, but neither Paolo nor I.

I think we established that tho - I didn't know that kvm@ doesn't CC you guys.
Probably should document that somewhere.

Uff, lemme try again. As already explained:

>   $ ./scripts/get_maintainer.pl --nogit --nogit-fallback --norolestats --nofixes -- <patch>
>   Thomas Gleixner <tglx@linutronix.de>
>   Ingo Molnar <mingo@redhat.com>
>   Borislav Petkov <bp@alien8.de>
>   Dave Hansen <dave.hansen@linux.intel.com>
>   "H. Peter Anvin" <hpa@zytor.com>
>   Peter Zijlstra <peterz@infradead.org>

those above are behind the mail alias x86@kernel.org so no need to CC each and
every one of them.

>   Josh Poimboeuf <jpoimboe@kernel.org>
>   Pawan Gupta <pawan.kumar.gupta@linux.intel.com>

Those folks are CCed as per MAINTAINERS as they wanted to look at bugs.c
tragedies.

>   Sean Christopherson <seanjc@google.com>
>   Paolo Bonzini <pbonzini@redhat.com>
>   linux-kernel@vger.kernel.org

LKML is CCed.

>   kvm@vger.kernel.org

I hope that clarifies the situation.

Thx.
Sean Christopherson Nov. 6, 2024, 4:17 p.m. UTC | #11
On Wed, Nov 06, 2024, Borislav Petkov wrote:
> On Wed, Nov 06, 2024 at 07:35:15AM -0800, Sean Christopherson wrote:
> > You didn't though.  The original mail Cc'd kvm@, but neither Paolo nor I.
> 
> I think we established that tho - I didn't know that kvm@ doesn't CC you guys.

I do subscribe to kvm@, but it's a mailing list, not at alias like x86@.  AFAIK,
x86@ is unique in that regard.  In other words, I don't see a need to document
the kvm@ behavior, because that's the behavior for every L: entry in MAINTAINERS
except the few "L:	x86@kernel.org" cases.
Borislav Petkov Nov. 6, 2024, 4:25 p.m. UTC | #12
On Wed, Nov 06, 2024 at 08:17:38AM -0800, Sean Christopherson wrote:
> I do subscribe to kvm@, but it's a mailing list, not at alias like x86@.  AFAIK,
> x86@ is unique in that regard.  In other words, I don't see a need to document
> the kvm@ behavior, because that's the behavior for every L: entry in MAINTAINERS
> except the few "L:	x86@kernel.org" cases.

I have had maintainers in the past not like to be CCed directly as long as the
corresponding mailing list is CCed. You guys want to be CCed. I will try to
remember that. We have different trees, with different requirements, wishes,
idiosyncrasies and so on. And you know that very well.

So document it or not - your call.
Sean Christopherson Nov. 6, 2024, 4:27 p.m. UTC | #13
On Wed, Nov 06, 2024, Borislav Petkov wrote:
> On Wed, Nov 06, 2024 at 08:17:38AM -0800, Sean Christopherson wrote:
> > I do subscribe to kvm@, but it's a mailing list, not at alias like x86@.  AFAIK,
> > x86@ is unique in that regard.  In other words, I don't see a need to document
> > the kvm@ behavior, because that's the behavior for every L: entry in MAINTAINERS
> > except the few "L:	x86@kernel.org" cases.
> 
> I have had maintainers in the past not like to be CCed directly as long as the
> corresponding mailing list is CCed.

LOL, doesn't that kind of defeat the purpose of MAINTAINERS?
Borislav Petkov Nov. 6, 2024, 4:39 p.m. UTC | #14
On Wed, Nov 06, 2024 at 08:27:47AM -0800, Sean Christopherson wrote:
> LOL, doesn't that kind of defeat the purpose of MAINTAINERS?

You're new here, right? :-P :-P

And you sound like everyone is supposed to unanimously adhere to the rules
we've all agreed upon. Oh well, I will make sure to point you to such
"exceptions" I come across in the future.

And no, I'm better off simply doing those small "exceptions" instead of
dropping into the fruitles abyss of endless discussions which will cause me
nothing but white hair so...

:-)
diff mbox series

Patch

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 924f530129d7..9d71f06e09a4 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -462,6 +462,8 @@ 
 #define X86_FEATURE_SBPB		(20*32+27) /* Selective Branch Prediction Barrier */
 #define X86_FEATURE_IBPB_BRTYPE		(20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
 #define X86_FEATURE_SRSO_NO		(20*32+29) /* CPU is not affected by SRSO */
+#define X86_FEATURE_SRSO_USER_KERNEL_NO	(20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
+#define X86_FEATURE_SRSO_MSR_FIX	(20*32+31) /* MSR BP_CFG[BpSpecReduce] can be used to mitigate SRSO */
 
 /*
  * Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3ae84c3b8e6d..1372a569fb58 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -717,6 +717,7 @@ 
 
 /* Zen4 */
 #define MSR_ZEN4_BP_CFG                 0xc001102e
+#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
 #define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
 
 /* Fam 19h MSRs */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 83b34a522dd7..5dffd1e679da 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2536,6 +2536,7 @@  enum srso_mitigation {
 	SRSO_MITIGATION_SAFE_RET,
 	SRSO_MITIGATION_IBPB,
 	SRSO_MITIGATION_IBPB_ON_VMEXIT,
+	SRSO_MITIGATION_BP_SPEC_REDUCE,
 };
 
 enum srso_mitigation_cmd {
@@ -2553,7 +2554,8 @@  static const char * const srso_strings[] = {
 	[SRSO_MITIGATION_MICROCODE]		= "Vulnerable: Microcode, no safe RET",
 	[SRSO_MITIGATION_SAFE_RET]		= "Mitigation: Safe RET",
 	[SRSO_MITIGATION_IBPB]			= "Mitigation: IBPB",
-	[SRSO_MITIGATION_IBPB_ON_VMEXIT]	= "Mitigation: IBPB on VMEXIT only"
+	[SRSO_MITIGATION_IBPB_ON_VMEXIT]	= "Mitigation: IBPB on VMEXIT only",
+	[SRSO_MITIGATION_BP_SPEC_REDUCE]	= "Mitigation: Reduced Speculation"
 };
 
 static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
@@ -2628,6 +2630,11 @@  static void __init srso_select_mitigation(void)
 		break;
 
 	case SRSO_CMD_SAFE_RET:
+		if (boot_cpu_has(X86_FEATURE_SRSO_USER_KERNEL_NO)) {
+			pr_notice("CPU user/kernel transitions protected, falling back to IBPB-on-VMEXIT\n");
+			goto ibpb_on_vmexit;
+		}
+
 		if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
 			/*
 			 * Enable the return thunk for generated code
@@ -2671,7 +2678,14 @@  static void __init srso_select_mitigation(void)
 		}
 		break;
 
+ibpb_on_vmexit:
 	case SRSO_CMD_IBPB_ON_VMEXIT:
+		if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX)) {
+			pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n");
+			srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE;
+			break;
+		}
+
 		if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
 			if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
 				setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8f41ab219cf1..ca3b588b51aa 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1273,6 +1273,7 @@  static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
 	VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO),
 	VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO),
 	VULNBL_AMD(0x19, SRSO),
+	VULNBL_AMD(0x1a, SRSO),
 	{}
 };
 
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 41786b834b16..d54cd67c8c50 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -799,6 +799,7 @@  void kvm_set_cpu_caps(void)
 
 	kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
 	kvm_cpu_cap_check_and_set(X86_FEATURE_IBPB_BRTYPE);
+	kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_USER_KERNEL_NO);
 	kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_NO);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_8000_0022_EAX,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9df3e1e5ae81..03f29912a638 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -608,6 +608,9 @@  static void svm_disable_virtualization_cpu(void)
 	kvm_cpu_svm_disable();
 
 	amd_pmu_disable_virt();
+
+	if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
+		msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
 }
 
 static int svm_enable_virtualization_cpu(void)
@@ -685,6 +688,9 @@  static int svm_enable_virtualization_cpu(void)
 		rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
 	}
 
+	if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
+		msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
+
 	return 0;
 }
 
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c
index 4bf4fad5b148..5a18ecc04a6c 100644
--- a/arch/x86/lib/msr.c
+++ b/arch/x86/lib/msr.c
@@ -103,6 +103,7 @@  int msr_set_bit(u32 msr, u8 bit)
 {
 	return __flip_bit(msr, bit, true);
 }
+EXPORT_SYMBOL_GPL(msr_set_bit);
 
 /**
  * msr_clear_bit - Clear @bit in a MSR @msr.
@@ -118,6 +119,7 @@  int msr_clear_bit(u32 msr, u8 bit)
 {
 	return __flip_bit(msr, bit, false);
 }
+EXPORT_SYMBOL_GPL(msr_clear_bit);
 
 #ifdef CONFIG_TRACEPOINTS
 void do_trace_write_msr(unsigned int msr, u64 val, int failed)