diff mbox series

[3/3] KVM SVM: Add Bus Lock Detect support

Message ID 20240429060643.211-4-ravi.bangoria@amd.com (mailing list archive)
State New, archived
Headers show
Series x86/cpu: Add Bus Lock Detect support for AMD | expand

Commit Message

Ravi Bangoria April 29, 2024, 6:06 a.m. UTC
Upcoming AMD uarch will support Bus Lock Detect. Add support for it
in KVM. Bus Lock Detect is enabled through MSR_IA32_DEBUGCTLMSR and
MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
enabled. Add this dependency in the KVM.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
 arch/x86/kvm/svm/nested.c |  3 ++-
 arch/x86/kvm/svm/svm.c    | 16 +++++++++++++++-
 2 files changed, 17 insertions(+), 2 deletions(-)

Comments

Sean Christopherson June 4, 2024, 12:45 a.m. UTC | #1
On Mon, Apr 29, 2024, Ravi Bangoria wrote:
> Upcoming AMD uarch will support Bus Lock Detect. Add support for it
> in KVM. Bus Lock Detect is enabled through MSR_IA32_DEBUGCTLMSR and
> MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
> enabled. Add this dependency in the KVM.

This is woefully incomplete, e.g. db_interception() needs to be updated to decipher
whether the #DB is the responsbility of the host or of the guest.

Honestly, I don't see any point in virtualizing this in KVM.  As Jim alluded to,
what's far, far more interesting for KVM is "Bus Lock Threshold".  Virtualizing
this for the guest would have been nice to have during the initial split-lock #AC
support, but now I'm skeptical the complexity is worth the payoff.

I suppose we could allow it if #DB isn't interecepted, at which point the enabling
required is minimal?

> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
>  arch/x86/kvm/svm/nested.c |  3 ++-
>  arch/x86/kvm/svm/svm.c    | 16 +++++++++++++++-
>  2 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 55b9a6d96bcf..6e93c2d9e7df 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -586,7 +586,8 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
>  	/* These bits will be set properly on the first execution when new_vmc12 is true */
>  	if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_DR))) {
>  		vmcb02->save.dr7 = svm->nested.save.dr7 | DR7_FIXED_1;
> -		svm->vcpu.arch.dr6  = svm->nested.save.dr6 | DR6_ACTIVE_LOW;
> +		/* DR6_RTM is not supported on AMD as of now. */
> +		svm->vcpu.arch.dr6  = svm->nested.save.dr6 | DR6_FIXED_1 | DR6_RTM;
>  		vmcb_mark_dirty(vmcb02, VMCB_DR);
>  	}
>  
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index d1a9f9951635..60f3af9bdacb 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1038,7 +1038,8 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  	bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
> -	bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
> +	u64 dbgctl_buslock_lbr = DEBUGCTLMSR_BUS_LOCK_DETECT | DEBUGCTLMSR_LBR;
> +	bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & dbgctl_buslock_lbr) ||
>  			    (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
>  			    (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
>  
> @@ -3119,6 +3120,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  		if (data & DEBUGCTL_RESERVED_BITS)
>  			return 1;
>  
> +		if ((data & DEBUGCTLMSR_BUS_LOCK_DETECT) &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
> +			return 1;
> +
>  		svm_get_lbr_vmcb(svm)->save.dbgctl = data;
>  		svm_update_lbrv(vcpu);
>  		break;
> @@ -5157,6 +5162,15 @@ static __init void svm_set_cpu_caps(void)
>  
>  	/* CPUID 0x8000001F (SME/SEV features) */
>  	sev_set_cpu_caps();
> +
> +	/*
> +	 * LBR Virtualization must be enabled to support BusLockTrap inside the
> +	 * guest, since BusLockTrap is enabled through MSR_IA32_DEBUGCTLMSR and
> +	 * MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
> +	 * enabled.
> +	 */
> +	if (!lbrv)
> +		kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
>  }
>  
>  static __init int svm_hardware_setup(void)
> -- 
> 2.44.0
>
Ravi Bangoria June 5, 2024, 11:38 a.m. UTC | #2
Hi Sean,

On 6/4/2024 6:15 AM, Sean Christopherson wrote:
> On Mon, Apr 29, 2024, Ravi Bangoria wrote:
>> Upcoming AMD uarch will support Bus Lock Detect. Add support for it
>> in KVM. Bus Lock Detect is enabled through MSR_IA32_DEBUGCTLMSR and
>> MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
>> enabled. Add this dependency in the KVM.
> 
> This is woefully incomplete, e.g. db_interception() needs to be updated to decipher
> whether the #DB is the responsbility of the host or of the guest.

Can you please elaborate. Are you referring to vcpu->guest_debug thingy?

> Honestly, I don't see any point in virtualizing this in KVM.  As Jim alluded to,
> what's far, far more interesting for KVM is "Bus Lock Threshold".  Virtualizing
> this for the guest would have been nice to have during the initial split-lock #AC
> support, but now I'm skeptical the complexity is worth the payoff.

This has a valid usecase of penalizing offending processes. I'm not sure
how much it's really used in the production though.

> I suppose we could allow it if #DB isn't interecepted, at which point the enabling
> required is minimal?

The feature uses DEBUG_CTL MSR, #DB and DR6 register. Do you mean expose
it when all three are accelerated or just #DB?

Thanks for the feedback,
Ravi
Sean Christopherson June 5, 2024, 3:08 p.m. UTC | #3
On Wed, Jun 05, 2024, Ravi Bangoria wrote:
> Hi Sean,
> 
> On 6/4/2024 6:15 AM, Sean Christopherson wrote:
> > On Mon, Apr 29, 2024, Ravi Bangoria wrote:
> >> Upcoming AMD uarch will support Bus Lock Detect. Add support for it
> >> in KVM. Bus Lock Detect is enabled through MSR_IA32_DEBUGCTLMSR and
> >> MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
> >> enabled. Add this dependency in the KVM.
> > 
> > This is woefully incomplete, e.g. db_interception() needs to be updated to decipher
> > whether the #DB is the responsbility of the host or of the guest.
> 
> Can you please elaborate. Are you referring to vcpu->guest_debug thingy?

Yes.  More broadly, all of db_interception().

> > Honestly, I don't see any point in virtualizing this in KVM.  As Jim alluded to,
> > what's far, far more interesting for KVM is "Bus Lock Threshold".  Virtualizing
> > this for the guest would have been nice to have during the initial split-lock #AC
> > support, but now I'm skeptical the complexity is worth the payoff.
> 
> This has a valid usecase of penalizing offending processes. I'm not sure
> how much it's really used in the production though.

Yeah, but split-lock #AC and #DB have existed on Intel for years, and no one has
put in the effort to land KVM support, despite the series getting as far as v9[*].
Some of the problems on Intel were due to the awful FMS-based feature detection,
but those weren't the only hiccups.  E.g. IIRC, we never sorted out what should
happen if both the host and guest want bus-lock #DBs.

Anyways, my point is that, except for SEV-ES+ where there's no good reason NOT to
virtualize Bus Lock Detect, I'm not convinced that it's worth virtualizing bus-lock
#DBs.

[*] https://lore.kernel.org/all/20200509110542.8159-1-xiaoyao.li@intel.com

> > I suppose we could allow it if #DB isn't interecepted, at which point the enabling
> > required is minimal?
> 
> The feature uses DEBUG_CTL MSR, #DB and DR6 register. Do you mean expose
> it when all three are accelerated or just #DB?

I mean that if KVM isn't intercepting #DB, then there's no extra complexity needed
to sort out whether the #DB "belongs" to the host or the guest.  See commit
90cbf6d914ad ("KVM: SEV-ES: Eliminate #DB intercept when DebugSwap enabled").
Ravi Bangoria June 5, 2024, 4:14 p.m. UTC | #4
On 6/5/2024 8:38 PM, Sean Christopherson wrote:
> On Wed, Jun 05, 2024, Ravi Bangoria wrote:
>> Hi Sean,
>>
>> On 6/4/2024 6:15 AM, Sean Christopherson wrote:
>>> On Mon, Apr 29, 2024, Ravi Bangoria wrote:
>>>> Upcoming AMD uarch will support Bus Lock Detect. Add support for it
>>>> in KVM. Bus Lock Detect is enabled through MSR_IA32_DEBUGCTLMSR and
>>>> MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
>>>> enabled. Add this dependency in the KVM.
>>>
>>> This is woefully incomplete, e.g. db_interception() needs to be updated to decipher
>>> whether the #DB is the responsbility of the host or of the guest.
>>
>> Can you please elaborate. Are you referring to vcpu->guest_debug thingy?
> 
> Yes.  More broadly, all of db_interception().
> 
>>> Honestly, I don't see any point in virtualizing this in KVM.  As Jim alluded to,
>>> what's far, far more interesting for KVM is "Bus Lock Threshold".  Virtualizing
>>> this for the guest would have been nice to have during the initial split-lock #AC
>>> support, but now I'm skeptical the complexity is worth the payoff.
>>
>> This has a valid usecase of penalizing offending processes. I'm not sure
>> how much it's really used in the production though.
> 
> Yeah, but split-lock #AC and #DB have existed on Intel for years, and no one has
> put in the effort to land KVM support, despite the series getting as far as v9[*].

Split-Lock Detect through #AC and Bus Lock Detect through #DB are independent
features. AMD supports only Bus Lock Detect with #DB. I'm not sure about Split
Lock Detect but Intel supports Bus Lock Detect in the guest. These are the
relevant commits:

  https://git.kernel.org/torvalds/c/9a3ecd5e2aa10
  https://git.kernel.org/torvalds/c/e8ea85fb280ec
  https://git.kernel.org/torvalds/c/76ea438b4afcd

> Some of the problems on Intel were due to the awful FMS-based feature detection,
> but those weren't the only hiccups.  E.g. IIRC, we never sorted out what should
> happen if both the host and guest want bus-lock #DBs.

I've to check about vcpu->guest_debug part, but keeping that aside, host and
guest can use Bus Lock Detect in parallel because, DEBUG_CTL MSR and DR6
register are save/restored in VMCB, hardware cause a VMEXIT_EXCEPTION_1 for
guest #DB(when intercepted) and hardware raises #DB on host when it's for the
host.

Please correct me if I misunderstood your comment.

> Anyways, my point is that, except for SEV-ES+ where there's no good reason NOT to
> virtualize Bus Lock Detect, I'm not convinced that it's worth virtualizing bus-lock
> #DBs.
> 
> [*] https://lore.kernel.org/all/20200509110542.8159-1-xiaoyao.li@intel.com
> 
>>> I suppose we could allow it if #DB isn't interecepted, at which point the enabling
>>> required is minimal?
>>
>> The feature uses DEBUG_CTL MSR, #DB and DR6 register. Do you mean expose
>> it when all three are accelerated or just #DB?
> 
> I mean that if KVM isn't intercepting #DB, then there's no extra complexity needed
> to sort out whether the #DB "belongs" to the host or the guest.  See commit
> 90cbf6d914ad ("KVM: SEV-ES: Eliminate #DB intercept when DebugSwap enabled").
Sean Christopherson June 12, 2024, 1:42 a.m. UTC | #5
On Wed, Jun 05, 2024, Ravi Bangoria wrote:
> On 6/5/2024 8:38 PM, Sean Christopherson wrote:
> > Some of the problems on Intel were due to the awful FMS-based feature detection,
> > but those weren't the only hiccups.  E.g. IIRC, we never sorted out what should
> > happen if both the host and guest want bus-lock #DBs.
> 
> I've to check about vcpu->guest_debug part, but keeping that aside, host and
> guest can use Bus Lock Detect in parallel because, DEBUG_CTL MSR and DR6
> register are save/restored in VMCB, hardware cause a VMEXIT_EXCEPTION_1 for
> guest #DB(when intercepted) and hardware raises #DB on host when it's for the
> host.

I'm talking about the case where the host wants to do something in response to
bus locks that occurred in the guest.  E.g. if the host is taking punitive action,
say by stalling the vCPU, then the guest kernel could bypass that behavior by
enabling bus lock detect itself.

Maybe it's moot point in practice, since it sounds like Bus Lock Threshold will
be available at the same time.

Ugh, and if we wanted to let the host handle guest-induced #DBs, we'd need code
to keep Bus Lock Detect enabled in the guest since it resides in DEBUG_CTL.  Bah.

So I guess if the vcpu->guest_debug part is fairly straightforward, it probably
makes to virtualize Bus Lock Detect because the only reason not to virtualize it
would actually require more work/code in KVM.

I'd still love to see Bus Lock Threshold support sooner than later though :-)
Ravi Bangoria July 10, 2024, 2:25 a.m. UTC | #6
Sean,

Apologies for the delay. I was waiting for Bus Lock Threshold patches to be
posted upstream:

  https://lore.kernel.org/r/20240709175145.9986-1-manali.shukla@amd.com

On 12-Jun-24 7:12 AM, Sean Christopherson wrote:
> On Wed, Jun 05, 2024, Ravi Bangoria wrote:
>> On 6/5/2024 8:38 PM, Sean Christopherson wrote:
>>> Some of the problems on Intel were due to the awful FMS-based feature detection,
>>> but those weren't the only hiccups.  E.g. IIRC, we never sorted out what should
>>> happen if both the host and guest want bus-lock #DBs.
>>
>> I've to check about vcpu->guest_debug part, but keeping that aside, host and
>> guest can use Bus Lock Detect in parallel because, DEBUG_CTL MSR and DR6
>> register are save/restored in VMCB, hardware cause a VMEXIT_EXCEPTION_1 for
>> guest #DB(when intercepted) and hardware raises #DB on host when it's for the
>> host.
> 
> I'm talking about the case where the host wants to do something in response to
> bus locks that occurred in the guest.  E.g. if the host is taking punitive action,
> say by stalling the vCPU, then the guest kernel could bypass that behavior by
> enabling bus lock detect itself.
> 
> Maybe it's moot point in practice, since it sounds like Bus Lock Threshold will
> be available at the same time.
> 
> Ugh, and if we wanted to let the host handle guest-induced #DBs, we'd need code
> to keep Bus Lock Detect enabled in the guest since it resides in DEBUG_CTL.  Bah.
> 
> So I guess if the vcpu->guest_debug part is fairly straightforward, it probably
> makes to virtualize Bus Lock Detect because the only reason not to virtualize it
> would actually require more work/code in KVM.

KVM forwards #DB to Qemu when vcpu->guest_debug is set and it's Qemu's
responsibility to re-inject exception when Bus Lock Trap is enabled
inside the guest. I realized that it is broken so I've prepared a
Qemu patch, embedding it at the end.

> I'd still love to see Bus Lock Threshold support sooner than later though :-)

With Bus Lock Threshold enabled, I assume the changes introduced by this
patch plus Qemu fix are sufficient to support Bus Lock Trap inside the
guest?

Thanks,
Ravi

---><---
From 0b01859f99c4c5e18323e3f4ac19d1f3e5ad4aee Mon Sep 17 00:00:00 2001
From: Ravi Bangoria <ravi.bangoria@amd.com>
Date: Thu, 4 Jul 2024 06:48:24 +0000
Subject: [PATCH] target/i386: Add Bus Lock Detect support

Upcoming AMD uarch will support Bus Lock Detect (called Bus Lock Trap
in AMD docs). Bus Lock Detect is enumerated with cpuid Fn0000_0007_ECX_x0
bit [24 / BUSLOCKTRAP]. It can be enabled through MSR_IA32_DEBUGCTLMSR.
When enabled, hardware clears DR6[11] and raises a #DB exception on
occurrence of Bus Lock if CPL > 0. More detail about the feature can be
found in AMD APM[1].

Qemu supports remote debugging through host gdb (the "gdbstub" facility)
where some of the remote debugging features like instruction and data
breakpoints relies on the same hardware infrastructure (#DB, DR6 etc.)
that Bus Lock Detect also uses. Instead of handling internally, KVM
forwards #DB to Qemu when remote debugging is ON and #DB is being
intercepted. It's Qemu's responsibility to re-inject the exception to
guest when some of the exception source bits (in DR6) are not being
handled by Qemu remote debug handler. Bus Lock Detect is one such case.

[1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June
     2023, Vol 2, 13.1.3.6 Bus Lock Trap
     https://bugzilla.kernel.org/attachment.cgi?id=304653

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
 target/i386/cpu.h     | 1 +
 target/i386/kvm/kvm.c | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c64ef0c1a2..89bcff2fa3 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -271,6 +271,7 @@ typedef enum X86Seg {
                 | CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_PKE_MASK | CR4_PKS_MASK \
                 | CR4_LAM_SUP_MASK))
 
+#define DR6_BLD         (1 << 11)
 #define DR6_BD          (1 << 13)
 #define DR6_BS          (1 << 14)
 #define DR6_BT          (1 << 15)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 6c864e4611..d128d4e5ca 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5141,14 +5141,14 @@ static int kvm_handle_debug(X86CPU *cpu,
     } else if (kvm_find_sw_breakpoint(cs, arch_info->pc)) {
         ret = EXCP_DEBUG;
     }
-    if (ret == 0) {
+    if (ret == 0 || !(arch_info->dr6 & DR6_BLD)) {
         cpu_synchronize_state(cs);
         assert(env->exception_nr == -1);
 
         /* pass to guest */
         kvm_queue_exception(env, arch_info->exception,
                             arch_info->exception == EXCP01_DB,
-                            arch_info->dr6);
+                            ret == 0 ? arch_info->dr6 ^ DR6_BLD : DR6_BLD);
         env->has_error_code = 0;
     }
Jim Mattson July 10, 2024, 4:15 a.m. UTC | #7
On Tue, Jul 9, 2024 at 7:25 PM Ravi Bangoria <ravi.bangoria@amd.com> wrote:
>
> Sean,
>
> Apologies for the delay. I was waiting for Bus Lock Threshold patches to be
> posted upstream:
>
>   https://lore.kernel.org/r/20240709175145.9986-1-manali.shukla@amd.com
>
> On 12-Jun-24 7:12 AM, Sean Christopherson wrote:
> > On Wed, Jun 05, 2024, Ravi Bangoria wrote:
> >> On 6/5/2024 8:38 PM, Sean Christopherson wrote:
> >>> Some of the problems on Intel were due to the awful FMS-based feature detection,
> >>> but those weren't the only hiccups.  E.g. IIRC, we never sorted out what should
> >>> happen if both the host and guest want bus-lock #DBs.
> >>
> >> I've to check about vcpu->guest_debug part, but keeping that aside, host and
> >> guest can use Bus Lock Detect in parallel because, DEBUG_CTL MSR and DR6
> >> register are save/restored in VMCB, hardware cause a VMEXIT_EXCEPTION_1 for
> >> guest #DB(when intercepted) and hardware raises #DB on host when it's for the
> >> host.
> >
> > I'm talking about the case where the host wants to do something in response to
> > bus locks that occurred in the guest.  E.g. if the host is taking punitive action,
> > say by stalling the vCPU, then the guest kernel could bypass that behavior by
> > enabling bus lock detect itself.
> >
> > Maybe it's moot point in practice, since it sounds like Bus Lock Threshold will
> > be available at the same time.
> >
> > Ugh, and if we wanted to let the host handle guest-induced #DBs, we'd need code
> > to keep Bus Lock Detect enabled in the guest since it resides in DEBUG_CTL.  Bah.
> >
> > So I guess if the vcpu->guest_debug part is fairly straightforward, it probably
> > makes to virtualize Bus Lock Detect because the only reason not to virtualize it
> > would actually require more work/code in KVM.
>
> KVM forwards #DB to Qemu when vcpu->guest_debug is set and it's Qemu's
> responsibility to re-inject exception when Bus Lock Trap is enabled
> inside the guest. I realized that it is broken so I've prepared a
> Qemu patch, embedding it at the end.
>
> > I'd still love to see Bus Lock Threshold support sooner than later though :-)
>
> With Bus Lock Threshold enabled, I assume the changes introduced by this
> patch plus Qemu fix are sufficient to support Bus Lock Trap inside the
> guest?

In any case, it seems that commit 76ea438b4afc ("KVM: X86: Expose bus
lock debug exception to guest") prematurely advertised the presence of
X86_FEATURE_BUS_LOCK to userspace on non-Intel platforms. We should
probably either accept these changes or fix up that commit. Either
way, something should be done for all active branches back to v5.15.

> Thanks,
> Ravi
>
> ---><---
> From 0b01859f99c4c5e18323e3f4ac19d1f3e5ad4aee Mon Sep 17 00:00:00 2001
> From: Ravi Bangoria <ravi.bangoria@amd.com>
> Date: Thu, 4 Jul 2024 06:48:24 +0000
> Subject: [PATCH] target/i386: Add Bus Lock Detect support
>
> Upcoming AMD uarch will support Bus Lock Detect (called Bus Lock Trap
> in AMD docs). Bus Lock Detect is enumerated with cpuid Fn0000_0007_ECX_x0
> bit [24 / BUSLOCKTRAP]. It can be enabled through MSR_IA32_DEBUGCTLMSR.
> When enabled, hardware clears DR6[11] and raises a #DB exception on
> occurrence of Bus Lock if CPL > 0. More detail about the feature can be
> found in AMD APM[1].
>
> Qemu supports remote debugging through host gdb (the "gdbstub" facility)
> where some of the remote debugging features like instruction and data
> breakpoints relies on the same hardware infrastructure (#DB, DR6 etc.)
> that Bus Lock Detect also uses. Instead of handling internally, KVM
> forwards #DB to Qemu when remote debugging is ON and #DB is being
> intercepted. It's Qemu's responsibility to re-inject the exception to
> guest when some of the exception source bits (in DR6) are not being
> handled by Qemu remote debug handler. Bus Lock Detect is one such case.
>
> [1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June
>      2023, Vol 2, 13.1.3.6 Bus Lock Trap
>      https://bugzilla.kernel.org/attachment.cgi?id=304653
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
>  target/i386/cpu.h     | 1 +
>  target/i386/kvm/kvm.c | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index c64ef0c1a2..89bcff2fa3 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -271,6 +271,7 @@ typedef enum X86Seg {
>                  | CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_PKE_MASK | CR4_PKS_MASK \
>                  | CR4_LAM_SUP_MASK))
>
> +#define DR6_BLD         (1 << 11)
>  #define DR6_BD          (1 << 13)
>  #define DR6_BS          (1 << 14)
>  #define DR6_BT          (1 << 15)
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 6c864e4611..d128d4e5ca 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -5141,14 +5141,14 @@ static int kvm_handle_debug(X86CPU *cpu,
>      } else if (kvm_find_sw_breakpoint(cs, arch_info->pc)) {
>          ret = EXCP_DEBUG;
>      }
> -    if (ret == 0) {
> +    if (ret == 0 || !(arch_info->dr6 & DR6_BLD)) {
>          cpu_synchronize_state(cs);
>          assert(env->exception_nr == -1);
>
>          /* pass to guest */
>          kvm_queue_exception(env, arch_info->exception,
>                              arch_info->exception == EXCP01_DB,
> -                            arch_info->dr6);
> +                            ret == 0 ? arch_info->dr6 ^ DR6_BLD : DR6_BLD);
>          env->has_error_code = 0;
>      }
>
> --
> 2.34.1
>
Sean Christopherson July 10, 2024, 1:52 p.m. UTC | #8
On Tue, Jul 09, 2024, Jim Mattson wrote:
> On Tue, Jul 9, 2024 at 7:25 PM Ravi Bangoria <ravi.bangoria@amd.com> wrote:
> >
> > Sean,
> >
> > Apologies for the delay. I was waiting for Bus Lock Threshold patches to be
> > posted upstream:
> >
> >   https://lore.kernel.org/r/20240709175145.9986-1-manali.shukla@amd.com
> >
> > On 12-Jun-24 7:12 AM, Sean Christopherson wrote:
> > > On Wed, Jun 05, 2024, Ravi Bangoria wrote:
> > >> On 6/5/2024 8:38 PM, Sean Christopherson wrote:
> > >>> Some of the problems on Intel were due to the awful FMS-based feature detection,
> > >>> but those weren't the only hiccups.  E.g. IIRC, we never sorted out what should
> > >>> happen if both the host and guest want bus-lock #DBs.
> > >>
> > >> I've to check about vcpu->guest_debug part, but keeping that aside, host and
> > >> guest can use Bus Lock Detect in parallel because, DEBUG_CTL MSR and DR6
> > >> register are save/restored in VMCB, hardware cause a VMEXIT_EXCEPTION_1 for
> > >> guest #DB(when intercepted) and hardware raises #DB on host when it's for the
> > >> host.
> > >
> > > I'm talking about the case where the host wants to do something in response to
> > > bus locks that occurred in the guest.  E.g. if the host is taking punitive action,
> > > say by stalling the vCPU, then the guest kernel could bypass that behavior by
> > > enabling bus lock detect itself.
> > >
> > > Maybe it's moot point in practice, since it sounds like Bus Lock Threshold will
> > > be available at the same time.
> > >
> > > Ugh, and if we wanted to let the host handle guest-induced #DBs, we'd need code
> > > to keep Bus Lock Detect enabled in the guest since it resides in DEBUG_CTL.  Bah.
> > >
> > > So I guess if the vcpu->guest_debug part is fairly straightforward, it probably
> > > makes to virtualize Bus Lock Detect because the only reason not to virtualize it
> > > would actually require more work/code in KVM.
> >
> > KVM forwards #DB to Qemu when vcpu->guest_debug is set and it's Qemu's
> > responsibility to re-inject exception when Bus Lock Trap is enabled
> > inside the guest. I realized that it is broken so I've prepared a
> > Qemu patch, embedding it at the end.
> >
> > > I'd still love to see Bus Lock Threshold support sooner than later though :-)
> >
> > With Bus Lock Threshold enabled, I assume the changes introduced by this
> > patch plus Qemu fix are sufficient to support Bus Lock Trap inside the
> > guest?
> 
> In any case, it seems that commit 76ea438b4afc ("KVM: X86: Expose bus
> lock debug exception to guest") prematurely advertised the presence of
> X86_FEATURE_BUS_LOCK to userspace on non-Intel platforms. We should
> probably either accept these changes or fix up that commit. Either
> way, something should be done for all active branches back to v5.15.

Drat.  Yeah, we need a patch to clear BUS_LOCK_DETECT in svm_set_cpu_caps(), marked
for stable@.  Then this series can remove that clearing.

At least I caught it for CET[*]!  It'd be nice to not have to rely on humans to
detect potential issues like this, but I can't think of a way to programmatically
handle this situation without incurring an annoying amount of overhead and/or
duplicate code between VMX and SVM.

[*] https://lore.kernel.org/all/ZjLRnisdUgeYgg8i@google.com
Ravi Bangoria July 11, 2024, 8:51 a.m. UTC | #9
On 10-Jul-24 7:22 PM, Sean Christopherson wrote:
> On Tue, Jul 09, 2024, Jim Mattson wrote:
>> On Tue, Jul 9, 2024 at 7:25 PM Ravi Bangoria <ravi.bangoria@amd.com> wrote:
>>>
>>> Sean,
>>>
>>> Apologies for the delay. I was waiting for Bus Lock Threshold patches to be
>>> posted upstream:
>>>
>>>   https://lore.kernel.org/r/20240709175145.9986-1-manali.shukla@amd.com
>>>
>>> On 12-Jun-24 7:12 AM, Sean Christopherson wrote:
>>>> On Wed, Jun 05, 2024, Ravi Bangoria wrote:
>>>>> On 6/5/2024 8:38 PM, Sean Christopherson wrote:
>>>>>> Some of the problems on Intel were due to the awful FMS-based feature detection,
>>>>>> but those weren't the only hiccups.  E.g. IIRC, we never sorted out what should
>>>>>> happen if both the host and guest want bus-lock #DBs.
>>>>>
>>>>> I've to check about vcpu->guest_debug part, but keeping that aside, host and
>>>>> guest can use Bus Lock Detect in parallel because, DEBUG_CTL MSR and DR6
>>>>> register are save/restored in VMCB, hardware cause a VMEXIT_EXCEPTION_1 for
>>>>> guest #DB(when intercepted) and hardware raises #DB on host when it's for the
>>>>> host.
>>>>
>>>> I'm talking about the case where the host wants to do something in response to
>>>> bus locks that occurred in the guest.  E.g. if the host is taking punitive action,
>>>> say by stalling the vCPU, then the guest kernel could bypass that behavior by
>>>> enabling bus lock detect itself.
>>>>
>>>> Maybe it's moot point in practice, since it sounds like Bus Lock Threshold will
>>>> be available at the same time.
>>>>
>>>> Ugh, and if we wanted to let the host handle guest-induced #DBs, we'd need code
>>>> to keep Bus Lock Detect enabled in the guest since it resides in DEBUG_CTL.  Bah.
>>>>
>>>> So I guess if the vcpu->guest_debug part is fairly straightforward, it probably
>>>> makes to virtualize Bus Lock Detect because the only reason not to virtualize it
>>>> would actually require more work/code in KVM.
>>>
>>> KVM forwards #DB to Qemu when vcpu->guest_debug is set and it's Qemu's
>>> responsibility to re-inject exception when Bus Lock Trap is enabled
>>> inside the guest. I realized that it is broken so I've prepared a
>>> Qemu patch, embedding it at the end.
>>>
>>>> I'd still love to see Bus Lock Threshold support sooner than later though :-)
>>>
>>> With Bus Lock Threshold enabled, I assume the changes introduced by this
>>> patch plus Qemu fix are sufficient to support Bus Lock Trap inside the
>>> guest?
>>
>> In any case, it seems that commit 76ea438b4afc ("KVM: X86: Expose bus
>> lock debug exception to guest") prematurely advertised the presence of
>> X86_FEATURE_BUS_LOCK to userspace on non-Intel platforms. We should
>> probably either accept these changes or fix up that commit. Either
>> way, something should be done for all active branches back to v5.15.
> 
> Drat.  Yeah, we need a patch to clear BUS_LOCK_DETECT in svm_set_cpu_caps(), marked
> for stable@.  Then this series can remove that clearing.
> 
> At least I caught it for CET[*]!  It'd be nice to not have to rely on humans to
> detect potential issues like this, but I can't think of a way to programmatically
> handle this situation without incurring an annoying amount of overhead and/or
> duplicate code between VMX and SVM.
> 
> [*] https://lore.kernel.org/all/ZjLRnisdUgeYgg8i@google.com

Sure, I'll add a patch and respin the series.

Thanks,
Ravi
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 55b9a6d96bcf..6e93c2d9e7df 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -586,7 +586,8 @@  static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
 	/* These bits will be set properly on the first execution when new_vmc12 is true */
 	if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_DR))) {
 		vmcb02->save.dr7 = svm->nested.save.dr7 | DR7_FIXED_1;
-		svm->vcpu.arch.dr6  = svm->nested.save.dr6 | DR6_ACTIVE_LOW;
+		/* DR6_RTM is not supported on AMD as of now. */
+		svm->vcpu.arch.dr6  = svm->nested.save.dr6 | DR6_FIXED_1 | DR6_RTM;
 		vmcb_mark_dirty(vmcb02, VMCB_DR);
 	}
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d1a9f9951635..60f3af9bdacb 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1038,7 +1038,8 @@  void svm_update_lbrv(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK;
-	bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) ||
+	u64 dbgctl_buslock_lbr = DEBUGCTLMSR_BUS_LOCK_DETECT | DEBUGCTLMSR_LBR;
+	bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & dbgctl_buslock_lbr) ||
 			    (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) &&
 			    (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
 
@@ -3119,6 +3120,10 @@  static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		if (data & DEBUGCTL_RESERVED_BITS)
 			return 1;
 
+		if ((data & DEBUGCTLMSR_BUS_LOCK_DETECT) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
+			return 1;
+
 		svm_get_lbr_vmcb(svm)->save.dbgctl = data;
 		svm_update_lbrv(vcpu);
 		break;
@@ -5157,6 +5162,15 @@  static __init void svm_set_cpu_caps(void)
 
 	/* CPUID 0x8000001F (SME/SEV features) */
 	sev_set_cpu_caps();
+
+	/*
+	 * LBR Virtualization must be enabled to support BusLockTrap inside the
+	 * guest, since BusLockTrap is enabled through MSR_IA32_DEBUGCTLMSR and
+	 * MSR_IA32_DEBUGCTLMSR is virtualized only if LBR Virtualization is
+	 * enabled.
+	 */
+	if (!lbrv)
+		kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
 }
 
 static __init int svm_hardware_setup(void)