diff mbox series

[v4] i386: Add ratelimit for bus locks acquired in guest

Message ID 20210521043820.29678-1-chenyi.qiang@intel.com (mailing list archive)
State New, archived
Headers show
Series [v4] i386: Add ratelimit for bus locks acquired in guest | expand

Commit Message

Chenyi Qiang May 21, 2021, 4:38 a.m. UTC
A bus lock is acquired through either split locked access to writeback
(WB) memory or any locked access to non-WB memory. It is typically >1000
cycles slower than an atomic operation within a cache and can also
disrupts performance on other cores.

Virtual Machines can exploit bus locks to degrade the performance of
system. To address this kind of performance DOS attack coming from the
VMs, bus lock VM exit is introduced in KVM and it can report the bus
locks detected in guest. If enabled in KVM, it would exit to the
userspace to let the user enforce throttling policies once bus locks
acquired in VMs.

The availability of bus lock VM exit can be detected through the
KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential
policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in
bitmap is the only supported strategy at present. It indicates that KVM
will exit to userspace to handle the bus locks.

This patch adds a ratelimit on the bus locks acquired in guest as a
mitigation policy.

Introduce a new field "bus_lock_ratelimit" to record the limited speed
of bus locks in the target VM. The user can specify it through the
"bus-lock-ratelimit" as a machine property. In current implementation,
the default value of the speed is 0 per second, which means no
restrictions on the bus locks.

As for ratelimit on detected bus locks, simply set the ratelimit
interval to 1s and restrict the quota of bus lock occurence to the value
of "bus_lock_ratelimit". A potential alternative is to introduce the
time slice as a property which can help the user achieve more precise
control.

The detail of bus lock VM exit can be found in spec:
https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>

---
Changes from v3:
  - change bus_lock_ratelimit_ctrl to a static variable to avoid calling
    qdev_get_machine(). (Eduardo)
  - 4951967d84a0ratelimit is thread safe by commit 4951967d84a0, remove
    the ratelimit mutex in previous patch.(Eduardo)
  - v3: https://lore.kernel.org/qemu-devel/20210430103305.28849-1-chenyi.qiang@intel.com/

Changes from v2:
  - do some rename work (bus-lock-ratelimit and BUS_LOCK_TIME_SLICE).
    (Eduardo)
  - change to register a class property at the x86_machine_class_init()
    and write the gettter/setter for the bus_lock_ratelimit property.
    (Eduardo)
  - add the lock to access the Ratelimit instance to avoid vcpu thread
    race condition. (Eduardo)
  - v2: https://lore.kernel.org/qemu-devel/20210420093736.17613-1-chenyi.qiang@intel.com/

Changes from RFC v1:
  - Remove the rip info output, as the rip can't reflect the bus lock
    position correctly. (Xiaoyao)
  - RFC v1: https://lore.kernel.org/qemu-devel/20210317084709.15605-1-chenyi.qiang@intel.com/
---
 hw/i386/x86.c         | 24 ++++++++++++++++++++++++
 include/hw/i386/x86.h |  8 ++++++++
 target/i386/kvm/kvm.c | 41 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 73 insertions(+)

Comments

Eduardo Habkost May 27, 2021, 9:19 p.m. UTC | #1
On Fri, May 21, 2021 at 12:38:20PM +0800, Chenyi Qiang wrote:
[...]
> @@ -4222,6 +4247,15 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
>      }
>  }
>  
> +static void kvm_rate_limit_on_bus_lock(void)
> +{
> +    uint64_t delay_ns = ratelimit_calculate_delay(&bus_lock_ratelimit_ctrl, 1);
> +
> +    if (delay_ns) {
> +        g_usleep(delay_ns / SCALE_US);
> +    }
> +}
> +
>  MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>  {
>      X86CPU *x86_cpu = X86_CPU(cpu);
> @@ -4237,6 +4271,9 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>      } else {
>          env->eflags &= ~IF_MASK;
>      }
> +    if (run->flags & KVM_RUN_X86_BUS_LOCK) {

Does the KVM API guarantee that KVM_RUN_X86_BUS_LOCK will never
be set if KVM_BUS_LOCK_DETECTION_EXIT isn't enabled?  (Otherwise
we risk crashing in ratelimit_calculate_delay() above if rate
limiting is disabled).

If that's guaranteed, the patch looks good to me now.

> +        kvm_rate_limit_on_bus_lock();
> +    }
>  
>      /* We need to protect the apic state against concurrent accesses from
>       * different threads in case the userspace irqchip is used. */
> @@ -4595,6 +4632,10 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>          ioapic_eoi_broadcast(run->eoi.vector);
>          ret = 0;
>          break;
> +    case KVM_EXIT_X86_BUS_LOCK:
> +        /* already handled in kvm_arch_post_run */
> +        ret = 0;
> +        break;
>      default:
>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>          ret = -1;
> -- 
> 2.17.1
>
Chenyi Qiang May 31, 2021, 5:14 a.m. UTC | #2
On 5/28/2021 5:19 AM, Eduardo Habkost wrote:
> On Fri, May 21, 2021 at 12:38:20PM +0800, Chenyi Qiang wrote:
> [...]
>> @@ -4222,6 +4247,15 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
>>       }
>>   }
>>   
>> +static void kvm_rate_limit_on_bus_lock(void)
>> +{
>> +    uint64_t delay_ns = ratelimit_calculate_delay(&bus_lock_ratelimit_ctrl, 1);
>> +
>> +    if (delay_ns) {
>> +        g_usleep(delay_ns / SCALE_US);
>> +    }
>> +}
>> +
>>   MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>>   {
>>       X86CPU *x86_cpu = X86_CPU(cpu);
>> @@ -4237,6 +4271,9 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>>       } else {
>>           env->eflags &= ~IF_MASK;
>>       }
>> +    if (run->flags & KVM_RUN_X86_BUS_LOCK) {
> 
> Does the KVM API guarantee that KVM_RUN_X86_BUS_LOCK will never
> be set if KVM_BUS_LOCK_DETECTION_EXIT isn't enabled?  (Otherwise
> we risk crashing in ratelimit_calculate_delay() above if rate
> limiting is disabled).
> 

Yes. KVM_RUN_X86_BUS_LOCK flag is set when bus lock VM exit happens. Bus 
lock VM exit is disabled by default and can only be enabled through the 
KVM_BUS_LOCK_DETECTION_EXIT capability.

> If that's guaranteed, the patch looks good to me now.
> 
>> +        kvm_rate_limit_on_bus_lock();
>> +    }
>>   
>>       /* We need to protect the apic state against concurrent accesses from
>>        * different threads in case the userspace irqchip is used. */
>> @@ -4595,6 +4632,10 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>>           ioapic_eoi_broadcast(run->eoi.vector);
>>           ret = 0;
>>           break;
>> +    case KVM_EXIT_X86_BUS_LOCK:
>> +        /* already handled in kvm_arch_post_run */
>> +        ret = 0;
>> +        break;
>>       default:
>>           fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>           ret = -1;
>> -- 
>> 2.17.1
>>
>
Eduardo Habkost June 1, 2021, 6:18 p.m. UTC | #3
On Mon, May 31, 2021 at 01:14:54PM +0800, Chenyi Qiang wrote:
> 
> 
> On 5/28/2021 5:19 AM, Eduardo Habkost wrote:
> > On Fri, May 21, 2021 at 12:38:20PM +0800, Chenyi Qiang wrote:
> > [...]
> > > @@ -4222,6 +4247,15 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
> > >       }
> > >   }
> > > +static void kvm_rate_limit_on_bus_lock(void)
> > > +{
> > > +    uint64_t delay_ns = ratelimit_calculate_delay(&bus_lock_ratelimit_ctrl, 1);
> > > +
> > > +    if (delay_ns) {
> > > +        g_usleep(delay_ns / SCALE_US);
> > > +    }
> > > +}
> > > +
> > >   MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
> > >   {
> > >       X86CPU *x86_cpu = X86_CPU(cpu);
> > > @@ -4237,6 +4271,9 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
> > >       } else {
> > >           env->eflags &= ~IF_MASK;
> > >       }
> > > +    if (run->flags & KVM_RUN_X86_BUS_LOCK) {
> > 
> > Does the KVM API guarantee that KVM_RUN_X86_BUS_LOCK will never
> > be set if KVM_BUS_LOCK_DETECTION_EXIT isn't enabled?  (Otherwise
> > we risk crashing in ratelimit_calculate_delay() above if rate
> > limiting is disabled).
> > 
> 
> Yes. KVM_RUN_X86_BUS_LOCK flag is set when bus lock VM exit happens. Bus
> lock VM exit is disabled by default and can only be enabled through the
> KVM_BUS_LOCK_DETECTION_EXIT capability.

I'm queueing on x86-next, thanks!
Eduardo Habkost June 1, 2021, 8:10 p.m. UTC | #4
On Tue, Jun 01, 2021 at 02:18:37PM -0400, Eduardo Habkost wrote:
> On Mon, May 31, 2021 at 01:14:54PM +0800, Chenyi Qiang wrote:
> > 
> > 
> > On 5/28/2021 5:19 AM, Eduardo Habkost wrote:
> > > On Fri, May 21, 2021 at 12:38:20PM +0800, Chenyi Qiang wrote:
> > > [...]
> > > > @@ -4222,6 +4247,15 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
> > > >       }
> > > >   }
> > > > +static void kvm_rate_limit_on_bus_lock(void)
> > > > +{
> > > > +    uint64_t delay_ns = ratelimit_calculate_delay(&bus_lock_ratelimit_ctrl, 1);
> > > > +
> > > > +    if (delay_ns) {
> > > > +        g_usleep(delay_ns / SCALE_US);
> > > > +    }
> > > > +}
> > > > +
> > > >   MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
> > > >   {
> > > >       X86CPU *x86_cpu = X86_CPU(cpu);
> > > > @@ -4237,6 +4271,9 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
> > > >       } else {
> > > >           env->eflags &= ~IF_MASK;
> > > >       }
> > > > +    if (run->flags & KVM_RUN_X86_BUS_LOCK) {
> > > 
> > > Does the KVM API guarantee that KVM_RUN_X86_BUS_LOCK will never
> > > be set if KVM_BUS_LOCK_DETECTION_EXIT isn't enabled?  (Otherwise
> > > we risk crashing in ratelimit_calculate_delay() above if rate
> > > limiting is disabled).
> > > 
> > 
> > Yes. KVM_RUN_X86_BUS_LOCK flag is set when bus lock VM exit happens. Bus
> > lock VM exit is disabled by default and can only be enabled through the
> > KVM_BUS_LOCK_DETECTION_EXIT capability.
> 
> I'm queueing on x86-next, thanks!

This breaks the build.  Is there a linux-headers update patch I've missed?

../target/i386/kvm/kvm.c: In function 'kvm_arch_init':
../target/i386/kvm/kvm.c:2322:42: error: 'KVM_CAP_X86_BUS_LOCK_EXIT' undeclared (first use in this function); did you mean 'KVM_CAP_X86_DISABLE_EXITS'?
             ret = kvm_check_extension(s, KVM_CAP_X86_BUS_LOCK_EXIT);
                                          ^~~~~~~~~~~~~~~~~~~~~~~~~
                                          KVM_CAP_X86_DISABLE_EXITS
Chenyi Qiang June 2, 2021, 1:26 a.m. UTC | #5
On 6/2/2021 4:10 AM, Eduardo Habkost wrote:
> On Tue, Jun 01, 2021 at 02:18:37PM -0400, Eduardo Habkost wrote:
>> On Mon, May 31, 2021 at 01:14:54PM +0800, Chenyi Qiang wrote:
>>>
>>>
>>> On 5/28/2021 5:19 AM, Eduardo Habkost wrote:
>>>> On Fri, May 21, 2021 at 12:38:20PM +0800, Chenyi Qiang wrote:
>>>> [...]
>>>>> @@ -4222,6 +4247,15 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
>>>>>        }
>>>>>    }
>>>>> +static void kvm_rate_limit_on_bus_lock(void)
>>>>> +{
>>>>> +    uint64_t delay_ns = ratelimit_calculate_delay(&bus_lock_ratelimit_ctrl, 1);
>>>>> +
>>>>> +    if (delay_ns) {
>>>>> +        g_usleep(delay_ns / SCALE_US);
>>>>> +    }
>>>>> +}
>>>>> +
>>>>>    MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>>>>>    {
>>>>>        X86CPU *x86_cpu = X86_CPU(cpu);
>>>>> @@ -4237,6 +4271,9 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>>>>>        } else {
>>>>>            env->eflags &= ~IF_MASK;
>>>>>        }
>>>>> +    if (run->flags & KVM_RUN_X86_BUS_LOCK) {
>>>>
>>>> Does the KVM API guarantee that KVM_RUN_X86_BUS_LOCK will never
>>>> be set if KVM_BUS_LOCK_DETECTION_EXIT isn't enabled?  (Otherwise
>>>> we risk crashing in ratelimit_calculate_delay() above if rate
>>>> limiting is disabled).
>>>>
>>>
>>> Yes. KVM_RUN_X86_BUS_LOCK flag is set when bus lock VM exit happens. Bus
>>> lock VM exit is disabled by default and can only be enabled through the
>>> KVM_BUS_LOCK_DETECTION_EXIT capability.
>>
>> I'm queueing on x86-next, thanks!
> 
> This breaks the build.  Is there a linux-headers update patch I've missed?
> 

Thanks for the queue and sorry for forgetting to submit the 
linux-headers update patch.

> ../target/i386/kvm/kvm.c: In function 'kvm_arch_init':
> ../target/i386/kvm/kvm.c:2322:42: error: 'KVM_CAP_X86_BUS_LOCK_EXIT' undeclared (first use in this function); did you mean 'KVM_CAP_X86_DISABLE_EXITS'?
>               ret = kvm_check_extension(s, KVM_CAP_X86_BUS_LOCK_EXIT);
>                                            ^~~~~~~~~~~~~~~~~~~~~~~~~
>                                            KVM_CAP_X86_DISABLE_EXITS
>
Dr. David Alan Gilbert July 27, 2021, 8:28 a.m. UTC | #6
* Chenyi Qiang (chenyi.qiang@intel.com) wrote:
> A bus lock is acquired through either split locked access to writeback
> (WB) memory or any locked access to non-WB memory. It is typically >1000
> cycles slower than an atomic operation within a cache and can also
> disrupts performance on other cores.
> 
> Virtual Machines can exploit bus locks to degrade the performance of
> system. To address this kind of performance DOS attack coming from the
> VMs, bus lock VM exit is introduced in KVM and it can report the bus
> locks detected in guest. If enabled in KVM, it would exit to the
> userspace to let the user enforce throttling policies once bus locks
> acquired in VMs.
> 
> The availability of bus lock VM exit can be detected through the
> KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential
> policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in
> bitmap is the only supported strategy at present. It indicates that KVM
> will exit to userspace to handle the bus locks.
> 
> This patch adds a ratelimit on the bus locks acquired in guest as a
> mitigation policy.
> 
> Introduce a new field "bus_lock_ratelimit" to record the limited speed
> of bus locks in the target VM. The user can specify it through the
> "bus-lock-ratelimit" as a machine property. In current implementation,
> the default value of the speed is 0 per second, which means no
> restrictions on the bus locks.
> 
> As for ratelimit on detected bus locks, simply set the ratelimit
> interval to 1s and restrict the quota of bus lock occurence to the value
> of "bus_lock_ratelimit". A potential alternative is to introduce the
> time slice as a property which can help the user achieve more precise
> control.
> 
> The detail of bus lock VM exit can be found in spec:
> https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
> 
> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>

Hi Chenyi,

  I noticed in this patch:


> +static void kvm_rate_limit_on_bus_lock(void)
> +{
> +    uint64_t delay_ns = ratelimit_calculate_delay(&bus_lock_ratelimit_ctrl, 1);
> +
> +    if (delay_ns) {
> +        g_usleep(delay_ns / SCALE_US);
> +    }
> +}

and wondered if this would block cpu kicks, and what would happen if
delay_ns got quite big - Eduardo thinks it might get upto 1s.

Also, it feels similar to what migration does during 'auto converge';
see softmuu/cpu-throttle.c - instead of doing your own g_usleep
you could call cpu_throttle_set with a given throttle rate.

Dave

> +
>  MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>  {
>      X86CPU *x86_cpu = X86_CPU(cpu);
> @@ -4237,6 +4271,9 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>      } else {
>          env->eflags &= ~IF_MASK;
>      }
> +    if (run->flags & KVM_RUN_X86_BUS_LOCK) {
> +        kvm_rate_limit_on_bus_lock();
> +    }
>  
>      /* We need to protect the apic state against concurrent accesses from
>       * different threads in case the userspace irqchip is used. */
> @@ -4595,6 +4632,10 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>          ioapic_eoi_broadcast(run->eoi.vector);
>          ret = 0;
>          break;
> +    case KVM_EXIT_X86_BUS_LOCK:
> +        /* already handled in kvm_arch_post_run */
> +        ret = 0;
> +        break;
>      default:
>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>          ret = -1;
> -- 
> 2.17.1
> 
>
Chenyi Qiang July 28, 2021, 5:40 a.m. UTC | #7
On 7/27/2021 4:28 PM, Dr. David Alan Gilbert wrote:
> * Chenyi Qiang (chenyi.qiang@intel.com) wrote:
>> A bus lock is acquired through either split locked access to writeback
>> (WB) memory or any locked access to non-WB memory. It is typically >1000
>> cycles slower than an atomic operation within a cache and can also
>> disrupts performance on other cores.
>>
>> Virtual Machines can exploit bus locks to degrade the performance of
>> system. To address this kind of performance DOS attack coming from the
>> VMs, bus lock VM exit is introduced in KVM and it can report the bus
>> locks detected in guest. If enabled in KVM, it would exit to the
>> userspace to let the user enforce throttling policies once bus locks
>> acquired in VMs.
>>
>> The availability of bus lock VM exit can be detected through the
>> KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential
>> policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in
>> bitmap is the only supported strategy at present. It indicates that KVM
>> will exit to userspace to handle the bus locks.
>>
>> This patch adds a ratelimit on the bus locks acquired in guest as a
>> mitigation policy.
>>
>> Introduce a new field "bus_lock_ratelimit" to record the limited speed
>> of bus locks in the target VM. The user can specify it through the
>> "bus-lock-ratelimit" as a machine property. In current implementation,
>> the default value of the speed is 0 per second, which means no
>> restrictions on the bus locks.
>>
>> As for ratelimit on detected bus locks, simply set the ratelimit
>> interval to 1s and restrict the quota of bus lock occurence to the value
>> of "bus_lock_ratelimit". A potential alternative is to introduce the
>> time slice as a property which can help the user achieve more precise
>> control.
>>
>> The detail of bus lock VM exit can be found in spec:
>> https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
>>
>> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
> 
> Hi Chenyi,
> 
>    I noticed in this patch:
> 
> 
>> +static void kvm_rate_limit_on_bus_lock(void)
>> +{
>> +    uint64_t delay_ns = ratelimit_calculate_delay(&bus_lock_ratelimit_ctrl, 1);
>> +
>> +    if (delay_ns) {
>> +        g_usleep(delay_ns / SCALE_US);
>> +    }
>> +}
> 
> and wondered if this would block cpu kicks, and what would happen if
> delay_ns got quite big - Eduardo thinks it might get upto 1s.
> 

I did a rough test, force the delay_ns to 1s and see how long it will 
take to sleep 20s in guest. Actually, for 1-vcpu VM, the output of 
elapsed time is 20.4~20.6s, so I assume the applications in guest may 
lose some precision. Changing to a more refined time slice control is an 
solution. (But concerning that such ratelimit only happen in a malicious 
guest, maybe it is acceptable to lose some accuracy.)

> Also, it feels similar to what migration does during 'auto converge';
> see softmuu/cpu-throttle.c - instead of doing your own g_usleep
> you could call cpu_throttle_set with a given throttle rate.
> 

Yes, looked at the cpu-throttle code, cpu_throttle_set works similarly, 
but need some refactor. Migration uses the static throttle_percentage to 
control the global throttling, so if bus lock throttling calls 
cpu_throttle_set, it needs to distinguish with migration.

> Dave
> 
>> +
>>   MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>>   {
>>       X86CPU *x86_cpu = X86_CPU(cpu);
>> @@ -4237,6 +4271,9 @@ MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
>>       } else {
>>           env->eflags &= ~IF_MASK;
>>       }
>> +    if (run->flags & KVM_RUN_X86_BUS_LOCK) {
>> +        kvm_rate_limit_on_bus_lock();
>> +    }
>>   
>>       /* We need to protect the apic state against concurrent accesses from
>>        * different threads in case the userspace irqchip is used. */
>> @@ -4595,6 +4632,10 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>>           ioapic_eoi_broadcast(run->eoi.vector);
>>           ret = 0;
>>           break;
>> +    case KVM_EXIT_X86_BUS_LOCK:
>> +        /* already handled in kvm_arch_post_run */
>> +        ret = 0;
>> +        break;
>>       default:
>>           fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>           ret = -1;
>> -- 
>> 2.17.1
>>
>>
diff mbox series

Patch

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index ed796fe6ba..d30cf27e29 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1246,6 +1246,23 @@  static void x86_machine_set_oem_table_id(Object *obj, const char *value,
     strncpy(x86ms->oem_table_id, value, 8);
 }
 
+static void x86_machine_get_bus_lock_ratelimit(Object *obj, Visitor *v,
+                                const char *name, void *opaque, Error **errp)
+{
+    X86MachineState *x86ms = X86_MACHINE(obj);
+    uint64_t bus_lock_ratelimit = x86ms->bus_lock_ratelimit;
+
+    visit_type_uint64(v, name, &bus_lock_ratelimit, errp);
+}
+
+static void x86_machine_set_bus_lock_ratelimit(Object *obj, Visitor *v,
+                               const char *name, void *opaque, Error **errp)
+{
+    X86MachineState *x86ms = X86_MACHINE(obj);
+
+    visit_type_uint64(v, name, &x86ms->bus_lock_ratelimit, errp);
+}
+
 static void x86_machine_initfn(Object *obj)
 {
     X86MachineState *x86ms = X86_MACHINE(obj);
@@ -1256,6 +1273,7 @@  static void x86_machine_initfn(Object *obj)
     x86ms->pci_irq_mask = ACPI_BUILD_PCI_IRQS;
     x86ms->oem_id = g_strndup(ACPI_BUILD_APPNAME6, 6);
     x86ms->oem_table_id = g_strndup(ACPI_BUILD_APPNAME8, 8);
+    x86ms->bus_lock_ratelimit = 0;
 }
 
 static void x86_machine_class_init(ObjectClass *oc, void *data)
@@ -1299,6 +1317,12 @@  static void x86_machine_class_init(ObjectClass *oc, void *data)
                                           "Override the default value of field OEM Table ID "
                                           "in ACPI table header."
                                           "The string may be up to 8 bytes in size");
+
+    object_class_property_add(oc, X86_MACHINE_BUS_LOCK_RATELIMIT, "uint64_t",
+                                x86_machine_get_bus_lock_ratelimit,
+                                x86_machine_set_bus_lock_ratelimit, NULL, NULL);
+    object_class_property_set_description(oc, X86_MACHINE_BUS_LOCK_RATELIMIT,
+            "Set the ratelimit for the bus locks acquired in VMs");
 }
 
 static const TypeInfo x86_machine_info = {
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index c09b648dff..25a1f16f01 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -74,12 +74,20 @@  struct X86MachineState {
      * will be translated to MSI messages in the address space.
      */
     AddressSpace *ioapic_as;
+
+    /*
+     * Ratelimit enforced on detected bus locks in guest.
+     * The default value of the bus_lock_ratelimit is 0 per second,
+     * which means no limitation on the guest's bus locks.
+     */
+    uint64_t bus_lock_ratelimit;
 };
 
 #define X86_MACHINE_SMM              "smm"
 #define X86_MACHINE_ACPI             "acpi"
 #define X86_MACHINE_OEM_ID           "x-oem-id"
 #define X86_MACHINE_OEM_TABLE_ID     "x-oem-table-id"
+#define X86_MACHINE_BUS_LOCK_RATELIMIT  "bus-lock-ratelimit"
 
 #define TYPE_X86_MACHINE   MACHINE_TYPE_NAME("x86")
 OBJECT_DECLARE_TYPE(X86MachineState, X86MachineClass, X86_MACHINE)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d972eb4705..af328068b3 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -131,6 +131,9 @@  static bool has_msr_mcg_ext_ctl;
 static struct kvm_cpuid2 *cpuid_cache;
 static struct kvm_msr_list *kvm_feature_msrs;
 
+#define BUS_LOCK_SLICE_TIME 1000000000ULL /* ns */
+static RateLimit bus_lock_ratelimit_ctrl;
+
 int kvm_has_pit_state2(void)
 {
     return has_pit_state2;
@@ -2268,6 +2271,28 @@  int kvm_arch_init(MachineState *ms, KVMState *s)
         }
     }
 
+    if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
+        X86MachineState *x86ms = X86_MACHINE(ms);
+
+        if (x86ms->bus_lock_ratelimit > 0) {
+            ret = kvm_check_extension(s, KVM_CAP_X86_BUS_LOCK_EXIT);
+            if (!(ret & KVM_BUS_LOCK_DETECTION_EXIT)) {
+                error_report("kvm: bus lock detection unsupported");
+                return -ENOTSUP;
+            }
+            ret = kvm_vm_enable_cap(s, KVM_CAP_X86_BUS_LOCK_EXIT, 0,
+                                    KVM_BUS_LOCK_DETECTION_EXIT);
+            if (ret < 0) {
+                error_report("kvm: Failed to enable bus lock detection cap: %s",
+                             strerror(-ret));
+                return ret;
+            }
+            ratelimit_init(&bus_lock_ratelimit_ctrl);
+            ratelimit_set_speed(&bus_lock_ratelimit_ctrl,
+                                x86ms->bus_lock_ratelimit, BUS_LOCK_SLICE_TIME);
+        }
+    }
+
     return 0;
 }
 
@@ -4222,6 +4247,15 @@  void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run)
     }
 }
 
+static void kvm_rate_limit_on_bus_lock(void)
+{
+    uint64_t delay_ns = ratelimit_calculate_delay(&bus_lock_ratelimit_ctrl, 1);
+
+    if (delay_ns) {
+        g_usleep(delay_ns / SCALE_US);
+    }
+}
+
 MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
 {
     X86CPU *x86_cpu = X86_CPU(cpu);
@@ -4237,6 +4271,9 @@  MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run)
     } else {
         env->eflags &= ~IF_MASK;
     }
+    if (run->flags & KVM_RUN_X86_BUS_LOCK) {
+        kvm_rate_limit_on_bus_lock();
+    }
 
     /* We need to protect the apic state against concurrent accesses from
      * different threads in case the userspace irqchip is used. */
@@ -4595,6 +4632,10 @@  int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         ioapic_eoi_broadcast(run->eoi.vector);
         ret = 0;
         break;
+    case KVM_EXIT_X86_BUS_LOCK:
+        /* already handled in kvm_arch_post_run */
+        ret = 0;
+        break;
     default:
         fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
         ret = -1;