Message ID | 20240608000639.3295768-2-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: Register cpuhp/syscore callbacks when enabling virt | expand |
On Fri, 2024-06-07 at 17:06 -0700, Sean Christopherson wrote: > Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock > on x86 due to a chain of locks and SRCU synchronizations. Translating the > below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on > CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the > fairness of r/w semaphores). > > CPU0 CPU1 CPU2 > 1 lock(&kvm->slots_lock); > 2 lock(&vcpu->mutex); > 3 lock(&kvm->srcu); > 4 lock(cpu_hotplug_lock); > 5 lock(kvm_lock); > 6 lock(&kvm->slots_lock); > 7 lock(cpu_hotplug_lock); > 8 sync(&kvm->srcu); > > [...] > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Nitpickings below: > --- > Documentation/virt/kvm/locking.rst | 19 ++++++++++++------ > virt/kvm/kvm_main.c | 31 +++++++++++++++--------------- > 2 files changed, 29 insertions(+), 21 deletions(-) > > diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst > index 02880d5552d5..5e102fe5b396 100644 > --- a/Documentation/virt/kvm/locking.rst > +++ b/Documentation/virt/kvm/locking.rst > @@ -227,7 +227,13 @@ time it will be set using the Dirty tracking mechanism described above. > :Type: mutex > :Arch: any > :Protects: - vm_list > - - kvm_usage_count > + > +``kvm_usage_count`` > +^^^^^^^^^^^^^^^^^^^ kvm_usage_lock > + > +:Type: mutex > +:Arch: any > +:Protects: - kvm_usage_count > - hardware virtualization enable/disable > :Comment: KVM also disables CPU hotplug via cpus_read_lock() during > enable/disable. I think this sentence should be improved to at least mention "Exists because using kvm_lock leads to deadlock", just like the comment for vendor_module_lock below. > @@ -290,11 +296,12 @@ time it will be set using the Dirty tracking mechanism described above. > wakeup. > > ``vendor_module_lock`` > -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > +^^^^^^^^^^^^^^^^^^^^^^ > :Type: mutex > :Arch: x86 > :Protects: loading a vendor module (kvm_amd or kvm_intel) > -:Comment: Exists because using kvm_lock leads to deadlock. cpu_hotplug_lock is > - taken outside of kvm_lock, e.g. in KVM's CPU online/offline callbacks, and > - many operations need to take cpu_hotplug_lock when loading a vendor module, > - e.g. updating static calls. > +:Comment: Exists because using kvm_lock leads to deadlock. kvm_lock is taken > + in notifiers, e.g. __kvmclock_cpufreq_notifier(), that may be invoked while > + cpu_hotplug_lock is held, e.g. from cpufreq_boost_trigger_state(), and many > + operations need to take cpu_hotplug_lock when loading a vendor module, e.g. > + updating static calls.
On 6/8/24 02:06, Sean Christopherson wrote: > Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock > on x86 due to a chain of locks and SRCU synchronizations. Translating the > below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on > CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the > fairness of r/w semaphores). > > CPU0 CPU1 CPU2 > 1 lock(&kvm->slots_lock); > 2 lock(&vcpu->mutex); > 3 lock(&kvm->srcu); > 4 lock(cpu_hotplug_lock); > 5 lock(kvm_lock); > 6 lock(&kvm->slots_lock); > 7 lock(cpu_hotplug_lock); > 8 sync(&kvm->srcu); > > Note, there are likely more potential deadlocks in KVM x86, e.g. the same > pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with > __kvmclock_cpufreq_notifier() Offhand I couldn't see any places where {,__}cpufreq_driver_target() is called within cpus_read_lock(). I didn't look too closely though. > +``kvm_usage_count`` > +^^^^^^^^^^^^^^^^^^^ ``kvm_usage_lock`` Paolo > + > +:Type: mutex > +:Arch: any > +:Protects: - kvm_usage_count > - hardware virtualization enable/disable > :Comment: KVM also disables CPU hotplug via cpus_read_lock() during > enable/disable. > @@ -290,11 +296,12 @@ time it will be set using the Dirty tracking mechanism described above. > wakeup. > > ``vendor_module_lock`` > -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > +^^^^^^^^^^^^^^^^^^^^^^ > :Type: mutex > :Arch: x86 > :Protects: loading a vendor module (kvm_amd or kvm_intel) > -:Comment: Exists because using kvm_lock leads to deadlock. cpu_hotplug_lock is > - taken outside of kvm_lock, e.g. in KVM's CPU online/offline callbacks, and > - many operations need to take cpu_hotplug_lock when loading a vendor module, > - e.g. updating static calls. > +:Comment: Exists because using kvm_lock leads to deadlock. kvm_lock is taken > + in notifiers, e.g. __kvmclock_cpufreq_notifier(), that may be invoked while > + cpu_hotplug_lock is held, e.g. from cpufreq_boost_trigger_state(), and many > + operations need to take cpu_hotplug_lock when loading a vendor module, e.g. > + updating static calls. > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 4965196cad58..d9b0579d3eea 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -5499,6 +5499,7 @@ __visible bool kvm_rebooting; > EXPORT_SYMBOL_GPL(kvm_rebooting); > > static DEFINE_PER_CPU(bool, hardware_enabled); > +static DEFINE_MUTEX(kvm_usage_lock); > static int kvm_usage_count; > > static int __hardware_enable_nolock(void) > @@ -5531,10 +5532,10 @@ static int kvm_online_cpu(unsigned int cpu) > * be enabled. Otherwise running VMs would encounter unrecoverable > * errors when scheduled to this CPU. > */ > - mutex_lock(&kvm_lock); > + mutex_lock(&kvm_usage_lock); > if (kvm_usage_count) > ret = __hardware_enable_nolock(); > - mutex_unlock(&kvm_lock); > + mutex_unlock(&kvm_usage_lock); > return ret; > } > > @@ -5554,10 +5555,10 @@ static void hardware_disable_nolock(void *junk) > > static int kvm_offline_cpu(unsigned int cpu) > { > - mutex_lock(&kvm_lock); > + mutex_lock(&kvm_usage_lock); > if (kvm_usage_count) > hardware_disable_nolock(NULL); > - mutex_unlock(&kvm_lock); > + mutex_unlock(&kvm_usage_lock); > return 0; > } > > @@ -5573,9 +5574,9 @@ static void hardware_disable_all_nolock(void) > static void hardware_disable_all(void) > { > cpus_read_lock(); > - mutex_lock(&kvm_lock); > + mutex_lock(&kvm_usage_lock); > hardware_disable_all_nolock(); > - mutex_unlock(&kvm_lock); > + mutex_unlock(&kvm_usage_lock); > cpus_read_unlock(); > } > > @@ -5606,7 +5607,7 @@ static int hardware_enable_all(void) > * enable hardware multiple times. > */ > cpus_read_lock(); > - mutex_lock(&kvm_lock); > + mutex_lock(&kvm_usage_lock); > > r = 0; > > @@ -5620,7 +5621,7 @@ static int hardware_enable_all(void) > } > } > > - mutex_unlock(&kvm_lock); > + mutex_unlock(&kvm_usage_lock); > cpus_read_unlock(); > > return r; > @@ -5648,13 +5649,13 @@ static int kvm_suspend(void) > { > /* > * Secondary CPUs and CPU hotplug are disabled across the suspend/resume > - * callbacks, i.e. no need to acquire kvm_lock to ensure the usage count > - * is stable. Assert that kvm_lock is not held to ensure the system > - * isn't suspended while KVM is enabling hardware. Hardware enabling > - * can be preempted, but the task cannot be frozen until it has dropped > - * all locks (userspace tasks are frozen via a fake signal). > + * callbacks, i.e. no need to acquire kvm_usage_lock to ensure the usage > + * count is stable. Assert that kvm_usage_lock is not held to ensure > + * the system isn't suspended while KVM is enabling hardware. Hardware > + * enabling can be preempted, but the task cannot be frozen until it has > + * dropped all locks (userspace tasks are frozen via a fake signal). > */ > - lockdep_assert_not_held(&kvm_lock); > + lockdep_assert_not_held(&kvm_usage_lock); > lockdep_assert_irqs_disabled(); > > if (kvm_usage_count) > @@ -5664,7 +5665,7 @@ static int kvm_suspend(void) > > static void kvm_resume(void) > { > - lockdep_assert_not_held(&kvm_lock); > + lockdep_assert_not_held(&kvm_usage_lock); > lockdep_assert_irqs_disabled(); > > if (kvm_usage_count)
On Wed, Aug 14, 2024, Paolo Bonzini wrote: > On 6/8/24 02:06, Sean Christopherson wrote: > > Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock > > on x86 due to a chain of locks and SRCU synchronizations. Translating the > > below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on > > CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the > > fairness of r/w semaphores). > > > > CPU0 CPU1 CPU2 > > 1 lock(&kvm->slots_lock); > > 2 lock(&vcpu->mutex); > > 3 lock(&kvm->srcu); > > 4 lock(cpu_hotplug_lock); > > 5 lock(kvm_lock); > > 6 lock(&kvm->slots_lock); > > 7 lock(cpu_hotplug_lock); > > 8 sync(&kvm->srcu); > > > > Note, there are likely more potential deadlocks in KVM x86, e.g. the same > > pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with > > __kvmclock_cpufreq_notifier() > > Offhand I couldn't see any places where {,__}cpufreq_driver_target() is > called within cpus_read_lock(). I didn't look too closely though. Aha! I think I finally found it and it's rather obvious now that I've found it. I looked quite deeply on multiple occasions in the past and never found such a case, but I could've sworn someone (Kai?) report a lockdep splat related to the cpufreq stuff when I did the big generic hardware enabling a while back. Of course, I couldn't find that either :-) Anyways... cpuhp_cpufreq_online() | -> cpufreq_online() | -> cpufreq_gov_performance_limits() | -> __cpufreq_driver_target() | -> __target_index() > > > +``kvm_usage_count`` > > +^^^^^^^^^^^^^^^^^^^ > > ``kvm_usage_lock`` Good job me.
On Thu, Aug 15, 2024 at 4:40 PM Sean Christopherson <seanjc@google.com> wrote: > > On Wed, Aug 14, 2024, Paolo Bonzini wrote: > > On 6/8/24 02:06, Sean Christopherson wrote: > > > Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock > > > on x86 due to a chain of locks and SRCU synchronizations. Translating the > > > below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on > > > CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the > > > fairness of r/w semaphores). > > > > > > CPU0 CPU1 CPU2 > > > 1 lock(&kvm->slots_lock); > > > 2 lock(&vcpu->mutex); > > > 3 lock(&kvm->srcu); > > > 4 lock(cpu_hotplug_lock); > > > 5 lock(kvm_lock); > > > 6 lock(&kvm->slots_lock); > > > 7 lock(cpu_hotplug_lock); > > > 8 sync(&kvm->srcu); > > > > > > Note, there are likely more potential deadlocks in KVM x86, e.g. the same > > > pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with > > > __kvmclock_cpufreq_notifier() > > > > Offhand I couldn't see any places where {,__}cpufreq_driver_target() is > > called within cpus_read_lock(). I didn't look too closely though. > > Anyways... > > cpuhp_cpufreq_online() > | > -> cpufreq_online() > | > -> cpufreq_gov_performance_limits() > | > -> __cpufreq_driver_target() > | > -> __target_index() Ah, I only looked in generic code. Can you add a comment to the comment message suggesting switching the vm_list to RCU? All the occurrences of list_for_each_entry(..., &vm_list, ...) seem amenable to that, and it should be as easy to stick all or part of kvm_destroy_vm() behind call_rcu(). Thanks, Paolo
On Thu, Aug 15, 2024, Paolo Bonzini wrote: > On Thu, Aug 15, 2024 at 4:40 PM Sean Christopherson <seanjc@google.com> wrote: > > > > On Wed, Aug 14, 2024, Paolo Bonzini wrote: > > > On 6/8/24 02:06, Sean Christopherson wrote: > > > > Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock > > > > on x86 due to a chain of locks and SRCU synchronizations. Translating the > > > > below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on > > > > CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the > > > > fairness of r/w semaphores). > > > > > > > > CPU0 CPU1 CPU2 > > > > 1 lock(&kvm->slots_lock); > > > > 2 lock(&vcpu->mutex); > > > > 3 lock(&kvm->srcu); > > > > 4 lock(cpu_hotplug_lock); > > > > 5 lock(kvm_lock); > > > > 6 lock(&kvm->slots_lock); > > > > 7 lock(cpu_hotplug_lock); > > > > 8 sync(&kvm->srcu); > > > > > > > > Note, there are likely more potential deadlocks in KVM x86, e.g. the same > > > > pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with > > > > __kvmclock_cpufreq_notifier() > > > > > > Offhand I couldn't see any places where {,__}cpufreq_driver_target() is > > > called within cpus_read_lock(). I didn't look too closely though. > > > > Anyways... > > > > cpuhp_cpufreq_online() > > | > > -> cpufreq_online() > > | > > -> cpufreq_gov_performance_limits() > > | > > -> __cpufreq_driver_target() > > | > > -> __target_index() > > Ah, I only looked in generic code. > > Can you add a comment to the comment message suggesting switching the vm_list > to RCU? All the occurrences of list_for_each_entry(..., &vm_list, ...) seem > amenable to that, and it should be as easy to stick all or part of > kvm_destroy_vm() behind call_rcu(). +1 to the idea of making vm_list RCU-protected, though I think we'd want to use SRCU, e.g. set_nx_huge_pages() currently takes eash VM's slots_lock while purging possible NX hugepages. And I think kvm_destroy_vm() can simply do a synchronize_srcu() after removing the VM from the list. Trying to put kvm_destroy_vm() into an RCU callback would probably be a bit of a disaster, e.g. kvm-intel.ko in particular currently does some rather nasty things while destory a VM.
On Sat, Aug 31, 2024 at 1:45 AM Sean Christopherson <seanjc@google.com> wrote: > > Can you add a comment to the comment message suggesting switching the vm_list > > to RCU? All the occurrences of list_for_each_entry(..., &vm_list, ...) seem > > amenable to that, and it should be as easy to stick all or part of > > kvm_destroy_vm() behind call_rcu(). > > +1 to the idea of making vm_list RCU-protected, though I think we'd want to use > SRCU, e.g. set_nx_huge_pages() currently takes eash VM's slots_lock while purging > possible NX hugepages. Ah, for that I was thinking of wrapping everything with kvm_get_kvm_safe()/rcu_read_unlock() and kvm_put_kvm/rcu_read_lock(). Avoiding zero refcounts is safer and generally these visits are not hot code. > And I think kvm_destroy_vm() can simply do a synchronize_srcu() after removing > the VM from the list. Trying to put kvm_destroy_vm() into an RCU callback would > probably be a bit of a disaster, e.g. kvm-intel.ko in particular currently does > some rather nasty things while destory a VM. If all iteration is guarded by kvm_get_kvm_safe(), probably you can defer only the reclaiming part (i.e. the part after kvm_destroy_devices()) which is a lot easier to audit. Anyhow, I took a look at the v2 and it looks good. Paolo
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst index 02880d5552d5..5e102fe5b396 100644 --- a/Documentation/virt/kvm/locking.rst +++ b/Documentation/virt/kvm/locking.rst @@ -227,7 +227,13 @@ time it will be set using the Dirty tracking mechanism described above. :Type: mutex :Arch: any :Protects: - vm_list - - kvm_usage_count + +``kvm_usage_count`` +^^^^^^^^^^^^^^^^^^^ + +:Type: mutex +:Arch: any +:Protects: - kvm_usage_count - hardware virtualization enable/disable :Comment: KVM also disables CPU hotplug via cpus_read_lock() during enable/disable. @@ -290,11 +296,12 @@ time it will be set using the Dirty tracking mechanism described above. wakeup. ``vendor_module_lock`` -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^ :Type: mutex :Arch: x86 :Protects: loading a vendor module (kvm_amd or kvm_intel) -:Comment: Exists because using kvm_lock leads to deadlock. cpu_hotplug_lock is - taken outside of kvm_lock, e.g. in KVM's CPU online/offline callbacks, and - many operations need to take cpu_hotplug_lock when loading a vendor module, - e.g. updating static calls. +:Comment: Exists because using kvm_lock leads to deadlock. kvm_lock is taken + in notifiers, e.g. __kvmclock_cpufreq_notifier(), that may be invoked while + cpu_hotplug_lock is held, e.g. from cpufreq_boost_trigger_state(), and many + operations need to take cpu_hotplug_lock when loading a vendor module, e.g. + updating static calls. diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4965196cad58..d9b0579d3eea 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5499,6 +5499,7 @@ __visible bool kvm_rebooting; EXPORT_SYMBOL_GPL(kvm_rebooting); static DEFINE_PER_CPU(bool, hardware_enabled); +static DEFINE_MUTEX(kvm_usage_lock); static int kvm_usage_count; static int __hardware_enable_nolock(void) @@ -5531,10 +5532,10 @@ static int kvm_online_cpu(unsigned int cpu) * be enabled. Otherwise running VMs would encounter unrecoverable * errors when scheduled to this CPU. */ - mutex_lock(&kvm_lock); + mutex_lock(&kvm_usage_lock); if (kvm_usage_count) ret = __hardware_enable_nolock(); - mutex_unlock(&kvm_lock); + mutex_unlock(&kvm_usage_lock); return ret; } @@ -5554,10 +5555,10 @@ static void hardware_disable_nolock(void *junk) static int kvm_offline_cpu(unsigned int cpu) { - mutex_lock(&kvm_lock); + mutex_lock(&kvm_usage_lock); if (kvm_usage_count) hardware_disable_nolock(NULL); - mutex_unlock(&kvm_lock); + mutex_unlock(&kvm_usage_lock); return 0; } @@ -5573,9 +5574,9 @@ static void hardware_disable_all_nolock(void) static void hardware_disable_all(void) { cpus_read_lock(); - mutex_lock(&kvm_lock); + mutex_lock(&kvm_usage_lock); hardware_disable_all_nolock(); - mutex_unlock(&kvm_lock); + mutex_unlock(&kvm_usage_lock); cpus_read_unlock(); } @@ -5606,7 +5607,7 @@ static int hardware_enable_all(void) * enable hardware multiple times. */ cpus_read_lock(); - mutex_lock(&kvm_lock); + mutex_lock(&kvm_usage_lock); r = 0; @@ -5620,7 +5621,7 @@ static int hardware_enable_all(void) } } - mutex_unlock(&kvm_lock); + mutex_unlock(&kvm_usage_lock); cpus_read_unlock(); return r; @@ -5648,13 +5649,13 @@ static int kvm_suspend(void) { /* * Secondary CPUs and CPU hotplug are disabled across the suspend/resume - * callbacks, i.e. no need to acquire kvm_lock to ensure the usage count - * is stable. Assert that kvm_lock is not held to ensure the system - * isn't suspended while KVM is enabling hardware. Hardware enabling - * can be preempted, but the task cannot be frozen until it has dropped - * all locks (userspace tasks are frozen via a fake signal). + * callbacks, i.e. no need to acquire kvm_usage_lock to ensure the usage + * count is stable. Assert that kvm_usage_lock is not held to ensure + * the system isn't suspended while KVM is enabling hardware. Hardware + * enabling can be preempted, but the task cannot be frozen until it has + * dropped all locks (userspace tasks are frozen via a fake signal). */ - lockdep_assert_not_held(&kvm_lock); + lockdep_assert_not_held(&kvm_usage_lock); lockdep_assert_irqs_disabled(); if (kvm_usage_count) @@ -5664,7 +5665,7 @@ static int kvm_suspend(void) static void kvm_resume(void) { - lockdep_assert_not_held(&kvm_lock); + lockdep_assert_not_held(&kvm_usage_lock); lockdep_assert_irqs_disabled(); if (kvm_usage_count)
Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock on x86 due to a chain of locks and SRCU synchronizations. Translating the below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the fairness of r/w semaphores). CPU0 CPU1 CPU2 1 lock(&kvm->slots_lock); 2 lock(&vcpu->mutex); 3 lock(&kvm->srcu); 4 lock(cpu_hotplug_lock); 5 lock(kvm_lock); 6 lock(&kvm->slots_lock); 7 lock(cpu_hotplug_lock); 8 sync(&kvm->srcu); Note, there are likely more potential deadlocks in KVM x86, e.g. the same pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with __kvmclock_cpufreq_notifier(), but actually triggering such deadlocks is beyond rare due to the combination of dependencies and timings involved. E.g. the cpufreq notifier is only used on older CPUs without a constant TSC, mucking with the NX hugepage mitigation while VMs are running is very uncommon, and doing so while also onlining/offlining a CPU (necessary to generate contention on cpu_hotplug_lock) would be even more unusual. ====================================================== WARNING: possible circular locking dependency detected 6.10.0-smp--c257535a0c9d-pip #330 Tainted: G S O ------------------------------------------------------ tee/35048 is trying to acquire lock: ff6a80eced71e0a8 (&kvm->slots_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x179/0x1e0 [kvm] but task is already holding lock: ffffffffc07abb08 (kvm_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x14a/0x1e0 [kvm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (kvm_lock){+.+.}-{3:3}: __mutex_lock+0x6a/0xb40 mutex_lock_nested+0x1f/0x30 kvm_dev_ioctl+0x4fb/0xe50 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #2 (cpu_hotplug_lock){++++}-{0:0}: cpus_read_lock+0x2e/0xb0 static_key_slow_inc+0x16/0x30 kvm_lapic_set_base+0x6a/0x1c0 [kvm] kvm_set_apic_base+0x8f/0xe0 [kvm] kvm_set_msr_common+0x9ae/0xf80 [kvm] vmx_set_msr+0xa54/0xbe0 [kvm_intel] __kvm_set_msr+0xb6/0x1a0 [kvm] kvm_arch_vcpu_ioctl+0xeca/0x10c0 [kvm] kvm_vcpu_ioctl+0x485/0x5b0 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #1 (&kvm->srcu){.+.+}-{0:0}: __synchronize_srcu+0x44/0x1a0 synchronize_srcu_expedited+0x21/0x30 kvm_swap_active_memslots+0x110/0x1c0 [kvm] kvm_set_memslot+0x360/0x620 [kvm] __kvm_set_memory_region+0x27b/0x300 [kvm] kvm_vm_ioctl_set_memory_region+0x43/0x60 [kvm] kvm_vm_ioctl+0x295/0x650 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #0 (&kvm->slots_lock){+.+.}-{3:3}: __lock_acquire+0x15ef/0x2e30 lock_acquire+0xe0/0x260 __mutex_lock+0x6a/0xb40 mutex_lock_nested+0x1f/0x30 set_nx_huge_pages+0x179/0x1e0 [kvm] param_attr_store+0x93/0x100 module_attr_store+0x22/0x40 sysfs_kf_write+0x81/0xb0 kernfs_fop_write_iter+0x133/0x1d0 vfs_write+0x28d/0x380 ksys_write+0x70/0xe0 __x64_sys_write+0x1f/0x30 x64_sys_call+0x281b/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e Cc: Chao Gao <chao.gao@intel.com> Fixes: 0bf50497f03b ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> --- Documentation/virt/kvm/locking.rst | 19 ++++++++++++------ virt/kvm/kvm_main.c | 31 +++++++++++++++--------------- 2 files changed, 29 insertions(+), 21 deletions(-)