diff mbox series

[2/2] KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is enabled

Message ID 20221027200316.2221027-3-dmatlack@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is enabled | expand

Commit Message

David Matlack Oct. 27, 2022, 8:03 p.m. UTC
Do not recover NX Huge Pages if dirty logging is enabled on any memslot.
Zapping a region that is being dirty tracked is a waste of CPU cycles
(both by the recovery worker, and subsequent vCPU faults) since the
memory will just be faulted back in at the same 4KiB granularity.

Use kvm->nr_memslots_dirty_logging as a cheap way to check if NX Huge
Pages are being dirty tracked. This has the additional benefit of
ensuring that the NX recovery worker uses little-to-no CPU during the
precopy phase of a live migration.

Note, kvm->nr_memslots_dirty_logging can result in false positives and
false negatives, e.g. if dirty logging is only enabled on a subset of
memslots or the recovery worker races with a memslot update. However
there are no correctness issues either way, and eventually NX Huge Pages
will be recovered once dirty logging is disabled on all memslots.

An alternative approach would be to lookup the memslot of each NX Huge
Page and check if it is being dirty tracked. However, this would
increase the CPU usage of the recovery worker and MMU lock hold time
in write mode, especially in VMs with a large number of memslots.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 8 ++++++++
 1 file changed, 8 insertions(+)
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6f81539061d6..b499d3757173 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6806,6 +6806,14 @@  static void kvm_recover_nx_lpages(struct kvm *kvm)
 	bool flush = false;
 	ulong to_zap;
 
+	/*
+	 * Do not attempt to recover NX Huge Pages while dirty logging is
+	 * enabled since any subsequent accesses by a vCPUs will just fault the
+	 * memory back in at the same 4KiB granularity.
+	 */
+	if (READ_ONCE(kvm->nr_memslots_dirty_logging))
+		return;
+
 	rcu_idx = srcu_read_lock(&kvm->srcu);
 	write_lock(&kvm->mmu_lock);