diff mbox series

KVM: PPC: Book3S HV: Fix race between kvm_unmap_hva_range and MMU mode switch

Message ID 20181116103036.GA19018@blackberry (mailing list archive)
State New, archived
Headers show
Series KVM: PPC: Book3S HV: Fix race between kvm_unmap_hva_range and MMU mode switch | expand

Commit Message

Paul Mackerras Nov. 16, 2018, 10:30 a.m. UTC
Testing has revealed an occasional crash which appears to be caused
by a race between kvmppc_switch_mmu_to_hpt and kvm_unmap_hva_range_hv.
The symptom is a NULL pointer dereference in __find_linux_pte() called
from kvm_unmap_radix() with kvm->arch.pgtable == NULL.

Looking at kvmppc_switch_mmu_to_hpt(), it does indeed clear
kvm->arch.pgtable (via kvmppc_free_radix()) before setting
kvm->arch.radix to NULL, and there is nothing to prevent
kvm_unmap_hva_range_hv() or the other MMU callback functions from
being called concurrently with kvmppc_switch_mmu_to_hpt() or
kvmppc_switch_mmu_to_radix().

This patch therefore adds calls to spin_lock/unlock on the kvm->mmu_lock
around the assignments to kvm->arch.radix, and makes sure that the
partition-scoped radix tree or HPT is only freed after changing
kvm->arch.radix.

This also takes the kvm->mmu_lock in kvmppc_rmap_reset() to make sure
that the clearing of each rmap array (one per memslot) doesn't happen
concurrently with use of the array in the kvm_unmap_hva_range_hv()
or the other MMU callbacks.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c |  3 +++
 arch/powerpc/kvm/book3s_hv.c        | 17 +++++++++++------
 2 files changed, 14 insertions(+), 6 deletions(-)

Comments

Paul Mackerras Dec. 18, 2018, 12:59 a.m. UTC | #1
On Fri, Nov 16, 2018 at 09:30:36PM +1100, Paul Mackerras wrote:
> Testing has revealed an occasional crash which appears to be caused
> by a race between kvmppc_switch_mmu_to_hpt and kvm_unmap_hva_range_hv.
> The symptom is a NULL pointer dereference in __find_linux_pte() called
> from kvm_unmap_radix() with kvm->arch.pgtable == NULL.
> 
> Looking at kvmppc_switch_mmu_to_hpt(), it does indeed clear
> kvm->arch.pgtable (via kvmppc_free_radix()) before setting
> kvm->arch.radix to NULL, and there is nothing to prevent
> kvm_unmap_hva_range_hv() or the other MMU callback functions from
> being called concurrently with kvmppc_switch_mmu_to_hpt() or
> kvmppc_switch_mmu_to_radix().
> 
> This patch therefore adds calls to spin_lock/unlock on the kvm->mmu_lock
> around the assignments to kvm->arch.radix, and makes sure that the
> partition-scoped radix tree or HPT is only freed after changing
> kvm->arch.radix.
> 
> This also takes the kvm->mmu_lock in kvmppc_rmap_reset() to make sure
> that the clearing of each rmap array (one per memslot) doesn't happen
> concurrently with use of the array in the kvm_unmap_hva_range_hv()
> or the other MMU callbacks.
> 
> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

Applied to my kvm-ppc-next branch.
diff mbox series

Patch

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index c615617..a18afda 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -743,12 +743,15 @@  void kvmppc_rmap_reset(struct kvm *kvm)
 	srcu_idx = srcu_read_lock(&kvm->srcu);
 	slots = kvm_memslots(kvm);
 	kvm_for_each_memslot(memslot, slots) {
+		/* Mutual exclusion with kvm_unmap_hva_range etc. */
+		spin_lock(&kvm->mmu_lock);
 		/*
 		 * This assumes it is acceptable to lose reference and
 		 * change bits across a reset.
 		 */
 		memset(memslot->arch.rmap, 0,
 		       memslot->npages * sizeof(*memslot->arch.rmap));
+		spin_unlock(&kvm->mmu_lock);
 	}
 	srcu_read_unlock(&kvm->srcu, srcu_idx);
 }
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a56f841..ab43306 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4532,12 +4532,15 @@  int kvmppc_switch_mmu_to_hpt(struct kvm *kvm)
 {
 	if (nesting_enabled(kvm))
 		kvmhv_release_all_nested(kvm);
+	kvmppc_rmap_reset(kvm);
+	kvm->arch.process_table = 0;
+	/* Mutual exclusion with kvm_unmap_hva_range etc. */
+	spin_lock(&kvm->mmu_lock);
+	kvm->arch.radix = 0;
+	spin_unlock(&kvm->mmu_lock);
 	kvmppc_free_radix(kvm);
 	kvmppc_update_lpcr(kvm, LPCR_VPM1,
 			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
-	kvmppc_rmap_reset(kvm);
-	kvm->arch.radix = 0;
-	kvm->arch.process_table = 0;
 	return 0;
 }
 
@@ -4549,12 +4552,14 @@  int kvmppc_switch_mmu_to_radix(struct kvm *kvm)
 	err = kvmppc_init_vm_radix(kvm);
 	if (err)
 		return err;
-
+	kvmppc_rmap_reset(kvm);
+	/* Mutual exclusion with kvm_unmap_hva_range etc. */
+	spin_lock(&kvm->mmu_lock);
+	kvm->arch.radix = 1;
+	spin_unlock(&kvm->mmu_lock);
 	kvmppc_free_hpt(&kvm->arch.hpt);
 	kvmppc_update_lpcr(kvm, LPCR_UPRT | LPCR_GTSE | LPCR_HR,
 			   LPCR_VPM1 | LPCR_UPRT | LPCR_GTSE | LPCR_HR);
-	kvmppc_rmap_reset(kvm);
-	kvm->arch.radix = 1;
 	return 0;
 }