diff mbox

[v3,15/15] KVM: MMU: replace kvm_zap_all with kvm_mmu_invalid_all_pages

Message ID 1366093973-2617-16-git-send-email-xiaoguangrong@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Xiao Guangrong April 16, 2013, 6:32 a.m. UTC
Use kvm_mmu_invalid_all_pages in kvm_arch_flush_shadow_all and
rename kvm_zap_all to kvm_free_all which is used to free all
memmory used by kvm mmu when vm is being destroyed, at this time,
no vcpu exists and mmu-notify has been unregistered, so we can
free the shadow pages out of mmu-lock

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/include/asm/kvm_host.h |    2 +-
 arch/x86/kvm/mmu.c              |   15 ++-------------
 arch/x86/kvm/x86.c              |    9 ++++-----
 3 files changed, 7 insertions(+), 19 deletions(-)

Comments

Marcelo Tosatti April 18, 2013, 12:08 a.m. UTC | #1
On Tue, Apr 16, 2013 at 02:32:53PM +0800, Xiao Guangrong wrote:
> Use kvm_mmu_invalid_all_pages in kvm_arch_flush_shadow_all and
> rename kvm_zap_all to kvm_free_all which is used to free all
> memmory used by kvm mmu when vm is being destroyed, at this time,
> no vcpu exists and mmu-notify has been unregistered, so we can
> free the shadow pages out of mmu-lock

Since there is no contention for mmu-lock its also not a problem to 
grab the lock right?

Automated verification of locking/srcu might complain.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong April 18, 2013, 4:03 a.m. UTC | #2
On 04/18/2013 08:08 AM, Marcelo Tosatti wrote:
> On Tue, Apr 16, 2013 at 02:32:53PM +0800, Xiao Guangrong wrote:
>> Use kvm_mmu_invalid_all_pages in kvm_arch_flush_shadow_all and
>> rename kvm_zap_all to kvm_free_all which is used to free all
>> memmory used by kvm mmu when vm is being destroyed, at this time,
>> no vcpu exists and mmu-notify has been unregistered, so we can
>> free the shadow pages out of mmu-lock
> 
> Since there is no contention for mmu-lock its also not a problem to 
> grab the lock right?

This still has contention. Other mmu-notify can happen when we handle
->release(). On the other handle, spin-lock is not preemptable.

> 
> Automated verification of locking/srcu might complain.

We hold slot-lock to free shadow page out of mmu-lock, it can avoid
the complain. No?



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti April 20, 2013, 5:18 p.m. UTC | #3
On Thu, Apr 18, 2013 at 12:03:45PM +0800, Xiao Guangrong wrote:
> On 04/18/2013 08:08 AM, Marcelo Tosatti wrote:
> > On Tue, Apr 16, 2013 at 02:32:53PM +0800, Xiao Guangrong wrote:
> >> Use kvm_mmu_invalid_all_pages in kvm_arch_flush_shadow_all and
> >> rename kvm_zap_all to kvm_free_all which is used to free all
> >> memmory used by kvm mmu when vm is being destroyed, at this time,
> >> no vcpu exists and mmu-notify has been unregistered, so we can
> >> free the shadow pages out of mmu-lock
> > 
> > Since there is no contention for mmu-lock its also not a problem to 
> > grab the lock right?
> 
> This still has contention. Other mmu-notify can happen when we handle
> ->release(). On the other handle, spin-lock is not preemptable.

Don't think so:

        kvm_coalesced_mmio_free(kvm);
#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
        mmu_notifier_unregister(&kvm->mmu_notifier, kvm->mm);
#else
        kvm_arch_flush_shadow_all(kvm);
#endif
        kvm_arch_destroy_vm(kvm);

> > Automated verification of locking/srcu might complain.
> 
> We hold slot-lock to free shadow page out of mmu-lock, it can avoid
> the complain. No?

Not if it realizes srcu is required to access the data structures.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong April 21, 2013, 6:59 a.m. UTC | #4
On 04/21/2013 01:18 AM, Marcelo Tosatti wrote:
> On Thu, Apr 18, 2013 at 12:03:45PM +0800, Xiao Guangrong wrote:
>> On 04/18/2013 08:08 AM, Marcelo Tosatti wrote:
>>> On Tue, Apr 16, 2013 at 02:32:53PM +0800, Xiao Guangrong wrote:
>>>> Use kvm_mmu_invalid_all_pages in kvm_arch_flush_shadow_all and
>>>> rename kvm_zap_all to kvm_free_all which is used to free all
>>>> memmory used by kvm mmu when vm is being destroyed, at this time,
>>>> no vcpu exists and mmu-notify has been unregistered, so we can
>>>> free the shadow pages out of mmu-lock
>>>
>>> Since there is no contention for mmu-lock its also not a problem to 
>>> grab the lock right?
>>
>> This still has contention. Other mmu-notify can happen when we handle
>> ->release(). On the other handle, spin-lock is not preemptable.
> 
> Don't think so:

Hi Marcelo,

The comment of ->release() says:

/*
	 * Called either by mmu_notifier_unregister or when the mm is
	 * being destroyed by exit_mmap, always before all pages are
	 * freed. This can run concurrently with other mmu notifier
	 * methods (the ones invoked outside the mm context)
> 
>         kvm_coalesced_mmio_free(kvm);
> #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
>         mmu_notifier_unregister(&kvm->mmu_notifier, kvm->mm);
> #else
>         kvm_arch_flush_shadow_all(kvm);
> #endif
>         kvm_arch_destroy_vm(kvm);

The contention does not exist in the code you listed above. It happens when
vm abnormally exits (for example, VM is killed). Please refer to
commit 3ad3d90 (mm: mmu_notifier: fix freed page still mapped in secondary MMU).
The current mmu-notify code is wrong and i have posted a patch to fix it which
can be found at:
http://marc.info/?l=kvm&m=136609583232031&w=2

Maybe i misunderstand your meaning. This patch tries to use kvm_mmu_invalid_all_pages
in ->release and rename kvm_zap_all to kvm_free_all. Do you mean we can still use
mmu-lock in kvm_free_all()? If yes, I do not have strong opinion on this point and
will keep kvm_free_all under the protection of mmu-lock.

> 
>>> Automated verification of locking/srcu might complain.
>>
>> We hold slot-lock to free shadow page out of mmu-lock, it can avoid
>> the complain. No?
> 
> Not if it realizes srcu is required to access the data structures.

It seems that kvm->srcu is only used to protect kvm->memslots, in kvm_memslots:

static inline struct kvm_memslots *kvm_memslots(struct kvm *kvm)
{
	return rcu_dereference_check(kvm->memslots,
			srcu_read_lock_held(&kvm->srcu)
			|| lockdep_is_held(&kvm->slots_lock));
}

kvm->memslots can be safely accessed when hold kvm->srcu _or_ slots_lock.

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6f8ee18..a336055 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -771,7 +771,7 @@  void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
 void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask);
-void kvm_mmu_zap_all(struct kvm *kvm);
+void kvm_mmu_free_all(struct kvm *kvm);
 void kvm_arch_init_generation(struct kvm *kvm);
 void kvm_mmu_invalid_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 12129b7..10c43ea 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4639,28 +4639,17 @@  void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 	spin_unlock(&kvm->mmu_lock);
 }
 
-void kvm_mmu_zap_all(struct kvm *kvm)
+void kvm_mmu_free_all(struct kvm *kvm)
 {
 	struct kvm_mmu_page *sp, *node;
 	LIST_HEAD(invalid_list);
 
-	might_sleep();
-
-	spin_lock(&kvm->mmu_lock);
 restart:
-	list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
+	list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link)
 		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
 			goto restart;
 
-		if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
-			kvm_mmu_commit_zap_page(kvm, &invalid_list);
-			cond_resched_lock(&kvm->mmu_lock);
-			goto restart;
-		}
-	}
-
 	kvm_mmu_commit_zap_page(kvm, &invalid_list);
-	spin_unlock(&kvm->mmu_lock);
 }
 
 static void kvm_mmu_zap_mmio_sptes(struct kvm *kvm)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d3dd0d5..4bb88f5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6840,6 +6840,7 @@  void kvm_arch_sync_events(struct kvm *kvm)
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
+	kvm_mmu_free_all(kvm);
 	kvm_iommu_unmap_guest(kvm);
 	kfree(kvm->arch.vpic);
 	kfree(kvm->arch.vioapic);
@@ -7056,11 +7057,9 @@  void kvm_arch_commit_memory_region(struct kvm *kvm,
 
 void kvm_arch_flush_shadow_all(struct kvm *kvm)
 {
-	int idx;
-
-	idx = srcu_read_lock(&kvm->srcu);
-	kvm_mmu_zap_all(kvm);
-	srcu_read_unlock(&kvm->srcu, idx);
+	mutex_lock(&kvm->slots_lock);
+	kvm_mmu_invalid_memslot_pages(kvm, INVALID_ALL_SLOTS);
+	mutex_unlock(&kvm->slots_lock);
 }
 
 void kvm_arch_flush_shadow_memslot(struct kvm *kvm,