diff mbox

[v3,07/15] KVM: MMU: introduce invalid rmap handlers

Message ID 1366093973-2617-8-git-send-email-xiaoguangrong@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Xiao Guangrong April 16, 2013, 6:32 a.m. UTC
Invalid rmaps is the rmap of the invalid memslot which is being
deleted, especially, we can treat all rmaps are invalid when
kvm is being destroyed since all memslot will be deleted soon.
MMU should remove all sptes on these rmaps before the invalid
memslot fully deleted

The reason why we separately handle invalid rmap is we want to
unmap invalid-rmap out of mmu-lock to achieve scale performance
on intensive memory and vcpu used guest

This patch make all the operations on invalid rmap are clearing
spte and reset rmap's entry. In the later patch, we will introduce
the path out of mmu-lock to unmap invalid rmap

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/mmu.c |   80 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 80 insertions(+), 0 deletions(-)

Comments

Marcelo Tosatti April 17, 2013, 11:38 p.m. UTC | #1
On Tue, Apr 16, 2013 at 02:32:45PM +0800, Xiao Guangrong wrote:
> Invalid rmaps is the rmap of the invalid memslot which is being
> deleted, especially, we can treat all rmaps are invalid when
> kvm is being destroyed since all memslot will be deleted soon.
> MMU should remove all sptes on these rmaps before the invalid
> memslot fully deleted
> 
> The reason why we separately handle invalid rmap is we want to
> unmap invalid-rmap out of mmu-lock to achieve scale performance
> on intensive memory and vcpu used guest

Better try to make the code simpler, and introduce complexity only 
if necessary.

The idea to zap the roots is very elegant and apparently effective. What
are its problems?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong April 18, 2013, 3:15 a.m. UTC | #2
On 04/18/2013 07:38 AM, Marcelo Tosatti wrote:
> On Tue, Apr 16, 2013 at 02:32:45PM +0800, Xiao Guangrong wrote:
>> Invalid rmaps is the rmap of the invalid memslot which is being
>> deleted, especially, we can treat all rmaps are invalid when
>> kvm is being destroyed since all memslot will be deleted soon.
>> MMU should remove all sptes on these rmaps before the invalid
>> memslot fully deleted
>>
>> The reason why we separately handle invalid rmap is we want to
>> unmap invalid-rmap out of mmu-lock to achieve scale performance
>> on intensive memory and vcpu used guest
> 
> Better try to make the code simpler, and introduce complexity only 
> if necessary.

Marcelo,

This code is necessary to implement "unmap invalid rmap out of mmu-lock",
the reason why we need it is that ...

> 
> The idea to zap the roots is very elegant and apparently effective. What
> are its problems?

I mentioned it in 00/15:

* Challenges
Some page invalidation is requested when memslot is moved or deleted
and kvm is being destroy who call zap_all_pages to delete all sp using
their rmap and lpage-info, after call zap_all_pages, the rmap and lpage-info
will be freed. So, we should implement a fast way to delete sp from the rmap
and lpage-info.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 850eab5..2a7a5d0 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1606,6 +1606,86 @@  void init_memslot_rmap_ops(struct kvm_memory_slot *slot)
 	slot->arch.ops = &normal_rmap_ops;
 }
 
+static int invalid_rmap_add(struct kvm_vcpu *vcpu, u64 *spte,
+			    unsigned long *pte_list)
+{
+	WARN_ON(1);
+	return 0;
+}
+
+static void invalid_rmap_remove(u64 *spte, unsigned long *rmapp)
+{
+	pte_list_clear_concurrently(spte, rmapp);
+}
+
+static bool invalid_rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
+				       bool pt_protect)
+{
+	WARN_ON(1);
+	return false;
+}
+
+static int __kvm_unmap_invalid_rmapp(unsigned long *rmapp)
+{
+	u64 *sptep;
+	struct rmap_iterator iter;
+
+	for (sptep = rmap_get_first(*rmapp, &iter); sptep;
+	      sptep = rmap_get_next(&iter)) {
+		if (sptep == PTE_LIST_SPTE_SKIP)
+			continue;
+
+		/* Do not call .rmap_remove(). */
+		if (mmu_spte_clear_track_bits(sptep))
+			pte_list_clear_concurrently(sptep, rmapp);
+	}
+
+	return 0;
+}
+
+static int kvm_unmap_invalid_rmapp(struct kvm *kvm, unsigned long *rmapp)
+{
+	return __kvm_unmap_invalid_rmapp(rmapp);
+}
+
+static int invalid_rmap_set_pte(struct kvm *kvm, unsigned long *rmapp,
+				pte_t *ptep)
+{
+	return kvm_unmap_invalid_rmapp(kvm, rmapp);
+}
+
+/*
+ * Invalid rmaps is the rmap of the invalid memslot which is being
+ * deleted, especially, we can treat all rmaps are invalid when
+ * kvm is being destroyed since all memslot will be deleted soon.
+ * MMU should remove all sptes on these rmaps before the invalid
+ * memslot fully deleted.
+ *
+ * VCPUs can not do address translation on invalid memslots, that
+ * means no sptes can be added to their rmaps and no shadow page
+ * can be created in their memory regions, so rmap_add and
+ * rmap_write_protect on invalid memslot should never happen.
+ * Any sptes on invalid rmaps are stale and can not be reused,
+ * we drop all sptes on any other operations. So, all handlers
+ * on invalid rmap do the same thing - remove and zap sptes on
+ * the rmap.
+ *
+ * KVM use pte_list_clear_concurrently to clear spte on invalid
+ * rmap which resets rmap's entry but keeps rmap's memory. The
+ * rmap is fully destroyed when free the invalid memslot.
+ */
+static struct rmap_operations invalid_rmap_ops = {
+	.rmap_add = invalid_rmap_add,
+	.rmap_remove = invalid_rmap_remove,
+
+	.rmap_write_protect = invalid_rmap_write_protect,
+
+	.rmap_set_pte = invalid_rmap_set_pte,
+	.rmap_age = kvm_unmap_invalid_rmapp,
+	.rmap_test_age = kvm_unmap_invalid_rmapp,
+	.rmap_unmap = kvm_unmap_invalid_rmapp
+};
+
 #ifdef MMU_DEBUG
 static int is_empty_shadow_page(u64 *spt)
 {