Message ID | 20181212041617.GC22265@blackberry (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: PPC: Book3S HV: Improve live migration of radix guests | expand |
On Wed, 2018-12-12 at 15:16 +1100, Paul Mackerras wrote: > For radix guests, this makes KVM map guest memory as individual pages > when dirty page logging is enabled for the memslot corresponding to > the > guest real address. Having a separate partition-scoped PTE for each > system page mapped to the guest means that we have a separate dirty > bit for each page, thus making the reported dirty bitmap more > accurate. > Without this, if part of guest memory is backed by transparent huge > pages, the dirty status is reported at a 2MB granularity rather than > a 64kB (or 4kB) granularity for that part, causing userspace to have > to transmit more data when migrating the guest. Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> > > Signed-off-by: Paul Mackerras <paulus@ozlabs.org> > --- > arch/powerpc/kvm/book3s_64_mmu_radix.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c > b/arch/powerpc/kvm/book3s_64_mmu_radix.c > index d68162e..87ad35e 100644 > --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c > +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c > @@ -683,6 +683,7 @@ int kvmppc_book3s_instantiate_page(struct > kvm_vcpu *vcpu, > pte_t pte, *ptep; > unsigned int shift, level; > int ret; > + bool large_enable; > > /* used to check for invalidations in progress */ > mmu_seq = kvm->mmu_notifier_seq; > @@ -732,12 +733,15 @@ int kvmppc_book3s_instantiate_page(struct > kvm_vcpu *vcpu, > pte = *ptep; > local_irq_enable(); > > + /* If we're logging dirty pages, always map single pages */ > + large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES); > + > /* Get pte level from shift/size */ > - if (shift == PUD_SHIFT && > + if (large_enable && shift == PUD_SHIFT && > (gpa & (PUD_SIZE - PAGE_SIZE)) == > (hva & (PUD_SIZE - PAGE_SIZE))) { > level = 2; > - } else if (shift == PMD_SHIFT && > + } else if (large_enable && shift == PMD_SHIFT && > (gpa & (PMD_SIZE - PAGE_SIZE)) == > (hva & (PMD_SIZE - PAGE_SIZE))) { > level = 1;
On Wed, Dec 12, 2018 at 03:16:17PM +1100, Paul Mackerras wrote: > For radix guests, this makes KVM map guest memory as individual pages > when dirty page logging is enabled for the memslot corresponding to the > guest real address. Having a separate partition-scoped PTE for each > system page mapped to the guest means that we have a separate dirty > bit for each page, thus making the reported dirty bitmap more accurate. > Without this, if part of guest memory is backed by transparent huge > pages, the dirty status is reported at a 2MB granularity rather than > a 64kB (or 4kB) granularity for that part, causing userspace to have > to transmit more data when migrating the guest. > > Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > arch/powerpc/kvm/book3s_64_mmu_radix.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c > index d68162e..87ad35e 100644 > --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c > +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c > @@ -683,6 +683,7 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, > pte_t pte, *ptep; > unsigned int shift, level; > int ret; > + bool large_enable; > > /* used to check for invalidations in progress */ > mmu_seq = kvm->mmu_notifier_seq; > @@ -732,12 +733,15 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, > pte = *ptep; > local_irq_enable(); > > + /* If we're logging dirty pages, always map single pages */ > + large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES); > + > /* Get pte level from shift/size */ > - if (shift == PUD_SHIFT && > + if (large_enable && shift == PUD_SHIFT && > (gpa & (PUD_SIZE - PAGE_SIZE)) == > (hva & (PUD_SIZE - PAGE_SIZE))) { > level = 2; > - } else if (shift == PMD_SHIFT && > + } else if (large_enable && shift == PMD_SHIFT && > (gpa & (PMD_SIZE - PAGE_SIZE)) == > (hva & (PMD_SIZE - PAGE_SIZE))) { > level = 1;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index d68162e..87ad35e 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -683,6 +683,7 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, pte_t pte, *ptep; unsigned int shift, level; int ret; + bool large_enable; /* used to check for invalidations in progress */ mmu_seq = kvm->mmu_notifier_seq; @@ -732,12 +733,15 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, pte = *ptep; local_irq_enable(); + /* If we're logging dirty pages, always map single pages */ + large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES); + /* Get pte level from shift/size */ - if (shift == PUD_SHIFT && + if (large_enable && shift == PUD_SHIFT && (gpa & (PUD_SIZE - PAGE_SIZE)) == (hva & (PUD_SIZE - PAGE_SIZE))) { level = 2; - } else if (shift == PMD_SHIFT && + } else if (large_enable && shift == PMD_SHIFT && (gpa & (PMD_SIZE - PAGE_SIZE)) == (hva & (PMD_SIZE - PAGE_SIZE))) { level = 1;
For radix guests, this makes KVM map guest memory as individual pages when dirty page logging is enabled for the memslot corresponding to the guest real address. Having a separate partition-scoped PTE for each system page mapped to the guest means that we have a separate dirty bit for each page, thus making the reported dirty bitmap more accurate. Without this, if part of guest memory is backed by transparent huge pages, the dirty status is reported at a 2MB granularity rather than a 64kB (or 4kB) granularity for that part, causing userspace to have to transmit more data when migrating the guest. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> --- arch/powerpc/kvm/book3s_64_mmu_radix.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)