[0/2] KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is enabled

Message ID	20221027200316.2221027-1-dmatlack@google.com (mailing list archive)
Headers	show Return-Path: <kvm-owner@kernel.org> Date: Thu, 27 Oct 2022 13:03:14 -0700 Mime-Version: 1.0 Message-ID: <20221027200316.2221027-1-dmatlack@google.com> Subject: [PATCH 0/2] KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is enabled From: David Matlack <dmatlack@google.com> To: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com>, kvm@vger.kernel.org, David Matlack <dmatlack@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is enabled \| expand [0/2] KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is enabled [1/2] KVM: Keep track of the number of memslots with dirty logging enabled [2/2] KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is enabled

David Matlack Oct. 27, 2022, 8:03 p.m. UTC

This series turns off the NX Huge Page recovery worker when any memslot
has dirty logging enabled. This avoids theoretical performance problems
and reduces the CPU usage of NX Huge Pages when a VM is in the pre-copy
phase of a Live Migration.

Tested manually and ran all selftests.

David Matlack (2):
  KVM: Keep track of the number of memslots with dirty logging enabled
  KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is
    enabled

 arch/x86/kvm/mmu/mmu.c   |  8 ++++++++
 include/linux/kvm_host.h |  2 ++
 virt/kvm/kvm_main.c      | 10 ++++++++++
 3 files changed, 20 insertions(+)


base-commit: e18d6152ff0f41b7f01f9817372022df04e0d354

Paolo Bonzini Oct. 28, 2022, 10:58 a.m. UTC | #1

On 10/27/22 22:03, David Matlack wrote:
> This series turns off the NX Huge Page recovery worker when any memslot
> has dirty logging enabled. This avoids theoretical performance problems
> and reduces the CPU usage of NX Huge Pages when a VM is in the pre-copy
> phase of a Live Migration.
> 
> Tested manually and ran all selftests.
> 
> David Matlack (2):
>    KVM: Keep track of the number of memslots with dirty logging enabled
>    KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is
>      enabled
> 
>   arch/x86/kvm/mmu/mmu.c   |  8 ++++++++
>   include/linux/kvm_host.h |  2 ++
>   virt/kvm/kvm_main.c      | 10 ++++++++++
>   3 files changed, 20 insertions(+)
> 
> 
> base-commit: e18d6152ff0f41b7f01f9817372022df04e0d354

This can be a bit problematic because for example you could have dirty 
logging enabled only for a framebuffer or similar.  In this case the 
memory being logged will not be the same as the one that is NX-split.

Perhaps we can take advantage of eager page splitting, that is you can 
add a bool to kvm_mmu_page that is set by shadow_mmu_get_sp_for_split 
and tdp_mmu_alloc_sp_for_split (or a similar place)?

Paolo

David Matlack Oct. 28, 2022, 8:05 p.m. UTC | #2

On Fri, Oct 28, 2022 at 3:58 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 10/27/22 22:03, David Matlack wrote:
> > This series turns off the NX Huge Page recovery worker when any memslot
> > has dirty logging enabled. This avoids theoretical performance problems
> > and reduces the CPU usage of NX Huge Pages when a VM is in the pre-copy
> > phase of a Live Migration.
> >
> > Tested manually and ran all selftests.
> >
> > David Matlack (2):
> >    KVM: Keep track of the number of memslots with dirty logging enabled
> >    KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is
> >      enabled
> >
> >   arch/x86/kvm/mmu/mmu.c   |  8 ++++++++
> >   include/linux/kvm_host.h |  2 ++
> >   virt/kvm/kvm_main.c      | 10 ++++++++++
> >   3 files changed, 20 insertions(+)
> >
> >
> > base-commit: e18d6152ff0f41b7f01f9817372022df04e0d354
>
> This can be a bit problematic because for example you could have dirty
> logging enabled only for a framebuffer or similar.  In this case the
> memory being logged will not be the same as the one that is NX-split.

Ah, thanks for pointing that out. It's good to know there are other
use-cases for dirty logging outside of live migration. At Google we
tend to hyper focus on LM.

>
> Perhaps we can take advantage of eager page splitting, that is you can
> add a bool to kvm_mmu_page that is set by shadow_mmu_get_sp_for_split
> and tdp_mmu_alloc_sp_for_split (or a similar place)?

I don't think that would help since eagerly-split pages aren't on the
list of possible NX Huge Pages. i.e. the recovery worker won't try to
zap them. The main thing I want to avoid is the recovery worker
zapping ranges that were actually split for NX Huge Pages while they
are being dirty tracked.

I'll experiment with a more accurate solution. i.e. have the recovery
worker lookup the memslot for each SP and check if it has dirty
logging enabled. Maybe the increase in CPU usage won't be as bad as I
thought.

>
> Paolo
>

Sean Christopherson Oct. 28, 2022, 9:07 p.m. UTC | #3

On Fri, Oct 28, 2022, David Matlack wrote:
> I'll experiment with a more accurate solution. i.e. have the recovery
> worker lookup the memslot for each SP and check if it has dirty
> logging enabled. Maybe the increase in CPU usage won't be as bad as I
> thought.

If you end up grabbing the memslot, use kvm_mmu_max_mapping_level() instead of
checking only dirty logging.  The way KVM will avoid zapping shadow pages that
could have been NX huge pages when they were created, but can no longer be NX huge
pages due to something other than dirty logging, e.g. because the gfn is being
shadow for nested TDP.

David Matlack Oct. 28, 2022, 9:24 p.m. UTC | #4

On Fri, Oct 28, 2022 at 2:07 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Oct 28, 2022, David Matlack wrote:
> > I'll experiment with a more accurate solution. i.e. have the recovery
> > worker lookup the memslot for each SP and check if it has dirty
> > logging enabled. Maybe the increase in CPU usage won't be as bad as I
> > thought.
>
> If you end up grabbing the memslot, use kvm_mmu_max_mapping_level() instead of
> checking only dirty logging.  The way KVM will avoid zapping shadow pages that
> could have been NX huge pages when they were created, but can no longer be NX huge
> pages due to something other than dirty logging, e.g. because the gfn is being
> shadow for nested TDP.

kvm_mmu_max_mapping_level() doesn't check if dirty logging is enabled
and does the unnecessary work of checking the host mapping level
(which requires knowing the pfn). I could refactor
kvm_mmu_hugepage_adjust() and kvm_mmu_max_mapping_level() though to
achieve what you suggest. Specifically, when recovering NX Huge Pages,
check if dirty logging is enabled and if a huge page is disallowed
(lpage_info_slot), and share that code with the fault handler.

Sean Christopherson Oct. 28, 2022, 9:42 p.m. UTC | #5

On Fri, Oct 28, 2022, David Matlack wrote:
> On Fri, Oct 28, 2022 at 2:07 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Oct 28, 2022, David Matlack wrote:
> > > I'll experiment with a more accurate solution. i.e. have the recovery
> > > worker lookup the memslot for each SP and check if it has dirty
> > > logging enabled. Maybe the increase in CPU usage won't be as bad as I
> > > thought.
> >
> > If you end up grabbing the memslot, use kvm_mmu_max_mapping_level() instead of
> > checking only dirty logging.  The way KVM will avoid zapping shadow pages that
> > could have been NX huge pages when they were created, but can no longer be NX huge
> > pages due to something other than dirty logging, e.g. because the gfn is being
> > shadow for nested TDP.
> 
> kvm_mmu_max_mapping_level() doesn't check if dirty logging is enabled

Gah, I forgot that kvm_mmu_hugepage_adjust() does a one-off check for dirty logging
instead of the info being fed into slot->arch.lpage_info.  Never mind.

[0/2] KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is enabled

Message

Comments