Message ID | 20221027200316.2221027-1-dmatlack@google.com (mailing list archive) |
---|---|
Headers | show |
Series | KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is enabled | expand |
On 10/27/22 22:03, David Matlack wrote: > This series turns off the NX Huge Page recovery worker when any memslot > has dirty logging enabled. This avoids theoretical performance problems > and reduces the CPU usage of NX Huge Pages when a VM is in the pre-copy > phase of a Live Migration. > > Tested manually and ran all selftests. > > David Matlack (2): > KVM: Keep track of the number of memslots with dirty logging enabled > KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is > enabled > > arch/x86/kvm/mmu/mmu.c | 8 ++++++++ > include/linux/kvm_host.h | 2 ++ > virt/kvm/kvm_main.c | 10 ++++++++++ > 3 files changed, 20 insertions(+) > > > base-commit: e18d6152ff0f41b7f01f9817372022df04e0d354 This can be a bit problematic because for example you could have dirty logging enabled only for a framebuffer or similar. In this case the memory being logged will not be the same as the one that is NX-split. Perhaps we can take advantage of eager page splitting, that is you can add a bool to kvm_mmu_page that is set by shadow_mmu_get_sp_for_split and tdp_mmu_alloc_sp_for_split (or a similar place)? Paolo
On Fri, Oct 28, 2022 at 3:58 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 10/27/22 22:03, David Matlack wrote: > > This series turns off the NX Huge Page recovery worker when any memslot > > has dirty logging enabled. This avoids theoretical performance problems > > and reduces the CPU usage of NX Huge Pages when a VM is in the pre-copy > > phase of a Live Migration. > > > > Tested manually and ran all selftests. > > > > David Matlack (2): > > KVM: Keep track of the number of memslots with dirty logging enabled > > KVM: x86/mmu: Do not recover NX Huge Pages when dirty logging is > > enabled > > > > arch/x86/kvm/mmu/mmu.c | 8 ++++++++ > > include/linux/kvm_host.h | 2 ++ > > virt/kvm/kvm_main.c | 10 ++++++++++ > > 3 files changed, 20 insertions(+) > > > > > > base-commit: e18d6152ff0f41b7f01f9817372022df04e0d354 > > This can be a bit problematic because for example you could have dirty > logging enabled only for a framebuffer or similar. In this case the > memory being logged will not be the same as the one that is NX-split. Ah, thanks for pointing that out. It's good to know there are other use-cases for dirty logging outside of live migration. At Google we tend to hyper focus on LM. > > Perhaps we can take advantage of eager page splitting, that is you can > add a bool to kvm_mmu_page that is set by shadow_mmu_get_sp_for_split > and tdp_mmu_alloc_sp_for_split (or a similar place)? I don't think that would help since eagerly-split pages aren't on the list of possible NX Huge Pages. i.e. the recovery worker won't try to zap them. The main thing I want to avoid is the recovery worker zapping ranges that were actually split for NX Huge Pages while they are being dirty tracked. I'll experiment with a more accurate solution. i.e. have the recovery worker lookup the memslot for each SP and check if it has dirty logging enabled. Maybe the increase in CPU usage won't be as bad as I thought. > > Paolo >
On Fri, Oct 28, 2022, David Matlack wrote: > I'll experiment with a more accurate solution. i.e. have the recovery > worker lookup the memslot for each SP and check if it has dirty > logging enabled. Maybe the increase in CPU usage won't be as bad as I > thought. If you end up grabbing the memslot, use kvm_mmu_max_mapping_level() instead of checking only dirty logging. The way KVM will avoid zapping shadow pages that could have been NX huge pages when they were created, but can no longer be NX huge pages due to something other than dirty logging, e.g. because the gfn is being shadow for nested TDP.
On Fri, Oct 28, 2022 at 2:07 PM Sean Christopherson <seanjc@google.com> wrote: > > On Fri, Oct 28, 2022, David Matlack wrote: > > I'll experiment with a more accurate solution. i.e. have the recovery > > worker lookup the memslot for each SP and check if it has dirty > > logging enabled. Maybe the increase in CPU usage won't be as bad as I > > thought. > > If you end up grabbing the memslot, use kvm_mmu_max_mapping_level() instead of > checking only dirty logging. The way KVM will avoid zapping shadow pages that > could have been NX huge pages when they were created, but can no longer be NX huge > pages due to something other than dirty logging, e.g. because the gfn is being > shadow for nested TDP. kvm_mmu_max_mapping_level() doesn't check if dirty logging is enabled and does the unnecessary work of checking the host mapping level (which requires knowing the pfn). I could refactor kvm_mmu_hugepage_adjust() and kvm_mmu_max_mapping_level() though to achieve what you suggest. Specifically, when recovering NX Huge Pages, check if dirty logging is enabled and if a huge page is disallowed (lpage_info_slot), and share that code with the fault handler.
On Fri, Oct 28, 2022, David Matlack wrote: > On Fri, Oct 28, 2022 at 2:07 PM Sean Christopherson <seanjc@google.com> wrote: > > > > On Fri, Oct 28, 2022, David Matlack wrote: > > > I'll experiment with a more accurate solution. i.e. have the recovery > > > worker lookup the memslot for each SP and check if it has dirty > > > logging enabled. Maybe the increase in CPU usage won't be as bad as I > > > thought. > > > > If you end up grabbing the memslot, use kvm_mmu_max_mapping_level() instead of > > checking only dirty logging. The way KVM will avoid zapping shadow pages that > > could have been NX huge pages when they were created, but can no longer be NX huge > > pages due to something other than dirty logging, e.g. because the gfn is being > > shadow for nested TDP. > > kvm_mmu_max_mapping_level() doesn't check if dirty logging is enabled Gah, I forgot that kvm_mmu_hugepage_adjust() does a one-off check for dirty logging instead of the info being fed into slot->arch.lpage_info. Never mind.