Message ID | 20240730053215.33768-1-flyingpeng@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/mmu: Conditionally call kvm_zap_obsolete_pages | expand |
On 7/30/24 07:32, flyingpenghao@gmail.com wrote: > > When tdp_mmu is enabled, invalid root calls kvm_tdp_mmu_zap_invalidated_roots > to implement it, and kvm_zap_obsolete_pages is not used. > > Signed-off-by: Peng Hao<flyingpeng@tencent.com> > --- > arch/x86/kvm/mmu/mmu.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 901be9e420a4..e91586c2ef87 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -6447,7 +6447,8 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) > */ > kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS); > > - kvm_zap_obsolete_pages(kvm); > + if (!tdp_mmu_enabled) > + kvm_zap_obsolete_pages(kvm); > Can't you have obsolete pages from the shadow MMU that's used for nested (nGPA->HPA) virtualization? Paolo
On Tue, Jul 30, 2024, Paolo Bonzini wrote: > On 7/30/24 07:32, flyingpenghao@gmail.com wrote: > > > > When tdp_mmu is enabled, invalid root calls kvm_tdp_mmu_zap_invalidated_roots > > to implement it, and kvm_zap_obsolete_pages is not used. > > > > Signed-off-by: Peng Hao<flyingpeng@tencent.com> > > --- > > arch/x86/kvm/mmu/mmu.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 901be9e420a4..e91586c2ef87 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -6447,7 +6447,8 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) > > */ > > kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS); > > - kvm_zap_obsolete_pages(kvm); > > + if (!tdp_mmu_enabled) > > + kvm_zap_obsolete_pages(kvm); > > Can't you have obsolete pages from the shadow MMU that's used for nested > (nGPA->HPA) virtualization? Yep. And kvm_zap_obsolete_pages() is a relatively cheap nop if there are no pages on active_mmu_pages. E.g. we could check kvm_memslots_have_rmaps(), but I don't see any point in doing so, as the existing code should be blazing fast relative to the total cost of the zap.
On Wed, Jul 31, 2024 at 4:27 AM Sean Christopherson <seanjc@google.com> wrote: > > On Tue, Jul 30, 2024, Paolo Bonzini wrote: > > On 7/30/24 07:32, flyingpenghao@gmail.com wrote: > > > > > > When tdp_mmu is enabled, invalid root calls kvm_tdp_mmu_zap_invalidated_roots > > > to implement it, and kvm_zap_obsolete_pages is not used. > > > > > > Signed-off-by: Peng Hao<flyingpeng@tencent.com> > > > --- > > > arch/x86/kvm/mmu/mmu.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > > index 901be9e420a4..e91586c2ef87 100644 > > > --- a/arch/x86/kvm/mmu/mmu.c > > > +++ b/arch/x86/kvm/mmu/mmu.c > > > @@ -6447,7 +6447,8 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) > > > */ > > > kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS); > > > - kvm_zap_obsolete_pages(kvm); > > > + if (!tdp_mmu_enabled) > > > + kvm_zap_obsolete_pages(kvm); > > > > Can't you have obsolete pages from the shadow MMU that's used for nested > > (nGPA->HPA) virtualization? > > Yep. And kvm_zap_obsolete_pages() is a relatively cheap nop if there are no > pages on active_mmu_pages. E.g. we could check kvm_memslots_have_rmaps(), but I > don't see any point in doing so, as the existing code should be blazing fast > relative to the total cost of the zap. Here can be optimized by judging whether active_mmu_pages is empty, just like kvm_zap_obsolete_pages. Regardless of L0 kvm or L1 kvm, when tdp_mmu is enabled, the active_mmu_pages list will not be used. When ept=0 , the probability that active_mmu_pages is empty is also high, not every time kvm_zap_obsolete_pages is called. Thanks.
On 7/31/24 11:09, Hao Peng wrote: >> Yep. And kvm_zap_obsolete_pages() is a relatively cheap nop if there are no >> pages on active_mmu_pages. E.g. we could check kvm_memslots_have_rmaps(), but I >> don't see any point in doing so, as the existing code should be blazing fast >> relative to the total cost of the zap. > Here can be optimized by judging whether active_mmu_pages is empty, > just like kvm_zap_obsolete_pages. > Regardless of L0 kvm or L1 kvm, when tdp_mmu is enabled, the > active_mmu_pages list will not be used. > When ept=0 , the probability that active_mmu_pages is empty is also > high, not every time > kvm_zap_obsolete_pages is called. So if anything you could check list_empty(&kvm->arch.active_mmu_pages) before the loop of kvm_zap_obsolete_pages(), similar to what is done in kvm_mmu_zap_oldest_mmu_pages(). I doubt it can have any practical benefit, though. Paolo
On Wed, Jul 31, 2024 at 6:01 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 7/31/24 11:09, Hao Peng wrote: > >> Yep. And kvm_zap_obsolete_pages() is a relatively cheap nop if there are no > >> pages on active_mmu_pages. E.g. we could check kvm_memslots_have_rmaps(), but I > >> don't see any point in doing so, as the existing code should be blazing fast > >> relative to the total cost of the zap. > > Here can be optimized by judging whether active_mmu_pages is empty, > > just like kvm_zap_obsolete_pages. > > Regardless of L0 kvm or L1 kvm, when tdp_mmu is enabled, the > > active_mmu_pages list will not be used. > > When ept=0 , the probability that active_mmu_pages is empty is also > > high, not every time > > kvm_zap_obsolete_pages is called. > > So if anything you could check list_empty(&kvm->arch.active_mmu_pages) > before the loop of kvm_zap_obsolete_pages(), similar to what is done in > kvm_mmu_zap_oldest_mmu_pages(). I doubt it can have any practical > benefit, though. > > Paolo > I did some tests, when ept=0, kvm_zap_obsolete_pages was called 42 times, and only 17 times active_mmu_page list was not empty. When tdp_mmu was enabled, active_mmu_page list was always empty.
On Wed, Jul 31, 2024 at 1:19 PM Hao Peng <flyingpenghao@gmail.com> wrote: > > So if anything you could check list_empty(&kvm->arch.active_mmu_pages) > > before the loop of kvm_zap_obsolete_pages(), similar to what is done in > > kvm_mmu_zap_oldest_mmu_pages(). I doubt it can have any practical > > benefit, though. > > I did some tests, when ept=0, kvm_zap_obsolete_pages was called 42 > times, and only 17 times > active_mmu_page list was not empty. When tdp_mmu was enabled, > active_mmu_page list > was always empty. Did you also test with nested virtual machines running? In any case, we're talking of a difference of about 100 instructions at most, so it's irrelevant. Paolo
On Wed, Jul 31, 2024, Paolo Bonzini wrote: > On Wed, Jul 31, 2024 at 1:19 PM Hao Peng <flyingpenghao@gmail.com> wrote: > > > So if anything you could check list_empty(&kvm->arch.active_mmu_pages) > > > before the loop of kvm_zap_obsolete_pages(), similar to what is done in > > > kvm_mmu_zap_oldest_mmu_pages(). I doubt it can have any practical > > > benefit, though. > > > > I did some tests, when ept=0, kvm_zap_obsolete_pages was called 42 > > times, and only 17 times > > active_mmu_page list was not empty. When tdp_mmu was enabled, > > active_mmu_page list > > was always empty. > > Did you also test with nested virtual machines running? > > In any case, we're talking of a difference of about 100 instructions > at most, so it's irrelevant. It's not even remotely close to 100 instructions. It's not even 10 instructions. It's 3 instructions, and maybe two uops? Modern compilers are smart enough to optimize usage of kvm_mmu_commit_zap_page() so that the caller inlines the list_empty(invalid_list) check, but the guts of the zap code are non-inlined. So, as is, the generated code is: 0x00000000000599a7 <+55>: mov 0x8d40(%r12),%rbp 0x00000000000599af <+63>: cmp %rbp,%r15 0x00000000000599b2 <+66>: mov 0x8(%rbp),%rbx 0x00000000000599b6 <+70>: je 0x599d6 <kvm_zap_obsolete_pages+102> 0x00000000000599d6 <+102>: mov 0x8d48(%r12),%rax 0x00000000000599de <+110>: cmp %r14,%rax 0x00000000000599e1 <+113>: je 0x59a5f <kvm_zap_obsolete_pages+239> 0x0000000000059a5f <+239>: mov 0x8(%rsp),%rax 0x0000000000059a64 <+244>: sub %gs:0x28,%rax 0x0000000000059a6d <+253>: jne 0x59a86 <kvm_zap_obsolete_pages+278> 0x0000000000059a6f <+255>: add $0x10,%rsp 0x0000000000059a73 <+259>: pop %rbx 0x0000000000059a74 <+260>: pop %rbp 0x0000000000059a75 <+261>: pop %r12 0x0000000000059a77 <+263>: pop %r13 0x0000000000059a79 <+265>: pop %r14 0x0000000000059a7b <+267>: pop %r15 0x0000000000059a7d <+269>: ret and adding an extra list_empty(kvm->arch.active_mmu_pages) generates: 0x000000000005999a <+42>: mov 0x8d38(%rdi),%rax 0x00000000000599a1 <+49>: cmp %rax,%r15 0x00000000000599a4 <+52>: je 0x59a6f <kvm_zap_obsolete_pages+255> 0x0000000000059a6f <+255>: mov 0x8(%rsp),%rax 0x0000000000059a74 <+260>: sub %gs:0x28,%rax 0x0000000000059a7d <+269>: jne 0x59a96 <kvm_zap_obsolete_pages+294> 0x0000000000059a7f <+271>: add $0x10,%rsp 0x0000000000059a83 <+275>: pop %rbx 0x0000000000059a84 <+276>: pop %rbp 0x0000000000059a85 <+277>: pop %r12 0x0000000000059a87 <+279>: pop %r13 0x0000000000059a89 <+281>: pop %r14 0x0000000000059a8b <+283>: pop %r15 0x0000000000059a8d <+285>: ret i.e. it elides the list_empty(invalid_list) check, that's it.
On Wed, Jul 31, 2024 at 5:21 PM Sean Christopherson <seanjc@google.com> wrote: > It's not even remotely close to 100 instructions. It's not even 10 instructions. > It's 3 instructions, and maybe two uops? Well yeah, I meant 100 instructions over the whole execution of the VM... Paolo > Modern compilers are smart enough to optimize usage of kvm_mmu_commit_zap_page() > so that the caller inlines the list_empty(invalid_list) check, but the guts of > the zap code are non-inlined. > > So, as is, the generated code is: > > 0x00000000000599a7 <+55>: mov 0x8d40(%r12),%rbp > 0x00000000000599af <+63>: cmp %rbp,%r15 > 0x00000000000599b2 <+66>: mov 0x8(%rbp),%rbx > 0x00000000000599b6 <+70>: je 0x599d6 <kvm_zap_obsolete_pages+102> > > 0x00000000000599d6 <+102>: mov 0x8d48(%r12),%rax > 0x00000000000599de <+110>: cmp %r14,%rax > 0x00000000000599e1 <+113>: je 0x59a5f <kvm_zap_obsolete_pages+239> > > 0x0000000000059a5f <+239>: mov 0x8(%rsp),%rax > 0x0000000000059a64 <+244>: sub %gs:0x28,%rax > 0x0000000000059a6d <+253>: jne 0x59a86 <kvm_zap_obsolete_pages+278> > 0x0000000000059a6f <+255>: add $0x10,%rsp > 0x0000000000059a73 <+259>: pop %rbx > 0x0000000000059a74 <+260>: pop %rbp > 0x0000000000059a75 <+261>: pop %r12 > 0x0000000000059a77 <+263>: pop %r13 > 0x0000000000059a79 <+265>: pop %r14 > 0x0000000000059a7b <+267>: pop %r15 > 0x0000000000059a7d <+269>: ret > > and adding an extra list_empty(kvm->arch.active_mmu_pages) generates: > > 0x000000000005999a <+42>: mov 0x8d38(%rdi),%rax > 0x00000000000599a1 <+49>: cmp %rax,%r15 > 0x00000000000599a4 <+52>: je 0x59a6f <kvm_zap_obsolete_pages+255> > > 0x0000000000059a6f <+255>: mov 0x8(%rsp),%rax > 0x0000000000059a74 <+260>: sub %gs:0x28,%rax > 0x0000000000059a7d <+269>: jne 0x59a96 <kvm_zap_obsolete_pages+294> > 0x0000000000059a7f <+271>: add $0x10,%rsp > 0x0000000000059a83 <+275>: pop %rbx > 0x0000000000059a84 <+276>: pop %rbp > 0x0000000000059a85 <+277>: pop %r12 > 0x0000000000059a87 <+279>: pop %r13 > 0x0000000000059a89 <+281>: pop %r14 > 0x0000000000059a8b <+283>: pop %r15 > 0x0000000000059a8d <+285>: ret > > i.e. it elides the list_empty(invalid_list) check, that's it. >
On Wed, Jul 31, 2024 at 11:00 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > > On Wed, Jul 31, 2024 at 1:19 PM Hao Peng <flyingpenghao@gmail.com> wrote: > > > So if anything you could check list_empty(&kvm->arch.active_mmu_pages) > > > before the loop of kvm_zap_obsolete_pages(), similar to what is done in > > > kvm_mmu_zap_oldest_mmu_pages(). I doubt it can have any practical > > > benefit, though. > > > > I did some tests, when ept=0, kvm_zap_obsolete_pages was called 42 > > times, and only 17 times > > active_mmu_page list was not empty. When tdp_mmu was enabled, > > active_mmu_page list > > was always empty. > > Did you also test with nested virtual machines running? > yes, have similar results. > In any case, we're talking of a difference of about 100 instructions > at most, so it's irrelevant. > > Paolo >
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 901be9e420a4..e91586c2ef87 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6447,7 +6447,8 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) */ kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS); - kvm_zap_obsolete_pages(kvm); + if (!tdp_mmu_enabled) + kvm_zap_obsolete_pages(kvm); write_unlock(&kvm->mmu_lock);