diff mbox series

KVM: x86/mmu: Conditionally call kvm_zap_obsolete_pages

Message ID 20240730053215.33768-1-flyingpeng@tencent.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/mmu: Conditionally call kvm_zap_obsolete_pages | expand

Commit Message

Hao Peng July 30, 2024, 5:32 a.m. UTC
From: Peng Hao <flyingpeng@tencent.com>

When tdp_mmu is enabled, invalid root calls kvm_tdp_mmu_zap_invalidated_roots
to implement it, and kvm_zap_obsolete_pages is not used.

Signed-off-by: Peng Hao <flyingpeng@tencent.com>
---
 arch/x86/kvm/mmu/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini July 30, 2024, 9:16 a.m. UTC | #1
On 7/30/24 07:32, flyingpenghao@gmail.com wrote:
> 
> When tdp_mmu is enabled, invalid root calls kvm_tdp_mmu_zap_invalidated_roots
> to implement it, and kvm_zap_obsolete_pages is not used.
> 
> Signed-off-by: Peng Hao<flyingpeng@tencent.com>
> ---
>   arch/x86/kvm/mmu/mmu.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 901be9e420a4..e91586c2ef87 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6447,7 +6447,8 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
>   	 */
>   	kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS);
>   
> -	kvm_zap_obsolete_pages(kvm);
> +	if (!tdp_mmu_enabled)
> +		kvm_zap_obsolete_pages(kvm);
>   

Can't you have obsolete pages from the shadow MMU that's used for 
nested (nGPA->HPA) virtualization?

Paolo
Sean Christopherson July 30, 2024, 8:27 p.m. UTC | #2
On Tue, Jul 30, 2024, Paolo Bonzini wrote:
> On 7/30/24 07:32, flyingpenghao@gmail.com wrote:
> > 
> > When tdp_mmu is enabled, invalid root calls kvm_tdp_mmu_zap_invalidated_roots
> > to implement it, and kvm_zap_obsolete_pages is not used.
> > 
> > Signed-off-by: Peng Hao<flyingpeng@tencent.com>
> > ---
> >   arch/x86/kvm/mmu/mmu.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 901be9e420a4..e91586c2ef87 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -6447,7 +6447,8 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
> >   	 */
> >   	kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS);
> > -	kvm_zap_obsolete_pages(kvm);
> > +	if (!tdp_mmu_enabled)
> > +		kvm_zap_obsolete_pages(kvm);
> 
> Can't you have obsolete pages from the shadow MMU that's used for nested
> (nGPA->HPA) virtualization?

Yep.  And kvm_zap_obsolete_pages() is a relatively cheap nop if there are no
pages on active_mmu_pages.  E.g. we could check kvm_memslots_have_rmaps(), but I
don't see any point in doing so, as the existing code should be blazing fast
relative to the total cost of the zap.
Hao Peng July 31, 2024, 9:09 a.m. UTC | #3
On Wed, Jul 31, 2024 at 4:27 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Jul 30, 2024, Paolo Bonzini wrote:
> > On 7/30/24 07:32, flyingpenghao@gmail.com wrote:
> > >
> > > When tdp_mmu is enabled, invalid root calls kvm_tdp_mmu_zap_invalidated_roots
> > > to implement it, and kvm_zap_obsolete_pages is not used.
> > >
> > > Signed-off-by: Peng Hao<flyingpeng@tencent.com>
> > > ---
> > >   arch/x86/kvm/mmu/mmu.c | 3 ++-
> > >   1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index 901be9e420a4..e91586c2ef87 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -6447,7 +6447,8 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
> > >      */
> > >     kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS);
> > > -   kvm_zap_obsolete_pages(kvm);
> > > +   if (!tdp_mmu_enabled)
> > > +           kvm_zap_obsolete_pages(kvm);
> >
> > Can't you have obsolete pages from the shadow MMU that's used for nested
> > (nGPA->HPA) virtualization?
>
> Yep.  And kvm_zap_obsolete_pages() is a relatively cheap nop if there are no
> pages on active_mmu_pages.  E.g. we could check kvm_memslots_have_rmaps(), but I
> don't see any point in doing so, as the existing code should be blazing fast
> relative to the total cost of the zap.
Here can be optimized by judging whether active_mmu_pages is empty,
just like kvm_zap_obsolete_pages.
Regardless of L0 kvm or L1 kvm, when tdp_mmu is enabled, the
active_mmu_pages list will not be used.
When ept=0 , the probability that active_mmu_pages is empty is also
high, not every time
kvm_zap_obsolete_pages is called.

Thanks.
Paolo Bonzini July 31, 2024, 10 a.m. UTC | #4
On 7/31/24 11:09, Hao Peng wrote:
>> Yep.  And kvm_zap_obsolete_pages() is a relatively cheap nop if there are no
>> pages on active_mmu_pages.  E.g. we could check kvm_memslots_have_rmaps(), but I
>> don't see any point in doing so, as the existing code should be blazing fast
>> relative to the total cost of the zap.
> Here can be optimized by judging whether active_mmu_pages is empty,
> just like kvm_zap_obsolete_pages.
> Regardless of L0 kvm or L1 kvm, when tdp_mmu is enabled, the
> active_mmu_pages list will not be used.
> When ept=0 , the probability that active_mmu_pages is empty is also
> high, not every time
> kvm_zap_obsolete_pages is called.

So if anything you could check list_empty(&kvm->arch.active_mmu_pages) 
before the loop of kvm_zap_obsolete_pages(), similar to what is done in 
kvm_mmu_zap_oldest_mmu_pages().  I doubt it can have any practical 
benefit, though.

Paolo
Hao Peng July 31, 2024, 11:19 a.m. UTC | #5
On Wed, Jul 31, 2024 at 6:01 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 7/31/24 11:09, Hao Peng wrote:
> >> Yep.  And kvm_zap_obsolete_pages() is a relatively cheap nop if there are no
> >> pages on active_mmu_pages.  E.g. we could check kvm_memslots_have_rmaps(), but I
> >> don't see any point in doing so, as the existing code should be blazing fast
> >> relative to the total cost of the zap.
> > Here can be optimized by judging whether active_mmu_pages is empty,
> > just like kvm_zap_obsolete_pages.
> > Regardless of L0 kvm or L1 kvm, when tdp_mmu is enabled, the
> > active_mmu_pages list will not be used.
> > When ept=0 , the probability that active_mmu_pages is empty is also
> > high, not every time
> > kvm_zap_obsolete_pages is called.
>
> So if anything you could check list_empty(&kvm->arch.active_mmu_pages)
> before the loop of kvm_zap_obsolete_pages(), similar to what is done in
> kvm_mmu_zap_oldest_mmu_pages().  I doubt it can have any practical
> benefit, though.
>
> Paolo
>
I did some tests, when ept=0,  kvm_zap_obsolete_pages was called 42
times, and only 17 times
active_mmu_page list was not empty. When tdp_mmu was enabled,
active_mmu_page list
was always empty.
Paolo Bonzini July 31, 2024, 3 p.m. UTC | #6
On Wed, Jul 31, 2024 at 1:19 PM Hao Peng <flyingpenghao@gmail.com> wrote:
> > So if anything you could check list_empty(&kvm->arch.active_mmu_pages)
> > before the loop of kvm_zap_obsolete_pages(), similar to what is done in
> > kvm_mmu_zap_oldest_mmu_pages().  I doubt it can have any practical
> > benefit, though.
>
> I did some tests, when ept=0,  kvm_zap_obsolete_pages was called 42
> times, and only 17 times
> active_mmu_page list was not empty. When tdp_mmu was enabled,
> active_mmu_page list
> was always empty.

Did you also test with nested virtual machines running?

In any case, we're talking of a difference of about 100 instructions
at most, so it's irrelevant.

Paolo
Sean Christopherson July 31, 2024, 3:20 p.m. UTC | #7
On Wed, Jul 31, 2024, Paolo Bonzini wrote:
> On Wed, Jul 31, 2024 at 1:19 PM Hao Peng <flyingpenghao@gmail.com> wrote:
> > > So if anything you could check list_empty(&kvm->arch.active_mmu_pages)
> > > before the loop of kvm_zap_obsolete_pages(), similar to what is done in
> > > kvm_mmu_zap_oldest_mmu_pages().  I doubt it can have any practical
> > > benefit, though.
> >
> > I did some tests, when ept=0,  kvm_zap_obsolete_pages was called 42
> > times, and only 17 times
> > active_mmu_page list was not empty. When tdp_mmu was enabled,
> > active_mmu_page list
> > was always empty.
> 
> Did you also test with nested virtual machines running?
> 
> In any case, we're talking of a difference of about 100 instructions
> at most, so it's irrelevant.

It's not even remotely close to 100 instructions.  It's not even 10 instructions.
It's 3 instructions, and maybe two uops?

Modern compilers are smart enough to optimize usage of kvm_mmu_commit_zap_page()
so that the caller inlines the list_empty(invalid_list) check, but the guts of
the zap code are non-inlined.

So, as is, the generated code is:

   0x00000000000599a7 <+55>:	mov    0x8d40(%r12),%rbp
   0x00000000000599af <+63>:	cmp    %rbp,%r15
   0x00000000000599b2 <+66>:	mov    0x8(%rbp),%rbx
   0x00000000000599b6 <+70>:	je     0x599d6 <kvm_zap_obsolete_pages+102>

   0x00000000000599d6 <+102>:	mov    0x8d48(%r12),%rax
   0x00000000000599de <+110>:	cmp    %r14,%rax
   0x00000000000599e1 <+113>:	je     0x59a5f <kvm_zap_obsolete_pages+239>

   0x0000000000059a5f <+239>:	mov    0x8(%rsp),%rax
   0x0000000000059a64 <+244>:	sub    %gs:0x28,%rax
   0x0000000000059a6d <+253>:	jne    0x59a86 <kvm_zap_obsolete_pages+278>
   0x0000000000059a6f <+255>:	add    $0x10,%rsp
   0x0000000000059a73 <+259>:	pop    %rbx
   0x0000000000059a74 <+260>:	pop    %rbp
   0x0000000000059a75 <+261>:	pop    %r12
   0x0000000000059a77 <+263>:	pop    %r13
   0x0000000000059a79 <+265>:	pop    %r14
   0x0000000000059a7b <+267>:	pop    %r15
   0x0000000000059a7d <+269>:	ret

and adding an extra list_empty(kvm->arch.active_mmu_pages) generates:

   0x000000000005999a <+42>:	mov    0x8d38(%rdi),%rax
   0x00000000000599a1 <+49>:	cmp    %rax,%r15
   0x00000000000599a4 <+52>:	je     0x59a6f <kvm_zap_obsolete_pages+255>

   0x0000000000059a6f <+255>:	mov    0x8(%rsp),%rax
   0x0000000000059a74 <+260>:	sub    %gs:0x28,%rax
   0x0000000000059a7d <+269>:	jne    0x59a96 <kvm_zap_obsolete_pages+294>
   0x0000000000059a7f <+271>:	add    $0x10,%rsp
   0x0000000000059a83 <+275>:	pop    %rbx
   0x0000000000059a84 <+276>:	pop    %rbp
   0x0000000000059a85 <+277>:	pop    %r12
   0x0000000000059a87 <+279>:	pop    %r13
   0x0000000000059a89 <+281>:	pop    %r14
   0x0000000000059a8b <+283>:	pop    %r15
   0x0000000000059a8d <+285>:	ret

i.e. it elides the list_empty(invalid_list) check, that's it.
Paolo Bonzini July 31, 2024, 4:07 p.m. UTC | #8
On Wed, Jul 31, 2024 at 5:21 PM Sean Christopherson <seanjc@google.com> wrote:
> It's not even remotely close to 100 instructions.  It's not even 10 instructions.
> It's 3 instructions, and maybe two uops?

Well yeah, I meant 100 instructions over the whole execution
of the VM...

Paolo

> Modern compilers are smart enough to optimize usage of kvm_mmu_commit_zap_page()
> so that the caller inlines the list_empty(invalid_list) check, but the guts of
> the zap code are non-inlined.
>
> So, as is, the generated code is:
>
>    0x00000000000599a7 <+55>:    mov    0x8d40(%r12),%rbp
>    0x00000000000599af <+63>:    cmp    %rbp,%r15
>    0x00000000000599b2 <+66>:    mov    0x8(%rbp),%rbx
>    0x00000000000599b6 <+70>:    je     0x599d6 <kvm_zap_obsolete_pages+102>
>
>    0x00000000000599d6 <+102>:   mov    0x8d48(%r12),%rax
>    0x00000000000599de <+110>:   cmp    %r14,%rax
>    0x00000000000599e1 <+113>:   je     0x59a5f <kvm_zap_obsolete_pages+239>
>
>    0x0000000000059a5f <+239>:   mov    0x8(%rsp),%rax
>    0x0000000000059a64 <+244>:   sub    %gs:0x28,%rax
>    0x0000000000059a6d <+253>:   jne    0x59a86 <kvm_zap_obsolete_pages+278>
>    0x0000000000059a6f <+255>:   add    $0x10,%rsp
>    0x0000000000059a73 <+259>:   pop    %rbx
>    0x0000000000059a74 <+260>:   pop    %rbp
>    0x0000000000059a75 <+261>:   pop    %r12
>    0x0000000000059a77 <+263>:   pop    %r13
>    0x0000000000059a79 <+265>:   pop    %r14
>    0x0000000000059a7b <+267>:   pop    %r15
>    0x0000000000059a7d <+269>:   ret
>
> and adding an extra list_empty(kvm->arch.active_mmu_pages) generates:
>
>    0x000000000005999a <+42>:    mov    0x8d38(%rdi),%rax
>    0x00000000000599a1 <+49>:    cmp    %rax,%r15
>    0x00000000000599a4 <+52>:    je     0x59a6f <kvm_zap_obsolete_pages+255>
>
>    0x0000000000059a6f <+255>:   mov    0x8(%rsp),%rax
>    0x0000000000059a74 <+260>:   sub    %gs:0x28,%rax
>    0x0000000000059a7d <+269>:   jne    0x59a96 <kvm_zap_obsolete_pages+294>
>    0x0000000000059a7f <+271>:   add    $0x10,%rsp
>    0x0000000000059a83 <+275>:   pop    %rbx
>    0x0000000000059a84 <+276>:   pop    %rbp
>    0x0000000000059a85 <+277>:   pop    %r12
>    0x0000000000059a87 <+279>:   pop    %r13
>    0x0000000000059a89 <+281>:   pop    %r14
>    0x0000000000059a8b <+283>:   pop    %r15
>    0x0000000000059a8d <+285>:   ret
>
> i.e. it elides the list_empty(invalid_list) check, that's it.
>
Hao Peng Aug. 1, 2024, 4:33 a.m. UTC | #9
On Wed, Jul 31, 2024 at 11:00 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On Wed, Jul 31, 2024 at 1:19 PM Hao Peng <flyingpenghao@gmail.com> wrote:
> > > So if anything you could check list_empty(&kvm->arch.active_mmu_pages)
> > > before the loop of kvm_zap_obsolete_pages(), similar to what is done in
> > > kvm_mmu_zap_oldest_mmu_pages().  I doubt it can have any practical
> > > benefit, though.
> >
> > I did some tests, when ept=0,  kvm_zap_obsolete_pages was called 42
> > times, and only 17 times
> > active_mmu_page list was not empty. When tdp_mmu was enabled,
> > active_mmu_page list
> > was always empty.
>
> Did you also test with nested virtual machines running?
>
yes, have similar results.

> In any case, we're talking of a difference of about 100 instructions
> at most, so it's irrelevant.
>
> Paolo
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 901be9e420a4..e91586c2ef87 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6447,7 +6447,8 @@  static void kvm_mmu_zap_all_fast(struct kvm *kvm)
 	 */
 	kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_FREE_OBSOLETE_ROOTS);
 
-	kvm_zap_obsolete_pages(kvm);
+	if (!tdp_mmu_enabled)
+		kvm_zap_obsolete_pages(kvm);
 
 	write_unlock(&kvm->mmu_lock);