diff mbox series

KVM: x86: count number of zapped pages for tdp_mmu

Message ID 20240103123959.46994-1-liangchen.linux@gmail.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: count number of zapped pages for tdp_mmu | expand

Commit Message

Liang Chen Jan. 3, 2024, 12:39 p.m. UTC
Count the number of zapped pages of tdp_mmu for vm stat.

Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Sean Christopherson Jan. 3, 2024, 3:25 p.m. UTC | #1
+David

On Wed, Jan 03, 2024, Liang Chen wrote:
> Count the number of zapped pages of tdp_mmu for vm stat.

Why?  I don't necessarily disagree with the change, but it's also not obvious
that this information is all that useful for the TDP MMU, e.g. the pf_fixed/taken
stats largely capture the same information.

> Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 6cd4dd631a2f..7f8bc1329aac 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -325,6 +325,7 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
>  	int i;
>  
>  	trace_kvm_mmu_prepare_zap_page(sp);
> +	++kvm->stat.mmu_shadow_zapped;

This isn't thread safe.  The TDP MMU can zap PTEs with mmu_lock held for read,
i.e. this needs to be an atomic access.  And tdp_mmu_unlink_sp() or even
tdp_unaccount_mmu_page() seems like a better fit for the stats update.

>  	tdp_mmu_unlink_sp(kvm, sp, shared);
>  
> -- 
> 2.40.1
>
Liang Chen Jan. 4, 2024, 4:14 a.m. UTC | #2
On Wed, Jan 3, 2024 at 11:25 PM Sean Christopherson <seanjc@google.com> wrote:
>
> +David
>
> On Wed, Jan 03, 2024, Liang Chen wrote:
> > Count the number of zapped pages of tdp_mmu for vm stat.
>
> Why?  I don't necessarily disagree with the change, but it's also not obvious
> that this information is all that useful for the TDP MMU, e.g. the pf_fixed/taken
> stats largely capture the same information.
>

We are attempting to make zapping specific to a particular memory
slot, something like below.

void kvm_tdp_zap_pages_in_memslot(struct kvm *kvm, struct kvm_memory_slot *slot)
{
        struct kvm_mmu_page *root;
        bool shared = false;
        struct tdp_iter iter;

        gfn_t end = slot->base_gfn + slot->npages;
        gfn_t start = slot->base_gfn;

        write_lock(&kvm->mmu_lock);
        rcu_read_lock();

        for_each_tdp_mmu_root_yield_safe(kvm, root, false) {

                for_each_tdp_pte_min_level(iter, root,
root->role.level, start, end) {
                        if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false))
                                continue;

                        if (!is_shadow_present_pte(iter.old_spte))
                                continue;

                        tdp_mmu_set_spte(kvm, &iter, 0);
                }
        }

        kvm_flush_remote_tlbs(kvm);

        rcu_read_unlock();
        write_unlock(&kvm->mmu_lock);
}

I noticed that it was previously done to the legacy MMU, but
encountered some subtle issues with VFIO. I'm not sure if the issue is
still there with TDP_MMU. So we are trying to do more tests and
analyses before submitting a patch. This provides me a convenient way
to observe the number of pages being zapped.

> > Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
> > ---
> >  arch/x86/kvm/mmu/tdp_mmu.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> > index 6cd4dd631a2f..7f8bc1329aac 100644
> > --- a/arch/x86/kvm/mmu/tdp_mmu.c
> > +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> > @@ -325,6 +325,7 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
> >       int i;
> >
> >       trace_kvm_mmu_prepare_zap_page(sp);
> > +     ++kvm->stat.mmu_shadow_zapped;
>
> This isn't thread safe.  The TDP MMU can zap PTEs with mmu_lock held for read,
> i.e. this needs to be an atomic access.  And tdp_mmu_unlink_sp() or even
> tdp_unaccount_mmu_page() seems like a better fit for the stats update.
>

Yeah, that's an issue. Perhaps it isn't worth the effort to convert
mmu_shadow_zapped to atomic type or use any concurrency control means.
I will just add an attribute in debugfs to keep track of the number.

> >       tdp_mmu_unlink_sp(kvm, sp, shared);
> >
> > --
> > 2.40.1
> >
David Matlack Jan. 4, 2024, 6:29 p.m. UTC | #3
On Wed, Jan 3, 2024 at 8:14 PM Liang Chen <liangchen.linux@gmail.com> wrote:
>
> On Wed, Jan 3, 2024 at 11:25 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > +David
> >
> > On Wed, Jan 03, 2024, Liang Chen wrote:
> > > Count the number of zapped pages of tdp_mmu for vm stat.
> >
> > Why?  I don't necessarily disagree with the change, but it's also not obvious
> > that this information is all that useful for the TDP MMU, e.g. the pf_fixed/taken
> > stats largely capture the same information.
> >
>
> We are attempting to make zapping specific to a particular memory
> slot, something like below.
>
> void kvm_tdp_zap_pages_in_memslot(struct kvm *kvm, struct kvm_memory_slot *slot)
> {
>         struct kvm_mmu_page *root;
>         bool shared = false;
>         struct tdp_iter iter;
>
>         gfn_t end = slot->base_gfn + slot->npages;
>         gfn_t start = slot->base_gfn;
>
>         write_lock(&kvm->mmu_lock);
>         rcu_read_lock();
>
>         for_each_tdp_mmu_root_yield_safe(kvm, root, false) {
>
>                 for_each_tdp_pte_min_level(iter, root,
> root->role.level, start, end) {
>                         if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false))
>                                 continue;
>
>                         if (!is_shadow_present_pte(iter.old_spte))
>                                 continue;
>
>                         tdp_mmu_set_spte(kvm, &iter, 0);
>                 }
>         }
>
>         kvm_flush_remote_tlbs(kvm);
>
>         rcu_read_unlock();
>         write_unlock(&kvm->mmu_lock);
> }
>
> I noticed that it was previously done to the legacy MMU, but
> encountered some subtle issues with VFIO. I'm not sure if the issue is
> still there with TDP_MMU. So we are trying to do more tests and
> analyses before submitting a patch. This provides me a convenient way
> to observe the number of pages being zapped.

Note you could also use the existing tracepoint to observe the number
of pages being zapped in a given test run. e.g.

  perf stat -e kvmmmu:kvm_mmu_prepare_zap_page -- <cmd>
Liang Chen Jan. 5, 2024, 1:47 a.m. UTC | #4
On Fri, Jan 5, 2024 at 2:29 AM David Matlack <dmatlack@google.com> wrote:
>
> On Wed, Jan 3, 2024 at 8:14 PM Liang Chen <liangchen.linux@gmail.com> wrote:
> >
> > On Wed, Jan 3, 2024 at 11:25 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > +David
> > >
> > > On Wed, Jan 03, 2024, Liang Chen wrote:
> > > > Count the number of zapped pages of tdp_mmu for vm stat.
> > >
> > > Why?  I don't necessarily disagree with the change, but it's also not obvious
> > > that this information is all that useful for the TDP MMU, e.g. the pf_fixed/taken
> > > stats largely capture the same information.
> > >
> >
> > We are attempting to make zapping specific to a particular memory
> > slot, something like below.
> >
> > void kvm_tdp_zap_pages_in_memslot(struct kvm *kvm, struct kvm_memory_slot *slot)
> > {
> >         struct kvm_mmu_page *root;
> >         bool shared = false;
> >         struct tdp_iter iter;
> >
> >         gfn_t end = slot->base_gfn + slot->npages;
> >         gfn_t start = slot->base_gfn;
> >
> >         write_lock(&kvm->mmu_lock);
> >         rcu_read_lock();
> >
> >         for_each_tdp_mmu_root_yield_safe(kvm, root, false) {
> >
> >                 for_each_tdp_pte_min_level(iter, root,
> > root->role.level, start, end) {
> >                         if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false))
> >                                 continue;
> >
> >                         if (!is_shadow_present_pte(iter.old_spte))
> >                                 continue;
> >
> >                         tdp_mmu_set_spte(kvm, &iter, 0);
> >                 }
> >         }
> >
> >         kvm_flush_remote_tlbs(kvm);
> >
> >         rcu_read_unlock();
> >         write_unlock(&kvm->mmu_lock);
> > }
> >
> > I noticed that it was previously done to the legacy MMU, but
> > encountered some subtle issues with VFIO. I'm not sure if the issue is
> > still there with TDP_MMU. So we are trying to do more tests and
> > analyses before submitting a patch. This provides me a convenient way
> > to observe the number of pages being zapped.
>
> Note you could also use the existing tracepoint to observe the number
> of pages being zapped in a given test run. e.g.
>
>   perf stat -e kvmmmu:kvm_mmu_prepare_zap_page -- <cmd>

Sure. Thank you!
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 6cd4dd631a2f..7f8bc1329aac 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -325,6 +325,7 @@  static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
 	int i;
 
 	trace_kvm_mmu_prepare_zap_page(sp);
+	++kvm->stat.mmu_shadow_zapped;
 
 	tdp_mmu_unlink_sp(kvm, sp, shared);