diff mbox series

[v1,12/13] KVM: x86/mmu: Add tracepoint for splitting huge pages

Message ID 20211213225918.672507-13-dmatlack@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/mmu: Eager Page Splitting for the TDP MMU | expand

Commit Message

David Matlack Dec. 13, 2021, 10:59 p.m. UTC
Add a tracepoint that records whenever KVM eagerly splits a huge page.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmutrace.h | 20 ++++++++++++++++++++
 arch/x86/kvm/mmu/tdp_mmu.c  |  2 ++
 2 files changed, 22 insertions(+)

Comments

Peter Xu Jan. 5, 2022, 8:38 a.m. UTC | #1
On Mon, Dec 13, 2021 at 10:59:17PM +0000, David Matlack wrote:
> Add a tracepoint that records whenever KVM eagerly splits a huge page.
> 
> Signed-off-by: David Matlack <dmatlack@google.com>

Reviewed-by: Peter Xu <peterx@redhat.com>
Sean Christopherson Jan. 6, 2022, 11:14 p.m. UTC | #2
On Mon, Dec 13, 2021, David Matlack wrote:
> Add a tracepoint that records whenever KVM eagerly splits a huge page.
> 
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
>  arch/x86/kvm/mmu/mmutrace.h | 20 ++++++++++++++++++++
>  arch/x86/kvm/mmu/tdp_mmu.c  |  2 ++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/arch/x86/kvm/mmu/mmutrace.h b/arch/x86/kvm/mmu/mmutrace.h
> index de5e8e4e1aa7..4feabf773387 100644
> --- a/arch/x86/kvm/mmu/mmutrace.h
> +++ b/arch/x86/kvm/mmu/mmutrace.h
> @@ -416,6 +416,26 @@ TRACE_EVENT(
>  	)
>  );
>  
> +TRACE_EVENT(
> +	kvm_mmu_split_huge_page,
> +	TP_PROTO(u64 gfn, u64 spte, int level),
> +	TP_ARGS(gfn, spte, level),
> +
> +	TP_STRUCT__entry(
> +		__field(u64, gfn)
> +		__field(u64, spte)
> +		__field(int, level)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->gfn = gfn;
> +		__entry->spte = spte;
> +		__entry->level = level;
> +	),
> +
> +	TP_printk("gfn %llx spte %llx level %d", __entry->gfn, __entry->spte, __entry->level)
> +);
> +
>  #endif /* _TRACE_KVMMMU_H */
>  
>  #undef TRACE_INCLUDE_PATH
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index be5eb74ac053..e6910b9b5c12 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1325,6 +1325,8 @@ tdp_mmu_split_huge_page_atomic(struct kvm *kvm, struct tdp_iter *iter, struct kv
>  	u64 child_spte;
>  	int i;
>  
> +	trace_kvm_mmu_split_huge_page(iter->gfn, huge_spte, level);

This should either be called iff splitting is successful, or it should record
whether or not the split was successful.  The latter is probably useful info,
and easy to do, e.g. assuming this is changed to return an int like the lower
helpers:


	ret = tdp_mmu_install_sp_atomic(kvm, iter, sp, false);

	/*
	 * tdp_mmu_install_sp_atomic will handle subtracting the split huge
	 * page from stats, but we have to manually update the new present child
	 * pages on success.
	 */
	if (!ret)
		kvm_update_page_stats(kvm, level - 1, PT64_ENT_PER_PAGE);

	trace_kvm_mmu_split_huge_page(iter->gfn, huge_spte, level, ret);

	return ret;

and then the tracpoint can do 'ret ? "failed" : "succeeded"' or something.

> +
>  	init_child_tdp_mmu_page(sp, iter);
>  
>  	for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
> -- 
> 2.34.1.173.g76aa8bc2d0-goog
>
David Matlack Jan. 7, 2022, 12:54 a.m. UTC | #3
On Thu, Jan 6, 2022 at 3:14 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Dec 13, 2021, David Matlack wrote:
> > Add a tracepoint that records whenever KVM eagerly splits a huge page.
> >
> > Signed-off-by: David Matlack <dmatlack@google.com>
> > ---
> >  arch/x86/kvm/mmu/mmutrace.h | 20 ++++++++++++++++++++
> >  arch/x86/kvm/mmu/tdp_mmu.c  |  2 ++
> >  2 files changed, 22 insertions(+)
> >
> > diff --git a/arch/x86/kvm/mmu/mmutrace.h b/arch/x86/kvm/mmu/mmutrace.h
> > index de5e8e4e1aa7..4feabf773387 100644
> > --- a/arch/x86/kvm/mmu/mmutrace.h
> > +++ b/arch/x86/kvm/mmu/mmutrace.h
> > @@ -416,6 +416,26 @@ TRACE_EVENT(
> >       )
> >  );
> >
> > +TRACE_EVENT(
> > +     kvm_mmu_split_huge_page,
> > +     TP_PROTO(u64 gfn, u64 spte, int level),
> > +     TP_ARGS(gfn, spte, level),
> > +
> > +     TP_STRUCT__entry(
> > +             __field(u64, gfn)
> > +             __field(u64, spte)
> > +             __field(int, level)
> > +     ),
> > +
> > +     TP_fast_assign(
> > +             __entry->gfn = gfn;
> > +             __entry->spte = spte;
> > +             __entry->level = level;
> > +     ),
> > +
> > +     TP_printk("gfn %llx spte %llx level %d", __entry->gfn, __entry->spte, __entry->level)
> > +);
> > +
> >  #endif /* _TRACE_KVMMMU_H */
> >
> >  #undef TRACE_INCLUDE_PATH
> > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> > index be5eb74ac053..e6910b9b5c12 100644
> > --- a/arch/x86/kvm/mmu/tdp_mmu.c
> > +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> > @@ -1325,6 +1325,8 @@ tdp_mmu_split_huge_page_atomic(struct kvm *kvm, struct tdp_iter *iter, struct kv
> >       u64 child_spte;
> >       int i;
> >
> > +     trace_kvm_mmu_split_huge_page(iter->gfn, huge_spte, level);
>
> This should either be called iff splitting is successful, or it should record
> whether or not the split was successful.

Blegh. My intention was to do the former but it's obviously wrong if
the cmpxchg fails.

> The latter is probably useful info,
> and easy to do, e.g. assuming this is changed to return an int like the lower
> helpers:
>
>
>         ret = tdp_mmu_install_sp_atomic(kvm, iter, sp, false);
>
>         /*
>          * tdp_mmu_install_sp_atomic will handle subtracting the split huge
>          * page from stats, but we have to manually update the new present child
>          * pages on success.
>          */
>         if (!ret)
>                 kvm_update_page_stats(kvm, level - 1, PT64_ENT_PER_PAGE);
>
>         trace_kvm_mmu_split_huge_page(iter->gfn, huge_spte, level, ret);
>
>         return ret;
>
> and then the tracpoint can do 'ret ? "failed" : "succeeded"' or something.

If we do this we should capture all the reasons why splitting might
fail. cmpxchg races are one, and the other is failing to allocate the
sp memory. I'll take a look at doing this in the next version. It
doesn't look too difficult.

>
> > +
> >       init_child_tdp_mmu_page(sp, iter);
> >
> >       for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
> > --
> > 2.34.1.173.g76aa8bc2d0-goog
> >
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmutrace.h b/arch/x86/kvm/mmu/mmutrace.h
index de5e8e4e1aa7..4feabf773387 100644
--- a/arch/x86/kvm/mmu/mmutrace.h
+++ b/arch/x86/kvm/mmu/mmutrace.h
@@ -416,6 +416,26 @@  TRACE_EVENT(
 	)
 );
 
+TRACE_EVENT(
+	kvm_mmu_split_huge_page,
+	TP_PROTO(u64 gfn, u64 spte, int level),
+	TP_ARGS(gfn, spte, level),
+
+	TP_STRUCT__entry(
+		__field(u64, gfn)
+		__field(u64, spte)
+		__field(int, level)
+	),
+
+	TP_fast_assign(
+		__entry->gfn = gfn;
+		__entry->spte = spte;
+		__entry->level = level;
+	),
+
+	TP_printk("gfn %llx spte %llx level %d", __entry->gfn, __entry->spte, __entry->level)
+);
+
 #endif /* _TRACE_KVMMMU_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index be5eb74ac053..e6910b9b5c12 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1325,6 +1325,8 @@  tdp_mmu_split_huge_page_atomic(struct kvm *kvm, struct tdp_iter *iter, struct kv
 	u64 child_spte;
 	int i;
 
+	trace_kvm_mmu_split_huge_page(iter->gfn, huge_spte, level);
+
 	init_child_tdp_mmu_page(sp, iter);
 
 	for (i = 0; i < PT64_ENT_PER_PAGE; i++) {