Message ID | 1641531575-28524-1-git-send-email-anshuman.khandual@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/migration: Add trace events for THP migrations | expand |
On Fri, 7 Jan 2022 10:29:35 +0530 Anshuman Khandual <anshuman.khandual@arm.com> wrote: > This adds two trace events for PMD based THP migration without split. These > events closely follow the implementation details like setting and removing > of PMD migration entries, which are essential operations for THP migration. > > Cc: Steven Rostedt <rostedt@goodmis.org> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Zi Yan <ziy@nvidia.com> > Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > Cc: John Hubbard <jhubbard@nvidia.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > This applies on v5.16-rc8 > > Changes in V1: > > - Dropped mm, pmdp, page arguments from trace > - Updated trace argument names and output format > > Changes in RFC: > > https://lore.kernel.org/all/1640328398-20698-1-git-send-email-anshuman.khandual@arm.com/ > > include/trace/events/thp.h | 37 +++++++++++++++++++++++++++++++++++++ > mm/huge_memory.c | 5 +++++ > 2 files changed, 42 insertions(+) > > diff --git a/include/trace/events/thp.h b/include/trace/events/thp.h > index d7fbbe551841..193a555aa2ea 100644 > --- a/include/trace/events/thp.h > +++ b/include/trace/events/thp.h > @@ -83,6 +83,43 @@ TRACE_EVENT(hugepage_splitting, > __entry->addr, __entry->pte) > ); > > +TRACE_EVENT(set_migration_pmd, > + > + TP_PROTO(unsigned long addr, unsigned long pmd), > + > + TP_ARGS(addr, pmd), > + > + TP_STRUCT__entry( > + __field(unsigned long, addr) > + __field(unsigned long, pmd) > + ), > + > + TP_fast_assign( > + __entry->addr = addr; > + __entry->pmd = pmd; > + ), > + > + TP_printk("create pmd migration entry addr=%lx, pmd=%lx", __entry->addr, __entry->pmd) > +); > + > +TRACE_EVENT(remove_migration_pmd, > + > + TP_PROTO(unsigned long addr, unsigned long pmd), > + > + TP_ARGS(addr, pmd), > + > + TP_STRUCT__entry( > + __field(unsigned long, addr) > + __field(unsigned long, pmd) > + ), > + > + TP_fast_assign( > + __entry->addr = addr; > + __entry->pmd = pmd; > + ), > + > + TP_printk("remove pmd migration entry addr=%lx, val=%lx", __entry->addr, __entry->pmd) The two above are pretty much identical, except the first one has "pmd=%lx" for the pmd, and the second has "val=%lx" for the pmd. I'd suggest they both be the same, and then you could save memory by combining the two into DECLARE_EVENT_CLASS() / DEFINE_EVENT() macros: DECLARE_EVENT_CLASS(migration_pmd, TP_PROTO(unsigned long addr, unsigned long pmd), TP_ARGS(addr, pmd), TP_STRUCT__entry( __field(unsigned long, addr) __field(unsigned long, pmd) ), TP_fast_assign( __entry->addr = addr; __entry->pmd = pmd; ), TP_printk("create pmd migration entry addr=%lx, pmd=%lx", __entry->addr, __entry->pmd) ); DEFINE_EVENT(migration_pmd, set_migration_pmd, TP_PROTO(unsigned long addr, unsigned long pmd), TP_ARGS(addr, pmd) ); DEFINE_EVENT(migration_pmd, remove_migration_pmd, TP_PROTO(unsigned long addr, unsigned long pmd), TP_ARGS(addr, pmd) ); And then you have the same thing, but it combines the code which saves both data and text. -- Steve > +); > #endif /* _TRACE_THP_H */ > > /* This part must be outside protection */ > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index e5483347291c..d0adc019afe0 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -39,6 +39,9 @@ > #include <asm/pgalloc.h> > #include "internal.h" > > +#define CREATE_TRACE_POINTS > +#include <trace/events/thp.h> > + > /* > * By default, transparent hugepage support is disabled in order to avoid > * risking an increased memory footprint for applications that are not > @@ -3173,6 +3176,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, > set_pmd_at(mm, address, pvmw->pmd, pmdswp); > page_remove_rmap(page, true); > put_page(page); > + trace_set_migration_pmd(address, pmd_val(pmdswp)); > } > > void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) > @@ -3206,5 +3210,6 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) > if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new)) > mlock_vma_page(new); > update_mmu_cache_pmd(vma, address, pvmw->pmd); > + trace_remove_migration_pmd(address, pmd_val(pmde)); > } > #endif
Hi Anshuman, On Fri, Jan 07, 2022 at 10:29:35AM +0530, Anshuman Khandual wrote: > This adds two trace events for PMD based THP migration without split. These > events closely follow the implementation details like setting and removing > of PMD migration entries, which are essential operations for THP migration. I often want to check which individual pages are migrated to which places (or not migrated) for testing, so these new tracepoints could help me. Maybe these can be much greater if they can handle other types of page migration for raw pages and hugetlb pages. Is it hard to cover all such page migration events? Thanks, Naoya Horiguchi
On 1/10/22 10:33 PM, Steven Rostedt wrote: > On Fri, 7 Jan 2022 10:29:35 +0530 > Anshuman Khandual <anshuman.khandual@arm.com> wrote: > >> This adds two trace events for PMD based THP migration without split. These >> events closely follow the implementation details like setting and removing >> of PMD migration entries, which are essential operations for THP migration. >> >> Cc: Steven Rostedt <rostedt@goodmis.org> >> Cc: Ingo Molnar <mingo@redhat.com> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Zi Yan <ziy@nvidia.com> >> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> >> Cc: John Hubbard <jhubbard@nvidia.com> >> Cc: Matthew Wilcox <willy@infradead.org> >> Cc: linux-mm@kvack.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> This applies on v5.16-rc8 >> >> Changes in V1: >> >> - Dropped mm, pmdp, page arguments from trace >> - Updated trace argument names and output format >> >> Changes in RFC: >> >> https://lore.kernel.org/all/1640328398-20698-1-git-send-email-anshuman.khandual@arm.com/ >> >> include/trace/events/thp.h | 37 +++++++++++++++++++++++++++++++++++++ >> mm/huge_memory.c | 5 +++++ >> 2 files changed, 42 insertions(+) >> >> diff --git a/include/trace/events/thp.h b/include/trace/events/thp.h >> index d7fbbe551841..193a555aa2ea 100644 >> --- a/include/trace/events/thp.h >> +++ b/include/trace/events/thp.h >> @@ -83,6 +83,43 @@ TRACE_EVENT(hugepage_splitting, >> __entry->addr, __entry->pte) >> ); >> >> +TRACE_EVENT(set_migration_pmd, >> + >> + TP_PROTO(unsigned long addr, unsigned long pmd), >> + >> + TP_ARGS(addr, pmd), >> + >> + TP_STRUCT__entry( >> + __field(unsigned long, addr) >> + __field(unsigned long, pmd) >> + ), >> + >> + TP_fast_assign( >> + __entry->addr = addr; >> + __entry->pmd = pmd; >> + ), >> + >> + TP_printk("create pmd migration entry addr=%lx, pmd=%lx", __entry->addr, __entry->pmd) >> +); >> + >> +TRACE_EVENT(remove_migration_pmd, >> + >> + TP_PROTO(unsigned long addr, unsigned long pmd), >> + >> + TP_ARGS(addr, pmd), >> + >> + TP_STRUCT__entry( >> + __field(unsigned long, addr) >> + __field(unsigned long, pmd) >> + ), >> + >> + TP_fast_assign( >> + __entry->addr = addr; >> + __entry->pmd = pmd; >> + ), >> + >> + TP_printk("remove pmd migration entry addr=%lx, val=%lx", __entry->addr, __entry->pmd) > > The two above are pretty much identical, except the first one has "pmd=%lx" > for the pmd, and the second has "val=%lx" for the pmd. I'd suggest they > both be the same, and then you could save memory by combining the two into Right, both can be the same 'pmd=%lx'. > DECLARE_EVENT_CLASS() / DEFINE_EVENT() macros: > > DECLARE_EVENT_CLASS(migration_pmd, > > TP_PROTO(unsigned long addr, unsigned long pmd), > > TP_ARGS(addr, pmd), > > TP_STRUCT__entry( > __field(unsigned long, addr) > __field(unsigned long, pmd) > ), > > TP_fast_assign( > __entry->addr = addr; > __entry->pmd = pmd; > ), > > TP_printk("create pmd migration entry addr=%lx, pmd=%lx", __entry->addr, __entry->pmd) > ); > > DEFINE_EVENT(migration_pmd, set_migration_pmd, > TP_PROTO(unsigned long addr, unsigned long pmd), > TP_ARGS(addr, pmd) > ); > > DEFINE_EVENT(migration_pmd, remove_migration_pmd, > TP_PROTO(unsigned long addr, unsigned long pmd), > TP_ARGS(addr, pmd) > ); > > And then you have the same thing, but it combines the code which saves both > data and text. But both need to print different messages. Hence I am wondering whether a (const char *) based message can be passed down here as well. create pmd migration entry addr=%lx, pmd=%lx remove pmd migration entry addr=%lx, pmd=%lx > > -- Steve > >> +); >> #endif /* _TRACE_THP_H */ >> >> /* This part must be outside protection */ >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index e5483347291c..d0adc019afe0 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -39,6 +39,9 @@ >> #include <asm/pgalloc.h> >> #include "internal.h" >> >> +#define CREATE_TRACE_POINTS >> +#include <trace/events/thp.h> >> + >> /* >> * By default, transparent hugepage support is disabled in order to avoid >> * risking an increased memory footprint for applications that are not >> @@ -3173,6 +3176,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, >> set_pmd_at(mm, address, pvmw->pmd, pmdswp); >> page_remove_rmap(page, true); >> put_page(page); >> + trace_set_migration_pmd(address, pmd_val(pmdswp)); >> } >> >> void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) >> @@ -3206,5 +3210,6 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) >> if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new)) >> mlock_vma_page(new); >> update_mmu_cache_pmd(vma, address, pvmw->pmd); >> + trace_remove_migration_pmd(address, pmd_val(pmde)); >> } >> #endif >
On 1/11/22 7:28 AM, Naoya Horiguchi wrote: > Hi Anshuman, > > On Fri, Jan 07, 2022 at 10:29:35AM +0530, Anshuman Khandual wrote: >> This adds two trace events for PMD based THP migration without split. These >> events closely follow the implementation details like setting and removing >> of PMD migration entries, which are essential operations for THP migration. > > I often want to check which individual pages are migrated to which places > (or not migrated) for testing, so these new tracepoints could help me. > Maybe these can be much greater if they can handle other types of page > migration for raw pages and hugetlb pages. Is it hard to cover all such > page migration events? Are you suggesting to cover all migration entry transitions for normal and HugeTLB pages as well ? migrate_pages() unmap_and_move_huge_page() try_to_migrate() make_writable_migration_entry() <--- make_readable_migration_entry() <--- remove_migration_ptes() <--- unmap_and_move() __unmap_and_move() try_to_migrate() make_writable_migration_entry() <--- make_readable_migration_entry() <--- remove_migration_ptes() <--- > > Thanks, > Naoya Horiguchi >
On Tue, Jan 11, 2022 at 10:31:21AM +0530, Anshuman Khandual wrote: > > > On 1/11/22 7:28 AM, Naoya Horiguchi wrote: > > Hi Anshuman, > > > > On Fri, Jan 07, 2022 at 10:29:35AM +0530, Anshuman Khandual wrote: > >> This adds two trace events for PMD based THP migration without split. These > >> events closely follow the implementation details like setting and removing > >> of PMD migration entries, which are essential operations for THP migration. > > > > I often want to check which individual pages are migrated to which places > > (or not migrated) for testing, so these new tracepoints could help me. > > Maybe these can be much greater if they can handle other types of page > > migration for raw pages and hugetlb pages. Is it hard to cover all such > > page migration events? > > Are you suggesting to cover all migration entry transitions for normal > and HugeTLB pages as well ? Yes if you like the idea. I think that some events listed below can be grouped into one tracepoint event with showing args like pgsize or read/write flags (or implementation detail is up to you). > > migrate_pages() > unmap_and_move_huge_page() > try_to_migrate() > make_writable_migration_entry() <--- > make_readable_migration_entry() <--- > remove_migration_ptes() <--- > unmap_and_move() > __unmap_and_move() > try_to_migrate() > make_writable_migration_entry() <--- > make_readable_migration_entry() <--- > remove_migration_ptes() <--- Thanks, Naoya Horiguchi
On 1/11/22 12:27 PM, Naoya Horiguchi wrote: > On Tue, Jan 11, 2022 at 10:31:21AM +0530, Anshuman Khandual wrote: >> >> >> On 1/11/22 7:28 AM, Naoya Horiguchi wrote: >>> Hi Anshuman, >>> >>> On Fri, Jan 07, 2022 at 10:29:35AM +0530, Anshuman Khandual wrote: >>>> This adds two trace events for PMD based THP migration without split. These >>>> events closely follow the implementation details like setting and removing >>>> of PMD migration entries, which are essential operations for THP migration. >>> >>> I often want to check which individual pages are migrated to which places >>> (or not migrated) for testing, so these new tracepoints could help me. >>> Maybe these can be much greater if they can handle other types of page >>> migration for raw pages and hugetlb pages. Is it hard to cover all such >>> page migration events? >> >> Are you suggesting to cover all migration entry transitions for normal >> and HugeTLB pages as well ? > > Yes if you like the idea. I think that some events listed below can be grouped > into one tracepoint event with showing args like pgsize or read/write flags > (or implementation detail is up to you). In its simplest form, something like this will work ? Although the THP migration patch still remains (almost) unchanged. include/trace/events/migrate.h | 38 ++++++++++++++++++++++++++++++++++ mm/migrate.c | 10 +++++++-- mm/rmap.c | 3 +++ 3 files changed, 49 insertions(+), 2 deletions(-) diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h index 779f3fad9ecd..b66652fcc8af 100644 --- a/include/trace/events/migrate.h +++ b/include/trace/events/migrate.h @@ -105,6 +105,44 @@ TRACE_EVENT(mm_migrate_pages_start, __print_symbolic(__entry->reason, MIGRATE_REASON)) ); +TRACE_EVENT(set_migration_pte, + + TP_PROTO(unsigned long addr, unsigned long pte), + + TP_ARGS(addr, pte), + + TP_STRUCT__entry( + __field(unsigned long, addr) + __field(unsigned long, pte) + ), + + TP_fast_assign( + __entry->addr = addr; + __entry->pte = pte; + ), + + TP_printk("create pte migration entry addr=%lx, pte=%lx", __entry->addr, __entry->pte) +); + +TRACE_EVENT(remove_migration_ptes, + + TP_PROTO(struct page *old_page, struct page *new_page), + + TP_ARGS(old_page, new_page), + + TP_STRUCT__entry( + __field(struct page *, old_page) + __field(struct page *, new_page) + ), + + TP_fast_assign( + __entry->old_page = old_page; + __entry->new_page = new_page; + ), + + TP_printk("remove pte migration entry old_page=%lx new_page=%lx", __entry->old_page, __entry->new_page); +); + #endif /* _TRACE_MIGRATE_H */ /* This part must be outside protection */ diff --git a/mm/migrate.c b/mm/migrate.c index 18ce840914f0..271b1d565642 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1053,9 +1053,12 @@ static int __unmap_and_move(struct page *page, struct page *newpage, if (!page_mapped(page)) rc = move_to_new_page(newpage, page, mode); - if (page_was_mapped) + if (page_was_mapped) { remove_migration_ptes(page, rc == MIGRATEPAGE_SUCCESS ? newpage : page, false); + trace_remove_migration_ptes(page, + rc == MIGRATEPAGE_SUCCESS ? newpage : page); + } out_unlock_both: unlock_page(newpage); @@ -1275,9 +1278,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, if (!page_mapped(hpage)) rc = move_to_new_page(new_hpage, hpage, mode); - if (page_was_mapped) + if (page_was_mapped) { remove_migration_ptes(hpage, rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, false); + trace_remove_migration_ptes(hpage, + rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage); + } unlock_put_anon: unlock_page(new_hpage); diff --git a/mm/rmap.c b/mm/rmap.c index 6a1e8c7f6213..cd373e378694 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -77,6 +77,7 @@ #include <asm/tlbflush.h> #include <trace/events/tlb.h> +#include <trace/events/migrate.h> #include "internal.h" @@ -1861,6 +1862,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma, if (pte_swp_uffd_wp(pteval)) swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte); + trace_set_migration_pte(pvmw.address, pte_val(swp_pte)); /* * No need to invalidate here it will synchronize on * against the special swap migration pte. @@ -1929,6 +1931,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma, if (pte_uffd_wp(pteval)) swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); + trace_set_migration_pte(address, pte_val(swp_pte)); /* * No need to invalidate here it will synchronize on * against the special swap migration pte.
On Fri, Jan 21, 2022 at 12:08:14PM +0530, Anshuman Khandual wrote: > > > On 1/11/22 12:27 PM, Naoya Horiguchi wrote: > > On Tue, Jan 11, 2022 at 10:31:21AM +0530, Anshuman Khandual wrote: > >> > >> > >> On 1/11/22 7:28 AM, Naoya Horiguchi wrote: > >>> Hi Anshuman, > >>> > >>> On Fri, Jan 07, 2022 at 10:29:35AM +0530, Anshuman Khandual wrote: > >>>> This adds two trace events for PMD based THP migration without split. These > >>>> events closely follow the implementation details like setting and removing > >>>> of PMD migration entries, which are essential operations for THP migration. > >>> > >>> I often want to check which individual pages are migrated to which places > >>> (or not migrated) for testing, so these new tracepoints could help me. > >>> Maybe these can be much greater if they can handle other types of page > >>> migration for raw pages and hugetlb pages. Is it hard to cover all such > >>> page migration events? > >> > >> Are you suggesting to cover all migration entry transitions for normal > >> and HugeTLB pages as well ? > > > > Yes if you like the idea. I think that some events listed below can be grouped > > into one tracepoint event with showing args like pgsize or read/write flags > > (or implementation detail is up to you). > > In its simplest form, something like this will work ? Although the THP migration > patch still remains (almost) unchanged. > > include/trace/events/migrate.h | 38 ++++++++++++++++++++++++++++++++++ > mm/migrate.c | 10 +++++++-- > mm/rmap.c | 3 +++ > 3 files changed, 49 insertions(+), 2 deletions(-) > > diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h > index 779f3fad9ecd..b66652fcc8af 100644 > --- a/include/trace/events/migrate.h > +++ b/include/trace/events/migrate.h > @@ -105,6 +105,44 @@ TRACE_EVENT(mm_migrate_pages_start, > __print_symbolic(__entry->reason, MIGRATE_REASON)) > ); > > +TRACE_EVENT(set_migration_pte, > + > + TP_PROTO(unsigned long addr, unsigned long pte), > + > + TP_ARGS(addr, pte), > + > + TP_STRUCT__entry( > + __field(unsigned long, addr) > + __field(unsigned long, pte) > + ), > + > + TP_fast_assign( > + __entry->addr = addr; > + __entry->pte = pte; > + ), > + > + TP_printk("create pte migration entry addr=%lx, pte=%lx", __entry->addr, __entry->pte) > +); > + > +TRACE_EVENT(remove_migration_ptes, > + > + TP_PROTO(struct page *old_page, struct page *new_page), > + > + TP_ARGS(old_page, new_page), > + > + TP_STRUCT__entry( > + __field(struct page *, old_page) > + __field(struct page *, new_page) > + ), > + > + TP_fast_assign( > + __entry->old_page = old_page; > + __entry->new_page = new_page; > + ), > + > + TP_printk("remove pte migration entry old_page=%lx new_page=%lx", __entry->old_page, __entry->new_page); Thank you for the work, Anshuman. I have a few comments: - the format string in TP_printk has prefix like "remove pte migration entry", but it seems to me redundant because each trace event name is also printed out like below: bash-2611 [001] 272.952419: mm_migrate_pages_start: mode=MIGRATE_SYNC reason=mempolicy_mbind bash-2611 [001] 272.952439: set_migration_pte: create pte migration entry addr=700000000000, pte=dfffffffdf478602 bash-2611 [001] 272.952466: remove_migration_ptes: remove pte migration entry old_pfn=0x105c3c new_pfn=0x15e8d8 Maybe trace events for THP migration can be more compact too. - TP_ARGS for remove_migration_ptes() are inconsistent with those of set_migration_pte and {set,remove}_migration_pmd, so how about putting trace_remove_migration_pte() within remove_migration_ptes() instead of after remove_migration_ptes(). - I think page order is also important to tell hugetlb migration events from normal page migration events. I borrowed your diff and made testing patch like below. Feel free to use/update it. Thanks, Naoya Horiguchi --- diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h index 779f3fad9ecd..f025e80bd5b9 100644 --- a/include/trace/events/migrate.h +++ b/include/trace/events/migrate.h @@ -105,6 +105,48 @@ TRACE_EVENT(mm_migrate_pages_start, __print_symbolic(__entry->reason, MIGRATE_REASON)) ); +TRACE_EVENT(set_migration_pte, + + TP_PROTO(unsigned long addr, unsigned long pte, int order), + + TP_ARGS(addr, pte, order), + + TP_STRUCT__entry( + __field(unsigned long, addr) + __field(unsigned long, pte) + __field(int, order) + ), + + TP_fast_assign( + __entry->addr = addr; + __entry->pte = pte; + __entry->order = order; + ), + + TP_printk("addr=%lx, pte=%lx, order=%d", __entry->addr, __entry->pte, __entry->order) +); + +TRACE_EVENT(remove_migration_pte, + + TP_PROTO(unsigned long addr, unsigned long pte, int order), + + TP_ARGS(addr, pte, order), + + TP_STRUCT__entry( + __field(unsigned long, addr) + __field(unsigned long, pte) + __field(int, order) + ), + + TP_fast_assign( + __entry->addr = addr; + __entry->pte = pte; + __entry->order = order; + ), + + TP_printk("addr=%lx, pte=%lx, order=%d", __entry->addr, __entry->pte, __entry->order) +); + #endif /* _TRACE_MIGRATE_H */ /* This part must be outside protection */ diff --git a/mm/migrate.c b/mm/migrate.c index c7da064b4781..0f9fb8813301 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -257,6 +257,8 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, if (PageTransHuge(page) && PageMlocked(page)) clear_page_mlock(page); + trace_remove_migration_pte(pvmw.address, pte_val(pte), compound_order(new)); + /* No need to invalidate - it was non-present before */ update_mmu_cache(vma, pvmw.address, pvmw.pte); } diff --git a/mm/rmap.c b/mm/rmap.c index b0fd9dc19eba..e089cbb9ef97 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -77,6 +77,7 @@ #include <asm/tlbflush.h> #include <trace/events/tlb.h> +#include <trace/events/migrate.h> #include "internal.h" @@ -1872,6 +1873,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma, if (pte_swp_uffd_wp(pteval)) swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte); + trace_set_migration_pte(pvmw.address, pte_val(swp_pte), compound_order(page)); /* * No need to invalidate here it will synchronize on * against the special swap migration pte. @@ -1940,6 +1942,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma, if (pte_uffd_wp(pteval)) swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); + trace_set_migration_pte(address, pte_val(swp_pte), compound_order(page)); /* * No need to invalidate here it will synchronize on * against the special swap migration pte.
diff --git a/include/trace/events/thp.h b/include/trace/events/thp.h index d7fbbe551841..193a555aa2ea 100644 --- a/include/trace/events/thp.h +++ b/include/trace/events/thp.h @@ -83,6 +83,43 @@ TRACE_EVENT(hugepage_splitting, __entry->addr, __entry->pte) ); +TRACE_EVENT(set_migration_pmd, + + TP_PROTO(unsigned long addr, unsigned long pmd), + + TP_ARGS(addr, pmd), + + TP_STRUCT__entry( + __field(unsigned long, addr) + __field(unsigned long, pmd) + ), + + TP_fast_assign( + __entry->addr = addr; + __entry->pmd = pmd; + ), + + TP_printk("create pmd migration entry addr=%lx, pmd=%lx", __entry->addr, __entry->pmd) +); + +TRACE_EVENT(remove_migration_pmd, + + TP_PROTO(unsigned long addr, unsigned long pmd), + + TP_ARGS(addr, pmd), + + TP_STRUCT__entry( + __field(unsigned long, addr) + __field(unsigned long, pmd) + ), + + TP_fast_assign( + __entry->addr = addr; + __entry->pmd = pmd; + ), + + TP_printk("remove pmd migration entry addr=%lx, val=%lx", __entry->addr, __entry->pmd) +); #endif /* _TRACE_THP_H */ /* This part must be outside protection */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e5483347291c..d0adc019afe0 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -39,6 +39,9 @@ #include <asm/pgalloc.h> #include "internal.h" +#define CREATE_TRACE_POINTS +#include <trace/events/thp.h> + /* * By default, transparent hugepage support is disabled in order to avoid * risking an increased memory footprint for applications that are not @@ -3173,6 +3176,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, set_pmd_at(mm, address, pvmw->pmd, pmdswp); page_remove_rmap(page, true); put_page(page); + trace_set_migration_pmd(address, pmd_val(pmdswp)); } void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) @@ -3206,5 +3210,6 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new)) mlock_vma_page(new); update_mmu_cache_pmd(vma, address, pvmw->pmd); + trace_remove_migration_pmd(address, pmd_val(pmde)); } #endif
This adds two trace events for PMD based THP migration without split. These events closely follow the implementation details like setting and removing of PMD migration entries, which are essential operations for THP migration. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Zi Yan <ziy@nvidia.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- This applies on v5.16-rc8 Changes in V1: - Dropped mm, pmdp, page arguments from trace - Updated trace argument names and output format Changes in RFC: https://lore.kernel.org/all/1640328398-20698-1-git-send-email-anshuman.khandual@arm.com/ include/trace/events/thp.h | 37 +++++++++++++++++++++++++++++++++++++ mm/huge_memory.c | 5 +++++ 2 files changed, 42 insertions(+)