Message ID | 20220512025307.57924-1-vvghjk1234@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm/page_alloc: Fix tracepoint mm_page_alloc_zone_locked() | expand |
On Thu, May 12, 2022 at 11:53:07AM +0900, Wonhyuk Yang wrote: > Currently, trace point mm_page_alloc_zone_locked() doesn't show > correct information. > > First, when alloc_flag has ALLOC_HARDER/ALLOC_CMA, page can > be allocated from MIGRATE_HIGHATOMIC/MIGRATE_CMA. Nevertheless, > tracepoint use requested migration type not MIGRATE_HIGHATOMIC and > MIGRATE_CMA. > > Second, after Commit 44042b4498728 ("mm/page_alloc: allow high-order > pages to be stored on the per-cpu lists") percpu-list can store > high order pages. But trace point determine whether it is a refiil > of percpu-list by comparing requested order and 0. > > To handle these problems, make mm_page_alloc_zone_locked() only be > called by __rmqueue_smallest with correct migration type. With a > new argument called percpu_refill, it can show roughly whether it > is a refill of percpu-list. > > Cc: Baik Song An <bsahn@etri.re.kr> > Cc: Hong Yeon Kim <kimhy@etri.re.kr> > Cc: Taeung Song <taeung@reallinux.co.kr> > Cc: linuxgeek@linuxgeek.io > Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com> Acked-by: Mel Gorman <mgorman@suse.de>
Sorry for the late reply, but I finally have time to catchup on my inbox. On Thu, 12 May 2022 11:53:07 +0900 Wonhyuk Yang <vvghjk1234@gmail.com> wrote: > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2476,6 +2476,9 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, > del_page_from_free_list(page, zone, current_order); > expand(zone, page, order, current_order, migratetype); > set_pcppage_migratetype(page, migratetype); > + trace_mm_page_alloc_zone_locked(page, order, migratetype, > + pcp_allowed_order(order) && > + migratetype < MIGRATE_PCPTYPES); It would make more sense to put this logic into the TP_fast_assign() if possible. This code is added at the location of execution, and even though it may not run while tracing is disabled, it does affect icache. Could you pass in the order (you already have migratetype) and then in the trace event have: TP_fast_assign( __entry->pfn = page ? page_to_pfn(page) : -1UL; __entry->order = order; __entry->migratetype = migratetype; + __entry->percpu_refill = pcp_allowed_order(order) && migratetype < MIGRATE_PCPTYPES); ), ?? -- Steve > return page; > } >
diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h index ddc8c944f417..f89fb3afcd46 100644 --- a/include/trace/events/kmem.h +++ b/include/trace/events/kmem.h @@ -229,20 +229,23 @@ TRACE_EVENT(mm_page_alloc, DECLARE_EVENT_CLASS(mm_page, - TP_PROTO(struct page *page, unsigned int order, int migratetype), + TP_PROTO(struct page *page, unsigned int order, int migratetype, + int percpu_refill), - TP_ARGS(page, order, migratetype), + TP_ARGS(page, order, migratetype, percpu_refill), TP_STRUCT__entry( __field( unsigned long, pfn ) __field( unsigned int, order ) __field( int, migratetype ) + __field( int, percpu_refill ) ), TP_fast_assign( __entry->pfn = page ? page_to_pfn(page) : -1UL; __entry->order = order; __entry->migratetype = migratetype; + __entry->percpu_refill = percpu_refill; ), TP_printk("page=%p pfn=0x%lx order=%u migratetype=%d percpu_refill=%d", @@ -250,14 +253,15 @@ DECLARE_EVENT_CLASS(mm_page, __entry->pfn != -1UL ? __entry->pfn : 0, __entry->order, __entry->migratetype, - __entry->order == 0) + __entry->percpu_refill) ); DEFINE_EVENT(mm_page, mm_page_alloc_zone_locked, - TP_PROTO(struct page *page, unsigned int order, int migratetype), + TP_PROTO(struct page *page, unsigned int order, int migratetype, + int percpu_refill), - TP_ARGS(page, order, migratetype) + TP_ARGS(page, order, migratetype, percpu_refill) ); TRACE_EVENT(mm_page_pcpu_drain, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0e42038382c1..e906ac274586 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2476,6 +2476,9 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, del_page_from_free_list(page, zone, current_order); expand(zone, page, order, current_order, migratetype); set_pcppage_migratetype(page, migratetype); + trace_mm_page_alloc_zone_locked(page, order, migratetype, + pcp_allowed_order(order) && + migratetype < MIGRATE_PCPTYPES); return page; } @@ -2999,7 +3002,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, zone_page_state(zone, NR_FREE_PAGES) / 2) { page = __rmqueue_cma_fallback(zone, order); if (page) - goto out; + return page; } } retry: @@ -3012,9 +3015,6 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, alloc_flags)) goto retry; } -out: - if (page) - trace_mm_page_alloc_zone_locked(page, order, migratetype); return page; } @@ -3733,11 +3733,8 @@ struct page *rmqueue(struct zone *preferred_zone, * reserved for high-order atomic allocation, so order-0 * request should skip it. */ - if (order > 0 && alloc_flags & ALLOC_HARDER) { + if (order > 0 && alloc_flags & ALLOC_HARDER) page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); - if (page) - trace_mm_page_alloc_zone_locked(page, order, migratetype); - } if (!page) { page = __rmqueue(zone, order, migratetype, alloc_flags); if (!page)
Currently, trace point mm_page_alloc_zone_locked() doesn't show correct information. First, when alloc_flag has ALLOC_HARDER/ALLOC_CMA, page can be allocated from MIGRATE_HIGHATOMIC/MIGRATE_CMA. Nevertheless, tracepoint use requested migration type not MIGRATE_HIGHATOMIC and MIGRATE_CMA. Second, after Commit 44042b4498728 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists") percpu-list can store high order pages. But trace point determine whether it is a refiil of percpu-list by comparing requested order and 0. To handle these problems, make mm_page_alloc_zone_locked() only be called by __rmqueue_smallest with correct migration type. With a new argument called percpu_refill, it can show roughly whether it is a refill of percpu-list. Cc: Baik Song An <bsahn@etri.re.kr> Cc: Hong Yeon Kim <kimhy@etri.re.kr> Cc: Taeung Song <taeung@reallinux.co.kr> Cc: linuxgeek@linuxgeek.io Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com> --- v1 -> v2: Simplify determining percpu-refill suggested by Mel Gorman include/trace/events/kmem.h | 14 +++++++++----- mm/page_alloc.c | 13 +++++-------- 2 files changed, 14 insertions(+), 13 deletions(-)