Message ID | 20210310180104.517886-1-minchan@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] mm: page_alloc: dump migrate-failed pages | expand |
On Wed, 10 Mar 2021 10:01:04 -0800 Minchan Kim <minchan@kernel.org> wrote: > Currently, debugging CMA allocation failures is quite limited. > The most commong source of these failures seems to be page > migration which doesn't provide any useful information on the > reason of the failure by itself. alloc_contig_range can report > those failures as it holds a list of migrate-failed pages. > > page refcount, mapcount with page flags on dump_page are > helpful information to deduce the culprit. Furthermore, > dump_page_owner was super helpful to find long term pinner > who initiated the page allocation. > > The reason it approach with dynamic debug is the debug message > could emit lots of noises as alloc_contig_range calls more > frequently since it's a best effort allocator. > > There are two ifdefery conditions to support common dyndbg options: > > - CONFIG_DYNAMIC_DEBUG_CORE && DYNAMIC_DEBUG_MODULE > It aims for supporting the feature with only specific file > with adding ccflags. > > - CONFIG_DYNAMIC_DEBUG > It aims for supporting the feature with system wide globally. > > A simple example to enable the feature: > > Admin could enable the dump like this(by default, disabled) > > echo "func dump_migrate_failure_pages +p" > control > > Admin could disable it. > > echo "func dump_migrate_failure_pages =_" > control I think the changelog is out of sync. Did you mean "alloc_contig_dump_pages" here? > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -8453,6 +8453,27 @@ static unsigned long pfn_max_align_up(unsigned long pfn) > pageblock_nr_pages)); > } > +#if defined(CONFIG_DYNAMIC_DEBUG) || \ > + (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) > +static void alloc_contig_dump_pages(struct list_head *page_list) > +{ > + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, > + "migrate failure"); > + > + if (DYNAMIC_DEBUG_BRANCH(descriptor)) { > + struct page *page; > + > + WARN(1, "failed callstack"); > + list_for_each_entry(page, page_list, lru) > + dump_page(page, "migration failure"); > + } > +} I doubt if everyone is familiar with dynamic debug. It might be kind to add a little comment over this, telling people how to turn it on and off.
On Wed, Mar 10, 2021 at 12:56:09PM -0800, Andrew Morton wrote: > On Wed, 10 Mar 2021 10:01:04 -0800 Minchan Kim <minchan@kernel.org> wrote: > > > Currently, debugging CMA allocation failures is quite limited. > > The most commong source of these failures seems to be page > > migration which doesn't provide any useful information on the > > reason of the failure by itself. alloc_contig_range can report > > those failures as it holds a list of migrate-failed pages. > > > > page refcount, mapcount with page flags on dump_page are > > helpful information to deduce the culprit. Furthermore, > > dump_page_owner was super helpful to find long term pinner > > who initiated the page allocation. > > > > The reason it approach with dynamic debug is the debug message > > could emit lots of noises as alloc_contig_range calls more > > frequently since it's a best effort allocator. > > > > There are two ifdefery conditions to support common dyndbg options: > > > > - CONFIG_DYNAMIC_DEBUG_CORE && DYNAMIC_DEBUG_MODULE > > It aims for supporting the feature with only specific file > > with adding ccflags. > > > > - CONFIG_DYNAMIC_DEBUG > > It aims for supporting the feature with system wide globally. > > > > A simple example to enable the feature: > > > > Admin could enable the dump like this(by default, disabled) > > > > echo "func dump_migrate_failure_pages +p" > control > > > > Admin could disable it. > > > > echo "func dump_migrate_failure_pages =_" > control > > I think the changelog is out of sync. Did you mean > "alloc_contig_dump_pages" here? Oops. Now fixed. > > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -8453,6 +8453,27 @@ static unsigned long pfn_max_align_up(unsigned long pfn) > > pageblock_nr_pages)); > > } > > > +#if defined(CONFIG_DYNAMIC_DEBUG) || \ > > + (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) > > +static void alloc_contig_dump_pages(struct list_head *page_list) > > +{ > > + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, > > + "migrate failure"); > > + > > + if (DYNAMIC_DEBUG_BRANCH(descriptor)) { > > + struct page *page; > > + > > + WARN(1, "failed callstack"); > > + list_for_each_entry(page, page_list, lru) > > + dump_page(page, "migration failure"); > > + } > > +} > > I doubt if everyone is familiar with dynamic debug. It might be kind > to add a little comment over this, telling people how to turn it on and > off. Something like this? #if defined(CONFIG_DYNAMIC_DEBUG) || \ (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) /* * dyndbg_dir="/sys/kernel/debug/dynamic_debug" * To enable: * echo "func dump_migrate_failure_pages +p" > $dyndbg_dir/control * To disable: * echo "func dump_migrate_failure_pages =_" > $dyndbg_dir/control */
On 10.03.21 19:01, Minchan Kim wrote: > Currently, debugging CMA allocation failures is quite limited. > The most commong source of these failures seems to be page > migration which doesn't provide any useful information on the > reason of the failure by itself. alloc_contig_range can report > those failures as it holds a list of migrate-failed pages. > > page refcount, mapcount with page flags on dump_page are > helpful information to deduce the culprit. Furthermore, > dump_page_owner was super helpful to find long term pinner > who initiated the page allocation. > > The reason it approach with dynamic debug is the debug message > could emit lots of noises as alloc_contig_range calls more > frequently since it's a best effort allocator. > > There are two ifdefery conditions to support common dyndbg options: > > - CONFIG_DYNAMIC_DEBUG_CORE && DYNAMIC_DEBUG_MODULE > It aims for supporting the feature with only specific file > with adding ccflags. > > - CONFIG_DYNAMIC_DEBUG > It aims for supporting the feature with system wide globally. > > A simple example to enable the feature: > > Admin could enable the dump like this(by default, disabled) > > echo "func dump_migrate_failure_pages +p" > control > > Admin could disable it. > > echo "func dump_migrate_failure_pages =_" > control > > Detail goes Documentation/admin-guide/dynamic-debug-howto.rst > > A concern is utility functions in dump_page uses inconsistent > loglevels. > > __dump_page: KERN_WARNING > __dump_page_owner: KERN_ALERT > stack_trace_print: KERN_DEFAULT > > There are bunch of places to use the inconsistent loglevel > utility functions(e.g., just grep dump_page/strace_trace_print). > It's unfortunate but here we are. It could be addressed > different patchset. > > Signed-off-by: Minchan Kim <minchan@kernel.org> > --- > * from v2 - https://lore.kernel.org/linux-mm/20210308202047.1903802-1-minchan@kernel.org/ > * remove ratelimit - mhocko > > * from v1 - https://lore.kernel.org/linux-mm/20210217163603.429062-1-minchan@kernel.org/ > * use dynamic debugging with system wide instead of per-call site - mhocko > > mm/page_alloc.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 3e4b29ee2b1e..f42f3e208794 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -8453,6 +8453,27 @@ static unsigned long pfn_max_align_up(unsigned long pfn) > pageblock_nr_pages)); > } > > +#if defined(CONFIG_DYNAMIC_DEBUG) || \ > + (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) > +static void alloc_contig_dump_pages(struct list_head *page_list) > +{ > + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, > + "migrate failure"); > + > + if (DYNAMIC_DEBUG_BRANCH(descriptor)) { > + struct page *page; > + > + WARN(1, "failed callstack"); Not sure if we want that here. You can deterministically crash a kernel with panic_on_warn set - which is what some distros already do. Some other way to dump the current call stack would be handy ... > + list_for_each_entry(page, page_list, lru) > + dump_page(page, "migration failure"); > + } > +} > +#else > +static inline void alloc_contig_dump_pages(struct list_head *page_list) > +{ > +} > +#endif > + > /* [start, end) must belong to a single zone. */ > static int __alloc_contig_migrate_range(struct compact_control *cc, > unsigned long start, unsigned long end) > @@ -8496,6 +8517,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, > NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE); > } > if (ret < 0) { > + alloc_contig_dump_pages(&cc->migratepages); > putback_movable_pages(&cc->migratepages); > return ret; > } > For me, this is good enough for now.
On Thu, Mar 11, 2021 at 05:40:35PM +0100, David Hildenbrand wrote: > On 10.03.21 19:01, Minchan Kim wrote: > > Currently, debugging CMA allocation failures is quite limited. > > The most commong source of these failures seems to be page > > migration which doesn't provide any useful information on the > > reason of the failure by itself. alloc_contig_range can report > > those failures as it holds a list of migrate-failed pages. > > > > page refcount, mapcount with page flags on dump_page are > > helpful information to deduce the culprit. Furthermore, > > dump_page_owner was super helpful to find long term pinner > > who initiated the page allocation. > > > > The reason it approach with dynamic debug is the debug message > > could emit lots of noises as alloc_contig_range calls more > > frequently since it's a best effort allocator. > > > > There are two ifdefery conditions to support common dyndbg options: > > > > - CONFIG_DYNAMIC_DEBUG_CORE && DYNAMIC_DEBUG_MODULE > > It aims for supporting the feature with only specific file > > with adding ccflags. > > > > - CONFIG_DYNAMIC_DEBUG > > It aims for supporting the feature with system wide globally. > > > > A simple example to enable the feature: > > > > Admin could enable the dump like this(by default, disabled) > > > > echo "func dump_migrate_failure_pages +p" > control > > > > Admin could disable it. > > > > echo "func dump_migrate_failure_pages =_" > control > > > > Detail goes Documentation/admin-guide/dynamic-debug-howto.rst > > > > A concern is utility functions in dump_page uses inconsistent > > loglevels. > > > > __dump_page: KERN_WARNING > > __dump_page_owner: KERN_ALERT > > stack_trace_print: KERN_DEFAULT > > > > There are bunch of places to use the inconsistent loglevel > > utility functions(e.g., just grep dump_page/strace_trace_print). > > It's unfortunate but here we are. It could be addressed > > different patchset. > > > > Signed-off-by: Minchan Kim <minchan@kernel.org> > > --- > > * from v2 - https://lore.kernel.org/linux-mm/20210308202047.1903802-1-minchan@kernel.org/ > > * remove ratelimit - mhocko > > > > * from v1 - https://lore.kernel.org/linux-mm/20210217163603.429062-1-minchan@kernel.org/ > > * use dynamic debugging with system wide instead of per-call site - mhocko > > > > mm/page_alloc.c | 22 ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 3e4b29ee2b1e..f42f3e208794 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -8453,6 +8453,27 @@ static unsigned long pfn_max_align_up(unsigned long pfn) > > pageblock_nr_pages)); > > } > > +#if defined(CONFIG_DYNAMIC_DEBUG) || \ > > + (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) > > +static void alloc_contig_dump_pages(struct list_head *page_list) > > +{ > > + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, > > + "migrate failure"); > > + > > + if (DYNAMIC_DEBUG_BRANCH(descriptor)) { > > + struct page *page; > > + > > + WARN(1, "failed callstack"); > > Not sure if we want that here. You can deterministically crash a kernel with > panic_on_warn set - which is what some distros already do. Good point. > > Some other way to dump the current call stack would be handy ... Yub, let's go with dump_stack. > > > + list_for_each_entry(page, page_list, lru) > > + dump_page(page, "migration failure"); > > + } > > +} > > +#else > > +static inline void alloc_contig_dump_pages(struct list_head *page_list) > > +{ > > +} > > +#endif > > + > > /* [start, end) must belong to a single zone. */ > > static int __alloc_contig_migrate_range(struct compact_control *cc, > > unsigned long start, unsigned long end) > > @@ -8496,6 +8517,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, > > NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE); > > } > > if (ret < 0) { > > + alloc_contig_dump_pages(&cc->migratepages); > > putback_movable_pages(&cc->migratepages); > > return ret; > > } > > > > For me, this is good enough for now. Thanks. Let me send next revision with addressing your point and Andrew's one.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3e4b29ee2b1e..f42f3e208794 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8453,6 +8453,27 @@ static unsigned long pfn_max_align_up(unsigned long pfn) pageblock_nr_pages)); } +#if defined(CONFIG_DYNAMIC_DEBUG) || \ + (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) +static void alloc_contig_dump_pages(struct list_head *page_list) +{ + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, + "migrate failure"); + + if (DYNAMIC_DEBUG_BRANCH(descriptor)) { + struct page *page; + + WARN(1, "failed callstack"); + list_for_each_entry(page, page_list, lru) + dump_page(page, "migration failure"); + } +} +#else +static inline void alloc_contig_dump_pages(struct list_head *page_list) +{ +} +#endif + /* [start, end) must belong to a single zone. */ static int __alloc_contig_migrate_range(struct compact_control *cc, unsigned long start, unsigned long end) @@ -8496,6 +8517,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE); } if (ret < 0) { + alloc_contig_dump_pages(&cc->migratepages); putback_movable_pages(&cc->migratepages); return ret; }
Currently, debugging CMA allocation failures is quite limited. The most commong source of these failures seems to be page migration which doesn't provide any useful information on the reason of the failure by itself. alloc_contig_range can report those failures as it holds a list of migrate-failed pages. page refcount, mapcount with page flags on dump_page are helpful information to deduce the culprit. Furthermore, dump_page_owner was super helpful to find long term pinner who initiated the page allocation. The reason it approach with dynamic debug is the debug message could emit lots of noises as alloc_contig_range calls more frequently since it's a best effort allocator. There are two ifdefery conditions to support common dyndbg options: - CONFIG_DYNAMIC_DEBUG_CORE && DYNAMIC_DEBUG_MODULE It aims for supporting the feature with only specific file with adding ccflags. - CONFIG_DYNAMIC_DEBUG It aims for supporting the feature with system wide globally. A simple example to enable the feature: Admin could enable the dump like this(by default, disabled) echo "func dump_migrate_failure_pages +p" > control Admin could disable it. echo "func dump_migrate_failure_pages =_" > control Detail goes Documentation/admin-guide/dynamic-debug-howto.rst A concern is utility functions in dump_page uses inconsistent loglevels. __dump_page: KERN_WARNING __dump_page_owner: KERN_ALERT stack_trace_print: KERN_DEFAULT There are bunch of places to use the inconsistent loglevel utility functions(e.g., just grep dump_page/strace_trace_print). It's unfortunate but here we are. It could be addressed different patchset. Signed-off-by: Minchan Kim <minchan@kernel.org> --- * from v2 - https://lore.kernel.org/linux-mm/20210308202047.1903802-1-minchan@kernel.org/ * remove ratelimit - mhocko * from v1 - https://lore.kernel.org/linux-mm/20210217163603.429062-1-minchan@kernel.org/ * use dynamic debugging with system wide instead of per-call site - mhocko mm/page_alloc.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)