Message ID | 20210528175403.4506-1-shy828301@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] mm: dump_page: print total mapcount for compound page | expand |
On Fri, May 28, 2021 at 10:54:03AM -0700, Yang Shi wrote: > So I prepared this patch to show a possible approach to get some > feedback. The same thing could be decoded by the reader of page dump > as well by using the same formula used by this patch. However it sounds > more convenient to have kernel do the math. You haven't taken enough things into consideration ... > + bool is_slab = PageSlab(head); We should probably have a separate dump_slab_page(). Almost nothing in __dump_page() is really useful for slab pages (eg, mapping, index, mapcount, compound_mapcount, compound_pincount, aops), and the flags (such as are used) have different meanings. > + nr = compound_nr(head); > + if (is_slab) > + total_mapcount = 0; > + else if (PageHuge(head)) > + total_mapcount = comp_mapcnt; > + else { > + if (mapping) { > + if (!PageAnon(head)) > + nr = nr * (comp_mapcnt + 1) - comp_mapcnt; > + } else > + nr = 0; > + total_mapcount = refcount - pincount - nr; I see what you're trying to do here, but there are so many other things which take a refcount on a page. The LRU, the page cache, private fs data, random temporary "gets" (eg, buffered reads, buffered writes, get_user_pages(), readahead, truncate, migration). I think this is likely to be so inaccurate as to be confusing. I had to think hard about it though. I like what you're trying to do, I just don't think it works ;-(
On 5/28/21 10:54 AM, Yang Shi wrote: > The total mapcount is a useful information for debugging, but we can't > call total_mapcount() directly since it calls some assertions which may > be triggered as commit 6dc5ea16c86f ("mm, > dump_page: do not crash with bad compound_mapcount()") met. > > We could implement yet another implementation for dump_page() but > it has the limitation when individual mapcount of subpages is corrupted. > > Actually the total mapcount could be decoded from refcount, pincount and > compound mapcount although it may be not very precise due to some > transient references. If the mapcount calculation were in a separate routine, *and* if something else in addition to dump_page() used it, then I'd be interested in calling it from dump_page(). But, just adding a calculation glob like this is not a good idea. If the reader really needs the calculation, then that person can, as you say, work it out from the other information. Debug and dump routines are actually supposed to remain fairly simple, so that they themselves do not end up with bugs, or stale assumptions (which this calculation is very much susceptible to). This goes in the wrong direction. So best to just not do this, IMHO. thanks,
On Fri, May 28, 2021 at 11:22 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Fri, May 28, 2021 at 10:54:03AM -0700, Yang Shi wrote: > > So I prepared this patch to show a possible approach to get some > > feedback. The same thing could be decoded by the reader of page dump > > as well by using the same formula used by this patch. However it sounds > > more convenient to have kernel do the math. > > You haven't taken enough things into consideration ... > > > + bool is_slab = PageSlab(head); > > We should probably have a separate dump_slab_page(). Almost nothing > in __dump_page() is really useful for slab pages (eg, mapping, index, > mapcount, compound_mapcount, compound_pincount, aops), and the flags > (such as are used) have different meanings. Yes, slab page dump is missed for a long time. > > > + nr = compound_nr(head); > > + if (is_slab) > > + total_mapcount = 0; > > + else if (PageHuge(head)) > > + total_mapcount = comp_mapcnt; > > + else { > > + if (mapping) { > > + if (!PageAnon(head)) > > + nr = nr * (comp_mapcnt + 1) - comp_mapcnt; > > + } else > > + nr = 0; > > + total_mapcount = refcount - pincount - nr; > > I see what you're trying to do here, but there are so many other things > which take a refcount on a page. The LRU, the page cache, private fs > data, random temporary "gets" (eg, buffered reads, buffered writes, > get_user_pages(), readahead, truncate, migration). I think this is > likely to be so inaccurate as to be confusing. Yes, it is inaccurate in some cases. There is not a simple way to rule out those random transient references. The page cache has been taken into account by this patch, but I overlooked private page and LRU cache cases, they seem simple to filter out by page flags. > > I had to think hard about it though. I like what you're trying to do, > I just don't think it works ;-( The random transient references are annoying and could make the number be far away from accuracy. That would be too confusing to be worth printing the number. But I'm not sure how often or bad it could be.
On Fri, May 28, 2021 at 11:26 AM John Hubbard <jhubbard@nvidia.com> wrote: > > On 5/28/21 10:54 AM, Yang Shi wrote: > > The total mapcount is a useful information for debugging, but we can't > > call total_mapcount() directly since it calls some assertions which may > > be triggered as commit 6dc5ea16c86f ("mm, > > dump_page: do not crash with bad compound_mapcount()") met. > > > > We could implement yet another implementation for dump_page() but > > it has the limitation when individual mapcount of subpages is corrupted. > > > > Actually the total mapcount could be decoded from refcount, pincount and > > compound mapcount although it may be not very precise due to some > > transient references. > > If the mapcount calculation were in a separate routine, *and* if something > else in addition to dump_page() used it, then I'd be interested in > calling it from dump_page(). There is. The total_mapcount() is used by mm code. But as I mentioned in the commit log and that discussion email, it is not safe to call it directly in dump_page() path. > > But, just adding a calculation glob like this is not a good idea. If > the reader really needs the calculation, then that person can, as you > say, work it out from the other information. > > Debug and dump routines are actually supposed to remain fairly simple, > so that they themselves do not end up with bugs, or stale assumptions > (which this calculation is very much susceptible to). This goes in the > wrong direction. > > So best to just not do this, IMHO. > > thanks, > -- > John Hubbard > NVIDIA > > > > > Signed-off-by: Yang Shi <shy828301@gmail.com> > > --- > > I think we are on the same page that the total mapcount is useful > > information and it would be ideal to print this information when dumpping > > page if possible. But how to implement it safely seems controversial. > > Some ideas and potential problems have been discussed by > > https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2105261733110.16920@eggly.anvils/. > > > > So I prepared this patch to show a possible approach to get some > > feedback. The same thing could be decoded by the reader of page dump > > as well by using the same formula used by this patch. However it sounds > > more convenient to have kernel do the math. > > > > mm/debug.c | 35 +++++++++++++++++++++++++++++------ > > 1 file changed, 29 insertions(+), 6 deletions(-) > > > > diff --git a/mm/debug.c b/mm/debug.c > > index e73fe0a8ec3d..129efcfcaf79 100644 > > --- a/mm/debug.c > > +++ b/mm/debug.c > > @@ -54,8 +54,13 @@ static void __dump_page(struct page *page) > > * inaccuracy here due to racing. > > */ > > bool page_cma = is_migrate_cma_page(page); > > - int mapcount; > > + int mapcount, total_mapcount; > > + int nr; > > + int refcount; > > + int pincount = 0; > > + int comp_mapcnt; > > char *type = ""; > > + bool is_slab = PageSlab(head); > > > > if (page < head || (page >= head + MAX_ORDER_NR_PAGES)) { > > /* > > @@ -82,22 +87,40 @@ static void __dump_page(struct page *page) > > * page->_mapcount space in struct page is used by sl[aou]b pages to > > * encode own info. > > */ > > - mapcount = PageSlab(head) ? 0 : page_mapcount(page); > > + mapcount = is_slab ? 0 : page_mapcount(page); > > + > > + refcount = page_ref_count(head); > > > > pr_warn("page:%p refcount:%d mapcount:%d mapping:%p index:%#lx pfn:%#lx\n", > > - page, page_ref_count(head), mapcount, mapping, > > + page, refcount, mapcount, mapping, > > page_to_pgoff(page), page_to_pfn(page)); > > if (compound) { > > + comp_mapcnt = head_compound_mapcount(head); > > if (hpage_pincount_available(page)) { > > + pincount = head_compound_pincount(head); > > pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n", > > head, compound_order(head), > > - head_compound_mapcount(head), > > - head_compound_pincount(head)); > > + comp_mapcnt, pincount); > > } else { > > pr_warn("head:%p order:%u compound_mapcount:%d\n", > > head, compound_order(head), > > - head_compound_mapcount(head)); > > + comp_mapcnt); > > + } > > + > > + nr = compound_nr(head); > > + if (is_slab) > > + total_mapcount = 0; > > + else if (PageHuge(head)) > > + total_mapcount = comp_mapcnt; > > + else { > > + if (mapping) { > > + if (!PageAnon(head)) > > + nr = nr * (comp_mapcnt + 1) - comp_mapcnt; > > + } else > > + nr = 0; > > + total_mapcount = refcount - pincount - nr; > > } > > + pr_warn("total_mapcount(estimated):%d\n", total_mapcount); > > } > > > > #ifdef CONFIG_MEMCG > > >
On Fri, 28 May 2021, Yang Shi wrote: > The total mapcount is a useful information for debugging, but we can't > call total_mapcount() directly since it calls some assertions which may > be triggered as commit 6dc5ea16c86f ("mm, > dump_page: do not crash with bad compound_mapcount()") met. > > We could implement yet another implementation for dump_page() but > it has the limitation when individual mapcount of subpages is corrupted. > > Actually the total mapcount could be decoded from refcount, pincount and > compound mapcount although it may be not very precise due to some > transient references. > > Signed-off-by: Yang Shi <shy828301@gmail.com> > --- > I think we are on the same page that the total mapcount is useful Well, it may be useful (and used to be shown) in the case we've been thinking of; but there the critical fact, page_mapped(), is evident from the fact that your VM_WARN_ON_ONCE_PAGE(page_mapped) is shown at all: being a number, total_mapcount() tells a little more, but not a lot. > information and it would be ideal to print this information when dumpping Yes, I admit I did say "ideal": but not at this cost. I'm sorry for pointing you down (something like) this path. If total_mapcount() itself had been assuredly safe, it would have been nice to add in; but not this substitute. > page if possible. But how to implement it safely seems controversial. > Some ideas and potential problems have been discussed by > https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2105261733110.16920@eggly.anvils/. > > So I prepared this patch to show a possible approach to get some > feedback. The same thing could be decoded by the reader of page dump > as well by using the same formula used by this patch. However it sounds > more convenient to have kernel do the math. > > mm/debug.c | 35 +++++++++++++++++++++++++++++------ > 1 file changed, 29 insertions(+), 6 deletions(-) Adding that code to come up with a deceptive approximation to a number which most sites won't care about: speaking for me, I'll say no. Hugh
On 5/28/21 12:03 PM, Yang Shi wrote: > On Fri, May 28, 2021 at 11:26 AM John Hubbard <jhubbard@nvidia.com> wrote: >> >> On 5/28/21 10:54 AM, Yang Shi wrote: >>> The total mapcount is a useful information for debugging, but we can't >>> call total_mapcount() directly since it calls some assertions which may >>> be triggered as commit 6dc5ea16c86f ("mm, >>> dump_page: do not crash with bad compound_mapcount()") met. >>> >>> We could implement yet another implementation for dump_page() but >>> it has the limitation when individual mapcount of subpages is corrupted. >>> >>> Actually the total mapcount could be decoded from refcount, pincount and >>> compound mapcount although it may be not very precise due to some >>> transient references. >> >> If the mapcount calculation were in a separate routine, *and* if something >> else in addition to dump_page() used it, then I'd be interested in >> calling it from dump_page(). > > There is. The total_mapcount() is used by mm code. But as I mentioned > in the commit log and that discussion email, it is not safe to call it > directly in dump_page() path. > Right! I apologize for missing the point that it is a separate function. But unfortunately, the conclusion here is still the same, if you are unable to actually call that function. Basically, if you can make simple calls to retrieve and print out information, then that's a good situation for diagnostics routines such as dump_page(). But if you have to open-code calculations and those start to get complex, then you have probably gone off in the wrong direction. Keep in mind that the diag routines themselves have to be correct, and they usually don't have the same level of testing that other routines do. thanks,
On Fri, May 28, 2021 at 12:47:31PM -0700, Hugh Dickins wrote: > > page if possible. But how to implement it safely seems controversial. > > Some ideas and potential problems have been discussed by > > https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2105261733110.16920@eggly.anvils/. > > > > So I prepared this patch to show a possible approach to get some > > feedback. The same thing could be decoded by the reader of page dump > > as well by using the same formula used by this patch. However it sounds > > more convenient to have kernel do the math. > > > > mm/debug.c | 35 +++++++++++++++++++++++++++++------ > > 1 file changed, 29 insertions(+), 6 deletions(-) > > Adding that code to come up with a deceptive approximation to a number > which most sites won't care about: speaking for me, I'll say no. I agree. The approximation may bring more confusion than help debugging.
diff --git a/mm/debug.c b/mm/debug.c index e73fe0a8ec3d..129efcfcaf79 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -54,8 +54,13 @@ static void __dump_page(struct page *page) * inaccuracy here due to racing. */ bool page_cma = is_migrate_cma_page(page); - int mapcount; + int mapcount, total_mapcount; + int nr; + int refcount; + int pincount = 0; + int comp_mapcnt; char *type = ""; + bool is_slab = PageSlab(head); if (page < head || (page >= head + MAX_ORDER_NR_PAGES)) { /* @@ -82,22 +87,40 @@ static void __dump_page(struct page *page) * page->_mapcount space in struct page is used by sl[aou]b pages to * encode own info. */ - mapcount = PageSlab(head) ? 0 : page_mapcount(page); + mapcount = is_slab ? 0 : page_mapcount(page); + + refcount = page_ref_count(head); pr_warn("page:%p refcount:%d mapcount:%d mapping:%p index:%#lx pfn:%#lx\n", - page, page_ref_count(head), mapcount, mapping, + page, refcount, mapcount, mapping, page_to_pgoff(page), page_to_pfn(page)); if (compound) { + comp_mapcnt = head_compound_mapcount(head); if (hpage_pincount_available(page)) { + pincount = head_compound_pincount(head); pr_warn("head:%p order:%u compound_mapcount:%d compound_pincount:%d\n", head, compound_order(head), - head_compound_mapcount(head), - head_compound_pincount(head)); + comp_mapcnt, pincount); } else { pr_warn("head:%p order:%u compound_mapcount:%d\n", head, compound_order(head), - head_compound_mapcount(head)); + comp_mapcnt); + } + + nr = compound_nr(head); + if (is_slab) + total_mapcount = 0; + else if (PageHuge(head)) + total_mapcount = comp_mapcnt; + else { + if (mapping) { + if (!PageAnon(head)) + nr = nr * (comp_mapcnt + 1) - comp_mapcnt; + } else + nr = 0; + total_mapcount = refcount - pincount - nr; } + pr_warn("total_mapcount(estimated):%d\n", total_mapcount); } #ifdef CONFIG_MEMCG
The total mapcount is a useful information for debugging, but we can't call total_mapcount() directly since it calls some assertions which may be triggered as commit 6dc5ea16c86f ("mm, dump_page: do not crash with bad compound_mapcount()") met. We could implement yet another implementation for dump_page() but it has the limitation when individual mapcount of subpages is corrupted. Actually the total mapcount could be decoded from refcount, pincount and compound mapcount although it may be not very precise due to some transient references. Signed-off-by: Yang Shi <shy828301@gmail.com> --- I think we are on the same page that the total mapcount is useful information and it would be ideal to print this information when dumpping page if possible. But how to implement it safely seems controversial. Some ideas and potential problems have been discussed by https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2105261733110.16920@eggly.anvils/. So I prepared this patch to show a possible approach to get some feedback. The same thing could be decoded by the reader of page dump as well by using the same formula used by this patch. However it sounds more convenient to have kernel do the math. mm/debug.c | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-)