diff mbox series

[v2,1/8] mm: dump_page: print head page's refcount, for compound pages

Message ID 20200129032417.3085670-2-jhubbard@nvidia.com (mailing list archive)
State New
Headers show
Series mm/gup: track FOLL_PIN pages (follow on from v12) | expand

Commit Message

John Hubbard Jan. 29, 2020, 3:24 a.m. UTC
When debugging a problem that involves compound pages, it is extremely
helpful if dump_page() reports not only the page->_refcount, but also
the refcount of the head page of the compound page. That's because the
head page collects refcounts for the entire compound page.

Therefore, enhance dump_page() so as to print out the refcount of the
head page of a compound page.

This approach (printing information about a struct page that is not the
struct page that was passed into dump_page()) has a precedent:
compound_mapcount is already being printed.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 mm/debug.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Kirill A. Shutemov Jan. 29, 2020, 11:25 a.m. UTC | #1
On Tue, Jan 28, 2020 at 07:24:10PM -0800, John Hubbard wrote:
> When debugging a problem that involves compound pages, it is extremely
> helpful if dump_page() reports not only the page->_refcount, but also
> the refcount of the head page of the compound page. That's because the
> head page collects refcounts for the entire compound page.
> 
> Therefore, enhance dump_page() so as to print out the refcount of the
> head page of a compound page.
> 
> This approach (printing information about a struct page that is not the
> struct page that was passed into dump_page()) has a precedent:
> compound_mapcount is already being printed.

refcount on a tail must always be 0. I think we should only print it when
it is non-zero, emphasizing this fact with a standalone message.
John Hubbard Jan. 29, 2020, 10:26 p.m. UTC | #2
On 1/29/20 3:25 AM, Kirill A. Shutemov wrote:
> On Tue, Jan 28, 2020 at 07:24:10PM -0800, John Hubbard wrote:
>> When debugging a problem that involves compound pages, it is extremely
>> helpful if dump_page() reports not only the page->_refcount, but also
>> the refcount of the head page of the compound page. That's because the
>> head page collects refcounts for the entire compound page.
>>
>> Therefore, enhance dump_page() so as to print out the refcount of the
>> head page of a compound page.
>>
>> This approach (printing information about a struct page that is not the
>> struct page that was passed into dump_page()) has a precedent:
>> compound_mapcount is already being printed.
> 
> refcount on a tail must always be 0. I think we should only print it when
> it is non-zero, emphasizing this fact with a standalone message.
> 

Hi Kirill,

Yes, good point, that sounds like just the right balance. And it avoids adding 
a new item to print in the common case, which is very nice. Here's what I've
changed it to for the next version (I'll also rewrite the commit description 
accordingly):


diff --git a/mm/debug.c b/mm/debug.c
index a90da5337c14..3a45e2b77de0 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -75,12 +75,17 @@ void __dump_page(struct page *page, const char *reason)
 	 */
 	mapcount = PageSlab(page) ? 0 : page_mapcount(page);
 
-	if (PageCompound(page))
-		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
-			"index:%#lx compound_mapcount: %d\n",
-			page, page_ref_count(page), mapcount,
+	if (PageCompound(page)) {
+		pr_warn("page:%px compound refcount:%d mapcount:%d mapping:%px "
+			"index:%#lx compound_mapcount:%d\n",
+			page, page_ref_count(compound_head(page)), mapcount,
 			page->mapping, page_to_pgoff(page),
 			compound_mapcount(page));
+
+		if (page != compound_head(page) && page_ref_count(page) != 0)
+			pr_warn("page:%px PROBLEM: non-zero refcount (==%d) on "
+				"this tail page\n", page, page_ref_count(page));
+	}
 	else
 		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx\n",
 			page, page_ref_count(page), mapcount,


thanks,
Matthew Wilcox Jan. 29, 2020, 10:59 p.m. UTC | #3
On Wed, Jan 29, 2020 at 02:26:06PM -0800, John Hubbard wrote:
> On 1/29/20 3:25 AM, Kirill A. Shutemov wrote:
> > On Tue, Jan 28, 2020 at 07:24:10PM -0800, John Hubbard wrote:
> >> When debugging a problem that involves compound pages, it is extremely
> >> helpful if dump_page() reports not only the page->_refcount, but also
> >> the refcount of the head page of the compound page. That's because the
> >> head page collects refcounts for the entire compound page.
> >>
> >> Therefore, enhance dump_page() so as to print out the refcount of the
> >> head page of a compound page.
> >>
> >> This approach (printing information about a struct page that is not the
> >> struct page that was passed into dump_page()) has a precedent:
> >> compound_mapcount is already being printed.
> > 
> > refcount on a tail must always be 0. I think we should only print it when
> > it is non-zero, emphasizing this fact with a standalone message.
> > 
> 
> Hi Kirill,
> 
> Yes, good point, that sounds like just the right balance. And it avoids adding 
> a new item to print in the common case, which is very nice. Here's what I've
> changed it to for the next version (I'll also rewrite the commit description 
> accordingly):
> 
> 
> diff --git a/mm/debug.c b/mm/debug.c
> index a90da5337c14..3a45e2b77de0 100644
> --- a/mm/debug.c
> +++ b/mm/debug.c
> @@ -75,12 +75,17 @@ void __dump_page(struct page *page, const char *reason)
>  	 */
>  	mapcount = PageSlab(page) ? 0 : page_mapcount(page);
>  
> -	if (PageCompound(page))
> -		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
> -			"index:%#lx compound_mapcount: %d\n",
> -			page, page_ref_count(page), mapcount,
> +	if (PageCompound(page)) {
> +		pr_warn("page:%px compound refcount:%d mapcount:%d mapping:%px "
> +			"index:%#lx compound_mapcount:%d\n",
> +			page, page_ref_count(compound_head(page)), mapcount,
>  			page->mapping, page_to_pgoff(page),
>  			compound_mapcount(page));
> +
> +		if (page != compound_head(page) && page_ref_count(page) != 0)
> +			pr_warn("page:%px PROBLEM: non-zero refcount (==%d) on "
> +				"this tail page\n", page, page_ref_count(page));
> +	}
>  	else
>  		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx\n",
>  			page, page_ref_count(page), mapcount,

I have a hunk in my current tree which looks like this:

@@ -77,6 +77,11 @@ void __dump_page(struct page *page, const char *reason)
                pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx
\n",
                        page, page_ref_count(page), mapcount,
                        page->mapping, page_to_pgoff(page));
+       if (PageTail(page)) {
+               struct page *head = compound_head(page);
+               pr_warn("head:%px mapping:%px index:%#lx\n",
+                       head, head->mapping, page_to_pgoff(head));
+       }
        if (PageKsm(page))
                pr_warn("ksm flags: %#lx(%pGp)\n", page->flags, &page->flags);
        else if (PageAnon(page))

I wonder if we can combine these two patches in some more useful way?

I also think we probably want a sanity check that 'head' and 'page'
are within a sane range of each other (ie head < page and head +
MAX_ORDER_NR_PAGES > page) to protect against a struct page that contains
complete garbage.
John Hubbard Jan. 30, 2020, 6:23 a.m. UTC | #4
On 1/29/20 2:59 PM, Matthew Wilcox wrote:
...
> I have a hunk in my current tree which looks like this:
> 
> @@ -77,6 +77,11 @@ void __dump_page(struct page *page, const char *reason)
>                  pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx
> \n",
>                          page, page_ref_count(page), mapcount,
>                          page->mapping, page_to_pgoff(page));
> +       if (PageTail(page)) {
> +               struct page *head = compound_head(page);
> +               pr_warn("head:%px mapping:%px index:%#lx\n",
> +                       head, head->mapping, page_to_pgoff(head));
> +       }
>          if (PageKsm(page))
>                  pr_warn("ksm flags: %#lx(%pGp)\n", page->flags, &page->flags);
>          else if (PageAnon(page))
> 
> I wonder if we can combine these two patches in some more useful way?
> 
> I also think we probably want a sanity check that 'head' and 'page'
> are within a sane range of each other (ie head < page and head +
> MAX_ORDER_NR_PAGES > page) to protect against a struct page that contains
> complete garbage.
> 

OK, here's a go at combining those. I like the observation, implicit in your
diffs, that PageTail rather than PageCompound is the key differentiator in
deciding what to print. How's this look:

diff --git a/mm/debug.c b/mm/debug.c
index a90da5337c14..944652843e7b 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -75,12 +75,31 @@ void __dump_page(struct page *page, const char *reason)
  	 */
  	mapcount = PageSlab(page) ? 0 : page_mapcount(page);
  
-	if (PageCompound(page))
-		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
-			"index:%#lx compound_mapcount: %d\n",
-			page, page_ref_count(page), mapcount,
-			page->mapping, page_to_pgoff(page),
-			compound_mapcount(page));
+	if (PageTail(page)) {
+		struct page *head = compound_head(page);
+
+		if ((page < head) || (page >= head + MAX_ORDER_NR_PAGES)) {
+			/*
+			 * Page is hopelessly corrupted, so limit any reporting
+			 * to information about the page itself. Do not attempt
+			 * to look at the head page.
+			 */
+			pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
+				"index:%#lx (corrupted tail page case)\n",
+				page, page_ref_count(page), mapcount,
+				page->mapping, page_to_pgoff(page));
+		} else {
+			pr_warn("page:%px compound refcount:%d mapcount:%d "
+				"mapping:%px index:%#lx compound_mapcount:%d\n",
+				page, page_ref_count(head),
+				mapcount, head->mapping, page_to_pgoff(head),
+				compound_mapcount(page));
+
+			if (page_ref_count(page) != 0)
+				pr_warn("page:%px PROBLEM: non-zero refcount (==%d) on "
+					"this tail page\n", page, page_ref_count(page));
+		}
+	}
  	else
  		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx\n",
  			page, page_ref_count(page), mapcount,

?

Here's sample output for a normal page, a tail page, and a tail page with a bad
(non-zero) refcount:

============
Normal page:
============
[   38.572084] page:ffffea0011465880 refcount:2 mapcount:1 mapping:ffff888454d99001 index:0xb2
[   38.579256] anon flags: 0x17ffe0000080036(referenced|uptodate|lru|active|swapbacked)
[   38.585799] raw: 017ffe0000080036 ffffea0011460fc8 ffffea0011466d08 ffff888454d99001
[   38.592350] raw: 00000000000000b2 0000000000000000 0000000200000000 0000000000000000
[   38.598885] page dumped because: test dump page


==========
Tail page:
==========
[   38.436384] page:ffffea0010aa0280 compound refcount:503 mapcount:1 mapping:ffff888455fb3399 index:0xa8 compound_mapcount:1
[   38.446350] anon flags: 0x17ffe0000000000()
[   38.449661] raw: 017ffe0000000000 ffffea0010aa0001 ffffea0010aa0288 dead000000000400
[   38.456228] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[   38.462794] page dumped because: test dump page

============================
Tail page with bad refcount:
============================
[   38.466088] page:ffffea0010aa0b40 compound refcount:468 mapcount:1 mapping:ffff888455fb3399 index:0xa8 compound_mapcount:1
[   38.475967] page:ffffea0010aa0b40 PROBLEM: non-zero refcount (==2) on this tail page
[   38.482490] anon flags: 0x17ffe0000000000()
[   38.485432] raw: 017ffe0000000000 ffffea0010aa0001 ffffea0010aa0b48 dead000000000400
[   38.491996] raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
[   38.498532] page dumped because: test bad tail page refcount



thanks,
John Hubbard Jan. 30, 2020, 6:30 a.m. UTC | #5
On 1/29/20 10:23 PM, John Hubbard wrote:
> On 1/29/20 2:59 PM, Matthew Wilcox wrote:
> ...
>> I have a hunk in my current tree which looks like this:
>>
>> @@ -77,6 +77,11 @@ void __dump_page(struct page *page, const char *reason)
>>                  pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx
>> \n",
>>                          page, page_ref_count(page), mapcount,
>>                          page->mapping, page_to_pgoff(page));
>> +       if (PageTail(page)) {
>> +               struct page *head = compound_head(page);
>> +               pr_warn("head:%px mapping:%px index:%#lx\n",
>> +                       head, head->mapping, page_to_pgoff(head));
>> +       }
>>          if (PageKsm(page))
>>                  pr_warn("ksm flags: %#lx(%pGp)\n", page->flags, &page->flags);
>>          else if (PageAnon(page))
>>
>> I wonder if we can combine these two patches in some more useful way?
>>
>> I also think we probably want a sanity check that 'head' and 'page'
>> are within a sane range of each other (ie head < page and head +
>> MAX_ORDER_NR_PAGES > page) to protect against a struct page that contains
>> complete garbage.
>>
> 
> OK, here's a go at combining those. I like the observation, implicit in your
> diffs, that PageTail rather than PageCompound is the key differentiator in
> deciding what to print. How's this look:
> 
> diff --git a/mm/debug.c b/mm/debug.c
> index a90da5337c14..944652843e7b 100644
> --- a/mm/debug.c
> +++ b/mm/debug.c
> @@ -75,12 +75,31 @@ void __dump_page(struct page *page, const char *reason)
>        */
>       mapcount = PageSlab(page) ? 0 : page_mapcount(page);
> 
> -    if (PageCompound(page))
> -        pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
> -            "index:%#lx compound_mapcount: %d\n",
> -            page, page_ref_count(page), mapcount,
> -            page->mapping, page_to_pgoff(page),
> -            compound_mapcount(page));
> +    if (PageTail(page)) {
> +        struct page *head = compound_head(page);
> +
> +        if ((page < head) || (page >= head + MAX_ORDER_NR_PAGES)) {
> +            /*
> +             * Page is hopelessly corrupted, so limit any reporting
> +             * to information about the page itself. Do not attempt
> +             * to look at the head page.
> +             */
> +            pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
> +                "index:%#lx (corrupted tail page case)\n",
> +                page, page_ref_count(page), mapcount,
> +                page->mapping, page_to_pgoff(page));
> +        } else {
> +            pr_warn("page:%px compound refcount:%d mapcount:%d "
> +                "mapping:%px index:%#lx compound_mapcount:%d\n",
> +                page, page_ref_count(head),
> +                mapcount, head->mapping, page_to_pgoff(head),
> +                compound_mapcount(page));
> +
> +            if (page_ref_count(page) != 0)
> +                pr_warn("page:%px PROBLEM: non-zero refcount (==%d) on "
> +                    "this tail page\n", page, page_ref_count(page));

...ahem, I sorta botched the above statement, because that should
be outside (just below) the "else" statement--it can be done whether or
not the page fails the safety/bounds check. :)

thanks,
diff mbox series

Patch

diff --git a/mm/debug.c b/mm/debug.c
index a90da5337c14..4cc6cad8385d 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -76,9 +76,11 @@  void __dump_page(struct page *page, const char *reason)
 	mapcount = PageSlab(page) ? 0 : page_mapcount(page);
 
 	if (PageCompound(page))
-		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
-			"index:%#lx compound_mapcount: %d\n",
-			page, page_ref_count(page), mapcount,
+		pr_warn("page:%px refcount:%d head refcount:%d "
+			"mapcount:%d mapping:%px index:%#lx "
+			"compound_mapcount:%d\n",
+			page, page_ref_count(page),
+			page_ref_count(compound_head(page)), mapcount,
 			page->mapping, page_to_pgoff(page),
 			compound_mapcount(page));
 	else