diff mbox series

mm: vmscan: correct nr_reclaimed for THP

Message ID 1557447392-61607-1-git-send-email-yang.shi@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series mm: vmscan: correct nr_reclaimed for THP | expand

Commit Message

Yang Shi May 10, 2019, 12:16 a.m. UTC
Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after
swapped out"), THP can be swapped out in a whole.  But, nr_reclaimed
still gets inc'ed by one even though a whole THP (512 pages) gets
swapped out.

This doesn't make too much sense to memory reclaim.  For example, direct
reclaim may just need reclaim SWAP_CLUSTER_MAX pages, reclaiming one THP
could fulfill it.  But, if nr_reclaimed is not increased correctly,
direct reclaim may just waste time to reclaim more pages,
SWAP_CLUSTER_MAX * 512 pages in worst case.

This change may result in more reclaimed pages than scanned pages showed
by /proc/vmstat since scanning one head page would reclaim 512 base pages.

Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
I'm not quite sure if it was the intended behavior or just omission. I tried
to dig into the review history, but didn't find any clue. I may miss some
discussion.

 mm/vmscan.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Shakeel Butt May 10, 2019, 12:39 a.m. UTC | #1
From: Yang Shi <yang.shi@linux.alibaba.com>
Date: Thu, May 9, 2019 at 5:16 PM
To: <ying.huang@intel.com>, <hannes@cmpxchg.org>, <mhocko@suse.com>,
<mgorman@techsingularity.net>, <kirill.shutemov@linux.intel.com>,
<hughd@google.com>, <akpm@linux-foundation.org>
Cc: <yang.shi@linux.alibaba.com>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>

> Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after
> swapped out"), THP can be swapped out in a whole.  But, nr_reclaimed
> still gets inc'ed by one even though a whole THP (512 pages) gets
> swapped out.
>
> This doesn't make too much sense to memory reclaim.  For example, direct
> reclaim may just need reclaim SWAP_CLUSTER_MAX pages, reclaiming one THP
> could fulfill it.  But, if nr_reclaimed is not increased correctly,
> direct reclaim may just waste time to reclaim more pages,
> SWAP_CLUSTER_MAX * 512 pages in worst case.
>
> This change may result in more reclaimed pages than scanned pages showed
> by /proc/vmstat since scanning one head page would reclaim 512 base pages.
>
> Cc: "Huang, Ying" <ying.huang@intel.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Hugh Dickins <hughd@google.com>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>

Nice find.

Reviewed-by: Shakeel Butt <shakeelb@google.com>

> ---
> I'm not quite sure if it was the intended behavior or just omission. I tried
> to dig into the review history, but didn't find any clue. I may miss some
> discussion.
>
>  mm/vmscan.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index fd9de50..7e026ec 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1446,7 +1446,11 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>
>                 unlock_page(page);
>  free_it:
> -               nr_reclaimed++;
> +               /*
> +                * THP may get swapped out in a whole, need account
> +                * all base pages.
> +                */
> +               nr_reclaimed += (1 << compound_order(page));
>
>                 /*
>                  * Is there need to periodically free_page_list? It would
> --
> 1.8.3.1
>
Huang, Ying May 10, 2019, 2:12 a.m. UTC | #2
Yang Shi <yang.shi@linux.alibaba.com> writes:

> Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after
> swapped out"), THP can be swapped out in a whole.  But, nr_reclaimed
> still gets inc'ed by one even though a whole THP (512 pages) gets
> swapped out.
>
> This doesn't make too much sense to memory reclaim.  For example, direct
> reclaim may just need reclaim SWAP_CLUSTER_MAX pages, reclaiming one THP
> could fulfill it.  But, if nr_reclaimed is not increased correctly,
> direct reclaim may just waste time to reclaim more pages,
> SWAP_CLUSTER_MAX * 512 pages in worst case.
>
> This change may result in more reclaimed pages than scanned pages showed
> by /proc/vmstat since scanning one head page would reclaim 512 base pages.
>
> Cc: "Huang, Ying" <ying.huang@intel.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Hugh Dickins <hughd@google.com>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> ---
> I'm not quite sure if it was the intended behavior or just omission. I tried
> to dig into the review history, but didn't find any clue. I may miss some
> discussion.
>
>  mm/vmscan.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index fd9de50..7e026ec 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1446,7 +1446,11 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  
>  		unlock_page(page);
>  free_it:
> -		nr_reclaimed++;
> +		/* 
> +		 * THP may get swapped out in a whole, need account
> +		 * all base pages.
> +		 */
> +		nr_reclaimed += (1 << compound_order(page));
>  
>  		/*
>  		 * Is there need to periodically free_page_list? It would

Good catch!  Thanks!

How about to change this to


        nr_reclaimed += hpage_nr_pages(page);

Best Regards,
Huang, Ying
Yang Shi May 10, 2019, 2:25 a.m. UTC | #3
On 5/9/19 7:12 PM, Huang, Ying wrote:
> Yang Shi <yang.shi@linux.alibaba.com> writes:
>
>> Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after
>> swapped out"), THP can be swapped out in a whole.  But, nr_reclaimed
>> still gets inc'ed by one even though a whole THP (512 pages) gets
>> swapped out.
>>
>> This doesn't make too much sense to memory reclaim.  For example, direct
>> reclaim may just need reclaim SWAP_CLUSTER_MAX pages, reclaiming one THP
>> could fulfill it.  But, if nr_reclaimed is not increased correctly,
>> direct reclaim may just waste time to reclaim more pages,
>> SWAP_CLUSTER_MAX * 512 pages in worst case.
>>
>> This change may result in more reclaimed pages than scanned pages showed
>> by /proc/vmstat since scanning one head page would reclaim 512 base pages.
>>
>> Cc: "Huang, Ying" <ying.huang@intel.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
>> Cc: Hugh Dickins <hughd@google.com>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> ---
>> I'm not quite sure if it was the intended behavior or just omission. I tried
>> to dig into the review history, but didn't find any clue. I may miss some
>> discussion.
>>
>>   mm/vmscan.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index fd9de50..7e026ec 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1446,7 +1446,11 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>   
>>   		unlock_page(page);
>>   free_it:
>> -		nr_reclaimed++;
>> +		/*
>> +		 * THP may get swapped out in a whole, need account
>> +		 * all base pages.
>> +		 */
>> +		nr_reclaimed += (1 << compound_order(page));
>>   
>>   		/*
>>   		 * Is there need to periodically free_page_list? It would
> Good catch!  Thanks!
>
> How about to change this to
>
>
>          nr_reclaimed += hpage_nr_pages(page);

Either is fine to me. Is this faster than "1 << compound_order(page)"?

>
> Best Regards,
> Huang, Ying
Huang, Ying May 10, 2019, 3:03 a.m. UTC | #4
Yang Shi <yang.shi@linux.alibaba.com> writes:

> On 5/9/19 7:12 PM, Huang, Ying wrote:
>> Yang Shi <yang.shi@linux.alibaba.com> writes:
>>
>>> Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after
>>> swapped out"), THP can be swapped out in a whole.  But, nr_reclaimed
>>> still gets inc'ed by one even though a whole THP (512 pages) gets
>>> swapped out.
>>>
>>> This doesn't make too much sense to memory reclaim.  For example, direct
>>> reclaim may just need reclaim SWAP_CLUSTER_MAX pages, reclaiming one THP
>>> could fulfill it.  But, if nr_reclaimed is not increased correctly,
>>> direct reclaim may just waste time to reclaim more pages,
>>> SWAP_CLUSTER_MAX * 512 pages in worst case.
>>>
>>> This change may result in more reclaimed pages than scanned pages showed
>>> by /proc/vmstat since scanning one head page would reclaim 512 base pages.
>>>
>>> Cc: "Huang, Ying" <ying.huang@intel.com>
>>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>>> Cc: Michal Hocko <mhocko@suse.com>
>>> Cc: Mel Gorman <mgorman@techsingularity.net>
>>> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
>>> Cc: Hugh Dickins <hughd@google.com>
>>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>>> ---
>>> I'm not quite sure if it was the intended behavior or just omission. I tried
>>> to dig into the review history, but didn't find any clue. I may miss some
>>> discussion.
>>>
>>>   mm/vmscan.c | 6 +++++-
>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index fd9de50..7e026ec 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -1446,7 +1446,11 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>>     		unlock_page(page);
>>>   free_it:
>>> -		nr_reclaimed++;
>>> +		/*
>>> +		 * THP may get swapped out in a whole, need account
>>> +		 * all base pages.
>>> +		 */
>>> +		nr_reclaimed += (1 << compound_order(page));
>>>     		/*
>>>   		 * Is there need to periodically free_page_list? It would
>> Good catch!  Thanks!
>>
>> How about to change this to
>>
>>
>>          nr_reclaimed += hpage_nr_pages(page);
>
> Either is fine to me. Is this faster than "1 << compound_order(page)"?

I think the readability is a little better.  And this will become

        nr_reclaimed += 1

if CONFIG_TRANSPARENT_HUAGEPAGE is disabled.

Best Regards,
Huang, Ying

>>
>> Best Regards,
>> Huang, Ying
William Kucharski May 10, 2019, 4:33 a.m. UTC | #5
> On May 9, 2019, at 9:03 PM, Huang, Ying <ying.huang@intel.com> wrote:
> 
> Yang Shi <yang.shi@linux.alibaba.com> writes:
> 
>> On 5/9/19 7:12 PM, Huang, Ying wrote:
>>> 
>>> How about to change this to
>>> 
>>> 
>>>         nr_reclaimed += hpage_nr_pages(page);
>> 
>> Either is fine to me. Is this faster than "1 << compound_order(page)"?
> 
> I think the readability is a little better.  And this will become
> 
>        nr_reclaimed += 1
> 
> if CONFIG_TRANSPARENT_HUAGEPAGE is disabled.

I find this more legible and self documenting, and it avoids the bit shift
operation completely on the majority of systems where THP is not configured.
Yang Shi May 10, 2019, 3:41 p.m. UTC | #6
On 5/9/19 8:03 PM, Huang, Ying wrote:
> Yang Shi <yang.shi@linux.alibaba.com> writes:
>
>> On 5/9/19 7:12 PM, Huang, Ying wrote:
>>> Yang Shi <yang.shi@linux.alibaba.com> writes:
>>>
>>>> Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after
>>>> swapped out"), THP can be swapped out in a whole.  But, nr_reclaimed
>>>> still gets inc'ed by one even though a whole THP (512 pages) gets
>>>> swapped out.
>>>>
>>>> This doesn't make too much sense to memory reclaim.  For example, direct
>>>> reclaim may just need reclaim SWAP_CLUSTER_MAX pages, reclaiming one THP
>>>> could fulfill it.  But, if nr_reclaimed is not increased correctly,
>>>> direct reclaim may just waste time to reclaim more pages,
>>>> SWAP_CLUSTER_MAX * 512 pages in worst case.
>>>>
>>>> This change may result in more reclaimed pages than scanned pages showed
>>>> by /proc/vmstat since scanning one head page would reclaim 512 base pages.
>>>>
>>>> Cc: "Huang, Ying" <ying.huang@intel.com>
>>>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>>>> Cc: Michal Hocko <mhocko@suse.com>
>>>> Cc: Mel Gorman <mgorman@techsingularity.net>
>>>> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
>>>> Cc: Hugh Dickins <hughd@google.com>
>>>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>>>> ---
>>>> I'm not quite sure if it was the intended behavior or just omission. I tried
>>>> to dig into the review history, but didn't find any clue. I may miss some
>>>> discussion.
>>>>
>>>>    mm/vmscan.c | 6 +++++-
>>>>    1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>> index fd9de50..7e026ec 100644
>>>> --- a/mm/vmscan.c
>>>> +++ b/mm/vmscan.c
>>>> @@ -1446,7 +1446,11 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>>>      		unlock_page(page);
>>>>    free_it:
>>>> -		nr_reclaimed++;
>>>> +		/*
>>>> +		 * THP may get swapped out in a whole, need account
>>>> +		 * all base pages.
>>>> +		 */
>>>> +		nr_reclaimed += (1 << compound_order(page));
>>>>      		/*
>>>>    		 * Is there need to periodically free_page_list? It would
>>> Good catch!  Thanks!
>>>
>>> How about to change this to
>>>
>>>
>>>           nr_reclaimed += hpage_nr_pages(page);
>> Either is fine to me. Is this faster than "1 << compound_order(page)"?
> I think the readability is a little better.  And this will become
>
>          nr_reclaimed += 1
>
> if CONFIG_TRANSPARENT_HUAGEPAGE is disabled.

Good point. Will update in v2 soon.

>
> Best Regards,
> Huang, Ying
>
>>> Best Regards,
>>> Huang, Ying
Yang Shi May 10, 2019, 3:48 p.m. UTC | #7
On 5/9/19 9:33 PM, William Kucharski wrote:
>
>> On May 9, 2019, at 9:03 PM, Huang, Ying <ying.huang@intel.com> wrote:
>>
>> Yang Shi <yang.shi@linux.alibaba.com> writes:
>>
>>> On 5/9/19 7:12 PM, Huang, Ying wrote:
>>>> How about to change this to
>>>>
>>>>
>>>>          nr_reclaimed += hpage_nr_pages(page);
>>> Either is fine to me. Is this faster than "1 << compound_order(page)"?
>> I think the readability is a little better.  And this will become
>>
>>         nr_reclaimed += 1
>>
>> if CONFIG_TRANSPARENT_HUAGEPAGE is disabled.
> I find this more legible and self documenting, and it avoids the bit shift
> operation completely on the majority of systems where THP is not configured.

Yes, I do agree. Thanks for the suggestion.
>
Matthew Wilcox (Oracle) May 10, 2019, 4:36 p.m. UTC | #8
On Fri, May 10, 2019 at 10:12:40AM +0800, Huang, Ying wrote:
> > +		nr_reclaimed += (1 << compound_order(page));
> 
> How about to change this to
> 
> 
>         nr_reclaimed += hpage_nr_pages(page);

Please don't.  That embeds the knowledge that we can only swap out either 
normal pages or THP sized pages.  I'm trying to make the VM capable of 
supporting arbitrary-order pages, and this would be just one more place
to fix.

I'm sympathetic to the "self documenting" argument.  My current tree has
a patch in it:

    mm: Introduce compound_nr
    
    Replace 1 << compound_order(page) with compound_nr(page).  Minor
    improvements in readability.

It goes along with this patch:

    mm: Introduce page_size()

    It's unnecessarily hard to find out the size of a potentially huge page.
    Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).

Better suggestions on naming gratefully received.  I'm more happy with 
page_size() than I am with compound_nr().  page_nr() gives the wrong
impression; page_count() isn't great either.
Yang Shi May 10, 2019, 4:50 p.m. UTC | #9
On 5/10/19 9:36 AM, Matthew Wilcox wrote:
> On Fri, May 10, 2019 at 10:12:40AM +0800, Huang, Ying wrote:
>>> +		nr_reclaimed += (1 << compound_order(page));
>> How about to change this to
>>
>>
>>          nr_reclaimed += hpage_nr_pages(page);
> Please don't.  That embeds the knowledge that we can only swap out either
> normal pages or THP sized pages.  I'm trying to make the VM capable of
> supporting arbitrary-order pages, and this would be just one more place
> to fix.
>
> I'm sympathetic to the "self documenting" argument.  My current tree has
> a patch in it:
>
>      mm: Introduce compound_nr
>      
>      Replace 1 << compound_order(page) with compound_nr(page).  Minor
>      improvements in readability.
>
> It goes along with this patch:
>
>      mm: Introduce page_size()
>
>      It's unnecessarily hard to find out the size of a potentially huge page.
>      Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).

So you prefer keeping using  "1 << compound_order" as v1 did? Then you 
will convert all "1 << compound_order" to compound_nr?

>
> Better suggestions on naming gratefully received.  I'm more happy with
> page_size() than I am with compound_nr().  page_nr() gives the wrong
> impression; page_count() isn't great either.
Matthew Wilcox (Oracle) May 10, 2019, 4:52 p.m. UTC | #10
On Fri, May 10, 2019 at 09:50:04AM -0700, Yang Shi wrote:
> On 5/10/19 9:36 AM, Matthew Wilcox wrote:
> > On Fri, May 10, 2019 at 10:12:40AM +0800, Huang, Ying wrote:
> > > > +		nr_reclaimed += (1 << compound_order(page));
> > > How about to change this to
> > > 
> > >          nr_reclaimed += hpage_nr_pages(page);
> > Please don't.  That embeds the knowledge that we can only swap out either
> > normal pages or THP sized pages.  I'm trying to make the VM capable of
> > supporting arbitrary-order pages, and this would be just one more place
> > to fix.
> > 
> > I'm sympathetic to the "self documenting" argument.  My current tree has
> > a patch in it:
> > 
> >      mm: Introduce compound_nr
> >      Replace 1 << compound_order(page) with compound_nr(page).  Minor
> >      improvements in readability.
> > 
> > It goes along with this patch:
> > 
> >      mm: Introduce page_size()
> > 
> >      It's unnecessarily hard to find out the size of a potentially huge page.
> >      Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).
> 
> So you prefer keeping using  "1 << compound_order" as v1 did? Then you will
> convert all "1 << compound_order" to compound_nr?

Yes.  Please, let's merge v1 and ignore v2.
Yang Shi May 10, 2019, 4:54 p.m. UTC | #11
On 5/10/19 9:52 AM, Matthew Wilcox wrote:
> On Fri, May 10, 2019 at 09:50:04AM -0700, Yang Shi wrote:
>> On 5/10/19 9:36 AM, Matthew Wilcox wrote:
>>> On Fri, May 10, 2019 at 10:12:40AM +0800, Huang, Ying wrote:
>>>>> +		nr_reclaimed += (1 << compound_order(page));
>>>> How about to change this to
>>>>
>>>>           nr_reclaimed += hpage_nr_pages(page);
>>> Please don't.  That embeds the knowledge that we can only swap out either
>>> normal pages or THP sized pages.  I'm trying to make the VM capable of
>>> supporting arbitrary-order pages, and this would be just one more place
>>> to fix.
>>>
>>> I'm sympathetic to the "self documenting" argument.  My current tree has
>>> a patch in it:
>>>
>>>       mm: Introduce compound_nr
>>>       Replace 1 << compound_order(page) with compound_nr(page).  Minor
>>>       improvements in readability.
>>>
>>> It goes along with this patch:
>>>
>>>       mm: Introduce page_size()
>>>
>>>       It's unnecessarily hard to find out the size of a potentially huge page.
>>>       Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).
>> So you prefer keeping using  "1 << compound_order" as v1 did? Then you will
>> convert all "1 << compound_order" to compound_nr?
> Yes.  Please, let's merge v1 and ignore v2.

Fine to me. I think Andrew will take care of it, Andrew?
Ira Weiny May 10, 2019, 10:54 p.m. UTC | #12
On Fri, May 10, 2019 at 09:36:12AM -0700, Matthew Wilcox wrote:
> On Fri, May 10, 2019 at 10:12:40AM +0800, Huang, Ying wrote:
> > > +		nr_reclaimed += (1 << compound_order(page));
> > 
> > How about to change this to
> > 
> > 
> >         nr_reclaimed += hpage_nr_pages(page);
> 
> Please don't.  That embeds the knowledge that we can only swap out either 
> normal pages or THP sized pages.  I'm trying to make the VM capable of 
> supporting arbitrary-order pages, and this would be just one more place
> to fix.
> 
> I'm sympathetic to the "self documenting" argument.  My current tree has
> a patch in it:
> 
>     mm: Introduce compound_nr
>     
>     Replace 1 << compound_order(page) with compound_nr(page).  Minor
>     improvements in readability.
> 
> It goes along with this patch:
> 
>     mm: Introduce page_size()
> 
>     It's unnecessarily hard to find out the size of a potentially huge page.
>     Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).
> 
> Better suggestions on naming gratefully received.  I'm more happy with 
> page_size() than I am with compound_nr().  page_nr() gives the wrong
> impression; page_count() isn't great either.

Stupid question : what does 'nr' stand for?

Ira
Matthew Wilcox (Oracle) May 10, 2019, 11:06 p.m. UTC | #13
On Fri, May 10, 2019 at 03:54:56PM -0700, Ira Weiny wrote:
> On Fri, May 10, 2019 at 09:36:12AM -0700, Matthew Wilcox wrote:
> > On Fri, May 10, 2019 at 10:12:40AM +0800, Huang, Ying wrote:
> > > > +		nr_reclaimed += (1 << compound_order(page));
> > > 
> > > How about to change this to
> > > 
> > > 
> > >         nr_reclaimed += hpage_nr_pages(page);
> > 
> > Please don't.  That embeds the knowledge that we can only swap out either 
> > normal pages or THP sized pages.  I'm trying to make the VM capable of 
> > supporting arbitrary-order pages, and this would be just one more place
> > to fix.
> > 
> > I'm sympathetic to the "self documenting" argument.  My current tree has
> > a patch in it:
> > 
> >     mm: Introduce compound_nr
> >     
> >     Replace 1 << compound_order(page) with compound_nr(page).  Minor
> >     improvements in readability.
> > 
> > It goes along with this patch:
> > 
> >     mm: Introduce page_size()
> > 
> >     It's unnecessarily hard to find out the size of a potentially huge page.
> >     Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).
> > 
> > Better suggestions on naming gratefully received.  I'm more happy with 
> > page_size() than I am with compound_nr().  page_nr() gives the wrong
> > impression; page_count() isn't great either.
> 
> Stupid question : what does 'nr' stand for?

NumbeR.  It's relatively common argot in the Linux kernel (as you can
see from the earlier example ...

> > >         nr_reclaimed += hpage_nr_pages(page);

willy@bobo:~/kernel/xarray-2$ git grep -w nr mm |wc -l
388
willy@bobo:~/kernel/xarray-2$ git grep -w nr fs |wc -l
1067
Yafang Shao May 11, 2019, 4:09 a.m. UTC | #14
On Sat, May 11, 2019 at 12:36 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Fri, May 10, 2019 at 10:12:40AM +0800, Huang, Ying wrote:
> > > +           nr_reclaimed += (1 << compound_order(page));
> >
> > How about to change this to
> >
> >
> >         nr_reclaimed += hpage_nr_pages(page);
>
> Please don't.  That embeds the knowledge that we can only swap out either
> normal pages or THP sized pages.

Agreed.
compound_order() is more general than hpage_nr_pages().
It seems to me that hpage_nr_pages() is a little  abuse in lots of places.

Thanks
Yafang
William Kucharski May 11, 2019, 10:33 p.m. UTC | #15
> On May 10, 2019, at 10:36 AM, Matthew Wilcox <willy@infradead.org> wrote:
> 
> Please don't.  That embeds the knowledge that we can only swap out either 
> normal pages or THP sized pages.  I'm trying to make the VM capable of 
> supporting arbitrary-order pages, and this would be just one more place
> to fix.
> 
> I'm sympathetic to the "self documenting" argument.  My current tree has
> a patch in it:
> 
>    mm: Introduce compound_nr
> 
>    Replace 1 << compound_order(page) with compound_nr(page).  Minor
>    improvements in readability.
> 
> It goes along with this patch:
> 
>    mm: Introduce page_size()
> 
>    It's unnecessarily hard to find out the size of a potentially huge page.
>    Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).
> 
> Better suggestions on naming gratefully received.  I'm more happy with 
> page_size() than I am with compound_nr().  page_nr() gives the wrong
> impression; page_count() isn't great either.

I like page_size() as well. At least to me, page_nr() or page_count() would
imply a basis of PAGESIZE, or that you would need to do something like:

    page_size = page_nr() << PAGE_SHIFT;

to get the size in bytes; page_size() is more straightforward in that respect.
diff mbox series

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index fd9de50..7e026ec 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1446,7 +1446,11 @@  static unsigned long shrink_page_list(struct list_head *page_list,
 
 		unlock_page(page);
 free_it:
-		nr_reclaimed++;
+		/* 
+		 * THP may get swapped out in a whole, need account
+		 * all base pages.
+		 */
+		nr_reclaimed += (1 << compound_order(page));
 
 		/*
 		 * Is there need to periodically free_page_list? It would