diff mbox series

[2/2] mm/hugetlb: pass correct order_per_bit to cma_declare_contiguous_nid

Message ID 20240404162515.527802-2-fvdl@google.com (mailing list archive)
State New
Headers show
Series [1/2] mm/cma: drop incorrect alignment check in cma_init_reserved_mem | expand

Commit Message

Frank van der Linden April 4, 2024, 4:25 p.m. UTC
The hugetlb_cma code passes 0 in the order_per_bit argument to
cma_declare_contiguous_nid (the alignment, computed using the
page order, is correctly passed in).

This causes a bit in the cma allocation bitmap to always represent
a 4k page, making the bitmaps potentially very large, and slower.

So, correctly pass in the order instead.

Signed-off-by: Frank van der Linden <fvdl@google.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
---
 mm/hugetlb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Roman Gushchin April 4, 2024, 6:56 p.m. UTC | #1
On Thu, Apr 04, 2024 at 04:25:15PM +0000, Frank van der Linden wrote:
> The hugetlb_cma code passes 0 in the order_per_bit argument to
> cma_declare_contiguous_nid (the alignment, computed using the
> page order, is correctly passed in).
> 
> This causes a bit in the cma allocation bitmap to always represent
> a 4k page, making the bitmaps potentially very large, and slower.
> 
> So, correctly pass in the order instead.
> 
> Signed-off-by: Frank van der Linden <fvdl@google.com>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")

Hi Frank,

there is a comment just above your changes which explains why order_per_bit is 0.
Is this not true anymore? If so, please, fix the comment too. Please, clarify.

Thanks!
Frank van der Linden April 4, 2024, 7:40 p.m. UTC | #2
On Thu, Apr 4, 2024 at 11:56 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
> On Thu, Apr 04, 2024 at 04:25:15PM +0000, Frank van der Linden wrote:
> > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > cma_declare_contiguous_nid (the alignment, computed using the
> > page order, is correctly passed in).
> >
> > This causes a bit in the cma allocation bitmap to always represent
> > a 4k page, making the bitmaps potentially very large, and slower.
> >
> > So, correctly pass in the order instead.
> >
> > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
>
> Hi Frank,
>
> there is a comment just above your changes which explains why order_per_bit is 0.
> Is this not true anymore? If so, please, fix the comment too. Please, clarify.
>
> Thanks!

Hi Roman,

I'm assuming you're referring to this comment:

/*
 * Note that 'order per bit' is based on smallest size that
 * may be returned to CMA allocator in the case of
 * huge page demotion.
 */

That comment was added in a01f43901cfb9 ("hugetlb: be sure to free
demoted CMA pages to CMA").

It talks about HUGETLB_PAGE_ORDER being the minimum order being given
back to the CMA allocator (after hugetlb demotion), therefore
order_per_bit must be HUGETLB_PAGE_ORDER. See the commit message for
that commit:

"Therefore, at region setup time we use HUGETLB_PAGE_ORDER as the
smallest possible huge page size that can be given back to CMA."

But the commit, while correctly changing the alignment, left the
order_per_bit argument at 0,  even though it clearly intended to set
it at HUGETLB_PAGE_ORDER. The confusion may have been that
cma_declare_contiguous_nid has 9 arguments, several of which can be
left at 0 meaning 'use default', so it's easy to misread.

In other words, the comment was correct, but the code was not. After
this patch, comment and code match.
David Hildenbrand April 4, 2024, 8:13 p.m. UTC | #3
On 04.04.24 18:25, Frank van der Linden wrote:
> The hugetlb_cma code passes 0 in the order_per_bit argument to
> cma_declare_contiguous_nid (the alignment, computed using the
> page order, is correctly passed in).
> 
> This causes a bit in the cma allocation bitmap to always represent
> a 4k page, making the bitmaps potentially very large, and slower.
> 
> So, correctly pass in the order instead.
> 
> Signed-off-by: Frank van der Linden <fvdl@google.com>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")

It might be subopimal, but do we call it a "BUG" that needs "fixing". I 
know, controversial :)

> ---
>   mm/hugetlb.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 23ef240ba48a..6dc62d8b2a3a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
>   		 * huge page demotion.
>   		 */
>   		res = cma_declare_contiguous_nid(0, size, 0,
> -						PAGE_SIZE << HUGETLB_PAGE_ORDER,
> -						 0, false, name,
> -						 &hugetlb_cma[nid], nid);
> +					PAGE_SIZE << HUGETLB_PAGE_ORDER,
> +					HUGETLB_PAGE_ORDER, false, name,
> +					&hugetlb_cma[nid], nid);
>   		if (res) {
>   			pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
>   				res, nid);

... I'm afraid this is not completely correct.

For example, on arm64, HUGETLB_PAGE_ORDER is essentially PMD_ORDER.

... but we do support smaller hugetlb sizes than that (cont-pte hugetlb 
size is 64 KiB, not 2 MiB -- PMD -- on a 4k kernel)
Andrew Morton April 4, 2024, 8:17 p.m. UTC | #4
On Thu,  4 Apr 2024 16:25:15 +0000 Frank van der Linden <fvdl@google.com> wrote:

> The hugetlb_cma code passes 0 in the order_per_bit argument to
> cma_declare_contiguous_nid (the alignment, computed using the
> page order, is correctly passed in).
> 
> This causes a bit in the cma allocation bitmap to always represent
> a 4k page, making the bitmaps potentially very large, and slower.
> 
> So, correctly pass in the order instead.

Ditto.  Should we backport this?   Can we somewhat quantify "potentially very",
and understand under what circumstances this might occur?
Roman Gushchin April 4, 2024, 8:45 p.m. UTC | #5
On Thu, Apr 04, 2024 at 12:40:58PM -0700, Frank van der Linden wrote:
> On Thu, Apr 4, 2024 at 11:56 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> >
> > On Thu, Apr 04, 2024 at 04:25:15PM +0000, Frank van der Linden wrote:
> > > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > > cma_declare_contiguous_nid (the alignment, computed using the
> > > page order, is correctly passed in).
> > >
> > > This causes a bit in the cma allocation bitmap to always represent
> > > a 4k page, making the bitmaps potentially very large, and slower.
> > >
> > > So, correctly pass in the order instead.
> > >
> > > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> >
> > Hi Frank,
> >
> > there is a comment just above your changes which explains why order_per_bit is 0.
> > Is this not true anymore? If so, please, fix the comment too. Please, clarify.
> >
> > Thanks!
> 
> Hi Roman,
> 
> I'm assuming you're referring to this comment:
> 
> /*
>  * Note that 'order per bit' is based on smallest size that
>  * may be returned to CMA allocator in the case of
>  * huge page demotion.
>  */
> 
> That comment was added in a01f43901cfb9 ("hugetlb: be sure to free
> demoted CMA pages to CMA").
> 
> It talks about HUGETLB_PAGE_ORDER being the minimum order being given
> back to the CMA allocator (after hugetlb demotion), therefore
> order_per_bit must be HUGETLB_PAGE_ORDER. See the commit message for
> that commit:
> 
> "Therefore, at region setup time we use HUGETLB_PAGE_ORDER as the
> smallest possible huge page size that can be given back to CMA."
> 
> But the commit, while correctly changing the alignment, left the
> order_per_bit argument at 0,  even though it clearly intended to set
> it at HUGETLB_PAGE_ORDER. The confusion may have been that
> cma_declare_contiguous_nid has 9 arguments, several of which can be
> left at 0 meaning 'use default', so it's easy to misread.
> 
> In other words, the comment was correct, but the code was not. After
> this patch, comment and code match.

Indeed the mentioned commit which added a comment which was not aligned
with the code was confusing. It all makes sense now, thank you for
the explanation!

Please, feel free to add
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
for your patch.

Thanks!
Roman Gushchin April 4, 2024, 8:52 p.m. UTC | #6
On Thu, Apr 04, 2024 at 10:13:21PM +0200, David Hildenbrand wrote:
> On 04.04.24 18:25, Frank van der Linden wrote:
> > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > cma_declare_contiguous_nid (the alignment, computed using the
> > page order, is correctly passed in).
> > 
> > This causes a bit in the cma allocation bitmap to always represent
> > a 4k page, making the bitmaps potentially very large, and slower.
> > 
> > So, correctly pass in the order instead.
> > 
> > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> 
> It might be subopimal, but do we call it a "BUG" that needs "fixing". I
> know, controversial :)

We probably should not rush with a stable backporting, especially given your
next comment on page sizes on arm.
Frank van der Linden April 4, 2024, 9:44 p.m. UTC | #7
On Thu, Apr 4, 2024 at 1:13 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 04.04.24 18:25, Frank van der Linden wrote:
> > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > cma_declare_contiguous_nid (the alignment, computed using the
> > page order, is correctly passed in).
> >
> > This causes a bit in the cma allocation bitmap to always represent
> > a 4k page, making the bitmaps potentially very large, and slower.
> >
> > So, correctly pass in the order instead.
> >
> > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
>
> It might be subopimal, but do we call it a "BUG" that needs "fixing". I
> know, controversial :)
>
> > ---
> >   mm/hugetlb.c | 6 +++---
> >   1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 23ef240ba48a..6dc62d8b2a3a 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
> >                * huge page demotion.
> >                */
> >               res = cma_declare_contiguous_nid(0, size, 0,
> > -                                             PAGE_SIZE << HUGETLB_PAGE_ORDER,
> > -                                              0, false, name,
> > -                                              &hugetlb_cma[nid], nid);
> > +                                     PAGE_SIZE << HUGETLB_PAGE_ORDER,
> > +                                     HUGETLB_PAGE_ORDER, false, name,
> > +                                     &hugetlb_cma[nid], nid);
> >               if (res) {
> >                       pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
> >                               res, nid);
>
> ... I'm afraid this is not completely correct.
>
> For example, on arm64, HUGETLB_PAGE_ORDER is essentially PMD_ORDER.
>
> ... but we do support smaller hugetlb sizes than that (cont-pte hugetlb
> size is 64 KiB, not 2 MiB -- PMD -- on a 4k kernel)

Right,  smaller hugetlb page sizes exist. But, the value here is not
intended to represent the minimum hugetlb page size - it's the minimum
hugetlb page size that we can demote a CMA-allocated hugetlb page to.
See:

a01f43901cfb ("hugetlb: be sure to free demoted CMA pages to CMA")

So, this just restricts demotion of the gigantic, CMA-allocated pages
to HUGETLB_PAGE_ORDER-sized chunks.

- Frank
Frank van der Linden April 4, 2024, 9:58 p.m. UTC | #8
On Thu, Apr 4, 2024 at 1:17 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu,  4 Apr 2024 16:25:15 +0000 Frank van der Linden <fvdl@google.com> wrote:
>
> > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > cma_declare_contiguous_nid (the alignment, computed using the
> > page order, is correctly passed in).
> >
> > This causes a bit in the cma allocation bitmap to always represent
> > a 4k page, making the bitmaps potentially very large, and slower.
> >
> > So, correctly pass in the order instead.
>
> Ditto.  Should we backport this?   Can we somewhat quantify "potentially very",
> and understand under what circumstances this might occur?

It would create bitmaps that would be pretty big. E.g. for a 4k page
size on x86, hugetlb_cma=64G would mean a bitmap size of (64G / 4k) /
8 == 2M. With HUGETLB_PAGE_ORDER as order_per_bit, as intended, this
would be (64G / 2M) / 8 == 4k. So, that's quite a difference :)

Also, this restricted the hugetlb_cma area to ((PAGE_SIZE <<
MAX_PAGE_ORDER) * 8) * PAGE_SIZE (e.g. 128G on x86) , since
bitmap_alloc uses normal page allocation, and is thus restricted by
MAX_PAGE_ORDER. Specifying anything about that would fail the CMA
initialization.

- Frank
Frank van der Linden April 4, 2024, 10:02 p.m. UTC | #9
Rushing is never good, of course, but see my reply to David - while
smaller hugetlb page sizes than HUGETLB_PAGE_ORDER exist, that's not
the issue in that particular code path.

The only restriction for backports is, I think, that the two patches
need to go together.

I have backported them to 6.6 (which was just a clean apply), and
5.10, which doesn't have hugetlb page demotion, so it actually can
pass the full 1G as order_per_bit. That works fine if you also apply
the CMA align check fix, but would fail otherwise.

- Frank

On Thu, Apr 4, 2024 at 1:52 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
> On Thu, Apr 04, 2024 at 10:13:21PM +0200, David Hildenbrand wrote:
> > On 04.04.24 18:25, Frank van der Linden wrote:
> > > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > > cma_declare_contiguous_nid (the alignment, computed using the
> > > page order, is correctly passed in).
> > >
> > > This causes a bit in the cma allocation bitmap to always represent
> > > a 4k page, making the bitmaps potentially very large, and slower.
> > >
> > > So, correctly pass in the order instead.
> > >
> > > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> >
> > It might be subopimal, but do we call it a "BUG" that needs "fixing". I
> > know, controversial :)
>
> We probably should not rush with a stable backporting, especially given your
> next comment on page sizes on arm.
Andrew Morton April 4, 2024, 10:20 p.m. UTC | #10
On Thu, 4 Apr 2024 15:02:34 -0700 Frank van der Linden <fvdl@google.com> wrote:

> Rushing is never good, of course, but see my reply to David - while
> smaller hugetlb page sizes than HUGETLB_PAGE_ORDER exist, that's not
> the issue in that particular code path.
> 
> The only restriction for backports is, I think, that the two patches
> need to go together.
> 
> I have backported them to 6.6 (which was just a clean apply), and
> 5.10, which doesn't have hugetlb page demotion, so it actually can
> pass the full 1G as order_per_bit. That works fine if you also apply
> the CMA align check fix, but would fail otherwise.

OK, thanks.  I added cc:stable to both patches and added this:

: It would create bitmaps that would be pretty big.  E.g.  for a 4k page
: size on x86, hugetlb_cma=64G would mean a bitmap size of (64G / 4k) / 8
: == 2M.  With HUGETLB_PAGE_ORDER as order_per_bit, as intended, this
: would be (64G / 2M) / 8 == 4k.  So, that's quite a difference.
: 
: Also, this restricted the hugetlb_cma area to ((PAGE_SIZE <<
: MAX_PAGE_ORDER) * 8) * PAGE_SIZE (e.g.  128G on x86) , since
: bitmap_alloc uses normal page allocation, and is thus restricted by
: MAX_PAGE_ORDER.  Specifying anything about that would fail the CMA
: initialization.

to the [2/2] changelog.


For extra test & review I'll leave them in mm-[un]stable so they go
into mainline for 6.10-rc1 which will then trigger the backporting
process.  This can of course all be altered...
Frank van der Linden April 4, 2024, 10:22 p.m. UTC | #11
On Thu, Apr 4, 2024 at 2:44 PM Frank van der Linden <fvdl@google.com> wrote:
>
> On Thu, Apr 4, 2024 at 1:13 PM David Hildenbrand <david@redhat.com> wrote:
> >
> > On 04.04.24 18:25, Frank van der Linden wrote:
> > > The hugetlb_cma code passes 0 in the order_per_bit argument to
> > > cma_declare_contiguous_nid (the alignment, computed using the
> > > page order, is correctly passed in).
> > >
> > > This causes a bit in the cma allocation bitmap to always represent
> > > a 4k page, making the bitmaps potentially very large, and slower.
> > >
> > > So, correctly pass in the order instead.
> > >
> > > Signed-off-by: Frank van der Linden <fvdl@google.com>
> > > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> >
> > It might be subopimal, but do we call it a "BUG" that needs "fixing". I
> > know, controversial :)
> >
> > > ---
> > >   mm/hugetlb.c | 6 +++---
> > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 23ef240ba48a..6dc62d8b2a3a 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
> > >                * huge page demotion.
> > >                */
> > >               res = cma_declare_contiguous_nid(0, size, 0,
> > > -                                             PAGE_SIZE << HUGETLB_PAGE_ORDER,
> > > -                                              0, false, name,
> > > -                                              &hugetlb_cma[nid], nid);
> > > +                                     PAGE_SIZE << HUGETLB_PAGE_ORDER,
> > > +                                     HUGETLB_PAGE_ORDER, false, name,
> > > +                                     &hugetlb_cma[nid], nid);
> > >               if (res) {
> > >                       pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
> > >                               res, nid);
> >
> > ... I'm afraid this is not completely correct.
> >
> > For example, on arm64, HUGETLB_PAGE_ORDER is essentially PMD_ORDER.
> >
> > ... but we do support smaller hugetlb sizes than that (cont-pte hugetlb
> > size is 64 KiB, not 2 MiB -- PMD -- on a 4k kernel)
>
> Right,  smaller hugetlb page sizes exist. But, the value here is not
> intended to represent the minimum hugetlb page size - it's the minimum
> hugetlb page size that we can demote a CMA-allocated hugetlb page to.
> See:
>
> a01f43901cfb ("hugetlb: be sure to free demoted CMA pages to CMA")
>
> So, this just restricts demotion of the gigantic, CMA-allocated pages
> to HUGETLB_PAGE_ORDER-sized chunks.
>
> - Frank

Just to clarify what I'm saying here: the restriction of the size you
can demote a CMA-allocated gigantic page to was already there, my
patch doesn't change anything in that regard.
Muchun Song April 7, 2024, 8:02 a.m. UTC | #12
> On Apr 5, 2024, at 00:25, Frank van der Linden <fvdl@google.com> wrote:
> 
> The hugetlb_cma code passes 0 in the order_per_bit argument to
> cma_declare_contiguous_nid (the alignment, computed using the
> page order, is correctly passed in).
> 
> This causes a bit in the cma allocation bitmap to always represent
> a 4k page, making the bitmaps potentially very large, and slower.
> 
> So, correctly pass in the order instead.
> 
> Signed-off-by: Frank van der Linden <fvdl@google.com>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> ---
> mm/hugetlb.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 23ef240ba48a..6dc62d8b2a3a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
> 		* huge page demotion.
> 		*/
> 		res = cma_declare_contiguous_nid(0, size, 0,
> - 						PAGE_SIZE << HUGETLB_PAGE_ORDER,
> - 						0, false, name,
> - 						&hugetlb_cma[nid], nid);
> +						PAGE_SIZE << HUGETLB_PAGE_ORDER,
> +						HUGETLB_PAGE_ORDER, false, name,

IIUC, we could make the optimization further to change order_per_bit to
'MAX_PAGE_ORDER + 1' since only gigantic hugetlb pages could allocated from
the CMA pool meaning any gigantic page is greater than or equal to the
size of two to the power of 'MAX_PAGE_ORDER + 1'.

Thanks.

> +						&hugetlb_cma[nid], nid);
> 		if (res) {
> 			pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
> 				res, nid);
> -- 
> 2.44.0.478.gd926399ef9-goog
>
David Hildenbrand April 8, 2024, 8:15 a.m. UTC | #13
On 04.04.24 23:44, Frank van der Linden wrote:
> On Thu, Apr 4, 2024 at 1:13 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 04.04.24 18:25, Frank van der Linden wrote:
>>> The hugetlb_cma code passes 0 in the order_per_bit argument to
>>> cma_declare_contiguous_nid (the alignment, computed using the
>>> page order, is correctly passed in).
>>>
>>> This causes a bit in the cma allocation bitmap to always represent
>>> a 4k page, making the bitmaps potentially very large, and slower.
>>>
>>> So, correctly pass in the order instead.
>>>
>>> Signed-off-by: Frank van der Linden <fvdl@google.com>
>>> Cc: Roman Gushchin <roman.gushchin@linux.dev>
>>> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
>>
>> It might be subopimal, but do we call it a "BUG" that needs "fixing". I
>> know, controversial :)
>>
>>> ---
>>>    mm/hugetlb.c | 6 +++---
>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 23ef240ba48a..6dc62d8b2a3a 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -7873,9 +7873,9 @@ void __init hugetlb_cma_reserve(int order)
>>>                 * huge page demotion.
>>>                 */
>>>                res = cma_declare_contiguous_nid(0, size, 0,
>>> -                                             PAGE_SIZE << HUGETLB_PAGE_ORDER,
>>> -                                              0, false, name,
>>> -                                              &hugetlb_cma[nid], nid);
>>> +                                     PAGE_SIZE << HUGETLB_PAGE_ORDER,
>>> +                                     HUGETLB_PAGE_ORDER, false, name,
>>> +                                     &hugetlb_cma[nid], nid);
>>>                if (res) {
>>>                        pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
>>>                                res, nid);
>>
>> ... I'm afraid this is not completely correct.
>>
>> For example, on arm64, HUGETLB_PAGE_ORDER is essentially PMD_ORDER.
>>
>> ... but we do support smaller hugetlb sizes than that (cont-pte hugetlb
>> size is 64 KiB, not 2 MiB -- PMD -- on a 4k kernel)
> 
> Right,  smaller hugetlb page sizes exist. But, the value here is not
> intended to represent the minimum hugetlb page size - it's the minimum
> hugetlb page size that we can demote a CMA-allocated hugetlb page to.
> See:
> 
> a01f43901cfb ("hugetlb: be sure to free demoted CMA pages to CMA")
> 
> So, this just restricts demotion of the gigantic, CMA-allocated pages
> to HUGETLB_PAGE_ORDER-sized chunks.

Right, now my memory comes back. In v1 of that patch set, Mike did 
support denoting to any smaller hugetlb order.

And because that smallest hugetlb order is not known when we call 
cma_declare_contiguous_nid(), he used to pass "0" for the order.

In v2, he simplifed that, though, and limited it to HUGETLB_PAGE_ORDER. 
He didn't update the order here, though.

Acked-by: David Hildenbrand <david@redhat.com>


Note that I don't think any of these patches are CC-stable material.
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 23ef240ba48a..6dc62d8b2a3a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -7873,9 +7873,9 @@  void __init hugetlb_cma_reserve(int order)
 		 * huge page demotion.
 		 */
 		res = cma_declare_contiguous_nid(0, size, 0,
-						PAGE_SIZE << HUGETLB_PAGE_ORDER,
-						 0, false, name,
-						 &hugetlb_cma[nid], nid);
+					PAGE_SIZE << HUGETLB_PAGE_ORDER,
+					HUGETLB_PAGE_ORDER, false, name,
+					&hugetlb_cma[nid], nid);
 		if (res) {
 			pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
 				res, nid);