diff mbox series

mm: page_alloc: use the correct THP order for THP PCP

Message ID a25c9e14cd03907d5978b60546a69e6aa3fc2a7d.1712151833.git.baolin.wang@linux.alibaba.com (mailing list archive)
State New
Headers show
Series mm: page_alloc: use the correct THP order for THP PCP | expand

Commit Message

Baolin Wang April 3, 2024, 1:47 p.m. UTC
Commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored
on the per-cpu lists") extends the PCP allocator to store THP pages, and
it determines whether to cache THP pags in PCP by comparing with pageblock_order.
But the pageblock_order is not always equal to THP order, it might also
be MAX_PAGE_ORDER, which could prevent PCP from caching THP pages.

Therefore, using HPAGE_PMD_ORDER instead to determine the need for caching
THP for PCP can fix this issue

Fixes: 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/page_alloc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Vlastimil Babka April 4, 2024, 10:03 a.m. UTC | #1
On 4/3/24 3:47 PM, Baolin Wang wrote:
> Commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored
> on the per-cpu lists") extends the PCP allocator to store THP pages, and
> it determines whether to cache THP pags in PCP by comparing with pageblock_order.
> But the pageblock_order is not always equal to THP order, it might also
> be MAX_PAGE_ORDER, which could prevent PCP from caching THP pages.
> 
> Therefore, using HPAGE_PMD_ORDER instead to determine the need for caching
> THP for PCP can fix this issue
> 
> Fixes: 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>

IIUC this happens with CONFIG_HUGETLB_PAGE disabled because HUGETLBFS is
disabled? But THPs are still enabled? I think there might be more of THP
working suboptimally in that case with pageblock_order being larger
(MAX_PAGE_ORDER).

In other words, should be rather make pageblock_order itself defined as

 min_t(unsigned int, HPAGE_PMD_ORDER, MAX_PAGE_ORDER)

in case with !CONFIG_HUGETLB_PAGE but THP enabled.

> ---
>  mm/page_alloc.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1beb56f75319..915f4ef070da 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -506,7 +506,7 @@ static inline unsigned int order_to_pindex(int migratetype, int order)
>  {
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  	if (order > PAGE_ALLOC_COSTLY_ORDER) {
> -		VM_BUG_ON(order != pageblock_order);
> +		VM_BUG_ON(order != HPAGE_PMD_ORDER);
>  		return NR_LOWORDER_PCP_LISTS;
>  	}
>  #else
> @@ -522,7 +522,7 @@ static inline int pindex_to_order(unsigned int pindex)
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  	if (pindex == NR_LOWORDER_PCP_LISTS)
> -		order = pageblock_order;
> +		order = HPAGE_PMD_ORDER;
>  #else
>  	VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER);
>  #endif
> @@ -535,7 +535,7 @@ static inline bool pcp_allowed_order(unsigned int order)
>  	if (order <= PAGE_ALLOC_COSTLY_ORDER)
>  		return true;
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> -	if (order == pageblock_order)
> +	if (order == HPAGE_PMD_ORDER)
>  		return true;
>  #endif
>  	return false;
Baolin Wang April 4, 2024, 12:19 p.m. UTC | #2
On 2024/4/4 18:03, Vlastimil Babka wrote:
> On 4/3/24 3:47 PM, Baolin Wang wrote:
>> Commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored
>> on the per-cpu lists") extends the PCP allocator to store THP pages, and
>> it determines whether to cache THP pags in PCP by comparing with pageblock_order.
>> But the pageblock_order is not always equal to THP order, it might also
>> be MAX_PAGE_ORDER, which could prevent PCP from caching THP pages.
>>
>> Therefore, using HPAGE_PMD_ORDER instead to determine the need for caching
>> THP for PCP can fix this issue
>>
>> Fixes: 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> 
> IIUC this happens with CONFIG_HUGETLB_PAGE disabled because HUGETLBFS is
> disabled? But THPs are still enabled? I think there might be more of THP

Right, and seems the Powerpc arch will set pageblock_order via 
set_pageblock_order() when the huge page sizes are variable (not sure if 
this is always equal to THP order).

Moreover, it still does not make sense to use pageblock_order to 
indicate a THP page, especially when we already have HPAGE_PMD_ORDER to 
represent THP.

> working suboptimally in that case with pageblock_order being larger
> (MAX_PAGE_ORDER).
> 
> In other words, should be rather make pageblock_order itself defined as
> 
>   min_t(unsigned int, HPAGE_PMD_ORDER, MAX_PAGE_ORDER)
> 
> in case with !CONFIG_HUGETLB_PAGE but THP enabled.

Yes, this makes sense to me (I wonder why this wasn't done before?). I 
can create a seperate patch to do this, what do you think? Thanks.
Vlastimil Babka April 4, 2024, 2 p.m. UTC | #3
On 4/3/24 3:47 PM, Baolin Wang wrote:
> Commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored
> on the per-cpu lists") extends the PCP allocator to store THP pages, and
> it determines whether to cache THP pags in PCP by comparing with pageblock_order.
> But the pageblock_order is not always equal to THP order, it might also
> be MAX_PAGE_ORDER, which could prevent PCP from caching THP pages.
> 
> Therefore, using HPAGE_PMD_ORDER instead to determine the need for caching
> THP for PCP can fix this issue
> 
> Fixes: 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/page_alloc.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1beb56f75319..915f4ef070da 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -506,7 +506,7 @@ static inline unsigned int order_to_pindex(int migratetype, int order)
>  {
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  	if (order > PAGE_ALLOC_COSTLY_ORDER) {
> -		VM_BUG_ON(order != pageblock_order);
> +		VM_BUG_ON(order != HPAGE_PMD_ORDER);
>  		return NR_LOWORDER_PCP_LISTS;
>  	}
>  #else
> @@ -522,7 +522,7 @@ static inline int pindex_to_order(unsigned int pindex)
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  	if (pindex == NR_LOWORDER_PCP_LISTS)
> -		order = pageblock_order;
> +		order = HPAGE_PMD_ORDER;
>  #else
>  	VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER);
>  #endif
> @@ -535,7 +535,7 @@ static inline bool pcp_allowed_order(unsigned int order)
>  	if (order <= PAGE_ALLOC_COSTLY_ORDER)
>  		return true;
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> -	if (order == pageblock_order)
> +	if (order == HPAGE_PMD_ORDER)
>  		return true;
>  #endif
>  	return false;
Vlastimil Babka April 4, 2024, 2:01 p.m. UTC | #4
On 4/4/24 2:19 PM, Baolin Wang wrote:
> 
> 
> On 2024/4/4 18:03, Vlastimil Babka wrote:
>> On 4/3/24 3:47 PM, Baolin Wang wrote:
>>> Commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored
>>> on the per-cpu lists") extends the PCP allocator to store THP pages, and
>>> it determines whether to cache THP pags in PCP by comparing with pageblock_order.
>>> But the pageblock_order is not always equal to THP order, it might also
>>> be MAX_PAGE_ORDER, which could prevent PCP from caching THP pages.
>>>
>>> Therefore, using HPAGE_PMD_ORDER instead to determine the need for caching
>>> THP for PCP can fix this issue
>>>
>>> Fixes: 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> 
>> IIUC this happens with CONFIG_HUGETLB_PAGE disabled because HUGETLBFS is
>> disabled? But THPs are still enabled? I think there might be more of THP
> 
> Right, and seems the Powerpc arch will set pageblock_order via 
> set_pageblock_order() when the huge page sizes are variable (not sure if 
> this is always equal to THP order).
> 
> Moreover, it still does not make sense to use pageblock_order to 
> indicate a THP page, especially when we already have HPAGE_PMD_ORDER to 
> represent THP.
> 
>> working suboptimally in that case with pageblock_order being larger
>> (MAX_PAGE_ORDER).
>> 
>> In other words, should be rather make pageblock_order itself defined as
>> 
>>   min_t(unsigned int, HPAGE_PMD_ORDER, MAX_PAGE_ORDER)
>> 
>> in case with !CONFIG_HUGETLB_PAGE but THP enabled.
> 
> Yes, this makes sense to me (I wonder why this wasn't done before?). I 
> can create a seperate patch to do this, what do you think? Thanks.

It probably wasn't anticipated that hugetlbfs would be disabled and THP
enabled. If you can create such patch, great!
Barry Song April 5, 2024, 4:53 a.m. UTC | #5
On Thu, Apr 4, 2024 at 2:47 AM Baolin Wang
<baolin.wang@linux.alibaba.com> wrote:
>
> Commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored
> on the per-cpu lists") extends the PCP allocator to store THP pages, and
> it determines whether to cache THP pags in PCP by comparing with pageblock_order.
> But the pageblock_order is not always equal to THP order, it might also
> be MAX_PAGE_ORDER, which could prevent PCP from caching THP pages.
>
> Therefore, using HPAGE_PMD_ORDER instead to determine the need for caching
> THP for PCP can fix this issue
>
> Fixes: 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>

LGTM,
Reviewed-by: Barry Song <baohua@kernel.org>

In the context of using mTHP, perhaps there arises a need for PCP
allocation for frequently
requested mTHP orders. These orders typically exceed PAGE_ALLOC_COSTLY_ORDER
but are smaller than HPAGE_PMD_ORDER.

> ---
>  mm/page_alloc.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1beb56f75319..915f4ef070da 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -506,7 +506,7 @@ static inline unsigned int order_to_pindex(int migratetype, int order)
>  {
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>         if (order > PAGE_ALLOC_COSTLY_ORDER) {
> -               VM_BUG_ON(order != pageblock_order);
> +               VM_BUG_ON(order != HPAGE_PMD_ORDER);
>                 return NR_LOWORDER_PCP_LISTS;
>         }
>  #else
> @@ -522,7 +522,7 @@ static inline int pindex_to_order(unsigned int pindex)
>
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>         if (pindex == NR_LOWORDER_PCP_LISTS)
> -               order = pageblock_order;
> +               order = HPAGE_PMD_ORDER;
>  #else
>         VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER);
>  #endif
> @@ -535,7 +535,7 @@ static inline bool pcp_allowed_order(unsigned int order)
>         if (order <= PAGE_ALLOC_COSTLY_ORDER)
>                 return true;
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> -       if (order == pageblock_order)
> +       if (order == HPAGE_PMD_ORDER)
>                 return true;
>  #endif
>         return false;
> --
> 2.39.3
>

Thanks
Barry
Baolin Wang April 5, 2024, 12:16 p.m. UTC | #6
On 2024/4/5 12:53, Barry Song wrote:
> On Thu, Apr 4, 2024 at 2:47 AM Baolin Wang
> <baolin.wang@linux.alibaba.com> wrote:
>>
>> Commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored
>> on the per-cpu lists") extends the PCP allocator to store THP pages, and
>> it determines whether to cache THP pags in PCP by comparing with pageblock_order.
>> But the pageblock_order is not always equal to THP order, it might also
>> be MAX_PAGE_ORDER, which could prevent PCP from caching THP pages.
>>
>> Therefore, using HPAGE_PMD_ORDER instead to determine the need for caching
>> THP for PCP can fix this issue
>>
>> Fixes: 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> 
> LGTM,
> Reviewed-by: Barry Song <baohua@kernel.org>
> 
> In the context of using mTHP, perhaps there arises a need for PCP
> allocation for frequently
> requested mTHP orders. These orders typically exceed PAGE_ALLOC_COSTLY_ORDER
> but are smaller than HPAGE_PMD_ORDER.

Yes, I have also considered this, but haven't found some time to do more 
investigation and run some benchmarks.:) May be we can create a new 
thread to talk about this first.
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1beb56f75319..915f4ef070da 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -506,7 +506,7 @@  static inline unsigned int order_to_pindex(int migratetype, int order)
 {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	if (order > PAGE_ALLOC_COSTLY_ORDER) {
-		VM_BUG_ON(order != pageblock_order);
+		VM_BUG_ON(order != HPAGE_PMD_ORDER);
 		return NR_LOWORDER_PCP_LISTS;
 	}
 #else
@@ -522,7 +522,7 @@  static inline int pindex_to_order(unsigned int pindex)
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	if (pindex == NR_LOWORDER_PCP_LISTS)
-		order = pageblock_order;
+		order = HPAGE_PMD_ORDER;
 #else
 	VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER);
 #endif
@@ -535,7 +535,7 @@  static inline bool pcp_allowed_order(unsigned int order)
 	if (order <= PAGE_ALLOC_COSTLY_ORDER)
 		return true;
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (order == pageblock_order)
+	if (order == HPAGE_PMD_ORDER)
 		return true;
 #endif
 	return false;