diff mbox series

[V3] mm/hugetlb: try preferred node first when alloc gigantic page from cma

Message ID 20200902025016.697260-1-lixinhai.lxh@gmail.com (mailing list archive)
State New, archived
Headers show
Series [V3] mm/hugetlb: try preferred node first when alloc gigantic page from cma | expand

Commit Message

Li Xinhai Sept. 2, 2020, 2:50 a.m. UTC
Since commit cf11e85fc08cc6a4 ("mm: hugetlb: optionally allocate gigantic
hugepages using cma"), the gigantic page would be allocated from node
which is not the preferred node, although there are pages available from
that node. The reason is that the nid parameter has been ignored in
alloc_gigantic_page().

Besides, the __GFP_THISNODE also need be checked if user required to
alloc only from the preferred node.

After this patch, the preferred node is tried first before other allowed
nodes, and don't try to allocate from other nodes if __GFP_THISNODE is
specified. If user don't specify the preferred node, the current node
will be used as preferred node, which makes sure consistent behavior of
allocating gigantic and non-gigantic hugetlb page.

Fixes: cf11e85fc08cc6a4 ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Cc: Roman Gushchin <guro@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
---
v2->V3
Consider current node as preferred node if nid is NUMA_NO_NODE, thanks
Mike.

v1->v2:
With review by Mike and Michal, need to check __GFP_THISNODE to avoid
allocate from other nodes.

 mm/hugetlb.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

Comments

Michal Hocko Sept. 2, 2020, 6:14 a.m. UTC | #1
On Wed 02-09-20 10:50:16, Li Xinhai wrote:
> Since commit cf11e85fc08cc6a4 ("mm: hugetlb: optionally allocate gigantic
> hugepages using cma"), the gigantic page would be allocated from node
> which is not the preferred node, although there are pages available from
> that node. The reason is that the nid parameter has been ignored in
> alloc_gigantic_page().
> 
> Besides, the __GFP_THISNODE also need be checked if user required to
> alloc only from the preferred node.
> 
> After this patch, the preferred node is tried first before other allowed
> nodes, and don't try to allocate from other nodes if __GFP_THISNODE is
> specified. If user don't specify the preferred node, the current node
> will be used as preferred node, which makes sure consistent behavior of
> allocating gigantic and non-gigantic hugetlb page.

Technically speaking this is still not in full sync with the allocator
semantic. E.g. cma allocator should try nodes in the node distance
order. Possible but not sure how much we really do care for those who
preallocate cma pools. Also __GFP_NOWAIT should skip CMA as that
requires mutex - or even better make alloc_cma gfp aware. Likely few
more things.

If we care then something for a patch or two on its own and also make
the cma part a function on its own to remove the ugly ifdef.
 
> Fixes: cf11e85fc08cc6a4 ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
> v2->V3
> Consider current node as preferred node if nid is NUMA_NO_NODE, thanks
> Mike.
> 
> v1->v2:
> With review by Mike and Michal, need to check __GFP_THISNODE to avoid
> allocate from other nodes.
> 
>  mm/hugetlb.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a301c2d672bf..5957dc80ebb1 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1250,21 +1250,32 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
>  		int nid, nodemask_t *nodemask)
>  {
>  	unsigned long nr_pages = 1UL << huge_page_order(h);
> +	if (nid == NUMA_NO_NODE)
> +		nid = numa_mem_id();
>  
>  #ifdef CONFIG_CMA
>  	{
>  		struct page *page;
>  		int node;
>  
> -		for_each_node_mask(node, *nodemask) {
> -			if (!hugetlb_cma[node])
> -				continue;
> -
> -			page = cma_alloc(hugetlb_cma[node], nr_pages,
> -					 huge_page_order(h), true);
> +		if (hugetlb_cma[nid]) {
> +			page = cma_alloc(hugetlb_cma[nid], nr_pages,
> +					huge_page_order(h), true);
>  			if (page)
>  				return page;
>  		}
> +
> +		if (!(gfp_mask & __GFP_THISNODE)) {
> +			for_each_node_mask(node, *nodemask) {
> +				if (node == nid || !hugetlb_cma[node])
> +					continue;
> +
> +				page = cma_alloc(hugetlb_cma[node], nr_pages,
> +						huge_page_order(h), true);
> +				if (page)
> +					return page;
> +			}
> +		}
>  	}
>  #endif
>  
> -- 
> 2.18.4
Mike Kravetz Sept. 2, 2020, 6:07 p.m. UTC | #2
On 9/1/20 11:14 PM, Michal Hocko wrote:
> On Wed 02-09-20 10:50:16, Li Xinhai wrote:
>> Since commit cf11e85fc08cc6a4 ("mm: hugetlb: optionally allocate gigantic
>> hugepages using cma"), the gigantic page would be allocated from node
>> which is not the preferred node, although there are pages available from
>> that node. The reason is that the nid parameter has been ignored in
>> alloc_gigantic_page().
>>
>> Besides, the __GFP_THISNODE also need be checked if user required to
>> alloc only from the preferred node.
>>
>> After this patch, the preferred node is tried first before other allowed
>> nodes, and don't try to allocate from other nodes if __GFP_THISNODE is
>> specified. If user don't specify the preferred node, the current node
>> will be used as preferred node, which makes sure consistent behavior of
>> allocating gigantic and non-gigantic hugetlb page.
> 
> Technically speaking this is still not in full sync with the allocator
> semantic. E.g. cma allocator should try nodes in the node distance
> order. Possible but not sure how much we really do care for those who
> preallocate cma pools. Also __GFP_NOWAIT should skip CMA as that
> requires mutex - or even better make alloc_cma gfp aware. Likely few
> more things.
> 
> If we care then something for a patch or two on its own and also make
> the cma part a function on its own to remove the ugly ifdef.

Agreed.  There is plenty of room for improvement which can be done with
subsequent changes.

>> Fixes: cf11e85fc08cc6a4 ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
>> Cc: Roman Gushchin <guro@fb.com>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a301c2d672bf..5957dc80ebb1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1250,21 +1250,32 @@  static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
 		int nid, nodemask_t *nodemask)
 {
 	unsigned long nr_pages = 1UL << huge_page_order(h);
+	if (nid == NUMA_NO_NODE)
+		nid = numa_mem_id();
 
 #ifdef CONFIG_CMA
 	{
 		struct page *page;
 		int node;
 
-		for_each_node_mask(node, *nodemask) {
-			if (!hugetlb_cma[node])
-				continue;
-
-			page = cma_alloc(hugetlb_cma[node], nr_pages,
-					 huge_page_order(h), true);
+		if (hugetlb_cma[nid]) {
+			page = cma_alloc(hugetlb_cma[nid], nr_pages,
+					huge_page_order(h), true);
 			if (page)
 				return page;
 		}
+
+		if (!(gfp_mask & __GFP_THISNODE)) {
+			for_each_node_mask(node, *nodemask) {
+				if (node == nid || !hugetlb_cma[node])
+					continue;
+
+				page = cma_alloc(hugetlb_cma[node], nr_pages,
+						huge_page_order(h), true);
+				if (page)
+					return page;
+			}
+		}
 	}
 #endif