Message ID | 20211021080744.874701-3-chenwandun@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | fix numa spreading for large hash tables | expand |
On Thu, 21 Oct 2021 16:07:44 +0800 Chen Wandun <chenwandun@huawei.com> wrote: > It What is "it"? > will cause significant performance regressions in some situations > as Andrew mentioned in [1]. The main situation is vmalloc, vmalloc > will allocate pages with NUMA_NO_NODE by default, that will result > in alloc page one by one; > > In order to solve this, __alloc_pages_bulk and mempolicy should be > considered at the same time. > > 1) If node is specified in memory allocation request, it will alloc > all pages by __alloc_pages_bulk. > > 2) If interleaving allocate memory, it will cauculate how many pages > should be allocated in each node, and use __alloc_pages_bulk to alloc > pages in each node. This v3 patch didn't incorporate my two fixes, below. It is usual to incorporate such fixes prior to resending. I have retained those two fixes, now against v3. From: Andrew Morton <akpm@linux-foundation.org> Subject: mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation-fix make two functions static Cc: Chen Wandun <chenwandun@huawei.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- mm/mempolicy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/mempolicy.c~mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation-fix +++ a/mm/mempolicy.c @@ -2196,7 +2196,7 @@ struct page *alloc_pages(gfp_t gfp, unsi } EXPORT_SYMBOL(alloc_pages); -unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp, +static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp, struct mempolicy *pol, unsigned long nr_pages, struct page **page_array) { @@ -2231,7 +2231,7 @@ unsigned long alloc_pages_bulk_array_int return total_allocated; } -unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid, +static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid, struct mempolicy *pol, unsigned long nr_pages, struct page **page_array) {
在 2021/10/22 11:26, Andrew Morton 写道: > On Thu, 21 Oct 2021 16:07:44 +0800 Chen Wandun <chenwandun@huawei.com> wrote: > >> It > > What is "it"? it == > [PATCH] mm/vmalloc: fix numa spreading for large hash tables; > >> will cause significant performance regressions in some situations >> as Andrew mentioned in [1]. The main situation is vmalloc, vmalloc >> will allocate pages with NUMA_NO_NODE by default, that will result >> in alloc page one by one; >> >> In order to solve this, __alloc_pages_bulk and mempolicy should be >> considered at the same time. >> >> 1) If node is specified in memory allocation request, it will alloc >> all pages by __alloc_pages_bulk. >> >> 2) If interleaving allocate memory, it will cauculate how many pages >> should be allocated in each node, and use __alloc_pages_bulk to alloc >> pages in each node. > > This v3 patch didn't incorporate my two fixes, below. It is usual to > incorporate such fixes prior to resending. I have retained those two > fixes, now against v3. > > > From: Andrew Morton <akpm@linux-foundation.org> > Subject: mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation-fix > > make two functions static > > Cc: Chen Wandun <chenwandun@huawei.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Hanjun Guo <guohanjun@huawei.com> > Cc: Kefeng Wang <wangkefeng.wang@huawei.com> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Shakeel Butt <shakeelb@google.com> > Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > --- > > mm/mempolicy.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > --- a/mm/mempolicy.c~mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation-fix > +++ a/mm/mempolicy.c > @@ -2196,7 +2196,7 @@ struct page *alloc_pages(gfp_t gfp, unsi > } > EXPORT_SYMBOL(alloc_pages); > > -unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp, > +static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp, > struct mempolicy *pol, unsigned long nr_pages, > struct page **page_array) > { > @@ -2231,7 +2231,7 @@ unsigned long alloc_pages_bulk_array_int > return total_allocated; > } > > -unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid, > +static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid, > struct mempolicy *pol, unsigned long nr_pages, > struct page **page_array) > { > _ > > > > > From: Andrew Morton <akpm@linux-foundation.org> > Subject: mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation-fix-2 > > fix CONFIG_NUMA=n build. alloc_pages_bulk_array_mempolicy() was undefined > > Cc: Chen Wandun <chenwandun@huawei.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Hanjun Guo <guohanjun@huawei.com> > Cc: Kefeng Wang <wangkefeng.wang@huawei.com> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > --- > > mm/vmalloc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- a/mm/vmalloc.c~mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation-fix-2 > +++ a/mm/vmalloc.c > @@ -2860,7 +2860,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, > * otherwise memory may be allocated in only one node, > * but mempolcy want to alloc memory by interleaving. > */ > - if (nid == NUMA_NO_NODE) > + if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE) > nr = alloc_pages_bulk_array_mempolicy(gfp, > nr_pages_request, > pages + nr_allocated); > _ > > . >
diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 55b2ec1f965a..cd98c858fc74 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -535,6 +535,10 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, struct list_head *page_list, struct page **page_array); +unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp, + unsigned long nr_pages, + struct page **page_array); + /* Bulk allocate order-0 pages */ static inline unsigned long alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 1592b081c58e..56bb1fe4d179 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2202,6 +2202,88 @@ struct page *alloc_pages(gfp_t gfp, unsigned order) } EXPORT_SYMBOL(alloc_pages); +unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp, + struct mempolicy *pol, unsigned long nr_pages, + struct page **page_array) +{ + int nodes; + unsigned long nr_pages_per_node; + int delta; + int i; + unsigned long nr_allocated; + unsigned long total_allocated = 0; + + nodes = nodes_weight(pol->nodes); + nr_pages_per_node = nr_pages / nodes; + delta = nr_pages - nodes * nr_pages_per_node; + + for (i = 0; i < nodes; i++) { + if (delta) { + nr_allocated = __alloc_pages_bulk(gfp, + interleave_nodes(pol), NULL, + nr_pages_per_node + 1, NULL, + page_array); + delta--; + } else { + nr_allocated = __alloc_pages_bulk(gfp, + interleave_nodes(pol), NULL, + nr_pages_per_node, NULL, page_array); + } + + page_array += nr_allocated; + total_allocated += nr_allocated; + } + + return total_allocated; +} + +unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid, + struct mempolicy *pol, unsigned long nr_pages, + struct page **page_array) +{ + gfp_t preferred_gfp; + unsigned long nr_allocated = 0; + + preferred_gfp = gfp | __GFP_NOWARN; + preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); + + nr_allocated = __alloc_pages_bulk(preferred_gfp, nid, &pol->nodes, + nr_pages, NULL, page_array); + + if (nr_allocated < nr_pages) + nr_allocated += __alloc_pages_bulk(gfp, numa_node_id(), NULL, + nr_pages - nr_allocated, NULL, + page_array + nr_allocated); + return nr_allocated; +} + +/* alloc pages bulk and mempolicy should be considered at the + * same time in some situation such as vmalloc. + * + * It can accelerate memory allocation especially interleaving + * allocate memory. + */ +unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp, + unsigned long nr_pages, struct page **page_array) +{ + struct mempolicy *pol = &default_policy; + + if (!in_interrupt() && !(gfp & __GFP_THISNODE)) + pol = get_task_policy(current); + + if (pol->mode == MPOL_INTERLEAVE) + return alloc_pages_bulk_array_interleave(gfp, pol, + nr_pages, page_array); + + if (pol->mode == MPOL_PREFERRED_MANY) + return alloc_pages_bulk_array_preferred_many(gfp, + numa_node_id(), pol, nr_pages, page_array); + + return __alloc_pages_bulk(gfp, policy_node(gfp, pol, numa_node_id()), + policy_nodemask(gfp, pol), nr_pages, NULL, + page_array); +} + int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst) { struct mempolicy *pol = mpol_dup(vma_policy(src)); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index e8a807c78110..c3ab25d408dd 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2825,7 +2825,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * to fails, fallback to a single page allocator that is * more permissive. */ - if (!order && nid != NUMA_NO_NODE) { + if (!order) { while (nr_allocated < nr_pages) { unsigned int nr, nr_pages_request; @@ -2837,8 +2837,20 @@ vm_area_alloc_pages(gfp_t gfp, int nid, */ nr_pages_request = min(100U, nr_pages - nr_allocated); - nr = alloc_pages_bulk_array_node(gfp, nid, - nr_pages_request, pages + nr_allocated); + /* memory allocation should consider mempolicy, we cant + * wrongly use nearest node when nid == NUMA_NO_NODE, + * otherwise memory may be allocated in only one node, + * but mempolcy want to alloc memory by interleaving. + */ + if (nid == NUMA_NO_NODE) + nr = alloc_pages_bulk_array_mempolicy(gfp, + nr_pages_request, + pages + nr_allocated); + + else + nr = alloc_pages_bulk_array_node(gfp, nid, + nr_pages_request, + pages + nr_allocated); nr_allocated += nr; cond_resched(); @@ -2850,7 +2862,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, if (nr != nr_pages_request) break; } - } else if (order) + } else /* * Compound pages required for remap_vmalloc_page if * high-order pages.