Message ID | 20230905103508.2996474-1-wangkefeng.wang@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [resend] mm: hugetlb_vmemmap: use bulk allocator in alloc_vmemmap_page_list() | expand |
> On Sep 5, 2023, at 18:35, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in > alloc_vmemmap_page_list(), so let's add a bulk allocator varietas > alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() > to use it to accelerate page allocation. > > Simple test on arm64's qemu with 1G Hugetlb, 870,842ns vs 3,845,252ns, > even if there is a certain fluctuation, it is still a nice improvement. > > Tested-by: Yuan Can <yuancan@huawei.com> > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Thanks.
On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: > It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in > alloc_vmemmap_page_list(), so let's add a bulk allocator varietas > alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() > to use it to accelerate page allocation. Argh, no, please don't do this. Iterating a linked list is _expensive_. It is about 10x quicker to iterate an array than a linked list. Adding the list_head option to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. These pages are going into an array anyway. Don't put them on a list first.
On 2023/9/6 10:47, Matthew Wilcox wrote: > On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: >> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in >> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas >> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() >> to use it to accelerate page allocation. > > Argh, no, please don't do this. > > Iterating a linked list is _expensive_. It is about 10x quicker to > iterate an array than a linked list. Adding the list_head option > to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. > > These pages are going into an array anyway. Don't put them on a list > first. struct vmemmap_remap_walk - walk vmemmap page table * @vmemmap_pages: the list head of the vmemmap pages that can be freed * or is mapped from. At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ?
On Wed, Sep 06, 2023 at 11:13:27AM +0800, Kefeng Wang wrote: > On 2023/9/6 10:47, Matthew Wilcox wrote: > > On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: > > > It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in > > > alloc_vmemmap_page_list(), so let's add a bulk allocator varietas > > > alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() > > > to use it to accelerate page allocation. > > > > Argh, no, please don't do this. > > > > Iterating a linked list is _expensive_. It is about 10x quicker to > > iterate an array than a linked list. Adding the list_head option > > to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. > > > > These pages are going into an array anyway. Don't put them on a list > > first. > > struct vmemmap_remap_walk - walk vmemmap page table > > * @vmemmap_pages: the list head of the vmemmap pages that can be freed > * or is mapped from. > > At present, the struct vmemmap_remap_walk use a list for vmemmap page table > walk, so do you mean we need change vmemmap_pages from a list to a array > firstly and then use array bulk api, even kill list bulk api ? That would be better, yes.
> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > > > On 2023/9/6 10:47, Matthew Wilcox wrote: >> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: >>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in >>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas >>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() >>> to use it to accelerate page allocation. >> Argh, no, please don't do this. >> Iterating a linked list is _expensive_. It is about 10x quicker to >> iterate an array than a linked list. Adding the list_head option >> to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. >> These pages are going into an array anyway. Don't put them on a list >> first. > > struct vmemmap_remap_walk - walk vmemmap page table > > * @vmemmap_pages: the list head of the vmemmap pages that can be freed > * or is mapped from. > > At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ? It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to directly use __alloc_pages_bulk in hugetlb_vmemmap itself?
On 2023/9/6 11:25, Muchun Song wrote: > > >> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: >> >> >> >> On 2023/9/6 10:47, Matthew Wilcox wrote: >>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: >>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in >>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas >>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() >>>> to use it to accelerate page allocation. >>> Argh, no, please don't do this. >>> Iterating a linked list is _expensive_. It is about 10x quicker to >>> iterate an array than a linked list. Adding the list_head option >>> to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. >>> These pages are going into an array anyway. Don't put them on a list >>> first. >> >> struct vmemmap_remap_walk - walk vmemmap page table >> >> * @vmemmap_pages: the list head of the vmemmap pages that can be freed >> * or is mapped from. >> >> At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ? > > It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to > directly use __alloc_pages_bulk in hugetlb_vmemmap itself? We could use alloc_pages_bulk_array_node() here without introduce a new alloc_pages_bulk_list_node(), only focus on accelerate page allocation for now.
On 2023/9/6 11:14, Matthew Wilcox wrote: > On Wed, Sep 06, 2023 at 11:13:27AM +0800, Kefeng Wang wrote: >> On 2023/9/6 10:47, Matthew Wilcox wrote: >>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: >>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in >>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas >>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() >>>> to use it to accelerate page allocation. >>> >>> Argh, no, please don't do this. >>> >>> Iterating a linked list is _expensive_. It is about 10x quicker to >>> iterate an array than a linked list. Adding the list_head option >>> to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. >>> >>> These pages are going into an array anyway. Don't put them on a list >>> first. >> >> struct vmemmap_remap_walk - walk vmemmap page table >> >> * @vmemmap_pages: the list head of the vmemmap pages that can be freed >> * or is mapped from. >> >> At present, the struct vmemmap_remap_walk use a list for vmemmap page table >> walk, so do you mean we need change vmemmap_pages from a list to a array >> firstly and then use array bulk api, even kill list bulk api ? > > That would be better, yes. Maybe not quick to convert vmemmap_remap_walk to use array, will use page array alloc bulk api firstly and won't introduce a new alloc_pages_bulk_list_node(), thanks.
> On Sep 6, 2023, at 17:33, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > > > On 2023/9/6 11:25, Muchun Song wrote: >>> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: >>> >>> >>> >>> On 2023/9/6 10:47, Matthew Wilcox wrote: >>>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: >>>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in >>>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas >>>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() >>>>> to use it to accelerate page allocation. >>>> Argh, no, please don't do this. >>>> Iterating a linked list is _expensive_. It is about 10x quicker to >>>> iterate an array than a linked list. Adding the list_head option >>>> to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. >>>> These pages are going into an array anyway. Don't put them on a list >>>> first. >>> >>> struct vmemmap_remap_walk - walk vmemmap page table >>> >>> * @vmemmap_pages: the list head of the vmemmap pages that can be freed >>> * or is mapped from. >>> >>> At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ? >> It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to >> directly use __alloc_pages_bulk in hugetlb_vmemmap itself? > > > We could use alloc_pages_bulk_array_node() here without introduce a new > alloc_pages_bulk_list_node(), only focus on accelerate page allocation > for now. > No. Using alloc_pages_bulk_array_node() will add more complexity (you need to allocate an array fist) for hugetlb_vmemap and this path that you optimized is only a control path and this optimization is at the millisecond level. So I don't think it is a great value to do this. Thanks.
On 2023/9/6 22:32, Muchun Song wrote: > > >> On Sep 6, 2023, at 17:33, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: >> >> >> >> On 2023/9/6 11:25, Muchun Song wrote: >>>> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: >>>> >>>> >>>> >>>> On 2023/9/6 10:47, Matthew Wilcox wrote: >>>>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: >>>>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in >>>>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas >>>>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() >>>>>> to use it to accelerate page allocation. >>>>> Argh, no, please don't do this. >>>>> Iterating a linked list is _expensive_. It is about 10x quicker to >>>>> iterate an array than a linked list. Adding the list_head option >>>>> to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. >>>>> These pages are going into an array anyway. Don't put them on a list >>>>> first. >>>> >>>> struct vmemmap_remap_walk - walk vmemmap page table >>>> >>>> * @vmemmap_pages: the list head of the vmemmap pages that can be freed >>>> * or is mapped from. >>>> >>>> At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ? >>> It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to >>> directly use __alloc_pages_bulk in hugetlb_vmemmap itself? >> >> >> We could use alloc_pages_bulk_array_node() here without introduce a new >> alloc_pages_bulk_list_node(), only focus on accelerate page allocation >> for now. >> > > No. Using alloc_pages_bulk_array_node() will add more complexity (you need to allocate > an array fist) for hugetlb_vmemap and this path that you optimized is only a control > path and this optimization is at the millisecond level. So I don't think it is a great > value to do this. > I tried it, yes, a little complex, diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 4b9734777f69..5f502e18f950 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -377,26 +377,53 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, return ret; } +static int vmemmap_bulk_alloc_pages(gfp_t gfp, int nid, unsigned int nr_pages, + struct page **pages) +{ + unsigned int last, allocated = 0; + + do { + last = allocated; + + allocated = alloc_pages_bulk_array_node(gfp, nid, nr_pages, pages); + if (allocated == last) + goto err; + + } while (allocated < nr_pages) + + return 0; +err: + for (allocated = 0; allocated < nr_pages; allocated++) { + if (pages[allocated]) + __free_page(pages[allocated]); + } + + return -ENOMEM; +} + static int alloc_vmemmap_page_list(unsigned long start, unsigned long end, struct list_head *list) { gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE; unsigned long nr_pages = (end - start) >> PAGE_SHIFT; int nid = page_to_nid((struct page *)start); - struct page *page, *next; + struct page **pages; + int ret = -ENOMEM; + + pages = kzalloc(array_size(nr_pages, sizeof(struct page *)), gfp_mask); + if (!pages) + return ret; + + ret = vmemmap_bulk_alloc_pages(gfp_mask, nid, nr_pages, pages); + if (ret) + goto out; while (nr_pages--) { - page = alloc_pages_node(nid, gfp_mask, 0); - if (!page) - goto out; - list_add_tail(&page->lru, list); + list_add_tail(&pages[nr_pages]->lru, list); } - - return 0; out: - list_for_each_entry_safe(page, next, list, lru) - __free_page(page); - return -ENOMEM; + kfree(pages); + return ret; } or just use __alloc_pages_bulk in it, but as Matthew said, we should avoid list usage, list api need to be cleanup and no one should use it, or no change, since it is not a hot path :)> Thanks. >
> On Sep 6, 2023, at 22:58, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > > > On 2023/9/6 22:32, Muchun Song wrote: >>> On Sep 6, 2023, at 17:33, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: >>> >>> >>> >>> On 2023/9/6 11:25, Muchun Song wrote: >>>>> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: >>>>> >>>>> >>>>> >>>>> On 2023/9/6 10:47, Matthew Wilcox wrote: >>>>>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: >>>>>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in >>>>>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas >>>>>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() >>>>>>> to use it to accelerate page allocation. >>>>>> Argh, no, please don't do this. >>>>>> Iterating a linked list is _expensive_. It is about 10x quicker to >>>>>> iterate an array than a linked list. Adding the list_head option >>>>>> to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. >>>>>> These pages are going into an array anyway. Don't put them on a list >>>>>> first. >>>>> >>>>> struct vmemmap_remap_walk - walk vmemmap page table >>>>> >>>>> * @vmemmap_pages: the list head of the vmemmap pages that can be freed >>>>> * or is mapped from. >>>>> >>>>> At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ? >>>> It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to >>>> directly use __alloc_pages_bulk in hugetlb_vmemmap itself? >>> >>> >>> We could use alloc_pages_bulk_array_node() here without introduce a new >>> alloc_pages_bulk_list_node(), only focus on accelerate page allocation >>> for now. >>> >> No. Using alloc_pages_bulk_array_node() will add more complexity (you need to allocate >> an array fist) for hugetlb_vmemap and this path that you optimized is only a control >> path and this optimization is at the millisecond level. So I don't think it is a great >> value to do this. > I tried it, yes, a little complex, > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > index 4b9734777f69..5f502e18f950 100644 > --- a/mm/hugetlb_vmemmap.c > +++ b/mm/hugetlb_vmemmap.c > @@ -377,26 +377,53 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, > return ret; > } > > +static int vmemmap_bulk_alloc_pages(gfp_t gfp, int nid, unsigned int nr_pages, > + struct page **pages) > +{ > + unsigned int last, allocated = 0; > + > + do { > + last = allocated; > + > + allocated = alloc_pages_bulk_array_node(gfp, nid, nr_pages, pages); > + if (allocated == last) > + goto err; > + > + } while (allocated < nr_pages) > + > + return 0; > +err: > + for (allocated = 0; allocated < nr_pages; allocated++) { > + if (pages[allocated]) > + __free_page(pages[allocated]); > + } > + > + return -ENOMEM; > +} > + > static int alloc_vmemmap_page_list(unsigned long start, unsigned long end, > struct list_head *list) > { > gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE; > unsigned long nr_pages = (end - start) >> PAGE_SHIFT; > int nid = page_to_nid((struct page *)start); > - struct page *page, *next; > + struct page **pages; > + int ret = -ENOMEM; > + > + pages = kzalloc(array_size(nr_pages, sizeof(struct page *)), gfp_mask); > + if (!pages) > + return ret; > + > + ret = vmemmap_bulk_alloc_pages(gfp_mask, nid, nr_pages, pages); > + if (ret) > + goto out; > > while (nr_pages--) { > - page = alloc_pages_node(nid, gfp_mask, 0); > - if (!page) > - goto out; > - list_add_tail(&page->lru, list); > + list_add_tail(&pages[nr_pages]->lru, list); > } > - > - return 0; > out: > - list_for_each_entry_safe(page, next, list, lru) > - __free_page(page); > - return -ENOMEM; > + kfree(pages); > + return ret; > } > > or just use __alloc_pages_bulk in it, but as Matthew said, we should > avoid list usage, list api need to be cleanup and no one should use it, > or no change, since it is not a hot path :)> Thanks. Let's keep it no change. Thanks.
On Wed, Sep 06, 2023 at 03:47:42AM +0100, Matthew Wilcox wrote: > On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: > > It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in > > alloc_vmemmap_page_list(), so let's add a bulk allocator varietas > > alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() > > to use it to accelerate page allocation. > > Argh, no, please don't do this. > > Iterating a linked list is _expensive_. It is about 10x quicker to > iterate an array than a linked list. Adding the list_head option > to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate it. > > These pages are going into an array anyway. Don't put them on a list > first. > I don't have access to my test infrastructure at the moment so this isn't even properly build tested but I think it's time for something like; --8<-- mm: page_alloc: Remove list interface for bulk page allocation Commit 387ba26fb1cb ("mm/page_alloc: add a bulk page allocator") introduced an API for bulk allocating pages stored on a list. Later an array interface was introduced as it was faster for the initial use cases. For two years, the array option continues to be superior for each use case and any attempt to us the list interface ultimately used arrays. Remove the list-based API for bulk page allocation and rename the API. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- include/linux/gfp.h | 17 +++++------------ mm/mempolicy.c | 21 ++++++++++----------- mm/page_alloc.c | 24 ++++++------------------ mm/vmalloc.c | 4 ++-- net/core/page_pool.c | 4 ++-- net/sunrpc/svc.c | 2 +- net/sunrpc/svc_xprt.c | 2 +- 7 files changed, 27 insertions(+), 47 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 665f06675c83..d81eb255718a 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -181,33 +181,26 @@ struct folio *__folio_alloc(gfp_t gfp, unsigned int order, int preferred_nid, unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, nodemask_t *nodemask, int nr_pages, - struct list_head *page_list, struct page **page_array); -unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp, +unsigned long alloc_pages_bulk_mempolicy(gfp_t gfp, unsigned long nr_pages, struct page **page_array); /* Bulk allocate order-0 pages */ static inline unsigned long -alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list) +alloc_pages_bulk(gfp_t gfp, unsigned long nr_pages, struct page **page_array) { - return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list, NULL); + return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, page_array); } static inline unsigned long -alloc_pages_bulk_array(gfp_t gfp, unsigned long nr_pages, struct page **page_array) -{ - return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, NULL, page_array); -} - -static inline unsigned long -alloc_pages_bulk_array_node(gfp_t gfp, int nid, unsigned long nr_pages, struct page **page_array) +alloc_pages_bulk_node(gfp_t gfp, int nid, unsigned long nr_pages, struct page **page_array) { if (nid == NUMA_NO_NODE) nid = numa_mem_id(); - return __alloc_pages_bulk(gfp, nid, NULL, nr_pages, NULL, page_array); + return __alloc_pages_bulk(gfp, nid, NULL, nr_pages, page_array); } static inline void warn_if_node_offline(int this_node, gfp_t gfp_mask) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 29ebf1e7898c..2a763df3e1ee 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2315,7 +2315,7 @@ struct folio *folio_alloc(gfp_t gfp, unsigned order) } EXPORT_SYMBOL(folio_alloc); -static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp, +static unsigned long alloc_pages_bulk_interleave(gfp_t gfp, struct mempolicy *pol, unsigned long nr_pages, struct page **page_array) { @@ -2334,13 +2334,12 @@ static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp, if (delta) { nr_allocated = __alloc_pages_bulk(gfp, interleave_nodes(pol), NULL, - nr_pages_per_node + 1, NULL, - page_array); + nr_pages_per_node + 1, page_array); delta--; } else { nr_allocated = __alloc_pages_bulk(gfp, interleave_nodes(pol), NULL, - nr_pages_per_node, NULL, page_array); + nr_pages_per_node, page_array); } page_array += nr_allocated; @@ -2350,7 +2349,7 @@ static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp, return total_allocated; } -static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid, +static unsigned long alloc_pages_bulk_preferred_many(gfp_t gfp, int nid, struct mempolicy *pol, unsigned long nr_pages, struct page **page_array) { @@ -2361,11 +2360,11 @@ static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid, preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); nr_allocated = __alloc_pages_bulk(preferred_gfp, nid, &pol->nodes, - nr_pages, NULL, page_array); + nr_pages, page_array); if (nr_allocated < nr_pages) nr_allocated += __alloc_pages_bulk(gfp, numa_node_id(), NULL, - nr_pages - nr_allocated, NULL, + nr_pages - nr_allocated, page_array + nr_allocated); return nr_allocated; } @@ -2376,7 +2375,7 @@ static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid, * It can accelerate memory allocation especially interleaving * allocate memory. */ -unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp, +unsigned long alloc_pages_bulk_mempolicy(gfp_t gfp, unsigned long nr_pages, struct page **page_array) { struct mempolicy *pol = &default_policy; @@ -2385,15 +2384,15 @@ unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp, pol = get_task_policy(current); if (pol->mode == MPOL_INTERLEAVE) - return alloc_pages_bulk_array_interleave(gfp, pol, + return alloc_pages_bulk_interleave(gfp, pol, nr_pages, page_array); if (pol->mode == MPOL_PREFERRED_MANY) - return alloc_pages_bulk_array_preferred_many(gfp, + return alloc_pages_bulk_preferred_many(gfp, numa_node_id(), pol, nr_pages, page_array); return __alloc_pages_bulk(gfp, policy_node(gfp, pol, numa_node_id()), - policy_nodemask(gfp, pol), nr_pages, NULL, + policy_nodemask(gfp, pol), nr_pages, page_array); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 85741403948f..8f035304fbf2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4221,23 +4221,20 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, * @preferred_nid: The preferred NUMA node ID to allocate from * @nodemask: Set of nodes to allocate from, may be NULL * @nr_pages: The number of pages desired on the list or array - * @page_list: Optional list to store the allocated pages - * @page_array: Optional array to store the pages + * @page_array: Array array to store the pages * * This is a batched version of the page allocator that attempts to - * allocate nr_pages quickly. Pages are added to page_list if page_list - * is not NULL, otherwise it is assumed that the page_array is valid. + * allocate nr_pages quickly with pointers stored in page_array. * * For lists, nr_pages is the number of pages that should be allocated. * * For arrays, only NULL elements are populated with pages and nr_pages * is the maximum number of pages that will be stored in the array. * - * Returns the number of pages on the list or array. + * Returns the number of pages in the array. */ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, nodemask_t *nodemask, int nr_pages, - struct list_head *page_list, struct page **page_array) { struct page *page; @@ -4351,11 +4348,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, nr_account++; prep_new_page(page, 0, gfp, 0); - if (page_list) - list_add(&page->lru, page_list); - else - page_array[nr_populated] = page; - nr_populated++; + page_array[nr_populated++] = page; } pcp_spin_unlock(pcp); @@ -4372,13 +4365,8 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid, failed: page = __alloc_pages(gfp, 0, preferred_nid, nodemask); - if (page) { - if (page_list) - list_add(&page->lru, page_list); - else - page_array[nr_populated] = page; - nr_populated++; - } + if (page) + page_array[nr_populated++] = page; goto out; } diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a3fedb3ee0db..46d49e04039e 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3025,12 +3025,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * but mempolicy wants to alloc memory by interleaving. */ if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE) - nr = alloc_pages_bulk_array_mempolicy(bulk_gfp, + nr = alloc_pages_bulk_mempolicy(bulk_gfp, nr_pages_request, pages + nr_allocated); else - nr = alloc_pages_bulk_array_node(bulk_gfp, nid, + nr = alloc_pages_bulk_node(bulk_gfp, nid, nr_pages_request, pages + nr_allocated); diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 77cb75e63aca..9cff3e4350ee 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -426,10 +426,10 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, if (unlikely(pool->alloc.count > 0)) return pool->alloc.cache[--pool->alloc.count]; - /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */ + /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk */ memset(&pool->alloc.cache, 0, sizeof(void *) * bulk); - nr_pages = alloc_pages_bulk_array_node(gfp, pool->p.nid, bulk, + nr_pages = alloc_pages_bulk_node(gfp, pool->p.nid, bulk, pool->alloc.cache); if (unlikely(!nr_pages)) return NULL; diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 812fda9d45dd..afce749a0871 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -613,7 +613,7 @@ svc_init_buffer(struct svc_rqst *rqstp, unsigned int size, int node) if (pages > RPCSVC_MAXPAGES) pages = RPCSVC_MAXPAGES; - ret = alloc_pages_bulk_array_node(GFP_KERNEL, node, pages, + ret = alloc_pages_bulk_node(GFP_KERNEL, node, pages, rqstp->rq_pages); return ret == pages; } diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 4cfe9640df48..92c6eae6610c 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -666,7 +666,7 @@ static bool svc_alloc_arg(struct svc_rqst *rqstp) } for (filled = 0; filled < pages; filled = ret) { - ret = alloc_pages_bulk_array_node(GFP_KERNEL, + ret = alloc_pages_bulk_node(GFP_KERNEL, rqstp->rq_pool->sp_id, pages, rqstp->rq_pages); if (ret > filled)
Hi Mel, kernel test robot noticed the following build errors: [auto build test ERROR on trondmy-nfs/linux-next] [also build test ERROR on linus/master v6.6-rc7] [cannot apply to akpm-mm/mm-everything next-20231025] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Mel-Gorman/Re-PATCH-resend-mm-hugetlb_vmemmap-use-bulk-allocator-in-alloc_vmemmap_page_list/20231025-173425 base: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next patch link: https://lore.kernel.org/r/20231025093254.xvomlctwhcuerzky%40techsingularity.net patch subject: Re: [PATCH resend] mm: hugetlb_vmemmap: use bulk allocator in alloc_vmemmap_page_list() config: loongarch-randconfig-002-20231025 (https://download.01.org/0day-ci/archive/20231025/202310252149.qzjGG49d-lkp@intel.com/config) compiler: loongarch64-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231025/202310252149.qzjGG49d-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202310252149.qzjGG49d-lkp@intel.com/ All errors (new ones prefixed by >>): fs/xfs/xfs_buf.c: In function 'xfs_buf_alloc_pages': >> fs/xfs/xfs_buf.c:391:26: error: implicit declaration of function 'alloc_pages_bulk_array'; did you mean 'alloc_pages_bulk_node'? [-Werror=implicit-function-declaration] 391 | filled = alloc_pages_bulk_array(gfp_mask, bp->b_page_count, | ^~~~~~~~~~~~~~~~~~~~~~ | alloc_pages_bulk_node cc1: some warnings being treated as errors -- fs/btrfs/extent_io.c: In function 'btrfs_alloc_page_array': >> fs/btrfs/extent_io.c:688:29: error: implicit declaration of function 'alloc_pages_bulk_array'; did you mean 'alloc_pages_bulk_node'? [-Werror=implicit-function-declaration] 688 | allocated = alloc_pages_bulk_array(GFP_NOFS, nr_pages, page_array); | ^~~~~~~~~~~~~~~~~~~~~~ | alloc_pages_bulk_node cc1: some warnings being treated as errors vim +391 fs/xfs/xfs_buf.c 0e6e847ffe3743 fs/xfs/linux-2.6/xfs_buf.c Dave Chinner 2011-03-26 353 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 354 static int 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 355 xfs_buf_alloc_pages( 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 356 struct xfs_buf *bp, 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 357 xfs_buf_flags_t flags) 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 358 { 289ae7b48c2c4d fs/xfs/xfs_buf.c Dave Chinner 2021-06-07 359 gfp_t gfp_mask = __GFP_NOWARN; c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 360 long filled = 0; 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 361 289ae7b48c2c4d fs/xfs/xfs_buf.c Dave Chinner 2021-06-07 362 if (flags & XBF_READ_AHEAD) 289ae7b48c2c4d fs/xfs/xfs_buf.c Dave Chinner 2021-06-07 363 gfp_mask |= __GFP_NORETRY; 289ae7b48c2c4d fs/xfs/xfs_buf.c Dave Chinner 2021-06-07 364 else 289ae7b48c2c4d fs/xfs/xfs_buf.c Dave Chinner 2021-06-07 365 gfp_mask |= GFP_NOFS; 289ae7b48c2c4d fs/xfs/xfs_buf.c Dave Chinner 2021-06-07 366 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 367 /* Make sure that we have a page list */ 934d1076bb2c5b fs/xfs/xfs_buf.c Christoph Hellwig 2021-06-07 368 bp->b_page_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE); 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 369 if (bp->b_page_count <= XB_PAGES) { 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 370 bp->b_pages = bp->b_page_array; 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 371 } else { 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 372 bp->b_pages = kzalloc(sizeof(struct page *) * bp->b_page_count, 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 373 gfp_mask); 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 374 if (!bp->b_pages) 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 375 return -ENOMEM; 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 376 } 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 377 bp->b_flags |= _XBF_PAGES; 02c5117386884e fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 378 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 379 /* Assure zeroed buffer for non-read cases. */ 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 380 if (!(flags & XBF_READ)) 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 381 gfp_mask |= __GFP_ZERO; 0a683794ace283 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 382 c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 383 /* c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 384 * Bulk filling of pages can take multiple calls. Not filling the entire c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 385 * array is not an allocation failure, so don't back off if we get at c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 386 * least one extra page. c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 387 */ c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 388 for (;;) { c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 389 long last = filled; c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 390 c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 @391 filled = alloc_pages_bulk_array(gfp_mask, bp->b_page_count, c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 392 bp->b_pages); c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 393 if (filled == bp->b_page_count) { c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 394 XFS_STATS_INC(bp->b_mount, xb_page_found); c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 395 break; c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 396 } c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 397 c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 398 if (filled != last) c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 399 continue; c9fa563072e133 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 400 ce8e922c0e79c8 fs/xfs/linux-2.6/xfs_buf.c Nathan Scott 2006-01-11 401 if (flags & XBF_READ_AHEAD) { e7d236a6fe5102 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 402 xfs_buf_free_pages(bp); e7d236a6fe5102 fs/xfs/xfs_buf.c Dave Chinner 2021-06-01 403 return -ENOMEM; ^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds 2005-04-16 404 } ^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds 2005-04-16 405 dbd329f1e44ed4 fs/xfs/xfs_buf.c Christoph Hellwig 2019-06-28 406 XFS_STATS_INC(bp->b_mount, xb_page_retries); 4034247a0d6ab2 fs/xfs/xfs_buf.c NeilBrown 2022-01-14 407 memalloc_retry_wait(gfp_mask); ^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds 2005-04-16 408 } 0e6e847ffe3743 fs/xfs/linux-2.6/xfs_buf.c Dave Chinner 2011-03-26 409 return 0; ^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds 2005-04-16 410 } ^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds 2005-04-16 411
diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 665f06675c83..d6e82f15b61f 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -195,6 +195,15 @@ alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list) return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list, NULL); } +static inline unsigned long +alloc_pages_bulk_list_node(gfp_t gfp, int nid, unsigned long nr_pages, struct list_head *list) +{ + if (nid == NUMA_NO_NODE) + nid = numa_mem_id(); + + return __alloc_pages_bulk(gfp, nid, NULL, nr_pages, list, NULL); +} + static inline unsigned long alloc_pages_bulk_array(gfp_t gfp, unsigned long nr_pages, struct page **page_array) { diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 4b9734777f69..786e581703c7 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -384,7 +384,13 @@ static int alloc_vmemmap_page_list(unsigned long start, unsigned long end, unsigned long nr_pages = (end - start) >> PAGE_SHIFT; int nid = page_to_nid((struct page *)start); struct page *page, *next; + unsigned long nr_allocated; + nr_allocated = alloc_pages_bulk_list_node(gfp_mask, nid, nr_pages, list); + if (!nr_allocated) + return -ENOMEM; + + nr_pages -= nr_allocated; while (nr_pages--) { page = alloc_pages_node(nid, gfp_mask, 0); if (!page)