diff mbox series

[v5,2/3] mm/compaction: add support for >0 order folio memory compaction.

Message ID 20240214220420.1229173-3-zi.yan@sent.com (mailing list archive)
State New
Headers show
Series [v5,1/3] mm/compaction: enable compacting >0 order folios. | expand

Commit Message

Zi Yan Feb. 14, 2024, 10:04 p.m. UTC
From: Zi Yan <ziy@nvidia.com>

Before last commit, memory compaction only migrates order-0 folios and
skips >0 order folios.  Last commit splits all >0 order folios during
compaction.  This commit migrates >0 order folios during compaction by
keeping isolated free pages at their original size without splitting them
into order-0 pages and using them directly during migration process.

What is different from the prior implementation:
1. All isolated free pages are kept in a NR_PAGE_ORDERS array of page
   lists, where each page list stores free pages in the same order.
2. All free pages are not post_alloc_hook() processed nor buddy pages,
   although their orders are stored in first page's private like buddy
   pages.
3. During migration, in new page allocation time (i.e., in
   compaction_alloc()), free pages are then processed by post_alloc_hook().
   When migration fails and a new page is returned (i.e., in
   compaction_free()), free pages are restored by reversing the
   post_alloc_hook() operations using newly added
   free_pages_prepare_fpi_none().

Step 3 is done for a latter optimization that splitting and/or merging
free pages during compaction becomes easier.

Note: without splitting free pages, compaction can end prematurely due to
migration will return -ENOMEM even if there is free pages.  This happens
when no order-0 free page exist and compaction_alloc() return NULL.

Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Adam Manzanares <a.manzanares@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yin Fengwei <fengwei.yin@intel.com>
---
 mm/compaction.c | 143 +++++++++++++++++++++++++++---------------------
 mm/internal.h   |   4 +-
 mm/page_alloc.c |   6 ++
 3 files changed, 91 insertions(+), 62 deletions(-)

Comments

Vlastimil Babka Feb. 15, 2024, 4:07 p.m. UTC | #1
On 2/14/24 23:04, Zi Yan wrote:
> From: Zi Yan <ziy@nvidia.com>
> 
> Before last commit, memory compaction only migrates order-0 folios and
> skips >0 order folios.  Last commit splits all >0 order folios during
> compaction.  This commit migrates >0 order folios during compaction by
> keeping isolated free pages at their original size without splitting them
> into order-0 pages and using them directly during migration process.
> 
> What is different from the prior implementation:
> 1. All isolated free pages are kept in a NR_PAGE_ORDERS array of page
>    lists, where each page list stores free pages in the same order.
> 2. All free pages are not post_alloc_hook() processed nor buddy pages,
>    although their orders are stored in first page's private like buddy
>    pages.
> 3. During migration, in new page allocation time (i.e., in
>    compaction_alloc()), free pages are then processed by post_alloc_hook().
>    When migration fails and a new page is returned (i.e., in
>    compaction_free()), free pages are restored by reversing the
>    post_alloc_hook() operations using newly added
>    free_pages_prepare_fpi_none().
> 
> Step 3 is done for a latter optimization that splitting and/or merging
> free pages during compaction becomes easier.
> 
> Note: without splitting free pages, compaction can end prematurely due to
> migration will return -ENOMEM even if there is free pages.  This happens
> when no order-0 free page exist and compaction_alloc() return NULL.
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Tested-by: Yu Zhao <yuzhao@google.com>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

Noticed a possible simplification:

> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -447,6 +447,8 @@ extern void prep_compound_page(struct page *page, unsigned int order);
>  
>  extern void post_alloc_hook(struct page *page, unsigned int order,
>  					gfp_t gfp_flags);
> +extern bool free_pages_prepare_fpi_none(struct page *page, unsigned int order);
> +
>  extern int user_min_free_kbytes;
>  
>  extern void free_unref_page(struct page *page, unsigned int order);
> @@ -481,7 +483,7 @@ int split_free_page(struct page *free_page,
>   * completes when free_pfn <= migrate_pfn
>   */
>  struct compact_control {
> -	struct list_head freepages;	/* List of free pages to migrate to */
> +	struct list_head freepages[NR_PAGE_ORDERS];	/* List of free pages to migrate to */
>  	struct list_head migratepages;	/* List of pages being migrated */
>  	unsigned int nr_freepages;	/* Number of isolated free pages */
>  	unsigned int nr_migratepages;	/* Number of pages to migrate */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 7ae4b74c9e5c..e6e2ac722a82 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1179,6 +1179,12 @@ static __always_inline bool free_pages_prepare(struct page *page,
>  	return true;
>  }
>  
> +__always_inline bool free_pages_prepare_fpi_none(struct page *page,
> +			unsigned int order)
> +{
> +	return free_pages_prepare(page, order, FPI_NONE);

Seems like free_pages_prepare() currently only passes  fpi_flags to
should_skip_kasan_poison() and that ignores them. You could remove the
parameter from both and declare and use free_pages_prepare(page, order)
directly.

> +}
> +
>  /*
>   * Frees a number of pages from the PCP lists
>   * Assumes all pages on list are in same zone.
Zi Yan Feb. 15, 2024, 4:13 p.m. UTC | #2
On 15 Feb 2024, at 11:07, Vlastimil Babka wrote:

> On 2/14/24 23:04, Zi Yan wrote:
>> From: Zi Yan <ziy@nvidia.com>
>>
>> Before last commit, memory compaction only migrates order-0 folios and
>> skips >0 order folios.  Last commit splits all >0 order folios during
>> compaction.  This commit migrates >0 order folios during compaction by
>> keeping isolated free pages at their original size without splitting them
>> into order-0 pages and using them directly during migration process.
>>
>> What is different from the prior implementation:
>> 1. All isolated free pages are kept in a NR_PAGE_ORDERS array of page
>>    lists, where each page list stores free pages in the same order.
>> 2. All free pages are not post_alloc_hook() processed nor buddy pages,
>>    although their orders are stored in first page's private like buddy
>>    pages.
>> 3. During migration, in new page allocation time (i.e., in
>>    compaction_alloc()), free pages are then processed by post_alloc_hook().
>>    When migration fails and a new page is returned (i.e., in
>>    compaction_free()), free pages are restored by reversing the
>>    post_alloc_hook() operations using newly added
>>    free_pages_prepare_fpi_none().
>>
>> Step 3 is done for a latter optimization that splitting and/or merging
>> free pages during compaction becomes easier.
>>
>> Note: without splitting free pages, compaction can end prematurely due to
>> migration will return -ENOMEM even if there is free pages.  This happens
>> when no order-0 free page exist and compaction_alloc() return NULL.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Tested-by: Yu Zhao <yuzhao@google.com>
>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Thanks.

>
> Noticed a possible simplification:
>
>> --- a/mm/internal.h
>> +++ b/mm/internal.h
>> @@ -447,6 +447,8 @@ extern void prep_compound_page(struct page *page, unsigned int order);
>>
>>  extern void post_alloc_hook(struct page *page, unsigned int order,
>>  					gfp_t gfp_flags);
>> +extern bool free_pages_prepare_fpi_none(struct page *page, unsigned int order);
>> +
>>  extern int user_min_free_kbytes;
>>
>>  extern void free_unref_page(struct page *page, unsigned int order);
>> @@ -481,7 +483,7 @@ int split_free_page(struct page *free_page,
>>   * completes when free_pfn <= migrate_pfn
>>   */
>>  struct compact_control {
>> -	struct list_head freepages;	/* List of free pages to migrate to */
>> +	struct list_head freepages[NR_PAGE_ORDERS];	/* List of free pages to migrate to */
>>  	struct list_head migratepages;	/* List of pages being migrated */
>>  	unsigned int nr_freepages;	/* Number of isolated free pages */
>>  	unsigned int nr_migratepages;	/* Number of pages to migrate */
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 7ae4b74c9e5c..e6e2ac722a82 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1179,6 +1179,12 @@ static __always_inline bool free_pages_prepare(struct page *page,
>>  	return true;
>>  }
>>
>> +__always_inline bool free_pages_prepare_fpi_none(struct page *page,
>> +			unsigned int order)
>> +{
>> +	return free_pages_prepare(page, order, FPI_NONE);
>
> Seems like free_pages_prepare() currently only passes  fpi_flags to
> should_skip_kasan_poison() and that ignores them. You could remove the
> parameter from both and declare and use free_pages_prepare(page, order)
> directly.

Got it. I can send a cleanup patch after this series. No, to avoid unnecessary
code churn, it is better to put a cleanup patch before this series and use
free_pages_prepare(). Will do it in v6.

>> +}
>> +
>>  /*
>>   * Frees a number of pages from the PCP lists
>>   * Assumes all pages on list are in same zone.


--
Best Regards,
Yan, Zi
Vlastimil Babka Feb. 15, 2024, 4:57 p.m. UTC | #3
On 2/14/24 23:04, Zi Yan wrote:
> @@ -1849,10 +1857,22 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data)
>  static void compaction_free(struct folio *dst, unsigned long data)
>  {
>  	struct compact_control *cc = (struct compact_control *)data;
> +	int order = folio_order(dst);
> +	struct page *page = &dst->page;
> +
> +	if (folio_put_testzero(dst)) {
> +		free_pages_prepare_fpi_none(page, order);
> +
> +		INIT_LIST_HEAD(&dst->lru);

(is this even needed? I think the state of first parameter of list_add() is
never expected to be in particular state?)

>  
> -	list_add(&dst->lru, &cc->freepages);
> -	cc->nr_freepages++;
> -	cc->nr_migratepages += 1 << folio_order(dst);
> +		list_add(&dst->lru, &cc->freepages[order]);
> +		cc->nr_freepages += 1 << order;
> +		cc->nr_migratepages += 1 << order;

Hm actually this increment of nr_migratepages should happen even if we lost
the free page.

> +	}
> +	/*
> +	 * someone else has referenced the page, we cannot take it back to our
> +	 * free list.
> +	 */
>  }
Zi Yan Feb. 15, 2024, 5:32 p.m. UTC | #4
On 15 Feb 2024, at 11:57, Vlastimil Babka wrote:

> On 2/14/24 23:04, Zi Yan wrote:
>> @@ -1849,10 +1857,22 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data)
>>  static void compaction_free(struct folio *dst, unsigned long data)
>>  {
>>  	struct compact_control *cc = (struct compact_control *)data;
>> +	int order = folio_order(dst);
>> +	struct page *page = &dst->page;
>> +
>> +	if (folio_put_testzero(dst)) {
>> +		free_pages_prepare_fpi_none(page, order);
>> +
>> +		INIT_LIST_HEAD(&dst->lru);
>
> (is this even needed? I think the state of first parameter of list_add() is
> never expected to be in particular state?)

There is a __list_add_valid() performing list corruption checks.
>
>>
>> -	list_add(&dst->lru, &cc->freepages);
>> -	cc->nr_freepages++;
>> -	cc->nr_migratepages += 1 << folio_order(dst);
>> +		list_add(&dst->lru, &cc->freepages[order]);
>> +		cc->nr_freepages += 1 << order;
>> +		cc->nr_migratepages += 1 << order;
>
> Hm actually this increment of nr_migratepages should happen even if we lost
> the free page.

Because compaction_free() indicates the page is not migrated and nr_migratepages
should be increased regardless.

Will fix it. Thanks.

>> +	}
>> +	/*
>> +	 * someone else has referenced the page, we cannot take it back to our
>> +	 * free list.
>> +	 */
>>  }


--
Best Regards,
Yan, Zi
Vlastimil Babka Feb. 15, 2024, 8:02 p.m. UTC | #5
On 2/15/24 18:32, Zi Yan wrote:
> On 15 Feb 2024, at 11:57, Vlastimil Babka wrote:
> 
>> On 2/14/24 23:04, Zi Yan wrote:
>>> @@ -1849,10 +1857,22 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data)
>>>  static void compaction_free(struct folio *dst, unsigned long data)
>>>  {
>>>  	struct compact_control *cc = (struct compact_control *)data;
>>> +	int order = folio_order(dst);
>>> +	struct page *page = &dst->page;
>>> +
>>> +	if (folio_put_testzero(dst)) {
>>> +		free_pages_prepare_fpi_none(page, order);
>>> +
>>> +		INIT_LIST_HEAD(&dst->lru);
>>
>> (is this even needed? I think the state of first parameter of list_add() is
>> never expected to be in particular state?)
> 
> There is a __list_add_valid() performing list corruption checks.

Yes, but dst->lru becomes "new" in list_add() and __list_add_valid() and
those never check the contents of new, i.e. new->next or new->prev. We could
have done list_del(&dst->lru) which puts poison values there and then a
list_add() is fine. So dst->lru does not need the init, it's just confusing.
Init is for the list's list_head, not for the list entry.

>>>
>>> -	list_add(&dst->lru, &cc->freepages);
>>> -	cc->nr_freepages++;
>>> -	cc->nr_migratepages += 1 << folio_order(dst);
>>> +		list_add(&dst->lru, &cc->freepages[order]);
>>> +		cc->nr_freepages += 1 << order;
>>> +		cc->nr_migratepages += 1 << order;
>>
>> Hm actually this increment of nr_migratepages should happen even if we lost
>> the free page.
> 
> Because compaction_free() indicates the page is not migrated and nr_migratepages
> should be increased regardless.

Yes.

> Will fix it. Thanks.
> 
>>> +	}
>>> +	/*
>>> +	 * someone else has referenced the page, we cannot take it back to our
>>> +	 * free list.
>>> +	 */
>>>  }
> 
> 
> --
> Best Regards,
> Yan, Zi
Zi Yan Feb. 15, 2024, 8:04 p.m. UTC | #6
On 15 Feb 2024, at 15:02, Vlastimil Babka wrote:

> On 2/15/24 18:32, Zi Yan wrote:
>> On 15 Feb 2024, at 11:57, Vlastimil Babka wrote:
>>
>>> On 2/14/24 23:04, Zi Yan wrote:
>>>> @@ -1849,10 +1857,22 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data)
>>>>  static void compaction_free(struct folio *dst, unsigned long data)
>>>>  {
>>>>  	struct compact_control *cc = (struct compact_control *)data;
>>>> +	int order = folio_order(dst);
>>>> +	struct page *page = &dst->page;
>>>> +
>>>> +	if (folio_put_testzero(dst)) {
>>>> +		free_pages_prepare_fpi_none(page, order);
>>>> +
>>>> +		INIT_LIST_HEAD(&dst->lru);
>>>
>>> (is this even needed? I think the state of first parameter of list_add() is
>>> never expected to be in particular state?)
>>
>> There is a __list_add_valid() performing list corruption checks.
>
> Yes, but dst->lru becomes "new" in list_add() and __list_add_valid() and
> those never check the contents of new, i.e. new->next or new->prev. We could
> have done list_del(&dst->lru) which puts poison values there and then a
> list_add() is fine. So dst->lru does not need the init, it's just confusing.
> Init is for the list's list_head, not for the list entry.

Got it. Will remove it.

>>>>
>>>> -	list_add(&dst->lru, &cc->freepages);
>>>> -	cc->nr_freepages++;
>>>> -	cc->nr_migratepages += 1 << folio_order(dst);
>>>> +		list_add(&dst->lru, &cc->freepages[order]);
>>>> +		cc->nr_freepages += 1 << order;
>>>> +		cc->nr_migratepages += 1 << order;
>>>
>>> Hm actually this increment of nr_migratepages should happen even if we lost
>>> the free page.
>>
>> Because compaction_free() indicates the page is not migrated and nr_migratepages
>> should be increased regardless.
>
> Yes.
>
>> Will fix it. Thanks.
>>
>>>> +	}
>>>> +	/*
>>>> +	 * someone else has referenced the page, we cannot take it back to our
>>>> +	 * free list.
>>>> +	 */
>>>>  }
>>
>>
>> --
>> Best Regards,
>> Yan, Zi


--
Best Regards,
Yan, Zi
diff mbox series

Patch

diff --git a/mm/compaction.c b/mm/compaction.c
index aa6aad805c4d..d0a05a621b67 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -66,45 +66,56 @@  static inline void count_compact_events(enum vm_event_item item, long delta)
 #define COMPACTION_HPAGE_ORDER	(PMD_SHIFT - PAGE_SHIFT)
 #endif
 
-static unsigned long release_freepages(struct list_head *freelist)
+static void split_map_pages(struct list_head *freepages)
 {
+	unsigned int i, order;
 	struct page *page, *next;
-	unsigned long high_pfn = 0;
+	LIST_HEAD(tmp_list);
 
-	list_for_each_entry_safe(page, next, freelist, lru) {
-		unsigned long pfn = page_to_pfn(page);
-		list_del(&page->lru);
-		__free_page(page);
-		if (pfn > high_pfn)
-			high_pfn = pfn;
-	}
+	for (order = 0; order < NR_PAGE_ORDERS; order++) {
+		list_for_each_entry_safe(page, next, &freepages[order], lru) {
+			unsigned int nr_pages;
 
-	return high_pfn;
+			list_del(&page->lru);
+
+			nr_pages = 1 << order;
+
+			post_alloc_hook(page, order, __GFP_MOVABLE);
+			if (order)
+				split_page(page, order);
+
+			for (i = 0; i < nr_pages; i++) {
+				list_add(&page->lru, &tmp_list);
+				page++;
+			}
+		}
+		list_splice_init(&tmp_list, &freepages[0]);
+	}
 }
 
-static void split_map_pages(struct list_head *list)
+static unsigned long release_free_list(struct list_head *freepages)
 {
-	unsigned int i, order, nr_pages;
-	struct page *page, *next;
-	LIST_HEAD(tmp_list);
-
-	list_for_each_entry_safe(page, next, list, lru) {
-		list_del(&page->lru);
+	int order;
+	unsigned long high_pfn = 0;
 
-		order = page_private(page);
-		nr_pages = 1 << order;
+	for (order = 0; order < NR_PAGE_ORDERS; order++) {
+		struct page *page, *next;
 
-		post_alloc_hook(page, order, __GFP_MOVABLE);
-		if (order)
-			split_page(page, order);
+		list_for_each_entry_safe(page, next, &freepages[order], lru) {
+			unsigned long pfn = page_to_pfn(page);
 
-		for (i = 0; i < nr_pages; i++) {
-			list_add(&page->lru, &tmp_list);
-			page++;
+			list_del(&page->lru);
+			/*
+			 * Convert free pages into post allocation pages, so
+			 * that we can free them via __free_page.
+			 */
+			post_alloc_hook(page, order, __GFP_MOVABLE);
+			__free_pages(page, order);
+			if (pfn > high_pfn)
+				high_pfn = pfn;
 		}
 	}
-
-	list_splice(&tmp_list, list);
+	return high_pfn;
 }
 
 #ifdef CONFIG_COMPACTION
@@ -657,7 +668,7 @@  static unsigned long isolate_freepages_block(struct compact_control *cc,
 		nr_scanned += isolated - 1;
 		total_isolated += isolated;
 		cc->nr_freepages += isolated;
-		list_add_tail(&page->lru, freelist);
+		list_add_tail(&page->lru, &freelist[order]);
 
 		if (!strict && cc->nr_migratepages <= cc->nr_freepages) {
 			blockpfn += isolated;
@@ -722,7 +733,11 @@  isolate_freepages_range(struct compact_control *cc,
 			unsigned long start_pfn, unsigned long end_pfn)
 {
 	unsigned long isolated, pfn, block_start_pfn, block_end_pfn;
-	LIST_HEAD(freelist);
+	int order;
+	struct list_head tmp_freepages[NR_PAGE_ORDERS];
+
+	for (order = 0; order < NR_PAGE_ORDERS; order++)
+		INIT_LIST_HEAD(&tmp_freepages[order]);
 
 	pfn = start_pfn;
 	block_start_pfn = pageblock_start_pfn(pfn);
@@ -753,7 +768,7 @@  isolate_freepages_range(struct compact_control *cc,
 			break;
 
 		isolated = isolate_freepages_block(cc, &isolate_start_pfn,
-					block_end_pfn, &freelist, 0, true);
+					block_end_pfn, tmp_freepages, 0, true);
 
 		/*
 		 * In strict mode, isolate_freepages_block() returns 0 if
@@ -770,15 +785,15 @@  isolate_freepages_range(struct compact_control *cc,
 		 */
 	}
 
-	/* __isolate_free_page() does not map the pages */
-	split_map_pages(&freelist);
-
 	if (pfn < end_pfn) {
 		/* Loop terminated early, cleanup. */
-		release_freepages(&freelist);
+		release_free_list(tmp_freepages);
 		return 0;
 	}
 
+	/* __isolate_free_page() does not map the pages */
+	split_map_pages(tmp_freepages);
+
 	/* We don't use freelists for anything. */
 	return pfn;
 }
@@ -1494,7 +1509,7 @@  fast_isolate_around(struct compact_control *cc, unsigned long pfn)
 	if (!page)
 		return;
 
-	isolate_freepages_block(cc, &start_pfn, end_pfn, &cc->freepages, 1, false);
+	isolate_freepages_block(cc, &start_pfn, end_pfn, cc->freepages, 1, false);
 
 	/* Skip this pageblock in the future as it's full or nearly full */
 	if (start_pfn == end_pfn && !cc->no_set_skip_hint)
@@ -1623,7 +1638,7 @@  static void fast_isolate_freepages(struct compact_control *cc)
 				nr_scanned += nr_isolated - 1;
 				total_isolated += nr_isolated;
 				cc->nr_freepages += nr_isolated;
-				list_add_tail(&page->lru, &cc->freepages);
+				list_add_tail(&page->lru, &cc->freepages[order]);
 				count_compact_events(COMPACTISOLATED, nr_isolated);
 			} else {
 				/* If isolation fails, abort the search */
@@ -1700,13 +1715,12 @@  static void isolate_freepages(struct compact_control *cc)
 	unsigned long isolate_start_pfn; /* exact pfn we start at */
 	unsigned long block_end_pfn;	/* end of current pageblock */
 	unsigned long low_pfn;	     /* lowest pfn scanner is able to scan */
-	struct list_head *freelist = &cc->freepages;
 	unsigned int stride;
 
 	/* Try a small search of the free lists for a candidate */
 	fast_isolate_freepages(cc);
 	if (cc->nr_freepages)
-		goto splitmap;
+		return;
 
 	/*
 	 * Initialise the free scanner. The starting point is where we last
@@ -1766,7 +1780,7 @@  static void isolate_freepages(struct compact_control *cc)
 
 		/* Found a block suitable for isolating free pages from. */
 		nr_isolated = isolate_freepages_block(cc, &isolate_start_pfn,
-					block_end_pfn, freelist, stride, false);
+					block_end_pfn, cc->freepages, stride, false);
 
 		/* Update the skip hint if the full pageblock was scanned */
 		if (isolate_start_pfn == block_end_pfn)
@@ -1807,10 +1821,6 @@  static void isolate_freepages(struct compact_control *cc)
 	 * and the loop terminated due to isolate_start_pfn < low_pfn
 	 */
 	cc->free_pfn = isolate_start_pfn;
-
-splitmap:
-	/* __isolate_free_page() does not map the pages */
-	split_map_pages(freelist);
 }
 
 /*
@@ -1821,24 +1831,22 @@  static struct folio *compaction_alloc(struct folio *src, unsigned long data)
 {
 	struct compact_control *cc = (struct compact_control *)data;
 	struct folio *dst;
+	int order = folio_order(src);
 
-	/* this makes migrate_pages() split the source page and retry */
-	if (folio_test_large(src) > 0)
-		return NULL;
-
-	if (list_empty(&cc->freepages)) {
+	if (list_empty(&cc->freepages[order])) {
 		isolate_freepages(cc);
-
-		if (list_empty(&cc->freepages))
+		if (list_empty(&cc->freepages[order]))
 			return NULL;
 	}
 
-	dst = list_entry(cc->freepages.next, struct folio, lru);
+	dst = list_first_entry(&cc->freepages[order], struct folio, lru);
 	list_del(&dst->lru);
-	cc->nr_freepages--;
-	cc->nr_migratepages -= 1 << folio_order(src);
-
-	return dst;
+	post_alloc_hook(&dst->page, order, __GFP_MOVABLE);
+	if (order)
+		prep_compound_page(&dst->page, order);
+	cc->nr_freepages -= 1 << order;
+	cc->nr_migratepages -= 1 << order;
+	return page_rmappable_folio(&dst->page);
 }
 
 /*
@@ -1849,10 +1857,22 @@  static struct folio *compaction_alloc(struct folio *src, unsigned long data)
 static void compaction_free(struct folio *dst, unsigned long data)
 {
 	struct compact_control *cc = (struct compact_control *)data;
+	int order = folio_order(dst);
+	struct page *page = &dst->page;
+
+	if (folio_put_testzero(dst)) {
+		free_pages_prepare_fpi_none(page, order);
+
+		INIT_LIST_HEAD(&dst->lru);
 
-	list_add(&dst->lru, &cc->freepages);
-	cc->nr_freepages++;
-	cc->nr_migratepages += 1 << folio_order(dst);
+		list_add(&dst->lru, &cc->freepages[order]);
+		cc->nr_freepages += 1 << order;
+		cc->nr_migratepages += 1 << order;
+	}
+	/*
+	 * someone else has referenced the page, we cannot take it back to our
+	 * free list.
+	 */
 }
 
 /* possible outcome of isolate_migratepages */
@@ -2476,6 +2496,7 @@  compact_zone(struct compact_control *cc, struct capture_control *capc)
 	const bool sync = cc->mode != MIGRATE_ASYNC;
 	bool update_cached;
 	unsigned int nr_succeeded = 0;
+	int order;
 
 	/*
 	 * These counters track activities during zone compaction.  Initialize
@@ -2485,7 +2506,8 @@  compact_zone(struct compact_control *cc, struct capture_control *capc)
 	cc->total_free_scanned = 0;
 	cc->nr_migratepages = 0;
 	cc->nr_freepages = 0;
-	INIT_LIST_HEAD(&cc->freepages);
+	for (order = 0; order < NR_PAGE_ORDERS; order++)
+		INIT_LIST_HEAD(&cc->freepages[order]);
 	INIT_LIST_HEAD(&cc->migratepages);
 
 	cc->migratetype = gfp_migratetype(cc->gfp_mask);
@@ -2671,7 +2693,7 @@  compact_zone(struct compact_control *cc, struct capture_control *capc)
 	 * so we don't leave any returned pages behind in the next attempt.
 	 */
 	if (cc->nr_freepages > 0) {
-		unsigned long free_pfn = release_freepages(&cc->freepages);
+		unsigned long free_pfn = release_free_list(cc->freepages);
 
 		cc->nr_freepages = 0;
 		VM_BUG_ON(free_pfn == 0);
@@ -2690,7 +2712,6 @@  compact_zone(struct compact_control *cc, struct capture_control *capc)
 
 	trace_mm_compaction_end(cc, start_pfn, end_pfn, sync, ret);
 
-	VM_BUG_ON(!list_empty(&cc->freepages));
 	VM_BUG_ON(!list_empty(&cc->migratepages));
 
 	return ret;
diff --git a/mm/internal.h b/mm/internal.h
index 1e29c5821a1d..9925291e7704 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -447,6 +447,8 @@  extern void prep_compound_page(struct page *page, unsigned int order);
 
 extern void post_alloc_hook(struct page *page, unsigned int order,
 					gfp_t gfp_flags);
+extern bool free_pages_prepare_fpi_none(struct page *page, unsigned int order);
+
 extern int user_min_free_kbytes;
 
 extern void free_unref_page(struct page *page, unsigned int order);
@@ -481,7 +483,7 @@  int split_free_page(struct page *free_page,
  * completes when free_pfn <= migrate_pfn
  */
 struct compact_control {
-	struct list_head freepages;	/* List of free pages to migrate to */
+	struct list_head freepages[NR_PAGE_ORDERS];	/* List of free pages to migrate to */
 	struct list_head migratepages;	/* List of pages being migrated */
 	unsigned int nr_freepages;	/* Number of isolated free pages */
 	unsigned int nr_migratepages;	/* Number of pages to migrate */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7ae4b74c9e5c..e6e2ac722a82 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1179,6 +1179,12 @@  static __always_inline bool free_pages_prepare(struct page *page,
 	return true;
 }
 
+__always_inline bool free_pages_prepare_fpi_none(struct page *page,
+			unsigned int order)
+{
+	return free_pages_prepare(page, order, FPI_NONE);
+}
+
 /*
  * Frees a number of pages from the PCP lists
  * Assumes all pages on list are in same zone.