diff mbox series

[v2,1/2] mm/compaction: count pages and stop correctly during page isolation.

Message ID 20201030155716.3614401-1-zi.yan@sent.com (mailing list archive)
State New, archived
Headers show
Series [v2,1/2] mm/compaction: count pages and stop correctly during page isolation. | expand

Commit Message

Zi Yan Oct. 30, 2020, 3:57 p.m. UTC
From: Zi Yan <ziy@nvidia.com>

In isolate_migratepages_block, when cc->alloc_contig is true, we are
able to isolate compound pages, nr_migratepages and nr_isolated did not
count compound pages correctly, causing us to isolate more pages than we
thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
in too_many_isolated while loop, since the actual isolated pages can go
up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
since we stop isolation after cc->nr_migratepages reaches to
COMPACT_CLUSTER_MAX.

In addition, after we fix the issue above, cc->nr_migratepages could
never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated,
thus page isolation could not stop as we intended. Change the isolation
stop condition to >=.

The issue can be triggered as follows:
In a system with 16GB memory and an 8GB CMA region reserved by
hugetlb_cma, if we first allocate 10GB THPs and mlock them
(so some THPs are allocated in the CMA region and mlocked), reserving
6 1GB hugetlb pages via
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck
(looping in too_many_isolated function) until we kill either task.
With the patch applied, oom will kill the application with 10GB THPs and
let hugetlb page reservation finish.

Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
---
 mm/compaction.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Matthew Wilcox Oct. 30, 2020, 6:12 p.m. UTC | #1
On Fri, Oct 30, 2020 at 11:57:15AM -0400, Zi Yan wrote:
> In isolate_migratepages_block, when cc->alloc_contig is true, we are
> able to isolate compound pages, nr_migratepages and nr_isolated did not
> count compound pages correctly, causing us to isolate more pages than we
> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
               ^^^^^^^^^^^^
Maybe replace that sentence with "Count compound pages as the number of
base pages they contain"?
Zi Yan Oct. 30, 2020, 6:15 p.m. UTC | #2
On 30 Oct 2020, at 14:12, Matthew Wilcox wrote:

> On Fri, Oct 30, 2020 at 11:57:15AM -0400, Zi Yan wrote:
>> In isolate_migratepages_block, when cc->alloc_contig is true, we are
>> able to isolate compound pages, nr_migratepages and nr_isolated did not
>> count compound pages correctly, causing us to isolate more pages than we
>> thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped
>                ^^^^^^^^^^^^
> Maybe replace that sentence with "Count compound pages as the number of
> base pages they contain"?

Sure. And compound_nr is used instead of thp_nr_pages in fact.

OK. V3 is coming.

—
Best Regards,
Yan Zi
diff mbox series

Patch

diff --git a/mm/compaction.c b/mm/compaction.c
index ee1f8439369e..3e834ac402f1 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1012,8 +1012,8 @@  isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 
 isolate_success:
 		list_add(&page->lru, &cc->migratepages);
-		cc->nr_migratepages++;
-		nr_isolated++;
+		cc->nr_migratepages += compound_nr(page);
+		nr_isolated += compound_nr(page);
 
 		/*
 		 * Avoid isolating too much unless this block is being
@@ -1021,7 +1021,7 @@  isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 		 * or a lock is contended. For contention, isolate quickly to
 		 * potentially remove one source of contention.
 		 */
-		if (cc->nr_migratepages == COMPACT_CLUSTER_MAX &&
+		if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX &&
 		    !cc->rescan && !cc->contended) {
 			++low_pfn;
 			break;
@@ -1132,7 +1132,7 @@  isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
 		if (!pfn)
 			break;
 
-		if (cc->nr_migratepages == COMPACT_CLUSTER_MAX)
+		if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX)
 			break;
 	}