From patchwork Tue Apr 13 10:47:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12200057 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F6DFC43460 for ; Tue, 13 Apr 2021 10:48:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0966060E0B for ; Tue, 13 Apr 2021 10:48:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0966060E0B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 888C16B0036; Tue, 13 Apr 2021 06:48:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 838C26B006E; Tue, 13 Apr 2021 06:48:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6DAAD6B0071; Tue, 13 Apr 2021 06:48:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0002.hostedemail.com [216.40.44.2]) by kanga.kvack.org (Postfix) with ESMTP id 4BBB66B0036 for ; Tue, 13 Apr 2021 06:48:03 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 008593643 for ; Tue, 13 Apr 2021 10:48:02 +0000 (UTC) X-FDA: 78027018846.03.001E872 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf10.hostedemail.com (Postfix) with ESMTP id A79FC40002C4 for ; Tue, 13 Apr 2021 10:47:57 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 99402B12D; Tue, 13 Apr 2021 10:48:01 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: Mike Kravetz , Vlastimil Babka , David Hildenbrand , Michal Hocko , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v7 1/7] mm,page_alloc: Bail out earlier on -ENOMEM in alloc_contig_migrate_range Date: Tue, 13 Apr 2021 12:47:41 +0200 Message-Id: <20210413104747.12177-2-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20210413104747.12177-1-osalvador@suse.de> References: <20210413104747.12177-1-osalvador@suse.de> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: A79FC40002C4 X-Stat-Signature: 3e8mckabkxh1sw1rpor4jk8xhhrf5qzb Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1618310877-886195 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, __alloc_contig_migrate_range can generate -EINTR, -ENOMEM or -EBUSY, and report them down the chain. The problem is that when migrate_pages() reports -ENOMEM, we keep going till we exhaust all the try-attempts (5 at the moment) instead of bailing out. migrate_pages() bails out right away on -ENOMEM because it is considered a fatal error. Do the same here instead of keep going and retrying. Note that this is not fixing a real issue, just a cosmetic change. Although we can save some cycles by backing off ealier Signed-off-by: Oscar Salvador Acked-by: Vlastimil Babka Reviewed-by: David Hildenbrand Acked-by: Michal Hocko Acked-by: Mike Kravetz --- mm/page_alloc.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1c67c99603a3..689454692de1 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8697,7 +8697,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, } tries = 0; } else if (++tries == 5) { - ret = ret < 0 ? ret : -EBUSY; + ret = -EBUSY; break; } @@ -8707,6 +8707,13 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, ret = migrate_pages(&cc->migratepages, alloc_migration_target, NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE); + + /* + * On -ENOMEM, migrate_pages() bails out right away. It is pointless + * to retry again over this error, so do the same here. + */ + if (ret == -ENOMEM) + break; } lru_cache_enable(); From patchwork Tue Apr 13 10:47:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12200061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9000C433ED for ; Tue, 13 Apr 2021 10:48:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 67FCB613BF for ; Tue, 13 Apr 2021 10:48:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 67FCB613BF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AAD816B0070; Tue, 13 Apr 2021 06:48:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A35CF6B0071; Tue, 13 Apr 2021 06:48:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 83A1E6B0072; Tue, 13 Apr 2021 06:48:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0175.hostedemail.com [216.40.44.175]) by kanga.kvack.org (Postfix) with ESMTP id 5E6FC6B0070 for ; Tue, 13 Apr 2021 06:48:04 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 1860A180CA823 for ; Tue, 13 Apr 2021 10:48:04 +0000 (UTC) X-FDA: 78027018888.10.6949041 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf29.hostedemail.com (Postfix) with ESMTP id B947EE4 for ; Tue, 13 Apr 2021 10:48:01 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 7BD5DAEFB; Tue, 13 Apr 2021 10:48:02 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: Mike Kravetz , Vlastimil Babka , David Hildenbrand , Michal Hocko , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v7 2/7] mm,compaction: Let isolate_migratepages_{range,block} return error codes Date: Tue, 13 Apr 2021 12:47:42 +0200 Message-Id: <20210413104747.12177-3-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20210413104747.12177-1-osalvador@suse.de> References: <20210413104747.12177-1-osalvador@suse.de> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: B947EE4 X-Stat-Signature: mpf9wk98r7u4x8weyoaeegxxwo85o5pt Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf29; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1618310881-489632 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, isolate_migratepages_{range,block} and their callers use a pfn == 0 vs pfn != 0 scheme to let the caller know whether there was any error during isolation. This does not work as soon as we need to start reporting different error codes and make sure we pass them down the chain, so they are properly interpreted by functions like e.g: alloc_contig_range. Let us rework isolate_migratepages_{range,block} so we can report error codes. Since isolate_migratepages_block will stop returning the next pfn to be scanned, we reuse the cc->migrate_pfn field to keep track of that. Signed-off-by: Oscar Salvador Acked-by: Vlastimil Babka Acked-by: Mike Kravetz --- mm/compaction.c | 54 +++++++++++++++++++++++++++--------------------------- mm/internal.h | 10 ++++++++-- mm/page_alloc.c | 7 +++---- 3 files changed, 38 insertions(+), 33 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 8c5028bfbd56..eeba4668c22c 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -787,15 +787,15 @@ static bool too_many_isolated(pg_data_t *pgdat) * * Isolate all pages that can be migrated from the range specified by * [low_pfn, end_pfn). The range is expected to be within same pageblock. - * Returns zero if there is a fatal signal pending, otherwise PFN of the - * first page that was not scanned (which may be both less, equal to or more - * than end_pfn). + * Returns errno, like -EAGAIN or -EINTR in case e.g signal pending or congestion, + * or 0. + * cc->migrate_pfn will contain the next pfn to scan (which may be both less, + * equal to or more that end_pfn). * * The pages are isolated on cc->migratepages list (not required to be empty), - * and cc->nr_migratepages is updated accordingly. The cc->migrate_pfn field - * is neither read nor updated. + * and cc->nr_migratepages is updated accordingly. */ -static unsigned long +static int isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, unsigned long end_pfn, isolate_mode_t isolate_mode) { @@ -809,6 +809,9 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, bool skip_on_failure = false; unsigned long next_skip_pfn = 0; bool skip_updated = false; + int ret = 0; + + cc->migrate_pfn = low_pfn; /* * Ensure that there are not too many pages isolated from the LRU @@ -818,16 +821,16 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, while (unlikely(too_many_isolated(pgdat))) { /* stop isolation if there are still pages not migrated */ if (cc->nr_migratepages) - return 0; + return -EAGAIN; /* async migration should just abort */ if (cc->mode == MIGRATE_ASYNC) - return 0; + return -EAGAIN; congestion_wait(BLK_RW_ASYNC, HZ/10); if (fatal_signal_pending(current)) - return 0; + return -EINTR; } cond_resched(); @@ -875,8 +878,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (fatal_signal_pending(current)) { cc->contended = true; + ret = -EINTR; - low_pfn = 0; goto fatal_pending; } @@ -1130,7 +1133,9 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (nr_isolated) count_compact_events(COMPACTISOLATED, nr_isolated); - return low_pfn; + cc->migrate_pfn = low_pfn; + + return ret; } /** @@ -1139,15 +1144,15 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, * @start_pfn: The first PFN to start isolating. * @end_pfn: The one-past-last PFN. * - * Returns zero if isolation fails fatally due to e.g. pending signal. - * Otherwise, function returns one-past-the-last PFN of isolated page - * (which may be greater than end_pfn if end fell in a middle of a THP page). + * Returns errno, like -EAGAIN or -EINTR in case e.g signal pending or congestion, + * or 0. */ -unsigned long +int isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn, unsigned long end_pfn) { unsigned long pfn, block_start_pfn, block_end_pfn; + int ret = 0; /* Scan block by block. First and last block may be incomplete */ pfn = start_pfn; @@ -1166,17 +1171,17 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn, block_end_pfn, cc->zone)) continue; - pfn = isolate_migratepages_block(cc, pfn, block_end_pfn, - ISOLATE_UNEVICTABLE); + ret = isolate_migratepages_block(cc, pfn, block_end_pfn, + ISOLATE_UNEVICTABLE); - if (!pfn) + if (ret) break; if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX) break; } - return pfn; + return ret; } #endif /* CONFIG_COMPACTION || CONFIG_CMA */ @@ -1847,7 +1852,7 @@ static isolate_migrate_t isolate_migratepages(struct compact_control *cc) */ for (; block_end_pfn <= cc->free_pfn; fast_find_block = false, - low_pfn = block_end_pfn, + cc->migrate_pfn = low_pfn = block_end_pfn, block_start_pfn = block_end_pfn, block_end_pfn += pageblock_nr_pages) { @@ -1889,10 +1894,8 @@ static isolate_migrate_t isolate_migratepages(struct compact_control *cc) } /* Perform the isolation */ - low_pfn = isolate_migratepages_block(cc, low_pfn, - block_end_pfn, isolate_mode); - - if (!low_pfn) + if (isolate_migratepages_block(cc, low_pfn, block_end_pfn, + isolate_mode)) return ISOLATE_ABORT; /* @@ -1903,9 +1906,6 @@ static isolate_migrate_t isolate_migratepages(struct compact_control *cc) break; } - /* Record where migration scanner will be restarted. */ - cc->migrate_pfn = low_pfn; - return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE; } diff --git a/mm/internal.h b/mm/internal.h index f469f69309de..46eb82eaa195 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -244,7 +244,13 @@ struct compact_control { unsigned int nr_freepages; /* Number of isolated free pages */ unsigned int nr_migratepages; /* Number of pages to migrate */ unsigned long free_pfn; /* isolate_freepages search base */ - unsigned long migrate_pfn; /* isolate_migratepages search base */ + /* + * Acts as an in/out parameter to page isolation for migration. + * isolate_migratepages uses it as a search base. + * isolate_migratepages_block will update the value to the next pfn + * after the last isolated one. + */ + unsigned long migrate_pfn; unsigned long fast_start_pfn; /* a pfn to start linear scan from */ struct zone *zone; unsigned long total_migrate_scanned; @@ -280,7 +286,7 @@ struct capture_control { unsigned long isolate_freepages_range(struct compact_control *cc, unsigned long start_pfn, unsigned long end_pfn); -unsigned long +int isolate_migratepages_range(struct compact_control *cc, unsigned long low_pfn, unsigned long end_pfn); int find_suitable_fallback(struct free_area *area, unsigned int order, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 689454692de1..b5a94de3cdde 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8690,11 +8690,10 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, if (list_empty(&cc->migratepages)) { cc->nr_migratepages = 0; - pfn = isolate_migratepages_range(cc, pfn, end); - if (!pfn) { - ret = -EINTR; + ret = isolate_migratepages_range(cc, pfn, end); + if (ret && ret != -EAGAIN) break; - } + pfn = cc->migrate_pfn; tries = 0; } else if (++tries == 5) { ret = -EBUSY; From patchwork Tue Apr 13 10:47:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12200063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 881E7C433B4 for ; Tue, 13 Apr 2021 10:48:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4F173613BF for ; Tue, 13 Apr 2021 10:48:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4F173613BF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 492DE6B0071; Tue, 13 Apr 2021 06:48:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 444956B0072; Tue, 13 Apr 2021 06:48:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E4006B0073; Tue, 13 Apr 2021 06:48:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0046.hostedemail.com [216.40.44.46]) by kanga.kvack.org (Postfix) with ESMTP id 123DE6B0071 for ; Tue, 13 Apr 2021 06:48:05 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id BFCDC180CCF5F for ; Tue, 13 Apr 2021 10:48:04 +0000 (UTC) X-FDA: 78027018888.19.2746D96 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf15.hostedemail.com (Postfix) with ESMTP id D652BA0000FF for ; Tue, 13 Apr 2021 10:48:02 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5D0B3B130; Tue, 13 Apr 2021 10:48:03 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: Mike Kravetz , Vlastimil Babka , David Hildenbrand , Michal Hocko , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v7 3/7] mm,hugetlb: Clear HPageFreed outside of the lock Date: Tue, 13 Apr 2021 12:47:43 +0200 Message-Id: <20210413104747.12177-4-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20210413104747.12177-1-osalvador@suse.de> References: <20210413104747.12177-1-osalvador@suse.de> X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: D652BA0000FF X-Stat-Signature: z3f73wdose311cge3ws1uahsf3ceg57s Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf15; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1618310882-684720 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, the clearing of the flag is done under the lock, but this is unnecessary as we just allocated the page and we did not give it away yet, so no one should be messing with it. Also, this helps making clear that here the lock is only protecting the counter. Signed-off-by: Oscar Salvador --- mm/hugetlb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 54d81d5947ed..e40d5fe5c63c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1490,10 +1490,10 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) hugetlb_set_page_subpool(page, NULL); set_hugetlb_cgroup(page, NULL); set_hugetlb_cgroup_rsvd(page, NULL); + ClearHPageFreed(page); spin_lock_irq(&hugetlb_lock); h->nr_huge_pages++; h->nr_huge_pages_node[nid]++; - ClearHPageFreed(page); spin_unlock_irq(&hugetlb_lock); } From patchwork Tue Apr 13 10:47:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12200065 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C77AC43460 for ; Tue, 13 Apr 2021 10:48:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DC49E61242 for ; Tue, 13 Apr 2021 10:48:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DC49E61242 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3F8466B0072; Tue, 13 Apr 2021 06:48:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 357616B0073; Tue, 13 Apr 2021 06:48:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1ADDC6B0074; Tue, 13 Apr 2021 06:48:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0154.hostedemail.com [216.40.44.154]) by kanga.kvack.org (Postfix) with ESMTP id E448E6B0072 for ; Tue, 13 Apr 2021 06:48:05 -0400 (EDT) Received: from smtpin37.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id A29B06C33 for ; Tue, 13 Apr 2021 10:48:05 +0000 (UTC) X-FDA: 78027018930.37.ACF6041 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf05.hostedemail.com (Postfix) with ESMTP id A1B4BE000118 for ; Tue, 13 Apr 2021 10:48:04 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 40277B158; Tue, 13 Apr 2021 10:48:04 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: Mike Kravetz , Vlastimil Babka , David Hildenbrand , Michal Hocko , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v7 4/7] mm,hugetlb: Split prep_new_huge_page functionality Date: Tue, 13 Apr 2021 12:47:44 +0200 Message-Id: <20210413104747.12177-5-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20210413104747.12177-1-osalvador@suse.de> References: <20210413104747.12177-1-osalvador@suse.de> X-Stat-Signature: jwdert1ayio51yfsh8998skfak4cowdb X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A1B4BE000118 Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf05; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1618310884-150558 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, prep_new_huge_page() performs two functions. It sets the right state for a new hugetlb, and increases the hstate's counters to account for the new page. Let us split its functionality into two separate functions, decoupling the handling of the counters from initializing a hugepage. The outcome is having __prep_new_huge_page(), which only initializes the page , and __prep_account_new_huge_page(), which adds the new page to the hstate's counters. This allows us to be able to set a hugetlb without having to worry about the counter/locking. It will prove useful in the next patch. prep_new_huge_page() still calls both functions. Signed-off-by: Oscar Salvador Acked-by: Michal Hocko Reviewed-by: Mike Kravetz --- mm/hugetlb.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e40d5fe5c63c..0607b2b71ac6 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1483,7 +1483,16 @@ void free_huge_page(struct page *page) } } -static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) +/* + * Must be called with the hugetlb lock held + */ +static void __prep_account_new_huge_page(struct hstate *h, int nid) +{ + h->nr_huge_pages++; + h->nr_huge_pages_node[nid]++; +} + +static void __prep_new_huge_page(struct page *page) { INIT_LIST_HEAD(&page->lru); set_compound_page_dtor(page, HUGETLB_PAGE_DTOR); @@ -1491,9 +1500,13 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) set_hugetlb_cgroup(page, NULL); set_hugetlb_cgroup_rsvd(page, NULL); ClearHPageFreed(page); +} + +static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) +{ + __prep_new_huge_page(page); spin_lock_irq(&hugetlb_lock); - h->nr_huge_pages++; - h->nr_huge_pages_node[nid]++; + __prep_account_new_huge_page(h, nid); spin_unlock_irq(&hugetlb_lock); } From patchwork Tue Apr 13 10:47:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12200067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F874C433ED for ; Tue, 13 Apr 2021 10:48:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CCAF0613BF for ; Tue, 13 Apr 2021 10:48:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CCAF0613BF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3AB156B0073; Tue, 13 Apr 2021 06:48:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 330A76B0074; Tue, 13 Apr 2021 06:48:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1ABD26B0075; Tue, 13 Apr 2021 06:48:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0064.hostedemail.com [216.40.44.64]) by kanga.kvack.org (Postfix) with ESMTP id ED7D76B0073 for ; Tue, 13 Apr 2021 06:48:06 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id AFDAA1801B5FA for ; Tue, 13 Apr 2021 10:48:06 +0000 (UTC) X-FDA: 78027018972.18.8CDF264 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf14.hostedemail.com (Postfix) with ESMTP id BFEF3C0007C1 for ; Tue, 13 Apr 2021 10:47:58 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3222EB159; Tue, 13 Apr 2021 10:48:05 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: Mike Kravetz , Vlastimil Babka , David Hildenbrand , Michal Hocko , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v7 5/7] mm: Make alloc_contig_range handle free hugetlb pages Date: Tue, 13 Apr 2021 12:47:45 +0200 Message-Id: <20210413104747.12177-6-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20210413104747.12177-1-osalvador@suse.de> References: <20210413104747.12177-1-osalvador@suse.de> X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: BFEF3C0007C1 X-Stat-Signature: jd1jcmnj5y6udcak65k4obokrou1jfon Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf14; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1618310878-112847 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: alloc_contig_range will fail if it ever sees a HugeTLB page within the range we are trying to allocate, even when that page is free and can be easily reallocated. This has proved to be problematic for some users of alloc_contic_range, e.g: CMA and virtio-mem, where those would fail the call even when those pages lay in ZONE_MOVABLE and are free. We can do better by trying to replace such page. Free hugepages are tricky to handle so as to no userspace application notices disruption, we need to replace the current free hugepage with a new one. In order to do that, a new function called alloc_and_dissolve_huge_page is introduced. This function will first try to get a new fresh hugepage, and if it succeeds, it will replace the old one in the free hugepage pool. The free page replacement is done under hugetlb_lock, so no external users of hugetlb will notice the change. To allocate the new huge page, we use alloc_buddy_huge_page(), so we do not have to deal with any counters, and prep_new_huge_page() is not called. This is valulable because in case we need to free the new page, we only need to call __free_pages(). Once we know that the page to be replaced is a genuine 0-refcounted huge page, we remove the old page from the freelist by remove_hugetlb_page(). Then, we can call __prep_new_huge_page() and __prep_account_new_huge_page() for the new huge page to properly initialize it and increment the hstate->nr_huge_pages counter (previously decremented by remove_hugetlb_page()). Once done, the page is enqueued by enqueue_huge_page() and it is ready to be used. There is one tricky case when page's refcount is 0 because it is in the process of being released. A missing PageHugeFreed bit will tell us that freeing is in flight so we retry after dropping the hugetlb_lock. The race window should be small and the next retry should make a forward progress. E.g: CPU0 CPU1 free_huge_page() isolate_or_dissolve_huge_page PageHuge() == T alloc_and_dissolve_huge_page alloc_buddy_huge_page() spin_lock_irq(hugetlb_lock) // PageHuge() && !PageHugeFreed && // !PageCount() spin_unlock_irq(hugetlb_lock) spin_lock_irq(hugetlb_lock) 1) update_and_free_page PageHuge() == F __free_pages() 2) enqueue_huge_page SetPageHugeFreed() spin_unlock(&hugetlb_lock) spin_lock_irq(hugetlb_lock) 1) PageHuge() == F (freed by case#1 from CPU0) 2) PageHuge() == T PageHugeFreed() == T - proceed with replacing the page In the case above we retry as the window race is quite small and we have high chances to succeed next time. With regard to the allocation, we restrict it to the node the page belongs to with __GFP_THISNODE, meaning we do not fallback on other node's zones. Note that gigantic hugetlb pages are fenced off since there is a cyclic dependency between them and alloc_contig_range. Signed-off-by: Oscar Salvador Acked-by: Michal Hocko --- include/linux/hugetlb.h | 6 +++ mm/compaction.c | 37 ++++++++++++++-- mm/hugetlb.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 155 insertions(+), 3 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 09f1fd12a6fa..b2d2118bfd1a 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -595,6 +595,7 @@ struct huge_bootmem_page { struct hstate *hstate; }; +int isolate_or_dissolve_huge_page(struct page *page); struct page *alloc_huge_page(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve); struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, @@ -877,6 +878,11 @@ static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; +static inline int isolate_or_dissolve_huge_page(struct page *page) +{ + return -ENOMEM; +} + static inline struct page *alloc_huge_page(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve) diff --git a/mm/compaction.c b/mm/compaction.c index eeba4668c22c..89426b6d1ea3 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -788,7 +788,7 @@ static bool too_many_isolated(pg_data_t *pgdat) * Isolate all pages that can be migrated from the range specified by * [low_pfn, end_pfn). The range is expected to be within same pageblock. * Returns errno, like -EAGAIN or -EINTR in case e.g signal pending or congestion, - * or 0. + * -ENOMEM in case we could not allocate a page, or 0. * cc->migrate_pfn will contain the next pfn to scan (which may be both less, * equal to or more that end_pfn). * @@ -809,6 +809,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, bool skip_on_failure = false; unsigned long next_skip_pfn = 0; bool skip_updated = false; + bool fatal_error = false; int ret = 0; cc->migrate_pfn = low_pfn; @@ -907,6 +908,33 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, valid_page = page; } + if (PageHuge(page) && cc->alloc_contig) { + ret = isolate_or_dissolve_huge_page(page); + + /* + * Fail isolation in case isolate_or_dissolve_huge_page + * reports an error. In case of -ENOMEM, abort right away. + */ + if (ret < 0) { + /* + * Do not report -EBUSY down the chain. + */ + if (ret == -ENOMEM) + fatal_error = true; + else + ret = 0; + low_pfn += (1UL << compound_order(page)) - 1; + goto isolate_fail; + } + + /* + * Ok, the hugepage was dissolved. Now these pages are + * Buddy and cannot be re-allocated because they are + * isolated. Fall-through as the check below handles + * Buddy pages. + */ + } + /* * Skip if free. We read page order here without zone lock * which is generally unsafe, but the race window is small and @@ -1066,7 +1094,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, put_page(page); isolate_fail: - if (!skip_on_failure) + if (!skip_on_failure && !fatal_error) continue; /* @@ -1092,6 +1120,9 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, */ next_skip_pfn += 1UL << cc->order; } + + if (fatal_error) + break; } /* @@ -1145,7 +1176,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, * @end_pfn: The one-past-last PFN. * * Returns errno, like -EAGAIN or -EINTR in case e.g signal pending or congestion, - * or 0. + * -ENOMEM in case we could not allocate a page, or 0. */ int isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0607b2b71ac6..4a664d6e82c1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2266,6 +2266,121 @@ static void restore_reserve_on_error(struct hstate *h, } } +/* + * alloc_and_dissolve_huge_page - Allocate a new page and dissolve the old one + * @h: struct hstate old page belongs to + * @old_page: Old page to dissolve + * Returns 0 on success, otherwise negated error. + */ + +static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page) +{ + gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE; + int nid = page_to_nid(old_page); + struct page *new_page; + int ret = 0; + + /* + * Before dissolving the page, we need to allocate a new one for the + * pool to remain stable. Using alloc_buddy_huge_page() allows us to + * not having to deal with prep_new_page() and avoids dealing of any + * counters. This simplifies and let us do the whole thing under the + * lock. + */ + new_page = alloc_buddy_huge_page(h, gfp_mask, nid, NULL, NULL); + if (!new_page) + return -ENOMEM; + +retry: + spin_lock_irq(&hugetlb_lock); + if (!PageHuge(old_page)) { + /* + * Freed from under us. Drop new_page too. + */ + goto free_new; + } else if (page_count(old_page)) { + /* + * Someone has grabbed the page, fail for now. + */ + ret = -EBUSY; + goto free_new; + } else if (!HPageFreed(old_page)) { + /* + * Page's refcount is 0 but it has not been enqueued in the + * freelist yet. Race window is small, so we can succeed here if + * we retry. + */ + spin_unlock_irq(&hugetlb_lock); + cond_resched(); + goto retry; + } else { + /* + * Ok, old_page is still a genuine free hugepage. Remove it from + * the freelist and decrease the counters. These will be + * incremented again when calling __prep_account_new_huge_page() + * and enqueue_huge_page() for new_page. The counters will remain + * stable since this happens under the lock. + */ + remove_hugetlb_page(h, old_page, false); + + /* + * Call __prep_new_huge_page() to construct the hugetlb page, and + * enqueue it then to place it in the freelists. After this, + * counters are back on track. Free hugepages have a refcount of 0, + * so we need to decrease new_page's count as well. + */ + __prep_new_huge_page(new_page); + __prep_account_new_huge_page(h, nid); + page_ref_dec(new_page); + enqueue_huge_page(h, new_page); + + /* + * Pages have been replaced, we can safely free the old one. + */ + spin_unlock_irq(&hugetlb_lock); + update_and_free_page(h, old_page); + } + + return ret; + +free_new: + spin_unlock_irq(&hugetlb_lock); + __free_pages(new_page, huge_page_order(h)); + + return ret; +} + +int isolate_or_dissolve_huge_page(struct page *page) +{ + struct hstate *h; + struct page *head; + + /* + * The page might have been dissolved from under our feet, so make sure + * to carefully check the state under the lock. + * Return success when racing as if we dissolved the page ourselves. + */ + spin_lock_irq(&hugetlb_lock); + if (PageHuge(page)) { + head = compound_head(page); + h = page_hstate(head); + } else { + spin_unlock(&hugetlb_lock); + return 0; + } + spin_unlock_irq(&hugetlb_lock); + + /* + * Fence off gigantic pages as there is a cyclic dependency between + * alloc_contig_range and them. Return -ENOME as this has the effect + * of bailing out right away without further retrying. + */ + if (hstate_is_gigantic(h)) + return -ENOMEM; + + return alloc_and_dissolve_huge_page(h, head); +} + struct page *alloc_huge_page(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve) { From patchwork Tue Apr 13 10:47:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12200069 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F176C433B4 for ; Tue, 13 Apr 2021 10:48:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C052D61242 for ; Tue, 13 Apr 2021 10:48:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C052D61242 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 473456B0074; Tue, 13 Apr 2021 06:48:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 30A7B6B0075; Tue, 13 Apr 2021 06:48:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04C4E6B0078; Tue, 13 Apr 2021 06:48:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0131.hostedemail.com [216.40.44.131]) by kanga.kvack.org (Postfix) with ESMTP id D447B6B0074 for ; Tue, 13 Apr 2021 06:48:07 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 92E1E18001C13 for ; Tue, 13 Apr 2021 10:48:07 +0000 (UTC) X-FDA: 78027019014.34.65B38F1 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf08.hostedemail.com (Postfix) with ESMTP id A71CC80192C7 for ; Tue, 13 Apr 2021 10:47:54 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 17769B165; Tue, 13 Apr 2021 10:48:06 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: Mike Kravetz , Vlastimil Babka , David Hildenbrand , Michal Hocko , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v7 6/7] mm: Make alloc_contig_range handle in-use hugetlb pages Date: Tue, 13 Apr 2021 12:47:46 +0200 Message-Id: <20210413104747.12177-7-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20210413104747.12177-1-osalvador@suse.de> References: <20210413104747.12177-1-osalvador@suse.de> X-Stat-Signature: j8xrj5rgqwyudgbjr84mjatr3gcme4rz X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A71CC80192C7 Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf08; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1618310874-87924 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: alloc_contig_range() will fail if it finds a HugeTLB page within the range, without a chance to handle them. Since HugeTLB pages can be migrated as any LRU or Movable page, it does not make sense to bail out without trying. Enable the interface to recognize in-use HugeTLB pages so we can migrate them, and have much better chances to succeed the call. Signed-off-by: Oscar Salvador Reviewed-by: Mike Kravetz Acked-by: Michal Hocko --- include/linux/hugetlb.h | 5 +++-- mm/compaction.c | 12 +++++++++++- mm/hugetlb.c | 22 +++++++++++++++++----- mm/vmscan.c | 5 +++-- 4 files changed, 34 insertions(+), 10 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index b2d2118bfd1a..b92f25ccef58 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -595,7 +595,7 @@ struct huge_bootmem_page { struct hstate *hstate; }; -int isolate_or_dissolve_huge_page(struct page *page); +int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); struct page *alloc_huge_page(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve); struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, @@ -878,7 +878,8 @@ static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; -static inline int isolate_or_dissolve_huge_page(struct page *page) +static inline int isolate_or_dissolve_huge_page(struct page *page, + struct list_head *list) { return -ENOMEM; } diff --git a/mm/compaction.c b/mm/compaction.c index 89426b6d1ea3..bb8ff3543972 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -909,7 +909,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, } if (PageHuge(page) && cc->alloc_contig) { - ret = isolate_or_dissolve_huge_page(page); + ret = isolate_or_dissolve_huge_page(page, &cc->migratepages); /* * Fail isolation in case isolate_or_dissolve_huge_page @@ -927,6 +927,15 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, goto isolate_fail; } + if (PageHuge(page)) { + /* + * Hugepage was successfully isolated and placed + * on the cc->migratepages list. + */ + low_pfn += compound_nr(page) - 1; + goto isolate_success_no_list; + } + /* * Ok, the hugepage was dissolved. Now these pages are * Buddy and cannot be re-allocated because they are @@ -1068,6 +1077,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, isolate_success: list_add(&page->lru, &cc->migratepages); +isolate_success_no_list: cc->nr_migratepages += compound_nr(page); nr_isolated += compound_nr(page); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 4a664d6e82c1..24a453ff47f2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2270,10 +2270,12 @@ static void restore_reserve_on_error(struct hstate *h, * alloc_and_dissolve_huge_page - Allocate a new page and dissolve the old one * @h: struct hstate old page belongs to * @old_page: Old page to dissolve + * @list: List to isolate the page in case we need to * Returns 0 on success, otherwise negated error. */ -static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page) +static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, + struct list_head *list) { gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE; int nid = page_to_nid(old_page); @@ -2300,9 +2302,13 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page) goto free_new; } else if (page_count(old_page)) { /* - * Someone has grabbed the page, fail for now. + * Someone has grabbed the page, try to isolate it here. + * Fail with -EBUSY if not possible. */ - ret = -EBUSY; + spin_unlock_irq(&hugetlb_lock); + if (!isolate_huge_page(old_page, list)) + ret = -EBUSY; + spin_lock_irq(&hugetlb_lock); goto free_new; } else if (!HPageFreed(old_page)) { /* @@ -2350,10 +2356,11 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page) return ret; } -int isolate_or_dissolve_huge_page(struct page *page) +int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list) { struct hstate *h; struct page *head; + int ret = -EBUSY; /* * The page might have been dissolved from under our feet, so make sure @@ -2378,7 +2385,12 @@ int isolate_or_dissolve_huge_page(struct page *page) if (hstate_is_gigantic(h)) return -ENOMEM; - return alloc_and_dissolve_huge_page(h, head); + if (page_count(head) && isolate_huge_page(head, list)) + ret = 0; + else if (!page_count(head)) + ret = alloc_and_dissolve_huge_page(h, head, list); + + return ret; } struct page *alloc_huge_page(struct vm_area_struct *vma, diff --git a/mm/vmscan.c b/mm/vmscan.c index bb8321026c0c..5199b9696bab 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1703,8 +1703,9 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone, LIST_HEAD(clean_pages); list_for_each_entry_safe(page, next, page_list, lru) { - if (page_is_file_lru(page) && !PageDirty(page) && - !__PageMovable(page) && !PageUnevictable(page)) { + if (!PageHuge(page) && page_is_file_lru(page) && + !PageDirty(page) && !__PageMovable(page) && + !PageUnevictable(page)) { ClearPageActive(page); list_move(&page->lru, &clean_pages); } From patchwork Tue Apr 13 10:47:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12200071 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFED2C433ED for ; Tue, 13 Apr 2021 10:48:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B4A9060E0B for ; Tue, 13 Apr 2021 10:48:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B4A9060E0B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 296A66B0075; Tue, 13 Apr 2021 06:48:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D2986B0078; Tue, 13 Apr 2021 06:48:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF6C16B007B; Tue, 13 Apr 2021 06:48:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0243.hostedemail.com [216.40.44.243]) by kanga.kvack.org (Postfix) with ESMTP id C85C96B0075 for ; Tue, 13 Apr 2021 06:48:08 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6FB2383F0 for ; Tue, 13 Apr 2021 10:48:08 +0000 (UTC) X-FDA: 78027019056.27.27E8521 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf12.hostedemail.com (Postfix) with ESMTP id D6659E7 for ; Tue, 13 Apr 2021 10:48:02 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id ED229B176; Tue, 13 Apr 2021 10:48:06 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: Mike Kravetz , Vlastimil Babka , David Hildenbrand , Michal Hocko , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v7 7/7] mm,page_alloc: Drop unnecessary checks from pfn_range_valid_contig Date: Tue, 13 Apr 2021 12:47:47 +0200 Message-Id: <20210413104747.12177-8-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20210413104747.12177-1-osalvador@suse.de> References: <20210413104747.12177-1-osalvador@suse.de> X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D6659E7 X-Stat-Signature: s68zrqmfafu85oiz596qiatwfudaes7c Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf12; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1618310882-752109 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: pfn_range_valid_contig() bails out when it finds an in-use page or a hugetlb page, among other things. We can drop the in-use page check since __alloc_contig_pages can migrate away those pages, and the hugetlb page check can go too since isolate_migratepages_range is now capable of dealing with hugetlb pages. Either way, those checks are racy so let the end function handle it when the time comes. Signed-off-by: Oscar Salvador Suggested-by: David Hildenbrand Reviewed-by: David Hildenbrand Acked-by: Mike Kravetz --- mm/page_alloc.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b5a94de3cdde..c5338e912ace 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8901,12 +8901,6 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn, if (PageReserved(page)) return false; - - if (page_count(page) > 0) - return false; - - if (PageHuge(page)) - return false; } return true; }