From patchwork Wed Feb 17 10:08:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12091265 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA48AC433E6 for ; Wed, 17 Feb 2021 10:08:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 33C7464E33 for ; Wed, 17 Feb 2021 10:08:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 33C7464E33 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B5CF88D002D; Wed, 17 Feb 2021 05:08:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B0CFE8D0019; Wed, 17 Feb 2021 05:08:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B25A8D002F; Wed, 17 Feb 2021 05:08:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0143.hostedemail.com [216.40.44.143]) by kanga.kvack.org (Postfix) with ESMTP id 765258D002D for ; Wed, 17 Feb 2021 05:08:28 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3CC2C8248D7C for ; Wed, 17 Feb 2021 10:08:28 +0000 (UTC) X-FDA: 77827335096.28.wing38_54034db2764b Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 1F1BD6C0B for ; Wed, 17 Feb 2021 10:08:28 +0000 (UTC) X-HE-Tag: wing38_54034db2764b X-Filterd-Recvd-Size: 6212 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Wed, 17 Feb 2021 10:08:27 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6E730B7B2; Wed, 17 Feb 2021 10:08:26 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: Mike Kravetz , David Hildenbrand , Muchun Song , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH 1/2] mm: Make alloc_contig_range handle free hugetlb pages Date: Wed, 17 Feb 2021 11:08:15 +0100 Message-Id: <20210217100816.28860-2-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20210217100816.28860-1-osalvador@suse.de> References: <20210217100816.28860-1-osalvador@suse.de> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Free hugetlb pages are tricky to handle so as to no userspace application notices disruption, we need to replace the current free hugepage with a new one. In order to do that, a new function called alloc_and_dissolve_huge_page is introduced. This function will first try to get a new fresh hugetlb page, and if it succeeds, it will dissolve the old one. With regard to the allocation, since we do not know whether the old page was allocated on a specific node on request, the node the old page belongs to will be tried first, and then we will fallback to all nodes containing memory (N_MEMORY). Note that gigantic hugetlb pages are fenced off since there is a cyclic dependency between them and alloc_contig_range. Signed-off-by: Oscar Salvador --- include/linux/hugetlb.h | 6 ++++ mm/compaction.c | 12 +++++++ mm/hugetlb.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 109 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index b5807f23caf8..72352d718829 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -505,6 +505,7 @@ struct huge_bootmem_page { struct hstate *hstate; }; +bool isolate_or_dissolve_huge_page(struct page *page); struct page *alloc_huge_page(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve); struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, @@ -775,6 +776,11 @@ void set_page_huge_active(struct page *page); #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; +static inline bool isolate_or_dissolve_huge_page(struct page *page) +{ + return false; +} + static inline struct page *alloc_huge_page(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve) diff --git a/mm/compaction.c b/mm/compaction.c index 190ccdaa6c19..d52506ed9db7 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -905,6 +905,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, valid_page = page; } + if (PageHuge(page) && cc->alloc_contig) { + if (!isolate_or_dissolve_huge_page(page)) + goto isolate_fail; + + /* + * Ok, the hugepage was dissolved. Now these pages are + * Buddy and cannot be re-allocated because they are + * isolated. Fall-through as the check below handles + * Buddy pages. + */ + } + /* * Skip if free. We read page order here without zone lock * which is generally unsafe, but the race window is small and diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 4bdb58ab14cb..b78926bca60a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2294,6 +2294,97 @@ static void restore_reserve_on_error(struct hstate *h, } } +static bool alloc_and_dissolve_huge_page(struct hstate *h, struct page *page) +{ + gfp_t gfp_mask = htlb_alloc_mask(h); + nodemask_t *nmask = &node_states[N_MEMORY]; + struct page *new_page; + bool ret = false; + int nid; + + spin_lock(&hugetlb_lock); + /* + * Check one more time to make race-window smaller. + */ + if (!PageHuge(page)) { + /* + * Dissolved from under our feet. + */ + spin_unlock(&hugetlb_lock); + return true; + } + + nid = page_to_nid(page); + spin_unlock(&hugetlb_lock); + + /* + * Before dissolving the page, we need to allocate a new one, + * so the pool remains stable. + */ + new_page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL); + if (new_page) { + /* + * Ok, we got a new free hugepage to replace this one. Try to + * dissolve the old page. + */ + if (!dissolve_free_huge_page(page)) { + ret = true; + } else if (dissolve_free_huge_page(new_page)) { + /* + * Seems the old page could not be dissolved, so try to + * dissolve the freshly allocated page. If that fails + * too, let us count the new page as a surplus. Doing so + * allows the pool to be re-balanced when pages are freed + * instead of enqueued again. + */ + spin_lock(&hugetlb_lock); + h->surplus_huge_pages++; + h->surplus_huge_pages_node[nid]++; + spin_unlock(&hugetlb_lock); + } + /* + * Free it into the hugepage allocator + */ + put_page(new_page); + } + + return ret; +} + +bool isolate_or_dissolve_huge_page(struct page *page) +{ + struct hstate *h = NULL; + struct page *head; + bool ret = false; + + spin_lock(&hugetlb_lock); + if (PageHuge(page)) { + head = compound_head(page); + h = page_hstate(head); + } + spin_unlock(&hugetlb_lock); + + if (!h) + /* + * The page might have been dissolved from under our feet. + * If that is the case, return success as if we dissolved it + * ourselves. + */ + return true; + + if (hstate_is_gigantic(h)) + /* + * Fence off gigantic pages as there is a cyclic dependency + * between alloc_contig_range and them. + */ + return ret; + + if(!page_count(head) && alloc_and_dissolve_huge_page(h, head)) + ret = true; + + return ret; +} + struct page *alloc_huge_page(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve) { From patchwork Wed Feb 17 10:08:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oscar Salvador X-Patchwork-Id: 12091267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2042C433E0 for ; Wed, 17 Feb 2021 10:08:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4B58364E28 for ; Wed, 17 Feb 2021 10:08:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B58364E28 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B3C948D002F; Wed, 17 Feb 2021 05:08:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9FD9B8D0019; Wed, 17 Feb 2021 05:08:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 827508D002F; Wed, 17 Feb 2021 05:08:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id 615938D0019 for ; Wed, 17 Feb 2021 05:08:29 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2201B362C for ; Wed, 17 Feb 2021 10:08:29 +0000 (UTC) X-FDA: 77827335138.01.scene67_39072952764b Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 0506F10046484 for ; Wed, 17 Feb 2021 10:08:29 +0000 (UTC) X-HE-Tag: scene67_39072952764b X-Filterd-Recvd-Size: 4798 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Wed, 17 Feb 2021 10:08:28 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3F935AFB5; Wed, 17 Feb 2021 10:08:27 +0000 (UTC) From: Oscar Salvador To: Andrew Morton Cc: Mike Kravetz , David Hildenbrand , Muchun Song , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH 2/2] mm: Make alloc_contig_range handle in-use hugetlb pages Date: Wed, 17 Feb 2021 11:08:16 +0100 Message-Id: <20210217100816.28860-3-osalvador@suse.de> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20210217100816.28860-1-osalvador@suse.de> References: <20210217100816.28860-1-osalvador@suse.de> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In-use hugetlb pages can be migrated as any other page (LRU and Movable), so let alloc_contig_range handle them. All we need is to succesfully isolate such page. Signed-off-by: Oscar Salvador --- include/linux/hugetlb.h | 5 +++-- mm/compaction.c | 11 ++++++++++- mm/hugetlb.c | 6 ++++-- mm/vmscan.c | 5 +++-- 4 files changed, 20 insertions(+), 7 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 72352d718829..8c17d0dbc87c 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -505,7 +505,7 @@ struct huge_bootmem_page { struct hstate *hstate; }; -bool isolate_or_dissolve_huge_page(struct page *page); +bool isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); struct page *alloc_huge_page(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve); struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, @@ -776,7 +776,8 @@ void set_page_huge_active(struct page *page); #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; -static inline bool isolate_or_dissolve_huge_page(struct page *page) +static inline bool isolate_or_dissolve_huge_page(struct page *page, + struct list_head *list) { return false; } diff --git a/mm/compaction.c b/mm/compaction.c index d52506ed9db7..55a41a9228a9 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -906,9 +906,17 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, } if (PageHuge(page) && cc->alloc_contig) { - if (!isolate_or_dissolve_huge_page(page)) + if (!isolate_or_dissolve_huge_page(page, &cc->migratepages)) goto isolate_fail; + if (PageHuge(page)) { + /* + * Hugepage was succesfully isolated. + */ + low_pfn += compound_nr(page) - 1; + goto isolate_success_no_list; + } + /* * Ok, the hugepage was dissolved. Now these pages are * Buddy and cannot be re-allocated because they are @@ -1053,6 +1061,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, isolate_success: list_add(&page->lru, &cc->migratepages); +isolate_success_no_list: cc->nr_migratepages += compound_nr(page); nr_isolated += compound_nr(page); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b78926bca60a..9fa678d13c68 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2351,7 +2351,7 @@ static bool alloc_and_dissolve_huge_page(struct hstate *h, struct page *page) return ret; } -bool isolate_or_dissolve_huge_page(struct page *page) +bool isolate_or_dissolve_huge_page(struct page *page, struct list_head *list) { struct hstate *h = NULL; struct page *head; @@ -2379,7 +2379,9 @@ bool isolate_or_dissolve_huge_page(struct page *page) */ return ret; - if(!page_count(head) && alloc_and_dissolve_huge_page(h, head)) + if (page_count(head) && isolate_huge_page(head, list)) + ret = true; + else if(!page_count(head) && alloc_and_dissolve_huge_page(h, head)) ret = true; return ret; diff --git a/mm/vmscan.c b/mm/vmscan.c index b1b574ad199d..0803adca4469 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1506,8 +1506,9 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone, LIST_HEAD(clean_pages); list_for_each_entry_safe(page, next, page_list, lru) { - if (page_is_file_lru(page) && !PageDirty(page) && - !__PageMovable(page) && !PageUnevictable(page)) { + if (!PageHuge(page) && page_is_file_lru(page) && + !PageDirty(page) && !__PageMovable(page) && + !PageUnevictable(page)) { ClearPageActive(page); list_move(&page->lru, &clean_pages); }