From patchwork Fri Feb 14 06:32:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ge Yang X-Patchwork-Id: 13974533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1303EC02198 for ; Fri, 14 Feb 2025 06:32:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DF10280004; Fri, 14 Feb 2025 01:32:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 18ED8280001; Fri, 14 Feb 2025 01:32:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07D91280004; Fri, 14 Feb 2025 01:32:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DEE74280001 for ; Fri, 14 Feb 2025 01:32:21 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4A09F47D1C for ; Fri, 14 Feb 2025 06:32:21 +0000 (UTC) X-FDA: 83117580882.27.D7E2384 Received: from m16.mail.126.com (m16.mail.126.com [220.197.31.9]) by imf01.hostedemail.com (Postfix) with ESMTP id D050F40008 for ; Fri, 14 Feb 2025 06:32:18 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=ZYZTJieX; spf=pass (imf01.hostedemail.com: domain of yangge1116@126.com designates 220.197.31.9 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739514739; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:references:dkim-signature; bh=t5/NFgvwirXTPslamxof568wydxGxcgJ439GsLJwwTY=; b=HEOOnAkoJyKeWEtLhCwzKTnpxrQpbTBfrxaetWe3BhCQH0uCPpgRhm+a61lOoZHx4XOIDC cQNgNn3WzCwIYWakPQqpxdH8l49srG5mzralgUl2+evUAa7qFMYstPQZvLOeoMFUD7LzWs VccdChZPoNtFbeYQIMEcevvGIAxvKtc= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=ZYZTJieX; spf=pass (imf01.hostedemail.com: domain of yangge1116@126.com designates 220.197.31.9 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739514739; a=rsa-sha256; cv=none; b=X2Sk6pjxz5kBFaaZPCwKItE8FY7Sr0+45CDTrv2bdPfmBIkkfo+6BRCL1ealFbzTeL7oSx Y77H+b2X7b6Y9cXdWzD785wwbFLZbO4hDmEf6B0ghW/TTiQb2BMaveI91injoZhP8WDjhM hndd7sK/SR8m9Bk1e6W+J1Iwa28XKuY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=From:Subject:Date:Message-Id; bh=t5/NFgvwirXTPslamx of568wydxGxcgJ439GsLJwwTY=; b=ZYZTJieXJhxWhh8S3SfQ0Mnp4T0+i4C7zb qSh25YigieruRXhhNC/0NPCKX9PNOgvUb0SeWLxQ3l9a51QRN/m47QSGR0P2MkiI fANlyCh5VHq9HFsjJupK9wgVlmFiQ+CDxq86s+XUaFvJmkJ2K28zJmHqrRWlhbjY y0Vxt6AlE= Received: from hg-OptiPlex-7040.hygon.cn (unknown []) by gzsmtp4 (Coremail) with SMTP id PykvCgDnr2Jr465nOc+UAw--.56792S2; Fri, 14 Feb 2025 14:32:12 +0800 (CST) From: yangge1116@126.com To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, 21cnbao@gmail.com, david@redhat.com, baolin.wang@linux.alibaba.com, muchun.song@linux.dev, osalvador@suse.de, liuzixing@hygon.cn, Ge Yang Subject: [PATCH] mm/hugetlb: wait for hugepage folios to be freed Date: Fri, 14 Feb 2025 14:32:09 +0800 Message-Id: <1739514729-21265-1-git-send-email-yangge1116@126.com> X-Mailer: git-send-email 2.7.4 X-CM-TRANSID: PykvCgDnr2Jr465nOc+UAw--.56792S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxXryrCFy8Xr45tw4UKr1UAwb_yoWrKFWrpF yUKwnrGrWDJrZakr17Xws5Zr1ay395ZFW2kFWIqw43Z3ZxJw1DKFy2vw1qq3y5ArZ7CFWx ZrWjv3yDuF1UAaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x0zRoGQDUUUUU= X-Originating-IP: [112.64.138.194] X-CM-SenderInfo: 51dqwwjhrrila6rslhhfrp/1tbifhnzG2eu3KhoTQAAsX X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: D050F40008 X-Stat-Signature: knaqbupqrr3e4iiu57u1j6qsa6aiaaq9 X-HE-Tag: 1739514738-415569 X-HE-Meta: U2FsdGVkX19Yfl09JbP6AIxod8CGgtR/xCHUkmKhQIg0v7aqTVZBB/7zeIjKo8hlZ1EMmVgDrAnnWtaZ2mVL1Oj6zOC+g7U9881xrylSrjIc6ZLO+7syatMPSQgFCoaUPwtCRRZbV7EpCTr7IaYfpPAXCGXtDdX0VtDorYfzGL92OTbcqYF9YDpJAa+7iLO0Pl1iE/h97+65l0eSlOSLf4VvtyLaZjWSEae9mvyChgPOi/bczCf2W09rZRs6HG+hT3+uktILTS+qUMBeza/5cosbKlivvo1LhsBhrPxr94jqToqmnuW+cC9zLPyVHM/Fu8tFM3LFgJeHwk9nPDPgykMVSDK6skBrjTn3A4RHo+UjZQYPcDWDZgjLRfJLk3It9iJEQscWdukkw5y9MpuhFvBSenoen/1VJErNVyKM5+Wl02CzMlqOe3KFjR9j3nZlPzBLpu4yGVdbYSna2KCUVRexun1jNVSQ+JUbqi0mEIt8ZvkB4+pJx4BsxTz939OJmMlQqewlqYwJgtKdkjTFku1Hxp4tsXMlAOTn9kp/165IiJtdSzuYA6LGeHs/uKFtp3WDT4gZVYkqiNzMhMdS2QEPkVoKCJUZXUxJzNMsnPvl+Rx8UR17sykWNas7ejajAstaCPfHE62CUp4/sdJ2DZJKRAKZbW4hRaJxgeLoBSRHLAlhRu0UwDfPbvXAi9FsXPrj06ZDxyk9paVH3rl//T8SSjupOIn74I6pdw8XPjr6zoP8IYuYYdGT34c2w56J7VhKrFfhfYHax1k6JxhMccJBeWTmxF2yY2w4fufrg9CUvVm/QhxS5gcxUlqHulzy0my2GYzOUNCaNCmjXkzR5YtirsnzE0t6+uOyehh7FL1STx2gYY8cYf19VpXo9ZKWJWTRUa9j9Z6X2564t4gI3JeC/1w8BU8ZjkoNz4N3hZb+zjcZPqYMZb8+SxP6mou4mVVqBYQWuEee80dapMd 5nyhIlZT A8Qx+IeKiZEQtVsLxVgdS1/V2wSsatDB9GqYpzzwQrlP3gHPSeiIBGnjeWEHSZMkEfnWHQ7t2eeaJm5Evgz7H2Jrb/YFxqYTi/jX4cWzSRlyrIHV1RDcKnPZu+c4AhFpWeQmz3vI16LX5UIQ4nfDJuLWe+vxXpdRIqLC5DntzohTQ/Zxp2AH3/0SbTnmIn1ceDNshza1ogbnedKO3gl8CXUp7Pd4Vw6Adn2v+1TGL1IeU88uQuju9prvwbltEVsD7ah5A7OvCj49By0lAg/23Oup7Dg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Ge Yang Since the introduction of commit b65d4adbc0f0 ("mm: hugetlb: defer freeing of HugeTLB pages"), which supports deferring the freeing of HugeTLB pages, the allocation of contiguous memory through cma_alloc() may fail probabilistically. In the CMA allocation process, if it is found that the CMA area is occupied by in-use hugepage folios, these in-use hugepage folios need to be migrated to another location. When there are no available hugepage folios in the free HugeTLB pool during the migration of in-use HugeTLB pages, new folios are allocated from the buddy system. A temporary state is set on the newly allocated folio. Upon completion of the hugepage folio migration, the temporary state is transferred from the new folios to the old folios. Normally, when the old folios with the temporary state are freed, it is directly released back to the buddy system. However, due to the deferred freeing of HugeTLB pages, the PageBuddy() check fails, ultimately leading to the failure of cma_alloc(). Here is a simplified call trace illustrating the process: cma_alloc() ->__alloc_contig_migrate_range() // Migrate in-use hugepage ->unmap_and_move_huge_page() ->folio_putback_hugetlb() // Free old folios ->test_pages_isolated() ->__test_page_isolated_in_pageblock() ->PageBuddy(page) // Check if the page is in buddy To resolve this issue, we have implemented a function named wait_for_hugepage_folios_freed(). This function ensures that the hugepage folios are properly released back to the buddy system after their migration is completed. By invoking wait_for_hugepage_folios_freed() following the migration process, we guarantee that when test_pages_isolated() is executed, it will successfully pass. Fixes: b65d4adbc0f0 ("mm: hugetlb: defer freeing of HugeTLB pages") Signed-off-by: Ge Yang --- include/linux/hugetlb.h | 5 +++++ mm/hugetlb.c | 7 +++++++ mm/migrate.c | 16 ++++++++++++++-- 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 6c6546b..c39e0d5 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -697,6 +697,7 @@ bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); +void wait_for_hugepage_folios_freed(struct hstate *h); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1092,6 +1093,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, return 0; } +static inline void wait_for_hugepage_folios_freed(struct hstate *h) +{ +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 30bc34d..64cae39 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2955,6 +2955,13 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) return ret; } +void wait_for_hugepage_folios_freed(struct hstate *h) +{ + WARN_ON(!h); + + flush_free_hpage_work(h); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv diff --git a/mm/migrate.c b/mm/migrate.c index fb19a18..5dd1851 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1448,6 +1448,7 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, int page_was_mapped = 0; struct anon_vma *anon_vma = NULL; struct address_space *mapping = NULL; + unsigned long size; if (folio_ref_count(src) == 1) { /* page was freed from under us. So we are done. */ @@ -1533,9 +1534,20 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, out_unlock: folio_unlock(src); out: - if (rc == MIGRATEPAGE_SUCCESS) + if (rc == MIGRATEPAGE_SUCCESS) { + size = folio_size(src); folio_putback_hugetlb(src); - else if (rc != -EAGAIN) + + /* + * Due to the deferred freeing of HugeTLB folios, the hugepage 'src' may + * not immediately release to the buddy system. This can lead to failure + * in allocating memory through the cma_alloc() function. To ensure that + * the hugepage folios are properly released back to the buddy system, + * we invoke the wait_for_hugepage_folios_freed() function to wait for + * the release to complete. + */ + wait_for_hugepage_folios_freed(size_to_hstate(size)); + } else if (rc != -EAGAIN) list_move_tail(&src->lru, ret); /*