From patchwork Thu May 26 23:15:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 12862892 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7E75C433F5 for ; Thu, 26 May 2022 23:15:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5BD7D8D0005; Thu, 26 May 2022 19:15:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 597EF8D0002; Thu, 26 May 2022 19:15:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 485388D0005; Thu, 26 May 2022 19:15:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3C4AB8D0002 for ; Thu, 26 May 2022 19:15:41 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 0A2F91205BD for ; Thu, 26 May 2022 23:15:41 +0000 (UTC) X-FDA: 79509453282.12.C13FAD6 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by imf29.hostedemail.com (Postfix) with ESMTP id 1EED2120036 for ; Thu, 26 May 2022 23:15:29 +0000 (UTC) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 3B7DE5C00BB; Thu, 26 May 2022 19:15:40 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Thu, 26 May 2022 19:15:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to :reply-to:sender:subject:subject:to:to; s=fm1; t=1653606940; x= 1653693340; bh=yFZglDNybaIMHmH9kYiFg9JUBZT2DT4WqPECVtKPtHQ=; b=V Gk9Kb5UWLW9rAmv4oflryNCTdu5f02iH/Eggx26tDDAlFb5/7NKSdpDaY3aaY99f xhxLsB1sOZoLTKPHXGoO2yCpck2lKS8j4U6j0fotxcBsJd9Z6g3VqPWXsGmJBgIm ao+0g6yQAi9b9VHYaqS+O8DANnT5RBMgj+RgxhiKS02e3goN3ZlUXfQkJaAei2IR DI8/cJ77eelIGSaGbZ6g6TRi2h5YxODtQ/JzskqPjEVVjDXB9tOqTvOss15JfNeq XqpMmEocGU1hfvl4SZSeuyj1N1v43wdoNi9vtBGWDcogn33bDrqD/Vk9ZkElA3oL tcZUKWGabjwfQFqmV1wMA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1653606940; x=1653693340; bh=y FZglDNybaIMHmH9kYiFg9JUBZT2DT4WqPECVtKPtHQ=; b=szQGTVZ5KKlCRaZdj txQU6MuzToX1f/jvtVzVzJWLSRv7f1qYiL0gb+EuHRLIfqGXpyAKLfWdGX//CYQq 7Fbdcv0ayCsto2+i2yk8HJPfDyfsVCjovxSvRVLr29zXegnnKO4tyq6kBfTkaaEb BN9UtXoIOdPi6tstSKxKRzy4tasbGVkmzEoG9Y+ETnmKaR8J8+/MsW5SkezXYdYF xKf6BlmDgExWEfwDnbfJ1ZNYbnAm+86WgBFnE0082tf51dX/12sa43wICGTCKSRj f6OYKVCkEP5zfQn3nngy5qiit9OvMbVirLQGkB+W9OU1CQJWKzAbErB75FuEQ9Eg 3Kxmw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrjeekgddukecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvfevufffkffojghfrhgggfestdhqredtredttdenucfhrhhomhepkghiucgj rghnuceoiihirdihrghnsehsvghnthdrtghomheqnecuggftrfgrthhtvghrnhepudeiue evhffhkeelhfeugeevjeefleelkedviedugfdvjeejfeejtdetheeulefhnecuffhomhgr ihhnpehkvghrnhgvlhdrohhrghenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Feedback-ID: iccd040f4:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 26 May 2022 19:15:39 -0400 (EDT) From: Zi Yan To: Andrew Morton , David Hildenbrand , linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Qian Cai , Vlastimil Babka , Mel Gorman , Eric Ren , Mike Rapoport , Oscar Salvador , Christophe Leroy , Zi Yan , Doug Berger Subject: [PATCH 2/2] mm: split free page with properly free memory accounting and without race Date: Thu, 26 May 2022 19:15:31 -0400 Message-Id: <20220526231531.2404977-2-zi.yan@sent.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220526231531.2404977-1-zi.yan@sent.com> References: <20220526231531.2404977-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 1EED2120036 X-Stat-Signature: h86s6spqw7prma7589o177pwqsjx64wi Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=sent.com header.s=fm1 header.b="V Gk9Kb5"; dkim=pass header.d=messagingengine.com header.s=fm1 header.b=szQGTVZ5; spf=pass (imf29.hostedemail.com: domain of zi.yan@sent.com designates 66.111.4.26 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (policy=none) header.from=sent.com X-Rspamd-Server: rspam09 X-HE-Tag: 1653606929-758220 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan In isolate_single_pageblock(), free pages are checked without holding zone lock, but they can go away in split_free_page() when zone lock is held. Check the free page and its order again in split_free_page() when zone lock is held. Recheck the page if the free page is gone under zone lock. In addition, in split_free_page(), the free page was deleted from the page list without changing free page accounting. Add the missing free page accounting code. Fix the type of order parameter in split_free_page(). Link: https://lore.kernel.org/lkml/20220525103621.987185e2ca0079f7b97b856d@linux-foundation.org/ Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity") Reported-by: Doug Berger Link: https://lore.kernel.org/linux-mm/c3932a6f-77fe-29f7-0c29-fe6b1c67ab7b@gmail.com/ Signed-off-by: Zi Yan --- mm/internal.h | 4 ++-- mm/page_alloc.c | 24 ++++++++++++++++++++---- mm/page_isolation.c | 10 +++++++--- 3 files changed, 29 insertions(+), 9 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 20e0a990da40..7cf12a15475b 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -374,8 +374,8 @@ extern void *memmap_alloc(phys_addr_t size, phys_addr_t align, phys_addr_t min_addr, int nid, bool exact_nid); -void split_free_page(struct page *free_page, - int order, unsigned long split_pfn_offset); +int split_free_page(struct page *free_page, + unsigned int order, unsigned long split_pfn_offset); #if defined CONFIG_COMPACTION || defined CONFIG_CMA diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 355bd017b185..2717d6dede99 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1112,30 +1112,44 @@ static inline void __free_one_page(struct page *page, * @order: the order of the page * @split_pfn_offset: split offset within the page * + * Return -ENOENT if the free page is changed, otherwise 0 + * * It is used when the free page crosses two pageblocks with different migratetypes * at split_pfn_offset within the page. The split free page will be put into * separate migratetype lists afterwards. Otherwise, the function achieves * nothing. */ -void split_free_page(struct page *free_page, - int order, unsigned long split_pfn_offset) +int split_free_page(struct page *free_page, + unsigned int order, unsigned long split_pfn_offset) { struct zone *zone = page_zone(free_page); unsigned long free_page_pfn = page_to_pfn(free_page); unsigned long pfn; unsigned long flags; int free_page_order; + int mt; + int ret = 0; if (split_pfn_offset == 0) - return; + return ret; spin_lock_irqsave(&zone->lock, flags); + + if (!PageBuddy(free_page) || buddy_order(free_page) != order) { + ret = -ENOENT; + goto out; + } + + mt = get_pageblock_migratetype(free_page); + if (likely(!is_migrate_isolate(mt))) + __mod_zone_freepage_state(zone, -(1UL << order), mt); + del_page_from_free_list(free_page, zone, order); for (pfn = free_page_pfn; pfn < free_page_pfn + (1UL << order);) { int mt = get_pfnblock_migratetype(pfn_to_page(pfn), pfn); - free_page_order = min_t(int, + free_page_order = min_t(unsigned int, pfn ? __ffs(pfn) : order, __fls(split_pfn_offset)); __free_one_page(pfn_to_page(pfn), pfn, zone, free_page_order, @@ -1146,7 +1160,9 @@ void split_free_page(struct page *free_page, if (split_pfn_offset == 0) split_pfn_offset = (1UL << order) - (pfn - free_page_pfn); } +out: spin_unlock_irqrestore(&zone->lock, flags); + return ret; } /* * A bad page could be due to a number of fields. Instead of multiple branches, diff --git a/mm/page_isolation.c b/mm/page_isolation.c index fbd820b21292..6021f8444b5a 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -371,9 +371,13 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, if (PageBuddy(page)) { int order = buddy_order(page); - if (pfn + (1UL << order) > boundary_pfn) - split_free_page(page, order, boundary_pfn - pfn); - pfn += (1UL << order); + if (pfn + (1UL << order) > boundary_pfn) { + /* free page changed before split, check it again */ + if (split_free_page(page, order, boundary_pfn - pfn)) + continue; + } + + pfn += 1UL << order; continue; } /*