From patchwork Tue Dec 10 10:29:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13901213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C7F5E77182 for ; Tue, 10 Dec 2024 10:30:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C55436B016E; Tue, 10 Dec 2024 05:30:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C037E6B0171; Tue, 10 Dec 2024 05:30:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7BEE6B0172; Tue, 10 Dec 2024 05:30:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8B7D56B016E for ; Tue, 10 Dec 2024 05:30:03 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 49B87C06FD for ; Tue, 10 Dec 2024 10:30:03 +0000 (UTC) X-FDA: 82878678456.12.3FC6BA1 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf16.hostedemail.com (Postfix) with ESMTP id E358F18001A for ; Tue, 10 Dec 2024 10:29:38 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gJaWKz26; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733826590; a=rsa-sha256; cv=none; b=HSCyrtg4Yu/p7fPUlib5ajPDDHW5mEzGEj6Ds4tViDXN9gR78ls08Dr6zn6PNMAHEW4kW/ zjw+J/DLHS0XCNVW5Hqu5TJti/Q1TDWWa4Ihacw9BYEmJcdog21SDH/7+pUtlJyhkw1EPv 2n1c231oN2pvZTflVdaqdRN04EYk9kE= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gJaWKz26; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733826590; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jf3UvXO36r9NLrYYxU7q1kcbqObQ8HDR+wPuMnn0UoE=; b=daFReA0QvhpfS7zE9xTwmb72vhqqnxXySZwxv6cWMbV9JhicIp1gkcbUSm5jxto5B+3O5u g6HHWL78sHETWsCCU/My1QXiTo2Ins0G2/iXX2KZ6crZfqaxRFqdXhp8HADebBUBcYrBNj xJTLxh5nNTDy167SJKN7SIoA1+a239w= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733826600; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jf3UvXO36r9NLrYYxU7q1kcbqObQ8HDR+wPuMnn0UoE=; b=gJaWKz26K0yFtbIXXfeRYOOyhLrX0kyZCAer2fYkMgGifqcIbazgQK9jg773n337fFfhDp kRWdXjqD7C8xri2+Kyaum4VtpwKvkySmS4fd/RKH5rra8cZO4CM+Z1cW9FNJrmHjXfm7Te j1TuEfizvS/Dhk0KEkJiaX1fqRQBaac= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-526-ONTDj1vONcuVkJR4yod7PA-1; Tue, 10 Dec 2024 05:29:59 -0500 X-MC-Unique: ONTDj1vONcuVkJR4yod7PA-1 X-Mimecast-MFC-AGG-ID: ONTDj1vONcuVkJR4yod7PA Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-434f3a758dbso19685115e9.0 for ; Tue, 10 Dec 2024 02:29:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733826598; x=1734431398; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jf3UvXO36r9NLrYYxU7q1kcbqObQ8HDR+wPuMnn0UoE=; b=r803GyxsWdoopHg0LssH6eMknfG6D9jNqHrqWCsEiTJen41XVwJKy84wp/yYxyUa19 brTHJRkZcE28F8Z4p0qSEGXeWQ7FTreM6WncFQoQ/nGnMGC3K+OZ50UDaqkKoPPBettb nVwa5fevtpL2h3vu62SG7W5NF9bpAKB5AXbKplCEa4IKxjDzTsWzZxosnbjCRibCd+e3 3CpM6GT8LRNX5r6iCxueJtWkMTbD28Z7hR55mpY5LPiPohebB56NXAJ9r3XtIND+7Ub0 DkxjwBembWwRs8O3AdukphNns5lmzhbH1cLh7xfc2PD8EOGOh21UI7W8rhqC2ODsa9zY 3z5A== X-Gm-Message-State: AOJu0YwuEWPRwfRoJAl0P6ociaafHGzxVVgHp7Qh80dRIrFDE2a31zxC SnZ3I4FPgSQ6QwaCWZQRxmhfssS/+npj1moxm+/+jm5o4OjXKDn2dSfVydnwbOcYbAG2kZKZX8K PDli08CHaaokElEkT6tmp0Xtgqio+n+BM5egobn8oVPRRhqSK X-Gm-Gg: ASbGncucnN1SawY5vDWTUX2yp0XV+blxzujhvgG13XlNab4C9e8SuuYH8N3oA4lE2z6 /+saCdKPeDJ2dp5XGYl2sx8pgJBwD0eBdFrMdjcY23M+hpmR+pjRKCFHFIRBV4YFM46fqTqDxHZ A0PaMcy4hFECNsys68Pkp8dlIelyu0+zeTOcJN4YV2N7Jj3myv6oPnupaElhX3w4FY/NW3degty dxz5XF7CHeY5YCKpMCZTz220IqLToDosD3p9tOTpLMVmaHh1YrReB2WI37XzrjoFbvCkF3lcxlH F/wfMydYynACczmcX0OxB6QKo5wfVujzhsG9Zjk= X-Received: by 2002:a05:6000:1869:b0:386:3213:5ba1 with SMTP id ffacd0b85a97d-386453e10ffmr3815861f8f.24.1733826597763; Tue, 10 Dec 2024 02:29:57 -0800 (PST) X-Google-Smtp-Source: AGHT+IE4tL+BSJEProxIsBDXYemaeIdRcoyVh5g8pjiVMDG9elzVPaoUP+MwbGvNobUYYHOAByApkA== X-Received: by 2002:a05:6000:1869:b0:386:3213:5ba1 with SMTP id ffacd0b85a97d-386453e10ffmr3815841f8f.24.1733826597377; Tue, 10 Dec 2024 02:29:57 -0800 (PST) Received: from localhost (p200300cbc723b8009a604b4649f987f3.dip0.t-ipconnect.de. [2003:cb:c723:b800:9a60:4b46:49f9:87f3]) by smtp.gmail.com with UTF8SMTPSA id ffacd0b85a97d-38621fbbc08sm15768883f8f.86.2024.12.10.02.29.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 10 Dec 2024 02:29:56 -0800 (PST) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Zi Yan , Vlastimil Babka , Yu Zhao Subject: [PATCH v2 1/2] mm/page_alloc: conditionally split > pageblock_order pages in free_one_page() and move_freepages_block_isolate() Date: Tue, 10 Dec 2024 11:29:52 +0100 Message-ID: <20241210102953.218122-2-david@redhat.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241210102953.218122-1-david@redhat.com> References: <20241210102953.218122-1-david@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: roqaHvE1LmHytRBxyTqdLv-I6MaUdHbufk-rWimJfSo_1733826598 X-Mimecast-Originator: redhat.com content-type: text/plain; charset="US-ASCII"; x-default=true X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: E358F18001A X-Stat-Signature: pawsjum4x3dfh76zyhyuxru56ebzom6t X-HE-Tag: 1733826578-430248 X-HE-Meta: U2FsdGVkX1/vMDcW7gP8X9zyCqluZX66GZ6/Bz1dGYIuJkB2OObhuzdT0/PqcGn7fk9a7HRT4SWoXCoH+MXWzt01hez9TSjfG7N7lPrPm1oKJA0y45qJLWZc68bqaS0Q9jDpA6jTIzfA+/m4zzemzCWkHERzYTM/05ag03wBigd0mJ5wuwK3ZqMu56ZoKbxHlcu1hKPVoEG3+s9rvO8QoQViLZJiGXZZmDeY9qr9Hu5+2uOnhVvg3A2uB1pUYqJfw8g6iYx1IIbcawJJ/ouhcIugROwq8uDdhSKcit+hIaME6g6+eWzejorsJfyIAFhdibXyr73E2GQ785TXXViERz9GW46+KhabqjrjzBnE2p/KXSL8AXLr6MRIFv4kOO8UzCFsssIwpvbJOAnW+Rz204a7PlI5bbyEcQXc1g9N+KTBYXbVI+43vWkllK2DV9lUEhfQPBxvIugobpYwY79e10O15mwwXUZry68avle1RKtTCr+9c14RMx1qAgJicFHSrXNEZfUR/JaY7U7ByXsX+zne7wKWo5ijnNerVfaN/S1DmqPvxYdyn/UTNmzLn1tQ01Dgf+GOzWH239bq/crHvcOFBwpXoR8Ys3oLYUJ7NlqXZPzYpHgIIYFNXY9Z0HZFlKoJ8Z3qwyzeDZRMvnMyXyOldGaQpTOB8D+DwKW/Z3t5+o8jZLEEC0Hqm5R4BgavD7du2TWnywJOTSjyRNnGfiJZyJSk69TaxtNpEi1vEbaru89HUfV2qEnltM2zMDmUVx6C5mxj5xPlul5vwhnxK3lsm1ECCPzGutpGmFytlx/4VTSkg8bOjblbkqcTlDA6W1De7cV1IsIxQNdRtn8oFFhAgn1NanI5ot+9Zj9a7MxRV2Fl4MSCv/fmmUae5xpI6KAPif6xmJn145vxJgBo7hLgTY6iJxii01AJbrP58UITW/QfqIr9ZaaeuSmhT1JWY2XJ+mhimBLFYSOMilg XFVap9Bl FyYgFt3ZAEEO3KR9kDXOyGyWuZtsyUR8YlAQoDbD9WVbktpo/E1MiLwX4EVWZBfimWR1TzL2MGZKwvIRiXR3LFo37ICXdKAK+rcWu48mZct+2hafmsC0uMeWr9RseiDTRHFdeEBVukQD2b/J7aJ2UMOmHT7eeTdBnKuwdjkix6ZnG6P9CVrGnSe6KwXaNCy9Q5YE9d5oDnHANaS6hdgPOOH4aFDETiIVGJ2FbaQo19lo1j9MKZmegEVrgGl5iglmLIXBe6kPAU2XKEzaVLh1RiIQqPl5YQRTbyBsbUcy/nTAT9+I3oOOyFlWVseunbE3C57liY2HHON0eJXQvQTKs4zJY0mPzSX7ysMhkJvQzjkdrefT1RxqtQLL5+TrUXmcqbK5gQIY7Bvf/yWI53wOUVvD7ZSNysrgEZ2J6ZGL5oObTtWMocoBUhf1R7LnHODcMF8uxAt+hsrA6f6CK6MNeAkTVoCJpyTNqDAQ9cDyoYh60DOSteEzCZEwg9vUGrIAEOO9VdmLQ7WnfrF6Iwkn0P0g9Zg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let's special-case for the common scenarios that: (a) We are freeing pages <= pageblock_order (b) We are freeing a page <= MAX_PAGE_ORDER and all pageblocks match (especially, no mixture of isolated and non-isolated pageblocks) When we encounter a > MAX_PAGE_ORDER page, it can only come from alloc_contig_range(), and we can process MAX_PAGE_ORDER chunks. When we encounter a >pageblock_order <= MAX_PAGE_ORDER page, check whether all pageblocks match, and if so (common case), don't split them up just for the buddy to merge them back. This makes sure that when we free MAX_PAGE_ORDER chunks to the buddy, for example during system startups, memory onlining, or when isolating consecutive pageblocks via alloc_contig_range()/memory offlining, that we don't unnecessarily split up what we'll immediately merge again, because the migratetypes match. Rename split_large_buddy() to __free_one_page_maybe_split(), to make it clearer what's happening, and handle in it only natural buddy orders, not the alloc_contig_range(__GFP_COMP) special case: handle that in free_one_page() only. In the future, we might want to assume that all pageblocks are equal if zone->nr_isolate_pageblock == 0; however, that will require some zone->nr_isolate_pageblock accounting changes, such that we are guaranteed to see zone->nr_isolate_pageblock != 0 when there is an isolated pageblock. Reviewed-by: Zi Yan Acked-by: Yu Zhao Acked-by: Vlastimil Babka --- mm/page_alloc.c | 73 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 59 insertions(+), 14 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a52c6022c65cb..444e4bcb9c7c6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1225,27 +1225,53 @@ static void free_pcppages_bulk(struct zone *zone, int count, spin_unlock_irqrestore(&zone->lock, flags); } -/* Split a multi-block free page into its individual pageblocks. */ -static void split_large_buddy(struct zone *zone, struct page *page, - unsigned long pfn, int order, fpi_t fpi) +static bool pfnblock_migratetype_equal(unsigned long pfn, + unsigned long end_pfn, int mt) { - unsigned long end = pfn + (1 << order); + VM_WARN_ON_ONCE(!IS_ALIGNED(pfn | end_pfn, pageblock_nr_pages)); + while (pfn != end_pfn) { + struct page *page = pfn_to_page(pfn); + + if (unlikely(mt != get_pfnblock_migratetype(page, pfn))) + return false; + pfn += pageblock_nr_pages; + } + return true; +} + +static void __free_one_page_maybe_split(struct zone *zone, struct page *page, + unsigned long pfn, int order, fpi_t fpi_flags) +{ + const unsigned long end_pfn = pfn + (1 << order); + int mt = get_pfnblock_migratetype(page, pfn); + + VM_WARN_ON_ONCE(order > MAX_PAGE_ORDER); VM_WARN_ON_ONCE(!IS_ALIGNED(pfn, 1 << order)); /* Caller removed page from freelist, buddy info cleared! */ VM_WARN_ON_ONCE(PageBuddy(page)); - if (order > pageblock_order) - order = pageblock_order; + /* + * With CONFIG_MEMORY_ISOLATION, we might be freeing MAX_ORDER_NR_PAGES + * pages that cover pageblocks with different migratetypes; for example + * only some migratetypes might be MIGRATE_ISOLATE. In that (unlikely) + * case, fallback to freeing individual pageblocks so they get put + * onto the right lists. + */ + if (!IS_ENABLED(CONFIG_MEMORY_ISOLATION) || + likely(order <= pageblock_order) || + pfnblock_migratetype_equal(pfn + pageblock_nr_pages, end_pfn, mt)) { + __free_one_page(page, pfn, zone, order, mt, fpi_flags); + return; + } do { - int mt = get_pfnblock_migratetype(page, pfn); - - __free_one_page(page, pfn, zone, order, mt, fpi); - pfn += 1 << order; - if (pfn == end) + __free_one_page(page, pfn, zone, pageblock_order, mt, fpi_flags); + pfn += pageblock_nr_pages; + if (pfn == end_pfn) break; page = pfn_to_page(pfn); + mt = get_pfnblock_migratetype(page, pfn); } while (1); } @@ -1256,7 +1282,26 @@ static void free_one_page(struct zone *zone, struct page *page, unsigned long flags; spin_lock_irqsave(&zone->lock, flags); - split_large_buddy(zone, page, pfn, order, fpi_flags); + if (likely(order <= MAX_PAGE_ORDER)) { + __free_one_page_maybe_split(zone, page, pfn, order, fpi_flags); + } else if (IS_ENABLED(CONFIG_CONTIG_ALLOC)) { + const unsigned long end_pfn = pfn + (1 << order); + + /* + * The only way we can end up with order > MAX_PAGE_ORDER is + * through alloc_contig_range(__GFP_COMP). + */ + do { + __free_one_page_maybe_split(zone, page, pfn, + MAX_PAGE_ORDER, fpi_flags); + pfn += MAX_ORDER_NR_PAGES; + if (pfn == end_pfn) + break; + page = pfn_to_page(pfn); + } while (1); + } else { + WARN_ON_ONCE(1); + } spin_unlock_irqrestore(&zone->lock, flags); __count_vm_events(PGFREE, 1 << order); @@ -1792,7 +1837,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page, del_page_from_free_list(buddy, zone, order, get_pfnblock_migratetype(buddy, pfn)); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, buddy, pfn, order, FPI_NONE); + __free_one_page_maybe_split(zone, buddy, pfn, order, FPI_NONE); return true; } @@ -1803,7 +1848,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page, del_page_from_free_list(page, zone, order, get_pfnblock_migratetype(page, pfn)); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, page, pfn, order, FPI_NONE); + __free_one_page_maybe_split(zone, page, pfn, order, FPI_NONE); return true; } move: