From patchwork Wed Aug 14 03:54:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13762817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEC1FC531DC for ; Wed, 14 Aug 2024 03:55:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 82B1C6B0083; Tue, 13 Aug 2024 23:55:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 789686B0085; Tue, 13 Aug 2024 23:55:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DC746B0089; Tue, 13 Aug 2024 23:55:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 383256B0083 for ; Tue, 13 Aug 2024 23:55:01 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9EA34A7B57 for ; Wed, 14 Aug 2024 03:55:00 +0000 (UTC) X-FDA: 82449485160.20.DEF2E91 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf03.hostedemail.com (Postfix) with ESMTP id D97D220003 for ; Wed, 14 Aug 2024 03:54:58 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ca3or8rQ; spf=pass (imf03.hostedemail.com: domain of 3kSq8ZgYKCBQIEJ1u808805y.w86527EH-664Fuw4.8B0@flex--yuzhao.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3kSq8ZgYKCBQIEJ1u808805y.w86527EH-664Fuw4.8B0@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723607628; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BGqmRlGLO1UPrtqcDwqRzTwlBOD38mCYYr0sMrb4o6s=; b=1F0w/rGIzs2Zv879GMSqk3wfgPeoKLiAWk9v9mwHwFHZFGZU9izVf/YTSD/5vwuvpQzl2v JgeU4ZrQZkxCYXv+px1UzrAV7dZKxEnNOVEatQjywAiwONUpxZceBw2cav7H0UUMaWFwru vp2U02YrD4T6k6OqoX8e1vGD+GPmvP8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723607628; a=rsa-sha256; cv=none; b=Y0hC8y5YavMoFlnszHUFSBV/2jhRRCite7/mrAtAygCu46bSB0QVt+llEcAccu6o31Px4q T04SlfcsMSMS5fjUSu4rJKEIN5cBhLrWqWgOqMueLVnZv37zT+UXeNLvStMkoeqARAqaYO KFrMlf+DdMuSsDcosHgiJU9Y5sgvPvk= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ca3or8rQ; spf=pass (imf03.hostedemail.com: domain of 3kSq8ZgYKCBQIEJ1u808805y.w86527EH-664Fuw4.8B0@flex--yuzhao.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3kSq8ZgYKCBQIEJ1u808805y.w86527EH-664Fuw4.8B0@flex--yuzhao.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-665a6dd38c8so133687477b3.1 for ; Tue, 13 Aug 2024 20:54:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723607698; x=1724212498; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=BGqmRlGLO1UPrtqcDwqRzTwlBOD38mCYYr0sMrb4o6s=; b=Ca3or8rQpOoBy2ON6pj79kebZXN+cit8oCwt/YiYAGwjjfSmfJ6dQPDmkDmDVfD2kE OIP76Til9ooGOZsXspc7EfZP7q9XfzXOKJe4bYF+3y+2BP8YXsqRNAznkUG0VzppR3D0 OU7s/4GC4hyvwsu79P+NEANvXlVBmjBSXy8eh0DSEe23v2l79RBcoxB8+/WEb/pVsxk1 2l+rTmNmuus1XW7osspZVE+VBBQIdlh2i3msTXmmzLWii9EM/OdMjQvGiJDulW6Dk7qb 3CNWtDeLP0Rqxea7M8XMoKJGw/zqt2TCjk+/ujy/UozkV9b6gacHo67B7FDjQN5q6sNS 16Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723607698; x=1724212498; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BGqmRlGLO1UPrtqcDwqRzTwlBOD38mCYYr0sMrb4o6s=; b=XIofQOYTZgqUb7lw0Iqm0pjCjL1jqkicCJ//CVFTAQV1DT4Cay1aeP2skOXwRoijvq SHWP0qMnsRoeN+ETLorM1Iawfhw1JtdL/OJXgQQmY/POiaTBuF7r1uPThRwQdRrav3Yw oHIYm/IcsFcxWx648hxw2aHpol/KlMHvA3Wc2x0cTSnayX6DBsP05ITeKfTcA9s62sZu arhBbrqZYMjtxL1Uhrf5DmSIEppipnFEa8k9aQZ3DhO46ZSi2PSPzX9HUg0i3RHGkGIx vUcOA8e9YfK22BZUrH1+H9PHRnR2CtDvzsbiM5NwinpKyZublrOKYiP8RjQtGSVesgnU 6aOA== X-Forwarded-Encrypted: i=1; AJvYcCXgEHIPH3D8xgnFLJerWRll5Z4eYbSKNl51D/htZ0eUhR3PDTYOmFVPLKMMRd9VIucNCHL//0U2kA==@kvack.org X-Gm-Message-State: AOJu0Yym1jMekdq0fACwMnO3Q8+UmHZDPYNySp9+nyYakJBQYrL1jFCH 77d/wF1eQdx5UI7MnydpTJ9I1RGzpRkZ9Wh+rm+0JwJCFg0ZAbA0thlLNqh0fMWpK1OpF2wVXZk N0A== X-Google-Smtp-Source: AGHT+IHF7I7tjKFaxxVwyOf6J4cLNGcmd7dtJ8TUkzqBnzt3BcUh4JMTZUA9JOZpfjEC0zAqGoYWe8MmT6c= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:c8ad:fcc7:f7cf:f2fd]) (user=yuzhao job=sendgmr) by 2002:a05:690c:2f09:b0:689:315b:cbfc with SMTP id 00721157ae682-6ac9b7b1a2fmr44757b3.6.1723607697910; Tue, 13 Aug 2024 20:54:57 -0700 (PDT) Date: Tue, 13 Aug 2024 21:54:49 -0600 In-Reply-To: <20240814035451.773331-1-yuzhao@google.com> Mime-Version: 1.0 References: <20240814035451.773331-1-yuzhao@google.com> X-Mailer: git-send-email 2.46.0.76.ge559c4bf1a-goog Message-ID: <20240814035451.773331-2-yuzhao@google.com> Subject: [PATCH mm-unstable v2 1/3] mm/contig_alloc: support __GFP_COMP From: Yu Zhao To: Andrew Morton , Muchun Song Cc: "Matthew Wilcox (Oracle)" , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao X-Rspamd-Queue-Id: D97D220003 X-Stat-Signature: sqn3r65akukrnj9mo16iipj7ns6smrc4 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1723607698-857158 X-HE-Meta: U2FsdGVkX18phNrZdonXV68Nc3+X3AoIzkh90iqn9FM2alt7GoFs+mtnLulhfJGTVSuF2pKtaTYHjhjRz+z9YudefwI7n5gXu3l1OHyrnbfZZzYFa3H3A3eVKhHVV2/2TGhD8iMv2WjePbskHWzDwea/7olpr3Pfx7mPWBKtkBKIeNCUcpaub8oKhAX6Ri+9JKjzwGIQpw/q9nkp8i6Z3288AoDm0zsd3jylzl6jitSD6cu7V2xvMkiNy5lAgUQpUWvapNtXov01eqTgXWV7T3hufpuBwQ/UcFRLQbznG39TzpQFMTd0/oBQKSGCx0wm/cob+4jceZkvrZMu6+fhK1FxL2p+wqDw3e8bjZF+QtgIMNOcnW3w8rRJ/VsgcYz/xXoDFeO9iL6QVVSvzqGtLultZNtMylbsTHidgXH0gslr4XgWlnqX9Jqn6NZmrtsFqYXSRplu8uiN9qBZvHZumdTGgmigFMCUSurWnPmsqqXxoqdufSvAOQkN0YA7g9bUlqGfjiK2voHJpQvLET9LasN6x1vlHJ89THLaNCh6gbrgDIzpF6n2Gf19okhaDyrPmNBcFQrqCHYADDghEWP1s6A9VjN+O8IehXe1CXRlQJ+l3wYERu1qiay9lNLpvEV+Ns6yQk/aIJkvj90YlPYbYtEoyslRqicA5BMTOvsa83cbrVHguOZeSIFSnlkS22cf3gBHDi6FCHJ6T4SCKPaNWJ6O7rPvCoJonU2vPnzL02qqlI54yjRkSlT8qequ4GW7ATPxJpBEJnjFR/otchTgDz5g/2mgDgQMV61LR9ArueuGGRZffN3YdgtI4o2Rs60k1HNmFFCiZl0Ko6cNKvTaD4sjsGZuk0QO/w5DzOOh7stQaz3Nowkxwqz8/VUsP5JSeF559P8K0uqyG2R20d7uRbFXoW/1iSRGOpMSV0rr4nepE5/9LFvs6xGat7ZQnBJmUmgzmAJ+aSPKfWEqkG1 CI1LUQib 0d+lr/rxe/K/fuJVTNaoHkUMSLS5BHrqNHBFnx9r84QcMLWZI0B5tCqpqfx3ZR/+Sn8Oko32t/JHVnW5g+UYHQvmjHoWcq7ca1IutE9rJGZ9WB9A4/U1xbQTziSQPHm8JMEuyvebR662p469yNiF+CbmBEEmm+fh7kSl3uEEhiT/bp+e7GhaNcnxMnWf8EJSql53O8k5oFBFNEL1GVz3Xji9V2ubNJD0jX1zdyp9uANdFoZfDBEakF9U3jdGTp1cVEV5JqNUIj9XBwZ3pn5ygvkVGLr3INsd2WCQ695YabeOl38BQNKWsT8/iZrhXD1ooAmuIp3RkFpPCxWVddZR14gFhgTXKrgyCUSczHPB9fKzmOQBuilob3W4UKhT5me781bBu2I8iaVkjsFVdVga6ZLufWNiytHyf4h2IWrQtY243Bkm6X0YBZvgOQdp15BXEqKdYaLHNEvb7wKFYvil1tnU+5SFfu6D1WfxyGV1SK493lgucStUAqB6wdwC3Pu0oe0q1Ad528ESfpXOJCW/aYeB8lbQzk/Xkp/sXkFArq8PEOt2RY2fy9qJR/Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Support __GFP_COMP in alloc_contig_range(). When the flag is set, upon success the function returns a large folio prepared by prep_new_page(), rather than a range of order-0 pages prepared by split_free_pages() (which is renamed from split_map_pages()). alloc_contig_range() can be used to allocate folios larger than MAX_PAGE_ORDER, e.g., gigantic hugeTLB folios. So on the free path, free_one_page() needs to handle that by split_large_buddy(). Signed-off-by: Yu Zhao --- include/linux/gfp.h | 23 +++++++++ mm/compaction.c | 41 ++-------------- mm/page_alloc.c | 111 +++++++++++++++++++++++++++++++------------- 3 files changed, 108 insertions(+), 67 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index f53f76e0b17e..59266df56aeb 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -446,4 +446,27 @@ extern struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_ #endif void free_contig_range(unsigned long pfn, unsigned long nr_pages); +#ifdef CONFIG_CONTIG_ALLOC +static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp, + int nid, nodemask_t *node) +{ + struct page *page; + + if (WARN_ON(!order || !(gfp | __GFP_COMP))) + return NULL; + + page = alloc_contig_pages_noprof(1 << order, gfp, nid, node); + + return page ? page_folio(page) : NULL; +} +#else +static inline struct folio *folio_alloc_gigantic_noprof(int order, gfp_t gfp, + int nid, nodemask_t *node) +{ + return NULL; +} +#endif +/* This should be paired with folio_put() rather than free_contig_range(). */ +#define folio_alloc_gigantic(...) alloc_hooks(folio_alloc_gigantic_noprof(__VA_ARGS__)) + #endif /* __LINUX_GFP_H */ diff --git a/mm/compaction.c b/mm/compaction.c index eb95e9b435d0..d1041fbce679 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -86,33 +86,6 @@ static struct page *mark_allocated_noprof(struct page *page, unsigned int order, } #define mark_allocated(...) alloc_hooks(mark_allocated_noprof(__VA_ARGS__)) -static void split_map_pages(struct list_head *freepages) -{ - unsigned int i, order; - struct page *page, *next; - LIST_HEAD(tmp_list); - - for (order = 0; order < NR_PAGE_ORDERS; order++) { - list_for_each_entry_safe(page, next, &freepages[order], lru) { - unsigned int nr_pages; - - list_del(&page->lru); - - nr_pages = 1 << order; - - mark_allocated(page, order, __GFP_MOVABLE); - if (order) - split_page(page, order); - - for (i = 0; i < nr_pages; i++) { - list_add(&page->lru, &tmp_list); - page++; - } - } - list_splice_init(&tmp_list, &freepages[0]); - } -} - static unsigned long release_free_list(struct list_head *freepages) { int order; @@ -742,11 +715,11 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, * * Non-free pages, invalid PFNs, or zone boundaries within the * [start_pfn, end_pfn) range are considered errors, cause function to - * undo its actions and return zero. + * undo its actions and return zero. cc->freepages[] are empty. * * Otherwise, function returns one-past-the-last PFN of isolated page * (which may be greater then end_pfn if end fell in a middle of - * a free page). + * a free page). cc->freepages[] contain free pages isolated. */ unsigned long isolate_freepages_range(struct compact_control *cc, @@ -754,10 +727,9 @@ isolate_freepages_range(struct compact_control *cc, { unsigned long isolated, pfn, block_start_pfn, block_end_pfn; int order; - struct list_head tmp_freepages[NR_PAGE_ORDERS]; for (order = 0; order < NR_PAGE_ORDERS; order++) - INIT_LIST_HEAD(&tmp_freepages[order]); + INIT_LIST_HEAD(&cc->freepages[order]); pfn = start_pfn; block_start_pfn = pageblock_start_pfn(pfn); @@ -788,7 +760,7 @@ isolate_freepages_range(struct compact_control *cc, break; isolated = isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, tmp_freepages, 0, true); + block_end_pfn, cc->freepages, 0, true); /* * In strict mode, isolate_freepages_block() returns 0 if @@ -807,13 +779,10 @@ isolate_freepages_range(struct compact_control *cc, if (pfn < end_pfn) { /* Loop terminated early, cleanup. */ - release_free_list(tmp_freepages); + release_free_list(cc->freepages); return 0; } - /* __isolate_free_page() does not map the pages */ - split_map_pages(tmp_freepages); - /* We don't use freelists for anything. */ return pfn; } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 5841bbea482a..0a43e4ea29e4 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1197,16 +1197,36 @@ static void free_pcppages_bulk(struct zone *zone, int count, spin_unlock_irqrestore(&zone->lock, flags); } +/* Split a multi-block free page into its individual pageblocks. */ +static void split_large_buddy(struct zone *zone, struct page *page, + unsigned long pfn, int order, fpi_t fpi) +{ + unsigned long end = pfn + (1 << order); + + VM_WARN_ON_ONCE(!IS_ALIGNED(pfn, 1 << order)); + /* Caller removed page from freelist, buddy info cleared! */ + VM_WARN_ON_ONCE(PageBuddy(page)); + + if (order > pageblock_order) + order = pageblock_order; + + while (pfn != end) { + int mt = get_pfnblock_migratetype(page, pfn); + + __free_one_page(page, pfn, zone, order, mt, fpi); + pfn += 1 << order; + page = pfn_to_page(pfn); + } +} + static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, fpi_t fpi_flags) { unsigned long flags; - int migratetype; spin_lock_irqsave(&zone->lock, flags); - migratetype = get_pfnblock_migratetype(page, pfn); - __free_one_page(page, pfn, zone, order, migratetype, fpi_flags); + split_large_buddy(zone, page, pfn, order, fpi_flags); spin_unlock_irqrestore(&zone->lock, flags); } @@ -1698,27 +1718,6 @@ static unsigned long find_large_buddy(unsigned long start_pfn) return start_pfn; } -/* Split a multi-block free page into its individual pageblocks */ -static void split_large_buddy(struct zone *zone, struct page *page, - unsigned long pfn, int order) -{ - unsigned long end_pfn = pfn + (1 << order); - - VM_WARN_ON_ONCE(order <= pageblock_order); - VM_WARN_ON_ONCE(pfn & (pageblock_nr_pages - 1)); - - /* Caller removed page from freelist, buddy info cleared! */ - VM_WARN_ON_ONCE(PageBuddy(page)); - - while (pfn != end_pfn) { - int mt = get_pfnblock_migratetype(page, pfn); - - __free_one_page(page, pfn, zone, pageblock_order, mt, FPI_NONE); - pfn += pageblock_nr_pages; - page = pfn_to_page(pfn); - } -} - /** * move_freepages_block_isolate - move free pages in block for page isolation * @zone: the zone @@ -1759,7 +1758,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page, del_page_from_free_list(buddy, zone, order, get_pfnblock_migratetype(buddy, pfn)); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, buddy, pfn, order); + split_large_buddy(zone, buddy, pfn, order, FPI_NONE); return true; } @@ -1770,7 +1769,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page, del_page_from_free_list(page, zone, order, get_pfnblock_migratetype(page, pfn)); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, page, pfn, order); + split_large_buddy(zone, page, pfn, order, FPI_NONE); return true; } move: @@ -6440,6 +6439,31 @@ int __alloc_contig_migrate_range(struct compact_control *cc, return (ret < 0) ? ret : 0; } +static void split_free_pages(struct list_head *list) +{ + int order; + + for (order = 0; order < NR_PAGE_ORDERS; order++) { + struct page *page, *next; + int nr_pages = 1 << order; + + list_for_each_entry_safe(page, next, &list[order], lru) { + int i; + + post_alloc_hook(page, order, __GFP_MOVABLE); + if (!order) + continue; + + split_page(page, order); + + /* Add all subpages to the order-0 head, in sequence. */ + list_del(&page->lru); + for (i = 0; i < nr_pages; i++) + list_add_tail(&page[i].lru, &list[0]); + } + } +} + /** * alloc_contig_range() -- tries to allocate given range of pages * @start: start PFN to allocate @@ -6552,12 +6576,25 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end, goto done; } - /* Free head and tail (if any) */ - if (start != outer_start) - free_contig_range(outer_start, start - outer_start); - if (end != outer_end) - free_contig_range(end, outer_end - end); + if (!(gfp_mask & __GFP_COMP)) { + split_free_pages(cc.freepages); + /* Free head and tail (if any) */ + if (start != outer_start) + free_contig_range(outer_start, start - outer_start); + if (end != outer_end) + free_contig_range(end, outer_end - end); + } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) { + struct page *head = pfn_to_page(start); + int order = ilog2(end - start); + + check_new_pages(head, order); + prep_new_page(head, order, gfp_mask, 0); + } else { + ret = -EINVAL; + WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n", + start, end, outer_start, outer_end); + } done: undo_isolate_page_range(start, end, migratetype); return ret; @@ -6666,6 +6703,18 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask, void free_contig_range(unsigned long pfn, unsigned long nr_pages) { unsigned long count = 0; + struct folio *folio = pfn_folio(pfn); + + if (folio_test_large(folio)) { + int expected = folio_nr_pages(folio); + + if (nr_pages == expected) + folio_put(folio); + else + WARN(true, "PFN %lu: nr_pages %lu != expected %d\n", + pfn, nr_pages, expected); + return; + } for (; nr_pages--; pfn++) { struct page *page = pfn_to_page(pfn);