From patchwork Sun Aug 11 21:21:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13759888 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A8E8C531DA for ; Sun, 11 Aug 2024 21:21:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C348D6B008C; Sun, 11 Aug 2024 17:21:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B95826B0092; Sun, 11 Aug 2024 17:21:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A358B6B0098; Sun, 11 Aug 2024 17:21:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 88FF56B008C for ; Sun, 11 Aug 2024 17:21:38 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 3D9C01603C9 for ; Sun, 11 Aug 2024 21:21:38 +0000 (UTC) X-FDA: 82441236276.17.0D64E58 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf24.hostedemail.com (Postfix) with ESMTP id 74451180012 for ; Sun, 11 Aug 2024 21:21:36 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Abi73pZA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3Xyu5ZgYKCNYQMR92G8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3Xyu5ZgYKCNYQMR92G8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--yuzhao.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723411240; a=rsa-sha256; cv=none; b=v6pkn8N17Q25QIwI/JLcSWItRo1NnDSJInIxH/nFWvalSfhTQAXbEMnFg5u6XlO4UhtCjL VGygsuCu1Sg4lNxr9s0kazmhiZZp8gYzOdt0G/OLPYCNn0yble7Krurtd7rTYvu0Em2+o4 Sth6Szk6vKXvN0m62gSxqwMHbEyEWiE= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Abi73pZA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3Xyu5ZgYKCNYQMR92G8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--yuzhao.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3Xyu5ZgYKCNYQMR92G8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--yuzhao.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723411240; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=THGaYoJGBVo1NFT+ygjobh7TePYwQA+eDE4p3qbAKWk=; b=l0h+/uMr9s0aoAZkktC5k+QduslvtwuR9bpK9vWre3fgKcV83VA++qOJBS2E4BQDbEl01c XrbF/Gt1PTXOCajbaL8l8JwU7H6CwsWlqvd+esyJoD0BE4JvqI4Ttnv2AOAtAJdPh1+RGt iu3S6dz1nw944iQ3ATlC538orftYA10= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-66a2aee82a0so73081557b3.0 for ; Sun, 11 Aug 2024 14:21:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723411295; x=1724016095; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=THGaYoJGBVo1NFT+ygjobh7TePYwQA+eDE4p3qbAKWk=; b=Abi73pZAKkBtCTNqDhGYyvw0W7BLOaPcsz2DHOFGFqtfqOei2SHM6MWOj8hly3EmjY hhMMBUyjCcQr88j2oLDJanthEPI+zctX0LKtH2waE/hxIqUAMYDw/aifSNAhUGLHMc53 YtpULCLr6gMxqTIZNPDnjLcpWIU+Eizb45/Bg9KAQcfTEcSVAiIGEUwjPUfDxSVW5LHF +0g3+13FZz4IeyF7eOJyw0zl3hHYtLZdH69eQ3IMg6zCQh8dJNYqmkXqxSVzUZ3srP63 651xhT6xQgeD1MCYpIob6dCjfyOiTUiRAEIHwerTtx6ddKe/4pxDTV8olSH5CaYHZgXb BtrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723411295; x=1724016095; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=THGaYoJGBVo1NFT+ygjobh7TePYwQA+eDE4p3qbAKWk=; b=MNJwWQ+KrxFmlaX3sk17FUR/O0Sp6vOHxRtNe2KI7CP5aMJ6cMWjvVbBdDIo6z5lrZ 1S+A9GyEIeZJxOYpqyBLrCvFZfq3lH58x6O+shDoJ1vr+Z/Yr4XtpGPx2cdVrpnsC7iU z34P1BOGh4xsW7Uxwznw1mTQVCDlU3HQsnl/nmHHJ76Ms9XtapS90ua4mnIMP/wHZFVU NdxcdC1CVY+bbLk6KTai9VUArDzj/7LPnLXFNWQZPCblill4TIlS7h4BHiGCg0GS/s0H Xqv4xy4a2vBuJvNnQmemGz+ZofoTmprTQOXDtlR/FIhzTzzPlVGpwXMDC6AMpu4w+Z3F iYzQ== X-Forwarded-Encrypted: i=1; AJvYcCUr0jfuhzh5w25t42qhMgAkVNDVtACztI2Nmi/aKGZmQLoMkZlip1AcuY2qJC6l/Rg0TWyuNrTw26sPUzeHcmJMnPw= X-Gm-Message-State: AOJu0Ywy1HePSfvpb4sVR/0HvN21Oeu0uV8Y1/7vMjt/3sbGGGfM7Rc7 1jRN69yBtaUwK/GANHv+hBBaVACQaW/V2mgaCvwXUFMBhxvRsSYUrh0hYRO9YqW6Pr8XDYMgzMH QLQ== X-Google-Smtp-Source: AGHT+IGkx0nGnqUgrMKet1uE7S+7hqwGUl1ojaHvm5FZlh6Ngw6ccwePmfpItDUFlv/Deh8AXgWhV/Td5YI= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:c9c:12b4:a1e3:7f10]) (user=yuzhao job=sendgmr) by 2002:a25:dc8d:0:b0:e0e:4350:d7de with SMTP id 3f1490d57ef6-e0eb9a28207mr13988276.9.1723411295441; Sun, 11 Aug 2024 14:21:35 -0700 (PDT) Date: Sun, 11 Aug 2024 15:21:27 -0600 In-Reply-To: <20240811212129.3074314-1-yuzhao@google.com> Mime-Version: 1.0 References: <20240811212129.3074314-1-yuzhao@google.com> X-Mailer: git-send-email 2.46.0.76.ge559c4bf1a-goog Message-ID: <20240811212129.3074314-2-yuzhao@google.com> Subject: [PATCH mm-unstable v1 1/3] mm/contig_alloc: support __GFP_COMP From: Yu Zhao To: Andrew Morton , Muchun Song Cc: "Matthew Wilcox (Oracle)" , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 74451180012 X-Stat-Signature: wnybnqbjaumxxgxfcxzf9651sknjeujw X-Rspam-User: X-HE-Tag: 1723411296-610013 X-HE-Meta: U2FsdGVkX19T/K0XEjC9MmeJx4D8jdlk+HIqgg3hdxK+uakvvwURmT4Q7I9obt+GsKMq1/IFf/E6qhubkiCPjTZ1KE++fjKm5B33fFM6qdnDC7IXt0beLTrfiNLBxtondrchvfhep0CXAKJ/QCv3/WOcCJcp228nklk8o8nzKSazhbXXIoAvQBijN+1orGaCsw6lOiY/FsYFYUIr8Z434/pq/3vK9h3o/K2bpMqKN/tUkWQHXGfK96XMXksqmi62dgHApjIgfT0V0mmR2K+kd1FfC0AezB9loPdI1t6FEgp1X/9rRuZ1eFOO0aPW7dGQ940oaBCeTPj9hWmgZ3ip9XUYloF93ikfrKM+xDhO8ungUFQBr0qV2sVeiCfMCMfJjAASlSr+PzOfYWXyv6xs4ygLOJd5H6OgesEx+DtVZTcfKP8GFsMQNYD21dDlrDM64ZuoCWqaGZn9WMCG5FS/5dNuTr8UkkK3AQkCOszh0kdC1b2G2GwxFheTaGDNX9pPvZ68WnF/DOgbu2orHT0MHB+C3Zo+8DI1QUSrmwbAjOBE75Vd/+a9cnKynQuusFToimoSRFttDWPU0EDNa3Qwl9Jff40HDWMLnIF5uIgFH7EUT6KPLHy+CtLd5lZ8FGCBaEGWXwU97tctkVJG+ZHUmQCyrN2Qd0f61a838LPxU2TpdgtKrZpnPnJsSKsptqvNOYhmtft7o5nKxB+/X77Mc8iy1EZQRBmncb8ZPE8koI0NXvTirpmQyx72SKSmo1869yKopJP3RMBZlFRP7+r2R1Sv72jvc0DVAxwUPvaiMA9oZluPewOzG4dfNGJNgJc9mbI3R+BHAxU+sVW5bsXrrbqYXCGqSXkbebu7Q/akBVTqy9uMW1443DsnH2UU3IBRiL6KnYk4lfPljzB7upnMWgT1IlA2dXSOhPSmaVuOd05oj38c+Bu4gE1tlda7JvF5wut8yRKtzlweQ+KtABB tins1slm QT4kouQXXrge5zl5EkkqGFTR+n/IsUiflYKIxHBTO5K2L3Ac7ZVEaYzv4jp7D4x4Cle8HkSYq74BCPQMLu2DDLhBDzpX2NnIl9KpGWBRquyFwoc+V3NeFwMI4K7I02I3VyeBOFUmsY9lstjXOiZ2uZJsk5zwc0+JEMDAMEbAPi5COLkNQwRHgu778Lx1d4H7/VUXQ1JQEiizKVOeue0ouFiXLKVoGtupexhvJ978WtFvqfI/oxGkmK2/z5JlO9a8fUEhEZGUSYeXokZ5113Gtt12QM4kj57cyyChiXLzVjlCw77yI56/SW29k0c3lJZy7FbJFVo+UfubuNDL85c4YSk+7DNB/mSsuTJ5yQJhjhd53UDAFiiIpjdnj7AA7VOvH3gzfVioRKWRbjbr4Hx/0FYr6CHvTmhg458WKLBdQnhsj0NRO93l27ky+yZ6nJkblXIdA4uk9em4giJV3mwo58alRqSds35HsMCzCwf+EtmWKuAERE4+rfWHi+jJrSbSbyYPNwhrdZGAK2BSzl+TSbTePJgy1rFBHgxTYlcpRLO6CNsYCOqusQxQX7Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Support __GFP_COMP in alloc_contig_range(). When the flag is set, upon success the function returns a large folio prepared by prep_new_page(), rather than a range of order-0 pages prepared by split_free_pages() (which is renamed from split_map_pages()). alloc_contig_range() can return folios larger than MAX_PAGE_ORDER, e.g., gigantic hugeTLB folios. As a result, on the free path free_one_page() needs to handle this case by split_large_buddy(), in addition to free_contig_range() properly handling large folios by folio_put(). Signed-off-by: Yu Zhao --- mm/compaction.c | 48 +++------------------ mm/internal.h | 9 ++++ mm/page_alloc.c | 111 ++++++++++++++++++++++++++++++++++-------------- 3 files changed, 94 insertions(+), 74 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index eb95e9b435d0..1ebfef98e1d0 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -79,40 +79,6 @@ static inline bool is_via_compact_memory(int order) { return false; } #define COMPACTION_HPAGE_ORDER (PMD_SHIFT - PAGE_SHIFT) #endif -static struct page *mark_allocated_noprof(struct page *page, unsigned int order, gfp_t gfp_flags) -{ - post_alloc_hook(page, order, __GFP_MOVABLE); - return page; -} -#define mark_allocated(...) alloc_hooks(mark_allocated_noprof(__VA_ARGS__)) - -static void split_map_pages(struct list_head *freepages) -{ - unsigned int i, order; - struct page *page, *next; - LIST_HEAD(tmp_list); - - for (order = 0; order < NR_PAGE_ORDERS; order++) { - list_for_each_entry_safe(page, next, &freepages[order], lru) { - unsigned int nr_pages; - - list_del(&page->lru); - - nr_pages = 1 << order; - - mark_allocated(page, order, __GFP_MOVABLE); - if (order) - split_page(page, order); - - for (i = 0; i < nr_pages; i++) { - list_add(&page->lru, &tmp_list); - page++; - } - } - list_splice_init(&tmp_list, &freepages[0]); - } -} - static unsigned long release_free_list(struct list_head *freepages) { int order; @@ -742,11 +708,11 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, * * Non-free pages, invalid PFNs, or zone boundaries within the * [start_pfn, end_pfn) range are considered errors, cause function to - * undo its actions and return zero. + * undo its actions and return zero. cc->freepages[] are empty. * * Otherwise, function returns one-past-the-last PFN of isolated page * (which may be greater then end_pfn if end fell in a middle of - * a free page). + * a free page). cc->freepages[] contain free pages isolated. */ unsigned long isolate_freepages_range(struct compact_control *cc, @@ -754,10 +720,9 @@ isolate_freepages_range(struct compact_control *cc, { unsigned long isolated, pfn, block_start_pfn, block_end_pfn; int order; - struct list_head tmp_freepages[NR_PAGE_ORDERS]; for (order = 0; order < NR_PAGE_ORDERS; order++) - INIT_LIST_HEAD(&tmp_freepages[order]); + INIT_LIST_HEAD(&cc->freepages[order]); pfn = start_pfn; block_start_pfn = pageblock_start_pfn(pfn); @@ -788,7 +753,7 @@ isolate_freepages_range(struct compact_control *cc, break; isolated = isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, tmp_freepages, 0, true); + block_end_pfn, cc->freepages, 0, true); /* * In strict mode, isolate_freepages_block() returns 0 if @@ -807,13 +772,10 @@ isolate_freepages_range(struct compact_control *cc, if (pfn < end_pfn) { /* Loop terminated early, cleanup. */ - release_free_list(tmp_freepages); + release_free_list(cc->freepages); return 0; } - /* __isolate_free_page() does not map the pages */ - split_map_pages(tmp_freepages); - /* We don't use freelists for anything. */ return pfn; } diff --git a/mm/internal.h b/mm/internal.h index acda347620c6..03e795ce755f 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -679,6 +679,15 @@ extern void prep_compound_page(struct page *page, unsigned int order); extern void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags); + +static inline struct page *post_alloc_hook_noprof(struct page *page, unsigned int order, + gfp_t gfp_flags) +{ + post_alloc_hook(page, order, __GFP_MOVABLE); + return page; +} +#define mark_allocated(...) alloc_hooks(post_alloc_hook_noprof(__VA_ARGS__)) + extern bool free_pages_prepare(struct page *page, unsigned int order); extern int user_min_free_kbytes; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 84a7154fde93..6c801404a108 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1196,16 +1196,36 @@ static void free_pcppages_bulk(struct zone *zone, int count, spin_unlock_irqrestore(&zone->lock, flags); } +/* Split a multi-block free page into its individual pageblocks */ +static void split_large_buddy(struct zone *zone, struct page *page, + unsigned long pfn, int order, fpi_t fpi) +{ + unsigned long end = pfn + (1 << order); + + VM_WARN_ON_ONCE(!IS_ALIGNED(pfn, 1 << order)); + /* Caller removed page from freelist, buddy info cleared! */ + VM_WARN_ON_ONCE(PageBuddy(page)); + + if (order > pageblock_order) + order = pageblock_order; + + while (pfn != end) { + int mt = get_pfnblock_migratetype(page, pfn); + + __free_one_page(page, pfn, zone, order, mt, fpi); + pfn += 1 << order; + page = pfn_to_page(pfn); + } +} + static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, fpi_t fpi_flags) { unsigned long flags; - int migratetype; spin_lock_irqsave(&zone->lock, flags); - migratetype = get_pfnblock_migratetype(page, pfn); - __free_one_page(page, pfn, zone, order, migratetype, fpi_flags); + split_large_buddy(zone, page, pfn, order, fpi_flags); spin_unlock_irqrestore(&zone->lock, flags); } @@ -1697,27 +1717,6 @@ static unsigned long find_large_buddy(unsigned long start_pfn) return start_pfn; } -/* Split a multi-block free page into its individual pageblocks */ -static void split_large_buddy(struct zone *zone, struct page *page, - unsigned long pfn, int order) -{ - unsigned long end_pfn = pfn + (1 << order); - - VM_WARN_ON_ONCE(order <= pageblock_order); - VM_WARN_ON_ONCE(pfn & (pageblock_nr_pages - 1)); - - /* Caller removed page from freelist, buddy info cleared! */ - VM_WARN_ON_ONCE(PageBuddy(page)); - - while (pfn != end_pfn) { - int mt = get_pfnblock_migratetype(page, pfn); - - __free_one_page(page, pfn, zone, pageblock_order, mt, FPI_NONE); - pfn += pageblock_nr_pages; - page = pfn_to_page(pfn); - } -} - /** * move_freepages_block_isolate - move free pages in block for page isolation * @zone: the zone @@ -1758,7 +1757,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page, del_page_from_free_list(buddy, zone, order, get_pfnblock_migratetype(buddy, pfn)); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, buddy, pfn, order); + split_large_buddy(zone, buddy, pfn, order, FPI_NONE); return true; } @@ -1769,7 +1768,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page, del_page_from_free_list(page, zone, order, get_pfnblock_migratetype(page, pfn)); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, page, pfn, order); + split_large_buddy(zone, page, pfn, order, FPI_NONE); return true; } move: @@ -6482,6 +6481,31 @@ int __alloc_contig_migrate_range(struct compact_control *cc, return (ret < 0) ? ret : 0; } +static void split_free_pages(struct list_head *list) +{ + int order; + + for (order = 0; order < NR_PAGE_ORDERS; order++) { + struct page *page, *next; + int nr_pages = 1 << order; + + list_for_each_entry_safe(page, next, &list[order], lru) { + int i; + + mark_allocated(page, order, __GFP_MOVABLE); + if (!order) + continue; + + split_page(page, order); + + /* add all subpages to the order-0 head, in sequence */ + list_del(&page->lru); + for (i = 0; i < nr_pages; i++) + list_add_tail(&page[i].lru, &list[0]); + } + } +} + /** * alloc_contig_range() -- tries to allocate given range of pages * @start: start PFN to allocate @@ -6594,12 +6618,25 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end, goto done; } - /* Free head and tail (if any) */ - if (start != outer_start) - free_contig_range(outer_start, start - outer_start); - if (end != outer_end) - free_contig_range(end, outer_end - end); + if (!(gfp_mask & __GFP_COMP)) { + split_free_pages(cc.freepages); + /* Free head and tail (if any) */ + if (start != outer_start) + free_contig_range(outer_start, start - outer_start); + if (end != outer_end) + free_contig_range(end, outer_end - end); + } else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) { + struct page *head = pfn_to_page(start); + int order = ilog2(end - start); + + check_new_pages(head, order); + prep_new_page(head, order, gfp_mask, 0); + } else { + ret = -EINVAL; + WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n", + start, end, outer_start, outer_end); + } done: undo_isolate_page_range(start, end, migratetype); return ret; @@ -6708,6 +6745,18 @@ struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask, void free_contig_range(unsigned long pfn, unsigned long nr_pages) { unsigned long count = 0; + struct folio *folio = pfn_folio(pfn); + + if (folio_test_large(folio)) { + int expected = folio_nr_pages(folio); + + if (nr_pages == expected) + folio_put(folio); + else + WARN(true, "PFN %lu: nr_pages %lu != expected %d\n", + pfn, nr_pages, expected); + return; + } for (; nr_pages--; pfn++) { struct page *page = pfn_to_page(pfn);