From patchwork Tue Dec 10 02:39:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13900695 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B39C8E7717D for ; Tue, 10 Dec 2024 02:39:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B2B66B00DC; Mon, 9 Dec 2024 21:39:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 261706B00DE; Mon, 9 Dec 2024 21:39:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 103846B00DF; Mon, 9 Dec 2024 21:39:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D70BB6B00DC for ; Mon, 9 Dec 2024 21:39:53 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 8E5DFA0261 for ; Tue, 10 Dec 2024 02:39:53 +0000 (UTC) X-FDA: 82877493300.08.1E95C63 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf14.hostedemail.com (Postfix) with ESMTP id 205C810000D for ; Tue, 10 Dec 2024 02:39:27 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=F9zzBIpy; spf=pass (imf14.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733798381; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XSOnooig0JqcXQ1Y/KpafF9RbsF0QDm5Q4FOwHWSe4I=; b=wpif9X/YqGHqdvk4Re2+61NCilhSeZWifE5yXpmQcVBdbKc4AiPjG2ZNchsiNupM776AZ5 JqjmHctAnPJM1QlyQm0CA7eSqBxqat38RDf90rVE+bDcoqkEpvCpLHt5dl1NHh+GOsiVjz 21+8H7hQtV4XMSYSCGqEppnflYbDeyQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733798381; a=rsa-sha256; cv=none; b=uKipFw2U9qGBEONhAPrNwUflkCmYRJsVSCTU53A/xxXGBcU3pj8+cJpCAYUY043v4oEDWm 494nS97gO7Sx4t7uisiqWjYgukeerVLlYMjzGpH0ugyCCrV5jcxoboGYg7TRDi+TPBFUdf 3YH289e9Q9Uz431QEwuHGYq3mWiow2c= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=F9zzBIpy; spf=pass (imf14.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-2164b1f05caso16795025ad.3 for ; Mon, 09 Dec 2024 18:39:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1733798390; x=1734403190; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XSOnooig0JqcXQ1Y/KpafF9RbsF0QDm5Q4FOwHWSe4I=; b=F9zzBIpyT6ynBmU9Ez9kAJSHUEUM/ZucVwZwhsHLdBSOk661wQFP+OvPCBaw9g8U36 +CThigM7WfHtinJHhctRggeF4zzrqyuycLMb3GlPZpdZ1s/GMUTGclcJ7qySizoEF8DY DKZLHeDaaWJpNeS+nhGtno7jWK0CAfp1DvXL4NGBu3RSECYn3yde/xFyths7gUUs0zF/ Py6numfhD/9q9jM+i1IFuLF449b+lo4VyRqmNu55zDL0s9peP499Nx6Mzu3xEC/uhGcm ANySgDxvB22SXHDN8v/kXkdRFy+qBylqWj9jQTPpELhU2GdNIbXNrvKyFidMuzWxt8de H6Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733798390; x=1734403190; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XSOnooig0JqcXQ1Y/KpafF9RbsF0QDm5Q4FOwHWSe4I=; b=AZKMI9jmODAp7Of99/X+7SJXLIS1OQRa5t1q8sJbAGu9MG4BhBCkv/balpOoe4xGb1 YW/e41hamy90HI5OcjfRuzWzjwMF/qFZFzwnJRGYbte843j0a1CnnDHcRIws6iYvmeHF H5t0SpXQC11AWYKBqod81epofINoWvxWvf4J2eFA9Qze1EDQGcL+oZhL1pkUBX1X9p3l 2jjMv5RRPw3nIQ4h4hL9jrjqMQRsftT6FKTKiDs9aqi1htktD2T3YP06bDx8AA+9DLDP g+cxtDhpna7QXhy+udjL/7FRPNTMEk0ggHUB8meljKqWa9QPP/aAqdAH+t+4p0i4vW7w IGwQ== X-Forwarded-Encrypted: i=1; AJvYcCUZSSeQ0/gG1pjKkqsz1wGQ9jiLCZYxg6kOknSuZnTwmAqq5mSWj3+rFf8Efe1o0pMVWUSnJC+juA==@kvack.org X-Gm-Message-State: AOJu0YykycIthlA6EB5e4FdWOIF4l/V+zL5fQGRdTj/lwP+HISAVeNoY T1KIaEQlclzRHmAZhMQu1jdGRWuioGRYtIq0Hl3vlfHIwalYBSA+ X-Gm-Gg: ASbGncvzM1nlm7DDNCADgobV9fy1XovodquFSmGHGtgIkIxpepTcJu7AiLZ8ElOlHq8 kPpe1C9awWVshQ2QiFG2hhWnh72ZNFpwwOXMUZ2o5TL4b2xvII+xkktBfiAQcfyVnY/oM+Qkn6a IQlaFZjFB2pSnnqX51GeTJ71pGyQJ1nBEH0OfOIHP6f+xvGjbgh+hMUA3cwNx+XR4QE80hC6S4j rhm3VPy9vbc2S0hllmVPF89bTBlA/G7FcnADU6YsRVkL12yHD0OLZs/TXVRS1k+N9D5WrQk2Ab3 Bh/5SA== X-Google-Smtp-Source: AGHT+IFs9AQy6lVySBTzfSVqJTRJXQoQU6RtOKdvZs51As+Ukwzy42oozyaRiTjoeSpSVQBsyUhvdA== X-Received: by 2002:a17:903:110f:b0:212:996:3536 with SMTP id d9443c01a7336-21614d2e719mr238769345ad.10.1733798389764; Mon, 09 Dec 2024 18:39:49 -0800 (PST) Received: from localhost.localdomain ([2620:10d:c090:400::5:83b0]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7fd45b78c7fsm2837312a12.15.2024.12.09.18.39.47 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 09 Dec 2024 18:39:49 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v2 2/6] mm, bpf: Introduce free_pages_nolock() Date: Mon, 9 Dec 2024 18:39:32 -0800 Message-Id: <20241210023936.46871-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20241210023936.46871-1-alexei.starovoitov@gmail.com> References: <20241210023936.46871-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 205C810000D X-Stat-Signature: q7ksjrjpg45fsz4tjhhharm7kyjxzr9z X-Rspam-User: X-HE-Tag: 1733798367-546405 X-HE-Meta: U2FsdGVkX19Nhb7E4SGryyK56YiZdF8NDitcAr/5agKR1JsAdxmzf6VZ6qjYVc9NIcnGAsHX+ZpSKYv4TPPyDOxDNPSp23YP2CJol6EKHFfg/OMPPr/1gEDxu19UsYqgdUpnm3EtOrsYuJoluZqyUn0UYDdpATkw6brEUnUZORIpOq9S+aWLTv7qJX1YFhW1633D33aR+YWOA4gO3DoB+xm/KrxgI4qu4vmf6LNBnHWGfAXQE3l9mTthw0YdfNrLgE5xk2CWXZhWnksO0PUxQc2rO1njIRQ/Xz2JvUF8SAasmwoqreXeBVUFbbMYeTDG8QiehMunA1cNuJNWzW5LSW5NBHVa7+YOw6HcId6XR2RPMM4u4LzAHhk++yQzWae/sVVYn1SSD5cqV95EVVzpxPNotxtJ35wyepNnmd6fjObdPw6gMeke2EEyTHOs4hXGDM0PTctZZ+CLfnKuHA3IAqnHcTQm/2FSPNlAJvO0D1Z80nsfbZhkZfqbuTQXSbq851tgXR+7IMOYhg91AXeRoQ1VSKxRMhuXuU+B4R/EPRAXM78aDr9ra7Ig4nbYfHBlqjS+z0anyTkikVh7rDaBy1wu3zSAao/l5XsnrYGSxaXmikp8nisIhAEh98/gymvlwVDxZ/xR+Fiawl4vE8KKcJaRxs0nlf/bObiFcycQhjzrZ2Kspw98+ZuvT5ZzgS++IcS5pEgJOQEmCHfjEsEHUUdSAu9xeG00PmUI/UIj7bx65ruBoqO7gpOf2BQzpir97z0JbU2vzwPuUr2JbiVEqiorzsySxgeTfXtLrldUh+FqKnHTgToT/J5cUkhicmZGOSPY2T154DQTYN2DKwtTOSr6jezLDvFz2KEUrG/MVJQrUrp+o19Zi0gZ342XMZco+W5zvk3jmPN+sKK/vP+9g91QCXqjaMzxNXPtVO8f6nDTSfexVVkJdFQ4DY/9BQTrNP3UPNfFZxycPppb0DM 2nj6h+Vb bfpMEQA0rEF9P15ucDfaQH6aHQxqo1xHjeY7XVR3uSLQ7QlU12sByMa2XAhKp2MIeats3Sq66CFLM3uYd/w3jr7S+SNmkB8EGvRWbz6Zb6O3OhociKxW6EGxWS5wrtlffZYYl3Cb/b1YJURMAUGH7iowVsj0DVLcJnvG98EFxojMxHUaLh1Ytc6t3sWI3BcDtcpcci0F6GMNkjXFF0OjTjREAG//U7h9RvBSr2I+rz4cOrTcWipEcqH6arFyhLaKB6V6yYs3cK8fe8RBzev2vI4FUCzPHt24NbkB/NfX18ZK8Zqv7cvIFMnas27Y6hoRWmLlR2yPaa3hRRfiRX3IP6mTYDo2niDYFr/vT21onPV7lGyk4YbSYHw1OMCJou0G3a74HT2v7PmLaudhZST1Q1cLmnsvbbFOGD1sADQI65Q6BAXKsRg9Gyb4d1/BfEJtJaMvlKxpqEMPsbrm2w/yHyU02UpO6rnceBQ2TXfnDUmiaiAzScSvx4P6juA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000483, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexei Starovoitov Introduce free_pages_nolock() that can free a page without taking locks. It relies on trylock only and can be called from any context. Signed-off-by: Alexei Starovoitov --- include/linux/gfp.h | 1 + include/linux/mm_types.h | 4 +++ include/linux/mmzone.h | 3 ++ mm/page_alloc.c | 72 +++++++++++++++++++++++++++++++++++----- 4 files changed, 72 insertions(+), 8 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index f68daa9c997b..dcae733ed006 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -394,6 +394,7 @@ __meminit void *alloc_pages_exact_nid_noprof(int nid, size_t size, gfp_t gfp_mas __get_free_pages((gfp_mask) | GFP_DMA, (order)) extern void __free_pages(struct page *page, unsigned int order); +extern void free_pages_nolock(struct page *page, unsigned int order); extern void free_pages(unsigned long addr, unsigned int order); #define __free_page(page) __free_pages((page), 0) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7361a8f3ab68..52547b3e5fd8 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -99,6 +99,10 @@ struct page { /* Or, free page */ struct list_head buddy_list; struct list_head pcp_list; + struct { + struct llist_node pcp_llist; + unsigned int order; + }; }; /* See page-flags.h for PAGE_MAPPING_FLAGS */ struct address_space *mapping; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index b36124145a16..1a854e0a9e3b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -953,6 +953,9 @@ struct zone { /* Primarily protects free_area */ spinlock_t lock; + /* Pages to be freed when next trylock succeeds */ + struct llist_head trylock_free_pages; + /* Write-intensive fields used by compaction and vmstats. */ CACHELINE_PADDING(_pad2_); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d511e68903c6..a969a62ec0c3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -88,6 +88,9 @@ typedef int __bitwise fpi_t; */ #define FPI_TO_TAIL ((__force fpi_t)BIT(1)) +/* Free the page without taking locks. Rely on trylock only. */ +#define FPI_TRYLOCK ((__force fpi_t)BIT(2)) + /* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */ static DEFINE_MUTEX(pcp_batch_high_lock); #define MIN_PERCPU_PAGELIST_HIGH_FRACTION (8) @@ -1251,9 +1254,33 @@ static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, fpi_t fpi_flags) { + struct llist_head *llhead; unsigned long flags; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(fpi_flags & FPI_TRYLOCK)) { + /* Remember the order */ + page->order = order; + /* Add the page to the free list */ + llist_add(&page->pcp_llist, &zone->trylock_free_pages); + return; + } + spin_lock_irqsave(&zone->lock, flags); + } + + /* The lock succeeded. Process deferred pages. */ + llhead = &zone->trylock_free_pages; + if (unlikely(!llist_empty(llhead))) { + struct llist_node *llnode; + struct page *p, *tmp; + + llnode = llist_del_all(llhead); + llist_for_each_entry_safe(p, tmp, llnode, pcp_llist) { + unsigned int p_order = p->order; + split_large_buddy(zone, p, page_to_pfn(p), p_order, fpi_flags); + __count_vm_events(PGFREE, 1 << p_order); + } + } split_large_buddy(zone, page, pfn, order, fpi_flags); spin_unlock_irqrestore(&zone->lock, flags); @@ -2596,7 +2623,7 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone, static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, struct page *page, int migratetype, - unsigned int order) + unsigned int order, fpi_t fpi_flags) { int high, batch; int pindex; @@ -2631,6 +2658,14 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, } if (pcp->free_count < (batch << CONFIG_PCP_BATCH_SCALE_MAX)) pcp->free_count += (1 << order); + + if (unlikely(fpi_flags & FPI_TRYLOCK)) { + /* + * Do not attempt to take a zone lock. Let pcp->count get + * over high mark temporarily. + */ + return; + } high = nr_pcp_high(pcp, zone, batch, free_high); if (pcp->count >= high) { free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high), @@ -2645,7 +2680,8 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, /* * Free a pcp page */ -void free_unref_page(struct page *page, unsigned int order) +static void __free_unref_page(struct page *page, unsigned int order, + fpi_t fpi_flags) { unsigned long __maybe_unused UP_flags; struct per_cpu_pages *pcp; @@ -2654,7 +2690,7 @@ void free_unref_page(struct page *page, unsigned int order) int migratetype; if (!pcp_allowed_order(order)) { - __free_pages_ok(page, order, FPI_NONE); + __free_pages_ok(page, order, fpi_flags); return; } @@ -2671,7 +2707,7 @@ void free_unref_page(struct page *page, unsigned int order) migratetype = get_pfnblock_migratetype(page, pfn); if (unlikely(migratetype >= MIGRATE_PCPTYPES)) { if (unlikely(is_migrate_isolate(migratetype))) { - free_one_page(page_zone(page), page, pfn, order, FPI_NONE); + free_one_page(page_zone(page), page, pfn, order, fpi_flags); return; } migratetype = MIGRATE_MOVABLE; @@ -2681,14 +2717,19 @@ void free_unref_page(struct page *page, unsigned int order) pcp_trylock_prepare(UP_flags); pcp = pcp_spin_trylock(zone->per_cpu_pageset); if (pcp) { - free_unref_page_commit(zone, pcp, page, migratetype, order); + free_unref_page_commit(zone, pcp, page, migratetype, order, fpi_flags); pcp_spin_unlock(pcp); } else { - free_one_page(zone, page, pfn, order, FPI_NONE); + free_one_page(zone, page, pfn, order, fpi_flags); } pcp_trylock_finish(UP_flags); } +void free_unref_page(struct page *page, unsigned int order) +{ + __free_unref_page(page, order, FPI_NONE); +} + /* * Free a batch of folios */ @@ -2777,7 +2818,7 @@ void free_unref_folios(struct folio_batch *folios) trace_mm_page_free_batched(&folio->page); free_unref_page_commit(zone, pcp, &folio->page, migratetype, - order); + order, FPI_NONE); } if (pcp) { @@ -4855,6 +4896,21 @@ void __free_pages(struct page *page, unsigned int order) } EXPORT_SYMBOL(__free_pages); +/* Can be called while holding raw_spin_lock or from IRQ. RCU must be watching. */ +void free_pages_nolock(struct page *page, unsigned int order) +{ + int head = PageHead(page); + struct alloc_tag *tag = pgalloc_tag_get(page); + + if (put_page_testzero(page)) { + __free_unref_page(page, order, FPI_TRYLOCK); + } else if (!head) { + pgalloc_tag_sub_pages(tag, (1 << order) - 1); + while (order-- > 0) + __free_unref_page(page + (1 << order), order, FPI_TRYLOCK); + } +} + void free_pages(unsigned long addr, unsigned int order) { if (addr != 0) {