From patchwork Wed Feb 12 17:47:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13972259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75439C021A0 for ; Wed, 12 Feb 2025 17:47:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A22E6B0093; Wed, 12 Feb 2025 12:47:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 052DC6B0095; Wed, 12 Feb 2025 12:47:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0ED76B0096; Wed, 12 Feb 2025 12:47:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C03E86B0093 for ; Wed, 12 Feb 2025 12:47:16 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7562BA0379 for ; Wed, 12 Feb 2025 17:47:16 +0000 (UTC) X-FDA: 83112024072.28.301ED37 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf02.hostedemail.com (Postfix) with ESMTP id 91B1D80014 for ; Wed, 12 Feb 2025 17:47:14 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jhrlyu03; spf=pass (imf02.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739382434; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZcDqEp0sdulY57w8+974+V84oF4S+MilCZmcAMnl5T8=; b=RNA/cCeYAuAIxU4bQQDRtq2IfJPYRONKUbp200jIHneH3suBiD9gb4Jdlcd3Y1Uv9fPP3s EyttOM4297cUwcfwUVTdJJmERYvydAHkLgT/zqliFtrrxn6Rkv0MFD0N1kaugahSsYMuOw SmWnszELnm5sEdT9r8jKjarj9s1HPXA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739382434; a=rsa-sha256; cv=none; b=TpBkffAfny5waCWK8IlTQMUPJG78GK7zvs0LY2DTe26/Nwtvy96ZUxhxtYdfz+y3al3vUa biA/ZibYBc7iu2CvaCABMsu0JRA9ZF1I3vxo9EOWop45NbeQDVjYL6JY360msfZtEScHBz efYzHuOjoUTuQCpnV+y/SfJv9kBlr6M= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jhrlyu03; spf=pass (imf02.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-21f2339dcfdso19706815ad.1 for ; Wed, 12 Feb 2025 09:47:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739382433; x=1739987233; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZcDqEp0sdulY57w8+974+V84oF4S+MilCZmcAMnl5T8=; b=jhrlyu03aktzmhgB5ss/nxtedAa1oP6xu1Oh2fVzW1eFEOKBKBKulvrTKc0BIzkUua dbmRU/HOxgeACCN/vpCphbLcvPcRoVKt0ER+jWBGblMyb1M4wd+3kI0egG3khdAcvovc uIDnXfj6sDuWDRE92DPRuBfGROfyTJKQTU9qO3NHbE8ZE+tiNIUW0hZy3qqlWCZ37FEf ZQ9PmuXfwVFbpUR29rIApOCiFX6iSOYZuW/ZW+ohodHLxPQVhgWzAf3j5lBlF6lzgcSa R1NJBvecXx0QUeMzQSN5le8UeO7wnv1GTdXFi6kGPJA3/yooYDbcUdXZF5KmhC1+S2KX MD9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739382433; x=1739987233; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZcDqEp0sdulY57w8+974+V84oF4S+MilCZmcAMnl5T8=; b=I0LZUlUYhetMcXjh9fvw2keuBoQY1uLtP+aroiKUMcEutW/N247fCujkvcRX7zFaNU 0d+TXjJtX4a6DTyqmP7hbR08nOLttjShFj4hsWVbR6qsUcI4YGxn1P9hwYH+ykYFTBD6 ojjD+PVBQeEaCHJB9QT6+yN967yy5H/VDDp6uDmUQpCue/VJViQsCshGJdjc7THJK6p/ 6Q8jeyuTHCpHMSF86ia793kD/EKJYxUNck5RkVqxqPRZpGnUI7oBLK04ngS185vHG72T NrfsrFT4XXqkgluP4NJl2BcxaEhOSXgrL1dAXtGBwlyyOg1pkm8JUcWwG5D6q7kJsD2I qz+w== X-Forwarded-Encrypted: i=1; AJvYcCV0JdD4cH45Gt+Kkb4RLkcPST1PPklnbW/h3Tolcn7ht7MIfQKN79oDATOH7vD4QG/jeQo327AnuQ==@kvack.org X-Gm-Message-State: AOJu0YzE/l21i7BUkwbwjFA2t3fudlxLmdyhS7r+m/rrDBqFQl/pMS5b RptVv4xBLAhlZrOhClFI/uEc8LFxAaER8wU/6foc+PauKMjmk6On X-Gm-Gg: ASbGncs7ruq1ZVWkY+m+dS8TEtdS2bDgS5c3R4unaEa6ZBBaNIFJmrNThIKhnhW6SJ/ yNcpkJJ9DIYtYJav6EtYHh12AQW3c0k7yiI9NzkfS31d7WBoF2Y2g0h/de1T7aeNpqBl+t3qDli giWyb8ZGZqpIgRRrgunX8s2OqpsJ46RZqXmu9Li89Yrp1Zi5BRjI8d2WMb3LtaA4EC6uaoXMm8e JVbBoCLVUHDQe6wvxOlpYsGsMMwnwp8RieTlgJ80kCNrygXWH8IEOjCljm5WrR9w9UmPaUo6JOI PxNi3M9mH0Zet+J21dV1a0uPztCUWOcKRUL+SQQbMDhO5Q== X-Google-Smtp-Source: AGHT+IGiz7lPxFiFSSCtFNzPA8nthkoVk2uvweVCB4Zudmmc+mzIhBfhSZ/Vst38614VBFH4Hr8CXg== X-Received: by 2002:a17:903:3d04:b0:21f:5e6f:a406 with SMTP id d9443c01a7336-220d2112aa1mr3048445ad.20.1739382433389; Wed, 12 Feb 2025 09:47:13 -0800 (PST) Received: from ast-mac.thefacebook.com ([2620:10d:c090:500::4:c330]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2fbf9ab0233sm1792313a91.44.2025.02.12.09.47.11 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 12 Feb 2025 09:47:12 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v7 1/6] mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation Date: Wed, 12 Feb 2025 09:47:00 -0800 Message-Id: <20250212174705.44492-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250212174705.44492-1-alexei.starovoitov@gmail.com> References: <20250212174705.44492-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 91B1D80014 X-Rspamd-Server: rspam07 X-Stat-Signature: hq5dcjq9sseozfiqs5cx41rk9t9m1jmp X-HE-Tag: 1739382434-356040 X-HE-Meta: U2FsdGVkX192vQYfFa7YUyRaxbgMEZzxNZMvqM+wCKeBjcR5R1RitfcyRd13afMzufTJtMvpzu0r9I/g3U5kIJ2TNLnP2yU7ZMTGBSANlPBVy1V2FzKQfaL1UZ8/ACH6xCHfTD5+V8h2rs9uS08SCVWS81Su+2e6fFnvqokKjeV3VJe0MzTCeJz6MPuiJDCxANvc7c4oCO++0IHnqiNsBs99G7fVL1nGDCg4xNV8OnssDkNuxjYkWfUZEfq3yqpBMA+Ai0Z7gr5fRWvFvIstC4t2GtXyH0XVN4lBcvkjenIdp4XCCH5NSj1LZMcLPIKJtmXZ7PAOzKVMTxbU1XFKEAT3wPbTAEqy2X/I3xyKGPZDw+RTH2ZGLwswLwTMP8BtoQtJ6+GTl9RcRbwC8vvVN8Ke4kxYAzQZvmTkIYBJUlMobs5OOBJLeg20wUa6nPYsCTZaoM2o3ZBFPO6tbhw2eBCtXKfw+2DnlOnbq6DJ3aThtEbEVpshEO63DrhI5UGdJQzmx4PKoXlvXLuhbd3Goz+eUtBvtauFgrg1ITQ9eToXqKfRe1CriMuuVvkz12mU1jVeM8O8/xuWMWnPV1VpXGCiVtZzsKXWftcLxF2l0LZpLJm+RYw2/91tQ2ZqtGcx3zXmguis0F9Vf/DyZDCeGoZhu+jF8rwDZC6JrAHUMCIjrv+Wk2xdx+k/b8Mf9YSodBRf/dI+UAFnB6W1MOCS4IPYUilIyQOpMpgzigywlRqjY9wrShfa1YEZ189GYRaDVaMoSoT5HIxo7wPq7evA+vdayvcfeNRv53VtAOoMjjVU3y41cj1BRqpo7jwXTpkr99yOztFbPCt4JbvHcNdQSIxdkjoeSzOzExJKK9CLhUwOZMvnPWgbBJocNDYrl7/7OrfSsBRRKQlUDl9VBR+rLl11nu+Pu9kNXEA4dL9DH2TrZkPIOQfZYoWy80dDetMQ0Q0lMn4dpYdiR3enwhI lrs4kAO8 2ubfqAPWVKLZGoDRzQtjetIQJ9l+owaA/Bnz3qhrt8BNheUSYm5VWIbg2QCBk4HhqLpqTc9CLMQtnvt/u0DyZKJb+g2VmW5c2bsREJgwfYp+rmdZW/mR804HKts62d5XtQ7nokPAzNu8uzOPfTm3tfPF9BWCM+10+vH2qYyQHl4JvOsFaMNk3nS7aYOILEViguPCJnj7Gwe6XyAKGu5GtzQHTEAkwSiLC4GzJJ9+hVi6E91/50LlsWHvnZ/ut88EqT605OppvucAmipWS9X2Fu178sJd9C5KrbT/jzX26VwIUkhPqtqgkICSKMR03RhpQIjEPK4S1QznfAjs7rzSswUtloMx/0Dcz27t9Mispn/dT1KXgi4ma4RDKzWKA9HFRTm851otICD6KVY8pYVEcFptQ5RpVqS2i9HaeocuWdzQ9wAOYmCCjHkoyMUhnNasEEz//bP33h0Cv5cVT0mKc1bFQl8pGzHwsYK5OF6/ApBAu7LW1aOKEUTKeTo+MeZrRqWiUnt7gtW3EANBu9WRnpJJqnd+7nR0+EqAPLi4+oRuECmilFyfyyCtcDds5m9gNv7az7dIOBYI8Lw1244W952h3y9PUyxDFfzU08T3rJFbNbEQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexei Starovoitov Tracing BPF programs execute from tracepoints and kprobes where running context is unknown, but they need to request additional memory. The prior workarounds were using pre-allocated memory and BPF specific freelists to satisfy such allocation requests. Instead, introduce gfpflags_allow_spinning() condition that signals to the allocator that running context is unknown. Then rely on percpu free list of pages to allocate a page. try_alloc_pages() -> get_page_from_freelist() -> rmqueue() -> rmqueue_pcplist() will spin_trylock to grab the page from percpu free list. If it fails (due to re-entrancy or list being empty) then rmqueue_bulk()/rmqueue_buddy() will attempt to spin_trylock zone->lock and grab the page from there. spin_trylock() is not safe in PREEMPT_RT when in NMI or in hard IRQ. Bailout early in such case. The support for gfpflags_allow_spinning() mode for free_page and memcg comes in the next patches. This is a first step towards supporting BPF requirements in SLUB and getting rid of bpf_mem_alloc. That goal was discussed at LSFMM: https://lwn.net/Articles/974138/ Acked-by: Michal Hocko Acked-by: Vlastimil Babka Acked-by: Sebastian Andrzej Siewior Reviewed-by: Shakeel Butt Signed-off-by: Alexei Starovoitov --- include/linux/gfp.h | 22 ++++++++++ lib/stackdepot.c | 5 ++- mm/internal.h | 1 + mm/page_alloc.c | 104 ++++++++++++++++++++++++++++++++++++++++++-- 4 files changed, 127 insertions(+), 5 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index b0fe9f62d15b..82bfb65b8d15 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -39,6 +39,25 @@ static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags) return !!(gfp_flags & __GFP_DIRECT_RECLAIM); } +static inline bool gfpflags_allow_spinning(const gfp_t gfp_flags) +{ + /* + * !__GFP_DIRECT_RECLAIM -> direct claim is not allowed. + * !__GFP_KSWAPD_RECLAIM -> it's not safe to wake up kswapd. + * All GFP_* flags including GFP_NOWAIT use one or both flags. + * try_alloc_pages() is the only API that doesn't specify either flag. + * + * This is stronger than GFP_NOWAIT or GFP_ATOMIC because + * those are guaranteed to never block on a sleeping lock. + * Here we are enforcing that the allocation doesn't ever spin + * on any locks (i.e. only trylocks). There is no high level + * GFP_$FOO flag for this use in try_alloc_pages() as the + * regular page allocator doesn't fully support this + * allocation mode. + */ + return !(gfp_flags & __GFP_RECLAIM); +} + #ifdef CONFIG_HIGHMEM #define OPT_ZONE_HIGHMEM ZONE_HIGHMEM #else @@ -347,6 +366,9 @@ static inline struct page *alloc_page_vma_noprof(gfp_t gfp, } #define alloc_page_vma(...) alloc_hooks(alloc_page_vma_noprof(__VA_ARGS__)) +struct page *try_alloc_pages_noprof(int nid, unsigned int order); +#define try_alloc_pages(...) alloc_hooks(try_alloc_pages_noprof(__VA_ARGS__)) + extern unsigned long get_free_pages_noprof(gfp_t gfp_mask, unsigned int order); #define __get_free_pages(...) alloc_hooks(get_free_pages_noprof(__VA_ARGS__)) diff --git a/lib/stackdepot.c b/lib/stackdepot.c index 245d5b416699..377194969e61 100644 --- a/lib/stackdepot.c +++ b/lib/stackdepot.c @@ -591,7 +591,8 @@ depot_stack_handle_t stack_depot_save_flags(unsigned long *entries, depot_stack_handle_t handle = 0; struct page *page = NULL; void *prealloc = NULL; - bool can_alloc = depot_flags & STACK_DEPOT_FLAG_CAN_ALLOC; + bool allow_spin = gfpflags_allow_spinning(alloc_flags); + bool can_alloc = (depot_flags & STACK_DEPOT_FLAG_CAN_ALLOC) && allow_spin; unsigned long flags; u32 hash; @@ -630,7 +631,7 @@ depot_stack_handle_t stack_depot_save_flags(unsigned long *entries, prealloc = page_address(page); } - if (in_nmi()) { + if (in_nmi() || !allow_spin) { /* We can never allocate in NMI context. */ WARN_ON_ONCE(can_alloc); /* Best effort; bail if we fail to take the lock. */ diff --git a/mm/internal.h b/mm/internal.h index 9826f7dce607..6c3c664aa346 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1174,6 +1174,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone, #define ALLOC_NOFRAGMENT 0x0 #endif #define ALLOC_HIGHATOMIC 0x200 /* Allows access to MIGRATE_HIGHATOMIC */ +#define ALLOC_TRYLOCK 0x400 /* Only use spin_trylock in allocation path */ #define ALLOC_KSWAPD 0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */ /* Flags that allow allocations below the min watermark. */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 01eab25edf89..a82bc67abbdb 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2306,7 +2306,11 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, unsigned long flags; int i; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) + return 0; + spin_lock_irqsave(&zone->lock, flags); + } for (i = 0; i < count; ++i) { struct page *page = __rmqueue(zone, order, migratetype, alloc_flags); @@ -2906,7 +2910,11 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, do { page = NULL; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) + return NULL; + spin_lock_irqsave(&zone->lock, flags); + } if (alloc_flags & ALLOC_HIGHATOMIC) page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); if (!page) { @@ -4511,7 +4519,12 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, might_alloc(gfp_mask); - if (should_fail_alloc_page(gfp_mask, order)) + /* + * Don't invoke should_fail logic, since it may call + * get_random_u32() and printk() which need to spin_lock. + */ + if (!(*alloc_flags & ALLOC_TRYLOCK) && + should_fail_alloc_page(gfp_mask, order)) return false; *alloc_flags = gfp_to_alloc_flags_cma(gfp_mask, *alloc_flags); @@ -7028,3 +7041,88 @@ static bool __free_unaccepted(struct page *page) } #endif /* CONFIG_UNACCEPTED_MEMORY */ + +/** + * try_alloc_pages_noprof - opportunistic reentrant allocation from any context + * @nid - node to allocate from + * @order - allocation order size + * + * Allocates pages of a given order from the given node. This is safe to + * call from any context (from atomic, NMI, and also reentrant + * allocator -> tracepoint -> try_alloc_pages_noprof). + * Allocation is best effort and to be expected to fail easily so nobody should + * rely on the success. Failures are not reported via warn_alloc(). + * See always fail conditions below. + * + * Return: allocated page or NULL on failure. + */ +struct page *try_alloc_pages_noprof(int nid, unsigned int order) +{ + /* + * Do not specify __GFP_DIRECT_RECLAIM, since direct claim is not allowed. + * Do not specify __GFP_KSWAPD_RECLAIM either, since wake up of kswapd + * is not safe in arbitrary context. + * + * These two are the conditions for gfpflags_allow_spinning() being true. + * + * Specify __GFP_NOWARN since failing try_alloc_pages() is not a reason + * to warn. Also warn would trigger printk() which is unsafe from + * various contexts. We cannot use printk_deferred_enter() to mitigate, + * since the running context is unknown. + * + * Specify __GFP_ZERO to make sure that call to kmsan_alloc_page() below + * is safe in any context. Also zeroing the page is mandatory for + * BPF use cases. + * + * Though __GFP_NOMEMALLOC is not checked in the code path below, + * specify it here to highlight that try_alloc_pages() + * doesn't want to deplete reserves. + */ + gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC; + unsigned int alloc_flags = ALLOC_TRYLOCK; + struct alloc_context ac = { }; + struct page *page; + + /* + * In PREEMPT_RT spin_trylock() will call raw_spin_lock() which is + * unsafe in NMI. If spin_trylock() is called from hard IRQ the current + * task may be waiting for one rt_spin_lock, but rt_spin_trylock() will + * mark the task as the owner of another rt_spin_lock which will + * confuse PI logic, so return immediately if called form hard IRQ or + * NMI. + * + * Note, irqs_disabled() case is ok. This function can be called + * from raw_spin_lock_irqsave region. + */ + if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq())) + return NULL; + if (!pcp_allowed_order(order)) + return NULL; + +#ifdef CONFIG_UNACCEPTED_MEMORY + /* Bailout, since try_to_accept_memory_one() needs to take a lock */ + if (has_unaccepted_memory()) + return NULL; +#endif + /* Bailout, since _deferred_grow_zone() needs to take a lock */ + if (deferred_pages_enabled()) + return NULL; + + if (nid == NUMA_NO_NODE) + nid = numa_node_id(); + + prepare_alloc_pages(alloc_gfp, order, nid, NULL, &ac, + &alloc_gfp, &alloc_flags); + + /* + * Best effort allocation from percpu free list. + * If it's empty attempt to spin_trylock zone->lock. + */ + page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac); + + /* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */ + + trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype); + kmsan_alloc_page(page, order, alloc_gfp); + return page; +} From patchwork Wed Feb 12 17:47:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13972260 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6786C021A0 for ; Wed, 12 Feb 2025 17:47:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6202E6B0096; Wed, 12 Feb 2025 12:47:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D0496B0098; Wed, 12 Feb 2025 12:47:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44A346B0099; Wed, 12 Feb 2025 12:47:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 274BB6B0096 for ; Wed, 12 Feb 2025 12:47:21 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id DA02E1A039F for ; Wed, 12 Feb 2025 17:47:20 +0000 (UTC) X-FDA: 83112024240.16.3B82DA5 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf13.hostedemail.com (Postfix) with ESMTP id EE5D52000B for ; Wed, 12 Feb 2025 17:47:18 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="J/EMJ24B"; spf=pass (imf13.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739382439; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bfOI9MX/hHEauaA9ofUhjkEEOuqobZaUZXdU+efBBTw=; b=SdcX3n7hPzoyi6/pLlP8Gy5r1WPBBarcq9Pb7QpvThD0DIrdQM2/YDiKf6mPBFRtuTPw7Y st4i3WM5uSvzNacn+OUXm7IUnPsa1XhCUe3FvxAJSqnbSgyH9Es0d+c1+gJ9EUTz5GvoGL rOWueAES/gjzO0FfvbUV6A2KCZAPNPs= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="J/EMJ24B"; spf=pass (imf13.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739382439; a=rsa-sha256; cv=none; b=hHKIHnIb0tolmaGyxx56aY/zsWMu3PP6tlFqqSnSQvZutOPy288h4COUsyPj2XWNgjVeBa s07FlGtd5my9de4PPhpU4sdz/rEiBjspXspHknmOWf1TPFKuBqqfq+KbQ9GhJAQy/CVpzf F1MtGMzE3YgRC4iFbsPXu6u2uHEjqKE= Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-21fa56e1583so59619475ad.3 for ; Wed, 12 Feb 2025 09:47:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739382438; x=1739987238; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bfOI9MX/hHEauaA9ofUhjkEEOuqobZaUZXdU+efBBTw=; b=J/EMJ24Bf5ulP5LhP/hG/9ONZnXFD0WrLUWor3Lp5dPfDLYY9kf6L3SuqLZzypJ8tm IQY8VIvsxjnlbbUsepUyr0PLqibeczRpswtsFr04zqPVu9Kw9IRPKj0zWSJA0OaqFBL5 VtUNFeTn/BaUmsk9sz44ZhG1rudqmZRh3/kvf8zW+ARyx9Skj+U3NlXJrvU030LKnO6P Y+UA86x1317VerBDmwuqqywq1MgvJze6xaXGxZuUM0Gr1bwqdVvLNKAE1QsgI/A5V+CE qtpo/PMrGwpEnSLTku9FfiKWYzBJk93KnmWQC65Ogzlx12TurwtQc1MZ3433iiCezGt+ PmBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739382438; x=1739987238; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bfOI9MX/hHEauaA9ofUhjkEEOuqobZaUZXdU+efBBTw=; b=c9S8gu501Ec7bxu/i3PP12QBCw7XGT3nAzAf8mus5h6ofxcpyDe6HSkjapMcsnm23t 0KFuWTq54DZJlmLbNUYTtaHg3k7L8xmNaGm8EhKKPWKJzZJriVQnB/6+2Q977FbwsACF EEsmgZD2qMEIgg9oWBPhmegssqf0EsJ9HSGBHgsvZZ3y5Q5JPxvIVHbqeijlbjUaNr4j S1V8e7GQgzpagXdPU9MsjD1k49wYZ58x9R4VwQVg0vKxj+p4NgtPls4RMBM6mwZD8sZu HzDdz4l9j1I1llb8Im9HrpOuga1eiWQxpJ7CrQp9XabRFp+5AAaQIHqwby8L0HJsEaX7 +ZzA== X-Forwarded-Encrypted: i=1; AJvYcCUiYr60oXvqRhgjv+p5jKPyZqcgqm01WMlnAjEXSXFxGgc/ElqHxacLcc0SgDsLgmF+OlaMUFjGQA==@kvack.org X-Gm-Message-State: AOJu0YyolnhzQjOEnQnMNb/d3sbXa4DBckHyXT6nZQzocWsSP5RqkRjM JRLp7pOAIZpNlYautiUhBCe/o6dUsMFsQIomf+vQAmEzvFA/fjmH X-Gm-Gg: ASbGncsCf0ZNHgawk76ewQ9S/aA/kSzNBd/vqqX+RNDxwx7Ug3jIkTAFg+oBG89Ja9c z4qpLA4xsvXE95jDSyFuZ5QDyV0G5L/3Wx8GnoeyV7I7wMPQkcLBNEmw3dCTTqUI0uUrkM7Y1ns I3sJi7RDmOwBQnEzhX5TqqemYk0QfQT0pioyxJX/e1ydYMDb2OjRAKWMK8g8r31aYsB1nCgJJcT g0l1+JzIglWoch0lOF5hSc8WP9Syh0ZeseOdd4EYpEFpqxxHz99KPTXEdpghLsP6v7Tn7vbTvU+ GcoBh94yK6ZxY2pmXTcxMjM6U97X3jsxe+fImi70ldq5CA== X-Google-Smtp-Source: AGHT+IHkhCXV1pRFdGKufwsizq7brzMYRyNrfIELMSWMRiL7LCkFISysHio/g0WXe8rLxsku0EUl5g== X-Received: by 2002:a17:902:ce12:b0:21f:3e2d:7d4a with SMTP id d9443c01a7336-220bbb08e1cmr69447445ad.26.1739382437693; Wed, 12 Feb 2025 09:47:17 -0800 (PST) Received: from ast-mac.thefacebook.com ([2620:10d:c090:500::4:c330]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21f3653bfb2sm116573095ad.67.2025.02.12.09.47.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 12 Feb 2025 09:47:17 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v7 2/6] mm, bpf: Introduce free_pages_nolock() Date: Wed, 12 Feb 2025 09:47:01 -0800 Message-Id: <20250212174705.44492-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250212174705.44492-1-alexei.starovoitov@gmail.com> References: <20250212174705.44492-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: EE5D52000B X-Stat-Signature: nehth36j8n6bwijiiygosmxo9daqdqq7 X-HE-Tag: 1739382438-987550 X-HE-Meta: U2FsdGVkX1/ItSR11YahWR6BqH9A4Iif3xdMbrouxTR5B+X3xoet+A1Pp+Troofxi/dQqvF0P2MnSHjNvLfGx16Gr5KmYeAItt8XhfjuLrpz1LQPII7In42N+ddcHFv3soHlKn/XmImz7semwiYm9wGAGdMVETXqjeyyulqSofFulc3RF2KvwybJOCblHGQqGAg+3hj4s6ncVbuNX/Iar8raCZtYp28TG62gMgf2/h5/yr3FsSEWxKOuhUKoVheyPw3qlMfVeZHBBzkWkswXB3wU/4n51W9YrOxf+SRsouxYziC4M88+gPJodkGuJaHRfwiYGpa/AB+UQF8dUG00c//ID/EwUJhvxe361pP5bGwfIekQVm0glU/R6+jvHQl/P8C4/8qPhlHoRRtmALJjlQ1z9+cKOvf0cksPurZ2V7zGmcLVDrHebsg+kVanaKz1ijnW+7TGs54VaqIi5E9Zqwv9kjM69gbcOEln0BFcRZDalQGWgUcO3qMq/y1gHgYUz66GatqGTOj98qMfeUKKQjKO0Y/dm7kved6s4N9uNDyxTDSFg+pnRZO1UvkcFS6MU2aHk8viPeLs4dDkQ//s3Z3F80nX0g2N9Fyl8KGDI9i0ROkwWul1KRGzYfcQwuuOWoA43LPekcojQth8TLV5rrPnIFSEZdKj8MCqUnO5/jG8TxifdFWDy519P6Nct7NTOjnppN9d6OWWCJQf9xje+v1VLudW7WI2J1segAzegqnPaij7Yg6ipMrMvCVZkw84+Q0z/L6iMOkvNZ3cG7sl+CTde42Hmp9v+gHyf1/II++Qj6gmuGLcCTjIMPInllOig42wdnJ6bctnEFPDAmujR708hlFwlVGP6rRZJSZWzXXupyAmmkn/bLl5KgaN5hY3ZmIDWjj6amt1/l3HCh7CQTn0L1mgR2aXb2c05MZMXMryI5I1lPHO91ynBP4wVdrzIlQ5ryPWIWV5EXv1ZDK UcD0WxNG oa+kK4/DzcrIVSrev8/1javU8V3h0NZg63ailnZl3Ka9e5KIOl9dlKKaFIgQ6OUTQBQvxXeGfoV+XEXs4AL8KFszS/f06o8yJd3o7YQXyD0yRD+M7ZDFk8mDQSdPy26tvMJfiWhDISm4OBQQZrDHcMPg+aiy1DJwpyeudwgVsAhubbBauj+dmEjMnCIQ2AXZuItb+ChyN+Psi2/aAQpEuXlRnOZ7TSH1uMAjO+ySYZtdF+jbZbwRHpR/e5FGQKbln4/OYGK1MYiOzZACR1/qg0Ql+J22etvv/xOB1vXn/YvKx7mvB9oPHZ0I9Xepyw96FxHpVjtHhhC8N79L+fbc1UAgrpjOH6co/StA9nvxVt981sEqnaYG6kxacjnE5Zv6Ep8o6q9eu2c8HnHmVQqTR37Qb5PFWcynCYqzqjkQx7+535ci/VmAZIKv79hkpccTLjrhw2h6/jn+HWKeTYYjim7lwh5pvMQ5W8RyjtU4pnoqc7Xk8wlsfP5a+Ug== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexei Starovoitov Introduce free_pages_nolock() that can free pages without taking locks. It relies on trylock and can be called from any context. Since spin_trylock() cannot be used in PREEMPT_RT from hard IRQ or NMI it uses lockless link list to stash the pages which will be freed by subsequent free_pages() from good context. Do not use llist unconditionally. BPF maps continuously allocate/free, so we cannot unconditionally delay the freeing to llist. When the memory becomes free make it available to the kernel and BPF users right away if possible, and fallback to llist as the last resort. Acked-by: Vlastimil Babka Acked-by: Sebastian Andrzej Siewior Reviewed-by: Shakeel Butt Signed-off-by: Alexei Starovoitov --- include/linux/gfp.h | 1 + include/linux/mm_types.h | 4 ++ include/linux/mmzone.h | 3 ++ lib/stackdepot.c | 5 ++- mm/page_alloc.c | 90 +++++++++++++++++++++++++++++++++++----- mm/page_owner.c | 8 +++- 6 files changed, 98 insertions(+), 13 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 82bfb65b8d15..a8233d09acfa 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -391,6 +391,7 @@ __meminit void *alloc_pages_exact_nid_noprof(int nid, size_t size, gfp_t gfp_mas __get_free_pages((gfp_mask) | GFP_DMA, (order)) extern void __free_pages(struct page *page, unsigned int order); +extern void free_pages_nolock(struct page *page, unsigned int order); extern void free_pages(unsigned long addr, unsigned int order); #define __free_page(page) __free_pages((page), 0) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 825c04b56403..583bf59e2627 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -99,6 +99,10 @@ struct page { /* Or, free page */ struct list_head buddy_list; struct list_head pcp_list; + struct { + struct llist_node pcp_llist; + unsigned int order; + }; }; /* See page-flags.h for PAGE_MAPPING_FLAGS */ struct address_space *mapping; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index b36124145a16..1a854e0a9e3b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -953,6 +953,9 @@ struct zone { /* Primarily protects free_area */ spinlock_t lock; + /* Pages to be freed when next trylock succeeds */ + struct llist_head trylock_free_pages; + /* Write-intensive fields used by compaction and vmstats. */ CACHELINE_PADDING(_pad2_); diff --git a/lib/stackdepot.c b/lib/stackdepot.c index 377194969e61..73d7b50924ef 100644 --- a/lib/stackdepot.c +++ b/lib/stackdepot.c @@ -672,7 +672,10 @@ depot_stack_handle_t stack_depot_save_flags(unsigned long *entries, exit: if (prealloc) { /* Stack depot didn't use this memory, free it. */ - free_pages((unsigned long)prealloc, DEPOT_POOL_ORDER); + if (!allow_spin) + free_pages_nolock(virt_to_page(prealloc), DEPOT_POOL_ORDER); + else + free_pages((unsigned long)prealloc, DEPOT_POOL_ORDER); } if (found) handle = found->handle.handle; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a82bc67abbdb..fa750c46e0fc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -88,6 +88,9 @@ typedef int __bitwise fpi_t; */ #define FPI_TO_TAIL ((__force fpi_t)BIT(1)) +/* Free the page without taking locks. Rely on trylock only. */ +#define FPI_TRYLOCK ((__force fpi_t)BIT(2)) + /* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */ static DEFINE_MUTEX(pcp_batch_high_lock); #define MIN_PERCPU_PAGELIST_HIGH_FRACTION (8) @@ -1249,13 +1252,44 @@ static void split_large_buddy(struct zone *zone, struct page *page, } while (1); } +static void add_page_to_zone_llist(struct zone *zone, struct page *page, + unsigned int order) +{ + /* Remember the order */ + page->order = order; + /* Add the page to the free list */ + llist_add(&page->pcp_llist, &zone->trylock_free_pages); +} + static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, fpi_t fpi_flags) { + struct llist_head *llhead; unsigned long flags; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(fpi_flags & FPI_TRYLOCK)) { + add_page_to_zone_llist(zone, page, order); + return; + } + spin_lock_irqsave(&zone->lock, flags); + } + + /* The lock succeeded. Process deferred pages. */ + llhead = &zone->trylock_free_pages; + if (unlikely(!llist_empty(llhead) && !(fpi_flags & FPI_TRYLOCK))) { + struct llist_node *llnode; + struct page *p, *tmp; + + llnode = llist_del_all(llhead); + llist_for_each_entry_safe(p, tmp, llnode, pcp_llist) { + unsigned int p_order = p->order; + + split_large_buddy(zone, p, page_to_pfn(p), p_order, fpi_flags); + __count_vm_events(PGFREE, 1 << p_order); + } + } split_large_buddy(zone, page, pfn, order, fpi_flags); spin_unlock_irqrestore(&zone->lock, flags); @@ -2598,7 +2632,7 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone, static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, struct page *page, int migratetype, - unsigned int order) + unsigned int order, fpi_t fpi_flags) { int high, batch; int pindex; @@ -2633,6 +2667,14 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, } if (pcp->free_count < (batch << CONFIG_PCP_BATCH_SCALE_MAX)) pcp->free_count += (1 << order); + + if (unlikely(fpi_flags & FPI_TRYLOCK)) { + /* + * Do not attempt to take a zone lock. Let pcp->count get + * over high mark temporarily. + */ + return; + } high = nr_pcp_high(pcp, zone, batch, free_high); if (pcp->count >= high) { free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high), @@ -2647,7 +2689,8 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, /* * Free a pcp page */ -void free_unref_page(struct page *page, unsigned int order) +static void __free_unref_page(struct page *page, unsigned int order, + fpi_t fpi_flags) { unsigned long __maybe_unused UP_flags; struct per_cpu_pages *pcp; @@ -2656,7 +2699,7 @@ void free_unref_page(struct page *page, unsigned int order) int migratetype; if (!pcp_allowed_order(order)) { - __free_pages_ok(page, order, FPI_NONE); + __free_pages_ok(page, order, fpi_flags); return; } @@ -2673,24 +2716,34 @@ void free_unref_page(struct page *page, unsigned int order) migratetype = get_pfnblock_migratetype(page, pfn); if (unlikely(migratetype >= MIGRATE_PCPTYPES)) { if (unlikely(is_migrate_isolate(migratetype))) { - free_one_page(page_zone(page), page, pfn, order, FPI_NONE); + free_one_page(page_zone(page), page, pfn, order, fpi_flags); return; } migratetype = MIGRATE_MOVABLE; } zone = page_zone(page); + if (unlikely((fpi_flags & FPI_TRYLOCK) && IS_ENABLED(CONFIG_PREEMPT_RT) + && (in_nmi() || in_hardirq()))) { + add_page_to_zone_llist(zone, page, order); + return; + } pcp_trylock_prepare(UP_flags); pcp = pcp_spin_trylock(zone->per_cpu_pageset); if (pcp) { - free_unref_page_commit(zone, pcp, page, migratetype, order); + free_unref_page_commit(zone, pcp, page, migratetype, order, fpi_flags); pcp_spin_unlock(pcp); } else { - free_one_page(zone, page, pfn, order, FPI_NONE); + free_one_page(zone, page, pfn, order, fpi_flags); } pcp_trylock_finish(UP_flags); } +void free_unref_page(struct page *page, unsigned int order) +{ + __free_unref_page(page, order, FPI_NONE); +} + /* * Free a batch of folios */ @@ -2779,7 +2832,7 @@ void free_unref_folios(struct folio_batch *folios) trace_mm_page_free_batched(&folio->page); free_unref_page_commit(zone, pcp, &folio->page, migratetype, - order); + order, FPI_NONE); } if (pcp) { @@ -4843,22 +4896,37 @@ EXPORT_SYMBOL(get_zeroed_page_noprof); * Context: May be called in interrupt context or while holding a normal * spinlock, but not in NMI context or while holding a raw spinlock. */ -void __free_pages(struct page *page, unsigned int order) +static void ___free_pages(struct page *page, unsigned int order, + fpi_t fpi_flags) { /* get PageHead before we drop reference */ int head = PageHead(page); struct alloc_tag *tag = pgalloc_tag_get(page); if (put_page_testzero(page)) - free_unref_page(page, order); + __free_unref_page(page, order, fpi_flags); else if (!head) { pgalloc_tag_sub_pages(tag, (1 << order) - 1); while (order-- > 0) - free_unref_page(page + (1 << order), order); + __free_unref_page(page + (1 << order), order, + fpi_flags); } } +void __free_pages(struct page *page, unsigned int order) +{ + ___free_pages(page, order, FPI_NONE); +} EXPORT_SYMBOL(__free_pages); +/* + * Can be called while holding raw_spin_lock or from IRQ and NMI for any + * page type (not only those that came from try_alloc_pages) + */ +void free_pages_nolock(struct page *page, unsigned int order) +{ + ___free_pages(page, order, FPI_TRYLOCK); +} + void free_pages(unsigned long addr, unsigned int order) { if (addr != 0) { diff --git a/mm/page_owner.c b/mm/page_owner.c index 2d6360eaccbb..90e31d0e3ed7 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -294,7 +294,13 @@ void __reset_page_owner(struct page *page, unsigned short order) page_owner = get_page_owner(page_ext); alloc_handle = page_owner->handle; - handle = save_stack(GFP_NOWAIT | __GFP_NOWARN); + /* + * Do not specify GFP_NOWAIT to make gfpflags_allow_spinning() == false + * to prevent issues in stack_depot_save(). + * This is similar to try_alloc_pages() gfp flags, but only used + * to signal stack_depot to avoid spin_locks. + */ + handle = save_stack(__GFP_NOWARN); __update_page_owner_free_handle(page_ext, handle, order, current->pid, current->tgid, free_ts_nsec); page_ext_put(page_ext); From patchwork Wed Feb 12 17:47:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13972261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39369C021A0 for ; Wed, 12 Feb 2025 17:47:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B9EF06B0099; Wed, 12 Feb 2025 12:47:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B4F4D6B009A; Wed, 12 Feb 2025 12:47:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C8C46B009B; Wed, 12 Feb 2025 12:47:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7E8C36B0099 for ; Wed, 12 Feb 2025 12:47:25 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2D57680298 for ; Wed, 12 Feb 2025 17:47:25 +0000 (UTC) X-FDA: 83112024450.14.1E16F17 Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) by imf20.hostedemail.com (Postfix) with ESMTP id 442891C000F for ; Wed, 12 Feb 2025 17:47:23 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LyaErl8f; spf=pass (imf20.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.216.54 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739382443; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Dxu1Rc7/sI72JB4NYxxk9elaSjOiJ9Td89bVV/zMYyc=; b=D/5GHJ5ZlWxbBWSOntj3Q4tbIXaSzxWNYrjaOxT4m4uIW1flVuRKNAWlqFZPVhn6Snb9Bq T0A7/owlykk0bfeITApp8K4516QiHxfrXeCdaoYOIkT2lTV024B+13sn9DRtYpNzhQyd6G Otrl/1IXpQXO4xSwvd3KKr97ESZHy4Q= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LyaErl8f; spf=pass (imf20.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.216.54 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739382443; a=rsa-sha256; cv=none; b=0p/SNDPFGv3PECIsKr04BJ6wLfaEejez8FIBq8F/EVp6e/D3lIDJfR2ECUUn597Zt6kTXu I0sD00tZ8l3VfJtQ68/rruRyISdhaOsx03O+bFVRePURvPM216SJ8t3Lx2FkeQhMZ8Ff/z s/1BJWOvtQJ5Q15h71XG9Cg94iI/EPY= Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-2fa1c093d6eso94861a91.0 for ; Wed, 12 Feb 2025 09:47:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739382442; x=1739987242; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Dxu1Rc7/sI72JB4NYxxk9elaSjOiJ9Td89bVV/zMYyc=; b=LyaErl8fA54tj9Zq4xjb4dZ6zReLJrjRZyQdkL2a/tpD/uyjU9UIIs1+oel0yPKdKI JAhQzCGOQrk8wsZteb5FWl+I/jrAEq7naVzi63I7WZ0qFqy5YKjTWygRmIiL9ahPD0Xo /8qjd5Anzw8lYny1/Xw8CDJbVBNh+Qr8MSAF9O2n5tsReZlIA9xKs/3Ip17UcuL+dm8N Bp7ajVk7Du8kCw6K2yysJdqMTAzJtXQmEhQfVxX6wEzJJhI5xIsPhk1dLiYjBvKVWpky /VaZ865uEfprIYPcM57LGEWXYquu8yfmxuyMF2nu9iFveGUZ0EZSbcA2Yfp+iEKDy8Tl ZQqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739382442; x=1739987242; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Dxu1Rc7/sI72JB4NYxxk9elaSjOiJ9Td89bVV/zMYyc=; b=qFbNsfqgb6atoCLV6YH+4AMyronK/YtQWj7IFIYucb7mbK1m7jefsze/1NJ3mYQg8e R3XI1AWQvoCf4Q4/E1l6dLy4cFnPQApBI6AzXAaKyoStfqQHo4rAAg4k3eyUQ3UsMI7A Qtc8/Eufbpw9YGqck7+vw4DaNW4jA63ED7gixkrO3/MXsiERjL+fJ6mUah86ab/fwmgg XtSqV8WdYMluN0lJVo4AiNc25SdwjHZZoM7W/kc8JvjTI6QklY4yBNEKFmVaYtyKiPYx afktNqcXlQaqxWOZRdD/p0xBJloUXcJhccoigKG4xNQPZhp0ia05MDPQyDRhENtZOETg uSnw== X-Forwarded-Encrypted: i=1; AJvYcCXyDH0WZSJLaKgptUdVeY+2q8cpn8VAXK2TxwYiKc+6anBLdDC8uJvs6OLerRaE1MllzI/c7cZ+Ew==@kvack.org X-Gm-Message-State: AOJu0Ywv2dlFN8rwdsKZr7EmntktWI67RaVdtiHrWeWu5FRFB5mhDdvT dgyk/9Lio6+a/eWaXTXBBVl8cTBBwvFIsdrU4c6hbq6i6RPALoyV X-Gm-Gg: ASbGncv2bmuvUGjrB1GFpccAhFiWiDDM6GijIDZ/tPo/ggfrVKvlp2qpPbaRgGwpMxc vRmiyDXMFUELECygwUtk5gO459YDnJzPL/+ZBDsOUcVyRfjxoX/hwHQUzqushA8RwB6e2d4HW5J k7ycQszWCjWDOHKdnVMXElg5kn4GJe9+u4n8DT5H2yjtJOnLvSjcwSjTxWzgynhGvLI49p+Ga1v Yg8otsTiRwrlFlB3eG+2OawXLbdDGJ1YPp23ebkqPXHZqAMPNUpwJS7Imt74VO+JjbaTwSHvD/w zO/Au9WVU713P1Hi/ar9jLAas1N+KZ/GS9qRTqH8Cn9wBg== X-Google-Smtp-Source: AGHT+IFdzeREzMu6Nvpg1XZZgAM871rh2ju2mt7o2mzuvJjED9/Nt9+adPMlY6ggnat49bQIcDjxPQ== X-Received: by 2002:a17:90a:d00e:b0:2ee:d433:7c50 with SMTP id 98e67ed59e1d1-2fbf5c5a06dmr4937863a91.23.1739382441983; Wed, 12 Feb 2025 09:47:21 -0800 (PST) Received: from ast-mac.thefacebook.com ([2620:10d:c090:500::4:c330]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2fbf9908835sm1805194a91.42.2025.02.12.09.47.20 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 12 Feb 2025 09:47:21 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v7 3/6] locking/local_lock: Introduce localtry_lock_t Date: Wed, 12 Feb 2025 09:47:02 -0800 Message-Id: <20250212174705.44492-4-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250212174705.44492-1-alexei.starovoitov@gmail.com> References: <20250212174705.44492-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 442891C000F X-Stat-Signature: 6mqfxezj35eyzayq3n7c8x5ioybyaitk X-HE-Tag: 1739382443-716558 X-HE-Meta: U2FsdGVkX1+H7mnFl5t412pJ6vhcTMCqy3WSaSPtPHTJHMkJn+8VACD2eiTe33oEZ20T+i8TCbBu6AFeTiuK5pDbgGnqOAtV2hQtuHZqmM46hrTDDpWFKS3fXCsQYz/5gm2IuKFQWOjswX8j0fWh7JSxift7At/S7Ie1etbRvm/wNhgxMF81AomnpbWVEt/rRKdHq8OascPQGgkbGxL+G0LZcgMH6wvI7qW2uhvaFbQR0iUsxrkK5QMNsBC424DDhEJ7/zDWqVvdJ9Af2PJ2B0HV8Tnu09LO4XuYwPcVLZclBk/+9JpXiMWYN3xVEaRXuPRzBn42cLgcTbcE2BqPaILxKRZCjG6q1cy31dMMebdTNLZdFC90Y+wUXOB3zp2qAfZOMjzh4HFGdS1g2Q/ZZgQwUNDryDf1qYORFzpTK6PtWZvQjzbfYV+XNARQSoLMNScjH10Al3v/e/wrzdWIYHC8OquLl7b5Bw9rQbeyQuB1gspqzo0eVBj8LuHkWZ8hZQNja4DQPEPS0I22JRkL+7acdcum3wjTA4uk2Uih3efkC/j6SlHWGMnxKdokyXNpfcWBZBszJZ/nkz2GipWXuJhr68dQLUBwilN+j5EmchUIS43UdjqOg1bJMYCMQlXJmHUCcvxEJZmgSl/fnxvR4zego6mEdt2a9QMLfcqFjVQZ1gINhVg96vd85qQ+vFzdFL81UO1OMmOA8auCC15aIVX02W/A9RkSpkRtOWuMWlq17rNuCP1ATSsIGLREEA53N9+FE1+TjpHYpfEirbQ0yWvkxmNL3efLECcuH+cFlU+XpUOuhDDvMaHriR+uDUZ6cqKoFfpvloa9S/uTNyRDrm6gajI6dF8NFcm5JFAqAhwINJRkAiK1R2N+ucvuhtgZpVabzNJkWcczKPJ+DO5vclR91OhEOTqQgz+C+ln58c7orAQb8iH0qfs1k4uZl7idGdlb0zN7cnJahjuFsSB aSyxBXL/ Gn4QiIq6vbPuq4a5nSV6r79JR1LFidmiTzEiYkmRVvGGRSbHOWWB3hI+PItEAJD8sGL2xDPyBRVESQwFUOYzhCuM/aAjr7q652Jq89XSivHBWuxAcw3P168pbrVjHqeTT8+gmvitnNpkKgsMnLFiMeW5sdhZy2VfuqWmjYMvS/Ojbw8pcUFsE4BvFtq8fSZ0/UmLR8jsVhBsOc4FVzIDrV0uWJWtsFQojzyMq8S9an65RNIVINNH0AT1vgaZldoeyJa0n/GjnRalyOxnVchPM6j1rqp187SGbJ/IJCWuw+j8Pd8rK0fXAQ2noPrpNwu1/FPAgSPnJp3+vyJfFSR7y00ACQ55O+y38VIgo3dEzuqfyWeAHb2Q4zxKp6dLM3nBa7fCWUxVLSTea2AV3s0x7QPxctu0SJgvF97lFCSq4sxkyRT++dzuvLn1IK+rOahpMVSIvuDNC84T/E5IaIYP5GwKVHntSFno+jlwU2EA4NXS+1MW1dGOkodoUsw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Sebastian Andrzej Siewior In !PREEMPT_RT local_lock_irqsave() disables interrupts to protect critical section, but it doesn't prevent NMI, so the fully reentrant code cannot use local_lock_irqsave() for exclusive access. Introduce localtry_lock_t and localtry_lock_irqsave() that disables interrupts and sets acquired=1, so localtry_lock_irqsave() from NMI attempting to acquire the same lock will return false. In PREEMPT_RT local_lock_irqsave() maps to preemptible spin_lock(). Map localtry_lock_irqsave() to preemptible spin_trylock(). When in hard IRQ or NMI return false right away, since spin_trylock() is not safe due to PI issues. Note there is no need to use local_inc for acquired variable, since it's a percpu variable with strict nesting scopes. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Alexei Starovoitov --- include/linux/local_lock.h | 59 +++++++++++++ include/linux/local_lock_internal.h | 123 ++++++++++++++++++++++++++++ 2 files changed, 182 insertions(+) diff --git a/include/linux/local_lock.h b/include/linux/local_lock.h index 091dc0b6bdfb..05c254a5d7d3 100644 --- a/include/linux/local_lock.h +++ b/include/linux/local_lock.h @@ -51,6 +51,65 @@ #define local_unlock_irqrestore(lock, flags) \ __local_unlock_irqrestore(lock, flags) +/** + * localtry_lock_init - Runtime initialize a lock instance + */ +#define localtry_lock_init(lock) __localtry_lock_init(lock) + +/** + * localtry_lock - Acquire a per CPU local lock + * @lock: The lock variable + */ +#define localtry_lock(lock) __localtry_lock(lock) + +/** + * localtry_lock_irq - Acquire a per CPU local lock and disable interrupts + * @lock: The lock variable + */ +#define localtry_lock_irq(lock) __localtry_lock_irq(lock) + +/** + * localtry_lock_irqsave - Acquire a per CPU local lock, save and disable + * interrupts + * @lock: The lock variable + * @flags: Storage for interrupt flags + */ +#define localtry_lock_irqsave(lock, flags) \ + __localtry_lock_irqsave(lock, flags) + +/** + * localtry_trylock_irqsave - Try to acquire a per CPU local lock, save and disable + * interrupts if acquired + * @lock: The lock variable + * @flags: Storage for interrupt flags + * + * The function can be used in any context such as NMI or HARDIRQ. Due to + * locking constrains it will _always_ fail to acquire the lock on PREEMPT_RT. + */ +#define localtry_trylock_irqsave(lock, flags) \ + __localtry_trylock_irqsave(lock, flags) + +/** + * local_unlock - Release a per CPU local lock + * @lock: The lock variable + */ +#define localtry_unlock(lock) __localtry_unlock(lock) + +/** + * local_unlock_irq - Release a per CPU local lock and enable interrupts + * @lock: The lock variable + */ +#define localtry_unlock_irq(lock) __localtry_unlock_irq(lock) + +/** + * localtry_unlock_irqrestore - Release a per CPU local lock and restore + * interrupt flags + * @lock: The lock variable + * @flags: Interrupt flags to restore + */ +#define localtry_unlock_irqrestore(lock, flags) \ + __localtry_unlock_irqrestore(lock, flags) + DEFINE_GUARD(local_lock, local_lock_t __percpu*, local_lock(_T), local_unlock(_T)) diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h index 8dd71fbbb6d2..c1369b300777 100644 --- a/include/linux/local_lock_internal.h +++ b/include/linux/local_lock_internal.h @@ -15,6 +15,11 @@ typedef struct { #endif } local_lock_t; +typedef struct { + local_lock_t llock; + unsigned int acquired; +} localtry_lock_t; + #ifdef CONFIG_DEBUG_LOCK_ALLOC # define LOCAL_LOCK_DEBUG_INIT(lockname) \ .dep_map = { \ @@ -31,6 +36,13 @@ static inline void local_lock_acquire(local_lock_t *l) l->owner = current; } +static inline void local_trylock_acquire(local_lock_t *l) +{ + lock_map_acquire_try(&l->dep_map); + DEBUG_LOCKS_WARN_ON(l->owner); + l->owner = current; +} + static inline void local_lock_release(local_lock_t *l) { DEBUG_LOCKS_WARN_ON(l->owner != current); @@ -45,11 +57,13 @@ static inline void local_lock_debug_init(local_lock_t *l) #else /* CONFIG_DEBUG_LOCK_ALLOC */ # define LOCAL_LOCK_DEBUG_INIT(lockname) static inline void local_lock_acquire(local_lock_t *l) { } +static inline void local_trylock_acquire(local_lock_t *l) { } static inline void local_lock_release(local_lock_t *l) { } static inline void local_lock_debug_init(local_lock_t *l) { } #endif /* !CONFIG_DEBUG_LOCK_ALLOC */ #define INIT_LOCAL_LOCK(lockname) { LOCAL_LOCK_DEBUG_INIT(lockname) } +#define INIT_LOCALTRY_LOCK(lockname) { .llock = { LOCAL_LOCK_DEBUG_INIT(lockname.llock) }} #define __local_lock_init(lock) \ do { \ @@ -118,6 +132,86 @@ do { \ #define __local_unlock_nested_bh(lock) \ local_lock_release(this_cpu_ptr(lock)) +/* localtry_lock_t variants */ + +#define __localtry_lock_init(lock) \ +do { \ + __local_lock_init(&(lock)->llock); \ + WRITE_ONCE(&(lock)->acquired, 0); \ +} while (0) + +#define __localtry_lock(lock) \ + do { \ + localtry_lock_t *lt; \ + preempt_disable(); \ + lt = this_cpu_ptr(lock); \ + local_lock_acquire(<->llock); \ + WRITE_ONCE(lt->acquired, 1); \ + } while (0) + +#define __localtry_lock_irq(lock) \ + do { \ + localtry_lock_t *lt; \ + local_irq_disable(); \ + lt = this_cpu_ptr(lock); \ + local_lock_acquire(<->llock); \ + WRITE_ONCE(lt->acquired, 1); \ + } while (0) + +#define __localtry_lock_irqsave(lock, flags) \ + do { \ + localtry_lock_t *lt; \ + local_irq_save(flags); \ + lt = this_cpu_ptr(lock); \ + local_lock_acquire(<->llock); \ + WRITE_ONCE(lt->acquired, 1); \ + } while (0) + +#define __localtry_trylock_irqsave(lock, flags) \ + ({ \ + localtry_lock_t *lt; \ + bool _ret; \ + \ + local_irq_save(flags); \ + lt = this_cpu_ptr(lock); \ + if (!READ_ONCE(lt->acquired)) { \ + WRITE_ONCE(lt->acquired, 1); \ + local_trylock_acquire(<->llock); \ + _ret = true; \ + } else { \ + _ret = false; \ + local_irq_restore(flags); \ + } \ + _ret; \ + }) + +#define __localtry_unlock(lock) \ + do { \ + localtry_lock_t *lt; \ + lt = this_cpu_ptr(lock); \ + WRITE_ONCE(lt->acquired, 0); \ + local_lock_release(<->llock); \ + preempt_enable(); \ + } while (0) + +#define __localtry_unlock_irq(lock) \ + do { \ + localtry_lock_t *lt; \ + lt = this_cpu_ptr(lock); \ + WRITE_ONCE(lt->acquired, 0); \ + local_lock_release(<->llock); \ + local_irq_enable(); \ + } while (0) + +#define __localtry_unlock_irqrestore(lock, flags) \ + do { \ + localtry_lock_t *lt; \ + lt = this_cpu_ptr(lock); \ + WRITE_ONCE(lt->acquired, 0); \ + local_lock_release(<->llock); \ + local_irq_restore(flags); \ + } while (0) + #else /* !CONFIG_PREEMPT_RT */ /* @@ -125,8 +219,10 @@ do { \ * critical section while staying preemptible. */ typedef spinlock_t local_lock_t; +typedef spinlock_t localtry_lock_t; #define INIT_LOCAL_LOCK(lockname) __LOCAL_SPIN_LOCK_UNLOCKED((lockname)) +#define INIT_LOCALTRY_LOCK(lockname) INIT_LOCAL_LOCK(lockname) #define __local_lock_init(l) \ do { \ @@ -169,4 +265,31 @@ do { \ spin_unlock(this_cpu_ptr((lock))); \ } while (0) +/* localtry_lock_t variants */ + +#define __localtry_lock_init(lock) __local_lock_init(lock) +#define __localtry_lock(lock) __local_lock(lock) +#define __localtry_lock_irq(lock) __local_lock(lock) +#define __localtry_lock_irqsave(lock, flags) __local_lock_irqsave(lock, flags) +#define __localtry_unlock(lock) __local_unlock(lock) +#define __localtry_unlock_irq(lock) __local_unlock(lock) +#define __localtry_unlock_irqrestore(lock, flags) __local_unlock_irqrestore(lock, flags) + +#define __localtry_trylock_irqsave(lock, flags) \ + ({ \ + int __locked; \ + \ + typecheck(unsigned long, flags); \ + flags = 0; \ + if (in_nmi() | in_hardirq()) { \ + __locked = 0; \ + } else { \ + migrate_disable(); \ + __locked = spin_trylock(this_cpu_ptr((lock))); \ + if (!__locked) \ + migrate_enable(); \ + } \ + __locked; \ + }) + #endif /* CONFIG_PREEMPT_RT */ From patchwork Wed Feb 12 17:47:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13972262 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A508EC02198 for ; Wed, 12 Feb 2025 17:47:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 36346280002; Wed, 12 Feb 2025 12:47:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 31221280001; Wed, 12 Feb 2025 12:47:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 18B75280002; Wed, 12 Feb 2025 12:47:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EE396280001 for ; Wed, 12 Feb 2025 12:47:29 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8FE6A80340 for ; Wed, 12 Feb 2025 17:47:29 +0000 (UTC) X-FDA: 83112024618.13.0D79FCA Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) by imf05.hostedemail.com (Postfix) with ESMTP id BB5F7100010 for ; Wed, 12 Feb 2025 17:47:27 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=nLhD70qh; spf=pass (imf05.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.216.44 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739382447; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lgL2WdPUSp9IlhXwRVgFpZY7gPxe7SOzqv/p6kt86b8=; b=4oV9QKqoOYlHpwXY2SuZhUlx5ExaHqeHkmhyXFhCTJgKKKtjQxY34AM2MrzkDqAxdLVS/m 21xPL4LYSDGFrCRjzW3OUW58HGAsxlMBTqqVmTO49401dqvE2S6nwrhY5acpdJP2DzhBio TUcQLkOgibYUJyM+MNc57Sk2e1ukUQI= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=nLhD70qh; spf=pass (imf05.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.216.44 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739382447; a=rsa-sha256; cv=none; b=msgdTQZTviVKkVHO+lDbJ8tHGST3jamqX0c0XrLzeigATuvYPpqPLUG43bq6GoWQ2fjDdd 5msv6/sUBVPMKi+KRbCDJGrFr6AD8nWM9Mmp91566NaTQDKpSfuAoscqbfZxAQPW93h2sp 19CL5lmRdEMolrQd+DavsDJLiTBx3jg= Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-2f9bac7699aso78863a91.1 for ; Wed, 12 Feb 2025 09:47:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739382447; x=1739987247; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lgL2WdPUSp9IlhXwRVgFpZY7gPxe7SOzqv/p6kt86b8=; b=nLhD70qhnS3K7wptNSJpTyPAa77ndCwl5qNb/LoG22w5eZwNFVnRt6BPPAvHhk9j5g 5a94YSB5ero/CWDgcpxEEiKzN1U+GsUSxFe1fD1jAbPZeEqj4B9gYeYLAyHeP7jlcqfQ UffIdlz5lzXVR4fvC5eKIrEsd2WdS/MKpXhFdMZiA0DIkLsOhKEVH4QN3bMPqc0Dg/co QsRoQQXMwzi88u18l5X1SebC8EpLYGlIor3YpchGr3C4q5qepOd4xdpGNM0rixsDPU+p Y6eUkXmO0UEsSozs67/GieUx3kKKoMqethREmztN7eqqKIckkPsWTU2EwhiG6lEq+QUa FEUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739382447; x=1739987247; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lgL2WdPUSp9IlhXwRVgFpZY7gPxe7SOzqv/p6kt86b8=; b=eiQI1uV/KJrr7GkP8Sy5sypbiGdx6RlOnkDeIQfFtj65mLWP5VN0j47VelA6X5iZ1p POjuYGjVieqfZlWq/juc3lLnnT4mV4Cc4nPr8yz22tfQ4r7QjngXIcoSmtCsiJrtI97r HZfZLQLIVuF0J1SzDoeI65RWzAMGzA1k0gUIatXnM4HcDgKt0QpgyFHtxC2vIUKNX1Du miaj7zf0xPQYoBdr7Rrn/8hA07dm8T9i5Q2IRv2Fyllu2WKoEy5ZGMVdza+i5YJPnjRH 585F/Ys4chuGfYI3Sq/VME467mMli+UcJQBVs5FxvZ9QIKlpOwee2EmVdAl6JfNOzOfj 4rFQ== X-Forwarded-Encrypted: i=1; AJvYcCWW1uFEHbKt/zwzP114tMPPxeG67syNEwq/Cjw0JTJe/Fk68GtoXJCUXEa5+nUbmuw25WTsy9PV1Q==@kvack.org X-Gm-Message-State: AOJu0Yzrks+fYxlsym79cKh+4X4DBN5B1zXHAUDFE/nbfJNP8mYGAoBq fvjxgVe98WFYaxt0aOdGRW4TYskYNWGyljpxQPgIIeZxIhyoVY6c6UWOgQ== X-Gm-Gg: ASbGncvP7tza65UvnaqKg8v/jLWM8k312DD4ye6QWyR3K9S6GNWxIR4MI4cf/HME/TT OawYv7Nl5hXASTm366rvVSftOB1DBvYfirLhWOS+GmB3e21OJ/jv+WMvKmys4vS47M/cCNF64Jw XRXrrd+T76f0jTivTuCjflrN92K5aE3Ry3OS+Fj+fo6Q91DQqPvp4TkmT1Vz0fPWjCvEtWdp+Tu LN0t6I1xVisDBsOND/Kc+rmnTH5sBssAOPbsPYMeW1z9aTNmkIZvHlFzKAvz6O/CIr2GfYN3iz6 s9cDp2AZjQ/11VdVOrORCR+bedGgen7qNV2141KvQM6zOw== X-Google-Smtp-Source: AGHT+IGEj3NYQQrGwrnzg5FtQkvq3P2X1QMPIINMUYPk4XwQqFltxvoCKJbMwmmlCOqX7njlFq8C2Q== X-Received: by 2002:a17:90b:1dcb:b0:2ee:bbd8:2b9d with SMTP id 98e67ed59e1d1-2fbf91358eamr4369795a91.34.1739382446220; Wed, 12 Feb 2025 09:47:26 -0800 (PST) Received: from ast-mac.thefacebook.com ([2620:10d:c090:500::4:c330]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21f36896b86sm116452455ad.212.2025.02.12.09.47.24 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 12 Feb 2025 09:47:25 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v7 4/6] memcg: Use trylock to access memcg stock_lock. Date: Wed, 12 Feb 2025 09:47:03 -0800 Message-Id: <20250212174705.44492-5-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250212174705.44492-1-alexei.starovoitov@gmail.com> References: <20250212174705.44492-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: BB5F7100010 X-Stat-Signature: poegtap46cgrpgsqb6jqxarszn8np1pk X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1739382447-427425 X-HE-Meta: U2FsdGVkX1/L24HI0ZqCixUSdRng1q1GYymM8FcAabQ4icb6tkBteTpr+etQG9TGJHaHVyDLKhyY3+0wzj3KmyqTwr1EyQOq6UvP/0ooEggV099a7R3hX6slUqj8jTh8amWaq/QaCafGfpe3/cVPiJiCZwdxgkS7prZxHXr8PQtMpKbdBzODEa9eYvgX531thv/Y/xwOewTFaeN78bdsdVnSz0rNZr20uLs8C16KeYm7N8ll5m5nqoYCvBG6IgjC/36b+1eUKv7XoJfgmNBb6lJpa9XdN/GXc18CT9UEHm27cF1tAPad761wHhF4wjYqKGi7fvFEv+wIlu9hAufmsYtVRqrXav3lI1o4gbEJbzuZ3mH7rLFUX6SOHUJCqsx+d1udAznbKxSQeuW480CeEiFJluJzWw9A8EOP8RzE1bs8U75lsmoIgocuimtK9Wk+q6+Tol3wixsxvEQH+JakQzSyFX5TqRmFeEQSncBmi1hxWqZIqQ0W/zr4bL/ZMhbUDQkOkWqyx/CsRDarClfZB6BPMHNjDG6aYcVcFbJbGBZuydvL4NN5xg3kioG1VbgIW6TA6kmT/qDeyHcWVCr4GJT5mWfFUlQA48/QocynWQbufzfw5gj5rdRZ/IxBZyJOIyIl5cGwxEPqVtYdJwgvoad/YpVxvN6uur48Ze1/vyUYcwLM/Dlw8m9f5TcKbnZYj8M7PVdKYdNKM9wEOXB/rrpcrlV0tKYVWycPR2B8YW5nQxZPbkQ9SCegC+K+FuS1Yw+DXEAOMW1slS/95PmDZQx4aKwTGPze5K0f5QGQy5+lzpljk7+Xnao7ki4ao1AP1VDSZxtRxus44QZHtVJ9XGOO9D9tOSBjkyLRgWxKfIGKrn2CGLa9cNeOAnSOTgVnKpJrxU0rrgryMXSPfy38jnYU92JmM+P/ViA5NiLIQk31f6G6kBycarA+F4ttc9VBqb+2jCS3ePjlFHO4hMd p87LHwU2 iSqKLC0UJ4ytYg1jgjsaShz2rkXc2JtcZyxeYhbZJTe39pFfqBuXXrDP64thzm0hm0BofAEa4n9RA/n2OqOydC3tQfFpfP+rD1Ck8RG2XfFet2T0L+E/ZXeiBtE8tBlO6rDVIQnPTk5MdGV+zXmGP2P2LGCiBY+bOs1X14Nw9syoy5kakz+htb5/5UhEfV4ew11LWVOwMpXehZXMQCSoducwE5mph1ymHnbTA46KCbticlEtTlUDnc/6jj7988vkKPEPd6+oLzPxocH834FIJ9YdaAJtZeLashtcnsgVpmLgkyEhFmA6V+UVxKK3iiveURB69qvtXSsEdfk0T9catWYWpfGVOlXhNgTieQVLuKNYArSVbMGffLJV8QD3Bg+kzv8diM/xeirTVwj++FOiZiAAbfH2oo+781zEJ2o7FN3SoPzBJPneYjYL8r4cat4cZTQoxtl+dG32z+qME4vdg2LLhbN3Cg7EQtZTOx2W11HbUwRqu/7q8IyKdihmVGlIrSKMarAkWvPEIekl5LxurQGGfZ1eLQyW3ed875v/Yo1rdEW1m5t+AzSN9pQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexei Starovoitov Teach memcg to operate under trylock conditions when spinning locks cannot be used. localtry_trylock might fail and this would lead to charge cache bypass if the calling context doesn't allow spinning (gfpflags_allow_spinning). In those cases charge the memcg counter directly and fail early if that is not possible. This might cause a pre-mature charge failing but it will allow an opportunistic charging that is safe from try_alloc_pages path. Acked-by: Michal Hocko Acked-by: Vlastimil Babka Acked-by: Shakeel Butt Signed-off-by: Alexei Starovoitov --- mm/memcontrol.c | 52 ++++++++++++++++++++++++++++++++++--------------- 1 file changed, 36 insertions(+), 16 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7b3503d12aaf..f3af615f727c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1722,7 +1722,7 @@ void mem_cgroup_print_oom_group(struct mem_cgroup *memcg) } struct memcg_stock_pcp { - local_lock_t stock_lock; + localtry_lock_t stock_lock; struct mem_cgroup *cached; /* this never be root cgroup */ unsigned int nr_pages; @@ -1737,7 +1737,7 @@ struct memcg_stock_pcp { #define FLUSHING_CACHED_CHARGE 0 }; static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock) = { - .stock_lock = INIT_LOCAL_LOCK(stock_lock), + .stock_lock = INIT_LOCALTRY_LOCK(stock_lock), }; static DEFINE_MUTEX(percpu_charge_mutex); @@ -1756,7 +1756,8 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, * * returns true if successful, false otherwise. */ -static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) +static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages, + gfp_t gfp_mask) { struct memcg_stock_pcp *stock; unsigned int stock_pages; @@ -1766,7 +1767,11 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) if (nr_pages > MEMCG_CHARGE_BATCH) return ret; - local_lock_irqsave(&memcg_stock.stock_lock, flags); + if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) { + if (!gfpflags_allow_spinning(gfp_mask)) + return ret; + localtry_lock_irqsave(&memcg_stock.stock_lock, flags); + } stock = this_cpu_ptr(&memcg_stock); stock_pages = READ_ONCE(stock->nr_pages); @@ -1775,7 +1780,7 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) ret = true; } - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); return ret; } @@ -1814,14 +1819,14 @@ static void drain_local_stock(struct work_struct *dummy) * drain_stock races is that we always operate on local CPU stock * here with IRQ disabled */ - local_lock_irqsave(&memcg_stock.stock_lock, flags); + localtry_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); old = drain_obj_stock(stock); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); obj_cgroup_put(old); } @@ -1851,9 +1856,20 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { unsigned long flags; - local_lock_irqsave(&memcg_stock.stock_lock, flags); + if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) { + /* + * In case of unlikely failure to lock percpu stock_lock + * uncharge memcg directly. + */ + if (mem_cgroup_is_root(memcg)) + return; + page_counter_uncharge(&memcg->memory, nr_pages); + if (do_memsw_account()) + page_counter_uncharge(&memcg->memsw, nr_pages); + return; + } __refill_stock(memcg, nr_pages); - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); } /* @@ -2196,9 +2212,13 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, unsigned long pflags; retry: - if (consume_stock(memcg, nr_pages)) + if (consume_stock(memcg, nr_pages, gfp_mask)) return 0; + if (!gfpflags_allow_spinning(gfp_mask)) + /* Avoid the refill and flush of the older stock */ + batch = nr_pages; + if (!do_memsw_account() || page_counter_try_charge(&memcg->memsw, batch, &counter)) { if (page_counter_try_charge(&memcg->memory, batch, &counter)) @@ -2709,7 +2729,7 @@ static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, unsigned long flags; int *bytes; - local_lock_irqsave(&memcg_stock.stock_lock, flags); + localtry_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); /* @@ -2762,7 +2782,7 @@ static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, if (nr) __mod_objcg_mlstate(objcg, pgdat, idx, nr); - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); obj_cgroup_put(old); } @@ -2772,7 +2792,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) unsigned long flags; bool ret = false; - local_lock_irqsave(&memcg_stock.stock_lock, flags); + localtry_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); if (objcg == READ_ONCE(stock->cached_objcg) && stock->nr_bytes >= nr_bytes) { @@ -2780,7 +2800,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) ret = true; } - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); return ret; } @@ -2872,7 +2892,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, unsigned long flags; unsigned int nr_pages = 0; - local_lock_irqsave(&memcg_stock.stock_lock, flags); + localtry_lock_irqsave(&memcg_stock.stock_lock, flags); stock = this_cpu_ptr(&memcg_stock); if (READ_ONCE(stock->cached_objcg) != objcg) { /* reset if necessary */ @@ -2890,7 +2910,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, stock->nr_bytes &= (PAGE_SIZE - 1); } - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); obj_cgroup_put(old); if (nr_pages) From patchwork Wed Feb 12 17:47:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13972263 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C898BC02198 for ; Wed, 12 Feb 2025 17:47:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5845F280003; Wed, 12 Feb 2025 12:47:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 53861280001; Wed, 12 Feb 2025 12:47:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 389BE280003; Wed, 12 Feb 2025 12:47:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 16D61280001 for ; Wed, 12 Feb 2025 12:47:34 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id AB33346772 for ; Wed, 12 Feb 2025 17:47:33 +0000 (UTC) X-FDA: 83112024786.16.C51758F Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by imf22.hostedemail.com (Postfix) with ESMTP id C0D6AC0003 for ; Wed, 12 Feb 2025 17:47:31 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=F3+eJUXn; spf=pass (imf22.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739382451; a=rsa-sha256; cv=none; b=ovzdOxM6NczpljdztytaWOMuB9rHVK6Yhf6YaO7y+38wQbZx1OrT+jHXS/YJxBXxNDXgJX Jjsrc5CtcgvWMBUEuMtdCdYfwZJumZDar7LLrBj33tQZq3XE3jco7O7s31SzZmCXbr7rtZ SBWVgv96cUl3Jk4iGC5yM8uXqYLl5iI= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=F3+eJUXn; spf=pass (imf22.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739382451; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iXCJ0gdG5EF9yWrbtRQyNOOIbBLtByDZRqzMTlbBC2I=; b=RJu1p7vXTvANGT5tUoj+UaTZvkKqrmvqvS8ai2i+WWDB4H/kws9xxydjD9B5Cb59tCtJOZ tJfX602Btuht6n/xlKy8JURUam4l0m1G4WPBHCfZgnctJ1SSQ67xsDxgH+lUjgSYC9Q1/O VgplOl8dXcvItjZ1mBhFxYFlw5Pm4jI= Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-2f9f5caa37cso1971827a91.0 for ; Wed, 12 Feb 2025 09:47:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739382451; x=1739987251; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iXCJ0gdG5EF9yWrbtRQyNOOIbBLtByDZRqzMTlbBC2I=; b=F3+eJUXny69LQRz7qVNMZtpvUISyb9cLg+k6l8uMwE445GzuRXE3EUgozRiMvYF6w6 ZC1xvTRGvAvaaGTEyIfrFEqKhk0BmcYyNrYKyF5RWchxOkV28aIASu6Yzf21EE/gL2KR ILMPBTT3rQF6+Kc+OqCu9fSsf/pDVlzJ0UH6T4G9B0IPHcmUaCY5cI55Cz6cldpy4OP8 XVDaY82imBJBNwALxwaeL5JbTA06uR+Ey6Ru5ZEDQU1RrQGGVS/qbtNSwg7+OSNPOVb1 YWoagNdDpJ6m8acbAm/YbbgMqHYjorshg/HbAHJ5mEwgTVo1gFz3i0Qu0lh+gUn7yxGI 0DMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739382451; x=1739987251; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iXCJ0gdG5EF9yWrbtRQyNOOIbBLtByDZRqzMTlbBC2I=; b=O+QNDmuELjZqk/da9eFfvyvjroNjr7yWnN3SgzJvgr4wPsJ8HRc37pf+Udjhb6i2U8 LNCKokdJ5MhViOFLi/dcfKdE34veVKhIicrAm5jPUHye+cFSKhYd9LPlFz2QhwAT4DCn 2yrZaZn9DIHBUM6/FCCwlKmdi+YtE++agHzLKDFUFnaRnATOaxxR0fXKLs8rjI69ulzY z3DHZW1S3h65V1kiNNr+z4KUgjrxRgVLWwhkui4KzNJrT7hDyzCULEqgq+fy11tduwi4 abOoR1b3OVaQ01QG2+t75JdvplrQ0LpZXknZjs/4eX/7rIfIGGkVKy9Y61WEQdARaPnk sGoA== X-Forwarded-Encrypted: i=1; AJvYcCVVFP3EPvKFYJzoktmUo/DLnOulrwt0QrskJuvQkZGfIfWMOKIUOvFb6bPYhQ+ZmWsXmrefNojT6Q==@kvack.org X-Gm-Message-State: AOJu0YxBPnJshImEPtR1YZtVij56/kL6nl1tL6O685F0mmcF1PBNUgNT c+9jo3GtQUz9WsrfRuEf0U4B9YHTk4HRSefS5ua9kOHJqlSuF0Od X-Gm-Gg: ASbGncv6rAmcVIQPR/QwFqhlH7L3p9Bd7ocYGI8jB6z2Pgg/92KetagM42bVzcTYP2D XLXwkylvmgotEW2SwLA7Ggxz9pMOqIAq76IRHxrZpYFXnIRgg/6MR1nyJD2hJ9j+orx2caQLA5F J59a/pHHf06RZCMctdht5Z1Ukx/R68BZhqMABtrBfu43xqKkIT8EkPPOhif/KVrmJEN4NBSAV4M /8lDcYkBEwkNEgZeaGtfUIekfyypmJd8IpoA45DIgGTzXwtJ1rVGr/mDtuQRyDg/uy2UATKjPbb 08sCbwlUN0kBNxmpzl0Z65MmQmiFYY2pkJXVGr63znB4Mw== X-Google-Smtp-Source: AGHT+IGQQe2Ltprl1s0EVAQgrtpuMzau10NCDgeIlNPMnvs7s+Zh8a9Qe2IaEzLCOBtyjvHoHPP7WA== X-Received: by 2002:a17:90b:38c6:b0:2ea:8aac:6ac1 with SMTP id 98e67ed59e1d1-2fc0db48d32mr293200a91.15.1739382450483; Wed, 12 Feb 2025 09:47:30 -0800 (PST) Received: from ast-mac.thefacebook.com ([2620:10d:c090:500::4:c330]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2fbf9908835sm1805334a91.42.2025.02.12.09.47.28 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 12 Feb 2025 09:47:30 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v7 5/6] mm, bpf: Use memcg in try_alloc_pages(). Date: Wed, 12 Feb 2025 09:47:04 -0800 Message-Id: <20250212174705.44492-6-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250212174705.44492-1-alexei.starovoitov@gmail.com> References: <20250212174705.44492-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: C0D6AC0003 X-Stat-Signature: 77rm7ampmgud7b3fh35dm8b6fd5fzozp X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1739382451-476757 X-HE-Meta: U2FsdGVkX1/3LSqBEbP6Ez/JPpHOuT55FS9MlX6MjUHm3LNVBcexKAxdpot0vykhtxLxPJLz/Brq26eUbXjm2o6rb4XOl/h0gcRugCo3x8eefMx752orLqRIzvN2hrB8qWIHwD5i360OSCBjDdAviuxs6ifMh0FnK4o+x9ckvhdrXK7FqEjxZ1dLUqEpSPlUsPMssI3GLnjYqqWzBbybwqPQI9tGiMgtK70IPQgGZ8zKNeU39xpUxwrTyPIs3ETUeFJsxaDhqn9wS29kRzh7My2HKgeoPBFMZBH0wfL6Xpjv0S5pLYEehh5m4gLexW3mHXgZb+fCeLEbjnSrlzc/ZH/EkNZ1hw6kjqJ0mjSR11MCE/u3gLAgqywfuPq1+YcfFaKbJfkndUzdqjUFGmOnuyVueBYhJu499qsK96IzvjsoVgTGedCUFBQZEgPCemheRLjKf5nt3c7l3ZUKyGA4h+7v9PtfqGifAtR+b0rnTElSUZLAn/CSKAxgbjg7uwpcG54WHttMTNbeyrnBERakc9KcaKRI+30ZXpszvnKpO0UFiQIT8Yz7msqWQ3aCb5m0/hSw84dNHNg8QbKlyrxtOYEzdWbpMkLEPk6C4GJqJRyRsP5b/VG/IuaDuQY4nUk1lmJTeIRdXz5bTYlPErGNd1IRkdEv5eV/nUTvhWoPYGKXqhlNFTdMdT+laTUpg2Rk7SqTqJrHA00jwrmdAKt20Ro9c4Cq0Mq7eylOetMcC0xQeGpAy4K19KPG68tQLDw1X/ozz3paRi4QpCDCkYRDs9dGyzu9083hECBv7e6oU+tb/NcHIrtULOLWZHTMX4GVSS1ld8JxOU7FjKj7BUVGkmu8wzWi167/Fjb4RTN+ZbS0V9L9GudBIZVnrKWFxMl5DNd1hBvL5JPy/QtOdBhIf09Bwn2kE8Mfgle8vUGo4YgW5JCBcJRazUqB73SF5U4xzoYidu8W4gF2i6o9JYm CqN+Lezt o6s6OCuAUrMBQolE7X85BkH3rV4FmXtJufz1RqoabM6CnFF/Ug017wOU1QcxHiHvIqLCcCgHzQEhU7ThtFp1TFge8PTTT0rDWI7ETS5GcuVoCzhKD5eXHNWcbAwikBSi/XsslKzdUVMWxzkZEWHyMTUYHEz/i7alwTrfkqoKCFPrT+US6jphWokXbhEShEnSJyPWewdYcNob9crx6NlTW9YqR+G9FsCYd3O350MmBBZeQWcL7TuOZ6lsX60GlFTAmniLuJbFzScFTrgA6xPEvCJKaYA+s7ZPZZxAn6AcD2iX64fVdoJmo75+tK+6izFPtRJ9jv7SnyH+uvaBcERnydnuixePmH1aFALqDGPBvRVX3Sz/Q+FFHJEiF4vo7PkORL+Bk0kXfeaMM7UsvrhhcS4cGaf148UY+yWVGbw8zyaMIDjuOXTR55k1dvwdJROKswvZmeVzZ1NmBRMYrrUKNmeJZ3QIVCRxOe2OW7SKGmvtcF85wDAiia4RJqg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexei Starovoitov Unconditionally use __GFP_ACCOUNT in try_alloc_pages(). The caller is responsible to setup memcg correctly. All BPF memory accounting is memcg based. Acked-by: Vlastimil Babka Acked-by: Shakeel Butt Signed-off-by: Alexei Starovoitov --- mm/page_alloc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index fa750c46e0fc..931cedcda788 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7146,7 +7146,8 @@ struct page *try_alloc_pages_noprof(int nid, unsigned int order) * specify it here to highlight that try_alloc_pages() * doesn't want to deplete reserves. */ - gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC; + gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_NOMEMALLOC + | __GFP_ACCOUNT; unsigned int alloc_flags = ALLOC_TRYLOCK; struct alloc_context ac = { }; struct page *page; @@ -7190,6 +7191,11 @@ struct page *try_alloc_pages_noprof(int nid, unsigned int order) /* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */ + if (memcg_kmem_online() && page && + unlikely(__memcg_kmem_charge_page(page, alloc_gfp, order) != 0)) { + free_pages_nolock(page, order); + page = NULL; + } trace_mm_page_alloc(page, order, alloc_gfp, ac.migratetype); kmsan_alloc_page(page, order, alloc_gfp); return page; From patchwork Wed Feb 12 17:47:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13972264 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE10AC02198 for ; Wed, 12 Feb 2025 17:47:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5ADFA280004; Wed, 12 Feb 2025 12:47:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 55C71280001; Wed, 12 Feb 2025 12:47:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D5F3280004; Wed, 12 Feb 2025 12:47:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1EF5D280001 for ; Wed, 12 Feb 2025 12:47:38 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D26391C7126 for ; Wed, 12 Feb 2025 17:47:37 +0000 (UTC) X-FDA: 83112024954.17.9BFECE9 Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) by imf17.hostedemail.com (Postfix) with ESMTP id DE5A040011 for ; Wed, 12 Feb 2025 17:47:35 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QVgxrPXJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739382455; a=rsa-sha256; cv=none; b=7qy/Xa7ZNIXt3qeJX4wGeMToy3fvZudMnXUbbyjDjFytwGXLYiuX0kSIXFXCEgHOLulTyc h340XzTkLKwnlG5onIvRvVCOGWYBT5HDMn1It0I7+wR2S12FiFSnN2KpVvV5a0AHs+Whum q04LWpPQcqA3G4euTjJMJo/SfOhW9fk= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QVgxrPXJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739382455; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bDuPZRtqsVh7bFH3yxNIFanE739bKkzWZ/UfDyq/wdw=; b=q2PdfXFDqdqjBInLkZK9DrrVWJNDUASrMlZSsK13inq0BZ7ODpcTV8RC95A6h1jsmq1nu+ NVZLtos+ah8uBzWMxKVy0CHeQWz8XHowMxUNPk0S9Z3beBUeK+vCuzxekJMXHThEaJ6LGE iL/6LY52G+yYifdBFT5EvogQtakBAaU= Received: by mail-pj1-f53.google.com with SMTP id 98e67ed59e1d1-2f9d3d0f55dso60657a91.1 for ; Wed, 12 Feb 2025 09:47:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739382455; x=1739987255; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bDuPZRtqsVh7bFH3yxNIFanE739bKkzWZ/UfDyq/wdw=; b=QVgxrPXJygYDPIeY4XEMwcK/XBTv20bgpdzl+fIsqz6iHiuSBmiLw29Mn9X/SfUAdr bgkDRAqkEQdURTmsGV+VA7HZCDlJq2UeB2oIRQpoZX0aULNsKvNtihX+KhHbXPLbeIsf 8oe3wTsX5uuZL2pLHEzUa6VFePfI51IuhhP3TZjhTZgK+A/g3ctSLX8jT+JZBZBAe7JQ H7fsVdri3ELQsAy/eHL1jQHMIDgHPgMaIX/24OiKkHLgyAAac3DY8PGKno3g0ti1p9IV PPQ1Szc3cu/VVZ0zVP1/JzHtjmlVy3aQrrwbvIMhDYbpNrH2+2xQpq+hrlbvfMeyLex3 YR+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739382455; x=1739987255; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bDuPZRtqsVh7bFH3yxNIFanE739bKkzWZ/UfDyq/wdw=; b=JMCWPr5VIIl3lk6pJSFUrtkLWpbKxtAIjUNopXHaya1yX25w7NLtzLLruwWSfnJYsn xZTzaIrWmMuoKw1RCHB88vr5htH2p9OzU4+bbvVQa2P3URUtIgKanbMDflwaQB7cL58i FlKaKykMSJ3fOjEfemK5sB1eEV9OJp38a4rIKi845RmwAhlw8b7CBn8D7dfWxaozy+Pa 1KGbbHkbZQxDcdS60KC1+R2jzNIrMzZ91YO6F6Db6X+0Lh+L8jEKDhn2TMoezplzDDk1 bcl2puZ59XljBUUHevCFeUGUZEG1q0BBErKjFNqZqcH1SgWV3XveDyKHOzJSlws/b+ss b67Q== X-Forwarded-Encrypted: i=1; AJvYcCWeAIZqALogjbC0tINGkerVrh9p9p82q6FRNN6KGP6EaD23Kgc2fDSQoXATtz7vHSKQ6k2ZwJGGOw==@kvack.org X-Gm-Message-State: AOJu0YykYLA17ULZHgFklgjAKpl1AXHP9qYv7O5VZajJB5NXmGGtwTMI VEz6ov2QoxG13M9cTvfil9P/uGLGWscWTFLaGSizR711Q6D6n/Hs X-Gm-Gg: ASbGncuwiju2m2axqBp5jkuiZyK/jbvKKL1nv+qJknpIltJowXGdVCXYikMkFVJ/ByA swKRx9RGExmrKNNTosAU13njJRREV0cL2WN7vZR3OwvCefo9zgEkd9PyP+Jg7GfjiZSyjABj58D 1g5yPGlmN5tMi7R809u70y4sMVEkBe/bO7xUVqtDINIeaqnwiQ5zditeM6eQ/50mhXbU+BUAQId BLs2H1GtoJGlSS5yYLsTfdJxA0y0NHegmJglxhyfcVQ2d+QhmG/MlXjPsO8+4ygljQCAXfFBN97 AXeM3zEP3r8SK2rTKU9lAhII5z1xPv/52/SVBsPwI8vsiw== X-Google-Smtp-Source: AGHT+IE9puw91tznIalD6EtLkH8J2zj9zmB6kCfaMb8CIT8gL4HusyDm8J9VmRaGQdeer0qVub9Dvw== X-Received: by 2002:a17:90b:280b:b0:2f8:2c47:fb36 with SMTP id 98e67ed59e1d1-2fbf5c6ea65mr7171319a91.33.1739382454794; Wed, 12 Feb 2025 09:47:34 -0800 (PST) Received: from ast-mac.thefacebook.com ([2620:10d:c090:500::4:c330]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21f36882dd7sm115611505ad.199.2025.02.12.09.47.32 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 12 Feb 2025 09:47:34 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v7 6/6] bpf: Use try_alloc_pages() to allocate pages for bpf needs. Date: Wed, 12 Feb 2025 09:47:05 -0800 Message-Id: <20250212174705.44492-7-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250212174705.44492-1-alexei.starovoitov@gmail.com> References: <20250212174705.44492-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: DE5A040011 X-Rspamd-Server: rspam12 X-Stat-Signature: dqgpyfs8nsqtoxhcxzxfhj1ynbnadx8g X-HE-Tag: 1739382455-826488 X-HE-Meta: U2FsdGVkX1/cb8iItYu1jdoeBhKH/nlwGwE+rJYMMSyvRVmEKPrPBRqzPqiu5bJFnZQqlzseLN8VfOJ9UpP1f6wzSL7xhdr6C936FHlWd+z9FVtMICRfS+igKL86uHG1HBj3podATWcjdevnxV/mGDCv0GB4Sd4tUKi6e5wMtmGnt5BQy76qU9tMU/JL7f0jO6RRFHo1s1rCDYEhsxwojKBpMYXJa/hFFTrk7GwFIREa+hb16aFbKinY+TRAvXxxtylMoCgW32j0y28KWLwzGe/FxlSD4sWqyiJBZWkiOy13OBC9GbEyrF+LmnvuQYG4Y/G6I9k9kJZ4QXxtFEoHEtnd5M9WLlQuAymrHZtYv/tyE5b5uc9ieSQ5dPhBxd0/hM6tHjRk1ZFx82wUMwMSb5cKRGpFi0n0/VU3EQLdLCZEyuKCgQO8RNb7KGpPFjqV0smntWljHUfhI6M76XwTFcffQTk8KwQ2LiVqlGKy7k7ngtCnh8rqpm85h7lbqMzM4BWXRrYPmoUm1bWWL/hSqQn3xlQz0pFyzQgd3Sqx9f4Cp2OQY2FzJiP3PJhIfGyPB8OL5wMbKCrAtQWlT0N95893EjoU9x4GhL4K6YeaOR8uQUyV4GhadqnHNUhgr8dPivLXPH6FJAj0y1qru4X880l5pBNNdPTrtMnvuKM8LMvK4wosLynUkbPdwyDLWd5lqIoIREGbHYHyUY9nAeb1lGqOoMepNZ/4g6K21gGFOsjSeAEVcPAYXRIp8TO0Op3VXIIhxUCD0o48Od10eMW/HKnfbHhylm6CbTgpAluUoiVMtPF7lF6+lmolMO5eqMGe7jhc5VzqSU7mN11tVOG8TLjtwdiDEPbZUl+lRxtVcXyFtkq420ae/BvAOyqlZN9RoV11Q2opGpfseSwl63GeabAn8FWkB4DLzmYQzIwZfTRxCmRmyHjKRUXWutVqlXsPZHRx+yvpWS+5bvHSsZs ocTsvF/C yVUzm9tgCaXt6SO3D2zhLIUZqDhdUiBZMe4bO8+6E9U2Aq8Z5zjKW45zXxm6MUtrW+9slaGEUnUrMlqx5rNPtTPbYqF0RoPrJRxaBrt/SEReaKlL0Z5rHqoAVQ0LN0stYEAmFJ4Hb8uAR1/UwhRA5v58iYLqWd9fzRdPdf2br6rwrNeuczs52joEiC7az+bAQnxWnymXj9Qgyyjn8y8eE8DzwRlkkSaFYJ5a1lYobjT4aHwya/1DyRbmvznTUDp3ThJVa3HtX+wefGDYjkbgnUPizVvxSH5zVEsUH4Vt8Yzk2+75Y2uV2W4PTSDKeo6T2PoS9LOKal/GNh2j7O35fdKB6pB3IQbYo8lDJAldWzU9yfXunnd2yw2k6pB38u1LAn7H2oYiMQR9Nphl7HbsbEMV67T5+gi5i2YluP1PIIyjxxCW0biy6NBBUG/Z10Uxz39Nx9I07maKE/uOm+8dP1IfK5xuazTyuFtYKYNOT4/xwOyeA86jGm/bZ2/ebhj2xcDak0NdX3qnwuEjhXxKJP97h/LCrzVzRoXwceupfdCvhm0E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexei Starovoitov Use try_alloc_pages() and free_pages_nolock() for BPF needs when context doesn't allow using normal alloc_pages. This is a prerequisite for further work. Signed-off-by: Alexei Starovoitov --- include/linux/bpf.h | 2 +- kernel/bpf/arena.c | 5 ++--- kernel/bpf/syscall.c | 23 ++++++++++++++++++++--- 3 files changed, 23 insertions(+), 7 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f3f50e29d639..e1838a341817 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2348,7 +2348,7 @@ int generic_map_delete_batch(struct bpf_map *map, struct bpf_map *bpf_map_get_curr_or_next(u32 *id); struct bpf_prog *bpf_prog_get_curr_or_next(u32 *id); -int bpf_map_alloc_pages(const struct bpf_map *map, gfp_t gfp, int nid, +int bpf_map_alloc_pages(const struct bpf_map *map, int nid, unsigned long nr_pages, struct page **page_array); #ifdef CONFIG_MEMCG void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags, diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c index 4b22a651b5d5..642399a5fd9f 100644 --- a/kernel/bpf/arena.c +++ b/kernel/bpf/arena.c @@ -287,7 +287,7 @@ static vm_fault_t arena_vm_fault(struct vm_fault *vmf) return VM_FAULT_SIGSEGV; /* Account into memcg of the process that created bpf_arena */ - ret = bpf_map_alloc_pages(map, GFP_KERNEL | __GFP_ZERO, NUMA_NO_NODE, 1, &page); + ret = bpf_map_alloc_pages(map, NUMA_NO_NODE, 1, &page); if (ret) { range_tree_set(&arena->rt, vmf->pgoff, 1); return VM_FAULT_SIGSEGV; @@ -465,8 +465,7 @@ static long arena_alloc_pages(struct bpf_arena *arena, long uaddr, long page_cnt if (ret) goto out_free_pages; - ret = bpf_map_alloc_pages(&arena->map, GFP_KERNEL | __GFP_ZERO, - node_id, page_cnt, pages); + ret = bpf_map_alloc_pages(&arena->map, node_id, page_cnt, pages); if (ret) goto out; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 0daf098e3207..55588dbd2fce 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -569,7 +569,24 @@ static void bpf_map_release_memcg(struct bpf_map *map) } #endif -int bpf_map_alloc_pages(const struct bpf_map *map, gfp_t gfp, int nid, +static bool can_alloc_pages(void) +{ + return preempt_count() == 0 && !irqs_disabled() && + !IS_ENABLED(CONFIG_PREEMPT_RT); +} + +static struct page *__bpf_alloc_page(int nid) +{ + if (!can_alloc_pages()) + return try_alloc_pages(nid, 0); + + return alloc_pages_node(nid, + GFP_KERNEL | __GFP_ZERO | __GFP_ACCOUNT + | __GFP_NOWARN, + 0); +} + +int bpf_map_alloc_pages(const struct bpf_map *map, int nid, unsigned long nr_pages, struct page **pages) { unsigned long i, j; @@ -582,14 +599,14 @@ int bpf_map_alloc_pages(const struct bpf_map *map, gfp_t gfp, int nid, old_memcg = set_active_memcg(memcg); #endif for (i = 0; i < nr_pages; i++) { - pg = alloc_pages_node(nid, gfp | __GFP_ACCOUNT, 0); + pg = __bpf_alloc_page(nid); if (pg) { pages[i] = pg; continue; } for (j = 0; j < i; j++) - __free_page(pages[j]); + free_pages_nolock(pages[j], 0); ret = -ENOMEM; break; }