From patchwork Wed Dec 18 03:07:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13912967 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-ot1-f50.google.com (mail-ot1-f50.google.com [209.85.210.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A70C7082E for ; Wed, 18 Dec 2024 03:07:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491253; cv=none; b=TSCoUJyGYabj6AdNKh5S4hO3e7g7CGgXbYZ8kaBRjYi+t4f3DIFflzsqDObcrdv+u1x40oi1yrOVIxv/PSACNRQeyaUt4slkjm/qPbQLGodfJI4Tao2BtPJzd2vuFvd7+YaalgML4QGFrOKe+zLvXO/h6cy2nCXCuq3A6jYFfus= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491253; c=relaxed/simple; bh=yOXNtMlw7xJqRPDHrzCdGHPavK39hoNe5hsDOx4/73A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dl6t4xGD/i1FEsqj7N3cRJzfbcJIDI5KKChIQB6Z1rhxDL0s254oteoa2ySs2kFk5OllvmsPOlykgXVxkLOKdkMNtipiJwSTMWt+GHVtuiMJbyQ+a1Qne6HcWjk+WeeYr1aatTfvaJXbpQ4KEsKJZNHytkeoQCQeuTEcihRX+eM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=X6+Ywhc7; arc=none smtp.client-ip=209.85.210.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="X6+Ywhc7" Received: by mail-ot1-f50.google.com with SMTP id 46e09a7af769-71e1d7130a5so2899380a34.0 for ; Tue, 17 Dec 2024 19:07:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734491250; x=1735096050; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OGzoMP4pz94E3iZ2ZVAY+FjoyH+9uJ8+ltaU5mTC/cc=; b=X6+Ywhc78m4p+kT6qqIPjgvJ3Zx3Z0hS0CHB5DYmQBZ7PR6nSC5XtL0k1eAqgW+T2A arIOEvoTh0PawnNADBx4cfzjPh+LMZt4/6YT2a0xQZNNGMbu+JUTvn4QDS7u6qB/XFpb m8SRCFLPOi6t1ZxmB0VFSq0Y8QpwVUx+xwFgRzx9oCuW7jpJM4RCtv65LngavoRIi/nu Fpjyo8GiSpAwjgnqm+y+Jaz7nhMibEDnke6jnbNvfyItlFLAyMzxlwgb8wPxGPHe7TJv LuU75L9/cnAwCuDFPuzb72wG2hRaX5KXpU3jfvw/fOd6ZlgGQfBTkoT3me94H4hRNG2u Z0iQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734491250; x=1735096050; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OGzoMP4pz94E3iZ2ZVAY+FjoyH+9uJ8+ltaU5mTC/cc=; b=f/O3gXP8+A7MhMddNvduBXR8NnIuKTjXwQgtCI7GXxhOAEEItRJVbqsMcXu/05B2E4 bTBEoqA1UutODjz00nUrw2YdLI3EdYgvIfPrPuNKELWjF7iceV4jy6xkTpXLgWdK/13h c//o7ZIlv5e91xRhU+qbykARtbn68THfxJTqz0YF9Cvf2DPrA2sfedhqgjC0SQkO7zO6 4ewzYm4fP6ovC6WwZYlnZaOfDTpUFPk3Mu/oiVeW2h46kCzZ4gUWdSmj0I0VuILYHhFO tBzWDbsc+o7Vt+P3m/qG/gERTosXGz4WwxrYQTeFHSxV5i88xcf4Fm9fBO01QVIzjAAC kVpg== X-Gm-Message-State: AOJu0YxcTymRyQj3+nz9wTa3oVfMkNmQg0LZm/jZH9aYhVoGBpXwM4HL Zy2f3tgKvYvN1fDyA4dqx0skR4S33YFH3N4R1Rdd4T+j4MDsq21mBHDGwg== X-Gm-Gg: ASbGncvX/FgOkIeFBgpFDJ0yOfbIJD/BE2LhF8X2euzh+ATydjihmQ4pWxV8Bl2mQFu JqJYHW1lCpJxh/7BhK1cYNmPQai4yaZb9WOcVIrG1wHHogN3ROFJ3QV64NeiZl2Rc5v0e01bkmy 1zA7dGjl2fF2DYRv62mWsEperusNL/LwUX8fwiPl8J+EbDDYk6uLZ+l2TmogzzwTJ/pzZ+jfrbI ed9Czh7aKM4HCUrG8J+f6pSDVTu609E85p2nHFKn1wf3zAL8yQnN1ThsRUUlw== X-Google-Smtp-Source: AGHT+IHS1I5BEwtWaQWxY8nR9/3Gb5s3FBDzligOvFHMM+1wBgrv8C56AV9pKBLPsNZs5Vq+R7EbDg== X-Received: by 2002:a05:6830:4391:b0:71d:eee3:fd26 with SMTP id 46e09a7af769-71fb75611a4mr755835a34.4.1734491249808; Tue, 17 Dec 2024 19:07:29 -0800 (PST) Received: from localhost ([2a03:2880:12ff:4::]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-71e48307e07sm2482972a34.1.2024.12.17.19.07.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Dec 2024 19:07:28 -0800 (PST) From: alexei.starovoitov@gmail.com To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v3 1/6] mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation Date: Tue, 17 Dec 2024 19:07:14 -0800 Message-ID: <20241218030720.1602449-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> References: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Tracing BPF programs execute from tracepoints and kprobes where running context is unknown, but they need to request additional memory. The prior workarounds were using pre-allocated memory and BPF specific freelists to satisfy such allocation requests. Instead, introduce internal __GFP_TRYLOCK flag that makes page allocator accessible from any context. It relies on percpu free list of pages that rmqueue_pcplist() should be able to pop the page from. If it fails (due to IRQ re-entrancy or list being empty) then try_alloc_pages() attempts to spin_trylock zone->lock and refill percpu freelist as normal. BPF program may execute with IRQs disabled and zone->lock is sleeping in RT, so trylock is the only option. In theory we can introduce percpu reentrance counter and increment it every time spin_lock_irqsave(&zone->lock, flags) is used, but we cannot rely on it. Even if this cpu is not in page_alloc path the spin_lock_irqsave() is not safe, since BPF prog might be called from tracepoint where preemption is disabled. So trylock only. Note, free_page and memcg are not taught about __GFP_TRYLOCK yet. The support comes in the next patches. This is a first step towards supporting BPF requirements in SLUB and getting rid of bpf_mem_alloc. That goal was discussed at LSFMM: https://lwn.net/Articles/974138/ Signed-off-by: Alexei Starovoitov --- include/linux/gfp.h | 3 ++ include/linux/gfp_types.h | 1 + mm/internal.h | 2 ++ mm/page_alloc.c | 69 ++++++++++++++++++++++++++++++++++++--- 4 files changed, 71 insertions(+), 4 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index b0fe9f62d15b..65b8df1db26a 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -347,6 +347,9 @@ static inline struct page *alloc_page_vma_noprof(gfp_t gfp, } #define alloc_page_vma(...) alloc_hooks(alloc_page_vma_noprof(__VA_ARGS__)) +struct page *try_alloc_pages_noprof(int nid, unsigned int order); +#define try_alloc_pages(...) alloc_hooks(try_alloc_pages_noprof(__VA_ARGS__)) + extern unsigned long get_free_pages_noprof(gfp_t gfp_mask, unsigned int order); #define __get_free_pages(...) alloc_hooks(get_free_pages_noprof(__VA_ARGS__)) diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h index 65db9349f905..65b148ec86eb 100644 --- a/include/linux/gfp_types.h +++ b/include/linux/gfp_types.h @@ -48,6 +48,7 @@ enum { ___GFP_THISNODE_BIT, ___GFP_ACCOUNT_BIT, ___GFP_ZEROTAGS_BIT, + ___GFP_TRYLOCK_BIT, #ifdef CONFIG_KASAN_HW_TAGS ___GFP_SKIP_ZERO_BIT, ___GFP_SKIP_KASAN_BIT, diff --git a/mm/internal.h b/mm/internal.h index cb8d8e8e3ffa..122fce7e1a9e 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1175,6 +1175,8 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone, #endif #define ALLOC_HIGHATOMIC 0x200 /* Allows access to MIGRATE_HIGHATOMIC */ #define ALLOC_KSWAPD 0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */ +#define __GFP_TRYLOCK ((__force gfp_t)BIT(___GFP_TRYLOCK_BIT)) +#define ALLOC_TRYLOCK 0x1000000 /* Only use spin_trylock in allocation path */ /* Flags that allow allocations below the min watermark. */ #define ALLOC_RESERVES (ALLOC_NON_BLOCK|ALLOC_MIN_RESERVE|ALLOC_HIGHATOMIC|ALLOC_OOM) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1cb4b8c8886d..d23545057b6e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2304,7 +2304,11 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, unsigned long flags; int i; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) + return 0; + spin_lock_irqsave(&zone->lock, flags); + } for (i = 0; i < count; ++i) { struct page *page = __rmqueue(zone, order, migratetype, alloc_flags); @@ -2904,7 +2908,11 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, do { page = NULL; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(alloc_flags & ALLOC_TRYLOCK)) + return NULL; + spin_lock_irqsave(&zone->lock, flags); + } if (alloc_flags & ALLOC_HIGHATOMIC) page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); if (!page) { @@ -4001,6 +4009,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order) */ BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_MIN_RESERVE); BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD); + BUILD_BUG_ON(__GFP_TRYLOCK != (__force gfp_t) ALLOC_TRYLOCK); /* * The caller may dip into page reserves a bit more if the caller @@ -4009,7 +4018,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order) * set both ALLOC_NON_BLOCK and ALLOC_MIN_RESERVE(__GFP_HIGH). */ alloc_flags |= (__force int) - (gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM)); + (gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM | __GFP_TRYLOCK)); if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) { /* @@ -4509,7 +4518,8 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, might_alloc(gfp_mask); - if (should_fail_alloc_page(gfp_mask, order)) + if (!(*alloc_flags & ALLOC_TRYLOCK) && + should_fail_alloc_page(gfp_mask, order)) return false; *alloc_flags = gfp_to_alloc_flags_cma(gfp_mask, *alloc_flags); @@ -7023,3 +7033,54 @@ static bool __free_unaccepted(struct page *page) } #endif /* CONFIG_UNACCEPTED_MEMORY */ + +struct page *try_alloc_pages_noprof(int nid, unsigned int order) +{ + gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | + __GFP_NOMEMALLOC | __GFP_TRYLOCK; + unsigned int alloc_flags = ALLOC_TRYLOCK; + struct alloc_context ac = { }; + struct page *page; + + /* + * In RT spin_trylock() may call raw_spin_lock() which is unsafe in NMI. + * If spin_trylock() is called from hard IRQ the current task may be + * waiting for one rt_spin_lock, but rt_spin_trylock() will mark the + * task as the owner of another rt_spin_lock which will confuse PI + * logic, so return immediately if called form hard IRQ or NMI. + * + * Note, irqs_disabled() case is ok. This function can be called + * from raw_spin_lock_irqsave region. + */ + if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq())) + return NULL; + if (!pcp_allowed_order(order)) + return NULL; + +#ifdef CONFIG_UNACCEPTED_MEMORY + if (has_unaccepted_memory() && !list_empty(&zone->unaccepted_pages)) + return NULL; +#endif + + if (nid == NUMA_NO_NODE) + nid = numa_node_id(); + + prepare_alloc_pages(alloc_gfp, order, nid, NULL, &ac, + &alloc_gfp, &alloc_flags); + + /* + * Best effort allocation from percpu free list. + * If it's empty attempt to spin_trylock zone->lock. + * Do not specify __GFP_KSWAPD_RECLAIM to avoid wakeup_kswapd + * that may need to grab a lock. + * Do not specify __GFP_ACCOUNT to avoid local_lock. + * Do not warn either. + */ + page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac); + + /* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */ + + trace_mm_page_alloc(page, order, alloc_gfp & ~__GFP_TRYLOCK, ac.migratetype); + kmsan_alloc_page(page, order, alloc_gfp); + return page; +} From patchwork Wed Dec 18 03:07:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13912968 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-oo1-f41.google.com (mail-oo1-f41.google.com [209.85.161.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 02A3A7082E for ; Wed, 18 Dec 2024 03:07:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491257; cv=none; b=LoBnealD8ek/8EmYvgaRMTeH9OCUUN9f2EvHzy4HGlMhgCJ2Mapt2yDtGcL89YInhq14SOJdC4v07WNTBAz9jjFhyY2vevWLyHJI7rKa2pvZ7M+3/Gzzp1Eq1iLBXkOKvJhFob25qiYedoVME1chRZ1NVxftcBlyiY53b/KWhh8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491257; c=relaxed/simple; bh=7U4RQR257f4iYZJPPw0P6oyJlt1ll7MxTT5kfFwF/HA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=icCpcCUnzdRr22MOSA2ph72czc44umaKg0ijYIHbL0rHceh7HIIwRmt1VRfbCsB2VZGlNuZxYTyVpFFBO6RzbsnAQql/zSbyeGbZ1KsdHUKe/UXH8Z6IcZtlke1Jdca+ezZNmPIdPEjbYuXkw6haAfkz9KgT9wubrQt+8FaMwcw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=I0HEmzvy; arc=none smtp.client-ip=209.85.161.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="I0HEmzvy" Received: by mail-oo1-f41.google.com with SMTP id 006d021491bc7-5f2d8f62290so2953787eaf.2 for ; Tue, 17 Dec 2024 19:07:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734491255; x=1735096055; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tgdU2JbWztEEEhB3TtDMib8w6jV6+vZNMAAIDtQqMFA=; b=I0HEmzvykuDrn/kERZb+B0YKDCAl6VsvvazmaMNEbWrsmLoxB4D5F8EX+mwEwnN5Rt cH0f27Bo6HmwubCW2V1aWHfQoanxhwI8fYlfahv9PAzTWUCeUNusQTS2avxpIj4WL3UF yUYz6Ad1N+mvdojU9inEqqKD/Z8qaaVyJ2aDxJB76gHA0FLmStjsTRGDDWDPPHSvj8t6 zOe5YUXfdElvlWA++c8K1Q9DqG1twEfStO+AH2SsCvZyMS5iOCvEtka9BFKqD0zeIEOw 9eXXB4pBZ3Mpq5Rg6w70l4HRIicCSq0mkMkgNHIICSaeP+l2qQseDwjg4XZNL/VvNpUm OnQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734491255; x=1735096055; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tgdU2JbWztEEEhB3TtDMib8w6jV6+vZNMAAIDtQqMFA=; b=nT/QeosJiEVznTH4ZeBMXTVxt/y3ylPpE1ZEPh5+pVISaHlfEiECNpefk5ohntkQSm k15EG4uLMKyQu0cvZEoADnflWGZnTX3qNiIbXAiECmWgyVVLvyL/XWzqX6tkgkNZVRM8 +kzj+YUqxcR9fRjapyQH0kjqq3z2WhIxR1T9CNa4rAGTCy10PvvIuUF6fLZ6NNBCO6Cr ooPyJsr/u1NQ+htPx5q2Bj0nAiwysefs/Oe46GsyoFp9H7rJ3g4i16nv2m5lgcjuzoJc TB8rsEWzLQrun3tiP5el5QStaTB/6AdwT2YPIk+4OvpXdkHI0G5cfIzUmZPskqfuf7So ERMw== X-Gm-Message-State: AOJu0YwtzC8hpDFnF5DxNNKjemOBO+ts1lJ3wj+KZ7RHKXGHAN8Beo9P qgMlzlWsZ/3TqWJ3WuUUnsed5QK7Ck2SF7Podg6x1dHEm50BL41FSho9cw== X-Gm-Gg: ASbGncvfzKKK4W3HdUACCEq+IyPqCEzx0kVSk6bb5bgn423KpXb8Hvhr9KvatVtcOf9 km6SdFr6nEIVDDyczuWBbFJ7U/8O6yhAchlkUv9puSawovSs6C019o6claJUbJpXjMPMX5nrLtU /80Z93kY3fFmLqjQHT51K6vL/A5oafeR/X6jbr5bqPFZ1vAL90dCwE8c9LIhqNTZIrgC2OQ4gI6 KOaptOtsQdd/3R+nxGhR1aW52BQ1vsPDQlmo3FJmUMOs/wGUnT+9Rcp1ndC1Q4= X-Google-Smtp-Source: AGHT+IFncZAcAnWxKKV09ycgYCqm0lYNyvdm4XJIXodyQo0lsEwlel1FnkF3rC4Igm8LoVETXeQkKQ== X-Received: by 2002:a05:6871:650:b0:288:2906:6882 with SMTP id 586e51a60fabf-2a7b32cc1b1mr802178fac.29.1734491254754; Tue, 17 Dec 2024 19:07:34 -0800 (PST) Received: from localhost ([2a03:2880:12ff:72::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-2a3d25473acsm3311761fac.13.2024.12.17.19.07.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Dec 2024 19:07:33 -0800 (PST) From: alexei.starovoitov@gmail.com To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v3 2/6] mm, bpf: Introduce free_pages_nolock() Date: Tue, 17 Dec 2024 19:07:15 -0800 Message-ID: <20241218030720.1602449-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> References: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Introduce free_pages_nolock() that can free pages without taking locks. It relies on trylock and can be called from any context. Since spin_trylock() cannot be used in RT from hard IRQ or NMI it uses lockless link list to stash the pages which will be freed by subsequent free_pages() from good context. Signed-off-by: Alexei Starovoitov --- include/linux/gfp.h | 1 + include/linux/mm_types.h | 4 ++ include/linux/mmzone.h | 3 ++ mm/page_alloc.c | 79 ++++++++++++++++++++++++++++++++++++---- 4 files changed, 79 insertions(+), 8 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 65b8df1db26a..ff9060af6295 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -372,6 +372,7 @@ __meminit void *alloc_pages_exact_nid_noprof(int nid, size_t size, gfp_t gfp_mas __get_free_pages((gfp_mask) | GFP_DMA, (order)) extern void __free_pages(struct page *page, unsigned int order); +extern void free_pages_nolock(struct page *page, unsigned int order); extern void free_pages(unsigned long addr, unsigned int order); #define __free_page(page) __free_pages((page), 0) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7361a8f3ab68..52547b3e5fd8 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -99,6 +99,10 @@ struct page { /* Or, free page */ struct list_head buddy_list; struct list_head pcp_list; + struct { + struct llist_node pcp_llist; + unsigned int order; + }; }; /* See page-flags.h for PAGE_MAPPING_FLAGS */ struct address_space *mapping; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index b36124145a16..1a854e0a9e3b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -953,6 +953,9 @@ struct zone { /* Primarily protects free_area */ spinlock_t lock; + /* Pages to be freed when next trylock succeeds */ + struct llist_head trylock_free_pages; + /* Write-intensive fields used by compaction and vmstats. */ CACHELINE_PADDING(_pad2_); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d23545057b6e..10918bfc6734 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -88,6 +88,9 @@ typedef int __bitwise fpi_t; */ #define FPI_TO_TAIL ((__force fpi_t)BIT(1)) +/* Free the page without taking locks. Rely on trylock only. */ +#define FPI_TRYLOCK ((__force fpi_t)BIT(2)) + /* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */ static DEFINE_MUTEX(pcp_batch_high_lock); #define MIN_PERCPU_PAGELIST_HIGH_FRACTION (8) @@ -1247,13 +1250,44 @@ static void split_large_buddy(struct zone *zone, struct page *page, } } +static void add_page_to_zone_llist(struct zone *zone, struct page *page, + unsigned int order) +{ + /* Remember the order */ + page->order = order; + /* Add the page to the free list */ + llist_add(&page->pcp_llist, &zone->trylock_free_pages); +} + static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, fpi_t fpi_flags) { + struct llist_head *llhead; unsigned long flags; - spin_lock_irqsave(&zone->lock, flags); + if (!spin_trylock_irqsave(&zone->lock, flags)) { + if (unlikely(fpi_flags & FPI_TRYLOCK)) { + add_page_to_zone_llist(zone, page, order); + return; + } + spin_lock_irqsave(&zone->lock, flags); + } + + /* The lock succeeded. Process deferred pages. */ + llhead = &zone->trylock_free_pages; + if (unlikely(!llist_empty(llhead) && !(fpi_flags & FPI_TRYLOCK))) { + struct llist_node *llnode; + struct page *p, *tmp; + + llnode = llist_del_all(llhead); + llist_for_each_entry_safe(p, tmp, llnode, pcp_llist) { + unsigned int p_order = p->order; + + split_large_buddy(zone, p, page_to_pfn(p), p_order, fpi_flags); + __count_vm_events(PGFREE, 1 << p_order); + } + } split_large_buddy(zone, page, pfn, order, fpi_flags); spin_unlock_irqrestore(&zone->lock, flags); @@ -2596,7 +2630,7 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone, static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, struct page *page, int migratetype, - unsigned int order) + unsigned int order, fpi_t fpi_flags) { int high, batch; int pindex; @@ -2631,6 +2665,14 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, } if (pcp->free_count < (batch << CONFIG_PCP_BATCH_SCALE_MAX)) pcp->free_count += (1 << order); + + if (unlikely(fpi_flags & FPI_TRYLOCK)) { + /* + * Do not attempt to take a zone lock. Let pcp->count get + * over high mark temporarily. + */ + return; + } high = nr_pcp_high(pcp, zone, batch, free_high); if (pcp->count >= high) { free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high), @@ -2645,7 +2687,8 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, /* * Free a pcp page */ -void free_unref_page(struct page *page, unsigned int order) +static void __free_unref_page(struct page *page, unsigned int order, + fpi_t fpi_flags) { unsigned long __maybe_unused UP_flags; struct per_cpu_pages *pcp; @@ -2654,7 +2697,7 @@ void free_unref_page(struct page *page, unsigned int order) int migratetype; if (!pcp_allowed_order(order)) { - __free_pages_ok(page, order, FPI_NONE); + __free_pages_ok(page, order, fpi_flags); return; } @@ -2671,24 +2714,33 @@ void free_unref_page(struct page *page, unsigned int order) migratetype = get_pfnblock_migratetype(page, pfn); if (unlikely(migratetype >= MIGRATE_PCPTYPES)) { if (unlikely(is_migrate_isolate(migratetype))) { - free_one_page(page_zone(page), page, pfn, order, FPI_NONE); + free_one_page(page_zone(page), page, pfn, order, fpi_flags); return; } migratetype = MIGRATE_MOVABLE; } zone = page_zone(page); + if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq())) { + add_page_to_zone_llist(zone, page, order); + return; + } pcp_trylock_prepare(UP_flags); pcp = pcp_spin_trylock(zone->per_cpu_pageset); if (pcp) { - free_unref_page_commit(zone, pcp, page, migratetype, order); + free_unref_page_commit(zone, pcp, page, migratetype, order, fpi_flags); pcp_spin_unlock(pcp); } else { - free_one_page(zone, page, pfn, order, FPI_NONE); + free_one_page(zone, page, pfn, order, fpi_flags); } pcp_trylock_finish(UP_flags); } +void free_unref_page(struct page *page, unsigned int order) +{ + __free_unref_page(page, order, FPI_NONE); +} + /* * Free a batch of folios */ @@ -2777,7 +2829,7 @@ void free_unref_folios(struct folio_batch *folios) trace_mm_page_free_batched(&folio->page); free_unref_page_commit(zone, pcp, &folio->page, migratetype, - order); + order, FPI_NONE); } if (pcp) { @@ -4854,6 +4906,17 @@ void __free_pages(struct page *page, unsigned int order) } EXPORT_SYMBOL(__free_pages); +/* + * Can be called while holding raw_spin_lock or from IRQ and NMI, + * but only for pages that came from try_alloc_pages(): + * order <= 3, !folio, etc + */ +void free_pages_nolock(struct page *page, unsigned int order) +{ + if (put_page_testzero(page)) + __free_unref_page(page, order, FPI_TRYLOCK); +} + void free_pages(unsigned long addr, unsigned int order) { if (addr != 0) { From patchwork Wed Dec 18 03:07:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13912969 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-oo1-f42.google.com (mail-oo1-f42.google.com [209.85.161.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5729743146 for ; Wed, 18 Dec 2024 03:07:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491263; cv=none; b=huYccEJbpwZb2B0b44jl8iwX5ECgCNuwc3OUw8Yb6SALof04qM8jaJGf9dUAiFSp7TV0W/KLk5C4dMi02bZpvHfxrZETKrRPneWE1dsileR9QfuadZ4o2ZCiUcdLd0HNHB6yo9ODI4GGOizjwLIn1QFOqkOFk26bikzuhX89pHo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491263; c=relaxed/simple; bh=TkxOKnG0Zg/aV4ON18flvreuq6SyqnihyXeEFYVd3fw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=U5rBRodGsWyucjKPm7hTNTQ8eJsBF8MAdbcSvd0+wbKrWDT6utzRko0udOza3VxInHAaoI5VxGZ+oev+p4ckH+LWNR4h3mWiJ2g3iCB+1uo4x+vFgKEccPQa0bF0dtB6aidrVt4LPK5NFOhI5ET46XmUdsoiWnzhBIfdkexS4Qs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Vnphg4d6; arc=none smtp.client-ip=209.85.161.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Vnphg4d6" Received: by mail-oo1-f42.google.com with SMTP id 006d021491bc7-5f4c111991bso185337eaf.0 for ; Tue, 17 Dec 2024 19:07:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734491260; x=1735096060; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fcHIP1+8Pwyc2xkUuGVXiolOQkisEg3F+R5j1Yt+uzY=; b=Vnphg4d6hM7oA9TfMNenHKc8TDOsN2R1NYJwaZM/h4Vls06hrA7NeiyZOXaVIrgWDc sd+rIuNZo4Vp5oNwPo+LHAyS7yHVeiHtkgEY+SMlaW71/I+O0MFqzY0dcS6lN/nDGt9f hDvWTblEvF7oL2JZn6fYx2WyvfbsI1DS3QjBAvFzWREDwIDbjaFinThWasTDAP7ly7mJ 8arUkG/usfyfkkyBhJr+I13bHJqY1y42mC7QoRIQg9KiW4BJCDUJrpcxuVxCCiXnOE/Q SkOtzoBmnfAuyXDFkZ3XjyQiLcCDHZl8lFC0RN6uBDV8pxucfnE50s+piankvn5AYaRX jRqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734491260; x=1735096060; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fcHIP1+8Pwyc2xkUuGVXiolOQkisEg3F+R5j1Yt+uzY=; b=koMBdMTC82wz+eSgeh7sZVKg9kns327C68W4sLsNV5tuKbmGxGD2/pn/zcIMnn8lmR jN/t2Rpt8w1y17DDoHnSmYjhrWW0p3bFgMms6mXoA8uBIjHWweKjX6oSUCoAHgTq+t9B DUluAfh/IfyNzF11Fl6scXbmxn48vMVmLDjA/yq0fWnUMfuUK8JFDIzC+qJsvZVzgO6W VUBvEYoN4B+6o8sE8v7lxpa7Xou9fO0kPv6QjR+gmpCOONW3AXCHbvnyC6OOXxEq2WGt hZ0mugq6Qw7yIUVvn5QOVDrNQYy0ckCTCWMZTl1LXqjZsfenyomtzqVDQOqUnDhDU+Ju pOmA== X-Gm-Message-State: AOJu0Yw+O67H8T91LcSgWcsAfz8/h17Zpg5MRjf6t/0MVwRx9NahCLs9 N9ShEpxVxKi7lYFx4Z7DmhevymuMVpvN0P8QADfjirZw3eyQBnynZZxAnQ== X-Gm-Gg: ASbGncu3DpLpfbJCBFqWkmtweCVwZqOXzHzENoZDMpt4pHmdXgUFCSXGD1T0P0NfhjZ E4iCKfnzvMAaWHPedvYjBDi+O2D9Gz/wr5HgCw2ee0VOY5cSsCC1xUbbB7CfJuNzinEzFzxdB4U JynXFPYqZ2E5EVMg9gFKFH7YknV2jz+ZSBTw3rjHuRnJo8N6Eay+5BGI3HR+Pxs2EZqro+jrDnW hhMrj3EhMDGpM3C1ERhf3lJXfj1tJu/fqH4t3K5hIJAXaf+B22duHSNLFKxPg== X-Google-Smtp-Source: AGHT+IGAOWU8WTtpajWGghl6QstJosNCQ5UcWaUN0/NUcMXEsAsN5z1Ap2Hz23QuLwZvMvbxMXfsRw== X-Received: by 2002:a4a:d74c:0:b0:5f2:b6d5:bf53 with SMTP id 006d021491bc7-5f45c71a68cmr2978615eaf.1.1734491259593; Tue, 17 Dec 2024 19:07:39 -0800 (PST) Received: from localhost ([2a03:2880:12ff:4::]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-5f33a98a3f7sm2526797eaf.36.2024.12.17.19.07.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Dec 2024 19:07:38 -0800 (PST) From: alexei.starovoitov@gmail.com To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v3 3/6] locking/local_lock: Introduce local_trylock_irqsave() Date: Tue, 17 Dec 2024 19:07:16 -0800 Message-ID: <20241218030720.1602449-4-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> References: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov This is inspired by 'struct local_tryirq_lock' in: https://lore.kernel.org/all/20241112-slub-percpu-caches-v1-5-ddc0bdc27e05@suse.cz/ Similar to local_lock_irqsave() introduce local_trylock_irqsave(). It uses spin_trylock in PREEMPT_RT when not in hard IRQ and not in NMI and instantly fails otherwise. In !RT it uses simple active flag that prevents IRQs or NMIs reentering locked region. Note there is no need to use local_inc for active flag. If IRQ handler grabs the same local_lock after READ_ONCE(lock->active) already completed it has to unlock it before returning. Similar with NMI handler. So there is a strict nesting of scopes. It's a per cpu lock, so multiple cpus do not access it in parallel. Signed-off-by: Alexei Starovoitov --- include/linux/local_lock.h | 9 ++++ include/linux/local_lock_internal.h | 76 ++++++++++++++++++++++++++--- 2 files changed, 78 insertions(+), 7 deletions(-) diff --git a/include/linux/local_lock.h b/include/linux/local_lock.h index 091dc0b6bdfb..84ee560c4f51 100644 --- a/include/linux/local_lock.h +++ b/include/linux/local_lock.h @@ -30,6 +30,15 @@ #define local_lock_irqsave(lock, flags) \ __local_lock_irqsave(lock, flags) +/** + * local_trylock_irqsave - Try to acquire a per CPU local lock, save and disable + * interrupts. Always fails in RT when in_hardirq or NMI. + * @lock: The lock variable + * @flags: Storage for interrupt flags + */ +#define local_trylock_irqsave(lock, flags) \ + __local_trylock_irqsave(lock, flags) + /** * local_unlock - Release a per CPU local lock * @lock: The lock variable diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h index 8dd71fbbb6d2..93672127c73d 100644 --- a/include/linux/local_lock_internal.h +++ b/include/linux/local_lock_internal.h @@ -9,6 +9,7 @@ #ifndef CONFIG_PREEMPT_RT typedef struct { + int active; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; struct task_struct *owner; @@ -22,7 +23,7 @@ typedef struct { .wait_type_inner = LD_WAIT_CONFIG, \ .lock_type = LD_LOCK_PERCPU, \ }, \ - .owner = NULL, + .owner = NULL, .active = 0 static inline void local_lock_acquire(local_lock_t *l) { @@ -31,6 +32,13 @@ static inline void local_lock_acquire(local_lock_t *l) l->owner = current; } +static inline void local_trylock_acquire(local_lock_t *l) +{ + lock_map_acquire_try(&l->dep_map); + DEBUG_LOCKS_WARN_ON(l->owner); + l->owner = current; +} + static inline void local_lock_release(local_lock_t *l) { DEBUG_LOCKS_WARN_ON(l->owner != current); @@ -45,6 +53,7 @@ static inline void local_lock_debug_init(local_lock_t *l) #else /* CONFIG_DEBUG_LOCK_ALLOC */ # define LOCAL_LOCK_DEBUG_INIT(lockname) static inline void local_lock_acquire(local_lock_t *l) { } +static inline void local_trylock_acquire(local_lock_t *l) { } static inline void local_lock_release(local_lock_t *l) { } static inline void local_lock_debug_init(local_lock_t *l) { } #endif /* !CONFIG_DEBUG_LOCK_ALLOC */ @@ -60,6 +69,7 @@ do { \ 0, LD_WAIT_CONFIG, LD_WAIT_INV, \ LD_LOCK_PERCPU); \ local_lock_debug_init(lock); \ + (lock)->active = 0; \ } while (0) #define __spinlock_nested_bh_init(lock) \ @@ -75,37 +85,73 @@ do { \ #define __local_lock(lock) \ do { \ + local_lock_t *l; \ preempt_disable(); \ - local_lock_acquire(this_cpu_ptr(lock)); \ + l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 0); \ + WRITE_ONCE(l->active, 1); \ + local_lock_acquire(l); \ } while (0) #define __local_lock_irq(lock) \ do { \ + local_lock_t *l; \ local_irq_disable(); \ - local_lock_acquire(this_cpu_ptr(lock)); \ + l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 0); \ + WRITE_ONCE(l->active, 1); \ + local_lock_acquire(l); \ } while (0) #define __local_lock_irqsave(lock, flags) \ do { \ + local_lock_t *l; \ local_irq_save(flags); \ - local_lock_acquire(this_cpu_ptr(lock)); \ + l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 0); \ + WRITE_ONCE(l->active, 1); \ + local_lock_acquire(l); \ } while (0) +#define __local_trylock_irqsave(lock, flags) \ + ({ \ + local_lock_t *l; \ + local_irq_save(flags); \ + l = this_cpu_ptr(lock); \ + if (READ_ONCE(l->active) == 1) { \ + local_irq_restore(flags); \ + l = NULL; \ + } else { \ + WRITE_ONCE(l->active, 1); \ + local_trylock_acquire(l); \ + } \ + !!l; \ + }) + #define __local_unlock(lock) \ do { \ - local_lock_release(this_cpu_ptr(lock)); \ + local_lock_t *l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 1); \ + WRITE_ONCE(l->active, 0); \ + local_lock_release(l); \ preempt_enable(); \ } while (0) #define __local_unlock_irq(lock) \ do { \ - local_lock_release(this_cpu_ptr(lock)); \ + local_lock_t *l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 1); \ + WRITE_ONCE(l->active, 0); \ + local_lock_release(l); \ local_irq_enable(); \ } while (0) #define __local_unlock_irqrestore(lock, flags) \ do { \ - local_lock_release(this_cpu_ptr(lock)); \ + local_lock_t *l = this_cpu_ptr(lock); \ + lockdep_assert(l->active == 1); \ + WRITE_ONCE(l->active, 0); \ + local_lock_release(l); \ local_irq_restore(flags); \ } while (0) @@ -148,6 +194,22 @@ typedef spinlock_t local_lock_t; __local_lock(lock); \ } while (0) +#define __local_trylock_irqsave(lock, flags) \ + ({ \ + __label__ out; \ + int ret = 0; \ + typecheck(unsigned long, flags); \ + flags = 0; \ + if (in_nmi() || in_hardirq()) \ + goto out; \ + migrate_disable(); \ + ret = spin_trylock(this_cpu_ptr((lock))); \ + if (!ret) \ + migrate_enable(); \ + out: \ + ret; \ + }) + #define __local_unlock(__lock) \ do { \ spin_unlock(this_cpu_ptr((__lock))); \ From patchwork Wed Dec 18 03:07:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13912970 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-ot1-f51.google.com (mail-ot1-f51.google.com [209.85.210.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2144A43146 for ; Wed, 18 Dec 2024 03:07:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491267; cv=none; b=LunQgsrO84A127w54BedE+C4jgSoiTWNHRH/OudHhQ45nGG2rmho8I4I5nyNqTU1vaAz+fhSdSvdwvJaabHf/GzQQbSKKePrR4NetM2X70C8FgqNGRTu/7yFysGrfhsknPoL1dNR9KWuaoj+OGD2NifhR3xu6DqGhcYPYTvZ+So= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491267; c=relaxed/simple; bh=7Vm/a/+qy6Ar8kP/6ExK4ePgo3e/MRDtz66Sr+JZdmI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ojfzo2uCNwEyMnjcflcAhazr6wUPhilPzzPSuaV7v9Eeu1ggjhhjmMkAmQQDYjNGkaClNRKn8MyAc9HWcUkUmBDdIU6UvP89oW5hqx8KOWr6juQHhzB25PCQnLg4d//KAOW9zI+XY7kzK7JMru+iN3zPijAZinRyZEfebHJC2Cg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=CUqDeyOW; arc=none smtp.client-ip=209.85.210.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CUqDeyOW" Received: by mail-ot1-f51.google.com with SMTP id 46e09a7af769-71e287897ceso3972131a34.0 for ; Tue, 17 Dec 2024 19:07:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734491264; x=1735096064; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/7W5s6nR1eKMiZGmAkj0lhGxTA+TFwBoEtTYrZWxIj4=; b=CUqDeyOW3FW5/2G7Sd2FNm3ciUOubwEntxwB2v07OSDDePcpJQmcNii4PtE7owOdAQ czlFn2VsZ6Qnlpc42pARiMGyUNvUF0i8LVtBLZb3DCM8bY9l8u3lqh9niSkQBHIYVsA0 foRDYT6XD38RBkJ25A/+LhgZ/P4XyaAbCw4nohk4+fZfAjP5zm0pr54jUjq4JSf+49Ce 8Oz1zno4DWLAZ5BbopmyDILEKwWH4Nkq4jBqWIDKDG4Dpx9holWoYC0oRqX+NmT48cRU +MvwRggGussa9Kc+Sc9PMQZM+pznC8WPPdZoHszhVaAS8VsJN5+wfwanOW/OiYxGShlE Fjnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734491264; x=1735096064; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/7W5s6nR1eKMiZGmAkj0lhGxTA+TFwBoEtTYrZWxIj4=; b=l2HHKpE3H586dYyMNEHpjg8gW7HZM4T3e3PJO96IKp7nwa9hnkhO/H6iyFswqpQFD4 bjQT0hph+o1n6bkt5+1wkkFbwbVBfLrv+xJ50VMJ0D9+GiAlroKsHxxk+Wau12MKrc8f Mx/oLQbC8DAtpGwHxk46n3LRkduakrd0dqTpdJ9dzlO0xdxIKSdFb99DewvzsKiKAefG D0PDgcV911rncLZ9ssx9p6JLLGiH9FaXkt9cJ+ZeVsZ/b3yITtfNgszRpYOXW5U1x9yv wpQE14euTjZrNNvaac21/1pmUFwcJmhKLiV7tHXSM1noXkGzr4xn8GLgZUFeBeTyuw/V ed0Q== X-Gm-Message-State: AOJu0YxcDXd/KyHNNL/alokKteBLVe89aikP4a/jBA8YgvmCyp+sftWy qMpwY2TQAqLzOPmxRR+AkFHyJ53HqrjPUgklFi8GS5ahRkUfbM7Zc6RqEQ== X-Gm-Gg: ASbGncu2TBQ9JAteFrjCnvf5WnIji4mJ2SiftJWJTa+QkRXjFqBJygjpYsEKqkpgcrB hbUtEkgFLUX7SoOYJWiUBtLwqv30wRrVZ2mgkiXouwCWVHU6E0ndoS7CMvgFxVUN3cnyeMbGDYl 45jeE1Gm6La4DuK1+wuKc76/nLum5NM0Fgl6dNuzi9o1lURWZs9sk//xuSR4QUdZHOgNp8C6egA zbptrDXJ4Pvc9Cg8CRZuX7SV+g15C5jmWQqoR536l/USHOqlnY5yR55TfyI/g== X-Google-Smtp-Source: AGHT+IH2tfKmT4WhT188FuyTuGvwz4iinrO9XQia8eG+MuqgHEUEQ+qNAXFpX68m9XBO2n0bhRDyxw== X-Received: by 2002:a05:6830:7308:b0:71e:904:6aed with SMTP id 46e09a7af769-71fb757a1afmr792576a34.10.1734491264502; Tue, 17 Dec 2024 19:07:44 -0800 (PST) Received: from localhost ([2a03:2880:12ff:2::]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-71e4835630esm2481271a34.27.2024.12.17.19.07.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Dec 2024 19:07:43 -0800 (PST) From: alexei.starovoitov@gmail.com To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v3 4/6] memcg: Use trylock to access memcg stock_lock. Date: Tue, 17 Dec 2024 19:07:17 -0800 Message-ID: <20241218030720.1602449-5-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> References: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Teach memcg to operate under trylock conditions when spinning locks cannot be used. The end result is __memcg_kmem_charge_page() and __memcg_kmem_uncharge_page() are safe to use from any context in RT and !RT. In !RT the NMI handler may fail to trylock stock_lock. In RT hard IRQ and NMI handlers will not attempt to trylock. Signed-off-by: Alexei Starovoitov --- mm/memcontrol.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7b3503d12aaf..f168d223375f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1756,7 +1756,8 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, * * returns true if successful, false otherwise. */ -static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) +static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages, + gfp_t gfp_mask) { struct memcg_stock_pcp *stock; unsigned int stock_pages; @@ -1766,7 +1767,11 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) if (nr_pages > MEMCG_CHARGE_BATCH) return ret; - local_lock_irqsave(&memcg_stock.stock_lock, flags); + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { + if (gfp_mask & __GFP_TRYLOCK) + return ret; + local_lock_irqsave(&memcg_stock.stock_lock, flags); + } stock = this_cpu_ptr(&memcg_stock); stock_pages = READ_ONCE(stock->nr_pages); @@ -1851,7 +1856,14 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { unsigned long flags; - local_lock_irqsave(&memcg_stock.stock_lock, flags); + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { + /* + * In case of unlikely failure to lock percpu stock_lock + * uncharge memcg directly. + */ + mem_cgroup_cancel_charge(memcg, nr_pages); + return; + } __refill_stock(memcg, nr_pages); local_unlock_irqrestore(&memcg_stock.stock_lock, flags); } @@ -2196,7 +2208,7 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, unsigned long pflags; retry: - if (consume_stock(memcg, nr_pages)) + if (consume_stock(memcg, nr_pages, gfp_mask)) return 0; if (!do_memsw_account() || From patchwork Wed Dec 18 03:07:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13912971 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E17DA1F931 for ; Wed, 18 Dec 2024 03:07:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491272; cv=none; b=ZDj5npxaYK6ORCtBX07lxqu6KbXed4p7GhGWvy4fPrtJgX+CpyqWDaEosqGnpI2uy01qOKsOF5EP3sxuZmx9MCtuNTWCI3N8DHuIXJ6dAXI2GBJ7jaaTP0+Ulqpv6c9bq8B1cALVekEUMET3ES56E9sbWPK6++UGyHE+hCR07rA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491272; c=relaxed/simple; bh=xyUVZumur1RRZwVexUoAmOdDa6jTdszBfA40AlVo+HM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ULFBgMG7lfQjJb6im6kBOq4aAiXTn5UFYNRUSYqlSZfmfG6NybaghaDep6SgUSG1rk8lrrerM/YdUGVs9fdjzRDQbp/aNPgONQKuwd6FSfhPAcstonhQOLCxMnOfZOXRtmxMMUdRYM7OC7RpH+fg/Andr4avhw54e2ZJ4ybOpTM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LcjJMhYR; arc=none smtp.client-ip=209.85.167.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LcjJMhYR" Received: by mail-oi1-f173.google.com with SMTP id 5614622812f47-3eb7f3b1342so2557496b6e.1 for ; Tue, 17 Dec 2024 19:07:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734491269; x=1735096069; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=we3p1MJO0mH53GLx5S6kuhiFdcHuR7bEtawg6CO6LzQ=; b=LcjJMhYRRvwvtAYIiQz0Bw/7GMeF7dOQ2vsD19ScQ3PBFQVW0xrOhaAiP6J7xmMyGR 6l++7gYWP2XYDzMtqURqEJU3yC72WX1+L4Fvpz/jJltzW7/uRxQhTTcyS416MW1pQTFR fX6CT8v0EBGv4HoFblKhoYPbUfdd7Ft9BwrVVyYqcY5KkvxrQVWlwtv5EEAJf3/Xpb7A Zzc1fgYL8P6x0CtS+EQKLHFtU+lldFw17yWSXa81RbAMopLZkLBy6DzIUKLlhC0W+8Zh SYZ705sAJTy2cnU4BXRfXDaCRofVlPRS+0sZjwdGP3yDJgF9oEYadr+ExfGMD6i6V+Ce pFcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734491269; x=1735096069; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=we3p1MJO0mH53GLx5S6kuhiFdcHuR7bEtawg6CO6LzQ=; b=GXZZD4He5+neZkgfX7t/NiR8LjG0TByoMeuggWj6O/u/XomEP8CEeqyCRA4csS9n77 uak5V6CfaWcL58fJK0raPAqzojLFzaWQM7C7lDWe9myf/UMmH5vH/uZuqqaCRefTMhl7 I0ge1VMiAFpYmSMXmFwddjalLMXR8uUqfEuXe3mjeJeFl2KsqKlP17KdtfoIRMgg+PeM tUhK9pHBjQqMIp1nKTJagAANXu9HYPsuPH/PGuuIG8nIaRx4BXd1a7R9Mey81rJb8bQC ++YKKqSwDl2RzDoiaAoHRBQreL/xmasr26NkaY6UI/sODDMfNqCIdBbu5MaPT+yH3Jnw 5bkg== X-Gm-Message-State: AOJu0YwBAUcLxrBb0iWT0cf/RdJqPSujGU3mPaEYIb9yR/blPqzcCEql nTAp1KFjODeOiNwseB9nClLjzXGlQkbnXY3qTkLhc0IOKZii+axtYpo59w== X-Gm-Gg: ASbGncuXRI2wotifKABhdAXR+e0Lpnt/4WdZ6pfg4EwWHspNW/dJGCHce++Q0OsYNvK KDa+Z6Nv3n6FeYWL83nfWrIXSm4LLINYhbu0IQ/BmsC+isndQB3rruuo2QEJwIUIGP+qFCIR8bC FnAA1A6lfJegQ5BG5BC4tlU6wSwICo/pVgZbeEwayXxh+E1ooswjktaBS5JTX/FmHdh6EjXkBxl ctS031XixnBg/GJgG/MX/gDZRlTsCZdE+wuAgWOpoTfFbV/+ev7OBozoWyc5A== X-Google-Smtp-Source: AGHT+IGL10nUsHo3mK7ebRhA2Sibr6OXQ514T2ZZGOY0T/DlM7tZqSomQ9DnyLhRUkqKMnmcfJnnaQ== X-Received: by 2002:a05:6808:bc4:b0:3eb:5d3a:5b1e with SMTP id 5614622812f47-3eccbf091c1mr839843b6e.3.1734491269205; Tue, 17 Dec 2024 19:07:49 -0800 (PST) Received: from localhost ([2a03:2880:12ff:3::]) by smtp.gmail.com with ESMTPSA id 5614622812f47-3ebb479702asm2651497b6e.13.2024.12.17.19.07.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Dec 2024 19:07:48 -0800 (PST) From: alexei.starovoitov@gmail.com To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v3 5/6] mm, bpf: Use memcg in try_alloc_pages(). Date: Tue, 17 Dec 2024 19:07:18 -0800 Message-ID: <20241218030720.1602449-6-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> References: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Unconditionally use __GFP_ACCOUNT in try_alloc_pages(). The caller is responsible to setup memcg correctly. All BPF memory accounting is memcg based. Signed-off-by: Alexei Starovoitov --- mm/page_alloc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 10918bfc6734..5d0e56fbb65b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7100,7 +7100,7 @@ static bool __free_unaccepted(struct page *page) struct page *try_alloc_pages_noprof(int nid, unsigned int order) { gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | - __GFP_NOMEMALLOC | __GFP_TRYLOCK; + __GFP_NOMEMALLOC | __GFP_TRYLOCK | __GFP_ACCOUNT; unsigned int alloc_flags = ALLOC_TRYLOCK; struct alloc_context ac = { }; struct page *page; @@ -7136,13 +7136,17 @@ struct page *try_alloc_pages_noprof(int nid, unsigned int order) * If it's empty attempt to spin_trylock zone->lock. * Do not specify __GFP_KSWAPD_RECLAIM to avoid wakeup_kswapd * that may need to grab a lock. - * Do not specify __GFP_ACCOUNT to avoid local_lock. * Do not warn either. */ page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac); /* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */ + if (memcg_kmem_online() && page && + unlikely(__memcg_kmem_charge_page(page, alloc_gfp, order) != 0)) { + free_pages_nolock(page, order); + page = NULL; + } trace_mm_page_alloc(page, order, alloc_gfp & ~__GFP_TRYLOCK, ac.migratetype); kmsan_alloc_page(page, order, alloc_gfp); return page; From patchwork Wed Dec 18 03:07:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13912972 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-oa1-f50.google.com (mail-oa1-f50.google.com [209.85.160.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A78DF43146 for ; Wed, 18 Dec 2024 03:07:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491276; cv=none; b=Dd0mD5JfMSSNX1LlaYQ3lRJJOa6Z3pgkToP5UE4X2GMy0CgOJEbdMxOqpayzwEZX2xxWgjdR/AhwOHOuaycBJEOoDBFlNeDb5UfCD9yZl3bcgCZ/Tk9tyxdCRv2hs2V+j/usx5KgvBb9W5YQI1T3kU7sLDGaoghEeLBmuo7cY1Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734491276; c=relaxed/simple; bh=Ez0Qn2cewVuxDjKZrQUQnN2X7Qdf1nNERwtTwPZpiag=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RyHK7zq5YcgZrdO2idKN4QIIeJTEGQh7YYe28E3hEOxGVNl0B12WrKH+O27V4LMloxgm7rd/qZmAxI2C34iWPP9A1vAYQGnrg8+mxH8y1/EDBmB1L/Xg5xvgx25vbUJX/1/r5jN4vo8uFxAgvfJG6QbHeZhwGbzG5bYZH8XxzvA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jmpryJoa; arc=none smtp.client-ip=209.85.160.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jmpryJoa" Received: by mail-oa1-f50.google.com with SMTP id 586e51a60fabf-29e842cb9b4so1910901fac.2 for ; Tue, 17 Dec 2024 19:07:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734491273; x=1735096073; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=e77ZoQKvlnjPoOZm7mn9NDRKoW7hQ+ydIa8Wb5hZHHY=; b=jmpryJoaToXH2wMn3kkMAUSvYV4P6LlOm0BQxEt7TmkJv9R/7NvjireF8x+DIpcuva cEdDNPhG6yL1Z9Re2Fq5bnkTfkUtu/g0vrTSX3ufO6Ha3k+UBMKYNkm8AR6rUKNKA9cF uP0P+/XuCgmIhGLnk++MsBrsCF8xuW1YPWceUqHEtVwwQsiLVsoi5lhz1XA4XytTvuW+ H7mYvaUYO3IUGIPmfxZCQ8mtZntcBw+Y1H7IOBwfRFqbRzQKC9R7+tlFWRagG1qbG3a3 hg1onsNMnj2T/VhG1DffS8H2Uk8cCjPguP/zJYy2Xrvy+86bnJsABsI0ImFMbBYLxAfJ 1wVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734491273; x=1735096073; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=e77ZoQKvlnjPoOZm7mn9NDRKoW7hQ+ydIa8Wb5hZHHY=; b=AmVzhNjcb4GW8AiqzQ6UggWdu9xpJ3Q6crb2MiFbLGSfc0mMMWTUKZWs5u1xVieQvV uFnY0AMLCOhDIy/Kpre3g458t1yaHsHKZ2UinoZtugz3BdFzx346nNTL6td6QH8M8gf1 GispaQrkWi1c0l3KsY+AzBgIxEqEUnwdfDA8XuEBjWV4e5u5HVrC7BsjJNCXMNsOqyhy DtIhnJ/weXxUj/AsvohGgRneeVKsY4OJr+akb2bhlPA0txnJfmmRebpdW8vmPO3L1tpm Yth7a5JXdi8v2iNqPabrBDc9YjcldGkn+no2m1w7yhCOE1G26OxwIhKEzw29dUcGKCa0 9XQQ== X-Gm-Message-State: AOJu0Yzirv5XW4ba3AHwQicSM/MIl8RdkvRMLf5yA8lZzwKynIN4sdbm lc1I5T7+jiGmPBph0SRodHJ3dWSbN5aG3lPqeMhjEaFkHDXz36fal2XgLQ== X-Gm-Gg: ASbGncuR0yPsMe+Q6tPdDYj2ky9DhywsH4e7UeQ04lCMSLNsRdYHZonJCkcfGMOvCgI n1pFjG3rqPTQJm0RP7rwP/O11h73yu0ycEZaXrRQc+iwv43WVrP8yIlEmq2jUEWnJM3uUvzqO46 n2PU0Qp1WnthW+ItRvmKjfeRN0QPeoyuGRYsVUyymXLA5HhhYXQOATcxRIko2mjbgbwAJQX9xh7 ddoMzTZ8Qgkthv3pzOWy4q27OfTerVMFKus9Q7vdCwUhJXuLS890V/HkgO4cA== X-Google-Smtp-Source: AGHT+IHi/o8gjZtYzcre/yxStX7zDbYufL6oPcWXItTfKnkU83y7CXItnrrjbp8U2xtfjDPf2FnqzQ== X-Received: by 2002:a05:6870:ff06:b0:29e:684d:2739 with SMTP id 586e51a60fabf-2a7b35c8056mr580007fac.32.1734491273379; Tue, 17 Dec 2024 19:07:53 -0800 (PST) Received: from localhost ([2a03:2880:12ff:7::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-2a3d263b839sm3295077fac.14.2024.12.17.19.07.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Dec 2024 19:07:52 -0800 (PST) From: alexei.starovoitov@gmail.com To: bpf@vger.kernel.org Cc: andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, bigeasy@linutronix.de, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, jannh@google.com, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next v3 6/6] bpf: Use try_alloc_pages() to allocate pages for bpf needs. Date: Tue, 17 Dec 2024 19:07:19 -0800 Message-ID: <20241218030720.1602449-7-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> References: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Use try_alloc_pages() and free_pages_nolock() Signed-off-by: Alexei Starovoitov --- kernel/bpf/syscall.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 4e88797fdbeb..45099d24909c 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -582,14 +582,14 @@ int bpf_map_alloc_pages(const struct bpf_map *map, gfp_t gfp, int nid, old_memcg = set_active_memcg(memcg); #endif for (i = 0; i < nr_pages; i++) { - pg = alloc_pages_node(nid, gfp | __GFP_ACCOUNT, 0); + pg = try_alloc_pages(nid, 0); if (pg) { pages[i] = pg; continue; } for (j = 0; j < i; j++) - __free_page(pages[j]); + free_pages_nolock(pages[j], 0); ret = -ENOMEM; break; }