From patchwork Wed Jun 28 01:56:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 063ACC0015E for ; Wed, 28 Jun 2023 01:56:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231214AbjF1B4p (ORCPT ); Tue, 27 Jun 2023 21:56:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231200AbjF1B4o (ORCPT ); Tue, 27 Jun 2023 21:56:44 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C319E297B; Tue, 27 Jun 2023 18:56:42 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1b801e6ce85so3325215ad.1; Tue, 27 Jun 2023 18:56:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917402; x=1690509402; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tnvjt/gAeB7s5ArRb1iRUdzehhXE3naKGVtPVDLMicQ=; b=WYtjcQf8YgZ72CYD/aEhu6in5okZ+zc61H9TSvTaZMX0L40lFfNTVvpSXpAeceEfdH Rfr3mqrXxlOUwFTSD9potOda1dJdDbH1aALmuTlRUenjVutYzBau5UpufXxUsnyJgece x/W6i6r2smYYIXpezttWlVtrT9R+0PMq2dDaJhWnSJXrnt4lJGmGnv50itLYtz6ENMSB FfLLBQrHKm4MMloc/MdcMx1eU+mq8JHmDRGlnueStvyHzRsdl4tTTDFF2ZFUl7bZjvwl Hvm5ZtmCJreQn+w5RSIqEU5bAvnJhuZ1UIqImxDgW2x4B3f9ldLZyeJmLBH3HeVGy38R sBhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917402; x=1690509402; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tnvjt/gAeB7s5ArRb1iRUdzehhXE3naKGVtPVDLMicQ=; b=FQksV1+7cuecULQE8retOBZbLBnZTgjTULTYJjuvU9IszBesL0H0+qtVPvgNrI47kF gx9/O7i8oqEUHz+/NTLbZBPeZ/Bvtm4MHhw8DRxsUlx+aEK0Hsfr+mvrqbN3M0bSXDTE pmPhw9aLJgonjUwo8DVkIgQzS+tIP9idXYzoteKRy+o1+jym0/drhVGvcjTVp74bazN6 hbeaVEsZoG/AR9bf61VkXI/sUp4EC3QIKg5YoJfm6p4605EOPFmqVYAieVrPImoRoE9t 9ayxDkn7NbFzq9OcXDLDl8xaJ8usz8R07vJaGTNGWf/YGYHK5zacXV6iju0ojFBsPHHP +xlw== X-Gm-Message-State: AC+VfDxRdN5pwSb+8ebhHgkKIWx4p7lR6LuQxC0dk0ISvzbxZD1SGZ/c enCH2aE1Z05hClpm32w3WHw= X-Google-Smtp-Source: ACHHUZ78cnzX9BDqTo0MHYJSfppOTXt+zu9AxlG/4MtV9Rmj61/WPypownx0DfOZFWllfazUPutLrw== X-Received: by 2002:a17:902:ea0b:b0:1b2:676d:1143 with SMTP id s11-20020a170902ea0b00b001b2676d1143mr14533404plg.15.1687917401890; Tue, 27 Jun 2023 18:56:41 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id k13-20020a170902ba8d00b001b69303db54sm6577289pls.91.2023.06.27.18.56.40 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:56:41 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 01/13] bpf: Rename few bpf_mem_alloc fields. Date: Tue, 27 Jun 2023 18:56:22 -0700 Message-Id: <20230628015634.33193-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Rename: - struct rcu_head rcu; - struct llist_head free_by_rcu; - struct llist_head waiting_for_gp; - atomic_t call_rcu_in_progress; + struct llist_head free_by_rcu_ttrace; + struct llist_head waiting_for_gp_ttrace; + struct rcu_head rcu_ttrace; + atomic_t call_rcu_ttrace_in_progress; ... - static void do_call_rcu(struct bpf_mem_cache *c) + static void do_call_rcu_ttrace(struct bpf_mem_cache *c) to better indicate intended use. The 'tasks trace' is shortened to 'ttrace' to reduce verbosity. No functional changes. Later patches will add free_by_rcu/waiting_for_gp fields to be used with normal RCU. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 57 ++++++++++++++++++++++--------------------- 1 file changed, 29 insertions(+), 28 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 0668bcd7c926..cc5b8adb4c83 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -99,10 +99,11 @@ struct bpf_mem_cache { int low_watermark, high_watermark, batch; int percpu_size; - struct rcu_head rcu; - struct llist_head free_by_rcu; - struct llist_head waiting_for_gp; - atomic_t call_rcu_in_progress; + /* list of objects to be freed after RCU tasks trace GP */ + struct llist_head free_by_rcu_ttrace; + struct llist_head waiting_for_gp_ttrace; + struct rcu_head rcu_ttrace; + atomic_t call_rcu_ttrace_in_progress; }; struct bpf_mem_caches { @@ -165,18 +166,18 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) old_memcg = set_active_memcg(memcg); for (i = 0; i < cnt; i++) { /* - * free_by_rcu is only manipulated by irq work refill_work(). + * free_by_rcu_ttrace is only manipulated by irq work refill_work(). * IRQ works on the same CPU are called sequentially, so it is * safe to use __llist_del_first() here. If alloc_bulk() is * invoked by the initial prefill, there will be no running * refill_work(), so __llist_del_first() is fine as well. * - * In most cases, objects on free_by_rcu are from the same CPU. + * In most cases, objects on free_by_rcu_ttrace are from the same CPU. * If some objects come from other CPUs, it doesn't incur any * harm because NUMA_NO_NODE means the preference for current * numa node and it is not a guarantee. */ - obj = __llist_del_first(&c->free_by_rcu); + obj = __llist_del_first(&c->free_by_rcu_ttrace); if (!obj) { /* Allocate, but don't deplete atomic reserves that typical * GFP_ATOMIC would do. irq_work runs on this cpu and kmalloc @@ -232,10 +233,10 @@ static void free_all(struct llist_node *llnode, bool percpu) static void __free_rcu(struct rcu_head *head) { - struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu); + struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu_ttrace); - free_all(llist_del_all(&c->waiting_for_gp), !!c->percpu_size); - atomic_set(&c->call_rcu_in_progress, 0); + free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size); + atomic_set(&c->call_rcu_ttrace_in_progress, 0); } static void __free_rcu_tasks_trace(struct rcu_head *head) @@ -254,32 +255,32 @@ static void enque_to_free(struct bpf_mem_cache *c, void *obj) struct llist_node *llnode = obj; /* bpf_mem_cache is a per-cpu object. Freeing happens in irq_work. - * Nothing races to add to free_by_rcu list. + * Nothing races to add to free_by_rcu_ttrace list. */ - __llist_add(llnode, &c->free_by_rcu); + __llist_add(llnode, &c->free_by_rcu_ttrace); } -static void do_call_rcu(struct bpf_mem_cache *c) +static void do_call_rcu_ttrace(struct bpf_mem_cache *c) { struct llist_node *llnode, *t; - if (atomic_xchg(&c->call_rcu_in_progress, 1)) + if (atomic_xchg(&c->call_rcu_ttrace_in_progress, 1)) return; - WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp)); - llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu)) - /* There is no concurrent __llist_add(waiting_for_gp) access. + WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); + llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu_ttrace)) + /* There is no concurrent __llist_add(waiting_for_gp_ttrace) access. * It doesn't race with llist_del_all either. - * But there could be two concurrent llist_del_all(waiting_for_gp): + * But there could be two concurrent llist_del_all(waiting_for_gp_ttrace): * from __free_rcu() and from drain_mem_cache(). */ - __llist_add(llnode, &c->waiting_for_gp); + __llist_add(llnode, &c->waiting_for_gp_ttrace); /* Use call_rcu_tasks_trace() to wait for sleepable progs to finish. * If RCU Tasks Trace grace period implies RCU grace period, free * these elements directly, else use call_rcu() to wait for normal * progs to finish and finally do free_one() on each element. */ - call_rcu_tasks_trace(&c->rcu, __free_rcu_tasks_trace); + call_rcu_tasks_trace(&c->rcu_ttrace, __free_rcu_tasks_trace); } static void free_bulk(struct bpf_mem_cache *c) @@ -307,7 +308,7 @@ static void free_bulk(struct bpf_mem_cache *c) /* and drain free_llist_extra */ llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra)) enque_to_free(c, llnode); - do_call_rcu(c); + do_call_rcu_ttrace(c); } static void bpf_mem_refill(struct irq_work *work) @@ -441,13 +442,13 @@ static void drain_mem_cache(struct bpf_mem_cache *c) /* No progs are using this bpf_mem_cache, but htab_map_free() called * bpf_mem_cache_free() for all remaining elements and they can be in - * free_by_rcu or in waiting_for_gp lists, so drain those lists now. + * free_by_rcu_ttrace or in waiting_for_gp_ttrace lists, so drain those lists now. * - * Except for waiting_for_gp list, there are no concurrent operations + * Except for waiting_for_gp_ttrace list, there are no concurrent operations * on these lists, so it is safe to use __llist_del_all(). */ - free_all(__llist_del_all(&c->free_by_rcu), percpu); - free_all(llist_del_all(&c->waiting_for_gp), percpu); + free_all(__llist_del_all(&c->free_by_rcu_ttrace), percpu); + free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu); free_all(__llist_del_all(&c->free_llist), percpu); free_all(__llist_del_all(&c->free_llist_extra), percpu); } @@ -462,7 +463,7 @@ static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) static void free_mem_alloc(struct bpf_mem_alloc *ma) { - /* waiting_for_gp lists was drained, but __free_rcu might + /* waiting_for_gp_ttrace lists was drained, but __free_rcu might * still execute. Wait for it now before we freeing percpu caches. * * rcu_barrier_tasks_trace() doesn't imply synchronize_rcu_tasks_trace(), @@ -535,7 +536,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) */ irq_work_sync(&c->refill_work); drain_mem_cache(c); - rcu_in_progress += atomic_read(&c->call_rcu_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); } /* objcg is the same across cpus */ if (c->objcg) @@ -550,7 +551,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) c = &cc->cache[i]; irq_work_sync(&c->refill_work); drain_mem_cache(c); - rcu_in_progress += atomic_read(&c->call_rcu_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); } } if (c->objcg) From patchwork Wed Jun 28 01:56:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295158 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B37A9EB64DC for ; Wed, 28 Jun 2023 01:56:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231218AbjF1B4u (ORCPT ); Tue, 27 Jun 2023 21:56:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230327AbjF1B4s (ORCPT ); Tue, 27 Jun 2023 21:56:48 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 851CE2D4C; Tue, 27 Jun 2023 18:56:46 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-55b0e7efb1cso759575a12.1; Tue, 27 Jun 2023 18:56:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917406; x=1690509406; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Q6ir4URdr01h+ZUg6kzH9KBRVLbULKDLoOzFhIlPeJI=; b=Ot3wsli6NbvjY75t2I9LiDretPDDdpicghUCiS9UW5Ye5U3SmfawJF25jrAH2TFgrT 3yZhBZ1uTEG3EjAz0CwHVxWJkl+N1XLB1dJBDcpwLbcY7XR0BVeJTY19Csbv0MPiaIjJ dL2afA1Wp1p4qtN5Pr+dOqz9avDQ7uxm79layia8MId9IFZNMAoWbcfMsYoeIQaS5zdn K+6zD8XxlPSwnzxEV+Tpp0T/ThIUvCXK0NmXn4IeGdqqw48IY0HFqsPOoKVEoFCwIvh9 JoTVar7MBzOruufflXbLT2h1EuRXarhPuDztVrztEgHi6BqIeoiQlPytNPH0PU/AqzbU rNSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917406; x=1690509406; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q6ir4URdr01h+ZUg6kzH9KBRVLbULKDLoOzFhIlPeJI=; b=dtwrO68NeVcCu3LrdCS1Dmkvl7O0psR9hAa01b8c61fbMb+Vg7qFIXD7zK899eYuZs PEFosqZb9wODeGP+lje09POnWPp0kX37Zq+COyqxNGYJvtbpkSZObbzbvHcQJO6FIgcz 9lORZMxu4dOjKEah8AQMvRsXQv3RzLp3VilF2sJdSYni1rldlargUqlmbbEDUXoX3uxF inNMaZDLPJaPUuIqL7Pdf1R5ScQuzj9CcL+uwaVeXee1FTEKaiE5mg+XhVCLlbw5SuFU 54XTsrSnXza+pc+gbMI39yHRG6U4ML+1fXykVMsdUZQrcRH8v4K7m6aMZwchzx2aT9mh Akvw== X-Gm-Message-State: AC+VfDxpIGyAykP8tFLMCX3QCFAlrUzz7kJLLRRE4wHDOxJwlj+WiRND reaW/haDr3eG2kvX29oGwBQ= X-Google-Smtp-Source: ACHHUZ5TUz/3lQZFltg83vQulm8G9Hc2b717uhrGE2y/Uxzov/iWhhYtlTWqsgBEdNFcXC6hEoAb2g== X-Received: by 2002:a05:6a21:7897:b0:125:a12e:a918 with SMTP id bf23-20020a056a21789700b00125a12ea918mr13323197pzc.8.1687917405849; Tue, 27 Jun 2023 18:56:45 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id w8-20020a170902a70800b001b3d4d74749sm6613239plq.7.2023.06.27.18.56.44 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:56:45 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 02/13] bpf: Simplify code of destroy_mem_alloc() with kmemdup(). Date: Tue, 27 Jun 2023 18:56:23 -0700 Message-Id: <20230628015634.33193-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Use kmemdup() to simplify the code. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index cc5b8adb4c83..b0011217be6c 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -499,7 +499,7 @@ static void destroy_mem_alloc(struct bpf_mem_alloc *ma, int rcu_in_progress) return; } - copy = kmalloc(sizeof(*ma), GFP_KERNEL); + copy = kmemdup(ma, sizeof(*ma), GFP_KERNEL); if (!copy) { /* Slow path with inline barrier-s */ free_mem_alloc(ma); @@ -507,10 +507,7 @@ static void destroy_mem_alloc(struct bpf_mem_alloc *ma, int rcu_in_progress) } /* Defer barriers into worker to let the rest of map memory to be freed */ - copy->cache = ma->cache; - ma->cache = NULL; - copy->caches = ma->caches; - ma->caches = NULL; + memset(ma, 0, sizeof(*ma)); INIT_WORK(©->work, free_mem_alloc_deferred); queue_work(system_unbound_wq, ©->work); } From patchwork Wed Jun 28 01:56:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F598EB64DC for ; Wed, 28 Jun 2023 01:56:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229835AbjF1B4y (ORCPT ); Tue, 27 Jun 2023 21:56:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231203AbjF1B4x (ORCPT ); Tue, 27 Jun 2023 21:56:53 -0400 Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D30F3DA; Tue, 27 Jun 2023 18:56:50 -0700 (PDT) Received: by mail-pf1-x433.google.com with SMTP id d2e1a72fcca58-676f16e0bc4so2046510b3a.0; Tue, 27 Jun 2023 18:56:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917410; x=1690509410; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m5dE0XhJvwd5u50dMmi7rGk+uakRp6is0d4Rg5rl7Ow=; b=T/jVIlCld2/l7E90KQE1BYw6tsDeUzwlSRdvG+R0bsCHcqSR3US577QKke6tnxUePg 8iSywElkXrbW8KvLL3hV6QeGKrtNHWmgFOjiLAxpdgZwpsJ94h8YaLPqVoGARGQ7VTdI vxUfuAMwSRovljIM1C4uYR5P8QygG9CEHxfMpN71hyUDWcXL/ig5rqzp5OdXo528XbFk 6d4+ls0ZaHg5gWpkQe2v0wB+kdqse0TF9+t7Rz6S9KyI7XRgOhCDGCQfNdUsvSPosFVg O+hrPHKyxP9NzIbm70sbUvOd+c0q5w3CFok475G0sc2M2/ycbDkPee3FbxBg6VQYWRSP jrJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917410; x=1690509410; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m5dE0XhJvwd5u50dMmi7rGk+uakRp6is0d4Rg5rl7Ow=; b=LTS2EVZIbFwruYtDemaxddxKhe6idCfnDI1rJ8AFudsCQ9qYe9HMLmBJheMamj3GQ/ oFy1VB1MopnpecRSTgxE96w1bskts74EcDDIabvH3G6OcKG1axF7PZhbtuWDW2BgejCv 8aKm2gcsqzATv0OroATK0Sjvnm5TOHsGDzYdo0Yk2EWEvs5/S1eJUsqysx9T2sIte8s6 Qr8/MD3uUOEBEEP2LWfeQaua7MLy3xqzJx0m49oSm8DAWD++JFNwRZGpLmbsiilGL2qL sRVnGJXSgW5tNbKc5VzkmTHtV9GfkdZ7K+CjbOZ4SQlT5HGWIOv/30f+04+KPFyTpmPQ 5UHA== X-Gm-Message-State: AC+VfDz+/4UR6H91UWEavTrRaVeZKLDh66OspxEzmcvmk0BbjlTN14ON 786W/Xivo/6AW8uT1aLW7zk= X-Google-Smtp-Source: ACHHUZ7CdQC4V9yJhdznKoH3684v5y1m5kvH7KSTmluDHlAIzUw0TW3gs0Top+Uk+5T76LO3q8sTuA== X-Received: by 2002:a05:6a00:2394:b0:668:73f5:dce0 with SMTP id f20-20020a056a00239400b0066873f5dce0mr22330814pfc.29.1687917409783; Tue, 27 Jun 2023 18:56:49 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id c19-20020aa78813000000b00671aa6b4962sm4771247pfo.114.2023.06.27.18.56.48 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:56:49 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 03/13] bpf: Let free_all() return the number of freed elements. Date: Tue, 27 Jun 2023 18:56:24 -0700 Message-Id: <20230628015634.33193-4-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Let free_all() helper return the number of freed elements. It's not used in this patch, but helps in debug/development of bpf_mem_alloc. For example this diff for __free_rcu(): - free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size); + printk("cpu %d freed %d objs after tasks trace\n", raw_smp_processor_id(), + free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size)); would show how busy RCU tasks trace is. In artificial benchmark where one cpu is allocating and different cpu is freeing the RCU tasks trace won't be able to keep up and the list of objects would keep growing from thousands to millions and eventually OOMing. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index b0011217be6c..693651d2648b 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -223,12 +223,16 @@ static void free_one(void *obj, bool percpu) kfree(obj); } -static void free_all(struct llist_node *llnode, bool percpu) +static int free_all(struct llist_node *llnode, bool percpu) { struct llist_node *pos, *t; + int cnt = 0; - llist_for_each_safe(pos, t, llnode) + llist_for_each_safe(pos, t, llnode) { free_one(pos, percpu); + cnt++; + } + return cnt; } static void __free_rcu(struct rcu_head *head) From patchwork Wed Jun 28 01:56:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295160 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B50D4EB64DC for ; Wed, 28 Jun 2023 01:56:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229459AbjF1B47 (ORCPT ); Tue, 27 Jun 2023 21:56:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230327AbjF1B46 (ORCPT ); Tue, 27 Jun 2023 21:56:58 -0400 Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AEFE91B1; Tue, 27 Jun 2023 18:56:54 -0700 (PDT) Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-668730696a4so2841203b3a.1; Tue, 27 Jun 2023 18:56:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917414; x=1690509414; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=16uWDlZdQR3Juwac+BQwsJvGxkAsevnPbRvEaTU5eEY=; b=LGDhdUQK8AZIfJYvAUIHp72/FPZ6uOtmdiIuTL9jhgBe6LCEIwUdKOaoUS9wp1kg/f W5SyOK8M5z+6zu4GkDVY9caCJJ+Ze59YpHD+dPk0hXIfSnpIC3pyj1lFNpzP1Kl71w3R g59T0C3mjdtoalBX2YCoOw/1a/izVyEHL8agkJqFIpiu5g91EvL7XBJ6BGDKRUQtXJ6M GCqqHwZdYedBMyp5/jpECxF7TlDoWgk6BLDMm03z7h63LfC/O5bidKi/HSGiMTWAOr1w Lf7yEo0jB7j+D/9PxMDp2zUqdYCPwff7fM11gzJHhxlqDI30eFD6aMZFMsgFrxj4UAJJ PHBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917414; x=1690509414; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=16uWDlZdQR3Juwac+BQwsJvGxkAsevnPbRvEaTU5eEY=; b=A0hq2H+jwMSmhOkv42izgUR6mLHmnmKGQgDrZTuLQdz0GCs4KGXuF79lM3kElh4NBW g4dn4QAI6SZ7eWYsMyvPwMT6mt5bRPSFX3dxXRlIixgdBYsXw3ejDFhCmfpO4sNStkCW VUfWs2FpEqfcOL7wquB/I0kBIUbJ1yjZBT4//V+3snk1IhOz5L0GqPctrpmI+UJxavyT HYH7bt/KJ91FHuNNYsP7+4I/mJ3uCStdVMMYiXnb1PbCTdxdMvWUsxlfSOqQoROePq3c /wLANDjzwjw50JsBkBmjWbuo5KYPjoFlOgwo6odeX3e2TNgCeAGwk0T+b1ziR15X2lYz bx0w== X-Gm-Message-State: AC+VfDxYrVqPqcXBJpZosOoI0L/9ktp5NanUMgtuAqjMnB7/K4ueKOI8 sUFOCefD48KzqLfhZt7zczE= X-Google-Smtp-Source: ACHHUZ4Vr1vZEFzpEmFIvWEEIFUJro27nERclTEnm3p/Z8PK+dbp9XwSTFBmQ2lT0a4IBwQNPcoV8A== X-Received: by 2002:a05:6a20:8e19:b0:11f:7aa:1b27 with SMTP id y25-20020a056a208e1900b0011f07aa1b27mr25859487pzj.51.1687917413730; Tue, 27 Jun 2023 18:56:53 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id jn22-20020a170903051600b001b3eed9cf24sm6563026plb.54.2023.06.27.18.56.52 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:56:53 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 04/13] bpf: Refactor alloc_bulk(). Date: Tue, 27 Jun 2023 18:56:25 -0700 Message-Id: <20230628015634.33193-5-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Factor out inner body of alloc_bulk into separate helper. No functioncal changes. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 46 ++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 20 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 693651d2648b..9693b1f8cbda 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -154,11 +154,35 @@ static struct mem_cgroup *get_memcg(const struct bpf_mem_cache *c) #endif } +static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj) +{ + unsigned long flags; + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + /* In RT irq_work runs in per-cpu kthread, so disable + * interrupts to avoid preemption and interrupts and + * reduce the chance of bpf prog executing on this cpu + * when active counter is busy. + */ + local_irq_save(flags); + /* alloc_bulk runs from irq_work which will not preempt a bpf + * program that does unit_alloc/unit_free since IRQs are + * disabled there. There is no race to increment 'active' + * counter. It protects free_llist from corruption in case NMI + * bpf prog preempted this loop. + */ + WARN_ON_ONCE(local_inc_return(&c->active) != 1); + __llist_add(obj, &c->free_llist); + c->free_cnt++; + local_dec(&c->active); + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_restore(flags); +} + /* Mostly runs from irq_work except __init phase. */ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) { struct mem_cgroup *memcg = NULL, *old_memcg; - unsigned long flags; void *obj; int i; @@ -188,25 +212,7 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) if (!obj) break; } - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - /* In RT irq_work runs in per-cpu kthread, so disable - * interrupts to avoid preemption and interrupts and - * reduce the chance of bpf prog executing on this cpu - * when active counter is busy. - */ - local_irq_save(flags); - /* alloc_bulk runs from irq_work which will not preempt a bpf - * program that does unit_alloc/unit_free since IRQs are - * disabled there. There is no race to increment 'active' - * counter. It protects free_llist from corruption in case NMI - * bpf prog preempted this loop. - */ - WARN_ON_ONCE(local_inc_return(&c->active) != 1); - __llist_add(obj, &c->free_llist); - c->free_cnt++; - local_dec(&c->active); - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - local_irq_restore(flags); + add_obj_to_free_list(c, obj); } set_active_memcg(old_memcg); mem_cgroup_put(memcg); From patchwork Wed Jun 28 01:56:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05FFAEB64DC for ; Wed, 28 Jun 2023 01:57:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230327AbjF1B5F (ORCPT ); Tue, 27 Jun 2023 21:57:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231162AbjF1B5E (ORCPT ); Tue, 27 Jun 2023 21:57:04 -0400 Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC97D2D71; Tue, 27 Jun 2023 18:56:58 -0700 (PDT) Received: by mail-pj1-x102b.google.com with SMTP id 98e67ed59e1d1-262c42d3fafso343433a91.0; Tue, 27 Jun 2023 18:56:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917418; x=1690509418; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Q8BWINMaO4z5rYdGjPlWk6lrzjT7qss1KTXXTsv4P9Y=; b=V218ZcqFUz03ODmojkFc9PKb24QgZyhpe577ezZzhZxnmTogwyipfjqhQ7T0Izep65 ehSUNFZbiyUxOJKZUzwE+NaadzFo218wVe79RW+Qr3aTaaWeN/B/feQIuWfivZ833KXn c9m2ZwT/56KNbzQVINRgffQwIGrKEiKM5YhExJpQy3+xcOReLTNbsFudyMXzlJFJrzeU a8JC3pnuuJJsKql1EanzsLM2nabMJQnU6q9LqNNtZpfURgJIXgWhXC7jhzbAMxPPVaG2 PFf5xzbp93hSHHgzNY3W1HVDQlLbZ6hXhP0jFjIBRcPGKB6i33Or2LOC9T4TbjUQ4Ft6 UAqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917418; x=1690509418; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q8BWINMaO4z5rYdGjPlWk6lrzjT7qss1KTXXTsv4P9Y=; b=cqJBfV+o9JQJijZVNHxH3sEEAHy6aQE7Eb5C1iVqfBKW2TFBLSnn4BmIkwvhbHupKD zj9+wrKfP0NNydKy78H30myJKxF/0z2LBKswaBExLl6AtZX/iZeTH7ir9t2GDm0f3Znw gjZlj+UmRbSqlfG20xaqDiR1BhGlMb7aC4QKoaG/i8kEtDsm7FCqf/Xm+7Qe8FMuXCyt Qk5/vdUZau62Yvx7SuH9YowTsPG72YHcLnmDGxbMCdo4ZGjG28eZDly9G5JOFm7DXVxK OULF1ZPW/mll+WPcCgAA+szhruTiRBoxRgw9aM8KgmTHddiP8Ti9rpx2hRqpyhJ0aI3h HJjw== X-Gm-Message-State: AC+VfDyWFo1ylKA4ZQpVjEJkSVQGg1An5pRDcRWDwYZKpsVnIfMSK+hI ZCd2ELPZALetR5khRpl8YgI= X-Google-Smtp-Source: ACHHUZ6aolvwT3c+0SJ5D0uOOzQKMRbUbSrimdEVSFAtHw/32TYPRsHmrl/Ag3/Cdq7WIn7iCOhS6A== X-Received: by 2002:a17:90a:7541:b0:262:ce9e:8a25 with SMTP id q59-20020a17090a754100b00262ce9e8a25mr13865966pjk.22.1687917418199; Tue, 27 Jun 2023 18:56:58 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id s18-20020a17090a881200b0024e05b7ba8bsm6744131pjn.25.2023.06.27.18.56.56 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:56:57 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 05/13] bpf: Factor out inc/dec of active flag into helpers. Date: Tue, 27 Jun 2023 18:56:26 -0700 Message-Id: <20230628015634.33193-6-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Factor out local_inc/dec_return(&c->active) into helpers. No functional changes. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 9693b1f8cbda..052fc801fb9f 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -154,17 +154,15 @@ static struct mem_cgroup *get_memcg(const struct bpf_mem_cache *c) #endif } -static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj) +static void inc_active(struct bpf_mem_cache *c, unsigned long *flags) { - unsigned long flags; - if (IS_ENABLED(CONFIG_PREEMPT_RT)) /* In RT irq_work runs in per-cpu kthread, so disable * interrupts to avoid preemption and interrupts and * reduce the chance of bpf prog executing on this cpu * when active counter is busy. */ - local_irq_save(flags); + local_irq_save(*flags); /* alloc_bulk runs from irq_work which will not preempt a bpf * program that does unit_alloc/unit_free since IRQs are * disabled there. There is no race to increment 'active' @@ -172,13 +170,25 @@ static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj) * bpf prog preempted this loop. */ WARN_ON_ONCE(local_inc_return(&c->active) != 1); - __llist_add(obj, &c->free_llist); - c->free_cnt++; +} + +static void dec_active(struct bpf_mem_cache *c, unsigned long flags) +{ local_dec(&c->active); if (IS_ENABLED(CONFIG_PREEMPT_RT)) local_irq_restore(flags); } +static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj) +{ + unsigned long flags; + + inc_active(c, &flags); + __llist_add(obj, &c->free_llist); + c->free_cnt++; + dec_active(c, flags); +} + /* Mostly runs from irq_work except __init phase. */ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) { @@ -300,17 +310,13 @@ static void free_bulk(struct bpf_mem_cache *c) int cnt; do { - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - local_irq_save(flags); - WARN_ON_ONCE(local_inc_return(&c->active) != 1); + inc_active(c, &flags); llnode = __llist_del_first(&c->free_llist); if (llnode) cnt = --c->free_cnt; else cnt = 0; - local_dec(&c->active); - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - local_irq_restore(flags); + dec_active(c, flags); if (llnode) enque_to_free(c, llnode); } while (cnt > (c->high_watermark + c->low_watermark) / 2); From patchwork Wed Jun 28 01:56:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295162 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7016FEB64DC for ; Wed, 28 Jun 2023 01:57:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231200AbjF1B5J (ORCPT ); Tue, 27 Jun 2023 21:57:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231162AbjF1B5I (ORCPT ); Tue, 27 Jun 2023 21:57:08 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2210910CF; Tue, 27 Jun 2023 18:57:03 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1b82bf265b2so6605075ad.0; Tue, 27 Jun 2023 18:57:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917422; x=1690509422; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7yiCh2NOg1GET2wwPA915j6lNxuG7zRImtB/G87hSZo=; b=mdjBjrwezIjld21zCcPaEkDZGclcbpwxEi0vDWD/C4YOoZinmr1x9E/VAX5gyETWsd haep6r9uKHp2qrT8MmmpQeoIPwLMqiw3Nom1Zi8Rjy1jOj5ZR9T9pPCbFSflUd7DOssB h48nzfMdRogHDVsLUZlE6wEAyhuLtCP5cmYrKB3ihYQyFcxgFz7169Z8I2FY6ule7bsF gANdkT9mD2jyH4omqyYcDPXF8lCrD9VcvFGiRADl/8s1gs6QmkvdSu8vZWeI10PBLimJ 57af2GpAzf4fmXBl/IyZJTj6nvQGwa4VRChy3XE4fY6Vox6eDR9axcqZfG3IDO8frbd+ 0KlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917422; x=1690509422; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7yiCh2NOg1GET2wwPA915j6lNxuG7zRImtB/G87hSZo=; b=O6yaweroK/e4sCYq5HOfH9PD8GRpP/81ZR41NhSEjftlGiftgsNQbVQozDMjP5Qfw4 +VNA8be5MFIUzAxr7nkzh6NlaOvoiRglYBl8V7XSAeZ4rGZbuE4eaEE8h3wwSbc8N5/u MdZR2wZIuR6qfdjhLLVSWpZLSvGXgKaKsZvPdVoc9ZOQTfHmcwovNQy0sye982E9bdoL ri9LOxLjvFRO44P9UpPGjBTKYBEeISO0iY2D7lqyQ1G18rS937ho2ZYzM8gll9OWa0Jr I5kJKoZ6iMSxnpcG+nQEGxdH9ZiuTZrhK7Q29Mtd51+hm96tUiI7TdzMGeVRdyJ2Au16 XUUQ== X-Gm-Message-State: AC+VfDzXXiHHBG5fdzNjIj1TyGk6D30lP8o1xNk2oeVSiNTNbvOWZm+R tcCPu0ek+zRyAku9hcfE0WI= X-Google-Smtp-Source: ACHHUZ50WzOjNJJVhYT8L/8Htvd7ty4p6GrMv2sh9RQ3Dix0/rCWC9NAnckgzpTvkvsNA9JSdeJoog== X-Received: by 2002:a17:90a:9316:b0:262:e909:d7fd with SMTP id p22-20020a17090a931600b00262e909d7fdmr5565329pjo.14.1687917422547; Tue, 27 Jun 2023 18:57:02 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id qi5-20020a17090b274500b00262e5449dbcsm5074914pjb.24.2023.06.27.18.57.00 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:57:02 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 06/13] bpf: Further refactor alloc_bulk(). Date: Tue, 27 Jun 2023 18:56:27 -0700 Message-Id: <20230628015634.33193-7-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov In certain scenarios alloc_bulk() migth be taking free objects mainly from free_by_rcu_ttrace list. In such case get_memcg() and set_active_memcg() are redundant, but they show up in perf profile. Split the loop and only set memcg when allocating from slab. No performance difference in this patch alone, but it helps in combination with further patches. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 052fc801fb9f..0ee566a7719a 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -196,8 +196,6 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) void *obj; int i; - memcg = get_memcg(c); - old_memcg = set_active_memcg(memcg); for (i = 0; i < cnt; i++) { /* * free_by_rcu_ttrace is only manipulated by irq work refill_work(). @@ -212,16 +210,24 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) * numa node and it is not a guarantee. */ obj = __llist_del_first(&c->free_by_rcu_ttrace); - if (!obj) { - /* Allocate, but don't deplete atomic reserves that typical - * GFP_ATOMIC would do. irq_work runs on this cpu and kmalloc - * will allocate from the current numa node which is what we - * want here. - */ - obj = __alloc(c, node, GFP_NOWAIT | __GFP_NOWARN | __GFP_ACCOUNT); - if (!obj) - break; - } + if (!obj) + break; + add_obj_to_free_list(c, obj); + } + if (i >= cnt) + return; + + memcg = get_memcg(c); + old_memcg = set_active_memcg(memcg); + for (; i < cnt; i++) { + /* Allocate, but don't deplete atomic reserves that typical + * GFP_ATOMIC would do. irq_work runs on this cpu and kmalloc + * will allocate from the current numa node which is what we + * want here. + */ + obj = __alloc(c, node, GFP_NOWAIT | __GFP_NOWARN | __GFP_ACCOUNT); + if (!obj) + break; add_obj_to_free_list(c, obj); } set_active_memcg(old_memcg); From patchwork Wed Jun 28 01:56:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13596C0015E for ; Wed, 28 Jun 2023 01:57:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229746AbjF1B5K (ORCPT ); Tue, 27 Jun 2023 21:57:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231213AbjF1B5J (ORCPT ); Tue, 27 Jun 2023 21:57:09 -0400 Received: from mail-oa1-x2a.google.com (mail-oa1-x2a.google.com [IPv6:2001:4860:4864:20::2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34C602D59; Tue, 27 Jun 2023 18:57:08 -0700 (PDT) Received: by mail-oa1-x2a.google.com with SMTP id 586e51a60fabf-1b06777596cso977451fac.2; Tue, 27 Jun 2023 18:57:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917426; x=1690509426; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mlxTnp19Kd/MPSDl2RItcTKx2L7V0g2IZhfiMGy+r1A=; b=BWxvq9lbYpleN97RGzQFzJanjNUt2jrJp2E6e+BAiivuOVMakqPuZeTfynbkBDwZEF 5ami1O1+TfCnXJTIHiaAKyslE39Al8jpd3rEiElNlFhGbwl4VVsxkdXypFFi3jF0vBB6 s1gkYClix7ttxrfVmyWB15Ib9odr8qnLCtFt8Jl2B3kh7qMgfEGgY11Aku73zmUSNxBf +ei5SfqKmRmXsClEgzb3bh4lGNbzf5v+X5UfUIDeUMHaXBjLJqllbi1BbsrvlfCQqrT5 N07SjTQ0xOGVaS01FPJ5LaKqF3xBM3AZSasj5qvSVUAKV/b1maIhXFpDLFndWnaa/RIl 55FQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917426; x=1690509426; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mlxTnp19Kd/MPSDl2RItcTKx2L7V0g2IZhfiMGy+r1A=; b=hjf4uNYFvdKlBrAvxZm6S5lVSq2v/G/bL5qhE1qBo0ss7JnYTFWs9ODVmrh6ADSD2e luZ9nMTFULvLFNw/PAAFaSejOGEAXKyX6AUhnZ/Hu6CUbI3t6psDqUeeYC9GYSP6fLit HivxPHyhP2K2wW4nDggdpy/CR6tDWVi4rBfT2k4/oYXwdopQFcMWMgWSA0qBPGfDJqFd V+uCrUX1U/a2QdlBko+lZ8nf1xFsZr4w3s2cvjWN9jeWGVDAPsIKjLWgVg7+2/qhT55c zaED45qPo+F6L0uTRvvth1DyXrB62yWm/aXRhB/0TwKvjuhWygB9D5yJ1X+VFynAiRtU 0AUw== X-Gm-Message-State: AC+VfDy/JKj99dfxJ33x+2lm0hBap+T3tGYSyo7OiBcN3vMZZrjjBkvp ggKREo2g8we73SSe+WWEdi8= X-Google-Smtp-Source: ACHHUZ4fkBOSfMw7/G5gRZNGEPLmgdSCrIFHQivc+pzR++oozLp3DUQSDxB37vgplgtXPBXLTUpRLA== X-Received: by 2002:a05:6808:168e:b0:39e:d559:61fc with SMTP id bb14-20020a056808168e00b0039ed55961fcmr33774952oib.30.1687917426430; Tue, 27 Jun 2023 18:57:06 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id v26-20020aa7809a000000b0062cf75a9e6bsm6015294pff.131.2023.06.27.18.57.04 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:57:06 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 07/13] bpf: Change bpf_mem_cache draining process. Date: Tue, 27 Jun 2023 18:56:28 -0700 Message-Id: <20230628015634.33193-8-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov The next patch will introduce cross-cpu llist access and existing irq_work_sync() + drain_mem_cache() + rcu_barrier_tasks_trace() mechanism will not be enough, since irq_work_sync() + drain_mem_cache() on cpu A won't guarantee that llist on cpu A are empty. The free_bulk() on cpu B might add objects back to llist of cpu A. Add 'bool draining' flag. The modified sequence looks like: for_each_cpu: WRITE_ONCE(c->draining, true); // do_call_rcu_ttrace() won't be doing call_rcu() any more irq_work_sync(); // wait for irq_work callback (free_bulk) to finish drain_mem_cache(); // free all objects rcu_barrier_tasks_trace(); // wait for RCU callbacks to execute Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 0ee566a7719a..2615f296f052 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -98,6 +98,7 @@ struct bpf_mem_cache { int free_cnt; int low_watermark, high_watermark, batch; int percpu_size; + bool draining; /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; @@ -301,6 +302,12 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) * from __free_rcu() and from drain_mem_cache(). */ __llist_add(llnode, &c->waiting_for_gp_ttrace); + + if (unlikely(READ_ONCE(c->draining))) { + __free_rcu(&c->rcu_ttrace); + return; + } + /* Use call_rcu_tasks_trace() to wait for sleepable progs to finish. * If RCU Tasks Trace grace period implies RCU grace period, free * these elements directly, else use call_rcu() to wait for normal @@ -544,15 +551,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) rcu_in_progress = 0; for_each_possible_cpu(cpu) { c = per_cpu_ptr(ma->cache, cpu); - /* - * refill_work may be unfinished for PREEMPT_RT kernel - * in which irq work is invoked in a per-CPU RT thread. - * It is also possible for kernel with - * arch_irq_work_has_interrupt() being false and irq - * work is invoked in timer interrupt. So waiting for - * the completion of irq work to ease the handling of - * concurrency. - */ + WRITE_ONCE(c->draining, true); irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); @@ -568,6 +567,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) cc = per_cpu_ptr(ma->caches, cpu); for (i = 0; i < NUM_CACHES; i++) { c = &cc->cache[i]; + WRITE_ONCE(c->draining, true); irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); From patchwork Wed Jun 28 01:56:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295164 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1006AEB64DC for ; Wed, 28 Jun 2023 01:57:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229593AbjF1B5N (ORCPT ); Tue, 27 Jun 2023 21:57:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231203AbjF1B5M (ORCPT ); Tue, 27 Jun 2023 21:57:12 -0400 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 036FB18E; Tue, 27 Jun 2023 18:57:11 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-55af0a816e4so1551457a12.1; Tue, 27 Jun 2023 18:57:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917430; x=1690509430; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1cMb2Xsn1WRkEk4McteyxXZFaj0EbXk3G7F10zvseQU=; b=kqTFrdFQVhHV8JeD1nLo0WLCJbyXgm0rl724qhDnVFTvEgYh9yIWvN1kEbaF8eSB2a N1Y9q0uOQAjQ3wAEyMqKIV5KOzi9biG/vNoSdUGG1UTlydDCjzwoSZqhdK+kViz43Msz hNCTa+5v2/fSrHMWORFmBtqvLWlUldTXazk+czdgclP+W05UbZVqiz1s7rr5o6ICnC/a hQiKpf2ytDrv8bRz7bjMoiz8PZh5ZASPbLeRKw1D+0KwcUJSaJssxxNQwz1vQfy/MSwL +AkzD92KL/E/YUSbt7fZNVxPcIbU1UJ8a/sm6zgTTI+wm27uZMMAgRcRMThDGCbwh5iP 0lIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917430; x=1690509430; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1cMb2Xsn1WRkEk4McteyxXZFaj0EbXk3G7F10zvseQU=; b=CMmjtrTVKITsQQfTE1ZMjJ8m/XVlsPXrHySWsvK/9oH5Tg/BgFytQUuv6na9JXdHnY 3/5XCI7jtPRs3kpzIGLW1IpZ3q6u3Es86RGoT1qI+xy3VEqKIkqRYAJIjiHts5pbsa9t G3SiB5wNRbIAm+V30IW1FTfGLjGGzSlAoQtnL8+BbpkjfZ6QAJ1dpCQMdW/tIZgi1gcm 2KF84rlfmTrB89ZYAHk8TPfqRxPong3E1R9CPXWlcm4jnEn30SPuaBc4tSH1tG7KCmBE eu+lhT9lutaZvQbgQULh3m85zx975gBOLbzaLbdcknxKJgtL87M8+qgg/L/TenZ5VQ/E VAEg== X-Gm-Message-State: AC+VfDyD0ieiKHbNkugVBirx3CBXwL/GmsU4c3yIWdEAEPYxmkb2+BjE Kb4HHU0NYrtpxStIZFlfaxQ= X-Google-Smtp-Source: ACHHUZ42d0P+Q7Ic2IwB3LqGcmZ/XjOMgVjQs4fG+Ju4s+eGOiY2IIn20N8FROpqepjKrzJsYlU5sQ== X-Received: by 2002:a05:6a21:9981:b0:125:b0ec:211f with SMTP id ve1-20020a056a21998100b00125b0ec211fmr11103129pzb.7.1687917430317; Tue, 27 Jun 2023 18:57:10 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id iw19-20020a170903045300b001b0603829a0sm720450plb.199.2023.06.27.18.57.08 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:57:09 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 08/13] bpf: Add a hint to allocated objects. Date: Tue, 27 Jun 2023 18:56:29 -0700 Message-Id: <20230628015634.33193-9-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov To address OOM issue when one cpu is allocating and another cpu is freeing add a target bpf_mem_cache hint to allocated objects and when local cpu free_llist overflows free to that bpf_mem_cache. The hint addresses the OOM while maintaing the same performance for common case when alloc/free are done on the same cpu. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 46 ++++++++++++++++++++++++++----------------- 1 file changed, 28 insertions(+), 18 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 2615f296f052..93242c4b85e0 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -99,6 +99,7 @@ struct bpf_mem_cache { int low_watermark, high_watermark, batch; int percpu_size; bool draining; + struct bpf_mem_cache *tgt; /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; @@ -199,18 +200,11 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) for (i = 0; i < cnt; i++) { /* - * free_by_rcu_ttrace is only manipulated by irq work refill_work(). - * IRQ works on the same CPU are called sequentially, so it is - * safe to use __llist_del_first() here. If alloc_bulk() is - * invoked by the initial prefill, there will be no running - * refill_work(), so __llist_del_first() is fine as well. - * - * In most cases, objects on free_by_rcu_ttrace are from the same CPU. - * If some objects come from other CPUs, it doesn't incur any - * harm because NUMA_NO_NODE means the preference for current - * numa node and it is not a guarantee. + * For every 'c' llist_del_first(&c->free_by_rcu_ttrace); is + * done only by one CPU == current CPU. Other CPUs might + * llist_add() and llist_del_all() in parallel. */ - obj = __llist_del_first(&c->free_by_rcu_ttrace); + obj = llist_del_first(&c->free_by_rcu_ttrace); if (!obj) break; add_obj_to_free_list(c, obj); @@ -284,7 +278,7 @@ static void enque_to_free(struct bpf_mem_cache *c, void *obj) /* bpf_mem_cache is a per-cpu object. Freeing happens in irq_work. * Nothing races to add to free_by_rcu_ttrace list. */ - __llist_add(llnode, &c->free_by_rcu_ttrace); + llist_add(llnode, &c->free_by_rcu_ttrace); } static void do_call_rcu_ttrace(struct bpf_mem_cache *c) @@ -295,7 +289,7 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) return; WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); - llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu_ttrace)) + llist_for_each_safe(llnode, t, llist_del_all(&c->free_by_rcu_ttrace)) /* There is no concurrent __llist_add(waiting_for_gp_ttrace) access. * It doesn't race with llist_del_all either. * But there could be two concurrent llist_del_all(waiting_for_gp_ttrace): @@ -312,16 +306,22 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) * If RCU Tasks Trace grace period implies RCU grace period, free * these elements directly, else use call_rcu() to wait for normal * progs to finish and finally do free_one() on each element. + * + * call_rcu_tasks_trace() enqueues to a global queue, so it's ok + * that current cpu bpf_mem_cache != target bpf_mem_cache. */ call_rcu_tasks_trace(&c->rcu_ttrace, __free_rcu_tasks_trace); } static void free_bulk(struct bpf_mem_cache *c) { + struct bpf_mem_cache *tgt = c->tgt; struct llist_node *llnode, *t; unsigned long flags; int cnt; + WARN_ON_ONCE(tgt->unit_size != c->unit_size); + do { inc_active(c, &flags); llnode = __llist_del_first(&c->free_llist); @@ -331,13 +331,13 @@ static void free_bulk(struct bpf_mem_cache *c) cnt = 0; dec_active(c, flags); if (llnode) - enque_to_free(c, llnode); + enque_to_free(tgt, llnode); } while (cnt > (c->high_watermark + c->low_watermark) / 2); /* and drain free_llist_extra */ llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra)) - enque_to_free(c, llnode); - do_call_rcu_ttrace(c); + enque_to_free(tgt, llnode); + do_call_rcu_ttrace(tgt); } static void bpf_mem_refill(struct irq_work *work) @@ -436,6 +436,7 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu) c->unit_size = unit_size; c->objcg = objcg; c->percpu_size = percpu_size; + c->tgt = c; prefill_mem_cache(c, cpu); } ma->cache = pc; @@ -458,6 +459,7 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu) c = &cc->cache[i]; c->unit_size = sizes[i]; c->objcg = objcg; + c->tgt = c; prefill_mem_cache(c, cpu); } } @@ -476,7 +478,7 @@ static void drain_mem_cache(struct bpf_mem_cache *c) * Except for waiting_for_gp_ttrace list, there are no concurrent operations * on these lists, so it is safe to use __llist_del_all(). */ - free_all(__llist_del_all(&c->free_by_rcu_ttrace), percpu); + free_all(llist_del_all(&c->free_by_rcu_ttrace), percpu); free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu); free_all(__llist_del_all(&c->free_llist), percpu); free_all(__llist_del_all(&c->free_llist_extra), percpu); @@ -601,8 +603,10 @@ static void notrace *unit_alloc(struct bpf_mem_cache *c) local_irq_save(flags); if (local_inc_return(&c->active) == 1) { llnode = __llist_del_first(&c->free_llist); - if (llnode) + if (llnode) { cnt = --c->free_cnt; + *(struct bpf_mem_cache **)llnode = c; + } } local_dec(&c->active); local_irq_restore(flags); @@ -626,6 +630,12 @@ static void notrace unit_free(struct bpf_mem_cache *c, void *ptr) BUILD_BUG_ON(LLIST_NODE_SZ > 8); + /* + * Remember bpf_mem_cache that allocated this object. + * The hint is not accurate. + */ + c->tgt = *(struct bpf_mem_cache **)llnode; + local_irq_save(flags); if (local_inc_return(&c->active) == 1) { __llist_add(llnode, &c->free_llist); From patchwork Wed Jun 28 01:56:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07229EB64DC for ; Wed, 28 Jun 2023 01:57:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231203AbjF1B5Q (ORCPT ); Tue, 27 Jun 2023 21:57:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231162AbjF1B5P (ORCPT ); Tue, 27 Jun 2023 21:57:15 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA69EDA; Tue, 27 Jun 2023 18:57:14 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-666e916b880so3036489b3a.2; Tue, 27 Jun 2023 18:57:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917434; x=1690509434; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8aR09UvnO+C0yrEmoJgbCycYlDAjgsqpYPl8jeHQqB0=; b=XWXab/2OYcp5/9K2Sy/HGPhxYF0wx9CDcRIfYQXSC7TggbLRE+/RKRPJdc0cudEh4f bZrrEzMkcJN3Q0gdYx6oAX4GYcK6nWuVV4H9dN/emZtbVJfQx2ozja/e5XQcwgDIq0w/ dvpah3nM8l9IZwZyMb9/pLD3zeVYvCTQM48ccfvQnzvi2+GkTY2+gL4IK290JafARs4G SVAUrT1RxF69mSaJOEuhYZcj/hzcrG7b/5tEA6zqmAsFpoAMu35q13UrFcoC6SHqblU4 xkbTlnKu4CtTbrdS1UGeAjuGj7tyIiqQkCW8+JY82eYU5T9bYx2WGS5ggwDCW79TW7Je 5DyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917434; x=1690509434; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8aR09UvnO+C0yrEmoJgbCycYlDAjgsqpYPl8jeHQqB0=; b=boDPOoEJ9JOkgV3CfpvAxGBCDdukstqOLdWWZJjH9MYT4SoaxF9JUBoscPjZoRr2O4 ZIvG8mzOCrEPc1+oqHUZIbzl8Uz/L0GGqhBfoUHcmIBubp+t7dvUxfOmetoNLTDtJLlt 9kQUvU90XFldXA1NQY0qvN/HqT+mjoIxPuKb5UZjb7IZ7vchsO/D+ezuS9jQnHaEv2Gu XxbKjS+HD8nR0WCw/ZYsaiMOcB3+ZxGZLcOQdR+vwcUnA0XOgS6tYm5ZN7N3HePf6kKs OY8LriqG0/aZ77wykb6sdOdN9c5FUxIaOVjUR7EHkko3CEctxjJqll4S5ynNECWdOonm GFRg== X-Gm-Message-State: AC+VfDxCB2APWQS0KRs+zGFkvx/Zxepq06Zvrs04Yl6EbO8xyhTU7gA5 g5MBFBlDW/aDUHGVUF18cxg= X-Google-Smtp-Source: ACHHUZ5ce+6nnJ+5Kat7T9DptCpFc+nfXc8g6f7pbHKXN7huSHJNcvncn6zus3UMGa8SHHmbJeD3ww== X-Received: by 2002:a05:6a20:4305:b0:123:c3dc:2052 with SMTP id h5-20020a056a20430500b00123c3dc2052mr15399992pzk.35.1687917434158; Tue, 27 Jun 2023 18:57:14 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id c3-20020a170902848300b001a52c38350fsm6608462plo.169.2023.06.27.18.57.12 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:57:13 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 09/13] bpf: Allow reuse from waiting_for_gp_ttrace list. Date: Tue, 27 Jun 2023 18:56:30 -0700 Message-Id: <20230628015634.33193-10-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov alloc_bulk() can reuse elements from free_by_rcu_ttrace. Let it reuse from waiting_for_gp_ttrace as well to avoid unnecessary kmalloc(). Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 93242c4b85e0..40524d9454c7 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -212,6 +212,15 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) if (i >= cnt) return; + for (; i < cnt; i++) { + obj = llist_del_first(&c->waiting_for_gp_ttrace); + if (!obj) + break; + add_obj_to_free_list(c, obj); + } + if (i >= cnt) + return; + memcg = get_memcg(c); old_memcg = set_active_memcg(memcg); for (; i < cnt; i++) { @@ -290,12 +299,7 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); llist_for_each_safe(llnode, t, llist_del_all(&c->free_by_rcu_ttrace)) - /* There is no concurrent __llist_add(waiting_for_gp_ttrace) access. - * It doesn't race with llist_del_all either. - * But there could be two concurrent llist_del_all(waiting_for_gp_ttrace): - * from __free_rcu() and from drain_mem_cache(). - */ - __llist_add(llnode, &c->waiting_for_gp_ttrace); + llist_add(llnode, &c->waiting_for_gp_ttrace); if (unlikely(READ_ONCE(c->draining))) { __free_rcu(&c->rcu_ttrace); From patchwork Wed Jun 28 01:56:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295166 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BEA8EB64D9 for ; Wed, 28 Jun 2023 01:57:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231206AbjF1B5U (ORCPT ); Tue, 27 Jun 2023 21:57:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231162AbjF1B5T (ORCPT ); Tue, 27 Jun 2023 21:57:19 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CAB9DDA; Tue, 27 Jun 2023 18:57:18 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1b539d2f969so3449315ad.0; Tue, 27 Jun 2023 18:57:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917438; x=1690509438; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QJFALLSA0Zej+KutPxvztQdPk2ASSkOODjkIKNtNWzE=; b=h8X/4jkIFsXgLZtZ8UmeAGEq5sycoK/XFUVVfiJxRQuVVic4UFrEP/xXlaIBbMx1k9 FmBdvpIGjekcm4Ws/Eu4zK+xY4V4EFAJcJzOO5YmWDjZqaTaf7Wd/atTukdOIhk/sO1D angij+ciL5ccGZB1Ml49cBIVzNq5MWsbrK7KuCO6ZDLIJDwXVLaeNxEOc0/EsWyuR8rH N7i0HjM9BHELIiRZ75s6LkThYcZx+BDHlQLHs6gbAPLS5wKiMR/QkxLOexLV5JAF0hnZ wCcGvvCwVRCfg1HGLdGLkaL5zVP/dnyhVJy+3i334fizIn566JRzgl/DiD7YAY1lQd9+ CH6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917438; x=1690509438; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QJFALLSA0Zej+KutPxvztQdPk2ASSkOODjkIKNtNWzE=; b=Bdm8H60aAE93OaIBfWN/XOpyoxMpxje2SeWvCIS/pUtautY1gR08Etb0mQ0g530O0D ip6ssDNm0AbwFBwKSWqVDqYYY9teP8LJnc9QDrX7/HuHeY/VIc/+8UX90vuh7SutkZNM WKNpaFAX6Us5CxcOE2kPUsWfx9x7zJcAAZ/2EM+iMldk4jQtnuskHcSdouJnDsTOviOa z397e9xixQxQRRvjP9uS/fhg1iY0Qt1SAD3UZFoCIRNMfDy4HB8v+mb4b7vSvdkkdJEd ZDBZj/tsmkUSW7UbmvJOKFRnekeuAVop5J14P/+LhLT5j6nF3uUwv6Q4C2T7f3qwg09h GQnA== X-Gm-Message-State: AC+VfDzpzqmdVNfUXcET3O47Z1a+ob2Bh4qS4hhXxvv5twMtyRiLjyl2 PZTwgKIGmLdJMYhoKe/qljw= X-Google-Smtp-Source: ACHHUZ4mqRbVdrbzOcQ5mk9oJGSnYahKscizFj42AIxrgg30lKY/pQQPuTEa+rMPsBjM2Vc1d7kI8w== X-Received: by 2002:a17:902:c411:b0:1b6:92f0:b6f5 with SMTP id k17-20020a170902c41100b001b692f0b6f5mr4620plk.14.1687917438176; Tue, 27 Jun 2023 18:57:18 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id j3-20020a170902da8300b001ab13f1fa82sm1262579plx.85.2023.06.27.18.57.16 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:57:17 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 10/13] rcu: Export rcu_request_urgent_qs_task() Date: Tue, 27 Jun 2023 18:56:31 -0700 Message-Id: <20230628015634.33193-11-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Paul E. McKenney" If a CPU is executing a long series of non-sleeping system calls, RCU grace periods can be delayed for on the order of a couple hundred milliseconds. This is normally not a problem, but if each system call does a call_rcu(), those callbacks can stack up. RCU will eventually notice this callback storm, but use of rcu_request_urgent_qs_task() allows the code invoking call_rcu() to give RCU a heads up. This function is not for general use, not yet, anyway. Reported-by: Alexei Starovoitov Signed-off-by: Paul E. McKenney Signed-off-by: Alexei Starovoitov --- include/linux/rcutiny.h | 2 ++ include/linux/rcutree.h | 1 + kernel/rcu/rcu.h | 2 -- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index 7f17acf29dda..7b949292908a 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -138,6 +138,8 @@ static inline int rcu_needs_cpu(void) return 0; } +static inline void rcu_request_urgent_qs_task(struct task_struct *t) { } + /* * Take advantage of the fact that there is only one CPU, which * allows us to ignore virtualization-based context switches. diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 56bccb5a8fde..126f6b418f6a 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -21,6 +21,7 @@ void rcu_softirq_qs(void); void rcu_note_context_switch(bool preempt); int rcu_needs_cpu(void); void rcu_cpu_stall_reset(void); +void rcu_request_urgent_qs_task(struct task_struct *t); /* * Note a virtualization-based context switch. This is simply a diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 4a1b9622598b..6f5fb3f7ebf3 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -493,7 +493,6 @@ static inline void rcu_expedite_gp(void) { } static inline void rcu_unexpedite_gp(void) { } static inline void rcu_async_hurry(void) { } static inline void rcu_async_relax(void) { } -static inline void rcu_request_urgent_qs_task(struct task_struct *t) { } #else /* #ifdef CONFIG_TINY_RCU */ bool rcu_gp_is_normal(void); /* Internal RCU use. */ bool rcu_gp_is_expedited(void); /* Internal RCU use. */ @@ -508,7 +507,6 @@ void show_rcu_tasks_gp_kthreads(void); #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */ static inline void show_rcu_tasks_gp_kthreads(void) {} #endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */ -void rcu_request_urgent_qs_task(struct task_struct *t); #endif /* #else #ifdef CONFIG_TINY_RCU */ #define RCU_SCHEDULER_INACTIVE 0 From patchwork Wed Jun 28 01:56:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88C8AEB64D9 for ; Wed, 28 Jun 2023 01:57:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231213AbjF1B5X (ORCPT ); Tue, 27 Jun 2023 21:57:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231162AbjF1B5X (ORCPT ); Tue, 27 Jun 2023 21:57:23 -0400 Received: from mail-pf1-x42a.google.com (mail-pf1-x42a.google.com [IPv6:2607:f8b0:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 976DF2D4C; Tue, 27 Jun 2023 18:57:22 -0700 (PDT) Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-6687446eaccso4813379b3a.3; Tue, 27 Jun 2023 18:57:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917442; x=1690509442; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Id52rcowJ7NZMb5+YK4IlpOtkvo0YDczJ2nwiMNvyqw=; b=ZCSUtjlRbqWXJJ8JIesLhYajDvasVipJn3pVbYY0jYCmeIXI+Ew/CBahT/NB3GwuTz 5mMcZpxTCdw3ULfExGOg84c8Keg1nRW87sHw7GuHgPfwAQs8YC5oreWydsBgccrIHukT npDhRdQREDNfe4ghjNZsJDLFlRh7pGq7tTA47tXpOde56gNuuZ99Etx8DTRwVpxG57Xx ox6WwSqFxRRR46V3ILp8YRiwlXuxWaSDHGS6CKbsuQzP2sRwEe1qqDz946dj+A7O7oHc aGF8BR6Z9TEdIAeoZvst1Qno6DfXzmOkmKU2euI9o4n8FqiNwbOcVVWf6NNEGm0qq9LI GCUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917442; x=1690509442; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Id52rcowJ7NZMb5+YK4IlpOtkvo0YDczJ2nwiMNvyqw=; b=EsYZGnh4jyuTXIWuyWdhsppDVNb0DTg4+WIR28YZG9lDsMGJgjcAx3CeTvp0WRdnu2 ZzO899+qVIdjNOqoQ14tzCA6iNnnJseTQxxxHJq2mtME8Nd13/CqovvmD7EwbuBBoAWK /+CY8HJrBrZYMpFFIrce9OHjZyJ5a263TU2V1o/T8prQHZtNZTN37RugDBQmLqCwlNhx HS/yxkzqM3QH8igd5QQg1mwi6P4Ls5ERhwDvBXAznBUX+2A8BJ/cOANFXDnpfqAQKuPa 21CeyBuetPXtjZITvoaVASqRmeOjforgFpL6WvugG6O3tCBKQALj28CK221To6RxohhJ ycuA== X-Gm-Message-State: AC+VfDwHDb4Sm8yBrYyRVprDCS5NPwZEjpivHp4fYac7Ppf/oy0SINQo bR5gf057Y1HSZ4+lzX/hvfc= X-Google-Smtp-Source: ACHHUZ7LIuf7QuI0kKroREU1alyb8GoYkxjYwGhB7PdhGL/R8EVBq+wTCRcmclDpn4QKeBt11mwOYA== X-Received: by 2002:a05:6a21:99a6:b0:125:87b1:a30d with SMTP id ve38-20020a056a2199a600b0012587b1a30dmr20011745pzb.1.1687917442054; Tue, 27 Jun 2023 18:57:22 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id w29-20020a63161d000000b0052871962579sm6224217pgl.63.2023.06.27.18.57.20 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:57:21 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 11/13] selftests/bpf: Improve test coverage of bpf_mem_alloc. Date: Tue, 27 Jun 2023 18:56:32 -0700 Message-Id: <20230628015634.33193-12-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov bpf_obj_new() calls bpf_mem_alloc(), but doing alloc/free of 8 elements is not triggering watermark conditions in bpf_mem_alloc. Increase to 200 elements to make sure alloc_bulk/free_bulk is exercised. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- tools/testing/selftests/bpf/progs/linked_list.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/progs/linked_list.c b/tools/testing/selftests/bpf/progs/linked_list.c index 57440a554304..84d1777a9e6c 100644 --- a/tools/testing/selftests/bpf/progs/linked_list.c +++ b/tools/testing/selftests/bpf/progs/linked_list.c @@ -96,7 +96,7 @@ static __always_inline int list_push_pop_multiple(struct bpf_spin_lock *lock, struct bpf_list_head *head, bool leave_in_map) { struct bpf_list_node *n; - struct foo *f[8], *pf; + struct foo *f[200], *pf; int i; /* Loop following this check adds nodes 2-at-a-time in order to From patchwork Wed Jun 28 01:56:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295168 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D0ACEB64DC for ; Wed, 28 Jun 2023 01:57:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231162AbjF1B52 (ORCPT ); Tue, 27 Jun 2023 21:57:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55524 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230308AbjF1B52 (ORCPT ); Tue, 27 Jun 2023 21:57:28 -0400 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B33CADA; Tue, 27 Jun 2023 18:57:26 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-6682909acadso2837351b3a.3; Tue, 27 Jun 2023 18:57:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917446; x=1690509446; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CSXoca94rrs9IDSeG1xXAALoWdcRbwaU34EJa2rPJ9U=; b=WPB/ccQ9Tk0oMDO3ZO4wY6oiFIU3GG6Ma6/FmKHW4vw1MCJ27Cw2d6WkDjBvGVaHJH oM3P+UeSVDrG9iytkgr3ovkMWP8YwFAF5x5TUiw1eNwi9qJNQZP6/v3WV2PV2VNHqv35 veLKeXHYKePkZR+KIdn3le7yT43p4AG6LxOv+MHzCafUjoE3mqVJvhGrCM50FQNBDK0p cTIlyLv+oaEDjUP5t2kbuXflmidELq+6VkeHtpF4C+22Vdy6OasjxsEreIgEzEXw0bWM XCbjkqTokNSe7Zmurp66ufggVuKt/drzDBa8BtbQKwgvJd6J2r6+kVYw4N4IeFxYqtxv 6Gtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917446; x=1690509446; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CSXoca94rrs9IDSeG1xXAALoWdcRbwaU34EJa2rPJ9U=; b=PGSq+PTRnR57oRpwMgzAI+P6iHB9LiXsQG3+cjmKVY83XEKXM7aBBxkJChZTmXkbFz /tebKYdUSV87hIohCaBZXq/ymXdjDJbDMMztMnLr1ELD3HFhfX3hgAjQ0RWm35USCkoF 5kGG79KO/1Z9K9tIl1H5yjO3xloeFzaIIKA3u3Mx2/STyOlUL0/6Q4uloicDM2niX+nQ 6vKHEP+CZ/jQplSf4b4/azmcJgJ2uHN7L1IBN/gmbLDNKsbpL4bbEl73aM7OPqtW77Bh BvCtnRY+poCyxQ3t0dndab3x4b0pwTLV+tqfx0jx0zSJhfN2rzRGarypFvMdHACXpHNr hHLw== X-Gm-Message-State: AC+VfDzKXXd8TEvIYFxXT9imnUmQdLSnwr+EsyhEypcYrsRh+OCjqy2M 9xwE0i0V2/CCL4hnNN7OdC8= X-Google-Smtp-Source: ACHHUZ50BQwMER6D8+PpQKSZSLRyDdrbJJl+VP/Xqz4RNRVhgANOFF6/6r5QiE5RSj/kQsvGNm4zTg== X-Received: by 2002:a05:6a00:1586:b0:66a:5466:25bd with SMTP id u6-20020a056a00158600b0066a546625bdmr14064448pfk.15.1687917445997; Tue, 27 Jun 2023 18:57:25 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id h3-20020a62b403000000b006468222af91sm5982960pfn.48.2023.06.27.18.57.24 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:57:25 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 12/13] bpf: Introduce bpf_mem_free_rcu() similar to kfree_rcu(). Date: Tue, 27 Jun 2023 18:56:33 -0700 Message-Id: <20230628015634.33193-13-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Introduce bpf_mem_[cache_]free_rcu() similar to kfree_rcu(). Unlike bpf_mem_[cache_]free() that links objects for immediate reuse into per-cpu free list the _rcu() flavor waits for RCU grace period and then moves objects into free_by_rcu_ttrace list where they are waiting for RCU task trace grace period to be freed into slab. The life cycle of objects: alloc: dequeue free_llist free: enqeueu free_llist free_rcu: enqueue free_by_rcu -> waiting_for_gp free_llist above high watermark -> free_by_rcu_ttrace after RCU GP waiting_for_gp -> free_by_rcu_ttrace free_by_rcu_ttrace -> waiting_for_gp_ttrace -> slab Signed-off-by: Alexei Starovoitov --- include/linux/bpf_mem_alloc.h | 2 + kernel/bpf/memalloc.c | 129 +++++++++++++++++++++++++++++++++- 2 files changed, 128 insertions(+), 3 deletions(-) diff --git a/include/linux/bpf_mem_alloc.h b/include/linux/bpf_mem_alloc.h index 3929be5743f4..d644bbb298af 100644 --- a/include/linux/bpf_mem_alloc.h +++ b/include/linux/bpf_mem_alloc.h @@ -27,10 +27,12 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma); /* kmalloc/kfree equivalent: */ void *bpf_mem_alloc(struct bpf_mem_alloc *ma, size_t size); void bpf_mem_free(struct bpf_mem_alloc *ma, void *ptr); +void bpf_mem_free_rcu(struct bpf_mem_alloc *ma, void *ptr); /* kmem_cache_alloc/free equivalent: */ void *bpf_mem_cache_alloc(struct bpf_mem_alloc *ma); void bpf_mem_cache_free(struct bpf_mem_alloc *ma, void *ptr); +void bpf_mem_cache_free_rcu(struct bpf_mem_alloc *ma, void *ptr); void bpf_mem_cache_raw_free(void *ptr); void *bpf_mem_cache_alloc_flags(struct bpf_mem_alloc *ma, gfp_t flags); diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 40524d9454c7..3081d06a434c 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -101,6 +101,15 @@ struct bpf_mem_cache { bool draining; struct bpf_mem_cache *tgt; + /* list of objects to be freed after RCU GP */ + struct llist_head free_by_rcu; + struct llist_node *free_by_rcu_tail; + struct llist_head waiting_for_gp; + struct llist_node *waiting_for_gp_tail; + struct rcu_head rcu; + atomic_t call_rcu_in_progress; + struct llist_head free_llist_extra_rcu; + /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; struct llist_head waiting_for_gp_ttrace; @@ -344,6 +353,69 @@ static void free_bulk(struct bpf_mem_cache *c) do_call_rcu_ttrace(tgt); } +static void __free_by_rcu(struct rcu_head *head) +{ + struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu); + struct bpf_mem_cache *tgt = c->tgt; + struct llist_node *llnode; + + llnode = llist_del_all(&c->waiting_for_gp); + if (!llnode) + goto out; + + llist_add_batch(llnode, c->waiting_for_gp_tail, &tgt->free_by_rcu_ttrace); + + /* Objects went through regular RCU GP. Send them to RCU tasks trace */ + do_call_rcu_ttrace(tgt); +out: + atomic_set(&c->call_rcu_in_progress, 0); +} + +static void check_free_by_rcu(struct bpf_mem_cache *c) +{ + struct llist_node *llnode, *t; + unsigned long flags; + + /* drain free_llist_extra_rcu */ + if (unlikely(!llist_empty(&c->free_llist_extra_rcu))) { + inc_active(c, &flags); + llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra_rcu)) + if (__llist_add(llnode, &c->free_by_rcu)) + c->free_by_rcu_tail = llnode; + dec_active(c, flags); + } + + if (llist_empty(&c->free_by_rcu)) + return; + + if (atomic_xchg(&c->call_rcu_in_progress, 1)) { + /* + * Instead of kmalloc-ing new rcu_head and triggering 10k + * call_rcu() to hit rcutree.qhimark and force RCU to notice + * the overload just ask RCU to hurry up. There could be many + * objects in free_by_rcu list. + * This hint reduces memory consumption for an artifical + * benchmark from 2 Gbyte to 150 Mbyte. + */ + rcu_request_urgent_qs_task(current); + return; + } + + WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp)); + + inc_active(c, &flags); + WRITE_ONCE(c->waiting_for_gp.first, __llist_del_all(&c->free_by_rcu)); + c->waiting_for_gp_tail = c->free_by_rcu_tail; + dec_active(c, flags); + + if (unlikely(READ_ONCE(c->draining))) { + free_all(llist_del_all(&c->waiting_for_gp), !!c->percpu_size); + atomic_set(&c->call_rcu_in_progress, 0); + } else { + call_rcu_hurry(&c->rcu, __free_by_rcu); + } +} + static void bpf_mem_refill(struct irq_work *work) { struct bpf_mem_cache *c = container_of(work, struct bpf_mem_cache, refill_work); @@ -358,6 +430,8 @@ static void bpf_mem_refill(struct irq_work *work) alloc_bulk(c, c->batch, NUMA_NO_NODE); else if (cnt > c->high_watermark) free_bulk(c); + + check_free_by_rcu(c); } static void notrace irq_work_raise(struct bpf_mem_cache *c) @@ -486,6 +560,9 @@ static void drain_mem_cache(struct bpf_mem_cache *c) free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu); free_all(__llist_del_all(&c->free_llist), percpu); free_all(__llist_del_all(&c->free_llist_extra), percpu); + free_all(__llist_del_all(&c->free_by_rcu), percpu); + free_all(__llist_del_all(&c->free_llist_extra_rcu), percpu); + free_all(llist_del_all(&c->waiting_for_gp), percpu); } static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) @@ -498,8 +575,8 @@ static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) static void free_mem_alloc(struct bpf_mem_alloc *ma) { - /* waiting_for_gp_ttrace lists was drained, but __free_rcu might - * still execute. Wait for it now before we freeing percpu caches. + /* waiting_for_gp[_ttrace] lists were drained, but RCU callbacks + * might still execute. Wait for them. * * rcu_barrier_tasks_trace() doesn't imply synchronize_rcu_tasks_trace(), * but rcu_barrier_tasks_trace() and rcu_barrier() below are only used @@ -508,7 +585,8 @@ static void free_mem_alloc(struct bpf_mem_alloc *ma) * rcu_trace_implies_rcu_gp(), it will be OK to skip rcu_barrier() by * using rcu_trace_implies_rcu_gp() as well. */ - rcu_barrier_tasks_trace(); + rcu_barrier(); /* wait for __free_by_rcu */ + rcu_barrier_tasks_trace(); /* wait for __free_rcu */ if (!rcu_trace_implies_rcu_gp()) rcu_barrier(); free_mem_alloc_no_barrier(ma); @@ -561,6 +639,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_in_progress); } /* objcg is the same across cpus */ if (c->objcg) @@ -577,6 +656,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_in_progress); } } if (c->objcg) @@ -661,6 +741,27 @@ static void notrace unit_free(struct bpf_mem_cache *c, void *ptr) irq_work_raise(c); } +static void notrace unit_free_rcu(struct bpf_mem_cache *c, void *ptr) +{ + struct llist_node *llnode = ptr - LLIST_NODE_SZ; + unsigned long flags; + + c->tgt = *(struct bpf_mem_cache **)llnode; + + local_irq_save(flags); + if (local_inc_return(&c->active) == 1) { + if (__llist_add(llnode, &c->free_by_rcu)) + c->free_by_rcu_tail = llnode; + } else { + llist_add(llnode, &c->free_llist_extra_rcu); + } + local_dec(&c->active); + local_irq_restore(flags); + + if (!atomic_read(&c->call_rcu_in_progress)) + irq_work_raise(c); +} + /* Called from BPF program or from sys_bpf syscall. * In both cases migration is disabled. */ @@ -694,6 +795,20 @@ void notrace bpf_mem_free(struct bpf_mem_alloc *ma, void *ptr) unit_free(this_cpu_ptr(ma->caches)->cache + idx, ptr); } +void notrace bpf_mem_free_rcu(struct bpf_mem_alloc *ma, void *ptr) +{ + int idx; + + if (!ptr) + return; + + idx = bpf_mem_cache_idx(ksize(ptr - LLIST_NODE_SZ)); + if (idx < 0) + return; + + unit_free_rcu(this_cpu_ptr(ma->caches)->cache + idx, ptr); +} + void notrace *bpf_mem_cache_alloc(struct bpf_mem_alloc *ma) { void *ret; @@ -710,6 +825,14 @@ void notrace bpf_mem_cache_free(struct bpf_mem_alloc *ma, void *ptr) unit_free(this_cpu_ptr(ma->cache), ptr); } +void notrace bpf_mem_cache_free_rcu(struct bpf_mem_alloc *ma, void *ptr) +{ + if (!ptr) + return; + + unit_free_rcu(this_cpu_ptr(ma->cache), ptr); +} + /* Directly does a kfree() without putting 'ptr' back to the free_llist * for reuse and without waiting for a rcu_tasks_trace gp. * The caller must first go through the rcu_tasks_trace gp for 'ptr' From patchwork Wed Jun 28 01:56:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13295169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90172EB64DC for ; Wed, 28 Jun 2023 01:57:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231210AbjF1B5c (ORCPT ); Tue, 27 Jun 2023 21:57:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230308AbjF1B5b (ORCPT ); Tue, 27 Jun 2023 21:57:31 -0400 Received: from mail-pf1-x435.google.com (mail-pf1-x435.google.com [IPv6:2607:f8b0:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 838CEDA; Tue, 27 Jun 2023 18:57:30 -0700 (PDT) Received: by mail-pf1-x435.google.com with SMTP id d2e1a72fcca58-668711086f4so337165b3a.1; Tue, 27 Jun 2023 18:57:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687917450; x=1690509450; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=smMeUc5DmNi2lwy4EGV+uTKZDtjMFbrXIGNYOVUZ5Gs=; b=XvEvIhM86CnhKh0MD8v5OB1VCZSoCvFBANEUAqcshG+pb7gx6vMwdcHcXGhacPz5OC dzOpn5T9RCvCwszlYF+T7XwUFz/ftQKzcsdkeGnSuG6/VpZLhfzQhzKzhZ2tnuuR0ixA 7aaVFJBkudfejdrOT1V4P/0rk0waJvVkH8CN4OkMmsRRTDNMAnVPZNAa5m/r5Ea8Yr8h h+R6nygbgzs8w54MY15/dD4cZRh92ucujhifl3IoxaOgazv8ZKY589fOXmTdk7vWQy2P RQZ0ed0OScWddZgcI7mrsBlkOCgYZNy0IUfyL5IM2zHuY4mzrHrPJ2Jt90vbHKzYQqW6 Qeqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687917450; x=1690509450; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=smMeUc5DmNi2lwy4EGV+uTKZDtjMFbrXIGNYOVUZ5Gs=; b=i13OEbX+y6zpJMe8j4Kjgz9V9sfsZ5GeATotnIfmB1h+uSIQGc56mI+IVGBRgY5G7X asgnFATFbknZCDZzo03cXgDh/MmnQ3oRqHbPKD3n8GaDol0vz2Z9qxc0DqQmvQcNvTLI TcYCzsMxbRwBn/aiKVBgkG9YkKCoZ+g/TiSHqwhlLjJHXb/3xcwLUyVzYvYhmZb/MEoe kmHLbO4tAbQf/0kRlxKMa+BD4L7C/TnhcMMKQynvkEqNtwBHU0p71jSDkuVS3CFIw45V KQgASNPvLqDUpCN66qkdBgWkoTAYubeetfYlcFC5sP67dwqWFE9IIw6gkX6edsC+c+DX f8oQ== X-Gm-Message-State: AC+VfDyPOKL3UuMpV87xnXDcnI0yEhcYp7Xv3LgLyTsfDl93qt+eHdM+ 0q6mGsLmgavxK8xqKcd851c= X-Google-Smtp-Source: ACHHUZ4h0VwbFvM6lSorEvSu4x8a4hE0merf/XO2OIQII0dgJ1Xy1ru2Gqpgrpv74Yx4Kg7SEj7OLA== X-Received: by 2002:a05:6a00:b4e:b0:67f:d5e7:4604 with SMTP id p14-20020a056a000b4e00b0067fd5e74604mr2023474pfo.13.1687917449856; Tue, 27 Jun 2023 18:57:29 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:6420]) by smtp.gmail.com with ESMTPSA id e6-20020aa78c46000000b0065ecdefa57fsm98977pfd.0.2023.06.27.18.57.28 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 27 Jun 2023 18:57:29 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v3 bpf-next 13/13] bpf: Convert bpf_cpumask to bpf_mem_cache_free_rcu. Date: Tue, 27 Jun 2023 18:56:34 -0700 Message-Id: <20230628015634.33193-14-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230628015634.33193-1-alexei.starovoitov@gmail.com> References: <20230628015634.33193-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Convert bpf_cpumask to bpf_mem_cache_free_rcu. Note that migrate_disable() in bpf_cpumask_release() is still necessary, since bpf_cpumask_release() is a dtor. bpf_obj_free_fields() can be converted to do migrate_disable() there in a follow up. Signed-off-by: Alexei Starovoitov Acked-by: David Vernet --- kernel/bpf/cpumask.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c index 938a60ff4295..6983af8e093c 100644 --- a/kernel/bpf/cpumask.c +++ b/kernel/bpf/cpumask.c @@ -9,7 +9,6 @@ /** * struct bpf_cpumask - refcounted BPF cpumask wrapper structure * @cpumask: The actual cpumask embedded in the struct. - * @rcu: The RCU head used to free the cpumask with RCU safety. * @usage: Object reference counter. When the refcount goes to 0, the * memory is released back to the BPF allocator, which provides * RCU safety. @@ -25,7 +24,6 @@ */ struct bpf_cpumask { cpumask_t cpumask; - struct rcu_head rcu; refcount_t usage; }; @@ -82,16 +80,6 @@ __bpf_kfunc struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask) return cpumask; } -static void cpumask_free_cb(struct rcu_head *head) -{ - struct bpf_cpumask *cpumask; - - cpumask = container_of(head, struct bpf_cpumask, rcu); - migrate_disable(); - bpf_mem_cache_free(&bpf_cpumask_ma, cpumask); - migrate_enable(); -} - /** * bpf_cpumask_release() - Release a previously acquired BPF cpumask. * @cpumask: The cpumask being released. @@ -102,8 +90,12 @@ static void cpumask_free_cb(struct rcu_head *head) */ __bpf_kfunc void bpf_cpumask_release(struct bpf_cpumask *cpumask) { - if (refcount_dec_and_test(&cpumask->usage)) - call_rcu(&cpumask->rcu, cpumask_free_cb); + if (!refcount_dec_and_test(&cpumask->usage)) + return; + + migrate_disable(); + bpf_mem_cache_free_rcu(&bpf_cpumask_ma, cpumask); + migrate_enable(); } /**