From patchwork Thu Jul 6 03:34:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303128 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7776EB64DD for ; Thu, 6 Jul 2023 03:34:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232832AbjGFDe6 (ORCPT ); Wed, 5 Jul 2023 23:34:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232426AbjGFDe5 (ORCPT ); Wed, 5 Jul 2023 23:34:57 -0400 Received: from mail-oi1-x229.google.com (mail-oi1-x229.google.com [IPv6:2607:f8b0:4864:20::229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D6B31BC8; Wed, 5 Jul 2023 20:34:56 -0700 (PDT) Received: by mail-oi1-x229.google.com with SMTP id 5614622812f47-392116ae103so331073b6e.0; Wed, 05 Jul 2023 20:34:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614495; x=1691206495; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hje/5uEwvmDjaVx273wRGE5gz7pxgdInq/+uIjr8Qv0=; b=ryyu70lnoK5mYFDNZw2EWe8ZIoTdVdhoDqJLay/AY6Yp1/IaxHKKmtmRqwTjHUUSML SsVVzfkYMFFQljA6eLtq9QWRK8vQtDa6VEH2cHf4+kIJynrq+LRILSpMsxN5O92rot4p FNlUiPIKHGePXzYv8Ua0QKOj+VbnMBdjmpTvL//4gOCcJs9Aj+GeCkXYD2OsAOSfzjXN e39itH3e3dqluT5lCKCui5sYe/9RktFjojrSWrLPby2G9n+9hmIFJ4n88L8qBhzg2M2x Jr9AsgBGmtEQ6yIxZWm1jFhQSql/HUPglxTa4EaQimr9IdEjw+Ib7QnNY8r6UQch8Vsr EqTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614495; x=1691206495; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hje/5uEwvmDjaVx273wRGE5gz7pxgdInq/+uIjr8Qv0=; b=JDt6fUSgNeUzFpkjRnzRBACJFvEHTRySlZUMdEElS9ecG0F2ZInxuDH1Syd3dHGfHB Nc1WPLckeqpsOrSL17NEIFk+Lc4+VBFPcUZ+s9HtXuIEnIeMUtQ/81FtJ6izxpccqsyF HLtfmCgTeuzP4uuYC/Fi2zCgSTor4C2sYlPowtvEkY/6gwyMRQOKWy5B/sI04C0hf24l jo+UjHD0JZTlt+R1ghsaORFBwQTNtML21kILQIqDiqyBejzvnDr7Y5F0hN1YAiCiGdPp p6RtTRCMA4A+9seXhXYx17eIdubalKHZmomqN+T+5owlE03HXywIBf8lDtWALgNBIZXR 9CJg== X-Gm-Message-State: ABy/qLZoPmHFivSrzERSkmCT5cedLHTnEB35HPQM+q+ay1uFXm0qFWkr 1f9Ckklfo6Rz1hZG+51XeWw= X-Google-Smtp-Source: APBJJlHlwYQwHnza+F4L21TGixzAGXQSLKFppF6WlNVcHYMwg7p/EtMZHOc7LA6ghI/RWonAYhRqZg== X-Received: by 2002:a05:6808:1993:b0:3a1:d504:f245 with SMTP id bj19-20020a056808199300b003a1d504f245mr556761oib.37.1688614495351; Wed, 05 Jul 2023 20:34:55 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id y17-20020a170902b49100b001b8622c1ad2sm219596plr.130.2023.07.05.20.34.53 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:34:54 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 01/14] bpf: Rename few bpf_mem_alloc fields. Date: Wed, 5 Jul 2023 20:34:34 -0700 Message-Id: <20230706033447.54696-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Rename: - struct rcu_head rcu; - struct llist_head free_by_rcu; - struct llist_head waiting_for_gp; - atomic_t call_rcu_in_progress; + struct llist_head free_by_rcu_ttrace; + struct llist_head waiting_for_gp_ttrace; + struct rcu_head rcu_ttrace; + atomic_t call_rcu_ttrace_in_progress; ... - static void do_call_rcu(struct bpf_mem_cache *c) + static void do_call_rcu_ttrace(struct bpf_mem_cache *c) to better indicate intended use. The 'tasks trace' is shortened to 'ttrace' to reduce verbosity. No functional changes. Later patches will add free_by_rcu/waiting_for_gp fields to be used with normal RCU. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 57 ++++++++++++++++++++++--------------------- 1 file changed, 29 insertions(+), 28 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 0668bcd7c926..cc5b8adb4c83 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -99,10 +99,11 @@ struct bpf_mem_cache { int low_watermark, high_watermark, batch; int percpu_size; - struct rcu_head rcu; - struct llist_head free_by_rcu; - struct llist_head waiting_for_gp; - atomic_t call_rcu_in_progress; + /* list of objects to be freed after RCU tasks trace GP */ + struct llist_head free_by_rcu_ttrace; + struct llist_head waiting_for_gp_ttrace; + struct rcu_head rcu_ttrace; + atomic_t call_rcu_ttrace_in_progress; }; struct bpf_mem_caches { @@ -165,18 +166,18 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) old_memcg = set_active_memcg(memcg); for (i = 0; i < cnt; i++) { /* - * free_by_rcu is only manipulated by irq work refill_work(). + * free_by_rcu_ttrace is only manipulated by irq work refill_work(). * IRQ works on the same CPU are called sequentially, so it is * safe to use __llist_del_first() here. If alloc_bulk() is * invoked by the initial prefill, there will be no running * refill_work(), so __llist_del_first() is fine as well. * - * In most cases, objects on free_by_rcu are from the same CPU. + * In most cases, objects on free_by_rcu_ttrace are from the same CPU. * If some objects come from other CPUs, it doesn't incur any * harm because NUMA_NO_NODE means the preference for current * numa node and it is not a guarantee. */ - obj = __llist_del_first(&c->free_by_rcu); + obj = __llist_del_first(&c->free_by_rcu_ttrace); if (!obj) { /* Allocate, but don't deplete atomic reserves that typical * GFP_ATOMIC would do. irq_work runs on this cpu and kmalloc @@ -232,10 +233,10 @@ static void free_all(struct llist_node *llnode, bool percpu) static void __free_rcu(struct rcu_head *head) { - struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu); + struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu_ttrace); - free_all(llist_del_all(&c->waiting_for_gp), !!c->percpu_size); - atomic_set(&c->call_rcu_in_progress, 0); + free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size); + atomic_set(&c->call_rcu_ttrace_in_progress, 0); } static void __free_rcu_tasks_trace(struct rcu_head *head) @@ -254,32 +255,32 @@ static void enque_to_free(struct bpf_mem_cache *c, void *obj) struct llist_node *llnode = obj; /* bpf_mem_cache is a per-cpu object. Freeing happens in irq_work. - * Nothing races to add to free_by_rcu list. + * Nothing races to add to free_by_rcu_ttrace list. */ - __llist_add(llnode, &c->free_by_rcu); + __llist_add(llnode, &c->free_by_rcu_ttrace); } -static void do_call_rcu(struct bpf_mem_cache *c) +static void do_call_rcu_ttrace(struct bpf_mem_cache *c) { struct llist_node *llnode, *t; - if (atomic_xchg(&c->call_rcu_in_progress, 1)) + if (atomic_xchg(&c->call_rcu_ttrace_in_progress, 1)) return; - WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp)); - llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu)) - /* There is no concurrent __llist_add(waiting_for_gp) access. + WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); + llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu_ttrace)) + /* There is no concurrent __llist_add(waiting_for_gp_ttrace) access. * It doesn't race with llist_del_all either. - * But there could be two concurrent llist_del_all(waiting_for_gp): + * But there could be two concurrent llist_del_all(waiting_for_gp_ttrace): * from __free_rcu() and from drain_mem_cache(). */ - __llist_add(llnode, &c->waiting_for_gp); + __llist_add(llnode, &c->waiting_for_gp_ttrace); /* Use call_rcu_tasks_trace() to wait for sleepable progs to finish. * If RCU Tasks Trace grace period implies RCU grace period, free * these elements directly, else use call_rcu() to wait for normal * progs to finish and finally do free_one() on each element. */ - call_rcu_tasks_trace(&c->rcu, __free_rcu_tasks_trace); + call_rcu_tasks_trace(&c->rcu_ttrace, __free_rcu_tasks_trace); } static void free_bulk(struct bpf_mem_cache *c) @@ -307,7 +308,7 @@ static void free_bulk(struct bpf_mem_cache *c) /* and drain free_llist_extra */ llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra)) enque_to_free(c, llnode); - do_call_rcu(c); + do_call_rcu_ttrace(c); } static void bpf_mem_refill(struct irq_work *work) @@ -441,13 +442,13 @@ static void drain_mem_cache(struct bpf_mem_cache *c) /* No progs are using this bpf_mem_cache, but htab_map_free() called * bpf_mem_cache_free() for all remaining elements and they can be in - * free_by_rcu or in waiting_for_gp lists, so drain those lists now. + * free_by_rcu_ttrace or in waiting_for_gp_ttrace lists, so drain those lists now. * - * Except for waiting_for_gp list, there are no concurrent operations + * Except for waiting_for_gp_ttrace list, there are no concurrent operations * on these lists, so it is safe to use __llist_del_all(). */ - free_all(__llist_del_all(&c->free_by_rcu), percpu); - free_all(llist_del_all(&c->waiting_for_gp), percpu); + free_all(__llist_del_all(&c->free_by_rcu_ttrace), percpu); + free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu); free_all(__llist_del_all(&c->free_llist), percpu); free_all(__llist_del_all(&c->free_llist_extra), percpu); } @@ -462,7 +463,7 @@ static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) static void free_mem_alloc(struct bpf_mem_alloc *ma) { - /* waiting_for_gp lists was drained, but __free_rcu might + /* waiting_for_gp_ttrace lists was drained, but __free_rcu might * still execute. Wait for it now before we freeing percpu caches. * * rcu_barrier_tasks_trace() doesn't imply synchronize_rcu_tasks_trace(), @@ -535,7 +536,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) */ irq_work_sync(&c->refill_work); drain_mem_cache(c); - rcu_in_progress += atomic_read(&c->call_rcu_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); } /* objcg is the same across cpus */ if (c->objcg) @@ -550,7 +551,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) c = &cc->cache[i]; irq_work_sync(&c->refill_work); drain_mem_cache(c); - rcu_in_progress += atomic_read(&c->call_rcu_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); } } if (c->objcg) From patchwork Thu Jul 6 03:34:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303129 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4314DEB64DD for ; Thu, 6 Jul 2023 03:35:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232690AbjGFDfC (ORCPT ); Wed, 5 Jul 2023 23:35:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232426AbjGFDfA (ORCPT ); Wed, 5 Jul 2023 23:35:00 -0400 Received: from mail-ot1-x332.google.com (mail-ot1-x332.google.com [IPv6:2607:f8b0:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F261F198E; Wed, 5 Jul 2023 20:34:59 -0700 (PDT) Received: by mail-ot1-x332.google.com with SMTP id 46e09a7af769-6b708b97418so233031a34.3; Wed, 05 Jul 2023 20:34:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614499; x=1691206499; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nhOq2nzPha5tdwWCWLTN4pqz4R1IbYNeZFnIcmPu3ks=; b=Uj7/BR1Q5MVH1+yL4UGWosf1Y3f0xKVrm1NT5GRNciKarBbr8jDnYRP4/s/FlyMZkl swR73/VNgq4QCXdqjThlOZjRNFQaLkrZQyirSAoxC44D9YVnYTjDJ1D4euhWHLkPpMIj PrAcvdt7weLWbOBtmtevPW8op/IKgFRmuvGj8UPQI+SQAsUU2jcm9sWvasqCKfQ54nhv slNEFSJxRli28N5Cgx9rokQcsPLogF2JQJ3jjhSeNSUr1hKu7aWl29eXgE5b5+FbHp4p GsoXR+GoOCnb3hkom8pBZxuUPpsYEGKatEQkB/HuHMa1q6jIWrA4EvLtl5U4mrdmg6Qj 5hBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614499; x=1691206499; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nhOq2nzPha5tdwWCWLTN4pqz4R1IbYNeZFnIcmPu3ks=; b=JR1iRdnAGMJvQPhmi4Ge0VbIOObgFvKBz74FifsacsiOCxT1bRee7+pL9C31PYJ312 y3XzieSsNSSWjNEgzAWMt9u7bmdru5ggtiluFm109VeCfepCd1OenIFfHQF3ZEVnyR21 +/64iCeR86ZwuqjNKRFutp1qgBFTOTjOw1rbZqRvkpR3CJBgpui1ivmAXT2JF27/6gl+ vsNMqCFt9M/J9vXZOh24pX0PsDoXETePati8Din4TE1gNYfAaP0vsDDfqpXZnjPMLeVq gMWDez5c48tssDc48JiP7dnttxa9WDwa9XeQih7QUL8X7x2Zq1zpwySAsCWY/vnTrHBm /9Vg== X-Gm-Message-State: ABy/qLYaLck/2oXISXtehZACB1vS6JybIXQaLL6Okjy8QvXjdM3UgmBn LGsPdTP53cFHnI3pmjeLAwo= X-Google-Smtp-Source: APBJJlF5gF6mjB+gQ4w/G/XlbPW1bbTtwwRoVCSiuv8ke/BdS5mINXdqW7Fv5FndoeaPv8nvws629A== X-Received: by 2002:a05:6870:1f8f:b0:1b0:1e3f:1369 with SMTP id go15-20020a0568701f8f00b001b01e3f1369mr1399435oac.57.1688614499216; Wed, 05 Jul 2023 20:34:59 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id v24-20020a631518000000b0053051d50a48sm268825pgl.79.2023.07.05.20.34.57 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:34:58 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 02/14] bpf: Simplify code of destroy_mem_alloc() with kmemdup(). Date: Wed, 5 Jul 2023 20:34:35 -0700 Message-Id: <20230706033447.54696-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Use kmemdup() to simplify the code. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index cc5b8adb4c83..b0011217be6c 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -499,7 +499,7 @@ static void destroy_mem_alloc(struct bpf_mem_alloc *ma, int rcu_in_progress) return; } - copy = kmalloc(sizeof(*ma), GFP_KERNEL); + copy = kmemdup(ma, sizeof(*ma), GFP_KERNEL); if (!copy) { /* Slow path with inline barrier-s */ free_mem_alloc(ma); @@ -507,10 +507,7 @@ static void destroy_mem_alloc(struct bpf_mem_alloc *ma, int rcu_in_progress) } /* Defer barriers into worker to let the rest of map memory to be freed */ - copy->cache = ma->cache; - ma->cache = NULL; - copy->caches = ma->caches; - ma->caches = NULL; + memset(ma, 0, sizeof(*ma)); INIT_WORK(©->work, free_mem_alloc_deferred); queue_work(system_unbound_wq, ©->work); } From patchwork Thu Jul 6 03:34:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303130 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 443A3EB64DA for ; Thu, 6 Jul 2023 03:35:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232801AbjGFDfG (ORCPT ); Wed, 5 Jul 2023 23:35:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232700AbjGFDfE (ORCPT ); Wed, 5 Jul 2023 23:35:04 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 01E221BCA; Wed, 5 Jul 2023 20:35:03 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1b80b343178so625325ad.0; Wed, 05 Jul 2023 20:35:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614503; x=1691206503; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JwcmYs5zpVjZuQmezI+xbvaGS8SBxqFvKsz2XGqKKoY=; b=Xh6FFBG8SnsEBV9Bo1mDrlVJg8Yr6pVxMawnKWc4C2kF2KzaYlls5r4US2QZrD9t7l YQrjR4rqzuqPpvYmDHND2WOsG4W0F6SP+sOxbsME9tkIzoi7QpldMUI7Sf58G8JEZ5kk tXALQUfg+4uFGiyTkEs8I7LBBstu8LaH98dacjgLn/vVknKF4ziIFZ/BiclGuelbRmLs wakfAci2kO8vkWEKpmWjeS9JsL1gGjdIwBkg6bikYWhtgpU1AaHe9RHcBFu87tgFThLE LanhBDOGZI+4HZMo1M7svexLo/uMAFODtPeN9//EglfGc9yDUPR+jRFbcU8VtOEha6Ra NEUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614503; x=1691206503; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JwcmYs5zpVjZuQmezI+xbvaGS8SBxqFvKsz2XGqKKoY=; b=Ivrqpu5lwrwBwu7A8PEP7h2ImMJg6Gc6zI391LdwCLSxGOiUJ9zOYwPlSjGawqLOia zotOkXdyXRw45i2eGuPbzFNdm6tYhSmvOJCwBcMQFPYUuOdybr35mMc8X89lxTV5z9Og 3yZ60h1H9nPQNrbZZnIJXK25QRp9jOMcSqyJlOQ3iA0j3xQNjoMUnJ5CKY1x5M4UfHwl uTPf6yXMxCRkIAFUwRByP5Z5qo5yFCr36G7qWTXby729Z4eAQ6dgyCY3UTspk9M8IOch rMGVp5xX3oSfGY98hT83+L2aadGuj2C/pYz1pah3FC8CszlGeWxFa7+x1mfe+xvDk4/j 1WYA== X-Gm-Message-State: ABy/qLa+WfHY69mEPXfqzerGYfrxu2NS2MIZn2xsRzN1cf6f5ZF1ou3h h0oO6QSM81uqvjqprLJKrVaVWrqZUOQ= X-Google-Smtp-Source: APBJJlG0lZp8369hXp7zzA3vLEN8NovtvLqT6YlpqDUdUjdLAhuxgGIqC9NS5DfeMKZpml7yiyujUA== X-Received: by 2002:a05:6a20:4429:b0:121:7454:be2a with SMTP id ce41-20020a056a20442900b001217454be2amr641953pzb.45.1688614503296; Wed, 05 Jul 2023 20:35:03 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id b5-20020a170902d50500b001aae625e422sm227108plg.37.2023.07.05.20.35.01 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:02 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 03/14] bpf: Let free_all() return the number of freed elements. Date: Wed, 5 Jul 2023 20:34:36 -0700 Message-Id: <20230706033447.54696-4-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Let free_all() helper return the number of freed elements. It's not used in this patch, but helps in debug/development of bpf_mem_alloc. For example this diff for __free_rcu(): - free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size); + printk("cpu %d freed %d objs after tasks trace\n", raw_smp_processor_id(), + free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size)); would show how busy RCU tasks trace is. In artificial benchmark where one cpu is allocating and different cpu is freeing the RCU tasks trace won't be able to keep up and the list of objects would keep growing from thousands to millions and eventually OOMing. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index b0011217be6c..693651d2648b 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -223,12 +223,16 @@ static void free_one(void *obj, bool percpu) kfree(obj); } -static void free_all(struct llist_node *llnode, bool percpu) +static int free_all(struct llist_node *llnode, bool percpu) { struct llist_node *pos, *t; + int cnt = 0; - llist_for_each_safe(pos, t, llnode) + llist_for_each_safe(pos, t, llnode) { free_one(pos, percpu); + cnt++; + } + return cnt; } static void __free_rcu(struct rcu_head *head) From patchwork Thu Jul 6 03:34:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303131 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6880EB64DA for ; Thu, 6 Jul 2023 03:35:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232426AbjGFDfJ (ORCPT ); Wed, 5 Jul 2023 23:35:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232730AbjGFDfI (ORCPT ); Wed, 5 Jul 2023 23:35:08 -0400 Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B40A21BC8; Wed, 5 Jul 2023 20:35:07 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1b8baa836a5so1004355ad.1; Wed, 05 Jul 2023 20:35:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614507; x=1691206507; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mKB1+Lkjs9jIhLzRZegSs16t8bNoVyzYLNE9YXmjTOA=; b=OQuy/g4+r38t4lYd6j3gDF0yZYv4Vo3p0IcIfcobeQ6Teudq12lnIeNA0mes9Vk3bU /VR1M5rbRjKP5xT8thjbKOAhQcA6SOY+bYH7c3r2bNvP0lw0DPhCR02yh1inWgRFHZfJ 50iAqWIdmNvxVplcCKd00Rf41IWXKUpj5pzbMqf4HDwzm6uEBnvLq5/iW4zcT6Ziftve fwq+EkVqY1NAMnOweTR7boh9KNg0eYRDa/jffTMhaQbzZEtpchStNDyN1/XaQDtYHQCc YWLQ/ic++zCNrtHv7RXxDAoybp6wt+IjX9+wK98KrDMA49Qla76d48OltRj/Yg1+6WMy iTBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614507; x=1691206507; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mKB1+Lkjs9jIhLzRZegSs16t8bNoVyzYLNE9YXmjTOA=; b=WT2kCaGcaFF/+lCX+VfqRMuRJBZ3cH9SUmXXQ4IH3aZK553D3A42r07+WdnnMpoT6S NhTccv7B8e2YcYJ70Acbfi8o8Ew9hKIBC5ktMcnYc63E6VW10Ap0wXkVzmftJ6TlRB9Q rRJ4RxQqImgq4jxKsEDYcCrryIF7CuN+lBhTOJbFFS81r/WnhPRgXgjfof9tamCDdnuL dqE+T07M+C/piG1i/6iUMuOO3XgXk/IMN5ZRDr9iQjW1dx7+n3EUHfpedGuhQCtmdVue GujahiDpWrLCNMMEGfNYQRlpftrCCSOGE6S4bjVKnGvUK1DvlJW4X2HtpywO5v3B7JzQ X1RQ== X-Gm-Message-State: ABy/qLaCS2LYDAvpQpkyjAXLYZGeHYZDPj6snaNAAgksPP+exITOV1yz dHqMV1Y3NO6DFiNukEuaYic= X-Google-Smtp-Source: APBJJlHS52lOtYNtz50v7vUTQnMjH6ZlUq1u7ICZEl7VfsLo1CfHmWTaiIyqq2ZIJFfbm71ViNs92A== X-Received: by 2002:a17:902:7889:b0:1b5:25f8:2160 with SMTP id q9-20020a170902788900b001b525f82160mr730246pll.30.1688614507128; Wed, 05 Jul 2023 20:35:07 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id g5-20020a1709026b4500b001b027221393sm229702plt.43.2023.07.05.20.35.05 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:06 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 04/14] bpf: Refactor alloc_bulk(). Date: Wed, 5 Jul 2023 20:34:37 -0700 Message-Id: <20230706033447.54696-5-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Factor out inner body of alloc_bulk into separate helper. No functional changes. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 46 ++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 20 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 693651d2648b..9693b1f8cbda 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -154,11 +154,35 @@ static struct mem_cgroup *get_memcg(const struct bpf_mem_cache *c) #endif } +static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj) +{ + unsigned long flags; + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + /* In RT irq_work runs in per-cpu kthread, so disable + * interrupts to avoid preemption and interrupts and + * reduce the chance of bpf prog executing on this cpu + * when active counter is busy. + */ + local_irq_save(flags); + /* alloc_bulk runs from irq_work which will not preempt a bpf + * program that does unit_alloc/unit_free since IRQs are + * disabled there. There is no race to increment 'active' + * counter. It protects free_llist from corruption in case NMI + * bpf prog preempted this loop. + */ + WARN_ON_ONCE(local_inc_return(&c->active) != 1); + __llist_add(obj, &c->free_llist); + c->free_cnt++; + local_dec(&c->active); + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_restore(flags); +} + /* Mostly runs from irq_work except __init phase. */ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) { struct mem_cgroup *memcg = NULL, *old_memcg; - unsigned long flags; void *obj; int i; @@ -188,25 +212,7 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) if (!obj) break; } - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - /* In RT irq_work runs in per-cpu kthread, so disable - * interrupts to avoid preemption and interrupts and - * reduce the chance of bpf prog executing on this cpu - * when active counter is busy. - */ - local_irq_save(flags); - /* alloc_bulk runs from irq_work which will not preempt a bpf - * program that does unit_alloc/unit_free since IRQs are - * disabled there. There is no race to increment 'active' - * counter. It protects free_llist from corruption in case NMI - * bpf prog preempted this loop. - */ - WARN_ON_ONCE(local_inc_return(&c->active) != 1); - __llist_add(obj, &c->free_llist); - c->free_cnt++; - local_dec(&c->active); - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - local_irq_restore(flags); + add_obj_to_free_list(c, obj); } set_active_memcg(old_memcg); mem_cgroup_put(memcg); From patchwork Thu Jul 6 03:34:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303132 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D46B7EB64DD for ; Thu, 6 Jul 2023 03:35:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232877AbjGFDfQ (ORCPT ); Wed, 5 Jul 2023 23:35:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232700AbjGFDfM (ORCPT ); Wed, 5 Jul 2023 23:35:12 -0400 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2B50198E; Wed, 5 Jul 2023 20:35:11 -0700 (PDT) Received: by mail-pj1-x102f.google.com with SMTP id 98e67ed59e1d1-262e839647eso259038a91.2; Wed, 05 Jul 2023 20:35:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614511; x=1691206511; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1eqzE1hZrK6k4U+CY/v2fs3fUbAlzEXpPd6LEk9iT+8=; b=Fjc4FuLawdQgw4MSBgBNtgYZTKWgkAZQX9nQY4uCj75tcocKgv+Kr6GePj0/bojvnb JR5IoJm8ntgXg0qxi6lUfHl9Djs0VvI1R9MUkv7KEiaMZdLs4cymWkTUxH1g+Dq9WZY3 h3WEF+sll7IB4c1rrGjPpksN+mo4VubKiIzRfHlQdiW/t8HQt73NP77O3NaPC3Ee91KH nDwR8PS6vv/lj/72VzBvneQxWbi4qX77Q5BU/o3oSDmsxagz229KFk8uZyQQPJSl6h3P qr/5JfupWr/15OkU0Put5BlntpXvF5FKYfhW3OAzw6HiAlXwazcFu7P69WQvGVMPa+D9 uaoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614511; x=1691206511; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1eqzE1hZrK6k4U+CY/v2fs3fUbAlzEXpPd6LEk9iT+8=; b=AwMPUjmyOxTz7oa5KQFcuFRbpAZ/y5CXgls2BIipYvJkPFtIryiCCPPbGQaBM84biY HvNT/AiUMTjLWQNhIrqcqRoRgDbDrB5tYlosGz/TUJKonfYebUKgEA4H4TBZ3ejCuYDY lYEYIKL/+/wYDNYt/JoRoLmG3TLoUTreaNB1Yc3f5Wjo/ngLfVj7J07RrtJcNQAlmHhL CuY+nAAo/LGdpncJm5pYsV9evWzneLXbSuXuhhlCF0Jyscx4lZwQVukio5A2McMVsnnW sLj0Kv+CHdXZc4SB3rdT4y8+R4xbm6DXUkvFFQrczhofeYHQU2Wi40DzsfSvDsoMZxCg JPnQ== X-Gm-Message-State: ABy/qLbLIWASKYh+WUgLkD7csrb6RRBCHibPzIgXNveXAA4R3jBQJCK5 rvJzzHyjh9E3h/QRBflV/W0= X-Google-Smtp-Source: APBJJlFlbwQf/XdUcogJy0XVZtkIdvY4CPKvhoHNuhiTyGAX4nLEZFsJ26althycxW1m56oiXg9qcg== X-Received: by 2002:a17:90a:f28a:b0:263:9816:fe0f with SMTP id fs10-20020a17090af28a00b002639816fe0fmr632692pjb.15.1688614511024; Wed, 05 Jul 2023 20:35:11 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id l20-20020a17090aec1400b00256bbfbabcfsm2070422pjy.48.2023.07.05.20.35.09 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:10 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 05/14] bpf: Factor out inc/dec of active flag into helpers. Date: Wed, 5 Jul 2023 20:34:38 -0700 Message-Id: <20230706033447.54696-6-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Factor out local_inc/dec_return(&c->active) into helpers. No functional changes. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 9693b1f8cbda..052fc801fb9f 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -154,17 +154,15 @@ static struct mem_cgroup *get_memcg(const struct bpf_mem_cache *c) #endif } -static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj) +static void inc_active(struct bpf_mem_cache *c, unsigned long *flags) { - unsigned long flags; - if (IS_ENABLED(CONFIG_PREEMPT_RT)) /* In RT irq_work runs in per-cpu kthread, so disable * interrupts to avoid preemption and interrupts and * reduce the chance of bpf prog executing on this cpu * when active counter is busy. */ - local_irq_save(flags); + local_irq_save(*flags); /* alloc_bulk runs from irq_work which will not preempt a bpf * program that does unit_alloc/unit_free since IRQs are * disabled there. There is no race to increment 'active' @@ -172,13 +170,25 @@ static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj) * bpf prog preempted this loop. */ WARN_ON_ONCE(local_inc_return(&c->active) != 1); - __llist_add(obj, &c->free_llist); - c->free_cnt++; +} + +static void dec_active(struct bpf_mem_cache *c, unsigned long flags) +{ local_dec(&c->active); if (IS_ENABLED(CONFIG_PREEMPT_RT)) local_irq_restore(flags); } +static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj) +{ + unsigned long flags; + + inc_active(c, &flags); + __llist_add(obj, &c->free_llist); + c->free_cnt++; + dec_active(c, flags); +} + /* Mostly runs from irq_work except __init phase. */ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) { @@ -300,17 +310,13 @@ static void free_bulk(struct bpf_mem_cache *c) int cnt; do { - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - local_irq_save(flags); - WARN_ON_ONCE(local_inc_return(&c->active) != 1); + inc_active(c, &flags); llnode = __llist_del_first(&c->free_llist); if (llnode) cnt = --c->free_cnt; else cnt = 0; - local_dec(&c->active); - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - local_irq_restore(flags); + dec_active(c, flags); if (llnode) enque_to_free(c, llnode); } while (cnt > (c->high_watermark + c->low_watermark) / 2); From patchwork Thu Jul 6 03:34:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303133 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2C61EB64DA for ; Thu, 6 Jul 2023 03:35:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232730AbjGFDfU (ORCPT ); Wed, 5 Jul 2023 23:35:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233054AbjGFDfS (ORCPT ); Wed, 5 Jul 2023 23:35:18 -0400 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 828CB1BCB; Wed, 5 Jul 2023 20:35:15 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-666e64e97e2so212097b3a.1; Wed, 05 Jul 2023 20:35:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614515; x=1691206515; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=N2OQn7u8qk7CwTWC1dwQlpoJqxJbcJEgwvja+T4+A5c=; b=SjfDf38U3NJ6GR34LWjxD86lOZ5TMYyrnjrf1khx3T7KXD/xPWuqDCu0WFtPKI6Txe OiZKgwpPqbITx+55hkv9PwNpF3xQsc1M4fz74vN8nVHGp2fhXeo5QW802aiu2Z+pyY5A APorq/3G3t9WtiRzwp2cH/bG2BNd6gUS06JCq4CbYL7AWZ+vjP8SI2UctZwvKzPd9s11 O2eFZ9lEBs2pdAvzsi6q4ut3KE9z2ShkgQ2LK0WAevsvUtVq+6qKu8iMX67Q2zmvqxGL T7qR+7k48oEQTezJv+HwCXZOz+z0qi40Po4YffUIXvCdaxp0dK8ZlmL/vwI21DA7nAoG yLTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614515; x=1691206515; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=N2OQn7u8qk7CwTWC1dwQlpoJqxJbcJEgwvja+T4+A5c=; b=chWM4VqMGTanJ2yd6cQVwooMLUBtPT+HtYsDrUdb1JmRksDzuoUrl1Yu9fT85RZvb0 z2B9dmvVEMtIMYBukFXSYfrSMxgd0Puue0mjgPvc02vvyAJabemaFHl2qT6jCBBHHpJ/ WSMwwnCVHsStmnwHcf6oGxo/G53ue+iA5TVMdav/TWoFV4/UQzXQSn8fH0QR+jNxxWCb ceKxJL61Yx3QpBfiFpNEETULt/6dOQCdqZ+IGJd1dRlfNZIcm/UARs6b64D03mGFhvYY VC4Ca02dbNlJS1WczQd6TPu6GcUxjnpj9NVnaqhAAnb8OGJ6fs/2Xy1Kiv/wh+R0Fgjr F5Dw== X-Gm-Message-State: ABy/qLbFwIu6p6BKjllHXX0Rf12GtjBoTe115x5Yi5+RvDPJX84kMA9X YV4InYzKDiYrll3g4YMshZU= X-Google-Smtp-Source: APBJJlHC7KpOFYfVUDMOMHsFuCtZz/EE3l6YqBwEqSkQvMkvESbDsiz4Vf91+WraVqedGo8m7/TkPw== X-Received: by 2002:a05:6a00:1acb:b0:66a:5466:25c6 with SMTP id f11-20020a056a001acb00b0066a546625c6mr524076pfv.18.1688614514898; Wed, 05 Jul 2023 20:35:14 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id 10-20020aa7924a000000b0066a31111cc5sm232228pfp.152.2023.07.05.20.35.13 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:14 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 06/14] bpf: Further refactor alloc_bulk(). Date: Wed, 5 Jul 2023 20:34:39 -0700 Message-Id: <20230706033447.54696-7-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov In certain scenarios alloc_bulk() might be taking free objects mainly from free_by_rcu_ttrace list. In such case get_memcg() and set_active_memcg() are redundant, but they show up in perf profile. Split the loop and only set memcg when allocating from slab. No performance difference in this patch alone, but it helps in combination with further patches. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 052fc801fb9f..0ee566a7719a 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -196,8 +196,6 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) void *obj; int i; - memcg = get_memcg(c); - old_memcg = set_active_memcg(memcg); for (i = 0; i < cnt; i++) { /* * free_by_rcu_ttrace is only manipulated by irq work refill_work(). @@ -212,16 +210,24 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) * numa node and it is not a guarantee. */ obj = __llist_del_first(&c->free_by_rcu_ttrace); - if (!obj) { - /* Allocate, but don't deplete atomic reserves that typical - * GFP_ATOMIC would do. irq_work runs on this cpu and kmalloc - * will allocate from the current numa node which is what we - * want here. - */ - obj = __alloc(c, node, GFP_NOWAIT | __GFP_NOWARN | __GFP_ACCOUNT); - if (!obj) - break; - } + if (!obj) + break; + add_obj_to_free_list(c, obj); + } + if (i >= cnt) + return; + + memcg = get_memcg(c); + old_memcg = set_active_memcg(memcg); + for (; i < cnt; i++) { + /* Allocate, but don't deplete atomic reserves that typical + * GFP_ATOMIC would do. irq_work runs on this cpu and kmalloc + * will allocate from the current numa node which is what we + * want here. + */ + obj = __alloc(c, node, GFP_NOWAIT | __GFP_NOWARN | __GFP_ACCOUNT); + if (!obj) + break; add_obj_to_free_list(c, obj); } set_active_memcg(old_memcg); From patchwork Thu Jul 6 03:34:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303134 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA1B2EB64DD for ; Thu, 6 Jul 2023 03:35:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232912AbjGFDfV (ORCPT ); Wed, 5 Jul 2023 23:35:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232700AbjGFDfU (ORCPT ); Wed, 5 Jul 2023 23:35:20 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C7A0198E; Wed, 5 Jul 2023 20:35:19 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-553a1f13d9fso242008a12.1; Wed, 05 Jul 2023 20:35:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614519; x=1691206519; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mlxTnp19Kd/MPSDl2RItcTKx2L7V0g2IZhfiMGy+r1A=; b=PL6vG5dldSNfBDDR4wMIJuXt34CqyG/cEZSisH8+df89ejxGlPE7xb2+6WrnXlktd+ xDaeoZMEm+e9hFHCBffxTYrHezwQUbSncj9F2FfxF2inLvZ6bAaSoxYl8s/spHC1qqN6 8lJ31bsSrJEZKKV+yJcroBeQ8xzqFGGMe20kXHJKsJAvBZfUw05j2Wil6Mo081jQNEJn +UD8GBpZoBbwgfTW0nbEqv2TslIPG3wNce7wXbRi1EtfeGz4HiOkBhcsuUikP8w0HvrB 3uUI1H/OyumiFqkVDM73lUPbYVqBl9VUIdiJO17U8/XODjO77QqZiz+XxkQ9LEAoTBNB PQkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614519; x=1691206519; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mlxTnp19Kd/MPSDl2RItcTKx2L7V0g2IZhfiMGy+r1A=; b=BX0oNi0/ePByzMoM35Ssuw7SXLrXLviCL3ZvmKpw/UNzQJp+aBph/jlTlzFRAS6sge VBGbUEl9MamFzbtH823FtH0EkEreAKyQuDAjJIV7pRHi6Y6aw5fX8HkjFu4JTtwZjYGF jNAeFqwdf8Fy0moVnTpUYy0bBETdmXqvv9v/YPtfA1E5DRe+vhQ3dVS2nb1IbF2qtyWB brDg4bQyNnXHLwRnyXRuzmSWqCzxkdjd80KjQpGgf1Int1lkIJ8KqpeJuPw5K/thy9rS Lg/Bg0CPAaQwNBMsNECjfS96y05jykONLiGyo2T7Ivq5/Xol/rvJviWEa/Alc4IN/wJI 29Gg== X-Gm-Message-State: ABy/qLYO8ppNA61UvQggdv3glYrRKSccNzsyMS2X/2AMnE53uGF9bnQ+ +gA9A8wLzZe9WJy0Y3hlXBj94aJEwPI= X-Google-Smtp-Source: APBJJlE4dAGCchgSlrQhJoYURk2GJisA49zLNYkETzZuSP/JeLmew7hUsrUOwYCzJwEXhQZu51OyMQ== X-Received: by 2002:a05:6a20:8411:b0:105:6d0e:c046 with SMTP id c17-20020a056a20841100b001056d0ec046mr976502pzd.26.1688614518729; Wed, 05 Jul 2023 20:35:18 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id q15-20020a62ae0f000000b00678159eacecsm240851pff.121.2023.07.05.20.35.17 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:18 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 07/14] bpf: Change bpf_mem_cache draining process. Date: Wed, 5 Jul 2023 20:34:40 -0700 Message-Id: <20230706033447.54696-8-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov The next patch will introduce cross-cpu llist access and existing irq_work_sync() + drain_mem_cache() + rcu_barrier_tasks_trace() mechanism will not be enough, since irq_work_sync() + drain_mem_cache() on cpu A won't guarantee that llist on cpu A are empty. The free_bulk() on cpu B might add objects back to llist of cpu A. Add 'bool draining' flag. The modified sequence looks like: for_each_cpu: WRITE_ONCE(c->draining, true); // do_call_rcu_ttrace() won't be doing call_rcu() any more irq_work_sync(); // wait for irq_work callback (free_bulk) to finish drain_mem_cache(); // free all objects rcu_barrier_tasks_trace(); // wait for RCU callbacks to execute Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 0ee566a7719a..2615f296f052 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -98,6 +98,7 @@ struct bpf_mem_cache { int free_cnt; int low_watermark, high_watermark, batch; int percpu_size; + bool draining; /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; @@ -301,6 +302,12 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) * from __free_rcu() and from drain_mem_cache(). */ __llist_add(llnode, &c->waiting_for_gp_ttrace); + + if (unlikely(READ_ONCE(c->draining))) { + __free_rcu(&c->rcu_ttrace); + return; + } + /* Use call_rcu_tasks_trace() to wait for sleepable progs to finish. * If RCU Tasks Trace grace period implies RCU grace period, free * these elements directly, else use call_rcu() to wait for normal @@ -544,15 +551,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) rcu_in_progress = 0; for_each_possible_cpu(cpu) { c = per_cpu_ptr(ma->cache, cpu); - /* - * refill_work may be unfinished for PREEMPT_RT kernel - * in which irq work is invoked in a per-CPU RT thread. - * It is also possible for kernel with - * arch_irq_work_has_interrupt() being false and irq - * work is invoked in timer interrupt. So waiting for - * the completion of irq work to ease the handling of - * concurrency. - */ + WRITE_ONCE(c->draining, true); irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); @@ -568,6 +567,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) cc = per_cpu_ptr(ma->caches, cpu); for (i = 0; i < NUM_CACHES; i++) { c = &cc->cache[i]; + WRITE_ONCE(c->draining, true); irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); From patchwork Thu Jul 6 03:34:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303135 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7762FEB64DD for ; Thu, 6 Jul 2023 03:35:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233087AbjGFDf2 (ORCPT ); Wed, 5 Jul 2023 23:35:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232934AbjGFDf1 (ORCPT ); Wed, 5 Jul 2023 23:35:27 -0400 Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98003198E; Wed, 5 Jul 2023 20:35:23 -0700 (PDT) Received: by mail-qk1-x72e.google.com with SMTP id af79cd13be357-76714caf466so28317285a.1; Wed, 05 Jul 2023 20:35:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614522; x=1691206522; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RjnD9spM/4mEE+I32gxaY1tARlurZ9ZzYpHOfvWPVew=; b=UpfEHrRZEgvxQjePJ0kHhss9OO6eWWjFWVnn9ROJJubIMlyNbJmNMykssfatwXupUW eFqi0q1C+BRH7noU8ldMd/jh1V4xNPjMKDj0tZ6G4Qm5PwqJOJ4BbMiTRJXjCGipS8cn SKH/rDXpXUIt7b6jhkml8fra5Ja9elr+gqDB5+WdvgDdOzHlomnyc08KqqH98lg0pFcj XVGPraJw5DTEX8KMzPbmsQVokrgnSqn2EO7MpJJLwUUIDUNCb8nhDJsPcwoMkzHUsnPu /D77QGp/em2lyQddRjpc+6ZK+7T21A6NYc68g0kYL5kgcluKq2s4foARqm7hBF5+tMgk 1wVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614522; x=1691206522; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RjnD9spM/4mEE+I32gxaY1tARlurZ9ZzYpHOfvWPVew=; b=eg2qObn7eSwMhRHJqduY1v1IZ/NDcUVAUGw8YW1FJdIehGL8+R0XrIPYsPeWIcyUtq f8/h9ItkdwL2vL4E5FfsFwspb9Fo/RRoilHz3a1PUTHUj5CGwBUiGZodWo9ySDk6fnGl Sx7Dp1EHD2o2JmnhnEG/U1gmjPeeUXSdbKFITG6u6hWnosDLmH8FuVQ9+K+LcKDsIRNJ fvhdIbuTJEfsiAP0xIt0L6u27T0KB4SyOSae4BbRPdYRejNdAb9JLlXsCP4j03JKku5k kUO+FcKb9ArVZ0TebUU03LS6X06fIyKf6EexXEgvgomSuaISDRqnBvii03eEkC7UXAjY DTMQ== X-Gm-Message-State: ABy/qLZX1ywdpoyIanz7uFdLpREndm0YF3fWho1Rwxb1JBHmfGV1/9iN HMATYXkRmF0IxPS8PaXFAv9dsc3ADX4= X-Google-Smtp-Source: APBJJlE44dh48hsy5XAd2kkPV5EGNMwngkhqwVIp32TNLNudGSuKzb89VHI5V0wD0aG0ps5w8bZhnw== X-Received: by 2002:a05:620a:4248:b0:765:8204:dd64 with SMTP id w8-20020a05620a424800b007658204dd64mr730677qko.7.1688614522579; Wed, 05 Jul 2023 20:35:22 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id n20-20020aa79054000000b00682b299b6besm243673pfo.70.2023.07.05.20.35.21 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:22 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 08/14] bpf: Add a hint to allocated objects. Date: Wed, 5 Jul 2023 20:34:41 -0700 Message-Id: <20230706033447.54696-9-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov To address OOM issue when one cpu is allocating and another cpu is freeing add a target bpf_mem_cache hint to allocated objects and when local cpu free_llist overflows free to that bpf_mem_cache. The hint addresses the OOM while maintaining the same performance for common case when alloc/free are done on the same cpu. Note that do_call_rcu_ttrace() now has to check 'draining' flag in one more case, since do_call_rcu_ttrace() is called not only for current cpu. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- kernel/bpf/memalloc.c | 50 +++++++++++++++++++++++++++---------------- 1 file changed, 31 insertions(+), 19 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 2615f296f052..9986c6b7df4d 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -99,6 +99,7 @@ struct bpf_mem_cache { int low_watermark, high_watermark, batch; int percpu_size; bool draining; + struct bpf_mem_cache *tgt; /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; @@ -199,18 +200,11 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) for (i = 0; i < cnt; i++) { /* - * free_by_rcu_ttrace is only manipulated by irq work refill_work(). - * IRQ works on the same CPU are called sequentially, so it is - * safe to use __llist_del_first() here. If alloc_bulk() is - * invoked by the initial prefill, there will be no running - * refill_work(), so __llist_del_first() is fine as well. - * - * In most cases, objects on free_by_rcu_ttrace are from the same CPU. - * If some objects come from other CPUs, it doesn't incur any - * harm because NUMA_NO_NODE means the preference for current - * numa node and it is not a guarantee. + * For every 'c' llist_del_first(&c->free_by_rcu_ttrace); is + * done only by one CPU == current CPU. Other CPUs might + * llist_add() and llist_del_all() in parallel. */ - obj = __llist_del_first(&c->free_by_rcu_ttrace); + obj = llist_del_first(&c->free_by_rcu_ttrace); if (!obj) break; add_obj_to_free_list(c, obj); @@ -284,18 +278,23 @@ static void enque_to_free(struct bpf_mem_cache *c, void *obj) /* bpf_mem_cache is a per-cpu object. Freeing happens in irq_work. * Nothing races to add to free_by_rcu_ttrace list. */ - __llist_add(llnode, &c->free_by_rcu_ttrace); + llist_add(llnode, &c->free_by_rcu_ttrace); } static void do_call_rcu_ttrace(struct bpf_mem_cache *c) { struct llist_node *llnode, *t; - if (atomic_xchg(&c->call_rcu_ttrace_in_progress, 1)) + if (atomic_xchg(&c->call_rcu_ttrace_in_progress, 1)) { + if (unlikely(READ_ONCE(c->draining))) { + llnode = llist_del_all(&c->free_by_rcu_ttrace); + free_all(llnode, !!c->percpu_size); + } return; + } WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); - llist_for_each_safe(llnode, t, __llist_del_all(&c->free_by_rcu_ttrace)) + llist_for_each_safe(llnode, t, llist_del_all(&c->free_by_rcu_ttrace)) /* There is no concurrent __llist_add(waiting_for_gp_ttrace) access. * It doesn't race with llist_del_all either. * But there could be two concurrent llist_del_all(waiting_for_gp_ttrace): @@ -318,10 +317,13 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) static void free_bulk(struct bpf_mem_cache *c) { + struct bpf_mem_cache *tgt = c->tgt; struct llist_node *llnode, *t; unsigned long flags; int cnt; + WARN_ON_ONCE(tgt->unit_size != c->unit_size); + do { inc_active(c, &flags); llnode = __llist_del_first(&c->free_llist); @@ -331,13 +333,13 @@ static void free_bulk(struct bpf_mem_cache *c) cnt = 0; dec_active(c, flags); if (llnode) - enque_to_free(c, llnode); + enque_to_free(tgt, llnode); } while (cnt > (c->high_watermark + c->low_watermark) / 2); /* and drain free_llist_extra */ llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra)) - enque_to_free(c, llnode); - do_call_rcu_ttrace(c); + enque_to_free(tgt, llnode); + do_call_rcu_ttrace(tgt); } static void bpf_mem_refill(struct irq_work *work) @@ -436,6 +438,7 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu) c->unit_size = unit_size; c->objcg = objcg; c->percpu_size = percpu_size; + c->tgt = c; prefill_mem_cache(c, cpu); } ma->cache = pc; @@ -458,6 +461,7 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu) c = &cc->cache[i]; c->unit_size = sizes[i]; c->objcg = objcg; + c->tgt = c; prefill_mem_cache(c, cpu); } } @@ -476,7 +480,7 @@ static void drain_mem_cache(struct bpf_mem_cache *c) * Except for waiting_for_gp_ttrace list, there are no concurrent operations * on these lists, so it is safe to use __llist_del_all(). */ - free_all(__llist_del_all(&c->free_by_rcu_ttrace), percpu); + free_all(llist_del_all(&c->free_by_rcu_ttrace), percpu); free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu); free_all(__llist_del_all(&c->free_llist), percpu); free_all(__llist_del_all(&c->free_llist_extra), percpu); @@ -601,8 +605,10 @@ static void notrace *unit_alloc(struct bpf_mem_cache *c) local_irq_save(flags); if (local_inc_return(&c->active) == 1) { llnode = __llist_del_first(&c->free_llist); - if (llnode) + if (llnode) { cnt = --c->free_cnt; + *(struct bpf_mem_cache **)llnode = c; + } } local_dec(&c->active); local_irq_restore(flags); @@ -626,6 +632,12 @@ static void notrace unit_free(struct bpf_mem_cache *c, void *ptr) BUILD_BUG_ON(LLIST_NODE_SZ > 8); + /* + * Remember bpf_mem_cache that allocated this object. + * The hint is not accurate. + */ + c->tgt = *(struct bpf_mem_cache **)llnode; + local_irq_save(flags); if (local_inc_return(&c->active) == 1) { __llist_add(llnode, &c->free_llist); From patchwork Thu Jul 6 03:34:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303136 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9B4BEB64DA for ; Thu, 6 Jul 2023 03:35:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232934AbjGFDf3 (ORCPT ); Wed, 5 Jul 2023 23:35:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59108 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233055AbjGFDf1 (ORCPT ); Wed, 5 Jul 2023 23:35:27 -0400 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D4B51BC8; Wed, 5 Jul 2023 20:35:27 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-55b1238cab4so149042a12.2; Wed, 05 Jul 2023 20:35:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614527; x=1691206527; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5wO9G0EswJ/clqhMAYknq09+PSdQoMWL3dAjmtV4xtc=; b=NCFy5LJWSD94GDPz68RNYmd+hZU6Gu4dl6swGsYtpDZ/2fxy1NrcODA8LwleB0jRBC 5JNwzu50beppCBISHHksoo3o2zZqrWh/lJq2gwy93hAyIqi8o9xjYcyFyoHaqQmitsfm fZ3+Nd9+TC8eQ7e6DxMTqEip12Kz2GmWhWaNvkkMNYBey2eKQNs3a1lV+SpM98WXmtyO Pcj/8InvBlfG0UjwUUuiDO+gIT3PrmvVk9pB/gDfUPuefL2gnOD82Xo4tzQbVnh5wZ6l M7KxPFSbQiANYAkTjUFzXl1yJ+h9tt2Bs5cuQycv5h0WeWKvIp/DaU6YmAh6HuXgKGUq 3o7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614527; x=1691206527; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5wO9G0EswJ/clqhMAYknq09+PSdQoMWL3dAjmtV4xtc=; b=P7XDwBc/MIRU+osQWdb3y0duAUu6LXzeGXh0RB/3J79Z/RejA7NYh7gxbzpn1QgcYL ExL7jjPuCmqpPy7x73koBAM86D3MjjrNbmMcaM91dWtqJQVg2SnIJ0nMKNi09MwqS1LN NZ7lWP8qpjwIeKtlWJmX9H6nYUtHuvPr3hFSDHjrVjokYEGcJpx2ZwW957Rd1eeNWZC4 0nRdShFbflhtvDwjnHqpZHbDQ2NXhSZj3z5/jKezMkzuX4TnvePoKCcIXhB1GygnhFEo p1pJ6XUUHwPgcoUA/4rOpwvzTpkTM6xl2lEQ3OnNYh6uTG8w5DTn1fe1OZTi+izzkpR/ 12Yg== X-Gm-Message-State: ABy/qLbSw4ZQGxdDc3DfCzb92n0zKi9XfIXpbA+eDP+4NYdAEEoiWTwZ 5YwToDgCT4n+/Nkz24Yt9vo= X-Google-Smtp-Source: APBJJlHewuq2MG+fG/uHdZZNLPosh3ErVF6uSyt61Qj6lbp/EFICkOYCbgNN0ycvKjoEdc2fshuBSA== X-Received: by 2002:a17:90b:e10:b0:263:16f3:f04a with SMTP id ge16-20020a17090b0e1000b0026316f3f04amr455542pjb.1.1688614526611; Wed, 05 Jul 2023 20:35:26 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id j13-20020a17090a694d00b00263fc1ef1aasm1096185pjm.10.2023.07.05.20.35.25 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:26 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 09/14] bpf: Allow reuse from waiting_for_gp_ttrace list. Date: Wed, 5 Jul 2023 20:34:42 -0700 Message-Id: <20230706033447.54696-10-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov alloc_bulk() can reuse elements from free_by_rcu_ttrace. Let it reuse from waiting_for_gp_ttrace as well to avoid unnecessary kmalloc(). Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 9986c6b7df4d..e5a87f6cf2cc 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -212,6 +212,15 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node) if (i >= cnt) return; + for (; i < cnt; i++) { + obj = llist_del_first(&c->waiting_for_gp_ttrace); + if (!obj) + break; + add_obj_to_free_list(c, obj); + } + if (i >= cnt) + return; + memcg = get_memcg(c); old_memcg = set_active_memcg(memcg); for (; i < cnt; i++) { @@ -295,12 +304,7 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c) WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); llist_for_each_safe(llnode, t, llist_del_all(&c->free_by_rcu_ttrace)) - /* There is no concurrent __llist_add(waiting_for_gp_ttrace) access. - * It doesn't race with llist_del_all either. - * But there could be two concurrent llist_del_all(waiting_for_gp_ttrace): - * from __free_rcu() and from drain_mem_cache(). - */ - __llist_add(llnode, &c->waiting_for_gp_ttrace); + llist_add(llnode, &c->waiting_for_gp_ttrace); if (unlikely(READ_ONCE(c->draining))) { __free_rcu(&c->rcu_ttrace); From patchwork Thu Jul 6 03:34:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303137 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BC17EB64DA for ; Thu, 6 Jul 2023 03:35:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232700AbjGFDff (ORCPT ); Wed, 5 Jul 2023 23:35:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233055AbjGFDfd (ORCPT ); Wed, 5 Jul 2023 23:35:33 -0400 Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 388F51BCE; Wed, 5 Jul 2023 20:35:31 -0700 (PDT) Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-553ad54d3c6so190986a12.1; Wed, 05 Jul 2023 20:35:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614530; x=1691206530; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=16ENwcmfnjk7gt3rsLEbTZKRRlF4WSn886UX6QPpAdw=; b=R+X3KbuD60jxL0rmT5cHj8vIIUsh3ZSL5DCXvtsdTLXkBEyrpgL25mhknW/h9UWFQ5 umRyVpzxS4WpXjlJA/GRpY6hssYs8GrgFuiBqsb2kZTj9pY9necGo1++INSOj4ElJKrW m1HKX42m2ubW7zx8P6JITgy1bduDDbprnPUh5vBE8RL+2CxWT1ljmYKtCkWOiZfwk7sp WgsNA48gIHqvZTxERwFX1KprW3hV/0XOtc6JitrWlUo3C/vmva2XC/gI8b4Tir0s/7kc tybk5ncZnvsII4Rg/kh5Rb+3IwCWuABCvxHDFGQwOUIV9HYZ7JI+wO/ZsOVzNDTp9Zwo DDfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614530; x=1691206530; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=16ENwcmfnjk7gt3rsLEbTZKRRlF4WSn886UX6QPpAdw=; b=JQ16B0pEEu1ygNpC2G3RyOWzOddnua0NKMf2V5C0fmKL70T1GxDA07e0i8sAkJAkcj F4DWre9P0IGFCU5591R1j6pNDrlfoMHdBbMfsNxqYTfnIQ70wd44k1ZG4jRmTLbnsQK2 01vhtWzDPD+FklChIw7ZuIgQ5UAscLlpzh4CzbCn2RwUbZgDBjUHTY46gmwOCcu4zRuw 1CytF+zEScONGam7AakUjlnDb44Bhv3BdIRH6leII93njBtCKOV1t5SxnpfLWBzaxyru PTgLcvq1cerjVKJ77OKrOLqGUywEVILKp5Xb8HSGD/4wNpWPf7MlISeVd6zZMTe88Us5 RYZg== X-Gm-Message-State: ABy/qLakEUPSRyW4ifUIA+CF8UKhE7AmM2JAbXnzjXwXJsMz7+Tmdl/y XT0D30a/tw1n8u0ECcbGCU4= X-Google-Smtp-Source: APBJJlFlV0C97L+55+xQRrEghRt0wEVRistq6K6x/V3hh++LwskYhtxhN3JRIqL/fMRlbtZRBfed1g== X-Received: by 2002:a05:6a20:a417:b0:101:1951:d491 with SMTP id z23-20020a056a20a41700b001011951d491mr488010pzk.6.1688614530528; Wed, 05 Jul 2023 20:35:30 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id jh5-20020a170903328500b001b872c17535sm234498plb.13.2023.07.05.20.35.28 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:30 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 10/14] rcu: Export rcu_request_urgent_qs_task() Date: Wed, 5 Jul 2023 20:34:43 -0700 Message-Id: <20230706033447.54696-11-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: "Paul E. McKenney" If a CPU is executing a long series of non-sleeping system calls, RCU grace periods can be delayed for on the order of a couple hundred milliseconds. This is normally not a problem, but if each system call does a call_rcu(), those callbacks can stack up. RCU will eventually notice this callback storm, but use of rcu_request_urgent_qs_task() allows the code invoking call_rcu() to give RCU a heads up. This function is not for general use, not yet, anyway. Reported-by: Alexei Starovoitov Signed-off-by: Paul E. McKenney Signed-off-by: Alexei Starovoitov --- include/linux/rcutiny.h | 2 ++ include/linux/rcutree.h | 1 + kernel/rcu/rcu.h | 2 -- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index 7f17acf29dda..7b949292908a 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -138,6 +138,8 @@ static inline int rcu_needs_cpu(void) return 0; } +static inline void rcu_request_urgent_qs_task(struct task_struct *t) { } + /* * Take advantage of the fact that there is only one CPU, which * allows us to ignore virtualization-based context switches. diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 56bccb5a8fde..126f6b418f6a 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -21,6 +21,7 @@ void rcu_softirq_qs(void); void rcu_note_context_switch(bool preempt); int rcu_needs_cpu(void); void rcu_cpu_stall_reset(void); +void rcu_request_urgent_qs_task(struct task_struct *t); /* * Note a virtualization-based context switch. This is simply a diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 98c1544cf572..f95cfb5bf2ee 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -493,7 +493,6 @@ static inline void rcu_expedite_gp(void) { } static inline void rcu_unexpedite_gp(void) { } static inline void rcu_async_hurry(void) { } static inline void rcu_async_relax(void) { } -static inline void rcu_request_urgent_qs_task(struct task_struct *t) { } #else /* #ifdef CONFIG_TINY_RCU */ bool rcu_gp_is_normal(void); /* Internal RCU use. */ bool rcu_gp_is_expedited(void); /* Internal RCU use. */ @@ -508,7 +507,6 @@ void show_rcu_tasks_gp_kthreads(void); #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */ static inline void show_rcu_tasks_gp_kthreads(void) {} #endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */ -void rcu_request_urgent_qs_task(struct task_struct *t); #endif /* #else #ifdef CONFIG_TINY_RCU */ #define RCU_SCHEDULER_INACTIVE 0 From patchwork Thu Jul 6 03:34:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303138 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEF21EB64DD for ; Thu, 6 Jul 2023 03:35:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233055AbjGFDfl (ORCPT ); Wed, 5 Jul 2023 23:35:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233218AbjGFDfi (ORCPT ); Wed, 5 Jul 2023 23:35:38 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1BA01BD9; Wed, 5 Jul 2023 20:35:35 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-666e97fcc60so237572b3a.3; Wed, 05 Jul 2023 20:35:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614535; x=1691206535; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oYt+LNzB/i0hKKdEYq5jVUn1IOf1250hxziDpCkVzXs=; b=KmhDZ5nonfra5L1SO5UQa1PTBxdDqvFvzc2PeDr78AP0CCvyXaSPuHJCsi2RqoLlDG P85IbaLhS+zuIZyGLSKTv7SFiN0DRJ6gH8qY7loR/tEBUaB1LWmseyaUrZZz26zwpZNs BQx3WeURSBS8R6o9KVwFTerN3DxX94brRgmdI9jjOzu3CsgIZpfdimpcNsUTQg7K7LUb yKoy+iyTASGMuy6Fx54aIumfzxoh1zllA/SYkDMl+1aZTnUtzmVhT+MrVjvN03nDAUPh Kp3YhOTv5rqFt9UMbF+4YGs50Nl0dCFNVyQq+KmhP32Vv/uyBvI4xN8OB6M9RfE3++Pc 9c7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614535; x=1691206535; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oYt+LNzB/i0hKKdEYq5jVUn1IOf1250hxziDpCkVzXs=; b=BeXfden6st8LkJ3qKIXy7Ghwq6lK5gsoS97bTQBLSKfwtefD/LQr7lEM57qUeWljli ohLCijWtCw3iMJyDyNLrvxQuhHfZ/vdX8b43EUc4+++q+5Ny5bx8agdfSXDy1oaVb/eL 7+fnNHvclIgfHRAIpOOPF0rmWa2ZUTszSOKFehCjp4AiBm0rwcRov1ORj2mej688+cBF RmC636sBo6SQ5HRWoQhEzdzfRK571vtqaJlSEt0qzcx+fcG/GrCvuJKeI2PgmBuF3bA8 2LmIPJREmNCpxCs6sEZHsN6dYDPfdRTfTnEUkPxoN+yPA6gIf8O7XzCwKf8USHdT/oI3 JJ1w== X-Gm-Message-State: ABy/qLalYm50PwXBXM0jTdOB+TcE6B96vkqfXtSzz2U88te9qaA8m0Mt e9pE2eCPcmy9k4BQ+CzC1rzIxl5hDa4= X-Google-Smtp-Source: APBJJlGPxYImLwrJfTPWKzIl2uJaj+5H/chqedmrZYcFrBslXfYGj/+773Z7IXOQ2BCZLOWAGyMSSA== X-Received: by 2002:a05:6a00:1991:b0:666:a25b:3788 with SMTP id d17-20020a056a00199100b00666a25b3788mr492266pfl.34.1688614534738; Wed, 05 Jul 2023 20:35:34 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id r23-20020a62e417000000b006661562429fsm248363pfh.97.2023.07.05.20.35.32 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:34 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 11/14] selftests/bpf: Improve test coverage of bpf_mem_alloc. Date: Wed, 5 Jul 2023 20:34:44 -0700 Message-Id: <20230706033447.54696-12-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov bpf_obj_new() calls bpf_mem_alloc(), but doing alloc/free of 8 elements is not triggering watermark conditions in bpf_mem_alloc. Increase to 200 elements to make sure alloc_bulk/free_bulk is exercised. Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- tools/testing/selftests/bpf/progs/linked_list.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/progs/linked_list.c b/tools/testing/selftests/bpf/progs/linked_list.c index 57440a554304..84d1777a9e6c 100644 --- a/tools/testing/selftests/bpf/progs/linked_list.c +++ b/tools/testing/selftests/bpf/progs/linked_list.c @@ -96,7 +96,7 @@ static __always_inline int list_push_pop_multiple(struct bpf_spin_lock *lock, struct bpf_list_head *head, bool leave_in_map) { struct bpf_list_node *n; - struct foo *f[8], *pf; + struct foo *f[200], *pf; int i; /* Loop following this check adds nodes 2-at-a-time in order to From patchwork Thu Jul 6 03:34:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B574DEB64DD for ; Thu, 6 Jul 2023 03:35:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229524AbjGFDfr (ORCPT ); Wed, 5 Jul 2023 23:35:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233166AbjGFDfp (ORCPT ); Wed, 5 Jul 2023 23:35:45 -0400 Received: from mail-pg1-x52f.google.com (mail-pg1-x52f.google.com [IPv6:2607:f8b0:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FB061BCA; Wed, 5 Jul 2023 20:35:39 -0700 (PDT) Received: by mail-pg1-x52f.google.com with SMTP id 41be03b00d2f7-54290603887so147747a12.1; Wed, 05 Jul 2023 20:35:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614539; x=1691206539; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3XvDiBoDjbQQa6lc4LFoLb3kfWjOt1I1H5xTwSvGnvw=; b=oqDB6ELAu/wo33otYPW4ZtFnT5JJc0S3Er4TRpkQKC9PJk1ZT4RINOQOR7Nk61LJ+7 +1LU4mK0ysbgSt2/BzamQwASsdpQx+zI91nTiDGvtc+xNeJZ923lrlLuFqdOXjeIUspZ 1qA06kw8i+b2Bxamv0KvphIXxIv7Wz2hhdl3HWkydNbpySlonPX6vVyfNCgKbdjvZ+ec wx9Ms6V1AIOtDbb+6rYbWpELBIwiYqC6o4n+pP7hC36SwKy+HVTKnYx6zoeUSypXUDnA 442e9w6EknRRA5RyBPB2IyYIAL3V/Z6e+Uwvwv7ZXhxmu74OaFcZ+ijmVz8NScWw+hOA dneQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614539; x=1691206539; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3XvDiBoDjbQQa6lc4LFoLb3kfWjOt1I1H5xTwSvGnvw=; b=NyC6Zf2mwt6ByXXh0qkjiXjUeP7qSxWXSsFapWStatoHM7vUsN6q5OrXfqse6GLdTM lQv2UN8xykwKpwaWcOa5+vyBLM/VtGmrfA79plc4ztRz0xoWicz2Zkmn4XYTj0OYTyw8 7PrqZhL5iBAvfQ+1P+PkgvgHC16q7LGVnvgsvCiCpO5efdw6B90az72gqv7kbhyEYDLf 8S51TzaAFEe/1/AZTBVh8XCx2NQ1E6EustZYw7MGcTmuBI3j76oUYxj637Phk+tvoWkN 0jYibutobsbu2F9eg9noGlg4toREslUdth4KXP7A4oJSMi+LUdKm4lRnKA7/BS2FxtHK npEQ== X-Gm-Message-State: ABy/qLYAElFswB/+Z+4kgxUNAavwe727TJnhUlbE8siO6d1+J3JpuYLa D/8on3lxb2c2p7/vPL2kyAk= X-Google-Smtp-Source: APBJJlGRJysi9/fWfcAitLsK2vG63SDtfH5jwWELDSJo08dyUU9JjfS5EXqQ6EUOOLZYyjfFuD+fSA== X-Received: by 2002:a05:6a00:1952:b0:668:73f5:dce0 with SMTP id s18-20020a056a00195200b0066873f5dce0mr559728pfk.29.1688614538620; Wed, 05 Jul 2023 20:35:38 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id x15-20020a62fb0f000000b00682936d04ccsm229209pfm.180.2023.07.05.20.35.37 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:38 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 12/14] bpf: Introduce bpf_mem_free_rcu() similar to kfree_rcu(). Date: Wed, 5 Jul 2023 20:34:45 -0700 Message-Id: <20230706033447.54696-13-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Introduce bpf_mem_[cache_]free_rcu() similar to kfree_rcu(). Unlike bpf_mem_[cache_]free() that links objects for immediate reuse into per-cpu free list the _rcu() flavor waits for RCU grace period and then moves objects into free_by_rcu_ttrace list where they are waiting for RCU task trace grace period to be freed into slab. The life cycle of objects: alloc: dequeue free_llist free: enqeueu free_llist free_rcu: enqueue free_by_rcu -> waiting_for_gp free_llist above high watermark -> free_by_rcu_ttrace after RCU GP waiting_for_gp -> free_by_rcu_ttrace free_by_rcu_ttrace -> waiting_for_gp_ttrace -> slab Signed-off-by: Alexei Starovoitov Acked-by: Hou Tao --- include/linux/bpf_mem_alloc.h | 2 + kernel/bpf/memalloc.c | 129 +++++++++++++++++++++++++++++++++- 2 files changed, 128 insertions(+), 3 deletions(-) diff --git a/include/linux/bpf_mem_alloc.h b/include/linux/bpf_mem_alloc.h index 3929be5743f4..d644bbb298af 100644 --- a/include/linux/bpf_mem_alloc.h +++ b/include/linux/bpf_mem_alloc.h @@ -27,10 +27,12 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma); /* kmalloc/kfree equivalent: */ void *bpf_mem_alloc(struct bpf_mem_alloc *ma, size_t size); void bpf_mem_free(struct bpf_mem_alloc *ma, void *ptr); +void bpf_mem_free_rcu(struct bpf_mem_alloc *ma, void *ptr); /* kmem_cache_alloc/free equivalent: */ void *bpf_mem_cache_alloc(struct bpf_mem_alloc *ma); void bpf_mem_cache_free(struct bpf_mem_alloc *ma, void *ptr); +void bpf_mem_cache_free_rcu(struct bpf_mem_alloc *ma, void *ptr); void bpf_mem_cache_raw_free(void *ptr); void *bpf_mem_cache_alloc_flags(struct bpf_mem_alloc *ma, gfp_t flags); diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index e5a87f6cf2cc..17ef2e9b278a 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -101,6 +101,15 @@ struct bpf_mem_cache { bool draining; struct bpf_mem_cache *tgt; + /* list of objects to be freed after RCU GP */ + struct llist_head free_by_rcu; + struct llist_node *free_by_rcu_tail; + struct llist_head waiting_for_gp; + struct llist_node *waiting_for_gp_tail; + struct rcu_head rcu; + atomic_t call_rcu_in_progress; + struct llist_head free_llist_extra_rcu; + /* list of objects to be freed after RCU tasks trace GP */ struct llist_head free_by_rcu_ttrace; struct llist_head waiting_for_gp_ttrace; @@ -346,6 +355,69 @@ static void free_bulk(struct bpf_mem_cache *c) do_call_rcu_ttrace(tgt); } +static void __free_by_rcu(struct rcu_head *head) +{ + struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu); + struct bpf_mem_cache *tgt = c->tgt; + struct llist_node *llnode; + + llnode = llist_del_all(&c->waiting_for_gp); + if (!llnode) + goto out; + + llist_add_batch(llnode, c->waiting_for_gp_tail, &tgt->free_by_rcu_ttrace); + + /* Objects went through regular RCU GP. Send them to RCU tasks trace */ + do_call_rcu_ttrace(tgt); +out: + atomic_set(&c->call_rcu_in_progress, 0); +} + +static void check_free_by_rcu(struct bpf_mem_cache *c) +{ + struct llist_node *llnode, *t; + unsigned long flags; + + /* drain free_llist_extra_rcu */ + if (unlikely(!llist_empty(&c->free_llist_extra_rcu))) { + inc_active(c, &flags); + llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra_rcu)) + if (__llist_add(llnode, &c->free_by_rcu)) + c->free_by_rcu_tail = llnode; + dec_active(c, flags); + } + + if (llist_empty(&c->free_by_rcu)) + return; + + if (atomic_xchg(&c->call_rcu_in_progress, 1)) { + /* + * Instead of kmalloc-ing new rcu_head and triggering 10k + * call_rcu() to hit rcutree.qhimark and force RCU to notice + * the overload just ask RCU to hurry up. There could be many + * objects in free_by_rcu list. + * This hint reduces memory consumption for an artificial + * benchmark from 2 Gbyte to 150 Mbyte. + */ + rcu_request_urgent_qs_task(current); + return; + } + + WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp)); + + inc_active(c, &flags); + WRITE_ONCE(c->waiting_for_gp.first, __llist_del_all(&c->free_by_rcu)); + c->waiting_for_gp_tail = c->free_by_rcu_tail; + dec_active(c, flags); + + if (unlikely(READ_ONCE(c->draining))) { + free_all(llist_del_all(&c->waiting_for_gp), !!c->percpu_size); + atomic_set(&c->call_rcu_in_progress, 0); + } else { + call_rcu_hurry(&c->rcu, __free_by_rcu); + } +} + static void bpf_mem_refill(struct irq_work *work) { struct bpf_mem_cache *c = container_of(work, struct bpf_mem_cache, refill_work); @@ -360,6 +432,8 @@ static void bpf_mem_refill(struct irq_work *work) alloc_bulk(c, c->batch, NUMA_NO_NODE); else if (cnt > c->high_watermark) free_bulk(c); + + check_free_by_rcu(c); } static void notrace irq_work_raise(struct bpf_mem_cache *c) @@ -488,6 +562,9 @@ static void drain_mem_cache(struct bpf_mem_cache *c) free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu); free_all(__llist_del_all(&c->free_llist), percpu); free_all(__llist_del_all(&c->free_llist_extra), percpu); + free_all(__llist_del_all(&c->free_by_rcu), percpu); + free_all(__llist_del_all(&c->free_llist_extra_rcu), percpu); + free_all(llist_del_all(&c->waiting_for_gp), percpu); } static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) @@ -500,8 +577,8 @@ static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) static void free_mem_alloc(struct bpf_mem_alloc *ma) { - /* waiting_for_gp_ttrace lists was drained, but __free_rcu might - * still execute. Wait for it now before we freeing percpu caches. + /* waiting_for_gp[_ttrace] lists were drained, but RCU callbacks + * might still execute. Wait for them. * * rcu_barrier_tasks_trace() doesn't imply synchronize_rcu_tasks_trace(), * but rcu_barrier_tasks_trace() and rcu_barrier() below are only used @@ -510,7 +587,8 @@ static void free_mem_alloc(struct bpf_mem_alloc *ma) * rcu_trace_implies_rcu_gp(), it will be OK to skip rcu_barrier() by * using rcu_trace_implies_rcu_gp() as well. */ - rcu_barrier_tasks_trace(); + rcu_barrier(); /* wait for __free_by_rcu */ + rcu_barrier_tasks_trace(); /* wait for __free_rcu */ if (!rcu_trace_implies_rcu_gp()) rcu_barrier(); free_mem_alloc_no_barrier(ma); @@ -563,6 +641,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_in_progress); } /* objcg is the same across cpus */ if (c->objcg) @@ -579,6 +658,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma) irq_work_sync(&c->refill_work); drain_mem_cache(c); rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress); + rcu_in_progress += atomic_read(&c->call_rcu_in_progress); } } if (c->objcg) @@ -663,6 +743,27 @@ static void notrace unit_free(struct bpf_mem_cache *c, void *ptr) irq_work_raise(c); } +static void notrace unit_free_rcu(struct bpf_mem_cache *c, void *ptr) +{ + struct llist_node *llnode = ptr - LLIST_NODE_SZ; + unsigned long flags; + + c->tgt = *(struct bpf_mem_cache **)llnode; + + local_irq_save(flags); + if (local_inc_return(&c->active) == 1) { + if (__llist_add(llnode, &c->free_by_rcu)) + c->free_by_rcu_tail = llnode; + } else { + llist_add(llnode, &c->free_llist_extra_rcu); + } + local_dec(&c->active); + local_irq_restore(flags); + + if (!atomic_read(&c->call_rcu_in_progress)) + irq_work_raise(c); +} + /* Called from BPF program or from sys_bpf syscall. * In both cases migration is disabled. */ @@ -696,6 +797,20 @@ void notrace bpf_mem_free(struct bpf_mem_alloc *ma, void *ptr) unit_free(this_cpu_ptr(ma->caches)->cache + idx, ptr); } +void notrace bpf_mem_free_rcu(struct bpf_mem_alloc *ma, void *ptr) +{ + int idx; + + if (!ptr) + return; + + idx = bpf_mem_cache_idx(ksize(ptr - LLIST_NODE_SZ)); + if (idx < 0) + return; + + unit_free_rcu(this_cpu_ptr(ma->caches)->cache + idx, ptr); +} + void notrace *bpf_mem_cache_alloc(struct bpf_mem_alloc *ma) { void *ret; @@ -712,6 +827,14 @@ void notrace bpf_mem_cache_free(struct bpf_mem_alloc *ma, void *ptr) unit_free(this_cpu_ptr(ma->cache), ptr); } +void notrace bpf_mem_cache_free_rcu(struct bpf_mem_alloc *ma, void *ptr) +{ + if (!ptr) + return; + + unit_free_rcu(this_cpu_ptr(ma->cache), ptr); +} + /* Directly does a kfree() without putting 'ptr' back to the free_llist * for reuse and without waiting for a rcu_tasks_trace gp. * The caller must first go through the rcu_tasks_trace gp for 'ptr' From patchwork Thu Jul 6 03:34:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303140 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E10CEB64DA for ; Thu, 6 Jul 2023 03:35:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232185AbjGFDfv (ORCPT ); Wed, 5 Jul 2023 23:35:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230383AbjGFDfu (ORCPT ); Wed, 5 Jul 2023 23:35:50 -0400 Received: from mail-oa1-x32.google.com (mail-oa1-x32.google.com [IPv6:2001:4860:4864:20::32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 332621BEC; Wed, 5 Jul 2023 20:35:43 -0700 (PDT) Received: by mail-oa1-x32.google.com with SMTP id 586e51a60fabf-1b078b34df5so363765fac.2; Wed, 05 Jul 2023 20:35:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614542; x=1691206542; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=smMeUc5DmNi2lwy4EGV+uTKZDtjMFbrXIGNYOVUZ5Gs=; b=eJ1M9zYGUcdMJB7lXouxeTk9Ls641bP0IouDn0A9CBSTo3oG6S/NqyLNH9bDknlopZ 8+J18JxP/uf+sgh+iyHrUTHkp94UHxbVeeseO4uWUJ5ayHbRT6UrJ8j/OvU5GCHdCa/w +twF9SE57vlPfK7mxv9lDM4BYGVE1Szmo2STs6Bst4dzKP84udssMAWR9E1P5rACeaZb KQKm5I81qqWRtunT991UrGXhCewI0defdMKRSSjrjFq6lqmD5fNsUQtaEZWxLTpkVbLo CvwxN76Tn4uP1gt8K2LO0u6JEkypDwmpNc7DQaSR/Yt90RysZdk96fO2QJTe0CbBfzVt txhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614542; x=1691206542; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=smMeUc5DmNi2lwy4EGV+uTKZDtjMFbrXIGNYOVUZ5Gs=; b=SPe/RWyxmn7Rr5Ix4NnMwLgIuV2tfAyP9mqNR1SYZn2HN+x1FGWKe/lrEUPKApGhzV Zi5PyP8Oqlv9iU3Mlj4V6Qpy26T/KEwmlR7qc5eBSlIx1JEKzyEN4odfZ5eY3G41FryI YfShxCQ+VudqS6heMZJ9yaBJfXiKt+JdoNsnU6oPvlr+T+bS4kJJaQ6QlhdDKSzwu750 nBrP5z1rOE5swUzHWnqSC6BUbGCmHIB6EDR0v1fd/G1EyZLWAA9TNG0zq21UqTSQUaIs DYqelmhy7YqB5zqb8OELboPdMNpuHGkgnCbVp6q3RaV+RR/Ir0jfz7iZw7Wek6Kt3q3m agtg== X-Gm-Message-State: ABy/qLavJTOWGQKnAGmvts+38lOtEdl0FaNbE9PUhhWHtE5L0spzepfE 5ksnTaxTmV6/xoLOfo+34C4= X-Google-Smtp-Source: APBJJlFimwDBMQBG/8L9mUiaQHodZaxNwiOKJG+xdULukv56qVDE2yKPt7+nFdfs5Gosd4e3Pn6qnw== X-Received: by 2002:a05:6870:a2c7:b0:1b0:151c:9b19 with SMTP id w7-20020a056870a2c700b001b0151c9b19mr1039388oak.18.1688614542468; Wed, 05 Jul 2023 20:35:42 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id u2-20020a170902e80200b001ab39cd875csm220092plg.133.2023.07.05.20.35.40 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:42 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 13/14] bpf: Convert bpf_cpumask to bpf_mem_cache_free_rcu. Date: Wed, 5 Jul 2023 20:34:46 -0700 Message-Id: <20230706033447.54696-14-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Alexei Starovoitov Convert bpf_cpumask to bpf_mem_cache_free_rcu. Note that migrate_disable() in bpf_cpumask_release() is still necessary, since bpf_cpumask_release() is a dtor. bpf_obj_free_fields() can be converted to do migrate_disable() there in a follow up. Signed-off-by: Alexei Starovoitov Acked-by: David Vernet --- kernel/bpf/cpumask.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c index 938a60ff4295..6983af8e093c 100644 --- a/kernel/bpf/cpumask.c +++ b/kernel/bpf/cpumask.c @@ -9,7 +9,6 @@ /** * struct bpf_cpumask - refcounted BPF cpumask wrapper structure * @cpumask: The actual cpumask embedded in the struct. - * @rcu: The RCU head used to free the cpumask with RCU safety. * @usage: Object reference counter. When the refcount goes to 0, the * memory is released back to the BPF allocator, which provides * RCU safety. @@ -25,7 +24,6 @@ */ struct bpf_cpumask { cpumask_t cpumask; - struct rcu_head rcu; refcount_t usage; }; @@ -82,16 +80,6 @@ __bpf_kfunc struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask) return cpumask; } -static void cpumask_free_cb(struct rcu_head *head) -{ - struct bpf_cpumask *cpumask; - - cpumask = container_of(head, struct bpf_cpumask, rcu); - migrate_disable(); - bpf_mem_cache_free(&bpf_cpumask_ma, cpumask); - migrate_enable(); -} - /** * bpf_cpumask_release() - Release a previously acquired BPF cpumask. * @cpumask: The cpumask being released. @@ -102,8 +90,12 @@ static void cpumask_free_cb(struct rcu_head *head) */ __bpf_kfunc void bpf_cpumask_release(struct bpf_cpumask *cpumask) { - if (refcount_dec_and_test(&cpumask->usage)) - call_rcu(&cpumask->rcu, cpumask_free_cb); + if (!refcount_dec_and_test(&cpumask->usage)) + return; + + migrate_disable(); + bpf_mem_cache_free_rcu(&bpf_cpumask_ma, cpumask); + migrate_enable(); } /** From patchwork Thu Jul 6 03:34:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13303141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E725AEB64DA for ; Thu, 6 Jul 2023 03:35:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232538AbjGFDf4 (ORCPT ); Wed, 5 Jul 2023 23:35:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233166AbjGFDfz (ORCPT ); Wed, 5 Jul 2023 23:35:55 -0400 Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0564C1BD2; Wed, 5 Jul 2023 20:35:47 -0700 (PDT) Received: by mail-oi1-x230.google.com with SMTP id 5614622812f47-3a1ebb85f99so343464b6e.2; Wed, 05 Jul 2023 20:35:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688614546; x=1691206546; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=f0bRuM+0zPcN3qYoxYOiMZ4uRyYlqx6ScAuNZFVccm0=; b=gKl3d8oMUw6/NKNF76YIYaD9bBO8RSyO5PM6Cvw46qzjnMBiNhuaxGxL6CCPqSm8Ph Kdv0J1iQVb5rDbvNokRqHU9Qtu8gXUySAtwJHXkVT/uSDuhVpLbtSgzGG85LGl3nkL71 e2N+yJ/d5U3v+P9yvpzefePVJmbPTcUcxH6pg9vUg4nFn7svcM0jpy9iV8bcHzMmY24Q 56A5nLRsjyG15Dbt1UUaU8fk5H2XgFVHQvOrC4zkNtnRaStQijhbH16Wmi1tL3vi3iUF k6i7GHsoUdMTFZ27mRt8ZszzzAH6du7QNvccq+3pcNQz3z+5bzO4yIZKXUM2zvqwVRFZ 9rig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688614546; x=1691206546; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=f0bRuM+0zPcN3qYoxYOiMZ4uRyYlqx6ScAuNZFVccm0=; b=ALmhEcmnOm52jxyW+Vh2iNBjOU+dZWLlb0Fvx0v4nqFYv5sy+JJf6o4z0gjy/t0Sc0 ALB4mN3nkt7ahhgk5eh7rrlfWLDeIdz4eJECP6zWUMTEMGdqEMGz/JV/DoJqo6+fdxmK Wdfp6heiNnICDI9sZAaU8wvtiKGXL/8O475MRnSIu7CjRZR5xZzlUXT7w4bzrDodsVF8 sdUDQTy9qlVssGDhHuqceVf3rX52sfSEo1HbFHc4QcrYqQ0sTh8XfvS1mD1wOT++UWnS CBGkleV+fWXIAUJFMxXj9Qgbx9dJDRBMtLjAnyzKk1Y0Q8tRQ2LKwztLrMryLPEhXLjv hEIQ== X-Gm-Message-State: ABy/qLZXQXesEtbH50f0CVe8JF2qycBnz/j521y2dbdAL8vLppuWhRk8 RIJS2RCTaJGoxZGVeLFx0CUWhn3KKZo= X-Google-Smtp-Source: APBJJlH13nj7856GlvWlvgyF7+YXlmnZ9rkeUgo+KEcrnC/DvOFqbkJ3BjFYflyTflMwfjfxDC6gCQ== X-Received: by 2002:a05:6808:1596:b0:3a1:ecdf:5f74 with SMTP id t22-20020a056808159600b003a1ecdf5f74mr646205oiw.43.1688614546277; Wed, 05 Jul 2023 20:35:46 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:400::5:f715]) by smtp.gmail.com with ESMTPSA id z18-20020aa791d2000000b006829b28b393sm228322pfa.199.2023.07.05.20.35.44 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 05 Jul 2023 20:35:45 -0700 (PDT) From: Alexei Starovoitov To: daniel@iogearbox.net, andrii@kernel.org, void@manifault.com, houtao@huaweicloud.com, paulmck@kernel.org Cc: tj@kernel.org, rcu@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v4 bpf-next 14/14] bpf: Add object leak check. Date: Wed, 5 Jul 2023 20:34:47 -0700 Message-Id: <20230706033447.54696-15-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230706033447.54696-1-alexei.starovoitov@gmail.com> References: <20230706033447.54696-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org From: Hou Tao The object leak check is cheap. Do it unconditionally to spot difficult races in bpf_mem_alloc. Signed-off-by: Hou Tao Signed-off-by: Alexei Starovoitov --- kernel/bpf/memalloc.c | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 17ef2e9b278a..51d6389e5152 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -567,8 +567,43 @@ static void drain_mem_cache(struct bpf_mem_cache *c) free_all(llist_del_all(&c->waiting_for_gp), percpu); } +static void check_mem_cache(struct bpf_mem_cache *c) +{ + WARN_ON_ONCE(!llist_empty(&c->free_by_rcu_ttrace)); + WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp_ttrace)); + WARN_ON_ONCE(!llist_empty(&c->free_llist)); + WARN_ON_ONCE(!llist_empty(&c->free_llist_extra)); + WARN_ON_ONCE(!llist_empty(&c->free_by_rcu)); + WARN_ON_ONCE(!llist_empty(&c->free_llist_extra_rcu)); + WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp)); +} + +static void check_leaked_objs(struct bpf_mem_alloc *ma) +{ + struct bpf_mem_caches *cc; + struct bpf_mem_cache *c; + int cpu, i; + + if (ma->cache) { + for_each_possible_cpu(cpu) { + c = per_cpu_ptr(ma->cache, cpu); + check_mem_cache(c); + } + } + if (ma->caches) { + for_each_possible_cpu(cpu) { + cc = per_cpu_ptr(ma->caches, cpu); + for (i = 0; i < NUM_CACHES; i++) { + c = &cc->cache[i]; + check_mem_cache(c); + } + } + } +} + static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma) { + check_leaked_objs(ma); free_percpu(ma->cache); free_percpu(ma->caches); ma->cache = NULL;