From patchwork Thu Oct 5 18:42:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Victor Nogueira X-Patchwork-Id: 13410624 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 995772E65A for ; Thu, 5 Oct 2023 18:42:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=mojatatu-com.20230601.gappssmtp.com header.i=@mojatatu-com.20230601.gappssmtp.com header.b="RH8ZmPqf" Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 381529E for ; Thu, 5 Oct 2023 11:42:54 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id 41be03b00d2f7-53fbf2c42bfso982005a12.3 for ; Thu, 05 Oct 2023 11:42:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20230601.gappssmtp.com; s=20230601; t=1696531373; x=1697136173; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Y3FzuamDMqc3cHGN57FU3P1A3vYgTC4SAk7aG0mI3ac=; b=RH8ZmPqfNZoMVGa1C3ybKFenu6pCAB8/V0EapxeiXaabZNdUSzOnknFJN1RLS629NT EhGGKRxGu6Mp8VQiD+ptFc77KplcllmcVctvLnfPy9w6LLvmtteotc9UseY3l6p0PRC3 D7awzt+7V0jHsinI4hPnM1kkpfvS8UxiAJB5o5p9pJR2URUcp3wFON1CJMXaKJ4ZJJrh 4vkNu1YPnsRZUnQyJF4ZgW5vIffXOkRA6xOosVFvr9LiPTBvgOmQ56QvLlfSz9Zf56ID sP4gjzsuWDRpRY9PTRHYhvQLQwhe8bw/puHwOeqJGEgEKddWpsqCjgfT/Bi+9D8SHA6K 9bnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696531373; x=1697136173; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Y3FzuamDMqc3cHGN57FU3P1A3vYgTC4SAk7aG0mI3ac=; b=JJ47+LovYzd+fHw6/3pew79IsSKRGnmXB3uAnF62WaDF6h2YkmgTJjx5HqDnVGbApS 3DlycnP7q3vbqvy0cx94dKxeqFCMNhckFa2eJvYiZAxtVrZCK2SXexrR/qkZdulP7lRO ZRRUUxa2cKzK6kFiqcJeWZyLYUdeRkkO8UW8Dft7XJldVq0BY2rTaCB2odGLAemWEWJk 1k6noaES0UhaEQyJRca0EBxsBMPaqFVILRSO9iSMXiawc+vKm/RqPv/x1LgIAAg8Hfof Wqqx4AAqVPv+c44gfWBLXE1MbZNWYrqOxXoYzgcKj8KjpqS37bDuiKKBuMaCJ9uZsP9w OW+A== X-Gm-Message-State: AOJu0YyGBLNynoj3PKw/dgr34H03l8ATJ1dlV8tuwDqUz+Ctvo3Wtvgr W2fkL0W9RUIgAZ+CBxgfyu+JzA== X-Google-Smtp-Source: AGHT+IGLwovphLLnYTswhoolRg6NRFKLzDVLtBQnk/tqlrlZ1ixKzcHsbZQ2UFTwR0oeZeVfCTyz5A== X-Received: by 2002:a05:6a20:564d:b0:160:97a3:cae7 with SMTP id is13-20020a056a20564d00b0016097a3cae7mr5151306pzc.57.1696531373583; Thu, 05 Oct 2023 11:42:53 -0700 (PDT) Received: from localhost.localdomain ([2804:7f1:e2c2:b6b7:54d9:6465:eb2f:5366]) by smtp.gmail.com with ESMTPSA id x28-20020aa793bc000000b00690d4c16296sm1725831pff.154.2023.10.05.11.42.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Oct 2023 11:42:53 -0700 (PDT) From: Victor Nogueira To: jhs@mojatatu.com, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, pabeni@redhat.com, edumazet@google.com, kuba@kernel.org Cc: mleitner@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com, pctammela@mojatatu.com, netdev@vger.kernel.org, kernel@mojatatu.com Subject: [PATCH net-next v4 1/3] net/sched: Introduce tc block netdev tracking infra Date: Thu, 5 Oct 2023 15:42:26 -0300 Message-ID: <20231005184228.467845-2-victor@mojatatu.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231005184228.467845-1-victor@mojatatu.com> References: <20231005184228.467845-1-victor@mojatatu.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org The tc block is a collection of netdevs/ports which allow qdiscs to share filter block instances (as opposed to the traditional tc filter per port). Example: $ tc qdisc add dev ens7 ingress block 22 $ tc qdisc add dev ens8 ingress block 22 Now we can add a filter using the block index: $ tc filter add block 22 protocol ip pref 25 \ flower dst_ip 192.168.0.0/16 action drop Up to this point, the block is unaware of its ports. This patch fixes that and makes the tc block ports available to the datapath. Suggested-by: Jiri Pirko Co-developed-by: Jamal Hadi Salim Signed-off-by: Jamal Hadi Salim Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Victor Nogueira --- include/net/sch_generic.h | 4 +++ net/sched/cls_api.c | 2 ++ net/sched/sch_api.c | 58 +++++++++++++++++++++++++++++++++++++++ net/sched/sch_generic.c | 34 +++++++++++++++++++++-- 4 files changed, 96 insertions(+), 2 deletions(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index c7318c73cfd6..a01979b0a2a1 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -19,6 +19,7 @@ #include #include #include +#include struct Qdisc_ops; struct qdisc_walker; @@ -126,6 +127,8 @@ struct Qdisc { struct rcu_head rcu; netdevice_tracker dev_tracker; + netdevice_tracker in_block_tracker; + netdevice_tracker eg_block_tracker; /* private data */ long privdata[] ____cacheline_aligned; }; @@ -458,6 +461,7 @@ struct tcf_chain { }; struct tcf_block { + struct xarray ports; /* datapath accessible */ /* Lock protects tcf_block and lifetime-management data of chains * attached to the block (refcnt, action_refcnt, explicitly_created). */ diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index a193cc7b3241..06b55344a948 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -531,6 +531,7 @@ static void tcf_block_destroy(struct tcf_block *block) { mutex_destroy(&block->lock); mutex_destroy(&block->proto_destroy_lock); + xa_destroy(&block->ports); kfree_rcu(block, rcu); } @@ -1003,6 +1004,7 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q, refcount_set(&block->refcnt, 1); block->net = net; block->index = block_index; + xa_init(&block->ports); /* Don't store q pointer for blocks which are shared */ if (!tcf_block_shared(block)) diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index e9eaf637220e..66543e4d6cdc 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1180,6 +1180,60 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, return 0; } +static int qdisc_block_add_dev(struct Qdisc *sch, struct net_device *dev, + struct nlattr **tca, + struct netlink_ext_ack *extack) +{ + const struct Qdisc_class_ops *cl_ops = sch->ops->cl_ops; + struct tcf_block *in_block = NULL; + struct tcf_block *eg_block = NULL; + unsigned long cl = 0; + int err; + + if (tca[TCA_INGRESS_BLOCK]) { + /* works for both ingress and clsact */ + cl = TC_H_MIN_INGRESS; + in_block = cl_ops->tcf_block(sch, cl, NULL); + if (!in_block) { + NL_SET_ERR_MSG(extack, "Shared ingress block missing"); + return -EINVAL; + } + + err = xa_insert(&in_block->ports, dev->ifindex, dev, GFP_KERNEL); + if (err) { + NL_SET_ERR_MSG(extack, "ingress block dev insert failed"); + return err; + } + + netdev_hold(dev, &sch->in_block_tracker, GFP_KERNEL); + } + + if (tca[TCA_EGRESS_BLOCK]) { + cl = TC_H_MIN_EGRESS; + eg_block = cl_ops->tcf_block(sch, cl, NULL); + if (!eg_block) { + NL_SET_ERR_MSG(extack, "Shared egress block missing"); + err = -EINVAL; + goto err_out; + } + + err = xa_insert(&eg_block->ports, dev->ifindex, dev, GFP_KERNEL); + if (err) { + NL_SET_ERR_MSG(extack, "Egress block dev insert failed"); + goto err_out; + } + netdev_hold(dev, &sch->eg_block_tracker, GFP_KERNEL); + } + + return 0; +err_out: + if (in_block) { + xa_erase(&in_block->ports, dev->ifindex); + netdev_put(dev, &sch->in_block_tracker); + } + return err; +} + static int qdisc_block_indexes_set(struct Qdisc *sch, struct nlattr **tca, struct netlink_ext_ack *extack) { @@ -1350,6 +1404,10 @@ static struct Qdisc *qdisc_create(struct net_device *dev, qdisc_hash_add(sch, false); trace_qdisc_create(ops, dev, parent); + err = qdisc_block_add_dev(sch, dev, tca, extack); + if (err) + goto err_out4; + return sch; err_out4: diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 4195a4bc26ca..b0c28b2ee713 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -1049,7 +1049,12 @@ static void qdisc_free_cb(struct rcu_head *head) static void __qdisc_destroy(struct Qdisc *qdisc) { - const struct Qdisc_ops *ops = qdisc->ops; + struct net_device *dev = qdisc_dev(qdisc); + const struct Qdisc_ops *ops = qdisc->ops; + const struct Qdisc_class_ops *cops; + struct tcf_block *block; + unsigned long cl; + u32 block_index; #ifdef CONFIG_NET_SCHED qdisc_hash_del(qdisc); @@ -1060,11 +1065,36 @@ static void __qdisc_destroy(struct Qdisc *qdisc) qdisc_reset(qdisc); + cops = ops->cl_ops; + if (ops->ingress_block_get) { + block_index = ops->ingress_block_get(qdisc); + if (block_index) { + cl = TC_H_MIN_INGRESS; + block = cops->tcf_block(qdisc, cl, NULL); + if (block) { + if (xa_erase(&block->ports, dev->ifindex)) + netdev_put(dev, &qdisc->in_block_tracker); + } + } + } + + if (ops->egress_block_get) { + block_index = ops->egress_block_get(qdisc); + if (block_index) { + cl = TC_H_MIN_EGRESS; + block = cops->tcf_block(qdisc, cl, NULL); + if (block) { + if (xa_erase(&block->ports, dev->ifindex)) + netdev_put(dev, &qdisc->eg_block_tracker); + } + } + } + if (ops->destroy) ops->destroy(qdisc); module_put(ops->owner); - netdev_put(qdisc_dev(qdisc), &qdisc->dev_tracker); + netdev_put(dev, &qdisc->dev_tracker); trace_qdisc_destroy(qdisc); From patchwork Thu Oct 5 18:42:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Victor Nogueira X-Patchwork-Id: 13410625 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4833F3C68A for ; Thu, 5 Oct 2023 18:42:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=mojatatu-com.20230601.gappssmtp.com header.i=@mojatatu-com.20230601.gappssmtp.com header.b="fEeUPq7+" Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF2DFBD for ; Thu, 5 Oct 2023 11:42:57 -0700 (PDT) Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-578d0d94986so927202a12.2 for ; Thu, 05 Oct 2023 11:42:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20230601.gappssmtp.com; s=20230601; t=1696531377; x=1697136177; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m3O/ERWJdSwPkhO1vOSMkjoPltWYfQOcrySKZI1D6E8=; b=fEeUPq7+RNEBV2yKwOIyu0N//IWOrCjTPEKMqxvuN8PezKB/3ia3JH1FwFBVaBQsz0 kQvsDCNDt9YXdDBSMNDPP4xgacUMuiAkfMoTHVZyPqTL01sVrkCwAt5HBaoyJNue2nJd ADfXX6WY/jraqEr2/XamwruRSZdr4y7+gsFCQtiChNmVRsXx8S77M6g4GEoYLjKtwCuD dcksnJfisV5ehOzpHWaRB2FQ3BgrCE3g9Jn10nxDIJg+HIwajBm79TOwmfvnHHD8mNa2 imFUvEuMcqlIf/Q4u+tSIsFllAsdPsMWd94tVU53GbeuUUvnXn6SQuRjrO5iZcjRGxaj ufUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696531377; x=1697136177; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m3O/ERWJdSwPkhO1vOSMkjoPltWYfQOcrySKZI1D6E8=; b=TJusriEVQxr1/2isdNfaeAWJc95XkJ9VHlF5mekzZGEby//tSeOEDUhAYtR5ekTkE6 WVeRqFE1xLDI9PENaHN9KM/vUdHRbw71a+gAK1frcwmBqj3ng+3RnGT9SeyEv38+JbJN tbfyT7L4i/Cb34oZtDX40aIjgZe9VINTJ/8m8Z8Rlr2Kl4PueClN0Ewhk8srA0j4NS1F 0j3Nx3DmYIzReSV72K2JXEsQCyg3bftIxceDpgYo/u9pTR9sie0zL8v2JEjjK+wQa3Wx KsA9HDtjbV5d7vSv/35Bf0WGfmFNGQXHW7RlmwXU9UvzSosyIfo8eQpRTJv6E6bhnitc rm5Q== X-Gm-Message-State: AOJu0YxSfjRth0QBs34B0yX5dDpS9Bg7bYyDd+v/EsySrGqd2kCsZ94Q dHCCDxQlHjYmKqKJMhshEkgX6Q== X-Google-Smtp-Source: AGHT+IEJXiQ6ybNJz/HqHWi4aT24Jx1PNg3ECmGtniOswyJbELYi85ubyqEZ/Qx/bH0+2dxEgxBEXw== X-Received: by 2002:a05:6a20:1007:b0:169:cd02:65ed with SMTP id gs7-20020a056a20100700b00169cd0265edmr3132550pzc.34.1696531377123; Thu, 05 Oct 2023 11:42:57 -0700 (PDT) Received: from localhost.localdomain ([2804:7f1:e2c2:b6b7:54d9:6465:eb2f:5366]) by smtp.gmail.com with ESMTPSA id x28-20020aa793bc000000b00690d4c16296sm1725831pff.154.2023.10.05.11.42.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Oct 2023 11:42:56 -0700 (PDT) From: Victor Nogueira To: jhs@mojatatu.com, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, pabeni@redhat.com, edumazet@google.com, kuba@kernel.org Cc: mleitner@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com, pctammela@mojatatu.com, netdev@vger.kernel.org, kernel@mojatatu.com Subject: [PATCH net-next v4 2/3] net/sched: cls_api: Expose tc block to the datapath Date: Thu, 5 Oct 2023 15:42:27 -0300 Message-ID: <20231005184228.467845-3-victor@mojatatu.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231005184228.467845-1-victor@mojatatu.com> References: <20231005184228.467845-1-victor@mojatatu.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org The datapath can now find the block of the port in which the packet arrived at. It can then use it for various activities. In the next patch we show a simple action that multicasts to all ports except for the port in which the packet arrived on. Co-developed-by: Jamal Hadi Salim Signed-off-by: Jamal Hadi Salim Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Victor Nogueira --- include/net/sch_generic.h | 4 ++++ net/sched/cls_api.c | 10 +++++++++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index a01979b0a2a1..03ab3730ba09 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -440,6 +440,8 @@ struct qdisc_skb_cb { }; #define QDISC_CB_PRIV_LEN 20 unsigned char data[QDISC_CB_PRIV_LEN]; + /* This should allow eBPF to continue to align */ + u32 block_index; }; typedef void tcf_chain_head_change_t(struct tcf_proto *tp_head, void *priv); @@ -488,6 +490,8 @@ struct tcf_block { struct mutex proto_destroy_lock; /* Lock for proto_destroy hashtable. */ }; +struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index); + static inline bool lockdep_tcf_chain_is_locked(struct tcf_chain *chain) { return lockdep_is_held(&chain->filter_chain_lock); diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 06b55344a948..c102fe26ac5e 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -1012,12 +1012,13 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q, return block; } -static struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index) +struct tcf_block *tcf_block_lookup(struct net *net, u32 block_index) { struct tcf_net *tn = net_generic(net, tcf_net_id); return idr_find(&tn->idr, block_index); } +EXPORT_SYMBOL(tcf_block_lookup); static struct tcf_block *tcf_block_refcnt_get(struct net *net, u32 block_index) { @@ -1738,9 +1739,13 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp, struct tcf_result *res, bool compat_mode) { + struct qdisc_skb_cb *qdisc_cb = qdisc_skb_cb(skb); + #if !IS_ENABLED(CONFIG_NET_TC_SKB_EXT) u32 last_executed_chain = 0; + qdisc_cb->block_index = block ? block->index : 0; + return __tcf_classify(skb, tp, tp, res, compat_mode, NULL, 0, &last_executed_chain); #else @@ -1752,6 +1757,7 @@ int tcf_classify(struct sk_buff *skb, int ret; if (block) { + qdisc_cb->block_index = block->index; ext = skb_ext_find(skb, TC_SKB_EXT); if (ext && (ext->chain || ext->act_miss)) { @@ -1779,6 +1785,8 @@ int tcf_classify(struct sk_buff *skb, tp = rcu_dereference_bh(fchain->filter_chain); last_executed_chain = fchain->index; } + } else { + qdisc_cb->block_index = 0; } ret = __tcf_classify(skb, tp, orig_tp, res, compat_mode, n, act_index, From patchwork Thu Oct 5 18:42:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Victor Nogueira X-Patchwork-Id: 13410626 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF5C328DD7 for ; Thu, 5 Oct 2023 18:43:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=mojatatu-com.20230601.gappssmtp.com header.i=@mojatatu-com.20230601.gappssmtp.com header.b="jHPTE37W" Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F1BAAB for ; Thu, 5 Oct 2023 11:43:01 -0700 (PDT) Received: by mail-pg1-x531.google.com with SMTP id 41be03b00d2f7-5809d5fe7f7so916070a12.3 for ; Thu, 05 Oct 2023 11:43:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mojatatu-com.20230601.gappssmtp.com; s=20230601; t=1696531381; x=1697136181; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HQdD3nl8ItlAuxCdP4SZ4kytJJ0umljUsnA6H0dBsr8=; b=jHPTE37WO+jfdZ4hCzTkRd99euxt8/l0xtqx3qde7ok9yyrGMQ3PuWOhU7+CKtVNCY lcgoEpBzxEeXV1p/1Me+7A016fj6QVbrRsv3AsODvIj2GVyfCbJBgLkldu0zMSv1sFjr 3pJO0MyXTXKDvh3yfM1JaZRcSfQ8wUbZANLMjjHIpZpCbIRSScAMAK8+/TZwYdwIT1Hm TgmEfQYuxMzOFcHTlL+1ahnBZTd3NFH/kCnKcl4yJeUwKMXWbuORXmDCmCSaxj7zYWhu NLgRK0U31MRFyR0GYTKsajKcIeV6G2Zi2Ou5YsXF96CiaLsioC6EgSVqjFT/t84rP+oL FUPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696531381; x=1697136181; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HQdD3nl8ItlAuxCdP4SZ4kytJJ0umljUsnA6H0dBsr8=; b=CykHj2U0H5ag3v1hy9fupKTLfwmuiF9/3vRJGykmXa4BgHxNm04AJ8CKzHr5SZ+/hm ActwBx5K3w7WFoEokCJy8XDBBGAM6/YvqjbFlBcnZZKDdkO0l/kuQeYjOO5C6cxtEr6H 6copB8vYI54HtzjHreX9vPBntaBPJDTFaAk0kaxAObU4vNNrmyc24MPpvMztVWWNvCWA zp/ChkRZ0QbPTCeO+gO+6BYDT7mVZyJAxINcSADGNySgFs9WXKg6Gpialb3fqvDWx+FB NYlXfnNq0k2ALpgRffp+fPu/oLhbMgGUXYHFVHkH98cOvwkBDPEYBwKObV+Rfgr+BcJn Krbg== X-Gm-Message-State: AOJu0YzrWGsboCUwVXsnjEvNTHVX4r8ZAPQvY0aCDSZJKhFeFwp9Meun AQh4hDIKESUEx3Tlni7dpyIxZMYu5OYN+8ADvnQ= X-Google-Smtp-Source: AGHT+IHr4IkLdIgmlQHiYn9lvJ+XFaoFbc0ixzKbzf1FNc50yTEaqPNIjx/yGwJ9jIZzY+g3+hI/zw== X-Received: by 2002:a05:6a21:788a:b0:15c:b7ba:ea44 with SMTP id bf10-20020a056a21788a00b0015cb7baea44mr6871526pzc.60.1696531380674; Thu, 05 Oct 2023 11:43:00 -0700 (PDT) Received: from localhost.localdomain ([2804:7f1:e2c2:b6b7:54d9:6465:eb2f:5366]) by smtp.gmail.com with ESMTPSA id x28-20020aa793bc000000b00690d4c16296sm1725831pff.154.2023.10.05.11.42.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Oct 2023 11:43:00 -0700 (PDT) From: Victor Nogueira To: jhs@mojatatu.com, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, pabeni@redhat.com, edumazet@google.com, kuba@kernel.org Cc: mleitner@redhat.com, vladbu@nvidia.com, simon.horman@corigine.com, pctammela@mojatatu.com, netdev@vger.kernel.org, kernel@mojatatu.com Subject: [PATCH net-next v4 3/3] net/sched: act_blockcast: Introduce blockcast tc action Date: Thu, 5 Oct 2023 15:42:28 -0300 Message-ID: <20231005184228.467845-4-victor@mojatatu.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231005184228.467845-1-victor@mojatatu.com> References: <20231005184228.467845-1-victor@mojatatu.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org This action takes advantage of the presence of tc block ports set in the datapath and broadcast a packet to all ports on that set with exception of the port in which it arrived on. Example usage: $ tc qdisc add dev ens7 ingress block 22 $ tc qdisc add dev ens8 ingress block 22 Now we can add a filter using the block index: $ tc filter add block 22 protocol ip pref 25 \ flower dst_ip 192.168.0.0/16 action blockcast Co-developed-by: Jamal Hadi Salim Signed-off-by: Jamal Hadi Salim Co-developed-by: Pedro Tammela Signed-off-by: Pedro Tammela Signed-off-by: Victor Nogueira --- include/net/tc_wrapper.h | 5 + include/uapi/linux/pkt_cls.h | 1 + net/sched/Kconfig | 13 ++ net/sched/Makefile | 1 + net/sched/act_blockcast.c | 297 +++++++++++++++++++++++++++++++++++ 5 files changed, 317 insertions(+) create mode 100644 net/sched/act_blockcast.c diff --git a/include/net/tc_wrapper.h b/include/net/tc_wrapper.h index a6d481b5bcbc..8ef848968be7 100644 --- a/include/net/tc_wrapper.h +++ b/include/net/tc_wrapper.h @@ -28,6 +28,7 @@ TC_INDIRECT_ACTION_DECLARE(tcf_csum_act); TC_INDIRECT_ACTION_DECLARE(tcf_ct_act); TC_INDIRECT_ACTION_DECLARE(tcf_ctinfo_act); TC_INDIRECT_ACTION_DECLARE(tcf_gact_act); +TC_INDIRECT_ACTION_DECLARE(tcf_blockcast_run); TC_INDIRECT_ACTION_DECLARE(tcf_gate_act); TC_INDIRECT_ACTION_DECLARE(tcf_ife_act); TC_INDIRECT_ACTION_DECLARE(tcf_ipt_act); @@ -57,6 +58,10 @@ static inline int tc_act(struct sk_buff *skb, const struct tc_action *a, if (a->ops->act == tcf_mirred_act) return tcf_mirred_act(skb, a, res); #endif +#if IS_BUILTIN(CONFIG_NET_ACT_BLOCKCAST) + if (a->ops->act == tcf_blockcast_run) + return tcf_blockcast_run(skb, a, res); +#endif #if IS_BUILTIN(CONFIG_NET_ACT_PEDIT) if (a->ops->act == tcf_pedit_act) return tcf_pedit_act(skb, a, res); diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index c7082cc60d21..e12fc51c1be1 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -139,6 +139,7 @@ enum tca_id { TCA_ID_MPLS, TCA_ID_CT, TCA_ID_GATE, + TCA_ID_BLOCKCAST, /* other actions go here */ __TCA_ID_MAX = 255 }; diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 470c70deffe2..abf26f0c921f 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -780,6 +780,19 @@ config NET_ACT_SIMP To compile this code as a module, choose M here: the module will be called act_simple. +config NET_ACT_BLOCKCAST + tristate "TC block Multicast" + depends on NET_CLS_ACT + help + Say Y here to add an action that will multicast an skb to egress of + all netdevs that belong to a tc block except for the netdev on which + the skb arrived on + + If unsure, say N. + + To compile this code as a module, choose M here: the + module will be called act_blockcast. + config NET_ACT_SKBEDIT tristate "SKB Editing" depends on NET_CLS_ACT diff --git a/net/sched/Makefile b/net/sched/Makefile index b5fd49641d91..2cdcf30645eb 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -17,6 +17,7 @@ obj-$(CONFIG_NET_ACT_IPT) += act_ipt.o obj-$(CONFIG_NET_ACT_NAT) += act_nat.o obj-$(CONFIG_NET_ACT_PEDIT) += act_pedit.o obj-$(CONFIG_NET_ACT_SIMP) += act_simple.o +obj-$(CONFIG_NET_ACT_BLOCKCAST) += act_blockcast.o obj-$(CONFIG_NET_ACT_SKBEDIT) += act_skbedit.o obj-$(CONFIG_NET_ACT_CSUM) += act_csum.o obj-$(CONFIG_NET_ACT_MPLS) += act_mpls.o diff --git a/net/sched/act_blockcast.c b/net/sched/act_blockcast.c new file mode 100644 index 000000000000..047023cba749 --- /dev/null +++ b/net/sched/act_blockcast.c @@ -0,0 +1,297 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * net/sched/act_blockcast.c Block Cast action + * Copyright (c) 2023, Mojatatu Networks + * Authors: Jamal Hadi Salim + * Victor Nogueira + * Pedro Tammela + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +static struct tc_action_ops act_blockcast_ops; + +struct tcf_blockcast_act { + struct tc_action common; +}; + +#define to_blockcast_act(a) ((struct tcf_blockcast_act *)a) + +#define CAST_RECURSION_LIMIT 4 + +static DEFINE_PER_CPU(unsigned int, redirect_rec_level); + +static int cast_one(struct sk_buff *skb, const u32 ifindex) +{ + struct sk_buff *skb2 = skb; + int retval = TC_ACT_PIPE; + struct net_device *dev; + unsigned int rec_level; + bool expects_nh; + int mac_len; + bool at_nh; + int err; + + rec_level = __this_cpu_inc_return(redirect_rec_level); + if (unlikely(rec_level > CAST_RECURSION_LIMIT)) { + net_warn_ratelimited("blockcast: exceeded redirect recursion limit on dev %s\n", + netdev_name(skb->dev)); + goto out_shot; + } + + dev = dev_get_by_index_rcu(dev_net(skb->dev), ifindex); + if (unlikely(!dev)) + goto out_shot; + + if (unlikely(!(dev->flags & IFF_UP) || !netif_carrier_ok(dev))) + goto out_shot; + + skb2 = skb_clone(skb, GFP_ATOMIC); + if (!skb2) + goto out_shot; + + nf_reset_ct(skb2); + + expects_nh = !dev_is_mac_header_xmit(dev); + at_nh = skb->data == skb_network_header(skb); + if (at_nh != expects_nh) { + mac_len = skb_at_tc_ingress(skb) ? + skb->mac_len : + skb_network_header(skb) - skb_mac_header(skb); + + if (expects_nh) { + /* target device/action expect data at nh */ + skb_pull_rcsum(skb2, mac_len); + } else { + /* target device/action expect data at mac */ + skb_push_rcsum(skb2, mac_len); + } + } + + skb2->skb_iif = skb->dev->ifindex; + skb2->dev = dev; + + err = dev_queue_xmit(skb2); + if (err) + goto out_shot; + + goto rec_level_dec; + +out_shot: + retval = TC_ACT_SHOT; + +rec_level_dec: + __this_cpu_dec(redirect_rec_level); + + return retval; +} + +TC_INDIRECT_SCOPE int tcf_blockcast_run(struct sk_buff *skb, + const struct tc_action *a, + struct tcf_result *res) +{ + u32 block_index = qdisc_skb_cb(skb)->block_index; + struct tcf_blockcast_act *p = to_blockcast_act(a); + int action = READ_ONCE(p->tcf_action); + struct net *net = dev_net(skb->dev); + struct tcf_block *block; + struct net_device *dev; + u32 exception_ifindex; + unsigned long index; + + block = tcf_block_lookup(net, block_index); + exception_ifindex = skb->dev->ifindex; + + tcf_action_update_bstats(&p->common, skb); + tcf_lastuse_update(&p->tcf_tm); + + if (!block || xa_empty(&block->ports)) + goto act_done; + + /* we are already under rcu protection, so iterating block is safe*/ + xa_for_each(&block->ports, index, dev) { + int err; + + if (index == exception_ifindex) + continue; + + err = cast_one(skb, dev->ifindex); + if (err != TC_ACT_PIPE) + tcf_action_inc_overlimit_qstats(&p->common); + } + +act_done: + if (action == TC_ACT_SHOT) + tcf_action_inc_drop_qstats(&p->common); + return action; +} + +static const struct nla_policy blockcast_policy[TCA_DEF_MAX + 1] = { + [TCA_DEF_PARMS] = { .len = sizeof(struct tc_defact) }, +}; + +static int tcf_blockcast_init(struct net *net, struct nlattr *nla, + struct nlattr *est, struct tc_action **a, + struct tcf_proto *tp, u32 flags, + struct netlink_ext_ack *extack) +{ + struct tc_action_net *tn = net_generic(net, act_blockcast_ops.net_id); + struct tcf_blockcast_act *p = to_blockcast_act(a); + bool bind = flags & TCA_ACT_FLAGS_BIND; + struct nlattr *tb[TCA_DEF_MAX + 1]; + struct tcf_chain *goto_ch = NULL; + struct tc_defact *parm; + bool exists = false; + int ret = 0, err; + u32 index; + + if (!nla) + return -EINVAL; + + err = nla_parse_nested(tb, TCA_DEF_MAX, nla, + blockcast_policy, extack); + if (err < 0) + return err; + + if (!tb[TCA_DEF_PARMS]) + return -EINVAL; + + parm = nla_data(tb[TCA_DEF_PARMS]); + index = parm->index; + + err = tcf_idr_check_alloc(tn, &index, a, bind); + if (err < 0) + return err; + + exists = err; + if (exists && bind) + return 0; + + if (!exists) { + ret = tcf_idr_create_from_flags(tn, index, est, a, + &act_blockcast_ops, bind, flags); + if (ret) { + tcf_idr_cleanup(tn, index); + return ret; + } + + ret = ACT_P_CREATED; + } else { + if (!(flags & TCA_ACT_FLAGS_REPLACE)) { + err = -EEXIST; + goto release_idr; + } + } + + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; + + if (exists) { + spin_lock_bh(&p->tcf_lock); + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); + spin_unlock_bh(&p->tcf_lock); + } else { + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); + } + + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + + return ret; +release_idr: + tcf_idr_release(*a, bind); + return err; +} + +static int tcf_blockcast_dump(struct sk_buff *skb, struct tc_action *a, + int bind, int ref) +{ + unsigned char *b = skb_tail_pointer(skb); + struct tcf_blockcast_act *p = to_blockcast_act(a); + struct tc_defact opt = { + .index = p->tcf_index, + .refcnt = refcount_read(&p->tcf_refcnt) - ref, + .bindcnt = atomic_read(&p->tcf_bindcnt) - bind, + }; + struct tcf_t t; + + spin_lock_bh(&p->tcf_lock); + opt.action = p->tcf_action; + if (nla_put(skb, TCA_DEF_PARMS, sizeof(opt), &opt)) + goto nla_put_failure; + + tcf_tm_dump(&t, &p->tcf_tm); + if (nla_put_64bit(skb, TCA_DEF_TM, sizeof(t), &t, TCA_DEF_PAD)) + goto nla_put_failure; + spin_unlock_bh(&p->tcf_lock); + + return skb->len; + +nla_put_failure: + spin_unlock_bh(&p->tcf_lock); + nlmsg_trim(skb, b); + return -1; +} + +static struct tc_action_ops act_blockcast_ops = { + .kind = "blockcast", + .id = TCA_ID_BLOCKCAST, + .owner = THIS_MODULE, + .act = tcf_blockcast_run, + .dump = tcf_blockcast_dump, + .init = tcf_blockcast_init, + .size = sizeof(struct tcf_blockcast_act), +}; + +static __net_init int blockcast_init_net(struct net *net) +{ + struct tc_action_net *tn = net_generic(net, act_blockcast_ops.net_id); + + return tc_action_net_init(net, tn, &act_blockcast_ops); +} + +static void __net_exit blockcast_exit_net(struct list_head *net_list) +{ + tc_action_net_exit(net_list, act_blockcast_ops.net_id); +} + +static struct pernet_operations blockcast_net_ops = { + .init = blockcast_init_net, + .exit_batch = blockcast_exit_net, + .id = &act_blockcast_ops.net_id, + .size = sizeof(struct tc_action_net), +}; + +MODULE_AUTHOR("Mojatatu Networks, Inc"); +MODULE_DESCRIPTION("Action to broadcast to devices on a block"); +MODULE_LICENSE("GPL"); + +static int __init blockcast_init_module(void) +{ + int ret = tcf_register_action(&act_blockcast_ops, &blockcast_net_ops); + + if (!ret) + pr_info("blockcast TC action Loaded\n"); + return ret; +} + +static void __exit blockcast_cleanup_module(void) +{ + tcf_unregister_action(&act_blockcast_ops, &blockcast_net_ops); +} + +module_init(blockcast_init_module); +module_exit(blockcast_cleanup_module);