From patchwork Mon May 22 23:53:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peilin Ye X-Patchwork-Id: 13251238 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDAA2174CD for ; Mon, 22 May 2023 23:53:50 +0000 (UTC) Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 816062136; Mon, 22 May 2023 16:53:49 -0700 (PDT) Received: by mail-oo1-xc36.google.com with SMTP id 006d021491bc7-55517975c5fso1530038eaf.2; Mon, 22 May 2023 16:53:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684799629; x=1687391629; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cka+L/EhW69q5eBXnddBNePu0LAvLr19Skc97ejFT5k=; b=hs6qEjLryiJCm6ldYmWxw2AlTIkJ2sUZbIaFF/7Q92s6BjF4GhM9/kciURhAEPPZRV Kn74+HhFeUEQ/h7Y5VbSjw8apuf4BgysdzxJo70TQ+cNMggpIJGOG7Lve6d+lOxEmdOS Jk1BAjiixJ4tcl82DQ9fsE9nm2E1BwzScLj5s3AS/BT3RMaNxmeY8TnkVSaeL0tstJZn 1lr1A+w2EM5nfq3w7is1/TNaf1zp25/RaObGRPtdR/0KGXrsYMrdN891ccGOktQCPbmc JsdR+uqtrp5F5GQKDdsRIBKIQXZ498HV0iUAC94sE3vGmN9YFMOUrCwDHLOvHV7/fDRF toHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684799629; x=1687391629; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cka+L/EhW69q5eBXnddBNePu0LAvLr19Skc97ejFT5k=; b=e/0AmTBn1COboxlzQ6ATk35djKODnN5R0jKM95A4MztzgUDTY3AAjkBNss1PbXk1qM S4+XWrGy0nt81aV4H9nVOOS5eexKUBZ7wuS7RHu5xJ0Fx34715CkrebNG6JJdC9tlVVm j7O7CfhZLI4Ik9audqz3bU2MUJJ42beEaHZWU+7o6qZSA63DTMtvcJ7x3iiiHWm49zhu YXhNsBPYd9XW9648wjO7KFcLU3fguFgC8LpkkKmli2U9yurG7viT6dSrpN6PM1cevC7d VtMv1USqGGhoEVpIAl6zGqC7GxMUdMUY+AackoxMnbnVCNC1oUQl/XMKj1nEiQVbCkGf Np7g== X-Gm-Message-State: AC+VfDxmMTUyauWlQb86buP/AhNl79FPi34rqHRSrK1F4kX0JUP6NTfy l//XBFQMicvMLKPBjk/F1X3qN6kO8A== X-Google-Smtp-Source: ACHHUZ4DzvUKvWyGcLzLNdX+tTZe+1UHe1hdq209PHohRpA+pS3ny1318P9CX9FMvQsaKZ6rt/gnoQ== X-Received: by 2002:a4a:9c50:0:b0:54f:b59c:256a with SMTP id c16-20020a4a9c50000000b0054fb59c256amr6010050ook.4.1684799628640; Mon, 22 May 2023 16:53:48 -0700 (PDT) Received: from C02FL77VMD6R.bytedance.net ([208.184.112.130]) by smtp.gmail.com with ESMTPSA id x131-20020a4a4189000000b0055210b1a91csm3201666ooa.3.2023.05.22.16.53.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 May 2023 16:53:48 -0700 (PDT) From: Peilin Ye X-Google-Original-From: Peilin Ye To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Peilin Ye , Daniel Borkmann , John Fastabend , Vlad Buslov , Pedro Tammela , Hillf Danton , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Cong Wang , Peilin Ye Subject: [PATCH v2 net 1/6] net/sched: sch_ingress: Only create under TC_H_INGRESS Date: Mon, 22 May 2023 16:53:33 -0700 Message-Id: X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org ingress Qdiscs are only supposed to be created under TC_H_INGRESS. Similar to mq_init(), return -EOPNOTSUPP if 'parent' is not TC_H_INGRESS. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/ Reviewed-by: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Peilin Ye --- net/sched/sch_ingress.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c index 84838128b9c5..3d71f7a3b4ad 100644 --- a/net/sched/sch_ingress.c +++ b/net/sched/sch_ingress.c @@ -80,6 +80,9 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt, struct net_device *dev = qdisc_dev(sch); int err; + if (sch->parent != TC_H_INGRESS) + return -EOPNOTSUPP; + net_inc_ingress_queue(); mini_qdisc_pair_init(&q->miniqp, sch, &dev->miniq_ingress); From patchwork Mon May 22 23:54:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peilin Ye X-Patchwork-Id: 13251239 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A39461773E for ; Mon, 22 May 2023 23:54:15 +0000 (UTC) Received: from mail-ot1-x336.google.com (mail-ot1-x336.google.com [IPv6:2607:f8b0:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 608252135; Mon, 22 May 2023 16:54:14 -0700 (PDT) Received: by mail-ot1-x336.google.com with SMTP id 46e09a7af769-6af7593ed5fso821351a34.0; Mon, 22 May 2023 16:54:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684799653; x=1687391653; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Eg7+MTtOoGginTkEuFUGcJMan5HKcjTNLO6/cJFDlM4=; b=hE+VlVPBniLddnIkLUExM3o1tiW7XrYfN2tezmYlTt87txsC7f9BRTYDdQiyAJhHxm BF17RE/+lSVmvCEg213hhwkePkJvdCWY3waFnayDGJm6vsvmRKSj0CBUvGvsjgkM2wbS Hlj722/CmP6utyeRySZK1l2oajoWjggphwHPgWzFyXFbcM3Q3qoTvB6OUN+gthu5deEk UUgobDZMccRc2G6f7Ln+TdsH/kbq1hgFAS3rlli4tM0IKzo8ZtFIDroz8xItAlaFqoL+ P1Xs+Fg9vieaMtT1mgng2x7eqbaU7/t0NLWQeeq3u7+lXGukqrhDEifjKRMKrqMu82xg Msxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684799653; x=1687391653; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Eg7+MTtOoGginTkEuFUGcJMan5HKcjTNLO6/cJFDlM4=; b=kp2r4EefMsUnN2F16V85m0NMrKueuQblNHqcJbcH9QRU80j11fniPwepF+fZf06han hfEsOlPoVa8YxJrTqlMXxksgG3XtQNQZ5Rl9RS9RMKTZxRftCEyKEnd5yYEvfztGg2DY Y9wluuh3Lw+cganHnLcj20WJMmFHaIXYlyxUD037YFbphmrXo5Fms5GGsCa/cmCEaHNA zXzjxPKzmaDYYZW0EwrgaDD/hmKo/IcxIHu/SvKf4+/mt+VgVJvbemx9e+V1r7Mizmtb rAd/12HksXaA/IU+Lg5/GES1HvferpKYmoR7aR9Un6yMEliN524YGGNDj1UIDDwNjkJv 9PGQ== X-Gm-Message-State: AC+VfDxwaBnhxheApI+GdPUN+3AFxVlUx1mWa7WYDBiawAlKW0/swaoM hu7MvtJkQCNFQko1c69uAA== X-Google-Smtp-Source: ACHHUZ4uUCdiFPY96wSRlMhGzY9cpAXsWfixLgugBHd+AcjTGacgnN0bQigh7sZ0ruSZ2zNRZxCkiw== X-Received: by 2002:a9d:6e19:0:b0:6af:7856:5d55 with SMTP id e25-20020a9d6e19000000b006af78565d55mr2673148otr.22.1684799653550; Mon, 22 May 2023 16:54:13 -0700 (PDT) Received: from C02FL77VMD6R.bytedance.net ([208.184.112.130]) by smtp.gmail.com with ESMTPSA id u21-20020a056830119500b006af800065f2sm1318813otq.59.2023.05.22.16.54.12 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 May 2023 16:54:13 -0700 (PDT) From: Peilin Ye X-Google-Original-From: Peilin Ye To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Peilin Ye , Daniel Borkmann , John Fastabend , Vlad Buslov , Pedro Tammela , Hillf Danton , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Cong Wang , Peilin Ye Subject: [PATCH v2 net 2/6] net/sched: sch_clsact: Only create under TC_H_CLSACT Date: Mon, 22 May 2023 16:54:02 -0700 Message-Id: X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org clsact Qdiscs are only supposed to be created under TC_H_CLSACT (which equals TC_H_INGRESS). Return -EOPNOTSUPP if 'parent' is not TC_H_CLSACT. Fixes: 1f211a1b929c ("net, sched: add clsact qdisc") Reviewed-by: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Peilin Ye --- net/sched/sch_ingress.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c index 3d71f7a3b4ad..13218a1fe4a5 100644 --- a/net/sched/sch_ingress.c +++ b/net/sched/sch_ingress.c @@ -222,6 +222,9 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt, struct net_device *dev = qdisc_dev(sch); int err; + if (sch->parent != TC_H_CLSACT) + return -EOPNOTSUPP; + net_inc_ingress_queue(); net_inc_egress_queue(); From patchwork Mon May 22 23:54:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peilin Ye X-Patchwork-Id: 13251240 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 69C0A17739 for ; Mon, 22 May 2023 23:54:40 +0000 (UTC) Received: from mail-oa1-x34.google.com (mail-oa1-x34.google.com [IPv6:2001:4860:4864:20::34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 043BD213A; Mon, 22 May 2023 16:54:39 -0700 (PDT) Received: by mail-oa1-x34.google.com with SMTP id 586e51a60fabf-19c8dd7c258so1579282fac.1; Mon, 22 May 2023 16:54:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684799678; x=1687391678; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IYDaMH2sBj0TTvJJ8lNQXKiQGR7svLdvqXQQFVpqjrk=; b=ZQtEtEbF+b1g4Hw7nC0Ii2VHFnrw+1rG8EjKgrtVPZEeEVQXPJjvYBFV3T+SnGiITV rQBN8CAob7xOG/o5BfUPuAJFztbHkhBWsplildS0v4v6mTRJYyOTYdbSOQKqMXy89HMQ uAPAdkdS3xMkWZ5yRk+KnyrMjV+U8FsdQ+VYOl0yZNf33Ren6fiRTYV+LM2uSLpYf4z4 V/0NGlmF2h1zn/0c+dSW9pX6EpBsVDQdPKhP1aq1WJynr+6F16injL7EjFfQXObx6gCW fFOjxGQUWAV3mM0gWx4lunknLHNTrzOx498PetTk+/0VVG+mdsze6lk388CLHTaWNZXP aCPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684799678; x=1687391678; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IYDaMH2sBj0TTvJJ8lNQXKiQGR7svLdvqXQQFVpqjrk=; b=AQvEe+/YNPwj2+EQHMFgOnkoNIJCYRD9P+42rU2Vo137aHPoLzIhVXuHUNlBwsDHG7 2YH3xJq36YcQvexi6xNr4lss2Ymvok3w/RyvEOhMtovz4bk8bT67LG54rRlXVPqN3e5X 9jPcOIgGFnbcfxOvq2lI6ijm5+ZI4fB1gPW4O3p+6/W7DY90PThOOS2kXf4n9JfLLYd5 4VfeDgRAbWldOtKeimnoHFDz5uh0D1tv/3lcraE4OygDaw2h01axMNBzFRZIFHM4ap3K yb/4Qa5r0qotEmKkQLnnJkFl4Y9Oyeob+dbzd6XIN86wYeFmwn3xooZkwOeRD4kPzTkO hZCg== X-Gm-Message-State: AC+VfDwoAWINLd/m/wkde5x9a6QXG6Z9RAlr3MnEuQBRbkzqcL4zQV32 iD04zimCtqYzs3jV7XVrJg== X-Google-Smtp-Source: ACHHUZ5pnoZtuCuzZD95IQRxq+3LkdbdnBsqXvHBFGaXwa/8oP7GDL0zttnT1lTpD543cHk4ZG8IaA== X-Received: by 2002:a05:6870:5a93:b0:19d:767:9da2 with SMTP id dt19-20020a0568705a9300b0019d07679da2mr2910634oab.3.1684799678293; Mon, 22 May 2023 16:54:38 -0700 (PDT) Received: from C02FL77VMD6R.bytedance.net ([208.184.112.130]) by smtp.gmail.com with ESMTPSA id j21-20020a9d7695000000b006ac98aae2d3sm2928959otl.40.2023.05.22.16.54.36 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 May 2023 16:54:38 -0700 (PDT) From: Peilin Ye X-Google-Original-From: Peilin Ye To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Peilin Ye , Daniel Borkmann , John Fastabend , Vlad Buslov , Pedro Tammela , Hillf Danton , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Cong Wang , Peilin Ye Subject: [PATCH v2 net 3/6] net/sched: Reserve TC_H_INGRESS (TC_H_CLSACT) for ingress (clsact) Qdiscs Date: Mon, 22 May 2023 16:54:22 -0700 Message-Id: X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org Currently it is possible to add e.g. an HTB Qdisc under ffff:fff1 (TC_H_INGRESS, TC_H_CLSACT): $ ip link add name ifb0 type ifb $ tc qdisc add dev ifb0 parent ffff:fff1 htb $ tc qdisc add dev ifb0 clsact Error: Exclusivity flag on, cannot modify. $ drgn ... >>> ifb0 = netdev_get_by_name(prog, "ifb0") >>> qdisc = ifb0.ingress_queue.qdisc_sleeping >>> print(qdisc.ops.id.string_().decode()) htb >>> qdisc.flags.value_() # TCQ_F_INGRESS 2 Only allow ingress and clsact Qdiscs under ffff:fff1. Return -EINVAL for everything else. Make TCQ_F_INGRESS a static flag of ingress and clsact Qdiscs. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Fixes: 1f211a1b929c ("net, sched: add clsact qdisc") Reviewed-by: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Peilin Ye --- net/sched/sch_api.c | 7 ++++++- net/sched/sch_ingress.c | 4 ++-- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index fdb8f429333d..383195955b7d 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1252,7 +1252,12 @@ static struct Qdisc *qdisc_create(struct net_device *dev, sch->parent = parent; if (handle == TC_H_INGRESS) { - sch->flags |= TCQ_F_INGRESS; + if (!(sch->flags & TCQ_F_INGRESS)) { + NL_SET_ERR_MSG(extack, + "Specified parent ID is reserved for ingress and clsact Qdiscs"); + err = -EINVAL; + goto err_out3; + } handle = TC_H_MAKE(TC_H_INGRESS, 0); } else { if (handle == 0) { diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c index 13218a1fe4a5..caea51e0d4e9 100644 --- a/net/sched/sch_ingress.c +++ b/net/sched/sch_ingress.c @@ -137,7 +137,7 @@ static struct Qdisc_ops ingress_qdisc_ops __read_mostly = { .cl_ops = &ingress_class_ops, .id = "ingress", .priv_size = sizeof(struct ingress_sched_data), - .static_flags = TCQ_F_CPUSTATS, + .static_flags = TCQ_F_INGRESS | TCQ_F_CPUSTATS, .init = ingress_init, .destroy = ingress_destroy, .dump = ingress_dump, @@ -275,7 +275,7 @@ static struct Qdisc_ops clsact_qdisc_ops __read_mostly = { .cl_ops = &clsact_class_ops, .id = "clsact", .priv_size = sizeof(struct clsact_sched_data), - .static_flags = TCQ_F_CPUSTATS, + .static_flags = TCQ_F_INGRESS | TCQ_F_CPUSTATS, .init = clsact_init, .destroy = clsact_destroy, .dump = ingress_dump, From patchwork Mon May 22 23:55:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peilin Ye X-Patchwork-Id: 13251241 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7545B17739 for ; Mon, 22 May 2023 23:55:07 +0000 (UTC) Received: from mail-ot1-x32d.google.com (mail-ot1-x32d.google.com [IPv6:2607:f8b0:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2071213E; Mon, 22 May 2023 16:55:05 -0700 (PDT) Received: by mail-ot1-x32d.google.com with SMTP id 46e09a7af769-6af70ff2761so845475a34.0; Mon, 22 May 2023 16:55:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684799705; x=1687391705; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6w0tHXoycuCHPxvFjx9JgsU6uyQcbmVkIq1ehIMdFnA=; b=hVGimPahIRoBsZXN1TYJLbnb315fAbDFUO2XjU0SvRW3Zxg6QfJGL7M4AtaAyOh5Er bsLoVpz63swq20sEYL9Brng9pCOZ4qjgdvYL6REWjYXfSnMkq/aqqkwcfjXeCuNuXRJj CUUzfdt+uHNMC/EdMqtoTVUaTAyeIOuStZJHFRRnarJeI/hnkKQofvzxSY/7kE9HWboN 6REcgvkRUG3cUmppdyCXo21TVPBMctXHsIwAE9P4GkUzXBlIkhqZetzWEHM42XgNL0Xy S+ip+clF8B40Vu3PLHNBJkF8fjmXMhTYhMn0f/PLkb4W4rRTPwZsleRK7rUd0kMaNpvf 2xEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684799705; x=1687391705; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6w0tHXoycuCHPxvFjx9JgsU6uyQcbmVkIq1ehIMdFnA=; b=AzR2eyArrtH/R3mMNLmgalTjZXuM6vx+fhq6rxM1K4LzYAKfu4x0Auc+ey3ApUlH55 tImsSKrqFEB5whgYg1mr8Ir76++hIsSvcHL4PIsYs/DSd3RxEFtmA+h8mybN9FVCqRZm y0QUamWDX2kx0SGkZoKwBSlvm7s9KCu8OWxOczfsKJmOa+Z2MsVXgE9SRwd7aeumgEi0 jVBxO6gw5ddGA5H+W5vWQOd8IJj7/mq3q1rX0ZDV+rB4JONJkp1GpgVZui13wry14H9k ByQRwHhtyJsHEbGcm1mlH168fDBgkzgCXd4Vm24UN/kOW8Cvm0WP/uCOXSdC70CE84/B MFqg== X-Gm-Message-State: AC+VfDx9qhtzpcvu4eqBVTgyKnxppyjPnroOF/rIddB2zxWpw/pNrD9n 03wPB+PZP7vZVZ6VxgtnSw== X-Google-Smtp-Source: ACHHUZ6o82VRDExjimiyZ59YRxP3ymwWmtjdBWhmbriezvtuzJUbc2Sq2fbWfvubyCnFt0Mkmw3JlQ== X-Received: by 2002:a05:6830:61a:b0:6a5:f792:dbe5 with SMTP id w26-20020a056830061a00b006a5f792dbe5mr6132994oti.22.1684799704958; Mon, 22 May 2023 16:55:04 -0700 (PDT) Received: from C02FL77VMD6R.bytedance.net ([208.184.112.130]) by smtp.gmail.com with ESMTPSA id f15-20020a9d5f0f000000b006a662e9f074sm2870814oti.58.2023.05.22.16.55.03 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 May 2023 16:55:04 -0700 (PDT) From: Peilin Ye X-Google-Original-From: Peilin Ye To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Peilin Ye , Daniel Borkmann , John Fastabend , Vlad Buslov , Pedro Tammela , Hillf Danton , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Cong Wang , Peilin Ye Subject: [PATCH v2 net 4/6] net/sched: Prohibit regrafting ingress or clsact Qdiscs Date: Mon, 22 May 2023 16:55:00 -0700 Message-Id: X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org Currently, after creating an ingress (or clsact) Qdisc and grafting it under TC_H_INGRESS (TC_H_CLSACT), it is possible to graft it again under e.g. a TBF Qdisc: $ ip link add ifb0 type ifb $ tc qdisc add dev ifb0 handle 1: root tbf rate 20kbit buffer 1600 limit 3000 $ tc qdisc add dev ifb0 clsact $ tc qdisc link dev ifb0 handle ffff: parent 1:1 $ tc qdisc show dev ifb0 qdisc tbf 1: root refcnt 2 rate 20Kbit burst 1600b lat 560.0ms qdisc clsact ffff: parent ffff:fff1 refcnt 2 ^^^^^^^^ clsact's refcount has increased: it is now grafted under both TC_H_CLSACT and 1:1. ingress and clsact Qdiscs should only be used under TC_H_INGRESS (TC_H_CLSACT). Prohibit regrafting them. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Fixes: 1f211a1b929c ("net, sched: add clsact qdisc") Reviewed-by: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Peilin Ye --- net/sched/sch_api.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 383195955b7d..49b9c1bbfdd9 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1596,6 +1596,11 @@ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, NL_SET_ERR_MSG(extack, "Invalid qdisc name"); return -EINVAL; } + if (q->flags & TCQ_F_INGRESS) { + NL_SET_ERR_MSG(extack, + "Cannot regraft ingress or clsact Qdiscs"); + return -EINVAL; + } if (q == p || (p && check_loop(q, p, 0))) { NL_SET_ERR_MSG(extack, "Qdisc parent/child loop detected"); From patchwork Mon May 22 23:55:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peilin Ye X-Patchwork-Id: 13251242 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9475DEEA8 for ; Mon, 22 May 2023 23:55:30 +0000 (UTC) Received: from mail-ot1-x332.google.com (mail-ot1-x332.google.com [IPv6:2607:f8b0:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 200EE2681; Mon, 22 May 2023 16:55:29 -0700 (PDT) Received: by mail-ot1-x332.google.com with SMTP id 46e09a7af769-6af8aac7a2eso429808a34.3; Mon, 22 May 2023 16:55:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684799728; x=1687391728; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sQLIv+3hRLznwnCm51oPj668Ns602ev+EO/wUc0CPh0=; b=n2xXPRGtvFpsGn52kRAG2uUGq4CN71U4+H75dmjZzCC3htNX+NrfJTZRi4yS2+hIIP 2HfJLbkolRJxawQkDwl4FptyOQv4iVzHKlu6uSMavJ0fbsH28EbR6KdGwRa/j42hT/vo /dWuiRCueLVD/s7DhY3X1KIFyIHIbbqUOWiuK3V7mPwfvOnwcEQ2SefZunO68pI6mbmT L0595pR826hRAXr7ofHCYd0WjT5LwbzvvGIFWgVQ/ah3aCBkbWXwuXbAbR4mcLchUYiF SB+dyayit4P5VhAJi8j0EsBgDTsZa0CP5BZSZ4uE7q4ax0sr+HPfiagMuNUYEWYsD2P3 SUKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684799728; x=1687391728; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sQLIv+3hRLznwnCm51oPj668Ns602ev+EO/wUc0CPh0=; b=MUq9wH5K6o2s1O44nDwrP7iX53Li8RtaT7ZtXIqnvBh9H9xpn/0Kjct8yRvTaGl6K3 LX6bqkuo0nQQ1XaurxZnCWgV8R5vil/28/DE6TIMmpPqPE2Xly0vPO3tXAWIBIvtprUK i+YAnwNXv2xCuuWcFDwpphifR3Sk6plcgZwtPfbAIuuEE5fqmdkxGeP//NSAAJZP10Lc G8DyQfLBtTzzjrc5xm159vUDuVCLrBv7HFRv/6orBdn1bwDqRCULZEQ9rGLNh1EJfc5O AvMU7jtQcwZ5EwkQN1JQ4BcvGKvjpqvmPp1XGgoCRWlcLX9rvhyCNAOkALWPFwr7XZQV cT3g== X-Gm-Message-State: AC+VfDzYbZ3jUoZEXNC05F9o0y+e9XUAfXynmKzRBJOZIAIvIbk5izQ2 hq/37MAhVQ9jprn9EasVLQ== X-Google-Smtp-Source: ACHHUZ4qEETLMn8eJd8NRsyDnoow7HpW0Yg5sjbrxZG6nm/1/c55PU22tWYVoiF+SVIYDQC8nFj0iQ== X-Received: by 2002:a9d:67d4:0:b0:6ab:360b:840 with SMTP id c20-20020a9d67d4000000b006ab360b0840mr6709235otn.35.1684799728387; Mon, 22 May 2023 16:55:28 -0700 (PDT) Received: from C02FL77VMD6R.bytedance.net ([208.184.112.130]) by smtp.gmail.com with ESMTPSA id d21-20020a056830139500b006ab32e46485sm2848224otq.79.2023.05.22.16.55.27 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 May 2023 16:55:28 -0700 (PDT) From: Peilin Ye X-Google-Original-From: Peilin Ye To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Peilin Ye , Daniel Borkmann , John Fastabend , Vlad Buslov , Pedro Tammela , Hillf Danton , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Cong Wang , Peilin Ye Subject: [PATCH v2 net 5/6] net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs Date: Mon, 22 May 2023 16:55:15 -0700 Message-Id: <9a7a44e532874f68c03a0028ad0a7a6b16620121.1684796705.git.peilin.ye@bytedance.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org Grafting ingress and clsact Qdiscs does not need a for-loop in qdisc_graft(). Refactor it. No functional changes intended. Reviewed-by: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Tested-by: Pedro Tammela Signed-off-by: Peilin Ye --- net/sched/sch_api.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 49b9c1bbfdd9..f72a581666a2 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1073,12 +1073,12 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, if (parent == NULL) { unsigned int i, num_q, ingress; + struct netdev_queue *dev_queue; ingress = 0; num_q = dev->num_tx_queues; if ((q && q->flags & TCQ_F_INGRESS) || (new && new->flags & TCQ_F_INGRESS)) { - num_q = 1; ingress = 1; if (!dev_ingress_queue(dev)) { NL_SET_ERR_MSG(extack, "Device does not have an ingress queue"); @@ -1094,18 +1094,18 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, if (new && new->ops->attach && !ingress) goto skip; - for (i = 0; i < num_q; i++) { - struct netdev_queue *dev_queue = dev_ingress_queue(dev); - - if (!ingress) + if (!ingress) { + for (i = 0; i < num_q; i++) { dev_queue = netdev_get_tx_queue(dev, i); + old = dev_graft_qdisc(dev_queue, new); - old = dev_graft_qdisc(dev_queue, new); - if (new && i > 0) - qdisc_refcount_inc(new); - - if (!ingress) + if (new && i > 0) + qdisc_refcount_inc(new); qdisc_put(old); + } + } else { + dev_queue = dev_ingress_queue(dev); + old = dev_graft_qdisc(dev_queue, new); } skip: From patchwork Mon May 22 23:55:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peilin Ye X-Patchwork-Id: 13251243 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5DEFE17748 for ; Mon, 22 May 2023 23:55:53 +0000 (UTC) Received: from mail-ot1-x332.google.com (mail-ot1-x332.google.com [IPv6:2607:f8b0:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54EF2213F; Mon, 22 May 2023 16:55:51 -0700 (PDT) Received: by mail-ot1-x332.google.com with SMTP id 46e09a7af769-6af7daff37eso1141127a34.0; Mon, 22 May 2023 16:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684799750; x=1687391750; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tItS/JirFunMX5RfI2+aJ2PuNtEICNcjEQi3pmV0waQ=; b=Oxk2wkmGKzN+KZViHTjAnIHcCWo+gN4WIpgSyeC38zjz0/hy481mG3wiKfWHFCi5xO jT8cRnHQ3D3ueSB0lDmMHwzLh3KvDfuwKsW0XIpyFf/xyfvVx5huI+WnPlyt+LJJOjis eUedKUWiJFf+2iKdFk/elq1GmfJPnZmXM80SwCwD7wQwpnwg3a3uDlP9iMVegvFYx+Y2 lgXAKuOW8opnf6DLXDlNtrrHKgM3A1Zkf3eo2BjeTdZZbaezCMRCeLGj5C1zZXFMMLYL wKAqOZC330NVZoYCNaL7hzH/dBQWeJVXelx9UNWtp+r9MKl8lZcY9W8eQgtVsKw0JImk eCqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684799750; x=1687391750; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tItS/JirFunMX5RfI2+aJ2PuNtEICNcjEQi3pmV0waQ=; b=j5H1XNGO8ygVIUlhkdVWHf1ny78B5VZmQ6ghEKZDjM1XxVgnnE6bAjn+10CojJB3IU ygqDgAI0m93xvGLkUwdH4KVsFGD+mnSQkrK4STbsrsZYwoPADiTjG31rXy6VASHX9LPc wq9KoaejKLgFzus2uHgGWw8XQhTw0ZCcf51bQCbkzSw3oFmxb6TEH4hXdwzuIK4PG432 //xNkiyECy5MmmcCYgg4lQgtHCipB83bYWVuSNSJj4H2EIVRStTkOvodrPNB/43GUfWM ZCmQMO4vzLbaF7eoqOPb5xm0HrN6WjX8m9kGe4JxrF/tp9Zi/bPJO/PZMaEf6UOB0FAA Eeww== X-Gm-Message-State: AC+VfDw35I5GZtVckiQIKR+aexKrQpQkZE9PUnsioabG+RP+cBEohS8l 7YQO1wksPqj1hq/YIfOOdA== X-Google-Smtp-Source: ACHHUZ7J2d9jQgy3oq7FjYSJFKKOLBhyVFtopnugZ+aCOiVhrEIYqEqEii3AeTLVJme3IWQVqMoV4w== X-Received: by 2002:a9d:7ad7:0:b0:6ac:8844:3605 with SMTP id m23-20020a9d7ad7000000b006ac88443605mr6297410otn.30.1684799750582; Mon, 22 May 2023 16:55:50 -0700 (PDT) Received: from C02FL77VMD6R.bytedance.net ([208.184.112.130]) by smtp.gmail.com with ESMTPSA id d5-20020a05683018e500b006a65be836acsm2889645otf.16.2023.05.22.16.55.49 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 May 2023 16:55:50 -0700 (PDT) From: Peilin Ye X-Google-Original-From: Peilin Ye To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Peilin Ye , Daniel Borkmann , John Fastabend , Vlad Buslov , Pedro Tammela , Hillf Danton , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Cong Wang , Peilin Ye Subject: [PATCH v2 net 6/6] net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting Date: Mon, 22 May 2023 16:55:36 -0700 Message-Id: <8e3383d0bacd084f0e33d9158d24bd411f1bf6ba.1684796705.git.peilin.ye@bytedance.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org mini_Qdisc_pair::p_miniq is a double pointer to mini_Qdisc, initialized in ingress_init() to point to net_device::miniq_ingress. ingress Qdiscs access this per-net_device pointer in mini_qdisc_pair_swap(). Similar for clsact Qdiscs and miniq_egress. Unfortunately, after introducing RTNL-unlocked RTM_{NEW,DEL,GET}TFILTER requests (thanks Hillf Danton for the hint), when replacing ingress or clsact Qdiscs, for example, the old Qdisc ("@old") could access the same miniq_{in,e}gress pointer(s) concurrently with the new Qdisc ("@new"), causing race conditions [1] including a use-after-free bug in mini_qdisc_pair_swap() reported by syzbot: BUG: KASAN: slab-use-after-free in mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 Write of size 8 at addr ffff888045b31308 by task syz-executor690/14901 ... Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:319 print_report mm/kasan/report.c:430 [inline] kasan_report+0x11c/0x130 mm/kasan/report.c:536 mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 tcf_chain_head_change_item net/sched/cls_api.c:495 [inline] tcf_chain0_head_change.isra.0+0xb9/0x120 net/sched/cls_api.c:509 tcf_chain_tp_insert net/sched/cls_api.c:1826 [inline] tcf_chain_tp_insert_unique net/sched/cls_api.c:1875 [inline] tc_new_tfilter+0x1de6/0x2290 net/sched/cls_api.c:2266 ... @old and @new should not affect each other. In other words, @old should never modify miniq_{in,e}gress after @new, and @new should not update @old's RCU state. Fixing without changing sch_api.c turned out to be difficult (please refer to Closes: for discussions). Instead, make sure @new's first call always happen after @old's last call, in qdisc_destroy(), has finished: In qdisc_graft(), return -EAGAIN and tell the caller to replay (suggested by Vlad Buslov) if @old has any ongoing RTNL-unlocked filter requests, and call qdisc_destroy() for @old before grafting @new. Introduce qdisc_refcount_dec_if_one() as the counterpart of qdisc_refcount_inc_nz() used for RTNL-unlocked filter requests. Introduce a non-static version of qdisc_destroy() that does a TCQ_F_BUILTIN check, just like qdisc_put() etc. Depends on patch "net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs". [1] To illustrate, the syzkaller reproducer adds ingress Qdiscs under TC_H_ROOT (no longer possible after patch "net/sched: sch_ingress: Only create under TC_H_INGRESS") on eth0 that has 8 transmission queues: Thread 1 creates ingress Qdisc A (containing mini Qdisc a1 and a2), then adds a flower filter X to A. Thread 2 creates another ingress Qdisc B (containing mini Qdisc b1 and b2) to replace A, then adds a flower filter Y to B. Thread 1 A's refcnt Thread 2 RTM_NEWQDISC (A, RTNL-locked) qdisc_create(A) 1 qdisc_graft(A) 9 RTM_NEWTFILTER (X, RTNL-unlocked) __tcf_qdisc_find(A) 10 tcf_chain0_head_change(A) mini_qdisc_pair_swap(A) (1st) | | RTM_NEWQDISC (B, RTNL-locked) RCU sync 2 qdisc_graft(B) | 1 notify_and_destroy(A) | tcf_block_release(A) 0 RTM_NEWTFILTER (Y, RTNL-unlocked) qdisc_destroy(A) tcf_chain0_head_change(B) tcf_chain0_head_change_cb_del(A) mini_qdisc_pair_swap(B) (2nd) mini_qdisc_pair_swap(A) (3rd) | ... ... Here, B calls mini_qdisc_pair_swap(), pointing eth0->miniq_ingress to its mini Qdisc, b1. Then, A calls mini_qdisc_pair_swap() again during ingress_destroy(), setting eth0->miniq_ingress to NULL, so ingress packets on eth0 will not find filter Y in sch_handle_ingress(). This is only one of the possible consequences of concurrently accessing miniq_{in,e}gress pointers. The point is clear though: again, A should never modify those per-net_device pointers after B, and B should not update A's RCU state. Fixes: 7a096d579e8e ("net: sched: ingress: set 'unlocked' flag for Qdisc ops") Fixes: 87f373921c4e ("net: sched: ingress: set 'unlocked' flag for clsact Qdisc ops") Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/ Cc: Hillf Danton Cc: Vlad Buslov Signed-off-by: Peilin Ye --- changes in v2: - replay the request if the current Qdisc has any ongoing RTNL-unlocked filter requests (Vlad) - minor changes in code comments and commit log include/net/sch_generic.h | 8 ++++++++ net/sched/sch_api.c | 32 ++++++++++++++++++++++++++------ net/sched/sch_generic.c | 14 +++++++++++--- 3 files changed, 45 insertions(+), 9 deletions(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index fab5ba3e61b7..3e9cc43cbc90 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -137,6 +137,13 @@ static inline void qdisc_refcount_inc(struct Qdisc *qdisc) refcount_inc(&qdisc->refcnt); } +static inline bool qdisc_refcount_dec_if_one(struct Qdisc *qdisc) +{ + if (qdisc->flags & TCQ_F_BUILTIN) + return true; + return refcount_dec_if_one(&qdisc->refcnt); +} + /* Intended to be used by unlocked users, when concurrent qdisc release is * possible. */ @@ -652,6 +659,7 @@ void dev_deactivate_many(struct list_head *head); struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue, struct Qdisc *qdisc); void qdisc_reset(struct Qdisc *qdisc); +void qdisc_destroy(struct Qdisc *qdisc); void qdisc_put(struct Qdisc *qdisc); void qdisc_put_unlocked(struct Qdisc *qdisc); void qdisc_tree_reduce_backlog(struct Qdisc *qdisc, int n, int len); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index f72a581666a2..b3bafa6c1b44 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1080,10 +1080,18 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, if ((q && q->flags & TCQ_F_INGRESS) || (new && new->flags & TCQ_F_INGRESS)) { ingress = 1; - if (!dev_ingress_queue(dev)) { + dev_queue = dev_ingress_queue(dev); + if (!dev_queue) { NL_SET_ERR_MSG(extack, "Device does not have an ingress queue"); return -ENOENT; } + + /* Replay if the current ingress (or clsact) Qdisc has ongoing + * RTNL-unlocked filter request(s). This is the counterpart of that + * qdisc_refcount_inc_nz() call in __tcf_qdisc_find(). + */ + if (!qdisc_refcount_dec_if_one(dev_queue->qdisc_sleeping)) + return -EAGAIN; } if (dev->flags & IFF_UP) @@ -1104,8 +1112,16 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, qdisc_put(old); } } else { - dev_queue = dev_ingress_queue(dev); - old = dev_graft_qdisc(dev_queue, new); + old = dev_graft_qdisc(dev_queue, NULL); + + /* {ingress,clsact}_destroy() @old before grafting @new to avoid + * unprotected concurrent accesses to net_device::miniq_{in,e}gress + * pointer(s) in mini_qdisc_pair_swap(). + */ + qdisc_notify(net, skb, n, classid, old, new, extack); + qdisc_destroy(old); + + dev_graft_qdisc(dev_queue, new); } skip: @@ -1119,8 +1135,6 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, if (new && new->ops->attach) new->ops->attach(new); - } else { - notify_and_destroy(net, skb, n, classid, old, new, extack); } if (dev->flags & IFF_UP) @@ -1458,6 +1472,7 @@ static int tc_get_qdisc(struct sk_buff *skb, struct nlmsghdr *n, struct Qdisc *p = NULL; int err; +replay: err = nlmsg_parse_deprecated(n, sizeof(*tcm), tca, TCA_MAX, rtm_tca_policy, extack); if (err < 0) @@ -1515,8 +1530,11 @@ static int tc_get_qdisc(struct sk_buff *skb, struct nlmsghdr *n, return -ENOENT; } err = qdisc_graft(dev, p, skb, n, clid, NULL, q, extack); - if (err != 0) + if (err != 0) { + if (err == -EAGAIN) + goto replay; return err; + } } else { qdisc_notify(net, skb, n, clid, NULL, q, NULL); } @@ -1704,6 +1722,8 @@ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n, if (err) { if (q) qdisc_put(q); + if (err == -EAGAIN) + goto replay; return err; } diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 37e41f972f69..e14ed47f961c 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -1046,7 +1046,7 @@ static void qdisc_free_cb(struct rcu_head *head) qdisc_free(q); } -static void qdisc_destroy(struct Qdisc *qdisc) +static void __qdisc_destroy(struct Qdisc *qdisc) { const struct Qdisc_ops *ops = qdisc->ops; @@ -1070,6 +1070,14 @@ static void qdisc_destroy(struct Qdisc *qdisc) call_rcu(&qdisc->rcu, qdisc_free_cb); } +void qdisc_destroy(struct Qdisc *qdisc) +{ + if (qdisc->flags & TCQ_F_BUILTIN) + return; + + __qdisc_destroy(qdisc); +} + void qdisc_put(struct Qdisc *qdisc) { if (!qdisc) @@ -1079,7 +1087,7 @@ void qdisc_put(struct Qdisc *qdisc) !refcount_dec_and_test(&qdisc->refcnt)) return; - qdisc_destroy(qdisc); + __qdisc_destroy(qdisc); } EXPORT_SYMBOL(qdisc_put); @@ -1094,7 +1102,7 @@ void qdisc_put_unlocked(struct Qdisc *qdisc) !refcount_dec_and_rtnl_lock(&qdisc->refcnt)) return; - qdisc_destroy(qdisc); + __qdisc_destroy(qdisc); rtnl_unlock(); } EXPORT_SYMBOL(qdisc_put_unlocked);