From patchwork Tue Nov 22 01:58:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pengcheng Yang X-Patchwork-Id: 13051803 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A854C4332F for ; Tue, 22 Nov 2022 01:58:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231490AbiKVB6y (ORCPT ); Mon, 21 Nov 2022 20:58:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231905AbiKVB6u (ORCPT ); Mon, 21 Nov 2022 20:58:50 -0500 Received: from azure-sdnproxy.icoremail.net (azure-sdnproxy.icoremail.net [52.175.55.52]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 90606C5635 for ; Mon, 21 Nov 2022 17:58:48 -0800 (PST) Received: from 102.wangsu.com (unknown [59.61.78.232]) by app2 (Coremail) with SMTP id SyJltADnLkvOLHxjiGsAAA--.531S3; Tue, 22 Nov 2022 09:58:46 +0800 (CST) From: Pengcheng Yang To: bpf@vger.kernel.org, netdev@vger.kernel.org, John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Cc: Pengcheng Yang Subject: [PATCH RESEND bpf 1/4] bpf, sockmap: Fix repeated calls to sock_put() when msg has more_data Date: Tue, 22 Nov 2022 09:58:26 +0800 Message-Id: <1669082309-2546-2-git-send-email-yangpc@wangsu.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> References: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> X-CM-TRANSID: SyJltADnLkvOLHxjiGsAAA--.531S3 X-Coremail-Antispam: 1UD129KBjvJXoW7Ar4DXFWkJr4rAryruryrCrg_yoW8tFy3pF W5Gw1akr43JrW7Cw4rtFWvvF18u3yrGFn0krZaqr1fAFZ3JFyUJF1jgryFka4FgrWxCw13 Zryqgr1UA3ZrZ3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnUUvcSsGvfC2KfnxnUUI43ZEXa7xR_UUUUUUUUU== X-CM-SenderInfo: p1dqw1nf6zt0xjvxhudrp/ Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net In tcp_bpf_send_verdict() redirection, the eval variable is assigned to __SK_REDIRECT after the apply_bytes data is sent, if msg has more_data, sock_put() will be called multiple times. We should reset the eval variable to __SK_NONE every time more_data starts. This causes: IPv4: Attempt to release TCP socket in state 1 00000000b4c925d7 ------------[ cut here ]------------ refcount_t: addition on 0; use-after-free. WARNING: CPU: 5 PID: 4482 at lib/refcount.c:25 refcount_warn_saturate+0x7d/0x110 Modules linked in: CPU: 5 PID: 4482 Comm: sockhash_bypass Kdump: loaded Not tainted 6.0.0 #1 Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 Call Trace: __tcp_transmit_skb+0xa1b/0xb90 ? __alloc_skb+0x8c/0x1a0 ? __kmalloc_node_track_caller+0x184/0x320 tcp_write_xmit+0x22a/0x1110 __tcp_push_pending_frames+0x32/0xf0 do_tcp_sendpages+0x62d/0x640 tcp_bpf_push+0xae/0x2c0 tcp_bpf_sendmsg_redir+0x260/0x410 ? preempt_count_add+0x70/0xa0 tcp_bpf_send_verdict+0x386/0x4b0 tcp_bpf_sendmsg+0x21b/0x3b0 sock_sendmsg+0x58/0x70 __sys_sendto+0xfa/0x170 ? xfd_validate_state+0x1d/0x80 ? switch_fpu_return+0x59/0xe0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x37/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: cd9733f5d75c ("tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function") Signed-off-by: Pengcheng Yang Acked-by: John Fastabend --- net/ipv4/tcp_bpf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index f8b12b9..ef5de4f 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -279,7 +279,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, bool cork = false, enospc = sk_msg_full(msg); struct sock *sk_redir; u32 tosend, origsize, sent, delta = 0; - u32 eval = __SK_NONE; + u32 eval; int ret; more_data: @@ -310,6 +310,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, tosend = msg->sg.size; if (psock->apply_bytes && psock->apply_bytes < tosend) tosend = psock->apply_bytes; + eval = __SK_NONE; switch (psock->eval) { case __SK_PASS: From patchwork Tue Nov 22 01:58:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pengcheng Yang X-Patchwork-Id: 13051802 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2297CC433FE for ; Tue, 22 Nov 2022 01:59:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232069AbiKVB73 (ORCPT ); Mon, 21 Nov 2022 20:59:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231895AbiKVB71 (ORCPT ); Mon, 21 Nov 2022 20:59:27 -0500 Received: from azure-sdnproxy.icoremail.net (azure-sdnproxy.icoremail.net [20.232.28.96]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 9D28CC5635 for ; Mon, 21 Nov 2022 17:59:26 -0800 (PST) Received: from 102.wangsu.com (unknown [59.61.78.232]) by app2 (Coremail) with SMTP id SyJltADnLkvOLHxjiGsAAA--.531S4; Tue, 22 Nov 2022 09:58:52 +0800 (CST) From: Pengcheng Yang To: bpf@vger.kernel.org, netdev@vger.kernel.org, John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Cc: Pengcheng Yang Subject: [PATCH RESEND bpf 2/4] bpf, sockmap: Fix missing BPF_F_INGRESS flag when using apply_bytes Date: Tue, 22 Nov 2022 09:58:27 +0800 Message-Id: <1669082309-2546-3-git-send-email-yangpc@wangsu.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> References: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> X-CM-TRANSID: SyJltADnLkvOLHxjiGsAAA--.531S4 X-Coremail-Antispam: 1UD129KBjvJXoW7KFy7uF1xJr47Ar4kKF4xXrb_yoW8KryUpF sYya1fCFW7CrWjgw1ftFWvqF43uw1rKFyjkr1a9w1ft397Kr40qFn5GFy3ZF1Fyrs7Ca1S qF4UWrW5GF17Zw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnUUvcSsGvfC2KfnxnUUI43ZEXa7xR_UUUUUUUUU== X-CM-SenderInfo: p1dqw1nf6zt0xjvxhudrp/ Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net When redirecting, we use sk_msg_to_ingress() to get the BPF_F_INGRESS flag from the msg->flags. If apply_bytes is used and it is larger than the current data being processed, sk_psock_msg_verdict() will not be called when sendmsg() is called again. At this time, the msg->flags is 0, and we lost the BPF_F_INGRESS flag. So we need to save the BPF_F_INGRESS flag in sk_psock and assign it to msg->flags before redirection. Fixes: 8934ce2fd081 ("bpf: sockmap redirect ingress support") Signed-off-by: Pengcheng Yang --- include/linux/skmsg.h | 1 + net/core/skmsg.c | 1 + net/ipv4/tcp_bpf.c | 1 + net/tls/tls_sw.c | 1 + 4 files changed, 4 insertions(+) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 48f4b64..e1d463f 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -82,6 +82,7 @@ struct sk_psock { u32 apply_bytes; u32 cork_bytes; u32 eval; + u32 flags; struct sk_msg *cork; struct sk_psock_progs progs; #if IS_ENABLED(CONFIG_BPF_STREAM_PARSER) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 188f855..ab2f8f3 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -888,6 +888,7 @@ int sk_psock_msg_verdict(struct sock *sk, struct sk_psock *psock, if (psock->sk_redir) sock_put(psock->sk_redir); psock->sk_redir = msg->sk_redir; + psock->flags = msg->flags; if (!psock->sk_redir) { ret = __SK_DROP; goto out; diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index ef5de4f..1390d72 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -323,6 +323,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, break; case __SK_REDIRECT: sk_redir = psock->sk_redir; + msg->flags = psock->flags; sk_msg_apply_bytes(psock, tosend); if (!psock->apply_bytes) { /* Clean up before releasing the sock lock. */ diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index fe27241..49e424d 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -838,6 +838,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk, break; case __SK_REDIRECT: sk_redir = psock->sk_redir; + msg->flags = psock->flags; memcpy(&msg_redir, msg, sizeof(*msg)); if (msg->apply_bytes < send) msg->apply_bytes = 0; From patchwork Tue Nov 22 01:58:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pengcheng Yang X-Patchwork-Id: 13051805 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FA1CC433FE for ; Tue, 22 Nov 2022 01:59:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229730AbiKVB7S (ORCPT ); Mon, 21 Nov 2022 20:59:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231905AbiKVB7Q (ORCPT ); Mon, 21 Nov 2022 20:59:16 -0500 Received: from zg8tmja2lje4os4yms4ymjma.icoremail.net (zg8tmja2lje4os4yms4ymjma.icoremail.net [206.189.21.223]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 4F447C5B4E for ; Mon, 21 Nov 2022 17:59:15 -0800 (PST) Received: from 102.wangsu.com (unknown [59.61.78.232]) by app2 (Coremail) with SMTP id SyJltADnLkvOLHxjiGsAAA--.531S5; Tue, 22 Nov 2022 09:58:56 +0800 (CST) From: Pengcheng Yang To: bpf@vger.kernel.org, netdev@vger.kernel.org, John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Cc: Pengcheng Yang Subject: [PATCH RESEND bpf 3/4] bpf, sockmap: Fix data loss caused by using apply_bytes on ingress redirect Date: Tue, 22 Nov 2022 09:58:28 +0800 Message-Id: <1669082309-2546-4-git-send-email-yangpc@wangsu.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> References: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> X-CM-TRANSID: SyJltADnLkvOLHxjiGsAAA--.531S5 X-Coremail-Antispam: 1UD129KBjvdXoWrKr4ftw4rGrW8Xr45CryDZFb_yoWfWwbE9r W0yF9xGry8WF1IkF4Du3y5tF92grs2vFn3Kr1xJFW7t348AFyUArs5XFn3ZFWkWFW2yFyq g34kXr4UZa4aqjkaLaAFLSUrUUUUUb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJ3UbIYCTnIWIevJa73UjIFyTuYvj4RJUUUUUUUU X-CM-SenderInfo: p1dqw1nf6zt0xjvxhudrp/ Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Use apply_bytes on ingress redirect, when apply_bytes is less than the length of msg data, some data may be skipped and lost in bpf_tcp_ingress(). If there is still data in the scatterlist that has not been consumed, we cannot move the msg iter. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Pengcheng Yang --- net/ipv4/tcp_bpf.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 1390d72..3cc0346 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -45,8 +45,11 @@ static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock, tmp->sg.end = i; if (apply) { apply_bytes -= size; - if (!apply_bytes) + if (!apply_bytes) { + if (sge->length) + sk_msg_iter_var_prev(i); break; + } } } while (i != msg->sg.end); From patchwork Tue Nov 22 01:58:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pengcheng Yang X-Patchwork-Id: 13051804 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A8A2C4332F for ; Tue, 22 Nov 2022 01:59:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231906AbiKVB7H (ORCPT ); Mon, 21 Nov 2022 20:59:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229730AbiKVB7G (ORCPT ); Mon, 21 Nov 2022 20:59:06 -0500 Received: from azure-sdnproxy.icoremail.net (azure-sdnproxy.icoremail.net [52.175.55.52]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 50470C5635 for ; Mon, 21 Nov 2022 17:59:03 -0800 (PST) Received: from 102.wangsu.com (unknown [59.61.78.232]) by app2 (Coremail) with SMTP id SyJltADnLkvOLHxjiGsAAA--.531S6; Tue, 22 Nov 2022 09:58:58 +0800 (CST) From: Pengcheng Yang To: bpf@vger.kernel.org, netdev@vger.kernel.org, John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Cc: Pengcheng Yang Subject: [PATCH RESEND bpf 4/4] selftests/bpf: Add ingress tests for txmsg with apply_bytes Date: Tue, 22 Nov 2022 09:58:29 +0800 Message-Id: <1669082309-2546-5-git-send-email-yangpc@wangsu.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> References: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> X-CM-TRANSID: SyJltADnLkvOLHxjiGsAAA--.531S6 X-Coremail-Antispam: 1UD129KBjvJXoW7ArWkWr45KFyrCw4UZF4DCFg_yoW8Gr1xp3 ZxJ39xKF95J3y7XF43JFy3tF4F9rW0qrWjyF4xAr1qvw43AFyxtrWrtFWYqF98JrWFq3Wa vayUGF4Uuw15Ja7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnUUvcSsGvfC2KfnxnUUI43ZEXa7xR_UUUUUUUUU== X-CM-SenderInfo: p1dqw1nf6zt0xjvxhudrp/ Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Currently, the ingress redirect is not covered in "txmsg test apply". Signed-off-by: Pengcheng Yang Acked-by: John Fastabend --- tools/testing/selftests/bpf/test_sockmap.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index 0fbaccd..9bc0cb4 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -1649,24 +1649,42 @@ static void test_txmsg_apply(int cgrp, struct sockmap_options *opt) { txmsg_pass = 1; txmsg_redir = 0; + txmsg_ingress = 0; txmsg_apply = 1; txmsg_cork = 0; test_send_one(opt, cgrp); txmsg_pass = 0; txmsg_redir = 1; + txmsg_ingress = 0; + txmsg_apply = 1; + txmsg_cork = 0; + test_send_one(opt, cgrp); + + txmsg_pass = 0; + txmsg_redir = 1; + txmsg_ingress = 1; txmsg_apply = 1; txmsg_cork = 0; test_send_one(opt, cgrp); txmsg_pass = 1; txmsg_redir = 0; + txmsg_ingress = 0; + txmsg_apply = 1024; + txmsg_cork = 0; + test_send_large(opt, cgrp); + + txmsg_pass = 0; + txmsg_redir = 1; + txmsg_ingress = 0; txmsg_apply = 1024; txmsg_cork = 0; test_send_large(opt, cgrp); txmsg_pass = 0; txmsg_redir = 1; + txmsg_ingress = 1; txmsg_apply = 1024; txmsg_cork = 0; test_send_large(opt, cgrp);