From patchwork Tue Nov 22 01:58:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pengcheng Yang X-Patchwork-Id: 13051803 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A854C4332F for ; Tue, 22 Nov 2022 01:58:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231490AbiKVB6y (ORCPT ); Mon, 21 Nov 2022 20:58:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231905AbiKVB6u (ORCPT ); Mon, 21 Nov 2022 20:58:50 -0500 Received: from azure-sdnproxy.icoremail.net (azure-sdnproxy.icoremail.net [52.175.55.52]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 90606C5635 for ; Mon, 21 Nov 2022 17:58:48 -0800 (PST) Received: from 102.wangsu.com (unknown [59.61.78.232]) by app2 (Coremail) with SMTP id SyJltADnLkvOLHxjiGsAAA--.531S3; Tue, 22 Nov 2022 09:58:46 +0800 (CST) From: Pengcheng Yang To: bpf@vger.kernel.org, netdev@vger.kernel.org, John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Cc: Pengcheng Yang Subject: [PATCH RESEND bpf 1/4] bpf, sockmap: Fix repeated calls to sock_put() when msg has more_data Date: Tue, 22 Nov 2022 09:58:26 +0800 Message-Id: <1669082309-2546-2-git-send-email-yangpc@wangsu.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> References: <1669082309-2546-1-git-send-email-yangpc@wangsu.com> X-CM-TRANSID: SyJltADnLkvOLHxjiGsAAA--.531S3 X-Coremail-Antispam: 1UD129KBjvJXoW7Ar4DXFWkJr4rAryruryrCrg_yoW8tFy3pF W5Gw1akr43JrW7Cw4rtFWvvF18u3yrGFn0krZaqr1fAFZ3JFyUJF1jgryFka4FgrWxCw13 Zryqgr1UA3ZrZ3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnUUvcSsGvfC2KfnxnUUI43ZEXa7xR_UUUUUUUUU== X-CM-SenderInfo: p1dqw1nf6zt0xjvxhudrp/ Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net In tcp_bpf_send_verdict() redirection, the eval variable is assigned to __SK_REDIRECT after the apply_bytes data is sent, if msg has more_data, sock_put() will be called multiple times. We should reset the eval variable to __SK_NONE every time more_data starts. This causes: IPv4: Attempt to release TCP socket in state 1 00000000b4c925d7 ------------[ cut here ]------------ refcount_t: addition on 0; use-after-free. WARNING: CPU: 5 PID: 4482 at lib/refcount.c:25 refcount_warn_saturate+0x7d/0x110 Modules linked in: CPU: 5 PID: 4482 Comm: sockhash_bypass Kdump: loaded Not tainted 6.0.0 #1 Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 Call Trace: __tcp_transmit_skb+0xa1b/0xb90 ? __alloc_skb+0x8c/0x1a0 ? __kmalloc_node_track_caller+0x184/0x320 tcp_write_xmit+0x22a/0x1110 __tcp_push_pending_frames+0x32/0xf0 do_tcp_sendpages+0x62d/0x640 tcp_bpf_push+0xae/0x2c0 tcp_bpf_sendmsg_redir+0x260/0x410 ? preempt_count_add+0x70/0xa0 tcp_bpf_send_verdict+0x386/0x4b0 tcp_bpf_sendmsg+0x21b/0x3b0 sock_sendmsg+0x58/0x70 __sys_sendto+0xfa/0x170 ? xfd_validate_state+0x1d/0x80 ? switch_fpu_return+0x59/0xe0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x37/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: cd9733f5d75c ("tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function") Signed-off-by: Pengcheng Yang Acked-by: John Fastabend --- net/ipv4/tcp_bpf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index f8b12b9..ef5de4f 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -279,7 +279,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, bool cork = false, enospc = sk_msg_full(msg); struct sock *sk_redir; u32 tosend, origsize, sent, delta = 0; - u32 eval = __SK_NONE; + u32 eval; int ret; more_data: @@ -310,6 +310,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, tosend = msg->sg.size; if (psock->apply_bytes && psock->apply_bytes < tosend) tosend = psock->apply_bytes; + eval = __SK_NONE; switch (psock->eval) { case __SK_PASS: