From patchwork Fri Jun 28 05:47:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Geliang Tang X-Patchwork-Id: 13715481 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 922A934545; Fri, 28 Jun 2024 05:48:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719553699; cv=none; b=clVjywV00rFWlZaTWTBygvP/CiMoz51SVJkE4dVMtWAKVedA3ileV8A2fBsW9ySbPw60hCL0pFhPDvTATgDaKOa8cA2onOsDSfR3CZkrHJX6k75tKV1ANOaRKfbSQBMSyzrvEm/vKRNc6uTjoNYp3TLqWNIrNBmTapaPPVC+Xf4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719553699; c=relaxed/simple; bh=cf7aEeq4KITc+GrG7B52DGuFkqEwTB5DygtXePdWYGQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rI+t5Ra8wkh0YeTLaYmg6tqT7cVFdtnPJVate3R6WCfip5ElnWXsymwdzfpWF8lOQwZPdefFBRjwev0fvI7+vYA6TNEk8p523TsTeWiLIY3OHHgbIi7+bL4/KayQwWGgdInL43kGOhER8kJzh92dJ5bf3xbVdsq4KpjO3G+84oU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kHnVU5YN; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kHnVU5YN" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 56F4EC32781; Fri, 28 Jun 2024 05:48:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719553699; bh=cf7aEeq4KITc+GrG7B52DGuFkqEwTB5DygtXePdWYGQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kHnVU5YNbTt0BIbpY+C/Hj4mZScLvhJFv572M3LfRI2GXwPpYg2EGzB3qy4B0m+23 McRMknbQJOCAxT9NqrAxlF86tCzIuywPv4ZBivh/OJeHk8vAsWG5fmQaWvXpNy2V76 3ds5ebgM8fWd1VxkATDiDqx2V7OZaKh4ptgYueKuzHK98GNRPfFFGtMLQ4VgxgfaOp XTLcmQowzTX0AtfnB+PO14IzEF3Dt3F34o3eVUxtgevsl5X2ZtbNTrzir1Ioqmw+Eq r38g2IM1hVZ2wvLM9vHQmV3VhVBFiGoUpY9yMaMFHgjJ6rlVbBbD5ZiPbo04QG5wjl bfcBsb0ojjNZw== From: Geliang Tang To: John Fastabend , Jakub Sitnicki , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov , Daniel Borkmann Cc: Geliang Tang , David Ahern , Eduard Zingerman , Mykola Lysenko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan , Mykyta Yatsenko , Miao Xu , Yuran Pereira , Huacai Chen , Tiezhu Yang , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH net v3 1/2] skmsg: prevent empty ingress skb from enqueuing Date: Fri, 28 Jun 2024 13:47:47 +0800 Message-ID: <5b6a55017ab616131f7de1268b60cb34e99941a1.1719553101.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org From: Geliang Tang Run this BPF selftests (./test_progs -t sockmap_basic) on a Loongarch platform, a Kernel panic occurs: ''' Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...) Stack : ... ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- ''' This crash happens every time when running sockmap_skb_verdict_shutdown subtest in sockmap_basic. This crash is because a NULL pointer is passed to page_address() in sk_msg_recvmsg(). Due to the difference in architecture, page_address(0) will not trigger a panic on the X86 platform but will panic on the Loogarch platform. So this bug was hidden on the x86 platform, but now it is exposed on the Loogarch platform. The root cause is an empty skb (skb->len == 0) is put on the queue. In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no page is put to this sge (see sg_set_page in sg_set_page), but this empty sge is queued into ingress_msg list. And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by sg_page(sge). Pass this NULL-page to copy_page_to_iter(), it passed to kmap_local_page() and page_address(), then kernel panics. To solve this, we should prevent empty skb from putting on the queue. So in sk_psock_verdict_recv(), if the skb->len is zero, drop this skb. Fixes: ef5659280eb1 ("bpf, sockmap: Allow skipping sk_skb parser program") Signed-off-by: Geliang Tang Reviewed-by: D. Wythe --- net/core/skmsg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index fd20aae30be2..44952cdd1425 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -1184,7 +1184,7 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb) rcu_read_lock(); psock = sk_psock(sk); - if (unlikely(!psock)) { + if (unlikely(!psock || !len)) { len = 0; tcp_eat_skb(sk, skb); sock_drop(sk, skb); From patchwork Fri Jun 28 05:47:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Geliang Tang X-Patchwork-Id: 13715482 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CAAB4249F5; Fri, 28 Jun 2024 05:48:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719553708; cv=none; b=F3cPbFrrT1xNlTo6v/Wg57CZhFuy4OR0+OjgxbTdhDFGGno30cNcZBcbr2p+Mhlj0yKKK7ow1lzBwUQayW8XHRXFFnS0doOgv/+7SOhgEJWwz8jgWeVO80xOnBHVSL/sv13KGciLeHfZH7OgV/th1HUD1TW3e1cpYzdtDhP+F5s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719553708; c=relaxed/simple; bh=OwRJINa0VpSHf4btc8behed4mXZR6PARmtNNKKc9ul8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HOBgOLjxFjaD9wUU9EtwJ83eGgBHFjBJndddEz1umkJuu0uuoaYe0Q6lEof71TM3cniAqCtNzQrEwbF1vZmUn9Oe3G5wa5HrwLb5/iYa+9A2PUtSqCz9mC/APVxx24EVCoxSlRm20Jg8SzbBbGGRBN2CGTHDMLjLt79A6Bbo2WY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IdvJB7KS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IdvJB7KS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1335EC116B1; Fri, 28 Jun 2024 05:48:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719553708; bh=OwRJINa0VpSHf4btc8behed4mXZR6PARmtNNKKc9ul8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IdvJB7KSj7DVi4U5dVFIZWaR0+Kc305E8R9meKJdhm6/hYAuOdmWrJeSsibO13wLA PwPCr+lTnw8H9QmDdLsap+f/FCr57hQNlWNQR7rki0mwVF9+XwLzgjY6uMFL0+E8b+ shB0fiPFrmsSyndjAhiypoxV1R0qFEiSeJvFhhGfYU/nNdsWQJug9SUdPcaBQxs/6u wu4DZhdD82fOrxF+nU5Zq1wm+cqTkE9ZN78Yqpd1uK+oFDvZY86Sp/CF7aAGy2Ug8Q Vs4ItS4DBozZJjZ6zpAJjAHksCUVxGszyF979SfC8Uvv+ZpgrnG/Z6fHk7i0EGqFs6 qM5AY+iWQ86Ow== From: Geliang Tang To: John Fastabend , Jakub Sitnicki , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov , Daniel Borkmann Cc: Geliang Tang , David Ahern , Eduard Zingerman , Mykola Lysenko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan , Mykyta Yatsenko , Miao Xu , Yuran Pereira , Huacai Chen , Tiezhu Yang , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH net v3 2/2] skmsg: bugfix for sk_msg sge iteration Date: Fri, 28 Jun 2024 13:47:48 +0800 Message-ID: <56d8ec28df901432e7bde4953795166ce2edd472.1719553101.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org From: Geliang Tang Every time run this BPF selftests (./test_sockmap) on a Loongarch platform, a Kernel panic occurs: ''' Oops[#1]: CPU: 20 PID: 23245 Comm: test_sockmap Tainted: G OE 6.10.0-rc2+ #32 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000043a315c tcp_bpf_sendmsg+0x23c/0x420 ERA: 900000000426cd1c sk_msg_memcopy_from_iter+0xbc/0x220 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: tls xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT Process test_sockmap (pid: 23245, threadinfo=00000000aeb68043, task=...) Stack : ... ... ... Call Trace: [<900000000426cd1c>] sk_msg_memcopy_from_iter+0xbc/0x220 [<90000000043a315c>] tcp_bpf_sendmsg+0x23c/0x420 [<90000000041cafc8>] __sock_sendmsg+0x68/0xe0 [<90000000041cc4bc>] ____sys_sendmsg+0x2bc/0x360 [<90000000041cea18>] ___sys_sendmsg+0xb8/0x120 [<90000000041cf1f8>] __sys_sendmsg+0x98/0x100 [<90000000045b76ec>] do_syscall+0x8c/0xc0 [<90000000030e1da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- ''' This crash is because a NULL pointer is passed to page_address() in sk_msg_memcopy_from_iter(). Due to the difference in architecture, page_address(0) will not trigger a panic on the X86 platform but will panic on the Loogarch platform. So this bug was hidden on the x86 platform, but now it is exposed on the Loogarch platform. This bug is a logic error indeed. In sk_msg_memcopy_from_iter(), an invalid "sge" is always used: if (msg->sg.copybreak >= sge->length) { msg->sg.copybreak = 0; sk_msg_iter_var_next(i); if (i == msg->sg.end) break; sge = sk_msg_elem(msg, i); } If the value of i is 2, msg->sg.end is also 2 when entering this if block. sk_msg_iter_var_next() increases i by 1, and now i is 3, which is no longer equal to msg->sg.end. The break will not be triggered, and the next sge obtained by sk_msg_elem(3) will be an invalid one. The correct approach is to check (i == msg->sg.end) first, and then invoke sk_msg_iter_var_next() if they are not equal. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Geliang Tang Reviewed-by: D. Wythe --- net/core/skmsg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 44952cdd1425..1906d0d0eeac 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -378,9 +378,9 @@ int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from, /* This is possible if a trim operation shrunk the buffer */ if (msg->sg.copybreak >= sge->length) { msg->sg.copybreak = 0; - sk_msg_iter_var_next(i); if (i == msg->sg.end) break; + sk_msg_iter_var_next(i); sge = sk_msg_elem(msg, i); }