diff mbox series

kernel bug found and suggestions for fixing it

Message ID tencent_CE572E29B79ABD1AB33F1980363ADE182606@qq.com (mailing list archive)
State New
Headers show
Series kernel bug found and suggestions for fixing it | expand

Commit Message

ffhgfv March 4, 2025, 7:31 a.m. UTC
Hello, I found a bug titled "KASAN: null-ptr-deref Read in smc_tcp_syn_recv_sock" with modified syzkaller in the lasted upstream related to bcachefs file system.
If you fix this issue, please add the following tag to the commit:  Reported-by: Jianzhou Zhao <xnxc22xnxc22@qq.com>,    xingwei lee <xrivendell7@gmail.com>, Zhizhuo Tang <strforexctzzchange@foxmail.com>

------------[ cut here ]------------
TITLE: KASAN: null-ptr-deref Read in smc_tcp_syn_recv_sock
==================================================================
BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:68 [inline]
BUG: KASAN: null-ptr-deref in atomic_read include/linux/atomic/atomic-instrumented.h:32 [inline]
BUG: KASAN: null-ptr-deref in smc_tcp_syn_recv_sock+0xa7/0x4c0 net/smc/af_smc.c:131
Read of size 4 at addr 0000000000000a04 by task syz.7.21/12319

CPU: 1 UID: 0 PID: 12319 Comm: syz.7.21 Not tainted 6.14.0-rc5-dirty #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <irq>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120
 kasan_report+0xbd/0xf0 mm/kasan/report.c:634
 check_region_inline mm/kasan/generic.c:183 [inline]
 kasan_check_range+0xf4/0x1a0 mm/kasan/generic.c:189
 instrument_atomic_read include/linux/instrumented.h:68 [inline]
 atomic_read include/linux/atomic/atomic-instrumented.h:32 [inline]
 smc_tcp_syn_recv_sock+0xa7/0x4c0 net/smc/af_smc.c:131
 tcp_check_req+0x5e4/0x1a90 net/ipv4/tcp_minisocks.c:861
 tcp_v4_rcv+0x1753/0x44e0 net/ipv4/tcp_ipv4.c:2274
 ip_protocol_deliver_rcu+0xba/0x4c0 net/ipv4/ip_input.c:205
 ip_local_deliver_finish+0x320/0x570 net/ipv4/ip_input.c:233
 NF_HOOK include/linux/netfilter.h:314 [inline]
 NF_HOOK include/linux/netfilter.h:308 [inline]
 ip_local_deliver+0x19a/0x200 net/ipv4/ip_input.c:254
 dst_input include/net/dst.h:469 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:447 [inline]
 NF_HOOK include/linux/netfilter.h:314 [inline]
 NF_HOOK include/linux/netfilter.h:308 [inline]
 ip_rcv+0x2be/0x5d0 net/ipv4/ip_input.c:567
 __netif_receive_skb_one_core+0x19b/0x1f0 net/core/dev.c:5893
 __netif_receive_skb+0x1d/0x170 net/core/dev.c:6006
 process_backlog+0x319/0x1460 net/core/dev.c:6354
 __napi_poll.constprop.0+0xb6/0x540 net/core/dev.c:7188
 napi_poll net/core/dev.c:7257 [inline]
 net_rx_action+0x9d2/0xe30 net/core/dev.c:7379
 handle_softirqs+0x1d1/0x870 kernel/softirq.c:561
 do_softirq kernel/softirq.c:462 [inline]
 do_softirq+0xac/0xe0 kernel/softirq.c:449
 </irq>
 <task>
 __local_bh_enable_ip+0x100/0x120 kernel/softirq.c:389
 local_bh_enable include/linux/bottom_half.h:33 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:919 [inline]
 __dev_queue_xmit+0x1b7a/0x4120 net/core/dev.c:4676
 dev_queue_xmit include/linux/netdevice.h:3313 [inline]
 neigh_hh_output include/net/neighbour.h:523 [inline]
 neigh_output include/net/neighbour.h:537 [inline]
 ip_finish_output2+0xc1c/0x1f10 net/ipv4/ip_output.c:236
 __ip_finish_output net/ipv4/ip_output.c:314 [inline]
 __ip_finish_output+0x442/0x940 net/ipv4/ip_output.c:296
 ip_finish_output+0x35/0x380 net/ipv4/ip_output.c:324
 NF_HOOK_COND include/linux/netfilter.h:303 [inline]
 ip_output+0x146/0x2b0 net/ipv4/ip_output.c:434
 dst_output include/net/dst.h:459 [inline]
 ip_local_out net/ipv4/ip_output.c:130 [inline]
 __ip_queue_xmit+0x19ee/0x21f0 net/ipv4/ip_output.c:528
 __tcp_transmit_skb+0x2a55/0x3e70 net/ipv4/tcp_output.c:1471
 __tcp_send_ack.part.0+0x39c/0x720 net/ipv4/tcp_output.c:4275
 __tcp_send_ack net/ipv4/tcp_output.c:4281 [inline]
 tcp_send_ack+0x81/0xa0 net/ipv4/tcp_output.c:4281
 tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6600 [inline]
 tcp_rcv_state_process+0x40e2/0x4c80 net/ipv4/tcp_input.c:6794
 tcp_v4_do_rcv+0x1a8/0xa70 net/ipv4/tcp_ipv4.c:1941
 sk_backlog_rcv include/net/sock.h:1122 [inline]
 __release_sock+0x31d/0x400 net/core/sock.c:3123
 release_sock+0x5a/0x220 net/core/sock.c:3677
 tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1359
 inet_sendmsg+0xb9/0x150 net/ipv4/af_inet.c:851
 smc_sendmsg+0x22a/0x530 net/smc/af_smc.c:2796
 sock_sendmsg_nosec net/socket.c:718 [inline]
 __sock_sendmsg net/socket.c:733 [inline]
 ____sys_sendmsg+0xab8/0xc70 net/socket.c:2573
 ___sys_sendmsg+0x11d/0x1c0 net/socket.c:2627
 __sys_sendmsg+0x151/0x200 net/socket.c:2659
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcb/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f36ef9a962d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 &lt;48&gt; 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f36f08c5f98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f36efbc5f80 RCX: 00007f36ef9a962d
RDX: 0000000024040049 RSI: 0000000020000200 RDI: 0000000000000004
RBP: 00007f36efa4e373 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f36efbc5f80 R15: 00007f36f08a6000
 </task>
==================================================================
I use the same kernel as syzbot instance upstream: 7eb172143d5508b4da468ed59ee857c6e5e01da6
kernel config: https://syzkaller.appspot.com/text?tag=KernelConfig&amp;x=da4b04ae798b7ef6
compiler: gcc version 11.4.0
===============================================================================
Unfortunately, the modified syzkaller does not generate an effective repeat program.
The following is my analysis of the bug and repair suggestions, hoping to help with the repair of the bug:
## Root cause analysis
1. ** null pointer access ** : 'smc_tcp_syn_recv_sock' function does not check whether the 'sk' pointer is valid, direct access to its member 'sk_wmem_alloc'.
2. ** Missing initialization ** : The 'sk' pointer may not be initialized correctly or returned in advance in the wrong path (such as memory allocation failure or SMC negotiation failure), resulting in a null pointer for subsequent operations.

### Repair suggestions
1. ** null pointer check ** : add 'if (!) before accessing' sk 'member sk) return NULL; '.
2. ** Error path handling ** : Ensure that resources are cleaned up in time when 'inet_csk_clone' fails to avoid passing invalid Pointers.
Patch example:

=========================================================================
I hope it helps.
Best regards
Jianzhou Zhao
xingwei lee
Zhizhuo Tang</strforexctzzchange@foxmail.com></xrivendell7@gmail.com></xnxc22xnxc22@qq.com>

Comments

D. Wythe March 5, 2025, 12:06 p.m. UTC | #1
On Tue, Mar 04, 2025 at 02:31:37AM -0500, ffhgfv wrote:
> Hello, I found a bug titled "KASAN: null-ptr-deref Read in smc_tcp_syn_recv_sock" with modified syzkaller in the lasted upstream related to bcachefs file system.
> If you fix this issue, please add the following tag to the commit:  Reported-by: Jianzhou Zhao <xnxc22xnxc22@qq.com>,    xingwei lee <xrivendell7@gmail.com>, Zhizhuo Tang <strforexctzzchange@foxmail.com>
> 
> ------------[ cut here ]------------
> TITLE: KASAN: null-ptr-deref Read in smc_tcp_syn_recv_sock
> ==================================================================
> BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:68 [inline]
> BUG: KASAN: null-ptr-deref in atomic_read include/linux/atomic/atomic-instrumented.h:32 [inline]
> BUG: KASAN: null-ptr-deref in smc_tcp_syn_recv_sock+0xa7/0x4c0 net/smc/af_smc.c:131
> Read of size 4 at addr 0000000000000a04 by task syz.7.21/12319
> 
> CPU: 1 UID: 0 PID: 12319 Comm: syz.7.21 Not tainted 6.14.0-rc5-dirty #2
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -128,6 +128,8 @@
>  	struct sock *child;
>  
>  	smc = smc_clcsock_user_data(sk);
> +	if (!smc)
> +		goto drop;  // Ensure that the smc pointer is valid before accessing its members

Hi ffhgfv,

Thanks for your report and solution.

The bigger issue here is that smc_clcsock_user_data currently requires
lock protection, which means we need to acquire the sk_callback_lock here.
But the sk in this context is const, which violates the expected interface.

In fact, we have been planning to replace sk_callback_lock with RCU, which should
provide a better solution to this issue. However, there is still a
significant backlog of tasks related to SMC, and we haven't had the
bandwidth to address this yet. 

Anyway, we sincerely appreciate your report, and we will fix
this issue in the future.

Best wishes,
D. Wythe

>  
>  	if (READ_ONCE(sk-&gt;sk_ack_backlog) + atomic_read(&amp;smc-&gt;queued_smc_hs) &gt;
>  	    sk-&gt;sk_max_ack_backlog)
> 
> =========================================================================
> I hope it helps.
> Best regards
> Jianzhou Zhao
> xingwei lee
> Zhizhuo Tang</strforexctzzchange@foxmail.com></xrivendell7@gmail.com></xnxc22xnxc22@qq.com>
diff mbox series

Patch

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -128,6 +128,8 @@ 
 	struct sock *child;
 
 	smc = smc_clcsock_user_data(sk);
+	if (!smc)
+		goto drop;  // Ensure that the smc pointer is valid before accessing its members
 
 	if (READ_ONCE(sk-&gt;sk_ack_backlog) + atomic_read(&amp;smc-&gt;queued_smc_hs) &gt;
 	    sk-&gt;sk_max_ack_backlog)