Message ID | 20240501125448.896529-1-edumazet@google.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 94062790aedb505bdda209b10bea47b294d6394f |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets | expand |
On Wed, May 1, 2024 at 8:54 AM Eric Dumazet <edumazet@google.com> wrote: > > TCP_SYN_RECV state is really special, it is only used by > cross-syn connections, mostly used by fuzzers. > > In the following crash [1], syzbot managed to trigger a divide > by zero in tcp_rcv_space_adjust() > > A socket makes the following state transitions, > without ever calling tcp_init_transfer(), > meaning tcp_init_buffer_space() is also not called. > > TCP_CLOSE > connect() > TCP_SYN_SENT > TCP_SYN_RECV > shutdown() -> tcp_shutdown(sk, SEND_SHUTDOWN) > TCP_FIN_WAIT1 > > To fix this issue, change tcp_shutdown() to not > perform a TCP_SYN_RECV -> TCP_FIN_WAIT1 transition, > which makes no sense anyway. > > When tcp_rcv_state_process() later changes socket state > from TCP_SYN_RECV to TCP_ESTABLISH, then look at > sk->sk_shutdown to finally enter TCP_FIN_WAIT1 state, > and send a FIN packet from a sane socket state. > > This means tcp_send_fin() can now be called from BH > context, and must use GFP_ATOMIC allocations. > > [1] > divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI > CPU: 1 PID: 5084 Comm: syz-executor358 Not tainted 6.9.0-rc6-syzkaller-00022-g98369dccd2f8 #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 > RIP: 0010:tcp_rcv_space_adjust+0x2df/0x890 net/ipv4/tcp_input.c:767 > Code: e3 04 4c 01 eb 48 8b 44 24 38 0f b6 04 10 84 c0 49 89 d5 0f 85 a5 03 00 00 41 8b 8e c8 09 00 00 89 e8 29 c8 48 0f af c3 31 d2 <48> f7 f1 48 8d 1c 43 49 8d 96 76 08 00 00 48 89 d0 48 c1 e8 03 48 > RSP: 0018:ffffc900031ef3f0 EFLAGS: 00010246 > RAX: 0c677a10441f8f42 RBX: 000000004fb95e7e RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: 0000000027d4b11f R08: ffffffff89e535a4 R09: 1ffffffff25e6ab7 > R10: dffffc0000000000 R11: ffffffff8135e920 R12: ffff88802a9f8d30 > R13: dffffc0000000000 R14: ffff88802a9f8d00 R15: 1ffff1100553f2da > FS: 00005555775c0380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f1155bf2304 CR3: 000000002b9f2000 CR4: 0000000000350ef0 > Call Trace: > <TASK> > tcp_recvmsg_locked+0x106d/0x25a0 net/ipv4/tcp.c:2513 > tcp_recvmsg+0x25d/0x920 net/ipv4/tcp.c:2578 > inet6_recvmsg+0x16a/0x730 net/ipv6/af_inet6.c:680 > sock_recvmsg_nosec net/socket.c:1046 [inline] > sock_recvmsg+0x109/0x280 net/socket.c:1068 > ____sys_recvmsg+0x1db/0x470 net/socket.c:2803 > ___sys_recvmsg net/socket.c:2845 [inline] > do_recvmmsg+0x474/0xae0 net/socket.c:2939 > __sys_recvmmsg net/socket.c:3018 [inline] > __do_sys_recvmmsg net/socket.c:3041 [inline] > __se_sys_recvmmsg net/socket.c:3034 [inline] > __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3034 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > RIP: 0033:0x7faeb6363db9 > Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007ffcc1997168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faeb6363db9 > RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000005 > RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000001c > R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001 > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Reported-by: syzbot <syzkaller@googlegroups.com> > Signed-off-by: Eric Dumazet <edumazet@google.com> > --- Very nice find and fix! Thanks, Eric! Acked-by: Neal Cardwell <ncardwell@google.com> neal
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Wed, 1 May 2024 12:54:48 +0000 you wrote: > TCP_SYN_RECV state is really special, it is only used by > cross-syn connections, mostly used by fuzzers. > > In the following crash [1], syzbot managed to trigger a divide > by zero in tcp_rcv_space_adjust() > > A socket makes the following state transitions, > without ever calling tcp_init_transfer(), > meaning tcp_init_buffer_space() is also not called. > > [...] Here is the summary with links: - [net] tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets https://git.kernel.org/netdev/net/c/94062790aedb You are awesome, thank you!
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e767721b3a588b5d56567ae7badf5dffcd35a76a..66d77faca64f6db95e04f4c0e7dd3892628ae3f7 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2710,7 +2710,7 @@ void tcp_shutdown(struct sock *sk, int how) /* If we've already sent a FIN, or it's a closed state, skip this. */ if ((1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_SYN_SENT | - TCPF_SYN_RECV | TCPF_CLOSE_WAIT)) { + TCPF_CLOSE_WAIT)) { /* Clear out any half completed packets. FIN if needed. */ if (tcp_close_state(sk)) tcp_send_fin(sk); @@ -2819,7 +2819,7 @@ void __tcp_close(struct sock *sk, long timeout) * machine. State transitions: * * TCP_ESTABLISHED -> TCP_FIN_WAIT1 - * TCP_SYN_RECV -> TCP_FIN_WAIT1 (forget it, it's impossible) + * TCP_SYN_RECV -> TCP_FIN_WAIT1 (it is difficult) * TCP_CLOSE_WAIT -> TCP_LAST_ACK * * are legal only when FIN has been sent (i.e. in window), diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 5d874817a78db31a4a807ab80e9158300329423d..a140d9f7a0a36e6a0b90c97a44a1e54e7639c71f 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6761,6 +6761,8 @@ tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) tcp_initialize_rcv_mss(sk); tcp_fast_path_on(tp); + if (sk->sk_shutdown & SEND_SHUTDOWN) + tcp_shutdown(sk, SEND_SHUTDOWN); break; case TCP_FIN_WAIT1: { diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index e3167ad965676facaacd8f82848c52cf966f97c3..02caeb7bcf6342713019d31891998fdbe426b573 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3563,7 +3563,9 @@ void tcp_send_fin(struct sock *sk) return; } } else { - skb = alloc_skb_fclone(MAX_TCP_HEADER, sk->sk_allocation); + skb = alloc_skb_fclone(MAX_TCP_HEADER, + sk_gfp_mask(sk, GFP_ATOMIC | + __GFP_NOWARN)); if (unlikely(!skb)) return;
TCP_SYN_RECV state is really special, it is only used by cross-syn connections, mostly used by fuzzers. In the following crash [1], syzbot managed to trigger a divide by zero in tcp_rcv_space_adjust() A socket makes the following state transitions, without ever calling tcp_init_transfer(), meaning tcp_init_buffer_space() is also not called. TCP_CLOSE connect() TCP_SYN_SENT TCP_SYN_RECV shutdown() -> tcp_shutdown(sk, SEND_SHUTDOWN) TCP_FIN_WAIT1 To fix this issue, change tcp_shutdown() to not perform a TCP_SYN_RECV -> TCP_FIN_WAIT1 transition, which makes no sense anyway. When tcp_rcv_state_process() later changes socket state from TCP_SYN_RECV to TCP_ESTABLISH, then look at sk->sk_shutdown to finally enter TCP_FIN_WAIT1 state, and send a FIN packet from a sane socket state. This means tcp_send_fin() can now be called from BH context, and must use GFP_ATOMIC allocations. [1] divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 1 PID: 5084 Comm: syz-executor358 Not tainted 6.9.0-rc6-syzkaller-00022-g98369dccd2f8 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 RIP: 0010:tcp_rcv_space_adjust+0x2df/0x890 net/ipv4/tcp_input.c:767 Code: e3 04 4c 01 eb 48 8b 44 24 38 0f b6 04 10 84 c0 49 89 d5 0f 85 a5 03 00 00 41 8b 8e c8 09 00 00 89 e8 29 c8 48 0f af c3 31 d2 <48> f7 f1 48 8d 1c 43 49 8d 96 76 08 00 00 48 89 d0 48 c1 e8 03 48 RSP: 0018:ffffc900031ef3f0 EFLAGS: 00010246 RAX: 0c677a10441f8f42 RBX: 000000004fb95e7e RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000027d4b11f R08: ffffffff89e535a4 R09: 1ffffffff25e6ab7 R10: dffffc0000000000 R11: ffffffff8135e920 R12: ffff88802a9f8d30 R13: dffffc0000000000 R14: ffff88802a9f8d00 R15: 1ffff1100553f2da FS: 00005555775c0380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1155bf2304 CR3: 000000002b9f2000 CR4: 0000000000350ef0 Call Trace: <TASK> tcp_recvmsg_locked+0x106d/0x25a0 net/ipv4/tcp.c:2513 tcp_recvmsg+0x25d/0x920 net/ipv4/tcp.c:2578 inet6_recvmsg+0x16a/0x730 net/ipv6/af_inet6.c:680 sock_recvmsg_nosec net/socket.c:1046 [inline] sock_recvmsg+0x109/0x280 net/socket.c:1068 ____sys_recvmsg+0x1db/0x470 net/socket.c:2803 ___sys_recvmsg net/socket.c:2845 [inline] do_recvmmsg+0x474/0xae0 net/socket.c:2939 __sys_recvmmsg net/socket.c:3018 [inline] __do_sys_recvmmsg net/socket.c:3041 [inline] __se_sys_recvmmsg net/socket.c:3034 [inline] __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3034 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7faeb6363db9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffcc1997168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faeb6363db9 RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000001c R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> --- net/ipv4/tcp.c | 4 ++-- net/ipv4/tcp_input.c | 2 ++ net/ipv4/tcp_output.c | 4 +++- 3 files changed, 7 insertions(+), 3 deletions(-)