Message ID | 8db98a8fbf2ac673b355651852093579a913f3f1.1716199422.git.pabeni@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: ensure sk_showdown is 0 for listening sockets | expand |
On Mon, May 20, 2024 at 12:04:47PM +0200, Paolo Abeni wrote: > Christoph reported the following splat: > > WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 > Modules linked in: > CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 > Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 > RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 > RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 > R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 > R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 > FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 > do_accept+0x435/0x620 net/socket.c:1929 > __sys_accept4_file net/socket.c:1969 [inline] > __sys_accept4+0x9b/0x110 net/socket.c:1999 > __do_sys_accept net/socket.c:2016 [inline] > __se_sys_accept net/socket.c:2013 [inline] > __x64_sys_accept+0x7d/0x90 net/socket.c:2013 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > RIP: 0033:0x4315f9 > Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b > RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 > RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 > R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 > R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 > </TASK> > > Listener sockets are supposed to have a zero sk_shutdown, as the > accepted children will inherit such field. > > Invoking shutdown() before entering the listener status allows > violating the above constraint. > > After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for > TCP_SYN_RECV sockets"), the above causes the child to reach the accept > syscall in FIN_WAIT1 status. > > Address the issue explicitly by clearing sk_shutdown at listen time. > > Reported-by: Christoph Paasch <cpaasch@apple.com> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 > Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2") nit: 1da177e4c3f > Signed-off-by: Paolo Abeni <pabeni@redhat.com> ...
On Mon, May 20, 2024 at 12:05 PM Paolo Abeni <pabeni@redhat.com> wrote: > > Christoph reported the following splat: > > WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 > Modules linked in: > CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 > Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 > RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 > RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 > R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 > R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 > FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 > do_accept+0x435/0x620 net/socket.c:1929 > __sys_accept4_file net/socket.c:1969 [inline] > __sys_accept4+0x9b/0x110 net/socket.c:1999 > __do_sys_accept net/socket.c:2016 [inline] > __se_sys_accept net/socket.c:2013 [inline] > __x64_sys_accept+0x7d/0x90 net/socket.c:2013 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > RIP: 0033:0x4315f9 > Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b > RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 > RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 > R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 > R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 > </TASK> > > Listener sockets are supposed to have a zero sk_shutdown, as the > accepted children will inherit such field. > > Invoking shutdown() before entering the listener status allows > violating the above constraint. > > After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for > TCP_SYN_RECV sockets"), the above causes the child to reach the accept > syscall in FIN_WAIT1 status. > > Address the issue explicitly by clearing sk_shutdown at listen time. > > Reported-by: Christoph Paasch <cpaasch@apple.com> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 > Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2") > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > --- > Note: the issue above reports an MPTCP reproducer, but I can reproduce > the issue even using plain TCP sockets only. > --- > net/ipv4/inet_connection_sock.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c > index 3b38610958ee..dab723fea0cc 100644 > --- a/net/ipv4/inet_connection_sock.c > +++ b/net/ipv4/inet_connection_sock.c > @@ -1269,6 +1269,8 @@ int inet_csk_listen_start(struct sock *sk) > > reqsk_queue_alloc(&icsk->icsk_accept_queue); > > + /* closed sockets can have non zero sk_shutdown */ > + WRITE_ONCE(sk->sk_shutdown, 0); Hi Paolo. I am unsure about your patch, I had an internal syzbot report about this before going OOO for a few days, and my first reaction was to change the WARN in inet_accept(). Perhaps some applications are relying on calling shutdown() before listen()...
On Mon, May 20, 2024 at 3:46 PM Eric Dumazet <edumazet@google.com> wrote: > > On Mon, May 20, 2024 at 12:05 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > Christoph reported the following splat: > > > > WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 > > Modules linked in: > > CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > > RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 > > Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 > > RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 > > RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > > RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 > > R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 > > R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 > > FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <TASK> > > inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 > > do_accept+0x435/0x620 net/socket.c:1929 > > __sys_accept4_file net/socket.c:1969 [inline] > > __sys_accept4+0x9b/0x110 net/socket.c:1999 > > __do_sys_accept net/socket.c:2016 [inline] > > __se_sys_accept net/socket.c:2013 [inline] > > __x64_sys_accept+0x7d/0x90 net/socket.c:2013 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > RIP: 0033:0x4315f9 > > Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 > > RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b > > RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 > > RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 > > R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 > > R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 > > </TASK> > > > > Listener sockets are supposed to have a zero sk_shutdown, as the > > accepted children will inherit such field. > > > > Invoking shutdown() before entering the listener status allows > > violating the above constraint. > > > > After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for > > TCP_SYN_RECV sockets"), the above causes the child to reach the accept > > syscall in FIN_WAIT1 status. > > > > Address the issue explicitly by clearing sk_shutdown at listen time. > > > > Reported-by: Christoph Paasch <cpaasch@apple.com> > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 > > Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2") > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > --- > > Note: the issue above reports an MPTCP reproducer, but I can reproduce > > the issue even using plain TCP sockets only. > > --- > > net/ipv4/inet_connection_sock.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c > > index 3b38610958ee..dab723fea0cc 100644 > > --- a/net/ipv4/inet_connection_sock.c > > +++ b/net/ipv4/inet_connection_sock.c > > @@ -1269,6 +1269,8 @@ int inet_csk_listen_start(struct sock *sk) > > > > reqsk_queue_alloc(&icsk->icsk_accept_queue); > > > > + /* closed sockets can have non zero sk_shutdown */ > > + WRITE_ONCE(sk->sk_shutdown, 0); > > Hi Paolo. > > I am unsure about your patch, I had an internal syzbot report about > this before going OOO for a few days, > and my first reaction was to change the WARN in inet_accept(). > > Perhaps some applications are relying on calling shutdown() before listen()... BTW the syzbot repro was r0 = socket$inet6_tcp(0xa, 0x1, 0x0) sendto$inet6(0xffffffffffffffff, 0x0, 0x0, 0x20000004, 0x0, 0x0) shutdown(r0, 0x1) bind$inet6(r0, &(0x7f0000000040)={0xa, 0x4e22, 0x0, @empty}, 0x1c) listen(r0, 0x0) r1 = socket$inet_mptcp(0x2, 0x1, 0x106) connect$inet(r1, &(0x7f0000000000)={0x2, 0x4e22, @local}, 0x10) accept(r0, 0x0, 0x0)
Hi, On Mon, 2024-05-20 at 16:07 +0200, Eric Dumazet wrote: > On Mon, May 20, 2024 at 3:46 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Mon, May 20, 2024 at 12:05 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > > > Christoph reported the following splat: > > > > > > WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 > > > Modules linked in: > > > CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > > > RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 > > > Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 > > > RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 > > > RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > > > RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 > > > R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 > > > R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 > > > FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > PKRU: 55555554 > > > Call Trace: > > > <TASK> > > > inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 > > > do_accept+0x435/0x620 net/socket.c:1929 > > > __sys_accept4_file net/socket.c:1969 [inline] > > > __sys_accept4+0x9b/0x110 net/socket.c:1999 > > > __do_sys_accept net/socket.c:2016 [inline] > > > __se_sys_accept net/socket.c:2013 [inline] > > > __x64_sys_accept+0x7d/0x90 net/socket.c:2013 > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 > > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > RIP: 0033:0x4315f9 > > > Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 > > > RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b > > > RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 > > > RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 > > > R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 > > > R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 > > > </TASK> > > > > > > Listener sockets are supposed to have a zero sk_shutdown, as the > > > accepted children will inherit such field. > > > > > > Invoking shutdown() before entering the listener status allows > > > violating the above constraint. > > > > > > After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for > > > TCP_SYN_RECV sockets"), the above causes the child to reach the accept > > > syscall in FIN_WAIT1 status. > > > > > > Address the issue explicitly by clearing sk_shutdown at listen time. > > > > > > Reported-by: Christoph Paasch <cpaasch@apple.com> > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 > > > Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2") > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > --- > > > Note: the issue above reports an MPTCP reproducer, but I can reproduce > > > the issue even using plain TCP sockets only. > > > --- > > > net/ipv4/inet_connection_sock.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c > > > index 3b38610958ee..dab723fea0cc 100644 > > > --- a/net/ipv4/inet_connection_sock.c > > > +++ b/net/ipv4/inet_connection_sock.c > > > @@ -1269,6 +1269,8 @@ int inet_csk_listen_start(struct sock *sk) > > > > > > reqsk_queue_alloc(&icsk->icsk_accept_queue); > > > > > > + /* closed sockets can have non zero sk_shutdown */ > > > + WRITE_ONCE(sk->sk_shutdown, 0); > > > > Hi Paolo. > > > > I am unsure about your patch, I had an internal syzbot report about > > this before going OOO for a few days, > > and my first reaction was to change the WARN in inet_accept(). > > > > Perhaps some applications are relying on calling shutdown() before listen()... Uhmm, right I did not consider that a non zero sk_shutdown would have affected recvmsg() and sendmsg() even prior to 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets"). > BTW the syzbot repro was > > r0 = socket$inet6_tcp(0xa, 0x1, 0x0) > sendto$inet6(0xffffffffffffffff, 0x0, 0x0, 0x20000004, 0x0, 0x0) > shutdown(r0, 0x1) > bind$inet6(r0, &(0x7f0000000040)={0xa, 0x4e22, 0x0, @empty}, 0x1c) > listen(r0, 0x0) > r1 = socket$inet_mptcp(0x2, 0x1, 0x106) > connect$inet(r1, &(0x7f0000000000)={0x2, 0x4e22, @local}, 0x10) > accept(r0, 0x0, 0x0) The above is very similar to what Christoph reported. It should splat even replacing 0x106 with 0 (mptcp -> tcp). I'm fine with relaxing the check in __inet_accept(). Do you prefer send to patch yourself, or me to send a v2? The condition should be WARN_ON(!((1 << newsk->sk_state) & (TCPF_ESTABLISHED | TCPF_SYN_RECV | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING | TCPF_CLOSE_WAIT | TCPF_CLOSE))); I guess. Thanks! Paolo
On Mon, May 20, 2024 at 4:46 PM Paolo Abeni <pabeni@redhat.com> wrote: > > Hi, > > On Mon, 2024-05-20 at 16:07 +0200, Eric Dumazet wrote: > > On Mon, May 20, 2024 at 3:46 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > On Mon, May 20, 2024 at 12:05 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > > > > > Christoph reported the following splat: > > > > > > > > WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 > > > > Modules linked in: > > > > CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > > > > RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 > > > > Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 > > > > RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 > > > > RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > > > > RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 > > > > R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 > > > > R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 > > > > FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > PKRU: 55555554 > > > > Call Trace: > > > > <TASK> > > > > inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 > > > > do_accept+0x435/0x620 net/socket.c:1929 > > > > __sys_accept4_file net/socket.c:1969 [inline] > > > > __sys_accept4+0x9b/0x110 net/socket.c:1999 > > > > __do_sys_accept net/socket.c:2016 [inline] > > > > __se_sys_accept net/socket.c:2013 [inline] > > > > __x64_sys_accept+0x7d/0x90 net/socket.c:2013 > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > > do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 > > > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > RIP: 0033:0x4315f9 > > > > Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 > > > > RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b > > > > RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 > > > > RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 > > > > R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 > > > > R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 > > > > </TASK> > > > > > > > > Listener sockets are supposed to have a zero sk_shutdown, as the > > > > accepted children will inherit such field. > > > > > > > > Invoking shutdown() before entering the listener status allows > > > > violating the above constraint. > > > > > > > > After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for > > > > TCP_SYN_RECV sockets"), the above causes the child to reach the accept > > > > syscall in FIN_WAIT1 status. > > > > > > > > Address the issue explicitly by clearing sk_shutdown at listen time. > > > > > > > > Reported-by: Christoph Paasch <cpaasch@apple.com> > > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 > > > > Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2") > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > > --- > > > > Note: the issue above reports an MPTCP reproducer, but I can reproduce > > > > the issue even using plain TCP sockets only. > > > > --- > > > > net/ipv4/inet_connection_sock.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c > > > > index 3b38610958ee..dab723fea0cc 100644 > > > > --- a/net/ipv4/inet_connection_sock.c > > > > +++ b/net/ipv4/inet_connection_sock.c > > > > @@ -1269,6 +1269,8 @@ int inet_csk_listen_start(struct sock *sk) > > > > > > > > reqsk_queue_alloc(&icsk->icsk_accept_queue); > > > > > > > > + /* closed sockets can have non zero sk_shutdown */ > > > > + WRITE_ONCE(sk->sk_shutdown, 0); > > > > > > Hi Paolo. > > > > > > I am unsure about your patch, I had an internal syzbot report about > > > this before going OOO for a few days, > > > and my first reaction was to change the WARN in inet_accept(). > > > > > > Perhaps some applications are relying on calling shutdown() before listen()... > > Uhmm, right I did not consider that a non zero sk_shutdown would have > affected recvmsg() and sendmsg() even prior to 94062790aedb ("tcp: > defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets"). > > > BTW the syzbot repro was > > > > r0 = socket$inet6_tcp(0xa, 0x1, 0x0) > > sendto$inet6(0xffffffffffffffff, 0x0, 0x0, 0x20000004, 0x0, 0x0) > > shutdown(r0, 0x1) > > bind$inet6(r0, &(0x7f0000000040)={0xa, 0x4e22, 0x0, @empty}, 0x1c) > > listen(r0, 0x0) > > r1 = socket$inet_mptcp(0x2, 0x1, 0x106) > > connect$inet(r1, &(0x7f0000000000)={0x2, 0x4e22, @local}, 0x10) > > accept(r0, 0x0, 0x0) > > The above is very similar to what Christoph reported. It should splat > even replacing 0x106 with 0 (mptcp -> tcp). > > I'm fine with relaxing the check in __inet_accept(). Do you prefer send > to patch yourself, or me to send a v2? The condition should be > > WARN_ON(!((1 << newsk->sk_state) & > (TCPF_ESTABLISHED | TCPF_SYN_RECV | > TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | > TCPF_CLOSING | TCPF_CLOSE_WAIT | > TCPF_CLOSE))); > > I guess. > > Thanks! > > Paolo > > >
On Mon, May 20, 2024 at 4:46 PM Paolo Abeni <pabeni@redhat.com> wrote: > > Hi, > > On Mon, 2024-05-20 at 16:07 +0200, Eric Dumazet wrote: > > On Mon, May 20, 2024 at 3:46 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > On Mon, May 20, 2024 at 12:05 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > > > > > Christoph reported the following splat: > > > > > > > > WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 > > > > Modules linked in: > > > > CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > > > > RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 > > > > Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 > > > > RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 > > > > RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > > > > RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 > > > > R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 > > > > R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 > > > > FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > PKRU: 55555554 > > > > Call Trace: > > > > <TASK> > > > > inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 > > > > do_accept+0x435/0x620 net/socket.c:1929 > > > > __sys_accept4_file net/socket.c:1969 [inline] > > > > __sys_accept4+0x9b/0x110 net/socket.c:1999 > > > > __do_sys_accept net/socket.c:2016 [inline] > > > > __se_sys_accept net/socket.c:2013 [inline] > > > > __x64_sys_accept+0x7d/0x90 net/socket.c:2013 > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > > do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 > > > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > RIP: 0033:0x4315f9 > > > > Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 > > > > RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b > > > > RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 > > > > RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 > > > > R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 > > > > R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 > > > > </TASK> > > > > > > > > Listener sockets are supposed to have a zero sk_shutdown, as the > > > > accepted children will inherit such field. > > > > > > > > Invoking shutdown() before entering the listener status allows > > > > violating the above constraint. > > > > > > > > After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for > > > > TCP_SYN_RECV sockets"), the above causes the child to reach the accept > > > > syscall in FIN_WAIT1 status. > > > > > > > > Address the issue explicitly by clearing sk_shutdown at listen time. > > > > > > > > Reported-by: Christoph Paasch <cpaasch@apple.com> > > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 > > > > Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2") > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > > --- > > > > Note: the issue above reports an MPTCP reproducer, but I can reproduce > > > > the issue even using plain TCP sockets only. > > > > --- > > > > net/ipv4/inet_connection_sock.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c > > > > index 3b38610958ee..dab723fea0cc 100644 > > > > --- a/net/ipv4/inet_connection_sock.c > > > > +++ b/net/ipv4/inet_connection_sock.c > > > > @@ -1269,6 +1269,8 @@ int inet_csk_listen_start(struct sock *sk) > > > > > > > > reqsk_queue_alloc(&icsk->icsk_accept_queue); > > > > > > > > + /* closed sockets can have non zero sk_shutdown */ > > > > + WRITE_ONCE(sk->sk_shutdown, 0); > > > > > > Hi Paolo. > > > > > > I am unsure about your patch, I had an internal syzbot report about > > > this before going OOO for a few days, > > > and my first reaction was to change the WARN in inet_accept(). > > > > > > Perhaps some applications are relying on calling shutdown() before listen()... > > Uhmm, right I did not consider that a non zero sk_shutdown would have > affected recvmsg() and sendmsg() even prior to 94062790aedb ("tcp: > defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets"). > > > BTW the syzbot repro was > > > > r0 = socket$inet6_tcp(0xa, 0x1, 0x0) > > sendto$inet6(0xffffffffffffffff, 0x0, 0x0, 0x20000004, 0x0, 0x0) > > shutdown(r0, 0x1) > > bind$inet6(r0, &(0x7f0000000040)={0xa, 0x4e22, 0x0, @empty}, 0x1c) > > listen(r0, 0x0) > > r1 = socket$inet_mptcp(0x2, 0x1, 0x106) > > connect$inet(r1, &(0x7f0000000000)={0x2, 0x4e22, @local}, 0x10) > > accept(r0, 0x0, 0x0) > > The above is very similar to what Christoph reported. It should splat > even replacing 0x106 with 0 (mptcp -> tcp). > > I'm fine with relaxing the check in __inet_accept(). Do you prefer send > to patch yourself, or me to send a v2? The condition should be > > WARN_ON(!((1 << newsk->sk_state) & > (TCPF_ESTABLISHED | TCPF_SYN_RECV | > TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | > TCPF_CLOSING | TCPF_CLOSE_WAIT | > TCPF_CLOSE))); > Please send a v2. I am not sure why we need a WARN_ON() to begin with, the socket is still private. Even the lock_sock(sk2)/release_sock(sk2) pair in inet_accept() seems overkill.
On Mon, 2024-05-20 at 16:53 +0200, Eric Dumazet wrote: > On Mon, May 20, 2024 at 4:46 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > Hi, > > > > On Mon, 2024-05-20 at 16:07 +0200, Eric Dumazet wrote: > > > On Mon, May 20, 2024 at 3:46 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > > > On Mon, May 20, 2024 at 12:05 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > > > > > > > Christoph reported the following splat: > > > > > > > > > > WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 > > > > > Modules linked in: > > > > > CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 > > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > > > > > RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 > > > > > Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 > > > > > RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 > > > > > RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 > > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > > > > > RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 > > > > > R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 > > > > > R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 > > > > > FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > PKRU: 55555554 > > > > > Call Trace: > > > > > <TASK> > > > > > inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 > > > > > do_accept+0x435/0x620 net/socket.c:1929 > > > > > __sys_accept4_file net/socket.c:1969 [inline] > > > > > __sys_accept4+0x9b/0x110 net/socket.c:1999 > > > > > __do_sys_accept net/socket.c:2016 [inline] > > > > > __se_sys_accept net/socket.c:2013 [inline] > > > > > __x64_sys_accept+0x7d/0x90 net/socket.c:2013 > > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > > > do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 > > > > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > > RIP: 0033:0x4315f9 > > > > > Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 > > > > > RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b > > > > > RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 > > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 > > > > > RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 > > > > > R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 > > > > > R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 > > > > > </TASK> > > > > > > > > > > Listener sockets are supposed to have a zero sk_shutdown, as the > > > > > accepted children will inherit such field. > > > > > > > > > > Invoking shutdown() before entering the listener status allows > > > > > violating the above constraint. > > > > > > > > > > After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for > > > > > TCP_SYN_RECV sockets"), the above causes the child to reach the accept > > > > > syscall in FIN_WAIT1 status. > > > > > > > > > > Address the issue explicitly by clearing sk_shutdown at listen time. > > > > > > > > > > Reported-by: Christoph Paasch <cpaasch@apple.com> > > > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 > > > > > Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2") > > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > > > --- > > > > > Note: the issue above reports an MPTCP reproducer, but I can reproduce > > > > > the issue even using plain TCP sockets only. > > > > > --- > > > > > net/ipv4/inet_connection_sock.c | 2 ++ > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c > > > > > index 3b38610958ee..dab723fea0cc 100644 > > > > > --- a/net/ipv4/inet_connection_sock.c > > > > > +++ b/net/ipv4/inet_connection_sock.c > > > > > @@ -1269,6 +1269,8 @@ int inet_csk_listen_start(struct sock *sk) > > > > > > > > > > reqsk_queue_alloc(&icsk->icsk_accept_queue); > > > > > > > > > > + /* closed sockets can have non zero sk_shutdown */ > > > > > + WRITE_ONCE(sk->sk_shutdown, 0); > > > > > > > > Hi Paolo. > > > > > > > > I am unsure about your patch, I had an internal syzbot report about > > > > this before going OOO for a few days, > > > > and my first reaction was to change the WARN in inet_accept(). > > > > > > > > Perhaps some applications are relying on calling shutdown() before listen()... > > > > Uhmm, right I did not consider that a non zero sk_shutdown would have > > affected recvmsg() and sendmsg() even prior to 94062790aedb ("tcp: > > defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets"). > > > > > BTW the syzbot repro was > > > > > > r0 = socket$inet6_tcp(0xa, 0x1, 0x0) > > > sendto$inet6(0xffffffffffffffff, 0x0, 0x0, 0x20000004, 0x0, 0x0) > > > shutdown(r0, 0x1) > > > bind$inet6(r0, &(0x7f0000000040)={0xa, 0x4e22, 0x0, @empty}, 0x1c) > > > listen(r0, 0x0) > > > r1 = socket$inet_mptcp(0x2, 0x1, 0x106) > > > connect$inet(r1, &(0x7f0000000000)={0x2, 0x4e22, @local}, 0x10) > > > accept(r0, 0x0, 0x0) > > > > The above is very similar to what Christoph reported. It should splat > > even replacing 0x106 with 0 (mptcp -> tcp). > > > > I'm fine with relaxing the check in __inet_accept(). Do you prefer send > > to patch yourself, or me to send a v2? The condition should be > > > > WARN_ON(!((1 << newsk->sk_state) & > > (TCPF_ESTABLISHED | TCPF_SYN_RECV | > > TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | > > TCPF_CLOSING | TCPF_CLOSE_WAIT | > > TCPF_CLOSE))); > > > > Please send a v2. > > I am not sure why we need a WARN_ON() to begin with, the socket is > still private. Digging into the history, the warn was introduced in 2.3.15 - was a BUG_TRAP() back then. The relevant chunk replaced explicit handling for each expected state with more generic code handling all of them the same way. I guess the assertion is a left over safeguard. I would not drop it on net, perhaps later on net-next? > Even the lock_sock(sk2)/release_sock(sk2) pair in inet_accept() seems overkill. Something for net-next, I guess? Thanks! Paolo
On Mon, May 20, 2024 at 5:13 PM Paolo Abeni <pabeni@redhat.com> wrote: > > On Mon, 2024-05-20 at 16:53 +0200, Eric Dumazet wrote: > > On Mon, May 20, 2024 at 4:46 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > > > Hi, > > > > > > On Mon, 2024-05-20 at 16:07 +0200, Eric Dumazet wrote: > > > > On Mon, May 20, 2024 at 3:46 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > > > > > On Mon, May 20, 2024 at 12:05 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > > > > > > > > > Christoph reported the following splat: > > > > > > > > > > > > WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 > > > > > > Modules linked in: > > > > > > CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 > > > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > > > > > > RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 > > > > > > Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 > > > > > > RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 > > > > > > RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 > > > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > > > > > > RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 > > > > > > R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 > > > > > > R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 > > > > > > FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 > > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > PKRU: 55555554 > > > > > > Call Trace: > > > > > > <TASK> > > > > > > inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 > > > > > > do_accept+0x435/0x620 net/socket.c:1929 > > > > > > __sys_accept4_file net/socket.c:1969 [inline] > > > > > > __sys_accept4+0x9b/0x110 net/socket.c:1999 > > > > > > __do_sys_accept net/socket.c:2016 [inline] > > > > > > __se_sys_accept net/socket.c:2013 [inline] > > > > > > __x64_sys_accept+0x7d/0x90 net/socket.c:2013 > > > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > > > > do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 > > > > > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > > > RIP: 0033:0x4315f9 > > > > > > Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 > > > > > > RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b > > > > > > RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 > > > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 > > > > > > RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 > > > > > > R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 > > > > > > R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 > > > > > > </TASK> > > > > > > > > > > > > Listener sockets are supposed to have a zero sk_shutdown, as the > > > > > > accepted children will inherit such field. > > > > > > > > > > > > Invoking shutdown() before entering the listener status allows > > > > > > violating the above constraint. > > > > > > > > > > > > After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for > > > > > > TCP_SYN_RECV sockets"), the above causes the child to reach the accept > > > > > > syscall in FIN_WAIT1 status. > > > > > > > > > > > > Address the issue explicitly by clearing sk_shutdown at listen time. > > > > > > > > > > > > Reported-by: Christoph Paasch <cpaasch@apple.com> > > > > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 > > > > > > Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2") > > > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > > > > --- > > > > > > Note: the issue above reports an MPTCP reproducer, but I can reproduce > > > > > > the issue even using plain TCP sockets only. > > > > > > --- > > > > > > net/ipv4/inet_connection_sock.c | 2 ++ > > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c > > > > > > index 3b38610958ee..dab723fea0cc 100644 > > > > > > --- a/net/ipv4/inet_connection_sock.c > > > > > > +++ b/net/ipv4/inet_connection_sock.c > > > > > > @@ -1269,6 +1269,8 @@ int inet_csk_listen_start(struct sock *sk) > > > > > > > > > > > > reqsk_queue_alloc(&icsk->icsk_accept_queue); > > > > > > > > > > > > + /* closed sockets can have non zero sk_shutdown */ > > > > > > + WRITE_ONCE(sk->sk_shutdown, 0); > > > > > > > > > > Hi Paolo. > > > > > > > > > > I am unsure about your patch, I had an internal syzbot report about > > > > > this before going OOO for a few days, > > > > > and my first reaction was to change the WARN in inet_accept(). > > > > > > > > > > Perhaps some applications are relying on calling shutdown() before listen()... > > > > > > Uhmm, right I did not consider that a non zero sk_shutdown would have > > > affected recvmsg() and sendmsg() even prior to 94062790aedb ("tcp: > > > defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets"). > > > > > > > BTW the syzbot repro was > > > > > > > > r0 = socket$inet6_tcp(0xa, 0x1, 0x0) > > > > sendto$inet6(0xffffffffffffffff, 0x0, 0x0, 0x20000004, 0x0, 0x0) > > > > shutdown(r0, 0x1) > > > > bind$inet6(r0, &(0x7f0000000040)={0xa, 0x4e22, 0x0, @empty}, 0x1c) > > > > listen(r0, 0x0) > > > > r1 = socket$inet_mptcp(0x2, 0x1, 0x106) > > > > connect$inet(r1, &(0x7f0000000000)={0x2, 0x4e22, @local}, 0x10) > > > > accept(r0, 0x0, 0x0) > > > > > > The above is very similar to what Christoph reported. It should splat > > > even replacing 0x106 with 0 (mptcp -> tcp). > > > > > > I'm fine with relaxing the check in __inet_accept(). Do you prefer send > > > to patch yourself, or me to send a v2? The condition should be > > > > > > WARN_ON(!((1 << newsk->sk_state) & > > > (TCPF_ESTABLISHED | TCPF_SYN_RECV | > > > TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | > > > TCPF_CLOSING | TCPF_CLOSE_WAIT | > > > TCPF_CLOSE))); > > > > > > > Please send a v2. > > > > I am not sure why we need a WARN_ON() to begin with, the socket is > > still private. > > Digging into the history, the warn was introduced in 2.3.15 - was a > BUG_TRAP() back then. > > The relevant chunk replaced explicit handling for each expected state > with more generic code handling all of them the same way. I guess the > assertion is a left over safeguard. > > I would not drop it on net, perhaps later on net-next? Sure, let's wait for the next syzbot report if any. > > > Even the lock_sock(sk2)/release_sock(sk2) pair in inet_accept() seems overkill. > > Something for net-next, I guess? Sure, this is orthogonal. > > Thanks! > > Paolo >
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 3b38610958ee..dab723fea0cc 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -1269,6 +1269,8 @@ int inet_csk_listen_start(struct sock *sk) reqsk_queue_alloc(&icsk->icsk_accept_queue); + /* closed sockets can have non zero sk_shutdown */ + WRITE_ONCE(sk->sk_shutdown, 0); sk->sk_ack_backlog = 0; inet_csk_delack_init(sk);
Christoph reported the following splat: WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0 Modules linked in: CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759 Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80 RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293 RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64 R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000 R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800 FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786 do_accept+0x435/0x620 net/socket.c:1929 __sys_accept4_file net/socket.c:1969 [inline] __sys_accept4+0x9b/0x110 net/socket.c:1999 __do_sys_accept net/socket.c:2016 [inline] __se_sys_accept net/socket.c:2013 [inline] __x64_sys_accept+0x7d/0x90 net/socket.c:2013 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x4315f9 Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300 R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055 </TASK> Listener sockets are supposed to have a zero sk_shutdown, as the accepted children will inherit such field. Invoking shutdown() before entering the listener status allows violating the above constraint. After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets"), the above causes the child to reach the accept syscall in FIN_WAIT1 status. Address the issue explicitly by clearing sk_shutdown at listen time. Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490 Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2") Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- Note: the issue above reports an MPTCP reproducer, but I can reproduce the issue even using plain TCP sockets only. --- net/ipv4/inet_connection_sock.c | 2 ++ 1 file changed, 2 insertions(+)