Message ID | 20211208051633.49122-1-kuniyu@amazon.co.jp (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: Remove sock_owned_by_user() test in tcp_child_process(). | expand |
On Tue, Dec 7, 2021 at 9:16 PM Kuniyuki Iwashima <kuniyu@amazon.co.jp> wrote: > > While creating a child socket, before v2.3.41, we used to call > bh_lock_sock() later than now; it was called just before > tcp_rcv_state_process(). The full socket was put into an accept queue > and exposed to other CPUs before bh_lock_sock() so that process context > might have acquired the lock by then. Thus, we had to check if any > process context was accessing the socket before tcp_rcv_state_process(). > > We can see this code in tcp_v4_do_rcv() of v2.3.14. [0] > > if (sk->state == TCP_LISTEN) { > struct sock *nsk; > > nsk = tcp_v4_hnd_req(sk, skb); > ... > if (nsk != sk) { > bh_lock_sock(nsk); > if (nsk->lock.users != 0) { > ... > sk_add_backlog(nsk, skb); > bh_unlock_sock(nsk); > return 0; > } > ... > } > } > > if (tcp_rcv_state_process(sk, skb, skb->h.th, skb->len)) > goto reset; > > However, in 2.3.15, this lock.users test was replaced with BUG_TRAP() by > mistake. [1] > > if (nsk != sk) { > ... > BUG_TRAP(nsk->lock.users == 0); > ... > ret = tcp_rcv_state_process(nsk, skb, skb->h.th, skb->len); > ... > bh_unlock_sock(nsk); > ... > return 0; > } > > Fortunately, the test was back in 2.3.41. [2] Then, related code was > packed into tcp_child_process() with comments, which remains until now. > > What is interesting in v2.3.41 is that the bh_lock_sock() was moved to > tcp_create_openreq_child() and placed just after sock_lock_init(). > Thus, the lock is never acquired until tcp_rcv_state_process() by process > contexts. The bh_lock_sock() is now in sk_clone_lock() and the rule does > not change. > > As of now, alas, it is not possible to reach the commented path by the > change. Let's remove the remnant of the old days. > > [0]: https://cdn.kernel.org/pub/linux/kernel/v2.3/linux-2.3.14.tar.gz > [1]: https://cdn.kernel.org/pub/linux/kernel/v2.3/patch-2.3.15.gz > [2]: https://cdn.kernel.org/pub/linux/kernel/v2.3/patch-2.3.41.gz > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") I do not think this patch qualifies as a stable candidate. At best this is a cleanup. At worst this could add a bug. I would advise adding a WARN_ON_ONCE() there for at least one release so that syzbot can validate for you if this is dead code or not. TCP_SYN_RECV is not TCP_NEW_SYN_RECV Thanks. > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> > --- > net/ipv4/tcp_minisocks.c | 18 ++++++------------ > 1 file changed, 6 insertions(+), 12 deletions(-) > > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > index 7c2d3ac2363a..b4a1f8728093 100644 > --- a/net/ipv4/tcp_minisocks.c > +++ b/net/ipv4/tcp_minisocks.c > @@ -833,18 +833,12 @@ int tcp_child_process(struct sock *parent, struct sock *child, > sk_mark_napi_id_set(child, skb); > > tcp_segs_in(tcp_sk(child), skb); > - if (!sock_owned_by_user(child)) { > - ret = tcp_rcv_state_process(child, skb); > - /* Wakeup parent, send SIGIO */ > - if (state == TCP_SYN_RECV && child->sk_state != state) > - parent->sk_data_ready(parent); > - } else { > - /* Alas, it is possible again, because we do lookup > - * in main socket hash table and lock on listening > - * socket does not protect us more. > - */ > - __sk_add_backlog(child, skb); > - } > + > + ret = tcp_rcv_state_process(child, skb); > + > + /* Wakeup parent, send SIGIO */ > + if (state == TCP_SYN_RECV && child->sk_state != state) > + parent->sk_data_ready(parent); > > bh_unlock_sock(child); > sock_put(child); > -- > 2.30.2 >
From: Eric Dumazet <edumazet@google.com> Date: Wed, 8 Dec 2021 10:30:49 -0800 > On Tue, Dec 7, 2021 at 9:16 PM Kuniyuki Iwashima <kuniyu@amazon.co.jp> wrote: >> >> While creating a child socket, before v2.3.41, we used to call >> bh_lock_sock() later than now; it was called just before >> tcp_rcv_state_process(). The full socket was put into an accept queue >> and exposed to other CPUs before bh_lock_sock() so that process context >> might have acquired the lock by then. Thus, we had to check if any >> process context was accessing the socket before tcp_rcv_state_process(). >> >> We can see this code in tcp_v4_do_rcv() of v2.3.14. [0] >> >> if (sk->state == TCP_LISTEN) { >> struct sock *nsk; >> >> nsk = tcp_v4_hnd_req(sk, skb); >> ... >> if (nsk != sk) { >> bh_lock_sock(nsk); >> if (nsk->lock.users != 0) { >> ... >> sk_add_backlog(nsk, skb); >> bh_unlock_sock(nsk); >> return 0; >> } >> ... >> } >> } >> >> if (tcp_rcv_state_process(sk, skb, skb->h.th, skb->len)) >> goto reset; >> >> However, in 2.3.15, this lock.users test was replaced with BUG_TRAP() by >> mistake. [1] >> >> if (nsk != sk) { >> ... >> BUG_TRAP(nsk->lock.users == 0); >> ... >> ret = tcp_rcv_state_process(nsk, skb, skb->h.th, skb->len); >> ... >> bh_unlock_sock(nsk); >> ... >> return 0; >> } >> >> Fortunately, the test was back in 2.3.41. [2] Then, related code was >> packed into tcp_child_process() with comments, which remains until now. >> >> What is interesting in v2.3.41 is that the bh_lock_sock() was moved to >> tcp_create_openreq_child() and placed just after sock_lock_init(). >> Thus, the lock is never acquired until tcp_rcv_state_process() by process >> contexts. The bh_lock_sock() is now in sk_clone_lock() and the rule does >> not change. >> >> As of now, alas, it is not possible to reach the commented path by the >> change. Let's remove the remnant of the old days. >> >> [0]: https://cdn.kernel.org/pub/linux/kernel/v2.3/linux-2.3.14.tar.gz >> [1]: https://cdn.kernel.org/pub/linux/kernel/v2.3/patch-2.3.15.gz >> [2]: https://cdn.kernel.org/pub/linux/kernel/v2.3/patch-2.3.41.gz >> >> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > I do not think this patch qualifies as a stable candidate. > > At best this is a cleanup. > > At worst this could add a bug. > > I would advise adding a WARN_ON_ONCE() there for at least one release > so that syzbot can validate for you if this is dead code or not. Thanks for review. I will add a WARN_ON_ONCE() and respin for net-next. > > TCP_SYN_RECV is not TCP_NEW_SYN_RECV Right, TCP_SYN_RECV is not the case. "While creating a child socket," was a bit misleading. I will clarify that is for TCP_NEW_SYN_RECV case and SYN cookie case. > > Thanks. > >> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> >> --- >> net/ipv4/tcp_minisocks.c | 18 ++++++------------ >> 1 file changed, 6 insertions(+), 12 deletions(-) >> >> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c >> index 7c2d3ac2363a..b4a1f8728093 100644 >> --- a/net/ipv4/tcp_minisocks.c >> +++ b/net/ipv4/tcp_minisocks.c >> @@ -833,18 +833,12 @@ int tcp_child_process(struct sock *parent, struct sock *child, >> sk_mark_napi_id_set(child, skb); >> >> tcp_segs_in(tcp_sk(child), skb); >> - if (!sock_owned_by_user(child)) { >> - ret = tcp_rcv_state_process(child, skb); >> - /* Wakeup parent, send SIGIO */ >> - if (state == TCP_SYN_RECV && child->sk_state != state) >> - parent->sk_data_ready(parent); >> - } else { >> - /* Alas, it is possible again, because we do lookup >> - * in main socket hash table and lock on listening >> - * socket does not protect us more. >> - */ >> - __sk_add_backlog(child, skb); >> - } >> + >> + ret = tcp_rcv_state_process(child, skb); >> + >> + /* Wakeup parent, send SIGIO */ >> + if (state == TCP_SYN_RECV && child->sk_state != state) >> + parent->sk_data_ready(parent); >> >> bh_unlock_sock(child); >> sock_put(child); >> -- >> 2.30.2 >>
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 7c2d3ac2363a..b4a1f8728093 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -833,18 +833,12 @@ int tcp_child_process(struct sock *parent, struct sock *child, sk_mark_napi_id_set(child, skb); tcp_segs_in(tcp_sk(child), skb); - if (!sock_owned_by_user(child)) { - ret = tcp_rcv_state_process(child, skb); - /* Wakeup parent, send SIGIO */ - if (state == TCP_SYN_RECV && child->sk_state != state) - parent->sk_data_ready(parent); - } else { - /* Alas, it is possible again, because we do lookup - * in main socket hash table and lock on listening - * socket does not protect us more. - */ - __sk_add_backlog(child, skb); - } + + ret = tcp_rcv_state_process(child, skb); + + /* Wakeup parent, send SIGIO */ + if (state == TCP_SYN_RECV && child->sk_state != state) + parent->sk_data_ready(parent); bh_unlock_sock(child); sock_put(child);
While creating a child socket, before v2.3.41, we used to call bh_lock_sock() later than now; it was called just before tcp_rcv_state_process(). The full socket was put into an accept queue and exposed to other CPUs before bh_lock_sock() so that process context might have acquired the lock by then. Thus, we had to check if any process context was accessing the socket before tcp_rcv_state_process(). We can see this code in tcp_v4_do_rcv() of v2.3.14. [0] if (sk->state == TCP_LISTEN) { struct sock *nsk; nsk = tcp_v4_hnd_req(sk, skb); ... if (nsk != sk) { bh_lock_sock(nsk); if (nsk->lock.users != 0) { ... sk_add_backlog(nsk, skb); bh_unlock_sock(nsk); return 0; } ... } } if (tcp_rcv_state_process(sk, skb, skb->h.th, skb->len)) goto reset; However, in 2.3.15, this lock.users test was replaced with BUG_TRAP() by mistake. [1] if (nsk != sk) { ... BUG_TRAP(nsk->lock.users == 0); ... ret = tcp_rcv_state_process(nsk, skb, skb->h.th, skb->len); ... bh_unlock_sock(nsk); ... return 0; } Fortunately, the test was back in 2.3.41. [2] Then, related code was packed into tcp_child_process() with comments, which remains until now. What is interesting in v2.3.41 is that the bh_lock_sock() was moved to tcp_create_openreq_child() and placed just after sock_lock_init(). Thus, the lock is never acquired until tcp_rcv_state_process() by process contexts. The bh_lock_sock() is now in sk_clone_lock() and the rule does not change. As of now, alas, it is not possible to reach the commented path by the change. Let's remove the remnant of the old days. [0]: https://cdn.kernel.org/pub/linux/kernel/v2.3/linux-2.3.14.tar.gz [1]: https://cdn.kernel.org/pub/linux/kernel/v2.3/patch-2.3.15.gz [2]: https://cdn.kernel.org/pub/linux/kernel/v2.3/patch-2.3.41.gz Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> --- net/ipv4/tcp_minisocks.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-)