Message ID | 20220430011523.3004693-1-eric.dumazet@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 783d108dd71d97e4cac5fe8ce70ca43ed7dc7bb7 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] tcp: drop skb dst in tcp_rcv_established() | expand |
On Fri, Apr 29, 2022 at 9:15 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > > In commit f84af32cbca7 ("net: ip_queue_rcv_skb() helper") > I dropped the skb dst in tcp_data_queue(). > > This only dealt with so-called TCP input slow path. > > When fast path is taken, tcp_rcv_established() calls > tcp_queue_rcv() while skb still has a dst. > > This was mostly fine, because most dsts at this point > are not refcounted (thanks to early demux) > > However, TCP packets sent over loopback have refcounted dst. > > Then commit 68822bdf76f1 ("net: generalize skb freeing > deferral to per-cpu lists") came and had the effect > of delaying skb freeing for an arbitrary time. > > If during this time the involved netns is dismantled, cleanup_net() > frees the struct net with embedded net->ipv6.ip6_dst_ops. > > Then when eventually dst_destroy_rcu() is called, > if (dst->ops->destroy) ... triggers an use-after-free. > > It is not clear if ip6_route_net_exit() lacks a rcu_barrier() > as syzbot reported similar issues before the blamed commit. > > ( https://groups.google.com/g/syzkaller-bugs/c/CofzW4eeA9A/m/009WjumTAAAJ ) > > Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists") > Signed-off-by: Eric Dumazet <edumazet@google.com> > --- > net/ipv4/tcp_input.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index cc3de8dc57970c97316ad1591cac0ca5f1a24c47..97cfcd85f84e6f873c3e60c388e6c27628451a7d 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -5928,6 +5928,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) > NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPHITS); > > /* Bulk data transfer: receiver */ > + skb_dst_drop(skb); > __skb_pull(skb, tcp_header_len); > eaten = tcp_queue_rcv(sk, skb, &fragstolen); > > -- Nice catch. Thanks, Eric! Acked-by: Neal Cardwell <ncardwell@google.com> neal
On Fri, Apr 29, 2022 at 10:20 PM Neal Cardwell <ncardwell@google.com> wrote: > > On Fri, Apr 29, 2022 at 9:15 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > From: Eric Dumazet <edumazet@google.com> > > > > In commit f84af32cbca7 ("net: ip_queue_rcv_skb() helper") > > I dropped the skb dst in tcp_data_queue(). > > > > This only dealt with so-called TCP input slow path. > > > > When fast path is taken, tcp_rcv_established() calls > > tcp_queue_rcv() while skb still has a dst. > > > > This was mostly fine, because most dsts at this point > > are not refcounted (thanks to early demux) > > > > However, TCP packets sent over loopback have refcounted dst. > > > > Then commit 68822bdf76f1 ("net: generalize skb freeing > > deferral to per-cpu lists") came and had the effect > > of delaying skb freeing for an arbitrary time. > > > > If during this time the involved netns is dismantled, cleanup_net() > > frees the struct net with embedded net->ipv6.ip6_dst_ops. > > > > Then when eventually dst_destroy_rcu() is called, > > if (dst->ops->destroy) ... triggers an use-after-free. > > > > It is not clear if ip6_route_net_exit() lacks a rcu_barrier() > > as syzbot reported similar issues before the blamed commit. > > > > ( https://groups.google.com/g/syzkaller-bugs/c/CofzW4eeA9A/m/009WjumTAAAJ ) > > > > Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists") > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > --- > > net/ipv4/tcp_input.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index cc3de8dc57970c97316ad1591cac0ca5f1a24c47..97cfcd85f84e6f873c3e60c388e6c27628451a7d 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -5928,6 +5928,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) > > NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPHITS); > > > > /* Bulk data transfer: receiver */ > > + skb_dst_drop(skb); > > __skb_pull(skb, tcp_header_len); > > eaten = tcp_queue_rcv(sk, skb, &fragstolen); > > > > -- > > Nice catch. Thanks, Eric! > > Acked-by: Neal Cardwell <ncardwell@google.com> > > neal Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Thank you for the fix!
Hello: This patch was applied to netdev/net-next.git (master) by David S. Miller <davem@davemloft.net>: On Fri, 29 Apr 2022 18:15:23 -0700 you wrote: > From: Eric Dumazet <edumazet@google.com> > > In commit f84af32cbca7 ("net: ip_queue_rcv_skb() helper") > I dropped the skb dst in tcp_data_queue(). > > This only dealt with so-called TCP input slow path. > > [...] Here is the summary with links: - [net-next] tcp: drop skb dst in tcp_rcv_established() https://git.kernel.org/netdev/net-next/c/783d108dd71d You are awesome, thank you!
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index cc3de8dc57970c97316ad1591cac0ca5f1a24c47..97cfcd85f84e6f873c3e60c388e6c27628451a7d 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5928,6 +5928,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPHITS); /* Bulk data transfer: receiver */ + skb_dst_drop(skb); __skb_pull(skb, tcp_header_len); eaten = tcp_queue_rcv(sk, skb, &fragstolen);