Message ID | 20220201184640.756716-1-eric.dumazet@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | b67985be400969578d4d4b17299714c0e5d2c07b |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data() | expand |
On Tue, Feb 1, 2022 at 1:46 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > > tcp_shift_skb_data() might collapse three packets into a larger one. > > P_A, P_B, P_C -> P_ABC > > Historically, it used a single tcp_skb_can_collapse_to(P_A) call, > because it was enough. > > In commit 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions"), > this call was replaced by a call to tcp_skb_can_collapse(P_A, P_B) > > But the now needed test over P_C has been missed. > > This probably broke MPTCP. > > Then later, commit 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs") > added an extra condition to tcp_skb_can_collapse(), but the missing call > from tcp_shift_skb_data() is also breaking TCP zerocopy, because P_A and P_C > might have different skb_zcopy_pure() status. > > Fixes: 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions") > Fixes: 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Paolo Abeni <pabeni@redhat.com> > Cc: Mat Martineau <mathew.j.martineau@linux.intel.com> > Cc: Talal Ahmad <talalahmad@google.com> > Cc: Arjun Roy <arjunroy@google.com> > Cc: Soheil Hassas Yeganeh <soheil@google.com> > Cc: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> I wish there were some packetdrill tests for MPTCP. Thank you for the fix! > --- > net/ipv4/tcp_input.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index dc49a3d551eb919baf5ad812ef21698c5c7b9679..bfe4112e000c09ba9d7d8b64392f52337b9053e9 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -1660,6 +1660,8 @@ static struct sk_buff *tcp_shift_skb_data(struct sock *sk, struct sk_buff *skb, > (mss != tcp_skb_seglen(skb))) > goto out; > > + if (!tcp_skb_can_collapse(prev, skb)) > + goto out; > len = skb->len; > pcount = tcp_skb_pcount(skb); > if (tcp_skb_shift(prev, skb, pcount, len)) > -- > 2.35.0.rc2.247.g8bbb082509-goog >
On Tue, 1 Feb 2022, Soheil Hassas Yeganeh wrote: > On Tue, Feb 1, 2022 at 1:46 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: >> >> From: Eric Dumazet <edumazet@google.com> >> >> tcp_shift_skb_data() might collapse three packets into a larger one. >> >> P_A, P_B, P_C -> P_ABC >> >> Historically, it used a single tcp_skb_can_collapse_to(P_A) call, >> because it was enough. >> >> In commit 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions"), >> this call was replaced by a call to tcp_skb_can_collapse(P_A, P_B) >> >> But the now needed test over P_C has been missed. >> >> This probably broke MPTCP. >> >> Then later, commit 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs") >> added an extra condition to tcp_skb_can_collapse(), but the missing call >> from tcp_shift_skb_data() is also breaking TCP zerocopy, because P_A and P_C >> might have different skb_zcopy_pure() status. >> >> Fixes: 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions") >> Fixes: 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs") >> Signed-off-by: Eric Dumazet <edumazet@google.com> >> Cc: Paolo Abeni <pabeni@redhat.com> >> Cc: Mat Martineau <mathew.j.martineau@linux.intel.com> >> Cc: Talal Ahmad <talalahmad@google.com> >> Cc: Arjun Roy <arjunroy@google.com> >> Cc: Soheil Hassas Yeganeh <soheil@google.com> >> Cc: Willem de Bruijn <willemb@google.com> > > Acked-by: Soheil Hassas Yeganeh <soheil@google.com> > > I wish there were some packetdrill tests for MPTCP. Thank you for the fix! > Soheil - I have good news, there are packetdrill tests for MPTCP: https://github.com/multipath-tcp/packetdrill This is still in a fork. I think Davide has talked to Neal about upstreaming the MPTCP changes before but there may be some code that needs refactoring before that could happen. >> --- >> net/ipv4/tcp_input.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c >> index dc49a3d551eb919baf5ad812ef21698c5c7b9679..bfe4112e000c09ba9d7d8b64392f52337b9053e9 100644 >> --- a/net/ipv4/tcp_input.c >> +++ b/net/ipv4/tcp_input.c >> @@ -1660,6 +1660,8 @@ static struct sk_buff *tcp_shift_skb_data(struct sock *sk, struct sk_buff *skb, >> (mss != tcp_skb_seglen(skb))) >> goto out; >> >> + if (!tcp_skb_can_collapse(prev, skb)) >> + goto out; >> len = skb->len; >> pcount = tcp_skb_pcount(skb); >> if (tcp_skb_shift(prev, skb, pcount, len)) >> -- >> 2.35.0.rc2.247.g8bbb082509-goog >> > -- Mat Martineau Intel
On Tue, 2022-02-01 at 12:01 -0800, Mat Martineau wrote: > On Tue, 1 Feb 2022, Soheil Hassas Yeganeh wrote: > > > On Tue, Feb 1, 2022 at 1:46 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > > > From: Eric Dumazet <edumazet@google.com> > > > > > > tcp_shift_skb_data() might collapse three packets into a larger one. > > > > > > P_A, P_B, P_C -> P_ABC > > > > > > Historically, it used a single tcp_skb_can_collapse_to(P_A) call, > > > because it was enough. > > > > > > In commit 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions"), > > > this call was replaced by a call to tcp_skb_can_collapse(P_A, P_B) > > > > > > But the now needed test over P_C has been missed. > > > > > > This probably broke MPTCP. Indeed it looks like it could cause MPTCP data stream corruption, in case of multiple substreams, if we hit this code-path. Thanks for catching and fixing it! > > > Then later, commit 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs") > > > added an extra condition to tcp_skb_can_collapse(), but the missing call > > > from tcp_shift_skb_data() is also breaking TCP zerocopy, because P_A and P_C > > > might have different skb_zcopy_pure() status. > > > > > > Fixes: 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions") > > > Fixes: 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs") > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > > Cc: Paolo Abeni <pabeni@redhat.com> > > > Cc: Mat Martineau <mathew.j.martineau@linux.intel.com> > > > Cc: Talal Ahmad <talalahmad@google.com> > > > Cc: Arjun Roy <arjunroy@google.com> > > > Cc: Soheil Hassas Yeganeh <soheil@google.com> > > > Cc: Willem de Bruijn <willemb@google.com> > > > > Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> > > > > I wish there were some packetdrill tests for MPTCP. Thank you for the fix! Do you have by chance a drill for the zero-copy case? it may help creating the MPTCP one, too. Thanks! Paolo
Hello: This patch was applied to netdev/net.git (master) by Jakub Kicinski <kuba@kernel.org>: On Tue, 1 Feb 2022 10:46:40 -0800 you wrote: > From: Eric Dumazet <edumazet@google.com> > > tcp_shift_skb_data() might collapse three packets into a larger one. > > P_A, P_B, P_C -> P_ABC > > Historically, it used a single tcp_skb_can_collapse_to(P_A) call, > because it was enough. > > [...] Here is the summary with links: - [net] tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data() https://git.kernel.org/netdev/net/c/b67985be4009 You are awesome, thank you!
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index dc49a3d551eb919baf5ad812ef21698c5c7b9679..bfe4112e000c09ba9d7d8b64392f52337b9053e9 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1660,6 +1660,8 @@ static struct sk_buff *tcp_shift_skb_data(struct sock *sk, struct sk_buff *skb, (mss != tcp_skb_seglen(skb))) goto out; + if (!tcp_skb_can_collapse(prev, skb)) + goto out; len = skb->len; pcount = tcp_skb_pcount(skb); if (tcp_skb_shift(prev, skb, pcount, len))