Message ID | 20211111235215.2605384-1-arjunroy.kdev@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 70701b83e208767f2720d8cd3e6a62cddafb3a30 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] tcp: Fix uninitialized access in skb frags array for Rx 0cp. | expand |
On Thu, Nov 11, 2021 at 3:52 PM Arjun Roy <arjunroy.kdev@gmail.com> wrote: > > From: Arjun Roy <arjunroy@google.com> > > TCP Receive zerocopy iterates through the SKB queue via > tcp_recv_skb(), acquiring a pointer to an SKB and an offset within > that SKB to read from. From there, it iterates the SKB frags array to > determine which offset to start remapping pages from. > > However, this is built on the assumption that the offset read so far > within the SKB is smaller than the SKB length. If this assumption is > violated, we can attempt to read an invalid frags array element, which > would cause a fault. > > tcp_recv_skb() can cause such an SKB to be returned when the TCP FIN > flag is set. Therefore, we must guard against this occurrence inside > skb_advance_frag(). > > One way that we can reproduce this error follows: > 1) In a receiver program, call getsockopt(TCP_ZEROCOPY_RECEIVE) with: > char some_array[32 * 1024]; > struct tcp_zerocopy_receive zc = { > .copybuf_address = (__u64) &some_array[0], > .copybuf_len = 32 * 1024, > }; > > 2) In a sender program, after a TCP handshake, send the following > sequence of packets: > i) Seq = [X, X+4000] > ii) Seq = [X+4000, X+5000] > iii) Seq = [X+4000, X+5000], Flags = FIN | URG, urgptr=1000 > > (This can happen without URG, if we have a signal pending, but URG is > a convenient way to reproduce the behaviour). > > In this case, the following event sequence will occur on the receiver: > > tcp_zerocopy_receive(): > -> receive_fallback_to_copy() // copybuf_len >= inq > -> tcp_recvmsg_locked() // reads 5000 bytes, then breaks due to URG > -> tcp_recv_skb() // yields skb with skb->len == offset > -> tcp_zerocopy_set_hint_for_skb() > -> skb_advance_to_frag() // will returns a frags ptr. >= nr_frags > -> find_next_mappable_frag() // will dereference this bad frags ptr. > > With this patch, skb_advance_to_frag() will no longer return an > invalid frags pointer, and will return NULL instead, fixing the issue. > > Signed-off-by: Arjun Roy <arjunroy@google.com> > Signed-off-by: Eric Dumazet <edumazet@google.com> > Fixes: 05255b823a61 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive") > > --- > net/ipv4/tcp.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index bc7f419184aa..ef896847f190 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -1741,6 +1741,9 @@ static skb_frag_t *skb_advance_to_frag(struct sk_buff *skb, u32 offset_skb, > { > skb_frag_t *frag; > > + if (unlikely(offset_skb >= skb->len)) > + return NULL; > + > offset_skb -= skb_headlen(skb); > if ((int)offset_skb < 0 || skb_has_frag_list(skb)) > return NULL; > -- > 2.34.0.rc1.387.gb447b232ab-goog > Interestingly, netdevbpf list claims a netdev/build_32bit failure here: https://patchwork.kernel.org/project/netdevbpf/patch/20211111235215.2605384-1-arjunroy.kdev@gmail.com/ But the v1 patch seemed to be fine (that one had a wrong "Fixes" tag, it's the only thing that changed in v2). Also, "make ARCH=i386" is working fine for me, and the significant amount of error output (https://patchwork.hopto.org/static/nipa/578999/12615889/build_32bit/) does not actually have any errors inside net/ipv4/tcp.c . I assume, then, this must be a tooling false positive, and I do not have to send a v3 (which would have no changes)? Thanks, -Arjun
On Thu, 11 Nov 2021 18:32:23 -0800 Arjun Roy wrote: > On Thu, Nov 11, 2021 at 3:52 PM Arjun Roy <arjunroy.kdev@gmail.com> wrote: > > > > From: Arjun Roy <arjunroy@google.com> > > > > TCP Receive zerocopy iterates through the SKB queue via > > tcp_recv_skb(), acquiring a pointer to an SKB and an offset within > > that SKB to read from. From there, it iterates the SKB frags array to > > determine which offset to start remapping pages from. > > > > However, this is built on the assumption that the offset read so far > > within the SKB is smaller than the SKB length. If this assumption is > > violated, we can attempt to read an invalid frags array element, which > > would cause a fault. > > > > tcp_recv_skb() can cause such an SKB to be returned when the TCP FIN > > flag is set. Therefore, we must guard against this occurrence inside > > skb_advance_frag(). > > > > One way that we can reproduce this error follows: > > 1) In a receiver program, call getsockopt(TCP_ZEROCOPY_RECEIVE) with: > > char some_array[32 * 1024]; > > struct tcp_zerocopy_receive zc = { > > .copybuf_address = (__u64) &some_array[0], > > .copybuf_len = 32 * 1024, > > }; > > > > 2) In a sender program, after a TCP handshake, send the following > > sequence of packets: > > i) Seq = [X, X+4000] > > ii) Seq = [X+4000, X+5000] > > iii) Seq = [X+4000, X+5000], Flags = FIN | URG, urgptr=1000 > > > > (This can happen without URG, if we have a signal pending, but URG is > > a convenient way to reproduce the behaviour). > > > > In this case, the following event sequence will occur on the receiver: > > > > tcp_zerocopy_receive(): > > -> receive_fallback_to_copy() // copybuf_len >= inq > > -> tcp_recvmsg_locked() // reads 5000 bytes, then breaks due to URG > > -> tcp_recv_skb() // yields skb with skb->len == offset > > -> tcp_zerocopy_set_hint_for_skb() > > -> skb_advance_to_frag() // will returns a frags ptr. >= nr_frags > > -> find_next_mappable_frag() // will dereference this bad frags ptr. > > > > With this patch, skb_advance_to_frag() will no longer return an > > invalid frags pointer, and will return NULL instead, fixing the issue. > > > > Signed-off-by: Arjun Roy <arjunroy@google.com> > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > Fixes: 05255b823a61 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive") > > > > --- > > net/ipv4/tcp.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > > index bc7f419184aa..ef896847f190 100644 > > --- a/net/ipv4/tcp.c > > +++ b/net/ipv4/tcp.c > > @@ -1741,6 +1741,9 @@ static skb_frag_t *skb_advance_to_frag(struct sk_buff *skb, u32 offset_skb, > > { > > skb_frag_t *frag; > > > > + if (unlikely(offset_skb >= skb->len)) > > + return NULL; > > + > > offset_skb -= skb_headlen(skb); > > if ((int)offset_skb < 0 || skb_has_frag_list(skb)) > > return NULL; > > -- > > 2.34.0.rc1.387.gb447b232ab-goog > > > > Interestingly, netdevbpf list claims a netdev/build_32bit failure here: > https://patchwork.kernel.org/project/netdevbpf/patch/20211111235215.2605384-1-arjunroy.kdev@gmail.com/ > > But the v1 patch seemed to be fine (that one had a wrong "Fixes" tag, > it's the only thing that changed in v2). Also, "make ARCH=i386" is > working fine for me, and the significant amount of error output > (https://patchwork.hopto.org/static/nipa/578999/12615889/build_32bit/) > does not actually have any errors inside net/ipv4/tcp.c . I assume, > then, this must be a tooling false positive, and I do not have to send > a v3 (which would have no changes)? Yes, see: https://lore.kernel.org/all/20211111174654.3d1f83e3@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/
Hello: This patch was applied to netdev/net.git (master) by Jakub Kicinski <kuba@kernel.org>: On Thu, 11 Nov 2021 15:52:15 -0800 you wrote: > From: Arjun Roy <arjunroy@google.com> > > TCP Receive zerocopy iterates through the SKB queue via > tcp_recv_skb(), acquiring a pointer to an SKB and an offset within > that SKB to read from. From there, it iterates the SKB frags array to > determine which offset to start remapping pages from. > > [...] Here is the summary with links: - [net,v2] tcp: Fix uninitialized access in skb frags array for Rx 0cp. https://git.kernel.org/netdev/net/c/70701b83e208 You are awesome, thank you!
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index bc7f419184aa..ef896847f190 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1741,6 +1741,9 @@ static skb_frag_t *skb_advance_to_frag(struct sk_buff *skb, u32 offset_skb, { skb_frag_t *frag; + if (unlikely(offset_skb >= skb->len)) + return NULL; + offset_skb -= skb_headlen(skb); if ((int)offset_skb < 0 || skb_has_frag_list(skb)) return NULL;