Message ID | 20230925202448.100920-2-john.fastabend@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | bpf, sockmap complete fixes for avail bytes | expand |
Hi John,
kernel test robot noticed the following build warnings:
[auto build test WARNING on bpf/master]
url: https://github.com/intel-lab-lkp/linux/commits/John-Fastabend/bpf-tcp_read_skb-needs-to-pop-skb-regardless-of-seq/20230926-042625
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master
patch link: https://lore.kernel.org/r/20230925202448.100920-2-john.fastabend%40gmail.com
patch subject: [PATCH bpf v2 1/3] bpf: tcp_read_skb needs to pop skb regardless of seq
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20230926/202309260854.w4YOXCoL-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230926/202309260854.w4YOXCoL-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309260854.w4YOXCoL-lkp@intel.com/
All warnings (new ones prefixed by >>):
net/ipv4/tcp.c: In function 'tcp_read_skb':
>> net/ipv4/tcp.c:1624:26: warning: unused variable 'tp' [-Wunused-variable]
1624 | struct tcp_sock *tp = tcp_sk(sk);
| ^~
vim +/tp +1624 net/ipv4/tcp.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 1621
965b57b469a589 Cong Wang 2022-06-15 1622 int tcp_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
04919bed948dc2 Cong Wang 2022-06-15 1623 {
04919bed948dc2 Cong Wang 2022-06-15 @1624 struct tcp_sock *tp = tcp_sk(sk);
04919bed948dc2 Cong Wang 2022-06-15 1625 struct sk_buff *skb;
04919bed948dc2 Cong Wang 2022-06-15 1626 int copied = 0;
04919bed948dc2 Cong Wang 2022-06-15 1627
04919bed948dc2 Cong Wang 2022-06-15 1628 if (sk->sk_state == TCP_LISTEN)
04919bed948dc2 Cong Wang 2022-06-15 1629 return -ENOTCONN;
04919bed948dc2 Cong Wang 2022-06-15 1630
44bb37a8112f62 John Fastabend 2023-09-25 1631 while ((skb = skb_peek(&sk->sk_receive_queue)) != NULL) {
db4192a754ebd5 Cong Wang 2022-09-12 1632 u8 tcp_flags;
db4192a754ebd5 Cong Wang 2022-09-12 1633 int used;
04919bed948dc2 Cong Wang 2022-06-15 1634
04919bed948dc2 Cong Wang 2022-06-15 1635 __skb_unlink(skb, &sk->sk_receive_queue);
96628951869c0d Peilin Ye 2022-09-08 1636 WARN_ON_ONCE(!skb_set_owner_sk_safe(skb, sk));
db4192a754ebd5 Cong Wang 2022-09-12 1637 tcp_flags = TCP_SKB_CB(skb)->tcp_flags;
db4192a754ebd5 Cong Wang 2022-09-12 1638 used = recv_actor(sk, skb);
db4192a754ebd5 Cong Wang 2022-09-12 1639 if (used < 0) {
db4192a754ebd5 Cong Wang 2022-09-12 1640 if (!copied)
db4192a754ebd5 Cong Wang 2022-09-12 1641 copied = used;
db4192a754ebd5 Cong Wang 2022-09-12 1642 break;
db4192a754ebd5 Cong Wang 2022-09-12 1643 }
db4192a754ebd5 Cong Wang 2022-09-12 1644 copied += used;
db4192a754ebd5 Cong Wang 2022-09-12 1645
44bb37a8112f62 John Fastabend 2023-09-25 1646 if (tcp_flags & TCPHDR_FIN)
db4192a754ebd5 Cong Wang 2022-09-12 1647 break;
db4192a754ebd5 Cong Wang 2022-09-12 1648 }
04919bed948dc2 Cong Wang 2022-06-15 1649 return copied;
04919bed948dc2 Cong Wang 2022-06-15 1650 }
04919bed948dc2 Cong Wang 2022-06-15 1651 EXPORT_SYMBOL(tcp_read_skb);
04919bed948dc2 Cong Wang 2022-06-15 1652
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0c3040a63ebd..235d77af3177 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1622,15 +1622,13 @@ EXPORT_SYMBOL(tcp_read_sock); int tcp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) { struct tcp_sock *tp = tcp_sk(sk); - u32 seq = tp->copied_seq; struct sk_buff *skb; int copied = 0; - u32 offset; if (sk->sk_state == TCP_LISTEN) return -ENOTCONN; - while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) { + while ((skb = skb_peek(&sk->sk_receive_queue)) != NULL) { u8 tcp_flags; int used; @@ -1643,13 +1641,10 @@ int tcp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) copied = used; break; } - seq += used; copied += used; - if (tcp_flags & TCPHDR_FIN) { - ++seq; + if (tcp_flags & TCPHDR_FIN) break; - } } return copied; }
Before fix e5c6de5fa0258 tcp_read_skb() would increment the tp->copied-seq value. This (as described in the commit) would cause an error for apps because once that is incremented the application might believe there is no data to be read. Then some apps would stall or abort believing no data is available. However, the fix is incomplete because it introduces another issue in the skb dequeue. The loop does tcp_recv_skb() in a while loop to consume as many skbs as possible. The problem is the call is, tcp_recv_skb(sk, seq, &offset) Where 'seq' is u32 seq = tp->copied_seq; Now we can hit a case where we've yet incremented copied_seq from BPF side, but then tcp_recv_skb() fails this test, if (offset < skb->len || (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)) so that instead of returning the skb we call tcp_eat_recv_skb() which frees the skb. This is because the routine believes the SKB has been collapsed per comment, /* This looks weird, but this can happen if TCP collapsing * splitted a fat GRO packet, while we released socket lock * in skb_splice_bits() */ This can't happen here we've unlinked the full SKB and orphaned it. Anyways it would confuse any BPF programs if the data were suddenly moved underneath it. To fix this situation do simpler operation and just skb_peek() the data of the queue followed by the unlink. It shouldn't need to check this condition and tcp_read_skb() reads entire skbs so there is no need to handle the 'offset!=0' case as we would see in tcp_read_sock(). Fixes: e5c6de5fa0258 ("bpf, sockmap: Incorrectly handling copied_seq") Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()") Signed-off-by: John Fastabend <john.fastabend@gmail.com> --- net/ipv4/tcp.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)