Message ID | 20240614130615.396837-1-edumazet@google.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 9e046bb111f13461d3f9331e24e974324245140e |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: clear tp->retrans_stamp in tcp_rcv_fastopen_synack() | expand |
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 14 Jun 2024 13:06:15 +0000 you wrote: > Some applications were reporting ETIMEDOUT errors on apparently > good looking flows, according to packet dumps. > > We were able to root cause the issue to an accidental setting > of tp->retrans_stamp in the following scenario: > > - client sends TFO SYN with data. > - server has TFO disabled, ACKs only SYN but not payload. > - client receives SYNACK covering only SYN. > - tcp_ack() eats SYN and sets tp->retrans_stamp to 0. > - tcp_rcv_fastopen_synack() calls tcp_xmit_retransmit_queue() > to retransmit TFO payload w/o SYN, sets tp->retrans_stamp to "now", > but we are not in any loss recovery state. > - TFO payload is ACKed. > - we are not in any loss recovery state, and don't see any dupacks, > so we don't get to any code path that clears tp->retrans_stamp. > - tp->retrans_stamp stays non-zero for the lifetime of the connection. > - after first RTO, tcp_clamp_rto_to_user_timeout() clamps second RTO > to 1 jiffy due to bogus tp->retrans_stamp. > - on clamped RTO with non-zero icsk_retransmits, retransmits_timed_out() > sets start_ts from tp->retrans_stamp from TFO payload retransmit > hours/days ago, and computes bogus long elapsed time for loss recovery, > and suffers ETIMEDOUT early. > > [...] Here is the summary with links: - [net] tcp: clear tp->retrans_stamp in tcp_rcv_fastopen_synack() https://git.kernel.org/netdev/net/c/9e046bb111f1 You are awesome, thank you!
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 9c04a9c8be9dfaa0ec2437b3748284e57588b216..01d208e0eef31fd87c7faaf5a3d10b8f52e99ee0 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6296,6 +6296,7 @@ static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack, skb_rbtree_walk_from(data) tcp_mark_skb_lost(sk, data); tcp_xmit_retransmit_queue(sk); + tp->retrans_stamp = 0; NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFASTOPENACTIVEFAIL); return true;