diff mbox series

[net] tcp: clear tp->retrans_stamp in tcp_rcv_fastopen_synack()

Message ID 20240614130615.396837-1-edumazet@google.com (mailing list archive)
State Accepted
Commit 9e046bb111f13461d3f9331e24e974324245140e
Delegated to: Netdev Maintainers
Headers show
Series [net] tcp: clear tp->retrans_stamp in tcp_rcv_fastopen_synack() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 862 this patch: 862
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 1 maintainers not CCed: dsahern@kernel.org
netdev/build_clang success Errors and warnings before: 863 this patch: 863
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 866 this patch: 866
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 7 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 1 this patch: 1
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-06-16--18-00 (tests: 659)

Commit Message

Eric Dumazet June 14, 2024, 1:06 p.m. UTC
Some applications were reporting ETIMEDOUT errors on apparently
good looking flows, according to packet dumps.

We were able to root cause the issue to an accidental setting
of tp->retrans_stamp in the following scenario:

- client sends TFO SYN with data.
- server has TFO disabled, ACKs only SYN but not payload.
- client receives SYNACK covering only SYN.
- tcp_ack() eats SYN and sets tp->retrans_stamp to 0.
- tcp_rcv_fastopen_synack() calls tcp_xmit_retransmit_queue()
  to retransmit TFO payload w/o SYN, sets tp->retrans_stamp to "now",
  but we are not in any loss recovery state.
- TFO payload is ACKed.
- we are not in any loss recovery state, and don't see any dupacks,
  so we don't get to any code path that clears tp->retrans_stamp.
- tp->retrans_stamp stays non-zero for the lifetime of the connection.
- after first RTO, tcp_clamp_rto_to_user_timeout() clamps second RTO
  to 1 jiffy due to bogus tp->retrans_stamp.
- on clamped RTO with non-zero icsk_retransmits, retransmits_timed_out()
  sets start_ts from tp->retrans_stamp from TFO payload retransmit
  hours/days ago, and computes bogus long elapsed time for loss recovery,
  and suffers ETIMEDOUT early.

Fixes: a7abf3cd76e1 ("tcp: consider using standard rtx logic in tcp_rcv_fastopen_synack()")
CC: stable@vger.kernel.org
Co-developed-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Co-developed-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c | 1 +
 1 file changed, 1 insertion(+)

Comments

patchwork-bot+netdevbpf@kernel.org June 18, 2024, 1:10 a.m. UTC | #1
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 14 Jun 2024 13:06:15 +0000 you wrote:
> Some applications were reporting ETIMEDOUT errors on apparently
> good looking flows, according to packet dumps.
> 
> We were able to root cause the issue to an accidental setting
> of tp->retrans_stamp in the following scenario:
> 
> - client sends TFO SYN with data.
> - server has TFO disabled, ACKs only SYN but not payload.
> - client receives SYNACK covering only SYN.
> - tcp_ack() eats SYN and sets tp->retrans_stamp to 0.
> - tcp_rcv_fastopen_synack() calls tcp_xmit_retransmit_queue()
>   to retransmit TFO payload w/o SYN, sets tp->retrans_stamp to "now",
>   but we are not in any loss recovery state.
> - TFO payload is ACKed.
> - we are not in any loss recovery state, and don't see any dupacks,
>   so we don't get to any code path that clears tp->retrans_stamp.
> - tp->retrans_stamp stays non-zero for the lifetime of the connection.
> - after first RTO, tcp_clamp_rto_to_user_timeout() clamps second RTO
>   to 1 jiffy due to bogus tp->retrans_stamp.
> - on clamped RTO with non-zero icsk_retransmits, retransmits_timed_out()
>   sets start_ts from tp->retrans_stamp from TFO payload retransmit
>   hours/days ago, and computes bogus long elapsed time for loss recovery,
>   and suffers ETIMEDOUT early.
> 
> [...]

Here is the summary with links:
  - [net] tcp: clear tp->retrans_stamp in tcp_rcv_fastopen_synack()
    https://git.kernel.org/netdev/net/c/9e046bb111f1

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9c04a9c8be9dfaa0ec2437b3748284e57588b216..01d208e0eef31fd87c7faaf5a3d10b8f52e99ee0 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6296,6 +6296,7 @@  static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack,
 		skb_rbtree_walk_from(data)
 			 tcp_mark_skb_lost(sk, data);
 		tcp_xmit_retransmit_queue(sk);
+		tp->retrans_stamp = 0;
 		NET_INC_STATS(sock_net(sk),
 				LINUX_MIB_TCPFASTOPENACTIVEFAIL);
 		return true;