diff mbox series

[net-next,2/2] tcp: no longer abort SYN_SENT when receiving some ICMP (II)

Message ID 20240417165756.2531620-3-edumazet@google.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series tcp: tcp_v4_err() changes | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 947 this patch: 947
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 1 maintainers not CCed: dsahern@kernel.org
netdev/build_clang success Errors and warnings before: 938 this patch: 938
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 958 this patch: 958
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 58 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest fail net-next-2024-04-18--12-00 (tests: 962)

Commit Message

Eric Dumazet April 17, 2024, 4:57 p.m. UTC
Notes:

 - A prior version of this patch in commit
   0a8de364ff7a ("tcp: no longer abort SYN_SENT when
   receiving some ICMP") had to be reverted.

 - We found the root cause, and fixed it in prior patch
   in the series.

 - Many thanks to Dragos Tatulea !

Currently, non fatal ICMP messages received on behalf
of SYN_SENT sockets do call tcp_ld_RTO_revert()
to implement RFC 6069, but immediately call tcp_done(),
thus aborting the connect() attempt.

This violates RFC 1122 following requirement:

4.2.3.9  ICMP Messages
...
          o    Destination Unreachable -- codes 0, 1, 5

                 Since these Unreachable messages indicate soft error
                 conditions, TCP MUST NOT abort the connection, and it
                 SHOULD make the information available to the
                 application.

This patch makes sure non 'fatal' ICMP[v6] messages do not
abort the connection attempt.

It enables RFC 6069 for SYN_SENT sockets as a result.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Tested-by: Dragos Tatulea <dtatulea@nvidia.com>
---
 net/ipv4/tcp_ipv4.c | 6 ++++++
 net/ipv6/tcp_ipv6.c | 9 ++++++---
 2 files changed, 12 insertions(+), 3 deletions(-)

Comments

Jason Xing April 18, 2024, 3:24 a.m. UTC | #1
On Thu, Apr 18, 2024 at 12:59 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Notes:
>
>  - A prior version of this patch in commit
>    0a8de364ff7a ("tcp: no longer abort SYN_SENT when
>    receiving some ICMP") had to be reverted.
>
>  - We found the root cause, and fixed it in prior patch
>    in the series.
>
>  - Many thanks to Dragos Tatulea !
>
> Currently, non fatal ICMP messages received on behalf
> of SYN_SENT sockets do call tcp_ld_RTO_revert()
> to implement RFC 6069, but immediately call tcp_done(),
> thus aborting the connect() attempt.
>
> This violates RFC 1122 following requirement:
>
> 4.2.3.9  ICMP Messages
> ...
>           o    Destination Unreachable -- codes 0, 1, 5
>
>                  Since these Unreachable messages indicate soft error
>                  conditions, TCP MUST NOT abort the connection, and it
>                  SHOULD make the information available to the
>                  application.
>
> This patch makes sure non 'fatal' ICMP[v6] messages do not
> abort the connection attempt.
>
> It enables RFC 6069 for SYN_SENT sockets as a result.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Tested-by: Dragos Tatulea <dtatulea@nvidia.com>

Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>

Finally!!
diff mbox series

Patch

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index a717db99972d977a64178d7ed1109325d64a6d51..4b50d09f84b9ae7fec98a71bedf39594ab85e5ea 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -482,6 +482,7 @@  int tcp_v4_err(struct sk_buff *skb, u32 info)
 	const int code = icmp_hdr(skb)->code;
 	struct sock *sk;
 	struct request_sock *fastopen;
+	bool harderr = false;
 	u32 seq, snd_una;
 	int err;
 	struct net *net = dev_net(skb->dev);
@@ -555,6 +556,7 @@  int tcp_v4_err(struct sk_buff *skb, u32 info)
 		goto out;
 	case ICMP_PARAMETERPROB:
 		err = EPROTO;
+		harderr = true;
 		break;
 	case ICMP_DEST_UNREACH:
 		if (code > NR_ICMP_UNREACH)
@@ -579,6 +581,7 @@  int tcp_v4_err(struct sk_buff *skb, u32 info)
 		}
 
 		err = icmp_err_convert[code].errno;
+		harderr = icmp_err_convert[code].fatal;
 		/* check if this ICMP message allows revert of backoff.
 		 * (see RFC 6069)
 		 */
@@ -605,6 +608,9 @@  int tcp_v4_err(struct sk_buff *skb, u32 info)
 		if (inet_test_bit(RECVERR, sk))
 			ip_icmp_error(sk, skb, err, th->dest, info, (u8 *)th);
 
+		if (!harderr)
+			break;
+
 		if (!sock_owned_by_user(sk)) {
 			WRITE_ONCE(sk->sk_err, err);
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index bb7c3caf4f8536dabdcb3dbe7c90aff9c8985c90..f31527f0a13dee9488ce72834d89524e83f61e5f 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -382,7 +382,7 @@  static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	struct tcp_sock *tp;
 	__u32 seq, snd_una;
 	struct sock *sk;
-	bool fatal;
+	bool harderr;
 	int err;
 
 	sk = __inet6_lookup_established(net, net->ipv4.tcp_death_row.hashinfo,
@@ -403,9 +403,9 @@  static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		return 0;
 	}
 	seq = ntohl(th->seq);
-	fatal = icmpv6_err_convert(type, code, &err);
+	harderr = icmpv6_err_convert(type, code, &err);
 	if (sk->sk_state == TCP_NEW_SYN_RECV) {
-		tcp_req_err(sk, seq, fatal);
+		tcp_req_err(sk, seq, harderr);
 		return 0;
 	}
 
@@ -490,6 +490,9 @@  static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 
 		ipv6_icmp_error(sk, skb, err, th->dest, ntohl(info), (u8 *)th);
 
+		if (!harderr)
+			break;
+
 		if (!sock_owned_by_user(sk)) {
 			WRITE_ONCE(sk->sk_err, err);
 			sk_error_report(sk);		/* Wake people up to see the error (see connect in sock.c) */