diff mbox series

[net-next,v2,4/6] tcp: rstreason: introduce SK_RST_REASON_TCP_STATE for active reset

Message ID 20240731120955.23542-5-kerneljasonxing@gmail.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series tcp: completely support active reset | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 59 this patch: 59
netdev/build_tools success Errors and warnings before: 10 this patch: 10
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 84 this patch: 84
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 2229 this patch: 2229
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 42 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 4 this patch: 4
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-08-01--06-00 (tests: 707)

Commit Message

Jason Xing July 31, 2024, 12:09 p.m. UTC
From: Jason Xing <kernelxing@tencent.com>

Introducing a new type TCP_STATE to handle some reset conditions
appearing in RFC 793 due to its socket state. Actually, we can look
into RFC 9293 which has no discrepancy about this part.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
V2
Link: https://lore.kernel.org/all/20240730200633.93761-1-kuniyu@amazon.com/
1. use RFC 9293 instead of RFC 793 which is too old (Kuniyuki)
---
 include/net/rstreason.h | 6 ++++++
 net/ipv4/tcp.c          | 4 ++--
 net/ipv4/tcp_timer.c    | 2 +-
 3 files changed, 9 insertions(+), 3 deletions(-)

Comments

Eric Dumazet Aug. 1, 2024, 6:56 a.m. UTC | #1
On Wed, Jul 31, 2024 at 2:10 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> From: Jason Xing <kernelxing@tencent.com>
>
> Introducing a new type TCP_STATE to handle some reset conditions
> appearing in RFC 793 due to its socket state. Actually, we can look
> into RFC 9293 which has no discrepancy about this part.
>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> V2
> Link: https://lore.kernel.org/all/20240730200633.93761-1-kuniyu@amazon.com/
> 1. use RFC 9293 instead of RFC 793 which is too old (Kuniyuki)
> ---
>  include/net/rstreason.h | 6 ++++++
>  net/ipv4/tcp.c          | 4 ++--
>  net/ipv4/tcp_timer.c    | 2 +-
>  3 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/include/net/rstreason.h b/include/net/rstreason.h
> index eef658da8952..bbf20d0bbde7 100644
> --- a/include/net/rstreason.h
> +++ b/include/net/rstreason.h
> @@ -20,6 +20,7 @@
>         FN(TCP_ABORT_ON_CLOSE)          \
>         FN(TCP_ABORT_ON_LINGER)         \
>         FN(TCP_ABORT_ON_MEMORY)         \
> +       FN(TCP_STATE)                   \
>         FN(MPTCP_RST_EUNSPEC)           \
>         FN(MPTCP_RST_EMPTCP)            \
>         FN(MPTCP_RST_ERESOURCE)         \
> @@ -102,6 +103,11 @@ enum sk_rst_reason {
>          * corresponding to LINUX_MIB_TCPABORTONMEMORY
>          */
>         SK_RST_REASON_TCP_ABORT_ON_MEMORY,
> +       /**
> +        * @SK_RST_REASON_TCP_STATE: abort on tcp state
> +        * Please see RFC 9293 for all possible reset conditions
> +        */
> +       SK_RST_REASON_TCP_STATE,
>
>         /* Copy from include/uapi/linux/mptcp.h.
>          * These reset fields will not be changed since they adhere to
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index fd928c447ce8..64a49cb714e1 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -3031,7 +3031,7 @@ int tcp_disconnect(struct sock *sk, int flags)
>                 /* The last check adjusts for discrepancy of Linux wrt. RFC
>                  * states
>                  */
> -               tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_NOT_SPECIFIED);
> +               tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_TCP_STATE);

I disagree with this. tcp_disconnect() is initiated by the user.

You are conflating two possible conditions :

1) tcp_need_reset(old_state)
2) (tp->snd_nxt != tp->write_seq && (1 << old_state) & (TCPF_CLOSING |
TCPF_LAST_ACK)))

>                 WRITE_ONCE(sk->sk_err, ECONNRESET);
>         } else if (old_state == TCP_SYN_SENT)
>                 WRITE_ONCE(sk->sk_err, ECONNRESET);
> @@ -4649,7 +4649,7 @@ int tcp_abort(struct sock *sk, int err)
>         if (!sock_flag(sk, SOCK_DEAD)) {
>                 if (tcp_need_reset(sk->sk_state))
>                         tcp_send_active_reset(sk, GFP_ATOMIC,
> -                                             SK_RST_REASON_NOT_SPECIFIED);
> +                                             SK_RST_REASON_TCP_STATE);
>                 tcp_done_with_error(sk, err);
>         }
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 0fba4a4fb988..3910f6d8614e 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -779,7 +779,7 @@ static void tcp_keepalive_timer (struct timer_list *t)
>                                 goto out;
>                         }
>                 }
> -               tcp_send_active_reset(sk, GFP_ATOMIC, SK_RST_REASON_NOT_SPECIFIED);
> +               tcp_send_active_reset(sk, GFP_ATOMIC, SK_RST_REASON_TCP_STATE);
>                 goto death;
>         }
>
> --
> 2.37.3
>
Jason Xing Aug. 1, 2024, 9:51 a.m. UTC | #2
Hello Eric,

On Thu, Aug 1, 2024 at 2:56 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Jul 31, 2024 at 2:10 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > Introducing a new type TCP_STATE to handle some reset conditions
> > appearing in RFC 793 due to its socket state. Actually, we can look
> > into RFC 9293 which has no discrepancy about this part.
> >
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> > V2
> > Link: https://lore.kernel.org/all/20240730200633.93761-1-kuniyu@amazon.com/
> > 1. use RFC 9293 instead of RFC 793 which is too old (Kuniyuki)
> > ---
> >  include/net/rstreason.h | 6 ++++++
> >  net/ipv4/tcp.c          | 4 ++--
> >  net/ipv4/tcp_timer.c    | 2 +-
> >  3 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/net/rstreason.h b/include/net/rstreason.h
> > index eef658da8952..bbf20d0bbde7 100644
> > --- a/include/net/rstreason.h
> > +++ b/include/net/rstreason.h
> > @@ -20,6 +20,7 @@
> >         FN(TCP_ABORT_ON_CLOSE)          \
> >         FN(TCP_ABORT_ON_LINGER)         \
> >         FN(TCP_ABORT_ON_MEMORY)         \
> > +       FN(TCP_STATE)                   \
> >         FN(MPTCP_RST_EUNSPEC)           \
> >         FN(MPTCP_RST_EMPTCP)            \
> >         FN(MPTCP_RST_ERESOURCE)         \
> > @@ -102,6 +103,11 @@ enum sk_rst_reason {
> >          * corresponding to LINUX_MIB_TCPABORTONMEMORY
> >          */
> >         SK_RST_REASON_TCP_ABORT_ON_MEMORY,
> > +       /**
> > +        * @SK_RST_REASON_TCP_STATE: abort on tcp state
> > +        * Please see RFC 9293 for all possible reset conditions
> > +        */
> > +       SK_RST_REASON_TCP_STATE,
> >
> >         /* Copy from include/uapi/linux/mptcp.h.
> >          * These reset fields will not be changed since they adhere to
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index fd928c447ce8..64a49cb714e1 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -3031,7 +3031,7 @@ int tcp_disconnect(struct sock *sk, int flags)
> >                 /* The last check adjusts for discrepancy of Linux wrt. RFC
> >                  * states
> >                  */
> > -               tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_NOT_SPECIFIED);
> > +               tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_TCP_STATE);
>
> I disagree with this. tcp_disconnect() is initiated by the user.
>
> You are conflating two possible conditions :
>
> 1) tcp_need_reset(old_state)

For this one, I can keep the TCP_STATE reason, right?

> 2) (tp->snd_nxt != tp->write_seq && (1 << old_state) & (TCPF_CLOSING |
> TCPF_LAST_ACK)))
>

For this one, I wonder if I need to separate this condition with
'tcp_need_reset()' and put it into another 'else-if' branch?
I decided to name it as 'CLOSE_WITH_DATA' because it can reflect that
the write queue of the socket is not empty (at this time the user may
think he has more data to send) but it stays in the active close
state.
How about it?

Thanks,
Jason
Eric Dumazet Aug. 1, 2024, 10:06 a.m. UTC | #3
On Thu, Aug 1, 2024 at 11:51 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> Hello Eric,
>
> On Thu, Aug 1, 2024 at 2:56 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Wed, Jul 31, 2024 at 2:10 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > From: Jason Xing <kernelxing@tencent.com>
> > >
> > > Introducing a new type TCP_STATE to handle some reset conditions
> > > appearing in RFC 793 due to its socket state. Actually, we can look
> > > into RFC 9293 which has no discrepancy about this part.
> > >
> > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > ---
> > > V2
> > > Link: https://lore.kernel.org/all/20240730200633.93761-1-kuniyu@amazon.com/
> > > 1. use RFC 9293 instead of RFC 793 which is too old (Kuniyuki)
> > > ---
> > >  include/net/rstreason.h | 6 ++++++
> > >  net/ipv4/tcp.c          | 4 ++--
> > >  net/ipv4/tcp_timer.c    | 2 +-
> > >  3 files changed, 9 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/include/net/rstreason.h b/include/net/rstreason.h
> > > index eef658da8952..bbf20d0bbde7 100644
> > > --- a/include/net/rstreason.h
> > > +++ b/include/net/rstreason.h
> > > @@ -20,6 +20,7 @@
> > >         FN(TCP_ABORT_ON_CLOSE)          \
> > >         FN(TCP_ABORT_ON_LINGER)         \
> > >         FN(TCP_ABORT_ON_MEMORY)         \
> > > +       FN(TCP_STATE)                   \
> > >         FN(MPTCP_RST_EUNSPEC)           \
> > >         FN(MPTCP_RST_EMPTCP)            \
> > >         FN(MPTCP_RST_ERESOURCE)         \
> > > @@ -102,6 +103,11 @@ enum sk_rst_reason {
> > >          * corresponding to LINUX_MIB_TCPABORTONMEMORY
> > >          */
> > >         SK_RST_REASON_TCP_ABORT_ON_MEMORY,
> > > +       /**
> > > +        * @SK_RST_REASON_TCP_STATE: abort on tcp state
> > > +        * Please see RFC 9293 for all possible reset conditions
> > > +        */
> > > +       SK_RST_REASON_TCP_STATE,
> > >
> > >         /* Copy from include/uapi/linux/mptcp.h.
> > >          * These reset fields will not be changed since they adhere to
> > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > index fd928c447ce8..64a49cb714e1 100644
> > > --- a/net/ipv4/tcp.c
> > > +++ b/net/ipv4/tcp.c
> > > @@ -3031,7 +3031,7 @@ int tcp_disconnect(struct sock *sk, int flags)
> > >                 /* The last check adjusts for discrepancy of Linux wrt. RFC
> > >                  * states
> > >                  */
> > > -               tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_NOT_SPECIFIED);
> > > +               tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_TCP_STATE);
> >
> > I disagree with this. tcp_disconnect() is initiated by the user.
> >
> > You are conflating two possible conditions :
> >
> > 1) tcp_need_reset(old_state)
>
> For this one, I can keep the TCP_STATE reason, right?

What does it mean ?

>
> > 2) (tp->snd_nxt != tp->write_seq && (1 << old_state) & (TCPF_CLOSING |
> > TCPF_LAST_ACK)))
> >
>
> For this one, I wonder if I need to separate this condition with
> 'tcp_need_reset()' and put it into another 'else-if' branch?
> I decided to name it as 'CLOSE_WITH_DATA' because it can reflect that
> the write queue of the socket is not empty (at this time the user may
> think he has more data to send) but it stays in the active close
> state.
> How about it?

This is not CLOSE_WITH_DATA, but a disconnect() operation, initiated
by user space.
If we add RST reasons, can we please be careful about the chosen names ?

man connect

<quote>
       Some  protocol  sockets  (e.g., TCP sockets as well as datagram
sockets in the UNIX and Internet domains) may dissolve the association
by connecting to an address with the sa_family member of sockaddr set
to AF_UNSPEC; thereafter, the socket can be connected to another ad‐
       dress.  (AF_UNSPEC is supported since Linux 2.2.)
</quote>

Very different from close()...
Jason Xing Aug. 1, 2024, 10:24 a.m. UTC | #4
Hello Eric,

On Thu, Aug 1, 2024 at 6:06 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Aug 1, 2024 at 11:51 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > Hello Eric,
> >
> > On Thu, Aug 1, 2024 at 2:56 PM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Wed, Jul 31, 2024 at 2:10 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > >
> > > > From: Jason Xing <kernelxing@tencent.com>
> > > >
> > > > Introducing a new type TCP_STATE to handle some reset conditions
> > > > appearing in RFC 793 due to its socket state. Actually, we can look
> > > > into RFC 9293 which has no discrepancy about this part.
> > > >
> > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > ---
> > > > V2
> > > > Link: https://lore.kernel.org/all/20240730200633.93761-1-kuniyu@amazon.com/
> > > > 1. use RFC 9293 instead of RFC 793 which is too old (Kuniyuki)
> > > > ---
> > > >  include/net/rstreason.h | 6 ++++++
> > > >  net/ipv4/tcp.c          | 4 ++--
> > > >  net/ipv4/tcp_timer.c    | 2 +-
> > > >  3 files changed, 9 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/include/net/rstreason.h b/include/net/rstreason.h
> > > > index eef658da8952..bbf20d0bbde7 100644
> > > > --- a/include/net/rstreason.h
> > > > +++ b/include/net/rstreason.h
> > > > @@ -20,6 +20,7 @@
> > > >         FN(TCP_ABORT_ON_CLOSE)          \
> > > >         FN(TCP_ABORT_ON_LINGER)         \
> > > >         FN(TCP_ABORT_ON_MEMORY)         \
> > > > +       FN(TCP_STATE)                   \
> > > >         FN(MPTCP_RST_EUNSPEC)           \
> > > >         FN(MPTCP_RST_EMPTCP)            \
> > > >         FN(MPTCP_RST_ERESOURCE)         \
> > > > @@ -102,6 +103,11 @@ enum sk_rst_reason {
> > > >          * corresponding to LINUX_MIB_TCPABORTONMEMORY
> > > >          */
> > > >         SK_RST_REASON_TCP_ABORT_ON_MEMORY,
> > > > +       /**
> > > > +        * @SK_RST_REASON_TCP_STATE: abort on tcp state
> > > > +        * Please see RFC 9293 for all possible reset conditions
> > > > +        */
> > > > +       SK_RST_REASON_TCP_STATE,
> > > >
> > > >         /* Copy from include/uapi/linux/mptcp.h.
> > > >          * These reset fields will not be changed since they adhere to
> > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > > index fd928c447ce8..64a49cb714e1 100644
> > > > --- a/net/ipv4/tcp.c
> > > > +++ b/net/ipv4/tcp.c
> > > > @@ -3031,7 +3031,7 @@ int tcp_disconnect(struct sock *sk, int flags)
> > > >                 /* The last check adjusts for discrepancy of Linux wrt. RFC
> > > >                  * states
> > > >                  */
> > > > -               tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_NOT_SPECIFIED);
> > > > +               tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_TCP_STATE);
> > >
> > > I disagree with this. tcp_disconnect() is initiated by the user.
> > >
> > > You are conflating two possible conditions :
> > >
> > > 1) tcp_need_reset(old_state)
> >
> > For this one, I can keep the TCP_STATE reason, right?
>
> What does it mean ?

I mean I wonder if I can use the TCP_STATE reason in tcp_abort() and
tcp_disconnect() when tcp_need_reset() returns true?

>
> >
> > > 2) (tp->snd_nxt != tp->write_seq && (1 << old_state) & (TCPF_CLOSING |
> > > TCPF_LAST_ACK)))
> > >
> >
> > For this one, I wonder if I need to separate this condition with
> > 'tcp_need_reset()' and put it into another 'else-if' branch?
> > I decided to name it as 'CLOSE_WITH_DATA' because it can reflect that
> > the write queue of the socket is not empty (at this time the user may
> > think he has more data to send) but it stays in the active close
> > state.
> > How about it?
>
> This is not CLOSE_WITH_DATA, but a disconnect() operation, initiated
> by user space.
> If we add RST reasons, can we please be careful about the chosen names ?

Yes, I know, but like old days, I'm struggling with the English name. Sorry.

>
> man connect
>
> <quote>
>        Some  protocol  sockets  (e.g., TCP sockets as well as datagram
> sockets in the UNIX and Internet domains) may dissolve the association
> by connecting to an address with the sa_family member of sockaddr set
> to AF_UNSPEC; thereafter, the socket can be connected to another ad‐
>        dress.  (AF_UNSPEC is supported since Linux 2.2.)
> </quote>
>
> Very different from close()...

Oh, I see. What I was talking about 'CLOSE' is the socket state, but
you are right: the name will finally be displayed to users, which must
clearly reflect the real meaning of the underlying behavior.

I will use "TCP_DISCONNECT_WITH_DATA" instead under this condition.
And then, I will put it into a new patch since it's a different reason
name.

Thanks for your help!

Jason
diff mbox series

Patch

diff --git a/include/net/rstreason.h b/include/net/rstreason.h
index eef658da8952..bbf20d0bbde7 100644
--- a/include/net/rstreason.h
+++ b/include/net/rstreason.h
@@ -20,6 +20,7 @@ 
 	FN(TCP_ABORT_ON_CLOSE)		\
 	FN(TCP_ABORT_ON_LINGER)		\
 	FN(TCP_ABORT_ON_MEMORY)		\
+	FN(TCP_STATE)			\
 	FN(MPTCP_RST_EUNSPEC)		\
 	FN(MPTCP_RST_EMPTCP)		\
 	FN(MPTCP_RST_ERESOURCE)		\
@@ -102,6 +103,11 @@  enum sk_rst_reason {
 	 * corresponding to LINUX_MIB_TCPABORTONMEMORY
 	 */
 	SK_RST_REASON_TCP_ABORT_ON_MEMORY,
+	/**
+	 * @SK_RST_REASON_TCP_STATE: abort on tcp state
+	 * Please see RFC 9293 for all possible reset conditions
+	 */
+	SK_RST_REASON_TCP_STATE,
 
 	/* Copy from include/uapi/linux/mptcp.h.
 	 * These reset fields will not be changed since they adhere to
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index fd928c447ce8..64a49cb714e1 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3031,7 +3031,7 @@  int tcp_disconnect(struct sock *sk, int flags)
 		/* The last check adjusts for discrepancy of Linux wrt. RFC
 		 * states
 		 */
-		tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_NOT_SPECIFIED);
+		tcp_send_active_reset(sk, gfp_any(), SK_RST_REASON_TCP_STATE);
 		WRITE_ONCE(sk->sk_err, ECONNRESET);
 	} else if (old_state == TCP_SYN_SENT)
 		WRITE_ONCE(sk->sk_err, ECONNRESET);
@@ -4649,7 +4649,7 @@  int tcp_abort(struct sock *sk, int err)
 	if (!sock_flag(sk, SOCK_DEAD)) {
 		if (tcp_need_reset(sk->sk_state))
 			tcp_send_active_reset(sk, GFP_ATOMIC,
-					      SK_RST_REASON_NOT_SPECIFIED);
+					      SK_RST_REASON_TCP_STATE);
 		tcp_done_with_error(sk, err);
 	}
 
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 0fba4a4fb988..3910f6d8614e 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -779,7 +779,7 @@  static void tcp_keepalive_timer (struct timer_list *t)
 				goto out;
 			}
 		}
-		tcp_send_active_reset(sk, GFP_ATOMIC, SK_RST_REASON_NOT_SPECIFIED);
+		tcp_send_active_reset(sk, GFP_ATOMIC, SK_RST_REASON_TCP_STATE);
 		goto death;
 	}