diff mbox series

[net-next,v6,1/2] net-timestamp: introduce SOF_TIMESTAMPING_OPT_RX_FILTER flag

Message ID 20240909015612.3856-2-kerneljasonxing@gmail.com (mailing list archive)
State Accepted
Commit be8e9eb3750639aa5cffb3f764ca080caed41bd0
Delegated to: Netdev Maintainers
Headers show
Series net-timestamp: introduce a flag to filter out rx software and hardware report | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 31 this patch: 31
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 84 this patch: 84
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 3240 this patch: 3240
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 86 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 27 this patch: 27
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-09-09--12-00 (tests: 722)

Commit Message

Jason Xing Sept. 9, 2024, 1:56 a.m. UTC
From: Jason Xing <kernelxing@tencent.com>

introduce a new flag SOF_TIMESTAMPING_OPT_RX_FILTER in the receive
path. User can set it with SOF_TIMESTAMPING_SOFTWARE to filter
out rx software timestamp report, especially after a process turns on
netstamp_needed_key which can time stamp every incoming skb.

Previously, we found out if an application starts first which turns on
netstamp_needed_key, then another one only passing SOF_TIMESTAMPING_SOFTWARE
could also get rx timestamp. Now we handle this case by introducing this
new flag without breaking users.

Quoting Willem to explain why we need the flag:
"why a process would want to request software timestamp reporting, but
not receive software timestamp generation. The only use I see is when
the application does request
SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_TX_SOFTWARE."

Similarly, this new flag could also be used for hardware case where we
can set it with SOF_TIMESTAMPING_RAW_HARDWARE, then we won't receive
hardware receive timestamp.

Another thing about errqueue in this patch I have a few words to say:
In this case, we need to handle the egress path carefully, or else
reporting the tx timestamp will fail. Egress path and ingress path will
finally call sock_recv_timestamp(). We have to distinguish them.
Errqueue is a good indicator to reflect the flow direction.

Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
v6
Link: https://lore.kernel.org/all/20240906095640.77533-1-kerneljasonxing@gmail.com/
1. add the description in doc provided by Willem
2. align the if statements (Willem)

v5
Link: https://lore.kernel.org/all/20240905071738.3725-1-kerneljasonxing@gmail.com/
1. squash the hardware case patch into this one (Willem)
2. update corresponding commit message and doc (Willem)
3. remove the limitation in sock_set_timestamping() and restore the
simplification branches. (Willem)

v4
Link: https://lore.kernel.org/all/20240830153751.86895-2-kerneljasonxing@gmail.com/
1. revise the commit message and doc (Willem)
2. simplify the test statement (Jakub)
3. add Willem's reviewed-by tag (Willem)

v3
1. Willem suggested this alternative way to solve the issue, so I
added his Suggested-by tag here. Thanks!
---
 Documentation/networking/timestamping.rst | 17 +++++++++++++++++
 include/uapi/linux/net_tstamp.h           |  3 ++-
 net/ethtool/common.c                      |  1 +
 net/ipv4/tcp.c                            |  9 +++++++--
 net/socket.c                              | 10 ++++++++--
 5 files changed, 35 insertions(+), 5 deletions(-)

Comments

Willem de Bruijn Sept. 9, 2024, 2:54 a.m. UTC | #1
Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> introduce a new flag SOF_TIMESTAMPING_OPT_RX_FILTER in the receive
> path. User can set it with SOF_TIMESTAMPING_SOFTWARE to filter
> out rx software timestamp report, especially after a process turns on
> netstamp_needed_key which can time stamp every incoming skb.
> 
> Previously, we found out if an application starts first which turns on
> netstamp_needed_key, then another one only passing SOF_TIMESTAMPING_SOFTWARE
> could also get rx timestamp. Now we handle this case by introducing this
> new flag without breaking users.
> 
> Quoting Willem to explain why we need the flag:
> "why a process would want to request software timestamp reporting, but
> not receive software timestamp generation. The only use I see is when
> the application does request
> SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_TX_SOFTWARE."
> 
> Similarly, this new flag could also be used for hardware case where we
> can set it with SOF_TIMESTAMPING_RAW_HARDWARE, then we won't receive
> hardware receive timestamp.
> 
> Another thing about errqueue in this patch I have a few words to say:
> In this case, we need to handle the egress path carefully, or else
> reporting the tx timestamp will fail. Egress path and ingress path will
> finally call sock_recv_timestamp(). We have to distinguish them.
> Errqueue is a good indicator to reflect the flow direction.
> 
> Suggested-by: Willem de Bruijn <willemb@google.com>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

I really only suggested making this a new flag, not the main idea of
filtering.

> ---
> v6
> Link: https://lore.kernel.org/all/20240906095640.77533-1-kerneljasonxing@gmail.com/
> 1. add the description in doc provided by Willem
> 2. align the if statements (Willem)
> 
> v5
> Link: https://lore.kernel.org/all/20240905071738.3725-1-kerneljasonxing@gmail.com/
> 1. squash the hardware case patch into this one (Willem)
> 2. update corresponding commit message and doc (Willem)
> 3. remove the limitation in sock_set_timestamping() and restore the
> simplification branches. (Willem)
> 
> v4
> Link: https://lore.kernel.org/all/20240830153751.86895-2-kerneljasonxing@gmail.com/
> 1. revise the commit message and doc (Willem)
> 2. simplify the test statement (Jakub)
> 3. add Willem's reviewed-by tag (Willem)
> 
> v3
> 1. Willem suggested this alternative way to solve the issue, so I
> added his Suggested-by tag here. Thanks!
> ---
>  Documentation/networking/timestamping.rst | 17 +++++++++++++++++
>  include/uapi/linux/net_tstamp.h           |  3 ++-
>  net/ethtool/common.c                      |  1 +
>  net/ipv4/tcp.c                            |  9 +++++++--
>  net/socket.c                              | 10 ++++++++--
>  5 files changed, 35 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
> index 9c7773271393..8199e6917671 100644
> --- a/Documentation/networking/timestamping.rst
> +++ b/Documentation/networking/timestamping.rst
> @@ -267,6 +267,23 @@ SOF_TIMESTAMPING_OPT_TX_SWHW:
>    two separate messages will be looped to the socket's error queue,
>    each containing just one timestamp.
>  
> +SOF_TIMESTAMPING_OPT_RX_FILTER:
> +  Filter out spurious receive timestamps: report a receive timestamp
> +  only if the matching timestamp generation flag is enabled.
> +
> +  Receive timestamps are generated early in the ingress path, before a
> +  packet's destination socket is known. If any socket enables receive
> +  timestamps, packets for all socket will receive timestamped packets.

nit: s/packets for all socket/all sockets/

My error in my suggestion.

Not important enough to respin.

> +  Including those that request timestamp reporting with
> +  SOF_TIMESTAMPING_SOFTWARE and/or SOF_TIMESTAMPING_RAW_HARDWARE, but
> +  do not request receive timestamp generation. This can happen when
> +  requesting transmit timestamps only.
Jason Xing Sept. 9, 2024, 3:12 a.m. UTC | #2
On Mon, Sep 9, 2024 at 10:54 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > introduce a new flag SOF_TIMESTAMPING_OPT_RX_FILTER in the receive
> > path. User can set it with SOF_TIMESTAMPING_SOFTWARE to filter
> > out rx software timestamp report, especially after a process turns on
> > netstamp_needed_key which can time stamp every incoming skb.
> >
> > Previously, we found out if an application starts first which turns on
> > netstamp_needed_key, then another one only passing SOF_TIMESTAMPING_SOFTWARE
> > could also get rx timestamp. Now we handle this case by introducing this
> > new flag without breaking users.
> >
> > Quoting Willem to explain why we need the flag:
> > "why a process would want to request software timestamp reporting, but
> > not receive software timestamp generation. The only use I see is when
> > the application does request
> > SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_TX_SOFTWARE."
> >
> > Similarly, this new flag could also be used for hardware case where we
> > can set it with SOF_TIMESTAMPING_RAW_HARDWARE, then we won't receive
> > hardware receive timestamp.
> >
> > Another thing about errqueue in this patch I have a few words to say:
> > In this case, we need to handle the egress path carefully, or else
> > reporting the tx timestamp will fail. Egress path and ingress path will
> > finally call sock_recv_timestamp(). We have to distinguish them.
> > Errqueue is a good indicator to reflect the flow direction.
> >
> > Suggested-by: Willem de Bruijn <willemb@google.com>
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
>
> Reviewed-by: Willem de Bruijn <willemb@google.com>
>
> I really only suggested making this a new flag, not the main idea of
> filtering.

You provided a good alternative solution, so I was trying to give
enough credit for your help and guidance :)

> > +SOF_TIMESTAMPING_OPT_RX_FILTER:
> > +  Filter out spurious receive timestamps: report a receive timestamp
> > +  only if the matching timestamp generation flag is enabled.
> > +
> > +  Receive timestamps are generated early in the ingress path, before a
> > +  packet's destination socket is known. If any socket enables receive
> > +  timestamps, packets for all socket will receive timestamped packets.
>
> nit: s/packets for all socket/all sockets/
>
> My error in my suggestion.
>
> Not important enough to respin.

Got it.

Thanks,
Jason
diff mbox series

Patch

diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
index 9c7773271393..8199e6917671 100644
--- a/Documentation/networking/timestamping.rst
+++ b/Documentation/networking/timestamping.rst
@@ -267,6 +267,23 @@  SOF_TIMESTAMPING_OPT_TX_SWHW:
   two separate messages will be looped to the socket's error queue,
   each containing just one timestamp.
 
+SOF_TIMESTAMPING_OPT_RX_FILTER:
+  Filter out spurious receive timestamps: report a receive timestamp
+  only if the matching timestamp generation flag is enabled.
+
+  Receive timestamps are generated early in the ingress path, before a
+  packet's destination socket is known. If any socket enables receive
+  timestamps, packets for all socket will receive timestamped packets.
+  Including those that request timestamp reporting with
+  SOF_TIMESTAMPING_SOFTWARE and/or SOF_TIMESTAMPING_RAW_HARDWARE, but
+  do not request receive timestamp generation. This can happen when
+  requesting transmit timestamps only.
+
+  Receiving spurious timestamps is generally benign. A process can
+  ignore the unexpected non-zero value. But it makes behavior subtly
+  dependent on other sockets. This flag isolates the socket for more
+  deterministic behavior.
+
 New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
 disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
 regardless of the setting of sysctl net.core.tstamp_allow_data.
diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index a2c66b3d7f0f..858339d1c1c4 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -32,8 +32,9 @@  enum {
 	SOF_TIMESTAMPING_OPT_TX_SWHW = (1<<14),
 	SOF_TIMESTAMPING_BIND_PHC = (1 << 15),
 	SOF_TIMESTAMPING_OPT_ID_TCP = (1 << 16),
+	SOF_TIMESTAMPING_OPT_RX_FILTER = (1 << 17),
 
-	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_ID_TCP,
+	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_RX_FILTER,
 	SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
 				 SOF_TIMESTAMPING_LAST
 };
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index 781834ef57c3..6c245e59bbc1 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -427,6 +427,7 @@  const char sof_timestamping_names[][ETH_GSTRING_LEN] = {
 	[const_ilog2(SOF_TIMESTAMPING_OPT_TX_SWHW)]  = "option-tx-swhw",
 	[const_ilog2(SOF_TIMESTAMPING_BIND_PHC)]     = "bind-phc",
 	[const_ilog2(SOF_TIMESTAMPING_OPT_ID_TCP)]   = "option-id-tcp",
+	[const_ilog2(SOF_TIMESTAMPING_OPT_RX_FILTER)] = "option-rx-filter",
 };
 static_assert(ARRAY_SIZE(sof_timestamping_names) == __SOF_TIMESTAMPING_CNT);
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 8a5680b4e786..e359a9161445 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2235,6 +2235,7 @@  void tcp_recv_timestamp(struct msghdr *msg, const struct sock *sk,
 			struct scm_timestamping_internal *tss)
 {
 	int new_tstamp = sock_flag(sk, SOCK_TSTAMP_NEW);
+	u32 tsflags = READ_ONCE(sk->sk_tsflags);
 	bool has_timestamping = false;
 
 	if (tss->ts[0].tv_sec || tss->ts[0].tv_nsec) {
@@ -2274,14 +2275,18 @@  void tcp_recv_timestamp(struct msghdr *msg, const struct sock *sk,
 			}
 		}
 
-		if (READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_SOFTWARE)
+		if (tsflags & SOF_TIMESTAMPING_SOFTWARE &&
+		    (tsflags & SOF_TIMESTAMPING_RX_SOFTWARE ||
+		     !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER)))
 			has_timestamping = true;
 		else
 			tss->ts[0] = (struct timespec64) {0};
 	}
 
 	if (tss->ts[2].tv_sec || tss->ts[2].tv_nsec) {
-		if (READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_RAW_HARDWARE)
+		if (tsflags & SOF_TIMESTAMPING_RAW_HARDWARE &&
+		    (tsflags & SOF_TIMESTAMPING_RX_HARDWARE ||
+		     !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER)))
 			has_timestamping = true;
 		else
 			tss->ts[2] = (struct timespec64) {0};
diff --git a/net/socket.c b/net/socket.c
index fcbdd5bc47ac..1e5c463fb4c5 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -946,11 +946,17 @@  void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
 
 	memset(&tss, 0, sizeof(tss));
 	tsflags = READ_ONCE(sk->sk_tsflags);
-	if ((tsflags & SOF_TIMESTAMPING_SOFTWARE) &&
+	if ((tsflags & SOF_TIMESTAMPING_SOFTWARE &&
+	     (tsflags & SOF_TIMESTAMPING_RX_SOFTWARE ||
+	      skb_is_err_queue(skb) ||
+	      !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER))) &&
 	    ktime_to_timespec64_cond(skb->tstamp, tss.ts + 0))
 		empty = 0;
 	if (shhwtstamps &&
-	    (tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
+	    (tsflags & SOF_TIMESTAMPING_RAW_HARDWARE &&
+	     (tsflags & SOF_TIMESTAMPING_RX_HARDWARE ||
+	      skb_is_err_queue(skb) ||
+	      !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER))) &&
 	    !skb_is_swtx_tstamp(skb, false_tstamp)) {
 		if_index = 0;
 		if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP_NETDEV)