diff mbox series

[net-next] selftests: txtimestamp: ignore the old skb from ERRQUEUE

Message ID 20250303080404.70042-1-kerneljasonxing@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] selftests: txtimestamp: ignore the old skb from ERRQUEUE | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success Errors and warnings before: 26 (+0) this patch: 26 (+0)
netdev/cc_maintainers warning 3 maintainers not CCed: shuah@kernel.org linux-kselftest@vger.kernel.org kernelxing@tencent.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning + fprintf(stderr, "ERROR: num_pkts received %d, expected %d\n", CHECK: Alignment should match open parenthesis
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest fail net-next-2025-03-03--12-00 (tests: 893)

Commit Message

Jason Xing March 3, 2025, 8:04 a.m. UTC
When I was playing with txtimestamp.c to see how kernel behaves,
I saw the following error outputs if I adjusted the loopback mtu to
1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000':

test SND
    USR: 1740877371 s 488712 us (seq=0, len=0)
    SND: 1740877371 s 489519 us (seq=29999, len=1106)  (USR +806 us)
    USR: 1740877371 s 581251 us (seq=0, len=0)
    SND: 1740877371 s 581970 us (seq=59999, len=8346)  (USR +719 us)
    USR: 1740877371 s 673855 us (seq=0, len=0)
    SND: 1740877371 s 674651 us (seq=89999, len=30000)  (USR +795 us)
    USR: 1740877371 s 765715 us (seq=0, len=0)
ERROR: key 89999, expected 119999
ERROR: -12665 us expected between 0 and 100000
    SND: 1740877371 s 753050 us (seq=89999, len=1106)  (USR +-12665 us)
    SND: 1740877371 s 800783 us (seq=119999, len=30000)  (USR +35068 us)
    USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us

Actually, the kernel behaved correctly after I did the analysis. The
second skb carrying 1106 payload was generated due to tail loss probe,
leading to the wrong estimation of tskey in this C program.

This patch does:
- Neglect the old tskey skb received from ERRQUEUE.
- Add a new test to count how many valid skbs received to compare with
cfg_num_pkts.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
---
Willem, I'm not sure if it's worth reviewing this patch to handle this
testcase. After all it's only a selftest. Well, adding a new test might
be helpful. Please feel free to drop/ignore it if you don't like it.
---
 tools/testing/selftests/net/txtimestamp.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

Comments

Willem de Bruijn March 3, 2025, 2:07 p.m. UTC | #1
Jason Xing wrote:
> When I was playing with txtimestamp.c to see how kernel behaves,
> I saw the following error outputs if I adjusted the loopback mtu to
> 1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000':
> 
> test SND
>     USR: 1740877371 s 488712 us (seq=0, len=0)
>     SND: 1740877371 s 489519 us (seq=29999, len=1106)  (USR +806 us)
>     USR: 1740877371 s 581251 us (seq=0, len=0)
>     SND: 1740877371 s 581970 us (seq=59999, len=8346)  (USR +719 us)
>     USR: 1740877371 s 673855 us (seq=0, len=0)
>     SND: 1740877371 s 674651 us (seq=89999, len=30000)  (USR +795 us)
>     USR: 1740877371 s 765715 us (seq=0, len=0)
> ERROR: key 89999, expected 119999
> ERROR: -12665 us expected between 0 and 100000
>     SND: 1740877371 s 753050 us (seq=89999, len=1106)  (USR +-12665 us)
>     SND: 1740877371 s 800783 us (seq=119999, len=30000)  (USR +35068 us)
>     USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us
> 
> Actually, the kernel behaved correctly after I did the analysis. The
> second skb carrying 1106 payload was generated due to tail loss probe,
> leading to the wrong estimation of tskey in this C program.
> 
> This patch does:
> - Neglect the old tskey skb received from ERRQUEUE.
> - Add a new test to count how many valid skbs received to compare with
> cfg_num_pkts.
> 
> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>
Jakub Kicinski March 3, 2025, 3:41 p.m. UTC | #2
On Mon,  3 Mar 2025 16:04:04 +0800 Jason Xing wrote:
> When I was playing with txtimestamp.c to see how kernel behaves,
> I saw the following error outputs if I adjusted the loopback mtu to
> 1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000':
> 
> test SND
>     USR: 1740877371 s 488712 us (seq=0, len=0)
>     SND: 1740877371 s 489519 us (seq=29999, len=1106)  (USR +806 us)
>     USR: 1740877371 s 581251 us (seq=0, len=0)
>     SND: 1740877371 s 581970 us (seq=59999, len=8346)  (USR +719 us)
>     USR: 1740877371 s 673855 us (seq=0, len=0)
>     SND: 1740877371 s 674651 us (seq=89999, len=30000)  (USR +795 us)
>     USR: 1740877371 s 765715 us (seq=0, len=0)
> ERROR: key 89999, expected 119999
> ERROR: -12665 us expected between 0 and 100000
>     SND: 1740877371 s 753050 us (seq=89999, len=1106)  (USR +-12665 us)
>     SND: 1740877371 s 800783 us (seq=119999, len=30000)  (USR +35068 us)
>     USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us
> 
> Actually, the kernel behaved correctly after I did the analysis. The
> second skb carrying 1106 payload was generated due to tail loss probe,
> leading to the wrong estimation of tskey in this C program.
> 
> This patch does:
> - Neglect the old tskey skb received from ERRQUEUE.
> - Add a new test to count how many valid skbs received to compare with
> cfg_num_pkts.

This appears to break some UDP test cases when running in the CI:

https://netdev-3.bots.linux.dev/vmksft-net/results/16721/41-txtimestamp-sh/stdout
Jason Xing March 3, 2025, 4:31 p.m. UTC | #3
On Mon, Mar 3, 2025 at 11:41 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon,  3 Mar 2025 16:04:04 +0800 Jason Xing wrote:
> > When I was playing with txtimestamp.c to see how kernel behaves,
> > I saw the following error outputs if I adjusted the loopback mtu to
> > 1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000':
> >
> > test SND
> >     USR: 1740877371 s 488712 us (seq=0, len=0)
> >     SND: 1740877371 s 489519 us (seq=29999, len=1106)  (USR +806 us)
> >     USR: 1740877371 s 581251 us (seq=0, len=0)
> >     SND: 1740877371 s 581970 us (seq=59999, len=8346)  (USR +719 us)
> >     USR: 1740877371 s 673855 us (seq=0, len=0)
> >     SND: 1740877371 s 674651 us (seq=89999, len=30000)  (USR +795 us)
> >     USR: 1740877371 s 765715 us (seq=0, len=0)
> > ERROR: key 89999, expected 119999
> > ERROR: -12665 us expected between 0 and 100000
> >     SND: 1740877371 s 753050 us (seq=89999, len=1106)  (USR +-12665 us)
> >     SND: 1740877371 s 800783 us (seq=119999, len=30000)  (USR +35068 us)
> >     USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us
> >
> > Actually, the kernel behaved correctly after I did the analysis. The
> > second skb carrying 1106 payload was generated due to tail loss probe,
> > leading to the wrong estimation of tskey in this C program.
> >
> > This patch does:
> > - Neglect the old tskey skb received from ERRQUEUE.
> > - Add a new test to count how many valid skbs received to compare with
> > cfg_num_pkts.
>
> This appears to break some UDP test cases when running in the CI:
>
> https://netdev-3.bots.linux.dev/vmksft-net/results/16721/41-txtimestamp-sh/stdout

Thanks for catching this. I did break this testcase: run_test_v4v6
${args} -u -o 42.

Will take a deep look into it tomorrow morning.

Thanks,
Jason
Jason Xing March 3, 2025, 4:49 p.m. UTC | #4
On Tue, Mar 4, 2025 at 12:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> On Mon, Mar 3, 2025 at 11:41 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Mon,  3 Mar 2025 16:04:04 +0800 Jason Xing wrote:
> > > When I was playing with txtimestamp.c to see how kernel behaves,
> > > I saw the following error outputs if I adjusted the loopback mtu to
> > > 1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000':
> > >
> > > test SND
> > >     USR: 1740877371 s 488712 us (seq=0, len=0)
> > >     SND: 1740877371 s 489519 us (seq=29999, len=1106)  (USR +806 us)
> > >     USR: 1740877371 s 581251 us (seq=0, len=0)
> > >     SND: 1740877371 s 581970 us (seq=59999, len=8346)  (USR +719 us)
> > >     USR: 1740877371 s 673855 us (seq=0, len=0)
> > >     SND: 1740877371 s 674651 us (seq=89999, len=30000)  (USR +795 us)
> > >     USR: 1740877371 s 765715 us (seq=0, len=0)
> > > ERROR: key 89999, expected 119999
> > > ERROR: -12665 us expected between 0 and 100000
> > >     SND: 1740877371 s 753050 us (seq=89999, len=1106)  (USR +-12665 us)
> > >     SND: 1740877371 s 800783 us (seq=119999, len=30000)  (USR +35068 us)
> > >     USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us
> > >
> > > Actually, the kernel behaved correctly after I did the analysis. The
> > > second skb carrying 1106 payload was generated due to tail loss probe,
> > > leading to the wrong estimation of tskey in this C program.
> > >
> > > This patch does:
> > > - Neglect the old tskey skb received from ERRQUEUE.
> > > - Add a new test to count how many valid skbs received to compare with
> > > cfg_num_pkts.
> >
> > This appears to break some UDP test cases when running in the CI:
> >
> > https://netdev-3.bots.linux.dev/vmksft-net/results/16721/41-txtimestamp-sh/stdout
>
> Thanks for catching this. I did break this testcase: run_test_v4v6
> ${args} -u -o 42.
>

To handle this particular case, I broke more than that, so I'll drop
this patch. Sorry for the inconvenience :( I must be out of mind.
diff mbox series

Patch

diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c
index dae91eb97d69..c63af798a521 100644
--- a/tools/testing/selftests/net/txtimestamp.c
+++ b/tools/testing/selftests/net/txtimestamp.c
@@ -86,6 +86,7 @@  static struct timespec ts_usr;
 
 static int saved_tskey = -1;
 static int saved_tskey_type = -1;
+static int saved_num_pkts;
 
 struct timing_event {
 	int64_t min;
@@ -131,17 +132,20 @@  static void add_timing_event(struct timing_event *te,
 	te->total += ts_delta;
 }
 
-static void validate_key(int tskey, int tstype)
+static bool validate_key(int tskey, int tstype)
 {
 	int stepsize;
 
+	if (tskey <= saved_tskey)
+		return false;
+
 	/* compare key for each subsequent request
 	 * must only test for one type, the first one requested
 	 */
 	if (saved_tskey == -1 || cfg_use_cmsg_opt_id)
 		saved_tskey_type = tstype;
 	else if (saved_tskey_type != tstype)
-		return;
+		return true;
 
 	stepsize = cfg_proto == SOCK_STREAM ? cfg_payload_len : 1;
 	stepsize = cfg_use_cmsg_opt_id ? 0 : stepsize;
@@ -152,6 +156,7 @@  static void validate_key(int tskey, int tstype)
 	}
 
 	saved_tskey = tskey;
+	return true;
 }
 
 static void validate_timestamp(struct timespec *cur, int min_delay)
@@ -219,7 +224,8 @@  static void print_timestamp(struct scm_timestamping *tss, int tstype,
 {
 	const char *tsname;
 
-	validate_key(tskey, tstype);
+	if (!validate_key(tskey, tstype))
+		return;
 
 	switch (tstype) {
 	case SCM_TSTAMP_SCHED:
@@ -242,6 +248,7 @@  static void print_timestamp(struct scm_timestamping *tss, int tstype,
 		tstype);
 	}
 	__print_timestamp(tsname, &tss->ts[0], tskey, payload_len);
+	saved_num_pkts++;
 }
 
 static void print_timing_event(char *name, struct timing_event *te)
@@ -582,6 +589,7 @@  static void do_test(int family, unsigned int report_opt)
 		       (char *) &sock_opt, sizeof(sock_opt)))
 		error(1, 0, "setsockopt timestamping");
 
+	saved_num_pkts = 0;
 	for (i = 0; i < cfg_num_pkts; i++) {
 		memset(&msg, 0, sizeof(msg));
 		memset(buf, 'a' + i, total_len);
@@ -673,6 +681,12 @@  static void do_test(int family, unsigned int report_opt)
 		while (!recv_errmsg(fd)) {}
 	}
 
+	if (cfg_num_pkts != saved_num_pkts) {
+		fprintf(stderr, "ERROR: num_pkts received %d, expected %d\n",
+				saved_num_pkts, cfg_num_pkts);
+		test_failed = true;
+	}
+
 	print_timing_event("USR-ENQ", &usr_enq);
 	print_timing_event("USR-SND", &usr_snd);
 	print_timing_event("USR-ACK", &usr_ack);