Message ID | 20250303080404.70042-1-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] selftests: txtimestamp: ignore the old skb from ERRQUEUE | expand |
Jason Xing wrote: > When I was playing with txtimestamp.c to see how kernel behaves, > I saw the following error outputs if I adjusted the loopback mtu to > 1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000': > > test SND > USR: 1740877371 s 488712 us (seq=0, len=0) > SND: 1740877371 s 489519 us (seq=29999, len=1106) (USR +806 us) > USR: 1740877371 s 581251 us (seq=0, len=0) > SND: 1740877371 s 581970 us (seq=59999, len=8346) (USR +719 us) > USR: 1740877371 s 673855 us (seq=0, len=0) > SND: 1740877371 s 674651 us (seq=89999, len=30000) (USR +795 us) > USR: 1740877371 s 765715 us (seq=0, len=0) > ERROR: key 89999, expected 119999 > ERROR: -12665 us expected between 0 and 100000 > SND: 1740877371 s 753050 us (seq=89999, len=1106) (USR +-12665 us) > SND: 1740877371 s 800783 us (seq=119999, len=30000) (USR +35068 us) > USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us > > Actually, the kernel behaved correctly after I did the analysis. The > second skb carrying 1106 payload was generated due to tail loss probe, > leading to the wrong estimation of tskey in this C program. > > This patch does: > - Neglect the old tskey skb received from ERRQUEUE. > - Add a new test to count how many valid skbs received to compare with > cfg_num_pkts. > > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com>
On Mon, 3 Mar 2025 16:04:04 +0800 Jason Xing wrote: > When I was playing with txtimestamp.c to see how kernel behaves, > I saw the following error outputs if I adjusted the loopback mtu to > 1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000': > > test SND > USR: 1740877371 s 488712 us (seq=0, len=0) > SND: 1740877371 s 489519 us (seq=29999, len=1106) (USR +806 us) > USR: 1740877371 s 581251 us (seq=0, len=0) > SND: 1740877371 s 581970 us (seq=59999, len=8346) (USR +719 us) > USR: 1740877371 s 673855 us (seq=0, len=0) > SND: 1740877371 s 674651 us (seq=89999, len=30000) (USR +795 us) > USR: 1740877371 s 765715 us (seq=0, len=0) > ERROR: key 89999, expected 119999 > ERROR: -12665 us expected between 0 and 100000 > SND: 1740877371 s 753050 us (seq=89999, len=1106) (USR +-12665 us) > SND: 1740877371 s 800783 us (seq=119999, len=30000) (USR +35068 us) > USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us > > Actually, the kernel behaved correctly after I did the analysis. The > second skb carrying 1106 payload was generated due to tail loss probe, > leading to the wrong estimation of tskey in this C program. > > This patch does: > - Neglect the old tskey skb received from ERRQUEUE. > - Add a new test to count how many valid skbs received to compare with > cfg_num_pkts. This appears to break some UDP test cases when running in the CI: https://netdev-3.bots.linux.dev/vmksft-net/results/16721/41-txtimestamp-sh/stdout
On Mon, Mar 3, 2025 at 11:41 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Mon, 3 Mar 2025 16:04:04 +0800 Jason Xing wrote: > > When I was playing with txtimestamp.c to see how kernel behaves, > > I saw the following error outputs if I adjusted the loopback mtu to > > 1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000': > > > > test SND > > USR: 1740877371 s 488712 us (seq=0, len=0) > > SND: 1740877371 s 489519 us (seq=29999, len=1106) (USR +806 us) > > USR: 1740877371 s 581251 us (seq=0, len=0) > > SND: 1740877371 s 581970 us (seq=59999, len=8346) (USR +719 us) > > USR: 1740877371 s 673855 us (seq=0, len=0) > > SND: 1740877371 s 674651 us (seq=89999, len=30000) (USR +795 us) > > USR: 1740877371 s 765715 us (seq=0, len=0) > > ERROR: key 89999, expected 119999 > > ERROR: -12665 us expected between 0 and 100000 > > SND: 1740877371 s 753050 us (seq=89999, len=1106) (USR +-12665 us) > > SND: 1740877371 s 800783 us (seq=119999, len=30000) (USR +35068 us) > > USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us > > > > Actually, the kernel behaved correctly after I did the analysis. The > > second skb carrying 1106 payload was generated due to tail loss probe, > > leading to the wrong estimation of tskey in this C program. > > > > This patch does: > > - Neglect the old tskey skb received from ERRQUEUE. > > - Add a new test to count how many valid skbs received to compare with > > cfg_num_pkts. > > This appears to break some UDP test cases when running in the CI: > > https://netdev-3.bots.linux.dev/vmksft-net/results/16721/41-txtimestamp-sh/stdout Thanks for catching this. I did break this testcase: run_test_v4v6 ${args} -u -o 42. Will take a deep look into it tomorrow morning. Thanks, Jason
On Tue, Mar 4, 2025 at 12:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Mon, Mar 3, 2025 at 11:41 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > On Mon, 3 Mar 2025 16:04:04 +0800 Jason Xing wrote: > > > When I was playing with txtimestamp.c to see how kernel behaves, > > > I saw the following error outputs if I adjusted the loopback mtu to > > > 1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000': > > > > > > test SND > > > USR: 1740877371 s 488712 us (seq=0, len=0) > > > SND: 1740877371 s 489519 us (seq=29999, len=1106) (USR +806 us) > > > USR: 1740877371 s 581251 us (seq=0, len=0) > > > SND: 1740877371 s 581970 us (seq=59999, len=8346) (USR +719 us) > > > USR: 1740877371 s 673855 us (seq=0, len=0) > > > SND: 1740877371 s 674651 us (seq=89999, len=30000) (USR +795 us) > > > USR: 1740877371 s 765715 us (seq=0, len=0) > > > ERROR: key 89999, expected 119999 > > > ERROR: -12665 us expected between 0 and 100000 > > > SND: 1740877371 s 753050 us (seq=89999, len=1106) (USR +-12665 us) > > > SND: 1740877371 s 800783 us (seq=119999, len=30000) (USR +35068 us) > > > USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us > > > > > > Actually, the kernel behaved correctly after I did the analysis. The > > > second skb carrying 1106 payload was generated due to tail loss probe, > > > leading to the wrong estimation of tskey in this C program. > > > > > > This patch does: > > > - Neglect the old tskey skb received from ERRQUEUE. > > > - Add a new test to count how many valid skbs received to compare with > > > cfg_num_pkts. > > > > This appears to break some UDP test cases when running in the CI: > > > > https://netdev-3.bots.linux.dev/vmksft-net/results/16721/41-txtimestamp-sh/stdout > > Thanks for catching this. I did break this testcase: run_test_v4v6 > ${args} -u -o 42. > To handle this particular case, I broke more than that, so I'll drop this patch. Sorry for the inconvenience :( I must be out of mind.
diff --git a/tools/testing/selftests/net/txtimestamp.c b/tools/testing/selftests/net/txtimestamp.c index dae91eb97d69..c63af798a521 100644 --- a/tools/testing/selftests/net/txtimestamp.c +++ b/tools/testing/selftests/net/txtimestamp.c @@ -86,6 +86,7 @@ static struct timespec ts_usr; static int saved_tskey = -1; static int saved_tskey_type = -1; +static int saved_num_pkts; struct timing_event { int64_t min; @@ -131,17 +132,20 @@ static void add_timing_event(struct timing_event *te, te->total += ts_delta; } -static void validate_key(int tskey, int tstype) +static bool validate_key(int tskey, int tstype) { int stepsize; + if (tskey <= saved_tskey) + return false; + /* compare key for each subsequent request * must only test for one type, the first one requested */ if (saved_tskey == -1 || cfg_use_cmsg_opt_id) saved_tskey_type = tstype; else if (saved_tskey_type != tstype) - return; + return true; stepsize = cfg_proto == SOCK_STREAM ? cfg_payload_len : 1; stepsize = cfg_use_cmsg_opt_id ? 0 : stepsize; @@ -152,6 +156,7 @@ static void validate_key(int tskey, int tstype) } saved_tskey = tskey; + return true; } static void validate_timestamp(struct timespec *cur, int min_delay) @@ -219,7 +224,8 @@ static void print_timestamp(struct scm_timestamping *tss, int tstype, { const char *tsname; - validate_key(tskey, tstype); + if (!validate_key(tskey, tstype)) + return; switch (tstype) { case SCM_TSTAMP_SCHED: @@ -242,6 +248,7 @@ static void print_timestamp(struct scm_timestamping *tss, int tstype, tstype); } __print_timestamp(tsname, &tss->ts[0], tskey, payload_len); + saved_num_pkts++; } static void print_timing_event(char *name, struct timing_event *te) @@ -582,6 +589,7 @@ static void do_test(int family, unsigned int report_opt) (char *) &sock_opt, sizeof(sock_opt))) error(1, 0, "setsockopt timestamping"); + saved_num_pkts = 0; for (i = 0; i < cfg_num_pkts; i++) { memset(&msg, 0, sizeof(msg)); memset(buf, 'a' + i, total_len); @@ -673,6 +681,12 @@ static void do_test(int family, unsigned int report_opt) while (!recv_errmsg(fd)) {} } + if (cfg_num_pkts != saved_num_pkts) { + fprintf(stderr, "ERROR: num_pkts received %d, expected %d\n", + saved_num_pkts, cfg_num_pkts); + test_failed = true; + } + print_timing_event("USR-ENQ", &usr_enq); print_timing_event("USR-SND", &usr_snd); print_timing_event("USR-ACK", &usr_ack);
When I was playing with txtimestamp.c to see how kernel behaves, I saw the following error outputs if I adjusted the loopback mtu to 1500 and then ran './txtimestamp -4 -L 127.0.0.1 -l 30000 -t 100000': test SND USR: 1740877371 s 488712 us (seq=0, len=0) SND: 1740877371 s 489519 us (seq=29999, len=1106) (USR +806 us) USR: 1740877371 s 581251 us (seq=0, len=0) SND: 1740877371 s 581970 us (seq=59999, len=8346) (USR +719 us) USR: 1740877371 s 673855 us (seq=0, len=0) SND: 1740877371 s 674651 us (seq=89999, len=30000) (USR +795 us) USR: 1740877371 s 765715 us (seq=0, len=0) ERROR: key 89999, expected 119999 ERROR: -12665 us expected between 0 and 100000 SND: 1740877371 s 753050 us (seq=89999, len=1106) (USR +-12665 us) SND: 1740877371 s 800783 us (seq=119999, len=30000) (USR +35068 us) USR-SND: count=5, avg=4945 us, min=-12665 us, max=35068 us Actually, the kernel behaved correctly after I did the analysis. The second skb carrying 1106 payload was generated due to tail loss probe, leading to the wrong estimation of tskey in this C program. This patch does: - Neglect the old tskey skb received from ERRQUEUE. - Add a new test to count how many valid skbs received to compare with cfg_num_pkts. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> --- Willem, I'm not sure if it's worth reviewing this patch to handle this testcase. After all it's only a selftest. Well, adding a new test might be helpful. Please feel free to drop/ignore it if you don't like it. --- tools/testing/selftests/net/txtimestamp.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)