Message ID | 20240131053212.2247527-1-chantr4@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 2ef61296d2844c6a4211e07ab70ef2fb412b2c30 |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] selftests/bpf: disable IPv6 for lwt_redirect test | expand |
Hello: This patch was applied to bpf/bpf-next.git (master) by Andrii Nakryiko <andrii@kernel.org>: On Tue, 30 Jan 2024 21:32:12 -0800 you wrote: > After a recent change in the vmtest runner, this test started failing > sporadically. > > Investigation showed that this test was subject to race condition which > got exacerbated after the vm runner change. The symptoms being that the > logic that waited for an ICMPv4 packet is naive and will break if 5 or > more non-ICMPv4 packets make it to tap0. > When ICMPv6 is enabled, the kernel will generate traffic such as ICMPv6 > router solicitation... > On a system with good performance, the expected ICMPv4 packet would very > likely make it to the network interface promptly, but on a system with > poor performance, those "guarantees" do not hold true anymore. > > [...] Here is the summary with links: - [bpf-next] selftests/bpf: disable IPv6 for lwt_redirect test https://git.kernel.org/bpf/bpf-next/c/2ef61296d284 You are awesome, thank you!
On 31/01/2024 05:32, Manu Bretelle wrote: > After a recent change in the vmtest runner, this test started failing > sporadically. > > Investigation showed that this test was subject to race condition which > got exacerbated after the vm runner change. The symptoms being that the > logic that waited for an ICMPv4 packet is naive and will break if 5 or > more non-ICMPv4 packets make it to tap0. > When ICMPv6 is enabled, the kernel will generate traffic such as ICMPv6 > router solicitation... > On a system with good performance, the expected ICMPv4 packet would very > likely make it to the network interface promptly, but on a system with > poor performance, those "guarantees" do not hold true anymore. > > Given that the test is IPv4 only, this change disable IPv6 in the test > netns by setting `net.ipv6.conf.all.disable_ipv6` to 1. > This essentially leaves "ping" as the sole generator of traffic in the > network namespace. > If this test was to be made IPv6 compatible, the logic in > `wait_for_packet` would need to be modified. > Great to fix test flakiness like this; I was curious if you tried modifying things from the bpf side; would something like this in progs/test_lwt_redirect.c help? (haven't been able to test because I can't reproduce the failure): static int get_redirect_target(struct __sk_buff *skb) { struct iphdr *iph = NULL; void *start = (void *)(long)skb->data; void *end = (void *)(long)skb->data_end; + if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6)) + return -1; I _think_ that would skip redirection and might solve the problem from the bpf side. Might be worth testing, but not a big deal.. > In more details... > > At a high level, the test does: > - create a new namespace > - in `setup_redirect_target` set up lo, tap0, and link_err interfaces as > well as add 2 routes that attaches ingress/egress sections of > `test_lwt_redirect.bpf.o` to the xmit path. > - in `send_and_capture_test_packets` send an ICMP packet and read off > the tap interface (using `wait_for_packet`) to check that a ICMP packet > with the right size is read. > > `wait_for_packet` will try to read `max_retry` (5) times from the tap0 > fd looking for an ICMPv4 packet matching some criteria. > > The problem is that when we set up the `tap0` interface, because IPv6 is > enabled by default, traffic such as Router solicitation is sent through > tap0, as in: > > # tcpdump -r /tmp/lwt_redirect.pc > reading from file /tmp/lwt_redirect.pcap, link-type EN10MB (Ethernet) > 04:46:23.578352 IP6 :: > ff02::1:ffc0:4427: ICMP6, neighbor solicitation, who has fe80::fcba:dff:fec0:4427, length 32 > 04:46:23.659522 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 > 04:46:24.389169 IP 10.0.0.1 > 20.0.0.9: ICMP echo request, id 122, seq 1, length 108 > 04:46:24.618599 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 > 04:46:24.619985 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16 > 04:46:24.767326 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 > 04:46:28.936402 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16 > > If `wait_for_packet` sees 5 non-ICMPv4 packets, it will return 0, which is what we see in: > > 2024-01-31T03:51:25.0336992Z test_lwt_redirect_run:PASS:netns_create 0 nsec > 2024-01-31T03:51:25.0341309Z open_netns:PASS:malloc token 0 nsec > 2024-01-31T03:51:25.0344844Z open_netns:PASS:open /proc/self/ns/net 0 nsec > 2024-01-31T03:51:25.0350071Z open_netns:PASS:open netns fd 0 nsec > 2024-01-31T03:51:25.0353516Z open_netns:PASS:setns 0 nsec > 2024-01-31T03:51:25.0356560Z test_lwt_redirect_run:PASS:setns 0 nsec > 2024-01-31T03:51:25.0360140Z open_tuntap:PASS:open(/dev/net/tun) 0 nsec > 2024-01-31T03:51:25.0363822Z open_tuntap:PASS:ioctl(TUNSETIFF) 0 nsec > 2024-01-31T03:51:25.0367402Z open_tuntap:PASS:fcntl(O_NONBLOCK) 0 nsec > 2024-01-31T03:51:25.0371167Z setup_redirect_target:PASS:open_tuntap 0 nsec > 2024-01-31T03:51:25.0375180Z setup_redirect_target:PASS:if_nametoindex 0 nsec > 2024-01-31T03:51:25.0379929Z setup_redirect_target:PASS:ip link add link_err type dummy 0 nsec > 2024-01-31T03:51:25.0384874Z setup_redirect_target:PASS:ip link set lo up 0 nsec > 2024-01-31T03:51:25.0389678Z setup_redirect_target:PASS:ip addr add dev lo 10.0.0.1/32 0 nsec > 2024-01-31T03:51:25.0394814Z setup_redirect_target:PASS:ip link set link_err up 0 nsec > 2024-01-31T03:51:25.0399874Z setup_redirect_target:PASS:ip link set tap0 up 0 nsec > 2024-01-31T03:51:25.0407731Z setup_redirect_target:PASS:ip route add 10.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_ingress 0 nsec > 2024-01-31T03:51:25.0419105Z setup_redirect_target:PASS:ip route add 20.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_egress 0 nsec > 2024-01-31T03:51:25.0427209Z test_lwt_redirect_normal:PASS:setup_redirect_target 0 nsec > 2024-01-31T03:51:25.0431424Z ping_dev:PASS:if_nametoindex 0 nsec > 2024-01-31T03:51:25.0437222Z send_and_capture_test_packets:FAIL:wait_for_epacket unexpected wait_for_epacket: actual 0 != expected 1 > 2024-01-31T03:51:25.0448298Z (/tmp/work/bpf/bpf/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c:175: errno: Success) test_lwt_redirect_normal egress test fails > 2024-01-31T03:51:25.0457124Z close_netns:PASS:setns 0 nsec > > When running in a VM which potential resource contrains, the odds that calling > `ping` is not scheduled very soon after bringing `tap0` up increases, > and with this the chances to get our ICMP packet pushed to position 6+ > in the network trace. > > To confirm this indeed solves the issue, I ran the test 100 times in a > row with: > > errors=0 > successes=0 > for i in `seq 1 100` > do > ./test_progs -t lwt_redirect/lwt_redirect_normal > if [ $? -eq 0 ]; then > successes=$((successes+1)) > else > errors=$((errors+1)) > fi > done > echo "successes: $successes/errors: $errors" > > While this test would at least fail a couple of time every 10 runs, here > it ran 100 times with no error. > > Fixes: 43a7c3ef8a15 ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT") > Signed-off-by: Manu Bretelle <chantr4@gmail.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> > --- > tools/testing/selftests/bpf/prog_tests/lwt_redirect.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > index beeb3ac1c361..b5b9e74b1044 100644 > --- a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > +++ b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > @@ -203,6 +203,7 @@ static int setup_redirect_target(const char *target_dev, bool need_mac) > if (!ASSERT_GE(target_index, 0, "if_nametoindex")) > goto fail; > > + SYS(fail, "sysctl -w net.ipv6.conf.all.disable_ipv6=1"); > SYS(fail, "ip link add link_err type dummy"); > SYS(fail, "ip link set lo up"); > SYS(fail, "ip addr add dev lo " LOCAL_SRC "/32");
On Wed, Jan 31, 2024 at 05:49:42PM +0000, Alan Maguire wrote: > On 31/01/2024 05:32, Manu Bretelle wrote: > > After a recent change in the vmtest runner, this test started failing > > sporadically. > > > > Investigation showed that this test was subject to race condition which > > got exacerbated after the vm runner change. The symptoms being that the > > logic that waited for an ICMPv4 packet is naive and will break if 5 or > > more non-ICMPv4 packets make it to tap0. > > When ICMPv6 is enabled, the kernel will generate traffic such as ICMPv6 > > router solicitation... > > On a system with good performance, the expected ICMPv4 packet would very > > likely make it to the network interface promptly, but on a system with > > poor performance, those "guarantees" do not hold true anymore. > > > > Given that the test is IPv4 only, this change disable IPv6 in the test > > netns by setting `net.ipv6.conf.all.disable_ipv6` to 1. > > This essentially leaves "ping" as the sole generator of traffic in the > > network namespace. > > If this test was to be made IPv6 compatible, the logic in > > `wait_for_packet` would need to be modified. > > > > Great to fix test flakiness like this; I was curious if you tried > modifying things from the bpf side; would something like this in > progs/test_lwt_redirect.c help? > (haven't been able to test because I can't reproduce the failure): > > static int get_redirect_target(struct __sk_buff *skb) > { > struct iphdr *iph = NULL; > void *start = (void *)(long)skb->data; > void *end = (void *)(long)skb->data_end; > > + if (skb->protocol == __bpf_constant_htons(ETH_P_IPV6)) > + return -1; > > I _think_ that would skip redirection and might solve the problem > from the bpf side. Might be worth testing, but not a big deal.. > No, I did not try this, and it should indeed work. The repro env is quite a heavy lift and I do not have easy access to access to a full toolchain to recompile programs and test this cross-platform within it. sysctl call was a straightforward code change with high chance of result and minimal back and forth with BPF CI to get s390x artifacts (which is where the flakiness manifested), and given that the test runs within its own ephemeral network namespace, I felt no guilt in using that sledgehammer :) > > In more details... > > > > At a high level, the test does: > > - create a new namespace > > - in `setup_redirect_target` set up lo, tap0, and link_err interfaces as > > well as add 2 routes that attaches ingress/egress sections of > > `test_lwt_redirect.bpf.o` to the xmit path. > > - in `send_and_capture_test_packets` send an ICMP packet and read off > > the tap interface (using `wait_for_packet`) to check that a ICMP packet > > with the right size is read. > > > > `wait_for_packet` will try to read `max_retry` (5) times from the tap0 > > fd looking for an ICMPv4 packet matching some criteria. > > > > The problem is that when we set up the `tap0` interface, because IPv6 is > > enabled by default, traffic such as Router solicitation is sent through > > tap0, as in: > > > > # tcpdump -r /tmp/lwt_redirect.pc > > reading from file /tmp/lwt_redirect.pcap, link-type EN10MB (Ethernet) > > 04:46:23.578352 IP6 :: > ff02::1:ffc0:4427: ICMP6, neighbor solicitation, who has fe80::fcba:dff:fec0:4427, length 32 > > 04:46:23.659522 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 > > 04:46:24.389169 IP 10.0.0.1 > 20.0.0.9: ICMP echo request, id 122, seq 1, length 108 > > 04:46:24.618599 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 > > 04:46:24.619985 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16 > > 04:46:24.767326 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 > > 04:46:28.936402 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16 > > > > If `wait_for_packet` sees 5 non-ICMPv4 packets, it will return 0, which is what we see in: > > > > 2024-01-31T03:51:25.0336992Z test_lwt_redirect_run:PASS:netns_create 0 nsec > > 2024-01-31T03:51:25.0341309Z open_netns:PASS:malloc token 0 nsec > > 2024-01-31T03:51:25.0344844Z open_netns:PASS:open /proc/self/ns/net 0 nsec > > 2024-01-31T03:51:25.0350071Z open_netns:PASS:open netns fd 0 nsec > > 2024-01-31T03:51:25.0353516Z open_netns:PASS:setns 0 nsec > > 2024-01-31T03:51:25.0356560Z test_lwt_redirect_run:PASS:setns 0 nsec > > 2024-01-31T03:51:25.0360140Z open_tuntap:PASS:open(/dev/net/tun) 0 nsec > > 2024-01-31T03:51:25.0363822Z open_tuntap:PASS:ioctl(TUNSETIFF) 0 nsec > > 2024-01-31T03:51:25.0367402Z open_tuntap:PASS:fcntl(O_NONBLOCK) 0 nsec > > 2024-01-31T03:51:25.0371167Z setup_redirect_target:PASS:open_tuntap 0 nsec > > 2024-01-31T03:51:25.0375180Z setup_redirect_target:PASS:if_nametoindex 0 nsec > > 2024-01-31T03:51:25.0379929Z setup_redirect_target:PASS:ip link add link_err type dummy 0 nsec > > 2024-01-31T03:51:25.0384874Z setup_redirect_target:PASS:ip link set lo up 0 nsec > > 2024-01-31T03:51:25.0389678Z setup_redirect_target:PASS:ip addr add dev lo 10.0.0.1/32 0 nsec > > 2024-01-31T03:51:25.0394814Z setup_redirect_target:PASS:ip link set link_err up 0 nsec > > 2024-01-31T03:51:25.0399874Z setup_redirect_target:PASS:ip link set tap0 up 0 nsec > > 2024-01-31T03:51:25.0407731Z setup_redirect_target:PASS:ip route add 10.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_ingress 0 nsec > > 2024-01-31T03:51:25.0419105Z setup_redirect_target:PASS:ip route add 20.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_egress 0 nsec > > 2024-01-31T03:51:25.0427209Z test_lwt_redirect_normal:PASS:setup_redirect_target 0 nsec > > 2024-01-31T03:51:25.0431424Z ping_dev:PASS:if_nametoindex 0 nsec > > 2024-01-31T03:51:25.0437222Z send_and_capture_test_packets:FAIL:wait_for_epacket unexpected wait_for_epacket: actual 0 != expected 1 > > 2024-01-31T03:51:25.0448298Z (/tmp/work/bpf/bpf/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c:175: errno: Success) test_lwt_redirect_normal egress test fails > > 2024-01-31T03:51:25.0457124Z close_netns:PASS:setns 0 nsec > > > > When running in a VM which potential resource contrains, the odds that calling > > `ping` is not scheduled very soon after bringing `tap0` up increases, > > and with this the chances to get our ICMP packet pushed to position 6+ > > in the network trace. > > > > To confirm this indeed solves the issue, I ran the test 100 times in a > > row with: > > > > errors=0 > > successes=0 > > for i in `seq 1 100` > > do > > ./test_progs -t lwt_redirect/lwt_redirect_normal > > if [ $? -eq 0 ]; then > > successes=$((successes+1)) > > else > > errors=$((errors+1)) > > fi > > done > > echo "successes: $successes/errors: $errors" > > > > While this test would at least fail a couple of time every 10 runs, here > > it ran 100 times with no error. > > > > Fixes: 43a7c3ef8a15 ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT") > > Signed-off-by: Manu Bretelle <chantr4@gmail.com> > > Reviewed-by: Alan Maguire <alan.maguire@oracle.com> > > > --- > > tools/testing/selftests/bpf/prog_tests/lwt_redirect.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > > index beeb3ac1c361..b5b9e74b1044 100644 > > --- a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > > +++ b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > > @@ -203,6 +203,7 @@ static int setup_redirect_target(const char *target_dev, bool need_mac) > > if (!ASSERT_GE(target_index, 0, "if_nametoindex")) > > goto fail; > > > > + SYS(fail, "sysctl -w net.ipv6.conf.all.disable_ipv6=1"); > > SYS(fail, "ip link add link_err type dummy"); > > SYS(fail, "ip link set lo up"); > > SYS(fail, "ip addr add dev lo " LOCAL_SRC "/32");
Hi Manu, On Tue, Jan 30, 2024 at 11:32 PM Manu Bretelle <chantr4@gmail.com> wrote: > > After a recent change in the vmtest runner, this test started failing > sporadically. > > Investigation showed that this test was subject to race condition which > got exacerbated after the vm runner change. The symptoms being that the > logic that waited for an ICMPv4 packet is naive and will break if 5 or > more non-ICMPv4 packets make it to tap0. > When ICMPv6 is enabled, the kernel will generate traffic such as ICMPv6 > router solicitation... > On a system with good performance, the expected ICMPv4 packet would very > likely make it to the network interface promptly, but on a system with > poor performance, those "guarantees" do not hold true anymore. > > Given that the test is IPv4 only, this change disable IPv6 in the test > netns by setting `net.ipv6.conf.all.disable_ipv6` to 1. > This essentially leaves "ping" as the sole generator of traffic in the > network namespace. > If this test was to be made IPv6 compatible, the logic in > `wait_for_packet` would need to be modified. > > In more details... > > At a high level, the test does: > - create a new namespace > - in `setup_redirect_target` set up lo, tap0, and link_err interfaces as > well as add 2 routes that attaches ingress/egress sections of > `test_lwt_redirect.bpf.o` to the xmit path. > - in `send_and_capture_test_packets` send an ICMP packet and read off > the tap interface (using `wait_for_packet`) to check that a ICMP packet > with the right size is read. > > `wait_for_packet` will try to read `max_retry` (5) times from the tap0 > fd looking for an ICMPv4 packet matching some criteria. > > The problem is that when we set up the `tap0` interface, because IPv6 is > enabled by default, traffic such as Router solicitation is sent through > tap0, as in: > > # tcpdump -r /tmp/lwt_redirect.pc > reading from file /tmp/lwt_redirect.pcap, link-type EN10MB (Ethernet) > 04:46:23.578352 IP6 :: > ff02::1:ffc0:4427: ICMP6, neighbor solicitation, who has fe80::fcba:dff:fec0:4427, length 32 > 04:46:23.659522 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 > 04:46:24.389169 IP 10.0.0.1 > 20.0.0.9: ICMP echo request, id 122, seq 1, length 108 > 04:46:24.618599 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 > 04:46:24.619985 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16 > 04:46:24.767326 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 > 04:46:28.936402 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16 > > If `wait_for_packet` sees 5 non-ICMPv4 packets, it will return 0, which is what we see in: > > 2024-01-31T03:51:25.0336992Z test_lwt_redirect_run:PASS:netns_create 0 nsec > 2024-01-31T03:51:25.0341309Z open_netns:PASS:malloc token 0 nsec > 2024-01-31T03:51:25.0344844Z open_netns:PASS:open /proc/self/ns/net 0 nsec > 2024-01-31T03:51:25.0350071Z open_netns:PASS:open netns fd 0 nsec > 2024-01-31T03:51:25.0353516Z open_netns:PASS:setns 0 nsec > 2024-01-31T03:51:25.0356560Z test_lwt_redirect_run:PASS:setns 0 nsec > 2024-01-31T03:51:25.0360140Z open_tuntap:PASS:open(/dev/net/tun) 0 nsec > 2024-01-31T03:51:25.0363822Z open_tuntap:PASS:ioctl(TUNSETIFF) 0 nsec > 2024-01-31T03:51:25.0367402Z open_tuntap:PASS:fcntl(O_NONBLOCK) 0 nsec > 2024-01-31T03:51:25.0371167Z setup_redirect_target:PASS:open_tuntap 0 nsec > 2024-01-31T03:51:25.0375180Z setup_redirect_target:PASS:if_nametoindex 0 nsec > 2024-01-31T03:51:25.0379929Z setup_redirect_target:PASS:ip link add link_err type dummy 0 nsec > 2024-01-31T03:51:25.0384874Z setup_redirect_target:PASS:ip link set lo up 0 nsec > 2024-01-31T03:51:25.0389678Z setup_redirect_target:PASS:ip addr add dev lo 10.0.0.1/32 0 nsec > 2024-01-31T03:51:25.0394814Z setup_redirect_target:PASS:ip link set link_err up 0 nsec > 2024-01-31T03:51:25.0399874Z setup_redirect_target:PASS:ip link set tap0 up 0 nsec > 2024-01-31T03:51:25.0407731Z setup_redirect_target:PASS:ip route add 10.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_ingress 0 nsec > 2024-01-31T03:51:25.0419105Z setup_redirect_target:PASS:ip route add 20.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_egress 0 nsec > 2024-01-31T03:51:25.0427209Z test_lwt_redirect_normal:PASS:setup_redirect_target 0 nsec > 2024-01-31T03:51:25.0431424Z ping_dev:PASS:if_nametoindex 0 nsec > 2024-01-31T03:51:25.0437222Z send_and_capture_test_packets:FAIL:wait_for_epacket unexpected wait_for_epacket: actual 0 != expected 1 > 2024-01-31T03:51:25.0448298Z (/tmp/work/bpf/bpf/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c:175: errno: Success) test_lwt_redirect_normal egress test fails > 2024-01-31T03:51:25.0457124Z close_netns:PASS:setns 0 nsec > > When running in a VM which potential resource contrains, the odds that calling > `ping` is not scheduled very soon after bringing `tap0` up increases, > and with this the chances to get our ICMP packet pushed to position 6+ > in the network trace. > > To confirm this indeed solves the issue, I ran the test 100 times in a > row with: > > errors=0 > successes=0 > for i in `seq 1 100` > do > ./test_progs -t lwt_redirect/lwt_redirect_normal > if [ $? -eq 0 ]; then > successes=$((successes+1)) > else > errors=$((errors+1)) > fi > done > echo "successes: $successes/errors: $errors" > > While this test would at least fail a couple of time every 10 runs, here > it ran 100 times with no error. > > Fixes: 43a7c3ef8a15 ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT") > Signed-off-by: Manu Bretelle <chantr4@gmail.com> > --- > tools/testing/selftests/bpf/prog_tests/lwt_redirect.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > index beeb3ac1c361..b5b9e74b1044 100644 > --- a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > +++ b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > @@ -203,6 +203,7 @@ static int setup_redirect_target(const char *target_dev, bool need_mac) > if (!ASSERT_GE(target_index, 0, "if_nametoindex")) > goto fail; > > + SYS(fail, "sysctl -w net.ipv6.conf.all.disable_ipv6=1"); Thanks for digging into this! I was totally unprepared for that many router solicitations when I wrote the wait logic. For now disable v6 is totally good to unblock similar scenarios. But think it twice it is probably still worthwhile to incorporate v6 later since lwt hooks mess with both v4/v6 routing. So I'll try to fix up the wait logic later this week. An exact packet filter is probably best suited to make icmpv6/arp/nd happy. best Yan > SYS(fail, "ip link add link_err type dummy"); > SYS(fail, "ip link set lo up"); > SYS(fail, "ip addr add dev lo " LOCAL_SRC "/32"); > -- > 2.39.3 >
On Wed, Jan 31, 2024 at 10:44:07PM -0600, Yan Zhai wrote: > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > > index beeb3ac1c361..b5b9e74b1044 100644 > > --- a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > > +++ b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c > > @@ -203,6 +203,7 @@ static int setup_redirect_target(const char *target_dev, bool need_mac) > > if (!ASSERT_GE(target_index, 0, "if_nametoindex")) > > goto fail; > > > > + SYS(fail, "sysctl -w net.ipv6.conf.all.disable_ipv6=1"); > > Thanks for digging into this! I was totally unprepared for that many > router solicitations when I wrote the wait logic. For now disable v6 > is totally good to unblock similar scenarios. But think it twice it is > probably still worthwhile to incorporate v6 later since lwt hooks mess > with both v4/v6 routing. So I'll try to fix up the wait logic later > this week. An exact packet filter is probably best suited to make > icmpv6/arp/nd happy. > Greatly appreciated Yan. I definitely agree that it is desirable to have V6 support to. In case that helps... [0] should help you repro locally on your favorite architecture. [0] https://chantra.github.io/bpfcitools/bpf-local-development.html Manu > best > Yan > > > SYS(fail, "ip link add link_err type dummy"); > > SYS(fail, "ip link set lo up"); > > SYS(fail, "ip addr add dev lo " LOCAL_SRC "/32"); > > -- > > 2.39.3 > >
diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c index beeb3ac1c361..b5b9e74b1044 100644 --- a/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c +++ b/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c @@ -203,6 +203,7 @@ static int setup_redirect_target(const char *target_dev, bool need_mac) if (!ASSERT_GE(target_index, 0, "if_nametoindex")) goto fail; + SYS(fail, "sysctl -w net.ipv6.conf.all.disable_ipv6=1"); SYS(fail, "ip link add link_err type dummy"); SYS(fail, "ip link set lo up"); SYS(fail, "ip addr add dev lo " LOCAL_SRC "/32");
After a recent change in the vmtest runner, this test started failing sporadically. Investigation showed that this test was subject to race condition which got exacerbated after the vm runner change. The symptoms being that the logic that waited for an ICMPv4 packet is naive and will break if 5 or more non-ICMPv4 packets make it to tap0. When ICMPv6 is enabled, the kernel will generate traffic such as ICMPv6 router solicitation... On a system with good performance, the expected ICMPv4 packet would very likely make it to the network interface promptly, but on a system with poor performance, those "guarantees" do not hold true anymore. Given that the test is IPv4 only, this change disable IPv6 in the test netns by setting `net.ipv6.conf.all.disable_ipv6` to 1. This essentially leaves "ping" as the sole generator of traffic in the network namespace. If this test was to be made IPv6 compatible, the logic in `wait_for_packet` would need to be modified. In more details... At a high level, the test does: - create a new namespace - in `setup_redirect_target` set up lo, tap0, and link_err interfaces as well as add 2 routes that attaches ingress/egress sections of `test_lwt_redirect.bpf.o` to the xmit path. - in `send_and_capture_test_packets` send an ICMP packet and read off the tap interface (using `wait_for_packet`) to check that a ICMP packet with the right size is read. `wait_for_packet` will try to read `max_retry` (5) times from the tap0 fd looking for an ICMPv4 packet matching some criteria. The problem is that when we set up the `tap0` interface, because IPv6 is enabled by default, traffic such as Router solicitation is sent through tap0, as in: # tcpdump -r /tmp/lwt_redirect.pc reading from file /tmp/lwt_redirect.pcap, link-type EN10MB (Ethernet) 04:46:23.578352 IP6 :: > ff02::1:ffc0:4427: ICMP6, neighbor solicitation, who has fe80::fcba:dff:fec0:4427, length 32 04:46:23.659522 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 04:46:24.389169 IP 10.0.0.1 > 20.0.0.9: ICMP echo request, id 122, seq 1, length 108 04:46:24.618599 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 04:46:24.619985 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16 04:46:24.767326 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28 04:46:28.936402 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16 If `wait_for_packet` sees 5 non-ICMPv4 packets, it will return 0, which is what we see in: 2024-01-31T03:51:25.0336992Z test_lwt_redirect_run:PASS:netns_create 0 nsec 2024-01-31T03:51:25.0341309Z open_netns:PASS:malloc token 0 nsec 2024-01-31T03:51:25.0344844Z open_netns:PASS:open /proc/self/ns/net 0 nsec 2024-01-31T03:51:25.0350071Z open_netns:PASS:open netns fd 0 nsec 2024-01-31T03:51:25.0353516Z open_netns:PASS:setns 0 nsec 2024-01-31T03:51:25.0356560Z test_lwt_redirect_run:PASS:setns 0 nsec 2024-01-31T03:51:25.0360140Z open_tuntap:PASS:open(/dev/net/tun) 0 nsec 2024-01-31T03:51:25.0363822Z open_tuntap:PASS:ioctl(TUNSETIFF) 0 nsec 2024-01-31T03:51:25.0367402Z open_tuntap:PASS:fcntl(O_NONBLOCK) 0 nsec 2024-01-31T03:51:25.0371167Z setup_redirect_target:PASS:open_tuntap 0 nsec 2024-01-31T03:51:25.0375180Z setup_redirect_target:PASS:if_nametoindex 0 nsec 2024-01-31T03:51:25.0379929Z setup_redirect_target:PASS:ip link add link_err type dummy 0 nsec 2024-01-31T03:51:25.0384874Z setup_redirect_target:PASS:ip link set lo up 0 nsec 2024-01-31T03:51:25.0389678Z setup_redirect_target:PASS:ip addr add dev lo 10.0.0.1/32 0 nsec 2024-01-31T03:51:25.0394814Z setup_redirect_target:PASS:ip link set link_err up 0 nsec 2024-01-31T03:51:25.0399874Z setup_redirect_target:PASS:ip link set tap0 up 0 nsec 2024-01-31T03:51:25.0407731Z setup_redirect_target:PASS:ip route add 10.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_ingress 0 nsec 2024-01-31T03:51:25.0419105Z setup_redirect_target:PASS:ip route add 20.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_egress 0 nsec 2024-01-31T03:51:25.0427209Z test_lwt_redirect_normal:PASS:setup_redirect_target 0 nsec 2024-01-31T03:51:25.0431424Z ping_dev:PASS:if_nametoindex 0 nsec 2024-01-31T03:51:25.0437222Z send_and_capture_test_packets:FAIL:wait_for_epacket unexpected wait_for_epacket: actual 0 != expected 1 2024-01-31T03:51:25.0448298Z (/tmp/work/bpf/bpf/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c:175: errno: Success) test_lwt_redirect_normal egress test fails 2024-01-31T03:51:25.0457124Z close_netns:PASS:setns 0 nsec When running in a VM which potential resource contrains, the odds that calling `ping` is not scheduled very soon after bringing `tap0` up increases, and with this the chances to get our ICMP packet pushed to position 6+ in the network trace. To confirm this indeed solves the issue, I ran the test 100 times in a row with: errors=0 successes=0 for i in `seq 1 100` do ./test_progs -t lwt_redirect/lwt_redirect_normal if [ $? -eq 0 ]; then successes=$((successes+1)) else errors=$((errors+1)) fi done echo "successes: $successes/errors: $errors" While this test would at least fail a couple of time every 10 runs, here it ran 100 times with no error. Fixes: 43a7c3ef8a15 ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT") Signed-off-by: Manu Bretelle <chantr4@gmail.com> --- tools/testing/selftests/bpf/prog_tests/lwt_redirect.c | 1 + 1 file changed, 1 insertion(+)