Message ID | 4c3b0d3f32d3b18949d75b18e5e1d9f13a24f025.1710877680.git.yan@cloudflare.com (mailing list archive) |
---|---|
State | Accepted |
Commit | d6dbbb11247c71203785a2c9da474c36f4b19eae |
Headers | show |
Series | Report RCU QS for busy network kthreads | expand |
On Tue, Mar 19, 2024 at 01:44:37PM -0700, Yan Zhai wrote: > NAPI threads can keep polling packets under load. Currently it is only > calling cond_resched() before repolling, but it is not sufficient to > clear out the holdout of RCU tasks, which prevent BPF tracing programs > from detaching for long period. This can be reproduced easily with > following set up: > > ip netns add test1 > ip netns add test2 > > ip -n test1 link add veth1 type veth peer name veth2 netns test2 > > ip -n test1 link set veth1 up > ip -n test1 link set lo up > ip -n test2 link set veth2 up > ip -n test2 link set lo up > > ip -n test1 addr add 192.168.1.2/31 dev veth1 > ip -n test1 addr add 1.1.1.1/32 dev lo > ip -n test2 addr add 192.168.1.3/31 dev veth2 > ip -n test2 addr add 2.2.2.2/31 dev lo > > ip -n test1 route add default via 192.168.1.3 > ip -n test2 route add default via 192.168.1.2 > > for i in `seq 10 210`; do > for j in `seq 10 210`; do > ip netns exec test2 iptables -I INPUT -s 3.3.$i.$j -p udp --dport 5201 > done > done > > ip netns exec test2 ethtool -K veth2 gro on > ip netns exec test2 bash -c 'echo 1 > /sys/class/net/veth2/threaded' > ip netns exec test1 ethtool -K veth1 tso off > > Then run an iperf3 client/server and a bpftrace script can trigger it: > > ip netns exec test2 iperf3 -s -B 2.2.2.2 >/dev/null& > ip netns exec test1 iperf3 -c 2.2.2.2 -B 1.1.1.1 -u -l 1500 -b 3g -t 100 >/dev/null& > bpftrace -e 'kfunc:__napi_poll{@=count();} interval:s:1{exit();}' > > Report RCU quiescent states periodically will resolve the issue. > > Fixes: 29863d41bb6e ("net: implement threaded-able napi poll loop support") > Reviewed-by: Jesper Dangaard Brouer <hawk@kernel.org> > Signed-off-by: Yan Zhai <yan@cloudflare.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> > --- > v2->v3: abstracted the work into a RCU helper > v1->v2: moved rcu_softirq_qs out from bh critical section, and only > raise it after a second of repolling. Added some brief perf test result. > > v2: https://lore.kernel.org/bpf/ZeFPz4D121TgvCje@debian.debian/ > v1: https://lore.kernel.org/lkml/Zd4DXTyCf17lcTfq@debian.debian/#t > --- > net/core/dev.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 303a6ff46e4e..9a67003e49db 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6743,6 +6743,8 @@ static int napi_threaded_poll(void *data) > void *have; > > while (!napi_thread_wait(napi)) { > + unsigned long last_qs = jiffies; > + > for (;;) { > bool repoll = false; > > @@ -6767,6 +6769,7 @@ static int napi_threaded_poll(void *data) > if (!repoll) > break; > > + rcu_softirq_qs_periodic(last_qs); > cond_resched(); > } > } > -- > 2.30.2 > >
diff --git a/net/core/dev.c b/net/core/dev.c index 303a6ff46e4e..9a67003e49db 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6743,6 +6743,8 @@ static int napi_threaded_poll(void *data) void *have; while (!napi_thread_wait(napi)) { + unsigned long last_qs = jiffies; + for (;;) { bool repoll = false; @@ -6767,6 +6769,7 @@ static int napi_threaded_poll(void *data) if (!repoll) break; + rcu_softirq_qs_periodic(last_qs); cond_resched(); } }