diff mbox series

[net,v3,2/3] net: sched: add check tc_skip_classify in sch egress

Message ID 20211129045503.20217-2-xiangxia.m.yue@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series [net,v3,1/3] net: core: set skb useful vars in __bpf_tx_skb | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter warning Series does not have a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 18 this patch: 18
netdev/cc_maintainers warning 1 maintainers not CCed: bpf@vger.kernel.org
netdev/build_clang success Errors and warnings before: 20 this patch: 20
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 9 this patch: 9
netdev/checkpatch warning WARNING: Duplicate signature
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf fail VM_Test
bpf/vmtest-bpf-PR fail PR summary
bpf/vmtest-bpf-next success VM_Test

Commit Message

Tonghao Zhang Nov. 29, 2021, 4:55 a.m. UTC
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

Try to resolve the issues as below:
* We look up and then check tc_skip_classify flag in net
  sched layer, even though skb don't want to be classified.
  That case may consume a lot of cpu cycles.

  Install the rules as below:
  $ for id in $(seq 1 100); do
  $       tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0
  $ done

  netperf:
  $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32
  $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32

  Before: 10662.33 tps, 108.95 Mbit/s
  After:  12434.48 tps, 145.89 Mbit/s
  For TCP_RR, there are 16.6% improvement, TCP_STREAM 33.9%.

* bpf_redirect may be invoked in egress path. if we don't
  check the flags and then return immediately, the packets
  will loopback.

  $ tc filter add dev eth0 egress bpf direct-action obj \
	  test_tc_redirect_ifb.o sec redirect_ifb

Cc: Willem de Bruijn <willemb@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Antoine Tenart <atenart@kernel.org>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Wei Wang <weiwan@google.com>
Cc: "Björn Töpel" <bjorn@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
v2: https://patchwork.kernel.org/project/netdevbpf/patch/20211103143208.41282-1-xiangxia.m.yue@gmail.com/
Willem de Bruijn and Daniel Borkmann, comment this patch, but I think we should fix this,
bpf_redirect may also loopback the packets. I hope there are more comments?
---
 net/core/dev.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Eric Dumazet Nov. 29, 2021, 5:44 p.m. UTC | #1
On Sun, Nov 28, 2021 at 8:55 PM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> Try to resolve the issues as below:
> * We look up and then check tc_skip_classify flag in net
>   sched layer, even though skb don't want to be classified.
>   That case may consume a lot of cpu cycles.
>
>   Install the rules as below:
>   $ for id in $(seq 1 100); do
>   $       tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0
>   $ done
>
>   netperf:
>   $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32
>   $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32
>
>   Before: 10662.33 tps, 108.95 Mbit/s
>   After:  12434.48 tps, 145.89 Mbit/s
>   For TCP_RR, there are 16.6% improvement, TCP_STREAM 33.9%.

These numbers mean nothing, really.

I think you should put 10,000 filters instead of 100 so that the
numbers look even better ?

As a matter of fact, you add yet another check in fast  path.

For some reason I have not received the cover letter and patch 1/3.
Tonghao Zhang Nov. 30, 2021, 1:24 a.m. UTC | #2
On Tue, Nov 30, 2021 at 1:44 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Sun, Nov 28, 2021 at 8:55 PM <xiangxia.m.yue@gmail.com> wrote:
> >
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > Try to resolve the issues as below:
> > * We look up and then check tc_skip_classify flag in net
> >   sched layer, even though skb don't want to be classified.
> >   That case may consume a lot of cpu cycles.
> >
> >   Install the rules as below:
> >   $ for id in $(seq 1 100); do
> >   $       tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0
> >   $ done
> >
> >   netperf:
> >   $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32
> >   $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32
> >
> >   Before: 10662.33 tps, 108.95 Mbit/s
> >   After:  12434.48 tps, 145.89 Mbit/s
> >   For TCP_RR, there are 16.6% improvement, TCP_STREAM 33.9%.
>
> These numbers mean nothing, really.
>
> I think you should put 10,000 filters instead of 100 so that the
> numbers look even better ?
This 100 filters with different prio, I will install 10,000 filters
and test again. Thanks.
> As a matter of fact, you add yet another check in fast  path.
>
> For some reason I have not received the cover letter and patch 1/3.
1/3 patch, https://patchwork.kernel.org/project/netdevbpf/patch/20211129045503.20217-1-xiangxia.m.yue@gmail.com/
Tonghao Zhang Dec. 1, 2021, 10:58 a.m. UTC | #3
On Tue, Nov 30, 2021 at 9:24 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Tue, Nov 30, 2021 at 1:44 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Sun, Nov 28, 2021 at 8:55 PM <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > >
> > > Try to resolve the issues as below:
> > > * We look up and then check tc_skip_classify flag in net
> > >   sched layer, even though skb don't want to be classified.
> > >   That case may consume a lot of cpu cycles.
> > >
> > >   Install the rules as below:
> > >   $ for id in $(seq 1 100); do
> > >   $       tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0
> > >   $ done
> > >
> > >   netperf:
> > >   $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32
> > >   $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32
> > >
> > >   Before: 10662.33 tps, 108.95 Mbit/s
> > >   After:  12434.48 tps, 145.89 Mbit/s
> > >   For TCP_RR, there are 16.6% improvement, TCP_STREAM 33.9%.
> >
> > These numbers mean nothing, really.
> >
> > I think you should put 10,000 filters instead of 100 so that the
> > numbers look even better ?
> This 100 filters with different prio, I will install 10,000 filters
> and test again. Thanks.
Hi Eric
I install 10,000 filters with different prio: for example
tc filter add dev enp5s0f0 egress protocol ip prio 10000 flower
skip_hw src_ip 4.4.39.16 action mirred egress redirect dev ifb0

Test test commands:
taskset -c 1 netperf -t TCP_RR -L 4.4.39.16 -H 4.4.200.200 -- -r 32,32
taskset -c 1 netperf -t TCP_STREAM -L 4.4.39.16 -H 4.4.200.200 -- -m 32

Without patch:
152.04 tps
0.58 10^6bits/sec
With patch:
303.07 tps
1.51 10^6bits/sec

> > As a matter of fact, you add yet another check in fast  path.
> >
> > For some reason I have not received the cover letter and patch 1/3.
> 1/3 patch, https://patchwork.kernel.org/project/netdevbpf/patch/20211129045503.20217-1-xiangxia.m.yue@gmail.com/
>
>
> --
> Best regards, Tonghao
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 823917de0d2b..4ceb927b1577 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3823,6 +3823,9 @@  sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
 	if (!miniq)
 		return skb;
 
+	if (skb_skip_tc_classify(skb))
+		return skb;
+
 	/* qdisc_skb_cb(skb)->pkt_len was already set by the caller. */
 	qdisc_skb_cb(skb)->mru = 0;
 	qdisc_skb_cb(skb)->post_ct = false;