Message ID | 20240305144404.569632-4-ast@fiberby.net (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | make skip_sw actually skip software | expand |
Hi Asbjørn, kernel test robot noticed the following build errors: [auto build test ERROR on net-next/main] url: https://github.com/intel-lab-lkp/linux/commits/Asbj-rn-Sloth-T-nnesen/net-sched-cls_api-add-skip_sw-counter/20240305-225707 base: net-next/main patch link: https://lore.kernel.org/r/20240305144404.569632-4-ast%40fiberby.net patch subject: [PATCH net-next v2 3/3] net: sched: make skip_sw actually skip software config: riscv-defconfig (https://download.01.org/0day-ci/archive/20240306/202403061346.ILWGhalr-lkp@intel.com/config) compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 325f51237252e6dab8e4e1ea1fa7acbb4faee1cd) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240306/202403061346.ILWGhalr-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202403061346.ILWGhalr-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from net/sched/cls_api.c:18: In file included from include/linux/skbuff.h:17: In file included from include/linux/bvec.h:10: In file included from include/linux/highmem.h:8: In file included from include/linux/cacheflush.h:5: In file included from arch/riscv/include/asm/cacheflush.h:9: In file included from include/linux/mm.h:2188: include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 522 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ In file included from net/sched/cls_api.c:27: In file included from include/net/sock.h:46: In file included from include/linux/netdevice.h:45: In file included from include/uapi/linux/neighbour.h:6: In file included from include/linux/netlink.h:9: In file included from include/net/scm.h:9: In file included from include/linux/security.h:35: include/linux/bpf.h:736:48: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_arg_type') [-Wenum-enum-conversion] 736 | ARG_PTR_TO_MAP_VALUE_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_MAP_VALUE, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~ include/linux/bpf.h:737:43: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_arg_type') [-Wenum-enum-conversion] 737 | ARG_PTR_TO_MEM_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_MEM, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ include/linux/bpf.h:738:43: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_arg_type') [-Wenum-enum-conversion] 738 | ARG_PTR_TO_CTX_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_CTX, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ include/linux/bpf.h:739:45: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_arg_type') [-Wenum-enum-conversion] 739 | ARG_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_SOCKET, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~ include/linux/bpf.h:740:44: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_arg_type') [-Wenum-enum-conversion] 740 | ARG_PTR_TO_STACK_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_STACK, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~ include/linux/bpf.h:741:45: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_arg_type') [-Wenum-enum-conversion] 741 | ARG_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_BTF_ID, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~ include/linux/bpf.h:745:38: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_arg_type') [-Wenum-enum-conversion] 745 | ARG_PTR_TO_UNINIT_MEM = MEM_UNINIT | ARG_PTR_TO_MEM, | ~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ include/linux/bpf.h:747:45: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_arg_type') [-Wenum-enum-conversion] 747 | ARG_PTR_TO_FIXED_SIZE_MEM = MEM_FIXED_SIZE | ARG_PTR_TO_MEM, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ include/linux/bpf.h:770:48: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_return_type') [-Wenum-enum-conversion] 770 | RET_PTR_TO_MAP_VALUE_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_MAP_VALUE, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~ include/linux/bpf.h:771:45: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_return_type') [-Wenum-enum-conversion] 771 | RET_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCKET, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~ include/linux/bpf.h:772:47: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_return_type') [-Wenum-enum-conversion] 772 | RET_PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~ include/linux/bpf.h:773:50: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_return_type') [-Wenum-enum-conversion] 773 | RET_PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~ include/linux/bpf.h:775:49: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_return_type') [-Wenum-enum-conversion] 775 | RET_PTR_TO_DYNPTR_MEM_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_MEM, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ include/linux/bpf.h:776:45: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_return_type') [-Wenum-enum-conversion] 776 | RET_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~ include/linux/bpf.h:777:43: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_return_type') [-Wenum-enum-conversion] 777 | RET_PTR_TO_BTF_ID_TRUSTED = PTR_TRUSTED | RET_PTR_TO_BTF_ID, | ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~ include/linux/bpf.h:888:44: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_reg_type') [-Wenum-enum-conversion] 888 | PTR_TO_MAP_VALUE_OR_NULL = PTR_MAYBE_NULL | PTR_TO_MAP_VALUE, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~ include/linux/bpf.h:889:42: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_reg_type') [-Wenum-enum-conversion] 889 | PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | PTR_TO_SOCKET, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~ include/linux/bpf.h:890:46: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_reg_type') [-Wenum-enum-conversion] 890 | PTR_TO_SOCK_COMMON_OR_NULL = PTR_MAYBE_NULL | PTR_TO_SOCK_COMMON, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~ include/linux/bpf.h:891:44: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_reg_type') [-Wenum-enum-conversion] 891 | PTR_TO_TCP_SOCK_OR_NULL = PTR_MAYBE_NULL | PTR_TO_TCP_SOCK, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~ include/linux/bpf.h:892:42: warning: bitwise operation between different enumeration types ('enum bpf_type_flag' and 'enum bpf_reg_type') [-Wenum-enum-conversion] 892 | PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | PTR_TO_BTF_ID, | ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~ >> net/sched/cls_api.c:421:23: error: use of undeclared identifier 'tcf_bypass_check_needed_key' 421 | static_branch_inc(&tcf_bypass_check_needed_key); | ^ net/sched/cls_api.c:423:23: error: use of undeclared identifier 'tcf_bypass_check_needed_key' 423 | static_branch_dec(&tcf_bypass_check_needed_key); | ^ 21 warnings and 2 errors generated. vim +/tcf_bypass_check_needed_key +421 net/sched/cls_api.c 412 413 static inline void tcf_maintain_bypass(struct tcf_block *block) 414 { 415 int filtercnt = atomic_read(&block->filtercnt); 416 int skipswcnt = atomic_read(&block->skipswcnt); 417 bool bypass_wanted = filtercnt > 0 && filtercnt == skipswcnt; 418 419 if (bypass_wanted != block->bypass_wanted) { 420 if (bypass_wanted) > 421 static_branch_inc(&tcf_bypass_check_needed_key); 422 else 423 static_branch_dec(&tcf_bypass_check_needed_key); 424 block->bypass_wanted = bypass_wanted; 425 } 426 } 427
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index a4ee43f493bb..41297bd38dff 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -74,6 +74,15 @@ static inline bool tcf_block_non_null_shared(struct tcf_block *block) return block && block->index; } +#ifdef CONFIG_NET_CLS_ACT +DECLARE_STATIC_KEY_FALSE(tcf_bypass_check_needed_key); + +static inline bool tcf_block_bypass_sw(struct tcf_block *block) +{ + return block && block->bypass_wanted; +} +#endif + static inline struct Qdisc *tcf_block_q(struct tcf_block *block) { WARN_ON(tcf_block_shared(block)); diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 7af0621db226..60b0fdf2b1ad 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -477,6 +477,7 @@ struct tcf_block { struct flow_block flow_block; struct list_head owner_list; bool keep_dst; + bool bypass_wanted; atomic_t filtercnt; /* Number of filters */ atomic_t skipswcnt; /* Number of skip_sw filters */ atomic_t offloadcnt; /* Number of oddloaded filters */ diff --git a/net/core/dev.c b/net/core/dev.c index fe054cbd41e9..b7c583f98e82 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2057,6 +2057,11 @@ void net_dec_egress_queue(void) EXPORT_SYMBOL_GPL(net_dec_egress_queue); #endif +#ifdef CONFIG_NET_CLS_ACT +DEFINE_STATIC_KEY_FALSE(tcf_bypass_check_needed_key); +EXPORT_SYMBOL(tcf_bypass_check_needed_key); +#endif + DEFINE_STATIC_KEY_FALSE(netstamp_needed_key); EXPORT_SYMBOL(netstamp_needed_key); #ifdef CONFIG_JUMP_LABEL @@ -3911,6 +3916,11 @@ static int tc_run(struct tcx_entry *entry, struct sk_buff *skb, if (!miniq) return ret; + if (static_branch_unlikely(&tcf_bypass_check_needed_key)) { + if (tcf_block_bypass_sw(miniq->block)) + return ret; + } + tc_skb_cb(skb)->mru = 0; tc_skb_cb(skb)->post_ct = false; tcf_set_drop_reason(skb, *drop_reason); diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 304a46ab0e0b..acef09e21969 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -410,6 +410,21 @@ static void tcf_proto_get(struct tcf_proto *tp) refcount_inc(&tp->refcnt); } +static inline void tcf_maintain_bypass(struct tcf_block *block) +{ + int filtercnt = atomic_read(&block->filtercnt); + int skipswcnt = atomic_read(&block->skipswcnt); + bool bypass_wanted = filtercnt > 0 && filtercnt == skipswcnt; + + if (bypass_wanted != block->bypass_wanted) { + if (bypass_wanted) + static_branch_inc(&tcf_bypass_check_needed_key); + else + static_branch_dec(&tcf_bypass_check_needed_key); + block->bypass_wanted = bypass_wanted; + } +} + static void tcf_block_filter_cnt_update(struct tcf_block *block, bool *counted, bool add) { lockdep_assert_not_held(&block->cb_lock); @@ -424,6 +439,7 @@ static void tcf_block_filter_cnt_update(struct tcf_block *block, bool *counted, *counted = false; } } + tcf_maintain_bypass(block); up_write(&block->cb_lock); }
TC filters come in 3 variants: - no flag (try to process in hardware, but fallback to software)) - skip_hw (do not process filter by hardware) - skip_sw (do not process filter by software) However skip_sw is implemented so that the skip_sw flag can first be checked, after it has been matched. IMHO it's common when using skip_sw, to use it on all rules. So if all filters in a block is skip_sw filters, then we can bail early, we can thus avoid having to match the filters, just to check for the skip_sw flag. This patch adds a bypass, for when only TC skip_sw rules are used. The bypass is guarded by a static key, to avoid harming other workloads. There are 3 ways that a packet from a skip_sw ruleset, can end up in the kernel path. Although the send packets to a non-existent chain way is only improved a few percents, then I believe it's worth optimizing the trap and fall-though use-cases. +----------------------------+--------+--------+--------+ | Test description | Pre- | Post- | Rel. | | | kpps | kpps | chg. | +----------------------------+--------+--------+--------+ | basic forwarding + notrack | 3589.3 | 3587.9 | 1.00x | | switch to eswitch mode | 3081.8 | 3094.7 | 1.00x | | add ingress qdisc | 3042.9 | 3063.6 | 1.01x | | tc forward in hw / skip_sw |37024.7 |37028.4 | 1.00x | | tc forward in sw / skip_hw | 3245.0 | 3245.3 | 1.00x | +----------------------------+--------+--------+--------+ | tests with only skip_sw rules below: | +----------------------------+--------+--------+--------+ | 1 non-matching rule | 2694.7 | 3058.7 | 1.14x | | 1 n-m rule, match trap | 2611.2 | 3323.1 | 1.27x | | 1 n-m rule, goto non-chain | 2886.8 | 2945.9 | 1.02x | | 5 non-matching rules | 1958.2 | 3061.3 | 1.56x | | 5 n-m rules, match trap | 1911.9 | 3327.0 | 1.74x | | 5 n-m rules, goto non-chain| 2883.1 | 2947.5 | 1.02x | | 10 non-matching rules | 1466.3 | 3062.8 | 2.09x | | 10 n-m rules, match trap | 1444.3 | 3317.9 | 2.30x | | 10 n-m rules,goto non-chain| 2883.1 | 2939.5 | 1.02x | | 25 non-matching rules | 838.5 | 3058.9 | 3.65x | | 25 n-m rules, match trap | 824.5 | 3323.0 | 4.03x | | 25 n-m rules,goto non-chain| 2875.8 | 2944.7 | 1.02x | | 50 non-matching rules | 488.1 | 3054.7 | 6.26x | | 50 n-m rules, match trap | 484.9 | 3318.5 | 6.84x | | 50 n-m rules,goto non-chain| 2884.1 | 2939.7 | 1.02x | +----------------------------+--------+--------+--------+ perf top (25 n-m skip_sw rules - pre patch): 20.39% [kernel] [k] __skb_flow_dissect 16.43% [kernel] [k] rhashtable_jhash2 10.58% [kernel] [k] fl_classify 10.23% [kernel] [k] fl_mask_lookup 4.79% [kernel] [k] memset_orig 2.58% [kernel] [k] tcf_classify 1.47% [kernel] [k] __x86_indirect_thunk_rax 1.42% [kernel] [k] __dev_queue_xmit 1.36% [kernel] [k] nft_do_chain 1.21% [kernel] [k] __rcu_read_lock perf top (25 n-m skip_sw rules - post patch): 5.12% [kernel] [k] __dev_queue_xmit 4.77% [kernel] [k] nft_do_chain 3.65% [kernel] [k] dev_gro_receive 3.41% [kernel] [k] check_preemption_disabled 3.14% [kernel] [k] mlx5e_skb_from_cqe_mpwrq_nonlinear 2.88% [kernel] [k] __netif_receive_skb_core.constprop.0 2.49% [kernel] [k] mlx5e_xmit 2.15% [kernel] [k] ip_forward 1.95% [kernel] [k] mlx5e_tc_restore_tunnel 1.92% [kernel] [k] vlan_gro_receive Test setup: DUT: Intel Xeon D-1518 (2.20GHz) w/ Nvidia/Mellanox ConnectX-6 Dx 2x100G Data rate measured on switch (Extreme X690), and DUT connected as a router on a stick, with pktgen and pktsink as VLANs. Pktgen-dpdk was in range 36.6-37.7 Mpps across all tests. Full test data at https://files.fiberby.net/ast/2024/tc_skip_sw/v2_tests/ Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> --- include/net/pkt_cls.h | 9 +++++++++ include/net/sch_generic.h | 1 + net/core/dev.c | 10 ++++++++++ net/sched/cls_api.c | 16 ++++++++++++++++ 4 files changed, 36 insertions(+)