diff mbox series

[RFC,bpf-next] bpf: Fork state at bpf_map_lookup_elem

Message ID 20241206033342.82058-1-alexei.starovoitov@gmail.com (mailing list archive)
State RFC
Delegated to: BPF
Headers show
Series [RFC,bpf-next] bpf: Fork state at bpf_map_lookup_elem | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 fail Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 fail Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 fail Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 fail Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 fail Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31 fail Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 fail Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-30 fail Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-37 fail Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-39 fail Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40 fail Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-38 fail Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 8 maintainers not CCed: haoluo@google.com jolsa@kernel.org yonghong.song@linux.dev kpsingh@kernel.org song@kernel.org sdf@fomichev.me martin.lau@linux.dev john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 11 this patch: 11
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-7 fail Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 fail Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-13 fail Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc

Commit Message

Alexei Starovoitov Dec. 6, 2024, 3:33 a.m. UTC
From: Alexei Starovoitov <ast@kernel.org>

Here is a work-in-progress diff that passes tests (except error message mismatch).

Instead of returning map_value_or_null from bpf_map_lookup_elem()
the patch forks the state and returns map_value in the fallthrough
and const zero on the second pass.

Below are the verifier performance results.
The bigger the negative % the better.
In some cases the wins are big.

The only substantial loss is 'tw_twfw_*'.
In that tests the bounded loop logic kicks in, so extra fork of states
inside the loop makes the verifier do more work.
Similar situation is with checkpoint_states_deletion() test in progs/iters.c.
Hence the patch uses old map_value_or_null approach
when get_loop_entry(env->cur_state) == true.
It addresses the problem with checkpoint_states_deletion(),
but not with tw_twfw_*.

I'm not convinced we need to land this patch, but
wins in balancer_ingress test are appealing.

// progs from selftests
./veristat -C -e prog,insns,verdict -f 'insns_pct>5' before after
Program                             Insns (A)  Insns (B)  Insns       (DIFF)  Verdict (A)  Verdict (B)  Verdict (DIFF)
----------------------------------  ---------  ---------  ------------------  -----------  -----------  --------------
iter_err_too_permissive2                   39         61       +22 (+56.41%)  failure      failure      MATCH
iter_err_too_permissive3                   31         54       +23 (+74.19%)  failure      failure      MATCH
iter_tricky_but_fine                       56         50        -6 (-10.71%)  success      success      MATCH
raw_tracepoint__sched_process_exit       3138       3315       +177 (+5.64%)  success      success      MATCH
kprobe__vfs_link                        10272      11000       +728 (+7.09%)  success      success      MATCH
kprobe__vfs_symlink                      5781       6311       +530 (+9.17%)  success      success      MATCH
kprobe_ret__do_filp_open                 5891       6421       +530 (+9.00%)  success      success      MATCH
on_event                               116096     877289  +761193 (+655.66%)  failure      success      MISMATCH
   // mainly due to BPF_COMPLEXITY_LIMIT_JMP_SEQ increase
on_event                                 4595       6332     +1737 (+37.80%)  success      success      MATCH
on_event                                 7187       6801       -386 (-5.37%)  success      success      MATCH
balancer_ingress                         4489       3257     -1232 (-27.44%)  success      success      MATCH
balancer_ingress                         4865       3168     -1697 (-34.88%)  success      success      MATCH
balancer_ingress                         1508       1060      -448 (-29.71%)  success      success      MATCH
balancer_ingress_v4                      3666       2819      -847 (-23.10%)  success      success      MATCH
balancer_ingress_v6                      3453       2523      -930 (-26.93%)  success      success      MATCH
syncookie_tc                             5549       5884       +335 (+6.04%)  success      success      MATCH

// production progs
./veristat -C -e prog,insns -f 'insns_pct>5' before after
Program                                   Insns (A)  Insns (B)  Insns      (DIFF)
----------------------------------------  ---------  ---------  -----------------
on_switch                                      3789       5585    +1796 (+47.40%)
balancer_ingress                               8389       6820    -1569 (-18.70%)
balancer_ingress                              12477      10735    -1742 (-13.96%)
balancer_ingress                              12989      11658    -1331 (-10.25%)
balancer_ingress                              12989      11658    -1331 (-10.25%)
balancer_ingress                              12477      10735    -1742 (-13.96%)
balancer_ingress                              16400      15415      -985 (-6.01%)
balancer_ingress                              17893      16775     -1118 (-6.25%)
balancer_ingress                              17311      16305     -1006 (-5.81%)
balancer_ingress                              18042      17137      -905 (-5.02%)
balancer_ingress                               9253       7728    -1525 (-16.48%)
balancer_ingress                               9865       8143    -1722 (-17.46%)
balancer_ingress                               8870       7182    -1688 (-19.03%)
balancer_ingress                             321972     164530  -157442 (-48.90%)
balancer_ingress                             322701     165237  -157464 (-48.80%)
balancer_ingress                             344833     176948  -167885 (-48.69%)
balancer_ingress                             344833     176948  -167885 (-48.69%)
balancer_ingress                             322701     165237  -157464 (-48.80%)
balancer_ingress                             343872     176031  -167841 (-48.81%)
balancer_ingress                             343665     175732  -167933 (-48.87%)
prog_block_rq_complete_raw                      803        884      +81 (+10.09%)
sm_tc_writer                                    200        214       +14 (+7.00%)
tc_scope_lookup                                 214        240      +26 (+12.15%)
ned_hwtstamp                                    133        162      +29 (+21.80%)
ned_skop_timestamp                              528        574       +46 (+8.71%)
ned_skop_pacing                                 113        124       +11 (+9.73%)
ned_scope_resolver                              262        307      +45 (+17.18%)
ned_skop_selcca                                 223        282      +59 (+26.46%)
ned_tcpopt_sr                                   660        721       +61 (+9.24%)
ned_skop_timeout                                218        244      +26 (+11.93%)
nat64                                          1337       1463      +126 (+9.42%)
dctcp_update_alpha                              113        123       +10 (+8.85%)
dctcp_update_alpha                              113        123       +10 (+8.85%)
ned_ts_func                                     592        655      +63 (+10.64%)
filtering                                       362        459      +97 (+26.80%)
mitigate_rwnd                                   314        441     +127 (+40.45%)
privacy_setoskopt                               100        106        +6 (+6.00%)
sslwall_sockops                                 511        451      -60 (-11.74%)
on_event                                        260        275       +15 (+5.77%)
on_event                                        260        275       +15 (+5.77%)
read_async_py_stack                           24723      22404     -2319 (-9.38%)
on_event                                        260        275       +15 (+5.77%)
read_async_py_stack                           24723      22404     -2319 (-9.38%)
read_async_py_stack                           24723      22404     -2319 (-9.38%)
read_async_py_stack                           24723      22404     -2319 (-9.38%)
bash_reader                                   19475      21980    +2505 (+12.86%)
syar_cgroup_mkdir                             10276      11532    +1256 (+12.22%)
accept_protect                                 9776      11037    +1261 (+12.90%)
syar_pci_enable_device                          156        164        +8 (+5.13%)
python3_detect                                11545      12447      +902 (+7.81%)
bpf_prog_detect                                 217        241      +24 (+11.06%)
syar_task_kill                                10223      11522    +1299 (+12.71%)
syar_task_enter_process_vm_writev             19531      20775     +1244 (+6.37%)
milli_sampler                                   497        554      +57 (+11.47%)
cubictcp_cong_avoid                           57380      61292     +3912 (+6.82%)
tcp_reno_cong_avoid                           57380      61292     +3912 (+6.82%)
tracepoint__tcp__tcp_destroy_sock                43         46        +3 (+6.98%)
tracepoint__tcp__tcp_receive_reset              156        199      +43 (+27.56%)
tracepoint__tcp__tcp_retransmit_skb            3471       2781     -690 (-19.88%)
tracepoint__tcp__tcp_retransmit_synack         3164       2293     -871 (-27.53%)
bbr_set_state                                 12594       5207    -7387 (-58.65%)
cubictcp_state                                12594       5207    -7387 (-58.65%)
kprobe__bbr_set_state                          8207       3940    -4267 (-51.99%)
kprobe__bictcp_state                           8207       3940    -4267 (-51.99%)
tcp_receive_reset                               206        227      +21 (+10.19%)
tcp_retransmit_skb                             7709       5557    -2152 (-27.92%)
tcp_retransmit_synack                          4706       3295    -1411 (-29.98%)
tw_netbw_cg_eg                                  196        215       +19 (+9.69%)
tw_egress                                      1190       1447     +257 (+21.60%)
tw_ingress                                     1180       1437     +257 (+21.78%)
ned_cgrp_dctcp                                  285        328      +43 (+15.09%)
tw_ipt_connect                                  165        177       +12 (+7.27%)
tw_ipt_ingress                                  101        112      +11 (+10.89%)
tw_ipt_listen                                   157        173      +16 (+10.19%)
tw_ns_phy2veth                                 2516       2288      -228 (-9.06%)
tw_tproxy_router                               1852       2110     +258 (+13.93%)
ttls_tc_egress                                  519        572      +53 (+10.21%)
ttls_tc_ingress                                7651       8137      +486 (+6.35%)
ttls_nat_ingress                                356        383       +27 (+7.58%)
tw_twfw_egress                               205149     239977   +34828 (+16.98%)
tw_twfw_ingress                              205153     239987   +34834 (+16.98%)
tw_twfw_tc_eg                                205147     239983   +34836 (+16.98%)
tw_twfw_tc_in                                205151     239987   +34836 (+16.98%)
tw_twfw_egress                                 5964       5530      -434 (-7.28%)
tw_twfw_ingress                                6110       5558      -552 (-9.03%)
tw_twfw_tc_eg                                  6109       5424     -685 (-11.21%)
tw_twfw_tc_in                                  6108       5558      -550 (-9.00%)
twfw_connect4                                 32715      17994   -14721 (-45.00%)
twfw_sendmsg4                                 32715      17994   -14721 (-45.00%)

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/bpf/verifier.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 31e0d33498ac..73b5cc767d25 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -186,7 +186,7 @@  struct bpf_verifier_stack_elem {
 	u32 log_pos;
 };
 
-#define BPF_COMPLEXITY_LIMIT_JMP_SEQ	8192
+#define BPF_COMPLEXITY_LIMIT_JMP_SEQ	(8192 * 4)
 #define BPF_COMPLEXITY_LIMIT_STATES	64
 
 #define BPF_MAP_KEY_POISON	(1ULL << 63)
@@ -11206,6 +11206,16 @@  static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 		regs[BPF_REG_0].map_ptr = meta.map_ptr;
 		regs[BPF_REG_0].map_uid = meta.map_uid;
 		regs[BPF_REG_0].type = PTR_TO_MAP_VALUE | ret_flag;
+		if (ret_flag == PTR_MAYBE_NULL && !get_loop_entry(env->cur_state)) {
+			struct bpf_verifier_state *st;
+			struct bpf_reg_state *other_regs;
+
+			st = push_stack(env, insn_idx + 1, insn_idx, false);
+			other_regs = st->frame[st->curframe]->regs;
+			__mark_reg_const_zero(env, &other_regs[BPF_REG_0]);
+
+			mark_ptr_not_null_reg(&regs[BPF_REG_0]);
+		}
 		if (!type_may_be_null(ret_type) &&
 		    btf_record_has_field(meta.map_ptr->record, BPF_SPIN_LOCK)) {
 			regs[BPF_REG_0].id = ++env->id_gen;