Message ID | 20230606222411.1820404-4-eddyz87@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | verify scalar ids mapping in regsafe() | expand |
On Tue, Jun 6, 2023 at 3:24 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > Make sure that the following unsafe example is rejected by verifier: > > 1: r9 = ... some pointer with range X ... > 2: r6 = ... unbound scalar ID=a ... > 3: r7 = ... unbound scalar ID=b ... > 4: if (r6 > r7) goto +1 > 5: r6 = r7 > 6: if (r6 > X) goto ... > --- checkpoint --- > 7: r9 += r7 > 8: *(u64 *)r9 = Y > > This example is unsafe because not all execution paths verify r7 range. > Because of the jump at (4) the verifier would arrive at (6) in two states: > I. r6{.id=b}, r7{.id=b} via path 1-6; > II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > > Currently regsafe() does not call check_ids() for scalar registers, > thus from POV of regsafe() states (I) and (II) are identical. If the > path 1-6 is taken by verifier first, and checkpoint is created at (6) > the path [1-4, 6] would be considered safe. > > This commit updates regsafe() to call check_ids() for precise scalar > registers. > > To minimize the impact on verification performance, avoid generating > bpf_reg_state::id for constant scalar values when processing BPF_MOV > in check_alu_op(). Scalar IDs are utilized by find_equal_scalars() to > propagate information about value ranges for registers that hold the > same value. However, there is no need to propagate range information > for constants. > > Still, there is some performance impact because of this change. > Using veristat to compare number of processed states for selftests > object files listed in tools/testing/selftests/bpf/veristat.cfg and > Cilium object files from [1] gives the following statistics: > > $ ./veristat -e file,prog,states -f "states_pct>10" \ > -C master-baseline.log current.log > File Program States (DIFF) > ----------- ------------------------------ -------------- > bpf_xdp.o tail_handle_nat_fwd_ipv6 +155 (+23.92%) > bpf_xdp.o tail_nodeport_nat_ingress_ipv4 +102 (+27.20%) > bpf_xdp.o tail_rev_nodeport_lb4 +83 (+20.85%) > loop6.bpf.o trace_virtqueue_add_sgs +25 (+11.06%) > > Also test case verifier_search_pruning/allocated_stack has to be > updated to avoid conflicts in register ID assignments between cached > and new states. > > [1] git@github.com:anakryiko/cilium.git > > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> > --- So I checked it also on our internal BPF object files, and it looks mostly good. Here are the only regressions: Program States (A) States (B) States (DIFF) ---------------------------------------- ---------- ---------- --------------- balancer_ingress 29219 34531 +5312 (+18.18%) syar_bind6_protect6 3257 3599 +342 (+10.50%) syar_bind4_protect4 2590 2931 +341 (+13.17%) on_alloc 415 526 +111 (+26.75%) on_free 406 517 +111 (+27.34%) pycallcount 395 506 +111 (+28.10%) resume_context 405 516 +111 (+27.41%) on_py_event 395 506 +111 (+28.10%) on_event 284 394 +110 (+38.73%) handle_cuda_event 268 378 +110 (+41.04%) handle_cuda_launch 276 386 +110 (+39.86%) handle_cuda_malloc_ret 272 382 +110 (+40.44%) handle_cuda_memcpy 270 380 +110 (+40.74%) handle_cuda_memcpy_async 270 380 +110 (+40.74%) handle_pytorch_allocate_ret 271 381 +110 (+40.59%) handle_pytorch_malloc_ret 272 382 +110 (+40.44%) on_event 284 394 +110 (+38.73%) on_event 284 394 +110 (+38.73%) syar_task_enter_execve 309 329 +20 (+6.47%) kprobe__security_inode_link 968 986 +18 (+1.86%) kprobe__security_inode_symlink 838 854 +16 (+1.91%) tw_twfw_egress 249 251 +2 (+0.80%) tw_twfw_ingress 250 252 +2 (+0.80%) tw_twfw_tc_eg 248 250 +2 (+0.81%) tw_twfw_tc_in 250 252 +2 (+0.80%) raw_tracepoint__sched_process_exec 136 139 +3 (+2.21%) kprobe_ret__do_filp_open 869 871 +2 (+0.23%) read_erlang_stack 572 573 +1 (+0.17%) They are mostly on small-ish programs. The only mild concern from my side is balancer_ingress, which is one of Katran BPF programs. It add +18% of states (which translates to about 70K more instructions verified, up from 350K). I think we can live with this, but would be nice to check why it's happening. I suspect that dropping SCALAR IDs as we discussed (after fixing register fill/spill ID generation) might completely mitigate that. Overall, LGTM: Acked-by: Andrii Nakryiko <andrii@kernel.org> > kernel/bpf/verifier.c | 34 ++++++++++++++++--- > .../bpf/progs/verifier_search_pruning.c | 3 +- > 2 files changed, 32 insertions(+), 5 deletions(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 2aa60b73f1b5..175ca22b868e 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -12933,12 +12933,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > if (BPF_SRC(insn->code) == BPF_X) { > struct bpf_reg_state *src_reg = regs + insn->src_reg; > struct bpf_reg_state *dst_reg = regs + insn->dst_reg; > + bool need_id = (src_reg->type == SCALAR_VALUE && !src_reg->id && > + !tnum_is_const(src_reg->var_off)); > nit: unnecessary outer () > if (BPF_CLASS(insn->code) == BPF_ALU64) { > /* case: R1 = R2 > * copy register state to dest reg > */ > - if (src_reg->type == SCALAR_VALUE && !src_reg->id) > + if (need_id) > /* Assign src and dst registers the same ID > * that will be used by find_equal_scalars() > * to propagate min/max range. [...]
On Wed, 2023-06-07 at 14:40 -0700, Andrii Nakryiko wrote: > On Tue, Jun 6, 2023 at 3:24 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > Make sure that the following unsafe example is rejected by verifier: > > > > 1: r9 = ... some pointer with range X ... > > 2: r6 = ... unbound scalar ID=a ... > > 3: r7 = ... unbound scalar ID=b ... > > 4: if (r6 > r7) goto +1 > > 5: r6 = r7 > > 6: if (r6 > X) goto ... > > --- checkpoint --- > > 7: r9 += r7 > > 8: *(u64 *)r9 = Y > > > > This example is unsafe because not all execution paths verify r7 range. > > Because of the jump at (4) the verifier would arrive at (6) in two states: > > I. r6{.id=b}, r7{.id=b} via path 1-6; > > II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > > > > Currently regsafe() does not call check_ids() for scalar registers, > > thus from POV of regsafe() states (I) and (II) are identical. If the > > path 1-6 is taken by verifier first, and checkpoint is created at (6) > > the path [1-4, 6] would be considered safe. > > > > This commit updates regsafe() to call check_ids() for precise scalar > > registers. > > > > To minimize the impact on verification performance, avoid generating > > bpf_reg_state::id for constant scalar values when processing BPF_MOV > > in check_alu_op(). Scalar IDs are utilized by find_equal_scalars() to > > propagate information about value ranges for registers that hold the > > same value. However, there is no need to propagate range information > > for constants. > > > > Still, there is some performance impact because of this change. > > Using veristat to compare number of processed states for selftests > > object files listed in tools/testing/selftests/bpf/veristat.cfg and > > Cilium object files from [1] gives the following statistics: > > > > $ ./veristat -e file,prog,states -f "states_pct>10" \ > > -C master-baseline.log current.log > > File Program States (DIFF) > > ----------- ------------------------------ -------------- > > bpf_xdp.o tail_handle_nat_fwd_ipv6 +155 (+23.92%) > > bpf_xdp.o tail_nodeport_nat_ingress_ipv4 +102 (+27.20%) > > bpf_xdp.o tail_rev_nodeport_lb4 +83 (+20.85%) > > loop6.bpf.o trace_virtqueue_add_sgs +25 (+11.06%) > > > > Also test case verifier_search_pruning/allocated_stack has to be > > updated to avoid conflicts in register ID assignments between cached > > and new states. > > > > [1] git@github.com:anakryiko/cilium.git > > > > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> > > --- > > So I checked it also on our internal BPF object files, and it looks > mostly good. Here are the only regressions: > > Program States (A) States (B) > States (DIFF) > ---------------------------------------- ---------- ---------- > --------------- > balancer_ingress 29219 34531 > +5312 (+18.18%) > syar_bind6_protect6 3257 3599 > +342 (+10.50%) > syar_bind4_protect4 2590 2931 > +341 (+13.17%) > on_alloc 415 526 > +111 (+26.75%) > on_free 406 517 > +111 (+27.34%) > pycallcount 395 506 > +111 (+28.10%) > resume_context 405 516 > +111 (+27.41%) > on_py_event 395 506 > +111 (+28.10%) > on_event 284 394 > +110 (+38.73%) > handle_cuda_event 268 378 > +110 (+41.04%) > handle_cuda_launch 276 386 > +110 (+39.86%) > handle_cuda_malloc_ret 272 382 > +110 (+40.44%) > handle_cuda_memcpy 270 380 > +110 (+40.74%) > handle_cuda_memcpy_async 270 380 > +110 (+40.74%) > handle_pytorch_allocate_ret 271 381 > +110 (+40.59%) > handle_pytorch_malloc_ret 272 382 > +110 (+40.44%) > on_event 284 394 > +110 (+38.73%) > on_event 284 394 > +110 (+38.73%) > syar_task_enter_execve 309 329 > +20 (+6.47%) > kprobe__security_inode_link 968 986 > +18 (+1.86%) > kprobe__security_inode_symlink 838 854 > +16 (+1.91%) > tw_twfw_egress 249 251 > +2 (+0.80%) > tw_twfw_ingress 250 252 > +2 (+0.80%) > tw_twfw_tc_eg 248 250 > +2 (+0.81%) > tw_twfw_tc_in 250 252 > +2 (+0.80%) > raw_tracepoint__sched_process_exec 136 139 > +3 (+2.21%) > kprobe_ret__do_filp_open 869 871 > +2 (+0.23%) > read_erlang_stack 572 573 > +1 (+0.17%) > > > They are mostly on small-ish programs. The only mild concern from my > side is balancer_ingress, which is one of Katran BPF programs. It add > +18% of states (which translates to about 70K more instructions > verified, up from 350K). I think we can live with this, but would be > nice to check why it's happening. Thank you for reviewing this series. I looked at the logs that you've shared, the difference is indeed caused by some scalar registers having a unique ID in cached state and no ID in current state or vice versa. The !rold->id trick that we discussed for V2 helps :) What do you think about an alternative way to exclude unique scalars as in the patch below? (on top of this patch-set): --- 8< ------------------------- diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 235d7eded565..ece9722dff3b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -15149,6 +15149,13 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap) return false; } +static bool check_scalar_ids(u32 old_id, u32 cur_id, struct bpf_verifier_env *env) +{ + old_id = old_id ? old_id : env->id_gen++; + cur_id = cur_id ? cur_id : env->id_gen++; + return check_ids(old_id, cur_id, env->idmap_scratch); +} + static void clean_func_state(struct bpf_verifier_env *env, struct bpf_func_state *st) { @@ -15325,7 +15332,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, */ return range_within(rold, rcur) && tnum_in(rold->var_off, rcur->var_off) && - check_ids(rold->id, rcur->id, idmap); + check_scalar_ids(rold->id, rcur->id, env); case PTR_TO_MAP_KEY: case PTR_TO_MAP_VALUE: case PTR_TO_MEM: ------------------------- >8 --- For me this patch removes all veristat differences compared to the master. If doing it for real, I'd like to reset env->id_gen at exit from states_equal() to the value it had at entry (to avoid allocating too many ids). > > I suspect that dropping SCALAR IDs as we discussed (after fixing > register fill/spill ID generation) might completely mitigate that. > > Overall, LGTM: > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > kernel/bpf/verifier.c | 34 ++++++++++++++++--- > > .../bpf/progs/verifier_search_pruning.c | 3 +- > > 2 files changed, 32 insertions(+), 5 deletions(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 2aa60b73f1b5..175ca22b868e 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -12933,12 +12933,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > > if (BPF_SRC(insn->code) == BPF_X) { > > struct bpf_reg_state *src_reg = regs + insn->src_reg; > > struct bpf_reg_state *dst_reg = regs + insn->dst_reg; > > + bool need_id = (src_reg->type == SCALAR_VALUE && !src_reg->id && > > + !tnum_is_const(src_reg->var_off)); > > > > nit: unnecessary outer () > > > if (BPF_CLASS(insn->code) == BPF_ALU64) { > > /* case: R1 = R2 > > * copy register state to dest reg > > */ > > - if (src_reg->type == SCALAR_VALUE && !src_reg->id) > > + if (need_id) > > /* Assign src and dst registers the same ID > > * that will be used by find_equal_scalars() > > * to propagate min/max range. > > [...]
On Wed, Jun 7, 2023 at 6:26 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Wed, 2023-06-07 at 14:40 -0700, Andrii Nakryiko wrote: > > On Tue, Jun 6, 2023 at 3:24 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > > > Make sure that the following unsafe example is rejected by verifier: > > > > > > 1: r9 = ... some pointer with range X ... > > > 2: r6 = ... unbound scalar ID=a ... > > > 3: r7 = ... unbound scalar ID=b ... > > > 4: if (r6 > r7) goto +1 > > > 5: r6 = r7 > > > 6: if (r6 > X) goto ... > > > --- checkpoint --- > > > 7: r9 += r7 > > > 8: *(u64 *)r9 = Y > > > > > > This example is unsafe because not all execution paths verify r7 range. > > > Because of the jump at (4) the verifier would arrive at (6) in two states: > > > I. r6{.id=b}, r7{.id=b} via path 1-6; > > > II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > > > > > > Currently regsafe() does not call check_ids() for scalar registers, > > > thus from POV of regsafe() states (I) and (II) are identical. If the > > > path 1-6 is taken by verifier first, and checkpoint is created at (6) > > > the path [1-4, 6] would be considered safe. > > > > > > This commit updates regsafe() to call check_ids() for precise scalar > > > registers. > > > > > > To minimize the impact on verification performance, avoid generating > > > bpf_reg_state::id for constant scalar values when processing BPF_MOV > > > in check_alu_op(). Scalar IDs are utilized by find_equal_scalars() to > > > propagate information about value ranges for registers that hold the > > > same value. However, there is no need to propagate range information > > > for constants. > > > > > > Still, there is some performance impact because of this change. > > > Using veristat to compare number of processed states for selftests > > > object files listed in tools/testing/selftests/bpf/veristat.cfg and > > > Cilium object files from [1] gives the following statistics: > > > > > > $ ./veristat -e file,prog,states -f "states_pct>10" \ > > > -C master-baseline.log current.log > > > File Program States (DIFF) > > > ----------- ------------------------------ -------------- > > > bpf_xdp.o tail_handle_nat_fwd_ipv6 +155 (+23.92%) > > > bpf_xdp.o tail_nodeport_nat_ingress_ipv4 +102 (+27.20%) > > > bpf_xdp.o tail_rev_nodeport_lb4 +83 (+20.85%) > > > loop6.bpf.o trace_virtqueue_add_sgs +25 (+11.06%) > > > > > > Also test case verifier_search_pruning/allocated_stack has to be > > > updated to avoid conflicts in register ID assignments between cached > > > and new states. > > > > > > [1] git@github.com:anakryiko/cilium.git > > > > > > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") > > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> > > > --- > > > > So I checked it also on our internal BPF object files, and it looks > > mostly good. Here are the only regressions: > > > > Program States (A) States (B) > > States (DIFF) > > ---------------------------------------- ---------- ---------- > > --------------- > > balancer_ingress 29219 34531 > > +5312 (+18.18%) > > syar_bind6_protect6 3257 3599 > > +342 (+10.50%) > > syar_bind4_protect4 2590 2931 > > +341 (+13.17%) > > on_alloc 415 526 > > +111 (+26.75%) > > on_free 406 517 > > +111 (+27.34%) > > pycallcount 395 506 > > +111 (+28.10%) > > resume_context 405 516 > > +111 (+27.41%) > > on_py_event 395 506 > > +111 (+28.10%) > > on_event 284 394 > > +110 (+38.73%) > > handle_cuda_event 268 378 > > +110 (+41.04%) > > handle_cuda_launch 276 386 > > +110 (+39.86%) > > handle_cuda_malloc_ret 272 382 > > +110 (+40.44%) > > handle_cuda_memcpy 270 380 > > +110 (+40.74%) > > handle_cuda_memcpy_async 270 380 > > +110 (+40.74%) > > handle_pytorch_allocate_ret 271 381 > > +110 (+40.59%) > > handle_pytorch_malloc_ret 272 382 > > +110 (+40.44%) > > on_event 284 394 > > +110 (+38.73%) > > on_event 284 394 > > +110 (+38.73%) > > syar_task_enter_execve 309 329 > > +20 (+6.47%) > > kprobe__security_inode_link 968 986 > > +18 (+1.86%) > > kprobe__security_inode_symlink 838 854 > > +16 (+1.91%) > > tw_twfw_egress 249 251 > > +2 (+0.80%) > > tw_twfw_ingress 250 252 > > +2 (+0.80%) > > tw_twfw_tc_eg 248 250 > > +2 (+0.81%) > > tw_twfw_tc_in 250 252 > > +2 (+0.80%) > > raw_tracepoint__sched_process_exec 136 139 > > +3 (+2.21%) > > kprobe_ret__do_filp_open 869 871 > > +2 (+0.23%) > > read_erlang_stack 572 573 > > +1 (+0.17%) > > > > > > They are mostly on small-ish programs. The only mild concern from my > > side is balancer_ingress, which is one of Katran BPF programs. It add > > +18% of states (which translates to about 70K more instructions > > verified, up from 350K). I think we can live with this, but would be > > nice to check why it's happening. > > Thank you for reviewing this series. > > I looked at the logs that you've shared, the difference is indeed > caused by some scalar registers having a unique ID in cached state and > no ID in current state or vice versa. The !rold->id trick that we > discussed for V2 helps :) > > What do you think about an alternative way to exclude unique scalars > as in the patch below? (on top of this patch-set): > > --- 8< ------------------------- > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 235d7eded565..ece9722dff3b 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -15149,6 +15149,13 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap) > return false; > } > > +static bool check_scalar_ids(u32 old_id, u32 cur_id, struct bpf_verifier_env *env) > +{ > + old_id = old_id ? old_id : env->id_gen++; > + cur_id = cur_id ? cur_id : env->id_gen++; > + return check_ids(old_id, cur_id, env->idmap_scratch); > +} > + > static void clean_func_state(struct bpf_verifier_env *env, > struct bpf_func_state *st) > { > @@ -15325,7 +15332,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > */ > return range_within(rold, rcur) && > tnum_in(rold->var_off, rcur->var_off) && > - check_ids(rold->id, rcur->id, idmap); > + check_scalar_ids(rold->id, rcur->id, env); > case PTR_TO_MAP_KEY: > case PTR_TO_MAP_VALUE: > case PTR_TO_MEM: > > ------------------------- >8 --- > > For me this patch removes all veristat differences compared to the > master. If doing it for real, I'd like to reset env->id_gen at exit > from states_equal() to the value it had at entry (to avoid allocating > too many ids). Hm.. It's clever and pretty minimal, I like it. We are basically allocating virtual ID for SCALAR that doesn't have id, just to make sure we get a conflict if the SCALAR with ID cannot be mapped into two different SCALARs, right? The only question would be if it's safe to do that for case when old_reg->id != 0 and cur_reg->id == 0? E.g., if in old (verified) state we have r6.id = r7.id = 123, and in new state we have r6.id = 0 and r7.id = 0, then your implementation above will say that states are equivalent. But are they, given there is a link between r6 and r7 that might be required for correctness. Which we don't have in current state. So with this we are getting to my original concerns with your !rold->id approach, which tries to ignore the necessity of link between registers. What if we do this only for old registers? Then, (in old state) r6.id = 0, r7.id = 0, (in new state) r6.id = r7.id = 123. This will be rejected because first we'll map 123 to newly allocated X for r6.id, and then when we try to match r7.id=123 to another allocated ID X+1 we'll get a conflict, right? > > > > > I suspect that dropping SCALAR IDs as we discussed (after fixing > > register fill/spill ID generation) might completely mitigate that. > > > > Overall, LGTM: > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > > > kernel/bpf/verifier.c | 34 ++++++++++++++++--- > > > .../bpf/progs/verifier_search_pruning.c | 3 +- > > > 2 files changed, 32 insertions(+), 5 deletions(-) > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > index 2aa60b73f1b5..175ca22b868e 100644 > > > --- a/kernel/bpf/verifier.c > > > +++ b/kernel/bpf/verifier.c > > > @@ -12933,12 +12933,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > > > if (BPF_SRC(insn->code) == BPF_X) { > > > struct bpf_reg_state *src_reg = regs + insn->src_reg; > > > struct bpf_reg_state *dst_reg = regs + insn->dst_reg; > > > + bool need_id = (src_reg->type == SCALAR_VALUE && !src_reg->id && > > > + !tnum_is_const(src_reg->var_off)); > > > > > > > nit: unnecessary outer () > > > > > if (BPF_CLASS(insn->code) == BPF_ALU64) { > > > /* case: R1 = R2 > > > * copy register state to dest reg > > > */ > > > - if (src_reg->type == SCALAR_VALUE && !src_reg->id) > > > + if (need_id) > > > /* Assign src and dst registers the same ID > > > * that will be used by find_equal_scalars() > > > * to propagate min/max range. > > > > [...] >
On Thu, 2023-06-08 at 10:21 -0700, Andrii Nakryiko wrote: > On Wed, Jun 7, 2023 at 6:26 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > On Wed, 2023-06-07 at 14:40 -0700, Andrii Nakryiko wrote: > > > On Tue, Jun 6, 2023 at 3:24 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > > > > > Make sure that the following unsafe example is rejected by verifier: > > > > > > > > 1: r9 = ... some pointer with range X ... > > > > 2: r6 = ... unbound scalar ID=a ... > > > > 3: r7 = ... unbound scalar ID=b ... > > > > 4: if (r6 > r7) goto +1 > > > > 5: r6 = r7 > > > > 6: if (r6 > X) goto ... > > > > --- checkpoint --- > > > > 7: r9 += r7 > > > > 8: *(u64 *)r9 = Y > > > > > > > > This example is unsafe because not all execution paths verify r7 range. > > > > Because of the jump at (4) the verifier would arrive at (6) in two states: > > > > I. r6{.id=b}, r7{.id=b} via path 1-6; > > > > II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > > > > > > > > Currently regsafe() does not call check_ids() for scalar registers, > > > > thus from POV of regsafe() states (I) and (II) are identical. If the > > > > path 1-6 is taken by verifier first, and checkpoint is created at (6) > > > > the path [1-4, 6] would be considered safe. > > > > > > > > This commit updates regsafe() to call check_ids() for precise scalar > > > > registers. > > > > > > > > To minimize the impact on verification performance, avoid generating > > > > bpf_reg_state::id for constant scalar values when processing BPF_MOV > > > > in check_alu_op(). Scalar IDs are utilized by find_equal_scalars() to > > > > propagate information about value ranges for registers that hold the > > > > same value. However, there is no need to propagate range information > > > > for constants. > > > > > > > > Still, there is some performance impact because of this change. > > > > Using veristat to compare number of processed states for selftests > > > > object files listed in tools/testing/selftests/bpf/veristat.cfg and > > > > Cilium object files from [1] gives the following statistics: > > > > > > > > $ ./veristat -e file,prog,states -f "states_pct>10" \ > > > > -C master-baseline.log current.log > > > > File Program States (DIFF) > > > > ----------- ------------------------------ -------------- > > > > bpf_xdp.o tail_handle_nat_fwd_ipv6 +155 (+23.92%) > > > > bpf_xdp.o tail_nodeport_nat_ingress_ipv4 +102 (+27.20%) > > > > bpf_xdp.o tail_rev_nodeport_lb4 +83 (+20.85%) > > > > loop6.bpf.o trace_virtqueue_add_sgs +25 (+11.06%) > > > > > > > > Also test case verifier_search_pruning/allocated_stack has to be > > > > updated to avoid conflicts in register ID assignments between cached > > > > and new states. > > > > > > > > [1] git@github.com:anakryiko/cilium.git > > > > > > > > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") > > > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> > > > > --- > > > > > > So I checked it also on our internal BPF object files, and it looks > > > mostly good. Here are the only regressions: > > > > > > Program States (A) States (B) > > > States (DIFF) > > > ---------------------------------------- ---------- ---------- > > > --------------- > > > balancer_ingress 29219 34531 > > > +5312 (+18.18%) > > > syar_bind6_protect6 3257 3599 > > > +342 (+10.50%) > > > syar_bind4_protect4 2590 2931 > > > +341 (+13.17%) > > > on_alloc 415 526 > > > +111 (+26.75%) > > > on_free 406 517 > > > +111 (+27.34%) > > > pycallcount 395 506 > > > +111 (+28.10%) > > > resume_context 405 516 > > > +111 (+27.41%) > > > on_py_event 395 506 > > > +111 (+28.10%) > > > on_event 284 394 > > > +110 (+38.73%) > > > handle_cuda_event 268 378 > > > +110 (+41.04%) > > > handle_cuda_launch 276 386 > > > +110 (+39.86%) > > > handle_cuda_malloc_ret 272 382 > > > +110 (+40.44%) > > > handle_cuda_memcpy 270 380 > > > +110 (+40.74%) > > > handle_cuda_memcpy_async 270 380 > > > +110 (+40.74%) > > > handle_pytorch_allocate_ret 271 381 > > > +110 (+40.59%) > > > handle_pytorch_malloc_ret 272 382 > > > +110 (+40.44%) > > > on_event 284 394 > > > +110 (+38.73%) > > > on_event 284 394 > > > +110 (+38.73%) > > > syar_task_enter_execve 309 329 > > > +20 (+6.47%) > > > kprobe__security_inode_link 968 986 > > > +18 (+1.86%) > > > kprobe__security_inode_symlink 838 854 > > > +16 (+1.91%) > > > tw_twfw_egress 249 251 > > > +2 (+0.80%) > > > tw_twfw_ingress 250 252 > > > +2 (+0.80%) > > > tw_twfw_tc_eg 248 250 > > > +2 (+0.81%) > > > tw_twfw_tc_in 250 252 > > > +2 (+0.80%) > > > raw_tracepoint__sched_process_exec 136 139 > > > +3 (+2.21%) > > > kprobe_ret__do_filp_open 869 871 > > > +2 (+0.23%) > > > read_erlang_stack 572 573 > > > +1 (+0.17%) > > > > > > > > > They are mostly on small-ish programs. The only mild concern from my > > > side is balancer_ingress, which is one of Katran BPF programs. It add > > > +18% of states (which translates to about 70K more instructions > > > verified, up from 350K). I think we can live with this, but would be > > > nice to check why it's happening. > > > > Thank you for reviewing this series. > > > > I looked at the logs that you've shared, the difference is indeed > > caused by some scalar registers having a unique ID in cached state and > > no ID in current state or vice versa. The !rold->id trick that we > > discussed for V2 helps :) > > > > What do you think about an alternative way to exclude unique scalars > > as in the patch below? (on top of this patch-set): > > > > --- 8< ------------------------- > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 235d7eded565..ece9722dff3b 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -15149,6 +15149,13 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap) > > return false; > > } > > > > +static bool check_scalar_ids(u32 old_id, u32 cur_id, struct bpf_verifier_env *env) > > +{ > > + old_id = old_id ? old_id : env->id_gen++; > > + cur_id = cur_id ? cur_id : env->id_gen++; > > + return check_ids(old_id, cur_id, env->idmap_scratch); > > +} > > + > > static void clean_func_state(struct bpf_verifier_env *env, > > struct bpf_func_state *st) > > { > > @@ -15325,7 +15332,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > > */ > > return range_within(rold, rcur) && > > tnum_in(rold->var_off, rcur->var_off) && > > - check_ids(rold->id, rcur->id, idmap); > > + check_scalar_ids(rold->id, rcur->id, env); > > case PTR_TO_MAP_KEY: > > case PTR_TO_MAP_VALUE: > > case PTR_TO_MEM: > > > > ------------------------- >8 --- > > > > For me this patch removes all veristat differences compared to the > > master. If doing it for real, I'd like to reset env->id_gen at exit > > from states_equal() to the value it had at entry (to avoid allocating > > too many ids). > > Hm.. It's clever and pretty minimal, I like it. We are basically > allocating virtual ID for SCALAR that doesn't have id, just to make > sure we get a conflict if the SCALAR with ID cannot be mapped into two > different SCALARs, right? > > The only question would be if it's safe to do that for case when > old_reg->id != 0 and cur_reg->id == 0? E.g., if in old (verified) > state we have r6.id = r7.id = 123, and in new state we have r6.id = 0 > and r7.id = 0, then your implementation above will say that states are > equivalent. But are they, given there is a link between r6 and r7 that > might be required for correctness. Which we don't have in current > state. You mean the other way around, rold.id == 0, rcur.id != 0, right? (below 0/2 means: original value 0, replaced by new id 2). (1) old cur r6.id 0/2 1 r7.id 0/3 1 check_ids returns true (2) old cur r6.id 1 0/2 r7.id 1 0/3 check_ids returns false Also, (1) is no different from (3): (3) old cur r6.id 1 3 r7.id 2 3 check_ids returns true Current check: if (!rold->precise) return true; return range_within(rold, rcur) && tnum_in(rold->var_off, rcur->var_off) && check_ids(rold->id, rcur->id, idmap); Will reject (1) and (2), but will accept (3). New check: if (!rold->precise) return true; return range_within(rold, rcur) && tnum_in(rold->var_off, rcur->var_off) && check_scalar_ids(rold->id, rcur->id, idmap); Will reject (2), but will accept (1) and (3). And modification of check_scalar_ids() to generate IDs only for rold or only for rcur would not reject (3) either. (3) would be rejected only if current check is used together with elimination of unique scalar IDs from old states. My previous experiments show that eliminating unique IDs from old states and not eliminating unique IDs from cur states is *very* bad for performance. > > So with this we are getting to my original concerns with your > !rold->id approach, which tries to ignore the necessity of link > between registers. > > What if we do this only for old registers? Then, (in old state) r6.id > = 0, r7.id = 0, (in new state) r6.id = r7.id = 123. This will be > rejected because first we'll map 123 to newly allocated X for r6.id, > and then when we try to match r7.id=123 to another allocated ID X+1 > we'll get a conflict, right? > > > > > > > > > I suspect that dropping SCALAR IDs as we discussed (after fixing > > > register fill/spill ID generation) might completely mitigate that. > > > > > > Overall, LGTM: > > > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > > > > > kernel/bpf/verifier.c | 34 ++++++++++++++++--- > > > > .../bpf/progs/verifier_search_pruning.c | 3 +- > > > > 2 files changed, 32 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > > index 2aa60b73f1b5..175ca22b868e 100644 > > > > --- a/kernel/bpf/verifier.c > > > > +++ b/kernel/bpf/verifier.c > > > > @@ -12933,12 +12933,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > > > > if (BPF_SRC(insn->code) == BPF_X) { > > > > struct bpf_reg_state *src_reg = regs + insn->src_reg; > > > > struct bpf_reg_state *dst_reg = regs + insn->dst_reg; > > > > + bool need_id = (src_reg->type == SCALAR_VALUE && !src_reg->id && > > > > + !tnum_is_const(src_reg->var_off)); > > > > > > > > > > nit: unnecessary outer () > > > > > > > if (BPF_CLASS(insn->code) == BPF_ALU64) { > > > > /* case: R1 = R2 > > > > * copy register state to dest reg > > > > */ > > > > - if (src_reg->type == SCALAR_VALUE && !src_reg->id) > > > > + if (need_id) > > > > /* Assign src and dst registers the same ID > > > > * that will be used by find_equal_scalars() > > > > * to propagate min/max range. > > > > > > [...] > >
On Thu, 2023-06-08 at 22:05 +0300, Eduard Zingerman wrote: [...] > > Hm.. It's clever and pretty minimal, I like it. We are basically > > allocating virtual ID for SCALAR that doesn't have id, just to make > > sure we get a conflict if the SCALAR with ID cannot be mapped into two > > different SCALARs, right? > > > > The only question would be if it's safe to do that for case when > > old_reg->id != 0 and cur_reg->id == 0? E.g., if in old (verified) > > state we have r6.id = r7.id = 123, and in new state we have r6.id = 0 > > and r7.id = 0, then your implementation above will say that states are > > equivalent. But are they, given there is a link between r6 and r7 that > > might be required for correctness. Which we don't have in current > > state. > > You mean the other way around, rold.id == 0, rcur.id != 0, right? > (below 0/2 means: original value 0, replaced by new id 2). > > (1) old cur > r6.id 0/2 1 > r7.id 0/3 1 check_ids returns true > > (2) old cur > r6.id 1 0/2 > r7.id 1 0/3 check_ids returns false > > Also, (1) is no different from (3): > > (3) old cur > r6.id 1 3 > r7.id 2 3 check_ids returns true > > Current check: > > if (!rold->precise) > return true; > return range_within(rold, rcur) && > tnum_in(rold->var_off, rcur->var_off) && > check_ids(rold->id, rcur->id, idmap); > > Will reject (1) and (2), but will accept (3). > > New check: > > if (!rold->precise) > return true; > return range_within(rold, rcur) && > tnum_in(rold->var_off, rcur->var_off) && > check_scalar_ids(rold->id, rcur->id, idmap); > > Will reject (2), but will accept (1) and (3). > > And modification of check_scalar_ids() to generate IDs only for rold > or only for rcur would not reject (3) either. > > (3) would be rejected only if current check is used together with > elimination of unique scalar IDs from old states. > > My previous experiments show that eliminating unique IDs from old > states and not eliminating unique IDs from cur states is *very* bad > for performance. > > > > > So with this we are getting to my original concerns with your > > !rold->id approach, which tries to ignore the necessity of link > > between registers. > > > > What if we do this only for old registers? Then, (in old state) r6.id > > = 0, r7.id = 0, (in new state) r6.id = r7.id = 123. This will be > > rejected because first we'll map 123 to newly allocated X for r6.id, > > and then when we try to match r7.id=123 to another allocated ID X+1 > > we'll get a conflict, right? [...] Ok, here is what I think is the final version: a. for each old or cur ID generate temporary unique ID; b. for scalars use modified check_ids that forbids mapping same 'cur' ID to several different 'old' IDs. (a) allows to reject situations like: (1) old cur (2) old cur r6.id 0 1 r6.id 1 0 r7.id 0 1 r7.id 1 0 (b) allows to reject situations like: (3) old cur r6.id 1 3 r7.id 2 3 And whether some scalar ID is unique or not does not matter for the algorithm. Tests are passing, katran example is fine (350k insns, 29K states), minor veristat regression: File Program States (DIFF) --------- ------------------------------ ------------- bpf_xdp.o tail_nodeport_nat_ingress_ipv4 +3 (+0.80%) bpf_xdp.o tail_rev_nodeport_lb4 +2 (+0.50%) --- 8< ----------------------------- diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 235d7eded565..5794dc7830db 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -15149,6 +15149,31 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap) return false; } +static bool check_scalar_ids(struct bpf_verifier_env *env, u32 old_id, u32 cur_id, + struct bpf_id_pair *idmap) +{ + unsigned int i; + + old_id = old_id ? old_id : env->id_gen++; + cur_id = cur_id ? cur_id : env->id_gen++; + + for (i = 0; i < BPF_ID_MAP_SIZE; i++) { + if (!idmap[i].old) { + /* Reached an empty slot; haven't seen this id before */ + idmap[i].old = old_id; + idmap[i].cur = cur_id; + return true; + } + if (idmap[i].old == old_id) + return idmap[i].cur == cur_id; + if (idmap[i].cur == cur_id) + return false; + } + /* We ran out of idmap slots, which should be impossible */ + WARN_ON_ONCE(1); + return false; +} + static void clean_func_state(struct bpf_verifier_env *env, struct bpf_func_state *st) { @@ -15325,7 +15350,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, */ return range_within(rold, rcur) && tnum_in(rold->var_off, rcur->var_off) && - check_ids(rold->id, rcur->id, idmap); + check_scalar_ids(env, rold->id, rcur->id, idmap); case PTR_TO_MAP_KEY: case PTR_TO_MAP_VALUE: case PTR_TO_MEM: ----------------------------- >8 ---
On Thu, Jun 8, 2023 at 1:58 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Thu, 2023-06-08 at 22:05 +0300, Eduard Zingerman wrote: > [...] > > > Hm.. It's clever and pretty minimal, I like it. We are basically > > > allocating virtual ID for SCALAR that doesn't have id, just to make > > > sure we get a conflict if the SCALAR with ID cannot be mapped into two > > > different SCALARs, right? > > > > > > The only question would be if it's safe to do that for case when > > > old_reg->id != 0 and cur_reg->id == 0? E.g., if in old (verified) > > > state we have r6.id = r7.id = 123, and in new state we have r6.id = 0 > > > and r7.id = 0, then your implementation above will say that states are > > > equivalent. But are they, given there is a link between r6 and r7 that > > > might be required for correctness. Which we don't have in current > > > state. > > > > You mean the other way around, rold.id == 0, rcur.id != 0, right? > > (below 0/2 means: original value 0, replaced by new id 2). no, I actually meant what I wrote, but I didn't realize that check_ids() is kind of broken... Because it shouldn't allow the same ID from cur state to be mapped to two different IDs in old state, should it? > > > > (1) old cur > > r6.id 0/2 1 > > r7.id 0/3 1 check_ids returns true I think this should be rejected. > > > > (2) old cur > > r6.id 1 0/2 > > r7.id 1 0/3 check_ids returns false And this should be rejected. > > > > Also, (1) is no different from (3): > > > > (3) old cur > > r6.id 1 3 > > r7.id 2 3 check_ids returns true And this definitely should be rejected. The only situation that might not be rejected would be: old cur r6.id 0/1 3 r7.id. 0/2 4 And perhaps this one is ok as well? old cur r6.id 3 0/1 r7.id. 4 0/2 And my assumption was that that's what you are trying to do. But weirdly check_ids() is enforcing only that old ID has to have a unique mapping, which seems like a bug. > > > > Current check: > > > > if (!rold->precise) > > return true; > > return range_within(rold, rcur) && > > tnum_in(rold->var_off, rcur->var_off) && > > check_ids(rold->id, rcur->id, idmap); > > > > Will reject (1) and (2), but will accept (3). > > > > New check: > > > > if (!rold->precise) > > return true; > > return range_within(rold, rcur) && > > tnum_in(rold->var_off, rcur->var_off) && > > check_scalar_ids(rold->id, rcur->id, idmap); > > > > Will reject (2), but will accept (1) and (3). > > > > And modification of check_scalar_ids() to generate IDs only for rold > > or only for rcur would not reject (3) either. > > > > (3) would be rejected only if current check is used together with > > elimination of unique scalar IDs from old states. > > > > My previous experiments show that eliminating unique IDs from old > > states and not eliminating unique IDs from cur states is *very* bad > > for performance. > > > > > > > > So with this we are getting to my original concerns with your > > > !rold->id approach, which tries to ignore the necessity of link > > > between registers. > > > > > > What if we do this only for old registers? Then, (in old state) r6.id > > > = 0, r7.id = 0, (in new state) r6.id = r7.id = 123. This will be > > > rejected because first we'll map 123 to newly allocated X for r6.id, > > > and then when we try to match r7.id=123 to another allocated ID X+1 > > > we'll get a conflict, right? > > [...] > > Ok, here is what I think is the final version: > a. for each old or cur ID generate temporary unique ID; > b. for scalars use modified check_ids that forbids mapping same 'cur' > ID to several different 'old' IDs. > > (a) allows to reject situations like: > > (1) old cur (2) old cur > r6.id 0 1 r6.id 1 0 > r7.id 0 1 r7.id 1 0 > > (b) allows to reject situations like: > > (3) old cur > r6.id 1 3 > r7.id 2 3 > > And whether some scalar ID is unique or not does not matter for the > algorithm. > > Tests are passing, katran example is fine (350k insns, 29K states), > minor veristat regression: > > File Program States (DIFF) > --------- ------------------------------ ------------- > bpf_xdp.o tail_nodeport_nat_ingress_ipv4 +3 (+0.80%) > bpf_xdp.o tail_rev_nodeport_lb4 +2 (+0.50%) > > --- 8< ----------------------------- > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 235d7eded565..5794dc7830db 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -15149,6 +15149,31 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap) > return false; > } > > +static bool check_scalar_ids(struct bpf_verifier_env *env, u32 old_id, u32 cur_id, > + struct bpf_id_pair *idmap) > +{ > + unsigned int i; > + > + old_id = old_id ? old_id : env->id_gen++; > + cur_id = cur_id ? cur_id : env->id_gen++; > + > + for (i = 0; i < BPF_ID_MAP_SIZE; i++) { > + if (!idmap[i].old) { > + /* Reached an empty slot; haven't seen this id before */ > + idmap[i].old = old_id; > + idmap[i].cur = cur_id; > + return true; > + } > + if (idmap[i].old == old_id) > + return idmap[i].cur == cur_id; > + if (idmap[i].cur == cur_id) > + return false; I think this should just be added to existing check_ids(), I think it's a bug that we don't check this condition today in check_ids(). But I'd say let's land fixes you have right now. And then work on fixing and optimizing scala ID checks separately. We are doing too many things at the same time :( > + } > + /* We ran out of idmap slots, which should be impossible */ > + WARN_ON_ONCE(1); > + return false; > +} > + > static void clean_func_state(struct bpf_verifier_env *env, > struct bpf_func_state *st) > { > @@ -15325,7 +15350,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > */ > return range_within(rold, rcur) && > tnum_in(rold->var_off, rcur->var_off) && > - check_ids(rold->id, rcur->id, idmap); > + check_scalar_ids(env, rold->id, rcur->id, idmap); > case PTR_TO_MAP_KEY: > case PTR_TO_MAP_VALUE: > case PTR_TO_MEM: > > ----------------------------- >8 ---
On Thu, 2023-06-08 at 15:37 -0700, Andrii Nakryiko wrote: > On Thu, Jun 8, 2023 at 1:58 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > On Thu, 2023-06-08 at 22:05 +0300, Eduard Zingerman wrote: > > [...] > > > > Hm.. It's clever and pretty minimal, I like it. We are basically > > > > allocating virtual ID for SCALAR that doesn't have id, just to make > > > > sure we get a conflict if the SCALAR with ID cannot be mapped into two > > > > different SCALARs, right? > > > > > > > > The only question would be if it's safe to do that for case when > > > > old_reg->id != 0 and cur_reg->id == 0? E.g., if in old (verified) > > > > state we have r6.id = r7.id = 123, and in new state we have r6.id = 0 > > > > and r7.id = 0, then your implementation above will say that states are > > > > equivalent. But are they, given there is a link between r6 and r7 that > > > > might be required for correctness. Which we don't have in current > > > > state. > > > > > > You mean the other way around, rold.id == 0, rcur.id != 0, right? > > > (below 0/2 means: original value 0, replaced by new id 2). > > no, I actually meant what I wrote, but I didn't realize that > check_ids() is kind of broken... Because it shouldn't allow the same > ID from cur state to be mapped to two different IDs in old state, > should it? IDs are used for several things, and it looks like the answer might vary. For example, I looked at mark_ptr_or_null_regs(): - it is called when conditional of form (ptr == NULL) is checked; - it marks every register with pointer having same ID as ptr as null/non-null; - when register is marked not null ID is removed (not for locks but ignore it for now). Assume r6 and r7 are both PTR_MAYBE_NULL and ID assignments look as follows: old cur r6.id 1 3 r7.id 2 3 'old' is safe, which means the following: - either r6 was not accessed or it was guarded by (r6 == NULL) - either r7 was not accessed or it was guarded by (r7 == NULL) In both cases it should be ok, if r6 and r7 are in fact the same pointer. It would be checked to be not NULL two times but that's fine. So, I'd say that 'cur' is a special case of 'old' and check_ids() is correct for it. But this is the same argument I used for scalars and you were not convinced :) Need to examine each use case carefully. > > > (1) old cur > > > r6.id 0/2 1 > > > r7.id 0/3 1 check_ids returns true > > I think this should be rejected. That's what we agreed upon when decided not to do !rold->id, so yes. > > > (2) old cur > > > r6.id 1 0/2 > > > r7.id 1 0/3 check_ids returns false > > And this should be rejected. For sure. > > > Also, (1) is no different from (3): > > > > > > (3) old cur > > > r6.id 1 3 > > > r7.id 2 3 check_ids returns true > > And this definitely should be rejected. Same as (1). > The only situation that might not be rejected would be: > > old cur > r6.id 0/1 3 > r7.id. 0/2 4 > > And perhaps this one is ok as well? > > old cur > r6.id 3 0/1 > r7.id. 4 0/2 I think these two should be accepted. [...] > > +static bool check_scalar_ids(struct bpf_verifier_env *env, u32 old_id, u32 cur_id, > > + struct bpf_id_pair *idmap) > > +{ > > + unsigned int i; > > + > > + old_id = old_id ? old_id : env->id_gen++; > > + cur_id = cur_id ? cur_id : env->id_gen++; > > + > > + for (i = 0; i < BPF_ID_MAP_SIZE; i++) { > > + if (!idmap[i].old) { > > + /* Reached an empty slot; haven't seen this id before */ > > + idmap[i].old = old_id; > > + idmap[i].cur = cur_id; > > + return true; > > + } > > + if (idmap[i].old == old_id) > > + return idmap[i].cur == cur_id; > > + if (idmap[i].cur == cur_id) > > + return false; > > I think this should just be added to existing check_ids(), I think > it's a bug that we don't check this condition today in check_ids(). > > But I'd say let's land fixes you have right now. And then work on > fixing and optimizing scala ID checks separately. We are doing too > many things at the same time :( Ok, will post V4 with these changes and examine other cases later. Thanks again for the discussion. [...]
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2aa60b73f1b5..175ca22b868e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -12933,12 +12933,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) if (BPF_SRC(insn->code) == BPF_X) { struct bpf_reg_state *src_reg = regs + insn->src_reg; struct bpf_reg_state *dst_reg = regs + insn->dst_reg; + bool need_id = (src_reg->type == SCALAR_VALUE && !src_reg->id && + !tnum_is_const(src_reg->var_off)); if (BPF_CLASS(insn->code) == BPF_ALU64) { /* case: R1 = R2 * copy register state to dest reg */ - if (src_reg->type == SCALAR_VALUE && !src_reg->id) + if (need_id) /* Assign src and dst registers the same ID * that will be used by find_equal_scalars() * to propagate min/max range. @@ -12957,7 +12959,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) } else if (src_reg->type == SCALAR_VALUE) { bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX; - if (is_src_reg_u32 && !src_reg->id) + if (is_src_reg_u32 && need_id) src_reg->id = ++env->id_gen; copy_register_state(dst_reg, src_reg); /* Make sure ID is cleared if src_reg is not in u32 range otherwise @@ -15289,9 +15291,33 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, return false; if (!rold->precise) return true; - /* new val must satisfy old val knowledge */ + /* Why check_ids() for scalar registers? + * + * Consider the following BPF code: + * 1: r6 = ... unbound scalar, ID=a ... + * 2: r7 = ... unbound scalar, ID=b ... + * 3: if (r6 > r7) goto +1 + * 4: r6 = r7 + * 5: if (r6 > X) goto ... + * 6: ... memory operation using r7 ... + * + * First verification path is [1-6]: + * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7; + * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark + * r7 <= X, because r6 and r7 share same id. + * Next verification path is [1-4, 6]. + * + * Instruction (6) would be reached in two states: + * I. r6{.id=b}, r7{.id=b} via path 1-6; + * II. r6{.id=a}, r7{.id=b} via path 1-4, 6. + * + * Use check_ids() to distinguish these states. + * --- + * Also verify that new value satisfies old value range knowledge. + */ return range_within(rold, rcur) && - tnum_in(rold->var_off, rcur->var_off); + tnum_in(rold->var_off, rcur->var_off) && + check_ids(rold->id, rcur->id, idmap); case PTR_TO_MAP_KEY: case PTR_TO_MAP_VALUE: case PTR_TO_MEM: diff --git a/tools/testing/selftests/bpf/progs/verifier_search_pruning.c b/tools/testing/selftests/bpf/progs/verifier_search_pruning.c index 5a14498d352f..bb3cd14bb3a1 100644 --- a/tools/testing/selftests/bpf/progs/verifier_search_pruning.c +++ b/tools/testing/selftests/bpf/progs/verifier_search_pruning.c @@ -271,7 +271,7 @@ l2_%=: r0 = 0; \ SEC("socket") __description("allocated_stack") -__success __msg("processed 15 insns") +__success __msg("processed 16 insns") __success_unpriv __msg_unpriv("") __log_level(1) __retval(0) __naked void allocated_stack(void) { @@ -279,6 +279,7 @@ __naked void allocated_stack(void) r6 = r1; \ call %[bpf_get_prandom_u32]; \ r7 = r0; \ + call %[bpf_get_prandom_u32]; \ if r0 == 0 goto l0_%=; \ r0 = 0; \ *(u64*)(r10 - 8) = r6; \
Make sure that the following unsafe example is rejected by verifier: 1: r9 = ... some pointer with range X ... 2: r6 = ... unbound scalar ID=a ... 3: r7 = ... unbound scalar ID=b ... 4: if (r6 > r7) goto +1 5: r6 = r7 6: if (r6 > X) goto ... --- checkpoint --- 7: r9 += r7 8: *(u64 *)r9 = Y This example is unsafe because not all execution paths verify r7 range. Because of the jump at (4) the verifier would arrive at (6) in two states: I. r6{.id=b}, r7{.id=b} via path 1-6; II. r6{.id=a}, r7{.id=b} via path 1-4, 6. Currently regsafe() does not call check_ids() for scalar registers, thus from POV of regsafe() states (I) and (II) are identical. If the path 1-6 is taken by verifier first, and checkpoint is created at (6) the path [1-4, 6] would be considered safe. This commit updates regsafe() to call check_ids() for precise scalar registers. To minimize the impact on verification performance, avoid generating bpf_reg_state::id for constant scalar values when processing BPF_MOV in check_alu_op(). Scalar IDs are utilized by find_equal_scalars() to propagate information about value ranges for registers that hold the same value. However, there is no need to propagate range information for constants. Still, there is some performance impact because of this change. Using veristat to compare number of processed states for selftests object files listed in tools/testing/selftests/bpf/veristat.cfg and Cilium object files from [1] gives the following statistics: $ ./veristat -e file,prog,states -f "states_pct>10" \ -C master-baseline.log current.log File Program States (DIFF) ----------- ------------------------------ -------------- bpf_xdp.o tail_handle_nat_fwd_ipv6 +155 (+23.92%) bpf_xdp.o tail_nodeport_nat_ingress_ipv4 +102 (+27.20%) bpf_xdp.o tail_rev_nodeport_lb4 +83 (+20.85%) loop6.bpf.o trace_virtqueue_add_sgs +25 (+11.06%) Also test case verifier_search_pruning/allocated_stack has to be updated to avoid conflicts in register ID assignments between cached and new states. [1] git@github.com:anakryiko/cilium.git Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- kernel/bpf/verifier.c | 34 ++++++++++++++++--- .../bpf/progs/verifier_search_pruning.c | 3 +- 2 files changed, 32 insertions(+), 5 deletions(-)