Message ID | 20230101083403.332783-2-memxor@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | Dynptr fixes | expand |
On Sun, 2023-01-01 at 14:03 +0530, Kumar Kartikeya Dwivedi wrote: > The root of the problem is missing liveness marking for STACK_DYNPTR > slots. This leads to all kinds of problems inside stacksafe. > > The verifier by default inside stacksafe ignores spilled_ptr in stack > slots which do not have REG_LIVE_READ marks. Since this is being checked > in the 'old' explored state, it must have already done clean_live_states > for this old bpf_func_state. Hence, it won't be receiving any more > liveness marks from to be explored insns (it has received REG_LIVE_DONE > marking from liveness point of view). > > What this means is that verifier considers that it's safe to not compare > the stack slot if was never read by children states. While liveness > marks are usually propagated correctly following the parentage chain for > spilled registers (SCALAR_VALUE and PTR_* types), the same is not the > case for STACK_DYNPTR. > > clean_live_states hence simply rewrites these stack slots to the type > STACK_INVALID since it sees no REG_LIVE_READ marks. > > The end result is that we will never see STACK_DYNPTR slots in explored > state. Even if verifier was conservatively matching !REG_LIVE_READ > slots, very next check continuing the stacksafe loop on seeing > STACK_INVALID would again prevent further checks. > > Now as long as verifier stores an explored state which we can compare to > when reaching a pruning point, we can abuse this bug to make verifier > prune search for obviously unsafe paths using STACK_DYNPTR slots > thinking they are never used hence safe. > > Doing this in unprivileged mode is a bit challenging. add_new_state is > only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges) > or when jmps_processed difference is >= 2 and insn_processed difference > is >= 8. So coming up with the unprivileged case requires a little more > work, but it is still totally possible. The test case being discussed > below triggers the heuristic even in unprivileged mode. > > However, it no longer works since commit > 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF"). > > Let's try to study the test step by step. > > Consider the following program (C style BPF ASM): > > 0 r0 = 0; > 1 r6 = &ringbuf_map; > 3 r1 = r6; > 4 r2 = 8; > 5 r3 = 0; > 6 r4 = r10; > 7 r4 -= -16; > 8 call bpf_ringbuf_reserve_dynptr; > 9 if r0 == 0 goto pc+1; > 10 goto pc+1; > 11 *(r10 - 16) = 0xeB9F; > 12 r1 = r10; > 13 r1 -= -16; > 14 r2 = 0; > 15 call bpf_ringbuf_discard_dynptr; > 16 r0 = 0; > 17 exit; > > We know that insn 12 will be a pruning point, hence if we force > add_new_state for it, it will first verify the following path as > safe in straight line exploration: > 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17 > > Then, when we arrive at insn 12 from the following path: > 0 1 3 4 5 6 7 8 9 -> 11 (12) > > We will find a state that has been verified as safe already at insn 12. > Since register state is same at this point, regsafe will pass. Next, in > stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped > seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true. > Next, refsafe is also true as reference state is unchanged in both > states. > > The states are considered equivalent and search is pruned. > > Hence, we are able to construct a dynptr with arbitrary contents and use > the dynptr API to operate on this arbitrary pointer and arbitrary size + > offset. > > To fix this, first define a mark_dynptr_read function that propagates > liveness marks whenever a valid initialized dynptr is accessed by dynptr > helpers. REG_LIVE_WRITTEN is marked whenever we initialize an > uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows > screening off mark_reg_read and not propagating marks upwards from that > point. > > This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or > none, so clean_live_states either sets both slots to STACK_INVALID or > none of them. This is the invariant the checks inside stacksafe rely on. > > Next, do a complete comparison of both stack slots whenever they have > STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and > also whether both form the same first_slot. Only then is the later path > safe. > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > --- > kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 73 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 4a25375ebb0d..f7248235e119 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -781,6 +781,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > state->stack[spi - 1].spilled_ptr.ref_obj_id = id; > } > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > + > return 0; > } > > @@ -805,6 +808,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > + > + /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot? > + * > + * While we don't allow reading STACK_INVALID, it is still possible to > + * do <8 byte writes marking some but not all slots as STACK_MISC. Then, > + * helpers or insns can do partial read of that part without failing, > + * but check_stack_range_initialized, check_stack_read_var_off, and > + * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of > + * the slot conservatively. Hence we need to screen off those liveness > + * marking walks. > + * > + * This was not a problem before because STACK_INVALID is only set by > + * default, or in clean_live_states after REG_LIVE_DONE, not randomly > + * during verifier state exploration. Hence, for this case parentage > + * chain will still be live, while earlier reg->parent was NULL, so we > + * need REG_LIVE_WRITTEN to screen off read marker propagation. > + */ > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > + This is purely to assist with verifier state pruning and does not affect correctness, right? Commenting the lines does not seem to fail any tests, maybe add one matching some "77 safe: ..." jump in the log? > return 0; > } > > @@ -2388,6 +2411,30 @@ static int mark_reg_read(struct bpf_verifier_env *env, > return 0; > } > > +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > +{ > + struct bpf_func_state *state = func(env, reg); > + int spi, ret; > + > + /* For CONST_PTR_TO_DYNPTR, it must have already been done by > + * check_reg_arg in check_helper_call and mark_btf_func_reg_size in > + * check_kfunc_call. > + */ > + if (reg->type == CONST_PTR_TO_DYNPTR) > + return 0; > + spi = get_spi(reg->off); > + /* Caller ensures dynptr is valid and initialized, which means spi is in > + * bounds and spi is the first dynptr slot. Simply mark stack slot as > + * read. > + */ > + ret = mark_reg_read(env, &state->stack[spi].spilled_ptr, > + state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64); > + if (ret) > + return ret; > + return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr, > + state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64); > +} > + > /* This function is supposed to be used by the following 32-bit optimization > * code only. It returns TRUE if the source or destination register operates > * on 64-bit, otherwise return FALSE. > @@ -5928,6 +5975,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta) > { > struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; > + int err; > > /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to an > * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*): > @@ -6008,6 +6056,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > err_extra, regno); > return -EINVAL; > } > + > + err = mark_dynptr_read(env, reg); > + if (err) > + return err; > } > return 0; > } > @@ -13204,6 +13256,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > * return false to continue verification of this path > */ > return false; > + /* Both are same slot_type, but STACK_DYNPTR requires more > + * checks before it can considered safe. > + */ > + if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) { > + /* If both are STACK_DYNPTR, type must be same */ > + if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type) > + return false; > + /* Both should also have first slot at same spi */ > + if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot) > + return false; > + /* ids should be same */ > + if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id) > + return false; > + if (old->stack[spi].spilled_ptr.ref_obj_id && > + !check_ids(old->stack[spi].spilled_ptr.ref_obj_id, > + cur->stack[spi].spilled_ptr.ref_obj_id, idmap)) > + return false; > + WARN_ON_ONCE(i % BPF_REG_SIZE); > + i += BPF_REG_SIZE - 1; > + continue; > + } Nitpick: maybe move the checks above inside regsafe() as all conditions operate on old/cur->stack[spi].spilled_ptr ? Acked-by: Eduard Zingerman <eddyz@gmail.com> > if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1) > continue; > if (!is_spilled_reg(&old->stack[spi]))
On Sun, Jan 1, 2023 at 12:34 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > The root of the problem is missing liveness marking for STACK_DYNPTR > slots. This leads to all kinds of problems inside stacksafe. > > The verifier by default inside stacksafe ignores spilled_ptr in stack > slots which do not have REG_LIVE_READ marks. Since this is being checked > in the 'old' explored state, it must have already done clean_live_states > for this old bpf_func_state. Hence, it won't be receiving any more > liveness marks from to be explored insns (it has received REG_LIVE_DONE > marking from liveness point of view). > > What this means is that verifier considers that it's safe to not compare > the stack slot if was never read by children states. While liveness > marks are usually propagated correctly following the parentage chain for > spilled registers (SCALAR_VALUE and PTR_* types), the same is not the > case for STACK_DYNPTR. > > clean_live_states hence simply rewrites these stack slots to the type > STACK_INVALID since it sees no REG_LIVE_READ marks. > > The end result is that we will never see STACK_DYNPTR slots in explored > state. Even if verifier was conservatively matching !REG_LIVE_READ > slots, very next check continuing the stacksafe loop on seeing > STACK_INVALID would again prevent further checks. > > Now as long as verifier stores an explored state which we can compare to > when reaching a pruning point, we can abuse this bug to make verifier > prune search for obviously unsafe paths using STACK_DYNPTR slots > thinking they are never used hence safe. > > Doing this in unprivileged mode is a bit challenging. add_new_state is > only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges) > or when jmps_processed difference is >= 2 and insn_processed difference > is >= 8. So coming up with the unprivileged case requires a little more > work, but it is still totally possible. The test case being discussed > below triggers the heuristic even in unprivileged mode. > > However, it no longer works since commit > 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF"). > > Let's try to study the test step by step. > > Consider the following program (C style BPF ASM): > > 0 r0 = 0; > 1 r6 = &ringbuf_map; > 3 r1 = r6; > 4 r2 = 8; > 5 r3 = 0; > 6 r4 = r10; > 7 r4 -= -16; > 8 call bpf_ringbuf_reserve_dynptr; > 9 if r0 == 0 goto pc+1; > 10 goto pc+1; > 11 *(r10 - 16) = 0xeB9F; > 12 r1 = r10; > 13 r1 -= -16; > 14 r2 = 0; > 15 call bpf_ringbuf_discard_dynptr; > 16 r0 = 0; > 17 exit; > > We know that insn 12 will be a pruning point, hence if we force > add_new_state for it, it will first verify the following path as > safe in straight line exploration: > 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17 > > Then, when we arrive at insn 12 from the following path: > 0 1 3 4 5 6 7 8 9 -> 11 (12) > > We will find a state that has been verified as safe already at insn 12. > Since register state is same at this point, regsafe will pass. Next, in > stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped > seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true. > Next, refsafe is also true as reference state is unchanged in both > states. > > The states are considered equivalent and search is pruned. > > Hence, we are able to construct a dynptr with arbitrary contents and use > the dynptr API to operate on this arbitrary pointer and arbitrary size + > offset. > > To fix this, first define a mark_dynptr_read function that propagates > liveness marks whenever a valid initialized dynptr is accessed by dynptr > helpers. REG_LIVE_WRITTEN is marked whenever we initialize an > uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows > screening off mark_reg_read and not propagating marks upwards from that > point. > > This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or > none, so clean_live_states either sets both slots to STACK_INVALID or > none of them. This is the invariant the checks inside stacksafe rely on. > > Next, do a complete comparison of both stack slots whenever they have > STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and > also whether both form the same first_slot. Only then is the later path > safe. > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > --- > kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 73 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 4a25375ebb0d..f7248235e119 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -781,6 +781,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > state->stack[spi - 1].spilled_ptr.ref_obj_id = id; > } > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > + > return 0; > } > > @@ -805,6 +808,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > + > + /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot? > + * > + * While we don't allow reading STACK_INVALID, it is still possible to > + * do <8 byte writes marking some but not all slots as STACK_MISC. Then, > + * helpers or insns can do partial read of that part without failing, > + * but check_stack_range_initialized, check_stack_read_var_off, and > + * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of > + * the slot conservatively. Hence we need to screen off those liveness > + * marking walks. > + * > + * This was not a problem before because STACK_INVALID is only set by > + * default, or in clean_live_states after REG_LIVE_DONE, not randomly > + * during verifier state exploration. Hence, for this case parentage > + * chain will still be live, while earlier reg->parent was NULL, so we > + * need REG_LIVE_WRITTEN to screen off read marker propagation. > + */ > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > + > return 0; > } > > @@ -2388,6 +2411,30 @@ static int mark_reg_read(struct bpf_verifier_env *env, > return 0; > } > > +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > +{ > + struct bpf_func_state *state = func(env, reg); > + int spi, ret; > + > + /* For CONST_PTR_TO_DYNPTR, it must have already been done by > + * check_reg_arg in check_helper_call and mark_btf_func_reg_size in > + * check_kfunc_call. > + */ > + if (reg->type == CONST_PTR_TO_DYNPTR) > + return 0; > + spi = get_spi(reg->off); > + /* Caller ensures dynptr is valid and initialized, which means spi is in > + * bounds and spi is the first dynptr slot. Simply mark stack slot as > + * read. > + */ > + ret = mark_reg_read(env, &state->stack[spi].spilled_ptr, > + state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64); > + if (ret) > + return ret; > + return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr, > + state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64); > +} > + > /* This function is supposed to be used by the following 32-bit optimization > * code only. It returns TRUE if the source or destination register operates > * on 64-bit, otherwise return FALSE. > @@ -5928,6 +5975,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta) > { > struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; > + int err; > > /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to an > * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*): > @@ -6008,6 +6056,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > err_extra, regno); > return -EINVAL; > } > + > + err = mark_dynptr_read(env, reg); > + if (err) > + return err; > } > return 0; > } > @@ -13204,6 +13256,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > * return false to continue verification of this path > */ > return false; > + /* Both are same slot_type, but STACK_DYNPTR requires more > + * checks before it can considered safe. > + */ > + if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) { how about moving this check right after `if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1)` ? Then we can actually generalize this to a switch to handle STACK_SPILL and STACK_DYNPTR separately. I'm adding STACK_ITER in my upcoming patch set, so this will have all the things ready for that? switch (old->stack[spi].slot_type[BPF_REG_SIZE - 1]) { case STACK_SPILL: if (!regsafe(...)) return false; break; case STACK_DYNPTR: ... break; /* and then eventually */ case STACK_ITER: ... WDYT? > + /* If both are STACK_DYNPTR, type must be same */ > + if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type) struct bpf_reg_state *old_reg, *cur_reg; old_reg = &old->stack[spi].spilled_ptr; cur_reg = &cur->stack[spi].spilled_ptr; and then use old_reg and cur_reg in one simple if here's how I have it locally: case STACK_DYNPTR: old_reg = &old->stack[spi].spilled_ptr; cur_reg = &cur->stack[spi].spilled_ptr; if (old_reg->dynptr.type != cur_reg->dynptr.type || old_reg->dynptr.first_slot != cur_reg->dynptr.first_slot || !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) return false; break; seems a bit cleaner? I'm also thinking of getting rid of first_slot field and instead have a rule that first slot has proper type set, but the next one has BPF_DYNPTR_TYPE_INVALID as type. This should simplify things a bit, I think. At least it seems that way for STACK_ITER state I'm adding. But that's a separate refactoring, probably. > + return false; > + /* Both should also have first slot at same spi */ > + if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot) > + return false; > + /* ids should be same */ > + if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id) > + return false; > + if (old->stack[spi].spilled_ptr.ref_obj_id && > + !check_ids(old->stack[spi].spilled_ptr.ref_obj_id, > + cur->stack[spi].spilled_ptr.ref_obj_id, idmap)) my previous change to tech check_ids to enforce that either id have to be zeroes or non-zeroes at the same time already landed, so you don't need to check `old->stack[spi].spilled_ptr.ref_obj_id`. Even more, it seems wrong to do this check like this, because if cur has ref_obj_id set we'll ignore it, right? > + return false; > + WARN_ON_ONCE(i % BPF_REG_SIZE); > + i += BPF_REG_SIZE - 1; > + continue; > + } > if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1) > continue; > if (!is_spilled_reg(&old->stack[spi])) > -- > 2.39.0 >
On Sun, Jan 1, 2023 at 12:34 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > The root of the problem is missing liveness marking for STACK_DYNPTR > slots. This leads to all kinds of problems inside stacksafe. > > The verifier by default inside stacksafe ignores spilled_ptr in stack > slots which do not have REG_LIVE_READ marks. Since this is being checked > in the 'old' explored state, it must have already done clean_live_states > for this old bpf_func_state. Hence, it won't be receiving any more > liveness marks from to be explored insns (it has received REG_LIVE_DONE > marking from liveness point of view). > > What this means is that verifier considers that it's safe to not compare > the stack slot if was never read by children states. While liveness > marks are usually propagated correctly following the parentage chain for > spilled registers (SCALAR_VALUE and PTR_* types), the same is not the > case for STACK_DYNPTR. > > clean_live_states hence simply rewrites these stack slots to the type > STACK_INVALID since it sees no REG_LIVE_READ marks. > > The end result is that we will never see STACK_DYNPTR slots in explored > state. Even if verifier was conservatively matching !REG_LIVE_READ > slots, very next check continuing the stacksafe loop on seeing > STACK_INVALID would again prevent further checks. > > Now as long as verifier stores an explored state which we can compare to > when reaching a pruning point, we can abuse this bug to make verifier > prune search for obviously unsafe paths using STACK_DYNPTR slots > thinking they are never used hence safe. > > Doing this in unprivileged mode is a bit challenging. add_new_state is > only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges) > or when jmps_processed difference is >= 2 and insn_processed difference > is >= 8. So coming up with the unprivileged case requires a little more > work, but it is still totally possible. The test case being discussed > below triggers the heuristic even in unprivileged mode. > > However, it no longer works since commit > 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF"). > > Let's try to study the test step by step. > > Consider the following program (C style BPF ASM): > > 0 r0 = 0; > 1 r6 = &ringbuf_map; > 3 r1 = r6; > 4 r2 = 8; > 5 r3 = 0; > 6 r4 = r10; > 7 r4 -= -16; > 8 call bpf_ringbuf_reserve_dynptr; > 9 if r0 == 0 goto pc+1; > 10 goto pc+1; > 11 *(r10 - 16) = 0xeB9F; > 12 r1 = r10; > 13 r1 -= -16; > 14 r2 = 0; > 15 call bpf_ringbuf_discard_dynptr; > 16 r0 = 0; > 17 exit; > > We know that insn 12 will be a pruning point, hence if we force > add_new_state for it, it will first verify the following path as > safe in straight line exploration: > 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17 > > Then, when we arrive at insn 12 from the following path: > 0 1 3 4 5 6 7 8 9 -> 11 (12) > > We will find a state that has been verified as safe already at insn 12. > Since register state is same at this point, regsafe will pass. Next, in > stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped > seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true. > Next, refsafe is also true as reference state is unchanged in both > states. > > The states are considered equivalent and search is pruned. > > Hence, we are able to construct a dynptr with arbitrary contents and use > the dynptr API to operate on this arbitrary pointer and arbitrary size + > offset. > > To fix this, first define a mark_dynptr_read function that propagates > liveness marks whenever a valid initialized dynptr is accessed by dynptr > helpers. REG_LIVE_WRITTEN is marked whenever we initialize an > uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows > screening off mark_reg_read and not propagating marks upwards from that > point. > > This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or > none, so clean_live_states either sets both slots to STACK_INVALID or > none of them. This is the invariant the checks inside stacksafe rely on. > > Next, do a complete comparison of both stack slots whenever they have > STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and > also whether both form the same first_slot. Only then is the later path > safe. > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > --- > kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 73 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 4a25375ebb0d..f7248235e119 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -781,6 +781,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > state->stack[spi - 1].spilled_ptr.ref_obj_id = id; > } > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > + > return 0; > } > > @@ -805,6 +808,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > + > + /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot? > + * > + * While we don't allow reading STACK_INVALID, it is still possible to > + * do <8 byte writes marking some but not all slots as STACK_MISC. Then, > + * helpers or insns can do partial read of that part without failing, > + * but check_stack_range_initialized, check_stack_read_var_off, and > + * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of > + * the slot conservatively. Hence we need to screen off those liveness > + * marking walks. > + * > + * This was not a problem before because STACK_INVALID is only set by > + * default, or in clean_live_states after REG_LIVE_DONE, not randomly > + * during verifier state exploration. Hence, for this case parentage Where does it get set randomly during verifier state exploration for this case? > + * chain will still be live, while earlier reg->parent was NULL, so we What does "live" in "parentage chain will still be live" here mean? what does "earlier" in "earlier reg->parent" refer to here, and why was the earlier reg->parent NULL? > + * need REG_LIVE_WRITTEN to screen off read marker propagation. > + */ > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > + > return 0; > } > > @@ -2388,6 +2411,30 @@ static int mark_reg_read(struct bpf_verifier_env *env, > return 0; > } > > +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > +{ > + struct bpf_func_state *state = func(env, reg); > + int spi, ret; > + > + /* For CONST_PTR_TO_DYNPTR, it must have already been done by > + * check_reg_arg in check_helper_call and mark_btf_func_reg_size in > + * check_kfunc_call. > + */ > + if (reg->type == CONST_PTR_TO_DYNPTR) > + return 0; > + spi = get_spi(reg->off); > + /* Caller ensures dynptr is valid and initialized, which means spi is in > + * bounds and spi is the first dynptr slot. Simply mark stack slot as > + * read. > + */ > + ret = mark_reg_read(env, &state->stack[spi].spilled_ptr, > + state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64); > + if (ret) > + return ret; > + return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr, > + state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64); > +} > + > /* This function is supposed to be used by the following 32-bit optimization > * code only. It returns TRUE if the source or destination register operates > * on 64-bit, otherwise return FALSE. > @@ -5928,6 +5975,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta) > { > struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; > + int err; > > /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to an > * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*): > @@ -6008,6 +6056,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > err_extra, regno); > return -EINVAL; > } > + > + err = mark_dynptr_read(env, reg); > + if (err) > + return err; > } > return 0; > } > @@ -13204,6 +13256,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > * return false to continue verification of this path > */ > return false; > + /* Both are same slot_type, but STACK_DYNPTR requires more > + * checks before it can considered safe. > + */ > + if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) { > + /* If both are STACK_DYNPTR, type must be same */ > + if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type) > + return false; > + /* Both should also have first slot at same spi */ > + if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot) > + return false; > + /* ids should be same */ > + if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id) > + return false; > + if (old->stack[spi].spilled_ptr.ref_obj_id && > + !check_ids(old->stack[spi].spilled_ptr.ref_obj_id, > + cur->stack[spi].spilled_ptr.ref_obj_id, idmap)) > + return false; > + WARN_ON_ONCE(i % BPF_REG_SIZE); > + i += BPF_REG_SIZE - 1; > + continue; > + } > if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1) > continue; > if (!is_spilled_reg(&old->stack[spi])) > -- > 2.39.0 >
On Tue, Jan 03, 2023 at 12:58:40AM IST, Eduard Zingerman wrote: > On Sun, 2023-01-01 at 14:03 +0530, Kumar Kartikeya Dwivedi wrote: > > The root of the problem is missing liveness marking for STACK_DYNPTR > > slots. This leads to all kinds of problems inside stacksafe. > > > > The verifier by default inside stacksafe ignores spilled_ptr in stack > > slots which do not have REG_LIVE_READ marks. Since this is being checked > > in the 'old' explored state, it must have already done clean_live_states > > for this old bpf_func_state. Hence, it won't be receiving any more > > liveness marks from to be explored insns (it has received REG_LIVE_DONE > > marking from liveness point of view). > > > > What this means is that verifier considers that it's safe to not compare > > the stack slot if was never read by children states. While liveness > > marks are usually propagated correctly following the parentage chain for > > spilled registers (SCALAR_VALUE and PTR_* types), the same is not the > > case for STACK_DYNPTR. > > > > clean_live_states hence simply rewrites these stack slots to the type > > STACK_INVALID since it sees no REG_LIVE_READ marks. > > > > The end result is that we will never see STACK_DYNPTR slots in explored > > state. Even if verifier was conservatively matching !REG_LIVE_READ > > slots, very next check continuing the stacksafe loop on seeing > > STACK_INVALID would again prevent further checks. > > > > Now as long as verifier stores an explored state which we can compare to > > when reaching a pruning point, we can abuse this bug to make verifier > > prune search for obviously unsafe paths using STACK_DYNPTR slots > > thinking they are never used hence safe. > > > > Doing this in unprivileged mode is a bit challenging. add_new_state is > > only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges) > > or when jmps_processed difference is >= 2 and insn_processed difference > > is >= 8. So coming up with the unprivileged case requires a little more > > work, but it is still totally possible. The test case being discussed > > below triggers the heuristic even in unprivileged mode. > > > > However, it no longer works since commit > > 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF"). > > > > Let's try to study the test step by step. > > > > Consider the following program (C style BPF ASM): > > > > 0 r0 = 0; > > 1 r6 = &ringbuf_map; > > 3 r1 = r6; > > 4 r2 = 8; > > 5 r3 = 0; > > 6 r4 = r10; > > 7 r4 -= -16; > > 8 call bpf_ringbuf_reserve_dynptr; > > 9 if r0 == 0 goto pc+1; > > 10 goto pc+1; > > 11 *(r10 - 16) = 0xeB9F; > > 12 r1 = r10; > > 13 r1 -= -16; > > 14 r2 = 0; > > 15 call bpf_ringbuf_discard_dynptr; > > 16 r0 = 0; > > 17 exit; > > > > We know that insn 12 will be a pruning point, hence if we force > > add_new_state for it, it will first verify the following path as > > safe in straight line exploration: > > 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17 > > > > Then, when we arrive at insn 12 from the following path: > > 0 1 3 4 5 6 7 8 9 -> 11 (12) > > > > We will find a state that has been verified as safe already at insn 12. > > Since register state is same at this point, regsafe will pass. Next, in > > stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped > > seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true. > > Next, refsafe is also true as reference state is unchanged in both > > states. > > > > The states are considered equivalent and search is pruned. > > > > Hence, we are able to construct a dynptr with arbitrary contents and use > > the dynptr API to operate on this arbitrary pointer and arbitrary size + > > offset. > > > > To fix this, first define a mark_dynptr_read function that propagates > > liveness marks whenever a valid initialized dynptr is accessed by dynptr > > helpers. REG_LIVE_WRITTEN is marked whenever we initialize an > > uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows > > screening off mark_reg_read and not propagating marks upwards from that > > point. > > > > This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or > > none, so clean_live_states either sets both slots to STACK_INVALID or > > none of them. This is the invariant the checks inside stacksafe rely on. > > > > Next, do a complete comparison of both stack slots whenever they have > > STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and > > also whether both form the same first_slot. Only then is the later path > > safe. > > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > --- > > kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 73 insertions(+) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 4a25375ebb0d..f7248235e119 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -781,6 +781,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > > state->stack[spi - 1].spilled_ptr.ref_obj_id = id; > > } > > > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + > > return 0; > > } > > > > @@ -805,6 +808,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > > > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > > + > > + /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot? > > + * > > + * While we don't allow reading STACK_INVALID, it is still possible to > > + * do <8 byte writes marking some but not all slots as STACK_MISC. Then, > > + * helpers or insns can do partial read of that part without failing, > > + * but check_stack_range_initialized, check_stack_read_var_off, and > > + * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of > > + * the slot conservatively. Hence we need to screen off those liveness > > + * marking walks. > > + * > > + * This was not a problem before because STACK_INVALID is only set by > > + * default, or in clean_live_states after REG_LIVE_DONE, not randomly > > + * during verifier state exploration. Hence, for this case parentage > > + * chain will still be live, while earlier reg->parent was NULL, so we > > + * need REG_LIVE_WRITTEN to screen off read marker propagation. > > + */ > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + > > This is purely to assist with verifier state pruning and does not > affect correctness, right? Yes, it should not affect correctness (to the best of my knowledge). > Commenting the lines does not seem to fail any tests, maybe add one > matching some "77 safe: ..." jump in the log? > I will, thanks. > > return 0; > > } > > > > @@ -2388,6 +2411,30 @@ static int mark_reg_read(struct bpf_verifier_env *env, > > return 0; > > } > > > > +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > +{ > > + struct bpf_func_state *state = func(env, reg); > > + int spi, ret; > > + > > + /* For CONST_PTR_TO_DYNPTR, it must have already been done by > > + * check_reg_arg in check_helper_call and mark_btf_func_reg_size in > > + * check_kfunc_call. > > + */ > > + if (reg->type == CONST_PTR_TO_DYNPTR) > > + return 0; > > + spi = get_spi(reg->off); > > + /* Caller ensures dynptr is valid and initialized, which means spi is in > > + * bounds and spi is the first dynptr slot. Simply mark stack slot as > > + * read. > > + */ > > + ret = mark_reg_read(env, &state->stack[spi].spilled_ptr, > > + state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64); > > + if (ret) > > + return ret; > > + return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr, > > + state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64); > > +} > > + > > /* This function is supposed to be used by the following 32-bit optimization > > * code only. It returns TRUE if the source or destination register operates > > * on 64-bit, otherwise return FALSE. > > @@ -5928,6 +5975,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > > enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta) > > { > > struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; > > + int err; > > > > /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to an > > * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*): > > @@ -6008,6 +6056,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > > err_extra, regno); > > return -EINVAL; > > } > > + > > + err = mark_dynptr_read(env, reg); > > + if (err) > > + return err; > > } > > return 0; > > } > > @@ -13204,6 +13256,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > > * return false to continue verification of this path > > */ > > return false; > > + /* Both are same slot_type, but STACK_DYNPTR requires more > > + * checks before it can considered safe. > > + */ > > + if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) { > > + /* If both are STACK_DYNPTR, type must be same */ > > + if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type) > > + return false; > > + /* Both should also have first slot at same spi */ > > + if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot) > > + return false; > > + /* ids should be same */ > > + if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id) > > + return false; > > + if (old->stack[spi].spilled_ptr.ref_obj_id && > > + !check_ids(old->stack[spi].spilled_ptr.ref_obj_id, > > + cur->stack[spi].spilled_ptr.ref_obj_id, idmap)) > > + return false; > > + WARN_ON_ONCE(i % BPF_REG_SIZE); > > + i += BPF_REG_SIZE - 1; > > + continue; > > + } > > Nitpick: maybe move the checks above inside regsafe() as all > conditions operate on old/cur->stack[spi].spilled_ptr ? Good suggestion, but may need to tweak the condition that falls through to regsafe for is_spilled_reg, and include STACK_DYNPTR there. I'll check Andrii's comments as well and see how the end result looks. > > Acked-by: Eduard Zingerman <eddyz@gmail.com> > > > if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1) > > continue; > > if (!is_spilled_reg(&old->stack[spi])) >
On Thu, Jan 05, 2023 at 03:54:03AM IST, Andrii Nakryiko wrote: > On Sun, Jan 1, 2023 at 12:34 AM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > The root of the problem is missing liveness marking for STACK_DYNPTR > > slots. This leads to all kinds of problems inside stacksafe. > > > > The verifier by default inside stacksafe ignores spilled_ptr in stack > > slots which do not have REG_LIVE_READ marks. Since this is being checked > > in the 'old' explored state, it must have already done clean_live_states > > for this old bpf_func_state. Hence, it won't be receiving any more > > liveness marks from to be explored insns (it has received REG_LIVE_DONE > > marking from liveness point of view). > > > > What this means is that verifier considers that it's safe to not compare > > the stack slot if was never read by children states. While liveness > > marks are usually propagated correctly following the parentage chain for > > spilled registers (SCALAR_VALUE and PTR_* types), the same is not the > > case for STACK_DYNPTR. > > > > clean_live_states hence simply rewrites these stack slots to the type > > STACK_INVALID since it sees no REG_LIVE_READ marks. > > > > The end result is that we will never see STACK_DYNPTR slots in explored > > state. Even if verifier was conservatively matching !REG_LIVE_READ > > slots, very next check continuing the stacksafe loop on seeing > > STACK_INVALID would again prevent further checks. > > > > Now as long as verifier stores an explored state which we can compare to > > when reaching a pruning point, we can abuse this bug to make verifier > > prune search for obviously unsafe paths using STACK_DYNPTR slots > > thinking they are never used hence safe. > > > > Doing this in unprivileged mode is a bit challenging. add_new_state is > > only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges) > > or when jmps_processed difference is >= 2 and insn_processed difference > > is >= 8. So coming up with the unprivileged case requires a little more > > work, but it is still totally possible. The test case being discussed > > below triggers the heuristic even in unprivileged mode. > > > > However, it no longer works since commit > > 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF"). > > > > Let's try to study the test step by step. > > > > Consider the following program (C style BPF ASM): > > > > 0 r0 = 0; > > 1 r6 = &ringbuf_map; > > 3 r1 = r6; > > 4 r2 = 8; > > 5 r3 = 0; > > 6 r4 = r10; > > 7 r4 -= -16; > > 8 call bpf_ringbuf_reserve_dynptr; > > 9 if r0 == 0 goto pc+1; > > 10 goto pc+1; > > 11 *(r10 - 16) = 0xeB9F; > > 12 r1 = r10; > > 13 r1 -= -16; > > 14 r2 = 0; > > 15 call bpf_ringbuf_discard_dynptr; > > 16 r0 = 0; > > 17 exit; > > > > We know that insn 12 will be a pruning point, hence if we force > > add_new_state for it, it will first verify the following path as > > safe in straight line exploration: > > 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17 > > > > Then, when we arrive at insn 12 from the following path: > > 0 1 3 4 5 6 7 8 9 -> 11 (12) > > > > We will find a state that has been verified as safe already at insn 12. > > Since register state is same at this point, regsafe will pass. Next, in > > stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped > > seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true. > > Next, refsafe is also true as reference state is unchanged in both > > states. > > > > The states are considered equivalent and search is pruned. > > > > Hence, we are able to construct a dynptr with arbitrary contents and use > > the dynptr API to operate on this arbitrary pointer and arbitrary size + > > offset. > > > > To fix this, first define a mark_dynptr_read function that propagates > > liveness marks whenever a valid initialized dynptr is accessed by dynptr > > helpers. REG_LIVE_WRITTEN is marked whenever we initialize an > > uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows > > screening off mark_reg_read and not propagating marks upwards from that > > point. > > > > This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or > > none, so clean_live_states either sets both slots to STACK_INVALID or > > none of them. This is the invariant the checks inside stacksafe rely on. > > > > Next, do a complete comparison of both stack slots whenever they have > > STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and > > also whether both form the same first_slot. Only then is the later path > > safe. > > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > --- > > kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 73 insertions(+) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 4a25375ebb0d..f7248235e119 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -781,6 +781,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > > state->stack[spi - 1].spilled_ptr.ref_obj_id = id; > > } > > > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + > > return 0; > > } > > > > @@ -805,6 +808,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > > > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > > + > > + /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot? > > + * > > + * While we don't allow reading STACK_INVALID, it is still possible to > > + * do <8 byte writes marking some but not all slots as STACK_MISC. Then, > > + * helpers or insns can do partial read of that part without failing, > > + * but check_stack_range_initialized, check_stack_read_var_off, and > > + * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of > > + * the slot conservatively. Hence we need to screen off those liveness > > + * marking walks. > > + * > > + * This was not a problem before because STACK_INVALID is only set by > > + * default, or in clean_live_states after REG_LIVE_DONE, not randomly > > + * during verifier state exploration. Hence, for this case parentage > > + * chain will still be live, while earlier reg->parent was NULL, so we > > + * need REG_LIVE_WRITTEN to screen off read marker propagation. > > + */ > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + > > return 0; > > } > > > > @@ -2388,6 +2411,30 @@ static int mark_reg_read(struct bpf_verifier_env *env, > > return 0; > > } > > > > +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > +{ > > + struct bpf_func_state *state = func(env, reg); > > + int spi, ret; > > + > > + /* For CONST_PTR_TO_DYNPTR, it must have already been done by > > + * check_reg_arg in check_helper_call and mark_btf_func_reg_size in > > + * check_kfunc_call. > > + */ > > + if (reg->type == CONST_PTR_TO_DYNPTR) > > + return 0; > > + spi = get_spi(reg->off); > > + /* Caller ensures dynptr is valid and initialized, which means spi is in > > + * bounds and spi is the first dynptr slot. Simply mark stack slot as > > + * read. > > + */ > > + ret = mark_reg_read(env, &state->stack[spi].spilled_ptr, > > + state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64); > > + if (ret) > > + return ret; > > + return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr, > > + state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64); > > +} > > + > > /* This function is supposed to be used by the following 32-bit optimization > > * code only. It returns TRUE if the source or destination register operates > > * on 64-bit, otherwise return FALSE. > > @@ -5928,6 +5975,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > > enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta) > > { > > struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; > > + int err; > > > > /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to an > > * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*): > > @@ -6008,6 +6056,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > > err_extra, regno); > > return -EINVAL; > > } > > + > > + err = mark_dynptr_read(env, reg); > > + if (err) > > + return err; > > } > > return 0; > > } > > @@ -13204,6 +13256,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > > * return false to continue verification of this path > > */ > > return false; > > + /* Both are same slot_type, but STACK_DYNPTR requires more > > + * checks before it can considered safe. > > + */ > > + if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) { > > how about moving this check right after `if (i % BPF_REG_SIZE != > BPF_REG_SIZE - 1)` ? Then we can actually generalize this to a switch > to handle STACK_SPILL and STACK_DYNPTR separately. I'm adding > STACK_ITER in my upcoming patch set, so this will have all the things > ready for that? > > switch (old->stack[spi].slot_type[BPF_REG_SIZE - 1]) { > case STACK_SPILL: > if (!regsafe(...)) > return false; > break; > case STACK_DYNPTR: > ... > break; > /* and then eventually */ > case STACK_ITER: > ... > > WDYT? > I can do this, it certainly makes sense with your upcoming changes, and it does look cleaner. > > + /* If both are STACK_DYNPTR, type must be same */ > > + if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type) > > struct bpf_reg_state *old_reg, *cur_reg; > > old_reg = &old->stack[spi].spilled_ptr; > cur_reg = &cur->stack[spi].spilled_ptr; > > and then use old_reg and cur_reg in one simple if > > here's how I have it locally: > > case STACK_DYNPTR: > old_reg = &old->stack[spi].spilled_ptr; > cur_reg = &cur->stack[spi].spilled_ptr; > if (old_reg->dynptr.type != cur_reg->dynptr.type || > old_reg->dynptr.first_slot != > cur_reg->dynptr.first_slot || > !check_ids(old_reg->ref_obj_id, > cur_reg->ref_obj_id, idmap)) > return false; > break; > > seems a bit cleaner? > Yep. > I'm also thinking of getting rid of first_slot field and instead have > a rule that first slot has proper type set, but the next one has > BPF_DYNPTR_TYPE_INVALID as type. This should simplify things a bit, I > think. At least it seems that way for STACK_ITER state I'm adding. But > that's a separate refactoring, probably. > Yeah, I'd rather not mix that into this set. Let me know if you think that's better and I can follow up after the next iteration with that change. > > + return false; > > + /* Both should also have first slot at same spi */ > > + if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot) > > + return false; > > + /* ids should be same */ > > + if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id) > > + return false; > > + if (old->stack[spi].spilled_ptr.ref_obj_id && > > + !check_ids(old->stack[spi].spilled_ptr.ref_obj_id, > > + cur->stack[spi].spilled_ptr.ref_obj_id, idmap)) > > my previous change to tech check_ids to enforce that either id have to > be zeroes or non-zeroes at the same time already landed, so you don't > need to check `old->stack[spi].spilled_ptr.ref_obj_id`. Even more, it > seems wrong to do this check like this, because if cur has ref_obj_id > set we'll ignore it, right? The check before that ensures either both are set or both are unset. If there is a mismatch we return false. I see that check_ids does it now, so yes it wouldn't be needed anymore. I am not sure about the last part, I don't think it will be ignored?
On Fri, Jan 06, 2023 at 05:48:06AM IST, Joanne Koong wrote: > On Sun, Jan 1, 2023 at 12:34 AM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > The root of the problem is missing liveness marking for STACK_DYNPTR > > slots. This leads to all kinds of problems inside stacksafe. > > > > The verifier by default inside stacksafe ignores spilled_ptr in stack > > slots which do not have REG_LIVE_READ marks. Since this is being checked > > in the 'old' explored state, it must have already done clean_live_states > > for this old bpf_func_state. Hence, it won't be receiving any more > > liveness marks from to be explored insns (it has received REG_LIVE_DONE > > marking from liveness point of view). > > > > What this means is that verifier considers that it's safe to not compare > > the stack slot if was never read by children states. While liveness > > marks are usually propagated correctly following the parentage chain for > > spilled registers (SCALAR_VALUE and PTR_* types), the same is not the > > case for STACK_DYNPTR. > > > > clean_live_states hence simply rewrites these stack slots to the type > > STACK_INVALID since it sees no REG_LIVE_READ marks. > > > > The end result is that we will never see STACK_DYNPTR slots in explored > > state. Even if verifier was conservatively matching !REG_LIVE_READ > > slots, very next check continuing the stacksafe loop on seeing > > STACK_INVALID would again prevent further checks. > > > > Now as long as verifier stores an explored state which we can compare to > > when reaching a pruning point, we can abuse this bug to make verifier > > prune search for obviously unsafe paths using STACK_DYNPTR slots > > thinking they are never used hence safe. > > > > Doing this in unprivileged mode is a bit challenging. add_new_state is > > only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges) > > or when jmps_processed difference is >= 2 and insn_processed difference > > is >= 8. So coming up with the unprivileged case requires a little more > > work, but it is still totally possible. The test case being discussed > > below triggers the heuristic even in unprivileged mode. > > > > However, it no longer works since commit > > 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF"). > > > > Let's try to study the test step by step. > > > > Consider the following program (C style BPF ASM): > > > > 0 r0 = 0; > > 1 r6 = &ringbuf_map; > > 3 r1 = r6; > > 4 r2 = 8; > > 5 r3 = 0; > > 6 r4 = r10; > > 7 r4 -= -16; > > 8 call bpf_ringbuf_reserve_dynptr; > > 9 if r0 == 0 goto pc+1; > > 10 goto pc+1; > > 11 *(r10 - 16) = 0xeB9F; > > 12 r1 = r10; > > 13 r1 -= -16; > > 14 r2 = 0; > > 15 call bpf_ringbuf_discard_dynptr; > > 16 r0 = 0; > > 17 exit; > > > > We know that insn 12 will be a pruning point, hence if we force > > add_new_state for it, it will first verify the following path as > > safe in straight line exploration: > > 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17 > > > > Then, when we arrive at insn 12 from the following path: > > 0 1 3 4 5 6 7 8 9 -> 11 (12) > > > > We will find a state that has been verified as safe already at insn 12. > > Since register state is same at this point, regsafe will pass. Next, in > > stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped > > seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true. > > Next, refsafe is also true as reference state is unchanged in both > > states. > > > > The states are considered equivalent and search is pruned. > > > > Hence, we are able to construct a dynptr with arbitrary contents and use > > the dynptr API to operate on this arbitrary pointer and arbitrary size + > > offset. > > > > To fix this, first define a mark_dynptr_read function that propagates > > liveness marks whenever a valid initialized dynptr is accessed by dynptr > > helpers. REG_LIVE_WRITTEN is marked whenever we initialize an > > uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows > > screening off mark_reg_read and not propagating marks upwards from that > > point. > > > > This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or > > none, so clean_live_states either sets both slots to STACK_INVALID or > > none of them. This is the invariant the checks inside stacksafe rely on. > > > > Next, do a complete comparison of both stack slots whenever they have > > STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and > > also whether both form the same first_slot. Only then is the later path > > safe. > > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > --- > > kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 73 insertions(+) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 4a25375ebb0d..f7248235e119 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -781,6 +781,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > > state->stack[spi - 1].spilled_ptr.ref_obj_id = id; > > } > > > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > > + > > return 0; > > } > > > > @@ -805,6 +808,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > > > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > > + > > + /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot? > > + * > > + * While we don't allow reading STACK_INVALID, it is still possible to > > + * do <8 byte writes marking some but not all slots as STACK_MISC. Then, > > + * helpers or insns can do partial read of that part without failing, > > + * but check_stack_range_initialized, check_stack_read_var_off, and > > + * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of > > + * the slot conservatively. Hence we need to screen off those liveness > > + * marking walks. > > + * > > + * This was not a problem before because STACK_INVALID is only set by > > + * default, or in clean_live_states after REG_LIVE_DONE, not randomly > > + * during verifier state exploration. Hence, for this case parentage > > Where does it get set randomly during verifier state exploration for this case? > When unmarking dynptr slots, we set STACK_INVALID. There are no other instances where a slot is marked STACK_INVALID while verifier is doing symbolic execution. It is either set by default or after checkpointed state won't be receiving liveness marks anymore (in clean_live_states). > > + * chain will still be live, while earlier reg->parent was NULL, so we > > What does "live" in "parentage chain will still be live" here mean? It just means it will probably have a non-NULL reg->parent (which mark_reg_read will follow), by default when STACK_INVALID slot is written to the spilled_ptr will have reg->parent as NULL. > what does "earlier" in "earlier reg->parent" refer to here, and why > was the earlier reg->parent NULL? > Earlier refers to the default case when STACK_INVALID was not being set while unmarking dynptr slots. So any mark_reg_read on spilled_ptr didn't propagate any read marks. Now it can happen, so to make state pruning less conservative we need to be more careful in places where REG_LIVE_WRITTEN needs to be set to stop those register parent chain walks in mark_reg_read. > > [...]
On Mon, Jan 9, 2023 at 3:05 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Thu, Jan 05, 2023 at 03:54:03AM IST, Andrii Nakryiko wrote: > > On Sun, Jan 1, 2023 at 12:34 AM Kumar Kartikeya Dwivedi > > <memxor@gmail.com> wrote: > > > > > > The root of the problem is missing liveness marking for STACK_DYNPTR > > > slots. This leads to all kinds of problems inside stacksafe. > > > > > > The verifier by default inside stacksafe ignores spilled_ptr in stack > > > slots which do not have REG_LIVE_READ marks. Since this is being checked > > > in the 'old' explored state, it must have already done clean_live_states > > > for this old bpf_func_state. Hence, it won't be receiving any more > > > liveness marks from to be explored insns (it has received REG_LIVE_DONE > > > marking from liveness point of view). > > > > > > What this means is that verifier considers that it's safe to not compare > > > the stack slot if was never read by children states. While liveness > > > marks are usually propagated correctly following the parentage chain for > > > spilled registers (SCALAR_VALUE and PTR_* types), the same is not the > > > case for STACK_DYNPTR. > > > > > > clean_live_states hence simply rewrites these stack slots to the type > > > STACK_INVALID since it sees no REG_LIVE_READ marks. > > > > > > The end result is that we will never see STACK_DYNPTR slots in explored > > > state. Even if verifier was conservatively matching !REG_LIVE_READ > > > slots, very next check continuing the stacksafe loop on seeing > > > STACK_INVALID would again prevent further checks. > > > > > > Now as long as verifier stores an explored state which we can compare to > > > when reaching a pruning point, we can abuse this bug to make verifier > > > prune search for obviously unsafe paths using STACK_DYNPTR slots > > > thinking they are never used hence safe. > > > > > > Doing this in unprivileged mode is a bit challenging. add_new_state is > > > only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges) > > > or when jmps_processed difference is >= 2 and insn_processed difference > > > is >= 8. So coming up with the unprivileged case requires a little more > > > work, but it is still totally possible. The test case being discussed > > > below triggers the heuristic even in unprivileged mode. > > > > > > However, it no longer works since commit > > > 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF"). > > > > > > Let's try to study the test step by step. > > > > > > Consider the following program (C style BPF ASM): > > > > > > 0 r0 = 0; > > > 1 r6 = &ringbuf_map; > > > 3 r1 = r6; > > > 4 r2 = 8; > > > 5 r3 = 0; > > > 6 r4 = r10; > > > 7 r4 -= -16; > > > 8 call bpf_ringbuf_reserve_dynptr; > > > 9 if r0 == 0 goto pc+1; > > > 10 goto pc+1; > > > 11 *(r10 - 16) = 0xeB9F; > > > 12 r1 = r10; > > > 13 r1 -= -16; > > > 14 r2 = 0; > > > 15 call bpf_ringbuf_discard_dynptr; > > > 16 r0 = 0; > > > 17 exit; > > > > > > We know that insn 12 will be a pruning point, hence if we force > > > add_new_state for it, it will first verify the following path as > > > safe in straight line exploration: > > > 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17 > > > > > > Then, when we arrive at insn 12 from the following path: > > > 0 1 3 4 5 6 7 8 9 -> 11 (12) > > > > > > We will find a state that has been verified as safe already at insn 12. > > > Since register state is same at this point, regsafe will pass. Next, in > > > stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped > > > seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true. > > > Next, refsafe is also true as reference state is unchanged in both > > > states. > > > > > > The states are considered equivalent and search is pruned. > > > > > > Hence, we are able to construct a dynptr with arbitrary contents and use > > > the dynptr API to operate on this arbitrary pointer and arbitrary size + > > > offset. > > > > > > To fix this, first define a mark_dynptr_read function that propagates > > > liveness marks whenever a valid initialized dynptr is accessed by dynptr > > > helpers. REG_LIVE_WRITTEN is marked whenever we initialize an > > > uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows > > > screening off mark_reg_read and not propagating marks upwards from that > > > point. > > > > > > This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or > > > none, so clean_live_states either sets both slots to STACK_INVALID or > > > none of them. This is the invariant the checks inside stacksafe rely on. > > > > > > Next, do a complete comparison of both stack slots whenever they have > > > STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and > > > also whether both form the same first_slot. Only then is the later path > > > safe. > > > > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > > --- > > > kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 73 insertions(+) > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > index 4a25375ebb0d..f7248235e119 100644 > > > --- a/kernel/bpf/verifier.c > > > +++ b/kernel/bpf/verifier.c > > > @@ -781,6 +781,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > > > state->stack[spi - 1].spilled_ptr.ref_obj_id = id; > > > } > > > > > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > > > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > > > + > > > return 0; > > > } > > > > > > @@ -805,6 +808,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > > > > > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > > > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > > > + > > > + /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot? > > > + * > > > + * While we don't allow reading STACK_INVALID, it is still possible to > > > + * do <8 byte writes marking some but not all slots as STACK_MISC. Then, > > > + * helpers or insns can do partial read of that part without failing, > > > + * but check_stack_range_initialized, check_stack_read_var_off, and > > > + * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of > > > + * the slot conservatively. Hence we need to screen off those liveness > > > + * marking walks. > > > + * > > > + * This was not a problem before because STACK_INVALID is only set by > > > + * default, or in clean_live_states after REG_LIVE_DONE, not randomly > > > + * during verifier state exploration. Hence, for this case parentage > > > + * chain will still be live, while earlier reg->parent was NULL, so we > > > + * need REG_LIVE_WRITTEN to screen off read marker propagation. > > > + */ > > > + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > > > + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > > > + > > > return 0; > > > } > > > > > > @@ -2388,6 +2411,30 @@ static int mark_reg_read(struct bpf_verifier_env *env, > > > return 0; > > > } > > > > > > +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > > +{ > > > + struct bpf_func_state *state = func(env, reg); > > > + int spi, ret; > > > + > > > + /* For CONST_PTR_TO_DYNPTR, it must have already been done by > > > + * check_reg_arg in check_helper_call and mark_btf_func_reg_size in > > > + * check_kfunc_call. > > > + */ > > > + if (reg->type == CONST_PTR_TO_DYNPTR) > > > + return 0; > > > + spi = get_spi(reg->off); > > > + /* Caller ensures dynptr is valid and initialized, which means spi is in > > > + * bounds and spi is the first dynptr slot. Simply mark stack slot as > > > + * read. > > > + */ > > > + ret = mark_reg_read(env, &state->stack[spi].spilled_ptr, > > > + state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64); > > > + if (ret) > > > + return ret; > > > + return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr, > > > + state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64); > > > +} > > > + > > > /* This function is supposed to be used by the following 32-bit optimization > > > * code only. It returns TRUE if the source or destination register operates > > > * on 64-bit, otherwise return FALSE. > > > @@ -5928,6 +5975,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > > > enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta) > > > { > > > struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; > > > + int err; > > > > > > /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to an > > > * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*): > > > @@ -6008,6 +6056,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, > > > err_extra, regno); > > > return -EINVAL; > > > } > > > + > > > + err = mark_dynptr_read(env, reg); > > > + if (err) > > > + return err; > > > } > > > return 0; > > > } > > > @@ -13204,6 +13256,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > > > * return false to continue verification of this path > > > */ > > > return false; > > > + /* Both are same slot_type, but STACK_DYNPTR requires more > > > + * checks before it can considered safe. > > > + */ > > > + if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) { > > > > how about moving this check right after `if (i % BPF_REG_SIZE != > > BPF_REG_SIZE - 1)` ? Then we can actually generalize this to a switch > > to handle STACK_SPILL and STACK_DYNPTR separately. I'm adding > > STACK_ITER in my upcoming patch set, so this will have all the things > > ready for that? > > > > switch (old->stack[spi].slot_type[BPF_REG_SIZE - 1]) { > > case STACK_SPILL: > > if (!regsafe(...)) > > return false; > > break; > > case STACK_DYNPTR: > > ... > > break; > > /* and then eventually */ > > case STACK_ITER: > > ... > > > > WDYT? > > > > I can do this, it certainly makes sense with your upcoming changes, and it does > look cleaner. > > > > + /* If both are STACK_DYNPTR, type must be same */ > > > + if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type) > > > > struct bpf_reg_state *old_reg, *cur_reg; > > > > old_reg = &old->stack[spi].spilled_ptr; > > cur_reg = &cur->stack[spi].spilled_ptr; > > > > and then use old_reg and cur_reg in one simple if > > > > here's how I have it locally: > > > > case STACK_DYNPTR: > > old_reg = &old->stack[spi].spilled_ptr; > > cur_reg = &cur->stack[spi].spilled_ptr; > > if (old_reg->dynptr.type != cur_reg->dynptr.type || > > old_reg->dynptr.first_slot != > > cur_reg->dynptr.first_slot || > > !check_ids(old_reg->ref_obj_id, > > cur_reg->ref_obj_id, idmap)) > > return false; > > break; > > > > seems a bit cleaner? > > > > Yep. > > > I'm also thinking of getting rid of first_slot field and instead have > > a rule that first slot has proper type set, but the next one has > > BPF_DYNPTR_TYPE_INVALID as type. This should simplify things a bit, I > > think. At least it seems that way for STACK_ITER state I'm adding. But > > that's a separate refactoring, probably. > > > > Yeah, I'd rather not mix that into this set. Let me know if you think that's > better and I can follow up after the next iteration with that change. So far it looks better for me, but we can always fix it later. > > > > + return false; > > > + /* Both should also have first slot at same spi */ > > > + if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot) > > > + return false; > > > + /* ids should be same */ > > > + if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id) > > > + return false; > > > + if (old->stack[spi].spilled_ptr.ref_obj_id && > > > + !check_ids(old->stack[spi].spilled_ptr.ref_obj_id, > > > + cur->stack[spi].spilled_ptr.ref_obj_id, idmap)) > > > > my previous change to tech check_ids to enforce that either id have to > > be zeroes or non-zeroes at the same time already landed, so you don't > > need to check `old->stack[spi].spilled_ptr.ref_obj_id`. Even more, it > > seems wrong to do this check like this, because if cur has ref_obj_id > > set we'll ignore it, right? > > The check before that ensures either both are set or both are unset. If there is > a mismatch we return false. I see that check_ids does it now, so yes it wouldn't > be needed anymore. I am not sure about the last part, I don't think it will be > ignored? Right, I forgot about that `if with !!old.ref_obj_id != !!cur.ref_obj_id`.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 4a25375ebb0d..f7248235e119 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -781,6 +781,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ state->stack[spi - 1].spilled_ptr.ref_obj_id = id; } + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; + return 0; } @@ -805,6 +808,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); + + /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot? + * + * While we don't allow reading STACK_INVALID, it is still possible to + * do <8 byte writes marking some but not all slots as STACK_MISC. Then, + * helpers or insns can do partial read of that part without failing, + * but check_stack_range_initialized, check_stack_read_var_off, and + * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of + * the slot conservatively. Hence we need to screen off those liveness + * marking walks. + * + * This was not a problem before because STACK_INVALID is only set by + * default, or in clean_live_states after REG_LIVE_DONE, not randomly + * during verifier state exploration. Hence, for this case parentage + * chain will still be live, while earlier reg->parent was NULL, so we + * need REG_LIVE_WRITTEN to screen off read marker propagation. + */ + state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; + state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; + return 0; } @@ -2388,6 +2411,30 @@ static int mark_reg_read(struct bpf_verifier_env *env, return 0; } +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) +{ + struct bpf_func_state *state = func(env, reg); + int spi, ret; + + /* For CONST_PTR_TO_DYNPTR, it must have already been done by + * check_reg_arg in check_helper_call and mark_btf_func_reg_size in + * check_kfunc_call. + */ + if (reg->type == CONST_PTR_TO_DYNPTR) + return 0; + spi = get_spi(reg->off); + /* Caller ensures dynptr is valid and initialized, which means spi is in + * bounds and spi is the first dynptr slot. Simply mark stack slot as + * read. + */ + ret = mark_reg_read(env, &state->stack[spi].spilled_ptr, + state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64); + if (ret) + return ret; + return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr, + state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64); +} + /* This function is supposed to be used by the following 32-bit optimization * code only. It returns TRUE if the source or destination register operates * on 64-bit, otherwise return FALSE. @@ -5928,6 +5975,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta) { struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; + int err; /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to an * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*): @@ -6008,6 +6056,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno, err_extra, regno); return -EINVAL; } + + err = mark_dynptr_read(env, reg); + if (err) + return err; } return 0; } @@ -13204,6 +13256,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, * return false to continue verification of this path */ return false; + /* Both are same slot_type, but STACK_DYNPTR requires more + * checks before it can considered safe. + */ + if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) { + /* If both are STACK_DYNPTR, type must be same */ + if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type) + return false; + /* Both should also have first slot at same spi */ + if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot) + return false; + /* ids should be same */ + if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id) + return false; + if (old->stack[spi].spilled_ptr.ref_obj_id && + !check_ids(old->stack[spi].spilled_ptr.ref_obj_id, + cur->stack[spi].spilled_ptr.ref_obj_id, idmap)) + return false; + WARN_ON_ONCE(i % BPF_REG_SIZE); + i += BPF_REG_SIZE - 1; + continue; + } if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1) continue; if (!is_spilled_reg(&old->stack[spi]))
The root of the problem is missing liveness marking for STACK_DYNPTR slots. This leads to all kinds of problems inside stacksafe. The verifier by default inside stacksafe ignores spilled_ptr in stack slots which do not have REG_LIVE_READ marks. Since this is being checked in the 'old' explored state, it must have already done clean_live_states for this old bpf_func_state. Hence, it won't be receiving any more liveness marks from to be explored insns (it has received REG_LIVE_DONE marking from liveness point of view). What this means is that verifier considers that it's safe to not compare the stack slot if was never read by children states. While liveness marks are usually propagated correctly following the parentage chain for spilled registers (SCALAR_VALUE and PTR_* types), the same is not the case for STACK_DYNPTR. clean_live_states hence simply rewrites these stack slots to the type STACK_INVALID since it sees no REG_LIVE_READ marks. The end result is that we will never see STACK_DYNPTR slots in explored state. Even if verifier was conservatively matching !REG_LIVE_READ slots, very next check continuing the stacksafe loop on seeing STACK_INVALID would again prevent further checks. Now as long as verifier stores an explored state which we can compare to when reaching a pruning point, we can abuse this bug to make verifier prune search for obviously unsafe paths using STACK_DYNPTR slots thinking they are never used hence safe. Doing this in unprivileged mode is a bit challenging. add_new_state is only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges) or when jmps_processed difference is >= 2 and insn_processed difference is >= 8. So coming up with the unprivileged case requires a little more work, but it is still totally possible. The test case being discussed below triggers the heuristic even in unprivileged mode. However, it no longer works since commit 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF"). Let's try to study the test step by step. Consider the following program (C style BPF ASM): 0 r0 = 0; 1 r6 = &ringbuf_map; 3 r1 = r6; 4 r2 = 8; 5 r3 = 0; 6 r4 = r10; 7 r4 -= -16; 8 call bpf_ringbuf_reserve_dynptr; 9 if r0 == 0 goto pc+1; 10 goto pc+1; 11 *(r10 - 16) = 0xeB9F; 12 r1 = r10; 13 r1 -= -16; 14 r2 = 0; 15 call bpf_ringbuf_discard_dynptr; 16 r0 = 0; 17 exit; We know that insn 12 will be a pruning point, hence if we force add_new_state for it, it will first verify the following path as safe in straight line exploration: 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17 Then, when we arrive at insn 12 from the following path: 0 1 3 4 5 6 7 8 9 -> 11 (12) We will find a state that has been verified as safe already at insn 12. Since register state is same at this point, regsafe will pass. Next, in stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true. Next, refsafe is also true as reference state is unchanged in both states. The states are considered equivalent and search is pruned. Hence, we are able to construct a dynptr with arbitrary contents and use the dynptr API to operate on this arbitrary pointer and arbitrary size + offset. To fix this, first define a mark_dynptr_read function that propagates liveness marks whenever a valid initialized dynptr is accessed by dynptr helpers. REG_LIVE_WRITTEN is marked whenever we initialize an uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows screening off mark_reg_read and not propagating marks upwards from that point. This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or none, so clean_live_states either sets both slots to STACK_INVALID or none of them. This is the invariant the checks inside stacksafe rely on. Next, do a complete comparison of both stack slots whenever they have STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and also whether both form the same first_slot. Only then is the later path safe. Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> --- kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+)