Message ID | 20230409033431.3992432-5-joannelkoong@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | Dynptr convenience helpers | expand |
On Sat, Apr 8, 2023 at 8:34 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > The cloned dynptr will point to the same data as its parent dynptr, > with the same type, offset, size and read-only properties. > > Any writes to a dynptr will be reflected across all instances > (by 'instance', this means any dynptrs that point to the same > underlying data). > > Please note that data slice and dynptr invalidations will affect all > instances as well. For example, if bpf_dynptr_write() is called on an > skb-type dynptr, all data slices of dynptr instances to that skb > will be invalidated as well (eg data slices of any clones, parents, > grandparents, ...). Another example is if a ringbuf dynptr is submitted, > any instance of that dynptr will be invalidated. > > Changing the view of the dynptr (eg advancing the offset or > trimming the size) will only affect that dynptr and not affect any > other instances. > > One example use case where cloning may be helpful is for hashing or > iterating through dynptr data. Cloning will allow the user to maintain > the original view of the dynptr for future use, while also allowing > views to smaller subsets of the data after the offset is advanced or the > size is trimmed. > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > --- > kernel/bpf/helpers.c | 14 +++++ > kernel/bpf/verifier.c | 125 +++++++++++++++++++++++++++++++++++++----- > 2 files changed, 126 insertions(+), 13 deletions(-) > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index bac4c6fe49f0..108f3bcfa6da 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -2351,6 +2351,19 @@ __bpf_kfunc __u32 bpf_dynptr_get_offset(const struct bpf_dynptr_kern *ptr) > return ptr->offset; > } > > +__bpf_kfunc int bpf_dynptr_clone(struct bpf_dynptr_kern *ptr, > + struct bpf_dynptr_kern *clone__uninit) I think most of uses for bpf_dynptr_clone() would be to get a partial view (like you mentioned above, to, e.g., do a hashing of a part of the memory range). So it feels it would be best UX if clone would allow you to define a new range in one go. So e.g., to create a "sub-dynptr" for range of bytes [10, 30), it should be enough to: struct bpf_dynptr orig_ptr, new_ptr; bpf_dynptr_clone(&orig_ptr, 10, 30, &new_ptr); Instead of: bpf_dynptr_clone(&orig_ptr, &new_ptr); bpf_dynptr_advance(&new_ptr, 10); bpf_dynptr_trim(&new_ptr, bpf_dynptr_get_size(&orig_ptr) - 30); This, btw, shows the awkwardness of the bpf_dynptr_trim() approach. If someone really wanted an exact full-sized copy, it's trivial: bpf_dynptr_clone(&orig_ptr, 0, bpf_dynptr_get_size(&orig_ptr), &new_ptr); BTW, let's rename bpf_dynptr_get_size -> bpf_dynptr_size()? That "get_" is a sore in the eye, IMO. > +{ > + if (!ptr->data) { > + bpf_dynptr_set_null(clone__uninit); > + return -EINVAL; > + } > + > + memcpy(clone__uninit, ptr, sizeof(*clone__uninit)); why memcpy instead of `*clone__uninit = *ptr`? > + > + return 0; > +} > + > __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj) > { > return obj; > @@ -2429,6 +2442,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null) > BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) > BTF_ID_FLAGS(func, bpf_dynptr_get_size) > BTF_ID_FLAGS(func, bpf_dynptr_get_offset) > +BTF_ID_FLAGS(func, bpf_dynptr_clone) > BTF_SET8_END(common_btf_ids) > > static const struct btf_kfunc_id_set common_kfunc_set = { > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 3660b573048a..804cb50050f9 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -302,6 +302,7 @@ struct bpf_kfunc_call_arg_meta { > struct { > enum bpf_dynptr_type type; > u32 id; > + u32 ref_obj_id; > } initialized_dynptr; > struct { > u8 spi; > @@ -963,24 +964,15 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > return 0; > } > > -static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > +static void invalidate_dynptr(struct bpf_verifier_env *env, struct bpf_func_state *state, int spi) > { > - struct bpf_func_state *state = func(env, reg); > - int spi, i; > - > - spi = dynptr_get_spi(env, reg); > - if (spi < 0) > - return spi; > + int i; > > for (i = 0; i < BPF_REG_SIZE; i++) { > state->stack[spi].slot_type[i] = STACK_INVALID; > state->stack[spi - 1].slot_type[i] = STACK_INVALID; > } > > - /* Invalidate any slices associated with this dynptr */ > - if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) > - WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id)); > - > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > > @@ -1007,6 +999,51 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > */ > state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > +} > + > +static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > +{ > + struct bpf_func_state *state = func(env, reg); > + int spi; > + > + spi = dynptr_get_spi(env, reg); > + if (spi < 0) > + return spi; > + > + if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) { > + int ref_obj_id = state->stack[spi].spilled_ptr.ref_obj_id; > + int i; > + > + /* If the dynptr has a ref_obj_id, then we need to invaldiate typo: invalidate > + * two things: > + * > + * 1) Any dynptrs with a matching ref_obj_id (clones) > + * 2) Any slices associated with the ref_obj_id I think this is where this dynptr_id confusion comes from. The rule should be "any slices derived from this dynptr". But as mentioned on another thread, it's a separate topic which we can address later. > + */ > + > + /* Invalidate any slices associated with this dynptr */ > + WARN_ON_ONCE(release_reference(env, ref_obj_id)); > + > + /* Invalidate any dynptr clones */ > + for (i = 1; i < state->allocated_stack / BPF_REG_SIZE; i++) { > + if (state->stack[i].spilled_ptr.ref_obj_id == ref_obj_id) { nit: invert if condition and continue to reduce nestedness of the rest the loop body > + /* it should always be the case that if the ref obj id > + * matches then the stack slot also belongs to a > + * dynptr > + */ > + if (state->stack[i].slot_type[0] != STACK_DYNPTR) { > + verbose(env, "verifier internal error: misconfigured ref_obj_id\n"); > + return -EFAULT; > + } > + if (state->stack[i].spilled_ptr.dynptr.first_slot) > + invalidate_dynptr(env, state, i); > + } > + } > + > + return 0; > + } > + > + invalidate_dynptr(env, state, spi); > > return 0; > } > @@ -6967,6 +7004,50 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx, > return 0; > } > > +static int handle_dynptr_clone(struct bpf_verifier_env *env, enum bpf_arg_type arg_type, > + int regno, int insn_idx, struct bpf_kfunc_call_arg_meta *meta) > +{ > + struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; > + struct bpf_reg_state *first_reg_state, *second_reg_state; > + struct bpf_func_state *state = func(env, reg); > + enum bpf_dynptr_type dynptr_type = meta->initialized_dynptr.type; > + int err, spi, ref_obj_id; > + > + if (!dynptr_type) { > + verbose(env, "verifier internal error: no dynptr type for bpf_dynptr_clone\n"); > + return -EFAULT; > + } > + arg_type |= get_dynptr_type_flag(dynptr_type); what about MEM_RDONLY and MEM_UNINIT flags, do we need to derive and add them? > + > + err = process_dynptr_func(env, regno, insn_idx, arg_type); > + if (err < 0) > + return err; > + > + spi = dynptr_get_spi(env, reg); > + if (spi < 0) > + return spi; > + > + first_reg_state = &state->stack[spi].spilled_ptr; > + second_reg_state = &state->stack[spi - 1].spilled_ptr; > + ref_obj_id = first_reg_state->ref_obj_id; > + > + /* reassign the clone the same dynptr id as the original */ > + __mark_dynptr_reg(first_reg_state, dynptr_type, true, meta->initialized_dynptr.id); > + __mark_dynptr_reg(second_reg_state, dynptr_type, false, meta->initialized_dynptr.id); I'm not sure why clone should have the same dynptr_id? Isn't it a new instance of a dynptr? I get preserving ref_obj_id (if refcounted), but why reusing dynptr_id?.. > + > + if (meta->initialized_dynptr.ref_obj_id) { > + /* release the new ref obj id assigned during process_dynptr_func */ > + err = release_reference_state(cur_func(env), ref_obj_id); > + if (err) > + return err; ugh... this is not good to create reference just to release. If we need to reuse logic, let's reuse the logic without parts that shouldn't happen. Please check if we can split process_dynptr_func() appropriately to allow this. > + /* reassign the clone the same ref obj id as the original */ > + first_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; > + second_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; > + } > + > + return 0; > +} > + [...]
On Wed, Apr 12, 2023 at 3:12 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Sat, Apr 8, 2023 at 8:34 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > The cloned dynptr will point to the same data as its parent dynptr, > > with the same type, offset, size and read-only properties. > > > > Any writes to a dynptr will be reflected across all instances > > (by 'instance', this means any dynptrs that point to the same > > underlying data). > > > > Please note that data slice and dynptr invalidations will affect all > > instances as well. For example, if bpf_dynptr_write() is called on an > > skb-type dynptr, all data slices of dynptr instances to that skb > > will be invalidated as well (eg data slices of any clones, parents, > > grandparents, ...). Another example is if a ringbuf dynptr is submitted, > > any instance of that dynptr will be invalidated. > > > > Changing the view of the dynptr (eg advancing the offset or > > trimming the size) will only affect that dynptr and not affect any > > other instances. > > > > One example use case where cloning may be helpful is for hashing or > > iterating through dynptr data. Cloning will allow the user to maintain > > the original view of the dynptr for future use, while also allowing > > views to smaller subsets of the data after the offset is advanced or the > > size is trimmed. > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > --- > > kernel/bpf/helpers.c | 14 +++++ > > kernel/bpf/verifier.c | 125 +++++++++++++++++++++++++++++++++++++----- > > 2 files changed, 126 insertions(+), 13 deletions(-) > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > index bac4c6fe49f0..108f3bcfa6da 100644 > > --- a/kernel/bpf/helpers.c > > +++ b/kernel/bpf/helpers.c > > @@ -2351,6 +2351,19 @@ __bpf_kfunc __u32 bpf_dynptr_get_offset(const struct bpf_dynptr_kern *ptr) > > return ptr->offset; > > } > > > > +__bpf_kfunc int bpf_dynptr_clone(struct bpf_dynptr_kern *ptr, > > + struct bpf_dynptr_kern *clone__uninit) > > I think most of uses for bpf_dynptr_clone() would be to get a partial > view (like you mentioned above, to, e.g., do a hashing of a part of > the memory range). So it feels it would be best UX if clone would > allow you to define a new range in one go. So e.g., to create a > "sub-dynptr" for range of bytes [10, 30), it should be enough to: > > struct bpf_dynptr orig_ptr, new_ptr; > > bpf_dynptr_clone(&orig_ptr, 10, 30, &new_ptr); > > Instead of: > > bpf_dynptr_clone(&orig_ptr, &new_ptr); > bpf_dynptr_advance(&new_ptr, 10); > bpf_dynptr_trim(&new_ptr, bpf_dynptr_get_size(&orig_ptr) - 30); > I don't think we can do this because we can't have bpf_dynptr_clone() fail (which might happen if we take in a range, if the range is invalid). This is because in the case where the clone is of a reference-counted dynptr (eg like a ringbuf-type dynptr), the clone even if it's invalid must be treated by the verifier as a legit dynptr (since the verifier can't know ahead of time whether the clone call will succeed or not) which means if the invalid clone dynptr is then passed into a reference-releasing function, the verifier will release the reference but the actual reference won't be released at runtime (since the clone dynptr is invalid), which would leak the reference. An example is something like: // invalid range is passed, error is returned and new_ptr is invalid bpf_dynptr_clone(&ringbuf_ptr, 9999999, 9999999, &new_ptr); // this releases the reference and invalidates both new_ptr and ringbuf_ptr bpf_ringbuf_discard_dynptr(&new_ptr, 0); At runtime, bpf_ringbuf_discard_dynptr() will be a no-op since new_ptr is invalid - the ringbuf record still needs to be submitted/discarded, but the verifier will think this already happened > > This, btw, shows the awkwardness of the bpf_dynptr_trim() approach. > > If someone really wanted an exact full-sized copy, it's trivial: > > bpf_dynptr_clone(&orig_ptr, 0, bpf_dynptr_get_size(&orig_ptr), &new_ptr); > > > BTW, let's rename bpf_dynptr_get_size -> bpf_dynptr_size()? That > "get_" is a sore in the eye, IMO. Will do! > > > > +{ > > + if (!ptr->data) { > > + bpf_dynptr_set_null(clone__uninit); > > + return -EINVAL; > > + } > > + > > + memcpy(clone__uninit, ptr, sizeof(*clone__uninit)); > > why memcpy instead of `*clone__uninit = *ptr`? > No good reason :) I'll change this for v2 > > + > > + return 0; > > +} > > + > > __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj) > > { > > return obj; > > @@ -2429,6 +2442,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null) > > BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) > > BTF_ID_FLAGS(func, bpf_dynptr_get_size) > > BTF_ID_FLAGS(func, bpf_dynptr_get_offset) > > +BTF_ID_FLAGS(func, bpf_dynptr_clone) > > BTF_SET8_END(common_btf_ids) > > > > static const struct btf_kfunc_id_set common_kfunc_set = { > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 3660b573048a..804cb50050f9 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -302,6 +302,7 @@ struct bpf_kfunc_call_arg_meta { > > struct { > > enum bpf_dynptr_type type; > > u32 id; > > + u32 ref_obj_id; > > } initialized_dynptr; > > struct { > > u8 spi; > > @@ -963,24 +964,15 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > > return 0; > > } > > > > -static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > +static void invalidate_dynptr(struct bpf_verifier_env *env, struct bpf_func_state *state, int spi) > > { > > - struct bpf_func_state *state = func(env, reg); > > - int spi, i; > > - > > - spi = dynptr_get_spi(env, reg); > > - if (spi < 0) > > - return spi; > > + int i; > > > > for (i = 0; i < BPF_REG_SIZE; i++) { > > state->stack[spi].slot_type[i] = STACK_INVALID; > > state->stack[spi - 1].slot_type[i] = STACK_INVALID; > > } > > > > - /* Invalidate any slices associated with this dynptr */ > > - if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) > > - WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id)); > > - > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > > > > @@ -1007,6 +999,51 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > > */ > > state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > > state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > > +} > > + > > +static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > +{ > > + struct bpf_func_state *state = func(env, reg); > > + int spi; > > + > > + spi = dynptr_get_spi(env, reg); > > + if (spi < 0) > > + return spi; > > + > > + if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) { > > + int ref_obj_id = state->stack[spi].spilled_ptr.ref_obj_id; > > + int i; > > + > > + /* If the dynptr has a ref_obj_id, then we need to invaldiate > > typo: invalidate > > > + * two things: > > + * > > + * 1) Any dynptrs with a matching ref_obj_id (clones) > > + * 2) Any slices associated with the ref_obj_id > > I think this is where this dynptr_id confusion comes from. The rule > should be "any slices derived from this dynptr". But as mentioned on > another thread, it's a separate topic which we can address later. > If there's a parent and a clone and slices derived from the parent and slices derived from the clone, if the clone is invalidated then we need to invalidate slices associated with the parent as well. So shouldn't it be "any slices associated with the ref obj id" not "any slices derived from this dynptr"? (also just a note, parent/clone slices will share the same ref obj id and the same dynptr id, so checking against either does the same thing) > > + */ > > + > > + /* Invalidate any slices associated with this dynptr */ > > + WARN_ON_ONCE(release_reference(env, ref_obj_id)); > > + > > + /* Invalidate any dynptr clones */ > > + for (i = 1; i < state->allocated_stack / BPF_REG_SIZE; i++) { > > + if (state->stack[i].spilled_ptr.ref_obj_id == ref_obj_id) { > > nit: invert if condition and continue to reduce nestedness of the rest > the loop body Ooh good idea > > > + /* it should always be the case that if the ref obj id > > + * matches then the stack slot also belongs to a > > + * dynptr > > + */ > > + if (state->stack[i].slot_type[0] != STACK_DYNPTR) { > > + verbose(env, "verifier internal error: misconfigured ref_obj_id\n"); > > + return -EFAULT; > > + } > > + if (state->stack[i].spilled_ptr.dynptr.first_slot) > > + invalidate_dynptr(env, state, i); > > + } > > + } > > + > > + return 0; > > + } > > + > > + invalidate_dynptr(env, state, spi); > > > > return 0; > > } > > @@ -6967,6 +7004,50 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx, > > return 0; > > } > > > > +static int handle_dynptr_clone(struct bpf_verifier_env *env, enum bpf_arg_type arg_type, > > + int regno, int insn_idx, struct bpf_kfunc_call_arg_meta *meta) > > +{ > > + struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; > > + struct bpf_reg_state *first_reg_state, *second_reg_state; > > + struct bpf_func_state *state = func(env, reg); > > + enum bpf_dynptr_type dynptr_type = meta->initialized_dynptr.type; > > + int err, spi, ref_obj_id; > > + > > + if (!dynptr_type) { > > + verbose(env, "verifier internal error: no dynptr type for bpf_dynptr_clone\n"); > > + return -EFAULT; > > + } > > + arg_type |= get_dynptr_type_flag(dynptr_type); > > > what about MEM_RDONLY and MEM_UNINIT flags, do we need to derive and add them? I don't think we need MEM_UNINIT because we can't clone an uninitialized dynptr. But yes, definitely MEM_RDONLY. I missed that, I will add it in in v2 > > > + > > + err = process_dynptr_func(env, regno, insn_idx, arg_type); > > + if (err < 0) > > + return err; > > + > > + spi = dynptr_get_spi(env, reg); > > + if (spi < 0) > > + return spi; > > + > > + first_reg_state = &state->stack[spi].spilled_ptr; > > + second_reg_state = &state->stack[spi - 1].spilled_ptr; > > + ref_obj_id = first_reg_state->ref_obj_id; > > + > > + /* reassign the clone the same dynptr id as the original */ > > + __mark_dynptr_reg(first_reg_state, dynptr_type, true, meta->initialized_dynptr.id); > > + __mark_dynptr_reg(second_reg_state, dynptr_type, false, meta->initialized_dynptr.id); > > I'm not sure why clone should have the same dynptr_id? Isn't it a new > instance of a dynptr? I get preserving ref_obj_id (if refcounted), but > why reusing dynptr_id?.. > I think we need to also copy over the dynptr id because in the case of a non-reference counted dynptr, if the parent (or clone) is invalidated (eg overwriting bytes of the dynptr on the stack), we must also invalidate the slices of the clone (or parent) > > > + > > + if (meta->initialized_dynptr.ref_obj_id) { > > + /* release the new ref obj id assigned during process_dynptr_func */ > > + err = release_reference_state(cur_func(env), ref_obj_id); > > + if (err) > > + return err; > > ugh... this is not good to create reference just to release. If we > need to reuse logic, let's reuse the logic without parts that > shouldn't happen. Please check if we can split process_dynptr_func() > appropriately to allow this. I'll change this for v2. I think the simplest approach would be having mark_stack_slots_dynptr() take in a boolean or something that'll indicate whether it should acquire a new reference state or not > > > + /* reassign the clone the same ref obj id as the original */ > > + first_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; > > + second_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; > > + } > > + > > + return 0; > > +} > > + > > [...]
Hi Joanne, kernel test robot noticed the following build warnings: [auto build test WARNING on bpf-next/master] url: https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/bpf-Add-bpf_dynptr_trim-and-bpf_dynptr_advance/20230409-113652 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master patch link: https://lore.kernel.org/r/20230409033431.3992432-5-joannelkoong%40gmail.com patch subject: [PATCH v1 bpf-next 4/5] bpf: Add bpf_dynptr_clone config: m68k-randconfig-s032-20230416 (https://download.01.org/0day-ci/archive/20230418/202304180233.Hk6WZE5M-lkp@intel.com/config) compiler: m68k-linux-gcc (GCC) 12.1.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.4-39-gce1a6720-dirty # https://github.com/intel-lab-lkp/linux/commit/d7830addcc26375f56b68655ddbfb44116b3e7f6 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Joanne-Koong/bpf-Add-bpf_dynptr_trim-and-bpf_dynptr_advance/20230409-113652 git checkout d7830addcc26375f56b68655ddbfb44116b3e7f6 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=m68k olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=m68k SHELL=/bin/bash kernel/bpf/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> | Link: https://lore.kernel.org/oe-kbuild-all/202304180233.Hk6WZE5M-lkp@intel.com/ sparse warnings: (new ones prefixed by >>) >> kernel/bpf/verifier.c:7020:41: sparse: sparse: mixing different enum types: >> kernel/bpf/verifier.c:7020:41: sparse: unsigned int enum bpf_arg_type >> kernel/bpf/verifier.c:7020:41: sparse: unsigned int enum bpf_type_flag kernel/bpf/verifier.c:18042:38: sparse: sparse: subtraction of functions? Share your drugs kernel/bpf/verifier.c: note: in included file (through include/linux/bpf.h, include/linux/bpf-cgroup.h): include/linux/bpfptr.h:65:40: sparse: sparse: cast to non-scalar include/linux/bpfptr.h:65:40: sparse: sparse: cast from non-scalar include/linux/bpfptr.h:65:40: sparse: sparse: cast to non-scalar include/linux/bpfptr.h:65:40: sparse: sparse: cast from non-scalar include/linux/bpfptr.h:65:40: sparse: sparse: cast to non-scalar include/linux/bpfptr.h:65:40: sparse: sparse: cast from non-scalar vim +7020 kernel/bpf/verifier.c 7006 7007 static int handle_dynptr_clone(struct bpf_verifier_env *env, enum bpf_arg_type arg_type, 7008 int regno, int insn_idx, struct bpf_kfunc_call_arg_meta *meta) 7009 { 7010 struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; 7011 struct bpf_reg_state *first_reg_state, *second_reg_state; 7012 struct bpf_func_state *state = func(env, reg); 7013 enum bpf_dynptr_type dynptr_type = meta->initialized_dynptr.type; 7014 int err, spi, ref_obj_id; 7015 7016 if (!dynptr_type) { 7017 verbose(env, "verifier internal error: no dynptr type for bpf_dynptr_clone\n"); 7018 return -EFAULT; 7019 } > 7020 arg_type |= get_dynptr_type_flag(dynptr_type); 7021 7022 err = process_dynptr_func(env, regno, insn_idx, arg_type); 7023 if (err < 0) 7024 return err; 7025 7026 spi = dynptr_get_spi(env, reg); 7027 if (spi < 0) 7028 return spi; 7029 7030 first_reg_state = &state->stack[spi].spilled_ptr; 7031 second_reg_state = &state->stack[spi - 1].spilled_ptr; 7032 ref_obj_id = first_reg_state->ref_obj_id; 7033 7034 /* reassign the clone the same dynptr id as the original */ 7035 __mark_dynptr_reg(first_reg_state, dynptr_type, true, meta->initialized_dynptr.id); 7036 __mark_dynptr_reg(second_reg_state, dynptr_type, false, meta->initialized_dynptr.id); 7037 7038 if (meta->initialized_dynptr.ref_obj_id) { 7039 /* release the new ref obj id assigned during process_dynptr_func */ 7040 err = release_reference_state(cur_func(env), ref_obj_id); 7041 if (err) 7042 return err; 7043 /* reassign the clone the same ref obj id as the original */ 7044 first_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; 7045 second_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; 7046 } 7047 7048 return 0; 7049 } 7050
On Thu, Apr 13, 2023 at 11:03 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > On Wed, Apr 12, 2023 at 3:12 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Sat, Apr 8, 2023 at 8:34 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > The cloned dynptr will point to the same data as its parent dynptr, > > > with the same type, offset, size and read-only properties. > > > > > > Any writes to a dynptr will be reflected across all instances > > > (by 'instance', this means any dynptrs that point to the same > > > underlying data). > > > > > > Please note that data slice and dynptr invalidations will affect all > > > instances as well. For example, if bpf_dynptr_write() is called on an > > > skb-type dynptr, all data slices of dynptr instances to that skb > > > will be invalidated as well (eg data slices of any clones, parents, > > > grandparents, ...). Another example is if a ringbuf dynptr is submitted, > > > any instance of that dynptr will be invalidated. > > > > > > Changing the view of the dynptr (eg advancing the offset or > > > trimming the size) will only affect that dynptr and not affect any > > > other instances. > > > > > > One example use case where cloning may be helpful is for hashing or > > > iterating through dynptr data. Cloning will allow the user to maintain > > > the original view of the dynptr for future use, while also allowing > > > views to smaller subsets of the data after the offset is advanced or the > > > size is trimmed. > > > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > > --- > > > kernel/bpf/helpers.c | 14 +++++ > > > kernel/bpf/verifier.c | 125 +++++++++++++++++++++++++++++++++++++----- > > > 2 files changed, 126 insertions(+), 13 deletions(-) > > > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > > index bac4c6fe49f0..108f3bcfa6da 100644 > > > --- a/kernel/bpf/helpers.c > > > +++ b/kernel/bpf/helpers.c > > > @@ -2351,6 +2351,19 @@ __bpf_kfunc __u32 bpf_dynptr_get_offset(const struct bpf_dynptr_kern *ptr) > > > return ptr->offset; > > > } > > > > > > +__bpf_kfunc int bpf_dynptr_clone(struct bpf_dynptr_kern *ptr, > > > + struct bpf_dynptr_kern *clone__uninit) > > > > I think most of uses for bpf_dynptr_clone() would be to get a partial > > view (like you mentioned above, to, e.g., do a hashing of a part of > > the memory range). So it feels it would be best UX if clone would > > allow you to define a new range in one go. So e.g., to create a > > "sub-dynptr" for range of bytes [10, 30), it should be enough to: > > > > struct bpf_dynptr orig_ptr, new_ptr; > > > > bpf_dynptr_clone(&orig_ptr, 10, 30, &new_ptr); > > > > Instead of: > > > > bpf_dynptr_clone(&orig_ptr, &new_ptr); > > bpf_dynptr_advance(&new_ptr, 10); > > bpf_dynptr_trim(&new_ptr, bpf_dynptr_get_size(&orig_ptr) - 30); > > > > I don't think we can do this because we can't have bpf_dynptr_clone() > fail (which might happen if we take in a range, if the range is > invalid). This is because in the case where the clone is of a > reference-counted dynptr (eg like a ringbuf-type dynptr), the clone > even if it's invalid must be treated by the verifier as a legit dynptr > (since the verifier can't know ahead of time whether the clone call > will succeed or not) which means if the invalid clone dynptr is then > passed into a reference-releasing function, the verifier will release > the reference but the actual reference won't be released at runtime > (since the clone dynptr is invalid), which would leak the reference. > An example is something like: > > // invalid range is passed, error is returned and new_ptr is invalid > bpf_dynptr_clone(&ringbuf_ptr, 9999999, 9999999, &new_ptr); > // this releases the reference and invalidates both new_ptr and ringbuf_ptr > bpf_ringbuf_discard_dynptr(&new_ptr, 0); > > At runtime, bpf_ringbuf_discard_dynptr() will be a no-op since new_ptr > is invalid - the ringbuf record still needs to be submitted/discarded, > but the verifier will think this already happened Ah, tricky, good point. But ok, I guess with bpf_dynptr_adjust() proposal in another email this would be ok: bpf_dynptr_clone(..); /* always succeeds */ bpf_dynptr_adjust(&new_ptr, 10, 30); /* could fail to adjust, but dynptr is left valid */ Right? > > > > > This, btw, shows the awkwardness of the bpf_dynptr_trim() approach. > > > > If someone really wanted an exact full-sized copy, it's trivial: > > > > bpf_dynptr_clone(&orig_ptr, 0, bpf_dynptr_get_size(&orig_ptr), &new_ptr); > > > > > > BTW, let's rename bpf_dynptr_get_size -> bpf_dynptr_size()? That > > "get_" is a sore in the eye, IMO. > > Will do! > > > > > > > +{ > > > + if (!ptr->data) { > > > + bpf_dynptr_set_null(clone__uninit); > > > + return -EINVAL; > > > + } > > > + > > > + memcpy(clone__uninit, ptr, sizeof(*clone__uninit)); > > > > why memcpy instead of `*clone__uninit = *ptr`? > > > No good reason :) I'll change this for v2 > > > + > > > + return 0; > > > +} > > > + > > > __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj) > > > { > > > return obj; > > > @@ -2429,6 +2442,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null) > > > BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) > > > BTF_ID_FLAGS(func, bpf_dynptr_get_size) > > > BTF_ID_FLAGS(func, bpf_dynptr_get_offset) > > > +BTF_ID_FLAGS(func, bpf_dynptr_clone) > > > BTF_SET8_END(common_btf_ids) > > > > > > static const struct btf_kfunc_id_set common_kfunc_set = { > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > index 3660b573048a..804cb50050f9 100644 > > > --- a/kernel/bpf/verifier.c > > > +++ b/kernel/bpf/verifier.c > > > @@ -302,6 +302,7 @@ struct bpf_kfunc_call_arg_meta { > > > struct { > > > enum bpf_dynptr_type type; > > > u32 id; > > > + u32 ref_obj_id; > > > } initialized_dynptr; > > > struct { > > > u8 spi; > > > @@ -963,24 +964,15 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ > > > return 0; > > > } > > > > > > -static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > > +static void invalidate_dynptr(struct bpf_verifier_env *env, struct bpf_func_state *state, int spi) > > > { > > > - struct bpf_func_state *state = func(env, reg); > > > - int spi, i; > > > - > > > - spi = dynptr_get_spi(env, reg); > > > - if (spi < 0) > > > - return spi; > > > + int i; > > > > > > for (i = 0; i < BPF_REG_SIZE; i++) { > > > state->stack[spi].slot_type[i] = STACK_INVALID; > > > state->stack[spi - 1].slot_type[i] = STACK_INVALID; > > > } > > > > > > - /* Invalidate any slices associated with this dynptr */ > > > - if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) > > > - WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id)); > > > - > > > __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); > > > __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); > > > > > > @@ -1007,6 +999,51 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re > > > */ > > > state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; > > > state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; > > > +} > > > + > > > +static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > > +{ > > > + struct bpf_func_state *state = func(env, reg); > > > + int spi; > > > + > > > + spi = dynptr_get_spi(env, reg); > > > + if (spi < 0) > > > + return spi; > > > + > > > + if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) { > > > + int ref_obj_id = state->stack[spi].spilled_ptr.ref_obj_id; > > > + int i; > > > + > > > + /* If the dynptr has a ref_obj_id, then we need to invaldiate > > > > typo: invalidate > > > > > + * two things: > > > + * > > > + * 1) Any dynptrs with a matching ref_obj_id (clones) > > > + * 2) Any slices associated with the ref_obj_id > > > > I think this is where this dynptr_id confusion comes from. The rule > > should be "any slices derived from this dynptr". But as mentioned on > > another thread, it's a separate topic which we can address later. > > > If there's a parent and a clone and slices derived from the parent and > slices derived from the clone, if the clone is invalidated then we > need to invalidate slices associated with the parent as well. So > shouldn't it be "any slices associated with the ref obj id" not "any > slices derived from this dynptr"? (also just a note, parent/clone > slices will share the same ref obj id and the same dynptr id, so > checking against either does the same thing) So, we have a ringbuf dynptr with ref_obj_id=1, id=2, ok? We clone it, clone gets ref_obj_id=1, id=3. If either original dynptr or clone dynptr is released due to bpf_ringbuf_discard_dynptr(), we invalidate all the dynptrs with ref_obj_id=1. During invalidation of each dynptr, we invalidate all the slices with ref_obj_id==dynptr's id. So we'll invalidate slices derived from dynptr with id=2 (original dynptr), and then all the slices derived from dynptr with id=3? > > > > + */ > > > + > > > + /* Invalidate any slices associated with this dynptr */ > > > + WARN_ON_ONCE(release_reference(env, ref_obj_id)); > > > + > > > + /* Invalidate any dynptr clones */ > > > + for (i = 1; i < state->allocated_stack / BPF_REG_SIZE; i++) { > > > + if (state->stack[i].spilled_ptr.ref_obj_id == ref_obj_id) { > > > > nit: invert if condition and continue to reduce nestedness of the rest > > the loop body > > Ooh good idea > > > > > + /* it should always be the case that if the ref obj id > > > + * matches then the stack slot also belongs to a > > > + * dynptr > > > + */ > > > + if (state->stack[i].slot_type[0] != STACK_DYNPTR) { > > > + verbose(env, "verifier internal error: misconfigured ref_obj_id\n"); > > > + return -EFAULT; > > > + } > > > + if (state->stack[i].spilled_ptr.dynptr.first_slot) > > > + invalidate_dynptr(env, state, i); > > > + } > > > + } > > > + > > > + return 0; > > > + } > > > + > > > + invalidate_dynptr(env, state, spi); > > > > > > return 0; > > > } > > > @@ -6967,6 +7004,50 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx, > > > return 0; > > > } > > > > > > +static int handle_dynptr_clone(struct bpf_verifier_env *env, enum bpf_arg_type arg_type, > > > + int regno, int insn_idx, struct bpf_kfunc_call_arg_meta *meta) > > > +{ > > > + struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; > > > + struct bpf_reg_state *first_reg_state, *second_reg_state; > > > + struct bpf_func_state *state = func(env, reg); > > > + enum bpf_dynptr_type dynptr_type = meta->initialized_dynptr.type; > > > + int err, spi, ref_obj_id; > > > + > > > + if (!dynptr_type) { > > > + verbose(env, "verifier internal error: no dynptr type for bpf_dynptr_clone\n"); > > > + return -EFAULT; > > > + } > > > + arg_type |= get_dynptr_type_flag(dynptr_type); > > > > > > what about MEM_RDONLY and MEM_UNINIT flags, do we need to derive and add them? > > I don't think we need MEM_UNINIT because we can't clone an > uninitialized dynptr. But yes, definitely MEM_RDONLY. I missed that, I > will add it in in v2 > > > > > > + > > > + err = process_dynptr_func(env, regno, insn_idx, arg_type); > > > + if (err < 0) > > > + return err; > > > + > > > + spi = dynptr_get_spi(env, reg); > > > + if (spi < 0) > > > + return spi; > > > + > > > + first_reg_state = &state->stack[spi].spilled_ptr; > > > + second_reg_state = &state->stack[spi - 1].spilled_ptr; > > > + ref_obj_id = first_reg_state->ref_obj_id; > > > + > > > + /* reassign the clone the same dynptr id as the original */ > > > + __mark_dynptr_reg(first_reg_state, dynptr_type, true, meta->initialized_dynptr.id); > > > + __mark_dynptr_reg(second_reg_state, dynptr_type, false, meta->initialized_dynptr.id); > > > > I'm not sure why clone should have the same dynptr_id? Isn't it a new > > instance of a dynptr? I get preserving ref_obj_id (if refcounted), but > > why reusing dynptr_id?.. > > > I think we need to also copy over the dynptr id because in the case of > a non-reference counted dynptr, if the parent (or clone) is > invalidated (eg overwriting bytes of the dynptr on the stack), we must > also invalidate the slices of the clone (or parent) yep, right now we'll have to do that because we have dynptr_id. But if we get rid of it and stick to ref_obj_id and id, then clone would need to get a new id, but keep ref_obj_id, right? > > > > > + > > > + if (meta->initialized_dynptr.ref_obj_id) { > > > + /* release the new ref obj id assigned during process_dynptr_func */ > > > + err = release_reference_state(cur_func(env), ref_obj_id); > > > + if (err) > > > + return err; > > > > ugh... this is not good to create reference just to release. If we > > need to reuse logic, let's reuse the logic without parts that > > shouldn't happen. Please check if we can split process_dynptr_func() > > appropriately to allow this. > > I'll change this for v2. I think the simplest approach would be having > mark_stack_slots_dynptr() take in a boolean or something that'll > indicate whether it should acquire a new reference state or not > > > > > + /* reassign the clone the same ref obj id as the original */ > > > + first_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; > > > + second_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; > > > + } > > > + > > > + return 0; > > > +} > > > + > > > > [...]
On Mon, Apr 17, 2023 at 4:46 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Thu, Apr 13, 2023 at 11:03 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > On Wed, Apr 12, 2023 at 3:12 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > > > > On Sat, Apr 8, 2023 at 8:34 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > > The cloned dynptr will point to the same data as its parent dynptr, > > > > with the same type, offset, size and read-only properties. > > > > > > > > Any writes to a dynptr will be reflected across all instances > > > > (by 'instance', this means any dynptrs that point to the same > > > > underlying data). > > > > > > > > Please note that data slice and dynptr invalidations will affect all > > > > instances as well. For example, if bpf_dynptr_write() is called on an > > > > skb-type dynptr, all data slices of dynptr instances to that skb > > > > will be invalidated as well (eg data slices of any clones, parents, > > > > grandparents, ...). Another example is if a ringbuf dynptr is submitted, > > > > any instance of that dynptr will be invalidated. > > > > > > > > Changing the view of the dynptr (eg advancing the offset or > > > > trimming the size) will only affect that dynptr and not affect any > > > > other instances. > > > > > > > > One example use case where cloning may be helpful is for hashing or > > > > iterating through dynptr data. Cloning will allow the user to maintain > > > > the original view of the dynptr for future use, while also allowing > > > > views to smaller subsets of the data after the offset is advanced or the > > > > size is trimmed. > > > > > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > > > --- > > > > kernel/bpf/helpers.c | 14 +++++ > > > > kernel/bpf/verifier.c | 125 +++++++++++++++++++++++++++++++++++++----- > > > > 2 files changed, 126 insertions(+), 13 deletions(-) > > > > > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > > > index bac4c6fe49f0..108f3bcfa6da 100644 > > > > --- a/kernel/bpf/helpers.c > > > > +++ b/kernel/bpf/helpers.c > > > > @@ -2351,6 +2351,19 @@ __bpf_kfunc __u32 bpf_dynptr_get_offset(const struct bpf_dynptr_kern *ptr) > > > > return ptr->offset; > > > > } > > > > > > > > +__bpf_kfunc int bpf_dynptr_clone(struct bpf_dynptr_kern *ptr, > > > > + struct bpf_dynptr_kern *clone__uninit) > > > > > > I think most of uses for bpf_dynptr_clone() would be to get a partial > > > view (like you mentioned above, to, e.g., do a hashing of a part of > > > the memory range). So it feels it would be best UX if clone would > > > allow you to define a new range in one go. So e.g., to create a > > > "sub-dynptr" for range of bytes [10, 30), it should be enough to: > > > > > > struct bpf_dynptr orig_ptr, new_ptr; > > > > > > bpf_dynptr_clone(&orig_ptr, 10, 30, &new_ptr); > > > > > > Instead of: > > > > > > bpf_dynptr_clone(&orig_ptr, &new_ptr); > > > bpf_dynptr_advance(&new_ptr, 10); > > > bpf_dynptr_trim(&new_ptr, bpf_dynptr_get_size(&orig_ptr) - 30); > > > > > > > I don't think we can do this because we can't have bpf_dynptr_clone() > > fail (which might happen if we take in a range, if the range is > > invalid). This is because in the case where the clone is of a > > reference-counted dynptr (eg like a ringbuf-type dynptr), the clone > > even if it's invalid must be treated by the verifier as a legit dynptr > > (since the verifier can't know ahead of time whether the clone call > > will succeed or not) which means if the invalid clone dynptr is then > > passed into a reference-releasing function, the verifier will release > > the reference but the actual reference won't be released at runtime > > (since the clone dynptr is invalid), which would leak the reference. > > An example is something like: > > > > // invalid range is passed, error is returned and new_ptr is invalid > > bpf_dynptr_clone(&ringbuf_ptr, 9999999, 9999999, &new_ptr); > > // this releases the reference and invalidates both new_ptr and ringbuf_ptr > > bpf_ringbuf_discard_dynptr(&new_ptr, 0); > > > > At runtime, bpf_ringbuf_discard_dynptr() will be a no-op since new_ptr > > is invalid - the ringbuf record still needs to be submitted/discarded, > > but the verifier will think this already happened > > Ah, tricky, good point. But ok, I guess with bpf_dynptr_adjust() > proposal in another email this would be ok: > > bpf_dynptr_clone(..); /* always succeeds */ > bpf_dynptr_adjust(&new_ptr, 10, 30); /* could fail to adjust, but > dynptr is left valid */ > > Right? Yes, this would be okay because if bpf_dynptr_adjust fails, the clone is valid (it was unaffected) so submitting/discarding it later on would correctly release the reference > > > > > > > > > This, btw, shows the awkwardness of the bpf_dynptr_trim() approach. > > > > > > If someone really wanted an exact full-sized copy, it's trivial: > > > > > > bpf_dynptr_clone(&orig_ptr, 0, bpf_dynptr_get_size(&orig_ptr), &new_ptr); > > > > > > [...] > > > > +static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > > > +{ > > > > + struct bpf_func_state *state = func(env, reg); > > > > + int spi; > > > > + > > > > + spi = dynptr_get_spi(env, reg); > > > > + if (spi < 0) > > > > + return spi; > > > > + > > > > + if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) { > > > > + int ref_obj_id = state->stack[spi].spilled_ptr.ref_obj_id; > > > > + int i; > > > > + > > > > + /* If the dynptr has a ref_obj_id, then we need to invaldiate > > > > > > typo: invalidate > > > > > > > + * two things: > > > > + * > > > > + * 1) Any dynptrs with a matching ref_obj_id (clones) > > > > + * 2) Any slices associated with the ref_obj_id > > > > > > I think this is where this dynptr_id confusion comes from. The rule > > > should be "any slices derived from this dynptr". But as mentioned on > > > another thread, it's a separate topic which we can address later. > > > > > If there's a parent and a clone and slices derived from the parent and > > slices derived from the clone, if the clone is invalidated then we > > need to invalidate slices associated with the parent as well. So > > shouldn't it be "any slices associated with the ref obj id" not "any > > slices derived from this dynptr"? (also just a note, parent/clone > > slices will share the same ref obj id and the same dynptr id, so > > checking against either does the same thing) > > So, we have a ringbuf dynptr with ref_obj_id=1, id=2, ok? We clone it, > clone gets ref_obj_id=1, id=3. If either original dynptr or clone > dynptr is released due to bpf_ringbuf_discard_dynptr(), we invalidate > all the dynptrs with ref_obj_id=1. During invalidation of each dynptr, > we invalidate all the slices with ref_obj_id==dynptr's id. So we'll > invalidate slices derived from dynptr with id=2 (original dynptr), and > then all the slices derived from dynptr with id=3? > When we create a slice for a dynptr (through bpf_dynptr_data()), the slice's reg->ref_obj_id is set to the dynptr's ref_obj_id (not dynptr's id). During invalidation of the dynptr, we invalidate all the slices with that ref obj id, which means we invalidate all slices for any parents/clones. [...] > > > > + > > > > + err = process_dynptr_func(env, regno, insn_idx, arg_type); > > > > + if (err < 0) > > > > + return err; > > > > + > > > > + spi = dynptr_get_spi(env, reg); > > > > + if (spi < 0) > > > > + return spi; > > > > + > > > > + first_reg_state = &state->stack[spi].spilled_ptr; > > > > + second_reg_state = &state->stack[spi - 1].spilled_ptr; > > > > + ref_obj_id = first_reg_state->ref_obj_id; > > > > + > > > > + /* reassign the clone the same dynptr id as the original */ > > > > + __mark_dynptr_reg(first_reg_state, dynptr_type, true, meta->initialized_dynptr.id); > > > > + __mark_dynptr_reg(second_reg_state, dynptr_type, false, meta->initialized_dynptr.id); > > > > > > I'm not sure why clone should have the same dynptr_id? Isn't it a new > > > instance of a dynptr? I get preserving ref_obj_id (if refcounted), but > > > why reusing dynptr_id?.. > > > > > I think we need to also copy over the dynptr id because in the case of > > a non-reference counted dynptr, if the parent (or clone) is > > invalidated (eg overwriting bytes of the dynptr on the stack), we must > > also invalidate the slices of the clone (or parent) > > yep, right now we'll have to do that because we have dynptr_id. But if > we get rid of it and stick to ref_obj_id and id, then clone would need > to get a new id, but keep ref_obj_id, right? > Thinking about this some more, I think you're right that we shouldn't reuse the dynptr id. in fact, i think reusing it would lead to incorrect behavior - in the example of a non-reference counted dynptr, if the parent dynptr is overwritten on the stack (and thus invalidated), that shouldn't invalidate the slices of the clone at all. I'll change this in the next version > > > [...]
On Tue, Apr 18, 2023 at 11:56 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > On Mon, Apr 17, 2023 at 4:46 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Thu, Apr 13, 2023 at 11:03 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > On Wed, Apr 12, 2023 at 3:12 PM Andrii Nakryiko > > > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > On Sat, Apr 8, 2023 at 8:34 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > > > > The cloned dynptr will point to the same data as its parent dynptr, > > > > > with the same type, offset, size and read-only properties. > > > > > > > > > > Any writes to a dynptr will be reflected across all instances > > > > > (by 'instance', this means any dynptrs that point to the same > > > > > underlying data). > > > > > > > > > > Please note that data slice and dynptr invalidations will affect all > > > > > instances as well. For example, if bpf_dynptr_write() is called on an > > > > > skb-type dynptr, all data slices of dynptr instances to that skb > > > > > will be invalidated as well (eg data slices of any clones, parents, > > > > > grandparents, ...). Another example is if a ringbuf dynptr is submitted, > > > > > any instance of that dynptr will be invalidated. > > > > > > > > > > Changing the view of the dynptr (eg advancing the offset or > > > > > trimming the size) will only affect that dynptr and not affect any > > > > > other instances. > > > > > > > > > > One example use case where cloning may be helpful is for hashing or > > > > > iterating through dynptr data. Cloning will allow the user to maintain > > > > > the original view of the dynptr for future use, while also allowing > > > > > views to smaller subsets of the data after the offset is advanced or the > > > > > size is trimmed. > > > > > > > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > > > > --- > > > > > kernel/bpf/helpers.c | 14 +++++ > > > > > kernel/bpf/verifier.c | 125 +++++++++++++++++++++++++++++++++++++----- > > > > > 2 files changed, 126 insertions(+), 13 deletions(-) > > > > > > > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > > > > index bac4c6fe49f0..108f3bcfa6da 100644 > > > > > --- a/kernel/bpf/helpers.c > > > > > +++ b/kernel/bpf/helpers.c > > > > > @@ -2351,6 +2351,19 @@ __bpf_kfunc __u32 bpf_dynptr_get_offset(const struct bpf_dynptr_kern *ptr) > > > > > return ptr->offset; > > > > > } > > > > > > > > > > +__bpf_kfunc int bpf_dynptr_clone(struct bpf_dynptr_kern *ptr, > > > > > + struct bpf_dynptr_kern *clone__uninit) > > > > > > > > I think most of uses for bpf_dynptr_clone() would be to get a partial > > > > view (like you mentioned above, to, e.g., do a hashing of a part of > > > > the memory range). So it feels it would be best UX if clone would > > > > allow you to define a new range in one go. So e.g., to create a > > > > "sub-dynptr" for range of bytes [10, 30), it should be enough to: > > > > > > > > struct bpf_dynptr orig_ptr, new_ptr; > > > > > > > > bpf_dynptr_clone(&orig_ptr, 10, 30, &new_ptr); > > > > > > > > Instead of: > > > > > > > > bpf_dynptr_clone(&orig_ptr, &new_ptr); > > > > bpf_dynptr_advance(&new_ptr, 10); > > > > bpf_dynptr_trim(&new_ptr, bpf_dynptr_get_size(&orig_ptr) - 30); > > > > > > > > > > I don't think we can do this because we can't have bpf_dynptr_clone() > > > fail (which might happen if we take in a range, if the range is > > > invalid). This is because in the case where the clone is of a > > > reference-counted dynptr (eg like a ringbuf-type dynptr), the clone > > > even if it's invalid must be treated by the verifier as a legit dynptr > > > (since the verifier can't know ahead of time whether the clone call > > > will succeed or not) which means if the invalid clone dynptr is then > > > passed into a reference-releasing function, the verifier will release > > > the reference but the actual reference won't be released at runtime > > > (since the clone dynptr is invalid), which would leak the reference. > > > An example is something like: > > > > > > // invalid range is passed, error is returned and new_ptr is invalid > > > bpf_dynptr_clone(&ringbuf_ptr, 9999999, 9999999, &new_ptr); > > > // this releases the reference and invalidates both new_ptr and ringbuf_ptr > > > bpf_ringbuf_discard_dynptr(&new_ptr, 0); > > > > > > At runtime, bpf_ringbuf_discard_dynptr() will be a no-op since new_ptr > > > is invalid - the ringbuf record still needs to be submitted/discarded, > > > but the verifier will think this already happened > > > > Ah, tricky, good point. But ok, I guess with bpf_dynptr_adjust() > > proposal in another email this would be ok: > > > > bpf_dynptr_clone(..); /* always succeeds */ > > bpf_dynptr_adjust(&new_ptr, 10, 30); /* could fail to adjust, but > > dynptr is left valid */ > > > > Right? > > Yes, this would be okay because if bpf_dynptr_adjust fails, the clone > is valid (it was unaffected) so submitting/discarding it later on > would correctly release the reference ok, cool, let's do that and keep things simple > > > > > > > > > > > > > This, btw, shows the awkwardness of the bpf_dynptr_trim() approach. > > > > > > > > If someone really wanted an exact full-sized copy, it's trivial: > > > > > > > > bpf_dynptr_clone(&orig_ptr, 0, bpf_dynptr_get_size(&orig_ptr), &new_ptr); > > > > > > > > > [...] > > > > > +static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) > > > > > +{ > > > > > + struct bpf_func_state *state = func(env, reg); > > > > > + int spi; > > > > > + > > > > > + spi = dynptr_get_spi(env, reg); > > > > > + if (spi < 0) > > > > > + return spi; > > > > > + > > > > > + if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) { > > > > > + int ref_obj_id = state->stack[spi].spilled_ptr.ref_obj_id; > > > > > + int i; > > > > > + > > > > > + /* If the dynptr has a ref_obj_id, then we need to invaldiate > > > > > > > > typo: invalidate > > > > > > > > > + * two things: > > > > > + * > > > > > + * 1) Any dynptrs with a matching ref_obj_id (clones) > > > > > + * 2) Any slices associated with the ref_obj_id > > > > > > > > I think this is where this dynptr_id confusion comes from. The rule > > > > should be "any slices derived from this dynptr". But as mentioned on > > > > another thread, it's a separate topic which we can address later. > > > > > > > If there's a parent and a clone and slices derived from the parent and > > > slices derived from the clone, if the clone is invalidated then we > > > need to invalidate slices associated with the parent as well. So > > > shouldn't it be "any slices associated with the ref obj id" not "any > > > slices derived from this dynptr"? (also just a note, parent/clone > > > slices will share the same ref obj id and the same dynptr id, so > > > checking against either does the same thing) > > > > So, we have a ringbuf dynptr with ref_obj_id=1, id=2, ok? We clone it, > > clone gets ref_obj_id=1, id=3. If either original dynptr or clone > > dynptr is released due to bpf_ringbuf_discard_dynptr(), we invalidate > > all the dynptrs with ref_obj_id=1. During invalidation of each dynptr, > > we invalidate all the slices with ref_obj_id==dynptr's id. So we'll > > invalidate slices derived from dynptr with id=2 (original dynptr), and > > then all the slices derived from dynptr with id=3? > > > When we create a slice for a dynptr (through bpf_dynptr_data()), the > slice's reg->ref_obj_id is set to the dynptr's ref_obj_id (not > dynptr's id). During invalidation of the dynptr, we invalidate all the > slices with that ref obj id, which means we invalidate all slices for > any parents/clones. > [...] Yep, because of how we define that ref_obj_id should be in refs array. What I'm saying is that we should probably change that to be more general "ID of an object which lifetime we are associated with". That would need some verifier internals adjustments. So let's wrap up this particular ref_obj_id revamp discussion for now, it's a separate topic that will be just distracting us. > > > > > + > > > > > + err = process_dynptr_func(env, regno, insn_idx, arg_type); > > > > > + if (err < 0) > > > > > + return err; > > > > > + > > > > > + spi = dynptr_get_spi(env, reg); > > > > > + if (spi < 0) > > > > > + return spi; > > > > > + > > > > > + first_reg_state = &state->stack[spi].spilled_ptr; > > > > > + second_reg_state = &state->stack[spi - 1].spilled_ptr; > > > > > + ref_obj_id = first_reg_state->ref_obj_id; > > > > > + > > > > > + /* reassign the clone the same dynptr id as the original */ > > > > > + __mark_dynptr_reg(first_reg_state, dynptr_type, true, meta->initialized_dynptr.id); > > > > > + __mark_dynptr_reg(second_reg_state, dynptr_type, false, meta->initialized_dynptr.id); > > > > > > > > I'm not sure why clone should have the same dynptr_id? Isn't it a new > > > > instance of a dynptr? I get preserving ref_obj_id (if refcounted), but > > > > why reusing dynptr_id?.. > > > > > > > I think we need to also copy over the dynptr id because in the case of > > > a non-reference counted dynptr, if the parent (or clone) is > > > invalidated (eg overwriting bytes of the dynptr on the stack), we must > > > also invalidate the slices of the clone (or parent) > > > > yep, right now we'll have to do that because we have dynptr_id. But if > > we get rid of it and stick to ref_obj_id and id, then clone would need > > to get a new id, but keep ref_obj_id, right? > > > Thinking about this some more, I think you're right that we shouldn't > reuse the dynptr id. in fact, i think reusing it would lead to > incorrect behavior - in the example of a non-reference counted dynptr, > if the parent dynptr is overwritten on the stack (and thus > invalidated), that shouldn't invalidate the slices of the clone at > all. I'll change this in the next version ok > > > > > [...]
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index bac4c6fe49f0..108f3bcfa6da 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2351,6 +2351,19 @@ __bpf_kfunc __u32 bpf_dynptr_get_offset(const struct bpf_dynptr_kern *ptr) return ptr->offset; } +__bpf_kfunc int bpf_dynptr_clone(struct bpf_dynptr_kern *ptr, + struct bpf_dynptr_kern *clone__uninit) +{ + if (!ptr->data) { + bpf_dynptr_set_null(clone__uninit); + return -EINVAL; + } + + memcpy(clone__uninit, ptr, sizeof(*clone__uninit)); + + return 0; +} + __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj) { return obj; @@ -2429,6 +2442,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null) BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) BTF_ID_FLAGS(func, bpf_dynptr_get_size) BTF_ID_FLAGS(func, bpf_dynptr_get_offset) +BTF_ID_FLAGS(func, bpf_dynptr_clone) BTF_SET8_END(common_btf_ids) static const struct btf_kfunc_id_set common_kfunc_set = { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 3660b573048a..804cb50050f9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -302,6 +302,7 @@ struct bpf_kfunc_call_arg_meta { struct { enum bpf_dynptr_type type; u32 id; + u32 ref_obj_id; } initialized_dynptr; struct { u8 spi; @@ -963,24 +964,15 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_ return 0; } -static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) +static void invalidate_dynptr(struct bpf_verifier_env *env, struct bpf_func_state *state, int spi) { - struct bpf_func_state *state = func(env, reg); - int spi, i; - - spi = dynptr_get_spi(env, reg); - if (spi < 0) - return spi; + int i; for (i = 0; i < BPF_REG_SIZE; i++) { state->stack[spi].slot_type[i] = STACK_INVALID; state->stack[spi - 1].slot_type[i] = STACK_INVALID; } - /* Invalidate any slices associated with this dynptr */ - if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) - WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id)); - __mark_reg_not_init(env, &state->stack[spi].spilled_ptr); __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr); @@ -1007,6 +999,51 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re */ state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN; state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN; +} + +static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg) +{ + struct bpf_func_state *state = func(env, reg); + int spi; + + spi = dynptr_get_spi(env, reg); + if (spi < 0) + return spi; + + if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) { + int ref_obj_id = state->stack[spi].spilled_ptr.ref_obj_id; + int i; + + /* If the dynptr has a ref_obj_id, then we need to invaldiate + * two things: + * + * 1) Any dynptrs with a matching ref_obj_id (clones) + * 2) Any slices associated with the ref_obj_id + */ + + /* Invalidate any slices associated with this dynptr */ + WARN_ON_ONCE(release_reference(env, ref_obj_id)); + + /* Invalidate any dynptr clones */ + for (i = 1; i < state->allocated_stack / BPF_REG_SIZE; i++) { + if (state->stack[i].spilled_ptr.ref_obj_id == ref_obj_id) { + /* it should always be the case that if the ref obj id + * matches then the stack slot also belongs to a + * dynptr + */ + if (state->stack[i].slot_type[0] != STACK_DYNPTR) { + verbose(env, "verifier internal error: misconfigured ref_obj_id\n"); + return -EFAULT; + } + if (state->stack[i].spilled_ptr.dynptr.first_slot) + invalidate_dynptr(env, state, i); + } + } + + return 0; + } + + invalidate_dynptr(env, state, spi); return 0; } @@ -6967,6 +7004,50 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx, return 0; } +static int handle_dynptr_clone(struct bpf_verifier_env *env, enum bpf_arg_type arg_type, + int regno, int insn_idx, struct bpf_kfunc_call_arg_meta *meta) +{ + struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; + struct bpf_reg_state *first_reg_state, *second_reg_state; + struct bpf_func_state *state = func(env, reg); + enum bpf_dynptr_type dynptr_type = meta->initialized_dynptr.type; + int err, spi, ref_obj_id; + + if (!dynptr_type) { + verbose(env, "verifier internal error: no dynptr type for bpf_dynptr_clone\n"); + return -EFAULT; + } + arg_type |= get_dynptr_type_flag(dynptr_type); + + err = process_dynptr_func(env, regno, insn_idx, arg_type); + if (err < 0) + return err; + + spi = dynptr_get_spi(env, reg); + if (spi < 0) + return spi; + + first_reg_state = &state->stack[spi].spilled_ptr; + second_reg_state = &state->stack[spi - 1].spilled_ptr; + ref_obj_id = first_reg_state->ref_obj_id; + + /* reassign the clone the same dynptr id as the original */ + __mark_dynptr_reg(first_reg_state, dynptr_type, true, meta->initialized_dynptr.id); + __mark_dynptr_reg(second_reg_state, dynptr_type, false, meta->initialized_dynptr.id); + + if (meta->initialized_dynptr.ref_obj_id) { + /* release the new ref obj id assigned during process_dynptr_func */ + err = release_reference_state(cur_func(env), ref_obj_id); + if (err) + return err; + /* reassign the clone the same ref obj id as the original */ + first_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; + second_reg_state->ref_obj_id = meta->initialized_dynptr.ref_obj_id; + } + + return 0; +} + static bool arg_type_is_mem_size(enum bpf_arg_type type) { return type == ARG_CONST_SIZE || @@ -9615,6 +9696,7 @@ enum special_kfunc_type { KF_bpf_dynptr_from_xdp, KF_bpf_dynptr_slice, KF_bpf_dynptr_slice_rdwr, + KF_bpf_dynptr_clone, }; BTF_SET_START(special_kfunc_set) @@ -9633,6 +9715,7 @@ BTF_ID(func, bpf_dynptr_from_skb) BTF_ID(func, bpf_dynptr_from_xdp) BTF_ID(func, bpf_dynptr_slice) BTF_ID(func, bpf_dynptr_slice_rdwr) +BTF_ID(func, bpf_dynptr_clone) BTF_SET_END(special_kfunc_set) BTF_ID_LIST(special_kfunc_list) @@ -9653,6 +9736,7 @@ BTF_ID(func, bpf_dynptr_from_skb) BTF_ID(func, bpf_dynptr_from_xdp) BTF_ID(func, bpf_dynptr_slice) BTF_ID(func, bpf_dynptr_slice_rdwr) +BTF_ID(func, bpf_dynptr_clone) static bool is_kfunc_bpf_rcu_read_lock(struct bpf_kfunc_call_arg_meta *meta) { @@ -10414,10 +10498,24 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ if (is_kfunc_arg_uninit(btf, &args[i])) dynptr_arg_type |= MEM_UNINIT; - if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_from_skb]) + if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_from_skb]) { dynptr_arg_type |= DYNPTR_TYPE_SKB; - else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_from_xdp]) + } else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_from_xdp]) { dynptr_arg_type |= DYNPTR_TYPE_XDP; + } else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_clone] && + (dynptr_arg_type & MEM_UNINIT)) { + /* bpf_dynptr_clone is special. + * + * we need to assign the clone the same dynptr type and + * the clone needs to have the same id and ref_obj_id as + * the original dynptr + */ + ret = handle_dynptr_clone(env, dynptr_arg_type, regno, insn_idx, meta); + if (ret < 0) + return ret; + + break; + } ret = process_dynptr_func(env, regno, insn_idx, dynptr_arg_type); if (ret < 0) @@ -10432,6 +10530,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ } meta->initialized_dynptr.id = id; meta->initialized_dynptr.type = dynptr_get_type(env, reg); + meta->initialized_dynptr.ref_obj_id = dynptr_ref_obj_id(env, reg); } break;
The cloned dynptr will point to the same data as its parent dynptr, with the same type, offset, size and read-only properties. Any writes to a dynptr will be reflected across all instances (by 'instance', this means any dynptrs that point to the same underlying data). Please note that data slice and dynptr invalidations will affect all instances as well. For example, if bpf_dynptr_write() is called on an skb-type dynptr, all data slices of dynptr instances to that skb will be invalidated as well (eg data slices of any clones, parents, grandparents, ...). Another example is if a ringbuf dynptr is submitted, any instance of that dynptr will be invalidated. Changing the view of the dynptr (eg advancing the offset or trimming the size) will only affect that dynptr and not affect any other instances. One example use case where cloning may be helpful is for hashing or iterating through dynptr data. Cloning will allow the user to maintain the original view of the dynptr for future use, while also allowing views to smaller subsets of the data after the offset is advanced or the size is trimmed. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> --- kernel/bpf/helpers.c | 14 +++++ kernel/bpf/verifier.c | 125 +++++++++++++++++++++++++++++++++++++----- 2 files changed, 126 insertions(+), 13 deletions(-)