Message ID | 20220908000254.3079129-8-joannelkoong@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | Dynptr convenience helpers | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for bpf-next, async |
netdev/apply | fail | Patch does not apply to bpf-next |
bpf/vmtest-bpf-next-PR | fail | merge-conflict |
On Thu, 8 Sept 2022 at 02:16, Joanne Koong <joannelkoong@gmail.com> wrote: > > Add a new helper function, bpf_dynptr_iterator: > > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, > void *callback_ctx, u64 flags) > > where callback_fn is defined as: > > long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx) > > and callback_fn returns the number of bytes to advance the > dynptr by (or an error code in the case of error). The iteration > will stop if the callback_fn returns 0 or an error or tries to > advance by more bytes than available. > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > --- This is buggy as is. A user can just reinitialize the dynptr from inside the callback, and then you will never stop advancing it inside your helper, therefore an infinite loop can be constructed. The stack frame of the caller is accessible using callback_ctx. For example (modifying your selftest) diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c b/tools/testing/selftests/bpf/progs/dynptr_success.c index 22164ad6df9d..a9e78316c508 100644 --- a/tools/testing/selftests/bpf/progs/dynptr_success.c +++ b/tools/testing/selftests/bpf/progs/dynptr_success.c @@ -540,6 +540,19 @@ struct iter_ctx { static int iter_callback(struct bpf_dynptr *ptr, struct iter_ctx *ctx) { + struct map_value *map_val; + int key = 0; + + map_val = bpf_map_lookup_elem(&array_map2, &key); + if (!map_val) { + return 0; + } + + *(void **)ptr = 0; + *((void **)ptr + 1) = 0; + bpf_dynptr_from_mem(map_val, 2, 0, ptr); + return 1; + if (ctx->trigger_err_erange) return bpf_dynptr_get_size(ptr) + 1; ... leads to a lockup. It doesn't have to be ringbuf_reserver_dynptr, it can just be dynptr_from_mem, which also gets around reference state restrictions inside callbacks. You cannot prevent overwriting dynptr stack slots in general. Some of them don't have to be released. It would be prohibitive for stack reuse. So it seems like you need to temporarily mark both the slots as immutable for the caller stack state during exploration of the callback. Setting some flag in r1 for callback is not enough (as it can reload PTR_TO_STACK of caller stack frame pointing at dynptr from callback_ctx). It needs to be set in spilled_ptr. Then certain operations modifying the view of the dynptr do not accept dynptr with that type flag set (e.g. trim, advance, init functions). While for others which only operate on the underlying view, you fold the flag (e.g. read/write/dynptr_data). It is the difference between struct bpf_dynptr *, vs const struct bpf_dynptr *, we need to give the callback access to the latter. I.e. it should still allow accessing the dynptr's view, but not modifying it. And at the risk of sounding like a broken record (and same suggestion as Martin in skb/xdp v6 set), the view's mutability should ideally also be part of the verifier's state. That doesn't preclude runtime tracking later, but there seems to be no strong motivation for that right now. > include/uapi/linux/bpf.h | 20 ++++++++++++++ > kernel/bpf/helpers.c | 48 +++++++++++++++++++++++++++++++--- > kernel/bpf/verifier.c | 27 +++++++++++++++++++ > tools/include/uapi/linux/bpf.h | 20 ++++++++++++++ > 4 files changed, 111 insertions(+), 4 deletions(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 16973fa4d073..ff78a94c262a 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -5531,6 +5531,25 @@ union bpf_attr { > * losing access to the original view of the dynptr. > * Return > * 0 on success, -EINVAL if the dynptr to clone is invalid. > + * > + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) > + * Description > + * Iterate through the dynptr data, calling **callback_fn** on each > + * iteration with **callback_ctx** as the context parameter. > + * The **callback_fn** should be a static function and > + * the **callback_ctx** should be a pointer to the stack. > + * Currently **flags** is unused and must be 0. > + * > + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); > + * > + * where **callback_fn** returns the number of bytes to advance > + * the dynptr by or an error. The iteration will stop if **callback_fn** > + * returns 0 or an error or tries to advance by more bytes than the > + * size of the dynptr. > + * Return > + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, > + * -ERANGE if attempting to iterate more bytes than available, or other > + * negative error if **callback_fn** returns an error. > */ > #define __BPF_FUNC_MAPPER(FN) \ > FN(unspec), \ > @@ -5752,6 +5771,7 @@ union bpf_attr { > FN(dynptr_get_size), \ > FN(dynptr_get_offset), \ > FN(dynptr_clone), \ > + FN(dynptr_iterator), \ > /* */ > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 667f1e213a61..519b3da06d49 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -1653,13 +1653,11 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = { > .arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO, > }; > > -BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > +/* *ptr* should always be a valid dynptr */ > +static int __bpf_dynptr_advance(struct bpf_dynptr_kern *ptr, u32 len) > { > u32 size; > > - if (!ptr->data) > - return -EINVAL; > - > size = __bpf_dynptr_get_size(ptr); > > if (len > size) > @@ -1672,6 +1670,14 @@ BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > return 0; > } > > +BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > +{ > + if (!ptr->data) > + return -EINVAL; > + > + return __bpf_dynptr_advance(ptr, len); > +} > + > static const struct bpf_func_proto bpf_dynptr_advance_proto = { > .func = bpf_dynptr_advance, > .gpl_only = false, > @@ -1783,6 +1789,38 @@ static const struct bpf_func_proto bpf_dynptr_clone_proto = { > .arg2_type = ARG_PTR_TO_DYNPTR | MEM_UNINIT, > }; > > +BPF_CALL_4(bpf_dynptr_iterator, struct bpf_dynptr_kern *, ptr, void *, callback_fn, > + void *, callback_ctx, u64, flags) > +{ > + bpf_callback_t callback = (bpf_callback_t)callback_fn; > + int nr_bytes, err; > + > + if (!ptr->data || flags) > + return -EINVAL; > + > + while (ptr->size > 0) { > + nr_bytes = callback((u64)(long)ptr, (u64)(long)callback_ctx, 0, 0, 0); > + if (nr_bytes <= 0) > + return nr_bytes; > + > + err = __bpf_dynptr_advance(ptr, nr_bytes); > + if (err) > + return err; > + } > + > + return 0; > +} > + > +static const struct bpf_func_proto bpf_dynptr_iterator_proto = { > + .func = bpf_dynptr_iterator, > + .gpl_only = false, > + .ret_type = RET_INTEGER, > + .arg1_type = ARG_PTR_TO_DYNPTR, > + .arg2_type = ARG_PTR_TO_FUNC, > + .arg3_type = ARG_PTR_TO_STACK_OR_NULL, > + .arg4_type = ARG_ANYTHING, > +}; > + > const struct bpf_func_proto bpf_get_current_task_proto __weak; > const struct bpf_func_proto bpf_get_current_task_btf_proto __weak; > const struct bpf_func_proto bpf_probe_read_user_proto __weak; > @@ -1869,6 +1907,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) > return &bpf_dynptr_get_offset_proto; > case BPF_FUNC_dynptr_clone: > return &bpf_dynptr_clone_proto; > + case BPF_FUNC_dynptr_iterator: > + return &bpf_dynptr_iterator_proto; > default: > break; > } > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 2eb2a4410344..76108cd4ed85 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -6901,6 +6901,29 @@ static int set_map_elem_callback_state(struct bpf_verifier_env *env, > return 0; > } > > +static int set_dynptr_iterator_callback_state(struct bpf_verifier_env *env, > + struct bpf_func_state *caller, > + struct bpf_func_state *callee, > + int insn_idx) > +{ > + /* bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, > + * void *callback_ctx, u64 flags); > + * > + * callback_fn(struct bpf_dynptr *ptr, void *callback_ctx); > + */ > + callee->regs[BPF_REG_1] = caller->regs[BPF_REG_1]; > + callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3]; > + callee->callback_ret_range = tnum_range(0, U64_MAX); > + > + /* unused */ > + __mark_reg_not_init(env, &callee->regs[BPF_REG_3]); > + __mark_reg_not_init(env, &callee->regs[BPF_REG_4]); > + __mark_reg_not_init(env, &callee->regs[BPF_REG_5]); > + > + callee->in_callback_fn = true; > + return 0; > +} > + > static int set_loop_callback_state(struct bpf_verifier_env *env, > struct bpf_func_state *caller, > struct bpf_func_state *callee, > @@ -7472,6 +7495,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > > break; > } > + case BPF_FUNC_dynptr_iterator: > + err = __check_func_call(env, insn, insn_idx_p, meta.subprogno, > + set_dynptr_iterator_callback_state); > + break; > } > > if (err) > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index 16973fa4d073..ff78a94c262a 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -5531,6 +5531,25 @@ union bpf_attr { > * losing access to the original view of the dynptr. > * Return > * 0 on success, -EINVAL if the dynptr to clone is invalid. > + * > + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) > + * Description > + * Iterate through the dynptr data, calling **callback_fn** on each > + * iteration with **callback_ctx** as the context parameter. > + * The **callback_fn** should be a static function and > + * the **callback_ctx** should be a pointer to the stack. > + * Currently **flags** is unused and must be 0. > + * > + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); > + * > + * where **callback_fn** returns the number of bytes to advance > + * the dynptr by or an error. The iteration will stop if **callback_fn** > + * returns 0 or an error or tries to advance by more bytes than the > + * size of the dynptr. > + * Return > + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, > + * -ERANGE if attempting to iterate more bytes than available, or other > + * negative error if **callback_fn** returns an error. > */ > #define __BPF_FUNC_MAPPER(FN) \ > FN(unspec), \ > @@ -5752,6 +5771,7 @@ union bpf_attr { > FN(dynptr_get_size), \ > FN(dynptr_get_offset), \ > FN(dynptr_clone), \ > + FN(dynptr_iterator), \ > /* */ > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > -- > 2.30.2 >
On Wed, Sep 7, 2022 at 5:07 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > Add a new helper function, bpf_dynptr_iterator: > > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, > void *callback_ctx, u64 flags) > > where callback_fn is defined as: > > long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx) > > and callback_fn returns the number of bytes to advance the > dynptr by (or an error code in the case of error). The iteration > will stop if the callback_fn returns 0 or an error or tries to > advance by more bytes than available. > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > --- > include/uapi/linux/bpf.h | 20 ++++++++++++++ > kernel/bpf/helpers.c | 48 +++++++++++++++++++++++++++++++--- > kernel/bpf/verifier.c | 27 +++++++++++++++++++ > tools/include/uapi/linux/bpf.h | 20 ++++++++++++++ > 4 files changed, 111 insertions(+), 4 deletions(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 16973fa4d073..ff78a94c262a 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -5531,6 +5531,25 @@ union bpf_attr { > * losing access to the original view of the dynptr. > * Return > * 0 on success, -EINVAL if the dynptr to clone is invalid. > + * > + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) > + * Description > + * Iterate through the dynptr data, calling **callback_fn** on each > + * iteration with **callback_ctx** as the context parameter. > + * The **callback_fn** should be a static function and > + * the **callback_ctx** should be a pointer to the stack. > + * Currently **flags** is unused and must be 0. > + * > + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); > + * > + * where **callback_fn** returns the number of bytes to advance > + * the dynptr by or an error. The iteration will stop if **callback_fn** > + * returns 0 or an error or tries to advance by more bytes than the > + * size of the dynptr. > + * Return > + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, > + * -ERANGE if attempting to iterate more bytes than available, or other > + * negative error if **callback_fn** returns an error. > */ > #define __BPF_FUNC_MAPPER(FN) \ > FN(unspec), \ > @@ -5752,6 +5771,7 @@ union bpf_attr { > FN(dynptr_get_size), \ > FN(dynptr_get_offset), \ > FN(dynptr_clone), \ > + FN(dynptr_iterator), \ > /* */ > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 667f1e213a61..519b3da06d49 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -1653,13 +1653,11 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = { > .arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO, > }; > > -BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > +/* *ptr* should always be a valid dynptr */ > +static int __bpf_dynptr_advance(struct bpf_dynptr_kern *ptr, u32 len) > { > u32 size; > > - if (!ptr->data) > - return -EINVAL; > - > size = __bpf_dynptr_get_size(ptr); > > if (len > size) > @@ -1672,6 +1670,14 @@ BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > return 0; > } > > +BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > +{ > + if (!ptr->data) > + return -EINVAL; > + > + return __bpf_dynptr_advance(ptr, len); > +} > + > static const struct bpf_func_proto bpf_dynptr_advance_proto = { > .func = bpf_dynptr_advance, > .gpl_only = false, > @@ -1783,6 +1789,38 @@ static const struct bpf_func_proto bpf_dynptr_clone_proto = { > .arg2_type = ARG_PTR_TO_DYNPTR | MEM_UNINIT, > }; > > +BPF_CALL_4(bpf_dynptr_iterator, struct bpf_dynptr_kern *, ptr, void *, callback_fn, > + void *, callback_ctx, u64, flags) > +{ > + bpf_callback_t callback = (bpf_callback_t)callback_fn; > + int nr_bytes, err; > + > + if (!ptr->data || flags) > + return -EINVAL; > + > + while (ptr->size > 0) { > + nr_bytes = callback((u64)(long)ptr, (u64)(long)callback_ctx, 0, 0, 0); > + if (nr_bytes <= 0) > + return nr_bytes; callback is defined as returning long but here you are silently truncating it to int. Let's either stick to longs or to ints. > + > + err = __bpf_dynptr_advance(ptr, nr_bytes); as Kumar pointed out, you can't modify ptr in place, you have to create a local copy and bpf_dynptr_clone() it (so that for MALLOC you'll bump refcnt). Then advance and pass it to callback. David has such local dynptr use case in bpf_user_ringbuf_drain() helper. > + if (err) > + return err; > + } > + > + return 0; > +} > + [...]
On Sun, Sep 18, 2022 at 5:08 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Thu, 8 Sept 2022 at 02:16, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > Add a new helper function, bpf_dynptr_iterator: > > > > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, > > void *callback_ctx, u64 flags) > > > > where callback_fn is defined as: > > > > long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx) > > > > and callback_fn returns the number of bytes to advance the > > dynptr by (or an error code in the case of error). The iteration > > will stop if the callback_fn returns 0 or an error or tries to > > advance by more bytes than available. > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > --- > > This is buggy as is. > > A user can just reinitialize the dynptr from inside the callback, and > then you will never stop advancing it inside your helper, therefore an > infinite loop can be constructed. The stack frame of the caller is > accessible using callback_ctx. > > For example (modifying your selftest) > > diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c > b/tools/testing/selftests/bpf/progs/dynptr_success.c > index 22164ad6df9d..a9e78316c508 100644 > --- a/tools/testing/selftests/bpf/progs/dynptr_success.c > +++ b/tools/testing/selftests/bpf/progs/dynptr_success.c > @@ -540,6 +540,19 @@ struct iter_ctx { > > static int iter_callback(struct bpf_dynptr *ptr, struct iter_ctx *ctx) > { > + struct map_value *map_val; > + int key = 0; > + > + map_val = bpf_map_lookup_elem(&array_map2, &key); > + if (!map_val) { > + return 0; > + } > + > + *(void **)ptr = 0; > + *((void **)ptr + 1) = 0; > + bpf_dynptr_from_mem(map_val, 2, 0, ptr); > + return 1; > + > if (ctx->trigger_err_erange) > return bpf_dynptr_get_size(ptr) + 1; > > ... leads to a lockup. > > It doesn't have to be ringbuf_reserver_dynptr, it can just be > dynptr_from_mem, which also gets around reference state restrictions > inside callbacks. > > You cannot prevent overwriting dynptr stack slots in general. Some of > them don't have to be released. It would be prohibitive for stack > reuse. > > So it seems like you need to temporarily mark both the slots as > immutable for the caller stack state during exploration of the > callback. > Setting some flag in r1 for callback is not enough (as it can reload > PTR_TO_STACK of caller stack frame pointing at dynptr from > callback_ctx). It needs to be set in spilled_ptr. This sounds overcomplicated. We need a local copy of dynptr for the duration of iteration and work with it. Basically internal bpf_dynptr_clone(). See my other reply in this thread. > > Then certain operations modifying the view of the dynptr do not accept > dynptr with that type flag set (e.g. trim, advance, init functions). > While for others which only operate on the underlying view, you fold > the flag (e.g. read/write/dynptr_data). > > It is the difference between struct bpf_dynptr *, vs const struct > bpf_dynptr *, we need to give the callback access to the latter. > I.e. it should still allow accessing the dynptr's view, but not modifying it. > > And at the risk of sounding like a broken record (and same suggestion > as Martin in skb/xdp v6 set), the view's mutability should ideally > also be part of the verifier's state. That doesn't preclude runtime > tracking later, but there seems to be no strong motivation for that > right now. The unexpected NULL for bpf_dynptr_data() vs bpf_dynptr_data_rdonly() argument from Martin is pretty convincing, I agree. So I guess I don't mind tracking it statically at this point. > > > include/uapi/linux/bpf.h | 20 ++++++++++++++ > > kernel/bpf/helpers.c | 48 +++++++++++++++++++++++++++++++--- > > kernel/bpf/verifier.c | 27 +++++++++++++++++++ > > tools/include/uapi/linux/bpf.h | 20 ++++++++++++++ > > 4 files changed, 111 insertions(+), 4 deletions(-) > > please trim irrelevant parts [...]
On Thu, 29 Sept 2022 at 01:04, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Sep 7, 2022 at 5:07 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > Add a new helper function, bpf_dynptr_iterator: > > > > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, > > void *callback_ctx, u64 flags) > > > > where callback_fn is defined as: > > > > long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx) > > > > and callback_fn returns the number of bytes to advance the > > dynptr by (or an error code in the case of error). The iteration > > will stop if the callback_fn returns 0 or an error or tries to > > advance by more bytes than available. > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > --- > > include/uapi/linux/bpf.h | 20 ++++++++++++++ > > kernel/bpf/helpers.c | 48 +++++++++++++++++++++++++++++++--- > > kernel/bpf/verifier.c | 27 +++++++++++++++++++ > > tools/include/uapi/linux/bpf.h | 20 ++++++++++++++ > > 4 files changed, 111 insertions(+), 4 deletions(-) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index 16973fa4d073..ff78a94c262a 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h > > @@ -5531,6 +5531,25 @@ union bpf_attr { > > * losing access to the original view of the dynptr. > > * Return > > * 0 on success, -EINVAL if the dynptr to clone is invalid. > > + * > > + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) > > + * Description > > + * Iterate through the dynptr data, calling **callback_fn** on each > > + * iteration with **callback_ctx** as the context parameter. > > + * The **callback_fn** should be a static function and > > + * the **callback_ctx** should be a pointer to the stack. > > + * Currently **flags** is unused and must be 0. > > + * > > + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); > > + * > > + * where **callback_fn** returns the number of bytes to advance > > + * the dynptr by or an error. The iteration will stop if **callback_fn** > > + * returns 0 or an error or tries to advance by more bytes than the > > + * size of the dynptr. > > + * Return > > + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, > > + * -ERANGE if attempting to iterate more bytes than available, or other > > + * negative error if **callback_fn** returns an error. > > */ > > #define __BPF_FUNC_MAPPER(FN) \ > > FN(unspec), \ > > @@ -5752,6 +5771,7 @@ union bpf_attr { > > FN(dynptr_get_size), \ > > FN(dynptr_get_offset), \ > > FN(dynptr_clone), \ > > + FN(dynptr_iterator), \ > > /* */ > > > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > index 667f1e213a61..519b3da06d49 100644 > > --- a/kernel/bpf/helpers.c > > +++ b/kernel/bpf/helpers.c > > @@ -1653,13 +1653,11 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = { > > .arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO, > > }; > > > > -BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > +/* *ptr* should always be a valid dynptr */ > > +static int __bpf_dynptr_advance(struct bpf_dynptr_kern *ptr, u32 len) > > { > > u32 size; > > > > - if (!ptr->data) > > - return -EINVAL; > > - > > size = __bpf_dynptr_get_size(ptr); > > > > if (len > size) > > @@ -1672,6 +1670,14 @@ BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > return 0; > > } > > > > +BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > +{ > > + if (!ptr->data) > > + return -EINVAL; > > + > > + return __bpf_dynptr_advance(ptr, len); > > +} > > + > > static const struct bpf_func_proto bpf_dynptr_advance_proto = { > > .func = bpf_dynptr_advance, > > .gpl_only = false, > > @@ -1783,6 +1789,38 @@ static const struct bpf_func_proto bpf_dynptr_clone_proto = { > > .arg2_type = ARG_PTR_TO_DYNPTR | MEM_UNINIT, > > }; > > > > +BPF_CALL_4(bpf_dynptr_iterator, struct bpf_dynptr_kern *, ptr, void *, callback_fn, > > + void *, callback_ctx, u64, flags) > > +{ > > + bpf_callback_t callback = (bpf_callback_t)callback_fn; > > + int nr_bytes, err; > > + > > + if (!ptr->data || flags) > > + return -EINVAL; > > + > > + while (ptr->size > 0) { > > + nr_bytes = callback((u64)(long)ptr, (u64)(long)callback_ctx, 0, 0, 0); > > + if (nr_bytes <= 0) > > + return nr_bytes; > > callback is defined as returning long but here you are silently > truncating it to int. Let's either stick to longs or to ints. > > > + > > + err = __bpf_dynptr_advance(ptr, nr_bytes); > > as Kumar pointed out, you can't modify ptr in place, you have to > create a local copy and bpf_dynptr_clone() it (so that for MALLOC > you'll bump refcnt). Then advance and pass it to callback. David has > such local dynptr use case in bpf_user_ringbuf_drain() helper. > My solution was a bit overcomplicated because just creating a local copy doesn't fix this completely, there's still the hole of writing through arg#0 (which now does not reflect runtime state, as writes at runtime go to clone while verifier thinks you touched stack slots), and still allows constructing the infinite loop (because well, you can overwrite dynptr through it). The 'side channel' of writing to dynptr slots through callback_ctx is still there as well. Maybe the infinite loop _can_ be avoided if you clone inside each iteration, that part isn't very clear. My reasoning was that when you iterate, the container is always immutable (to prevent iterator invalidation) while mutability of elements remains unaffected from that. Setting it in spilled_ptr covers all bases (PTR_TO_STACK arg#0, access to it through callback_ctx, etc.). But I totally agree with you that we should be working on a local copy inside the helper and leave the original dynptr untouched. I think then the first arg should not be PTR_TO_STACK but some other pointer type. Maybe it should be its own register type, and spilled_ptr should reflect the same register, which allows the dynptr state that we track to be copied into the callback arg#0 easily. For the case of writes to the original dynptr that is already broken right now, we can't track the state of the stack across iterations correctly anyway so I think that has to be left as is until your N callbacks rework happens. wdyt?
On Wed, Sep 28, 2022 at 5:32 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Thu, 29 Sept 2022 at 01:04, Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Sep 7, 2022 at 5:07 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > Add a new helper function, bpf_dynptr_iterator: > > > > > > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, > > > void *callback_ctx, u64 flags) > > > > > > where callback_fn is defined as: > > > > > > long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx) > > > > > > and callback_fn returns the number of bytes to advance the > > > dynptr by (or an error code in the case of error). The iteration > > > will stop if the callback_fn returns 0 or an error or tries to > > > advance by more bytes than available. > > > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > > --- > > > include/uapi/linux/bpf.h | 20 ++++++++++++++ > > > kernel/bpf/helpers.c | 48 +++++++++++++++++++++++++++++++--- > > > kernel/bpf/verifier.c | 27 +++++++++++++++++++ > > > tools/include/uapi/linux/bpf.h | 20 ++++++++++++++ > > > 4 files changed, 111 insertions(+), 4 deletions(-) > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > index 16973fa4d073..ff78a94c262a 100644 > > > --- a/include/uapi/linux/bpf.h > > > +++ b/include/uapi/linux/bpf.h > > > @@ -5531,6 +5531,25 @@ union bpf_attr { > > > * losing access to the original view of the dynptr. > > > * Return > > > * 0 on success, -EINVAL if the dynptr to clone is invalid. > > > + * > > > + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) > > > + * Description > > > + * Iterate through the dynptr data, calling **callback_fn** on each > > > + * iteration with **callback_ctx** as the context parameter. > > > + * The **callback_fn** should be a static function and > > > + * the **callback_ctx** should be a pointer to the stack. > > > + * Currently **flags** is unused and must be 0. > > > + * > > > + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); > > > + * > > > + * where **callback_fn** returns the number of bytes to advance > > > + * the dynptr by or an error. The iteration will stop if **callback_fn** > > > + * returns 0 or an error or tries to advance by more bytes than the > > > + * size of the dynptr. > > > + * Return > > > + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, > > > + * -ERANGE if attempting to iterate more bytes than available, or other > > > + * negative error if **callback_fn** returns an error. > > > */ > > > #define __BPF_FUNC_MAPPER(FN) \ > > > FN(unspec), \ > > > @@ -5752,6 +5771,7 @@ union bpf_attr { > > > FN(dynptr_get_size), \ > > > FN(dynptr_get_offset), \ > > > FN(dynptr_clone), \ > > > + FN(dynptr_iterator), \ > > > /* */ > > > > > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > > index 667f1e213a61..519b3da06d49 100644 > > > --- a/kernel/bpf/helpers.c > > > +++ b/kernel/bpf/helpers.c > > > @@ -1653,13 +1653,11 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = { > > > .arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO, > > > }; > > > > > > -BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > +/* *ptr* should always be a valid dynptr */ > > > +static int __bpf_dynptr_advance(struct bpf_dynptr_kern *ptr, u32 len) > > > { > > > u32 size; > > > > > > - if (!ptr->data) > > > - return -EINVAL; > > > - > > > size = __bpf_dynptr_get_size(ptr); > > > > > > if (len > size) > > > @@ -1672,6 +1670,14 @@ BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > return 0; > > > } > > > > > > +BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > +{ > > > + if (!ptr->data) > > > + return -EINVAL; > > > + > > > + return __bpf_dynptr_advance(ptr, len); > > > +} > > > + > > > static const struct bpf_func_proto bpf_dynptr_advance_proto = { > > > .func = bpf_dynptr_advance, > > > .gpl_only = false, > > > @@ -1783,6 +1789,38 @@ static const struct bpf_func_proto bpf_dynptr_clone_proto = { > > > .arg2_type = ARG_PTR_TO_DYNPTR | MEM_UNINIT, > > > }; > > > > > > +BPF_CALL_4(bpf_dynptr_iterator, struct bpf_dynptr_kern *, ptr, void *, callback_fn, > > > + void *, callback_ctx, u64, flags) > > > +{ > > > + bpf_callback_t callback = (bpf_callback_t)callback_fn; > > > + int nr_bytes, err; > > > + > > > + if (!ptr->data || flags) > > > + return -EINVAL; > > > + > > > + while (ptr->size > 0) { > > > + nr_bytes = callback((u64)(long)ptr, (u64)(long)callback_ctx, 0, 0, 0); > > > + if (nr_bytes <= 0) > > > + return nr_bytes; > > > > callback is defined as returning long but here you are silently > > truncating it to int. Let's either stick to longs or to ints. > > > > > + > > > + err = __bpf_dynptr_advance(ptr, nr_bytes); > > > > as Kumar pointed out, you can't modify ptr in place, you have to > > create a local copy and bpf_dynptr_clone() it (so that for MALLOC > > you'll bump refcnt). Then advance and pass it to callback. David has > > such local dynptr use case in bpf_user_ringbuf_drain() helper. > > > > My solution was a bit overcomplicated because just creating a local > copy doesn't fix this completely, there's still the hole of writing > through arg#0 (which now does not reflect runtime state, as writes at > runtime go to clone while verifier thinks you touched stack slots), > and still allows constructing the infinite loop (because well, you can > overwrite dynptr through it). The 'side channel' of writing to dynptr > slots through callback_ctx is still there as well. > > Maybe the infinite loop _can_ be avoided if you clone inside each > iteration, that part isn't very clear. > > My reasoning was that when you iterate, the container is always > immutable (to prevent iterator invalidation) while mutability of > elements remains unaffected from that. Setting it in spilled_ptr > covers all bases (PTR_TO_STACK arg#0, access to it through > callback_ctx, etc.). > > But I totally agree with you that we should be working on a local copy > inside the helper and leave the original dynptr untouched. > I think then the first arg should not be PTR_TO_STACK but some other > pointer type. Maybe it should be its own register type, and > spilled_ptr should reflect the same register, which allows the dynptr > state that we track to be copied into the callback arg#0 easily. Right, something like what David Vernet did with his bpf_user_ringbuf_drain() helper that passes kernel-only (not PTR_TO_STACK) dynptr into callback. If that implementation has the same hole (being able to be modified), we should fix it the same way in both cases, by not allowing this for PTR_TO_DYNPTR (or whatever the new reg type is called). > > For the case of writes to the original dynptr that is already broken > right now, we can't track the state of the stack across iterations > correctly anyway so I think that has to be left as is until your N > callbacks rework happens. Right, that's a separate issue. > > wdyt?
On Thu, 29 Sept 2022 at 02:43, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Sep 28, 2022 at 5:32 PM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > On Thu, 29 Sept 2022 at 01:04, Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > > > > On Wed, Sep 7, 2022 at 5:07 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > > Add a new helper function, bpf_dynptr_iterator: > > > > > > > > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, > > > > void *callback_ctx, u64 flags) > > > > > > > > where callback_fn is defined as: > > > > > > > > long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx) > > > > > > > > and callback_fn returns the number of bytes to advance the > > > > dynptr by (or an error code in the case of error). The iteration > > > > will stop if the callback_fn returns 0 or an error or tries to > > > > advance by more bytes than available. > > > > > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > > > --- > > > > include/uapi/linux/bpf.h | 20 ++++++++++++++ > > > > kernel/bpf/helpers.c | 48 +++++++++++++++++++++++++++++++--- > > > > kernel/bpf/verifier.c | 27 +++++++++++++++++++ > > > > tools/include/uapi/linux/bpf.h | 20 ++++++++++++++ > > > > 4 files changed, 111 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > > index 16973fa4d073..ff78a94c262a 100644 > > > > --- a/include/uapi/linux/bpf.h > > > > +++ b/include/uapi/linux/bpf.h > > > > @@ -5531,6 +5531,25 @@ union bpf_attr { > > > > * losing access to the original view of the dynptr. > > > > * Return > > > > * 0 on success, -EINVAL if the dynptr to clone is invalid. > > > > + * > > > > + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) > > > > + * Description > > > > + * Iterate through the dynptr data, calling **callback_fn** on each > > > > + * iteration with **callback_ctx** as the context parameter. > > > > + * The **callback_fn** should be a static function and > > > > + * the **callback_ctx** should be a pointer to the stack. > > > > + * Currently **flags** is unused and must be 0. > > > > + * > > > > + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); > > > > + * > > > > + * where **callback_fn** returns the number of bytes to advance > > > > + * the dynptr by or an error. The iteration will stop if **callback_fn** > > > > + * returns 0 or an error or tries to advance by more bytes than the > > > > + * size of the dynptr. > > > > + * Return > > > > + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, > > > > + * -ERANGE if attempting to iterate more bytes than available, or other > > > > + * negative error if **callback_fn** returns an error. > > > > */ > > > > #define __BPF_FUNC_MAPPER(FN) \ > > > > FN(unspec), \ > > > > @@ -5752,6 +5771,7 @@ union bpf_attr { > > > > FN(dynptr_get_size), \ > > > > FN(dynptr_get_offset), \ > > > > FN(dynptr_clone), \ > > > > + FN(dynptr_iterator), \ > > > > /* */ > > > > > > > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > > > index 667f1e213a61..519b3da06d49 100644 > > > > --- a/kernel/bpf/helpers.c > > > > +++ b/kernel/bpf/helpers.c > > > > @@ -1653,13 +1653,11 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = { > > > > .arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO, > > > > }; > > > > > > > > -BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > > +/* *ptr* should always be a valid dynptr */ > > > > +static int __bpf_dynptr_advance(struct bpf_dynptr_kern *ptr, u32 len) > > > > { > > > > u32 size; > > > > > > > > - if (!ptr->data) > > > > - return -EINVAL; > > > > - > > > > size = __bpf_dynptr_get_size(ptr); > > > > > > > > if (len > size) > > > > @@ -1672,6 +1670,14 @@ BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > > return 0; > > > > } > > > > > > > > +BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > > +{ > > > > + if (!ptr->data) > > > > + return -EINVAL; > > > > + > > > > + return __bpf_dynptr_advance(ptr, len); > > > > +} > > > > + > > > > static const struct bpf_func_proto bpf_dynptr_advance_proto = { > > > > .func = bpf_dynptr_advance, > > > > .gpl_only = false, > > > > @@ -1783,6 +1789,38 @@ static const struct bpf_func_proto bpf_dynptr_clone_proto = { > > > > .arg2_type = ARG_PTR_TO_DYNPTR | MEM_UNINIT, > > > > }; > > > > > > > > +BPF_CALL_4(bpf_dynptr_iterator, struct bpf_dynptr_kern *, ptr, void *, callback_fn, > > > > + void *, callback_ctx, u64, flags) > > > > +{ > > > > + bpf_callback_t callback = (bpf_callback_t)callback_fn; > > > > + int nr_bytes, err; > > > > + > > > > + if (!ptr->data || flags) > > > > + return -EINVAL; > > > > + > > > > + while (ptr->size > 0) { > > > > + nr_bytes = callback((u64)(long)ptr, (u64)(long)callback_ctx, 0, 0, 0); > > > > + if (nr_bytes <= 0) > > > > + return nr_bytes; > > > > > > callback is defined as returning long but here you are silently > > > truncating it to int. Let's either stick to longs or to ints. > > > > > > > + > > > > + err = __bpf_dynptr_advance(ptr, nr_bytes); > > > > > > as Kumar pointed out, you can't modify ptr in place, you have to > > > create a local copy and bpf_dynptr_clone() it (so that for MALLOC > > > you'll bump refcnt). Then advance and pass it to callback. David has > > > such local dynptr use case in bpf_user_ringbuf_drain() helper. > > > > > > > My solution was a bit overcomplicated because just creating a local > > copy doesn't fix this completely, there's still the hole of writing > > through arg#0 (which now does not reflect runtime state, as writes at > > runtime go to clone while verifier thinks you touched stack slots), > > and still allows constructing the infinite loop (because well, you can > > overwrite dynptr through it). The 'side channel' of writing to dynptr > > slots through callback_ctx is still there as well. > > > > Maybe the infinite loop _can_ be avoided if you clone inside each > > iteration, that part isn't very clear. > > > > My reasoning was that when you iterate, the container is always > > immutable (to prevent iterator invalidation) while mutability of > > elements remains unaffected from that. Setting it in spilled_ptr > > covers all bases (PTR_TO_STACK arg#0, access to it through > > callback_ctx, etc.). > > > > But I totally agree with you that we should be working on a local copy > > inside the helper and leave the original dynptr untouched. > > I think then the first arg should not be PTR_TO_STACK but some other > > pointer type. Maybe it should be its own register type, and > > spilled_ptr should reflect the same register, which allows the dynptr > > state that we track to be copied into the callback arg#0 easily. > > Right, something like what David Vernet did with his > bpf_user_ringbuf_drain() helper that passes kernel-only (not > PTR_TO_STACK) dynptr into callback. If that implementation has the > same hole (being able to be modified), we should fix it the same way > in both cases, by not allowing this for PTR_TO_DYNPTR (or whatever the > new reg type is called). > Sorry for the late response. Somehow I missed that series entirely. Yes, we should reuse that register type, and it does seem like it needs to check for MEM_UNINIT to prevent reinitialization. I'm rolling that fix into my dynptr series that I'm sending next week, since it would lead to lots of conflicts otherwise. Then this set can use the same approach.
On Sun, Oct 2, 2022 at 9:46 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Thu, 29 Sept 2022 at 02:43, Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Sep 28, 2022 at 5:32 PM Kumar Kartikeya Dwivedi > > <memxor@gmail.com> wrote: > > > > > > On Thu, 29 Sept 2022 at 01:04, Andrii Nakryiko > > > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > On Wed, Sep 7, 2022 at 5:07 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > > > > Add a new helper function, bpf_dynptr_iterator: > > > > > > > > > > long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, > > > > > void *callback_ctx, u64 flags) > > > > > > > > > > where callback_fn is defined as: > > > > > > > > > > long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx) > > > > > > > > > > and callback_fn returns the number of bytes to advance the > > > > > dynptr by (or an error code in the case of error). The iteration > > > > > will stop if the callback_fn returns 0 or an error or tries to > > > > > advance by more bytes than available. > > > > > > > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > > > > --- > > > > > include/uapi/linux/bpf.h | 20 ++++++++++++++ > > > > > kernel/bpf/helpers.c | 48 +++++++++++++++++++++++++++++++--- > > > > > kernel/bpf/verifier.c | 27 +++++++++++++++++++ > > > > > tools/include/uapi/linux/bpf.h | 20 ++++++++++++++ > > > > > 4 files changed, 111 insertions(+), 4 deletions(-) > > > > > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > > > index 16973fa4d073..ff78a94c262a 100644 > > > > > --- a/include/uapi/linux/bpf.h > > > > > +++ b/include/uapi/linux/bpf.h > > > > > @@ -5531,6 +5531,25 @@ union bpf_attr { > > > > > * losing access to the original view of the dynptr. > > > > > * Return > > > > > * 0 on success, -EINVAL if the dynptr to clone is invalid. > > > > > + * > > > > > + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) > > > > > + * Description > > > > > + * Iterate through the dynptr data, calling **callback_fn** on each > > > > > + * iteration with **callback_ctx** as the context parameter. > > > > > + * The **callback_fn** should be a static function and > > > > > + * the **callback_ctx** should be a pointer to the stack. > > > > > + * Currently **flags** is unused and must be 0. > > > > > + * > > > > > + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); > > > > > + * > > > > > + * where **callback_fn** returns the number of bytes to advance > > > > > + * the dynptr by or an error. The iteration will stop if **callback_fn** > > > > > + * returns 0 or an error or tries to advance by more bytes than the > > > > > + * size of the dynptr. > > > > > + * Return > > > > > + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, > > > > > + * -ERANGE if attempting to iterate more bytes than available, or other > > > > > + * negative error if **callback_fn** returns an error. > > > > > */ > > > > > #define __BPF_FUNC_MAPPER(FN) \ > > > > > FN(unspec), \ > > > > > @@ -5752,6 +5771,7 @@ union bpf_attr { > > > > > FN(dynptr_get_size), \ > > > > > FN(dynptr_get_offset), \ > > > > > FN(dynptr_clone), \ > > > > > + FN(dynptr_iterator), \ > > > > > /* */ > > > > > > > > > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > > > > index 667f1e213a61..519b3da06d49 100644 > > > > > --- a/kernel/bpf/helpers.c > > > > > +++ b/kernel/bpf/helpers.c > > > > > @@ -1653,13 +1653,11 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = { > > > > > .arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO, > > > > > }; > > > > > > > > > > -BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > > > +/* *ptr* should always be a valid dynptr */ > > > > > +static int __bpf_dynptr_advance(struct bpf_dynptr_kern *ptr, u32 len) > > > > > { > > > > > u32 size; > > > > > > > > > > - if (!ptr->data) > > > > > - return -EINVAL; > > > > > - > > > > > size = __bpf_dynptr_get_size(ptr); > > > > > > > > > > if (len > size) > > > > > @@ -1672,6 +1670,14 @@ BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > > > return 0; > > > > > } > > > > > > > > > > +BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) > > > > > +{ > > > > > + if (!ptr->data) > > > > > + return -EINVAL; > > > > > + > > > > > + return __bpf_dynptr_advance(ptr, len); > > > > > +} > > > > > + > > > > > static const struct bpf_func_proto bpf_dynptr_advance_proto = { > > > > > .func = bpf_dynptr_advance, > > > > > .gpl_only = false, > > > > > @@ -1783,6 +1789,38 @@ static const struct bpf_func_proto bpf_dynptr_clone_proto = { > > > > > .arg2_type = ARG_PTR_TO_DYNPTR | MEM_UNINIT, > > > > > }; > > > > > > > > > > +BPF_CALL_4(bpf_dynptr_iterator, struct bpf_dynptr_kern *, ptr, void *, callback_fn, > > > > > + void *, callback_ctx, u64, flags) > > > > > +{ > > > > > + bpf_callback_t callback = (bpf_callback_t)callback_fn; > > > > > + int nr_bytes, err; > > > > > + > > > > > + if (!ptr->data || flags) > > > > > + return -EINVAL; > > > > > + > > > > > + while (ptr->size > 0) { > > > > > + nr_bytes = callback((u64)(long)ptr, (u64)(long)callback_ctx, 0, 0, 0); > > > > > + if (nr_bytes <= 0) > > > > > + return nr_bytes; > > > > > > > > callback is defined as returning long but here you are silently > > > > truncating it to int. Let's either stick to longs or to ints. > > > > > > > > > + > > > > > + err = __bpf_dynptr_advance(ptr, nr_bytes); > > > > > > > > as Kumar pointed out, you can't modify ptr in place, you have to > > > > create a local copy and bpf_dynptr_clone() it (so that for MALLOC > > > > you'll bump refcnt). Then advance and pass it to callback. David has > > > > such local dynptr use case in bpf_user_ringbuf_drain() helper. > > > > > > > > > > My solution was a bit overcomplicated because just creating a local > > > copy doesn't fix this completely, there's still the hole of writing > > > through arg#0 (which now does not reflect runtime state, as writes at > > > runtime go to clone while verifier thinks you touched stack slots), > > > and still allows constructing the infinite loop (because well, you can > > > overwrite dynptr through it). The 'side channel' of writing to dynptr > > > slots through callback_ctx is still there as well. > > > > > > Maybe the infinite loop _can_ be avoided if you clone inside each > > > iteration, that part isn't very clear. > > > > > > My reasoning was that when you iterate, the container is always > > > immutable (to prevent iterator invalidation) while mutability of > > > elements remains unaffected from that. Setting it in spilled_ptr > > > covers all bases (PTR_TO_STACK arg#0, access to it through > > > callback_ctx, etc.). > > > > > > But I totally agree with you that we should be working on a local copy > > > inside the helper and leave the original dynptr untouched. > > > I think then the first arg should not be PTR_TO_STACK but some other > > > pointer type. Maybe it should be its own register type, and > > > spilled_ptr should reflect the same register, which allows the dynptr > > > state that we track to be copied into the callback arg#0 easily. > > > > Right, something like what David Vernet did with his > > bpf_user_ringbuf_drain() helper that passes kernel-only (not > > PTR_TO_STACK) dynptr into callback. If that implementation has the > > same hole (being able to be modified), we should fix it the same way > > in both cases, by not allowing this for PTR_TO_DYNPTR (or whatever the > > new reg type is called). > > > > Sorry for the late response. > > Somehow I missed that series entirely. Yes, we should reuse that register type, > and it does seem like it needs to check for MEM_UNINIT to prevent > reinitialization. I'm rolling that fix into > my dynptr series that I'm sending next week, since it would lead to > lots of conflicts otherwise. > Then this set can use the same approach. Sounds good, thanks!
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 16973fa4d073..ff78a94c262a 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -5531,6 +5531,25 @@ union bpf_attr { * losing access to the original view of the dynptr. * Return * 0 on success, -EINVAL if the dynptr to clone is invalid. + * + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) + * Description + * Iterate through the dynptr data, calling **callback_fn** on each + * iteration with **callback_ctx** as the context parameter. + * The **callback_fn** should be a static function and + * the **callback_ctx** should be a pointer to the stack. + * Currently **flags** is unused and must be 0. + * + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); + * + * where **callback_fn** returns the number of bytes to advance + * the dynptr by or an error. The iteration will stop if **callback_fn** + * returns 0 or an error or tries to advance by more bytes than the + * size of the dynptr. + * Return + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, + * -ERANGE if attempting to iterate more bytes than available, or other + * negative error if **callback_fn** returns an error. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5752,6 +5771,7 @@ union bpf_attr { FN(dynptr_get_size), \ FN(dynptr_get_offset), \ FN(dynptr_clone), \ + FN(dynptr_iterator), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 667f1e213a61..519b3da06d49 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1653,13 +1653,11 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = { .arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO, }; -BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) +/* *ptr* should always be a valid dynptr */ +static int __bpf_dynptr_advance(struct bpf_dynptr_kern *ptr, u32 len) { u32 size; - if (!ptr->data) - return -EINVAL; - size = __bpf_dynptr_get_size(ptr); if (len > size) @@ -1672,6 +1670,14 @@ BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) return 0; } +BPF_CALL_2(bpf_dynptr_advance, struct bpf_dynptr_kern *, ptr, u32, len) +{ + if (!ptr->data) + return -EINVAL; + + return __bpf_dynptr_advance(ptr, len); +} + static const struct bpf_func_proto bpf_dynptr_advance_proto = { .func = bpf_dynptr_advance, .gpl_only = false, @@ -1783,6 +1789,38 @@ static const struct bpf_func_proto bpf_dynptr_clone_proto = { .arg2_type = ARG_PTR_TO_DYNPTR | MEM_UNINIT, }; +BPF_CALL_4(bpf_dynptr_iterator, struct bpf_dynptr_kern *, ptr, void *, callback_fn, + void *, callback_ctx, u64, flags) +{ + bpf_callback_t callback = (bpf_callback_t)callback_fn; + int nr_bytes, err; + + if (!ptr->data || flags) + return -EINVAL; + + while (ptr->size > 0) { + nr_bytes = callback((u64)(long)ptr, (u64)(long)callback_ctx, 0, 0, 0); + if (nr_bytes <= 0) + return nr_bytes; + + err = __bpf_dynptr_advance(ptr, nr_bytes); + if (err) + return err; + } + + return 0; +} + +static const struct bpf_func_proto bpf_dynptr_iterator_proto = { + .func = bpf_dynptr_iterator, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_DYNPTR, + .arg2_type = ARG_PTR_TO_FUNC, + .arg3_type = ARG_PTR_TO_STACK_OR_NULL, + .arg4_type = ARG_ANYTHING, +}; + const struct bpf_func_proto bpf_get_current_task_proto __weak; const struct bpf_func_proto bpf_get_current_task_btf_proto __weak; const struct bpf_func_proto bpf_probe_read_user_proto __weak; @@ -1869,6 +1907,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) return &bpf_dynptr_get_offset_proto; case BPF_FUNC_dynptr_clone: return &bpf_dynptr_clone_proto; + case BPF_FUNC_dynptr_iterator: + return &bpf_dynptr_iterator_proto; default: break; } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2eb2a4410344..76108cd4ed85 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -6901,6 +6901,29 @@ static int set_map_elem_callback_state(struct bpf_verifier_env *env, return 0; } +static int set_dynptr_iterator_callback_state(struct bpf_verifier_env *env, + struct bpf_func_state *caller, + struct bpf_func_state *callee, + int insn_idx) +{ + /* bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, + * void *callback_ctx, u64 flags); + * + * callback_fn(struct bpf_dynptr *ptr, void *callback_ctx); + */ + callee->regs[BPF_REG_1] = caller->regs[BPF_REG_1]; + callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3]; + callee->callback_ret_range = tnum_range(0, U64_MAX); + + /* unused */ + __mark_reg_not_init(env, &callee->regs[BPF_REG_3]); + __mark_reg_not_init(env, &callee->regs[BPF_REG_4]); + __mark_reg_not_init(env, &callee->regs[BPF_REG_5]); + + callee->in_callback_fn = true; + return 0; +} + static int set_loop_callback_state(struct bpf_verifier_env *env, struct bpf_func_state *caller, struct bpf_func_state *callee, @@ -7472,6 +7495,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn break; } + case BPF_FUNC_dynptr_iterator: + err = __check_func_call(env, insn, insn_idx_p, meta.subprogno, + set_dynptr_iterator_callback_state); + break; } if (err) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 16973fa4d073..ff78a94c262a 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5531,6 +5531,25 @@ union bpf_attr { * losing access to the original view of the dynptr. * Return * 0 on success, -EINVAL if the dynptr to clone is invalid. + * + * long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) + * Description + * Iterate through the dynptr data, calling **callback_fn** on each + * iteration with **callback_ctx** as the context parameter. + * The **callback_fn** should be a static function and + * the **callback_ctx** should be a pointer to the stack. + * Currently **flags** is unused and must be 0. + * + * long (\*callback_fn)(struct bpf_dynptr \*ptr, void \*ctx); + * + * where **callback_fn** returns the number of bytes to advance + * the dynptr by or an error. The iteration will stop if **callback_fn** + * returns 0 or an error or tries to advance by more bytes than the + * size of the dynptr. + * Return + * 0 on success, -EINVAL if the dynptr is invalid or **flags** is not 0, + * -ERANGE if attempting to iterate more bytes than available, or other + * negative error if **callback_fn** returns an error. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5752,6 +5771,7 @@ union bpf_attr { FN(dynptr_get_size), \ FN(dynptr_get_offset), \ FN(dynptr_clone), \ + FN(dynptr_iterator), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper
Add a new helper function, bpf_dynptr_iterator: long bpf_dynptr_iterator(struct bpf_dynptr *ptr, void *callback_fn, void *callback_ctx, u64 flags) where callback_fn is defined as: long (*callback_fn)(struct bpf_dynptr *ptr, void *ctx) and callback_fn returns the number of bytes to advance the dynptr by (or an error code in the case of error). The iteration will stop if the callback_fn returns 0 or an error or tries to advance by more bytes than available. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> --- include/uapi/linux/bpf.h | 20 ++++++++++++++ kernel/bpf/helpers.c | 48 +++++++++++++++++++++++++++++++--- kernel/bpf/verifier.c | 27 +++++++++++++++++++ tools/include/uapi/linux/bpf.h | 20 ++++++++++++++ 4 files changed, 111 insertions(+), 4 deletions(-)