Message ID | 20220822235649.2218031-2-joannelkoong@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | Add skb + xdp dynptrs | expand |
Hi Joanne, Thank you for the patch! Yet something to improve: [auto build test ERROR on bpf-next/master] url: https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/Add-skb-xdp-dynptrs/20220823-080022 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master config: arm-buildonly-randconfig-r002-20220823 (https://download.01.org/0day-ci/archive/20220824/202208240728.59W00MTW-lkp@intel.com/config) compiler: arm-linux-gnueabi-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/a2c8a74d8f0b7fd0b0008dc9bc5ccf9887317f36 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Joanne-Koong/Add-skb-xdp-dynptrs/20220823-080022 git checkout a2c8a74d8f0b7fd0b0008dc9bc5ccf9887317f36 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): arm-linux-gnueabi-ld: kernel/bpf/helpers.o: in function `bpf_dynptr_read': >> helpers.c:(.text+0xb1c): undefined reference to `__bpf_skb_load_bytes' arm-linux-gnueabi-ld: kernel/bpf/helpers.o: in function `bpf_dynptr_write': >> helpers.c:(.text+0xc30): undefined reference to `__bpf_skb_store_bytes'
Hi Joanne, Thank you for the patch! Yet something to improve: [auto build test ERROR on bpf-next/master] url: https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/Add-skb-xdp-dynptrs/20220823-080022 base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master config: csky-randconfig-r022-20220823 (https://download.01.org/0day-ci/archive/20220824/202208240751.BRPS1SoF-lkp@intel.com/config) compiler: csky-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/a2c8a74d8f0b7fd0b0008dc9bc5ccf9887317f36 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Joanne-Koong/Add-skb-xdp-dynptrs/20220823-080022 git checkout a2c8a74d8f0b7fd0b0008dc9bc5ccf9887317f36 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=csky SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): csky-linux-ld: kernel/bpf/helpers.o: in function `____bpf_dynptr_read': >> kernel/bpf/helpers.c:1543: undefined reference to `__bpf_skb_load_bytes' csky-linux-ld: kernel/bpf/helpers.o: in function `bpf_dynptr_read': kernel/bpf/helpers.c:1522: undefined reference to `__bpf_skb_load_bytes' csky-linux-ld: kernel/bpf/helpers.o: in function `____bpf_dynptr_write': >> kernel/bpf/helpers.c:1584: undefined reference to `__bpf_skb_store_bytes' csky-linux-ld: kernel/bpf/helpers.o: in function `bpf_dynptr_write': kernel/bpf/helpers.c:1561: undefined reference to `__bpf_skb_store_bytes' vim +1543 kernel/bpf/helpers.c 1521 1522 BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src, 1523 u32, offset, u64, flags) 1524 { 1525 enum bpf_dynptr_type type; 1526 int err; 1527 1528 if (!src->data || flags) 1529 return -EINVAL; 1530 1531 err = bpf_dynptr_check_off_len(src, offset, len); 1532 if (err) 1533 return err; 1534 1535 type = bpf_dynptr_get_type(src); 1536 1537 switch (type) { 1538 case BPF_DYNPTR_TYPE_LOCAL: 1539 case BPF_DYNPTR_TYPE_RINGBUF: 1540 memcpy(dst, src->data + src->offset + offset, len); 1541 return 0; 1542 case BPF_DYNPTR_TYPE_SKB: > 1543 return __bpf_skb_load_bytes(src->data, src->offset + offset, dst, len); 1544 default: 1545 WARN(true, "bpf_dynptr_read: unknown dynptr type %d\n", type); 1546 return -EFAULT; 1547 } 1548 } 1549 1550 static const struct bpf_func_proto bpf_dynptr_read_proto = { 1551 .func = bpf_dynptr_read, 1552 .gpl_only = false, 1553 .ret_type = RET_INTEGER, 1554 .arg1_type = ARG_PTR_TO_UNINIT_MEM, 1555 .arg2_type = ARG_CONST_SIZE_OR_ZERO, 1556 .arg3_type = ARG_PTR_TO_DYNPTR, 1557 .arg4_type = ARG_ANYTHING, 1558 .arg5_type = ARG_ANYTHING, 1559 }; 1560 1561 BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src, 1562 u32, len, u64, flags) 1563 { 1564 enum bpf_dynptr_type type; 1565 int err; 1566 1567 if (!dst->data || bpf_dynptr_is_rdonly(dst)) 1568 return -EINVAL; 1569 1570 err = bpf_dynptr_check_off_len(dst, offset, len); 1571 if (err) 1572 return err; 1573 1574 type = bpf_dynptr_get_type(dst); 1575 1576 switch (type) { 1577 case BPF_DYNPTR_TYPE_LOCAL: 1578 case BPF_DYNPTR_TYPE_RINGBUF: 1579 if (flags) 1580 return -EINVAL; 1581 memcpy(dst->data + dst->offset + offset, src, len); 1582 return 0; 1583 case BPF_DYNPTR_TYPE_SKB: > 1584 return __bpf_skb_store_bytes(dst->data, dst->offset + offset, src, len, 1585 flags); 1586 default: 1587 WARN(true, "bpf_dynptr_write: unknown dynptr type %d\n", type); 1588 return -EFAULT; 1589 } 1590 } 1591
On Mon, Aug 22, 2022 at 4:57 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > Add skb dynptrs, which are dynptrs whose underlying pointer points > to a skb. The dynptr acts on skb data. skb dynptrs have two main > benefits. One is that they allow operations on sizes that are not > statically known at compile-time (eg variable-sized accesses). > Another is that parsing the packet data through dynptrs (instead of > through direct access of skb->data and skb->data_end) can be more > ergonomic and less brittle (eg does not need manual if checking for > being within bounds of data_end). > > For bpf prog types that don't support writes on skb data, the dynptr is > read-only. For reads and writes through the bpf_dynptr_read() and > bpf_dynptr_write() interfaces, this supports reading and writing into > data in the non-linear paged buffers. For data slices (through the > bpf_dynptr_data() interface), if the data is in a paged buffer, the user > must first call bpf_skb_pull_data() to pull the data into the linear > portion. The returned data slice from a call to bpf_dynptr_data() is of > reg type PTR_TO_PACKET | PTR_MAYBE_NULL. > > Any bpf_dynptr_write() automatically invalidates any prior data slices > to the skb dynptr. This is because a bpf_dynptr_write() may be writing > to data in a paged buffer, so it will need to pull the buffer first into > the head. The reason it needs to be pulled instead of writing directly to > the paged buffers is because they may be cloned (only the head of the skb > is by default uncloned). As such, any bpf_dynptr_write() will > automatically have its prior data slices invalidated, even if the write > is to data in the skb head (the verifier has no way of differentiating > whether the write is to the head or paged buffers during program load > time). Please note as well that any other helper calls that change the > underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data > slices of the skb dynptr as well. Whenever such a helper call is made, > the verifier marks any PTR_TO_PACKET reg type (which includes skb dynptr > slices since they are PTR_TO_PACKETs) as unknown. The stack trace for > this is check_helper_call() -> clear_all_pkt_pointers() -> > __clear_all_pkt_pointers() -> mark_reg_unknown() > > For examples of how skb dynptrs can be used, please see the attached > selftests. > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > --- > include/linux/bpf.h | 83 +++++++++++++++++----------- > include/linux/filter.h | 4 ++ > include/uapi/linux/bpf.h | 40 ++++++++++++-- > kernel/bpf/helpers.c | 81 +++++++++++++++++++++++++--- > kernel/bpf/verifier.c | 99 ++++++++++++++++++++++++++++------ > net/core/filter.c | 53 ++++++++++++++++-- > tools/include/uapi/linux/bpf.h | 40 ++++++++++++-- > 7 files changed, 335 insertions(+), 65 deletions(-) > [...] > @@ -1521,9 +1532,19 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src > if (err) > return err; > > - memcpy(dst, src->data + src->offset + offset, len); > + type = bpf_dynptr_get_type(src); > > - return 0; > + switch (type) { > + case BPF_DYNPTR_TYPE_LOCAL: > + case BPF_DYNPTR_TYPE_RINGBUF: > + memcpy(dst, src->data + src->offset + offset, len); > + return 0; > + case BPF_DYNPTR_TYPE_SKB: > + return __bpf_skb_load_bytes(src->data, src->offset + offset, dst, len); > + default: > + WARN(true, "bpf_dynptr_read: unknown dynptr type %d\n", type); nit: probably better to WARN_ONCE? > + return -EFAULT; > + } > } > > static const struct bpf_func_proto bpf_dynptr_read_proto = { > @@ -1540,18 +1561,32 @@ static const struct bpf_func_proto bpf_dynptr_read_proto = { > BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src, > u32, len, u64, flags) > { > + enum bpf_dynptr_type type; > int err; > > - if (!dst->data || flags || bpf_dynptr_is_rdonly(dst)) > + if (!dst->data || bpf_dynptr_is_rdonly(dst)) > return -EINVAL; > > err = bpf_dynptr_check_off_len(dst, offset, len); > if (err) > return err; > > - memcpy(dst->data + dst->offset + offset, src, len); > + type = bpf_dynptr_get_type(dst); > > - return 0; > + switch (type) { > + case BPF_DYNPTR_TYPE_LOCAL: > + case BPF_DYNPTR_TYPE_RINGBUF: > + if (flags) > + return -EINVAL; > + memcpy(dst->data + dst->offset + offset, src, len); > + return 0; > + case BPF_DYNPTR_TYPE_SKB: > + return __bpf_skb_store_bytes(dst->data, dst->offset + offset, src, len, > + flags); > + default: > + WARN(true, "bpf_dynptr_write: unknown dynptr type %d\n", type); ditto about WARN_ONCE > + return -EFAULT; > + } > } > > static const struct bpf_func_proto bpf_dynptr_write_proto = { [...] > +static enum bpf_dynptr_type stack_slot_get_dynptr_info(struct bpf_verifier_env *env, > + struct bpf_reg_state *reg, > + int *ref_obj_id) > { > struct bpf_func_state *state = func(env, reg); > int spi = get_spi(reg->off); > > - return state->stack[spi].spilled_ptr.id; > + if (ref_obj_id) > + *ref_obj_id = state->stack[spi].spilled_ptr.id; > + > + return state->stack[spi].spilled_ptr.dynptr.type; > } > > static int check_func_arg(struct bpf_verifier_env *env, u32 arg, > @@ -6056,7 +6075,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, > case DYNPTR_TYPE_RINGBUF: > err_extra = "ringbuf "; > break; > - default: > + case DYNPTR_TYPE_SKB: > + err_extra = "skb "; > break; > } > > @@ -7149,6 +7169,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > { > enum bpf_prog_type prog_type = resolve_prog_type(env->prog); > const struct bpf_func_proto *fn = NULL; > + enum bpf_dynptr_type dynptr_type; compiler technically can complain about use of uninitialized dynptr_type, maybe initialize it to BPF_DYNPTR_TYPE_INVALID ? > enum bpf_return_type ret_type; > enum bpf_type_flag ret_flag; > struct bpf_reg_state *regs; > @@ -7320,24 +7341,43 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > } > } > break; > - case BPF_FUNC_dynptr_data: > - for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) { > - if (arg_type_is_dynptr(fn->arg_type[i])) { > - if (meta.ref_obj_id) { > - verbose(env, "verifier internal error: meta.ref_obj_id already set\n"); > - return -EFAULT; > - } > - /* Find the id of the dynptr we're tracking the reference of */ > - meta.ref_obj_id = stack_slot_get_id(env, ®s[BPF_REG_1 + i]); > - break; > - } > + case BPF_FUNC_dynptr_write: > + { > + struct bpf_reg_state *reg; > + > + reg = get_dynptr_arg_reg(fn, regs); > + if (!reg) { > + verbose(env, "verifier internal error: no dynptr in bpf_dynptr_data()\n"); s/bpf_dynptr_data/bpf_dynptr_write/ > + return -EFAULT; > } > - if (i == MAX_BPF_FUNC_REG_ARGS) { > + > + /* bpf_dynptr_write() for skb-type dynptrs may pull the skb, so we must > + * invalidate all data slices associated with it > + */ > + if (stack_slot_get_dynptr_info(env, reg, NULL) == BPF_DYNPTR_TYPE_SKB) > + changes_data = true; > + > + break; > + } > + case BPF_FUNC_dynptr_data: > + { > + struct bpf_reg_state *reg; > + > + reg = get_dynptr_arg_reg(fn, regs); > + if (!reg) { > verbose(env, "verifier internal error: no dynptr in bpf_dynptr_data()\n"); > return -EFAULT; > } > + > + if (meta.ref_obj_id) { > + verbose(env, "verifier internal error: meta.ref_obj_id already set\n"); > + return -EFAULT; > + } > + > + dynptr_type = stack_slot_get_dynptr_info(env, reg, &meta.ref_obj_id); > break; > } > + } > > if (err) > return err; > @@ -7397,8 +7437,15 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > break; > case RET_PTR_TO_ALLOC_MEM: > mark_reg_known_zero(env, regs, BPF_REG_0); > - regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; > - regs[BPF_REG_0].mem_size = meta.mem_size; > + > + if (func_id == BPF_FUNC_dynptr_data && > + dynptr_type == BPF_DYNPTR_TYPE_SKB) { > + regs[BPF_REG_0].type = PTR_TO_PACKET | ret_flag; > + regs[BPF_REG_0].range = meta.mem_size; > + } else { > + regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; > + regs[BPF_REG_0].mem_size = meta.mem_size; > + } > break; > case RET_PTR_TO_MEM_OR_BTF_ID: > { > @@ -14141,6 +14188,24 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > goto patch_call_imm; > } > > + if (insn->imm == BPF_FUNC_dynptr_from_skb) { > + bool is_rdonly = !may_access_direct_pkt_data(env, NULL, BPF_WRITE); > + > + insn_buf[0] = BPF_MOV32_IMM(BPF_REG_4, is_rdonly); > + insn_buf[1] = *insn; > + cnt = 2; This might have been discussed before, sorry if I missed it. But why this special rewrite to provide read-only flag vs having two implementations of bpf_dynptr_from_skb() depending on program types? If program type allows read+write access, return bpf_dynptr_from_skb_rdwr(), for those that have read-only access - bpf_dynptr_from_skb_rdonly(), and for others return NULL proto (disable helper)? You can then use it very similarly for bpf_dynptr_from_skb_meta(). > + > + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); > + if (!new_prog) > + return -ENOMEM; > + > + delta += cnt - 1; > + env->prog = new_prog; > + prog = new_prog; > + insn = new_prog->insnsi + i + delta; > + goto patch_call_imm; > + } > + [...] > BPF_CALL_1(bpf_sk_fullsock, struct sock *, sk) > { > return sk_fullsock(sk) ? (unsigned long)sk : (unsigned long)NULL; > @@ -7726,6 +7763,8 @@ sk_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) > return &bpf_get_socket_uid_proto; > case BPF_FUNC_perf_event_output: > return &bpf_skb_event_output_proto; > + case BPF_FUNC_dynptr_from_skb: > + return &bpf_dynptr_from_skb_proto; > default: > return bpf_sk_base_func_proto(func_id); > } > @@ -7909,6 +7948,8 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) > return &bpf_tcp_raw_check_syncookie_ipv6_proto; > #endif > #endif > + case BPF_FUNC_dynptr_from_skb: > + return &bpf_dynptr_from_skb_proto; > default: > return bpf_sk_base_func_proto(func_id); > } > @@ -8104,6 +8145,8 @@ sk_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) > case BPF_FUNC_skc_lookup_tcp: > return &bpf_skc_lookup_tcp_proto; > #endif > + case BPF_FUNC_dynptr_from_skb: > + return &bpf_dynptr_from_skb_proto; > default: > return bpf_sk_base_func_proto(func_id); > } > @@ -8142,6 +8185,8 @@ lwt_out_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) > return &bpf_get_smp_processor_id_proto; > case BPF_FUNC_skb_under_cgroup: > return &bpf_skb_under_cgroup_proto; > + case BPF_FUNC_dynptr_from_skb: > + return &bpf_dynptr_from_skb_proto; so like here you'd return read-only prototype for dynptr_from_skb (seems like LWT programs have only read-only access, according to may_access_direct_pkt_data), but in cases where may_access_direct_pkt_data() allows read-write access we'd choose rdwr variant of dynptr_from_skb? > default: > return bpf_sk_base_func_proto(func_id); > } [...]
On Wed, 24 Aug 2022 at 20:42, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Mon, Aug 22, 2022 at 4:57 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > Add skb dynptrs, which are dynptrs whose underlying pointer points > > to a skb. The dynptr acts on skb data. skb dynptrs have two main > > benefits. One is that they allow operations on sizes that are not > > statically known at compile-time (eg variable-sized accesses). > > Another is that parsing the packet data through dynptrs (instead of > > through direct access of skb->data and skb->data_end) can be more > > ergonomic and less brittle (eg does not need manual if checking for > > being within bounds of data_end). > > > > For bpf prog types that don't support writes on skb data, the dynptr is > > read-only. For reads and writes through the bpf_dynptr_read() and > > bpf_dynptr_write() interfaces, this supports reading and writing into > > data in the non-linear paged buffers. For data slices (through the > > bpf_dynptr_data() interface), if the data is in a paged buffer, the user > > must first call bpf_skb_pull_data() to pull the data into the linear > > portion. The returned data slice from a call to bpf_dynptr_data() is of > > reg type PTR_TO_PACKET | PTR_MAYBE_NULL. > > > > Any bpf_dynptr_write() automatically invalidates any prior data slices > > to the skb dynptr. This is because a bpf_dynptr_write() may be writing > > to data in a paged buffer, so it will need to pull the buffer first into > > the head. The reason it needs to be pulled instead of writing directly to > > the paged buffers is because they may be cloned (only the head of the skb > > is by default uncloned). As such, any bpf_dynptr_write() will > > automatically have its prior data slices invalidated, even if the write > > is to data in the skb head (the verifier has no way of differentiating > > whether the write is to the head or paged buffers during program load > > time). Please note as well that any other helper calls that change the > > underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data > > slices of the skb dynptr as well. Whenever such a helper call is made, > > the verifier marks any PTR_TO_PACKET reg type (which includes skb dynptr > > slices since they are PTR_TO_PACKETs) as unknown. The stack trace for > > this is check_helper_call() -> clear_all_pkt_pointers() -> > > __clear_all_pkt_pointers() -> mark_reg_unknown() > > > > For examples of how skb dynptrs can be used, please see the attached > > selftests. > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > --- > > include/linux/bpf.h | 83 +++++++++++++++++----------- > > include/linux/filter.h | 4 ++ > > include/uapi/linux/bpf.h | 40 ++++++++++++-- > > kernel/bpf/helpers.c | 81 +++++++++++++++++++++++++--- > > kernel/bpf/verifier.c | 99 ++++++++++++++++++++++++++++------ > > net/core/filter.c | 53 ++++++++++++++++-- > > tools/include/uapi/linux/bpf.h | 40 ++++++++++++-- > > 7 files changed, 335 insertions(+), 65 deletions(-) > > > > [...] > > > @@ -1521,9 +1532,19 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src > > if (err) > > return err; > > > > - memcpy(dst, src->data + src->offset + offset, len); > > + type = bpf_dynptr_get_type(src); > > > > - return 0; > > + switch (type) { > > + case BPF_DYNPTR_TYPE_LOCAL: > > + case BPF_DYNPTR_TYPE_RINGBUF: > > + memcpy(dst, src->data + src->offset + offset, len); > > + return 0; > > + case BPF_DYNPTR_TYPE_SKB: > > + return __bpf_skb_load_bytes(src->data, src->offset + offset, dst, len); > > + default: > > + WARN(true, "bpf_dynptr_read: unknown dynptr type %d\n", type); > > nit: probably better to WARN_ONCE? > > > + return -EFAULT; > > + } > > } > > > > static const struct bpf_func_proto bpf_dynptr_read_proto = { > > @@ -1540,18 +1561,32 @@ static const struct bpf_func_proto bpf_dynptr_read_proto = { > > BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src, > > u32, len, u64, flags) > > { > > + enum bpf_dynptr_type type; > > int err; > > > > - if (!dst->data || flags || bpf_dynptr_is_rdonly(dst)) > > + if (!dst->data || bpf_dynptr_is_rdonly(dst)) > > return -EINVAL; > > > > err = bpf_dynptr_check_off_len(dst, offset, len); > > if (err) > > return err; > > > > - memcpy(dst->data + dst->offset + offset, src, len); > > + type = bpf_dynptr_get_type(dst); > > > > - return 0; > > + switch (type) { > > + case BPF_DYNPTR_TYPE_LOCAL: > > + case BPF_DYNPTR_TYPE_RINGBUF: > > + if (flags) > > + return -EINVAL; > > + memcpy(dst->data + dst->offset + offset, src, len); > > + return 0; > > + case BPF_DYNPTR_TYPE_SKB: > > + return __bpf_skb_store_bytes(dst->data, dst->offset + offset, src, len, > > + flags); > > + default: > > + WARN(true, "bpf_dynptr_write: unknown dynptr type %d\n", type); > > ditto about WARN_ONCE > > > + return -EFAULT; > > + } > > } > > > > static const struct bpf_func_proto bpf_dynptr_write_proto = { > > [...] > > > +static enum bpf_dynptr_type stack_slot_get_dynptr_info(struct bpf_verifier_env *env, > > + struct bpf_reg_state *reg, > > + int *ref_obj_id) > > { > > struct bpf_func_state *state = func(env, reg); > > int spi = get_spi(reg->off); > > > > - return state->stack[spi].spilled_ptr.id; > > + if (ref_obj_id) > > + *ref_obj_id = state->stack[spi].spilled_ptr.id; > > + > > + return state->stack[spi].spilled_ptr.dynptr.type; > > } > > > > static int check_func_arg(struct bpf_verifier_env *env, u32 arg, > > @@ -6056,7 +6075,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, > > case DYNPTR_TYPE_RINGBUF: > > err_extra = "ringbuf "; > > break; > > - default: > > + case DYNPTR_TYPE_SKB: > > + err_extra = "skb "; > > break; > > } > > > > @@ -7149,6 +7169,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > > { > > enum bpf_prog_type prog_type = resolve_prog_type(env->prog); > > const struct bpf_func_proto *fn = NULL; > > + enum bpf_dynptr_type dynptr_type; > > compiler technically can complain about use of uninitialized > dynptr_type, maybe initialize it to BPF_DYNPTR_TYPE_INVALID ? > > > enum bpf_return_type ret_type; > > enum bpf_type_flag ret_flag; > > struct bpf_reg_state *regs; > > @@ -7320,24 +7341,43 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > > } > > } > > break; > > - case BPF_FUNC_dynptr_data: > > - for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) { > > - if (arg_type_is_dynptr(fn->arg_type[i])) { > > - if (meta.ref_obj_id) { > > - verbose(env, "verifier internal error: meta.ref_obj_id already set\n"); > > - return -EFAULT; > > - } > > - /* Find the id of the dynptr we're tracking the reference of */ > > - meta.ref_obj_id = stack_slot_get_id(env, ®s[BPF_REG_1 + i]); > > - break; > > - } > > + case BPF_FUNC_dynptr_write: > > + { > > + struct bpf_reg_state *reg; > > + > > + reg = get_dynptr_arg_reg(fn, regs); > > + if (!reg) { > > + verbose(env, "verifier internal error: no dynptr in bpf_dynptr_data()\n"); > > s/bpf_dynptr_data/bpf_dynptr_write/ > > > + return -EFAULT; > > } > > - if (i == MAX_BPF_FUNC_REG_ARGS) { > > + > > + /* bpf_dynptr_write() for skb-type dynptrs may pull the skb, so we must > > + * invalidate all data slices associated with it > > + */ > > + if (stack_slot_get_dynptr_info(env, reg, NULL) == BPF_DYNPTR_TYPE_SKB) > > + changes_data = true; > > + > > + break; > > + } > > + case BPF_FUNC_dynptr_data: > > + { > > + struct bpf_reg_state *reg; > > + > > + reg = get_dynptr_arg_reg(fn, regs); > > + if (!reg) { > > verbose(env, "verifier internal error: no dynptr in bpf_dynptr_data()\n"); > > return -EFAULT; > > } > > + > > + if (meta.ref_obj_id) { > > + verbose(env, "verifier internal error: meta.ref_obj_id already set\n"); > > + return -EFAULT; > > + } > > + > > + dynptr_type = stack_slot_get_dynptr_info(env, reg, &meta.ref_obj_id); > > break; > > } > > + } > > > > if (err) > > return err; > > @@ -7397,8 +7437,15 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > > break; > > case RET_PTR_TO_ALLOC_MEM: > > mark_reg_known_zero(env, regs, BPF_REG_0); > > - regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; > > - regs[BPF_REG_0].mem_size = meta.mem_size; > > + > > + if (func_id == BPF_FUNC_dynptr_data && > > + dynptr_type == BPF_DYNPTR_TYPE_SKB) { > > + regs[BPF_REG_0].type = PTR_TO_PACKET | ret_flag; > > + regs[BPF_REG_0].range = meta.mem_size; > > + } else { > > + regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; > > + regs[BPF_REG_0].mem_size = meta.mem_size; > > + } > > break; > > case RET_PTR_TO_MEM_OR_BTF_ID: > > { > > @@ -14141,6 +14188,24 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > > goto patch_call_imm; > > } > > > > + if (insn->imm == BPF_FUNC_dynptr_from_skb) { > > + bool is_rdonly = !may_access_direct_pkt_data(env, NULL, BPF_WRITE); > > + > > + insn_buf[0] = BPF_MOV32_IMM(BPF_REG_4, is_rdonly); > > + insn_buf[1] = *insn; > > + cnt = 2; > > This might have been discussed before, sorry if I missed it. But why > this special rewrite to provide read-only flag vs having two > implementations of bpf_dynptr_from_skb() depending on program types? > If program type allows read+write access, return > bpf_dynptr_from_skb_rdwr(), for those that have read-only access - > bpf_dynptr_from_skb_rdonly(), and for others return NULL proto > (disable helper)? > > You can then use it very similarly for bpf_dynptr_from_skb_meta(). > Related question, it seems we know statically if dynptr is read only or not, so why even do all this hidden parameter passing and instead just reject writes directly? You only need to be able to set MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject dynptr_write when dynptr type is xdp/skb (and ctx is only one). That seems simpler than checking it at runtime. Verifier already handles MEM_RDONLY generically, you only need to add the guard for check_packet_acces (and check_helper_mem_access for meta->raw_mode under pkt case), and rejecting dynptr_write seems like a if condition.
On Wed, Aug 24, 2022 at 4:26 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Wed, 24 Aug 2022 at 20:42, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > On Mon, Aug 22, 2022 at 4:57 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > Add skb dynptrs, which are dynptrs whose underlying pointer points > > > to a skb. The dynptr acts on skb data. skb dynptrs have two main > > > benefits. One is that they allow operations on sizes that are not > > > statically known at compile-time (eg variable-sized accesses). > > > Another is that parsing the packet data through dynptrs (instead of > > > through direct access of skb->data and skb->data_end) can be more > > > ergonomic and less brittle (eg does not need manual if checking for > > > being within bounds of data_end). > > > > > > For bpf prog types that don't support writes on skb data, the dynptr is > > > read-only. For reads and writes through the bpf_dynptr_read() and > > > bpf_dynptr_write() interfaces, this supports reading and writing into > > > data in the non-linear paged buffers. For data slices (through the > > > bpf_dynptr_data() interface), if the data is in a paged buffer, the user > > > must first call bpf_skb_pull_data() to pull the data into the linear > > > portion. The returned data slice from a call to bpf_dynptr_data() is of > > > reg type PTR_TO_PACKET | PTR_MAYBE_NULL. > > > > > > Any bpf_dynptr_write() automatically invalidates any prior data slices > > > to the skb dynptr. This is because a bpf_dynptr_write() may be writing > > > to data in a paged buffer, so it will need to pull the buffer first into > > > the head. The reason it needs to be pulled instead of writing directly to > > > the paged buffers is because they may be cloned (only the head of the skb > > > is by default uncloned). As such, any bpf_dynptr_write() will > > > automatically have its prior data slices invalidated, even if the write > > > is to data in the skb head (the verifier has no way of differentiating > > > whether the write is to the head or paged buffers during program load > > > time). Please note as well that any other helper calls that change the > > > underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data > > > slices of the skb dynptr as well. Whenever such a helper call is made, > > > the verifier marks any PTR_TO_PACKET reg type (which includes skb dynptr > > > slices since they are PTR_TO_PACKETs) as unknown. The stack trace for > > > this is check_helper_call() -> clear_all_pkt_pointers() -> > > > __clear_all_pkt_pointers() -> mark_reg_unknown() > > > > > > For examples of how skb dynptrs can be used, please see the attached > > > selftests. > > > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > > --- > > > include/linux/bpf.h | 83 +++++++++++++++++----------- > > > include/linux/filter.h | 4 ++ > > > include/uapi/linux/bpf.h | 40 ++++++++++++-- > > > kernel/bpf/helpers.c | 81 +++++++++++++++++++++++++--- > > > kernel/bpf/verifier.c | 99 ++++++++++++++++++++++++++++------ > > > net/core/filter.c | 53 ++++++++++++++++-- > > > tools/include/uapi/linux/bpf.h | 40 ++++++++++++-- > > > 7 files changed, 335 insertions(+), 65 deletions(-) > > > > > > > [...] > > > > > @@ -1521,9 +1532,19 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src > > > if (err) > > > return err; > > > > > > - memcpy(dst, src->data + src->offset + offset, len); > > > + type = bpf_dynptr_get_type(src); > > > > > > - return 0; > > > + switch (type) { > > > + case BPF_DYNPTR_TYPE_LOCAL: > > > + case BPF_DYNPTR_TYPE_RINGBUF: > > > + memcpy(dst, src->data + src->offset + offset, len); > > > + return 0; > > > + case BPF_DYNPTR_TYPE_SKB: > > > + return __bpf_skb_load_bytes(src->data, src->offset + offset, dst, len); > > > + default: > > > + WARN(true, "bpf_dynptr_read: unknown dynptr type %d\n", type); > > > > nit: probably better to WARN_ONCE? > > > > > + return -EFAULT; > > > + } > > > } > > > > > > static const struct bpf_func_proto bpf_dynptr_read_proto = { > > > @@ -1540,18 +1561,32 @@ static const struct bpf_func_proto bpf_dynptr_read_proto = { > > > BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src, > > > u32, len, u64, flags) > > > { > > > + enum bpf_dynptr_type type; > > > int err; > > > > > > - if (!dst->data || flags || bpf_dynptr_is_rdonly(dst)) > > > + if (!dst->data || bpf_dynptr_is_rdonly(dst)) > > > return -EINVAL; > > > > > > err = bpf_dynptr_check_off_len(dst, offset, len); > > > if (err) > > > return err; > > > > > > - memcpy(dst->data + dst->offset + offset, src, len); > > > + type = bpf_dynptr_get_type(dst); > > > > > > - return 0; > > > + switch (type) { > > > + case BPF_DYNPTR_TYPE_LOCAL: > > > + case BPF_DYNPTR_TYPE_RINGBUF: > > > + if (flags) > > > + return -EINVAL; > > > + memcpy(dst->data + dst->offset + offset, src, len); > > > + return 0; > > > + case BPF_DYNPTR_TYPE_SKB: > > > + return __bpf_skb_store_bytes(dst->data, dst->offset + offset, src, len, > > > + flags); > > > + default: > > > + WARN(true, "bpf_dynptr_write: unknown dynptr type %d\n", type); > > > > ditto about WARN_ONCE > > > > > + return -EFAULT; > > > + } > > > } > > > > > > static const struct bpf_func_proto bpf_dynptr_write_proto = { > > > > [...] > > > > > +static enum bpf_dynptr_type stack_slot_get_dynptr_info(struct bpf_verifier_env *env, > > > + struct bpf_reg_state *reg, > > > + int *ref_obj_id) > > > { > > > struct bpf_func_state *state = func(env, reg); > > > int spi = get_spi(reg->off); > > > > > > - return state->stack[spi].spilled_ptr.id; > > > + if (ref_obj_id) > > > + *ref_obj_id = state->stack[spi].spilled_ptr.id; > > > + > > > + return state->stack[spi].spilled_ptr.dynptr.type; > > > } > > > > > > static int check_func_arg(struct bpf_verifier_env *env, u32 arg, > > > @@ -6056,7 +6075,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, > > > case DYNPTR_TYPE_RINGBUF: > > > err_extra = "ringbuf "; > > > break; > > > - default: > > > + case DYNPTR_TYPE_SKB: > > > + err_extra = "skb "; > > > break; > > > } > > > > > > @@ -7149,6 +7169,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > > > { > > > enum bpf_prog_type prog_type = resolve_prog_type(env->prog); > > > const struct bpf_func_proto *fn = NULL; > > > + enum bpf_dynptr_type dynptr_type; > > > > compiler technically can complain about use of uninitialized > > dynptr_type, maybe initialize it to BPF_DYNPTR_TYPE_INVALID ? > > > > > enum bpf_return_type ret_type; > > > enum bpf_type_flag ret_flag; > > > struct bpf_reg_state *regs; > > > @@ -7320,24 +7341,43 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > > > } > > > } > > > break; > > > - case BPF_FUNC_dynptr_data: > > > - for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) { > > > - if (arg_type_is_dynptr(fn->arg_type[i])) { > > > - if (meta.ref_obj_id) { > > > - verbose(env, "verifier internal error: meta.ref_obj_id already set\n"); > > > - return -EFAULT; > > > - } > > > - /* Find the id of the dynptr we're tracking the reference of */ > > > - meta.ref_obj_id = stack_slot_get_id(env, ®s[BPF_REG_1 + i]); > > > - break; > > > - } > > > + case BPF_FUNC_dynptr_write: > > > + { > > > + struct bpf_reg_state *reg; > > > + > > > + reg = get_dynptr_arg_reg(fn, regs); > > > + if (!reg) { > > > + verbose(env, "verifier internal error: no dynptr in bpf_dynptr_data()\n"); > > > > s/bpf_dynptr_data/bpf_dynptr_write/ > > > > > + return -EFAULT; > > > } > > > - if (i == MAX_BPF_FUNC_REG_ARGS) { > > > + > > > + /* bpf_dynptr_write() for skb-type dynptrs may pull the skb, so we must > > > + * invalidate all data slices associated with it > > > + */ > > > + if (stack_slot_get_dynptr_info(env, reg, NULL) == BPF_DYNPTR_TYPE_SKB) > > > + changes_data = true; > > > + > > > + break; > > > + } > > > + case BPF_FUNC_dynptr_data: > > > + { > > > + struct bpf_reg_state *reg; > > > + > > > + reg = get_dynptr_arg_reg(fn, regs); > > > + if (!reg) { > > > verbose(env, "verifier internal error: no dynptr in bpf_dynptr_data()\n"); > > > return -EFAULT; > > > } > > > + > > > + if (meta.ref_obj_id) { > > > + verbose(env, "verifier internal error: meta.ref_obj_id already set\n"); > > > + return -EFAULT; > > > + } > > > + > > > + dynptr_type = stack_slot_get_dynptr_info(env, reg, &meta.ref_obj_id); > > > break; > > > } > > > + } > > > > > > if (err) > > > return err; > > > @@ -7397,8 +7437,15 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > > > break; > > > case RET_PTR_TO_ALLOC_MEM: > > > mark_reg_known_zero(env, regs, BPF_REG_0); > > > - regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; > > > - regs[BPF_REG_0].mem_size = meta.mem_size; > > > + > > > + if (func_id == BPF_FUNC_dynptr_data && > > > + dynptr_type == BPF_DYNPTR_TYPE_SKB) { > > > + regs[BPF_REG_0].type = PTR_TO_PACKET | ret_flag; > > > + regs[BPF_REG_0].range = meta.mem_size; > > > + } else { > > > + regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; > > > + regs[BPF_REG_0].mem_size = meta.mem_size; > > > + } > > > break; > > > case RET_PTR_TO_MEM_OR_BTF_ID: > > > { > > > @@ -14141,6 +14188,24 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > > > goto patch_call_imm; > > > } > > > > > > + if (insn->imm == BPF_FUNC_dynptr_from_skb) { > > > + bool is_rdonly = !may_access_direct_pkt_data(env, NULL, BPF_WRITE); > > > + > > > + insn_buf[0] = BPF_MOV32_IMM(BPF_REG_4, is_rdonly); > > > + insn_buf[1] = *insn; > > > + cnt = 2; > > > > This might have been discussed before, sorry if I missed it. But why > > this special rewrite to provide read-only flag vs having two > > implementations of bpf_dynptr_from_skb() depending on program types? > > If program type allows read+write access, return > > bpf_dynptr_from_skb_rdwr(), for those that have read-only access - > > bpf_dynptr_from_skb_rdonly(), and for others return NULL proto > > (disable helper)? Ooh I love this idea, thanks Andrii! I'll add this in v5. > > > > You can then use it very similarly for bpf_dynptr_from_skb_meta(). > > > > Related question, it seems we know statically if dynptr is read only > or not, so why even do all this hidden parameter passing and instead > just reject writes directly? You only need to be able to set > MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject > dynptr_write when dynptr type is xdp/skb (and ctx is only one). That > seems simpler than checking it at runtime. Verifier already handles > MEM_RDONLY generically, you only need to add the guard for > check_packet_acces (and check_helper_mem_access for meta->raw_mode > under pkt case), and rejecting dynptr_write seems like a if condition. There will be other helper functions that do writes (eg memcpy to dynptrs, strncpy to dynptrs, probe read user to dynptrs, hashing dynptrs, ...) so it's more scalable if we reject these at runtime rather than enforce these at the verifier level. I also think it's cleaner to keep the verifier logic as simple as possible and do the checking in the helper. There's a prior discussion about this in v1 [0] as well. [0] https://lore.kernel.org/bpf/20220726184706.954822-1-joannelkoong@gmail.com/T/#mf3fe3965bc1852b07b8f2d306d09818b35acf3c1
On Thu, 25 Aug 2022 at 23:02, Joanne Koong <joannelkoong@gmail.com> wrote: > [...] > > > > Related question, it seems we know statically if dynptr is read only > > or not, so why even do all this hidden parameter passing and instead > > just reject writes directly? You only need to be able to set > > MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject > > dynptr_write when dynptr type is xdp/skb (and ctx is only one). That > > seems simpler than checking it at runtime. Verifier already handles > > MEM_RDONLY generically, you only need to add the guard for > > check_packet_acces (and check_helper_mem_access for meta->raw_mode > > under pkt case), and rejecting dynptr_write seems like a if condition. > > There will be other helper functions that do writes (eg memcpy to > dynptrs, strncpy to dynptrs, probe read user to dynptrs, hashing > dynptrs, ...) so it's more scalable if we reject these at runtime > rather than enforce these at the verifier level. I also think it's > cleaner to keep the verifier logic as simple as possible and do the > checking in the helper. I won't be pushing this further, since you know what you plan to add in the future better, but I still disagree. I'm guessing there might be dynptrs where this read only property is set dynamically at runtime, which is why you want to go this route? I.e. you might not know statically whether dynptr is read only or not? My main confusion is the inconsistency here. Right now the patch implicitly relies on may_access_direct_pkt_data to protect slices returned from dynptr_data, instead of setting MEM_RDONLY on the returned PTR_TO_PACKET. Which is fine, it's not needed. So indirectly, you are relying on knowing statically whether the dynptr is read only or not. But then you also set this bit at runtime. So you reject some cases at load time, and the rest of them only at runtime. Direct writes to dynptr slice fails load, writes through helper does not (only fails at runtime). Also, dynptr_data needs to know whether dynptr is read only statically, to protect writes to its returned pointer, unless you decide to introduce another helper for the dynamic rdonly bit case (like dynptr_data_rdonly). Then you have a mismatch, where dynptr_data works for some rdonly dynptrs (known to be rdonly statically, like this skb one), but not for others. I also don't agree about the complexity or scalability part, all the infra and precedence is already there. We already have similar checks for meta->raw_mode where we reject writes to read only pointers in check_helper_mem_access.
On Thu, Aug 25, 2022 at 5:19 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Thu, 25 Aug 2022 at 23:02, Joanne Koong <joannelkoong@gmail.com> wrote: > > [...] > > > > > > Related question, it seems we know statically if dynptr is read only > > > or not, so why even do all this hidden parameter passing and instead > > > just reject writes directly? You only need to be able to set > > > MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject > > > dynptr_write when dynptr type is xdp/skb (and ctx is only one). That > > > seems simpler than checking it at runtime. Verifier already handles > > > MEM_RDONLY generically, you only need to add the guard for > > > check_packet_acces (and check_helper_mem_access for meta->raw_mode > > > under pkt case), and rejecting dynptr_write seems like a if condition. > > > > There will be other helper functions that do writes (eg memcpy to > > dynptrs, strncpy to dynptrs, probe read user to dynptrs, hashing > > dynptrs, ...) so it's more scalable if we reject these at runtime > > rather than enforce these at the verifier level. I also think it's > > cleaner to keep the verifier logic as simple as possible and do the > > checking in the helper. > > I won't be pushing this further, since you know what you plan to add > in the future better, but I still disagree. > > I'm guessing there might be dynptrs where this read only property is > set dynamically at runtime, which is why you want to go this route? > I.e. you might not know statically whether dynptr is read only or not? > > My main confusion is the inconsistency here. > > Right now the patch implicitly relies on may_access_direct_pkt_data to > protect slices returned from dynptr_data, instead of setting > MEM_RDONLY on the returned PTR_TO_PACKET. Which is fine, it's not > needed. So indirectly, you are relying on knowing statically whether > the dynptr is read only or not. But then you also set this bit at > runtime. > > So you reject some cases at load time, and the rest of them only at > runtime. Direct writes to dynptr slice fails load, writes through > helper does not (only fails at runtime). > > Also, dynptr_data needs to know whether dynptr is read only > statically, to protect writes to its returned pointer, unless you > decide to introduce another helper for the dynamic rdonly bit case > (like dynptr_data_rdonly). Then you have a mismatch, where dynptr_data > works for some rdonly dynptrs (known to be rdonly statically, like > this skb one), but not for others. > > I also don't agree about the complexity or scalability part, all the > infra and precedence is already there. We already have similar checks > for meta->raw_mode where we reject writes to read only pointers in > check_helper_mem_access. My point about scalability is that if we reject bpf_dynptr_write() at load time, then we must reject any future dynptr helper that does any writing at load time as well, to be consistent. I don't feel strongly about whether we reject at load time or run time. Rejecting at load time instead of runtime doesn't seem that useful to me, but there's a good chance I'm wrong here since Martin stated that he prefers rejecting at load time as well. As for the added complexity part, what I mean is that we'll need to keep track of some more stuff to support this, such as whether the dynptr is read only and which helper functions need to check whether the dynptr is read only or not. On the whole, I think given that both Martin and you have specified your preferences for this, we should reject at load time instead of runtime. I'll make this change for v5 :)
On Fri, 26 Aug 2022 at 20:44, Joanne Koong <joannelkoong@gmail.com> wrote: > > On Thu, Aug 25, 2022 at 5:19 PM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > On Thu, 25 Aug 2022 at 23:02, Joanne Koong <joannelkoong@gmail.com> wrote: > > > [...] > > > > > > > > Related question, it seems we know statically if dynptr is read only > > > > or not, so why even do all this hidden parameter passing and instead > > > > just reject writes directly? You only need to be able to set > > > > MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject > > > > dynptr_write when dynptr type is xdp/skb (and ctx is only one). That > > > > seems simpler than checking it at runtime. Verifier already handles > > > > MEM_RDONLY generically, you only need to add the guard for > > > > check_packet_acces (and check_helper_mem_access for meta->raw_mode > > > > under pkt case), and rejecting dynptr_write seems like a if condition. > > > > > > There will be other helper functions that do writes (eg memcpy to > > > dynptrs, strncpy to dynptrs, probe read user to dynptrs, hashing > > > dynptrs, ...) so it's more scalable if we reject these at runtime > > > rather than enforce these at the verifier level. I also think it's > > > cleaner to keep the verifier logic as simple as possible and do the > > > checking in the helper. > > > > I won't be pushing this further, since you know what you plan to add > > in the future better, but I still disagree. > > > > I'm guessing there might be dynptrs where this read only property is > > set dynamically at runtime, which is why you want to go this route? > > I.e. you might not know statically whether dynptr is read only or not? > > > > My main confusion is the inconsistency here. > > > > Right now the patch implicitly relies on may_access_direct_pkt_data to > > protect slices returned from dynptr_data, instead of setting > > MEM_RDONLY on the returned PTR_TO_PACKET. Which is fine, it's not > > needed. So indirectly, you are relying on knowing statically whether > > the dynptr is read only or not. But then you also set this bit at > > runtime. > > > > So you reject some cases at load time, and the rest of them only at > > runtime. Direct writes to dynptr slice fails load, writes through > > helper does not (only fails at runtime). > > > > Also, dynptr_data needs to know whether dynptr is read only > > statically, to protect writes to its returned pointer, unless you > > decide to introduce another helper for the dynamic rdonly bit case > > (like dynptr_data_rdonly). Then you have a mismatch, where dynptr_data > > works for some rdonly dynptrs (known to be rdonly statically, like > > this skb one), but not for others. > > > > I also don't agree about the complexity or scalability part, all the > > infra and precedence is already there. We already have similar checks > > for meta->raw_mode where we reject writes to read only pointers in > > check_helper_mem_access. > > My point about scalability is that if we reject bpf_dynptr_write() at > load time, then we must reject any future dynptr helper that does any > writing at load time as well, to be consistent. > > I don't feel strongly about whether we reject at load time or run > time. Rejecting at load time instead of runtime doesn't seem that > useful to me, but there's a good chance I'm wrong here since Martin > stated that he prefers rejecting at load time as well. > > As for the added complexity part, what I mean is that we'll need to > keep track of some more stuff to support this, such as whether the > dynptr is read only and which helper functions need to check whether > the dynptr is read only or not. What I'm trying to understand is how dynptr_data is supposed to work if this dynptr read only bit is only known at runtime. Or will it be always known statically so that it can set returned pointer as read only? Because then it doesn't seem it is required or useful to track the readonly bit at runtime. It is fine if _everything_ checks it at runtime, but that doesn't seem possible, hence the question. We would need a new slice helper that only returns read-only slices, because dynptr_data can return rw slices currently and it is already UAPI so changing that is not possible anymore.
On Fri, Aug 26, 2022 at 11:52 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Fri, 26 Aug 2022 at 20:44, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > On Thu, Aug 25, 2022 at 5:19 PM Kumar Kartikeya Dwivedi > > <memxor@gmail.com> wrote: > > > > > > On Thu, 25 Aug 2022 at 23:02, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > [...] > > > > > > > > > > Related question, it seems we know statically if dynptr is read only > > > > > or not, so why even do all this hidden parameter passing and instead > > > > > just reject writes directly? You only need to be able to set > > > > > MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject > > > > > dynptr_write when dynptr type is xdp/skb (and ctx is only one). That > > > > > seems simpler than checking it at runtime. Verifier already handles > > > > > MEM_RDONLY generically, you only need to add the guard for > > > > > check_packet_acces (and check_helper_mem_access for meta->raw_mode > > > > > under pkt case), and rejecting dynptr_write seems like a if condition. > > > > > > > > There will be other helper functions that do writes (eg memcpy to > > > > dynptrs, strncpy to dynptrs, probe read user to dynptrs, hashing > > > > dynptrs, ...) so it's more scalable if we reject these at runtime > > > > rather than enforce these at the verifier level. I also think it's > > > > cleaner to keep the verifier logic as simple as possible and do the > > > > checking in the helper. > > > > > > I won't be pushing this further, since you know what you plan to add > > > in the future better, but I still disagree. > > > > > > I'm guessing there might be dynptrs where this read only property is > > > set dynamically at runtime, which is why you want to go this route? > > > I.e. you might not know statically whether dynptr is read only or not? > > > > > > My main confusion is the inconsistency here. > > > > > > Right now the patch implicitly relies on may_access_direct_pkt_data to > > > protect slices returned from dynptr_data, instead of setting > > > MEM_RDONLY on the returned PTR_TO_PACKET. Which is fine, it's not > > > needed. So indirectly, you are relying on knowing statically whether > > > the dynptr is read only or not. But then you also set this bit at > > > runtime. > > > > > > So you reject some cases at load time, and the rest of them only at > > > runtime. Direct writes to dynptr slice fails load, writes through > > > helper does not (only fails at runtime). > > > > > > Also, dynptr_data needs to know whether dynptr is read only > > > statically, to protect writes to its returned pointer, unless you > > > decide to introduce another helper for the dynamic rdonly bit case > > > (like dynptr_data_rdonly). Then you have a mismatch, where dynptr_data > > > works for some rdonly dynptrs (known to be rdonly statically, like > > > this skb one), but not for others. > > > > > > I also don't agree about the complexity or scalability part, all the > > > infra and precedence is already there. We already have similar checks > > > for meta->raw_mode where we reject writes to read only pointers in > > > check_helper_mem_access. > > > > My point about scalability is that if we reject bpf_dynptr_write() at > > load time, then we must reject any future dynptr helper that does any > > writing at load time as well, to be consistent. > > > > I don't feel strongly about whether we reject at load time or run > > time. Rejecting at load time instead of runtime doesn't seem that > > useful to me, but there's a good chance I'm wrong here since Martin > > stated that he prefers rejecting at load time as well. > > > > As for the added complexity part, what I mean is that we'll need to > > keep track of some more stuff to support this, such as whether the > > dynptr is read only and which helper functions need to check whether > > the dynptr is read only or not. > > What I'm trying to understand is how dynptr_data is supposed to work > if this dynptr read only bit is only known at runtime. Or will it be > always known statically so that it can set returned pointer as read > only? Because then it doesn't seem it is required or useful to track > the readonly bit at runtime. I think it'll always be known statically whether the dynptr is read-only or not. If we make all writable dynptr helper functions reject read-only dynptrs at load time instead of run time, then yes we can remove the read-only bit in the bpf_dynptr_kern struct. There's also the question of whether this constraint (eg all read-only writes are rejected at load time) is too rigid - for example, what if in the future we want to add a helper function where if a certain condition is met, then we write some number of bytes, else we read some number of bytes? This would be not possible to add then, since we'll only know at runtime whether the condition is met. I personally lean towards rejecting helper function writes at runtime, but if you think it's a non-trivial benefit to reject at load time instead, I'm fine going with that. > > It is fine if _everything_ checks it at runtime, but that doesn't seem > possible, hence the question. We would need a new slice helper that > only returns read-only slices, because dynptr_data can return rw > slices currently and it is already UAPI so changing that is not > possible anymore. I don't agree that if bpf_dynptr_write() is checked at runtime, then bpf_dynptr_data must also be checked at runtime to be consistent. I think it's fine if writes through helper functions are rejected at runtime, and writes through direct access are rejected at load time. That doesn't seem inconsistent to me.
On Fri, 26 Aug 2022 at 21:49, Joanne Koong <joannelkoong@gmail.com> wrote: > > On Fri, Aug 26, 2022 at 11:52 AM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > On Fri, 26 Aug 2022 at 20:44, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > On Thu, Aug 25, 2022 at 5:19 PM Kumar Kartikeya Dwivedi > > > <memxor@gmail.com> wrote: > > > > > > > > On Thu, 25 Aug 2022 at 23:02, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > [...] > > > > > > > > > > > > Related question, it seems we know statically if dynptr is read only > > > > > > or not, so why even do all this hidden parameter passing and instead > > > > > > just reject writes directly? You only need to be able to set > > > > > > MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject > > > > > > dynptr_write when dynptr type is xdp/skb (and ctx is only one). That > > > > > > seems simpler than checking it at runtime. Verifier already handles > > > > > > MEM_RDONLY generically, you only need to add the guard for > > > > > > check_packet_acces (and check_helper_mem_access for meta->raw_mode > > > > > > under pkt case), and rejecting dynptr_write seems like a if condition. > > > > > > > > > > There will be other helper functions that do writes (eg memcpy to > > > > > dynptrs, strncpy to dynptrs, probe read user to dynptrs, hashing > > > > > dynptrs, ...) so it's more scalable if we reject these at runtime > > > > > rather than enforce these at the verifier level. I also think it's > > > > > cleaner to keep the verifier logic as simple as possible and do the > > > > > checking in the helper. > > > > > > > > I won't be pushing this further, since you know what you plan to add > > > > in the future better, but I still disagree. > > > > > > > > I'm guessing there might be dynptrs where this read only property is > > > > set dynamically at runtime, which is why you want to go this route? > > > > I.e. you might not know statically whether dynptr is read only or not? > > > > > > > > My main confusion is the inconsistency here. > > > > > > > > Right now the patch implicitly relies on may_access_direct_pkt_data to > > > > protect slices returned from dynptr_data, instead of setting > > > > MEM_RDONLY on the returned PTR_TO_PACKET. Which is fine, it's not > > > > needed. So indirectly, you are relying on knowing statically whether > > > > the dynptr is read only or not. But then you also set this bit at > > > > runtime. > > > > > > > > So you reject some cases at load time, and the rest of them only at > > > > runtime. Direct writes to dynptr slice fails load, writes through > > > > helper does not (only fails at runtime). > > > > > > > > Also, dynptr_data needs to know whether dynptr is read only > > > > statically, to protect writes to its returned pointer, unless you > > > > decide to introduce another helper for the dynamic rdonly bit case > > > > (like dynptr_data_rdonly). Then you have a mismatch, where dynptr_data > > > > works for some rdonly dynptrs (known to be rdonly statically, like > > > > this skb one), but not for others. > > > > > > > > I also don't agree about the complexity or scalability part, all the > > > > infra and precedence is already there. We already have similar checks > > > > for meta->raw_mode where we reject writes to read only pointers in > > > > check_helper_mem_access. > > > > > > My point about scalability is that if we reject bpf_dynptr_write() at > > > load time, then we must reject any future dynptr helper that does any > > > writing at load time as well, to be consistent. > > > > > > I don't feel strongly about whether we reject at load time or run > > > time. Rejecting at load time instead of runtime doesn't seem that > > > useful to me, but there's a good chance I'm wrong here since Martin > > > stated that he prefers rejecting at load time as well. > > > > > > As for the added complexity part, what I mean is that we'll need to > > > keep track of some more stuff to support this, such as whether the > > > dynptr is read only and which helper functions need to check whether > > > the dynptr is read only or not. > > > > What I'm trying to understand is how dynptr_data is supposed to work > > if this dynptr read only bit is only known at runtime. Or will it be > > always known statically so that it can set returned pointer as read > > only? Because then it doesn't seem it is required or useful to track > > the readonly bit at runtime. > > I think it'll always be known statically whether the dynptr is > read-only or not. If we make all writable dynptr helper functions > reject read-only dynptrs at load time instead of run time, then yes we > can remove the read-only bit in the bpf_dynptr_kern struct. > > There's also the question of whether this constraint (eg all read-only > writes are rejected at load time) is too rigid - for example, what if > in the future we want to add a helper function where if a certain > condition is met, then we write some number of bytes, else we read > some number of bytes? This would be not possible to add then, since > we'll only know at runtime whether the condition is met. > > I personally lean towards rejecting helper function writes at runtime, > but if you think it's a non-trivial benefit to reject at load time > instead, I'm fine going with that. > My personal opinion is this: When I am working with a statically known read only dynptr, it is like declaring a variable const. Every function expecting it to be non-const should fail compilation, and trying to mutate the variables through writes should also fail compilation. For BPF compilation is analogous to program load. It might be that said variable is not const, then those operations may fail at runtime due to some other reason. Being dynamically read-only is then a runtime failure condition, which will cause failure at runtime. Both are distinct cases in my mind, and it is fine to fail at runtime when we don't know. In general, you save a lot of time of the user in my opinion (esp. people new to things) if you reject known incorrectness as early as possible. E.g. load a dynptr from a map, where the field accepts storing both read only and non-read only ones. Then it is expected that writes may fail at runtime. That also allows you to switch read-only ness at runtime back to rw. But if the field only expects rdonly dynptr, verifier knows that the type is const statically, so it triggers failures for writes at load time instead. Taking this a step further, you may even store rw dynptr to a map field expecting rdonly dynptr. That's like returning a const pointer from a function for a rw memory, where you want to limit access of the user, even better if you do it statically. Then functions trying to write to dynptr loaded from said map field will fail load itself, while others having access to rw dynptr can do it just fine. When the verifier does not know, it does not know. There will be such cases when you make const-ness a runtime property. > > > > It is fine if _everything_ checks it at runtime, but that doesn't seem > > possible, hence the question. We would need a new slice helper that > > only returns read-only slices, because dynptr_data can return rw > > slices currently and it is already UAPI so changing that is not > > possible anymore. > > I don't agree that if bpf_dynptr_write() is checked at runtime, then > bpf_dynptr_data must also be checked at runtime to be consistent. I > think it's fine if writes through helper functions are rejected at > runtime, and writes through direct access are rejected at load time. > That doesn't seem inconsistent to me. My point was more that dynptr_data cannot propagate runtime read-only-ness to its returned pointer. The verifier has to know statically, at which point I don't see why we can't just reject other cases at load anyway. When we have dynptrs which have const-ness as a runtime property, it is ok to support that by failing at runtime (but then you'll have a hard time deciding how you want dynptr_data to work, most likely you'll need another helper which returns only a rdonly slice, when it fails, we call dynptr_data for rw slice). But as I said before, I don't know how dynptr is going to evolve in the future, so you'll have a better idea, and I'll leave it up to you decide how you want to design its API. Enough words exchanged about this :).
On Fri, Aug 26, 2022 at 1:54 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Fri, 26 Aug 2022 at 21:49, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > On Fri, Aug 26, 2022 at 11:52 AM Kumar Kartikeya Dwivedi > > <memxor@gmail.com> wrote: > > > > > > On Fri, 26 Aug 2022 at 20:44, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > > On Thu, Aug 25, 2022 at 5:19 PM Kumar Kartikeya Dwivedi > > > > <memxor@gmail.com> wrote: > > > > > > > > > > On Thu, 25 Aug 2022 at 23:02, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > [...] > > > > > > > > > > > > > > Related question, it seems we know statically if dynptr is read only > > > > > > > or not, so why even do all this hidden parameter passing and instead > > > > > > > just reject writes directly? You only need to be able to set > > > > > > > MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject > > > > > > > dynptr_write when dynptr type is xdp/skb (and ctx is only one). That > > > > > > > seems simpler than checking it at runtime. Verifier already handles > > > > > > > MEM_RDONLY generically, you only need to add the guard for > > > > > > > check_packet_acces (and check_helper_mem_access for meta->raw_mode > > > > > > > under pkt case), and rejecting dynptr_write seems like a if condition. > > > > > > > > > > > > There will be other helper functions that do writes (eg memcpy to > > > > > > dynptrs, strncpy to dynptrs, probe read user to dynptrs, hashing > > > > > > dynptrs, ...) so it's more scalable if we reject these at runtime > > > > > > rather than enforce these at the verifier level. I also think it's > > > > > > cleaner to keep the verifier logic as simple as possible and do the > > > > > > checking in the helper. > > > > > > > > > > I won't be pushing this further, since you know what you plan to add > > > > > in the future better, but I still disagree. > > > > > > > > > > I'm guessing there might be dynptrs where this read only property is > > > > > set dynamically at runtime, which is why you want to go this route? > > > > > I.e. you might not know statically whether dynptr is read only or not? > > > > > > > > > > My main confusion is the inconsistency here. > > > > > > > > > > Right now the patch implicitly relies on may_access_direct_pkt_data to > > > > > protect slices returned from dynptr_data, instead of setting > > > > > MEM_RDONLY on the returned PTR_TO_PACKET. Which is fine, it's not > > > > > needed. So indirectly, you are relying on knowing statically whether > > > > > the dynptr is read only or not. But then you also set this bit at > > > > > runtime. > > > > > > > > > > So you reject some cases at load time, and the rest of them only at > > > > > runtime. Direct writes to dynptr slice fails load, writes through > > > > > helper does not (only fails at runtime). > > > > > > > > > > Also, dynptr_data needs to know whether dynptr is read only > > > > > statically, to protect writes to its returned pointer, unless you > > > > > decide to introduce another helper for the dynamic rdonly bit case > > > > > (like dynptr_data_rdonly). Then you have a mismatch, where dynptr_data > > > > > works for some rdonly dynptrs (known to be rdonly statically, like > > > > > this skb one), but not for others. > > > > > > > > > > I also don't agree about the complexity or scalability part, all the > > > > > infra and precedence is already there. We already have similar checks > > > > > for meta->raw_mode where we reject writes to read only pointers in > > > > > check_helper_mem_access. > > > > > > > > My point about scalability is that if we reject bpf_dynptr_write() at > > > > load time, then we must reject any future dynptr helper that does any > > > > writing at load time as well, to be consistent. > > > > > > > > I don't feel strongly about whether we reject at load time or run > > > > time. Rejecting at load time instead of runtime doesn't seem that > > > > useful to me, but there's a good chance I'm wrong here since Martin > > > > stated that he prefers rejecting at load time as well. > > > > > > > > As for the added complexity part, what I mean is that we'll need to > > > > keep track of some more stuff to support this, such as whether the > > > > dynptr is read only and which helper functions need to check whether > > > > the dynptr is read only or not. > > > > > > What I'm trying to understand is how dynptr_data is supposed to work > > > if this dynptr read only bit is only known at runtime. Or will it be > > > always known statically so that it can set returned pointer as read > > > only? Because then it doesn't seem it is required or useful to track > > > the readonly bit at runtime. > > > > I think it'll always be known statically whether the dynptr is > > read-only or not. If we make all writable dynptr helper functions > > reject read-only dynptrs at load time instead of run time, then yes we > > can remove the read-only bit in the bpf_dynptr_kern struct. > > > > There's also the question of whether this constraint (eg all read-only > > writes are rejected at load time) is too rigid - for example, what if > > in the future we want to add a helper function where if a certain > > condition is met, then we write some number of bytes, else we read > > some number of bytes? This would be not possible to add then, since > > we'll only know at runtime whether the condition is met. > > > > I personally lean towards rejecting helper function writes at runtime, > > but if you think it's a non-trivial benefit to reject at load time > > instead, I'm fine going with that. > > > > My personal opinion is this: > > When I am working with a statically known read only dynptr, it is like > declaring a variable const. Every function expecting it to be > non-const should fail compilation, and trying to mutate the variables > through writes should also fail compilation. For BPF compilation is > analogous to program load. > > It might be that said variable is not const, then those operations may > fail at runtime due to some other reason. Being dynamically read-only > is then a runtime failure condition, which will cause failure at > runtime. Both are distinct cases in my mind, and it is fine to fail at > runtime when we don't know. In general, you save a lot of time of the > user in my opinion (esp. people new to things) if you reject known > incorrectness as early as possible. > > E.g. load a dynptr from a map, where the field accepts storing both > read only and non-read only ones. Then it is expected that writes may > fail at runtime. That also allows you to switch read-only ness at > runtime back to rw. But if the field only expects rdonly dynptr, > verifier knows that the type is const statically, so it triggers > failures for writes at load time instead. > > Taking this a step further, you may even store rw dynptr to a map > field expecting rdonly dynptr. That's like returning a const pointer > from a function for a rw memory, where you want to limit access of the > user, even better if you do it statically. Then functions trying to > write to dynptr loaded from said map field will fail load itself, > while others having access to rw dynptr can do it just fine. > > When the verifier does not know, it does not know. There will be such > cases when you make const-ness a runtime property. > > > > > > > It is fine if _everything_ checks it at runtime, but that doesn't seem > > > possible, hence the question. We would need a new slice helper that > > > only returns read-only slices, because dynptr_data can return rw > > > slices currently and it is already UAPI so changing that is not > > > possible anymore. > > > > I don't agree that if bpf_dynptr_write() is checked at runtime, then > > bpf_dynptr_data must also be checked at runtime to be consistent. I > > think it's fine if writes through helper functions are rejected at > > runtime, and writes through direct access are rejected at load time. > > That doesn't seem inconsistent to me. > > My point was more that dynptr_data cannot propagate runtime > read-only-ness to its returned pointer. The verifier has to know > statically, at which point I don't see why we can't just reject other > cases at load anyway. I think the right answer here is to not make bpf_dynptr_data() return direct pointer of changing read-only-ness. Maybe the right answer here is another helper, bpf_dynptr_data_rdonly(), that will return NULL for non-read-only dynptr and PTR_TO_MEM | MEM_RDONLY if dynptr is indeed read-only? By saying that read-only-ness of dynptr should be statically known and rejecting some dynptr functions at load time places us at the mercy of verifier's complete knowledge of application logic, which is exactly against the spirit of dynptr. It's only slightly tangential, but I still dread my experience proving to BPF verifier that some value is strictly greater than zero for BPF helper that expected ARG_CONST_SIZE (not ARG_CONST_SIZE_OR_ZERO). There were also cases were absolutely correct program had to be mangled just to prove to BPF verifier that it indeed can return just 0 or 1, etc. This is not to bash BPF verifier, but just to point out that sometimes unnecessary strictness adds nothing but unnecessary pain to user. So, let's not reject anything at load, we can check all that at runtime and return NULL. But bpf_dynptr_data_rdonly() seems useful for cases where we know we are not going to write > > When we have dynptrs which have const-ness as a runtime property, it > is ok to support that by failing at runtime (but then you'll have a > hard time deciding how you want dynptr_data to work, most likely > you'll need another helper which returns only a rdonly slice, when it > fails, we call dynptr_data for rw slice). > > But as I said before, I don't know how dynptr is going to evolve in > the future, so you'll have a better idea, and I'll leave it up to you > decide how you want to design its API. Enough words exchanged about > this :). directionally, dynptr is about offloading decisions to runtime, so I think avoiding unnecessary restrictions at verification time is the right trade off
On Sat, 27 Aug 2022 at 07:37, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Fri, Aug 26, 2022 at 1:54 PM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > On Fri, 26 Aug 2022 at 21:49, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > On Fri, Aug 26, 2022 at 11:52 AM Kumar Kartikeya Dwivedi > > > <memxor@gmail.com> wrote: > > > > > > > > On Fri, 26 Aug 2022 at 20:44, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > > > > On Thu, Aug 25, 2022 at 5:19 PM Kumar Kartikeya Dwivedi > > > > > <memxor@gmail.com> wrote: > > > > > > > > > > > > On Thu, 25 Aug 2022 at 23:02, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > [...] > > > > > > > > > > > > > > > > Related question, it seems we know statically if dynptr is read only > > > > > > > > or not, so why even do all this hidden parameter passing and instead > > > > > > > > just reject writes directly? You only need to be able to set > > > > > > > > MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject > > > > > > > > dynptr_write when dynptr type is xdp/skb (and ctx is only one). That > > > > > > > > seems simpler than checking it at runtime. Verifier already handles > > > > > > > > MEM_RDONLY generically, you only need to add the guard for > > > > > > > > check_packet_acces (and check_helper_mem_access for meta->raw_mode > > > > > > > > under pkt case), and rejecting dynptr_write seems like a if condition. > > > > > > > > > > > > > > There will be other helper functions that do writes (eg memcpy to > > > > > > > dynptrs, strncpy to dynptrs, probe read user to dynptrs, hashing > > > > > > > dynptrs, ...) so it's more scalable if we reject these at runtime > > > > > > > rather than enforce these at the verifier level. I also think it's > > > > > > > cleaner to keep the verifier logic as simple as possible and do the > > > > > > > checking in the helper. > > > > > > > > > > > > I won't be pushing this further, since you know what you plan to add > > > > > > in the future better, but I still disagree. > > > > > > > > > > > > I'm guessing there might be dynptrs where this read only property is > > > > > > set dynamically at runtime, which is why you want to go this route? > > > > > > I.e. you might not know statically whether dynptr is read only or not? > > > > > > > > > > > > My main confusion is the inconsistency here. > > > > > > > > > > > > Right now the patch implicitly relies on may_access_direct_pkt_data to > > > > > > protect slices returned from dynptr_data, instead of setting > > > > > > MEM_RDONLY on the returned PTR_TO_PACKET. Which is fine, it's not > > > > > > needed. So indirectly, you are relying on knowing statically whether > > > > > > the dynptr is read only or not. But then you also set this bit at > > > > > > runtime. > > > > > > > > > > > > So you reject some cases at load time, and the rest of them only at > > > > > > runtime. Direct writes to dynptr slice fails load, writes through > > > > > > helper does not (only fails at runtime). > > > > > > > > > > > > Also, dynptr_data needs to know whether dynptr is read only > > > > > > statically, to protect writes to its returned pointer, unless you > > > > > > decide to introduce another helper for the dynamic rdonly bit case > > > > > > (like dynptr_data_rdonly). Then you have a mismatch, where dynptr_data > > > > > > works for some rdonly dynptrs (known to be rdonly statically, like > > > > > > this skb one), but not for others. > > > > > > > > > > > > I also don't agree about the complexity or scalability part, all the > > > > > > infra and precedence is already there. We already have similar checks > > > > > > for meta->raw_mode where we reject writes to read only pointers in > > > > > > check_helper_mem_access. > > > > > > > > > > My point about scalability is that if we reject bpf_dynptr_write() at > > > > > load time, then we must reject any future dynptr helper that does any > > > > > writing at load time as well, to be consistent. > > > > > > > > > > I don't feel strongly about whether we reject at load time or run > > > > > time. Rejecting at load time instead of runtime doesn't seem that > > > > > useful to me, but there's a good chance I'm wrong here since Martin > > > > > stated that he prefers rejecting at load time as well. > > > > > > > > > > As for the added complexity part, what I mean is that we'll need to > > > > > keep track of some more stuff to support this, such as whether the > > > > > dynptr is read only and which helper functions need to check whether > > > > > the dynptr is read only or not. > > > > > > > > What I'm trying to understand is how dynptr_data is supposed to work > > > > if this dynptr read only bit is only known at runtime. Or will it be > > > > always known statically so that it can set returned pointer as read > > > > only? Because then it doesn't seem it is required or useful to track > > > > the readonly bit at runtime. > > > > > > I think it'll always be known statically whether the dynptr is > > > read-only or not. If we make all writable dynptr helper functions > > > reject read-only dynptrs at load time instead of run time, then yes we > > > can remove the read-only bit in the bpf_dynptr_kern struct. > > > > > > There's also the question of whether this constraint (eg all read-only > > > writes are rejected at load time) is too rigid - for example, what if > > > in the future we want to add a helper function where if a certain > > > condition is met, then we write some number of bytes, else we read > > > some number of bytes? This would be not possible to add then, since > > > we'll only know at runtime whether the condition is met. > > > > > > I personally lean towards rejecting helper function writes at runtime, > > > but if you think it's a non-trivial benefit to reject at load time > > > instead, I'm fine going with that. > > > > > > > My personal opinion is this: > > > > When I am working with a statically known read only dynptr, it is like > > declaring a variable const. Every function expecting it to be > > non-const should fail compilation, and trying to mutate the variables > > through writes should also fail compilation. For BPF compilation is > > analogous to program load. > > > > It might be that said variable is not const, then those operations may > > fail at runtime due to some other reason. Being dynamically read-only > > is then a runtime failure condition, which will cause failure at > > runtime. Both are distinct cases in my mind, and it is fine to fail at > > runtime when we don't know. In general, you save a lot of time of the > > user in my opinion (esp. people new to things) if you reject known > > incorrectness as early as possible. > > > > E.g. load a dynptr from a map, where the field accepts storing both > > read only and non-read only ones. Then it is expected that writes may > > fail at runtime. That also allows you to switch read-only ness at > > runtime back to rw. But if the field only expects rdonly dynptr, > > verifier knows that the type is const statically, so it triggers > > failures for writes at load time instead. > > > > Taking this a step further, you may even store rw dynptr to a map > > field expecting rdonly dynptr. That's like returning a const pointer > > from a function for a rw memory, where you want to limit access of the > > user, even better if you do it statically. Then functions trying to > > write to dynptr loaded from said map field will fail load itself, > > while others having access to rw dynptr can do it just fine. > > > > When the verifier does not know, it does not know. There will be such > > cases when you make const-ness a runtime property. > > > > > > > > > > It is fine if _everything_ checks it at runtime, but that doesn't seem > > > > possible, hence the question. We would need a new slice helper that > > > > only returns read-only slices, because dynptr_data can return rw > > > > slices currently and it is already UAPI so changing that is not > > > > possible anymore. > > > > > > I don't agree that if bpf_dynptr_write() is checked at runtime, then > > > bpf_dynptr_data must also be checked at runtime to be consistent. I > > > think it's fine if writes through helper functions are rejected at > > > runtime, and writes through direct access are rejected at load time. > > > That doesn't seem inconsistent to me. > > > > My point was more that dynptr_data cannot propagate runtime > > read-only-ness to its returned pointer. The verifier has to know > > statically, at which point I don't see why we can't just reject other > > cases at load anyway. > > I think the right answer here is to not make bpf_dynptr_data() return > direct pointer of changing read-only-ness. Maybe the right answer here > is another helper, bpf_dynptr_data_rdonly(), that will return NULL for > non-read-only dynptr and PTR_TO_MEM | MEM_RDONLY if dynptr is indeed > read-only? Shouldn't it be the other way around? bpf_dynptr_data_rdonly() should work for both ro and rw dynptrs, and bpf_dynptr_data() only for rw dynptr? And yes, you're kind of painting yourself in a corner if you allow dynptr_data to work with statically ro dynptrs now, ascertaining the ro bit for the returned slice, and then later you plan to add dynptrs whose ro vs rw is not known to the verifier statically. Then it works well for statically known ones (returning both ro and rw slices), but has to return NULL at runtime for statically unknown read only ones, and always rw slice in verifier state for them. > > By saying that read-only-ness of dynptr should be statically known and > rejecting some dynptr functions at load time places us at the mercy of > verifier's complete knowledge of application logic, which is exactly > against the spirit of dynptr. > Right, that might be too restrictive if we require them to be statically read only. But it's not about forcing it to be statically ro, it is more about rejecting load when we know the program is incorrect (e.g. the types are incorrect when passed to helpers), otherwise we throw the error at runtime anyway, which seems to be the convention afaicu. But maybe I missed the memo and we gradually want to move away from such strict static checks. I view the situation here similar to if we were rejecting direct writes to PTR_TO_MEM | MEM_RDONLY at load time, but offloading as runtime check in the helper writing to it as rw memory arg. It's as if we pretend it's part of the 'type' of the register when doing direct writes, but then ignore it while matching it against the said helper's argument type. > It's only slightly tangential, but I still dread my experience proving > to BPF verifier that some value is strictly greater than zero for BPF > helper that expected ARG_CONST_SIZE (not ARG_CONST_SIZE_OR_ZERO). > There were also cases were absolutely correct program had to be > mangled just to prove to BPF verifier that it indeed can return just 0 > or 1, etc. This is not to bash BPF verifier, but just to point out > that sometimes unnecessary strictness adds nothing but unnecessary > pain to user. So, let's not reject anything at load, we can check all > that at runtime and return NULL. > > But bpf_dynptr_data_rdonly() seems useful for cases where we know we > are not going to write > > > > > When we have dynptrs which have const-ness as a runtime property, it > > is ok to support that by failing at runtime (but then you'll have a > > hard time deciding how you want dynptr_data to work, most likely > > you'll need another helper which returns only a rdonly slice, when it > > fails, we call dynptr_data for rw slice). > > > > But as I said before, I don't know how dynptr is going to evolve in > > the future, so you'll have a better idea, and I'll leave it up to you > > decide how you want to design its API. Enough words exchanged about > > this :). > > directionally, dynptr is about offloading decisions to runtime, so I > think avoiding unnecessary restrictions at verification time is the > right trade off Fair enough, I guess I've made my point. Let's go with what you both feel would be best.
On Sat, Aug 27, 2022 at 12:12 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Sat, 27 Aug 2022 at 07:37, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > On Fri, Aug 26, 2022 at 1:54 PM Kumar Kartikeya Dwivedi > > <memxor@gmail.com> wrote: > > > > > > On Fri, 26 Aug 2022 at 21:49, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > > On Fri, Aug 26, 2022 at 11:52 AM Kumar Kartikeya Dwivedi > > > > <memxor@gmail.com> wrote: > > > > > > > > > > On Fri, 26 Aug 2022 at 20:44, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > > > > > > On Thu, Aug 25, 2022 at 5:19 PM Kumar Kartikeya Dwivedi > > > > > > <memxor@gmail.com> wrote: > > > > > > > > > > > > > > On Thu, 25 Aug 2022 at 23:02, Joanne Koong <joannelkoong@gmail.com> wrote: > > > > > > > > [...] > > > > > > > > > > > > > > > > > > Related question, it seems we know statically if dynptr is read only > > > > > > > > > or not, so why even do all this hidden parameter passing and instead > > > > > > > > > just reject writes directly? You only need to be able to set > > > > > > > > > MEM_RDONLY on dynptr_data returned PTR_TO_PACKETs, and reject > > > > > > > > > dynptr_write when dynptr type is xdp/skb (and ctx is only one). That > > > > > > > > > seems simpler than checking it at runtime. Verifier already handles > > > > > > > > > MEM_RDONLY generically, you only need to add the guard for > > > > > > > > > check_packet_acces (and check_helper_mem_access for meta->raw_mode > > > > > > > > > under pkt case), and rejecting dynptr_write seems like a if condition. > > > > > > > > > > > > > > > > There will be other helper functions that do writes (eg memcpy to > > > > > > > > dynptrs, strncpy to dynptrs, probe read user to dynptrs, hashing > > > > > > > > dynptrs, ...) so it's more scalable if we reject these at runtime > > > > > > > > rather than enforce these at the verifier level. I also think it's > > > > > > > > cleaner to keep the verifier logic as simple as possible and do the > > > > > > > > checking in the helper. > > > > > > > > > > > > > > I won't be pushing this further, since you know what you plan to add > > > > > > > in the future better, but I still disagree. > > > > > > > > > > > > > > I'm guessing there might be dynptrs where this read only property is > > > > > > > set dynamically at runtime, which is why you want to go this route? > > > > > > > I.e. you might not know statically whether dynptr is read only or not? > > > > > > > > > > > > > > My main confusion is the inconsistency here. > > > > > > > > > > > > > > Right now the patch implicitly relies on may_access_direct_pkt_data to > > > > > > > protect slices returned from dynptr_data, instead of setting > > > > > > > MEM_RDONLY on the returned PTR_TO_PACKET. Which is fine, it's not > > > > > > > needed. So indirectly, you are relying on knowing statically whether > > > > > > > the dynptr is read only or not. But then you also set this bit at > > > > > > > runtime. > > > > > > > > > > > > > > So you reject some cases at load time, and the rest of them only at > > > > > > > runtime. Direct writes to dynptr slice fails load, writes through > > > > > > > helper does not (only fails at runtime). > > > > > > > > > > > > > > Also, dynptr_data needs to know whether dynptr is read only > > > > > > > statically, to protect writes to its returned pointer, unless you > > > > > > > decide to introduce another helper for the dynamic rdonly bit case > > > > > > > (like dynptr_data_rdonly). Then you have a mismatch, where dynptr_data > > > > > > > works for some rdonly dynptrs (known to be rdonly statically, like > > > > > > > this skb one), but not for others. > > > > > > > > > > > > > > I also don't agree about the complexity or scalability part, all the > > > > > > > infra and precedence is already there. We already have similar checks > > > > > > > for meta->raw_mode where we reject writes to read only pointers in > > > > > > > check_helper_mem_access. > > > > > > > > > > > > My point about scalability is that if we reject bpf_dynptr_write() at > > > > > > load time, then we must reject any future dynptr helper that does any > > > > > > writing at load time as well, to be consistent. > > > > > > > > > > > > I don't feel strongly about whether we reject at load time or run > > > > > > time. Rejecting at load time instead of runtime doesn't seem that > > > > > > useful to me, but there's a good chance I'm wrong here since Martin > > > > > > stated that he prefers rejecting at load time as well. > > > > > > > > > > > > As for the added complexity part, what I mean is that we'll need to > > > > > > keep track of some more stuff to support this, such as whether the > > > > > > dynptr is read only and which helper functions need to check whether > > > > > > the dynptr is read only or not. > > > > > > > > > > What I'm trying to understand is how dynptr_data is supposed to work > > > > > if this dynptr read only bit is only known at runtime. Or will it be > > > > > always known statically so that it can set returned pointer as read > > > > > only? Because then it doesn't seem it is required or useful to track > > > > > the readonly bit at runtime. > > > > > > > > I think it'll always be known statically whether the dynptr is > > > > read-only or not. If we make all writable dynptr helper functions > > > > reject read-only dynptrs at load time instead of run time, then yes we > > > > can remove the read-only bit in the bpf_dynptr_kern struct. > > > > > > > > There's also the question of whether this constraint (eg all read-only > > > > writes are rejected at load time) is too rigid - for example, what if > > > > in the future we want to add a helper function where if a certain > > > > condition is met, then we write some number of bytes, else we read > > > > some number of bytes? This would be not possible to add then, since > > > > we'll only know at runtime whether the condition is met. > > > > > > > > I personally lean towards rejecting helper function writes at runtime, > > > > but if you think it's a non-trivial benefit to reject at load time > > > > instead, I'm fine going with that. > > > > > > > > > > My personal opinion is this: > > > > > > When I am working with a statically known read only dynptr, it is like > > > declaring a variable const. Every function expecting it to be > > > non-const should fail compilation, and trying to mutate the variables > > > through writes should also fail compilation. For BPF compilation is > > > analogous to program load. > > > > > > It might be that said variable is not const, then those operations may > > > fail at runtime due to some other reason. Being dynamically read-only > > > is then a runtime failure condition, which will cause failure at > > > runtime. Both are distinct cases in my mind, and it is fine to fail at > > > runtime when we don't know. In general, you save a lot of time of the > > > user in my opinion (esp. people new to things) if you reject known > > > incorrectness as early as possible. > > > > > > E.g. load a dynptr from a map, where the field accepts storing both > > > read only and non-read only ones. Then it is expected that writes may > > > fail at runtime. That also allows you to switch read-only ness at > > > runtime back to rw. But if the field only expects rdonly dynptr, > > > verifier knows that the type is const statically, so it triggers > > > failures for writes at load time instead. > > > > > > Taking this a step further, you may even store rw dynptr to a map > > > field expecting rdonly dynptr. That's like returning a const pointer > > > from a function for a rw memory, where you want to limit access of the > > > user, even better if you do it statically. Then functions trying to > > > write to dynptr loaded from said map field will fail load itself, > > > while others having access to rw dynptr can do it just fine. > > > > > > When the verifier does not know, it does not know. There will be such > > > cases when you make const-ness a runtime property. > > > > > > > > > > > > > It is fine if _everything_ checks it at runtime, but that doesn't seem > > > > > possible, hence the question. We would need a new slice helper that > > > > > only returns read-only slices, because dynptr_data can return rw > > > > > slices currently and it is already UAPI so changing that is not > > > > > possible anymore. > > > > > > > > I don't agree that if bpf_dynptr_write() is checked at runtime, then > > > > bpf_dynptr_data must also be checked at runtime to be consistent. I > > > > think it's fine if writes through helper functions are rejected at > > > > runtime, and writes through direct access are rejected at load time. > > > > That doesn't seem inconsistent to me. > > > > > > My point was more that dynptr_data cannot propagate runtime > > > read-only-ness to its returned pointer. The verifier has to know > > > statically, at which point I don't see why we can't just reject other > > > cases at load anyway. > > > > I think the right answer here is to not make bpf_dynptr_data() return > > direct pointer of changing read-only-ness. Maybe the right answer here > > is another helper, bpf_dynptr_data_rdonly(), that will return NULL for > > non-read-only dynptr and PTR_TO_MEM | MEM_RDONLY if dynptr is indeed > > read-only? > > Shouldn't it be the other way around? bpf_dynptr_data_rdonly() should > work for both ro and rw dynptrs, and bpf_dynptr_data() only for rw > dynptr? Right, that's what I proposed: "bpf_dynptr_data_rdonly(), that will return NULL for non-read-only dynptr" so if you pass read-write dynptr, it will return NULL (because it's unsafe to take writable direct pointer). bpf_dynptr_data_rdonly() should still work fine with both rdonly and read-write dynptr. bpf_dynptr_data() only works (in the sense returns non-NULL) for read-write dynptr only. > > And yes, you're kind of painting yourself in a corner if you allow > dynptr_data to work with statically ro dynptrs now, ascertaining the > ro bit for the returned slice, and then later you plan to add dynptrs > whose ro vs rw is not known to the verifier statically. Then it works > well for statically known ones (returning both ro and rw slices), but > has to return NULL at runtime for statically unknown read only ones, > and always rw slice in verifier state for them. Right, will be both inconsistent and puzzling. > > > > > By saying that read-only-ness of dynptr should be statically known and > > rejecting some dynptr functions at load time places us at the mercy of > > verifier's complete knowledge of application logic, which is exactly > > against the spirit of dynptr. > > > > Right, that might be too restrictive if we require them to be > statically read only. > > But it's not about forcing it to be statically ro, it is more about > rejecting load when we know the program is incorrect (e.g. the types > are incorrect when passed to helpers), otherwise we throw the error at > runtime anyway, which seems to be the convention afaicu. But maybe I > missed the memo and we gradually want to move away from such strict > static checks. > > I view the situation here similar to if we were rejecting direct > writes to PTR_TO_MEM | MEM_RDONLY at load time, but offloading as > runtime check in the helper writing to it as rw memory arg. It's as if > we pretend it's part of the 'type' of the register when doing direct > writes, but then ignore it while matching it against the said helper's > argument type. I disagree, it's not the same. bpf_dynptr_data/bpf_dynptr_data_rdonly turns completely dynamic dynptr into static slice of memory. Only after that point it makes sense for BPF verifier to reject something. Until then it's not incorrect. BPF program will always have to deal with some dynptr operations potentially failing. dynptr can always be NULL internally, you can still call bpf_dynptr_xxx() operations on it, they will just do nothing and return error. That doesn't make BPF program incorrect. As I said, dynptr is about shifting static verification to runtime checks for stuff that BPF verifier can't or shouldn't verify statically. Like dynptr's dynamic size, for example. We've used a special data/data_end pointer types to try to solve this for skb, but it is quite painful in practice and relies on compiler generating "good" sequence of instructions understandable to BPF verifier. dynptr takes a different approach, shifts validation to runtime and "interfaces" with static verification through bpf_dynptr_data() and similar APIs. > > > It's only slightly tangential, but I still dread my experience proving > > to BPF verifier that some value is strictly greater than zero for BPF > > helper that expected ARG_CONST_SIZE (not ARG_CONST_SIZE_OR_ZERO). > > There were also cases were absolutely correct program had to be > > mangled just to prove to BPF verifier that it indeed can return just 0 > > or 1, etc. This is not to bash BPF verifier, but just to point out > > that sometimes unnecessary strictness adds nothing but unnecessary > > pain to user. So, let's not reject anything at load, we can check all > > that at runtime and return NULL. > > > > But bpf_dynptr_data_rdonly() seems useful for cases where we know we > > are not going to write > > > > > > > > When we have dynptrs which have const-ness as a runtime property, it > > > is ok to support that by failing at runtime (but then you'll have a > > > hard time deciding how you want dynptr_data to work, most likely > > > you'll need another helper which returns only a rdonly slice, when it > > > fails, we call dynptr_data for rw slice). > > > > > > But as I said before, I don't know how dynptr is going to evolve in > > > the future, so you'll have a better idea, and I'll leave it up to you > > > decide how you want to design its API. Enough words exchanged about > > > this :). > > > > directionally, dynptr is about offloading decisions to runtime, so I > > think avoiding unnecessary restrictions at verification time is the > > right trade off > > Fair enough, I guess I've made my point. Let's go with what you both > feel would be best. Sounds good. I tried to point out the "philosophy" behind dynptr. It doesn't preclude making the BPF verifier smarter and track more things. But it's a deliberate design of dynptr to shift more checks to runtime because in a lot of cases it makes more sense than "fighting BPF verifier". This "fighting verifier" phrase is like a meme now. I've heard this exact phrase from multiple unrelated engineers. BPF verifier should be helpful, we should engineers having to "fight" it, so overly strict checks from verifier should be avoided if they don't compromise correctness, IMO.
On Sat, 27 Aug 2022 at 19:22, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Sat, Aug 27, 2022 at 12:12 AM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > [...] > > > > > > I think the right answer here is to not make bpf_dynptr_data() return > > > direct pointer of changing read-only-ness. Maybe the right answer here > > > is another helper, bpf_dynptr_data_rdonly(), that will return NULL for > > > non-read-only dynptr and PTR_TO_MEM | MEM_RDONLY if dynptr is indeed > > > read-only? > > > > Shouldn't it be the other way around? bpf_dynptr_data_rdonly() should > > work for both ro and rw dynptrs, and bpf_dynptr_data() only for rw > > dynptr? > > Right, that's what I proposed: > > "bpf_dynptr_data_rdonly(), that will return NULL for non-read-only dynptr" > > so if you pass read-write dynptr, it will return NULL (because it's > unsafe to take writable direct pointer). > > bpf_dynptr_data_rdonly() should still work fine with both rdonly and > read-write dynptr. > bpf_dynptr_data() only works (in the sense returns non-NULL) for > read-write dynptr only. > > > > > > And yes, you're kind of painting yourself in a corner if you allow > > dynptr_data to work with statically ro dynptrs now, ascertaining the > > ro bit for the returned slice, and then later you plan to add dynptrs > > whose ro vs rw is not known to the verifier statically. Then it works > > well for statically known ones (returning both ro and rw slices), but > > has to return NULL at runtime for statically unknown read only ones, > > and always rw slice in verifier state for them. > > Right, will be both inconsistent and puzzling. > > > > > > > > > By saying that read-only-ness of dynptr should be statically known and > > > rejecting some dynptr functions at load time places us at the mercy of > > > verifier's complete knowledge of application logic, which is exactly > > > against the spirit of dynptr. > > > > > > > Right, that might be too restrictive if we require them to be > > statically read only. > > > > But it's not about forcing it to be statically ro, it is more about > > rejecting load when we know the program is incorrect (e.g. the types > > are incorrect when passed to helpers), otherwise we throw the error at > > runtime anyway, which seems to be the convention afaicu. But maybe I > > missed the memo and we gradually want to move away from such strict > > static checks. > > > > I view the situation here similar to if we were rejecting direct > > writes to PTR_TO_MEM | MEM_RDONLY at load time, but offloading as > > runtime check in the helper writing to it as rw memory arg. It's as if > > we pretend it's part of the 'type' of the register when doing direct > > writes, but then ignore it while matching it against the said helper's > > argument type. > > I disagree, it's not the same. bpf_dynptr_data/bpf_dynptr_data_rdonly > turns completely dynamic dynptr into static slice of memory. Only > after that point it makes sense for BPF verifier to reject something. > Until then it's not incorrect. BPF program will always have to deal > with some dynptr operations potentially failing. dynptr can always be > NULL internally, you can still call bpf_dynptr_xxx() operations on it, > they will just do nothing and return error. That doesn't make BPF > program incorrect. Let me just explain one last time why I'm unable to swallow this 'completely dynamic dynptr' explanation, because it is not treated as completely dynamic by all dynptr helpers. No pushback, but it would be great if you could further help me wrap my head around this, so that we're in sync for future discussions. So you say you may not know the type of dynptr (read-only, rw, local, ringbuf, etc.). Hence you want to treat every dynptr as 'dynamic dynptr' you know nothing about even when you do know some information about it statically (e.g. if it's on stack). You don't want to reject things early at load even if you have all the info to do so. You want operations on it to fail at runtime instead. If you cannot track ro vs rw in the future statically, you won't be be able to track local or ringbuf or skb either (since both are part of the type of the dynptr). If you can, you can just as well encode const-ness as part of the type where you declare it (e.g. in a map field where the value is assigned dynamically, where you say dynptr_ringbuf etc.). Type comprises local vs skb vs ringbuf | ro vs rw for me. But maybe I could be wrong. So following this line of reasoning, will you be relaxing the argument type of helpers like bpf_ringbuf_submit_dynptr? Right now they take 'dynamic dynptr' as arg, but also expect DYNPTR_TYPE_RINGBUF, so you reject load when it's not a ringbuf dynptr. Will it then fallback to checking the type at runtime when the type will not be known? But then it will also permit passing local or skb dynptr in the future which it rejects right now. I'm just hoping you are able to see why it's looking a bit inconsistent to me. If DYNPTR_TYPE_RINGBUF has to be checked, it felt to me like DYNPTR_RDONLY is as much part of that kind of type safety wrt helpers. It would be set on the dynptr when skb passed to dynptr_from_skb is rdonly in some program types, along with DYNPTR_TYPE_SKB, and expect to be matched when invoking helpers on such dynptrs. It is fine to push checks to runtime, especially when you won't know the type, because verifier cannot reasonably track the dynptr type then. But right now there's still some level of state you maintain in the verifier about dynptrs (like it's type), and it seems to me like some helpers are using that state to reject things at load time, while some other helper will ignore it and fallback to runtime checks. I hope this is a clear enough description to atleast justify why I'm (still) a bit confused.
On Sat, 27 Aug 2022 at 20:32, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Sat, 27 Aug 2022 at 19:22, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > On Sat, Aug 27, 2022 at 12:12 AM Kumar Kartikeya Dwivedi > > <memxor@gmail.com> wrote: > > > [...] > > > > > > > > I think the right answer here is to not make bpf_dynptr_data() return > > > > direct pointer of changing read-only-ness. Maybe the right answer here > > > > is another helper, bpf_dynptr_data_rdonly(), that will return NULL for > > > > non-read-only dynptr and PTR_TO_MEM | MEM_RDONLY if dynptr is indeed > > > > read-only? > > > > > > Shouldn't it be the other way around? bpf_dynptr_data_rdonly() should > > > work for both ro and rw dynptrs, and bpf_dynptr_data() only for rw > > > dynptr? > > > > Right, that's what I proposed: > > > > "bpf_dynptr_data_rdonly(), that will return NULL for non-read-only dynptr" > > > > so if you pass read-write dynptr, it will return NULL (because it's > > unsafe to take writable direct pointer). > > > > bpf_dynptr_data_rdonly() should still work fine with both rdonly and > > read-write dynptr. > > bpf_dynptr_data() only works (in the sense returns non-NULL) for > > read-write dynptr only. > > > > > > > > > > And yes, you're kind of painting yourself in a corner if you allow > > > dynptr_data to work with statically ro dynptrs now, ascertaining the > > > ro bit for the returned slice, and then later you plan to add dynptrs > > > whose ro vs rw is not known to the verifier statically. Then it works > > > well for statically known ones (returning both ro and rw slices), but > > > has to return NULL at runtime for statically unknown read only ones, > > > and always rw slice in verifier state for them. > > > > Right, will be both inconsistent and puzzling. > > > > > > > > > > > > > By saying that read-only-ness of dynptr should be statically known and > > > > rejecting some dynptr functions at load time places us at the mercy of > > > > verifier's complete knowledge of application logic, which is exactly > > > > against the spirit of dynptr. > > > > > > > > > > Right, that might be too restrictive if we require them to be > > > statically read only. > > > > > > But it's not about forcing it to be statically ro, it is more about > > > rejecting load when we know the program is incorrect (e.g. the types > > > are incorrect when passed to helpers), otherwise we throw the error at > > > runtime anyway, which seems to be the convention afaicu. But maybe I > > > missed the memo and we gradually want to move away from such strict > > > static checks. > > > > > > I view the situation here similar to if we were rejecting direct > > > writes to PTR_TO_MEM | MEM_RDONLY at load time, but offloading as > > > runtime check in the helper writing to it as rw memory arg. It's as if > > > we pretend it's part of the 'type' of the register when doing direct > > > writes, but then ignore it while matching it against the said helper's > > > argument type. > > > > I disagree, it's not the same. bpf_dynptr_data/bpf_dynptr_data_rdonly > > turns completely dynamic dynptr into static slice of memory. Only > > after that point it makes sense for BPF verifier to reject something. > > Until then it's not incorrect. BPF program will always have to deal > > with some dynptr operations potentially failing. dynptr can always be > > NULL internally, you can still call bpf_dynptr_xxx() operations on it, > > they will just do nothing and return error. That doesn't make BPF > > program incorrect. > > Let me just explain one last time why I'm unable to swallow this > 'completely dynamic dynptr' explanation, because it is not treated as > completely dynamic by all dynptr helpers. > > No pushback, but it would be great if you could further help me wrap > my head around this, so that we're in sync for future discussions. > > So you say you may not know the type of dynptr (read-only, rw, local, > ringbuf, etc.). Hence you want to treat every dynptr as 'dynamic > dynptr' you know nothing about even when you do know some information > about it statically (e.g. if it's on stack). You don't want to reject > things early at load even if you have all the info to do so. You want > operations on it to fail at runtime instead. > > If you cannot track ro vs rw in the future statically, you won't be be > able to track local or ringbuf or skb either (since both are part of > the type of the dynptr). If you can, you can just as well encode > const-ness as part of the type where you declare it (e.g. in a map > field where the value is assigned dynamically, where you say > dynptr_ringbuf etc.). Type comprises local vs skb vs ringbuf | ro vs > rw for me. But maybe I could be wrong. > > So following this line of reasoning, will you be relaxing the argument > type of helpers like bpf_ringbuf_submit_dynptr? Right now they take > 'dynamic dynptr' as arg, but also expect DYNPTR_TYPE_RINGBUF, so you > reject load when it's not a ringbuf dynptr. Will it then fallback to > checking the type at runtime when the type will not be known? But then > it will also permit passing local or skb dynptr in the future which it > rejects right now. > > I'm just hoping you are able to see why it's looking a bit > inconsistent to me. If DYNPTR_TYPE_RINGBUF has to be checked, it felt > to me like DYNPTR_RDONLY is as much part of that kind of type safety > wrt helpers. It would be set on the dynptr when skb passed to > dynptr_from_skb is rdonly in some program types, along with > DYNPTR_TYPE_SKB, and expect to be matched when invoking helpers on > such dynptrs. > > It is fine to push checks to runtime, especially when you won't know > the type, because verifier cannot reasonably track the dynptr type > then. > To further expand on this point, in my mind when you have actual 'completely dynamic dynptrs', you will most likely track them as DYNPTR_TYPE_UNKNOWN in the verifier. Then you would capture free bits to encode the types at runtime, and start checking that in helpers. All helpers will start taking such DYNPTR_TYPE_UNKNOWN, or you can have a casting helper like we already have for normal pointers. No need for extra verifier complexity to teach it to recognize type when it is unknown. It will then be checked dynamically at runtime for dynptr operations. Being strictly type safe by default for helper arguments is not going to lead you to 'fighting the verifier'. Very much the opposite, the verifier is helping you catch errors that will rather occur at runtime anyway. > But right now there's still some level of state you maintain in the > verifier about dynptrs (like it's type), and it seems to me like some > helpers are using that state to reject things at load time, while some > other helper will ignore it and fallback to runtime checks. > > I hope this is a clear enough description to atleast justify why I'm > (still) a bit confused.
On Sat, Aug 27, 2022 at 11:33 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Sat, 27 Aug 2022 at 19:22, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > On Sat, Aug 27, 2022 at 12:12 AM Kumar Kartikeya Dwivedi > > <memxor@gmail.com> wrote: > > > [...] > > > > > > > > I think the right answer here is to not make bpf_dynptr_data() return > > > > direct pointer of changing read-only-ness. Maybe the right answer here > > > > is another helper, bpf_dynptr_data_rdonly(), that will return NULL for > > > > non-read-only dynptr and PTR_TO_MEM | MEM_RDONLY if dynptr is indeed > > > > read-only? > > > > > > Shouldn't it be the other way around? bpf_dynptr_data_rdonly() should > > > work for both ro and rw dynptrs, and bpf_dynptr_data() only for rw > > > dynptr? > > > > Right, that's what I proposed: > > > > "bpf_dynptr_data_rdonly(), that will return NULL for non-read-only dynptr" > > > > so if you pass read-write dynptr, it will return NULL (because it's > > unsafe to take writable direct pointer). > > > > bpf_dynptr_data_rdonly() should still work fine with both rdonly and > > read-write dynptr. > > bpf_dynptr_data() only works (in the sense returns non-NULL) for > > read-write dynptr only. > > > > > > > > > > And yes, you're kind of painting yourself in a corner if you allow > > > dynptr_data to work with statically ro dynptrs now, ascertaining the > > > ro bit for the returned slice, and then later you plan to add dynptrs > > > whose ro vs rw is not known to the verifier statically. Then it works > > > well for statically known ones (returning both ro and rw slices), but > > > has to return NULL at runtime for statically unknown read only ones, > > > and always rw slice in verifier state for them. > > > > Right, will be both inconsistent and puzzling. > > > > > > > > > > > > > By saying that read-only-ness of dynptr should be statically known and > > > > rejecting some dynptr functions at load time places us at the mercy of > > > > verifier's complete knowledge of application logic, which is exactly > > > > against the spirit of dynptr. > > > > > > > > > > Right, that might be too restrictive if we require them to be > > > statically read only. > > > > > > But it's not about forcing it to be statically ro, it is more about > > > rejecting load when we know the program is incorrect (e.g. the types > > > are incorrect when passed to helpers), otherwise we throw the error at > > > runtime anyway, which seems to be the convention afaicu. But maybe I > > > missed the memo and we gradually want to move away from such strict > > > static checks. > > > > > > I view the situation here similar to if we were rejecting direct > > > writes to PTR_TO_MEM | MEM_RDONLY at load time, but offloading as > > > runtime check in the helper writing to it as rw memory arg. It's as if > > > we pretend it's part of the 'type' of the register when doing direct > > > writes, but then ignore it while matching it against the said helper's > > > argument type. > > > > I disagree, it's not the same. bpf_dynptr_data/bpf_dynptr_data_rdonly > > turns completely dynamic dynptr into static slice of memory. Only > > after that point it makes sense for BPF verifier to reject something. > > Until then it's not incorrect. BPF program will always have to deal > > with some dynptr operations potentially failing. dynptr can always be > > NULL internally, you can still call bpf_dynptr_xxx() operations on it, > > they will just do nothing and return error. That doesn't make BPF > > program incorrect. > > Let me just explain one last time why I'm unable to swallow this > 'completely dynamic dynptr' explanation, because it is not treated as > completely dynamic by all dynptr helpers. > > No pushback, but it would be great if you could further help me wrap > my head around this, so that we're in sync for future discussions. > > So you say you may not know the type of dynptr (read-only, rw, local, > ringbuf, etc.). Hence you want to treat every dynptr as 'dynamic No, that's not what I said. I didn't talk about dynptr type (ringbuf vs skb vs local). Type we have to track statically precisely to limit what kind of helpers can be used on dynptrs of specific type. But note that there are helpers that don't care about dynptr and, in theory, should work with any dynptr. Because dynptr is an abstraction of "logically continuous range of memory". So in some cases even types don't matter. But regardless, I just don't think about read-only bit as part of dynptr types. I don't see why it has to be statically known and even why it has to stay as read-only or read-write for the entire duration of dynptr lifetime. Why can't we allow flipping dynptr from read-write to read-only before passing it to some custom BPF subprog, potentially provided from some BPF library which you don't control; just to make sure that it cannot really modify underlying memory (even if it is fundamentally modifiable, like with ringbuf)? So the point I'm trying to communicate is that things that don't **need** to be knowable statically to BPF verifier - **should not** be knowable and should be checked at runtime only. With any dynptr helper that is interfacing its runtime nature into static thing that verifier enforces (which is what bpf_dynptr_data/bpf_dynptr_data_rdonly are), it will always be "fallible" and users will have to expect that they might return NULL or do nothing, or whatever way to deal with failure case. And I believe dynptr read-onliness is not something that BPF verifier should be tracking statically. PTR_TO_MEM -- yes, dynptr -- no. That's it. Dynptr type/kind -- yes, track statically, it's way more important to determine what you can at all do with dynptr. > dynptr' you know nothing about even when you do know some information > about it statically (e.g. if it's on stack). You don't want to reject > things early at load even if you have all the info to do so. You want > operations on it to fail at runtime instead. > > If you cannot track ro vs rw in the future statically, you won't be be > able to track local or ringbuf or skb either (since both are part of > the type of the dynptr). If you can, you can just as well encode > const-ness as part of the type where you declare it (e.g. in a map > field where the value is assigned dynamically, where you say > dynptr_ringbuf etc.). Type comprises local vs skb vs ringbuf | ro vs > rw for me. But maybe I could be wrong. I don't consider read-only to be a part of type. It's more like offset and size to me, rather than type. And even if we can track constness (e.g., we can teach BPF verifier to recognize that hypothetical bpf_dynptr_set_read_only() makes dynptr into dynptr_rdonly, but why?) > > So following this line of reasoning, will you be relaxing the argument > type of helpers like bpf_ringbuf_submit_dynptr? Right now they take > 'dynamic dynptr' as arg, but also expect DYNPTR_TYPE_RINGBUF, so you > reject load when it's not a ringbuf dynptr. Will it then fallback to > checking the type at runtime when the type will not be known? But then > it will also permit passing local or skb dynptr in the future which it > rejects right now. No, which is exactly why we track dynptr type statically. Because some operations don't make sense and they have additional semantics (like with ringbuf submit). > > I'm just hoping you are able to see why it's looking a bit > inconsistent to me. If DYNPTR_TYPE_RINGBUF has to be checked, it felt > to me like DYNPTR_RDONLY is as much part of that kind of type safety > wrt helpers. It would be set on the dynptr when skb passed to > dynptr_from_skb is rdonly in some program types, along with > DYNPTR_TYPE_SKB, and expect to be matched when invoking helpers on > such dynptrs. I understand your view point. But you are taking "completely dynamic" nature of dynptr too far (not tracking dynptr type) in one case, or making it too restrictive (tracking read-only state) in another. We can argue what substitutes "inconsistent", but I'm just leaning towards slightly different tradeoffs. I don't see a huge value in preventing bpf_dynptr_write() to be called on read-only dynptr **statically**. I've ran into various cases where I'd rather BPF verifier not pretend to be all-knowing and too helpful, because it sometimes turns into a game of proving to BPF verifier that "I know what I'm doing, and it's correct, so please just let me do it". > > It is fine to push checks to runtime, especially when you won't know > the type, because verifier cannot reasonably track the dynptr type > then. > > But right now there's still some level of state you maintain in the > verifier about dynptrs (like it's type), and it seems to me like some > helpers are using that state to reject things at load time, while some > other helper will ignore it and fallback to runtime checks. Exactly. There are helpers that can fail no matter what, and read-onliness is just another condition they will have to check and reject. And some helpers really don't and shouldn't care about the nature of dynptr (like bpf_dynptr_get_size(), for example). > > I hope this is a clear enough description to atleast justify why I'm > (still) a bit confused. I don't think you are confused, you just prefer a different tradeoff. Hopefully I gave a few more arguments in favor of less static treatment of dynptr where it doesn't hurt correctness. P.S. I'm going away on vacation, so unlikely to be able to continue this discussion in a timely manner going forward.
On Sun, 28 Aug 2022 at 01:03, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Sat, Aug 27, 2022 at 11:33 AM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > On Sat, 27 Aug 2022 at 19:22, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > > > On Sat, Aug 27, 2022 at 12:12 AM Kumar Kartikeya Dwivedi > > > <memxor@gmail.com> wrote: > > > > [...] > > > > > > > > > > I think the right answer here is to not make bpf_dynptr_data() return > > > > > direct pointer of changing read-only-ness. Maybe the right answer here > > > > > is another helper, bpf_dynptr_data_rdonly(), that will return NULL for > > > > > non-read-only dynptr and PTR_TO_MEM | MEM_RDONLY if dynptr is indeed > > > > > read-only? > > > > > > > > Shouldn't it be the other way around? bpf_dynptr_data_rdonly() should > > > > work for both ro and rw dynptrs, and bpf_dynptr_data() only for rw > > > > dynptr? > > > > > > Right, that's what I proposed: > > > > > > "bpf_dynptr_data_rdonly(), that will return NULL for non-read-only dynptr" > > > > > > so if you pass read-write dynptr, it will return NULL (because it's > > > unsafe to take writable direct pointer). > > > > > > bpf_dynptr_data_rdonly() should still work fine with both rdonly and > > > read-write dynptr. > > > bpf_dynptr_data() only works (in the sense returns non-NULL) for > > > read-write dynptr only. > > > > > > > > > > > > > > And yes, you're kind of painting yourself in a corner if you allow > > > > dynptr_data to work with statically ro dynptrs now, ascertaining the > > > > ro bit for the returned slice, and then later you plan to add dynptrs > > > > whose ro vs rw is not known to the verifier statically. Then it works > > > > well for statically known ones (returning both ro and rw slices), but > > > > has to return NULL at runtime for statically unknown read only ones, > > > > and always rw slice in verifier state for them. > > > > > > Right, will be both inconsistent and puzzling. > > > > > > > > > > > > > > > > > By saying that read-only-ness of dynptr should be statically known and > > > > > rejecting some dynptr functions at load time places us at the mercy of > > > > > verifier's complete knowledge of application logic, which is exactly > > > > > against the spirit of dynptr. > > > > > > > > > > > > > Right, that might be too restrictive if we require them to be > > > > statically read only. > > > > > > > > But it's not about forcing it to be statically ro, it is more about > > > > rejecting load when we know the program is incorrect (e.g. the types > > > > are incorrect when passed to helpers), otherwise we throw the error at > > > > runtime anyway, which seems to be the convention afaicu. But maybe I > > > > missed the memo and we gradually want to move away from such strict > > > > static checks. > > > > > > > > I view the situation here similar to if we were rejecting direct > > > > writes to PTR_TO_MEM | MEM_RDONLY at load time, but offloading as > > > > runtime check in the helper writing to it as rw memory arg. It's as if > > > > we pretend it's part of the 'type' of the register when doing direct > > > > writes, but then ignore it while matching it against the said helper's > > > > argument type. > > > > > > I disagree, it's not the same. bpf_dynptr_data/bpf_dynptr_data_rdonly > > > turns completely dynamic dynptr into static slice of memory. Only > > > after that point it makes sense for BPF verifier to reject something. > > > Until then it's not incorrect. BPF program will always have to deal > > > with some dynptr operations potentially failing. dynptr can always be > > > NULL internally, you can still call bpf_dynptr_xxx() operations on it, > > > they will just do nothing and return error. That doesn't make BPF > > > program incorrect. > > > > Let me just explain one last time why I'm unable to swallow this > > 'completely dynamic dynptr' explanation, because it is not treated as > > completely dynamic by all dynptr helpers. > > > > No pushback, but it would be great if you could further help me wrap > > my head around this, so that we're in sync for future discussions. > > > > So you say you may not know the type of dynptr (read-only, rw, local, > > ringbuf, etc.). Hence you want to treat every dynptr as 'dynamic > > No, that's not what I said. I didn't talk about dynptr type (ringbuf > vs skb vs local). Type we have to track statically precisely to limit > what kind of helpers can be used on dynptrs of specific type. > > But note that there are helpers that don't care about dynptr and, in > theory, should work with any dynptr. Because dynptr is an abstraction > of "logically continuous range of memory". So in some cases even types > don't matter. > > But regardless, I just don't think about read-only bit as part of > dynptr types. I don't see why it has to be statically known and even > why it has to stay as read-only or read-write for the entire duration > of dynptr lifetime. Why can't we allow flipping dynptr from read-write > to read-only before passing it to some custom BPF subprog, potentially > provided from some BPF library which you don't control; just to make > sure that it cannot really modify underlying memory (even if it is > fundamentally modifiable, like with ringbuf)? > > So the point I'm trying to communicate is that things that don't > **need** to be knowable statically to BPF verifier - **should not** be > knowable and should be checked at runtime only. With any dynptr helper > that is interfacing its runtime nature into static thing that verifier > enforces (which is what bpf_dynptr_data/bpf_dynptr_data_rdonly are), > it will always be "fallible" and users will have to expect that they > might return NULL or do nothing, or whatever way to deal with failure > case. > > > And I believe dynptr read-onliness is not something that BPF verifier > should be tracking statically. PTR_TO_MEM -- yes, dynptr -- no. That's > it. Dynptr type/kind -- yes, track statically, it's way more important > to determine what you can at all do with dynptr. > Understood, it's more clear now how you're thinking about this :), especially that you don't consider read-only bit to be a property of dynptr type. I think that was the main point I was not following. And yes, I guess it's about different tradeoffs. Taking your BPF library example, my idea would be that we will anyway be verifying the global or static subprog in the library by the BTF of its function prototype. The verifier has to verify safety of the subprog by setting up argument types if it is global. There, I find it more appropriate for global ones to declare const bpf_dynptr * as the argument type if the intent is to not write to the dynptr. Then user can safely pass both rw and ro dynptrs to it, without having to flip it each time at runtime just to ensure safety. If it is static it will also be able to catch incorrect writes for callers. But indeed there may be cases where you want to control this at runtime, IMPO that should be orthogonal to type level const-ness. Those will be rw at the type level but may change const-ness at runtime, then operations start failing at runtime. Anyway, I guess we are done here. Again, thanks for sticking along and taking the time to describe it in detail :). > > [...]
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 39bd36359c1e..30615d1a0c13 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -407,11 +407,14 @@ enum bpf_type_flag { /* Size is known at compile time. */ MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), + /* DYNPTR points to sk_buff */ + DYNPTR_TYPE_SKB = BIT(11 + BPF_BASE_TYPE_BITS), + __BPF_TYPE_FLAG_MAX, __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1, }; -#define DYNPTR_TYPE_FLAG_MASK (DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF) +#define DYNPTR_TYPE_FLAG_MASK (DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF | DYNPTR_TYPE_SKB) /* Max number of base types. */ #define BPF_BASE_TYPE_LIMIT (1UL << BPF_BASE_TYPE_BITS) @@ -903,6 +906,36 @@ static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func( return bpf_func(ctx, insnsi); } +/* the implementation of the opaque uapi struct bpf_dynptr */ +struct bpf_dynptr_kern { + void *data; + /* Size represents the number of usable bytes of dynptr data. + * If for example the offset is at 4 for a local dynptr whose data is + * of type u64, the number of usable bytes is 4. + * + * The upper 8 bits are reserved. It is as follows: + * Bits 0 - 23 = size + * Bits 24 - 30 = dynptr type + * Bit 31 = whether dynptr is read-only + */ + u32 size; + u32 offset; +} __aligned(8); + +enum bpf_dynptr_type { + BPF_DYNPTR_TYPE_INVALID, + /* Points to memory that is local to the bpf program */ + BPF_DYNPTR_TYPE_LOCAL, + /* Underlying data is a ringbuf record */ + BPF_DYNPTR_TYPE_RINGBUF, + /* Underlying data is a sk_buff */ + BPF_DYNPTR_TYPE_SKB, + /* Underlying data is a xdp_buff */ + BPF_DYNPTR_TYPE_XDP, +}; + +int bpf_dynptr_check_size(u32 size); + #ifdef CONFIG_BPF_JIT int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr); int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr); @@ -1975,6 +2008,12 @@ static inline bool has_current_bpf_ctx(void) { return !!current->bpf_ctx; } + +void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data, + enum bpf_dynptr_type type, u32 offset, u32 size); +void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr); +void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr); + #else /* !CONFIG_BPF_SYSCALL */ static inline struct bpf_prog *bpf_prog_get(u32 ufd) { @@ -2188,6 +2227,19 @@ static inline bool has_current_bpf_ctx(void) { return false; } + +static inline void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data, + enum bpf_dynptr_type type, u32 offset, u32 size) +{ +} + +static inline void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr) +{ +} + +static inline void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr) +{ +} #endif /* CONFIG_BPF_SYSCALL */ void __bpf_free_used_btfs(struct bpf_prog_aux *aux, @@ -2548,35 +2600,6 @@ int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args, u32 **bin_buf, u32 num_args); void bpf_bprintf_cleanup(void); -/* the implementation of the opaque uapi struct bpf_dynptr */ -struct bpf_dynptr_kern { - void *data; - /* Size represents the number of usable bytes of dynptr data. - * If for example the offset is at 4 for a local dynptr whose data is - * of type u64, the number of usable bytes is 4. - * - * The upper 8 bits are reserved. It is as follows: - * Bits 0 - 23 = size - * Bits 24 - 30 = dynptr type - * Bit 31 = whether dynptr is read-only - */ - u32 size; - u32 offset; -} __aligned(8); - -enum bpf_dynptr_type { - BPF_DYNPTR_TYPE_INVALID, - /* Points to memory that is local to the bpf program */ - BPF_DYNPTR_TYPE_LOCAL, - /* Underlying data is a ringbuf record */ - BPF_DYNPTR_TYPE_RINGBUF, -}; - -void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data, - enum bpf_dynptr_type type, u32 offset, u32 size); -void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr); -int bpf_dynptr_check_size(u32 size); - #ifdef CONFIG_BPF_LSM void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype); void bpf_cgroup_atype_put(int cgroup_atype); diff --git a/include/linux/filter.h b/include/linux/filter.h index a5f21dc3c432..649063d9cbfd 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1532,4 +1532,8 @@ static __always_inline int __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifind return XDP_REDIRECT; } +int __bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset, void *to, u32 len); +int __bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void *from, + u32 len, u64 flags); + #endif /* __LINUX_FILTER_H__ */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 934a2a8beb87..320e6b95d95c 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -5253,11 +5253,22 @@ union bpf_attr { * Description * Write *len* bytes from *src* into *dst*, starting from *offset* * into *dst*. - * *flags* is currently unused. + * + * *flags* must be 0 except for skb-type dynptrs. + * + * For skb-type dynptrs: + * * All data slices of the dynptr are automatically + * invalidated after **bpf_dynptr_write**\ (). If you wish to + * avoid this, please perform the write using direct data slices + * instead. + * + * * For *flags*, please see the flags accepted by + * **bpf_skb_store_bytes**\ (). * Return * 0 on success, -E2BIG if *offset* + *len* exceeds the length * of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst* - * is a read-only dynptr or if *flags* is not 0. + * is a read-only dynptr or if *flags* is not correct. For skb-type dynptrs, + * other errors correspond to errors returned by **bpf_skb_store_bytes**\ (). * * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len) * Description @@ -5265,10 +5276,20 @@ union bpf_attr { * * *len* must be a statically known value. The returned data slice * is invalidated whenever the dynptr is invalidated. + * + * For skb-type dynptrs: + * * If *offset* + *len* extends into the skb's paged buffers, + * the user should manually pull the skb with **bpf_skb_pull_data**\ () + * and try again. + * + * * The data slice is automatically invalidated anytime + * **bpf_dynptr_write**\ () or a helper call that changes + * the underlying packet buffer (eg **bpf_skb_pull_data**\ ()) + * is called. * Return * Pointer to the underlying dynptr data, NULL if the dynptr is * read-only, if the dynptr is invalid, or if the offset and length - * is out of bounds. + * is out of bounds or in a paged buffer for skb-type dynptrs. * * s64 bpf_tcp_raw_gen_syncookie_ipv4(struct iphdr *iph, struct tcphdr *th, u32 th_len) * Description @@ -5355,6 +5376,18 @@ union bpf_attr { * Return * Current *ktime*. * + * long bpf_dynptr_from_skb(struct sk_buff *skb, u64 flags, struct bpf_dynptr *ptr) + * Description + * Get a dynptr to the data in *skb*. *skb* must be the BPF program + * context. Depending on program type, the dynptr may be read-only. + * + * Calls that change the *skb*'s underlying packet buffer + * (eg **bpf_skb_pull_data**\ ()) do not invalidate the dynptr, but + * they do invalidate any data slices associated with the dynptr. + * + * *flags* is currently unused, it must be 0 for now. + * Return + * 0 on success or -EINVAL if flags is not 0. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5566,6 +5599,7 @@ union bpf_attr { FN(tcp_raw_check_syncookie_ipv4), \ FN(tcp_raw_check_syncookie_ipv6), \ FN(ktime_get_tai_ns), \ + FN(dynptr_from_skb), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 3c1b9bbcf971..471a01a9b6ae 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1437,11 +1437,21 @@ static bool bpf_dynptr_is_rdonly(struct bpf_dynptr_kern *ptr) return ptr->size & DYNPTR_RDONLY_BIT; } +void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr) +{ + ptr->size |= DYNPTR_RDONLY_BIT; +} + static void bpf_dynptr_set_type(struct bpf_dynptr_kern *ptr, enum bpf_dynptr_type type) { ptr->size |= type << DYNPTR_TYPE_SHIFT; } +static enum bpf_dynptr_type bpf_dynptr_get_type(const struct bpf_dynptr_kern *ptr) +{ + return (ptr->size & ~(DYNPTR_RDONLY_BIT)) >> DYNPTR_TYPE_SHIFT; +} + static u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr) { return ptr->size & DYNPTR_SIZE_MASK; @@ -1512,6 +1522,7 @@ static const struct bpf_func_proto bpf_dynptr_from_mem_proto = { BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src, u32, offset, u64, flags) { + enum bpf_dynptr_type type; int err; if (!src->data || flags) @@ -1521,9 +1532,19 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src if (err) return err; - memcpy(dst, src->data + src->offset + offset, len); + type = bpf_dynptr_get_type(src); - return 0; + switch (type) { + case BPF_DYNPTR_TYPE_LOCAL: + case BPF_DYNPTR_TYPE_RINGBUF: + memcpy(dst, src->data + src->offset + offset, len); + return 0; + case BPF_DYNPTR_TYPE_SKB: + return __bpf_skb_load_bytes(src->data, src->offset + offset, dst, len); + default: + WARN(true, "bpf_dynptr_read: unknown dynptr type %d\n", type); + return -EFAULT; + } } static const struct bpf_func_proto bpf_dynptr_read_proto = { @@ -1540,18 +1561,32 @@ static const struct bpf_func_proto bpf_dynptr_read_proto = { BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src, u32, len, u64, flags) { + enum bpf_dynptr_type type; int err; - if (!dst->data || flags || bpf_dynptr_is_rdonly(dst)) + if (!dst->data || bpf_dynptr_is_rdonly(dst)) return -EINVAL; err = bpf_dynptr_check_off_len(dst, offset, len); if (err) return err; - memcpy(dst->data + dst->offset + offset, src, len); + type = bpf_dynptr_get_type(dst); - return 0; + switch (type) { + case BPF_DYNPTR_TYPE_LOCAL: + case BPF_DYNPTR_TYPE_RINGBUF: + if (flags) + return -EINVAL; + memcpy(dst->data + dst->offset + offset, src, len); + return 0; + case BPF_DYNPTR_TYPE_SKB: + return __bpf_skb_store_bytes(dst->data, dst->offset + offset, src, len, + flags); + default: + WARN(true, "bpf_dynptr_write: unknown dynptr type %d\n", type); + return -EFAULT; + } } static const struct bpf_func_proto bpf_dynptr_write_proto = { @@ -1567,6 +1602,9 @@ static const struct bpf_func_proto bpf_dynptr_write_proto = { BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset, u32, len) { + enum bpf_dynptr_type type; + bool is_rdonly; + void *data; int err; if (!ptr->data) @@ -1576,10 +1614,37 @@ BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset, u32, len if (err) return 0; - if (bpf_dynptr_is_rdonly(ptr)) - return 0; + type = bpf_dynptr_get_type(ptr); + + /* Only skb dynptrs can get read-only data slices, because the + * verifier enforces PTR_TO_PACKET accesses + */ + is_rdonly = bpf_dynptr_is_rdonly(ptr); + + switch (type) { + case BPF_DYNPTR_TYPE_LOCAL: + case BPF_DYNPTR_TYPE_RINGBUF: + if (is_rdonly) + return 0; + + data = ptr->data; + break; + case BPF_DYNPTR_TYPE_SKB: + { + struct sk_buff *skb = ptr->data; - return (unsigned long)(ptr->data + ptr->offset + offset); + /* if the data is paged, the caller needs to pull it first */ + if (ptr->offset + offset + len > skb->len - skb->data_len) + return 0; + + data = skb->data; + break; + } + default: + WARN(true, "bpf_dynptr_data: unknown dynptr type %d\n", type); + return 0; + } + return (unsigned long)(data + ptr->offset + offset); } static const struct bpf_func_proto bpf_dynptr_data_proto = { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2c1f8069f7b7..1ea295f47525 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -684,6 +684,8 @@ static enum bpf_dynptr_type arg_to_dynptr_type(enum bpf_arg_type arg_type) return BPF_DYNPTR_TYPE_LOCAL; case DYNPTR_TYPE_RINGBUF: return BPF_DYNPTR_TYPE_RINGBUF; + case DYNPTR_TYPE_SKB: + return BPF_DYNPTR_TYPE_SKB; default: return BPF_DYNPTR_TYPE_INVALID; } @@ -5826,12 +5828,29 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env, return __check_ptr_off_reg(env, reg, regno, fixed_off_ok); } -static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg) +static struct bpf_reg_state *get_dynptr_arg_reg(const struct bpf_func_proto *fn, + struct bpf_reg_state *regs) +{ + int i; + + for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) + if (arg_type_is_dynptr(fn->arg_type[i])) + return ®s[BPF_REG_1 + i]; + + return NULL; +} + +static enum bpf_dynptr_type stack_slot_get_dynptr_info(struct bpf_verifier_env *env, + struct bpf_reg_state *reg, + int *ref_obj_id) { struct bpf_func_state *state = func(env, reg); int spi = get_spi(reg->off); - return state->stack[spi].spilled_ptr.id; + if (ref_obj_id) + *ref_obj_id = state->stack[spi].spilled_ptr.id; + + return state->stack[spi].spilled_ptr.dynptr.type; } static int check_func_arg(struct bpf_verifier_env *env, u32 arg, @@ -6056,7 +6075,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, case DYNPTR_TYPE_RINGBUF: err_extra = "ringbuf "; break; - default: + case DYNPTR_TYPE_SKB: + err_extra = "skb "; break; } @@ -7149,6 +7169,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn { enum bpf_prog_type prog_type = resolve_prog_type(env->prog); const struct bpf_func_proto *fn = NULL; + enum bpf_dynptr_type dynptr_type; enum bpf_return_type ret_type; enum bpf_type_flag ret_flag; struct bpf_reg_state *regs; @@ -7320,24 +7341,43 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn } } break; - case BPF_FUNC_dynptr_data: - for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) { - if (arg_type_is_dynptr(fn->arg_type[i])) { - if (meta.ref_obj_id) { - verbose(env, "verifier internal error: meta.ref_obj_id already set\n"); - return -EFAULT; - } - /* Find the id of the dynptr we're tracking the reference of */ - meta.ref_obj_id = stack_slot_get_id(env, ®s[BPF_REG_1 + i]); - break; - } + case BPF_FUNC_dynptr_write: + { + struct bpf_reg_state *reg; + + reg = get_dynptr_arg_reg(fn, regs); + if (!reg) { + verbose(env, "verifier internal error: no dynptr in bpf_dynptr_data()\n"); + return -EFAULT; } - if (i == MAX_BPF_FUNC_REG_ARGS) { + + /* bpf_dynptr_write() for skb-type dynptrs may pull the skb, so we must + * invalidate all data slices associated with it + */ + if (stack_slot_get_dynptr_info(env, reg, NULL) == BPF_DYNPTR_TYPE_SKB) + changes_data = true; + + break; + } + case BPF_FUNC_dynptr_data: + { + struct bpf_reg_state *reg; + + reg = get_dynptr_arg_reg(fn, regs); + if (!reg) { verbose(env, "verifier internal error: no dynptr in bpf_dynptr_data()\n"); return -EFAULT; } + + if (meta.ref_obj_id) { + verbose(env, "verifier internal error: meta.ref_obj_id already set\n"); + return -EFAULT; + } + + dynptr_type = stack_slot_get_dynptr_info(env, reg, &meta.ref_obj_id); break; } + } if (err) return err; @@ -7397,8 +7437,15 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn break; case RET_PTR_TO_ALLOC_MEM: mark_reg_known_zero(env, regs, BPF_REG_0); - regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; - regs[BPF_REG_0].mem_size = meta.mem_size; + + if (func_id == BPF_FUNC_dynptr_data && + dynptr_type == BPF_DYNPTR_TYPE_SKB) { + regs[BPF_REG_0].type = PTR_TO_PACKET | ret_flag; + regs[BPF_REG_0].range = meta.mem_size; + } else { + regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag; + regs[BPF_REG_0].mem_size = meta.mem_size; + } break; case RET_PTR_TO_MEM_OR_BTF_ID: { @@ -14141,6 +14188,24 @@ static int do_misc_fixups(struct bpf_verifier_env *env) goto patch_call_imm; } + if (insn->imm == BPF_FUNC_dynptr_from_skb) { + bool is_rdonly = !may_access_direct_pkt_data(env, NULL, BPF_WRITE); + + insn_buf[0] = BPF_MOV32_IMM(BPF_REG_4, is_rdonly); + insn_buf[1] = *insn; + cnt = 2; + + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = new_prog; + prog = new_prog; + insn = new_prog->insnsi + i + delta; + goto patch_call_imm; + } + /* BPF_EMIT_CALL() assumptions in some of the map_gen_lookup * and other inlining handlers are currently limited to 64 bit * only. diff --git a/net/core/filter.c b/net/core/filter.c index 1acfaffeaf32..5b204b42fb3e 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -1681,8 +1681,8 @@ static inline void bpf_pull_mac_rcsum(struct sk_buff *skb) skb_postpull_rcsum(skb, skb_mac_header(skb), skb->mac_len); } -BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *, skb, u32, offset, - const void *, from, u32, len, u64, flags) +int __bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void *from, + u32 len, u64 flags) { void *ptr; @@ -1707,6 +1707,12 @@ BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *, skb, u32, offset, return 0; } +BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *, skb, u32, offset, + const void *, from, u32, len, u64, flags) +{ + return __bpf_skb_store_bytes(skb, offset, from, len, flags); +} + static const struct bpf_func_proto bpf_skb_store_bytes_proto = { .func = bpf_skb_store_bytes, .gpl_only = false, @@ -1718,8 +1724,7 @@ static const struct bpf_func_proto bpf_skb_store_bytes_proto = { .arg5_type = ARG_ANYTHING, }; -BPF_CALL_4(bpf_skb_load_bytes, const struct sk_buff *, skb, u32, offset, - void *, to, u32, len) +int __bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset, void *to, u32 len) { void *ptr; @@ -1738,6 +1743,12 @@ BPF_CALL_4(bpf_skb_load_bytes, const struct sk_buff *, skb, u32, offset, return -EFAULT; } +BPF_CALL_4(bpf_skb_load_bytes, const struct sk_buff *, skb, u32, offset, + void *, to, u32, len) +{ + return __bpf_skb_load_bytes(skb, offset, to, len); +} + static const struct bpf_func_proto bpf_skb_load_bytes_proto = { .func = bpf_skb_load_bytes, .gpl_only = false, @@ -1849,6 +1860,32 @@ static const struct bpf_func_proto bpf_skb_pull_data_proto = { .arg2_type = ARG_ANYTHING, }; +/* is_rdonly is set by the verifier */ +BPF_CALL_4(bpf_dynptr_from_skb, struct sk_buff *, skb, u64, flags, + struct bpf_dynptr_kern *, ptr, u32, is_rdonly) +{ + if (flags) { + bpf_dynptr_set_null(ptr); + return -EINVAL; + } + + bpf_dynptr_init(ptr, skb, BPF_DYNPTR_TYPE_SKB, 0, skb->len); + + if (is_rdonly) + bpf_dynptr_set_rdonly(ptr); + + return 0; +} + +static const struct bpf_func_proto bpf_dynptr_from_skb_proto = { + .func = bpf_dynptr_from_skb, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_SKB | MEM_UNINIT, +}; + BPF_CALL_1(bpf_sk_fullsock, struct sock *, sk) { return sk_fullsock(sk) ? (unsigned long)sk : (unsigned long)NULL; @@ -7726,6 +7763,8 @@ sk_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_get_socket_uid_proto; case BPF_FUNC_perf_event_output: return &bpf_skb_event_output_proto; + case BPF_FUNC_dynptr_from_skb: + return &bpf_dynptr_from_skb_proto; default: return bpf_sk_base_func_proto(func_id); } @@ -7909,6 +7948,8 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_tcp_raw_check_syncookie_ipv6_proto; #endif #endif + case BPF_FUNC_dynptr_from_skb: + return &bpf_dynptr_from_skb_proto; default: return bpf_sk_base_func_proto(func_id); } @@ -8104,6 +8145,8 @@ sk_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_skc_lookup_tcp: return &bpf_skc_lookup_tcp_proto; #endif + case BPF_FUNC_dynptr_from_skb: + return &bpf_dynptr_from_skb_proto; default: return bpf_sk_base_func_proto(func_id); } @@ -8142,6 +8185,8 @@ lwt_out_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_get_smp_processor_id_proto; case BPF_FUNC_skb_under_cgroup: return &bpf_skb_under_cgroup_proto; + case BPF_FUNC_dynptr_from_skb: + return &bpf_dynptr_from_skb_proto; default: return bpf_sk_base_func_proto(func_id); } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 1d6085e15fc8..3f1800a2b77c 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5253,11 +5253,22 @@ union bpf_attr { * Description * Write *len* bytes from *src* into *dst*, starting from *offset* * into *dst*. - * *flags* is currently unused. + * + * *flags* must be 0 except for skb-type dynptrs. + * + * For skb-type dynptrs: + * * All data slices of the dynptr are automatically + * invalidated after **bpf_dynptr_write**\ (). If you wish to + * avoid this, please perform the write using direct data slices + * instead. + * + * * For *flags*, please see the flags accepted by + * **bpf_skb_store_bytes**\ (). * Return * 0 on success, -E2BIG if *offset* + *len* exceeds the length * of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst* - * is a read-only dynptr or if *flags* is not 0. + * is a read-only dynptr or if *flags* is not correct. For skb-type dynptrs, + * other errors correspond to errors returned by **bpf_skb_store_bytes**\ (). * * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len) * Description @@ -5265,10 +5276,20 @@ union bpf_attr { * * *len* must be a statically known value. The returned data slice * is invalidated whenever the dynptr is invalidated. + * + * For skb-type dynptrs: + * * If *offset* + *len* extends into the skb's paged buffers, + * the user should manually pull the skb with **bpf_skb_pull_data**\ () + * and try again. + * + * * The data slice is automatically invalidated anytime + * **bpf_dynptr_write**\ () or a helper call that changes + * the underlying packet buffer (eg **bpf_skb_pull_data**\ ()) + * is called. * Return * Pointer to the underlying dynptr data, NULL if the dynptr is * read-only, if the dynptr is invalid, or if the offset and length - * is out of bounds. + * is out of bounds or in a paged buffer for skb-type dynptrs. * * s64 bpf_tcp_raw_gen_syncookie_ipv4(struct iphdr *iph, struct tcphdr *th, u32 th_len) * Description @@ -5355,6 +5376,18 @@ union bpf_attr { * Return * Current *ktime*. * + * long bpf_dynptr_from_skb(struct sk_buff *skb, u64 flags, struct bpf_dynptr *ptr) + * Description + * Get a dynptr to the data in *skb*. *skb* must be the BPF program + * context. Depending on program type, the dynptr may be read-only. + * + * Calls that change the *skb*'s underlying packet buffer + * (eg **bpf_skb_pull_data**\ ()) do not invalidate the dynptr, but + * they do invalidate any data slices associated with the dynptr. + * + * *flags* is currently unused, it must be 0 for now. + * Return + * 0 on success or -EINVAL if flags is not 0. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5566,6 +5599,7 @@ union bpf_attr { FN(tcp_raw_check_syncookie_ipv4), \ FN(tcp_raw_check_syncookie_ipv6), \ FN(ktime_get_tai_ns), \ + FN(dynptr_from_skb), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper
Add skb dynptrs, which are dynptrs whose underlying pointer points to a skb. The dynptr acts on skb data. skb dynptrs have two main benefits. One is that they allow operations on sizes that are not statically known at compile-time (eg variable-sized accesses). Another is that parsing the packet data through dynptrs (instead of through direct access of skb->data and skb->data_end) can be more ergonomic and less brittle (eg does not need manual if checking for being within bounds of data_end). For bpf prog types that don't support writes on skb data, the dynptr is read-only. For reads and writes through the bpf_dynptr_read() and bpf_dynptr_write() interfaces, this supports reading and writing into data in the non-linear paged buffers. For data slices (through the bpf_dynptr_data() interface), if the data is in a paged buffer, the user must first call bpf_skb_pull_data() to pull the data into the linear portion. The returned data slice from a call to bpf_dynptr_data() is of reg type PTR_TO_PACKET | PTR_MAYBE_NULL. Any bpf_dynptr_write() automatically invalidates any prior data slices to the skb dynptr. This is because a bpf_dynptr_write() may be writing to data in a paged buffer, so it will need to pull the buffer first into the head. The reason it needs to be pulled instead of writing directly to the paged buffers is because they may be cloned (only the head of the skb is by default uncloned). As such, any bpf_dynptr_write() will automatically have its prior data slices invalidated, even if the write is to data in the skb head (the verifier has no way of differentiating whether the write is to the head or paged buffers during program load time). Please note as well that any other helper calls that change the underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data slices of the skb dynptr as well. Whenever such a helper call is made, the verifier marks any PTR_TO_PACKET reg type (which includes skb dynptr slices since they are PTR_TO_PACKETs) as unknown. The stack trace for this is check_helper_call() -> clear_all_pkt_pointers() -> __clear_all_pkt_pointers() -> mark_reg_unknown() For examples of how skb dynptrs can be used, please see the attached selftests. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> --- include/linux/bpf.h | 83 +++++++++++++++++----------- include/linux/filter.h | 4 ++ include/uapi/linux/bpf.h | 40 ++++++++++++-- kernel/bpf/helpers.c | 81 +++++++++++++++++++++++++--- kernel/bpf/verifier.c | 99 ++++++++++++++++++++++++++++------ net/core/filter.c | 53 ++++++++++++++++-- tools/include/uapi/linux/bpf.h | 40 ++++++++++++-- 7 files changed, 335 insertions(+), 65 deletions(-)