Message ID | 20221107230950.7117-7-memxor@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | Local kptrs, BPF linked lists | expand |
On Mon, Nov 7, 2022 at 3:10 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > Introduce local kptrs, i.e. PTR_TO_BTF_ID that point to a type in > program BTF. This is indicated by the presence of MEM_ALLOC type flag in > reg->type to avoid having to check btf_is_kernel when trying to match > argument types in helpers. > > Refactor btf_struct_access callback to just take bpf_reg_state instead > of btf and btf_type paramters. Note that the call site in > check_map_access now simulates access to a PTR_TO_BTF_ID by creating a > dummy reg on stack. Since only the type, btf, and btf_id of the register > matter for the checks, it can be done so without complicating the usual > cases elsewhere in the verifier where reg->btf and reg->btf_id is used > verbatim. > > Whenever walking such types, any pointers being walked will always yield > a SCALAR instead of pointer. In the future we might permit kptr inside > local kptr (either kernel or local), and it would be permitted only in > that case. > > For now, these local kptrs will always be referenced in verifier > context, hence ref_obj_id == 0 for them is a bug. It is allowed to write > to such objects, as long fields that are special are not touched > (support for which will be added in subsequent patches). Note that once > such a local kptr is marked PTR_UNTRUSTED, it is no longer allowed to > write to it. > > No PROBE_MEM handling is therefore done for loads into this type unless > PTR_UNTRUSTED is part of the register type, since they can never be in > an undefined state, and their lifetime will always be valid. > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > --- > include/linux/bpf.h | 28 ++++++++++++++++-------- > include/linux/filter.h | 8 +++---- > kernel/bpf/btf.c | 16 ++++++++++---- > kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++------ > net/bpf/bpf_dummy_struct_ops.c | 14 ++++++------ > net/core/filter.c | 34 ++++++++++++----------------- > net/ipv4/bpf_tcp_ca.c | 13 ++++++----- > net/netfilter/nf_conntrack_bpf.c | 17 ++++++--------- > 8 files changed, 99 insertions(+), 68 deletions(-) > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > index afc1c51b59ff..75dbd2ecf80a 100644 > --- a/include/linux/bpf.h > +++ b/include/linux/bpf.h > @@ -524,6 +524,11 @@ enum bpf_type_flag { > /* Size is known at compile time. */ > MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), > > + /* MEM is of a type from program BTF, not kernel BTF. This is used to > + * tag PTR_TO_BTF_ID allocated using bpf_obj_new. > + */ > + MEM_ALLOC = BIT(11 + BPF_BASE_TYPE_BITS), > + you fixed one naming confusion with RINGBUF and basically are creating a new one, where "ALLOC" means "local kptr"... If we are stuck with "local kptr" (which I find very confusing as well, but that's beside the point), why not stick to the whole "local" terminology here? MEM_LOCAL? > __BPF_TYPE_FLAG_MAX, > __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1, > }; > @@ -771,6 +776,7 @@ struct bpf_prog_ops { > union bpf_attr __user *uattr); > }; > [...] > -int btf_struct_access(struct bpf_verifier_log *log, const struct btf *btf, > - const struct btf_type *t, int off, int size, > - enum bpf_access_type atype __maybe_unused, > +int btf_struct_access(struct bpf_verifier_log *log, > + const struct bpf_reg_state *reg, > + int off, int size, enum bpf_access_type atype __maybe_unused, > u32 *next_btf_id, enum bpf_type_flag *flag) > { > + const struct btf *btf = reg->btf; > enum bpf_type_flag tmp_flag = 0; > + const struct btf_type *t; > + u32 id = reg->btf_id; > int err; > - u32 id; > > + t = btf_type_by_id(btf, id); > do { > err = btf_struct_walk(log, btf, t, off, size, &id, &tmp_flag); > > switch (err) { > case WALK_PTR: > + /* For local types, the destination register cannot > + * become a pointer again. > + */ > + if (type_is_local_kptr(reg->type)) > + return SCALAR_VALUE; passing the entire bpf_reg_state just to differentiate between local vs kernel pointer seems like a huge overkill. bpf_reg_state is quite a complicated and extensive amount of state, and it seems cleaner to just pass it as a flag whether to allow pointer chasing or not. At least then we know we only care about that specific aspect, not about dozens of other possible fields of bpf_reg_state. > /* If we found the pointer or scalar on t+off, > * we're done. > */ [...]
On Wed, Nov 09, 2022 at 04:59:41AM IST, Andrii Nakryiko wrote: > On Mon, Nov 7, 2022 at 3:10 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > Introduce local kptrs, i.e. PTR_TO_BTF_ID that point to a type in > > program BTF. This is indicated by the presence of MEM_ALLOC type flag in > > reg->type to avoid having to check btf_is_kernel when trying to match > > argument types in helpers. > > > > Refactor btf_struct_access callback to just take bpf_reg_state instead > > of btf and btf_type paramters. Note that the call site in > > check_map_access now simulates access to a PTR_TO_BTF_ID by creating a > > dummy reg on stack. Since only the type, btf, and btf_id of the register > > matter for the checks, it can be done so without complicating the usual > > cases elsewhere in the verifier where reg->btf and reg->btf_id is used > > verbatim. > > > > Whenever walking such types, any pointers being walked will always yield > > a SCALAR instead of pointer. In the future we might permit kptr inside > > local kptr (either kernel or local), and it would be permitted only in > > that case. > > > > For now, these local kptrs will always be referenced in verifier > > context, hence ref_obj_id == 0 for them is a bug. It is allowed to write > > to such objects, as long fields that are special are not touched > > (support for which will be added in subsequent patches). Note that once > > such a local kptr is marked PTR_UNTRUSTED, it is no longer allowed to > > write to it. > > > > No PROBE_MEM handling is therefore done for loads into this type unless > > PTR_UNTRUSTED is part of the register type, since they can never be in > > an undefined state, and their lifetime will always be valid. > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > --- > > include/linux/bpf.h | 28 ++++++++++++++++-------- > > include/linux/filter.h | 8 +++---- > > kernel/bpf/btf.c | 16 ++++++++++---- > > kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++------ > > net/bpf/bpf_dummy_struct_ops.c | 14 ++++++------ > > net/core/filter.c | 34 ++++++++++++----------------- > > net/ipv4/bpf_tcp_ca.c | 13 ++++++----- > > net/netfilter/nf_conntrack_bpf.c | 17 ++++++--------- > > 8 files changed, 99 insertions(+), 68 deletions(-) > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > index afc1c51b59ff..75dbd2ecf80a 100644 > > --- a/include/linux/bpf.h > > +++ b/include/linux/bpf.h > > @@ -524,6 +524,11 @@ enum bpf_type_flag { > > /* Size is known at compile time. */ > > MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), > > > > + /* MEM is of a type from program BTF, not kernel BTF. This is used to > > + * tag PTR_TO_BTF_ID allocated using bpf_obj_new. > > + */ > > + MEM_ALLOC = BIT(11 + BPF_BASE_TYPE_BITS), > > + > > you fixed one naming confusion with RINGBUF and basically are creating > a new one, where "ALLOC" means "local kptr"... If we are stuck with > "local kptr" (which I find very confusing as well, but that's beside > the point), why not stick to the whole "local" terminology here? > MEM_LOCAL? > See the discussion about this in v4: https://lore.kernel.org/bpf/20221104075113.5ighwdvero4mugu7@apollo It was MEM_TYPE_LOCAL before. Also, better naming suggestions are always welcome, I asked the same in that message as well. > > __BPF_TYPE_FLAG_MAX, > > __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1, > > }; > > @@ -771,6 +776,7 @@ struct bpf_prog_ops { > > union bpf_attr __user *uattr); > > }; > > > > [...] > > > -int btf_struct_access(struct bpf_verifier_log *log, const struct btf *btf, > > - const struct btf_type *t, int off, int size, > > - enum bpf_access_type atype __maybe_unused, > > +int btf_struct_access(struct bpf_verifier_log *log, > > + const struct bpf_reg_state *reg, > > + int off, int size, enum bpf_access_type atype __maybe_unused, > > u32 *next_btf_id, enum bpf_type_flag *flag) > > { > > + const struct btf *btf = reg->btf; > > enum bpf_type_flag tmp_flag = 0; > > + const struct btf_type *t; > > + u32 id = reg->btf_id; > > int err; > > - u32 id; > > > > + t = btf_type_by_id(btf, id); > > do { > > err = btf_struct_walk(log, btf, t, off, size, &id, &tmp_flag); > > > > switch (err) { > > case WALK_PTR: > > + /* For local types, the destination register cannot > > + * become a pointer again. > > + */ > > + if (type_is_local_kptr(reg->type)) > > + return SCALAR_VALUE; > > passing the entire bpf_reg_state just to differentiate between local > vs kernel pointer seems like a huge overkill. bpf_reg_state is quite a > complicated and extensive amount of state, and it seems cleaner to > just pass it as a flag whether to allow pointer chasing or not. At > least then we know we only care about that specific aspect, not about > dozens of other possible fields of bpf_reg_state. > I agree that the separation is usually better, especially because this is also a callback. I don't feel too strong about this though, we certainly do pass the whole reg to functions which only work on a specific type of pointer. Though the concern in this case is justified as it's not only an internal function but also a callback. It was just a bool in the RFC. But in https://lore.kernel.org/bpf/20220907233023.x3uclwlnjuhftvtb@macbook-pro-4.dhcp.thefacebook.com Alexei suggested passing reg instead. From the link: > imo it's cleaner to pass 'reg' instead of 'reg->btf', > so we don't have to pass another boolean. > And check type_is_local(reg) inside btf_struct_access().
On Tue, Nov 8, 2022 at 4:00 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Wed, Nov 09, 2022 at 04:59:41AM IST, Andrii Nakryiko wrote: > > On Mon, Nov 7, 2022 at 3:10 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > > > Introduce local kptrs, i.e. PTR_TO_BTF_ID that point to a type in > > > program BTF. This is indicated by the presence of MEM_ALLOC type flag in > > > reg->type to avoid having to check btf_is_kernel when trying to match > > > argument types in helpers. > > > > > > Refactor btf_struct_access callback to just take bpf_reg_state instead > > > of btf and btf_type paramters. Note that the call site in > > > check_map_access now simulates access to a PTR_TO_BTF_ID by creating a > > > dummy reg on stack. Since only the type, btf, and btf_id of the register > > > matter for the checks, it can be done so without complicating the usual > > > cases elsewhere in the verifier where reg->btf and reg->btf_id is used > > > verbatim. > > > > > > Whenever walking such types, any pointers being walked will always yield > > > a SCALAR instead of pointer. In the future we might permit kptr inside > > > local kptr (either kernel or local), and it would be permitted only in > > > that case. > > > > > > For now, these local kptrs will always be referenced in verifier > > > context, hence ref_obj_id == 0 for them is a bug. It is allowed to write > > > to such objects, as long fields that are special are not touched > > > (support for which will be added in subsequent patches). Note that once > > > such a local kptr is marked PTR_UNTRUSTED, it is no longer allowed to > > > write to it. > > > > > > No PROBE_MEM handling is therefore done for loads into this type unless > > > PTR_UNTRUSTED is part of the register type, since they can never be in > > > an undefined state, and their lifetime will always be valid. > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > > --- > > > include/linux/bpf.h | 28 ++++++++++++++++-------- > > > include/linux/filter.h | 8 +++---- > > > kernel/bpf/btf.c | 16 ++++++++++---- > > > kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++------ > > > net/bpf/bpf_dummy_struct_ops.c | 14 ++++++------ > > > net/core/filter.c | 34 ++++++++++++----------------- > > > net/ipv4/bpf_tcp_ca.c | 13 ++++++----- > > > net/netfilter/nf_conntrack_bpf.c | 17 ++++++--------- > > > 8 files changed, 99 insertions(+), 68 deletions(-) > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > > index afc1c51b59ff..75dbd2ecf80a 100644 > > > --- a/include/linux/bpf.h > > > +++ b/include/linux/bpf.h > > > @@ -524,6 +524,11 @@ enum bpf_type_flag { > > > /* Size is known at compile time. */ > > > MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), > > > > > > + /* MEM is of a type from program BTF, not kernel BTF. This is used to > > > + * tag PTR_TO_BTF_ID allocated using bpf_obj_new. > > > + */ > > > + MEM_ALLOC = BIT(11 + BPF_BASE_TYPE_BITS), > > > + > > > > you fixed one naming confusion with RINGBUF and basically are creating > > a new one, where "ALLOC" means "local kptr"... If we are stuck with > > "local kptr" (which I find very confusing as well, but that's beside > > the point), why not stick to the whole "local" terminology here? > > MEM_LOCAL? > > > > See the discussion about this in v4: > https://lore.kernel.org/bpf/20221104075113.5ighwdvero4mugu7@apollo > > It was MEM_TYPE_LOCAL before. Also, better naming suggestions are always > welcome, I asked the same in that message as well. Sorry, I haven't followed <v5. Don't have perfect name, but logically this is BPF program memory. So "prog_kptr" would be something to convert this, but I'm not super happy with such a name. "user_kptr" would be too confusing, drawing incorrect "kernel space vs user space" comparison, while both are kernel memory. It's BPF-side kptr, so "bpf_kptr", but also not great. So that's why didn't suggest anything specific, but at least as far as MEM_xxx flag goes, MEM_LOCAL_KPTR is better than MEM_ALLOC, IMO. It's at least consistent with the official name of the concept it represents. > > > > __BPF_TYPE_FLAG_MAX, > > > __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1, > > > }; > > > @@ -771,6 +776,7 @@ struct bpf_prog_ops { > > > union bpf_attr __user *uattr); > > > }; > > > > > > > [...] > > > > > -int btf_struct_access(struct bpf_verifier_log *log, const struct btf *btf, > > > - const struct btf_type *t, int off, int size, > > > - enum bpf_access_type atype __maybe_unused, > > > +int btf_struct_access(struct bpf_verifier_log *log, > > > + const struct bpf_reg_state *reg, > > > + int off, int size, enum bpf_access_type atype __maybe_unused, > > > u32 *next_btf_id, enum bpf_type_flag *flag) > > > { > > > + const struct btf *btf = reg->btf; > > > enum bpf_type_flag tmp_flag = 0; > > > + const struct btf_type *t; > > > + u32 id = reg->btf_id; > > > int err; > > > - u32 id; > > > > > > + t = btf_type_by_id(btf, id); > > > do { > > > err = btf_struct_walk(log, btf, t, off, size, &id, &tmp_flag); > > > > > > switch (err) { > > > case WALK_PTR: > > > + /* For local types, the destination register cannot > > > + * become a pointer again. > > > + */ > > > + if (type_is_local_kptr(reg->type)) > > > + return SCALAR_VALUE; > > > > passing the entire bpf_reg_state just to differentiate between local > > vs kernel pointer seems like a huge overkill. bpf_reg_state is quite a > > complicated and extensive amount of state, and it seems cleaner to > > just pass it as a flag whether to allow pointer chasing or not. At > > least then we know we only care about that specific aspect, not about > > dozens of other possible fields of bpf_reg_state. > > > > I agree that the separation is usually better, especially because this is also a > callback. I don't feel too strong about this though, we certainly do pass the > whole reg to functions which only work on a specific type of pointer. Though the Yeah, and then it takes a lot of grepping and jumping around the code to verify that only one simple field out of the entire bpf_reg_state is actually used. I'd say this is actually bad that we do pass it around so willy-nilly. Verifier code is already a hot complex mess, let's not actively make it harder than necessary. > concern in this case is justified as it's not only an internal function but also > a callback. > > It was just a bool in the RFC. > But in https://lore.kernel.org/bpf/20220907233023.x3uclwlnjuhftvtb@macbook-pro-4.dhcp.thefacebook.com > Alexei suggested passing reg instead. > From the link: > > imo it's cleaner to pass 'reg' instead of 'reg->btf', > > so we don't have to pass another boolean. > > And check type_is_local(reg) inside btf_struct_access(). I sympathize with "too many input args" (especially if it's a bunch of bools) argument, but see above, I find it increasingly harder to know what parts of complex internal register state is used by helper functions and which are not. And the fact that we have to construct a fake register state in some case is a red flag to me. Pass enum bpf_reg_type type to avoid passing true/false. Or let's invent a new enum. Or extend enum bpf_access_type to have READ_NOPTR/WRITE_NOPTR or something like that. Don't know. This isn't a major issue, I can live with this just fine, but this definitely doesn't feel like a clean approach.
On Tue, Nov 8, 2022 at 4:36 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Tue, Nov 8, 2022 at 4:00 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > On Wed, Nov 09, 2022 at 04:59:41AM IST, Andrii Nakryiko wrote: > > > On Mon, Nov 7, 2022 at 3:10 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > > > > > Introduce local kptrs, i.e. PTR_TO_BTF_ID that point to a type in > > > > program BTF. This is indicated by the presence of MEM_ALLOC type flag in > > > > reg->type to avoid having to check btf_is_kernel when trying to match > > > > argument types in helpers. > > > > > > > > Refactor btf_struct_access callback to just take bpf_reg_state instead > > > > of btf and btf_type paramters. Note that the call site in > > > > check_map_access now simulates access to a PTR_TO_BTF_ID by creating a > > > > dummy reg on stack. Since only the type, btf, and btf_id of the register > > > > matter for the checks, it can be done so without complicating the usual > > > > cases elsewhere in the verifier where reg->btf and reg->btf_id is used > > > > verbatim. > > > > > > > > Whenever walking such types, any pointers being walked will always yield > > > > a SCALAR instead of pointer. In the future we might permit kptr inside > > > > local kptr (either kernel or local), and it would be permitted only in > > > > that case. > > > > > > > > For now, these local kptrs will always be referenced in verifier > > > > context, hence ref_obj_id == 0 for them is a bug. It is allowed to write > > > > to such objects, as long fields that are special are not touched > > > > (support for which will be added in subsequent patches). Note that once > > > > such a local kptr is marked PTR_UNTRUSTED, it is no longer allowed to > > > > write to it. > > > > > > > > No PROBE_MEM handling is therefore done for loads into this type unless > > > > PTR_UNTRUSTED is part of the register type, since they can never be in > > > > an undefined state, and their lifetime will always be valid. > > > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > > > --- > > > > include/linux/bpf.h | 28 ++++++++++++++++-------- > > > > include/linux/filter.h | 8 +++---- > > > > kernel/bpf/btf.c | 16 ++++++++++---- > > > > kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++------ > > > > net/bpf/bpf_dummy_struct_ops.c | 14 ++++++------ > > > > net/core/filter.c | 34 ++++++++++++----------------- > > > > net/ipv4/bpf_tcp_ca.c | 13 ++++++----- > > > > net/netfilter/nf_conntrack_bpf.c | 17 ++++++--------- > > > > 8 files changed, 99 insertions(+), 68 deletions(-) > > > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > > > index afc1c51b59ff..75dbd2ecf80a 100644 > > > > --- a/include/linux/bpf.h > > > > +++ b/include/linux/bpf.h > > > > @@ -524,6 +524,11 @@ enum bpf_type_flag { > > > > /* Size is known at compile time. */ > > > > MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), > > > > > > > > + /* MEM is of a type from program BTF, not kernel BTF. This is used to > > > > + * tag PTR_TO_BTF_ID allocated using bpf_obj_new. > > > > + */ > > > > + MEM_ALLOC = BIT(11 + BPF_BASE_TYPE_BITS), > > > > + > > > > > > you fixed one naming confusion with RINGBUF and basically are creating > > > a new one, where "ALLOC" means "local kptr"... If we are stuck with > > > "local kptr" (which I find very confusing as well, but that's beside > > > the point), why not stick to the whole "local" terminology here? > > > MEM_LOCAL? > > > > > > > See the discussion about this in v4: > > https://lore.kernel.org/bpf/20221104075113.5ighwdvero4mugu7@apollo > > > > It was MEM_TYPE_LOCAL before. Also, better naming suggestions are always > > welcome, I asked the same in that message as well. > > Sorry, I haven't followed <v5. Don't have perfect name, but logically > this is BPF program memory. So "prog_kptr" would be something to > convert this, but I'm not super happy with such a name. "user_kptr" > would be too confusing, drawing incorrect "kernel space vs user space" > comparison, while both are kernel memory. It's BPF-side kptr, so > "bpf_kptr", but also not great. yep. I went through the same thinking process. > So that's why didn't suggest anything specific, but at least as far as > MEM_xxx flag goes, MEM_LOCAL_KPTR is better than MEM_ALLOC, IMO. It's > at least consistent with the official name of the concept it > represents. "local kptr" doesn't fit here. In libbpf, "local" is equally badly named. If "local" was a good name we wouldn't have had this discussion. So we need to fix it libbpf, but we should start with a proper name in the kernel. And "local kptr" is not it. We must avoid exhausting bikeshedding too. MEM_ALLOC is something we can use right now and as long as "local kptr" doesn't appear in docs, comments and commit logs we're good. We can rename MEM_ALLOC to something else later. In commit logs we can just say that this is a pointer to an object allocated by the bpf program. It's crystal clear definition whereas "local kptr" is nonsensical. Going back to the kptr definition. kptr was supposed to mean a pointer to a kernel object. In that light "pointer to an object allocated by the bpf prog" is something else. Maybe "bptr" ? In some ways bpf is a layer different from kernel space and user space. Some people joked that there is ring-0 for kernel, ring-3 for user space while bpf runs in ring-B. Two new btf_tags __bptr and __bptr_ref (or may be just one?) might be necessary as well to make it easier to distinguish kernel and bpf prog allocated objects. > > > > It was just a bool in the RFC. > > But in https://lore.kernel.org/bpf/20220907233023.x3uclwlnjuhftvtb@macbook-pro-4.dhcp.thefacebook.com > > Alexei suggested passing reg instead. > > From the link: > > > imo it's cleaner to pass 'reg' instead of 'reg->btf', > > > so we don't have to pass another boolean. > > > And check type_is_local(reg) inside btf_struct_access(). > > I sympathize with "too many input args" (especially if it's a bunch of > bools) argument, but see above, I find it increasingly harder to know > what parts of complex internal register state is used by helper > functions and which are not. > > And the fact that we have to construct a fake register state in some > case is a red flag to me. Pass enum bpf_reg_type type to avoid passing > true/false. Or let's invent a new enum. Or extend enum bpf_access_type > to have READ_NOPTR/WRITE_NOPTR or something like that. Don't know. > > This isn't a major issue, I can live with this just fine, but this > definitely doesn't feel like a clean approach. I think passing bpf_reg_state is a lesser evil _right_now_ and I prefer we proceed this way, since other works are blocked on this patch set. We did plenty of refactoring of the verifier in the past. There will be more in the future.
On Wed, Nov 09, 2022 at 07:02:11AM IST, Alexei Starovoitov wrote: > On Tue, Nov 8, 2022 at 4:36 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Tue, Nov 8, 2022 at 4:00 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > > > On Wed, Nov 09, 2022 at 04:59:41AM IST, Andrii Nakryiko wrote: > > > > On Mon, Nov 7, 2022 at 3:10 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > > > > > > > Introduce local kptrs, i.e. PTR_TO_BTF_ID that point to a type in > > > > > program BTF. This is indicated by the presence of MEM_ALLOC type flag in > > > > > reg->type to avoid having to check btf_is_kernel when trying to match > > > > > argument types in helpers. > > > > > > > > > > Refactor btf_struct_access callback to just take bpf_reg_state instead > > > > > of btf and btf_type paramters. Note that the call site in > > > > > check_map_access now simulates access to a PTR_TO_BTF_ID by creating a > > > > > dummy reg on stack. Since only the type, btf, and btf_id of the register > > > > > matter for the checks, it can be done so without complicating the usual > > > > > cases elsewhere in the verifier where reg->btf and reg->btf_id is used > > > > > verbatim. > > > > > > > > > > Whenever walking such types, any pointers being walked will always yield > > > > > a SCALAR instead of pointer. In the future we might permit kptr inside > > > > > local kptr (either kernel or local), and it would be permitted only in > > > > > that case. > > > > > > > > > > For now, these local kptrs will always be referenced in verifier > > > > > context, hence ref_obj_id == 0 for them is a bug. It is allowed to write > > > > > to such objects, as long fields that are special are not touched > > > > > (support for which will be added in subsequent patches). Note that once > > > > > such a local kptr is marked PTR_UNTRUSTED, it is no longer allowed to > > > > > write to it. > > > > > > > > > > No PROBE_MEM handling is therefore done for loads into this type unless > > > > > PTR_UNTRUSTED is part of the register type, since they can never be in > > > > > an undefined state, and their lifetime will always be valid. > > > > > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > > > > --- > > > > > include/linux/bpf.h | 28 ++++++++++++++++-------- > > > > > include/linux/filter.h | 8 +++---- > > > > > kernel/bpf/btf.c | 16 ++++++++++---- > > > > > kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++------ > > > > > net/bpf/bpf_dummy_struct_ops.c | 14 ++++++------ > > > > > net/core/filter.c | 34 ++++++++++++----------------- > > > > > net/ipv4/bpf_tcp_ca.c | 13 ++++++----- > > > > > net/netfilter/nf_conntrack_bpf.c | 17 ++++++--------- > > > > > 8 files changed, 99 insertions(+), 68 deletions(-) > > > > > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > > > > index afc1c51b59ff..75dbd2ecf80a 100644 > > > > > --- a/include/linux/bpf.h > > > > > +++ b/include/linux/bpf.h > > > > > @@ -524,6 +524,11 @@ enum bpf_type_flag { > > > > > /* Size is known at compile time. */ > > > > > MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), > > > > > > > > > > + /* MEM is of a type from program BTF, not kernel BTF. This is used to > > > > > + * tag PTR_TO_BTF_ID allocated using bpf_obj_new. > > > > > + */ > > > > > + MEM_ALLOC = BIT(11 + BPF_BASE_TYPE_BITS), > > > > > + > > > > > > > > you fixed one naming confusion with RINGBUF and basically are creating > > > > a new one, where "ALLOC" means "local kptr"... If we are stuck with > > > > "local kptr" (which I find very confusing as well, but that's beside > > > > the point), why not stick to the whole "local" terminology here? > > > > MEM_LOCAL? > > > > > > > > > > See the discussion about this in v4: > > > https://lore.kernel.org/bpf/20221104075113.5ighwdvero4mugu7@apollo > > > > > > It was MEM_TYPE_LOCAL before. Also, better naming suggestions are always > > > welcome, I asked the same in that message as well. > > > > Sorry, I haven't followed <v5. Don't have perfect name, but logically > > this is BPF program memory. So "prog_kptr" would be something to > > convert this, but I'm not super happy with such a name. "user_kptr" > > would be too confusing, drawing incorrect "kernel space vs user space" > > comparison, while both are kernel memory. It's BPF-side kptr, so > > "bpf_kptr", but also not great. > > yep. I went through the same thinking process. > > > So that's why didn't suggest anything specific, but at least as far as > > MEM_xxx flag goes, MEM_LOCAL_KPTR is better than MEM_ALLOC, IMO. It's > > at least consistent with the official name of the concept it > > represents. > > "local kptr" doesn't fit here. > In libbpf, "local" is equally badly named. > If "local" was a good name we wouldn't have had this discussion. > So we need to fix it libbpf, but we should start with a proper > name in the kernel. > And "local kptr" is not it. > > We must avoid exhausting bikeshedding too. > MEM_ALLOC is something we can use right now and > as long as "local kptr" doesn't appear in docs, comments and > commit logs we're good. > We can rename MEM_ALLOC to something else later. > > In commit logs we can just say that this is > a pointer to an object allocated by the bpf program. > It's crystal clear definition whereas "local kptr" is nonsensical. > Ok, I'll drop the naming everywhere. > Going back to the kptr definition. > kptr was supposed to mean a pointer to a kernel object. > In that light "pointer to an object allocated by the bpf prog" > is something else. > Maybe "bptr" ? > In some ways bpf is a layer different from kernel space and user space. > Some people joked that there is ring-0 for kernel, ring-3 for user space > while bpf runs in ring-B. > Two new btf_tags __bptr and __bptr_ref (or may be just one?) > might be necessary as well to make it easier to distinguish > kernel and bpf prog allocated objects. > There's also the option of simply using __kptr and __kptr_ref for these (without __local tag in BPF maps) and doing two stage name lookup for the types. Kernel BTF takes precedence, if not found there, then it searches program BTF for a local type. It would probably the simplest for users. struct map_value { struct nf_conn __kptr_ref *ct; // kernel struct foo __kptr_ref *f; // local struct task_struct __kptr_ref *t; // kernel struct bar __kptr_ref *b; // local } We can revisit this again once the post the follow up to store them in maps.
On Tue, Nov 8, 2022 at 5:32 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Nov 8, 2022 at 4:36 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Tue, Nov 8, 2022 at 4:00 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > > > On Wed, Nov 09, 2022 at 04:59:41AM IST, Andrii Nakryiko wrote: > > > > On Mon, Nov 7, 2022 at 3:10 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > > > > > > > Introduce local kptrs, i.e. PTR_TO_BTF_ID that point to a type in > > > > > program BTF. This is indicated by the presence of MEM_ALLOC type flag in > > > > > reg->type to avoid having to check btf_is_kernel when trying to match > > > > > argument types in helpers. > > > > > > > > > > Refactor btf_struct_access callback to just take bpf_reg_state instead > > > > > of btf and btf_type paramters. Note that the call site in > > > > > check_map_access now simulates access to a PTR_TO_BTF_ID by creating a > > > > > dummy reg on stack. Since only the type, btf, and btf_id of the register > > > > > matter for the checks, it can be done so without complicating the usual > > > > > cases elsewhere in the verifier where reg->btf and reg->btf_id is used > > > > > verbatim. > > > > > > > > > > Whenever walking such types, any pointers being walked will always yield > > > > > a SCALAR instead of pointer. In the future we might permit kptr inside > > > > > local kptr (either kernel or local), and it would be permitted only in > > > > > that case. > > > > > > > > > > For now, these local kptrs will always be referenced in verifier > > > > > context, hence ref_obj_id == 0 for them is a bug. It is allowed to write > > > > > to such objects, as long fields that are special are not touched > > > > > (support for which will be added in subsequent patches). Note that once > > > > > such a local kptr is marked PTR_UNTRUSTED, it is no longer allowed to > > > > > write to it. > > > > > > > > > > No PROBE_MEM handling is therefore done for loads into this type unless > > > > > PTR_UNTRUSTED is part of the register type, since they can never be in > > > > > an undefined state, and their lifetime will always be valid. > > > > > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > > > > --- > > > > > include/linux/bpf.h | 28 ++++++++++++++++-------- > > > > > include/linux/filter.h | 8 +++---- > > > > > kernel/bpf/btf.c | 16 ++++++++++---- > > > > > kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++------ > > > > > net/bpf/bpf_dummy_struct_ops.c | 14 ++++++------ > > > > > net/core/filter.c | 34 ++++++++++++----------------- > > > > > net/ipv4/bpf_tcp_ca.c | 13 ++++++----- > > > > > net/netfilter/nf_conntrack_bpf.c | 17 ++++++--------- > > > > > 8 files changed, 99 insertions(+), 68 deletions(-) > > > > > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > > > > index afc1c51b59ff..75dbd2ecf80a 100644 > > > > > --- a/include/linux/bpf.h > > > > > +++ b/include/linux/bpf.h > > > > > @@ -524,6 +524,11 @@ enum bpf_type_flag { > > > > > /* Size is known at compile time. */ > > > > > MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), > > > > > > > > > > + /* MEM is of a type from program BTF, not kernel BTF. This is used to > > > > > + * tag PTR_TO_BTF_ID allocated using bpf_obj_new. > > > > > + */ > > > > > + MEM_ALLOC = BIT(11 + BPF_BASE_TYPE_BITS), > > > > > + > > > > > > > > you fixed one naming confusion with RINGBUF and basically are creating > > > > a new one, where "ALLOC" means "local kptr"... If we are stuck with > > > > "local kptr" (which I find very confusing as well, but that's beside > > > > the point), why not stick to the whole "local" terminology here? > > > > MEM_LOCAL? > > > > > > > > > > See the discussion about this in v4: > > > https://lore.kernel.org/bpf/20221104075113.5ighwdvero4mugu7@apollo > > > > > > It was MEM_TYPE_LOCAL before. Also, better naming suggestions are always > > > welcome, I asked the same in that message as well. > > > > Sorry, I haven't followed <v5. Don't have perfect name, but logically > > this is BPF program memory. So "prog_kptr" would be something to > > convert this, but I'm not super happy with such a name. "user_kptr" > > would be too confusing, drawing incorrect "kernel space vs user space" > > comparison, while both are kernel memory. It's BPF-side kptr, so > > "bpf_kptr", but also not great. > > yep. I went through the same thinking process. > > > So that's why didn't suggest anything specific, but at least as far as > > MEM_xxx flag goes, MEM_LOCAL_KPTR is better than MEM_ALLOC, IMO. It's > > at least consistent with the official name of the concept it > > represents. > > "local kptr" doesn't fit here. > In libbpf, "local" is equally badly named. > If "local" was a good name we wouldn't have had this discussion. > So we need to fix it libbpf, but we should start with a proper > name in the kernel. > And "local kptr" is not it. > > We must avoid exhausting bikeshedding too. > MEM_ALLOC is something we can use right now and > as long as "local kptr" doesn't appear in docs, comments and > commit logs we're good. > We can rename MEM_ALLOC to something else later. agreed > > In commit logs we can just say that this is > a pointer to an object allocated by the bpf program. > It's crystal clear definition whereas "local kptr" is nonsensical. > > Going back to the kptr definition. > kptr was supposed to mean a pointer to a kernel object. > In that light "pointer to an object allocated by the bpf prog" > is something else. > Maybe "bptr" ? > In some ways bpf is a layer different from kernel space and user space. > Some people joked that there is ring-0 for kernel, ring-3 for user space > while bpf runs in ring-B. > Two new btf_tags __bptr and __bptr_ref (or may be just one?) > might be necessary as well to make it easier to distinguish > kernel and bpf prog allocated objects. bptr isn't terrific, but I don't have any good suggestions. bptr, bpfptr, tptr (for "typed pointer" as opposed to untyped dynptr), all kind of meh. I'm fine with whatever. > > > > > > > It was just a bool in the RFC. > > > But in https://lore.kernel.org/bpf/20220907233023.x3uclwlnjuhftvtb@macbook-pro-4.dhcp.thefacebook.com > > > Alexei suggested passing reg instead. > > > From the link: > > > > imo it's cleaner to pass 'reg' instead of 'reg->btf', > > > > so we don't have to pass another boolean. > > > > And check type_is_local(reg) inside btf_struct_access(). > > > > I sympathize with "too many input args" (especially if it's a bunch of > > bools) argument, but see above, I find it increasingly harder to know > > what parts of complex internal register state is used by helper > > functions and which are not. > > > > And the fact that we have to construct a fake register state in some > > case is a red flag to me. Pass enum bpf_reg_type type to avoid passing > > true/false. Or let's invent a new enum. Or extend enum bpf_access_type > > to have READ_NOPTR/WRITE_NOPTR or something like that. Don't know. > > > > This isn't a major issue, I can live with this just fine, but this > > definitely doesn't feel like a clean approach. > > I think passing bpf_reg_state is a lesser evil _right_now_ and > I prefer we proceed this way, since other works are blocked on > this patch set. > We did plenty of refactoring of the verifier in the past. > There will be more in the future. sure, not crucial
On Wed, Nov 9, 2022 at 9:00 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Wed, Nov 09, 2022 at 07:02:11AM IST, Alexei Starovoitov wrote: > > On Tue, Nov 8, 2022 at 4:36 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > > > > On Tue, Nov 8, 2022 at 4:00 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > > > > > On Wed, Nov 09, 2022 at 04:59:41AM IST, Andrii Nakryiko wrote: > > > > > On Mon, Nov 7, 2022 at 3:10 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > > > > > > > > > Introduce local kptrs, i.e. PTR_TO_BTF_ID that point to a type in > > > > > > program BTF. This is indicated by the presence of MEM_ALLOC type flag in > > > > > > reg->type to avoid having to check btf_is_kernel when trying to match > > > > > > argument types in helpers. > > > > > > > > > > > > Refactor btf_struct_access callback to just take bpf_reg_state instead > > > > > > of btf and btf_type paramters. Note that the call site in > > > > > > check_map_access now simulates access to a PTR_TO_BTF_ID by creating a > > > > > > dummy reg on stack. Since only the type, btf, and btf_id of the register > > > > > > matter for the checks, it can be done so without complicating the usual > > > > > > cases elsewhere in the verifier where reg->btf and reg->btf_id is used > > > > > > verbatim. > > > > > > > > > > > > Whenever walking such types, any pointers being walked will always yield > > > > > > a SCALAR instead of pointer. In the future we might permit kptr inside > > > > > > local kptr (either kernel or local), and it would be permitted only in > > > > > > that case. > > > > > > > > > > > > For now, these local kptrs will always be referenced in verifier > > > > > > context, hence ref_obj_id == 0 for them is a bug. It is allowed to write > > > > > > to such objects, as long fields that are special are not touched > > > > > > (support for which will be added in subsequent patches). Note that once > > > > > > such a local kptr is marked PTR_UNTRUSTED, it is no longer allowed to > > > > > > write to it. > > > > > > > > > > > > No PROBE_MEM handling is therefore done for loads into this type unless > > > > > > PTR_UNTRUSTED is part of the register type, since they can never be in > > > > > > an undefined state, and their lifetime will always be valid. > > > > > > > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > > > > > --- > > > > > > include/linux/bpf.h | 28 ++++++++++++++++-------- > > > > > > include/linux/filter.h | 8 +++---- > > > > > > kernel/bpf/btf.c | 16 ++++++++++---- > > > > > > kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++------ > > > > > > net/bpf/bpf_dummy_struct_ops.c | 14 ++++++------ > > > > > > net/core/filter.c | 34 ++++++++++++----------------- > > > > > > net/ipv4/bpf_tcp_ca.c | 13 ++++++----- > > > > > > net/netfilter/nf_conntrack_bpf.c | 17 ++++++--------- > > > > > > 8 files changed, 99 insertions(+), 68 deletions(-) > > > > > > > > > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > > > > > index afc1c51b59ff..75dbd2ecf80a 100644 > > > > > > --- a/include/linux/bpf.h > > > > > > +++ b/include/linux/bpf.h > > > > > > @@ -524,6 +524,11 @@ enum bpf_type_flag { > > > > > > /* Size is known at compile time. */ > > > > > > MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), > > > > > > > > > > > > + /* MEM is of a type from program BTF, not kernel BTF. This is used to > > > > > > + * tag PTR_TO_BTF_ID allocated using bpf_obj_new. > > > > > > + */ > > > > > > + MEM_ALLOC = BIT(11 + BPF_BASE_TYPE_BITS), > > > > > > + > > > > > > > > > > you fixed one naming confusion with RINGBUF and basically are creating > > > > > a new one, where "ALLOC" means "local kptr"... If we are stuck with > > > > > "local kptr" (which I find very confusing as well, but that's beside > > > > > the point), why not stick to the whole "local" terminology here? > > > > > MEM_LOCAL? > > > > > > > > > > > > > See the discussion about this in v4: > > > > https://lore.kernel.org/bpf/20221104075113.5ighwdvero4mugu7@apollo > > > > > > > > It was MEM_TYPE_LOCAL before. Also, better naming suggestions are always > > > > welcome, I asked the same in that message as well. > > > > > > Sorry, I haven't followed <v5. Don't have perfect name, but logically > > > this is BPF program memory. So "prog_kptr" would be something to > > > convert this, but I'm not super happy with such a name. "user_kptr" > > > would be too confusing, drawing incorrect "kernel space vs user space" > > > comparison, while both are kernel memory. It's BPF-side kptr, so > > > "bpf_kptr", but also not great. > > > > yep. I went through the same thinking process. > > > > > So that's why didn't suggest anything specific, but at least as far as > > > MEM_xxx flag goes, MEM_LOCAL_KPTR is better than MEM_ALLOC, IMO. It's > > > at least consistent with the official name of the concept it > > > represents. > > > > "local kptr" doesn't fit here. > > In libbpf, "local" is equally badly named. > > If "local" was a good name we wouldn't have had this discussion. > > So we need to fix it libbpf, but we should start with a proper > > name in the kernel. > > And "local kptr" is not it. > > > > We must avoid exhausting bikeshedding too. > > MEM_ALLOC is something we can use right now and > > as long as "local kptr" doesn't appear in docs, comments and > > commit logs we're good. > > We can rename MEM_ALLOC to something else later. > > > > In commit logs we can just say that this is > > a pointer to an object allocated by the bpf program. > > It's crystal clear definition whereas "local kptr" is nonsensical. > > > > Ok, I'll drop the naming everywhere. > > > Going back to the kptr definition. > > kptr was supposed to mean a pointer to a kernel object. > > In that light "pointer to an object allocated by the bpf prog" > > is something else. > > Maybe "bptr" ? > > In some ways bpf is a layer different from kernel space and user space. > > Some people joked that there is ring-0 for kernel, ring-3 for user space > > while bpf runs in ring-B. > > Two new btf_tags __bptr and __bptr_ref (or may be just one?) > > might be necessary as well to make it easier to distinguish > > kernel and bpf prog allocated objects. > > > > There's also the option of simply using __kptr and __kptr_ref for these (without > __local tag in BPF maps) and doing two stage name lookup for the types. Kernel > BTF takes precedence, if not found there, then it searches program BTF for a > local type. It would probably the simplest for users. Disagree about the simplest for users. It's going to be quite confusing for anyone reading the code and trying to understand what's going on. Explicit tag seems better to me. But it's subjective. > > struct map_value { > struct nf_conn __kptr_ref *ct; // kernel > struct foo __kptr_ref *f; // local > struct task_struct __kptr_ref *t; // kernel > struct bar __kptr_ref *b; // local > } > > We can revisit this again once the post the follow up to store them in maps.
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index afc1c51b59ff..75dbd2ecf80a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -524,6 +524,11 @@ enum bpf_type_flag { /* Size is known at compile time. */ MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS), + /* MEM is of a type from program BTF, not kernel BTF. This is used to + * tag PTR_TO_BTF_ID allocated using bpf_obj_new. + */ + MEM_ALLOC = BIT(11 + BPF_BASE_TYPE_BITS), + __BPF_TYPE_FLAG_MAX, __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1, }; @@ -771,6 +776,7 @@ struct bpf_prog_ops { union bpf_attr __user *uattr); }; +struct bpf_reg_state; struct bpf_verifier_ops { /* return eBPF function prototype for verification */ const struct bpf_func_proto * @@ -792,9 +798,8 @@ struct bpf_verifier_ops { struct bpf_insn *dst, struct bpf_prog *prog, u32 *target_size); int (*btf_struct_access)(struct bpf_verifier_log *log, - const struct btf *btf, - const struct btf_type *t, int off, int size, - enum bpf_access_type atype, + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, u32 *next_btf_id, enum bpf_type_flag *flag); }; @@ -2080,9 +2085,9 @@ static inline bool bpf_tracing_btf_ctx_access(int off, int size, return btf_ctx_access(off, size, type, prog, info); } -int btf_struct_access(struct bpf_verifier_log *log, const struct btf *btf, - const struct btf_type *t, int off, int size, - enum bpf_access_type atype, +int btf_struct_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, u32 *next_btf_id, enum bpf_type_flag *flag); bool btf_struct_ids_match(struct bpf_verifier_log *log, const struct btf *btf, u32 id, int off, @@ -2333,9 +2338,8 @@ static inline struct bpf_prog *bpf_prog_by_id(u32 id) } static inline int btf_struct_access(struct bpf_verifier_log *log, - const struct btf *btf, - const struct btf_type *t, int off, int size, - enum bpf_access_type atype, + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, u32 *next_btf_id, enum bpf_type_flag *flag) { return -EACCES; @@ -2792,4 +2796,10 @@ struct bpf_key { bool has_ref; }; #endif /* CONFIG_KEYS */ + +static inline bool type_is_local_kptr(u32 type) +{ + return type & MEM_ALLOC; +} + #endif /* _LINUX_BPF_H */ diff --git a/include/linux/filter.h b/include/linux/filter.h index efc42a6e3aed..787d35dbf5b0 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -568,10 +568,10 @@ struct sk_filter { DECLARE_STATIC_KEY_FALSE(bpf_stats_enabled_key); extern struct mutex nf_conn_btf_access_lock; -extern int (*nfct_btf_struct_access)(struct bpf_verifier_log *log, const struct btf *btf, - const struct btf_type *t, int off, int size, - enum bpf_access_type atype, u32 *next_btf_id, - enum bpf_type_flag *flag); +extern int (*nfct_btf_struct_access)(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, + u32 *next_btf_id, enum bpf_type_flag *flag); typedef unsigned int (*bpf_dispatcher_fn)(const void *ctx, const struct bpf_insn *insnsi, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index d8f083b09e5e..4d6c8577bf17 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6015,20 +6015,28 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf, return -EINVAL; } -int btf_struct_access(struct bpf_verifier_log *log, const struct btf *btf, - const struct btf_type *t, int off, int size, - enum bpf_access_type atype __maybe_unused, +int btf_struct_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype __maybe_unused, u32 *next_btf_id, enum bpf_type_flag *flag) { + const struct btf *btf = reg->btf; enum bpf_type_flag tmp_flag = 0; + const struct btf_type *t; + u32 id = reg->btf_id; int err; - u32 id; + t = btf_type_by_id(btf, id); do { err = btf_struct_walk(log, btf, t, off, size, &id, &tmp_flag); switch (err) { case WALK_PTR: + /* For local types, the destination register cannot + * become a pointer again. + */ + if (type_is_local_kptr(reg->type)) + return SCALAR_VALUE; /* If we found the pointer or scalar on t+off, * we're done. */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5fca156eca43..7dcb4629f764 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4682,17 +4682,28 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env, return -EACCES; } - if (env->ops->btf_struct_access) { - ret = env->ops->btf_struct_access(&env->log, reg->btf, t, - off, size, atype, &btf_id, &flag); + if (env->ops->btf_struct_access && !type_is_local_kptr(reg->type)) { + if (!btf_is_kernel(reg->btf)) { + verbose(env, "verifier internal error: reg->btf must be kernel btf\n"); + return -EFAULT; + } + ret = env->ops->btf_struct_access(&env->log, reg, off, size, atype, &btf_id, &flag); } else { - if (atype != BPF_READ) { + /* Writes are permitted with default btf_struct_access for local + * kptrs (which always have ref_obj_id > 0), but not for + * _untrusted_ local kptrs. + */ + if (atype != BPF_READ && reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { verbose(env, "only read is supported\n"); return -EACCES; } - ret = btf_struct_access(&env->log, reg->btf, t, off, size, - atype, &btf_id, &flag); + if (type_is_local_kptr(reg->type) && !reg->ref_obj_id) { + verbose(env, "verifier internal error: ref_obj_id for local kptr must be non-zero\n"); + return -EFAULT; + } + + ret = btf_struct_access(&env->log, reg, off, size, atype, &btf_id, &flag); } if (ret < 0) @@ -4718,6 +4729,7 @@ static int check_ptr_to_map_access(struct bpf_verifier_env *env, { struct bpf_reg_state *reg = regs + regno; struct bpf_map *map = reg->map_ptr; + struct bpf_reg_state map_reg; enum bpf_type_flag flag = 0; const struct btf_type *t; const char *tname; @@ -4756,7 +4768,10 @@ static int check_ptr_to_map_access(struct bpf_verifier_env *env, return -EACCES; } - ret = btf_struct_access(&env->log, btf_vmlinux, t, off, size, atype, &btf_id, &flag); + /* Simulate access to a PTR_TO_BTF_ID */ + memset(&map_reg, 0, sizeof(map_reg)); + mark_btf_ld_reg(env, &map_reg, 0, PTR_TO_BTF_ID, btf_vmlinux, *map->ops->map_btf_id, 0); + ret = btf_struct_access(&env->log, &map_reg, off, size, atype, &btf_id, &flag); if (ret < 0) return ret; @@ -5966,6 +5981,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env, * fixed offset. */ case PTR_TO_BTF_ID: + case PTR_TO_BTF_ID | MEM_ALLOC: /* When referenced PTR_TO_BTF_ID is passed to release function, * it's fixed offset must be 0. In the other cases, fixed offset * can be non-zero. @@ -13648,6 +13664,13 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) break; case PTR_TO_BTF_ID: case PTR_TO_BTF_ID | PTR_UNTRUSTED: + /* PTR_TO_BTF_ID | MEM_ALLOC always has a valid lifetime, unlike + * PTR_TO_BTF_ID, and an active ref_obj_id, but the same cannot + * be said once it is marked PTR_UNTRUSTED, hence we must handle + * any faults for loads into such types. BPF_WRITE is disallowed + * for this case. + */ + case PTR_TO_BTF_ID | MEM_ALLOC | PTR_UNTRUSTED: if (type == BPF_READ) { insn->code = BPF_LDX | BPF_PROBE_MEM | BPF_SIZE((insn)->code); diff --git a/net/bpf/bpf_dummy_struct_ops.c b/net/bpf/bpf_dummy_struct_ops.c index e78dadfc5829..2d434c1f4617 100644 --- a/net/bpf/bpf_dummy_struct_ops.c +++ b/net/bpf/bpf_dummy_struct_ops.c @@ -156,29 +156,29 @@ static bool bpf_dummy_ops_is_valid_access(int off, int size, } static int bpf_dummy_ops_btf_struct_access(struct bpf_verifier_log *log, - const struct btf *btf, - const struct btf_type *t, int off, - int size, enum bpf_access_type atype, + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, u32 *next_btf_id, enum bpf_type_flag *flag) { const struct btf_type *state; + const struct btf_type *t; s32 type_id; int err; - type_id = btf_find_by_name_kind(btf, "bpf_dummy_ops_state", + type_id = btf_find_by_name_kind(reg->btf, "bpf_dummy_ops_state", BTF_KIND_STRUCT); if (type_id < 0) return -EINVAL; - state = btf_type_by_id(btf, type_id); + t = btf_type_by_id(reg->btf, reg->btf_id); + state = btf_type_by_id(reg->btf, type_id); if (t != state) { bpf_log(log, "only access to bpf_dummy_ops_state is supported\n"); return -EACCES; } - err = btf_struct_access(log, btf, t, off, size, atype, next_btf_id, - flag); + err = btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); if (err < 0) return err; diff --git a/net/core/filter.c b/net/core/filter.c index cb3b635e35be..199632e6a7cb 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -8651,28 +8651,25 @@ static bool tc_cls_act_is_valid_access(int off, int size, DEFINE_MUTEX(nf_conn_btf_access_lock); EXPORT_SYMBOL_GPL(nf_conn_btf_access_lock); -int (*nfct_btf_struct_access)(struct bpf_verifier_log *log, const struct btf *btf, - const struct btf_type *t, int off, int size, - enum bpf_access_type atype, u32 *next_btf_id, - enum bpf_type_flag *flag); +int (*nfct_btf_struct_access)(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, + u32 *next_btf_id, enum bpf_type_flag *flag); EXPORT_SYMBOL_GPL(nfct_btf_struct_access); static int tc_cls_act_btf_struct_access(struct bpf_verifier_log *log, - const struct btf *btf, - const struct btf_type *t, int off, - int size, enum bpf_access_type atype, - u32 *next_btf_id, - enum bpf_type_flag *flag) + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, + u32 *next_btf_id, enum bpf_type_flag *flag) { int ret = -EACCES; if (atype == BPF_READ) - return btf_struct_access(log, btf, t, off, size, atype, next_btf_id, - flag); + return btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); mutex_lock(&nf_conn_btf_access_lock); if (nfct_btf_struct_access) - ret = nfct_btf_struct_access(log, btf, t, off, size, atype, next_btf_id, flag); + ret = nfct_btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); mutex_unlock(&nf_conn_btf_access_lock); return ret; @@ -8738,21 +8735,18 @@ void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog, EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action); static int xdp_btf_struct_access(struct bpf_verifier_log *log, - const struct btf *btf, - const struct btf_type *t, int off, - int size, enum bpf_access_type atype, - u32 *next_btf_id, - enum bpf_type_flag *flag) + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, + u32 *next_btf_id, enum bpf_type_flag *flag) { int ret = -EACCES; if (atype == BPF_READ) - return btf_struct_access(log, btf, t, off, size, atype, next_btf_id, - flag); + return btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); mutex_lock(&nf_conn_btf_access_lock); if (nfct_btf_struct_access) - ret = nfct_btf_struct_access(log, btf, t, off, size, atype, next_btf_id, flag); + ret = nfct_btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); mutex_unlock(&nf_conn_btf_access_lock); return ret; diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c index 6da16ae6a962..d15c91de995f 100644 --- a/net/ipv4/bpf_tcp_ca.c +++ b/net/ipv4/bpf_tcp_ca.c @@ -69,18 +69,17 @@ static bool bpf_tcp_ca_is_valid_access(int off, int size, } static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log, - const struct btf *btf, - const struct btf_type *t, int off, - int size, enum bpf_access_type atype, - u32 *next_btf_id, - enum bpf_type_flag *flag) + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, + u32 *next_btf_id, enum bpf_type_flag *flag) { + const struct btf_type *t; size_t end; if (atype == BPF_READ) - return btf_struct_access(log, btf, t, off, size, atype, next_btf_id, - flag); + return btf_struct_access(log, reg, off, size, atype, next_btf_id, flag); + t = btf_type_by_id(reg->btf, reg->btf_id); if (t != tcp_sock_type) { bpf_log(log, "only read is supported\n"); return -EACCES; diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c index 8639e7efd0e2..24002bc61e07 100644 --- a/net/netfilter/nf_conntrack_bpf.c +++ b/net/netfilter/nf_conntrack_bpf.c @@ -191,19 +191,16 @@ BTF_ID(struct, nf_conn___init) /* Check writes into `struct nf_conn` */ static int _nf_conntrack_btf_struct_access(struct bpf_verifier_log *log, - const struct btf *btf, - const struct btf_type *t, int off, - int size, enum bpf_access_type atype, - u32 *next_btf_id, - enum bpf_type_flag *flag) + const struct bpf_reg_state *reg, + int off, int size, enum bpf_access_type atype, + u32 *next_btf_id, enum bpf_type_flag *flag) { - const struct btf_type *ncit; - const struct btf_type *nct; + const struct btf_type *ncit, *nct, *t; size_t end; - ncit = btf_type_by_id(btf, btf_nf_conn_ids[1]); - nct = btf_type_by_id(btf, btf_nf_conn_ids[0]); - + ncit = btf_type_by_id(reg->btf, btf_nf_conn_ids[1]); + nct = btf_type_by_id(reg->btf, btf_nf_conn_ids[0]); + t = btf_type_by_id(reg->btf, reg->btf_id); if (t != nct && t != ncit) { bpf_log(log, "only read is supported\n"); return -EACCES;
Introduce local kptrs, i.e. PTR_TO_BTF_ID that point to a type in program BTF. This is indicated by the presence of MEM_ALLOC type flag in reg->type to avoid having to check btf_is_kernel when trying to match argument types in helpers. Refactor btf_struct_access callback to just take bpf_reg_state instead of btf and btf_type paramters. Note that the call site in check_map_access now simulates access to a PTR_TO_BTF_ID by creating a dummy reg on stack. Since only the type, btf, and btf_id of the register matter for the checks, it can be done so without complicating the usual cases elsewhere in the verifier where reg->btf and reg->btf_id is used verbatim. Whenever walking such types, any pointers being walked will always yield a SCALAR instead of pointer. In the future we might permit kptr inside local kptr (either kernel or local), and it would be permitted only in that case. For now, these local kptrs will always be referenced in verifier context, hence ref_obj_id == 0 for them is a bug. It is allowed to write to such objects, as long fields that are special are not touched (support for which will be added in subsequent patches). Note that once such a local kptr is marked PTR_UNTRUSTED, it is no longer allowed to write to it. No PROBE_MEM handling is therefore done for loads into this type unless PTR_UNTRUSTED is part of the register type, since they can never be in an undefined state, and their lifetime will always be valid. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> --- include/linux/bpf.h | 28 ++++++++++++++++-------- include/linux/filter.h | 8 +++---- kernel/bpf/btf.c | 16 ++++++++++---- kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++------ net/bpf/bpf_dummy_struct_ops.c | 14 ++++++------ net/core/filter.c | 34 ++++++++++++----------------- net/ipv4/bpf_tcp_ca.c | 13 ++++++----- net/netfilter/nf_conntrack_bpf.c | 17 ++++++--------- 8 files changed, 99 insertions(+), 68 deletions(-)