Message ID | 20231025214007.2920506-1-davemarchevsky@fb.com (mailing list archive) |
---|---|
Headers | show |
Series | Allow bpf_refcount_acquire of mapval | expand |
The name of this series was cut off. It's supposed to be "Allow bpf_refcount_acquire of mapval obtained via direct LD". Respins of this series will have the correct name.
On 10/25/23 2:40 PM, Dave Marchevsky wrote: > Consider this BPF program: > > struct cgv_node { > int d; > struct bpf_refcount r; > struct bpf_rb_node rb; > }; > > struct val_stash { > struct cgv_node __kptr *v; > }; > > struct { > __uint(type, BPF_MAP_TYPE_ARRAY); > __type(key, int); > __type(value, struct val_stash); > __uint(max_entries, 10); > } array_map SEC(".maps"); > > long bpf_program(void *ctx) > { > struct val_stash *mapval; > struct cgv_node *p; > int idx = 0; > > mapval = bpf_map_lookup_elem(&array_map, &idx); > if (!mapval || !mapval->v) { /* omitted */ } > > p = bpf_refcount_acquire(mapval->v); /* Verification FAILs here */ > > /* Add p to some tree */ > return 0; > } > > Verification fails on the refcount_acquire: > > 160: (79) r1 = *(u64 *)(r8 +8) ; R1_w=untrusted_ptr_or_null_cgv_node(id=11,off=0,imm=0) R8_w=map_value(id=10,off=0,ks=8,vs=16,imm=0) refs=6 > 161: (b7) r2 = 0 ; R2_w=0 refs=6 > 162: (85) call bpf_refcount_acquire_impl#117824 > arg#0 is neither owning or non-owning ref > > The above verifier dump is actually from sched_ext's scx_flatcg [0], > which is the motivating usecase for this series' changes. Specifically, > scx_flatcg stashes a rb_node type w/ cgroup-specific info (struct > cgv_node) in a map when the cgroup is created, then later puts that > cgroup's node in a rbtree in .enqueue . Making struct cgv_node > refcounted would simplify the code a bit by virtue of allowing us to > remove the kptr_xchg's, but "later puts that cgroups node in a rbtree" > is not possible without a refcount_acquire, which suffers from the above > verification failure. > > If we get rid of PTR_UNTRUSTED flag, and add MEM_ALLOC | NON_OWN_REF, > mapval->v would be a non-owning ref and verification would succeed. Due > to the most recent set of refcount changes [1], which modified > bpf_obj_drop behavior to not reuse refcounted graph node's underlying > memory until after RCU grace period, this is safe to do. Once mapval->v > has the correct flags it _is_ a non-owning reference and verification of > the motivating example will succeed. > > [0]: https://github.com/sched-ext/sched_ext/blob/52911e1040a0f94b9c426dddcc00be5364a7ae9f/tools/sched_ext/scx_flatcg.bpf.c#L275 > [1]: https://lore.kernel.org/bpf/20230821193311.3290257-1-davemarchevsky@fb.com/ > > Summary of patches: > * Patch 1 fixes an issue with bpf_refcount_acquire verification > letting MAYBE_NULL ptrs through > * Patch 2 tests Patch 1's fix > * Patch 3 broadens the use of "free only after RCU GP" to all > user-allocated types > * Patch 4 is a small nonfunctional refactoring > * Patch 5 changes verifier to mark direct LD of stashed graph node > kptr as non-owning ref > * Patch 6 tests Patch 5's verifier changes > > Dave Marchevsky (6): > bpf: Add KF_RCU flag to bpf_refcount_acquire_impl > selftests/bpf: Add test passing MAYBE_NULL reg to bpf_refcount_acquire > bpf: Use bpf_mem_free_rcu when bpf_obj_dropping non-refcounted nodes > bpf: Move GRAPH_{ROOT,NODE}_MASK macros into btf_field_type enum > bpf: Mark direct ld of stashed bpf_{rb,list}_node as non-owning ref > selftests/bpf: Test bpf_refcount_acquire of node obtained via direct > ld > > include/linux/bpf.h | 4 +- > kernel/bpf/btf.c | 11 ++- > kernel/bpf/helpers.c | 7 +- > kernel/bpf/verifier.c | 36 ++++++++-- > .../bpf/prog_tests/local_kptr_stash.c | 33 +++++++++ > .../selftests/bpf/progs/local_kptr_stash.c | 68 +++++++++++++++++++ > .../bpf/progs/refcounted_kptr_fail.c | 19 ++++++ > 7 files changed, 159 insertions(+), 19 deletions(-) > The patch looks good to me from high level. There is a test failure and I added some comment in Patch 5. Please take a look and address the test failure. Thanks!