Message ID | 20240510192412.3297104-3-amery.hung@bytedance.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | bpf qdisc | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply, async |
On 5/10/24 12:23, Amery Hung wrote: > A reference is automatically acquired for a referenced kptr argument > annotated via the stub function with "__ref_acquired" in a struct_ops > program. It must be released and cannot be acquired more than once. > > The test first checks whether a reference to the correct type is acquired > in "ref_acquire". Then, we check if the verifier correctly rejects the > program that fails to release the reference (i.e., reference leak) in > "ref_acquire_ref_leak". Finally, we check if the reference can be only > acquired once through the argument in "ref_acquire_dup_ref". > > Signed-off-by: Amery Hung <amery.hung@bytedance.com> > --- > .../selftests/bpf/bpf_testmod/bpf_testmod.c | 7 +++ > .../selftests/bpf/bpf_testmod/bpf_testmod.h | 2 + > .../prog_tests/test_struct_ops_ref_acquire.c | 58 +++++++++++++++++++ > .../bpf/progs/struct_ops_ref_acquire.c | 27 +++++++++ > .../progs/struct_ops_ref_acquire_dup_ref.c | 24 ++++++++ > .../progs/struct_ops_ref_acquire_ref_leak.c | 19 ++++++ > 6 files changed, 137 insertions(+) > create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_ref_acquire.c > create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c > create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_dup_ref.c > create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_ref_leak.c > > ... skipped ... > + > diff --git a/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c > new file mode 100644 > index 000000000000..bae342db0fdb > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c > @@ -0,0 +1,27 @@ > +#include <vmlinux.h> > +#include <bpf/bpf_tracing.h> > +#include "../bpf_testmod/bpf_testmod.h" > + > +char _license[] SEC("license") = "GPL"; > + > +void bpf_task_release(struct task_struct *p) __ksym; > + > +/* This is a test BPF program that uses struct_ops to access a referenced > + * kptr argument. This is a test for the verifier to ensure that it recongnizes > + * the task as a referenced object (i.e., ref_obj_id > 0). > + */ > +SEC("struct_ops/test_ref_acquire") > +int BPF_PROG(test_ref_acquire, int dummy, > + struct task_struct *task) > +{ > + bpf_task_release(task); This looks weird for me. According to what you mentioned in the patch 1, the purpose is to prevent acquiring multiple references from happening. So, is it possible to return NULL from the acquire function if having returned a reference before? > + > + return 0; > +} > + > + ... skipped ...
On Fri, May 10, 2024 at 2:33 PM Kui-Feng Lee <sinquersw@gmail.com> wrote: > > > > On 5/10/24 12:23, Amery Hung wrote: > > A reference is automatically acquired for a referenced kptr argument > > annotated via the stub function with "__ref_acquired" in a struct_ops > > program. It must be released and cannot be acquired more than once. > > > > The test first checks whether a reference to the correct type is acquired > > in "ref_acquire". Then, we check if the verifier correctly rejects the > > program that fails to release the reference (i.e., reference leak) in > > "ref_acquire_ref_leak". Finally, we check if the reference can be only > > acquired once through the argument in "ref_acquire_dup_ref". > > > > Signed-off-by: Amery Hung <amery.hung@bytedance.com> > > --- > > .../selftests/bpf/bpf_testmod/bpf_testmod.c | 7 +++ > > .../selftests/bpf/bpf_testmod/bpf_testmod.h | 2 + > > .../prog_tests/test_struct_ops_ref_acquire.c | 58 +++++++++++++++++++ > > .../bpf/progs/struct_ops_ref_acquire.c | 27 +++++++++ > > .../progs/struct_ops_ref_acquire_dup_ref.c | 24 ++++++++ > > .../progs/struct_ops_ref_acquire_ref_leak.c | 19 ++++++ > > 6 files changed, 137 insertions(+) > > create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_ref_acquire.c > > create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c > > create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_dup_ref.c > > create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_ref_leak.c > > > > > ... skipped ... > > + > > diff --git a/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c > > new file mode 100644 > > index 000000000000..bae342db0fdb > > --- /dev/null > > +++ b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c > > @@ -0,0 +1,27 @@ > > +#include <vmlinux.h> > > +#include <bpf/bpf_tracing.h> > > +#include "../bpf_testmod/bpf_testmod.h" > > + > > +char _license[] SEC("license") = "GPL"; > > + > > +void bpf_task_release(struct task_struct *p) __ksym; > > + > > +/* This is a test BPF program that uses struct_ops to access a referenced > > + * kptr argument. This is a test for the verifier to ensure that it recongnizes > > + * the task as a referenced object (i.e., ref_obj_id > 0). > > + */ > > +SEC("struct_ops/test_ref_acquire") > > +int BPF_PROG(test_ref_acquire, int dummy, > > + struct task_struct *task) > > +{ > > + bpf_task_release(task); > > This looks weird for me. > > According to what you mentioned in the patch 1, the purpose is to > prevent acquiring multiple references from happening. So, is it possible > to return NULL from the acquire function if having returned a reference > before? The purpose of req_acquired is to allow acquiring a referenced kptr in struct_ops argument just once. Whether multiple references can be acquired/duplicated later I think could be orthogonal. In bpf qdisc, we ensure unique reference of skb through ref_acquired and the fact that there is no bpf_ref_count in sk_buff (so that users cannot use bpf_ref_acquire()). In this case, it is true that programs like below will be able to get multiple references to task (Is this the scenario you have in mind?). Thus, if the users want to enforce the unique reference semantic, they need to make bpf_task_acquire() unavailable as well. SEC("struct_ops/test_ref_acquire") int BPF_PROG(test_ref_acquire, int dummy, struct task_struct *task) { struct task_struct task2; task2 = bpf_task_acquire(task); bpf_task_release(task); if (task2) bpf_task_release(task2); return 0; } > > > > + > > + return 0; > > +} > > + > > + > ... skipped ...
I thought about patch 1-4 a bit more after the discussion in LSFMMBPF and I think we should keep what "ref_acquried" does, but maybe rename it to "ref_moved". We discussed the lifecycle of skb in qdisc and changes to struct_ops and bpf semantics. In short, At the beginning of .enqueue, the kernel passes the ownership of an skb to a qdisc. We do not increase the reference count of skb since this is an ownership transfer, not kernel and qdisc both holding references to the skb. (The counterexample can be found in RFC v7. See how weird skb release kfuncs look[0]). The skb should be either enqueued or dropped. Then, in .dequeue, an skb will be removed from the queue and the ownership will be returned to the kernel. Referenced kptr in bpf already carries the semantic of ownership. Thus, what we need here is to enable struct_ops programs to get a referenced kptr from the argument and returning referenced kptr (achieved via patch 1-4). Proper handling of referenced objects is important for safety reasons. In the case of bpf qdisc, there are three problematic situations as listed below, and referenced kptr has taken care of (1) and (2). (1) .enqueue not enqueueing nor dropping the skb, causing reference leak (2) .dequeue making up an invalid skb ptr and returning to kernel (3) If bpf qdisc operators can duplicate skb references, multiple references to the same skb can be present. If we enqueue these references to a collection and dequeue one, since skb->dev will be restored after the skb is removed from the collection, other skb in the collection will then have invalid skb->rbnode as "dev" and "rbnode" share the same memory. A discussion point was about introducing and enforcing a unique reference semantic (PTR_UNIQUE) to mitigate (3). After giving it more thoughts, I think we should keep "ref_acquired", and be careful about kernel-side implementation that could return referenced kptr. Taking a step back, (3) is only problematic because I made an assumption that the kfunc only increases the reference count of skb (i.e., skb_get()). It could have been done safely using skb_copy() or maybe pskb_copy(). In other words, it is a kernel implementation issue, and not a verifier issue. Besides, the verifier has no knowledge about what a kfunc with KF_ACQUIRE does internally whatsoever. In v8, we try to do this safely by only allowing reading "ref_acquired"- annotated argument once. Since the argument passed to struct_ops never changes when during a single invocation, it will always be referencing the same kernel object. Therefore, reading more than once and returning mulitple references shouldn't be allowed. Maybe "ref_moved" is a more precise annotation label, hinting that the ownership is transferred. [0] https://lore.kernel.org/netdev/2d31261b245828d09d2f80e0953e911a9c38573a.1705432850.git.amery.hung@bytedance.com/
On 5/16/24 4:14 PM, Amery Hung wrote: > I thought about patch 1-4 a bit more after the discussion in LSFMMBPF and > I think we should keep what "ref_acquried" does, but maybe rename it to > "ref_moved". > > We discussed the lifecycle of skb in qdisc and changes to struct_ops and > bpf semantics. In short, At the beginning of .enqueue, the kernel passes > the ownership of an skb to a qdisc. We do not increase the reference count > of skb since this is an ownership transfer, not kernel and qdisc both > holding references to the skb. (The counterexample can be found in RFC v7. > See how weird skb release kfuncs look[0]). The skb should be either > enqueued or dropped. Then, in .dequeue, an skb will be removed from the > queue and the ownership will be returned to the kernel. > > Referenced kptr in bpf already carries the semantic of ownership. Thus, > what we need here is to enable struct_ops programs to get a referenced > kptr from the argument and returning referenced kptr (achieved via patch > 1-4). > > Proper handling of referenced objects is important for safety reasons. > In the case of bpf qdisc, there are three problematic situations as listed > below, and referenced kptr has taken care of (1) and (2). > > (1) .enqueue not enqueueing nor dropping the skb, causing reference leak > > (2) .dequeue making up an invalid skb ptr and returning to kernel > > (3) If bpf qdisc operators can duplicate skb references, multiple > references to the same skb can be present. If we enqueue these > references to a collection and dequeue one, since skb->dev will be > restored after the skb is removed from the collection, other skb in > the collection will then have invalid skb->rbnode as "dev" and "rbnode" > share the same memory. > > A discussion point was about introducing and enforcing a unique reference > semantic (PTR_UNIQUE) to mitigate (3). After giving it more thoughts, I > think we should keep "ref_acquired", and be careful about kernel-side > implementation that could return referenced kptr. Taking a step back, (3) > is only problematic because I made an assumption that the kfunc only > increases the reference count of skb (i.e., skb_get()). It could have been > done safely using skb_copy() or maybe pskb_copy(). In other words, it is a > kernel implementation issue, and not a verifier issue. Besides, the > verifier has no knowledge about what a kfunc with KF_ACQUIRE does > internally whatsoever. > > In v8, we try to do this safely by only allowing reading "ref_acquired"- > annotated argument once. Since the argument passed to struct_ops never > changes when during a single invocation, it will always be referencing the > same kernel object. Therefore, reading more than once and returning > mulitple references shouldn't be allowed. Maybe "ref_moved" is a more > precise annotation label, hinting that the ownership is transferred. The part that no skb acquire kfunc should be available to the qdisc struct_ops prog is understood. I think it just needs to clarify the commit message and remove the "It must be released and cannot be acquired more than once" part. > > [0] https://lore.kernel.org/netdev/2d31261b245828d09d2f80e0953e911a9c38573a.1705432850.git.amery.hung@bytedance.com/
On Thu, May 16, 2024 at 4:45 PM Martin KaFai Lau <martin.lau@linux.dev> wrote: > > On 5/16/24 4:14 PM, Amery Hung wrote: > > I thought about patch 1-4 a bit more after the discussion in LSFMMBPF and > > I think we should keep what "ref_acquried" does, but maybe rename it to > > "ref_moved". > > > > We discussed the lifecycle of skb in qdisc and changes to struct_ops and > > bpf semantics. In short, At the beginning of .enqueue, the kernel passes > > the ownership of an skb to a qdisc. We do not increase the reference count > > of skb since this is an ownership transfer, not kernel and qdisc both > > holding references to the skb. (The counterexample can be found in RFC v7. > > See how weird skb release kfuncs look[0]). The skb should be either > > enqueued or dropped. Then, in .dequeue, an skb will be removed from the > > queue and the ownership will be returned to the kernel. > > > > Referenced kptr in bpf already carries the semantic of ownership. Thus, > > what we need here is to enable struct_ops programs to get a referenced > > kptr from the argument and returning referenced kptr (achieved via patch > > 1-4). > > > > Proper handling of referenced objects is important for safety reasons. > > In the case of bpf qdisc, there are three problematic situations as listed > > below, and referenced kptr has taken care of (1) and (2). > > > > (1) .enqueue not enqueueing nor dropping the skb, causing reference leak > > > > (2) .dequeue making up an invalid skb ptr and returning to kernel > > > > (3) If bpf qdisc operators can duplicate skb references, multiple > > references to the same skb can be present. If we enqueue these > > references to a collection and dequeue one, since skb->dev will be > > restored after the skb is removed from the collection, other skb in > > the collection will then have invalid skb->rbnode as "dev" and "rbnode" > > share the same memory. > > > > A discussion point was about introducing and enforcing a unique reference > > semantic (PTR_UNIQUE) to mitigate (3). After giving it more thoughts, I > > think we should keep "ref_acquired", and be careful about kernel-side > > implementation that could return referenced kptr. Taking a step back, (3) > > is only problematic because I made an assumption that the kfunc only > > increases the reference count of skb (i.e., skb_get()). It could have been > > done safely using skb_copy() or maybe pskb_copy(). In other words, it is a > > kernel implementation issue, and not a verifier issue. Besides, the > > verifier has no knowledge about what a kfunc with KF_ACQUIRE does > > internally whatsoever. > > > > In v8, we try to do this safely by only allowing reading "ref_acquired"- > > annotated argument once. Since the argument passed to struct_ops never > > changes when during a single invocation, it will always be referencing the > > same kernel object. Therefore, reading more than once and returning > > mulitple references shouldn't be allowed. Maybe "ref_moved" is a more > > precise annotation label, hinting that the ownership is transferred. > > The part that no skb acquire kfunc should be available to the qdisc struct_ops > prog is understood. I think it just needs to clarify the commit message and > remove the "It must be released and cannot be acquired more than once" part. > Got it. I will improve the clarity of the commit message. In addition, I will also remove "struct_ops_ref_acquire_dup_ref.c" as whether duplicate references can be acquired through kfunc is out of scope (should be taken care of by struct_ops implementer). Actually, this testcase should load the and it does load... As for the name, do you have any thoughts? Thanks, Amery > > > > > [0] https://lore.kernel.org/netdev/2d31261b245828d09d2f80e0953e911a9c38573a.1705432850.git.amery.hung@bytedance.com/ >
On 5/16/24 5:54 PM, Amery Hung wrote: >> The part that no skb acquire kfunc should be available to the qdisc struct_ops >> prog is understood. I think it just needs to clarify the commit message and >> remove the "It must be released and cannot be acquired more than once" part. >> > Got it. I will improve the clarity of the commit message. > > In addition, I will also remove "struct_ops_ref_acquire_dup_ref.c" as > whether duplicate references can be acquired through kfunc is out of > scope (should be taken care of by struct_ops implementer). Actually, > this testcase should load the and it does load... > > As for the name, do you have any thoughts? Naming is hard... :( May be just keep it short, just "__ref"?
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c index 39ad96a18123..64dcab25b539 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c @@ -594,10 +594,17 @@ static int bpf_testmod_ops__test_maybe_null(int dummy, return 0; } +static int bpf_testmod_ops__test_ref_acquire(int dummy, + struct task_struct *task__ref_acquired) +{ + return 0; +} + static struct bpf_testmod_ops __bpf_testmod_ops = { .test_1 = bpf_testmod_test_1, .test_2 = bpf_testmod_test_2, .test_maybe_null = bpf_testmod_ops__test_maybe_null, + .test_ref_acquire = bpf_testmod_ops__test_ref_acquire, }; struct bpf_struct_ops bpf_bpf_testmod_ops = { diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h index 23fa1872ee67..a0233990fb0e 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h @@ -35,6 +35,8 @@ struct bpf_testmod_ops { void (*test_2)(int a, int b); /* Used to test nullable arguments. */ int (*test_maybe_null)(int dummy, struct task_struct *task); + /* Used to test ref_acquired arguments. */ + int (*test_ref_acquire)(int dummy, struct task_struct *task); /* The following fields are used to test shadow copies. */ char onebyte; diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_ref_acquire.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_ref_acquire.c new file mode 100644 index 000000000000..779287a00ed8 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_ref_acquire.c @@ -0,0 +1,58 @@ +#include <test_progs.h> + +#include "struct_ops_ref_acquire.skel.h" +#include "struct_ops_ref_acquire_ref_leak.skel.h" +#include "struct_ops_ref_acquire_dup_ref.skel.h" + +/* Test that the verifier accepts a program that acquires a referenced + * kptr and releases the reference + */ +static void ref_acquire(void) +{ + struct struct_ops_ref_acquire *skel; + + skel = struct_ops_ref_acquire__open_and_load(); + if (!ASSERT_OK_PTR(skel, "struct_ops_module_open_and_load")) + return; + + struct_ops_ref_acquire__destroy(skel); +} + +/* Test that the verifier rejects a program that acquires a referenced + * kptr without releasing the reference + */ +static void ref_acquire_ref_leak(void) +{ + struct struct_ops_ref_acquire_ref_leak *skel; + + skel = struct_ops_ref_acquire_ref_leak__open_and_load(); + if (ASSERT_ERR_PTR(skel, "struct_ops_module_fail__open_and_load")) + return; + + struct_ops_ref_acquire_ref_leak__destroy(skel); +} + +/* Test that the verifier rejects a program that tries to acquire a + * referenced twice + */ +static void ref_acquire_dup_ref(void) +{ + struct struct_ops_ref_acquire_dup_ref *skel; + + skel = struct_ops_ref_acquire_dup_ref__open_and_load(); + if (ASSERT_ERR_PTR(skel, "struct_ops_module_fail__open_and_load")) + return; + + struct_ops_ref_acquire_dup_ref__destroy(skel); +} + +void test_struct_ops_ref_acquire(void) +{ + if (test__start_subtest("ref_acquire")) + ref_acquire(); + if (test__start_subtest("ref_acquire_ref_leak")) + ref_acquire_ref_leak(); + if (test__start_subtest("ref_acquire_dup_ref")) + ref_acquire_dup_ref(); +} + diff --git a/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c new file mode 100644 index 000000000000..bae342db0fdb --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c @@ -0,0 +1,27 @@ +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include "../bpf_testmod/bpf_testmod.h" + +char _license[] SEC("license") = "GPL"; + +void bpf_task_release(struct task_struct *p) __ksym; + +/* This is a test BPF program that uses struct_ops to access a referenced + * kptr argument. This is a test for the verifier to ensure that it recongnizes + * the task as a referenced object (i.e., ref_obj_id > 0). + */ +SEC("struct_ops/test_ref_acquire") +int BPF_PROG(test_ref_acquire, int dummy, + struct task_struct *task) +{ + bpf_task_release(task); + + return 0; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_ref_acquire = { + .test_ref_acquire = (void *)test_ref_acquire, +}; + + diff --git a/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_dup_ref.c b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_dup_ref.c new file mode 100644 index 000000000000..489db98a47fb --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_dup_ref.c @@ -0,0 +1,24 @@ +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include "../bpf_testmod/bpf_testmod.h" + +char _license[] SEC("license") = "GPL"; + +void bpf_task_release(struct task_struct *p) __ksym; + +SEC("struct_ops/test_ref_acquire") +int BPF_PROG(test_ref_acquire, int dummy, + struct task_struct *task) +{ + bpf_task_release(task); + bpf_task_release(task); + + return 0; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_ref_acquire = { + .test_ref_acquire = (void *)test_ref_acquire, +}; + + diff --git a/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_ref_leak.c b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_ref_leak.c new file mode 100644 index 000000000000..c5b9a1d748a1 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_ref_leak.c @@ -0,0 +1,19 @@ +#include <vmlinux.h> +#include <bpf/bpf_tracing.h> +#include "../bpf_testmod/bpf_testmod.h" + +char _license[] SEC("license") = "GPL"; + +SEC("struct_ops/test_ref_acquire") +int BPF_PROG(test_ref_acquire, int dummy, + struct task_struct *task) +{ + return 0; +} + +SEC(".struct_ops.link") +struct bpf_testmod_ops testmod_ref_acquire = { + .test_ref_acquire = (void *)test_ref_acquire, +}; + +
A reference is automatically acquired for a referenced kptr argument annotated via the stub function with "__ref_acquired" in a struct_ops program. It must be released and cannot be acquired more than once. The test first checks whether a reference to the correct type is acquired in "ref_acquire". Then, we check if the verifier correctly rejects the program that fails to release the reference (i.e., reference leak) in "ref_acquire_ref_leak". Finally, we check if the reference can be only acquired once through the argument in "ref_acquire_dup_ref". Signed-off-by: Amery Hung <amery.hung@bytedance.com> --- .../selftests/bpf/bpf_testmod/bpf_testmod.c | 7 +++ .../selftests/bpf/bpf_testmod/bpf_testmod.h | 2 + .../prog_tests/test_struct_ops_ref_acquire.c | 58 +++++++++++++++++++ .../bpf/progs/struct_ops_ref_acquire.c | 27 +++++++++ .../progs/struct_ops_ref_acquire_dup_ref.c | 24 ++++++++ .../progs/struct_ops_ref_acquire_ref_leak.c | 19 ++++++ 6 files changed, 137 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_ref_acquire.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_ref_acquire.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_dup_ref.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_ref_acquire_ref_leak.c