Message ID | 20211123205607.452497-1-zenczykowski@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] bpf: allow readonly direct path access for skfilter | expand |
Note: this is more of an RFC... question in patch format... is this even a good idea? On Tue, Nov 23, 2021 at 12:56 PM Maciej Żenczykowski <zenczykowski@gmail.com> wrote: > > From: Maciej Żenczykowski <maze@google.com> > > skfilter bpf programs can read the packet directly via llvm.bpf.load.byte/ > /half/word which are 8/16/32-bit primitive bpf instructions and thus > behave basically as well as DPA reads. But there is no 64-bit equivalent, > due to the support for the equivalent 64-bit bpf opcode never having been > added (unclear why, there was a patch posted). > DPA uses a slightly different mechanism, so doesn't suffer this limitation. > > Using 64-bit reads, 128-bit ipv6 address comparisons can be done in just > 2 steps, instead of the 4 steps needed with llvm.bpf.word. > > This should hopefully allow simpler (less instructions, and possibly less > logic and maybe even less jumps) programs. Less jumps may also mean vastly > faster bpf verifier times (it can be exponential in the number of jumps...). > > This can be particularly important when trying to do something like scan > a netlink message for a pattern (2000 iteration loop) to decide whether > a message should be dropped, or delivered to userspace (thus waking it up). > > I'm requiring CAP_NET_ADMIN because I'm not sure of the security > implications... > > Tested: only build tested > Signed-off-by: Maciej Żenczykowski <maze@google.com> > --- > kernel/bpf/verifier.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 331b170d9fcc..0c2e25fb9844 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -3258,6 +3258,11 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env, > enum bpf_prog_type prog_type = resolve_prog_type(env->prog); > > switch (prog_type) { > + case BPF_PROG_TYPE_SOCKET_FILTER: > + if (meta || !capable(CAP_NET_ADMIN)) > + return false; > + fallthrough; > + > /* Program types only with direct read access go here! */ > case BPF_PROG_TYPE_LWT_IN: > case BPF_PROG_TYPE_LWT_OUT: > -- > 2.34.0.rc2.393.gf8c9666880-goog >
On Tue, Nov 23, 2021 at 3:02 PM Maciej Żenczykowski <maze@google.com> wrote: > > Note: this is more of an RFC... question in patch format... is this > even a good idea? > > On Tue, Nov 23, 2021 at 12:56 PM Maciej Żenczykowski > <zenczykowski@gmail.com> wrote: > > > > From: Maciej Żenczykowski <maze@google.com> > > > > skfilter bpf programs can read the packet directly via llvm.bpf.load.byte/ > > /half/word which are 8/16/32-bit primitive bpf instructions and thus > > behave basically as well as DPA reads. But there is no 64-bit equivalent, > > due to the support for the equivalent 64-bit bpf opcode never having been > > added (unclear why, there was a patch posted). > > DPA uses a slightly different mechanism, so doesn't suffer this limitation. > > > > Using 64-bit reads, 128-bit ipv6 address comparisons can be done in just > > 2 steps, instead of the 4 steps needed with llvm.bpf.word. > > > > This should hopefully allow simpler (less instructions, and possibly less > > logic and maybe even less jumps) programs. Less jumps may also mean vastly > > faster bpf verifier times (it can be exponential in the number of jumps...). > > > > This can be particularly important when trying to do something like scan > > a netlink message for a pattern (2000 iteration loop) to decide whether > > a message should be dropped, or delivered to userspace (thus waking it up). > > > > I'm requiring CAP_NET_ADMIN because I'm not sure of the security > > implications... I don't know BPF_PROG_TYPE_SOCKET_FILTER very well, but the patch seems reasonable to me. It will be great if we can show the performance impact with a benchmark or a selftests. Thanks, Song
On Tue, Nov 23, 2021 at 12:56 PM Maciej Żenczykowski <zenczykowski@gmail.com> wrote: > > From: Maciej Żenczykowski <maze@google.com> > > skfilter bpf programs can read the packet directly via llvm.bpf.load.byte/ > /half/word which are 8/16/32-bit primitive bpf instructions and thus > behave basically as well as DPA reads. But there is no 64-bit equivalent, > due to the support for the equivalent 64-bit bpf opcode never having been > added (unclear why, there was a patch posted). > DPA uses a slightly different mechanism, so doesn't suffer this limitation. > > Using 64-bit reads, 128-bit ipv6 address comparisons can be done in just > 2 steps, instead of the 4 steps needed with llvm.bpf.word. llvm.bpf.word is a pseudo instruction. It's actually a function call for classic bpf. See bpf_gen_ld_abs. We used to have ugly special cases for them in JITs, but then got rid of it. Don't use them if performance is a requirement. > This should hopefully allow simpler (less instructions, and possibly less > logic and maybe even less jumps) programs. Less jumps may also mean vastly > faster bpf verifier times (it can be exponential in the number of jumps...). > > This can be particularly important when trying to do something like scan > a netlink message for a pattern (2000 iteration loop) to decide whether > a message should be dropped, or delivered to userspace (thus waking it up). > > I'm requiring CAP_NET_ADMIN because I'm not sure of the security > implications... > > Tested: only build tested > Signed-off-by: Maciej Żenczykowski <maze@google.com> > --- > kernel/bpf/verifier.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 331b170d9fcc..0c2e25fb9844 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -3258,6 +3258,11 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env, > enum bpf_prog_type prog_type = resolve_prog_type(env->prog); > > switch (prog_type) { > + case BPF_PROG_TYPE_SOCKET_FILTER: > + if (meta || !capable(CAP_NET_ADMIN)) > + return false; probably needs CAP_BPF too. Other than that I think it's fine.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 331b170d9fcc..0c2e25fb9844 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3258,6 +3258,11 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env, enum bpf_prog_type prog_type = resolve_prog_type(env->prog); switch (prog_type) { + case BPF_PROG_TYPE_SOCKET_FILTER: + if (meta || !capable(CAP_NET_ADMIN)) + return false; + fallthrough; + /* Program types only with direct read access go here! */ case BPF_PROG_TYPE_LWT_IN: case BPF_PROG_TYPE_LWT_OUT: