mbox series

[bpf-next,0/6] bpf: add netfilter program type

Message ID 20230405161116.13565-1-fw@strlen.de (mailing list archive)
Headers show
Series bpf: add netfilter program type | expand

Message

Florian Westphal April 5, 2023, 4:11 p.m. UTC
Add minimal support to hook bpf programs to netfilter hooks, e.g.
PREROUTING or FORWARD.

For this the most relevant parts for registering a netfilter
hook via the in-kernel api are exposed to userspace via bpf_link.

The new program type is 'tracing style', i.e. there is no context
access rewrite done by verifier, the function argument (struct bpf_nf_ctx)
isn't stable.
There is no support for direct packet access, dynptr api should be used
instead.

With this its possible to build a small test program such as:

 #include "vmlinux.h"
extern int bpf_dynptr_from_skb(struct __sk_buff *skb, __u64 flags,
                               struct bpf_dynptr *ptr__uninit) __ksym;
extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, uint32_t offset,
                                   void *buffer, uint32_t buffer__sz) __ksym;
SEC("netfilter")
int nf_test(struct bpf_nf_ctx *ctx)
{
	struct nf_hook_state *state = ctx->state;
	struct sk_buff *skb = ctx->skb;
	const struct iphdr *iph, _iph;
	const struct tcphdr *th, _th;
	struct bpf_dynptr ptr;

	if (bpf_dynptr_from_skb(skb, 0, &ptr))
		return NF_DROP;

	iph = bpf_dynptr_slice(&ptr, 0, &_iph, sizeof(_iph));
	if (!iph)
		return NF_DROP;

	th = bpf_dynptr_slice(&ptr, iph->ihl << 2, &_th, sizeof(_th));
	if (!th)
		return NF_DROP;

	bpf_printk("accept %x:%d->%x:%d, hook %d ifin %d\n", iph->saddr, bpf_ntohs(th->source), iph->daddr, bpf_ntohs(th->dest), state->hook, state->in->ifindex);
        return NF_ACCEPT;
}

Then, tail /sys/kernel/tracing/trace_pipe.

Changes since last RFC version:
1. extend 'bpftool link show' to print prio/hooknum etc
2. extend 'nft list hooks' so it can print the bpf program id
3. Add an extra patch to artificially restrict bpf progs with
   same priority.  Its fine from a technical pov but it will
   cause ordering issues (most recent one comes first).
   Can be removed later.
4. Add test_run support for netfilter prog type and a small
   extension to verifier tests to make sure we can't return
   verdicts like NF_STOLEN.
5. Alter the netfilter part of the bpf_link uapi struct:
   - add flags/reserved members.
  Not used here except returning errors when they are nonzero.
  Plan is to allow the bpf_link users to enable netfilter
  defrag or conntrack engine by setting feature flags at
  link create time in the future.

Let me know if there is anything missing that has to be addressed
before this can be merged.

Thanks!

Florian Westphal (6):
  bpf: add bpf_link support for BPF_NETFILTER programs
  bpf: minimal support for programs hooked into netfilter framework
  netfilter: nfnetlink hook: dump bpf prog id
  netfilter: disallow bpf hook attachment at same priority
  tools: bpftool: print netfilter link info
  bpf: add test_run support for netfilter program type

 include/linux/bpf.h                           |   3 +
 include/linux/bpf_types.h                     |   4 +
 include/linux/netfilter.h                     |   1 +
 include/net/netfilter/nf_bpf_link.h           |   8 +
 include/uapi/linux/bpf.h                      |  15 ++
 include/uapi/linux/netfilter/nfnetlink_hook.h |  20 +-
 kernel/bpf/btf.c                              |   6 +
 kernel/bpf/syscall.c                          |   6 +
 kernel/bpf/verifier.c                         |   3 +
 net/bpf/test_run.c                            | 143 +++++++++++++
 net/core/filter.c                             |   1 +
 net/netfilter/Kconfig                         |   3 +
 net/netfilter/Makefile                        |   1 +
 net/netfilter/core.c                          |  12 ++
 net/netfilter/nf_bpf_link.c                   | 190 ++++++++++++++++++
 net/netfilter/nfnetlink_hook.c                |  81 ++++++--
 tools/bpf/bpftool/link.c                      |  24 +++
 tools/include/uapi/linux/bpf.h                |  15 ++
 tools/lib/bpf/libbpf.c                        |   1 +
 .../selftests/bpf/verifier/netfilter.c        |  23 +++
 20 files changed, 546 insertions(+), 14 deletions(-)
 create mode 100644 include/net/netfilter/nf_bpf_link.h
 create mode 100644 net/netfilter/nf_bpf_link.c
 create mode 100644 tools/testing/selftests/bpf/verifier/netfilter.c

Comments

Quentin Deslandes April 12, 2023, 8:20 a.m. UTC | #1
On 05/04/2023 18:11, Florian Westphal wrote:
> Add minimal support to hook bpf programs to netfilter hooks, e.g.
> PREROUTING or FORWARD.
> 
> For this the most relevant parts for registering a netfilter
> hook via the in-kernel api are exposed to userspace via bpf_link.
> 
> The new program type is 'tracing style', i.e. there is no context
> access rewrite done by verifier, the function argument (struct bpf_nf_ctx)
> isn't stable.
> There is no support for direct packet access, dynptr api should be used
> instead.

Does this mean the verifier will reject any program accessing ctx->skb
(e.g. ctx->skb + X)?

> With this its possible to build a small test program such as:
> 
>  #include "vmlinux.h"
> extern int bpf_dynptr_from_skb(struct __sk_buff *skb, __u64 flags,
>                                struct bpf_dynptr *ptr__uninit) __ksym;
> extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, uint32_t offset,
>                                    void *buffer, uint32_t buffer__sz) __ksym;
> SEC("netfilter")
> int nf_test(struct bpf_nf_ctx *ctx)
> {
> 	struct nf_hook_state *state = ctx->state;
> 	struct sk_buff *skb = ctx->skb;
> 	const struct iphdr *iph, _iph;
> 	const struct tcphdr *th, _th;
> 	struct bpf_dynptr ptr;
> 
> 	if (bpf_dynptr_from_skb(skb, 0, &ptr))
> 		return NF_DROP;
> 
> 	iph = bpf_dynptr_slice(&ptr, 0, &_iph, sizeof(_iph));
> 	if (!iph)
> 		return NF_DROP;
> 
> 	th = bpf_dynptr_slice(&ptr, iph->ihl << 2, &_th, sizeof(_th));
> 	if (!th)
> 		return NF_DROP;
> 
> 	bpf_printk("accept %x:%d->%x:%d, hook %d ifin %d\n", iph->saddr, bpf_ntohs(th->source), iph->daddr, bpf_ntohs(th->dest), state->hook, state->in->ifindex);
>         return NF_ACCEPT;
> }
> 
> Then, tail /sys/kernel/tracing/trace_pipe.
> 
> Changes since last RFC version:
> 1. extend 'bpftool link show' to print prio/hooknum etc
> 2. extend 'nft list hooks' so it can print the bpf program id
> 3. Add an extra patch to artificially restrict bpf progs with
>    same priority.  Its fine from a technical pov but it will
>    cause ordering issues (most recent one comes first).
>    Can be removed later.
> 4. Add test_run support for netfilter prog type and a small
>    extension to verifier tests to make sure we can't return
>    verdicts like NF_STOLEN.
> 5. Alter the netfilter part of the bpf_link uapi struct:
>    - add flags/reserved members.
>   Not used here except returning errors when they are nonzero.
>   Plan is to allow the bpf_link users to enable netfilter
>   defrag or conntrack engine by setting feature flags at
>   link create time in the future.
> 
> Let me know if there is anything missing that has to be addressed
> before this can be merged.
> 
> Thanks!
> 
> Florian Westphal (6):
>   bpf: add bpf_link support for BPF_NETFILTER programs
>   bpf: minimal support for programs hooked into netfilter framework
>   netfilter: nfnetlink hook: dump bpf prog id
>   netfilter: disallow bpf hook attachment at same priority
>   tools: bpftool: print netfilter link info
>   bpf: add test_run support for netfilter program type
> 
>  include/linux/bpf.h                           |   3 +
>  include/linux/bpf_types.h                     |   4 +
>  include/linux/netfilter.h                     |   1 +
>  include/net/netfilter/nf_bpf_link.h           |   8 +
>  include/uapi/linux/bpf.h                      |  15 ++
>  include/uapi/linux/netfilter/nfnetlink_hook.h |  20 +-
>  kernel/bpf/btf.c                              |   6 +
>  kernel/bpf/syscall.c                          |   6 +
>  kernel/bpf/verifier.c                         |   3 +
>  net/bpf/test_run.c                            | 143 +++++++++++++
>  net/core/filter.c                             |   1 +
>  net/netfilter/Kconfig                         |   3 +
>  net/netfilter/Makefile                        |   1 +
>  net/netfilter/core.c                          |  12 ++
>  net/netfilter/nf_bpf_link.c                   | 190 ++++++++++++++++++
>  net/netfilter/nfnetlink_hook.c                |  81 ++++++--
>  tools/bpf/bpftool/link.c                      |  24 +++
>  tools/include/uapi/linux/bpf.h                |  15 ++
>  tools/lib/bpf/libbpf.c                        |   1 +
>  .../selftests/bpf/verifier/netfilter.c        |  23 +++
>  20 files changed, 546 insertions(+), 14 deletions(-)
>  create mode 100644 include/net/netfilter/nf_bpf_link.h
>  create mode 100644 net/netfilter/nf_bpf_link.c
>  create mode 100644 tools/testing/selftests/bpf/verifier/netfilter.c
>
Florian Westphal April 12, 2023, 9:45 a.m. UTC | #2
Quentin Deslandes <qde@naccy.de> wrote:
> On 05/04/2023 18:11, Florian Westphal wrote:
> > Add minimal support to hook bpf programs to netfilter hooks, e.g.
> > PREROUTING or FORWARD.
> > 
> > For this the most relevant parts for registering a netfilter
> > hook via the in-kernel api are exposed to userspace via bpf_link.
> > 
> > The new program type is 'tracing style', i.e. there is no context
> > access rewrite done by verifier, the function argument (struct bpf_nf_ctx)
> > isn't stable.
> > There is no support for direct packet access, dynptr api should be used
> > instead.
> 
> Does this mean the verifier will reject any program accessing ctx->skb
> (e.g. ctx->skb + X)?

Do you mean access to ctx->skb->data + X?  If so, yes, that won't work.

Otherwise, then no, it just means that programs might have to be recompiled
if they lack needed relocation information, but only if bpf_nf_ctx structure is
changed.

Initial version used "__sk_buff *skb", like e.g. clsact.  I was told
to not do that and expose the real kernel-side structure instead and to
not bother with direct packet access (skb->data access) support.

> >  #include "vmlinux.h"
> > extern int bpf_dynptr_from_skb(struct __sk_buff *skb, __u64 flags,
> >                                struct bpf_dynptr *ptr__uninit) __ksym;
> > extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, uint32_t offset,
> >                                    void *buffer, uint32_t buffer__sz) __ksym;
> > SEC("netfilter")
> > int nf_test(struct bpf_nf_ctx *ctx)
> > {
> > 	struct nf_hook_state *state = ctx->state;
> > 	struct sk_buff *skb = ctx->skb;

ctx->skb is dereferenced...

> > 	if (bpf_dynptr_from_skb(skb, 0, &ptr))
> > 		return NF_DROP;

... dynptr is created ...

> > 	iph = bpf_dynptr_slice(&ptr, 0, &_iph, sizeof(_iph));
> > 	if (!iph)
> > 		return NF_DROP;
> > 	th = bpf_dynptr_slice(&ptr, iph->ihl << 2, &_th, sizeof(_th));

ip header access.
Quentin Deslandes April 13, 2023, 9:26 a.m. UTC | #3
On 12/04/2023 11:45, Florian Westphal wrote:
> Quentin Deslandes <qde@naccy.de> wrote:
>> On 05/04/2023 18:11, Florian Westphal wrote:
>>> Add minimal support to hook bpf programs to netfilter hooks, e.g.
>>> PREROUTING or FORWARD.
>>>
>>> For this the most relevant parts for registering a netfilter
>>> hook via the in-kernel api are exposed to userspace via bpf_link.
>>>
>>> The new program type is 'tracing style', i.e. there is no context
>>> access rewrite done by verifier, the function argument (struct bpf_nf_ctx)
>>> isn't stable.
>>> There is no support for direct packet access, dynptr api should be used
>>> instead.
>>
>> Does this mean the verifier will reject any program accessing ctx->skb
>> (e.g. ctx->skb + X)?
> 
> Do you mean access to ctx->skb->data + X?  If so, yes, that won't work.
> 
> Otherwise, then no, it just means that programs might have to be recompiled
> if they lack needed relocation information, but only if bpf_nf_ctx structure is
> changed.
> 
> Initial version used "__sk_buff *skb", like e.g. clsact.  I was told
> to not do that and expose the real kernel-side structure instead and to
> not bother with direct packet access (skb->data access) support.
> 

That's exactly what I meant, thanks.

>>>  #include "vmlinux.h"
>>> extern int bpf_dynptr_from_skb(struct __sk_buff *skb, __u64 flags,
>>>                                struct bpf_dynptr *ptr__uninit) __ksym;
>>> extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, uint32_t offset,
>>>                                    void *buffer, uint32_t buffer__sz) __ksym;
>>> SEC("netfilter")
>>> int nf_test(struct bpf_nf_ctx *ctx)
>>> {
>>> 	struct nf_hook_state *state = ctx->state;
>>> 	struct sk_buff *skb = ctx->skb;
> 
> ctx->skb is dereferenced...
> 
>>> 	if (bpf_dynptr_from_skb(skb, 0, &ptr))
>>> 		return NF_DROP;
> 
> ... dynptr is created ...
> 
>>> 	iph = bpf_dynptr_slice(&ptr, 0, &_iph, sizeof(_iph));
>>> 	if (!iph)
>>> 		return NF_DROP;
>>> 	th = bpf_dynptr_slice(&ptr, iph->ihl << 2, &_th, sizeof(_th));
> 
> ip header access.