mbox series

[RFC,bpf-next,00/11] bpf: inlinable kfuncs for BPF

Message ID 20241107175040.1659341-1-eddyz87@gmail.com (mailing list archive)
Headers show
Series bpf: inlinable kfuncs for BPF | expand

Message

Eduard Zingerman Nov. 7, 2024, 5:50 p.m. UTC
Some time ago, in an off-list discussion, Alexei Starovoitov suggested
compiling certain kfuncs to BPF to allow inlining calls to such kfuncs
during verification. This RFC explores the idea.

This RFC introduces a notion of inlinable BPF kfuncs.
Inlinable kfuncs are compiled to BPF and are inlined by verifier after
program verification. Inlined kfunc bodies are subject to dead code
removal and removal of conditional jumps, if such jumps are proved to
always follow a single branch.

Motivation
----------

The patch set uses bpf_dynptr_slice() as guinea pig, as this function
is relatively complex and has a few conditionals that could be
removed most of the times. The function has the following structure:

    void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset,
                           void *buffer__opt, u32 buffer__szk)
    {
        ...
        type = bpf_dynptr_get_type(ptr);

        switch (type) {
        ...
        case BPF_DYNPTR_TYPE_SKB:
                if (buffer__opt)
                        return skb_header_pointer(...);
                else
                        return skb_pointer_if_linear(...);
        ...
    }

Parameters 'type' and 'buffer__opt' are most likely to be known at
callsite, and thus the function could be inlined w/o 'switch' and 'if'
checks above.

This has a measurable speedup on microbenchmark, e.g. a simple test
measuring number of bpf_dynptr_slice() executions per second shows
~1.5 speedup compared to master (see last patch in the series).
Granted, real world programs do some other work beside slice calls.

Mechanism
---------

Implementation pursues three main goals:
- avoid differences in program verification whether or not certain
  kfunc is inlinable;
- predict branches in inlined kfuncs bodies;
- allow inlined kfunc bodies to do arbitrary computations.

The goals are achieved in the following steps:
- Inlinable kfuncs are defined in kernel/bpf/inlinable_kfuncs.c.
- In order to include kernel headers as-is the C file is compiled to
  llvm bitcode targeting native architecture and then to BPF elf
  object using llc utility.
- BPF elf object is embedded in kernel data section.
- At kernel initialization time the elf object is parsed and functions
  defined in the object are deemed to be inlinable kfuncs.
- Before main verification pass, for each inlinable kfunc callsite,
  inlinable kfunc is cloned as a hidden subprogram. Such subprograms
  are called kfunc instances.
- A new KERNEL_VALUE register type is added:
  - ALU operations on any type and KERNEL_VALUE return KERNEL_VALUE;
  - load / store operations with base register having type
    KERNEL_VALUE return KERNEL_VALUE.
- During main verification pass:
  - inlinable kfunc calls are verified as regular kfunc calls;
  - the bodies of the corresponding kfunc instances are visited
    in a "distilled" context:
    - no caller stack frame;
    - scalar or null pointer r1-r5 from caller stack frame are copied
      to instance call frame verbatim;
    - pointer to dynptr r1-r5 from caller stack frame are represented
      in the instance call frame as pointers to a fake stack frame.
      For each dynptr this fake stack frame contains two register spills:
      - one scalar for 'size' field, which also encodes type;
      - one KERNEL_VALUE for 'data' field;
    - r1-r5 of all other types are represented in the instance call
      frame as KERNEL_VALUEs;
  - when 'exit' instruction within kfunc instance body is processed,
    verification for the current path is assumed complete.
- After main verification pass:
  - rely on existing passes opt_hard_wire_dead_code_branches() and
    opt_remove_dead_code() to simplify kfunc instance bodies;
  - calls to inlinable kfuncs are replaced with corresponding kfunc
    instance bodies, instance subprograms are removed.

Patch-set structure
-------------------

Implementation of the above mechanics is split in several patches to
simplify the review. The following patches are most interesting:

- "bpf: shared BPF/native kfuncs":
  Build system integration and kfuncs inlining after verification.

- "bpf: KERNEL_VALUE register type"
  Adds an opaque type for registers that could be used for ALU
  and memory access.

- "bpf: allow specifying inlinable kfuncs in modules"
  This is mostly for tests: for testing purposes
  it is necessary to control exact assembly code for
  both test case calling kfunc and kfunc itself.

- "bpf: instantiate inlinable kfuncs before verification"
  Adds a pass that clones inlinable kfunc bodies as hidden
  subprograms, one subprogram per callsite.

- "bpf: special rules for kernel function calls inside inlinable kfuncs"
  Allows to call arbitrary kernel functions from kfunc instances.

Limitations
-----------

Some existing verifier restrictions are not lifted for inlinable kfuncs:
- stack is limited to 512 bytes;
- max 8 call frames (7 with the fake call frame);
- loops are still verified in a "brute force" way;
- instructions processed for kfunc instances are counted in 1M
  instructions budget.

The list is probably not exhaustive.

TODO items
----------

The following items are currently on the TODO list:
- consider getting rid of bpftool linking phase for BPF elf object;
- consider getting rid of clang->bitcode->llc->elf dance;
- make bpf_dynptr_from_skb() inlinable and allow passing objects state
  between kfunc instances, thus avoiding special case for dynptr;
- determine the set of kfuncs for which inlining makes sense.

Alternatives
------------

Imo, this RFC is worth following through only if number of kfuncs
benefiting from inlining is big. If the set is limited to dynptr
family of functions, it is simpler to add a number of hard-coded
inlining templates for such functions (similarly to what is currently
done for some helpers).

Eduard Zingerman (11):
  bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  selftests/bpf: tests for opt_hard_wire_dead_code_branches()
  bpf: shared BPF/native kfuncs
  bpf: allow specifying inlinable kfuncs in modules
  bpf: dynamic allocation for bpf_verifier_env->subprog_info
  bpf: KERNEL_VALUE register type
  bpf: instantiate inlinable kfuncs before verification
  bpf: special rules for kernel function calls inside inlinable kfuncs
  bpf: move selected dynptr kfuncs to inlinable_kfuncs.c
  selftests/bpf: tests to verify handling of inlined kfuncs
  selftests/bpf: dynptr_slice benchmark

 Makefile                                      |   22 +-
 include/linux/bpf.h                           |   40 +-
 include/linux/bpf_verifier.h                  |   12 +-
 include/linux/btf.h                           |    7 +
 kernel/bpf/Makefile                           |   25 +-
 kernel/bpf/btf.c                              |    2 +-
 kernel/bpf/helpers.c                          |  130 +-
 kernel/bpf/inlinable_kfuncs.c                 |  113 ++
 kernel/bpf/log.c                              |    1 +
 kernel/bpf/verifier.c                         | 1187 ++++++++++++++++-
 tools/testing/selftests/bpf/Makefile          |    2 +
 tools/testing/selftests/bpf/bench.c           |    2 +
 .../selftests/bpf/benchs/bench_dynptr_slice.c |   76 ++
 .../selftests/bpf/bpf_testmod/Makefile        |   24 +-
 .../{bpf_testmod.c => bpf_testmod_core.c}     |   25 +
 .../bpf/bpf_testmod/test_inlinable_kfuncs.c   |  132 ++
 .../selftests/bpf/prog_tests/verifier.c       |    9 +
 .../selftests/bpf/progs/dynptr_slice_bench.c  |   29 +
 .../selftests/bpf/progs/verifier_dead_code.c  |   63 +
 .../bpf/progs/verifier_inlinable_kfuncs.c     |  253 ++++
 20 files changed, 1970 insertions(+), 184 deletions(-)
 create mode 100644 kernel/bpf/inlinable_kfuncs.c
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_dynptr_slice.c
 rename tools/testing/selftests/bpf/bpf_testmod/{bpf_testmod.c => bpf_testmod_core.c} (97%)
 create mode 100644 tools/testing/selftests/bpf/bpf_testmod/test_inlinable_kfuncs.c
 create mode 100644 tools/testing/selftests/bpf/progs/dynptr_slice_bench.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_dead_code.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_inlinable_kfuncs.c

Comments

Toke Høiland-Jørgensen Nov. 8, 2024, 8:41 p.m. UTC | #1
Eduard Zingerman <eddyz87@gmail.com> writes:

> Some time ago, in an off-list discussion, Alexei Starovoitov suggested
> compiling certain kfuncs to BPF to allow inlining calls to such kfuncs
> during verification. This RFC explores the idea.
>
> This RFC introduces a notion of inlinable BPF kfuncs.
> Inlinable kfuncs are compiled to BPF and are inlined by verifier after
> program verification. Inlined kfunc bodies are subject to dead code
> removal and removal of conditional jumps, if such jumps are proved to
> always follow a single branch.

Ohh, this is very exciting!

Mostly want to comment on this bit:

> Imo, this RFC is worth following through only if number of kfuncs
> benefiting from inlining is big. If the set is limited to dynptr
> family of functions, it is simpler to add a number of hard-coded
> inlining templates for such functions (similarly to what is currently
> done for some helpers).

One place where this would definitely be applicable is in all the XDP HW
metadata kfuncs. Right now, there's a function call for each piece of HW
metadata that an XDP program wants to read, which quickly adds up. And
in XDP land we are counting function calls, as the overhead (~1.1 ns) is
directly measurable in XDP PPS performance.

Back when we settled on the kfunc approach to reading metadata, we were
discussing this overhead, obviously, and whether we should do the
bespoke BPF assembly type inlining that we currently do for map lookups
and that sort of thing. We were told that the "right" way to do the
inlining is something along the lines of what you are proposing here, so
I would very much encourage you to continue working on this!

One complication for the XDP kfuncs is that the kfunc that the BPF
program calls is actually a stub function in the kernel core; at
verification time, the actual function call is replaced with one from
the network driver (see bpf_dev_bound_resolve_kfunc()). So somehow
supporting this (with kfuncs defined in drivers, i.e., in modules) would
be needed for the XDP use case.

Happy to help with benchmarking for the XDP use case when/if this can be
supported, of course! :)

(+Jesper, who I'm sure will be happy to help as well)

-Toke
Eduard Zingerman Nov. 8, 2024, 11:01 p.m. UTC | #2
On Fri, 2024-11-08 at 21:41 +0100, Toke Høiland-Jørgensen wrote:

[...]

Hi Toke,

> Back when we settled on the kfunc approach to reading metadata, we were
> discussing this overhead, obviously, and whether we should do the
> bespoke BPF assembly type inlining that we currently do for map lookups
> and that sort of thing. We were told that the "right" way to do the
> inlining is something along the lines of what you are proposing here, so
> I would very much encourage you to continue working on this!
> 
> One complication for the XDP kfuncs is that the kfunc that the BPF
> program calls is actually a stub function in the kernel core; at
> verification time, the actual function call is replaced with one from
> the network driver (see bpf_dev_bound_resolve_kfunc()). So somehow
> supporting this (with kfuncs defined in drivers, i.e., in modules) would
> be needed for the XDP use case.

Thank you for the pointer to bpf_dev_bound_resolve_kfunc().
Looking at specialize_kfunc(), I will have to extend the interface for
selecting inlinable function body. The inlinable kfuncs already could
be defined in modules, so this should be a relatively small adjustment.

> Happy to help with benchmarking for the XDP use case when/if this can be
> supported, of course! :)
> 
> (+Jesper, who I'm sure will be happy to help as well)

Thank you, help with benchmarking is most welcome.
Very interested in real-world benchmarks, as I'm not fully sold on
this feature, it adds significant layer of complexity to the verifier.
I'll reach to you and Jesper after adding support for inlining of XDP
metadata kfuncs.

Thanks,
Eduard
Toke Høiland-Jørgensen Nov. 11, 2024, 6:42 p.m. UTC | #3
Eduard Zingerman <eddyz87@gmail.com> writes:

> On Fri, 2024-11-08 at 21:41 +0100, Toke Høiland-Jørgensen wrote:
>
> [...]
>
> Hi Toke,
>
>> Back when we settled on the kfunc approach to reading metadata, we were
>> discussing this overhead, obviously, and whether we should do the
>> bespoke BPF assembly type inlining that we currently do for map lookups
>> and that sort of thing. We were told that the "right" way to do the
>> inlining is something along the lines of what you are proposing here, so
>> I would very much encourage you to continue working on this!
>> 
>> One complication for the XDP kfuncs is that the kfunc that the BPF
>> program calls is actually a stub function in the kernel core; at
>> verification time, the actual function call is replaced with one from
>> the network driver (see bpf_dev_bound_resolve_kfunc()). So somehow
>> supporting this (with kfuncs defined in drivers, i.e., in modules) would
>> be needed for the XDP use case.
>
> Thank you for the pointer to bpf_dev_bound_resolve_kfunc().
> Looking at specialize_kfunc(), I will have to extend the interface for
> selecting inlinable function body. The inlinable kfuncs already could
> be defined in modules, so this should be a relatively small adjustment.

Awesome!

>> Happy to help with benchmarking for the XDP use case when/if this can be
>> supported, of course! :)
>> 
>> (+Jesper, who I'm sure will be happy to help as well)
>
> Thank you, help with benchmarking is most welcome.
> Very interested in real-world benchmarks, as I'm not fully sold on
> this feature, it adds significant layer of complexity to the verifier.
> I'll reach to you and Jesper after adding support for inlining of XDP
> metadata kfuncs.

Sounds good, thanks!

-Toke