Message ID | 20201116065305.1010651-1-haliu@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | iproute2: add libbpf support | expand |
On Sun, Nov 15, 2020 at 10:56 PM Hangbin Liu <haliu@redhat.com> wrote: > > This series converts iproute2 to use libbpf for loading and attaching > BPF programs when it is available. This means that iproute2 will > correctly process BTF information and support the new-style BTF-defined > maps, while keeping compatibility with the old internal map definition > syntax. > > This is achieved by checking for libbpf at './configure' time, and using > it if available. By default the system libbpf will be used, but static > linking against a custom libbpf version can be achieved by passing > LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure > abort if no suitable libbpf is found (useful for automatic packaging > that wants to enforce the dependency), or set off to disable libbpf check > and build iproute2 with legacy bpf. > > The old iproute2 bpf code is kept and will be used if no suitable libbpf > is available. When using libbpf, wrapper code ensures that iproute2 will > still understand the old map definition format, including populating > map-in-map and tail call maps before load. > > The examples in bpf/examples are kept, and a separate set of examples > are added with BTF-based map definitions for those examples where this > is possible (libbpf doesn't currently support declaratively populating > tail call maps). > > At last, Thanks a lot for Toke's help on this patch set. > > v5: > a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR > dest. > b) Fix bpf_prog_load_dev typo. > c) rebase to latest iproute2-next. For the reasons explained multiple times earlier: Nacked-by: Alexei Starovoitov <ast@kernel.org>
On Sun, 15 Nov 2020 23:19:26 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Sun, Nov 15, 2020 at 10:56 PM Hangbin Liu <haliu@redhat.com> wrote: > > > > This series converts iproute2 to use libbpf for loading and attaching > > BPF programs when it is available. This means that iproute2 will > > correctly process BTF information and support the new-style BTF-defined > > maps, while keeping compatibility with the old internal map definition > > syntax. > > > > This is achieved by checking for libbpf at './configure' time, and using > > it if available. By default the system libbpf will be used, but static > > linking against a custom libbpf version can be achieved by passing > > LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure > > abort if no suitable libbpf is found (useful for automatic packaging > > that wants to enforce the dependency), or set off to disable libbpf check > > and build iproute2 with legacy bpf. > > > > The old iproute2 bpf code is kept and will be used if no suitable libbpf > > is available. When using libbpf, wrapper code ensures that iproute2 will > > still understand the old map definition format, including populating > > map-in-map and tail call maps before load. > > > > The examples in bpf/examples are kept, and a separate set of examples > > are added with BTF-based map definitions for those examples where this > > is possible (libbpf doesn't currently support declaratively populating > > tail call maps). > > > > At last, Thanks a lot for Toke's help on this patch set. > > > > v5: > > a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR > > dest. > > b) Fix bpf_prog_load_dev typo. > > c) rebase to latest iproute2-next. > > For the reasons explained multiple times earlier: > Nacked-by: Alexei Starovoitov <ast@kernel.org> We really need to get another BPF-ELF loaded into iproute2. I have done a number of practical projects with TC-BPF and it sucks that iproute2 have this out-dated (compiled in) BPF-loader. Examples jumping through hoops to get XDP + TC to collaborate[1], and dealing with iproute2 map-elf layout[2]. Thus, IMHO we MUST move forward and get started with converting iproute2 to libbpf, and start on the work to deprecate the build in BPF-ELF-loader. I would prefer ripping out the BPF-ELF-loader and replace it with libbpf that handle the older binary elf-map layout, but I do understand if you want to keep this around. (at least for the next couple of releases). Maybe we can get a little closer to what Alexei wants? When compiled against dynamic libbpf, then I would use 'ldd' command to see what libbpf lib version is used. When compiled/linked statically against a custom libbpf version (already supported via LIBBPF_DIR) then *I* think is difficult to figure out that version of libbpf I'm using. Could we add the libbpf version info in 'tc -V', as then it would remove one of my concerns with static linking. I actually fear that it will be a bad user experience, when we start to have multiple userspace tools that load BPF, but each is compiled and statically linked with it own version of libbpf (with git submodule an increasing number of tools will have more variations!). Small variations in supported features can cause strange and difficult troubleshooting. A practical example is xdp-cpumap-tc[1] where I had to instruct the customer to load XDP-program *BEFORE* TC-program to have map (that is shared between TC and XDP) being created correctly, for userspace tool written in libbpf to have proper map-access and info. I actually thinks it makes sense to have iproute2 require a specific libbpf version, and also to move this version requirement forward, as the kernel evolves features that gets added into libbpf. I know this is kind of controversial, and an attempt to pressure distro vendors to update libbpf. Maybe it will actually backfire, as the person generating the DEB/RPM software package will/can choose to compile iproute2 without ELF-BPF/libbpf support. [1] https://github.com/xdp-project/xdp-cpumap-tc [2] https://github.com/netoptimizer/bpf-examples/blob/71db45b28ec/traffic-pacing-edt/edt_pacer02.c#L33-L35
On Sun, 15 Nov 2020 23:19:26 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Sun, Nov 15, 2020 at 10:56 PM Hangbin Liu <haliu@redhat.com> wrote: > > > > This series converts iproute2 to use libbpf for loading and attaching > > BPF programs when it is available. This means that iproute2 will > > correctly process BTF information and support the new-style BTF-defined > > maps, while keeping compatibility with the old internal map definition > > syntax. > > > > This is achieved by checking for libbpf at './configure' time, and using > > it if available. By default the system libbpf will be used, but static > > linking against a custom libbpf version can be achieved by passing > > LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure > > abort if no suitable libbpf is found (useful for automatic packaging > > that wants to enforce the dependency), or set off to disable libbpf check > > and build iproute2 with legacy bpf. > > > > The old iproute2 bpf code is kept and will be used if no suitable libbpf > > is available. When using libbpf, wrapper code ensures that iproute2 will > > still understand the old map definition format, including populating > > map-in-map and tail call maps before load. > > > > The examples in bpf/examples are kept, and a separate set of examples > > are added with BTF-based map definitions for those examples where this > > is possible (libbpf doesn't currently support declaratively populating > > tail call maps). > > > > At last, Thanks a lot for Toke's help on this patch set. > > > > v5: > > a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR > > dest. > > b) Fix bpf_prog_load_dev typo. > > c) rebase to latest iproute2-next. > > For the reasons explained multiple times earlier: > Nacked-by: Alexei Starovoitov <ast@kernel.org> Could you propose a trial balloon patch to show what you would like to see in iproute2?
Jesper Dangaard Brouer <brouer@redhat.com> writes: > When compiled against dynamic libbpf, then I would use 'ldd' command to > see what libbpf lib version is used. When compiled/linked statically > against a custom libbpf version (already supported via LIBBPF_DIR) then > *I* think is difficult to figure out that version of libbpf I'm using. > Could we add the libbpf version info in 'tc -V', as then it would > remove one of my concerns with static linking. Agreed, I think we should definitely add the libbpf version to the tool version output. -Toke
On Mon, Nov 16, 2020 at 03:54:46PM +0100, Jesper Dangaard Brouer wrote: > > Thus, IMHO we MUST move forward and get started with converting > iproute2 to libbpf, and start on the work to deprecate the build in > BPF-ELF-loader. I would prefer ripping out the BPF-ELF-loader and > replace it with libbpf that handle the older binary elf-map layout, but > I do understand if you want to keep this around. (at least for the next > couple of releases). I don't understand why legacy code has to be around. Having the legacy code and an option to build tc without libbpf creates backward compatibility risk to tc users: Newer tc may not load bpf progs that older tc did. > I actually fear that it will be a bad user experience, when we start to > have multiple userspace tools that load BPF, but each is compiled and > statically linked with it own version of libbpf (with git submodule an > increasing number of tools will have more variations!). So far people either freeze bpftool that they use to load progs or they use libbpf directly in their applications. Any other way means that the application behavior will be unpredictable. If a company built a bpf-based product and wants to distibute such product as a package it needs a way to specify this dependency in pkg config. 'tc -V' is not something that can be put in a spec. The main iproute2 version can be used as a dependency, but it's meaningless when presence of libbpf and its version is not strictly derived from iproute2 spec. The users should be able to write in their spec: BuildRequires: iproute-tc >= 5.10 and be confident that tc will load the prog they've developed and tested. > I actually thinks it makes sense to have iproute2 require a specific > libbpf version, and also to move this version requirement forward, as > the kernel evolves features that gets added into libbpf. +1
On Mon, Nov 16, 2020 at 06:37:57PM -0800, Alexei Starovoitov wrote: > On Mon, Nov 16, 2020 at 03:54:46PM +0100, Jesper Dangaard Brouer wrote: > > > > Thus, IMHO we MUST move forward and get started with converting > > iproute2 to libbpf, and start on the work to deprecate the build in > > BPF-ELF-loader. I would prefer ripping out the BPF-ELF-loader and > > replace it with libbpf that handle the older binary elf-map layout, but > > I do understand if you want to keep this around. (at least for the next > > couple of releases). > > I don't understand why legacy code has to be around. > Having the legacy code and an option to build tc without libbpf creates > backward compatibility risk to tc users: > Newer tc may not load bpf progs that older tc did. If a distro choose to compile iproute2 with libbpf, I don't think they will compile iproute2 without libbpf in new version. So yum/apt-get update from official source doesn't like a problem. Unless a user choose to use a self build iproute2 version. Then the self build version may also don't have other supports, like libelf, libnml, libcap etc. > > > I actually fear that it will be a bad user experience, when we start to > > have multiple userspace tools that load BPF, but each is compiled and > > statically linked with it own version of libbpf (with git submodule an > > increasing number of tools will have more variations!). > > So far people either freeze bpftool that they use to load progs > or they use libbpf directly in their applications. > Any other way means that the application behavior will be unpredictable. > If a company built a bpf-based product and wants to distibute such > product as a package it needs a way to specify this dependency in pkg config. > 'tc -V' is not something that can be put in a spec. > The main iproute2 version can be used as a dependency, but it's meaningless > when presence of libbpf and its version is not strictly derived from > iproute2 spec. > The users should be able to write in their spec: > BuildRequires: iproute-tc >= 5.10 > and be confident that tc will load the prog they've developed and tested. The current patch does have a libbpf version check, it need at least libbpf 0.1.0. So if a distro starts to build iproute2 based on libbpf, there will have a dependence. The rule could be added to rpm spec file, or what else the distro choose. That's the distro compiler's work. Unless you want to say a company built a bpf-based product, they only add iproute2 version dependence(let's say some distros has iproute2 5.12 with libbpf supported), and somehow forgot add libbpf version dependence check and distro check. At the same time a user run the product on a distro without libbpf compiled on iproute2 5.12. That do will cause problem. But if I'm the user, I will think the company is not professional for bpf product that they even do not know libbpf is needed... So my opinion: for end user, the distro should take care of libbpf and iproute2 version control. For bpf company, they should take care if libbpf is used by the iproute2 and what distros they support. Please correct me if I missed something. Thanks Hangbin
On 11/16/20 7:54 AM, Jesper Dangaard Brouer wrote: > When compiled against dynamic libbpf, then I would use 'ldd' command to > see what libbpf lib version is used. When compiled/linked statically > against a custom libbpf version (already supported via LIBBPF_DIR) then > *I* think is difficult to figure out that version of libbpf I'm using. > Could we add the libbpf version info in 'tc -V', as then it would > remove one of my concerns with static linking. Adding libbpf version to 'tc -V' and 'ip -V' seems reasonable. As for the bigger problem, trying to force user space components to constantly chase latest and greatest S/W versions is not the right answer. The crux of the problem here is loading bpf object files and what will most likely be a never ending stream of enhancements that impact the proper loading of them. bpftool is much more suited to the job of managing bpf files versus iproute2 which is the de facto implementation for networking APIs. bpftool ships as part of a common linux tools package, so it will naturally track kernel versions for those who want / need latest and greatest versions. Users who are not building their own agents for managing bpf files (which I think is much more appropriate for production use cases than forking command line utilities) can use bpftool to load files, manage maps which are then attached to the programs, etc, and then invoke iproute2 to handle the networking attach / detach / list with detailed information. That said, the legacy bpf code in iproute2 has created some expectations, and iproute2 can not simply remove existing capabilities. Moving iproute2 to libbpf provides an improvement over the current status by allowing ‘modern’ bpf object files to be loaded without affecting legacy users, even if it does not allow latest and greatest bpf capabilities at every moment in time (again, a constantly moving reference point). iproute2 is a networking configuration tool, not a bpf management tool. Hangbin’s approach gives full flexibility to those who roll their own and for distributions who value stability, it allows iproute2 to use latest and greatest libbpf for those who want to chase the pot of gold at the end of the rainbow, or they can choose stability with an OS distro’s libbpf or legacy bpf. I believe this is the right compromise at this point in time.
On 17/11/2020 02:37, Alexei Starovoitov wrote: > If a company built a bpf-based product and wants to distibute such > product as a package it needs a way to specify this dependency in pkg config. > 'tc -V' is not something that can be put in a spec. > The main iproute2 version can be used as a dependency, but it's meaningless > when presence of libbpf and its version is not strictly derived from > iproute2 spec. But if libbpf is dynamically linked, they can put Requires: libbpf >= 0.3.0 Requires: iproute-tc >= 5.10 and get the dependency behaviour they need. No? -ed
On Mon, Nov 16, 2020 at 08:38:15PM -0700, David Ahern wrote: > > As for the bigger problem, trying to force user space components to > constantly chase latest and greatest S/W versions is not the right answer. Your own nexthop enhancements in the kernel code follow 1-1 with iproute2 changes. So the users do chase the latest kernel and the latest iproute2 if they want the networking feature. Yet you're arguing that for bpf features they shouldn't have such expectations with iproute2 which will not support the latest kernel bpf features. I sense a lot of bias here. > The crux of the problem here is loading bpf object files and what will > most likely be a never ending stream of enhancements that impact the > proper loading of them. Please stop this misinformation spread. Multiple people explained numerous times that libbpf takes care of backward compatibility. > That said, the legacy bpf code in iproute2 has created some > expectations, and iproute2 can not simply remove existing capabilities. It certainly can remove them by moving to libbpf. > iproute2 is a networking configuration tool, not a bpf management tool. > Hangbin’s approach gives full flexibility to those who roll their own > and for distributions who value stability, it allows iproute2 to use > latest and greatest libbpf for those who want to chase the pot of gold > at the end of the rainbow, or they can choose stability with an OS > distro’s libbpf or legacy bpf. I believe this is the right compromise at > this point in time. In other words you're saying that upstream iproute2 is a kitchen sink of untested combinations of libraries and distros suppose to do a ton of extra work to provide their users a quality iproute2.
On Tue, Nov 17, 2020 at 11:19:33AM +0800, Hangbin Liu wrote: > On Mon, Nov 16, 2020 at 06:37:57PM -0800, Alexei Starovoitov wrote: > > On Mon, Nov 16, 2020 at 03:54:46PM +0100, Jesper Dangaard Brouer wrote: > > > > > > Thus, IMHO we MUST move forward and get started with converting > > > iproute2 to libbpf, and start on the work to deprecate the build in > > > BPF-ELF-loader. I would prefer ripping out the BPF-ELF-loader and > > > replace it with libbpf that handle the older binary elf-map layout, but > > > I do understand if you want to keep this around. (at least for the next > > > couple of releases). > > > > I don't understand why legacy code has to be around. > > Having the legacy code and an option to build tc without libbpf creates > > backward compatibility risk to tc users: > > Newer tc may not load bpf progs that older tc did. > > If a distro choose to compile iproute2 with libbpf, I don't think they will > compile iproute2 without libbpf in new version. So yum/apt-get update from > official source doesn't like a problem. > > Unless a user choose to use a self build iproute2 version. Then the self build > version may also don't have other supports, like libelf, libnml, libcap etc. > > > > > > I actually fear that it will be a bad user experience, when we start to > > > have multiple userspace tools that load BPF, but each is compiled and > > > statically linked with it own version of libbpf (with git submodule an > > > increasing number of tools will have more variations!). > > > > So far people either freeze bpftool that they use to load progs > > or they use libbpf directly in their applications. > > Any other way means that the application behavior will be unpredictable. > > If a company built a bpf-based product and wants to distibute such > > product as a package it needs a way to specify this dependency in pkg config. > > 'tc -V' is not something that can be put in a spec. > > The main iproute2 version can be used as a dependency, but it's meaningless > > when presence of libbpf and its version is not strictly derived from > > iproute2 spec. > > The users should be able to write in their spec: > > BuildRequires: iproute-tc >= 5.10 > > and be confident that tc will load the prog they've developed and tested. > > The current patch does have a libbpf version check, it need at least libbpf > 0.1.0. So if a distro starts to build iproute2 based on libbpf, there will > have a dependence. The rule could be added to rpm spec file, or what else > the distro choose. That's the distro compiler's work. > > Unless you want to say a company built a bpf-based product, they only > add iproute2 version dependence(let's say some distros has iproute2 5.12 with > libbpf supported), and somehow forgot add libbpf version dependence check > and distro check. At the same time a user run the product on a distro without > libbpf compiled on iproute2 5.12. That do will cause problem. right. You've answered Ed's question: > But if libbpf is dynamically linked, they can put > Requires: libbpf >= 0.3.0 > Requires: iproute-tc >= 5.10 > and get the dependency behaviour they need. No? It is a problem because >= 5.10 cannot capture legacy vs libbpf. > But if I'm the user, I will think the company is not professional for bpf > product that they even do not know libbpf is needed... > > So my opinion: for end user, the distro should take care of libbpf and > iproute2 version control. For bpf company, they should take care if libbpf > is used by the iproute2 and what distros they support. So you're saying that bpf community shouldn't care about their users. The distros suppose to step forward and provide proper bpf support in tools like iproute2? In other words iproute2 upstream doesn't care about shipping quality product. It's distros job now. Thanks, but no. iproute2 should stay with legacy obsolete prog loader and the users should switch to bpftool + iproute2 combination. bpftool for loading progs and iproute2 for networking configs.
This series converts iproute2 to use libbpf for loading and attaching BPF programs when it is available. This means that iproute2 will correctly process BTF information and support the new-style BTF-defined maps, while keeping compatibility with the old internal map definition syntax. This is achieved by checking for libbpf at './configure' time, and using it if available. By default the system libbpf will be used, but static linking against a custom libbpf version can be achieved by passing LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure abort if no suitable libbpf is found (useful for automatic packaging that wants to enforce the dependency), or set off to disable libbpf check and build iproute2 with legacy bpf. The old iproute2 bpf code is kept and will be used if no suitable libbpf is available. When using libbpf, wrapper code ensures that iproute2 will still understand the old map definition format, including populating map-in-map and tail call maps before load. The examples in bpf/examples are kept, and a separate set of examples are added with BTF-based map definitions for those examples where this is possible (libbpf doesn't currently support declaratively populating tail call maps). At last, Thanks a lot for Toke's help on this patch set. v6: a) print runtime libbpf version in ip -V and tc -V v5: a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR dest. b) Fix bpf_prog_load_dev typo. c) rebase to latest iproute2-next. v4: a) Make variable LIBBPF_FORCE able to control whether build iproute2 with libbpf or not. b) Add new file bpf_glue.c to for libbpf/legacy mixed bpf calls. c) Fix some build issues and shell compatibility error. v3: a) Update configure to Check function bpf_program__section_name() separately b) Add a new function get_bpf_program__section_name() to choose whether to use bpf_program__title() or not. c) Test build the patch on Fedora 33 with libbpf-0.1.0-1.fc33 and libbpf-devel-0.1.0-1.fc33 v2: a) Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead. b) Add ipvrf with libbpf support. Here are the test results with patched iproute2: == Show libbpf version # ip -V ip utility, iproute2-5.9.0, libbpf 0.1.0 # tc -V tc utility, iproute2-5.9.0, libbpf 0.1.0 == setup env # clang -O2 -Wall -g -target bpf -c bpf_graft.c -o btf_graft.o # clang -O2 -Wall -g -target bpf -c bpf_map_in_map.c -o btf_map_in_map.o # clang -O2 -Wall -g -target bpf -c bpf_shared.c -o btf_shared.o # clang -O2 -Wall -g -target bpf -c legacy/bpf_cyclic.c -o bpf_cyclic.o # clang -O2 -Wall -g -target bpf -c legacy/bpf_graft.c -o bpf_graft.o # clang -O2 -Wall -g -target bpf -c legacy/bpf_map_in_map.c -o bpf_map_in_map.o # clang -O2 -Wall -g -target bpf -c legacy/bpf_shared.c -o bpf_shared.o # clang -O2 -Wall -g -target bpf -c legacy/bpf_tailcall.c -o bpf_tailcall.o # rm -rf /sys/fs/bpf/xdp/globals # /root/iproute2/ip/ip link add type veth # /root/iproute2/ip/ip link set veth0 up # /root/iproute2/ip/ip link set veth1 up == Load objs # /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 4 tag 3056d2382e53f27c jited # ls /sys/fs/bpf/xdp/globals jmp_tc # bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 4: xdp name cls_aaa tag 3056d2382e53f27c gpl loaded_at 2020-10-22T08:04:21-0400 uid 0 xlated 80B jited 71B memlock 4096B btf_id 5 # /root/iproute2/ip/ip link set veth0 xdp off # /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 8 tag 4420e72b2a601ed7 jited # ls /sys/fs/bpf/xdp/globals jmp_tc map_inner map_outer # bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 8: xdp name imain tag 4420e72b2a601ed7 gpl loaded_at 2020-10-22T08:04:23-0400 uid 0 xlated 336B jited 193B memlock 4096B map_ids 3 btf_id 10 # /root/iproute2/ip/ip link set veth0 xdp off # /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 12 tag 9cbab549c3af3eab jited # ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef: map_sh /sys/fs/bpf/xdp/globals: jmp_tc map_inner map_outer # bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 4: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 12: xdp name imain tag 9cbab549c3af3eab gpl loaded_at 2020-10-22T08:04:25-0400 uid 0 xlated 224B jited 139B memlock 4096B map_ids 4 btf_id 15 # /root/iproute2/ip/ip link set veth0 xdp off == Load objs again to make sure maps could be reused # /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 16 tag 3056d2382e53f27c jited # ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef: map_sh /sys/fs/bpf/xdp/globals: jmp_tc map_inner map_outer # bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 4: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 16: xdp name cls_aaa tag 3056d2382e53f27c gpl loaded_at 2020-10-22T08:04:27-0400 uid 0 xlated 80B jited 71B memlock 4096B btf_id 20 # /root/iproute2/ip/ip link set veth0 xdp off # /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 20 tag 4420e72b2a601ed7 jited # ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef: map_sh /sys/fs/bpf/xdp/globals: jmp_tc map_inner map_outer # bpftool map show [236/4518] 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 4: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 20: xdp name imain tag 4420e72b2a601ed7 gpl loaded_at 2020-10-22T08:04:29-0400 uid 0 xlated 336B jited 193B memlock 4096B map_ids 3 btf_id 25 # /root/iproute2/ip/ip link set veth0 xdp off # /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 24 tag 9cbab549c3af3eab jited # ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef: map_sh /sys/fs/bpf/xdp/globals: jmp_tc map_inner map_outer # bpftool map show 1: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 2: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 3: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 4: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 24: xdp name imain tag 9cbab549c3af3eab gpl loaded_at 2020-10-22T08:04:31-0400 uid 0 xlated 224B jited 139B memlock 4096B map_ids 4 btf_id 30 # /root/iproute2/ip/ip link set veth0 xdp off # rm -rf /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals == Testing if we can load new-style objects (using xdp-filter as an example) # /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_all.o sec xdp_filter # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 28 tag e29eeda1489a6520 jited # ls /sys/fs/bpf/xdp/globals filter_ethernet filter_ipv4 filter_ipv6 filter_ports xdp_stats_map # bpftool map show 5: percpu_array name xdp_stats_map flags 0x0 key 4B value 16B max_entries 5 memlock 4096B btf_id 35 6: percpu_array name filter_ports flags 0x0 key 4B value 8B max_entries 65536 memlock 1576960B btf_id 35 7: percpu_hash name filter_ipv4 flags 0x0 key 4B value 8B max_entries 10000 memlock 1064960B btf_id 35 8: percpu_hash name filter_ipv6 flags 0x0 key 16B value 8B max_entries 10000 memlock 1142784B btf_id 35 9: percpu_hash name filter_ethernet flags 0x0 key 6B value 8B max_entries 10000 memlock 1064960B btf_id 35 # bpftool prog show 28: xdp name xdpfilt_alw_all tag e29eeda1489a6520 gpl loaded_at 2020-10-22T08:04:33-0400 uid 0 xlated 2408B jited 1405B memlock 4096B map_ids 9,5,7,8,6 btf_id 35 # /root/iproute2/ip/ip link set veth0 xdp off # /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_ip.o sec xdp_filter # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 32 tag 2f2b9dbfb786a5a2 jited # ls /sys/fs/bpf/xdp/globals filter_ethernet filter_ipv4 filter_ipv6 filter_ports xdp_stats_map # bpftool map show 5: percpu_array name xdp_stats_map flags 0x0 key 4B value 16B max_entries 5 memlock 4096B btf_id 35 6: percpu_array name filter_ports flags 0x0 key 4B value 8B max_entries 65536 memlock 1576960B btf_id 35 7: percpu_hash name filter_ipv4 flags 0x0 key 4B value 8B max_entries 10000 memlock 1064960B btf_id 35 8: percpu_hash name filter_ipv6 flags 0x0 key 16B value 8B max_entries 10000 memlock 1142784B btf_id 35 9: percpu_hash name filter_ethernet flags 0x0 key 6B value 8B max_entries 10000 memlock 1064960B btf_id 35 # bpftool prog show 32: xdp name xdpfilt_alw_ip tag 2f2b9dbfb786a5a2 gpl loaded_at 2020-10-22T08:04:35-0400 uid 0 xlated 1336B jited 778B memlock 4096B map_ids 7,8,5 btf_id 40 # /root/iproute2/ip/ip link set veth0 xdp off # /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_tcp.o sec xdp_filter # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 36 tag 18c1bb25084030bc jited # ls /sys/fs/bpf/xdp/globals filter_ethernet filter_ipv4 filter_ipv6 filter_ports xdp_stats_map # bpftool map show 5: percpu_array name xdp_stats_map flags 0x0 key 4B value 16B max_entries 5 memlock 4096B btf_id 35 6: percpu_array name filter_ports flags 0x0 key 4B value 8B max_entries 65536 memlock 1576960B btf_id 35 7: percpu_hash name filter_ipv4 flags 0x0 key 4B value 8B max_entries 10000 memlock 1064960B btf_id 35 8: percpu_hash name filter_ipv6 flags 0x0 key 16B value 8B max_entries 10000 memlock 1142784B btf_id 35 9: percpu_hash name filter_ethernet flags 0x0 key 6B value 8B max_entries 10000 memlock 1064960B btf_id 35 # bpftool prog show 36: xdp name xdpfilt_alw_tcp tag 18c1bb25084030bc gpl loaded_at 2020-10-22T08:04:37-0400 uid 0 xlated 1128B jited 690B memlock 4096B map_ids 6,5 btf_id 45 # /root/iproute2/ip/ip link set veth0 xdp off # rm -rf /sys/fs/bpf/xdp/globals == Load new btf defined maps # /root/iproute2/ip/ip link set veth0 xdp obj btf_graft.o sec aaa # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 40 tag 3056d2382e53f27c jited # ls /sys/fs/bpf/xdp/globals jmp_tc # bpftool map show 10: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 40: xdp name cls_aaa tag 3056d2382e53f27c gpl loaded_at 2020-10-22T08:04:39-0400 uid 0 xlated 80B jited 71B memlock 4096B btf_id 50 # /root/iproute2/ip/ip link set veth0 xdp off # /root/iproute2/ip/ip link set veth0 xdp obj btf_map_in_map.o sec ingress # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 44 tag 4420e72b2a601ed7 jited # ls /sys/fs/bpf/xdp/globals jmp_tc map_outer # bpftool map show 10: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 11: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 13: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 44: xdp name imain tag 4420e72b2a601ed7 gpl loaded_at 2020-10-22T08:04:41-0400 uid 0 xlated 336B jited 193B memlock 4096B map_ids 13 btf_id 55 # /root/iproute2/ip/ip link set veth0 xdp off # /root/iproute2/ip/ip link set veth0 xdp obj btf_shared.o sec ingress # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff prog/xdp id 48 tag 9cbab549c3af3eab jited # ls /sys/fs/bpf/xdp/globals jmp_tc map_outer map_sh # bpftool map show 10: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 11: array name map_inner flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 13: array_of_maps name map_outer flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 14: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 48: xdp name imain tag 9cbab549c3af3eab gpl loaded_at 2020-10-22T08:04:43-0400 uid 0 xlated 224B jited 139B memlock 4096B map_ids 14 btf_id 60 # /root/iproute2/ip/ip link set veth0 xdp off # rm -rf /sys/fs/bpf/xdp/globals == Test load objs by tc # /root/iproute2/tc/tc qdisc add dev veth0 ingress # /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_cyclic.o sec 0xabccba/0 # /root/iproute2/tc/tc filter add dev veth0 parent ffff: bpf obj bpf_graft.o # /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/0 # /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/1 # /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 43/0 # /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec classifier # /root/iproute2/ip/ip link show veth0 5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff # ls /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f /sys/fs/bpf/xdp/globals /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d: jmp_tc /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f: jmp_ex jmp_tc map_sh /sys/fs/bpf/xdp/globals: jmp_tc # bpftool map show 15: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B owner_prog_type sched_cls owner jited 16: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 1 memlock 4096B owner_prog_type sched_cls owner jited 17: prog_array name jmp_ex flags 0x0 key 4B value 4B max_entries 1 memlock 4096B owner_prog_type sched_cls owner jited 18: prog_array name jmp_tc flags 0x0 key 4B value 4B max_entries 2 memlock 4096B owner_prog_type sched_cls owner jited 19: array name map_sh flags 0x0 key 4B value 4B max_entries 1 memlock 4096B # bpftool prog show 52: sched_cls name cls_loop tag 3e98a40b04099d36 gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 168B jited 133B memlock 4096B map_ids 15 btf_id 65 56: sched_cls name cls_entry tag 0fbb4d9310a6ee26 gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 144B jited 121B memlock 4096B map_ids 16 btf_id 70 60: sched_cls name cls_case1 tag e06a3bd62293d65d gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 328B jited 216B memlock 4096B map_ids 19,17 btf_id 75 66: sched_cls name cls_case1 tag e06a3bd62293d65d gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 328B jited 216B memlock 4096B map_ids 19,17 btf_id 80 72: sched_cls name cls_case1 tag e06a3bd62293d65d gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 328B jited 216B memlock 4096B map_ids 19,17 btf_id 85 78: sched_cls name cls_case1 tag e06a3bd62293d65d gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 328B jited 216B memlock 4096B map_ids 19,17 btf_id 90 79: sched_cls name cls_case2 tag ee218ff893dca823 gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 336B jited 218B memlock 4096B map_ids 19,18 btf_id 90 80: sched_cls name cls_exit tag e78a58140deed387 gpl loaded_at 2020-10-22T08:04:45-0400 uid 0 xlated 288B jited 177B memlock 4096B map_ids 19 btf_id 90 I also run the following upstream kselftest with patches iproute2 and all passed. test_lwt_ip_encap.sh test_xdp_redirect.sh test_tc_redirect.sh test_xdp_meta.sh test_xdp_veth.sh test_xdp_vlan.sh Hangbin Liu (5): iproute2: add check_libbpf() and get_libbpf_version() lib: make ipvrf able to use libbpf and fix function name conflicts lib: add libbpf support examples/bpf: move struct bpf_elf_map defined maps to legacy folder examples/bpf: add bpf examples with BTF defined maps configure | 113 ++++++++ examples/bpf/README | 18 +- examples/bpf/bpf_graft.c | 14 +- examples/bpf/bpf_map_in_map.c | 37 ++- examples/bpf/bpf_shared.c | 14 +- examples/bpf/{ => legacy}/bpf_cyclic.c | 2 +- examples/bpf/legacy/bpf_graft.c | 66 +++++ examples/bpf/legacy/bpf_map_in_map.c | 56 ++++ examples/bpf/legacy/bpf_shared.c | 53 ++++ examples/bpf/{ => legacy}/bpf_tailcall.c | 2 +- include/bpf_api.h | 13 + include/bpf_util.h | 30 +- ip/ip.c | 10 +- ip/ipvrf.c | 6 +- lib/Makefile | 8 +- lib/bpf_glue.c | 86 ++++++ lib/{bpf.c => bpf_legacy.c} | 193 ++++++++++++- lib/bpf_libbpf.c | 348 +++++++++++++++++++++++ tc/tc.c | 10 +- 19 files changed, 1017 insertions(+), 62 deletions(-) rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%) create mode 100644 examples/bpf/legacy/bpf_graft.c create mode 100644 examples/bpf/legacy/bpf_map_in_map.c create mode 100644 examples/bpf/legacy/bpf_shared.c rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%) create mode 100644 lib/bpf_glue.c rename lib/{bpf.c => bpf_legacy.c} (94%) create mode 100644 lib/bpf_libbpf.c