Message ID | 20221025222802.2295103-1-eddyz87@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Use uapi kernel headers with vmlinux.h | expand |
On Wed, Oct 26, 2022 at 01:27:49AM +0300, Eduard Zingerman wrote: > > Include the following system header: > - /usr/include/sys/socket.h (all via linux/if.h) > > The sys/socket.h conflicts with vmlinux.h in: > - types: struct iovec, struct sockaddr, struct msghdr, ... > - constants: SOCK_STREAM, SOCK_DGRAM, ... > > However, only two types are actually used: > - struct sockaddr > - struct sockaddr_storage (used only in linux/mptcp.h) > > In 'vmlinux.h' this type originates from 'kernel/include/socket.h' > (non UAPI header), thus does not have a header guard. > > The only workaround that I see is to: > - define a stub sys/socket.h as follows: > > #ifndef __BPF_SOCKADDR__ > #define __BPF_SOCKADDR__ > > /* For __kernel_sa_family_t */ > #include <linux/socket.h> > > struct sockaddr { > __kernel_sa_family_t sa_family; > char sa_data[14]; > }; > > #endif > > - hardcode generation of __BPF_SOCKADDR__ bracket for > 'struct sockaddr' in vmlinux.h. we don't need to hack sys/socket.h and can hardcode #ifdef _SYS_SOCKET_H as header guard for sockaddr instead, right? bits/socket.h has this: #ifndef _SYS_SOCKET_H # error "Never include <bits/socket.h> directly; use <sys/socket.h> instead." So that ifdef is kinda stable. > Another possibility is to move the definition of 'struct sockaddr' > from 'kernel/include/socket.h' to 'kernel/include/uapi/linux/socket.h', > but I expect that this won't fly with the mainline as it might break > the programs that include both 'linux/socket.h' and 'sys/socket.h'. > > Conflict with vmlinux.h > ---- > > Uapi header: > - linux/signal.h > > Conflict with vmlinux.h in definition of 'struct sigaction'. > Defined in: > - vmlinux.h: kernel/include/linux/signal_types.h > - uapi: kernel/arch/x86/include/asm/signal.h > > Uapi headers: > - linux/tipc_sockets_diag.h > - linux/sock_diag.h > > Conflict with vmlinux.h in definition of 'SOCK_DESTROY'. Interesting one! I think we can hard code '#undef SOCK_DESTROY' in vmlinux.h The goal is not to be able to mix arbitrary uapi header with vmlinux.h, but only those that could be useful out of bpf progs. Currently it's tcp.h and few other network related headers because they have #define-s in them that are useful inside bpf progs. As long as the solution covers this small subset we're fine.
On 25/10/2022 23:27, Eduard Zingerman wrote: > Hi BPF community, > > AFAIK there is a long standing feature request to use kernel headers > alongside `vmlinux.h` generated by `bpftool`. For example significant > effort was put to add an attribute `bpf_dominating_decl` (see [1]) to > clang, unfortunately this effort was stuck due to concerns regarding C > language semantics. > Thanks for the details! It's a tricky problem to solve. Before diving into this, can I ask if there's another way round this; is there no way we could teach vmlinux.h generation which types to skip via some kind of bootstrapping process? For a bpf object foo.bpf.c that wants to include linux/tcp.h and vmlinux.h, something like this seems possible; 1. run the preprocessor (gcc -E) over the BPF program to generate a bootstrap header foo.bpf.exclude_types.h consisting of all the types mentioned in it and associated includes, but not in vmlinux.h It would need to -D__VMLINUX_H__ to avoid including vmlinux.h definitions and -D__BPF_EXCLUDE_TYPE_BOOTSTRAP__ to skip the programs themselves, which would need a guard around them I think: #include <stddef.h> #include <stdbool.h> #include <vmlinux.h> #include <linux/tcp.h> #ifndef __BPF_EXCLUDE_TYPE_BOOTSTRAP__ //rest of program here #endif So as a result of this, we now have a single header file that contains all the types that non-vmlinux.h include files define. 2. now to generate vmlinux.h, pass foo.bpf.exclude_types.h into "bpftool btf dump" as an exclude header: bpftool btf dump --exclude /tmp/foo.bpf.types.h file /sys/kernel/btf/vmlinux format c > vmlinux.h bpftool would have to parse the exclude header for actual type definitions, spotting struct, enum and typedef definitions. This is likely made easier by running the preprocessor at least since formatting is probably quite uniform. vmlinux.h could simply emit forward declarations for types described both in vmlinux BTF and in the exclude file. So the build process would be - start with empty vmlinux.h - bootstrap a header consisting of the set of types to exclude via c preprocessor - generate new vmlinux.h based on above - build bpf program Build processes for BPF objects already has bootstrapping elements like this for generating vmlinux.h and skeletons, so while it's not perfect it might be a simpler approach. There may be problems with this I'm not seeing though? Thanks! Alan > After some discussion with Alexei and Yonghong I'd like to request > your comments regarding a somewhat brittle and partial solution to > this issue that relies on adding `#ifndef FOO_H ... #endif` guards in > the generated `vmlinux.h`. > > The basic idea > --- > > The goal of the patch set is to allow usage of header files from > `include/uapi` alongside `vmlinux.h` as follows: > > #include <uapi/linux/tcp.h> > #include "vmlinux.h" > > This goal is achieved by adding `#ifndef ... #endif` guards in > `vmlinux.h` around definitions that originate from the `include/uapi` > headers. The guards emitted match the guards used in the original > headers. E.g. as follows: > > include/uapi/linux/tcp.h: > > #ifndef _UAPI_LINUX_TCP_H > #define _UAPI_LINUX_TCP_H > ... > union tcp_word_hdr { > struct tcphdr hdr; > __be32 words[5]; > }; > ... > #endif /* _UAPI_LINUX_TCP_H */ > > vmlinux.h: > > ... > #ifndef _UAPI_LINUX_TCP_H > > union tcp_word_hdr { > struct tcphdr hdr; > __be32 words[5]; > }; > > #endif /* _UAPI_LINUX_TCP_H */ > ... > > To get to this state the following steps are necessary: > - "header guard" name should be identified for each header file; > - the correspondence between data type and it's header guard has to be > encoded in BTF; > - `bpftool` should be adjusted to emit `#ifndef FOO_H ... #endif` > brackets. > > It is not possible to identify header guard names for all uapi headers > basing only on the file name. However a simple script could devised to > identify the guards basing on the file name and it's content. Thus it > is possible to obtain the list of header names with corresponding > header guards. > > The correspondence between type and it's declaration file (header) is > available in DWARF as `DW_AT_decl_file` attribute. The > `DW_AT_decl_file` can be matched with the list of header guards > described above to obtain the header guard name for a specific type. > > The `pahole` generates BTF using DWARF. It is possible to modify > `pahole` to accept the header guards list as an additional parameter > and to encode the header guard names in BTF. > > Implementation details > --- > > Present patch-set implements these ideas as follows: > - A parameter `--header_guards_db` is added to `pahole`. If present it > points to a file with a list of `<header> <guard>` records. > - `pahole` uses DWARF `DW_AT_decl_file` value to lookup the header > guard for each type emitted to BTF. If header guard is present it is > encoded alongside the type. > - Header guards are encoded in BTF as `BTF_DECL_TAG` records with a > special prefix. The prefix "header_guard:" is added to a value of > such tags. (Here `BTF_DECL_TAG` is used to avoid BTF binary format > changes). > - A special script `infer_header_guards.pl` is added as a part of > kbuild, it can infer header guard names for each UAPI header basing > on the header content. > - This script is invoked from `link-vmlinux.sh` prior to BTF > generation during kernel build. The output of the script is saved to > a file, the file is passed to `pahole` as `--header_guards_db` > parameter. > - `libbpf` is modified to aggregate `BTF_DECL_TAG` records for each > type and to emit `#ifndef FOO_H ... #endif` brackets when > "header_guard:" tag is present for a type. > > Details for each patch in a set: > - libbpf: Deduplicate unambigous standalone forward declarations > - selftests/bpf: Tests for standalone forward BTF declarations deduplication > > There is a small number (63 for defconfig) of forward declarations > that are not de-duplicated with the main type declaration under > certain conditions. This hinders the header guard brackets > generation. This patch addresses this de-duplication issue. > > - libbpf: Support for BTF_DECL_TAG dump in C format > - selftests/bpf: Tests for BTF_DECL_TAG dump in C format > > Currently libbpf does not process BTF_DECL_TAG when btf is dumped in > C format. This patch adds a hash table matching btf type ids with a > list of decl tags to the struct btf_dump. > The `btf_dump_emit_decl_tags` is not necessary for the overall > patch-set to function but simplifies testing a bit. > > - libbpf: Header guards for selected data structures in vmlinux.h > - selftests/bpf: Tests for header guards printing in BTF dump > > Adds option `emit_header_guards` to `struct btf_dump_opts`. > When enabled the `btf_dump__dump_type` prints `#ifndef ... #endif` > brackets around types for which header guard information is present > in BTF. > > - bpftool: Enable header guards generation > > Unconditionally enables `emit_header_guards` for BTF dump in C format. > > - kbuild: Script to infer header guard values for uapi headers > - kbuild: Header guards for types from include/uapi/*.h in kernel BTF > > Adds `scripts/infer_header_guards.pl` and integrates it with > `link-vmlinux.sh`. > > - selftests/bpf: Script to verify uapi headers usage with vmlinux.h > > Adds a script `test_uapi_headers.py` that tests header guards with > vmlinux.h by compiling a simple C snippet. The snippet looks as > follows: > > #include <some_uapi_header.h> > #include "vmlinux.h" > > __attribute__((section("tc"), used)) > int syncookie_tc(struct __sk_buff *skb) { return 0; } > > The list of headers to test comes from > `tools/testing/selftests/bpf/good_uapi_headers.txt`. > > - selftests/bpf: Known good uapi headers for test_uapi_headers.py > > The list of uapi headers that could be included alongside vmlinux.h. > The headers are peeked from the following locations: > - <headers-export-dir>/linux/*.h > - <headers-export-dir>/linux/**/*.h > This choice of locations is somewhat arbitrary. > > - selftests/bpf: script for infer_header_guards.pl testing > > The test case for `scripts/infer_header_guards.pl`, verifies that > header guards can be inferred for all uapi headers. > > - There is also a patch for dwarves that adds `--header_guards_db` > option (see [2]). > > The `test_uapi_headers.py` is important as it demonstrates the > the necessary compiler flags: > > clang ... \ > -D__x86_64__ \ > -Xclang -fwchar-type=short \ > -Xclang -fno-signed-wchar \ > -I{exported_kernel_headers}/include/ \ > ... > > - `-fwchar-type=short` and `-fno-signed-wchar` had to be added because > BPF target uses `int` for `wchar_t` by default and this differs from > `vmlinux.h` definition of the type (at least for x86_64). > - `__x86_64__` had to be added for uapi headers that include > `stddef.h` (the one that is supplied my CLANG itself), in order to > define correct sizes for `size_t` and `ptrdiff_t`. > - The `{exported_kernel_headers}` stands for exported kernel headers > directory (the headers obtained by `make headers_install` or via > distribution package). > > When it works > --- > > The mechanics described above works for a significant number of UAPI > headers. For example, for the test case above I chose the headers from > the following locations: > - linux/*.h > - linux/**/*.h > There are 759 such headers and for 677 of them the test described > above passes. > > I excluded the headers from the following sub-directories as > potentially not interesting: > > asm rdma video xen > asm-generic misc scsi > drm mtd sound > > Thus saving some time for both discussion and CI but the choice is > somewhat arbitrary. If I run `test_uapi_headers.py --test '*'` (all > headers) test passes for 834 out of 972 headers. > > When it breaks > --- > > There several scenarios when this mechanics breaks. > Specifically I found the following cases: > - When uapi header includes some system header that conflicts with > vmlinux.h. > - When uapi header itself conflicts with vmlinux.h. > > Below are examples for both cases. > > Conflict with system headers > ---- > > The following uapi headers: > - linux/atmbr2684.h > - linux/bpfilter.h > - linux/gsmmux.h > - linux/icmp.h > - linux/if.h > - linux/if_arp.h > - linux/if_bonding.h > - linux/if_pppox.h > - linux/if_tunnel.h > - linux/ip6_tunnel.h > - linux/llc.h > - linux/mctp.h > - linux/mptcp.h > - linux/netdevice.h > - linux/netfilter/xt_RATEEST.h > - linux/netfilter/xt_hashlimit.h > - linux/netfilter/xt_physdev.h > - linux/netfilter/xt_rateest.h > - linux/netfilter_arp/arp_tables.h > - linux/netfilter_arp/arpt_mangle.h > - linux/netfilter_bridge.h > - linux/netfilter_bridge/ebtables.h > - linux/netfilter_ipv4/ip_tables.h > - linux/netfilter_ipv6/ip6_tables.h > - linux/route.h > - linux/wireless.h > > Include the following system header: > - /usr/include/sys/socket.h (all via linux/if.h) > > The sys/socket.h conflicts with vmlinux.h in: > - types: struct iovec, struct sockaddr, struct msghdr, ... > - constants: SOCK_STREAM, SOCK_DGRAM, ... > > However, only two types are actually used: > - struct sockaddr > - struct sockaddr_storage (used only in linux/mptcp.h) > > In 'vmlinux.h' this type originates from 'kernel/include/socket.h' > (non UAPI header), thus does not have a header guard. > > The only workaround that I see is to: > - define a stub sys/socket.h as follows: > > #ifndef __BPF_SOCKADDR__ > #define __BPF_SOCKADDR__ > > /* For __kernel_sa_family_t */ > #include <linux/socket.h> > > struct sockaddr { > __kernel_sa_family_t sa_family; > char sa_data[14]; > }; > > #endif > > - hardcode generation of __BPF_SOCKADDR__ bracket for > 'struct sockaddr' in vmlinux.h. > > Another possibility is to move the definition of 'struct sockaddr' > from 'kernel/include/socket.h' to 'kernel/include/uapi/linux/socket.h', > but I expect that this won't fly with the mainline as it might break > the programs that include both 'linux/socket.h' and 'sys/socket.h'. > > Conflict with vmlinux.h > ---- > > Uapi header: > - linux/signal.h > > Conflict with vmlinux.h in definition of 'struct sigaction'. > Defined in: > - vmlinux.h: kernel/include/linux/signal_types.h > - uapi: kernel/arch/x86/include/asm/signal.h > > Uapi headers: > - linux/tipc_sockets_diag.h > - linux/sock_diag.h > > Conflict with vmlinux.h in definition of 'SOCK_DESTROY'. > Defined in: > - vmlinux.h: kernel/include/net/sock.h > - uapi: kernel/include/uapi/linux/sock_diag.h > Constants seem to be unrelated. > > And so on... I have details for many other headers but omit those for > brevity. > > In conclusion > --- > > Except from the general feasibility I have a few questions: > - What UAPI headers are the candidates for such use? If there are some > interesting headers currently not working with this patch-set some > hacks have to be added (e.g. like with `linux/if.h`). > - Is it ok to encode header guards as special `BTF_DECL_TAG` or should > I change the BTF format a bit to save some bytes. > > Thanks, > Eduard > > > [1] https://reviews.llvm.org/D111307 > [clang] __attribute__ bpf_dominating_decl > [2] https://lore.kernel.org/dwarves/20221025220729.2293891-1-eddyz87@gmail.com/T/ > [RFC dwarves] pahole: Save header guard names when > --header_guards_db is passed > > Eduard Zingerman (12): > libbpf: Deduplicate unambigous standalone forward declarations > selftests/bpf: Tests for standalone forward BTF declarations > deduplication > libbpf: Support for BTF_DECL_TAG dump in C format > selftests/bpf: Tests for BTF_DECL_TAG dump in C format > libbpf: Header guards for selected data structures in vmlinux.h > selftests/bpf: Tests for header guards printing in BTF dump > bpftool: Enable header guards generation > kbuild: Script to infer header guard values for uapi headers > kbuild: Header guards for types from include/uapi/*.h in kernel BTF > selftests/bpf: Script to verify uapi headers usage with vmlinux.h > selftests/bpf: Known good uapi headers for test_uapi_headers.py > selftests/bpf: script for infer_header_guards.pl testing > > scripts/infer_header_guards.pl | 191 +++++ > scripts/link-vmlinux.sh | 13 +- > tools/bpf/bpftool/btf.c | 4 +- > tools/lib/bpf/btf.c | 178 ++++- > tools/lib/bpf/btf.h | 7 +- > tools/lib/bpf/btf_dump.c | 232 +++++- > .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ > tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ > .../selftests/bpf/prog_tests/btf_dump.c | 11 +- > .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + > .../progs/btf_dump_test_case_header_guards.c | 94 +++ > .../bpf/test_uapi_header_guards_infer.sh | 33 + > .../selftests/bpf/test_uapi_headers.py | 197 +++++ > 13 files changed, 1816 insertions(+), 12 deletions(-) > create mode 100755 scripts/infer_header_guards.pl > create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c > create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh > create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py >
On Tue, 2022-10-25 at 16:46 -0700, Alexei Starovoitov wrote: > On Wed, Oct 26, 2022 at 01:27:49AM +0300, Eduard Zingerman wrote: > > > > Include the following system header: > > - /usr/include/sys/socket.h (all via linux/if.h) > > > > The sys/socket.h conflicts with vmlinux.h in: > > - types: struct iovec, struct sockaddr, struct msghdr, ... > > - constants: SOCK_STREAM, SOCK_DGRAM, ... > > > > However, only two types are actually used: > > - struct sockaddr > > - struct sockaddr_storage (used only in linux/mptcp.h) > > > > In 'vmlinux.h' this type originates from 'kernel/include/socket.h' > > (non UAPI header), thus does not have a header guard. > > > > The only workaround that I see is to: > > - define a stub sys/socket.h as follows: > > > > #ifndef __BPF_SOCKADDR__ > > #define __BPF_SOCKADDR__ > > > > /* For __kernel_sa_family_t */ > > #include <linux/socket.h> > > > > struct sockaddr { > > __kernel_sa_family_t sa_family; > > char sa_data[14]; > > }; > > > > #endif > > > > - hardcode generation of __BPF_SOCKADDR__ bracket for > > 'struct sockaddr' in vmlinux.h. > > we don't need to hack sys/socket.h and can hardcode > #ifdef _SYS_SOCKET_H as header guard for sockaddr instead, right? > bits/socket.h has this: > #ifndef _SYS_SOCKET_H > # error "Never include <bits/socket.h> directly; use <sys/socket.h> instead." > > So that ifdef is kinda stable. The `if.h` only uses two types from `sys/socket.h`, namely: - `struct sockaddr` - `struct sockaddr_storage` However `sys/socket.h` itself defines more types, here is a complete list of types from `sys/socket.h` that conflict with `vmlinux.h` (generated for my x86_64 laptop): Type name System header iovec /usr/include/bits/types/struct_iovec.h loff_t /usr/include/sys/types.h dev_t /usr/include/sys/types.h nlink_t /usr/include/sys/types.h timer_t /usr/include/bits/types/timer_t.h int16_t /usr/include/bits/stdint-intn.h int32_t /usr/include/bits/stdint-intn.h int64_t /usr/include/bits/stdint-intn.h u_int64_t /usr/include/sys/types.h sigset_t /usr/include/bits/types/sigset_t.h fd_set /usr/include/sys/select.h blkcnt_t /usr/include/sys/types.h SOCK_STREAM /usr/include/bits/socket_type.h SOCK_DGRAM /usr/include/bits/socket_type.h SOCK_RAW /usr/include/bits/socket_type.h SOCK_RDM /usr/include/bits/socket_type.h SOCK_SEQPACKET /usr/include/bits/socket_type.h SOCK_DCCP /usr/include/bits/socket_type.h SOCK_PACKET /usr/include/bits/socket_type.h sockaddr /usr/include/bits/socket.h msghdr /usr/include/bits/socket.h cmsghdr /usr/include/bits/socket.h linger /usr/include/bits/socket.h SHUT_RD /usr/include/sys/socket.h SHUT_WR /usr/include/sys/socket.h SHUT_RDWR /usr/include/sys/socket.h It would be safe to wrap the corresponding types in the vmlinux.h with _SYS_SOCKET_H / _SYS_TYPES guards if the definitions above match between libc and kernel. To my surprise not all of them match. Here is the list of genuine conflicts (for typedefs I skip the intermediate definitions and print the last typedef in the chain): Type: dev_t typedef unsigned int __u32 vmlinux.h typedef unsigned long int __dev_t /usr/include/bits/types.h Type: nlink_t typedef unsigned int __u32 vmlinux.h typedef unsigned long int __nlink_t /usr/include/bits/types.h Type: timer_t typedef int __kernel_timer _t vmlinux.h typedef void *__timer_t /usr/include/bits/types.h Type: sigset_t typedef struct vmlinux.h { long unsigned int sig[1]; } sigset_t typedef struct /usr/include/bits/types/__sigset_t.h { unsigned long int __val[1024 / (8 * (sizeof(unsigned long int)))]; } __sigset_t Type: fd_set typedef struct vmlinux.h { long unsigned int fds_bits[16]; } __kernel_fd_set typedef struct /usr/include/sys/select.h { __fd_mask __fds_bits[1024 / (8 * ((int) (sizeof(__fd_mask))))]; } fd_set Type: sigset_t typedef struct vmlinux.h { long unsigned int sig[1]; } sigset_t typedef struct /usr/include/bits/types/__sigset_t.h { unsigned long int __val[1024 / (8 * (sizeof(unsigned long int)))]; } __sigset_t Type: msghdr struct msghdr vmlinux.h { void *msg_name; int msg_namelen; int msg_inq; struct iov_iter msg_iter; union { void *msg_control; void *msg_control_user; }; bool msg_control_is_user : 1; bool msg_get_inq : 1; unsigned int msg_flags; __kernel_size_t msg_controllen; struct kiocb *msg_iocb; struct ubuf_info *msg_ubuf; struct sock *, struct sk_buff *, struct iov_iter *, size_tint; } struct msghdr /usr/include/bits/socket.h { void *msg_name; socklen_t msg_namelen; struct iovec *msg_iov; size_t msg_iovlen; void *msg_control; size_t msg_controllen; int msg_flags; } > > > Another possibility is to move the definition of 'struct sockaddr' > > from 'kernel/include/socket.h' to 'kernel/include/uapi/linux/socket.h', > > but I expect that this won't fly with the mainline as it might break > > the programs that include both 'linux/socket.h' and 'sys/socket.h'. > > > > Conflict with vmlinux.h > > ---- > > > > Uapi header: > > - linux/signal.h > > > > Conflict with vmlinux.h in definition of 'struct sigaction'. > > Defined in: > > - vmlinux.h: kernel/include/linux/signal_types.h > > - uapi: kernel/arch/x86/include/asm/signal.h > > > > Uapi headers: > > - linux/tipc_sockets_diag.h > > - linux/sock_diag.h > > > > Conflict with vmlinux.h in definition of 'SOCK_DESTROY'. > > Interesting one! > I think we can hard code '#undef SOCK_DESTROY' in vmlinux.h > > The goal is not to be able to mix arbitrary uapi header with > vmlinux.h, but only those that could be useful out of bpf progs. > Currently it's tcp.h and few other network related headers > because they have #define-s in them that are useful inside bpf progs. > As long as the solution covers this small subset we're fine. Well, tcp.h works :) It would be great if someone could list the interesting headers. Thanks, Eduard
On Wed, 2022-10-26 at 12:10 +0100, Alan Maguire wrote: > On 25/10/2022 23:27, Eduard Zingerman wrote: > > Hi BPF community, > > > > AFAIK there is a long standing feature request to use kernel headers > > alongside `vmlinux.h` generated by `bpftool`. For example significant > > effort was put to add an attribute `bpf_dominating_decl` (see [1]) to > > clang, unfortunately this effort was stuck due to concerns regarding C > > language semantics. > > > > Thanks for the details! It's a tricky problem to solve. > > Before diving into this, can I ask if there's another way round this; > is there no way we could teach vmlinux.h generation which types to > skip via some kind of bootstrapping process? > > For a bpf object foo.bpf.c that wants to include linux/tcp.h and > vmlinux.h, something like this seems possible; > > 1. run the preprocessor (gcc -E) over the BPF program to generate > a bootstrap header foo.bpf.exclude_types.h consisting of all the types mentioned > in it and associated includes, but not in vmlinux.h It would need to -D__VMLINUX_H__ > to avoid including vmlinux.h definitions and -D__BPF_EXCLUDE_TYPE_BOOTSTRAP__ to > skip the programs themselves, which would need a guard around them I think: > > #include <stddef.h> > #include <stdbool.h> > #include <vmlinux.h> > #include <linux/tcp.h> > > #ifndef __BPF_EXCLUDE_TYPE_BOOTSTRAP__ > //rest of program here > #endif > > So as a result of this, we now have a single header file that contains all the types > that non-vmlinux.h include files define. > > 2. now to generate vmlinux.h, pass foo.bpf.exclude_types.h into "bpftool btf dump" as an > exclude header: > > bpftool btf dump --exclude /tmp/foo.bpf.types.h file /sys/kernel/btf/vmlinux format c > vmlinux.h > > bpftool would have to parse the exclude header for actual type definitions, spotting struct, > enum and typedef definitions. This is likely made easier by running the preprocessor > at least since formatting is probably quite uniform. vmlinux.h could simply emit forward declarations > for types described both in vmlinux BTF and in the exclude file. > > So the build process would be > > - start with empty vmlinux.h > - bootstrap a header consisting of the set of types to exclude via c preprocessor > - generate new vmlinux.h based on above > - build bpf program > > Build processes for BPF objects already has bootstrapping elements like > this for generating vmlinux.h and skeletons, so while it's not perfect it might > be a simpler approach. There may be problems with this I'm not seeing though? I like the tool based approach more but I heard there were some reservations about separate tool complicating the build process. - the tool does not require changes to the kbuild; - the tool is not limited to uapi headers; - the tool could imitate the dominating declarations attribute and address the issue with definitions miss-match between vmlinux.h and system headers (see my reply to Alexei in the parallel sub-thread). To support this point it would have to work in a somewhat different order: - read/pre-process the input file; - remove from it the definitions that are also present in kernel BTF; - prepend vmlinux.h to the beginning of the file. On the other hand I think that implementation would need a C parser / type analysis step. Thus it would be harder to implement it as a part of bpftool. But the prototype using something like [1] should be simple. Thanks, Eduard [1] https://github.com/inducer/pycparserext > > Thanks! > > Alan > > > After some discussion with Alexei and Yonghong I'd like to request > > your comments regarding a somewhat brittle and partial solution to > > this issue that relies on adding `#ifndef FOO_H ... #endif` guards in > > the generated `vmlinux.h`. > > > > The basic idea > > --- > > > > The goal of the patch set is to allow usage of header files from > > `include/uapi` alongside `vmlinux.h` as follows: > > > > #include <uapi/linux/tcp.h> > > #include "vmlinux.h" > > > > This goal is achieved by adding `#ifndef ... #endif` guards in > > `vmlinux.h` around definitions that originate from the `include/uapi` > > headers. The guards emitted match the guards used in the original > > headers. E.g. as follows: > > > > include/uapi/linux/tcp.h: > > > > #ifndef _UAPI_LINUX_TCP_H > > #define _UAPI_LINUX_TCP_H > > ... > > union tcp_word_hdr { > > struct tcphdr hdr; > > __be32 words[5]; > > }; > > ... > > #endif /* _UAPI_LINUX_TCP_H */ > > > > vmlinux.h: > > > > ... > > #ifndef _UAPI_LINUX_TCP_H > > > > union tcp_word_hdr { > > struct tcphdr hdr; > > __be32 words[5]; > > }; > > > > #endif /* _UAPI_LINUX_TCP_H */ > > ... > > > > To get to this state the following steps are necessary: > > - "header guard" name should be identified for each header file; > > - the correspondence between data type and it's header guard has to be > > encoded in BTF; > > - `bpftool` should be adjusted to emit `#ifndef FOO_H ... #endif` > > brackets. > > > > It is not possible to identify header guard names for all uapi headers > > basing only on the file name. However a simple script could devised to > > identify the guards basing on the file name and it's content. Thus it > > is possible to obtain the list of header names with corresponding > > header guards. > > > > The correspondence between type and it's declaration file (header) is > > available in DWARF as `DW_AT_decl_file` attribute. The > > `DW_AT_decl_file` can be matched with the list of header guards > > described above to obtain the header guard name for a specific type. > > > > The `pahole` generates BTF using DWARF. It is possible to modify > > `pahole` to accept the header guards list as an additional parameter > > and to encode the header guard names in BTF. > > > > Implementation details > > --- > > > > Present patch-set implements these ideas as follows: > > - A parameter `--header_guards_db` is added to `pahole`. If present it > > points to a file with a list of `<header> <guard>` records. > > - `pahole` uses DWARF `DW_AT_decl_file` value to lookup the header > > guard for each type emitted to BTF. If header guard is present it is > > encoded alongside the type. > > - Header guards are encoded in BTF as `BTF_DECL_TAG` records with a > > special prefix. The prefix "header_guard:" is added to a value of > > such tags. (Here `BTF_DECL_TAG` is used to avoid BTF binary format > > changes). > > - A special script `infer_header_guards.pl` is added as a part of > > kbuild, it can infer header guard names for each UAPI header basing > > on the header content. > > - This script is invoked from `link-vmlinux.sh` prior to BTF > > generation during kernel build. The output of the script is saved to > > a file, the file is passed to `pahole` as `--header_guards_db` > > parameter. > > - `libbpf` is modified to aggregate `BTF_DECL_TAG` records for each > > type and to emit `#ifndef FOO_H ... #endif` brackets when > > "header_guard:" tag is present for a type. > > > > Details for each patch in a set: > > - libbpf: Deduplicate unambigous standalone forward declarations > > - selftests/bpf: Tests for standalone forward BTF declarations deduplication > > > > There is a small number (63 for defconfig) of forward declarations > > that are not de-duplicated with the main type declaration under > > certain conditions. This hinders the header guard brackets > > generation. This patch addresses this de-duplication issue. > > > > - libbpf: Support for BTF_DECL_TAG dump in C format > > - selftests/bpf: Tests for BTF_DECL_TAG dump in C format > > > > Currently libbpf does not process BTF_DECL_TAG when btf is dumped in > > C format. This patch adds a hash table matching btf type ids with a > > list of decl tags to the struct btf_dump. > > The `btf_dump_emit_decl_tags` is not necessary for the overall > > patch-set to function but simplifies testing a bit. > > > > - libbpf: Header guards for selected data structures in vmlinux.h > > - selftests/bpf: Tests for header guards printing in BTF dump > > > > Adds option `emit_header_guards` to `struct btf_dump_opts`. > > When enabled the `btf_dump__dump_type` prints `#ifndef ... #endif` > > brackets around types for which header guard information is present > > in BTF. > > > > - bpftool: Enable header guards generation > > > > Unconditionally enables `emit_header_guards` for BTF dump in C format. > > > > - kbuild: Script to infer header guard values for uapi headers > > - kbuild: Header guards for types from include/uapi/*.h in kernel BTF > > > > Adds `scripts/infer_header_guards.pl` and integrates it with > > `link-vmlinux.sh`. > > > > - selftests/bpf: Script to verify uapi headers usage with vmlinux.h > > > > Adds a script `test_uapi_headers.py` that tests header guards with > > vmlinux.h by compiling a simple C snippet. The snippet looks as > > follows: > > > > #include <some_uapi_header.h> > > #include "vmlinux.h" > > > > __attribute__((section("tc"), used)) > > int syncookie_tc(struct __sk_buff *skb) { return 0; } > > > > The list of headers to test comes from > > `tools/testing/selftests/bpf/good_uapi_headers.txt`. > > > > - selftests/bpf: Known good uapi headers for test_uapi_headers.py > > > > The list of uapi headers that could be included alongside vmlinux.h. > > The headers are peeked from the following locations: > > - <headers-export-dir>/linux/*.h > > - <headers-export-dir>/linux/**/*.h > > This choice of locations is somewhat arbitrary. > > > > - selftests/bpf: script for infer_header_guards.pl testing > > > > The test case for `scripts/infer_header_guards.pl`, verifies that > > header guards can be inferred for all uapi headers. > > > > - There is also a patch for dwarves that adds `--header_guards_db` > > option (see [2]). > > > > The `test_uapi_headers.py` is important as it demonstrates the > > the necessary compiler flags: > > > > clang ... \ > > -D__x86_64__ \ > > -Xclang -fwchar-type=short \ > > -Xclang -fno-signed-wchar \ > > -I{exported_kernel_headers}/include/ \ > > ... > > > > - `-fwchar-type=short` and `-fno-signed-wchar` had to be added because > > BPF target uses `int` for `wchar_t` by default and this differs from > > `vmlinux.h` definition of the type (at least for x86_64). > > - `__x86_64__` had to be added for uapi headers that include > > `stddef.h` (the one that is supplied my CLANG itself), in order to > > define correct sizes for `size_t` and `ptrdiff_t`. > > - The `{exported_kernel_headers}` stands for exported kernel headers > > directory (the headers obtained by `make headers_install` or via > > distribution package). > > > > When it works > > --- > > > > The mechanics described above works for a significant number of UAPI > > headers. For example, for the test case above I chose the headers from > > the following locations: > > - linux/*.h > > - linux/**/*.h > > There are 759 such headers and for 677 of them the test described > > above passes. > > > > I excluded the headers from the following sub-directories as > > potentially not interesting: > > > > asm rdma video xen > > asm-generic misc scsi > > drm mtd sound > > > > Thus saving some time for both discussion and CI but the choice is > > somewhat arbitrary. If I run `test_uapi_headers.py --test '*'` (all > > headers) test passes for 834 out of 972 headers. > > > > When it breaks > > --- > > > > There several scenarios when this mechanics breaks. > > Specifically I found the following cases: > > - When uapi header includes some system header that conflicts with > > vmlinux.h. > > - When uapi header itself conflicts with vmlinux.h. > > > > Below are examples for both cases. > > > > Conflict with system headers > > ---- > > > > The following uapi headers: > > - linux/atmbr2684.h > > - linux/bpfilter.h > > - linux/gsmmux.h > > - linux/icmp.h > > - linux/if.h > > - linux/if_arp.h > > - linux/if_bonding.h > > - linux/if_pppox.h > > - linux/if_tunnel.h > > - linux/ip6_tunnel.h > > - linux/llc.h > > - linux/mctp.h > > - linux/mptcp.h > > - linux/netdevice.h > > - linux/netfilter/xt_RATEEST.h > > - linux/netfilter/xt_hashlimit.h > > - linux/netfilter/xt_physdev.h > > - linux/netfilter/xt_rateest.h > > - linux/netfilter_arp/arp_tables.h > > - linux/netfilter_arp/arpt_mangle.h > > - linux/netfilter_bridge.h > > - linux/netfilter_bridge/ebtables.h > > - linux/netfilter_ipv4/ip_tables.h > > - linux/netfilter_ipv6/ip6_tables.h > > - linux/route.h > > - linux/wireless.h > > > > Include the following system header: > > - /usr/include/sys/socket.h (all via linux/if.h) > > > > The sys/socket.h conflicts with vmlinux.h in: > > - types: struct iovec, struct sockaddr, struct msghdr, ... > > - constants: SOCK_STREAM, SOCK_DGRAM, ... > > > > However, only two types are actually used: > > - struct sockaddr > > - struct sockaddr_storage (used only in linux/mptcp.h) > > > > In 'vmlinux.h' this type originates from 'kernel/include/socket.h' > > (non UAPI header), thus does not have a header guard. > > > > The only workaround that I see is to: > > - define a stub sys/socket.h as follows: > > > > #ifndef __BPF_SOCKADDR__ > > #define __BPF_SOCKADDR__ > > > > /* For __kernel_sa_family_t */ > > #include <linux/socket.h> > > > > struct sockaddr { > > __kernel_sa_family_t sa_family; > > char sa_data[14]; > > }; > > > > #endif > > > > - hardcode generation of __BPF_SOCKADDR__ bracket for > > 'struct sockaddr' in vmlinux.h. > > > > Another possibility is to move the definition of 'struct sockaddr' > > from 'kernel/include/socket.h' to 'kernel/include/uapi/linux/socket.h', > > but I expect that this won't fly with the mainline as it might break > > the programs that include both 'linux/socket.h' and 'sys/socket.h'. > > > > Conflict with vmlinux.h > > ---- > > > > Uapi header: > > - linux/signal.h > > > > Conflict with vmlinux.h in definition of 'struct sigaction'. > > Defined in: > > - vmlinux.h: kernel/include/linux/signal_types.h > > - uapi: kernel/arch/x86/include/asm/signal.h > > > > Uapi headers: > > - linux/tipc_sockets_diag.h > > - linux/sock_diag.h > > > > Conflict with vmlinux.h in definition of 'SOCK_DESTROY'. > > Defined in: > > - vmlinux.h: kernel/include/net/sock.h > > - uapi: kernel/include/uapi/linux/sock_diag.h > > Constants seem to be unrelated. > > > > And so on... I have details for many other headers but omit those for > > brevity. > > > > In conclusion > > --- > > > > Except from the general feasibility I have a few questions: > > - What UAPI headers are the candidates for such use? If there are some > > interesting headers currently not working with this patch-set some > > hacks have to be added (e.g. like with `linux/if.h`). > > - Is it ok to encode header guards as special `BTF_DECL_TAG` or should > > I change the BTF format a bit to save some bytes. > > > > Thanks, > > Eduard > > > > > > [1] https://reviews.llvm.org/D111307 > > [clang] __attribute__ bpf_dominating_decl > > [2] https://lore.kernel.org/dwarves/20221025220729.2293891-1-eddyz87@gmail.com/T/ > > [RFC dwarves] pahole: Save header guard names when > > --header_guards_db is passed > > > > Eduard Zingerman (12): > > libbpf: Deduplicate unambigous standalone forward declarations > > selftests/bpf: Tests for standalone forward BTF declarations > > deduplication > > libbpf: Support for BTF_DECL_TAG dump in C format > > selftests/bpf: Tests for BTF_DECL_TAG dump in C format > > libbpf: Header guards for selected data structures in vmlinux.h > > selftests/bpf: Tests for header guards printing in BTF dump > > bpftool: Enable header guards generation > > kbuild: Script to infer header guard values for uapi headers > > kbuild: Header guards for types from include/uapi/*.h in kernel BTF > > selftests/bpf: Script to verify uapi headers usage with vmlinux.h > > selftests/bpf: Known good uapi headers for test_uapi_headers.py > > selftests/bpf: script for infer_header_guards.pl testing > > > > scripts/infer_header_guards.pl | 191 +++++ > > scripts/link-vmlinux.sh | 13 +- > > tools/bpf/bpftool/btf.c | 4 +- > > tools/lib/bpf/btf.c | 178 ++++- > > tools/lib/bpf/btf.h | 7 +- > > tools/lib/bpf/btf_dump.c | 232 +++++- > > .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ > > tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ > > .../selftests/bpf/prog_tests/btf_dump.c | 11 +- > > .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + > > .../progs/btf_dump_test_case_header_guards.c | 94 +++ > > .../bpf/test_uapi_header_guards_infer.sh | 33 + > > .../selftests/bpf/test_uapi_headers.py | 197 +++++ > > 13 files changed, 1816 insertions(+), 12 deletions(-) > > create mode 100755 scripts/infer_header_guards.pl > > create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt > > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c > > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c > > create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh > > create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py > >
On Tue, Oct 25, 2022 at 3:28 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > Hi BPF community, > > AFAIK there is a long standing feature request to use kernel headers > alongside `vmlinux.h` generated by `bpftool`. For example significant > effort was put to add an attribute `bpf_dominating_decl` (see [1]) to > clang, unfortunately this effort was stuck due to concerns regarding C > language semantics. > Maybe we should make another attempt to implement bpf_dominating_decl? That seems like a more elegant solution than any other implemented or discussed alternative. Yonghong, WDYT? BTW, I suggest splitting libbpf btf_dedup and btf_dump changes into a separate series and sending them as non-RFC sooner. Those improvements are independent of all the header guards stuff, let's get them landed sooner. > After some discussion with Alexei and Yonghong I'd like to request > your comments regarding a somewhat brittle and partial solution to > this issue that relies on adding `#ifndef FOO_H ... #endif` guards in > the generated `vmlinux.h`. > [...] > Eduard Zingerman (12): > libbpf: Deduplicate unambigous standalone forward declarations > selftests/bpf: Tests for standalone forward BTF declarations > deduplication > libbpf: Support for BTF_DECL_TAG dump in C format > selftests/bpf: Tests for BTF_DECL_TAG dump in C format > libbpf: Header guards for selected data structures in vmlinux.h > selftests/bpf: Tests for header guards printing in BTF dump > bpftool: Enable header guards generation > kbuild: Script to infer header guard values for uapi headers > kbuild: Header guards for types from include/uapi/*.h in kernel BTF > selftests/bpf: Script to verify uapi headers usage with vmlinux.h > selftests/bpf: Known good uapi headers for test_uapi_headers.py > selftests/bpf: script for infer_header_guards.pl testing > > scripts/infer_header_guards.pl | 191 +++++ > scripts/link-vmlinux.sh | 13 +- > tools/bpf/bpftool/btf.c | 4 +- > tools/lib/bpf/btf.c | 178 ++++- > tools/lib/bpf/btf.h | 7 +- > tools/lib/bpf/btf_dump.c | 232 +++++- > .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ > tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ > .../selftests/bpf/prog_tests/btf_dump.c | 11 +- > .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + > .../progs/btf_dump_test_case_header_guards.c | 94 +++ > .../bpf/test_uapi_header_guards_infer.sh | 33 + > .../selftests/bpf/test_uapi_headers.py | 197 +++++ > 13 files changed, 1816 insertions(+), 12 deletions(-) > create mode 100755 scripts/infer_header_guards.pl > create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c > create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh > create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py > > -- > 2.34.1 >
On 10/27/22 4:14 PM, Andrii Nakryiko wrote: > On Tue, Oct 25, 2022 at 3:28 PM Eduard Zingerman <eddyz87@gmail.com> wrote: >> >> Hi BPF community, >> >> AFAIK there is a long standing feature request to use kernel headers >> alongside `vmlinux.h` generated by `bpftool`. For example significant >> effort was put to add an attribute `bpf_dominating_decl` (see [1]) to >> clang, unfortunately this effort was stuck due to concerns regarding C >> language semantics. >> > > Maybe we should make another attempt to implement bpf_dominating_decl? > That seems like a more elegant solution than any other implemented or > discussed alternative. Yonghong, WDYT? I would say it would be very difficult for upstream to agree with bpf_dominating_decl. We already have lots of discussions and we likely won't be able to satisfy Aaron who wants us to emit adequate diagnostics which will involve lots of other work and he also thinks this is too far away from C standard and he wants us to implement in a llvm/clang tool which is not what we want. > > BTW, I suggest splitting libbpf btf_dedup and btf_dump changes into a > separate series and sending them as non-RFC sooner. Those improvements > are independent of all the header guards stuff, let's get them landed > sooner. > >> After some discussion with Alexei and Yonghong I'd like to request >> your comments regarding a somewhat brittle and partial solution to >> this issue that relies on adding `#ifndef FOO_H ... #endif` guards in >> the generated `vmlinux.h`. >> > > [...] > >> Eduard Zingerman (12): >> libbpf: Deduplicate unambigous standalone forward declarations >> selftests/bpf: Tests for standalone forward BTF declarations >> deduplication >> libbpf: Support for BTF_DECL_TAG dump in C format >> selftests/bpf: Tests for BTF_DECL_TAG dump in C format >> libbpf: Header guards for selected data structures in vmlinux.h >> selftests/bpf: Tests for header guards printing in BTF dump >> bpftool: Enable header guards generation >> kbuild: Script to infer header guard values for uapi headers >> kbuild: Header guards for types from include/uapi/*.h in kernel BTF >> selftests/bpf: Script to verify uapi headers usage with vmlinux.h >> selftests/bpf: Known good uapi headers for test_uapi_headers.py >> selftests/bpf: script for infer_header_guards.pl testing >> >> scripts/infer_header_guards.pl | 191 +++++ >> scripts/link-vmlinux.sh | 13 +- >> tools/bpf/bpftool/btf.c | 4 +- >> tools/lib/bpf/btf.c | 178 ++++- >> tools/lib/bpf/btf.h | 7 +- >> tools/lib/bpf/btf_dump.c | 232 +++++- >> .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ >> tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ >> .../selftests/bpf/prog_tests/btf_dump.c | 11 +- >> .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + >> .../progs/btf_dump_test_case_header_guards.c | 94 +++ >> .../bpf/test_uapi_header_guards_infer.sh | 33 + >> .../selftests/bpf/test_uapi_headers.py | 197 +++++ >> 13 files changed, 1816 insertions(+), 12 deletions(-) >> create mode 100755 scripts/infer_header_guards.pl >> create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt >> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c >> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c >> create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh >> create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py >> >> -- >> 2.34.1 >>
On Thu, Oct 27, 2022 at 6:33 PM Yonghong Song <yhs@meta.com> wrote: > > > > On 10/27/22 4:14 PM, Andrii Nakryiko wrote: > > On Tue, Oct 25, 2022 at 3:28 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > >> > >> Hi BPF community, > >> > >> AFAIK there is a long standing feature request to use kernel headers > >> alongside `vmlinux.h` generated by `bpftool`. For example significant > >> effort was put to add an attribute `bpf_dominating_decl` (see [1]) to > >> clang, unfortunately this effort was stuck due to concerns regarding C > >> language semantics. > >> > > > > Maybe we should make another attempt to implement bpf_dominating_decl? > > That seems like a more elegant solution than any other implemented or > > discussed alternative. Yonghong, WDYT? > > I would say it would be very difficult for upstream to agree with > bpf_dominating_decl. We already have lots of discussions and we > likely won't be able to satisfy Aaron who wants us to emit > adequate diagnostics which will involve lots of other work > and he also thinks this is too far away from C standard and he > wants us to implement in a llvm/clang tool which is not what > we want. Ok, could we change the problem to detecting if some type is defined. Would it be possible to have something like #if !__is_type_defined(struct abc) struct abc { }; #endif I think we talked about this and there were problems with this approach, but I don't remember details and how insurmountable the problem is. Having a way to check whether some type is defined would be very useful even outside of -target bpf parlance, though, so maybe it's the problem worth attacking? > > > > > BTW, I suggest splitting libbpf btf_dedup and btf_dump changes into a > > separate series and sending them as non-RFC sooner. Those improvements > > are independent of all the header guards stuff, let's get them landed > > sooner. > > > >> After some discussion with Alexei and Yonghong I'd like to request > >> your comments regarding a somewhat brittle and partial solution to > >> this issue that relies on adding `#ifndef FOO_H ... #endif` guards in > >> the generated `vmlinux.h`. > >> > > > > [...] > > > >> Eduard Zingerman (12): > >> libbpf: Deduplicate unambigous standalone forward declarations > >> selftests/bpf: Tests for standalone forward BTF declarations > >> deduplication > >> libbpf: Support for BTF_DECL_TAG dump in C format > >> selftests/bpf: Tests for BTF_DECL_TAG dump in C format > >> libbpf: Header guards for selected data structures in vmlinux.h > >> selftests/bpf: Tests for header guards printing in BTF dump > >> bpftool: Enable header guards generation > >> kbuild: Script to infer header guard values for uapi headers > >> kbuild: Header guards for types from include/uapi/*.h in kernel BTF > >> selftests/bpf: Script to verify uapi headers usage with vmlinux.h > >> selftests/bpf: Known good uapi headers for test_uapi_headers.py > >> selftests/bpf: script for infer_header_guards.pl testing > >> > >> scripts/infer_header_guards.pl | 191 +++++ > >> scripts/link-vmlinux.sh | 13 +- > >> tools/bpf/bpftool/btf.c | 4 +- > >> tools/lib/bpf/btf.c | 178 ++++- > >> tools/lib/bpf/btf.h | 7 +- > >> tools/lib/bpf/btf_dump.c | 232 +++++- > >> .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ > >> tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ > >> .../selftests/bpf/prog_tests/btf_dump.c | 11 +- > >> .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + > >> .../progs/btf_dump_test_case_header_guards.c | 94 +++ > >> .../bpf/test_uapi_header_guards_infer.sh | 33 + > >> .../selftests/bpf/test_uapi_headers.py | 197 +++++ > >> 13 files changed, 1816 insertions(+), 12 deletions(-) > >> create mode 100755 scripts/infer_header_guards.pl > >> create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt > >> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c > >> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c > >> create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh > >> create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py > >> > >> -- > >> 2.34.1 > >>
On 10/28/22 10:13 AM, Andrii Nakryiko wrote: > On Thu, Oct 27, 2022 at 6:33 PM Yonghong Song <yhs@meta.com> wrote: >> >> >> >> On 10/27/22 4:14 PM, Andrii Nakryiko wrote: >>> On Tue, Oct 25, 2022 at 3:28 PM Eduard Zingerman <eddyz87@gmail.com> wrote: >>>> >>>> Hi BPF community, >>>> >>>> AFAIK there is a long standing feature request to use kernel headers >>>> alongside `vmlinux.h` generated by `bpftool`. For example significant >>>> effort was put to add an attribute `bpf_dominating_decl` (see [1]) to >>>> clang, unfortunately this effort was stuck due to concerns regarding C >>>> language semantics. >>>> >>> >>> Maybe we should make another attempt to implement bpf_dominating_decl? >>> That seems like a more elegant solution than any other implemented or >>> discussed alternative. Yonghong, WDYT? >> >> I would say it would be very difficult for upstream to agree with >> bpf_dominating_decl. We already have lots of discussions and we >> likely won't be able to satisfy Aaron who wants us to emit >> adequate diagnostics which will involve lots of other work >> and he also thinks this is too far away from C standard and he >> wants us to implement in a llvm/clang tool which is not what >> we want. > > Ok, could we change the problem to detecting if some type is defined. > Would it be possible to have something like > > #if !__is_type_defined(struct abc) > struct abc { > }; > #endif > > I think we talked about this and there were problems with this > approach, but I don't remember details and how insurmountable the > problem is. Having a way to check whether some type is defined would > be very useful even outside of -target bpf parlance, though, so maybe > it's the problem worth attacking? Yes, we discussed this before. This will need to add additional work in preprocessor. I just made a discussion topic in llvm discourse https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 Let us see whether we can get some upstream agreement or not. > >> >>> >>> BTW, I suggest splitting libbpf btf_dedup and btf_dump changes into a >>> separate series and sending them as non-RFC sooner. Those improvements >>> are independent of all the header guards stuff, let's get them landed >>> sooner. >>> >>>> After some discussion with Alexei and Yonghong I'd like to request >>>> your comments regarding a somewhat brittle and partial solution to >>>> this issue that relies on adding `#ifndef FOO_H ... #endif` guards in >>>> the generated `vmlinux.h`. >>>> >>> >>> [...] >>> >>>> Eduard Zingerman (12): >>>> libbpf: Deduplicate unambigous standalone forward declarations >>>> selftests/bpf: Tests for standalone forward BTF declarations >>>> deduplication >>>> libbpf: Support for BTF_DECL_TAG dump in C format >>>> selftests/bpf: Tests for BTF_DECL_TAG dump in C format >>>> libbpf: Header guards for selected data structures in vmlinux.h >>>> selftests/bpf: Tests for header guards printing in BTF dump >>>> bpftool: Enable header guards generation >>>> kbuild: Script to infer header guard values for uapi headers >>>> kbuild: Header guards for types from include/uapi/*.h in kernel BTF >>>> selftests/bpf: Script to verify uapi headers usage with vmlinux.h >>>> selftests/bpf: Known good uapi headers for test_uapi_headers.py >>>> selftests/bpf: script for infer_header_guards.pl testing >>>> >>>> scripts/infer_header_guards.pl | 191 +++++ >>>> scripts/link-vmlinux.sh | 13 +- >>>> tools/bpf/bpftool/btf.c | 4 +- >>>> tools/lib/bpf/btf.c | 178 ++++- >>>> tools/lib/bpf/btf.h | 7 +- >>>> tools/lib/bpf/btf_dump.c | 232 +++++- >>>> .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ >>>> tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ >>>> .../selftests/bpf/prog_tests/btf_dump.c | 11 +- >>>> .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + >>>> .../progs/btf_dump_test_case_header_guards.c | 94 +++ >>>> .../bpf/test_uapi_header_guards_infer.sh | 33 + >>>> .../selftests/bpf/test_uapi_headers.py | 197 +++++ >>>> 13 files changed, 1816 insertions(+), 12 deletions(-) >>>> create mode 100755 scripts/infer_header_guards.pl >>>> create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt >>>> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c >>>> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c >>>> create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh >>>> create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py >>>> >>>> -- >>>> 2.34.1 >>>>
On Fri, Oct 28, 2022 at 11:57 AM Yonghong Song <yhs@meta.com> wrote: > > > > On 10/28/22 10:13 AM, Andrii Nakryiko wrote: > > On Thu, Oct 27, 2022 at 6:33 PM Yonghong Song <yhs@meta.com> wrote: > >> > >> > >> > >> On 10/27/22 4:14 PM, Andrii Nakryiko wrote: > >>> On Tue, Oct 25, 2022 at 3:28 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > >>>> > >>>> Hi BPF community, > >>>> > >>>> AFAIK there is a long standing feature request to use kernel headers > >>>> alongside `vmlinux.h` generated by `bpftool`. For example significant > >>>> effort was put to add an attribute `bpf_dominating_decl` (see [1]) to > >>>> clang, unfortunately this effort was stuck due to concerns regarding C > >>>> language semantics. > >>>> > >>> > >>> Maybe we should make another attempt to implement bpf_dominating_decl? > >>> That seems like a more elegant solution than any other implemented or > >>> discussed alternative. Yonghong, WDYT? > >> > >> I would say it would be very difficult for upstream to agree with > >> bpf_dominating_decl. We already have lots of discussions and we > >> likely won't be able to satisfy Aaron who wants us to emit > >> adequate diagnostics which will involve lots of other work > >> and he also thinks this is too far away from C standard and he > >> wants us to implement in a llvm/clang tool which is not what > >> we want. > > > > Ok, could we change the problem to detecting if some type is defined. > > Would it be possible to have something like > > > > #if !__is_type_defined(struct abc) > > struct abc { > > }; > > #endif > > > > I think we talked about this and there were problems with this > > approach, but I don't remember details and how insurmountable the > > problem is. Having a way to check whether some type is defined would > > be very useful even outside of -target bpf parlance, though, so maybe > > it's the problem worth attacking? > > Yes, we discussed this before. This will need to add additional work > in preprocessor. I just made a discussion topic in llvm discourse > > https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 > > Let us see whether we can get some upstream agreement or not. > Thanks for starting the conversation! I'll be following along. > > > >> > >>> > >>> BTW, I suggest splitting libbpf btf_dedup and btf_dump changes into a > >>> separate series and sending them as non-RFC sooner. Those improvements > >>> are independent of all the header guards stuff, let's get them landed > >>> sooner. > >>> > >>>> After some discussion with Alexei and Yonghong I'd like to request > >>>> your comments regarding a somewhat brittle and partial solution to > >>>> this issue that relies on adding `#ifndef FOO_H ... #endif` guards in > >>>> the generated `vmlinux.h`. > >>>> > >>> > >>> [...] > >>> > >>>> Eduard Zingerman (12): > >>>> libbpf: Deduplicate unambigous standalone forward declarations > >>>> selftests/bpf: Tests for standalone forward BTF declarations > >>>> deduplication > >>>> libbpf: Support for BTF_DECL_TAG dump in C format > >>>> selftests/bpf: Tests for BTF_DECL_TAG dump in C format > >>>> libbpf: Header guards for selected data structures in vmlinux.h > >>>> selftests/bpf: Tests for header guards printing in BTF dump > >>>> bpftool: Enable header guards generation > >>>> kbuild: Script to infer header guard values for uapi headers > >>>> kbuild: Header guards for types from include/uapi/*.h in kernel BTF > >>>> selftests/bpf: Script to verify uapi headers usage with vmlinux.h > >>>> selftests/bpf: Known good uapi headers for test_uapi_headers.py > >>>> selftests/bpf: script for infer_header_guards.pl testing > >>>> > >>>> scripts/infer_header_guards.pl | 191 +++++ > >>>> scripts/link-vmlinux.sh | 13 +- > >>>> tools/bpf/bpftool/btf.c | 4 +- > >>>> tools/lib/bpf/btf.c | 178 ++++- > >>>> tools/lib/bpf/btf.h | 7 +- > >>>> tools/lib/bpf/btf_dump.c | 232 +++++- > >>>> .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ > >>>> tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ > >>>> .../selftests/bpf/prog_tests/btf_dump.c | 11 +- > >>>> .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + > >>>> .../progs/btf_dump_test_case_header_guards.c | 94 +++ > >>>> .../bpf/test_uapi_header_guards_infer.sh | 33 + > >>>> .../selftests/bpf/test_uapi_headers.py | 197 +++++ > >>>> 13 files changed, 1816 insertions(+), 12 deletions(-) > >>>> create mode 100755 scripts/infer_header_guards.pl > >>>> create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt > >>>> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c > >>>> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c > >>>> create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh > >>>> create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py > >>>> > >>>> -- > >>>> 2.34.1 > >>>>
On 28/10/2022 22:35, Andrii Nakryiko wrote: > On Fri, Oct 28, 2022 at 11:57 AM Yonghong Song <yhs@meta.com> wrote: >> >> >> >> On 10/28/22 10:13 AM, Andrii Nakryiko wrote: >>> On Thu, Oct 27, 2022 at 6:33 PM Yonghong Song <yhs@meta.com> wrote: >>>> >>>> >>>> >>>> On 10/27/22 4:14 PM, Andrii Nakryiko wrote: >>>>> On Tue, Oct 25, 2022 at 3:28 PM Eduard Zingerman <eddyz87@gmail.com> wrote: >>>>>> >>>>>> Hi BPF community, >>>>>> >>>>>> AFAIK there is a long standing feature request to use kernel headers >>>>>> alongside `vmlinux.h` generated by `bpftool`. For example significant >>>>>> effort was put to add an attribute `bpf_dominating_decl` (see [1]) to >>>>>> clang, unfortunately this effort was stuck due to concerns regarding C >>>>>> language semantics. >>>>>> >>>>> >>>>> Maybe we should make another attempt to implement bpf_dominating_decl? >>>>> That seems like a more elegant solution than any other implemented or >>>>> discussed alternative. Yonghong, WDYT? >>>> >>>> I would say it would be very difficult for upstream to agree with >>>> bpf_dominating_decl. We already have lots of discussions and we >>>> likely won't be able to satisfy Aaron who wants us to emit >>>> adequate diagnostics which will involve lots of other work >>>> and he also thinks this is too far away from C standard and he >>>> wants us to implement in a llvm/clang tool which is not what >>>> we want. >>> >>> Ok, could we change the problem to detecting if some type is defined. >>> Would it be possible to have something like >>> >>> #if !__is_type_defined(struct abc) >>> struct abc { >>> }; >>> #endif >>> >>> I think we talked about this and there were problems with this >>> approach, but I don't remember details and how insurmountable the >>> problem is. Having a way to check whether some type is defined would >>> be very useful even outside of -target bpf parlance, though, so maybe >>> it's the problem worth attacking? >> >> Yes, we discussed this before. This will need to add additional work >> in preprocessor. I just made a discussion topic in llvm discourse >> >> https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 >> >> Let us see whether we can get some upstream agreement or not. >> > > Thanks for starting the conversation! I'll be following along. > I think this sort of approach assumes that vmlinux.h is included after any uapi headers, and would guard type definitions with #if type_is_defined(foo) struct foo { }; #endif ...is that right? My concern is that the vmlinux.h definitions have the CO-RE attributes. From a BPF perspective, would we like the vmlinux.h definitions to dominate over UAPI definitions do you think, or does it matter? I was wondering if there might be yet another way to crack this; if we did want the vmlinux.h type definitions to be authoritative because they have the preserve access index attribute, and because bpftool knows all vmlinux types, it could use that info to selectively redefine those type names such that we avoid name clashes when later including UAPI headers. Something like #ifdef __VMLINUX_H__ //usual vmlinux.h type definitions #endif /* __VMLINUX_H__ */ #ifdef __VMLINUX_ALIAS__ if !defined(timespec64) #define timespec64 __VMLINUX_ALIAS__timespec64 #endif // rest of the types define aliases here #undef __VMLINUX_ALIAS__ #else /* unalias */ #if defined(timespec64) #undef timespec64 #endif // rest of types undef aliases here #endif /* __VMLINUX_ALIAS__ */ Then the consumer does this: #define __VMLINUX_ALIAS__ #include "vmlinux.h" // include uapi headers #include "vmlinux.h" (the latter include of vmlinux.h is needed to undef all the type aliases) I tried hacking up bpftool to support this aliasing scheme and while it is kind of hacky it does seem to work, aside from some issues with IPPROTO_* definitions - for the enumerated IPPROTO_ values linux/in.h does this: enum { IPPROTO_IP = 0, /* Dummy protocol for TCP */ #define IPPROTO_IP IPPROTO_IP IPPROTO_ICMP = 1, /* Internet Control Message Protocol */ #define IPPROTO_ICMP IPPROTO_ICMP ...so our enum value definitions for IPPROTO_ values clash with the above definitions. These could be individually ifdef-guarded if needed though I think. I can send the proof-of-concept patch if it would help, I just wanted to check in case that might be a workable path too, since it just requires changes to bpftool (and changes to in.h). Thanks! Alan >>> >>>> >>>>> >>>>> BTW, I suggest splitting libbpf btf_dedup and btf_dump changes into a >>>>> separate series and sending them as non-RFC sooner. Those improvements >>>>> are independent of all the header guards stuff, let's get them landed >>>>> sooner. >>>>> >>>>>> After some discussion with Alexei and Yonghong I'd like to request >>>>>> your comments regarding a somewhat brittle and partial solution to >>>>>> this issue that relies on adding `#ifndef FOO_H ... #endif` guards in >>>>>> the generated `vmlinux.h`. >>>>>> >>>>> >>>>> [...] >>>>> >>>>>> Eduard Zingerman (12): >>>>>> libbpf: Deduplicate unambigous standalone forward declarations >>>>>> selftests/bpf: Tests for standalone forward BTF declarations >>>>>> deduplication >>>>>> libbpf: Support for BTF_DECL_TAG dump in C format >>>>>> selftests/bpf: Tests for BTF_DECL_TAG dump in C format >>>>>> libbpf: Header guards for selected data structures in vmlinux.h >>>>>> selftests/bpf: Tests for header guards printing in BTF dump >>>>>> bpftool: Enable header guards generation >>>>>> kbuild: Script to infer header guard values for uapi headers >>>>>> kbuild: Header guards for types from include/uapi/*.h in kernel BTF >>>>>> selftests/bpf: Script to verify uapi headers usage with vmlinux.h >>>>>> selftests/bpf: Known good uapi headers for test_uapi_headers.py >>>>>> selftests/bpf: script for infer_header_guards.pl testing >>>>>> >>>>>> scripts/infer_header_guards.pl | 191 +++++ >>>>>> scripts/link-vmlinux.sh | 13 +- >>>>>> tools/bpf/bpftool/btf.c | 4 +- >>>>>> tools/lib/bpf/btf.c | 178 ++++- >>>>>> tools/lib/bpf/btf.h | 7 +- >>>>>> tools/lib/bpf/btf_dump.c | 232 +++++- >>>>>> .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ >>>>>> tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ >>>>>> .../selftests/bpf/prog_tests/btf_dump.c | 11 +- >>>>>> .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + >>>>>> .../progs/btf_dump_test_case_header_guards.c | 94 +++ >>>>>> .../bpf/test_uapi_header_guards_infer.sh | 33 + >>>>>> .../selftests/bpf/test_uapi_headers.py | 197 +++++ >>>>>> 13 files changed, 1816 insertions(+), 12 deletions(-) >>>>>> create mode 100755 scripts/infer_header_guards.pl >>>>>> create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt >>>>>> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c >>>>>> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c >>>>>> create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh >>>>>> create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py >>>>>> >>>>>> -- >>>>>> 2.34.1 >>>>>>
On Tue, Nov 1, 2022 at 9:02 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > >> Yes, we discussed this before. This will need to add additional work > >> in preprocessor. I just made a discussion topic in llvm discourse > >> > >> https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 That would be a great clang feature. Thanks Yonghong! > >> > >> Let us see whether we can get some upstream agreement or not. > >> > > > > Thanks for starting the conversation! I'll be following along. > > > > > I think this sort of approach assumes that vmlinux.h is included after > any uapi headers, and would guard type definitions with > > #if type_is_defined(foo) > struct foo { > > }; > #endif > > ...is that right? My concern is that the vmlinux.h definitions have > the CO-RE attributes. From a BPF perspective, would we like the vmlinux.h > definitions to dominate over UAPI definitions do you think, or does it > matter? I think it's totally fine to require #include "vmlinux.h" to be last. The attr(preserve_access_index) is only useful for kernel internal structs. uapi structs don't need it. > > I was wondering if there might be yet another way to crack this; > if we did want the vmlinux.h type definitions to be authoritative > because they have the preserve access index attribute, and because > bpftool knows all vmlinux types, it could use that info to selectively > redefine those type names such that we avoid name clashes when later > including UAPI headers. Something like > > #ifdef __VMLINUX_H__ > //usual vmlinux.h type definitions > #endif /* __VMLINUX_H__ */ > > #ifdef __VMLINUX_ALIAS__ > if !defined(timespec64) > #define timespec64 __VMLINUX_ALIAS__timespec64 > #endif > // rest of the types define aliases here > #undef __VMLINUX_ALIAS__ > #else /* unalias */ > #if defined(timespec64) > #undef timespec64 > #endif > // rest of types undef aliases here > #endif /* __VMLINUX_ALIAS__ */ > > > Then the consumer does this: > > #define __VMLINUX_ALIAS__ > #include "vmlinux.h" > // include uapi headers > #include "vmlinux.h" > > (the latter include of vmlinux.h is needed to undef all the type aliases) Sounds like a bunch of complexity for the use case that is not clear to me. > > I tried hacking up bpftool to support this aliasing scheme and while > it is kind of hacky it does seem to work, aside from some issues with > IPPROTO_* definitions - for the enumerated IPPROTO_ values linux/in.h does > this: > > enum { > IPPROTO_IP = 0, /* Dummy protocol for TCP */ > #define IPPROTO_IP IPPROTO_IP > IPPROTO_ICMP = 1, /* Internet Control Message Protocol */ > #define IPPROTO_ICMP IPPROTO_ICMP > > > ...so our enum value definitions for IPPROTO_ values clash with the above > definitions. These could be individually ifdef-guarded if needed though I think. Including vmlinux.h last won't have this enum conflicts, right? > I can send the proof-of-concept patch if it would help, I just wanted to > check in case that might be a workable path too, since it just requires > changes to bpftool (and changes to in.h). I think changing the uapi header like in.h is no-go. Touching anything in uapi is too much of a risk.
On Tue, 2022-11-01 at 11:35 -0700, Alexei Starovoitov wrote: > On Tue, Nov 1, 2022 at 9:02 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > > > > > Yes, we discussed this before. This will need to add additional work > > > > in preprocessor. I just made a discussion topic in llvm discourse > > > > > > > > https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 > > That would be a great clang feature. > Thanks Yonghong! > > > > > > > > > Let us see whether we can get some upstream agreement or not. > > > > > > > > > > Thanks for starting the conversation! I'll be following along. > > > > > > > > > I think this sort of approach assumes that vmlinux.h is included after > > any uapi headers, and would guard type definitions with > > > > #if type_is_defined(foo) > > struct foo { > > > > }; > > #endif > > > > ...is that right? My concern is that the vmlinux.h definitions have > > the CO-RE attributes. From a BPF perspective, would we like the vmlinux.h > > definitions to dominate over UAPI definitions do you think, or does it > > matter? > > I think it's totally fine to require #include "vmlinux.h" to be last. > The attr(preserve_access_index) is only useful for kernel internal > structs. uapi structs don't need it. > > > > > I was wondering if there might be yet another way to crack this; > > if we did want the vmlinux.h type definitions to be authoritative > > because they have the preserve access index attribute, and because > > bpftool knows all vmlinux types, it could use that info to selectively > > redefine those type names such that we avoid name clashes when later > > including UAPI headers. Something like > > > > #ifdef __VMLINUX_H__ > > //usual vmlinux.h type definitions > > #endif /* __VMLINUX_H__ */ > > > > #ifdef __VMLINUX_ALIAS__ > > if !defined(timespec64) > > #define timespec64 __VMLINUX_ALIAS__timespec64 > > #endif > > // rest of the types define aliases here > > #undef __VMLINUX_ALIAS__ > > #else /* unalias */ > > #if defined(timespec64) > > #undef timespec64 > > #endif > > // rest of types undef aliases here > > #endif /* __VMLINUX_ALIAS__ */ > > > > > > Then the consumer does this: > > > > #define __VMLINUX_ALIAS__ > > #include "vmlinux.h" > > // include uapi headers > > #include "vmlinux.h" > > > > (the latter include of vmlinux.h is needed to undef all the type aliases) > > Sounds like a bunch of complexity for the use case that is not > clear to me. Well, my RFC is not shy of complexity :) What Alan suggests should solve the confilicts described in [1] or any other confilicts of such kind. [1] https://lore.kernel.org/bpf/999da51bdf050f155ba299500061b3eb6e0dcd0d.camel@gmail.com/ > > > > I tried hacking up bpftool to support this aliasing scheme and while > > it is kind of hacky it does seem to work, aside from some issues with > > IPPROTO_* definitions - for the enumerated IPPROTO_ values linux/in.h does > > this: > > > > enum { > > IPPROTO_IP = 0, /* Dummy protocol for TCP */ > > #define IPPROTO_IP IPPROTO_IP > > IPPROTO_ICMP = 1, /* Internet Control Message Protocol */ > > #define IPPROTO_ICMP IPPROTO_ICMP > > > > > > ...so our enum value definitions for IPPROTO_ values clash with the above > > definitions. These could be individually ifdef-guarded if needed though I think. > > Including vmlinux.h last won't have this enum conflicts, right? > > > I can send the proof-of-concept patch if it would help, I just wanted to > > check in case that might be a workable path too, since it just requires > > changes to bpftool (and changes to in.h). > > I think changing the uapi header like in.h is no-go. > Touching anything in uapi is too much of a risk.
On Tue, Nov 1, 2022 at 12:21 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Tue, 2022-11-01 at 11:35 -0700, Alexei Starovoitov wrote: > > On Tue, Nov 1, 2022 at 9:02 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > > > > > > > Yes, we discussed this before. This will need to add additional work > > > > > in preprocessor. I just made a discussion topic in llvm discourse > > > > > > > > > > https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 > > > > That would be a great clang feature. > > Thanks Yonghong! > > > > > > > > > > > > Let us see whether we can get some upstream agreement or not. > > > > > > > > > > > > > Thanks for starting the conversation! I'll be following along. > > > > > > > > > > > > > I think this sort of approach assumes that vmlinux.h is included after > > > any uapi headers, and would guard type definitions with > > > > > > #if type_is_defined(foo) > > > struct foo { > > > > > > }; > > > #endif > > > > > > ...is that right? My concern is that the vmlinux.h definitions have > > > the CO-RE attributes. From a BPF perspective, would we like the vmlinux.h > > > definitions to dominate over UAPI definitions do you think, or does it > > > matter? > > > > I think it's totally fine to require #include "vmlinux.h" to be last. > > The attr(preserve_access_index) is only useful for kernel internal > > structs. uapi structs don't need it. > > > > > > > > I was wondering if there might be yet another way to crack this; > > > if we did want the vmlinux.h type definitions to be authoritative > > > because they have the preserve access index attribute, and because > > > bpftool knows all vmlinux types, it could use that info to selectively > > > redefine those type names such that we avoid name clashes when later > > > including UAPI headers. Something like > > > > > > #ifdef __VMLINUX_H__ > > > //usual vmlinux.h type definitions > > > #endif /* __VMLINUX_H__ */ > > > > > > #ifdef __VMLINUX_ALIAS__ > > > if !defined(timespec64) > > > #define timespec64 __VMLINUX_ALIAS__timespec64 > > > #endif > > > // rest of the types define aliases here > > > #undef __VMLINUX_ALIAS__ > > > #else /* unalias */ > > > #if defined(timespec64) > > > #undef timespec64 > > > #endif > > > // rest of types undef aliases here > > > #endif /* __VMLINUX_ALIAS__ */ > > > > > > > > > Then the consumer does this: > > > > > > #define __VMLINUX_ALIAS__ > > > #include "vmlinux.h" > > > // include uapi headers > > > #include "vmlinux.h" > > > > > > (the latter include of vmlinux.h is needed to undef all the type aliases) > > > > Sounds like a bunch of complexity for the use case that is not > > clear to me. > > Well, my RFC is not shy of complexity :) > What Alan suggests should solve the confilicts described in [1] or any > other confilicts of such kind. > > [1] https://lore.kernel.org/bpf/999da51bdf050f155ba299500061b3eb6e0dcd0d.camel@gmail.com/ I don't quite see how renaming all types in the 1st vmlinux.h will help with name collisions inside enum {}. The enums will conflict with 2nd vmlinux.h too. Unless the proposal is to rename them as well, but then what's the point?
On Fri, 2022-10-28 at 11:56 -0700, Yonghong Song wrote: > > > [...] > > > > Ok, could we change the problem to detecting if some type is defined. > > Would it be possible to have something like > > > > #if !__is_type_defined(struct abc) > > struct abc { > > }; > > #endif > > > > I think we talked about this and there were problems with this > > approach, but I don't remember details and how insurmountable the > > problem is. Having a way to check whether some type is defined would > > be very useful even outside of -target bpf parlance, though, so maybe > > it's the problem worth attacking? > > Yes, we discussed this before. This will need to add additional work > in preprocessor. I just made a discussion topic in llvm discourse > > https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 > > Let us see whether we can get some upstream agreement or not. I did a small investigation of this feature. The main pre-requirement is construction of the symbol table during source code pre-processing, which implies necessity to parse the source code at the same time. It is technically possible in clang, as lexing, pre-processing and AST construction happens at the same time when in compilation mode. The prototype is available here [1], it includes: - Change in the pre-processor that adds an optional callback "IsTypeDefinedFn" & necessary parsing of __is_type_defined construct. - Change in Sema module (responsible for parsing/AST & symbol table) that installs the appropriate "IsTypeDefinedFn" in the pre-processor instance. However, this prototype builds a backward dependency between pre-processor and semantic analysis. There are currently no such dependencies in the clang code base. This makes it impossible to do pre-processing and compilation separately, e.g. consider the following example: $ cat test.c struct foo { int x; }; #if __is_type_defined(foo) const int x = 1; #else const int x = 2; #endif $ clang -cc1 -ast-print test.c -o - struct foo { int x; }; const int x = 1; $ clang -E test.c -o - # ... some line directives ... struct foo { int x; }; const int x = 2; Note that __is_type_defined is computed to different value in the first and second calls. This is so because semantic analysis (AST, symbol table) is not done for -E. It also breaks that C11 standard which clearly separates pre-processing and semantic analysis phases, see [2] 5.1.1.2. So, my conclusion is as follows: this is technically possible in clang but has no chance to reach llvm upstream. Thanks, Eduard [1] https://github.com/llvm/llvm-project/compare/main...eddyz87:llvm-project:is-type-defined-experiment [2] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf > > > > > > > > > > > > > > BTW, I suggest splitting libbpf btf_dedup and btf_dump changes into a > > > > separate series and sending them as non-RFC sooner. Those improvements > > > > are independent of all the header guards stuff, let's get them landed > > > > sooner. > > > > > > > > > After some discussion with Alexei and Yonghong I'd like to request > > > > > your comments regarding a somewhat brittle and partial solution to > > > > > this issue that relies on adding `#ifndef FOO_H ... #endif` guards in > > > > > the generated `vmlinux.h`. > > > > > > > > > > > > > [...] > > > > > > > > > Eduard Zingerman (12): > > > > > libbpf: Deduplicate unambigous standalone forward declarations > > > > > selftests/bpf: Tests for standalone forward BTF declarations > > > > > deduplication > > > > > libbpf: Support for BTF_DECL_TAG dump in C format > > > > > selftests/bpf: Tests for BTF_DECL_TAG dump in C format > > > > > libbpf: Header guards for selected data structures in vmlinux.h > > > > > selftests/bpf: Tests for header guards printing in BTF dump > > > > > bpftool: Enable header guards generation > > > > > kbuild: Script to infer header guard values for uapi headers > > > > > kbuild: Header guards for types from include/uapi/*.h in kernel BTF > > > > > selftests/bpf: Script to verify uapi headers usage with vmlinux.h > > > > > selftests/bpf: Known good uapi headers for test_uapi_headers.py > > > > > selftests/bpf: script for infer_header_guards.pl testing > > > > > > > > > > scripts/infer_header_guards.pl | 191 +++++ > > > > > scripts/link-vmlinux.sh | 13 +- > > > > > tools/bpf/bpftool/btf.c | 4 +- > > > > > tools/lib/bpf/btf.c | 178 ++++- > > > > > tools/lib/bpf/btf.h | 7 +- > > > > > tools/lib/bpf/btf_dump.c | 232 +++++- > > > > > .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ > > > > > tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ > > > > > .../selftests/bpf/prog_tests/btf_dump.c | 11 +- > > > > > .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + > > > > > .../progs/btf_dump_test_case_header_guards.c | 94 +++ > > > > > .../bpf/test_uapi_header_guards_infer.sh | 33 + > > > > > .../selftests/bpf/test_uapi_headers.py | 197 +++++ > > > > > 13 files changed, 1816 insertions(+), 12 deletions(-) > > > > > create mode 100755 scripts/infer_header_guards.pl > > > > > create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt > > > > > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c > > > > > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c > > > > > create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh > > > > > create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py > > > > > > > > > > -- > > > > > 2.34.1 > > > > >
On 11/11/22 1:55 PM, Eduard Zingerman wrote: > On Fri, 2022-10-28 at 11:56 -0700, Yonghong Song wrote: >>>> [...] >>> >>> Ok, could we change the problem to detecting if some type is defined. >>> Would it be possible to have something like >>> >>> #if !__is_type_defined(struct abc) >>> struct abc { >>> }; >>> #endif >>> >>> I think we talked about this and there were problems with this >>> approach, but I don't remember details and how insurmountable the >>> problem is. Having a way to check whether some type is defined would >>> be very useful even outside of -target bpf parlance, though, so maybe >>> it's the problem worth attacking? >> >> Yes, we discussed this before. This will need to add additional work >> in preprocessor. I just made a discussion topic in llvm discourse >> >> https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 >> >> Let us see whether we can get some upstream agreement or not. > > I did a small investigation of this feature. > > The main pre-requirement is construction of the symbol table during > source code pre-processing, which implies necessity to parse the > source code at the same time. It is technically possible in clang, as > lexing, pre-processing and AST construction happens at the same time > when in compilation mode. > > The prototype is available here [1], it includes: > - Change in the pre-processor that adds an optional callback > "IsTypeDefinedFn" & necessary parsing of __is_type_defined > construct. > - Change in Sema module (responsible for parsing/AST & symbol table) > that installs the appropriate "IsTypeDefinedFn" in the pre-processor > instance. > > However, this prototype builds a backward dependency between > pre-processor and semantic analysis. There are currently no such > dependencies in the clang code base. > > This makes it impossible to do pre-processing and compilation > separately, e.g. consider the following example: > > $ cat test.c > > struct foo { int x; }; > > #if __is_type_defined(foo) > const int x = 1; > #else > const int x = 2; > #endif > > $ clang -cc1 -ast-print test.c -o - > > struct foo { > int x; > }; > const int x = 1; > > $ clang -E test.c -o - > > # ... some line directives ... > struct foo { int x; }; > const int x = 2; Is it any chance '-E' could output the same one as '-cc1 -ast-print'? That is, even with -E we could do some semantics analysis as well, using either current clang semantics analysis or creating an minimal version of sema analysis in preprocessor itself? > > Note that __is_type_defined is computed to different value in the > first and second calls. This is so because semantic analysis (AST, > symbol table) is not done for -E. > > It also breaks that C11 standard which clearly separates > pre-processing and semantic analysis phases, see [2] 5.1.1.2. > > So, my conclusion is as follows: this is technically possible in clang > but has no chance to reach llvm upstream. > > Thanks, > Eduard > > [1] https://github.com/llvm/llvm-project/compare/main...eddyz87:llvm-project:is-type-defined-experiment > [2] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf > > >> >>> >>>> >>>>> >>>>> BTW, I suggest splitting libbpf btf_dedup and btf_dump changes into a >>>>> separate series and sending them as non-RFC sooner. Those improvements >>>>> are independent of all the header guards stuff, let's get them landed >>>>> sooner. >>>>> >>>>>> After some discussion with Alexei and Yonghong I'd like to request >>>>>> your comments regarding a somewhat brittle and partial solution to >>>>>> this issue that relies on adding `#ifndef FOO_H ... #endif` guards in >>>>>> the generated `vmlinux.h`. >>>>>> >>>>> >>>>> [...] >>>>> >>>>>> Eduard Zingerman (12): >>>>>> libbpf: Deduplicate unambigous standalone forward declarations >>>>>> selftests/bpf: Tests for standalone forward BTF declarations >>>>>> deduplication >>>>>> libbpf: Support for BTF_DECL_TAG dump in C format >>>>>> selftests/bpf: Tests for BTF_DECL_TAG dump in C format >>>>>> libbpf: Header guards for selected data structures in vmlinux.h >>>>>> selftests/bpf: Tests for header guards printing in BTF dump >>>>>> bpftool: Enable header guards generation >>>>>> kbuild: Script to infer header guard values for uapi headers >>>>>> kbuild: Header guards for types from include/uapi/*.h in kernel BTF >>>>>> selftests/bpf: Script to verify uapi headers usage with vmlinux.h >>>>>> selftests/bpf: Known good uapi headers for test_uapi_headers.py >>>>>> selftests/bpf: script for infer_header_guards.pl testing >>>>>> >>>>>> scripts/infer_header_guards.pl | 191 +++++ >>>>>> scripts/link-vmlinux.sh | 13 +- >>>>>> tools/bpf/bpftool/btf.c | 4 +- >>>>>> tools/lib/bpf/btf.c | 178 ++++- >>>>>> tools/lib/bpf/btf.h | 7 +- >>>>>> tools/lib/bpf/btf_dump.c | 232 +++++- >>>>>> .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ >>>>>> tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ >>>>>> .../selftests/bpf/prog_tests/btf_dump.c | 11 +- >>>>>> .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + >>>>>> .../progs/btf_dump_test_case_header_guards.c | 94 +++ >>>>>> .../bpf/test_uapi_header_guards_infer.sh | 33 + >>>>>> .../selftests/bpf/test_uapi_headers.py | 197 +++++ >>>>>> 13 files changed, 1816 insertions(+), 12 deletions(-) >>>>>> create mode 100755 scripts/infer_header_guards.pl >>>>>> create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt >>>>>> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c >>>>>> create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c >>>>>> create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh >>>>>> create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py >>>>>> >>>>>> -- >>>>>> 2.34.1 >>>>>> >
On Sun, 2022-11-13 at 23:52 -0800, Yonghong Song wrote: > > On 11/11/22 1:55 PM, Eduard Zingerman wrote: > > On Fri, 2022-10-28 at 11:56 -0700, Yonghong Song wrote: > > > > > [...] > > > > > > > > Ok, could we change the problem to detecting if some type is defined. > > > > Would it be possible to have something like > > > > > > > > #if !__is_type_defined(struct abc) > > > > struct abc { > > > > }; > > > > #endif > > > > > > > > I think we talked about this and there were problems with this > > > > approach, but I don't remember details and how insurmountable the > > > > problem is. Having a way to check whether some type is defined would > > > > be very useful even outside of -target bpf parlance, though, so maybe > > > > it's the problem worth attacking? > > > > > > Yes, we discussed this before. This will need to add additional work > > > in preprocessor. I just made a discussion topic in llvm discourse > > > > > > https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 > > > > > > Let us see whether we can get some upstream agreement or not. > > > > I did a small investigation of this feature. > > > > The main pre-requirement is construction of the symbol table during > > source code pre-processing, which implies necessity to parse the > > source code at the same time. It is technically possible in clang, as > > lexing, pre-processing and AST construction happens at the same time > > when in compilation mode. > > > > The prototype is available here [1], it includes: > > - Change in the pre-processor that adds an optional callback > > "IsTypeDefinedFn" & necessary parsing of __is_type_defined > > construct. > > - Change in Sema module (responsible for parsing/AST & symbol table) > > that installs the appropriate "IsTypeDefinedFn" in the pre-processor > > instance. > > > > However, this prototype builds a backward dependency between > > pre-processor and semantic analysis. There are currently no such > > dependencies in the clang code base. > > > > This makes it impossible to do pre-processing and compilation > > separately, e.g. consider the following example: > > > > $ cat test.c > > > > struct foo { int x; }; > > > > #if __is_type_defined(foo) > > const int x = 1; > > #else > > const int x = 2; > > #endif > > > > $ clang -cc1 -ast-print test.c -o - > > > > struct foo { > > int x; > > }; > > const int x = 1; > > > > $ clang -E test.c -o - > > > > # ... some line directives ... > > struct foo { int x; }; > > const int x = 2; > > Is it any chance '-E' could output the same one as '-cc1 -ast-print'? > That is, even with -E we could do some semantics analysis > as well, using either current clang semantics analysis or creating > an minimal version of sema analysis in preprocessor itself? Sema drives consumption of tokens from Preprocessor. Calls to Preprocessor are done on a parsing recursive descent. Extracting a stream of tokens would require an incremental parser instead. A minimal version of such parser is possible to implement for C. It might be the case that matching open / closing braces and identifiers following 'struct' / 'union' / 'enum' keywords might be almost sufficient but I need to try to be sure (e.g. it is more complex for 'typedef'). I can work on it but I don't think there is a chance to upstream this work. Thanks, Eduard > > > > > Note that __is_type_defined is computed to different value in the > > first and second calls. This is so because semantic analysis (AST, > > symbol table) is not done for -E. > > > > It also breaks that C11 standard which clearly separates > > pre-processing and semantic analysis phases, see [2] 5.1.1.2. > > > > So, my conclusion is as follows: this is technically possible in clang > > but has no chance to reach llvm upstream. > > > > Thanks, > > Eduard > > > > [1] https://github.com/llvm/llvm-project/compare/main...eddyz87:llvm-project:is-type-defined-experiment > > [2] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf > > > > > > > > > > > > > > > > > > > > > > > > > > > > BTW, I suggest splitting libbpf btf_dedup and btf_dump changes into a > > > > > > separate series and sending them as non-RFC sooner. Those improvements > > > > > > are independent of all the header guards stuff, let's get them landed > > > > > > sooner. > > > > > > > > > > > > > After some discussion with Alexei and Yonghong I'd like to request > > > > > > > your comments regarding a somewhat brittle and partial solution to > > > > > > > this issue that relies on adding `#ifndef FOO_H ... #endif` guards in > > > > > > > the generated `vmlinux.h`. > > > > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > Eduard Zingerman (12): > > > > > > > libbpf: Deduplicate unambigous standalone forward declarations > > > > > > > selftests/bpf: Tests for standalone forward BTF declarations > > > > > > > deduplication > > > > > > > libbpf: Support for BTF_DECL_TAG dump in C format > > > > > > > selftests/bpf: Tests for BTF_DECL_TAG dump in C format > > > > > > > libbpf: Header guards for selected data structures in vmlinux.h > > > > > > > selftests/bpf: Tests for header guards printing in BTF dump > > > > > > > bpftool: Enable header guards generation > > > > > > > kbuild: Script to infer header guard values for uapi headers > > > > > > > kbuild: Header guards for types from include/uapi/*.h in kernel BTF > > > > > > > selftests/bpf: Script to verify uapi headers usage with vmlinux.h > > > > > > > selftests/bpf: Known good uapi headers for test_uapi_headers.py > > > > > > > selftests/bpf: script for infer_header_guards.pl testing > > > > > > > > > > > > > > scripts/infer_header_guards.pl | 191 +++++ > > > > > > > scripts/link-vmlinux.sh | 13 +- > > > > > > > tools/bpf/bpftool/btf.c | 4 +- > > > > > > > tools/lib/bpf/btf.c | 178 ++++- > > > > > > > tools/lib/bpf/btf.h | 7 +- > > > > > > > tools/lib/bpf/btf_dump.c | 232 +++++- > > > > > > > .../selftests/bpf/good_uapi_headers.txt | 677 ++++++++++++++++++ > > > > > > > tools/testing/selftests/bpf/prog_tests/btf.c | 152 ++++ > > > > > > > .../selftests/bpf/prog_tests/btf_dump.c | 11 +- > > > > > > > .../bpf/progs/btf_dump_test_case_decl_tag.c | 39 + > > > > > > > .../progs/btf_dump_test_case_header_guards.c | 94 +++ > > > > > > > .../bpf/test_uapi_header_guards_infer.sh | 33 + > > > > > > > .../selftests/bpf/test_uapi_headers.py | 197 +++++ > > > > > > > 13 files changed, 1816 insertions(+), 12 deletions(-) > > > > > > > create mode 100755 scripts/infer_header_guards.pl > > > > > > > create mode 100644 tools/testing/selftests/bpf/good_uapi_headers.txt > > > > > > > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_decl_tag.c > > > > > > > create mode 100644 tools/testing/selftests/bpf/progs/btf_dump_test_case_header_guards.c > > > > > > > create mode 100755 tools/testing/selftests/bpf/test_uapi_header_guards_infer.sh > > > > > > > create mode 100755 tools/testing/selftests/bpf/test_uapi_headers.py > > > > > > > > > > > > > > -- > > > > > > > 2.34.1 > > > > > > > > >
On Mon, Nov 14, 2022 at 1:13 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Sun, 2022-11-13 at 23:52 -0800, Yonghong Song wrote: > > > > On 11/11/22 1:55 PM, Eduard Zingerman wrote: > > > On Fri, 2022-10-28 at 11:56 -0700, Yonghong Song wrote: > > > > > > [...] > > > > > > > > > > Ok, could we change the problem to detecting if some type is defined. > > > > > Would it be possible to have something like > > > > > > > > > > #if !__is_type_defined(struct abc) > > > > > struct abc { > > > > > }; > > > > > #endif > > > > > > > > > > I think we talked about this and there were problems with this > > > > > approach, but I don't remember details and how insurmountable the > > > > > problem is. Having a way to check whether some type is defined would > > > > > be very useful even outside of -target bpf parlance, though, so maybe > > > > > it's the problem worth attacking? > > > > > > > > Yes, we discussed this before. This will need to add additional work > > > > in preprocessor. I just made a discussion topic in llvm discourse > > > > > > > > https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 > > > > > > > > Let us see whether we can get some upstream agreement or not. > > > > > > I did a small investigation of this feature. > > > > > > The main pre-requirement is construction of the symbol table during > > > source code pre-processing, which implies necessity to parse the > > > source code at the same time. It is technically possible in clang, as > > > lexing, pre-processing and AST construction happens at the same time > > > when in compilation mode. > > > > > > The prototype is available here [1], it includes: > > > - Change in the pre-processor that adds an optional callback > > > "IsTypeDefinedFn" & necessary parsing of __is_type_defined > > > construct. > > > - Change in Sema module (responsible for parsing/AST & symbol table) > > > that installs the appropriate "IsTypeDefinedFn" in the pre-processor > > > instance. > > > > > > However, this prototype builds a backward dependency between > > > pre-processor and semantic analysis. There are currently no such > > > dependencies in the clang code base. > > > > > > This makes it impossible to do pre-processing and compilation > > > separately, e.g. consider the following example: > > > > > > $ cat test.c > > > > > > struct foo { int x; }; > > > > > > #if __is_type_defined(foo) > > > const int x = 1; > > > #else > > > const int x = 2; > > > #endif > > > > > > $ clang -cc1 -ast-print test.c -o - > > > > > > struct foo { > > > int x; > > > }; > > > const int x = 1; > > > > > > $ clang -E test.c -o - > > > > > > # ... some line directives ... > > > struct foo { int x; }; > > > const int x = 2; > > > > Is it any chance '-E' could output the same one as '-cc1 -ast-print'? > > That is, even with -E we could do some semantics analysis > > as well, using either current clang semantics analysis or creating > > an minimal version of sema analysis in preprocessor itself? > > Sema drives consumption of tokens from Preprocessor. Calls to > Preprocessor are done on a parsing recursive descent. Extracting a > stream of tokens would require an incremental parser instead. > > A minimal version of such parser is possible to implement for C. > It might be the case that matching open / closing braces and > identifiers following 'struct' / 'union' / 'enum' keywords might be > almost sufficient but I need to try to be sure (e.g. it is more > complex for 'typedef'). > > I can work on it but I don't think there is a chance to upstream this work. Right. It's going to be C only. C++ with namespaces and nested class decls won't work with simple type parser. On the other side if we're asking preprocessor to look for 'struct foo' and remember that 'foo' is a type maybe we can add a regex-search instead? It would be a bit more generic and will work for basic union/struct foo definition? Something like instead of: #if __is_type_defined(foo) use: #if regex(struct[\t]+foo) enums are harder in this approach, but higher chance to land? regex() would mean "search for this pattern in the file until this line. Or some other preprocessor "language" tricks? For example: The preprocessor would grep for 'struct *' in a single line while processing a file and emit #define __secret_prefix_##$1 where $1 would be a capture from "single line regex". Then later in the same file instead of: #if __is_type_defined(foo) use: #ifdef __secret_prefix_foo This "single line regex" may look like: #if regex_in_any_later_line(struct[\t]+[a-zA-Z_]+) define __secret_prefix_$2
On Mon, 2022-11-14 at 13:50 -0800, Alexei Starovoitov wrote: > On Mon, Nov 14, 2022 at 1:13 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > On Sun, 2022-11-13 at 23:52 -0800, Yonghong Song wrote: > > > > > > On 11/11/22 1:55 PM, Eduard Zingerman wrote: > > > > On Fri, 2022-10-28 at 11:56 -0700, Yonghong Song wrote: > > > > > > > [...] > > > > > > > > > > > > Ok, could we change the problem to detecting if some type is defined. > > > > > > Would it be possible to have something like > > > > > > > > > > > > #if !__is_type_defined(struct abc) > > > > > > struct abc { > > > > > > }; > > > > > > #endif > > > > > > > > > > > > I think we talked about this and there were problems with this > > > > > > approach, but I don't remember details and how insurmountable the > > > > > > problem is. Having a way to check whether some type is defined would > > > > > > be very useful even outside of -target bpf parlance, though, so maybe > > > > > > it's the problem worth attacking? > > > > > > > > > > Yes, we discussed this before. This will need to add additional work > > > > > in preprocessor. I just made a discussion topic in llvm discourse > > > > > > > > > > https://discourse.llvm.org/t/add-a-type-checking-macro-is-type-defined-type/66268 > > > > > > > > > > Let us see whether we can get some upstream agreement or not. > > > > > > > > I did a small investigation of this feature. > > > > > > > > The main pre-requirement is construction of the symbol table during > > > > source code pre-processing, which implies necessity to parse the > > > > source code at the same time. It is technically possible in clang, as > > > > lexing, pre-processing and AST construction happens at the same time > > > > when in compilation mode. > > > > > > > > The prototype is available here [1], it includes: > > > > - Change in the pre-processor that adds an optional callback > > > > "IsTypeDefinedFn" & necessary parsing of __is_type_defined > > > > construct. > > > > - Change in Sema module (responsible for parsing/AST & symbol table) > > > > that installs the appropriate "IsTypeDefinedFn" in the pre-processor > > > > instance. > > > > > > > > However, this prototype builds a backward dependency between > > > > pre-processor and semantic analysis. There are currently no such > > > > dependencies in the clang code base. > > > > > > > > This makes it impossible to do pre-processing and compilation > > > > separately, e.g. consider the following example: > > > > > > > > $ cat test.c > > > > > > > > struct foo { int x; }; > > > > > > > > #if __is_type_defined(foo) > > > > const int x = 1; > > > > #else > > > > const int x = 2; > > > > #endif > > > > > > > > $ clang -cc1 -ast-print test.c -o - > > > > > > > > struct foo { > > > > int x; > > > > }; > > > > const int x = 1; > > > > > > > > $ clang -E test.c -o - > > > > > > > > # ... some line directives ... > > > > struct foo { int x; }; > > > > const int x = 2; > > > > > > Is it any chance '-E' could output the same one as '-cc1 -ast-print'? > > > That is, even with -E we could do some semantics analysis > > > as well, using either current clang semantics analysis or creating > > > an minimal version of sema analysis in preprocessor itself? > > > > Sema drives consumption of tokens from Preprocessor. Calls to > > Preprocessor are done on a parsing recursive descent. Extracting a > > stream of tokens would require an incremental parser instead. > > > > A minimal version of such parser is possible to implement for C. > > It might be the case that matching open / closing braces and > > identifiers following 'struct' / 'union' / 'enum' keywords might be > > almost sufficient but I need to try to be sure (e.g. it is more > > complex for 'typedef'). > > > > I can work on it but I don't think there is a chance to upstream this work. > > Right. It's going to be C only. > C++ with namespaces and nested class decls won't work with simple > type parser. > > On the other side if we're asking preprocessor to look for > 'struct foo' and remember that 'foo' is a type > maybe we can add a regex-search instead? > It would be a bit more generic and will work for basic > union/struct foo definition? > Something like instead of: > #if __is_type_defined(foo) > use: > #if regex(struct[\t]+foo) > > enums are harder in this approach, but higher chance to land? > > regex() would mean "search for this pattern in the file until this line. > > Or some other preprocessor "language" tricks? > I talked to Yonhong today and he suggests to investigate whether pre-processor changes could be made BPF target specific. E.g. there are extension points in the clang pre-processor right now but those for tooling. There might be a way to extend this mechanism to allow target specific pre-processor behavior. I'll take a look and write another email here. > For example: > The preprocessor would grep for 'struct *' in a single line > while processing a file and emit #define __secret_prefix_##$1 > where $1 would be a capture from "single line regex". > Then later in the same file instead of: > #if __is_type_defined(foo) > use: > #ifdef __secret_prefix_foo > > This "single line regex" may look like: > #if regex_in_any_later_line(struct[\t]+[a-zA-Z_]+) define __secret_prefix_$2