diff mbox series

[RFC,bpf-next,1/2] libbpf: Add support for dynamic tracepoint

Message ID 20250105124403.991-2-laoar.shao@gmail.com (mailing list archive)
State RFC
Delegated to: BPF
Headers show
Series libbpf: Add support for dynamic tracepoint | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-11 success Logs for aarch64-gcc / veristat-meta
bpf/vmtest-bpf-next-VM_Test-18 success Logs for s390x-gcc / veristat-meta
bpf/vmtest-bpf-next-VM_Test-19 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-gcc / veristat-kernel / x86_64-gcc veristat_kernel
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-17 / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-44 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-17 / veristat-meta
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-46 success Logs for x86_64-llvm-18 / veristat-meta
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-gcc / veristat-meta / x86_64-gcc veristat_meta
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-45 success Logs for x86_64-llvm-18 / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-43 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success Errors and warnings before: 2 (+0) this patch: 2 (+0)
netdev/cc_maintainers success CCed 13 of 13 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 15 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Yafang Shao Jan. 5, 2025, 12:44 p.m. UTC
Dynamic tracepoints can be created using debugfs. For example:

   echo 'p:myprobe kernel_clone args' >> /sys/kernel/debug/tracing/kprobe_events

This command creates a new tracepoint under debugfs:

  $ ls /sys/kernel/debug/tracing/events/kprobes/myprobe/
  enable  filter  format  hist  id  trigger

Although this dynamic tracepoint appears as a tracepoint, it is internally
implemented as a kprobe. However, it must be attached as a tracepoint to
function correctly in certain contexts.

This update adds support in libbpf for handling such tracepoints,
simplifying their usage and integration in BPF workflows.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 tools/lib/bpf/libbpf.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Alexei Starovoitov Jan. 6, 2025, 12:16 a.m. UTC | #1
On Sun, Jan 5, 2025 at 4:44 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> Dynamic tracepoints can be created using debugfs. For example:
>
>    echo 'p:myprobe kernel_clone args' >> /sys/kernel/debug/tracing/kprobe_events
>
> This command creates a new tracepoint under debugfs:
>
>   $ ls /sys/kernel/debug/tracing/events/kprobes/myprobe/
>   enable  filter  format  hist  id  trigger
>
> Although this dynamic tracepoint appears as a tracepoint, it is internally
> implemented as a kprobe. However, it must be attached as a tracepoint to
> function correctly in certain contexts.

Nack.
There are multiple mechanisms to create kprobe/tp via text interfaces.
We're not going to mix them with the programmatic libbpf api.
Yafang Shao Jan. 6, 2025, 2:32 a.m. UTC | #2
On Mon, Jan 6, 2025 at 8:16 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sun, Jan 5, 2025 at 4:44 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > Dynamic tracepoints can be created using debugfs. For example:
> >
> >    echo 'p:myprobe kernel_clone args' >> /sys/kernel/debug/tracing/kprobe_events
> >
> > This command creates a new tracepoint under debugfs:
> >
> >   $ ls /sys/kernel/debug/tracing/events/kprobes/myprobe/
> >   enable  filter  format  hist  id  trigger
> >
> > Although this dynamic tracepoint appears as a tracepoint, it is internally
> > implemented as a kprobe. However, it must be attached as a tracepoint to
> > function correctly in certain contexts.
>
> Nack.
> There are multiple mechanisms to create kprobe/tp via text interfaces.
> We're not going to mix them with the programmatic libbpf api.

It appears that bpftrace still lacks support for adding a kprobe/tp
and then attaching to it directly. Is that correct?
What do you think about introducing this mechanism into bpftrace? With
such a feature, we could easily attach to inlined kernel functions
using bpftrace.

--
Regards
Yafang
Alexei Starovoitov Jan. 6, 2025, 10:33 p.m. UTC | #3
On Sun, Jan 5, 2025 at 6:32 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Mon, Jan 6, 2025 at 8:16 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Sun, Jan 5, 2025 at 4:44 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > Dynamic tracepoints can be created using debugfs. For example:
> > >
> > >    echo 'p:myprobe kernel_clone args' >> /sys/kernel/debug/tracing/kprobe_events
> > >
> > > This command creates a new tracepoint under debugfs:
> > >
> > >   $ ls /sys/kernel/debug/tracing/events/kprobes/myprobe/
> > >   enable  filter  format  hist  id  trigger
> > >
> > > Although this dynamic tracepoint appears as a tracepoint, it is internally
> > > implemented as a kprobe. However, it must be attached as a tracepoint to
> > > function correctly in certain contexts.
> >
> > Nack.
> > There are multiple mechanisms to create kprobe/tp via text interfaces.
> > We're not going to mix them with the programmatic libbpf api.
>
> It appears that bpftrace still lacks support for adding a kprobe/tp
> and then attaching to it directly. Is that correct?

what do you mean?
bpftrace supports both kprobe attaching and tp too.

> What do you think about introducing this mechanism into bpftrace? With
> such a feature, we could easily attach to inlined kernel functions
> using bpftrace.

Attaching to inlined funcs also sort-of works. It relies on dwarf,
and there is work in progress to add a special section to vmlinux
to annotate inlined sites, so it can work without dwarf.
Yafang Shao Jan. 7, 2025, 2:41 a.m. UTC | #4
On Tue, Jan 7, 2025 at 6:33 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sun, Jan 5, 2025 at 6:32 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > On Mon, Jan 6, 2025 at 8:16 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Sun, Jan 5, 2025 at 4:44 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > > >
> > > > Dynamic tracepoints can be created using debugfs. For example:
> > > >
> > > >    echo 'p:myprobe kernel_clone args' >> /sys/kernel/debug/tracing/kprobe_events
> > > >
> > > > This command creates a new tracepoint under debugfs:
> > > >
> > > >   $ ls /sys/kernel/debug/tracing/events/kprobes/myprobe/
> > > >   enable  filter  format  hist  id  trigger
> > > >
> > > > Although this dynamic tracepoint appears as a tracepoint, it is internally
> > > > implemented as a kprobe. However, it must be attached as a tracepoint to
> > > > function correctly in certain contexts.
> > >
> > > Nack.
> > > There are multiple mechanisms to create kprobe/tp via text interfaces.
> > > We're not going to mix them with the programmatic libbpf api.
> >
> > It appears that bpftrace still lacks support for adding a kprobe/tp
> > and then attaching to it directly. Is that correct?
>
> what do you mean?

Take the inlined kernel function tcp_listendrop() as an example:

$ perf probe -a 'tcp_listendrop sk'
Added new events:
  probe:tcp_listendrop (on tcp_listendrop with sk)
  probe:tcp_listendrop (on tcp_listendrop with sk)
  probe:tcp_listendrop (on tcp_listendrop with sk)
  probe:tcp_listendrop (on tcp_listendrop with sk)
  probe:tcp_listendrop (on tcp_listendrop with sk)
  probe:tcp_listendrop (on tcp_listendrop with sk)
  probe:tcp_listendrop (on tcp_listendrop with sk)
  probe:tcp_listendrop (on tcp_listendrop with sk)

You can now use it in all perf tools, such as:

        perf record -e probe:tcp_listendrop -aR sleep 1

Similarly, we can also use bpftrace to trace inlined kernel functions.
For example:

- add a dynamic tracepoint
  $ bpftrace probe -a 'tcp_listendrop sk'

- trace the dynamic tracepoint
  $ bpftrace probe -e 'probe:tcp_listendrop {print(args->sk)}'

> bpftrace supports both kprobe attaching and tp too.

The dynamic tracepoint is not supported yet.

>
> > What do you think about introducing this mechanism into bpftrace? With
> > such a feature, we could easily attach to inlined kernel functions
> > using bpftrace.
>
> Attaching to inlined funcs also sort-of works. It relies on dwarf,
> and there is work in progress to add a special section to vmlinux
> to annotate inlined sites, so it can work without dwarf.

What’s the benefit of doing this? Why not simply read the DWARF
information directly from vmlinux?

$ readelf -S /boot/vmlinux  | grep debug_info
  [63] .debug_info       PROGBITS         0000000000000000  03e2bc20

The DWARF information embedded in vmlinux makes it straightforward to
trace inlined functions without requiring any kernel modifications.
This approach allows all existing kernel releases to immediately take
advantage of the functionality, eliminating the need for kernel
recompilation or patching.


--
Regards
Yafang
Jiri Olsa Jan. 7, 2025, 12:16 p.m. UTC | #5
On Mon, Jan 06, 2025 at 10:32:15AM +0800, Yafang Shao wrote:
> On Mon, Jan 6, 2025 at 8:16 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Sun, Jan 5, 2025 at 4:44 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > Dynamic tracepoints can be created using debugfs. For example:
> > >
> > >    echo 'p:myprobe kernel_clone args' >> /sys/kernel/debug/tracing/kprobe_events
> > >
> > > This command creates a new tracepoint under debugfs:
> > >
> > >   $ ls /sys/kernel/debug/tracing/events/kprobes/myprobe/
> > >   enable  filter  format  hist  id  trigger
> > >
> > > Although this dynamic tracepoint appears as a tracepoint, it is internally
> > > implemented as a kprobe. However, it must be attached as a tracepoint to
> > > function correctly in certain contexts.
> >
> > Nack.
> > There are multiple mechanisms to create kprobe/tp via text interfaces.
> > We're not going to mix them with the programmatic libbpf api.
> 
> It appears that bpftrace still lacks support for adding a kprobe/tp
> and then attaching to it directly. Is that correct?
> What do you think about introducing this mechanism into bpftrace? With
> such a feature, we could easily attach to inlined kernel functions
> using bpftrace.

so with the 'echo .. > kprobe_events' you create kprobe which will be
exported through tracefs together with other tracepoints and bpftrace
sees it as another tracepoint.. but it's a kprobe :-\

how about we add support for kprobe section like SEC("kprobe/SUBSYSTEM/PROBE"),
so in your case above it'd be SEC("kprobe/kprobes/myprobe")

then attach_kprobe would parse that out and use new new probe_attach_mode
for bpf_program__attach_kprobe_opts to attach it correctly

cc-ing Viktor

jirka
Yafang Shao Jan. 7, 2025, 1:32 p.m. UTC | #6
On Tue, Jan 7, 2025 at 8:16 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Mon, Jan 06, 2025 at 10:32:15AM +0800, Yafang Shao wrote:
> > On Mon, Jan 6, 2025 at 8:16 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Sun, Jan 5, 2025 at 4:44 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > > >
> > > > Dynamic tracepoints can be created using debugfs. For example:
> > > >
> > > >    echo 'p:myprobe kernel_clone args' >> /sys/kernel/debug/tracing/kprobe_events
> > > >
> > > > This command creates a new tracepoint under debugfs:
> > > >
> > > >   $ ls /sys/kernel/debug/tracing/events/kprobes/myprobe/
> > > >   enable  filter  format  hist  id  trigger
> > > >
> > > > Although this dynamic tracepoint appears as a tracepoint, it is internally
> > > > implemented as a kprobe. However, it must be attached as a tracepoint to
> > > > function correctly in certain contexts.
> > >
> > > Nack.
> > > There are multiple mechanisms to create kprobe/tp via text interfaces.
> > > We're not going to mix them with the programmatic libbpf api.
> >
> > It appears that bpftrace still lacks support for adding a kprobe/tp
> > and then attaching to it directly. Is that correct?
> > What do you think about introducing this mechanism into bpftrace? With
> > such a feature, we could easily attach to inlined kernel functions
> > using bpftrace.
>
> so with the 'echo .. > kprobe_events' you create kprobe which will be
> exported through tracefs together with other tracepoints and bpftrace
> sees it as another tracepoint.. but it's a kprobe :-\

exactly.

>
> how about we add support for kprobe section like SEC("kprobe/SUBSYSTEM/PROBE"),
> so in your case above it'd be SEC("kprobe/kprobes/myprobe")

This is similar to what I'm currently proposing:

  SEC("dynamic_tp/kprobes/my_dynamic_tp")

My proposal requires only a 3-line change. In contrast, if we
implement it as you suggested, it may require significantly more code
changes. I prefer to introduce a new section, such as
SEC("dynamic_tracepoint/"), SEC("kprobe_tracepoint/"), or something
similar, for this special type of kprobe. However, if you believe
SEC("kprobe/SUBSYSTEM/PROBE") is a better approach, I’m happy to
implement it that way.

>
> then attach_kprobe would parse that out and use new new probe_attach_mode
> for bpf_program__attach_kprobe_opts to attach it correctly

Yes, that would be a great enhancement for tracing inlined kernel functions.

--
Regards
Yafang
diff mbox series

Patch

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 66173ddb5a2d..077bec761ebf 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -9504,6 +9504,7 @@  static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("struct_ops.s+",	STRUCT_OPS, 0, SEC_SLEEPABLE),
 	SEC_DEF("sk_lookup",		SK_LOOKUP, BPF_SK_LOOKUP, SEC_ATTACHABLE),
 	SEC_DEF("netfilter",		NETFILTER, BPF_NETFILTER, SEC_NONE),
+	SEC_DEF("dynamic_tp+",          KPROBE, 0, SEC_NONE, attach_tp),
 };
 
 int libbpf_register_prog_handler(const char *sec,
@@ -12500,6 +12501,8 @@  static int attach_tp(const struct bpf_program *prog, long cookie, struct bpf_lin
 	/* extract "tp/<category>/<name>" or "tracepoint/<category>/<name>" */
 	if (str_has_pfx(prog->sec_name, "tp/"))
 		tp_cat = sec_name + sizeof("tp/") - 1;
+	else if (str_has_pfx(prog->sec_name, "dynamic_tp/"))
+		tp_cat = sec_name + sizeof("dynamic_tp/") - 1;
 	else
 		tp_cat = sec_name + sizeof("tracepoint/") - 1;
 	tp_name = strchr(tp_cat, '/');