mbox series

[0/8] bpf: Add fprobe link

Message ID 20220202135333.190761-1-jolsa@kernel.org (mailing list archive)
Headers show
Series bpf: Add fprobe link | expand

Message

Jiri Olsa Feb. 2, 2022, 1:53 p.m. UTC
hi,
this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
program through fprobe API [1] instroduced by Masami.

The fprobe API allows to attach probe on multiple functions at once very
fast, because it works on top of ftrace. On the other hand this limits
the probe point to the function entry or return.

With bpftrace support I see following attach speed:

  # perf stat --null -r 5 ./src/bpftrace -e 'kprobe:x* { } i:ms:1 { exit(); } '
  Attaching 2 probes...
  Attaching 3342 functions
  ...

  1.4960 +- 0.0285 seconds time elapsed  ( +-  1.91% )

Also available at:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  bpf/fprobe_link

thanks,
jirka


[1] https://lore.kernel.org/bpf/20220202162925.bd74e7970fc35cb4236eef48@kernel.org/T/#t
---
Jiri Olsa (8):
      bpf: Add support to attach kprobe program with fprobe
      bpf: Add bpf_get_func_ip kprobe helper for fprobe link
      bpf: Add bpf_cookie support to fprobe
      libbpf: Add libbpf__kallsyms_parse function
      libbpf: Add bpf_link_create support for multi kprobes
      libbpf: Add bpf_program__attach_kprobe_opts for multi kprobes
      selftest/bpf: Add fprobe attach test
      selftest/bpf: Add fprobe test for bpf_cookie values

 include/linux/bpf.h                                   |   2 +
 include/linux/bpf_types.h                             |   1 +
 include/uapi/linux/bpf.h                              |  14 +++++
 kernel/bpf/syscall.c                                  | 327 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/bpf/verifier.c                                 |  19 +++++-
 kernel/trace/bpf_trace.c                              |  32 +++++++++-
 tools/include/uapi/linux/bpf.h                        |  14 +++++
 tools/lib/bpf/bpf.c                                   |   7 +++
 tools/lib/bpf/bpf.h                                   |   9 ++-
 tools/lib/bpf/libbpf.c                                | 198 +++++++++++++++++++++++++++++++++++++++++++++++++++--------
 tools/lib/bpf/libbpf_internal.h                       |   5 ++
 tools/testing/selftests/bpf/prog_tests/bpf_cookie.c   |  73 ++++++++++++++++++++++
 tools/testing/selftests/bpf/prog_tests/fprobe_test.c  | 117 +++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/progs/fprobe.c            |  58 ++++++++++++++++++
 tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c |  62 +++++++++++++++++++
 15 files changed, 902 insertions(+), 36 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fprobe_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fprobe.c
 create mode 100644 tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c

Comments

Alexei Starovoitov Feb. 2, 2022, 5:09 p.m. UTC | #1
On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> hi,
> this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> program through fprobe API [1] instroduced by Masami.

No new prog type please.
I thought I made my reasons clear earlier.
It's a multi kprobe. Not a fprobe or any other name.
The kernel internal names should not leak into uapi.
Jiri Olsa Feb. 2, 2022, 5:24 p.m. UTC | #2
On Wed, Feb 02, 2022 at 09:09:53AM -0800, Alexei Starovoitov wrote:
> On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > hi,
> > this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> > program through fprobe API [1] instroduced by Masami.
> 
> No new prog type please.
> I thought I made my reasons clear earlier.
> It's a multi kprobe. Not a fprobe or any other name.
> The kernel internal names should not leak into uapi.
> 

well it's not new prog type, it's new link type that allows
to attach kprobe program to multiple functions

the original change used BPF_LINK_TYPE_RAW_KPROBE, which did not
seem to fit anymore, so I moved to FPROBE, because that's what
it is ;-)

but if you don't want new name in uapi we could make this more
obvious with link name:
  BPF_LINK_TYPE_MULTI_KPROBE

and bpf_attach_type:
  BPF_TRACE_MULTI_KPROBE

jirka
Alexei Starovoitov Feb. 2, 2022, 5:30 p.m. UTC | #3
On Wed, Feb 2, 2022 at 9:24 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Wed, Feb 02, 2022 at 09:09:53AM -0800, Alexei Starovoitov wrote:
> > On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > hi,
> > > this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> > > program through fprobe API [1] instroduced by Masami.
> >
> > No new prog type please.
> > I thought I made my reasons clear earlier.
> > It's a multi kprobe. Not a fprobe or any other name.
> > The kernel internal names should not leak into uapi.
> >
>
> well it's not new prog type, it's new link type that allows
> to attach kprobe program to multiple functions
>
> the original change used BPF_LINK_TYPE_RAW_KPROBE, which did not
> seem to fit anymore, so I moved to FPROBE, because that's what
> it is ;-)

Now I don't like the fprobe name even more.
Why invent new names? It's an ftrace interface.

> but if you don't want new name in uapi we could make this more
> obvious with link name:
>   BPF_LINK_TYPE_MULTI_KPROBE
>
> and bpf_attach_type:
>   BPF_TRACE_MULTI_KPROBE

I'd rather get rid of fprobe name first.
Jiri Olsa Feb. 3, 2022, 3:06 p.m. UTC | #4
On Wed, Feb 02, 2022 at 09:30:21AM -0800, Alexei Starovoitov wrote:
> On Wed, Feb 2, 2022 at 9:24 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Wed, Feb 02, 2022 at 09:09:53AM -0800, Alexei Starovoitov wrote:
> > > On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > hi,
> > > > this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> > > > program through fprobe API [1] instroduced by Masami.
> > >
> > > No new prog type please.
> > > I thought I made my reasons clear earlier.
> > > It's a multi kprobe. Not a fprobe or any other name.
> > > The kernel internal names should not leak into uapi.
> > >
> >
> > well it's not new prog type, it's new link type that allows
> > to attach kprobe program to multiple functions
> >
> > the original change used BPF_LINK_TYPE_RAW_KPROBE, which did not
> > seem to fit anymore, so I moved to FPROBE, because that's what
> > it is ;-)
> 
> Now I don't like the fprobe name even more.
> Why invent new names? It's an ftrace interface.

how about ftrace_probe ?

> 
> > but if you don't want new name in uapi we could make this more
> > obvious with link name:
> >   BPF_LINK_TYPE_MULTI_KPROBE
> >
> > and bpf_attach_type:
> >   BPF_TRACE_MULTI_KPROBE
> 
> I'd rather get rid of fprobe name first.
>

Masami, any idea?

thanks,
jirka
Masami Hiramatsu (Google) Feb. 4, 2022, 12:46 a.m. UTC | #5
On Thu, 3 Feb 2022 16:06:36 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> On Wed, Feb 02, 2022 at 09:30:21AM -0800, Alexei Starovoitov wrote:
> > On Wed, Feb 2, 2022 at 9:24 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Wed, Feb 02, 2022 at 09:09:53AM -0800, Alexei Starovoitov wrote:
> > > > On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > >
> > > > > hi,
> > > > > this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> > > > > program through fprobe API [1] instroduced by Masami.
> > > >
> > > > No new prog type please.
> > > > I thought I made my reasons clear earlier.
> > > > It's a multi kprobe. Not a fprobe or any other name.
> > > > The kernel internal names should not leak into uapi.
> > > >
> > >
> > > well it's not new prog type, it's new link type that allows
> > > to attach kprobe program to multiple functions
> > >
> > > the original change used BPF_LINK_TYPE_RAW_KPROBE, which did not
> > > seem to fit anymore, so I moved to FPROBE, because that's what
> > > it is ;-)
> > 
> > Now I don't like the fprobe name even more.
> > Why invent new names? It's an ftrace interface.
> 
> how about ftrace_probe ?

I thought What Alexei pointed was that don't expose the FPROBE name
to user space. If so, I agree with that. We can continue to use
KPROBE for user space. Using fprobe is just for kernel implementation.

It means that we may better to keep simple mind model (there are only
static event or dynamic kprobe event).


> > > but if you don't want new name in uapi we could make this more
> > > obvious with link name:
> > >   BPF_LINK_TYPE_MULTI_KPROBE
> > >
> > > and bpf_attach_type:
> > >   BPF_TRACE_MULTI_KPROBE
> > 
> > I'd rather get rid of fprobe name first.
> >
> 
> Masami, any idea?

Can't we continue to use kprobe prog type for user interface
and internally, if there are multiple kprobes or kretprobes
required, switch to use fprobe?

Thank you,

> 
> thanks,
> jirka
>
Alexei Starovoitov Feb. 4, 2022, 1:34 a.m. UTC | #6
On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> I thought What Alexei pointed was that don't expose the FPROBE name
> to user space. If so, I agree with that. We can continue to use
> KPROBE for user space. Using fprobe is just for kernel implementation.

Clearly that intent is not working.
The "fprobe" name is already leaking outside of the kernel internals.
The module interface is being proposed.
You'd need to document it, etc.
I think it's only causing confusion to users.
The new name serves no additional purpose other than
being new and unheard of.
fprobe is kprobe on ftrace. That's it.
Just call it kprobe on ftrace in api and everywhere.
Please?
Masami Hiramatsu (Google) Feb. 4, 2022, 2:07 a.m. UTC | #7
On Thu, 3 Feb 2022 17:34:54 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > I thought What Alexei pointed was that don't expose the FPROBE name
> > to user space. If so, I agree with that. We can continue to use
> > KPROBE for user space. Using fprobe is just for kernel implementation.
> 
> Clearly that intent is not working.

Thanks for confirmation :-)

> The "fprobe" name is already leaking outside of the kernel internals.
> The module interface is being proposed.

Yes, but that is only for making the example module.
It is easy for me to enclose it inside kernel. I'm preparing KUnit
selftest code for next version. After integrated that, we don't need
that example module anymore.

> You'd need to document it, etc.

Yes, I've added a document of the APIs for the series.  :-)

> I think it's only causing confusion to users.
> The new name serves no additional purpose other than
> being new and unheard of.
> fprobe is kprobe on ftrace. That's it.

No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
transparently.

> Just call it kprobe on ftrace in api and everywhere.
> Please?

Hmm, no, I think that's the work for who provide user-interface, isn't it?.
Inside kernel, IMHO, the interface named from the programing viewpoint, and
from that viewpoint, fprobe and kprobe interface are similar but different.

I'm able to allow kprobe-event (of ftrace) to accept "func*" (yeah, that's
actually good idea), but ftrace interface will not export as fprobe. Even if
it internally uses fprobe, I don't call it fprobe. It's kprobes from the
viewpoint of ftrace user. (Yeah, I think it should be called as 
"dynamic-probe-event-for-kernel" but historically, it is called as kprobe-event.)

Thank you,
Alexei Starovoitov Feb. 4, 2022, 2:12 a.m. UTC | #8
On Thu, Feb 3, 2022 at 6:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 3 Feb 2022 17:34:54 -0800
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>
> > On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > I thought What Alexei pointed was that don't expose the FPROBE name
> > > to user space. If so, I agree with that. We can continue to use
> > > KPROBE for user space. Using fprobe is just for kernel implementation.
> >
> > Clearly that intent is not working.
>
> Thanks for confirmation :-)
>
> > The "fprobe" name is already leaking outside of the kernel internals.
> > The module interface is being proposed.
>
> Yes, but that is only for making the example module.
> It is easy for me to enclose it inside kernel. I'm preparing KUnit
> selftest code for next version. After integrated that, we don't need
> that example module anymore.
>
> > You'd need to document it, etc.
>
> Yes, I've added a document of the APIs for the series.  :-)
>
> > I think it's only causing confusion to users.
> > The new name serves no additional purpose other than
> > being new and unheard of.
> > fprobe is kprobe on ftrace. That's it.
>
> No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> transparently.

Not true.
fprobe is nothing but _explicit_ kprobe on ftrace.
There was an implicit optimization for kprobe when ftrace
could be used.
All this new interface is doing is making it explicit.
So a new name is not warranted here.

> from that viewpoint, fprobe and kprobe interface are similar but different.

What is the difference?
I don't see it.
Steven Rostedt Feb. 4, 2022, 2:19 a.m. UTC | #9
On Thu, 3 Feb 2022 18:12:11 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > transparently.  
> 
> Not true.
> fprobe is nothing but _explicit_ kprobe on ftrace.
> There was an implicit optimization for kprobe when ftrace
> could be used.
> All this new interface is doing is making it explicit.
> So a new name is not warranted here.
> 
> > from that viewpoint, fprobe and kprobe interface are similar but different.  
> 
> What is the difference?
> I don't see it.

IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
abilities that a normal kprobe does not. Namely, "what is the function
parameters?"

You can only reliably get the parameters at function entry. Hence, by
having a probe that is unique to functions as supposed to the middle of a
function, makes sense to me.

That is, the API can change. "Give me parameter X". That along with some
BTF reading, could figure out how to get parameter X, and record that.

-- Steve
Alexei Starovoitov Feb. 4, 2022, 2:42 a.m. UTC | #10
On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Thu, 3 Feb 2022 18:12:11 -0800
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>
> > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > transparently.
> >
> > Not true.
> > fprobe is nothing but _explicit_ kprobe on ftrace.
> > There was an implicit optimization for kprobe when ftrace
> > could be used.
> > All this new interface is doing is making it explicit.
> > So a new name is not warranted here.
> >
> > > from that viewpoint, fprobe and kprobe interface are similar but different.
> >
> > What is the difference?
> > I don't see it.
>
> IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> abilities that a normal kprobe does not. Namely, "what is the function
> parameters?"
>
> You can only reliably get the parameters at function entry. Hence, by
> having a probe that is unique to functions as supposed to the middle of a
> function, makes sense to me.
>
> That is, the API can change. "Give me parameter X". That along with some
> BTF reading, could figure out how to get parameter X, and record that.

This is more or less a description of kprobe on ftrace :)
The bpf+kprobe users were relying on that for a long time.
See PT_REGS_PARM1() macros in bpf_tracing.h
They're meaningful only with kprobe on ftrace.
So, no, fprobe is not inventing anything new here.

No one is using kprobe in the middle of the function.
It's too difficult to make anything useful out of it,
so no one bothers.
When people say "kprobe" 99 out of 100 they mean
kprobe on ftrace/fentry.
Masami Hiramatsu (Google) Feb. 4, 2022, 3:14 a.m. UTC | #11
On Thu, 3 Feb 2022 18:12:11 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 3, 2022 at 6:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 3 Feb 2022 17:34:54 -0800
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > > On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >
> > > > I thought What Alexei pointed was that don't expose the FPROBE name
> > > > to user space. If so, I agree with that. We can continue to use
> > > > KPROBE for user space. Using fprobe is just for kernel implementation.
> > >
> > > Clearly that intent is not working.
> >
> > Thanks for confirmation :-)
> >
> > > The "fprobe" name is already leaking outside of the kernel internals.
> > > The module interface is being proposed.
> >
> > Yes, but that is only for making the example module.
> > It is easy for me to enclose it inside kernel. I'm preparing KUnit
> > selftest code for next version. After integrated that, we don't need
> > that example module anymore.
> >
> > > You'd need to document it, etc.
> >
> > Yes, I've added a document of the APIs for the series.  :-)
> >
> > > I think it's only causing confusion to users.
> > > The new name serves no additional purpose other than
> > > being new and unheard of.
> > > fprobe is kprobe on ftrace. That's it.
> >
> > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > transparently.
> 
> Not true.
> fprobe is nothing but _explicit_ kprobe on ftrace.
> There was an implicit optimization for kprobe when ftrace
> could be used.
> All this new interface is doing is making it explicit.
> So a new name is not warranted here.
> 
> > from that viewpoint, fprobe and kprobe interface are similar but different.
> 
> What is the difference?
> I don't see it.

From the raw-kernel programer's viewpoint, here are the differences.

kprobes is focusing on probing just a single probe point, and it can probe
everywhere including function body. With this charactoristics, user can
made a callback logic which is specialized for a specific address.

typedef int (*kprobe_pre_handler_t) (struct kprobe *, struct pt_regs *);


On the other hand, fprobe focuses on the multiple function entry and exit.
That is just a wrapper of ftrace. So callbacks will need to check the
function IP and change their behavior according to the IP.

        void (*entry_handler)(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs);
        void (*exit_handler)(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs);

This is why the fprobe handler gets @entry_ip for the handlers.

However, from viewpoint of the higher level users, those may look same
because both interrupts the kernel execution and callback their program
like BPF. BPF can select collect program according to the instruction_pointer
of @regs in both case.

In that case, I think it is natual that the BPF layer hides those differences
from user, by abstracting those as a generic "kprobe" which means an idea of
the general kernel instrumentation.

Thank you,
Masami Hiramatsu (Google) Feb. 4, 2022, 3:17 a.m. UTC | #12
On Thu, 3 Feb 2022 18:42:22 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > On Thu, 3 Feb 2022 18:12:11 -0800
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > transparently.
> > >
> > > Not true.
> > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > There was an implicit optimization for kprobe when ftrace
> > > could be used.
> > > All this new interface is doing is making it explicit.
> > > So a new name is not warranted here.
> > >
> > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > >
> > > What is the difference?
> > > I don't see it.
> >
> > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > abilities that a normal kprobe does not. Namely, "what is the function
> > parameters?"
> >
> > You can only reliably get the parameters at function entry. Hence, by
> > having a probe that is unique to functions as supposed to the middle of a
> > function, makes sense to me.
> >
> > That is, the API can change. "Give me parameter X". That along with some
> > BTF reading, could figure out how to get parameter X, and record that.
> 
> This is more or less a description of kprobe on ftrace :)
> The bpf+kprobe users were relying on that for a long time.
> See PT_REGS_PARM1() macros in bpf_tracing.h
> They're meaningful only with kprobe on ftrace.
> So, no, fprobe is not inventing anything new here.
> 
> No one is using kprobe in the middle of the function.
> It's too difficult to make anything useful out of it,
> so no one bothers.

Perf-probe makes it very easy, as easy as gdb does. :-)

Thank you,

> When people say "kprobe" 99 out of 100 they mean
> kprobe on ftrace/fentry.
Masami Hiramatsu (Google) Feb. 4, 2022, 3:59 a.m. UTC | #13
Hi Alexei,

On Thu, 3 Feb 2022 18:42:22 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > On Thu, 3 Feb 2022 18:12:11 -0800
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > transparently.
> > >
> > > Not true.
> > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > There was an implicit optimization for kprobe when ftrace
> > > could be used.
> > > All this new interface is doing is making it explicit.
> > > So a new name is not warranted here.
> > >
> > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > >
> > > What is the difference?
> > > I don't see it.
> >
> > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > abilities that a normal kprobe does not. Namely, "what is the function
> > parameters?"
> >
> > You can only reliably get the parameters at function entry. Hence, by
> > having a probe that is unique to functions as supposed to the middle of a
> > function, makes sense to me.
> >
> > That is, the API can change. "Give me parameter X". That along with some
> > BTF reading, could figure out how to get parameter X, and record that.
> 
> This is more or less a description of kprobe on ftrace :)
> The bpf+kprobe users were relying on that for a long time.
> See PT_REGS_PARM1() macros in bpf_tracing.h
> They're meaningful only with kprobe on ftrace.
> So, no, fprobe is not inventing anything new here.

Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
CONFIG_KPROBES=y. It is valid unless you put a probe out of function
entry.

> No one is using kprobe in the middle of the function.
> It's too difficult to make anything useful out of it,
> so no one bothers.
> When people say "kprobe" 99 out of 100 they mean
> kprobe on ftrace/fentry.

I see. But the kprobe is kprobe. It is not designed to support multiple
probe points. If I'm forced to say, I can rename the struct fprobe to
struct multi_kprobe, but that doesn't change the essence. You may need
to use both of kprobes and so-called multi_kprobe properly. (Someone
need to do that.)

Thank you,
Jiri Olsa Feb. 15, 2022, 1:21 p.m. UTC | #14
On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> Hi Alexei,
> 
> On Thu, 3 Feb 2022 18:42:22 -0800
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > >
> > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > >
> > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > transparently.
> > > >
> > > > Not true.
> > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > There was an implicit optimization for kprobe when ftrace
> > > > could be used.
> > > > All this new interface is doing is making it explicit.
> > > > So a new name is not warranted here.
> > > >
> > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > >
> > > > What is the difference?
> > > > I don't see it.
> > >
> > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > abilities that a normal kprobe does not. Namely, "what is the function
> > > parameters?"
> > >
> > > You can only reliably get the parameters at function entry. Hence, by
> > > having a probe that is unique to functions as supposed to the middle of a
> > > function, makes sense to me.
> > >
> > > That is, the API can change. "Give me parameter X". That along with some
> > > BTF reading, could figure out how to get parameter X, and record that.
> > 
> > This is more or less a description of kprobe on ftrace :)
> > The bpf+kprobe users were relying on that for a long time.
> > See PT_REGS_PARM1() macros in bpf_tracing.h
> > They're meaningful only with kprobe on ftrace.
> > So, no, fprobe is not inventing anything new here.
> 
> Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> entry.
> 
> > No one is using kprobe in the middle of the function.
> > It's too difficult to make anything useful out of it,
> > so no one bothers.
> > When people say "kprobe" 99 out of 100 they mean
> > kprobe on ftrace/fentry.
> 
> I see. But the kprobe is kprobe. It is not designed to support multiple
> probe points. If I'm forced to say, I can rename the struct fprobe to
> struct multi_kprobe, but that doesn't change the essence. You may need
> to use both of kprobes and so-called multi_kprobe properly. (Someone
> need to do that.)

hi,
tying to kick things further ;-) I was thinking about bpf side of this
and we could use following interface:

  enum bpf_attach_type {
    ...
    BPF_TRACE_KPROBE_MULTI
  };

  enum bpf_link_type {
    ...
    BPF_LINK_TYPE_KPROBE_MULTI
  };

  union bpf_attr {

    struct {
      ...
      struct {
        __aligned_u64   syms;
        __aligned_u64   addrs;
        __aligned_u64   cookies;
        __u32           cnt;
        __u32           flags;
      } kprobe_multi;
    } link_create;
  }

because from bpf user POV it's new link for attaching multiple kprobes
and I agree new 'fprobe' type name in here brings more confusion, using
kprobe_multi is straightforward

thoguhts?

thanks,
jirka
Andrii Nakryiko Feb. 16, 2022, 6:27 p.m. UTC | #15
On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > Hi Alexei,
> >
> > On Thu, 3 Feb 2022 18:42:22 -0800
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > > >
> > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > transparently.
> > > > >
> > > > > Not true.
> > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > There was an implicit optimization for kprobe when ftrace
> > > > > could be used.
> > > > > All this new interface is doing is making it explicit.
> > > > > So a new name is not warranted here.
> > > > >
> > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > >
> > > > > What is the difference?
> > > > > I don't see it.
> > > >
> > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > parameters?"
> > > >
> > > > You can only reliably get the parameters at function entry. Hence, by
> > > > having a probe that is unique to functions as supposed to the middle of a
> > > > function, makes sense to me.
> > > >
> > > > That is, the API can change. "Give me parameter X". That along with some
> > > > BTF reading, could figure out how to get parameter X, and record that.
> > >
> > > This is more or less a description of kprobe on ftrace :)
> > > The bpf+kprobe users were relying on that for a long time.
> > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > They're meaningful only with kprobe on ftrace.
> > > So, no, fprobe is not inventing anything new here.
> >
> > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > entry.
> >
> > > No one is using kprobe in the middle of the function.
> > > It's too difficult to make anything useful out of it,
> > > so no one bothers.
> > > When people say "kprobe" 99 out of 100 they mean
> > > kprobe on ftrace/fentry.
> >
> > I see. But the kprobe is kprobe. It is not designed to support multiple
> > probe points. If I'm forced to say, I can rename the struct fprobe to
> > struct multi_kprobe, but that doesn't change the essence. You may need
> > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > need to do that.)
>
> hi,
> tying to kick things further ;-) I was thinking about bpf side of this
> and we could use following interface:
>
>   enum bpf_attach_type {
>     ...
>     BPF_TRACE_KPROBE_MULTI
>   };
>
>   enum bpf_link_type {
>     ...
>     BPF_LINK_TYPE_KPROBE_MULTI
>   };
>
>   union bpf_attr {
>
>     struct {
>       ...
>       struct {
>         __aligned_u64   syms;
>         __aligned_u64   addrs;
>         __aligned_u64   cookies;
>         __u32           cnt;
>         __u32           flags;
>       } kprobe_multi;
>     } link_create;
>   }
>
> because from bpf user POV it's new link for attaching multiple kprobes
> and I agree new 'fprobe' type name in here brings more confusion, using
> kprobe_multi is straightforward
>
> thoguhts?

I think this makes sense. We do need new type of link to store ip ->
cookie mapping anyways.

Is there any chance to support this fast multi-attach for uprobe? If
yes, we might want to reuse the same link for both (so should we name
it more generically? on the other hand BPF program type for uprobe is
BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
consistent with what we have today).

But yeah, the main question is whether there is something preventing
us from supporting multi-attach uprobe as well? It would be really
great for USDT use case.

>
> thanks,
> jirka
Masami Hiramatsu (Google) Feb. 17, 2022, 2:03 p.m. UTC | #16
On Wed, 16 Feb 2022 10:27:19 -0800
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > > Hi Alexei,
> > >
> > > On Thu, 3 Feb 2022 18:42:22 -0800
> > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > >
> > > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > > > >
> > > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > > transparently.
> > > > > >
> > > > > > Not true.
> > > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > > There was an implicit optimization for kprobe when ftrace
> > > > > > could be used.
> > > > > > All this new interface is doing is making it explicit.
> > > > > > So a new name is not warranted here.
> > > > > >
> > > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > > >
> > > > > > What is the difference?
> > > > > > I don't see it.
> > > > >
> > > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > > parameters?"
> > > > >
> > > > > You can only reliably get the parameters at function entry. Hence, by
> > > > > having a probe that is unique to functions as supposed to the middle of a
> > > > > function, makes sense to me.
> > > > >
> > > > > That is, the API can change. "Give me parameter X". That along with some
> > > > > BTF reading, could figure out how to get parameter X, and record that.
> > > >
> > > > This is more or less a description of kprobe on ftrace :)
> > > > The bpf+kprobe users were relying on that for a long time.
> > > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > > They're meaningful only with kprobe on ftrace.
> > > > So, no, fprobe is not inventing anything new here.
> > >
> > > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > > entry.
> > >
> > > > No one is using kprobe in the middle of the function.
> > > > It's too difficult to make anything useful out of it,
> > > > so no one bothers.
> > > > When people say "kprobe" 99 out of 100 they mean
> > > > kprobe on ftrace/fentry.
> > >
> > > I see. But the kprobe is kprobe. It is not designed to support multiple
> > > probe points. If I'm forced to say, I can rename the struct fprobe to
> > > struct multi_kprobe, but that doesn't change the essence. You may need
> > > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > > need to do that.)
> >
> > hi,
> > tying to kick things further ;-) I was thinking about bpf side of this
> > and we could use following interface:
> >
> >   enum bpf_attach_type {
> >     ...
> >     BPF_TRACE_KPROBE_MULTI
> >   };
> >
> >   enum bpf_link_type {
> >     ...
> >     BPF_LINK_TYPE_KPROBE_MULTI
> >   };
> >
> >   union bpf_attr {
> >
> >     struct {
> >       ...
> >       struct {
> >         __aligned_u64   syms;
> >         __aligned_u64   addrs;
> >         __aligned_u64   cookies;
> >         __u32           cnt;
> >         __u32           flags;
> >       } kprobe_multi;
> >     } link_create;
> >   }
> >
> > because from bpf user POV it's new link for attaching multiple kprobes
> > and I agree new 'fprobe' type name in here brings more confusion, using
> > kprobe_multi is straightforward
> >
> > thoguhts?
> 
> I think this makes sense. We do need new type of link to store ip ->
> cookie mapping anyways.

This looks good to me too.

> 
> Is there any chance to support this fast multi-attach for uprobe? If
> yes, we might want to reuse the same link for both (so should we name
> it more generically?

There is no interface to do that but also there is no limitation to
expand uprobes. For the kprobes, there are some limitations for the
function entry because it needs to share the space with ftrace. So
I introduced fprobe for easier to use.

> on the other hand BPF program type for uprobe is
> BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> consistent with what we have today).

Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
the target program.)


> But yeah, the main question is whether there is something preventing
> us from supporting multi-attach uprobe as well? It would be really
> great for USDT use case.

Ah, for the USDT, it will be useful. But since now we will have "user-event"
which is faster than uprobes, we may be better to consider to use it.

I'm not so sure how uprobes probes the target process, but maybe it has
to manage some memory pages and task related things. If we can split
those task-related part from struct uprobe software-breakpoint part,
it maybe easy to support multiple probe (one task-related part + multiple
software-breakpoint parts.)

Thank you,
Andrii Nakryiko Feb. 17, 2022, 10:01 p.m. UTC | #17
On Thu, Feb 17, 2022 at 6:04 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Wed, 16 Feb 2022 10:27:19 -0800
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > > > Hi Alexei,
> > > >
> > > > On Thu, 3 Feb 2022 18:42:22 -0800
> > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > > > > >
> > > > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > > > transparently.
> > > > > > >
> > > > > > > Not true.
> > > > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > > > There was an implicit optimization for kprobe when ftrace
> > > > > > > could be used.
> > > > > > > All this new interface is doing is making it explicit.
> > > > > > > So a new name is not warranted here.
> > > > > > >
> > > > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > > > >
> > > > > > > What is the difference?
> > > > > > > I don't see it.
> > > > > >
> > > > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > > > parameters?"
> > > > > >
> > > > > > You can only reliably get the parameters at function entry. Hence, by
> > > > > > having a probe that is unique to functions as supposed to the middle of a
> > > > > > function, makes sense to me.
> > > > > >
> > > > > > That is, the API can change. "Give me parameter X". That along with some
> > > > > > BTF reading, could figure out how to get parameter X, and record that.
> > > > >
> > > > > This is more or less a description of kprobe on ftrace :)
> > > > > The bpf+kprobe users were relying on that for a long time.
> > > > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > > > They're meaningful only with kprobe on ftrace.
> > > > > So, no, fprobe is not inventing anything new here.
> > > >
> > > > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > > > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > > > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > > > entry.
> > > >
> > > > > No one is using kprobe in the middle of the function.
> > > > > It's too difficult to make anything useful out of it,
> > > > > so no one bothers.
> > > > > When people say "kprobe" 99 out of 100 they mean
> > > > > kprobe on ftrace/fentry.
> > > >
> > > > I see. But the kprobe is kprobe. It is not designed to support multiple
> > > > probe points. If I'm forced to say, I can rename the struct fprobe to
> > > > struct multi_kprobe, but that doesn't change the essence. You may need
> > > > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > > > need to do that.)
> > >
> > > hi,
> > > tying to kick things further ;-) I was thinking about bpf side of this
> > > and we could use following interface:
> > >
> > >   enum bpf_attach_type {
> > >     ...
> > >     BPF_TRACE_KPROBE_MULTI
> > >   };
> > >
> > >   enum bpf_link_type {
> > >     ...
> > >     BPF_LINK_TYPE_KPROBE_MULTI
> > >   };
> > >
> > >   union bpf_attr {
> > >
> > >     struct {
> > >       ...
> > >       struct {
> > >         __aligned_u64   syms;
> > >         __aligned_u64   addrs;
> > >         __aligned_u64   cookies;
> > >         __u32           cnt;
> > >         __u32           flags;
> > >       } kprobe_multi;
> > >     } link_create;
> > >   }
> > >
> > > because from bpf user POV it's new link for attaching multiple kprobes
> > > and I agree new 'fprobe' type name in here brings more confusion, using
> > > kprobe_multi is straightforward
> > >
> > > thoguhts?
> >
> > I think this makes sense. We do need new type of link to store ip ->
> > cookie mapping anyways.
>
> This looks good to me too.
>
> >
> > Is there any chance to support this fast multi-attach for uprobe? If
> > yes, we might want to reuse the same link for both (so should we name
> > it more generically?
>
> There is no interface to do that but also there is no limitation to
> expand uprobes. For the kprobes, there are some limitations for the
> function entry because it needs to share the space with ftrace. So
> I introduced fprobe for easier to use.
>
> > on the other hand BPF program type for uprobe is
> > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > consistent with what we have today).
>
> Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> the target program.)
>

We've been talking about sleepable uprobe programs, so we might need
to add uprobe-specific program type, probably. But historically, from
BPF point of view there was no difference between kprobe and uprobe
programs (in terms of how they are run and what's available to them).
From BPF point of view, it was just attaching BPF program to a
perf_event.

>
> > But yeah, the main question is whether there is something preventing
> > us from supporting multi-attach uprobe as well? It would be really
> > great for USDT use case.
>
> Ah, for the USDT, it will be useful. But since now we will have "user-event"
> which is faster than uprobes, we may be better to consider to use it.

Any pointers? I'm not sure what "user-event" refers to.

>
> I'm not so sure how uprobes probes the target process, but maybe it has
> to manage some memory pages and task related things. If we can split
> those task-related part from struct uprobe software-breakpoint part,
> it maybe easy to support multiple probe (one task-related part + multiple
> software-breakpoint parts.)
>
> Thank you,
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>
Masami Hiramatsu (Google) Feb. 18, 2022, 4:07 a.m. UTC | #18
On Thu, 17 Feb 2022 14:01:30 -0800
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:


> > > Is there any chance to support this fast multi-attach for uprobe? If
> > > yes, we might want to reuse the same link for both (so should we name
> > > it more generically?
> >
> > There is no interface to do that but also there is no limitation to
> > expand uprobes. For the kprobes, there are some limitations for the
> > function entry because it needs to share the space with ftrace. So
> > I introduced fprobe for easier to use.
> >
> > > on the other hand BPF program type for uprobe is
> > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > consistent with what we have today).
> >
> > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > the target program.)
> >
> 
> We've been talking about sleepable uprobe programs, so we might need
> to add uprobe-specific program type, probably. But historically, from
> BPF point of view there was no difference between kprobe and uprobe
> programs (in terms of how they are run and what's available to them).
> From BPF point of view, it was just attaching BPF program to a
> perf_event.

Got it, so that will reuse the uprobe_events in ftrace. But I think
the uprobe requires a "path" to the attached binary, how is it
specified?

> > > But yeah, the main question is whether there is something preventing
> > > us from supporting multi-attach uprobe as well? It would be really
> > > great for USDT use case.
> >
> > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > which is faster than uprobes, we may be better to consider to use it.
> 
> Any pointers? I'm not sure what "user-event" refers to.

Here is the user-events series, which allows user program to define
raw dynamic events and it can write raw event data directly from
user space.

https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/

Thank you,
Andrii Nakryiko Feb. 18, 2022, 7:46 p.m. UTC | #19
On Thu, Feb 17, 2022 at 8:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 17 Feb 2022 14:01:30 -0800
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
>
> > > > Is there any chance to support this fast multi-attach for uprobe? If
> > > > yes, we might want to reuse the same link for both (so should we name
> > > > it more generically?
> > >
> > > There is no interface to do that but also there is no limitation to
> > > expand uprobes. For the kprobes, there are some limitations for the
> > > function entry because it needs to share the space with ftrace. So
> > > I introduced fprobe for easier to use.
> > >
> > > > on the other hand BPF program type for uprobe is
> > > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > > consistent with what we have today).
> > >
> > > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > > the target program.)
> > >
> >
> > We've been talking about sleepable uprobe programs, so we might need
> > to add uprobe-specific program type, probably. But historically, from
> > BPF point of view there was no difference between kprobe and uprobe
> > programs (in terms of how they are run and what's available to them).
> > From BPF point of view, it was just attaching BPF program to a
> > perf_event.
>
> Got it, so that will reuse the uprobe_events in ftrace. But I think
> the uprobe requires a "path" to the attached binary, how is it
> specified?

It's passed as a string to perf subsystem during perf_event_open() syscall.

>
> > > > But yeah, the main question is whether there is something preventing
> > > > us from supporting multi-attach uprobe as well? It would be really
> > > > great for USDT use case.
> > >
> > > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > > which is faster than uprobes, we may be better to consider to use it.
> >
> > Any pointers? I'm not sure what "user-event" refers to.
>
> Here is the user-events series, which allows user program to define
> raw dynamic events and it can write raw event data directly from
> user space.
>
> https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/
>

Thanks for the link! I'll check it out.

> Thank you,
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>
Alexei Starovoitov Feb. 19, 2022, 2:10 a.m. UTC | #20
On Thu, Feb 17, 2022 at 8:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 17 Feb 2022 14:01:30 -0800
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
>
> > > > Is there any chance to support this fast multi-attach for uprobe? If
> > > > yes, we might want to reuse the same link for both (so should we name
> > > > it more generically?
> > >
> > > There is no interface to do that but also there is no limitation to
> > > expand uprobes. For the kprobes, there are some limitations for the
> > > function entry because it needs to share the space with ftrace. So
> > > I introduced fprobe for easier to use.
> > >
> > > > on the other hand BPF program type for uprobe is
> > > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > > consistent with what we have today).
> > >
> > > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > > the target program.)
> > >
> >
> > We've been talking about sleepable uprobe programs, so we might need
> > to add uprobe-specific program type, probably. But historically, from
> > BPF point of view there was no difference between kprobe and uprobe
> > programs (in terms of how they are run and what's available to them).
> > From BPF point of view, it was just attaching BPF program to a
> > perf_event.
>
> Got it, so that will reuse the uprobe_events in ftrace. But I think
> the uprobe requires a "path" to the attached binary, how is it
> specified?
>
> > > > But yeah, the main question is whether there is something preventing
> > > > us from supporting multi-attach uprobe as well? It would be really
> > > > great for USDT use case.
> > >
> > > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > > which is faster than uprobes, we may be better to consider to use it.
> >
> > Any pointers? I'm not sure what "user-event" refers to.
>
> Here is the user-events series, which allows user program to define
> raw dynamic events and it can write raw event data directly from
> user space.
>
> https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/

Is this a way for user space to inject user bytes into kernel events?
What is the use case?
Masami Hiramatsu (Google) Feb. 21, 2022, 7:18 a.m. UTC | #21
On Fri, 18 Feb 2022 18:10:08 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 17, 2022 at 8:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 17 Feb 2022 14:01:30 -0800
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> >
> > > > > Is there any chance to support this fast multi-attach for uprobe? If
> > > > > yes, we might want to reuse the same link for both (so should we name
> > > > > it more generically?
> > > >
> > > > There is no interface to do that but also there is no limitation to
> > > > expand uprobes. For the kprobes, there are some limitations for the
> > > > function entry because it needs to share the space with ftrace. So
> > > > I introduced fprobe for easier to use.
> > > >
> > > > > on the other hand BPF program type for uprobe is
> > > > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > > > consistent with what we have today).
> > > >
> > > > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > > > the target program.)
> > > >
> > >
> > > We've been talking about sleepable uprobe programs, so we might need
> > > to add uprobe-specific program type, probably. But historically, from
> > > BPF point of view there was no difference between kprobe and uprobe
> > > programs (in terms of how they are run and what's available to them).
> > > From BPF point of view, it was just attaching BPF program to a
> > > perf_event.
> >
> > Got it, so that will reuse the uprobe_events in ftrace. But I think
> > the uprobe requires a "path" to the attached binary, how is it
> > specified?
> >
> > > > > But yeah, the main question is whether there is something preventing
> > > > > us from supporting multi-attach uprobe as well? It would be really
> > > > > great for USDT use case.
> > > >
> > > > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > > > which is faster than uprobes, we may be better to consider to use it.
> > >
> > > Any pointers? I'm not sure what "user-event" refers to.
> >
> > Here is the user-events series, which allows user program to define
> > raw dynamic events and it can write raw event data directly from
> > user space.
> >
> > https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/
> 
> Is this a way for user space to inject user bytes into kernel events?

Yes, it is.

> What is the use case?

This is like trace_marker but more ftrace/perf friendly version. The trace_marker
can only send a user string, and the kernel can not parse it. Thus, the traced
data will be shown in the trace buffer, but the event filter, event trigger,
histogram etc didn't work with trace_marker.

On the other hand, the user-events allows user-space defines new events with
various arguments with types, and the application can send the formatted raw
data to the kernel. Thus the kernel can apply event filter, event trigger and
histograms on those events as same as other kernel defined events.

This will be helpful for users to push their own data as events of ftrace
and perf (and eBPF I think) so that they can use those tracing tools to analyze
both of their events and kernel events. :-)

Thank you,
Jiri Olsa Feb. 22, 2022, 12:42 p.m. UTC | #22
On Wed, Feb 16, 2022 at 10:27:19AM -0800, Andrii Nakryiko wrote:

SNIP

> >
> > hi,
> > tying to kick things further ;-) I was thinking about bpf side of this
> > and we could use following interface:
> >
> >   enum bpf_attach_type {
> >     ...
> >     BPF_TRACE_KPROBE_MULTI
> >   };
> >
> >   enum bpf_link_type {
> >     ...
> >     BPF_LINK_TYPE_KPROBE_MULTI
> >   };
> >
> >   union bpf_attr {
> >
> >     struct {
> >       ...
> >       struct {
> >         __aligned_u64   syms;
> >         __aligned_u64   addrs;
> >         __aligned_u64   cookies;
> >         __u32           cnt;
> >         __u32           flags;
> >       } kprobe_multi;
> >     } link_create;
> >   }
> >
> > because from bpf user POV it's new link for attaching multiple kprobes
> > and I agree new 'fprobe' type name in here brings more confusion, using
> > kprobe_multi is straightforward
> >
> > thoguhts?
> 
> I think this makes sense. We do need new type of link to store ip ->
> cookie mapping anyways.
> 
> Is there any chance to support this fast multi-attach for uprobe? If
> yes, we might want to reuse the same link for both (so should we name
> it more generically? on the other hand BPF program type for uprobe is
> BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> consistent with what we have today).
> 
> But yeah, the main question is whether there is something preventing
> us from supporting multi-attach uprobe as well? It would be really
> great for USDT use case.

I need to check with uprobes, my understanding ends at perf/trace
code calling uprobe_register ;-)

maybe I should first try if uprobes suffer the same performance issue

I'll send another version with above interface, because there's
tons of other fixes, and by the time for next version we might
have answer for the interface change

jirka