Message ID | 20220104080943.113249-9-jolsa@kernel.org (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | kprobe/bpf: Add support to attach multiple kprobes | expand |
Context | Check | Description |
---|---|---|
bpf/vmtest-bpf-next | fail | VM_Test |
bpf/vmtest-bpf-next-PR | fail | PR summary |
netdev/tree_selection | success | Not a local patch, async |
On Tue, Jan 4, 2022 at 12:10 AM Jiri Olsa <jolsa@redhat.com> wrote: > > Adding new link type BPF_LINK_TYPE_KPROBE to attach kprobes > directly through register_kprobe/kretprobe API. > > Adding new attach type BPF_TRACE_RAW_KPROBE that enables > such link for kprobe program. > > The new link allows to create multiple kprobes link by using > new link_create interface: > > struct { > __aligned_u64 addrs; > __u32 cnt; > __u64 bpf_cookie; I'm afraid bpf_cookie has to be different for each addr, otherwise it's severely limiting. So it would be an array of cookies alongside an array of addresses > } kprobe; > > Plus new flag BPF_F_KPROBE_RETURN for link_create.flags to > create return probe. > > Signed-off-by: Jiri Olsa <jolsa@kernel.org> > --- > include/linux/bpf_types.h | 1 + > include/uapi/linux/bpf.h | 12 +++ > kernel/bpf/syscall.c | 191 ++++++++++++++++++++++++++++++++- > tools/include/uapi/linux/bpf.h | 12 +++ > 4 files changed, 211 insertions(+), 5 deletions(-) > [...] > @@ -1111,6 +1113,11 @@ enum bpf_link_type { > */ > #define BPF_F_SLEEPABLE (1U << 4) > > +/* link_create flags used in LINK_CREATE command for BPF_TRACE_RAW_KPROBE > + * attach type. > + */ > +#define BPF_F_KPROBE_RETURN (1U << 0) > + we have plenty of flexibility to have per-link type fields, so why not add `bool is_retprobe` next to addrs and cnt? > /* When BPF ldimm64's insn[0].src_reg != 0 then this can have > * the following extensions: > * > @@ -1465,6 +1472,11 @@ union bpf_attr { > */ > __u64 bpf_cookie; > } perf_event; > + struct { > + __aligned_u64 addrs; > + __u32 cnt; > + __u64 bpf_cookie; > + } kprobe; > }; > } link_create; > [...]
On Wed, Jan 05, 2022 at 08:30:56PM -0800, Andrii Nakryiko wrote: > On Tue, Jan 4, 2022 at 12:10 AM Jiri Olsa <jolsa@redhat.com> wrote: > > > > Adding new link type BPF_LINK_TYPE_KPROBE to attach kprobes > > directly through register_kprobe/kretprobe API. > > > > Adding new attach type BPF_TRACE_RAW_KPROBE that enables > > such link for kprobe program. > > > > The new link allows to create multiple kprobes link by using > > new link_create interface: > > > > struct { > > __aligned_u64 addrs; > > __u32 cnt; > > __u64 bpf_cookie; > > I'm afraid bpf_cookie has to be different for each addr, otherwise > it's severely limiting. So it would be an array of cookies alongside > an array of addresses ok > > > } kprobe; > > > > Plus new flag BPF_F_KPROBE_RETURN for link_create.flags to > > create return probe. > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org> > > --- > > include/linux/bpf_types.h | 1 + > > include/uapi/linux/bpf.h | 12 +++ > > kernel/bpf/syscall.c | 191 ++++++++++++++++++++++++++++++++- > > tools/include/uapi/linux/bpf.h | 12 +++ > > 4 files changed, 211 insertions(+), 5 deletions(-) > > > > [...] > > > @@ -1111,6 +1113,11 @@ enum bpf_link_type { > > */ > > #define BPF_F_SLEEPABLE (1U << 4) > > > > +/* link_create flags used in LINK_CREATE command for BPF_TRACE_RAW_KPROBE > > + * attach type. > > + */ > > +#define BPF_F_KPROBE_RETURN (1U << 0) > > + > > we have plenty of flexibility to have per-link type fields, so why not > add `bool is_retprobe` next to addrs and cnt? well I thought if I do that, people would suggest to use the empty flags field instead ;-) we can move it there as you suggest, but I wonder it's good idea to use bool in uapi headers, because the bool size definition is vague jirka > > > /* When BPF ldimm64's insn[0].src_reg != 0 then this can have > > * the following extensions: > > * > > @@ -1465,6 +1472,11 @@ union bpf_attr { > > */ > > __u64 bpf_cookie; > > } perf_event; > > + struct { > > + __aligned_u64 addrs; > > + __u32 cnt; > > + __u64 bpf_cookie; > > + } kprobe; > > }; > > } link_create; > > > > [...] >
On Thu, Jan 6, 2022 at 12:41 AM Jiri Olsa <jolsa@redhat.com> wrote: > > On Wed, Jan 05, 2022 at 08:30:56PM -0800, Andrii Nakryiko wrote: > > On Tue, Jan 4, 2022 at 12:10 AM Jiri Olsa <jolsa@redhat.com> wrote: > > > > > > Adding new link type BPF_LINK_TYPE_KPROBE to attach kprobes > > > directly through register_kprobe/kretprobe API. > > > > > > Adding new attach type BPF_TRACE_RAW_KPROBE that enables > > > such link for kprobe program. > > > > > > The new link allows to create multiple kprobes link by using > > > new link_create interface: > > > > > > struct { > > > __aligned_u64 addrs; > > > __u32 cnt; > > > __u64 bpf_cookie; > > > > I'm afraid bpf_cookie has to be different for each addr, otherwise > > it's severely limiting. So it would be an array of cookies alongside > > an array of addresses > > ok > > > > > > } kprobe; > > > > > > Plus new flag BPF_F_KPROBE_RETURN for link_create.flags to > > > create return probe. > > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org> > > > --- > > > include/linux/bpf_types.h | 1 + > > > include/uapi/linux/bpf.h | 12 +++ > > > kernel/bpf/syscall.c | 191 ++++++++++++++++++++++++++++++++- > > > tools/include/uapi/linux/bpf.h | 12 +++ > > > 4 files changed, 211 insertions(+), 5 deletions(-) > > > > > > > [...] > > > > > @@ -1111,6 +1113,11 @@ enum bpf_link_type { > > > */ > > > #define BPF_F_SLEEPABLE (1U << 4) > > > > > > +/* link_create flags used in LINK_CREATE command for BPF_TRACE_RAW_KPROBE > > > + * attach type. > > > + */ > > > +#define BPF_F_KPROBE_RETURN (1U << 0) > > > + > > > > we have plenty of flexibility to have per-link type fields, so why not > > add `bool is_retprobe` next to addrs and cnt? > > well I thought if I do that, people would suggest to use the empty > flags field instead ;-) > > we can move it there as you suggest, but I wonder it's good idea to > use bool in uapi headers, because the bool size definition is vague Good point. No 'bool' please. grep bool include/uapi/linux/ Only gives openvswitch.h and it's guarded by ifdef KERNEL So not a single uapi header has bool in it.
On Thu, Jan 6, 2022 at 8:32 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Jan 6, 2022 at 12:41 AM Jiri Olsa <jolsa@redhat.com> wrote: > > > > On Wed, Jan 05, 2022 at 08:30:56PM -0800, Andrii Nakryiko wrote: > > > On Tue, Jan 4, 2022 at 12:10 AM Jiri Olsa <jolsa@redhat.com> wrote: > > > > > > > > Adding new link type BPF_LINK_TYPE_KPROBE to attach kprobes > > > > directly through register_kprobe/kretprobe API. > > > > > > > > Adding new attach type BPF_TRACE_RAW_KPROBE that enables > > > > such link for kprobe program. > > > > > > > > The new link allows to create multiple kprobes link by using > > > > new link_create interface: > > > > > > > > struct { > > > > __aligned_u64 addrs; > > > > __u32 cnt; > > > > __u64 bpf_cookie; > > > > > > I'm afraid bpf_cookie has to be different for each addr, otherwise > > > it's severely limiting. So it would be an array of cookies alongside > > > an array of addresses > > > > ok > > > > > > > > > } kprobe; > > > > > > > > Plus new flag BPF_F_KPROBE_RETURN for link_create.flags to > > > > create return probe. > > > > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org> > > > > --- > > > > include/linux/bpf_types.h | 1 + > > > > include/uapi/linux/bpf.h | 12 +++ > > > > kernel/bpf/syscall.c | 191 ++++++++++++++++++++++++++++++++- > > > > tools/include/uapi/linux/bpf.h | 12 +++ > > > > 4 files changed, 211 insertions(+), 5 deletions(-) > > > > > > > > > > [...] > > > > > > > @@ -1111,6 +1113,11 @@ enum bpf_link_type { > > > > */ > > > > #define BPF_F_SLEEPABLE (1U << 4) > > > > > > > > +/* link_create flags used in LINK_CREATE command for BPF_TRACE_RAW_KPROBE > > > > + * attach type. > > > > + */ > > > > +#define BPF_F_KPROBE_RETURN (1U << 0) > > > > + > > > > > > we have plenty of flexibility to have per-link type fields, so why not > > > add `bool is_retprobe` next to addrs and cnt? > > > > well I thought if I do that, people would suggest to use the empty > > flags field instead ;-) > > > > we can move it there as you suggest, but I wonder it's good idea to > > use bool in uapi headers, because the bool size definition is vague > > Good point. No 'bool' please. > grep bool include/uapi/linux/ > Only gives openvswitch.h and it's guarded by ifdef KERNEL > So not a single uapi header has bool in it. Yeah, I don't insist on bool specifically. But I was trying to avoid link_create.flags to become map_flags where we have various flags, each taking a separate bit, but most flags don't apply to most map types. Ideally link_create.flags would have few flags that apply to all or most link types (i.e., something orthogonal to a specific link type), and for this case we could have kprobe-specific flags (link_create.kprobe.flags) to adjust kprobe link creation behavior. But I don't know, maybe I'm overthinking this.
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index 48a91c51c015..a9000feab34e 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -140,3 +140,4 @@ BPF_LINK_TYPE(BPF_LINK_TYPE_XDP, xdp) #ifdef CONFIG_PERF_EVENTS BPF_LINK_TYPE(BPF_LINK_TYPE_PERF_EVENT, perf) #endif +BPF_LINK_TYPE(BPF_LINK_TYPE_KPROBE, kprobe) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index b0383d371b9a..5216b333c688 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -995,6 +995,7 @@ enum bpf_attach_type { BPF_SK_REUSEPORT_SELECT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, BPF_PERF_EVENT, + BPF_TRACE_RAW_KPROBE, __MAX_BPF_ATTACH_TYPE }; @@ -1009,6 +1010,7 @@ enum bpf_link_type { BPF_LINK_TYPE_NETNS = 5, BPF_LINK_TYPE_XDP = 6, BPF_LINK_TYPE_PERF_EVENT = 7, + BPF_LINK_TYPE_KPROBE = 8, MAX_BPF_LINK_TYPE, }; @@ -1111,6 +1113,11 @@ enum bpf_link_type { */ #define BPF_F_SLEEPABLE (1U << 4) +/* link_create flags used in LINK_CREATE command for BPF_TRACE_RAW_KPROBE + * attach type. + */ +#define BPF_F_KPROBE_RETURN (1U << 0) + /* When BPF ldimm64's insn[0].src_reg != 0 then this can have * the following extensions: * @@ -1465,6 +1472,11 @@ union bpf_attr { */ __u64 bpf_cookie; } perf_event; + struct { + __aligned_u64 addrs; + __u32 cnt; + __u64 bpf_cookie; + } kprobe; }; } link_create; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index fa4505f9b611..7e5be63c820f 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -32,6 +32,7 @@ #include <linux/bpf-netns.h> #include <linux/rcupdate_trace.h> #include <linux/memcontrol.h> +#include <linux/kprobes.h> #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \ (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \ @@ -3014,8 +3015,178 @@ static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *pro fput(perf_file); return err; } +#else +static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) +{ + return -ENOTSUPP; +} #endif /* CONFIG_PERF_EVENTS */ +#ifdef CONFIG_HAVE_KPROBES_MULTI_ON_FTRACE + +struct bpf_kprobe_link { + struct bpf_link link; + struct kretprobe rp; + bool is_return; + kprobe_opcode_t **addrs; + u32 cnt; + u64 bpf_cookie; +}; + +static void bpf_kprobe_link_release(struct bpf_link *link) +{ + struct bpf_kprobe_link *kprobe_link; + + kprobe_link = container_of(link, struct bpf_kprobe_link, link); + + if (kprobe_link->is_return) + unregister_kretprobe(&kprobe_link->rp); + else + unregister_kprobe(&kprobe_link->rp.kp); +} + +static void bpf_kprobe_link_dealloc(struct bpf_link *link) +{ + struct bpf_kprobe_link *kprobe_link; + + kprobe_link = container_of(link, struct bpf_kprobe_link, link); + kfree(kprobe_link->addrs); + kfree(kprobe_link); +} + +static const struct bpf_link_ops bpf_kprobe_link_lops = { + .release = bpf_kprobe_link_release, + .dealloc = bpf_kprobe_link_dealloc, +}; + +static int kprobe_link_prog_run(struct bpf_kprobe_link *kprobe_link, + struct pt_regs *regs) +{ + struct bpf_trace_run_ctx run_ctx; + struct bpf_run_ctx *old_run_ctx; + int err; + + if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) { + err = 0; + goto out; + } + + old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx); + run_ctx.bpf_cookie = kprobe_link->bpf_cookie; + + rcu_read_lock(); + migrate_disable(); + err = bpf_prog_run(kprobe_link->link.prog, regs); + migrate_enable(); + rcu_read_unlock(); + + bpf_reset_run_ctx(old_run_ctx); + + out: + __this_cpu_dec(bpf_prog_active); + return err; +} + +static int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs) +{ + struct bpf_kprobe_link *kprobe_link; + + kprobe_link = container_of(kp, struct bpf_kprobe_link, rp.kp); + return kprobe_link_prog_run(kprobe_link, regs); +} + +static int +kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs) +{ + struct kretprobe *rp = get_kretprobe(ri); + struct bpf_kprobe_link *kprobe_link; + + kprobe_link = container_of(rp, struct bpf_kprobe_link, rp); + return kprobe_link_prog_run(kprobe_link, regs); +} + +static int bpf_kprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) +{ + struct bpf_link_primer link_primer; + struct bpf_kprobe_link *link; + kprobe_opcode_t **addrs; + u32 flags, cnt, size; + void __user *uaddrs; + u64 **tmp; + int err; + + flags = attr->link_create.flags; + if (flags & ~BPF_F_KPROBE_RETURN) + return -EINVAL; + + uaddrs = u64_to_user_ptr(attr->link_create.kprobe.addrs); + cnt = attr->link_create.kprobe.cnt; + size = cnt * sizeof(*tmp); + + tmp = kzalloc(size, GFP_KERNEL); + if (!tmp) + return -ENOMEM; + + if (copy_from_user(tmp, uaddrs, size)) { + err = -EFAULT; + goto error; + } + + /* TODO add extra copy for 32bit archs */ + if (sizeof(u64) != sizeof(void *)) + return -EINVAL; + + addrs = (kprobe_opcode_t **) tmp; + + link = kzalloc(sizeof(*link), GFP_KERNEL); + if (!link) + return -ENOMEM; + + bpf_link_init(&link->link, BPF_LINK_TYPE_KPROBE, &bpf_kprobe_link_lops, prog); + + err = bpf_link_prime(&link->link, &link_primer); + if (err) { + kfree(link); + goto error; + } + + INIT_HLIST_NODE(&link->rp.kp.hlist); + INIT_LIST_HEAD(&link->rp.kp.list); + link->is_return = flags & BPF_F_KPROBE_RETURN; + link->addrs = addrs; + link->cnt = cnt; + link->bpf_cookie = attr->link_create.kprobe.bpf_cookie; + + link->rp.kp.multi.addrs = addrs; + link->rp.kp.multi.cnt = cnt; + + if (link->is_return) + link->rp.handler = kretprobe_dispatcher; + else + link->rp.kp.pre_handler = kprobe_dispatcher; + + if (link->is_return) + err = register_kretprobe(&link->rp); + else + err = register_kprobe(&link->rp.kp); + + if (err) { + bpf_link_cleanup(&link_primer); + return err; + } + return bpf_link_settle(&link_primer); + +error: + kfree(tmp); + return err; +} +#else /* !CONFIG_HAVE_KPROBES_MULTI_ON_FTRACE */ +static int bpf_kprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) +{ + return -ENOTSUPP; +} +#endif + #define BPF_RAW_TRACEPOINT_OPEN_LAST_FIELD raw_tracepoint.prog_fd static int bpf_raw_tracepoint_open(const union bpf_attr *attr) @@ -4242,7 +4413,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr, return -EINVAL; } -#define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len +#define BPF_LINK_CREATE_LAST_FIELD link_create.kprobe.bpf_cookie static int link_create(union bpf_attr *attr, bpfptr_t uattr) { enum bpf_prog_type ptype; @@ -4266,7 +4437,6 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) ret = tracing_bpf_link_attach(attr, uattr, prog); goto out; case BPF_PROG_TYPE_PERF_EVENT: - case BPF_PROG_TYPE_KPROBE: case BPF_PROG_TYPE_TRACEPOINT: if (attr->link_create.attach_type != BPF_PERF_EVENT) { ret = -EINVAL; @@ -4274,6 +4444,14 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) } ptype = prog->type; break; + case BPF_PROG_TYPE_KPROBE: + if (attr->link_create.attach_type != BPF_PERF_EVENT && + attr->link_create.attach_type != BPF_TRACE_RAW_KPROBE) { + ret = -EINVAL; + goto out; + } + ptype = prog->type; + break; default: ptype = attach_type_to_prog_type(attr->link_create.attach_type); if (ptype == BPF_PROG_TYPE_UNSPEC || ptype != prog->type) { @@ -4305,13 +4483,16 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) ret = bpf_xdp_link_attach(attr, prog); break; #endif -#ifdef CONFIG_PERF_EVENTS case BPF_PROG_TYPE_PERF_EVENT: case BPF_PROG_TYPE_TRACEPOINT: - case BPF_PROG_TYPE_KPROBE: ret = bpf_perf_link_attach(attr, prog); break; -#endif + case BPF_PROG_TYPE_KPROBE: + if (attr->link_create.attach_type == BPF_PERF_EVENT) + ret = bpf_perf_link_attach(attr, prog); + else + ret = bpf_kprobe_link_attach(attr, prog); + break; default: ret = -EINVAL; } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index b0383d371b9a..5216b333c688 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -995,6 +995,7 @@ enum bpf_attach_type { BPF_SK_REUSEPORT_SELECT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, BPF_PERF_EVENT, + BPF_TRACE_RAW_KPROBE, __MAX_BPF_ATTACH_TYPE }; @@ -1009,6 +1010,7 @@ enum bpf_link_type { BPF_LINK_TYPE_NETNS = 5, BPF_LINK_TYPE_XDP = 6, BPF_LINK_TYPE_PERF_EVENT = 7, + BPF_LINK_TYPE_KPROBE = 8, MAX_BPF_LINK_TYPE, }; @@ -1111,6 +1113,11 @@ enum bpf_link_type { */ #define BPF_F_SLEEPABLE (1U << 4) +/* link_create flags used in LINK_CREATE command for BPF_TRACE_RAW_KPROBE + * attach type. + */ +#define BPF_F_KPROBE_RETURN (1U << 0) + /* When BPF ldimm64's insn[0].src_reg != 0 then this can have * the following extensions: * @@ -1465,6 +1472,11 @@ union bpf_attr { */ __u64 bpf_cookie; } perf_event; + struct { + __aligned_u64 addrs; + __u32 cnt; + __u64 bpf_cookie; + } kprobe; }; } link_create;
Adding new link type BPF_LINK_TYPE_KPROBE to attach kprobes directly through register_kprobe/kretprobe API. Adding new attach type BPF_TRACE_RAW_KPROBE that enables such link for kprobe program. The new link allows to create multiple kprobes link by using new link_create interface: struct { __aligned_u64 addrs; __u32 cnt; __u64 bpf_cookie; } kprobe; Plus new flag BPF_F_KPROBE_RETURN for link_create.flags to create return probe. Signed-off-by: Jiri Olsa <jolsa@kernel.org> --- include/linux/bpf_types.h | 1 + include/uapi/linux/bpf.h | 12 +++ kernel/bpf/syscall.c | 191 ++++++++++++++++++++++++++++++++- tools/include/uapi/linux/bpf.h | 12 +++ 4 files changed, 211 insertions(+), 5 deletions(-)