Message ID | 162400002631.506599.2413605639666466945.stgit@devnote2 (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | kprobes: Fix stacktrace with kretprobes on x86 | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Fri, Jun 18, 2021 at 04:07:06PM +0900, Masami Hiramatsu wrote: > @@ -549,7 +548,15 @@ bool unwind_next_frame(struct unwind_state *state) > (void *)orig_ip); > goto err; > } > - > + /* > + * There is a small chance to interrupt at the entry of > + * kretprobe_trampoline where the ORC info doesn't exist. > + * That point is right after the RET to kretprobe_trampoline > + * which was modified return address. So the @addr_p must > + * be right before the regs->sp. > + */ > + state->ip = unwind_recover_kretprobe(state, state->ip, > + (unsigned long *)(state->sp - sizeof(long))); > state->regs = (struct pt_regs *)sp; > state->prev_regs = NULL; > state->full_regs = true; > @@ -562,6 +569,9 @@ bool unwind_next_frame(struct unwind_state *state) > (void *)orig_ip); > goto err; > } > + /* See UNWIND_HINT_TYPE_REGS case comment. */ > + state->ip = unwind_recover_kretprobe(state, state->ip, > + (unsigned long *)(state->sp - sizeof(long))); > > if (state->full_regs) > state->prev_regs = state->regs; Why doesn't the ftrace case have this? That is, why aren't both return trampolines having the same general shape?
On Mon, 5 Jul 2021 13:36:14 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, Jun 18, 2021 at 04:07:06PM +0900, Masami Hiramatsu wrote: > > @@ -549,7 +548,15 @@ bool unwind_next_frame(struct unwind_state *state) > > (void *)orig_ip); > > goto err; > > } > > - > > + /* > > + * There is a small chance to interrupt at the entry of > > + * kretprobe_trampoline where the ORC info doesn't exist. > > + * That point is right after the RET to kretprobe_trampoline > > + * which was modified return address. So the @addr_p must > > + * be right before the regs->sp. > > + */ > > + state->ip = unwind_recover_kretprobe(state, state->ip, > > + (unsigned long *)(state->sp - sizeof(long))); > > state->regs = (struct pt_regs *)sp; > > state->prev_regs = NULL; > > state->full_regs = true; > > @@ -562,6 +569,9 @@ bool unwind_next_frame(struct unwind_state *state) > > (void *)orig_ip); > > goto err; > > } > > + /* See UNWIND_HINT_TYPE_REGS case comment. */ > > + state->ip = unwind_recover_kretprobe(state, state->ip, > > + (unsigned long *)(state->sp - sizeof(long))); > > > > if (state->full_regs) > > state->prev_regs = state->regs; > > Why doesn't the ftrace case have this? That is, why aren't both return > trampolines having the same general shape? Ah, this strongly depends what the trampoline code does. For the kretprobe case, the PUSHQ at the entry of the kretprobe_trampoline() does not covered by UNWIND_HINT_FUNC. Thus it needs to find 'correct_ret_addr' by the frame pointer (which is next to the sp). "kretprobe_trampoline:\n" #ifdef CONFIG_X86_64 /* Push fake return address to tell the unwinder it's a kretprobe */ " pushq $kretprobe_trampoline\n" UNWIND_HINT_FUNC But I'm not so sure how ftrace treat it. It seems that the return_to_handler() doesn't care such case. (anyway, return_to_handler() does not return but jump to the original call-site, in that case, the information will be lost.) Thank you,
On Tue, Jul 06, 2021 at 12:42:57AM +0900, Masami Hiramatsu wrote: > On Mon, 5 Jul 2021 13:36:14 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > On Fri, Jun 18, 2021 at 04:07:06PM +0900, Masami Hiramatsu wrote: > > > @@ -549,7 +548,15 @@ bool unwind_next_frame(struct unwind_state *state) > > > (void *)orig_ip); > > > goto err; > > > } > > > - > > > + /* > > > + * There is a small chance to interrupt at the entry of > > > + * kretprobe_trampoline where the ORC info doesn't exist. > > > + * That point is right after the RET to kretprobe_trampoline > > > + * which was modified return address. So the @addr_p must > > > + * be right before the regs->sp. > > > + */ > > > + state->ip = unwind_recover_kretprobe(state, state->ip, > > > + (unsigned long *)(state->sp - sizeof(long))); > > > state->regs = (struct pt_regs *)sp; > > > state->prev_regs = NULL; > > > state->full_regs = true; > > > @@ -562,6 +569,9 @@ bool unwind_next_frame(struct unwind_state *state) > > > (void *)orig_ip); > > > goto err; > > > } > > > + /* See UNWIND_HINT_TYPE_REGS case comment. */ > > > + state->ip = unwind_recover_kretprobe(state, state->ip, > > > + (unsigned long *)(state->sp - sizeof(long))); > > > > > > if (state->full_regs) > > > state->prev_regs = state->regs; > > > > Why doesn't the ftrace case have this? That is, why aren't both return > > trampolines having the same general shape? > > Ah, this strongly depends what the trampoline code does. > For the kretprobe case, the PUSHQ at the entry of the kretprobe_trampoline() > does not covered by UNWIND_HINT_FUNC. Thus it needs to find 'correct_ret_addr' > by the frame pointer (which is next to the sp). > > "kretprobe_trampoline:\n" > #ifdef CONFIG_X86_64 > /* Push fake return address to tell the unwinder it's a kretprobe */ > " pushq $kretprobe_trampoline\n" > UNWIND_HINT_FUNC > > But I'm not so sure how ftrace treat it. It seems that the return_to_handler() > doesn't care such case. (anyway, return_to_handler() does not return but jump > to the original call-site, in that case, the information will be lost.) I find it bothersome (OCD, sorry :-) that both return trampolines behave differently. Doubly so because I know people (Steve in particular) have been talking about unifying them. Steve, can you clarify the ftrace side here? Afaict return_to_handler() is similarly affected.
On Tue, 6 Jul 2021 09:55:03 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > > But I'm not so sure how ftrace treat it. It seems that the return_to_handler() > > doesn't care such case. (anyway, return_to_handler() does not return but jump > > to the original call-site, in that case, the information will be lost.) > > I find it bothersome (OCD, sorry :-) that both return trampolines behave > differently. Doubly so because I know people (Steve in particular) have > been talking about unifying them. They were developed separately, and designed differently with different goals in mind. Yes, I want to unify them, but trying to get the different goals together, compounded by the fact that almost every arch also implemented them differently (in which case, we need to find a way to do it one arch at a time), makes the process extremely frustrating. > > Steve, can you clarify the ftrace side here? Afaict return_to_handler() > is similarly affected. I'm not exactly sure what the issue is. As Masami stated, kretprobe uses a ret to return to the calling function, but ftrace uses a jmp. kretprobe return tracing is more complex than the function graph return tracing is (which is one of the issues I need to overcome to unify them), and when the function graph return trampoline was created, it did things as simple as possible (and before ORC existed). Is this something to worry about now, or should we look to fix his in the unifying process? -- Steve
On Tue, Jul 06, 2021 at 11:11:36AM -0400, Steven Rostedt wrote: > On Tue, 6 Jul 2021 09:55:03 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > > But I'm not so sure how ftrace treat it. It seems that the return_to_handler() > > > doesn't care such case. (anyway, return_to_handler() does not return but jump > > > to the original call-site, in that case, the information will be lost.) > > > > I find it bothersome (OCD, sorry :-) that both return trampolines behave > > differently. Doubly so because I know people (Steve in particular) have > > been talking about unifying them. > > They were developed separately, and designed differently with different > goals in mind. Yes, I want to unify them, but trying to get the > different goals together, compounded by the fact that almost every arch > also implemented them differently (in which case, we need to find a way > to do it one arch at a time), makes the process extremely frustrating. Yeah.. that's going to be somewhat painful. > > Steve, can you clarify the ftrace side here? Afaict return_to_handler() > > is similarly affected. > > I'm not exactly sure what the issue is. As Masami stated, kretprobe > uses a ret to return to the calling function, but ftrace uses a jmp. I'll have to re-read the ftrace bits, but from the top of my head you cannot do an indirect jump and preserve all registers at the same time, so a return stub must use jump from stack aka. ret. > kretprobe return tracing is more complex than the function graph return > tracing is (which is one of the issues I need to overcome to unify > them), I'm not sure it is. IIRC the biggest pain point with kretprobe is that 'silly' property that the kretprobe_instance are not the same between kretprobes. Luckily, that's not actually used anywhere, so we can simply rip that out. That should also help Matt make the whole freelist thing faster, because now the kretprobe instances are global. > and when the function graph return trampoline was created, it > did things as simple as possible (and before ORC existed). > > Is this something to worry about now, or should we look to fix his in > the unifying process? There seems to be a lot of kretprobe activity now; so I figured we ought to at least consider where we want to go so we don't make it harder still.
On Wed, Jul 07, 2021 at 10:20:41AM +0200, Peter Zijlstra wrote: > > > Steve, can you clarify the ftrace side here? Afaict return_to_handler() > > > is similarly affected. > > > > I'm not exactly sure what the issue is. As Masami stated, kretprobe > > uses a ret to return to the calling function, but ftrace uses a jmp. > > I'll have to re-read the ftrace bits, but from the top of my head you > cannot do an indirect jump and preserve all registers at the same time, > so a return stub must use jump from stack aka. ret. Hmm... there's callee clobbered regs ofcourse, which don't need to be preserved. And that's exactly what ftrace seems to be doing, and I don't think there's any reason why kretprobe cannot do the same. Lemme try.
On Wed, 7 Jul 2021 10:20:41 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > > > Steve, can you clarify the ftrace side here? Afaict return_to_handler() > > > is similarly affected. > > > > I'm not exactly sure what the issue is. As Masami stated, kretprobe > > uses a ret to return to the calling function, but ftrace uses a jmp. > > I'll have to re-read the ftrace bits, but from the top of my head you > cannot do an indirect jump and preserve all registers at the same time, > so a return stub must use jump from stack aka. ret. > > > kretprobe return tracing is more complex than the function graph return > > tracing is (which is one of the issues I need to overcome to unify > > them), > > I'm not sure it is. IIRC the biggest pain point with kretprobe is that > 'silly' property that the kretprobe_instance are not the same between > kretprobes. Luckily, that's not actually used anywhere, so we can simply > rip that out. I actually don't want to keep this feature because no one use it. (only systemtap needs it?) Anyway, if we keep the idea-level compatibility (not code level), what we need is 'void *data' in the struct kretprobe_instance. User who needs it can allocate their own instance data for their kretprobes when initialising it and sets in their entry handler. Then we can have a simple kretprobe_instance. Thank you,
On Wed, Jul 07, 2021 at 07:15:10PM +0900, Masami Hiramatsu wrote: > I actually don't want to keep this feature because no one use it. > (only systemtap needs it?) Yeah, you mentioned systemtap, but since that's out-of-tree I don't care. Their problem. > Anyway, if we keep the idea-level compatibility (not code level), > what we need is 'void *data' in the struct kretprobe_instance. > User who needs it can allocate their own instance data for their > kretprobes when initialising it and sets in their entry handler. > > Then we can have a simple kretprobe_instance. When would you do the alloc? When installing the retprobe, but that might be inside the allocator, which means you can't call the allocator etc.. :-) If we look at struct ftrace_ret_stack, it has a few fixed function fields. The calltime one is all that is needed for the kretprobe example code.
On Wed, 7 Jul 2021 12:20:57 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Wed, Jul 07, 2021 at 07:15:10PM +0900, Masami Hiramatsu wrote: > > > I actually don't want to keep this feature because no one use it. > > (only systemtap needs it?) > > Yeah, you mentioned systemtap, but since that's out-of-tree I don't > care. Their problem. > > > Anyway, if we keep the idea-level compatibility (not code level), > > what we need is 'void *data' in the struct kretprobe_instance. > > User who needs it can allocate their own instance data for their > > kretprobes when initialising it and sets in their entry handler. > > > > Then we can have a simple kretprobe_instance. > > When would you do the alloc? When installing the retprobe, but that > might be inside the allocator, which means you can't call the allocator > etc.. :-) Yes, so the user may need to allocate a pool right before register_kretprobe(). (whether per-kretprobe or per-task or global pool, that is user's choice.) > > If we look at struct ftrace_ret_stack, it has a few fixed function > fields. The calltime one is all that is needed for the kretprobe > example code. kretprobe consumes 3 fields, a pointer to 'struct kretprobe' (which stores callee function address in 'kretprobe::kp.addr'), a return address and a frame pointer (*). * note that this frame pointer might be used for fixing up the stack trace, but the fixup method depends on the architecture. Thank you,
On Wed, 7 Jul 2021 19:45:30 +0900 Masami Hiramatsu <mhiramat@kernel.org> wrote: > On Wed, 7 Jul 2021 12:20:57 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > On Wed, Jul 07, 2021 at 07:15:10PM +0900, Masami Hiramatsu wrote: > > > > > I actually don't want to keep this feature because no one use it. > > > (only systemtap needs it?) > > > > Yeah, you mentioned systemtap, but since that's out-of-tree I don't > > care. Their problem. Yeah, maybe it is not hard to update. > > > > > Anyway, if we keep the idea-level compatibility (not code level), > > > what we need is 'void *data' in the struct kretprobe_instance. > > > User who needs it can allocate their own instance data for their > > > kretprobes when initialising it and sets in their entry handler. > > > > > > Then we can have a simple kretprobe_instance. > > > > When would you do the alloc? When installing the retprobe, but that > > might be inside the allocator, which means you can't call the allocator > > etc.. :-) > > Yes, so the user may need to allocate a pool right before register_kretprobe(). > (whether per-kretprobe or per-task or global pool, that is user's choice.) > > > > > If we look at struct ftrace_ret_stack, it has a few fixed function > > fields. The calltime one is all that is needed for the kretprobe > > example code. > > kretprobe consumes 3 fields, a pointer to 'struct kretprobe' (which > stores callee function address in 'kretprobe::kp.addr'), a return > address and a frame pointer (*). Oops, I forgot to add "void *data" for storing user data. Thank you,
On 2021/7/7 PM9:29, Masami Hiramatsu wrote: > On Wed, 7 Jul 2021 19:45:30 +0900 > Masami Hiramatsu <mhiramat@kernel.org> wrote: > >> On Wed, 7 Jul 2021 12:20:57 +0200 >> Peter Zijlstra <peterz@infradead.org> wrote: >> >>> On Wed, Jul 07, 2021 at 07:15:10PM +0900, Masami Hiramatsu wrote: >>> >>>> I actually don't want to keep this feature because no one use it. >>>> (only systemtap needs it?) >>> >>> Yeah, you mentioned systemtap, but since that's out-of-tree I don't >>> care. Their problem. > > Yeah, maybe it is not hard to update. > >>> >>>> Anyway, if we keep the idea-level compatibility (not code level), >>>> what we need is 'void *data' in the struct kretprobe_instance. >>>> User who needs it can allocate their own instance data for their >>>> kretprobes when initialising it and sets in their entry handler. >>>> >>>> Then we can have a simple kretprobe_instance. >>> >>> When would you do the alloc? When installing the retprobe, but that >>> might be inside the allocator, which means you can't call the allocator >>> etc.. :-) >> >> Yes, so the user may need to allocate a pool right before register_kretprobe(). >> (whether per-kretprobe or per-task or global pool, that is user's choice.) >> >>> >>> If we look at struct ftrace_ret_stack, it has a few fixed function >>> fields. The calltime one is all that is needed for the kretprobe >>> example code. >> >> kretprobe consumes 3 fields, a pointer to 'struct kretprobe' (which >> stores callee function address in 'kretprobe::kp.addr'), a return >> address and a frame pointer (*). > > Oops, I forgot to add "void *data" for storing user data. > Should use "struct kretprobe_holder *rph", since "struct kretprobe" belongs to 3rd-party module (which might be unloaded any time). User's own pool might not work if the module can be unloaded. Better manage the pool in kretprobe_holder, which needs no changes from user side.
On Wed, 7 Jul 2021 22:42:47 +0800 Matt Wu <wuqiang.matt@bytedance.com> wrote: > On 2021/7/7 PM9:29, Masami Hiramatsu wrote: > > On Wed, 7 Jul 2021 19:45:30 +0900 > > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > >> On Wed, 7 Jul 2021 12:20:57 +0200 > >> Peter Zijlstra <peterz@infradead.org> wrote: > >> > >>> On Wed, Jul 07, 2021 at 07:15:10PM +0900, Masami Hiramatsu wrote: > >>> > >>>> I actually don't want to keep this feature because no one use it. > >>>> (only systemtap needs it?) > >>> > >>> Yeah, you mentioned systemtap, but since that's out-of-tree I don't > >>> care. Their problem. > > > > Yeah, maybe it is not hard to update. > > > >>> > >>>> Anyway, if we keep the idea-level compatibility (not code level), > >>>> what we need is 'void *data' in the struct kretprobe_instance. > >>>> User who needs it can allocate their own instance data for their > >>>> kretprobes when initialising it and sets in their entry handler. > >>>> > >>>> Then we can have a simple kretprobe_instance. > >>> > >>> When would you do the alloc? When installing the retprobe, but that > >>> might be inside the allocator, which means you can't call the allocator > >>> etc.. :-) > >> > >> Yes, so the user may need to allocate a pool right before register_kretprobe(). > >> (whether per-kretprobe or per-task or global pool, that is user's choice.) > >> > >>> > >>> If we look at struct ftrace_ret_stack, it has a few fixed function > >>> fields. The calltime one is all that is needed for the kretprobe > >>> example code. > >> > >> kretprobe consumes 3 fields, a pointer to 'struct kretprobe' (which > >> stores callee function address in 'kretprobe::kp.addr'), a return > >> address and a frame pointer (*). > > > Oops, I forgot to add "void *data" for storing user data. > > > > Should use "struct kretprobe_holder *rph", since "struct kretprobe" belongs > to 3rd-party module (which might be unloaded any time). Good catch. Yes, instead of 'struct kretprobe', we need to use the holder. > User's own pool might not work if the module can be unloaded. Better manage > the pool in kretprobe_holder, which needs no changes from user side. No, since the 'data' will be only refered from user handler. If the kretprobe is released, then the kretprobe_holder will clear the refernce to the 'struct kretprobe'. Then, the user handler is never called. No one access the 'data'. Thank you,
On 2021/7/11 PM10:09, Masami Hiramatsu wrote: > On Wed, 7 Jul 2021 22:42:47 +0800 > Matt Wu <wuqiang.matt@bytedance.com> wrote: > >> On 2021/7/7 PM9:29, Masami Hiramatsu wrote: >>> On Wed, 7 Jul 2021 19:45:30 +0900 >>> Masami Hiramatsu <mhiramat@kernel.org> wrote: >>> >>>> On Wed, 7 Jul 2021 12:20:57 +0200 >>>> Peter Zijlstra <peterz@infradead.org> wrote: >>>> >>>>> On Wed, Jul 07, 2021 at 07:15:10PM +0900, Masami Hiramatsu wrote: >>>>> >>>>>> I actually don't want to keep this feature because no one use it. >>>>>> (only systemtap needs it?) >>>>> >>>>> Yeah, you mentioned systemtap, but since that's out-of-tree I don't >>>>> care. Their problem. >>> >>> Yeah, maybe it is not hard to update. >>> >>>>> >>>>>> Anyway, if we keep the idea-level compatibility (not code level), >>>>>> what we need is 'void *data' in the struct kretprobe_instance. >>>>>> User who needs it can allocate their own instance data for their >>>>>> kretprobes when initialising it and sets in their entry handler. >>>>>> >>>>>> Then we can have a simple kretprobe_instance. >>>>> >>>>> When would you do the alloc? When installing the retprobe, but that >>>>> might be inside the allocator, which means you can't call the allocator >>>>> etc.. :-) >>>> >>>> Yes, so the user may need to allocate a pool right before register_kretprobe(). >>>> (whether per-kretprobe or per-task or global pool, that is user's choice.) >>>> >>>>> >>>>> If we look at struct ftrace_ret_stack, it has a few fixed function >>>>> fields. The calltime one is all that is needed for the kretprobe >>>>> example code. >>>> >>>> kretprobe consumes 3 fields, a pointer to 'struct kretprobe' (which >>>> stores callee function address in 'kretprobe::kp.addr'), a return >>>> address and a frame pointer (*). >>> > Oops, I forgot to add "void *data" for storing user data. >>> >> >> Should use "struct kretprobe_holder *rph", since "struct kretprobe" belongs >> to 3rd-party module (which might be unloaded any time). > > Good catch. Yes, instead of 'struct kretprobe', we need to use the holder. > >> User's own pool might not work if the module can be unloaded. Better manage >> the pool in kretprobe_holder, which needs no changes from user side. > > No, since the 'data' will be only refered from user handler. If the kretprobe > is released, then the kretprobe_holder will clear the refernce to the 'struct > kretprobe'. Then, the user handler is never called. No one access the 'data'. Indeed, there is no race of "data" accessing, since unregister_kretprobes() is taking care of it. This implementation just increases the complexity of caller to keep track of all allocated instances and release them after unregistration. But guys are likely to use kmalloc in pre-handler and kfree in post-handler, which will lead to memory leaks.
On Sun, 11 Jul 2021 23:28:49 +0800 Matt Wu <wuqiang.matt@bytedance.com> wrote: > On 2021/7/11 PM10:09, Masami Hiramatsu wrote: > > On Wed, 7 Jul 2021 22:42:47 +0800 > > Matt Wu <wuqiang.matt@bytedance.com> wrote: > > > >> On 2021/7/7 PM9:29, Masami Hiramatsu wrote: > >>> On Wed, 7 Jul 2021 19:45:30 +0900 > >>> Masami Hiramatsu <mhiramat@kernel.org> wrote: > >>> > >>>> On Wed, 7 Jul 2021 12:20:57 +0200 > >>>> Peter Zijlstra <peterz@infradead.org> wrote: > >>>> > >>>>> On Wed, Jul 07, 2021 at 07:15:10PM +0900, Masami Hiramatsu wrote: > >>>>> > >>>>>> I actually don't want to keep this feature because no one use it. > >>>>>> (only systemtap needs it?) > >>>>> > >>>>> Yeah, you mentioned systemtap, but since that's out-of-tree I don't > >>>>> care. Their problem. > >>> > >>> Yeah, maybe it is not hard to update. > >>> > >>>>> > >>>>>> Anyway, if we keep the idea-level compatibility (not code level), > >>>>>> what we need is 'void *data' in the struct kretprobe_instance. > >>>>>> User who needs it can allocate their own instance data for their > >>>>>> kretprobes when initialising it and sets in their entry handler. > >>>>>> > >>>>>> Then we can have a simple kretprobe_instance. > >>>>> > >>>>> When would you do the alloc? When installing the retprobe, but that > >>>>> might be inside the allocator, which means you can't call the allocator > >>>>> etc.. :-) > >>>> > >>>> Yes, so the user may need to allocate a pool right before register_kretprobe(). > >>>> (whether per-kretprobe or per-task or global pool, that is user's choice.) > >>>> > >>>>> > >>>>> If we look at struct ftrace_ret_stack, it has a few fixed function > >>>>> fields. The calltime one is all that is needed for the kretprobe > >>>>> example code. > >>>> > >>>> kretprobe consumes 3 fields, a pointer to 'struct kretprobe' (which > >>>> stores callee function address in 'kretprobe::kp.addr'), a return > >>>> address and a frame pointer (*). > >>> > Oops, I forgot to add "void *data" for storing user data. > >>> > >> > >> Should use "struct kretprobe_holder *rph", since "struct kretprobe" belongs > >> to 3rd-party module (which might be unloaded any time). > > > > Good catch. Yes, instead of 'struct kretprobe', we need to use the holder. > > > >> User's own pool might not work if the module can be unloaded. Better manage > >> the pool in kretprobe_holder, which needs no changes from user side. > > > > No, since the 'data' will be only refered from user handler. If the kretprobe > > is released, then the kretprobe_holder will clear the refernce to the 'struct > > kretprobe'. Then, the user handler is never called. No one access the 'data'. > > Indeed, there is no race of "data" accessing, since unregister_kretprobes() > is taking care of it. > > This implementation just increases the complexity of caller to keep track > of all allocated instances and release them after unregistration. Yes, but user can manage it with an array of pointers (or directly allocate an array of their desired data). Not hard to track it in that case. > But guys are likely to use kmalloc in pre-handler and kfree in post-handler, > which will lead to memory leaks. I will note "do not allocate memory inside kprobe handler" on manual. I think that's all what we need. We cannot stop someone shooting their feet especially in the kernel... Thank you,
diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h index 70fc159ebe69..36d3971c0a2c 100644 --- a/arch/x86/include/asm/unwind.h +++ b/arch/x86/include/asm/unwind.h @@ -4,6 +4,7 @@ #include <linux/sched.h> #include <linux/ftrace.h> +#include <linux/kprobes.h> #include <asm/ptrace.h> #include <asm/stacktrace.h> @@ -15,6 +16,7 @@ struct unwind_state { unsigned long stack_mask; struct task_struct *task; int graph_idx; + struct llist_node *kr_cur; bool error; #if defined(CONFIG_UNWINDER_ORC) bool signal, full_regs; @@ -99,6 +101,27 @@ void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size, void *orc, size_t orc_size) {} #endif +static inline +unsigned long unwind_recover_kretprobe(struct unwind_state *state, + unsigned long addr, unsigned long *addr_p) +{ + return is_kretprobe_trampoline(addr) ? + kretprobe_find_ret_addr(state->task, addr_p, &state->kr_cur) : + addr; +} + +/* Recover the return address modified by instrumentation (e.g. kretprobe) */ +static inline +unsigned long unwind_recover_ret_addr(struct unwind_state *state, + unsigned long addr, unsigned long *addr_p) +{ + unsigned long ret; + + ret = ftrace_graph_ret_addr(state->task, &state->graph_idx, + addr, addr_p); + return unwind_recover_kretprobe(state, ret, addr_p); +} + /* * This disables KASAN checking when reading a value from another task's stack, * since the other task could be running on another CPU and could have poisoned diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c index d7c44b257f7f..8e1c50c86e5d 100644 --- a/arch/x86/kernel/unwind_frame.c +++ b/arch/x86/kernel/unwind_frame.c @@ -240,8 +240,7 @@ static bool update_stack_state(struct unwind_state *state, else { addr_p = unwind_get_return_address_ptr(state); addr = READ_ONCE_TASK_STACK(state->task, *addr_p); - state->ip = ftrace_graph_ret_addr(state->task, &state->graph_idx, - addr, addr_p); + state->ip = unwind_recover_ret_addr(state, addr, addr_p); } /* Save the original stack pointer for unwind_dump(): */ diff --git a/arch/x86/kernel/unwind_guess.c b/arch/x86/kernel/unwind_guess.c index c49f10ffd8cd..884d68a6e714 100644 --- a/arch/x86/kernel/unwind_guess.c +++ b/arch/x86/kernel/unwind_guess.c @@ -15,8 +15,7 @@ unsigned long unwind_get_return_address(struct unwind_state *state) addr = READ_ONCE_NOCHECK(*state->sp); - return ftrace_graph_ret_addr(state->task, &state->graph_idx, - addr, state->sp); + return unwind_recover_ret_addr(state, addr, state->sp); } EXPORT_SYMBOL_GPL(unwind_get_return_address); diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index a1202536fc57..ad6a9aece379 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -534,9 +534,8 @@ bool unwind_next_frame(struct unwind_state *state) if (!deref_stack_reg(state, ip_p, &state->ip)) goto err; - state->ip = ftrace_graph_ret_addr(state->task, &state->graph_idx, - state->ip, (void *)ip_p); - + state->ip = unwind_recover_ret_addr(state, state->ip, + (unsigned long *)ip_p); state->sp = sp; state->regs = NULL; state->prev_regs = NULL; @@ -549,7 +548,15 @@ bool unwind_next_frame(struct unwind_state *state) (void *)orig_ip); goto err; } - + /* + * There is a small chance to interrupt at the entry of + * kretprobe_trampoline where the ORC info doesn't exist. + * That point is right after the RET to kretprobe_trampoline + * which was modified return address. So the @addr_p must + * be right before the regs->sp. + */ + state->ip = unwind_recover_kretprobe(state, state->ip, + (unsigned long *)(state->sp - sizeof(long))); state->regs = (struct pt_regs *)sp; state->prev_regs = NULL; state->full_regs = true; @@ -562,6 +569,9 @@ bool unwind_next_frame(struct unwind_state *state) (void *)orig_ip); goto err; } + /* See UNWIND_HINT_TYPE_REGS case comment. */ + state->ip = unwind_recover_kretprobe(state, state->ip, + (unsigned long *)(state->sp - sizeof(long))); if (state->full_regs) state->prev_regs = state->regs;