Message ID | ZUKLnmYyHpthlMEE@krava (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [QUESTION] ftrace_test_recursion_trylock behaviour | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Wed, 1 Nov 2023 18:32:14 +0100 Jiri Olsa <olsajiri@gmail.com> wrote: > hi, > I'm doing some testing on top of fprobes and noticed that the > ftrace_test_recursion_trylock allows caller from the same context > going through twice. > > The change below adds extra fprobe on stack_trace_print, which is > called within the sample_entry_handler and I can see it being executed > with following trace output: > > <...>-457 [003] ...1. 32.352554: sample_entry_handler: > Enter <kernel_clone+0x0/0x380> ip = 0xffffffff81177420 <...>-457 > [003] ...2. 32.352578: sample_entry_handler_extra: Enter > <stack_trace_print+0x0/0x60> ip = 0xffffffff8127ae70 > > IOW nested ftrace_test_recursion_trylock call in the same context > succeeded. > > It seems the reason is the TRACE_CTX_TRANSITION bit logic. > > Just making sure it's intentional.. we have kprobe_multi code on top of > fprobe with another re-entry logic and that might behave differently based > on ftrace_test_recursion_trylock logic. Yes it's intentional, as it's a work around for an issue that may be cleared up now with Peter Zijlstra's noinstr updates. The use case for that TRACE_CTX_TRANSITION is when a function is traced just after an interrupt was triggered but before the preempt count was updated to let us know that we are in an interrupt context. Daniel Bristot reported a regression after the trylock was first introduced where the interrupt entry function was traced sometimes but not always. That's because if the interrupt happened normally, it would be traced, but if the interrupt happened when another event was being traced, the recursion logic would see that the trace of the interrupt was happening in the same context as the event it interrupted and drop the interrupt trace. But after the preempt count was updated, the other functions in the interrupt would be seen. This led to very confusing trace output. The solution to that was this workaround hack, where the trace recursion logic would allow a single recursion (the interrupt preempting another trace before it set preempt count). But with noinstr, there should be no more instances of this problem and we can drop that extra bit. But the last I checked, there were a few places that still could be traced without the preempt_count set. I'll have to re-investigate. -- Steve
On Wed, Nov 01, 2023 at 01:45:56PM -0400, Steven Rostedt wrote: > On Wed, 1 Nov 2023 18:32:14 +0100 > Jiri Olsa <olsajiri@gmail.com> wrote: > > > hi, > > I'm doing some testing on top of fprobes and noticed that the > > ftrace_test_recursion_trylock allows caller from the same context > > going through twice. > > > > The change below adds extra fprobe on stack_trace_print, which is > > called within the sample_entry_handler and I can see it being executed > > with following trace output: > > > > <...>-457 [003] ...1. 32.352554: sample_entry_handler: > > Enter <kernel_clone+0x0/0x380> ip = 0xffffffff81177420 <...>-457 > > [003] ...2. 32.352578: sample_entry_handler_extra: Enter > > <stack_trace_print+0x0/0x60> ip = 0xffffffff8127ae70 > > > > IOW nested ftrace_test_recursion_trylock call in the same context > > succeeded. > > > > It seems the reason is the TRACE_CTX_TRANSITION bit logic. > > > > Just making sure it's intentional.. we have kprobe_multi code on top of > > fprobe with another re-entry logic and that might behave differently based > > on ftrace_test_recursion_trylock logic. > > Yes it's intentional, as it's a work around for an issue that may be > cleared up now with Peter Zijlstra's noinstr updates. > > The use case for that TRACE_CTX_TRANSITION is when a function is traced > just after an interrupt was triggered but before the preempt count was > updated to let us know that we are in an interrupt context. > > Daniel Bristot reported a regression after the trylock was first introduced > where the interrupt entry function was traced sometimes but not always. > That's because if the interrupt happened normally, it would be traced, but > if the interrupt happened when another event was being traced, the recursion > logic would see that the trace of the interrupt was happening in the same > context as the event it interrupted and drop the interrupt trace. But after > the preempt count was updated, the other functions in the interrupt would be > seen. This led to very confusing trace output. > > The solution to that was this workaround hack, where the trace recursion > logic would allow a single recursion (the interrupt preempting another > trace before it set preempt count). > > But with noinstr, there should be no more instances of this problem and we > can drop that extra bit. But the last I checked, there were a few places > that still could be traced without the preempt_count set. I'll have to > re-investigate. I see, so I'll keep in mind that it could change in the future thanks, jirka
diff --git a/samples/fprobe/fprobe_example.c b/samples/fprobe/fprobe_example.c index 64e715e7ed11..531272af0682 100644 --- a/samples/fprobe/fprobe_example.c +++ b/samples/fprobe/fprobe_example.c @@ -23,6 +23,9 @@ static struct fprobe sample_probe; static unsigned long nhit; +static struct fprobe sample_probe_extra; +static char *symbol_extra = "stack_trace_print"; + static char symbol[MAX_SYMBOL_LEN] = "kernel_clone"; module_param_string(symbol, symbol, sizeof(symbol), 0644); MODULE_PARM_DESC(symbol, "Probed symbol(s), given by comma separated symbols or a wildcard pattern."); @@ -48,6 +51,15 @@ static void show_backtrace(void) stack_trace_print(stacks, len, 24); } +static int sample_entry_handler_extra(struct fprobe *fp, unsigned long ip, + unsigned long ret_ip, + struct pt_regs *regs, void *data) +{ + if (use_trace) + trace_printk("Enter <%pS> ip = 0x%p\n", (void *)ip, (void *)ip); + return 0; +} + static int sample_entry_handler(struct fprobe *fp, unsigned long ip, unsigned long ret_ip, struct pt_regs *regs, void *data) @@ -96,6 +108,8 @@ static int __init fprobe_init(void) sample_probe.entry_handler = sample_entry_handler; sample_probe.exit_handler = sample_exit_handler; + sample_probe_extra.entry_handler = sample_entry_handler_extra; + if (strchr(symbol, '*')) { /* filter based fprobe */ ret = register_fprobe(&sample_probe, symbol, @@ -137,12 +151,19 @@ static int __init fprobe_init(void) else pr_info("Planted fprobe at %s\n", symbol); + ret = register_fprobe_syms(&sample_probe_extra, (const char **) &symbol_extra, 1); + if (ret < 0) + pr_err("register_fprobe extra failed, returned %d\n", ret); + else + pr_info("Planted extra fprobe at %s\n", symbol_extra); + return ret; } static void __exit fprobe_exit(void) { unregister_fprobe(&sample_probe); + unregister_fprobe(&sample_probe_extra); pr_info("fprobe at %s unregistered. %ld times hit, %ld times missed\n", symbol, nhit, sample_probe.nmissed);