Message ID | 3c175a7a1f7c2e08098f6d5e84dc247ce94846d2.1504244801.git.panand@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Pratyush, On 01/09/17 06:48, Pratyush Anand wrote: > do_task_stat() calls get_wchan(), which further does unbind_frame(). > unbind_frame() restores frame->pc to original value in case function > graph tracer has modified a return address (LR) in a stack frame to hook > a function return. However, if function graph tracer has hit a filtered > function, then we can't unwind it as ftrace_push_return_trace() has > biased the index(frame->graph) with a 'huge negative' > offset(-FTRACE_NOTRACE_DEPTH). > > Moreover, arm64 stack walker defines index(frame->graph) as unsigned > int, which can not compare a -ve number. > > Similar problem we can have with calling of walk_stackframe() from > save_stack_trace_tsk() or dump_backtrace(). > > This patch fixes unwind_frame() to test the index for -ve value and > restore index accordingly before we can restore frame->pc. I've just spotted arm64's profile_pc, which does this: From arch/arm64/kernel/time.c:profile_pc(): > #ifdef CONFIG_FUNCTION_GRAPH_TRACER > frame.graph = -1; /* no task info */ > #endif Is this another elaborate way of hitting this problem? I guess the options are skip any return-address restore in the unwinder if frame.graph is -1. (and profile_pc may have a bug here). Or, put current->curr_ret_stack in there. profile_pc() always passes tsk=NULL, so the unwinder assumes its current... kernel/profile.c pulls the pt_regs from a per-cpu irq_regs variable, that is updated by handle_IPI ... so it looks like this should always be current... Thanks, James > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c > index 09d37d66b630..4c47147d0554 100644 > --- a/arch/arm64/kernel/stacktrace.c > +++ b/arch/arm64/kernel/stacktrace.c > @@ -75,6 +75,9 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) > #ifdef CONFIG_FUNCTION_GRAPH_TRACER > if (tsk->ret_stack && > (frame->pc == (unsigned long)return_to_handler)) { > + if (frame->graph < 0) > + frame->graph += FTRACE_NOTRACE_DEPTH; > + > /* > * This is a case where function graph tracer has > * modified a return address (LR) in a stack frame >
On Tue, Sep 12, 2017 at 10:54:28AM +0100, James Morse wrote: > Hi Pratyush, > > On 01/09/17 06:48, Pratyush Anand wrote: > > do_task_stat() calls get_wchan(), which further does unbind_frame(). > > unbind_frame() restores frame->pc to original value in case function > > graph tracer has modified a return address (LR) in a stack frame to hook > > a function return. However, if function graph tracer has hit a filtered > > function, then we can't unwind it as ftrace_push_return_trace() has > > biased the index(frame->graph) with a 'huge negative' > > offset(-FTRACE_NOTRACE_DEPTH). > > > > Moreover, arm64 stack walker defines index(frame->graph) as unsigned > > int, which can not compare a -ve number. > > > > Similar problem we can have with calling of walk_stackframe() from > > save_stack_trace_tsk() or dump_backtrace(). > > > > This patch fixes unwind_frame() to test the index for -ve value and > > restore index accordingly before we can restore frame->pc. > > I've just spotted arm64's profile_pc, which does this: > From arch/arm64/kernel/time.c:profile_pc(): > > #ifdef CONFIG_FUNCTION_GRAPH_TRACER > > frame.graph = -1; /* no task info */ > > #endif > > Is this another elaborate way of hitting this problem? > > I guess the options are skip any return-address restore in the unwinder if > frame.graph is -1. (and profile_pc may have a bug here). Or, put > current->curr_ret_stack in there. > > profile_pc() always passes tsk=NULL, so the unwinder assumes its current... > kernel/profile.c pulls the pt_regs from a per-cpu irq_regs variable, that is > updated by handle_IPI ... so it looks like this should always be current... Hmmm... is profile_pc the *only* case where frame->graph isn't equal to tsk->curr_ret_stack in unwind_frame? If so, maybe unwind_frame should just use that, and we could kill the graph member of struct stackframe completely? Will
On Wednesday 13 September 2017 08:12 AM, Will Deacon wrote: > On Tue, Sep 12, 2017 at 10:54:28AM +0100, James Morse wrote: >> Hi Pratyush, >> >> On 01/09/17 06:48, Pratyush Anand wrote: >>> do_task_stat() calls get_wchan(), which further does unbind_frame(). >>> unbind_frame() restores frame->pc to original value in case function >>> graph tracer has modified a return address (LR) in a stack frame to hook >>> a function return. However, if function graph tracer has hit a filtered >>> function, then we can't unwind it as ftrace_push_return_trace() has >>> biased the index(frame->graph) with a 'huge negative' >>> offset(-FTRACE_NOTRACE_DEPTH). >>> >>> Moreover, arm64 stack walker defines index(frame->graph) as unsigned >>> int, which can not compare a -ve number. >>> >>> Similar problem we can have with calling of walk_stackframe() from >>> save_stack_trace_tsk() or dump_backtrace(). >>> >>> This patch fixes unwind_frame() to test the index for -ve value and >>> restore index accordingly before we can restore frame->pc. >> >> I've just spotted arm64's profile_pc, which does this: >> From arch/arm64/kernel/time.c:profile_pc(): >>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER >>> frame.graph = -1; /* no task info */ >>> #endif >> >> Is this another elaborate way of hitting this problem? >> >> I guess the options are skip any return-address restore in the unwinder if >> frame.graph is -1. (and profile_pc may have a bug here). Or, put >> current->curr_ret_stack in there. I think we should go with latter, ie assign frame.graph = current->curr_ret_stack in profile_pc(). >> >> profile_pc() always passes tsk=NULL, so the unwinder assumes its current... >> kernel/profile.c pulls the pt_regs from a per-cpu irq_regs variable, that is >> updated by handle_IPI ... so it looks like this should always be current... > > Hmmm... is profile_pc the *only* case where frame->graph isn't equal to > tsk->curr_ret_stack in unwind_frame? If so, maybe unwind_frame should just Yes, it is the only place. > use that, and we could kill the graph member of struct stackframe completely? > Humm, not sure, we initialize frame->graph out of the while loop in unwind_frame()'s caller and then keep in decrementing it in looped function.
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h index 5b6eafccc5d8..816db5b9b874 100644 --- a/arch/arm64/include/asm/stacktrace.h +++ b/arch/arm64/include/asm/stacktrace.h @@ -23,7 +23,7 @@ struct stackframe { unsigned long sp; unsigned long pc; #ifdef CONFIG_FUNCTION_GRAPH_TRACER - unsigned int graph; + int graph; #endif }; diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index 09d37d66b630..4c47147d0554 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -75,6 +75,9 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) #ifdef CONFIG_FUNCTION_GRAPH_TRACER if (tsk->ret_stack && (frame->pc == (unsigned long)return_to_handler)) { + if (frame->graph < 0) + frame->graph += FTRACE_NOTRACE_DEPTH; + /* * This is a case where function graph tracer has * modified a return address (LR) in a stack frame
do_task_stat() calls get_wchan(), which further does unbind_frame(). unbind_frame() restores frame->pc to original value in case function graph tracer has modified a return address (LR) in a stack frame to hook a function return. However, if function graph tracer has hit a filtered function, then we can't unwind it as ftrace_push_return_trace() has biased the index(frame->graph) with a 'huge negative' offset(-FTRACE_NOTRACE_DEPTH). Moreover, arm64 stack walker defines index(frame->graph) as unsigned int, which can not compare a -ve number. Similar problem we can have with calling of walk_stackframe() from save_stack_trace_tsk() or dump_backtrace(). This patch fixes unwind_frame() to test the index for -ve value and restore index accordingly before we can restore frame->pc. Reproducer: cd /sys/kernel/debug/tracing/ echo schedule > set_graph_notrace echo 1 > options/display-graph echo wakeup > current_tracer ps -ef | grep -i agent Above commands result in: Unable to handle kernel paging request at virtual address ffff801bd3d1e000 pgd = ffff8003cbe97c00 [ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000 Internal error: Oops: 96000006 [#1] SMP [...] CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33 [...] task: ffff8003c21ba000 task.stack: ffff8003cc6c0000 PC is at unwind_frame+0x12c/0x180 LR is at get_wchan+0xd4/0x134 pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145 sp : ffff8003cc6c3ab0 x29: ffff8003cc6c3ab0 x28: 0000000000000001 x27: 0000000000000026 x26: 0000000000000026 x25: 00000000000012d8 x24: 0000000000000000 x23: ffff8003c1c04000 x22: ffff000008c83000 x21: ffff8003c1c00000 x20: 000000000000000f x19: ffff8003c1bc0000 x18: 0000fffffc593690 x17: 0000000000000000 x16: 0000000000000001 x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f x13: 0000000000000001 x12: 0000000000000000 x11: 00000000e8f4883e x10: 0000000154f47ec8 x9 : 0000000070f367c0 x8 : 0000000000000000 x7 : 00008003f7290000 x6 : 0000000000000018 x5 : 0000000000000000 x4 : ffff8003c1c03cb0 x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000 x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000 Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000) Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000) [...] [<ffff00000808892c>] unwind_frame+0x12c/0x180 [<ffff000008305008>] do_task_stat+0x864/0x870 [<ffff000008305c44>] proc_tgid_stat+0x3c/0x48 [<ffff0000082fde0c>] proc_single_show+0x5c/0xb8 [<ffff0000082b27e0>] seq_read+0x160/0x414 [<ffff000008289e6c>] __vfs_read+0x58/0x164 [<ffff00000828b164>] vfs_read+0x88/0x144 [<ffff00000828c2e8>] SyS_read+0x60/0xc0 [<ffff0000080834a0>] __sys_trace_return+0x0/0x4 fixes: 20380bb390a4 (arm64: ftrace: fix a stack tracer's output under function graph tracer) Signed-off-by: Pratyush Anand <panand@redhat.com> --- v1 -> v2: - improved commit log - now index is restored and thereafter frame->pc as well. arch/arm64/include/asm/stacktrace.h | 2 +- arch/arm64/kernel/stacktrace.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-)