diff mbox

[v2] arm64: fix unwind_frame() for filtered out fn for function graph tracing

Message ID 3c175a7a1f7c2e08098f6d5e84dc247ce94846d2.1504244801.git.panand@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Pratyush Anand Sept. 1, 2017, 5:48 a.m. UTC
do_task_stat() calls get_wchan(), which further does unbind_frame().
unbind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).

Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.

Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().

This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.

Reproducer:

cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent

Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000

Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4

fixes: 20380bb390a4 (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
---
v1 -> v2:
	- improved commit log
	- now index is restored and thereafter frame->pc as well.

 arch/arm64/include/asm/stacktrace.h | 2 +-
 arch/arm64/kernel/stacktrace.c      | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

Comments

James Morse Sept. 12, 2017, 9:54 a.m. UTC | #1
Hi Pratyush,

On 01/09/17 06:48, Pratyush Anand wrote:
> do_task_stat() calls get_wchan(), which further does unbind_frame().
> unbind_frame() restores frame->pc to original value in case function
> graph tracer has modified a return address (LR) in a stack frame to hook
> a function return. However, if function graph tracer has hit a filtered
> function, then we can't unwind it as ftrace_push_return_trace() has
> biased the index(frame->graph) with a 'huge negative'
> offset(-FTRACE_NOTRACE_DEPTH).
> 
> Moreover, arm64 stack walker defines index(frame->graph) as unsigned
> int, which can not compare a -ve number.
> 
> Similar problem we can have with calling of walk_stackframe() from
> save_stack_trace_tsk() or dump_backtrace().
> 
> This patch fixes unwind_frame() to test the index for -ve value and
> restore index accordingly before we can restore frame->pc.

I've just spotted arm64's profile_pc, which does this:
From arch/arm64/kernel/time.c:profile_pc():
> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> 	frame.graph = -1; /* no task info */
> #endif

Is this another elaborate way of hitting this problem?

I guess the options are skip any return-address restore in the unwinder if
frame.graph is -1. (and profile_pc may have a bug here). Or, put
current->curr_ret_stack in there.

profile_pc() always passes tsk=NULL, so the unwinder assumes its current...
kernel/profile.c pulls the pt_regs from a per-cpu irq_regs variable, that is
updated by handle_IPI ... so it looks like this should always be current...


Thanks,

James


> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 09d37d66b630..4c47147d0554 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -75,6 +75,9 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  	if (tsk->ret_stack &&
>  			(frame->pc == (unsigned long)return_to_handler)) {
> +		if (frame->graph < 0)
> +			frame->graph += FTRACE_NOTRACE_DEPTH;
> +
>  		/*
>  		 * This is a case where function graph tracer has
>  		 * modified a return address (LR) in a stack frame
>
Will Deacon Sept. 13, 2017, 2:42 a.m. UTC | #2
On Tue, Sep 12, 2017 at 10:54:28AM +0100, James Morse wrote:
> Hi Pratyush,
> 
> On 01/09/17 06:48, Pratyush Anand wrote:
> > do_task_stat() calls get_wchan(), which further does unbind_frame().
> > unbind_frame() restores frame->pc to original value in case function
> > graph tracer has modified a return address (LR) in a stack frame to hook
> > a function return. However, if function graph tracer has hit a filtered
> > function, then we can't unwind it as ftrace_push_return_trace() has
> > biased the index(frame->graph) with a 'huge negative'
> > offset(-FTRACE_NOTRACE_DEPTH).
> > 
> > Moreover, arm64 stack walker defines index(frame->graph) as unsigned
> > int, which can not compare a -ve number.
> > 
> > Similar problem we can have with calling of walk_stackframe() from
> > save_stack_trace_tsk() or dump_backtrace().
> > 
> > This patch fixes unwind_frame() to test the index for -ve value and
> > restore index accordingly before we can restore frame->pc.
> 
> I've just spotted arm64's profile_pc, which does this:
> From arch/arm64/kernel/time.c:profile_pc():
> > #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > 	frame.graph = -1; /* no task info */
> > #endif
> 
> Is this another elaborate way of hitting this problem?
> 
> I guess the options are skip any return-address restore in the unwinder if
> frame.graph is -1. (and profile_pc may have a bug here). Or, put
> current->curr_ret_stack in there.
> 
> profile_pc() always passes tsk=NULL, so the unwinder assumes its current...
> kernel/profile.c pulls the pt_regs from a per-cpu irq_regs variable, that is
> updated by handle_IPI ... so it looks like this should always be current...

Hmmm... is profile_pc the *only* case where frame->graph isn't equal to
tsk->curr_ret_stack in unwind_frame? If so, maybe unwind_frame should just
use that, and we could kill the graph member of struct stackframe completely?

Will
Pratyush Anand Sept. 13, 2017, 4:59 a.m. UTC | #3
On Wednesday 13 September 2017 08:12 AM, Will Deacon wrote:
> On Tue, Sep 12, 2017 at 10:54:28AM +0100, James Morse wrote:
>> Hi Pratyush,
>>
>> On 01/09/17 06:48, Pratyush Anand wrote:
>>> do_task_stat() calls get_wchan(), which further does unbind_frame().
>>> unbind_frame() restores frame->pc to original value in case function
>>> graph tracer has modified a return address (LR) in a stack frame to hook
>>> a function return. However, if function graph tracer has hit a filtered
>>> function, then we can't unwind it as ftrace_push_return_trace() has
>>> biased the index(frame->graph) with a 'huge negative'
>>> offset(-FTRACE_NOTRACE_DEPTH).
>>>
>>> Moreover, arm64 stack walker defines index(frame->graph) as unsigned
>>> int, which can not compare a -ve number.
>>>
>>> Similar problem we can have with calling of walk_stackframe() from
>>> save_stack_trace_tsk() or dump_backtrace().
>>>
>>> This patch fixes unwind_frame() to test the index for -ve value and
>>> restore index accordingly before we can restore frame->pc.
>>
>> I've just spotted arm64's profile_pc, which does this:
>>  From arch/arm64/kernel/time.c:profile_pc():
>>> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>>> 	frame.graph = -1; /* no task info */
>>> #endif
>>
>> Is this another elaborate way of hitting this problem?
>>
>> I guess the options are skip any return-address restore in the unwinder if
>> frame.graph is -1. (and profile_pc may have a bug here). Or, put
>> current->curr_ret_stack in there.

I think we should go with latter, ie assign frame.graph = 
current->curr_ret_stack in profile_pc().

>>
>> profile_pc() always passes tsk=NULL, so the unwinder assumes its current...
>> kernel/profile.c pulls the pt_regs from a per-cpu irq_regs variable, that is
>> updated by handle_IPI ... so it looks like this should always be current...
> 
> Hmmm... is profile_pc the *only* case where frame->graph isn't equal to
> tsk->curr_ret_stack in unwind_frame? If so, maybe unwind_frame should just

Yes, it is the only place.

> use that, and we could kill the graph member of struct stackframe completely?
> 

Humm, not sure, we initialize frame->graph out of the while loop in 
unwind_frame()'s caller and then keep in decrementing it in looped function.
diff mbox

Patch

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 5b6eafccc5d8..816db5b9b874 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -23,7 +23,7 @@  struct stackframe {
 	unsigned long sp;
 	unsigned long pc;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	unsigned int graph;
+	int graph;
 #endif
 };
 
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 09d37d66b630..4c47147d0554 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -75,6 +75,9 @@  int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	if (tsk->ret_stack &&
 			(frame->pc == (unsigned long)return_to_handler)) {
+		if (frame->graph < 0)
+			frame->graph += FTRACE_NOTRACE_DEPTH;
+
 		/*
 		 * This is a case where function graph tracer has
 		 * modified a return address (LR) in a stack frame