Message ID | 20210429014321.196606-1-leo.yan@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: stacktrace: Stop unwinding when the PC is zero | expand |
Hi Leo, On Thu, Apr 29, 2021 at 09:43:21AM +0800, Leo Yan wrote: > When use ftrace for stack trace, it reports the spurious frame with the > PC value is zero. This can be reproduced with commands: > > # cd /sys/kernel/debug/tracing/ > # echo "prev_pid == 0" > events/sched/sched_switch/filter > # echo stacktrace > events/sched/sched_switch/trigger > # echo 1 > events/sched/sched_switch/enable > # cat trace > > <idle>-0 [005] d..2 259.621390: sched_switch: ... > <idle>-0 [005] d..3 259.621394: <stack trace> > => __schedule > => schedule_idle > => do_idle > => cpu_startup_entry > => secondary_start_kernel > => 0 IIUC, this is my fault, and is an unintended side-effect of commit: 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") ... since before prior to that, we'd implicitly create a terminal record in start_kernel and secondary_start_kernel by virtue of entering those functions with both FP and LR set to NULL. After that commit, we report the NULL LR before trying to unwind the NULL FP. > The kernel initializes FP/PC values as zero for swapper threads in > head.S, when walk the stack frame, this patch stops unwinding if detect > the PC value is zero, therefore can avoid the spurious frame. > > Below is the stacktrace after applying the change: > > # cat trace > > <idle>-0 [005] d..2 259.621390: sched_switch: ... > <idle>-0 [005] d..3 259.621394: <stack trace> > => __schedule > => schedule_idle > => do_idle > => cpu_startup_entry > => secondary_start_kernel > > Signed-off-by: Leo Yan <leo.yan@linaro.org> > --- > arch/arm64/kernel/stacktrace.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c > index 84b676bcf867..02b1e85b2026 100644 > --- a/arch/arm64/kernel/stacktrace.c > +++ b/arch/arm64/kernel/stacktrace.c > @@ -145,7 +145,11 @@ void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame, > if (!fn(data, frame->pc)) > break; > ret = unwind_frame(tsk, frame); > - if (ret < 0) > + /* > + * When the frame->pc is zero, it has reached to the initial pc > + * and fp values; stop unwinding for this case. > + */ > + if (ret < 0 || !frame->pc) > break; I don't think this is the right place for this, since we intend unwind_frame() to detect when unwinding is finished; see commit: 3c02600144bdb0a1 ("arm64: stacktrace: Report when we reach the end of the stack") I think we have three options for what to do here: a) Revert 6106e1112cc69a36, and identify these cases as terminal records where FP and LR are both NULL. b) Have __primary_switched and __secondary_switched call start_kernel and secondary_start_kernel with BL rather than B. The __*_switched functions will show up in the trace, but we won't unwind any further as the next record will have a NULL FP. c) Revert 6106e1112cc69a36, create terminal records in __primary_switched and __secondary_switched, and call start_kernel and secondary_start_kernel with BL rather than B. The __*_switched functions will show up in the trace, but we won't unwind any further as the next record will be a terminal record. For RELIABLE_STACKTRACE, we're going to have to do (c), I think, but for now we could do (a) so as to have a minimal fix, and we can build (c) atop that. How about the patch below? I've tested it with your instructions and also by inspecting /proc/self/stack. Thanks, Mark. ---->8---- From b99e647b34b74059f3013c09f12fbd542c7679fd Mon Sep 17 00:00:00 2001 From: Mark Rutland <mark.rutland@arm.com> Date: Thu, 29 Apr 2021 11:20:04 +0100 Subject: [PATCH] arm64: stacktrace: restore terminal records We removed the terminal frame records in commit: 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") ... on the assumption that as we no longer used them to find the pt_regs at exception boundaries, they were no longer necessary. However, Leo reports that as an unintended side-effect, this causes traces which cross secondary_start_kernel to terminate one entry too late, with a spurious "0" entry. There are a few ways we could sovle this, but as we're planning to use terminal records for RELIABLE_STACKTRACE, let's revert the logic change for now, keeping the update comments and accounting for the changes in commit: 3c02600144bdb0a1 ("arm64: stacktrace: Report when we reach the end of the stack") This is effectively a partial revert of commit: 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") Reported-by: Leo Yan <leo.yan@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> --- arch/arm64/kernel/entry.S | 6 +++--- arch/arm64/kernel/stacktrace.c | 10 ++++++---- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 6acfc5e6b5e0..9b205744a233 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -263,16 +263,16 @@ alternative_else_nop_endif stp lr, x21, [sp, #S_LR] /* - * For exceptions from EL0, terminate the callchain here. + * For exceptions from EL0, create a terminal frame record. * For exceptions from EL1, create a synthetic frame record so the * interrupted code shows up in the backtrace. */ .if \el == 0 - mov x29, xzr + stp xzr, xzr, [sp, #S_STACKFRAME] .else stp x29, x22, [sp, #S_STACKFRAME] - add x29, sp, #S_STACKFRAME .endif + add x29, sp, #S_STACKFRAME #ifdef CONFIG_ARM64_SW_TTBR0_PAN alternative_if_not ARM64_HAS_PAN diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index d55bdfb7789c..7032a5f9e624 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -44,10 +44,6 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) unsigned long fp = frame->fp; struct stack_info info; - /* Terminal record; nothing to unwind */ - if (!fp) - return -ENOENT; - if (fp & 0xf) return -EINVAL; @@ -108,6 +104,12 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) frame->pc = ptrauth_strip_insn_pac(frame->pc); + /* + * This is a terminal record, so we have finished unwinding. + */ + if (!frame->fp && !frame->pc) + return -ENOENT; + return 0; } NOKPROBE_SYMBOL(unwind_frame);
Hi Mark, On Thu, Apr 29, 2021 at 11:48:13AM +0100, Mark Rutland wrote: [...] > > --- a/arch/arm64/kernel/stacktrace.c > > +++ b/arch/arm64/kernel/stacktrace.c > > @@ -145,7 +145,11 @@ void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame, > > if (!fn(data, frame->pc)) > > break; > > ret = unwind_frame(tsk, frame); > > - if (ret < 0) > > + /* > > + * When the frame->pc is zero, it has reached to the initial pc > > + * and fp values; stop unwinding for this case. > > + */ > > + if (ret < 0 || !frame->pc) > > break; > > I don't think this is the right place for this, since we intend > unwind_frame() to detect when unwinding is finished; see commit: > > 3c02600144bdb0a1 ("arm64: stacktrace: Report when we reach the end of the stack") > > I think we have three options for what to do here: > > a) Revert 6106e1112cc69a36, and identify these cases as terminal records > where FP and LR are both NULL. > > b) Have __primary_switched and __secondary_switched call start_kernel > and secondary_start_kernel with BL rather than B. The __*_switched > functions will show up in the trace, but we won't unwind any further > as the next record will have a NULL FP. > > c) Revert 6106e1112cc69a36, create terminal records in > __primary_switched and __secondary_switched, and call start_kernel > and secondary_start_kernel with BL rather than B. The __*_switched > functions will show up in the trace, but we won't unwind any further > as the next record will be a terminal record. > > For RELIABLE_STACKTRACE, we're going to have to do (c), I think, but for > now we could do (a) so as to have a minimal fix, and we can build (c) > atop that. > > How about the patch below? I've tested it with your instructions and > also by inspecting /proc/self/stack. Thanks a lot for the quick fixing, and appreciate for sharing the background knowledge! > ---->8---- > From b99e647b34b74059f3013c09f12fbd542c7679fd Mon Sep 17 00:00:00 2001 > From: Mark Rutland <mark.rutland@arm.com> > Date: Thu, 29 Apr 2021 11:20:04 +0100 > Subject: [PATCH] arm64: stacktrace: restore terminal records > > We removed the terminal frame records in commit: > > 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") > > ... on the assumption that as we no longer used them to find the pt_regs > at exception boundaries, they were no longer necessary. > > However, Leo reports that as an unintended side-effect, this causes > traces which cross secondary_start_kernel to terminate one entry too > late, with a spurious "0" entry. > > There are a few ways we could sovle this, but as we're planning to use > terminal records for RELIABLE_STACKTRACE, let's revert the logic change > for now, keeping the update comments and accounting for the changes in > commit: > > 3c02600144bdb0a1 ("arm64: stacktrace: Report when we reach the end of the stack") > > This is effectively a partial revert of commit: > > 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > Fixes: 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") > Reported-by: Leo Yan <leo.yan@linaro.org> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: Mark Brown <broonie@kernel.org> > Cc: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> For this patch, I tested at my side and it works as expected. Though I don't have complete knowledge for reviewing this patch, I went through the history commits your mentioned and connected with this patch, it looks good to me: Tested-by: Leo Yan <leo.yan@linaro.org> > --- > arch/arm64/kernel/entry.S | 6 +++--- > arch/arm64/kernel/stacktrace.c | 10 ++++++---- > 2 files changed, 9 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index 6acfc5e6b5e0..9b205744a233 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -263,16 +263,16 @@ alternative_else_nop_endif > stp lr, x21, [sp, #S_LR] > > /* > - * For exceptions from EL0, terminate the callchain here. > + * For exceptions from EL0, create a terminal frame record. > * For exceptions from EL1, create a synthetic frame record so the > * interrupted code shows up in the backtrace. > */ > .if \el == 0 > - mov x29, xzr > + stp xzr, xzr, [sp, #S_STACKFRAME] > .else > stp x29, x22, [sp, #S_STACKFRAME] > - add x29, sp, #S_STACKFRAME > .endif > + add x29, sp, #S_STACKFRAME > > #ifdef CONFIG_ARM64_SW_TTBR0_PAN > alternative_if_not ARM64_HAS_PAN > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c > index d55bdfb7789c..7032a5f9e624 100644 > --- a/arch/arm64/kernel/stacktrace.c > +++ b/arch/arm64/kernel/stacktrace.c > @@ -44,10 +44,6 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) > unsigned long fp = frame->fp; > struct stack_info info; > > - /* Terminal record; nothing to unwind */ > - if (!fp) > - return -ENOENT; > - > if (fp & 0xf) > return -EINVAL; > > @@ -108,6 +104,12 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame) > > frame->pc = ptrauth_strip_insn_pac(frame->pc); > > + /* > + * This is a terminal record, so we have finished unwinding. > + */ > + if (!frame->fp && !frame->pc) > + return -ENOENT; > + > return 0; > } > NOKPROBE_SYMBOL(unwind_frame); > -- > 2.11.0 > > >
On Thu, Apr 29, 2021 at 11:48:13AM +0100, Mark Rutland wrote: > From b99e647b34b74059f3013c09f12fbd542c7679fd Mon Sep 17 00:00:00 2001 > From: Mark Rutland <mark.rutland@arm.com> > Date: Thu, 29 Apr 2021 11:20:04 +0100 > Subject: [PATCH] arm64: stacktrace: restore terminal records > > We removed the terminal frame records in commit: > > 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") > > ... on the assumption that as we no longer used them to find the pt_regs > at exception boundaries, they were no longer necessary. > > However, Leo reports that as an unintended side-effect, this causes > traces which cross secondary_start_kernel to terminate one entry too > late, with a spurious "0" entry. > > There are a few ways we could sovle this, but as we're planning to use > terminal records for RELIABLE_STACKTRACE, let's revert the logic change > for now, keeping the update comments and accounting for the changes in > commit: > > 3c02600144bdb0a1 ("arm64: stacktrace: Report when we reach the end of the stack") > > This is effectively a partial revert of commit: > > 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > Fixes: 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") > Reported-by: Leo Yan <leo.yan@linaro.org> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: Mark Brown <broonie@kernel.org> > Cc: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> Thanks Mark. I applied it to for-next/core (couldn't figure out the combination of b4 and git am + the scissors and not replying to the top message).
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index 84b676bcf867..02b1e85b2026 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -145,7 +145,11 @@ void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame, if (!fn(data, frame->pc)) break; ret = unwind_frame(tsk, frame); - if (ret < 0) + /* + * When the frame->pc is zero, it has reached to the initial pc + * and fp values; stop unwinding for this case. + */ + if (ret < 0 || !frame->pc) break; } }
When use ftrace for stack trace, it reports the spurious frame with the PC value is zero. This can be reproduced with commands: # cd /sys/kernel/debug/tracing/ # echo "prev_pid == 0" > events/sched/sched_switch/filter # echo stacktrace > events/sched/sched_switch/trigger # echo 1 > events/sched/sched_switch/enable # cat trace <idle>-0 [005] d..2 259.621390: sched_switch: ... <idle>-0 [005] d..3 259.621394: <stack trace> => __schedule => schedule_idle => do_idle => cpu_startup_entry => secondary_start_kernel => 0 The kernel initializes FP/PC values as zero for swapper threads in head.S, when walk the stack frame, this patch stops unwinding if detect the PC value is zero, therefore can avoid the spurious frame. Below is the stacktrace after applying the change: # cat trace <idle>-0 [005] d..2 259.621390: sched_switch: ... <idle>-0 [005] d..3 259.621394: <stack trace> => __schedule => schedule_idle => do_idle => cpu_startup_entry => secondary_start_kernel Signed-off-by: Leo Yan <leo.yan@linaro.org> --- arch/arm64/kernel/stacktrace.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)