mbox series

[0/2] tracing/arm: Fix the stack tracer when LR is saved after local storage

Message ID 20190807163401.570339297@goodmis.org (mailing list archive)
Headers show
Series tracing/arm: Fix the stack tracer when LR is saved after local storage | expand

Message

Steven Rostedt Aug. 7, 2019, 4:34 p.m. UTC
As arm64 saves the link register after a function's local variables are
stored, it causes the max stack tracer to be off by one in its output
of which function has the bloated stack frame.

The first patch fixes this by creating a ARCH_RET_ADDR_BEFORE_LOCAL_VARS
define that an achitecture (arm64) may set in asm/ftrace.h, and this
will cause the stack tracer to make the shift.

As it has been proven that the stack tracer isn't the most trivial
algorithm to understand by staring at the code, the second patch adds
comments to the code to explain the algorithm with and without the
ARCH_RET_ADDR_BEFORE_LOCAL_VARS.

Hmm, should this be sent to stable (and for inclusion now?)

-- Steve

Steven Rostedt (VMware) (2):
      tracing/arm64: Have max stack tracer handle the case of return address after data
      tracing: Document the stack trace algorithm in the comments

----
 arch/arm64/include/asm/ftrace.h |   1 +
 kernel/trace/trace_stack.c      | 112 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 113 insertions(+)

Comments

Mark Rutland Aug. 7, 2019, 5:08 p.m. UTC | #1
Hi Steve,

On Wed, Aug 07, 2019 at 12:34:01PM -0400, Steven Rostedt wrote:
> As arm64 saves the link register after a function's local variables are
> stored, it causes the max stack tracer to be off by one in its output
> of which function has the bloated stack frame.

For reference, it's a bit more complex than that. :/

Our procedure call standard (the AAPCS) says that the frame record may
be placed anywhere within a stackframe, so we don't have a guarantee as
to where the saved lr will fall w.r.t local variables.

Today, GCC happens to create the stack frame by creating the stack
record, so the LR is saved at a lower addresss than the local variables.

However, I am aware that there are reasons why a compiler may choose to
place the frame record at a different locations, e.g. using pointer
authentication to provide an implicit stack canary, so this could change
in future, or potentially differ across functions.

Maybe that's a bridge we'll have to cross in future.

Thanks,
Mark.

> 
> The first patch fixes this by creating a ARCH_RET_ADDR_BEFORE_LOCAL_VARS
> define that an achitecture (arm64) may set in asm/ftrace.h, and this
> will cause the stack tracer to make the shift.
> 
> As it has been proven that the stack tracer isn't the most trivial
> algorithm to understand by staring at the code, the second patch adds
> comments to the code to explain the algorithm with and without the
> ARCH_RET_ADDR_BEFORE_LOCAL_VARS.
> 
> Hmm, should this be sent to stable (and for inclusion now?)
> 
> -- Steve
> 
> Steven Rostedt (VMware) (2):
>       tracing/arm64: Have max stack tracer handle the case of return address after data
>       tracing: Document the stack trace algorithm in the comments
> 
> ----
>  arch/arm64/include/asm/ftrace.h |   1 +
>  kernel/trace/trace_stack.c      | 112 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 113 insertions(+)
Steven Rostedt Aug. 7, 2019, 5:20 p.m. UTC | #2
On Wed, 7 Aug 2019 18:08:14 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> Hi Steve,
> 
> On Wed, Aug 07, 2019 at 12:34:01PM -0400, Steven Rostedt wrote:
> > As arm64 saves the link register after a function's local variables are
> > stored, it causes the max stack tracer to be off by one in its output
> > of which function has the bloated stack frame.  
> 
> For reference, it's a bit more complex than that. :/

Yeah, I know it is. ;-)

> 
> Our procedure call standard (the AAPCS) says that the frame record may
> be placed anywhere within a stackframe, so we don't have a guarantee as
> to where the saved lr will fall w.r.t local variables.

Yep.

> 
> Today, GCC happens to create the stack frame by creating the stack
> record, so the LR is saved at a lower addresss than the local variables.

Which is what breaks the current algorithm (without this update).

> 
> However, I am aware that there are reasons why a compiler may choose to
> place the frame record at a different locations, e.g. using pointer
> authentication to provide an implicit stack canary, so this could change
> in future, or potentially differ across functions.
> 
> Maybe that's a bridge we'll have to cross in future.

OK, how about I update the change log and add a comment that states
that this can change. But even if it does, it wont break anything but
show the wrong stack size, which is usually only important for us
kernel developers anyway ;-)

Let me send a v2.

-- Steve