mbox series

[V2,00/29] stacktrace: Consolidate stack trace usage

Message ID 20190418084119.056416939@linutronix.de (mailing list archive)
Headers show
Series stacktrace: Consolidate stack trace usage | expand

Message

Thomas Gleixner April 18, 2019, 8:41 a.m. UTC
This is an update to V1:

 https://lkml.kernel.org/r/20190410102754.387743324@linutronix.de

Struct stack_trace is a sinkhole for input and output parameters which is
largely pointless for most usage sites. In fact if embedded into other data
structures it creates indirections and extra storage overhead for no
benefit.

Looking at all usage sites makes it clear that they just require an
interface which is based on a storage array. That array is either on stack,
global or embedded into some other data structure.

Some of the stack depot usage sites are outright wrong, but fortunately the
wrongness just causes more stack being used for nothing and does not have
functional impact.

Fix this up by:

  1) Providing plain storage array based interfaces for stacktrace and
     stackdepot.

  2) Cleaning up the mess at the callsites including some related
     cleanups.

  3) Removing the struct stack_trace based interfaces

  This is not yet changing the struct stack_trace interfaces at the
  architecture level, but it removes the exposure to the usage sites.

The last two patches are extending the cleanup to the architecture level by
replacing the various save_stack_trace.* architecture interfaces with a
more unified arch_stack_walk() interface. x86 is converted, but I have
worked through all architectures already and it removes lots of duplicated
code and allows consolidation across the board. The rest of the
architecture patches are not included in this posting as I want to get
feedback on the approach itself. The diffstat of cleaning up the remaining
architectures is currently on top of the current lot is:

   47 files changed, 402 insertions(+), 1196 deletions(-)

Once this has settled, the core interfaces can be improved by adding
features, which allow to get rid of the imprecise 'skip number of entries'
approach which tries to remove the stack tracer and the callsites themself
from the trace. That's error prone due to inlining and other issues. Having
e.g. a _RET_IP_ based filter allows to do that far more reliable.

The series is based on:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/stacktrace

which contains the removal of the inconsistent and pointless ULONG_MAX
termination of stacktraces.

It's also available from git:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.core/stacktrace

   up to:  131038eb3e2f ("x86/stacktrace: Use common infrastructure")

Changes vs. V1:

   - Applied the ULONG_MAX termination cleanup in tip

   - Addressed the review comments

   - Fixed up the last users of struct stack_trace outside the stacktrace
     core and architecture code (livepatch, tracing)

   - Added the new arch_stack_walk() model and converted x86 to it

Thanks,

	tglx

---
 arch/x86/Kconfig                              |    1 
 arch/x86/kernel/stacktrace.c                  |  116 +--------
 drivers/gpu/drm/drm_mm.c                      |   22 -
 drivers/gpu/drm/i915/i915_vma.c               |   11 
 drivers/gpu/drm/i915/intel_runtime_pm.c       |   21 -
 drivers/md/dm-bufio.c                         |   15 -
 drivers/md/persistent-data/dm-block-manager.c |   19 -
 fs/btrfs/ref-verify.c                         |   15 -
 fs/proc/base.c                                |   14 -
 include/linux/ftrace.h                        |   18 -
 include/linux/lockdep.h                       |    9 
 include/linux/stackdepot.h                    |    8 
 include/linux/stacktrace.h                    |   80 +++++-
 kernel/backtracetest.c                        |   11 
 kernel/dma/debug.c                            |   13 -
 kernel/latencytop.c                           |   17 -
 kernel/livepatch/transition.c                 |   22 -
 kernel/locking/lockdep.c                      |   81 ++----
 kernel/stacktrace.c                           |  323 ++++++++++++++++++++++++--
 kernel/trace/trace.c                          |  105 +++-----
 kernel/trace/trace.h                          |    8 
 kernel/trace/trace_events_hist.c              |   12 
 kernel/trace/trace_stack.c                    |   76 ++----
 lib/Kconfig                                   |    4 
 lib/fault-inject.c                            |   12 
 lib/stackdepot.c                              |   50 ++--
 mm/kasan/common.c                             |   30 --
 mm/kasan/report.c                             |    7 
 mm/kmemleak.c                                 |   24 -
 mm/page_owner.c                               |   79 ++----
 mm/slub.c                                     |   12 
 31 files changed, 664 insertions(+), 571 deletions(-)

Comments

Steven Rostedt April 18, 2019, 8:13 p.m. UTC | #1
On Thu, 18 Apr 2019 14:58:55 -0500
Tom Zanussi <tom.zanussi@linux.intel.com> wrote:

> > Tom,
> > 
> > Can you review this too?  
> 
> Looks good to me too!
> 
> Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com>
> 

Would you be OK to upgrade this to a Reviewed-by tag?

Thanks!

-- Steve
Steven Rostedt April 18, 2019, 9:24 p.m. UTC | #2
On Thu, 18 Apr 2019 23:14:45 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Thu, 18 Apr 2019, Josh Poimboeuf wrote:
> 
> > On Thu, Apr 18, 2019 at 10:41:20AM +0200, Thomas Gleixner wrote:  
> > > - Remove the extra array member of stack_dump_trace[]. It's not required as
> > >   the stack tracer stores at max array size - 1 entries so there is still
> > >   an empty slot.  
> > 
> > What is the empty slot used for?  
> 
> I was trying to find an answer but failed. Maybe it's just historical
> leftovers or Steven knows where the magic is in this maze.
>

I believe it was for historical leftovers (there was a time it was
required), and left there for "paranoid" sake. But let me apply the
patch and see if it is really needed.

-- Steve
Steven Rostedt April 18, 2019, 9:50 p.m. UTC | #3
On Thu, 18 Apr 2019 17:24:43 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> I believe it was for historical leftovers (there was a time it was
> required), and left there for "paranoid" sake. But let me apply the
> patch and see if it is really needed.

I removed the +1 on the max_entries and set SET_TRACE_ENTRIES to 5 (a
bit extreme). Then I ran the stack tracing with KASAN enabled and it
never complained.

As stated, it was there for historical reasons and I felt 500 was way
more than enough and left the buffer there just out of laziness and
paranoia.

Feel free to remove that if you want.

-- Steve
Steven Rostedt April 18, 2019, 10:19 p.m. UTC | #4
On Thu, 18 Apr 2019 10:41:20 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:


> @@ -412,23 +404,20 @@ stack_trace_sysctl(struct ctl_table *tab
>  		   void __user *buffer, size_t *lenp,
>  		   loff_t *ppos)
>  {
> -	int ret;
> +	int ret, was_enabled;

One small nit. Could this be:

	int was_enabled;
	int ret;

I prefer only joining variables that are related on the same line.
Makes it look cleaner IMO.

>  
>  	mutex_lock(&stack_sysctl_mutex);
> +	was_enabled = !!stack_tracer_enabled;
>  

Bah, not sure why I didn't do it this way to begin with. I think I
copied something else that couldn't do it this way for some reason and
didn't put any brain power behind the copy. :-/ But that was back in
2008 so I blame it on being "young and stupid" ;-)

Other then the above nit and removing the unneeded +1 in max_entries:

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve


>  	ret = proc_dointvec(table, write, buffer, lenp, ppos);
>  
> -	if (ret || !write ||
> -	    (last_stack_tracer_enabled == !!stack_tracer_enabled))
> +	if (ret || !write || (was_enabled == !!stack_tracer_enabled))
>  		goto out;
>  
> -	last_stack_tracer_enabled = !!stack_tracer_enabled;
> -
>  	if (stack_tracer_enabled)
>  		register_ftrace_function(&trace_ops);
>  	else
>  		unregister_ftrace_function(&trace_ops);
> -
>   out:
>  	mutex_unlock(&stack_sysctl_mutex);
>  	return ret;
> @@ -444,7 +433,6 @@ static __init int enable_stacktrace(char
>  		strncpy(stack_trace_filter_buf, str + len, COMMAND_LINE_SIZE);
>  
>  	stack_tracer_enabled = 1;
> -	last_stack_tracer_enabled = 1;
>  	return 1;
>  }
>  __setup("stacktrace", enable_stacktrace);
>
Steven Rostedt April 19, 2019, 12:39 a.m. UTC | #5
On Fri, 19 Apr 2019 00:44:17 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Thu, 18 Apr 2019, Steven Rostedt wrote:
> > On Thu, 18 Apr 2019 10:41:20 +0200
> > Thomas Gleixner <tglx@linutronix.de> wrote:
> >   
> > > @@ -412,23 +404,20 @@ stack_trace_sysctl(struct ctl_table *tab
> > >  		   void __user *buffer, size_t *lenp,
> > >  		   loff_t *ppos)
> > >  {
> > > -	int ret;
> > > +	int ret, was_enabled;  
> > 
> > One small nit. Could this be:
> > 
> > 	int was_enabled;
> > 	int ret;
> > 
> > I prefer only joining variables that are related on the same line.
> > Makes it look cleaner IMO.  
> 
> If you wish so. To me it's waste of screen space :)

At least you didn't say it helps the compiler ;-)

> 
> > >  
> > >  	mutex_lock(&stack_sysctl_mutex);
> > > +	was_enabled = !!stack_tracer_enabled;
> > >    
> > 
> > Bah, not sure why I didn't do it this way to begin with. I think I
> > copied something else that couldn't do it this way for some reason and
> > didn't put any brain power behind the copy. :-/ But that was back in
> > 2008 so I blame it on being "young and stupid" ;-)  
> 
> The young part is gone for sure :)

I purposely set you up for that response.

> 
> > Other then the above nit and removing the unneeded +1 in max_entries:  
> 
> s/+1/-1/

That was an ode to G+

-- Steve
Peter Zijlstra April 19, 2019, 7:02 a.m. UTC | #6
On Thu, Apr 18, 2019 at 05:42:55PM +0200, Thomas Gleixner wrote:
> On Thu, 18 Apr 2019, Josh Poimboeuf wrote:

> > Another idea I had (but never got a chance to work on) was to extend the
> > x86 unwind interface to all arches.  So instead of the callbacks, each
> > arch would implement something like this API:

> I surely thought about that, but after staring at all incarnations of
> arch/*/stacktrace.c I just gave up.
> 
> Aside of that quite some archs already have callback based unwinders
> because they use them for more than stacktracing and just have a single
> implementation of that loop.
> 
> I'm fine either way. We can start with x86 and then let archs convert over
> their stuff, but I wouldn't hold my breath that this will be completed in
> the forseeable future.

I suggested the same to Thomas early on, and I even spend the time to
convert some $random arch to the iterator interface, and while it is
indeed entirely feasible, it is _far_ more work.

The callback thing OTOH is flexible enough to do what we want to do now,
and allows converting most archs to it without too much pain (as Thomas
said, many archs are already in this form and only need minor API
adjustments), which gets us in a far better place than we are now.

And we can always go to iterators later on. But I think getting the
generic unwinder improved across all archs is a really important first
step here.
Peter Zijlstra April 19, 2019, 7:18 a.m. UTC | #7
On Thu, Apr 18, 2019 at 10:41:47AM +0200, Thomas Gleixner wrote:

> +typedef bool (*stack_trace_consume_fn)(void *cookie, unsigned long addr,
> +                                      bool reliable);

> +void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> +		     struct task_struct *task, struct pt_regs *regs);
> +int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry, void *cookie,
> +			     struct task_struct *task);

This bugs me a little; ideally the _reliable() thing would not exists.

Thomas said that the existing __save_stack_trace_reliable() is different
enough for the unification to be non-trivial, but maybe Josh can help
out?

From what I can see the biggest significant differences are:

 - it looks at the regs sets on the stack and for FP bails early
 - bails for khreads and idle (after it does all the hard work!?!)

The first (FP checking for exceptions) should probably be reflected in
consume_fn(.reliable) anyway -- although that would mean a lot of extra
'?' entries where there are none today.

And the second (KTHREAD/IDLE) is something that the generic code can
easily do before calling into the arch unwinder.

Hmm?
Peter Zijlstra April 19, 2019, 9:07 a.m. UTC | #8
On Fri, Apr 19, 2019 at 10:32:30AM +0200, Thomas Gleixner wrote:
> On Fri, 19 Apr 2019, Peter Zijlstra wrote:
> > On Thu, Apr 18, 2019 at 10:41:47AM +0200, Thomas Gleixner wrote:
> > 
> > > +typedef bool (*stack_trace_consume_fn)(void *cookie, unsigned long addr,
> > > +                                      bool reliable);
> > 
> > > +void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> > > +		     struct task_struct *task, struct pt_regs *regs);
> > > +int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry, void *cookie,
> > > +			     struct task_struct *task);
> > 
> > This bugs me a little; ideally the _reliable() thing would not exists.
> > 
> > Thomas said that the existing __save_stack_trace_reliable() is different
> > enough for the unification to be non-trivial, but maybe Josh can help
> > out?
> > 
> > >From what I can see the biggest significant differences are:
> > 
> >  - it looks at the regs sets on the stack and for FP bails early
> >  - bails for khreads and idle (after it does all the hard work!?!)
> > 
> > The first (FP checking for exceptions) should probably be reflected in
> > consume_fn(.reliable) anyway -- although that would mean a lot of extra
> > '?' entries where there are none today.
> > 
> > And the second (KTHREAD/IDLE) is something that the generic code can
> > easily do before calling into the arch unwinder.
> 
> And looking at the powerpc version of it, that has even more interesting
> extra checks in that function.

Right, but not fundamentally different from determining @reliable I
think.

Anyway, it would be good if someone knowledgable could have a look at
this.
Steven Rostedt April 19, 2019, 1:28 p.m. UTC | #9
On Thu, 18 Apr 2019 10:41:41 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> It's only used in trace.c and there is absolutely no point in compiling it
> in when user space stack traces are not supported.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Steven Rostedt <rostedt@goodmis.org>

Funny, these were moved out to global functions along with the
ftrace_trace_stack() but I guess they were never used.

This basically just does a partial revert of:

 c0a0d0d3f6528 ("tracing/core: Make the stack entry helpers global")


> ---
>  kernel/trace/trace.c |   14 ++++++++------
>  kernel/trace/trace.h |    8 --------
>  2 files changed, 8 insertions(+), 14 deletions(-)
> 
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -159,6 +159,8 @@ static union trace_eval_map_item *trace_
>  #endif /* CONFIG_TRACE_EVAL_MAP_FILE */
>  
>  static int tracing_set_tracer(struct trace_array *tr, const char *buf);
> +static void ftrace_trace_userstack(struct ring_buffer *buffer,
> +				   unsigned long flags, int pc);
>  
>  #define MAX_TRACER_SIZE		100
>  static char bootup_tracer_buf[MAX_TRACER_SIZE] __initdata;
> @@ -2905,9 +2907,10 @@ void trace_dump_stack(int skip)
>  }
>  EXPORT_SYMBOL_GPL(trace_dump_stack);
>  
> +#ifdef CONFIG_USER_STACKTRACE_SUPPORT
>  static DEFINE_PER_CPU(int, user_stack_count);
>  
> -void
> +static void
>  ftrace_trace_userstack(struct ring_buffer *buffer, unsigned long flags, int pc)
>  {
>  	struct trace_event_call *call = &event_user_stack;
> @@ -2958,13 +2961,12 @@ ftrace_trace_userstack(struct ring_buffe
>   out:
>  	preempt_enable();
>  }
> -
> -#ifdef UNUSED

Strange, I never knew about this ifdef. I would have nuked it when I
saw it.

Anyway,

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve


> -static void __trace_userstack(struct trace_array *tr, unsigned long flags)
> +#else /* CONFIG_USER_STACKTRACE_SUPPORT */
> +static void ftrace_trace_userstack(struct ring_buffer *buffer,
> +				   unsigned long flags, int pc)
>  {
> -	ftrace_trace_userstack(tr, flags, preempt_count());
>  }
> -#endif /* UNUSED */
> +#endif /* !CONFIG_USER_STACKTRACE_SUPPORT */
>  
>  #endif /* CONFIG_STACKTRACE */
>  
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -782,17 +782,9 @@ void update_max_tr_single(struct trace_a
>  #endif /* CONFIG_TRACER_MAX_TRACE */
>  
>  #ifdef CONFIG_STACKTRACE
> -void ftrace_trace_userstack(struct ring_buffer *buffer, unsigned long flags,
> -			    int pc);
> -
>  void __trace_stack(struct trace_array *tr, unsigned long flags, int skip,
>  		   int pc);
>  #else
> -static inline void ftrace_trace_userstack(struct ring_buffer *buffer,
> -					  unsigned long flags, int pc)
> -{
> -}
> -
>  static inline void __trace_stack(struct trace_array *tr, unsigned long flags,
>  				 int skip, int pc)
>  {
>
Josh Poimboeuf April 19, 2019, 3:50 p.m. UTC | #10
On Fri, Apr 19, 2019 at 09:02:11AM +0200, Peter Zijlstra wrote:
> On Thu, Apr 18, 2019 at 05:42:55PM +0200, Thomas Gleixner wrote:
> > On Thu, 18 Apr 2019, Josh Poimboeuf wrote:
> 
> > > Another idea I had (but never got a chance to work on) was to extend the
> > > x86 unwind interface to all arches.  So instead of the callbacks, each
> > > arch would implement something like this API:
> 
> > I surely thought about that, but after staring at all incarnations of
> > arch/*/stacktrace.c I just gave up.
> > 
> > Aside of that quite some archs already have callback based unwinders
> > because they use them for more than stacktracing and just have a single
> > implementation of that loop.
> > 
> > I'm fine either way. We can start with x86 and then let archs convert over
> > their stuff, but I wouldn't hold my breath that this will be completed in
> > the forseeable future.
> 
> I suggested the same to Thomas early on, and I even spend the time to
> convert some $random arch to the iterator interface, and while it is
> indeed entirely feasible, it is _far_ more work.
> 
> The callback thing OTOH is flexible enough to do what we want to do now,
> and allows converting most archs to it without too much pain (as Thomas
> said, many archs are already in this form and only need minor API
> adjustments), which gets us in a far better place than we are now.
> 
> And we can always go to iterators later on. But I think getting the
> generic unwinder improved across all archs is a really important first
> step here.

Fair enough.
Josh Poimboeuf April 19, 2019, 4:17 p.m. UTC | #11
On Fri, Apr 19, 2019 at 11:07:17AM +0200, Peter Zijlstra wrote:
> On Fri, Apr 19, 2019 at 10:32:30AM +0200, Thomas Gleixner wrote:
> > On Fri, 19 Apr 2019, Peter Zijlstra wrote:
> > > On Thu, Apr 18, 2019 at 10:41:47AM +0200, Thomas Gleixner wrote:
> > > 
> > > > +typedef bool (*stack_trace_consume_fn)(void *cookie, unsigned long addr,
> > > > +                                      bool reliable);
> > > 
> > > > +void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> > > > +		     struct task_struct *task, struct pt_regs *regs);
> > > > +int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry, void *cookie,
> > > > +			     struct task_struct *task);
> > > 
> > > This bugs me a little; ideally the _reliable() thing would not exists.
> > > 
> > > Thomas said that the existing __save_stack_trace_reliable() is different
> > > enough for the unification to be non-trivial, but maybe Josh can help
> > > out?
> > > 
> > > >From what I can see the biggest significant differences are:
> > > 
> > >  - it looks at the regs sets on the stack and for FP bails early
> > >  - bails for khreads and idle (after it does all the hard work!?!)

That's done for a reason, see the "Success path" comments.

> > > The first (FP checking for exceptions) should probably be reflected in
> > > consume_fn(.reliable) anyway -- although that would mean a lot of extra
> > > '?' entries where there are none today.
> > > 
> > > And the second (KTHREAD/IDLE) is something that the generic code can
> > > easily do before calling into the arch unwinder.
> > 
> > And looking at the powerpc version of it, that has even more interesting
> > extra checks in that function.
> 
> Right, but not fundamentally different from determining @reliable I
> think.
> 
> Anyway, it would be good if someone knowledgable could have a look at
> this.

Yeah, we could probably do that.

The flow would need to be changed a bit -- some of the errors are soft
errors which most users don't care about because they just want a best
effort.  The soft errors can be remembered without breaking out of the
loop, and just returned at the end.  Most users could just ignore the
return code.

The only thing I'd be worried about is performance for the non-livepatch
users, but I guess those checks don't look very expensive.  And the x86
unwinders are already pretty slow anyway...
Steven Rostedt April 19, 2019, 8:11 p.m. UTC | #12
On Thu, 18 Apr 2019 10:41:42 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> Replace the indirection through struct stack_trace by using the storage
> array based interfaces.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Steven Rostedt <rostedt@goodmis.org>

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve
Steven Rostedt April 19, 2019, 8:11 p.m. UTC | #13
On Thu, 18 Apr 2019 10:41:43 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> Simplify the stack retrieval code by using the storage array based
> interface.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve
Miroslav Benes April 23, 2019, 8:18 a.m. UTC | #14
On Thu, 18 Apr 2019, Thomas Gleixner wrote:

> Replace the indirection through struct stack_trace by using the storage
> array based interfaces.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Miroslav Benes <mbenes@suse.cz>

Feel free to take it through tip or let us know to pick it up.

Miroslav
Peter Zijlstra April 24, 2019, 7:45 p.m. UTC | #15
On Thu, Apr 18, 2019 at 10:41:37AM +0200, Thomas Gleixner wrote:
> There is only one caller of check_prev_add() which hands in a zeroed struct
> stack trace and a function pointer to save_stack(). Inside check_prev_add()
> the stack_trace struct is checked for being empty, which is always
> true. Based on that one code path stores a stack trace which is unused. The
> comment there does not make sense either. It's all leftovers from
> historical lockdep code (cross release).

I was more or less expecting a revert of:

ce07a9415f26 ("locking/lockdep: Make check_prev_add() able to handle external stack_trace")

And then I read the comment that went with the "static struct
stack_trace trace" that got removed (in the above commit) and realized
that your patch will consume more stack entries.

The problem is when the held lock stack in check_prevs_add() has multple
trylock entries on top, in that case we call check_prev_add() multiple
times, and this patch will then save the exact same stack-trace multiple
times, consuming static resources.

Possibly we should copy what stackdepot does (but we cannot use it
directly because stackdepot uses locks; but possible we can share bits),
but that is a patch for another day I think.

So while convoluted, perhaps we should retain this code for now.
Peter Zijlstra April 24, 2019, 7:45 p.m. UTC | #16
On Thu, Apr 18, 2019 at 10:41:38AM +0200, Thomas Gleixner wrote:
> Replace the indirection through struct stack_trace by using the storage
> array based interfaces and storing the information is a small lockdep
> specific data structure.
> 

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Thomas Gleixner April 24, 2019, 7:51 p.m. UTC | #17
On Wed, 24 Apr 2019, Peter Zijlstra wrote:
> On Thu, Apr 18, 2019 at 10:41:37AM +0200, Thomas Gleixner wrote:
> > There is only one caller of check_prev_add() which hands in a zeroed struct
> > stack trace and a function pointer to save_stack(). Inside check_prev_add()
> > the stack_trace struct is checked for being empty, which is always
> > true. Based on that one code path stores a stack trace which is unused. The
> > comment there does not make sense either. It's all leftovers from
> > historical lockdep code (cross release).
> 
> I was more or less expecting a revert of:
> 
> ce07a9415f26 ("locking/lockdep: Make check_prev_add() able to handle external stack_trace")
> 
> And then I read the comment that went with the "static struct
> stack_trace trace" that got removed (in the above commit) and realized
> that your patch will consume more stack entries.
> 
> The problem is when the held lock stack in check_prevs_add() has multple
> trylock entries on top, in that case we call check_prev_add() multiple
> times, and this patch will then save the exact same stack-trace multiple
> times, consuming static resources.
> 
> Possibly we should copy what stackdepot does (but we cannot use it
> directly because stackdepot uses locks; but possible we can share bits),
> but that is a patch for another day I think.
> 
> So while convoluted, perhaps we should retain this code for now.

Uurg, what a mess.