diff mbox

[v3,2/8] lockdep: Introduce CROSSRELEASE_STACK_TRACE and make it not unwind as default

Message ID 1508837889-16932-3-git-send-email-byungchul.park@lge.com (mailing list archive)
State New, archived
Headers show

Commit Message

Byungchul Park Oct. 24, 2017, 9:38 a.m. UTC
Johan Hovold reported a performance regression by crossrelease like:

> Boot time (from "Linux version" to login prompt) had in fact doubled
> since 4.13 where it took 17 seconds (with my current config) compared to
> the 35 seconds I now see with 4.14-rc4.
>
> I quick bisect pointed to lockdep and specifically the following commit:
>
> 	28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition
> 	               of a crosslock")
>
> which I've verified is the commit which doubled the boot time (compared
> to 28a903f63ec0^) (added by lockdep crossrelease series [1]).

Currently crossrelease performs unwind on every acquisition. But, that
overloads systems too much. So this patch makes unwind optional and set
it to N as default. Instead, it records only acquire_ip normally. Of
course, unwind is sometimes required for full analysis. In that case, we
can set CROSSRELEASE_STACK_TRACE to Y and use it.

In my qemu ubuntu machin (x86_64, 4 cores, 512M), the regression was
fixed like, measuring booting times with 'perf stat --null --repeat 10
$QEMU', where $QEMU launchs a kernel with init=/bin/true:

1. No lockdep enabled

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.756558155 seconds time elapsed                    ( +-  0.09% )

2. Lockdep enabled

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.968710420 seconds time elapsed                    ( +-  0.12% )

3. Lockdep enabled + crossrelease enabled

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       3.153839636 seconds time elapsed                    ( +-  0.31% )

4. Lockdep enabled + crossrelease enabled + this patch applied

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.963669551 seconds time elapsed                    ( +-  0.11% )

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/lockdep.h  |  4 ++++
 kernel/locking/lockdep.c |  5 +++++
 lib/Kconfig.debug        | 15 +++++++++++++++
 3 files changed, 24 insertions(+)

Comments

Ingo Molnar Oct. 24, 2017, 10:05 a.m. UTC | #1
* Byungchul Park <byungchul.park@lge.com> wrote:

> Johan Hovold reported a performance regression by crossrelease like:

Pplease add Reported-by and Analyzed-by tags - you didn't even Cc: Johan!

Thanks,

	Ingo
Ingo Molnar Oct. 24, 2017, 10:06 a.m. UTC | #2
Cannot pick up this series yet, but I have enhanced the changelog to:

=============>
Subject: locking/lockdep: Introduce CONFIG_CROSSRELEASE_STACK_TRACE and make it not unwind by default
From: Byungchul Park <byungchul.park@lge.com>
Date: Tue, 24 Oct 2017 18:38:03 +0900

Johan Hovold reported a heavy performance regression caused by
lockdep cross-release:

 > Boot time (from "Linux version" to login prompt) had in fact doubled
 > since 4.13 where it took 17 seconds (with my current config) compared to
 > the 35 seconds I now see with 4.14-rc4.
 >
 > I quick bisect pointed to lockdep and specifically the following commit:
 >
 >	28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition
 >	               of a crosslock")
 >
 > which I've verified is the commit which doubled the boot time (compared
 > to 28a903f63ec0^) (added by lockdep crossrelease series [1]).

Currently crossrelease performs unwind on every acquisition, but that is
very expensive.

This patch makes unwind optional and disables it by default and only records
acquire_ip.

Full stack traces are sometimes required for full analysis, in which
case CROSSRELEASE_STACK_TRACE can be enabled.

On my qemu Ubuntu machine (x86_64, 4 cores, 512M), the regression was
fixed. We measure boot times with 'perf stat --null --repeat 10 $QEMU',
where $QEMU launches a kernel with init=/bin/true:

1. No lockdep enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.756558155 seconds time elapsed                    ( +-  0.09% )

2. Lockdep enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.968710420 seconds time elapsed                    ( +-  0.12% )

3. Lockdep enabled + crossrelease enabled:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       3.153839636 seconds time elapsed                    ( +-  0.31% )

4. Lockdep enabled + crossrelease enabled + this patch applied:

 Performance counter stats for 'qemu_booting_time.sh bzImage' (10 runs):

       2.963669551 seconds time elapsed                    ( +-  0.11% )

I.e. lockdep-crossrelease performance is now indistinguishable
from vanilla lockdep.
Byungchul Park Oct. 25, 2017, 1:01 a.m. UTC | #3
On Tue, Oct 24, 2017 at 12:05:16PM +0200, Ingo Molnar wrote:
> 
> * Byungchul Park <byungchul.park@lge.com> wrote:
> 
> > Johan Hovold reported a performance regression by crossrelease like:
> 
> Pplease add Reported-by and Analyzed-by tags - you didn't even Cc: Johan!

Excuse me but, I am sure, whom is the issue analyzed by? Me?

> Thanks,
> 
> 	Ingo
Ingo Molnar Oct. 25, 2017, 5:53 a.m. UTC | #4
* Byungchul Park <byungchul.park@lge.com> wrote:

> On Tue, Oct 24, 2017 at 12:05:16PM +0200, Ingo Molnar wrote:
> > 
> > * Byungchul Park <byungchul.park@lge.com> wrote:
> > 
> > > Johan Hovold reported a performance regression by crossrelease like:
> > 
> > Pplease add Reported-by and Analyzed-by tags - you didn't even Cc: Johan!
> 
> Excuse me but, I am sure, whom is the issue analyzed by? Me?

Well, Johan tracked it all down for us, Thomas gave the right suggestion to fix 
the performance regression, so I meant something like:

  Reported-by: Johan Hovold <johan@kernel.org>
  Bisected-by: Johan Hovold <johan@kernel.org>
  Analyzed-by: Thomas Gleixner <tglx@linutronix.de>

Thanks,

	Ingo
diff mbox

Patch

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index bfa8e0b..70358b5 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -278,7 +278,11 @@  struct held_lock {
 };
 
 #ifdef CONFIG_LOCKDEP_CROSSRELEASE
+#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
 #define MAX_XHLOCK_TRACE_ENTRIES 5
+#else
+#define MAX_XHLOCK_TRACE_ENTRIES 1
+#endif
 
 /*
  * This is for keeping locks waiting for commit so that true dependencies
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e36e652..5c2ddf2 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4863,8 +4863,13 @@  static void add_xhlock(struct held_lock *hlock)
 	xhlock->trace.nr_entries = 0;
 	xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES;
 	xhlock->trace.entries = xhlock->trace_entries;
+#ifdef CONFIG_CROSSRELEASE_STACK_TRACE
 	xhlock->trace.skip = 3;
 	save_stack_trace(&xhlock->trace);
+#else
+	xhlock->trace.nr_entries = 1;
+	xhlock->trace.entries[0] = hlock->acquire_ip;
+#endif
 }
 
 static inline int same_context_xhlock(struct hist_lock *xhlock)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 3db9167..90ea784 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1225,6 +1225,21 @@  config LOCKDEP_COMPLETIONS
 	 A deadlock caused by wait_for_completion() and complete() can be
 	 detected by lockdep using crossrelease feature.
 
+config CROSSRELEASE_STACK_TRACE
+	bool "Record more than one entity of stack trace in crossrelease"
+	depends on LOCKDEP_CROSSRELEASE
+	default n
+	help
+	 The lockdep "cross-release" feature needs to record stack traces
+	 (of calling functions) for all acquisitions, for eventual later
+	 use during analysis. By default only a single caller is recorded,
+	 because the unwind operation can be very expensive with deeper
+	 stack chains. However, sometimes deeper traces are required for
+	 full analysis. This option turns on the saving of the full stack
+	 trace entries.
+
+	 If unsure, say N.
+
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
 	depends on DEBUG_KERNEL && LOCKDEP