diff mbox

[arm,1/1] arm: Use _rcuidle tracepoint to allow use from idle

Message ID 20160426175214.GJ3715@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Paul E. McKenney April 26, 2016, 5:52 p.m. UTC
On Tue, Apr 26, 2016 at 08:00:31AM -0700, Tony Lindgren wrote:
> Hi,
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> [160425 13:49]:
> > Testing on ARM encountered the following pair of lockdep-RCU splats:
> > 
> > ------------------------------------------------------------------------
> > 
> > ===============================
> > [ INFO: suspicious RCU usage. ]
> > 4.6.0-rc4-next-20160422 #1 Not tainted
> > -------------------------------
> > include/trace/events/power.h:328 suspicious rcu_dereference_check() usage!
> > 
> > other info that might help us debug this:
> > 
> > RCU used illegally from idle CPU!
> > rcu_scheduler_active = 1, debug_locks = 0
> > RCU used illegally from extended quiescent state!
> > no locks held by swapper/0/0.
> > 
> > stack backtrace:
> > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc4-next-20160422 #1
> > Hardware name: Generic OMAP3-GP (Flattened Device Tree)
> > [<c010f55c>] (unwind_backtrace) from [<c010b64c>] (show_stack+0x10/0x14)
> > [<c010b64c>] (show_stack) from [<c047acbc>] (dump_stack+0xa8/0xe0)
> > [<c047acbc>] (dump_stack) from [<c012bc10>] (pwrdm_set_next_pwrst+0xf8/0x1cc)
> > [<c012bc10>] (pwrdm_set_next_pwrst) from [<c01269fc>] (omap3_enter_idle_bm+0x1b8/0x1e8)
> > [<c01269fc>] (omap3_enter_idle_bm) from [<c05fa0b8>] (cpuidle_enter_state+0x84/0x408)
> > [<c05fa0b8>] (cpuidle_enter_state) from [<c0182c1c>] (cpu_startup_entry+0x1c8/0x3f0)
> > [<c0182c1c>] (cpu_startup_entry) from [<c0b00c20>] (start_kernel+0x354/0x3cc)
> > 
> > ------------------------------------------------------------------------
> > 
> > [<c010f55c>] (unwind_backtrace) from [<c010b64c>] (show_stack+0x10/0x14)
> > [<c010b64c>] (show_stack) from [<c047ac3c>] (dump_stack+0xa8/0xe0)
> > [<c047ac3c>] (dump_stack) from [<c012c340>] (_pwrdm_state_switch+0x188/0x32c)
> > [<c012c340>] (_pwrdm_state_switch) from [<c012c4f0>] (_pwrdm_post_transition_cb+0xc/0x14)
> > [<c012c4f0>] (_pwrdm_post_transition_cb) from [<c012ba74>] (pwrdm_for_each+0x30/0x5c)
> > [<c012ba74>] (pwrdm_for_each) from [<c012c72c>] (pwrdm_post_transition+0x24/0x30)
> > [<c012c72c>] (pwrdm_post_transition) from [<c012548c>] (omap_sram_idle+0xfc/0x240)
> > [<c012548c>] (omap_sram_idle) from [<c0126934>] (omap3_enter_idle_bm+0xf0/0x1e8)
> > [<c0126934>] (omap3_enter_idle_bm) from [<c05fa038>] (cpuidle_enter_state+0x84/0x408)
> > [<c05fa038>] (cpuidle_enter_state) from [<c0182b90>] (cpu_startup_entry+0x1c8/0x3f0)
> > [<c0182b90>] (cpu_startup_entry) from [<c0b00c20>] (start_kernel+0x354/0x3cc)
> > 
> > ------------------------------------------------------------------------
> > 
> > These are caused by event tracing from the idle loop.  This commit
> > therefore adds the _rcuidle suffix to make RCU aware of this implicit
> > use of RCU by event tracing, thus preventing both splats.
> 
> Maybe also mention which patch in the RCU tree caused these?

Or, more accurately, the patch that removed the false negatives that
suppressed them.  These lockdep-RCU splats were bugs all along, just
not reported.

> I'm still seeing the following warnings with next with your patch applied.

Does the following patch help?

It is quite possible that there are quite a few more of these.  If this
is the case, then one way to make the kernel list more of them on a
given boot is to build with CONFIG_PROVE_RCU_REPEATEDLY=y.

							Thanx, Paul

------------------------------------------------------------------------

commit 99cd56ad86d2d5fbe7ceab303b47f34e4156e5e7
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Tue Apr 26 10:42:25 2016 -0700

    rm: Use more _rcuidle tracepoints to allow use from idle
    
    Further testing with false negatives suppressed by commit 293e2421fe25
    ("rcu: Remove superfluous versions of rcu_read_lock_sched_held()")
    identified a few more unprotected uses of RCU from the idle loop.
    Because RCU actively ignores idle-loop code (for energy-efficiency
    reasons, among other things), using RCU from the idle loop can result
    in too-short grace periods, in turn resulting in arbitrary misbehavior.
    
    The resulting lockdep-RCU splats are as follows:
    
    ------------------------------------------------------------------------
    
    ===============================
    [ INFO: suspicious RCU usage. ]
    4.6.0-rc5-next-20160426+ #1112 Not tainted
    -------------------------------
    include/trace/events/ipi.h:35 suspicious rcu_dereference_check() usage!
    
    other info that might help us debug this:
    
    RCU used illegally from idle CPU!
    rcu_scheduler_active = 1, debug_locks = 0
    RCU used illegally from extended quiescent state!
    no locks held by swapper/0/0.
    
    stack backtrace:
    CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112
    Hardware name: Generic OMAP4 (Flattened Device Tree)
    [<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
    [<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4)
    [<c047fec8>] (dump_stack) from [<c010dcfc>] (smp_cross_call+0xbc/0x188)
    [<c010dcfc>] (smp_cross_call) from [<c01c9e28>] (generic_exec_single+0x9c/0x15c)
    [<c01c9e28>] (generic_exec_single) from [<c01ca0a0>] (smp_call_function_single_async+0 x38/0x9c)
    [<c01ca0a0>] (smp_call_function_single_async) from [<c0603728>] (cpuidle_coupled_poke_others+0x8c/0xa8)
    [<c0603728>] (cpuidle_coupled_poke_others) from [<c0603c10>] (cpuidle_enter_state_coupled+0x26c/0x390)
    [<c0603c10>] (cpuidle_enter_state_coupled) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0)
    [<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
    [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
    
    ------------------------------------------------------------------------
    
    Warning from omap3
    ===============================
    [ INFO: suspicious RCU usage. ]
    4.6.0-rc5-next-20160426+ #1112 Not tainted
    -------------------------------
    include/trace/events/rpm.h:63 suspicious rcu_dereference_check() usage!
    
    other info that might help us debug this:
    
    RCU used illegally from idle CPU!
    rcu_scheduler_active = 1, debug_locks = 0
    RCU used illegally from extended quiescent state!
    1 lock held by swapper/0/0:
     #0:  (&(&dev->power.lock)->rlock){-.-...}, at: [<c052ee24>] __pm_runtime_suspend+0x54/0x84
    
    stack backtrace:
    CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112
    Hardware name: Generic OMAP36xx (Flattened Device Tree)
    [<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
    [<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4)
    [<c047fec8>] (dump_stack) from [<c052d7b4>] (rpm_suspend+0x604/0x7e4)
    [<c052d7b4>] (rpm_suspend) from [<c052ee34>] (__pm_runtime_suspend+0x64/0x84)
    [<c052ee34>] (__pm_runtime_suspend) from [<c04bf3bc>] (omap2_gpio_prepare_for_idle+0x5c/0x70)
    [<c04bf3bc>] (omap2_gpio_prepare_for_idle) from [<c01255e8>] (omap_sram_idle+0x140/0x244)
    [<c01255e8>] (omap_sram_idle) from [<c0126b48>] (omap3_enter_idle_bm+0xfc/0x1ec)
    [<c0126b48>] (omap3_enter_idle_bm) from [<c0601db8>] (cpuidle_enter_state+0x80/0x3d4)
    [<c0601db8>] (cpuidle_enter_state) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0)
    [<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
    [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
    
    ------------------------------------------------------------------------
    
    Reported-by: Tony Lindgren <tony@atomide.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Tony Lindgren <tony@atomide.com>
    Cc: Russell King <linux@arm.linux.org.uk>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: <linux-omap@vger.kernel.org>
    Cc: <linux-arm-kernel@lists.infradead.org>

Comments

Tony Lindgren April 26, 2016, 6:29 p.m. UTC | #1
* Paul E. McKenney <paulmck@linux.vnet.ibm.com> [160426 10:53]:
> Does the following patch help?

It just changes the output.. See below.

> It is quite possible that there are quite a few more of these.  If this
> is the case, then one way to make the kernel list more of them on a
> given boot is to build with CONFIG_PROVE_RCU_REPEATEDLY=y.

OK

Regards,

Tony

8< ------------------
CPU: Testing write buffer coherency: ok
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000


RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
no locks held by swapper/1/0.

stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.6.0-rc5-next-20160426+ #1113
Hardware name: Generic OMAP4 (Flattened Device Tree)
[<c0110290>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
[<c010c3a8>] (show_stack) from [<c047ff88>] (dump_stack+0xb0/0xe4)
[<c047ff88>] (dump_stack) from [<c012c014>] (pwrdm_set_next_pwrst+0x100/0x1d4)
[<c012c014>] (pwrdm_set_next_pwrst) from [<c0126120>] (omap4_enter_lowpower+0xc8/0x230)
[<c0126120>] (omap4_enter_lowpower) from [<c0126c24>] (omap_enter_idle_coupled+0x6c/0x254)
[<c0126c24>] (omap_enter_idle_coupled) from [<c0601dfc>] (cpuidle_enter_state+0x80/0x3d4)
[<c0601dfc>] (cpuidle_enter_state) from [<c0603d30>] (cpuidle_enter_state_coupled+0x348/0x390)
[<c0603d30>] (cpuidle_enter_state_coupled) from [<c0183d34>] (cpu_startup_entry+0x198/0x3a0)
[<c0183d34>] (cpu_startup_entry) from [<8010162c>] (0x8010162c)
hw-breakpoint: Failed to enable monitor mode on CPU 0.
Paul E. McKenney April 26, 2016, 7:07 p.m. UTC | #2
On Tue, Apr 26, 2016 at 11:29:39AM -0700, Tony Lindgren wrote:
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> [160426 10:53]:
> > Does the following patch help?
> 
> It just changes the output.. See below.

Ah -- you need both patches.

These are on -rcu at:

501889db6fee (arm: Use _rcuidle tracepoint to allow use from idle)
99cd56ad86d2 (rm: Use more _rcuidle tracepoints to allow use from idle)

And -rcu may be found at:

	git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

							Thanx, Paul

> > It is quite possible that there are quite a few more of these.  If this
> > is the case, then one way to make the kernel list more of them on a
> > given boot is to build with CONFIG_PROVE_RCU_REPEATEDLY=y.
> 
> OK
> 
> Regards,
> 
> Tony
> 
> 8< ------------------
> CPU: Testing write buffer coherency: ok
> CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
> 
> 
> RCU used illegally from idle CPU!
> rcu_scheduler_active = 1, debug_locks = 0
> RCU used illegally from extended quiescent state!
> no locks held by swapper/1/0.
> 
> stack backtrace:
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.6.0-rc5-next-20160426+ #1113
> Hardware name: Generic OMAP4 (Flattened Device Tree)
> [<c0110290>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
> [<c010c3a8>] (show_stack) from [<c047ff88>] (dump_stack+0xb0/0xe4)
> [<c047ff88>] (dump_stack) from [<c012c014>] (pwrdm_set_next_pwrst+0x100/0x1d4)
> [<c012c014>] (pwrdm_set_next_pwrst) from [<c0126120>] (omap4_enter_lowpower+0xc8/0x230)
> [<c0126120>] (omap4_enter_lowpower) from [<c0126c24>] (omap_enter_idle_coupled+0x6c/0x254)
> [<c0126c24>] (omap_enter_idle_coupled) from [<c0601dfc>] (cpuidle_enter_state+0x80/0x3d4)
> [<c0601dfc>] (cpuidle_enter_state) from [<c0603d30>] (cpuidle_enter_state_coupled+0x348/0x390)
> [<c0603d30>] (cpuidle_enter_state_coupled) from [<c0183d34>] (cpu_startup_entry+0x198/0x3a0)
> [<c0183d34>] (cpu_startup_entry) from [<8010162c>] (0x8010162c)
> hw-breakpoint: Failed to enable monitor mode on CPU 0.
>
Tony Lindgren April 26, 2016, 7:39 p.m. UTC | #3
* Paul E. McKenney <paulmck@linux.vnet.ibm.com> [160426 12:08]:
> On Tue, Apr 26, 2016 at 11:29:39AM -0700, Tony Lindgren wrote:
> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> [160426 10:53]:
> > > Does the following patch help?
> > 
> > It just changes the output.. See below.
> 
> Ah -- you need both patches.

Oops sorry, I though I had the first one already from next..

> These are on -rcu at:
> 
> 501889db6fee (arm: Use _rcuidle tracepoint to allow use from idle)
> 99cd56ad86d2 (rm: Use more _rcuidle tracepoints to allow use from idle)

With these omap4 is now OK. But omap3 produces yet another warning,
See below :)

Regards,

Tony

8< ------------------
===============================
[ INFO: suspicious RCU usage. ]
4.6.0-rc5-next-20160426+ #1114 Not tainted
-------------------------------
include/trace/events/clk.h:59 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
2 locks held by swapper/0/0:
 #0:  (&oh->hwmod_key#30){......}, at: [<c0121b40>] omap_hwmod_idle+0x18/0x44
 #1:  (enable_lock){......}, at: [<c0630998>] clk_enable_lock+0x18/0x124

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1114
Hardware name: Generic OMAP36xx (Flattened Device Tree)
[<c0110290>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
[<c010c3a8>] (show_stack) from [<c047fd68>] (dump_stack+0xb0/0xe4)
[<c047fd68>] (dump_stack) from [<c0631618>] (clk_core_disable+0x17c/0x348)
[<c0631618>] (clk_core_disable) from [<c0632774>] (clk_disable+0x24/0x30)
[<c0632774>] (clk_disable) from [<c0120590>] (_disable_clocks+0x18/0x7c)
[<c0120590>] (_disable_clocks) from [<c0121680>] (_idle+0x12c/0x230)
[<c0121680>] (_idle) from [<c0121b4c>] (omap_hwmod_idle+0x24/0x44)
[<c0121b4c>] (omap_hwmod_idle) from [<c0122c24>] (omap_device_idle+0x3c/0x90)
[<c0122c24>] (omap_device_idle) from [<c052cc00>] (__rpm_callback+0x2c/0x60)
[<c052cc00>] (__rpm_callback) from [<c052cc54>] (rpm_callback+0x20/0x80)
[<c052cc54>] (rpm_callback) from [<c052d150>] (rpm_suspend+0x100/0x768)
[<c052d150>] (rpm_suspend) from [<c052ec58>] (__pm_runtime_suspend+0x64/0x84)
[<c052ec58>] (__pm_runtime_suspend) from [<c04bf25c>] (omap2_gpio_prepare_for_idle+0x5
c/0x70)
[<c04bf25c>] (omap2_gpio_prepare_for_idle) from [<c0125568>] (omap_sram_idle+0x140/0x2
44)
[<c0125568>] (omap_sram_idle) from [<c01269dc>] (omap3_enter_idle_bm+0xfc/0x1ec)
[<c01269dc>] (omap3_enter_idle_bm) from [<c0601bdc>] (cpuidle_enter_state+0x80/0x3d4)
[<c0601bdc>] (cpuidle_enter_state) from [<c0183b08>] (cpu_startup_entry+0x198/0x3a0)
[<c0183b08>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
[<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
diff mbox

Patch

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index baee70267f29..7afe48ae5d76 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -486,7 +486,7 @@  static const char *ipi_types[NR_IPI] __tracepoint_string = {
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
 {
-	trace_ipi_raise(target, ipi_types[ipinr]);
+	trace_ipi_raise_rcuidle(target, ipi_types[ipinr]);
 	__smp_cross_call(target, ipinr);
 }
 
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 4c7055009bd6..d9358c202763 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -419,7 +419,7 @@  static int rpm_suspend(struct device *dev, int rpmflags)
 	struct device *parent = NULL;
 	int retval;
 
-	trace_rpm_suspend(dev, rpmflags);
+	trace_rpm_suspend_rcuidle(dev, rpmflags);
 
  repeat:
 	retval = rpm_check_suspend_allowed(dev);