Message ID | 20120922000537.GH2454@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Paul On Fri, 21 Sep 2012, Paul E. McKenney wrote: > I am wondering if your system somehow figured out how to start a grace > period that had no RCU callbacks waiting for it. If that happened, > then a CONFIG_NO_HZ=y system could in theory get into a state where all > CPUs are in dyntick-idle mode, so that none of them is doing anything > to force the grace period to complete. > > That should be easy to diagnose, anyway. Please see below, which > includes the earlier diagnostic patch. Here you go. - Paul [ 248.902618] INFO: rcu_sched self-detected stall on CPU [ 248.905456] 0: (1 ticks this GP) idle=933/1/0 [ 248.907897] (t=26570 jiffies g=11 c=10 q=0) [ 248.910339] [<c001bc90>] (unwind_backtrace+0x0/0xf0) from [<c00ad800>] (rcu_check_callbacks+0x220/0x714) [ 248.915527] [<c00ad800>] (rcu_check_callbacks+0x220/0x714) from [<c00532a0>] (update_process_times+0x38/0x68) [ 248.920928] [<c00532a0>] (update_process_times+0x38/0x68) from [<c008c9e8>] (tick_sched_timer+0x80/0xec) [ 248.926116] [<c008c9e8>] (tick_sched_timer+0x80/0xec) from [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0) [ 248.930999] [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0) from [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0) [ 248.936035] [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0) from [<c001a3cc>] (twd_handler+0x30/0x44) [ 248.940948] [<c001a3cc>] (twd_handler+0x30/0x44) from [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c) [ 248.946075] [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c) from [<c00a4344>] (generic_handle_irq+0x30/0x48) [ 248.951538] [<c00a4344>] (generic_handle_irq+0x30/0x48) from [<c0014e38>] (handle_IRQ+0x4c/0xac) [ 248.956329] [<c0014e38>] (handle_IRQ+0x4c/0xac) from [<c00084cc>] (gic_handle_irq+0x28/0x5c) [ 248.960937] [<c00084cc>] (gic_handle_irq+0x28/0x5c) from [<c04fb1a4>] (__irq_svc+0x44/0x5c) [ 248.965484] Exception stack(0xc0729f58 to 0xc0729fa0) [ 248.968231] 9f40: 0003b832 00000001 [ 248.972686] 9f60: 00000000 c074a8e8 c0728000 c07c42c8 c05065a0 c074bdc8 00000000 411fc092 [ 248.977142] 9f80: c074bfe8 00000000 00000001 c0729fa0 0003b833 c0015130 20000113 ffffffff [ 248.981597] [<c04fb1a4>] (__irq_svc+0x44/0x5c) from [<c0015130>] (default_idle+0x20/0x44) [ 248.986083] [<c0015130>] (default_idle+0x20/0x44) from [<c001535c>] (cpu_idle+0x9c/0x114) [ 248.990539] [<c001535c>] (cpu_idle+0x9c/0x114) from [<c06d77b0>] (start_kernel+0x2b4/0x304) -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Sep 22, 2012 at 06:16:15PM +0000, Paul Walmsley wrote: > Hi Paul > > On Fri, 21 Sep 2012, Paul E. McKenney wrote: > > > I am wondering if your system somehow figured out how to start a grace > > period that had no RCU callbacks waiting for it. If that happened, > > then a CONFIG_NO_HZ=y system could in theory get into a state where all > > CPUs are in dyntick-idle mode, so that none of them is doing anything > > to force the grace period to complete. > > > > That should be easy to diagnose, anyway. Please see below, which > > includes the earlier diagnostic patch. > > Here you go. > > - Paul > > [ 248.902618] INFO: rcu_sched self-detected stall on CPU > [ 248.905456] 0: (1 ticks this GP) idle=933/1/0 > [ 248.907897] (t=26570 jiffies g=11 c=10 q=0) Bingo!!! (q=0, in case you were wondering. And thank you for testing this!) Strangely enough, I believe that I have inadvertently fixed this in my -rcu tree: git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next Nevertheless, if you get a chance to try it, I would be interested to hear if my guess is correct. The trick is that a kthread drives the grace period in -rcu, regardless of whether or not there are callbacks. However, the backport would not be something that -stable would be happy with, so I will be putting together a fix for mainline. This thing has been in the kernel since about 2004, not sure why you didn't hit it earlier. Thanx, Paul > [ 248.910339] [<c001bc90>] (unwind_backtrace+0x0/0xf0) from [<c00ad800>] (rcu_check_callbacks+0x220/0x714) > [ 248.915527] [<c00ad800>] (rcu_check_callbacks+0x220/0x714) from [<c00532a0>] (update_process_times+0x38/0x68) > [ 248.920928] [<c00532a0>] (update_process_times+0x38/0x68) from [<c008c9e8>] (tick_sched_timer+0x80/0xec) > [ 248.926116] [<c008c9e8>] (tick_sched_timer+0x80/0xec) from [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0) > [ 248.930999] [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0) from [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0) > [ 248.936035] [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0) from [<c001a3cc>] (twd_handler+0x30/0x44) > [ 248.940948] [<c001a3cc>] (twd_handler+0x30/0x44) from [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c) > [ 248.946075] [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c) from [<c00a4344>] (generic_handle_irq+0x30/0x48) > [ 248.951538] [<c00a4344>] (generic_handle_irq+0x30/0x48) from [<c0014e38>] (handle_IRQ+0x4c/0xac) > [ 248.956329] [<c0014e38>] (handle_IRQ+0x4c/0xac) from [<c00084cc>] (gic_handle_irq+0x28/0x5c) > [ 248.960937] [<c00084cc>] (gic_handle_irq+0x28/0x5c) from [<c04fb1a4>] (__irq_svc+0x44/0x5c) > [ 248.965484] Exception stack(0xc0729f58 to 0xc0729fa0) > [ 248.968231] 9f40: 0003b832 00000001 > [ 248.972686] 9f60: 00000000 c074a8e8 c0728000 c07c42c8 c05065a0 c074bdc8 00000000 411fc092 > [ 248.977142] 9f80: c074bfe8 00000000 00000001 c0729fa0 0003b833 c0015130 20000113 ffffffff > [ 248.981597] [<c04fb1a4>] (__irq_svc+0x44/0x5c) from [<c0015130>] (default_idle+0x20/0x44) > [ 248.986083] [<c0015130>] (default_idle+0x20/0x44) from [<c001535c>] (cpu_idle+0x9c/0x114) > [ 248.990539] [<c001535c>] (cpu_idle+0x9c/0x114) from [<c06d77b0>] (start_kernel+0x2b4/0x304) > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Paul On Sat, 22 Sep 2012, Paul E. McKenney wrote: > Strangely enough, I believe that I have inadvertently fixed this in > my -rcu tree: > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next > > Nevertheless, if you get a chance to try it, I would be interested to > hear if my guess is correct. Yes, good news: the stall warnings go away with that branch. > The trick is that a kthread drives the grace period in -rcu, regardless > of whether or not there are callbacks. This is "rcu: Move quiescent-state forcing into kthread" ? Added some debugging into rcu_gp_kthread() after that commit and can confirm that the quiescent-state forcing loop does start a few times when there are zero callbacks pending (modulo any races in my measurement code). > However, the backport would not be something that -stable would be happy > with, so I will be putting together a fix for mainline. This thing > has been in the kernel since about 2004, not sure why you didn't hit > it earlier. One other data point in that regard - noticed the warnings don't appear when the board is booted with: commit 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367 Author: Paul E. McKenney <paul.mckenney@linaro.org> Date: Tue Jun 5 15:53:53 2012 -0700 rcu: Fix qlen_lazy breakage ... - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Sep 22, 2012 at 10:20:19PM +0000, Paul Walmsley wrote: > Hi Paul > > On Sat, 22 Sep 2012, Paul E. McKenney wrote: > > > Strangely enough, I believe that I have inadvertently fixed this in > > my -rcu tree: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next > > > > Nevertheless, if you get a chance to try it, I would be interested to > > hear if my guess is correct. > > Yes, good news: the stall warnings go away with that branch. Very good! > > The trick is that a kthread drives the grace period in -rcu, regardless > > of whether or not there are callbacks. > > This is "rcu: Move quiescent-state forcing into kthread" ? Yep, plus the preceding commits moving grace-period initialization and cleanup into that same kthread. This was motivated by a bug report last February complaining about 200-microsecond latency spikes from RCU grace-period initialization. On systems with 4096 CPUs. Real-time response. It is far bigger than I thought. ;-) > Added some debugging into rcu_gp_kthread() after that commit and can > confirm that the quiescent-state forcing loop does start a few times when > there are zero callbacks pending (modulo any races in my measurement > code). Cool, thank you! Assuming it works, that indicates that there is long-term value to the fix for this problem. On larger systems, extra grace periods are not what you want, as their expense increases with the number of CPUs. > > However, the backport would not be something that -stable would be happy > > with, so I will be putting together a fix for mainline. This thing > > has been in the kernel since about 2004, not sure why you didn't hit > > it earlier. > > One other data point in that regard - noticed the warnings don't appear > when the board is booted with: > > commit 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367 > Author: Paul E. McKenney <paul.mckenney@linaro.org> > Date: Tue Jun 5 15:53:53 2012 -0700 > > rcu: Fix qlen_lazy breakage You lost me on this one. This is already in mainline, so if you were using (say) 3.6-rc6, you would already have this commit applied. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 22 Sep 2012, Paul E. McKenney wrote: > On Sat, Sep 22, 2012 at 10:20:19PM +0000, Paul Walmsley wrote: > > On Sat, 22 Sep 2012, Paul E. McKenney wrote: > > > > > This thing has been in the kernel since about 2004, not sure why you > > > didn't hit it earlier. > > > > One other data point in that regard - noticed the warnings don't appear > > when the board is booted with: > > > > commit 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367 > > Author: Paul E. McKenney <paul.mckenney@linaro.org> > > Date: Tue Jun 5 15:53:53 2012 -0700 > > > > rcu: Fix qlen_lazy breakage > > You lost me on this one. This is already in mainline, so if you were > using (say) 3.6-rc6, you would already have this commit applied. If I check out a kernel at this commit 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367 + the zero-callback-in-tickless-idle diagnostic patch, build and boot it, then the stall warnings don't appear (in 25 minutes of testing). Messages from the diagnostic patch indicate that the kernel is entering idle during a grace period with no RCU callbacks, though. This is not a big deal and does not need any further attention. Just wanted to place a time boundary on the point when these messages started appearing. (It is unlikely to be an optimal bound: i.e., there are probably later commits where the warnings also don't appear.) - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 24, 2012 at 09:54:00PM +0000, Paul Walmsley wrote: > On Sat, 22 Sep 2012, Paul E. McKenney wrote: > > > On Sat, Sep 22, 2012 at 10:20:19PM +0000, Paul Walmsley wrote: > > > On Sat, 22 Sep 2012, Paul E. McKenney wrote: > > > > > > > This thing has been in the kernel since about 2004, not sure why you > > > > didn't hit it earlier. > > > > > > One other data point in that regard - noticed the warnings don't appear > > > when the board is booted with: > > > > > > commit 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367 > > > Author: Paul E. McKenney <paul.mckenney@linaro.org> > > > Date: Tue Jun 5 15:53:53 2012 -0700 > > > > > > rcu: Fix qlen_lazy breakage > > > > You lost me on this one. This is already in mainline, so if you were > > using (say) 3.6-rc6, you would already have this commit applied. > > If I check out a kernel at this commit > 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367 + the > zero-callback-in-tickless-idle diagnostic patch, build and boot it, then > the stall warnings don't appear (in 25 minutes of testing). > > Messages from the diagnostic patch indicate that the kernel is entering > idle during a grace period with no RCU callbacks, though. > > This is not a big deal and does not need any further attention. Just > wanted to place a time boundary on the point when these messages started > appearing. (It is unlikely to be an optimal bound: i.e., there are > probably later commits where the warnings also don't appear.) Ah, got it, thank you! Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel/rcutree.c b/kernel/rcutree.c index 307caf1..696f189 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -879,6 +879,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp) unsigned long flags; int ndetected = 0; struct rcu_node *rnp = rcu_get_root(rsp); + long totqlen = 0; /* Only let one CPU complain about others per time interval. */ @@ -923,8 +924,11 @@ static void print_other_cpu_stall(struct rcu_state *rsp) raw_spin_unlock_irqrestore(&rnp->lock, flags); print_cpu_stall_info_end(); - printk(KERN_CONT "(detected by %d, t=%ld jiffies)\n", - smp_processor_id(), (long)(jiffies - rsp->gp_start)); + for_each_possible_cpu(cpu) + totqlen += per_cpu_ptr(rsp->rda, cpu)->qlen; + pr_cont("(detected by %d, t=%ld jiffies, g=%lu, c=%lu, q=%lu)\n", + smp_processor_id(), (long)(jiffies - rsp->gp_start), + rsp->gpnum, rsp->completed, totqlen); if (ndetected == 0) printk(KERN_ERR "INFO: Stall ended before state dump start\n"); else if (!trigger_all_cpu_backtrace()) @@ -939,8 +943,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp) static void print_cpu_stall(struct rcu_state *rsp) { + int cpu; unsigned long flags; struct rcu_node *rnp = rcu_get_root(rsp); + long totqlen = 0; /* * OK, time to rat on ourselves... @@ -951,7 +957,10 @@ static void print_cpu_stall(struct rcu_state *rsp) print_cpu_stall_info_begin(); print_cpu_stall_info(rsp, smp_processor_id()); print_cpu_stall_info_end(); - printk(KERN_CONT " (t=%lu jiffies)\n", jiffies - rsp->gp_start); + for_each_possible_cpu(cpu) + totqlen += per_cpu_ptr(rsp->rda, cpu)->qlen; + pr_cont(" (t=%lu jiffies g=%lu c=%lu q=%lu)\n", + jiffies - rsp->gp_start, rsp->gpnum, rsp->completed, totqlen); if (!trigger_all_cpu_backtrace()) dump_stack();