Message ID | 20120922215923.GA13161@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, 22 Sep 2012, Paul E. McKenney wrote: > And here is a patch. I am still having trouble reproducing the problem, > but figured that I should avoid serializing things. Thanks, testing this now on v3.6-rc6. One question though about the patch description: > All this begs the question of exactly how a callback-free grace period > gets started in the first place. This can happen due to the fact that > CPUs do not necessarily agree on which grace period is in progress. > If a CPU still believes that the grace period that just completed is > still ongoing, it will believe that it has callbacks that need to wait > for another grace period, never mind the fact that the grace period > that they were waiting for just completed. This CPU can therefore > erroneously decide to start a new grace period. Doesn't this imply that this bug would only affect multi-CPU systems? The recent tests here have been on Pandaboard, which is dual-CPU, but my recollection is that I also observed the warnings on a single-core Beagleboard. Will re-test. - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Sep 22, 2012 at 10:25:59PM +0000, Paul Walmsley wrote: > On Sat, 22 Sep 2012, Paul E. McKenney wrote: > > > And here is a patch. I am still having trouble reproducing the problem, > > but figured that I should avoid serializing things. > > Thanks, testing this now on v3.6-rc6. Very cool, thank you! > One question though about the patch > description: > > > All this begs the question of exactly how a callback-free grace period > > gets started in the first place. This can happen due to the fact that > > CPUs do not necessarily agree on which grace period is in progress. > > If a CPU still believes that the grace period that just completed is > > still ongoing, it will believe that it has callbacks that need to wait > > for another grace period, never mind the fact that the grace period > > that they were waiting for just completed. This CPU can therefore > > erroneously decide to start a new grace period. > > Doesn't this imply that this bug would only affect multi-CPU systems? Surprisingly not, at least when running TREE_RCU or TREE_PREEMPT_RCU. In order to keep lock contention down to a dull roar on larger systems, TREE_RCU keeps three sets of books: (1) the global state in the rcu_state structure, (2) the combining-tree per-node state in the rcu_node structure, and the per-CPU state in the rcu_data structure. A CPU is not officially aware of the end of a grace period until it is reflected in its rcu_data structure. This has the perhaps-surprising consequence that the CPU that detected the end of the old grace period might start a new one before becoming officially aware that the old one ended. Why not have the CPU inform itself immediately upon noticing that the old grace period ended? Deadlock. The rcu_node locks must be acquired from leaf towards root, and the CPU is holding the root rcu_node lock when it notices that the grace period has ended. I have made this a bit less problematic in the bigrt branch, working towards a goal of getting RCU into a state where automatic formal validation might one day be possible. And yes, I am starting to get some formal-validation people interested in this lofty goal, see for example: http://sites.google.com/site/popl13grace/paper.pdf. > The recent tests here have been on Pandaboard, which is dual-CPU, but my > recollection is that I also observed the warnings on a single-core > Beagleboard. Will re-test. Anxiously awaiting the results. This has been a strange one, even by RCU's standards. Plus I need to add a few Reported-by lines. Next version... Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 22 Sep 2012, Paul E. McKenney wrote: > On Sat, Sep 22, 2012 at 10:25:59PM +0000, Paul Walmsley wrote: > > > The recent tests here have been on Pandaboard, which is dual-CPU, but my > > recollection is that I also observed the warnings on a single-core > > Beagleboard. Will re-test. > > Anxiously awaiting the results. The same problem exists on BeagleBoard XM (OMAP3730, single-core Cortex-A8): http://www.pwsan.com/omap/transcripts/20120922-beaglexm-rcu-stall-debug-pre-fix.txt and the same patch fixes it: http://www.pwsan.com/omap/transcripts/20120922-beaglexm-rcu-stall-debug-post-fix.txt Please feel free to update my Tested-by:, if you wish. Tested-by: Paul Walmsley <paul@pwsan.com> # OMAP3730, OMAP4430 - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 23, 2012 at 07:55:50AM +0000, Paul Walmsley wrote: > On Sat, 22 Sep 2012, Paul E. McKenney wrote: > > > On Sat, Sep 22, 2012 at 10:25:59PM +0000, Paul Walmsley wrote: > > > > > The recent tests here have been on Pandaboard, which is dual-CPU, but my > > > recollection is that I also observed the warnings on a single-core > > > Beagleboard. Will re-test. > > > > Anxiously awaiting the results. > > The same problem exists on BeagleBoard XM (OMAP3730, single-core > Cortex-A8): > > http://www.pwsan.com/omap/transcripts/20120922-beaglexm-rcu-stall-debug-pre-fix.txt > > and the same patch fixes it: > > http://www.pwsan.com/omap/transcripts/20120922-beaglexm-rcu-stall-debug-post-fix.txt > > Please feel free to update my Tested-by:, if you wish. > > Tested-by: Paul Walmsley <paul@pwsan.com> # OMAP3730, OMAP4430 Very good, thank you very much!!! Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 23, 2012 at 3:29 AM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Sat, Sep 22, 2012 at 01:10:43PM -0700, Paul E. McKenney wrote: >> On Sat, Sep 22, 2012 at 06:42:08PM +0000, Paul Walmsley wrote: >> > On Fri, 21 Sep 2012, Paul E. McKenney wrote: [...] > > And here is a patch. I am still having trouble reproducing the problem, > but figured that I should avoid serializing things. > > Thanx, Paul > > ------------------------------------------------------------------------ > > b/kernel/rcutree.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > rcu: Fix day-one dyntick-idle stall-warning bug > > Each grace period is supposed to have at least one callback waiting > for that grace period to complete. However, if CONFIG_NO_HZ=n, an > extra callback-free grace period is no big problem -- it will chew up > a tiny bit of CPU time, but it will complete normally. In contrast, > CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to > sleep indefinitely, in turn indefinitely delaying completion of the > callback-free grace period. Given that nothing is waiting on this grace > period, this is also not a problem. > > Unless RCU CPU stall warnings are also enabled, as they are in recent > kernels. In this case, if a CPU wakes up after at least one minute > of inactivity, an RCU CPU stall warning will result. The reason that > no one noticed until quite recently is that most systems have enough > OS noise that they will never remain absolutely idle for a full minute. > But there are some embedded systems with cut-down userspace configurations > that get into this mode quite easily. > > All this begs the question of exactly how a callback-free grace period > gets started in the first place. This can happen due to the fact that > CPUs do not necessarily agree on which grace period is in progress. > If a CPU still believes that the grace period that just completed is > still ongoing, it will believe that it has callbacks that need to wait > for another grace period, never mind the fact that the grace period > that they were waiting for just completed. This CPU can therefore > erroneously decide to start a new grace period. > > Once this CPU notices that the earlier grace period completed, it will > invoke its callbacks. It then won't have any callbacks left. If no > other CPU has any callbacks, we now have a callback-free grace period. > > This commit therefore makes CPUs check more carefully before starting a > new grace period. This new check relies on an array of tail pointers > into each CPU's list of callbacks. If the CPU is up to date on which > grace periods have completed, it checks to see if any callbacks follow > the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks > follow the RCU_WAIT_TAIL segment. The reason that this works is that > the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment > as soon as the CPU figures out that the old grace period has ended. > > This change is to cpu_needs_another_gp(), which is called in a number > of places. The only one that really matters is in rcu_start_gp(), where > the root rcu_node structure's ->lock is held, which prevents any > other CPU from starting or completing a grace period, so that the > comparison that determines whether the CPU is missing the completion > of a grace period is stable. > > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > As already confirmed by Paul W and others, I too no longer see the rcu dumps any more with above patch. Thanks a lot for the fix. Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 24, 2012 at 03:11:34PM +0530, Shilimkar, Santosh wrote: > On Sun, Sep 23, 2012 at 3:29 AM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > On Sat, Sep 22, 2012 at 01:10:43PM -0700, Paul E. McKenney wrote: > >> On Sat, Sep 22, 2012 at 06:42:08PM +0000, Paul Walmsley wrote: > >> > On Fri, 21 Sep 2012, Paul E. McKenney wrote: > > [...] > > > > > And here is a patch. I am still having trouble reproducing the problem, > > but figured that I should avoid serializing things. > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > b/kernel/rcutree.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > rcu: Fix day-one dyntick-idle stall-warning bug > > > > Each grace period is supposed to have at least one callback waiting > > for that grace period to complete. However, if CONFIG_NO_HZ=n, an > > extra callback-free grace period is no big problem -- it will chew up > > a tiny bit of CPU time, but it will complete normally. In contrast, > > CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to > > sleep indefinitely, in turn indefinitely delaying completion of the > > callback-free grace period. Given that nothing is waiting on this grace > > period, this is also not a problem. > > > > Unless RCU CPU stall warnings are also enabled, as they are in recent > > kernels. In this case, if a CPU wakes up after at least one minute > > of inactivity, an RCU CPU stall warning will result. The reason that > > no one noticed until quite recently is that most systems have enough > > OS noise that they will never remain absolutely idle for a full minute. > > But there are some embedded systems with cut-down userspace configurations > > that get into this mode quite easily. > > > > All this begs the question of exactly how a callback-free grace period > > gets started in the first place. This can happen due to the fact that > > CPUs do not necessarily agree on which grace period is in progress. > > If a CPU still believes that the grace period that just completed is > > still ongoing, it will believe that it has callbacks that need to wait > > for another grace period, never mind the fact that the grace period > > that they were waiting for just completed. This CPU can therefore > > erroneously decide to start a new grace period. > > > > Once this CPU notices that the earlier grace period completed, it will > > invoke its callbacks. It then won't have any callbacks left. If no > > other CPU has any callbacks, we now have a callback-free grace period. > > > > This commit therefore makes CPUs check more carefully before starting a > > new grace period. This new check relies on an array of tail pointers > > into each CPU's list of callbacks. If the CPU is up to date on which > > grace periods have completed, it checks to see if any callbacks follow > > the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks > > follow the RCU_WAIT_TAIL segment. The reason that this works is that > > the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment > > as soon as the CPU figures out that the old grace period has ended. > > > > This change is to cpu_needs_another_gp(), which is called in a number > > of places. The only one that really matters is in rcu_start_gp(), where > > the root rcu_node structure's ->lock is held, which prevents any > > other CPU from starting or completing a grace period, so that the > > comparison that determines whether the CPU is missing the completion > > of a grace period is stable. > > > > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > As already confirmed by Paul W and others, I too no longer see the rcu dumps > any more with above patch. Thanks a lot for the fix. Glad it finally works! Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Sep 22, 2012 at 11:59 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> rcu: Fix day-one dyntick-idle stall-warning bug
As mentioned in another thread this solves the same problem for ux500.
Reported/Tested-by: Linus Walleij <linus.walleij@linaro.org>
But now it appears that this commit didn't make it into v3.6 so
it definately needs to be tagged with Cc: stable@kernel.org
before it gets merged since the stall warnings are kinda scary.
Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 01, 2012 at 10:55:11AM +0200, Linus Walleij wrote: > On Sat, Sep 22, 2012 at 11:59 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > > rcu: Fix day-one dyntick-idle stall-warning bug > > As mentioned in another thread this solves the same problem for ux500. > Reported/Tested-by: Linus Walleij <linus.walleij@linaro.org> > > But now it appears that this commit didn't make it into v3.6 so > it definately needs to be tagged with Cc: stable@kernel.org > before it gets merged since the stall warnings are kinda scary. Ingo submitting this to Linus Torvalds earlier today, so we should be able to send to stable shortly. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel/rcutree.c b/kernel/rcutree.c index f280e54..f7bcd9e 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -305,7 +305,9 @@ cpu_has_callbacks_ready_to_invoke(struct rcu_data *rdp) static int cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp) { - return *rdp->nxttail[RCU_DONE_TAIL] && !rcu_gp_in_progress(rsp); + return *rdp->nxttail[RCU_DONE_TAIL + + ACCESS_ONCE(rsp->completed) != rdp->completed] && + !rcu_gp_in_progress(rsp); } /*