Message ID | 20141022185712.GA9570@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Wed, 22 Oct 2014, Paul E. McKenney wrote: > rcu: More on deadlock between CPU hotplug and expedited grace periods > > Commit dd56af42bd82 (rcu: Eliminate deadlock between CPU hotplug and > expedited grace periods) was incomplete. Although it did eliminate > deadlocks involving synchronize_sched_expedited()'s acquisition of > cpu_hotplug.lock via get_online_cpus(), it did nothing about the similar > deadlock involving acquisition of this same lock via put_online_cpus(). > This deadlock became apparent with testing involving hibernation. > > This commit therefore changes put_online_cpus() acquisition of this lock > to be conditional, and increments a new cpu_hotplug.puts_pending field > in case of acquisition failure. Then cpu_hotplug_begin() checks for this > new field being non-zero, and applies any changes to cpu_hotplug.refcount. > Yes, this works. FWIW, please feel free to add Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz> once merging it. Why lockdep produced such an incomplete stacktrace still remains unexplained. Thanks,
On Wed, Oct 22, 2014 at 10:57:25PM +0200, Jiri Kosina wrote: > On Wed, 22 Oct 2014, Paul E. McKenney wrote: > > > rcu: More on deadlock between CPU hotplug and expedited grace periods > > > > Commit dd56af42bd82 (rcu: Eliminate deadlock between CPU hotplug and > > expedited grace periods) was incomplete. Although it did eliminate > > deadlocks involving synchronize_sched_expedited()'s acquisition of > > cpu_hotplug.lock via get_online_cpus(), it did nothing about the similar > > deadlock involving acquisition of this same lock via put_online_cpus(). > > This deadlock became apparent with testing involving hibernation. > > > > This commit therefore changes put_online_cpus() acquisition of this lock > > to be conditional, and increments a new cpu_hotplug.puts_pending field > > in case of acquisition failure. Then cpu_hotplug_begin() checks for this > > new field being non-zero, and applies any changes to cpu_hotplug.refcount. > > > > Yes, this works. FWIW, please feel free to add > > Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz> > > once merging it. Done, and thank you for both the bug report and the testing! > Why lockdep produced such an incomplete stacktrace still remains > unexplained. On that, I must defer to people more familiar with stack frames. Thanx, Paul > Thanks, > > -- > Jiri Kosina > SUSE Labs > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 22, 2014 at 02:09:43PM -0700, Paul E. McKenney wrote: > > Yes, this works. FWIW, please feel free to add > > > > Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz> > > > > once merging it. > > Done, and thank you for both the bug report and the testing! Works here too. Tested-by: Borislav Petkov <bp@suse.de> Thanks.
On Thu, Oct 23, 2014 at 10:11:25AM +0200, Borislav Petkov wrote: > On Wed, Oct 22, 2014 at 02:09:43PM -0700, Paul E. McKenney wrote: > > > Yes, this works. FWIW, please feel free to add > > > > > > Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz> > > > > > > once merging it. > > > > Done, and thank you for both the bug report and the testing! > > Works here too. > > Tested-by: Borislav Petkov <bp@suse.de> Thank you as well, recorded! Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel/cpu.c b/kernel/cpu.c index 356450f09c1f..90a3d017b90c 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -64,6 +64,8 @@ static struct { * an ongoing cpu hotplug operation. */ int refcount; + /* And allows lockless put_online_cpus(). */ + atomic_t puts_pending; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; @@ -113,7 +115,11 @@ void put_online_cpus(void) { if (cpu_hotplug.active_writer == current) return; - mutex_lock(&cpu_hotplug.lock); + if (!mutex_trylock(&cpu_hotplug.lock)) { + atomic_inc(&cpu_hotplug.puts_pending); + cpuhp_lock_release(); + return; + } if (WARN_ON(!cpu_hotplug.refcount)) cpu_hotplug.refcount++; /* try to fix things up */ @@ -155,6 +161,12 @@ void cpu_hotplug_begin(void) cpuhp_lock_acquire(); for (;;) { mutex_lock(&cpu_hotplug.lock); + if (atomic_read(&cpu_hotplug.puts_pending)) { + int delta; + + delta = atomic_xchg(&cpu_hotplug.puts_pending, 0); + cpu_hotplug.refcount -= delta; + } if (likely(!cpu_hotplug.refcount)) break; __set_current_state(TASK_UNINTERRUPTIBLE);