From patchwork Wed Oct 22 18:57:12 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 5136371 Return-Path: X-Original-To: patchwork-linux-pm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 3543DC11AC for ; Wed, 22 Oct 2014 18:57:23 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 3B80320260 for ; Wed, 22 Oct 2014 18:57:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BA2F52025A for ; Wed, 22 Oct 2014 18:57:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754183AbaJVS5T (ORCPT ); Wed, 22 Oct 2014 14:57:19 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:60194 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752591AbaJVS5S (ORCPT ); Wed, 22 Oct 2014 14:57:18 -0400 Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 22 Oct 2014 12:57:17 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 22 Oct 2014 12:57:15 -0600 Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id B3AB53E4003B; Wed, 22 Oct 2014 12:57:14 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s9MIvEJk55509078; Wed, 22 Oct 2014 20:57:14 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s9MJ1ucC014950; Wed, 22 Oct 2014 13:01:57 -0600 Received: from paulmck-ThinkPad-W500 ([9.70.82.148]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id s9MJ1u6k014933; Wed, 22 Oct 2014 13:01:56 -0600 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id BA5E3382273; Wed, 22 Oct 2014 11:57:12 -0700 (PDT) Date: Wed, 22 Oct 2014 11:57:12 -0700 From: "Paul E. McKenney" To: Jiri Kosina Cc: Peter Zijlstra , Ingo Molnar , "Rafael J. Wysocki" , Pavel Machek , Steven Rostedt , Dave Jones , Daniel Lezcano , Nicolas Pitre , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org Subject: Re: lockdep splat in CPU hotplug Message-ID: <20141022185712.GA9570@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20141022143837.GW4977@linux.vnet.ibm.com> <20141022165433.GA22874@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20141022165433.GA22874@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14102218-0013-0000-0000-000005B96C0E Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Oct 22, 2014 at 09:54:33AM -0700, Paul E. McKenney wrote: > On Wed, Oct 22, 2014 at 07:38:37AM -0700, Paul E. McKenney wrote: > > On Wed, Oct 22, 2014 at 11:53:49AM +0200, Jiri Kosina wrote: > > > On Tue, 21 Oct 2014, Jiri Kosina wrote: > > > > > > > Hi, > > > > > > > > I am seeing the lockdep report below when resuming from suspend-to-disk > > > > with current Linus' tree (c2661b80609). > > > > > > > > The reason for CCing Ingo and Peter is that I can't make any sense of one > > > > of the stacktraces lockdep is providing. > > > > > > > > Please have a look at the very first stacktrace in the dump, where lockdep > > > > is trying to explain where cpu_hotplug.lock#2 has been acquired. It seems > > > > to imply that cpuidle_pause() is taking cpu_hotplug.lock, but that's not > > > > the case at all. > > > > > > > > What am I missing? > > > > > > Okay, reverting 442bf3aaf55a ("sched: Let the scheduler see CPU idle > > > states") and followup 83a0a96a5f26 ("sched/fair: Leverage the idle state > > > info when choosing the "idlest" cpu") which depends on it makes the splat > > > go away. > > > > > > Just for the sake of testing the hypothesis, I did just the minimal change > > > below on top of current Linus' tree, and it also makes the splat go away > > > (of course it's totally incorrect thing to do by itself alone): > > > > > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > > > index 125150d..d31e04c 100644 > > > --- a/drivers/cpuidle/cpuidle.c > > > +++ b/drivers/cpuidle/cpuidle.c > > > @@ -225,12 +225,6 @@ void cpuidle_uninstall_idle_handler(void) > > > initialized = 0; > > > wake_up_all_idle_cpus(); > > > } > > > - > > > - /* > > > - * Make sure external observers (such as the scheduler) > > > - * are done looking at pointed idle states. > > > - */ > > > - synchronize_rcu(); > > > } > > > > > > /** > > > > > > So indeed 442bf3aaf55a is guilty. > > > > > > Paul was stating yesterday that it can't be the try_get_online_cpus() in > > > synchronize_sched_expedited(), as it's doing only trylock. There are > > > however more places where synchronize_sched_expedited() is acquiring > > > cpu_hotplug.lock unconditionally by calling put_online_cpus(), so the race > > > seems real. > > > > Gah! So I only half-eliminated the deadlock between > > synchronize_sched_expedited() and CPU hotplug. Back to the drawing > > board... > > Please see below for an untested alleged fix. And that patch had a lockdep issue. The following replacement patch passes light rcutorture testing, but your mileage may vary. Thanx, Paul ------------------------------------------------------------------------ rcu: More on deadlock between CPU hotplug and expedited grace periods Commit dd56af42bd82 (rcu: Eliminate deadlock between CPU hotplug and expedited grace periods) was incomplete. Although it did eliminate deadlocks involving synchronize_sched_expedited()'s acquisition of cpu_hotplug.lock via get_online_cpus(), it did nothing about the similar deadlock involving acquisition of this same lock via put_online_cpus(). This deadlock became apparent with testing involving hibernation. This commit therefore changes put_online_cpus() acquisition of this lock to be conditional, and increments a new cpu_hotplug.puts_pending field in case of acquisition failure. Then cpu_hotplug_begin() checks for this new field being non-zero, and applies any changes to cpu_hotplug.refcount. Reported by: Jiri Kosina Signed-off-by: Paul E. McKenney Tested-by: Borislav Petkov --- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/kernel/cpu.c b/kernel/cpu.c index 356450f09c1f..90a3d017b90c 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -64,6 +64,8 @@ static struct { * an ongoing cpu hotplug operation. */ int refcount; + /* And allows lockless put_online_cpus(). */ + atomic_t puts_pending; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; @@ -113,7 +115,11 @@ void put_online_cpus(void) { if (cpu_hotplug.active_writer == current) return; - mutex_lock(&cpu_hotplug.lock); + if (!mutex_trylock(&cpu_hotplug.lock)) { + atomic_inc(&cpu_hotplug.puts_pending); + cpuhp_lock_release(); + return; + } if (WARN_ON(!cpu_hotplug.refcount)) cpu_hotplug.refcount++; /* try to fix things up */ @@ -155,6 +161,12 @@ void cpu_hotplug_begin(void) cpuhp_lock_acquire(); for (;;) { mutex_lock(&cpu_hotplug.lock); + if (atomic_read(&cpu_hotplug.puts_pending)) { + int delta; + + delta = atomic_xchg(&cpu_hotplug.puts_pending, 0); + cpu_hotplug.refcount -= delta; + } if (likely(!cpu_hotplug.refcount)) break; __set_current_state(TASK_UNINTERRUPTIBLE);