From patchwork Mon May 18 09:03:37 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Hocko X-Patchwork-Id: 6426941 Return-Path: X-Original-To: patchwork-linux-pm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 605509F1C1 for ; Mon, 18 May 2015 09:03:44 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4E5E12061F for ; Mon, 18 May 2015 09:03:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 30ABF203B6 for ; Mon, 18 May 2015 09:03:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752171AbbERJDl (ORCPT ); Mon, 18 May 2015 05:03:41 -0400 Received: from mail-wg0-f43.google.com ([74.125.82.43]:35125 "EHLO mail-wg0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751207AbbERJDk (ORCPT ); Mon, 18 May 2015 05:03:40 -0400 Received: by wgfl8 with SMTP id l8so27823426wgf.2; Mon, 18 May 2015 02:03:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=fU58Jeoqoyp3SxU7n0n2hrnKE7VYl6IUBTkGkDfMTX0=; b=I8HhkeoQ6/hudfbou+2ymO0eUZzBQUmf3cjvKabICx7srtrnZG3YHm1AxUzjmFuo/x DsfpyUuhDpCIJJ9995l4wOOHDQUKZqHeUcqZgFmEmxdPTa1rcAccmiGomeE5IuSpIk8c CXg24VT0XnZm6EMNKoGYFTO7X5oqGguqiw99LUxfSKAL5LUn/j+iz3snRvAEaT0SWQiR PKocjgw0LrhhZwsSQXRGFFddk3rGdQwt4qbTqIO80+wvG5XtrWquHGuEk5x5xAfOAsG/ BuLCX7Wv9LnsBTufiidRNL/pzgZp+H7oan7IQXyR6HPcI1UKPPS+QjAaonB2U2zCw+IZ iqfw== X-Received: by 10.194.103.232 with SMTP id fz8mr40490698wjb.130.1431939818696; Mon, 18 May 2015 02:03:38 -0700 (PDT) Received: from localhost (ip-86-49-65-8.net.upcbroadband.cz. [86.49.65.8]) by mx.google.com with ESMTPSA id g8sm15853797wjn.18.2015.05.18.02.03.37 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 18 May 2015 02:03:37 -0700 (PDT) Date: Mon, 18 May 2015 11:03:37 +0200 From: Michal Hocko To: Peter Zijlstra Cc: Linus Torvalds , Stephane Eranian , Ulrich Obergfell , Don Zickus , Ingo Molnar , Andrew Morton , "Rafael J. Wysocki" , Kevin Hilman , Ulf Hansson , "linux-pm@vger.kernel.org" , LKML Subject: Re: suspend regression in 4.1-rc1 Message-ID: <20150518090336.GA6393@dhcp22.suse.cz> References: <20150517185041.GA5897@dhcp22.suse.cz> <20150518073046.GO17717@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20150518073046.GO17717@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID,T_RP_MATCHES_RCVD,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Mon 18-05-15 09:30:46, Peter Zijlstra wrote: > On Sun, May 17, 2015 at 09:33:56PM -0700, Linus Torvalds wrote: > > On Sun, May 17, 2015 at 11:50 AM, Michal Hocko wrote: > > > > > > The merge commit is empty and both 80dcc31fbe55 and e4b0db72be24 work > > > properly but the merge is bad. So it seems like some of the commits in > > > either branch has a side effect which needs other branch in order to > > > reproduce. > > > > > > So've tried to bisect ^80dcc31fbe55 e4b0db72be24 and merged 80dcc31fbe55 > > > in each step. > > > > Good extra work! Thanks. > > > > > This lead to: > > > > > > commit 195daf665a6299de98a4da3843fed2dd9de19d3a > > > Author: Ulrich Obergfell > > > Date: Tue Apr 14 15:44:13 2015 -0700 > > > > > > watchdog: enable the new user interface of the watchdog mechanism > > > > > > The patch doesn't revert because of follow up changes so I have reverted > > > all three: > > > 692297d8f968 ("watchdog: introduce the hardlockup_detector_disable() function") > > > b2f57c3a0df9 ("watchdog: clean up some function names and arguments") > > > 195daf665a62 ("watchdog: enable the new user interface of the watchdog mechanism") > > > > Hmm. I guess we should just revert those three then. Unless somebody > > can see what the subtle interaction is. > > > > Actually, looking closer, on the *other* side of the merge, the only > > commit that looks like it might be conflicting is > > > > b3738d293233 "watchdog: Add watchdog enable/disable all functions" > > > > which is then used by > > > > b37609c30e41 "perf/x86/intel: Make the HT bug workaround > > conditional on HT enabled" > > > > Does the problem go away if you revert *those* two commits instead? > > > > At least that would tell is what the exact bad interaction is. > > > > Adding Stephane (author of those watchdog/perf patches) to the Cc. And > > PeterZ, who signed them off (Ingo also did, but was already on the > > participants list). > > > > Anybody see it? > > The 'obvious' discrepancy is that 195daf665a62 ("watchdog: enable the > new user interface of the watchdog mechanism") changes the semantics of > watchdog_user_enabled, which thereafter is only used by the functions > introduced by b3738d293233 ("watchdog: Add watchdog enable/disable all > functions"). Yeah, this is it! b3738d293233 was definitely in the range I was testing when merging 195daf665 into e95e7f627062..80dcc31fbe55. I must have screwed something. > There further appears to be a distinct lack of serialization between > setting and using watchdog_enabled, so perhaps we should wrap the > {en,dis}able_all() things in watchdog_proc_mutex. > > Let me go see if I can reproduce / test this.. as is the below is > entirely untested. This doesn't hang anymore. I've just had to move the mutex definition up to make it compile. So feel free to add my Reported-and-tested-by: Michal Hocko Thanks! > > --- > kernel/watchdog.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 2316f50b07a4..56aeedb087e3 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -608,19 +608,25 @@ void watchdog_nmi_enable_all(void) > { > int cpu; > > - if (!watchdog_user_enabled) > + mutex_lock(&watchdog_proc_mutex); > + > + if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED)) > return; > > get_online_cpus(); > for_each_online_cpu(cpu) > watchdog_nmi_enable(cpu); > put_online_cpus(); > + > + mutex_unlock(&watchdog_proc_mutex); > } > > void watchdog_nmi_disable_all(void) > { > int cpu; > > + mutex_lock(&watchdog_proc_mutex); > + > if (!watchdog_running) > return; > > @@ -628,6 +634,8 @@ void watchdog_nmi_disable_all(void) > for_each_online_cpu(cpu) > watchdog_nmi_disable(cpu); > put_online_cpus(); > + > + mutex_unlock(&watchdog_proc_mutex); > } > #else > static int watchdog_nmi_enable(unsigned int cpu) { return 0; } diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 56aeedb087e3..c398596c35b8 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -604,6 +604,8 @@ static void watchdog_nmi_disable(unsigned int cpu) } } +static DEFINE_MUTEX(watchdog_proc_mutex); + void watchdog_nmi_enable_all(void) { int cpu; @@ -752,8 +754,6 @@ static int proc_watchdog_update(void) } -static DEFINE_MUTEX(watchdog_proc_mutex); - /* * common function for watchdog, nmi_watchdog and soft_watchdog parameter *