From patchwork Wed Aug 16 15:31:47 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 9903959 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id ECD5D60231 for ; Wed, 16 Aug 2017 15:32:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD2AA288DA for ; Wed, 16 Aug 2017 15:32:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D1FDF28A0B; Wed, 16 Aug 2017 15:32:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, DKIM_VALID, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 43509288DA for ; Wed, 16 Aug 2017 15:32:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=GW4Lxil0Jv4GN9xVSfni4FhC3QMkdRVKbulIsvz7c8I=; b=CZdwTZT8t2QD90 0fBVcrrv/9K+TDmIYALRcKz0cqxWUZQIP7bIKwSgtljJeaQkw+QQqRhku2g6axId2tkhYemjlIurX xyH6OY5LCM13kCxvNG4JIcE7082RStEg2ui/XXf7uIx8+hDocrKFgcH+EjSsQxJVdJx8+ZEirvqiU hn5sXVhoRT4TXQ56u/6VDW/nHg9UghJ6QZNtISBNj7DgpqOXuTIYoYkpveHgYEsYGFenKOr+ywPJC rKr3l064g8u6pSkalub2V1FKoY52HcS67cIj2nqetQ4g2RSzApmwJHyBVpZfhZY03EW8j5cgOMWcq YnYRzbX/AGSoB4Oen9WQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1di0J4-0004Ds-93; Wed, 16 Aug 2017 15:32:30 +0000 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1di0J0-0004Bu-K2 for linux-arm-kernel@lists.infradead.org; Wed, 16 Aug 2017 15:32:28 +0000 Received: by mail-pf0-x243.google.com with SMTP id t86so1649100pfe.1 for ; Wed, 16 Aug 2017 08:32:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:in-reply-to:references :organization:mime-version:content-transfer-encoding; bh=G1Lq/ijRonRiTiA7JIDuD0eTfpuvb+2SVZdo8HpX3Ec=; b=UDBzS8nxeaYu32flCg7QwBzT9Vw/3t52yr+IHhXS+9LMqaZ+6trCQWgnf+QKciK57Y l1TWgGf+3iwuWUzdtSXOBT6ciYWSJRfZ1m9/GCRFOvzCgORqyGVsO7grNvU0/VzyyfTZ i1Y2FUMiIT3MI3xqDFfHrA6DGR8SWlBPlRwLCMT1DLtLRDbTgepu/+U8W1tFxExaC24J hxjXAexeMysDgkrOEXhen3Rs365B/udMWXQdL7XhF0i+Gi7pbwm9T1NFCFNp9niKsMrB 5rqDoiyTh6LDoHVZoD6UNzRaOr7Ltfad8aMo4Vnx/CJmQLAzVMkhFXVCtCyCNuPt/Q89 R/mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=G1Lq/ijRonRiTiA7JIDuD0eTfpuvb+2SVZdo8HpX3Ec=; b=OUTCpEZcz54TCqRED0wGP/4whjYPLaeiK4xD3FIRe2rEu/KBK0r58tzpHO0Qme8pxI xqq1QSpFTwQ1Pir2EEJyTGbjh3890iwB/60mrM8Kz5Xhai3sRXsACZHMZrjepgUqYNmk uaacoUWnefNvHl/V/UAIF4jJZ/kAVCcZkaQ4UVzL2OfdFoj+HeP84jyQ6VsGHclmX3td nba05yF431NFBRZ+H7rCIQ13+YUMi5VAjy3SnFInVQecJPY1FYWgWronAJAaH2A+NYYQ /3oif1gN/93EelKMSCWRwbgITOZuXhrDPX9o33itc+Bwi0mBvgrFzKQzraUW553lyiRM V3Aw== X-Gm-Message-State: AHYfb5g6zTkc1aMZ9Dxv547QLEboQuakpi/jTpAyILExrzyqcxhQQkt6 nCEcuc/BqBVnHQ== X-Received: by 10.99.109.9 with SMTP id i9mr2017393pgc.372.1502897524934; Wed, 16 Aug 2017 08:32:04 -0700 (PDT) Received: from roar.ozlabs.ibm.com (203-219-56-202.tpgi.com.au. [203.219.56.202]) by smtp.gmail.com with ESMTPSA id q133sm2663387pfq.31.2017.08.16.08.31.58 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 16 Aug 2017 08:32:03 -0700 (PDT) Date: Thu, 17 Aug 2017 01:31:47 +1000 From: Nicholas Piggin To: "Paul E. McKenney" Subject: Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this? Message-ID: <20170817013147.586e0eff@roar.ozlabs.ibm.com> In-Reply-To: <20170816125617.GY7017@linux.vnet.ibm.com> References: <20170728165529.GF3730@linux.vnet.ibm.com> <20170728182053.000072aa@huawei.com> <20170728190349.GM3730@linux.vnet.ibm.com> <20170731120847.00003d5c@huawei.com> <20170731150411.GA3730@linux.vnet.ibm.com> <20170731162757.000058ba@huawei.com> <20170801184646.GE3730@linux.vnet.ibm.com> <20170802172555.0000468a@huawei.com> <20170815154743.GK7017@linux.vnet.ibm.com> <87wp63smwn.fsf@concordia.ellerman.id.au> <20170816125617.GY7017@linux.vnet.ibm.com> Organization: IBM X-Mailer: Claws Mail 3.15.0-dirty (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170816_083227_349192_163DB7C0 X-CRM114-Status: GOOD ( 21.41 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dzickus@redhat.com, sfr@canb.auug.org.au, Michael Ellerman , "Rafael J. Wysocki" , linuxarm@huawei.com, David Miller , abdhalee@linux.vnet.ibm.com, Jonathan Cameron , sparclinux@vger.kernel.org, tglx@linutronix.de, linuxppc-dev@lists.ozlabs.org, akpm@linux-foundation.org, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, 16 Aug 2017 05:56:17 -0700 "Paul E. McKenney" wrote: > On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote: > > "Paul E. McKenney" writes: > > ... > > > > > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312 > > > Author: Paul E. McKenney > > > Date: Mon Aug 14 08:54:39 2017 -0700 > > > > > > EXP: Trace tick return from tick_nohz_stop_sched_tick > > > > > > Signed-off-by: Paul E. McKenney > > > > > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c > > > index c7a899c5ce64..7358a5073dfb 100644 > > > --- a/kernel/time/tick-sched.c > > > +++ b/kernel/time/tick-sched.c > > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts, > > > * (not only the tick). > > > */ > > > ts->sleep_length = ktime_sub(dev->next_event, now); > > > + trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) / 1000); > > > return tick; > > > } > > > > Should I be seeing negative values? A small sample: > > Maybe due to hypervisor preemption delays, but I confess that I am > surprised to see them this large. 1,602,250,019 microseconds is something > like a half hour, which could result in stall warnings all by itself. > > > -0 [015] d... 1602.039695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250019 > > -0 [009] d... 1602.039701: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025 > > -0 [007] d... 1602.039702: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250025 > > -0 [048] d... 1602.039703: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: 9973 > > -0 [006] d... 1602.039704: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250027 > > -0 [001] d... 1602.039730: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250053 > > -0 [008] d... 1602.039732: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602250055 > > -0 [006] d... 1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018 > > -0 [009] d... 1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018 > > -0 [001] d... 1602.049695: __tick_nohz_idle_enter: tick_nohz_stop_sched_tick: -1602260018 > > > > > > I have a full trace, I'll send it to you off-list. > > I will take a look! I found this, I can't see that it would cause our symptoms, but it's worth someone who knows the code taking a look at it. --- cpuidle: fix broadcast control when broadcast can not be entered When failing to enter broadcast timer mode for an idle state that requires it, a new state is selected that does not require broadcast, but the broadcast variable remains set. This causes tick_broadcast_exit to be called despite not having entered broadcast mode. This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some cases, but otherwise does not appear to cause problems. Signed-off-by: Nicholas Piggin --- drivers/cpuidle/cpuidle.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 60bb64f4329d..4453e27f855e 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -208,6 +208,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, return -EBUSY; } target_state = &drv->states[index]; + broadcast = false; } /* Take note of the planned idle state. */