From patchwork Wed Mar 26 03:56:48 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: preeti X-Patchwork-Id: 3891901 Return-Path: X-Original-To: patchwork-linux-pm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 2F7E9BF540 for ; Wed, 26 Mar 2014 04:00:53 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 44A6A201FD for ; Wed, 26 Mar 2014 04:00:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 51323201BA for ; Wed, 26 Mar 2014 04:00:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750832AbaCZEAu (ORCPT ); Wed, 26 Mar 2014 00:00:50 -0400 Received: from e38.co.us.ibm.com ([32.97.110.159]:56321 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750730AbaCZEAt (ORCPT ); Wed, 26 Mar 2014 00:00:49 -0400 Received: from /spool/local by e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 25 Mar 2014 22:00:48 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e38.co.us.ibm.com (192.168.1.138) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 25 Mar 2014 22:00:47 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id CC4941FF0045; Tue, 25 Mar 2014 22:00:46 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp07029.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s2Q1vrwk8258024; Wed, 26 Mar 2014 02:57:53 +0100 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s2Q44O2O002685; Tue, 25 Mar 2014 22:04:25 -0600 Received: from preeti.in.ibm.com ([9.79.182.235]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id s2Q44HLP002195; Tue, 25 Mar 2014 22:04:19 -0600 Subject: [PATCH] tick, broadcast: Prevent false alarm when force mask contains offline cpus To: tglx@linutronix.de, mingo@redhat.com, linux-kernel@vger.kernel.org From: Preeti U Murthy Cc: linux-pm@vger.kernel.org, peterz@infradead.org, rjw@rjwysocki.net, srivatsa.bhat@linux.vnet.ibm.com, paulmck@linux.vnet.ibm.com, davem@davemloft.net Date: Wed, 26 Mar 2014 09:26:48 +0530 Message-ID: <20140326035648.21736.85740.stgit@preeti.in.ibm.com> User-Agent: StGit/0.16-38-g167d MIME-Version: 1.0 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14032604-1344-0000-0000-000000814D7A Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Its possible that the tick_broadcast_force_mask contains cpus which are not in cpu_online_mask when a broadcast tick occurs. This could happen under the following circumstance assuming CPU1 is among the CPUs waiting for broadcast. CPU0 CPU1 Run CPU_DOWN_PREPARE notifiers Start stop_machine Gets woken up by IPI to run stop_machine, sets itself in tick_broadcast_force_mask if the time of broadcast interrupt is around the same time as this IPI. Start stop_machine set_cpu_online(cpu1, false) End stop_machine End stop_machine Broadcast interrupt Finds that cpu1 in tick_broadcast_force_mask is offline and triggers the WARN_ON in tick_handle_oneshot_broadcast() Clears all broadcast masks in CPU_DEAD stage. This WARN_ON was added to capture scenarios where the broadcast mask, be it oneshot/pending/force_mask contain offline cpus whose tick devices have been removed. But here is a case where we trigger the warn on in a valid scenario. One could argue that the scenario is invalid and ought to be warned against because ideally the broadcast masks need to be cleared of the cpus about to go offine before clearing them in the online_mask so that we dont hit these scenarios. This would mean clearing the masks in CPU_DOWN_PREPARE stage. But it is quite possible that this stage itself will fail and cpu hotplug will not go through. We would then end up in a situation where the cpu has not gone offline, and continues to wait for the broadcast interrupt like before. However it is cleared in the broadcast masks and this interrupt will never be delivered. Hence clearing of masks is best kept off until we are sure that the cpu is dead, i.e. in the CPU_DEAD stage. Hence simply ensure that the tick_broadcast_force_mask is a subset of the online cpus to take care of rare occurences such as above. Moreover this is not a harmful scenario where the cpu is in the mask but its tick device was shutdown. The WARN_ON will then continue to capture cases where we could possibly cause a kernel crash. Signed-off-by: Preeti U Murthy --- kernel/time/tick-broadcast.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 63c7b2d..30b8731 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -606,7 +606,12 @@ again: */ cpumask_clear_cpu(smp_processor_id(), tick_broadcast_pending_mask); - /* Take care of enforced broadcast requests */ + /* Take care of enforced broadcast requests. We could have offline + * cpus in the tick_broadcast_force_mask. Thats ok, we got the interrupt + * before we could clear the mask. + */ + cpumask_and(tick_broadcast_force_mask, + tick_broadcast_force_mask, cpu_online_mask); cpumask_or(tmpmask, tmpmask, tick_broadcast_force_mask); cpumask_clear(tick_broadcast_force_mask);