From patchwork Fri Jan 11 21:51:48 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Colin Cross X-Patchwork-Id: 1967551 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork1.kernel.org (Postfix) with ESMTP id 28D5B3FE37 for ; Fri, 11 Jan 2013 21:55:33 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1TtmW8-0001tW-Fe; Fri, 11 Jan 2013 21:52:00 +0000 Received: from mail-qa0-f74.google.com ([209.85.216.74]) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1TtmW4-0001t4-Ta for linux-arm-kernel@lists.infradead.org; Fri, 11 Jan 2013 21:51:58 +0000 Received: by mail-qa0-f74.google.com with SMTP id r4so33963qaq.1 for ; Fri, 11 Jan 2013 13:51:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:cc:subject:date:message-id:x-mailer :x-gm-message-state; bh=V8vI1DdCIieaj7K7XCpGNoyZc57+7dFYe/LyCxMMGSk=; b=ITUjmn5mrbCSbil4lRlIn76Ooiape2WN2Gcsnn+3Ae7ZYlGiYX1ZQ3RfBUXEez3L41 kJeWh3+kRWKInYgUn1uBu3KP8XuZ5y8N7pU0cJg1bxUvTe+eQ3UajGEWUkBo4dEy5Oyx rjh493o3QQHrxHXP4MBbl5qYRPIrWilIqH/QWMYCMfWvfFgbA6D61O970nzNiXQgt5Nf pAat3XPHpO4zoq94I2M5jGhu+eoX8Mt1m0mrZFsmmmqGraWgvqgPCQeMKu1LYLCujRNN rxolhWmqZOH6ljkZNT3ADVEKS4YnFMn2hj35U0E+9HuNasWjzfFfXrNVcJXDzhg5ymH2 k21Q== X-Received: by 10.236.124.238 with SMTP id x74mr37550863yhh.13.1357941114420; Fri, 11 Jan 2013 13:51:54 -0800 (PST) Received: from wpzn3.hot.corp.google.com (216-239-44-65.google.com [216.239.44.65]) by gmr-mx.google.com with ESMTPS id s58si305316yhi.6.2013.01.11.13.51.54 (version=TLSv1 cipher=AES128-SHA bits=128/128); Fri, 11 Jan 2013 13:51:54 -0800 (PST) Received: from walnut.mtv.corp.google.com (walnut.mtv.corp.google.com [172.18.105.48]) by wpzn3.hot.corp.google.com (Postfix) with ESMTP id 28753100047; Fri, 11 Jan 2013 13:51:54 -0800 (PST) Received: by walnut.mtv.corp.google.com (Postfix, from userid 99897) id C1309160826; Fri, 11 Jan 2013 13:51:53 -0800 (PST) From: Colin Cross To: linux-kernel@vger.kernel.org Subject: [PATCH v2] hardlockup: detect hard lockups without NMIs using secondary cpus Date: Fri, 11 Jan 2013 13:51:48 -0800 Message-Id: <1357941108-14138-1-git-send-email-ccross@android.com> X-Mailer: git-send-email 1.7.7.3 X-Gm-Message-State: ALoCoQmhR4ny0ZqgNeNs84g+kNexfK7wAIbILjXqfP2vmhNrUhjk5nuV024RANPYgLBk41hGfQfCP9DjrIaW5rdu5yfKiORdvEMYk2WDpaNbgXZs+lFg13Sb+vE+cZT/MSuWrOYT0ZJHtRhLoN5+OJht/mTvajnZcEsw6wfGW/1x9jkenWhHd+c6tn8sze3z4ZAie78BMB1D6AkU7NvZuFs//J/P+GgNgQ== X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130111_165157_130416_055A4BB2 X-CRM114-Status: GOOD ( 27.22 ) X-Spam-Score: -0.3 (/) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-0.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- 3.0 KHOP_BIG_TO_CC Sent to 10+ recipients instaed of Bcc or a list -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.216.74 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -0.7 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: Don Zickus , Russell King - ARM Linux , Tony Lindgren , Frederic Weisbecker , Ingo Molnar , Colin Cross , Andrew Morton , liu chuansheng , Thomas Gleixner , linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Emulate NMIs on systems where they are not available by using timer interrupts on other cpus. Each cpu will use its softlockup hrtimer to check that the next cpu is processing hrtimer interrupts by verifying that a counter is increasing. This patch is useful on systems where the hardlockup detector is not available due to a lack of NMIs, for example most ARM SoCs. Without this patch any cpu stuck with interrupts disabled can cause a hardware watchdog reset with no debugging information, but with this patch the kernel can detect the lockup and panic, which can result in useful debugging info. Signed-off-by: Colin Cross --- include/linux/nmi.h | 5 ++- kernel/watchdog.c | 123 ++++++++++++++++++++++++++++++++++++++++++++++++-- lib/Kconfig.debug | 14 +++++- 3 files changed, 135 insertions(+), 7 deletions(-) Changes since v1: renamed variables to clarify when referring to the next cpu factored out function to get next cpu fixed int vs unsigned int mismatches fixed race condition when offlining cpus diff --git a/include/linux/nmi.h b/include/linux/nmi.h index db50840..c8f8aa0 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -14,8 +14,11 @@ * may be used to reset the timeout - for code which intentionally * disables interrupts for a long time. This call is stateless. */ -#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR) +#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR_NMI) #include +#endif + +#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR) extern void touch_nmi_watchdog(void); #else static inline void touch_nmi_watchdog(void) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 75a2ab3..61a0595 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -44,6 +44,11 @@ static DEFINE_PER_CPU(bool, hard_watchdog_warn); static DEFINE_PER_CPU(bool, watchdog_nmi_touch); static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved); +#endif +#ifdef CONFIG_HARDLOCKUP_DETECTOR_OTHER_CPU +static cpumask_t __read_mostly watchdog_cpus; +#endif +#ifdef CONFIG_HARDLOCKUP_DETECTOR_NMI static DEFINE_PER_CPU(struct perf_event *, watchdog_ev); #endif @@ -179,7 +184,7 @@ void touch_softlockup_watchdog_sync(void) __raw_get_cpu_var(watchdog_touch_ts) = 0; } -#ifdef CONFIG_HARDLOCKUP_DETECTOR +#ifdef CONFIG_HARDLOCKUP_DETECTOR_NMI /* watchdog detector functions */ static int is_hardlockup(void) { @@ -193,6 +198,76 @@ static int is_hardlockup(void) } #endif +#ifdef CONFIG_HARDLOCKUP_DETECTOR_OTHER_CPU +static unsigned int watchdog_next_cpu(unsigned int cpu) +{ + cpumask_t cpus = watchdog_cpus; + unsigned int next_cpu; + + next_cpu = cpumask_next(cpu, &cpus); + if (next_cpu >= nr_cpu_ids) + next_cpu = cpumask_first(&cpus); + + if (next_cpu == cpu) + return nr_cpu_ids; + + return next_cpu; +} + +static int is_hardlockup_other_cpu(unsigned int cpu) +{ + unsigned long hrint = per_cpu(hrtimer_interrupts, cpu); + + if (per_cpu(hrtimer_interrupts_saved, cpu) == hrint) + return 1; + + per_cpu(hrtimer_interrupts_saved, cpu) = hrint; + return 0; +} + +static void watchdog_check_hardlockup_other_cpu(void) +{ + unsigned int next_cpu; + + /* + * Test for hardlockups every 3 samples. The sample period is + * watchdog_thresh * 2 / 5, so 3 samples gets us back to slightly over + * watchdog_thresh (over by 20%). + */ + if (__this_cpu_read(hrtimer_interrupts) % 3 != 0) + return; + + /* check for a hardlockup on the next cpu */ + next_cpu = watchdog_next_cpu(smp_processor_id()); + if (next_cpu >= nr_cpu_ids) + return; + + smp_rmb(); + + if (per_cpu(watchdog_nmi_touch, next_cpu) == true) { + per_cpu(watchdog_nmi_touch, next_cpu) = false; + return; + } + + if (is_hardlockup_other_cpu(next_cpu)) { + /* only warn once */ + if (per_cpu(hard_watchdog_warn, next_cpu) == true) + return; + + if (hardlockup_panic) + panic("Watchdog detected hard LOCKUP on cpu %u", next_cpu); + else + WARN(1, "Watchdog detected hard LOCKUP on cpu %u", next_cpu); + + per_cpu(hard_watchdog_warn, next_cpu) = true; + } else { + per_cpu(hard_watchdog_warn, next_cpu) = false; + } +} +#else +static inline void watchdog_check_hardlockup_other_cpu(void) { return; } +#endif + static int is_softlockup(unsigned long touch_ts) { unsigned long now = get_timestamp(smp_processor_id()); @@ -204,7 +279,7 @@ static int is_softlockup(unsigned long touch_ts) return 0; } -#ifdef CONFIG_HARDLOCKUP_DETECTOR +#ifdef CONFIG_HARDLOCKUP_DETECTOR_NMI static struct perf_event_attr wd_hw_attr = { .type = PERF_TYPE_HARDWARE, @@ -252,7 +327,7 @@ static void watchdog_overflow_callback(struct perf_event *event, __this_cpu_write(hard_watchdog_warn, false); return; } -#endif /* CONFIG_HARDLOCKUP_DETECTOR */ +#endif /* CONFIG_HARDLOCKUP_DETECTOR_NMI */ static void watchdog_interrupt_count(void) { @@ -272,6 +347,9 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) /* kick the hardlockup detector */ watchdog_interrupt_count(); + /* test for hardlockups on the next cpu */ + watchdog_check_hardlockup_other_cpu(); + /* kick the softlockup detector */ wake_up_process(__this_cpu_read(softlockup_watchdog)); @@ -396,7 +474,7 @@ static void watchdog(unsigned int cpu) __touch_watchdog(); } -#ifdef CONFIG_HARDLOCKUP_DETECTOR +#ifdef CONFIG_HARDLOCKUP_DETECTOR_NMI /* * People like the simple clean cpu node info on boot. * Reduce the watchdog noise by only printing messages @@ -472,9 +550,44 @@ static void watchdog_nmi_disable(unsigned int cpu) return; } #else +#ifdef CONFIG_HARDLOCKUP_DETECTOR_OTHER_CPU +static int watchdog_nmi_enable(unsigned int cpu) +{ + /* + * The new cpu will be marked online before the first hrtimer interrupt + * runs on it. If another cpu tests for a hardlockup on the new cpu + * before it has run its first hrtimer, it will get a false positive. + * Touch the watchdog on the new cpu to delay the first check for at + * least 3 sampling periods to guarantee one hrtimer has run on the new + * cpu. + */ + per_cpu(watchdog_nmi_touch, cpu) = true; + smp_wmb(); + cpumask_set_cpu(cpu, &watchdog_cpus); + return 0; +} + +static void watchdog_nmi_disable(unsigned int cpu) +{ + unsigned int next_cpu = watchdog_next_cpu(cpu); + + /* + * Offlining this cpu will cause the cpu before this one to start + * checking the one after this one. If this cpu just finished checking + * the next cpu and updating hrtimer_interrupts_saved, and then the + * previous cpu checks it within one sample period, it will trigger a + * false positive. Touch the watchdog on the next cpu to prevent it. + */ + if (next_cpu < nr_cpu_ids) + per_cpu(watchdog_nmi_touch, next_cpu) = true; + smp_wmb(); + cpumask_clear_cpu(cpu, &watchdog_cpus); +} +#else static int watchdog_nmi_enable(unsigned int cpu) { return 0; } static void watchdog_nmi_disable(unsigned int cpu) { return; } -#endif /* CONFIG_HARDLOCKUP_DETECTOR */ +#endif /* CONFIG_HARDLOCKUP_DETECTOR_OTHER_CPU */ +#endif /* CONFIG_HARDLOCKUP_DETECTOR_NMI */ /* prepare/enable/disable routines */ /* sysctl functions */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index aaf8baf..f7c4859 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -191,15 +191,27 @@ config LOCKUP_DETECTOR The overhead should be minimal. A periodic hrtimer runs to generate interrupts and kick the watchdog task every 4 seconds. An NMI is generated every 10 seconds or so to check for hardlockups. + If NMIs are not available on the platform, every 12 seconds the + hrtimer interrupt on one cpu will be used to check for hardlockups + on the next cpu. The frequency of hrtimer and NMI events and the soft and hard lockup thresholds can be controlled through the sysctl watchdog_thresh. -config HARDLOCKUP_DETECTOR +config HARDLOCKUP_DETECTOR_NMI def_bool y depends on LOCKUP_DETECTOR && !HAVE_NMI_WATCHDOG depends on PERF_EVENTS && HAVE_PERF_EVENTS_NMI +config HARDLOCKUP_DETECTOR_OTHER_CPU + def_bool y + depends on LOCKUP_DETECTOR && SMP + depends on !HARDLOCKUP_DETECTOR_NMI && !HAVE_NMI_WATCHDOG + +config HARDLOCKUP_DETECTOR + def_bool y + depends on HARDLOCKUP_DETECTOR_NMI || HARDLOCKUP_DETECTOR_OTHER_CPU + config BOOTPARAM_HARDLOCKUP_PANIC bool "Panic (Reboot) On Hard Lockups" depends on HARDLOCKUP_DETECTOR