From patchwork Thu Dec 19 12:18:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303377 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B90A7139A for ; Thu, 19 Dec 2019 12:19:55 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 86D1620716 for ; Thu, 19 Dec 2019 12:19:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="m4ZSRyEp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 86D1620716 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=f3HaN7/YWYZG0gihe8hw7Sfv578WO/D8a7h7Qbcj67Q=; b=m4ZSRyEpyKSfBJuQei5h2IzsE5 zbYMACPY6WbrygMq9S1PTDyVtL1RvfmUP21J3lQeCUGLhDOzDxPvtFTr+F4oP9cv6CnWcuKKv+4pK TlWuOGRckhWQMNvpipodSdwi/qhebS0Nk4N2FZzM9B5WzYKgixP7TH5A43A1jUsvmte2AtsxM4AhR pbMsdY0yiUvVCzLYjpZkzAS21S9JJDTVtP7EKEG0jQHehGuPEvaFy0BjzzKeBE5jNDrpRapdVA3W2 kmKoGXWThynnDMWqX2utHTQAfRd+O1rdk8br7aRTJQsy07iBj59xNSpW8Pw47rRuC9uxGw7P3luze urdrE0ug==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumV-0002OY-PS; Thu, 19 Dec 2019 12:19:51 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihum8-00024Y-NJ for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:30 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 583EE328; Thu, 19 Dec 2019 04:19:26 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0E0913F719; Thu, 19 Dec 2019 04:19:23 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 01/12] smp: add generic SMP-stop support to common code Date: Thu, 19 Dec 2019 12:18:54 +0000 Message-Id: <20191219121905.26905-2-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041928_847616_7E41663A X-CRM114-Status: GOOD ( 19.17 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org There was a lot of code duplication across architectures regarding the SMP stop procedures' logic; moreover some of this duplicated code logic happened to be similarly faulty across a few architectures: while fixing such logic, move such generic logic as much as possible inside common code. Collect all the common logic related to SMP stop operations into the common SMP code; any architecture willing to use such centralized logic can select CONFIG_ARCH_USE_COMMON_STOP=y and provide the related arch-specific helpers: in such a scenario, those architectures will transparently start using the common code provided by smp_send_stop() common function. On the other side, Architectures not willing to use common code SMP stop logic will simply leave CONFIG_ARCH_USE_COMMON_STOP undefined and carry on executing their local arch-specific smp_send_stop() as before. Suggested-by: Dave Martin Signed-off-by: Cristian Marussi --- v2 --> v3 - reviewed wait/timeout helpers avoiding broken shared static globals v1 --> v2 - moved related Kconfig to common code inside arch/Kconfig - introduced additional CONFIG_USE_COMMON_STOP selected by CONFIG_ARCH_USE_COMMON_STOP - introduced helpers to let architectures optionally alter the default common code behaviour while waiting for CPUs: change timeout or wait for ever. (will be needed by x86) --- arch/Kconfig | 7 ++++ include/linux/smp.h | 39 +++++++++++++++++++ kernel/smp.c | 91 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 137 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig index 48b5e103bdb0..99550754e2ea 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -169,6 +169,13 @@ config ARCH_USE_BUILTIN_BSWAP instructions should set this. And it shouldn't hurt to set it on architectures that don't have such instructions. +config ARCH_USE_COMMON_SMP_STOP + def_bool n + +config USE_COMMON_SMP_STOP + depends on SMP && ARCH_USE_COMMON_SMP_STOP + def_bool y + config KRETPROBES def_bool y depends on KPROBES && HAVE_KRETPROBES diff --git a/include/linux/smp.h b/include/linux/smp.h index 6fc856c9eda5..22a0adeb7350 100644 --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -77,6 +77,45 @@ int smp_call_function_single_async(int cpu, call_single_data_t *csd); */ extern void smp_send_stop(void); +#ifdef CONFIG_USE_COMMON_SMP_STOP +/* + * An Architecture can optionally use this helper to change the default + * waiting behaviour of common STOP logic. + */ +void smp_stop_set_wait_forever(bool wait_forever); + +/* + * An Architecture can optionally use this helper to change the default + * waiting timeout of common STOP logic. + * A ZERO timeout means no waiting at all as long as waiting forever was + * not also previously set. + * + * Note that, in this way, waiting forever and timeout period settings + * remain disjoint. + */ +void smp_stop_set_wait_timeout_us(unsigned long timeout); + +/* + * Retrieve the current SMP STOP wait settings. + * Returns true if waiting forever is set. + */ +bool smp_stop_get_wait_settings(unsigned long *timeout); + +/* + * Any Architecture willing to use STOP common logic implementation + * MUST at least provide the arch_smp_stop_call() helper which implements + * the arch-specific CPU-stop mechanism. + */ +extern void arch_smp_stop_call(cpumask_t *cpus); + +/* + * An Architecture CAN also provide the arch_smp_cpus_stop_complete() + * dedicated helper, to perform any final arch-specific operation on + * the local CPU once the other CPUs have been successfully stopped. + */ +void arch_smp_cpus_stop_complete(void); +#endif + /* * sends a 'reschedule' event to another CPU: */ diff --git a/kernel/smp.c b/kernel/smp.c index 7dbcb402c2fc..de29cd94a948 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "smpboot.h" @@ -817,3 +818,93 @@ int smp_call_on_cpu(unsigned int cpu, int (*func)(void *), void *par, bool phys) return sscs.ret; } EXPORT_SYMBOL_GPL(smp_call_on_cpu); + +#ifdef CONFIG_USE_COMMON_SMP_STOP +static atomic_t smp_stop_wait_forever; +static atomic_t smp_stop_wait_timeout = ATOMIC_INIT(USEC_PER_SEC); + +void smp_stop_set_wait_forever(bool wait_forever) +{ + atomic_set(&smp_stop_wait_forever, wait_forever); + /* ensure wait atomic-op is visible */ + smp_mb__after_atomic(); +} + +void smp_stop_set_wait_timeout_us(unsigned long timeout) +{ + atomic_set(&smp_stop_wait_timeout, timeout); + /* ensure timeout atomic-op is visible */ + smp_mb__after_atomic(); +} + +bool smp_stop_get_wait_settings(unsigned long *timeout) +{ + if (timeout) + *timeout = atomic_read(&smp_stop_wait_timeout); + return atomic_read(&smp_stop_wait_forever); +} + +void __weak arch_smp_cpus_stop_complete(void) { } + +static inline bool any_other_cpus_online(cpumask_t *mask, + unsigned int this_cpu_id) +{ + cpumask_copy(mask, cpu_online_mask); + cpumask_clear_cpu(this_cpu_id, mask); + + return !cpumask_empty(mask); +} + +/* + * This centralizes the common logic to: + * + * - evaluate which CPUs are online and needs to be notified for stop, + * while considering properly the status of the calling CPU + * + * - call the arch-specific helpers to request the effective stop + * + * - wait for the stop operation to be completed across all involved CPUs + * monitoring the cpu_online_mask + */ +void smp_send_stop(void) +{ + unsigned int this_cpu_id; + cpumask_t mask; + + this_cpu_id = smp_processor_id(); + if (any_other_cpus_online(&mask, this_cpu_id)) { + bool wait_forever; + unsigned long timeout; + unsigned int this_cpu_online = cpu_online(this_cpu_id); + + if (system_state <= SYSTEM_RUNNING) + pr_crit("stopping secondary CPUs\n"); + arch_smp_stop_call(&mask); + + /* + * Defaults to wait up to one second for other CPUs to stop; + * architectures can modify the default timeout or request + * to wait forever. + * + * Here we rely simply on cpu_online_mask to sync with + * arch-specific stop code without bloating the code with an + * additional atomic_t ad-hoc counter. + * + * As a consequence we'll need proper explicit memory barriers + * in case the other CPUs running the arch-specific stop-code + * would need to commit to memory some data (like saved_regs). + */ + wait_forever = smp_stop_get_wait_settings(&timeout); + while (num_online_cpus() > this_cpu_online && + (wait_forever || timeout--)) + udelay(1); + /* ensure any stopping-CPUs memory access is made visible */ + smp_rmb(); + if (num_online_cpus() > this_cpu_online) + pr_warn("failed to stop secondary CPUs %*pbl\n", + cpumask_pr_args(cpu_online_mask)); + } + /* Perform final (possibly arch-specific) work on this CPU */ + arch_smp_cpus_stop_complete(); +} +#endif From patchwork Thu Dec 19 12:18:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303379 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8DDC6109A for ; Thu, 19 Dec 2019 12:20:21 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 515B7227BF for ; Thu, 19 Dec 2019 12:20:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="VFCo4JTY" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 515B7227BF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=rnhUDAEPbXm9evKVxyXx1fllRxyS18yuRqSlB/YelTI=; b=VFCo4JTYlaPew0Pk1QHKnyz2CM Wwo0D1NuoISdqYkF5jhtnEf8XPgwdvkm4VnBxwyIfpxBRMNCTduH0SrMRJNaeT7PJ7D6AHjdGw2sw Acl5p8x4Tm7d14fO4FIW3I1mjMUW+CSDjFucxc25mD0518xNMujM/7TTCOdLOJtlZIhwA1xSBt0Na GNdMBc7Pd/EiZbhshLiwNudyDUhsRxu8q2vYhiUQaipoi18k4hbklcB49PZ2S8dl4o3+hKe7Qnw9w zmz/gh889scImLP7EosK15SiXTFaGYsL5g/Zi0yqljTR+6W60DRVmvJIT8kgchvm5YKHTNIkFQeRr JzjEwZsw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumy-0003zw-M6; Thu, 19 Dec 2019 12:20:20 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihum9-00025K-Lh for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:31 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F16E9DA7; Thu, 19 Dec 2019 04:19:28 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8D2343F719; Thu, 19 Dec 2019 04:19:26 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 02/12] smp: unify crash_ and smp_send_stop() logic Date: Thu, 19 Dec 2019 12:18:55 +0000 Message-Id: <20191219121905.26905-3-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041929_791582_6B12718C X-CRM114-Status: GOOD ( 22.54 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org crash_smp_send_stop() logic was fairly similar to smp_send_stop(): a lot of logic and code was duplicated between the two code paths and across a few different architectures. Unify this underlying common logic into the existing SMP common stop code: use a common workhorse function for both paths to perform the common tasks while taking care to propagate to the underlying architecture code the intent of the stop operation: a simple stop or a crash dump stop. Relocate the __weak crash_smp_send_stop() function from panic.c to smp.c, since it is the crash dump entry point for the crash stop process and now calls into this new common logic (only if this latter is enabled by ARCH_USE_COMMON_SMP_STOP=y). Introduce a few more helpers so that the architectures willing to use the common code logic can provide their arch-specific bits to handle the differences between a stop and a crash stop; architectures can anyway decide to override as a whole the common logic providing their own custom solution in crash_smp_send_stop() (as it was before). Provide also a new common code method to inquiry on the outcome of an ongoing crash_stop procedure: smp_crash_stop_failed(). Signed-off-by: Cristian Marussi --- v2 --> v3 - removed inlining on __smp_send_stop_all() - renamed __smp_send_stop_all() parameter v1 --> v2 - using new CONFIG_USE_COMMON_SMP_STOP - added arch_smp_cpus_crash_complete() --- include/linux/smp.h | 34 +++++++++++++++++++++++ kernel/panic.c | 26 ------------------ kernel/smp.c | 67 ++++++++++++++++++++++++++++++++++++++++++--- 3 files changed, 97 insertions(+), 30 deletions(-) diff --git a/include/linux/smp.h b/include/linux/smp.h index 22a0adeb7350..1886e49a65bb 100644 --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -114,8 +114,36 @@ extern void arch_smp_stop_call(cpumask_t *cpus); * the local CPU once the other CPUs have been successfully stopped. */ void arch_smp_cpus_stop_complete(void); + +/* + * An Architecture CAN also provide the arch_smp_cpus_crash_complete() + * dedicated helper, to perform any final arch-specific operation on + * the local CPU once the other CPUs have been successfully crash stopped. + * When not overridden by the user, this defaults to call straight away + * arch_smp_cpus_stop_complete() + */ +void arch_smp_cpus_crash_complete(void); + +/* + * An Architecture CAN additionally provide the arch_smp_crash_call() + * helper which implements the arch-specific crash dump related operations. + * + * If such arch wants to fully support crash dump, this MUST be provided; + * when not provided the crash dump procedure will fallback to behave like + * a normal stop. (no saved regs, no arch-specific features disabled) + */ +extern void arch_smp_crash_call(cpumask_t *cpus); + +/* Helper to query the outcome of an ongoing crash_stop operation */ +bool smp_crash_stop_failed(void); #endif +/* + * stops all CPUs but the current one propagating to all other CPUs + * the information that a crash_kexec is ongoing: + */ +void crash_smp_send_stop(void); + /* * sends a 'reschedule' event to another CPU: */ @@ -179,6 +207,12 @@ static inline int get_boot_cpu_id(void) static inline void smp_send_stop(void) { } +static inline void crash_smp_send_stop(void) { } + +#ifdef CONFIG_USE_COMMON_SMP_STOP +static inline bool smp_crash_stop_failed(void) { } +#endif + /* * These macros fold the SMP functionality into a single CPU system */ diff --git a/kernel/panic.c b/kernel/panic.c index b69ee9e76cb2..3965af34ac38 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -87,32 +87,6 @@ void __weak nmi_panic_self_stop(struct pt_regs *regs) panic_smp_self_stop(); } -/* - * Stop other CPUs in panic. Architecture dependent code may override this - * with more suitable version. For example, if the architecture supports - * crash dump, it should save registers of each stopped CPU and disable - * per-CPU features such as virtualization extensions. - */ -void __weak crash_smp_send_stop(void) -{ - static int cpus_stopped; - - /* - * This function can be called twice in panic path, but obviously - * we execute this only once. - */ - if (cpus_stopped) - return; - - /* - * Note smp_send_stop is the usual smp shutdown function, which - * unfortunately means it may not be hardened to work in a panic - * situation. - */ - smp_send_stop(); - cpus_stopped = 1; -} - atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID); /* diff --git a/kernel/smp.c b/kernel/smp.c index de29cd94a948..6224b0b1208b 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -846,6 +846,11 @@ bool smp_stop_get_wait_settings(unsigned long *timeout) void __weak arch_smp_cpus_stop_complete(void) { } +void __weak arch_smp_cpus_crash_complete(void) +{ + arch_smp_cpus_stop_complete(); +} + static inline bool any_other_cpus_online(cpumask_t *mask, unsigned int this_cpu_id) { @@ -855,6 +860,12 @@ static inline bool any_other_cpus_online(cpumask_t *mask, return !cpumask_empty(mask); } +void __weak arch_smp_crash_call(cpumask_t *cpus) +{ + pr_debug("SMP: Using generic %s() as SMP crash call.\n", __func__); + arch_smp_stop_call(cpus); +} + /* * This centralizes the common logic to: * @@ -866,7 +877,7 @@ static inline bool any_other_cpus_online(cpumask_t *mask, * - wait for the stop operation to be completed across all involved CPUs * monitoring the cpu_online_mask */ -void smp_send_stop(void) +static void __smp_send_stop_all(bool crash) { unsigned int this_cpu_id; cpumask_t mask; @@ -879,8 +890,11 @@ void smp_send_stop(void) if (system_state <= SYSTEM_RUNNING) pr_crit("stopping secondary CPUs\n"); - arch_smp_stop_call(&mask); - + /* smp and crash arch-backends helpers are kept distinct */ + if (!crash) + arch_smp_stop_call(&mask); + else + arch_smp_crash_call(&mask); /* * Defaults to wait up to one second for other CPUs to stop; * architectures can modify the default timeout or request @@ -905,6 +919,51 @@ void smp_send_stop(void) cpumask_pr_args(cpu_online_mask)); } /* Perform final (possibly arch-specific) work on this CPU */ - arch_smp_cpus_stop_complete(); + if (!crash) + arch_smp_cpus_stop_complete(); + else + arch_smp_cpus_crash_complete(); +} + +void smp_send_stop(void) +{ + __smp_send_stop_all(false); +} + +bool __weak smp_crash_stop_failed(void) +{ + return (num_online_cpus() > cpu_online(smp_processor_id())); } #endif + +/* + * Stop other CPUs while passing down the additional information that a + * crash_kexec is ongoing: it's up to the architecture implementation + * decide what to do. + * + * For example, Architectures supporting crash dump should provide + * specialized support for saving registers and disabling per-CPU features + * like virtualization extensions. + * + * Behaviour in the CONFIG_USE_COMMON_SMP_STOP=n case is preserved + * as it was before. + */ +void __weak crash_smp_send_stop(void) +{ + static int cpus_stopped; + + /* + * This function can be called twice in panic path, but obviously + * we execute this only once. + */ + if (cpus_stopped) + return; + +#ifdef CONFIG_USE_COMMON_SMP_STOP + __smp_send_stop_all(true); +#else + smp_send_stop(); +#endif + + cpus_stopped = 1; +} From patchwork Thu Dec 19 12:18:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303381 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D88EA109A for ; Thu, 19 Dec 2019 12:20:43 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B56CE20716 for ; Thu, 19 Dec 2019 12:20:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="IHFleaQ7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B56CE20716 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=OozWTdPyd+Boohi72hmYHc/3zmIG9+uUVaphvYx29bA=; b=IHFleaQ7mCf+GrBRR52wOVT0kb T9L1jjwlyB1TZ9yhi1BrSVmBPBqqogLcVObrhQ6+gzyWdqltyPMJN8ocZ3oUU0HLMv4zgbHMYRyZx rqZo9lWdqvlDSbM3ND99Mh11NZIkzE/m80uQD4U/uBDYi596pebwJgkTv6ZJwqPSWEBpXbo61v306 VM/8lpuMOaDDU1/Po10elojfMjKQ3FxdRxnRWeToCQotDlx4HS/MwEy9GBJZ3RHk5HwwQSE5b4KwM TMCElll3OsL2Gfb4K7Igh5w64+5GZ4dbkMrmDgUK5tEb9P5wY8F7Klxw+jC+c0+FvnJ3EiDDC4BeE suPlgEIA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihunK-0004G3-US; Thu, 19 Dec 2019 12:20:42 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumB-00026A-SG for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:33 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7CDC41007; Thu, 19 Dec 2019 04:19:31 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 325E13F719; Thu, 19 Dec 2019 04:19:29 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 03/12] smp: coordinate concurrent crash/smp stop calls Date: Thu, 19 Dec 2019 12:18:56 +0000 Message-Id: <20191219121905.26905-4-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041932_011997_F1CEE780 X-CRM114-Status: GOOD ( 17.24 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Once a stop request is in progress on one CPU, it must be carefully evaluated what to do if another stop request is issued concurrently on another CPU. Given that panic and crash dump code flows are mutually exclusive, the only alternative possible scenario which instead could lead to concurrent stop requests, is represented by the simultaneous stop requests possibly triggered by a concurrent halt/reboot/shutdown. In such a case, prioritize the panic/crash procedure and most importantly immediately park the offending CPU that was attempting the concurrent stop request: force it to idle quietly, waiting for the ongoing stop/dump requests to arrive. Failing to do so would result in the offending CPU being effectively lost on the next possible reboot triggered by the crash dump. [1] Another scenario, where the crash/stop code path was known to be executed twice, was during a normal panic/crash with crash_kexec_post_notifiers=1; since this issue is similar, fold also this case-handling into this new logic. [1] <<<<<---------- TRIGGERED PANIC [ 225.676014] ------------[ cut here ]------------ [ 225.676656] kernel BUG at arch/arm64/kernel/cpufeature.c:852! [ 225.677253] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 225.677660] Modules linked in: [ 225.678458] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted 5.3.0-rc5-00004-gb8a210cd3c32-dirty #2 [ 225.678621] Hardware name: Foundation-v8A (DT) [ 225.678868] pstate: 000001c5 (nzcv dAIF -PAN -UAO) [ 225.679523] pc : has_cpuid_feature+0x35c/0x360 [ 225.679649] lr : verify_local_elf_hwcaps+0x6c/0xf0 [ 225.679759] sp : ffff0000118cbf60 [ 225.679855] x29: ffff0000118cbf60 x28: 0000000000000000 [ 225.680026] x27: 0000000000000000 x26: 0000000000000000 [ 225.680115] x25: ffff00001167a010 x24: ffff0000112f59f8 [ 225.680207] x23: 0000000000000000 x22: 0000000000000000 [ 225.680290] x21: ffff0000112ea018 x20: ffff000010fe5538 [ 225.680372] x19: ffff000010ba3f30 x18: 000000000000001e [ 225.680465] x17: 0000000000000000 x16: 0000000000000000 [ 225.680546] x15: 0000000000000000 x14: 0000000000000008 [ 225.680629] x13: 0209018b7a9404f4 x12: 0000000000000001 [ 225.680719] x11: 0000000000000080 x10: 00400032b5503510 [ 225.680803] x9 : 0000000000000000 x8 : ffff000010b93204 [ 225.680884] x7 : 00000000800001d8 x6 : 0000000000000005 [ 225.680975] x5 : 0000000000000000 x4 : 0000000000000000 [ 225.681056] x3 : 0000000000000000 x2 : 0000000000008000 [ 225.681139] x1 : 0000000000180480 x0 : 0000000000180480 [ 225.681423] Call trace: [ 225.681669] has_cpuid_feature+0x35c/0x360 [ 225.681762] verify_local_elf_hwcaps+0x6c/0xf0 [ 225.681836] check_local_cpu_capabilities+0x88/0x118 [ 225.681939] secondary_start_kernel+0xc4/0x168 [ 225.682432] Code: d53801e0 17ffff58 d5380600 17ffff56 (d4210000) [ 225.683998] smp: stopping secondary CPUs [ 225.684130] Delaying stop.... <<<<<------ INSTRUMENTED DEBUG_DELAY Rebooting. <<<<<------ MANUAL SIMULTANEOUS REBOOT [ 232.647472] reboot: Restarting system [ 232.648363] Reboot failed -- System halted [ 239.552413] smp: failed to stop secondary CPUs 0 [ 239.554730] Starting crashdump kernel... [ 239.555194] ------------[ cut here ]------------ [ 239.555406] Some CPUs may be stale, kdump will be unreliable. [ 239.555802] WARNING: CPU: 3 PID: 0 at arch/arm64/kernel/machine_kexec.c:156 machine_kexec+0x3c/0x2b0 [ 239.556044] Modules linked in: [ 239.556244] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted 5.3.0-rc5-00004-gb8a210cd3c32-dirty #2 [ 239.556340] Hardware name: Foundation-v8A (DT) [ 239.556459] pstate: 600003c5 (nZCv DAIF -PAN -UAO) [ 239.556587] pc : machine_kexec+0x3c/0x2b0 [ 239.556700] lr : machine_kexec+0x3c/0x2b0 [ 239.556775] sp : ffff0000118cbad0 [ 239.556876] x29: ffff0000118cbad0 x28: ffff80087a8c3700 [ 239.557012] x27: 0000000000000000 x26: 0000000000000000 [ 239.557278] x25: ffff000010fe3ef0 x24: 00000000000003c0 .... [ 239.561568] Bye! [ 0.000000] Booting Linux on physical CPU 0x0000000003 [0x410fd0f0] [ 0.000000] Linux version 5.2.0-rc4-00001-g93bd4bc234d5-dirty [ 0.000000] Machine model: Foundation-v8A ... [ 0.197991] smp: Bringing up secondary CPUs ... [ 0.232643] psci: failed to boot CPU1 (-22) <<<<--- LOST CPU ON REBOOT [ 0.232861] CPU1: failed to boot: -22 [ 0.306291] Detected PIPT I-cache on CPU2 [ 0.310524] GICv3: CPU2: found redistributor 1 region 0:0x000000002f120000 [ 0.315618] CPU2: Booted secondary processor 0x0000000001 [0x410fd0f0] [ 0.395576] Detected PIPT I-cache on CPU3 [ 0.400431] GICv3: CPU3: found redistributor 2 region 0:0x000000002f140000 [ 0.407252] CPU3: Booted secondary processor 0x0000000002 [0x410fd0f0] [ 0.431811] smp: Brought up 1 node, 3 CPUs [ 0.439898] SMP: Total of 3 processors activated. Signed-off-by: Cristian Marussi --- v2 --> v3 - local var renamded v1 --> v2 - using new CONFIG_USE_COMMON_SMP_STOP --- include/linux/smp.h | 3 +++ kernel/smp.c | 48 ++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 46 insertions(+), 5 deletions(-) diff --git a/include/linux/smp.h b/include/linux/smp.h index 1886e49a65bb..42be03ac1c0c 100644 --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -136,6 +136,9 @@ extern void arch_smp_crash_call(cpumask_t *cpus); /* Helper to query the outcome of an ongoing crash_stop operation */ bool smp_crash_stop_failed(void); + +/* A generic cpu parking helper, possibly overridden by architecture code */ +void arch_smp_cpu_park(void) __noreturn; #endif /* diff --git a/kernel/smp.c b/kernel/smp.c index 6224b0b1208b..29eb6eff2063 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -844,6 +844,12 @@ bool smp_stop_get_wait_settings(unsigned long *timeout) return atomic_read(&smp_stop_wait_forever); } +void __weak arch_smp_cpu_park(void) +{ + while (1) + cpu_relax(); +} + void __weak arch_smp_cpus_stop_complete(void) { } void __weak arch_smp_cpus_crash_complete(void) @@ -866,6 +872,9 @@ void __weak arch_smp_crash_call(cpumask_t *cpus) arch_smp_stop_call(cpus); } +#define REASON_STOP 1 +#define REASON_CRASH 2 + /* * This centralizes the common logic to: * @@ -881,8 +890,38 @@ static void __smp_send_stop_all(bool crash) { unsigned int this_cpu_id; cpumask_t mask; + static atomic_t stopping; + int was_stopping; this_cpu_id = smp_processor_id(); + /* make sure that concurrent stop requests are handled properly */ + was_stopping = atomic_cmpxchg(&stopping, 0, + crash ? REASON_CRASH : REASON_STOP); + if (was_stopping) { + /* + * This function can be called twice in panic path if + * crash_kexec is called with crash_kexec_post_notifiers=1, + * but obviously we execute this only once. + */ + if (crash && was_stopping == REASON_CRASH) + return; + /* + * In case of other concurrent STOP calls (like in a REBOOT + * started simultaneously as an ongoing PANIC/CRASH/REBOOT) + * we want to prioritize the ongoing PANIC/KEXEC call and + * force here the offending CPU that was attempting the + * concurrent STOP to just sit idle waiting to die. + * Failing to do so would result in a lost CPU on the next + * possible reboot triggered by crash_kexec(). + */ + if (!crash) { + pr_crit("CPU%d - kernel already stopping, parking myself.\n", + this_cpu_id); + local_irq_enable(); + /* does not return */ + arch_smp_cpu_park(); + } + } if (any_other_cpus_online(&mask, this_cpu_id)) { bool wait_forever; unsigned long timeout; @@ -950,6 +989,9 @@ bool __weak smp_crash_stop_failed(void) */ void __weak crash_smp_send_stop(void) { +#ifdef CONFIG_USE_COMMON_SMP_STOP + __smp_send_stop_all(true); +#else static int cpus_stopped; /* @@ -959,11 +1001,7 @@ void __weak crash_smp_send_stop(void) if (cpus_stopped) return; -#ifdef CONFIG_USE_COMMON_SMP_STOP - __smp_send_stop_all(true); -#else smp_send_stop(); -#endif - cpus_stopped = 1; +#endif } From patchwork Thu Dec 19 12:18:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303383 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E4DCA139A for ; Thu, 19 Dec 2019 12:21:05 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C1C6120716 for ; Thu, 19 Dec 2019 12:21:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="Y3PW1Ggp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C1C6120716 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=zIHlumBrSHomPPQ7xiCUtAwpsw8rShljozrYlMTKgi0=; b=Y3PW1GgpzmWw4bIVjZ33TnGuOx miwfHaeYyvv9lDnrdVsuJBXQ5LPt5ZUENY9EFAY5miCnLYXumvhtRf37TIhAui197kr4cx5w7oSqp HH9vT74XFjiNi1dPWhFabncIsZezUuv/a4YSsAoTYU/MZ0NVILQ2l2NLZ5P9WUJn/n/tcSqLIPxh0 CP1/EkysRU3GazIP0/GP2nnpZ8S1QQHyl1MJnJbGIXo9JGWyJBF9S+RUARJakvFY6X0T1zKmXo7tX jN+Vt1YyBTq7UF5Us4uAtXIgqikweGu/TH7ykD1+XazcE/3/AK3FeSeQflBps7eiNUENZkKG6kLec B6nnaaeg==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihunf-0004Ut-JE; Thu, 19 Dec 2019 12:21:03 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumE-000290-Mt for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:36 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0D84F31B; Thu, 19 Dec 2019 04:19:34 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B2FBD3F719; Thu, 19 Dec 2019 04:19:31 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 04/12] smp: address races of starting CPUs while stopping Date: Thu, 19 Dec 2019 12:18:57 +0000 Message-Id: <20191219121905.26905-5-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041934_872793_DF0832E7 X-CRM114-Status: GOOD ( 19.19 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Add to SMP stop common code a best-effort retry logic, re-issuing a stop request when any CPU is detected to be still online after the first stop request cycle has completed. While retrying provide to architectures' helpers an 'attempt' argument so that, after a possible first failed attempt, they can autonomously decide to adopt different approaches in subsequent retries. Address the case in which some CPUs could still be starting up when the stop process is initiated, remaining so undetected and coming fully online only after the SMP stop procedure was already started: such officially still offline CPUs would be missed by an ongoing stop procedure. Being a best effort approach, though, it is not always guaranteed to be able to stop any CPU that happened to finally come online after the whole SMP stop retry cycle has completed. (i.e. the race-window is reduced but not eliminated) Signed-off-by: Cristian Marussi --- v2 --> v3 - reviewed max_retries get/set helpers to avoid header global static v1 --> v2 - added attempt_num param to arch helpers, to let arch implementation know if a retry is ongoing because some CPU failed to stop. (some arch like x86 attempts the retry with different means like NMI) - added some helpers to let archs decide on the number of retries --- A more deterministic approach has been also attempted in order to catch any concurrently starting CPUs at the very last moment and make them kill themselves by: - adding a new set_cpus_stopping() in cpumask.h used to set a __cpus_stopping atomic internal flag - modifying set_cpu_online() to check on __cpus_stopping only when coming online, and force the offending CPU to kill itself in that case Anyway it has proved tricky and complex (beside faulty) to implement the above 'kill-myself' phase in a reliable way while remaining architecture agnostic and still distingushing properly regular stops from crash kexec. So given that the main idea underlying this patch series was instead of simplifying and unifying code and the residual races not caught by the best-effort logic seemed not very likely, this more deterministic approach has been dropped in favour of the best effort retry logic. Moreover, the current retry logic will be anyway needed to support some architectures, like x86, that prefer to use different CPU's stopping methods in subsequent retries. --- include/linux/smp.h | 11 +++++++++-- kernel/smp.c | 34 +++++++++++++++++++++++++++------- 2 files changed, 36 insertions(+), 9 deletions(-) diff --git a/include/linux/smp.h b/include/linux/smp.h index 42be03ac1c0c..247c78434a3d 100644 --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -78,6 +78,7 @@ int smp_call_function_single_async(int cpu, call_single_data_t *csd); extern void smp_send_stop(void); #ifdef CONFIG_USE_COMMON_SMP_STOP + /* * An Architecture can optionally use this helper to change the default * waiting behaviour of common STOP logic. @@ -101,12 +102,18 @@ void smp_stop_set_wait_timeout_us(unsigned long timeout); */ bool smp_stop_get_wait_settings(unsigned long *timeout); +/* Change common SMP STOP logic maximum retries */ +void smp_stop_set_max_retries(unsigned int max_retries); + +/* Get currently set maximum retries attempt */ +unsigned int smp_stop_get_max_retries(void); + /* * Any Architecture willing to use STOP common logic implementation * MUST at least provide the arch_smp_stop_call() helper which implements * the arch-specific CPU-stop mechanism. */ -extern void arch_smp_stop_call(cpumask_t *cpus); +extern void arch_smp_stop_call(cpumask_t *cpus, unsigned int attempt_num); /* * An Architecture CAN also provide the arch_smp_cpus_stop_complete() @@ -132,7 +139,7 @@ void arch_smp_cpus_crash_complete(void); * when not provided the crash dump procedure will fallback to behave like * a normal stop. (no saved regs, no arch-specific features disabled) */ -extern void arch_smp_crash_call(cpumask_t *cpus); +extern void arch_smp_crash_call(cpumask_t *cpus, unsigned int attempt_num); /* Helper to query the outcome of an ongoing crash_stop operation */ bool smp_crash_stop_failed(void); diff --git a/kernel/smp.c b/kernel/smp.c index 29eb6eff2063..46a307d2351e 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -823,6 +823,9 @@ EXPORT_SYMBOL_GPL(smp_call_on_cpu); static atomic_t smp_stop_wait_forever; static atomic_t smp_stop_wait_timeout = ATOMIC_INIT(USEC_PER_SEC); +#define DEFAULT_MAX_STOP_RETRIES 2 +static atomic_t max_stop_retries = ATOMIC_INIT(DEFAULT_MAX_STOP_RETRIES); + void smp_stop_set_wait_forever(bool wait_forever) { atomic_set(&smp_stop_wait_forever, wait_forever); @@ -844,6 +847,18 @@ bool smp_stop_get_wait_settings(unsigned long *timeout) return atomic_read(&smp_stop_wait_forever); } +void smp_stop_set_max_retries(unsigned int max_retries) +{ + atomic_set(&max_stop_retries, max_retries); + /* ensure retries atomics are visible */ + smp_mb__after_atomic(); +} + +unsigned int smp_stop_get_max_retries(void) +{ + return atomic_read(&max_stop_retries); +} + void __weak arch_smp_cpu_park(void) { while (1) @@ -866,10 +881,10 @@ static inline bool any_other_cpus_online(cpumask_t *mask, return !cpumask_empty(mask); } -void __weak arch_smp_crash_call(cpumask_t *cpus) +void __weak arch_smp_crash_call(cpumask_t *cpus, unsigned int attempt_num) { pr_debug("SMP: Using generic %s() as SMP crash call.\n", __func__); - arch_smp_stop_call(cpus); + arch_smp_stop_call(cpus, attempt_num); } #define REASON_STOP 1 @@ -888,10 +903,10 @@ void __weak arch_smp_crash_call(cpumask_t *cpus) */ static void __smp_send_stop_all(bool crash) { - unsigned int this_cpu_id; cpumask_t mask; static atomic_t stopping; int was_stopping; + unsigned int this_cpu_id, max_retries, attempt = 0; this_cpu_id = smp_processor_id(); /* make sure that concurrent stop requests are handled properly */ @@ -922,7 +937,9 @@ static void __smp_send_stop_all(bool crash) arch_smp_cpu_park(); } } - if (any_other_cpus_online(&mask, this_cpu_id)) { + max_retries = smp_stop_get_max_retries(); + while (++attempt <= max_retries && + any_other_cpus_online(&mask, this_cpu_id)) { bool wait_forever; unsigned long timeout; unsigned int this_cpu_online = cpu_online(this_cpu_id); @@ -931,9 +948,9 @@ static void __smp_send_stop_all(bool crash) pr_crit("stopping secondary CPUs\n"); /* smp and crash arch-backends helpers are kept distinct */ if (!crash) - arch_smp_stop_call(&mask); + arch_smp_stop_call(&mask, attempt); else - arch_smp_crash_call(&mask); + arch_smp_crash_call(&mask, attempt); /* * Defaults to wait up to one second for other CPUs to stop; * architectures can modify the default timeout or request @@ -953,9 +970,12 @@ static void __smp_send_stop_all(bool crash) udelay(1); /* ensure any stopping-CPUs memory access is made visible */ smp_rmb(); - if (num_online_cpus() > this_cpu_online) + if (num_online_cpus() > this_cpu_online) { pr_warn("failed to stop secondary CPUs %*pbl\n", cpumask_pr_args(cpu_online_mask)); + if (attempt < max_retries) + pr_warn("Retrying SMP stop call...\n"); + } } /* Perform final (possibly arch-specific) work on this CPU */ if (!crash) From patchwork Thu Dec 19 12:18:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303385 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C029B109A for ; Thu, 19 Dec 2019 12:21:26 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 707092068F for ; Thu, 19 Dec 2019 12:21:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="CqiNrJ6F" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 707092068F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=LNtMitWCFQM6Y4Wdo+KHtLNYT123UIkXbrNibMhTwO4=; b=CqiNrJ6F2IaFf5BHzygzL98QM9 9Mnt4SGsXlvC3A7EwaZQzesPBhFsCrl0yien1aQ5vUawCyqAsWw3nOcC7i8htT9IFuG/Bi+3I2Rns BlCmTfYE98FuZiFukwXxURDDaRa8PWosA8SlK3RUWxjbOF0LR40m7YJ1uOvAWMzSI4uUchSb5cqUn uF5Tl1gYkJhlFj9FacKf5FCsPk4MaBMQv1wP2kBg5W0AtuVpQW6vQ06wa3YnIsSGufFMxUVJhCAfa LELTLHgeU5S/3mCz8ckbroRBWoaTqP0bcMBQyE4zb+jeWsd322jcwPmYV1DEazVJPazaCu4yOZ97l Rx3ER3tg==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihunz-0004k8-67; Thu, 19 Dec 2019 12:21:23 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumH-0002Bd-9I for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:39 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8C441328; Thu, 19 Dec 2019 04:19:36 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 42B5D3F719; Thu, 19 Dec 2019 04:19:34 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 05/12] arm64: smp: use generic SMP stop common code Date: Thu, 19 Dec 2019 12:18:58 +0000 Message-Id: <20191219121905.26905-6-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041937_444673_AC422B81 X-CRM114-Status: GOOD ( 16.07 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Make arm64 use the generic SMP-stop logic provided by common code unified smp_send_stop() function. arm64 smp_send_stop() logic had a bug in it: it failed to consider the online status of the calling CPU when evaluating if any stop message needed to be sent to other CPus at all: this resulted, on a 2-CPUs system, in the failure to stop all cpus if one paniced while starting up, leaving such system in an unexpected lively state. [root@arch ~]# echo 1 > /sys/devices/system/cpu/cpu1/online [root@arch ~]# [ 152.583368] ------------[ cut here ]------------ [ 152.583872] kernel BUG at arch/arm64/kernel/cpufeature.c:852! [ 152.584693] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 152.585228] Modules linked in: [ 152.586040] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.3.0-rc5-00001-gcabd12118c4a-dirty #2 [ 152.586218] Hardware name: Foundation-v8A (DT) [ 152.586478] pstate: 000001c5 (nzcv dAIF -PAN -UAO) [ 152.587260] pc : has_cpuid_feature+0x35c/0x360 [ 152.587398] lr : verify_local_elf_hwcaps+0x6c/0xf0 [ 152.587520] sp : ffff0000118bbf60 [ 152.587605] x29: ffff0000118bbf60 x28: 0000000000000000 [ 152.587784] x27: 0000000000000000 x26: 0000000000000000 [ 152.587882] x25: ffff00001167a010 x24: ffff0000112f59f8 [ 152.587992] x23: 0000000000000000 x22: 0000000000000000 [ 152.588085] x21: ffff0000112ea018 x20: ffff000010fe5518 [ 152.588180] x19: ffff000010ba3f30 x18: 0000000000000036 [ 152.588285] x17: 0000000000000000 x16: 0000000000000000 [ 152.588380] x15: 0000000000000000 x14: ffff80087a821210 [ 152.588481] x13: 0000000000000000 x12: 0000000000000000 [ 152.588599] x11: 0000000000000080 x10: 00400032b5503510 [ 152.588709] x9 : 0000000000000000 x8 : ffff000010b93204 [ 152.588810] x7 : 00000000800001d8 x6 : 0000000000000005 [ 152.588910] x5 : 0000000000000000 x4 : 0000000000000000 [ 152.589021] x3 : 0000000000000000 x2 : 0000000000008000 [ 152.589121] x1 : 0000000000180480 x0 : 0000000000180480 [ 152.589379] Call trace: [ 152.589646] has_cpuid_feature+0x35c/0x360 [ 152.589763] verify_local_elf_hwcaps+0x6c/0xf0 [ 152.589858] check_local_cpu_capabilities+0x88/0x118 [ 152.589968] secondary_start_kernel+0xc4/0x168 [ 152.590530] Code: d53801e0 17ffff58 d5380600 17ffff56 (d4210000) [ 152.592215] ---[ end trace 80ea98416149c87e ]--- [ 152.592734] Kernel panic - not syncing: Attempted to kill the idle task! [ 152.593173] Kernel Offset: disabled [ 152.593501] CPU features: 0x0004,20c02008 [ 152.593678] Memory Limit: none [ 152.594208] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- [root@arch ~]# bash: echo: write error: Input/output error [root@arch ~]# [root@arch ~]# [root@arch ~]# echo HELO HELO Get rid of such bug, switching arm64 to use the common SMP stop code. Reported-by: Dave Martin Signed-off-by: Cristian Marussi --- v1 --> v2 - now selecting arch/Kconfig ARCH_USE_COMMON_SMP_STOP - added attempt_num arch_smp_stop_call() helper --- arch/arm64/Kconfig | 1 + arch/arm64/kernel/smp.c | 29 ++++++----------------------- 2 files changed, 7 insertions(+), 23 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index b1b4476ddb83..618e2c2052dd 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -64,6 +64,7 @@ config ARM64 select ARCH_USE_CMPXCHG_LOCKREF select ARCH_USE_QUEUED_RWLOCKS select ARCH_USE_QUEUED_SPINLOCKS + select ARCH_USE_COMMON_SMP_STOP select ARCH_SUPPORTS_MEMORY_FAILURE select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && (GCC_VERSION >= 50000 || CC_IS_CLANG) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index d4ed9a19d8fe..7c1869161b5e 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -958,33 +958,16 @@ void tick_broadcast(const struct cpumask *mask) } #endif -void smp_send_stop(void) +void arch_smp_cpus_stop_complete(void) { - unsigned long timeout; - - if (num_online_cpus() > 1) { - cpumask_t mask; - - cpumask_copy(&mask, cpu_online_mask); - cpumask_clear_cpu(smp_processor_id(), &mask); - - if (system_state <= SYSTEM_RUNNING) - pr_crit("SMP: stopping secondary CPUs\n"); - smp_cross_call(&mask, IPI_CPU_STOP); - } - - /* Wait up to one second for other CPUs to stop */ - timeout = USEC_PER_SEC; - while (num_online_cpus() > 1 && timeout--) - udelay(1); - - if (num_online_cpus() > 1) - pr_warn("SMP: failed to stop secondary CPUs %*pbl\n", - cpumask_pr_args(cpu_online_mask)); - sdei_mask_local_cpu(); } +void arch_smp_stop_call(cpumask_t *cpus, unsigned int __unused) +{ + smp_cross_call(cpus, IPI_CPU_STOP); +} + #ifdef CONFIG_KEXEC_CORE void crash_smp_send_stop(void) { From patchwork Thu Dec 19 12:18:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303387 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F08D1139A for ; Thu, 19 Dec 2019 12:21:41 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C9C732068F for ; Thu, 19 Dec 2019 12:21:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="u6MYaPl7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C9C732068F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=MoKvisOdU0Nn+2QK02XmM0zSYkFrxFYx8tSjMjJZfzg=; b=u6MYaPl76EdD6gQuV/F8USgZw7 tawa3co/kiCM9bspLmld8hl+k9jqAUSS3dNvx3VhR6WeiXJkvuY1B2uNanYUUMkuIpt4wmEaz4uGE A/MmmxicIyG5CXVHGfl1nJxnFohhuJgX6suaLrPOv1DVZHF4dUKVyy/ZyV+10aesyZyFdmGDvQ5mm pSjsTckIH5ZqnjQ7WXisq3i4bMr1jn0dNkv+Q8Q5M2wqkkUg/tljZn30ItoPxuTC7EynwfFNKJYxx 8x6ujZBnIIDCCLDxhHlffoev3xBXUFXNg3x4h48kmWHbV0Y1AAtqjECLetEXOZdHL27KeB0Fs6SrJ xJ+NKCJA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihuoE-000500-J9; Thu, 19 Dec 2019 12:21:39 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumJ-0002Ef-IZ for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:41 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 16F93DA7; Thu, 19 Dec 2019 04:19:39 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C12853F719; Thu, 19 Dec 2019 04:19:36 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 06/12] arm64: smp: use SMP crash-stop common code Date: Thu, 19 Dec 2019 12:18:59 +0000 Message-Id: <20191219121905.26905-7-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041939_722099_EEB724A3 X-CRM114-Status: GOOD ( 14.94 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Make arm64 use the SMP common implementation of crash_smp_send_stop() and its generic logic, by removing the arm64 crash_smp_send_stop() definition and providing the needed arch specific helpers. Additionally, simplify the arch-specific stop and crash dump ISRs backends (which are in charge of effectively receiving and interpreting the stop/crash messages) and unify them as much as possible. Using the SMP common code, it is no more needed to make use of an atomic_t counter to make sure that each CPU had time to perform its crash dump related shutdown-ops before the world ends: simply take care to synchronize on cpu_online_mask, and add proper explicit memory barriers where needed. Moreover, remove arm64 specific smp_crash_stop_failed() helper as a whole and rely on the common code provided homonym function to lookup the state of an ongoing crash_stop operation. Signed-off-by: Cristian Marussi v1 --> v2 - added attempt_num param to arch_smp_crash_call() --- arch/arm64/include/asm/smp.h | 2 - arch/arm64/kernel/smp.c | 100 +++++++++-------------------------- 2 files changed, 26 insertions(+), 76 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index a0c8a0b65259..d98c409f9225 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -150,8 +150,6 @@ static inline void cpu_panic_kernel(void) */ bool cpus_are_stuck_in_kernel(void); -extern void crash_smp_send_stop(void); -extern bool smp_crash_stop_failed(void); #endif /* ifndef __ASSEMBLY__ */ diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 7c1869161b5e..edb2de85507a 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -830,12 +830,30 @@ void arch_irq_work_raise(void) } #endif -static void local_cpu_stop(void) +static void local_cpu_crash_or_stop(struct pt_regs *crash_regs) { - set_cpu_online(smp_processor_id(), false); + unsigned int cpu = smp_processor_id(); - local_daif_mask(); + if (IS_ENABLED(CONFIG_KEXEC_CORE) && crash_regs) { +#ifdef CONFIG_KEXEC_CORE + /* crash stop requested: save regs before going offline */ + crash_save_cpu(crash_regs, cpu); +#endif + local_irq_disable(); + } else { + local_daif_mask(); + } sdei_mask_local_cpu(); + /* ensure dumped regs are visible once cpu is seen offline */ + smp_wmb(); + set_cpu_online(cpu, false); + /* ensure all writes are globally visible before cpu parks */ + wmb(); +#if defined(CONFIG_KEXEC_CORE) && defined(CONFIG_HOTPLUG_CPU) + if (cpu_ops[cpu]->cpu_die) + cpu_ops[cpu]->cpu_die(cpu); +#endif + /* just in case */ cpu_park_loop(); } @@ -846,31 +864,7 @@ static void local_cpu_stop(void) */ void panic_smp_self_stop(void) { - local_cpu_stop(); -} - -#ifdef CONFIG_KEXEC_CORE -static atomic_t waiting_for_crash_ipi = ATOMIC_INIT(0); -#endif - -static void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs) -{ -#ifdef CONFIG_KEXEC_CORE - crash_save_cpu(regs, cpu); - - atomic_dec(&waiting_for_crash_ipi); - - local_irq_disable(); - sdei_mask_local_cpu(); - -#ifdef CONFIG_HOTPLUG_CPU - if (cpu_ops[cpu]->cpu_die) - cpu_ops[cpu]->cpu_die(cpu); -#endif - - /* just in case */ - cpu_park_loop(); -#endif + local_cpu_crash_or_stop(NULL); } /* @@ -899,14 +893,14 @@ void handle_IPI(int ipinr, struct pt_regs *regs) case IPI_CPU_STOP: irq_enter(); - local_cpu_stop(); + local_cpu_crash_or_stop(NULL); irq_exit(); break; case IPI_CPU_CRASH_STOP: if (IS_ENABLED(CONFIG_KEXEC_CORE)) { irq_enter(); - ipi_cpu_crash_stop(cpu, regs); + local_cpu_crash_or_stop(regs); unreachable(); } @@ -968,52 +962,10 @@ void arch_smp_stop_call(cpumask_t *cpus, unsigned int __unused) smp_cross_call(cpus, IPI_CPU_STOP); } -#ifdef CONFIG_KEXEC_CORE -void crash_smp_send_stop(void) +void arch_smp_crash_call(cpumask_t *cpus, unsigned int __unused) { - static int cpus_stopped; - cpumask_t mask; - unsigned long timeout; - - /* - * This function can be called twice in panic path, but obviously - * we execute this only once. - */ - if (cpus_stopped) - return; - - cpus_stopped = 1; - - if (num_online_cpus() == 1) { - sdei_mask_local_cpu(); - return; - } - - cpumask_copy(&mask, cpu_online_mask); - cpumask_clear_cpu(smp_processor_id(), &mask); - - atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); - - pr_crit("SMP: stopping secondary CPUs\n"); - smp_cross_call(&mask, IPI_CPU_CRASH_STOP); - - /* Wait up to one second for other CPUs to stop */ - timeout = USEC_PER_SEC; - while ((atomic_read(&waiting_for_crash_ipi) > 0) && timeout--) - udelay(1); - - if (atomic_read(&waiting_for_crash_ipi) > 0) - pr_warn("SMP: failed to stop secondary CPUs %*pbl\n", - cpumask_pr_args(&mask)); - - sdei_mask_local_cpu(); -} - -bool smp_crash_stop_failed(void) -{ - return (atomic_read(&waiting_for_crash_ipi) > 0); + smp_cross_call(cpus, IPI_CPU_CRASH_STOP); } -#endif /* * not supported here From patchwork Thu Dec 19 12:19:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303389 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CD0E8139A for ; Thu, 19 Dec 2019 12:21:57 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A5CC82068F for ; Thu, 19 Dec 2019 12:21:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="sweuhwQs" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A5CC82068F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=UbG8M46S0buFyv5Ip3vQithmtDyYD9saSQBFlDyYd7E=; b=sweuhwQsob16noOcmr5PwxHg2/ Vec4nazcYBTJRRFh/ZljlMKMjHIeNlfacuT8sabemGFJXgnPfkVjMFKQP+TwsYpqP5TG1hYXTlU8f eACi8V9cwEGw9uIxIosovQaDcN8U5A6tFGZLVaHmxIMJ7oRAQW7wGwJB4DmPnT903iKA0PZ7ssSiv IYj0+yjlc+WUGAlGWCvaVYNUgbRKjd+FHDGv9xbSBpMH/V4+qlnCLOgDclSG8Vm1bBxkdAPZX1yOP atzWgkux/rczVbohphHqDlHTys9tMTdK7qMdjbYzsULMGVphQOX3+ptOr0hjI8wxAPjQwx/TG3ZRt owBW3iCA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihuoV-0005GL-SW; Thu, 19 Dec 2019 12:21:55 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumM-0002Hb-5L for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:43 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 966C231B; Thu, 19 Dec 2019 04:19:41 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4BE753F719; Thu, 19 Dec 2019 04:19:39 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 07/12] arm64: smp: add arch specific cpu parking helper Date: Thu, 19 Dec 2019 12:19:00 +0000 Message-Id: <20191219121905.26905-8-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041942_327671_9CF3FBB6 X-CRM114-Status: UNSURE ( 7.27 ) X-CRM114-Notice: Please train this message. X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Add an arm64 specific helper which parks the cpu in a more architecture efficient way. Signed-off-by: Cristian Marussi --- arch/arm64/kernel/smp.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index edb2de85507a..3f108be544f8 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -952,6 +952,12 @@ void tick_broadcast(const struct cpumask *mask) } #endif +void arch_smp_cpu_park(void) +{ + while (1) + cpu_park_loop(); +} + void arch_smp_cpus_stop_complete(void) { sdei_mask_local_cpu(); From patchwork Thu Dec 19 12:19:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303391 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7FD2A109A for ; Thu, 19 Dec 2019 12:22:11 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 50209227BF for ; Thu, 19 Dec 2019 12:22:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="eWkbBsdE" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 50209227BF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=E9rKyo2gHQHHeWg9Ok3h5XmPq8Fag3SAroXonEDRt94=; b=eWkbBsdEAUiBceDkfureO1qq+z hb/LutR8lB8eEqSELJWEO+iTbzf78EwugU7XOzfRaystphbQxupfn8G2w44VIvvA6uLLOkRxQyQoi nfemna8+e4ObW+pskxFnI5u29dJW1P29eiiABH8MccsS0iRqsUBEY5LX4NzXSf8bYxHdCkuff3dxj NPAw+lfUTwC7i7XGWEDofmLO2d7ary6yECA2NEBFw5B3wFc2gkzHxD3gM2RpZEeKxeiiBfhQce571 5gb8uR6dCbMMfFEWG7D5IVk706u84w9iwrDQoVvdTfFWQBbWuaRjKsI9a+zHGYnJc++2z7aKdkhhG jhTwN9yQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihuoj-0005U7-VK; Thu, 19 Dec 2019 12:22:10 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumO-0002Jj-V7 for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:47 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 237AF1007; Thu, 19 Dec 2019 04:19:44 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CB8CD3F719; Thu, 19 Dec 2019 04:19:41 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 08/12] x86: smp: use generic SMP stop common code Date: Thu, 19 Dec 2019 12:19:01 +0000 Message-Id: <20191219121905.26905-9-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041945_210578_EBEC55ED X-CRM114-Status: GOOD ( 19.73 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Make x86 use the generic SMP-stop logic provided by common code unified smp_send_stop() function. Introduce needed arch_smp_stop_call()/arch_smp_cpus_stop_complete() helpers that implement the backend architectures specific functionalities previously provided by native_stop_other_cpus(): common logic is now delegated to common SMP stop code. Remove arch-specific smp_send_stop(), and redefine original function native_stop_other_cpus() to rely instead on the unified common code version of smp_send_stop(): native_stop_other_cpus() is anyway kept since it is wired to smp_ops.stop_other_cpus() which get called at reboot time with particular waiting settings. Signed-off-by: Cristian Marussi --- Note that in this patch we kept in use the original x86 local handling of 'stopping_cpu' variable: atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()); Instead, common SMP stop code could have been easily extended to keep and expose to architectures backends such value using some additional helper like smp_stop_get_stopping_cpu_id(). This has not been addressed in this series. v2 ---> v3 - added new wait_forever change capabilities - better handling of x86 reboot_force flag --- arch/x86/Kconfig | 1 + arch/x86/include/asm/smp.h | 5 --- arch/x86/kernel/smp.c | 88 +++++++++++++++++++------------------- 3 files changed, 44 insertions(+), 50 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5e8949953660..0bc274426875 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -92,6 +92,7 @@ config X86 select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_QUEUED_RWLOCKS select ARCH_USE_QUEUED_SPINLOCKS + select ARCH_USE_COMMON_SMP_STOP select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH select ARCH_WANTS_DYNAMIC_TASK_STRUCT select ARCH_WANT_HUGE_PMD_SHARE diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h index e15f364efbcc..e937fab6474b 100644 --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -67,11 +67,6 @@ extern void set_cpu_sibling_map(int cpu); #ifdef CONFIG_SMP extern struct smp_ops smp_ops; -static inline void smp_send_stop(void) -{ - smp_ops.stop_other_cpus(0); -} - static inline void stop_other_cpus(void) { smp_ops.stop_other_cpus(1); diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index b8d4e9c3c070..7aeb45c512f7 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -147,71 +147,69 @@ static int register_stop_handler(void) static void native_stop_other_cpus(int wait) { - unsigned long flags; - unsigned long timeout; - if (reboot_force) return; + smp_stop_set_wait_forever(wait); + /* use common SMP stop code */ + smp_send_stop(); +} - /* - * Use an own vector here because smp_call_function - * does lots of things not suitable in a panic situation. - */ - - /* - * We start by using the REBOOT_VECTOR irq. - * The irq is treated as a sync point to allow critical - * regions of code on other cpus to release their spin locks - * and re-enable irqs. Jumping straight to an NMI might - * accidentally cause deadlocks with further shutdown/panic - * code. By syncing, we give the cpus up to one second to - * finish their work before we force them off with the NMI. - */ - if (num_online_cpus() > 1) { - /* did someone beat us here? */ - if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1) - return; - - /* sync above data before sending IRQ */ - wmb(); +void arch_smp_stop_call(cpumask_t *cpus, unsigned int attempt_num) +{ + static bool saved_wait; - apic_send_IPI_allbutself(REBOOT_VECTOR); + if (attempt_num == 1) { + /* + * We start by using the REBOOT_VECTOR irq. + * The irq is treated as a sync point to allow critical + * regions of code on other cpus to release their spin locks + * and re-enable irqs. Jumping straight to an NMI might + * accidentally cause deadlocks with further shutdown/panic + * code. By syncing, we give the cpus up to one second to + * finish their work before we force them off with the NMI. + */ /* - * Don't wait longer than a second for IPI completion. The - * wait request is not checked here because that would - * prevent an NMI shutdown attempt in case that not all - * CPUs reach shutdown state. + * Don't wait longer than a second for IPI completion. + * Wait forever request is explicitly disabled here because + * that would prevent an NMI shutdown attempt in case that + * not all CPUs reach shutdown state. */ - timeout = USEC_PER_SEC; - while (num_online_cpus() > 1 && timeout--) - udelay(1); - } + saved_wait = smp_stop_get_wait_settings(NULL); + smp_stop_set_wait_forever(false); + smp_stop_set_wait_timeout_us(USEC_PER_MSEC); + + /* Used by NMI handler callback to skip the stopping_cpu. */ + atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()); + + /* sync above data before sending IRQ */ + wmb(); + apic->send_IPI_mask(cpus, REBOOT_VECTOR); + } else if (attempt_num > 1) { + /* if the REBOOT_VECTOR didn't work, try with the NMI */ + smp_stop_set_wait_forever(saved_wait); + /* Don't wait longer than 10 ms when not asked to wait */ + smp_stop_set_wait_timeout_us(USEC_PER_MSEC * 10); - /* if the REBOOT_VECTOR didn't work, try with the NMI */ - if (num_online_cpus() > 1) { /* * If NMI IPI is enabled, try to register the stop handler * and send the IPI. In any case try to wait for the other * CPUs to stop. */ if (!smp_no_nmi_ipi && !register_stop_handler()) { - /* Sync above data before sending IRQ */ + /* sync above data before sending IRQ */ wmb(); pr_emerg("Shutting down cpus with NMI\n"); - apic_send_IPI_allbutself(NMI_VECTOR); + apic->send_IPI_mask(cpus, NMI_VECTOR); } - /* - * Don't wait longer than 10 ms if the caller didn't - * reqeust it. If wait is true, the machine hangs here if - * one or more CPUs do not reach shutdown state. - */ - timeout = USEC_PER_MSEC * 10; - while (num_online_cpus() > 1 && (wait || timeout--)) - udelay(1); } +} + +void arch_smp_cpus_stop_complete(void) +{ + unsigned long flags; local_irq_save(flags); disable_local_APIC(); From patchwork Thu Dec 19 12:19:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303393 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7FA6D139A for ; Thu, 19 Dec 2019 12:22:28 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 499082068F for ; Thu, 19 Dec 2019 12:22:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="uOLOWw6V" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 499082068F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=guZnn+LeiPekvgiDQ0/QnXYP45R8lUqlzvLoXkJIg00=; b=uOLOWw6VJYw7xednBNfFdrjZ4X thLmS1LbvCKemFD4eAE0dh1CUITyB8OXDirgx1my6PJXYCkHQ9poqt7ccD5/gU2Uwus/Dxcnkugnl 5wBbkugCN4yU5N1BcO1zdKMSTDHRadFlLC7+3+JRcpJFHwXNRtzlDWJPzEWT4tIaoPMRSpSruKMtO 2hpnNmy3AC6CNTBwn2AP0aMgEOwkO4b8zsuwgtbkCM0qy+zxqTetTJ6EetQ+fw6lsUh/UpXAS9w7e 3zFjCCQV3e3+wJviV2kN4BMhq4DTd96A7rKhhT7xD+zXVxS31HT+oowwmV76TLHbYrY6mSZ4BJf++ WPYor6kw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihup0-0005jv-Dc; Thu, 19 Dec 2019 12:22:26 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumR-0002MO-BU for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:49 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A1341328; Thu, 19 Dec 2019 04:19:46 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 575023F719; Thu, 19 Dec 2019 04:19:44 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 09/12] x86: smp: use SMP crash-stop common code Date: Thu, 19 Dec 2019 12:19:02 +0000 Message-Id: <20191219121905.26905-10-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041947_513466_D522C54C X-CRM114-Status: GOOD ( 20.27 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Make x86 use the SMP common implementation of crash_smp_send_stop() and its generic logic, by removing the x86 crash_smp_send_stop() definition and providing the needed arch specific helpers. Remove also redundant smp_ops.crash_stop_other_cpus(); add shared util function do_nmi_shootdown_cpus(), which is a generalization of the previous nmi_shootdown_cpus(), and it is used by architecture backend helper arch_smp_crash_call(). Modify original crash_nmi_callback() to properly set cpu offline flag and adding needed memory barriers. Modify original nmi_shootdown_cpus() to rely on common code logic provided by generic crash_smp_send_stop(): this was needed because the original nmi_shootdown_cpus() was used also on the emergency reboot path, employing a different callback. Reuse the same shootdown_callback mechanism to properly handle both a crash and an emergency reboot through the same common code crash path. Signed-off-by: Cristian Marussi --- Note that in this patch we kept in use the original x86 local handling of 'crashing_cpu' variable: crashing_cpu = safe_smp_processor_id(); Instead, common SMP stop code could have been easily extended to keep and expose to architectures backends such value using some additional helper like smp_stop_get_stopping_cpu_id(). This has not been addressed in this series. v2 --> v3 - conflicts - simplified _shootdown_nmi_cpus calls --- arch/x86/include/asm/reboot.h | 2 ++ arch/x86/include/asm/smp.h | 1 - arch/x86/kernel/crash.c | 27 +++------------- arch/x86/kernel/reboot.c | 58 ++++++++++++++++++++++------------- arch/x86/kernel/smp.c | 3 -- 5 files changed, 43 insertions(+), 48 deletions(-) diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h index 04c17be9b5fd..bae3ecf84659 100644 --- a/arch/x86/include/asm/reboot.h +++ b/arch/x86/include/asm/reboot.h @@ -3,6 +3,7 @@ #define _ASM_X86_REBOOT_H #include +#include struct pt_regs; @@ -28,6 +29,7 @@ void __noreturn machine_real_restart(unsigned int type); typedef void (*nmi_shootdown_cb)(int, struct pt_regs*); void nmi_panic_self_stop(struct pt_regs *regs); void nmi_shootdown_cpus(nmi_shootdown_cb callback); +void do_nmi_shootdown_cpus(cpumask_t *cpus, nmi_shootdown_cb callback); void run_crash_ipi_callback(struct pt_regs *regs); #endif /* _ASM_X86_REBOOT_H */ diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h index e937fab6474b..22db383fc2d3 100644 --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -49,7 +49,6 @@ struct smp_ops { void (*smp_cpus_done)(unsigned max_cpus); void (*stop_other_cpus)(int wait); - void (*crash_stop_other_cpus)(void); void (*smp_send_reschedule)(int cpu); int (*cpu_up)(unsigned cpu, struct task_struct *tidle); diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index 00fc55ac7ffa..c311a70bcb76 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -111,34 +111,16 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs) disable_local_APIC(); } -void kdump_nmi_shootdown_cpus(void) +void arch_smp_crash_call(cpumask_t *cpus, unsigned int __unused) { - nmi_shootdown_cpus(kdump_nmi_callback); - - disable_local_APIC(); + do_nmi_shootdown_cpus(cpus, kdump_nmi_callback); } -/* Override the weak function in kernel/panic.c */ -void crash_smp_send_stop(void) +void arch_smp_cpus_crash_complete(void) { - static int cpus_stopped; - - if (cpus_stopped) - return; - - if (smp_ops.crash_stop_other_cpus) - smp_ops.crash_stop_other_cpus(); - else - smp_send_stop(); - - cpus_stopped = 1; + disable_local_APIC(); } -#else -void crash_smp_send_stop(void) -{ - /* There are no cpus to shootdown */ -} #endif void native_machine_crash_shutdown(struct pt_regs *regs) @@ -154,6 +136,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs) /* The kernel is broken so disable interrupts */ local_irq_disable(); + /* calling into SMP common stop code */ crash_smp_send_stop(); /* diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index 0cc7c0b106bb..0d1bf44643e9 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -799,7 +799,6 @@ int crashing_cpu = -1; static nmi_shootdown_cb shootdown_callback; -static atomic_t waiting_for_crash_ipi; static int crash_ipi_issued; static int crash_nmi_callback(unsigned int val, struct pt_regs *regs) @@ -819,7 +818,12 @@ static int crash_nmi_callback(unsigned int val, struct pt_regs *regs) shootdown_callback(cpu, regs); - atomic_dec(&waiting_for_crash_ipi); + /* ensure all shootdown writes are visible once cpu is seen offline */ + smp_wmb(); + set_cpu_online(cpu, false); + /* ensure all writes are globally visible before cpu parks */ + wmb(); + /* Assume hlt works */ halt(); for (;;) @@ -829,23 +833,26 @@ static int crash_nmi_callback(unsigned int val, struct pt_regs *regs) } /* - * Halt all other CPUs, calling the specified function on each of them - * - * This function can be used to halt all other CPUs on crash - * or emergency reboot time. The function passed as parameter - * will be called inside a NMI handler on all CPUs. + * Halt the specified @cpus, calling the provided @callback on each of them + * unless a shootdown_callback was already installed previously: this way + * we can handle here also the emergency reboot requests issued via + * nmi_shootdown_cpus() and routed via usual common code crash_smp_send_stop() */ -void nmi_shootdown_cpus(nmi_shootdown_cb callback) +void do_nmi_shootdown_cpus(cpumask_t *cpus, nmi_shootdown_cb callback) { - unsigned long msecs; - local_irq_disable(); + if (!shootdown_callback) + shootdown_callback = callback; + + if (!cpus) { + /* ensure callback in place before calling commmon SMP */ + wmb(); + /* call into common SMP to reuse the generic logic */ + return crash_smp_send_stop(); + } + local_irq_disable(); /* Make a note of crashing cpu. Will be used in NMI callback. */ crashing_cpu = safe_smp_processor_id(); - - shootdown_callback = callback; - - atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); /* Would it be better to replace the trap vector here? */ if (register_nmi_handler(NMI_LOCAL, crash_nmi_callback, NMI_FLAG_FIRST, "crash")) @@ -855,21 +862,28 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback) * out the NMI */ wmb(); - - apic_send_IPI_allbutself(NMI_VECTOR); + apic->send_IPI_mask(cpus, NMI_VECTOR); /* Kick CPUs looping in NMI context. */ WRITE_ONCE(crash_ipi_issued, 1); - msecs = 1000; /* Wait at most a second for the other cpus to stop */ - while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) { - mdelay(1); - msecs--; - } - /* Leave the nmi callback set */ } +/* + * Halt all other CPUs, calling the specified function on each of them + * + * This function can be used to halt all other CPUs on crash + * or emergency reboot time. The function passed as parameter + * will be called inside a NMI handler on all CPUs. + * + * It relies on crash_smp_send_stop() common code logic to shutdown CPUs. + */ +void nmi_shootdown_cpus(nmi_shootdown_cb callback) +{ + do_nmi_shootdown_cpus(NULL, callback); +} + /* * Check if the crash dumping IPI got issued and if so, call its callback * directly. This function is used when we have already been in NMI handler. diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 7aeb45c512f7..3bd93912898a 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -276,9 +276,6 @@ struct smp_ops smp_ops = { .smp_cpus_done = native_smp_cpus_done, .stop_other_cpus = native_stop_other_cpus, -#if defined(CONFIG_KEXEC_CORE) - .crash_stop_other_cpus = kdump_nmi_shootdown_cpus, -#endif .smp_send_reschedule = native_smp_send_reschedule, .cpu_up = native_cpu_up, From patchwork Thu Dec 19 12:19:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303395 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1CD0E109A for ; Thu, 19 Dec 2019 12:22:43 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E19FA2068F for ; Thu, 19 Dec 2019 12:22:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="cIBu85EP" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E19FA2068F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=WpFUj9WrasLgsCWFf6qEZTk3+iSPP9j4/7yVDCwgHeQ=; b=cIBu85EPEMx6z8L9AlMtQspV2I IP3xYJvABVBVnmUGGNAPVm88QlkHGErfB3BUr55o9gNvCIiTSHIL2FUL2y+1zYXMWOqaNDnVDIzKK yiE/H/AWz/CV55M7Q69yy2BitwIY5gv8pSMyDuqrWqr96tdE8Z0GJrBMzhkJ+MLV+OzP2rLaLYSeg 5bGMkjMjC96p3+Ir1zE88toDDCdNMKMnBDC4kIM3jNP3FoGIFNvO2obsjRuU86wxZXy1PEfCxWEG4 EjoqeSxei03Ip48KAqANZDoIGZ6yBt+dF50EAlGkAKj3cnETGm+ufgWQVbQSTdTPMYf2XjM550hEx YE25oktQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihupE-000602-TU; Thu, 19 Dec 2019 12:22:40 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumT-0002OI-Ok for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:51 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2D23631B; Thu, 19 Dec 2019 04:19:49 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D60563F719; Thu, 19 Dec 2019 04:19:46 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 10/12] arm: smp: use generic SMP stop common code Date: Thu, 19 Dec 2019 12:19:03 +0000 Message-Id: <20191219121905.26905-11-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041949_982627_C43480CC X-CRM114-Status: GOOD ( 12.88 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Make arm use the generic SMP-stop logic provided by common code unified smp_send_stop() function. Signed-off-by: Cristian Marussi --- v2 --> v3 - conflicts - added missing barriers --- arch/arm/Kconfig | 1 + arch/arm/kernel/smp.c | 22 ++++++---------------- 2 files changed, 7 insertions(+), 16 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index ba75e3661a41..40f6961c449c 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -35,6 +35,7 @@ config ARM select ARCH_USE_CMPXCHG_LOCKREF select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU select ARCH_WANT_IPC_PARSE_VERSION + select ARCH_USE_COMMON_SMP_STOP select BINFMT_FLAT_ARGVP_ENVP_ON_STACK select BUILDTIME_EXTABLE_SORT if MMU select CLONE_BACKWARDS diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 46e1be9e57a8..f03e9bbf4116 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -598,6 +598,8 @@ static void ipi_cpu_stop(unsigned int cpu) } set_cpu_online(cpu, false); + /* ensure all writes are globally visible before cpu parks */ + wmb(); local_fiq_disable(); local_irq_disable(); @@ -705,23 +707,9 @@ void smp_send_reschedule(int cpu) smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE); } -void smp_send_stop(void) +void arch_smp_stop_call(cpumask_t *cpus, unsigned int __unused) { - unsigned long timeout; - struct cpumask mask; - - cpumask_copy(&mask, cpu_online_mask); - cpumask_clear_cpu(smp_processor_id(), &mask); - if (!cpumask_empty(&mask)) - smp_cross_call(&mask, IPI_CPU_STOP); - - /* Wait up to one second for other CPUs to stop */ - timeout = USEC_PER_SEC; - while (num_online_cpus() > 1 && timeout--) - udelay(1); - - if (num_online_cpus() > 1) - pr_warn("SMP: failed to stop secondary CPUs\n"); + smp_cross_call(cpus, IPI_CPU_STOP); } /* In case panic() and panic() called at the same time on CPU1 and CPU2, @@ -735,6 +723,8 @@ void panic_smp_self_stop(void) pr_debug("CPU %u will stop doing anything useful since another CPU has paniced\n", smp_processor_id()); set_cpu_online(smp_processor_id(), false); + /* ensure all writes visible before parking */ + wmb(); while (1) cpu_relax(); } From patchwork Thu Dec 19 12:19:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303397 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 400CA109A for ; Thu, 19 Dec 2019 12:22:57 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 18FCA2068F for ; Thu, 19 Dec 2019 12:22:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="GW1ESJpK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 18FCA2068F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=RnAMkFLxa8MJW4qoh91vUTjeth1hjI5+pfs3fiZbnoI=; b=GW1ESJpKvxucth//cpLlr9UUFD cCRVKTqHnmxY9L338Y8SAGrxjPo1EAr0GtPPjzONRMzP6+cx7RhtxVpJTbZNCY3UnO9FDGKBSlPad eL50NjNEhTRyxxW5GPT+DZ0WoabY1byeevzuwkVxaVp+TPcbKnKAYIGe+ebkFjNcT+Z87JXRMZjp/ +hg802XuEehpwRJk+CXaZ2DcMTVp+Eg/6fgZGO++gaIYAtlWjXfNZJRABjN6T8LESdGoD+XbPXCnY Q1lHBFtsetidTb3s/WUJrH+om21omWP6CmmJfUNx+VAF4vWVZrvs9szBxvkgi7vBJegHoV8e9aPkp bVesUr1g==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihupS-0006F1-8a; Thu, 19 Dec 2019 12:22:54 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumW-0002R1-G8 for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:54 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id ACF31DA7; Thu, 19 Dec 2019 04:19:51 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6295E3F719; Thu, 19 Dec 2019 04:19:49 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 11/12] arm: smp: use SMP crash-stop common code Date: Thu, 19 Dec 2019 12:19:04 +0000 Message-Id: <20191219121905.26905-12-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041952_598235_78BF07BD X-CRM114-Status: GOOD ( 10.45 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Make arm use the SMP common implementation of crash_smp_send_stop() and its generic logic, by removing the arm crash_smp_send_stop() definition and providing the needed arch specific helpers. Signed-off-by: Cristian Marussi --- arch/arm/kernel/machine_kexec.c | 29 ++++++++--------------------- 1 file changed, 8 insertions(+), 21 deletions(-) diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c index 76300f3813e8..5289984e3941 100644 --- a/arch/arm/kernel/machine_kexec.c +++ b/arch/arm/kernel/machine_kexec.c @@ -29,8 +29,6 @@ extern unsigned long kexec_indirection_page; extern unsigned long kexec_mach_type; extern unsigned long kexec_boot_atags; -static atomic_t waiting_for_crash_ipi; - /* * Provide a dummy crash_notes definition while crash dump arrives to arm. * This prevents breakage of crash_notes attribute in kernel/ksysfs.c. @@ -89,34 +87,23 @@ void machine_crash_nonpanic_core(void *unused) crash_save_cpu(®s, smp_processor_id()); flush_cache_all(); + /* ensure saved regs writes are visible before going offline */ + smp_wmb(); set_cpu_online(smp_processor_id(), false); - atomic_dec(&waiting_for_crash_ipi); + /* ensure all writes visible before parking */ + wmb(); while (1) { cpu_relax(); wfe(); } } -void crash_smp_send_stop(void) +void arch_smp_crash_call(cpumask_t *cpus, unsigned int __unused) { - static int cpus_stopped; - unsigned long msecs; - - if (cpus_stopped) - return; - - atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); - smp_call_function(machine_crash_nonpanic_core, NULL, false); - msecs = 1000; /* Wait at most a second for the other cpus to stop */ - while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) { - mdelay(1); - msecs--; - } - if (atomic_read(&waiting_for_crash_ipi) > 0) - pr_warn("Non-crashing CPUs did not react to IPI\n"); - - cpus_stopped = 1; + preempt_disable(); + smp_call_function_many(cpus, machine_crash_nonpanic_core, NULL, false); + preempt_enable(); } static void machine_kexec_mask_interrupts(void) From patchwork Thu Dec 19 12:19:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 11303399 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0C14E139A for ; Thu, 19 Dec 2019 12:23:11 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B87C12068F for ; Thu, 19 Dec 2019 12:23:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="KEkQOXBI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B87C12068F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References: In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=hsEnND9AEuCoJtVtmbASDexYleSRm9qRMbVy/2NpiNM=; b=KEkQOXBIsZTRjVFJWsVN2eZC3A O19K2KF+i2tvG2W01G4FSxP96V2l5CjbPbVd6fTvcpl7y/woI5+8PSWgEftBCE5M9xrpMMSvVob4w km955IdlpAq0c74OPCJbYwaedBeqQq9KyDE2B93DSKNcettv8kTmnQST/Yu0yiQUGkO+wURDk3lN8 iIPSdgdwz87FAkCZ15QFaN20LmDeYvVjzAFWhUVEvT1AKQk61cl0a86agJglFA/1RrxUjO5xkaj4s mFsN37eVTDmiFpwbOsj6PO8YOFCEViPrmmeNNvfLJpd8hFb1n+GZ52EVFoewqfK+lrQ5CADmt6+XO V27DgBxw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihuph-0006Wj-88; Thu, 19 Dec 2019 12:23:09 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1ihumY-0002TC-Lh for linux-arm-kernel@lists.infradead.org; Thu, 19 Dec 2019 12:19:56 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 380E931B; Thu, 19 Dec 2019 04:19:54 -0800 (PST) Received: from e120937-lin.cambridge.arm.com (e120937-lin.cambridge.arm.com [10.1.197.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E1BD33F719; Thu, 19 Dec 2019 04:19:51 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 12/12] sparc64: smp: use generic SMP stop common code Date: Thu, 19 Dec 2019 12:19:05 +0000 Message-Id: <20191219121905.26905-13-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191219121905.26905-1-cristian.marussi@arm.com> References: <20191219121905.26905-1-cristian.marussi@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20191219_041954_786736_620EF912 X-CRM114-Status: GOOD ( 10.18 ) X-Spam-Score: 0.0 (/) X-Spam-Report: SpamAssassin version 3.4.2 on bombadil.infradead.org summary: Content analysis details: (0.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [217.140.110.172 listed in list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, mark.rutland@arm.com, sparclinux@vger.kernel.org, dzickus@redhat.com, ehabkost@redhat.com, peterz@infradead.org, catalin.marinas@arm.com, x86@kernel.org, linux@armlinux.org.uk, hch@infradead.org, takahiro.akashi@linaro.org, mingo@redhat.com, james.morse@arm.com, hidehiro.kawai.ez@hitachi.com, tglx@linutronix.de, will@kernel.org, davem@davemloft.net, linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Make sparc64 use the generic SMP-stop logic provided by common code unified smp_send_stop() function. Signed-off-by: Cristian Marussi --- arch/sparc/Kconfig | 1 + arch/sparc/kernel/smp_64.c | 15 ++++++++------- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index eb24cb1afc11..9fedb79209d7 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -95,6 +95,7 @@ config SPARC64 select ARCH_HAS_PTE_SPECIAL select PCI_DOMAINS if PCI select ARCH_HAS_GIGANTIC_PAGE + select ARCH_USE_COMMON_SMP_STOP config ARCH_DEFCONFIG string diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c index 9b4506373353..87a9f3c96193 100644 --- a/arch/sparc/kernel/smp_64.c +++ b/arch/sparc/kernel/smp_64.c @@ -1537,7 +1537,12 @@ static void stop_this_cpu(void *dummy) prom_stopself(); } -void smp_send_stop(void) +void arch_smp_cpus_stop_complete(void) +{ + smp_call_function(stop_this_cpu, NULL, 0); +} + +void arch_smp_stop_call(cpumask_t *cpus, unsigned int __unused) { int cpu; @@ -1546,10 +1551,7 @@ void smp_send_stop(void) #ifdef CONFIG_SERIAL_SUNHV sunhv_migrate_hvcons_irq(this_cpu); #endif - for_each_online_cpu(cpu) { - if (cpu == this_cpu) - continue; - + for_each_cpu(cpu, cpus) { set_cpu_online(cpu, false); #ifdef CONFIG_SUN_LDOMS if (ldom_domaining_enabled) { @@ -1562,8 +1564,7 @@ void smp_send_stop(void) #endif prom_stopcpu_cpuid(cpu); } - } else - smp_call_function(stop_this_cpu, NULL, 0); + } } /**