From patchwork Mon Aug 22 02:15:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pingfan Liu X-Patchwork-Id: 12950115 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 088FBC32774 for ; Mon, 22 Aug 2022 02:16:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=KmHNseDjdwFDYiGXBhKIZte0vWi2G/+kiGRK/hp9jGI=; b=I7D2ZatKFCIKby ITivSBvOoyJsYnzZ3a13kERZlsc1AQnhDCJrrG2GBkABJe7G5CX+YtePYp6/39o59t/TP0JQGJnf6 mkrcSsLR4c2LOCrOjWyJWmwE3BuonCuCo5h4GNBbrlhtaTHWXqFVkuZbsxOumvwo9lhVlHptL1EOm Gz9O8fFwbQc/8/UXfiS617gGlJVfYUu5MXu/mu10gF+P4ucG7NdRMfFoOxKeduXenGiFiRMRGIBJh 24u960Wy9ydeyj5hY0WKg8Tiu06wA70JwbCIUYUIKwWVbMIwDQ3kDJVAulCkgXzkOnUHkZSIS/Nni ZGymAAFpLb2ey7+QFBzA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oPwz5-003lD5-Bs; Mon, 22 Aug 2022 02:16:11 +0000 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oPwyj-003kyB-Vv; Mon, 22 Aug 2022 02:15:53 +0000 Received: by mail-pf1-x434.google.com with SMTP id g129so5704204pfb.8; Sun, 21 Aug 2022 19:15:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc; bh=1iHPy2MJm6s7J5XdJO/siHoKhzaKKWD/EXtuZwe5/AE=; b=mKi9In1cguwoEOIvsqd6r243Uwmb97pem6sMmVbQ/lX7DKmHGQ4Cz8u7NaUAMolab3 HmdgTa5cqnb+GfgxLAPvRC2VcgnhWhU/izaXxri5eH+EgDoV+6M0er9uK9apm/9k6Wa+ O6vWMuPgbVirYyTOwBORRtmkCs2pdpI1E3AlnfMY1g6INcFNsiY5OxCc+eK21YnW41W/ 9gNulkHUPIkdHnMs5921OdcrPwHgoXozCfdiUiIdwYospQ2cvraEremg4xnjuFFKYkFu gDjp9ye/aSfYYNx/jRfdI/eCYmQuHZst3at0UYKdKX8Agaty19wI93E6qRscBDPMtYU4 QtCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=1iHPy2MJm6s7J5XdJO/siHoKhzaKKWD/EXtuZwe5/AE=; b=Ja8bB+KQCCYjXE7jRa6zXBbozm4Hqybi3eNMKGNAcyVeBDNJA2xI96RKzKbb3F26Z6 MXtt4OgYajAhtP9ABGE42ciec4CxHuK/h4OOqezXpg5USSOuJlNO3x36KdBtwsxZ8lpl JOlp/RMsK7oo2mej6wtGk4+BEpWHxiczZi9/TXQWTvqseEOVxcM3JUD8J6IE7V4nJhoq XCjGz6OLIGF+ynuBqAYHIYD0bfXyng5I5npf3XxAsBNLkbshlcnfnaPMYV8r9x7pMV0y qrZhZ2UPAe+JmdZiI+d7R589l7r78WfLrXnr65uZbPzKwNV4GvD3tUBxIP37sBi7VvdR jxog== X-Gm-Message-State: ACgBeo1Ov5y8fKiZDt/r0gy0tMHhBq8uFsvxr3gEGXgftemZ0m/blJDa 2rUsVlShWTUvlTZS/NbRlpV2znyIYg== X-Google-Smtp-Source: AA6agR5RhOvnr+n8C+QckGvAPfLQPD+H4Z9Kyq6FS5FDBt09E1rsRMXpjGgfl8V7Hzvo18Hl1aKTcg== X-Received: by 2002:a63:ce06:0:b0:41d:dcc3:aa6e with SMTP id y6-20020a63ce06000000b0041ddcc3aa6emr15448270pgf.251.1661134548425; Sun, 21 Aug 2022 19:15:48 -0700 (PDT) Received: from piliu.users.ipa.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id k3-20020aa79723000000b005321340753fsm7312139pfg.103.2022.08.21.19.15.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 Aug 2022 19:15:47 -0700 (PDT) From: Pingfan Liu To: linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Pingfan Liu , Russell King , Catalin Marinas , Will Deacon , Paul Walmsley , Palmer Dabbelt , Albert Ou , Peter Zijlstra , "Eric W. Biederman" , Mark Rutland , Marco Elver , Masami Hiramatsu , Dan Li , Song Liu , Sami Tolvanen , Arnd Bergmann , Linus Walleij , Ard Biesheuvel , Tony Lindgren , Nick Hawkins , John Crispin , Geert Uytterhoeven , Andrew Morton , Bjorn Andersson , Anshuman Khandual , Thomas Gleixner , Steven Price Subject: [RFC 02/10] cpu/hotplug: Compile smp_shutdown_nonboot_cpus() conditioned on CONFIG_SHUTDOWN_NONBOOT_CPUS Date: Mon, 22 Aug 2022 10:15:12 +0800 Message-Id: <20220822021520.6996-3-kernelfans@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220822021520.6996-1-kernelfans@gmail.com> References: <20220822021520.6996-1-kernelfans@gmail.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220821_191551_208472_6A5C01EA X-CRM114-Status: GOOD ( 15.17 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Only arm/arm64/ia64/riscv share the smp_shutdown_nonboot_cpus(). So compiling this code conditioned on the macro CONFIG_SHUTDOWN_NONBOOT_CPUS. Later this macro will brace the quick kexec reboot code. Signed-off-by: Pingfan Liu Cc: Russell King Cc: Catalin Marinas Cc: Will Deacon Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Peter Zijlstra Cc: "Eric W. Biederman" Cc: Mark Rutland Cc: Marco Elver Cc: Masami Hiramatsu Cc: Dan Li Cc: Song Liu Cc: Sami Tolvanen Cc: Arnd Bergmann Cc: Linus Walleij Cc: Ard Biesheuvel Cc: Tony Lindgren Cc: Nick Hawkins Cc: John Crispin Cc: Geert Uytterhoeven Cc: Andrew Morton Cc: Bjorn Andersson Cc: Anshuman Khandual Cc: Thomas Gleixner Cc: Steven Price To: linux-arm-kernel@lists.infradead.org To: linux-ia64@vger.kernel.org To: linux-riscv@lists.infradead.org To: linux-kernel@vger.kernel.org --- arch/Kconfig | 4 ++++ arch/arm/Kconfig | 1 + arch/arm64/Kconfig | 1 + arch/ia64/Kconfig | 1 + arch/riscv/Kconfig | 1 + kernel/cpu.c | 3 +++ 6 files changed, 11 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig index f330410da63a..be447537d0f6 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -14,6 +14,10 @@ menu "General architecture-dependent options" config CRASH_CORE bool +config SHUTDOWN_NONBOOT_CPUS + select KEXEC_CORE + bool + config KEXEC_CORE select CRASH_CORE bool diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 87badeae3181..711cfdb4f9f4 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -129,6 +129,7 @@ config ARM select PCI_SYSCALL if PCI select PERF_USE_VMALLOC select RTC_LIB + select SHUTDOWN_NONBOOT_CPUS select SYS_SUPPORTS_APM_EMULATION select THREAD_INFO_IN_TASK select HAVE_ARCH_VMAP_STACK if MMU && ARM_HAS_GROUP_RELOCS diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 571cc234d0b3..8c481a0b1829 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -223,6 +223,7 @@ config ARM64 select PCI_SYSCALL if PCI select POWER_RESET select POWER_SUPPLY + select SHUTDOWN_NONBOOT_CPUS select SPARSE_IRQ select SWIOTLB select SYSCTL_EXCEPTION_TRACE diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 26ac8ea15a9e..8a3ddea97d1b 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -52,6 +52,7 @@ config IA64 select ARCH_CLOCKSOURCE_DATA select GENERIC_TIME_VSYSCALL select LEGACY_TIMER_TICK + select SHUTDOWN_NONBOOT_CPUS select SWIOTLB select SYSCTL_ARCH_UNALIGN_NO_WARN select HAVE_MOD_ARCH_SPECIFIC diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index ed66c31e4655..02606a48c5ea 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -120,6 +120,7 @@ config RISCV select PCI_MSI if PCI select RISCV_INTC select RISCV_TIMER if RISCV_SBI + select SHUTDOWN_NONBOOT_CPUS select SPARSE_IRQ select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK diff --git a/kernel/cpu.c b/kernel/cpu.c index 338e1d426c7e..2be6ba811a01 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1258,6 +1258,8 @@ int remove_cpu(unsigned int cpu) } EXPORT_SYMBOL_GPL(remove_cpu); +#ifdef CONFIG_SHUTDOWN_NONBOOT_CPUS + void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) { unsigned int cpu; @@ -1299,6 +1301,7 @@ void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) cpu_maps_update_done(); } +#endif #else #define takedown_cpu NULL From patchwork Mon Aug 22 02:15:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pingfan Liu X-Patchwork-Id: 12950116 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DCDB9C28D13 for ; Mon, 22 Aug 2022 02:16:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=72EFHkZqhoyq3/NszQCeWfcdgWNyDe115PeMYP8PlqY=; b=zjR/uYmFmx8aMZ l3jvg6DcAIBfRvEAwREg3wR7SC0s7JIbSfpvsQh/3DtdXheVuSatYbQgybzwaypeEXoHkfFQ6pvBP AABWY2To/Rr5khT7UyYOgvezksGyNNxZYuPjhybKmkqQRjtTVelibj0neFopsgPk25Z2kSJs44Eba U9/d8XUUv94OGQFNj0KYyQdAY75Ggb+FArV0mGXOT/RoU+IwdjkmAJ2ENOS/2/9PjGoB7a31yamNm BLdE397n2BvWXm1iNnHBZfx3p02nO5YN7AvCQleNhaLSC+XXJl69W8/Yqi5GNSu97oxRHAjbTxErV 0sTK++1FPfSSthIXoESw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oPwzJ-003lQD-Hi; Mon, 22 Aug 2022 02:16:25 +0000 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oPwyr-003kyB-0B; Mon, 22 Aug 2022 02:15:59 +0000 Received: by mail-pf1-x434.google.com with SMTP id g129so5704368pfb.8; Sun, 21 Aug 2022 19:15:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc; bh=WLszNFH6rAx833ludRXMY401Pm+H+RTV6I0FYUgHTg0=; b=SqqVBMOnC7CzBii1m0ocTxbCUywgAQHCwLfgqx83GK4H1dtDbCF9O6MIHnUWaRicTM 26QmKxmEBTyB+zRhgsS02GmtOBoSUG3n9vNLTx/oVFe/iLv2YXG5DBfeJVvp3N98Nqw3 QrOZY8/GZMChwXBy246ZcaxGqyefNTj4lrsFOg7n9KSwMyGWA/PJun+O3rN7k6IKejoS W0WGG2sDMSITPd9l5SFj5FDjiB8VxiEy3h01jTkhf4lwHS2n4jaEtlm95zcq47ioODO/ +xzMNd/XLGxseTgLin18K/K9szEkCq6SJSsPsuypa2ZcYXvVOjiUTEtup+0S+TuAjbog PD6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=WLszNFH6rAx833ludRXMY401Pm+H+RTV6I0FYUgHTg0=; b=PNUfZZ0eKvKrt4bKFblF0yH6ak5G7DUDpVVhVR32qRdWaxALUCFwpLkjqPf1f6Vd1n iOal/26t+lhBQmX0r/MCnZk7WyVAmq9S0yYqzhSGhbnDh99TGL8NfbZv36Ct4TCTPkt+ YYDKvWy+U4b1UjrGMCPQqRIYr1cRDD+OcmVPZJApKcBAHaoCHH0+yNrYy71w0XYeD8Fa CBzAX+GLSpV7Hb2/ZZVw1FuZAA1veaFN4ALMTOK2k/ciY4dmIMrknhgbBWhOWq4Agx+2 sQCPQZjTpkUa1v0ZrvUEGCGB5zb9WTtatyQMKLXoi9GRMEU1bZYMFmNhkXGFZG6f/B1w e/mw== X-Gm-Message-State: ACgBeo34Vd51k/qTedY0+up6X6tiPHIMOrTo4k6lJZYL4plI8/vXdyBE TzR6xGeVK8fpK1LNYMQJp0BYSUVgDw== X-Google-Smtp-Source: AA6agR6OqW8j4jhaYEyPSXqhTTqweo/eD8PzrvQ6+13CJ2uOIZ0noYxOQmPKYGKHiep+6to1lt7J9Q== X-Received: by 2002:a05:6a00:23c1:b0:536:463e:e53b with SMTP id g1-20020a056a0023c100b00536463ee53bmr9714834pfc.43.1661134556164; Sun, 21 Aug 2022 19:15:56 -0700 (PDT) Received: from piliu.users.ipa.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id k3-20020aa79723000000b005321340753fsm7312139pfg.103.2022.08.21.19.15.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 Aug 2022 19:15:55 -0700 (PDT) From: Pingfan Liu To: linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Pingfan Liu , Thomas Gleixner , Steven Price , Kuppuswamy Sathyanarayanan , "Jason A. Donenfeld" , Frederic Weisbecker , Russell King , Catalin Marinas , Will Deacon , Paul Walmsley , Palmer Dabbelt , Albert Ou , Peter Zijlstra , "Eric W. Biederman" Subject: [RFC 03/10] cpu/hotplug: Introduce fast kexec reboot Date: Mon, 22 Aug 2022 10:15:13 +0800 Message-Id: <20220822021520.6996-4-kernelfans@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220822021520.6996-1-kernelfans@gmail.com> References: <20220822021520.6996-1-kernelfans@gmail.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220821_191557_166431_EB12EB48 X-CRM114-Status: GOOD ( 34.54 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org *** Current situation 'slow kexec reboot' *** At present, some architectures rely on smp_shutdown_nonboot_cpus() to implement "kexec -e". Since smp_shutdown_nonboot_cpus() tears down the cpus serially, it is very slow. Take a close look, a cpu_down() processing on a single cpu can approximately be divided into two stages: -1. from CPUHP_ONLINE to CPUHP_TEARDOWN_CPU -2. from CPUHP_TEARDOWN_CPU to CPUHP_AP_IDLE_DEAD which is by stop_machine_cpuslocked(take_cpu_down, NULL, cpumask_of(cpu)); and runs on the teardown cpu. If these processes can run in parallel, then, the reboot can be speeded up. That is the aim of this patch. *** Contrast to other implements *** X86 and PowerPC have their own machine_shutdown(), which does not reply on the cpu hot-removing mechanism. They just discriminate some critical component and tears them down in per cpu NMI handler during the kexec reboot. But for some architectures, let's say arm64, it is not easy to define these critical component due to various chipmakers' implements. As a result, sticking to the cpu hot-removing mechanism is the simplest way to re-implement the parallel. It also renders an opportunity to implement the cpu_down() in parallel in future (not done by this series). *** Things worthy of consideration *** 1. The definition of a clean boundary between the first kernel and the new kernel -1.1 firmware The firmware's internal state should enter into a proper state. And this is achieved by the firmware's cpuhp_step's teardown interface if any. -1.2 CPU internal Whether the cache or PMU needs a clean shutdown before rebooting. 2. The dependency of each cpuhp_step The boundary of a clean cut involves only few cpuhp_step, but they may propagate to other cpuhp_step by the way of the dependency. This series does not bother to judge the dependency, instead, just iterate downside each cpuhp_step. And this stragegy demands that each cpuhp_step's teardown interface supports parallel. *** Solution *** Ideally, if the interface _cpu_down() can be enhanced to enable parallel, then the fast reboot can be achieved. But revisiting the two parts of the current cpu_down() process, the second part 'stop_machine_cpuslocked()' is a blockade. Packed inside the _cpu_down(), stop_machine_cpuslocked() only allow one cpu to execute the teardown. So this patch breaks down the process of _cpu_down(), and divides the teardown into three steps. And the exposed stop_machine_cpuslocked() can be used to support parallel. 1. Bring each AP from CPUHP_ONLINE to CPUHP_TEARDOWN_CPU in parallel. 2. Sync on BP to wait all APs to enter CPUHP_TEARDOWN_CPU state 3. Bring each AP from CPUHP_TEARDOWN_CPU to CPUHP_AP_IDLE_DEAD by the interface of stop_machine_cpuslocked() in parallel. Apparently, the step 2 is introduced in order to satisfy the condition on which stop_machine_cpuslocked() can start on each cpu. Then the rest issue is about how to support parallel in step 1&3. Furtunately, each subsystem has its own carefully designed lock mechanism. In each cpuhp_step teardown interface, adopting to the subsystem's lock rule will make things work. *** No rollback if failure *** During kexec reboot, the devices have already been shutdown, there is no way for system to roll back to a workable state. So this series also does not consider the rollback issue. Signed-off-by: Pingfan Liu Cc: Thomas Gleixner Cc: Steven Price Cc: Kuppuswamy Sathyanarayanan Cc: "Jason A. Donenfeld" Cc: Frederic Weisbecker Cc: Russell King Cc: Catalin Marinas Cc: Will Deacon Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Peter Zijlstra Cc: "Eric W. Biederman" To: linux-arm-kernel@lists.infradead.org To: linux-ia64@vger.kernel.org To: linux-riscv@lists.infradead.org To: linux-kernel@vger.kernel.org --- kernel/cpu.c | 139 +++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 129 insertions(+), 10 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index 2be6ba811a01..94ab2727d6bb 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1260,10 +1260,125 @@ EXPORT_SYMBOL_GPL(remove_cpu); #ifdef CONFIG_SHUTDOWN_NONBOOT_CPUS -void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) +/* + * Push all of cpus to the state CPUHP_AP_ONLINE_IDLE. + * Since kexec-reboot has already shut down all devices, there is no way to + * roll back, the cpus' teardown also requires no rollback, instead, just throw + * warning. + */ +static void cpus_down_no_rollback(struct cpumask *cpus) { + struct cpuhp_cpu_state *st; unsigned int cpu; + + /* launch ap work one by one, but not wait for completion */ + for_each_cpu(cpu, cpus) { + st = per_cpu_ptr(&cpuhp_state, cpu); + /* + * If the current CPU state is in the range of the AP hotplug thread, + * then we need to kick the thread. + */ + if (st->state > CPUHP_TEARDOWN_CPU) { + cpuhp_set_state(cpu, st, CPUHP_TEARDOWN_CPU); + /* In order to parallel, async. And there is no way to rollback */ + cpuhp_kick_ap_work_async(cpu); + } + } + + /* wait for all ap work completion */ + for_each_cpu(cpu, cpus) { + st = per_cpu_ptr(&cpuhp_state, cpu); + wait_for_ap_thread(st, st->bringup); + if (st->result) + pr_warn("cpu %u refuses to offline due to %d\n", cpu, st->result); + else if (st->state > CPUHP_TEARDOWN_CPU) + pr_warn("cpu %u refuses to offline, state: %d\n", cpu, st->state); + } +} + +static int __takedown_cpu_cleanup(unsigned int cpu) +{ + struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu); + + /* + * The teardown callback for CPUHP_AP_SCHED_STARTING will have removed + * all runnable tasks from the CPU, there's only the idle task left now + * that the migration thread is done doing the stop_machine thing. + * + * Wait for the stop thread to go away. + */ + wait_for_ap_thread(st, false); + BUG_ON(st->state != CPUHP_AP_IDLE_DEAD); + + hotplug_cpu__broadcast_tick_pull(cpu); + /* This actually kills the CPU. */ + __cpu_die(cpu); + + tick_cleanup_dead_cpu(cpu); + rcutree_migrate_callbacks(cpu); + return 0; +} + +/* + * There is a sync that all ap threads are done before calling this func. + */ +static void takedown_cpus_no_rollback(struct cpumask *cpus) +{ + struct cpuhp_cpu_state *st; + unsigned int cpu; + + for_each_cpu(cpu, cpus) { + st = per_cpu_ptr(&cpuhp_state, cpu); + WARN_ON(st->state != CPUHP_TEARDOWN_CPU); + /* No invoke to takedown_cpu(), so set the state by manual */ + st->state = CPUHP_AP_ONLINE; + cpuhp_set_state(cpu, st, CPUHP_AP_OFFLINE); + } + + irq_lock_sparse(); + /* ask stopper kthreads to execute take_cpu_down() in parallel */ + stop_machine_cpuslocked(take_cpu_down, NULL, cpus); + + /* Finally wait for completion and clean up */ + for_each_cpu(cpu, cpus) + __takedown_cpu_cleanup(cpu); + irq_unlock_sparse(); +} + +static bool check_quick_reboot(void) +{ + return false; +} + +static struct cpumask kexec_ap_map; + +void smp_shutdown_nonboot_cpus_quick_path(unsigned int primary_cpu) +{ + struct cpumask *cpus = &kexec_ap_map; + /* + * To prevent other subsystem from access to __cpu_online_mask, but internally, + * __cpu_disable() accesses the bitmap in parral and needs its own local lock. + */ + cpus_write_lock(); + + cpumask_copy(cpus, cpu_online_mask); + cpumask_clear_cpu(primary_cpu, cpus); + cpus_down_no_rollback(cpus); + takedown_cpus_no_rollback(cpus); + /* + * For some subsystems, there are still remains for offline cpus from + * CPUHP_BRINGUP_CPU to CPUHP_OFFLINE. But since none of them interact + * with hardwares or firmware, they have no effect on the new kernel. + * So skipping the cpuhp callbacks in that range + */ + + cpus_write_unlock(); +} + +void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) +{ int error; + unsigned int cpu; cpu_maps_update_begin(); @@ -1275,15 +1390,19 @@ void smp_shutdown_nonboot_cpus(unsigned int primary_cpu) if (!cpu_online(primary_cpu)) primary_cpu = cpumask_first(cpu_online_mask); - for_each_online_cpu(cpu) { - if (cpu == primary_cpu) - continue; - - error = cpu_down_maps_locked(cpu, CPUHP_OFFLINE); - if (error) { - pr_err("Failed to offline CPU%d - error=%d", - cpu, error); - break; + if (check_quick_reboot()) { + smp_shutdown_nonboot_cpus_quick_path(primary_cpu); + } else { + for_each_online_cpu(cpu) { + if (cpu == primary_cpu) + continue; + + error = cpu_down_maps_locked(cpu, CPUHP_OFFLINE); + if (error) { + pr_err("Failed to offline CPU%d - error=%d", + cpu, error); + break; + } } } From patchwork Mon Aug 22 02:15:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pingfan Liu X-Patchwork-Id: 12950128 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 84B12C00140 for ; Mon, 22 Aug 2022 02:17:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=1AWXSSt+gXkhyddkp8d9+rBxeGlE1Jv0hcc9SKNVKA0=; b=W7iDSxfhWLsOo3 jrqODvQjgbt0daOykM8gZth7IUvZSS4TDcKWwP/aeXshZGhUE8eFoX78Y3Dwpko+GMRvVAr5FZm5L nCuHj9uXPeIf0ql3kQuDFlLEItOiVLXuBxwkLKR1fAY/blCjFTTsXoiH/TgbO0T90aEVrznzYU+Yl E5ICf03qh3Etud6P0LMCGxNMbA8/Ofs09Y6yzDO5RsB2D/wDjgdHrPNj0sxbCQtZvlm3mLrrvmeOp 3L3M/N1MOkfwDGHTqqt4eWnCTH1FazsgKuZWoT8ngBWrTVv7ryTkVyB/sAr6C4Qp+IzfjlrHbrRP9 TkRo599MKKc2HL3hACnQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oPwzo-003ls7-1h; Mon, 22 Aug 2022 02:16:56 +0000 Received: from mail-pg1-x535.google.com ([2607:f8b0:4864:20::535]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oPwz1-003l8x-Js; Mon, 22 Aug 2022 02:16:10 +0000 Received: by mail-pg1-x535.google.com with SMTP id v4so8164299pgi.10; Sun, 21 Aug 2022 19:16:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc; bh=xSWHQT0Gh+ASwqDbuftZ9hOwt+7ZpIdbcTfHZLJUjew=; b=drdEGnn6bSG1bQXrr6SCO9vL8sEJJnSyNh0s4SAbgA50zj3osmvGAr+RilNPSGW7tp KAfFuc5fnb/IN3BmLw8Lx1KUVaemwzWMqg+7fGeE+qpb2RuR+Kib91bE6zeP6pdAU+Iv rHPp+ANtA230uW3x89/bKkqo7TORCuWAHvqz4rYuwfwa0RHsGIPHziXZEfrA5sbdna54 tQSSr4N/VaS9RF7zANbJqRDu+IaNTN4xu3gRYTJ/+K+ZA/pY1IFky9Y0+JvCObZ18NYD vzacLfpll9SGbrBMx+pDubr0qCbwovinLpHj7a49H5uO4QvUDuMrSrN+nvwfBoxvGTK5 Cm8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=xSWHQT0Gh+ASwqDbuftZ9hOwt+7ZpIdbcTfHZLJUjew=; b=FXghNO9xmt4HbVo97uNW8okrVNi4qRWz5Z/u6YeiPyAS7cdcmphJYtg0s1oc8aLj3e al8QVK/zMGufaj5qgUrBozJmM3NY40hVKuZZw/POcZmnz//8Ug1avALF8QcePU5Wu71u Xg9gvYDxSt1nvaVFACMFSSlUxkVw3i2IB4FwDy5DJaRlyLVV3yJ71H6VlkCU5inVDecP gmIbPArTeOoHN3h1znQ/ZEcqLT3oRvJuiBMpBaT7Wihpws7dAehhkNvo8yP1mevxTPSL eEAnsSiDKn6OOduFSDsmaSnPSu/qVdbTQy20fio54gjZt9+HfFfMwHPyYcfGesUBFd57 DwWQ== X-Gm-Message-State: ACgBeo2TMLBXneFMQxUuN1jVH2vwZYnN9O25RaL1MparAucmhKuoZS4f m7uYmAimvA48aILK6A8y6+Jr3k7YQQ== X-Google-Smtp-Source: AA6agR7KljxZ9kbPwullzRTYyS9R74MZRULUzSQZWPtq0VQwcgMNVf0XgsbzpqwnU5ffstdl5YcXvw== X-Received: by 2002:a63:c1f:0:b0:41a:9b73:a89e with SMTP id b31-20020a630c1f000000b0041a9b73a89emr14967609pgl.342.1661134566121; Sun, 21 Aug 2022 19:16:06 -0700 (PDT) Received: from piliu.users.ipa.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id k3-20020aa79723000000b005321340753fsm7312139pfg.103.2022.08.21.19.15.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 Aug 2022 19:16:05 -0700 (PDT) From: Pingfan Liu To: linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Pingfan Liu , Thomas Gleixner , Steven Price , Kuppuswamy Sathyanarayanan , "Jason A. Donenfeld" , Frederic Weisbecker , Russell King , Catalin Marinas , Will Deacon , Paul Walmsley , Palmer Dabbelt , Albert Ou , Peter Zijlstra , "Eric W. Biederman" Subject: [RFC 04/10] cpu/hotplug: Check the capability of kexec quick reboot Date: Mon, 22 Aug 2022 10:15:14 +0800 Message-Id: <20220822021520.6996-5-kernelfans@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220822021520.6996-1-kernelfans@gmail.com> References: <20220822021520.6996-1-kernelfans@gmail.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220821_191607_821173_4099B6E9 X-CRM114-Status: GOOD ( 17.51 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org The kexec quick reboot needs each involved cpuhp_step to run in parallel. There are lots of teardown cpuhp_step, but not all of them belong to arm/arm64/riscv kexec reboot path. So introducing a member 'support_kexec_parallel' in the struct cpuhp_step to signal whether the teardown supports parallel or not. If a cpuhp_step is used in kexec reboot, then it needs to support parallel to enable the quick reboot. The function check_quick_reboot() checks all teardown cpuhp_steps and report those unsupported if any. Signed-off-by: Pingfan Liu Cc: Thomas Gleixner Cc: Steven Price Cc: Kuppuswamy Sathyanarayanan Cc: "Jason A. Donenfeld" Cc: Frederic Weisbecker Cc: Russell King Cc: Catalin Marinas Cc: Will Deacon Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Peter Zijlstra Cc: "Eric W. Biederman" To: linux-arm-kernel@lists.infradead.org To: linux-ia64@vger.kernel.org To: linux-riscv@lists.infradead.org To: linux-kernel@vger.kernel.org --- include/linux/cpuhotplug.h | 2 ++ kernel/cpu.c | 28 +++++++++++++++++++++++++++- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index f61447913db9..73093fc15aec 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -374,6 +374,8 @@ static inline int cpuhp_setup_state_multi(enum cpuhp_state state, (void *) teardown, true); } +void cpuhp_set_step_parallel(enum cpuhp_state state); + int __cpuhp_state_add_instance(enum cpuhp_state state, struct hlist_node *node, bool invoke); int __cpuhp_state_add_instance_cpuslocked(enum cpuhp_state state, diff --git a/kernel/cpu.c b/kernel/cpu.c index 94ab2727d6bb..1261c3f3be51 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -137,6 +137,9 @@ struct cpuhp_step { /* public: */ bool cant_stop; bool multi_instance; +#ifdef CONFIG_SHUTDOWN_NONBOOT_CPUS + bool support_kexec_parallel; +#endif }; static DEFINE_MUTEX(cpuhp_state_mutex); @@ -147,6 +150,14 @@ static struct cpuhp_step *cpuhp_get_step(enum cpuhp_state state) return cpuhp_hp_states + state; } +#ifdef CONFIG_SHUTDOWN_NONBOOT_CPUS +void cpuhp_set_step_parallel(enum cpuhp_state state) +{ + cpuhp_hp_states[state].support_kexec_parallel = true; +} +EXPORT_SYMBOL(cpuhp_set_step_parallel); +#endif + static bool cpuhp_step_empty(bool bringup, struct cpuhp_step *step) { return bringup ? !step->startup.single : !step->teardown.single; @@ -1347,7 +1358,22 @@ static void takedown_cpus_no_rollback(struct cpumask *cpus) static bool check_quick_reboot(void) { - return false; + struct cpuhp_step *step; + enum cpuhp_state state; + bool ret = true; + + for (state = CPUHP_ONLINE; state >= CPUHP_AP_OFFLINE; state--) { + step = cpuhp_get_step(state); + if (step->teardown.single == NULL) + continue; + if (step->support_kexec_parallel == false) { + pr_info("cpuhp state:%d, %s, does not support cpudown in parallel\n", + state, step->name); + ret = false; + } + } + + return ret; } static struct cpumask kexec_ap_map;