From patchwork Fri Apr 24 17:12:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Kossifidis X-Patchwork-Id: 11508671 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8B03281 for ; Fri, 24 Apr 2020 17:12:59 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 67A3D2075A for ; Fri, 24 Apr 2020 17:12:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="CdtR57vR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 67A3D2075A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ics.forth.gr Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-Id:Date :Subject:To:From:Reply-To:Content-Type:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Z8P/OFktKU/X2aaYOX5mM57WwZliCGcHhQyWZLlN+Ms=; b=CdtR57vRQQ8Vft k+gXbkqAPCZ1khpRumuZAQmanh4xhv+bCcgiMBRTboirAYclqqs1y//X/hdL+HDUvwn7R1QlciKNR U/37GCW6j8Y70WkBZoF/OaRHcA/UGfkYUExZywwCpmmwvMTCgDq1EIZZYf7h6jMmXveZlvsWBrj+b oQsWOT2PnL+dJGOufztJ2vpiAl25KMnCsoxVSb5QG4btE1d+YVddiw1wX9gm2M4CQMjaIjYjIv/QK oUtNuwdLg7hTMCQYoFDe2ytdR5GLptYwnSEvfCMiko6ia8agYsfaApYWyYSRyz21CA1kf6pBxT1/w qaQzLiQf5dS5PTXev27Q==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jS1sl-0004Ni-W4; Fri, 24 Apr 2020 17:12:55 +0000 Received: from mailgate-2.ics.forth.gr ([139.91.1.5]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jS1si-0004JP-6V for linux-riscv@lists.infradead.org; Fri, 24 Apr 2020 17:12:55 +0000 Received: from av3.ics.forth.gr (av3in [139.91.1.77]) by mailgate-2.ics.forth.gr (8.14.4/ICS-FORTH/V10-1.8-GATE) with ESMTP id 03OHCN0V020368; Fri, 24 Apr 2020 17:12:25 GMT X-AuditID: 8b5b014d-241ff700000045c5-bd-5ea31df7e5e8 Received: from enigma.ics.forth.gr (enigma-2.ics.forth.gr [139.91.151.35]) by av3.ics.forth.gr (Symantec Messaging Gateway) with SMTP id D7.23.17861.7FD13AE5; Fri, 24 Apr 2020 20:12:23 +0300 (EEST) X-ICS-AUTH-INFO: Authenticated user: mick@ics.forth.gr at ics.forth.gr From: Nick Kossifidis To: linux-riscv@lists.infradead.org Subject: [PATCH 1/3] RISC-V: Add kexec support Date: Fri, 24 Apr 2020 20:12:12 +0300 Message-Id: <20200424171214.1515457-2-mick@ics.forth.gr> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200424171214.1515457-1-mick@ics.forth.gr> References: <20200424171214.1515457-1-mick@ics.forth.gr> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrMLMWRmVeSWpSXmKPExsXSHT1dWfe77OI4gxvdKhYtH96xWixa8Z3F YsvhR8wW2z63sFk0vzvHbvHycg+zRdssfoumF9eZHTg8pv4+w+LR9WMGq8ebly9ZPB5uusTk sXlJvcel5uvsHu0HupkC2KO4bFJSczLLUov07RK4Mvaf28FWcKyoYsvvR8wNjJOiuxg5OSQE TCSO3fjF3sXIxSEkcJxRYuP29ewQCTeJ2/d3soLYbAKaEvMvHWTpYuTgEBFQlph/zxWknlng GKPEnhXzmUBqhAX0JJZtXAHWyyKgKjH70QIWEJtXwEJiyYsWVoiZ8hJbv30CszkFLCXmzJ0D ViMEVHPm4F52iHpBiZMzn4DFmYHqm7fOZp7AyDcLSWoWktQCRqZVjAKJZcZ6mcnFemn5RSUZ eulFmxjB4crou4Px9ua3eocYmTgYDzFKcDArifDGlCyKE+JNSaysSi3Kjy8qzUktPsQozcGi JM6bx708VkggPbEkNTs1tSC1CCbLxMEp1cAkyXOO9XHc8yW3z3gva1O2YT7J1Pjj4z5fzceh cgqP/zerx2w0PWTCONvxfvRCf6V1j9y3nUheeePhyatXjrNk/Ra7rbJj1smfS/ziJ3icXn+E 3UNlx/wlIU8fNk3xtHD9kHjSNffl9bIFnNMu/bEqLKr09lu9aFdichL/y7onf3uusvcrSV6M Wyv0Njdpcvgmrs79WptWv19d/8PQ+NTqU0xbo4yPT/mT+yTKQyxVoErmibc923SuYOltNr9P HXtV+s6vflKHhugr9oMcBkv5H80WjLrO7+y927/+w8EJx4+fzJjkqeffvupOUY38cQepb13X V+lqVvOJX5fjDWb/ylnSxvno/IJNjElNHh0aSizFGYmGWsxFxYkA2SUmzsYCAAA= X-Greylist: inspected by milter-greylist-4.6.2 (mailgate-2.ics.forth.gr [139.91.1.5]); Fri, 24 Apr 2020 17:12:25 +0000 (GMT) for IP:'139.91.1.77' DOMAIN:'av3in' HELO:'av3.ics.forth.gr' FROM:'mick@ics.forth.gr' RCPT:'' X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mailgate-2.ics.forth.gr [139.91.1.5]); Fri, 24 Apr 2020 17:12:25 +0000 (GMT) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200424_101252_793452_DE508774 X-CRM114-Status: GOOD ( 31.34 ) X-Spam-Score: -2.3 (--) X-Spam-Report: SpamAssassin version 3.4.4 on bombadil.infradead.org summary: Content analysis details: (-2.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at https://www.dnswl.org/, medium trust [139.91.1.5 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david.abdurachmanov@sifive.com, anup@brainfault.org, atish.patra@wdc.com, yibin_liu@c-sky.com, palmer@dabbelt.com, paul.walmsley@sifive.com, Nick Kossifidis Sender: "linux-riscv" Errors-To: linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org This patch adds support for kexec on RISC-V. On SMP systems it depends on HOTPLUG_CPU in order to be able to bring up all harts after kexec. It also needs a recent OpenSBI version that supports the HSM extension. I tested it on riscv64 QEMU on both an smp and a non-smp system. Note that on my tests although all harts seem to go down properly, when using more than 2 harts on qemu, they don't seem to come back as expected. The older version of this patch can be found at: https://patchwork.kernel.org/patch/11486653/ v3: * Use the new smp_shutdown_nonboot_cpus() call. * Move riscv_kexec_relocate to .rodata v2: * Pass needed parameters as arguments to riscv_kexec_relocate instead of using global variables. * Use kimage_arch to hold the fdt address of the included fdt. * Use SYM_* macros on kexec_relocate.S. * Compatibility with STRICT_KERNEL_RWX. * Compatibility with HOTPLUG_CPU for SMP * Small cleanups Signed-off-by: Nick Kossifidis --- arch/riscv/Kconfig | 14 +++ arch/riscv/include/asm/kexec.h | 47 ++++++++ arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/kexec_relocate.S | 158 ++++++++++++++++++++++++ arch/riscv/kernel/machine_kexec.c | 188 +++++++++++++++++++++++++++++ include/uapi/linux/kexec.h | 1 + 6 files changed, 409 insertions(+) create mode 100644 arch/riscv/include/asm/kexec.h create mode 100644 arch/riscv/kernel/kexec_relocate.S create mode 100644 arch/riscv/kernel/machine_kexec.c diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index cbc652754..4934b2f62 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -331,6 +331,20 @@ config RISCV_SBI_V01 help This config allows kernel to use SBI v0.1 APIs. This will be deprecated in future once legacy M-mode software are no longer in use. + +config KEXEC + bool "Kexec system call" + select KEXEC_CORE + select HOTPLUG_CPU if SMP + help + kexec is a system call that implements the ability to shutdown your + current kernel, and to start another kernel. It is like a reboot + but it is independent of the system firmware. And like a reboot + you can start any kernel with it, not just Linux. + + The name comes from the similarity to the exec system call. + + endmenu menu "Boot options" diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h new file mode 100644 index 000000000..edff8a84a --- /dev/null +++ b/arch/riscv/include/asm/kexec.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2019 FORTH-ICS/CARV + * Nick Kossifidis + */ + +#ifndef _RISCV_KEXEC_H +#define _RISCV_KEXEC_H + +/* Maximum physical address we can use pages from */ +#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL) + +/* Maximum address we can reach in physical address mode */ +#define KEXEC_DESTINATION_MEMORY_LIMIT (-1UL) + +/* Maximum address we can use for the control code buffer */ +#define KEXEC_CONTROL_MEMORY_LIMIT (-1UL) + +/* Reserve a page for the control code buffer */ +#define KEXEC_CONTROL_PAGE_SIZE 4096 + +#define KEXEC_ARCH KEXEC_ARCH_RISCV + +static inline void +crash_setup_regs(struct pt_regs *newregs, + struct pt_regs *oldregs) +{ + /* Dummy implementation for now */ +} + + +#define ARCH_HAS_KIMAGE_ARCH + +struct kimage_arch { + unsigned long fdt_addr; +}; + +const extern unsigned char riscv_kexec_relocate[]; +const extern unsigned int riscv_kexec_relocate_size; + +typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, + unsigned long jump_addr, + unsigned long fdt_addr, + unsigned long hartid, + unsigned long va_pa_off); + +#endif diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 86c830810..28dccbbe6 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -51,5 +51,6 @@ ifeq ($(CONFIG_RISCV_SBI), y) obj-$(CONFIG_SMP) += cpu_ops_sbi.o endif obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o +obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o clean: diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S new file mode 100644 index 000000000..d26e38157 --- /dev/null +++ b/arch/riscv/kernel/kexec_relocate.S @@ -0,0 +1,158 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2019 FORTH-ICS/CARV + * Nick Kossifidis + */ + +#include /* For RISCV_* and REG_* macros */ +#include /* For PAGE_SHIFT */ +#include /* For SYM_* macros */ + +.section ".rodata" +SYM_CODE_START(riscv_kexec_relocate) + + /* + * s0: Pointer to the current entry + * s1: (const) Phys address to jump to after relocation + * s2: (const) Phys address of the FDT image + * s3: (const) The hartid of the current hart + * s4: Pointer to the destination address for the relocation + * s5: (const) Number of words per page + * s6: (const) 1, used for subtraction + * s7: (const) va_pa_offset, used when switching MMU off + * s8: (const) Physical address of the main loop + * s9: (debug) indirection page counter + * s10: (debug) entry counter + * s11: (debug) copied words counter + */ + mv s0, a0 + mv s1, a1 + mv s2, a2 + mv s3, a3 + mv s4, zero + li s5, ((1 << PAGE_SHIFT) / RISCV_SZPTR) + li s6, 1 + mv s7, a4 + mv s8, zero + mv s9, zero + mv s10, zero + mv s11, zero + + /* Disable / cleanup interrupts */ + csrw sie, zero + csrw sip, zero + + /* + * When we switch SATP.MODE to "Bare" we'll only + * play with physical addresses. However the first time + * we try to jump somewhere, the offset on the jump + * will be relative to pc which will still be on VA. To + * deal with this we set stvec to the physical address at + * the start of the loop below so that we jump there in + * any case. + */ + la s8, 1f + sub s8, s8, s7 + csrw stvec, s8 + + /* Process entries in a loop */ +1: + addi s10, s10, 1 + REG_L t0, 0(s0) /* t0 = *image->entry */ + addi s0, s0, RISCV_SZPTR /* image->entry++ */ + + /* IND_DESTINATION entry ? -> save destination address */ + andi t1, t0, 0x1 + beqz t1, 2f + andi s4, t0, ~0x1 + j 1b + +2: + /* IND_INDIRECTION entry ? -> update next entry ptr (PA) */ + andi t1, t0, 0x2 + beqz t1, 2f + andi s0, t0, ~0x2 + addi s9, s9, 1 + csrw sptbr, zero + jalr zero, s8, 0 + +2: + /* IND_DONE entry ? -> jump to done label */ + andi t1, t0, 0x4 + beqz t1, 2f + j 4f + +2: + /* + * IND_SOURCE entry ? -> copy page word by word to the + * destination address we got from IND_DESTINATION + */ + andi t1, t0, 0x8 + beqz t1, 1b /* Unknown entry type, ignore it */ + andi t0, t0, ~0x8 + mv t3, s5 /* i = num words per page */ +3: /* copy loop */ + REG_L t1, (t0) /* t1 = *src_ptr */ + REG_S t1, (s4) /* *dst_ptr = *src_ptr */ + addi t0, t0, RISCV_SZPTR /* stc_ptr++ */ + addi s4, s4, RISCV_SZPTR /* dst_ptr++ */ + sub t3, t3, s6 /* i-- */ + addi s11, s11, 1 /* c++ */ + beqz t3, 1b /* copy done ? */ + j 3b + +4: + /* Wait for the relocation to be visible by other harts */ + fence w,w + + /* Pass the arguments to the next kernel / Cleanup*/ + mv a0, s3 + mv a1, s2 + mv a2, s1 + + /* Cleanup */ + mv a3, zero + mv a4, zero + mv a5, zero + mv a6, zero + mv a7, zero + + mv s0, zero + mv s1, zero + mv s2, zero + mv s3, zero + mv s4, zero + mv s5, zero + mv s6, zero + mv s7, zero + mv s8, zero + mv s9, zero + mv s10, zero + mv s11, zero + + mv t0, zero + mv t1, zero + mv t2, zero + mv t3, zero + mv t4, zero + mv t5, zero + mv t6, zero + csrw sepc, zero + csrw scause, zero + csrw sscratch, zero + + /* + * Make sure the relocated code is visible + * and jump to the new kernel + */ + fence.i + + jalr zero, a2, 0 + +SYM_CODE_END(riscv_kexec_relocate) +riscv_kexec_relocate_end: + + .section ".rodata" +SYM_DATA(riscv_kexec_relocate_size, + .long riscv_kexec_relocate_end - riscv_kexec_relocate) + diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c new file mode 100644 index 000000000..b9fc735c3 --- /dev/null +++ b/arch/riscv/kernel/machine_kexec.c @@ -0,0 +1,188 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2019 FORTH-ICS/CARV + * Nick Kossifidis + */ + +#include +#include /* For riscv_kexec_* symbol defines */ +#include /* For smp_send_stop () */ +#include /* For local_flush_icache_all() */ +#include /* For smp_wmb() */ +#include /* For PAGE_MASK */ +#include /* For fdt_check_header() */ +#include /* For set_memory_x() */ +#include /* For unreachable() */ +#include /* For cpu_down() */ + +/** + * kexec_image_info - Print received image details + */ +static void +kexec_image_info(const struct kimage *image) +{ + unsigned long i; + + pr_debug("Kexec image info:\n"); + pr_debug("\ttype: %d\n", image->type); + pr_debug("\tstart: %lx\n", image->start); + pr_debug("\thead: %lx\n", image->head); + pr_debug("\tnr_segments: %lu\n", image->nr_segments); + + for (i = 0; i < image->nr_segments; i++) { + pr_debug("\t segment[%lu]: %016lx - %016lx", i, + image->segment[i].mem, + image->segment[i].mem + image->segment[i].memsz); + pr_debug("\t\t0x%lx bytes, %lu pages\n", + (unsigned long) image->segment[i].memsz, + (unsigned long) image->segment[i].memsz / PAGE_SIZE); + } +} + +/** + * machine_kexec_prepare - Initialize kexec + * + * This function is called from do_kexec_load, when the user has + * provided us with an image to be loaded. Its goal is to validate + * the image and prepare the control code buffer as needed. + * Note that kimage_alloc_init has already been called and the + * control buffer has already been allocated. + */ +int +machine_kexec_prepare(struct kimage *image) +{ + struct kimage_arch *internal = &image->arch; + struct fdt_header fdt = {0}; + void *control_code_buffer = NULL; + int i = 0; + + kexec_image_info(image); + + if (image->type == KEXEC_TYPE_CRASH) { + pr_warn("Loading a crash kernel is unsupported for now.\n"); + return -EINVAL; + } + + /* Find the Flattened Device Tree and save its physical address */ + for (i = 0; i < image->nr_segments; i++) { + if (image->segment[i].memsz <= sizeof(fdt)) + continue; + + if (copy_from_user(&fdt, image->segment[i].buf, sizeof(fdt))) + continue; + + if (fdt_check_header(&fdt)) + continue; + + internal->fdt_addr = (unsigned long) image->segment[i].mem; + break; + } + + if (!internal->fdt_addr) { + pr_err("Device tree not included in the provided image\n"); + return -EINVAL; + } + + /* Copy the assembler code for relocation to the control page */ + control_code_buffer = page_address(image->control_code_page); + memcpy(control_code_buffer, riscv_kexec_relocate, + riscv_kexec_relocate_size); + + /* Mark the control page executable */ + set_memory_x((unsigned long) control_code_buffer, 1); + +#ifdef CONFIG_SMP + /* + * Make sure other harts see the copied data + * if they try to read the control buffer + */ + smp_wmb(); +#endif + + return 0; +} + + +/** + * machine_kexec_cleanup - Cleanup any leftovers from + * machine_kexec_prepare + * + * This function is called by kimage_free to handle any arch-specific + * allocations done on machine_kexec_prepare. Since we didn't do any + * allocations there, this is just an empty function. Note that the + * control buffer is freed by kimage_free. + */ +void +machine_kexec_cleanup(struct kimage *image) +{ +} + + +/* + * machine_shutdown - Prepare for a kexec reboot + * + * This function is called by kernel_kexec just before machine_kexec + * below. Its goal is to prepare the rest of the system (the other + * harts and possibly devices etc) for a kexec reboot. + */ +void machine_shutdown(void) +{ + /* + * No more interrupts on this hart + * until we are back up. + */ + local_irq_disable(); + +#if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) + smp_shutdown_nonboot_cpus(smp_processor_id()); +#endif +} + +/** + * machine_crash_shutdown - Prepare to kexec after a kernel crash + * + * This function is called by crash_kexec just before machine_kexec + * below and its goal is similar to machine_shutdown, but in case of + * a kernel crash. Since we don't handle such cases yet, this function + * is empty. + */ +void +machine_crash_shutdown(struct pt_regs *regs) +{ +} + +/** + * machine_kexec - Jump to the loaded kimage + * + * This function is called by kernel_kexec which is called by the + * reboot system call when the reboot cmd is LINUX_REBOOT_CMD_KEXEC, + * or by crash_kernel which is called by the kernel's arch-specific + * trap handler in case of a kernel panic. It's the final stage of + * the kexec process where the pre-loaded kimage is ready to be + * executed. We assume at this point that all other harts are + * suspended and this hart will be the new boot hart. + */ +void __noreturn +machine_kexec(struct kimage *image) +{ + struct kimage_arch *internal = &image->arch; + unsigned long jump_addr = (unsigned long) image->start; + unsigned long first_ind_entry = (unsigned long) &image->head; + unsigned long this_hart_id = raw_smp_processor_id(); + unsigned long fdt_addr = internal->fdt_addr; + void *control_code_buffer = page_address(image->control_code_page); + riscv_kexec_do_relocate do_relocate = control_code_buffer; + + pr_notice("Will call new kernel at %08lx from hart id %lx\n", + jump_addr, this_hart_id); + pr_notice("FDT image at %08lx\n", fdt_addr); + + /* Make sure the relocation code is visible to the hart */ + local_flush_icache_all(); + + /* Jump to the relocation code */ + pr_notice("Bye...\n"); + do_relocate(first_ind_entry, jump_addr, fdt_addr, + this_hart_id, va_pa_offset); + unreachable(); +} diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h index 05669c87a..778dc191c 100644 --- a/include/uapi/linux/kexec.h +++ b/include/uapi/linux/kexec.h @@ -42,6 +42,7 @@ #define KEXEC_ARCH_MIPS_LE (10 << 16) #define KEXEC_ARCH_MIPS ( 8 << 16) #define KEXEC_ARCH_AARCH64 (183 << 16) +#define KEXEC_ARCH_RISCV (243 << 16) /* The artificial cap on the number of segments passed to kexec_load. */ #define KEXEC_SEGMENT_MAX 16 From patchwork Fri Apr 24 17:12:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Kossifidis X-Patchwork-Id: 11508667 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5D87A92A for ; Fri, 24 Apr 2020 17:12:48 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3A7F22075A for ; Fri, 24 Apr 2020 17:12:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="AT1SRBKf" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3A7F22075A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ics.forth.gr Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-Id:Date :Subject:To:From:Reply-To:Content-Type:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=JNTyMJY1cjgy6TO1cktHlCFEHnTVGSEoETINSPzee/E=; b=AT1SRBKfgBF3X2 oR7BYQMh6v7pQMiECjTAQefnWTSJpMN1h0sBUTvk8s1NFqteqE70X4sLLD6BtBWsrhv6BItB25FDF R2t67LslA/iPbAb+Mm4rToIHI7RNFzRMsO0eWLLgZXPft5HuVN82mQc2js5jAijfbmUy2W7pUIfzz wV29TchAfNArYsSvcgApKAThs33faTp85TbqwtPtBfG/furnZdLkiIAvoDmwEH9mD1DziarCss0eU wJXnRF4ieHwNq/3D8QIhc1+NWqzGgycR624fV4Gq+d7mxM2sEvNvUno3O1FVl+XOWeDweqXrp+Kdz 8egUL6xAM2+CfjQzpUtQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jS1sa-0004DQ-KJ; Fri, 24 Apr 2020 17:12:44 +0000 Received: from mailgate.ics.forth.gr ([139.91.1.2]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jS1sU-00049c-Me for linux-riscv@lists.infradead.org; Fri, 24 Apr 2020 17:12:41 +0000 Received: from av3.ics.forth.gr (av3in.ics.forth.gr [139.91.1.77]) by mailgate.ics.forth.gr (8.15.2/ICS-FORTH/V10-1.8-GATE) with ESMTP id 03OHCSDd094032; Fri, 24 Apr 2020 20:12:28 +0300 (EEST) X-AuditID: 8b5b014d-257ff700000045c5-be-5ea31df794bb Received: from enigma.ics.forth.gr (enigma-2.ics.forth.gr [139.91.151.35]) by av3.ics.forth.gr (Symantec Messaging Gateway) with SMTP id 68.23.17861.7FD13AE5; Fri, 24 Apr 2020 20:12:23 +0300 (EEST) X-ICS-AUTH-INFO: Authenticated user: mick@ics.forth.gr at ics.forth.gr From: Nick Kossifidis To: linux-riscv@lists.infradead.org Subject: [PATCH 2/3] RISC-V: Add kdump support Date: Fri, 24 Apr 2020 20:12:13 +0300 Message-Id: <20200424171214.1515457-3-mick@ics.forth.gr> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200424171214.1515457-1-mick@ics.forth.gr> References: <20200424171214.1515457-1-mick@ics.forth.gr> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrELMWRmVeSWpSXmKPExsXSHT1dWfe77OI4g2cf1SxaPrxjtVi04juL xZbDj5gttn1uYbNofneO3eLl5R5mi7ZZ/BZNL64zO3B4TP19hsWj68cMVo83L1+yeDzcdInJ Y/OSeo9LzdfZPdoPdDMFsEdx2aSk5mSWpRbp2yVwZUzd+4e5YHdlxYKlF9kbGCcndzFyckgI mEic6WhnBrGFBI4zSmx96gMRd5O4fX8nK4jNJqApMf/SQZYuRg4OEQFlifn3XLsYuTiYBY4x SuxZMZ8JpEZYQE/i1Nv9YDaLgKrE6+MLwHp5BSwklpy/wQQxU15i67dPYHFOAUuJOXPnsEDs tZA4c3AvO0S9oMTJmU/A4sxA9c1bZzNPYOSbhSQ1C0lqASPTKkaBxDJjvczkYr20/KKSDL30 ok2M4GBl9N3BeHvzW71DjEwcjIcYJTiYlUR4Y0oWxQnxpiRWVqUW5ccXleakFh9ilOZgURLn zeNeHiskkJ5YkpqdmlqQWgSTZeLglGpgYja6W1V6Tnf6E4G7XleWONU+ljyvu9n2/q9NVXbO mudmXsmusTnKqRppZbwjnf9sWYBTVJn4lV2Bpg+7OSt+7dme2rmKs6Nvc+srno07567ff7RN s0c66P4rKbGKRx1KKnPtlpm3PysrFHE3PersxXZhet6Vx4ZW7rzzurm+rdWY9eSRv/PhK6mh e7dtnVs068Sa3J/Tnv00kn5fMK3Brb/zg0uwVcC3nN/b31xM9sqTiTp65fD+fo3sc9ce5eku 9d+45pPmoT1zGFZf8U2K+Bj3+96Kvdtlz7QZN6yZeUoxdAmfLIvjZ1XJoicd6W/3tb/PcYz9 WfTY/KpJxcnDyv4elo95HTdH+jTbbCyvVWIpzkg01GIuKk4EAJlt8N7FAgAA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200424_101239_112533_3B676E44 X-CRM114-Status: GOOD ( 25.66 ) X-Spam-Score: -2.3 (--) X-Spam-Report: SpamAssassin version 3.4.4 on bombadil.infradead.org summary: Content analysis details: (-2.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at https://www.dnswl.org/, medium trust [139.91.1.2 listed in list.dnswl.org] 0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [139.91.1.2 listed in wl.mailspike.net] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 0.0 RCVD_IN_MSPIKE_WL Mailspike good senders X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david.abdurachmanov@sifive.com, anup@brainfault.org, atish.patra@wdc.com, yibin_liu@c-sky.com, palmer@dabbelt.com, paul.walmsley@sifive.com, Nick Kossifidis Sender: "linux-riscv" Errors-To: linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org This patch adds support for kdump, the kernel will reserve a region for the crash kernel and jump there on panic. In order for userspace tools (kexec-tools) to prepare the crash kernel kexec image, we also need to expose some information on /proc/iomem for the memory regions used by the kernel and for the region reserved for crash kernel. Note that on userspace the device tree is used to determine the system's memory layout so the "System RAM" on /proc/iomem is ignored. I just put it there for compatibility with other archs. I tested this on riscv64 qemu and works as expected, you may test it by triggering a crash through /proc/sysrq_trigger: echo c > /proc/sysrq_trigger Signed-off-by: Nick Kossifidis --- arch/riscv/include/asm/kexec.h | 19 ++++--- arch/riscv/include/uapi/asm/elf.h | 6 +++ arch/riscv/kernel/Makefile | 2 +- arch/riscv/kernel/crash_save_regs.S | 56 +++++++++++++++++++++ arch/riscv/kernel/kexec_relocate.S | 61 +++++++++++++++++++++- arch/riscv/kernel/machine_kexec.c | 31 +++++++----- arch/riscv/kernel/setup.c | 50 ++++++++++++++++++ arch/riscv/mm/init.c | 78 +++++++++++++++++++++++++++++ 8 files changed, 282 insertions(+), 21 deletions(-) create mode 100644 arch/riscv/kernel/crash_save_regs.S diff --git a/arch/riscv/include/asm/kexec.h b/arch/riscv/include/asm/kexec.h index edff8a84a..5bfb3431f 100644 --- a/arch/riscv/include/asm/kexec.h +++ b/arch/riscv/include/asm/kexec.h @@ -21,11 +21,16 @@ #define KEXEC_ARCH KEXEC_ARCH_RISCV +extern void riscv_crash_save_regs(struct pt_regs *newregs); + static inline void crash_setup_regs(struct pt_regs *newregs, struct pt_regs *oldregs) { - /* Dummy implementation for now */ + if (oldregs) + memcpy(newregs, oldregs, sizeof(struct pt_regs)); + else + riscv_crash_save_regs(newregs); } @@ -38,10 +43,12 @@ struct kimage_arch { const extern unsigned char riscv_kexec_relocate[]; const extern unsigned int riscv_kexec_relocate_size; -typedef void (*riscv_kexec_do_relocate)(unsigned long first_ind_entry, - unsigned long jump_addr, - unsigned long fdt_addr, - unsigned long hartid, - unsigned long va_pa_off); +typedef void (*riscv_kexec_method)(unsigned long first_ind_entry, + unsigned long jump_addr, + unsigned long fdt_addr, + unsigned long hartid, + unsigned long va_pa_off); + +extern riscv_kexec_method riscv_kexec_norelocate; #endif diff --git a/arch/riscv/include/uapi/asm/elf.h b/arch/riscv/include/uapi/asm/elf.h index d696d6610..5b19f5547 100644 --- a/arch/riscv/include/uapi/asm/elf.h +++ b/arch/riscv/include/uapi/asm/elf.h @@ -19,6 +19,12 @@ typedef unsigned long elf_greg_t; typedef struct user_regs_struct elf_gregset_t; #define ELF_NGREG (sizeof(elf_gregset_t) / sizeof(elf_greg_t)) +#define ELF_CORE_COPY_REGS(dest, regs) \ +do { \ + *(struct user_regs_struct *)&(dest) = \ + *(struct user_regs_struct *)regs; \ +} while (0); + /* We don't support f without d, or q. */ typedef __u64 elf_fpreg_t; typedef union __riscv_fp_state elf_fpregset_t; diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 28dccbbe6..65f6c9f1d 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -51,6 +51,6 @@ ifeq ($(CONFIG_RISCV_SBI), y) obj-$(CONFIG_SMP) += cpu_ops_sbi.o endif obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o -obj-${CONFIG_KEXEC} += kexec_relocate.o machine_kexec.o +obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o clean: diff --git a/arch/riscv/kernel/crash_save_regs.S b/arch/riscv/kernel/crash_save_regs.S new file mode 100644 index 000000000..ce5948097 --- /dev/null +++ b/arch/riscv/kernel/crash_save_regs.S @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2020 FORTH-ICS/CARV + * Nick Kossifidis + */ + +#include /* For RISCV_* and REG_* macros */ +#include /* For CSR_* macros */ +#include /* For offsets on pt_regs */ +#include /* For SYM_* macros */ + +.section ".text" +SYM_CODE_START(riscv_crash_save_regs) + REG_S ra, PT_RA(a0) /* x1 */ + REG_S sp, PT_SP(a0) /* x2 */ + REG_S gp, PT_GP(a0) /* x3 */ + REG_S tp, PT_TP(a0) /* x4 */ + REG_S t0, PT_T0(a0) /* x5 */ + REG_S t1, PT_T1(a0) /* x6 */ + REG_S t2, PT_T2(a0) /* x7 */ + REG_S s0, PT_S0(a0) /* x8/fp */ + REG_S s1, PT_S1(a0) /* x9 */ + REG_S a0, PT_A0(a0) /* x10 */ + REG_S a1, PT_A1(a0) /* x11 */ + REG_S a2, PT_A2(a0) /* x12 */ + REG_S a3, PT_A3(a0) /* x13 */ + REG_S a4, PT_A4(a0) /* x14 */ + REG_S a5, PT_A5(a0) /* x15 */ + REG_S a6, PT_A6(a0) /* x16 */ + REG_S a7, PT_A7(a0) /* x17 */ + REG_S s2, PT_S2(a0) /* x18 */ + REG_S s3, PT_S3(a0) /* x19 */ + REG_S s4, PT_S4(a0) /* x20 */ + REG_S s5, PT_S5(a0) /* x21 */ + REG_S s6, PT_S6(a0) /* x22 */ + REG_S s7, PT_S7(a0) /* x23 */ + REG_S s8, PT_S8(a0) /* x24 */ + REG_S s9, PT_S9(a0) /* x25 */ + REG_S s10, PT_S10(a0) /* x26 */ + REG_S s11, PT_S11(a0) /* x27 */ + REG_S t3, PT_T3(a0) /* x28 */ + REG_S t4, PT_T4(a0) /* x29 */ + REG_S t5, PT_T5(a0) /* x30 */ + REG_S t6, PT_T6(a0) /* x31 */ + + csrr t1, CSR_STATUS + csrr t2, CSR_EPC + csrr t3, CSR_TVAL + csrr t4, CSR_CAUSE + + REG_S t1, PT_STATUS(a0) + REG_S t2, PT_EPC(a0) + REG_S t3, PT_BADADDR(a0) + REG_S t4, PT_CAUSE(a0) + ret +SYM_CODE_END(riscv_crash_save_regs) diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S index d26e38157..aa4a9a95c 100644 --- a/arch/riscv/kernel/kexec_relocate.S +++ b/arch/riscv/kernel/kexec_relocate.S @@ -152,7 +152,66 @@ SYM_CODE_START(riscv_kexec_relocate) SYM_CODE_END(riscv_kexec_relocate) riscv_kexec_relocate_end: - .section ".rodata" + +/* Used for jumping to crashkernel */ +.section ".text" +SYM_CODE_START(riscv_kexec_norelocate) + /* + * s0: (const) Phys address to jump to + * s1: (const) Phys address of the FDT image + * s2: (const) The hartid of the current hart + */ + mv s0, a1 + mv s1, a2 + mv s2, a3 + + /* Disable / cleanup interrupts */ + csrw sie, zero + csrw sip, zero + + /* Switch to physical addressing */ + csrw sptbr, zero + + /* Pass the arguments to the next kernel / Cleanup*/ + mv a0, s2 + mv a1, s1 + mv a2, s0 + + /* Cleanup */ + mv a3, zero + mv a4, zero + mv a5, zero + mv a6, zero + mv a7, zero + + mv s0, zero + mv s1, zero + mv s2, zero + mv s3, zero + mv s4, zero + mv s5, zero + mv s6, zero + mv s7, zero + mv s8, zero + mv s9, zero + mv s10, zero + mv s11, zero + + mv t0, zero + mv t1, zero + mv t2, zero + mv t3, zero + mv t4, zero + mv t5, zero + mv t6, zero + csrw sepc, zero + csrw scause, zero + csrw sscratch, zero + + jalr zero, a2, 0 +SYM_CODE_END(riscv_kexec_norelocate) + +.section ".rodata" SYM_DATA(riscv_kexec_relocate_size, .long riscv_kexec_relocate_end - riscv_kexec_relocate) diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c index b9fc735c3..14e9622eb 100644 --- a/arch/riscv/kernel/machine_kexec.c +++ b/arch/riscv/kernel/machine_kexec.c @@ -58,11 +58,6 @@ machine_kexec_prepare(struct kimage *image) kexec_image_info(image); - if (image->type == KEXEC_TYPE_CRASH) { - pr_warn("Loading a crash kernel is unsupported for now.\n"); - return -EINVAL; - } - /* Find the Flattened Device Tree and save its physical address */ for (i = 0; i < image->nr_segments; i++) { if (image->segment[i].memsz <= sizeof(fdt)) @@ -84,12 +79,14 @@ machine_kexec_prepare(struct kimage *image) } /* Copy the assembler code for relocation to the control page */ - control_code_buffer = page_address(image->control_code_page); - memcpy(control_code_buffer, riscv_kexec_relocate, - riscv_kexec_relocate_size); + if (image->type != KEXEC_TYPE_CRASH) { + control_code_buffer = page_address(image->control_code_page); + memcpy(control_code_buffer, riscv_kexec_relocate, + riscv_kexec_relocate_size); - /* Mark the control page executable */ - set_memory_x((unsigned long) control_code_buffer, 1); + /* Mark the control page executable */ + set_memory_x((unsigned long) control_code_buffer, 1); + } #ifdef CONFIG_SMP /* @@ -149,6 +146,9 @@ void machine_shutdown(void) void machine_crash_shutdown(struct pt_regs *regs) { + machine_shutdown(); + crash_save_cpu(regs, smp_processor_id()); + pr_info("Starting crashdump kernel...\n"); } /** @@ -171,7 +171,12 @@ machine_kexec(struct kimage *image) unsigned long this_hart_id = raw_smp_processor_id(); unsigned long fdt_addr = internal->fdt_addr; void *control_code_buffer = page_address(image->control_code_page); - riscv_kexec_do_relocate do_relocate = control_code_buffer; + riscv_kexec_method kexec_method = NULL; + + if (image->type != KEXEC_TYPE_CRASH) + kexec_method = control_code_buffer; + else + kexec_method = (riscv_kexec_method) &riscv_kexec_norelocate; pr_notice("Will call new kernel at %08lx from hart id %lx\n", jump_addr, this_hart_id); @@ -182,7 +187,7 @@ machine_kexec(struct kimage *image) /* Jump to the relocation code */ pr_notice("Bye...\n"); - do_relocate(first_ind_entry, jump_addr, fdt_addr, - this_hart_id, va_pa_offset); + kexec_method(first_ind_entry, jump_addr, fdt_addr, + this_hart_id, va_pa_offset); unreachable(); } diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 3e528312f..52d057bde 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -4,6 +4,8 @@ * Chen Liqin * Lennox Wu * Copyright (C) 2012 Regents of the University of California + * Copyright (C) 2020 FORTH-ICS/CARV + * Nick Kossifidis */ #include @@ -17,6 +19,7 @@ #include #include #include +#include #include #include @@ -50,6 +53,52 @@ atomic_t hart_lottery __section(.sdata); unsigned long boot_cpu_hartid; static DEFINE_PER_CPU(struct cpu, cpu_devices); +/* + * Place kernel memory regions on the resource tree so that + * userspace tools such as kexec-tools can retrieve them from + * /proc/iomem. While there also add "System RAM" region for + * compatibility with other archs. + */ +static struct resource sysram_res = { .name = "System RAM", }; +static struct resource code_res = { .name = "Kernel code", }; +static struct resource data_res = { .name = "Kernel data", }; +static struct resource rodata_res = { .name = "Kernel rodata", }; +static struct resource bss_res = { .name = "Kernel bss", }; + +static void __init kdump_resource_init(void) +{ + sysram_res.start = memblock_start_of_DRAM(); + sysram_res.end = memblock_end_of_DRAM() - 1; + sysram_res.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM; + insert_resource(&iomem_resource, &sysram_res); + + code_res.start = __pa_symbol(_text); + code_res.end = __pa_symbol(_etext) - 1; + code_res.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM; + insert_resource(&sysram_res, &code_res); + + rodata_res.start = __pa_symbol(__start_rodata); + rodata_res.end = __pa_symbol(__end_rodata) - 1; + rodata_res.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM; + insert_resource(&sysram_res, &rodata_res); + + data_res.start = __pa_symbol(_data); + data_res.end = __pa_symbol(_edata) - 1; + data_res.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM; + insert_resource(&sysram_res, &data_res); + + bss_res.start = __pa_symbol(__bss_start); + bss_res.end = __pa_symbol(__bss_stop) - 1; + bss_res.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM; + insert_resource(&sysram_res, &bss_res); + +#ifdef CONFIG_KEXEC_CORE + /* Add crash kernel region if present */ + if (crashk_res.end > crashk_res.start) + insert_resource(&sysram_res, &crashk_res); +#endif +} + void __init parse_dtb(void) { if (early_init_dt_scan(dtb_early_va)) @@ -75,6 +124,7 @@ void __init setup_arch(char **cmdline_p) setup_bootmem(); paging_init(); + kdump_resource_init(); #if IS_ENABLED(CONFIG_BUILTIN_DTB) unflatten_and_copy_device_tree(); #else diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 9d7dfc986..1c4461590 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -2,6 +2,8 @@ /* * Copyright (C) 2012 Regents of the University of California * Copyright (C) 2019 Western Digital Corporation or its affiliates. + * Copyright (C) 2020 FORTH-ICS/CARV + * Nick Kossifidis */ #include @@ -13,6 +15,7 @@ #include #include #include +#include #include #include @@ -541,6 +544,78 @@ void mark_rodata_ro(void) } #endif +#ifdef CONFIG_KEXEC_CORE +/* + * reserve_crashkernel() - reserves memory for crash kernel + * + * This function reserves memory area given in "crashkernel=" kernel command + * line parameter. The memory reserved is used by dump capture kernel when + * primary kernel is crashing. + */ +static void __init reserve_crashkernel(void) +{ + unsigned long long crash_base = 0; + unsigned long long crash_size = 0; + unsigned long start_pfn = find_min_pfn_with_active_regions(); + unsigned long search_start = start_pfn << PAGE_SHIFT; + unsigned long search_end = (unsigned long) PFN_PHYS(max_low_pfn) - 1; + + int ret = 0; + + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), + &crash_size, &crash_base); + if (ret || !crash_size) + return; + + crash_size = PAGE_ALIGN(crash_size); + + if (crash_base == 0) { + /* + * Current riscv boot protocol requires 2MB alignment for + * RV64 and 4MB alignment for RV32 (hugepage size) + */ + crash_base = memblock_find_in_range(search_start, search_end, +#ifdef CONFIG_64BIT + crash_size, SZ_2M); +#else + crash_size, SZ_4M); +#endif + if (crash_base == 0) { + pr_warn("crashkernel: couldn't allocate %lldKB\n", + crash_size >> 10); + return; + } + } else { + /* User specifies base address explicitly. */ + if (!memblock_is_region_memory(crash_base, crash_size)) { + pr_warn("crashkernel: requested region is not memory\n"); + return; + } + + if (memblock_is_region_reserved(crash_base, crash_size)) { + pr_warn("crashkernel: requested region is reserved\n"); + return; + } + +#ifdef CONFIG_64BIT + if (!IS_ALIGNED(crash_base, SZ_2M)) { +#else + if (!IS_ALIGNED(crash_base, SZ_4M)) { +#endif + pr_warn("crashkernel: requested region is misaligned\n"); + return; + } + } + memblock_reserve(crash_base, crash_size); + + pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", + crash_base, crash_base + crash_size, crash_size >> 20); + + crashk_res.start = crash_base; + crashk_res.end = crash_base + crash_size - 1; +} +#endif + void __init paging_init(void) { setup_vm_final(); @@ -548,6 +623,9 @@ void __init paging_init(void) sparse_init(); setup_zero_page(); zone_sizes_init(); +#ifdef CONFIG_KEXEC_CORE + reserve_crashkernel(); +#endif } #ifdef CONFIG_SPARSEMEM_VMEMMAP From patchwork Fri Apr 24 17:12:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Kossifidis X-Patchwork-Id: 11508669 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7BCF092A for ; Fri, 24 Apr 2020 17:12:58 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5AA8B2071E for ; Fri, 24 Apr 2020 17:12:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="tryt8csf" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5AA8B2071E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ics.forth.gr Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id: Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-Id:Date :Subject:To:From:Reply-To:Content-Type:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=xapoJs4m/WZhJG2TZ1NYDyunMNrcWXcQKm7kNEAuI24=; b=tryt8csfp3B0DV RlaIoSJEmnDZlBcsT6eZQCrIMC2+8EoxIoy5AqucjhgtQZfE3qt0hOR2AGOHPpLVkOhvTwrSIIoKl 2bLrU9XWBbRn+5PDVmGn/S6XEdEDfMMlb7jbPBs3TBDW9HsbiLBR7CIgdmRGr9EmdV6RgrLj15dGK 71VPaMceE8Ly7TB1GixshOMJJ7M2MAvi2x859Ci9EDbCRuOgaA9NATJmSBON9gSuR+tnmK/SYynsO WvOOXz5DFQhr/eLpbIHRP5aLoysJbY0LaWcqsLwfwP1YIb7uUziddlAvO+prsp545z8hFRk08tvy4 nNsPZRDakh/TbNgZhtDw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jS1sk-0004Lc-Kz; Fri, 24 Apr 2020 17:12:54 +0000 Received: from mailgate-2.ics.forth.gr ([139.91.1.5]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jS1si-0004JQ-6U for linux-riscv@lists.infradead.org; Fri, 24 Apr 2020 17:12:54 +0000 Received: from av3.ics.forth.gr (av3in [139.91.1.77]) by mailgate-2.ics.forth.gr (8.14.4/ICS-FORTH/V10-1.8-GATE) with ESMTP id 03OHCN0W020368; Fri, 24 Apr 2020 17:12:25 GMT X-AuditID: 8b5b014d-241ff700000045c5-bf-5ea31df70979 Received: from enigma.ics.forth.gr (enigma-2.ics.forth.gr [139.91.151.35]) by av3.ics.forth.gr (Symantec Messaging Gateway) with SMTP id 78.23.17861.7FD13AE5; Fri, 24 Apr 2020 20:12:23 +0300 (EEST) X-ICS-AUTH-INFO: Authenticated user: mick@ics.forth.gr at ics.forth.gr From: Nick Kossifidis To: linux-riscv@lists.infradead.org Subject: [PATCH 3/3] RISC-V: Add crash kernel support Date: Fri, 24 Apr 2020 20:12:14 +0300 Message-Id: <20200424171214.1515457-4-mick@ics.forth.gr> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200424171214.1515457-1-mick@ics.forth.gr> References: <20200424171214.1515457-1-mick@ics.forth.gr> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrELMWRmVeSWpSXmKPExsXSHT1dWfe77OI4g9X3tCxaPrxjtVi04juL xZbDj5gttn1uYbNofneO3eLl5R5mi7ZZ/BZNL64zO3B4TP19hsWj68cMVo83L1+yeDzcdInJ Y/OSeo9LzdfZPdoPdDMFsEdx2aSk5mSWpRbp2yVwZbSc7GcveOZc0XVSoIHxrnkXIyeHhICJ xNcdE1m6GLk4hASOM0o8PniOHSLhJnH7/k5WEJtNQFNi/qWDQEUcHCICyhLz77mC1DMLHGOU 2LNiPhNIjbCAqcTLua8ZQWwWAVWJBdc3MoPYvAIWEs/et7BCzJSX2PrtE5jNKWApMWfuHBYQ Wwio5szBvewQ9YISJ2c+AYszA9U3b53NPIGRbxaS1CwkqQWMTKsYBRLLjPUyk4v10vKLSjL0 0os2MYKDldF3B+PtzW/1DjEycTAeYpTgYFYS4Y0pWRQnxJuSWFmVWpQfX1Sak1p8iFGag0VJ nDePe3mskEB6YklqdmpqQWoRTJaJg1OqgenEhvcHt8wWafpV/y5h0YmG/0vzXUPf2e7d6CSk oL5FfafMP54ij4M7lp5WOsvaz6vmpO1+9LXgv5WO+6YIbouPLpnf6nxk1TyL2iPaBnvdt5px B1/u17ddY3DgrfKqnVYqUVvWvX44yZ7v6PJPZzhNQ/d6cgZt1L9XUnGR66xr4zTRoql3mvOm 1bjKPZx3+tviBfGvVBP7J07izduSwLhQ+vvU+Q+XTqo+IWIdzXuuYcucOyt+d5ZVTZLi9opp FPJaVCQc9b/DuSyy9zxf4e/KHbPyLxoL5SzS5w7auLkkWHR+LGeP2lbDL1VG368FKHolHl6z fEvX7Xqxrcu3nwztP3yx/aa12vyF0QKV/6YqsRRnJBpqMRcVJwIAEDQOdMUCAAA= X-Greylist: inspected by milter-greylist-4.6.2 (mailgate-2.ics.forth.gr [139.91.1.5]); Fri, 24 Apr 2020 17:12:25 +0000 (GMT) for IP:'139.91.1.77' DOMAIN:'av3in' HELO:'av3.ics.forth.gr' FROM:'mick@ics.forth.gr' RCPT:'' X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mailgate-2.ics.forth.gr [139.91.1.5]); Fri, 24 Apr 2020 17:12:25 +0000 (GMT) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200424_101252_609207_1620CC57 X-CRM114-Status: GOOD ( 21.76 ) X-Spam-Score: -2.3 (--) X-Spam-Report: SpamAssassin version 3.4.4 on bombadil.infradead.org summary: Content analysis details: (-2.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at https://www.dnswl.org/, medium trust [139.91.1.5 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david.abdurachmanov@sifive.com, anup@brainfault.org, atish.patra@wdc.com, yibin_liu@c-sky.com, palmer@dabbelt.com, paul.walmsley@sifive.com, Nick Kossifidis Sender: "linux-riscv" Errors-To: linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org This patch allows Linux to act as a crash kernel for use with kdump. Userspace will let the crash kernel know about the memory region it can use through linux,usable-memory-range property, and about the memory region where the elf core header of the previous kernel is saved, through the linux,elfcorehdr property (both on /chosen). These dtb bindings are also used on arm64. I tested this on riscv64 qemu and it works as expected, you may test it by retrieving the dmesg of the previous kernel through /proc/vmcore, using the vmcore-dmesg utility from kexec-tools. Signed-off-by: Nick Kossifidis --- arch/riscv/Kconfig | 10 ++++ arch/riscv/kernel/Makefile | 4 +- arch/riscv/kernel/crash_dump.c | 46 ++++++++++++++++ arch/riscv/kernel/setup.c | 13 +++++ arch/riscv/mm/init.c | 97 +++++++++++++++++++++++++++++++++- 5 files changed, 167 insertions(+), 3 deletions(-) create mode 100644 arch/riscv/kernel/crash_dump.c diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 4934b2f62..3802c8888 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -344,6 +344,16 @@ config KEXEC The name comes from the similarity to the exec system call. +config CRASH_DUMP + bool "Build kdump crash kernel" + help + Generate crash dump after being started by kexec. This should + be normally only set in special crash dump kernels which are + loaded in the main kernel with kexec-tools into a specially + reserved region and then later executed after a crash by + kdump/kexec. + + For more details see Documentation/admin-guide/kdump/kdump.rst endmenu diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 65f6c9f1d..6374b0db5 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -51,6 +51,6 @@ ifeq ($(CONFIG_RISCV_SBI), y) obj-$(CONFIG_SMP) += cpu_ops_sbi.o endif obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o -obj-${CONFIG_KEXEC} += kexec_relocate.o crash_save_regs.o machine_kexec.o - +obj-$(CONFIG_KEXEC) += kexec_relocate.o crash_save_regs.o machine_kexec.o +obj-$(CONFIG_CRASH_DUMP) += crash_dump.o clean: diff --git a/arch/riscv/kernel/crash_dump.c b/arch/riscv/kernel/crash_dump.c new file mode 100644 index 000000000..81b9d2a71 --- /dev/null +++ b/arch/riscv/kernel/crash_dump.c @@ -0,0 +1,46 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * This code is taken from arch/arm64/kernel/crash_dump.c + * Created by: AKASHI Takahiro + * Copyright (C) 2017 Linaro Limited + */ + +#include +#include + +/** + * copy_oldmem_page() - copy one page from old kernel memory + * @pfn: page frame number to be copied + * @buf: buffer where the copied page is placed + * @csize: number of bytes to copy + * @offset: offset in bytes into the page + * @userbuf: if set, @buf is in a user address space + * + * This function copies one page from old kernel memory into buffer pointed by + * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes + * copied or negative error in case of failure. + */ +ssize_t copy_oldmem_page(unsigned long pfn, char *buf, + size_t csize, unsigned long offset, + int userbuf) +{ + void *vaddr; + + if (!csize) + return 0; + + vaddr = memremap(__pfn_to_phys(pfn), PAGE_SIZE, MEMREMAP_WB); + if (!vaddr) + return -ENOMEM; + + if (userbuf) { + if (copy_to_user((char __user *)buf, vaddr + offset, csize)) { + memunmap(vaddr); + return -EFAULT; + } + } else + memcpy(buf, vaddr + offset, csize); + + memunmap(vaddr); + return csize; +} diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 52d057bde..cbd8c8ba6 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -64,6 +64,9 @@ static struct resource code_res = { .name = "Kernel code", }; static struct resource data_res = { .name = "Kernel data", }; static struct resource rodata_res = { .name = "Kernel rodata", }; static struct resource bss_res = { .name = "Kernel bss", }; +#ifdef CONFIG_CRASH_DUMP +static struct resource elfcorehdr_res = { .name = "ELF Core hdr", }; +#endif static void __init kdump_resource_init(void) { @@ -97,6 +100,16 @@ static void __init kdump_resource_init(void) if (crashk_res.end > crashk_res.start) insert_resource(&sysram_res, &crashk_res); #endif + +#ifdef CONFIG_CRASH_DUMP + if (elfcorehdr_size) { + elfcorehdr_res.start = elfcorehdr_addr; + elfcorehdr_res.end = elfcorehdr_addr + elfcorehdr_size; + elfcorehdr_res.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM; + insert_resource(&sysram_res, &elfcorehdr_res); + } +#endif + } void __init parse_dtb(void) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 1c4461590..93cd434c5 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -123,6 +123,80 @@ static void __init setup_initrd(void) } #endif /* CONFIG_BLK_DEV_INITRD */ +#ifdef CONFIG_CRASH_DUMP +/* + * These come from arm64 + */ +static int __init early_init_dt_scan_elfcorehdr(unsigned long node, + const char *uname, int depth, void *data) +{ + struct memblock_region *elfcorehdr_mem = data; + const __be32 *reg; + int len; + + if (depth != 1 || strcmp(uname, "chosen") != 0) + return 0; + + reg = of_get_flat_dt_prop(node, "linux,elfcorehdr", &len); + if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells))) + return 1; + + elfcorehdr_mem->base = dt_mem_next_cell(dt_root_addr_cells, ®); + elfcorehdr_mem->size = dt_mem_next_cell(dt_root_size_cells, ®); + + return 1; +} + +static void __init reserve_elfcorehdr(void) +{ + struct memblock_region elfcorehdr_region = {0}; + + of_scan_flat_dt(early_init_dt_scan_elfcorehdr, &elfcorehdr_region); + + if (elfcorehdr_region.size) { + memblock_reserve(elfcorehdr_region.base, + elfcorehdr_region.size); + + pr_info("Reserving %lldKB of memory at 0x%llx for elfcorehdr\n", + elfcorehdr_region.size >> 10, elfcorehdr_region.base); + + elfcorehdr_addr = elfcorehdr_region.base; + elfcorehdr_size = elfcorehdr_region.size; + } +} + +static int __init early_init_dt_scan_usablemem(unsigned long node, + const char *uname, int depth, void *data) +{ + struct memblock_region *usable_mem = data; + const __be32 *reg; + int len; + + if (depth != 1 || strcmp(uname, "chosen") != 0) + return 0; + + reg = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len); + if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells))) + return 1; + + usable_mem->base = dt_mem_next_cell(dt_root_addr_cells, ®); + usable_mem->size = dt_mem_next_cell(dt_root_size_cells, ®); + + return 1; +} + +static void __init fdt_enforce_memory_region(void) +{ + struct memblock_region usablemem_region = {0}; + + of_scan_flat_dt(early_init_dt_scan_usablemem, &usablemem_region); + + if (usablemem_region.size) + memblock_cap_memory_range(usablemem_region.base, + usablemem_region.size); +} +#endif + static phys_addr_t dtb_early_pa __initdata; void __init setup_bootmem(void) @@ -132,6 +206,9 @@ void __init setup_bootmem(void) phys_addr_t vmlinux_end = __pa_symbol(&_end); phys_addr_t vmlinux_start = __pa_symbol(&_start); +#ifdef CONFIG_CRASH_DUMP + fdt_enforce_memory_region(); +#endif /* Find the memory region containing the kernel */ for_each_memblock(memory, reg) { phys_addr_t end = reg->base + reg->size; @@ -166,6 +243,13 @@ void __init setup_bootmem(void) */ memblock_reserve(dtb_early_pa, fdt_totalsize(dtb_early_va)); + /* + * We need to reserve elfcorehdr here so that it doesn't + * get overwritten later on. + */ +#ifdef CONFIG_CRASH_DUMP + reserve_elfcorehdr(); +#endif early_init_fdt_scan_reserved_mem(); memblock_allow_resize(); memblock_dump_all(); @@ -559,9 +643,20 @@ static void __init reserve_crashkernel(void) unsigned long start_pfn = find_min_pfn_with_active_regions(); unsigned long search_start = start_pfn << PAGE_SHIFT; unsigned long search_end = (unsigned long) PFN_PHYS(max_low_pfn) - 1; - int ret = 0; + /* + * Don't reserve a region for a crash kernel on a crash kernel + * since it doesn't make much sense and we have limited memory + * resources. + */ +#ifdef CONFIG_CRASH_DUMP + if (is_kdump_kernel()) { + pr_info("crashkernel: ignore reservation request\n"); + return; + } +#endif + ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), &crash_size, &crash_base); if (ret || !crash_size)