From patchwork Fri Jan 26 12:31:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xie XiuQi X-Patchwork-Id: 10185531 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 82D1A60383 for ; Fri, 26 Jan 2018 12:26:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7735D2916A for ; Fri, 26 Jan 2018 12:26:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 696B62916F; Fri, 26 Jan 2018 12:26:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 833BB2916A for ; Fri, 26 Jan 2018 12:26:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=igeDmzJVz4UxsfiRmYyP3iZhOl3NkwRR0Y8QSYE5A8w=; b=PQuql5EY0ULrfj ufm8IJSvgy3c+KD6v8tePV7KAkYYW62IYsG6T5N8sXjydbij90bF6jgXpcxS7UrrSS+T3iyxPuEvK addxtFl9hviNHfG/BrG0lypYYIoasrfWi05UCpcS6DSvU/+FmoY7uV1jHstMLIbV+eEYrDEkJdi3t fpOseKOFdPC13rhnfY8jla76feaCMup/zwRps7e0qsS3hhhO6nZtVYc+Hql7qssvqrTfuPU8MvO8j Zo95ewHV6d7UA69bkRy/KMHkM0BBS9TByTrBSiJM6ZhoXMDVSn8haCisl640bIEzCoksupye15i0W w6gnZM2sAo1lhXJKYY3g==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.89 #1 (Red Hat Linux)) id 1ef35m-0006GF-5j; Fri, 26 Jan 2018 12:26:50 +0000 Received: from [45.249.212.32] (helo=huawei.com) by bombadil.infradead.org with esmtps (Exim 4.89 #1 (Red Hat Linux)) id 1ef34X-0004bR-Js for linux-arm-kernel@lists.infradead.org; Fri, 26 Jan 2018 12:25:40 +0000 Received: from DGGEMS411-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id BA93140FDC1F; Fri, 26 Jan 2018 20:25:04 +0800 (CST) Received: from localhost.localdomain.localdomain (10.175.113.25) by DGGEMS411-HUB.china.huawei.com (10.3.19.211) with Microsoft SMTP Server id 14.3.361.1; Fri, 26 Jan 2018 20:24:58 +0800 From: Xie XiuQi To: , , , , , , , , , , , , , Subject: [PATCH v5 1/3] arm64/ras: support sea error recovery Date: Fri, 26 Jan 2018 20:31:23 +0800 Message-ID: <1516969885-150532-2-git-send-email-xiexiuqi@huawei.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1516969885-150532-1-git-send-email-xiexiuqi@huawei.com> References: <1516969885-150532-1-git-send-email-xiexiuqi@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.113.25] X-CFilter-Loop: Reflected X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: wangkefeng.wang@huawei.com, cj.chengjian@huawei.com, xiexiuqi@huawei.com, linux-kernel@vger.kernel.org, gengdongjiu@huawei.com, lijinyue@huawei.com, linux-acpi@vger.kernel.org, hanjun.guo@linaro.org, huawei.libin@huawei.com, guohanjun@huawei.com, zhengqiang10@huawei.com, wangxiongfeng2@huawei.com, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors are consumed. According to the existing process, errors occurred in the kernel, leading to direct panic, if it occurred the user-space, we should just kill process. But there is a class of error, in fact, is not necessary to kill process, you can recover and continue to run the process. Such as the instruction data corrupted, where the memory page might be read-only, which is has not been modified, the disk might have the correct data, so you can directly drop the page, ant reload it when necessary. So this patchset is just try to solve such problem: if the error is consumed in user-space and the error occurs on a clean page, you can directly drop the memory page without killing process. If the corrupted page is clean, just dropped it and return to user-space without side effects. And if corrupted page is dirty, memory_failure() will send SIGBUS with code=BUS_MCEERR_AR. While without this patchset, do_sea() will just send SIGBUS, so the process was killed in the same place. Because memory_failure() may sleep, we can not call it directly in SEA exception context. So we saved faulting physical address associated with a process in the ghes handler and set __TIF_SEA_NOTIFY. When we return from SEA exception context and get into do_notify_resume() before the process running, we could check it and call memory_failure() to do recovery. It's safe, because we are in process context. Signed-off-by: Xie XiuQi Tested-by: Wang Xiongfeng Tested-by: Tyler Baicar Cc: James Morse Cc: Borislav Petkov Cc: Julien Thierry Cc: Hanjun Guo --- arch/arm64/Kconfig | 11 +++ arch/arm64/include/asm/ras.h | 23 ++++++ arch/arm64/include/asm/thread_info.h | 4 +- arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/ras.c | 142 +++++++++++++++++++++++++++++++++++ arch/arm64/kernel/signal.c | 7 ++ arch/arm64/mm/fault.c | 27 +++++-- drivers/acpi/apei/ghes.c | 8 +- include/acpi/ghes.h | 3 + 9 files changed, 216 insertions(+), 10 deletions(-) create mode 100644 arch/arm64/include/asm/ras.h create mode 100644 arch/arm64/kernel/ras.c diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c9a7e9e..8892923 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -688,6 +688,17 @@ config HOTPLUG_CPU Say Y here to experiment with turning CPUs off and on. CPUs can be controlled through /sys/devices/system/cpu. +config ARM64_ERR_RECOV + bool "Support arm64 RAS error recovery" + depends on ACPI_APEI_SEA && MEMORY_FAILURE + help + With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors + are consumed. In some cases, if the error address is in a clean page or a + read-only page, there is a chance to recover. Such as error occurs in a + instruction page, we can reread this page from disk instead of killing process. + + Say Y if unsure. + # Common NUMA Features config NUMA bool "Numa Memory Allocation and Scheduler Support" diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h new file mode 100644 index 0000000..f0f18da --- /dev/null +++ b/arch/arm64/include/asm/ras.h @@ -0,0 +1,23 @@ +/* + * ARM64 SEA error recoery support + * + * Copyright 2017 Huawei Technologies Co., Ltd. + * Author: Xie XiuQi + * Author: Wang Xiongfeng + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version + * 2 as published by the Free Software Foundation; + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#ifndef _ASM_RAS_H +#define _ASM_RAS_H + +extern void sea_notify_process(void); + +#endif /*_ASM_RAS_H*/ diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index eb43128..250d769 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -84,6 +84,7 @@ struct thread_info { #define TIF_FOREIGN_FPSTATE 3 /* CPU's FP state is not current's */ #define TIF_UPROBE 4 /* uprobe breakpoint or singlestep */ #define TIF_FSCHECK 5 /* Check FS is USER_DS on return */ +#define TIF_SEA_NOTIFY 6 /* notify to do an error recovery */ #define TIF_NOHZ 7 #define TIF_SYSCALL_TRACE 8 #define TIF_SYSCALL_AUDIT 9 @@ -102,6 +103,7 @@ struct thread_info { #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) #define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE) #define _TIF_NOHZ (1 << TIF_NOHZ) +#define _TIF_SEA_NOTIFY (1 << TIF_SEA_NOTIFY) #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE) #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) #define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT) @@ -113,7 +115,7 @@ struct thread_info { #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \ - _TIF_UPROBE | _TIF_FSCHECK) + _TIF_UPROBE | _TIF_FSCHECK | _TIF_SEA_NOTIFY) #define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \ diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 067baac..a87f0a2 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -42,6 +42,7 @@ arm64-obj-$(CONFIG_EFI) += efi.o efi-entry.stub.o arm64-obj-$(CONFIG_PCI) += pci.o arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o arm64-obj-$(CONFIG_ACPI) += acpi.o +arm64-obj-$(CONFIG_ARM64_ERR_RECOV) += ras.o arm64-obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o arm64-obj-$(CONFIG_PARAVIRT) += paravirt.o diff --git a/arch/arm64/kernel/ras.c b/arch/arm64/kernel/ras.c new file mode 100644 index 0000000..05d3871 --- /dev/null +++ b/arch/arm64/kernel/ras.c @@ -0,0 +1,142 @@ +/* + * ARM64 SEA error recoery support + * + * Copyright 2017 Huawei Technologies Co., Ltd. + * Author: Xie XiuQi + * Author: Wang Xiongfeng + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version + * 2 as published by the Free Software Foundation; + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include +#include +#include + +/* + * Need to save faulting physical address associated with a process + * in the sea ghes handler some place where we can grab it back + * later in sea_notify_process() + */ +#define SEA_INFO_MAX 16 + +struct sea_info { + atomic_t inuse; + struct task_struct *t; + __u64 paddr; +} sea_info[SEA_INFO_MAX]; + +static bool sea_save_info(__u64 addr) +{ + struct sea_info *si; + + for (si = sea_info; si < &sea_info[SEA_INFO_MAX]; si++) { + if (atomic_cmpxchg(&si->inuse, 0, 1) == 0) { + si->t = current; + si->paddr = addr; + return true; + } + } + + pr_err("Too many concurrent recoverable errors\n"); + return false; +} + +static struct sea_info *sea_find_info(void) +{ + struct sea_info *si; + + for (si = sea_info; si < &sea_info[SEA_INFO_MAX]; si++) + if (atomic_read(&si->inuse) && si->t == current) + return si; + return NULL; +} + +static void sea_clear_info(struct sea_info *si) +{ + atomic_set(&si->inuse, 0); +} + +/* + * Called in process context that interrupted by SEA and marked with + * TIF_SEA_NOTIFY, just before returning to erroneous userland. + * This code is allowed to sleep. + * Attempt possible recovery such as calling the high level VM handler to + * process any corrupted pages, and kill/signal current process if required. + * Action required errors are handled here. + */ +void sea_notify_process(void) +{ + unsigned long pfn; + int fail = 0, flags = MF_ACTION_REQUIRED; + struct sea_info *si = sea_find_info(); + + if (!si) + panic("Lost physical address for consumed uncorrectable error"); + + clear_thread_flag(TIF_SEA_NOTIFY); + do { + pfn = si->paddr >> PAGE_SHIFT; + + + pr_err("Uncorrected hardware memory error in user-access at %llx\n", + si->paddr); + /* + * We must call memory_failure() here even if the current process is + * doomed. We still need to mark the page as poisoned and alert any + * other users of the page. + */ + if (memory_failure(pfn, 0, flags) < 0) + fail++; + + sea_clear_info(si); + + si = sea_find_info(); + } while (si); + + if (fail) { + pr_err("Memory error not recovered\n"); + force_sig(SIGBUS, current); + } +} + +void ghes_arm_process_error(struct ghes *ghes, struct cper_sec_proc_arm *err) +{ + int i; + bool info_saved = false; + struct cper_arm_err_info *err_info; + + log_arm_hw_error(err); + + if ((ghes->generic->notify.type != ACPI_HEST_NOTIFY_SEA) || + (ghes->estatus->error_severity != CPER_SEV_RECOVERABLE)) + return; + + err_info = (struct cper_arm_err_info *)(err + 1); + for (i = 0; i < err->err_info_num; i++, err_info++) { + if ((err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR) && + (err_info->type == CPER_ARM_CACHE_ERROR)) + info_saved |= sea_save_info(err_info->physical_fault_addr); + } + + if (info_saved) + set_thread_flag(TIF_SEA_NOTIFY); +} diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index b120111..1fe60dd 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -41,6 +41,7 @@ #include #include #include +#include /* * Do a signal return; undo the signal stack. These are aligned to 128-bit. @@ -907,6 +908,12 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, */ trace_hardirqs_off(); +#ifdef CONFIG_ARM64_ERR_RECOV + /* notify userspace of pending SEAs */ + if (unlikely(thread_flags & _TIF_SEA_NOTIFY)) + sea_notify_process(); +#endif /* CONFIG_ARM64_ERR_RECOV */ + do { /* Check valid user FS if needed */ addr_limit_user_check(); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 9b7f89d..15597e1 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -594,14 +594,25 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) nmi_exit(); } - info.si_signo = SIGBUS; - info.si_errno = 0; - info.si_code = 0; - if (esr & ESR_ELx_FnV) - info.si_addr = NULL; - else - info.si_addr = (void __user *)addr; - arm64_notify_die("", regs, &info, esr); + if (user_mode(regs)) { + if (test_thread_flag(TIF_SEA_NOTIFY)) + return 0; + + info.si_signo = SIGBUS; + info.si_errno = 0; + info.si_code = 0; + if (esr & ESR_ELx_FnV) + info.si_addr = NULL; + else + info.si_addr = (void __user *)addr; + + current->thread.fault_address = 0; + current->thread.fault_code = esr; + force_sig_info(info.si_signo, &info, current); + } else { + die("Uncorrected hardware memory error in kernel-access\n", + regs, esr); + } return 0; } diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 6402f7f..cff671d 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -414,6 +414,12 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int #endif } +void __weak ghes_arm_process_error(struct ghes *ghes, + struct cper_sec_proc_arm *err) +{ + log_arm_hw_error(err); +} + static void ghes_do_proc(struct ghes *ghes, const struct acpi_hest_generic_status *estatus) { @@ -476,7 +482,7 @@ static void ghes_do_proc(struct ghes *ghes, else if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) { struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata); - log_arm_hw_error(err); + ghes_arm_process_error(ghes, err); } else { void *err = acpi_hest_get_payload(gdata); diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h index 8feb0c8..d626367 100644 --- a/include/acpi/ghes.h +++ b/include/acpi/ghes.h @@ -121,4 +121,7 @@ static inline void *acpi_hest_get_next(struct acpi_hest_generic_data *gdata) int ghes_notify_sea(void); +extern void ghes_arm_process_error(struct ghes *ghes, + struct cper_sec_proc_arm *err); + #endif /* GHES_H */