From patchwork Fri Jan 8 22:22:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Luck X-Patchwork-Id: 12007711 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F3A8C433DB for ; Fri, 8 Jan 2021 22:23:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D8AB723AC0 for ; Fri, 8 Jan 2021 22:23:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D8AB723AC0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5AEAA8D01BF; Fri, 8 Jan 2021 17:23:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 539698D01BD; Fri, 8 Jan 2021 17:23:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 33B908D01BF; Fri, 8 Jan 2021 17:23:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0105.hostedemail.com [216.40.44.105]) by kanga.kvack.org (Postfix) with ESMTP id 10A1F8D01BD for ; Fri, 8 Jan 2021 17:23:03 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id CE12E4DB7 for ; Fri, 8 Jan 2021 22:23:02 +0000 (UTC) X-FDA: 77684034204.23.bulb73_2c060f4274f6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id AE18D37606 for ; Fri, 8 Jan 2021 22:23:02 +0000 (UTC) X-HE-Tag: bulb73_2c060f4274f6 X-Filterd-Recvd-Size: 4511 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 Jan 2021 22:23:01 +0000 (UTC) IronPort-SDR: rGx4UjbT+bGvGjy9nNeib0xtoxu3bjeHdLKbyXWggtIlMYYh0tNvQC7g5h1SVOAJVyA0EWJJCf 8faRBCiSCiOw== X-IronPort-AV: E=McAfee;i="6000,8403,9858"; a="157443601" X-IronPort-AV: E=Sophos;i="5.79,332,1602572400"; d="scan'208";a="157443601" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jan 2021 14:22:59 -0800 IronPort-SDR: 1Jif8V1QzkxtUTclrXTR9Hs3OE1iYg9NwMYIlsNXu0Q4iXR7s+XzGQEsJgQyuWuFXj4DchD6ls ywNiNWYyC+Kg== X-IronPort-AV: E=Sophos;i="5.79,332,1602572400"; d="scan'208";a="423091293" Received: from agluck-desk2.sc.intel.com ([10.3.52.68]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jan 2021 14:22:58 -0800 From: Tony Luck To: Borislav Petkov Cc: Tony Luck , x86@kernel.org, Andrew Morton , Peter Zijlstra , Darren Hart , Andy Lutomirski , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 1/2] x86/mce: Avoid infinite loop for copy from user recovery Date: Fri, 8 Jan 2021 14:22:50 -0800 Message-Id: <20210108222251.14391-2-tony.luck@intel.com> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20210108222251.14391-1-tony.luck@intel.com> References: <20210108222251.14391-1-tony.luck@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Recovery action when get_user() triggers a machine check uses the fixup path to make get_user() return -EFAULT. Also queue_task_work() sets up so that kill_me_maybe() will be called on return to user mode to send a SIGBUS to the current process. But there are places in the kernel where the code assumes that this EFAULT return was simply because of a page fault. The code takes some action to fix that, and then retries the access. This results in a second machine check. While processing this second machine check queue_task_work() is called again. But since this uses the same callback_head structure that was used in the first call, the net result is an entry on the current->task_works list that points to itself. When task_work_run() is called it loops forever in this code: do { next = work->next; work->func(work); work = next; cond_resched(); } while (work); Add a "mce_busy" flag bit to detect this situation and panic when it happens. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 7 ++++++- include/linux/sched.h | 3 ++- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 13d3f1cbda17..1bf11213e093 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1246,6 +1246,7 @@ static void kill_me_maybe(struct callback_head *cb) struct task_struct *p = container_of(cb, struct task_struct, mce_kill_me); int flags = MF_ACTION_REQUIRED; + p->mce_busy = 0; pr_err("Uncorrected hardware memory error in user-access at %llx", p->mce_addr); if (!p->mce_ripv) @@ -1268,6 +1269,7 @@ static void kill_me_maybe(struct callback_head *cb) static void queue_task_work(struct mce *m, int kill_current_task) { + current->mce_busy = 1; current->mce_addr = m->addr; current->mce_kflags = m->kflags; current->mce_ripv = !!(m->mcgstatus & MCG_STATUS_RIPV); @@ -1431,8 +1433,11 @@ noinstr void do_machine_check(struct pt_regs *regs) mce_panic("Failed kernel mode recovery", &m, msg); } - if (m.kflags & MCE_IN_KERNEL_COPYIN) + if (m.kflags & MCE_IN_KERNEL_COPYIN) { + if (current->mce_busy) + mce_panic("Multiple copyin", &m, msg); queue_task_work(&m, kill_current_task); + } } out: mce_wrmsrl(MSR_IA32_MCG_STATUS, 0); diff --git a/include/linux/sched.h b/include/linux/sched.h index 6e3a5eeec509..a763a76eac57 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1360,7 +1360,8 @@ struct task_struct { u64 mce_addr; __u64 mce_ripv : 1, mce_whole_page : 1, - __mce_reserved : 62; + mce_busy : 1, + __mce_reserved : 61; struct callback_head mce_kill_me; #endif From patchwork Fri Jan 8 22:22:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Luck X-Patchwork-Id: 12007713 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98A4BC433E9 for ; Fri, 8 Jan 2021 22:23:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4E0A223A74 for ; Fri, 8 Jan 2021 22:23:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4E0A223A74 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C17898D01C0; Fri, 8 Jan 2021 17:23:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B7C138D01BD; Fri, 8 Jan 2021 17:23:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 955C28D01C0; Fri, 8 Jan 2021 17:23:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0078.hostedemail.com [216.40.44.78]) by kanga.kvack.org (Postfix) with ESMTP id 7C8598D01BD for ; Fri, 8 Jan 2021 17:23:04 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 4AA8B2C84 for ; Fri, 8 Jan 2021 22:23:04 +0000 (UTC) X-FDA: 77684034288.16.cast81_2f00b2d274f6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id 2D472100E6903 for ; Fri, 8 Jan 2021 22:23:04 +0000 (UTC) X-HE-Tag: cast81_2f00b2d274f6 X-Filterd-Recvd-Size: 4290 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Fri, 8 Jan 2021 22:23:03 +0000 (UTC) IronPort-SDR: 3ZergAHcEjV7CbWE8Emz+vRwgFQ/3o18NG6h450qZB6l7M9oHR6RoeDADgzZ3glG9N2d+XUejV NYfoRPOp1Oiw== X-IronPort-AV: E=McAfee;i="6000,8403,9858"; a="157443602" X-IronPort-AV: E=Sophos;i="5.79,332,1602572400"; d="scan'208";a="157443602" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jan 2021 14:22:59 -0800 IronPort-SDR: 6AZfUQRjruUiq3waI6R4k01goON3OjB2cqqQj1j2GNQRPzZpyApD/F3AYswfgMRTmeJhb6rm1s +vA4n6WyeXBQ== X-IronPort-AV: E=Sophos;i="5.79,332,1602572400"; d="scan'208";a="423091296" Received: from agluck-desk2.sc.intel.com ([10.3.52.68]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jan 2021 14:22:59 -0800 From: Tony Luck To: Borislav Petkov Cc: Tony Luck , x86@kernel.org, Andrew Morton , Peter Zijlstra , Darren Hart , Andy Lutomirski , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 2/2] futex, x86/mce: Avoid double machine checks Date: Fri, 8 Jan 2021 14:22:51 -0800 Message-Id: <20210108222251.14391-3-tony.luck@intel.com> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20210108222251.14391-1-tony.luck@intel.com> References: <20210108222251.14391-1-tony.luck@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: futex_wait_setup() first tries to read the user value with page faults disabled (because it holds a lock, and so cannot sleep). If that read fails it drops the lock and tries again. But there are now two reasons why the user space read can fail. Either: 1) legacy case of a page fault, in which case it is reasonable to retry 2) machine check on the user address, bad idea to re-read Add some infrastructure to differentiate these cases. Signed-off-by: Tony Luck --- arch/x86/include/asm/mmu.h | 7 +++++++ arch/x86/kernel/cpu/mce/core.c | 10 ++++++++++ include/linux/mm.h | 4 ++++ kernel/futex.c | 3 +++ 4 files changed, 24 insertions(+) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 5d7494631ea9..a46c78381388 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -66,4 +66,11 @@ typedef struct { void leave_mm(int cpu); #define leave_mm leave_mm +#if defined(CONFIG_X86_MCE) && defined(CONFIG_MEMORY_FAILURE) +#undef arch_memory_failure +#define arch_memory_failure x86_memory_failure +#endif + +bool x86_memory_failure(u32 __user *addr); + #endif /* _ASM_X86_MMU_H */ diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 1bf11213e093..b27aa30290bb 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1236,6 +1236,16 @@ static void __mc_scan_banks(struct mce *m, struct pt_regs *regs, struct mce *fin *m = *final; } +bool x86_memory_failure(u32 __user *addr) +{ + if (current->mce_busy == 0) + return false; + + WARN_ON(current->mce_vaddr != addr); + + return true; +} + static void kill_me_now(struct callback_head *ch) { force_sig(SIGBUS); diff --git a/include/linux/mm.h b/include/linux/mm.h index ecdf8a8cd6ae..470708a71dd3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3177,5 +3177,9 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping, extern int sysctl_nr_trim_pages; +#ifndef arch_memory_failure +#define arch_memory_failure(vaddr) (0) +#endif + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ diff --git a/kernel/futex.c b/kernel/futex.c index c47d1015d759..8fa2fc854026 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -2658,6 +2658,9 @@ static int futex_wait_setup(u32 __user *uaddr, u32 val, unsigned int flags, if (ret) { queue_unlock(*hb); + if (arch_memory_failure(uaddr)) + return ret; + ret = get_user(uval, uaddr); if (ret) return ret;