From patchwork Mon Oct 17 23:42:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Luck X-Patchwork-Id: 13009709 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05E6EC4332F for ; Mon, 17 Oct 2022 23:42:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A5486B0072; Mon, 17 Oct 2022 19:42:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 055E86B0075; Mon, 17 Oct 2022 19:42:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E86326B0078; Mon, 17 Oct 2022 19:42:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DA2036B0072 for ; Mon, 17 Oct 2022 19:42:14 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id B27D91A0F5E for ; Mon, 17 Oct 2022 23:42:14 +0000 (UTC) X-FDA: 80032067388.08.0B38823 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf01.hostedemail.com (Postfix) with ESMTP id 2E0F940048 for ; Mon, 17 Oct 2022 23:42:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666050133; x=1697586133; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=QlcB0A8sTz9XVjgWdXeFTBjKtTBN2eNlnDD0YUJo+2c=; b=O1uuv9BEVGtcqwYTuZx6p07odfNeYj+tXwmDlCLF+PcIl8fIHiVBxysF eMTM2JrAZklnDHCBY41a4WCCSsCDnlLG+rdK4Cewu66AZscQcpcFwgC/+ dLldhySrwXQ+zukKrI8vExWH8mohJ/+UaJ2clPjFC3yOB35YdNVcgzYSn 5Kc/rN3EU7SigZ/SKS8qL8pbcxOCLSHFk1ADvuHrdvhc+9lh/I+naZJpd u7xO5Ka73TqM/OXj+W5RoO5Blw6xn97r8hf/+TlbfFWrbx2+WfSjHn2Lu Q2aEt4+7N/wXZDvjWf7TWhb+NyeVWlLw11dv6t8toKX+vOZFyQsL5LnaE A==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="285668052" X-IronPort-AV: E=Sophos;i="5.95,192,1661842800"; d="scan'208";a="285668052" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2022 16:42:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="606304946" X-IronPort-AV: E=Sophos;i="5.95,192,1661842800"; d="scan'208";a="606304946" Received: from agluck-desk3.sc.intel.com ([172.25.222.78]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2022 16:42:10 -0700 From: Tony Luck To: Naoya Horiguchi , Andrew Morton Cc: Miaohe Lin , Matthew Wilcox , Shuai Xue , Dan Williams , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Tony Luck Subject: [RFC PATCH] mm, hwpoison: Recover from copy-on-write machine checks Date: Mon, 17 Oct 2022 16:42:03 -0700 Message-Id: <20221017234203.103666-1-tony.luck@intel.com> X-Mailer: git-send-email 2.37.3 MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1666050134; a=rsa-sha256; cv=none; b=u8weWKWFZ071Ybhsmir3l1jV66SC/MqIHbqP16LUkqUJw/3BIkWDg70PUMSyNZDFPbELw1 eAvnusW/zb0ewoOP3r/N4X8PJhVvTrepRd5sOTIzD+N2LrXb3ZjOyHCWUUfmYveALglqoE lUDVgMQjcLcaVKDmwA7iwLVcetgranc= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=O1uuv9BE; spf=pass (imf01.hostedemail.com: domain of tony.luck@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=tony.luck@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1666050134; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Tw7trn+kRahPRUQp5F7+R3XGCFg1x5rMNGvZWZYYecE=; b=GcBWrsir0wVMMma3BlnubvWFP5a9ZayD0kOx3nWmDfK6+n2jqo8zuLFweSOMAIZoIptlDy gqbBBojviLWP7gV4/VWMcz4YbIVM3FG6YkfCSNjKhR77YCyyHHuDznnOuK1aO7AmTJZXYo mDa9dGpCUBA05uOEkpNmotiVVCb8hF4= X-Stat-Signature: hhip6z17xgqpyo6556rcsmp73yiucpxd X-Rspamd-Queue-Id: 2E0F940048 X-Rspam-User: X-Rspamd-Server: rspam03 Authentication-Results: imf01.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=O1uuv9BE; spf=pass (imf01.hostedemail.com: domain of tony.luck@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=tony.luck@intel.com; dmarc=pass (policy=none) header.from=intel.com X-HE-Tag: 1666050132-77692 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If the kernel is copying a page as the result of a copy-on-write fault and runs into an uncorrectable error, Linux will crash because it does not have recovery code for this case where poison is consumed by the kernel. It is easy to set up a test case. Just inject an error into a private page, fork(2), and have the child process write to the page. I wrapped that neatly into a test at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/ras-tools.git just enable ACPI error injection and run: # ./einj_mem-uc -f copy-on-write [Note this test needs some better reporting for the case where this patch has been applied and the system does NOT crash] Patch below works ... but there are probably many places where it could fit better into the general "mm" way of doing things. E.g. using the copy_mc_to_kernel() function does what I need here, but the name doesn't seem like it is quite right. Basic idea is very simple ... if the kernel gets a machine check copying the page, just free up the new page that was going to be the target of the copy and return VM_FAULT_HWPOISON to the calling stack. Slightly-signed-off-by: Tony Luck --- include/linux/highmem.h | 19 +++++++++++++++++++ mm/memory.c | 28 ++++++++++++++++++++-------- 2 files changed, 39 insertions(+), 8 deletions(-) diff --git a/include/linux/highmem.h b/include/linux/highmem.h index e9912da5441b..5967541fbf0e 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -319,6 +319,25 @@ static inline void copy_user_highpage(struct page *to, struct page *from, #endif +static inline int copy_user_highpage_mc(struct page *to, struct page *from, + unsigned long vaddr, struct vm_area_struct *vma) +{ + unsigned long ret = 0; +#ifdef copy_mc_to_kernel + char *vfrom, *vto; + + vfrom = kmap_local_page(from); + vto = kmap_local_page(to); + ret = copy_mc_to_kernel(vto, vfrom, PAGE_SIZE); + kunmap_local(vto); + kunmap_local(vfrom); +#else + copy_user_highpage(to, from, vaddr, vma); +#endif + + return ret; +} + #ifndef __HAVE_ARCH_COPY_HIGHPAGE static inline void copy_highpage(struct page *to, struct page *from) diff --git a/mm/memory.c b/mm/memory.c index f88c351aecd4..b5e22bf4c10a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2848,8 +2848,14 @@ static inline int pte_unmap_same(struct vm_fault *vmf) return same; } -static inline bool __wp_page_copy_user(struct page *dst, struct page *src, - struct vm_fault *vmf) +/* + * Return: + * -1 = copy failed due to poison in source page + * 0 = copied failed (some other reason) + * 1 = copied succeeded + */ +static inline int __wp_page_copy_user(struct page *dst, struct page *src, + struct vm_fault *vmf) { bool ret; void *kaddr; @@ -2860,8 +2866,9 @@ static inline bool __wp_page_copy_user(struct page *dst, struct page *src, unsigned long addr = vmf->address; if (likely(src)) { - copy_user_highpage(dst, src, addr, vma); - return true; + if (copy_user_highpage_mc(dst, src, addr, vma)) + return -1; + return 1; } /* @@ -2888,7 +2895,7 @@ static inline bool __wp_page_copy_user(struct page *dst, struct page *src, * and update local tlb only */ update_mmu_tlb(vma, addr, vmf->pte); - ret = false; + ret = 0; goto pte_unlock; } @@ -2913,7 +2920,7 @@ static inline bool __wp_page_copy_user(struct page *dst, struct page *src, if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) { /* The PTE changed under us, update local tlb */ update_mmu_tlb(vma, addr, vmf->pte); - ret = false; + ret = 0; goto pte_unlock; } @@ -2932,7 +2939,7 @@ static inline bool __wp_page_copy_user(struct page *dst, struct page *src, } } - ret = true; + ret = 1; pte_unlock: if (locked) @@ -3104,6 +3111,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) pte_t entry; int page_copied = 0; struct mmu_notifier_range range; + int ret; delayacct_wpcopy_start(); @@ -3121,7 +3129,11 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) if (!new_page) goto oom; - if (!__wp_page_copy_user(new_page, old_page, vmf)) { + ret = __wp_page_copy_user(new_page, old_page, vmf); + if (ret == -1) { + put_page(new_page); + return VM_FAULT_HWPOISON; + } else if (ret == 0) { /* * COW failed, if the fault was solved by other, * it's fine. If not, userspace would re-fault on