From patchwork Mon Oct 31 20:10:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Luck X-Patchwork-Id: 13026305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD4BEECAAA1 for ; Mon, 31 Oct 2022 20:10:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6BC1D6B0072; Mon, 31 Oct 2022 16:10:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 66C6C6B0073; Mon, 31 Oct 2022 16:10:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55C196B0074; Mon, 31 Oct 2022 16:10:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 49D386B0072 for ; Mon, 31 Oct 2022 16:10:51 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DFA82160E74 for ; Mon, 31 Oct 2022 20:10:50 +0000 (UTC) X-FDA: 80082337860.01.2E39646 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf05.hostedemail.com (Postfix) with ESMTP id 63153100043 for ; Mon, 31 Oct 2022 20:10:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667247049; x=1698783049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FkjXs29rqfDyvtsYyleWaTFa4Yy2vHuQ4WNYRV8TNe0=; b=DA8Kxelr+DYjtpkCDM84qckl8T+qLxVIR/IXujWLW+NEsbOJmWZvTYRf qyIGpyK/3G45XbQ0T8sXcNbHw1XIjb5tklEp2RynH/85gR6DoB+7ywAnh mYOUvGKeZZZhlCcbe9/mkcdRoJEGkkpiZMQ318nz88tTK0xkn7qv6WHiT i+6j72+dlKr78gmuF/OQzz4afHVeGA/ZAVNIzowJX3REdJOm9Qc1PT92e EK6Q69ROZUec6g6yE5LojJ1IjHz+9qyE2yRo9v8AOg0GXivsCqxsuOR3q gYyJ7bb5t0+wN6CXNjlTlcoH6H27SUDf32zBbHtkwgbKbh1QPF6JZ7erX A==; X-IronPort-AV: E=McAfee;i="6500,9779,10517"; a="289379579" X-IronPort-AV: E=Sophos;i="5.95,228,1661842800"; d="scan'208";a="289379579" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Oct 2022 13:10:47 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10517"; a="722931443" X-IronPort-AV: E=Sophos;i="5.95,228,1661842800"; d="scan'208";a="722931443" Received: from agluck-desk3.sc.intel.com ([172.25.222.78]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Oct 2022 13:10:47 -0700 From: Tony Luck To: Andrew Morton Cc: Alexander Potapenko , Naoya Horiguchi , Miaohe Lin , Matthew Wilcox , Shuai Xue , Dan Williams , Michael Ellerman , Nicholas Piggin , Christophe Leroy , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Tony Luck Subject: [PATCH v4 1/2] mm, hwpoison: Try to recover from copy-on write faults Date: Mon, 31 Oct 2022 13:10:28 -0700 Message-Id: <20221031201029.102123-2-tony.luck@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221031201029.102123-1-tony.luck@intel.com> References: <20221021200120.175753-1-tony.luck@intel.com> <20221031201029.102123-1-tony.luck@intel.com> MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=DA8Kxelr; spf=pass (imf05.hostedemail.com: domain of tony.luck@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=tony.luck@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667247050; a=rsa-sha256; cv=none; b=oJJRDu/I0TFZcyfDPr1t7yefCS1NsMRssJryYb8Me4ffDdU3LE+EfirZ5sKDjk2lRVrX8X YvPjVH24DBKc1pHQZ+ZnmKgHlVsKTlPdDl2H2Q5fLiXAPY10FClMGrhSGdSqvTjKSKqqPi oZny5Hn1txDgqGpejxS3thBHEV6LaOU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667247050; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8jEPwIkUumdRrnqDX2mJvndjQWqhP6Zp1614HD9ut8I=; b=gVXhuSVaejI2c4VFiPhb2sy8VQy6jJgTPTE35IyKPngXXc9w3Ah2mgs8eK2BAATs/hAyU9 TCHwKmT2qBdhO/WGnbFvzFFyXCwgxrNc971TcFtaDsBcsTPFFzEYjEDHmvrf5suIDOMBKQ aesVsWFhQxskbjRffVyugUVvwGjOIK0= X-Rspam-User: X-Rspamd-Queue-Id: 63153100043 Authentication-Results: imf05.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=DA8Kxelr; spf=pass (imf05.hostedemail.com: domain of tony.luck@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=tony.luck@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Stat-Signature: eeh5pin34ex5dzw5ranmhnkcx8oak7bn X-Rspamd-Server: rspam10 X-HE-Tag: 1667247049-940556 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If the kernel is copying a page as the result of a copy-on-write fault and runs into an uncorrectable error, Linux will crash because it does not have recovery code for this case where poison is consumed by the kernel. It is easy to set up a test case. Just inject an error into a private page, fork(2), and have the child process write to the page. I wrapped that neatly into a test at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/ras-tools.git just enable ACPI error injection and run: # ./einj_mem-uc -f copy-on-write Add a new copy_user_highpage_mc() function that uses copy_mc_to_kernel() on architectures where that is available (currently x86 and powerpc). When an error is detected during the page copy, return VM_FAULT_HWPOISON to caller of wp_page_copy(). This propagates up the call stack. Both x86 and powerpc have code in their fault handler to deal with this code by sending a SIGBUS to the application. Note that this patch avoids a system crash and signals the process that triggered the copy-on-write action. It does not take any action for the memory error that is still in the shared page. To handle that a call to memory_failure() is needed. But this cannot be done from wp_page_copy() because it holds mmap_lock(). Perhaps the architecture fault handlers can deal with this loose end in a subsequent patch? On Intel/x86 this loose end will often be handled automatically because the memory controller provides an additional notification of the h/w poison in memory, the handler for this will call memory_failure(). This isn't a 100% solution. If there are multiple errors, not all may be logged in this way. Reviewed-by: Dan Williams Reviewed-by: Miaohe Lin Reviewed-by: Naoya Horiguchi Tested-by: Shuai Xue Signed-off-by: Tony Luck Message-Id: <20221021200120.175753-2-tony.luck@intel.com> Signed-off-by: Tony Luck --- include/linux/highmem.h | 26 ++++++++++++++++++++++++++ mm/memory.c | 30 ++++++++++++++++++++---------- 2 files changed, 46 insertions(+), 10 deletions(-) diff --git a/include/linux/highmem.h b/include/linux/highmem.h index e9912da5441b..44242268f53b 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -319,6 +319,32 @@ static inline void copy_user_highpage(struct page *to, struct page *from, #endif +#ifdef copy_mc_to_kernel +static inline int copy_mc_user_highpage(struct page *to, struct page *from, + unsigned long vaddr, struct vm_area_struct *vma) +{ + unsigned long ret; + char *vfrom, *vto; + + vfrom = kmap_local_page(from); + vto = kmap_local_page(to); + ret = copy_mc_to_kernel(vto, vfrom, PAGE_SIZE); + if (!ret) + kmsan_unpoison_memory(page_address(to), PAGE_SIZE); + kunmap_local(vto); + kunmap_local(vfrom); + + return ret; +} +#else +static inline int copy_mc_user_highpage(struct page *to, struct page *from, + unsigned long vaddr, struct vm_area_struct *vma) +{ + copy_user_highpage(to, from, vaddr, vma); + return 0; +} +#endif + #ifndef __HAVE_ARCH_COPY_HIGHPAGE static inline void copy_highpage(struct page *to, struct page *from) diff --git a/mm/memory.c b/mm/memory.c index f88c351aecd4..b6056eef2f72 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2848,10 +2848,16 @@ static inline int pte_unmap_same(struct vm_fault *vmf) return same; } -static inline bool __wp_page_copy_user(struct page *dst, struct page *src, - struct vm_fault *vmf) +/* + * Return: + * 0: copied succeeded + * -EHWPOISON: copy failed due to hwpoison in source page + * -EAGAIN: copied failed (some other reason) + */ +static inline int __wp_page_copy_user(struct page *dst, struct page *src, + struct vm_fault *vmf) { - bool ret; + int ret; void *kaddr; void __user *uaddr; bool locked = false; @@ -2860,8 +2866,9 @@ static inline bool __wp_page_copy_user(struct page *dst, struct page *src, unsigned long addr = vmf->address; if (likely(src)) { - copy_user_highpage(dst, src, addr, vma); - return true; + if (copy_mc_user_highpage(dst, src, addr, vma)) + return -EHWPOISON; + return 0; } /* @@ -2888,7 +2895,7 @@ static inline bool __wp_page_copy_user(struct page *dst, struct page *src, * and update local tlb only */ update_mmu_tlb(vma, addr, vmf->pte); - ret = false; + ret = -EAGAIN; goto pte_unlock; } @@ -2913,7 +2920,7 @@ static inline bool __wp_page_copy_user(struct page *dst, struct page *src, if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) { /* The PTE changed under us, update local tlb */ update_mmu_tlb(vma, addr, vmf->pte); - ret = false; + ret = -EAGAIN; goto pte_unlock; } @@ -2932,7 +2939,7 @@ static inline bool __wp_page_copy_user(struct page *dst, struct page *src, } } - ret = true; + ret = 0; pte_unlock: if (locked) @@ -3104,6 +3111,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) pte_t entry; int page_copied = 0; struct mmu_notifier_range range; + int ret; delayacct_wpcopy_start(); @@ -3121,19 +3129,21 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) if (!new_page) goto oom; - if (!__wp_page_copy_user(new_page, old_page, vmf)) { + ret = __wp_page_copy_user(new_page, old_page, vmf); + if (ret) { /* * COW failed, if the fault was solved by other, * it's fine. If not, userspace would re-fault on * the same address and we will handle the fault * from the second attempt. + * The -EHWPOISON case will not be retried. */ put_page(new_page); if (old_page) put_page(old_page); delayacct_wpcopy_end(); - return 0; + return ret == -EHWPOISON ? VM_FAULT_HWPOISON : 0; } kmsan_copy_page_meta(new_page, old_page); }