From patchwork Thu Jan 11 13:55:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tong Tiangen X-Patchwork-Id: 13517459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FF4FC47077 for ; Thu, 11 Jan 2024 13:56:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7E2426B00A1; Thu, 11 Jan 2024 08:56:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 75CAC6B00A5; Thu, 11 Jan 2024 08:56:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3156D6B00A3; Thu, 11 Jan 2024 08:56:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EE5AF6B00A2 for ; Thu, 11 Jan 2024 08:56:06 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CA99280C88 for ; Thu, 11 Jan 2024 13:56:06 +0000 (UTC) X-FDA: 81667179132.04.2591A65 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf28.hostedemail.com (Postfix) with ESMTP id 2D810C0022 for ; Thu, 11 Jan 2024 13:56:03 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of tongtiangen@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=tongtiangen@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704981365; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XKVffCLzV//oo/2yIDd4XvK33IiZmZFYlL1Wbukj5jw=; b=p1pBf1Jo2T26F4+XY6CnZNtGY5kPROnRnM+j7QNtzhMhanLRdeE0pcaDmJUVSJlIbNasSs XxA612TaxeGBV8AhrkXxq80W275zrFpZjjQr6SziHkFJUOAb1qeLHCBb/7/r8mDvSD+SZK Ou5XSEZ3sPDtrj2zlPdSkmmVq2AWE/s= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of tongtiangen@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=tongtiangen@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704981365; a=rsa-sha256; cv=none; b=l19makuqrNuciSZsYe3Rvefs3Ysm5abYQ7kIpfVXurotJN4U+31Uq514dJ1Qrb0iAY4v0f EQWhYTvOwWi76GZxDv7v/AE0OG4tiCw3kp6wnuTmcc54l36Jpy23bDVKUwd8OjnBmqhm4W IyW7mc1zMZBiZ8kbrqtL3u1207lLYWQ= Received: from mail.maildlp.com (unknown [172.19.163.252]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4T9mQK3ZqxzNkn4; Thu, 11 Jan 2024 21:55:21 +0800 (CST) Received: from kwepemm600017.china.huawei.com (unknown [7.193.23.234]) by mail.maildlp.com (Postfix) with ESMTPS id 6A93B180077; Thu, 11 Jan 2024 21:56:00 +0800 (CST) Received: from localhost.localdomain (10.175.112.125) by kwepemm600017.china.huawei.com (7.193.23.234) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 11 Jan 2024 21:55:59 +0800 From: Tong Tiangen To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , , Dave Hansen , , "H. Peter Anvin" , Tony Luck , Andy Lutomirski , Peter Zijlstra , Andrew Morton , Naoya Horiguchi CC: , , , Tong Tiangen , Guohanjun Subject: [PATCH -next v4 3/3] x86/mce: set MCE_IN_KERNEL_COPY_MC for DEFAULT_MCE_SAFE exception Date: Thu, 11 Jan 2024 21:55:48 +0800 Message-ID: <20240111135548.3207437-4-tongtiangen@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240111135548.3207437-1-tongtiangen@huawei.com> References: <20240111135548.3207437-1-tongtiangen@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.112.125] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemm600017.china.huawei.com (7.193.23.234) X-Rspamd-Queue-Id: 2D810C0022 X-Rspam-User: X-Stat-Signature: m7wo9pu6ewc68cq1w96smr3cxjtar58m X-Rspamd-Server: rspam01 X-HE-Tag: 1704981363-125977 X-HE-Meta: U2FsdGVkX18Cg49P9EYxTx3ZqpkoS3qU7Rrnmqw1gfpGSz6iBktRYYS30AwroWAsurFWVANheY9S5MkcGCdDrePdqLXZz7lAOVQTLBdPR8/HBRxZ6AACRGslbojqleADVKly7Bibsy8qxLkY3HQL+5uCkMCm2V2IU6e1iS6u11Po/ctMLJQFG9FCH4cRmeTyMgxoE3ij4fWipUC8prlRWpO7baHW+S1CZ55jfa4SgXY+7Xc0ZtR4bleNglHQN1qwxawFkxew0eebBiq61I38M6s83NUJKe5jpFjrEdUl0yi3k/BKSy0MAZOGi4yolX5WFSbALdmoTqjri4Ni68nRX3W4+/XRnaiZhW2lgD6bkz4UxG4Upke39v1aeFCjMwSS38j+WmVdPKmQnEMAbesnmruszSg+5/Dd0aw0tvlj7b6teH4teMEltmyLx7XcTQicYXZVXF5ndkTwBmkaNm6e07A5RneiXIuiwl3cbDhXVG6/uFEXzHNL55Tamg/DYIzBQ7pIViOoSOQU0ta5jcoVas6I7QRiZeC+5h+OOWPHd4YGniWRzm2pcHQnSsKmFuN1UFwkOvMQHdQlKQ/nnRMkhumZaDBGIOTC+N7dENLHJBc7bzdoY9vNanVZkwFUbwF/+X2ZnRnblAMRHlbfkxh2SvwYDP1ODzXhKSVt5JarOX8ErFcTqA4H7qu6w6zG4NuR6fGhrJ19nOwk0ZiQsT5u7eM8DY0Mo10rY9T4FRdrupGHJB7LVppBDTerpKKGH/xIy6NOY3gL4+Pns65/Kwnea7hZVea4Rq0SireRLVoNLQKWpN0R2gpHWJxQULiz4V9jIxYOyXNYJyE1GbwUDVpYCirI01LU3uBnKwMzj1ZQDWyHY9bFnqiDbv/AfA1kvjcsi2Bf1ACH6nKs5wRdZl65Jmkv6LEq2J01ya2ImYjWFDBl3ycQWnP9V69MNyfNhAumM0/uDIoG5UR0bzcL3L1 VHZUaf2H EagC9QzoaE1/guET6P7xG9wYMqQOJEA6UYYm3p+XyHtzl5YJc1Yva+8IO2DABZawOpDTYeNpgKveG34ETRG0UvkMMYEgaKsrNEV5s1e4Ha+e3i7TMnVu0e9lJsKOrIZNymVZW7Zx9P8apA+AJDhwa4x7gLpWlkUBaxsRyWAhSgFclX2tL8kWTZ2yTcE4DL3qPnM8lrOvqxshK8iIPgxWHKN71s/ELKWwB12vqeKun6G3o6O2AVAGPCeumOQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kefeng Wang If an MCE has happened in kernel space and the kernel can recover, mce.kflags MCE_IN_KERNEL_RECOV will set in error_context(). With the setting of MCE_IN_KERNEL_RECOV, the MCE is handled in do_machine_check(). But due to lack of MCE_IN_KERNEL_COPY_MC, although the kernel won't panic, the corrupted page don't be isolated, new one maybe consume it again, which is not what we expected. In order to avoid above issue, some hwpoison recover process[1][2][3], memory_failure_queue() is called to cope with such unhandled corrupted pages, also there are some other already existed MC-safe copy scenarios, eg, nvdimm, dm-writecache, dax, which don't isolate corrupted pages. The best way to fix them is set MCE_IN_KERNEL_COPY_MC for MC-Safe Copy, then let the core do_machine_check() to isolate corrupted page instead of doing it one-by-one. EX_TYPE_FAULT_MCE_SAFE is used for the FPU. Here, we do not touch the logic of FPU. We only modify the logic of EX_TYPE_DEFAULT_MCE_SAFE which is used in the scenarios described above. [1] commit d302c2398ba2 ("mm, hwpoison: when copy-on-write hits poison, take page offline") [2] commit 1cb9dc4b475c ("mm: hwpoison: support recovery from HugePage copy-on-write faults") [3] commit 6b970599e807 ("mm: hwpoison: support recovery from ksm_might_need_to_copy()") Reviewed-by: Naoya Horiguchi Reviewed-by: Tony Luck Signed-off-by: Kefeng Wang Signed-off-by: Tong Tiangen --- arch/x86/kernel/cpu/mce/severity.c | 4 ++-- mm/ksm.c | 1 - mm/memory.c | 13 ++++--------- 3 files changed, 6 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c index df67a7a13034..b4b1d028cbb3 100644 --- a/arch/x86/kernel/cpu/mce/severity.c +++ b/arch/x86/kernel/cpu/mce/severity.c @@ -292,11 +292,11 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) case EX_TYPE_UACCESS: if (!copy_user) return IN_KERNEL; + fallthrough; + case EX_TYPE_DEFAULT_MCE_SAFE: m->kflags |= MCE_IN_KERNEL_COPY_MC; fallthrough; - case EX_TYPE_FAULT_MCE_SAFE: - case EX_TYPE_DEFAULT_MCE_SAFE: m->kflags |= MCE_IN_KERNEL_RECOV; return IN_KERNEL_RECOV; diff --git a/mm/ksm.c b/mm/ksm.c index 8c001819cf10..ba9d324ea1c6 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -3084,7 +3084,6 @@ struct folio *ksm_might_need_to_copy(struct folio *folio, if (copy_mc_user_highpage(folio_page(new_folio, 0), page, addr, vma)) { folio_put(new_folio); - memory_failure_queue(folio_pfn(folio), 0); return ERR_PTR(-EHWPOISON); } folio_set_dirty(new_folio); diff --git a/mm/memory.c b/mm/memory.c index c66af4520958..33d8903ab2af 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2843,10 +2843,8 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src, unsigned long addr = vmf->address; if (likely(src)) { - if (copy_mc_user_highpage(dst, src, addr, vma)) { - memory_failure_queue(page_to_pfn(src), 0); + if (copy_mc_user_highpage(dst, src, addr, vma)) return -EHWPOISON; - } return 0; } @@ -6176,10 +6174,8 @@ static int copy_user_gigantic_page(struct folio *dst, struct folio *src, cond_resched(); if (copy_mc_user_highpage(dst_page, src_page, - addr + i*PAGE_SIZE, vma)) { - memory_failure_queue(page_to_pfn(src_page), 0); + addr + i*PAGE_SIZE, vma)) return -EHWPOISON; - } } return 0; } @@ -6196,10 +6192,9 @@ static int copy_subpage(unsigned long addr, int idx, void *arg) struct page *dst = nth_page(copy_arg->dst, idx); struct page *src = nth_page(copy_arg->src, idx); - if (copy_mc_user_highpage(dst, src, addr, copy_arg->vma)) { - memory_failure_queue(page_to_pfn(src), 0); + if (copy_mc_user_highpage(dst, src, addr, copy_arg->vma)) return -EHWPOISON; - } + return 0; }