From patchwork Fri May 24 21:53:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jane Chu X-Patchwork-Id: 13673755 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80165C25B74 for ; Fri, 24 May 2024 21:53:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B30C26B007B; Fri, 24 May 2024 17:53:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD2FB6B0082; Fri, 24 May 2024 17:53:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A6F66B0083; Fri, 24 May 2024 17:53:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 7DB506B007B for ; Fri, 24 May 2024 17:53:21 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E84CD1404A4 for ; Fri, 24 May 2024 21:53:20 +0000 (UTC) X-FDA: 82154640960.13.7B2E8B3 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf08.hostedemail.com (Postfix) with ESMTP id 004B816000B for ; Fri, 24 May 2024 21:53:18 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b="b/sz1kBM"; spf=pass (imf08.hostedemail.com: domain of jane.chu@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=jane.chu@oracle.com; dmarc=pass (policy=quarantine) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716587599; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=K0BAFTCnT5Qrgec3dEAH5062MqSR75v8S+Sy4UV20rU=; b=jsVIsQGKl7fSPeZ0eeIBkKDt4cQDA0UEBfPlsslnXlkXg0/HT/wplbkOoVgkVhrWl23pPr 0n4w7/tQugwsXBdrj40Bq25rXuONeyjjJIQJ87Z4SWdl6KCxSGsjOiGRifI4iVVAteQWVP QJKO52775e/pWr2sByjIjz1KQfu+8DY= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b="b/sz1kBM"; spf=pass (imf08.hostedemail.com: domain of jane.chu@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=jane.chu@oracle.com; dmarc=pass (policy=quarantine) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716587599; a=rsa-sha256; cv=none; b=5MhL1ZSoSVgqLfDrUA47YMCxLKvVh1YdjCU93upl9g01xPfaZSunXqavvp+4UMjQmtUUgy foh5XcCouwB/m/22LiYBe6/RP068x3Ji9jeC7L80CiQA+IF5zTkT+xJqWXFTVrohlAjZ8h QxHahaGUO7mZDh4NFNy0t9BI8Zn+ISk= Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 44OHcEG4021717; Fri, 24 May 2024 21:53:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : mime-version : content-transfer-encoding; s=corp-2023-11-20; bh=K0BAFTCnT5Qrgec3dEAH5062MqSR75v8S+Sy4UV20rU=; b=b/sz1kBMSxzikhjijtJcBAVLPQf6N5k66yqCe9V8nRKh1LLUJrL+ul3EpCZYjSbNpx+G ORshuiJWhv6sGIbsxeOJCXVA1Dod14ZkifJ/Xj0n/W2tg/YW3b7oXvBCRvAJmCCYikuW cTgMDGtAqV64t9xuBBlt3lCK+023wW9YsLdENbG9wcJwDZIXOaEkgoGMSOuLB9D/C9/k 6RaTCLW1ROxEdUkdVL5xAglDDsdJssC1Q3KyT5p5ku6i9j+ZVnOwa9cBRy0bdUJEJqHE ZygpHWZvTyE1s1XJ0zB3HDGMlsyU4pxamCFbHY+ogBw0hVmlz0WeHppW4Xx0Jte7KF30 rw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3y6k8ddcyy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 24 May 2024 21:53:11 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 44OLjG17035956; Fri, 24 May 2024 21:53:10 GMT Received: from brm-x62-16.us.oracle.com (brm-x62-16.us.oracle.com [10.80.150.37]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3y6jscbsu0-1; Fri, 24 May 2024 21:53:10 +0000 From: Jane Chu To: linmiaohe@huawei.com, nao.horiguchi@gmail.com, akpm@linux-foundation.org, osalvador@suse.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 0/5] Enhance soft hwpoison handling and injection Date: Fri, 24 May 2024 15:53:01 -0600 Message-Id: <20240524215306.2705454-1-jane.chu@oracle.com> X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.650,FMLib:17.12.28.16 definitions=2024-05-24_08,2024-05-24_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2405240157 X-Proofpoint-GUID: D7MZYSpvFUZn71z56Yo4IHeyM_PUvV6g X-Proofpoint-ORIG-GUID: D7MZYSpvFUZn71z56Yo4IHeyM_PUvV6g X-Stat-Signature: asimi4ushrh1qbyzzr5o93nruy76wqy9 X-Rspamd-Queue-Id: 004B816000B X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1716587598-657308 X-HE-Meta: U2FsdGVkX18RValN2fr7gp3xVU9cirlsJ8CF+LbYc59ZjmHwcLrHu5jwmTbi9ZxTBKRQ14S4ujNPh0mjUae7V+wrjU8APr2ijIq1EPAqURgoPYfrgwJPie5JTMj92EC5MeQem58hYw1b+TFC5e3zjo4ddb6QFPB6LRyvrYc6ccczqW5TZnLM2G2LRyV9pTUxn2lHUfD32cjBXPUDFeJZnGT1SvZbAzLKP3EgBiBzH1AY4CL7F11/m+fWrfsET25RqPXpwbSlPMn6AuIDaV7Hq9yFM3dwsUYChvsM0VpmnI5d7Dxql7HWfbOvVZffzT9GdFi8lpBWie8gsaUv/U1uhF2fcd58ImrkkVXAxWif/CfntEmGd7QHJ5yR+Y3e7/iiLbZRSZxduuVu2WtAJQk6HZnqFo3sytSQUwkmf420iB1/TSbGPqVL+1nPPsQGUyTWjxP3AQLU+T1IHLOWw+AkEBrVITGFTpBbO8XXPt47kcBevRwQ7TVOH058ERiWNNXfeUIjtXqpBNdHuvlQqvWmI3fJG2U9iJ1oYzxNthvA1au6mjeRcAIDLAwpqLPpuVPA1UPeLevCXYp9nNHtAkCTxHwb1Gmu1EGYFCnjAtwkyCDGaiIZznUFUblc+0RUCxkzVevekQ6mVBdEwt0a2kLkVDDypnVTSTDxs2tb2cCjPlYjowrc239Z1QWIVIpWNS6WiMpa7NbAUWPNNiUeIO6UnfVp3dpPXgcYDgRuqfqNlIH73SG4+a8byxG/ezAxb4BCm8LdiJ6KxE2p1PEIx5cWmYmD+hqX/FzB9m2hId4/hrAc1ZYoluyCMyY3b5zlusVIh023nGGjs7PiJqTsdfXetuQcIMft2z4m+lwBFnp6MkDAs0ar617P+ek5aSoihkKb4NbTiOl9kZZ52bIAE8f61IR8bYL7YMefEnHvjVSplv66uZTr0RHeJBTGf157qogJXRMUY627a43avpbPEGd oPDyWER5 kJ/voz+iLmFL2Ar2tGxUi1A7k4CxdXoTf6Py6+biSCSzGhmUSsRA0cFpsTwy7yqWlbbRXO6KbMugk5evjTSRxw6uY5BaOJn0A0RocFNqHvy4AIiwRCwAGfsjSLDygsoTahk+ISZhXQUu7dD5kXejZ54xFiXRZ51iyiNaswAq7zlBZ0HWQZb+9LUnD/tOoFKLOjy/jYR4sZIH8VDN/ET6VuEtW1i6p7tPhrFMLNotCcYio5Zd/YAC23oRo72TvlUFqQieR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Changes in v4: - collected R-B from Oscar Salvador - collected Acked-by from Miaohe Lin - fixed comment on MF_DELAYED, and comments for better coding. - Miaohe Lin Changes in v3: - rebased to mainline as of 5/20/2024 - added an acked-by from Miaohe Lin - picked up a R-B from Oscar Salvador - fixed/clarified comments about MF_IGNORED/MF_FAILED definition and usage. - Oscar Salvador - invoke hwpoison_filter slightly earlier to avoid unnecessary THP split, and with refcount held. - Miaohe Lin - added comments to try_to_split_thp_page() on when not to release page refcount. - Oscar Salvador - added action_result() in a couple cases, but take care not to overwrite the intended returns. - Oscar Salvador Changes in v2: - rebased to mm-stable as of 5/8/2024 - added RB by Oscar Salvador - comments from Oscar on patch 1-of-3: clarify changelog - comments from Miahe Lin on patch 3-of-3: remove unnecessary user page checking and remove incorrect put_page() in kill_procs_now(). Invoke kill_procs_now() regardless MF_ACTIN_REQUIRED is set or not, moved hwpoison_filter() higher up. - added two patches 3-of-5 and 4-of-5 This series aim at the following enhancement - - Let one hwpoison injector, that is, madvise(MADV_HWPOISON) to behave more like as if a real UE occurred. Because the other two injectors such as hwpoison-inject and the 'einj' on x86 can't, and it seems to me we need a better simulation to real UE scenario. - For years, if the kernel is unable to unmap a hwpoisoned page, it send a SIGKILL instead of SIGBUS to prevent user process from potentially accessing the page again. But in doing so, the user process also lose important information: vaddr, for recovery. Fortunately, the kernel already has code to kill process re-accessing a hwpoisoned page, so remove the '!unmap_success' check. - Right now, if a thp page under GUP longterm pin is hwpoisoned, and kernel cannot split the thp page, memory-failure simply ignores the UE and returns. That's not ideal, it could deliver a SIGBUS with useful information for userspace recovery. Jane Chu (5): mm/memory-failure: try to send SIGBUS even if unmap failed mm/madvise: Add MF_ACTION_REQUIRED to madvise(MADV_HWPOISON) mm/memory-failure: improve memory failure action_result messages mm/memory-failure: move hwpoison_filter() higher up mm/memory-failure: send SIGBUS in the event of thp split fail include/linux/mm.h | 2 + include/ras/ras_event.h | 2 + mm/madvise.c | 2 +- mm/memory-failure.c | 106 +++++++++++++++++++++++++++++----------- 4 files changed, 82 insertions(+), 30 deletions(-)