From patchwork Wed Apr 21 00:57:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12215267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A92A4C433B4 for ; Wed, 21 Apr 2021 00:57:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3917161409 for ; Wed, 21 Apr 2021 00:57:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3917161409 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C16456B0070; Tue, 20 Apr 2021 20:57:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BEDDC8D0003; Tue, 20 Apr 2021 20:57:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB3DC6B0072; Tue, 20 Apr 2021 20:57:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0094.hostedemail.com [216.40.44.94]) by kanga.kvack.org (Postfix) with ESMTP id 89E6C6B0070 for ; Tue, 20 Apr 2021 20:57:46 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4B82E181AEF3E for ; Wed, 21 Apr 2021 00:57:46 +0000 (UTC) X-FDA: 78054561732.14.64E8989 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by imf18.hostedemail.com (Postfix) with ESMTP id AF71F2000242 for ; Wed, 21 Apr 2021 00:57:47 +0000 (UTC) Received: by mail-pf1-f181.google.com with SMTP id h15so9349781pfv.2 for ; Tue, 20 Apr 2021 17:57:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ykSjY95cqkRXggeKZ48gy3dH7XwchqBVjMxlboR4Vf0=; b=i9MK1MbRoL2svXM3fLPYJa68MIZAR2O8jBzW0qTKgkc0G5kVL2RFQmC++wbYq7tCl+ c1/1iWr1tB7CSeCfmYA4x+Ev8FMT9qOCvPGMdoihgEnGpvQUwe5QXZ52fGDzZHgj+NIY 00y/x/BuJ5ezpZfZLXPLI0kDX1vdV5c1N+TvY+xEbytAtrA88bHXTw1hA97SMILuYGEk GyzlMoiPqZb/xIxlO+mCpN8tio3KFIFPhPrqWITKZ9gy67/rYdr6ob/u6P5m2izvgiyY XxapFWLx013eqSAdnJT/adrjP0copP/zC6v9IDejqa6cVxLAAK2C7DfoAeEO1dTtnqEm rEpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ykSjY95cqkRXggeKZ48gy3dH7XwchqBVjMxlboR4Vf0=; b=kSKqncz9/TzY1TwzxmI/iz+7kaA6aE7OlmWq2lRUt4qeo6ICoQw2nK7ti4AxefGjlK wDo+Zk0//Q+JP3Y0Z1kfSWQVUzhxNKWd+gxMezKcMoxz3cctZ0O+6Tj8+HvnSNnV/0Zx HUo799Mst5Vcf9Fbgaow82EwP8WrdCPBeMJiQQ0bzSY3rpypCLyhwDTr757V6eAIeHP/ 570cyL1cnBG5vzH8lpf1IUmhoedfZWvZ8g0rGReAuu2AdGzHlCgBNGwCNoVsaUvr1Oxm Br4dSMBNAnV5YnkNAI8wo5wLGRhUX2RGZmsy7ZWRkc1JrC3gj5z6GwIh3YCceM/bq1RK ClLw== X-Gm-Message-State: AOAM530yZc5d9AG8nm+D2vYfnyJcNrA7jUlbaTJvuc575lXIAORuRyWF OjI5Dt2Q37gAvcnR7xsC7AGLadOP8oYRo3E= X-Google-Smtp-Source: ABdhPJxR5nyNxQS5ervGrHw4rorWmK1JFla9w4YOeUKrYHxk6hRWcvt52rQ4HI2IPtti5inBvSXy4A== X-Received: by 2002:a62:3086:0:b029:248:16e0:7c6 with SMTP id w128-20020a6230860000b029024816e007c6mr28088086pfw.19.1618966664818; Tue, 20 Apr 2021 17:57:44 -0700 (PDT) Received: from localhost.localdomain (h175-177-040-153.catv02.itscom.jp. [175.177.40.153]) by smtp.gmail.com with ESMTPSA id e13sm178278pfi.199.2021.04.20.17.57.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Apr 2021 17:57:44 -0700 (PDT) From: Naoya Horiguchi To: linux-mm@kvack.org, Tony Luck , Aili Yao Cc: Andrew Morton , Oscar Salvador , David Hildenbrand , Borislav Petkov , Andy Lutomirski , Naoya Horiguchi , Jue Wang , linux-kernel@vger.kernel.org Subject: [PATCH v3 2/3] mm,hwpoison: return -EHWPOISON when page already Date: Wed, 21 Apr 2021 09:57:27 +0900 Message-Id: <20210421005728.1994268-3-nao.horiguchi@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210421005728.1994268-1-nao.horiguchi@gmail.com> References: <20210421005728.1994268-1-nao.horiguchi@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: AF71F2000242 X-Stat-Signature: bwpd9kix19r35qr4jxkbzw388wsw4kmg X-Rspamd-Server: rspam02 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf18; identity=mailfrom; envelope-from=""; helo=mail-pf1-f181.google.com; client-ip=209.85.210.181 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618966667-222485 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Aili Yao When the page is already poisoned, another memory_failure() call in the same page now returns 0, meaning OK. For nested memory mce handling, this behavior may lead to one mce looping, Example: 1. When LCME is enabled, and there are two processes A && B running on different core X && Y separately, which will access one same page, then the page corrupted when process A access it, a MCE will be rasied to core X and the error process is just underway. 2. Then B access the page and trigger another MCE to core Y, it will also do error process, it will see TestSetPageHWPoison be true, and 0 is returned. 3. The kill_me_maybe will check the return: 1244 static void kill_me_maybe(struct callback_head *cb) 1245 { ... 1254 if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) && 1255 !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) { 1256 set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page); 1257 sync_core(); 1258 return; 1259 } ... 1267 } 4. The error process for B will end, and may nothing happened if kill-early is not set, The process B will re-excute instruction and get into mce again and then loop happens. And also the set_mce_nospec() here is not proper, may refer to commit fd0e786d9d09 ("x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages"). For other cases which care the return value of memory_failure() should check why they want to process a memory error which have already been processed. This behavior seems reasonable. Signed-off-by: Aili Yao Signed-off-by: Naoya Horiguchi --- mm/memory-failure.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git v5.12-rc8/mm/memory-failure.c v5.12-rc8_patched/mm/memory-failure.c index 4087308e4b32..39d0ff0339b9 100644 --- v5.12-rc8/mm/memory-failure.c +++ v5.12-rc8_patched/mm/memory-failure.c @@ -1228,7 +1228,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) if (TestSetPageHWPoison(head)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); - return 0; + return -EHWPOISON; } num_poisoned_pages_inc(); @@ -1437,6 +1437,7 @@ int memory_failure(unsigned long pfn, int flags) if (TestSetPageHWPoison(p)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); + res = -EHWPOISON; goto unlock_mutex; }