From patchwork Wed Apr 27 06:10:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yu X-Patchwork-Id: 12828311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 424D5C433FE for ; Wed, 27 Apr 2022 06:10:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9ADED6B007B; Wed, 27 Apr 2022 02:10:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 934266B007E; Wed, 27 Apr 2022 02:10:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 83C926B007D; Wed, 27 Apr 2022 02:10:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 770DA6B0078 for ; Wed, 27 Apr 2022 02:10:41 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5C9AA29013 for ; Wed, 27 Apr 2022 06:10:41 +0000 (UTC) X-FDA: 79401635082.04.2BACD5B Received: from out30-44.freemail.mail.aliyun.com (out30-44.freemail.mail.aliyun.com [115.124.30.44]) by imf19.hostedemail.com (Postfix) with ESMTP id 6F9E91A004C for ; Wed, 27 Apr 2022 06:10:35 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=xuyu@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0VBR9y9r_1651039835; Received: from localhost(mailfrom:xuyu@linux.alibaba.com fp:SMTPD_---0VBR9y9r_1651039835) by smtp.aliyun-inc.com(127.0.0.1); Wed, 27 Apr 2022 14:10:35 +0800 From: Xu Yu To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, naoya.horiguchi@nec.com, shy828301@gmail.com Subject: [PATCH 1/2] Revert "mm/memory-failure.c: skip huge_zero_page in memory_failure()" Date: Wed, 27 Apr 2022 14:10:16 +0800 Message-Id: <872cefb182ba1dd686b0e7db1e6b2ebe5a4fff87.1651039624.git.xuyu@linux.alibaba.com> X-Mailer: git-send-email 2.20.1.2432.ga663e714 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 6F9E91A004C X-Stat-Signature: cxis7zmdui5bh1cx51ymqdkjajybk44m X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of xuyu@linux.alibaba.com designates 115.124.30.44 as permitted sender) smtp.mailfrom=xuyu@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com X-HE-Tag: 1651039835-195577 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This reverts commit d173d5417fb67411e623d394aab986d847e47dad. The commit d173d5417fb6 ("mm/memory-failure.c: skip huge_zero_page in memory_failure()") explicitly skips huge_zero_page in memory_failure(), in order to avoid triggering VM_BUG_ON_PAGE on huge_zero_page in split_huge_page_to_list(). This works, but Yang Shi thinks that, Raising BUG is overkilling for splitting huge_zero_page. The huge_zero_page can't be met from normal paths other than memory failure, but memory failure is a valid caller. So I tend to replace the BUG to WARN + returning -EBUSY. If we don't care about the reason code in memory failure, we don't have to touch memory failure. And for the issue that huge_zero_page will be set PG_has_hwpoisoned, Yang Shi comments that, The anonymous page fault doesn't check if the page is poisoned or not since it typically gets a fresh allocated page and assumes the poisoned page (isolated successfully) can't be reallocated again. But huge zero page and base zero page are reused every time. So no matter what fix we pick, the issue is always there. Finally, Yang, David, Anshuman and Naoya all agree to fix the bug, i.e., to split huge_zero_page, in split_huge_page_to_list(). This reverts the commit d173d5417fb6 ("mm/memory-failure.c: skip huge_zero_page in memory_failure()"), and the original bug will be fixed by the next patch. Suggested-by: Yang Shi Cc: Naoya Horiguchi Signed-off-by: Xu Yu Reviewed-by: Yang Shi Reviewed-by: Miaohe Lin --- mm/memory-failure.c | 13 ------------- 1 file changed, 13 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 27760c19bad7..2020944398c9 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1860,19 +1860,6 @@ int memory_failure(unsigned long pfn, int flags) } if (PageTransHuge(hpage)) { - /* - * Bail out before SetPageHasHWPoisoned() if hpage is - * huge_zero_page, although PG_has_hwpoisoned is not - * checked in set_huge_zero_page(). - * - * TODO: Handle memory failure of huge_zero_page thoroughly. - */ - if (is_huge_zero_page(hpage)) { - action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); - res = -EBUSY; - goto unlock_mutex; - } - /* * The flag must be set after the refcount is bumped * otherwise it may race with THP split. From patchwork Wed Apr 27 06:10:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yu X-Patchwork-Id: 12828312 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFAD3C433EF for ; Wed, 27 Apr 2022 06:10:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0CBB6B0078; Wed, 27 Apr 2022 02:10:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D703E6B007D; Wed, 27 Apr 2022 02:10:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8D226B0080; Wed, 27 Apr 2022 02:10:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id 902706B0078 for ; Wed, 27 Apr 2022 02:10:41 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 59CFA21A47 for ; Wed, 27 Apr 2022 06:10:41 +0000 (UTC) X-FDA: 79401635082.05.D1F0E47 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by imf21.hostedemail.com (Postfix) with ESMTP id 627A61C0059 for ; Wed, 27 Apr 2022 06:10:36 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R531e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=xuyu@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0VBRGlj2_1651039836; Received: from localhost(mailfrom:xuyu@linux.alibaba.com fp:SMTPD_---0VBRGlj2_1651039836) by smtp.aliyun-inc.com(127.0.0.1); Wed, 27 Apr 2022 14:10:36 +0800 From: Xu Yu To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, naoya.horiguchi@nec.com, shy828301@gmail.com Subject: [PATCH 2/2] mm/huge_memory: do not overkill when splitting huge_zero_page Date: Wed, 27 Apr 2022 14:10:17 +0800 Message-Id: X-Mailer: git-send-email 2.20.1.2432.ga663e714 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 627A61C0059 X-Stat-Signature: q1ndxadhn5zt55oipobm7rfi87fx6fic X-Rspam-User: Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf21.hostedemail.com: domain of xuyu@linux.alibaba.com designates 115.124.30.131 as permitted sender) smtp.mailfrom=xuyu@linux.alibaba.com X-HE-Tag: 1651039836-633325 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Kernel panic when injecting memory_failure for the global huge_zero_page, when CONFIG_DEBUG_VM is enabled, as follows. Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000 page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00 head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0 flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff) raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head)) ------------[ cut here ]------------ kernel BUG at mm/huge_memory.c:2499! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11 Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014 RIP: 0010:split_huge_page_to_list+0x66a/0x880 Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246 RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000 R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40 FS: 00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: try_to_split_thp_page+0x3a/0x130 memory_failure+0x128/0x800 madvise_inject_error.cold+0x8b/0xa1 __x64_sys_madvise+0x54/0x60 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fc3754f8bf9 Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8 RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9 RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000 RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490 R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000 We think that raising BUG is overkilling for splitting huge_zero_page, the huge_zero_page can't be met from normal paths other than memory failure, but memory failure is a valid caller. So we tend to replace the BUG to WARN + returning -EBUSY, and thus the panic above won't happen again. Suggested-by: Yang Shi Cc: Naoya Horiguchi Signed-off-by: Xu Yu Reviewed-by: Naoya Horiguchi Reported-by: kernel test robot Reported-by: kernel test robot Reported-by: kernel test robot --- mm/huge_memory.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c468fee595ff..3bb464509518 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2496,10 +2496,12 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) int extra_pins, ret; pgoff_t end; - VM_BUG_ON_PAGE(is_huge_zero_page(head), head); VM_BUG_ON_PAGE(!PageLocked(head), head); VM_BUG_ON_PAGE(!PageCompound(head), head); + if (VM_WARN_ON_ONCE_PAGE(is_huge_zero_page(head), head)) + return -EBUSY; + if (PageWriteback(head)) return -EBUSY;