From patchwork Fri Dec 15 12:07:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13494380 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1975CC35274 for ; Fri, 15 Dec 2023 12:08:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A291D8D012C; Fri, 15 Dec 2023 07:08:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D7218D0121; Fri, 15 Dec 2023 07:08:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89F528D012C; Fri, 15 Dec 2023 07:08:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 777108D0121 for ; Fri, 15 Dec 2023 07:08:14 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 42071A23FC for ; Fri, 15 Dec 2023 12:08:14 +0000 (UTC) X-FDA: 81568929708.28.3DE323A Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) by imf24.hostedemail.com (Postfix) with ESMTP id 49415180019 for ; Fri, 15 Dec 2023 12:08:10 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.100 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702642092; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=ipfl9oPJhTJu7oy7dioTzTqhm29LQlr/CN4M4EOrxDU=; b=wfO+s8ohikHVwbFTeG6Wo/t/TyHykaE7spPcooSGHoCXG233OH0+0GKnhORrwTZ6o33UD1 PVi5z3Xn2nOVJsSKOigRU25Nc5ORaqIg6eTk3bOjFvFBxzQS2SmMf8xKJTXpG3zcpaORNE 5txWGxRnpqHfsGkEDRbZNNMtGYaYFyM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702642092; a=rsa-sha256; cv=none; b=Q3Lm/s7JRpYONods+JqciYgmFMJfVARLuP3r0/ZE+oiX4Xg3pNp/vXQWdx/Jfcue6fGX95 2u3D9/rVX2xKHZEH7NZNXHPNqMkB2TgBM5w6EDMoc5r6Z+PtWQWV6JciSoqax0AM1qo8DT zC4jTki3Wndzll3AacJmS22hMXuz/3w= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.100 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R401e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0VyXjgdG_1702642084; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0VyXjgdG_1702642084) by smtp.aliyun-inc.com; Fri, 15 Dec 2023 20:08:05 +0800 From: Baolin Wang To: akpm@linux-foundation.org Cc: david@redhat.com, ying.huang@intel.com, ziy@nvidia.com, xuyu@linux.alibaba.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] mm: migrate: fix getting incorrect page mapping during page migration Date: Fri, 15 Dec 2023 20:07:52 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 X-Stat-Signature: rpt6qmt1m3qhcaaj7fdhjqut34rofht4 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 49415180019 X-Rspam-User: X-HE-Tag: 1702642090-144480 X-HE-Meta: U2FsdGVkX1/gH7/ZxdcD9oyUgiPMrxhmxI4NgDapOWz2CfwB4j6VsnCV5kBHS3kNVv7JzTY750c6+stAIq5OzVuwwAP1t3Mk9CORXddT7KCfJvB4gtVwHMWKm9fwCsR7XFzlGgha5rfOLOddY9dy0TElzKOVuLsiXPxwRKotmxHnuNczpRVqkYbs9t76MnT0Dbv8TLKqLzEexZLsAGMsS9/VYEXeEyRJZHGGKpRHfjsTwf0WhyIywrdA9coT4ubx+wvRxIlIc15gROMbhY2ef/Pomtqta0Ey4LKJTGejw8kAlDxcP6TMbMYIwiFjvfb3bDQw4KfxpHTgRhrR2ceXBAHpeLjOpVuE47JPfc5+EMVOf2AuNu5rpBIOH7luQFvbzLQ/qiV7/2GlyjhCgsimXXAxlXjNmIfzJ09tCbsEqd/9GjUPCgezrUSI6cVTj1scn67P9a4LSESEr6OTDF5ZYLKvThkKCUxx8BbSPgcedPZhCYCFUvIXX2bJl0EFNFmYk2+5MT4JpOpqZc9O0qejUZc8u1iXkiW8qfvQ87oypL5ak88/LBcZBJac+PAYwpoP2t7DIIWONZFAvY/jmeIRO5Z3mgqM4MBQedX27Bhxt0eIPQODk5Ru2CLU5sPJugvfvMhuxYJcdYhG4EaSS61peYvdkIl9fVIRVmNr/C3miNu8uMUAO4Jkbf/vQoqlQzwrfG3d2ROJ/jAOp34wj9zxIRCfm9zl0GeMBqn/DlSbcK8f5IXQE0+bBJCKl0v173266gsoXrctTqJpieuk8Mqavk7peQNZuy7sVcSw3FhhwTaH+HGvPtUe9yz8ZIMR8shbYAiVYoO7E/JGEGgR/hCUGPTIONo489aa1DXTtGFkosTPPPZ/khNLhBKxp1+I8uaI/kvRtRKO8hqX9FWhY3bHfov3FPjwDbH8fUu2Cx8dz8vTPIN8eyfwU4Q8qwBLOw8UBuStZ1yi4JqzWFJ4mse AqBwAx5g VRdesd8kDEBCMeEBRx1jSGx5pZC/SZoATw8V38bYhHsAkN7/lNV9iHYi74k7B2EwBZnM2Mrd8HHqOaJk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When running stress-ng testing, we found below kernel crash after a few hours: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 pc : dentry_name+0xd8/0x224 lr : pointer+0x22c/0x370 sp : ffff800025f134c0 ...... Call trace: dentry_name+0xd8/0x224 pointer+0x22c/0x370 vsnprintf+0x1ec/0x730 vscnprintf+0x2c/0x60 vprintk_store+0x70/0x234 vprintk_emit+0xe0/0x24c vprintk_default+0x3c/0x44 vprintk_func+0x84/0x2d0 printk+0x64/0x88 __dump_page+0x52c/0x530 dump_page+0x14/0x20 set_migratetype_isolate+0x110/0x224 start_isolate_page_range+0xc4/0x20c offline_pages+0x124/0x474 memory_block_offline+0x44/0xf4 memory_subsys_offline+0x3c/0x70 device_offline+0xf0/0x120 ...... After analyzing the vmcore, I found this issue is caused by page migration. The scenario is that, one thread is doing page migration, and we will use the target page's ->mapping field to save 'anon_vma' pointer between page unmap and page move, and now the target page is locked and refcount is 1. Currently, there is another stress-ng thread performing memory hotplug, attempting to offline the target page that is being migrated. It discovers that the refcount of this target page is 1, preventing the offline operation, thus proceeding to dump the page. However, page_mapping() of the target page may return an incorrect file mapping to crash the system in dump_mapping(), since the target page->mapping only saves 'anon_vma' pointer without setting PAGE_MAPPING_ANON flag. There are seveval ways to fix this issue: (1) Setting the PAGE_MAPPING_ANON flag for target page's ->mapping when saving 'anon_vma', but this can confuse PageAnon() for PFN walkers, since the target page has not built mappings yet. (2) Getting the page lock to call page_mapping() in __dump_page() to avoid crashing the system, however, there are still some PFN walkers that call page_mapping() without holding the page lock, such as compaction. (3) Using target page->private field to save the 'anon_vma' pointer and 2 bits page state, just as page->mapping records an anonymous page, which can remove the page_mapping() impact for PFN walkers and also seems a simple way. So I choose option 3 to fix this issue, and this can also fix other potential issues for PFN walkers, such as compaction. Fixes: 64c8902ed441 ("migrate_pages: split unmap_and_move() to _unmap() and _move()") Signed-off-by: Baolin Wang Reviewed-by: "Huang, Ying" --- mm/migrate.c | 27 ++++++++++----------------- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 397f2a6e34cb..bad3039d165e 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1025,38 +1025,31 @@ static int move_to_new_folio(struct folio *dst, struct folio *src, } /* - * To record some information during migration, we use some unused - * fields (mapping and private) of struct folio of the newly allocated - * destination folio. This is safe because nobody is using them - * except us. + * To record some information during migration, we use unused private + * field of struct folio of the newly allocated destination folio. + * This is safe because nobody is using it except us. */ -union migration_ptr { - struct anon_vma *anon_vma; - struct address_space *mapping; -}; - enum { PAGE_WAS_MAPPED = BIT(0), PAGE_WAS_MLOCKED = BIT(1), + PAGE_OLD_STATES = PAGE_WAS_MAPPED | PAGE_WAS_MLOCKED, }; static void __migrate_folio_record(struct folio *dst, - unsigned long old_page_state, + int old_page_state, struct anon_vma *anon_vma) { - union migration_ptr ptr = { .anon_vma = anon_vma }; - dst->mapping = ptr.mapping; - dst->private = (void *)old_page_state; + dst->private = (void *)anon_vma + old_page_state; } static void __migrate_folio_extract(struct folio *dst, int *old_page_state, struct anon_vma **anon_vmap) { - union migration_ptr ptr = { .mapping = dst->mapping }; - *anon_vmap = ptr.anon_vma; - *old_page_state = (unsigned long)dst->private; - dst->mapping = NULL; + unsigned long private = (unsigned long)dst->private; + + *anon_vmap = (struct anon_vma *)(private & ~PAGE_OLD_STATES); + *old_page_state = private & PAGE_OLD_STATES; dst->private = NULL; }