From patchwork Tue May 21 04:02:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lance Yang X-Patchwork-Id: 13668894 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A86BBC25B75 for ; Tue, 21 May 2024 04:03:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 28FDC6B008C; Tue, 21 May 2024 00:03:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2185A6B0092; Tue, 21 May 2024 00:03:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0929C6B0093; Tue, 21 May 2024 00:03:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D62F86B008C for ; Tue, 21 May 2024 00:03:37 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 4FA3CA057F for ; Tue, 21 May 2024 04:03:37 +0000 (UTC) X-FDA: 82141058874.16.1BE1E75 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf15.hostedemail.com (Postfix) with ESMTP id 654FFA000A for ; Tue, 21 May 2024 04:03:35 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bq2hY3Id; spf=pass (imf15.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716264215; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Kr99+UnOmjT7S1avYjwp5ZtjL07TKstHpgea7ghLKEc=; b=yMP5yXaY5ODuZPaiH1aS8DH3C7ViEQHmziWAtWAaslKKzAosg2AVtvXG0w8MqtWVw14lu+ u+K9jfD/KzSYxyFF4aJ/H7vSreD3xWYJTpWM4qxZdCikEKKa5bRY0SomIO7ybicDbJhl95 9FBbnNpdq7K9HuifinpG6CLfQ/8FcSM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716264215; a=rsa-sha256; cv=none; b=gTo4nT8+4Iw+4H1sMvCy/Nx6SBMoaorx9Ut4SnvKw/Ap02I337D3rFz0BQWIy6L60t0SSn YI0ups04z/dXX+Hgaj1FS5GWXwKAaVqcJN06ANi8IaZyYqL/RMyPqu79w4qbDBYvZLftXm oOlMEE9GTYQXPTy6puDhJTBMMZc8Ak4= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bq2hY3Id; spf=pass (imf15.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-1e651a9f3ffso91978875ad.1 for ; Mon, 20 May 2024 21:03:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716264214; x=1716869014; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Kr99+UnOmjT7S1avYjwp5ZtjL07TKstHpgea7ghLKEc=; b=bq2hY3Id2Va3p8XG7eFAnaxGTiA98s8GYl9Fj3b6n6e+Szq/3gAAvB3OQO///0UsHB tJP1bolI9ZBX/TOArot1xJNmKuQEvAFmf7KfLULgkaWQtobE2DHnXBsg3VY/zcKu6f1y We+MnIh/ioRtin7ZnPtphjPIKNjhygX0G9pLBokg5aIR2yCbpplABYT1or0wHp7OKqAQ ykFMQQw0Tu22gRM3l3hSuw5aAi7Y6Jlzt9RdIlS2klHKdxOHaBlLWzDbvx7XZ/HoccZN UDkDFX4C2ilFk4yL3VLKbS8v13seZaHlOXdZq3yWfi4qhtLo3tS/5VlkcuJMu+jdo0LH 4hbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716264214; x=1716869014; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Kr99+UnOmjT7S1avYjwp5ZtjL07TKstHpgea7ghLKEc=; b=OGdqOsvo7JEe39UWXNJDyJiUpTQA19aUNwO1MBdpv4qqOkJNU5UZ6xz7rq1k8XSX2V cwn0pAdWlYcf8qcLEu0AMERhm0DYvYnBlP95FW8o2NEhquqd//UY5gnU1V+FC51d2zgh koSuVm9+wPGUg6frZL+kDoRPGUmJfE6lGV+YkpDg7nHM/gxA9DBTwArQlzkhAl2cXtAi k2xcTAHXKl3JTkipsuw5WTqIZ+bbtBMZ73pCbbgeg0tDaHWPbs6ZsL0rDCq5iKMvDzUI XiraMithQPr+8p16o7dC8dILvHBtUP7RElgtPPAdOOBfhdJD10UALxN5YFOBA6f2nj+p 92cg== X-Forwarded-Encrypted: i=1; AJvYcCULroHF9HATjPtm2WqG6Dq4k04vve1AgddFy3a8iWl2dv/lM5qZa5O0/E12dYE+XLyAJFcu7a1dJDNbEuZrrdzzFsg= X-Gm-Message-State: AOJu0YyLzgfDpQb72CvW+20pOTyqMhGnT7a32xdNhudzV3uNmC2gitu7 +Wvgg/pnSfjjlReFeKIQchm+6gDwue9L6wNKRmv8xgTq9rbLeANg X-Google-Smtp-Source: AGHT+IHNlRE9Ql30bKl4wf8mPivbg9Uu62KGA61WCjKuW56gBBFvSattqBS4RrJD3TWyboPzl/3vNg== X-Received: by 2002:a17:902:d2cf:b0:1f3:dd7:ad42 with SMTP id d9443c01a7336-1f30dd7af64mr10263255ad.0.1716264214253; Mon, 20 May 2024 21:03:34 -0700 (PDT) Received: from LancedeMBP.lan.lan ([2403:2c80:6::302d]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1ef0bad62bfsm211683945ad.74.2024.05.20.21.03.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 May 2024 21:03:33 -0700 (PDT) From: Lance Yang To: akpm@linux-foundation.org Cc: willy@infradead.org, sj@kernel.org, baolin.wang@linux.alibaba.com, maskray@google.com, ziy@nvidia.com, ryan.roberts@arm.com, david@redhat.com, 21cnbao@gmail.com, mhocko@suse.com, fengwei.yin@intel.com, zokeefe@google.com, shy828301@gmail.com, xiehuan09@gmail.com, libang.li@antgroup.com, wangkefeng.wang@huawei.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lance Yang Subject: [PATCH v6 2/3] mm/rmap: integrate PMD-mapped folio splitting into pagewalk loop Date: Tue, 21 May 2024 12:02:43 +0800 Message-Id: <20240521040244.48760-3-ioworker0@gmail.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20240521040244.48760-1-ioworker0@gmail.com> References: <20240521040244.48760-1-ioworker0@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: f3aweoqou6yawgfzdh91dqfq4czwuxrb X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 654FFA000A X-HE-Tag: 1716264215-347456 X-HE-Meta: U2FsdGVkX18gqf3xIWeroClz4EF5NwTt3E0cAtdag0+LOSXdVUUKaEzv6Hsk13GaIusC56g3mfxDKH9EwXJC0OKGxYHXwOGoC4R7qgjtMwzDShtP3PzNH18vltcopJblgLeGLy4RSHEv+dX9SCary5+7G5GfGJTqmwSYso4/w2iK+al08XiwQxNckwUfDkfgGFP4wBuiLsFTg8EuELEur4TOYmPw82Zw4AyOZiBAtVnKGOMs3QXmQtcN7Zm5wgfMETwVZvGQFTyvbT/aBFO9A64wLo8RxT0Bq8/q9edpZg91fO9CbSCmXHu6MbcyQK/2Cs9GVNGs2pJ07irYeyo+D6q81QG0RyFNUWNpWF0xvYQUfz3vbF4AjCI3YNSTUtA2t/Yj0XA5L51kO7GUpXdivBeLHmHVCy2objSfBxaKiEy9tVwSzdEpQ+NK6PWyoNIZrBMPjQ2+EKOQvLNFaHdngx8fP/bZz8ENcmYx/xhHe/Lv3MPFhZSmSFGrsRsrc//FUzOVJbAAc/dlfRj9J0D1bDh5NTFQJ+W701DE3lSAxAqzITwqnOjonIvvh3fJQ9IqoJ0rQOcoGKHhICnR4oOlka9hKdA7TU8ONXMQBZFTEsuE8pW9T47tqj1d/y+p7nsSjvH0ZKcuKhaVeqYu9AhjgsCzf1iMmo/KquZClfmEMOBW3xPZsGufsafKoH6QxzS4cfxywoLm9Edib2zTJq6te0LEz3iECLnmOAnWMd0qZMBKasKPWsBa2oPkNFNDPs0ZU9SJNtHd/K4hA+U00UalU0EhUIWbsIdPV8xDdy3s4ra0fUELVfzxfdOzB8Fe1pfWJns+GODbWmGCVKwcsMLcjZ6lJ+W/LrHQld+eNKH26v8i2c0zQDJY/2hgBvqDM7/V4H7W/sfxIgmwjqlOQ/wb3G5qXWf5Ca+Dppjpp1MWUGVvn3uxxd9Ddm0aJ+FrhkyzT4VuV56oNKtvQaUODAU VZkZIJdJ IZzLET8SP13sUGg9mdySIDOb0ditg0yFCwobIzkenjlDZ5Bo4TfHdBbFRRehJzD3Uf7JKMBNadYHx+g//3bVHI5fDayZKcQXRIEf8jdSsXWnss8Uhr8qsp3XfaIK9jHzPoacdgdliMlSTcbVBusE+xNq/MuAr6aJSddZ77YaB1z9h1+gbvx8OKI8a64PKP1uzi3NZALRU7wxrjegDOZ+nref70wTPb2YvNCKXFPU08F6prZC0sbLt2z7Uqz2tTLMtg3Hk64pvLa7XEVNOEzROjgbETFTm7nbtXK2Wy1qCRNCabAQAQ2aaIOoGQhQu3pX0X1dFFg1tq+FL4b7ADEygBcKfwGEEcJT7NShuIa+MDY8ZjNEO1A1AZYwnuHAeOdzO/09/dFS6b5EJMLRTegoV4jOYAnfrzyQhwfwZAtbM2yYZt3kdYM1XqcNXZBj5mB9jR2dHPv1fAmgCs0K1IRKhuynT3M3i1xvLpoJLmhOZrMU+gEKFYzrJSr31Q84ojEHR9oS//seZoPq58YDO7lA8/kouG0Yv5vF9MYOt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In preparation for supporting try_to_unmap_one() to unmap PMD-mapped folios, start the pagewalk first, then call split_huge_pmd_address() to split the folio. Since TTU_SPLIT_HUGE_PMD will no longer perform immediately, we might encounter a PMD-mapped THP missing the mlock in the VM_LOCKED range during the page walk. It’s probably necessary to mlock this THP to prevent it from being picked up during page reclaim. Suggested-by: David Hildenbrand Suggested-by: Baolin Wang Signed-off-by: Lance Yang --- include/linux/huge_mm.h | 6 ++++++ mm/huge_memory.c | 42 +++++++++++++++++++++-------------------- mm/rmap.c | 26 ++++++++++++++++++------- 3 files changed, 47 insertions(+), 27 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index c8d3ec116e29..9fcb0b0b6ed1 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -409,6 +409,9 @@ static inline bool thp_migration_supported(void) return IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION); } +void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long address, + pmd_t *pmd, bool freeze, struct folio *folio); + #else /* CONFIG_TRANSPARENT_HUGEPAGE */ static inline bool folio_test_pmd_mappable(struct folio *folio) @@ -471,6 +474,9 @@ static inline void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, bool freeze, struct folio *folio) {} static inline void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address, bool freeze, struct folio *folio) {} +static inline void split_huge_pmd_locked(struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, + bool freeze, struct folio *folio) {} #define split_huge_pud(__vma, __pmd, __address) \ do { } while (0) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 317de2afd371..425272c6c50b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2581,6 +2581,27 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, pmd_populate(mm, pmd, pgtable); } +void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long address, + pmd_t *pmd, bool freeze, struct folio *folio) +{ + VM_WARN_ON_ONCE(folio && !folio_test_pmd_mappable(folio)); + VM_WARN_ON_ONCE(!IS_ALIGNED(address, HPAGE_PMD_SIZE)); + VM_WARN_ON_ONCE(folio && !folio_test_locked(folio)); + VM_BUG_ON(freeze && !folio); + + /* + * When the caller requests to set up a migration entry, we + * require a folio to check the PMD against. Otherwise, there + * is a risk of replacing the wrong folio. + */ + if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || + is_pmd_migration_entry(*pmd)) { + if (folio && folio != pmd_folio(*pmd)) + return; + __split_huge_pmd_locked(vma, pmd, address, freeze); + } +} + void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, bool freeze, struct folio *folio) { @@ -2592,26 +2613,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, (address & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); ptl = pmd_lock(vma->vm_mm, pmd); - - /* - * If caller asks to setup a migration entry, we need a folio to check - * pmd against. Otherwise we can end up replacing wrong folio. - */ - VM_BUG_ON(freeze && !folio); - VM_WARN_ON_ONCE(folio && !folio_test_locked(folio)); - - if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || - is_pmd_migration_entry(*pmd)) { - /* - * It's safe to call pmd_page when folio is set because it's - * guaranteed that pmd is present. - */ - if (folio && folio != pmd_folio(*pmd)) - goto out; - __split_huge_pmd_locked(vma, pmd, range.start, freeze); - } - -out: + split_huge_pmd_locked(vma, range.start, pmd, freeze, folio); spin_unlock(ptl); mmu_notifier_invalidate_range_end(&range); } diff --git a/mm/rmap.c b/mm/rmap.c index ddffa30c79fb..08a93347f283 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1640,9 +1640,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (flags & TTU_SYNC) pvmw.flags = PVMW_SYNC; - if (flags & TTU_SPLIT_HUGE_PMD) - split_huge_pmd_address(vma, address, false, folio); - /* * For THP, we have to assume the worse case ie pmd for invalidation. * For hugetlb, it could be much worse if we need to do pud @@ -1668,20 +1665,35 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, mmu_notifier_invalidate_range_start(&range); while (page_vma_mapped_walk(&pvmw)) { - /* Unexpected PMD-mapped THP? */ - VM_BUG_ON_FOLIO(!pvmw.pte, folio); - /* * If the folio is in an mlock()d vma, we must not swap it out. */ if (!(flags & TTU_IGNORE_MLOCK) && (vma->vm_flags & VM_LOCKED)) { /* Restore the mlock which got missed */ - if (!folio_test_large(folio)) + if (!folio_test_large(folio) || + (!pvmw.pte && (flags & TTU_SPLIT_HUGE_PMD))) mlock_vma_folio(folio, vma); goto walk_done_err; } + if (!pvmw.pte && (flags & TTU_SPLIT_HUGE_PMD)) { + /* + * We temporarily have to drop the PTL and start once + * again from that now-PTE-mapped page table. + */ + split_huge_pmd_locked(vma, range.start, pvmw.pmd, false, + folio); + pvmw.pmd = NULL; + spin_unlock(pvmw.ptl); + pvmw.ptl = NULL; + flags &= ~TTU_SPLIT_HUGE_PMD; + continue; + } + + /* Unexpected PMD-mapped THP? */ + VM_BUG_ON_FOLIO(!pvmw.pte, folio); + pfn = pte_pfn(ptep_get(pvmw.pte)); subpage = folio_page(folio, pfn - folio_pfn(folio)); address = pvmw.address;