From patchwork Tue Feb 11 11:13:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969526 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FD4BC0219D for ; Tue, 11 Feb 2025 11:16:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B82C7280002; Tue, 11 Feb 2025 06:15:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B0C2F280001; Tue, 11 Feb 2025 06:15:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 986B2280002; Tue, 11 Feb 2025 06:15:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 73CA2280001 for ; Tue, 11 Feb 2025 06:15:59 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 26B0A1A15EE for ; Tue, 11 Feb 2025 11:15:59 +0000 (UTC) X-FDA: 83107409238.04.6F91260 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf27.hostedemail.com (Postfix) with ESMTP id 98CEF40010 for ; Tue, 11 Feb 2025 11:15:57 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272557; a=rsa-sha256; cv=none; b=aCXNdv+q+ncSubMwhbOWhgT6kekx9Tii2LCenf1T2lwuetn7ScAzyjXKENOuSnfx+NJ1vP v6dVwZGL5/5tBsh5Wjp4d4T6WQ38FDY6PC8tcLgFY1DrEamjWLOm1hQql5QI9ZH3ZWaVFm ks4MO+eJfdllu6lMk8U5jxRRscaVnFI= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272557; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N3Fb69C97b7PsqQngaY/05193CPXQY7N+6PYYN1lD7A=; b=kn9Bn+Z7DAZgIPlbiRepCfCS4ePB+N354xPjwhqWvdfExFe1n7+QxeUQyurHoIIA+el02f mzlKe/m0G86/GBvSf6BT/tiOwXCzEzT9YRkp4/bUlyaAVQCA3jQRKaUv4yJBUfW41j/1JP AilKEk8sbLNYrsQqH2vVn+eTnN4N/dw= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 68D7913D5; Tue, 11 Feb 2025 03:16:18 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 9313B3F5A1; Tue, 11 Feb 2025 03:15:46 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 13/17] khugepaged: Lock all VMAs mapping the PTE table Date: Tue, 11 Feb 2025 16:43:22 +0530 Message-Id: <20250211111326.14295-14-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20250211111326.14295-1-dev.jain@arm.com> References: <20250211111326.14295-1-dev.jain@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 98CEF40010 X-Rspamd-Server: rspam12 X-Stat-Signature: pijjhxjc61e5wcjdxghdasfeezodijfn X-HE-Tag: 1739272557-474477 X-HE-Meta: U2FsdGVkX1+WZoYQIFWQXHao09tGDfZMXoX7tmCzTD5Pdlavy67Jvf8s5vMFX5vbyMS9nFCIA3JmmkEphT7xfm+8IbjnZpG+sDr1/Q8YgeoNqzCBnAyAAyA6VYGsWsdUWFWoQtmx4mnXRJRp9GOwKDa0uIpSmVidDHF+j8Y1z3uddu+ObTuuOeG0ncGCYGCzdMMimOOqC+0dRLuRVwxfpIuzQ57nlh1ByAGjGRqFwhcMd8/ukmhE/8D5kqUzwu25+CQeper4qIAdC55EH6oEqYycg4C9N2omm0NSsndX6+f9kGKoAqsL9L7gbNbjy833l0SSc5/Hz6QSA6ismgx5T6uAbFdJuR17UsgCdIrmmT+xkm/QD+4znt42bRmPCzgEBNJtrCHfDulNGHgvMUsrrkzAMaWK4h9nzCPaa65o8rmamk9mwWwjfqhqLsz4NaRQLbFrl2gDs5hqUBizYbDQOCYgl8bKwLyOPkXEeA5auuSdTPU7MppiAxKmlrlHMfd6j0XsmYAmplDM5pmYQ2Un1C9cLO1yYrcRQBL3y1oY+1rQ9sxdoGj1n8+gOGSE/i0KUPI6eri8K0gnTO6/L7YKLKqDnoxeAekFwU2ASJWUuOhYtl3mW3E6xAaOGBvN1sRLxynI0jxjB5TZmzgE/1sfGaQpYjYZv3oU/0/Ot82ljUAZx8g6/y0p6oU+S8jI237BaxKVGSs/PWuBLUD90Eb34mUM7RL2eHI2CEJ0LfwW3yaLe0IhI75XJAoJKATzmRWCLuY6tiKSGUsDRc4+fLPCeNUYHSlSxrUq0eRYZr9u7IL5h1HhijtzcyNnhJESn1U4mvWFW8ouA38xTyIj/MNWX3VAYHnW9Bus8xRjSr9XZJA/FJ1rfpD9AqKX8ewqKnQaIVLTzjXZbRsEavG7vRY+E3BovrV0SMn4nYe9CbSFogY5nlW5zyEgisoOx8p8vVpmpOs//LkLuEpZMTCk4vd uw+sMUM2 MIS8CBJr9hRJU8pG/ajjxW5NlRhX1mcLeEvoUCjFxh8xA6GOz8B+AjhNVwM4ewslf+vqiNYh0TpVxMq3uKy7TnmxL5rUGpLOq+2oYV5lMQ5H/3z0lynodLh0I6m2hr8ewMdCpPmQCdkiJEsGX+F8+Kh0vce+t+dUMuHRbnE5OkW0V2jlwY2KhGDhNeK7eln8QXx+FV2g/G9xniP+qjOZ/mcoTIJRPNjHszoh2y8+cCIL5EYxPodSv+JUFdqkjg/uwfDqX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: After enabling khugepaged to handle VMAs of any size, it may happen that the process faults on a VMA other than the VMA under collapse, and both these VMAs span the same PTE table. As a result, the fault handler will install a new PTE table after khugepaged isolates the PTE table. Therefore, scan the PTE table, retrieve all VMAs, and write lock them. Note that, rmap can still reach the PTE table from folios not under collapse; this is fine since it does not interfere with the PTEs under collapse, nor the folios under collapse, nor can rmap fill the PMD. Signed-off-by: Dev Jain --- mm/khugepaged.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 048f990d8507..e1c2c5b89f6d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1139,6 +1139,23 @@ static int alloc_charge_folio(struct folio **foliop, struct mm_struct *mm, return SCAN_SUCCEED; } +static void take_vma_locks_per_pte(struct mm_struct *mm, unsigned long haddress) +{ + struct vm_area_struct *vma; + unsigned long start = haddress; + unsigned long end = haddress + HPAGE_PMD_SIZE; + + while (start < end) { + vma = vma_lookup(mm, start); + if (!vma) { + start += PAGE_SIZE; + continue; + } + vma_start_write(vma); + start = vma->vm_end; + } +} + static int vma_collapse_anon_folio_pmd(struct mm_struct *mm, unsigned long address, struct vm_area_struct *vma, struct collapse_control *cc, pmd_t *pmd, struct folio *folio) @@ -1270,7 +1287,9 @@ static int vma_collapse_anon_folio(struct mm_struct *mm, unsigned long address, if (result != SCAN_SUCCEED) goto out; - vma_start_write(vma); + /* Faulting may fill the PMD after flush; lock all VMAs mapping this PTE */ + take_vma_locks_per_pte(mm, haddress); + anon_vma_lock_write(vma->anon_vma); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, haddress,