From patchwork Wed Feb 26 12:06:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13992226 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD60BC021B8 for ; Wed, 26 Feb 2025 12:07:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A82C6B0099; Wed, 26 Feb 2025 07:07:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3575D6B009A; Wed, 26 Feb 2025 07:07:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D633280022; Wed, 26 Feb 2025 07:07:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id F1E266B0099 for ; Wed, 26 Feb 2025 07:07:24 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9E2CA51CAA for ; Wed, 26 Feb 2025 12:07:24 +0000 (UTC) X-FDA: 83161970808.11.BCD9C17 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id E7EFB40020 for ; Wed, 26 Feb 2025 12:07:22 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740571643; a=rsa-sha256; cv=none; b=s41qQ/JgEwbJLdVPxrbb7TRPE+a3yyV88Ry4Y1XGy7mYnnAGxoGYflxsb2fh2MLZZAiON9 9hJno6o2sTTnICPuLhOpD5B/oXgCBzTGI7wrWZj+foQT2IOb1ik7dYqb/XtyLSTK6RGgV2 Khg9Hk8BzFq2cLX1ljiUOe37WkpSNNw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740571643; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=58UqbJHKq3r73m/lPqxQecNsNcrQYNT40wuQHH9CYfE=; b=kbmDQP1POjb3M4HcdcrdHerAlQrQ6KzfX9lrcQTXPBfiLhy/peuNk4671XmnCYKMN9Minv 1UVkiDn6sihKsjBpbDML2SPow2xj0Fxh/uzLeX6LP4bBN0FlHlwijMEFWUlf5ZW/CbxBQR ZuUwu+GHaZXQV5C0xDif3QmNwzS1nlw= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 453CF13D5; Wed, 26 Feb 2025 04:07:38 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 082C03F5A1; Wed, 26 Feb 2025 04:07:16 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Huacai Chen , WANG Xuerui , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Madhavan Srinivasan , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Gerald Schaefer , "David S. Miller" , Andreas Larsson , Arnd Bergmann , Muchun Song , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Dev Jain , Kevin Brodsky , Alexandre Ghiti Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH v3 2/3] arm64: hugetlb: Fix huge_ptep_get_and_clear() for non-present ptes Date: Wed, 26 Feb 2025 12:06:52 +0000 Message-ID: <20250226120656.2400136-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250226120656.2400136-1-ryan.roberts@arm.com> References: <20250226120656.2400136-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: E7EFB40020 X-Rspamd-Server: rspam08 X-Rspam-User: X-Stat-Signature: sb3oohq33xu6j6k6np6746inu6kanoqf X-HE-Tag: 1740571642-800048 X-HE-Meta: U2FsdGVkX1+VJu80rphs95JmaVXVAyeYtWkyMjFTtvrBbZ2Ejc26NThjAAhK908/InsuZRotG4fXi0WKmPir7JpW06tq3zaS3ZCH76IPpuBe8y3EPlq70w5zMPet+VO76ojsXmmSgrOP8xjIHd0/nUbOXxcS4FoPHGogxnKd53+dMoF6D6fXhRspNw/ZRcgrdc5gslsSHx5jRIdww53uxulfJvRVNqdAt/fK6pM/YQXZ0ojwnT6NuctuskcPPZ4Ih20YS/RLkMuCL8rrtvWm2AByxhgTdS/qd6Myk/VVV0mOywjYtILJYMCR0p/f+1xrWNPBBW84FOMWjzmdCBUmWUo8fvOfYbSNzIqdzd6Mfs5vrbN+rN6muihBNS88g6aRvjzQDj3XUA+fNf0pefgl5RavWZQwLroKOXNIoL1SgjTqlyDncCpFFppLMQxdKtCC7p+qXJtSLgB2gp0aNjahW1hJxE9Gdz85Pony3qmqt5xcSuIqZCft+ON6Tjt3qfBX+m96Mg3tQgaDCCDSPYYLQTQJ3cAvbS9TOGPRi5d8C49zAXOyK4s9yc0Ek6weGs/JnSgZdTkZ9U0PKowbirtdJxaAkzZI9jduiJgGhPb5Jh5k7AvZVs5QAeO6Cl6Pe7RjYqBu68LbcrHWdzkxCqtnIr6Z55lmRcaAdc7iFX5xOaygwXmPA6wLROUVp6eXL4VJ3cHzdXBanImRtSMOXcjOVQdDBY9Nez8iwR5bU5OgqLsmk+uNZnbmXjPrhaeezWHStBke/FHaO5Rir/MRHhyIDH0pXyPQeNBIMVI3gPb+qRCx5DtIOqgH/pKgS9yEqzDByR9A8j+2hgm1QOx1PIY68AfkoB7Ix/qa5opQKe7gk0ybPkbNbcIkGoNEygeENFz5QpFB41K+Ew5Tk8WeE7GI7o7mHyzGIYCXXpZ+QMnP7KNZGAKDAtVrzTJPNY1oD4L0gUcX9lplGRT02jabwrN k/dR5ZA7 6c4luGabdYsrgbbsWgLqa++EglcSBbiGgdEHQrGsKAwzaGnB/wO7wDq7LEhqv5O0v0p4otDx7ycWNd/s+7CdqHwAds/H05O6CJFclk3hMrctmq5dWzswzNET+olxI2oZeDVMVep9ykJUj3BZnSK59/BNqtRK+vPWJK9tYeRyrsQoIQzdcJcp3H1cZGcRiBM58KC4crV1d6Ln0nnLLOrvKk2p+Hj80MogKJvyC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: arm64 supports multiple huge_pte sizes. Some of the sizes are covered by a single pte entry at a particular level (PMD_SIZE, PUD_SIZE), and some are covered by multiple ptes at a particular level (CONT_PTE_SIZE, CONT_PMD_SIZE). So the function has to figure out the size from the huge_pte pointer. This was previously done by walking the pgtable to determine the level and by using the PTE_CONT bit to determine the number of ptes at the level. But the PTE_CONT bit is only valid when the pte is present. For non-present pte values (e.g. markers, migration entries), the previous implementation was therefore erroneously determining the size. There is at least one known caller in core-mm, move_huge_pte(), which may call huge_ptep_get_and_clear() for a non-present pte. So we must be robust to this case. Additionally the "regular" ptep_get_and_clear() is robust to being called for non-present ptes so it makes sense to follow the behavior. Fix this by using the new sz parameter which is now provided to the function. Additionally when clearing each pte in a contig range, don't gather the access and dirty bits if the pte is not present. An alternative approach that would not require API changes would be to store the PTE_CONT bit in a spare bit in the swap entry pte for the non-present case. But it felt cleaner to follow other APIs' lead and just pass in the size. As an aside, PTE_CONT is bit 52, which corresponds to bit 40 in the swap entry offset field (layout of non-present pte). Since hugetlb is never swapped to disk, this field will only be populated for markers, which always set this bit to 0 and hwpoison swap entries, which set the offset field to a PFN; So it would only ever be 1 for a 52-bit PVA system where memory in that high half was poisoned (I think!). So in practice, this bit would almost always be zero for non-present ptes and we would only clear the first entry if it was actually a contiguous block. That's probably a less severe symptom than if it was always interpreted as 1 and cleared out potentially-present neighboring PTEs. Cc: stable@vger.kernel.org Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") Reviewed-by: Catalin Marinas Signed-off-by: Ryan Roberts tmp --- arch/arm64/mm/hugetlbpage.c | 53 ++++++++++++++----------------------- 1 file changed, 20 insertions(+), 33 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 06db4649af91..b3a7fafe8892 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -100,20 +100,11 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr, static inline int num_contig_ptes(unsigned long size, size_t *pgsize) { - int contig_ptes = 0; + int contig_ptes = 1; *pgsize = size; switch (size) { -#ifndef __PAGETABLE_PMD_FOLDED - case PUD_SIZE: - if (pud_sect_supported()) - contig_ptes = 1; - break; -#endif - case PMD_SIZE: - contig_ptes = 1; - break; case CONT_PMD_SIZE: *pgsize = PMD_SIZE; contig_ptes = CONT_PMDS; @@ -122,6 +113,8 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize) *pgsize = PAGE_SIZE; contig_ptes = CONT_PTES; break; + default: + WARN_ON(!__hugetlb_valid_size(size)); } return contig_ptes; @@ -163,24 +156,23 @@ static pte_t get_clear_contig(struct mm_struct *mm, unsigned long pgsize, unsigned long ncontig) { - pte_t orig_pte = __ptep_get(ptep); - unsigned long i; - - for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) { - pte_t pte = __ptep_get_and_clear(mm, addr, ptep); - - /* - * If HW_AFDBM is enabled, then the HW could turn on - * the dirty or accessed bit for any page in the set, - * so check them all. - */ - if (pte_dirty(pte)) - orig_pte = pte_mkdirty(orig_pte); - - if (pte_young(pte)) - orig_pte = pte_mkyoung(orig_pte); + pte_t pte, tmp_pte; + bool present; + + pte = __ptep_get_and_clear(mm, addr, ptep); + present = pte_present(pte); + while (--ncontig) { + ptep++; + addr += pgsize; + tmp_pte = __ptep_get_and_clear(mm, addr, ptep); + if (present) { + if (pte_dirty(tmp_pte)) + pte = pte_mkdirty(pte); + if (pte_young(tmp_pte)) + pte = pte_mkyoung(pte); + } } - return orig_pte; + return pte; } static pte_t get_clear_contig_flush(struct mm_struct *mm, @@ -401,13 +393,8 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, { int ncontig; size_t pgsize; - pte_t orig_pte = __ptep_get(ptep); - - if (!pte_cont(orig_pte)) - return __ptep_get_and_clear(mm, addr, ptep); - - ncontig = find_num_contig(mm, addr, ptep, &pgsize); + ncontig = num_contig_ptes(sz, &pgsize); return get_clear_contig(mm, addr, ptep, pgsize, ncontig); }