From patchwork Mon Feb 17 14:07:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B0E8C021A9 for ; Mon, 17 Feb 2025 14:08:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12172280058; Mon, 17 Feb 2025 09:08:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D15428004D; Mon, 17 Feb 2025 09:08:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDCF4280058; Mon, 17 Feb 2025 09:08:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CF1FA28004D for ; Mon, 17 Feb 2025 09:08:25 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 4BA2D12045C for ; Mon, 17 Feb 2025 14:08:25 +0000 (UTC) X-FDA: 83129616570.26.9418CFB Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 9200E20017 for ; Mon, 17 Feb 2025 14:08:23 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801303; a=rsa-sha256; cv=none; b=6SLBxJOaksGz/TRxnviFaJOYjRaLKG7gx5hJ6siFzrQotZJHQVd27PfqQhKMevauSbMlzk fY7T1bIflOMhiRy0BUP3QDB+N4A9jvK1U5vUq9D8oSX5wqzbL/fIKiFvrYYAolqJbrjxFk WMXG46UwBE3xMTPheuTbEwDMQvQFbIc= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801303; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QjvR6b5BjqryM3DMHxFhTSbOzPrvcdNa2MPAXtRZ3A4=; b=SBxQLKHGj55CWSyXeAbWPLLZFX2oDGuYnnHTpFzWaWNN4AunofP8cZRs4CLMYrIaagwk/6 QC6IFXUCEIGO+k1f4LDEHRORVcQChoadOL200eLAMjDB5LDhW+xwJE7zxkgY90CDKPehpJ VliJtYvsTAAZYPv3ryhdl9FTaQxpa94= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1DD911A2D; Mon, 17 Feb 2025 06:08:42 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A3E023F6A8; Mon, 17 Feb 2025 06:08:20 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 01/14] arm64: hugetlb: Cleanup huge_pte size discovery mechanisms Date: Mon, 17 Feb 2025 14:07:53 +0000 Message-ID: <20250217140809.1702789-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: kqgqsqcku6u5wz8dyu5miejzbuh3rncy X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 9200E20017 X-Rspam-User: X-HE-Tag: 1739801303-242449 X-HE-Meta: U2FsdGVkX1/eE4W1r1ylTPgQ6AZ4P9+AtjM2pklVNCW72H4vKHu2vv12mGOCo1FcBz8HVZFgAReYSKtCyGulCde4OhDbLcDeWdViNfDsZJrEviDbaIBnFMTqTyyNZBI7yoNn5v9Wz7iRaui+2tN0fE7Ea3LforIbwgAE2u5g6NBe8YhCbI9Xo6umvGgpDGXSZExAuJZGrDgfjkOBtGbhqw6+HJoZjBZgZE4zL2Kwc3FdF6v1Tivi9hac2qYU8jmeu5RQ2LfBpRc3Vun4R0yuldWrmSP0/XynV9WTafe8LYErSgV1q/DzFT98PHUxm/fNkROZoYIOGI2HIvJ6DHOgPqTdAVYEsnWkBLuWfp+4umNyiJ+19ETWaCo6bhzOTrhe4SXJfA/xv3OHdwhW6RJicvPotUD4tS2XrrVXqZrzIh5QRvFGc8kPmAfHsztchPheA+KwVzg8fU0y6UyItEe0yr+Hg8ICWij1o9F1qw7nn/STszm10NDGmd9bp+2KXq8BdOdPhMrmXB04oujPYQm8VHt7H+7d2yasjo0P3eVe43I28xvWJuplSVG0uSkS9iOQk0k5uvw4z6YTGi7ISI/H0A/8bX8HUTJr5bljG3blMAr5A3Hqc2/d9YwM6BLaHUY2duKuJ8qAI1xkb2V6EDJpMey/z3lcWQmJYGAFPHI3gafWFnD5ufcDkDWQbTpyKlVXxDPZYfNSrkuE7Gw8Pa8OIyL+eHXP5UHguwEYZhigvb86lkR0pkXR82JhrqrCX8RFXl6PlkrmdjvAR1ppjO5HJsHoLtao8fOJh+1BTlkanXQhhKjg3kIsBk57K4nHI9NEB0LPPP848TuwUarSml60W+KCOalGsc6vBpLOnCBNIGr1nd1Su5MWRl0E3veaUumvxXaC1xOfoUe6V0Px6nyCRN36SOxLlvRut9uMN2cQWMSlELWNuJsZJBPuecM9WxuzPd/pc/B8y7A3Gbpka6Q If7SlqdH +4ovoMIi6v7ERar/lVPB7n1HEi/JzrqQ9j7F65grMhmrK3JqrkqAEZ4kBC51M0SmGr/BMXm0eS1EqvHt4wClSBU6tXabXIxxQfU+Edwcvoja4kEodo6MJup52dv8VLdpx+pSyKc37FESKWwNgLZrcr+eFrFLzceLw7GsyYA2mRKAZE3CkIVYPkw5W91U0AYtPQhHNf6ODW/5FpMg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Not all huge_pte helper APIs explicitly provide the size of the huge_pte. So the helpers have to depend on various methods to determine the size of the huge_pte. Some of these methods are dubious. Let's clean up the code to use preferred methods and retire the dubious ones. The options in order of preference: - If size is provided as parameter, use it together with num_contig_ptes(). This is explicit and works for both present and non-present ptes. - If vma is provided as a parameter, retrieve size via huge_page_size(hstate_vma(vma)) and use it together with num_contig_ptes(). This is explicit and works for both present and non-present ptes. - If the pte is present and contiguous, use find_num_contig() to walk the pgtable to find the level and infer the number of ptes from level. Only works for *present* ptes. - If the pte is present and not contiguous and you can infer from this that only 1 pte needs to be operated on. This is ok if you don't care about the absolute size, and just want to know the number of ptes. - NEVER rely on resolving the PFN of a present pte to a folio and getting the folio's size. This is fragile at best, because there is nothing to stop the core-mm from allocating a folio twice as big as the huge_pte then mapping it across 2 consecutive huge_ptes. Or just partially mapping it. Where we require that the pte is present, add warnings if not-present. Signed-off-by: Ryan Roberts --- arch/arm64/mm/hugetlbpage.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 614b2feddba2..31ea826a8a09 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -136,7 +136,7 @@ pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) if (!pte_present(orig_pte) || !pte_cont(orig_pte)) return orig_pte; - ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize); + ncontig = find_num_contig(mm, addr, ptep, &pgsize); for (i = 0; i < ncontig; i++, ptep++) { pte_t pte = __ptep_get(ptep); @@ -445,16 +445,19 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, pgprot_t hugeprot; pte_t orig_pte; + VM_WARN_ON(!pte_present(pte)); + if (!pte_cont(pte)) return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); - ncontig = find_num_contig(mm, addr, ptep, &pgsize); + ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); dpfn = pgsize >> PAGE_SHIFT; if (!__cont_access_flags_changed(ptep, pte, ncontig)) return 0; orig_pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); + VM_WARN_ON(!pte_present(orig_pte)); /* Make sure we don't lose the dirty or young state */ if (pte_dirty(orig_pte)) @@ -479,7 +482,10 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, size_t pgsize; pte_t pte; - if (!pte_cont(__ptep_get(ptep))) { + pte = __ptep_get(ptep); + VM_WARN_ON(!pte_present(pte)); + + if (!pte_cont(pte)) { __ptep_set_wrprotect(mm, addr, ptep); return; } @@ -503,11 +509,15 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; size_t pgsize; int ncontig; + pte_t pte; + + pte = __ptep_get(ptep); + VM_WARN_ON(!pte_present(pte)); - if (!pte_cont(__ptep_get(ptep))) + if (!pte_cont(pte)) return ptep_clear_flush(vma, addr, ptep); - ncontig = find_num_contig(mm, addr, ptep, &pgsize); + ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); return get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); } From patchwork Mon Feb 17 14:07:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8BE0C021AB for ; Mon, 17 Feb 2025 14:08:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6543628005A; Mon, 17 Feb 2025 09:08:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5DB2E280059; Mon, 17 Feb 2025 09:08:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4084628005A; Mon, 17 Feb 2025 09:08:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 17C57280059 for ; Mon, 17 Feb 2025 09:08:28 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CF6001C71C1 for ; Mon, 17 Feb 2025 14:08:27 +0000 (UTC) X-FDA: 83129616654.04.5290510 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf24.hostedemail.com (Postfix) with ESMTP id 23161180017 for ; Mon, 17 Feb 2025 14:08:25 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801306; a=rsa-sha256; cv=none; b=3nBIFN5zKuXnZ/klp0QtGA8DoMX19IntoMdd6Wp+zEzJ+gGokIQr5rTyzWC2AuchnhU8K7 weupbY1ouIkMbjR1w5jr/1PR4Yf+jFK9oy52+h2azeaAFRN1bkwU/Cm8bVWr/pZah2NKgQ V6Szo66Z/WBjnBv2n241buc7UT8+nOk= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801306; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F8P6CnZIKWih8v53EKSQOz8corlrU6mPWHF0BWlrBYA=; b=yJ3nUwpSCY7X/7538NdmICi2J5Y+qEmgfU4mjOf72sZmoYmHdyO3wnoq/i0PN+CP9Pv11m xPOaTONMTR4spx1SZIgHlvF99qQNbjB+iVmv4SxYnT4H2wkO5kJWzeFPQbmpu3gLqDtwhB 9cBtlJp3qdn0FZZYVZHJ3BVwvdp769M= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A18691A2D; Mon, 17 Feb 2025 06:08:44 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 33ED53F6A8; Mon, 17 Feb 2025 06:08:23 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 02/14] arm64: hugetlb: Refine tlb maintenance scope Date: Mon, 17 Feb 2025 14:07:54 +0000 Message-ID: <20250217140809.1702789-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 23161180017 X-Rspamd-Server: rspam12 X-Stat-Signature: 5f3q391fs4qhrx1c7dfatkm6zikn1fqr X-HE-Tag: 1739801305-598712 X-HE-Meta: U2FsdGVkX1+ysrpKN+kFE/R6dSGnEoh15ZgRHN8Z/WrTy1ueDWPNZpAv+9nLPm1k8L8pPDBJltmKXZ7DZpydIUXhqDToCcG35gBcBRzju2t5gfYkUdY3rCaSgBochKwlmBCO/GbrsFhox/JrjMHocMQsY+8HHI83i76rnlm4kauYLOczcRH1w5EhkZhGNwj3pkbnQaGl3Z1zbm0/yoKp+wGH9ez8MiaWDLOjWO8bu25qJy0sgUZ0pBinKFZci8J++wfXQRTl5fvTZ6dWIil6n8s/g5JMbSOe94nOg6NkEZ7WuT11WMfRp6zjO+pnJWLNsmyz4R10B5nIMwuJznZUiPoKFYdPzIVOT58BxcNUqJMiTO4JBKDa8NSYweulwBM/ZFVgHyRLFlEobrE0OyQOafUdAEOINtjdmbDR+Ii/IiSO52/R69pXeqqk7CHkDtkolSv+c32utkBHpTYLxCvc6mlTpGm4Xmd2bSNOYERlxjsIyxnquHOA3e5nrenPVsPT74HXhAnGmgZ51Nrcz+GjYbO4+GMLcijt4yaCQbcnqvjpo9Sa5ShUSx4YJOrgw8bfNyJB+4sNyISbCAoPVWrOqh4bnG7one31f/wbog4axfJaC60WOtm9OhZKcnjwqmQeG4xo4P9HsSnbb88LxRgLZ41kiwhRIoPGtiIz/+SJXYECvD6j4BUnPQB0CsxkX7r/od2cQcfWk6RZZUxv0Iu/k8YC8k4eOOuNNvW/tihJEkFEr5WtEkfW3zwXM2w/VuC8wZzUG8Z1InQT/89ppLJCB4JMwteMAFKeh/xuZtqhi4JgwKoYAq9nuZixERUqdgamCH/SfTdT/2oHQC735XAEvmEhoffUB8gvJCJEcaU2ClroUO2vgTWqsJ00LVJlnrTvnLUq98ekxiGtqRrbvoja63LGkd9lBk/7u/ADVLnycLCqKz4qMT/0YrZagTxZyaiqjR6c1klLLWZvhPp44yv IuXt8ahE g75Vg/bg5VrfzcP0YehyKWZ8eP55tNY7MwNmSFZLYa3I54G/tUVeKesToshy3jehN5SoH5H8bji6KbXzS9MIX9Q7kaKQUG7CiVKTW/kbdlyTuuAyNQJRcs16XKuEb6/pFIytlK8ozJ0VYlV3eAYnwbe6XlQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When operating on contiguous blocks of ptes (or pmds) for some hugetlb sizes, we must honour break-before-make requirements and clear down the block to invalid state in the pgtable then invalidate the relevant tlb entries before making the pgtable entries valid again. However, the tlb maintenance is currently always done assuming the worst case stride (PAGE_SIZE), last_level (false) and tlb_level (TLBI_TTL_UNKNOWN). We can do much better with the hinting; In reality, we know the stride from the huge_pte pgsize, we are always operating only on the last level, and we always know the tlb_level, again based on pgsize. So let's start providing these hints. Additionally, avoid tlb maintenace in set_huge_pte_at(). Break-before-make is only required if we are transitioning the contiguous pte block from valid -> valid. So let's elide the clear-and-flush ("break") if the pte range was previously invalid. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/hugetlb.h | 29 +++++++++++++++++++---------- arch/arm64/mm/hugetlbpage.c | 9 ++++++--- 2 files changed, 25 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h index 07fbf5bf85a7..2a8155c4a882 100644 --- a/arch/arm64/include/asm/hugetlb.h +++ b/arch/arm64/include/asm/hugetlb.h @@ -69,29 +69,38 @@ extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, #include -#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE -static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, - unsigned long start, - unsigned long end) +static inline void __flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + unsigned long stride, + bool last_level) { - unsigned long stride = huge_page_size(hstate_vma(vma)); - switch (stride) { #ifndef __PAGETABLE_PMD_FOLDED case PUD_SIZE: - __flush_tlb_range(vma, start, end, PUD_SIZE, false, 1); + __flush_tlb_range(vma, start, end, PUD_SIZE, last_level, 1); break; #endif case CONT_PMD_SIZE: case PMD_SIZE: - __flush_tlb_range(vma, start, end, PMD_SIZE, false, 2); + __flush_tlb_range(vma, start, end, PMD_SIZE, last_level, 2); break; case CONT_PTE_SIZE: - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3); + __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, 3); break; default: - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN); + __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, TLBI_TTL_UNKNOWN); } } +#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE +static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end) +{ + unsigned long stride = huge_page_size(hstate_vma(vma)); + + __flush_hugetlb_tlb_range(vma, start, end, stride, false); +} + #endif /* __ASM_HUGETLB_H */ diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 31ea826a8a09..b7434ed1b93b 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -190,8 +190,9 @@ static pte_t get_clear_contig_flush(struct mm_struct *mm, { pte_t orig_pte = get_clear_contig(mm, addr, ptep, pgsize, ncontig); struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0); + unsigned long end = addr + (pgsize * ncontig); - flush_tlb_range(&vma, addr, addr + (pgsize * ncontig)); + __flush_hugetlb_tlb_range(&vma, addr, end, pgsize, true); return orig_pte; } @@ -216,7 +217,7 @@ static void clear_flush(struct mm_struct *mm, for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) __ptep_get_and_clear(mm, addr, ptep); - flush_tlb_range(&vma, saddr, addr); + __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); } void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, @@ -245,7 +246,9 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, dpfn = pgsize >> PAGE_SHIFT; hugeprot = pte_pgprot(pte); - clear_flush(mm, addr, ptep, pgsize, ncontig); + /* Only need to "break" if transitioning valid -> valid. */ + if (pte_valid(__ptep_get(ptep))) + clear_flush(mm, addr, ptep, pgsize, ncontig); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); From patchwork Mon Feb 17 14:07:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977908 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47196C021AA for ; Mon, 17 Feb 2025 14:08:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBC9D28005B; Mon, 17 Feb 2025 09:08:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BCEBE280059; Mon, 17 Feb 2025 09:08:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F9F528005B; Mon, 17 Feb 2025 09:08:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 7E8FA280059 for ; Mon, 17 Feb 2025 09:08:30 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 2FCDE1A0483 for ; Mon, 17 Feb 2025 14:08:30 +0000 (UTC) X-FDA: 83129616780.02.5BFDAD0 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 8E6381C0013 for ; Mon, 17 Feb 2025 14:08:28 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801308; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i1h0X13EBKBwrKu1G9dgmEtg+bxoPDg6xcG/Lwz+vSw=; b=DVSig2NspOnTN9oAhuw0oomwmgQEGygT9RuRzv6qvgM6qcnqyeaxxO4yxNfLrv7XOuViTe OwM0LkGWv7FPVu9GAxW+HMAurIkZ2jT5anHJQL/WI3p6Rs+Z2LJ0sqIuHm54Mk0BcxZXd/ 4tYEAmWiEt7A/pMwDVg0YDT3AniQ63Q= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801308; a=rsa-sha256; cv=none; b=Az/EVHn/nn7sEBIvc6H7lGuwQWK52AnMmaxsb+ukdCVWWfTgL0BYSXkTTvahvI770lU9vp c+ZzbOlMIFCEev/gXyki//Poaaa+N2D645nrN881XrVR8+Bh/jojQ08fQyY86vb3hrh9jt wiulQ+b5YeyAwoH28ZldWT+z2ZLs4nk= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 315891A2D; Mon, 17 Feb 2025 06:08:47 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B7B753F6A8; Mon, 17 Feb 2025 06:08:25 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 03/14] mm/page_table_check: Batch-check pmds/puds just like ptes Date: Mon, 17 Feb 2025 14:07:55 +0000 Message-ID: <20250217140809.1702789-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 8E6381C0013 X-Stat-Signature: gr7h8iy1kymyg465b7hewzk9d7zo5d3m X-HE-Tag: 1739801308-626036 X-HE-Meta: U2FsdGVkX1/ZHbwDI/2EWz2FDW65/mZgIec93z6VA6MA76k5Qfvr3jCG77bPXdxPU89+tqUSl/XwjA5M5uAkwiAJR7EEyLS5uCDgqOTJCjC5iBhRRdWnW4vYBVfqAKH+47T3mrWnXpZv753fCX0cD+UJHltYFHYxBHf7mqOtP5Oury59dUw4gC2tyO2262PE5zwyKI9Ilv0TRaYcK77Sp3aWdOZ3/90wjrEz5n/sr+VZlG/g2HXaBLHM2b02SUL29b8GmnQTkgu4+/w6oVxXtxyalG91WllUWhqc0NNVM7WUIiAL5Rnu/GogI27JRf+DQANXeaFU1+9UcuWu3n6D0psNxb/Z9z0Q+z6K26FXf98YVgae8Gi3evAEOMgKC9BDbE007QdOYMZdJ6KAXaG90GSJ6lvUgPpFV2RBMPE8JyK3JD5eF48S+Ue5Rda0H/jrcuM4m7MtMtfls9JxhVhWSCVZmGmQDLptY/gDJ73hYKU/SG77YYsjayqOnwOrpM0Xm5Xifx/Q6ftS4B4r+gXXjsdpjQ+2Rd828iCMkLVYKahghR692EsUOp1v0EjMv3Mc0ban2ERE69l32swhTmkxifNZz/yKobjIJOrP0KigGbUCqr2BXhuKEuZjxpTQAVVcjfcWgnZ7npdSbfNhZztfU8ITsmTQSjT9vqnbETBTqdMZcU+Y5C3ck3Ll9doNlJIjMDCMBIXMlFZ02dlTdCjRUN9XpW5KKjiNKCKavi4YLbUaoEIGlpIZaR/SOKV6O8lJyzUs5Sx7PboY8KxOfLDFER/3yWXaBttRMJ5f+zI0ycFr1KYNcLPJ3r4az5H+/HxEdytv1wj+T34Po6RjELGzlmj1eZ2CSk6cjhIQZuOBuX0w5/X8SkUZrbeFbAhTpzbu3mvktzg/CNTt6QwQUydlgakpJOQS412ZeHxokrf/3Yio3rUetzFIWPfDUbM8IJPmP0vFkBfsUXnViHWd1Ev 6KQnDmVH pzkDEM5i71L3hTpJ2orVaXXF4ZvVXfqC5/jES3HJrwZd2MhmUnmCE2djHKI0wFTE4ePZ54sRsLjoo8WKS9N1kCukBhF/3hFJSy9AJu+HiIRqWw0hgJO8mYuGa9AUsmmSUx9r3mAPrOrM6Qq1lMpPGFmUEfAkwgLC1Zk+kq+rP9ePd+Sp02ezOA9ZUU8v9/fSRr2dJ1rM/bPojjqg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Convert page_table_check_p[mu]d_set(...) to page_table_check_p[mu]ds_set(..., nr) to allow checking a contiguous set of pmds/puds in single batch. We retain page_table_check_p[mu]d_set(...) as macros that call new batch functions with nr=1 for compatibility. arm64 is about to reorganise its pte/pmd/pud helpers to reuse more code and to allow the implementation for huge_pte to more efficiently set ptes/pmds/puds in batches. We need these batch-helpers to make the refactoring possible. Reviewed-by: Anshuman Khandual Signed-off-by: Ryan Roberts --- include/linux/page_table_check.h | 30 +++++++++++++++++----------- mm/page_table_check.c | 34 +++++++++++++++++++------------- 2 files changed, 38 insertions(+), 26 deletions(-) diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index 6722941c7cb8..289620d4aad3 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -19,8 +19,10 @@ void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd); void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud); void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte, unsigned int nr); -void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd); -void __page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, pud_t pud); +void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, + unsigned int nr); +void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, + unsigned int nr); void __page_table_check_pte_clear_range(struct mm_struct *mm, unsigned long addr, pmd_t pmd); @@ -74,22 +76,22 @@ static inline void page_table_check_ptes_set(struct mm_struct *mm, __page_table_check_ptes_set(mm, ptep, pte, nr); } -static inline void page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, - pmd_t pmd) +static inline void page_table_check_pmds_set(struct mm_struct *mm, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pmd_set(mm, pmdp, pmd); + __page_table_check_pmds_set(mm, pmdp, pmd, nr); } -static inline void page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, - pud_t pud) +static inline void page_table_check_puds_set(struct mm_struct *mm, + pud_t *pudp, pud_t pud, unsigned int nr) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pud_set(mm, pudp, pud); + __page_table_check_puds_set(mm, pudp, pud, nr); } static inline void page_table_check_pte_clear_range(struct mm_struct *mm, @@ -129,13 +131,13 @@ static inline void page_table_check_ptes_set(struct mm_struct *mm, { } -static inline void page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, - pmd_t pmd) +static inline void page_table_check_pmds_set(struct mm_struct *mm, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) { } -static inline void page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, - pud_t pud) +static inline void page_table_check_puds_set(struct mm_struct *mm, + pud_t *pudp, pud_t pud, unsigned int nr) { } @@ -146,4 +148,8 @@ static inline void page_table_check_pte_clear_range(struct mm_struct *mm, } #endif /* CONFIG_PAGE_TABLE_CHECK */ + +#define page_table_check_pmd_set(mm, pmdp, pmd) page_table_check_pmds_set(mm, pmdp, pmd, 1) +#define page_table_check_pud_set(mm, pudp, pud) page_table_check_puds_set(mm, pudp, pud, 1) + #endif /* __LINUX_PAGE_TABLE_CHECK_H */ diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 509c6ef8de40..dae4a7d776b3 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -234,33 +234,39 @@ static inline void page_table_check_pmd_flags(pmd_t pmd) WARN_ON_ONCE(swap_cached_writable(pmd_to_swp_entry(pmd))); } -void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd) +void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, + unsigned int nr) { + unsigned int i; + unsigned long stride = PMD_SIZE >> PAGE_SHIFT; + if (&init_mm == mm) return; page_table_check_pmd_flags(pmd); - __page_table_check_pmd_clear(mm, *pmdp); - if (pmd_user_accessible_page(pmd)) { - page_table_check_set(pmd_pfn(pmd), PMD_SIZE >> PAGE_SHIFT, - pmd_write(pmd)); - } + for (i = 0; i < nr; i++) + __page_table_check_pmd_clear(mm, *(pmdp + i)); + if (pmd_user_accessible_page(pmd)) + page_table_check_set(pmd_pfn(pmd), stride * nr, pmd_write(pmd)); } -EXPORT_SYMBOL(__page_table_check_pmd_set); +EXPORT_SYMBOL(__page_table_check_pmds_set); -void __page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, pud_t pud) +void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, + unsigned int nr) { + unsigned int i; + unsigned long stride = PUD_SIZE >> PAGE_SHIFT; + if (&init_mm == mm) return; - __page_table_check_pud_clear(mm, *pudp); - if (pud_user_accessible_page(pud)) { - page_table_check_set(pud_pfn(pud), PUD_SIZE >> PAGE_SHIFT, - pud_write(pud)); - } + for (i = 0; i < nr; i++) + __page_table_check_pud_clear(mm, *(pudp + i)); + if (pud_user_accessible_page(pud)) + page_table_check_set(pud_pfn(pud), stride * nr, pud_write(pud)); } -EXPORT_SYMBOL(__page_table_check_pud_set); +EXPORT_SYMBOL(__page_table_check_puds_set); void __page_table_check_pte_clear_range(struct mm_struct *mm, unsigned long addr, From patchwork Mon Feb 17 14:07:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B50E7C021AB for ; Mon, 17 Feb 2025 14:08:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E21C28005C; Mon, 17 Feb 2025 09:08:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 36A01280059; Mon, 17 Feb 2025 09:08:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 194B828005C; Mon, 17 Feb 2025 09:08:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EC717280059 for ; Mon, 17 Feb 2025 09:08:32 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B810A140493 for ; Mon, 17 Feb 2025 14:08:32 +0000 (UTC) X-FDA: 83129616864.27.67D1810 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf30.hostedemail.com (Postfix) with ESMTP id 1F5A480012 for ; Mon, 17 Feb 2025 14:08:30 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801311; a=rsa-sha256; cv=none; b=lKtjJk8OBnLLiSKb8nZto7LYOsi9mA+E6ck3khS3fO2tDVNeD7fDMyDGBSHwddGXbzUcLv j8jNIGTSaPkfebgTXqQI7yL/bp+he2nOjuUUU1LlVXJJE512B9RtU/RhDxWx1EkV1oCUVV Ev1YlGOLkjUhl7Qo6OXprooSDokMd9U= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801311; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XpkQ6d7xBAhPjyCPb06jvbtt0L1ADbzIGtuZamDqKQw=; b=ni5WbHZBVHcAigGL/oXwuMaPUuS25sByfGok3HQJElnXMHvnRrW95+a8XtoFR4xf7Xtz2Q rycSe8K7AgCGr6Mm+n2FfcXznbVFcTQ0960QK/uMw6TJIO8Yn797r/+NkZG/03MzXeQj4K LSAby2nvc+9ARjZTLmC+ZK8KDjAP0H0= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B5C791A2D; Mon, 17 Feb 2025 06:08:49 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 476D93F6A8; Mon, 17 Feb 2025 06:08:28 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 04/14] arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear() Date: Mon, 17 Feb 2025 14:07:56 +0000 Message-ID: <20250217140809.1702789-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 1F5A480012 X-Stat-Signature: zihgoksk79izat8x37b3dhikix7urgnr X-Rspam-User: X-HE-Tag: 1739801310-837635 X-HE-Meta: U2FsdGVkX1/do/AnbiNliV7X+lMSr1fTGIPCHli6PgDyzIj4WoZ9DR7n38aFw/564DXs7qinxvx8dJQLO/IvONzt/ntpWlinH2qRb01q3Mqq90wsYYi8YjHmRfJo7tkMbVGQmS8BTxSVutbvg4ewPsizG3FsINNMAJW15lTKXBj+OMRLZ8Qplg5URawlx8drTSR6tt5AwXhj5JP0SMs5cGE97LlOvR1QWNJPG2PQoGfzX1ZIMVb38psTl6B+8HNNa5NLX/4YcgEMNjIaPfdL8h8FFZaWe+KCcIVcaVwBsaOCjsDpv6AmJcrEGV6+Ig/MukpTCHSgfxshgCXsds49Q5eYPFfzTIcvN5+aJCgNUCdmxUwQD+mj0X58BqwNEDA/f/CoBIlgbs6lQUReN4r1sehWcJpNrpkgrvuo3Ox9H4xOVTNPqKjvcFhsOHSmON1Ue9KEgvpoQgcbnOpAU7cgPO6a6SZ2q/u/7Iw+WsrXYnwgY39OPzsaghHu1vkU0r7NFue1rPkzBM+STbxGnvXKlGdR2US5FhUXKw5hi6tfQsFxwTynhtB4KtsnS99lk0vmaXKRxTwRwagJyPRrzvMLJ0dIGDlvWQxARW/bm1kSSqyh+jSOSvThnH9jeiGU1F1MX0ypYrEHrMoB4bqe+eeDa+cKp7FSwMllkjvPjlg8VhGWPiXe9qT0ipVfxl+Gowqu8DY5aZ7bSi1HRlWYSsOCcKI00hGkki8qALGIdZBVblD7WYDe63mOeiLrzf3xMnaZo5iJ7dGDT8pMWSk4gZt2vFr/y/5SJjjzJr+26mAZGk4q2HhFANmdIUbPNS5IHPzH9boRWkJlJvQikLbURcsGFNXIVMk5jjuCnesYPVvsn6By+8tU2WHD7XRlNtuVXThuqrS7j06SDIu6H4QnM6RyU9dNgnF1vE5mHckW85NFc0Yk+0mA94Nnkszs/JxI+Ul3CKyl81ku2XBItMNsm4m xuA8DOZW yjietr9s8g5MoWyxvxT8OBIEWFjckGt6Jw285iGBxSom4q7pjRD4h1ULMGOLNulrpS4yzWRukx0Gc2U2iXw/lRYsEQP4TvfhgT0lMsev37oTYvsqMkK/e2iDK+T4Y30JICSSokBYzAX3BqpC8C8zsJoIoulpf2Xr/xZ4DkGTgQtNizxr9jg/GvJ33AcnrWYHKCbAXhQcLC8DarjM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Refactor __set_ptes(), set_pmd_at() and set_pud_at() so that they are all a thin wrapper around a new common set_ptes_anysz(), which takes pgsize parameter. Additionally, refactor __ptep_get_and_clear() and pmdp_huge_get_and_clear() to use a new common ptep_get_and_clear_anysz() which also takes a pgsize parameter. These changes will permit the huge_pte API to efficiently batch-set pgtable entries and take advantage of the future barrier optimizations. Additionally since the new *_anysz() helpers call the correct page_table_check_*_set() API based on pgsize, this means that huge_ptes will be able to get proper coverage. Currently the huge_pte API always uses the pte API which assumes an entry only covers a single page. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 108 +++++++++++++++++++------------ 1 file changed, 67 insertions(+), 41 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0b2a2ad1b9e8..e255a36380dc 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -420,23 +420,6 @@ static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) return pfn_pte(pte_pfn(pte) + nr, pte_pgprot(pte)); } -static inline void __set_ptes(struct mm_struct *mm, - unsigned long __always_unused addr, - pte_t *ptep, pte_t pte, unsigned int nr) -{ - page_table_check_ptes_set(mm, ptep, pte, nr); - __sync_cache_and_tags(pte, nr); - - for (;;) { - __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); - if (--nr == 0) - break; - ptep++; - pte = pte_advance_pfn(pte, 1); - } -} - /* * Hugetlb definitions. */ @@ -641,30 +624,59 @@ static inline pgprot_t pud_pgprot(pud_t pud) return __pgprot(pud_val(pfn_pud(pfn, __pgprot(0))) ^ pud_val(pud)); } -static inline void __set_pte_at(struct mm_struct *mm, - unsigned long __always_unused addr, - pte_t *ptep, pte_t pte, unsigned int nr) +static inline void set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, pte_t pte, + unsigned int nr, unsigned long pgsize) { - __sync_cache_and_tags(pte, nr); - __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + unsigned long stride = pgsize >> PAGE_SHIFT; + + switch (pgsize) { + case PAGE_SIZE: + page_table_check_ptes_set(mm, ptep, pte, nr); + break; + case PMD_SIZE: + page_table_check_pmds_set(mm, (pmd_t *)ptep, pte_pmd(pte), nr); + break; + case PUD_SIZE: + page_table_check_puds_set(mm, (pud_t *)ptep, pte_pud(pte), nr); + break; + default: + VM_WARN_ON(1); + } + + __sync_cache_and_tags(pte, nr * stride); + + for (;;) { + __check_safe_pte_update(mm, ptep, pte); + __set_pte(ptep, pte); + if (--nr == 0) + break; + ptep++; + pte = pte_advance_pfn(pte, stride); + } } -static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, - pmd_t *pmdp, pmd_t pmd) +static inline void __set_ptes(struct mm_struct *mm, + unsigned long __always_unused addr, + pte_t *ptep, pte_t pte, unsigned int nr) { - page_table_check_pmd_set(mm, pmdp, pmd); - return __set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd), - PMD_SIZE >> PAGE_SHIFT); + set_ptes_anysz(mm, ptep, pte, nr, PAGE_SIZE); } -static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, - pud_t *pudp, pud_t pud) +static inline void __set_pmds(struct mm_struct *mm, + unsigned long __always_unused addr, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) +{ + set_ptes_anysz(mm, (pte_t *)pmdp, pmd_pte(pmd), nr, PMD_SIZE); +} +#define set_pmd_at(mm, addr, pmdp, pmd) __set_pmds(mm, addr, pmdp, pmd, 1) + +static inline void __set_puds(struct mm_struct *mm, + unsigned long __always_unused addr, + pud_t *pudp, pud_t pud, unsigned int nr) { - page_table_check_pud_set(mm, pudp, pud); - return __set_pte_at(mm, addr, (pte_t *)pudp, pud_pte(pud), - PUD_SIZE >> PAGE_SHIFT); + set_ptes_anysz(mm, (pte_t *)pudp, pud_pte(pud), nr, PUD_SIZE); } +#define set_pud_at(mm, addr, pudp, pud) __set_puds(mm, addr, pudp, pud, 1) #define __p4d_to_phys(p4d) __pte_to_phys(p4d_pte(p4d)) #define __phys_to_p4d_val(phys) __phys_to_pte_val(phys) @@ -1276,16 +1288,34 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, } #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */ -static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, - unsigned long address, pte_t *ptep) +static inline pte_t ptep_get_and_clear_anysz(struct mm_struct *mm, pte_t *ptep, + unsigned long pgsize) { pte_t pte = __pte(xchg_relaxed(&pte_val(*ptep), 0)); - page_table_check_pte_clear(mm, pte); + switch (pgsize) { + case PAGE_SIZE: + page_table_check_pte_clear(mm, pte); + break; + case PMD_SIZE: + page_table_check_pmd_clear(mm, pte_pmd(pte)); + break; + case PUD_SIZE: + page_table_check_pud_clear(mm, pte_pud(pte)); + break; + default: + VM_WARN_ON(1); + } return pte; } +static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, + unsigned long address, pte_t *ptep) +{ + return ptep_get_and_clear_anysz(mm, ptep, PAGE_SIZE); +} + static inline void __clear_full_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned int nr, int full) { @@ -1322,11 +1352,7 @@ static inline pte_t __get_and_clear_full_ptes(struct mm_struct *mm, static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long address, pmd_t *pmdp) { - pmd_t pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0)); - - page_table_check_pmd_clear(mm, pmd); - - return pmd; + return pte_pmd(ptep_get_and_clear_anysz(mm, (pte_t *)pmdp, PMD_SIZE)); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ From patchwork Mon Feb 17 14:07:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977910 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91941C021A9 for ; Mon, 17 Feb 2025 14:08:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0FAC628005D; Mon, 17 Feb 2025 09:08:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AAF7280059; Mon, 17 Feb 2025 09:08:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB3DB28005D; Mon, 17 Feb 2025 09:08:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CE5B3280059 for ; Mon, 17 Feb 2025 09:08:35 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7FAFB80496 for ; Mon, 17 Feb 2025 14:08:35 +0000 (UTC) X-FDA: 83129616990.24.73859E5 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf20.hostedemail.com (Postfix) with ESMTP id A545B1C001F for ; Mon, 17 Feb 2025 14:08:33 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; spf=pass (imf20.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801313; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1cElz/iHfGJIJrYOZM+nGWb3qr/VxQv3tlR6HOGX+84=; b=bO4e1AwJBItfHja5KmlGyndQHSGN7TwaJVNt6vuV0Q5G68b96xD1KNGXgcqXddxmWOJrYO idPki5q6RDfsTZ1mjFpFaoUHc3P4QW2cxswxaSaQm6/zfMjdcEgqc1eCVqeORqg6EIatMa gbirHSIvTAK7zy8/OO1bP7qJp2lZPRw= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; spf=pass (imf20.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801313; a=rsa-sha256; cv=none; b=UJ2ajb9+QB4L29cIoPXm0mZKCD+HfziEby1jBBbMQI+l4f/FYVc9K+ZSJOw8ufag8HBk+s B32NzKJOQ95cjO6E+LdaOgwkDSRu4adRDyYAxrWdjx3XQb5x9ZWW5mcQLemHy0NjMBdKRP Kp7bsEwC7Ci9aa+EtR9YBgvhAvdY7jw= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 45EC71A2D; Mon, 17 Feb 2025 06:08:52 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CC4143F6A8; Mon, 17 Feb 2025 06:08:30 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 05/14] arm64: hugetlb: Use set_ptes_anysz() and ptep_get_and_clear_anysz() Date: Mon, 17 Feb 2025 14:07:57 +0000 Message-ID: <20250217140809.1702789-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: A545B1C001F X-Stat-Signature: dx5cxack8km8sawka6kswp3jkzcokxj3 X-Rspamd-Server: rspam03 X-HE-Tag: 1739801313-229571 X-HE-Meta: U2FsdGVkX19YCxG7073V9nf7GhxzqIbYLCnCXzZw6ZAk0l0laiyYHsYXbxJWimHh0lVshc1ONMgLdtV/AemCRGiFIL/9cHNCKfDJQTHpm+OByLDFAw/dVU3AD31//9rIf9GUcoNqCOx3tVNVjiY1R7wKPOornQRMKOnUGiMKGHUalyUErvwX3IHQ/q3GUu5KpFAJBBXrD9kzHf0wx4YeJSBAtIj+1s4B8X5VwE/17q1pEpCA30aFOIrdhe5CUwPRZegHhbg5Hgka8ZW+OF+qSx6dJrPJOFT6K9lDD0Rx2l+b8ayG/5X+AHI2QUCJzD//JOI07xCERQ9WF78oT+FCErPHJ7KKyj/wu2Y/0vFprurj+8Ar4iCAuMgUvBiG1q0OlvvPwsUNqSGGpOdYbmO9dXxUmJct/Z9lvWxC9LLfKNPpsXkXOvRzU2WKfMHPX9Zt1l1l7GLee2xAycdOXBFZnlMsnpHrYbUQR9iANN/scuuqKzYZT8HGXtN3N9w2LQeUDektFO++x7rNxqgd9dcsgkJ8SOIghshPgwRZdUdYWT2p2RDtTOIXjWMFVbQa4ITmu8Xqr+gqNvPeLX/gamO8R5N/VvW0BKckw59kBMju9FzzBlBwro85X1IB47jp5ICBPDU8E4SlVAXlM1YbUlHP0oAetICOp2k+MrbrDZjcdYmAOobmOkVVPsoAG5SAErGERiZ8nk22WXXRXU/sDPoDSSrNQxbfgw1d/nGkDPaia6GNl7KDjT1cZ4SzJl7tLSiW1shSwiPPSD+I8MK4wlSpSQWYao+hXxyqL6wZBIDZV+i7UaB7V6eW6oujqZs3eQih6pAPuOuqsGxhYzxewxiS6gDr1qvplbM4AUCOJ95CwzuXs14BqIctQWRP+985TqIMRuCFUES6reR/LNKlo4avafgwQI8jn9XS6W02coDH0ZqmI3DsWeWnNmOrceMTZr98rc7dD96EgbrFcw7F7tg 4u0D7VFb 0DmrrnE8vNOuzZyAE+iMCot6KbBQgmtsv0m1mATYKO7umqW6588oPlCo2SdlX0DWfi1gW+lvQuiF/SVCc0vJ+uw82elaDhDZBq5tj00GABLeKqOZrS1vRKaUhaUPtfKklzSDxeTYM9bU3jwhC8Dj/PnF5P5dVXdPWXDn9TRLmbBubKEJUcRCw1poz/mZKqpO+ASLbmY1Azmwc8Dw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Refactor the huge_pte helpers to use the new common set_ptes_anysz() and ptep_get_and_clear_anysz() APIs. This provides 2 benefits; First, when page_table_check=on, hugetlb is now properly/fully checked. Previously only the first page of a hugetlb folio was checked. Second, instead of having to call __set_ptes(nr=1) for each pte in a loop, the whole contiguous batch can now be set in one go, which enables some efficiencies and cleans up the code. One detail to note is that huge_ptep_clear_flush() was previously calling ptep_clear_flush() for a non-contiguous pte (i.e. a pud or pmd block mapping). This has a couple of disadvantages; first ptep_clear_flush() calls ptep_get_and_clear() which transparently handles contpte. Given we only call for non-contiguous ptes, it would be safe, but a waste of effort. It's preferable to go straight to the layer below. However, more problematic is that ptep_get_and_clear() is for PAGE_SIZE entries so it calls page_table_check_pte_clear() and would not clear the whole hugetlb folio. So let's stop special-casing the non-cont case and just rely on get_clear_contig_flush() to do the right thing for non-cont entries. Signed-off-by: Ryan Roberts --- arch/arm64/mm/hugetlbpage.c | 52 +++++++------------------------------ 1 file changed, 10 insertions(+), 42 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index b7434ed1b93b..8ac86cd180b3 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -166,12 +166,12 @@ static pte_t get_clear_contig(struct mm_struct *mm, pte_t pte, tmp_pte; bool present; - pte = __ptep_get_and_clear(mm, addr, ptep); + pte = ptep_get_and_clear_anysz(mm, ptep, pgsize); present = pte_present(pte); while (--ncontig) { ptep++; addr += pgsize; - tmp_pte = __ptep_get_and_clear(mm, addr, ptep); + tmp_pte = ptep_get_and_clear_anysz(mm, ptep, pgsize); if (present) { if (pte_dirty(tmp_pte)) pte = pte_mkdirty(pte); @@ -215,7 +215,7 @@ static void clear_flush(struct mm_struct *mm, unsigned long i, saddr = addr; for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) - __ptep_get_and_clear(mm, addr, ptep); + ptep_get_and_clear_anysz(mm, ptep, pgsize); __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); } @@ -226,32 +226,20 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, size_t pgsize; int i; int ncontig; - unsigned long pfn, dpfn; - pgprot_t hugeprot; ncontig = num_contig_ptes(sz, &pgsize); if (!pte_present(pte)) { for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) - __set_ptes(mm, addr, ptep, pte, 1); + set_ptes_anysz(mm, ptep, pte, 1, pgsize); return; } - if (!pte_cont(pte)) { - __set_ptes(mm, addr, ptep, pte, 1); - return; - } - - pfn = pte_pfn(pte); - dpfn = pgsize >> PAGE_SHIFT; - hugeprot = pte_pgprot(pte); - /* Only need to "break" if transitioning valid -> valid. */ - if (pte_valid(__ptep_get(ptep))) + if (pte_cont(pte) && pte_valid(__ptep_get(ptep))) clear_flush(mm, addr, ptep, pgsize, ncontig); - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); + set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); } pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -441,11 +429,9 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte, int dirty) { - int ncontig, i; + int ncontig; size_t pgsize = 0; - unsigned long pfn = pte_pfn(pte), dpfn; struct mm_struct *mm = vma->vm_mm; - pgprot_t hugeprot; pte_t orig_pte; VM_WARN_ON(!pte_present(pte)); @@ -454,7 +440,6 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); - dpfn = pgsize >> PAGE_SHIFT; if (!__cont_access_flags_changed(ptep, pte, ncontig)) return 0; @@ -469,19 +454,14 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, if (pte_young(orig_pte)) pte = pte_mkyoung(pte); - hugeprot = pte_pgprot(pte); - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); - + set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); return 1; } void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - unsigned long pfn, dpfn; - pgprot_t hugeprot; - int ncontig, i; + int ncontig; size_t pgsize; pte_t pte; @@ -494,16 +474,11 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, } ncontig = find_num_contig(mm, addr, ptep, &pgsize); - dpfn = pgsize >> PAGE_SHIFT; pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); pte = pte_wrprotect(pte); - hugeprot = pte_pgprot(pte); - pfn = pte_pfn(pte); - - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); + set_ptes_anysz(mm, ptep, pte, ncontig, pgsize); } pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, @@ -512,13 +487,6 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; size_t pgsize; int ncontig; - pte_t pte; - - pte = __ptep_get(ptep); - VM_WARN_ON(!pte_present(pte)); - - if (!pte_cont(pte)) - return ptep_clear_flush(vma, addr, ptep); ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); return get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); From patchwork Mon Feb 17 14:07:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09E3EC021AA for ; Mon, 17 Feb 2025 14:08:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 62DE728005E; Mon, 17 Feb 2025 09:08:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 58F00280059; Mon, 17 Feb 2025 09:08:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42F5728005E; Mon, 17 Feb 2025 09:08:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 225EC280059 for ; Mon, 17 Feb 2025 09:08:38 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D141512045C for ; Mon, 17 Feb 2025 14:08:37 +0000 (UTC) X-FDA: 83129617074.19.E56A0F3 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf25.hostedemail.com (Postfix) with ESMTP id 3A808A000F for ; Mon, 17 Feb 2025 14:08:36 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801316; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8wUxreknr65ickLPMtCy28QDzN6KgfgyfF+zJLcXS8I=; b=pVZneop37SIrnbFHIWTSNRl/82G+Y3WMgIsHq2HRzlraH2wB9lE8dA2drSgR6mmVJOkr++ /Sn5Xic3qZjZsLX2hgxA2Zff3FEeW1k0f05QsZC1Dt+555v529waXLrpPwyE51cFIsnH5o 0h1ffMrbF8crE2rBB8LPTixUZLylflM= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801316; a=rsa-sha256; cv=none; b=suNntrN78+GlSToNlJ/YEgokAQBC9hV0PXIr3GEhju8zXHiE97EnRu4A26pWspDBIsr8Pa f2O6ybNJu7a2Bhux9fyOD68HnQIE6QqzMryalKeEJNtbOSeBJUXICdKkIqFo5moS+YuEd5 /x/5XZprFVKVBReWeJ/oWxOlkcpxMLI= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CA4601E5E; Mon, 17 Feb 2025 06:08:54 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5BF713F6A8; Mon, 17 Feb 2025 06:08:33 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 06/14] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Date: Mon, 17 Feb 2025 14:07:58 +0000 Message-ID: <20250217140809.1702789-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3A808A000F X-Stat-Signature: 5zq78ywz6ow15txcgdgd759dhw8cbma1 X-Rspam-User: X-HE-Tag: 1739801316-968756 X-HE-Meta: U2FsdGVkX1+HSpobugSYNbdjXu/VdiPjZduBWju/VkUw/1te11jORa5HaWL61PoAzX52jYkygjrIRoC88/ImRoKyjameyK/dFE1AjjTEhjz+4M24iZLRsmqb76RqVTp4GpeALW2wcanOKn4zc05tz+N9QCX2kaIeMEHf8x1LY0y9KPc3p3gDceJvYbkLjsJeOs/a/CcXxfoaroY8oC5vJxF2+c4OZHv/R+gQtVM7YANhIDTQyBCkHdBbedlr29keD76POIZgn2KXn7PGNJe50VvEx4eac2NWhP4mfsQNn6/hkN+O3R+5MPCdWAbeDFCrmI2ffqnqFT2hjOAd5yNpn+IQsXTVwmerI4dE1JOOt6W9btUSulw1jU0/N7lafLOYl/X2+tajj/kV6Mc0MUlU5bMVocdUAGtn1zJ3DDpd0weVG8LeeZVrqCvxrFb8JMZ++GDiq1XJ6xp6eoyOE2w4pzJyzZ3i1iKV7cR2p/ao++IrgnX8OXD/GvKYz81auNveBDjgT/HSu06GOE6hNbWn9xRsM6dZcPxw5PxhZO07xdVP72+YZ/esr9gZy7K7aa7hw4fS5UTmsdmjLO5Xe/tepMMCK/9NPmp5ybFHopUSWrJ+cl2vf1J5bmRLdkAlIB8Lr0MB2TzmVb+Ow+OdtQMtJrnj+66iHsmBa8BgcHTg5OLpjTDxvWzhRThYtRrMhyym1V1L6f/X+L4SOgnQ815XQOSgsrrfemJKEIGcdbl8ByAMjnIvkuqYANYbgge6Gw5lVUE1S10xglljMOMI3ZXtkPMxxflM9J9qLTkj5nfcvOSsLts8NU1LQ/trpqJmF008AVxNBV1jXTXW7UOToavC42j1/WAi1m+a5Z3PLtKPTNQ/fWcdLrrl1Iv8a7cIYKlZoh/i7if4BtUJmXYwJ+AeoMW5K2dR7mwbres9RmncFSqVJiGgkacZiOokIyKYGVJHoEWi3uGdvZRxpxyp7Ma RJFUcvF+ W8AvmL4HJDlEM7Vlm3vDYxZ4L6x4SW7Pqe8Yq5eSs83p9yhcIpP+xuoXtr1uc15yxQcBWIEkxBQTR2cLjJDGgDK1jRhyf3O6IXQPHluzT1zO+60VvsHUKnuzdhV7joayRmDHbs9h/yCJAMU6kqQi8QsBx+EO9c3ceOvbBw0Da3AtB8O0m+ZpyIa78lfjghzHh5qpcV8/Njo8GZ80= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: set_ptes_anysz() previously called __set_pte() for each PTE in the range, which would conditionally issue a DSB and ISB to make the new PTE value immediately visible to the table walker if the new PTE was valid and for kernel space. We can do better than this; let's hoist those barriers out of the loop so that they are only issued once at the end of the loop. We then reduce the cost by the number of PTEs in the range. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index e255a36380dc..e4b1946b261f 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -317,10 +317,8 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) WRITE_ONCE(*ptep, pte); } -static inline void __set_pte(pte_t *ptep, pte_t pte) +static inline void __set_pte_complete(pte_t pte) { - __set_pte_nosync(ptep, pte); - /* * Only if the new pte is valid and kernel, otherwise TLB maintenance * or update_mmu_cache() have the necessary barriers. @@ -331,6 +329,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte) } } +static inline void __set_pte(pte_t *ptep, pte_t pte) +{ + __set_pte_nosync(ptep, pte); + __set_pte_complete(pte); +} + static inline pte_t __ptep_get(pte_t *ptep) { return READ_ONCE(*ptep); @@ -647,12 +651,14 @@ static inline void set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, pte_t pte, for (;;) { __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + __set_pte_nosync(ptep, pte); if (--nr == 0) break; ptep++; pte = pte_advance_pfn(pte, stride); } + + __set_pte_complete(pte); } static inline void __set_ptes(struct mm_struct *mm, From patchwork Mon Feb 17 14:07:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F716C021A9 for ; Mon, 17 Feb 2025 14:08:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D7DC628005F; Mon, 17 Feb 2025 09:08:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D5601280059; Mon, 17 Feb 2025 09:08:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF5B328005F; Mon, 17 Feb 2025 09:08:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9D737280059 for ; Mon, 17 Feb 2025 09:08:40 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5DF8B479B2 for ; Mon, 17 Feb 2025 14:08:40 +0000 (UTC) X-FDA: 83129617200.07.49F6E8C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id B9B094000E for ; Mon, 17 Feb 2025 14:08:38 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801318; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YaLcawmwrUWcb66XB/Sg9CcZOpkFUpumQM1gp+Tb5Og=; b=6ssVx3A24QrO2EfljkcB6IgtJAchP35d7oVKRStc2qGkY3+XTTmrWzXIeYzUdR5la7aGE4 azyviqCvr8EqxqmZ4BVUc4vpZgV5ctqMUKBtKtJMmJgmlJgLZNGMoKek/w1YOWT2MC40Vb KL4zVqQUem0s4af4pMEu1SypMpcOTl8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801318; a=rsa-sha256; cv=none; b=4xAR1O7txMD8HSYxTCKH4VVHdwKiewdy4pofw9VMHxvOBKSgO/KQNaRi7llO5g1mXEf9/y h0wafvyCiu/ZXBF6oez3tdbkxxhXQgpKV/VXEQf1xqz74JIsC8qKygYtLs9QY4F7B6QaqF x3hTKYJoZzD5qbEbU3i95s/4uHGG9OM= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5A12A1A2D; Mon, 17 Feb 2025 06:08:57 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E02EF3F6A8; Mon, 17 Feb 2025 06:08:35 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 07/14] arm64/mm: Avoid barriers for invalid or userspace mappings Date: Mon, 17 Feb 2025 14:07:59 +0000 Message-ID: <20250217140809.1702789-8-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: B9B094000E X-Stat-Signature: cd47hckz1senwzwsq6icidkfgpxreb1p X-HE-Tag: 1739801318-190270 X-HE-Meta: U2FsdGVkX188sNUFmBFqnSlHlAYqoRtifSOV+YaDYNIyAnRUn5b/B0YDOyKa4y2NqHPelDjCp1LUjO9ItJmD6wVC7wuEvnk5rHwOH4jGuwbd2A+j3uN90dEr+jaeN1IRLkpltb+cRDIbzC3WLP3kJ30hWJ8QLSaLaPaAB5pVSnl/v4WOEdRT7RDV/LCGA1PKhvnJpaSyG1rEhgOgFcZoc08bHenooKBuYooOSztL7FoKibatz2MwnH4mLKsD6/pS7C871haQyBjcXv7CwVYiYxkMckBkmHK9inRpaIA4EXvlX3ObDkYEXMrv+C7JsTLsVB08YDReTNQM3FKNHvS6fwBxhoTtm4y6FJiaP6MZdJWHd7mhEihxv6JPVqWvZi96Hr3P2sbuuf8lxYQt6kWlfygcNSkvfbi2X/3sC4EDfxIe5Vw9lE5yc/RgfitlNOHIzk0LcG/VENhcHqqeUbDGeSNfOec3UbAMu+eyoTm0UCF/ikiy8KDQfdopj/1OToaZSo2W9YoU/CxwzegJLkTXaPPyaN8mGixORcwTKw+1s0hsqbUSG+sO2hRv80owDHHJbxyzJ6w7MGcxe5Fk+doKjIAKfSTtAutPlbr8KwOMitwmmzD+/Jy9bsQZB/aj//cNPCIb3dEQA2UfnwJJBbQJnTP2AxTis7kaula+hxP0BXZgPEIL29ji+JzQJUoFuFSNcXkHRUauMSOeRXP+43I954k6pUValendVkHJ8OBZG1ezl4jMRiwGWr6ViUp08/36LdFQetG31Yu2Ng5wkZhE5iAvCC43zxFkl1ZBsPuFD7+TP8ADWEndDget3nOJW6ihLXHqHIAu7s6QmVvlWUNQ0Z0xrHzXJ/nkjgzBrFGjAasQPPI+LWb1terPJp1E9WvS/tMn0PSTi5cOBONsbzOk9vNNCafoBv7KQWcpHmuNYihmueuqHjCn9s2NItCcxsIsR3yEY4aqqr+TWFBBwFE K+uOsN/w 5vK4HytidHnY0MtF1E7+Zu5twPgpKxfG+3/9jMI/wewFnrZAOsbH1kSQmddcnw4bnNOrIfHLfMvA+bYmcOKmHn9R2QOj1UUwJxlX+Wj6nhpGZyuohiH9zyz+gfu0RG9gPdVMSrFsj4o3cT0d3hh7E6eazHA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: __set_pte_complete(), set_pmd(), set_pud(), set_p4d() and set_pgd() are used to write entries into pgtables. And they issue barriers (currently dsb and isb) to ensure that the written values are observed by the table walker prior to any program-order-future memory access to the mapped location. Over the years some of these functions have received optimizations: In particular, commit 7f0b1bf04511 ("arm64: Fix barriers used for page table modifications") made it so that the barriers were only emitted for valid-kernel mappings for set_pte() (now __set_pte_complete()). And commit 0795edaf3f1f ("arm64: pgtable: Implement p[mu]d_valid() and check in set_p[mu]d()") made it so that set_pmd()/set_pud() only emitted the barriers for valid mappings. set_p4d()/set_pgd() continue to emit the barriers unconditionally. This is all very confusing to the casual observer; surely the rules should be invariant to the level? Let's change this so that every level consistently emits the barriers only when setting valid, non-user entries (both table and leaf). It seems obvious that if it is ok to elide barriers all but valid kernel mappings at pte level, it must also be ok to do this for leaf entries at other levels: If setting an entry to invalid, a tlb maintenance operation must surely follow to synchronise the TLB and this contains the required barriers. If setting a valid user mapping, the previous mapping must have been invalid and there must have been a TLB maintenance operation (complete with barriers) to honour break-before-make. So the worst that can happen is we take an extra fault (which will imply the DSB + ISB) and conclude that there is nothing to do. These are the arguments for doing this optimization at pte level and they also apply to leaf mappings at other levels. For table entries, the same arguments hold: If unsetting a table entry, a TLB is required and this will emit the required barriers. If setting a table entry, the previous value must have been invalid and the table walker must already be able to observe that. Additionally the contents of the pgtable being pointed to in the newly set entry must be visible before the entry is written and this is enforced via smp_wmb() (dmb) in the pgtable allocation functions and in __split_huge_pmd_locked(). But this last part could never have been enforced by the barriers in set_pXd() because they occur after updating the entry. So ultimately, the wost that can happen by eliding these barriers for user table entries is an extra fault. I observe roughly the same number of page faults (107M) with and without this change when compiling the kernel on Apple M2. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 34 ++++++++++++++++++++++++++------ 1 file changed, 28 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index e4b1946b261f..51128c2956f8 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -767,6 +767,19 @@ static inline bool in_swapper_pgdir(void *addr) ((unsigned long)swapper_pg_dir & PAGE_MASK); } +static inline bool pmd_valid_not_user(pmd_t pmd) +{ + /* + * User-space table entries always have (PXN && !UXN). All other + * combinations indicate it's a table entry for kernel space. + * Valid-not-user leaf entries follow the same rules as + * pte_valid_not_user(). + */ + if (pmd_table(pmd)) + return !((pmd_val(pmd) & (PMD_TABLE_PXN | PMD_TABLE_UXN)) == PMD_TABLE_PXN); + return pte_valid_not_user(pmd_pte(pmd)); +} + static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) { #ifdef __PAGETABLE_PMD_FOLDED @@ -778,7 +791,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) WRITE_ONCE(*pmdp, pmd); - if (pmd_valid(pmd)) { + if (pmd_valid_not_user(pmd)) { dsb(ishst); isb(); } @@ -833,6 +846,7 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #define pud_valid(pud) pte_valid(pud_pte(pud)) #define pud_user(pud) pte_user(pud_pte(pud)) #define pud_user_exec(pud) pte_user_exec(pud_pte(pud)) +#define pud_valid_not_user(pud) pmd_valid_not_user(pte_pmd(pud_pte(pud))) static inline bool pgtable_l4_enabled(void); @@ -845,7 +859,7 @@ static inline void set_pud(pud_t *pudp, pud_t pud) WRITE_ONCE(*pudp, pud); - if (pud_valid(pud)) { + if (pud_valid_not_user(pud)) { dsb(ishst); isb(); } @@ -916,6 +930,7 @@ static inline bool mm_pud_folded(const struct mm_struct *mm) #define p4d_none(p4d) (pgtable_l4_enabled() && !p4d_val(p4d)) #define p4d_bad(p4d) (pgtable_l4_enabled() && !(p4d_val(p4d) & P4D_TABLE_BIT)) #define p4d_present(p4d) (!p4d_none(p4d)) +#define p4d_valid_not_user(p4d) pmd_valid_not_user(pte_pmd(p4d_pte(p4d))) static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) { @@ -925,8 +940,11 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) } WRITE_ONCE(*p4dp, p4d); - dsb(ishst); - isb(); + + if (p4d_valid_not_user(p4d)) { + dsb(ishst); + isb(); + } } static inline void p4d_clear(p4d_t *p4dp) @@ -1043,6 +1061,7 @@ static inline bool mm_p4d_folded(const struct mm_struct *mm) #define pgd_none(pgd) (pgtable_l5_enabled() && !pgd_val(pgd)) #define pgd_bad(pgd) (pgtable_l5_enabled() && !(pgd_val(pgd) & PGD_TABLE_BIT)) #define pgd_present(pgd) (!pgd_none(pgd)) +#define pgd_valid_not_user(pgd) pmd_valid_not_user(pte_pmd(pgd_pte(pgd))) static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) { @@ -1052,8 +1071,11 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) } WRITE_ONCE(*pgdp, pgd); - dsb(ishst); - isb(); + + if (pgd_valid_not_user(pgd)) { + dsb(ishst); + isb(); + } } static inline void pgd_clear(pgd_t *pgdp) From patchwork Mon Feb 17 14:08:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17DBCC021A9 for ; Mon, 17 Feb 2025 14:08:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 938BF280060; Mon, 17 Feb 2025 09:08:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C23B280059; Mon, 17 Feb 2025 09:08:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 789DD280060; Mon, 17 Feb 2025 09:08:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 54DA8280059 for ; Mon, 17 Feb 2025 09:08:43 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 06D8F1C71A0 for ; Mon, 17 Feb 2025 14:08:43 +0000 (UTC) X-FDA: 83129617326.24.FFA3DAD Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf24.hostedemail.com (Postfix) with ESMTP id 53461180010 for ; Mon, 17 Feb 2025 14:08:41 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801321; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7P0IWJ16i1/Q2YMeI6OEkiHvyCNL436o0qF9iFsNPno=; b=FxOVY+BJqqMPxL7FUxd0koJdQyxRKp9U67GKbBOAPb0xWnE81LbAZuSOcGY7EJLNJqJCTj R4tiSpykq807pO3sUuIrZpFC4RNQ+ghN093Y/CF/b+Egv9ZC5aq4y8A10QhJFCKfuTQq2Q S4lKp8vsGaaqPjKEZlL6Ue8Dv33h4PM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801321; a=rsa-sha256; cv=none; b=CMx4rGXgf2CETdsk14J18aD6XJPpDVdp58NMNqgzPeUfnZ0QEt72D0+5XRvZUyrCqFzb3P XQs58A9ifNGYDr76ED2+x7JlhIvkV5G537x2u0+ZK4C6oTgDekG1sD58mJF1aU+Fy2K3lJ op05EOYpuj0gLOOCcXCYLFocgHcG0f8= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DD4D61E5E; Mon, 17 Feb 2025 06:08:59 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6FF853F6A8; Mon, 17 Feb 2025 06:08:38 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 08/14] mm/vmalloc: Warn on improper use of vunmap_range() Date: Mon, 17 Feb 2025 14:08:00 +0000 Message-ID: <20250217140809.1702789-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 53461180010 X-Rspamd-Server: rspam07 X-Stat-Signature: t5gpi5mogxzg6p356j6ahrjucykccbb8 X-HE-Tag: 1739801321-588021 X-HE-Meta: U2FsdGVkX1+Rug92ipjFbgI3P0T3ac2gyV6mxF0tDTWpcX1HnmfMyamx8jhyNMA35zuPBPPmqZvaWpUh5t7SUO/YtyLyvdnf4c3zk42L65MLg3zHuGzLbEl6ydjlTKf4LD86jZC73GShEy/W0mmywwsSo1/xRfl7d+fP6mTXWpvISrhudUgI08XbJ4QTre+Tzg/wX71YXzRNTFhHtwbhOCrdPbEmVJ3QckagBv9+OK7VC+IWgezsPACDqidZG2paffqPdrs7VimB7+5AS8xCbSXmueNB0IW74jSZ1B7MWpO1uequhrC+pKQqK7Dc2UnjEFn5AN8bEV8O0/wjXpZEz2jcIE1DKSOmPj0Dzi2R7LqUG2+60uXAPdGzxkZMTLXKjN50yKCP7MizKZwjo3fzaEDJieSPz+G2lzatYSixS/b/0BoOnSa5dZmZzsxg9r5w6Yd3iWNfncIzGkwVEcJIVysOxGmdrKnZfn1fouEyuRIpsu33c5TjC+vnsR2rlht3hAtnpXQwSNnRxHOKbFj0FYHD4q29LwS/pmr/UiDqAobn8M0OXHxF0dJRY4rFykr9Em9vVaTCX2nfdoJll42TN1t6NmikH+05qpTDaPjmOC5+dAFVGxGUyXmsKN2IjmmN4Z3e6bN71BJEw2VMCRgD8dbK7uWtRp8+mBfWAXxr+hhG15dVQXUukDZgGEEX0yCRYXijruPXeg4IyNdwWwSkNP7Oy7mPKJPhT77C9nJjdfogAcMaDcg6Oo6eBwDUklv2DEIiBQ6PFO8vq3BqpJnpX0kJDAryT9L/6pEvuRvY+3s0eEC5cuWSVAhvqSlNOvVeNAeWhutUbcCWxKJWTHqYNY8yFxU4FbtBGkROV8t4D1HkT9kNGYjs7FNmMtMaaTdKQ8wvXt9IWSBlaD+qKz9poUDjjwuAnTvQuu9HwA+7TLPsgCziQjiLF/gbVpoR0O2NF4gxacSrrllSxEVProq brUptJLe iKkH0D7pqRUj8AhnVWTju7RXUYtWRbLBHaj6hnb/Oy+slvlRXk+7c8DNcd+UTqm4kDOa9OAv5pcKXPQkQa/ZdDTFRxNS3aemN5jCRlpGJXASqnBJ7Ns0eDG/I7J0GXHlOzBOprD5YzVMggPXv6Cz92N0Ev1WIJQ0Mgguduk3MtfbeZQx0CbLmV5U2q8MeG9cZYea2jkoOCsmIt5o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A call to vmalloc_huge() may cause memory blocks to be mapped at pmd or pud level. But it is possible to subsequently call vunmap_range() on a sub-range of the mapped memory, which partially overlaps a pmd or pud. In this case, vmalloc unmaps the entire pmd or pud so that the no-overlapping portion is also unmapped. Clearly that would have a bad outcome, but it's not something that any callers do today as far as I can tell. So I guess it's just expected that callers will not do this. However, it would be useful to know if this happened in future; let's add a warning to cover the eventuality. Signed-off-by: Ryan Roberts Reviewed-by: Anshuman Khandual --- mm/vmalloc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 61981ee1c9d2..a7e34e6936d2 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -374,8 +374,10 @@ static void vunmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, if (cleared || pmd_bad(*pmd)) *mask |= PGTBL_PMD_MODIFIED; - if (cleared) + if (cleared) { + WARN_ON(next - addr < PMD_SIZE); continue; + } if (pmd_none_or_clear_bad(pmd)) continue; vunmap_pte_range(pmd, addr, next, mask); @@ -399,8 +401,10 @@ static void vunmap_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, if (cleared || pud_bad(*pud)) *mask |= PGTBL_PUD_MODIFIED; - if (cleared) + if (cleared) { + WARN_ON(next - addr < PUD_SIZE); continue; + } if (pud_none_or_clear_bad(pud)) continue; vunmap_pmd_range(pud, addr, next, mask); From patchwork Mon Feb 17 14:08:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C6AAC021AA for ; Mon, 17 Feb 2025 14:08:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 31D14280061; Mon, 17 Feb 2025 09:08:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2CDC0280059; Mon, 17 Feb 2025 09:08:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 146E9280061; Mon, 17 Feb 2025 09:08:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EA559280059 for ; Mon, 17 Feb 2025 09:08:45 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A0A61B04EC for ; Mon, 17 Feb 2025 14:08:45 +0000 (UTC) X-FDA: 83129617410.12.F99C290 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf02.hostedemail.com (Postfix) with ESMTP id D80E480004 for ; Mon, 17 Feb 2025 14:08:43 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801324; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MNbungrrFJz+FPgf+VcHoNwc+UNtCM9gplHqa2WAA0I=; b=O7vjVN5nsXzkS4ucC41kQQoxbZsAqbKHJE4gAhm0U8IHf8Q7ui5eN0577+uC0+kvY6/Xbj mMMryjB8e4wBySCECmkop7XaE5Xlt/BsBuG0WZyq2uPg452zEiKzy6vsDtHA//x3hwBeoH u9my/omh8hixvYcNZlKhDlIGbBsPm2I= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801324; a=rsa-sha256; cv=none; b=Tj+2LHe9lnRIBekgWsyWKsg/E6/Igu++9XiQsITBA04OJlurs/opB6cwAyQOwQfALWe+W2 +zQQrlu9sRUV7n7eTSUpcpq5wy8tH4VxOnfp3kmw5z8QUdYjprxR1+L96YPF+XGGm3nuue BrzMdEGCVbBbJYdwMiAlUJdDTdcr88k= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6D5B91A2D; Mon, 17 Feb 2025 06:09:02 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F384B3F6A8; Mon, 17 Feb 2025 06:08:40 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 09/14] mm/vmalloc: Gracefully unmap huge ptes Date: Mon, 17 Feb 2025 14:08:01 +0000 Message-ID: <20250217140809.1702789-10-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: D80E480004 X-Stat-Signature: tuc6emw9jdrm8jdusswqjb4ceabw9uir X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1739801323-844116 X-HE-Meta: U2FsdGVkX1/jh/rT+VdzLBgfnVtanbVkwAA4HWanEgvSOpirCrRptlCHpMwnKZ+xp1UaC7Ko7PfsBmGrjfZoNtYt9GRJXse0eIHBmzazhTf/9+Vc0XSqRI1AcX3VUYKpI6gk8nIai3z8SV9mn4GYb06qswajzTPJA4qCKXWPMQeYM93dLV1CjNvK8g8gw41L4ZUjADYJdbV/+JyZyLJMpmlNu2Y0zFE8f9SWNo22M4zZk5O7FP7rZRS+clVR4pZi7+jfz+VMI2Qtdwavv8niVICYu2Dgc3+jkv9oEctkOyRFslX00eq0Ua0HgNT6O7/F+rQI5r7jnSFu9NUAjC2qW+hGEjGNkK2ayKaxxmuK7ZJwgH5Xvtv8VpiThcRgMklTrbjkQKajbkPPANBQYtfy8zxjgFPCTNQxzuwrpxNI+c9K9puLCGIVg3m279LAwWBcodOiXd29Jvyqzrl6aUqrTuK0Wn2gM1jCih3VRaG7BbkFX9QSrpyhIf7sDdMEtcAWDHVj5bBwWrVSkxREGyN+lLTeWocn8Ik5j5SGiuTgLA+Mlgkt9AV+kJZRcRKh3tKOf761LXO8oymDp+WZcZ4r2cMOLyKqdPy9fy09rWsXepYZ0pKy25yE5RumnH3aiYxl2mAnA4omeuGqNzW5JAHnu0emVtgMqKCy9jQDCkMOtnrBUy4gQOkpctyQhn/kimjiVIPa7kxu7eMiJmloklrXBvweH8SYMBH7ZzU5+n99jOrgsanktEERT21RophVhrGoiduP5RWz1Mqfppwp1rPVof5co0wd+FS2ndXN+uGGmJB1shYssIbeplwnGMV3uPyNnAyj6b5QoAhh1vP9mL2pbSPUqZx+mrZXcvTNdSLWMBtgIn+IL3kgeuwep2b34euw/pmg4Ti8pi/cYQcZ1ac1kvBfBXb57ndkODSk7Ym5zGeiK6z+8I8iRDaujmoW1LARDCV8oA8y0vK9XW6lSYd cTGF0Xq6 cTPtIkRVlGo5JVGYjZXM0XIYyQHM5mn1T9w92up5Mdkazvp1uajarFIL1iZMwF2Dw2VuvetM0kXabQget+Zm/MB3YgLewgdSO6UwUzIGYv3/I3CHmpZoKJI6oSkAia7/Q2xspXTpw9oXY0na9SuLOe5YbFaeNmra+boHrUGyfnu2VjO6CB3hivsOpOR7wn2T/cITO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Commit f7ee1f13d606 ("mm/vmalloc: enable mapping of huge pages at pte level in vmap") added its support by reusing the set_huge_pte_at() API, which is otherwise only used for user mappings. But when unmapping those huge ptes, it continued to call ptep_get_and_clear(), which is a layering violation. To date, the only arch to implement this support is powerpc and it all happens to work ok for it. But arm64's implementation of ptep_get_and_clear() can not be safely used to clear a previous set_huge_pte_at(). So let's introduce a new arch opt-in function, arch_vmap_pte_range_unmap_size(), which can provide the size of a (present) pte. Then we can call huge_ptep_get_and_clear() to tear it down properly. Note that if vunmap_range() is called with a range that starts in the middle of a huge pte-mapped page, we must unmap the entire huge page so the behaviour is consistent with pmd and pud block mappings. In this case emit a warning just like we do for pmd/pud mappings. Reviewed-by: Anshuman Khandual Signed-off-by: Ryan Roberts Reviewed-by: Uladzislau Rezki (Sony) --- include/linux/vmalloc.h | 8 ++++++++ mm/vmalloc.c | 18 ++++++++++++++++-- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 31e9ffd936e3..16dd4cba64f2 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -113,6 +113,14 @@ static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, uns } #endif +#ifndef arch_vmap_pte_range_unmap_size +static inline unsigned long arch_vmap_pte_range_unmap_size(unsigned long addr, + pte_t *ptep) +{ + return PAGE_SIZE; +} +#endif + #ifndef arch_vmap_pte_supported_shift static inline int arch_vmap_pte_supported_shift(unsigned long size) { diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a7e34e6936d2..68950b1824d0 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -350,12 +350,26 @@ static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, pgtbl_mod_mask *mask) { pte_t *pte; + pte_t ptent; + unsigned long size = PAGE_SIZE; pte = pte_offset_kernel(pmd, addr); do { - pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte); +#ifdef CONFIG_HUGETLB_PAGE + size = arch_vmap_pte_range_unmap_size(addr, pte); + if (size != PAGE_SIZE) { + if (WARN_ON(!IS_ALIGNED(addr, size))) { + addr = ALIGN_DOWN(addr, size); + pte = PTR_ALIGN_DOWN(pte, sizeof(*pte) * (size >> PAGE_SHIFT)); + } + ptent = huge_ptep_get_and_clear(&init_mm, addr, pte, size); + if (WARN_ON(end - addr < size)) + size = end - addr; + } else +#endif + ptent = ptep_get_and_clear(&init_mm, addr, pte); WARN_ON(!pte_none(ptent) && !pte_present(ptent)); - } while (pte++, addr += PAGE_SIZE, addr != end); + } while (pte += (size >> PAGE_SHIFT), addr += size, addr != end); *mask |= PGTBL_PTE_MODIFIED; } From patchwork Mon Feb 17 14:08:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B651C021A9 for ; Mon, 17 Feb 2025 14:08:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A99196B00B4; Mon, 17 Feb 2025 09:08:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A48D86B00B7; Mon, 17 Feb 2025 09:08:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8E84D6B00B6; Mon, 17 Feb 2025 09:08:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 675C86B00B2 for ; Mon, 17 Feb 2025 09:08:48 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2DF09477DA for ; Mon, 17 Feb 2025 14:08:48 +0000 (UTC) X-FDA: 83129617536.04.B1C2456 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf02.hostedemail.com (Postfix) with ESMTP id 604F480005 for ; Mon, 17 Feb 2025 14:08:46 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801326; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YcimCt12M3537/OZTbdudhpXR/S8bHCeIQvJ781xMdc=; b=H5BJQ2q2YW4xZvQ622xKPqrLKtGtQHq1FdqVfwHVWZlTSTIiokhZYJv4snx7SGWoHDQPcL RSySlh/FB+q+sH+5b8XnoRpxh6/b0uDTkiS/tt5l13Q2ZlIwMLlLeiyHIJUTGtsh+u54k0 XMz0zaFndTBoXaWxuSAuRDx1eKxZM6s= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801326; a=rsa-sha256; cv=none; b=Gic9YAIQr01z9SQA8fnNol5czZFHIe09KOUUkcJKnzLRThZEgsYaKBQ/nitBeQ1guHgUAg Ah+lefb1m57udFi703rLflk8a9wqPTEHdOMLvb3d0erUPmyTY1BCdbUmId95yvXMFx0MrY A2nnHg4WcQNvKDcGpAnuwg3rzud/eTE= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F17D81A2D; Mon, 17 Feb 2025 06:09:04 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 83BC73F6A8; Mon, 17 Feb 2025 06:08:43 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 10/14] arm64/mm: Support huge pte-mapped pages in vmap Date: Mon, 17 Feb 2025 14:08:02 +0000 Message-ID: <20250217140809.1702789-11-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 604F480005 X-Stat-Signature: okiwcq8cntjxdrns89xicryzbpj8ucdu X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1739801326-435769 X-HE-Meta: U2FsdGVkX1+L+Jk+UmIwc8dwimrwxTle1Ufmz5/1YB2LSWpqyVxJcAj67qSHg67X1NOOHoV8w558WKLlau9WEGZCRwzs0zaYOnq8p0rWYCI51VS9uwhZzaYR4Bhycb4AyGhbXmPQb6KBkD7PU8uVc5LywoOCYC1yUAo5dkPmbE0PGmnYBfUz5u6qZn5WCtk7DCTWzZsDwebgsu5AlNt5BBMw1LvI9O2IKJJ4N71AoPJzcZCONpiCee4WfOJTdQMAkO0IKZOclj5GTi65W1yTsQ278HINP1HlTB2Yh7J60m6hNziDCyQP80v+vuBjnWt+LaCoNqA+fFS9cKROcy8Z/vZOXGLy1t/IwCJ1A2MMi1tQj+wjCTfgEm5sm7lJtPgeL0w799TWZ+X93yab3aCzax2TcQ+OL/1hScqW0jbo7eiBv0RP/3gAXX4xJT4UNKJ8UnMKubpUtJm3hGeClhJdm8YA6m0UEpKDY6xmoqabyrFcTmlyOmTNiY5ObDGD+p7T5Tuxw13RBB02v7aQg08TERSmzIgB2JX3yKa/sCd59raxZ+VYzt9UgVv5/hU6bCdrU3omC6cShqwOTZitwHy6t5JfVzaaMc5pHIYnMNCo10kqT8jCF6+XHlL0ZLbF8g7DV4XD6dC7TF1vey3RyfuiYyWfTAs464c2KVaYd9a6K8ZHr/n6sDUlynueuX+yq4yd8ebV1bPz/6W+VKJn6xXXuNyo0ascOZbZqkuQ+zdy1aQC/p6LK0HL6M9X1z2dLzcKEullGLMIdN3igSikoF2rfFDKv35yJQHoHWTZ2aIN8URjpm9avAiop/bqDTYKwYW/Wez1P5oGculiplkv0R0V/ZNPTKxos9Bj+Z0IX45OV1VamdtVhdV1pDcgdBnijWrHYdcPG2Vu+LOt30rPBjbqHQ5esJb/rSDUZbh961WSfIyz/hgDEznnFE3XOHhgtI4dd7c3Klc+84Ap4wfwKEb IpyOCs+f KYNy/j7z9aRLwCXVY9Xmj5LCR9qrCgvloTYY6GVWDVBfIwvvCOmkDJuBGJq58Lr1LJmrq7a536XxN7X66HNcE5kizBX0vyEibx333UcONvT+5CQ7q7ikqftPqnDOl9xOQMZvzqIFZ5iRBp+fioKMZmztGZQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Implement the required arch functions to enable use of contpte in the vmap when VM_ALLOW_HUGE_VMAP is specified. This speeds up vmap operations due to only having to issue a DSB and ISB per contpte block instead of per pte. But it also means that the TLB pressure reduces due to only needing a single TLB entry for the whole contpte block. Since vmap uses set_huge_pte_at() to set the contpte, that API is now used for kernel mappings for the first time. Although in the vmap case we never expect it to be called to modify a valid mapping so clear_flush() should never be called, it's still wise to make it robust for the kernel case, so amend the tlb flush function if the mm is for kernel space. Tested with vmalloc performance selftests: # kself/mm/test_vmalloc.sh \ run_test_mask=1 test_repeat_count=5 nr_pages=256 test_loop_count=100000 use_huge=1 Duration reduced from 1274243 usec to 1083553 usec on Apple M2 for 15% reduction in time taken. Reviewed-by: Anshuman Khandual Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/vmalloc.h | 46 ++++++++++++++++++++++++++++++++ arch/arm64/mm/hugetlbpage.c | 5 +++- 2 files changed, 50 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/vmalloc.h b/arch/arm64/include/asm/vmalloc.h index 38fafffe699f..40ebc664190b 100644 --- a/arch/arm64/include/asm/vmalloc.h +++ b/arch/arm64/include/asm/vmalloc.h @@ -23,6 +23,52 @@ static inline bool arch_vmap_pmd_supported(pgprot_t prot) return !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS); } +#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size +static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, + unsigned long end, u64 pfn, + unsigned int max_page_shift) +{ + /* + * If the block is at least CONT_PTE_SIZE in size, and is naturally + * aligned in both virtual and physical space, then we can pte-map the + * block using the PTE_CONT bit for more efficient use of the TLB. + */ + + if (max_page_shift < CONT_PTE_SHIFT) + return PAGE_SIZE; + + if (end - addr < CONT_PTE_SIZE) + return PAGE_SIZE; + + if (!IS_ALIGNED(addr, CONT_PTE_SIZE)) + return PAGE_SIZE; + + if (!IS_ALIGNED(PFN_PHYS(pfn), CONT_PTE_SIZE)) + return PAGE_SIZE; + + return CONT_PTE_SIZE; +} + +#define arch_vmap_pte_range_unmap_size arch_vmap_pte_range_unmap_size +static inline unsigned long arch_vmap_pte_range_unmap_size(unsigned long addr, + pte_t *ptep) +{ + /* + * The caller handles alignment so it's sufficient just to check + * PTE_CONT. + */ + return pte_valid_cont(__ptep_get(ptep)) ? CONT_PTE_SIZE : PAGE_SIZE; +} + +#define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift +static inline int arch_vmap_pte_supported_shift(unsigned long size) +{ + if (size >= CONT_PTE_SIZE) + return CONT_PTE_SHIFT; + + return PAGE_SHIFT; +} + #endif #define arch_vmap_pgprot_tagged arch_vmap_pgprot_tagged diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 8ac86cd180b3..a29f347fea54 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -217,7 +217,10 @@ static void clear_flush(struct mm_struct *mm, for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) ptep_get_and_clear_anysz(mm, ptep, pgsize); - __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); + if (mm == &init_mm) + flush_tlb_kernel_range(saddr, addr); + else + __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); } void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, From patchwork Mon Feb 17 14:08:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977916 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE21EC021AA for ; Mon, 17 Feb 2025 14:08:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 320CD6B00B6; Mon, 17 Feb 2025 09:08:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 28381280059; Mon, 17 Feb 2025 09:08:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0AB4C6B00B8; Mon, 17 Feb 2025 09:08:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DDA216B00B6 for ; Mon, 17 Feb 2025 09:08:50 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 950601203EA for ; Mon, 17 Feb 2025 14:08:50 +0000 (UTC) X-FDA: 83129617620.22.A1A3DAA Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf20.hostedemail.com (Postfix) with ESMTP id 08B591C0011 for ; Mon, 17 Feb 2025 14:08:48 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf20.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801329; a=rsa-sha256; cv=none; b=KizdGRvCmpzxxi/jkH7U1jKtvdWmOYf8fi30D/MQN5c40em+FHJtAHAln8w/LfXX/PZdB4 N2F3s5+1LUc6z7pO1GDVj1NOLi7X1/IITPlTKhFWd7SlFHtRjKMEwpJXLa5bljsiaLxTgw DO5TM1Ea+yFPI5u4dGprEngqSE9rkjE= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf20.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801329; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uaCYE6J03jFfScCeArmGchXyS88UX/fmWKyf4HvQMQw=; b=P6FJrJh6BiHns+ftgOlXenkp/KFDRGXmRTz8wcLGdJqOfPLAD2EydC8LITn4K2x1Xk8lyU Vw1+q8yAE+cCLt2dnXhEJQhDFG4zwjHrWMDH3Ib4Prq+nZEJVsrqv63ABGvGgoI4nLzxsX Wbys6rxeLkPixRtlJvETLPcI29TbNWQ= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 80BD71E5E; Mon, 17 Feb 2025 06:09:07 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 134433F6A8; Mon, 17 Feb 2025 06:08:45 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 11/14] mm/vmalloc: Batch arch_sync_kernel_mappings() more efficiently Date: Mon, 17 Feb 2025 14:08:03 +0000 Message-ID: <20250217140809.1702789-12-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 08B591C0011 X-Rspamd-Server: rspam12 X-Stat-Signature: qzngxmd5kgzkn4tuqp6h786iqrnuhjgk X-HE-Tag: 1739801328-663346 X-HE-Meta: U2FsdGVkX1+ojrxWva6Ula33Cq8BveHHEMvYk/ryKyzRsJfR86IjRU5zI6h4WKt0N7KTW20JdRdb0rxoIlAPAHf1e2iahdzYFDUwXHaIy6vrF7lFpvIupaG2MsbCr0uNYIHZ2zfThq62g2C0gNipMUhC55PB7M01QSJoYtRV8eGnDp+zBgRyatLVE7cwcwhWLepYtnoJcFOZv74Wf0DpsSoVkgKUUO6REoGii/c2Gj7gAExWEd7Vyd3x+EdeH+SLKMiBdK1sZihLpcINQZ0gbK93yiF6vIXyvDKzIX+Ssyey0Fltb7y2QLymsprzo+D0Iui4hrC4BcNdmvboBcwvehS9aopXtC2i9/w/HHndrRVeTFooL5gG1Svt7i4lPuAF0UKoCxU1dmc8DQwGVP4ONbiGlEwoquSMLmR7VZtqgI1scW3i7QoKdoBzdwNUtohR52irB/uyxwy8Ymauuu+GquvY+smboAFmNnbjME37EQD11XzXY3fJxnbin19aXsCHYifeV3vz3dMUGWqH/K8vN956tKvaW/zG9ek3hb4c+U/JI4BzHat0o17KoFEHbqvs1iaN5nACsB6pEs/vExWEvqoGT/ShR3TYRDrZVbWkCiyetZ1uJZghyjPZvuIgdQSb1is4GKtEZOw4aqIleU3MB+OPQ3XG3V83nXZEpt4VsfviqJQNbbNpoIs9SKa3yHVpW6HXzK/mKetS70l9pkv2/Mbf4ikevI2CoUfEq6sFCAM1Aj4iHy6oWHzMtOOMzMhBx7+AR/Ty1hbuQFw9+Tf31vggWYCCE5AkVcIV1k7wiGP69ph2yJVpMZtswW9tliVFafaBPRxh1WKM0z/1NDizk6WNY/qyNQO8QTLOCzSpkvdVXn9Y1R+HdZGVWpEZlwnOFpsNxYToFVJhnIk26lAtsP6755zM1iBnum6zYJp2MmiyvjhTGEZpL4lYOS6LVyPI/XUDHYCap6APiQNmydX VJTmEFhh ttArTO4VOk6trNHktkxxHyJyyQXKquIT8GaUheuwbncyaRWH1Pfx0KviEo9S6amnWC/cor09yCyQIJji4v6K4BX8o0RAf0o/tr+7SEuAPw+N9oETzir30ZPviyEI+BMiVm6bLaGm/pviOf4FUFo/ciy135QOhbWWcmPLzbDpVR0XkuhowvXMswENJoQu3GV8hzesh9pqBwbKdS6Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When page_shift is greater than PAGE_SIZE, __vmap_pages_range_noflush() will call vmap_range_noflush() for each individual huge page. But vmap_range_noflush() would previously call arch_sync_kernel_mappings() directly so this would end up being called for every huge page. We can do better than this; refactor the call into the outer __vmap_pages_range_noflush() so that it is only called once for the entire batch operation. This will benefit performance for arm64 which is about to opt-in to using the hook. Reviewed-by: Anshuman Khandual Signed-off-by: Ryan Roberts --- mm/vmalloc.c | 60 ++++++++++++++++++++++++++-------------------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 68950b1824d0..50fd44439875 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -285,40 +285,38 @@ static int vmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, static int vmap_range_noflush(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot, - unsigned int max_page_shift) + unsigned int max_page_shift, pgtbl_mod_mask *mask) { pgd_t *pgd; - unsigned long start; unsigned long next; int err; - pgtbl_mod_mask mask = 0; might_sleep(); BUG_ON(addr >= end); - start = addr; pgd = pgd_offset_k(addr); do { next = pgd_addr_end(addr, end); err = vmap_p4d_range(pgd, addr, next, phys_addr, prot, - max_page_shift, &mask); + max_page_shift, mask); if (err) break; } while (pgd++, phys_addr += (next - addr), addr = next, addr != end); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, end); - return err; } int vmap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot) { + pgtbl_mod_mask mask = 0; int err; err = vmap_range_noflush(addr, end, phys_addr, pgprot_nx(prot), - ioremap_max_page_shift); + ioremap_max_page_shift, &mask); + if (mask & ARCH_PAGE_TABLE_SYNC_MASK) + arch_sync_kernel_mappings(addr, end); + flush_cache_vmap(addr, end); if (!err) err = kmsan_ioremap_page_range(addr, end, phys_addr, prot, @@ -587,29 +585,24 @@ static int vmap_pages_p4d_range(pgd_t *pgd, unsigned long addr, } static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end, - pgprot_t prot, struct page **pages) + pgprot_t prot, struct page **pages, pgtbl_mod_mask *mask) { - unsigned long start = addr; pgd_t *pgd; unsigned long next; int err = 0; int nr = 0; - pgtbl_mod_mask mask = 0; BUG_ON(addr >= end); pgd = pgd_offset_k(addr); do { next = pgd_addr_end(addr, end); if (pgd_bad(*pgd)) - mask |= PGTBL_PGD_MODIFIED; - err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr, &mask); + *mask |= PGTBL_PGD_MODIFIED; + err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr, mask); if (err) break; } while (pgd++, addr = next, addr != end); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, end); - return err; } @@ -626,26 +619,33 @@ int __vmap_pages_range_noflush(unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, unsigned int page_shift) { unsigned int i, nr = (end - addr) >> PAGE_SHIFT; + unsigned long start = addr; + pgtbl_mod_mask mask = 0; + int err = 0; WARN_ON(page_shift < PAGE_SHIFT); if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC) || - page_shift == PAGE_SHIFT) - return vmap_small_pages_range_noflush(addr, end, prot, pages); - - for (i = 0; i < nr; i += 1U << (page_shift - PAGE_SHIFT)) { - int err; - - err = vmap_range_noflush(addr, addr + (1UL << page_shift), - page_to_phys(pages[i]), prot, - page_shift); - if (err) - return err; + page_shift == PAGE_SHIFT) { + err = vmap_small_pages_range_noflush(addr, end, prot, pages, + &mask); + } else { + for (i = 0; i < nr; i += 1U << (page_shift - PAGE_SHIFT)) { + err = vmap_range_noflush(addr, + addr + (1UL << page_shift), + page_to_phys(pages[i]), prot, + page_shift, &mask); + if (err) + break; - addr += 1UL << page_shift; + addr += 1UL << page_shift; + } } - return 0; + if (mask & ARCH_PAGE_TABLE_SYNC_MASK) + arch_sync_kernel_mappings(start, end); + + return err; } int vmap_pages_range_noflush(unsigned long addr, unsigned long end, From patchwork Mon Feb 17 14:08:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5E56C021A9 for ; Mon, 17 Feb 2025 14:08:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E91836B00B8; Mon, 17 Feb 2025 09:08:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E414F280059; Mon, 17 Feb 2025 09:08:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C1D336B00BC; Mon, 17 Feb 2025 09:08:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A2A196B00B8 for ; Mon, 17 Feb 2025 09:08:53 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 455E41A049B for ; Mon, 17 Feb 2025 14:08:53 +0000 (UTC) X-FDA: 83129617746.02.802E3FF Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf16.hostedemail.com (Postfix) with ESMTP id 8B953180006 for ; Mon, 17 Feb 2025 14:08:51 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801331; a=rsa-sha256; cv=none; b=jehZpuwrhXmL6ioVnTOvv+RsNbhC/X6roQgsh5ZHoCPOugFn8c5GJf/cRdzFLGBzqson9b M9ilxtY83oi4Z2ZKzwD3HrF/bZNvsPyqqvkhY/LDCcN612B1o21lciYol6yVocMgj7zZod WxbgWtCYJIKc02RdXYeyGtmA916rKGg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801331; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zw4vZu0X095hQTpAYGpfAGxwgGtSu1Wp4rJvqtPB/Ts=; b=jycMFSAqWTTUDLi/hMxYbvohadix0qPNo9vaZlTHJ7qGnOuN7bJfjuDNxyVV4L8dlBwOno a+mHhe+0fcVPBvOzxy4JyY1zbWAPTwel6KUin66QV/F5qVk941rEE11jJ6GvMsmZLVjEsP xd+a2f1jACraw1toyVmNMf9h+ZPnMPE= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2F5881A2D; Mon, 17 Feb 2025 06:09:10 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 96F693F6A8; Mon, 17 Feb 2025 06:08:48 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 12/14] mm: Generalize arch_sync_kernel_mappings() Date: Mon, 17 Feb 2025 14:08:04 +0000 Message-ID: <20250217140809.1702789-13-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: sfgomhwqcfdpcko3bmeezpciqh7zio7z X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 8B953180006 X-Rspam-User: X-HE-Tag: 1739801331-647166 X-HE-Meta: U2FsdGVkX18T0CiKmktKTcsZ7Dq1UBVN7GKL2Tvv7655o2zjJsP5ZGpTFTHEsQ5p2rzowjSmmuSNc5yu3zMFUTcRHGUjmH1CUPbNgnGgeUIU8KT2bYKStNhEO6dXNXqabAP0Vk5c2JiuWAdxOXmIcC/t7qyxmzInVIeTzRVWATA7eem9qBfw0lTGEpE+/IAlB4PWNTIsZbExle6aMZvSGTBcA3rHWCBhHHUzmUYjMnrsLlpSP2KToh6Xj5jeVQGKdylKIi6pzis6TN+tn0W7XtUvQ/d8DS4ir0qMr1ub3ObtZeePeFKcLPnqbWVsK3Jz0bsTNStciZUgiNR9SLRjID8nwPtD+33qdVN/TUNlV25YvniFY1J8xilWnIgspyN4K+0R60va0rGhQzqft071Kcb6933Ya6Q6ZY+DxQQhbN3cyc7+JTL+rRRxPYQRtS/wx6yeEKHkvX4Pc2vnPQeVLRfxE9NDnH6P4m9eFK6nwmk+zJTg/3UkAHDfM+pFuVCq/Ykbof1bMJWkEqWoT74mXCNwKpPkHFUkcvIlMCMaxWw/o7q3Njm27HSkDmx4tZ0lL/USqGJqPN6UoeUcjHjS4t8Lh+ZJ+U1clOFj77e16NnRFdjQR80b7ktOFmmMCFgTeZuyd0UgUUDivAeS6Pa3vkXsYvuXD3cBYisTy9cboomlBhvRom16CBRfZv5UIbTFFHo5WIXmYp6OTjGNzqBrSjs610eP0MPriEnTgQFemMrFi4Lb0K4PknEiDmCaAUNSbAXn4jFWR2wecGbAo0afSQrTKbK5rLe35KZSFZAS5Er7jKZAQ0a+eYIe5dWoiB+R4WDdfmfsMBYylf1az0Gw5MBBk9GVZSnyFx2UjgJ0uizIld2eY2/ORzhKNXFpEGyJbEYmbqcRdnV2uvpaShyhloeKm7l4ckrEbZkHaC+F+X++jd67d27OnkD8P+Q8KfnKbLx7jRMvtEdtHH4wJYA db7hcSMO JSgSJq/gCAY8PYUo8gNzYRUKYJuYBzOFxxEPgWPPHW6HjofwR/AulOsGyp8oWPxM7bDjpU1QtjTWhebq7okPzKg1RVEVVUVfJOb2lT2jMbAQefoUh4dwQtqIwRVkZE6yxKIPt0skmBjRh7eLpoa7zwwACBFNrSbD0AK8Mz5YRWSguoIXe6iOYlZNK4TMaDC8u+EuYCoZtVkA08cw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: arch_sync_kernel_mappings() is an optional hook for arches to allow them to synchonize certain levels of the kernel pgtables after modification. But arm64 could benefit from a hook similar to this, paired with a call prior to starting the batch of modifications. So let's introduce arch_update_kernel_mappings_begin() and arch_update_kernel_mappings_end(). Both have a default implementation which can be overridden by the arch code. The default for the former is a nop, and the default for the latter is to call arch_sync_kernel_mappings(), so the latter replaces previous arch_sync_kernel_mappings() callsites. So by default, the resulting behaviour is unchanged. To avoid include hell, the pgtbl_mod_mask type and it's associated macros are moved to their own header. In a future patch, arm64 will opt-in to overriding both functions. Signed-off-by: Ryan Roberts --- include/linux/pgtable.h | 24 +---------------- include/linux/pgtable_modmask.h | 32 ++++++++++++++++++++++ include/linux/vmalloc.h | 47 +++++++++++++++++++++++++++++++++ mm/memory.c | 5 ++-- mm/vmalloc.c | 15 ++++++----- 5 files changed, 92 insertions(+), 31 deletions(-) create mode 100644 include/linux/pgtable_modmask.h diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 94d267d02372..7f70786a73b3 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -4,6 +4,7 @@ #include #include +#include #define PMD_ORDER (PMD_SHIFT - PAGE_SHIFT) #define PUD_ORDER (PUD_SHIFT - PAGE_SHIFT) @@ -1786,29 +1787,6 @@ static inline bool arch_has_pfn_modify_check(void) # define PAGE_KERNEL_EXEC PAGE_KERNEL #endif -/* - * Page Table Modification bits for pgtbl_mod_mask. - * - * These are used by the p?d_alloc_track*() set of functions an in the generic - * vmalloc/ioremap code to track at which page-table levels entries have been - * modified. Based on that the code can better decide when vmalloc and ioremap - * mapping changes need to be synchronized to other page-tables in the system. - */ -#define __PGTBL_PGD_MODIFIED 0 -#define __PGTBL_P4D_MODIFIED 1 -#define __PGTBL_PUD_MODIFIED 2 -#define __PGTBL_PMD_MODIFIED 3 -#define __PGTBL_PTE_MODIFIED 4 - -#define PGTBL_PGD_MODIFIED BIT(__PGTBL_PGD_MODIFIED) -#define PGTBL_P4D_MODIFIED BIT(__PGTBL_P4D_MODIFIED) -#define PGTBL_PUD_MODIFIED BIT(__PGTBL_PUD_MODIFIED) -#define PGTBL_PMD_MODIFIED BIT(__PGTBL_PMD_MODIFIED) -#define PGTBL_PTE_MODIFIED BIT(__PGTBL_PTE_MODIFIED) - -/* Page-Table Modification Mask */ -typedef unsigned int pgtbl_mod_mask; - #endif /* !__ASSEMBLY__ */ #if !defined(MAX_POSSIBLE_PHYSMEM_BITS) && !defined(CONFIG_64BIT) diff --git a/include/linux/pgtable_modmask.h b/include/linux/pgtable_modmask.h new file mode 100644 index 000000000000..5a21b1bb8df3 --- /dev/null +++ b/include/linux/pgtable_modmask.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_PGTABLE_MODMASK_H +#define _LINUX_PGTABLE_MODMASK_H + +#ifndef __ASSEMBLY__ + +/* + * Page Table Modification bits for pgtbl_mod_mask. + * + * These are used by the p?d_alloc_track*() set of functions an in the generic + * vmalloc/ioremap code to track at which page-table levels entries have been + * modified. Based on that the code can better decide when vmalloc and ioremap + * mapping changes need to be synchronized to other page-tables in the system. + */ +#define __PGTBL_PGD_MODIFIED 0 +#define __PGTBL_P4D_MODIFIED 1 +#define __PGTBL_PUD_MODIFIED 2 +#define __PGTBL_PMD_MODIFIED 3 +#define __PGTBL_PTE_MODIFIED 4 + +#define PGTBL_PGD_MODIFIED BIT(__PGTBL_PGD_MODIFIED) +#define PGTBL_P4D_MODIFIED BIT(__PGTBL_P4D_MODIFIED) +#define PGTBL_PUD_MODIFIED BIT(__PGTBL_PUD_MODIFIED) +#define PGTBL_PMD_MODIFIED BIT(__PGTBL_PMD_MODIFIED) +#define PGTBL_PTE_MODIFIED BIT(__PGTBL_PTE_MODIFIED) + +/* Page-Table Modification Mask */ +typedef unsigned int pgtbl_mod_mask; + +#endif /* !__ASSEMBLY__ */ + +#endif /* _LINUX_PGTABLE_MODMASK_H */ diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 16dd4cba64f2..cb5d8f1965a1 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -11,6 +11,7 @@ #include /* pgprot_t */ #include #include +#include #include @@ -213,6 +214,26 @@ extern int remap_vmalloc_range(struct vm_area_struct *vma, void *addr, int vmap_pages_range(unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, unsigned int page_shift); +#ifndef arch_update_kernel_mappings_begin +/** + * arch_update_kernel_mappings_begin - A batch of kernel pgtable mappings are + * about to be updated. + * @start: Virtual address of start of range to be updated. + * @end: Virtual address of end of range to be updated. + * + * An optional hook to allow architecture code to prepare for a batch of kernel + * pgtable mapping updates. An architecture may use this to enter a lazy mode + * where some operations can be deferred until the end of the batch. + * + * Context: Called in task context and may be preemptible. + */ +static inline void arch_update_kernel_mappings_begin(unsigned long start, + unsigned long end) +{ +} +#endif + +#ifndef arch_update_kernel_mappings_end /* * Architectures can set this mask to a combination of PGTBL_P?D_MODIFIED values * and let generic vmalloc and ioremap code know when arch_sync_kernel_mappings() @@ -229,6 +250,32 @@ int vmap_pages_range(unsigned long addr, unsigned long end, pgprot_t prot, */ void arch_sync_kernel_mappings(unsigned long start, unsigned long end); +/** + * arch_update_kernel_mappings_end - A batch of kernel pgtable mappings have + * been updated. + * @start: Virtual address of start of range that was updated. + * @end: Virtual address of end of range that was updated. + * + * An optional hook to inform architecture code that a batch update is complete. + * This balances a previous call to arch_update_kernel_mappings_begin(). + * + * An architecture may override this for any purpose, such as exiting a lazy + * mode previously entered with arch_update_kernel_mappings_begin() or syncing + * kernel mappings to a secondary pgtable. The default implementation calls an + * arch-provided arch_sync_kernel_mappings() if any arch-defined pgtable level + * was updated. + * + * Context: Called in task context and may be preemptible. + */ +static inline void arch_update_kernel_mappings_end(unsigned long start, + unsigned long end, + pgtbl_mod_mask mask) +{ + if (mask & ARCH_PAGE_TABLE_SYNC_MASK) + arch_sync_kernel_mappings(start, end); +} +#endif + /* * Lowlevel-APIs (not for driver use!) */ diff --git a/mm/memory.c b/mm/memory.c index a15f7dd500ea..f80930bc19f6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3035,6 +3035,8 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr, if (WARN_ON(addr >= end)) return -EINVAL; + arch_update_kernel_mappings_begin(start, end); + pgd = pgd_offset(mm, addr); do { next = pgd_addr_end(addr, end); @@ -3055,8 +3057,7 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr, break; } while (pgd++, addr = next, addr != end); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, start + size); + arch_update_kernel_mappings_end(start, end, mask); return err; } diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 50fd44439875..c5c51d86ef78 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -312,10 +312,10 @@ int vmap_page_range(unsigned long addr, unsigned long end, pgtbl_mod_mask mask = 0; int err; + arch_update_kernel_mappings_begin(addr, end); err = vmap_range_noflush(addr, end, phys_addr, pgprot_nx(prot), ioremap_max_page_shift, &mask); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(addr, end); + arch_update_kernel_mappings_end(addr, end, mask); flush_cache_vmap(addr, end); if (!err) @@ -463,6 +463,9 @@ void __vunmap_range_noflush(unsigned long start, unsigned long end) pgtbl_mod_mask mask = 0; BUG_ON(addr >= end); + + arch_update_kernel_mappings_begin(start, end); + pgd = pgd_offset_k(addr); do { next = pgd_addr_end(addr, end); @@ -473,8 +476,7 @@ void __vunmap_range_noflush(unsigned long start, unsigned long end) vunmap_p4d_range(pgd, addr, next, &mask); } while (pgd++, addr = next, addr != end); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, end); + arch_update_kernel_mappings_end(start, end, mask); } void vunmap_range_noflush(unsigned long start, unsigned long end) @@ -625,6 +627,8 @@ int __vmap_pages_range_noflush(unsigned long addr, unsigned long end, WARN_ON(page_shift < PAGE_SHIFT); + arch_update_kernel_mappings_begin(start, end); + if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC) || page_shift == PAGE_SHIFT) { err = vmap_small_pages_range_noflush(addr, end, prot, pages, @@ -642,8 +646,7 @@ int __vmap_pages_range_noflush(unsigned long addr, unsigned long end, } } - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, end); + arch_update_kernel_mappings_end(start, end, mask); return err; } From patchwork Mon Feb 17 14:08:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977918 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2DE5C021A9 for ; Mon, 17 Feb 2025 14:08:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 45815280062; Mon, 17 Feb 2025 09:08:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BA70280059; Mon, 17 Feb 2025 09:08:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 25BA3280062; Mon, 17 Feb 2025 09:08:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EC3DD280059 for ; Mon, 17 Feb 2025 09:08:55 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id AA760A04EE for ; Mon, 17 Feb 2025 14:08:55 +0000 (UTC) X-FDA: 83129617830.26.6657BF0 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf14.hostedemail.com (Postfix) with ESMTP id 1CCFC100004 for ; Mon, 17 Feb 2025 14:08:53 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801334; a=rsa-sha256; cv=none; b=HLYgudH1zJSkwNbLEHyYmwGsz+6WQKRiXVlAuWMcLl3N1h0WTbL/jEMMO1uqCMJ47JO9aI ou/QMmqkz+Rhg5237gqGsKtZ1Iz+JP+7NdtSeqMdqr9c/r69naJXr1PR1VyXSDsaptQU5H /TIo4+FDxdVWwSIWP7M3FmtertLlVmY= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801334; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0ChFArhQc4UjuOW/ZrRHu4cHoDy98pviZhV3Uu5a5Yo=; b=yozXbAkD4WD5lfN7nrVjID0m6jbzBaAu+L3QeRlV7xwItYTKL9hAUthYthfRRl7mvzVWaK pkx1JRmnVulbL+C+NBkEjX/W3mHK5l92BHUswXUJ2AanZv7FuDZpKbm+bB5Xn7P88obdRs 6kq3cBk+2qey0YxTZmn2xLPMhtob59M= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B39F6244C; Mon, 17 Feb 2025 06:09:12 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 459AE3F6A8; Mon, 17 Feb 2025 06:08:51 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 13/14] mm: Only call arch_update_kernel_mappings_[begin|end]() for kernel mappings Date: Mon, 17 Feb 2025 14:08:05 +0000 Message-ID: <20250217140809.1702789-14-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 1CCFC100004 X-Rspamd-Server: rspam12 X-Stat-Signature: kymn9y1zcryfxa9qr1sdf95dbjtq1rmn X-HE-Tag: 1739801333-552360 X-HE-Meta: U2FsdGVkX1/xXyDOU8EENSlHSzniAZzh51teM92lRyF7vOvRXNszIedrfXh7DUW14Uf3KEUO7h+8tgHiZZdBU6q0F0bAxWsmBBNkZ/zfpa9+6kTANn82rCwN8Fn1u8mtF6j04dGlnQxfQU4GyCpQ0t48iXNDzZ43gJni2c0tPlR7owAZM4OM7WILSjqyBgy7Hb/gRYDPuLPcg44K/5/hmpjFlkJ/eVuP19KC3IXR9OtiICJB3Vp99aGKUdXowHYHROSuU0ZhfP8MEJuCZeDFnkOjBiLST7PQtQVeXkAVfp8ilWClbGER/CQ08Ds19lv2CAt3w131OgEKfCzv+klzB6zYmsSWxDb45pQ6d0rNiyOEI5czGw/mZ8O3o0a1Hb+adu0hhkJinF5ixuNSwjln1PAJ7neFgULJooLH14J2xqWj/bUcBBjoNV5lWmFOtJa0gHdTZEaocSUZKnoGMBfARaSXMe0lAvQ7EWe25JfkNvmGEPj2u17WQFujB+Tnisaw/ri8Kd3ZdRGFNlxuxVY8AbC5bbwfkNadG3xdrK+7pRGwYE77CB0UxpXtSMk2wg1j5a/i/0rXZ+mW+CdlPBFRqQ2FZf7n+EBFGt+pfU2NtIldtOK3mJVOZLDJZwCwbmsgPnhxrDqx2UBh8BPPTYwH/oc6Sm+0pRqVFgLokLNgJYudJHkmB3EMtkB60z8mgyVHYTZAELZzrwNn2fzNpG5y88RhEcuvCCklyuhzHirA5LbHppzvHk7Fe3IklPZgFELZsNhiXRA/eUCtFvAZ8WdIfs1h5p+qIWMMhCNzDH/yYPdkFDgWeBXrImMYgTwEn6CJXyBJvMWpVtNp1l88XqXmkf6k+mVuCz2fJ5iyz6Vsdug7pOqCZyTqL2kgKOZ3qWleeYYF3fpoYlDeS3pbto8iuxml6n9bzcks+pvY5XZRZ+wXvNRPJQcxN+rRwhob4BhXw1nswDPKbbm3cW1g8cp kBM/8kEf d4U1YmM6iQm5OHUn/olNR1r9sa/zJ+/1nJ3qaVEXaVo2dkCwoqpSt2RO7+esNAm43BQPBif708RcX2pfY87AxjoiB2fE6DRzt3cQY25zVtxLGakyzu7I0UEN4VaJgx2bHBZb0oRKmc+YKY0mArVu594EjkwW6wnMKkn4u3O8RwyodAQdflkhVhxwVx5HcQV/cW54Ws6OKvIAvDkE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: arch_update_kernel_mappings_[begin|end]() is called from __apply_to_page_range(), which operates both on kernel and user mappings. Previously arch_update_kernel_mappings_[begin|end]() was called unconditionally for both user and kernel mappings. The existing arch implementations of arch_sync_kernel_mappings() (which is called by the default implementation of arch_update_kernel_mappings_end()) filter on kernel address ranges so this change is still correct for those users. But given "kernel_mappings" is in the function name, we really shouldn't be calling it for user mappings. This change will also make the upcoming arm64 implementation simpler. Signed-off-by: Ryan Roberts --- mm/memory.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index f80930bc19f6..4e299d254a11 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3035,7 +3035,8 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr, if (WARN_ON(addr >= end)) return -EINVAL; - arch_update_kernel_mappings_begin(start, end); + if (mm == &init_mm) + arch_update_kernel_mappings_begin(start, end); pgd = pgd_offset(mm, addr); do { @@ -3057,7 +3058,8 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr, break; } while (pgd++, addr = next, addr != end); - arch_update_kernel_mappings_end(start, end, mask); + if (mm == &init_mm) + arch_update_kernel_mappings_end(start, end, mask); return err; } From patchwork Mon Feb 17 14:08:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B4E0C021AA for ; Mon, 17 Feb 2025 14:09:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EDF08280063; Mon, 17 Feb 2025 09:08:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E8DDF280059; Mon, 17 Feb 2025 09:08:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C927D280063; Mon, 17 Feb 2025 09:08:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A3047280059 for ; Mon, 17 Feb 2025 09:08:58 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 54FE0C035E for ; Mon, 17 Feb 2025 14:08:58 +0000 (UTC) X-FDA: 83129617956.28.C2B7658 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf26.hostedemail.com (Postfix) with ESMTP id A6BBA140004 for ; Mon, 17 Feb 2025 14:08:56 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801336; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t1dNM9JM90rBmspFvnFJJnBHEG8lJQq6n88kpxM+tj8=; b=Zqi+vUoO0gbERdRWPYiJirVM4oPtoMZ4LkZ8VZk26Y4tmA5Tva/iAnDn+M9XiLbGCS8Ziw DugTERlhQv4eR2SFqpcS9UwrjGg8VjzD0IJEkLz44om3lq+13YQgZecUaouzrPVaTfFRfi QoZhEoymRWDMiZlPtLU89Fqyd+3Pbvs= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801336; a=rsa-sha256; cv=none; b=iqwc/kl3WcrT2NyBqA2lQwCGWXQiFsCwwELMlu80TaKCBYpH4XlUqYOF64R181/gGt94D8 BmU6JTL2Pe5aGAbSrcZU0D17dRg0BHMLmkHBrrllKMfECu05CzNlm4tY7CmMMN9NzUnctT cfywhWwoxSVCWdMCy2qqS7vsU4BPRj4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 42E83244C; Mon, 17 Feb 2025 06:09:15 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C9BE63F6A8; Mon, 17 Feb 2025 06:08:53 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 14/14] arm64/mm: Batch barriers when updating kernel mappings Date: Mon, 17 Feb 2025 14:08:06 +0000 Message-ID: <20250217140809.1702789-15-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: A6BBA140004 X-Stat-Signature: 7hfbcuoo887npztsrkt3pjuimrct558j X-HE-Tag: 1739801336-917408 X-HE-Meta: U2FsdGVkX1/UpFXu6g6JXRMQuWAjMw9rm8OO5yMT4xnHBsLMDpJZOa1AlBQG9zFJEULboCYWKMXVSwxUgil31RWYs+41VgUbjNo3Ozog7XuB6hHac36QepYatv/Akb5lJAA8V+rj6ZFsU8L9Bq7pIGfyC2TXIIxczYc/vEjYM8Tt4XNKvwEmJ6bvEFnGulex9qmpsflDtmYuS/ciE9Cck8ZaUwMW/+EFT9KLs9l5JLcFl/2wtNpBQDRRVXzxqods0VI0fsCCOTbz7oDLFrGjchCPEAy+1P0LyJATOZHRLHSsurR7I5c+gh/N7QTp4KPgAG+KyrRWTg/ZCagIxuk9B1t4wiY+/lr7eyxqinGmOKEopieQTqY5FflLXxwX5XwFROTCS35FoRc6J06k/rw2WpJkk1scBzitIQSN/gue+QKslPiGKvREJj8ySsNKbJf8ia3zIcibFemWhnGO/TfzNbQosB5nHUcbuLs9oPzA18bsw/Z5rA/5e20cIyhFXH8O/cUf7b0GTEnEw4A2LPg5gnLIq8BJtISfTHWHzGk/6PbrgDeSscYvDb0spBIww65AdZGTUUH4sOORR8vT2amzIhyeCR8cUa4BlRwLqZyPQxbS7KoUQeVCBAY4tKMLU/i1/vLzv8wknAK2ju4Kr4M9Fzefd8Cs7v52vNH8TRi+HHmZdmMDSptkeunc7mzdeycBBRZEYy6YzYBTD9Z0T3kIlOcwaMzvIF4P8L9mRFafc1q+YMsQLebvcsXv234Yyd4fMJbU1I/uNFjKgaDY2UfRK4RXGvAoCm4lU8gRCwFKO8f+hggHmJbOR5VoS4VFMjVgHq7SKM9B9HOf2PZsSMSO+rHLVi/L4XXZCnkffzYDGTRd5/VtWK94Grt/dF/qwq7g55+hjYvabm9GvGmTTBkcWY8uNOCnK5A6p/cQWqVgSkL09TiaN9Bk/3nJZtHbCSb0ZYsJcY4oA8YQ0PkJJ3h yXr7UXS7 swqAlIW6twPWG9nS+87wrKSva7YOcRZ7lKaZ0Rl7DUKIc4ccIRyjPucvpBDuEzWqo5+8Cqh2D95M33xb5cmabVDyvnR/lZw3vNvpExoG4dk9o5dPLDQkGDCRtPtGCMAjEpfZcbLPV8kBK3ArEPQNCn1oe0Bm/ammrGmEf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Because the kernel can't tolerate page faults for kernel mappings, when setting a valid, kernel space pte (or pmd/pud/p4d/pgd), it emits a dsb(ishst) to ensure that the store to the pgtable is observed by the table walker immediately. Additionally it emits an isb() to ensure that any already speculatively determined invalid mapping fault gets canceled. We can improve the performance of vmalloc operations by batching these barriers until the end of a set of entry updates. The newly added arch_update_kernel_mappings_begin() / arch_update_kernel_mappings_end() provide the required hooks. vmalloc improves by up to 30% as a result. A new TIF_ flag is created; TIF_KMAP_UPDATE_ACTIVE tells us if we are in the batch mode and can therefore defer any barriers until the end of the batch. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 73 ++++++++++++++++++++-------- arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/kernel/process.c | 9 ++-- 3 files changed, 59 insertions(+), 24 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 51128c2956f8..f8866dbdfde7 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -39,6 +39,49 @@ #include #include #include +#include + +static inline void emit_pte_barriers(void) +{ + /* + * These barriers are emitted under certain conditions after a pte entry + * was modified (see e.g. __set_pte_complete()). The dsb makes the store + * visible to the table walker. The isb ensures that any previous + * speculative "invalid translation" marker that is in the CPU's + * pipeline gets cleared, so that any access to that address after + * setting the pte to valid won't cause a spurious fault. If the thread + * gets preempted after storing to the pgtable but before emitting these + * barriers, __switch_to() emits a dsb which ensure the walker gets to + * see the store. There is no guarrantee of an isb being issued though. + * This is safe because it will still get issued (albeit on a + * potentially different CPU) when the thread starts running again, + * before any access to the address. + */ + dsb(ishst); + isb(); +} + +static inline void queue_pte_barriers(void) +{ + if (!test_thread_flag(TIF_KMAP_UPDATE_ACTIVE)) + emit_pte_barriers(); +} + +#define arch_update_kernel_mappings_begin arch_update_kernel_mappings_begin +static inline void arch_update_kernel_mappings_begin(unsigned long start, + unsigned long end) +{ + set_thread_flag(TIF_KMAP_UPDATE_ACTIVE); +} + +#define arch_update_kernel_mappings_end arch_update_kernel_mappings_end +static inline void arch_update_kernel_mappings_end(unsigned long start, + unsigned long end, + pgtbl_mod_mask mask) +{ + clear_thread_flag(TIF_KMAP_UPDATE_ACTIVE); + emit_pte_barriers(); +} #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE @@ -323,10 +366,8 @@ static inline void __set_pte_complete(pte_t pte) * Only if the new pte is valid and kernel, otherwise TLB maintenance * or update_mmu_cache() have the necessary barriers. */ - if (pte_valid_not_user(pte)) { - dsb(ishst); - isb(); - } + if (pte_valid_not_user(pte)) + queue_pte_barriers(); } static inline void __set_pte(pte_t *ptep, pte_t pte) @@ -791,10 +832,8 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) WRITE_ONCE(*pmdp, pmd); - if (pmd_valid_not_user(pmd)) { - dsb(ishst); - isb(); - } + if (pmd_valid_not_user(pmd)) + queue_pte_barriers(); } static inline void pmd_clear(pmd_t *pmdp) @@ -859,10 +898,8 @@ static inline void set_pud(pud_t *pudp, pud_t pud) WRITE_ONCE(*pudp, pud); - if (pud_valid_not_user(pud)) { - dsb(ishst); - isb(); - } + if (pud_valid_not_user(pud)) + queue_pte_barriers(); } static inline void pud_clear(pud_t *pudp) @@ -941,10 +978,8 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) WRITE_ONCE(*p4dp, p4d); - if (p4d_valid_not_user(p4d)) { - dsb(ishst); - isb(); - } + if (p4d_valid_not_user(p4d)) + queue_pte_barriers(); } static inline void p4d_clear(p4d_t *p4dp) @@ -1072,10 +1107,8 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) WRITE_ONCE(*pgdp, pgd); - if (pgd_valid_not_user(pgd)) { - dsb(ishst); - isb(); - } + if (pgd_valid_not_user(pgd)) + queue_pte_barriers(); } static inline void pgd_clear(pgd_t *pgdp) diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index 1114c1c3300a..3856e0759cc3 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -82,6 +82,7 @@ void arch_setup_new_exec(void); #define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */ #define TIF_KERNEL_FPSTATE 29 /* Task is in a kernel mode FPSIMD section */ #define TIF_TSC_SIGSEGV 30 /* SIGSEGV on counter-timer access */ +#define TIF_KMAP_UPDATE_ACTIVE 31 /* kernel map update in progress */ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 42faebb7b712..45a55fe81788 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -680,10 +680,11 @@ struct task_struct *__switch_to(struct task_struct *prev, gcs_thread_switch(next); /* - * Complete any pending TLB or cache maintenance on this CPU in case - * the thread migrates to a different CPU. - * This full barrier is also required by the membarrier system - * call. + * Complete any pending TLB or cache maintenance on this CPU in case the + * thread migrates to a different CPU. This full barrier is also + * required by the membarrier system call. Additionally it makes any + * in-progress pgtable writes visible to the table walker; See + * emit_pte_barriers(). */ dsb(ish);