From patchwork Mon Feb 17 14:07:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09E3EC021AA for ; Mon, 17 Feb 2025 14:08:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 62DE728005E; Mon, 17 Feb 2025 09:08:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 58F00280059; Mon, 17 Feb 2025 09:08:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42F5728005E; Mon, 17 Feb 2025 09:08:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 225EC280059 for ; Mon, 17 Feb 2025 09:08:38 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D141512045C for ; Mon, 17 Feb 2025 14:08:37 +0000 (UTC) X-FDA: 83129617074.19.E56A0F3 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf25.hostedemail.com (Postfix) with ESMTP id 3A808A000F for ; Mon, 17 Feb 2025 14:08:36 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801316; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8wUxreknr65ickLPMtCy28QDzN6KgfgyfF+zJLcXS8I=; b=pVZneop37SIrnbFHIWTSNRl/82G+Y3WMgIsHq2HRzlraH2wB9lE8dA2drSgR6mmVJOkr++ /Sn5Xic3qZjZsLX2hgxA2Zff3FEeW1k0f05QsZC1Dt+555v529waXLrpPwyE51cFIsnH5o 0h1ffMrbF8crE2rBB8LPTixUZLylflM= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801316; a=rsa-sha256; cv=none; b=suNntrN78+GlSToNlJ/YEgokAQBC9hV0PXIr3GEhju8zXHiE97EnRu4A26pWspDBIsr8Pa f2O6ybNJu7a2Bhux9fyOD68HnQIE6QqzMryalKeEJNtbOSeBJUXICdKkIqFo5moS+YuEd5 /x/5XZprFVKVBReWeJ/oWxOlkcpxMLI= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CA4601E5E; Mon, 17 Feb 2025 06:08:54 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5BF713F6A8; Mon, 17 Feb 2025 06:08:33 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 06/14] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Date: Mon, 17 Feb 2025 14:07:58 +0000 Message-ID: <20250217140809.1702789-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3A808A000F X-Stat-Signature: 5zq78ywz6ow15txcgdgd759dhw8cbma1 X-Rspam-User: X-HE-Tag: 1739801316-968756 X-HE-Meta: U2FsdGVkX1+HSpobugSYNbdjXu/VdiPjZduBWju/VkUw/1te11jORa5HaWL61PoAzX52jYkygjrIRoC88/ImRoKyjameyK/dFE1AjjTEhjz+4M24iZLRsmqb76RqVTp4GpeALW2wcanOKn4zc05tz+N9QCX2kaIeMEHf8x1LY0y9KPc3p3gDceJvYbkLjsJeOs/a/CcXxfoaroY8oC5vJxF2+c4OZHv/R+gQtVM7YANhIDTQyBCkHdBbedlr29keD76POIZgn2KXn7PGNJe50VvEx4eac2NWhP4mfsQNn6/hkN+O3R+5MPCdWAbeDFCrmI2ffqnqFT2hjOAd5yNpn+IQsXTVwmerI4dE1JOOt6W9btUSulw1jU0/N7lafLOYl/X2+tajj/kV6Mc0MUlU5bMVocdUAGtn1zJ3DDpd0weVG8LeeZVrqCvxrFb8JMZ++GDiq1XJ6xp6eoyOE2w4pzJyzZ3i1iKV7cR2p/ao++IrgnX8OXD/GvKYz81auNveBDjgT/HSu06GOE6hNbWn9xRsM6dZcPxw5PxhZO07xdVP72+YZ/esr9gZy7K7aa7hw4fS5UTmsdmjLO5Xe/tepMMCK/9NPmp5ybFHopUSWrJ+cl2vf1J5bmRLdkAlIB8Lr0MB2TzmVb+Ow+OdtQMtJrnj+66iHsmBa8BgcHTg5OLpjTDxvWzhRThYtRrMhyym1V1L6f/X+L4SOgnQ815XQOSgsrrfemJKEIGcdbl8ByAMjnIvkuqYANYbgge6Gw5lVUE1S10xglljMOMI3ZXtkPMxxflM9J9qLTkj5nfcvOSsLts8NU1LQ/trpqJmF008AVxNBV1jXTXW7UOToavC42j1/WAi1m+a5Z3PLtKPTNQ/fWcdLrrl1Iv8a7cIYKlZoh/i7if4BtUJmXYwJ+AeoMW5K2dR7mwbres9RmncFSqVJiGgkacZiOokIyKYGVJHoEWi3uGdvZRxpxyp7Ma RJFUcvF+ W8AvmL4HJDlEM7Vlm3vDYxZ4L6x4SW7Pqe8Yq5eSs83p9yhcIpP+xuoXtr1uc15yxQcBWIEkxBQTR2cLjJDGgDK1jRhyf3O6IXQPHluzT1zO+60VvsHUKnuzdhV7joayRmDHbs9h/yCJAMU6kqQi8QsBx+EO9c3ceOvbBw0Da3AtB8O0m+ZpyIa78lfjghzHh5qpcV8/Njo8GZ80= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: set_ptes_anysz() previously called __set_pte() for each PTE in the range, which would conditionally issue a DSB and ISB to make the new PTE value immediately visible to the table walker if the new PTE was valid and for kernel space. We can do better than this; let's hoist those barriers out of the loop so that they are only issued once at the end of the loop. We then reduce the cost by the number of PTEs in the range. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index e255a36380dc..e4b1946b261f 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -317,10 +317,8 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) WRITE_ONCE(*ptep, pte); } -static inline void __set_pte(pte_t *ptep, pte_t pte) +static inline void __set_pte_complete(pte_t pte) { - __set_pte_nosync(ptep, pte); - /* * Only if the new pte is valid and kernel, otherwise TLB maintenance * or update_mmu_cache() have the necessary barriers. @@ -331,6 +329,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte) } } +static inline void __set_pte(pte_t *ptep, pte_t pte) +{ + __set_pte_nosync(ptep, pte); + __set_pte_complete(pte); +} + static inline pte_t __ptep_get(pte_t *ptep) { return READ_ONCE(*ptep); @@ -647,12 +651,14 @@ static inline void set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, pte_t pte, for (;;) { __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + __set_pte_nosync(ptep, pte); if (--nr == 0) break; ptep++; pte = pte_advance_pfn(pte, stride); } + + __set_pte_complete(pte); } static inline void __set_ptes(struct mm_struct *mm,