From patchwork Mon Feb 17 14:07:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F716C021A9 for ; Mon, 17 Feb 2025 14:08:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D7DC628005F; Mon, 17 Feb 2025 09:08:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D5601280059; Mon, 17 Feb 2025 09:08:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF5B328005F; Mon, 17 Feb 2025 09:08:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9D737280059 for ; Mon, 17 Feb 2025 09:08:40 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5DF8B479B2 for ; Mon, 17 Feb 2025 14:08:40 +0000 (UTC) X-FDA: 83129617200.07.49F6E8C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id B9B094000E for ; Mon, 17 Feb 2025 14:08:38 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801318; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YaLcawmwrUWcb66XB/Sg9CcZOpkFUpumQM1gp+Tb5Og=; b=6ssVx3A24QrO2EfljkcB6IgtJAchP35d7oVKRStc2qGkY3+XTTmrWzXIeYzUdR5la7aGE4 azyviqCvr8EqxqmZ4BVUc4vpZgV5ctqMUKBtKtJMmJgmlJgLZNGMoKek/w1YOWT2MC40Vb KL4zVqQUem0s4af4pMEu1SypMpcOTl8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801318; a=rsa-sha256; cv=none; b=4xAR1O7txMD8HSYxTCKH4VVHdwKiewdy4pofw9VMHxvOBKSgO/KQNaRi7llO5g1mXEf9/y h0wafvyCiu/ZXBF6oez3tdbkxxhXQgpKV/VXEQf1xqz74JIsC8qKygYtLs9QY4F7B6QaqF x3hTKYJoZzD5qbEbU3i95s/4uHGG9OM= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5A12A1A2D; Mon, 17 Feb 2025 06:08:57 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E02EF3F6A8; Mon, 17 Feb 2025 06:08:35 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 07/14] arm64/mm: Avoid barriers for invalid or userspace mappings Date: Mon, 17 Feb 2025 14:07:59 +0000 Message-ID: <20250217140809.1702789-8-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: B9B094000E X-Stat-Signature: cd47hckz1senwzwsq6icidkfgpxreb1p X-HE-Tag: 1739801318-190270 X-HE-Meta: U2FsdGVkX188sNUFmBFqnSlHlAYqoRtifSOV+YaDYNIyAnRUn5b/B0YDOyKa4y2NqHPelDjCp1LUjO9ItJmD6wVC7wuEvnk5rHwOH4jGuwbd2A+j3uN90dEr+jaeN1IRLkpltb+cRDIbzC3WLP3kJ30hWJ8QLSaLaPaAB5pVSnl/v4WOEdRT7RDV/LCGA1PKhvnJpaSyG1rEhgOgFcZoc08bHenooKBuYooOSztL7FoKibatz2MwnH4mLKsD6/pS7C871haQyBjcXv7CwVYiYxkMckBkmHK9inRpaIA4EXvlX3ObDkYEXMrv+C7JsTLsVB08YDReTNQM3FKNHvS6fwBxhoTtm4y6FJiaP6MZdJWHd7mhEihxv6JPVqWvZi96Hr3P2sbuuf8lxYQt6kWlfygcNSkvfbi2X/3sC4EDfxIe5Vw9lE5yc/RgfitlNOHIzk0LcG/VENhcHqqeUbDGeSNfOec3UbAMu+eyoTm0UCF/ikiy8KDQfdopj/1OToaZSo2W9YoU/CxwzegJLkTXaPPyaN8mGixORcwTKw+1s0hsqbUSG+sO2hRv80owDHHJbxyzJ6w7MGcxe5Fk+doKjIAKfSTtAutPlbr8KwOMitwmmzD+/Jy9bsQZB/aj//cNPCIb3dEQA2UfnwJJBbQJnTP2AxTis7kaula+hxP0BXZgPEIL29ji+JzQJUoFuFSNcXkHRUauMSOeRXP+43I954k6pUValendVkHJ8OBZG1ezl4jMRiwGWr6ViUp08/36LdFQetG31Yu2Ng5wkZhE5iAvCC43zxFkl1ZBsPuFD7+TP8ADWEndDget3nOJW6ihLXHqHIAu7s6QmVvlWUNQ0Z0xrHzXJ/nkjgzBrFGjAasQPPI+LWb1terPJp1E9WvS/tMn0PSTi5cOBONsbzOk9vNNCafoBv7KQWcpHmuNYihmueuqHjCn9s2NItCcxsIsR3yEY4aqqr+TWFBBwFE K+uOsN/w 5vK4HytidHnY0MtF1E7+Zu5twPgpKxfG+3/9jMI/wewFnrZAOsbH1kSQmddcnw4bnNOrIfHLfMvA+bYmcOKmHn9R2QOj1UUwJxlX+Wj6nhpGZyuohiH9zyz+gfu0RG9gPdVMSrFsj4o3cT0d3hh7E6eazHA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: __set_pte_complete(), set_pmd(), set_pud(), set_p4d() and set_pgd() are used to write entries into pgtables. And they issue barriers (currently dsb and isb) to ensure that the written values are observed by the table walker prior to any program-order-future memory access to the mapped location. Over the years some of these functions have received optimizations: In particular, commit 7f0b1bf04511 ("arm64: Fix barriers used for page table modifications") made it so that the barriers were only emitted for valid-kernel mappings for set_pte() (now __set_pte_complete()). And commit 0795edaf3f1f ("arm64: pgtable: Implement p[mu]d_valid() and check in set_p[mu]d()") made it so that set_pmd()/set_pud() only emitted the barriers for valid mappings. set_p4d()/set_pgd() continue to emit the barriers unconditionally. This is all very confusing to the casual observer; surely the rules should be invariant to the level? Let's change this so that every level consistently emits the barriers only when setting valid, non-user entries (both table and leaf). It seems obvious that if it is ok to elide barriers all but valid kernel mappings at pte level, it must also be ok to do this for leaf entries at other levels: If setting an entry to invalid, a tlb maintenance operation must surely follow to synchronise the TLB and this contains the required barriers. If setting a valid user mapping, the previous mapping must have been invalid and there must have been a TLB maintenance operation (complete with barriers) to honour break-before-make. So the worst that can happen is we take an extra fault (which will imply the DSB + ISB) and conclude that there is nothing to do. These are the arguments for doing this optimization at pte level and they also apply to leaf mappings at other levels. For table entries, the same arguments hold: If unsetting a table entry, a TLB is required and this will emit the required barriers. If setting a table entry, the previous value must have been invalid and the table walker must already be able to observe that. Additionally the contents of the pgtable being pointed to in the newly set entry must be visible before the entry is written and this is enforced via smp_wmb() (dmb) in the pgtable allocation functions and in __split_huge_pmd_locked(). But this last part could never have been enforced by the barriers in set_pXd() because they occur after updating the entry. So ultimately, the wost that can happen by eliding these barriers for user table entries is an extra fault. I observe roughly the same number of page faults (107M) with and without this change when compiling the kernel on Apple M2. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 34 ++++++++++++++++++++++++++------ 1 file changed, 28 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index e4b1946b261f..51128c2956f8 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -767,6 +767,19 @@ static inline bool in_swapper_pgdir(void *addr) ((unsigned long)swapper_pg_dir & PAGE_MASK); } +static inline bool pmd_valid_not_user(pmd_t pmd) +{ + /* + * User-space table entries always have (PXN && !UXN). All other + * combinations indicate it's a table entry for kernel space. + * Valid-not-user leaf entries follow the same rules as + * pte_valid_not_user(). + */ + if (pmd_table(pmd)) + return !((pmd_val(pmd) & (PMD_TABLE_PXN | PMD_TABLE_UXN)) == PMD_TABLE_PXN); + return pte_valid_not_user(pmd_pte(pmd)); +} + static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) { #ifdef __PAGETABLE_PMD_FOLDED @@ -778,7 +791,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) WRITE_ONCE(*pmdp, pmd); - if (pmd_valid(pmd)) { + if (pmd_valid_not_user(pmd)) { dsb(ishst); isb(); } @@ -833,6 +846,7 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) #define pud_valid(pud) pte_valid(pud_pte(pud)) #define pud_user(pud) pte_user(pud_pte(pud)) #define pud_user_exec(pud) pte_user_exec(pud_pte(pud)) +#define pud_valid_not_user(pud) pmd_valid_not_user(pte_pmd(pud_pte(pud))) static inline bool pgtable_l4_enabled(void); @@ -845,7 +859,7 @@ static inline void set_pud(pud_t *pudp, pud_t pud) WRITE_ONCE(*pudp, pud); - if (pud_valid(pud)) { + if (pud_valid_not_user(pud)) { dsb(ishst); isb(); } @@ -916,6 +930,7 @@ static inline bool mm_pud_folded(const struct mm_struct *mm) #define p4d_none(p4d) (pgtable_l4_enabled() && !p4d_val(p4d)) #define p4d_bad(p4d) (pgtable_l4_enabled() && !(p4d_val(p4d) & P4D_TABLE_BIT)) #define p4d_present(p4d) (!p4d_none(p4d)) +#define p4d_valid_not_user(p4d) pmd_valid_not_user(pte_pmd(p4d_pte(p4d))) static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) { @@ -925,8 +940,11 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) } WRITE_ONCE(*p4dp, p4d); - dsb(ishst); - isb(); + + if (p4d_valid_not_user(p4d)) { + dsb(ishst); + isb(); + } } static inline void p4d_clear(p4d_t *p4dp) @@ -1043,6 +1061,7 @@ static inline bool mm_p4d_folded(const struct mm_struct *mm) #define pgd_none(pgd) (pgtable_l5_enabled() && !pgd_val(pgd)) #define pgd_bad(pgd) (pgtable_l5_enabled() && !(pgd_val(pgd) & PGD_TABLE_BIT)) #define pgd_present(pgd) (!pgd_none(pgd)) +#define pgd_valid_not_user(pgd) pmd_valid_not_user(pte_pmd(pgd_pte(pgd))) static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) { @@ -1052,8 +1071,11 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) } WRITE_ONCE(*pgdp, pgd); - dsb(ishst); - isb(); + + if (pgd_valid_not_user(pgd)) { + dsb(ishst); + isb(); + } } static inline void pgd_clear(pgd_t *pgdp)