From patchwork Mon Feb 17 14:08:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13977919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B4E0C021AA for ; Mon, 17 Feb 2025 14:09:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EDF08280063; Mon, 17 Feb 2025 09:08:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E8DDF280059; Mon, 17 Feb 2025 09:08:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C927D280063; Mon, 17 Feb 2025 09:08:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A3047280059 for ; Mon, 17 Feb 2025 09:08:58 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 54FE0C035E for ; Mon, 17 Feb 2025 14:08:58 +0000 (UTC) X-FDA: 83129617956.28.C2B7658 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf26.hostedemail.com (Postfix) with ESMTP id A6BBA140004 for ; Mon, 17 Feb 2025 14:08:56 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739801336; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t1dNM9JM90rBmspFvnFJJnBHEG8lJQq6n88kpxM+tj8=; b=Zqi+vUoO0gbERdRWPYiJirVM4oPtoMZ4LkZ8VZk26Y4tmA5Tva/iAnDn+M9XiLbGCS8Ziw DugTERlhQv4eR2SFqpcS9UwrjGg8VjzD0IJEkLz44om3lq+13YQgZecUaouzrPVaTfFRfi QoZhEoymRWDMiZlPtLU89Fqyd+3Pbvs= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739801336; a=rsa-sha256; cv=none; b=iqwc/kl3WcrT2NyBqA2lQwCGWXQiFsCwwELMlu80TaKCBYpH4XlUqYOF64R181/gGt94D8 BmU6JTL2Pe5aGAbSrcZU0D17dRg0BHMLmkHBrrllKMfECu05CzNlm4tY7CmMMN9NzUnctT cfywhWwoxSVCWdMCy2qqS7vsU4BPRj4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 42E83244C; Mon, 17 Feb 2025 06:09:15 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C9BE63F6A8; Mon, 17 Feb 2025 06:08:53 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 14/14] arm64/mm: Batch barriers when updating kernel mappings Date: Mon, 17 Feb 2025 14:08:06 +0000 Message-ID: <20250217140809.1702789-15-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250217140809.1702789-1-ryan.roberts@arm.com> References: <20250217140809.1702789-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: A6BBA140004 X-Stat-Signature: 7hfbcuoo887npztsrkt3pjuimrct558j X-HE-Tag: 1739801336-917408 X-HE-Meta: U2FsdGVkX1/UpFXu6g6JXRMQuWAjMw9rm8OO5yMT4xnHBsLMDpJZOa1AlBQG9zFJEULboCYWKMXVSwxUgil31RWYs+41VgUbjNo3Ozog7XuB6hHac36QepYatv/Akb5lJAA8V+rj6ZFsU8L9Bq7pIGfyC2TXIIxczYc/vEjYM8Tt4XNKvwEmJ6bvEFnGulex9qmpsflDtmYuS/ciE9Cck8ZaUwMW/+EFT9KLs9l5JLcFl/2wtNpBQDRRVXzxqods0VI0fsCCOTbz7oDLFrGjchCPEAy+1P0LyJATOZHRLHSsurR7I5c+gh/N7QTp4KPgAG+KyrRWTg/ZCagIxuk9B1t4wiY+/lr7eyxqinGmOKEopieQTqY5FflLXxwX5XwFROTCS35FoRc6J06k/rw2WpJkk1scBzitIQSN/gue+QKslPiGKvREJj8ySsNKbJf8ia3zIcibFemWhnGO/TfzNbQosB5nHUcbuLs9oPzA18bsw/Z5rA/5e20cIyhFXH8O/cUf7b0GTEnEw4A2LPg5gnLIq8BJtISfTHWHzGk/6PbrgDeSscYvDb0spBIww65AdZGTUUH4sOORR8vT2amzIhyeCR8cUa4BlRwLqZyPQxbS7KoUQeVCBAY4tKMLU/i1/vLzv8wknAK2ju4Kr4M9Fzefd8Cs7v52vNH8TRi+HHmZdmMDSptkeunc7mzdeycBBRZEYy6YzYBTD9Z0T3kIlOcwaMzvIF4P8L9mRFafc1q+YMsQLebvcsXv234Yyd4fMJbU1I/uNFjKgaDY2UfRK4RXGvAoCm4lU8gRCwFKO8f+hggHmJbOR5VoS4VFMjVgHq7SKM9B9HOf2PZsSMSO+rHLVi/L4XXZCnkffzYDGTRd5/VtWK94Grt/dF/qwq7g55+hjYvabm9GvGmTTBkcWY8uNOCnK5A6p/cQWqVgSkL09TiaN9Bk/3nJZtHbCSb0ZYsJcY4oA8YQ0PkJJ3h yXr7UXS7 swqAlIW6twPWG9nS+87wrKSva7YOcRZ7lKaZ0Rl7DUKIc4ccIRyjPucvpBDuEzWqo5+8Cqh2D95M33xb5cmabVDyvnR/lZw3vNvpExoG4dk9o5dPLDQkGDCRtPtGCMAjEpfZcbLPV8kBK3ArEPQNCn1oe0Bm/ammrGmEf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Because the kernel can't tolerate page faults for kernel mappings, when setting a valid, kernel space pte (or pmd/pud/p4d/pgd), it emits a dsb(ishst) to ensure that the store to the pgtable is observed by the table walker immediately. Additionally it emits an isb() to ensure that any already speculatively determined invalid mapping fault gets canceled. We can improve the performance of vmalloc operations by batching these barriers until the end of a set of entry updates. The newly added arch_update_kernel_mappings_begin() / arch_update_kernel_mappings_end() provide the required hooks. vmalloc improves by up to 30% as a result. A new TIF_ flag is created; TIF_KMAP_UPDATE_ACTIVE tells us if we are in the batch mode and can therefore defer any barriers until the end of the batch. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 73 ++++++++++++++++++++-------- arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/kernel/process.c | 9 ++-- 3 files changed, 59 insertions(+), 24 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 51128c2956f8..f8866dbdfde7 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -39,6 +39,49 @@ #include #include #include +#include + +static inline void emit_pte_barriers(void) +{ + /* + * These barriers are emitted under certain conditions after a pte entry + * was modified (see e.g. __set_pte_complete()). The dsb makes the store + * visible to the table walker. The isb ensures that any previous + * speculative "invalid translation" marker that is in the CPU's + * pipeline gets cleared, so that any access to that address after + * setting the pte to valid won't cause a spurious fault. If the thread + * gets preempted after storing to the pgtable but before emitting these + * barriers, __switch_to() emits a dsb which ensure the walker gets to + * see the store. There is no guarrantee of an isb being issued though. + * This is safe because it will still get issued (albeit on a + * potentially different CPU) when the thread starts running again, + * before any access to the address. + */ + dsb(ishst); + isb(); +} + +static inline void queue_pte_barriers(void) +{ + if (!test_thread_flag(TIF_KMAP_UPDATE_ACTIVE)) + emit_pte_barriers(); +} + +#define arch_update_kernel_mappings_begin arch_update_kernel_mappings_begin +static inline void arch_update_kernel_mappings_begin(unsigned long start, + unsigned long end) +{ + set_thread_flag(TIF_KMAP_UPDATE_ACTIVE); +} + +#define arch_update_kernel_mappings_end arch_update_kernel_mappings_end +static inline void arch_update_kernel_mappings_end(unsigned long start, + unsigned long end, + pgtbl_mod_mask mask) +{ + clear_thread_flag(TIF_KMAP_UPDATE_ACTIVE); + emit_pte_barriers(); +} #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE @@ -323,10 +366,8 @@ static inline void __set_pte_complete(pte_t pte) * Only if the new pte is valid and kernel, otherwise TLB maintenance * or update_mmu_cache() have the necessary barriers. */ - if (pte_valid_not_user(pte)) { - dsb(ishst); - isb(); - } + if (pte_valid_not_user(pte)) + queue_pte_barriers(); } static inline void __set_pte(pte_t *ptep, pte_t pte) @@ -791,10 +832,8 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) WRITE_ONCE(*pmdp, pmd); - if (pmd_valid_not_user(pmd)) { - dsb(ishst); - isb(); - } + if (pmd_valid_not_user(pmd)) + queue_pte_barriers(); } static inline void pmd_clear(pmd_t *pmdp) @@ -859,10 +898,8 @@ static inline void set_pud(pud_t *pudp, pud_t pud) WRITE_ONCE(*pudp, pud); - if (pud_valid_not_user(pud)) { - dsb(ishst); - isb(); - } + if (pud_valid_not_user(pud)) + queue_pte_barriers(); } static inline void pud_clear(pud_t *pudp) @@ -941,10 +978,8 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) WRITE_ONCE(*p4dp, p4d); - if (p4d_valid_not_user(p4d)) { - dsb(ishst); - isb(); - } + if (p4d_valid_not_user(p4d)) + queue_pte_barriers(); } static inline void p4d_clear(p4d_t *p4dp) @@ -1072,10 +1107,8 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) WRITE_ONCE(*pgdp, pgd); - if (pgd_valid_not_user(pgd)) { - dsb(ishst); - isb(); - } + if (pgd_valid_not_user(pgd)) + queue_pte_barriers(); } static inline void pgd_clear(pgd_t *pgdp) diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index 1114c1c3300a..3856e0759cc3 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -82,6 +82,7 @@ void arch_setup_new_exec(void); #define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */ #define TIF_KERNEL_FPSTATE 29 /* Task is in a kernel mode FPSIMD section */ #define TIF_TSC_SIGSEGV 30 /* SIGSEGV on counter-timer access */ +#define TIF_KMAP_UPDATE_ACTIVE 31 /* kernel map update in progress */ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 42faebb7b712..45a55fe81788 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -680,10 +680,11 @@ struct task_struct *__switch_to(struct task_struct *prev, gcs_thread_switch(next); /* - * Complete any pending TLB or cache maintenance on this CPU in case - * the thread migrates to a different CPU. - * This full barrier is also required by the membarrier system - * call. + * Complete any pending TLB or cache maintenance on this CPU in case the + * thread migrates to a different CPU. This full barrier is also + * required by the membarrier system call. Additionally it makes any + * in-progress pgtable writes visible to the table walker; See + * emit_pte_barriers(). */ dsb(ish);