From patchwork Fri Feb 21 00:53:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984687 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4ACEAC021B3 for ; Fri, 21 Feb 2025 00:55:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B699F280012; Thu, 20 Feb 2025 19:55:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 95DD4280014; Thu, 20 Feb 2025 19:55:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EC81280013; Thu, 20 Feb 2025 19:55:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 340E4280012 for ; Thu, 20 Feb 2025 19:55:27 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id DFD681A01D2 for ; Fri, 21 Feb 2025 00:55:26 +0000 (UTC) X-FDA: 83142133452.09.078262C Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf20.hostedemail.com (Postfix) with ESMTP id 611AD1C0013 for ; Fri, 21 Feb 2025 00:55:25 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf20.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099325; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zQVkWtQvWh/JQlkH3zRjmdk+4uvC0ZngJuxM1UBby2g=; b=4uO7LMBBEaJsW/7kTdVrgKVPNLDqcHE4Jp7h6MkFhS8XoAmZlf3KyQzq9pMnj03E9GLR3J sJ6uHfsZ4HSVxM8FwC3DORsxeauhoGuH1zrUuJZFoWUXwVdXQA9hGgUpHhDxZ1OE+iz8pi 5nwC+56awYXwRajmKsZqQKcTgIzb4hA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099325; a=rsa-sha256; cv=none; b=Q972Twyl16MUohntfk/vk6AHckrZsaUNJ8kFsc9vhWF9WmEgrxJswPW7d2o9gXmxLjZqWj /sodoy26N91Sac8CKXrGKjBc0L5arb6YiUHG2e3OBnBKQEluYcTl6qyekR8ZZIjlbzBq5E VXBPwDoKFyuzHtjOBRfmUlsfoCyQteU= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf20.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-0b2u; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 01/16] x86/mm: make MMU_GATHER_RCU_TABLE_FREE unconditional Date: Thu, 20 Feb 2025 19:53:00 -0500 Message-ID: <20250221005345.2156760-2-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 611AD1C0013 X-Rspamd-Server: rspam07 X-Stat-Signature: z9chbcyey33wdjc6fc1u8coefe1515fx X-HE-Tag: 1740099325-181618 X-HE-Meta: U2FsdGVkX18n+35Ryi5tScs4EiToCgOangN5MP20gd41nlo/ITijCXuc/Nkreolzv0S50ryI/859QG2177sTTFcMLQECTeC2ypY6hWsZxiIgH7YFVendDPndjhGlw03aO56MSmPUxbGw2/74gqF0/KNqnXp8SBCR3oM4S/N0iTWmu31PbAVx7IfhLjUtpxyJqn+4N3VhEWJZL47FJ/1Ns2zRAKCUjO1zXV97v8BRjqdtUfDw2EFUIFv1eKPaNO/60x2UmS1qO5TDuEL80+4AIUuQLY9ATvEoB7VuAhKq6/2hLggAkcZUZvs04ixlkN3DGHNUgHX90AVsJuDOlvRuzonft9v8LJ9sh/Bh5x/yISU49Q3gwq6x0WK9H7SPZZRYEhnf2eLK7nDY7uEoL0ZJ3jVrDmQW0JZYUAS5/lnXjQSmewhx2DdbXLUnuHGZjFZmq4p9FPi3kXLh3wOAmMkfyLTqGcfcO8aS9xLbZ3fjMdFqpNL3NUbybSgSnNQwobVqW3T4bb8asUI2cceIOv5tOJSCl2LexAezgAzLn1Kx794yH8V2VYdKYEvRqO/RjjAGJHgUqyC0wHvuQ9ex5sCAc56aqdzhn9dG901vvhjuD9hvoHb/ncR/q6B5wwqpPiKp/TBXsqaT13bB693ddSEN7REAqlt5o4uf6BYBESJ7mwvkpTqilt/9mTwq1Ib9igTC6BOd+vUttS75+i4xDsdLvDUIIBKK6ghPyQWWKYisSsuY2MGCmfVzVqfOTTm7ZxDh54lu5cOw6y7a+aDiUMW5ctgOzQRZlVq6FKkMb754rQkPctdLak7rZ90ViWeO0PmDNkvu4e4MhF+5amehrw5ciB2Pt4BEpfAmjLBCyUycRyl5DfXl2gaYkvQoaM8T0Pr1ScJCr+iNGtW1ZK+EsW0Q70L+X2zsK/Puibh+fbNFUCir/qUfcYtsoOnvFsABI+L7gUEyb39nOWYExwEFsnk UE2uVnOd 7+lYpMcSIjzw1HIHe/k1baALk+OOHXwpkTvyh4+jQQ4cBZUDgr9I0FW9mS6/0qRQ3L8t7r5ldW9FLzo3CTChs8jF6qH+ecwuAHS4T5yu4PdD/57fQqil2gSVxPFIJBO5ylLL6YaOM0uWI2jC2wSBROtRTTDvFLqID0G+qEaH2ryqJA8YTeVE5sVocBczVgThO5YxlTHbNjdW4ikmeGAHMs4pkV+HVE3SjZX8jAecBlEX/7fXBKyqHNpYAmDoHDeEG74OadgOt4tPyM8cq5DOUzO47pop8CjDMiUhBsnWtXM8ecHDyR1nalWipeJ/35yEox7D1Q4OZhX21vodDYsNecTWu+P4PR0dgeOOGCsOWuriCTDhD5F/mlgh3LHsGsPHs9xo5/JyJQWrMRBGn86yZ6XVslnxrnGeI01hOlKrDSJxS7YLhJodmyfVOSe3DpSvwGVA7v1P2/dEPAAY0/bb1ozFXUVtlT8TgeVXJj9KSJlm4moI3SY66MUojRlYFaVNTkKkkj2fPmrOo8Z7DB9RgCx+CkkY8F9VRTxtnRRQ7lF9H5WZboyqjaIYgVA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently x86 uses CONFIG_MMU_GATHER_TABLE_FREE when using paravirt, and not when running on bare metal. There is no real good reason to do things differently for each setup. Make them all the same. Currently get_user_pages_fast synchronizes against page table freeing in two different ways: - on bare metal, by blocking IRQs, which block TLB flush IPIs - on paravirt, with MMU_GATHER_RCU_TABLE_FREE This is done because some paravirt TLB flush implementations handle the TLB flush in the hypervisor, and will do the flush even when the target CPU has interrupts disabled. Always handle page table freeing with MMU_GATHER_RCU_TABLE_FREE. Using RCU synchronization between page table freeing and get_user_pages_fast() allows bare metal to also do TLB flushing while interrupts are disabled. Various places in the mm do still block IRQs or disable preemption as an implicit way to block RCU frees. That makes it safe to use INVLPGB on AMD CPUs. Signed-off-by: Rik van Riel Suggested-by: Peter Zijlstra Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/Kconfig | 2 +- arch/x86/kernel/paravirt.c | 17 +---------------- arch/x86/mm/pgtable.c | 27 ++++----------------------- 3 files changed, 6 insertions(+), 40 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6df7779ed6da..aeb07da762fc 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -278,7 +278,7 @@ config X86 select HAVE_PCI select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP - select MMU_GATHER_RCU_TABLE_FREE if PARAVIRT + select MMU_GATHER_RCU_TABLE_FREE select MMU_GATHER_MERGE_VMAS select HAVE_POSIX_CPU_TIMERS_TASK_WORK select HAVE_REGS_AND_STACK_ACCESS_API diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 1ccaa3397a67..527f5605aa3e 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -59,21 +59,6 @@ void __init native_pv_lock_init(void) static_branch_enable(&virt_spin_lock_key); } -#ifndef CONFIG_PT_RECLAIM -static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - struct ptdesc *ptdesc = (struct ptdesc *)table; - - pagetable_dtor(ptdesc); - tlb_remove_page(tlb, ptdesc_page(ptdesc)); -} -#else -static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - tlb_remove_table(tlb, table); -} -#endif - struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; @@ -195,7 +180,7 @@ struct paravirt_patch_template pv_ops = { .mmu.flush_tlb_kernel = native_flush_tlb_global, .mmu.flush_tlb_one_user = native_flush_tlb_one_user, .mmu.flush_tlb_multi = native_flush_tlb_multi, - .mmu.tlb_remove_table = native_tlb_remove_table, + .mmu.tlb_remove_table = tlb_remove_table, .mmu.exit_mmap = paravirt_nop, .mmu.notify_page_enc_status_changed = paravirt_nop, diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 1fef5ad32d5a..b1c1f72c1fd1 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -18,25 +18,6 @@ EXPORT_SYMBOL(physical_mask); #define PGTABLE_HIGHMEM 0 #endif -#ifndef CONFIG_PARAVIRT -#ifndef CONFIG_PT_RECLAIM -static inline -void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - struct ptdesc *ptdesc = (struct ptdesc *)table; - - pagetable_dtor(ptdesc); - tlb_remove_page(tlb, ptdesc_page(ptdesc)); -} -#else -static inline -void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - tlb_remove_table(tlb, table); -} -#endif /* !CONFIG_PT_RECLAIM */ -#endif /* !CONFIG_PARAVIRT */ - gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM; pgtable_t pte_alloc_one(struct mm_struct *mm) @@ -64,7 +45,7 @@ early_param("userpte", setup_userpte); void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte) { paravirt_release_pte(page_to_pfn(pte)); - paravirt_tlb_remove_table(tlb, page_ptdesc(pte)); + tlb_remove_table(tlb, page_ptdesc(pte)); } #if CONFIG_PGTABLE_LEVELS > 2 @@ -78,21 +59,21 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd) #ifdef CONFIG_X86_PAE tlb->need_flush_all = 1; #endif - paravirt_tlb_remove_table(tlb, virt_to_ptdesc(pmd)); + tlb_remove_table(tlb, virt_to_ptdesc(pmd)); } #if CONFIG_PGTABLE_LEVELS > 3 void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud) { paravirt_release_pud(__pa(pud) >> PAGE_SHIFT); - paravirt_tlb_remove_table(tlb, virt_to_ptdesc(pud)); + tlb_remove_table(tlb, virt_to_ptdesc(pud)); } #if CONFIG_PGTABLE_LEVELS > 4 void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d) { paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT); - paravirt_tlb_remove_table(tlb, virt_to_ptdesc(p4d)); + tlb_remove_table(tlb, virt_to_ptdesc(p4d)); } #endif /* CONFIG_PGTABLE_LEVELS > 4 */ #endif /* CONFIG_PGTABLE_LEVELS > 3 */ From patchwork Fri Feb 21 00:53:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984688 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD082C021B4 for ; Fri, 21 Feb 2025 00:55:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F2E24280013; Thu, 20 Feb 2025 19:55:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EB7B5280014; Thu, 20 Feb 2025 19:55:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D0429280013; Thu, 20 Feb 2025 19:55:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8CF6428000B for ; Thu, 20 Feb 2025 19:55:27 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 5225E1A020B for ; Fri, 21 Feb 2025 00:55:27 +0000 (UTC) X-FDA: 83142133494.03.AA78D71 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf27.hostedemail.com (Postfix) with ESMTP id CA94D40007 for ; Fri, 21 Feb 2025 00:55:25 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf27.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099325; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ict0ZUfh5/z/IoUmcY6i3Lw6+8ZsHB39bOyYhjfs4aA=; b=flOAIwzRZ+P39Ka4be5J1crN9rOlWpXg7znPDmrFSm7DSrrHl88iJ9Irylcq1wkA+b6VSb 1RWNlkTidyzznZ5mN3zYSh140Ud6c+y2ptjEDdANTzW3g8+T2jtMcfRnfXwWTNEampTaAu 7CjQlG2tGSVA90PvNlDjtQS0fFYbG6o= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf27.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099325; a=rsa-sha256; cv=none; b=YJ5EXO914xX/y6GHvMoW5yhb2Pm0LoADtsHAD6KFUN08vhsBT8k6TPDDEtJj3bTDp90i5U bBe8c192wjYa2SEHf3c7Fy9XSGB6FI+SPcBCtr0qWdCNsVs5I7VVqIsQSO9qgsVwBeKFi0 YP/SByKWmzc26bYTIqC7EjSChZd4NTA= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-0hJF; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 02/16] x86/mm: remove pv_ops.mmu.tlb_remove_table call Date: Thu, 20 Feb 2025 19:53:01 -0500 Message-ID: <20250221005345.2156760-3-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: CA94D40007 X-Stat-Signature: xbkbs3iurmj1jyeo4ahs8c3m7h8esde9 X-HE-Tag: 1740099325-511050 X-HE-Meta: U2FsdGVkX182WzRETgVTsPpTD6SefNy4obzP5O/MOtLpEnJR46lXD/r9fXiT2v9JtHkwk+2bdNMAhNSDVYjlGJXnb0khGmF8efQx3r5/qTZcH6UGqLgvD2bPkRusaW6q3nR6VBny6BncAfdVLzU4AkfC7CMxibJR+HFmYYrngr8MfKtKfDRUGp96fY1KAIm4FkcS1N8MzZT3FQvFGA3CJlliQ2wMA/fuqhWED/CgIxFk7U3Y1MASyjUzrhAOLL7XzTEd7dISKzUBS/huiOCDlqS6VXddf7rzC0XapdsuKBD9AZ5/XrLPJhV8uD2T1GIZtsHHEn/hClkjwmKNcoULsS7o3I3Ke6QjthnYY/oDrprH/Jne2vmoO2ct+INrqNdXB8qtLLOT4ojqQ15VRPfe2TqGfLrlbg2Vgd/OUE+6y7o+JS47XDQkYotAJlHb17Wq7d/03p76ouHQJT/pR50c8oKBHZFpkc3SHpNchv3d9yJ9jDvKkyDf0MSu+ymndpQsxOCNaeYUvfnNdCFPxobBLSbcdc4jKe515TUYaD7CYacW7sUYdsDb5klc3XEWq6/2Qo1+P88FCVTKl6s5iHKgGfdBu+l14a/+/ul9ToJTGOKhoKrFhJtSDIDvFYd6noTT1AOEJIG8zC78AsJLvBrZ/Rd6ODWKVsqLO8J7z6BTxTEbENVsI1hzoOSZOd4j0dmCTl/pnINOUwDr4NLwVvf91IVFIsmJw6/xsqv9bTJTUfK+xDyXxC/wEc3luLr+pbz1DD7QiuQoKUpXZd9tqIz4OxTM/oGzDIJ1mbuLZhEKQZ6/glYCMukAh0k2lo5bXw+VD9IPCw5qjBNAGK5fuUaotyGS1vgv6RnG+XrYsoVVHhL815rIqxWMmH9wwgaZTv/D/pMRnnmgY525M330miTcFBaBg40Ll150r7l1E2l1lQ7PoPly9wmbBU6SOMl2jkXCExANOffKs4CLmNxlIBT n4q79yke zKZ64z1z9xOR4J4WJwZFtjVL1HjHlE0sjLairiLIXD0B0thA/euRsBhOxM49wINe3pobpXS+gbkHvLDzHTPH+ApJCuooD+epgDHwoBPAcbwS77X1y3AUo1s8JxSZNNLiQNv5pwPXkuc0A3mvnWM8P2SNgLswglY2lTBRSxfaTWXxWCjPQaeGj15T7CVDaWOfLoeKLnsJo/WAzIswXPNFTOrnG8Hr3WM05Fiwwuw3pT7uWU/ZTQLT372FqOH8jwj3ldcg/uxSX+IsbQJtOg9l/zb/lxAtt89h00DBs1TQGLZPLihkqzUTjrXMLwGJOihXuKDgltO5EDnOA+0X+mdKVMw0lSJseG/7PG83JKeOAIvUgZOQAlsL6nM6gLhN/YZzUdM871+7jxxpi8zUSwkWc8uAcBnMi7RZI67lHOc8VG961360shLc9dcd/j4+stUv7OusK3LWoSU7WoNETYzerUlsM+G4TCELt/8iWYvF1yB/wSxik4lrjTCWu2kQsVOQQHA1LbQ0ZuGZ/NZcaVvIG94T2TA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Every pv_ops.mmu.tlb_remove_table call ends up calling tlb_remove_table. Get rid of the indirection by simply calling tlb_remove_table directly, and not going through the paravirt function pointers. Signed-off-by: Rik van Riel Suggested-by: Qi Zheng Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/hyperv/mmu.c | 1 - arch/x86/include/asm/paravirt.h | 5 ----- arch/x86/include/asm/paravirt_types.h | 2 -- arch/x86/kernel/kvm.c | 1 - arch/x86/kernel/paravirt.c | 1 - arch/x86/xen/mmu_pv.c | 1 - 6 files changed, 11 deletions(-) diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index cc8c3bd0e7c2..1f7c3082a36d 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -239,5 +239,4 @@ void hyperv_setup_mmu_ops(void) pr_info("Using hypercall for remote TLB flush\n"); pv_ops.mmu.flush_tlb_multi = hyperv_flush_tlb_multi; - pv_ops.mmu.tlb_remove_table = tlb_remove_table; } diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 041aff51eb50..38a632a282d4 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -91,11 +91,6 @@ static inline void __flush_tlb_multi(const struct cpumask *cpumask, PVOP_VCALL2(mmu.flush_tlb_multi, cpumask, info); } -static inline void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - PVOP_VCALL2(mmu.tlb_remove_table, tlb, table); -} - static inline void paravirt_arch_exit_mmap(struct mm_struct *mm) { PVOP_VCALL1(mmu.exit_mmap, mm); diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index fea56b04f436..e26633c00455 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -134,8 +134,6 @@ struct pv_mmu_ops { void (*flush_tlb_multi)(const struct cpumask *cpus, const struct flush_tlb_info *info); - void (*tlb_remove_table)(struct mmu_gather *tlb, void *table); - /* Hook for intercepting the destruction of an mm_struct. */ void (*exit_mmap)(struct mm_struct *mm); void (*notify_page_enc_status_changed)(unsigned long pfn, int npages, bool enc); diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 7a422a6c5983..3be9b3342c67 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -838,7 +838,6 @@ static void __init kvm_guest_init(void) #ifdef CONFIG_SMP if (pv_tlb_flush_supported()) { pv_ops.mmu.flush_tlb_multi = kvm_flush_tlb_multi; - pv_ops.mmu.tlb_remove_table = tlb_remove_table; pr_info("KVM setup pv remote TLB flush\n"); } diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 527f5605aa3e..2aa251d0b308 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -180,7 +180,6 @@ struct paravirt_patch_template pv_ops = { .mmu.flush_tlb_kernel = native_flush_tlb_global, .mmu.flush_tlb_one_user = native_flush_tlb_one_user, .mmu.flush_tlb_multi = native_flush_tlb_multi, - .mmu.tlb_remove_table = tlb_remove_table, .mmu.exit_mmap = paravirt_nop, .mmu.notify_page_enc_status_changed = paravirt_nop, diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 2c70cd35e72c..a0b371557125 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2141,7 +2141,6 @@ static const typeof(pv_ops) xen_mmu_ops __initconst = { .flush_tlb_kernel = xen_flush_tlb, .flush_tlb_one_user = xen_flush_tlb_one_user, .flush_tlb_multi = xen_flush_tlb_multi, - .tlb_remove_table = tlb_remove_table, .pgd_alloc = xen_pgd_alloc, .pgd_free = xen_pgd_free, From patchwork Fri Feb 21 00:53:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 947F8C021B3 for ; Fri, 21 Feb 2025 00:55:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EAE0828000D; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C19D76B00AC; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A1DC7280009; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 5B8C6280009 for ; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id E5846160264 for ; Fri, 21 Feb 2025 00:55:08 +0000 (UTC) X-FDA: 83142132696.21.4FD16A0 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf15.hostedemail.com (Postfix) with ESMTP id 5F51CA000E for ; Fri, 21 Feb 2025 00:55:07 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099307; a=rsa-sha256; cv=none; b=2o8pglTaWYUX8f8z+yahkOHqEWkSgdf4PoQl2Bj0WaRfj/rPfAtGd9pxcR2muXmBu0GXSo aCefGdrfFRN8/FlRQfrRWsjg6bBTQ3n80QKf1m4PxCUjNi/QNg5mBtGxUQRvFfMLaaRwB3 vYB8t3qK4ezZeiDRF0tDixHQYYM5RmA= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099307; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/LW6VSvXPGSjnBKvbbdw0SlEWW24iiSz41LQZQ3zF1s=; b=HHGLhoe6imy/RtZGQO+Khrnu5GeDDyU5ZWtkQjJEvFQAbwi3dV2X6k1aDyieCm8PcB+kTO DOmr2YJFFDM1KU5YGgaT7DentBaN39lmFeS3ON9IBizCBBQLTCwq2P+TZN1i4Zohpqo+Iy l4llAW/YdtzbMl/6jT04KwIB5QRiMyY= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-0nqE; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel , Dave Hansen Subject: [PATCH v12 03/16] x86/mm: consolidate full flush threshold decision Date: Thu, 20 Feb 2025 19:53:02 -0500 Message-ID: <20250221005345.2156760-4-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 5F51CA000E X-Rspamd-Server: rspam12 X-Stat-Signature: ayxo49sm9ddpenmpaxbp31ba3johyiea X-HE-Tag: 1740099307-362404 X-HE-Meta: U2FsdGVkX1+O7CdDCGGGQBdCqNGC1o/clVJbhpIZq9gofzKNEbwfQp3LS5+tDOJlUeFI3eeMsMzw3RVH8+62lDp3a+hOsjh+VVWMo2sml3gDYKmD8BaMDnc9Oym8REoP1JwPoWIYKY65kuTtRfrqcmWLnptib46rLWLizRmDsoKfpki7qz4ZdVZdKeD+zDGnvLbHoi5zbcGsylRTd9GJ8lP0YYbRp6q3t7OlX7xilyI8yyodBiuRu4bU8YBYTlEIOhJ/g7hG/443LhSIwmXsitRyAmitscaOHJrP3nUoj06iX7vnTeoJEWy3azLwT+6b3kQEN7AFpGyPPF6egTFx4LsstDwEDwy0vO1FnANFS5UhXa/v6ndVaKVncwW4DAormWl2Gu9pXj5ICPxM4uejJSqYPwpFB2QGguB1Bly7bZAfNGjtQnOTPdJyITkFbIaG7MRYXECV66w04+JBZv7l95L2c+4KFXX96midXSU31ve1hpmZ8fAPPAOkeiTd/Z2LCCM/eyXD/wKQOdGMNd3896w/P3wUkEBEz7JFR+mippsjfi8NrKieRfgqsaFej+RVV6j9HPmkqI2XwvL7DmrtaLP0uvobWqFC/nCOW7jRiiHoe+NWN/QwW2z46XG1CfxcSpB7s/p8gmsK5ogXkhJ6DJ7LMR346ibqtRyfxPgimiYobe4uhW31J5jQQztrnmThyITQbXO2jqo4BPIecYKq7VNybJwbaulyo/1OG2K27fOVSUJ0UIWtV9+F7JKhBkhOZ69HZEtzoDAolOieVxB7aH80drmyo+7JD5U/EKo2MejtAVVLDYqVSVuAeClRrksNV4085F5TrcUSTph6AK1duOyY5f9R7PDSOyCZZ0qYYZnYXnFLXXY4O/0KMr/ChxZcChajHdg3evTtVDcJtv9D7mcNXbar+tMw0iz4SkIv3VjYbvE0RBYd2L37CW4O3+M1X0YjGVT2vXXrEsVqmzP BoYIVhnk abRo/Bt9rJqNvMDmMUDC1kVtym3Rpt94ye8sXj/wZoxW6E8fl30qUoMLwO5JgaDM0h7ebWc+/W7x5pG+dm9E1xRCzsc45nQ1RpzxamCTdKXfohZL8REy+F/b299+I4cLsCsTsUHxogbB7MoWPTUn0nGI0H16T1cEkgxFOFLvbwhHNI22qCEJAKvf90qjgdd/N9ZyJledP2mcnA7B5GvPL/H6PneDjOKtuNbw2EVbXaXDjjrFnaeeBsqBaOUKlhEd4vxzjU1pjxpAiY20MXyIt3qiw5Zujgyc+u+IpdPvlKO90i2cIeQOoHDEpNlKw76Sqy0r3zDw9TgIqPYhGwfPZop41AAjh7ZlfYu8ng1pjKZ8v3lq5quOV+vOcMpjo1OyHSDa0Sn4pj9wRK1nod0nbPJWaSxBvP4zrTEjeQrLP2UE3YRBKML40bsRsKg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Reduce code duplication by consolidating the decision point for whether to do individual invalidations or a full flush inside get_flush_tlb_info. Signed-off-by: Rik van Riel Suggested-by: Dave Hansen Tested-by: Michael Kelley Acked-by: Dave Hansen --- arch/x86/mm/tlb.c | 41 +++++++++++++++++++---------------------- 1 file changed, 19 insertions(+), 22 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ffc25b348041..dbcb5c968ff9 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1000,6 +1000,15 @@ static struct flush_tlb_info *get_flush_tlb_info(struct mm_struct *mm, BUG_ON(this_cpu_inc_return(flush_tlb_info_idx) != 1); #endif + /* + * If the number of flushes is so large that a full flush + * would be faster, do a full flush. + */ + if ((end - start) >> stride_shift > tlb_single_page_flush_ceiling) { + start = 0; + end = TLB_FLUSH_ALL; + } + info->start = start; info->end = end; info->mm = mm; @@ -1026,17 +1035,8 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, bool freed_tables) { struct flush_tlb_info *info; + int cpu = get_cpu(); u64 new_tlb_gen; - int cpu; - - cpu = get_cpu(); - - /* Should we flush just the requested range? */ - if ((end == TLB_FLUSH_ALL) || - ((end - start) >> stride_shift) > tlb_single_page_flush_ceiling) { - start = 0; - end = TLB_FLUSH_ALL; - } /* This is also a barrier that synchronizes with switch_mm(). */ new_tlb_gen = inc_mm_tlb_gen(mm); @@ -1089,22 +1089,19 @@ static void do_kernel_range_flush(void *info) void flush_tlb_kernel_range(unsigned long start, unsigned long end) { - /* Balance as user space task's flush, a bit conservative */ - if (end == TLB_FLUSH_ALL || - (end - start) > tlb_single_page_flush_ceiling << PAGE_SHIFT) { - on_each_cpu(do_flush_tlb_all, NULL, 1); - } else { - struct flush_tlb_info *info; + struct flush_tlb_info *info; + + guard(preempt)(); - preempt_disable(); - info = get_flush_tlb_info(NULL, start, end, 0, false, - TLB_GENERATION_INVALID); + info = get_flush_tlb_info(NULL, start, end, PAGE_SHIFT, false, + TLB_GENERATION_INVALID); + if (info->end == TLB_FLUSH_ALL) + on_each_cpu(do_flush_tlb_all, NULL, 1); + else on_each_cpu(do_kernel_range_flush, info, 1); - put_flush_tlb_info(); - preempt_enable(); - } + put_flush_tlb_info(); } /* From patchwork Fri Feb 21 00:53:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 047D1C021B3 for ; Fri, 21 Feb 2025 00:55:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3DD0D6B00AC; Thu, 20 Feb 2025 19:55:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E628028000C; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BAB6F28000B; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 445856B00AE for ; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C5E3148E27 for ; Fri, 21 Feb 2025 00:55:08 +0000 (UTC) X-FDA: 83142132696.06.46DCEF3 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf12.hostedemail.com (Postfix) with ESMTP id 2210B40013 for ; Fri, 21 Feb 2025 00:55:06 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf12.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099307; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l+vicx6yTnFJ2B8TG3u6etT+xSitiRT+7lgNVkCj/zA=; b=sCt9yDj9dGWSHx4gRqv4w/XsWwAHPTjtSs2ipyaN9Vn/32tuwUDVkfVagNywbonIa7XX1P MidQTJdo8UGHYGDv/jsjg2VkuQQ2whoa3xum8bzk3n3sev/pfXPlPm/k7UY3ITgEnULtTD 83RNrHT6YKyoWC8KziRgy/Jvw65ULE4= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf12.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099307; a=rsa-sha256; cv=none; b=2PTbeRY8SeQYGcNSy2b6J8G+Ko2h6F1sA/1eagd5b3Bs4+NCjw6v/AtjXkrSuOJs7g+Lur BdVqvTAo7HA4c6NbAFP5Ky7jTVpqA+w9noToLfhOVujd+NlHKuxOz0Is08F1fYueIYku+I GiJVWmISFOI6Kjl6jiqoosYjPrAEtZE= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-0tox; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel , Dave Hansen Subject: [PATCH v12 04/16] x86/mm: get INVLPGB count max from CPUID Date: Thu, 20 Feb 2025 19:53:03 -0500 Message-ID: <20250221005345.2156760-5-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 2210B40013 X-Stat-Signature: o6oxxeuto749wjyaie6t88xs33sshi5b X-HE-Tag: 1740099306-121877 X-HE-Meta: U2FsdGVkX18tGR0zGAag1/oPsk0wUdGqDyydHc+5VpLeZ/Y9MkMckwfD8AXJEI9EAUeYjADH+4cvx8iJeA2NvAF8pXZUc8AWl0a9UfPSO9QBec5u8dKuPUaI5wx8CRSqrKWneXI0r/oE2rcEDTeS/Fp1BzcKoyJYOClYZ9cAyN/q1hisw8UGPJwUgWnx74EZj9xJ+NL1MDfro5WD5nh7w4wKQuz+eJE/bnaWNg8q6Yxvron5mlh5EWsyvSbZji5nEKDz25Vlph0uMGY925IALVWPyDwsBn/WNS4kh3qrFuD1U17XMk+gGxXpmRD7D/zNedm8jlguCUAO3k6ahhblD0yNZHJKkbWIvhHSRI40YsPIIy1KtDE+gMA8I3pI7PeUmMJI/YAqEGUOfos0K1E41FtOR6R/MLjB4/fXi9PRsvufhQCJPf3ZtwVK0R+Pw0OptM5A/FeZDjOGl2JaClsvRWTftTkQS6F4mypqylX6DQKr+vM80W80PuTLgkTlwNrE+MT2IIXcV64MF27ZzIATiV/ZAa9K8SXph9cK+XuhjHqScWMMpTv6AtQHQ6i5PHmmFxjviQkH7LI+lWJtf4iqqKIlYDEViO5FfY/Qw4xQ34AlOSeMmOzsWJ21krIX+TSpFo8885heCaNSViGVpxL7/YQK63IWz3GA/kan+lbuE91/voMMoq/JO3LKxirmSP8jGNar43UDhNd5BjlmAu5rTz5VhlB5NOgi2jcU39RbQHNvwR8d12S/FVgjTYUZjgW4Vi4xts7X3al+x0KZmX+0ikvcowkRNjZzt4quvp8hAiTn6xjr3fwgccWZtgSGNy1ExVtOE4xzE1xULpym2GUq9xiF4BkvVonr2Kvg6R021xjPUiHmU9XQTLitDEzfFMVXNHjUQar3T5yVgWFAYHR+sXb1yPib8pglZJ6KSoBXqpBwl05VKiLJ1nDxJWOd5nvLrqmJsIIPlGKGA176kg5 I8mn/7lo 7Rxmnnv2Y6K+OIuwUg3qtElhA4GX4AfNtjjnjSFEKhVdXHajjVGNYx2t9NnCICM0FCo7gB/Lqfo42J8y1cmg1X9As+CEuVSmmT9s7zrzWybZ7sBqajqWi/HCWjFJ+w0AvPJDhYpkbitA2sRMLrGk9aSY49IX0bsPctR6STh4k5AcVGjdjNf1WU/UlaOd5FEtTdWRq3isUD/FtGRsoKklt6cTrGfPz4HphxuRtztUozWAN/D/NMcrLzAqSHx6MlBFRGoIkrl4Kc4THb74fwt9+yLayGsFLZc8zjebQH8WF9ubJsceIEQdchh8FaxDhthn6I4UvC0z/v/IyLqBLvpxuqoP1pwkm5OxQxIUzcOMcD7sGMtgw3mF4BeBsV1OstOwafe+IJFwyoX3O4mucVREtjY6niNSuk8vTwnuALenjCjCWuZMwvgToBgb6b2eCg5P6axogkknqam1kDbrrPSWbFWirHPkxdvUKGa9g X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The CPU advertises the maximum number of pages that can be shot down with one INVLPGB instruction in the CPUID data. Save that information for later use. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley Acked-by: Dave Hansen Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley Acked-by: Dave Hansen Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley Acked-by: Dave Hansen --- arch/x86/Kconfig.cpu | 5 +++++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/tlbflush.h | 3 +++ arch/x86/kernel/cpu/common.c | 3 +++ 4 files changed, 12 insertions(+) diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu index 2a7279d80460..bb6943c21b7f 100644 --- a/arch/x86/Kconfig.cpu +++ b/arch/x86/Kconfig.cpu @@ -401,6 +401,10 @@ menuconfig PROCESSOR_SELECT This lets you choose what x86 vendor support code your kernel will include. +config X86_BROADCAST_TLB_FLUSH + def_bool y + depends on CPU_SUP_AMD && 64BIT + config CPU_SUP_INTEL default y bool "Support Intel processors" if PROCESSOR_SELECT @@ -431,6 +435,7 @@ config CPU_SUP_CYRIX_32 config CPU_SUP_AMD default y bool "Support AMD processors" if PROCESSOR_SELECT + select X86_BROADCAST_TLB_FLUSH help This enables detection, tunings and quirks for AMD processors diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 508c0dad116b..b5c66b7465ba 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -338,6 +338,7 @@ #define X86_FEATURE_CLZERO (13*32+ 0) /* "clzero" CLZERO instruction */ #define X86_FEATURE_IRPERF (13*32+ 1) /* "irperf" Instructions Retired Count */ #define X86_FEATURE_XSAVEERPTR (13*32+ 2) /* "xsaveerptr" Always save/restore FP error pointers */ +#define X86_FEATURE_INVLPGB (13*32+ 3) /* INVLPGB and TLBSYNC instruction supported. */ #define X86_FEATURE_RDPRU (13*32+ 4) /* "rdpru" Read processor register at user level */ #define X86_FEATURE_WBNOINVD (13*32+ 9) /* "wbnoinvd" WBNOINVD instruction */ #define X86_FEATURE_AMD_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */ diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 3da645139748..09463a2fb05f 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -183,6 +183,9 @@ static inline void cr4_init_shadow(void) extern unsigned long mmu_cr4_features; extern u32 *trampoline_cr4_features; +/* How many pages can we invalidate with one INVLPGB. */ +extern u16 invlpgb_count_max; + extern void initialize_tlbstate_and_flush(void); /* diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 7cce91b19fb2..742bdb0c4846 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -95,6 +95,8 @@ EXPORT_SYMBOL(__num_cores_per_package); unsigned int __num_threads_per_package __ro_after_init = 1; EXPORT_SYMBOL(__num_threads_per_package); +u16 invlpgb_count_max __ro_after_init; + static struct ppin_info { int feature; int msr_ppin_ctl; @@ -1030,6 +1032,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) if (c->extended_cpuid_level >= 0x80000008) { cpuid(0x80000008, &eax, &ebx, &ecx, &edx); c->x86_capability[CPUID_8000_0008_EBX] = ebx; + invlpgb_count_max = (edx & 0xffff) + 1; } if (c->extended_cpuid_level >= 0x8000000a) From patchwork Fri Feb 21 00:53:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984689 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75E65C021B3 for ; Fri, 21 Feb 2025 00:55:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C2151280016; Thu, 20 Feb 2025 19:55:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B7F6B280014; Thu, 20 Feb 2025 19:55:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9335F280016; Thu, 20 Feb 2025 19:55:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6C149280014 for ; Thu, 20 Feb 2025 19:55:28 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 25EE0120285 for ; Fri, 21 Feb 2025 00:55:28 +0000 (UTC) X-FDA: 83142133536.09.76CAD87 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf24.hostedemail.com (Postfix) with ESMTP id 9FB5418000C for ; Fri, 21 Feb 2025 00:55:26 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf24.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099326; a=rsa-sha256; cv=none; b=5fllDAAXlf3bHJZdOh2y72EVXsaHB3dq+ogUarArHYwowff/HR1pwAlEfyfFxv+IRXydhm ZNTq5VioXuGeXgy/CokU9HvjukLdQnxLI0McWGPFTSU2HcbAM0t02I0caIBx7KO8PsgeeE gVDOz5+h5e1iTsfvzWvnVjwtxgFOm28= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf24.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099326; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FDNW7A3jW76TJvTwQhr2RDVesMXpIDtxJIAfA8ejwZs=; b=6cgWYABbKUZBS8l/GvL2v2wtq01S7mDjPXpAUZtsu+vOK/T9APNhNnSqjALkXDyOsRfvKE LXkjnwMPeOXIsNcdYSM3p+lEUa4JFi4sm/j1fBcdMOnbcjq82optDLtg4vaqiudHS+et2Y WOlpCOeyg54fRhUj/2SR84R2YfFVRGk= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-0zou; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel , Dave Hansen Subject: [PATCH v12 05/16] x86/mm: add INVLPGB support code Date: Thu, 20 Feb 2025 19:53:04 -0500 Message-ID: <20250221005345.2156760-6-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 9FB5418000C X-Stat-Signature: 8c3eyegecq93o9zzxg81roxdozi6e479 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1740099326-355752 X-HE-Meta: U2FsdGVkX19zvlb5pb0Bq3GkQBa+1WkhHaMa/jWnBgqAZUiN2r1AsmAMeO2QhUDH4VYD+fCp3sKey3RNgzJwj/a71xAkIN9BgVKcCdj6GnGObiBcBFO9gpz08+uLz6WfWXI1lMVfpA1bwAUYLYoTChcP/qisLT2Yn60uZJbxuv0z7nEzqx7mhO4yAT+6OwwhhcT3zVYjgoAC6zecxn4br/AH74yREBDkamyYXkJ896EENqMM+xEVqxxNreI2uMZ/SvYuYxJTFHhWAEmRSnYDz/V8Oxxy/DnbMnXMAKnpnm17sB1U+dZEPpbgTU/Ybj/4yHNpXllvWfuzPOoqIy/ECnTaCXfIwmoVkMj26A+pS26Nu+Bc3gJyRavXONcpoHYQzBIIgRNkgFLqeLIIEKgM2ZHlhhY1idpy4E+/UbVfa2WXhmMAoUiNudtmLBRNIqJdAFdSrEQxBDIzkSDO2ZKejmLjZQbREbQIlXGJiuRKirij1TZAVz1Rn/R2EAjtK92AsaCwdDOqlwzDLB0EyO3jsF7R+b6+lIwMeo1evtJlBeGjNk0mYuXc1pDRaVzq14E2GgdMQ0qmFpGOubLEEFnRBOWNicR554nBQnTQp7zQCE2uPdpdZiRfv6zHfaKIwMMBjuyj6Qw0Cuo8M62e/biFNFaa9GwLJ9JJSuymaKqSamtzbhmEEKVeen87pX4EbK2bzJgPxKTY3pYQPct61bm8+wzdz3WTUwG7N6Fvjkx8eLrg6074DLH7TX/fVgC7+dxQg8pvL85A5U0ENMZyh8+qxxezta7c3IewFJ5wMUbK8AM0k1hHsOwVqFAPK+J2UPJako9xKCTslk3FrsSEpOSpC2MMukhB4PgQ3dQIKr+zRj1bFN2x0vEsRsZfGPQarkZuLt419sdIKhi6evkeSgjZzfwePF9btZQJ8DgR5XsZBoKxBDcCWc6gUS2baKDryIkziknsNb3gWs3Obnyts+C tSqTFkcH iNkInTqnMrLVA6ktWhmZc/4W4+jBMUP/7jc7QBJL/3vbum7VBh5jY+K2b3salncF76br1GS9jE5hAdQc06h9OCFmlRfPEX9hEYj5Da6e/Q98xE9qIIF//ldoJCV8BCcE8f9diGmPnKAJ3zNi9gSa7b+dExYtUgDbkjiFS7MadrAgZeZpPuqWHXLXd14rkpByYvLUNsXNdFBjmVxh8h9QfpB0BTyyU/ftvYP316Zpiacj/lczIDAmoF8DEJ0B4nLu5GTgUKYVOIMOZeJokWho9LUWEZHnXuGiLZIrpz5Go3ILwgNwwaywQcOdf8NFPCLsImP027uexw8mlI9sCndl3UPIip2jwIfCcYH7O/88hYrU3SK/jmXqjdmZ7XUDPCTA++YashgNe9c7zxRr+XuPDcV0Ik5I3FX1mPMrvwGIVt4KhcWMkKpTfFIPEIWPtz8vyiG82Sk6Mf1YUYWjuRj4SAcfwZya0TBpnxt1R X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add invlpgb.h with the helper functions and definitions needed to use broadcast TLB invalidation on AMD EPYC 3 and newer CPUs. All the functions defined in invlpgb.h are used later in the series. Compile time disabling X86_FEATURE_INVLPGB when the config option is not set allows the compiler to omit unnecessary code. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley Acked-by: Dave Hansen --- arch/x86/include/asm/disabled-features.h | 9 ++- arch/x86/include/asm/tlb.h | 92 ++++++++++++++++++++++++ 2 files changed, 100 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index c492bdc97b05..95997caf0935 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -129,6 +129,13 @@ #define DISABLE_SEV_SNP (1 << (X86_FEATURE_SEV_SNP & 31)) #endif +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH +#define DISABLE_INVLPGB 0 +#else +/* Keep 32 bit kernels smaller by compiling out the INVLPGB code. */ +#define DISABLE_INVLPGB (1 << (X86_FEATURE_INVLPGB & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -146,7 +153,7 @@ #define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \ DISABLE_CALL_DEPTH_TRACKING|DISABLE_USER_SHSTK) #define DISABLED_MASK12 (DISABLE_FRED|DISABLE_LAM) -#define DISABLED_MASK13 0 +#define DISABLED_MASK13 (DISABLE_INVLPGB) #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \ diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 77f52bc1578a..b3cd521e5e2f 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -6,6 +6,9 @@ static inline void tlb_flush(struct mmu_gather *tlb); #include +#include +#include +#include static inline void tlb_flush(struct mmu_gather *tlb) { @@ -25,4 +28,93 @@ static inline void invlpg(unsigned long addr) asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); } + +/* + * INVLPGB does broadcast TLB invalidation across all the CPUs in the system. + * + * The INVLPGB instruction is weakly ordered, and a batch of invalidations can + * be done in a parallel fashion. + * + * The instruction takes the number of extra pages to invalidate, beyond + * the first page, while __invlpgb gets the more human readable number of + * pages to invalidate. + * + * TLBSYNC is used to ensure that pending INVLPGB invalidations initiated from + * this CPU have completed. + */ +static inline void __invlpgb(unsigned long asid, unsigned long pcid, + unsigned long addr, u16 nr_pages, + bool pmd_stride, u8 flags) +{ + u32 edx = (pcid << 16) | asid; + u32 ecx = (pmd_stride << 31) | (nr_pages - 1); + u64 rax = addr | flags; + + /* The low bits in rax are for flags. Verify addr is clean. */ + VM_WARN_ON_ONCE(addr & ~PAGE_MASK); + + /* INVLPGB; supported in binutils >= 2.36. */ + asm volatile(".byte 0x0f, 0x01, 0xfe" : : "a" (rax), "c" (ecx), "d" (edx)); +} + +/* Wait for INVLPGB originated by this CPU to complete. */ +static inline void __tlbsync(void) +{ + cant_migrate(); + /* TLBSYNC: supported in binutils >= 0.36. */ + asm volatile(".byte 0x0f, 0x01, 0xff" ::: "memory"); +} + +/* + * INVLPGB can be targeted by virtual address, PCID, ASID, or any combination + * of the three. For example: + * - INVLPGB_VA | INVLPGB_INCLUDE_GLOBAL: invalidate all TLB entries at the address + * - INVLPGB_PCID: invalidate all TLB entries matching the PCID + * + * The first can be used to invalidate (kernel) mappings at a particular + * address across all processes. + * + * The latter invalidates all TLB entries matching a PCID. + */ +#define INVLPGB_VA BIT(0) +#define INVLPGB_PCID BIT(1) +#define INVLPGB_ASID BIT(2) +#define INVLPGB_INCLUDE_GLOBAL BIT(3) +#define INVLPGB_FINAL_ONLY BIT(4) +#define INVLPGB_INCLUDE_NESTED BIT(5) + +static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, + unsigned long addr, + u16 nr, + bool pmd_stride) +{ + __invlpgb(0, pcid, addr, nr, pmd_stride, INVLPGB_PCID | INVLPGB_VA); +} + +/* Flush all mappings for a given PCID, not including globals. */ +static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) +{ + __invlpgb(0, pcid, 0, 1, 0, INVLPGB_PCID); +} + +/* Flush all mappings, including globals, for all PCIDs. */ +static inline void invlpgb_flush_all(void) +{ + __invlpgb(0, 0, 0, 1, 0, INVLPGB_INCLUDE_GLOBAL); + __tlbsync(); +} + +/* Flush addr, including globals, for all PCIDs. */ +static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) +{ + __invlpgb(0, 0, addr, nr, 0, INVLPGB_INCLUDE_GLOBAL); +} + +/* Flush all mappings for all PCIDs except globals. */ +static inline void invlpgb_flush_all_nonglobals(void) +{ + __invlpgb(0, 0, 0, 1, 0, 0); + __tlbsync(); +} + #endif /* _ASM_X86_TLB_H */ From patchwork Fri Feb 21 00:53:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984680 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD44DC021B3 for ; Fri, 21 Feb 2025 00:55:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 406EC280009; Thu, 20 Feb 2025 19:55:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 319996B00AD; Thu, 20 Feb 2025 19:55:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 145786B00AE; Thu, 20 Feb 2025 19:55:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E8BBD6B00AD for ; Thu, 20 Feb 2025 19:55:10 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8E36D8021E for ; Fri, 21 Feb 2025 00:55:10 +0000 (UTC) X-FDA: 83142132780.17.B6DCA9D Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf16.hostedemail.com (Postfix) with ESMTP id 17DC5180006 for ; Fri, 21 Feb 2025 00:55:08 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf16.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099309; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hF8xMwU6gtFHerZ+8FO0Ift86ipYBUao4K8kpXc1fTk=; b=hTd8z8xFXyW1OZPDZXrQfJSMLeJLw4Hfb797F3tGZd0Elg/LnqG/h0tly90I3w0AcaayWa A4HQX+sCqayaiC3KCs7SfXCyREAk/+Gm96FH18WXWPC7zDUXHPYYsIgXA9w9eq9OYs921F YG/10ci+QmOXvI9WWLToerfVvun0oNI= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf16.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099309; a=rsa-sha256; cv=none; b=GQl4jZ4Y6UEOxTD6RC/YKGW/fcEilmQYwEEgux+0rBsS9c/biwJvhAeZuO5lP52lNqvmAU jRdipqvoegohuwIDKfC+ENw8ZQn0KQ5IHmf6ZcQG9u67TSfYRa+lxHaZR/WU0K24q9lLLn r2zvnJUGSvGh8h2Mk4gzKGnl4RoD48I= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-15zW; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 06/16] x86/mm: use INVLPGB for kernel TLB flushes Date: Thu, 20 Feb 2025 19:53:05 -0500 Message-ID: <20250221005345.2156760-7-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 17DC5180006 X-Stat-Signature: hxswjh38y7m54umn1ftxf1wq5atpk461 X-HE-Tag: 1740099308-377796 X-HE-Meta: U2FsdGVkX1/gSHjhoID21GPKWmQbqRUI5ieYn720WR8JY3Qv59nmwLNTNFKrxpzy8hc/cbga9IFf5dibzrQRND2+xfDdZqJwnktpPhKS6qZE3YXvnkwplF+qJy8i/8EzcmRKvjbixaBJIMYLFw/b2/Egko9RNdYTUG8mPKytZ8nYwWfDvdzmorR0GElUQ9NyKLLoZn6V2o7RYn5S7iTMniq5BW5aLwhpL4XZzvRCpps/ytDrpjkVzRbXiZm0kK8D8Uaxbj0vfNyUjCGU1R76Zd1tCG0lul2iHIlhZ2wKo/5/oD8U6hv6TFT/82Na468LXG6gJCPVNWYjtaS5mvE+8O4gblMTTwsYnUttVlrcVMXWs/oQJS6GkZvJ2xBDTAzPsISK7fN96sV1jUVlqbTDsBv88QjNhVP7ij5hyTkhcSfnyWmcwZxbP7E0llAQATwuMlowXfaImRHnLMfB7NuaGq+IN0x0v+3NHVmSDHHYA4V5wwCqtwhAHWREJjgN7D4bNTW6OLIEEayQB/DjKOwsJfxdUBcp2Dz6qHE92RykbIYwbO//dTOGvnu4PiF7bcqzo2tcdM5C0eGLb32dh9eMJMCW8kiv8JaatACPyjv4gcwPquwufSBoU02spxxd/KwE5JplJ8dyXvaISXV5vUzlPgvBXn2Le5l3cpejbtlHHkSsLGdCsvFIIDpOf5weRJ0QudWLCTVlxVcN95q1v5eZxhvviaX1+jzwQDRNDZO46gaZqFgyOzsuTc2Tq9aXzvPyfVcoiaBrNBUZM4/fC9smu3Kl6sDKKHxTJjc+g+560ndDkE/jbl6Ixf2VknzWMHmuZYmMUW1yExIlSVtH45bfkL5PtEpwsfCZvNUv6og5jacETMCEvg7esKoUMNPHh3+PbOuy+jfjSKzgvezT5Ctq12L+DfMqAa1OOVN838f0iidnuCanqOf2d+g+Vukl/mrKma4Kj68gJdMMCI1rr6i iIoGJdn/ inlx4hUqoOeE7on8Q7BOpNQkGjI+sDN/H1umFda8JpxZ72X/DF1rR4JRBBjwn+c/mkrIAYsFPhsQ8sBG3SBhBYme+q+/7WjEWYVE8jzL1+Zwf5oYfdyTU98iedMRn0JyXUJtuPIK4hLoPQTxkDhMgSZ6Ee2yLyPkkhQNEjwrbraacB6CB54FTf7c4hbxz9prmPiVSoxBmb8W9qnw+FyWgK7oDYtl0vh8rV2gAQJJ9oXRpvo3Mwbpg7sapH608qdzRgXzt2nkA99gulUnjwAttUbtBV+X0x+ok2IgGyLh4jq7kd4JtKlp0dO/AHZA4DpwAutb1k3XJIeGiCBV3LIfBgHC0FcF9KaeBzxhYcxvVo3oIKZy2fLKwR92p7mdzTwuHzlhm2E0K7iCRUQT+CMBDgw9M8xNP+WtvZ7q6DY8PzoakHCAuiythjkXQKPHZglNeXeLg/HMpL6tWMO/7R0eR328/eSSBgYkkhpnm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use broadcast TLB invalidation for kernel addresses when available. Remove the need to send IPIs for kernel TLB flushes. Signed-off-by: Rik van Riel Reviewed-by: Nadav Amit Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 34 ++++++++++++++++++++++++++++++++-- 1 file changed, 32 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index dbcb5c968ff9..59396a3c6e9c 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1077,6 +1077,20 @@ void flush_tlb_all(void) on_each_cpu(do_flush_tlb_all, NULL, 1); } +static bool invlpgb_kernel_range_flush(struct flush_tlb_info *info) +{ + unsigned long addr; + unsigned long nr; + + for (addr = info->start; addr < info->end; addr += nr << PAGE_SHIFT) { + nr = (info->end - addr) >> PAGE_SHIFT; + nr = clamp_val(nr, 1, invlpgb_count_max); + invlpgb_flush_addr_nosync(addr, nr); + } + __tlbsync(); + return true; +} + static void do_kernel_range_flush(void *info) { struct flush_tlb_info *f = info; @@ -1087,6 +1101,22 @@ static void do_kernel_range_flush(void *info) flush_tlb_one_kernel(addr); } +static void kernel_tlb_flush_all(struct flush_tlb_info *info) +{ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) + invlpgb_flush_all(); + else + on_each_cpu(do_flush_tlb_all, NULL, 1); +} + +static void kernel_tlb_flush_range(struct flush_tlb_info *info) +{ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) + invlpgb_kernel_range_flush(info); + else + on_each_cpu(do_kernel_range_flush, info, 1); +} + void flush_tlb_kernel_range(unsigned long start, unsigned long end) { struct flush_tlb_info *info; @@ -1097,9 +1127,9 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) TLB_GENERATION_INVALID); if (info->end == TLB_FLUSH_ALL) - on_each_cpu(do_flush_tlb_all, NULL, 1); + kernel_tlb_flush_all(info); else - on_each_cpu(do_kernel_range_flush, info, 1); + kernel_tlb_flush_range(info); put_flush_tlb_info(); } From patchwork Fri Feb 21 00:53:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984681 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FD36C021B4 for ; Fri, 21 Feb 2025 00:55:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C1CD128000C; Thu, 20 Feb 2025 19:55:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B7E6C28000B; Thu, 20 Feb 2025 19:55:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A93328000C; Thu, 20 Feb 2025 19:55:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6C62628000B for ; Thu, 20 Feb 2025 19:55:15 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 23198A07B1 for ; Fri, 21 Feb 2025 00:55:15 +0000 (UTC) X-FDA: 83142132990.29.302EB88 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf03.hostedemail.com (Postfix) with ESMTP id 872142000B for ; Fri, 21 Feb 2025 00:55:13 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf03.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099313; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H8e1q/ijWxutZBBUZ4egEFqPViZpFe2V2vsQQKwVkPs=; b=DOgJhSkk/S4jKxSI72JeOWAMgSgd3RWCS4nlK68nmEEXBztJG8pVCiIefSgzOauYxXbIof yF5IXAXHXx5qjh7OqXM0XxNYe4WcWa/9DKi0VIS5brPzonESezXG9ZoGsFIDHAjlR01xep RpUbY9gcnGVUJ0mI8kwpBusypZt/KuQ= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf03.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099313; a=rsa-sha256; cv=none; b=Wzt6ZLRwLqC5sCwR8zx0SpHVkPMxV6Yo2Pj9dCqvfEfjDd9091xtaYQz/fipQLNkprxWfV jLUy05DR1NfKDoyQ1SlaPEvN54ZAg9KD8XLpGxGl56Y/uWtSFg9wAoLvthQvYfT/HHb20l L0a5ETwGPAjsDB3eJLd1zWWyvDjHgpU= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-1C5m; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 07/16] x86/mm: use INVLPGB in flush_tlb_all Date: Thu, 20 Feb 2025 19:53:06 -0500 Message-ID: <20250221005345.2156760-8-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: c1qurmrtcymthescibo6j1woraa914ye X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 872142000B X-HE-Tag: 1740099313-152044 X-HE-Meta: U2FsdGVkX1+T9j4URxiCeilARLbEf0K5LvbUwV2xTEr6lrpkUfr0fqBF29759F7bmhpoDYPtiEjapCrdf2bQAFtuaj7ajOw3s9kv1sRnEmxoPC/ARWzSiPJ36uqB37JsIFu3SCydpuQ6kczKuYJoM0qnSxXv/anUZESOVaCg1Dfbh4gI3lXrBahwKVuzU9mpxBydhjEheaCr/F7w8p/KLiF76Sye1JIBWcBxpG3AUA4DY5XWyj1hOeaOfFBFQh1G/l49i414aQ48K5HSWJ2/n1ZBEWZAADUXm/IxqgLujYNTqFQ9auyHMsJExkuVCSbs1nMWPbSCdh7Gg+fM1jQDCaFQmynj5GTNt2pqb9V/ckwV4ZX8/DZCBTAlsdiY6T80WWXDOaSaX6YcHP/nCva2SNNPHKhi1oE1uFSpqTFiS031KlZoy2tpFupYRchuqhd/0fXwKV/WIYsIym/tQifFWw1TqrU3XRlIg+e1zIpuAIXiBgfBWM8IqdPucHjICcR+QkcZoJTAKhTCiV4OD/rkc2IhEDTR/z1zytlczF3naEo/X1yN5vuJiNP45kqTZfx/UsMqNmaBFyvoFxuI1Ra2ZlYRj805G5/Jg8BYbF5Wv8Lh/2CJ36M3y/gzrQOFcm9gsMFUtru9QhuF7IFA+TlSehsTOLFlrFB29JQFFfzdOsHaPRa/tjj6YvTRqQCJ7eFYWpv1QvM+KmmqurxmcpgMJNwRuTSXygxW7+JJrFW6oR10r0M1uZBliL5QnrrD9k9xy9KM6KALPxxqv0FlgZ4Gp80jHgHac8GaY31iRz7PJr4q1f0dlzGLMWm+DzbbmGtjkH9TyOuCTnpzTAMQv8PE07C+8O9rW4ZpAq4sziCuG84h88InjAyw5zEcEcNDwfYl8vLlNOKpl2lxYUrGieHTfLpFp9wPry+ITy24F9PoA25RmTCFop6MRcucT/yoV1g62KH0bUUP95tpbXZMm+m mhlCp6k3 2uLVWhDBHvaeahA9qyNqc0lgvYEF2L7MmgU0iu1GmahCrAgGoLSH6o7TYhN1UjAo1hD7FOih+yKFSewrJZNySgD22UkO6cbUrz+uSeWhCooQ955nl7ofvBgP6hn714V9v6mrhfXTqHw1hzKOo0i27rtxLFS3udU0IXuWpbNeh6NFj6fIqLRNS6nx6bwYXrFZ/zOPBKJULxd6insS7QEz6Xl+D0JmSwb6UBEW9QF4lArM+REag5/ZYapcFQxd6VlOMinUL4Jx9hK+qid9+aYoJeOQoDi3sHvMxngSkfcs0Ne8d1DR0E/7MwyVi1JAc7Va2Vc/4IE/Yyfjg4AxFbBbiGeCYa6hj7cCZelfQOJYzreaUEf5KNH3rwOqd1g3zsUTyYlBvsOOXzU6ohUiPLHi89AOn7E0ruoJYpjWGltACcYXLCb/K3K9nmx1gyqk74m1pN8hLsNOGLDVnc2U4fmLOzYmouN1PO1cMfs7nh8+rpIe+6V/KFzVZ1LR+2g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The flush_tlb_all() function is not used a whole lot, but we might as well use broadcast TLB flushing there, too. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 59396a3c6e9c..2d7ed0fda61f 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1065,6 +1065,16 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, } +static bool broadcast_flush_tlb_all(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return false; + + guard(preempt)(); + invlpgb_flush_all(); + return true; +} + static void do_flush_tlb_all(void *info) { count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED); @@ -1074,6 +1084,12 @@ static void do_flush_tlb_all(void *info) void flush_tlb_all(void) { count_vm_tlb_event(NR_TLB_REMOTE_FLUSH); + + /* First try (faster) hardware-assisted TLB invalidation. */ + if (broadcast_flush_tlb_all()) + return; + + /* Fall back to the IPI-based invalidation. */ on_each_cpu(do_flush_tlb_all, NULL, 1); } From patchwork Fri Feb 21 00:53:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984676 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46C49C021B4 for ; Fri, 21 Feb 2025 00:55:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B3397280005; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9CE4D28000D; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 783AC28000C; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 38A456B00AD for ; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DEBA08021E for ; Fri, 21 Feb 2025 00:55:08 +0000 (UTC) X-FDA: 83142132696.22.E131089 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf08.hostedemail.com (Postfix) with ESMTP id 5D592160003 for ; Fri, 21 Feb 2025 00:55:07 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099307; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HJW3QmIpSlqNaI883v50yEcBcnAci6RjnvBZXp2DQnU=; b=HvmST8y/0s/uuSf0JmtgHhZ63WcehM8n5B0o3GqLv0wYwT7T4e4ena2BIJEaBefJJuoVPF eVokX4TKJhjYELVdVXJR8uy63lpaoz8WJ4GHHNQsze6VudUSQ/PQAfj9fhjjgOwHbsJsst ysmk2R3GUN6yHlHlOstK+ZR0E/SYzOg= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099307; a=rsa-sha256; cv=none; b=aCy8nBMT1rhj/f8D3tHJdGKBm0k1aGaLH2ANVXsqqCBJGgvr7xhYyV0gaHb7/yYoG7l+Sn qWeu/CkwGKIc2tA6xb6wgYH53gafH339M0xbcre3RKem/z3cQ1wQBgWAjiK2Hs2WPs6lzO yZi2RmZDl8qREEkU7k6TrJn5O7X654E= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-1ILt; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 08/16] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing Date: Thu, 20 Feb 2025 19:53:07 -0500 Message-ID: <20250221005345.2156760-9-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5D592160003 X-Stat-Signature: 15t1fo8suud165y6ocboqhjwts56rxwy X-Rspam-User: X-HE-Tag: 1740099307-871212 X-HE-Meta: U2FsdGVkX1/hPERC3D8KZU74Sfzb6nolKSRUDR/WmvGO84WzE1qyG2MnXI8JjsDfNTbDTiUf+x06hLre2wsc09uEZJGIqNY9q3+9XHz5TsExkk/fORB8gzZiYUvPS2bujBFKYI/qgmq11X6GbHgc9ENmKUa8kZbgZ75kxJC8duo2l8x56OFTt3hECZrI9TYDZIGCsSXL9PVsIasfGE7LM5NF7eO5W7cKVolxnD8efhajKEy1spkCA+lqsUlqvhBHouRSdXhvlVt1MRubG7ZDu7QsGRGHoAMdlWCIlWrCIF9z0zh2Sq1JwKv2YxA9IdlDu1TawaloG7YL18iE+EkIzk90Cf9BYYvV9lV1WocCXUg4x4OMdhcvkXpHH3Kiiy7zOfKffYNP5nVZGbSduknAat4GA7x9ko1V8A8V0qXAGYTE5IfYclpqdx0EF5x6fu1Kuj5xkh3WH4XiehN7xGMn5hZikb4UNNirFupRFyN6vb+KFSgExLMVBWKazb5/1OYNYCKJ+FyWYfky8KfbVOZ2zsUq6kSQOI5pc3+lQenSnqf+dX3Ln/9hH6R3yvseoptqtZyyxe4AB56KUkxBme7n+os5HmE4Lc1WFc+JuOXnLRUTvuNRPS6vnjZ9DACsMUSXHgvRsM9kGRTpadQWyjsG7EJ5fex7gS8dS6Fzj8AqVc9sglqhp31wjMcBitHMxugfMbUCxV0pHmMLUogv1swEwf6Q41At/R8ek5gtWToY6B8vNQHxH8nQrSEYNQiB6ABp5iOiWG8sB8eujL0YA6b+VU7j4qvaxWauJlnGF+4zAtkmpJ4mIOGKBNQaseZrodmYduMm/jyVuJZsugGSzwz/H1DdDR0ez9nMYpTVo6f6r1jHGfvVql4nEFwqI0Le86NOPQnt7JdX10DEDbglF60BhSxKXhSEoZz9Orzwk9a9tVJOrGi4mgtpY4AEXRjLl3UtX8kygDxmaQFCu+qzTVC ilWSgX5J 0qn2ZxKBUCJlVzVDM88t3vv1EznWBsIaxV/36ghx63wOQCaCJBj5egF81pNiJIKPFn/QznL981vtHwLn9qJ9DEhIIai4yf1TQjUeFYORtQ7smzabyUuL2g1FUHsHl+zxNIamk+4Wmesm1bVZmbA1XqeUQs41wj3H1MtL1FOg+a9kSHSXKfenX5HleauwM3+Mm3KAkQTwtJ7syqAgzuLYIFJAcsK/9oL3kRhLgIU/pKwL+K5Wspl1MdGF0zMWLS7UXrCNJ/DSKm7FHEAd8HP99ap9BIRAiaEWf+q8Au3U2qsQizsbcEW0y+LMVMyl17fDkTzinEYYWHOcb7KXG79p/06RrA0zpCGNfREDMaOGCr7PbfwXLVH/K9+mLOLa7w63JxRDu25cXwV9kY5A7Ae1K7o9+lLwuDjr7+XbyrjeGIbck/GZgBmd1VYYtRW7CsomsLcpDc5q2X64vNJE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In the page reclaim code, we only track the CPU(s) where the TLB needs to be flushed, rather than all the individual mappings that may be getting invalidated. Use broadcast TLB flushing when that is available. This is a temporary hack to ensure that the PCID context for tasks in the next patch gets properly flushed from the page reclaim code, because the IPI based flushing in arch_tlbbatch_flush only flushes the currently loaded TLB context on each CPU. Patch 10 replaces this with the actual mechanism used to do broadcast TLB flushing from the page reclaim code. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 2d7ed0fda61f..16839651f67f 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1326,7 +1326,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + invlpgb_flush_all_nonglobals(); + } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { flush_tlb_multi(&batch->cpumask, info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); From patchwork Fri Feb 21 00:53:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984686 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98209C021B2 for ; Fri, 21 Feb 2025 00:55:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9614A280015; Thu, 20 Feb 2025 19:55:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 82455280012; Thu, 20 Feb 2025 19:55:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 629BB28000B; Thu, 20 Feb 2025 19:55:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3447F280013 for ; Thu, 20 Feb 2025 19:55:27 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D9CEAB01F7 for ; Fri, 21 Feb 2025 00:55:26 +0000 (UTC) X-FDA: 83142133452.13.996BD17 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf02.hostedemail.com (Postfix) with ESMTP id 4C8238000A for ; Fri, 21 Feb 2025 00:55:25 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf02.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099325; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Rog+SzG8Vb9jzhiROu/3rA2Us/dyT8c7ly5MvOLWqtg=; b=ro+xITsNu5F26//ZeVzOlDOKW2NFRJ4ZIIuT8QdesLC6GZdb8OBPw1G+EgDMjJsvpGkrcA IxNG4RmVoXTV6zgMIeVJdDy8fworOH9rKlOea+BWE/trbtCXh7bCOMb2CPAZM7gxFatNqL MfqeSEMmgPB9dNq8OgN68K2VMe//+Ms= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf02.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099325; a=rsa-sha256; cv=none; b=REJai8cxj7mytcTrMQfodj5daRxHyJwNsZlhULRphVXoQDHpj7HImJ3RQOXsRjYSt1C2Vd Un5za2lEUDof8BtuYb6s6spiNgTI86si9jS2HldtjIwNJm4eU2XQd+IOSnB134xO0OtYHu zqz/Dz+WO80FCa+trQOIuquiTBEOvrQ= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-1Okl; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 09/16] x86/mm: global ASID allocation helper functions Date: Thu, 20 Feb 2025 19:53:08 -0500 Message-ID: <20250221005345.2156760-10-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 4C8238000A X-Stat-Signature: ui5j8jh4cccwfkcmy39di4aijpd3ik8x X-HE-Tag: 1740099325-721273 X-HE-Meta: U2FsdGVkX1+vdKS3RgqRA75POv7B+4ICNHw67AOJB4kpb6o/P7Cua64FWpqmciJjutEi3qpiSj48L7Sn5/f7E0iuYoDWtyxu0bu6IydHpeP6qGbkoH0GR3nkMgnklpdwgDCoFVhcR74XGKofHGPm4jZg20V1c/q71reOZ/kW6hCTfkyaXp/W5mrzPExO45Eeie3ggLf5qZCujC+P3FRPhqHbJ3w+pZZ9OKzKL+pGRMpFD4goKobWKnmiQiAAQQavIDnTiDUnwvMlPpYTMXG6OMi3DwhpoK5I2ksnkd6ARXQIW65FTO891ZyvCkbvWaQSFKkxCne2MI9nK8f/YOA6tDWNJmCXxnGXs52SVUGbhc8l/XCCONVT9Gdp4VDT/kNLsszx+vGbRoYl8BXwh8s2u2QGEiY5JYe5SQ6G/GWKKRZyWi8VG3+t3VanjAyE00vWqLJxbpLMhpOA+TWxuJ8AS3w11L03Vga3ZAWI1WgdtkVWO9ktOqpJLtoaiqHolrIYZKANNLbVU+GGPpoHG9jmu00r/StPZc2SSg5bde8E78wio1kq6b1Ci1neL3yV1rMfe38HW/XtKas8U8grZ2ZCujgvP00s2cZ1oES0l2JCaX1ABZt8jC0OntEH3nlgiXJTpsOhWh0k93qrqNvgTswlZaiyOYs9is9T+Y4egFsWNBSISEaDbc1tsuIjpGc7MTJA1ZkHdt0OSYtaUgqfH1kpLHRxX7T+VhRzVClaSrarKUgKADZB1JnbdMnx1buoo7LdLsnbszrB6dmXZ+Q0fOuPlLrNr8AFExvmyXtOPI1fgei01ElUKelxjipZ9hVOgyTjALaTpXTXWC9XBJ5yuMvTqvt+RL2oHRC09sm/OwxF+91Y+nfkmZBnurZs0IX28se9ZQuUOs1LLew7OItZ1FL+CmKwImQ1zY8KvA0XWDTCq9R6/fNro2g5FNgyzX8vWaCV5mw8i60LgpTN6DJ292+ 65Eq30PP +BvtftFDjPXsSoe3Of1evZVRiXfA9IISFx3RrmM4kl5c3gMu+W4sTf03gBx4+50kkdgxdqxRl4SLhJ8VqepQ0SoJ8j5zntFJpGep73lsYxSi0eO3miDYQ30UBl951Gb8yRXKF6qJfrC7xQLl7SNJ/o8VQv2toefdjkw2RTMzSFGpbM2HW8e0RFTyAcXJJx5PoAuW96ZCWsar6rlbdzBbIuI9T58yunPJcYgAV0moge1ldPrnEB4KDKxkI+9B9AfKdRAq3SKym8gbFG5NxKWrdaMOgVO/bWMRRkAgVQgacmUlu9QZmt9FJP2EGgkZopcB1WCCX3lzf07QmxFFUoaPAx4aITr2b09xIyI/Xa4zU2wKySL2aKRycV9Deukm6sovEor9stR4HVi0h91aKeLgHiES2C2v82qyAThl6SYplkAmmLDDLedRSFhgqLAJEM/S1xJCM8GSirDO+7MT+bDQ3hPvtnf5tfWUWUWPOfq41V4xLhPM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Functions to manage global ASID space. Multithreaded processes that are simultaneously active on 4 or more CPUs can get a global ASID, resulting in the same PCID being used for that process on every CPU. This in turn will allow the kernel to use hardware-assisted TLB flushing through AMD INVLPGB or Intel RAR for these processes. Helper functions split out by request. Signed-off-by: Rik van Riel Reviewed-by: Nadav Amit Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/mmu.h | 11 +++ arch/x86/include/asm/tlbflush.h | 43 ++++++++++ arch/x86/mm/tlb.c | 144 +++++++++++++++++++++++++++++++- 3 files changed, 195 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 3b496cdcb74b..edb5942d4829 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -69,6 +69,17 @@ typedef struct { u16 pkey_allocation_map; s16 execute_only_pkey; #endif + +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + /* + * The global ASID will be a non-zero value when the process has + * the same ASID across all CPUs, allowing it to make use of + * hardware-assisted remote TLB invalidation like AMD INVLPGB. + */ + u16 global_asid; + /* The process is transitioning to a new global ASID number. */ + bool asid_transition; +#endif } mm_context_t; #define INIT_MM_CONTEXT(mm) \ diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 09463a2fb05f..83f1da2f1e4a 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -234,6 +235,48 @@ void flush_tlb_one_kernel(unsigned long addr); void flush_tlb_multi(const struct cpumask *cpumask, const struct flush_tlb_info *info); +static inline bool is_dyn_asid(u16 asid) +{ + return asid < TLB_NR_DYN_ASIDS; +} + +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH +static inline u16 mm_global_asid(struct mm_struct *mm) +{ + u16 asid; + + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return 0; + + asid = smp_load_acquire(&mm->context.global_asid); + + /* mm->context.global_asid is either 0, or a global ASID */ + VM_WARN_ON_ONCE(asid && is_dyn_asid(asid)); + + return asid; +} + +static inline void assign_mm_global_asid(struct mm_struct *mm, u16 asid) +{ + /* + * Notably flush_tlb_mm_range() -> broadcast_tlb_flush() -> + * finish_asid_transition() needs to observe asid_transition = true + * once it observes global_asid. + */ + mm->context.asid_transition = true; + smp_store_release(&mm->context.global_asid, asid); +} +#else +static inline u16 mm_global_asid(struct mm_struct *mm) +{ + return 0; +} + +static inline void assign_mm_global_asid(struct mm_struct *mm, u16 asid) +{ +} +#endif + #ifdef CONFIG_PARAVIRT #include #endif diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 16839651f67f..405630479b90 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -74,13 +74,15 @@ * use different names for each of them: * * ASID - [0, TLB_NR_DYN_ASIDS-1] - * the canonical identifier for an mm + * the canonical identifier for an mm, dynamically allocated on each CPU + * [TLB_NR_DYN_ASIDS, MAX_ASID_AVAILABLE-1] + * the canonical, global identifier for an mm, identical across all CPUs * - * kPCID - [1, TLB_NR_DYN_ASIDS] + * kPCID - [1, MAX_ASID_AVAILABLE] * the value we write into the PCID part of CR3; corresponds to the * ASID+1, because PCID 0 is special. * - * uPCID - [2048 + 1, 2048 + TLB_NR_DYN_ASIDS] + * uPCID - [2048 + 1, 2048 + MAX_ASID_AVAILABLE] * for KPTI each mm has two address spaces and thus needs two * PCID values, but we can still do with a single ASID denomination * for each mm. Corresponds to kPCID + 2048. @@ -251,6 +253,142 @@ static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen, *need_flush = true; } +/* + * Global ASIDs are allocated for multi-threaded processes that are + * active on multiple CPUs simultaneously, giving each of those + * processes the same PCIDs on every CPU, for use with hardware-assisted + * TLB shootdown on remote CPUs, like AMD INVLPGB or Intel RAR. + * + * These global ASIDs are held for the lifetime of the process. + */ +static DEFINE_RAW_SPINLOCK(global_asid_lock); +static u16 last_global_asid = MAX_ASID_AVAILABLE; +static DECLARE_BITMAP(global_asid_used, MAX_ASID_AVAILABLE); +static DECLARE_BITMAP(global_asid_freed, MAX_ASID_AVAILABLE); +static int global_asid_available = MAX_ASID_AVAILABLE - TLB_NR_DYN_ASIDS - 1; + +/* + * When the search for a free ASID in the global ASID space reaches + * MAX_ASID_AVAILABLE, a global TLB flush guarantees that previously + * freed global ASIDs are safe to re-use. + * + * This way the global flush only needs to happen at ASID rollover + * time, and not at ASID allocation time. + */ +static void reset_global_asid_space(void) +{ + lockdep_assert_held(&global_asid_lock); + + invlpgb_flush_all_nonglobals(); + + /* + * The TLB flush above makes it safe to re-use the previously + * freed global ASIDs. + */ + bitmap_andnot(global_asid_used, global_asid_used, + global_asid_freed, MAX_ASID_AVAILABLE); + bitmap_clear(global_asid_freed, 0, MAX_ASID_AVAILABLE); + + /* Restart the search from the start of global ASID space. */ + last_global_asid = TLB_NR_DYN_ASIDS; +} + +static u16 allocate_global_asid(void) +{ + u16 asid; + + lockdep_assert_held(&global_asid_lock); + + /* The previous allocation hit the edge of available address space */ + if (last_global_asid >= MAX_ASID_AVAILABLE - 1) + reset_global_asid_space(); + + asid = find_next_zero_bit(global_asid_used, MAX_ASID_AVAILABLE, last_global_asid); + + if (asid >= MAX_ASID_AVAILABLE) { + /* This should never happen. */ + VM_WARN_ONCE(1, "Unable to allocate global ASID despite %d available\n", + global_asid_available); + return 0; + } + + /* Claim this global ASID. */ + __set_bit(asid, global_asid_used); + last_global_asid = asid; + global_asid_available--; + return asid; +} + +/* + * Check whether a process is currently active on more than "threshold" CPUs. + * This is a cheap estimation on whether or not it may make sense to assign + * a global ASID to this process, and use broadcast TLB invalidation. + */ +static bool mm_active_cpus_exceeds(struct mm_struct *mm, int threshold) +{ + int count = 0; + int cpu; + + /* This quick check should eliminate most single threaded programs. */ + if (cpumask_weight(mm_cpumask(mm)) <= threshold) + return false; + + /* Slower check to make sure. */ + for_each_cpu(cpu, mm_cpumask(mm)) { + /* Skip the CPUs that aren't really running this process. */ + if (per_cpu(cpu_tlbstate.loaded_mm, cpu) != mm) + continue; + + if (per_cpu(cpu_tlbstate_shared.is_lazy, cpu)) + continue; + + if (++count > threshold) + return true; + } + return false; +} + +/* + * Assign a global ASID to the current process, protecting against + * races between multiple threads in the process. + */ +static void use_global_asid(struct mm_struct *mm) +{ + u16 asid; + + guard(raw_spinlock_irqsave)(&global_asid_lock); + + /* This process is already using broadcast TLB invalidation. */ + if (mm_global_asid(mm)) + return; + + /* The last global ASID was consumed while waiting for the lock. */ + if (!global_asid_available) { + VM_WARN_ONCE(1, "Ran out of global ASIDs\n"); + return; + } + + asid = allocate_global_asid(); + if (!asid) + return; + + assign_mm_global_asid(mm, asid); +} + +void destroy_context_free_global_asid(struct mm_struct *mm) +{ + if (!mm_global_asid(mm)) + return; + + guard(raw_spinlock_irqsave)(&global_asid_lock); + + /* The global ASID can be re-used only after flush at wrap-around. */ + __set_bit(mm->context.global_asid, global_asid_freed); + + mm->context.global_asid = 0; + global_asid_available++; +} + /* * Given an ASID, flush the corresponding user ASID. We can delay this * until the next time we switch to it. From patchwork Fri Feb 21 00:53:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984685 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCD2FC021B2 for ; Fri, 21 Feb 2025 00:55:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BE27280011; Thu, 20 Feb 2025 19:55:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1471A28000B; Thu, 20 Feb 2025 19:55:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F00E2280011; Thu, 20 Feb 2025 19:55:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C8E1028000B for ; Thu, 20 Feb 2025 19:55:26 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 534DB120288 for ; Fri, 21 Feb 2025 00:55:26 +0000 (UTC) X-FDA: 83142133452.27.D94D51C Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf10.hostedemail.com (Postfix) with ESMTP id B7115C000C for ; Fri, 21 Feb 2025 00:55:24 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099324; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=17PDbuRCdkaIeeTY7aYrDnI9ea5aT6YMVNZgYdZsgdg=; b=qNIDO74wQsGJTVsKqc8uwZmmCOk4r6dlYw/KCg2862yL4R5gxlxRW/nZ/k2pQjbnBfj+en TKhOWeF45534m7/nQEerJ6UYj2zQF8j/dN8rVjvFYpMqfVCZGzpcTKb5aMNecAuJwiTVH4 gqkpKPGklGhuiaMk5YeNbLOflZNfTvY= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099324; a=rsa-sha256; cv=none; b=dBXc11rsjdgEkln1fr03NXncm5N/Oa6KQXa/VvPauY+eYemXf4gAfVJJqvTiwevHOzIgYy xdWwHmCcfRh/uskKiKFvliY78ADb+DFICH9iTnQSZFkoG015eeRhpYfmGqs4af+kbVnhiZ p4roUYFh9R4ldLGwrkP5oH3yh7/eGPg= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-1Wgi; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 10/16] x86/mm: global ASID context switch & TLB flush handling Date: Thu, 20 Feb 2025 19:53:09 -0500 Message-ID: <20250221005345.2156760-11-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: B7115C000C X-Stat-Signature: ki795hfzsbswwy9ipw8sdrkyuchrbsfn X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1740099324-741304 X-HE-Meta: U2FsdGVkX19clNJZVs/aMQ5esILtyd7cq4mUR4JqXBC6kcSIdnDuEhZ/cF0ICO21lYtMC6XQIBRGWJQgee+67jGcvAF3NJ29DCSxeAI2xJknopSoMdsTANYttIXA3WTtn4knNBLH2y65B1C0oRiZR33r7FNhAC8h3onkVIJC/2dg1+w+HMldV+w+NeFtCoDme+3bXK1lFXwUjXso5xvPEPO1Nw+V9QsOhs7KZTj7jGMUlXHmc0w/mtrQr/nUsbad5gOhYvv82CNw0Mx6vS+zr+LXibPKSZeX51T6r0Rdm5EnIUgznIljA1AqgZKjUzSDn9RnbxBQ0WKu9sAUqnsUVqmLAaq/a3SoicU8RgGtXDM3xKMGUPMQo7kf+/XdQJUkut6snUIf0eLUl9k6o9xZR7UJUDv6kJtwBXLpZHm77lvHvr2+2UBBGQJcAa2HXQBRh6NhwzvPfCV55h2ZVr7q9P5ZPFamaEFsTTAs0/2y7OgIjRn0yPKdjB7z/Vvx6aHhrv4x7Q+nJMMyrFfKxUEkC/fdZW5YHTn0TLM6rtuqaANEdDz7d2FUlO0vp/9efG1RnGDyf8tBBpv89TECpG2yatwDNRS9TNWs+Os57zZyhyKS0ImVSvAtpfrWlhruL91wWOUpweGGPjM5AyPemwHYBgB0lMuTIYxC5w/VxX3MGX1z3SazBuZNEi5jPIzB4i80BdjLojHI/URJxtq+WsvLZPHoTGZdrsnv5lNCIkIqM++D5F/kwGwJZP1LbQjUDFDQ4NIMNSCgji2qFHbihIo8Wo8pCuRtiq3+fYc0ZSeKFDD7FP4s8dOv4YatzLHHpP7dG6qkT6TEfgQ6+500DnurPIP9rv9DkEJZghZENKSFySwT0BVMjVq1GxXwd5NpM7LXTPRK+bzZlAoVwqPoocGBJySf+y6AGb9j36ssLowp5DvemLrh9MbbJ929cOwM2w25z3gBrow3g3a6QJuR+lV 4BMubmlk Ma46u4Q79prMF+PbiEU+jQew5IU7mv1dX7JYhGe7hhP3SPL1/7BgqsKdF21odKOCDNUx4V+yxKM4H5QNFHpFRPQ6NcTCCswdLXJ3McjoCYETLKIfZVxXD4B1Sz+SmNLYAEqyLyvCyl7Q6k76NT6Z84k00un/dpexOAXxrscL0LM/TJN1l9682BEqaO48BFf3jZeOT5j87htDVuG6huhjIqY/rUTVHTH0AfBHlXiCRpnSuoEtkTeK5EGQlYZyEmkARlCC1VlTZI4hPz3SPf1BFPdBt0hZGnUOqfGlgDTOePzkt5JiKY0GYb2jx4nszkpwtBwO0gUP9dCBe5ElJzf2Ty0N/7ITiwVLZLkqV9Gq5HbdGcax0bg6wynTGEamqc369gntF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Context switch and TLB flush support for processes that use a global ASID & PCID across all CPUs. At both context switch time and TLB flush time, we need to check whether a task is switching to a global ASID, and reload the TLB with the new ASID as appropriate. In both code paths, we also short-circuit the TLB flush if we are using a global ASID, because the global ASIDs are always kept up to date across CPUs, even while the process is not running on a CPU. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlbflush.h | 13 ++++++ arch/x86/mm/tlb.c | 77 ++++++++++++++++++++++++++++++--- 2 files changed, 83 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 83f1da2f1e4a..f1f82571249b 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -240,6 +240,19 @@ static inline bool is_dyn_asid(u16 asid) return asid < TLB_NR_DYN_ASIDS; } +static inline bool is_global_asid(u16 asid) +{ + return !is_dyn_asid(asid); +} + +static inline bool in_asid_transition(struct mm_struct *mm) +{ + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return false; + + return mm && READ_ONCE(mm->context.asid_transition); +} + #ifdef CONFIG_X86_BROADCAST_TLB_FLUSH static inline u16 mm_global_asid(struct mm_struct *mm) { diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 405630479b90..d8a04e398615 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -227,6 +227,20 @@ static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen, return; } + /* + * TLB consistency for global ASIDs is maintained with hardware assisted + * remote TLB flushing. Global ASIDs are always up to date. + */ + if (static_cpu_has(X86_FEATURE_INVLPGB)) { + u16 global_asid = mm_global_asid(next); + + if (global_asid) { + *new_asid = global_asid; + *need_flush = false; + return; + } + } + if (this_cpu_read(cpu_tlbstate.invalidate_other)) clear_asid_other(); @@ -389,6 +403,23 @@ void destroy_context_free_global_asid(struct mm_struct *mm) global_asid_available++; } +/* + * Is the mm transitioning from a CPU-local ASID to a global ASID? + */ +static bool needs_global_asid_reload(struct mm_struct *next, u16 prev_asid) +{ + u16 global_asid = mm_global_asid(next); + + if (!static_cpu_has(X86_FEATURE_INVLPGB)) + return false; + + /* Process is transitioning to a global ASID */ + if (global_asid && prev_asid != global_asid) + return true; + + return false; +} + /* * Given an ASID, flush the corresponding user ASID. We can delay this * until the next time we switch to it. @@ -694,7 +725,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, */ if (prev == next) { /* Not actually switching mm's */ - VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != + VM_WARN_ON(is_dyn_asid(prev_asid) && + this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != next->context.ctx_id); /* @@ -711,6 +743,20 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, !cpumask_test_cpu(cpu, mm_cpumask(next)))) cpumask_set_cpu(cpu, mm_cpumask(next)); + /* Check if the current mm is transitioning to a global ASID */ + if (needs_global_asid_reload(next, prev_asid)) { + next_tlb_gen = atomic64_read(&next->context.tlb_gen); + choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); + goto reload_tlb; + } + + /* + * Broadcast TLB invalidation keeps this PCID up to date + * all the time. + */ + if (is_global_asid(prev_asid)) + return; + /* * If the CPU is not in lazy TLB mode, we are just switching * from one thread in a process to another thread in the same @@ -744,6 +790,13 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, */ cond_mitigation(tsk); + /* + * Let nmi_uaccess_okay() and finish_asid_transition() + * know that we're changing CR3. + */ + this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); + barrier(); + /* * Leave this CPU in prev's mm_cpumask. Atomic writes to * mm_cpumask can be expensive under contention. The CPU @@ -758,14 +811,12 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, next_tlb_gen = atomic64_read(&next->context.tlb_gen); choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); - - /* Let nmi_uaccess_okay() know that we're changing CR3. */ - this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); - barrier(); } +reload_tlb: new_lam = mm_lam_cr3_mask(next); if (need_flush) { + VM_WARN_ON_ONCE(is_global_asid(new_asid)); this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen); load_new_mm_cr3(next->pgd, new_asid, new_lam, true); @@ -884,7 +935,7 @@ static void flush_tlb_func(void *info) const struct flush_tlb_info *f = info; struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); u32 loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); - u64 local_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen); + u64 local_tlb_gen; bool local = smp_processor_id() == f->initiating_cpu; unsigned long nr_invalidate = 0; u64 mm_tlb_gen; @@ -907,6 +958,16 @@ static void flush_tlb_func(void *info) if (unlikely(loaded_mm == &init_mm)) return; + /* Reload the ASID if transitioning into or out of a global ASID */ + if (needs_global_asid_reload(loaded_mm, loaded_mm_asid)) { + switch_mm_irqs_off(NULL, loaded_mm, NULL); + loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); + } + + /* Broadcast ASIDs are always kept up to date with INVLPGB. */ + if (is_global_asid(loaded_mm_asid)) + return; + VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].ctx_id) != loaded_mm->context.ctx_id); @@ -924,6 +985,8 @@ static void flush_tlb_func(void *info) return; } + local_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen); + if (unlikely(f->new_tlb_gen != TLB_GENERATION_INVALID && f->new_tlb_gen <= local_tlb_gen)) { /* @@ -1091,7 +1154,7 @@ STATIC_NOPV void native_flush_tlb_multi(const struct cpumask *cpumask, * up on the new contents of what used to be page tables, while * doing a speculative memory access. */ - if (info->freed_tables) + if (info->freed_tables || in_asid_transition(info->mm)) on_each_cpu_mask(cpumask, flush_tlb_func, (void *)info, true); else on_each_cpu_cond_mask(should_flush_tlb, flush_tlb_func, From patchwork Fri Feb 21 00:53:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984674 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B1C7C021B2 for ; Fri, 21 Feb 2025 00:55:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F0BC6B00AF; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A7CB280007; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 321CB6B008A; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 19B3B6B008A for ; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B047112027B for ; Fri, 21 Feb 2025 00:55:08 +0000 (UTC) X-FDA: 83142132696.09.D112D82 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf22.hostedemail.com (Postfix) with ESMTP id 1F12AC0007 for ; Fri, 21 Feb 2025 00:55:06 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099307; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tnXJUoKhnm5h8EohtTt/G8qJOME70FGJFaEsjMY4Igo=; b=J+skPRIkbGtNZqAKk8ViwBSl+dhhTZ99xXNuy7w47bDxDoVynV0rx5Vqn+jVn69AGFIVAB qarNWTHcq0YDMxe7+Gb015TugsBEeQKpYLb4vxsMHAXJzc/SZP//UYdkyjyRLTiEBm2paN hJMcQ4U5VXRdksAVPwbc4aXhkqibGd0= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099307; a=rsa-sha256; cv=none; b=LQX2gAy+j3jpIhQDIwH8cpkZnbMTfjsfKyGQd1cuNsVrOhwsohcRxVC4uijcNvA+RunE6B EgAdySmI6cRx9Y0i1puM9tOVyKQGoRSygGWTMJcL87rpKbC0EvE4P9px1mT4RNzYFNDbmH NDREZhj5hEuNreQHn2f8NtoAQEbsFz8= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-1eSr; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 11/16] x86/mm: global ASID process exit helpers Date: Thu, 20 Feb 2025 19:53:10 -0500 Message-ID: <20250221005345.2156760-12-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1F12AC0007 X-Stat-Signature: qohokthi4cs8pd8mnabmqwkz1m94i3kb X-Rspam-User: X-HE-Tag: 1740099306-209294 X-HE-Meta: U2FsdGVkX19jOH53ewUBTMIP2gcY15upvGTI3IP+Vm+9aoNnp9WpT88fBxakjm9H9kL4v6/aonME+JZjd8Gk4Ja+M2pOI9CNo71z88db2XSG3LLj14J7ZWH+MTj9tTD91VV2Ue3ahlsu3Nh+NZli8EbgpKVwyRLgh0X1/3YVqQ6SmckkHx4KMtie9Y6ZRexy4/WvZ7wVYSYEHWjUOhoD17Wg7FX1eecb/Xk3d+FjVJ3IfpkZoM72zrUAfkajm1KvsNko+7eEW2cWODCKAvZ73hbawZ4kDfXR4R/9KOeAHh2eb38o5Pzi2gVDGAyxBqPYS0xuHtHrEQ6zyTqkqkzCzjcJZCxazvwJfz3v18iqjQak2vjEOl3tMseER9/lTavO7j0Mr9sNHfgp5kKI6PQM6RoyvvMmRvwSn1d0B4DR6bwxFCc1DGPObtXEEgWq8Bhah+wNhvlAOZS2BUf2/aw3c22eTUOCnPkc3wXuvGujvQEynMHjiIr/+FTqsE7zEPpxugHpTGAvHwsbtX6LPSXMT5w2pxAfnz4Nl2IkULdk4MgZDzz93uHPX0g9KSox8kRo60cQ8yZc3+35D+KM83zQJNcB/ka2L7cnu7OKO1Ov9ma5I7hUpgoFKOV/j+03RVeCOIHZuM/v4+nKjyXWW/8PHPs882za23W2lBTG7LZK980b9cGmmXERX2ohog05969VZimzBzuTyt7TsEBOpSpOg7RiiQmKJXH5Y4Sbx8+7bFCGXyPtW9ftcI1UwQ4kf9BXtn12+QaxACd0eUI3+7gmHbZ9mPEghX5ls2izlw9JyR4j6x9ZGhtAeJioMmdPYLLBcLQfOtd1dkIYcNSRAQah86fmZyTzHTaiGVDknxPXQY89ltZLP3iDtOyQAHwYcW98szNX1YbVc2ItJv2JiTMjXnTFMA63HXFaCDvq5zMoO6kjEJtX/lQqPuZjA2moJ6Osq2M8yEyXtHdLMy8w52v qvMjM8k8 iPlHgvsxpsw1QftwQg30xmHRdTR10kzS/LxYFY/iufA9nWB9WVQfahjmst89bOlzCVMdxMuWKrUws+DRC6n1siPTQ8KnObuyCpxiA4WT4xIAuJejxQEDUZWw74egcQvK2uZIgQODOKBdYQPe7MZe42Rdz7ynzkTfTxGZVTkIevq5pqWFjPbELhbd0CylK/45LVE3l/s8Z7qWIeCkFnkZuaF81yq9scONw+mhC1OIAFKvcoz5088HJPVjhypVqM+Z5pSihf1Gbc8h+HppPf3HSzyc0mwrCyH4dXiXPcGdVBrVIll1O6r3v9G64oDvK1cHdeGcoU9aphcxeKUGnfLSV8tM/v7G1+Qu5RB/bfC1sx77QFxxOX9yjtUSQme6OK+Gtc3CB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A global ASID is allocated for the lifetime of a process. Free the global ASID at process exit time. Signed-off-by: Rik van Riel --- arch/x86/include/asm/mmu_context.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 795fdd53bd0a..d670699d32c2 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -139,6 +139,8 @@ static inline void mm_reset_untag_mask(struct mm_struct *mm) #define enter_lazy_tlb enter_lazy_tlb extern void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk); +extern void destroy_context_free_global_asid(struct mm_struct *mm); + /* * Init a new mm. Used on mm copies, like at fork() * and on mm's that are brand-new, like at execve(). @@ -161,6 +163,14 @@ static inline int init_new_context(struct task_struct *tsk, mm->context.execute_only_pkey = -1; } #endif + +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + mm->context.global_asid = 0; + mm->context.asid_transition = false; + } +#endif + mm_reset_untag_mask(mm); init_new_context_ldt(mm); return 0; @@ -170,6 +180,10 @@ static inline int init_new_context(struct task_struct *tsk, static inline void destroy_context(struct mm_struct *mm) { destroy_context_ldt(mm); +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) + destroy_context_free_global_asid(mm); +#endif } extern void switch_mm(struct mm_struct *prev, struct mm_struct *next, From patchwork Fri Feb 21 00:53:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984682 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D950C021B2 for ; Fri, 21 Feb 2025 00:55:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9AFB828000E; Thu, 20 Feb 2025 19:55:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 86FCE28000B; Thu, 20 Feb 2025 19:55:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 64DAD28000E; Thu, 20 Feb 2025 19:55:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 38CDE28000B for ; Thu, 20 Feb 2025 19:55:18 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id F224C1A0205 for ; Fri, 21 Feb 2025 00:55:17 +0000 (UTC) X-FDA: 83142133074.26.A20CA2F Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf04.hostedemail.com (Postfix) with ESMTP id 6D8D940007 for ; Fri, 21 Feb 2025 00:55:16 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf04.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099316; a=rsa-sha256; cv=none; b=6JmuJBgY35lc9gaVlEs6yrgAjTemxQVX25RFjzAnUrHyDy6RC1d/GOOYncx6lSXvOh+nZ5 G0qIYhsxHdi97+1wnSUFoJqCkOX/gsAU15dGsLW9irr3j/XUq+5mYKcUg2UAHPoYb8Y3Lb 2McKdkFM8h1jDzBRz7jYbPZ9BQsVAMU= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf04.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099316; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CLu07lJ1w+aXrVKjC6QVxImYk8xVdT6cyZPEzHI0oRM=; b=o18vZXpGZQ9Gijjlv0zoiYi4h3dh660P811EpFGeQP6efKrVgspbrGhMBwOlDRaXNcah94 CpqwJvg46aCCVu5Z0nmdoAf8J4T1qsg9VfZyS3HrnEhFQjRTviNEFPoglZNwjCU0HRewxD b89gsN/JzAwunCAw0Dfh0r/R1E80UJ8= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-1l2C; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 12/16] x86/mm: enable broadcast TLB invalidation for multi-threaded processes Date: Thu, 20 Feb 2025 19:53:11 -0500 Message-ID: <20250221005345.2156760-13-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 6D8D940007 X-Rspamd-Server: rspam12 X-Stat-Signature: eha1hgw39iteko77acd9sdjxcupe3bqr X-HE-Tag: 1740099316-937835 X-HE-Meta: U2FsdGVkX18KhOsWjsKrSPx6Iz3Ie11FR/h+tdcCUjMttdUl0CjqixHoTSq3rooG/oTtcFIDA+YHa+nw8aZv/0b2jJDY4JrLS+SBfhTBl2vfIm5DVa7HAqYoO/n1HM6a4xSsaO+v2gUyME1uDHq3XU6rEHiEbMD1WHvQI43TpBwHXSV/AaC8PXwkYMCgPg+SIXdS5JXi2p00mpSo9nisaFkHu97dliacLcUXrOXXFc1O2tljJjPGs0EtgCSDNFHzKIeAV3my7yqn23PjtARCqJ9DUN7WL+AcjfiP4CPkT9DSerQoBrYSGr0cppPBhqhnJrTMXCWWrYQUVbsHetFg5q9qGMltNarh3Q6ubGreydutTM03IKIcyEuOeU3h8o6Aq3aGn89SSnP3ifWJsdxh4ZhT/yfARYqhA/NZOibUvmPn/ISMve37A6B4JYKoFVTMdzhz8sSczWx4hadB4cLkh8qoHllSyqpHFavVK23eewZFYYJwBuOnsZMlro9xA/k9lfPnThs3NnAjGSsyb/u1XMhU92+TQ7rimXQ1236bfWkSTNETTcU/dAz9txx8b14jv2ZRhDwYG5U8V/nOJj5/jEPWUtmNpJG62YFdf8DnUvuHZ5kM97PGijYv+UbxGz2K7LvdxssC0PNU+U12o4Z3V3VfEvZYKU1YfRb8OXTxAlvT1T+EsgAKzdx7oBbopvgehe+guPRsBjWtZOw+HY8cg9X1JXXAz5CyQ/RUA9jKEifTCYkXnp28oMVeOTfNQhirUYZQW6F3ybnUlWTsVVvV+RxnyQ+uUJOTiNWvZGR96YCPCyaw8+VfY8F2Rm1xtITnzYcVLL4dQ3Mw4857I95c9Buo/IV8y+N9+tRa/DRNOLYg44f0ZqmH7l13KZ/vN3E8sJ7kT6MnukUySfinydXmqSdaOSUpeKRZvKtS87+Sw9BWi3PH+2YbtyZnMzjZ6URvLrrMY2p898ZCR9c7h1/ pEtLyclc qjRS2L02wqLiR+Ir0ldGoU7pFk2RY8NgdaiRlBL8z+76jdZFmxCzyMAbV/243HiCpgFQ1TuDAtj0dUzlXtJ0UnO+SM7ioc1DhDeGpKzLqImgbP6fo4/rVfR5hDjilxg99v5ccbpgYE5dZIgTaXp9THL+An9t9VRMkqakypEilqOGoLmd5l4MxPHm8N++1QQw49jQcwlkk4bYlkiXmRRswcaVkE/kfBqbVj2lxiQd4RiQpEoVnvTLlzPDKfn7BBCb43Ez8Ih6oqP6esA6aQsWe7e/lNVJos2E0NaVCTWe48l0Ng/eG/wCy6mnL0Ly79i2k5I1khbO/qPU0dqAQhfc+tZJdg/aCh9rUBYH5D30vIbo36CQ2eGl00V+e0sfW0SV0qcEagT1fc3v8F508pPMfp7sRwYVqM2x4An4ehDUN+9xrMzIsGaYWvX8pXhRtsDG530nj2ev3tX4YBd94YeVxNXUhrJymdrto6yxp X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use broadcast TLB invalidation, using the INVPLGB instruction. There is not enough room in the 12-bit ASID address space to hand out broadcast ASIDs to every process. Only hand out broadcast ASIDs to processes when they are observed to be simultaneously running on 4 or more CPUs. This also allows single threaded process to continue using the cheaper, local TLB invalidation instructions like INVLPGB. Signed-off-by: Rik van Riel Reviewed-by: Nadav Amit Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 107 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 106 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index d8a04e398615..01a5edb51ebe 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -420,6 +420,108 @@ static bool needs_global_asid_reload(struct mm_struct *next, u16 prev_asid) return false; } +/* + * x86 has 4k ASIDs (2k when compiled with KPTI), but the largest + * x86 systems have over 8k CPUs. Because of this potential ASID + * shortage, global ASIDs are handed out to processes that have + * frequent TLB flushes and are active on 4 or more CPUs simultaneously. + */ +static void consider_global_asid(struct mm_struct *mm) +{ + if (!static_cpu_has(X86_FEATURE_INVLPGB)) + return; + + /* Check every once in a while. */ + if ((current->pid & 0x1f) != (jiffies & 0x1f)) + return; + + if (!READ_ONCE(global_asid_available)) + return; + + /* + * Assign a global ASID if the process is active on + * 4 or more CPUs simultaneously. + */ + if (mm_active_cpus_exceeds(mm, 3)) + use_global_asid(mm); +} + +static void finish_asid_transition(struct flush_tlb_info *info) +{ + struct mm_struct *mm = info->mm; + int bc_asid = mm_global_asid(mm); + int cpu; + + if (!READ_ONCE(mm->context.asid_transition)) + return; + + for_each_cpu(cpu, mm_cpumask(mm)) { + /* + * The remote CPU is context switching. Wait for that to + * finish, to catch the unlikely case of it switching to + * the target mm with an out of date ASID. + */ + while (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu)) == LOADED_MM_SWITCHING) + cpu_relax(); + + if (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu)) != mm) + continue; + + /* + * If at least one CPU is not using the global ASID yet, + * send a TLB flush IPI. The IPI should cause stragglers + * to transition soon. + * + * This can race with the CPU switching to another task; + * that results in a (harmless) extra IPI. + */ + if (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm_asid, cpu)) != bc_asid) { + flush_tlb_multi(mm_cpumask(info->mm), info); + return; + } + } + + /* All the CPUs running this process are using the global ASID. */ + WRITE_ONCE(mm->context.asid_transition, false); +} + +static void broadcast_tlb_flush(struct flush_tlb_info *info) +{ + bool pmd = info->stride_shift == PMD_SHIFT; + unsigned long asid = info->mm->context.global_asid; + unsigned long addr = info->start; + + /* + * TLB flushes with INVLPGB are kicked off asynchronously. + * The inc_mm_tlb_gen() guarantees page table updates are done + * before these TLB flushes happen. + */ + if (info->end == TLB_FLUSH_ALL) { + invlpgb_flush_single_pcid_nosync(kern_pcid(asid)); + /* Do any CPUs supporting INVLPGB need PTI? */ + if (static_cpu_has(X86_FEATURE_PTI)) + invlpgb_flush_single_pcid_nosync(user_pcid(asid)); + } else do { + unsigned long nr = 1; + + if (info->stride_shift <= PMD_SHIFT) { + nr = (info->end - addr) >> info->stride_shift; + nr = clamp_val(nr, 1, invlpgb_count_max); + } + + invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd); + if (static_cpu_has(X86_FEATURE_PTI)) + invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd); + + addr += nr << info->stride_shift; + } while (addr < info->end); + + finish_asid_transition(info); + + /* Wait for the INVLPGBs kicked off above to finish. */ + __tlbsync(); +} + /* * Given an ASID, flush the corresponding user ASID. We can delay this * until the next time we switch to it. @@ -1250,9 +1352,12 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { + if (mm_global_asid(mm)) { + broadcast_tlb_flush(info); + } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { info->trim_cpumask = should_trim_cpumask(mm); flush_tlb_multi(mm_cpumask(mm), info); + consider_global_asid(mm); } else if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) { lockdep_assert_irqs_enabled(); local_irq_disable(); From patchwork Fri Feb 21 00:53:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984683 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BB44C021B2 for ; Fri, 21 Feb 2025 00:55:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7AF3A28000F; Thu, 20 Feb 2025 19:55:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6EBB328000B; Thu, 20 Feb 2025 19:55:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4CE6728000F; Thu, 20 Feb 2025 19:55:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 26FFD28000B for ; Thu, 20 Feb 2025 19:55:24 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D7146120259 for ; Fri, 21 Feb 2025 00:55:23 +0000 (UTC) X-FDA: 83142133326.22.3F56729 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf15.hostedemail.com (Postfix) with ESMTP id 54A88A0008 for ; Fri, 21 Feb 2025 00:55:22 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099322; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LdvwmofdFF3hDY0t7ucFzvQKImtHH2rUisVGBSrpDs4=; b=wrkBislADukh9A4rGxhmA6rtjswXMA1snY0JJ0QE5wrhpr4X92DZumTJiiIlz9df9RGC5a vOTU98vb7Ndh+xNM3N+xzmaF9Jj31kSh20vchO/BfTf+tCfhrd5FY6xXkPBl2iwlEuNJ88 dC+JqNeZYZcFtTlpYSw20JXx5xStQGc= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099322; a=rsa-sha256; cv=none; b=TO0nJYF7EnJUdhgaZ78kP5uX3dyW0EveGlJxdihzMUJWEvZv0WaZwHVc51E94UUjPhMkYU 0+RZ/Dj4u4FkjoLUm1Zl0MuVKOOTk2GlA8trnhNGcxR+uRKVLIf1fX+6pzgY/QW3KGb7Pd sUv/LL3DocDnrMS5OPlNk9WSzj8e9fc= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-1rzG; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 13/16] x86/mm: do targeted broadcast flushing from tlbbatch code Date: Thu, 20 Feb 2025 19:53:12 -0500 Message-ID: <20250221005345.2156760-14-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: qdhqtft4wh9hygwu3k5u8d94wnpbpgs8 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 54A88A0008 X-HE-Tag: 1740099322-110206 X-HE-Meta: U2FsdGVkX1++PvCeALTOGY2K3fCgzQ6AnnwC4oXMsRQt+AR3xoeGOi/Ba6X30XxW/tTz7S4zKgmQapFoRxDnNcLQZ4N2oopHllytON0YELQj2OTim162IMmhNTqCwu4v+2i3ZXDx6X/TWQ/ud1/lE76SXWvtejglWOGRf0i/bxnp+0pPCqXQkQ7WmLtN18HaDu360YAKkvGr1+PG1th+vtn0ehlnCWRGUGgKUIR7fljzK96ZU0tmLEl6b53LMS2t7RYGh4p8+mi/xNIlC1M05SfabFC2hfxxk9Jv1gU6seDeN4m85W4HUhenPUgPp044i8ToV8zg8QZ8MxGoUO/B+gJL2bdpbn7172KDV6ozVEL9xwokiSPwkQc+LJGWZujUxTnc1/2NT9Xbn2WHqhMdn4IXsVQXCyos+peghYTjfGP/BNjQiSiHDfT3V+eTY37Cd36vsaUi8VytbFASvg4devI/YJmIPx0LAe7qfWgiEZXcvG5uBTFnpCTUPPUT1Y3x6MTbh7n/yXKcg/H2fyUyijcPKqX/gdbgUuoIkG1Qmi2f694Wuxwj9iTvM1rpY6GkLk12qfqolFV7m4oYuuSYqEfiptubzww/DOIMp62pwiTOw/aC3ajugx8YxhV32Ruwf8MmWcS/9XOC8kaTU67/1cEV2f0LPLvfK+1Lq2WDWYJEE7iqtxPghCXPwomMNikGlN2LGrJXHgiQ2yIVvIVOysIyURSYuGlE31yvtQxj7RkC/3667LKDbLwlVVI9SSWcAQPvhCz/TXhnHFE52GYH/ueXTLmPRiBiA+jyl3UHeeftCbrmnLDw3stWrPi/Cf27ZYQRUr3iDKuCeubJ4RpxVXsLr+PRsUMOUtDmZVHZ28K5Pqr9UOat4a5Qq3oFzc7ni8zMWt2qNbDr+y3p9GrahdoEkb3G3YKp83dTvKcWQvCxE+62ug8qebNYlCglxrkQqbTTMv0oe2GfE84IPMC fmcczH7p VkTchbqCi3uzxC2WgFYjpK/4Shs4dqQdoi2F/1+/IEQrvPn6W0mEKxzfgSZPdZ4CVOeOzn/fHx5l8LW6xwI0w9gSUC0J07bkJDQ5iqcYMsXbA/QsitcwchgI5IxEOuF/DDKa4Vi/Rglh809FKwZSgSJLgNCQqKi1qx/5NZAO8ahs5tejHdup6JJayW2DCZUPcz+nY5uGqyDZB69J7ZtjXO3IWYhH32009lU8iTB19EeUqDdJF3eGeKjGitS/7hKsOMlgVjnQVGP9ZfNAVdkOXZnwuxacqU8rCFP+RmU+hGahIZ+uYm0CtVohZfBksGbpyHqOfYzvkx+KOhjdVoDdxWS6eTbLfb7g1IN6ZIuZ/Yg5v9TFQ0j52SCLwRiGwtE11ENw4iiER6Sf+vqInyw5q/dw7fUpXyNZAed3HSvXbLEyFN9/lC57Om41t+B3Iu2E5eNbzfumhf6cVl1k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Instead of doing a system-wide TLB flush from arch_tlbbatch_flush, queue up asynchronous, targeted flushes from arch_tlbbatch_add_pending. This also allows us to avoid adding the CPUs of processes using broadcast flushing to the batch->cpumask, and will hopefully further reduce TLB flushing from the reclaim and compaction paths. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/tlb.h | 12 ++--- arch/x86/include/asm/tlbflush.h | 19 ++++---- arch/x86/mm/tlb.c | 79 +++++++++++++++++++++++++++++++-- 3 files changed, 92 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index b3cd521e5e2f..f69b243683e1 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -83,16 +83,16 @@ static inline void __tlbsync(void) #define INVLPGB_FINAL_ONLY BIT(4) #define INVLPGB_INCLUDE_NESTED BIT(5) -static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, - unsigned long addr, - u16 nr, - bool pmd_stride) +static inline void __invlpgb_flush_user_nr_nosync(unsigned long pcid, + unsigned long addr, + u16 nr, + bool pmd_stride) { __invlpgb(0, pcid, addr, nr, pmd_stride, INVLPGB_PCID | INVLPGB_VA); } /* Flush all mappings for a given PCID, not including globals. */ -static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) +static inline void __invlpgb_flush_single_pcid_nosync(unsigned long pcid) { __invlpgb(0, pcid, 0, 1, 0, INVLPGB_PCID); } @@ -105,7 +105,7 @@ static inline void invlpgb_flush_all(void) } /* Flush addr, including globals, for all PCIDs. */ -static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) +static inline void __invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) { __invlpgb(0, 0, addr, nr, 0, INVLPGB_INCLUDE_GLOBAL); } diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index f1f82571249b..241fa1435375 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -105,6 +105,9 @@ struct tlb_state { * need to be invalidated. */ bool invalidate_other; +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + bool need_tlbsync; +#endif #ifdef CONFIG_ADDRESS_MASKING /* @@ -288,6 +291,10 @@ static inline u16 mm_global_asid(struct mm_struct *mm) static inline void assign_mm_global_asid(struct mm_struct *mm, u16 asid) { } + +static inline void tlbsync(void) +{ +} #endif #ifdef CONFIG_PARAVIRT @@ -337,21 +344,15 @@ static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) return atomic64_inc_return(&mm->context.tlb_gen); } -static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr) -{ - inc_mm_tlb_gen(mm); - cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); - mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL); -} - static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm) { flush_tlb_mm(mm); } extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); +extern void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long uaddr); static inline bool pte_flags_need_flush(unsigned long oldflags, unsigned long newflags, diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 01a5edb51ebe..9ca22c504f82 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -485,6 +485,37 @@ static void finish_asid_transition(struct flush_tlb_info *info) WRITE_ONCE(mm->context.asid_transition, false); } +static inline void tlbsync(void) +{ + if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) + return; + __tlbsync(); + this_cpu_write(cpu_tlbstate.need_tlbsync, false); +} + +static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, + unsigned long addr, + u16 nr, bool pmd_stride) +{ + __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride); + if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) + this_cpu_write(cpu_tlbstate.need_tlbsync, true); +} + +static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) +{ + __invlpgb_flush_single_pcid_nosync(pcid); + if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) + this_cpu_write(cpu_tlbstate.need_tlbsync, true); +} + +static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) +{ + __invlpgb_flush_addr_nosync(addr, nr); + if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) + this_cpu_write(cpu_tlbstate.need_tlbsync, true); +} + static void broadcast_tlb_flush(struct flush_tlb_info *info) { bool pmd = info->stride_shift == PMD_SHIFT; @@ -783,6 +814,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, if (IS_ENABLED(CONFIG_PROVE_LOCKING)) WARN_ON_ONCE(!irqs_disabled()); + tlbsync(); + /* * Verify that CR3 is what we think it is. This will catch * hypothetical buggy code that directly switches to swapper_pg_dir @@ -959,6 +992,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, */ void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) { + tlbsync(); + if (this_cpu_read(cpu_tlbstate.loaded_mm) == &init_mm) return; @@ -1632,9 +1667,7 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { - invlpgb_flush_all_nonglobals(); - } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { + if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { flush_tlb_multi(&batch->cpumask, info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); @@ -1643,12 +1676,52 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) local_irq_enable(); } + /* + * If we issued (asynchronous) INVLPGB flushes, wait for them here. + * The cpumask above contains only CPUs that were running tasks + * not using broadcast TLB flushing. + */ + tlbsync(); + cpumask_clear(&batch->cpumask); put_flush_tlb_info(); put_cpu(); } +void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long uaddr) +{ + u16 asid = mm_global_asid(mm); + + if (asid) { + invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false); + /* Do any CPUs supporting INVLPGB need PTI? */ + if (static_cpu_has(X86_FEATURE_PTI)) + invlpgb_flush_user_nr_nosync(user_pcid(asid), uaddr, 1, false); + + /* + * Some CPUs might still be using a local ASID for this + * process, and require IPIs, while others are using the + * global ASID. + * + * In this corner case we need to do both the broadcast + * TLB invalidation, and send IPIs. The IPIs will help + * stragglers transition to the broadcast ASID. + */ + if (in_asid_transition(mm)) + asid = 0; + } + + if (!asid) { + inc_mm_tlb_gen(mm); + cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); + } + + mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL); +} + /* * Blindly accessing user memory from NMI context can be dangerous * if we're in the middle of switching the current user task or From patchwork Fri Feb 21 00:53:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22D51C021B2 for ; Fri, 21 Feb 2025 00:55:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D7726B008A; Thu, 20 Feb 2025 19:55:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D76F3280009; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE1EA28000C; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4542B280005 for ; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BA40A801F4 for ; Fri, 21 Feb 2025 00:55:08 +0000 (UTC) X-FDA: 83142132696.10.BFC063F Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf27.hostedemail.com (Postfix) with ESMTP id 1177340003 for ; Fri, 21 Feb 2025 00:55:06 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099307; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lx8kkE5bATflrMu8IGtIlQjO2P3voRfd0uK9epWuJlI=; b=lM1uV8iJjkAqKzpx1ofEMxPk05ckN9BXqiuQwdG1Is+5b4y2PjDQ8NE1xopWxSB8QBAkS5 rMK+cyZBEkL24bhHhmwz7favjEMyj/eYUsu8E/s8POJ+Xkic9TCcPOj4xp/zyO8Pi3Gy7p AAE4h/cMNOMnW4nCtoL4zV8mB7wnhCY= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099307; a=rsa-sha256; cv=none; b=ubGUYS6TUmNvPtqzYxuDBPKXra9N8kSDWAbKlfLu7fmlzoMmZoFXIP/0bkOe0lBrk5JjyG h/CHwS2on9uHElychsu3+dvxgpex8xVTxclhT3bW/8IMtEzp6AOIy+az6MjgIpg2dLuuWh CRL6HwDbb4cdZ3e8E3CEJMNRF6m5Cl8= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-1z6G; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 14/16] x86/mm: enable AMD translation cache extensions Date: Thu, 20 Feb 2025 19:53:13 -0500 Message-ID: <20250221005345.2156760-15-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 1177340003 X-Stat-Signature: kaspe3pm3ijagmz9khcaedwocrfswjg9 X-HE-Tag: 1740099306-422701 X-HE-Meta: U2FsdGVkX1/Ak403kLUv54GbVqSSk5uJ9cLVRi4ii46BEWknvneL75Q8kExpjPtHZRGXBJj0HYzzHQeAHja1NguZWzzole92zNWiNTZ1xS4n+LUIBwI+/I27dROfi9hlIbk6AVYhj4SJiF+azYnPjdbDbpaYaZtcMGRWntcGVaWFo+7vEvnNzJSg2s+2Bu5Hp9AykF3MOGl7InEz6G5GOAwfNjWPlNhiFmy/wH95o++c3IFXpcbBQGFr/r3nNqIGIIGXn8vNkW0RGiI379U3ys7K61XXqedkHl37yDFhlBJ4tiBLPScHBMZ0nXDFB48cgSNE7+qE0sUpvV/fg33oX069KVg7bFXW5RgWU3wRCuvFU+qkFu5BtzbFjTZqGq4BFy3N97tWU15NM15n63y8bfniAKPrD6Rp5D3OTaiWkagfXzcJwoe0MSrBDutAb1BRWA8ePOp/vD8oqooLwZ5ZvBUEH4DWBn1BB5Q354LCpcVfKLVBdrzONuowW6hTg5w/w4D1wSlbqJ2JGmMCJSh03ZzTC6kd7lgoWAVK2NLaIRyJ9dCzzBUj5KBuz6yb/pTb5Q5o06OEcf1SQpsfDX4TLQV/CnlsdDPaWJ4zNgQXpfKb2Jk5vCow3sArOwOIPkkbUoXR+vG3KkB18+a+OL7uMEb72GvQzUYixzoGX3RC03acaDJI5RfVV/k0JIFWGnDtoukFXSELlXW/0SHoSHqh8PVLvIhK7/FNkky1fPRp3GU9LqA8qqNrQCrQlpv9zjR4XBAy/Z3mJJeNMazjtA3iLQLP1wHE/s2XZH72MRFfgckj9bAA+VTAzIjEkk40fHrV3j9lMBau0gy8HArM8h5xv1pCK+UjPQzCvGHXkp0S6ZNJ9KHk2eFTkbryDapMCe/cU2V8gzUTW+q9bTbUMOU3vrdorq5+rjF/V4AxugTBagFRUwDTgdvuzo8SWoZ3uld78+9KOvynyxNNuTU2NNg x0CMlft2 +lKlEcU8bx/0ymv9TieBwzNlYpsR/PCv3LCYMbaqe+IJVMm1pP8XwOApOiSkpAjRlflm1cNReC+j9wh8Y+pms5RkqflNtwzuI356CKyg4Zk4Q0oiQ3stwoZVNpvjYb42V3CyR6Tq90h4ixGcl6sv3F7fK08/Nr7yezSE9iKvD5Y4ho1YmDfYPWGhSJkPHN8xpUo+lqJZ6hgHT+pJ9OF8SF//mBL56jTACeZoVm8iuEfxzy45jSHv2ENJVx1K+NLtsyV3ZlBqbIj5n0GGidtJ6PX9NKJ0UijTcXLZeRuyaF5LaCuEnFXYthQIeyQww7iwZ3zYgmZ3HuficIRgRfyDeCtePOU0fSxfpQKQxh2xC8iBVl3WLF+5j/NabPPPqO+yw3P3DOQzXyE1HPdVH+RVfqh3PQFv4xkbWo6Y2B4wsMPKEYdA+9RqjyISdjBWJSu3TgwfJR8ZKCUyha58= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With AMD TCE (translation cache extensions) only the intermediate mappings that cover the address range zapped by INVLPG / INVLPGB get invalidated, rather than all intermediate mappings getting zapped at every TLB invalidation. This can help reduce the TLB miss rate, by keeping more intermediate mappings in the cache. From the AMD manual: Translation Cache Extension (TCE) Bit. Bit 15, read/write. Setting this bit to 1 changes how the INVLPG, INVLPGB, and INVPCID instructions operate on TLB entries. When this bit is 0, these instructions remove the target PTE from the TLB as well as all upper-level table entries that are cached in the TLB, whether or not they are associated with the target PTE. When this bit is set, these instructions will remove the target PTE and only those upper-level entries that lead to the target PTE in the page table hierarchy, leaving unrelated upper-level entries intact. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/msr-index.h | 2 ++ arch/x86/kernel/cpu/amd.c | 4 ++++ tools/arch/x86/include/asm/msr-index.h | 2 ++ 3 files changed, 8 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 9a71880eec07..a7ea9720ba3c 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -25,6 +25,7 @@ #define _EFER_SVME 12 /* Enable virtualization */ #define _EFER_LMSLE 13 /* Long Mode Segment Limit Enable */ #define _EFER_FFXSR 14 /* Enable Fast FXSAVE/FXRSTOR */ +#define _EFER_TCE 15 /* Enable Translation Cache Extensions */ #define _EFER_AUTOIBRS 21 /* Enable Automatic IBRS */ #define EFER_SCE (1<<_EFER_SCE) @@ -34,6 +35,7 @@ #define EFER_SVME (1<<_EFER_SVME) #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) +#define EFER_TCE (1<<_EFER_TCE) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) /* diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 54194f5995de..b9b67d44c279 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1073,6 +1073,10 @@ static void init_amd(struct cpuinfo_x86 *c) /* AMD CPUs don't need fencing after x2APIC/TSC_DEADLINE MSR writes. */ clear_cpu_cap(c, X86_FEATURE_APIC_MSRS_FENCE); + + /* Enable Translation Cache Extension */ + if (cpu_feature_enabled(X86_FEATURE_TCE)) + msr_set_bit(MSR_EFER, _EFER_TCE); } #ifdef CONFIG_X86_32 diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 3ae84c3b8e6d..dc1c1057f26e 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -25,6 +25,7 @@ #define _EFER_SVME 12 /* Enable virtualization */ #define _EFER_LMSLE 13 /* Long Mode Segment Limit Enable */ #define _EFER_FFXSR 14 /* Enable Fast FXSAVE/FXRSTOR */ +#define _EFER_TCE 15 /* Enable Translation Cache Extensions */ #define _EFER_AUTOIBRS 21 /* Enable Automatic IBRS */ #define EFER_SCE (1<<_EFER_SCE) @@ -34,6 +35,7 @@ #define EFER_SVME (1<<_EFER_SVME) #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) +#define EFER_TCE (1<<_EFER_TCE) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) /* From patchwork Fri Feb 21 00:53:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984675 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CE6EC021B4 for ; Fri, 21 Feb 2025 00:55:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 84CDC280007; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E86028000B; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4ED576B008A; Thu, 20 Feb 2025 19:55:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 23D856B00AC for ; Thu, 20 Feb 2025 19:55:09 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C4063B0310 for ; Fri, 21 Feb 2025 00:55:08 +0000 (UTC) X-FDA: 83142132696.30.E129544 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf01.hostedemail.com (Postfix) with ESMTP id 4110A40009 for ; Fri, 21 Feb 2025 00:55:07 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099307; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fp6wnG9ZVIwshrx2d7IT/TxCw+AbmdtrIYTrFxOaFoo=; b=5TmgLFvc/sMuR0Zbop4ftp5GpjFm8mIgoDfP6UCE/K7KB2amRC6DEp2JvVbSydjj7ZMD4J o3YvUnufq6FudeVMlv7AC6T4PFmUHbF+fV14Fx6A0cT5S7D//WZNMch4NDfk5+TUU2MdQJ DbhLAEJpWK7dzu3IyPQuGT2OrqYYCCU= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099307; a=rsa-sha256; cv=none; b=gF+E6nnR2FOodxTqR331fqoaykqi1TECCyyU19rdst+nNXmWxcOGjTGp4rA0rfb2wrmv9p f21y6d97HQTpf9MJP3cf8S7rzkWPcwg98KWED46i0lOu18WvbRVdW8dXOt6z84B2kcolsL 9l9s7zqWQHLsqSZ+67k1F+Y8QdBeFAo= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-26S8; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 15/16] x86/mm: only invalidate final translations with INVLPGB Date: Thu, 20 Feb 2025 19:53:14 -0500 Message-ID: <20250221005345.2156760-16-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 4110A40009 X-Stat-Signature: fq3tf1rdcu18yxwkkx1d64oiatou16re X-Rspamd-Server: rspam03 X-HE-Tag: 1740099307-142001 X-HE-Meta: U2FsdGVkX19AU6WZmrwR0rE0TnwCBMHnZKUwrpHS/hmPRUxNxTWSoLnxLK9TpurYqKaz0LE+A7ywalWaHp8cioeL/LS6a3sG/LaiWzHHVls+jbz5dbxBd8BIiab62LewetzU+Ll1b6XEA7X9sVtG2HF4NBFnnLq0SrRjMeZYAXCSXNX/eeV/fp2VlyIP9pdqwzv0L0k64e1PRZiGYNLygMk0vs4z/vdZ75qTOEBKkBNFOKjUN5VL0VzJDVB7CUPHHArgDXPefpZL9TzRi0KbQfBRHzdYps6dNaVPL0bM3NfLBFNuXiZjEWScfv/cls9d0hV63cVxAv4JHfEL+pXEGPpWi/KaO9jUTrmzFzEZnQ9tczSpFnW4TteA/J3I7nqt9LhGV9zMaQs7uA7/vKPr5C8UPy0Cr4dEUyhMS/kIy6ztIKqxSnnr5ZSgF/dEil6XCaoMB5r8bpoqWaECWzbm1K3Dxbzj7a875dobhdUFiwO7/5qKVEimbMy385YqvyGX/vy7/hsgk9qOEcMSho50HZbpJwjJqCZteLoMmDNX2dyEWwBsc6C54YsbPXHDgEwM2d4lmNVw7rq3SSQ4GZqxa9UQZOSfQJ7Xjfz0hodTnTYzSAwVHSRv4D3mkHV49ofbaVTDiXxa2+yR7qLuxvl2NgTe7O78E1VsVPEduurblYS87Kvp6EVtvZ5S3K6HayTNZfWHqpZQV7La9WainEpBDcNdrQk2Q7WyXWJ6nUz2FjxV9Lb2OTsOhtJ2BxYdtvQgzuz92fImDwReo3e5wBjgya2gw/3QL9NX9F46Wr5raQmHVjl2n3Ul+mypaecLbMwzAw0MYLNDbqP6mFTehKh9bcKtUxNTv3sAAOByflHjnbb3WAM978/10m++z5t+MUpbj/JJKQzp8ZvvqzLKqplswGvoDT8sf6LStfs6m4JBxDnU3dtDJSu47FvzCq+e+6OzJnGUDKI1e2OLuju/IcT u2N/yY7P +I870j+fwEE+Mqxu775WADTHubMS55rFHi6mFR+DUlokX2XvvFGOcs0at8GZI9lPlC6COIN3952psBGYB2MSd8jkhrv11SiCgCNhfI+HwBr1C22oeUqAfiA3JM0PXErLnW2U7PkZvcQGFYXJApVtMej/hWJcj2legeQoD4R8MMGb11+XymuPXPB9369G049XLgWROJCRHItzOnqIXk/T7zRVVAL60LoVWaO1/571fs6ydodOEdION7nbEJdPBMSvv73Y5HOS7t/O3Lk9vvZPuEf4HvgikSxshwNv/NQ+TE+cSGLYN4/OZOT9Jka4KNjyYQX0w++zZhP6Xj5WVnOSKEQpXA2XCd2wvQXe1YoXIF8bdVMTht1D4IvNIGbH17zZEaXNbhsvcjALRabuacGUm1J5T9Aj5uQGzus85UZaJl6+SANVDq4fsO6HwYJnmVcjOaneCdsPVlYbKsJM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use the INVLPGB_FINAL_ONLY flag when invalidating mappings with INVPLGB. This way only leaf mappings get removed from the TLB, leaving intermediate translations cached. On the (rare) occasions where we free page tables we do a full flush, ensuring intermediate translations get flushed from the TLB. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/tlb.h | 10 ++++++++-- arch/x86/mm/tlb.c | 13 +++++++------ 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index f69b243683e1..b1a18fe30d9b 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -86,9 +86,15 @@ static inline void __tlbsync(void) static inline void __invlpgb_flush_user_nr_nosync(unsigned long pcid, unsigned long addr, u16 nr, - bool pmd_stride) + bool pmd_stride, + bool freed_tables) { - __invlpgb(0, pcid, addr, nr, pmd_stride, INVLPGB_PCID | INVLPGB_VA); + u8 flags = INVLPGB_PCID | INVLPGB_VA; + + if (!freed_tables) + flags |= INVLPGB_FINAL_ONLY; + + __invlpgb(0, pcid, addr, nr, pmd_stride, flags); } /* Flush all mappings for a given PCID, not including globals. */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 9ca22c504f82..8494d14d2fb7 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -495,9 +495,10 @@ static inline void tlbsync(void) static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, unsigned long addr, - u16 nr, bool pmd_stride) + u16 nr, bool pmd_stride, + bool freed_tables) { - __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride); + __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride, freed_tables); if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) this_cpu_write(cpu_tlbstate.need_tlbsync, true); } @@ -540,9 +541,9 @@ static void broadcast_tlb_flush(struct flush_tlb_info *info) nr = clamp_val(nr, 1, invlpgb_count_max); } - invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd); + invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd, info->freed_tables); if (static_cpu_has(X86_FEATURE_PTI)) - invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd); + invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd, info->freed_tables); addr += nr << info->stride_shift; } while (addr < info->end); @@ -1696,10 +1697,10 @@ void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, u16 asid = mm_global_asid(mm); if (asid) { - invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false); + invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false, false); /* Do any CPUs supporting INVLPGB need PTI? */ if (static_cpu_has(X86_FEATURE_PTI)) - invlpgb_flush_user_nr_nosync(user_pcid(asid), uaddr, 1, false); + invlpgb_flush_user_nr_nosync(user_pcid(asid), uaddr, 1, false, false); /* * Some CPUs might still be using a local ASID for this From patchwork Fri Feb 21 00:53:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13984690 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3467C021B2 for ; Fri, 21 Feb 2025 00:55:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7D9926B0093; Thu, 20 Feb 2025 19:55:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7624A280014; Thu, 20 Feb 2025 19:55:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EEC26B0096; Thu, 20 Feb 2025 19:55:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 2BC256B0093 for ; Thu, 20 Feb 2025 19:55:32 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D541CB01D7 for ; Fri, 21 Feb 2025 00:55:31 +0000 (UTC) X-FDA: 83142133662.28.D91D4E8 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf06.hostedemail.com (Postfix) with ESMTP id 4BDBA180009 for ; Fri, 21 Feb 2025 00:55:30 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740099330; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ShmVTFQrWHLQViUCKUWp5kVyIRrTH+Y6FMbtaGon1Tw=; b=b/28wHneFDbAXCeAjrzkZsvV9+No2380rNR7OP+d/aXn2cJPuc8Z732PAGMyIOhb5ecLzU fpNWoFTM+AhQ3Gj3DbThWMZIqEh0jPSNIWT8XTsC6QLXYhN1TBX7OmJNkdWgJh1CbdEtM/ 0xhjXi77vnzA148J77CwwLOI2BDWxvs= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740099330; a=rsa-sha256; cv=none; b=zBHsAhYtCLjSqkDamOPwNUg9dIw9jpl27vEUjM3mQlbhgvYn76Hgd4M71rRgqEYTFHeeRu f8WB30700ZXxlXgt7CEd8E2PZ/Ydko/he5sEw8K0KneUL/8Lh5WsCrHUjm4mc1cYmqRhcx MBYaHH6HcI+CJNU99ATSx0GIyBPB5hU= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tlHIZ-000000003Qf-2CL0; Thu, 20 Feb 2025 19:53:47 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, Rik van Riel Subject: [PATCH v12 16/16] x86/mm: add noinvlpgb commandline option Date: Thu, 20 Feb 2025 19:53:15 -0500 Message-ID: <20250221005345.2156760-17-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250221005345.2156760-1-riel@surriel.com> References: <20250221005345.2156760-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 4BDBA180009 X-Stat-Signature: si7g8ofcd9b8mhjizoxegu1r76i6owbu X-Rspamd-Server: rspam03 X-HE-Tag: 1740099330-20637 X-HE-Meta: U2FsdGVkX1+ao714oPPs5QhYqxa0FJfl6FdLWR6WxfgnpXbMT5NqSekRDts0JDwuZFcDjBAd7MLA2sy7kKMzipPxAty/aUjAk4ozwG++/yfHWJxtvlDZL0+A2t2JRIFRhNnldvoyx4YvETsvmphR1XL4DUfady1Cr47lNtaGaByPslCjCudPPu4okltw3c8ZseTYfPJ/0tZhDtmUcF58BqUPCVuZZRzA2EU+5QG0wp49zxIRZv+lle7YRKm/VLhUt1Bl2nT9xOjbg+P6cfkwGgG4ldlgwiVFDLvk5EUWD/d+nZ5/67xB74A0mQ1j5y6V0nyYQuESwcH/6+IvuganTDyu7A2ZhtYtevFBRwjyVbXTq3pgY/VRi2gfZnYiBrKzzgzs+pXrY3PFQG8nkygTGgwxkeZDHii3BHu2hEJkDXwgEyR1BvHtTu2A78/ec+XJQD3bfukVR99Uwxc2tWjciHExJ9qvbTkzi06DYop+KguQK22gLDTm6HSb93WSBk3abkVuWyrHmm+4+4H5sogGxDrl3//deW9CbCBWfGCAjl+BqrI3WN0wGA7N7CDokiYES4BCvRwui49uePSa+jrgQjAWtZAKnoaRW1MOe6uvq8tjlZwm8ABXyAuOqLk6iqNMAH6BiN4uf1szTtCqaHC8B20m0RmZxuyFyPiwCvyW8/SOMALD7oCf86ExqElWpspUm0gad/dd+zYKZjydTo+pTfpDe1ttxgQnZ1yb/miQNQgd359Npjv1rCetHqSfn82Tz63OJ9KB4ilkBCh2bUdmefXoJyiP163UgRfFqFgxVoVPVJoiII+GVgSkZJoRfCBD7hH+BKA8//+gvUadRK5n+/rAASVlmoHZcte3HDVFwUEevwU6+CIOFW/VduTFJ4EkHSnOMqnNzmW4ZYbgw33C+fG63bmGQnANNdxtcq9iUSAvzvWmImDV9DipusLlPgg8GKHaPoZkhHyHOA7RJ1u qeS6QOo0 m9onKxfbtL0628pVbh0p1vj3HaWkAG1IZHdqxqtT8AGB6GCEKPK7kPFRU++hfSrLkE8AHeF6oCdU3xkwxSClii7RjJNT9dQH5D5mkRB8N6o2YAU89NdtXILzOmAKeifKrQxfav6OesiVgbP7z2nwu3rGgKEpzR6HaboCPBwR60K8zQTj6rW62W24dkfRIHgp9IRr4m61Ieopftz3gIkuWroGiTPKoUAo0suz8auT4L4H2XtB/NP3M0LgbUYjb+iKeYXo9CiYWnNBfxHzau/o9eYGSWsBwOub1tWgORzAiTPz5ywVfkZOkfUjnbh7ezfxRQr/dTG97Zw7ig+E9rdFC8s4dBX2y6nLh6XmK5Hq7n71t4KJgjhGYRyoWx4z5nKYMwNCWJVQjqdnGm+Hv3PuUobGgh+FhdxspWoAR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add a "noinvlpgb" commandline option to disable AMD broadcast TLB flushing at boot time. Also fix up the "nopcid" boot option to automatically disable INVLPGB functionality, which relies on processes to run on globally allocated PCIDs. Signed-off-by: Rik van Riel Suggested-by: Brendan Jackman --- .../admin-guide/kernel-parameters.txt | 3 ++ arch/x86/kernel/cpu/common.c | 28 +++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index fb8752b42ec8..91260e1949fb 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4182,6 +4182,8 @@ nomodule Disable module load + noinvlpgb [X86-64,EARLY] Disable the INVLPGB cpu feature. + nonmi_ipi [X86] Disable using NMI IPIs during panic/reboot to shutdown the other cpus. Instead use the REBOOT_VECTOR irq. @@ -4190,6 +4192,7 @@ pagetables) support. nopcid [X86-64,EARLY] Disable the PCID cpu feature. + This also disables INVLPGB, which relies on PCID. nopku [X86] Disable Memory Protection Keys CPU feature found in some Intel CPUs. diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 742bdb0c4846..b1ead1136d5c 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -245,6 +245,33 @@ DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = { } }; EXPORT_PER_CPU_SYMBOL_GPL(gdt_page); +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH +static void disable_invlpgb(void) +{ + /* do not emit a message if the feature is not present */ + if (!boot_cpu_has(X86_FEATURE_INVLPGB)) + return; + + setup_clear_cpu_cap(X86_FEATURE_INVLPGB); + pr_info("INVLPGB feature disabled\n"); +} + +static int __init x86_noinvlpgb_setup(char *s) +{ + /* noinvlpgb doesn't accept parameters */ + if (s) + return -EINVAL; + + disable_invlpgb(); + return 0; +} +early_param("noinvlpgb", x86_noinvlpgb_setup); +#else +static void disable_invlpgb(void) +{ +} +#endif + #ifdef CONFIG_X86_64 static int __init x86_nopcid_setup(char *s) { @@ -258,6 +285,7 @@ static int __init x86_nopcid_setup(char *s) setup_clear_cpu_cap(X86_FEATURE_PCID); pr_info("nopcid: PCID feature disabled\n"); + disable_invlpgb(); return 0; } early_param("nopcid", x86_nopcid_setup);