From patchwork Thu Feb 13 16:13:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1ADDC021A4 for ; Thu, 13 Feb 2025 16:18:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D1C36B008A; Thu, 13 Feb 2025 11:18:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3813A6B008C; Thu, 13 Feb 2025 11:18:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 248F86B0092; Thu, 13 Feb 2025 11:18:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0873D6B008A for ; Thu, 13 Feb 2025 11:18:00 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id BA787A12B0 for ; Thu, 13 Feb 2025 16:17:59 +0000 (UTC) X-FDA: 83115427878.02.B721367 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf16.hostedemail.com (Postfix) with ESMTP id 2649218000B for ; Thu, 13 Feb 2025 16:17:57 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463478; a=rsa-sha256; cv=none; b=NFZWZIn+xh0qFLSbxiBxLCQhEGROTRnxyZlK2vLGwbaAFvny299/iSs35LbChT3c5/jxzG /I8Ysu1gtkQIxUPmezkPmQLgAnCSHCoBUs8bpJoIgSkxWy7ZUJrG7xUMz4DGOsFAmSGSVZ UEEgMacY5nhzH0saDiTCiv59SPtLubU= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463478; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zQVkWtQvWh/JQlkH3zRjmdk+4uvC0ZngJuxM1UBby2g=; b=FYO/SyoBpKqWCh5XfrHC1pyB0swv8YdKbBtXfPHt6xYyIRR6Dh+HQ8BRU7fcHdIC4zf8W9 CmRU7X8TcoSLd/01jxswfZpmKXptH11Wpjk6F8Vs6Z4Lf3z7laVCmGGJEJv1Mp1DJ4ruGT ThXPFS0mikk+DZpBvJYshFibbEd6K4I= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-0QCo; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 01/12] x86/mm: make MMU_GATHER_RCU_TABLE_FREE unconditional Date: Thu, 13 Feb 2025 11:13:52 -0500 Message-ID: <20250213161423.449435-2-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Stat-Signature: aan5p45ruchpubus9bor1z913izxemth X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2649218000B X-Rspam-User: X-HE-Tag: 1739463477-156483 X-HE-Meta: U2FsdGVkX1+RRc1XiTxzVkKQO08FHvyXokoSYM+c0PXZCkuBkTA0MzrNwKFceGMmipA/fiAmVMSVn+UEV4qWSdjA2foNat8xHdfGlXHwrBD5gyI2fWSAAgeRr1JrTgHh5BsAQenbxaMtc2/xpbxF1nbU8CDVA95I/c3sFtfN6jUweMrh3ux37E37h9aPfvSJjAUvT6QalKE0ff/SbzzUiRVvTkltoNnWPhkxaYBB3dxxt8Wx0T2OYjgDtLcT99rnaxaDZITH2CcOTS2RRp+om6vxdd7PWUKHWUzf8iJO33I06eAXygC5uAjFdNRI5CZ0lkOC+3/msXonlPF2GTx8LXclqU+SJFkUw50Xnut1nPQm7I0FTrHCS3Tb9FASUWW2A2SSigXa0hasVLTvXEsojjQDQ6vSt4z1TO9I+0BwhiZXMpSKKIGxkmNG729+n7zBZspvZ9Y1T9fzYtVGItDgb7LYD2cEfw6Tkvdq4FbtVwHlupjJVZP6TxRtb/5jDKEHmsr5/JktAMpTktAB1fJZk4+M+pW58GoFJrbiW1eHOM/YIJj6jM9875UMnEeqx26Kp2V/Vj0UnTUoLAqJucdK42m/6lh11a/kW9/JY3ltijwfJINfGSBXUyYglL/HFBF7GX6r3E83xi+DXVF4u8D5NUGe5oR7FD7X39I821hMRPY9U5jgcwpDv3zThOef7/icTj9JDmbIkUSJ/mKlxq6CGSc0ixpY41icTHylUP/gDVPSRGV89qkGRIb0muNfu3t1nsiCQaYduW+V/4kO+PAbLDiMS1bmZbtkV2U8ziwRG/u/6OQLp0/oEz8fLzW9mw2/hKHp3e7ugPULHdQfxE8k01Iu8bi4TbUWx563RwjbcSS9APCq76JlicKSUstz9aOtAWwXKd3dq0VpP2PzEz8BK0yQt1TIjEad+1riwZFt8BzNaG0FnHy+b79of9jCPTidW3RwUuKQ9VjK0ikyESo nHMOqJft b83TdHVUS1nFW1QmNCFyo8IPBFoHq06/eevUbcPJ2Sw5MWPprjOsmDmaNpSoHA+fJketLoxoGWt62nCrbfLl5pZDlZpeyMwBIc8jIajbJb63yICaPlUALm2NelLCk5tGyepKhzlI52swFuMRziqEYsaR5DEliQtJ9goA/D+QyGRk75jH5CJTH3Y6O5IR6d2vImzpT75vtOJfPG4XkmO5EmVd/3wg1NE/3BEgE5oz66/DbgNDQNXSeffxJq1rmOmQnx2xXIVCaqv7rgeWDmH1ALT+fy+lNOFnrsHDwzsguNmok8roIVy1K5CwEULUisbPEQQNPlvbQ/aEO/qPZIvbcXm+5n2gZOtoUYLyobN8QLWVV9sLVTZof2jpfZoRORPV5HmU4fePs6dRBWqIooopSFYm5anHoAqxsrEkyqrSmU2wvHLL887Ht22FfgPXoEaSg8/30jpHtXemyGebB6aTotEa61ovCQYmmrJ7d1SRLCtoRO7VgGYchNyxgwPf3Si8/5frhSU96kf8YyvEA3jUcirIlNICC0u0HXfZwp49iC3Vj0KZr++FWx2KaMH6UXVG4LvbIPCvmPimtCwF8dEpUdHKGwKJeqDWMlEtL2m6mq66+FrIUWOddaK+0GBeaRFQTxOH/50MU1k52mDxzRwGgaAaL6ciYs63pkpuUdfRjc3oC0/A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently x86 uses CONFIG_MMU_GATHER_TABLE_FREE when using paravirt, and not when running on bare metal. There is no real good reason to do things differently for each setup. Make them all the same. Currently get_user_pages_fast synchronizes against page table freeing in two different ways: - on bare metal, by blocking IRQs, which block TLB flush IPIs - on paravirt, with MMU_GATHER_RCU_TABLE_FREE This is done because some paravirt TLB flush implementations handle the TLB flush in the hypervisor, and will do the flush even when the target CPU has interrupts disabled. Always handle page table freeing with MMU_GATHER_RCU_TABLE_FREE. Using RCU synchronization between page table freeing and get_user_pages_fast() allows bare metal to also do TLB flushing while interrupts are disabled. Various places in the mm do still block IRQs or disable preemption as an implicit way to block RCU frees. That makes it safe to use INVLPGB on AMD CPUs. Signed-off-by: Rik van Riel Suggested-by: Peter Zijlstra Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/Kconfig | 2 +- arch/x86/kernel/paravirt.c | 17 +---------------- arch/x86/mm/pgtable.c | 27 ++++----------------------- 3 files changed, 6 insertions(+), 40 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6df7779ed6da..aeb07da762fc 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -278,7 +278,7 @@ config X86 select HAVE_PCI select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP - select MMU_GATHER_RCU_TABLE_FREE if PARAVIRT + select MMU_GATHER_RCU_TABLE_FREE select MMU_GATHER_MERGE_VMAS select HAVE_POSIX_CPU_TIMERS_TASK_WORK select HAVE_REGS_AND_STACK_ACCESS_API diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 1ccaa3397a67..527f5605aa3e 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -59,21 +59,6 @@ void __init native_pv_lock_init(void) static_branch_enable(&virt_spin_lock_key); } -#ifndef CONFIG_PT_RECLAIM -static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - struct ptdesc *ptdesc = (struct ptdesc *)table; - - pagetable_dtor(ptdesc); - tlb_remove_page(tlb, ptdesc_page(ptdesc)); -} -#else -static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - tlb_remove_table(tlb, table); -} -#endif - struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; @@ -195,7 +180,7 @@ struct paravirt_patch_template pv_ops = { .mmu.flush_tlb_kernel = native_flush_tlb_global, .mmu.flush_tlb_one_user = native_flush_tlb_one_user, .mmu.flush_tlb_multi = native_flush_tlb_multi, - .mmu.tlb_remove_table = native_tlb_remove_table, + .mmu.tlb_remove_table = tlb_remove_table, .mmu.exit_mmap = paravirt_nop, .mmu.notify_page_enc_status_changed = paravirt_nop, diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 1fef5ad32d5a..b1c1f72c1fd1 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -18,25 +18,6 @@ EXPORT_SYMBOL(physical_mask); #define PGTABLE_HIGHMEM 0 #endif -#ifndef CONFIG_PARAVIRT -#ifndef CONFIG_PT_RECLAIM -static inline -void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - struct ptdesc *ptdesc = (struct ptdesc *)table; - - pagetable_dtor(ptdesc); - tlb_remove_page(tlb, ptdesc_page(ptdesc)); -} -#else -static inline -void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - tlb_remove_table(tlb, table); -} -#endif /* !CONFIG_PT_RECLAIM */ -#endif /* !CONFIG_PARAVIRT */ - gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM; pgtable_t pte_alloc_one(struct mm_struct *mm) @@ -64,7 +45,7 @@ early_param("userpte", setup_userpte); void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte) { paravirt_release_pte(page_to_pfn(pte)); - paravirt_tlb_remove_table(tlb, page_ptdesc(pte)); + tlb_remove_table(tlb, page_ptdesc(pte)); } #if CONFIG_PGTABLE_LEVELS > 2 @@ -78,21 +59,21 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd) #ifdef CONFIG_X86_PAE tlb->need_flush_all = 1; #endif - paravirt_tlb_remove_table(tlb, virt_to_ptdesc(pmd)); + tlb_remove_table(tlb, virt_to_ptdesc(pmd)); } #if CONFIG_PGTABLE_LEVELS > 3 void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud) { paravirt_release_pud(__pa(pud) >> PAGE_SHIFT); - paravirt_tlb_remove_table(tlb, virt_to_ptdesc(pud)); + tlb_remove_table(tlb, virt_to_ptdesc(pud)); } #if CONFIG_PGTABLE_LEVELS > 4 void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d) { paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT); - paravirt_tlb_remove_table(tlb, virt_to_ptdesc(p4d)); + tlb_remove_table(tlb, virt_to_ptdesc(p4d)); } #endif /* CONFIG_PGTABLE_LEVELS > 4 */ #endif /* CONFIG_PGTABLE_LEVELS > 3 */