From patchwork Wed Feb 26 03:00:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 568BDC021BE for ; Wed, 26 Feb 2025 03:02:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E1CD16B009A; Tue, 25 Feb 2025 22:02:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DAC076B009B; Tue, 25 Feb 2025 22:02:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C1A72280007; Tue, 25 Feb 2025 22:02:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A26C56B0099 for ; Tue, 25 Feb 2025 22:02:25 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5CCEF1C8107 for ; Wed, 26 Feb 2025 03:02:25 +0000 (UTC) X-FDA: 83160597450.15.40C472E Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf19.hostedemail.com (Postfix) with ESMTP id 4D9D41A000F for ; Wed, 26 Feb 2025 03:02:22 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538943; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FfnRwq33yyi/Wd2CTbfl4M5RVRk0hOOuF/UtgJG2WM8=; b=zxBlbrCZckfudY2FRGb2kFezxl93uzkGVVG76pIU5L6B7ONmvqplVfHC/F+nqUUJmFGNHN +h3kd+sQGhymmySlqzbmK/JCVpRTDhHqtCO1Ng0zLOEJrxxlFY46xAjXYTrPuW2yXsl7m6 XAzM9fBxNOd8HsNS+6hbPjEollJjbMM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538943; a=rsa-sha256; cv=none; b=1De1h7w9ytc2S244Ck75qHj+onQD25PDQufvjsZcGrzw260cvzNCdq6DwuFgmvisK3Nu7D ZkPxizx6TtBdKe5cmL5mRCsOL2tQwd3CxRtcOsm3MdfP7D2T6nsyaWp4wKd0+jpRNhBZul sLidumAdShNcfJDanqSEEAS+DFt0AQ4= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-0MJ4; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel , Dave Hansen Subject: [PATCH v14 01/13] x86/mm: consolidate full flush threshold decision Date: Tue, 25 Feb 2025 22:00:36 -0500 Message-ID: <20250226030129.530345-2-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 4D9D41A000F X-Stat-Signature: sqw4p77c5677xx9hkc7zn1y8jac3gxff X-Rspamd-Server: rspam03 X-HE-Tag: 1740538942-20049 X-HE-Meta: U2FsdGVkX1+kFvJOI8kj3puqh/p3kFe+0nnI+MRCON2vwHOpy8ct2xdvRUHqaPjIqlGBF5w3eVMBEKEewYfut7ZkGsXDB2ilrfpEc2pKTp7G9wSxONIymF8lTB5Eb/5Ljuk54U+rR3vfgX6dZ7oqNSEmtIyaNDk7EQnttczqTEeVtqXoEaC+OqI15gp8vVVMmIiAeOjk3hY6PwE379xjQGz1h4GAyMJC0ckTbczkk8i3/die4l/8sXzoV5wR8z6AxYQcXKlU5oO+Uup4OsWYWoOf38JoqXyu7MJNJ0imlgTtSsq1M0lEMd+OXNSlaee4kiSh66pa1yhfdwWFkdHJfpcWbPcqMNVnWWKzkV5AYl1nrn9MjSzQsHz+UKiCjdRhNRrmfSPz/9Aca9BIRnnVzvvUFVlbfJ2lFRSlszMpKW6wUFW4mu/IyEPwxuwN/Sij7gMu1BhHnS68HQxWRkJeFpbg7ptBzRGxIBfh9Ab+RgP+OeNq3WlCjluURhs+NhN2tAdphyHnXWKyztpnbhtpOyj2C+IFdN7RwSyCx7HHCT/FdCSUHTnScuEfa4pL0Sd5FRUpscrWxwoaQnlgXzCgexoWusT8JIpL4Ja9YzXKtuXC1kmSPDD0UfdBN4ZsLWBrd8FEo2AhojDjkfjNjBvRaJIhX+7mk7VHZM5EsiSfpB7wP8WznAQatDBbHYBOI1vgl5jGStvB1SM76AL0TujmNGlrdl4nzW+hT+i/xlj0fyd5OFYf56tkjie5Ot7B5M3+DqO8KTRzRaP6JjEXxbCr3j1EtLUZanfZ1JBs1ZNNEpjHL5+eAzF/Ue/L05/vflyB5nZJOogxfRfDTUCQJNm7ovxXw+CXLPxByMR4rAaFM22hRrHO3PExNGoqe+Sto2pF1acqZNJviSvSEtXjToQ3DbYmHT6D8gAb+qgjM2x5ozdp8WidtTwKE4j84l4wi5/FOb8j/4J43YYjX+vI79H PlduAItJ fatBOsuUuR+pFMBVEVLZUDotBdlpqpoCq09HzQ4Sjo9yXpQ8QYrORhtCRXlKf9lkNhILEf0NlwvY9hM8KGo7sIgIY916gFJOYn0Ss/KyuJsWCQdU1mhl8RhGGyNTk/vaKATZJNzSIHlSoytOh+re0i1MlXYt5DJvu4/qfpXcvz4bEKkDeG1nIzQOsASfnNnMQ1qcrfb6pHVsEI730kK1fD0S08wcy4P47jfxWKY3nLWaepyFknlI8WMbyexe9Zw5x0mjYL6WvVAuHUbUZ8n6GNVUyCDiKsyD1+5ddybJ6blJBH00yHDf9HNj4f50tiRTaEMS619jQh8Fx+JtFxKGmmhN9fsvblr5mrhOmiuVSDuG5N+W/C7u7soAPO5I9g7ThM85Lr7y+4rqa0xYZE2VCr3mXLpASHConE3kliEvBE1Yr1zB4uAK25OyuUQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Reduce code duplication by consolidating the decision point for whether to do individual invalidations or a full flush inside get_flush_tlb_info. Signed-off-by: Rik van Riel Suggested-by: Dave Hansen Tested-by: Michael Kelley Acked-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) --- arch/x86/mm/tlb.c | 41 +++++++++++++++++++---------------------- 1 file changed, 19 insertions(+), 22 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ffc25b348041..dbcb5c968ff9 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1000,6 +1000,15 @@ static struct flush_tlb_info *get_flush_tlb_info(struct mm_struct *mm, BUG_ON(this_cpu_inc_return(flush_tlb_info_idx) != 1); #endif + /* + * If the number of flushes is so large that a full flush + * would be faster, do a full flush. + */ + if ((end - start) >> stride_shift > tlb_single_page_flush_ceiling) { + start = 0; + end = TLB_FLUSH_ALL; + } + info->start = start; info->end = end; info->mm = mm; @@ -1026,17 +1035,8 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, bool freed_tables) { struct flush_tlb_info *info; + int cpu = get_cpu(); u64 new_tlb_gen; - int cpu; - - cpu = get_cpu(); - - /* Should we flush just the requested range? */ - if ((end == TLB_FLUSH_ALL) || - ((end - start) >> stride_shift) > tlb_single_page_flush_ceiling) { - start = 0; - end = TLB_FLUSH_ALL; - } /* This is also a barrier that synchronizes with switch_mm(). */ new_tlb_gen = inc_mm_tlb_gen(mm); @@ -1089,22 +1089,19 @@ static void do_kernel_range_flush(void *info) void flush_tlb_kernel_range(unsigned long start, unsigned long end) { - /* Balance as user space task's flush, a bit conservative */ - if (end == TLB_FLUSH_ALL || - (end - start) > tlb_single_page_flush_ceiling << PAGE_SHIFT) { - on_each_cpu(do_flush_tlb_all, NULL, 1); - } else { - struct flush_tlb_info *info; + struct flush_tlb_info *info; + + guard(preempt)(); - preempt_disable(); - info = get_flush_tlb_info(NULL, start, end, 0, false, - TLB_GENERATION_INVALID); + info = get_flush_tlb_info(NULL, start, end, PAGE_SHIFT, false, + TLB_GENERATION_INVALID); + if (info->end == TLB_FLUSH_ALL) + on_each_cpu(do_flush_tlb_all, NULL, 1); + else on_each_cpu(do_kernel_range_flush, info, 1); - put_flush_tlb_info(); - preempt_enable(); - } + put_flush_tlb_info(); } /* From patchwork Wed Feb 26 03:00:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08612C021BF for ; Wed, 26 Feb 2025 03:02:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CDFF4280007; Tue, 25 Feb 2025 22:02:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C68F56B009C; Tue, 25 Feb 2025 22:02:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE6D2280007; Tue, 25 Feb 2025 22:02:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 88E066B009B for ; Tue, 25 Feb 2025 22:02:26 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 3F3C5513E8 for ; Wed, 26 Feb 2025 03:02:26 +0000 (UTC) X-FDA: 83160597492.27.8D7E31B Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf29.hostedemail.com (Postfix) with ESMTP id B6ADE120005 for ; Wed, 26 Feb 2025 03:02:23 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf29.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538944; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=edKIssGg3DoRG9kOddrnz00jN+XS69uR3W13XEMd3UQ=; b=n5oKT2RUupoiuOr08S+z71t4l7KNqCYyRAdCoDeXbgxtOzj660yl1Rrk0rPcfUDNF4YYsF xKUS2Llwvr3Sevcc9+kZlv1cTs/V8xSqI+dY5enoG2ldYojgYbogaJDAsctNPKfyj4sZYg 4oic7ehjWgCz8vl6v9lKu1Gl6pPAKds= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf29.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538944; a=rsa-sha256; cv=none; b=tTbGAVSgLsfgJXXgdr5mXc40VKMTM/bqM/d/FvodHvyEJlmUMA2QnkXEJ+gZbmB3gKIrCb qRX317X9ic3qY1J+G024E2B1dfvP8kP5y/IHuKSXqCvH0qRaAUbwyPNsUWFtsOddafeeyD XhbpNKmpevM1lgCBQv9xCCNGwb3J9TA= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-0Swe; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel , Dave Hansen Subject: [PATCH v14 02/13] x86/mm: get INVLPGB count max from CPUID Date: Tue, 25 Feb 2025 22:00:37 -0500 Message-ID: <20250226030129.530345-3-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: B6ADE120005 X-Stat-Signature: xeqppewab95p51qm6cxi8afxucaykq7g X-HE-Tag: 1740538943-469575 X-HE-Meta: U2FsdGVkX1+xgE59GDlHltAHIO24aONZ4ZpkP3lpF1fimy8s4QD6l2en0X44CydASWRjQTTTh91Fb+11iO0jzyX19tCY92JoDuf6d3vroXoeGcHl7BCoJhReAf3wuB4APUomupE2q4/j8bZQh1Gul76MzppztwFa8EzPd9CTpNXwb7yMBzEiguWpC45bHfHTzI46zK+INrUEbUwvZ7le1VazObTbzIzgmoBowPjaGgBHm56g/7u4i4y7WuaDsS87y90Uuo3m4QEszzpLcr1S/NWEx7g6v07y7XT58geFhLT64KGfITHwAI/Vv6FI5Df1IcKFALsJr3j/XM7853WthgCwhZk7krIJsd3IlHu4SjdZsbbrj3kkIJ44CK/46WwVcFN5gX1sWsFAAHaR5+PrVYD+ZCiMY2J1MnhSQyweUB5qnsAqlT7YS0j6Ew7eBL3wdxukTdpyfOY/qWNJfk0IyPb4YMjVs0JBuc45WfmukF90sOUoZOrhkaALwNqzq3FXyptdOSYQ8BBIy3paVXINk1qKF6klheYmLNKlRmn/9VuLjEAd1OenylwBtEaU+ibWMAKsRNa6SXoPUJV2ruAUss0zeOK/7EaJbUGjP/ai9wj2zcS5XX+xxLHyAlOryT3jlXPV/ADsJE9OPNuL/n9crUqYxL+yRhio3OEG82tMuBSJXyhn52QKIsDmBkrufeJrT+YVk+5Euy6zez80by1dtfhVBOvUmJt7x5s/b8EtjECo3Unc+ULKD228K7HzukXoVZhBHzuudkFSYSTjQztCA6rAqu/7R0WaCh0aLYS4TE0RZk7bMcxgTXs+MejttkB67Recm7iL9zGHX3KD1Trp4z+4DC24owbE9FconmVgleLdM85ZM0EwtVnG8Nw6keT/U0vm3arSIwvwDJ/AA5MTOWWtn8tV35AdlNKTPwlMqQYiThHpRwIujyWdRgkBMF5QlIYHguv9dEtGqpxFlZ6 P1mGIJii CZeC99qKBJdEiWXeF06Q/PasjxVTFxDucYRvcTPPdSH40BNg6re3ep52yFZLhtwzWxtehWJz7CaFIQ/zOfIQrh048lDI1TkZhrhLQ5St00sQCmqIpJcuh6lIMfkY6sVJRoQ349aJ9VanMtDHg7ZBGAg0wGmf17+MtPPkaFnQF1mYpk5NzIHmzk8jfuVrMI8rS+KIFEkIG4MCOpT+NvyjFFpO47WGSexYnPd5yiBTqMlgwiuGoeY3vuH67fCa1HIwkXNWmriKZc5GIfsc3nb6UhAg+3h9wfi+cDQvKNSxkdvQRm/FO72SN6BhHLWWb5czJm0v2x3Op56HtKhyuQ/AbVrEWUPnibkOCTntahG7vffhMtab1mb3UAKBudyx9m13U6HkWbJvFLinjTHtDjlNFoyNtQfd6HySn6tkuhm0cV2FbxmrHBrf+FExXAFx1blZdlBGNPgHcbeFRpayKnpR/hCCq/CbfX4rCqIBq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The CPU advertises the maximum number of pages that can be shot down with one INVLPGB instruction in the CPUID data. Save that information for later use. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley Acked-by: Dave Hansen --- arch/x86/Kconfig.cpu | 4 ++++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/tlbflush.h | 3 +++ arch/x86/kernel/cpu/amd.c | 6 ++++++ 4 files changed, 14 insertions(+) diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu index 2a7279d80460..981def9cbfac 100644 --- a/arch/x86/Kconfig.cpu +++ b/arch/x86/Kconfig.cpu @@ -401,6 +401,10 @@ menuconfig PROCESSOR_SELECT This lets you choose what x86 vendor support code your kernel will include. +config X86_BROADCAST_TLB_FLUSH + def_bool y + depends on CPU_SUP_AMD && 64BIT + config CPU_SUP_INTEL default y bool "Support Intel processors" if PROCESSOR_SELECT diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 508c0dad116b..b5c66b7465ba 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -338,6 +338,7 @@ #define X86_FEATURE_CLZERO (13*32+ 0) /* "clzero" CLZERO instruction */ #define X86_FEATURE_IRPERF (13*32+ 1) /* "irperf" Instructions Retired Count */ #define X86_FEATURE_XSAVEERPTR (13*32+ 2) /* "xsaveerptr" Always save/restore FP error pointers */ +#define X86_FEATURE_INVLPGB (13*32+ 3) /* INVLPGB and TLBSYNC instruction supported. */ #define X86_FEATURE_RDPRU (13*32+ 4) /* "rdpru" Read processor register at user level */ #define X86_FEATURE_WBNOINVD (13*32+ 9) /* "wbnoinvd" WBNOINVD instruction */ #define X86_FEATURE_AMD_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */ diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 3da645139748..855c13da2045 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -183,6 +183,9 @@ static inline void cr4_init_shadow(void) extern unsigned long mmu_cr4_features; extern u32 *trampoline_cr4_features; +/* How many pages can be invalidated with one INVLPGB. */ +extern u16 invlpgb_count_max; + extern void initialize_tlbstate_and_flush(void); /* diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 54194f5995de..3c75c174a274 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -29,6 +29,8 @@ #include "cpu.h" +u16 invlpgb_count_max __ro_after_init; + static inline int rdmsrl_amd_safe(unsigned msr, unsigned long long *p) { u32 gprs[8] = { 0 }; @@ -1139,6 +1141,10 @@ static void cpu_detect_tlb_amd(struct cpuinfo_x86 *c) tlb_lli_2m[ENTRIES] = eax & mask; tlb_lli_4m[ENTRIES] = tlb_lli_2m[ENTRIES] >> 1; + + /* Max number of pages INVLPGB can invalidate in one shot */ + if (boot_cpu_has(X86_FEATURE_INVLPGB)) + invlpgb_count_max = (cpuid_edx(0x80000008) & 0xffff) + 1; } static const struct cpu_dev amd_cpu_dev = { From patchwork Wed Feb 26 03:00:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991452 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1124C18E7C for ; Wed, 26 Feb 2025 03:02:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A4B2D28000C; Tue, 25 Feb 2025 22:02:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F7EC280008; Tue, 25 Feb 2025 22:02:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8242C28000C; Tue, 25 Feb 2025 22:02:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 606A228000B for ; Tue, 25 Feb 2025 22:02:30 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E2A591A083B for ; Wed, 26 Feb 2025 03:02:29 +0000 (UTC) X-FDA: 83160597618.12.B2E98DA Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf01.hostedemail.com (Postfix) with ESMTP id 5B89140007 for ; Wed, 26 Feb 2025 03:02:28 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538948; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z6pgi4x4Hyymd9yGlGpAWtWs/OgewiTi/0ssoZztg1g=; b=A+66bsBrI6LnD4/ghSO9Q26I2AWTlEkHiLAFu58B/J+msjrG5Q/Yn2hy177AbFbExfkDa0 rVOlTkY+h0qefZ9xx207PB9JEqc8NLIDlc2MZD6sfOBUrh7hrM0ieZBlMYlPclghcdKFau D5EJQZCUF7D9LkKoWXWskeZiTg/hRKE= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538948; a=rsa-sha256; cv=none; b=SHWf+THIu6TFxjBZBS5dU+W3WnI53k6EbBPDsJMrt4+d94wM0FMQUd86KjnMDIQPnZyZPj buEPr4lvs39lHuy5rk+pvD7VK6Oobu7QubE2kES5LIkrpbFcPHMiyBW3U0XAFNnrDktZqF a5Zg+tSR+NAr20jEkct55HjmRSPmIFE= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-0aUg; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel , Dave Hansen Subject: [PATCH v14 03/13] x86/mm: add INVLPGB support code Date: Tue, 25 Feb 2025 22:00:38 -0500 Message-ID: <20250226030129.530345-4-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 5B89140007 X-Stat-Signature: xq6mdwwkqkrkdzzydfa5ja7ebbraouhc X-Rspamd-Server: rspam03 X-HE-Tag: 1740538948-604677 X-HE-Meta: U2FsdGVkX1/yg3SWCc7vkXFjOXbRrSpYV96EgffMgLv20/c1c1DJDdpmCUw8s4qC7TYonuT395EgfTFclnE5wD5JNs8LnbvatkFqwzbnv3Epmkinans2KoBTuI2ABsLvve8CfJ6bQbBs/f+v0wvZl5Kgg6rgzvQtCHHWSZQsKjZCF3ZNfG0D+YN9ra5tCVZVKLm40ZJgpsKCDL8nfcAqz4JSgDZRwhKjvdRElUpeYoVdhYO04vh4OAhsnA/Tt9/Sr3KE9m+QcSCuH9tMFimp1Jy+6YR73vF1xhMey1tZ4icYC2PME2YGXWsB/QfJuDBJDTawhufab4rbuDAu/tE85q3O6R+p9s9KAWfFnHUCJYdCRsHaF4XI2nDJ2hezZRcFujo6uZM4d/MUqFX9jKQTGdn9qkdkWWEFgxslpf1AsY8/87agdfWx4jOwDOu0r55D/OWrG62opWf0XcaWZm4Ir74JFXMjgIlt+dYOj3YFLvTKeBmpW+im1I7FVngqFODZSGab3l6NsFPiumoPZGa6xl6kqcRktHuPJ8nZ0OX79NDOYb/iHn4KcThJrKfciD/jML0sN49B3g798dPoPipNl6Uu4twgh2KT97nG7gy+Nf+nZhTKbkVgq3NC8YDXguCaX86BqULq4Q/QSbKLjzJt4V65dRBHhShZ6KYWVP94lQNjqdbDbCxZMJY+wLmFZrnx4MwE5eIcAq6ekK5vOuvVEiL0AJocFkKVZVS9mpizQHAzml3hPybbkTbr8YCJranIiBr8mnkW6+p4L9fFKe+6XvTbP9BFXg8jLoszD3Cq+ORM7XlcAjAY2ZLgdXx4EDQ7ZZZfHJqcaFge618GlW6ggFmpMpQhE2SbELycuWme+aRQfGjPunV6wzmasTRUcyVf8WfJQua0811L0Of9L3rFGv5e0RGx/Nq9SvZ/EzFukXe3Qq5+gX5TpZSLWXkFnxHh0iGLRmVWRPd/z+TNVIi e9GCYnex gokdhLeHq65AuoFw7n6GZ47JuTNgjzbqj7mxPvV1ZSEJI3CH835szr5Mr9KLsCXQo/6kctF81Ajt0sBipmPiJTJwurAgBaghs7V9p5GzGjUNYfGsF3j5CypeWlEmVDI1oEk8bnBJ4LdIzYMRjuQdsffTyGXpoW132i+iSD8JC8NydYz6f+/LwARGrPRFTfeEE1ckVec8mH+1Js+UMCPW18j1nsNqu5eivqHadWAT/2gfag0J3mzSh0JVto1wSh7t0c/8GrtdvhQJ6OiFjB2yhoXjd+nIsSz7MAmZyUPNo9BmVl7KZSu2nRTT9vrYVjDORiBp/MHk8XTgYjFYjjsYugHXkhRhGg0SYintDKkgU/xhbDhbvDTv25FC8i4etPdImb6Bbh0AXxy89OjYtvRlbpZnP2NC1We5RmqNNCg4AnZ8tWUevV8IrpsCjRZqv5sppIpqv90eW97YHnuGhRrR+Zd1G5SIWhYMChggg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add helper functions and definitions needed to use broadcast TLB invalidation on AMD EPYC 3 and newer CPUs. All the functions defined in invlpgb.h are used later in the series. Compile time disabling X86_FEATURE_INVLPGB when the config option is not set allows the compiler to omit unnecessary code. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley Acked-by: Dave Hansen --- arch/x86/include/asm/disabled-features.h | 8 +- arch/x86/include/asm/tlb.h | 98 ++++++++++++++++++++++++ 2 files changed, 105 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index c492bdc97b05..625a89259968 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -129,6 +129,12 @@ #define DISABLE_SEV_SNP (1 << (X86_FEATURE_SEV_SNP & 31)) #endif +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH +#define DISABLE_INVLPGB 0 +#else +#define DISABLE_INVLPGB (1 << (X86_FEATURE_INVLPGB & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -146,7 +152,7 @@ #define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \ DISABLE_CALL_DEPTH_TRACKING|DISABLE_USER_SHSTK) #define DISABLED_MASK12 (DISABLE_FRED|DISABLE_LAM) -#define DISABLED_MASK13 0 +#define DISABLED_MASK13 (DISABLE_INVLPGB) #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \ diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 77f52bc1578a..91c9a4da3ace 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -6,6 +6,9 @@ static inline void tlb_flush(struct mmu_gather *tlb); #include +#include +#include +#include static inline void tlb_flush(struct mmu_gather *tlb) { @@ -25,4 +28,99 @@ static inline void invlpg(unsigned long addr) asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); } + +/* + * INVLPGB does broadcast TLB invalidation across all the CPUs in the system. + * + * The INVLPGB instruction is weakly ordered, and a batch of invalidations can + * be done in a parallel fashion. + * + * The instruction takes the number of extra pages to invalidate, beyond + * the first page, while __invlpgb gets the more human readable number of + * pages to invalidate. + * + * TLBSYNC is used to ensure that pending INVLPGB invalidations initiated from + * this CPU have completed. + */ +static inline void __invlpgb(unsigned long asid, unsigned long pcid, + unsigned long addr, u16 nr_pages, + bool pmd_stride, u8 flags) +{ + u32 edx = (pcid << 16) | asid; + u32 ecx = (pmd_stride << 31) | (nr_pages - 1); + u64 rax = addr | flags; + + /* The low bits in rax are for flags. Verify addr is clean. */ + VM_WARN_ON_ONCE(addr & ~PAGE_MASK); + + /* INVLPGB; supported in binutils >= 2.36. */ + asm volatile(".byte 0x0f, 0x01, 0xfe" : : "a" (rax), "c" (ecx), "d" (edx)); +} + +static inline void __tlbsync(void) +{ + /* + * tlbsync waits for invlpgb instructions originating on the + * same CPU to have completed. Print a warning if we could have + * migrated, and might not be waiting on all the invlpgbs issued + * during this TLB invalidation sequence. + */ + cant_migrate(); + + /* TLBSYNC: supported in binutils >= 0.36. */ + asm volatile(".byte 0x0f, 0x01, 0xff" ::: "memory"); +} + +/* + * INVLPGB can be targeted by virtual address, PCID, ASID, or any combination + * of the three. For example: + * - INVLPGB_VA | INVLPGB_INCLUDE_GLOBAL: invalidate all TLB entries at the address + * - INVLPGB_PCID: invalidate all TLB entries matching the PCID + * + * The first can be used to invalidate (kernel) mappings at a particular + * address across all processes. + * + * The latter invalidates all TLB entries matching a PCID. + */ +#define INVLPGB_VA BIT(0) +#define INVLPGB_PCID BIT(1) +#define INVLPGB_ASID BIT(2) +#define INVLPGB_INCLUDE_GLOBAL BIT(3) +#define INVLPGB_FINAL_ONLY BIT(4) +#define INVLPGB_INCLUDE_NESTED BIT(5) + +static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, + unsigned long addr, + u16 nr, + bool pmd_stride) +{ + __invlpgb(0, pcid, addr, nr, pmd_stride, INVLPGB_PCID | INVLPGB_VA); +} + +/* Flush all mappings for a given PCID, not including globals. */ +static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) +{ + __invlpgb(0, pcid, 0, 1, 0, INVLPGB_PCID); +} + +/* Flush all mappings, including globals, for all PCIDs. */ +static inline void invlpgb_flush_all(void) +{ + __invlpgb(0, 0, 0, 1, 0, INVLPGB_INCLUDE_GLOBAL); + __tlbsync(); +} + +/* Flush addr, including globals, for all PCIDs. */ +static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) +{ + __invlpgb(0, 0, addr, nr, 0, INVLPGB_INCLUDE_GLOBAL); +} + +/* Flush all mappings for all PCIDs except globals. */ +static inline void invlpgb_flush_all_nonglobals(void) +{ + __invlpgb(0, 0, 0, 1, 0, 0); + __tlbsync(); +} + #endif /* _ASM_X86_TLB_H */ From patchwork Wed Feb 26 03:00:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E31EC021BE for ; Wed, 26 Feb 2025 03:02:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 62B6A280013; Tue, 25 Feb 2025 22:02:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5B5E628000F; Tue, 25 Feb 2025 22:02:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E235280013; Tue, 25 Feb 2025 22:02:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 1373E28000F for ; Tue, 25 Feb 2025 22:02:43 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CE46612066E for ; Wed, 26 Feb 2025 03:02:42 +0000 (UTC) X-FDA: 83160598164.23.6AAEEAA Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf19.hostedemail.com (Postfix) with ESMTP id 58E681A0010 for ; Wed, 26 Feb 2025 03:02:41 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf19.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538961; a=rsa-sha256; cv=none; b=QkBCEx7mSMj/2FBRdse1fQYDITU8tUf+gi0BwKpjkBLx8+IX6mlQaQdw9TszjSdnDWU0Nu kYiyALWpivmdMG24kv979zBd0x4MjfwVXtWzWcg4xZ5s37CIsJwzpxlfurDT6eeBqAa9l5 9HL9WC8XCz89qxLz82WBDckO1aVghbk= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf19.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538961; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CZXldwNfoBeSPHhHtkjfRWchenbUlW05eL04gXlbatk=; b=Ny9Gr0xKNbbtmWK7d/85v2gJzjuDIkLalE9FasVHhs4/wpooegYIW6+SW6BUTK4QBb2hXV AyqDRZXNC/yaPKjYAvo/MRvkC/DqJNN4muWUDMgLTqlzt6aCKcKHScoChOb6LZV1TOhc6n DMHAViNTvBYFv1A/cJhwwDvtFYx2zFQ= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-0hvS; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 04/13] x86/mm: use INVLPGB for kernel TLB flushes Date: Tue, 25 Feb 2025 22:00:39 -0500 Message-ID: <20250226030129.530345-5-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 58E681A0010 X-Stat-Signature: 43c8e5brcusk48xh7k6pwnozirqubnep X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1740538961-879825 X-HE-Meta: U2FsdGVkX187qEFSslZXvH/NjZBLpasxoZRJX2uM1FwpHW0I94YAr7sY8APgS19UM2YgjeP0pfq5zAS1cF5AnYxbKxsz43UUhYTtReMETBjBsPocdS5s5Q/jIvp4iO9YLfjS2xrG51aLgmaJIqEFY3W16oJU+kDhq5YSUuoQoJW1pLa5pjNtGlErqbgOtMlTqKnVHbQQtcWmMiShAaIdF6e9SZP6DPchFuNOLmKIPVkCVO7TKpe9GQxCo2WWcdop5FPJEoNQrKcdg+utYQl+AZiBETexGhMinvyL3/Rrsbfrwj7Z3T/CVcX/xO+mYuAo1m3TWgBmeurUx6lwuVbHkAj2QBMGVgeW3l/u6GV3e9hRIhhqV2adz2ar9AgxRaz9BAxwIjjfemz77d6w/s9RxeWJRGhkwSvJfI2Wzf6+wKW9lG2d1WZxrJMQ4y/JxPlsBtXliUxpPzenHASi389o8z6ddzaVf9dqKl5FUDsoUxuLXiIQhmp1wRW9HnHkS9t+utuvvbGPuh+slEjH8aDc5bKdyuB3sKFmfsfEpXJsAj25hP/YMkGzhwWhw4xzH6ZQuRDrAc+KaNfwwSwKEWULwAtivYhI9RvdCtRN8whiLA1QgJcwDmhGyTyVsad1gMbncVK1cxLuj/Sn4xBHWTABIfSzUUwMwuGMOxjRwsroYFPTUzjulzbVVshwuCluJI28f83/SSMekdgMplVNmkNtiRIP+vUcAJN/QJjE+p35iQe9/1uaPBOF+fTDNKMM3jVDgThgWs/R4wwqs472c+gfN2BMA6J8jrdEdsssNQYkSLEqf6gCnK2v+yOscX7EEI3B5yoXmKDLORYAEzQQCfzRg1nIq8ToNyL7bMZcgZ1gfqwfCjDSVpdv5JHwCzMYNiH7ZLFYmITnE9PdqjTiZ5C2g0xVKrYcN4QIJdJAWHt6nsYNcMMABu0TsO5ha0JhANF33RlPoqNabRQnQK9sjGr e+D1xZoQ AUhuQAmo/sDa3WiEfybmAviFGRUI7DAvbi3yPBLZ07FIReWINMH3A6KIgC5OznO9FXVNRnnaNdAgFPGNAJUhbmcaKH2j2HU5PE8VqZWeNkxboR7VAnuJP6U7x5Y/HGOaRWFyPrt+foqRCm5peGRoIZj4Mzwdm03YPy2TFqrP7wzSeAeR5RNrkxi4G3AGfUAUiFJrU8dLllvMvGHmVFsDjr4276bgaVlTawcRggpPcl/JKsu8cQtlD9FiK/O1Bc0wm9YWRPkIFIkIhTy7hTmWMqJtT1akb2HeU0amieu/kw0bfLU6kH8z+GcmkDWE4dfIbYAjGxnLzfDXfHySffYre5QmMWikCa0ByBGkf6LREYwsKWfH1TD+T6y1x1A/ZZMq4xpTUMhxlQys6+oB7e9vn1/Uxebzla3o1u7K6mzyoq8EO/moqo17StMk1znKds7ZopQvFRtw29Im00tY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use broadcast TLB invalidation for kernel addresses when available. Remove the need to send IPIs for kernel TLB flushes. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 32 ++++++++++++++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index dbcb5c968ff9..f44a03bca41c 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1077,6 +1077,18 @@ void flush_tlb_all(void) on_each_cpu(do_flush_tlb_all, NULL, 1); } +static void invlpgb_kernel_range_flush(struct flush_tlb_info *info) +{ + unsigned long addr, nr; + + for (addr = info->start; addr < info->end; addr += nr << PAGE_SHIFT) { + nr = (info->end - addr) >> PAGE_SHIFT; + nr = clamp_val(nr, 1, invlpgb_count_max); + invlpgb_flush_addr_nosync(addr, nr); + } + __tlbsync(); +} + static void do_kernel_range_flush(void *info) { struct flush_tlb_info *f = info; @@ -1087,6 +1099,22 @@ static void do_kernel_range_flush(void *info) flush_tlb_one_kernel(addr); } +static void kernel_tlb_flush_all(struct flush_tlb_info *info) +{ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) + invlpgb_flush_all(); + else + on_each_cpu(do_flush_tlb_all, NULL, 1); +} + +static void kernel_tlb_flush_range(struct flush_tlb_info *info) +{ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) + invlpgb_kernel_range_flush(info); + else + on_each_cpu(do_kernel_range_flush, info, 1); +} + void flush_tlb_kernel_range(unsigned long start, unsigned long end) { struct flush_tlb_info *info; @@ -1097,9 +1125,9 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) TLB_GENERATION_INVALID); if (info->end == TLB_FLUSH_ALL) - on_each_cpu(do_flush_tlb_all, NULL, 1); + kernel_tlb_flush_all(info); else - on_each_cpu(do_kernel_range_flush, info, 1); + kernel_tlb_flush_range(info); put_flush_tlb_info(); } From patchwork Wed Feb 26 03:00:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991456 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5E3BC021BE for ; Wed, 26 Feb 2025 03:02:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DCDEB280010; Tue, 25 Feb 2025 22:02:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D7B8928000F; Tue, 25 Feb 2025 22:02:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA68D280010; Tue, 25 Feb 2025 22:02:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9322C28000F for ; Tue, 25 Feb 2025 22:02:40 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 4AD3F12066E for ; Wed, 26 Feb 2025 03:02:40 +0000 (UTC) X-FDA: 83160598080.05.3F9AA9E Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf13.hostedemail.com (Postfix) with ESMTP id AE11F2000A for ; Wed, 26 Feb 2025 03:02:38 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf13.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538958; a=rsa-sha256; cv=none; b=e0lGD5h4I0UThe5swxfcQf2U/M/rgmI32kAPYLpoxBF0lxBwaZ6NiKJnZ0fzb7hoVGJrpR BzOo1Tbv8XO4ut6qNcAP6hkQwQ8z1Cb+fTu7B8z+KvkzCEG5FZw3vP42vJm7FluKmuKhRF RHht6pMWu/XAwZfRrg3yFkpBbW9ajko= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538958; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1Zd3l1/qscIsnVfm1nsLSgX0cb8GjaRTPKf9eVV/oC0=; b=G3l1vkS4M5E4+GdoSWW0GvHDlxODWOxRT8YeoRMh6MCWuhf1AcaKT7hEKSPw02QUOLlXZx wi3EuwjGRb118h4ogEeBXLUztmOUTeOrWMT643vUTK+i0fXhLkZoGgVrMBrxivoC1DrjWQ NvXrgNEr0jnsAxHqHAOgEV7TgbAERmw= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf13.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-0mkj; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 05/13] x86/mm: use INVLPGB in flush_tlb_all Date: Tue, 25 Feb 2025 22:00:40 -0500 Message-ID: <20250226030129.530345-6-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: AE11F2000A X-Rspamd-Server: rspam08 X-Rspam-User: X-Stat-Signature: mebs7k5mch7nxn3hk7uyc9ibnn8mbzwj X-HE-Tag: 1740538958-412917 X-HE-Meta: U2FsdGVkX18H+5tPCRlKRhxPpCNxcac8PTV2QdjIra6Ji84VLEo9BxvHDyNRy55zL1ISfEvnuhvQTXE8HR9C0THgAqBuAXPnj4l/GweYuCoPPX5ak8vFkQb0EOQDaKCAGoiKyO/Ja89SbVCm13w+78yMhPw7fcd4Akh87NZUf6uqIItI1Xm4JvtbeJYEI+FHaR0XEG/yZiWsy6lUpVgduJfpl3HRnbVeZ545TmmUqc3ZSD400EOwyLdMbHvMJaKNzzksc4dESrHtqaPvVtS4s8MU9Axd3wmrfKJf4VntcJ7PscIJSRqTB3BijBFYLEmmvV+hLv6NquyLtFe26uwz7s0uQN00mNkNAB4fWy0vWB7VF6zi1VXogbWPN5RYAv/vHxXd5o6TunTMGnSn22MS79bDafpEiC2k6am4cDJzbB5BYOQlw/cb7HIw4T9dh+gdG3m75TqUdW21RR14Y6MYQ1T6/8UUe+7kj/RgIl57LrfcQzZ9I1QLV8NWBGeNXmEmeYIyoZABhbgtX1tjaWS9pIZpvesx+hwuWpA8KJ62ltuflAinIZmYoI9NffYJhyXiLpKV3my6uaOnp3KbK5NBrO5OZQ0gnS7r7I8brBWPbf/N7gPtnEFMiuz/jz1cNjkZNCf9r2dx3tU7VMKNcx1xg6iYo85VKOA45q66RcKOE8SWnBXsJwRQ42j8c9g0Wiwfv+WdtkbQz+tb/kSMNNgeEbvd80V7C3I+621V7BkW3z4idvrFK4Sxlf3co91Q7F90DuluT+DY0PYbkl9HySe66rXGv/ZV8DG2pZmcUzbS+jnnGN8Q0K1X2P0yufDpWCDEcmJCqbOkQ9S/5ugjbxBiqh4JY10yhQEg4A4bXdVosXvda7rxqAajSIiqQHwmqkYHsHygcVaIqvibcajlJmexmmuzS+SXjFu5BiYJD0+pLlWKHRRF50hx1otRdfwd3/4PG8buHnW+W/lgtm3UDW9 B7vB+7Fz EEaow3wcoVHQ/YAhjqU+B6avJRgGMALysr0kD2oKL3FhAG9Dju9NtCehfpPS4d3lEs/RukgwkPHc9Q0RV66hkcI6JdoNTAMDVzhDME6IEChtmPLyKb9kew13hmzmFdwCtqXA8JOyz+iKPox3GEY0EKPiw9/ZcIoxNeX3cFc7REsNc1rpMdnN4BUG+41STVnepzU+IwYwBazWUQHwjRi+5p2D4DxiUekurQ4pEvRGl3x5+yTBTZ/I2BvjiI9H1bfIQnQloEcpEbzIxpaC3JOw+3vkPdW0eV2PW6XcMwzBZ8PxmuVKOhYivznpqXKAWMEak2YR+L139IEb7UMo0Pww/K9+LIZVxWxr0wPqdXE4pKBzCJrH+hMDTcexDWBpjntt0ePBazblF4CWWBwT8q/cUfrWG+DfyA6beJ73VgC2/wNTKjxBtsVGloxNeJ6uAUJRZ/rzz6HxTkwrC7xc6v3BMe2EUsweS7CcCuC6cVwQZLvm2NehqbsV1B3WqpA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The flush_tlb_all() function is not used a whole lot, but we might as well use broadcast TLB flushing there, too. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index f44a03bca41c..a6cd61d5f423 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1064,7 +1064,6 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end); } - static void do_flush_tlb_all(void *info) { count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED); @@ -1074,6 +1073,15 @@ static void do_flush_tlb_all(void *info) void flush_tlb_all(void) { count_vm_tlb_event(NR_TLB_REMOTE_FLUSH); + + /* First try (faster) hardware-assisted TLB invalidation. */ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + guard(preempt)(); + invlpgb_flush_all(); + return; + } + + /* Fall back to the IPI-based invalidation. */ on_each_cpu(do_flush_tlb_all, NULL, 1); } From patchwork Wed Feb 26 03:00:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD08DC021BE for ; Wed, 26 Feb 2025 03:02:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 714CA28000E; Tue, 25 Feb 2025 22:02:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C53128000B; Tue, 25 Feb 2025 22:02:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5652128000E; Tue, 25 Feb 2025 22:02:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 327AD28000B for ; Tue, 25 Feb 2025 22:02:39 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E83AC1A0833 for ; Wed, 26 Feb 2025 03:02:38 +0000 (UTC) X-FDA: 83160597996.24.5F0F1AD Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf11.hostedemail.com (Postfix) with ESMTP id 6F83D4000B for ; Wed, 26 Feb 2025 03:02:37 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538957; a=rsa-sha256; cv=none; b=gLlRad2Tmp+cBdLO8hsaP/txc9t9J5MkNzx9BbMBzilGLwemchmNhIFqEpFp2RmnvHF030 iAUF3ASn82StvtfBd1/DF1IxWKNGaeTKK2w02s/AzSKrZl9uPt3t/hNbh0JU/Rj9hxNHwi jDDvcDOC7HZoYWJoez/89/gMIr3YTkA= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538957; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I9UTqQ0klEhWgfKYKeTC6jtFZ5gtL53ANv0XBAHaLPE=; b=iCv7x2U7Lz5BkZvEnO+okLOpF3a2/CfIFomMAtZYK0oCsg4NZRPtoKnGqhxrhQRhaWSbA+ o64PR+49aaGH7muEzSxh03CcHsQcKIbVaXPjox8xXROAD8c4JEQz882WdaAMnYnOKc+xRO yhcxSqPNW+USwLSmC+SQ9kIVP0WjmQM= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-0r2n; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 06/13] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing Date: Tue, 25 Feb 2025 22:00:41 -0500 Message-ID: <20250226030129.530345-7-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Stat-Signature: h8ak8zody7eqi6kn1h5zqf74kyeresku X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 6F83D4000B X-Rspam-User: X-HE-Tag: 1740538957-22340 X-HE-Meta: U2FsdGVkX198u/J010eHHhqiu93kSV+UjkYZ9jwIceDUI+AB64cP2v7NcP67udIYbk2qO858Xo3ziNoed5WCEGkeWbnBB45vH1ELAScEJ/9+Z/aXtOfJicsR4DUY4q9Dk8mR6HaXfdcOE8oNZEE+vpm5Yk3scTWHFx99r8zyQ2ByH6RRllxuRQ9rgZnj9ejfKmlDwsdJ9tboJ6SIGGTT80WI/WNeaSs44aY+4Gm5//JJWPLRRoZh8+IcMSQvG8ZElDwHs6pwaAn9nHKPboPn09gq3Q20Wl00mW/6ETbICTtxBW7a2z5IRWOHfHndLSKdxaSHW5AK5UBg7/QixrZMT2RJVpeh3hbrVPdh1jocaGhdzH6MXov4RVWqHwwBnxQw3DpslXMQnStCcIfvp0u9fPgRIZi47iooIrk5cgrONLYxLxjyE6KJYua696D2RTYb0UNy3AKe1NbXTDkx+m3bIh0xMtKYssgoZpmqDZyfltz+5w7CTev0sCuFQUyJR+kpnSeyGVcNYX6Q9WDjtoG+badzL6dHr1JwVWby3dwYOnEp/NhtaqrBNuUsUCMzfqfDjtUqhlmSoUcJhtt1PSY9qjretlz1amU3DkS9gHuhp4ycJxFcbKNe9mE5a1iIRzIuwNnviro2hChXP83BnjwI6aCxpf+As2NSgZonSzDMicIXelJlOilyzxwKEFfUr9b+oBzyDYQdsXQmjNpGvnVnjaO6iOwCmVFMNjSSc6IMNjBlBrYxQ7ATqyTOdTQIAeQrebEsuYQafDnoPQnI5XrdFbyU/KLBOS2UmPiol6CfoQ8BhXI/0O5xbdHLQ/YjwIvUpmN0vj3OChoE0AVbGmUGgtufVZyHez81BQOJbG+GKjcDht7/MKcndq98I1HBDrk33VaKNq/lUtp2btZlgNbvRy5esGt/hAJ64e5Sf3viC+z48rGyARzN3CWeo+eZXJ/yuHmF/IhARDWUkD8hzBK w47tKXPb vRIZjXGxaS9lo9V9BgRQ5TbuLNIxxe6aOs3UL3SmZwiQ8ETxDBtEfvCoLPEz2KYSHj8M2DVfWKw+VE8XnKjpfaKAmCGpsogOaUY2Alny89aBS45x4VgutCik9e0szmnDl8dy+RlE4iwRKgyvoGxUFQM2mGLWJe8YTXDP6GljRBDFhlSms5ihZv0lIRhBbQrxX+uZXZWVvQvA/60usmJcJqZZ0KDTBn7/VQcFD1iGFJExiXWkn5JdHdXoc4UOW5Smr3BbonmXgnfHHRVtviNgD78uOQ0qiLBXN2IlYP1TI9N8wbzkH54AcLKUNwyfhEgPIDKDxm3uB0ofGVN2MpaM0LN8OjgMo1AC7QtKG4Fj/80fJcBnDOSZznzykkqsONirBPLT3yS+BeueqeB7CeeYaoI4P4K3cEVsbkIFcbSiT29mKWa0FpTBF5yON3+THm+MXsDBPss9Vo3O139E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In the page reclaim code, we only track the CPU(s) where the TLB needs to be flushed, rather than all the individual mappings that may be getting invalidated. Use broadcast TLB flushing when that is available. This is a temporary hack to ensure that the PCID context for tasks in the next patch gets properly flushed from the page reclaim code, because the IPI based flushing in arch_tlbbatch_flush only flushes the currently loaded TLB context on each CPU. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index a6cd61d5f423..1cc25e83bd34 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1316,7 +1316,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + invlpgb_flush_all_nonglobals(); + } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { flush_tlb_multi(&batch->cpumask, info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); From patchwork Wed Feb 26 03:00:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1876C18E7C for ; Wed, 26 Feb 2025 03:02:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F6CB6B0099; Tue, 25 Feb 2025 22:02:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2A4D5280007; Tue, 25 Feb 2025 22:02:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 060BB6B009D; Tue, 25 Feb 2025 22:02:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C8FDE6B0099 for ; Tue, 25 Feb 2025 22:02:25 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5651852B71 for ; Wed, 26 Feb 2025 03:02:25 +0000 (UTC) X-FDA: 83160597450.16.61B96A9 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf22.hostedemail.com (Postfix) with ESMTP id 332A6C0007 for ; Wed, 26 Feb 2025 03:02:22 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538943; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ck5/KrUElZLi961Ay6jgBkJTqAyumMCttJJQNOmZKNM=; b=CGrsY7COTqKvGv4kF1GxHvEh78TthIo1LWvUu9cYxvW+igghB2MfWQNOJtOnme2jhbiZXN 63Rh8HN0UAE/g9govpLfhpqhSZeuS25Uz5/nzHlHoYrOPngCbMPXcYU7TetW7gshFdceu+ n4hCFUuH5fP2MVySq1AhdBkzWcxduO8= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538943; a=rsa-sha256; cv=none; b=Il8Q9NfwOeUmzVjYcwlrHWlPQxybHmTRSC4QpgcA48Fb4EtP3B++vwGsyJ3IIdUnSCfrb0 xgijEP9qx5vsW0Z2NI8PdpHQ8wkno21O16+4iltNMOFNVAjDuqclgvcnr2k8GnJQhKvil7 rr70d+oePat+LpSjenZ8o5MEYbBCTr4= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-0wG8; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 07/13] x86/mm: add global ASID allocation helper functions Date: Tue, 25 Feb 2025 22:00:42 -0500 Message-ID: <20250226030129.530345-8-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 332A6C0007 X-Stat-Signature: u4s1aczgbxz318zg6sju3arojnidsfmy X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1740538942-198311 X-HE-Meta: U2FsdGVkX1916ILuo1mh2M6x8UPj14dkAyEycSf8RSAfEMzMPQ5cGnliIX7W71yP5he7KGx78AeUOuk9k7FbSwoBKZlIXI8utVMJVssq19QaE30uKPFD3D7jc6ycCy99HUm5fwGShjkv1eM/XWJTlinLsNLACt06MpS7a+Iz40C7eIaTxHHqYfWBS/gMq9O8tOCUbOYbiBYyQnBgpLc+I3AGJZBYRys8LsESfZ0V4AGYAZsBZFbpp7J88xNHSLU9c98QxOm9OJvKmbBKznDpyEh8WCRaCi33rjCdDTzrMDOuhDMwLvjfsCVZm4mNz8wYMLBg4S4lfFvjTXOq97vgwnr/eAWo3Nvc92Fcen8KGEp/CqVrV6AU/6bXyVgvaiSS13OH57uJhdWXMVWMPey0OKzzKV0+XzauDwYs7TF6tqHBylzKkVTXsAP44IGLO6AHkDyFYkDavWKa9G66ROeuPc2AQ0tbMrIDXm4m9OWJfOlGYqtm2qghV8pvApgJTMSqtjJS7pVNeBdGQIB7/RrtQEtUrpNt1HNIxdah7N+whhjIGlJ1jN8BvBezy4OTYvqkHEI+CKegxMnKtT+5CiXeDKrO6XubC2wafuPhO+0iHL/5++lBPfcbzOPoSYNPmbGVMLqO/z5nSl2UzOKSzRGQ0rkxu6vEw6NdhP2khEGlHLkO6uYJJ2f4CN8rgv2KUBCQHKwuD4h3BdgSZk0u9yzMjqmG6g1jR7h07k3n7KqudDrrcr+ZoHz4cADVlINR1WnTSzOjaW8BBAuFKIcVsQN3k4ANyZ2yIKn75tJuTlI4rjCSTzhvup4KRqpUtM9MTc93jb4aEyqwRcj1TWZXd5JwwfW2n5z9TsZs2KpWeJfEcMB1JpRPvAQ1e2sSr3kS0dqvWdpOe32Ulc1n1bUrAwnK+P6+OMCAxOtZzIOsWZhNTN8SIdx7OykjAEMmcKMayIYigq5UmiDy7zp+dKiMdky ob68Y7UU 90LsZICHzh3N2VPHzJYVtkai7/m4T7WPpiDnAJ906tExw5Qp58vD4LnPoiMqoDnj1czROoCuDTAueUsPv3IypTBlaVEN0H99ivJv6JzQBKAX62OnCt1tQHNbCa1l3oXPDW0HijHEW7uCAdZQ13olVQZAAKGLqoUorqsEuLDGbMrEecGYCfvzcd08dKnlW5OhtLMr0QcZ80WFE8/S4hhkVn2kGPPQBSJvETmov3L0vo/OVkdkhX7q/rXRAO4xi3g4kM4KKM2ltf9eM9NkjxOB+6+ykyba5uny4JUzO5pRSSdVKrn1AKorGIl141xC9k3Ps5lFsGoi9w1f0wlXJD56nFW7UKwW3L21bt51HyN9nbHAa6sZ0sTcfyBdPFsEKLPtxp55J8b7hS+yXjeAHU39G3jlbohsxKdn63yfSipf7wMZilaaCtM1GGe3+B+/aQgJ74d28fUU8phX3VvU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add functions to manage global ASID space. Multithreaded processes that are simultaneously active on 4 or more CPUs can get a global ASID, resulting in the same PCID being used for that process on every CPU. This in turn will allow the kernel to use hardware-assisted TLB flushing through AMD INVLPGB or Intel RAR for these processes. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/mmu.h | 11 +++ arch/x86/include/asm/mmu_context.h | 2 + arch/x86/include/asm/tlbflush.h | 43 +++++++++ arch/x86/mm/tlb.c | 146 ++++++++++++++++++++++++++++- 4 files changed, 199 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 3b496cdcb74b..edb5942d4829 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -69,6 +69,17 @@ typedef struct { u16 pkey_allocation_map; s16 execute_only_pkey; #endif + +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + /* + * The global ASID will be a non-zero value when the process has + * the same ASID across all CPUs, allowing it to make use of + * hardware-assisted remote TLB invalidation like AMD INVLPGB. + */ + u16 global_asid; + /* The process is transitioning to a new global ASID number. */ + bool asid_transition; +#endif } mm_context_t; #define INIT_MM_CONTEXT(mm) \ diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 795fdd53bd0a..a2c70e495b1b 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -139,6 +139,8 @@ static inline void mm_reset_untag_mask(struct mm_struct *mm) #define enter_lazy_tlb enter_lazy_tlb extern void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk); +extern void mm_free_global_asid(struct mm_struct *mm); + /* * Init a new mm. Used on mm copies, like at fork() * and on mm's that are brand-new, like at execve(). diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 855c13da2045..8e7df0ed7005 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -234,6 +235,48 @@ void flush_tlb_one_kernel(unsigned long addr); void flush_tlb_multi(const struct cpumask *cpumask, const struct flush_tlb_info *info); +static inline bool is_dyn_asid(u16 asid) +{ + return asid < TLB_NR_DYN_ASIDS; +} + +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH +static inline u16 mm_global_asid(struct mm_struct *mm) +{ + u16 asid; + + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return 0; + + asid = smp_load_acquire(&mm->context.global_asid); + + /* mm->context.global_asid is either 0, or a global ASID */ + VM_WARN_ON_ONCE(asid && is_dyn_asid(asid)); + + return asid; +} + +static inline void mm_assign_global_asid(struct mm_struct *mm, u16 asid) +{ + /* + * Notably flush_tlb_mm_range() -> broadcast_tlb_flush() -> + * finish_asid_transition() needs to observe asid_transition = true + * once it observes global_asid. + */ + mm->context.asid_transition = true; + smp_store_release(&mm->context.global_asid, asid); +} +#else +static inline u16 mm_global_asid(struct mm_struct *mm) +{ + return 0; +} + +static inline void mm_assign_global_asid(struct mm_struct *mm, u16 asid) +{ +} +#endif + #ifdef CONFIG_PARAVIRT #include #endif diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 1cc25e83bd34..9b1652c02452 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -74,13 +74,15 @@ * use different names for each of them: * * ASID - [0, TLB_NR_DYN_ASIDS-1] - * the canonical identifier for an mm + * the canonical identifier for an mm, dynamically allocated on each CPU + * [TLB_NR_DYN_ASIDS, MAX_ASID_AVAILABLE-1] + * the canonical, global identifier for an mm, identical across all CPUs * - * kPCID - [1, TLB_NR_DYN_ASIDS] + * kPCID - [1, MAX_ASID_AVAILABLE] * the value we write into the PCID part of CR3; corresponds to the * ASID+1, because PCID 0 is special. * - * uPCID - [2048 + 1, 2048 + TLB_NR_DYN_ASIDS] + * uPCID - [2048 + 1, 2048 + MAX_ASID_AVAILABLE] * for KPTI each mm has two address spaces and thus needs two * PCID values, but we can still do with a single ASID denomination * for each mm. Corresponds to kPCID + 2048. @@ -251,6 +253,144 @@ static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen, *need_flush = true; } +/* + * Global ASIDs are allocated for multi-threaded processes that are + * active on multiple CPUs simultaneously, giving each of those + * processes the same PCID on every CPU, for use with hardware-assisted + * TLB shootdown on remote CPUs, like AMD INVLPGB or Intel RAR. + * + * These global ASIDs are held for the lifetime of the process. + */ +static DEFINE_RAW_SPINLOCK(global_asid_lock); +static u16 last_global_asid = MAX_ASID_AVAILABLE; +static DECLARE_BITMAP(global_asid_used, MAX_ASID_AVAILABLE); +static DECLARE_BITMAP(global_asid_freed, MAX_ASID_AVAILABLE); +static int global_asid_available = MAX_ASID_AVAILABLE - TLB_NR_DYN_ASIDS - 1; + +/* + * When the search for a free ASID in the global ASID space reaches + * MAX_ASID_AVAILABLE, a global TLB flush guarantees that previously + * freed global ASIDs are safe to re-use. + * + * This way the global flush only needs to happen at ASID rollover + * time, and not at ASID allocation time. + */ +static void reset_global_asid_space(void) +{ + lockdep_assert_held(&global_asid_lock); + + invlpgb_flush_all_nonglobals(); + + /* + * The TLB flush above makes it safe to re-use the previously + * freed global ASIDs. + */ + bitmap_andnot(global_asid_used, global_asid_used, + global_asid_freed, MAX_ASID_AVAILABLE); + bitmap_clear(global_asid_freed, 0, MAX_ASID_AVAILABLE); + + /* Restart the search from the start of global ASID space. */ + last_global_asid = TLB_NR_DYN_ASIDS; +} + +static u16 allocate_global_asid(void) +{ + u16 asid; + + lockdep_assert_held(&global_asid_lock); + + /* The previous allocation hit the edge of available address space */ + if (last_global_asid >= MAX_ASID_AVAILABLE - 1) + reset_global_asid_space(); + + asid = find_next_zero_bit(global_asid_used, MAX_ASID_AVAILABLE, last_global_asid); + + if (asid >= MAX_ASID_AVAILABLE && !global_asid_available) { + /* This should never happen. */ + VM_WARN_ONCE(1, "Unable to allocate global ASID despite %d available\n", + global_asid_available); + return 0; + } + + /* Claim this global ASID. */ + __set_bit(asid, global_asid_used); + last_global_asid = asid; + global_asid_available--; + return asid; +} + +/* + * Check whether a process is currently active on more than @threshold CPUs. + * This is a cheap estimation on whether or not it may make sense to assign + * a global ASID to this process, and use broadcast TLB invalidation. + */ +static bool mm_active_cpus_exceeds(struct mm_struct *mm, int threshold) +{ + int count = 0; + int cpu; + + /* This quick check should eliminate most single threaded programs. */ + if (cpumask_weight(mm_cpumask(mm)) <= threshold) + return false; + + /* Slower check to make sure. */ + for_each_cpu(cpu, mm_cpumask(mm)) { + /* Skip the CPUs that aren't really running this process. */ + if (per_cpu(cpu_tlbstate.loaded_mm, cpu) != mm) + continue; + + if (per_cpu(cpu_tlbstate_shared.is_lazy, cpu)) + continue; + + if (++count > threshold) + return true; + } + return false; +} + +/* + * Assign a global ASID to the current process, protecting against + * races between multiple threads in the process. + */ +static void use_global_asid(struct mm_struct *mm) +{ + u16 asid; + + guard(raw_spinlock_irqsave)(&global_asid_lock); + + /* This process is already using broadcast TLB invalidation. */ + if (mm_global_asid(mm)) + return; + + /* The last global ASID was consumed while waiting for the lock. */ + if (!global_asid_available) { + VM_WARN_ONCE(1, "Ran out of global ASIDs\n"); + return; + } + + asid = allocate_global_asid(); + if (!asid) + return; + + mm_assign_global_asid(mm, asid); +} + +void mm_free_global_asid(struct mm_struct *mm) +{ + if (!mm_global_asid(mm)) + return; + + guard(raw_spinlock_irqsave)(&global_asid_lock); + + /* The global ASID can be re-used only after flush at wrap-around. */ +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + __set_bit(mm->context.global_asid, global_asid_freed); + + mm->context.global_asid = 0; + global_asid_available++; +#endif +} + /* * Given an ASID, flush the corresponding user ASID. We can delay this * until the next time we switch to it. From patchwork Wed Feb 26 03:00:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991460 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BA20C021BE for ; Wed, 26 Feb 2025 03:02:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DF4DF280015; Tue, 25 Feb 2025 22:02:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DA23B28000F; Tue, 25 Feb 2025 22:02:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA954280014; Tue, 25 Feb 2025 22:02:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 96F8228000F for ; Tue, 25 Feb 2025 22:02:44 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 50CDA1A083F for ; Wed, 26 Feb 2025 03:02:44 +0000 (UTC) X-FDA: 83160598248.28.8473966 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf07.hostedemail.com (Postfix) with ESMTP id C1BB94000A for ; Wed, 26 Feb 2025 03:02:42 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538962; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kRbgBdjxTnNUBShiKn92H3soFVpwjA4Is/CRKG4wws0=; b=N/UjjCIKemo6Qw1LL719lPQgfuOJ8qqr8QQxdUOU9mfksl4QDgaW1T1pe0AWDuLkXRYc67 oMPGBgwEUrJipcP0xmu+B0I4Kd9vEumaw/Q9axlnQEuGTXp6mLTdwqDoXvJ7xvB7AIBBhV MA4bP+jrSjo/UZ3dmZJHYkI+KCkQFz8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538962; a=rsa-sha256; cv=none; b=vEvpq988S/dxZJUh0cQvZAEzpVpvHsxJ5NQbMhmXn8vE8I2nuETx3auV533mr68aU1OGv1 SgJc20576WlNvIv19R5GP2K8vXpQjCHWa699myw11X720IBVjpk1hbXXd92zT2p+e/u6OY JhmHPL+1y/TFMqGwTYG2kDcIl8tX8O0= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-10yf; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 08/13] x86/mm: global ASID context switch & TLB flush handling Date: Tue, 25 Feb 2025 22:00:43 -0500 Message-ID: <20250226030129.530345-9-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: C1BB94000A X-Stat-Signature: jipcxuk96gp8qzncds4mwsgmt1e7n1fh X-HE-Tag: 1740538962-188032 X-HE-Meta: U2FsdGVkX19OgQOhd2yDe5vkEqloKPLMlyqF7XFxoSQ+6NI/uBbsiDvGOxOdbvvgU9RSOGP9f11XszDB0LQkm5vLlCb0QkRUsq2KvW7sL3ubO0vG0dOJc/BJgfl+wiK3cZRSSJgZ3SnWlmGXuQWmonenObgy5L6xzXOAqcjzG5r/UVxgFqnUXfkm8BcnHXEaQNokRPYxM20Be1HBtt8jKAjnfGCSSZtLn99BeSbYJgVNNA6yjEGq8PW4Pi6Q/5QVCRvJ+iJVuY1dj3x5q47BLaXnrMyETR8/rluKrJAQH3Kotk0Xc4Tls16gtx3SPSfG77EzMTHLxtDNC+AjZeFh0IPAwb5iTcCaD+5aL1lZgm9ZuPjdLrCddqUN2BSu3kS1yAW07XUAJuTq/8ZF6tWQL8zz7OG0rfAiabdw6G6lmY2I6Rbz+LVz3xkZ3aBrWmDqnwmnrGrduygFXqz2WJjW/YAWfGkK/sqpB+r0Jwna1c8j4zfHD/prgo88R78zjp/zrDCMCZuTpvQSiXRXnLZiztJSouaZHvrZ7bF8SaCGhlmCayx+SBPYdEjc9pPbT915QRDx72R+MjzwEXggWg6RpLp96Y/rhamRUV+1goCqYJbq23iEn2EyvrEaFswfiPx/LA8UfpjHsjmXAcXwP+/ARKgisDsbL34fh4gFOaZ4yeqFhdMGO744NtMs90e7B156IUBGpDoVrdRU7Ug/7f/ncRxI+3mA0hmbwMCRrn8jWeVABuavnCgI4gXxaiVlMD34L1YleeHYUbDsmuI66XTEd57fMBpqnNfDcvYCN9DAIf1MtRilo6xUls7/K6NjvQ1Gqke6bdBe60Azhv4mverhnkzvioWmNY8i8iqdI/LGk0punrBfONeezN1pfFBkuMmxXD5ClR0KPTFIUo0r3aCjPFQvVX+3gVKC6uIot8ZZAkgBOruYB6cKAHkofyNKrmqFoTySCaeBqIyE0EATwFQ iFt7RdKA +AyR+3EYtVLjeWRJuboOAWnbOYz8/3LQZ2d3yXWk4dhnPFkocqGr/8zGVYKg+Q2YQiIArhGGuLcdPIIqEOhXui/PJsJ3u8fUT3aDbeTKIy90Ir8LOWLv/Rc4NYX4/VUXlQRsVlNFXoBrCTgx9oz6wlKHLrulKBPi9dvq+vn0ACuLbJChpIywkDpMYAbDYWwV8JRzWSMz+PQYXEDD8RBY7w/Uudx5NauAgyb2nZ39aQ5Tdg0442N+8yGm7avc7sie3kCcajfpCIdeMvUu/j3Srm8WQs88OZQ86JvXWc4N3wcgDWPjfS7YwWHlQneILw2D67/8UJCHTn9yr9+x01ZaFvkNxQU+BcrQfR9Hz0xEjwRTTWwDvHGDdsx54u59j83LZM3JrTK5vgY9WHVA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Context switch and TLB flush support for processes that use a global ASID & PCID across all CPUs. At both context switch time and TLB flush time, we need to check whether a task is switching to a global ASID, and reload the TLB with the new ASID as appropriate. In both code paths, we also short-circuit the TLB flush if we are using a global ASID, because the global ASIDs are always kept up to date across CPUs, even while the process is not running on a CPU. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlbflush.h | 18 ++++++++ arch/x86/mm/tlb.c | 77 ++++++++++++++++++++++++++++++--- 2 files changed, 88 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 8e7df0ed7005..37b735dcf025 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -240,6 +240,11 @@ static inline bool is_dyn_asid(u16 asid) return asid < TLB_NR_DYN_ASIDS; } +static inline bool is_global_asid(u16 asid) +{ + return !is_dyn_asid(asid); +} + #ifdef CONFIG_X86_BROADCAST_TLB_FLUSH static inline u16 mm_global_asid(struct mm_struct *mm) { @@ -266,6 +271,14 @@ static inline void mm_assign_global_asid(struct mm_struct *mm, u16 asid) mm->context.asid_transition = true; smp_store_release(&mm->context.global_asid, asid); } + +static inline bool in_asid_transition(struct mm_struct *mm) +{ + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return false; + + return mm && READ_ONCE(mm->context.asid_transition); +} #else static inline u16 mm_global_asid(struct mm_struct *mm) { @@ -275,6 +288,11 @@ static inline u16 mm_global_asid(struct mm_struct *mm) static inline void mm_assign_global_asid(struct mm_struct *mm, u16 asid) { } + +static inline bool in_asid_transition(struct mm_struct *mm) +{ + return false; +} #endif #ifdef CONFIG_PARAVIRT diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 9b1652c02452..b7d461db1b08 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -227,6 +227,20 @@ static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen, return; } + /* + * TLB consistency for global ASIDs is maintained with hardware assisted + * remote TLB flushing. Global ASIDs are always up to date. + */ + if (static_cpu_has(X86_FEATURE_INVLPGB)) { + u16 global_asid = mm_global_asid(next); + + if (global_asid) { + *new_asid = global_asid; + *need_flush = false; + return; + } + } + if (this_cpu_read(cpu_tlbstate.invalidate_other)) clear_asid_other(); @@ -391,6 +405,23 @@ void mm_free_global_asid(struct mm_struct *mm) #endif } +/* + * Is the mm transitioning from a CPU-local ASID to a global ASID? + */ +static bool needs_global_asid_reload(struct mm_struct *next, u16 prev_asid) +{ + u16 global_asid = mm_global_asid(next); + + if (!static_cpu_has(X86_FEATURE_INVLPGB)) + return false; + + /* Process is transitioning to a global ASID */ + if (global_asid && prev_asid != global_asid) + return true; + + return false; +} + /* * Given an ASID, flush the corresponding user ASID. We can delay this * until the next time we switch to it. @@ -696,7 +727,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, */ if (prev == next) { /* Not actually switching mm's */ - VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != + VM_WARN_ON(is_dyn_asid(prev_asid) && + this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != next->context.ctx_id); /* @@ -713,6 +745,20 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, !cpumask_test_cpu(cpu, mm_cpumask(next)))) cpumask_set_cpu(cpu, mm_cpumask(next)); + /* Check if the current mm is transitioning to a global ASID */ + if (needs_global_asid_reload(next, prev_asid)) { + next_tlb_gen = atomic64_read(&next->context.tlb_gen); + choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); + goto reload_tlb; + } + + /* + * Broadcast TLB invalidation keeps this PCID up to date + * all the time. + */ + if (is_global_asid(prev_asid)) + return; + /* * If the CPU is not in lazy TLB mode, we are just switching * from one thread in a process to another thread in the same @@ -746,6 +792,13 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, */ cond_mitigation(tsk); + /* + * Let nmi_uaccess_okay() and finish_asid_transition() + * know that we're changing CR3. + */ + this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); + barrier(); + /* * Leave this CPU in prev's mm_cpumask. Atomic writes to * mm_cpumask can be expensive under contention. The CPU @@ -760,14 +813,12 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, next_tlb_gen = atomic64_read(&next->context.tlb_gen); choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); - - /* Let nmi_uaccess_okay() know that we're changing CR3. */ - this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); - barrier(); } +reload_tlb: new_lam = mm_lam_cr3_mask(next); if (need_flush) { + VM_WARN_ON_ONCE(is_global_asid(new_asid)); this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen); load_new_mm_cr3(next->pgd, new_asid, new_lam, true); @@ -886,7 +937,7 @@ static void flush_tlb_func(void *info) const struct flush_tlb_info *f = info; struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); u32 loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); - u64 local_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen); + u64 local_tlb_gen; bool local = smp_processor_id() == f->initiating_cpu; unsigned long nr_invalidate = 0; u64 mm_tlb_gen; @@ -909,6 +960,16 @@ static void flush_tlb_func(void *info) if (unlikely(loaded_mm == &init_mm)) return; + /* Reload the ASID if transitioning into or out of a global ASID */ + if (needs_global_asid_reload(loaded_mm, loaded_mm_asid)) { + switch_mm_irqs_off(NULL, loaded_mm, NULL); + loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); + } + + /* Broadcast ASIDs are always kept up to date with INVLPGB. */ + if (is_global_asid(loaded_mm_asid)) + return; + VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].ctx_id) != loaded_mm->context.ctx_id); @@ -926,6 +987,8 @@ static void flush_tlb_func(void *info) return; } + local_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen); + if (unlikely(f->new_tlb_gen != TLB_GENERATION_INVALID && f->new_tlb_gen <= local_tlb_gen)) { /* @@ -1093,7 +1156,7 @@ STATIC_NOPV void native_flush_tlb_multi(const struct cpumask *cpumask, * up on the new contents of what used to be page tables, while * doing a speculative memory access. */ - if (info->freed_tables) + if (info->freed_tables || in_asid_transition(info->mm)) on_each_cpu_mask(cpumask, flush_tlb_func, (void *)info, true); else on_each_cpu_cond_mask(should_flush_tlb, flush_tlb_func, From patchwork Wed Feb 26 03:00:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9383DC021BE for ; Wed, 26 Feb 2025 03:02:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 173F928000A; Tue, 25 Feb 2025 22:02:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D8E2280008; Tue, 25 Feb 2025 22:02:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB4AA28000A; Tue, 25 Feb 2025 22:02:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C9DA7280008 for ; Tue, 25 Feb 2025 22:02:29 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8E131120686 for ; Wed, 26 Feb 2025 03:02:29 +0000 (UTC) X-FDA: 83160597618.06.E3E371E Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf07.hostedemail.com (Postfix) with ESMTP id 09FBA40008 for ; Wed, 26 Feb 2025 03:02:27 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538948; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Uy5TwRTpjXt38vjCKk2qWvwb3dLWpkXsZFKPZettWho=; b=p5BcfqvIjnrONMGCU1f33w+BQpJUIUPNurfgNDIqgYc6qjJuw+2hRWKRryIxXFOAKKpUaq p4Iw5Yqlv6xW7YwWinvEZKf60mbXmPS5KF7F10DEyXaMR/b9l1Xqt6/vD/3AWwjL9pNERg cDqPMByzPbkC7Cuqyd6iJAzEmId7QzI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538948; a=rsa-sha256; cv=none; b=8MxsbbyRvkfIUAnjY7U2W9S/0x4igiLjY00XsRSgYRtxZgOmHRXxGWzSWi4F8R/Ct7Kd3U rf96kJra1ji5QwqXnLp0M0cmNhgmKtQQV5RqOzBrRgD424kDN22Yp4Xde4wNI1NeTNUaWD ihcotO0ynq5RbkDBWLFV1mCbvvxhX8o= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-16PD; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 09/13] x86/mm: global ASID process exit helpers Date: Tue, 25 Feb 2025 22:00:44 -0500 Message-ID: <20250226030129.530345-10-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 09FBA40008 X-Stat-Signature: 3yb4615mirzg7fjs3qfert5r76n4zhju X-Rspam-User: X-HE-Tag: 1740538947-413392 X-HE-Meta: U2FsdGVkX18NfMGkz41h+C9MxcpMFDOknmDnpU1/3abiQF1f6S43GyPA+cwyx2yVfG4SDf16wwfhnYaVZ0ECRwXnhvbV+j8dgoQIy64z1bOlgJrl1jijJKLl/VrvHDJuLsllggYeUNFBqa4pmIsFppOun69opndsCaNOvFy6kW4dCYBwLZGc8hyGgUHarDQBbutVx3CXsFJo2IwYd+TZ+BZQB2//Mz3vtbp91FabZs/HX35g3HWe2uhW4/tIDlD3CgEotOyKXB8bFgSCLWle5fto8YVM9lqA9hQVMmSFM0vpSxyG3an4+jayhEShWSnuqnmgNknQaXnGGhvjUoXidvD7FFfwfASBhWRigsav0SnlkDDFgrOsqF+lvC1B411oQYsmhwMbIWjSkwt9qJ6Nn99k3NYSlNCTpIChKgKMV438reTx68PHLNlkPf0jor7awvyUuy9HV2hfAhBoT9z6LZ0XrcVGooOLP/IR9uElU7Gz5jt70wZLYeacI1QkUP1vgk6ehPqis0jpuyYVcUMoAFzmsn2ledk6fCKtMdVlzKF6I6fwGC1lCsteqjPOCGDIEP8ZxCvd7wooMmgB8cNMJgnJhj/zutM8rZOF7y/DTk96fLLdskIbPQ7Tk6mUz457TXGrY3ENQuw9ij0AeHbDWFmFK/qSSdcIPnlXCWK97sgprIfEgTDeNWysN7aBxXJpc96lIqdtQzNGl0SGFH/6lAGggLxHCc1vxexAtIkxCbdDxkuPlmls5u2eQ8Sknfu4FOOi3t8MZJton1sWpTV+2rBSjsdIFnU588PJr7WtUAljytNCPvWDaeZsbspEjBuR24V8CLlicF6t5plRporWy2bbXf6taBcUjzIIlXWb7Nj0K6V+Wp6OBMIuKauNicPLGuN+4jC7xbGlk6pHDnHSR2KTjMHUZtznvMF4g5a/9n1xAjgITIjgZkejTQKdseGTnnDhSHghlVmoN15HXE2 TnMUprqK NV5iGgEkfUWqheQJR33JK01l2mZzjnpEenFA0shTcZSaWcbb07TA7gwxhAouOavcukDT16GkeXOsp8b2lSECB55DBTVH20N/smL0U0H8N3mJz6MnzQkqCYHGPoK39gzHvBta2WgFnuhykLKdNtfJdiujARXxYC1ABEijuE5Wv/1agjCxCMqhTe7CI+Ji1HFKa+0inpkqZB9G5HZdu803ES1nW5guB8AJ9dIyUGrAw7g1T5LZfsarCQrwZLe77iHP6d6x6DbM9uGTNSwAaScr94XSn5UF/jODIwBJ6yjF34Xny+rzg13lLRInDoE1mEX4UvKfRR19IR6/NiQaZEY4sdUCX7NHfBxjlmLiiLN7QMVvgQ2xMflfxzO/xt+qklBhhjeer X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A global ASID is allocated for the lifetime of a process. Free the global ASID at process exit time. Signed-off-by: Rik van Riel --- arch/x86/include/asm/mmu_context.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index a2c70e495b1b..b47ac6d270e6 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -163,6 +163,14 @@ static inline int init_new_context(struct task_struct *tsk, mm->context.execute_only_pkey = -1; } #endif + +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + mm->context.global_asid = 0; + mm->context.asid_transition = false; + } +#endif + mm_reset_untag_mask(mm); init_new_context_ldt(mm); return 0; @@ -172,6 +180,10 @@ static inline int init_new_context(struct task_struct *tsk, static inline void destroy_context(struct mm_struct *mm) { destroy_context_ldt(mm); +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) + mm_free_global_asid(mm); +#endif } extern void switch_mm(struct mm_struct *prev, struct mm_struct *next, From patchwork Wed Feb 26 03:00:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 996E8C021BF for ; Wed, 26 Feb 2025 03:02:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB45C280008; Tue, 25 Feb 2025 22:02:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D1BAA28000D; Tue, 25 Feb 2025 22:02:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 95CE228000B; Tue, 25 Feb 2025 22:02:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 604A8280008 for ; Tue, 25 Feb 2025 22:02:30 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id F3AF1120656 for ; Wed, 26 Feb 2025 03:02:29 +0000 (UTC) X-FDA: 83160597618.04.64FA20D Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf24.hostedemail.com (Postfix) with ESMTP id 7A387180005 for ; Wed, 26 Feb 2025 03:02:28 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538948; a=rsa-sha256; cv=none; b=MTCNHW4B/HylYrhancahRajcg41E7YCbeTIJcFl4EjCS2KeaNTC6R+xiCwUW4FcbNK2soJ Qmkw9ekwWKt1MBlCBNvImH3mvFHchB2ofM9Lo+7RuEsHeLDorIo01ulR9vaIpYWhEKo1hR p9IC424/laZ6nhUDnQDzBwEeC+DHVYQ= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538948; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/v4aW6UaLE++ZtewdPgD/bWpgXcZU5FK0h4A4Bel+c8=; b=5skDQM2/yMMl3Oep5FzWVZU7iRU7gYG8233z9Qs03DRxx+732us+h727ZLIvDhn5rBjcz5 CJhTii3iVvX28Wmoj8bzbRnr8IfI/Hov8QjiCf/YZHfV+2AknPrQCYlngGMNediiZtezlX CU6aeuH6BRqK6rCOMOEUsFcbgditSZQ= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-1Bi8; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 10/13] x86/mm: enable broadcast TLB invalidation for multi-threaded processes Date: Tue, 25 Feb 2025 22:00:45 -0500 Message-ID: <20250226030129.530345-11-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Stat-Signature: 9fck9rife6ewh4pg1cwgezr7i6gud5pu X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 7A387180005 X-Rspam-User: X-HE-Tag: 1740538948-473129 X-HE-Meta: U2FsdGVkX1+Uxxwmkqy6FxSKp+uvN85yG0qgySq1WPYPPt9/ae7aGjdVzplqb8LHHDBCH5zPkXBNDuRLuGMlZKNSlSy4spLNIk1mAZmE+r3k13/byHjf3FjZ/UBQaTsOFE8nrxbalh9frrA9/Vfnsk61rpYxzPrp/hCl0pOl3ibJJ9sFESVXUxkotWAMZvFqi7VQ1al7nRPUAfKl8F/5swP0P13IhYE3xeZAFuC2dWYjqn04q7RB1Yv73R0IQsMVUl/nNW4/OowXfLL8wYLNMDm+MLr7QH5fSuZ+V2ejhJWavPKrZfah/DTqijQ1e3e7bAYy/YIyaAIW5jRA/KymIfD60MbR46b+MyA5+F5dYlUs9ODkoaS5rEe7/qNvcD8dJNnLCe1Fi8a2Q1UFa4vVoSG78/CIy/c3EaViNlqt8RzQBa4uNIbxKUFc6RLoGmY0o47vk/1BpqMJmYmjuVlQCE8GqMwPFQaFNbNnA97AGxeOpAmIuXxXTifgx8F4rkz2OELO/ZaUZvFxLbDpHcIqFoZ2lDLvE8F//Fo+T15/zduzCeovsa1O6PYS2eGa2zi+XtgbV9viXESqXfRsaz9V3SjMVQhTr1gqMMoaeZChuI8/O+It9IoE1qraQZI50tA9N43g+deVzej53juMLgCIz40pw23DZX2H8Xypeg8kEkfWJ5ocVK+5OhzOzIhAms4O5JhWaXab/jKpKFCi05TUMVGnW9NFm9SQTrfz4yY3wmhZFCEzli+xbY8FUyutmD4r602jz9x/juHDWEHpnTUWAUaw6kiP/iTq9+hbR+Dghzkliknm9WuosKKkRPOnYCicXqiJ30ls3tDMNjXL1/9ByGYOeUrp45NsaiNZvPXLSWdjE5406ecjmVk1mQ8xqOW4gNm+K8664CaXIm0tDZu6GYvoHzWo2EUI1/xOpmI2n4LbNj8XGMuVZhLZ2KC+cTGHq02UjG5YPFHbrHNuAdM 8r34Iv1P m0egyXIYZpZp9vWSgQTGPp4CZiveCHf2bcNdsiaSmQPhm7pPUdkqPt8dSAG1hfFMJTxIDvYQgUfoAcCVrlO9G8WbNQAIat3nTmkLAA70RcXn3D0jBPOwjwzB1GQF9BHjj5EhiSm9dy3TpSvVtUv02mfgxZa1oYiltWdC34Ma7ue+TrsnCWf/Y7c+/UaFKSc6jPOcjSDj5YbJGyD9wvLqDFqL6uwWewceWqSD2QwGn64nkeOk3+3NX6HQ9/+Tfs9YnVH5eNTNkel30cBz6Qit2BeuPmDHm0x+ynZ4gYla3rXL4RYEsZiLBpwkzoay7d1USVcxcaBDHINYaqGjLEgsfy1Lyy6tqVs5C8vJojBcYDuVUbnJ0wNRWBDJRFmZ6Of7uTATZW12RP13cOJaVeVmrfeLEfIHG0JW7BBfvr1uP+jaPpdxEOhJZJxfLX/eUVgu4X+uQfOGHR/NIZNTRCJUADcxqRmcIRz8mAIduJ7yE7U2FhdbN6AXhdSKJ4fpxSHwGNEJZasr0nde1E+re2IQ/iT5FbDfDGzQnOTYalS36Y1hUo95uU1+DYnRzTvn4/u5MBCd3a3mfq6/heBpL5dBWLxW/qQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use broadcast TLB invalidation, using the INVPLGB instruction. There is not enough room in the 12-bit ASID address space to hand out broadcast ASIDs to every process. Only hand out broadcast ASIDs to processes when they are observed to be simultaneously running on 4 or more CPUs. This also allows single threaded process to continue using the cheaper, local TLB invalidation instructions like INVLPGB. Due to the structure of flush_tlb_mm_range, the INVLPGB flushing is done in a generically named broadcast_tlb_flush function, which can later also be used for Intel RAR. Combined with the removal of unnecessary lru_add_drain calls (see https://lkml.org/lkml/2024/12/19/1388) this results in a nice performance boost for the will-it-scale tlb_flush2_threads test on an AMD Milan system with 36 cores: - vanilla kernel: 527k loops/second - lru_add_drain removal: 731k loops/second - only INVLPGB: 527k loops/second - lru_add_drain + INVLPGB: 1157k loops/second Profiling with only the INVLPGB changes showed while TLB invalidation went down from 40% of the total CPU time to only around 4% of CPU time, the contention simply moved to the LRU lock. Fixing both at the same time about doubles the number of iterations per second from this case. Comparing will-it-scale tlb_flush2_threads with several different numbers of threads on a 72 CPU AMD Milan shows similar results. The number represents the total number of loops per second across all the threads: threads tip invlpgb 1 315k 304k 2 423k 424k 4 644k 1032k 8 652k 1267k 16 737k 1368k 32 759k 1199k 64 636k 1094k 72 609k 993k 1 and 2 thread performance is similar with and without invlpgb, because invlpgb is only used on processes using 4 or more CPUs simultaneously. The number is the median across 5 runs. Some numbers closer to real world performance can be found at Phoronix, thanks to Michael: https://www.phoronix.com/news/AMD-INVLPGB-Linux-Benefits Signed-off-by: Rik van Riel Reviewed-by: Nadav Amit Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/tlbflush.h | 9 +++ arch/x86/mm/tlb.c | 107 +++++++++++++++++++++++++++++++- 2 files changed, 115 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 37b735dcf025..811dd70eb6b8 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -272,6 +272,11 @@ static inline void mm_assign_global_asid(struct mm_struct *mm, u16 asid) smp_store_release(&mm->context.global_asid, asid); } +static inline void clear_asid_transition(struct mm_struct *mm) +{ + WRITE_ONCE(mm->context.asid_transition, false); +} + static inline bool in_asid_transition(struct mm_struct *mm) { if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) @@ -289,6 +294,10 @@ static inline void mm_assign_global_asid(struct mm_struct *mm, u16 asid) { } +static inline void clear_asid_transition(struct mm_struct *mm) +{ +} + static inline bool in_asid_transition(struct mm_struct *mm) { return false; diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index b7d461db1b08..cd109bdf0dd9 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -422,6 +422,108 @@ static bool needs_global_asid_reload(struct mm_struct *next, u16 prev_asid) return false; } +/* + * x86 has 4k ASIDs (2k when compiled with KPTI), but the largest + * x86 systems have over 8k CPUs. Because of this potential ASID + * shortage, global ASIDs are handed out to processes that have + * frequent TLB flushes and are active on 4 or more CPUs simultaneously. + */ +static void consider_global_asid(struct mm_struct *mm) +{ + if (!static_cpu_has(X86_FEATURE_INVLPGB)) + return; + + /* Check every once in a while. */ + if ((current->pid & 0x1f) != (jiffies & 0x1f)) + return; + + if (!READ_ONCE(global_asid_available)) + return; + + /* + * Assign a global ASID if the process is active on + * 4 or more CPUs simultaneously. + */ + if (mm_active_cpus_exceeds(mm, 3)) + use_global_asid(mm); +} + +static void finish_asid_transition(struct flush_tlb_info *info) +{ + struct mm_struct *mm = info->mm; + int bc_asid = mm_global_asid(mm); + int cpu; + + if (!in_asid_transition(mm)) + return; + + for_each_cpu(cpu, mm_cpumask(mm)) { + /* + * The remote CPU is context switching. Wait for that to + * finish, to catch the unlikely case of it switching to + * the target mm with an out of date ASID. + */ + while (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu)) == LOADED_MM_SWITCHING) + cpu_relax(); + + if (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu)) != mm) + continue; + + /* + * If at least one CPU is not using the global ASID yet, + * send a TLB flush IPI. The IPI should cause stragglers + * to transition soon. + * + * This can race with the CPU switching to another task; + * that results in a (harmless) extra IPI. + */ + if (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm_asid, cpu)) != bc_asid) { + flush_tlb_multi(mm_cpumask(info->mm), info); + return; + } + } + + /* All the CPUs running this process are using the global ASID. */ + clear_asid_transition(mm); +} + +static void broadcast_tlb_flush(struct flush_tlb_info *info) +{ + bool pmd = info->stride_shift == PMD_SHIFT; + unsigned long asid = mm_global_asid(info->mm); + unsigned long addr = info->start; + + /* + * TLB flushes with INVLPGB are kicked off asynchronously. + * The inc_mm_tlb_gen() guarantees page table updates are done + * before these TLB flushes happen. + */ + if (info->end == TLB_FLUSH_ALL) { + invlpgb_flush_single_pcid_nosync(kern_pcid(asid)); + /* Do any CPUs supporting INVLPGB need PTI? */ + if (static_cpu_has(X86_FEATURE_PTI)) + invlpgb_flush_single_pcid_nosync(user_pcid(asid)); + } else do { + unsigned long nr = 1; + + if (info->stride_shift <= PMD_SHIFT) { + nr = (info->end - addr) >> info->stride_shift; + nr = clamp_val(nr, 1, invlpgb_count_max); + } + + invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd); + if (static_cpu_has(X86_FEATURE_PTI)) + invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd); + + addr += nr << info->stride_shift; + } while (addr < info->end); + + finish_asid_transition(info); + + /* Wait for the INVLPGBs kicked off above to finish. */ + __tlbsync(); +} + /* * Given an ASID, flush the corresponding user ASID. We can delay this * until the next time we switch to it. @@ -1252,9 +1354,12 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { + if (mm_global_asid(mm)) { + broadcast_tlb_flush(info); + } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { info->trim_cpumask = should_trim_cpumask(mm); flush_tlb_multi(mm_cpumask(mm), info); + consider_global_asid(mm); } else if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) { lockdep_assert_irqs_enabled(); local_irq_disable(); From patchwork Wed Feb 26 03:00:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0D1FC021BF for ; Wed, 26 Feb 2025 03:02:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6032D28000D; Tue, 25 Feb 2025 22:02:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5670E28000B; Tue, 25 Feb 2025 22:02:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 36CEC28000D; Tue, 25 Feb 2025 22:02:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 14E8128000B for ; Tue, 25 Feb 2025 22:02:36 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 8F723A163A for ; Wed, 26 Feb 2025 03:02:35 +0000 (UTC) X-FDA: 83160597870.30.BD4CC25 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf06.hostedemail.com (Postfix) with ESMTP id F246118000B for ; Wed, 26 Feb 2025 03:02:33 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf06.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538954; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zQYvgERn9mBuahDJCLE4ZCsMCyI24IUkE+WnPUtfWUc=; b=0PmhBn9aZzprUt7LySFjMo37hkFHg8tUwxzhhyZmwDhyuECIX7XtTC8JkRQS+OGQWsr49y oLCk0+rNiKSR4enXTqTKBpxfi0ot7Q2UDC7ldmeEEqPbey7yH1GacEtVc9xAfTso+Q/Aho 33M2FBCwyZSt0Zn6PyNhmiquCTAPfDI= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf06.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538954; a=rsa-sha256; cv=none; b=VUndLYpPHqgaNrL5z7XojmrwTQRB7GHsp65RXQe5iitxoPQSl2IW6kr/I5FuUBSWd5IetG BARw9mwae/ot+wCJ2ok6QZ1lk73AjrGpEiSfyv5es3tY5djMGXN1pVflYb1Wcc1UDKoZfK Q4oK1E3Ow2KMz5rtvPYvEkWnyW2TGfk= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-1HJG; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 11/13] x86/mm: do targeted broadcast flushing from tlbbatch code Date: Tue, 25 Feb 2025 22:00:46 -0500 Message-ID: <20250226030129.530345-12-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: uq7dueyb1matafp4h6cjs5dwkkihxayx X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: F246118000B X-HE-Tag: 1740538953-74101 X-HE-Meta: U2FsdGVkX18bQm2adnXQF3NE9Sc92GUc1op7Gffodx2YE7u0Ml5OAyvMtZ/gx4tIEoKKhg+6VMtQZubOG1ihChsQdGVdszYaj5fSdB09ZugDCMXl7YMHqDej2rXodD2chIohYUaO2L4o9d45qlz52y6g4kNxErqwCpkbC8YSBc3IIjko06wGsUOxwnKt62zvMwsDYXWh73ykfv84pp2cdCxkK7/dGDGFHHw/9DqV3KtGxNqB5RGfufS354eZJYfpXYWzq3hehAMjnelaDzIv3/NUdbsjzjAUxJe96vrnAX8lTp8bdZJs+Rzi8dxeOhn+pYwxf+q3N7JAgHLhEXmuDh9yBah6tMziCukwb9xuqpXTs/ydPmYNDNxWAiosIECAJZR+tt9fBOobgd+n5w7MdVPvC2iW1PrDimWDOV4Jp/70kXwxS6CzAo1hgdKHHyr8kgarxKkn5QnIM+zLn3596CtEdrHlsmbIBl0aOjMuhz7SZV+2wwWSZuOQXzGmI8GVAXA1YwUp5jW4hlJ7Orc01zb3U55LRqYTxO8lsE1NGaDS3c5U1RoM1AfTgwT5dulEc4Vv8sibPczWVE26QWK6arhfZR2WwYZZ4XmAXz4lfwj/o/mP6x36eg6qVH6PBUWOokmZeRqfDXuBU4I9CVZFV6xmr3g1r4FEDSwZrDk8hjLCqfGfKgBNuFiM1g0aan7qXx6X4CM5CfmiL9Tru68Lb9b5eDHrOKmJrYxRLDEKUqLU44tuI96cAOgnmZRh0VVghf49PJT9c0JUtUo8SLA52ATECCm/qJBZfWVlUXQgSpMI+jqfnkNOa3UvdBkLxP8DqFe5iuAJvDy+Gaauzk5l/iBadTwDlGwMQcZpf18LEiGtTMwp9OxdDCLYJdGV/1nG8IDE7E4YpVZv72Fy9nlADnsrRYfv5ihbAz0MNLQNyrtZrRcNCpsN5woa9v0004MDEo+jfd4weUONQYyTGJ5 5/Uce8nt m8eYW53goATADkMw7F93PSvEZ8d/8yDJOBTWIIrSgzDJzA/Sv45gzo2VfpXvrq2n8MS4cusZaHsz1QufE4xZDEx4eseBHAPFmLrO8qR8Du1RgtCKlTlQDY0DJcX5HFHfrRwoFz+koZh3OJQq42wJogILNqVzGUwoACWbtp0YHgafZWYRGcZq4ASaC4nTlPGpXGtBA1iNjFGjQyc1H7olFSoZTkpiWIHRXcQR5Gp9WsxyjXhXQnwihlOtPiGyIUyZp1v1YWzdV8uxs/FGzJ27CkkOG58vq9XAyCvKYtyci4fpQpJEU4Xz6uxRum0vV0lPBjtxm6jVnWASx1Yvn8MMY8r0EFYILChdFTBrv6Yu0zWiYLUebI2uNLhqrqntDdHPKEbnNEx9uG87ra+tJjdEDl3xibwMoSEhdzgBgg4yHVzoEl0d7mc4OIVVDgdglxnXxeDIbMxg6SCAf+JI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Instead of doing a system-wide TLB flush from arch_tlbbatch_flush, queue up asynchronous, targeted flushes from arch_tlbbatch_add_pending. This also allows us to avoid adding the CPUs of processes using broadcast flushing to the batch->cpumask, and will hopefully further reduce TLB flushing from the reclaim and compaction paths. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/tlb.h | 12 ++--- arch/x86/include/asm/tlbflush.h | 34 ++++++++++---- arch/x86/mm/tlb.c | 79 +++++++++++++++++++++++++++++++-- 3 files changed, 107 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 91c9a4da3ace..e645884a1877 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -89,16 +89,16 @@ static inline void __tlbsync(void) #define INVLPGB_FINAL_ONLY BIT(4) #define INVLPGB_INCLUDE_NESTED BIT(5) -static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, - unsigned long addr, - u16 nr, - bool pmd_stride) +static inline void __invlpgb_flush_user_nr_nosync(unsigned long pcid, + unsigned long addr, + u16 nr, + bool pmd_stride) { __invlpgb(0, pcid, addr, nr, pmd_stride, INVLPGB_PCID | INVLPGB_VA); } /* Flush all mappings for a given PCID, not including globals. */ -static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) +static inline void __invlpgb_flush_single_pcid_nosync(unsigned long pcid) { __invlpgb(0, pcid, 0, 1, 0, INVLPGB_PCID); } @@ -111,7 +111,7 @@ static inline void invlpgb_flush_all(void) } /* Flush addr, including globals, for all PCIDs. */ -static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) +static inline void __invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) { __invlpgb(0, 0, addr, nr, 0, INVLPGB_INCLUDE_GLOBAL); } diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 811dd70eb6b8..22462bd4b1ee 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -105,6 +105,9 @@ struct tlb_state { * need to be invalidated. */ bool invalidate_other; +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + bool need_tlbsync; +#endif #ifdef CONFIG_ADDRESS_MASKING /* @@ -284,6 +287,16 @@ static inline bool in_asid_transition(struct mm_struct *mm) return mm && READ_ONCE(mm->context.asid_transition); } + +static inline bool cpu_need_tlbsync(void) +{ + return this_cpu_read(cpu_tlbstate.need_tlbsync); +} + +static inline void cpu_write_tlbsync(bool state) +{ + this_cpu_write(cpu_tlbstate.need_tlbsync, state); +} #else static inline u16 mm_global_asid(struct mm_struct *mm) { @@ -302,6 +315,15 @@ static inline bool in_asid_transition(struct mm_struct *mm) { return false; } + +static inline bool cpu_need_tlbsync(void) +{ + return false; +} + +static inline void cpu_write_tlbsync(bool state) +{ +} #endif #ifdef CONFIG_PARAVIRT @@ -351,21 +373,15 @@ static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) return atomic64_inc_return(&mm->context.tlb_gen); } -static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr) -{ - inc_mm_tlb_gen(mm); - cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); - mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL); -} - static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm) { flush_tlb_mm(mm); } extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); +extern void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long uaddr); static inline bool pte_flags_need_flush(unsigned long oldflags, unsigned long newflags, diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index cd109bdf0dd9..4d56d22b9893 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -487,6 +487,37 @@ static void finish_asid_transition(struct flush_tlb_info *info) clear_asid_transition(mm); } +static inline void tlbsync(void) +{ + if (!cpu_need_tlbsync()) + return; + __tlbsync(); + cpu_write_tlbsync(false); +} + +static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, + unsigned long addr, + u16 nr, bool pmd_stride) +{ + __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride); + if (!cpu_need_tlbsync()) + cpu_write_tlbsync(true); +} + +static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) +{ + __invlpgb_flush_single_pcid_nosync(pcid); + if (!cpu_need_tlbsync()) + cpu_write_tlbsync(true); +} + +static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) +{ + __invlpgb_flush_addr_nosync(addr, nr); + if (!cpu_need_tlbsync()) + cpu_write_tlbsync(true); +} + static void broadcast_tlb_flush(struct flush_tlb_info *info) { bool pmd = info->stride_shift == PMD_SHIFT; @@ -785,6 +816,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, if (IS_ENABLED(CONFIG_PROVE_LOCKING)) WARN_ON_ONCE(!irqs_disabled()); + tlbsync(); + /* * Verify that CR3 is what we think it is. This will catch * hypothetical buggy code that directly switches to swapper_pg_dir @@ -961,6 +994,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, */ void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) { + tlbsync(); + if (this_cpu_read(cpu_tlbstate.loaded_mm) == &init_mm) return; @@ -1624,9 +1659,7 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { - invlpgb_flush_all_nonglobals(); - } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { + if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { flush_tlb_multi(&batch->cpumask, info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); @@ -1635,12 +1668,52 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) local_irq_enable(); } + /* + * If we issued (asynchronous) INVLPGB flushes, wait for them here. + * The cpumask above contains only CPUs that were running tasks + * not using broadcast TLB flushing. + */ + tlbsync(); + cpumask_clear(&batch->cpumask); put_flush_tlb_info(); put_cpu(); } +void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long uaddr) +{ + u16 asid = mm_global_asid(mm); + + if (asid) { + invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false); + /* Do any CPUs supporting INVLPGB need PTI? */ + if (static_cpu_has(X86_FEATURE_PTI)) + invlpgb_flush_user_nr_nosync(user_pcid(asid), uaddr, 1, false); + + /* + * Some CPUs might still be using a local ASID for this + * process, and require IPIs, while others are using the + * global ASID. + * + * In this corner case we need to do both the broadcast + * TLB invalidation, and send IPIs. The IPIs will help + * stragglers transition to the broadcast ASID. + */ + if (in_asid_transition(mm)) + asid = 0; + } + + if (!asid) { + inc_mm_tlb_gen(mm); + cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); + } + + mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL); +} + /* * Blindly accessing user memory from NMI context can be dangerous * if we're in the middle of switching the current user task or From patchwork Wed Feb 26 03:00:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991458 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87A3CC18E7C for ; Wed, 26 Feb 2025 03:02:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BF98280012; Tue, 25 Feb 2025 22:02:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 149D528000F; Tue, 25 Feb 2025 22:02:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8CC2280012; Tue, 25 Feb 2025 22:02:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A1C1828000F for ; Tue, 25 Feb 2025 22:02:41 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4867C80822 for ; Wed, 26 Feb 2025 03:02:41 +0000 (UTC) X-FDA: 83160598122.30.D3B0F2B Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf22.hostedemail.com (Postfix) with ESMTP id B7CDBC000A for ; Wed, 26 Feb 2025 03:02:39 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538959; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mYK+rDY7ubCCCrhzBQcRvYSAmChEhcLtjY1wXTkNDqY=; b=zLeh98YbKIi6AZhg5VhPsmjAfKpycnjuEGhDgexaWbDO6tsQjrKz54RHU2T0WybLNBdH+a TiaH1RTHW9o89I4pdZeg+EGap1TOoQ/3ygycTSmEdud9vEvr7o4NeDOJf45+zo2vasBxiR Ewn54FanM17gKTmtvE55lu56+6az8D8= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538959; a=rsa-sha256; cv=none; b=5uDZOfT/TWBx4p87FvtXfExJ92iZACmGliNoEYhZlNryUnlXBBMHFmJAfmU/jk3Lbmdvhw wiY3769f0ilzyo46i7VE8o5oS0+sYeWbYbOmizc+SsMLsxJ6yzVrPLkRGrpjUq7/8wyPl6 eE+HVu7GNfpM+amD4Tzz4xq32FqGFP4= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-1O36; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 12/13] x86/mm: enable AMD translation cache extensions Date: Tue, 25 Feb 2025 22:00:47 -0500 Message-ID: <20250226030129.530345-13-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: B7CDBC000A X-Stat-Signature: 7r114xp9qmo3myr5o3abmzyp3i6i798j X-HE-Tag: 1740538959-847064 X-HE-Meta: U2FsdGVkX198jMy4rNQ+caKZSqcULFDDUNMYX/U8yxmVy/1IEE+W9tTl7aV6SyfRqVZNEaQHsv8656MESY2zFQXocPWmTXq5Y80vkBF1X1FaXj55qmdr+PepL+mxmw/6RsmcgHB28ER9DsY/JJFv76hhm6uOf4TWqpw20M2uOJ3wFRzqopOufPTgbYqvVRhvf5k5oyve3YiBl7d6vuS6TjKvph+PhHZ773N/pDQNAoW5jrgGUhN2Ur8b5X5DeMGToxoKjd5JzGhCtfKm9PjensBFSn1EAVdsUR/4h/DABoLqAhIAIFPF2DNT/jQp0qQrFZxny2ybMNfP/UqK4eJyhzKH/5eGBU7OP/4/3m1R+6+7EcLuqmUujRxpqyXKUt5l4TUgqwhjvafMgL09KAJSs/f7sAdBUjTK8jv9czWSBP27PIuv2Q0usdL9bsI5W9toyjel76WcMx52BHNfQW1ye4Af3vNUrwAm25cac8Lj4QJRezN3AHV7CMNwpdBMDHnf76j4a3wVWt1PCdOAwOaP3gw/oo0PhcCF/M4IxZutuDOB+UEKHpivsgYk80JAjheCOZQ973vHo5QIS2+0Lm6kgNcMYsUGBa3JqHwcJizZzA9vfQ7ZUlJ9QmZetGLcN8b9QI4UnIvB9mLAIbYUI35r/D5/8oPWf1Aj+Ph8KNMsyvryr11OCRBJrXQM3vOalQJd2FvKirSc7jw34xfKe2g3/fuBz/7LxUzUNjlv0VdDPxC+gPYXhe0rgXGs6cLcrXj7GVMuw7/OzFjKJELGLI0KM9JZD7NPTMlO9hh+GTW+i3n6zYRs2zubDMncTcpx8i7QEGvbF4+j8Ytg+jJFKXcoJ/2wy0BVKT2/KV4qgUAOaVi6J2EEi/BsgEmfjTrtaF3jzYKKYu2RPR2kL/bstN/xeWT+1Bg2O0xZ0d3mCKiCVzW70ilq+Kwt8B68AbMPGTUMeC+sfQuvO3mIgyxCOCQ 14CtbkDC FMFoKlMz9hEUDL9r8QkdQ0X4wVzT4LxdlbnzHCHPFjD6QOXYwUTHKYgGZlcKRgZHVmq3alh4Wn2s7aEOLGN9ul2R/dhdCu9A9Ky07cRKAxgCYJgP8+Rju/w8l76FOEE6ALfEehFfqMsKuQVz6uVmk0ncV+1QYN+lOFLJWYGjlCXR37zTuak562FxWEdrVgjn/3cArSwI62BUcZ9pT8pQcNNpkt9tEjjyH0QhaaxhPYtVjzujqBi5jzOxiGM2mSqeYcq2N0Zzub9HpzmXXSgBkMsaHIR2YCDA8FY17VYkHLwGhzOqnlo11bH+Ks18ESRvHAP6jdp486ori4VgA3fx/cHYC1RNlfMsOZ+JqV8tMJHPP/lJ5gzOh8tVj2LWpQjmcdrmouUxa5PiOX3uNw/MKlA68j/ONIkn7D59QgR4gZZ6AFz13aB+DiqDQxwNJ9O6qSMTNXtyH0z1z2+g= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With AMD TCE (translation cache extensions) only the intermediate mappings that cover the address range zapped by INVLPG / INVLPGB get invalidated, rather than all intermediate mappings getting zapped at every TLB invalidation. This can help reduce the TLB miss rate, by keeping more intermediate mappings in the cache. From the AMD manual: Translation Cache Extension (TCE) Bit. Bit 15, read/write. Setting this bit to 1 changes how the INVLPG, INVLPGB, and INVPCID instructions operate on TLB entries. When this bit is 0, these instructions remove the target PTE from the TLB as well as all upper-level table entries that are cached in the TLB, whether or not they are associated with the target PTE. When this bit is set, these instructions will remove the target PTE and only those upper-level entries that lead to the target PTE in the page table hierarchy, leaving unrelated upper-level entries intact. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/msr-index.h | 2 ++ arch/x86/kernel/cpu/amd.c | 4 ++++ tools/arch/x86/include/asm/msr-index.h | 2 ++ 3 files changed, 8 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 9a71880eec07..a7ea9720ba3c 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -25,6 +25,7 @@ #define _EFER_SVME 12 /* Enable virtualization */ #define _EFER_LMSLE 13 /* Long Mode Segment Limit Enable */ #define _EFER_FFXSR 14 /* Enable Fast FXSAVE/FXRSTOR */ +#define _EFER_TCE 15 /* Enable Translation Cache Extensions */ #define _EFER_AUTOIBRS 21 /* Enable Automatic IBRS */ #define EFER_SCE (1<<_EFER_SCE) @@ -34,6 +35,7 @@ #define EFER_SVME (1<<_EFER_SVME) #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) +#define EFER_TCE (1<<_EFER_TCE) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) /* diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 3c75c174a274..2bd512a1b4d0 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1075,6 +1075,10 @@ static void init_amd(struct cpuinfo_x86 *c) /* AMD CPUs don't need fencing after x2APIC/TSC_DEADLINE MSR writes. */ clear_cpu_cap(c, X86_FEATURE_APIC_MSRS_FENCE); + + /* Enable Translation Cache Extension */ + if (cpu_feature_enabled(X86_FEATURE_TCE)) + msr_set_bit(MSR_EFER, _EFER_TCE); } #ifdef CONFIG_X86_32 diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 3ae84c3b8e6d..dc1c1057f26e 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -25,6 +25,7 @@ #define _EFER_SVME 12 /* Enable virtualization */ #define _EFER_LMSLE 13 /* Long Mode Segment Limit Enable */ #define _EFER_FFXSR 14 /* Enable Fast FXSAVE/FXRSTOR */ +#define _EFER_TCE 15 /* Enable Translation Cache Extensions */ #define _EFER_AUTOIBRS 21 /* Enable Automatic IBRS */ #define EFER_SCE (1<<_EFER_SCE) @@ -34,6 +35,7 @@ #define EFER_SVME (1<<_EFER_SVME) #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) +#define EFER_TCE (1<<_EFER_TCE) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) /* From patchwork Wed Feb 26 03:00:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13991461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D38DBC18E7C for ; Wed, 26 Feb 2025 03:03:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 40CD828000F; Tue, 25 Feb 2025 22:02:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B616280014; Tue, 25 Feb 2025 22:02:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E1FE28000F; Tue, 25 Feb 2025 22:02:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DBA7B280014 for ; Tue, 25 Feb 2025 22:02:44 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9F1F51A0840 for ; Wed, 26 Feb 2025 03:02:44 +0000 (UTC) X-FDA: 83160598248.06.12E1E6A Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf08.hostedemail.com (Postfix) with ESMTP id 20BCB16000A for ; Wed, 26 Feb 2025 03:02:42 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740538963; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8y9pAyo+xfIAz74sbhhKq0EqwZZPSxZox6JAjvfy64M=; b=ijwicfwanAlZjaR8jcnlSgIhkDICm5aOvbj4vPIBX6jmkRWFmEru4hPfYDaqJQN+5ulfpX hBgLwNHWLhxV59gBFI0WFePWUCaVmnpa+jtfmlh+Mgz/ptHf7HfZJv9eRdVV9xx+RqGHsI HdvHaQUHOGs6lp76eWlFBifCwrsfsX8= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740538963; a=rsa-sha256; cv=none; b=HDeQM4FCyZd5UKNDcEEpP6wPg+qRzMFFczXMqR4sz6g2Htx0Igh74/1StWGifTnasvi9Zn 9uXcXGZGf7vCAvSkft3a5dhgz5+2m6t1vCSM6oNTRc0pFaetGwOwV3+y3zDxJhnRNst23X I147eqQe4iZzsnnGt45DM3YFk8K3ncc= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tn7fw-000000001Y5-1Tjt; Tue, 25 Feb 2025 22:01:32 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali.Shukla@amd.com, mingo@kernel.org, Rik van Riel Subject: [PATCH v14 13/13] x86/mm: only invalidate final translations with INVLPGB Date: Tue, 25 Feb 2025 22:00:48 -0500 Message-ID: <20250226030129.530345-14-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250226030129.530345-1-riel@surriel.com> References: <20250226030129.530345-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 20BCB16000A X-Stat-Signature: 3n1t83kt7r5cfwomee4or5tsmqu58zpw X-Rspam-User: X-HE-Tag: 1740538962-633597 X-HE-Meta: U2FsdGVkX19UF0Le2VYzQVlKlMEe2S+tZYKVG5doXhY8B5nlN7cyHa9FOTzvRdJh+c1mwbnDY3Fkf7BKP1vEoQFtcXhahWZY0Z9hoSypTp+yieEnPDsUqkLkzEe80me9K1nnpZtxK+SqGutRRd/PcRfayHXQtq4zwrhfCvtOZThTiZbFT/j3JDU7yXylGcvF6RQ2Rd8TPd7qCXo/ijIVLbEiRsuaQkc5qaWDR1OEuRPObSYfStHPfkOQZn/G8qmMj3MPSYsA8UbeD/uUI/dY4L0mc9azVBTjQ8UmQ3VjzLWtUcC6TEAQrJ0abFVBd5ycu64iIP9bTfE3lr/Gqc/P+Go5DbK8YSs1USSZy/fZ947dsBd70W1Q2x+N3LdygzAS8ueNraqZXf5NiEaiQpl8T/VpRfSYbpvfvVgcbowhPMI0+r+ggi7jj9aWKuJ4ECfsVGQfNtKLy9d76DREJMOjwuUeNYa2xN5yuC7PkMNH4McRY3LQ9LQgKTWWr/V6Gh0QgFyh9f/U7oIBmPMeE68mQU15sxLJgDdhf8XrLSCGKtrU96l4hVxST8EhmCR7zsUdfDqaqwTCXOm5tIbJ3D1XoirmCRNYRxUEJ7VFOj0syZhwD7OBrtUEa7ZnWZQtvUQp71G+uRO0AByY9fBvP2/qYbHjhw5Z5InSF9u+HsXk+CdwOD8DYD0difKsCQonj3n0/kFNpASxGkMhEsAn8OiZfNYNJXcQjKRUdffl9+y6UXoWN9U03pcSBVFzrSBRbzvLct7RI+s4Ddu1hibNCfSzSvzInol9mPAqJvWhJwSK8pYRb0gQdy8ErSKdW/IK7OQh2Y9TpZaA1qANJ89PX13czg/pmF77LYlME+Z8QJiEm4nkM3MBd9kZUfMd4r43QkabXeFC6288ltu3E2oejwzlXjIcvhxTv9fDgdw5mBg/7J/O1ZutAwaPZDv5gXdtJKzFh/7Y4Hf+Qd79o1e8hrD Q0SB5uLr JAHLMI6waCF9j/wJFO5LPjhB4CZpuOgepw/LkP6qkOS/G7S0nJvaq7fjCSY2m10G7Ry2p8wnPfcYSeZGa2X6FfFHaPn0BxKt/fRr4iXmBZfmr6ssWu7SKJDcq95ZWdtxyBC1H1yEbqkwjiFDjsouQ3tPHlqhr/VIOvG37xwbQbtI2Qz4CuMAuKg+MQigK0S4Q+kjR70IUu5Ymd6w7Eso1VLPbwsBuMbcJ/rOXAT905wBiY35c8vnAjhAoYgnn7qhelVXin86qMvI1G6c2VMLMU7CCOnw/0Mcu3n5Viz+IN/vQo/sxh+cIU77UcjCg134DOcC24M0ualQxUd6C9kiYYQHIwRQlcSzYPZqGSONg8cD6j7J9mtIhl93CEsK+PEuEWB6Awjp/lIfSTG6jt+1J/oVld6K2UnEz9FSAW3fsx9EIoUMWLSP0GoPEaR0i3GrFi/Zj2NKwP7eDur4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use the INVLPGB_FINAL_ONLY flag when invalidating mappings with INVPLGB. This way only leaf mappings get removed from the TLB, leaving intermediate translations cached. On the (rare) occasions where we free page tables we do a full flush, ensuring intermediate translations get flushed from the TLB. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/tlb.h | 10 ++++++++-- arch/x86/mm/tlb.c | 13 +++++++------ 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index e645884a1877..8d78667a2d1b 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -92,9 +92,15 @@ static inline void __tlbsync(void) static inline void __invlpgb_flush_user_nr_nosync(unsigned long pcid, unsigned long addr, u16 nr, - bool pmd_stride) + bool pmd_stride, + bool freed_tables) { - __invlpgb(0, pcid, addr, nr, pmd_stride, INVLPGB_PCID | INVLPGB_VA); + u8 flags = INVLPGB_PCID | INVLPGB_VA; + + if (!freed_tables) + flags |= INVLPGB_FINAL_ONLY; + + __invlpgb(0, pcid, addr, nr, pmd_stride, flags); } /* Flush all mappings for a given PCID, not including globals. */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 4d56d22b9893..91680cfd5868 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -497,9 +497,10 @@ static inline void tlbsync(void) static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, unsigned long addr, - u16 nr, bool pmd_stride) + u16 nr, bool pmd_stride, + bool freed_tables) { - __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride); + __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride, freed_tables); if (!cpu_need_tlbsync()) cpu_write_tlbsync(true); } @@ -542,9 +543,9 @@ static void broadcast_tlb_flush(struct flush_tlb_info *info) nr = clamp_val(nr, 1, invlpgb_count_max); } - invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd); + invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd, info->freed_tables); if (static_cpu_has(X86_FEATURE_PTI)) - invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd); + invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd, info->freed_tables); addr += nr << info->stride_shift; } while (addr < info->end); @@ -1688,10 +1689,10 @@ void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, u16 asid = mm_global_asid(mm); if (asid) { - invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false); + invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false, false); /* Do any CPUs supporting INVLPGB need PTI? */ if (static_cpu_has(X86_FEATURE_PTI)) - invlpgb_flush_user_nr_nosync(user_pcid(asid), uaddr, 1, false); + invlpgb_flush_user_nr_nosync(user_pcid(asid), uaddr, 1, false, false); /* * Some CPUs might still be using a local ASID for this