From patchwork Thu Feb 13 16:14:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B678DC021A0 for ; Thu, 13 Feb 2025 16:18:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5807D6B0082; Thu, 13 Feb 2025 11:18:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 52FB66B0083; Thu, 13 Feb 2025 11:18:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F7366B0089; Thu, 13 Feb 2025 11:18:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 216F76B0082 for ; Thu, 13 Feb 2025 11:18:44 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C257F1C8ECE for ; Thu, 13 Feb 2025 16:18:43 +0000 (UTC) X-FDA: 83115429726.25.1D860F3 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf07.hostedemail.com (Postfix) with ESMTP id 2BA2F4000C for ; Thu, 13 Feb 2025 16:18:42 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463522; a=rsa-sha256; cv=none; b=buIqhLvbBfB1dMVKUXmDyRO2n7SgKR5TKUh4edf2OzZc3WYP4LRiPtS3YEORvio/5x3jnS lAu/E18IoTOa15j/c4JOZvbjAtTqotT+1fg2YzZRe2Ok2a5f26nzg2PrwIZRvVrTi3g4qH KeVOeKPndEFpZ3MkmSAq72AmGWH10CI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463522; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oh1XTOganrHAXrn6/YqDJlPae8gxQKo7DQ7P4yE2UVs=; b=alvYI5nnCXCzpph2fVaY3hdA6MKSJVSgR8J9zrBXYcTBb0VntLF0CK8tope6fJ0Qrm6gkb VUVG/vO7vpRDP258c6oFh/E2cWJiNetlVeZ19IsTdXBIc/S1aTCjXF5ADXtyWgtMQ4MqEy KgtM6AIy5cRlf5PgbV+XsA6l32EPB1M= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-1Hne; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 11/12] x86/mm: enable AMD translation cache extensions Date: Thu, 13 Feb 2025 11:14:02 -0500 Message-ID: <20250213161423.449435-12-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 2BA2F4000C X-Rspamd-Server: rspam12 X-Stat-Signature: 3csshye61aisx3595mjaiwrb1q1wni1a X-HE-Tag: 1739463522-586002 X-HE-Meta: U2FsdGVkX19HVLPagTMS4kRUj9hW02NE/3PHfCi5i3UxCyLqqeIYAC/hnD+RiIovccCGT/q5Fg4m+747SJN1ZM2XJ2ypb2/86/8XJOB5i8Y+xHE3UI6o+U1ZNcqOjKkeV8wjvdovl88fUcmMdVAh5ny4IRzg+WzmyUIif3Ue+WANwEmzUaz9hosdiDZk6nWp0sK+sacaYPRTrcTEZ1kK6J1DClv94OmSGtOeY+dMwP9ekg6AcStq5GZ1HWKOcL5okd9LhYiCuxChs/zk/x8FfNn6W8SSlXnsDDjpEnzt31MP6s8rKwYdOFXoHojj4dKJn7/UZveVpQZlTw1uQrL9hyt7/AYaq/yxulvB4i95EsHHcS6qqKHK7PWgxfrLmcRwCdU3n889mhYOCftaZULhfi4wzk9NiSsoSPsNpRF76Pw7Rx01F1dkm+yqzBk4AzPPZ0ZRUSPEOvr914aA/9ZHdzyfMU7f0WJHHle1Sd2iv39khXdxkFDc54UTOLhjnROhObzwpSfLA5XYQaND2X6+DvCsaVCwTnng8JAnLIQ8eMRKJhWCDNgQxWVw33/G/8pHLS/erAsTK3OiWPZnVWVX8d6YaLTzB4iSo+zCLUX3iiVxVTab1wrpGzaCiYgsJ8cn0F38VyZmquvCXBbDhDfXx/6BvEA04ZH7BrMWrFBhLAv8uI0PVzWb+ILBARnFFwIsP5OhqxGcnJsr47Mab5hgaHJhVfFYcvNDSJKEKDv5OV8xlbp7MZdCQC6Mbqase5ljnvriTb256FDFYwN892UuGMZRAA5mt8e8L18I65WFBVMZ1BIM9diosLye/NHBm1J+HlCEJ4ZFpZyfU8NAzT/xVhygxtZ8dMGMavaTmoKL00QKw4DKF+MPT22M4BKyOu28pmsN4U1STFFy+ucO3LtZkX/N/eKzSeSJCw0GQwKi0fWhs1tNUPJglU3fR1q3pda2goNwgkMpjpldF89SIuq FlSMH+XU HvAW+jYbK9InGgULs0WW0LN4U7sQQ2tSeUxD9EoTbMxyXN7L9ol9AY32ix+Z9E9xuncJO1lIEDHk2Eak6nQmRu8ExW3PhjCZ36LI1pnw+8gl0FS+E3dqRlunciY/X53KDrM9t/NA97MK3miW9uWkLkj5KUPnbP0xBK81D4ll9eJY01U4LuqedMIelKfqy6E9ryFrss+4NTcO587k2BDl8MvjgHhrspDq+UuMl+EdHkgGBqKLnvAtgZptqPwtudFgK0rhJFgru/yO291lF5nYPd3LFFAj3JflMqlLdMDRmnDpZCvFgkeQQrX96MAgdk38yShGYWz3M+uiwOlzh72/rjWbCcV70r0DT7LGNxPywQfhjy4u/zx8c24dLKVrTdnATOq1hLRLB7Ra56hMp5B47rzgqkUjkYcsYTluY6nLy0yg3oDb/8faLlT0PHzS/E7wQS5mlOFUmRI4dIKHYtixMWN9hoSgMwFiclmTCXw8qptkZGbYsKZ9VmBBwg1ng2tRVkeR0/9BvfjphAUnVhkU2V//LQQrZyFa8v5fB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With AMD TCE (translation cache extensions) only the intermediate mappings that cover the address range zapped by INVLPG / INVLPGB get invalidated, rather than all intermediate mappings getting zapped at every TLB invalidation. This can help reduce the TLB miss rate, by keeping more intermediate mappings in the cache. From the AMD manual: Translation Cache Extension (TCE) Bit. Bit 15, read/write. Setting this bit to 1 changes how the INVLPG, INVLPGB, and INVPCID instructions operate on TLB entries. When this bit is 0, these instructions remove the target PTE from the TLB as well as all upper-level table entries that are cached in the TLB, whether or not they are associated with the target PTE. When this bit is set, these instructions will remove the target PTE and only those upper-level entries that lead to the target PTE in the page table hierarchy, leaving unrelated upper-level entries intact. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/msr-index.h | 2 ++ arch/x86/kernel/cpu/amd.c | 4 ++++ tools/arch/x86/include/asm/msr-index.h | 2 ++ 3 files changed, 8 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 9a71880eec07..a7ea9720ba3c 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -25,6 +25,7 @@ #define _EFER_SVME 12 /* Enable virtualization */ #define _EFER_LMSLE 13 /* Long Mode Segment Limit Enable */ #define _EFER_FFXSR 14 /* Enable Fast FXSAVE/FXRSTOR */ +#define _EFER_TCE 15 /* Enable Translation Cache Extensions */ #define _EFER_AUTOIBRS 21 /* Enable Automatic IBRS */ #define EFER_SCE (1<<_EFER_SCE) @@ -34,6 +35,7 @@ #define EFER_SVME (1<<_EFER_SVME) #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) +#define EFER_TCE (1<<_EFER_TCE) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) /* diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 3e8180354303..38f454671c88 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1075,6 +1075,10 @@ static void init_amd(struct cpuinfo_x86 *c) /* AMD CPUs don't need fencing after x2APIC/TSC_DEADLINE MSR writes. */ clear_cpu_cap(c, X86_FEATURE_APIC_MSRS_FENCE); + + /* Enable Translation Cache Extension */ + if (cpu_feature_enabled(X86_FEATURE_TCE)) + msr_set_bit(MSR_EFER, _EFER_TCE); } #ifdef CONFIG_X86_32 diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 3ae84c3b8e6d..dc1c1057f26e 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -25,6 +25,7 @@ #define _EFER_SVME 12 /* Enable virtualization */ #define _EFER_LMSLE 13 /* Long Mode Segment Limit Enable */ #define _EFER_FFXSR 14 /* Enable Fast FXSAVE/FXRSTOR */ +#define _EFER_TCE 15 /* Enable Translation Cache Extensions */ #define _EFER_AUTOIBRS 21 /* Enable Automatic IBRS */ #define EFER_SCE (1<<_EFER_SCE) @@ -34,6 +35,7 @@ #define EFER_SVME (1<<_EFER_SVME) #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) +#define EFER_TCE (1<<_EFER_TCE) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) /*