From patchwork Mon Dec 30 17:53:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13923395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA4A9E77194 for ; Mon, 30 Dec 2024 17:57:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F20088D0006; Mon, 30 Dec 2024 12:57:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ED04F8D0005; Mon, 30 Dec 2024 12:57:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C38188D0006; Mon, 30 Dec 2024 12:57:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9C02B8D0007 for ; Mon, 30 Dec 2024 12:57:34 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4B3E51A0866 for ; Mon, 30 Dec 2024 17:57:34 +0000 (UTC) X-FDA: 82952380938.27.C935B9B Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf30.hostedemail.com (Postfix) with ESMTP id C1B8980015 for ; Mon, 30 Dec 2024 17:56:02 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735581421; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=hMFghgBY+5f8eEy8FhNncr8ukZs7vi8p/pT5rMptxPM=; b=a2pZA8siAxD6lfvfxHbU271A4duenusNalsewOzxbShYg71C5G1cuLPbipYFcYky0ujmCk 09/rO6uj89alKIsNKcIQQsffXBnYjR7/nl8+oBZxWYWPEFyzq6Ctxdm3Fgr8oOL97dOWPv DIUMXZkAbIqtwgSeMBevyKlQ3TQgLok= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735581421; a=rsa-sha256; cv=none; b=Zje38aorQg1/fvbRmTdvgtOk4IHbbG6zFV2z49yHTeznUebqoi/ItVZwINwhTpdBWU3lzI leUGqa55u1arVzN1aC2TZSpbheHCNyL1v2Naz6dpRviQmckyd6fUIIGY1n/QAhZ4ZFjzR5 VWARcMYeA5KeFoiYGz614BOgThE6yd8= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tSJzc-000000008Lf-179i; Mon, 30 Dec 2024 12:55:52 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, akpm@linux-foundation.org, nadav.amit@gmail.com, zhengqi.arch@bytedance.com, linux-mm@kvack.org Subject: [PATCH v3 00/12] AMD broadcast TLB invalidation Date: Mon, 30 Dec 2024 12:53:01 -0500 Message-ID: <20241230175550.4046587-1-riel@surriel.com> X-Mailer: git-send-email 2.47.1 MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C1B8980015 X-Stat-Signature: 5wn49ifaww6yzdmqpbzawna5x4ks7jm4 X-Rspam-User: X-HE-Tag: 1735581362-398223 X-HE-Meta: U2FsdGVkX1+bUpGkGy/kues0Omik/foOiZeSQ2WeigYLfGeQrCE7hoHed4oUn9GrqE1Rtx8cmmx4ipDTtJE//N5ze7QAHffycsaNVfHswPeYWis+co9+a8a/H88i3DUrH9BZNZxHbZe4Y7bMIAJrTNcoDf296yRgkTm1/4W+KWuZcNSmIE6KG7tXMFjcocRT3p/FIKqmVH/FTgWTKUasfCJohE+6jtPoRSmryrovsYuHhkhT3iOtuOFqrSumia5mWaUVcafO8HD8MMUW07LJMqy8lvSKKnu+EDNPJtgG392PXuwuz5KYzRGRPhQKSWrp35MGMuuPgrrZk6C0nCXA494Ww/HphjUBo0tN5vscstpANwRz+ekIXsb8ZNTHetFY6iM0f/8VyQkZEhz11RwRco5R0SA7sMpTm4v4tyVb2sY8ehGZDeDhc7K24Bz1zsFbQfHmZfUlo7SPtaD3OGbQcPBOtXchovdz57zCCnhwUzQpY2GKUkGimGBgCq1E+tkf/v7OJ1yx4C5HlVKUbrtuzrKTvTHL5uYv8KKPBAY6XDnu+KQAyRcapz9wrOPwicESDlmvOehGUA5iHRM/AYKlm3WMjPDMqfKfqKEKWaxTw1g6Un3lPV2upa7orF/XqaaPcBtsJSOoXXnzvcgI4r3LtM+Z8RverB7JuYZI1I6tVGT1b2HcdndRLURDEVewdMu4Qc3hEVWhDd5VqH0Utg0Hb6KdRehvYOfMNFmNRV725JtLJc6VO40T69qIScq5SwtlfWivq3z43nE46gEkaaXcuR/rOnstJ6oeze4vuBYld5ttuMiWXZghbuOOMBu/FY4h/nJhZdM0ONm54iYYtHcpqMO5rbvOTIfBcdkbM2Qq3SyAqnHHLT/LZitjZt+iEJbgYklH5B/oclKRDE0vJehHwhJvRFeMW4khuUbijvYngxhcFOVVwlsPg61B6EY/Cw8577cYzWScrYsdSHIHAHm Ce3vo1p6 jgnVmgWAGllK03MSLKuPT39DAcaRLZ/p1SK73aG6R+z8pJ4GHCV73YGZS3JNTFYkUAL7i8rWytJHp0JImOGZQbO3V80SW10QbNRI/Lw/nkEX+B6/hos4e2oNS5d20/fRD5JUsJu3Kkg2nDBpU3735Ci4vrmeUA3fPmL7QA/kGoFHNjV1GtHELgO6RfPLECI2wZWtzOmSGuorfw4aoVWyE95caUCZZpAYfo2D6fQSjFHfjf/eHYIhoRKka1sOXfNy7615ee3TPHtSvInMH4SQWv3hnUvTbSCEOyFGfq49Q9UmNyLDJ5lYHTSQHmM7TD9jjuBo5TPWpKXDYiITDt60/cpsJBWfbqOm2tSpcZ47nhLQbgB+Ve4GCXw6pQOz9xZFo21im X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Subject: [RFC PATCH 00/10] AMD broadcast TLB invalidation Add support for broadcast TLB invalidation using AMD's INVLPGB instruction. This allows the kernel to invalidate TLB entries on remote CPUs without needing to send IPIs, without having to wait for remote CPUs to handle those interrupts, and with less interruption to what was running on those CPUs. Because x86 PCID space is limited, and there are some very large systems out there, broadcast TLB invalidation is only used for processes that are active on 3 or more CPUs, with the threshold being gradually increased the more the PCID space gets exhausted. Combined with the removal of unnecessary lru_add_drain calls (see https://lkml.org/lkml/2024/12/19/1388) this results in a nice performance boost for the will-it-scale tlb_flush2_threads test on an AMD Milan system with 36 cores: - vanilla kernel: 527k loops/second - lru_add_drain removal: 731k loops/second - only INVLPGB: 527k loops/second - lru_add_drain + INVLPGB: 1157k loops/second Profiling with only the INVLPGB changes showed while TLB invalidation went down from 40% of the total CPU time to only around 4% of CPU time, the contention simply moved to the LRU lock. Fixing both at the same time about doubles the number of iterations per second from this case. v3: - Remove paravirt tlb_remove_table call (thank you Qi Zheng) - More suggested cleanups and changelog fixes by Peter and Nadav v2: - Apply suggestions by Peter and Borislav (thank you!) - Fix bug in arch_tlbbatch_flush, where we need to do both the TLBSYNC, and flush the CPUs that are in the cpumask. - Some updates to comments and changelogs based on questions.