From patchwork Tue Mar 25 09:36:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Miko=C5=82aj_Lenczewski?= X-Patchwork-Id: 14028251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E6AB5C35FFC for ; Tue, 25 Mar 2025 09:40:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=EmWDdMSApQdFT1Z8KfYalBXX6jiCy9FVepkLaAytNkM=; b=aE+Awp3brMVGRklcod3YCvxg0I 6g5fYGLBDtgr2NXwy+RFcabh3vRpKQkeFAe//xACfRuZRKkUcVgEdxDaHzo9nHxC1BvajBSWsDwMT c//Zit8OkikCe5grSlucZWSBnkzhJeMhuDAaRjOOmjawrT3mui/S/WWkKJEmsTJn2i++uJAX8FnwG sgc/KgH53HPO8Aqrg9mDYHiMfOOlbDQqP+J9um0jwDxs4XieUT/HWcph2dtCPjHPzoFSu0ZAEU55p 4NQkk9Rd5VUqnmh1vilXvPuFOZcPwf9VZiqkKDxu2KI12mT9aYWfBVxXjoQQSMwxEqSY7zkSZ80IN yI228ICQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tx0lc-00000005KGU-1PGe; Tue, 25 Mar 2025 09:40:16 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tx0iC-00000005Jd4-1k76 for linux-arm-kernel@lists.infradead.org; Tue, 25 Mar 2025 09:36:45 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C5BAA1756; Tue, 25 Mar 2025 02:36:47 -0700 (PDT) Received: from mazurka.cambridge.arm.com (mazurka.cambridge.arm.com [10.2.80.18]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BB96B3F63F; Tue, 25 Mar 2025 02:36:37 -0700 (PDT) From: =?utf-8?q?Miko=C5=82aj_Lenczewski?= To: ryan.roberts@arm.com, suzuki.poulose@arm.com, yang@os.amperecomputing.com, corbet@lwn.net, catalin.marinas@arm.com, will@kernel.org, jean-philippe@linaro.org, robin.murphy@arm.com, joro@8bytes.org, akpm@linux-foundation.org, ardb@kernel.org, mark.rutland@arm.com, joey.gouly@arm.com, maz@kernel.org, james.morse@arm.com, broonie@kernel.org, oliver.upton@linux.dev, baohua@kernel.org, david@redhat.com, ioworker0@gmail.com, jgg@ziepe.ca, nicolinc@nvidia.com, mshavit@google.com, jsnitsel@redhat.com, smostafa@google.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev Cc: =?utf-8?q?Miko=C5=82aj_Lenczewski?= Subject: [PATCH v5 1/3] arm64: Add BBM Level 2 cpu feature Date: Tue, 25 Mar 2025 09:36:23 +0000 Message-ID: <20250325093625.55184-2-miko.lenczewski@arm.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250325093625.55184-1-miko.lenczewski@arm.com> References: <20250325093625.55184-1-miko.lenczewski@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250325_023644_544558_952FD313 X-CRM114-Status: GOOD ( 24.10 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The Break-Before-Make cpu feature supports multiple levels (levels 0-2), and this commit adds a dedicated BBML2 cpufeature to test against support for, as well as a kernel commandline parameter to optionally disable BBML2 altogether. This is a system feature as we might have a big.LITTLE architecture where some cores support BBML2 and some don't, but we want all cores to be available and BBM to default to level 0 (as opposed to having cores without BBML2 not coming online). To support BBML2 in as wide a range of contexts as we can, we want not only the architectural guarantees that BBML2 makes, but additionally want BBML2 to not create TLB conflict aborts. Not causing aborts avoids us having to prove that no recursive faults can be induced in any path that uses BBML2, allowing its use for arbitrary kernel mappings. Support detection of such CPUs. Signed-off-by: Mikołaj Lenczewski Reviewed-by: Suzuki K Poulose Reviewed-by: Ryan Roberts --- .../admin-guide/kernel-parameters.txt | 3 + arch/arm64/Kconfig | 19 +++++ arch/arm64/include/asm/cpucaps.h | 2 + arch/arm64/include/asm/cpufeature.h | 5 ++ arch/arm64/kernel/cpufeature.c | 71 +++++++++++++++++++ arch/arm64/kernel/pi/idreg-override.c | 2 + arch/arm64/tools/cpucaps | 1 + 7 files changed, 103 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index fb8752b42ec8..3e4cc917a07e 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -453,6 +453,9 @@ arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of 32 bit applications. + arm64.nobbml2 [ARM64] Unconditionally disable Break-Before-Make Level + 2 support + arm64.nobti [ARM64] Unconditionally disable Branch Target Identification support diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 940343beb3d4..db63e0d83492 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2057,6 +2057,25 @@ config ARM64_TLB_RANGE The feature introduces new assembly instructions, and they were support when binutils >= 2.30. +config ARM64_BBML2_NOABORT + bool "Enable support for Break-Before-Make Level 2 detection and usage" + default y + help + FEAT_BBM provides detection of support levels for break-before-make + sequences. If BBM level 2 is supported, some TLB maintenance requirements + can be relaxed to improve performance. We additonally require the + property that the implementation cannot ever raise TLB Conflict Aborts. + Selecting N causes the kernel to fallback to BBM level 0 behaviour + even if the system supports BBM level 2. + + To enable detection of BBML2 support, and to make use of it, say Y. + + Detection of and support for BBM level 2 can optionally be overridden + at runtime via the use of the arm64.nobbml2 kernel commandline + parameter. If your system claims support for BBML2, but is unstable + with this option enabled, either say N or make use of the commandline + parameter override to force BBML0. + endmenu # "ARMv8.4 architectural features" menu "ARMv8.5 architectural features" diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 0b5ca6e0eb09..2d6db33d4e45 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -23,6 +23,8 @@ cpucap_is_possible(const unsigned int cap) return IS_ENABLED(CONFIG_ARM64_PAN); case ARM64_HAS_EPAN: return IS_ENABLED(CONFIG_ARM64_EPAN); + case ARM64_HAS_BBML2_NOABORT: + return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT); case ARM64_SVE: return IS_ENABLED(CONFIG_ARM64_SVE); case ARM64_SME: diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index e0e4478f5fb5..108ef3fbbc00 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -866,6 +866,11 @@ static __always_inline bool system_supports_mpam_hcr(void) return alternative_has_cap_unlikely(ARM64_MPAM_HCR); } +static inline bool system_supports_bbml2_noabort(void) +{ + return alternative_has_cap_unlikely(ARM64_HAS_BBML2_NOABORT); +} + int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt); bool try_emulate_mrs(struct pt_regs *regs, u32 isn); diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index d561cf3b8ac7..832b86fca542 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2176,6 +2176,70 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry, return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE); } +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr) +{ + /* + * We want to allow usage of bbml2 in as wide a range of kernel contexts + * as possible. This list is therefore an allow-list of known-good + * implementations that both support bbml2 and additionally, fulfill the + * extra constraint of never generating TLB conflict aborts when using + * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain + * kernel contexts difficult to prove safe against recursive aborts). + * + * Note that implementations can only be considered "known-good" if their + * implementors attest to the fact that the implementation never raises + * TLBI conflict aborts for bbml2 mapping granularity changes. + */ + static const struct midr_range supports_bbml2_noabort_list[] = { + MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf), + MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf), + {} + }; + + return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list); +} + +static inline unsigned int cpu_read_midr(int cpu) +{ + WARN_ON_ONCE(!cpu_online(cpu)); + + return per_cpu(cpu_data, cpu).reg_midr; +} + +static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope) +{ + if (!IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT)) + return false; + + if (scope & SCOPE_SYSTEM) { + int cpu; + + /* + * We are a boot CPU, and must verify that all enumerated boot + * CPUs have MIDR values within our allowlist. Otherwise, we do + * not allow the BBML2 feature to avoid potential faults when + * the insufficient CPUs access memory regions using BBML2 + * semantics. + */ + for_each_online_cpu(cpu) { + if (!cpu_has_bbml2_noabort(cpu_read_midr(cpu))) + return false; + } + } else if (scope & SCOPE_LOCAL_CPU) { + /* + * We are a hot-plugged CPU, so must only check our MIDR. + * If we have the correct MIDR, but the kernel booted on an + * insufficient CPU, we will not use BBML2 (this is safe). If + * we have an incorrect MIDR, but the kernel booted on a + * sufficient CPU, we will not bring up this CPU. + */ + if (!cpu_has_bbml2_noabort(read_cpuid_id())) + return false; + } + + return has_cpuid_feature(caps, scope); +} + #ifdef CONFIG_ARM64_PAN static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused) { @@ -2926,6 +2990,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP) }, + { + .desc = "BBM Level 2 without conflict abort", + .capability = ARM64_HAS_BBML2_NOABORT, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_bbml2_noabort, + ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, BBM, 2) + }, { .desc = "52-bit Virtual Addressing for KVM (LPA2)", .capability = ARM64_HAS_LPA2, diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c index c6b185b885f7..803a0c99f7b4 100644 --- a/arch/arm64/kernel/pi/idreg-override.c +++ b/arch/arm64/kernel/pi/idreg-override.c @@ -102,6 +102,7 @@ static const struct ftr_set_desc mmfr2 __prel64_initconst = { .override = &id_aa64mmfr2_override, .fields = { FIELD("varange", ID_AA64MMFR2_EL1_VARange_SHIFT, mmfr2_varange_filter), + FIELD("bbm", ID_AA64MMFR2_EL1_BBM_SHIFT, NULL), {} }, }; @@ -246,6 +247,7 @@ static const struct { { "rodata=off", "arm64_sw.rodataoff=1" }, { "arm64.nolva", "id_aa64mmfr2.varange=0" }, { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" }, + { "arm64.nobbml2", "id_aa64mmfr2.bbm=0" }, }; static int __init parse_hexdigit(const char *p, u64 *v) diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index 1e65f2fb45bd..b03a375e5507 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -14,6 +14,7 @@ HAS_ADDRESS_AUTH_ARCH_QARMA5 HAS_ADDRESS_AUTH_IMP_DEF HAS_AMU_EXTN HAS_ARMv8_4_TTL +HAS_BBML2_NOABORT HAS_CACHE_DIC HAS_CACHE_IDC HAS_CNP From patchwork Tue Mar 25 09:36:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Miko=C5=82aj_Lenczewski?= X-Patchwork-Id: 14028255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CD3A4C35FFC for ; Tue, 25 Mar 2025 09:42:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=QTl+acxxdzK6SBQbkH0P/ggjRqwBC4jtu/1mROg7GSE=; b=uTKOq5MSk+iNrpz4nHXgn8Xye1 X9EgXBI2dAOJAkNzu9KQm4Mc8O23vHmSW/HjAoaTWoOGYlFMGJiLLhrUUmaV9dpL+n0N9EOXFUq9Z 0dbQlFGJRUcUaQiZc0GWKoH6wNycU9aceqCnwASWWqcMjFfWjCiBUL5OrHFJjUXjqIoPD0td9OIY0 QSu9qM0dL7W5N7JI8v2fzTUNwDQKYrdx3HKLeFjBGiEWy9LH3oWcULmWcbFY+8N/AKdjAxwJQFKur nvpUYdWeAOmj6RFtW4fXBPEfJpJE1kkAKSSX6pJveW/WH2tRymI3pvb863ywSZYw97baN50aMrDxc CWmU2NVA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tx0nJ-00000005KaS-07LT; Tue, 25 Mar 2025 09:42:01 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tx0iE-00000005Jee-2ZTh for linux-arm-kernel@lists.infradead.org; Tue, 25 Mar 2025 09:36:47 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1E3951C2B; Tue, 25 Mar 2025 02:36:52 -0700 (PDT) Received: from mazurka.cambridge.arm.com (mazurka.cambridge.arm.com [10.2.80.18]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 13FF73F63F; Tue, 25 Mar 2025 02:36:41 -0700 (PDT) From: =?utf-8?q?Miko=C5=82aj_Lenczewski?= To: ryan.roberts@arm.com, suzuki.poulose@arm.com, yang@os.amperecomputing.com, corbet@lwn.net, catalin.marinas@arm.com, will@kernel.org, jean-philippe@linaro.org, robin.murphy@arm.com, joro@8bytes.org, akpm@linux-foundation.org, ardb@kernel.org, mark.rutland@arm.com, joey.gouly@arm.com, maz@kernel.org, james.morse@arm.com, broonie@kernel.org, oliver.upton@linux.dev, baohua@kernel.org, david@redhat.com, ioworker0@gmail.com, jgg@ziepe.ca, nicolinc@nvidia.com, mshavit@google.com, jsnitsel@redhat.com, smostafa@google.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev Cc: =?utf-8?q?Miko=C5=82aj_Lenczewski?= Subject: [PATCH v5 2/3] iommu/arm: Add BBM Level 2 smmu feature Date: Tue, 25 Mar 2025 09:36:24 +0000 Message-ID: <20250325093625.55184-3-miko.lenczewski@arm.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250325093625.55184-1-miko.lenczewski@arm.com> References: <20250325093625.55184-1-miko.lenczewski@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250325_023646_784981_89EA8E83 X-CRM114-Status: GOOD ( 11.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org For supporting BBM Level 2 for userspace mappings, we want to ensure that the smmu also supports its own version of BBM Level 2. Luckily, the smmu spec (IHI 0070G 3.21.1.3) is stricter than the aarch64 spec (DDI 0487K.a D8.16.2), so already guarantees that no aborts are raised when BBM level 2 is claimed. Add the feature and testing for it under arm_smmu_sva_supported(). Signed-off-by: Mikołaj Lenczewski Reviewed-by: Robin Murphy Reviewed-by: Ryan Roberts --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 +++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++++ 3 files changed, 10 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 9ba596430e7c..6ba182572788 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -222,6 +222,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu) feat_mask |= ARM_SMMU_FEAT_VAX; } + if (system_supports_bbml2_noabort()) + feat_mask |= ARM_SMMU_FEAT_BBML2; + if ((smmu->features & feat_mask) != feat_mask) return false; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 358072b4e293..dcee0bdec924 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -4406,6 +4406,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) if (FIELD_GET(IDR3_RIL, reg)) smmu->features |= ARM_SMMU_FEAT_RANGE_INV; + if (FIELD_GET(IDR3_BBML, reg) == IDR3_BBML2) + smmu->features |= ARM_SMMU_FEAT_BBML2; + /* IDR5 */ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index bd9d7c85576a..85eaf3ab88c2 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -60,6 +60,9 @@ struct arm_smmu_device; #define ARM_SMMU_IDR3 0xc #define IDR3_FWB (1 << 8) #define IDR3_RIL (1 << 10) +#define IDR3_BBML GENMASK(12, 11) +#define IDR3_BBML1 (1 << 11) +#define IDR3_BBML2 (2 << 11) #define ARM_SMMU_IDR5 0x14 #define IDR5_STALL_MAX GENMASK(31, 16) @@ -754,6 +757,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_HA (1 << 21) #define ARM_SMMU_FEAT_HD (1 << 22) #define ARM_SMMU_FEAT_S2FWB (1 << 23) +#define ARM_SMMU_FEAT_BBML2 (1 << 24) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) From patchwork Tue Mar 25 09:36:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Miko=C5=82aj_Lenczewski?= X-Patchwork-Id: 14028256 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5381BC36008 for ; Tue, 25 Mar 2025 09:44:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Vhd9OLP19nHpcre+78tpM1P6tFgF3pG7vpM2folBU5Q=; b=PbjykNgFnHLzQyIinX//zUB293 NO4j+93+JH/9l8BfyvKCV+PlExY8wefzFcyDmoD/MQAbLv5Dx9dY0tM0m90tiXkSoUDJqjRydHrY6 SuhRX+ljaSfCrND5WWhn2b7737AXcFzMfqtzi1zn76LSasNhbcbyhFBKaHPXb4fPYqtK5sbQkIXAv vFZkOe5cgR61Ae4sPt4Apb23tWQdJyHzE/cLKKJjcitYH7/IBYGN7MCz/o4s1zwXeSaqyE+q9kHLE CQ3XlynJ8m+vQcDnSUtXvrcBxOdF8nP1dNo9Fm26e4KuXfsbRhc426hPdLRwN8wNvmiwy7cXTMOgo iQk6R17g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tx0p0-00000005L6L-3UdD; Tue, 25 Mar 2025 09:43:46 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tx0iI-00000005Jfj-34IF for linux-arm-kernel@lists.infradead.org; Tue, 25 Mar 2025 09:36:51 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6AC851E8D; Tue, 25 Mar 2025 02:36:56 -0700 (PDT) Received: from mazurka.cambridge.arm.com (mazurka.cambridge.arm.com [10.2.80.18]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 60A083F63F; Tue, 25 Mar 2025 02:36:46 -0700 (PDT) From: =?utf-8?q?Miko=C5=82aj_Lenczewski?= To: ryan.roberts@arm.com, suzuki.poulose@arm.com, yang@os.amperecomputing.com, corbet@lwn.net, catalin.marinas@arm.com, will@kernel.org, jean-philippe@linaro.org, robin.murphy@arm.com, joro@8bytes.org, akpm@linux-foundation.org, ardb@kernel.org, mark.rutland@arm.com, joey.gouly@arm.com, maz@kernel.org, james.morse@arm.com, broonie@kernel.org, oliver.upton@linux.dev, baohua@kernel.org, david@redhat.com, ioworker0@gmail.com, jgg@ziepe.ca, nicolinc@nvidia.com, mshavit@google.com, jsnitsel@redhat.com, smostafa@google.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev Cc: =?utf-8?q?Miko=C5=82aj_Lenczewski?= Subject: [PATCH v5 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2 Date: Tue, 25 Mar 2025 09:36:25 +0000 Message-ID: <20250325093625.55184-4-miko.lenczewski@arm.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250325093625.55184-1-miko.lenczewski@arm.com> References: <20250325093625.55184-1-miko.lenczewski@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250325_023650_813357_B7149221 X-CRM114-Status: GOOD ( 13.15 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When converting a region via contpte_convert() to use mTHP, we have two different goals. We have to mark each entry as contiguous, and we would like to smear the dirty and young (access) bits across all entries in the contiguous block. Currently, we do this by first accumulating the dirty and young bits in the block, using an atomic __ptep_get_and_clear() and the relevant pte_{dirty,young}() calls, performing a tlbi, and finally smearing the correct bits across the block using __set_ptes(). This approach works fine for BBM level 0, but with support for BBM level 2 we are allowed to reorder the tlbi to after setting the pagetable entries. We expect the time cost of a tlbi to be much greater than the cost of clearing and resetting the PTEs. As such, this reordering of the tlbi outside the window where our PTEs are invalid greatly reduces the duration the PTE are visibly invalid for other threads. This reduces the likelyhood of a concurrent page walk finding an invalid PTE, reducing the likelyhood of a fault in other threads, and improving performance (more so when there are more threads). Because we support via allowlist only bbml2 implementations that never raise conflict aborts and instead invalidate the tlb entries automatically in hardware, we can avoid the final flush altogether. Avoiding flushes is a win. Signed-off-by: Mikołaj Lenczewski Reviewed-by: Ryan Roberts Reviewed-by: David Hildenbrand --- arch/arm64/mm/contpte.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index 55107d27d3f8..77ed03b30b72 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -68,7 +68,8 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr, pte = pte_mkyoung(pte); } - __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3); + if (!system_supports_bbml2_noabort()) + __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3); __set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES); }