From patchwork Fri Feb 28 18:24:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Miko=C5=82aj_Lenczewski?= X-Patchwork-Id: 13996922 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 07BB8C282C6 for ; Fri, 28 Feb 2025 18:27:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:MIME-Version:Message-ID:Date:Subject:Cc:To:From:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=+wVMZO2PXgZ8rMOFqUN61uPa6aSeXLoP9327ptMJfek=; b=dv0/Nh9DXjdAurrkrA3TYWaxN/ Oypj/D5FUE8vtqli07Q91WfjU75cb7jGt0otfsDrUy4TTYTfxbbV6lIoxIHddCaM8uMM3JbDki49V cP5Eq4lqU4h0xyLp+uKScWhz1avy0POww4tzGx3a7SAFoZzURtQvhNlz3ietiOjJmC4NYlXnO1WYt X0V8h9GHcw8nKk8hMFSqvyFFCnDo2Iyaf3iPn+FetUzPZVjEh/gHUvAeYB/XhpDR7MCHSeXP7Ty2g Bk8z2kOC/M7kyeHwKkHxDFLE2cJnc73TP9BTk3k4rOn4rsIYLUPcOnj1zZe5Rt9pTiCoH6t2aDNPP y/2/PChA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1to54e-0000000C8Ay-0T8B; Fri, 28 Feb 2025 18:27:00 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1to52s-0000000C7bt-1uFs for linux-arm-kernel@lists.infradead.org; Fri, 28 Feb 2025 18:25:12 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 192F5150C; Fri, 28 Feb 2025 10:25:23 -0800 (PST) Received: from mazurka.cambridge.arm.com (mazurka.cambridge.arm.com [10.2.80.18]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ADCA33F6A8; Fri, 28 Feb 2025 10:25:04 -0800 (PST) From: =?utf-8?q?Miko=C5=82aj_Lenczewski?= To: ryan.roberts@arm.com, suzuki.poulose@arm.com, yang@os.amperecomputing.com, catalin.marinas@arm.com, will@kernel.org, joro@8bytes.org, jean-philippe@linaro.org, mark.rutland@arm.com, joey.gouly@arm.com, oliver.upton@linux.dev, james.morse@arm.com, broonie@kernel.org, maz@kernel.org, david@redhat.com, akpm@linux-foundation.org, jgg@ziepe.ca, nicolinc@nvidia.com, mshavit@google.com, jsnitsel@redhat.com, smostafa@google.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, iommu@lists.linux.dev Cc: =?utf-8?q?Miko=C5=82aj_Lenczewski?= Subject: [PATCH v2 0/4] Initial BBML2 support for contpte_convert() Date: Fri, 28 Feb 2025 18:24:00 +0000 Message-ID: <20250228182403.6269-2-miko.lenczewski@arm.com> X-Mailer: git-send-email 2.45.3 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250228_102510_533294_5119A717 X-CRM114-Status: GOOD ( 12.71 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi All, This patch series adds adding initial support for eliding break-before-make requirements on systems that support BBML2 and additionally guarantee to never raise a conflict abort. This support reorders and optionally elides a TLB invalidation in contpte_convert(). The elision of said invalidation leads to a 12% improvement when executing a microbenchmark designed to force the pathological path where contpte_convert() gets called. This represents an 80% reduction in the cost of calling contpte_convert(). However, even without the elision, the reodering represents a performance improvement due to reducing thread contention, as there is a smaller time window for racing threads to see an invalid pagetable entry (especially if they already have a cached entry in their TLB that they are working off of). This series is based on v6.14-rc3 (0ad2507d5d93). Patch 1 implements an allow-list of cpus that support BBML2, but with the additional constraint of never causing TLB conflict aborts. We settled on this constraint because we will use the feature for kernel mappings in the future, for which we cannot handle conflict aborts safely. Yang Shi has a series at [1] that aims to use BBML2 to enable splitting the linear map at runtime. This series partially overlaps with it to add the cpu feature. We beleive this series is fully compatible with Yang's requirements and could go first, given there is still a lot of discussion around the best way to manage the mechanics of splitting/collapsing the linear map. [1]: https://lore.kernel.org/linux-arm-kernel/20250103011822.1257189-1-yang@os.amperecomputing.com/ MikoĊ‚aj Lenczewski (4): arm64: Add BBM Level 2 cpu feature arm64/mm: Delay tlbi in contpte_convert() under BBML2 arm64/mm: Elide tlbi in contpte_convert() under BBML2 iommu/arm: Add BBM Level 2 smmu feature arch/arm64/Kconfig | 11 +++ arch/arm64/include/asm/cpucaps.h | 2 + arch/arm64/include/asm/cpufeature.h | 5 ++ arch/arm64/kernel/cpufeature.c | 68 +++++++++++++++++++ arch/arm64/mm/contpte.c | 3 +- arch/arm64/tools/cpucaps | 1 + .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 + drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 + drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++ 9 files changed, 99 insertions(+), 1 deletion(-)