From patchwork Wed Apr 12 12:55:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ivan T . Ivanov" X-Patchwork-Id: 13209149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A6AE2C77B6E for ; Wed, 12 Apr 2023 12:56:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:Cc :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=WUciSXNZH2AsDOUCrquR7uc35EAEJEe94V2+msTMILQ=; b=sNZoadpHURH66H Qm7RETs8VLtd7hvbI1jMZ78K7Oe9iYoA6E5hk+/UrkiNZTj4/hziT3E4mof67le+GPlCsILqz9q3o ffxpeUlL5v7ov8gkSPgMQe5sU2RWjuLUbahdem559s0BVDcMXLqSq7mpNuKPiL1YAKR+3oL/X0QXs DeglfMlDhgTo71HT1XN+b2lCSVdVh9SPPHBfOMyFc3AaEeBeq8p6a9g2KREfyJu29pXtaLj68vQtM ZO5Bkre2jyOpe5SE02/lTK7T3GxV4I/QU+V7REAg7VDXiiViLV3X1jntpOaB+TFkjKF/IxF9o6BiX qmXswhR/STQvhhojNdeQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1pma0P-003EDU-2G; Wed, 12 Apr 2023 12:55:21 +0000 Received: from smtp-out1.suse.de ([195.135.220.28]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1pma0L-003ECK-1o for linux-arm-kernel@lists.infradead.org; Wed, 12 Apr 2023 12:55:19 +0000 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6C998211C3; Wed, 12 Apr 2023 12:55:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1681304113; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=xkcI6j+Je9OflxGdMTR2cCyPb3ZQMgvVoAd4g3hqRUQ=; b=NbEY1d+jEy7CiVoeIp8bFpKh4HErOuJAHS5UMY1UtSQTBBQx6H9BfFnPHzy0OLN8WI2Rg6 gRq4eBjWybfE3xaP0OpQoG05W7a6hlJlILGfembKfi6cAKnCc+uOGroTnOFBJTgx3qn+T+ k+m8+YowgezddYrHLlYXX+x87gqZ//w= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1681304113; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=xkcI6j+Je9OflxGdMTR2cCyPb3ZQMgvVoAd4g3hqRUQ=; b=WS+nFuTmilHxkIydP3yDZnj7W76h+dbIRQ41Uz+OvxYDN8TnwpMwA+KH9YEG91Fq/tPca4 4dSZy4kn9aQj0UBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5B676132C7; Wed, 12 Apr 2023 12:55:13 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 8or7FTGqNmQWZQAAMHmgww (envelope-from ); Wed, 12 Apr 2023 12:55:13 +0000 From: "Ivan T. Ivanov" To: Catalin Marinas , Will Deacon Cc: Mark Brown , Mark Rutland , Shawn Guo , Dong Aisheng , linux-arm-kernel@lists.infradead.org, linux-imx@nxp.com, "Ivan T. Ivanov" Subject: [PATCH] arm64: errata: Add NXP iMX8QM workaround for A53 Cache coherency issue Date: Wed, 12 Apr 2023 15:55:06 +0300 Message-Id: <20230412125506.21634-1-iivanov@suse.de> X-Mailer: git-send-email 2.35.3 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230412_055517_755918_DF4C2199 X-CRM114-Status: GOOD ( 19.88 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org According to NXP errata document[1] i.MX8QuadMax SoC suffers from serious cache coherence issue. It was also mentioned in initial support[2] for imx8qm mek machine. I chose to use an ALTERNATIVE() framework, instead downstream solution[3], for this issue with the hope to reduce effect of this fix on unaffected platforms. Unfortunately I was unable to find a way to identify SoC ID using registers. Boot CPU MIDR_EL1 is equal to 0x410fd034. So I fallback to using devicetree compatible strings for this. I know this fix is a suboptimal solution for affected machines, but I haven't been able to come up with a less intrusive fix. And I hope once TLB caches are invalidated any immediate attempt to invalidate them again will be close to NOP operation (flush_tlb_kernel_range()) I have run few simple benchmarks and perf tests on affected and unaffected machines and I was not able see any obvious issues. iMX8QM "performance" was nearly doubled with 2 A72 bringed online. Following is excerpt from NXP IMX8_1N94W "Mask Set Errata" document Rev. 5, 3/2023. Just in case it gets lost somehow. --- "ERR050104: Arm/A53: Cache coherency issue" Description Some maintenance operations exchanged between the A53 and A72 core clusters, involving some Translation Look-aside Buffer Invalidate (TLBI) and Instruction Cache (IC) instructions can be corrupted. The upper bits, above bit-35, of ARADDR and ACADDR buses within in Arm A53 sub-system have been incorrectly connected. Therefore ARADDR and ACADDR address bits above bit-35 should not be used. Workaround The following software instructions are required to be downgraded to TLBI VMALLE1IS: TLBI ASIDE1, TLBI ASIDE1IS, TLBI VAAE1, TLBI VAAE1IS, TLBI VAALE1, TLBI VAALE1IS, TLBI VAE1, TLBI VAE1IS, TLBI VALE1, TLBI VALE1IS The following software instructions are required to be downgraded to TLBI VMALLS12E1IS: TLBI IPAS2E1IS, TLBI IPAS2LE1IS The following software instructions are required to be downgraded to TLBI ALLE2IS: TLBI VAE2IS, TLBI VALE2IS. The following software instructions are required to be downgraded to TLBI ALLE3IS: TLBI VAE3IS, TLBI VALE3IS. The following software instructions are required to be downgraded to TLBI VMALLE1IS when the Force Broadcast (FB) bit [9] of the Hypervisor Configuration Register (HCR_EL2) is set: TLBI ASIDE1, TLBI VAAE1, TLBI VAALE1, TLBI VAE1, TLBI VALE1 The following software instruction is required to be downgraded to IC IALLUIS: IC IVAU, Xt Specifically for the IC IVAU, Xt downgrade, setting SCTLR_EL1.UCI to 0 will disable EL0 access to this instruction. Any attempt to execute from EL0 will generate an EL1 trap, where the downgrade to IC ALLUIS can be implemented. -- [1] https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf [2] 307fd14d4b14 ("arm64: dts: imx: add imx8qm mek support") [3] https://github.com/nxp-imx/linux-imx/blob/lf-6.1.y/arch/arm64/include/asm/tlbflush.h#L19 Signed-off-by: Ivan T. Ivanov --- Documentation/arm64/silicon-errata.rst | 2 ++ arch/arm64/Kconfig | 10 ++++++++++ arch/arm64/include/asm/cpufeature.h | 3 ++- arch/arm64/include/asm/tlbflush.h | 6 +++++- arch/arm64/kernel/cpu_errata.c | 18 ++++++++++++++++++ arch/arm64/kernel/traps.c | 22 +++++++++++++++++++++- arch/arm64/tools/cpucaps | 1 + 7 files changed, 59 insertions(+), 3 deletions(-) diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index ec5f889d7681..fce231797184 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -175,6 +175,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | +----------------+-----------------+-----------------+-----------------------------+ +| Freescale/NXP | i.MX 8QuadMax | ERR050104 | NXP_IMX8QM_ERRATUM_ERR050104| ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip0{5,6,7} | #161010101 | HISILICON_ERRATUM_161010101 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 1023e896d46b..437cb53f8753 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1159,6 +1159,16 @@ config SOCIONEXT_SYNQUACER_PREITS If unsure, say Y. +config NXP_IMX8QM_ERRATUM_ERR050104 + bool "NXP iMX8QM: Workaround for Arm/A53 Cache coherency issue" + default n + help + Some maintenance operations exchanged between the A53 and A72 core + clusters, involving some Translation Look-aside Buffer Invalidate + (TLBI) and Instruction Cache (IC) instructions can be corrupted. + + If unsure, say N. + endmenu # "ARM errata workarounds via the alternatives framework" choice diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 6bf013fb110d..1ed648f7f29a 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -835,7 +835,8 @@ static inline bool system_supports_bti(void) static inline bool system_supports_tlb_range(void) { return IS_ENABLED(CONFIG_ARM64_TLB_RANGE) && - cpus_have_const_cap(ARM64_HAS_TLB_RANGE); + cpus_have_const_cap(ARM64_HAS_TLB_RANGE) && + !cpus_have_const_cap(ARM64_WORKAROUND_NXP_ERR050104); } int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt); diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 412a3b9a3c25..12055b859ce3 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -37,7 +37,11 @@ : : ) #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ - "tlbi " #op ", %0\n" \ + ALTERNATIVE("nop\n nop\n tlbi " #op ", %0", \ + "tlbi vmalle1is\n dsb ish\n isb", \ + ARM64_WORKAROUND_NXP_ERR050104) \ + : : "r" (arg)); \ + asm (ARM64_ASM_PREAMBLE \ ALTERNATIVE("nop\n nop", \ "dsb ish\n tlbi " #op ", %0", \ ARM64_WORKAROUND_REPEAT_TLBI, \ diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 307faa2b4395..7b702a79bf60 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -55,6 +56,14 @@ is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope) return model == entry->midr_range.model; } +static bool __maybe_unused +is_imx8qm_soc(const struct arm64_cpu_capabilities *entry, int scope) +{ + WARN_ON(preemptible()); + + return of_machine_is_compatible("fsl,imx8qm"); +} + static bool has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry, int scope) @@ -729,6 +738,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = { MIDR_FIXED(MIDR_CPU_VAR_REV(1,1), BIT(25)), .cpu_enable = cpu_clear_bf16_from_user_emulation, }, +#endif +#ifdef CONFIG_NXP_IMX8QM_ERRATUM_ERR050104 + { + .desc = "NXP A53 cache coherency issue", + .capability = ARM64_WORKAROUND_NXP_ERR050104, + .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE, + .matches = is_imx8qm_soc, + .cpu_enable = cpu_enable_cache_maint_trap, + }, #endif { } diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index 4a79ba100799..4858f8c86fd5 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -529,6 +529,26 @@ void do_el1_fpac(struct pt_regs *regs, unsigned long esr) uaccess_ttbr0_disable(); \ } +#define __user_instruction_cache_maint(address, res) \ +do { \ + if (address >= TASK_SIZE_MAX) { \ + res = -EFAULT; \ + } else { \ + uaccess_ttbr0_enable(); \ + asm volatile ( \ + "1:\n" \ + ALTERNATIVE(" ic ivau, %1\n", \ + " ic ialluis\n", \ + ARM64_WORKAROUND_NXP_ERR050104) \ + " mov %w0, #0\n" \ + "2:\n" \ + _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0) \ + : "=r" (res) \ + : "r" (address)); \ + uaccess_ttbr0_disable(); \ + } \ +} while (0) + static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs) { unsigned long tagged_address, address; @@ -556,7 +576,7 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs) __user_cache_maint("dc civac", address, ret); break; case ESR_ELx_SYS64_ISS_CRM_IC_IVAU: /* IC IVAU */ - __user_cache_maint("ic ivau", address, ret); + __user_instruction_cache_maint(address, ret); break; default: force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0); diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index 37b1340e9646..e225f1cd1005 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -90,3 +90,4 @@ WORKAROUND_NVIDIA_CARMEL_CNP WORKAROUND_QCOM_FALKOR_E1003 WORKAROUND_REPEAT_TLBI WORKAROUND_SPECULATIVE_AT +WORKAROUND_NXP_ERR050104