diff mbox series

[1/2] arm64: entry: fix ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD

Message ID 20240116110221.420467-2-mark.rutland@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: fix+cleanup for ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD | expand

Commit Message

Mark Rutland Jan. 16, 2024, 11:02 a.m. UTC
Currently the ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround isn't
quite right, as it is supposed to be applied after the last explicit
memory access, but is immediately followed by an LDR.

The ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround is used to
handle Cortex-A520 erratum 2966298 and Cortex-A510 erratum 3117295,
which are described in:

* https://developer.arm.com/documentation/SDEN2444153/0600/?lang=en
* https://developer.arm.com/documentation/SDEN1873361/1600/?lang=en

In both cases the workaround is described as:

| If pagetable isolation is disabled, the context switch logic in the
| kernel can be updated to execute the following sequence on affected
| cores before exiting to EL0, and after all explicit memory accesses:
|
| 1. A non-shareable TLBI to any context and/or address, including
|    unused contexts or addresses, such as a `TLBI VALE1 Xzr`.
|
| 2. A DSB NSH to guarantee completion of the TLBI.

The important part being that the TLBI+DSB must be placed "after all
explicit memory accesses".

Unfortunately, as-implemented, the TLBI+DSB is immediately followed by
an LDR, as we have:

| alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
| 	tlbi	vale1, xzr
| 	dsb	nsh
| alternative_else_nop_endif
| alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0
| 	ldr	lr, [sp, #S_LR]
| 	add	sp, sp, #PT_REGS_SIZE		// restore sp
| 	eret
| alternative_else_nop_endif
|
| [ ... KPTI exception return path ... ]

This patch fixes this by reworking the logic to place the TLBI+DSB
immediately before the ERET, after all explicit memory accesses.

The ERET is currently in a separate alternative block, and alternatives
cannot be nested. To account for this, the alternative block for
ARM64_UNMAP_KERNEL_AT_EL0 is replaced with a single alternative branch
to skip the KPTI logic, with the new shape of the logic being:

| alternative_insn "b .L_skip_tramp_exit_\@", nop, ARM64_UNMAP_KERNEL_AT_EL0
| 	[ ... KPTI exception return path ... ]
| .L_skip_tramp_exit_\@:
|
| 	ldr	lr, [sp, #S_LR]
| 	add	sp, sp, #PT_REGS_SIZE		// restore sp
|
| alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
| 	tlbi	vale1, xzr
| 	dsb	nsh
| alternative_else_nop_endif
| 	eret

The new structure means that the workaround is only applied when KPTI is
not in use; this is fine as noted in the documented implications of the
erratum:

| Pagetable isolation between EL0 and higher level ELs prevents the
| issue from occurring.

... and as per the workaround description quoted above, the workaround
is only necessary "If pagetable isolation is disabled".

Fixes: 471470bc7052d28c ("arm64: errata: Add Cortex-A520 speculative unprivileged load workaround")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org
---
 arch/arm64/kernel/entry.S | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

Comments

Rob Herring (Arm) Jan. 19, 2024, 3:11 p.m. UTC | #1
On Tue, Jan 16, 2024 at 5:02 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> Currently the ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround isn't
> quite right, as it is supposed to be applied after the last explicit
> memory access, but is immediately followed by an LDR.

This isn't necessary. The LDR in question is an unprivileged load from
the EL0 stack. The erratum write-up is not really clear in that
regard.

It's the same as the KPTI case. After switching the page tables, there
are unprivileged loads from the EL0 stack.

Rob
Mark Rutland Jan. 22, 2024, 11:16 a.m. UTC | #2
Hi Rob,

Sorry for the confusion here; I should have synced up with you before sending
this out.

On Fri, Jan 19, 2024 at 09:11:33AM -0600, Rob Herring wrote:
> On Tue, Jan 16, 2024 at 5:02 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > Currently the ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround isn't
> > quite right, as it is supposed to be applied after the last explicit
> > memory access, but is immediately followed by an LDR.
> 
> This isn't necessary. The LDR in question is an unprivileged load from
> the EL0 stack. The erratum write-up is not really clear in that
> regard.

I see from internal notes that the rationale is that the LDR in question only
loads data that EL0 already has in its registers, and hence it doesn't matter
if that data is leaked to EL0. That's reasonable, but we didn't note that
anywhere (i.e. neither in the commit message nor in any comments).

To avoid confusion, the LDR in question *is* a privileged load (whereas an LDTR
at EL1 would be an unprivileged load); for memory accesses the architecture
uses the terms privileged and unprivileged to distinguish the way those are
handled by the MMU.

I agree that given the rationale above this patch isn't strictly necessary, but
I would prefer result of these two patches as it's less likely that we'll add
loads of sensitive information in future as this code is changed.

> It's the same as the KPTI case. After switching the page tables, there
> are unprivileged loads from the EL0 stack.

I'm not sure what you mean here; maybe I'm missing something?

AFAICT we don't do any loads within the kernel after switching the translation
tables. In tramp_exit we follow tramp_unmap_kernel with:

	MRS; ERET; SB

... and in __sdei_asm_exit_trampoline we follow tramp_unmap_kernel with

	CMP; B.NE; {HVC,SMC}; B .

... so there are no explicit loads at EL1 before the ERET to EL0. In the SDEI
case any loads at a higher EL don't matter because they're in a different
translation regime.

Thanks,
Mark.
diff mbox series

Patch

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 544ab46649f3f..7fcbee0f6c0e4 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -428,16 +428,9 @@  alternative_else_nop_endif
 	ldp	x28, x29, [sp, #16 * 14]
 
 	.if	\el == 0
-alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
-	tlbi	vale1, xzr
-	dsb	nsh
-alternative_else_nop_endif
-alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0
-	ldr	lr, [sp, #S_LR]
-	add	sp, sp, #PT_REGS_SIZE		// restore sp
-	eret
-alternative_else_nop_endif
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	alternative_insn "b .L_skip_tramp_exit_\@", nop, ARM64_UNMAP_KERNEL_AT_EL0
+
 	msr	far_el1, x29
 
 	ldr_this_cpu	x30, this_cpu_vector, x29
@@ -446,7 +439,18 @@  alternative_else_nop_endif
 	ldr		lr, [sp, #S_LR]		// restore x30
 	add		sp, sp, #PT_REGS_SIZE	// restore sp
 	br		x29
+
+.L_skip_tramp_exit_\@:
 #endif
+	ldr	lr, [sp, #S_LR]
+	add	sp, sp, #PT_REGS_SIZE		// restore sp
+
+	/* This must be after the last explicit memory access */
+alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
+	tlbi	vale1, xzr
+	dsb	nsh
+alternative_else_nop_endif
+	eret
 	.else
 	ldr	lr, [sp, #S_LR]
 	add	sp, sp, #PT_REGS_SIZE		// restore sp