Message ID | 20240116110221.420467-2-mark.rutland@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: fix+cleanup for ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD | expand |
On Tue, Jan 16, 2024 at 5:02 AM Mark Rutland <mark.rutland@arm.com> wrote: > > Currently the ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround isn't > quite right, as it is supposed to be applied after the last explicit > memory access, but is immediately followed by an LDR. This isn't necessary. The LDR in question is an unprivileged load from the EL0 stack. The erratum write-up is not really clear in that regard. It's the same as the KPTI case. After switching the page tables, there are unprivileged loads from the EL0 stack. Rob
Hi Rob, Sorry for the confusion here; I should have synced up with you before sending this out. On Fri, Jan 19, 2024 at 09:11:33AM -0600, Rob Herring wrote: > On Tue, Jan 16, 2024 at 5:02 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > Currently the ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround isn't > > quite right, as it is supposed to be applied after the last explicit > > memory access, but is immediately followed by an LDR. > > This isn't necessary. The LDR in question is an unprivileged load from > the EL0 stack. The erratum write-up is not really clear in that > regard. I see from internal notes that the rationale is that the LDR in question only loads data that EL0 already has in its registers, and hence it doesn't matter if that data is leaked to EL0. That's reasonable, but we didn't note that anywhere (i.e. neither in the commit message nor in any comments). To avoid confusion, the LDR in question *is* a privileged load (whereas an LDTR at EL1 would be an unprivileged load); for memory accesses the architecture uses the terms privileged and unprivileged to distinguish the way those are handled by the MMU. I agree that given the rationale above this patch isn't strictly necessary, but I would prefer result of these two patches as it's less likely that we'll add loads of sensitive information in future as this code is changed. > It's the same as the KPTI case. After switching the page tables, there > are unprivileged loads from the EL0 stack. I'm not sure what you mean here; maybe I'm missing something? AFAICT we don't do any loads within the kernel after switching the translation tables. In tramp_exit we follow tramp_unmap_kernel with: MRS; ERET; SB ... and in __sdei_asm_exit_trampoline we follow tramp_unmap_kernel with CMP; B.NE; {HVC,SMC}; B . ... so there are no explicit loads at EL1 before the ERET to EL0. In the SDEI case any loads at a higher EL don't matter because they're in a different translation regime. Thanks, Mark.
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 544ab46649f3f..7fcbee0f6c0e4 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -428,16 +428,9 @@ alternative_else_nop_endif ldp x28, x29, [sp, #16 * 14] .if \el == 0 -alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD - tlbi vale1, xzr - dsb nsh -alternative_else_nop_endif -alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0 - ldr lr, [sp, #S_LR] - add sp, sp, #PT_REGS_SIZE // restore sp - eret -alternative_else_nop_endif #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 + alternative_insn "b .L_skip_tramp_exit_\@", nop, ARM64_UNMAP_KERNEL_AT_EL0 + msr far_el1, x29 ldr_this_cpu x30, this_cpu_vector, x29 @@ -446,7 +439,18 @@ alternative_else_nop_endif ldr lr, [sp, #S_LR] // restore x30 add sp, sp, #PT_REGS_SIZE // restore sp br x29 + +.L_skip_tramp_exit_\@: #endif + ldr lr, [sp, #S_LR] + add sp, sp, #PT_REGS_SIZE // restore sp + + /* This must be after the last explicit memory access */ +alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD + tlbi vale1, xzr + dsb nsh +alternative_else_nop_endif + eret .else ldr lr, [sp, #S_LR] add sp, sp, #PT_REGS_SIZE // restore sp
Currently the ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround isn't quite right, as it is supposed to be applied after the last explicit memory access, but is immediately followed by an LDR. The ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround is used to handle Cortex-A520 erratum 2966298 and Cortex-A510 erratum 3117295, which are described in: * https://developer.arm.com/documentation/SDEN2444153/0600/?lang=en * https://developer.arm.com/documentation/SDEN1873361/1600/?lang=en In both cases the workaround is described as: | If pagetable isolation is disabled, the context switch logic in the | kernel can be updated to execute the following sequence on affected | cores before exiting to EL0, and after all explicit memory accesses: | | 1. A non-shareable TLBI to any context and/or address, including | unused contexts or addresses, such as a `TLBI VALE1 Xzr`. | | 2. A DSB NSH to guarantee completion of the TLBI. The important part being that the TLBI+DSB must be placed "after all explicit memory accesses". Unfortunately, as-implemented, the TLBI+DSB is immediately followed by an LDR, as we have: | alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD | tlbi vale1, xzr | dsb nsh | alternative_else_nop_endif | alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0 | ldr lr, [sp, #S_LR] | add sp, sp, #PT_REGS_SIZE // restore sp | eret | alternative_else_nop_endif | | [ ... KPTI exception return path ... ] This patch fixes this by reworking the logic to place the TLBI+DSB immediately before the ERET, after all explicit memory accesses. The ERET is currently in a separate alternative block, and alternatives cannot be nested. To account for this, the alternative block for ARM64_UNMAP_KERNEL_AT_EL0 is replaced with a single alternative branch to skip the KPTI logic, with the new shape of the logic being: | alternative_insn "b .L_skip_tramp_exit_\@", nop, ARM64_UNMAP_KERNEL_AT_EL0 | [ ... KPTI exception return path ... ] | .L_skip_tramp_exit_\@: | | ldr lr, [sp, #S_LR] | add sp, sp, #PT_REGS_SIZE // restore sp | | alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD | tlbi vale1, xzr | dsb nsh | alternative_else_nop_endif | eret The new structure means that the workaround is only applied when KPTI is not in use; this is fine as noted in the documented implications of the erratum: | Pagetable isolation between EL0 and higher level ELs prevents the | issue from occurring. ... and as per the workaround description quoted above, the workaround is only necessary "If pagetable isolation is disabled". Fixes: 471470bc7052d28c ("arm64: errata: Add Cortex-A520 speculative unprivileged load workaround") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org --- arch/arm64/kernel/entry.S | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)