Message ID | 20230912121120.380420-2-robh@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] arm64: Add Cortex-A520 CPU part definition | expand |
On Tue, Sep 12, 2023 at 07:11:15AM -0500, Rob Herring wrote: > Implement the workaround for ARM Cortex-A520 erratum 2966298. On an > affected Cortex-A520 core, a speculatively executed unprivileged load > might leak data from a privileged level via a cache side channel. > > The workaround is to execute a TLBI before returning to EL0. A > non-shareable TLBI to any address is sufficient. Can you elaborate at all on how this works, please? A TLBI addressing a cache side channel feels weird (or is "cache" referring to some TLB structures rather than e.g. the data cache here?). Assuming there's some vulnerable window between the speculative unprivileged load and the completion of the TLBI, what prevents another CPU from observing the side-channel during that time? Also, does the TLBI need to be using the same ASID as the unprivileged load? If so, then a context-switch could widen the vulnerable window quite significantly. Anyway, hopefully I'm barking up the wrong tree, but it would be helpful to have some intuition behind the workaround in order to review (and maintain) this patch. Cheers, Will
On 2023-09-18 11:01, Will Deacon wrote: > On Tue, Sep 12, 2023 at 07:11:15AM -0500, Rob Herring wrote: >> Implement the workaround for ARM Cortex-A520 erratum 2966298. On an >> affected Cortex-A520 core, a speculatively executed unprivileged load >> might leak data from a privileged level via a cache side channel. >> >> The workaround is to execute a TLBI before returning to EL0. A >> non-shareable TLBI to any address is sufficient. > > Can you elaborate at all on how this works, please? A TLBI addressing a > cache side channel feels weird (or is "cache" referring to some TLB > structures rather than e.g. the data cache here?). > > Assuming there's some vulnerable window between the speculative > unprivileged load and the completion of the TLBI, what prevents another > CPU from observing the side-channel during that time? Also, does the > TLBI need to be using the same ASID as the unprivileged load? If so, > then > a context-switch could widen the vulnerable window quite significantly. Another 'interesting' case is the KVM world switch. If EL0 is affected, what about EL1? Can such a data leak exist cross-EL1, or from EL2 to El1? Asking for a friend... M.
On Mon, Sep 18, 2023 at 5:01 AM Will Deacon <will@kernel.org> wrote: > > On Tue, Sep 12, 2023 at 07:11:15AM -0500, Rob Herring wrote: > > Implement the workaround for ARM Cortex-A520 erratum 2966298. On an > > affected Cortex-A520 core, a speculatively executed unprivileged load > > might leak data from a privileged level via a cache side channel. > > > > The workaround is to execute a TLBI before returning to EL0. A > > non-shareable TLBI to any address is sufficient. > > Can you elaborate at all on how this works, please? Here's the write-up if you haven't read that already: https://developer.arm.com/documentation/SDEN-2444153/0500/?lang=en > A TLBI addressing a > cache side channel feels weird (or is "cache" referring to some TLB > structures rather than e.g. the data cache here?). AIUI, the TLBI is simply enough to ensure the permission check happens on the speculative load. It has nothing to do with actual TLB contents. This core has FEAT_E0PD and FEAT_CSV3 which should mitigate this scenario, but this case is a narrow uarch condition which bypasses those checks. > Assuming there's some vulnerable window between the speculative > unprivileged load and the completion of the TLBI, what prevents another > CPU from observing the side-channel during that time? The cache hit is private to the core. How would another core observe that? > Also, does the > TLBI need to be using the same ASID as the unprivileged load? If so, then > a context-switch could widen the vulnerable window quite significantly. No, the TLBI can be any context and/or address including unused addresses. Rob
On Mon, Sep 18, 2023 at 5:18 AM Marc Zyngier <maz@misterjones.org> wrote: > > On 2023-09-18 11:01, Will Deacon wrote: > > On Tue, Sep 12, 2023 at 07:11:15AM -0500, Rob Herring wrote: > >> Implement the workaround for ARM Cortex-A520 erratum 2966298. On an > >> affected Cortex-A520 core, a speculatively executed unprivileged load > >> might leak data from a privileged level via a cache side channel. > >> > >> The workaround is to execute a TLBI before returning to EL0. A > >> non-shareable TLBI to any address is sufficient. > > > > Can you elaborate at all on how this works, please? A TLBI addressing a > > cache side channel feels weird (or is "cache" referring to some TLB > > structures rather than e.g. the data cache here?). > > > > Assuming there's some vulnerable window between the speculative > > unprivileged load and the completion of the TLBI, what prevents another > > CPU from observing the side-channel during that time? Also, does the > > TLBI need to be using the same ASID as the unprivileged load? If so, > > then > > a context-switch could widen the vulnerable window quite significantly. > > Another 'interesting' case is the KVM world switch. If EL0 is > affected, what about EL1? Can such a data leak exist cross-EL1, > or from EL2 to El1? Asking for a friend... I'm checking for a definitive answer, but page table isolation also avoids the issue. Wouldn't these scenarios all be similar to page table isolation in that the EL2 or prior EL1 context is unmapped? Rob
On Tue, 19 Sep 2023 13:29:07 +0100, Rob Herring <robh@kernel.org> wrote: > > On Mon, Sep 18, 2023 at 5:18 AM Marc Zyngier <maz@misterjones.org> wrote: > > > > On 2023-09-18 11:01, Will Deacon wrote: > > > On Tue, Sep 12, 2023 at 07:11:15AM -0500, Rob Herring wrote: > > >> Implement the workaround for ARM Cortex-A520 erratum 2966298. On an > > >> affected Cortex-A520 core, a speculatively executed unprivileged load > > >> might leak data from a privileged level via a cache side channel. > > >> > > >> The workaround is to execute a TLBI before returning to EL0. A > > >> non-shareable TLBI to any address is sufficient. > > > > > > Can you elaborate at all on how this works, please? A TLBI addressing a > > > cache side channel feels weird (or is "cache" referring to some TLB > > > structures rather than e.g. the data cache here?). > > > > > > Assuming there's some vulnerable window between the speculative > > > unprivileged load and the completion of the TLBI, what prevents another > > > CPU from observing the side-channel during that time? Also, does the > > > TLBI need to be using the same ASID as the unprivileged load? If so, > > > then > > > a context-switch could widen the vulnerable window quite significantly. > > > > Another 'interesting' case is the KVM world switch. If EL0 is > > affected, what about EL1? Can such a data leak exist cross-EL1, > > or from EL2 to El1? Asking for a friend... > > I'm checking for a definitive answer, but page table isolation also > avoids the issue. Wouldn't these scenarios all be similar to page > table isolation in that the EL2 or prior EL1 context is unmapped? No, EL2 is always mapped, and we don't have anything like KPTI there. Maybe the saving grace is that EL2 and EL2&0 are different translation regimes from EL1&0, but there's nothing in the commit message that indicates it. As for EL1-to-EL1 leaks, it again completely depends on how the TLBs are tagged. You'd hope that having different VMIDs would save the bacon, but if you can leak EL1 translations into EL0, it means that the associated permission and/or tags do not contain all the required information... M.
On Tue, Sep 19, 2023 at 7:50 AM Marc Zyngier <maz@kernel.org> wrote: > > On Tue, 19 Sep 2023 13:29:07 +0100, > Rob Herring <robh@kernel.org> wrote: > > > > On Mon, Sep 18, 2023 at 5:18 AM Marc Zyngier <maz@misterjones.org> wrote: > > > > > > On 2023-09-18 11:01, Will Deacon wrote: > > > > On Tue, Sep 12, 2023 at 07:11:15AM -0500, Rob Herring wrote: > > > >> Implement the workaround for ARM Cortex-A520 erratum 2966298. On an > > > >> affected Cortex-A520 core, a speculatively executed unprivileged load > > > >> might leak data from a privileged level via a cache side channel. > > > >> > > > >> The workaround is to execute a TLBI before returning to EL0. A > > > >> non-shareable TLBI to any address is sufficient. > > > > > > > > Can you elaborate at all on how this works, please? A TLBI addressing a > > > > cache side channel feels weird (or is "cache" referring to some TLB > > > > structures rather than e.g. the data cache here?). > > > > > > > > Assuming there's some vulnerable window between the speculative > > > > unprivileged load and the completion of the TLBI, what prevents another > > > > CPU from observing the side-channel during that time? Also, does the > > > > TLBI need to be using the same ASID as the unprivileged load? If so, > > > > then > > > > a context-switch could widen the vulnerable window quite significantly. > > > > > > Another 'interesting' case is the KVM world switch. If EL0 is > > > affected, what about EL1? Can such a data leak exist cross-EL1, > > > or from EL2 to El1? Asking for a friend... > > > > I'm checking for a definitive answer, but page table isolation also > > avoids the issue. Wouldn't these scenarios all be similar to page > > table isolation in that the EL2 or prior EL1 context is unmapped? > > No, EL2 is always mapped, and we don't have anything like KPTI there. > > Maybe the saving grace is that EL2 and EL2&0 are different translation > regimes from EL1&0, but there's nothing in the commit message that > indicates it. As for EL1-to-EL1 leaks, it again completely depends on > how the TLBs are tagged. Different translation regimes are not affected. It must be the same regime and same translation. > You'd hope that having different VMIDs would save the bacon, but if > you can leak EL1 translations into EL0, it means that the associated > permission and/or tags do not contain all the required information... The VMID is part of the equation. See here[1]. Rob [1] https://developer.arm.com/documentation/102517/0001/Memory-management/Translation-Lookaside-Buffer-match-process?lang=en
On Wed, 20 Sep 2023 17:47:35 +0100, Rob Herring <robh@kernel.org> wrote: > > On Tue, Sep 19, 2023 at 7:50 AM Marc Zyngier <maz@kernel.org> wrote: > > > > On Tue, 19 Sep 2023 13:29:07 +0100, > > Rob Herring <robh@kernel.org> wrote: > > > > > > On Mon, Sep 18, 2023 at 5:18 AM Marc Zyngier <maz@misterjones.org> wrote: > > > > > > > > On 2023-09-18 11:01, Will Deacon wrote: > > > > > On Tue, Sep 12, 2023 at 07:11:15AM -0500, Rob Herring wrote: > > > > >> Implement the workaround for ARM Cortex-A520 erratum 2966298. On an > > > > >> affected Cortex-A520 core, a speculatively executed unprivileged load > > > > >> might leak data from a privileged level via a cache side channel. > > > > >> > > > > >> The workaround is to execute a TLBI before returning to EL0. A > > > > >> non-shareable TLBI to any address is sufficient. > > > > > > > > > > Can you elaborate at all on how this works, please? A TLBI addressing a > > > > > cache side channel feels weird (or is "cache" referring to some TLB > > > > > structures rather than e.g. the data cache here?). > > > > > > > > > > Assuming there's some vulnerable window between the speculative > > > > > unprivileged load and the completion of the TLBI, what prevents another > > > > > CPU from observing the side-channel during that time? Also, does the > > > > > TLBI need to be using the same ASID as the unprivileged load? If so, > > > > > then > > > > > a context-switch could widen the vulnerable window quite significantly. > > > > > > > > Another 'interesting' case is the KVM world switch. If EL0 is > > > > affected, what about EL1? Can such a data leak exist cross-EL1, > > > > or from EL2 to El1? Asking for a friend... > > > > > > I'm checking for a definitive answer, but page table isolation also > > > avoids the issue. Wouldn't these scenarios all be similar to page > > > table isolation in that the EL2 or prior EL1 context is unmapped? > > > > No, EL2 is always mapped, and we don't have anything like KPTI there. > > > > Maybe the saving grace is that EL2 and EL2&0 are different translation > > regimes from EL1&0, but there's nothing in the commit message that > > indicates it. As for EL1-to-EL1 leaks, it again completely depends on > > how the TLBs are tagged. > > Different translation regimes are not affected. It must be the same > regime and same translation. It would be good to capture this, then. > > > You'd hope that having different VMIDs would save the bacon, but if > > you can leak EL1 translations into EL0, it means that the associated > > permission and/or tags do not contain all the required information... > > The VMID is part of the equation. See here[1]. I have a pretty good idea of how TLB are *supposed* to behave. The fact that you need some sort of invalidation on ERET to EL0 is the proof that this CPU doesn't follow these rules to the letter... M.
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst index e96f057ea2a0..f47f63bcf67c 100644 --- a/Documentation/arch/arm64/silicon-errata.rst +++ b/Documentation/arch/arm64/silicon-errata.rst @@ -71,6 +71,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A510 | #2658417 | ARM64_ERRATUM_2658417 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-A520 | #2966298 | ARM64_ERRATUM_2966298 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 | diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index b10515c0200b..78f20e632712 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1037,6 +1037,19 @@ config ARM64_ERRATUM_2645198 If unsure, say Y. +config ARM64_ERRATUM_2966298 + bool "Cortex-A520: 2966298: workaround for speculatively executed unprivileged load" + default y + help + This option adds the workaround for ARM Cortex-A520 erratum 2966298. + + On an affected Cortex-A520 core, a speculatively executed unprivileged + load might leak data from a privileged level via a cache side channel. + + Work around this problem by executing a TLBI before returning to EL0. + + If unsure, say Y. + config CAVIUM_ERRATUM_22375 bool "Cavium erratum 22375, 24313" default y diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index be66e94a21bd..5706e74c5578 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -730,6 +730,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .cpu_enable = cpu_clear_bf16_from_user_emulation, }, #endif +#ifdef CONFIG_ARM64_ERRATUM_2966298 + { + .desc = "ARM erratum 2966298", + .capability = ARM64_WORKAROUND_2966298, + /* Cortex-A520 r0p0 - r0p1 */ + ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A520, 0, 0, 1), + }, +#endif #ifdef CONFIG_AMPERE_ERRATUM_AC03_CPU_38 { .desc = "AmpereOne erratum AC03_CPU_38", diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 6ad61de03d0a..a6030913cd58 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -428,6 +428,10 @@ alternative_else_nop_endif ldp x28, x29, [sp, #16 * 14] .if \el == 0 +alternative_if ARM64_WORKAROUND_2966298 + tlbi vale1, xzr + dsb nsh +alternative_else_nop_endif alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0 ldr lr, [sp, #S_LR] add sp, sp, #PT_REGS_SIZE // restore sp diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index c3f06fdef609..dea3dc89234b 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -84,6 +84,7 @@ WORKAROUND_2077057 WORKAROUND_2457168 WORKAROUND_2645198 WORKAROUND_2658417 +WORKAROUND_2966298 WORKAROUND_AMPERE_AC03_CPU_38 WORKAROUND_TRBE_OVERWRITE_FILL_MODE WORKAROUND_TSB_FLUSH_FAILURE
Implement the workaround for ARM Cortex-A520 erratum 2966298. On an affected Cortex-A520 core, a speculatively executed unprivileged load might leak data from a privileged level via a cache side channel. The workaround is to execute a TLBI before returning to EL0. A non-shareable TLBI to any address is sufficient. The workaround isn't necessary if page table isolation (KPTI) is enabled, but for simplicity it will be. Page table isolation should normally be disabled for Cortex-A520 as it supports the CSV3 feature and the E0PD feature (used when KASLR is enabled). Cc: stable@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> --- Documentation/arch/arm64/silicon-errata.rst | 2 ++ arch/arm64/Kconfig | 13 +++++++++++++ arch/arm64/kernel/cpu_errata.c | 8 ++++++++ arch/arm64/kernel/entry.S | 4 ++++ arch/arm64/tools/cpucaps | 1 + 5 files changed, 28 insertions(+)