Message ID | 1546956464-48825-2-git-send-email-julien.thierry@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: provide pseudo NMI with GICv3 | expand |
On Tue, Jan 08, 2019 at 02:07:19PM +0000, Julien Thierry wrote: > When using VHE, the host needs to clear HCR_EL2.TGE bit in order > to interract with guest TLBs, switching from EL2&0 translation regime > to EL1&0. > > However, some non-maskable asynchronous event could happen while TGE is > cleared like SDEI. Because of this address translation operations > relying on EL2&0 translation regime could fail (tlb invalidation, > userspace access, ...). Why would an NMI context need to access user space? (just curious what breaks exactly without this patch; otherwise it looks fine)
On 14/01/2019 15:56, Catalin Marinas wrote: > On Tue, Jan 08, 2019 at 02:07:19PM +0000, Julien Thierry wrote: >> When using VHE, the host needs to clear HCR_EL2.TGE bit in order >> to interract with guest TLBs, switching from EL2&0 translation regime >> to EL1&0. >> >> However, some non-maskable asynchronous event could happen while TGE is >> cleared like SDEI. Because of this address translation operations >> relying on EL2&0 translation regime could fail (tlb invalidation, >> userspace access, ...). > > Why would an NMI context need to access user space? (just curious what > breaks exactly without this patch; otherwise it looks fine) If I remember correctly, the SDEI interrupt might perform cache maintenance with EL2&0 translation regime, but James can probably give more detail (or correct me if I'm wrong). Otherwise, if we decide to use the pseudo NMI for profiling with perf, I believe the perf interrupt can access user space (although I'm not completely sure whether that might be to record profiling data in buffers shared with user space or something else). Thanks,
Hi guys, On 14/01/2019 16:12, Julien Thierry wrote: > On 14/01/2019 15:56, Catalin Marinas wrote: >> On Tue, Jan 08, 2019 at 02:07:19PM +0000, Julien Thierry wrote: >>> When using VHE, the host needs to clear HCR_EL2.TGE bit in order >>> to interract with guest TLBs, switching from EL2&0 translation regime >>> to EL1&0. >>> >>> However, some non-maskable asynchronous event could happen while TGE is >>> cleared like SDEI. Because of this address translation operations >>> relying on EL2&0 translation regime could fail (tlb invalidation, >>> userspace access, ...). >> >> Why would an NMI context need to access user space? (just curious what >> breaks exactly without this patch; otherwise it looks fine) > > If I remember correctly, the SDEI interrupt might perform cache > maintenance with EL2&0 translation regime, but James can probably give > more detail (or correct me if I'm wrong). Yup, spot on. The APEI driver has to map/unmap memory using the fixmap. If it interrupts a guest, the TLB maintenance would affect EL1&0 instead. > Otherwise, if we decide to use the pseudo NMI for profiling with perf, I > believe the perf interrupt can access user space (although I'm not > completely sure whether that might be to record profiling data in > buffers shared with user space or something else). It does a stack walk, I think its the PERF_SAMPLE_CALLCHAIN feature, and the code is: arch/arm64/kernel/perf_callchain.c::user_backtrace() Thanks, James
Hi, [This is an automated email] This commit has been processed because it contains a -stable tag. The stable tag indicates that it's relevant for the following trees: all The bot has tested the following trees: v4.20.2, v4.19.15, v4.14.93, v4.9.150, v4.4.170, v3.18.132. v4.20.2: Build OK! v4.19.15: Build OK! v4.14.93: Build OK! v4.9.150: Failed to apply! Possible dependencies: 096683724cb2 ("arm64: unwind: avoid percpu indirection for irq stack") 34be98f4944f ("arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP") a9ea0017ebe8 ("arm64: factor out current_stack_pointer") c02433dd6de3 ("arm64: split thread_info from task stack") c7365330753c ("arm64: unwind: disregard frame.sp when validating frame pointer") dbc9344a68e5 ("arm64: clean up THREAD_* definitions") f60ad4edcf07 ("arm64: clean up irq stack definitions") f60fe78f1332 ("arm64: use an irq stack pointer") v4.4.170: Failed to apply! Possible dependencies: 096683724cb2 ("arm64: unwind: avoid percpu indirection for irq stack") 0a8ea52c3eb1 ("arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature") 132cd887b5c5 ("arm64: Modify stack trace and dump for use with irq_stack") 1ffe199b1c9b ("arm64: when walking onto the task stack, check sp & fp are in current->stack") 20380bb390a4 ("arm64: ftrace: fix a stack tracer's output under function graph tracer") 7596abf2e566 ("arm64: irq: fix walking from irq stack to task stack") 8e23dacd12a4 ("arm64: Add do_softirq_own_stack() and enable irq_stacks") 971c67ce37cf ("arm64: reduce stack use in irq_handler") a80a0eb70c35 ("arm64: make irq_stack_ptr more robust") c7365330753c ("arm64: unwind: disregard frame.sp when validating frame pointer") f60ad4edcf07 ("arm64: clean up irq stack definitions") f60fe78f1332 ("arm64: use an irq stack pointer") fe13f95b7200 ("arm64: pass a task parameter to unwind_frame()") v3.18.132: Failed to apply! Possible dependencies: 020295b4cb5b ("ACPI / processor: Make it possible to get CPU hardware ID via GICC") 132cd887b5c5 ("arm64: Modify stack trace and dump for use with irq_stack") 13ca62b243f6 ("ACPI: Fix minor syntax issues in processor_core.c") 37655163ce1a ("ARM64 / ACPI: Get RSDP and ACPI boot-time tables") 587064b610c7 ("arm64: Add framework for legacy instruction emulation") 652261a7a86c ("ACPI: fix acpi_os_ioremap for arm64") 828aef376d7a ("ACPI / processor: Introduce phys_cpuid_t for CPU hardware ID") 96f0e00378d4 ("ARM: add basic support for on-demand backtrace of other CPUs") af2c632e234f ("arm64/include/asm: Fixed a warning about 'struct pt_regs'") af8f3f514d19 ("ACPI / processor: Convert apic_id to phys_id to make it arch agnostic") b4ff8389ed14 ("xen/events: Always allocate legacy interrupts on PV guests") d02dc27db0dc ("ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu()") d60fc3892c4d ("irqchip: Add GICv2 specific ACPI boot support") ecf5636dcd59 ("ACPI: Add interfaces to parse IOAPIC ID for IOAPIC hotplug") f60ad4edcf07 ("arm64: clean up irq stack definitions") f60fe78f1332 ("arm64: use an irq stack pointer") How should we proceed with this patch? -- Thanks, Sasha
On Tue, 08 Jan 2019 14:07:19 +0000, Julien Thierry <julien.thierry@arm.com> wrote: > > When using VHE, the host needs to clear HCR_EL2.TGE bit in order > to interract with guest TLBs, switching from EL2&0 translation regime > to EL1&0. > > However, some non-maskable asynchronous event could happen while TGE is > cleared like SDEI. Because of this address translation operations > relying on EL2&0 translation regime could fail (tlb invalidation, > userspace access, ...). > > Fix this by properly setting HCR_EL2.TGE when entering NMI context and > clear it if necessary when returning to the interrupted context. > > Signed-off-by: Julien Thierry <julien.thierry@arm.com> > Suggested-by: Marc Zyngier <marc.zyngier@arm.com> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will.deacon@arm.com> > Cc: Marc Zyngier <marc.zyngier@arm.com> > Cc: James Morse <james.morse@arm.com> > Cc: linux-arch@vger.kernel.org > Cc: stable@vger.kernel.org Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Thanks, M.
diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h index 1473fc2..94b7481 100644 --- a/arch/arm64/include/asm/hardirq.h +++ b/arch/arm64/include/asm/hardirq.h @@ -19,6 +19,7 @@ #include <linux/cache.h> #include <linux/threads.h> #include <asm/irq.h> +#include <asm/kvm_arm.h> #define NR_IPI 7 @@ -37,6 +38,33 @@ #define __ARCH_IRQ_EXIT_IRQS_DISABLED 1 +struct nmi_ctx { + u64 hcr; +}; + +DECLARE_PER_CPU(struct nmi_ctx, nmi_contexts); + +#define arch_nmi_enter() \ + do { \ + if (is_kernel_in_hyp_mode()) { \ + struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts); \ + nmi_ctx->hcr = read_sysreg(hcr_el2); \ + if (!(nmi_ctx->hcr & HCR_TGE)) { \ + write_sysreg(nmi_ctx->hcr | HCR_TGE, hcr_el2); \ + isb(); \ + } \ + } \ + } while (0) + +#define arch_nmi_exit() \ + do { \ + if (is_kernel_in_hyp_mode()) { \ + struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts); \ + if (!(nmi_ctx->hcr & HCR_TGE)) \ + write_sysreg(nmi_ctx->hcr, hcr_el2); \ + } \ + } while (0) + static inline void ack_bad_irq(unsigned int irq) { extern unsigned long irq_err_count; diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c index 780a12f..92fa817 100644 --- a/arch/arm64/kernel/irq.c +++ b/arch/arm64/kernel/irq.c @@ -33,6 +33,9 @@ unsigned long irq_err_count; +/* Only access this in an NMI enter/exit */ +DEFINE_PER_CPU(struct nmi_ctx, nmi_contexts); + DEFINE_PER_CPU(unsigned long *, irq_stack_ptr); int arch_show_interrupts(struct seq_file *p, int prec) diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h index 0fbbcdf..da0af63 100644 --- a/include/linux/hardirq.h +++ b/include/linux/hardirq.h @@ -60,8 +60,14 @@ static inline void rcu_nmi_exit(void) */ extern void irq_exit(void); +#ifndef arch_nmi_enter +#define arch_nmi_enter() do { } while (0) +#define arch_nmi_exit() do { } while (0) +#endif + #define nmi_enter() \ do { \ + arch_nmi_enter(); \ printk_nmi_enter(); \ lockdep_off(); \ ftrace_nmi_enter(); \ @@ -80,6 +86,7 @@ static inline void rcu_nmi_exit(void) ftrace_nmi_exit(); \ lockdep_on(); \ printk_nmi_exit(); \ + arch_nmi_exit(); \ } while (0) #endif /* LINUX_HARDIRQ_H */
When using VHE, the host needs to clear HCR_EL2.TGE bit in order to interract with guest TLBs, switching from EL2&0 translation regime to EL1&0. However, some non-maskable asynchronous event could happen while TGE is cleared like SDEI. Because of this address translation operations relying on EL2&0 translation regime could fail (tlb invalidation, userspace access, ...). Fix this by properly setting HCR_EL2.TGE when entering NMI context and clear it if necessary when returning to the interrupted context. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: James Morse <james.morse@arm.com> Cc: linux-arch@vger.kernel.org Cc: stable@vger.kernel.org --- arch/arm64/include/asm/hardirq.h | 28 ++++++++++++++++++++++++++++ arch/arm64/kernel/irq.c | 3 +++ include/linux/hardirq.h | 7 +++++++ 3 files changed, 38 insertions(+)