Message ID | 1403623097-1153-5-git-send-email-daniel.thompson@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jun 24, 2014 at 04:18:17PM +0100, Daniel Thompson wrote: > + .align 5 > +__fiq_svc: > + svc_entry Remember that the registers you have on the stack here are r0-r12, plus the SVC banked sp and lr registers. These may not be the registers from the mode you took the FIQ (eg, if it was IRQ, or abort mode.) Also bear in mind that svc_entry calls trace_hardirqs_off - is this appropriate and safe for the FIQ to call? > + fiq_handler > + mov r0, sp > + ldmib r0, {r1 - r14} So this restores r1 to r12, and the SVC mode sp and lr registers. Nothing touches the SVC SPSR, so we hope that retains its value throughout the FIQ processing. Note that the stack pointer at this point will be above state which we have not yet read, so we better not take any exceptions from this instruction (not even an imprecise abort). > + msr cpsr_c, #FIQ_MODE | PSR_I_BIT | PSR_F_BIT Here we switch to FIQ mode. What about the PSR_A_BIT which prevents imprecise aborts on ARMv6+ ? Nevertheless, I think it's safe because the A bit will be set by the CPU when taking the FIQ exception, and it should remain set since cpsr_c won't modify it. > + add r8, r0, #S_PC > + ldr r9, [r0, #S_PSR] > + msr spsr_cxsf, r9 Here we update the FIQ SPSR with the calling mode's CPSR, ready to return... > + ldr r0, [r0, #S_R0] Load the calling mode's R0 value. > + ldmia r8, {pc}^ And return (restoring CPSR from SPSR_fiq). This looks pretty good except for the niggles...
On Tue, 24 Jun 2014, Daniel Thompson wrote: > From: Anton Vorontsov <anton.vorontsov@linaro.org> > > The FIQ debugger may be used to debug situations when the kernel stuck > in uninterruptable sections, e.g. the kernel infinitely loops or > deadlocked in an interrupt or with interrupts disabled. > > By default KGDB FIQ is disabled in runtime, but can be enabled with > kgdb_fiq.enable=1 kernel command line option. > > Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> > Signed-off-by: John Stultz <john.stultz@linaro.org> > Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> > Cc: Russell King <linux@arm.linux.org.uk> > Cc: Ben Dooks <ben.dooks@codethink.co.uk> > Cc: Dave Martin <Dave.Martin@arm.com> > --- > arch/arm/Kconfig | 2 + > arch/arm/Kconfig.debug | 18 ++++++ > arch/arm/include/asm/kgdb.h | 7 +++ > arch/arm/kernel/Makefile | 1 + > arch/arm/kernel/kgdb_fiq.c | 124 +++++++++++++++++++++++++++++++++++++++ > arch/arm/kernel/kgdb_fiq_entry.S | 87 +++++++++++++++++++++++++++ > 6 files changed, 239 insertions(+) > create mode 100644 arch/arm/kernel/kgdb_fiq.c > create mode 100644 arch/arm/kernel/kgdb_fiq_entry.S [...] > +static long kgdb_fiq_setup_stack(void *info) > +{ > + struct pt_regs regs; > + > + regs.ARM_sp = __get_free_pages(GFP_KERNEL, THREAD_SIZE_ORDER) + > + THREAD_START_SP; > + WARN_ON(!regs.ARM_sp); Isn't this rather fatal if you can't allocate any stack? Why not using BUG_ON(), or better yet propagate a proper error code back? > + > + set_fiq_regs(®s); > + return 0; > +} > + > +/** > + * kgdb_fiq_enable_nmi - Manage NMI-triggered entry to KGDB > + * @on: Flag to either enable or disable an NMI > + * > + * This function manages NMIs that usually cause KGDB to enter. That is, not > + * all NMIs should be enabled or disabled, but only those that issue > + * kgdb_handle_exception(). > + * > + * The call counts disable requests, and thus allows to nest disables. But > + * trying to enable already enabled NMI is an error. > + */ > +static void kgdb_fiq_enable_nmi(bool on) > +{ > + static atomic_t cnt; > + int ret; > + > + ret = atomic_add_return(on ? 1 : -1, &cnt); > + if (ret > 1 && on) { > + /* > + * There should be only one instance that calls this function > + * in "enable, disable" order. All other users must call > + * disable first, then enable. If not, something is wrong. > + */ > + WARN_ON(1); > + return; > + } Minor style suggestion: /* * There should be only one instance that calls this function * in "enable, disable" order. All other users must call * disable first, then enable. If not, something is wrong. */ if (WARN_ON(ret > 1 && on)) return; Other than that... Acked-by: Nicolas Pitre <nico@linaro.org> Nicolas
On 24/06/14 17:08, Russell King - ARM Linux wrote: > On Tue, Jun 24, 2014 at 04:18:17PM +0100, Daniel Thompson wrote: >> + .align 5 >> +__fiq_svc: >> + svc_entry > > Remember that the registers you have on the stack here are r0-r12, plus > the SVC banked sp and lr registers. These may not be the registers > from the mode you took the FIQ (eg, if it was IRQ, or abort mode.) We probably ought to save/restore lr_abt and spsr_abt but I think sp_abt and the state for irq and und can be neglected. The stack pointers are constant anyway and I think it reasonable to assume the FIQ handler doesn't unmask interrupts or attempt to execute undefined instructions. > Also bear in mind that svc_entry calls trace_hardirqs_off - is this > appropriate and safe for the FIQ to call? I personally think it appropriate and it looked safe on the lockdep side of things. However I will look a bit deeper at this since I don't remember how far I chased things back. Naturally it is a problem that we don't currently call trace_hardirq_on. I'll fix this. >> + fiq_handler >> + mov r0, sp >> + ldmib r0, {r1 - r14} > > So this restores r1 to r12, and the SVC mode sp and lr registers. > Nothing touches the SVC SPSR, so we hope that retains its value > throughout the FIQ processing. Are you worried about something changing it? I haven't thought of any good reason for it to change. The FIQ handler can't safely execute a SVC instruction. > Note that the stack pointer at this > point will be above state which we have not yet read, so we better > not take any exceptions from this instruction (not even an imprecise > abort). Can a comment cover this? We shouldn't get an abort reading the SVC stack and imprecise abort is blocked. Note we could copy these values back onto the FIQ stack before switching modes if is there's a possibility of an abort we cannot avoid, however I'm not know if this is worthwhile. >> + msr cpsr_c, #FIQ_MODE | PSR_I_BIT | PSR_F_BIT > > Here we switch to FIQ mode. What about the PSR_A_BIT which prevents > imprecise aborts on ARMv6+ ? > > Nevertheless, I think it's safe because the A bit will be set by the > CPU when taking the FIQ exception, and it should remain set since > cpsr_c won't modify it. Agreed. Note that while double checking this I realized that this code will drop the value of PSR_ISETSTATE (T bit) that the vector_stub macro set for us. I'll fix this. >> + add r8, r0, #S_PC >> + ldr r9, [r0, #S_PSR] >> + msr spsr_cxsf, r9 > > Here we update the FIQ SPSR with the calling mode's CPSR, ready to > return... > >> + ldr r0, [r0, #S_R0] > > Load the calling mode's R0 value. > >> + ldmia r8, {pc}^ > > And return (restoring CPSR from SPSR_fiq). > > This looks pretty good except for the niggles... Thanks. I've picked out the following actions from the above: 1. Wrap a save and restore lr_abt and spsr_abt around the FIQ handler 2. Add a paired up trace_hardirqs_on() (and review more deeply). 3. Add comments explaining hazards w.r.t. data abort, 4. Correctly manage T bit during transition back to FIQ mode. Do I miss anything? Daniel.
On 24/06/14 17:22, Nicolas Pitre wrote: > On Tue, 24 Jun 2014, Daniel Thompson wrote: > >> From: Anton Vorontsov <anton.vorontsov@linaro.org> >> >> The FIQ debugger may be used to debug situations when the kernel stuck >> in uninterruptable sections, e.g. the kernel infinitely loops or >> deadlocked in an interrupt or with interrupts disabled. >> >> By default KGDB FIQ is disabled in runtime, but can be enabled with >> kgdb_fiq.enable=1 kernel command line option. >> >> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> >> Signed-off-by: John Stultz <john.stultz@linaro.org> >> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> >> Cc: Russell King <linux@arm.linux.org.uk> >> Cc: Ben Dooks <ben.dooks@codethink.co.uk> >> Cc: Dave Martin <Dave.Martin@arm.com> >> --- >> arch/arm/Kconfig | 2 + >> arch/arm/Kconfig.debug | 18 ++++++ >> arch/arm/include/asm/kgdb.h | 7 +++ >> arch/arm/kernel/Makefile | 1 + >> arch/arm/kernel/kgdb_fiq.c | 124 +++++++++++++++++++++++++++++++++++++++ >> arch/arm/kernel/kgdb_fiq_entry.S | 87 +++++++++++++++++++++++++++ >> 6 files changed, 239 insertions(+) >> create mode 100644 arch/arm/kernel/kgdb_fiq.c >> create mode 100644 arch/arm/kernel/kgdb_fiq_entry.S > > [...] > >> +static long kgdb_fiq_setup_stack(void *info) >> +{ >> + struct pt_regs regs; >> + >> + regs.ARM_sp = __get_free_pages(GFP_KERNEL, THREAD_SIZE_ORDER) + >> + THREAD_START_SP; >> + WARN_ON(!regs.ARM_sp); > > Isn't this rather fatal if you can't allocate any stack? Why not using > BUG_ON(), or better yet propagate a proper error code back? Thanks for raising this. I think we can get rid of the allocation altogether. This stack is *way* oversized (it only needs to be 12 bytes). >> + >> + set_fiq_regs(®s); >> + return 0; >> +} >> + >> +/** >> + * kgdb_fiq_enable_nmi - Manage NMI-triggered entry to KGDB >> + * @on: Flag to either enable or disable an NMI >> + * >> + * This function manages NMIs that usually cause KGDB to enter. That is, not >> + * all NMIs should be enabled or disabled, but only those that issue >> + * kgdb_handle_exception(). >> + * >> + * The call counts disable requests, and thus allows to nest disables. But >> + * trying to enable already enabled NMI is an error. >> + */ >> +static void kgdb_fiq_enable_nmi(bool on) >> +{ >> + static atomic_t cnt; >> + int ret; >> + >> + ret = atomic_add_return(on ? 1 : -1, &cnt); >> + if (ret > 1 && on) { >> + /* >> + * There should be only one instance that calls this function >> + * in "enable, disable" order. All other users must call >> + * disable first, then enable. If not, something is wrong. >> + */ >> + WARN_ON(1); >> + return; >> + } > > Minor style suggestion: > > /* > * There should be only one instance that calls this function > * in "enable, disable" order. All other users must call > * disable first, then enable. If not, something is wrong. > */ > if (WARN_ON(ret > 1 && on)) > return; Will adopt this style. > Other than that... > > Acked-by: Nicolas Pitre <nico@linaro.org> Thanks for review.
On 26/06/14 10:54, Daniel Thompson wrote: >> Also bear in mind that svc_entry calls trace_hardirqs_off - is this >> appropriate and safe for the FIQ to call? > > I personally think it appropriate and it looked safe on the lockdep side > of things. However I will look a bit deeper at this since I don't > remember how far I chased things back. I've reviewed as far as I can. Regarding safety I can't find anything much to upset the FIQ handler. I think it might occasionally trigger the trace code's recursion avoidance causing the trace event to be dropped but that's about it. I admit I came very close to removing the trace_hardirqs calls from the FIQ code but in the end I've left it. The hardirqs *are* off during FIQ execution. >>> + msr cpsr_c, #FIQ_MODE | PSR_I_BIT | PSR_F_BIT >> >> Here we switch to FIQ mode. What about the PSR_A_BIT which prevents >> imprecise aborts on ARMv6+ ? >> >> Nevertheless, I think it's safe because the A bit will be set by the >> CPU when taking the FIQ exception, and it should remain set since >> cpsr_c won't modify it. > > Agreed. > > Note that while double checking this I realized that this code will drop > the value of PSR_ISETSTATE (T bit) that the vector_stub macro set for > us. I'll fix this. I was wrong about this. CPSR T bit is part of execution state can cannot be modified by msr. > I've picked out the following actions from the above: > > 1. Wrap a save and restore lr_abt and spsr_abt around the FIQ handler Done. > 2. Add a paired up trace_hardirqs_on() (and review more deeply). Done. > 3. Add comments explaining hazards w.r.t. data abort, Done. > 4. Correctly manage T bit during transition back to FIQ mode. Not applicable. > Do I miss anything? I hope not! Daniel.
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 245058b..f385b27 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -297,6 +297,7 @@ choice config ARCH_MULTIPLATFORM bool "Allow multiple platforms to be selected" depends on MMU + select ARCH_MIGHT_HAVE_KGDB_FIQ select ARCH_WANT_OPTIONAL_GPIOLIB select ARM_HAS_SG_CHAIN select ARM_PATCH_PHYS_VIRT @@ -346,6 +347,7 @@ config ARCH_REALVIEW config ARCH_VERSATILE bool "ARM Ltd. Versatile family" + select ARCH_MIGHT_HAVE_KGDB_FIQ select ARCH_WANT_OPTIONAL_GPIOLIB select ARM_AMBA select ARM_TIMER_SP804 diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug index 26536f7..c7342b6 100644 --- a/arch/arm/Kconfig.debug +++ b/arch/arm/Kconfig.debug @@ -2,6 +2,24 @@ menu "Kernel hacking" source "lib/Kconfig.debug" +config ARCH_MIGHT_HAVE_KGDB_FIQ + bool + +config KGDB_FIQ + bool "KGDB FIQ support" + depends on KGDB_KDB && ARCH_MIGHT_HAVE_KGDB_FIQ && !THUMB2_KERNEL + select FIQ + help + The FIQ debugger may be used to debug situations when the + kernel stuck in uninterruptable sections, e.g. the kernel + infinitely loops or deadlocked in an interrupt or with + interrupts disabled. + + By default KGDB FIQ is disabled at runtime, but can be + enabled with kgdb_fiq.enable=1 kernel command line option. + + If unsure, say N. + config ARM_PTDUMP bool "Export kernel pagetable layout to userspace via debugfs" depends on DEBUG_KERNEL diff --git a/arch/arm/include/asm/kgdb.h b/arch/arm/include/asm/kgdb.h index 0a9d5dd..5de21f01 100644 --- a/arch/arm/include/asm/kgdb.h +++ b/arch/arm/include/asm/kgdb.h @@ -11,7 +11,9 @@ #define __ARM_KGDB_H__ #include <linux/ptrace.h> +#include <linux/linkage.h> #include <asm/opcodes.h> +#include <asm/exception.h> /* * GDB assumes that we're a user process being debugged, so @@ -48,6 +50,11 @@ static inline void arch_kgdb_breakpoint(void) extern void kgdb_handle_bus_error(void); extern int kgdb_fault_expected; +extern char kgdb_fiq_handler; +extern char kgdb_fiq_handler_end; +asmlinkage void __exception_irq_entry kgdb_fiq_do_handle(struct pt_regs *regs); +extern int kgdb_register_fiq(unsigned int fiq); + #endif /* !__ASSEMBLY__ */ /* diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile index 38ddd9f..30ee8f3 100644 --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -68,6 +68,7 @@ endif obj-$(CONFIG_OABI_COMPAT) += sys_oabi-compat.o obj-$(CONFIG_ARM_THUMBEE) += thumbee.o obj-$(CONFIG_KGDB) += kgdb.o +obj-$(CONFIG_KGDB_FIQ) += kgdb_fiq_entry.o kgdb_fiq.o obj-$(CONFIG_ARM_UNWIND) += unwind.o obj-$(CONFIG_HAVE_TCM) += tcm.o obj-$(CONFIG_OF) += devtree.o diff --git a/arch/arm/kernel/kgdb_fiq.c b/arch/arm/kernel/kgdb_fiq.c new file mode 100644 index 0000000..dbf4873 --- /dev/null +++ b/arch/arm/kernel/kgdb_fiq.c @@ -0,0 +1,124 @@ +/* + * KGDB FIQ + * + * Copyright 2010 Google, Inc. + * Arve Hjønnevåg <arve@android.com> + * Colin Cross <ccross@android.com> + * Copyright 2012 Linaro Ltd. + * Anton Vorontsov <anton.vorontsov@linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + */ + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/init.h> +#include <linux/slab.h> +#include <linux/errno.h> +#include <linux/hardirq.h> +#include <linux/atomic.h> +#include <linux/kdb.h> +#include <linux/kgdb.h> +#include <asm/fiq.h> +#include <asm/exception.h> + +static int kgdb_fiq_enabled; +module_param_named(enable, kgdb_fiq_enabled, int, 0600); +MODULE_PARM_DESC(enable, "set to 1 to enable FIQ KGDB"); + +static unsigned int kgdb_fiq; + +asmlinkage void __exception_irq_entry kgdb_fiq_do_handle(struct pt_regs *regs) +{ + if (kgdb_nmi_poll_knock()) { + nmi_enter(); + kgdb_handle_exception(1, 0, 0, regs); + nmi_exit(); + } + + eoi_fiq(kgdb_fiq); +} + +static struct fiq_handler kgdb_fiq_desc = { + .name = "kgdb", +}; + +static long kgdb_fiq_setup_stack(void *info) +{ + struct pt_regs regs; + + regs.ARM_sp = __get_free_pages(GFP_KERNEL, THREAD_SIZE_ORDER) + + THREAD_START_SP; + WARN_ON(!regs.ARM_sp); + + set_fiq_regs(®s); + return 0; +} + +/** + * kgdb_fiq_enable_nmi - Manage NMI-triggered entry to KGDB + * @on: Flag to either enable or disable an NMI + * + * This function manages NMIs that usually cause KGDB to enter. That is, not + * all NMIs should be enabled or disabled, but only those that issue + * kgdb_handle_exception(). + * + * The call counts disable requests, and thus allows to nest disables. But + * trying to enable already enabled NMI is an error. + */ +static void kgdb_fiq_enable_nmi(bool on) +{ + static atomic_t cnt; + int ret; + + ret = atomic_add_return(on ? 1 : -1, &cnt); + if (ret > 1 && on) { + /* + * There should be only one instance that calls this function + * in "enable, disable" order. All other users must call + * disable first, then enable. If not, something is wrong. + */ + WARN_ON(1); + return; + } + + if (ret > 0) + enable_fiq(kgdb_fiq); + else + disable_fiq(kgdb_fiq); +} + +int kgdb_register_fiq(unsigned int fiq) +{ + int err; + int cpu; + + if (!kgdb_fiq_enabled) + return -ENODEV; + + if (!has_fiq(fiq)) { + pr_warn( + "%s: Cannot register %u (no FIQ with this number)\n", + __func__, fiq); + return -ENODEV; + } + + kgdb_fiq = fiq; + + err = claim_fiq(&kgdb_fiq_desc); + if (err) { + pr_warn("%s: unable to claim fiq", __func__); + return err; + } + + for_each_possible_cpu(cpu) + work_on_cpu(cpu, kgdb_fiq_setup_stack, NULL); + + set_fiq_handler(&kgdb_fiq_handler, + &kgdb_fiq_handler_end - &kgdb_fiq_handler); + + arch_kgdb_ops.enable_nmi = kgdb_fiq_enable_nmi; + return 0; +} diff --git a/arch/arm/kernel/kgdb_fiq_entry.S b/arch/arm/kernel/kgdb_fiq_entry.S new file mode 100644 index 0000000..d6becca --- /dev/null +++ b/arch/arm/kernel/kgdb_fiq_entry.S @@ -0,0 +1,87 @@ +/* + * KGDB FIQ entry + * + * Copyright 1996,1997,1998 Russell King. + * Copyright 2012 Linaro Ltd. + * Anton Vorontsov <anton.vorontsov@linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + */ + +#include <linux/linkage.h> +#include <asm/assembler.h> +#include <asm/memory.h> +#include <asm/unwind.h> +#include "entry-header.S" + + .text + +@ This is needed for usr_entry/alignment_trap +.LCcralign: + .long cr_alignment +.LCdohandle: + .long kgdb_fiq_do_handle + + .macro fiq_handler + ldr r1, =.LCdohandle + mov r0, sp + adr lr, BSYM(9997f) + ldr pc, [r1] +9997: + .endm + + .align 5 +__fiq_svc: + svc_entry + fiq_handler + mov r0, sp + ldmib r0, {r1 - r14} + msr cpsr_c, #FIQ_MODE | PSR_I_BIT | PSR_F_BIT + add r8, r0, #S_PC + ldr r9, [r0, #S_PSR] + msr spsr_cxsf, r9 + ldr r0, [r0, #S_R0] + ldmia r8, {pc}^ + + UNWIND(.fnend ) +ENDPROC(__fiq_svc) + .ltorg + + .align 5 +__fiq_usr: + usr_entry + kuser_cmpxchg_check + fiq_handler + get_thread_info tsk + mov why, #0 + b ret_to_user_from_irq + UNWIND(.fnend ) +ENDPROC(__fiq_usr) + .ltorg + + .global kgdb_fiq_handler +kgdb_fiq_handler: + + vector_stub fiq, FIQ_MODE, 4 + + .long __fiq_usr @ 0 (USR_26 / USR_32) + .long __fiq_svc @ 1 (FIQ_26 / FIQ_32) + .long __fiq_svc @ 2 (IRQ_26 / IRQ_32) + .long __fiq_svc @ 3 (SVC_26 / SVC_32) + .long __fiq_svc @ 4 + .long __fiq_svc @ 5 + .long __fiq_svc @ 6 + .long __fiq_svc @ 7 + .long __fiq_svc @ 8 + .long __fiq_svc @ 9 + .long __fiq_svc @ a + .long __fiq_svc @ b + .long __fiq_svc @ c + .long __fiq_svc @ d + .long __fiq_svc @ e + .long __fiq_svc @ f + + .global kgdb_fiq_handler_end +kgdb_fiq_handler_end: