Message ID | 1587726554-32018-5-git-send-email-sumit.garg@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: Introduce new IPI as IPI_CALL_NMI_FUNC | expand |
Hi, On Fri, Apr 24, 2020 at 4:11 AM Sumit Garg <sumit.garg@linaro.org> wrote: > > arm64 platforms with GICv3 or later supports pseudo NMIs which can be > leveraged to round up CPUs which are stuck in hard lockup state with > interrupts disabled that wouldn't be possible with a normal IPI. > > So instead switch to round up CPUs using IPI_CALL_NMI_FUNC. And in > case a particular arm64 platform doesn't supports pseudo NMIs, > IPI_CALL_NMI_FUNC will act as a normal IPI which maintains existing > kgdb functionality. > > Also, one thing to note here is that with CPUs running in NMI context, > kernel has special handling for printk() which involves CPU specific > buffers and defering printk() until exit from NMI context. But with kgdb > we don't want to defer printk() especially backtrace on corresponding > CPUs. So switch to normal printk() context instead prior to entering > kgdb context. > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org> > --- > arch/arm64/kernel/kgdb.c | 15 +++++++++++++++ > arch/arm64/kernel/smp.c | 17 ++++++++++++++--- > 2 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c > index 4311992..0851ead 100644 > --- a/arch/arm64/kernel/kgdb.c > +++ b/arch/arm64/kernel/kgdb.c > @@ -14,6 +14,7 @@ > #include <linux/kgdb.h> > #include <linux/kprobes.h> > #include <linux/sched/task_stack.h> > +#include <linux/smp.h> > > #include <asm/debug-monitors.h> > #include <asm/insn.h> > @@ -353,3 +354,17 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt) > return aarch64_insn_write((void *)bpt->bpt_addr, > *(u32 *)bpt->saved_instr); > } > + > +#ifdef CONFIG_SMP > +void kgdb_roundup_cpus(void) > +{ > + struct cpumask mask; > + > + cpumask_copy(&mask, cpu_online_mask); > + cpumask_clear_cpu(raw_smp_processor_id(), &mask); > + if (cpumask_empty(&mask)) > + return; > + > + arch_send_call_nmi_func_ipi_mask(&mask); > +} > +#endif > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > index 27c8ee1..c7158f6e8 100644 > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -31,6 +31,7 @@ > #include <linux/of.h> > #include <linux/irq_work.h> > #include <linux/kexec.h> > +#include <linux/kgdb.h> > #include <linux/kvm_host.h> > > #include <asm/alternative.h> > @@ -976,9 +977,19 @@ void handle_IPI(int ipinr, struct pt_regs *regs) > /* Handle it as a normal interrupt if not in NMI context */ > if (!in_nmi()) > irq_enter(); > - > - /* nop, IPI handlers for special features can be added here. */ > - > +#ifdef CONFIG_KGDB My vote would be to keep "ifdef"s out of the middle of functions. Can you put your code in "arch/arm64/kernel/kgdb.c" and then have a dummpy no-op function if "CONFIG_KGDB" isn't defined? > + if (atomic_read(&kgdb_active) != -1) { > + /* > + * For kgdb to work properly, we need printk to operate > + * in normal context. > + */ > + if (in_nmi()) > + printk_nmi_exit(); It feels like all the printk management belongs in kgdb_nmicallback(). ...or is there some reason that this isn't a problem for other platforms using NMI? Maybe it's just that nobody has noticed it yet? > + kgdb_nmicallback(raw_smp_processor_id(), regs); Why do you need to call raw_smp_processor_id()? Are you expecting a different value than the local variable "cpu"? > + if (in_nmi()) > + printk_nmi_enter(); > + } > +#endif > if (!in_nmi()) > irq_exit(); > break; Not that I really know what I'm talking about since I really don't know arm64 at this level very well, but I'll ask anyway and probably look like a fool... I had a note that said: * Will Deacon says: * * the whole roundup code is sketchy and it's the only place in the kernel * which tries to perform I-cache maintenance with irqs disabled, leading * to this nasty hack in the arch code: * * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/cacheflush.h#n74 I presume that, if nothing else, the comment needs to be updated. ...but is the situation any better (or worse?) with your new solution? -Doug
Hi Doug, Thanks for your comments. On Sat, 25 Apr 2020 at 02:17, Doug Anderson <dianders@chromium.org> wrote: > > Hi, > > On Fri, Apr 24, 2020 at 4:11 AM Sumit Garg <sumit.garg@linaro.org> wrote: > > > > arm64 platforms with GICv3 or later supports pseudo NMIs which can be > > leveraged to round up CPUs which are stuck in hard lockup state with > > interrupts disabled that wouldn't be possible with a normal IPI. > > > > So instead switch to round up CPUs using IPI_CALL_NMI_FUNC. And in > > case a particular arm64 platform doesn't supports pseudo NMIs, > > IPI_CALL_NMI_FUNC will act as a normal IPI which maintains existing > > kgdb functionality. > > > > Also, one thing to note here is that with CPUs running in NMI context, > > kernel has special handling for printk() which involves CPU specific > > buffers and defering printk() until exit from NMI context. But with kgdb > > we don't want to defer printk() especially backtrace on corresponding > > CPUs. So switch to normal printk() context instead prior to entering > > kgdb context. > > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org> > > --- > > arch/arm64/kernel/kgdb.c | 15 +++++++++++++++ > > arch/arm64/kernel/smp.c | 17 ++++++++++++++--- > > 2 files changed, 29 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c > > index 4311992..0851ead 100644 > > --- a/arch/arm64/kernel/kgdb.c > > +++ b/arch/arm64/kernel/kgdb.c > > @@ -14,6 +14,7 @@ > > #include <linux/kgdb.h> > > #include <linux/kprobes.h> > > #include <linux/sched/task_stack.h> > > +#include <linux/smp.h> > > > > #include <asm/debug-monitors.h> > > #include <asm/insn.h> > > @@ -353,3 +354,17 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt) > > return aarch64_insn_write((void *)bpt->bpt_addr, > > *(u32 *)bpt->saved_instr); > > } > > + > > +#ifdef CONFIG_SMP > > +void kgdb_roundup_cpus(void) > > +{ > > + struct cpumask mask; > > + > > + cpumask_copy(&mask, cpu_online_mask); > > + cpumask_clear_cpu(raw_smp_processor_id(), &mask); > > + if (cpumask_empty(&mask)) > > + return; > > + > > + arch_send_call_nmi_func_ipi_mask(&mask); > > +} > > +#endif > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > > index 27c8ee1..c7158f6e8 100644 > > --- a/arch/arm64/kernel/smp.c > > +++ b/arch/arm64/kernel/smp.c > > @@ -31,6 +31,7 @@ > > #include <linux/of.h> > > #include <linux/irq_work.h> > > #include <linux/kexec.h> > > +#include <linux/kgdb.h> > > #include <linux/kvm_host.h> > > > > #include <asm/alternative.h> > > @@ -976,9 +977,19 @@ void handle_IPI(int ipinr, struct pt_regs *regs) > > /* Handle it as a normal interrupt if not in NMI context */ > > if (!in_nmi()) > > irq_enter(); > > - > > - /* nop, IPI handlers for special features can be added here. */ > > - > > +#ifdef CONFIG_KGDB > > My vote would be to keep "ifdef"s out of the middle of functions. Can > you put your code in "arch/arm64/kernel/kgdb.c" and then have a dummpy > no-op function if "CONFIG_KGDB" isn't defined? > Sure. > > > + if (atomic_read(&kgdb_active) != -1) { > > + /* > > + * For kgdb to work properly, we need printk to operate > > + * in normal context. > > + */ > > + if (in_nmi()) > > + printk_nmi_exit(); > > It feels like all the printk management belongs in kgdb_nmicallback(). > ...or is there some reason that this isn't a problem for other > platforms using NMI? Maybe it's just that nobody has noticed it yet? > Initially I was skeptical of moving this printk handling in the common kgdb framework but after exploring other platforms like x86 (probably unnoticed bug), I agree with you that it belongs to kgdb_nmicallback(). So I will move it there. > > > + kgdb_nmicallback(raw_smp_processor_id(), regs); > > Why do you need to call raw_smp_processor_id()? Are you expecting a > different value than the local variable "cpu"? Ah, no. Will use the local variable "cpu" instead. > > > > + if (in_nmi()) > > + printk_nmi_enter(); > > + } > > +#endif > > if (!in_nmi()) > > irq_exit(); > > break; > > Not that I really know what I'm talking about since I really don't > know arm64 at this level very well, but I'll ask anyway and probably > look like a fool... I had a note that said: > > * Will Deacon says: > * > * the whole roundup code is sketchy and it's the only place in the kernel > * which tries to perform I-cache maintenance with irqs disabled, leading > * to this nasty hack in the arch code: > * > * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/cacheflush.h#n74 > > I presume that, if nothing else, the comment needs to be updated. > ...but is the situation any better (or worse?) with your new solution? I think the situation remains the same with new solution as well. As either we use IPI being a pseudo NMI or a normal IRQ to roundup CPUs, kgdb still does I-cache maintenance with irqs disabled which could lead to a deadlock trying to IPI the secondary CPUs without this nasty hack in the arch code. -Sumit > > -Doug
diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c index 4311992..0851ead 100644 --- a/arch/arm64/kernel/kgdb.c +++ b/arch/arm64/kernel/kgdb.c @@ -14,6 +14,7 @@ #include <linux/kgdb.h> #include <linux/kprobes.h> #include <linux/sched/task_stack.h> +#include <linux/smp.h> #include <asm/debug-monitors.h> #include <asm/insn.h> @@ -353,3 +354,17 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt) return aarch64_insn_write((void *)bpt->bpt_addr, *(u32 *)bpt->saved_instr); } + +#ifdef CONFIG_SMP +void kgdb_roundup_cpus(void) +{ + struct cpumask mask; + + cpumask_copy(&mask, cpu_online_mask); + cpumask_clear_cpu(raw_smp_processor_id(), &mask); + if (cpumask_empty(&mask)) + return; + + arch_send_call_nmi_func_ipi_mask(&mask); +} +#endif diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 27c8ee1..c7158f6e8 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -31,6 +31,7 @@ #include <linux/of.h> #include <linux/irq_work.h> #include <linux/kexec.h> +#include <linux/kgdb.h> #include <linux/kvm_host.h> #include <asm/alternative.h> @@ -976,9 +977,19 @@ void handle_IPI(int ipinr, struct pt_regs *regs) /* Handle it as a normal interrupt if not in NMI context */ if (!in_nmi()) irq_enter(); - - /* nop, IPI handlers for special features can be added here. */ - +#ifdef CONFIG_KGDB + if (atomic_read(&kgdb_active) != -1) { + /* + * For kgdb to work properly, we need printk to operate + * in normal context. + */ + if (in_nmi()) + printk_nmi_exit(); + kgdb_nmicallback(raw_smp_processor_id(), regs); + if (in_nmi()) + printk_nmi_enter(); + } +#endif if (!in_nmi()) irq_exit(); break;
arm64 platforms with GICv3 or later supports pseudo NMIs which can be leveraged to round up CPUs which are stuck in hard lockup state with interrupts disabled that wouldn't be possible with a normal IPI. So instead switch to round up CPUs using IPI_CALL_NMI_FUNC. And in case a particular arm64 platform doesn't supports pseudo NMIs, IPI_CALL_NMI_FUNC will act as a normal IPI which maintains existing kgdb functionality. Also, one thing to note here is that with CPUs running in NMI context, kernel has special handling for printk() which involves CPU specific buffers and defering printk() until exit from NMI context. But with kgdb we don't want to defer printk() especially backtrace on corresponding CPUs. So switch to normal printk() context instead prior to entering kgdb context. Signed-off-by: Sumit Garg <sumit.garg@linaro.org> --- arch/arm64/kernel/kgdb.c | 15 +++++++++++++++ arch/arm64/kernel/smp.c | 17 ++++++++++++++--- 2 files changed, 29 insertions(+), 3 deletions(-)