Message ID | 20230830191314.1618136-1-dianders@chromium.org (mailing list archive) |
---|---|
Headers | show |
Series | arm64: Add IPI for backtraces / kgdb; try to use NMI for some IPIs | expand |
On Thu, Aug 31, 2023 at 3:14 AM Douglas Anderson <dianders@chromium.org> wrote: > > This is an attempt to resurrect Sumit's old patch series [1] that > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and > also to round up CPUs in kdb/kgdb. The last post from Sumit that I > could find was v7, so I started my series at v8. I haven't copied all > of his old changelongs here, but you can find them from the link. > > This patch series targets v6.6. Specifically it can't land in v6.5 > since it depends on commit 8d539b84f1e3 ("nmi_backtrace: allow > excluding an arbitrary CPU"). > > It should be noted that Mark still feels there might be some corner > cases where pseudo-NMI is not production ready [2] [3], but as far as > I'm aware there are no concrete/documented issues. Regardless of > whether this should be enabled for production, though, this series > will be invaluable to anyone trying to debug crashes on arm64 > machines. > > v12 of this series collects tags, fixes a few small nits in comments > and commit messages from v11 and adds a new (and somewhat unrelated) > small patch to the end of the series. There are no code changes other > than the last patch, which is tiny. > > v11 of this series addressed Stephen Boyd's feedback on v10 and added > a missing "static" that the patches robot found. > > v10 of this series attempted to address all of Mark's feedback on > v9. As a quick summary: > - It includes his patch to remove IPI_WAKEUP, freeing up an extra IPI. > - It no longer combines the "kgdb" and "backtrace" IPIs. If we need > another IPI these could always be recombined later. > - It promotes IPI_CPU_STOP and IPI_CPU_CRASH_STOP to NMI. > - It puts nearly all the code directly in smp.c. > - Several of the patches are squashed together. > - Patch #6 ("kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB") > was dropped from the series since it landed. > > Between v8 and v9, I had cleaned up this patch series by integrating > the 10th patch from v8 [4] into the whole series. As part of this, I > renamed the "NMI IPI" to the "debug IPI" since it could now be backed > by a regular IPI in the case that pseudo NMIs weren't available. With > the fallback, this allowed me to drop some extra patches from the > series. This feels (to me) to be pretty clean and hopefully others > agree. Any patch I touched significantly I removed Masayoshi and > Chen-Yu's tags from. > > ...also in v8, I reorderd the patches a bit in a way that seemed a > little cleaner to me. > > Since v7, I have: > * Addressed the small amount of feedback that was there for v7. > * Rebased. > * Added a new patch that prevents us from spamming the logs with idle > tasks. > * Added an extra patch to gracefully fall back to regular IPIs if > pseudo-NMIs aren't there. > > It can be noted that this patch series works very well with the recent > "hardlockup" patches that have landed through Andrew Morton's tree and > are currently in mainline. It works especially well with the "buddy" > lockup detector. > > [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/ > [2] https://lore.kernel.org/lkml/ZFvGqD%2F%2Fpm%2FlZb+p@FVFF77S0Q05N.cambridge.arm.com/ > [3] https://lore.kernel.org/lkml/ZNDKVP2m-iiZCz3v@FVFF77S0Q05N.cambridge.arm.com > [4] https://lore.kernel.org/r/20230419155341.v8.10.Ic3659997d6243139d0522fc3afcdfd88d7a5f030@changeid/ > > Changes in v12: > - ("arm64: smp: Mark IPI globals as __ro_after_init") new for v12. > - Added a comment about why we account for 16 SGIs when Linux uses 8. > - Minor comment change to add "()" after nmi_trigger_cpumask_backtrace. > - Updated the commit hash of the commit this depends on. > > Changes in v11: > - Adjust comment about NR_IPI/MAX_IPI. > - Don't use confusing "backed by" idiom in comment. > - Made arm64_backtrace_ipi() static. > - Updated commit message as per Stephen. > - arch_send_wakeup_ipi() now takes an unsigned int. > > Changes in v10: > - ("IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI") new for v10. > - ("arm64: smp: Remove dedicated wakeup IPI") new for v10. > - Backtrace now directly supported in smp.c > - Don't allocate the cpumask on the stack; just iterate. > - Moved kgdb calls to smp.c to avoid needing to export IPI info. > - Rewrite as needed for 5.11+ as per Mark Rutland and Sumit. > - Squash backtrace into patch adding support for pseudo-NMI IPIs. > - kgdb now has its own IPI. > > Changes in v9: > - Added comments that we might not be using NMI always. > - Added to commit message that this doesn't catch all cases. > - Fold in v8 patch #10 ("Fallback to a regular IPI if NMI isn't enabled") > - Moved header file out of "include" since it didn't need to be there. > - Remove arm64_supports_nmi() > - Remove fallback for when debug IPI isn't available. > - Renamed "NMI IPI" to "debug IPI" since it might not be backed by NMI. > - arch_trigger_cpumask_backtrace() no longer returns bool > > Changes in v8: > - "Tag the arm64 idle functions as __cpuidle" new for v8 > - Removed "#ifdef CONFIG_SMP" since arm64 is always SMP > - debug_ipi_setup() and debug_ipi_teardown() no longer take cpu param > > Douglas Anderson (6): > irqchip/gic-v3: Enable support for SGIs to act as NMIs > arm64: idle: Tag the arm64 idle functions as __cpuidle > arm64: smp: Add arch support for backtrace using pseudo-NMI > arm64: smp: IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI > arm64: kgdb: Implement kgdb_roundup_cpus() to enable pseudo-NMI > roundup > arm64: smp: Mark IPI globals as __ro_after_init > > Mark Rutland (1): > arm64: smp: Remove dedicated wakeup IPI Whole series is Tested-by: Chen-Yu Tsai <wenst@chromium.org> on SolidRun i.MX8MM Hummingboard Pulse by injecting HARDLOCKUP with lkdtm.