diff mbox series

[v5] arm64: Enable perf events based hard lockup detector

Message ID 1610712101-14929-1-git-send-email-sumit.garg@linaro.org (mailing list archive)
State New, archived
Headers show
Series [v5] arm64: Enable perf events based hard lockup detector | expand

Commit Message

Sumit Garg Jan. 15, 2021, 12:01 p.m. UTC
With the recent feature added to enable perf events to use pseudo NMIs
as interrupts on platforms which support GICv3 or later, its now been
possible to enable hard lockup detector (or NMI watchdog) on arm64
platforms. So enable corresponding support.

One thing to note here is that normally lockup detector is initialized
just after the early initcalls but PMU on arm64 comes up much later as
device_initcall(). So we need to re-initialize lockup detection once
PMU has been initialized.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
---

Changes in v5:
- Fix lockup_detector_init() invocation to be rather invoked from CPU
  binded context as it makes heavy use of per-cpu variables and shouldn't
  be invoked from preemptible context.

Changes in v4:
- Rebased to latest pmu v7 NMI patch-set [1] and in turn use "has_nmi"
  hook to know if PMU IRQ has been requested as an NMI.
- Add check for return value prior to initializing hard-lockup detector.

[1] https://lkml.org/lkml/2020/9/24/458

Changes in v3:
- Rebased to latest pmu NMI patch-set [1].
- Addressed misc. comments from Stephen.

[1] https://lkml.org/lkml/2020/8/19/671

Changes since RFC:
- Rebased on top of Alex's WIP-pmu-nmi branch.
- Add comment for safe max. CPU frequency.
- Misc. cleanup.

 arch/arm64/Kconfig             |  2 ++
 arch/arm64/kernel/perf_event.c | 48 ++++++++++++++++++++++++++++++++++++++++--
 drivers/perf/arm_pmu.c         |  5 +++++
 include/linux/perf/arm_pmu.h   |  2 ++
 4 files changed, 55 insertions(+), 2 deletions(-)

Comments

Will Deacon Jan. 26, 2021, 2:18 p.m. UTC | #1
Hi Sumit,

On Fri, Jan 15, 2021 at 05:31:41PM +0530, Sumit Garg wrote:
> With the recent feature added to enable perf events to use pseudo NMIs
> as interrupts on platforms which support GICv3 or later, its now been
> possible to enable hard lockup detector (or NMI watchdog) on arm64
> platforms. So enable corresponding support.
> 
> One thing to note here is that normally lockup detector is initialized
> just after the early initcalls but PMU on arm64 comes up much later as
> device_initcall(). So we need to re-initialize lockup detection once
> PMU has been initialized.
> 
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> ---

[...]

> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 3605f77a..bafb7c8 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -23,6 +23,8 @@
>  #include <linux/platform_device.h>
>  #include <linux/sched_clock.h>
>  #include <linux/smp.h>
> +#include <linux/nmi.h>
> +#include <linux/cpufreq.h>
>  
>  /* ARMv8 Cortex-A53 specific event types. */
>  #define ARMV8_A53_PERFCTR_PREF_LINEFILL				0xC2
> @@ -1246,12 +1248,30 @@ static struct platform_driver armv8_pmu_driver = {
>  	.probe		= armv8_pmu_device_probe,
>  };
>  
> +static int __init lockup_detector_init_fn(void *data)
> +{
> +	lockup_detector_init();
> +	return 0;
> +}
> +
>  static int __init armv8_pmu_driver_init(void)
>  {
> +	int ret;
> +
>  	if (acpi_disabled)
> -		return platform_driver_register(&armv8_pmu_driver);
> +		ret = platform_driver_register(&armv8_pmu_driver);
>  	else
> -		return arm_pmu_acpi_probe(armv8_pmuv3_init);
> +		ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
> +
> +	/*
> +	 * Try to re-initialize lockup detector after PMU init in
> +	 * case PMU events are triggered via NMIs.
> +	 */
> +	if (ret == 0 && arm_pmu_irq_is_nmi())
> +		smp_call_on_cpu(raw_smp_processor_id(), lockup_detector_init_fn,
> +				NULL, false);
> +
> +	return ret;

What's wrong with the alternative approach outlined by Mark:

https://lore.kernel.org/r/20210113130235.GB19011@C02TD0UTHF1T.local

?

Will
Sumit Garg Jan. 28, 2021, 7:07 a.m. UTC | #2
Hi Will,

On Tue, 26 Jan 2021 at 19:48, Will Deacon <will@kernel.org> wrote:
>
> Hi Sumit,
>
> On Fri, Jan 15, 2021 at 05:31:41PM +0530, Sumit Garg wrote:
> > With the recent feature added to enable perf events to use pseudo NMIs
> > as interrupts on platforms which support GICv3 or later, its now been
> > possible to enable hard lockup detector (or NMI watchdog) on arm64
> > platforms. So enable corresponding support.
> >
> > One thing to note here is that normally lockup detector is initialized
> > just after the early initcalls but PMU on arm64 comes up much later as
> > device_initcall(). So we need to re-initialize lockup detection once
> > PMU has been initialized.
> >
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > ---
>
> [...]
>
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index 3605f77a..bafb7c8 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -23,6 +23,8 @@
> >  #include <linux/platform_device.h>
> >  #include <linux/sched_clock.h>
> >  #include <linux/smp.h>
> > +#include <linux/nmi.h>
> > +#include <linux/cpufreq.h>
> >
> >  /* ARMv8 Cortex-A53 specific event types. */
> >  #define ARMV8_A53_PERFCTR_PREF_LINEFILL                              0xC2
> > @@ -1246,12 +1248,30 @@ static struct platform_driver armv8_pmu_driver = {
> >       .probe          = armv8_pmu_device_probe,
> >  };
> >
> > +static int __init lockup_detector_init_fn(void *data)
> > +{
> > +     lockup_detector_init();
> > +     return 0;
> > +}
> > +
> >  static int __init armv8_pmu_driver_init(void)
> >  {
> > +     int ret;
> > +
> >       if (acpi_disabled)
> > -             return platform_driver_register(&armv8_pmu_driver);
> > +             ret = platform_driver_register(&armv8_pmu_driver);
> >       else
> > -             return arm_pmu_acpi_probe(armv8_pmuv3_init);
> > +             ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
> > +
> > +     /*
> > +      * Try to re-initialize lockup detector after PMU init in
> > +      * case PMU events are triggered via NMIs.
> > +      */
> > +     if (ret == 0 && arm_pmu_irq_is_nmi())
> > +             smp_call_on_cpu(raw_smp_processor_id(), lockup_detector_init_fn,
> > +                             NULL, false);
> > +
> > +     return ret;
>
> What's wrong with the alternative approach outlined by Mark:
>
> https://lore.kernel.org/r/20210113130235.GB19011@C02TD0UTHF1T.local
>
> ?

I have replied on this thread.

-Sumit

>
> Will
Sumit Garg Feb. 19, 2021, 9:37 a.m. UTC | #3
Hi Will, Mark,

On Fri, 15 Jan 2021 at 17:32, Sumit Garg <sumit.garg@linaro.org> wrote:
>
> With the recent feature added to enable perf events to use pseudo NMIs
> as interrupts on platforms which support GICv3 or later, its now been
> possible to enable hard lockup detector (or NMI watchdog) on arm64
> platforms. So enable corresponding support.
>
> One thing to note here is that normally lockup detector is initialized
> just after the early initcalls but PMU on arm64 comes up much later as
> device_initcall(). So we need to re-initialize lockup detection once
> PMU has been initialized.
>
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> ---
>
> Changes in v5:
> - Fix lockup_detector_init() invocation to be rather invoked from CPU
>   binded context as it makes heavy use of per-cpu variables and shouldn't
>   be invoked from preemptible context.
>

Do you have any further comments on this?

Lecopzer,

Does this feature work fine for you now?

-Sumit

> Changes in v4:
> - Rebased to latest pmu v7 NMI patch-set [1] and in turn use "has_nmi"
>   hook to know if PMU IRQ has been requested as an NMI.
> - Add check for return value prior to initializing hard-lockup detector.
>
> [1] https://lkml.org/lkml/2020/9/24/458
>
> Changes in v3:
> - Rebased to latest pmu NMI patch-set [1].
> - Addressed misc. comments from Stephen.
>
> [1] https://lkml.org/lkml/2020/8/19/671
>
> Changes since RFC:
> - Rebased on top of Alex's WIP-pmu-nmi branch.
> - Add comment for safe max. CPU frequency.
> - Misc. cleanup.
>
>  arch/arm64/Kconfig             |  2 ++
>  arch/arm64/kernel/perf_event.c | 48 ++++++++++++++++++++++++++++++++++++++++--
>  drivers/perf/arm_pmu.c         |  5 +++++
>  include/linux/perf/arm_pmu.h   |  2 ++
>  4 files changed, 55 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f39568b..05e1735 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -174,6 +174,8 @@ config ARM64
>         select HAVE_NMI
>         select HAVE_PATA_PLATFORM
>         select HAVE_PERF_EVENTS
> +       select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI && HW_PERF_EVENTS
> +       select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
>         select HAVE_PERF_REGS
>         select HAVE_PERF_USER_STACK_DUMP
>         select HAVE_REGS_AND_STACK_ACCESS_API
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 3605f77a..bafb7c8 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -23,6 +23,8 @@
>  #include <linux/platform_device.h>
>  #include <linux/sched_clock.h>
>  #include <linux/smp.h>
> +#include <linux/nmi.h>
> +#include <linux/cpufreq.h>
>
>  /* ARMv8 Cortex-A53 specific event types. */
>  #define ARMV8_A53_PERFCTR_PREF_LINEFILL                                0xC2
> @@ -1246,12 +1248,30 @@ static struct platform_driver armv8_pmu_driver = {
>         .probe          = armv8_pmu_device_probe,
>  };
>
> +static int __init lockup_detector_init_fn(void *data)
> +{
> +       lockup_detector_init();
> +       return 0;
> +}
> +
>  static int __init armv8_pmu_driver_init(void)
>  {
> +       int ret;
> +
>         if (acpi_disabled)
> -               return platform_driver_register(&armv8_pmu_driver);
> +               ret = platform_driver_register(&armv8_pmu_driver);
>         else
> -               return arm_pmu_acpi_probe(armv8_pmuv3_init);
> +               ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
> +
> +       /*
> +        * Try to re-initialize lockup detector after PMU init in
> +        * case PMU events are triggered via NMIs.
> +        */
> +       if (ret == 0 && arm_pmu_irq_is_nmi())
> +               smp_call_on_cpu(raw_smp_processor_id(), lockup_detector_init_fn,
> +                               NULL, false);
> +
> +       return ret;
>  }
>  device_initcall(armv8_pmu_driver_init)
>
> @@ -1309,3 +1329,27 @@ void arch_perf_update_userpage(struct perf_event *event,
>         userpg->cap_user_time_zero = 1;
>         userpg->cap_user_time_short = 1;
>  }
> +
> +#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
> +/*
> + * Safe maximum CPU frequency in case a particular platform doesn't implement
> + * cpufreq driver. Although, architecture doesn't put any restrictions on
> + * maximum frequency but 5 GHz seems to be safe maximum given the available
> + * Arm CPUs in the market which are clocked much less than 5 GHz. On the other
> + * hand, we can't make it much higher as it would lead to a large hard-lockup
> + * detection timeout on parts which are running slower (eg. 1GHz on
> + * Developerbox) and doesn't possess a cpufreq driver.
> + */
> +#define SAFE_MAX_CPU_FREQ      5000000000UL // 5 GHz
> +u64 hw_nmi_get_sample_period(int watchdog_thresh)
> +{
> +       unsigned int cpu = smp_processor_id();
> +       unsigned long max_cpu_freq;
> +
> +       max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
> +       if (!max_cpu_freq)
> +               max_cpu_freq = SAFE_MAX_CPU_FREQ;
> +
> +       return (u64)max_cpu_freq * watchdog_thresh;
> +}
> +#endif
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index cb2f55f..794a37d 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -726,6 +726,11 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
>         return per_cpu(hw_events->irq, cpu);
>  }
>
> +bool arm_pmu_irq_is_nmi(void)
> +{
> +       return has_nmi;
> +}
> +
>  /*
>   * PMU hardware loses all context when a CPU goes offline.
>   * When a CPU is hotplugged back in, since some hardware registers are
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 5054802..bf79667 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -163,6 +163,8 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn);
>  static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
>  #endif
>
> +bool arm_pmu_irq_is_nmi(void);
> +
>  /* Internal functions only for core arm_pmu code */
>  struct arm_pmu *armpmu_alloc(void);
>  struct arm_pmu *armpmu_alloc_atomic(void);
> --
> 2.7.4
>
Lecopzer Chen March 30, 2021, 8:06 a.m. UTC | #4
> Hi Will, Mark,
> 
> On Fri, 15 Jan 2021 at 17:32, Sumit Garg <sumit.garg@linaro.org> wrote:
> >
> > With the recent feature added to enable perf events to use pseudo NMIs
> > as interrupts on platforms which support GICv3 or later, its now been
> > possible to enable hard lockup detector (or NMI watchdog) on arm64
> > platforms. So enable corresponding support.
> >
> > One thing to note here is that normally lockup detector is initialized
> > just after the early initcalls but PMU on arm64 comes up much later as
> > device_initcall(). So we need to re-initialize lockup detection once
> > PMU has been initialized.
> >
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > ---
> >
> > Changes in v5:
> > - Fix lockup_detector_init() invocation to be rather invoked from CPU
> >   binded context as it makes heavy use of per-cpu variables and shouldn't
> >   be invoked from preemptible context.
> >
> 
> Do you have any further comments on this?
> 
> Lecopzer,
> 
> Does this feature work fine for you now?

This really fixes the warning, I have a real hardware for testing this now.
but do we need to call lockup_detector_init() for each cpu?

In init/main.c, it's only called by cpu 0 for once.


BRs,
Lecopzer
Lecopzer Chen March 30, 2021, 8:32 a.m. UTC | #5
> > Hi Will, Mark,
> > 
> > On Fri, 15 Jan 2021 at 17:32, Sumit Garg <sumit.garg@linaro.org> wrote:
> > >
> > > With the recent feature added to enable perf events to use pseudo NMIs
> > > as interrupts on platforms which support GICv3 or later, its now been
> > > possible to enable hard lockup detector (or NMI watchdog) on arm64
> > > platforms. So enable corresponding support.
> > >
> > > One thing to note here is that normally lockup detector is initialized
> > > just after the early initcalls but PMU on arm64 comes up much later as
> > > device_initcall(). So we need to re-initialize lockup detection once
> > > PMU has been initialized.
> > >
> > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > ---
> > >
> > > Changes in v5:
> > > - Fix lockup_detector_init() invocation to be rather invoked from CPU
> > >   binded context as it makes heavy use of per-cpu variables and shouldn't
> > >   be invoked from preemptible context.
> > >
> > 
> > Do you have any further comments on this?
> > 
> > Lecopzer,
> > 
> > Does this feature work fine for you now?
> 
> This really fixes the warning, I have a real hardware for testing this now.
> but do we need to call lockup_detector_init() for each cpu?
> 
> In init/main.c, it's only called by cpu 0 for once.
 
Oh sorry, I just misread the code, please ignore previous mail.
 

BRs,
Lecopzer
Sumit Garg March 30, 2021, 12:30 p.m. UTC | #6
On Tue, 30 Mar 2021 at 14:07, Lecopzer Chen <lecopzer.chen@mediatek.com> wrote:
>
> > > Hi Will, Mark,
> > >
> > > On Fri, 15 Jan 2021 at 17:32, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > >
> > > > With the recent feature added to enable perf events to use pseudo NMIs
> > > > as interrupts on platforms which support GICv3 or later, its now been
> > > > possible to enable hard lockup detector (or NMI watchdog) on arm64
> > > > platforms. So enable corresponding support.
> > > >
> > > > One thing to note here is that normally lockup detector is initialized
> > > > just after the early initcalls but PMU on arm64 comes up much later as
> > > > device_initcall(). So we need to re-initialize lockup detection once
> > > > PMU has been initialized.
> > > >
> > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > ---
> > > >
> > > > Changes in v5:
> > > > - Fix lockup_detector_init() invocation to be rather invoked from CPU
> > > >   binded context as it makes heavy use of per-cpu variables and shouldn't
> > > >   be invoked from preemptible context.
> > > >
> > >
> > > Do you have any further comments on this?
> > >
> > > Lecopzer,
> > >
> > > Does this feature work fine for you now?
> >
> > This really fixes the warning, I have a real hardware for testing this now.

Thanks for the testing. I assume it as an implicit Tested-by.

> > but do we need to call lockup_detector_init() for each cpu?
> >
> > In init/main.c, it's only called by cpu 0 for once.
>
> Oh sorry, I just misread the code, please ignore previous mail.
>

No worries.

-Sumit

>
> BRs,
> Lecopzer
Sumit Garg April 12, 2021, 12:01 p.m. UTC | #7
Hi Will,

On Tue, 30 Mar 2021 at 18:00, Sumit Garg <sumit.garg@linaro.org> wrote:
>
> On Tue, 30 Mar 2021 at 14:07, Lecopzer Chen <lecopzer.chen@mediatek.com> wrote:
> >
> > > > Hi Will, Mark,
> > > >
> > > > On Fri, 15 Jan 2021 at 17:32, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > > >
> > > > > With the recent feature added to enable perf events to use pseudo NMIs
> > > > > as interrupts on platforms which support GICv3 or later, its now been
> > > > > possible to enable hard lockup detector (or NMI watchdog) on arm64
> > > > > platforms. So enable corresponding support.
> > > > >
> > > > > One thing to note here is that normally lockup detector is initialized
> > > > > just after the early initcalls but PMU on arm64 comes up much later as
> > > > > device_initcall(). So we need to re-initialize lockup detection once
> > > > > PMU has been initialized.
> > > > >
> > > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > ---
> > > > >
> > > > > Changes in v5:
> > > > > - Fix lockup_detector_init() invocation to be rather invoked from CPU
> > > > >   binded context as it makes heavy use of per-cpu variables and shouldn't
> > > > >   be invoked from preemptible context.
> > > > >
> > > >
> > > > Do you have any further comments on this?
> > > >

Since there aren't any further comments, can you re-pick this feature for 5.13?

-Sumit

> > > > Lecopzer,
> > > >
> > > > Does this feature work fine for you now?
> > >
> > > This really fixes the warning, I have a real hardware for testing this now.
>
> Thanks for the testing. I assume it as an implicit Tested-by.
>
> > > but do we need to call lockup_detector_init() for each cpu?
> > >
> > > In init/main.c, it's only called by cpu 0 for once.
> >
> > Oh sorry, I just misread the code, please ignore previous mail.
> >
>
> No worries.
>
> -Sumit
>
> >
> > BRs,
> > Lecopzer
Will Deacon April 19, 2021, 5:03 p.m. UTC | #8
On Mon, Apr 12, 2021 at 05:31:13PM +0530, Sumit Garg wrote:
> On Tue, 30 Mar 2021 at 18:00, Sumit Garg <sumit.garg@linaro.org> wrote:
> > On Tue, 30 Mar 2021 at 14:07, Lecopzer Chen <lecopzer.chen@mediatek.com> wrote:
> > > > > On Fri, 15 Jan 2021 at 17:32, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > > > >
> > > > > > With the recent feature added to enable perf events to use pseudo NMIs
> > > > > > as interrupts on platforms which support GICv3 or later, its now been
> > > > > > possible to enable hard lockup detector (or NMI watchdog) on arm64
> > > > > > platforms. So enable corresponding support.
> > > > > >
> > > > > > One thing to note here is that normally lockup detector is initialized
> > > > > > just after the early initcalls but PMU on arm64 comes up much later as
> > > > > > device_initcall(). So we need to re-initialize lockup detection once
> > > > > > PMU has been initialized.
> > > > > >
> > > > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > ---
> > > > > >
> > > > > > Changes in v5:
> > > > > > - Fix lockup_detector_init() invocation to be rather invoked from CPU
> > > > > >   binded context as it makes heavy use of per-cpu variables and shouldn't
> > > > > >   be invoked from preemptible context.
> > > > > >
> > > > >
> > > > > Do you have any further comments on this?
> > > > >
> 
> Since there aren't any further comments, can you re-pick this feature for 5.13?

I'd still like Mark's Ack on this, as the approach you have taken doesn't
really sit with what he was suggesting.

I also don't understand how all the CPUs get initialised with your patch,
since the PMU driver will be initialised after SMP is up and running.

Will
Huang Shijie July 19, 2021, 6:35 a.m. UTC | #9
On Mon, Jul 19, 2021 at 11:48:33AM +0530, Sumit Garg wrote:
> Subject: [PATCH v5] arm64: Enable perf events based hard lockup detector
> To: <will@kernel.org>, <mark.rutland@arm.com>, <lecopzer.chen@mediatek.com>
> Cc: <linux-arm-kernel@lists.infradead.org>, <catalin.marinas@arm.com>,
> <alexandru.elisei@arm.com>, <swboyd@chromium.org>,
> <dianders@chromium.org>, <daniel.thompson@linaro.org>,
> <linux-kernel@vger.kernel.org>, Sumit Garg <sumit.garg@linaro.org>
> 
> 
> With the recent feature added to enable perf events to use pseudo NMIs
> as interrupts on platforms which support GICv3 or later, its now been
> possible to enable hard lockup detector (or NMI watchdog) on arm64
> platforms. So enable corresponding support.
> 
> One thing to note here is that normally lockup detector is initialized
> just after the early initcalls but PMU on arm64 comes up much later as
> device_initcall(). So we need to re-initialize lockup detection once
> PMU has been initialized.
> 
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> ---
> 
> Changes in v5:
> - Fix lockup_detector_init() invocation to be rather invoked from CPU
>   binded context as it makes heavy use of per-cpu variables and shouldn't
>   be invoked from preemptible context.
> 
> Changes in v4:
> - Rebased to latest pmu v7 NMI patch-set [1] and in turn use "has_nmi"
>   hook to know if PMU IRQ has been requested as an NMI.
> - Add check for return value prior to initializing hard-lockup detector.
> 
> [1] https://lkml.org/lkml/2020/9/24/458
> 
> Changes in v3:
> - Rebased to latest pmu NMI patch-set [1].
> - Addressed misc. comments from Stephen.
> 
> [1] https://lkml.org/lkml/2020/8/19/671
> 
> Changes since RFC:
> - Rebased on top of Alex's WIP-pmu-nmi branch.
> - Add comment for safe max. CPU frequency.
> - Misc. cleanup.
> 
>  arch/arm64/Kconfig             |  2 ++
>  arch/arm64/kernel/perf_event.c | 48 ++++++++++++++++++++++++++++++++++++++++--
>  drivers/perf/arm_pmu.c         |  5 +++++
>  include/linux/perf/arm_pmu.h   |  2 ++
>  4 files changed, 55 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f39568b..05e1735 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -174,6 +174,8 @@ config ARM64
>         select HAVE_NMI
>         select HAVE_PATA_PLATFORM
>         select HAVE_PERF_EVENTS
> +       select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI && HW_PERF_EVENTS
> +       select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS &&
> HAVE_PERF_EVENTS_NMI
>         select HAVE_PERF_REGS
>         select HAVE_PERF_USER_STACK_DUMP
>         select HAVE_REGS_AND_STACK_ACCESS_API
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 3605f77a..bafb7c8 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -23,6 +23,8 @@
>  #include <linux/platform_device.h>
>  #include <linux/sched_clock.h>
>  #include <linux/smp.h>
> +#include <linux/nmi.h>
> +#include <linux/cpufreq.h>
> 
>  /* ARMv8 Cortex-A53 specific event types. */
>  #define ARMV8_A53_PERFCTR_PREF_LINEFILL                                0xC2
> @@ -1246,12 +1248,30 @@ static struct platform_driver armv8_pmu_driver = {
>         .probe          = armv8_pmu_device_probe,
>  };
> 
> +static int __init lockup_detector_init_fn(void *data)
> +{
> +       lockup_detector_init();
> +       return 0;
> +}
> +
>  static int __init armv8_pmu_driver_init(void)
>  {
> +       int ret;
> +
>         if (acpi_disabled)
> -               return platform_driver_register(&armv8_pmu_driver);
> +               ret = platform_driver_register(&armv8_pmu_driver);
>         else
> -               return arm_pmu_acpi_probe(armv8_pmuv3_init);
> +               ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
> +
> +       /*
> +        * Try to re-initialize lockup detector after PMU init in
> +        * case PMU events are triggered via NMIs.
> +        */
> +       if (ret == 0 && arm_pmu_irq_is_nmi())
> +               smp_call_on_cpu(raw_smp_processor_id(), lockup_detector_init_fn,
> +                               NULL, false);
> +
> +       return ret;
>  }
>  device_initcall(armv8_pmu_driver_init)
> 
> @@ -1309,3 +1329,27 @@ void arch_perf_update_userpage(struct perf_event *event,
>         userpg->cap_user_time_zero = 1;
>         userpg->cap_user_time_short = 1;
>  }
> +
> +#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
> +/*
> + * Safe maximum CPU frequency in case a particular platform doesn't implement
> + * cpufreq driver. Although, architecture doesn't put any restrictions on
> + * maximum frequency but 5 GHz seems to be safe maximum given the available
> + * Arm CPUs in the market which are clocked much less than 5 GHz. On the other
> + * hand, we can't make it much higher as it would lead to a large hard-lockup
> + * detection timeout on parts which are running slower (eg. 1GHz on
> + * Developerbox) and doesn't possess a cpufreq driver.
> + */
> +#define SAFE_MAX_CPU_FREQ      5000000000UL // 5 GHz
> +u64 hw_nmi_get_sample_period(int watchdog_thresh)
> +{
> +       unsigned int cpu = smp_processor_id();
> +       unsigned long max_cpu_freq;
> +
> +       max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
> +       if (!max_cpu_freq)
> +               max_cpu_freq = SAFE_MAX_CPU_FREQ;
> +
> +       return (u64)max_cpu_freq * watchdog_thresh;
> +}
> +#endif
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index cb2f55f..794a37d 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -726,6 +726,11 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
>         return per_cpu(hw_events->irq, cpu);
>  }
> 
> +bool arm_pmu_irq_is_nmi(void)
> +{
> +       return has_nmi;
> +}
> +
>  /*
>   * PMU hardware loses all context when a CPU goes offline.
>   * When a CPU is hotplugged back in, since some hardware registers are
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 5054802..bf79667 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -163,6 +163,8 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn);
>  static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
>  #endif
> 
> +bool arm_pmu_irq_is_nmi(void);
> +
>  /* Internal functions only for core arm_pmu code */
>  struct arm_pmu *armpmu_alloc(void);
>  struct arm_pmu *armpmu_alloc_atomic(void);
> --
> 2.7.4
Tested-by: Huang Shijie <shijie@os.amperecomputing.com>
diff mbox series

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f39568b..05e1735 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -174,6 +174,8 @@  config ARM64
 	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
+	select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI && HW_PERF_EVENTS
+	select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 3605f77a..bafb7c8 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -23,6 +23,8 @@ 
 #include <linux/platform_device.h>
 #include <linux/sched_clock.h>
 #include <linux/smp.h>
+#include <linux/nmi.h>
+#include <linux/cpufreq.h>
 
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREF_LINEFILL				0xC2
@@ -1246,12 +1248,30 @@  static struct platform_driver armv8_pmu_driver = {
 	.probe		= armv8_pmu_device_probe,
 };
 
+static int __init lockup_detector_init_fn(void *data)
+{
+	lockup_detector_init();
+	return 0;
+}
+
 static int __init armv8_pmu_driver_init(void)
 {
+	int ret;
+
 	if (acpi_disabled)
-		return platform_driver_register(&armv8_pmu_driver);
+		ret = platform_driver_register(&armv8_pmu_driver);
 	else
-		return arm_pmu_acpi_probe(armv8_pmuv3_init);
+		ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
+
+	/*
+	 * Try to re-initialize lockup detector after PMU init in
+	 * case PMU events are triggered via NMIs.
+	 */
+	if (ret == 0 && arm_pmu_irq_is_nmi())
+		smp_call_on_cpu(raw_smp_processor_id(), lockup_detector_init_fn,
+				NULL, false);
+
+	return ret;
 }
 device_initcall(armv8_pmu_driver_init)
 
@@ -1309,3 +1329,27 @@  void arch_perf_update_userpage(struct perf_event *event,
 	userpg->cap_user_time_zero = 1;
 	userpg->cap_user_time_short = 1;
 }
+
+#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
+/*
+ * Safe maximum CPU frequency in case a particular platform doesn't implement
+ * cpufreq driver. Although, architecture doesn't put any restrictions on
+ * maximum frequency but 5 GHz seems to be safe maximum given the available
+ * Arm CPUs in the market which are clocked much less than 5 GHz. On the other
+ * hand, we can't make it much higher as it would lead to a large hard-lockup
+ * detection timeout on parts which are running slower (eg. 1GHz on
+ * Developerbox) and doesn't possess a cpufreq driver.
+ */
+#define SAFE_MAX_CPU_FREQ	5000000000UL // 5 GHz
+u64 hw_nmi_get_sample_period(int watchdog_thresh)
+{
+	unsigned int cpu = smp_processor_id();
+	unsigned long max_cpu_freq;
+
+	max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
+	if (!max_cpu_freq)
+		max_cpu_freq = SAFE_MAX_CPU_FREQ;
+
+	return (u64)max_cpu_freq * watchdog_thresh;
+}
+#endif
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index cb2f55f..794a37d 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -726,6 +726,11 @@  static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
 	return per_cpu(hw_events->irq, cpu);
 }
 
+bool arm_pmu_irq_is_nmi(void)
+{
+	return has_nmi;
+}
+
 /*
  * PMU hardware loses all context when a CPU goes offline.
  * When a CPU is hotplugged back in, since some hardware registers are
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 5054802..bf79667 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -163,6 +163,8 @@  int arm_pmu_acpi_probe(armpmu_init_fn init_fn);
 static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
 #endif
 
+bool arm_pmu_irq_is_nmi(void);
+
 /* Internal functions only for core arm_pmu code */
 struct arm_pmu *armpmu_alloc(void);
 struct arm_pmu *armpmu_alloc_atomic(void);