Message ID | 20200226132947.29738-2-ionela.voinescu@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: ARMv8.4 Activity Monitors support | expand |
Hi Ionela, On Wed, Feb 26, 2020 at 01:29:41PM +0000, Ionela Voinescu wrote: > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index dbc22d684627..49f0c436928f 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -318,6 +318,15 @@ > Format: <a>,<b> > See also Documentation/input/joydev/joystick.rst > > + amu= [ARM64] > + Enables or disables detection, enablement and access to > + counter registers of the Activity Monitors Unit (AMU). > + Format: amu=[0/1/on/off/y/n] > + amu=[0/off/n] ensures access to AMU's counter registers > + is not attempted. > + amu=[1/on/y] (default) enables detection and access to > + AMU's counter registers. Is the only reason for this parameter to be able to disable the feature if the firmware doesn't support it? According to the Kconfig entry, you may see weird behaviour, firmware lock-up. Is the user supposed to try again with amu=0? I'm not particularly fond of adding kernel parameters to work around broken firmware. We have other architecture features (e.g. PtrAuth) that need enabling at EL3 but we don't have such parameters. If it's likely that we hit this issue in practice, I'd rather have the firmware describing the presence of AMU via some DT entry. But I'd rather not bother at all, just get the vendors to update their firmware. If we drop this parameter, patch 1 would need to change. Otherwise the patches look fine. Thanks.
Hi Catalin, On Friday 28 Feb 2020 at 10:32:34 (+0000), Catalin Marinas wrote: > Hi Ionela, > > On Wed, Feb 26, 2020 at 01:29:41PM +0000, Ionela Voinescu wrote: > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > > index dbc22d684627..49f0c436928f 100644 > > --- a/Documentation/admin-guide/kernel-parameters.txt > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > @@ -318,6 +318,15 @@ > > Format: <a>,<b> > > See also Documentation/input/joydev/joystick.rst > > > > + amu= [ARM64] > > + Enables or disables detection, enablement and access to > > + counter registers of the Activity Monitors Unit (AMU). > > + Format: amu=[0/1/on/off/y/n] > > + amu=[0/off/n] ensures access to AMU's counter registers > > + is not attempted. > > + amu=[1/on/y] (default) enables detection and access to > > + AMU's counter registers. > > Is the only reason for this parameter to be able to disable the feature > if the firmware doesn't support it? According to the Kconfig entry, you > may see weird behaviour, firmware lock-up. Is the user supposed to try > again with amu=0? > > I'm not particularly fond of adding kernel parameters to work around > broken firmware. We have other architecture features (e.g. PtrAuth) that > need enabling at EL3 but we don't have such parameters. If it's likely > that we hit this issue in practice, I'd rather have the firmware > describing the presence of AMU via some DT entry. But I'd rather not > bother at all, just get the vendors to update their firmware. > The firmware is supposed to do three actions for the kernel to be able to use the counters: enable access to EL2/EL1, enable the counters and save/restore the counters before/after core-off. Improper firmware support can trigger different issues: kernel/firmware lockup/panic, invalid counter values (0, non-monotonic). Some of them might be less likely (firmware lockups), and some might just be due to present but improper support(save/restore) and therefore more likely. The users of the counters, for example frequency invariance [6/7], does some validation for this, but unfortunately not all conditions can be fully mitigated - validate and bail out if some condition is not accomplished - for example the save and restore functionality. This might result in improper scale values after idle. Therefore, the amu kernel parameter is not only there if the firmware does not support AMU, but it's also there if the firmware support is broken/improper. The kernel parameter was added at Suzuki's recommendation to be able to bypass its use in single kernels that are meant to run on multiple platforms. I also believe this is nice to have even for platforms that properly support AMU, but they might not want the use of the feature in the kernel. > If we drop this parameter, patch 1 would need to change. Otherwise the > patches look fine. > This being said, I agree this was added as a 'just in case' and not as support for a likely scenario, therefore, I don't fully disagree to drop it for now. Let me know what you think. If you'd still rather I drop it, I can do that and rebase on top of Marc's changes and push v6. Thanks, Ionela.
On Mon, Mar 02, 2020 at 02:23:26PM +0000, Ionela Voinescu wrote: > On Friday 28 Feb 2020 at 10:32:34 (+0000), Catalin Marinas wrote: > > On Wed, Feb 26, 2020 at 01:29:41PM +0000, Ionela Voinescu wrote: > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > > > index dbc22d684627..49f0c436928f 100644 > > > --- a/Documentation/admin-guide/kernel-parameters.txt > > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > > @@ -318,6 +318,15 @@ > > > Format: <a>,<b> > > > See also Documentation/input/joydev/joystick.rst > > > > > > + amu= [ARM64] > > > + Enables or disables detection, enablement and access to > > > + counter registers of the Activity Monitors Unit (AMU). > > > + Format: amu=[0/1/on/off/y/n] > > > + amu=[0/off/n] ensures access to AMU's counter registers > > > + is not attempted. > > > + amu=[1/on/y] (default) enables detection and access to > > > + AMU's counter registers. > > > > Is the only reason for this parameter to be able to disable the feature > > if the firmware doesn't support it? According to the Kconfig entry, you > > may see weird behaviour, firmware lock-up. Is the user supposed to try > > again with amu=0? > > > > I'm not particularly fond of adding kernel parameters to work around > > broken firmware. We have other architecture features (e.g. PtrAuth) that > > need enabling at EL3 but we don't have such parameters. If it's likely > > that we hit this issue in practice, I'd rather have the firmware > > describing the presence of AMU via some DT entry. But I'd rather not > > bother at all, just get the vendors to update their firmware. > > The firmware is supposed to do three actions for the kernel to be able > to use the counters: enable access to EL2/EL1, enable the counters and > save/restore the counters before/after core-off. [...] > Therefore, the amu kernel parameter is not only there if the firmware > does not support AMU, but it's also there if the firmware support is > broken/improper. The kernel parameter was added at Suzuki's > recommendation to be able to bypass its use in single kernels that are > meant to run on multiple platforms. Single kernel images are supposed to run on multiple platforms while using the same command line arguments. There are many other ways firmware can screw up but I'm not keen on working on such assumption and preemptively adding options to ignore CPU features. > I also believe this is nice to have even for platforms that properly > support AMU, but they might not want the use of the feature in the > kernel. Are there any downsides to this feature? If you want it for testing purposes, i.e. different scheduler behaviour, fine by me but I'd change the text in the Kconfig to not even imply that firmware is allowed to be broken as we have a workaround in the kernel. > > If we drop this parameter, patch 1 would need to change. Otherwise the > > patches look fine. > > This being said, I agree this was added as a 'just in case' and not as > support for a likely scenario, therefore, I don't fully disagree to drop > it for now. If you need it for testing different scheduler behaviours, maybe big.LITTLE where AMU is only supported on some CPUs, than keep it. If it's only on the assumption that the firmware may be broken, please remove it. Thanks.
Hi Catalin, On Tuesday 03 Mar 2020 at 16:58:45 (+0000), Catalin Marinas wrote: > On Mon, Mar 02, 2020 at 02:23:26PM +0000, Ionela Voinescu wrote: > > On Friday 28 Feb 2020 at 10:32:34 (+0000), Catalin Marinas wrote: > > > On Wed, Feb 26, 2020 at 01:29:41PM +0000, Ionela Voinescu wrote: > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > > > > index dbc22d684627..49f0c436928f 100644 > > > > --- a/Documentation/admin-guide/kernel-parameters.txt > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > > > @@ -318,6 +318,15 @@ > > > > Format: <a>,<b> > > > > See also Documentation/input/joydev/joystick.rst > > > > > > > > + amu= [ARM64] > > > > + Enables or disables detection, enablement and access to > > > > + counter registers of the Activity Monitors Unit (AMU). > > > > + Format: amu=[0/1/on/off/y/n] > > > > + amu=[0/off/n] ensures access to AMU's counter registers > > > > + is not attempted. > > > > + amu=[1/on/y] (default) enables detection and access to > > > > + AMU's counter registers. > > > > > > Is the only reason for this parameter to be able to disable the feature > > > if the firmware doesn't support it? According to the Kconfig entry, you > > > may see weird behaviour, firmware lock-up. Is the user supposed to try > > > again with amu=0? > > > > > > I'm not particularly fond of adding kernel parameters to work around > > > broken firmware. We have other architecture features (e.g. PtrAuth) that > > > need enabling at EL3 but we don't have such parameters. If it's likely > > > that we hit this issue in practice, I'd rather have the firmware > > > describing the presence of AMU via some DT entry. But I'd rather not > > > bother at all, just get the vendors to update their firmware. > > > > The firmware is supposed to do three actions for the kernel to be able > > to use the counters: enable access to EL2/EL1, enable the counters and > > save/restore the counters before/after core-off. > [...] > > Therefore, the amu kernel parameter is not only there if the firmware > > does not support AMU, but it's also there if the firmware support is > > broken/improper. The kernel parameter was added at Suzuki's > > recommendation to be able to bypass its use in single kernels that are > > meant to run on multiple platforms. > > Single kernel images are supposed to run on multiple platforms while > using the same command line arguments. > > There are many other ways firmware can screw up but I'm not keen on > working on such assumption and preemptively adding options to ignore CPU > features. > > > I also believe this is nice to have even for platforms that properly > > support AMU, but they might not want the use of the feature in the > > kernel. > > Are there any downsides to this feature? If you want it for testing > purposes, i.e. different scheduler behaviour, fine by me but I'd change > the text in the Kconfig to not even imply that firmware is allowed to be > broken as we have a workaround in the kernel. > This solution would not be appropriate to select different scheduler behaviours. That would be the end result of "amu=0", but it would not be a clean way to select the source of information for frequency invariance. The scheduler and the frequency invariance engine is only one of the potential users of activity monitors, while this kernel parameter would disable detection for the full feature so it will affect frequency invariance behaviour and other future users of the counters. In no way I want to send the message that firmware is allowed to be broken or that this is a good way to tune scheduler behaviour. My flawed logic above was to suggest that a few small reasons (potential broken firmware, potential interest to turn off the feature at runtime) would make a big one to justify the parameter, but none of these fully stand on their own. > > > If we drop this parameter, patch 1 would need to change. Otherwise the > > > patches look fine. > > > > This being said, I agree this was added as a 'just in case' and not as > > support for a likely scenario, therefore, I don't fully disagree to drop > > it for now. > > If you need it for testing different scheduler behaviours, maybe > big.LITTLE where AMU is only supported on some CPUs, than keep it. If > it's only on the assumption that the firmware may be broken, please > remove it. > In regards to frequency invariance, there is no downside of using the feature and, if available, it's preferred to use the counters independent on whether all CPUs support them on not. But this would be a bad way to switch between sources of information (cpufreq or counters) for frequency invariance in any case, so I'll remove the parameter. Thanks, Ionela. > Thanks. > > -- > Catalin > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index dbc22d684627..49f0c436928f 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -318,6 +318,15 @@ Format: <a>,<b> See also Documentation/input/joydev/joystick.rst + amu= [ARM64] + Enables or disables detection, enablement and access to + counter registers of the Activity Monitors Unit (AMU). + Format: amu=[0/1/on/off/y/n] + amu=[0/off/n] ensures access to AMU's counter registers + is not attempted. + amu=[1/on/y] (default) enables detection and access to + AMU's counter registers. + analog.map= [HW,JOY] Analog joystick and gamepad support Specifies type or capabilities of an analog joystick connected to one of 16 gameports diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 0b30e884e088..3404ca7004c3 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1517,6 +1517,36 @@ config ARM64_PTR_AUTH endmenu +menu "ARMv8.4 architectural features" + +config ARM64_AMU_EXTN + bool "Enable support for the Activity Monitors Unit CPU extension" + default y + help + The activity monitors extension is an optional extension introduced + by the ARMv8.4 CPU architecture. This enables support for version 1 + of the activity monitors architecture, AMUv1. + + To enable the use of this extension on CPUs that implement it, say Y. + + Note that for architectural reasons, firmware _must_ implement AMU + support when running on CPUs that present the activity monitors + extension. The required support is present in: + * Version 1.5 and later of the ARM Trusted Firmware + + For kernels that have this configuration enabled but boot with broken + firmware, you may need to say N here until the firmware is fixed. + Otherwise you may experience firmware panics or lockups when + accessing the counter registers. Even if you are not observing these + symptoms, the values returned by the register reads might not + correctly reflect reality. Most commonly, the value read will be 0, + indicating that the counter is not enabled. + + Alternatively, a kernel parameter is provided to disable detection, + enablement and access to the counter registers of the Activity + Monitors Unit at runtime. +endmenu + menu "ARMv8.5 architectural features" config ARM64_E0PD diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 865e0253fc1e..185e44aa2713 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -58,7 +58,8 @@ #define ARM64_WORKAROUND_SPECULATIVE_AT_NVHE 48 #define ARM64_HAS_E0PD 49 #define ARM64_HAS_RNG 50 +#define ARM64_HAS_AMU_EXTN 51 -#define ARM64_NCAPS 51 +#define ARM64_NCAPS 52 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 92ef9539874a..485e069d8768 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -678,6 +678,11 @@ static inline bool cpu_has_hw_af(void) ID_AA64MMFR1_HADBS_SHIFT); } +#ifdef CONFIG_ARM64_AMU_EXTN +/* Check whether the cpu supports the Activity Monitors Unit (AMU) */ +extern bool cpu_has_amu_feat(int cpu); +#endif + #endif /* __ASSEMBLY__ */ #endif diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index b91570ff9db1..a6ad64e324a4 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -386,6 +386,42 @@ #define SYS_TPIDR_EL0 sys_reg(3, 3, 13, 0, 2) #define SYS_TPIDRRO_EL0 sys_reg(3, 3, 13, 0, 3) +/* Definitions for system register interface to AMU for ARMv8.4 onwards */ +#define SYS_AM_EL0(crm, op2) sys_reg(3, 3, 13, (crm), (op2)) +#define SYS_AMCR_EL0 SYS_AM_EL0(2, 0) +#define SYS_AMCFGR_EL0 SYS_AM_EL0(2, 1) +#define SYS_AMCGCR_EL0 SYS_AM_EL0(2, 2) +#define SYS_AMUSERENR_EL0 SYS_AM_EL0(2, 3) +#define SYS_AMCNTENCLR0_EL0 SYS_AM_EL0(2, 4) +#define SYS_AMCNTENSET0_EL0 SYS_AM_EL0(2, 5) +#define SYS_AMCNTENCLR1_EL0 SYS_AM_EL0(3, 0) +#define SYS_AMCNTENSET1_EL0 SYS_AM_EL0(3, 1) + +/* + * Group 0 of activity monitors (architected): + * op0 op1 CRn CRm op2 + * Counter: 11 011 1101 010:n<3> n<2:0> + * Type: 11 011 1101 011:n<3> n<2:0> + * n: 0-15 + * + * Group 1 of activity monitors (auxiliary): + * op0 op1 CRn CRm op2 + * Counter: 11 011 1101 110:n<3> n<2:0> + * Type: 11 011 1101 111:n<3> n<2:0> + * n: 0-15 + */ + +#define SYS_AMEVCNTR0_EL0(n) SYS_AM_EL0(4 + ((n) >> 3), (n) & 7) +#define SYS_AMEVTYPE0_EL0(n) SYS_AM_EL0(6 + ((n) >> 3), (n) & 7) +#define SYS_AMEVCNTR1_EL0(n) SYS_AM_EL0(12 + ((n) >> 3), (n) & 7) +#define SYS_AMEVTYPE1_EL0(n) SYS_AM_EL0(14 + ((n) >> 3), (n) & 7) + +/* AMU v1: Fixed (architecturally defined) activity monitors */ +#define SYS_AMEVCNTR0_CORE_EL0 SYS_AMEVCNTR0_EL0(0) +#define SYS_AMEVCNTR0_CONST_EL0 SYS_AMEVCNTR0_EL0(1) +#define SYS_AMEVCNTR0_INST_RET_EL0 SYS_AMEVCNTR0_EL0(2) +#define SYS_AMEVCNTR0_MEM_STALL SYS_AMEVCNTR0_EL0(3) + #define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0) #define SYS_CNTP_TVAL_EL0 sys_reg(3, 3, 14, 2, 0) @@ -598,6 +634,7 @@ #define ID_AA64PFR0_CSV3_SHIFT 60 #define ID_AA64PFR0_CSV2_SHIFT 56 #define ID_AA64PFR0_DIT_SHIFT 48 +#define ID_AA64PFR0_AMU_SHIFT 44 #define ID_AA64PFR0_SVE_SHIFT 32 #define ID_AA64PFR0_RAS_SHIFT 28 #define ID_AA64PFR0_GIC_SHIFT 24 @@ -608,6 +645,7 @@ #define ID_AA64PFR0_EL1_SHIFT 4 #define ID_AA64PFR0_EL0_SHIFT 0 +#define ID_AA64PFR0_AMU 0x1 #define ID_AA64PFR0_SVE 0x1 #define ID_AA64PFR0_RAS_V1 0x1 #define ID_AA64PFR0_FP_NI 0xf diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 0b6715625cf6..60cebc071603 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -163,6 +163,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = { ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_DIT_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_AMU_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_RAS_SHIFT, 4, 0), @@ -1222,6 +1223,78 @@ static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap, #endif +#ifdef CONFIG_ARM64_AMU_EXTN + +/* + * The "amu_cpus" cpumask only signals that the CPU implementation for the + * flagged CPUs supports the Activity Monitors Unit (AMU) but does not provide + * information regarding all the events that it supports. When a CPU bit is + * set in the cpumask, the user of this feature can only rely on the presence + * of the 4 fixed counters for that CPU. But this does not guarantee that the + * counters are enabled or access to these counters is enabled by code + * executed at higher exception levels (firmware). + */ +static struct cpumask amu_cpus __read_mostly; + +bool cpu_has_amu_feat(int cpu) +{ + return cpumask_test_cpu(cpu, &amu_cpus); +} + +static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap) +{ + if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU)) { + pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n", + smp_processor_id()); + cpumask_set_cpu(smp_processor_id(), &amu_cpus); + } +} + +/* + * For known broken firmware, a kernel parameter ("amu=[0/off/n]") is provided + * to ensure access to AMU counter registers is not attempted. By default, the + * feature is enabled, but the kernel parameter can both be used to disable or + * enable the capability at runtime in case the default changes in the future. + * + * To be noted that for security considerations, this does not bypass the + * setting of AMUSERENR_EL0 to trap accesses from EL0 (userspace) to EL1 + * (kernel). Therefore, firmware should still ensure accesses to AMU registers + * are not trapped in EL2/EL3. + */ +static bool enable_amu = true; + +static int __init set_enable_amu(char *str) +{ + bool value; + int ret = strtobool(str, &value); + + if (!ret) + enable_amu = value; + + return ret; +} +early_param("amu", set_enable_amu); + +static bool has_amu(const struct arm64_cpu_capabilities *cap, + int __unused) +{ + /* + * The AMU extension is a non-conflicting feature: the kernel can + * safely run a mix of CPUs with and without support for the + * activity monitors extension. Therefore, if not disabled through + * the kernel command line early parameter, enable the capability + * to allow any late CPU to use the feature. + * + * With this feature enabled, the cpu_enable function will be called + * for all CPUs that match the criteria, including secondary and + * hotplugged, marking this feature as present on that respective CPU. + * The enable function will also print a detection message. + */ + + return enable_amu; +} +#endif + #ifdef CONFIG_ARM64_VHE static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused) { @@ -1499,6 +1572,24 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .cpu_enable = cpu_clear_disr, }, #endif /* CONFIG_ARM64_RAS_EXTN */ +#ifdef CONFIG_ARM64_AMU_EXTN + { + /* + * The feature is enabled by default if CONFIG_ARM64_AMU_EXTN=y. + * Therefore, don't provide .desc as we don't want the detection + * message to be shown until at least one CPU is detected to + * support the feature. + */ + .capability = ARM64_HAS_AMU_EXTN, + .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE, + .matches = has_amu, + .sys_reg = SYS_ID_AA64PFR0_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64PFR0_AMU_SHIFT, + .min_field_value = ID_AA64PFR0_AMU, + .cpu_enable = cpu_amu_enable, + }, +#endif /* CONFIG_ARM64_AMU_EXTN */ { .desc = "Data cache clean to the PoU not required for I/D coherence", .capability = ARM64_HAS_CACHE_IDC,