Message ID | 20210811171505.1502090-1-wenst@chromium.org (mailing list archive) |
---|---|
State | Mainlined |
Headers | show |
Series | irqchip/gic-v3: Fix priority comparison when non-secure priorities are used | expand |
+ Alex, who introduced this. On Wed, 11 Aug 2021 18:15:05 +0100, Chen-Yu Tsai <wenst@chromium.org> wrote: > > When non-secure priorities are used, compared to the raw priority set, > the value read back from RPR is also right-shifted by one and the > highest bit set. > > Add a macro to do the modifications to the raw priority when doing the > comparison against the RPR value. This corrects the pseudo-NMI behavior > when non-secure priorities in the GIC are used. Tested on 5.10 with > the "IPI as pseudo-NMI" series [1] applied on MT8195. > > [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/ > > Fixes: 336780590990 ("irqchip/gic-v3: Support pseudo-NMIs when SCR_EL3.FIQ == 0") > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> > --- > drivers/irqchip/irq-gic-v3.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > index e0f4debe64e1..e7a0b55413db 100644 > --- a/drivers/irqchip/irq-gic-v3.c > +++ b/drivers/irqchip/irq-gic-v3.c > @@ -100,6 +100,15 @@ EXPORT_SYMBOL(gic_pmr_sync); > DEFINE_STATIC_KEY_FALSE(gic_nonsecure_priorities); > EXPORT_SYMBOL(gic_nonsecure_priorities); > > +#define GICD_INT_RPR_PRI(priority) \ > + ({ \ > + u32 __priority = (priority); \ > + if (static_branch_unlikely(&gic_nonsecure_priorities)) \ > + __priority = 0x80 | (__priority >> 1); \ > + \ > + __priority; \ This doesn't reflect what the pseudocode says of a read of ICC_RPR_EL1 AFAICS. When the priority is activated, it is indeed shifted. But a read of RPR does appear to shift things back (and you loose the lowest bit in the process). Please see 'aarch64/support/ICC_RPR_EL1' in the architecture spec. Can you confirm that SCR_EL3.FIQ is set on your system? Thanks, M. > + }) > + > /* ppi_nmi_refs[n] == number of cpus having ppi[n + 16] set as NMI */ > static refcount_t *ppi_nmi_refs; > > @@ -687,7 +696,7 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs > return; > > if (gic_supports_nmi() && > - unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) { > + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) { > gic_handle_nmi(irqnr, regs); > return; > }
Hi, After re-familiarizing myself with the spec, it starting to look to me like indeed there's something not quite right (read as: totally broken) with my patch. Arm IHI 0069F, the pseudocode for reading ICC_RPR_EL1 (page 11-797), says that the priority returned is unchanged if SCTLR_EL3.FIQ == 0. This means that the ICC_RPR_EL1 read will return the secure view (the value as it is stored by the GIC) of the priority, so for pseudo-nmis it will return (GICD_INT_NMI_PRI >> 1) | 0x80, which definitely != GICD_INT_NMI_PRI. This is further confirmed by this statement on page 4-67: "When GICD_CTLR.DS == 0, [..] For Non-secure access to ICC_PMR_EL1 and ICC_RPR_EL1 when SCR_EL3.FIQ == 0: The Secure, unshifted view applies." I don't know how I missed that during testing. Did a quick test on the model with PMU NMIs (GICD_CTRL.DS = 0, SCTLR_EL2.FIQ = 0), gic_handle_nmi() was not being called at all, but when I changed the comparison to gic_read_rpr() == ((GICD_INT_NMI_PRI >> 1) | 0x80), NMIs were being correctly handled again. Will try to run some more tests on real hardware and see if I can confirm this. Thanks, Alex On 8/11/21 7:31 PM, Marc Zyngier wrote: > + Alex, who introduced this. > > On Wed, 11 Aug 2021 18:15:05 +0100, > Chen-Yu Tsai <wenst@chromium.org> wrote: >> When non-secure priorities are used, compared to the raw priority set, >> the value read back from RPR is also right-shifted by one and the >> highest bit set. >> >> Add a macro to do the modifications to the raw priority when doing the >> comparison against the RPR value. This corrects the pseudo-NMI behavior >> when non-secure priorities in the GIC are used. Tested on 5.10 with >> the "IPI as pseudo-NMI" series [1] applied on MT8195. >> >> [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/ >> >> Fixes: 336780590990 ("irqchip/gic-v3: Support pseudo-NMIs when SCR_EL3.FIQ == 0") >> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> >> --- >> drivers/irqchip/irq-gic-v3.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c >> index e0f4debe64e1..e7a0b55413db 100644 >> --- a/drivers/irqchip/irq-gic-v3.c >> +++ b/drivers/irqchip/irq-gic-v3.c >> @@ -100,6 +100,15 @@ EXPORT_SYMBOL(gic_pmr_sync); >> DEFINE_STATIC_KEY_FALSE(gic_nonsecure_priorities); >> EXPORT_SYMBOL(gic_nonsecure_priorities); >> >> +#define GICD_INT_RPR_PRI(priority) \ >> + ({ \ >> + u32 __priority = (priority); \ >> + if (static_branch_unlikely(&gic_nonsecure_priorities)) \ >> + __priority = 0x80 | (__priority >> 1); \ >> + \ >> + __priority; \ > This doesn't reflect what the pseudocode says of a read of ICC_RPR_EL1 > AFAICS. When the priority is activated, it is indeed shifted. But a > read of RPR does appear to shift things back (and you loose the lowest > bit in the process). Please see 'aarch64/support/ICC_RPR_EL1' in the > architecture spec. > > Can you confirm that SCR_EL3.FIQ is set on your system? > > Thanks, > > M. > >> + }) >> + >> /* ppi_nmi_refs[n] == number of cpus having ppi[n + 16] set as NMI */ >> static refcount_t *ppi_nmi_refs; >> >> @@ -687,7 +696,7 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs >> return; >> >> if (gic_supports_nmi() && >> - unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) { >> + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) { >> gic_handle_nmi(irqnr, regs); >> return; >> } >
On Thu, 12 Aug 2021 12:51:34 +0100, Alexandru Elisei <alexandru.elisei@arm.com> wrote: > > Hi, > > After re-familiarizing myself with the spec, it starting to look to > me like indeed there's something not quite right (read as: totally > broken) with my patch. > > Arm IHI 0069F, the pseudocode for reading ICC_RPR_EL1 (page 11-797), > says that the priority returned is unchanged if SCTLR_EL3.FIQ == > 0. Sure, but look at what ICC_RPR_EL1 does for FIQ==1: <quote> if HaveEL(EL3) && !IsSecure() && SCR_EL3.FIQ == '1' then // A Non-secure GIC access and Group 0 inaccessible to Non-secure. if pPriority<7> == '0' then // Priority is in Secure half and not visible to Non-secure Priority = Zeros(); elsif !IsOnes(pPriority) then // Non-secure access and not idle, so physical priority must be shifted pPriority<7:0> = (pPriority AND PRIMask())<6:0>:'0'; return ZeroExtend(pPriority); </quote> See how the the priority is shifted *left* (bits [6:0] end up in [7:1])? > This means that the ICC_RPR_EL1 read will return the secure view > (the value as it is stored by the GIC) of the priority, so for > pseudo-nmis it will return (GICD_INT_NMI_PRI >> 1) | 0x80, which > definitely != GICD_INT_NMI_PRI. That's not my reading of the pseudocode. > This is further confirmed by this statement on page 4-67: > > "When GICD_CTLR.DS == 0, [..] For Non-secure access to ICC_PMR_EL1 > and ICC_RPR_EL1 when SCR_EL3.FIQ == 0: The Secure, unshifted view > applies." > > I don't know how I missed that during testing. > > Did a quick test on the model with PMU NMIs (GICD_CTRL.DS = 0, > SCTLR_EL2.FIQ = 0), gic_handle_nmi() was not being called at all, 0? Really???? > but when I changed the comparison to gic_read_rpr() == > ((GICD_INT_NMI_PRI >> 1) | 0x80), NMIs were being correctly handled > again. You have completely lost me. This contradicts what you have written above. Thanks, M.
Hi Marc, On 8/12/21 2:09 PM, Marc Zyngier wrote: > On Thu, 12 Aug 2021 12:51:34 +0100, > Alexandru Elisei <alexandru.elisei@arm.com> wrote: >> Hi, >> >> After re-familiarizing myself with the spec, it starting to look to >> me like indeed there's something not quite right (read as: totally >> broken) with my patch. >> >> Arm IHI 0069F, the pseudocode for reading ICC_RPR_EL1 (page 11-797), >> says that the priority returned is unchanged if SCTLR_EL3.FIQ == >> 0. > Sure, but look at what ICC_RPR_EL1 does for FIQ==1: > > <quote> > if HaveEL(EL3) && !IsSecure() && SCR_EL3.FIQ == '1' then > // A Non-secure GIC access and Group 0 inaccessible to Non-secure. > if pPriority<7> == '0' then > // Priority is in Secure half and not visible to Non-secure > Priority = Zeros(); > elsif !IsOnes(pPriority) then > // Non-secure access and not idle, so physical priority must be shifted > pPriority<7:0> = (pPriority AND PRIMask())<6:0>:'0'; > > return ZeroExtend(pPriority); > </quote> > > See how the the priority is shifted *left* (bits [6:0] end up in > [7:1])? Yes, when SCR_EL3.FIQ=1, but gic_nonsecure_priorities is enabled when SCR_EL3.FIQ=0 (gic_has_group0()). In that case ICC_RPR_EL1 returns (what I assume to be) the highest priority interrupt from ICC_AP0R_EL1, ICC_AP1R_EL1NS and ICC_AP1R_EL1S. Isn't that the secure view (or Distributor value) of the priority? > >> This means that the ICC_RPR_EL1 read will return the secure view >> (the value as it is stored by the GIC) of the priority, so for >> pseudo-nmis it will return (GICD_INT_NMI_PRI >> 1) | 0x80, which >> definitely != GICD_INT_NMI_PRI. > That's not my reading of the pseudocode. > >> This is further confirmed by this statement on page 4-67: >> >> "When GICD_CTLR.DS == 0, [..] For Non-secure access to ICC_PMR_EL1 >> and ICC_RPR_EL1 when SCR_EL3.FIQ == 0: The Secure, unshifted view >> applies." >> >> I don't know how I missed that during testing. >> >> Did a quick test on the model with PMU NMIs (GICD_CTRL.DS = 0, >> SCTLR_EL2.FIQ = 0), gic_handle_nmi() was not being called at all, > 0? Really???? Yes, I was also very surprised myself, I was certain that I tested my patch on the model with PMU NMIs. Here's what I did to get this result, I would very much appreciate anyone pointing out what I did wrong here. $ git diff diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index e0f4debe64e1..749748df2466 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -623,6 +623,8 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) bool irqs_enabled = interrupts_enabled(regs); int err; + panic("Pseudo-NMI"); + if (irqs_enabled) nmi_enter(); @@ -1654,8 +1656,10 @@ static void gic_enable_nmi_support(void) * be in the non-secure range, we use a different PMR value to mask IRQs * and the rest of the values that we use remain unchanged. */ - if (gic_has_group0() && !gic_dist_security_disabled()) + if (gic_has_group0() && !gic_dist_security_disabled()) { + pr_info("enabling gic_nonsecure_priorities"); static_branch_enable(&gic_nonsecure_priorities); + } static_branch_enable(&supports_pseudo_nmis); And a truncated log for FVP: Boot-wrapper v0.2 [..] [ 0.000000] Kernel command line: earlycon=pl011,0x1c090000 console=ttyAMA0 kpti=off root=/dev/vda irqchip.gicv3_pseudo_nmi=1 [..] [ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0 [ 0.000000] GICv3: GIC: Using split EOI/Deactivate mode [ 0.000000] GICv3: 224 SPIs implemented [ 0.000000] GICv3: 0 Extended SPIs implemented [ 0.000000] GICv3: Distributor has no Range Selector support [ 0.000000] Root IRQ handler: gic_handle_irq [ 0.000000] GICv3: 16 PPIs implemented [ 0.000000] GICv3: CPU0: found redistributor 0 region 0:0x000000002f100000 [ 0.000000] GICv3: Pseudo-NMIs enabled using relaxed ICC_PMR_EL1 synchronisation [ 0.000000] GICv3: enabling gic_nonsecure_priorities [..] [ 0.037784] armv8-pmu pmu: hw perfevents: no interrupt-affinity property, guessing. [ 0.037901] hw perfevents: enabled with armv8_pmuv3 PMU driver, 9 counters available, using NMIs [..] # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 9: 0 0 0 0 GICv3 25 Level vgic 11: 0 0 0 0 GICv3 30 Level kvm guest ptimer 12: 0 0 0 0 GICv3 27 Level kvm guest vtimer 13: 83 77 69 88 GICv3 26 Level arch_timer 16: 0 0 0 0 GICv3 92 Level arm-pmu 17: 0 0 0 0 GICv3 93 Level arm-pmu 18: 0 0 0 0 GICv3 94 Level arm-pmu 19: 0 0 0 0 GICv3 95 Level arm-pmu 22: 0 0 0 0 GICv3 47 Level eth0 24: 33 0 0 0 GICv3 37 Level uart-pl011 29: 0 0 0 0 GICv3 36 Level rtc-pl031 31: 138 0 0 0 GICv3 74 Level virtio0 IPI0: 251 334 252 172 Rescheduling interrupts IPI1: 26 23 10 20 Function call interrupts IPI2: 0 0 0 0 CPU stop interrupts IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts IPI4: 0 0 0 0 Timer broadcast interrupts IPI5: 0 0 0 0 IRQ work interrupts IPI6: 0 0 0 0 CPU wake-up interrupts Err: 0 # perf record -a -- sleep 5 ^C[ perf record: Woken up 1 times to write data ] [ 6.908820] random: perf: uninitialized urandom read (6 bytes read) [ 7.596139] random: perf: uninitialized urandom read (6 bytes read) [ perf record: Captured and wrote 0.013 MB perf.data (24 samples) ] # # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 9: 0 0 0 0 GICv3 25 Level vgic 11: 0 0 0 0 GICv3 30 Level kvm guest ptimer 12: 0 0 0 0 GICv3 27 Level kvm guest vtimer 13: 546 115 82 132 GICv3 26 Level arch_timer 16: 6 0 0 0 GICv3 92 Level arm-pmu 17: 0 10 0 0 GICv3 93 Level arm-pmu 18: 0 0 4 0 GICv3 94 Level arm-pmu 19: 0 0 0 4 GICv3 95 Level arm-pmu 22: 0 0 0 0 GICv3 47 Level eth0 24: 80 0 0 0 GICv3 37 Level uart-pl011 29: 0 0 0 0 GICv3 36 Level rtc-pl031 31: 195 0 0 0 GICv3 74 Level virtio0 IPI0: 259 367 263 224 Rescheduling interrupts IPI1: 27 51 16 24 Function call interrupts IPI2: 0 0 0 0 CPU stop interrupts IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts IPI4: 0 0 0 0 Timer broadcast interrupts IPI5: 0 0 0 0 IRQ work interrupts IPI6: 0 0 0 0 CPU wake-up interrupts Err: 0 # With this change to the driver: $ git diff diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index e0f4debe64e1..e58e62ab64fe 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -623,6 +623,8 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) bool irqs_enabled = interrupts_enabled(regs); int err; + panic("Pseudo-NMI"); + if (irqs_enabled) nmi_enter(); @@ -687,7 +689,7 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs return; if (gic_supports_nmi() && - unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) { + unlikely(gic_read_rpr() == ((GICD_INT_NMI_PRI >> 1) | 0x80))) { gic_handle_nmi(irqnr, regs); return; } @@ -1654,8 +1656,10 @@ static void gic_enable_nmi_support(void) * be in the non-secure range, we use a different PMR value to mask IRQs * and the rest of the values that we use remain unchanged. */ - if (gic_has_group0() && !gic_dist_security_disabled()) + if (gic_has_group0() && !gic_dist_security_disabled()) { + pr_info("enabling gic_nonsecure_priorities"); static_branch_enable(&gic_nonsecure_priorities); + } static_branch_enable(&supports_pseudo_nmis); This is what I get from FVP (also truncated): Boot-wrapper v0.2 [..] [ 0.000000] Kernel command line: earlycon=pl011,0x1c090000 console=ttyAMA0 kpti=off root=/dev/vda irqchip.gicv3_pseudo_nmi=1 [..] [ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0 [ 0.000000] GICv3: GIC: Using split EOI/Deactivate mode [ 0.000000] GICv3: 224 SPIs implemented [ 0.000000] GICv3: 0 Extended SPIs implemented [ 0.000000] GICv3: Distributor has no Range Selector support [ 0.000000] Root IRQ handler: gic_handle_irq [ 0.000000] GICv3: 16 PPIs implemented [ 0.000000] GICv3: CPU0: found redistributor 0 region 0:0x000000002f100000 [ 0.000000] GICv3: Pseudo-NMIs enabled using relaxed ICC_PMR_EL1 synchronisation [ 0.000000] GICv3: enabling gic_nonsecure_priorities [..] [ 0.037749] armv8-pmu pmu: hw perfevents: no interrupt-affinity property, guessing. [ 0.037860] hw perfevents: enabled with armv8_pmuv3 PMU driver, 9 counters available, using NMIs [..] # perf record -a -- sleep 10 [ 3.556669] Kernel panic - not syncing: Pseudo-NMI [ 3.556676] CPU: 0 PID: 145 Comm: perf Not tainted 5.14.0-rc5-dirty #23 [ 3.556690] Hardware name: FVP Base (DT) [ 3.556692] Call trace: [ 3.556700] dump_backtrace+0x0/0x19c [ 3.556710] show_stack+0x1c/0x70 [ 3.556725] dump_stack_lvl+0x68/0x84 [ 3.556729] dump_stack+0x1c/0x38 [ 3.556741] panic+0x16c/0x340 [ 3.556750] gic_handle_irq+0x178/0x1a4 [ 3.556760] call_on_irq_stack+0x2c/0x58 [ 3.556776] do_interrupt_handler+0x54/0x60 [ 3.556791] el1_interrupt+0x30/0xa0 [ 3.556800] el1h_64_irq_handler+0x1c/0x2c [ 3.556809] el1h_64_irq+0x78/0x7c [ 3.556819] do_vfs_ioctl+0x90/0xd30 [ 3.556830] __arm64_sys_ioctl+0x88/0xf0 [ 3.556842] invoke_syscall+0x48/0x114 [ 3.556849] el0_svc_common+0x48/0x17c [ 3.556859] do_el0_svc+0x2c/0x94 [ 3.556873] el0_svc+0x2c/0x5c [ 3.556879] el0t_64_sync_handler+0xa8/0x130 [ 3.556889] el0t_64_sync+0x198/0x19c [ 3.556907] SMP: stopping secondary CPUs [ 3.556919] Kernel Offset: 0x57c5cb480000 from 0xffff800010000000 [ 3.556922] PHYS_OFFSET: 0xffffebb580000000 [ 3.556930] CPU features: 0x000000c1,ff3d4eef [ 3.556940] Memory Limit: none [ 3.556949] ---[ end Kernel panic - not syncing: Pseudo-NMI ]--- > >> but when I changed the comparison to gic_read_rpr() == >> ((GICD_INT_NMI_PRI >> 1) | 0x80), NMIs were being correctly handled >> again. > You have completely lost me. This contradicts what you have written > above. I don't see how that is the case - ICC_RPR_EL1 contains the priority value as seen by the Distributor, and non-secure priorities get right-shifted when SCR_EL3.FIQ=0, meaning that GICD_INT_NMI_PRI becomes (GICD_INT_NMI_PRI >> 1) | 0x80 in the Distributor. Can you elaborate where I'm contradicting myself? Thanks, Alex
Hi, On Thu, Aug 12, 2021 at 2:31 AM Marc Zyngier <maz@kernel.org> wrote: > > + Alex, who introduced this. > > On Wed, 11 Aug 2021 18:15:05 +0100, > Chen-Yu Tsai <wenst@chromium.org> wrote: > > > > When non-secure priorities are used, compared to the raw priority set, > > the value read back from RPR is also right-shifted by one and the > > highest bit set. > > > > Add a macro to do the modifications to the raw priority when doing the > > comparison against the RPR value. This corrects the pseudo-NMI behavior > > when non-secure priorities in the GIC are used. Tested on 5.10 with > > the "IPI as pseudo-NMI" series [1] applied on MT8195. > > > > [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/ > > > > Fixes: 336780590990 ("irqchip/gic-v3: Support pseudo-NMIs when SCR_EL3.FIQ == 0") > > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> > > --- > > drivers/irqchip/irq-gic-v3.c | 11 ++++++++++- > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > index e0f4debe64e1..e7a0b55413db 100644 > > --- a/drivers/irqchip/irq-gic-v3.c > > +++ b/drivers/irqchip/irq-gic-v3.c > > @@ -100,6 +100,15 @@ EXPORT_SYMBOL(gic_pmr_sync); > > DEFINE_STATIC_KEY_FALSE(gic_nonsecure_priorities); > > EXPORT_SYMBOL(gic_nonsecure_priorities); > > > > +#define GICD_INT_RPR_PRI(priority) \ > > + ({ \ > > + u32 __priority = (priority); \ > > + if (static_branch_unlikely(&gic_nonsecure_priorities)) \ > > + __priority = 0x80 | (__priority >> 1); \ > > + \ > > + __priority; \ > > This doesn't reflect what the pseudocode says of a read of ICC_RPR_EL1 > AFAICS. When the priority is activated, it is indeed shifted. But a > read of RPR does appear to shift things back (and you loose the lowest > bit in the process). Please see 'aarch64/support/ICC_RPR_EL1' in the > architecture spec. > > Can you confirm that SCR_EL3.FIQ is set on your system? I gave this a test with mainline on an ROC-RK3399-PC. I figure this is more available and stable than the MT8195 [1]. My board is running: - ATF v2.4-561-g465af20ce (based on git commit 8078b5c5a) with console and some erratas enabled, but otherwise standard rk3399 config "-dirty" was due to the SCR_EL3 line. NOTICE: BL31: v2.4(debug):v2.4-561-g465af20ce-dirty NOTICE: BL31: Built : 22:59:42, Aug 16 2021 INFO: GICv3 with legacy support detected. INFO: ARM GICv3 driver initialized in EL3 INFO: Maximum SPI INTID supported: 287 INFO: plat_rockchip_pmu_init(1628): pd status 3e INFO: BL31: Initializing runtime services INFO: BL31: cortex_a53: CPU workaround for 855873 was applied INFO: BL31: cortex_a53: CPU workaround for 1530924 was applied INFO: BL31: Preparing for EL3 exit to normal world INFO: BL31: SCR_EL3=0x238, FIQ not set INFO: Entry point address = 0x200000 INFO: SPSR = 0x3c9 - kernel next-20210813 + "NMI as IPI" series + debug printks scattered in the GICv3 driver GICv3: GIC: Using split EOI/Deactivate mode GICv3: 256 SPIs implemented GICv3: 0 Extended SPIs implemented GICv3: Distributor has no Range Selector support Root IRQ handler: gic_handle_irq GICv3: 16 PPIs implemented GICv3: CPU0: found redistributor 0 region 0:0x00000000fef00000 GICv3: Pseudo-NMIs enabled using relaxed ICC_PMR_EL1 synchronisation GICv3: Priority bits (gic_get_pribits()): 5 GICv3: Group0 available (gic_has_group0()) GICv3: Distributor security enabled (gic_dist_security_disabled() GICv3: non-secure priorities used (gic_nonsecure_priorities) GICv3: gic_irq_nmi_setup start: irq 7 GICv3: interrupt 7 priority a0 GICv3: interrupt 7 priority 20 So the RK3399 actually works correctly _without_ my fix. The NMI backtrace triggers going through `gic_handle_nmi()`. If my fix is applied, it goes through `gic_handle_irq()` instead. However the MT8195 needs this to work correctly. === ATF === NOTICE: MT8195 bl31_setup NOTICE: BL31: v2.5(debug): NOTICE: BL31: Built : Wed Jul 7 11:17:50 UTC 2021 INFO: GICv3 without legacy support detected. INFO: ARM GICv3 driver initialized in EL3 INFO: Maximum SPI INTID supported: 895 NOTICE: MT8195 spm_boot_init INFO: BL31: Initializing runtime services INFO: BL31: cortex_a55: CPU workaround for 1530923 was applied INFO: SPM: enable CPC mode INFO: mcdi ready for mcusys-off-idle and system suspend INFO: BL31: Preparing for EL3 exit to normal world INFO: Entry point address = 0x80000000 INFO: SPSR = 0x8 === kernel GICv3 === GICv3: GIC: Using split EOI/Deactivate mode GICv3: 864 SPIs implemented GICv3: 0 Extended SPIs implemented GICv3: Distributor has no Range Selector support GICv3: 16 PPIs implemented GICv3: CPU0: found redistributor 0 region 0:0x000000000c040000 GICv3: Pseudo-NMIs enabled using relaxed ICC_PMR_EL1 synchronisation GICv3: Priority bits: 5 GICv3: Group0 available GICv3: Distributor security enabled GICv3: non-secure priorities used GICv3: gic_irq_nmi_setup start: irq 7 GICv3: interrupt 7 priority a0 GICv3: interrupt 7 priority 20 I will dig through our ATF code base and try to see what's different. ChenYu [1] On a side note, the latest Chrome OS kernel 5.10 for the MT8195 gives a spinlock recursion during boot and hangs if pseudo-NMIs are enabled. I had to dig through my reflog to find the early version I developed on. > Thanks, > > M. > > > + }) > > + > > /* ppi_nmi_refs[n] == number of cpus having ppi[n + 16] set as NMI */ > > static refcount_t *ppi_nmi_refs; > > > > @@ -687,7 +696,7 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs > > return; > > > > if (gic_supports_nmi() && > > - unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) { > > + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) { > > gic_handle_nmi(irqnr, regs); > > return; > > } > > > -- > Without deviation from the norm, progress is not possible.
Hi Alex, On Thu, 12 Aug 2021 15:24:03 +0100, Alexandru Elisei <alexandru.elisei@arm.com> wrote: > > Hi Marc, > > On 8/12/21 2:09 PM, Marc Zyngier wrote: > > On Thu, 12 Aug 2021 12:51:34 +0100, > > Alexandru Elisei <alexandru.elisei@arm.com> wrote: > >> Hi, > >> > >> After re-familiarizing myself with the spec, it starting to look to > >> me like indeed there's something not quite right (read as: totally > >> broken) with my patch. > >> > >> Arm IHI 0069F, the pseudocode for reading ICC_RPR_EL1 (page 11-797), > >> says that the priority returned is unchanged if SCTLR_EL3.FIQ == > >> 0. > > Sure, but look at what ICC_RPR_EL1 does for FIQ==1: > > > > <quote> > > if HaveEL(EL3) && !IsSecure() && SCR_EL3.FIQ == '1' then > > // A Non-secure GIC access and Group 0 inaccessible to Non-secure. > > if pPriority<7> == '0' then > > // Priority is in Secure half and not visible to Non-secure > > Priority = Zeros(); > > elsif !IsOnes(pPriority) then > > // Non-secure access and not idle, so physical priority must be shifted > > pPriority<7:0> = (pPriority AND PRIMask())<6:0>:'0'; > > > > return ZeroExtend(pPriority); > > </quote> > > > > See how the the priority is shifted *left* (bits [6:0] end up in > > [7:1])? > > Yes, when SCR_EL3.FIQ=1, but gic_nonsecure_priorities is enabled > when SCR_EL3.FIQ=0 (gic_has_group0()). In that case ICC_RPR_EL1 > returns (what I assume to be) the highest priority interrupt from > ICC_AP0R_EL1, ICC_AP1R_EL1NS and ICC_AP1R_EL1S. Isn't that the > secure view (or Distributor value) of the priority? Yup. I guess I got confused with what "non-secure" priorities mean in this context. [...] > I don't see how that is the case - ICC_RPR_EL1 contains the priority > value as seen by the Distributor, and non-secure priorities get > right-shifted when SCR_EL3.FIQ=0, meaning that GICD_INT_NMI_PRI > becomes (GICD_INT_NMI_PRI >> 1) | 0x80 in the Distributor. Can you > elaborate where I'm contradicting myself? I think I know why I confused myself. When FIQ==0, G0 is NS. On the face of it, this should mean that no shift occurs. However, G1S is still in the picture, and we get the extra shift to preserve the ordering with G1S. This is a different configuration from that of a guest, where G0 is also NS, but there is no shift at all, as there is no G1S. The GIC strikes back. Again. I run some more tests with this patch, and merge it of nothing breaks. Thanks, M.
Hello, Pending Marc's testing (I realized I don't have any hardware to test this on at the moment), this patch looks correct to me. One comment below. On 8/11/21 6:15 PM, Chen-Yu Tsai wrote: > When non-secure priorities are used, compared to the raw priority set, > the value read back from RPR is also right-shifted by one and the > highest bit set. > > Add a macro to do the modifications to the raw priority when doing the > comparison against the RPR value. This corrects the pseudo-NMI behavior > when non-secure priorities in the GIC are used. Tested on 5.10 with > the "IPI as pseudo-NMI" series [1] applied on MT8195. > > [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/ > > Fixes: 336780590990 ("irqchip/gic-v3: Support pseudo-NMIs when SCR_EL3.FIQ == 0") > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> > --- > drivers/irqchip/irq-gic-v3.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > index e0f4debe64e1..e7a0b55413db 100644 > --- a/drivers/irqchip/irq-gic-v3.c > +++ b/drivers/irqchip/irq-gic-v3.c > @@ -100,6 +100,15 @@ EXPORT_SYMBOL(gic_pmr_sync); > DEFINE_STATIC_KEY_FALSE(gic_nonsecure_priorities); > EXPORT_SYMBOL(gic_nonsecure_priorities); > > +#define GICD_INT_RPR_PRI(priority) \ > + ({ \ > + u32 __priority = (priority); \ > + if (static_branch_unlikely(&gic_nonsecure_priorities)) \ > + __priority = 0x80 | (__priority >> 1); \ > + \ > + __priority; \ > + }) Would you mind adding a comment to the macro explaining why it's needed? This behaviour is rather subtle and I'm hoping it will save someone's time at some point in the future. I'm thinking something like this (please ignore it if you can think of something better): When the Non-secure world has access to group 0 interrupts (SCR_EL3.FIQ = 0), reading the ICC_RPR_EL1 register will return the Distributor's view of the interrupt priority. When GIC security is enabled (GICD_CTLR.DS = 0), the interrupt priority written by software is moved to the Non-secure range by the Distributor. If both are true (which is the situation where gic_nonsecure_priorities gets enabled), then we need to shift down the priority programmed by software if we want match it against the value returned from ICC_RPR_EL1. With a comment added: Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Thanks, Alex > + > /* ppi_nmi_refs[n] == number of cpus having ppi[n + 16] set as NMI */ > static refcount_t *ppi_nmi_refs; > > @@ -687,7 +696,7 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs > return; > > if (gic_supports_nmi() && > - unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) { > + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) { > gic_handle_nmi(irqnr, regs); > return; > }
On Fri, 20 Aug 2021 14:31:39 +0100, Alexandru Elisei <alexandru.elisei@arm.com> wrote: > > Hello, > > Pending Marc's testing (I realized I don't have any hardware to test > this on at the moment), this patch looks correct to me. One comment > below. Seems good so far. I tested it both in a VM, on a FIQ==1 host, and on D05, which runs with FIQ==0. Can't be more broken than it was... ;-) > > On 8/11/21 6:15 PM, Chen-Yu Tsai wrote: > > When non-secure priorities are used, compared to the raw priority set, > > the value read back from RPR is also right-shifted by one and the > > highest bit set. > > > > Add a macro to do the modifications to the raw priority when doing the > > comparison against the RPR value. This corrects the pseudo-NMI behavior > > when non-secure priorities in the GIC are used. Tested on 5.10 with > > the "IPI as pseudo-NMI" series [1] applied on MT8195. > > > > [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/ > > > > Fixes: 336780590990 ("irqchip/gic-v3: Support pseudo-NMIs when SCR_EL3.FIQ == 0") > > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> > > --- > > drivers/irqchip/irq-gic-v3.c | 11 ++++++++++- > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > index e0f4debe64e1..e7a0b55413db 100644 > > --- a/drivers/irqchip/irq-gic-v3.c > > +++ b/drivers/irqchip/irq-gic-v3.c > > @@ -100,6 +100,15 @@ EXPORT_SYMBOL(gic_pmr_sync); > > DEFINE_STATIC_KEY_FALSE(gic_nonsecure_priorities); > > EXPORT_SYMBOL(gic_nonsecure_priorities); > > > > +#define GICD_INT_RPR_PRI(priority) \ > > + ({ \ > > + u32 __priority = (priority); \ > > + if (static_branch_unlikely(&gic_nonsecure_priorities)) \ > > + __priority = 0x80 | (__priority >> 1); \ > > + \ > > + __priority; \ > > + }) > > Would you mind adding a comment to the macro explaining why it's > needed? This behaviour is rather subtle and I'm hoping it will save > someone's time at some point in the future. I'm thinking something > like this (please ignore it if you can think of something better): > > When the Non-secure world has access to group 0 interrupts > (SCR_EL3.FIQ = 0), reading the ICC_RPR_EL1 register will return the > Distributor's view of the interrupt priority. When GIC security is > enabled (GICD_CTLR.DS = 0), the interrupt priority written by > software is moved to the Non-secure range by the Distributor. If > both are true (which is the situation where gic_nonsecure_priorities > gets enabled), then we need to shift down the priority programmed by > software if we want match it against the value returned from > ICC_RPR_EL1. > > With a comment added: > > Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Let me fold this into the commit and push it out again. Thanks, M.
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index e0f4debe64e1..e7a0b55413db 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -100,6 +100,15 @@ EXPORT_SYMBOL(gic_pmr_sync); DEFINE_STATIC_KEY_FALSE(gic_nonsecure_priorities); EXPORT_SYMBOL(gic_nonsecure_priorities); +#define GICD_INT_RPR_PRI(priority) \ + ({ \ + u32 __priority = (priority); \ + if (static_branch_unlikely(&gic_nonsecure_priorities)) \ + __priority = 0x80 | (__priority >> 1); \ + \ + __priority; \ + }) + /* ppi_nmi_refs[n] == number of cpus having ppi[n + 16] set as NMI */ static refcount_t *ppi_nmi_refs; @@ -687,7 +696,7 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs return; if (gic_supports_nmi() && - unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) { + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) { gic_handle_nmi(irqnr, regs); return; }
When non-secure priorities are used, compared to the raw priority set, the value read back from RPR is also right-shifted by one and the highest bit set. Add a macro to do the modifications to the raw priority when doing the comparison against the RPR value. This corrects the pseudo-NMI behavior when non-secure priorities in the GIC are used. Tested on 5.10 with the "IPI as pseudo-NMI" series [1] applied on MT8195. [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/ Fixes: 336780590990 ("irqchip/gic-v3: Support pseudo-NMIs when SCR_EL3.FIQ == 0") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> --- drivers/irqchip/irq-gic-v3.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)