Message ID | 20211116082450.10357-4-kernelfans@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: Fixes RCU deadlock due to a mistaken | expand |
Hi Pingfan, On Tue, 16 Nov 2021 08:24:49 +0000, Pingfan Liu <kernelfans@gmail.com> wrote: > > Arch level code is ready to take over the nmi_enter()/nmi_exit() > housekeeping. > > GICv3 can expose the pNMI discriminator, then simply remove the > housekeeping. > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Marc Zyngier <maz@kernel.org> > Cc: Joey Gouly <joey.gouly@arm.com> > Cc: Sami Tolvanen <samitolvanen@google.com> > Cc: Julien Thierry <julien.thierry@arm.com> > Cc: Yuichi Ito <ito-yuichi@fujitsu.com> > Cc: rcu@vger.kernel.org > To: linux-arm-kernel@lists.infradead.org > --- > drivers/irqchip/irq-gic-v3.c | 18 ++++++++++++------ > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > index daec3309b014..aa2bcb47b47e 100644 > --- a/drivers/irqchip/irq-gic-v3.c > +++ b/drivers/irqchip/irq-gic-v3.c > @@ -646,12 +646,8 @@ static void gic_deactivate_unhandled(u32 irqnr) > > static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > { > - bool irqs_enabled = interrupts_enabled(regs); > int err; > > - if (irqs_enabled) > - nmi_enter(); > - > if (static_branch_likely(&supports_deactivate_key)) > gic_write_eoir(irqnr); > /* > @@ -664,8 +660,6 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > if (err) > gic_deactivate_unhandled(irqnr); > > - if (irqs_enabled) > - nmi_exit(); > } > > static u32 do_read_iar(struct pt_regs *regs) > @@ -702,6 +696,15 @@ static u32 do_read_iar(struct pt_regs *regs) > return iar; > } > > +static bool gic_is_in_nmi(void) > +{ > + if (gic_supports_nmi() && > + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) > + return true; I don't think this fixes anything. RPR stands for 'Running Priority Register', which in GIC speak reports the priority of the most recently Ack'ed interrupt. You cannot use this to find out whether the interrupt that you /will/ ack is a NMI or not. Actually, you cannot find out about *any* priority until you actually ack the interrupt. What you are asking for is the equivalent of a crystal ball, and we're in short supply... ;-) The only case where ICC_RPR_EL1 will return something that is equal to GICD_INT_NMI_PRI is when you are *already* in an NMI context. So unless I have completely misunderstood your approach (which is always possible), I don't see how this can work. If you want to distinguish between NMI and IRQ early on (before acknowledging the interrupt), the only solution is to turn the NMI into a Group-0 interrupt so that it is presented to the CPU as a FIQ. At which point, you have the information by construction. Unfortunately, this will only work in VMs, as Group-0 interrupts are usually routed to EL3 on bare metal systems. M.
On Tue, Nov 16, 2021 at 09:53:25AM +0000, Marc Zyngier wrote: > Hi Pingfan, > > On Tue, 16 Nov 2021 08:24:49 +0000, > Pingfan Liu <kernelfans@gmail.com> wrote: > > > > Arch level code is ready to take over the nmi_enter()/nmi_exit() > > housekeeping. > > > > GICv3 can expose the pNMI discriminator, then simply remove the > > housekeeping. > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > Cc: Will Deacon <will@kernel.org> > > Cc: Mark Rutland <mark.rutland@arm.com> > > Cc: Marc Zyngier <maz@kernel.org> > > Cc: Joey Gouly <joey.gouly@arm.com> > > Cc: Sami Tolvanen <samitolvanen@google.com> > > Cc: Julien Thierry <julien.thierry@arm.com> > > Cc: Yuichi Ito <ito-yuichi@fujitsu.com> > > Cc: rcu@vger.kernel.org > > To: linux-arm-kernel@lists.infradead.org > > --- > > drivers/irqchip/irq-gic-v3.c | 18 ++++++++++++------ > > 1 file changed, 12 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > index daec3309b014..aa2bcb47b47e 100644 > > --- a/drivers/irqchip/irq-gic-v3.c > > +++ b/drivers/irqchip/irq-gic-v3.c > > @@ -646,12 +646,8 @@ static void gic_deactivate_unhandled(u32 irqnr) > > > > static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > > { > > - bool irqs_enabled = interrupts_enabled(regs); > > int err; > > > > - if (irqs_enabled) > > - nmi_enter(); > > - > > if (static_branch_likely(&supports_deactivate_key)) > > gic_write_eoir(irqnr); > > /* > > @@ -664,8 +660,6 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > > if (err) > > gic_deactivate_unhandled(irqnr); > > > > - if (irqs_enabled) > > - nmi_exit(); > > } > > > > static u32 do_read_iar(struct pt_regs *regs) > > @@ -702,6 +696,15 @@ static u32 do_read_iar(struct pt_regs *regs) > > return iar; > > } > > > > +static bool gic_is_in_nmi(void) > > +{ > > + if (gic_supports_nmi() && > > + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) > > + return true; > > I don't think this fixes anything. > > RPR stands for 'Running Priority Register', which in GIC speak reports > the priority of the most recently Ack'ed interrupt. > > You cannot use this to find out whether the interrupt that you /will/ > ack is a NMI or not. Actually, you cannot find out about *any* > priority until you actually ack the interrupt. What you are asking for > is the equivalent of a crystal ball, and we're in short supply... ;-) > > The only case where ICC_RPR_EL1 will return something that is equal to > GICD_INT_NMI_PRI is when you are *already* in an NMI context. So > unless I have completely misunderstood your approach (which is always > possible), I don't see how this can work. > Thank you for the clear explanation. Also I revist this part in "GIC v3 and v4 overview" and have a deeper understanding. (Need to spare time to go through all later) You totally got my idea, and I need to find a bail-out. As all kinds of PIC at least have two parts of functions: active (Ack) and deactive(EOI), is it possible to split handle_arch_irq into two parts? I.e let irqchip expose two interfaces: u32 (*read_irqinfo*)(struct pt_regs *regs, bool *is_nmi) void (*handle_arch_irq)(struct pt_regs *regs, u32 irqnr) to replace the current interface: void (*handle_arch_irq)(struct pt_regs *regs) I have thought about such stuff for some days. And the benefits include: -1. For this bugfix (by the parameter 'is_nmi') -2. IPI_RESCHEDULE performance drop issue can be resolved at arch code level. (by irqnr - ipi_irq_base == IPI_RESCHEDULE ?) -3. The arch level can provide a similar loop as aic_handle_irq() in irq-apple-aic.c, which can save cpu by avoiding heavy context sync when irq is intensive. Do you think it is doable? Thanks, Pingfan > If you want to distinguish between NMI and IRQ early on (before > acknowledging the interrupt), the only solution is to turn the NMI > into a Group-0 interrupt so that it is presented to the CPU as a > FIQ. At which point, you have the information by construction. > > Unfortunately, this will only work in VMs, as Group-0 interrupts are > usually routed to EL3 on bare metal systems. > > M. > > -- > Without deviation from the norm, progress is not possible. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Wed, 17 Nov 2021 10:16:53 +0000, Pingfan Liu <kernelfans@gmail.com> wrote: > > On Tue, Nov 16, 2021 at 09:53:25AM +0000, Marc Zyngier wrote: > > Hi Pingfan, > > > > On Tue, 16 Nov 2021 08:24:49 +0000, > > Pingfan Liu <kernelfans@gmail.com> wrote: > > > > > > Arch level code is ready to take over the nmi_enter()/nmi_exit() > > > housekeeping. > > > > > > GICv3 can expose the pNMI discriminator, then simply remove the > > > housekeeping. > > > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > > Cc: Will Deacon <will@kernel.org> > > > Cc: Mark Rutland <mark.rutland@arm.com> > > > Cc: Marc Zyngier <maz@kernel.org> > > > Cc: Joey Gouly <joey.gouly@arm.com> > > > Cc: Sami Tolvanen <samitolvanen@google.com> > > > Cc: Julien Thierry <julien.thierry@arm.com> > > > Cc: Yuichi Ito <ito-yuichi@fujitsu.com> > > > Cc: rcu@vger.kernel.org > > > To: linux-arm-kernel@lists.infradead.org > > > --- > > > drivers/irqchip/irq-gic-v3.c | 18 ++++++++++++------ > > > 1 file changed, 12 insertions(+), 6 deletions(-) > > > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > > index daec3309b014..aa2bcb47b47e 100644 > > > --- a/drivers/irqchip/irq-gic-v3.c > > > +++ b/drivers/irqchip/irq-gic-v3.c > > > @@ -646,12 +646,8 @@ static void gic_deactivate_unhandled(u32 irqnr) > > > > > > static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > > > { > > > - bool irqs_enabled = interrupts_enabled(regs); > > > int err; > > > > > > - if (irqs_enabled) > > > - nmi_enter(); > > > - > > > if (static_branch_likely(&supports_deactivate_key)) > > > gic_write_eoir(irqnr); > > > /* > > > @@ -664,8 +660,6 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > > > if (err) > > > gic_deactivate_unhandled(irqnr); > > > > > > - if (irqs_enabled) > > > - nmi_exit(); > > > } > > > > > > static u32 do_read_iar(struct pt_regs *regs) > > > @@ -702,6 +696,15 @@ static u32 do_read_iar(struct pt_regs *regs) > > > return iar; > > > } > > > > > > +static bool gic_is_in_nmi(void) > > > +{ > > > + if (gic_supports_nmi() && > > > + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) > > > + return true; > > > > I don't think this fixes anything. > > > > RPR stands for 'Running Priority Register', which in GIC speak reports > > the priority of the most recently Ack'ed interrupt. > > > > You cannot use this to find out whether the interrupt that you /will/ > > ack is a NMI or not. Actually, you cannot find out about *any* > > priority until you actually ack the interrupt. What you are asking for > > is the equivalent of a crystal ball, and we're in short supply... ;-) > > > > The only case where ICC_RPR_EL1 will return something that is equal to > > GICD_INT_NMI_PRI is when you are *already* in an NMI context. So > > unless I have completely misunderstood your approach (which is always > > possible), I don't see how this can work. > > > > Thank you for the clear explanation. Also I revist this part in "GIC v3 > and v4 overview" and have a deeper understanding. (Need to spare time to > go through all later) > > You totally got my idea, and I need to find a bail-out. > > As all kinds of PIC at least have two parts of functions: active (Ack) and > deactive(EOI), is it possible to split handle_arch_irq into two parts? > I.e let irqchip expose two interfaces: > u32 (*read_irqinfo*)(struct pt_regs *regs, bool *is_nmi) > void (*handle_arch_irq)(struct pt_regs *regs, u32 irqnr) > to replace the current interface: No. There is no way we will move to such a scheme. We want to isolate the irqchip stuff in its own corner, and not propagate it into the arch code. If the pseudo-NMI is such a problem, I'm all for removing it *now*, never to come back again. > void (*handle_arch_irq)(struct pt_regs *regs) > > I have thought about such stuff for some days. And the benefits include: > -1. For this bugfix (by the parameter 'is_nmi') > -2. IPI_RESCHEDULE performance drop issue can be resolved at arch > code level. (by irqnr - ipi_irq_base == IPI_RESCHEDULE ?) > -3. The arch level can provide a similar loop as aic_handle_irq() in irq-apple-aic.c, > which can save cpu by avoiding heavy context sync when irq is > intensive. > > Do you think it is doable? I don't think this is even needed, because I don't believe that the whole thing is a real problem. In patch #1, you are claiming that handling a NMI as an IRQ first, and then upgrading to NMI once we know it really is an NMI is a problem. How different is this from an IRQ being preempted by a NMI? Your own conclusion is that the this later case isn't a problem. So why is the former an issue? I'm not saying that there is no issue at all, and it could well be that you have spotted something that I cannot see yet. But if that's the case, it means that the core code is broken as well. M.
On Wed, Nov 17, 2021 at 11:01:23AM +0000, Marc Zyngier wrote: > On Wed, 17 Nov 2021 10:16:53 +0000, > Pingfan Liu <kernelfans@gmail.com> wrote: > > > > On Tue, Nov 16, 2021 at 09:53:25AM +0000, Marc Zyngier wrote: > > > Hi Pingfan, > > > > > > On Tue, 16 Nov 2021 08:24:49 +0000, > > > Pingfan Liu <kernelfans@gmail.com> wrote: > > > > > > > > Arch level code is ready to take over the nmi_enter()/nmi_exit() > > > > housekeeping. > > > > > > > > GICv3 can expose the pNMI discriminator, then simply remove the > > > > housekeeping. > > > > > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > > > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > > > Cc: Will Deacon <will@kernel.org> > > > > Cc: Mark Rutland <mark.rutland@arm.com> > > > > Cc: Marc Zyngier <maz@kernel.org> > > > > Cc: Joey Gouly <joey.gouly@arm.com> > > > > Cc: Sami Tolvanen <samitolvanen@google.com> > > > > Cc: Julien Thierry <julien.thierry@arm.com> > > > > Cc: Yuichi Ito <ito-yuichi@fujitsu.com> > > > > Cc: rcu@vger.kernel.org > > > > To: linux-arm-kernel@lists.infradead.org > > > > --- > > > > drivers/irqchip/irq-gic-v3.c | 18 ++++++++++++------ > > > > 1 file changed, 12 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > > > index daec3309b014..aa2bcb47b47e 100644 > > > > --- a/drivers/irqchip/irq-gic-v3.c > > > > +++ b/drivers/irqchip/irq-gic-v3.c > > > > @@ -646,12 +646,8 @@ static void gic_deactivate_unhandled(u32 irqnr) > > > > > > > > static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > > > > { > > > > - bool irqs_enabled = interrupts_enabled(regs); > > > > int err; > > > > > > > > - if (irqs_enabled) > > > > - nmi_enter(); > > > > - > > > > if (static_branch_likely(&supports_deactivate_key)) > > > > gic_write_eoir(irqnr); > > > > /* > > > > @@ -664,8 +660,6 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > > > > if (err) > > > > gic_deactivate_unhandled(irqnr); > > > > > > > > - if (irqs_enabled) > > > > - nmi_exit(); > > > > } > > > > > > > > static u32 do_read_iar(struct pt_regs *regs) > > > > @@ -702,6 +696,15 @@ static u32 do_read_iar(struct pt_regs *regs) > > > > return iar; > > > > } > > > > > > > > +static bool gic_is_in_nmi(void) > > > > +{ > > > > + if (gic_supports_nmi() && > > > > + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) > > > > + return true; > > > > > > I don't think this fixes anything. > > > > > > RPR stands for 'Running Priority Register', which in GIC speak reports > > > the priority of the most recently Ack'ed interrupt. > > > > > > You cannot use this to find out whether the interrupt that you /will/ > > > ack is a NMI or not. Actually, you cannot find out about *any* > > > priority until you actually ack the interrupt. What you are asking for > > > is the equivalent of a crystal ball, and we're in short supply... ;-) > > > > > > The only case where ICC_RPR_EL1 will return something that is equal to > > > GICD_INT_NMI_PRI is when you are *already* in an NMI context. So > > > unless I have completely misunderstood your approach (which is always > > > possible), I don't see how this can work. > > > > > > > Thank you for the clear explanation. Also I revist this part in "GIC v3 > > and v4 overview" and have a deeper understanding. (Need to spare time to > > go through all later) > > > > You totally got my idea, and I need to find a bail-out. > > > > As all kinds of PIC at least have two parts of functions: active (Ack) and > > deactive(EOI), is it possible to split handle_arch_irq into two parts? > > I.e let irqchip expose two interfaces: > > u32 (*read_irqinfo*)(struct pt_regs *regs, bool *is_nmi) > > void (*handle_arch_irq)(struct pt_regs *regs, u32 irqnr) > > to replace the current interface: > > No. There is no way we will move to such a scheme. We want to isolate > the irqchip stuff in its own corner, and not propagate it into the > arch code. I understood this beautiful design of isolation. But since the x86 and powerpc have integrated the PIC as part of arch at hardware design, they can tell the source of the interrupt at the arch level code. That is why their IPI_RESCHEDULE is cheaper than arm64. But yeah, the benifit may not persuasive enough to breach the design goal. > > If the pseudo-NMI is such a problem, I'm all for removing it *now*, > never to come back again. > It is a pity that pNMI has not fulfilled the roles like other Arches yet. Two benefits come with NMI on x86 and powerpc: -1. hardlockup detector uses NMI to ease the detection and analysis of unpaired irq enable/disable. It can accurately report the stack trace -2. In kdump case, all cpus can be forced into a known code piece. But on arm64, it does not play such a role yet. Once I hit a boot hang issue on a customed kdump kernel, there is no tick irq after adding some printk, and if there is hardlockup detector, it will be easy to debug. > > void (*handle_arch_irq)(struct pt_regs *regs) > > > > I have thought about such stuff for some days. And the benefits include: > > -1. For this bugfix (by the parameter 'is_nmi') > > -2. IPI_RESCHEDULE performance drop issue can be resolved at arch > > code level. (by irqnr - ipi_irq_base == IPI_RESCHEDULE ?) > > -3. The arch level can provide a similar loop as aic_handle_irq() in irq-apple-aic.c, > > which can save cpu by avoiding heavy context sync when irq is > > intensive. > > > > Do you think it is doable? > > I don't think this is even needed, because I don't believe that the > whole thing is a real problem. > > In patch #1, you are claiming that handling a NMI as an IRQ first, and > then upgrading to NMI once we know it really is an NMI is a problem. > How different is this from an IRQ being preempted by a NMI? > It turned out a false alarm as my reply to Mark. In that case, the code should take __el1_pnmi(), but I mistake it as __el1_irq(). Thank again for your time and patient reply. Regards, Pingfan > Your own conclusion is that the this later case isn't a problem. So > why is the former an issue? > > I'm not saying that there is no issue at all, and it could well be > that you have spotted something that I cannot see yet. But if that's > the case, it means that the core code is broken as well. > > M. > > -- > Without deviation from the norm, progress is not possible. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index daec3309b014..aa2bcb47b47e 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -646,12 +646,8 @@ static void gic_deactivate_unhandled(u32 irqnr) static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) { - bool irqs_enabled = interrupts_enabled(regs); int err; - if (irqs_enabled) - nmi_enter(); - if (static_branch_likely(&supports_deactivate_key)) gic_write_eoir(irqnr); /* @@ -664,8 +660,6 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) if (err) gic_deactivate_unhandled(irqnr); - if (irqs_enabled) - nmi_exit(); } static u32 do_read_iar(struct pt_regs *regs) @@ -702,6 +696,15 @@ static u32 do_read_iar(struct pt_regs *regs) return iar; } +static bool gic_is_in_nmi(void) +{ + if (gic_supports_nmi() && + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) + return true; + + return false; +} + static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs) { u32 irqnr; @@ -1791,6 +1794,9 @@ static int __init gic_init_bases(void __iomem *dist_base, } set_handle_irq(gic_handle_irq); +#if IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) + set_nmi_discriminator(gic_is_in_nmi); +#endif gic_update_rdist_properties();
Arch level code is ready to take over the nmi_enter()/nmi_exit() housekeeping. GICv3 can expose the pNMI discriminator, then simply remove the housekeeping. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Julien Thierry <julien.thierry@arm.com> Cc: Yuichi Ito <ito-yuichi@fujitsu.com> Cc: rcu@vger.kernel.org To: linux-arm-kernel@lists.infradead.org --- drivers/irqchip/irq-gic-v3.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)