Message ID | 20240126223836.202321-1-namcao@linutronix.de (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | irqchip/sifive-plic: enable interrupt if needed before EOI | expand |
Hi, On 2024-01-26 4:38 PM, Nam Cao wrote: > RISC-V PLIC cannot EOI disabled interrupts, as explained in the > description of Interrupt Completion in the PLIC spec: > > "The PLIC signals it has completed executing an interrupt handler by > writing the interrupt ID it received from the claim to the claim/complete > register. The PLIC does not check whether the completion ID is the same > as the last claim ID for that target. If the completion ID does not match > an interrupt source that *is currently enabled* for the target, the > completion is silently ignored." > > Commit 69ea463021be ("irqchip/sifive-plic: Fixup EOI failed when masked") > ensured that by enabling the interrupt if needed before EOI. > > Commit a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask > operations") removed the interrupt enabling code from the previous > commit, because it assumes that interrupt should be enabled at the point > of EOI. However, this is incorrect: there is a small window after a hart > claiming an interrupt and before irq_desc->lock getting acquired, > interrupt can be disabled during this window. Thus, EOI can be invoked > while the interrupt is disabled, effectively nullify this EOI. > > Make sure that interrupt is really enabled before EOI. Could you please try the patch I previously sent for this issue[1]? It should fix the bug without complicating the IRQ hot path. Thanks, Samuel [1]: https://lore.kernel.org/lkml/20230717185841.1294425-1-samuel.holland@sifive.com/ > Fixes: a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask operations") > Cc: <stable@vger.kernel.org> > Signed-off-by: Nam Cao <namcao@linutronix.de> > --- > drivers/irqchip/irq-sifive-plic.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c > index 5b7bc4fd9517..0857a516c35b 100644 > --- a/drivers/irqchip/irq-sifive-plic.c > +++ b/drivers/irqchip/irq-sifive-plic.c > @@ -148,7 +148,13 @@ static void plic_irq_eoi(struct irq_data *d) > { > struct plic_handler *handler = this_cpu_ptr(&plic_handlers); > > - writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > + if (irqd_irq_disabled(d)) { > + plic_toggle(handler, d->hwirq, 1); > + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > + plic_toggle(handler, d->hwirq, 0); > + } else { > + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > + } > } > > #ifdef CONFIG_SMP
On Fri, 26 Jan 2024 18:31:19 -0600 Samuel Holland <samuel.holland@sifive.com> wrote: > On 2024-01-26 4:38 PM, Nam Cao wrote: > > RISC-V PLIC cannot EOI disabled interrupts, as explained in the > > description of Interrupt Completion in the PLIC spec: > > > > "The PLIC signals it has completed executing an interrupt handler by > > writing the interrupt ID it received from the claim to the claim/complete > > register. The PLIC does not check whether the completion ID is the same > > as the last claim ID for that target. If the completion ID does not match > > an interrupt source that *is currently enabled* for the target, the > > completion is silently ignored." > > > > Commit 69ea463021be ("irqchip/sifive-plic: Fixup EOI failed when masked") > > ensured that by enabling the interrupt if needed before EOI. > > > > Commit a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask > > operations") removed the interrupt enabling code from the previous > > commit, because it assumes that interrupt should be enabled at the point > > of EOI. However, this is incorrect: there is a small window after a hart > > claiming an interrupt and before irq_desc->lock getting acquired, > > interrupt can be disabled during this window. Thus, EOI can be invoked > > while the interrupt is disabled, effectively nullify this EOI. > > > > Make sure that interrupt is really enabled before EOI. > > Could you please try the patch I previously sent for this issue[1]? Unfortunately my system still gets frozen with the patch applied :( I think because the patch doesn't prevent plic_irq_shutdown() from getting called after the hart claiming the interrupt and before irq_desc is locked up. > It should fix the bug without complicating the IRQ hot path. I can add an unlikely() to help that a bit, because from my experience, it is quite rare that EOI happens with interrupt disabled. Best regards, Nam
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c index 5b7bc4fd9517..0857a516c35b 100644 --- a/drivers/irqchip/irq-sifive-plic.c +++ b/drivers/irqchip/irq-sifive-plic.c @@ -148,7 +148,13 @@ static void plic_irq_eoi(struct irq_data *d) { struct plic_handler *handler = this_cpu_ptr(&plic_handlers); - writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); + if (irqd_irq_disabled(d)) { + plic_toggle(handler, d->hwirq, 1); + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); + plic_toggle(handler, d->hwirq, 0); + } else { + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); + } } #ifdef CONFIG_SMP
RISC-V PLIC cannot EOI disabled interrupts, as explained in the description of Interrupt Completion in the PLIC spec: "The PLIC signals it has completed executing an interrupt handler by writing the interrupt ID it received from the claim to the claim/complete register. The PLIC does not check whether the completion ID is the same as the last claim ID for that target. If the completion ID does not match an interrupt source that *is currently enabled* for the target, the completion is silently ignored." Commit 69ea463021be ("irqchip/sifive-plic: Fixup EOI failed when masked") ensured that by enabling the interrupt if needed before EOI. Commit a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask operations") removed the interrupt enabling code from the previous commit, because it assumes that interrupt should be enabled at the point of EOI. However, this is incorrect: there is a small window after a hart claiming an interrupt and before irq_desc->lock getting acquired, interrupt can be disabled during this window. Thus, EOI can be invoked while the interrupt is disabled, effectively nullify this EOI. Make sure that interrupt is really enabled before EOI. Fixes: a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask operations") Cc: <stable@vger.kernel.org> Signed-off-by: Nam Cao <namcao@linutronix.de> --- drivers/irqchip/irq-sifive-plic.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)