Message ID | 20211016032200.2869998-2-guoren@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | irqchip: riscv: Add thead,c900-plic support | expand |
On 10/15/21 10:21 PM, guoren@kernel.org wrote: > From: Guo Ren <guoren@linux.alibaba.com> > > 1) The irq_mask/unmask() is used by handle_fasteoi_irq() is mostly > for ONESHOT irqs and there is no limitation in the RISC-V PLIC driver > due to use of irq_mask/unmask() callbacks. In fact, a lot of irqchip > drivers using handle_fasteoi_irq() also implement irq_mask/unmask(). > > 2) The C9xx PLIC does not comply with the interrupt claim/completion > process defined by the RISC-V PLIC specification because C9xx PLIC > will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim) > and the IRQ will be unmasked upon completion by PLIC driver (i.e. > writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by > the generic handle_fasteoi_irq() used in the PLIC driver. > > 3) This patch adds an errata fix for IRQS_ONESHOT handling on > C9xx PLIC by using irq_enable/disable() callbacks instead of > irq_mask/unmask(). > > Signed-off-by: Guo Ren <guoren@linux.alibaba.com> > Cc: Anup Patel <anup@brainfault.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Marc Zyngier <maz@kernel.org> > Cc: Palmer Dabbelt <palmer@dabbelt.com> > Cc: Atish Patra <atish.patra@wdc.com> > > --- > > Changes since V4: > - Update comment by Anup > > Changes since V3: > - Rename "c9xx" to "c900" > - Add sifive_plic_chip and thead_plic_chip for difference > > Changes since V2: > - Add a separate compatible string "thead,c9xx-plic" > - set irq_mask/unmask of "plic_chip" to NULL and point > irq_enable/disable of "plic_chip" to plic_irq_mask/unmask > - Add a detailed comment block in plic_init() about the > differences in Claim/Completion process of RISC-V PLIC and C9xx > PLIC. > --- > drivers/irqchip/irq-sifive-plic.c | 34 +++++++++++++++++++++++++++++-- > 1 file changed, 32 insertions(+), 2 deletions(-) > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c > index cf74cfa82045..960b29d02070 100644 > --- a/drivers/irqchip/irq-sifive-plic.c > +++ b/drivers/irqchip/irq-sifive-plic.c > @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d) > writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > } > > -static struct irq_chip plic_chip = { > +static struct irq_chip sifive_plic_chip = { > .name = "SiFive PLIC", > .irq_mask = plic_irq_mask, > .irq_unmask = plic_irq_unmask, > @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = { > #endif > }; > > +/* > + * The C9xx PLIC does not comply with the interrupt claim/completion > + * process defined by the RISC-V PLIC specification because C9xx PLIC > + * will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim) > + * and the IRQ will be unmasked upon completion by PLIC driver (i.e. > + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by > + * the generic handle_fasteoi_irq() used in the PLIC driver. > + */ > +static struct irq_chip thead_plic_chip = { > + .name = "T-Head PLIC", > + .irq_disable = plic_irq_mask, > + .irq_enable = plic_irq_unmask, > + .irq_eoi = plic_irq_eoi, > +#ifdef CONFIG_SMP > + .irq_set_affinity = plic_set_affinity, > +#endif I tested this, and it doesn't work. Without IRQCHIP_EOI_THREADED, .irq_eoi is called at the end of the hard IRQ handler. This unmasks the IRQ before the irqthread has a chance to run, so it causes an interrupt storm for any threaded level IRQ (I saw this happen for sun8i_thermal). With IRQCHIP_EOI_THREADED, .irq_eoi is delayed until after the irqthread runs. This is good. Except that the call to unmask_threaded_irq() is inside a check for IRQD_IRQ_MASKED. And IRQD_IRQ_MASKED will never be set because .irq_mask is NULL. So the end result is that the IRQ is never EOI'd and is masked permanently. If you set .flags = IRQCHIP_EOI_THREADED, and additionally set .irq_mask and .irq_unmask to a dummy function that does nothing, the IRQ core will properly set/unset IRQD_IRQ_MASKED, and the IRQs will flow as expected. But adding dummy functions seems not so ideal, so I am not sure if this is the best solution. Regards, Samuel > +}; > + > +static struct irq_chip *def_plic_chip = &sifive_plic_chip; > + > static int plic_irqdomain_map(struct irq_domain *d, unsigned int irq, > irq_hw_number_t hwirq) > { > struct plic_priv *priv = d->host_data; > > - irq_domain_set_info(d, irq, hwirq, &plic_chip, d->host_data, > + irq_domain_set_info(d, irq, hwirq, def_plic_chip, d->host_data, > handle_fasteoi_irq, NULL, NULL); > irq_set_noprobe(irq); > irq_set_affinity(irq, &priv->lmask); > @@ -390,5 +410,15 @@ static int __init plic_init(struct device_node *node, > return error; > } > > +static int __init thead_c900_plic_init(struct device_node *node, > + struct device_node *parent) > +{ > + def_plic_chip = &thead_plic_chip; > + > + return plic_init(node, parent); > +} > + > IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init); > IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */ > +IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", thead_c900_plic_init); > +IRQCHIP_DECLARE(allwinner_sun20i_d1_plic, "allwinner,sun20i-d1-plic", thead_c900_plic_init); >
On Mon, Oct 18, 2021 at 10:47 AM Samuel Holland <samuel@sholland.org> wrote: > > On 10/15/21 10:21 PM, guoren@kernel.org wrote: > > From: Guo Ren <guoren@linux.alibaba.com> > > > > 1) The irq_mask/unmask() is used by handle_fasteoi_irq() is mostly > > for ONESHOT irqs and there is no limitation in the RISC-V PLIC driver > > due to use of irq_mask/unmask() callbacks. In fact, a lot of irqchip > > drivers using handle_fasteoi_irq() also implement irq_mask/unmask(). > > > > 2) The C9xx PLIC does not comply with the interrupt claim/completion > > process defined by the RISC-V PLIC specification because C9xx PLIC > > will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim) > > and the IRQ will be unmasked upon completion by PLIC driver (i.e. > > writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by > > the generic handle_fasteoi_irq() used in the PLIC driver. > > > > 3) This patch adds an errata fix for IRQS_ONESHOT handling on > > C9xx PLIC by using irq_enable/disable() callbacks instead of > > irq_mask/unmask(). > > > > Signed-off-by: Guo Ren <guoren@linux.alibaba.com> > > Cc: Anup Patel <anup@brainfault.org> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > Cc: Marc Zyngier <maz@kernel.org> > > Cc: Palmer Dabbelt <palmer@dabbelt.com> > > Cc: Atish Patra <atish.patra@wdc.com> > > > > --- > > > > Changes since V4: > > - Update comment by Anup > > > > Changes since V3: > > - Rename "c9xx" to "c900" > > - Add sifive_plic_chip and thead_plic_chip for difference > > > > Changes since V2: > > - Add a separate compatible string "thead,c9xx-plic" > > - set irq_mask/unmask of "plic_chip" to NULL and point > > irq_enable/disable of "plic_chip" to plic_irq_mask/unmask > > - Add a detailed comment block in plic_init() about the > > differences in Claim/Completion process of RISC-V PLIC and C9xx > > PLIC. > > --- > > drivers/irqchip/irq-sifive-plic.c | 34 +++++++++++++++++++++++++++++-- > > 1 file changed, 32 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c > > index cf74cfa82045..960b29d02070 100644 > > --- a/drivers/irqchip/irq-sifive-plic.c > > +++ b/drivers/irqchip/irq-sifive-plic.c > > @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d) > > writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > > } > > > > -static struct irq_chip plic_chip = { > > +static struct irq_chip sifive_plic_chip = { > > .name = "SiFive PLIC", > > .irq_mask = plic_irq_mask, > > .irq_unmask = plic_irq_unmask, > > @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = { > > #endif > > }; > > > > +/* > > + * The C9xx PLIC does not comply with the interrupt claim/completion > > + * process defined by the RISC-V PLIC specification because C9xx PLIC > > + * will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim) > > + * and the IRQ will be unmasked upon completion by PLIC driver (i.e. > > + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by > > + * the generic handle_fasteoi_irq() used in the PLIC driver. > > + */ > > +static struct irq_chip thead_plic_chip = { > > + .name = "T-Head PLIC", > > + .irq_disable = plic_irq_mask, > > + .irq_enable = plic_irq_unmask, > > + .irq_eoi = plic_irq_eoi, > > +#ifdef CONFIG_SMP > > + .irq_set_affinity = plic_set_affinity, > > +#endif > I tested this, and it doesn't work. Without IRQCHIP_EOI_THREADED, > .irq_eoi is called at the end of the hard IRQ handler. This unmasks the > IRQ before the irqthread has a chance to run, so it causes an interrupt > storm for any threaded level IRQ (I saw this happen for sun8i_thermal). > > With IRQCHIP_EOI_THREADED, .irq_eoi is delayed until after the irqthread > runs. This is good. Except that the call to unmask_threaded_irq() is > inside a check for IRQD_IRQ_MASKED. And IRQD_IRQ_MASKED will never be > set because .irq_mask is NULL. So the end result is that the IRQ is > never EOI'd and is masked permanently. > > If you set .flags = IRQCHIP_EOI_THREADED, and additionally set .irq_mask > and .irq_unmask to a dummy function that does nothing, the IRQ core will > properly set/unset IRQD_IRQ_MASKED, and the IRQs will flow as expected. > But adding dummy functions seems not so ideal, so I am not sure if this > is the best solution. This series only tries to optimize a particular case in handle_fasteoi_irq() for T-HEAD PLIC. I am not sure about this series either. Although, we do need separate compatible strings for T-HEAD PLIC because T-HEAD PLIC is not compliant with RISC-V PLIC specification. Regards, Anup > > Regards, > Samuel > > > +}; > > + > > +static struct irq_chip *def_plic_chip = &sifive_plic_chip; > > + > > static int plic_irqdomain_map(struct irq_domain *d, unsigned int irq, > > irq_hw_number_t hwirq) > > { > > struct plic_priv *priv = d->host_data; > > > > - irq_domain_set_info(d, irq, hwirq, &plic_chip, d->host_data, > > + irq_domain_set_info(d, irq, hwirq, def_plic_chip, d->host_data, > > handle_fasteoi_irq, NULL, NULL); > > irq_set_noprobe(irq); > > irq_set_affinity(irq, &priv->lmask); > > @@ -390,5 +410,15 @@ static int __init plic_init(struct device_node *node, > > return error; > > } > > > > +static int __init thead_c900_plic_init(struct device_node *node, > > + struct device_node *parent) > > +{ > > + def_plic_chip = &thead_plic_chip; > > + > > + return plic_init(node, parent); > > +} > > + > > IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init); > > IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */ > > +IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", thead_c900_plic_init); > > +IRQCHIP_DECLARE(allwinner_sun20i_d1_plic, "allwinner,sun20i-d1-plic", thead_c900_plic_init); > > >
Hi Samuel, On Mon, Oct 18, 2021 at 1:17 PM Samuel Holland <samuel@sholland.org> wrote: > > On 10/15/21 10:21 PM, guoren@kernel.org wrote: > > From: Guo Ren <guoren@linux.alibaba.com> > > > > 1) The irq_mask/unmask() is used by handle_fasteoi_irq() is mostly > > for ONESHOT irqs and there is no limitation in the RISC-V PLIC driver > > due to use of irq_mask/unmask() callbacks. In fact, a lot of irqchip > > drivers using handle_fasteoi_irq() also implement irq_mask/unmask(). > > > > 2) The C9xx PLIC does not comply with the interrupt claim/completion > > process defined by the RISC-V PLIC specification because C9xx PLIC > > will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim) > > and the IRQ will be unmasked upon completion by PLIC driver (i.e. > > writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by > > the generic handle_fasteoi_irq() used in the PLIC driver. > > > > 3) This patch adds an errata fix for IRQS_ONESHOT handling on > > C9xx PLIC by using irq_enable/disable() callbacks instead of > > irq_mask/unmask(). > > > > Signed-off-by: Guo Ren <guoren@linux.alibaba.com> > > Cc: Anup Patel <anup@brainfault.org> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > Cc: Marc Zyngier <maz@kernel.org> > > Cc: Palmer Dabbelt <palmer@dabbelt.com> > > Cc: Atish Patra <atish.patra@wdc.com> > > > > --- > > > > Changes since V4: > > - Update comment by Anup > > > > Changes since V3: > > - Rename "c9xx" to "c900" > > - Add sifive_plic_chip and thead_plic_chip for difference > > > > Changes since V2: > > - Add a separate compatible string "thead,c9xx-plic" > > - set irq_mask/unmask of "plic_chip" to NULL and point > > irq_enable/disable of "plic_chip" to plic_irq_mask/unmask > > - Add a detailed comment block in plic_init() about the > > differences in Claim/Completion process of RISC-V PLIC and C9xx > > PLIC. > > --- > > drivers/irqchip/irq-sifive-plic.c | 34 +++++++++++++++++++++++++++++-- > > 1 file changed, 32 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c > > index cf74cfa82045..960b29d02070 100644 > > --- a/drivers/irqchip/irq-sifive-plic.c > > +++ b/drivers/irqchip/irq-sifive-plic.c > > @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d) > > writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > > } > > > > -static struct irq_chip plic_chip = { > > +static struct irq_chip sifive_plic_chip = { > > .name = "SiFive PLIC", > > .irq_mask = plic_irq_mask, > > .irq_unmask = plic_irq_unmask, > > @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = { > > #endif > > }; > > > > +/* > > + * The C9xx PLIC does not comply with the interrupt claim/completion > > + * process defined by the RISC-V PLIC specification because C9xx PLIC > > + * will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim) > > + * and the IRQ will be unmasked upon completion by PLIC driver (i.e. > > + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by > > + * the generic handle_fasteoi_irq() used in the PLIC driver. > > + */ > > +static struct irq_chip thead_plic_chip = { > > + .name = "T-Head PLIC", > > + .irq_disable = plic_irq_mask, > > + .irq_enable = plic_irq_unmask, > > + .irq_eoi = plic_irq_eoi, > > +#ifdef CONFIG_SMP > > + .irq_set_affinity = plic_set_affinity, > > +#endif > I tested this, and it doesn't work. Without IRQCHIP_EOI_THREADED, > .irq_eoi is called at the end of the hard IRQ handler. This unmasks the > IRQ before the irqthread has a chance to run, so it causes an interrupt > storm for any threaded level IRQ (I saw this happen for sun8i_thermal). devm_request_threaded_irq(struct device *dev, unsigned int irq, irq_handler_t handler, irq_handler_t thread_fn I think you should pull down the IRQ level signal in "handler" and put the backend progress into "thread_fn". Could you give out your driver code? > > With IRQCHIP_EOI_THREADED, .irq_eoi is delayed until after the irqthread > runs. This is good. Except that the call to unmask_threaded_irq() is > inside a check for IRQD_IRQ_MASKED. And IRQD_IRQ_MASKED will never be > set because .irq_mask is NULL. So the end result is that the IRQ is > never EOI'd and is masked permanently. I don't think we should use IRQCHIP_EOI_THREADED because it makes the IRQ path complex, we need to let the driver separate their "handler" & "thread_fn" properly. How do you think? > > If you set .flags = IRQCHIP_EOI_THREADED, and additionally set .irq_mask > and .irq_unmask to a dummy function that does nothing, the IRQ core will > properly set/unset IRQD_IRQ_MASKED, and the IRQs will flow as expected. > But adding dummy functions seems not so ideal, so I am not sure if this > is the best solution. It's ununderstandable, we need to find a way. Thx for the test & the question. > > Regards, > Samuel > > > +}; > > + > > +static struct irq_chip *def_plic_chip = &sifive_plic_chip; > > + > > static int plic_irqdomain_map(struct irq_domain *d, unsigned int irq, > > irq_hw_number_t hwirq) > > { > > struct plic_priv *priv = d->host_data; > > > > - irq_domain_set_info(d, irq, hwirq, &plic_chip, d->host_data, > > + irq_domain_set_info(d, irq, hwirq, def_plic_chip, d->host_data, > > handle_fasteoi_irq, NULL, NULL); > > irq_set_noprobe(irq); > > irq_set_affinity(irq, &priv->lmask); > > @@ -390,5 +410,15 @@ static int __init plic_init(struct device_node *node, > > return error; > > } > > > > +static int __init thead_c900_plic_init(struct device_node *node, > > + struct device_node *parent) > > +{ > > + def_plic_chip = &thead_plic_chip; > > + > > + return plic_init(node, parent); > > +} > > + > > IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init); > > IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */ > > +IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", thead_c900_plic_init); > > +IRQCHIP_DECLARE(allwinner_sun20i_d1_plic, "allwinner,sun20i-d1-plic", thead_c900_plic_init); > > >
On 2021-10-18 06:17, Samuel Holland wrote: > On 10/15/21 10:21 PM, guoren@kernel.org wrote: >> From: Guo Ren <guoren@linux.alibaba.com> >> >> 1) The irq_mask/unmask() is used by handle_fasteoi_irq() is mostly Drop this useless numbering. >> for ONESHOT irqs and there is no limitation in the RISC-V PLIC driver >> due to use of irq_mask/unmask() callbacks. In fact, a lot of irqchip >> drivers using handle_fasteoi_irq() also implement irq_mask/unmask(). This paragraph doesn't provide any useful information in the context of this patch. That's at best cover-letter material. >> 2) The C9xx PLIC does not comply with the interrupt claim/completion >> process defined by the RISC-V PLIC specification because C9xx PLIC >> will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim) >> and the IRQ will be unmasked upon completion by PLIC driver (i.e. >> writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by >> the generic handle_fasteoi_irq() used in the PLIC driver. >> >> 3) This patch adds an errata fix for IRQS_ONESHOT handling on s/fix/workaround/ >> C9xx PLIC by using irq_enable/disable() callbacks instead of >> irq_mask/unmask(). From Documentation/process/submitting-patches.rst: <quote> Describe your changes in imperative mood, e.g. "make xyzzy do frotz" instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy to do frotz", as if you are giving orders to the codebase to change its behaviour. </quote> >> >> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> >> Cc: Anup Patel <anup@brainfault.org> >> Cc: Thomas Gleixner <tglx@linutronix.de> >> Cc: Marc Zyngier <maz@kernel.org> >> Cc: Palmer Dabbelt <palmer@dabbelt.com> >> Cc: Atish Patra <atish.patra@wdc.com> >> >> --- >> >> Changes since V4: >> - Update comment by Anup >> >> Changes since V3: >> - Rename "c9xx" to "c900" >> - Add sifive_plic_chip and thead_plic_chip for difference >> >> Changes since V2: >> - Add a separate compatible string "thead,c9xx-plic" >> - set irq_mask/unmask of "plic_chip" to NULL and point >> irq_enable/disable of "plic_chip" to plic_irq_mask/unmask >> - Add a detailed comment block in plic_init() about the >> differences in Claim/Completion process of RISC-V PLIC and C9xx >> PLIC. >> --- >> drivers/irqchip/irq-sifive-plic.c | 34 >> +++++++++++++++++++++++++++++-- >> 1 file changed, 32 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/irqchip/irq-sifive-plic.c >> b/drivers/irqchip/irq-sifive-plic.c >> index cf74cfa82045..960b29d02070 100644 >> --- a/drivers/irqchip/irq-sifive-plic.c >> +++ b/drivers/irqchip/irq-sifive-plic.c >> @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d) >> writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); >> } >> >> -static struct irq_chip plic_chip = { >> +static struct irq_chip sifive_plic_chip = { >> .name = "SiFive PLIC", >> .irq_mask = plic_irq_mask, >> .irq_unmask = plic_irq_unmask, >> @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = { >> #endif >> }; >> >> +/* >> + * The C9xx PLIC does not comply with the interrupt claim/completion >> + * process defined by the RISC-V PLIC specification because C9xx PLIC >> + * will mask an IRQ when it is claimed by PLIC driver (i.e. >> readl(claim) >> + * and the IRQ will be unmasked upon completion by PLIC driver (i.e. >> + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT >> by >> + * the generic handle_fasteoi_irq() used in the PLIC driver. >> + */ >> +static struct irq_chip thead_plic_chip = { >> + .name = "T-Head PLIC", >> + .irq_disable = plic_irq_mask, >> + .irq_enable = plic_irq_unmask, >> + .irq_eoi = plic_irq_eoi, >> +#ifdef CONFIG_SMP >> + .irq_set_affinity = plic_set_affinity, >> +#endif > I tested this, and it doesn't work. Without IRQCHIP_EOI_THREADED, > .irq_eoi is called at the end of the hard IRQ handler. This unmasks the > IRQ before the irqthread has a chance to run, so it causes an interrupt > storm for any threaded level IRQ (I saw this happen for sun8i_thermal). > > With IRQCHIP_EOI_THREADED, .irq_eoi is delayed until after the > irqthread > runs. This is good. Except that the call to unmask_threaded_irq() is > inside a check for IRQD_IRQ_MASKED. And IRQD_IRQ_MASKED will never be > set because .irq_mask is NULL. So the end result is that the IRQ is > never EOI'd and is masked permanently. > > If you set .flags = IRQCHIP_EOI_THREADED, and additionally set > .irq_mask > and .irq_unmask to a dummy function that does nothing, the IRQ core > will > properly set/unset IRQD_IRQ_MASKED, and the IRQs will flow as expected. > But adding dummy functions seems not so ideal, so I am not sure if this > is the best solution. This series is totally broken indeed, because it assumes that enable/disable are a substitute to mask/unmask. Nothing could be further from the truth. mask/unmask must be implemented, and enable/disable supplement them if the HW requires something different at startup time. If you have an 'automask' behaviour and yet the HW doesn't record this in a separate bit, then you need to track this by yourself in the irq_eoi() callback instead. I guess that you would skip the write to the CLAIM register in this case, though I have no idea whether this breaks the HW interrupt state or not. There is an example of this in the Apple AIC driver. M.
Thx Marc, On Mon, Oct 18, 2021 at 3:21 PM Marc Zyngier <maz@kernel.org> wrote: > > On 2021-10-18 06:17, Samuel Holland wrote: > > On 10/15/21 10:21 PM, guoren@kernel.org wrote: > >> From: Guo Ren <guoren@linux.alibaba.com> > >> > >> 1) The irq_mask/unmask() is used by handle_fasteoi_irq() is mostly > > Drop this useless numbering. Okay > > >> for ONESHOT irqs and there is no limitation in the RISC-V PLIC driver > >> due to use of irq_mask/unmask() callbacks. In fact, a lot of irqchip > >> drivers using handle_fasteoi_irq() also implement irq_mask/unmask(). > > This paragraph doesn't provide any useful information in the context > of this patch. That's at best cover-letter material. Okay. I would reconstruct the sentence. > > >> 2) The C9xx PLIC does not comply with the interrupt claim/completion > >> process defined by the RISC-V PLIC specification because C9xx PLIC > >> will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim) > >> and the IRQ will be unmasked upon completion by PLIC driver (i.e. > >> writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by > >> the generic handle_fasteoi_irq() used in the PLIC driver. > >> > >> 3) This patch adds an errata fix for IRQS_ONESHOT handling on > > s/fix/workaround/ Okay > > >> C9xx PLIC by using irq_enable/disable() callbacks instead of > >> irq_mask/unmask(). > > From Documentation/process/submitting-patches.rst: > > <quote> > Describe your changes in imperative mood, e.g. "make xyzzy do frotz" > instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy > to do frotz", as if you are giving orders to the codebase to change > its behaviour. > </quote> I would try the style in the next version of the patch. > > >> > >> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> > >> Cc: Anup Patel <anup@brainfault.org> > >> Cc: Thomas Gleixner <tglx@linutronix.de> > >> Cc: Marc Zyngier <maz@kernel.org> > >> Cc: Palmer Dabbelt <palmer@dabbelt.com> > >> Cc: Atish Patra <atish.patra@wdc.com> > >> > >> --- > >> > >> Changes since V4: > >> - Update comment by Anup > >> > >> Changes since V3: > >> - Rename "c9xx" to "c900" > >> - Add sifive_plic_chip and thead_plic_chip for difference > >> > >> Changes since V2: > >> - Add a separate compatible string "thead,c9xx-plic" > >> - set irq_mask/unmask of "plic_chip" to NULL and point > >> irq_enable/disable of "plic_chip" to plic_irq_mask/unmask > >> - Add a detailed comment block in plic_init() about the > >> differences in Claim/Completion process of RISC-V PLIC and C9xx > >> PLIC. > >> --- > >> drivers/irqchip/irq-sifive-plic.c | 34 > >> +++++++++++++++++++++++++++++-- > >> 1 file changed, 32 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/irqchip/irq-sifive-plic.c > >> b/drivers/irqchip/irq-sifive-plic.c > >> index cf74cfa82045..960b29d02070 100644 > >> --- a/drivers/irqchip/irq-sifive-plic.c > >> +++ b/drivers/irqchip/irq-sifive-plic.c > >> @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d) > >> writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > >> } > >> > >> -static struct irq_chip plic_chip = { > >> +static struct irq_chip sifive_plic_chip = { > >> .name = "SiFive PLIC", > >> .irq_mask = plic_irq_mask, > >> .irq_unmask = plic_irq_unmask, > >> @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = { > >> #endif > >> }; > >> > >> +/* > >> + * The C9xx PLIC does not comply with the interrupt claim/completion > >> + * process defined by the RISC-V PLIC specification because C9xx PLIC > >> + * will mask an IRQ when it is claimed by PLIC driver (i.e. > >> readl(claim) > >> + * and the IRQ will be unmasked upon completion by PLIC driver (i.e. > >> + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT > >> by > >> + * the generic handle_fasteoi_irq() used in the PLIC driver. > >> + */ > >> +static struct irq_chip thead_plic_chip = { > >> + .name = "T-Head PLIC", > >> + .irq_disable = plic_irq_mask, > >> + .irq_enable = plic_irq_unmask, > >> + .irq_eoi = plic_irq_eoi, > >> +#ifdef CONFIG_SMP > >> + .irq_set_affinity = plic_set_affinity, > >> +#endif > > I tested this, and it doesn't work. Without IRQCHIP_EOI_THREADED, > > .irq_eoi is called at the end of the hard IRQ handler. This unmasks the > > IRQ before the irqthread has a chance to run, so it causes an interrupt > > storm for any threaded level IRQ (I saw this happen for sun8i_thermal). > > > > With IRQCHIP_EOI_THREADED, .irq_eoi is delayed until after the > > irqthread > > runs. This is good. Except that the call to unmask_threaded_irq() is > > inside a check for IRQD_IRQ_MASKED. And IRQD_IRQ_MASKED will never be > > set because .irq_mask is NULL. So the end result is that the IRQ is > > never EOI'd and is masked permanently. > > > > If you set .flags = IRQCHIP_EOI_THREADED, and additionally set > > .irq_mask > > and .irq_unmask to a dummy function that does nothing, the IRQ core > > will > > properly set/unset IRQD_IRQ_MASKED, and the IRQs will flow as expected. > > But adding dummy functions seems not so ideal, so I am not sure if this > > is the best solution. > > This series is totally broken indeed, because it assumes that > enable/disable are a substitute to mask/unmask. Nothing could be further > from the truth. mask/unmask must be implemented, and enable/disable > supplement them if the HW requires something different at startup time. After re-studying irqchip, I agree that you are right. The csky-mpintc driver needs to be corrected, I will send patches asap. I hope you can continue to help review. handle_fasteoi_irq itself has avoided mask/unmask, so my understanding is wrong. The mask/unmask design can prevent "rogue interrupts" from damaging the system. C-SKY guys encountered the thread_irq interrupt storm problem. The solution at that time was to pull the interrupt signal in the handler and put the rest in thread_fn. If we implemented the mask/unmask correctly in csky-mpintc, it was unnecessary. > > If you have an 'automask' behavior and yet the HW doesn't record this > in a separate bit, then you need to track this by yourself in the > irq_eoi() callback instead. I guess that you would skip the write to > the CLAIM register in this case, though I have no idea whether this > breaks > the HW interrupt state or not. The problem is when enable bit is 0 for that irq_number, "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect the hw state machine. Then this irq would enter in ack state and no continues irqs could come in. > > There is an example of this in the Apple AIC driver. Thx for the tip, I think your suggestion is: +++ b/drivers/irqchip/irq-sifive-plic.c @@ -163,7 +163,12 @@ static void plic_irq_eoi(struct irq_data *d) { struct plic_handler *handler = this_cpu_ptr(&plic_handlers); - writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); + if (irqd_irq_masked(d)) { + plic_irq_unmask(d); + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); + plic_irq_mask(d); + } else { + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); + } } The above could solve the problem, I've tested it on qemu & our hw platform. > > M. > -- > Jazz is not dead. It just smells funny... -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
On Tue, 19 Oct 2021 10:33:49 +0100, Guo Ren <guoren@kernel.org> wrote: > > If you have an 'automask' behavior and yet the HW doesn't record this > > in a separate bit, then you need to track this by yourself in the > > irq_eoi() callback instead. I guess that you would skip the write to > > the CLAIM register in this case, though I have no idea whether this > > breaks > > the HW interrupt state or not. > The problem is when enable bit is 0 for that irq_number, > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > the hw state machine. Then this irq would enter in ack state and no > continues irqs could come in. Really? This means that you cannot mask an interrupt while it is being handled? How great... > > > > There is an example of this in the Apple AIC driver. > Thx for the tip, I think your suggestion is: > +++ b/drivers/irqchip/irq-sifive-plic.c > @@ -163,7 +163,12 @@ static void plic_irq_eoi(struct irq_data *d) > { > struct plic_handler *handler = this_cpu_ptr(&plic_handlers); > > - writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > + if (irqd_irq_masked(d)) { > + plic_irq_unmask(d); > + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > + plic_irq_mask(d); This looks pretty dodgy. You are relying on interrupts being globally masked on the CPU, I guess. It probably works today, but man, what a terrible HW implementation. You'll definitely have to move this into a c900-specific callback. M.
On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > On Tue, 19 Oct 2021 10:33:49 +0100, > Guo Ren <guoren@kernel.org> wrote: > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > in a separate bit, then you need to track this by yourself in the > > > irq_eoi() callback instead. I guess that you would skip the write to > > > the CLAIM register in this case, though I have no idea whether this > > > breaks > > > the HW interrupt state or not. > > The problem is when enable bit is 0 for that irq_number, > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > the hw state machine. Then this irq would enter in ack state and no > > continues irqs could come in. > > Really? This means that you cannot mask an interrupt while it is being > handled? How great... If the completion ID does not match an interrupt source that is currently enabled for the target, the completion is silently ignored. So, C9xx completion depends on enable-bit. > > > > > > > There is an example of this in the Apple AIC driver. > > Thx for the tip, I think your suggestion is: > > +++ b/drivers/irqchip/irq-sifive-plic.c > > @@ -163,7 +163,12 @@ static void plic_irq_eoi(struct irq_data *d) > > { > > struct plic_handler *handler = this_cpu_ptr(&plic_handlers); > > > > - writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > > + if (irqd_irq_masked(d)) { > > + plic_irq_unmask(d); > > + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); > > + plic_irq_mask(d); > > This looks pretty dodgy. You are relying on interrupts being globally > masked on the CPU, I guess. It probably works today, but man, what a > terrible HW implementation. > > You'll definitely have to move this into a c900-specific callback. Yes, it's an errata. > > M. > > -- > Without deviation from the norm, progress is not possible. -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
On Tue, 19 Oct 2021 14:27:02 +0100, Guo Ren <guoren@kernel.org> wrote: > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > Guo Ren <guoren@kernel.org> wrote: > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > in a separate bit, then you need to track this by yourself in the > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > the CLAIM register in this case, though I have no idea whether this > > > > breaks > > > > the HW interrupt state or not. > > > The problem is when enable bit is 0 for that irq_number, > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > the hw state machine. Then this irq would enter in ack state and no > > > continues irqs could come in. > > > > Really? This means that you cannot mask an interrupt while it is being > > handled? How great... > If the completion ID does not match an interrupt source that is > currently enabled for the target, the completion is silently ignored. > So, C9xx completion depends on enable-bit. Is that what the PLIC spec says? Or what your implementation does? I can understand that one implementation would be broken, but if the PLIC architecture itself is broken, that's far more concerning. M.
On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <maz@kernel.org> wrote: > > On Tue, 19 Oct 2021 14:27:02 +0100, > Guo Ren <guoren@kernel.org> wrote: > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > in a separate bit, then you need to track this by yourself in the > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > breaks > > > > > the HW interrupt state or not. > > > > The problem is when enable bit is 0 for that irq_number, > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > the hw state machine. Then this irq would enter in ack state and no > > > > continues irqs could come in. > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > handled? How great... > > If the completion ID does not match an interrupt source that is > > currently enabled for the target, the completion is silently ignored. > > So, C9xx completion depends on enable-bit. > > Is that what the PLIC spec says? Or what your implementation does? I > can understand that one implementation would be broken, but if the > PLIC architecture itself is broken, that's far more concerning. Here is the description of Interrupt Completion in PLIC spec [1]: The PLIC signals it has completed executing an interrupt handler by writing the interrupt ID it received from the claim to the claim/complete register. The PLIC does not check whether the completion ID is the same as the last claim ID for that target. If the completion ID does not match an interrupt source that is currently enabled for the target, the ^^ ^^^^^^^^^ ^^^^^^^ completion is silently ignored. [1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc Did we misunderstand the PLIC spec? > > M. > > -- > Without deviation from the norm, progress is not possible. -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <maz@kernel.org> wrote: > > On Tue, 19 Oct 2021 14:27:02 +0100, > Guo Ren <guoren@kernel.org> wrote: > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > in a separate bit, then you need to track this by yourself in the > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > breaks > > > > > the HW interrupt state or not. > > > > The problem is when enable bit is 0 for that irq_number, > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > the hw state machine. Then this irq would enter in ack state and no > > > > continues irqs could come in. > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > handled? How great... > > If the completion ID does not match an interrupt source that is > > currently enabled for the target, the completion is silently ignored. > > So, C9xx completion depends on enable-bit. > > Is that what the PLIC spec says? Or what your implementation does? I > can understand that one implementation would be broken, but if the > PLIC architecture itself is broken, that's far more concerning. Yes, we are dealing with a broken/non-compliant PLIC implementation. The RISC-V PLIC spec defines a very different behaviour for the interrupt claim (i.e. readl(claim)) and interrupt completion (i.e. writel(claim)). The T-HEAD PLIC implementation does things different from what the RISC-V PLIC spec says because it will mask an interrupt upon interrupt claim whereas PLIC spec says it should only clear the interrupt pending bit (not mask the interrupt). Quoting interrupt claim process (chapter 9) from PLIC spec: "The PLIC can perform an interrupt claim by reading the claim/complete register, which returns the ID of the highest priority pending interrupt or zero if there is no pending interrupt. A successful claim will also atomically clear the corresponding pending bit on the interrupt source." Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc Regards, Anup > > M. > > -- > Without deviation from the norm, progress is not possible.
On Wed, Oct 20, 2021 at 10:19:06PM +0800, Guo Ren wrote: > On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > Guo Ren <guoren@kernel.org> wrote: > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > breaks > > > > > > the HW interrupt state or not. > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > continues irqs could come in. > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > handled? How great... > > > If the completion ID does not match an interrupt source that is > > > currently enabled for the target, the completion is silently ignored. > > > So, C9xx completion depends on enable-bit. > > > > Is that what the PLIC spec says? Or what your implementation does? I > > can understand that one implementation would be broken, but if the > > PLIC architecture itself is broken, that's far more concerning. > > Here is the description of Interrupt Completion in PLIC spec [1]: > > The PLIC signals it has completed executing an interrupt handler by > writing the interrupt ID it received from the claim to the claim/complete > register. The PLIC does not check whether the completion ID is the same > as the last claim ID for that target. If the completion ID does not match > an interrupt source that is currently enabled for the target, the > ^^ ^^^^^^^^^ ^^^^^^^ > completion is silently ignored. > > [1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > Did we misunderstand the PLIC spec? > That clause sounds to me like it is due to the SiFive implementation, which the RISC-V PLIC specification is based on. Since the PLIC spec is still a draft I would expect it to change before release.
On Wed, 20 Oct 2021 15:33:49 +0100, Anup Patel <anup@brainfault.org> wrote: > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > Guo Ren <guoren@kernel.org> wrote: > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > breaks > > > > > > the HW interrupt state or not. > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > continues irqs could come in. > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > handled? How great... > > > If the completion ID does not match an interrupt source that is > > > currently enabled for the target, the completion is silently ignored. > > > So, C9xx completion depends on enable-bit. > > > > Is that what the PLIC spec says? Or what your implementation does? I > > can understand that one implementation would be broken, but if the > > PLIC architecture itself is broken, that's far more concerning. > > Yes, we are dealing with a broken/non-compliant PLIC > implementation. > > The RISC-V PLIC spec defines a very different behaviour for the > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e. > writel(claim)). The T-HEAD PLIC implementation does things > different from what the RISC-V PLIC spec says because it will > mask an interrupt upon interrupt claim whereas PLIC spec says > it should only clear the interrupt pending bit (not mask the interrupt). > > Quoting interrupt claim process (chapter 9) from PLIC spec: > "The PLIC can perform an interrupt claim by reading the claim/complete > register, which returns the ID of the highest priority pending interrupt or > zero if there is no pending interrupt. A successful claim will also atomically > clear the corresponding pending bit on the interrupt source." > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc That's not the point I'm making. According to Guo, the PLIC (any implementation of it) will ignore a write to claim on a masked interrupt. If that's indeed correct, then a sequence such as: (1) irq = read(claim) (2) mask from the interrupt handler with the right flags so that it isn't done lazily (3) write(irq, claim) will result in an interrupt blocked in ack state (and probably no more interrupt for this CPU at this priority). That would be an interesting bug in the current code, but also a pretty bad architectural choice. M.
On Wed, Oct 20, 2021 at 8:38 PM Marc Zyngier <maz@kernel.org> wrote: > > On Wed, 20 Oct 2021 15:33:49 +0100, > Anup Patel <anup@brainfault.org> wrote: > > > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > breaks > > > > > > > the HW interrupt state or not. > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > continues irqs could come in. > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > handled? How great... > > > > If the completion ID does not match an interrupt source that is > > > > currently enabled for the target, the completion is silently ignored. > > > > So, C9xx completion depends on enable-bit. > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > can understand that one implementation would be broken, but if the > > > PLIC architecture itself is broken, that's far more concerning. > > > > Yes, we are dealing with a broken/non-compliant PLIC > > implementation. > > > > The RISC-V PLIC spec defines a very different behaviour for the > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e. > > writel(claim)). The T-HEAD PLIC implementation does things > > different from what the RISC-V PLIC spec says because it will > > mask an interrupt upon interrupt claim whereas PLIC spec says > > it should only clear the interrupt pending bit (not mask the interrupt). > > > > Quoting interrupt claim process (chapter 9) from PLIC spec: > > "The PLIC can perform an interrupt claim by reading the claim/complete > > register, which returns the ID of the highest priority pending interrupt or > > zero if there is no pending interrupt. A successful claim will also atomically > > clear the corresponding pending bit on the interrupt source." > > > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > That's not the point I'm making. According to Guo, the PLIC (any > implementation of it) will ignore a write to claim on a masked > interrupt. Yes, write to claim on a masked interrupt is certainly ignored but read to claim does not automatically mask the interrupt. > > If that's indeed correct, then a sequence such as: > > (1) irq = read(claim) This will return highest priority pending interrupt and clear the pending bit as-per RISC-V PLIC spec. > (2) mask from the interrupt handler with the right flags so that it > isn't done lazily > (3) write(irq, claim) > > will result in an interrupt blocked in ack state (and probably no more > interrupt for this CPU at this priority). That would be an interesting > bug in the current code, but also a pretty bad architectural choice. The interrupt claim/completion is for each interrupt and not at CPU level so if an interrupt is masked then only that interrupt is blocked for all CPUs but other interrupts can still be raised. Regards, Anup > > M. > > -- > Without deviation from the norm, progress is not possible.
On Wed, Oct 20, 2021 at 8:29 PM Darius Rad <darius@bluespec.com> wrote: > > On Wed, Oct 20, 2021 at 10:19:06PM +0800, Guo Ren wrote: > > On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > breaks > > > > > > > the HW interrupt state or not. > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > continues irqs could come in. > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > handled? How great... > > > > If the completion ID does not match an interrupt source that is > > > > currently enabled for the target, the completion is silently ignored. > > > > So, C9xx completion depends on enable-bit. > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > can understand that one implementation would be broken, but if the > > > PLIC architecture itself is broken, that's far more concerning. > > > > Here is the description of Interrupt Completion in PLIC spec [1]: > > > > The PLIC signals it has completed executing an interrupt handler by > > writing the interrupt ID it received from the claim to the claim/complete > > register. The PLIC does not check whether the completion ID is the same > > as the last claim ID for that target. If the completion ID does not match > > an interrupt source that is currently enabled for the target, the > > ^^ ^^^^^^^^^ ^^^^^^^ > > completion is silently ignored. > > > > [1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > > > Did we misunderstand the PLIC spec? > > > > That clause sounds to me like it is due to the SiFive implementation, which > the RISC-V PLIC specification is based on. Since the PLIC spec is still a > draft I would expect it to change before release. The SiFive PLIC has been adopted by various RISC-V platforms (including SiFive themselves). Almost all existing RISC-V boards have PLIC as the interrupt controller. Considering the wide usage of PLIC across existing platforms, the RISC-V International has adopted it as an official RISC-V non-ISA spec. Of course, the RISC-V PLIC spec needs to follow the process for RISC-V non-ISA spec but changing the RISC-V PLIC spec now would mean all existing RISC-V platforms will become non-compliant. The RISC-V AIA spec is intended to replace the RISC-V PLIC spec as the new interrupt controller spec for future RISC-V platforms. Regards, Anup
On Wed, 20 Oct 2021 17:08:36 +0100, Anup Patel <anup@brainfault.org> wrote: > > On Wed, Oct 20, 2021 at 8:38 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Wed, 20 Oct 2021 15:33:49 +0100, > > Anup Patel <anup@brainfault.org> wrote: > > > > > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > > breaks > > > > > > > > the HW interrupt state or not. > > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > > continues irqs could come in. > > > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > > handled? How great... > > > > > If the completion ID does not match an interrupt source that is > > > > > currently enabled for the target, the completion is silently ignored. > > > > > So, C9xx completion depends on enable-bit. > > > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > > can understand that one implementation would be broken, but if the > > > > PLIC architecture itself is broken, that's far more concerning. > > > > > > Yes, we are dealing with a broken/non-compliant PLIC > > > implementation. > > > > > > The RISC-V PLIC spec defines a very different behaviour for the > > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e. > > > writel(claim)). The T-HEAD PLIC implementation does things > > > different from what the RISC-V PLIC spec says because it will > > > mask an interrupt upon interrupt claim whereas PLIC spec says > > > it should only clear the interrupt pending bit (not mask the interrupt). > > > > > > Quoting interrupt claim process (chapter 9) from PLIC spec: > > > "The PLIC can perform an interrupt claim by reading the claim/complete > > > register, which returns the ID of the highest priority pending interrupt or > > > zero if there is no pending interrupt. A successful claim will also atomically > > > clear the corresponding pending bit on the interrupt source." > > > > > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > > > That's not the point I'm making. According to Guo, the PLIC (any > > implementation of it) will ignore a write to claim on a masked > > interrupt. > > Yes, write to claim on a masked interrupt is certainly ignored but > read to claim does not automatically mask the interrupt. > > > > > If that's indeed correct, then a sequence such as: > > > > (1) irq = read(claim) > > This will return highest priority pending interrupt and clear the > pending bit as-per RISC-V PLIC spec. > > > (2) mask from the interrupt handler with the right flags so that it > > isn't done lazily > > (3) write(irq, claim) > > > > will result in an interrupt blocked in ack state (and probably no more > > interrupt for this CPU at this priority). That would be an interesting > > bug in the current code, but also a pretty bad architectural choice. > > The interrupt claim/completion is for each interrupt and not at CPU > level so if an interrupt is masked then only that interrupt is blocked > for all CPUs but other interrupts can still be raised. Do you mean that another interrupt of the same priority will be able to be taken on *this* CPU, despite the completion being silently ignored? M.
On Wed, Oct 20, 2021 at 09:48:36PM +0530, Anup Patel wrote: > On Wed, Oct 20, 2021 at 8:29 PM Darius Rad <darius@bluespec.com> wrote: > > > > On Wed, Oct 20, 2021 at 10:19:06PM +0800, Guo Ren wrote: > > > On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > > breaks > > > > > > > > the HW interrupt state or not. > > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > > continues irqs could come in. > > > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > > handled? How great... > > > > > If the completion ID does not match an interrupt source that is > > > > > currently enabled for the target, the completion is silently ignored. > > > > > So, C9xx completion depends on enable-bit. > > > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > > can understand that one implementation would be broken, but if the > > > > PLIC architecture itself is broken, that's far more concerning. > > > > > > Here is the description of Interrupt Completion in PLIC spec [1]: > > > > > > The PLIC signals it has completed executing an interrupt handler by > > > writing the interrupt ID it received from the claim to the claim/complete > > > register. The PLIC does not check whether the completion ID is the same > > > as the last claim ID for that target. If the completion ID does not match > > > an interrupt source that is currently enabled for the target, the > > > ^^ ^^^^^^^^^ ^^^^^^^ > > > completion is silently ignored. > > > > > > [1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > > > > > Did we misunderstand the PLIC spec? > > > > > > > That clause sounds to me like it is due to the SiFive implementation, which > > the RISC-V PLIC specification is based on. Since the PLIC spec is still a > > draft I would expect it to change before release. > > The SiFive PLIC has been adopted by various RISC-V platforms (including > SiFive themselves). Almost all existing RISC-V boards have PLIC as the > interrupt controller. > > Considering the wide usage of PLIC across existing platforms, the RISC-V > International has adopted it as an official RISC-V non-ISA spec. ... You mean is in the process of adopting it, right? > ... Of course, > the RISC-V PLIC spec needs to follow the process for RISC-V non-ISA spec > but changing the RISC-V PLIC spec now would mean all existing RISC-V > platforms will become non-compliant. > I would expect the review process to produce a proper specification, rather than a verbatim copy of the SiFive datasheet, and clarify some ambgiuous and implementation specific language. Clarifying the specification does not necessarily make all existing implementations non-compliant, as this has been done numerous times with other RISC-V specifications. > The RISC-V AIA spec is intended to replace the RISC-V PLIC spec as the > new interrupt controller spec for future RISC-V platforms. > > Regards, > Anup > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Thu, Oct 21, 2021 at 12:08 AM Anup Patel <anup@brainfault.org> wrote: > > On Wed, Oct 20, 2021 at 8:38 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Wed, 20 Oct 2021 15:33:49 +0100, > > Anup Patel <anup@brainfault.org> wrote: > > > > > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > > breaks > > > > > > > > the HW interrupt state or not. > > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > > continues irqs could come in. > > > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > > handled? How great... > > > > > If the completion ID does not match an interrupt source that is > > > > > currently enabled for the target, the completion is silently ignored. > > > > > So, C9xx completion depends on enable-bit. > > > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > > can understand that one implementation would be broken, but if the > > > > PLIC architecture itself is broken, that's far more concerning. > > > > > > Yes, we are dealing with a broken/non-compliant PLIC > > > implementation. > > > > > > The RISC-V PLIC spec defines a very different behaviour for the > > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e. > > > writel(claim)). The T-HEAD PLIC implementation does things > > > different from what the RISC-V PLIC spec says because it will > > > mask an interrupt upon interrupt claim whereas PLIC spec says > > > it should only clear the interrupt pending bit (not mask the interrupt). > > > > > > Quoting interrupt claim process (chapter 9) from PLIC spec: > > > "The PLIC can perform an interrupt claim by reading the claim/complete > > > register, which returns the ID of the highest priority pending interrupt or > > > zero if there is no pending interrupt. A successful claim will also atomically > > > clear the corresponding pending bit on the interrupt source." > > > > > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > > > That's not the point I'm making. According to Guo, the PLIC (any > > implementation of it) will ignore a write to claim on a masked > > interrupt. > > Yes, write to claim on a masked interrupt is certainly ignored but > read to claim does not automatically mask the interrupt. > > > > > If that's indeed correct, then a sequence such as: > > > > (1) irq = read(claim) > > This will return highest priority pending interrupt and clear the > pending bit as-per RISC-V PLIC spec. > > > (2) mask from the interrupt handler with the right flags so that it > > isn't done lazily > > (3) write(irq, claim) > > > > will result in an interrupt blocked in ack state (and probably no more > > interrupt for this CPU at this priority). That would be an interesting > > bug in the current code, but also a pretty bad architectural choice. > > The interrupt claim/completion is for each interrupt and not at CPU > level so if an interrupt is masked then only that interrupt is blocked > for all CPUs but other interrupts can still be raised. 1. I think PLIC only could receive a new coming IRQ after completion: claim IRQ-0 complete IRQ-0 claim IRQ-1 complete IRQ-1 claim IRQ-2 complete IRQ-2 Any recursion would break the PLIC, right? That's why we need to mask the IRQ before entering this IRQ thread_fn. 2. plic_handle_irq -> readl(claim) handle_fasteoi_irq -> if (desc->istate & IRQS_ONESHOT) mask_irq(desc); handle_fasteoi_irq -> chip->irq_eoi(&desc->irq_data); // failied Seems all ONESHOT IRQs would be broken, right? > > Regards, > Anup > > > > > M. > > > > -- > > Without deviation from the norm, progress is not possible. -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
On Wed, Oct 20, 2021 at 11:08 PM Marc Zyngier <maz@kernel.org> wrote: > > On Wed, 20 Oct 2021 15:33:49 +0100, > Anup Patel <anup@brainfault.org> wrote: > > > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > breaks > > > > > > > the HW interrupt state or not. > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > continues irqs could come in. > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > handled? How great... > > > > If the completion ID does not match an interrupt source that is > > > > currently enabled for the target, the completion is silently ignored. > > > > So, C9xx completion depends on enable-bit. > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > can understand that one implementation would be broken, but if the > > > PLIC architecture itself is broken, that's far more concerning. > > > > Yes, we are dealing with a broken/non-compliant PLIC > > implementation. > > > > The RISC-V PLIC spec defines a very different behaviour for the > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e. > > writel(claim)). The T-HEAD PLIC implementation does things > > different from what the RISC-V PLIC spec says because it will > > mask an interrupt upon interrupt claim whereas PLIC spec says > > it should only clear the interrupt pending bit (not mask the interrupt). > > > > Quoting interrupt claim process (chapter 9) from PLIC spec: > > "The PLIC can perform an interrupt claim by reading the claim/complete > > register, which returns the ID of the highest priority pending interrupt or > > zero if there is no pending interrupt. A successful claim will also atomically > > clear the corresponding pending bit on the interrupt source." > > > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > That's not the point I'm making. According to Guo, the PLIC (any > implementation of it) will ignore a write to claim on a masked > interrupt. > > If that's indeed correct, then a sequence such as: > > (1) irq = read(claim) > (2) mask from the interrupt handler with the right flags so that it > isn't done lazily > (3) write(irq, claim) How about letting the IRQ chip change? diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index a98bcfc4be7b..ed6ace1058ac 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -444,10 +444,10 @@ void unmask_threaded_irq(struct irq_desc *desc) { struct irq_chip *chip = desc->irq_data.chip; + unmask_irq(desc); + if (chip->flags & IRQCHIP_EOI_THREADED) chip->irq_eoi(&desc->irq_data); - - unmask_irq(desc); } /* @@ -673,8 +673,8 @@ static void cond_unmask_eoi_irq(struct irq_desc *desc, struct irq_chip *chip) */ if (!irqd_irq_disabled(&desc->irq_data) && irqd_irq_masked(&desc->irq_data) && !desc->threads_oneshot) { - chip->irq_eoi(&desc->irq_data); unmask_irq(desc); + chip->irq_eoi(&desc->irq_data); } else if (!(chip->flags & IRQCHIP_EOI_THREADED)) { chip->irq_eoi(&desc->irq_data); } > > will result in an interrupt blocked in ack state (and probably no more > interrupt for this CPU at this priority). That would be an interesting > bug in the current code, but also a pretty bad architectural choice. > > M. > > -- > Without deviation from the norm, progress is not possible.
On Thu, 21 Oct 2021 03:00:43 +0100, Guo Ren <guoren@kernel.org> wrote: > > On Wed, Oct 20, 2021 at 11:08 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Wed, 20 Oct 2021 15:33:49 +0100, > > Anup Patel <anup@brainfault.org> wrote: > > > > > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > > breaks > > > > > > > > the HW interrupt state or not. > > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > > continues irqs could come in. > > > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > > handled? How great... > > > > > If the completion ID does not match an interrupt source that is > > > > > currently enabled for the target, the completion is silently ignored. > > > > > So, C9xx completion depends on enable-bit. > > > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > > can understand that one implementation would be broken, but if the > > > > PLIC architecture itself is broken, that's far more concerning. > > > > > > Yes, we are dealing with a broken/non-compliant PLIC > > > implementation. > > > > > > The RISC-V PLIC spec defines a very different behaviour for the > > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e. > > > writel(claim)). The T-HEAD PLIC implementation does things > > > different from what the RISC-V PLIC spec says because it will > > > mask an interrupt upon interrupt claim whereas PLIC spec says > > > it should only clear the interrupt pending bit (not mask the interrupt). > > > > > > Quoting interrupt claim process (chapter 9) from PLIC spec: > > > "The PLIC can perform an interrupt claim by reading the claim/complete > > > register, which returns the ID of the highest priority pending interrupt or > > > zero if there is no pending interrupt. A successful claim will also atomically > > > clear the corresponding pending bit on the interrupt source." > > > > > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > > > That's not the point I'm making. According to Guo, the PLIC (any > > implementation of it) will ignore a write to claim on a masked > > interrupt. > > > > If that's indeed correct, then a sequence such as: > > > > (1) irq = read(claim) > > (2) mask from the interrupt handler with the right flags so that it > > isn't done lazily > > (3) write(irq, claim) > > How about letting the IRQ chip change? > > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c > index a98bcfc4be7b..ed6ace1058ac 100644 > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -444,10 +444,10 @@ void unmask_threaded_irq(struct irq_desc *desc) > { > struct irq_chip *chip = desc->irq_data.chip; > > + unmask_irq(desc); > + > if (chip->flags & IRQCHIP_EOI_THREADED) > chip->irq_eoi(&desc->irq_data); > - > - unmask_irq(desc); > } > > /* > @@ -673,8 +673,8 @@ static void cond_unmask_eoi_irq(struct irq_desc > *desc, struct irq_chip *chip) > */ > if (!irqd_irq_disabled(&desc->irq_data) && > irqd_irq_masked(&desc->irq_data) && !desc->threads_oneshot) { > - chip->irq_eoi(&desc->irq_data); > unmask_irq(desc); > + chip->irq_eoi(&desc->irq_data); > } else if (!(chip->flags & IRQCHIP_EOI_THREADED)) { > chip->irq_eoi(&desc->irq_data); > } No, I don't think that's acceptable, and I strongly suspect that other irqchips have the opposite requirement. You'll have to keep the workaround in the PLIC code and track the EOI vs unmask to do the right thing in both callbacks. M.
On Wed, Oct 20, 2021 at 11:32 PM Darius Rad <darius@bluespec.com> wrote: > > On Wed, Oct 20, 2021 at 09:48:36PM +0530, Anup Patel wrote: > > On Wed, Oct 20, 2021 at 8:29 PM Darius Rad <darius@bluespec.com> wrote: > > > > > > On Wed, Oct 20, 2021 at 10:19:06PM +0800, Guo Ren wrote: > > > > On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > > > breaks > > > > > > > > > the HW interrupt state or not. > > > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > > > continues irqs could come in. > > > > > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > > > handled? How great... > > > > > > If the completion ID does not match an interrupt source that is > > > > > > currently enabled for the target, the completion is silently ignored. > > > > > > So, C9xx completion depends on enable-bit. > > > > > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > > > can understand that one implementation would be broken, but if the > > > > > PLIC architecture itself is broken, that's far more concerning. > > > > > > > > Here is the description of Interrupt Completion in PLIC spec [1]: > > > > > > > > The PLIC signals it has completed executing an interrupt handler by > > > > writing the interrupt ID it received from the claim to the claim/complete > > > > register. The PLIC does not check whether the completion ID is the same > > > > as the last claim ID for that target. If the completion ID does not match > > > > an interrupt source that is currently enabled for the target, the > > > > ^^ ^^^^^^^^^ ^^^^^^^ > > > > completion is silently ignored. > > > > > > > > [1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > > > > > > > Did we misunderstand the PLIC spec? > > > > > > > > > > That clause sounds to me like it is due to the SiFive implementation, which > > > the RISC-V PLIC specification is based on. Since the PLIC spec is still a > > > draft I would expect it to change before release. > > > > The SiFive PLIC has been adopted by various RISC-V platforms (including > > SiFive themselves). Almost all existing RISC-V boards have PLIC as the > > interrupt controller. > > > > Considering the wide usage of PLIC across existing platforms, the RISC-V > > International has adopted it as an official RISC-V non-ISA spec. ... > > You mean is in the process of adopting it, right? Yes, it in the process. > > > ... Of course, > > the RISC-V PLIC spec needs to follow the process for RISC-V non-ISA spec > > but changing the RISC-V PLIC spec now would mean all existing RISC-V > > platforms will become non-compliant. > > > > I would expect the review process to produce a proper specification, rather > than a verbatim copy of the SiFive datasheet, and clarify some ambgiuous > and implementation specific language. Clarifying the specification does > not necessarily make all existing implementations non-compliant, as this > has been done numerous times with other RISC-V specifications. Yes, clarification can be definitely done. Regards, Anup > > > The RISC-V AIA spec is intended to replace the RISC-V PLIC spec as the > > new interrupt controller spec for future RISC-V platforms. > > > > Regards, > > Anup > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Wed, Oct 20, 2021 at 10:18 PM Marc Zyngier <maz@kernel.org> wrote: > > On Wed, 20 Oct 2021 17:08:36 +0100, > Anup Patel <anup@brainfault.org> wrote: > > > > On Wed, Oct 20, 2021 at 8:38 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > On Wed, 20 Oct 2021 15:33:49 +0100, > > > Anup Patel <anup@brainfault.org> wrote: > > > > > > > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > > > breaks > > > > > > > > > the HW interrupt state or not. > > > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > > > continues irqs could come in. > > > > > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > > > handled? How great... > > > > > > If the completion ID does not match an interrupt source that is > > > > > > currently enabled for the target, the completion is silently ignored. > > > > > > So, C9xx completion depends on enable-bit. > > > > > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > > > can understand that one implementation would be broken, but if the > > > > > PLIC architecture itself is broken, that's far more concerning. > > > > > > > > Yes, we are dealing with a broken/non-compliant PLIC > > > > implementation. > > > > > > > > The RISC-V PLIC spec defines a very different behaviour for the > > > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e. > > > > writel(claim)). The T-HEAD PLIC implementation does things > > > > different from what the RISC-V PLIC spec says because it will > > > > mask an interrupt upon interrupt claim whereas PLIC spec says > > > > it should only clear the interrupt pending bit (not mask the interrupt). > > > > > > > > Quoting interrupt claim process (chapter 9) from PLIC spec: > > > > "The PLIC can perform an interrupt claim by reading the claim/complete > > > > register, which returns the ID of the highest priority pending interrupt or > > > > zero if there is no pending interrupt. A successful claim will also atomically > > > > clear the corresponding pending bit on the interrupt source." > > > > > > > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > > > > > That's not the point I'm making. According to Guo, the PLIC (any > > > implementation of it) will ignore a write to claim on a masked > > > interrupt. > > > > Yes, write to claim on a masked interrupt is certainly ignored but > > read to claim does not automatically mask the interrupt. > > > > > > > > If that's indeed correct, then a sequence such as: > > > > > > (1) irq = read(claim) > > > > This will return highest priority pending interrupt and clear the > > pending bit as-per RISC-V PLIC spec. > > > > > (2) mask from the interrupt handler with the right flags so that it > > > isn't done lazily > > > (3) write(irq, claim) > > > > > > will result in an interrupt blocked in ack state (and probably no more > > > interrupt for this CPU at this priority). That would be an interesting > > > bug in the current code, but also a pretty bad architectural choice. > > > > The interrupt claim/completion is for each interrupt and not at CPU > > level so if an interrupt is masked then only that interrupt is blocked > > for all CPUs but other interrupts can still be raised. > > Do you mean that another interrupt of the same priority will be able > to be taken on *this* CPU, despite the completion being silently > ignored? This part is not clear in the RISC-V PLIC spec so I will request for adding clarification. Regards, Anup > > M. > > -- > Without deviation from the norm, progress is not possible.
On Thu, Oct 21, 2021 at 4:33 PM Marc Zyngier <maz@kernel.org> wrote: > > On Thu, 21 Oct 2021 03:00:43 +0100, > Guo Ren <guoren@kernel.org> wrote: > > > > On Wed, Oct 20, 2021 at 11:08 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > On Wed, 20 Oct 2021 15:33:49 +0100, > > > Anup Patel <anup@brainfault.org> wrote: > > > > > > > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > On Tue, 19 Oct 2021 14:27:02 +0100, > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > > > > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100, > > > > > > > Guo Ren <guoren@kernel.org> wrote: > > > > > > > > > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this > > > > > > > > > in a separate bit, then you need to track this by yourself in the > > > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to > > > > > > > > > the CLAIM register in this case, though I have no idea whether this > > > > > > > > > breaks > > > > > > > > > the HW interrupt state or not. > > > > > > > > The problem is when enable bit is 0 for that irq_number, > > > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect > > > > > > > > the hw state machine. Then this irq would enter in ack state and no > > > > > > > > continues irqs could come in. > > > > > > > > > > > > > > Really? This means that you cannot mask an interrupt while it is being > > > > > > > handled? How great... > > > > > > If the completion ID does not match an interrupt source that is > > > > > > currently enabled for the target, the completion is silently ignored. > > > > > > So, C9xx completion depends on enable-bit. > > > > > > > > > > Is that what the PLIC spec says? Or what your implementation does? I > > > > > can understand that one implementation would be broken, but if the > > > > > PLIC architecture itself is broken, that's far more concerning. > > > > > > > > Yes, we are dealing with a broken/non-compliant PLIC > > > > implementation. > > > > > > > > The RISC-V PLIC spec defines a very different behaviour for the > > > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e. > > > > writel(claim)). The T-HEAD PLIC implementation does things > > > > different from what the RISC-V PLIC spec says because it will > > > > mask an interrupt upon interrupt claim whereas PLIC spec says > > > > it should only clear the interrupt pending bit (not mask the interrupt). > > > > > > > > Quoting interrupt claim process (chapter 9) from PLIC spec: > > > > "The PLIC can perform an interrupt claim by reading the claim/complete > > > > register, which returns the ID of the highest priority pending interrupt or > > > > zero if there is no pending interrupt. A successful claim will also atomically > > > > clear the corresponding pending bit on the interrupt source." > > > > > > > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc > > > > > > That's not the point I'm making. According to Guo, the PLIC (any > > > implementation of it) will ignore a write to claim on a masked > > > interrupt. > > > > > > If that's indeed correct, then a sequence such as: > > > > > > (1) irq = read(claim) > > > (2) mask from the interrupt handler with the right flags so that it > > > isn't done lazily > > > (3) write(irq, claim) > > > > How about letting the IRQ chip change? > > > > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c > > index a98bcfc4be7b..ed6ace1058ac 100644 > > --- a/kernel/irq/chip.c > > +++ b/kernel/irq/chip.c > > @@ -444,10 +444,10 @@ void unmask_threaded_irq(struct irq_desc *desc) > > { > > struct irq_chip *chip = desc->irq_data.chip; > > > > + unmask_irq(desc); > > + > > if (chip->flags & IRQCHIP_EOI_THREADED) > > chip->irq_eoi(&desc->irq_data); > > - > > - unmask_irq(desc); > > } > > > > /* > > @@ -673,8 +673,8 @@ static void cond_unmask_eoi_irq(struct irq_desc > > *desc, struct irq_chip *chip) > > */ > > if (!irqd_irq_disabled(&desc->irq_data) && > > irqd_irq_masked(&desc->irq_data) && !desc->threads_oneshot) { > > - chip->irq_eoi(&desc->irq_data); > > unmask_irq(desc); > > + chip->irq_eoi(&desc->irq_data); > > } else if (!(chip->flags & IRQCHIP_EOI_THREADED)) { > > chip->irq_eoi(&desc->irq_data); > > } > > No, I don't think that's acceptable, and I strongly suspect that other > irqchips have the opposite requirement. You'll have to keep the > workaround in the PLIC code and track the EOI vs unmask to do the > right thing in both callbacks. Okay... > > M. > > -- > Without deviation from the norm, progress is not possible.
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c index cf74cfa82045..960b29d02070 100644 --- a/drivers/irqchip/irq-sifive-plic.c +++ b/drivers/irqchip/irq-sifive-plic.c @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d) writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM); } -static struct irq_chip plic_chip = { +static struct irq_chip sifive_plic_chip = { .name = "SiFive PLIC", .irq_mask = plic_irq_mask, .irq_unmask = plic_irq_unmask, @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = { #endif }; +/* + * The C9xx PLIC does not comply with the interrupt claim/completion + * process defined by the RISC-V PLIC specification because C9xx PLIC + * will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim) + * and the IRQ will be unmasked upon completion by PLIC driver (i.e. + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by + * the generic handle_fasteoi_irq() used in the PLIC driver. + */ +static struct irq_chip thead_plic_chip = { + .name = "T-Head PLIC", + .irq_disable = plic_irq_mask, + .irq_enable = plic_irq_unmask, + .irq_eoi = plic_irq_eoi, +#ifdef CONFIG_SMP + .irq_set_affinity = plic_set_affinity, +#endif +}; + +static struct irq_chip *def_plic_chip = &sifive_plic_chip; + static int plic_irqdomain_map(struct irq_domain *d, unsigned int irq, irq_hw_number_t hwirq) { struct plic_priv *priv = d->host_data; - irq_domain_set_info(d, irq, hwirq, &plic_chip, d->host_data, + irq_domain_set_info(d, irq, hwirq, def_plic_chip, d->host_data, handle_fasteoi_irq, NULL, NULL); irq_set_noprobe(irq); irq_set_affinity(irq, &priv->lmask); @@ -390,5 +410,15 @@ static int __init plic_init(struct device_node *node, return error; } +static int __init thead_c900_plic_init(struct device_node *node, + struct device_node *parent) +{ + def_plic_chip = &thead_plic_chip; + + return plic_init(node, parent); +} + IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init); IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */ +IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", thead_c900_plic_init); +IRQCHIP_DECLARE(allwinner_sun20i_d1_plic, "allwinner,sun20i-d1-plic", thead_c900_plic_init);