irqchip/sifive-plic: allow many cores to handle IRQs

Message ID	20200426110740.123638-1-zong.li@sifive.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=gGvM=6K=lists.infradead.org=linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DBDF2206CD From: Zong Li <zong.li@sifive.com> To: palmer@dabbelt.com, paul.walmsley@sifive.com, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, anup@brainfault.org Subject: [PATCH] irqchip/sifive-plic: allow many cores to handle IRQs Date: Sun, 26 Apr 2020 19:07:40 +0800 Message-Id: <20200426110740.123638-1-zong.li@sifive.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit summary: Content analysis details: (-0.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:1043 listed in] [list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature Precedence: list Cc: david.abdurachmanov@sifive.com, Zong Li <zong.li@sifive.com> Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org> Errors-To: linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org
Series	irqchip/sifive-plic: allow many cores to handle IRQs \| expand irqchip/sifive-plic: allow many cores to handle IRQs

Zong Li April 26, 2020, 11:07 a.m. UTC

Currently, driver forces the IRQs to be handled by only one core. This
patch provides the way to enable others cores to handle IRQs if needed,
so users could decide how many cores they wanted on default by boot
argument.

Use 'irqaffinity' boot argument to determine affinity. If there is no
irqaffinity in dts or kernel configuration, use irq default affinity,
so all harts would try to claim IRQ.

For example, add irqaffinity=0 in chosen node to set irq affinity to
hart 0. It also supports more than one harts to handle irq, such as set
irqaffinity=0,3,4.

You can change IRQ affinity from user-space using procfs. For example,
you can make CPU0 and CPU2 serve IRQ together by the following command:

echo 4 > /proc/irq/<x>/smp_affinity

Signed-off-by: Zong Li <zong.li@sifive.com>
---
 drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
 1 file changed, 7 insertions(+), 14 deletions(-)

Anup Patel April 26, 2020, 12:46 p.m. UTC | #1

On Sun, Apr 26, 2020 at 4:37 PM Zong Li <zong.li@sifive.com> wrote:
>
> Currently, driver forces the IRQs to be handled by only one core. This
> patch provides the way to enable others cores to handle IRQs if needed,
> so users could decide how many cores they wanted on default by boot
> argument.
>
> Use 'irqaffinity' boot argument to determine affinity. If there is no
> irqaffinity in dts or kernel configuration, use irq default affinity,
> so all harts would try to claim IRQ.
>
> For example, add irqaffinity=0 in chosen node to set irq affinity to
> hart 0. It also supports more than one harts to handle irq, such as set
> irqaffinity=0,3,4.
>
> You can change IRQ affinity from user-space using procfs. For example,
> you can make CPU0 and CPU2 serve IRQ together by the following command:
>
> echo 4 > /proc/irq/<x>/smp_affinity
>
> Signed-off-by: Zong Li <zong.li@sifive.com>
> ---
>  drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
>  1 file changed, 7 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> index d0a71febdadc..bc1440d54185 100644
> --- a/drivers/irqchip/irq-sifive-plic.c
> +++ b/drivers/irqchip/irq-sifive-plic.c
> @@ -111,15 +111,12 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
>  static void plic_irq_unmask(struct irq_data *d)
>  {
>         struct cpumask amask;
> -       unsigned int cpu;
>         struct plic_priv *priv = irq_get_chip_data(d->irq);
>
>         cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> -       cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
> -                                          &amask);
> -       if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> -               return;
> -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> +       cpumask_and(&amask, &amask, irq_data_get_affinity_mask(d));
> +
> +       plic_irq_toggle(&amask, d, 1);
>  }
>
>  static void plic_irq_mask(struct irq_data *d)
> @@ -133,24 +130,20 @@ static void plic_irq_mask(struct irq_data *d)
>  static int plic_set_affinity(struct irq_data *d,
>                              const struct cpumask *mask_val, bool force)
>  {
> -       unsigned int cpu;
>         struct cpumask amask;
>         struct plic_priv *priv = irq_get_chip_data(d->irq);
>
>         cpumask_and(&amask, &priv->lmask, mask_val);
>
>         if (force)
> -               cpu = cpumask_first(&amask);
> +               cpumask_copy(&amask, mask_val);
>         else
> -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> -
> -       if (cpu >= nr_cpu_ids)
> -               return -EINVAL;
> +               cpumask_and(&amask, &amask, cpu_online_mask);
>
>         plic_irq_toggle(&priv->lmask, d, 0);
> -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> +       plic_irq_toggle(&amask, d, 1);
>
> -       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +       irq_data_update_effective_affinity(d, &amask);
>
>         return IRQ_SET_MASK_OK_DONE;
>  }
> --
> 2.26.1
>

I strongly oppose (NACK) this patch due to performance reasons.

In PLIC, if we enable an IRQ X for N CPUs then when IRQ X occurs:
1) All N CPUs will take interrupt
2) All N CPUs will try to read PLIC CLAIM register
3) Only one of the CPUs will see IRQ X using the CLAIM register
but other N - 1 CPUs will see no interrupt and return back to what
they were doing. In other words, N - 1 CPUs will just waste CPU
every time IRQ X occurs.

Example1, one Application doing heavy network traffic will
degrade performance of other applications because with every
network RX/TX interrupt N-1 CPUs will waste CPU trying to
process network interrupt.

Example1, one Application doing heavy MMC/SD traffic will
degrade performance of other applications because with every
SPI read/write interrupt N-1 CPUs will waste CPU trying to
process it.

In fact, the current PLIC approach is actually a performance
optimization. This implementation also works fine with in-kernel
load-balancer and user space load balancer.

Regards,
Anup

Zong Li April 26, 2020, 1:19 p.m. UTC | #2

On Sun, Apr 26, 2020 at 8:47 PM Anup Patel <anup@brainfault.org> wrote:
>
> On Sun, Apr 26, 2020 at 4:37 PM Zong Li <zong.li@sifive.com> wrote:
> >
> > Currently, driver forces the IRQs to be handled by only one core. This
> > patch provides the way to enable others cores to handle IRQs if needed,
> > so users could decide how many cores they wanted on default by boot
> > argument.
> >
> > Use 'irqaffinity' boot argument to determine affinity. If there is no
> > irqaffinity in dts or kernel configuration, use irq default affinity,
> > so all harts would try to claim IRQ.
> >
> > For example, add irqaffinity=0 in chosen node to set irq affinity to
> > hart 0. It also supports more than one harts to handle irq, such as set
> > irqaffinity=0,3,4.
> >
> > You can change IRQ affinity from user-space using procfs. For example,
> > you can make CPU0 and CPU2 serve IRQ together by the following command:
> >
> > echo 4 > /proc/irq/<x>/smp_affinity
> >
> > Signed-off-by: Zong Li <zong.li@sifive.com>
> > ---
> >  drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
> >  1 file changed, 7 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> > index d0a71febdadc..bc1440d54185 100644
> > --- a/drivers/irqchip/irq-sifive-plic.c
> > +++ b/drivers/irqchip/irq-sifive-plic.c
> > @@ -111,15 +111,12 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
> >  static void plic_irq_unmask(struct irq_data *d)
> >  {
> >         struct cpumask amask;
> > -       unsigned int cpu;
> >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> >
> >         cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> > -       cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
> > -                                          &amask);
> > -       if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> > -               return;
> > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > +       cpumask_and(&amask, &amask, irq_data_get_affinity_mask(d));
> > +
> > +       plic_irq_toggle(&amask, d, 1);
> >  }
> >
> >  static void plic_irq_mask(struct irq_data *d)
> > @@ -133,24 +130,20 @@ static void plic_irq_mask(struct irq_data *d)
> >  static int plic_set_affinity(struct irq_data *d,
> >                              const struct cpumask *mask_val, bool force)
> >  {
> > -       unsigned int cpu;
> >         struct cpumask amask;
> >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> >
> >         cpumask_and(&amask, &priv->lmask, mask_val);
> >
> >         if (force)
> > -               cpu = cpumask_first(&amask);
> > +               cpumask_copy(&amask, mask_val);
> >         else
> > -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > -
> > -       if (cpu >= nr_cpu_ids)
> > -               return -EINVAL;
> > +               cpumask_and(&amask, &amask, cpu_online_mask);
> >
> >         plic_irq_toggle(&priv->lmask, d, 0);
> > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > +       plic_irq_toggle(&amask, d, 1);
> >
> > -       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > +       irq_data_update_effective_affinity(d, &amask);
> >
> >         return IRQ_SET_MASK_OK_DONE;
> >  }
> > --
> > 2.26.1
> >
>
> I strongly oppose (NACK) this patch due to performance reasons.
>
> In PLIC, if we enable an IRQ X for N CPUs then when IRQ X occurs:
> 1) All N CPUs will take interrupt
> 2) All N CPUs will try to read PLIC CLAIM register
> 3) Only one of the CPUs will see IRQ X using the CLAIM register
> but other N - 1 CPUs will see no interrupt and return back to what
> they were doing. In other words, N - 1 CPUs will just waste CPU
> every time IRQ X occurs.
>
> Example1, one Application doing heavy network traffic will
> degrade performance of other applications because with every
> network RX/TX interrupt N-1 CPUs will waste CPU trying to
> process network interrupt.
>
> Example1, one Application doing heavy MMC/SD traffic will
> degrade performance of other applications because with every
> SPI read/write interrupt N-1 CPUs will waste CPU trying to
> process it.
>
> In fact, the current PLIC approach is actually a performance
> optimization. This implementation also works fine with in-kernel
> load-balancer and user space load balancer.
>

Yes, it's exactly, I know what you pointed out. But the idea of this
patch is just providing a way that users could enable other cores if
they wanted, it could still enable only one core by this change. The
purpose here is thinking of flexible use, rather than limitation.
Maybe it would be a happy medium that we make the default case enable
only one core? It is a good open discussion.

> Regards,
> Anup

Anup Patel April 26, 2020, 1:38 p.m. UTC | #3

+Mark Z

On Sun, Apr 26, 2020 at 6:49 PM Zong Li <zong.li@sifive.com> wrote:
>
> On Sun, Apr 26, 2020 at 8:47 PM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Sun, Apr 26, 2020 at 4:37 PM Zong Li <zong.li@sifive.com> wrote:
> > >
> > > Currently, driver forces the IRQs to be handled by only one core. This
> > > patch provides the way to enable others cores to handle IRQs if needed,
> > > so users could decide how many cores they wanted on default by boot
> > > argument.
> > >
> > > Use 'irqaffinity' boot argument to determine affinity. If there is no
> > > irqaffinity in dts or kernel configuration, use irq default affinity,
> > > so all harts would try to claim IRQ.
> > >
> > > For example, add irqaffinity=0 in chosen node to set irq affinity to
> > > hart 0. It also supports more than one harts to handle irq, such as set
> > > irqaffinity=0,3,4.
> > >
> > > You can change IRQ affinity from user-space using procfs. For example,
> > > you can make CPU0 and CPU2 serve IRQ together by the following command:
> > >
> > > echo 4 > /proc/irq/<x>/smp_affinity
> > >
> > > Signed-off-by: Zong Li <zong.li@sifive.com>
> > > ---
> > >  drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
> > >  1 file changed, 7 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> > > index d0a71febdadc..bc1440d54185 100644
> > > --- a/drivers/irqchip/irq-sifive-plic.c
> > > +++ b/drivers/irqchip/irq-sifive-plic.c
> > > @@ -111,15 +111,12 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
> > >  static void plic_irq_unmask(struct irq_data *d)
> > >  {
> > >         struct cpumask amask;
> > > -       unsigned int cpu;
> > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > >
> > >         cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> > > -       cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
> > > -                                          &amask);
> > > -       if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> > > -               return;
> > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > +       cpumask_and(&amask, &amask, irq_data_get_affinity_mask(d));
> > > +
> > > +       plic_irq_toggle(&amask, d, 1);
> > >  }
> > >
> > >  static void plic_irq_mask(struct irq_data *d)
> > > @@ -133,24 +130,20 @@ static void plic_irq_mask(struct irq_data *d)
> > >  static int plic_set_affinity(struct irq_data *d,
> > >                              const struct cpumask *mask_val, bool force)
> > >  {
> > > -       unsigned int cpu;
> > >         struct cpumask amask;
> > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > >
> > >         cpumask_and(&amask, &priv->lmask, mask_val);
> > >
> > >         if (force)
> > > -               cpu = cpumask_first(&amask);
> > > +               cpumask_copy(&amask, mask_val);
> > >         else
> > > -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > -
> > > -       if (cpu >= nr_cpu_ids)
> > > -               return -EINVAL;
> > > +               cpumask_and(&amask, &amask, cpu_online_mask);
> > >
> > >         plic_irq_toggle(&priv->lmask, d, 0);
> > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > +       plic_irq_toggle(&amask, d, 1);
> > >
> > > -       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > +       irq_data_update_effective_affinity(d, &amask);
> > >
> > >         return IRQ_SET_MASK_OK_DONE;
> > >  }
> > > --
> > > 2.26.1
> > >
> >
> > I strongly oppose (NACK) this patch due to performance reasons.
> >
> > In PLIC, if we enable an IRQ X for N CPUs then when IRQ X occurs:
> > 1) All N CPUs will take interrupt
> > 2) All N CPUs will try to read PLIC CLAIM register
> > 3) Only one of the CPUs will see IRQ X using the CLAIM register
> > but other N - 1 CPUs will see no interrupt and return back to what
> > they were doing. In other words, N - 1 CPUs will just waste CPU
> > every time IRQ X occurs.
> >
> > Example1, one Application doing heavy network traffic will
> > degrade performance of other applications because with every
> > network RX/TX interrupt N-1 CPUs will waste CPU trying to
> > process network interrupt.
> >
> > Example1, one Application doing heavy MMC/SD traffic will
> > degrade performance of other applications because with every
> > SPI read/write interrupt N-1 CPUs will waste CPU trying to
> > process it.
> >
> > In fact, the current PLIC approach is actually a performance
> > optimization. This implementation also works fine with in-kernel
> > load-balancer and user space load balancer.
> >
>
> Yes, it's exactly, I know what you pointed out. But the idea of this
> patch is just providing a way that users could enable other cores if
> they wanted, it could still enable only one core by this change. The
> purpose here is thinking of flexible use, rather than limitation.
> Maybe it would be a happy medium that we make the default case enable
> only one core? It is a good open discussion.

Making the default case as enable only one core is just a work-around.

As-per my understanding, if we set affinity mask of N CPUs for IRQ X
then it does not mean all N CPUs should receive IRQ X rather it means
that exactly one of the N CPUs will receive IRQ X and the IRQ receiving
CPU will be fixed (reflected by effective affinity returned by the driver).

If we ignore above semantics and still provide a mechanism to target
IRQ X to N CPUs then most likely someone will try and run into
performance issues.

Please don't go this path. The performance impact in case of Guest/VM
is even worst because PLIC is trap-n-emulated by hypervisors as MMIO
device.

Regards,
Anup

Zong Li April 26, 2020, 3:12 p.m. UTC | #4

On Sun, Apr 26, 2020 at 9:38 PM Anup Patel <anup@brainfault.org> wrote:
>
> +Mark Z
>
> On Sun, Apr 26, 2020 at 6:49 PM Zong Li <zong.li@sifive.com> wrote:
> >
> > On Sun, Apr 26, 2020 at 8:47 PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > On Sun, Apr 26, 2020 at 4:37 PM Zong Li <zong.li@sifive.com> wrote:
> > > >
> > > > Currently, driver forces the IRQs to be handled by only one core. This
> > > > patch provides the way to enable others cores to handle IRQs if needed,
> > > > so users could decide how many cores they wanted on default by boot
> > > > argument.
> > > >
> > > > Use 'irqaffinity' boot argument to determine affinity. If there is no
> > > > irqaffinity in dts or kernel configuration, use irq default affinity,
> > > > so all harts would try to claim IRQ.
> > > >
> > > > For example, add irqaffinity=0 in chosen node to set irq affinity to
> > > > hart 0. It also supports more than one harts to handle irq, such as set
> > > > irqaffinity=0,3,4.
> > > >
> > > > You can change IRQ affinity from user-space using procfs. For example,
> > > > you can make CPU0 and CPU2 serve IRQ together by the following command:
> > > >
> > > > echo 4 > /proc/irq/<x>/smp_affinity
> > > >
> > > > Signed-off-by: Zong Li <zong.li@sifive.com>
> > > > ---
> > > >  drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
> > > >  1 file changed, 7 insertions(+), 14 deletions(-)
> > > >
> > > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> > > > index d0a71febdadc..bc1440d54185 100644
> > > > --- a/drivers/irqchip/irq-sifive-plic.c
> > > > +++ b/drivers/irqchip/irq-sifive-plic.c
> > > > @@ -111,15 +111,12 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
> > > >  static void plic_irq_unmask(struct irq_data *d)
> > > >  {
> > > >         struct cpumask amask;
> > > > -       unsigned int cpu;
> > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > >
> > > >         cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> > > > -       cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
> > > > -                                          &amask);
> > > > -       if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> > > > -               return;
> > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > +       cpumask_and(&amask, &amask, irq_data_get_affinity_mask(d));
> > > > +
> > > > +       plic_irq_toggle(&amask, d, 1);
> > > >  }
> > > >
> > > >  static void plic_irq_mask(struct irq_data *d)
> > > > @@ -133,24 +130,20 @@ static void plic_irq_mask(struct irq_data *d)
> > > >  static int plic_set_affinity(struct irq_data *d,
> > > >                              const struct cpumask *mask_val, bool force)
> > > >  {
> > > > -       unsigned int cpu;
> > > >         struct cpumask amask;
> > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > >
> > > >         cpumask_and(&amask, &priv->lmask, mask_val);
> > > >
> > > >         if (force)
> > > > -               cpu = cpumask_first(&amask);
> > > > +               cpumask_copy(&amask, mask_val);
> > > >         else
> > > > -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > -
> > > > -       if (cpu >= nr_cpu_ids)
> > > > -               return -EINVAL;
> > > > +               cpumask_and(&amask, &amask, cpu_online_mask);
> > > >
> > > >         plic_irq_toggle(&priv->lmask, d, 0);
> > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > +       plic_irq_toggle(&amask, d, 1);
> > > >
> > > > -       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > +       irq_data_update_effective_affinity(d, &amask);
> > > >
> > > >         return IRQ_SET_MASK_OK_DONE;
> > > >  }
> > > > --
> > > > 2.26.1
> > > >
> > >
> > > I strongly oppose (NACK) this patch due to performance reasons.
> > >
> > > In PLIC, if we enable an IRQ X for N CPUs then when IRQ X occurs:
> > > 1) All N CPUs will take interrupt
> > > 2) All N CPUs will try to read PLIC CLAIM register
> > > 3) Only one of the CPUs will see IRQ X using the CLAIM register
> > > but other N - 1 CPUs will see no interrupt and return back to what
> > > they were doing. In other words, N - 1 CPUs will just waste CPU
> > > every time IRQ X occurs.
> > >
> > > Example1, one Application doing heavy network traffic will
> > > degrade performance of other applications because with every
> > > network RX/TX interrupt N-1 CPUs will waste CPU trying to
> > > process network interrupt.
> > >
> > > Example1, one Application doing heavy MMC/SD traffic will
> > > degrade performance of other applications because with every
> > > SPI read/write interrupt N-1 CPUs will waste CPU trying to
> > > process it.
> > >
> > > In fact, the current PLIC approach is actually a performance
> > > optimization. This implementation also works fine with in-kernel
> > > load-balancer and user space load balancer.
> > >
> >
> > Yes, it's exactly, I know what you pointed out. But the idea of this
> > patch is just providing a way that users could enable other cores if
> > they wanted, it could still enable only one core by this change. The
> > purpose here is thinking of flexible use, rather than limitation.
> > Maybe it would be a happy medium that we make the default case enable
> > only one core? It is a good open discussion.
>
> Making the default case as enable only one core is just a work-around.
>
> As-per my understanding, if we set affinity mask of N CPUs for IRQ X
> then it does not mean all N CPUs should receive IRQ X rather it means
> that exactly one of the N CPUs will receive IRQ X and the IRQ receiving
> CPU will be fixed (reflected by effective affinity returned by the driver).

is there a case that we only bundle the IRQ to CPU0, but CPU0 is more
much busy than other CPUs, and it would be better if another CPU could
take the IRQ?

>
> If we ignore above semantics and still provide a mechanism to target
> IRQ X to N CPUs then most likely someone will try and run into
> performance issues.
>
> Please don't go this path. The performance impact in case of Guest/VM
> is even worst because PLIC is trap-n-emulated by hypervisors as MMIO
> device.

OK, I won't persist in that, just wanna figure out the situation.

>
> Regards,
> Anup

Anup Patel April 26, 2020, 3:21 p.m. UTC | #5

On Sun, Apr 26, 2020 at 8:42 PM Zong Li <zong.li@sifive.com> wrote:
>
> On Sun, Apr 26, 2020 at 9:38 PM Anup Patel <anup@brainfault.org> wrote:
> >
> > +Mark Z
> >
> > On Sun, Apr 26, 2020 at 6:49 PM Zong Li <zong.li@sifive.com> wrote:
> > >
> > > On Sun, Apr 26, 2020 at 8:47 PM Anup Patel <anup@brainfault.org> wrote:
> > > >
> > > > On Sun, Apr 26, 2020 at 4:37 PM Zong Li <zong.li@sifive.com> wrote:
> > > > >
> > > > > Currently, driver forces the IRQs to be handled by only one core. This
> > > > > patch provides the way to enable others cores to handle IRQs if needed,
> > > > > so users could decide how many cores they wanted on default by boot
> > > > > argument.
> > > > >
> > > > > Use 'irqaffinity' boot argument to determine affinity. If there is no
> > > > > irqaffinity in dts or kernel configuration, use irq default affinity,
> > > > > so all harts would try to claim IRQ.
> > > > >
> > > > > For example, add irqaffinity=0 in chosen node to set irq affinity to
> > > > > hart 0. It also supports more than one harts to handle irq, such as set
> > > > > irqaffinity=0,3,4.
> > > > >
> > > > > You can change IRQ affinity from user-space using procfs. For example,
> > > > > you can make CPU0 and CPU2 serve IRQ together by the following command:
> > > > >
> > > > > echo 4 > /proc/irq/<x>/smp_affinity
> > > > >
> > > > > Signed-off-by: Zong Li <zong.li@sifive.com>
> > > > > ---
> > > > >  drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
> > > > >  1 file changed, 7 insertions(+), 14 deletions(-)
> > > > >
> > > > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> > > > > index d0a71febdadc..bc1440d54185 100644
> > > > > --- a/drivers/irqchip/irq-sifive-plic.c
> > > > > +++ b/drivers/irqchip/irq-sifive-plic.c
> > > > > @@ -111,15 +111,12 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
> > > > >  static void plic_irq_unmask(struct irq_data *d)
> > > > >  {
> > > > >         struct cpumask amask;
> > > > > -       unsigned int cpu;
> > > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > > >
> > > > >         cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> > > > > -       cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
> > > > > -                                          &amask);
> > > > > -       if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> > > > > -               return;
> > > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > > +       cpumask_and(&amask, &amask, irq_data_get_affinity_mask(d));
> > > > > +
> > > > > +       plic_irq_toggle(&amask, d, 1);
> > > > >  }
> > > > >
> > > > >  static void plic_irq_mask(struct irq_data *d)
> > > > > @@ -133,24 +130,20 @@ static void plic_irq_mask(struct irq_data *d)
> > > > >  static int plic_set_affinity(struct irq_data *d,
> > > > >                              const struct cpumask *mask_val, bool force)
> > > > >  {
> > > > > -       unsigned int cpu;
> > > > >         struct cpumask amask;
> > > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > > >
> > > > >         cpumask_and(&amask, &priv->lmask, mask_val);
> > > > >
> > > > >         if (force)
> > > > > -               cpu = cpumask_first(&amask);
> > > > > +               cpumask_copy(&amask, mask_val);
> > > > >         else
> > > > > -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > > -
> > > > > -       if (cpu >= nr_cpu_ids)
> > > > > -               return -EINVAL;
> > > > > +               cpumask_and(&amask, &amask, cpu_online_mask);
> > > > >
> > > > >         plic_irq_toggle(&priv->lmask, d, 0);
> > > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > > +       plic_irq_toggle(&amask, d, 1);
> > > > >
> > > > > -       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > > +       irq_data_update_effective_affinity(d, &amask);
> > > > >
> > > > >         return IRQ_SET_MASK_OK_DONE;
> > > > >  }
> > > > > --
> > > > > 2.26.1
> > > > >
> > > >
> > > > I strongly oppose (NACK) this patch due to performance reasons.
> > > >
> > > > In PLIC, if we enable an IRQ X for N CPUs then when IRQ X occurs:
> > > > 1) All N CPUs will take interrupt
> > > > 2) All N CPUs will try to read PLIC CLAIM register
> > > > 3) Only one of the CPUs will see IRQ X using the CLAIM register
> > > > but other N - 1 CPUs will see no interrupt and return back to what
> > > > they were doing. In other words, N - 1 CPUs will just waste CPU
> > > > every time IRQ X occurs.
> > > >
> > > > Example1, one Application doing heavy network traffic will
> > > > degrade performance of other applications because with every
> > > > network RX/TX interrupt N-1 CPUs will waste CPU trying to
> > > > process network interrupt.
> > > >
> > > > Example1, one Application doing heavy MMC/SD traffic will
> > > > degrade performance of other applications because with every
> > > > SPI read/write interrupt N-1 CPUs will waste CPU trying to
> > > > process it.
> > > >
> > > > In fact, the current PLIC approach is actually a performance
> > > > optimization. This implementation also works fine with in-kernel
> > > > load-balancer and user space load balancer.
> > > >
> > >
> > > Yes, it's exactly, I know what you pointed out. But the idea of this
> > > patch is just providing a way that users could enable other cores if
> > > they wanted, it could still enable only one core by this change. The
> > > purpose here is thinking of flexible use, rather than limitation.
> > > Maybe it would be a happy medium that we make the default case enable
> > > only one core? It is a good open discussion.
> >
> > Making the default case as enable only one core is just a work-around.
> >
> > As-per my understanding, if we set affinity mask of N CPUs for IRQ X
> > then it does not mean all N CPUs should receive IRQ X rather it means
> > that exactly one of the N CPUs will receive IRQ X and the IRQ receiving
> > CPU will be fixed (reflected by effective affinity returned by the driver).
>
> is there a case that we only bundle the IRQ to CPU0, but CPU0 is more
> much busy than other CPUs, and it would be better if another CPU could
> take the IRQ?

This is a common problem across architectures.

To tackle this, we typically run irqbalance daemon in user-space which will
change IRQ affinity based on CPU load.

Refer, https://linux.die.net/man/1/irqbalance

>
> >
> > If we ignore above semantics and still provide a mechanism to target
> > IRQ X to N CPUs then most likely someone will try and run into
> > performance issues.
> >
> > Please don't go this path. The performance impact in case of Guest/VM
> > is even worst because PLIC is trap-n-emulated by hypervisors as MMIO
> > device.
>
> OK, I won't persist in that, just wanna figure out the situation.
>
> >
> > Regards,
> > Anup

Regards,
Anup

Zong Li April 26, 2020, 3:35 p.m. UTC | #6

On Sun, Apr 26, 2020 at 11:21 PM Anup Patel <anup@brainfault.org> wrote:
>
> On Sun, Apr 26, 2020 at 8:42 PM Zong Li <zong.li@sifive.com> wrote:
> >
> > On Sun, Apr 26, 2020 at 9:38 PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > +Mark Z
> > >
> > > On Sun, Apr 26, 2020 at 6:49 PM Zong Li <zong.li@sifive.com> wrote:
> > > >
> > > > On Sun, Apr 26, 2020 at 8:47 PM Anup Patel <anup@brainfault.org> wrote:
> > > > >
> > > > > On Sun, Apr 26, 2020 at 4:37 PM Zong Li <zong.li@sifive.com> wrote:
> > > > > >
> > > > > > Currently, driver forces the IRQs to be handled by only one core. This
> > > > > > patch provides the way to enable others cores to handle IRQs if needed,
> > > > > > so users could decide how many cores they wanted on default by boot
> > > > > > argument.
> > > > > >
> > > > > > Use 'irqaffinity' boot argument to determine affinity. If there is no
> > > > > > irqaffinity in dts or kernel configuration, use irq default affinity,
> > > > > > so all harts would try to claim IRQ.
> > > > > >
> > > > > > For example, add irqaffinity=0 in chosen node to set irq affinity to
> > > > > > hart 0. It also supports more than one harts to handle irq, such as set
> > > > > > irqaffinity=0,3,4.
> > > > > >
> > > > > > You can change IRQ affinity from user-space using procfs. For example,
> > > > > > you can make CPU0 and CPU2 serve IRQ together by the following command:
> > > > > >
> > > > > > echo 4 > /proc/irq/<x>/smp_affinity
> > > > > >
> > > > > > Signed-off-by: Zong Li <zong.li@sifive.com>
> > > > > > ---
> > > > > >  drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
> > > > > >  1 file changed, 7 insertions(+), 14 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> > > > > > index d0a71febdadc..bc1440d54185 100644
> > > > > > --- a/drivers/irqchip/irq-sifive-plic.c
> > > > > > +++ b/drivers/irqchip/irq-sifive-plic.c
> > > > > > @@ -111,15 +111,12 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
> > > > > >  static void plic_irq_unmask(struct irq_data *d)
> > > > > >  {
> > > > > >         struct cpumask amask;
> > > > > > -       unsigned int cpu;
> > > > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > > > >
> > > > > >         cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> > > > > > -       cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
> > > > > > -                                          &amask);
> > > > > > -       if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> > > > > > -               return;
> > > > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > > > +       cpumask_and(&amask, &amask, irq_data_get_affinity_mask(d));
> > > > > > +
> > > > > > +       plic_irq_toggle(&amask, d, 1);
> > > > > >  }
> > > > > >
> > > > > >  static void plic_irq_mask(struct irq_data *d)
> > > > > > @@ -133,24 +130,20 @@ static void plic_irq_mask(struct irq_data *d)
> > > > > >  static int plic_set_affinity(struct irq_data *d,
> > > > > >                              const struct cpumask *mask_val, bool force)
> > > > > >  {
> > > > > > -       unsigned int cpu;
> > > > > >         struct cpumask amask;
> > > > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > > > >
> > > > > >         cpumask_and(&amask, &priv->lmask, mask_val);
> > > > > >
> > > > > >         if (force)
> > > > > > -               cpu = cpumask_first(&amask);
> > > > > > +               cpumask_copy(&amask, mask_val);
> > > > > >         else
> > > > > > -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > > > -
> > > > > > -       if (cpu >= nr_cpu_ids)
> > > > > > -               return -EINVAL;
> > > > > > +               cpumask_and(&amask, &amask, cpu_online_mask);
> > > > > >
> > > > > >         plic_irq_toggle(&priv->lmask, d, 0);
> > > > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > > > +       plic_irq_toggle(&amask, d, 1);
> > > > > >
> > > > > > -       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > > > +       irq_data_update_effective_affinity(d, &amask);
> > > > > >
> > > > > >         return IRQ_SET_MASK_OK_DONE;
> > > > > >  }
> > > > > > --
> > > > > > 2.26.1
> > > > > >
> > > > >
> > > > > I strongly oppose (NACK) this patch due to performance reasons.
> > > > >
> > > > > In PLIC, if we enable an IRQ X for N CPUs then when IRQ X occurs:
> > > > > 1) All N CPUs will take interrupt
> > > > > 2) All N CPUs will try to read PLIC CLAIM register
> > > > > 3) Only one of the CPUs will see IRQ X using the CLAIM register
> > > > > but other N - 1 CPUs will see no interrupt and return back to what
> > > > > they were doing. In other words, N - 1 CPUs will just waste CPU
> > > > > every time IRQ X occurs.
> > > > >
> > > > > Example1, one Application doing heavy network traffic will
> > > > > degrade performance of other applications because with every
> > > > > network RX/TX interrupt N-1 CPUs will waste CPU trying to
> > > > > process network interrupt.
> > > > >
> > > > > Example1, one Application doing heavy MMC/SD traffic will
> > > > > degrade performance of other applications because with every
> > > > > SPI read/write interrupt N-1 CPUs will waste CPU trying to
> > > > > process it.
> > > > >
> > > > > In fact, the current PLIC approach is actually a performance
> > > > > optimization. This implementation also works fine with in-kernel
> > > > > load-balancer and user space load balancer.
> > > > >
> > > >
> > > > Yes, it's exactly, I know what you pointed out. But the idea of this
> > > > patch is just providing a way that users could enable other cores if
> > > > they wanted, it could still enable only one core by this change. The
> > > > purpose here is thinking of flexible use, rather than limitation.
> > > > Maybe it would be a happy medium that we make the default case enable
> > > > only one core? It is a good open discussion.
> > >
> > > Making the default case as enable only one core is just a work-around.
> > >
> > > As-per my understanding, if we set affinity mask of N CPUs for IRQ X
> > > then it does not mean all N CPUs should receive IRQ X rather it means
> > > that exactly one of the N CPUs will receive IRQ X and the IRQ receiving
> > > CPU will be fixed (reflected by effective affinity returned by the driver).
> >
> > is there a case that we only bundle the IRQ to CPU0, but CPU0 is more
> > much busy than other CPUs, and it would be better if another CPU could
> > take the IRQ?
>
> This is a common problem across architectures.
>
> To tackle this, we typically run irqbalance daemon in user-space which will
> change IRQ affinity based on CPU load.
>
> Refer, https://linux.die.net/man/1/irqbalance

OK, thank you for the information and figuring out the issue. Let's
stop this patch unless there are other voices.

>
> >
> > >
> > > If we ignore above semantics and still provide a mechanism to target
> > > IRQ X to N CPUs then most likely someone will try and run into
> > > performance issues.
> > >
> > > Please don't go this path. The performance impact in case of Guest/VM
> > > is even worst because PLIC is trap-n-emulated by hypervisors as MMIO
> > > device.
> >
> > OK, I won't persist in that, just wanna figure out the situation.
> >
> > >
> > > Regards,
> > > Anup
>
> Regards,
> Anup

Greentime Hu April 27, 2020, 6:49 a.m. UTC | #7

Zong Li <zong.li@sifive.com> 於 2020年4月26日 週日 下午11:35寫道：
>
> On Sun, Apr 26, 2020 at 11:21 PM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Sun, Apr 26, 2020 at 8:42 PM Zong Li <zong.li@sifive.com> wrote:
> > >
> > > On Sun, Apr 26, 2020 at 9:38 PM Anup Patel <anup@brainfault.org> wrote:
> > > >
> > > > +Mark Z
> > > >
> > > > On Sun, Apr 26, 2020 at 6:49 PM Zong Li <zong.li@sifive.com> wrote:
> > > > >
> > > > > On Sun, Apr 26, 2020 at 8:47 PM Anup Patel <anup@brainfault.org> wrote:
> > > > > >
> > > > > > On Sun, Apr 26, 2020 at 4:37 PM Zong Li <zong.li@sifive.com> wrote:
> > > > > > >
> > > > > > > Currently, driver forces the IRQs to be handled by only one core. This
> > > > > > > patch provides the way to enable others cores to handle IRQs if needed,
> > > > > > > so users could decide how many cores they wanted on default by boot
> > > > > > > argument.
> > > > > > >
> > > > > > > Use 'irqaffinity' boot argument to determine affinity. If there is no
> > > > > > > irqaffinity in dts or kernel configuration, use irq default affinity,
> > > > > > > so all harts would try to claim IRQ.
> > > > > > >
> > > > > > > For example, add irqaffinity=0 in chosen node to set irq affinity to
> > > > > > > hart 0. It also supports more than one harts to handle irq, such as set
> > > > > > > irqaffinity=0,3,4.
> > > > > > >
> > > > > > > You can change IRQ affinity from user-space using procfs. For example,
> > > > > > > you can make CPU0 and CPU2 serve IRQ together by the following command:
> > > > > > >
> > > > > > > echo 4 > /proc/irq/<x>/smp_affinity
> > > > > > >
> > > > > > > Signed-off-by: Zong Li <zong.li@sifive.com>
> > > > > > > ---
> > > > > > >  drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
> > > > > > >  1 file changed, 7 insertions(+), 14 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> > > > > > > index d0a71febdadc..bc1440d54185 100644
> > > > > > > --- a/drivers/irqchip/irq-sifive-plic.c
> > > > > > > +++ b/drivers/irqchip/irq-sifive-plic.c
> > > > > > > @@ -111,15 +111,12 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
> > > > > > >  static void plic_irq_unmask(struct irq_data *d)
> > > > > > >  {
> > > > > > >         struct cpumask amask;
> > > > > > > -       unsigned int cpu;
> > > > > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > > > > >
> > > > > > >         cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> > > > > > > -       cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
> > > > > > > -                                          &amask);
> > > > > > > -       if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> > > > > > > -               return;
> > > > > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > > > > +       cpumask_and(&amask, &amask, irq_data_get_affinity_mask(d));
> > > > > > > +
> > > > > > > +       plic_irq_toggle(&amask, d, 1);
> > > > > > >  }
> > > > > > >
> > > > > > >  static void plic_irq_mask(struct irq_data *d)
> > > > > > > @@ -133,24 +130,20 @@ static void plic_irq_mask(struct irq_data *d)
> > > > > > >  static int plic_set_affinity(struct irq_data *d,
> > > > > > >                              const struct cpumask *mask_val, bool force)
> > > > > > >  {
> > > > > > > -       unsigned int cpu;
> > > > > > >         struct cpumask amask;
> > > > > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > > > > >
> > > > > > >         cpumask_and(&amask, &priv->lmask, mask_val);
> > > > > > >
> > > > > > >         if (force)
> > > > > > > -               cpu = cpumask_first(&amask);
> > > > > > > +               cpumask_copy(&amask, mask_val);
> > > > > > >         else
> > > > > > > -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > > > > -
> > > > > > > -       if (cpu >= nr_cpu_ids)
> > > > > > > -               return -EINVAL;
> > > > > > > +               cpumask_and(&amask, &amask, cpu_online_mask);
> > > > > > >
> > > > > > >         plic_irq_toggle(&priv->lmask, d, 0);
> > > > > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > > > > +       plic_irq_toggle(&amask, d, 1);
> > > > > > >
> > > > > > > -       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > > > > +       irq_data_update_effective_affinity(d, &amask);
> > > > > > >
> > > > > > >         return IRQ_SET_MASK_OK_DONE;
> > > > > > >  }
> > > > > > > --
> > > > > > > 2.26.1
> > > > > > >
> > > > > >
> > > > > > I strongly oppose (NACK) this patch due to performance reasons.
> > > > > >
> > > > > > In PLIC, if we enable an IRQ X for N CPUs then when IRQ X occurs:
> > > > > > 1) All N CPUs will take interrupt
> > > > > > 2) All N CPUs will try to read PLIC CLAIM register
> > > > > > 3) Only one of the CPUs will see IRQ X using the CLAIM register
> > > > > > but other N - 1 CPUs will see no interrupt and return back to what
> > > > > > they were doing. In other words, N - 1 CPUs will just waste CPU
> > > > > > every time IRQ X occurs.
> > > > > >
> > > > > > Example1, one Application doing heavy network traffic will
> > > > > > degrade performance of other applications because with every
> > > > > > network RX/TX interrupt N-1 CPUs will waste CPU trying to
> > > > > > process network interrupt.
> > > > > >
> > > > > > Example1, one Application doing heavy MMC/SD traffic will
> > > > > > degrade performance of other applications because with every
> > > > > > SPI read/write interrupt N-1 CPUs will waste CPU trying to
> > > > > > process it.
> > > > > >

Hi Anup,

If there is a case that there are a lot of interrupts from a single
irq number coming very quickly, the first hart is still serving the
first interrupt and still in its top half so it doesn't enable
interrupt yet but the second interrupt is coming and waiting, only the
rest of the harts can serve it. If they are masked, the interrupt
won't be able to be served it right away. In this case, I think this
patch can get better performance. Maybe we can still keep this patch
for user to define their usage in his case?

Anup Patel April 27, 2020, 6:59 a.m. UTC | #8

On Mon, Apr 27, 2020 at 12:19 PM Greentime Hu <greentime.hu@sifive.com> wrote:
>
> Zong Li <zong.li@sifive.com> 於 2020年4月26日 週日 下午11:35寫道：
> >
> > On Sun, Apr 26, 2020 at 11:21 PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > On Sun, Apr 26, 2020 at 8:42 PM Zong Li <zong.li@sifive.com> wrote:
> > > >
> > > > On Sun, Apr 26, 2020 at 9:38 PM Anup Patel <anup@brainfault.org> wrote:
> > > > >
> > > > > +Mark Z
> > > > >
> > > > > On Sun, Apr 26, 2020 at 6:49 PM Zong Li <zong.li@sifive.com> wrote:
> > > > > >
> > > > > > On Sun, Apr 26, 2020 at 8:47 PM Anup Patel <anup@brainfault.org> wrote:
> > > > > > >
> > > > > > > On Sun, Apr 26, 2020 at 4:37 PM Zong Li <zong.li@sifive.com> wrote:
> > > > > > > >
> > > > > > > > Currently, driver forces the IRQs to be handled by only one core. This
> > > > > > > > patch provides the way to enable others cores to handle IRQs if needed,
> > > > > > > > so users could decide how many cores they wanted on default by boot
> > > > > > > > argument.
> > > > > > > >
> > > > > > > > Use 'irqaffinity' boot argument to determine affinity. If there is no
> > > > > > > > irqaffinity in dts or kernel configuration, use irq default affinity,
> > > > > > > > so all harts would try to claim IRQ.
> > > > > > > >
> > > > > > > > For example, add irqaffinity=0 in chosen node to set irq affinity to
> > > > > > > > hart 0. It also supports more than one harts to handle irq, such as set
> > > > > > > > irqaffinity=0,3,4.
> > > > > > > >
> > > > > > > > You can change IRQ affinity from user-space using procfs. For example,
> > > > > > > > you can make CPU0 and CPU2 serve IRQ together by the following command:
> > > > > > > >
> > > > > > > > echo 4 > /proc/irq/<x>/smp_affinity
> > > > > > > >
> > > > > > > > Signed-off-by: Zong Li <zong.li@sifive.com>
> > > > > > > > ---
> > > > > > > >  drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
> > > > > > > >  1 file changed, 7 insertions(+), 14 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> > > > > > > > index d0a71febdadc..bc1440d54185 100644
> > > > > > > > --- a/drivers/irqchip/irq-sifive-plic.c
> > > > > > > > +++ b/drivers/irqchip/irq-sifive-plic.c
> > > > > > > > @@ -111,15 +111,12 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
> > > > > > > >  static void plic_irq_unmask(struct irq_data *d)
> > > > > > > >  {
> > > > > > > >         struct cpumask amask;
> > > > > > > > -       unsigned int cpu;
> > > > > > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > > > > > >
> > > > > > > >         cpumask_and(&amask, &priv->lmask, cpu_online_mask);
> > > > > > > > -       cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
> > > > > > > > -                                          &amask);
> > > > > > > > -       if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> > > > > > > > -               return;
> > > > > > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > > > > > +       cpumask_and(&amask, &amask, irq_data_get_affinity_mask(d));
> > > > > > > > +
> > > > > > > > +       plic_irq_toggle(&amask, d, 1);
> > > > > > > >  }
> > > > > > > >
> > > > > > > >  static void plic_irq_mask(struct irq_data *d)
> > > > > > > > @@ -133,24 +130,20 @@ static void plic_irq_mask(struct irq_data *d)
> > > > > > > >  static int plic_set_affinity(struct irq_data *d,
> > > > > > > >                              const struct cpumask *mask_val, bool force)
> > > > > > > >  {
> > > > > > > > -       unsigned int cpu;
> > > > > > > >         struct cpumask amask;
> > > > > > > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
> > > > > > > >
> > > > > > > >         cpumask_and(&amask, &priv->lmask, mask_val);
> > > > > > > >
> > > > > > > >         if (force)
> > > > > > > > -               cpu = cpumask_first(&amask);
> > > > > > > > +               cpumask_copy(&amask, mask_val);
> > > > > > > >         else
> > > > > > > > -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> > > > > > > > -
> > > > > > > > -       if (cpu >= nr_cpu_ids)
> > > > > > > > -               return -EINVAL;
> > > > > > > > +               cpumask_and(&amask, &amask, cpu_online_mask);
> > > > > > > >
> > > > > > > >         plic_irq_toggle(&priv->lmask, d, 0);
> > > > > > > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
> > > > > > > > +       plic_irq_toggle(&amask, d, 1);
> > > > > > > >
> > > > > > > > -       irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > > > > > > > +       irq_data_update_effective_affinity(d, &amask);
> > > > > > > >
> > > > > > > >         return IRQ_SET_MASK_OK_DONE;
> > > > > > > >  }
> > > > > > > > --
> > > > > > > > 2.26.1
> > > > > > > >
> > > > > > >
> > > > > > > I strongly oppose (NACK) this patch due to performance reasons.
> > > > > > >
> > > > > > > In PLIC, if we enable an IRQ X for N CPUs then when IRQ X occurs:
> > > > > > > 1) All N CPUs will take interrupt
> > > > > > > 2) All N CPUs will try to read PLIC CLAIM register
> > > > > > > 3) Only one of the CPUs will see IRQ X using the CLAIM register
> > > > > > > but other N - 1 CPUs will see no interrupt and return back to what
> > > > > > > they were doing. In other words, N - 1 CPUs will just waste CPU
> > > > > > > every time IRQ X occurs.
> > > > > > >
> > > > > > > Example1, one Application doing heavy network traffic will
> > > > > > > degrade performance of other applications because with every
> > > > > > > network RX/TX interrupt N-1 CPUs will waste CPU trying to
> > > > > > > process network interrupt.
> > > > > > >
> > > > > > > Example1, one Application doing heavy MMC/SD traffic will
> > > > > > > degrade performance of other applications because with every
> > > > > > > SPI read/write interrupt N-1 CPUs will waste CPU trying to
> > > > > > > process it.
> > > > > > >
>
> Hi Anup,
>
> If there is a case that there are a lot of interrupts from a single
> irq number coming very quickly, the first hart is still serving the
> first interrupt and still in its top half so it doesn't enable
> interrupt yet but the second interrupt is coming and waiting, only the
> rest of the harts can serve it. If they are masked, the interrupt
> won't be able to be served it right away. In this case, I think this
> patch can get better performance. Maybe we can still keep this patch
> for user to define their usage in his case?

The way you described issue, it clearly means that other device driver
you are using does not have a optimized interrupt handler and the driver
should be fixed right away. I don't see the point in hacking PLIC driver
and slowing it down because some other device driver spending too
much time in interrupt context.

Seeing your comment, I quickly looked at Cadance MACB driver which
is very important on SiFive Unleashed. It turns out that Cadance MACB
driver does not have well designed top-half and bottom-half. Best thing
would be to change Candance MACB driver to use threaded irqs so that
majority of packet processing work on SiFive unleashed is done in kernel
thread (which is preemtible).

Regards,
Anup

Marc Zyngier April 27, 2020, 8:27 a.m. UTC | #9

Hi Anup,

On 2020-04-26 14:38, Anup Patel wrote:
> +Mark Z

s/k/c/ ;-)

> 
> On Sun, Apr 26, 2020 at 6:49 PM Zong Li <zong.li@sifive.com> wrote:
>> 
>> On Sun, Apr 26, 2020 at 8:47 PM Anup Patel <anup@brainfault.org> 
>> wrote:
>> >
>> > On Sun, Apr 26, 2020 at 4:37 PM Zong Li <zong.li@sifive.com> wrote:
>> > >
>> > > Currently, driver forces the IRQs to be handled by only one core. This
>> > > patch provides the way to enable others cores to handle IRQs if needed,
>> > > so users could decide how many cores they wanted on default by boot
>> > > argument.
>> > >
>> > > Use 'irqaffinity' boot argument to determine affinity. If there is no
>> > > irqaffinity in dts or kernel configuration, use irq default affinity,
>> > > so all harts would try to claim IRQ.
>> > >
>> > > For example, add irqaffinity=0 in chosen node to set irq affinity to
>> > > hart 0. It also supports more than one harts to handle irq, such as set
>> > > irqaffinity=0,3,4.
>> > >
>> > > You can change IRQ affinity from user-space using procfs. For example,
>> > > you can make CPU0 and CPU2 serve IRQ together by the following command:
>> > >
>> > > echo 4 > /proc/irq/<x>/smp_affinity
>> > >
>> > > Signed-off-by: Zong Li <zong.li@sifive.com>
>> > > ---
>> > >  drivers/irqchip/irq-sifive-plic.c | 21 +++++++--------------
>> > >  1 file changed, 7 insertions(+), 14 deletions(-)
>> > >
>> > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
>> > > index d0a71febdadc..bc1440d54185 100644
>> > > --- a/drivers/irqchip/irq-sifive-plic.c
>> > > +++ b/drivers/irqchip/irq-sifive-plic.c
>> > > @@ -111,15 +111,12 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
>> > >  static void plic_irq_unmask(struct irq_data *d)
>> > >  {
>> > >         struct cpumask amask;
>> > > -       unsigned int cpu;
>> > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
>> > >
>> > >         cpumask_and(&amask, &priv->lmask, cpu_online_mask);
>> > > -       cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
>> > > -                                          &amask);
>> > > -       if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
>> > > -               return;
>> > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
>> > > +       cpumask_and(&amask, &amask, irq_data_get_affinity_mask(d));
>> > > +
>> > > +       plic_irq_toggle(&amask, d, 1);
>> > >  }
>> > >
>> > >  static void plic_irq_mask(struct irq_data *d)
>> > > @@ -133,24 +130,20 @@ static void plic_irq_mask(struct irq_data *d)
>> > >  static int plic_set_affinity(struct irq_data *d,
>> > >                              const struct cpumask *mask_val, bool force)
>> > >  {
>> > > -       unsigned int cpu;
>> > >         struct cpumask amask;
>> > >         struct plic_priv *priv = irq_get_chip_data(d->irq);
>> > >
>> > >         cpumask_and(&amask, &priv->lmask, mask_val);
>> > >
>> > >         if (force)
>> > > -               cpu = cpumask_first(&amask);
>> > > +               cpumask_copy(&amask, mask_val);
>> > >         else
>> > > -               cpu = cpumask_any_and(&amask, cpu_online_mask);
>> > > -
>> > > -       if (cpu >= nr_cpu_ids)
>> > > -               return -EINVAL;
>> > > +               cpumask_and(&amask, &amask, cpu_online_mask);
>> > >
>> > >         plic_irq_toggle(&priv->lmask, d, 0);
>> > > -       plic_irq_toggle(cpumask_of(cpu), d, 1);
>> > > +       plic_irq_toggle(&amask, d, 1);
>> > >
>> > > -       irq_data_update_effective_affinity(d, cpumask_of(cpu));
>> > > +       irq_data_update_effective_affinity(d, &amask);
>> > >
>> > >         return IRQ_SET_MASK_OK_DONE;
>> > >  }
>> > > --
>> > > 2.26.1
>> > >
>> >
>> > I strongly oppose (NACK) this patch due to performance reasons.
>> >
>> > In PLIC, if we enable an IRQ X for N CPUs then when IRQ X occurs:
>> > 1) All N CPUs will take interrupt
>> > 2) All N CPUs will try to read PLIC CLAIM register
>> > 3) Only one of the CPUs will see IRQ X using the CLAIM register
>> > but other N - 1 CPUs will see no interrupt and return back to what
>> > they were doing. In other words, N - 1 CPUs will just waste CPU
>> > every time IRQ X occurs.
>> >
>> > Example1, one Application doing heavy network traffic will
>> > degrade performance of other applications because with every
>> > network RX/TX interrupt N-1 CPUs will waste CPU trying to
>> > process network interrupt.
>> >
>> > Example1, one Application doing heavy MMC/SD traffic will
>> > degrade performance of other applications because with every
>> > SPI read/write interrupt N-1 CPUs will waste CPU trying to
>> > process it.
>> >
>> > In fact, the current PLIC approach is actually a performance
>> > optimization. This implementation also works fine with in-kernel
>> > load-balancer and user space load balancer.
>> >
>> 
>> Yes, it's exactly, I know what you pointed out. But the idea of this
>> patch is just providing a way that users could enable other cores if
>> they wanted, it could still enable only one core by this change. The
>> purpose here is thinking of flexible use, rather than limitation.
>> Maybe it would be a happy medium that we make the default case enable
>> only one core? It is a good open discussion.
> 
> Making the default case as enable only one core is just a work-around.
> 
> As-per my understanding, if we set affinity mask of N CPUs for IRQ X
> then it does not mean all N CPUs should receive IRQ X rather it means
> that exactly one of the N CPUs will receive IRQ X and the IRQ receiving
> CPU will be fixed (reflected by effective affinity returned by the 
> driver).
> 
> If we ignore above semantics and still provide a mechanism to target
> IRQ X to N CPUs then most likely someone will try and run into
> performance issues.
> 
> Please don't go this path. The performance impact in case of Guest/VM
> is even worst because PLIC is trap-n-emulated by hypervisors as MMIO
> device.

Just to add my view on this, as we get the same request on the ARM side
once every other year:

- the "broadcast" aspect of the interrupt signalling is usually a waste
   of CPU cycles on all CPUs but the one that has actually takes it.

- the fact that you have to use MMIO to interact with the interrupt
   controller potentially makes this even worse if you don't have some
   per-CPU datapath that will handle this really fast.

- not hitting the same CPU at all times for a given interrupt means that
   the caches are usually cold for this interrupt, meaning the gain in
   latency is pretty dubious.

There is a very small (but also very vocal) fringe of users that insist
that they cannot live without this broadcast aspect, because it allows
them to shave a couple of cycles off a trivial interrupt handler on 
systems
that are otherwise mostly idle.

My answer to this is that Linux is a general purpose OS, and not a
glorified interrupt handler. We aim to have balanced response times, and
give *userspace* control over where the interrupt happens in the long 
run.

HTH,

         M.

irqchip/sifive-plic: allow many cores to handle IRQs

Commit Message

Comments

Patch