diff mbox series

[06/13] PCI: aardvark: Do not clear status bits of masked interrupts

Message ID 20211001195856.10081-7-kabel@kernel.org (mailing list archive)
State Superseded
Delegated to: Lorenzo Pieralisi
Headers show
Series PCI: aardvark controller fixes | expand

Commit Message

Marek Behún Oct. 1, 2021, 7:58 p.m. UTC
From: Pali Rohár <pali@kernel.org>

It is incorrect to clear status bits of masked interrupts.

The aardvark driver clears all status interrupt bits if no unmasked
status bit is set. Masked bits should never be cleared.

Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host controller driver")
Signed-off-by: Pali Rohár <pali@kernel.org>
Reviewed-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org
---
 drivers/pci/controller/pci-aardvark.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

Comments

Lorenzo Pieralisi Oct. 4, 2021, 2:06 p.m. UTC | #1
[+Marc - always better to have his eyes on IRQ handling code]

On Fri, Oct 01, 2021 at 09:58:49PM +0200, Marek Behún wrote:
> From: Pali Rohár <pali@kernel.org>
> 
> It is incorrect to clear status bits of masked interrupts.
> 
> The aardvark driver clears all status interrupt bits if no unmasked
> status bit is set. Masked bits should never be cleared.
> 
> Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host controller driver")
> Signed-off-by: Pali Rohár <pali@kernel.org>
> Reviewed-by: Marek Behún <kabel@kernel.org>
> Signed-off-by: Marek Behún <kabel@kernel.org>
> Cc: stable@vger.kernel.org
> ---
>  drivers/pci/controller/pci-aardvark.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index d5d6f92e5143..e4986806a189 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -1295,11 +1295,8 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
>  	isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
>  	isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);
>  
> -	if (!isr0_status && !isr1_status) {
> -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);

This looks fine - on the other hand if no interrupt is set in the status
registers (that are filtered with the masks) we are dealing with a
spurious IRQ right ? Just gauging how severe this is.

Lorenzo

> +	if (!isr0_status && !isr1_status)
>  		return;
> -	}
>  
>  	/* Process MSI interrupts */
>  	if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
> -- 
> 2.32.0
>
Marek Behún Oct. 4, 2021, 3:18 p.m. UTC | #2
On Mon, 4 Oct 2021 15:06:53 +0100
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote:

> [+Marc - always better to have his eyes on IRQ handling code]
> 
> > -	if (!isr0_status && !isr1_status) {
> > -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> > -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);  
> 
> This looks fine - on the other hand if no interrupt is set in the status
> registers (that are filtered with the masks) we are dealing with a
> spurious IRQ right ? Just gauging how severe this is.

Yes, spurious IRQ can really happen.

Patch 7 in this series fixes an issue where aardvark does not mask all
interrupts, and then kernel can think that an interrupt is masked but
it really isn't.

Also, some interrupts may be masked by the user of the emulated bridge
(some other driver), so that they can be polled. But if we clear all of
them in the status, even the masked ones, then the other driver which
is polling will always get a zero.

Marek
Marc Zyngier Oct. 4, 2021, 3:31 p.m. UTC | #3
On Mon, 04 Oct 2021 15:06:53 +0100,
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote:
> 
> [+Marc - always better to have his eyes on IRQ handling code]
> 
> On Fri, Oct 01, 2021 at 09:58:49PM +0200, Marek Behún wrote:
> > From: Pali Rohár <pali@kernel.org>
> > 
> > It is incorrect to clear status bits of masked interrupts.
> > 
> > The aardvark driver clears all status interrupt bits if no unmasked
> > status bit is set. Masked bits should never be cleared.
> > 
> > Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host controller driver")
> > Signed-off-by: Pali Rohár <pali@kernel.org>
> > Reviewed-by: Marek Behún <kabel@kernel.org>
> > Signed-off-by: Marek Behún <kabel@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> >  drivers/pci/controller/pci-aardvark.c | 5 +----
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> > 
> > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > index d5d6f92e5143..e4986806a189 100644
> > --- a/drivers/pci/controller/pci-aardvark.c
> > +++ b/drivers/pci/controller/pci-aardvark.c
> > @@ -1295,11 +1295,8 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
> >  	isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
> >  	isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);
> >  
> > -	if (!isr0_status && !isr1_status) {
> > -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> > -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);
> 
> This looks fine - on the other hand if no interrupt is set in the status
> registers (that are filtered with the masks) we are dealing with a
> spurious IRQ right ? Just gauging how severe this is.
> 
> Lorenzo
> 
> > +	if (!isr0_status && !isr1_status)
> >  		return;

The whole thing is a bit odd. What the commit message doesn't say is
whether the status register shows the status of the line before
masking, or after masking.

The code seems to imply the former, but then the behaviour is
awkward. How did we end-up here the first place? if that's only a
spurious interrupt, then I'd probably simplify the code altogether,
and drop all the early return code. Something like below, as usual
completely untested.

	M.

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 596ebcfcc82d..1d8f257ecb63 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1275,7 +1275,8 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
 static void advk_pcie_handle_int(struct advk_pcie *pcie)
 {
 	u32 isr0_val, isr0_mask, isr0_status;
-	u32 isr1_val, isr1_mask, isr1_status;
+	u32 isr1_val, isr1_mask;
+	unsigned long isr1_status;
 	int i;
 
 	isr0_val = advk_readl(pcie, PCIE_ISR0_REG);
@@ -1285,22 +1286,14 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
 	isr1_val = advk_readl(pcie, PCIE_ISR1_REG);
 	isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
 	isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);
-
-	if (!isr0_status && !isr1_status) {
-		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
-		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);
-		return;
-	}
+	isr1_status >> 8;
 
 	/* Process MSI interrupts */
 	if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
 		advk_pcie_handle_msi(pcie);
 
 	/* Process legacy interrupts */
-	for (i = 0; i < PCI_NUM_INTX; i++) {
-		if (!(isr1_status & PCIE_ISR1_INTX_ASSERT(i)))
-			continue;
-
+	for_each_set_bit(i, &isr1_status, PCI_NUM_INTX) {
 		advk_writel(pcie, PCIE_ISR1_INTX_ASSERT(i),
 			    PCIE_ISR1_REG);
Marek Behún Oct. 5, 2021, 12:13 p.m. UTC | #4
On Mon, 04 Oct 2021 16:31:54 +0100
Marc Zyngier <maz@kernel.org> wrote:

> On Mon, 04 Oct 2021 15:06:53 +0100,
> Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote:
> > 
> > [+Marc - always better to have his eyes on IRQ handling code]
> > 
> > On Fri, Oct 01, 2021 at 09:58:49PM +0200, Marek Behún wrote:  
> > > From: Pali Rohár <pali@kernel.org>
> > > 
> > > It is incorrect to clear status bits of masked interrupts.
> > > 
> > > The aardvark driver clears all status interrupt bits if no
> > > unmasked status bit is set. Masked bits should never be cleared.
> > > 
> > > Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host
> > > controller driver") Signed-off-by: Pali Rohár <pali@kernel.org>
> > > Reviewed-by: Marek Behún <kabel@kernel.org>
> > > Signed-off-by: Marek Behún <kabel@kernel.org>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  drivers/pci/controller/pci-aardvark.c | 5 +----
> > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/pci/controller/pci-aardvark.c
> > > b/drivers/pci/controller/pci-aardvark.c index
> > > d5d6f92e5143..e4986806a189 100644 ---
> > > a/drivers/pci/controller/pci-aardvark.c +++
> > > b/drivers/pci/controller/pci-aardvark.c @@ -1295,11 +1295,8 @@
> > > static void advk_pcie_handle_int(struct advk_pcie *pcie)
> > > isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG); isr1_status =
> > > isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK); 
> > > -	if (!isr0_status && !isr1_status) {
> > > -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> > > -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);  
> > 
> > This looks fine - on the other hand if no interrupt is set in the
> > status registers (that are filtered with the masks) we are dealing
> > with a spurious IRQ right ? Just gauging how severe this is.
> > 
> > Lorenzo
> >   
> > > +	if (!isr0_status && !isr1_status)
> > >  		return;  
> 
> The whole thing is a bit odd. What the commit message doesn't say is
> whether the status register shows the status of the line before
> masking, or after masking.

I don't quite understand what you are asking about.
If you are asking about the register itself:
the PCIE_ISR1_REG says which interrupts are currently set / active,
including those which are masked.

If you are asking about the isr1_status variable, it is the
status of the line after masking. I.e. masked interrupts are not set in
this variable, only active interrupts which are also unmasked. That is
obvious from the code.

> The code seems to imply the former, but then the behaviour is
> awkward. How did we end-up here the first place?

I answered this in reply to Lorenzo's comment on this patch, see
https://lore.kernel.org/linux-pci/20211004171823.0288684e@thinkpad/

> if that's only a
> spurious interrupt, then I'd probably simplify the code altogether,
> and drop all the early return code. Something like below, as usual
> completely untested.
> 
> 	M.
> 
> diff --git a/drivers/pci/controller/pci-aardvark.c
> b/drivers/pci/controller/pci-aardvark.c index
> 596ebcfcc82d..1d8f257ecb63 100644 ---
> a/drivers/pci/controller/pci-aardvark.c +++
> b/drivers/pci/controller/pci-aardvark.c @@ -1275,7 +1275,8 @@ static
> void advk_pcie_handle_msi(struct advk_pcie *pcie) static void
> advk_pcie_handle_int(struct advk_pcie *pcie) {
>  	u32 isr0_val, isr0_mask, isr0_status;
> -	u32 isr1_val, isr1_mask, isr1_status;
> +	u32 isr1_val, isr1_mask;
> +	unsigned long isr1_status;
>  	int i;
>  
>  	isr0_val = advk_readl(pcie, PCIE_ISR0_REG);
> @@ -1285,22 +1286,14 @@ static void advk_pcie_handle_int(struct
> advk_pcie *pcie) isr1_val = advk_readl(pcie, PCIE_ISR1_REG);
>  	isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
>  	isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);
> -
> -	if (!isr0_status && !isr1_status) {
> -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);
> -		return;
> -	}
> +	isr1_status >> 8;
>
>  	/* Process MSI interrupts */
>  	if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
>  		advk_pcie_handle_msi(pcie);
>  
>  	/* Process legacy interrupts */
> -	for (i = 0; i < PCI_NUM_INTX; i++) {
> -		if (!(isr1_status & PCIE_ISR1_INTX_ASSERT(i)))
> -			continue;
> -
> +	for_each_set_bit(i, &isr1_status, PCI_NUM_INTX) {
>  		advk_writel(pcie, PCIE_ISR1_INTX_ASSERT(i),
>  			    PCIE_ISR1_REG);

1. what you are doing here is code cleanup. We are currently in the
   state where we have lots of fixes for this driver, which we are
   hoping will go also to stable. Some of them depend on these changes.
   Can we please first apply those fixes (we want to send them in
   batches to avoid sending 60 patchs in one series, since last time
   nobody wanted to review all of that) and do this afterwards?

2. you are throwing away lower 8 bits of isr1_status. We have follow-up
   patches (not in this series, but in another batch which we want to
   send after this) that will be using those lower 8 bits, so we do not
   want to throw away them now.

Marek

PS: You can look at all the prepared changes at
https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=pci-aardvark
Marc Zyngier Oct. 5, 2021, 12:42 p.m. UTC | #5
On Tue, 05 Oct 2021 13:13:40 +0100,
Marek Behún <kabel@kernel.org> wrote:
> 
> On Mon, 04 Oct 2021 16:31:54 +0100
> Marc Zyngier <maz@kernel.org> wrote:
> 
> > On Mon, 04 Oct 2021 15:06:53 +0100,
> > Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote:
> > > 
> > > [+Marc - always better to have his eyes on IRQ handling code]
> > > 
> > > On Fri, Oct 01, 2021 at 09:58:49PM +0200, Marek Behún wrote:  
> > > > From: Pali Rohár <pali@kernel.org>
> > > > 
> > > > It is incorrect to clear status bits of masked interrupts.
> > > > 
> > > > The aardvark driver clears all status interrupt bits if no
> > > > unmasked status bit is set. Masked bits should never be cleared.
> > > > 
> > > > Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host
> > > > controller driver") Signed-off-by: Pali Rohár <pali@kernel.org>
> > > > Reviewed-by: Marek Behún <kabel@kernel.org>
> > > > Signed-off-by: Marek Behún <kabel@kernel.org>
> > > > Cc: stable@vger.kernel.org
> > > > ---
> > > >  drivers/pci/controller/pci-aardvark.c | 5 +----
> > > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/pci/controller/pci-aardvark.c
> > > > b/drivers/pci/controller/pci-aardvark.c index
> > > > d5d6f92e5143..e4986806a189 100644 ---
> > > > a/drivers/pci/controller/pci-aardvark.c +++
> > > > b/drivers/pci/controller/pci-aardvark.c @@ -1295,11 +1295,8 @@
> > > > static void advk_pcie_handle_int(struct advk_pcie *pcie)
> > > > isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG); isr1_status =
> > > > isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK); 
> > > > -	if (!isr0_status && !isr1_status) {
> > > > -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> > > > -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);  
> > > 
> > > This looks fine - on the other hand if no interrupt is set in the
> > > status registers (that are filtered with the masks) we are dealing
> > > with a spurious IRQ right ? Just gauging how severe this is.
> > > 
> > > Lorenzo
> > >   
> > > > +	if (!isr0_status && !isr1_status)
> > > >  		return;  
> > 
> > The whole thing is a bit odd. What the commit message doesn't say is
> > whether the status register shows the status of the line before
> > masking, or after masking.
> 
> I don't quite understand what you are asking about.
> If you are asking about the register itself:
> the PCIE_ISR1_REG says which interrupts are currently set / active,
> including those which are masked.

Then please say so in the commit message.

> If you are asking about the isr1_status variable, it is the
> status of the line after masking. I.e. masked interrupts are not set in
> this variable, only active interrupts which are also unmasked. That is
> obvious from the code.

Which is what I have said... two lines below. If you are going to
reply, please do so in context.

> 
> > The code seems to imply the former, but then the behaviour is
> > awkward. How did we end-up here the first place?
> 
> I answered this in reply to Lorenzo's comment on this patch, see
> https://lore.kernel.org/linux-pci/20211004171823.0288684e@thinkpad/

It did grace my inbox, thanks.

> > if that's only a
> > spurious interrupt, then I'd probably simplify the code altogether,
> > and drop all the early return code. Something like below, as usual
> > completely untested.
> > 
> > 	M.
> > 
> > diff --git a/drivers/pci/controller/pci-aardvark.c
> > b/drivers/pci/controller/pci-aardvark.c index
> > 596ebcfcc82d..1d8f257ecb63 100644 ---
> > a/drivers/pci/controller/pci-aardvark.c +++
> > b/drivers/pci/controller/pci-aardvark.c @@ -1275,7 +1275,8 @@ static
> > void advk_pcie_handle_msi(struct advk_pcie *pcie) static void
> > advk_pcie_handle_int(struct advk_pcie *pcie) {
> >  	u32 isr0_val, isr0_mask, isr0_status;
> > -	u32 isr1_val, isr1_mask, isr1_status;
> > +	u32 isr1_val, isr1_mask;
> > +	unsigned long isr1_status;
> >  	int i;
> >  
> >  	isr0_val = advk_readl(pcie, PCIE_ISR0_REG);
> > @@ -1285,22 +1286,14 @@ static void advk_pcie_handle_int(struct
> > advk_pcie *pcie) isr1_val = advk_readl(pcie, PCIE_ISR1_REG);
> >  	isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
> >  	isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);
> > -
> > -	if (!isr0_status && !isr1_status) {
> > -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> > -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);
> > -		return;
> > -	}
> > +	isr1_status >> 8;
> >
> >  	/* Process MSI interrupts */
> >  	if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
> >  		advk_pcie_handle_msi(pcie);
> >  
> >  	/* Process legacy interrupts */
> > -	for (i = 0; i < PCI_NUM_INTX; i++) {
> > -		if (!(isr1_status & PCIE_ISR1_INTX_ASSERT(i)))
> > -			continue;
> > -
> > +	for_each_set_bit(i, &isr1_status, PCI_NUM_INTX) {
> >  		advk_writel(pcie, PCIE_ISR1_INTX_ASSERT(i),
> >  			    PCIE_ISR1_REG);
> 
> 1. what you are doing here is code cleanup. We are currently in the
>    state where we have lots of fixes for this driver, which we are
>    hoping will go also to stable.

Yes, it is code cleanup. Because I don't find this patch to be very
good, TBH. As for going into stable, that's not relevant for this
discussion.

>    Some of them depend on these changes.
>    Can we please first apply those fixes (we want to send them in
>    batches to avoid sending 60 patchs in one series, since last time
>    nobody wanted to review all of that) and do this afterwards?

It would be better to start with patches that are in a better
shape. After all, this is what the code review process is about. This
isn't "just take my patches".

> 2. you are throwing away lower 8 bits of isr1_status. We have follow-up
>    patches (not in this series, but in another batch which we want to
>    send after this) that will be using those lower 8 bits, so we do not
>    want to throw away them now.

I'm discarding these bits because *in isolation*, that's the correct
thing to do. Feel free to propose a better patch that doesn't discard
these bits and still makes the code more palatable.

Thanks,

	M.
Pali Rohár Oct. 5, 2021, 1:15 p.m. UTC | #6
On Tuesday 05 October 2021 13:42:02 Marc Zyngier wrote:
> On Tue, 05 Oct 2021 13:13:40 +0100,
> Marek Behún <kabel@kernel.org> wrote:
> > 
> > On Mon, 04 Oct 2021 16:31:54 +0100
> > Marc Zyngier <maz@kernel.org> wrote:
> > 
> > > On Mon, 04 Oct 2021 15:06:53 +0100,
> > > Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote:
> > > > 
> > > > [+Marc - always better to have his eyes on IRQ handling code]
> > > > 
> > > > On Fri, Oct 01, 2021 at 09:58:49PM +0200, Marek Behún wrote:  
> > > > > From: Pali Rohár <pali@kernel.org>
> > > > > 
> > > > > It is incorrect to clear status bits of masked interrupts.
> > > > > 
> > > > > The aardvark driver clears all status interrupt bits if no
> > > > > unmasked status bit is set. Masked bits should never be cleared.
> > > > > 
> > > > > Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host
> > > > > controller driver") Signed-off-by: Pali Rohár <pali@kernel.org>
> > > > > Reviewed-by: Marek Behún <kabel@kernel.org>
> > > > > Signed-off-by: Marek Behún <kabel@kernel.org>
> > > > > Cc: stable@vger.kernel.org
> > > > > ---
> > > > >  drivers/pci/controller/pci-aardvark.c | 5 +----
> > > > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/pci/controller/pci-aardvark.c
> > > > > b/drivers/pci/controller/pci-aardvark.c index
> > > > > d5d6f92e5143..e4986806a189 100644 ---
> > > > > a/drivers/pci/controller/pci-aardvark.c +++
> > > > > b/drivers/pci/controller/pci-aardvark.c @@ -1295,11 +1295,8 @@
> > > > > static void advk_pcie_handle_int(struct advk_pcie *pcie)
> > > > > isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG); isr1_status =
> > > > > isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK); 
> > > > > -	if (!isr0_status && !isr1_status) {
> > > > > -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> > > > > -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);  
> > > > 
> > > > This looks fine - on the other hand if no interrupt is set in the
> > > > status registers (that are filtered with the masks) we are dealing
> > > > with a spurious IRQ right ? Just gauging how severe this is.
> > > > 
> > > > Lorenzo
> > > >   
> > > > > +	if (!isr0_status && !isr1_status)
> > > > >  		return;  
> > > 
> > > The whole thing is a bit odd. What the commit message doesn't say is
> > > whether the status register shows the status of the line before
> > > masking, or after masking.
> > 
> > I don't quite understand what you are asking about.
> > If you are asking about the register itself:
> > the PCIE_ISR1_REG says which interrupts are currently set / active,
> > including those which are masked.
> 
> Then please say so in the commit message.

Very well, we shall do so.

> > If you are asking about the isr1_status variable, it is the
> > status of the line after masking. I.e. masked interrupts are not set in
> > this variable, only active interrupts which are also unmasked. That is
> > obvious from the code.
> 
> Which is what I have said... two lines below. If you are going to
> reply, please do so in context.
> 
> > 
> > > The code seems to imply the former, but then the behaviour is
> > > awkward. How did we end-up here the first place?
> > 
> > I answered this in reply to Lorenzo's comment on this patch, see
> > https://lore.kernel.org/linux-pci/20211004171823.0288684e@thinkpad/
> 
> It did grace my inbox, thanks.
> 
> > > if that's only a
> > > spurious interrupt, then I'd probably simplify the code altogether,
> > > and drop all the early return code. Something like below, as usual
> > > completely untested.
> > > 
> > > 	M.
> > > 
> > > diff --git a/drivers/pci/controller/pci-aardvark.c
> > > b/drivers/pci/controller/pci-aardvark.c index
> > > 596ebcfcc82d..1d8f257ecb63 100644 ---
> > > a/drivers/pci/controller/pci-aardvark.c +++
> > > b/drivers/pci/controller/pci-aardvark.c @@ -1275,7 +1275,8 @@ static
> > > void advk_pcie_handle_msi(struct advk_pcie *pcie) static void
> > > advk_pcie_handle_int(struct advk_pcie *pcie) {
> > >  	u32 isr0_val, isr0_mask, isr0_status;
> > > -	u32 isr1_val, isr1_mask, isr1_status;
> > > +	u32 isr1_val, isr1_mask;
> > > +	unsigned long isr1_status;
> > >  	int i;
> > >  
> > >  	isr0_val = advk_readl(pcie, PCIE_ISR0_REG);
> > > @@ -1285,22 +1286,14 @@ static void advk_pcie_handle_int(struct
> > > advk_pcie *pcie) isr1_val = advk_readl(pcie, PCIE_ISR1_REG);
> > >  	isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
> > >  	isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);
> > > -
> > > -	if (!isr0_status && !isr1_status) {
> > > -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> > > -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);
> > > -		return;
> > > -	}
> > > +	isr1_status >> 8;

Hello!

I dislike this approach. It adds another magic number which is just
causing issues. Please read commit message for patch 11/13 where we
describe why such magic constants are bad and already caused lot of
issues in this driver.

> > >  	/* Process MSI interrupts */
> > >  	if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
> > >  		advk_pcie_handle_msi(pcie);
> > >  
> > >  	/* Process legacy interrupts */
> > > -	for (i = 0; i < PCI_NUM_INTX; i++) {
> > > -		if (!(isr1_status & PCIE_ISR1_INTX_ASSERT(i)))
> > > -			continue;
> > > -
> > > +	for_each_set_bit(i, &isr1_status, PCI_NUM_INTX) {
> > >  		advk_writel(pcie, PCIE_ISR1_INTX_ASSERT(i),
> > >  			    PCIE_ISR1_REG);
> > 
> > 1. what you are doing here is code cleanup. We are currently in the
> >    state where we have lots of fixes for this driver, which we are
> >    hoping will go also to stable.
> 
> Yes, it is code cleanup. Because I don't find this patch to be very
> good, TBH. As for going into stable, that's not relevant for this
> discussion.
> 
> >    Some of them depend on these changes.
> >    Can we please first apply those fixes (we want to send them in
> >    batches to avoid sending 60 patchs in one series, since last time
> >    nobody wanted to review all of that) and do this afterwards?
> 
> It would be better to start with patches that are in a better
> shape. After all, this is what the code review process is about. This
> isn't "just take my patches".
> 
> > 2. you are throwing away lower 8 bits of isr1_status. We have follow-up
> >    patches (not in this series, but in another batch which we want to
> >    send after this) that will be using those lower 8 bits, so we do not
> >    want to throw away them now.
> 
> I'm discarding these bits because *in isolation*, that's the correct
> thing to do. Feel free to propose a better patch that doesn't discard
> these bits and still makes the code more palatable.

The code pattern in this function is: compose irs*_status variable and
then compare it with register macros defined at the top of driver. Each
bit in this register represent some event and for each event there is
simple macro to match.

So with your proposed change it would break all macros (as they are
going to be shifted by magic constant) and then this code disallow
access to events represented by low bits. And also it makes code pattern
different for isr0_status and isr1_status variables which is very
confusing and probably source for introduction of new bugs.

Also the whole early-return optimization can be removed as it does not
change functionality. So we will do so.

But we do not agree with the lower 8 bit discard of the isr1_status
variable as explained above.

So if we add the explanation to commit message and drop the early
return, would it be ok?
Marc Zyngier Oct. 5, 2021, 3:42 p.m. UTC | #7
On Tue, 05 Oct 2021 14:15:45 +0100,
Pali Rohár <pali@kernel.org> wrote:
> 
> Hello!
> 
> I dislike this approach. It adds another magic number which is just
> causing issues. Please read commit message for patch 11/13 where we
> describe why such magic constants are bad and already caused lot of
> issues in this driver.

As I said, feel free to write something better.

> 
> > > >  	/* Process MSI interrupts */
> > > >  	if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
> > > >  		advk_pcie_handle_msi(pcie);
> > > >  
> > > >  	/* Process legacy interrupts */
> > > > -	for (i = 0; i < PCI_NUM_INTX; i++) {
> > > > -		if (!(isr1_status & PCIE_ISR1_INTX_ASSERT(i)))
> > > > -			continue;
> > > > -
> > > > +	for_each_set_bit(i, &isr1_status, PCI_NUM_INTX) {
> > > >  		advk_writel(pcie, PCIE_ISR1_INTX_ASSERT(i),
> > > >  			    PCIE_ISR1_REG);
> > > 
> > > 1. what you are doing here is code cleanup. We are currently in the
> > >    state where we have lots of fixes for this driver, which we are
> > >    hoping will go also to stable.
> > 
> > Yes, it is code cleanup. Because I don't find this patch to be very
> > good, TBH. As for going into stable, that's not relevant for this
> > discussion.
> > 
> > >    Some of them depend on these changes.
> > >    Can we please first apply those fixes (we want to send them in
> > >    batches to avoid sending 60 patchs in one series, since last time
> > >    nobody wanted to review all of that) and do this afterwards?
> > 
> > It would be better to start with patches that are in a better
> > shape. After all, this is what the code review process is about. This
> > isn't "just take my patches".
> > 
> > > 2. you are throwing away lower 8 bits of isr1_status. We have follow-up
> > >    patches (not in this series, but in another batch which we want to
> > >    send after this) that will be using those lower 8 bits, so we do not
> > >    want to throw away them now.
> > 
> > I'm discarding these bits because *in isolation*, that's the correct
> > thing to do. Feel free to propose a better patch that doesn't discard
> > these bits and still makes the code more palatable.
> 
> The code pattern in this function is: compose irs*_status variable and
> then compare it with register macros defined at the top of driver. Each
> bit in this register represent some event and for each event there is
> simple macro to match.
> 
> So with your proposed change it would break all macros (as they are
> going to be shifted by magic constant) and then this code disallow
> access to events represented by low bits. And also it makes code pattern
> different for isr0_status and isr1_status variables which is very
> confusing and probably source for introduction of new bugs.

Read what I have said: I'm suggesting changes based on this patch *in
isolation*. I don't see any other related patch in my inbox (nor do I
want to receive any). Feel free to suggest something better (that's
the third time I write this...).

> Also the whole early-return optimization can be removed as it does not
> change functionality. So we will do so.
> 
> But we do not agree with the lower 8 bit discard of the isr1_status
> variable as explained above.
> 
> So if we add the explanation to commit message and drop the early
> return, would it be ok?

Do what you want. I have no interest in this particular piece of code,
and only replied to Lorenzo's request. It doesn't change what I think
of this patch, but I have too much on my plate to get dragged into
sterile arguments.

	M.
Lorenzo Pieralisi Oct. 5, 2021, 4:04 p.m. UTC | #8
On Tue, Oct 05, 2021 at 03:15:45PM +0200, Pali Rohár wrote:
> On Tuesday 05 October 2021 13:42:02 Marc Zyngier wrote:
> > On Tue, 05 Oct 2021 13:13:40 +0100,
> > Marek Behún <kabel@kernel.org> wrote:
> > > 
> > > On Mon, 04 Oct 2021 16:31:54 +0100
> > > Marc Zyngier <maz@kernel.org> wrote:
> > > 
> > > > On Mon, 04 Oct 2021 15:06:53 +0100,
> > > > Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote:
> > > > > 
> > > > > [+Marc - always better to have his eyes on IRQ handling code]
> > > > > 
> > > > > On Fri, Oct 01, 2021 at 09:58:49PM +0200, Marek Behún wrote:  
> > > > > > From: Pali Rohár <pali@kernel.org>
> > > > > > 
> > > > > > It is incorrect to clear status bits of masked interrupts.
> > > > > > 
> > > > > > The aardvark driver clears all status interrupt bits if no
> > > > > > unmasked status bit is set. Masked bits should never be cleared.
> > > > > > 
> > > > > > Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host
> > > > > > controller driver") Signed-off-by: Pali Rohár <pali@kernel.org>
> > > > > > Reviewed-by: Marek Behún <kabel@kernel.org>
> > > > > > Signed-off-by: Marek Behún <kabel@kernel.org>
> > > > > > Cc: stable@vger.kernel.org
> > > > > > ---
> > > > > >  drivers/pci/controller/pci-aardvark.c | 5 +----
> > > > > >  1 file changed, 1 insertion(+), 4 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/pci/controller/pci-aardvark.c
> > > > > > b/drivers/pci/controller/pci-aardvark.c index
> > > > > > d5d6f92e5143..e4986806a189 100644 ---
> > > > > > a/drivers/pci/controller/pci-aardvark.c +++
> > > > > > b/drivers/pci/controller/pci-aardvark.c @@ -1295,11 +1295,8 @@
> > > > > > static void advk_pcie_handle_int(struct advk_pcie *pcie)
> > > > > > isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG); isr1_status =
> > > > > > isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK); 
> > > > > > -	if (!isr0_status && !isr1_status) {
> > > > > > -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> > > > > > -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);  
> > > > > 
> > > > > This looks fine - on the other hand if no interrupt is set in the
> > > > > status registers (that are filtered with the masks) we are dealing
> > > > > with a spurious IRQ right ? Just gauging how severe this is.
> > > > > 
> > > > > Lorenzo
> > > > >   
> > > > > > +	if (!isr0_status && !isr1_status)
> > > > > >  		return;  
> > > > 
> > > > The whole thing is a bit odd. What the commit message doesn't say is
> > > > whether the status register shows the status of the line before
> > > > masking, or after masking.
> > > 
> > > I don't quite understand what you are asking about.
> > > If you are asking about the register itself:
> > > the PCIE_ISR1_REG says which interrupts are currently set / active,
> > > including those which are masked.
> > 
> > Then please say so in the commit message.
> 
> Very well, we shall do so.
> 
> > > If you are asking about the isr1_status variable, it is the
> > > status of the line after masking. I.e. masked interrupts are not set in
> > > this variable, only active interrupts which are also unmasked. That is
> > > obvious from the code.
> > 
> > Which is what I have said... two lines below. If you are going to
> > reply, please do so in context.
> > 
> > > 
> > > > The code seems to imply the former, but then the behaviour is
> > > > awkward. How did we end-up here the first place?
> > > 
> > > I answered this in reply to Lorenzo's comment on this patch, see
> > > https://lore.kernel.org/linux-pci/20211004171823.0288684e@thinkpad/
> > 
> > It did grace my inbox, thanks.
> > 
> > > > if that's only a
> > > > spurious interrupt, then I'd probably simplify the code altogether,
> > > > and drop all the early return code. Something like below, as usual
> > > > completely untested.
> > > > 
> > > > 	M.
> > > > 
> > > > diff --git a/drivers/pci/controller/pci-aardvark.c
> > > > b/drivers/pci/controller/pci-aardvark.c index
> > > > 596ebcfcc82d..1d8f257ecb63 100644 ---
> > > > a/drivers/pci/controller/pci-aardvark.c +++
> > > > b/drivers/pci/controller/pci-aardvark.c @@ -1275,7 +1275,8 @@ static
> > > > void advk_pcie_handle_msi(struct advk_pcie *pcie) static void
> > > > advk_pcie_handle_int(struct advk_pcie *pcie) {
> > > >  	u32 isr0_val, isr0_mask, isr0_status;
> > > > -	u32 isr1_val, isr1_mask, isr1_status;
> > > > +	u32 isr1_val, isr1_mask;
> > > > +	unsigned long isr1_status;
> > > >  	int i;
> > > >  
> > > >  	isr0_val = advk_readl(pcie, PCIE_ISR0_REG);
> > > > @@ -1285,22 +1286,14 @@ static void advk_pcie_handle_int(struct
> > > > advk_pcie *pcie) isr1_val = advk_readl(pcie, PCIE_ISR1_REG);
> > > >  	isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
> > > >  	isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);
> > > > -
> > > > -	if (!isr0_status && !isr1_status) {
> > > > -		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
> > > > -		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);
> > > > -		return;
> > > > -	}
> > > > +	isr1_status >> 8;
> 
> Hello!
> 
> I dislike this approach. It adds another magic number which is just
> causing issues. Please read commit message for patch 11/13 where we
> describe why such magic constants are bad and already caused lot of
> issues in this driver.
> 
> > > >  	/* Process MSI interrupts */
> > > >  	if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
> > > >  		advk_pcie_handle_msi(pcie);
> > > >  
> > > >  	/* Process legacy interrupts */
> > > > -	for (i = 0; i < PCI_NUM_INTX; i++) {
> > > > -		if (!(isr1_status & PCIE_ISR1_INTX_ASSERT(i)))
> > > > -			continue;
> > > > -
> > > > +	for_each_set_bit(i, &isr1_status, PCI_NUM_INTX) {
> > > >  		advk_writel(pcie, PCIE_ISR1_INTX_ASSERT(i),
> > > >  			    PCIE_ISR1_REG);
> > > 
> > > 1. what you are doing here is code cleanup. We are currently in the
> > >    state where we have lots of fixes for this driver, which we are
> > >    hoping will go also to stable.
> > 
> > Yes, it is code cleanup. Because I don't find this patch to be very
> > good, TBH. As for going into stable, that's not relevant for this
> > discussion.
> > 
> > >    Some of them depend on these changes.
> > >    Can we please first apply those fixes (we want to send them in
> > >    batches to avoid sending 60 patchs in one series, since last time
> > >    nobody wanted to review all of that) and do this afterwards?
> > 
> > It would be better to start with patches that are in a better
> > shape. After all, this is what the code review process is about. This
> > isn't "just take my patches".
> > 
> > > 2. you are throwing away lower 8 bits of isr1_status. We have follow-up
> > >    patches (not in this series, but in another batch which we want to
> > >    send after this) that will be using those lower 8 bits, so we do not
> > >    want to throw away them now.
> > 
> > I'm discarding these bits because *in isolation*, that's the correct
> > thing to do. Feel free to propose a better patch that doesn't discard
> > these bits and still makes the code more palatable.
> 
> The code pattern in this function is: compose irs*_status variable and
> then compare it with register macros defined at the top of driver. Each
> bit in this register represent some event and for each event there is
> simple macro to match.
> 
> So with your proposed change it would break all macros (as they are
> going to be shifted by magic constant) and then this code disallow
> access to events represented by low bits. And also it makes code pattern
> different for isr0_status and isr1_status variables which is very
> confusing and probably source for introduction of new bugs.
> 
> Also the whole early-return optimization can be removed as it does not
> change functionality. So we will do so.
> 
> But we do not agree with the lower 8 bit discard of the isr1_status
> variable as explained above.
> 
> So if we add the explanation to commit message and drop the early
> return, would it be ok?

Please repost a v2 of this series with the review comments you
received (and tags removed if they are not applicable as things
stand in mainline) and we will take it from there.

Thanks,
Lorenzo
Pali Rohár Oct. 5, 2021, 6:48 p.m. UTC | #9
On Tuesday 05 October 2021 16:42:29 Marc Zyngier wrote:
> As I said, feel free to write something better.

Marek now sent a new version, I hope it is better:
https://lore.kernel.org/linux-pci/20211005180952.6812-7-kabel@kernel.org/


Anyway, I was thinking more if it is possible to end up in state when
*_status variables are zero also after applying patch 7/13.

I do not know how precisely are bits 8-11 of ISR1 reg handled in HW as
description is completely missing in all documentations which I read.
These bits represents events for legacy INTx interrupts.

And maybe it is possible that driver for endpoint card run some function
in context when all CPU interrupts are disabled and do something which
cause card to send Assert_INTA message followed by Deassert_INTA (e.g.
poke some register and immediately process event which deassert intx).
After endpoint driver function finish its execution then CPU receive
PCIe summary interrupt and pci controller driver would see that no event
is pending and noting to process.
diff mbox series

Patch

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index d5d6f92e5143..e4986806a189 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1295,11 +1295,8 @@  static void advk_pcie_handle_int(struct advk_pcie *pcie)
 	isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
 	isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);
 
-	if (!isr0_status && !isr1_status) {
-		advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
-		advk_writel(pcie, isr1_val, PCIE_ISR1_REG);
+	if (!isr0_status && !isr1_status)
 		return;
-	}
 
 	/* Process MSI interrupts */
 	if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)