diff mbox

[1/2] pcie/dpc: Skip DPC event if device is not present

Message ID 1493395369-20225-2-git-send-email-keith.busch@intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Keith Busch April 28, 2017, 4:02 p.m. UTC
The DPC interupt may be executed on a device that is being removed. Skip
queuing event handling if the status is all 1's, which should be seen
only if the device is not present.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/pcie-dpc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Wei Zhang May 10, 2017, 3:39 a.m. UTC | #1
Hi Keith and Wes,

I wonder if getting an All 1’s read on the DPC status register is a valid scenario.   The DPC register is on the switch, why would the status register returns All 1’s even if the device is removed?

Thanks,
-Wei

--
wei zhang | software engineer | facebook
wzhang@fb.com | (408) 460-4803

On 4/29/17, 12:02 AM, "Keith Busch" <keith.busch@intel.com> wrote:

    The DPC interupt may be executed on a device that is being removed. Skip
    queuing event handling if the status is all 1's, which should be seen
    only if the device is not present.
    
    Signed-off-by: Keith Busch <keith.busch@intel.com>

    ---
     drivers/pci/pcie/pcie-dpc.c | 2 +-
     1 file changed, 1 insertion(+), 1 deletion(-)
    
    diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c
    index d4d70ef..fc2a4a7 100644
    --- a/drivers/pci/pcie/pcie-dpc.c
    +++ b/drivers/pci/pcie/pcie-dpc.c
    @@ -87,7 +87,7 @@ static irqreturn_t dpc_irq(int irq, void *context)
     	pci_read_config_word(pdev, dpc->cap_pos + PCI_EXP_DPC_STATUS, &status);
     	pci_read_config_word(pdev, dpc->cap_pos + PCI_EXP_DPC_SOURCE_ID,
     			     &source);
    -	if (!status)
    +	if (!status || status == (u16)(~0))
     		return IRQ_NONE;
     
     	dev_info(&dpc->dev->device, "DPC containment event, status:%#06x source:%#06x\n",
    -- 
    2.7.2
Keith Busch May 10, 2017, 1:17 p.m. UTC | #2
On Wed, May 10, 2017 at 03:39:27AM +0000, Wei Zhang wrote:
>  Hi Keith and Wes,
> 
> I wonder if getting an All 1’s read on the DPC status register is a
> valid scenario.   The DPC register is on the switch, why would the
> status register returns All 1’s even if the device is removed?

Ah, this isn't about the downstream device precense. This is about
the DPC switch device itself, like if you pull the cable out of the
enclosure. Reading anything off the DSPs in it will see all 1's.
Wesley Yung May 10, 2017, 2:16 p.m. UTC | #3
Oh I see. If the whole switch is removed if they try to read the dsps that were there would result in a timeout and that looks like all 1s. 

Cheers
Wes

On May 10, 2017, at 6:10 AM, Keith Busch <keith.busch@intel.com> wrote:

EXTERNAL EMAIL


> On Wed, May 10, 2017 at 03:39:27AM +0000, Wei Zhang wrote:

> Hi Keith and Wes,

> 

> I wonder if getting an All 1’s read on the DPC status register is a

> valid scenario.   The DPC register is on the switch, why would the

> status register returns All 1’s even if the device is removed?


Ah, this isn't about the downstream device precense. This is about
the DPC switch device itself, like if you pull the cable out of the
enclosure. Reading anything off the DSPs in it will see all 1's.
Wei Zhang May 10, 2017, 4:35 p.m. UTC | #4
Hi Keith,

I see.  I thought the current CPU root complex does not support such a use case, ie removing the DPC switch device itself and might result in kernel panic.  But I agree this will make the code future-proof when CPU does support such a case in the future.

Thanks for the clarification!
-Wei

--
wei zhang | software engineer | facebook
wzhang@fb.com | (408) 460-4803

On 5/10/17, 9:17 PM, "Keith Busch" <keith.busch@intel.com> wrote:

    On Wed, May 10, 2017 at 03:39:27AM +0000, Wei Zhang wrote:
    >  Hi Keith and Wes,

    > 

    > I wonder if getting an All 1’s read on the DPC status register is a

    > valid scenario.   The DPC register is on the switch, why would the

    > status register returns All 1’s even if the device is removed?

    
    Ah, this isn't about the downstream device precense. This is about
    the DPC switch device itself, like if you pull the cable out of the
    enclosure. Reading anything off the DSPs in it will see all 1's.
Wesley Yung May 10, 2017, 4:43 p.m. UTC | #5
Hey Keith,

Does the RC Downstream ports also have DPC capability so the driver actually loads for the Root Port DSPs?  Just trying to understand the mechanism.

For Wei's specific use case, removing a switch from the RP is not officially supported but technically it should be possible to do so.  On the switch end we just need to know where the ecosystem is at so we can make sure to play nice.

Wes

-----Original Message-----
From: Keith Busch [mailto:keith.busch@intel.com] 
Sent: Wednesday, May 10, 2017 9:45 AM
To: Wei Zhang <wzhang@fb.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>; linux-pci@vger.kernel.org; Wesley Yung <wesley.yung@microsemi.com>; Sammy Hui <sammy.hui@intel.com>; Joseph Gruher <joseph.r.gruher@intel.com>; Krishna Dhulipala <krishnad@fb.com>
Subject: Re: [PATCH 1/2] pcie/dpc: Skip DPC event if device is not present

EXTERNAL EMAIL


On Wed, May 10, 2017 at 04:35:06PM +0000, Wei Zhang wrote:
> Hi Keith,
>
> I see.  I thought the current CPU root complex does not support such a use case, ie removing the DPC switch device itself and might result in kernel panic.  But I agree this will make the code future-proof when CPU does support such a case in the future.

What do you mean in the future? I do this today (hotplug enclosures), but I need this fix in place otherwise we get a use-after-free when the DPC work queue runs after the hotplug code freed the topology that includes the DPC parts.
Keith Busch May 10, 2017, 4:44 p.m. UTC | #6
On Wed, May 10, 2017 at 04:35:06PM +0000, Wei Zhang wrote:
> Hi Keith,
> 
> I see.  I thought the current CPU root complex does not support such a use case, ie removing the DPC switch device itself and might result in kernel panic.  But I agree this will make the code future-proof when CPU does support such a case in the future.

What do you mean in the future? I do this today (hotplug enclosures),
but I need this fix in place otherwise we get a use-after-free when
the DPC work queue runs after the hotplug code freed the topology that
includes the DPC parts.
Wesley Yung May 10, 2017, 4:52 p.m. UTC | #7
Just to add more colour:

If a switch is removed from the RP.  All non posted's will time out.  This includes a config read of any switch P2P.  I believe Keith is saying if a read comes down to read the DPC status on a Switch DS P2P that read will time out after the host TMO.  Once that occurs, the IIO will generate all 1s back to the DPC driver. 

It means all the back end drivers (DPC, AER, Hot Plug, NVMe) need to have some understanding that all 1s means "something happened".  In the case of the DPC driver, we know the DPC status should never have 1s in the most significant bits so returning all 1s is a special return type that needs to be handled.

Wes

-----Original Message-----
From: Wesley Yung 
Sent: Wednesday, May 10, 2017 9:43 AM
To: 'Keith Busch' <keith.busch@intel.com>; Wei Zhang <wzhang@fb.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>; linux-pci@vger.kernel.org; Sammy Hui <sammy.hui@intel.com>; Joseph Gruher <joseph.r.gruher@intel.com>; Krishna Dhulipala <krishnad@fb.com>
Subject: RE: [PATCH 1/2] pcie/dpc: Skip DPC event if device is not present

Hey Keith,

Does the RC Downstream ports also have DPC capability so the driver actually loads for the Root Port DSPs?  Just trying to understand the mechanism.

For Wei's specific use case, removing a switch from the RP is not officially supported but technically it should be possible to do so.  On the switch end we just need to know where the ecosystem is at so we can make sure to play nice.

Wes

-----Original Message-----
From: Keith Busch [mailto:keith.busch@intel.com] 
Sent: Wednesday, May 10, 2017 9:45 AM
To: Wei Zhang <wzhang@fb.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>; linux-pci@vger.kernel.org; Wesley Yung <wesley.yung@microsemi.com>; Sammy Hui <sammy.hui@intel.com>; Joseph Gruher <joseph.r.gruher@intel.com>; Krishna Dhulipala <krishnad@fb.com>
Subject: Re: [PATCH 1/2] pcie/dpc: Skip DPC event if device is not present

EXTERNAL EMAIL


On Wed, May 10, 2017 at 04:35:06PM +0000, Wei Zhang wrote:
> Hi Keith,
>
> I see.  I thought the current CPU root complex does not support such a use case, ie removing the DPC switch device itself and might result in kernel panic.  But I agree this will make the code future-proof when CPU does support such a case in the future.

What do you mean in the future? I do this today (hotplug enclosures), but I need this fix in place otherwise we get a use-after-free when the DPC work queue runs after the hotplug code freed the topology that includes the DPC parts.
Keith Busch May 10, 2017, 5:42 p.m. UTC | #8
Oh, I see what you're asking about. I'm not testing DPC capable RP's. I
do not have such capabilities on my platforms. In fact, the machines I
have are quite old, Sandy Bridge generation.

I've a switch connected to the RP, then the external enclosures are
downstream that.  This topology has multiple levels of switches and
devices, but I still can't unplug the top level as you expected; only
from anywhere in the middle.

Removing the enclosure triggers DPC on the switch connected to the RP.
Sometimes the interrupt handler for the now removed enclosure switches
triggers. The driver checking the DPC status gets a Completer Abort and
the reader see's all 1's, so this is not a CTO type situation.

On Wed, May 10, 2017 at 04:52:16PM +0000, Wesley Yung wrote:
> Just to add more colour:
> 
> If a switch is removed from the RP.  All non posted's will time out.  This includes a config read of any switch P2P.  I believe Keith is saying if a read comes down to read the DPC status on a Switch DS P2P that read will time out after the host TMO.  Once that occurs, the IIO will generate all 1s back to the DPC driver. 
> 
> It means all the back end drivers (DPC, AER, Hot Plug, NVMe) need to have some understanding that all 1s means "something happened".  In the case of the DPC driver, we know the DPC status should never have 1s in the most significant bits so returning all 1s is a special return type that needs to be handled.
> 
> Wes
> 
> -----Original Message-----
> From: Wesley Yung 
> Sent: Wednesday, May 10, 2017 9:43 AM
> To: 'Keith Busch' <keith.busch@intel.com>; Wei Zhang <wzhang@fb.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>; linux-pci@vger.kernel.org; Sammy Hui <sammy.hui@intel.com>; Joseph Gruher <joseph.r.gruher@intel.com>; Krishna Dhulipala <krishnad@fb.com>
> Subject: RE: [PATCH 1/2] pcie/dpc: Skip DPC event if device is not present
> 
> Hey Keith,
> 
> Does the RC Downstream ports also have DPC capability so the driver actually loads for the Root Port DSPs?  Just trying to understand the mechanism.
> 
> For Wei's specific use case, removing a switch from the RP is not officially supported but technically it should be possible to do so.  On the switch end we just need to know where the ecosystem is at so we can make sure to play nice.
> 
> Wes
diff mbox

Patch

diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c
index d4d70ef..fc2a4a7 100644
--- a/drivers/pci/pcie/pcie-dpc.c
+++ b/drivers/pci/pcie/pcie-dpc.c
@@ -87,7 +87,7 @@  static irqreturn_t dpc_irq(int irq, void *context)
 	pci_read_config_word(pdev, dpc->cap_pos + PCI_EXP_DPC_STATUS, &status);
 	pci_read_config_word(pdev, dpc->cap_pos + PCI_EXP_DPC_SOURCE_ID,
 			     &source);
-	if (!status)
+	if (!status || status == (u16)(~0))
 		return IRQ_NONE;
 
 	dev_info(&dpc->dev->device, "DPC containment event, status:%#06x source:%#06x\n",