diff mbox

[RFC] Let device drivers disable msi on shutdown

Message ID 1403628537-16367-1-git-send-email-keith.busch@intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Keith Busch June 24, 2014, 4:48 p.m. UTC
I'd like to do shutdowns asynchronously so I can shutdown multiple
devices in parallel, but the pci-driver disables interrupts after my
driver returns from its '.shutdown', though it needs to rely on these
interrupts in its asynchronously scheduled shutdown.

I tracked the reason for pci disabling msi to ...

| commit d52877c7b1afb8c37ebe17e2005040b79cb618b0
| Author: Yinghai Lu <yhlu.kernel.send@gmail.com>
| Date:   Wed Apr 23 14:58:09 2008 -0700
|
|     pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2

... because mptfusion doesn't disable msi in its shutdown path.

Any reason we can't let the drivers do this instead?

To provide context why I want to do this asynchronously, NVM-Express has
one PCI device per controller, of which there could be dozens in a system,
and each one may take many seconds (I've heard over ten in some cases)
to safely shutdown.

In this patch, mptfusion was compile tested only; I didn't observe any
adverse affects from running the pci portion.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/message/fusion/mptscsih.c |    3 +++
 drivers/pci/pci-driver.c          |    2 --
 2 files changed, 3 insertions(+), 2 deletions(-)

Comments

Elliott, Robert (Server Storage) June 24, 2014, 9:21 p.m. UTC | #1
> -----Original Message-----
> From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-
> owner@vger.kernel.org] On Behalf Of Keith Busch
> Sent: Tuesday, 24 June, 2014 11:49 AM
> To: linux-pci@vger.kernel.org; linux-scsi@vger.kernel.org
> Cc: Keith Busch; Nagalakshmi Nandigama; Sreekanth Reddy; Bjorn Helgaas
> Subject: [RFC PATCH] Let device drivers disable msi on shutdown
> 
> I'd like to do shutdowns asynchronously so I can shutdown multiple
> devices in parallel, but the pci-driver disables interrupts after my
> driver returns from its '.shutdown', though it needs to rely on these
> interrupts in its asynchronously scheduled shutdown.
> 
> I tracked the reason for pci disabling msi to ...
> 
> | commit d52877c7b1afb8c37ebe17e2005040b79cb618b0
> | Author: Yinghai Lu <yhlu.kernel.send@gmail.com>
> | Date:   Wed Apr 23 14:58:09 2008 -0700
> |
> |     pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2
> 
> ... because mptfusion doesn't disable msi in its shutdown path.
> 
> Any reason we can't let the drivers do this instead?
> 
> To provide context why I want to do this asynchronously, NVM-Express has
> one PCI device per controller, of which there could be dozens in a system,
> and each one may take many seconds (I've heard over ten in some cases)
> to safely shutdown.
> 
> In this patch, mptfusion was compile tested only; I didn't observe any
> adverse affects from running the pci portion.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
> Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/message/fusion/mptscsih.c |    3 +++
>  drivers/pci/pci-driver.c          |    2 --
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/message/fusion/mptscsih.c
> b/drivers/message/fusion/mptscsih.c
> index 2a1c6f2..3186e17 100644
> --- a/drivers/message/fusion/mptscsih.c
> +++ b/drivers/message/fusion/mptscsih.c
> @@ -1215,6 +1215,9 @@ mptscsih_remove(struct pci_dev *pdev)
>  void
>  mptscsih_shutdown(struct pci_dev *pdev)
>  {
> +	MPT_ADAPTER *ioc = pci_get_drvdata(pdev);
> +	if (ioc->msi_enable)
> +		pci_disable_msi(pdev);
>  }
> 
>  #ifdef CONFIG_PM
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 3f8e3db..8079d98 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -453,8 +453,6 @@ static void pci_device_shutdown(struct device *dev)
> 
>  	if (drv && drv->shutdown)
>  		drv->shutdown(pci_dev);
> -	pci_msi_shutdown(pci_dev);
> -	pci_msix_shutdown(pci_dev);
> 
>  #ifdef CONFIG_KEXEC
>  	/*
> --
> 1.7.10.4
> 

1. That will cover the .shutdown function used by mptfc.c, mptspi.c, 
and mptscsih.c, but mptsas.c uses mptsas_shutdown rather than 
mptscsih_shutdown.  It doesn't call pci_disable_msi either.

2. mptscsih_suspend is another caller of mptscsih_shutdown
(although the latter does nothing right now).  This is done
before calling mpt_suspend, which writes to some chip registers 
to disable interrupts before calling pci_disable_msi.  Adding a 
pci_disable_msi call before those writes might not be safe.  

3. That mptscsih_suspend call chain will result in calling 
pci_disable_msi twice, which might trigger this in 
pci_disable_msi -> pci_msi_shutdown:
        BUG_ON(list_empty(&dev->msi_list));


---
Rob Elliott    HP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch June 24, 2014, 10:25 p.m. UTC | #2
On Tue, 24 Jun 2014, Elliott, Robert (Server Storage) wrote:
> 1. That will cover the .shutdown function used by mptfc.c, mptspi.c,
> and mptscsih.c, but mptsas.c uses mptsas_shutdown rather than
> mptscsih_shutdown.  It doesn't call pci_disable_msi either.

Missed that; thanks.

> 2. mptscsih_suspend is another caller of mptscsih_shutdown
> (although the latter does nothing right now).  This is done
> before calling mpt_suspend, which writes to some chip registers
> to disable interrupts before calling pci_disable_msi.  Adding a
> pci_disable_msi call before those writes might not be safe.

Clearly more paths into this function than I investigated.

> 3. That mptscsih_suspend call chain will result in calling
> pci_disable_msi twice, which might trigger this in
> pci_disable_msi -> pci_msi_shutdown:
>        BUG_ON(list_empty(&dev->msi_list));

Ah, but pci_disable_msi will not even invoke pci_msi_shutdown:

         if (!pci_msi_enable || !dev || !dev->msi_enabled)
 	                return;

The same check is done in pci_msi_shutdown. If that check wasn't there,
every driver would hit that BUG_ON since most every other pci driver
disables their interrupts in their shutdown path before returning, and
then pci-driver calls it again! MPT appears to be the exception.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas July 10, 2014, 6:38 p.m. UTC | #3
[+cc LKML, Greg KH for driver core async shutdown question]

On Tue, Jun 24, 2014 at 10:48:57AM -0600, Keith Busch wrote:
> I'd like to do shutdowns asynchronously so I can shutdown multiple
> devices in parallel, but the pci-driver disables interrupts after my
> driver returns from its '.shutdown', though it needs to rely on these
> interrupts in its asynchronously scheduled shutdown.
> 
> I tracked the reason for pci disabling msi to ...
> 
> | commit d52877c7b1afb8c37ebe17e2005040b79cb618b0
> | Author: Yinghai Lu <yhlu.kernel.send@gmail.com>
> | Date:   Wed Apr 23 14:58:09 2008 -0700
> |
> |     pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2
> 
> ... because mptfusion doesn't disable msi in its shutdown path.
> 
> Any reason we can't let the drivers do this instead?

I guess they *could*, but my general idea is that when we can do
something in the core, we probably should, because then we're not
depending on all the drivers to do the right thing.  So I would turn
this around and say "we need a pretty good reason to move something
out of the core and into the drivers."

A lot of the issues in the shutdown path are related to getting the
device to shut up so it doesn't cause problems with kexec.  So it does
seem like a good idea to have pci_device_shutdown() disable MSI to
remove one source of bothersome interrupts.

> To provide context why I want to do this asynchronously, NVM-Express has
> one PCI device per controller, of which there could be dozens in a system,
> and each one may take many seconds (I've heard over ten in some cases)
> to safely shutdown.

I don't see anything in device_shutdown() that would wait for this
sort of asynchronous shutdown to complete.  So how do we know it's
finished before we turn off the power, reset, kexec, etc.?

If we need to do asynchronous shutdown, it seems like we need some
sort of driver core infrastructure to manage that.

> In this patch, mptfusion was compile tested only; I didn't observe any
> adverse affects from running the pci portion.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
> Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/message/fusion/mptscsih.c |    3 +++
>  drivers/pci/pci-driver.c          |    2 --
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/message/fusion/mptscsih.c b/drivers/message/fusion/mptscsih.c
> index 2a1c6f2..3186e17 100644
> --- a/drivers/message/fusion/mptscsih.c
> +++ b/drivers/message/fusion/mptscsih.c
> @@ -1215,6 +1215,9 @@ mptscsih_remove(struct pci_dev *pdev)
>  void
>  mptscsih_shutdown(struct pci_dev *pdev)
>  {
> +	MPT_ADAPTER *ioc = pci_get_drvdata(pdev);
> +	if (ioc->msi_enable)
> +		pci_disable_msi(pdev);
>  }
>  
>  #ifdef CONFIG_PM
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 3f8e3db..8079d98 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -453,8 +453,6 @@ static void pci_device_shutdown(struct device *dev)
>  
>  	if (drv && drv->shutdown)
>  		drv->shutdown(pci_dev);
> -	pci_msi_shutdown(pci_dev);
> -	pci_msix_shutdown(pci_dev);
>  
>  #ifdef CONFIG_KEXEC
>  	/*
> -- 
> 1.7.10.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch July 10, 2014, 6:53 p.m. UTC | #4
On Thu, 10 Jul 2014, Bjorn Helgaas wrote:
> [+cc LKML, Greg KH for driver core async shutdown question]
> On Tue, Jun 24, 2014 at 10:48:57AM -0600, Keith Busch wrote:
>> To provide context why I want to do this asynchronously, NVM-Express has
>> one PCI device per controller, of which there could be dozens in a system,
>> and each one may take many seconds (I've heard over ten in some cases)
>> to safely shutdown.
>
> I don't see anything in device_shutdown() that would wait for this
> sort of asynchronous shutdown to complete.  So how do we know it's
> finished before we turn off the power, reset, kexec, etc.?
>
> If we need to do asynchronous shutdown, it seems like we need some
> sort of driver core infrastructure to manage that.

Yes, good point! To address that, I did submit this patch:

http://lists.infradead.org/pipermail/linux-nvme/2014-May/000827.html

I need to fix the EXPORT_SYMBOL_GPL usage for a v2, but before that, I
wan't to know the reason the driver can't use MSIx in an async shutdown
shutdown, and came to the patch mentioned above.

I'd originally had the async shutdown use legacy interrupts, but I
know some NVMe devices do not support legacy, so can't use my original
proposal. If I can't rely on MSI/MSI-x being enabled in an async shutdown,
then I have to add polling, which I suppose we can live with.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas July 10, 2014, 7:53 p.m. UTC | #5
On Thu, Jul 10, 2014 at 12:53 PM, Keith Busch <keith.busch@intel.com> wrote:
> On Thu, 10 Jul 2014, Bjorn Helgaas wrote:
>>
>> [+cc LKML, Greg KH for driver core async shutdown question]
>> On Tue, Jun 24, 2014 at 10:48:57AM -0600, Keith Busch wrote:
>>>
>>> To provide context why I want to do this asynchronously, NVM-Express has
>>> one PCI device per controller, of which there could be dozens in a
>>> system,
>>> and each one may take many seconds (I've heard over ten in some cases)
>>> to safely shutdown.
>>
>>
>> I don't see anything in device_shutdown() that would wait for this
>> sort of asynchronous shutdown to complete.  So how do we know it's
>> finished before we turn off the power, reset, kexec, etc.?
>>
>> If we need to do asynchronous shutdown, it seems like we need some
>> sort of driver core infrastructure to manage that.
>
>
> Yes, good point! To address that, I did submit this patch:
>
> http://lists.infradead.org/pipermail/linux-nvme/2014-May/000827.html
>
> I need to fix the EXPORT_SYMBOL_GPL usage for a v2, but before that, I
> wan't to know the reason the driver can't use MSIx in an async shutdown
> shutdown, and came to the patch mentioned above.
>
> I'd originally had the async shutdown use legacy interrupts, but I
> know some NVMe devices do not support legacy, so can't use my original
> proposal. If I can't rely on MSI/MSI-x being enabled in an async shutdown,
> then I have to add polling, which I suppose we can live with.

OK, I wasn't aware of your async shutdown patch.  Given that, I would
think we'd want to somehow delay the rest of pci_device_shutdown()
until after the driver's async shutdown completes.  We don't want to
do either the MSI shutdown or the pci_clear_master() until after it
completes, do we?  And if we delay that, there would be no need to
move the MSI shutdown from the core to the driver.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/message/fusion/mptscsih.c b/drivers/message/fusion/mptscsih.c
index 2a1c6f2..3186e17 100644
--- a/drivers/message/fusion/mptscsih.c
+++ b/drivers/message/fusion/mptscsih.c
@@ -1215,6 +1215,9 @@  mptscsih_remove(struct pci_dev *pdev)
 void
 mptscsih_shutdown(struct pci_dev *pdev)
 {
+	MPT_ADAPTER *ioc = pci_get_drvdata(pdev);
+	if (ioc->msi_enable)
+		pci_disable_msi(pdev);
 }
 
 #ifdef CONFIG_PM
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 3f8e3db..8079d98 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -453,8 +453,6 @@  static void pci_device_shutdown(struct device *dev)
 
 	if (drv && drv->shutdown)
 		drv->shutdown(pci_dev);
-	pci_msi_shutdown(pci_dev);
-	pci_msix_shutdown(pci_dev);
 
 #ifdef CONFIG_KEXEC
 	/*