diff mbox

[1/2] pci: Error disabling SR-IOV if in VFs assigned

Message ID 1463521199-16604-1-git-send-email-keith.busch@intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Keith Busch May 17, 2016, 9:39 p.m. UTC
Every sriov capable driver has to check if any guest is using a virtual
function prior to disabling, so let's make it common code.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pci-sysfs.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Alex Williamson May 17, 2016, 10:08 p.m. UTC | #1
On Tue, 17 May 2016 15:39:58 -0600
Keith Busch <keith.busch@intel.com> wrote:

> Every sriov capable driver has to check if any guest is using a virtual
> function prior to disabling, so let's make it common code.

This is not true, the PCI_DEV_FLAGS_ASSIGNED flag is inherently racy,
so checking it is really only a courtesy for broken drivers that
still make use of it.  I don't object to adding it here, though I
wish the entire interface was deprecated, but it's only a minimal amount
of insurance as a VF might get assigned immediately following your added
test or might not participate in the assigned device flagging at all.  I
believe the better way to handle this is with proper host drivers for
assigned devices that manage the driver .remove callback properly while
devices are in use.  The only reason to handle assigned devices
specially in this case is when they don't have proper host drivers
managing them, which is a problem that we've fixed.  Thanks,

Alex

> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  drivers/pci/pci-sysfs.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 342b691..5011fa9 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -487,6 +487,11 @@ static ssize_t sriov_numvfs_store(struct device *dev,
>  
>  	if (num_vfs == 0) {
>  		/* disable VFs */
> +		if (pci_vfs_assigned(pdev)) {
> +			dev_warn(&pdev->dev,
> +				"Cannot disable SR-IOV VFs while assigned\n");
> +			return -EPERM;
> +		}
>  		ret = pdev->driver->sriov_configure(pdev, 0);
>  		if (ret < 0)
>  			return ret;
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig May 23, 2016, 10:55 a.m. UTC | #2
On Tue, May 17, 2016 at 04:08:32PM -0600, Alex Williamson wrote:
> On Tue, 17 May 2016 15:39:58 -0600
> Keith Busch <keith.busch@intel.com> wrote:
> 
> > Every sriov capable driver has to check if any guest is using a virtual
> > function prior to disabling, so let's make it common code.
> 
> This is not true, the PCI_DEV_FLAGS_ASSIGNED flag is inherently racy,
> so checking it is really only a courtesy for broken drivers that
> still make use of it.  I don't object to adding it here, though I
> wish the entire interface was deprecated, but it's only a minimal amount
> of insurance as a VF might get assigned immediately following your added
> test or might not participate in the assigned device flagging at all.

Si should we just kill it? As far as I can tell it's only used in these
kinds of boilerplate checks.

> I
> believe the better way to handle this is with proper host drivers for
> assigned devices that manage the driver .remove callback properly while
> devices are in use.  The only reason to handle assigned devices
> specially in this case is when they don't have proper host drivers
> managing them, which is a problem that we've fixed.  Thanks,

We always use pci-stub now, don't we?
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Williamson May 23, 2016, 3:07 p.m. UTC | #3
On Mon, 23 May 2016 03:55:32 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, May 17, 2016 at 04:08:32PM -0600, Alex Williamson wrote:
> > On Tue, 17 May 2016 15:39:58 -0600
> > Keith Busch <keith.busch@intel.com> wrote:
> >   
> > > Every sriov capable driver has to check if any guest is using a virtual
> > > function prior to disabling, so let's make it common code.  
> > 
> > This is not true, the PCI_DEV_FLAGS_ASSIGNED flag is inherently racy,
> > so checking it is really only a courtesy for broken drivers that
> > still make use of it.  I don't object to adding it here, though I
> > wish the entire interface was deprecated, but it's only a minimal amount
> > of insurance as a VF might get assigned immediately following your added
> > test or might not participate in the assigned device flagging at all.  
> 
> Si should we just kill it? As far as I can tell it's only used in these
> kinds of boilerplate checks.

Long term, yes, but perhaps KVM legacy PCI device assignment needs to
be not only deprecated, but removed first.

> > I
> > believe the better way to handle this is with proper host drivers for
> > assigned devices that manage the driver .remove callback properly while
> > devices are in use.  The only reason to handle assigned devices
> > specially in this case is when they don't have proper host drivers
> > managing them, which is a problem that we've fixed.  Thanks,  
> 
> We always use pci-stub now, don't we?

pci-stub's only purpose is to prevent other drivers from binding to a
device, such as while it's in used by legacy KVM device assignment.
pci-stub has no visibility whether a device is in use or not, and will
happily unbind the device at any point in time.  Thus legacy KVM device
assignment with pci-stub makes use of this horrible flag in a vain
attempt to prevent devices from disappearing, littering every possible
remove path with these sorts of checks when really the driver holding
the assigned device should block or maybe allow an error return from
the .remove callback.  Xen pci-back also sets this flag, but I would
hope there's a sensible solution available there and they just adopted
use of this flag without really questioning that it works or makes
sense. Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig May 23, 2016, 3:10 p.m. UTC | #4
On Mon, May 23, 2016 at 09:07:11AM -0600, Alex Williamson wrote:
> On Mon, 23 May 2016 03:55:32 -0700
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Tue, May 17, 2016 at 04:08:32PM -0600, Alex Williamson wrote:
> > > On Tue, 17 May 2016 15:39:58 -0600
> > > Keith Busch <keith.busch@intel.com> wrote:
> > >   
> > > > Every sriov capable driver has to check if any guest is using a virtual
> > > > function prior to disabling, so let's make it common code.  
> > > 
> > > This is not true, the PCI_DEV_FLAGS_ASSIGNED flag is inherently racy,
> > > so checking it is really only a courtesy for broken drivers that
> > > still make use of it.  I don't object to adding it here, though I
> > > wish the entire interface was deprecated, but it's only a minimal amount
> > > of insurance as a VF might get assigned immediately following your added
> > > test or might not participate in the assigned device flagging at all.  
> > 
> > Si should we just kill it? As far as I can tell it's only used in these
> > kinds of boilerplate checks.
> 
> Long term, yes, but perhaps KVM legacy PCI device assignment needs to
> be not only deprecated, but removed first.

Oh well.  In that case I'd like to at least move it to the PCI core
so that drivers don't have to care about it.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 13, 2016, 9:14 p.m. UTC | #5
On Tue, May 17, 2016 at 03:39:58PM -0600, Keith Busch wrote:
> Every sriov capable driver has to check if any guest is using a virtual
> function prior to disabling, so let's make it common code.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>

If I understand the discussion correctly, this is still racy but
nobody objects to adding this until we have a better, non-racy
solution.

However, you added this in common code and took advantage of it in
nvme.  Good so far.  But we have about a dozen other drivers that call
pci_vfs_assigned().  I assume some of those places could be changed so
they take advantage of this check in the core instead?

Can we do that at the same time?  If we add good new stuff and only
use it one place, there's not as much overall goodness as there would
be if we updated everybody to do it similarly.

> ---
>  drivers/pci/pci-sysfs.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 342b691..5011fa9 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -487,6 +487,11 @@ static ssize_t sriov_numvfs_store(struct device *dev,
>  
>  	if (num_vfs == 0) {
>  		/* disable VFs */
> +		if (pci_vfs_assigned(pdev)) {
> +			dev_warn(&pdev->dev,
> +				"Cannot disable SR-IOV VFs while assigned\n");
> +			return -EPERM;
> +		}
>  		ret = pdev->driver->sriov_configure(pdev, 0);
>  		if (ret < 0)
>  			return ret;
> -- 
> 2.7.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch June 13, 2016, 9:28 p.m. UTC | #6
On Mon, Jun 13, 2016 at 04:14:11PM -0500, Bjorn Helgaas wrote:
> On Tue, May 17, 2016 at 03:39:58PM -0600, Keith Busch wrote:
> > Every sriov capable driver has to check if any guest is using a virtual
> > function prior to disabling, so let's make it common code.
> > 
> > Signed-off-by: Keith Busch <keith.busch@intel.com>
> 
> If I understand the discussion correctly, this is still racy but
> nobody objects to adding this until we have a better, non-racy
> solution.

My understanding as well: there's a potential race, but no different
than what exists with today.
 
> However, you added this in common code and took advantage of it in
> nvme.  Good so far.  But we have about a dozen other drivers that call
> pci_vfs_assigned().  I assume some of those places could be changed so
> they take advantage of this check in the core instead?

> Can we do that at the same time?  If we add good new stuff and only
> use it one place, there's not as much overall goodness as there would
> be if we updated everybody to do it similarly.

Sounds good, I'll send a series taking advantage of this for all the
other PF drivers duplicating this check in their sriov_configure.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch June 13, 2016, 9:57 p.m. UTC | #7
On Mon, Jun 13, 2016 at 05:28:10PM -0400, Keith Busch wrote:
> On Mon, Jun 13, 2016 at 04:14:11PM -0500, Bjorn Helgaas wrote:
> > Can we do that at the same time?  If we add good new stuff and only
> > use it one place, there's not as much overall goodness as there would
> > be if we updated everybody to do it similarly.
> 
> Sounds good, I'll send a series taking advantage of this for all the
> other PF drivers duplicating this check in their sriov_configure.

Heh, I thought "no big deal", thinking all use was similar to
NVMe's. However, most network drivers have multiple paths to sriov
configuration, or have other requirements to changing the live
count! There's only two drivers I find that can immediately use the
simplification safely. I'll send those updates to just those ones.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 13, 2016, 10:26 p.m. UTC | #8
On Mon, Jun 13, 2016 at 05:57:37PM -0400, Keith Busch wrote:
> On Mon, Jun 13, 2016 at 05:28:10PM -0400, Keith Busch wrote:
> > On Mon, Jun 13, 2016 at 04:14:11PM -0500, Bjorn Helgaas wrote:
> > > Can we do that at the same time?  If we add good new stuff and only
> > > use it one place, there's not as much overall goodness as there would
> > > be if we updated everybody to do it similarly.
> > 
> > Sounds good, I'll send a series taking advantage of this for all the
> > other PF drivers duplicating this check in their sriov_configure.
> 
> Heh, I thought "no big deal", thinking all use was similar to
> NVMe's. However, most network drivers have multiple paths to sriov
> configuration, or have other requirements to changing the live
> count! There's only two drivers I find that can immediately use the
> simplification safely. I'll send those updates to just those ones.

My take on that is, "it's a pretty trival check; just put it directly
in nvme and don't bother doing anything in the PCI core."  Maybe
someday we'll come up with a way to make things more common.  It
doesn't really seem worth it to add something that only helps three
drivers.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch June 13, 2016, 10:35 p.m. UTC | #9
On Mon, Jun 13, 2016 at 05:26:30PM -0500, Bjorn Helgaas wrote:
> My take on that is, "it's a pretty trival check; just put it directly
> in nvme and don't bother doing anything in the PCI core."  Maybe
> someday we'll come up with a way to make things more common.  It
> doesn't really seem worth it to add something that only helps three
> drivers.

Fair enough. Will withdraw the check in pci and resend nvme.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig June 15, 2016, 10:26 a.m. UTC | #10
On Mon, Jun 13, 2016 at 06:35:52PM -0400, Keith Busch wrote:
> On Mon, Jun 13, 2016 at 05:26:30PM -0500, Bjorn Helgaas wrote:
> > My take on that is, "it's a pretty trival check; just put it directly
> > in nvme and don't bother doing anything in the PCI core."  Maybe
> > someday we'll come up with a way to make things more common.  It
> > doesn't really seem worth it to add something that only helps three
> > drivers.
> 
> Fair enough. Will withdraw the check in pci and resend nvme.

Sorry for leading you down this path..
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 15, 2016, 3:38 p.m. UTC | #11
On Wed, Jun 15, 2016 at 5:26 AM, Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Jun 13, 2016 at 06:35:52PM -0400, Keith Busch wrote:
>> On Mon, Jun 13, 2016 at 05:26:30PM -0500, Bjorn Helgaas wrote:
>> > My take on that is, "it's a pretty trival check; just put it directly
>> > in nvme and don't bother doing anything in the PCI core."  Maybe
>> > someday we'll come up with a way to make things more common.  It
>> > doesn't really seem worth it to add something that only helps three
>> > drivers.
>>
>> Fair enough. Will withdraw the check in pci and resend nvme.
>
> Sorry for leading you down this path..

No problem, I think it would be awesome if we could unify these
drivers somehow, and unification always involves lots of dead-ends :)
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 342b691..5011fa9 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -487,6 +487,11 @@  static ssize_t sriov_numvfs_store(struct device *dev,
 
 	if (num_vfs == 0) {
 		/* disable VFs */
+		if (pci_vfs_assigned(pdev)) {
+			dev_warn(&pdev->dev,
+				"Cannot disable SR-IOV VFs while assigned\n");
+			return -EPERM;
+		}
 		ret = pdev->driver->sriov_configure(pdev, 0);
 		if (ret < 0)
 			return ret;