diff mbox series

[V1,mlx5-next,04/13] PCI/IOV: Allow SRIOV VF drivers to reach the drvdata of a PF

Message ID 20211013094707.163054-5-yishaih@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Add mlx5 live migration driver | expand

Commit Message

Yishai Hadas Oct. 13, 2021, 9:46 a.m. UTC
From: Jason Gunthorpe <jgg@nvidia.com>

There are some cases where a SRIOV VF driver will need to reach into and
interact with the PF driver. This requires accessing the drvdata of the PF.

Provide a function pci_iov_get_pf_drvdata() to return this PF drvdata in a
safe way. Normally accessing a drvdata of a foreign struct device would be
done using the device_lock() to protect against device driver
probe()/remove() races.

However, due to the design of pci_enable_sriov() this will result in a
ABBA deadlock on the device_lock as the PF's device_lock is held during PF
sriov_configure() while calling pci_enable_sriov() which in turn holds the
VF's device_lock while calling VF probe(), and similarly for remove.

This means the VF driver can never obtain the PF's device_lock.

Instead use the implicit locking created by pci_enable/disable_sriov(). A
VF driver can access its PF drvdata only while its own driver is attached,
and the PF driver can control access to its own drvdata based on when it
calls pci_enable/disable_sriov().

To use this API the PF driver will setup the PF drvdata in the probe()
function. pci_enable_sriov() is only called from sriov_configure() which
cannot happen until probe() completes, ensuring no VF races with drvdata
setup.

For removal, the PF driver must call pci_disable_sriov() in its remove
function before destroying any of the drvdata. This ensures that all VF
drivers are unbound before returning, fencing concurrent access to the
drvdata.

The introduction of a new function to do this access makes clear the
special locking scheme and the documents the requirements on the PF/VF
drivers using this.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/pci/iov.c   | 29 +++++++++++++++++++++++++++++
 include/linux/pci.h |  7 +++++++
 2 files changed, 36 insertions(+)

Comments

Bjorn Helgaas Oct. 13, 2021, 6:27 p.m. UTC | #1
On Wed, Oct 13, 2021 at 12:46:58PM +0300, Yishai Hadas wrote:
> From: Jason Gunthorpe <jgg@nvidia.com>
> 
> There are some cases where a SRIOV VF driver will need to reach into and
> interact with the PF driver. This requires accessing the drvdata of the PF.
> 
> Provide a function pci_iov_get_pf_drvdata() to return this PF drvdata in a
> safe way. Normally accessing a drvdata of a foreign struct device would be
> done using the device_lock() to protect against device driver
> probe()/remove() races.
> 
> However, due to the design of pci_enable_sriov() this will result in a
> ABBA deadlock on the device_lock as the PF's device_lock is held during PF
> sriov_configure() while calling pci_enable_sriov() which in turn holds the
> VF's device_lock while calling VF probe(), and similarly for remove.
> 
> This means the VF driver can never obtain the PF's device_lock.
> 
> Instead use the implicit locking created by pci_enable/disable_sriov(). A
> VF driver can access its PF drvdata only while its own driver is attached,
> and the PF driver can control access to its own drvdata based on when it
> calls pci_enable/disable_sriov().
> 
> To use this API the PF driver will setup the PF drvdata in the probe()
> function. pci_enable_sriov() is only called from sriov_configure() which
> cannot happen until probe() completes, ensuring no VF races with drvdata
> setup.
> 
> For removal, the PF driver must call pci_disable_sriov() in its remove
> function before destroying any of the drvdata. This ensures that all VF
> drivers are unbound before returning, fencing concurrent access to the
> drvdata.
> 
> The introduction of a new function to do this access makes clear the
> special locking scheme and the documents the requirements on the PF/VF
> drivers using this.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Nit: s/SRIOV/SR-IOV/ above so it matches usage in the spec.

I think it's nice to include the actual interface in the subject when
practical.

> ---
>  drivers/pci/iov.c   | 29 +++++++++++++++++++++++++++++
>  include/linux/pci.h |  7 +++++++
>  2 files changed, 36 insertions(+)
> 
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index e7751fa3fe0b..ca696730f761 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -47,6 +47,35 @@ int pci_iov_vf_id(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL_GPL(pci_iov_vf_id);
>  
> +/**
> + * pci_iov_get_pf_drvdata - Return the drvdata of a PF
> + * @dev - VF pci_dev
> + * @pf_driver - Device driver required to own the PF
> + *
> + * This must be called from a context that ensures that a VF driver is attached.
> + * The value returned is invalid once the VF driver completes its remove()
> + * callback.
> + *
> + * Locking is achieved by the driver core. A VF driver cannot be probed until
> + * pci_enable_sriov() is called and pci_disable_sriov() does not return until
> + * all VF drivers have completed their remove().
> + *
> + * The PF driver must call pci_disable_sriov() before it begins to destroy the
> + * drvdata.
> + */
> +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver)
> +{
> +	struct pci_dev *pf_dev;
> +
> +	if (dev->is_physfn)
> +		return ERR_PTR(-EINVAL);
> +	pf_dev = dev->physfn;
> +	if (pf_dev->driver != pf_driver)
> +		return ERR_PTR(-EINVAL);
> +	return pci_get_drvdata(pf_dev);
> +}
> +EXPORT_SYMBOL_GPL(pci_iov_get_pf_drvdata);
> +
>  /*
>   * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may
>   * change when NumVFs changes.
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 2337512e67f0..639a0a239774 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -2154,6 +2154,7 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar);
>  int pci_iov_virtfn_bus(struct pci_dev *dev, int id);
>  int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
>  int pci_iov_vf_id(struct pci_dev *dev);
> +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver);
>  int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
>  void pci_disable_sriov(struct pci_dev *dev);
>  
> @@ -2187,6 +2188,12 @@ static inline int pci_iov_vf_id(struct pci_dev *dev)
>  	return -ENOSYS;
>  }
>  
> +static inline void *pci_iov_get_pf_drvdata(struct pci_dev *dev,
> +					   struct pci_driver *pf_driver)
> +{
> +	return ERR_PTR(-EINVAL);
> +}
> +
>  static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
>  { return -ENODEV; }
>  
> -- 
> 2.18.1
>
Alex Williamson Oct. 14, 2021, 10:11 p.m. UTC | #2
On Wed, 13 Oct 2021 12:46:58 +0300
Yishai Hadas <yishaih@nvidia.com> wrote:

> From: Jason Gunthorpe <jgg@nvidia.com>
> 
> There are some cases where a SRIOV VF driver will need to reach into and
> interact with the PF driver. This requires accessing the drvdata of the PF.
> 
> Provide a function pci_iov_get_pf_drvdata() to return this PF drvdata in a
> safe way. Normally accessing a drvdata of a foreign struct device would be
> done using the device_lock() to protect against device driver
> probe()/remove() races.
> 
> However, due to the design of pci_enable_sriov() this will result in a
> ABBA deadlock on the device_lock as the PF's device_lock is held during PF
> sriov_configure() while calling pci_enable_sriov() which in turn holds the
> VF's device_lock while calling VF probe(), and similarly for remove.
> 
> This means the VF driver can never obtain the PF's device_lock.
> 
> Instead use the implicit locking created by pci_enable/disable_sriov(). A
> VF driver can access its PF drvdata only while its own driver is attached,
> and the PF driver can control access to its own drvdata based on when it
> calls pci_enable/disable_sriov().
> 
> To use this API the PF driver will setup the PF drvdata in the probe()
> function. pci_enable_sriov() is only called from sriov_configure() which
> cannot happen until probe() completes, ensuring no VF races with drvdata
> setup.
> 
> For removal, the PF driver must call pci_disable_sriov() in its remove
> function before destroying any of the drvdata. This ensures that all VF
> drivers are unbound before returning, fencing concurrent access to the
> drvdata.
> 
> The introduction of a new function to do this access makes clear the
> special locking scheme and the documents the requirements on the PF/VF
> drivers using this.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> ---
>  drivers/pci/iov.c   | 29 +++++++++++++++++++++++++++++
>  include/linux/pci.h |  7 +++++++
>  2 files changed, 36 insertions(+)
> 
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index e7751fa3fe0b..ca696730f761 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -47,6 +47,35 @@ int pci_iov_vf_id(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL_GPL(pci_iov_vf_id);
>  
> +/**
> + * pci_iov_get_pf_drvdata - Return the drvdata of a PF
> + * @dev - VF pci_dev
> + * @pf_driver - Device driver required to own the PF
> + *
> + * This must be called from a context that ensures that a VF driver is attached.
> + * The value returned is invalid once the VF driver completes its remove()
> + * callback.
> + *
> + * Locking is achieved by the driver core. A VF driver cannot be probed until
> + * pci_enable_sriov() is called and pci_disable_sriov() does not return until
> + * all VF drivers have completed their remove().
> + *
> + * The PF driver must call pci_disable_sriov() before it begins to destroy the
> + * drvdata.
> + */
> +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver)
> +{
> +	struct pci_dev *pf_dev;
> +
> +	if (dev->is_physfn)
> +		return ERR_PTR(-EINVAL);

I think we're trying to make this only accessible to VFs, so shouldn't
we test (!dev->is_virtfn)?  is_physfn will be zero for either a PF with
failed SR-IOV configuration or for a non-SR-IOV device afaict.  Thanks,

Alex

> +	pf_dev = dev->physfn;
> +	if (pf_dev->driver != pf_driver)
> +		return ERR_PTR(-EINVAL);
> +	return pci_get_drvdata(pf_dev);
> +}
> +EXPORT_SYMBOL_GPL(pci_iov_get_pf_drvdata);
> +
>  /*
>   * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may
>   * change when NumVFs changes.
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 2337512e67f0..639a0a239774 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -2154,6 +2154,7 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar);
>  int pci_iov_virtfn_bus(struct pci_dev *dev, int id);
>  int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
>  int pci_iov_vf_id(struct pci_dev *dev);
> +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver);
>  int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
>  void pci_disable_sriov(struct pci_dev *dev);
>  
> @@ -2187,6 +2188,12 @@ static inline int pci_iov_vf_id(struct pci_dev *dev)
>  	return -ENOSYS;
>  }
>  
> +static inline void *pci_iov_get_pf_drvdata(struct pci_dev *dev,
> +					   struct pci_driver *pf_driver)
> +{
> +	return ERR_PTR(-EINVAL);
> +}
> +
>  static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
>  { return -ENODEV; }
>
Yishai Hadas Oct. 17, 2021, 1:43 p.m. UTC | #3
On 10/15/2021 1:11 AM, Alex Williamson wrote:
> On Wed, 13 Oct 2021 12:46:58 +0300
> Yishai Hadas <yishaih@nvidia.com> wrote:
>
>> From: Jason Gunthorpe <jgg@nvidia.com>
>>
>> There are some cases where a SRIOV VF driver will need to reach into and
>> interact with the PF driver. This requires accessing the drvdata of the PF.
>>
>> Provide a function pci_iov_get_pf_drvdata() to return this PF drvdata in a
>> safe way. Normally accessing a drvdata of a foreign struct device would be
>> done using the device_lock() to protect against device driver
>> probe()/remove() races.
>>
>> However, due to the design of pci_enable_sriov() this will result in a
>> ABBA deadlock on the device_lock as the PF's device_lock is held during PF
>> sriov_configure() while calling pci_enable_sriov() which in turn holds the
>> VF's device_lock while calling VF probe(), and similarly for remove.
>>
>> This means the VF driver can never obtain the PF's device_lock.
>>
>> Instead use the implicit locking created by pci_enable/disable_sriov(). A
>> VF driver can access its PF drvdata only while its own driver is attached,
>> and the PF driver can control access to its own drvdata based on when it
>> calls pci_enable/disable_sriov().
>>
>> To use this API the PF driver will setup the PF drvdata in the probe()
>> function. pci_enable_sriov() is only called from sriov_configure() which
>> cannot happen until probe() completes, ensuring no VF races with drvdata
>> setup.
>>
>> For removal, the PF driver must call pci_disable_sriov() in its remove
>> function before destroying any of the drvdata. This ensures that all VF
>> drivers are unbound before returning, fencing concurrent access to the
>> drvdata.
>>
>> The introduction of a new function to do this access makes clear the
>> special locking scheme and the documents the requirements on the PF/VF
>> drivers using this.
>>
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>> ---
>>   drivers/pci/iov.c   | 29 +++++++++++++++++++++++++++++
>>   include/linux/pci.h |  7 +++++++
>>   2 files changed, 36 insertions(+)
>>
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index e7751fa3fe0b..ca696730f761 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -47,6 +47,35 @@ int pci_iov_vf_id(struct pci_dev *dev)
>>   }
>>   EXPORT_SYMBOL_GPL(pci_iov_vf_id);
>>   
>> +/**
>> + * pci_iov_get_pf_drvdata - Return the drvdata of a PF
>> + * @dev - VF pci_dev
>> + * @pf_driver - Device driver required to own the PF
>> + *
>> + * This must be called from a context that ensures that a VF driver is attached.
>> + * The value returned is invalid once the VF driver completes its remove()
>> + * callback.
>> + *
>> + * Locking is achieved by the driver core. A VF driver cannot be probed until
>> + * pci_enable_sriov() is called and pci_disable_sriov() does not return until
>> + * all VF drivers have completed their remove().
>> + *
>> + * The PF driver must call pci_disable_sriov() before it begins to destroy the
>> + * drvdata.
>> + */
>> +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver)
>> +{
>> +	struct pci_dev *pf_dev;
>> +
>> +	if (dev->is_physfn)
>> +		return ERR_PTR(-EINVAL);
> I think we're trying to make this only accessible to VFs, so shouldn't
> we test (!dev->is_virtfn)?  is_physfn will be zero for either a PF with
> failed SR-IOV configuration or for a non-SR-IOV device afaict.  Thanks,
>
> Alex
>

Yes, this should be accessible only for VFs.

We can go with your suggestion to explicitly check (!dev->is_virtfn) as 
this seems cleaner and safer as you mentioned.

We already got ACK on this patch from Bjorn but as your suggestion seems 
straight forward I may put the Acked-by as part of V2 in any case.

Yishai
diff mbox series

Patch

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index e7751fa3fe0b..ca696730f761 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -47,6 +47,35 @@  int pci_iov_vf_id(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_iov_vf_id);
 
+/**
+ * pci_iov_get_pf_drvdata - Return the drvdata of a PF
+ * @dev - VF pci_dev
+ * @pf_driver - Device driver required to own the PF
+ *
+ * This must be called from a context that ensures that a VF driver is attached.
+ * The value returned is invalid once the VF driver completes its remove()
+ * callback.
+ *
+ * Locking is achieved by the driver core. A VF driver cannot be probed until
+ * pci_enable_sriov() is called and pci_disable_sriov() does not return until
+ * all VF drivers have completed their remove().
+ *
+ * The PF driver must call pci_disable_sriov() before it begins to destroy the
+ * drvdata.
+ */
+void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver)
+{
+	struct pci_dev *pf_dev;
+
+	if (dev->is_physfn)
+		return ERR_PTR(-EINVAL);
+	pf_dev = dev->physfn;
+	if (pf_dev->driver != pf_driver)
+		return ERR_PTR(-EINVAL);
+	return pci_get_drvdata(pf_dev);
+}
+EXPORT_SYMBOL_GPL(pci_iov_get_pf_drvdata);
+
 /*
  * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may
  * change when NumVFs changes.
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2337512e67f0..639a0a239774 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2154,6 +2154,7 @@  void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar);
 int pci_iov_virtfn_bus(struct pci_dev *dev, int id);
 int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
 int pci_iov_vf_id(struct pci_dev *dev);
+void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver);
 int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 void pci_disable_sriov(struct pci_dev *dev);
 
@@ -2187,6 +2188,12 @@  static inline int pci_iov_vf_id(struct pci_dev *dev)
 	return -ENOSYS;
 }
 
+static inline void *pci_iov_get_pf_drvdata(struct pci_dev *dev,
+					   struct pci_driver *pf_driver)
+{
+	return ERR_PTR(-EINVAL);
+}
+
 static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
 { return -ENODEV; }