Message ID | 20211013094707.163054-5-yishaih@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add mlx5 live migration driver | expand |
On Wed, Oct 13, 2021 at 12:46:58PM +0300, Yishai Hadas wrote: > From: Jason Gunthorpe <jgg@nvidia.com> > > There are some cases where a SRIOV VF driver will need to reach into and > interact with the PF driver. This requires accessing the drvdata of the PF. > > Provide a function pci_iov_get_pf_drvdata() to return this PF drvdata in a > safe way. Normally accessing a drvdata of a foreign struct device would be > done using the device_lock() to protect against device driver > probe()/remove() races. > > However, due to the design of pci_enable_sriov() this will result in a > ABBA deadlock on the device_lock as the PF's device_lock is held during PF > sriov_configure() while calling pci_enable_sriov() which in turn holds the > VF's device_lock while calling VF probe(), and similarly for remove. > > This means the VF driver can never obtain the PF's device_lock. > > Instead use the implicit locking created by pci_enable/disable_sriov(). A > VF driver can access its PF drvdata only while its own driver is attached, > and the PF driver can control access to its own drvdata based on when it > calls pci_enable/disable_sriov(). > > To use this API the PF driver will setup the PF drvdata in the probe() > function. pci_enable_sriov() is only called from sriov_configure() which > cannot happen until probe() completes, ensuring no VF races with drvdata > setup. > > For removal, the PF driver must call pci_disable_sriov() in its remove > function before destroying any of the drvdata. This ensures that all VF > drivers are unbound before returning, fencing concurrent access to the > drvdata. > > The introduction of a new function to do this access makes clear the > special locking scheme and the documents the requirements on the PF/VF > drivers using this. > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Nit: s/SRIOV/SR-IOV/ above so it matches usage in the spec. I think it's nice to include the actual interface in the subject when practical. > --- > drivers/pci/iov.c | 29 +++++++++++++++++++++++++++++ > include/linux/pci.h | 7 +++++++ > 2 files changed, 36 insertions(+) > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > index e7751fa3fe0b..ca696730f761 100644 > --- a/drivers/pci/iov.c > +++ b/drivers/pci/iov.c > @@ -47,6 +47,35 @@ int pci_iov_vf_id(struct pci_dev *dev) > } > EXPORT_SYMBOL_GPL(pci_iov_vf_id); > > +/** > + * pci_iov_get_pf_drvdata - Return the drvdata of a PF > + * @dev - VF pci_dev > + * @pf_driver - Device driver required to own the PF > + * > + * This must be called from a context that ensures that a VF driver is attached. > + * The value returned is invalid once the VF driver completes its remove() > + * callback. > + * > + * Locking is achieved by the driver core. A VF driver cannot be probed until > + * pci_enable_sriov() is called and pci_disable_sriov() does not return until > + * all VF drivers have completed their remove(). > + * > + * The PF driver must call pci_disable_sriov() before it begins to destroy the > + * drvdata. > + */ > +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver) > +{ > + struct pci_dev *pf_dev; > + > + if (dev->is_physfn) > + return ERR_PTR(-EINVAL); > + pf_dev = dev->physfn; > + if (pf_dev->driver != pf_driver) > + return ERR_PTR(-EINVAL); > + return pci_get_drvdata(pf_dev); > +} > +EXPORT_SYMBOL_GPL(pci_iov_get_pf_drvdata); > + > /* > * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may > * change when NumVFs changes. > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 2337512e67f0..639a0a239774 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -2154,6 +2154,7 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar); > int pci_iov_virtfn_bus(struct pci_dev *dev, int id); > int pci_iov_virtfn_devfn(struct pci_dev *dev, int id); > int pci_iov_vf_id(struct pci_dev *dev); > +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver); > int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn); > void pci_disable_sriov(struct pci_dev *dev); > > @@ -2187,6 +2188,12 @@ static inline int pci_iov_vf_id(struct pci_dev *dev) > return -ENOSYS; > } > > +static inline void *pci_iov_get_pf_drvdata(struct pci_dev *dev, > + struct pci_driver *pf_driver) > +{ > + return ERR_PTR(-EINVAL); > +} > + > static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn) > { return -ENODEV; } > > -- > 2.18.1 >
On Wed, 13 Oct 2021 12:46:58 +0300 Yishai Hadas <yishaih@nvidia.com> wrote: > From: Jason Gunthorpe <jgg@nvidia.com> > > There are some cases where a SRIOV VF driver will need to reach into and > interact with the PF driver. This requires accessing the drvdata of the PF. > > Provide a function pci_iov_get_pf_drvdata() to return this PF drvdata in a > safe way. Normally accessing a drvdata of a foreign struct device would be > done using the device_lock() to protect against device driver > probe()/remove() races. > > However, due to the design of pci_enable_sriov() this will result in a > ABBA deadlock on the device_lock as the PF's device_lock is held during PF > sriov_configure() while calling pci_enable_sriov() which in turn holds the > VF's device_lock while calling VF probe(), and similarly for remove. > > This means the VF driver can never obtain the PF's device_lock. > > Instead use the implicit locking created by pci_enable/disable_sriov(). A > VF driver can access its PF drvdata only while its own driver is attached, > and the PF driver can control access to its own drvdata based on when it > calls pci_enable/disable_sriov(). > > To use this API the PF driver will setup the PF drvdata in the probe() > function. pci_enable_sriov() is only called from sriov_configure() which > cannot happen until probe() completes, ensuring no VF races with drvdata > setup. > > For removal, the PF driver must call pci_disable_sriov() in its remove > function before destroying any of the drvdata. This ensures that all VF > drivers are unbound before returning, fencing concurrent access to the > drvdata. > > The introduction of a new function to do this access makes clear the > special locking scheme and the documents the requirements on the PF/VF > drivers using this. > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > Signed-off-by: Yishai Hadas <yishaih@nvidia.com> > --- > drivers/pci/iov.c | 29 +++++++++++++++++++++++++++++ > include/linux/pci.h | 7 +++++++ > 2 files changed, 36 insertions(+) > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > index e7751fa3fe0b..ca696730f761 100644 > --- a/drivers/pci/iov.c > +++ b/drivers/pci/iov.c > @@ -47,6 +47,35 @@ int pci_iov_vf_id(struct pci_dev *dev) > } > EXPORT_SYMBOL_GPL(pci_iov_vf_id); > > +/** > + * pci_iov_get_pf_drvdata - Return the drvdata of a PF > + * @dev - VF pci_dev > + * @pf_driver - Device driver required to own the PF > + * > + * This must be called from a context that ensures that a VF driver is attached. > + * The value returned is invalid once the VF driver completes its remove() > + * callback. > + * > + * Locking is achieved by the driver core. A VF driver cannot be probed until > + * pci_enable_sriov() is called and pci_disable_sriov() does not return until > + * all VF drivers have completed their remove(). > + * > + * The PF driver must call pci_disable_sriov() before it begins to destroy the > + * drvdata. > + */ > +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver) > +{ > + struct pci_dev *pf_dev; > + > + if (dev->is_physfn) > + return ERR_PTR(-EINVAL); I think we're trying to make this only accessible to VFs, so shouldn't we test (!dev->is_virtfn)? is_physfn will be zero for either a PF with failed SR-IOV configuration or for a non-SR-IOV device afaict. Thanks, Alex > + pf_dev = dev->physfn; > + if (pf_dev->driver != pf_driver) > + return ERR_PTR(-EINVAL); > + return pci_get_drvdata(pf_dev); > +} > +EXPORT_SYMBOL_GPL(pci_iov_get_pf_drvdata); > + > /* > * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may > * change when NumVFs changes. > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 2337512e67f0..639a0a239774 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -2154,6 +2154,7 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar); > int pci_iov_virtfn_bus(struct pci_dev *dev, int id); > int pci_iov_virtfn_devfn(struct pci_dev *dev, int id); > int pci_iov_vf_id(struct pci_dev *dev); > +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver); > int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn); > void pci_disable_sriov(struct pci_dev *dev); > > @@ -2187,6 +2188,12 @@ static inline int pci_iov_vf_id(struct pci_dev *dev) > return -ENOSYS; > } > > +static inline void *pci_iov_get_pf_drvdata(struct pci_dev *dev, > + struct pci_driver *pf_driver) > +{ > + return ERR_PTR(-EINVAL); > +} > + > static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn) > { return -ENODEV; } >
On 10/15/2021 1:11 AM, Alex Williamson wrote: > On Wed, 13 Oct 2021 12:46:58 +0300 > Yishai Hadas <yishaih@nvidia.com> wrote: > >> From: Jason Gunthorpe <jgg@nvidia.com> >> >> There are some cases where a SRIOV VF driver will need to reach into and >> interact with the PF driver. This requires accessing the drvdata of the PF. >> >> Provide a function pci_iov_get_pf_drvdata() to return this PF drvdata in a >> safe way. Normally accessing a drvdata of a foreign struct device would be >> done using the device_lock() to protect against device driver >> probe()/remove() races. >> >> However, due to the design of pci_enable_sriov() this will result in a >> ABBA deadlock on the device_lock as the PF's device_lock is held during PF >> sriov_configure() while calling pci_enable_sriov() which in turn holds the >> VF's device_lock while calling VF probe(), and similarly for remove. >> >> This means the VF driver can never obtain the PF's device_lock. >> >> Instead use the implicit locking created by pci_enable/disable_sriov(). A >> VF driver can access its PF drvdata only while its own driver is attached, >> and the PF driver can control access to its own drvdata based on when it >> calls pci_enable/disable_sriov(). >> >> To use this API the PF driver will setup the PF drvdata in the probe() >> function. pci_enable_sriov() is only called from sriov_configure() which >> cannot happen until probe() completes, ensuring no VF races with drvdata >> setup. >> >> For removal, the PF driver must call pci_disable_sriov() in its remove >> function before destroying any of the drvdata. This ensures that all VF >> drivers are unbound before returning, fencing concurrent access to the >> drvdata. >> >> The introduction of a new function to do this access makes clear the >> special locking scheme and the documents the requirements on the PF/VF >> drivers using this. >> >> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> >> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> >> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> >> --- >> drivers/pci/iov.c | 29 +++++++++++++++++++++++++++++ >> include/linux/pci.h | 7 +++++++ >> 2 files changed, 36 insertions(+) >> >> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c >> index e7751fa3fe0b..ca696730f761 100644 >> --- a/drivers/pci/iov.c >> +++ b/drivers/pci/iov.c >> @@ -47,6 +47,35 @@ int pci_iov_vf_id(struct pci_dev *dev) >> } >> EXPORT_SYMBOL_GPL(pci_iov_vf_id); >> >> +/** >> + * pci_iov_get_pf_drvdata - Return the drvdata of a PF >> + * @dev - VF pci_dev >> + * @pf_driver - Device driver required to own the PF >> + * >> + * This must be called from a context that ensures that a VF driver is attached. >> + * The value returned is invalid once the VF driver completes its remove() >> + * callback. >> + * >> + * Locking is achieved by the driver core. A VF driver cannot be probed until >> + * pci_enable_sriov() is called and pci_disable_sriov() does not return until >> + * all VF drivers have completed their remove(). >> + * >> + * The PF driver must call pci_disable_sriov() before it begins to destroy the >> + * drvdata. >> + */ >> +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver) >> +{ >> + struct pci_dev *pf_dev; >> + >> + if (dev->is_physfn) >> + return ERR_PTR(-EINVAL); > I think we're trying to make this only accessible to VFs, so shouldn't > we test (!dev->is_virtfn)? is_physfn will be zero for either a PF with > failed SR-IOV configuration or for a non-SR-IOV device afaict. Thanks, > > Alex > Yes, this should be accessible only for VFs. We can go with your suggestion to explicitly check (!dev->is_virtfn) as this seems cleaner and safer as you mentioned. We already got ACK on this patch from Bjorn but as your suggestion seems straight forward I may put the Acked-by as part of V2 in any case. Yishai
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index e7751fa3fe0b..ca696730f761 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -47,6 +47,35 @@ int pci_iov_vf_id(struct pci_dev *dev) } EXPORT_SYMBOL_GPL(pci_iov_vf_id); +/** + * pci_iov_get_pf_drvdata - Return the drvdata of a PF + * @dev - VF pci_dev + * @pf_driver - Device driver required to own the PF + * + * This must be called from a context that ensures that a VF driver is attached. + * The value returned is invalid once the VF driver completes its remove() + * callback. + * + * Locking is achieved by the driver core. A VF driver cannot be probed until + * pci_enable_sriov() is called and pci_disable_sriov() does not return until + * all VF drivers have completed their remove(). + * + * The PF driver must call pci_disable_sriov() before it begins to destroy the + * drvdata. + */ +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver) +{ + struct pci_dev *pf_dev; + + if (dev->is_physfn) + return ERR_PTR(-EINVAL); + pf_dev = dev->physfn; + if (pf_dev->driver != pf_driver) + return ERR_PTR(-EINVAL); + return pci_get_drvdata(pf_dev); +} +EXPORT_SYMBOL_GPL(pci_iov_get_pf_drvdata); + /* * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may * change when NumVFs changes. diff --git a/include/linux/pci.h b/include/linux/pci.h index 2337512e67f0..639a0a239774 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2154,6 +2154,7 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar); int pci_iov_virtfn_bus(struct pci_dev *dev, int id); int pci_iov_virtfn_devfn(struct pci_dev *dev, int id); int pci_iov_vf_id(struct pci_dev *dev); +void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver); int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn); void pci_disable_sriov(struct pci_dev *dev); @@ -2187,6 +2188,12 @@ static inline int pci_iov_vf_id(struct pci_dev *dev) return -ENOSYS; } +static inline void *pci_iov_get_pf_drvdata(struct pci_dev *dev, + struct pci_driver *pf_driver) +{ + return ERR_PTR(-EINVAL); +} + static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn) { return -ENODEV; }