Message ID | 20180618220142.16527-1-jakub.kicinski@netronome.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Mon, 18 Jun 2018 15:01:42 -0700, Jakub Kicinski wrote: > Bjorn points out that currently core and most of the drivers don't > clean up dev->sriov->driver_max_VFs settings on .remove(). This > means that if a different driver is bound afterwards it will > inherit the old setting: > > - load PF driver 1 > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs > - unload PF driver 1 > - load PF driver 2 > # driver 2 does not know to call pci_sriov_set_totalvfs() > > Some drivers (e.g. nfp) used to do the clean up by calling > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers > to limit total_VFs to 0") 0 no longer has this special meaning. > > The need to reset driver_max_VFs comes from the fact that old FW > builds may not expose its VF limits to the drivers, and depend on > the ability to reject the configuration change when driver notifies > the FW as part of struct pci_driver->sriov_configure() callback. > Therefore if there is no information on VF count limits we should > use the PCI capability max, and not the last value set by > pci_sriov_set_totalvfs(). > > Reset driver_max_VFs back to total_VFs after device remove. If > drivers want to reload FW/reconfigure the device while driver is > bound we may add an explicit pci_sriov_reset_totalvfs(), but for > now no driver is doing that. > > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Any comments? :( Could this still make it into 4.18? > drivers/pci/iov.c | 16 ++++++++++++++++ > drivers/pci/pci-driver.c | 1 + > drivers/pci/pci.h | 4 ++++ > 3 files changed, 21 insertions(+) > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > index d0d73dbbd5ca..cbbe6d8fab0c 100644 > --- a/drivers/pci/iov.c > +++ b/drivers/pci/iov.c > @@ -574,6 +574,22 @@ void pci_iov_release(struct pci_dev *dev) > sriov_release(dev); > } > > +/** > + * pci_iov_device_removed - clean up SR-IOV state after PF driver is detached > + * @dev: the PCI device > + */ > +void pci_iov_device_removed(struct pci_dev *dev) > +{ > + struct pci_sriov *iov = dev->sriov; > + > + if (!dev->is_physfn) > + return; > + iov->driver_max_VFs = iov->total_VFs; > + if (iov->num_VFs) > + dev_warn(&dev->dev, > + "driver left SR-IOV enabled after remove\n"); > +} > + > /** > * pci_iov_update_resource - update a VF BAR > * @dev: the PCI device > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > index c125d53033c6..80a281cf5d21 100644 > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -445,6 +445,7 @@ static int pci_device_remove(struct device *dev) > } > pcibios_free_irq(pci_dev); > pci_dev->driver = NULL; > + pci_iov_device_removed(pci_dev); > } > > /* Undo the runtime PM settings in local_pci_probe() */ > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index c358e7a07f3f..fc8bd4fdfb95 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -311,6 +311,7 @@ static inline void pci_restore_ats_state(struct pci_dev *dev) > #ifdef CONFIG_PCI_IOV > int pci_iov_init(struct pci_dev *dev); > void pci_iov_release(struct pci_dev *dev); > +void pci_iov_device_removed(struct pci_dev *dev); > void pci_iov_update_resource(struct pci_dev *dev, int resno); > resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno); > void pci_restore_iov_state(struct pci_dev *dev); > @@ -323,6 +324,9 @@ static inline int pci_iov_init(struct pci_dev *dev) > } > static inline void pci_iov_release(struct pci_dev *dev) > > +{ > +} > +static inline void pci_iov_device_removed(struct pci_dev *dev) > { > } > static inline void pci_restore_iov_state(struct pci_dev *dev)
On Tue, Jun 26, 2018 at 09:40:02AM -0700, Jakub Kicinski wrote: > On Mon, 18 Jun 2018 15:01:42 -0700, Jakub Kicinski wrote: > > Bjorn points out that currently core and most of the drivers don't > > clean up dev->sriov->driver_max_VFs settings on .remove(). This > > means that if a different driver is bound afterwards it will > > inherit the old setting: > > > > - load PF driver 1 > > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs > > - unload PF driver 1 > > - load PF driver 2 > > # driver 2 does not know to call pci_sriov_set_totalvfs() > > > > Some drivers (e.g. nfp) used to do the clean up by calling > > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver > > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers > > to limit total_VFs to 0") 0 no longer has this special meaning. > > > > The need to reset driver_max_VFs comes from the fact that old FW > > builds may not expose its VF limits to the drivers, and depend on > > the ability to reject the configuration change when driver notifies > > the FW as part of struct pci_driver->sriov_configure() callback. > > Therefore if there is no information on VF count limits we should > > use the PCI capability max, and not the last value set by > > pci_sriov_set_totalvfs(). > > > > Reset driver_max_VFs back to total_VFs after device remove. If > > drivers want to reload FW/reconfigure the device while driver is > > bound we may add an explicit pci_sriov_reset_totalvfs(), but for > > now no driver is doing that. > > > > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > Any comments? :( Could this still make it into 4.18? Is this a fix for something we broke during the v4.18 merge window? Or does it fix something that's been broken for a long time? I think the latter, but haven't looked carefully yet. > > drivers/pci/iov.c | 16 ++++++++++++++++ > > drivers/pci/pci-driver.c | 1 + > > drivers/pci/pci.h | 4 ++++ > > 3 files changed, 21 insertions(+) > > > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > > index d0d73dbbd5ca..cbbe6d8fab0c 100644 > > --- a/drivers/pci/iov.c > > +++ b/drivers/pci/iov.c > > @@ -574,6 +574,22 @@ void pci_iov_release(struct pci_dev *dev) > > sriov_release(dev); > > } > > > > +/** > > + * pci_iov_device_removed - clean up SR-IOV state after PF driver is detached > > + * @dev: the PCI device > > + */ > > +void pci_iov_device_removed(struct pci_dev *dev) > > +{ > > + struct pci_sriov *iov = dev->sriov; > > + > > + if (!dev->is_physfn) > > + return; > > + iov->driver_max_VFs = iov->total_VFs; > > + if (iov->num_VFs) > > + dev_warn(&dev->dev, > > + "driver left SR-IOV enabled after remove\n"); > > +} > > + > > /** > > * pci_iov_update_resource - update a VF BAR > > * @dev: the PCI device > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > > index c125d53033c6..80a281cf5d21 100644 > > --- a/drivers/pci/pci-driver.c > > +++ b/drivers/pci/pci-driver.c > > @@ -445,6 +445,7 @@ static int pci_device_remove(struct device *dev) > > } > > pcibios_free_irq(pci_dev); > > pci_dev->driver = NULL; > > + pci_iov_device_removed(pci_dev); > > } > > > > /* Undo the runtime PM settings in local_pci_probe() */ > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > > index c358e7a07f3f..fc8bd4fdfb95 100644 > > --- a/drivers/pci/pci.h > > +++ b/drivers/pci/pci.h > > @@ -311,6 +311,7 @@ static inline void pci_restore_ats_state(struct pci_dev *dev) > > #ifdef CONFIG_PCI_IOV > > int pci_iov_init(struct pci_dev *dev); > > void pci_iov_release(struct pci_dev *dev); > > +void pci_iov_device_removed(struct pci_dev *dev); > > void pci_iov_update_resource(struct pci_dev *dev, int resno); > > resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno); > > void pci_restore_iov_state(struct pci_dev *dev); > > @@ -323,6 +324,9 @@ static inline int pci_iov_init(struct pci_dev *dev) > > } > > static inline void pci_iov_release(struct pci_dev *dev) > > > > +{ > > +} > > +static inline void pci_iov_device_removed(struct pci_dev *dev) > > { > > } > > static inline void pci_restore_iov_state(struct pci_dev *dev) >
On Thu, Jun 28, 2018 at 6:56 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > On Tue, Jun 26, 2018 at 09:40:02AM -0700, Jakub Kicinski wrote: >> On Mon, 18 Jun 2018 15:01:42 -0700, Jakub Kicinski wrote: >> > Bjorn points out that currently core and most of the drivers don't >> > clean up dev->sriov->driver_max_VFs settings on .remove(). This >> > means that if a different driver is bound afterwards it will >> > inherit the old setting: >> > >> > - load PF driver 1 >> > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs >> > - unload PF driver 1 >> > - load PF driver 2 >> > # driver 2 does not know to call pci_sriov_set_totalvfs() >> > >> > Some drivers (e.g. nfp) used to do the clean up by calling >> > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver >> > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers >> > to limit total_VFs to 0") 0 no longer has this special meaning. >> > >> > The need to reset driver_max_VFs comes from the fact that old FW >> > builds may not expose its VF limits to the drivers, and depend on >> > the ability to reject the configuration change when driver notifies >> > the FW as part of struct pci_driver->sriov_configure() callback. >> > Therefore if there is no information on VF count limits we should >> > use the PCI capability max, and not the last value set by >> > pci_sriov_set_totalvfs(). >> > >> > Reset driver_max_VFs back to total_VFs after device remove. If >> > drivers want to reload FW/reconfigure the device while driver is >> > bound we may add an explicit pci_sriov_reset_totalvfs(), but for >> > now no driver is doing that. >> > >> > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") >> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> >> >> Any comments? :( Could this still make it into 4.18? > > Is this a fix for something we broke during the v4.18 merge window? > Or does it fix something that's been broken for a long time? I think > the latter, but haven't looked carefully yet. This is half of a fix, really. NFP driver does pci_sriov_set_totalvfs(pdev, 0) to clear the limit. Now it will disable SR-IOV completely. If this patch gets applied I will drop those pci_sriov_set_totalvfs(pdev, 0) calls from the driver and the regression will be solved. >> > drivers/pci/iov.c | 16 ++++++++++++++++ >> > drivers/pci/pci-driver.c | 1 + >> > drivers/pci/pci.h | 4 ++++ >> > 3 files changed, 21 insertions(+) >> > >> > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c >> > index d0d73dbbd5ca..cbbe6d8fab0c 100644 >> > --- a/drivers/pci/iov.c >> > +++ b/drivers/pci/iov.c >> > @@ -574,6 +574,22 @@ void pci_iov_release(struct pci_dev *dev) >> > sriov_release(dev); >> > } >> > >> > +/** >> > + * pci_iov_device_removed - clean up SR-IOV state after PF driver is detached >> > + * @dev: the PCI device >> > + */ >> > +void pci_iov_device_removed(struct pci_dev *dev) >> > +{ >> > + struct pci_sriov *iov = dev->sriov; >> > + >> > + if (!dev->is_physfn) >> > + return; >> > + iov->driver_max_VFs = iov->total_VFs; >> > + if (iov->num_VFs) >> > + dev_warn(&dev->dev, >> > + "driver left SR-IOV enabled after remove\n"); >> > +} >> > + >> > /** >> > * pci_iov_update_resource - update a VF BAR >> > * @dev: the PCI device >> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c >> > index c125d53033c6..80a281cf5d21 100644 >> > --- a/drivers/pci/pci-driver.c >> > +++ b/drivers/pci/pci-driver.c >> > @@ -445,6 +445,7 @@ static int pci_device_remove(struct device *dev) >> > } >> > pcibios_free_irq(pci_dev); >> > pci_dev->driver = NULL; >> > + pci_iov_device_removed(pci_dev); >> > } >> > >> > /* Undo the runtime PM settings in local_pci_probe() */ >> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h >> > index c358e7a07f3f..fc8bd4fdfb95 100644 >> > --- a/drivers/pci/pci.h >> > +++ b/drivers/pci/pci.h >> > @@ -311,6 +311,7 @@ static inline void pci_restore_ats_state(struct pci_dev *dev) >> > #ifdef CONFIG_PCI_IOV >> > int pci_iov_init(struct pci_dev *dev); >> > void pci_iov_release(struct pci_dev *dev); >> > +void pci_iov_device_removed(struct pci_dev *dev); >> > void pci_iov_update_resource(struct pci_dev *dev, int resno); >> > resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno); >> > void pci_restore_iov_state(struct pci_dev *dev); >> > @@ -323,6 +324,9 @@ static inline int pci_iov_init(struct pci_dev *dev) >> > } >> > static inline void pci_iov_release(struct pci_dev *dev) >> > >> > +{ >> > +} >> > +static inline void pci_iov_device_removed(struct pci_dev *dev) >> > { >> > } >> > static inline void pci_restore_iov_state(struct pci_dev *dev) >>
On Thu, Jun 28, 2018 at 09:17:09AM -0700, Jakub Kicinski wrote: > On Thu, Jun 28, 2018 at 6:56 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Tue, Jun 26, 2018 at 09:40:02AM -0700, Jakub Kicinski wrote: > >> On Mon, 18 Jun 2018 15:01:42 -0700, Jakub Kicinski wrote: > >> > Bjorn points out that currently core and most of the drivers don't > >> > clean up dev->sriov->driver_max_VFs settings on .remove(). This > >> > means that if a different driver is bound afterwards it will > >> > inherit the old setting: > >> > > >> > - load PF driver 1 > >> > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs > >> > - unload PF driver 1 > >> > - load PF driver 2 > >> > # driver 2 does not know to call pci_sriov_set_totalvfs() > >> > > >> > Some drivers (e.g. nfp) used to do the clean up by calling > >> > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver > >> > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers > >> > to limit total_VFs to 0") 0 no longer has this special meaning. > >> > > >> > The need to reset driver_max_VFs comes from the fact that old FW > >> > builds may not expose its VF limits to the drivers, and depend on > >> > the ability to reject the configuration change when driver notifies > >> > the FW as part of struct pci_driver->sriov_configure() callback. > >> > Therefore if there is no information on VF count limits we should > >> > use the PCI capability max, and not the last value set by > >> > pci_sriov_set_totalvfs(). > >> > > >> > Reset driver_max_VFs back to total_VFs after device remove. If > >> > drivers want to reload FW/reconfigure the device while driver is > >> > bound we may add an explicit pci_sriov_reset_totalvfs(), but for > >> > now no driver is doing that. > >> > > >> > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") > >> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > >> > >> Any comments? :( Could this still make it into 4.18? > > > > Is this a fix for something we broke during the v4.18 merge window? > > Or does it fix something that's been broken for a long time? I think > > the latter, but haven't looked carefully yet. > > This is half of a fix, really. NFP driver does > pci_sriov_set_totalvfs(pdev, 0) to clear the limit. Now it will > disable SR-IOV completely. If this patch gets applied I will drop > those pci_sriov_set_totalvfs(pdev, 0) calls from the driver and the > regression will be solved. Sorry, I missed the regression part. 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") appeared in v4.18-rc1, so I guess the problem is that nfp works correctly in v4.17, but is broken in v4.18-rc1? I'd just like to understand the breakage and fix better. In v4.17: # nfp loaded: nfp_pci_probe nfp_pcie_sriov_read_nfd_limit nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) if (err) # FW doesn't expose limit (?) pci_sriov_set_totalvfs(0) dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear # userspace writes N to sysfs "sriov_numvfs": sriov_numvfs_store # write N pci_sriov_get_totalvfs return dev->sriov->total_VFs # since driver_max_VFs == 0 driver->sriov_configure(N) nfp_pcie_sriov_configure(N) nfp_pcie_sriov_enable(N) In v4.18-rc1, it's the same except pci_sriov_get_totalvfs() now returns 0: # nfp loaded: nfp_pci_probe nfp_pcie_sriov_read_nfd_limit nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) if (err) # FW doesn't expose limit (?) pci_sriov_set_totalvfs(0) dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear # userspace writes N to sysfs "sriov_numvfs": sriov_numvfs_store # write N pci_sriov_get_totalvfs return 0 # (driver_max_VFs == 0) return -ERANGE # because N > 0 Am I on the right track? Sounds like we may want to merge this patch and the nfp patch to remove the pci_sriov_set_totalvfs() calls together to make sure they get merged in the correct order. > >> > drivers/pci/iov.c | 16 ++++++++++++++++ > >> > drivers/pci/pci-driver.c | 1 + > >> > drivers/pci/pci.h | 4 ++++ > >> > 3 files changed, 21 insertions(+) > >> > > >> > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > >> > index d0d73dbbd5ca..cbbe6d8fab0c 100644 > >> > --- a/drivers/pci/iov.c > >> > +++ b/drivers/pci/iov.c > >> > @@ -574,6 +574,22 @@ void pci_iov_release(struct pci_dev *dev) > >> > sriov_release(dev); > >> > } > >> > > >> > +/** > >> > + * pci_iov_device_removed - clean up SR-IOV state after PF driver is detached > >> > + * @dev: the PCI device > >> > + */ > >> > +void pci_iov_device_removed(struct pci_dev *dev) > >> > +{ > >> > + struct pci_sriov *iov = dev->sriov; > >> > + > >> > + if (!dev->is_physfn) > >> > + return; > >> > + iov->driver_max_VFs = iov->total_VFs; > >> > + if (iov->num_VFs) > >> > + dev_warn(&dev->dev, > >> > + "driver left SR-IOV enabled after remove\n"); > >> > +} > >> > + > >> > /** > >> > * pci_iov_update_resource - update a VF BAR > >> > * @dev: the PCI device > >> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > >> > index c125d53033c6..80a281cf5d21 100644 > >> > --- a/drivers/pci/pci-driver.c > >> > +++ b/drivers/pci/pci-driver.c > >> > @@ -445,6 +445,7 @@ static int pci_device_remove(struct device *dev) > >> > } > >> > pcibios_free_irq(pci_dev); > >> > pci_dev->driver = NULL; > >> > + pci_iov_device_removed(pci_dev); > >> > } > >> > > >> > /* Undo the runtime PM settings in local_pci_probe() */ > >> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > >> > index c358e7a07f3f..fc8bd4fdfb95 100644 > >> > --- a/drivers/pci/pci.h > >> > +++ b/drivers/pci/pci.h > >> > @@ -311,6 +311,7 @@ static inline void pci_restore_ats_state(struct pci_dev *dev) > >> > #ifdef CONFIG_PCI_IOV > >> > int pci_iov_init(struct pci_dev *dev); > >> > void pci_iov_release(struct pci_dev *dev); > >> > +void pci_iov_device_removed(struct pci_dev *dev); > >> > void pci_iov_update_resource(struct pci_dev *dev, int resno); > >> > resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno); > >> > void pci_restore_iov_state(struct pci_dev *dev); > >> > @@ -323,6 +324,9 @@ static inline int pci_iov_init(struct pci_dev *dev) > >> > } > >> > static inline void pci_iov_release(struct pci_dev *dev) > >> > > >> > +{ > >> > +} > >> > +static inline void pci_iov_device_removed(struct pci_dev *dev) > >> > { > >> > } > >> > static inline void pci_restore_iov_state(struct pci_dev *dev) > >>
On Thu, 28 Jun 2018 13:07:05 -0500, Bjorn Helgaas wrote: > On Thu, Jun 28, 2018 at 09:17:09AM -0700, Jakub Kicinski wrote: > > On Thu, Jun 28, 2018 at 6:56 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > > On Tue, Jun 26, 2018 at 09:40:02AM -0700, Jakub Kicinski wrote: > > >> On Mon, 18 Jun 2018 15:01:42 -0700, Jakub Kicinski wrote: > > >> > Bjorn points out that currently core and most of the drivers don't > > >> > clean up dev->sriov->driver_max_VFs settings on .remove(). This > > >> > means that if a different driver is bound afterwards it will > > >> > inherit the old setting: > > >> > > > >> > - load PF driver 1 > > >> > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs > > >> > - unload PF driver 1 > > >> > - load PF driver 2 > > >> > # driver 2 does not know to call pci_sriov_set_totalvfs() > > >> > > > >> > Some drivers (e.g. nfp) used to do the clean up by calling > > >> > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver > > >> > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers > > >> > to limit total_VFs to 0") 0 no longer has this special meaning. > > >> > > > >> > The need to reset driver_max_VFs comes from the fact that old FW > > >> > builds may not expose its VF limits to the drivers, and depend on > > >> > the ability to reject the configuration change when driver notifies > > >> > the FW as part of struct pci_driver->sriov_configure() callback. > > >> > Therefore if there is no information on VF count limits we should > > >> > use the PCI capability max, and not the last value set by > > >> > pci_sriov_set_totalvfs(). > > >> > > > >> > Reset driver_max_VFs back to total_VFs after device remove. If > > >> > drivers want to reload FW/reconfigure the device while driver is > > >> > bound we may add an explicit pci_sriov_reset_totalvfs(), but for > > >> > now no driver is doing that. > > >> > > > >> > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") > > >> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > >> > > >> Any comments? :( Could this still make it into 4.18? > > > > > > Is this a fix for something we broke during the v4.18 merge window? > > > Or does it fix something that's been broken for a long time? I think > > > the latter, but haven't looked carefully yet. > > > > This is half of a fix, really. NFP driver does > > pci_sriov_set_totalvfs(pdev, 0) to clear the limit. Now it will > > disable SR-IOV completely. If this patch gets applied I will drop > > those pci_sriov_set_totalvfs(pdev, 0) calls from the driver and the > > regression will be solved. > > Sorry, I missed the regression part. 8d85a7a4f2c9 ("PCI/IOV: Allow PF > drivers to limit total_VFs to 0") appeared in v4.18-rc1, so I guess the > problem is that nfp works correctly in v4.17, but is broken in v4.18-rc1? > > I'd just like to understand the breakage and fix better. In v4.17: > > # nfp loaded: > nfp_pci_probe > nfp_pcie_sriov_read_nfd_limit > nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) > if (err) # FW doesn't expose limit (?) > pci_sriov_set_totalvfs(0) > dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear > > # userspace writes N to sysfs "sriov_numvfs": > sriov_numvfs_store # write N > pci_sriov_get_totalvfs > return dev->sriov->total_VFs # since driver_max_VFs == 0 > driver->sriov_configure(N) > nfp_pcie_sriov_configure(N) > nfp_pcie_sriov_enable(N) > > In v4.18-rc1, it's the same except pci_sriov_get_totalvfs() now returns 0: > > # nfp loaded: > nfp_pci_probe > nfp_pcie_sriov_read_nfd_limit > nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) > if (err) # FW doesn't expose limit (?) > pci_sriov_set_totalvfs(0) > dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear > > # userspace writes N to sysfs "sriov_numvfs": > sriov_numvfs_store # write N > pci_sriov_get_totalvfs > return 0 # (driver_max_VFs == 0) > return -ERANGE # because N > 0 > > Am I on the right track? That's exactly it! > Sounds like we may want to merge this patch and the nfp patch to > remove the pci_sriov_set_totalvfs() calls together to make sure they > get merged in the correct order. NFP patch coming right up! For some reason I was planning to push it via the net tree, but on reflection it doesn't matter..
On Thu, Jun 28, 2018 at 11:14:57AM -0700, Jakub Kicinski wrote: > On Thu, 28 Jun 2018 13:07:05 -0500, Bjorn Helgaas wrote: > > On Thu, Jun 28, 2018 at 09:17:09AM -0700, Jakub Kicinski wrote: > > > On Thu, Jun 28, 2018 at 6:56 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > On Tue, Jun 26, 2018 at 09:40:02AM -0700, Jakub Kicinski wrote: > > > >> On Mon, 18 Jun 2018 15:01:42 -0700, Jakub Kicinski wrote: > > > >> > Bjorn points out that currently core and most of the drivers don't > > > >> > clean up dev->sriov->driver_max_VFs settings on .remove(). This > > > >> > means that if a different driver is bound afterwards it will > > > >> > inherit the old setting: > > > >> > > > > >> > - load PF driver 1 > > > >> > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs > > > >> > - unload PF driver 1 > > > >> > - load PF driver 2 > > > >> > # driver 2 does not know to call pci_sriov_set_totalvfs() > > > >> > > > > >> > Some drivers (e.g. nfp) used to do the clean up by calling > > > >> > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver > > > >> > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers > > > >> > to limit total_VFs to 0") 0 no longer has this special meaning. > > > >> > > > > >> > The need to reset driver_max_VFs comes from the fact that old FW > > > >> > builds may not expose its VF limits to the drivers, and depend on > > > >> > the ability to reject the configuration change when driver notifies > > > >> > the FW as part of struct pci_driver->sriov_configure() callback. > > > >> > Therefore if there is no information on VF count limits we should > > > >> > use the PCI capability max, and not the last value set by > > > >> > pci_sriov_set_totalvfs(). > > > >> > > > > >> > Reset driver_max_VFs back to total_VFs after device remove. If > > > >> > drivers want to reload FW/reconfigure the device while driver is > > > >> > bound we may add an explicit pci_sriov_reset_totalvfs(), but for > > > >> > now no driver is doing that. > > > >> > > > > >> > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") > > > >> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > > >> > > > >> Any comments? :( Could this still make it into 4.18? > > > > > > > > Is this a fix for something we broke during the v4.18 merge window? > > > > Or does it fix something that's been broken for a long time? I think > > > > the latter, but haven't looked carefully yet. > > > > > > This is half of a fix, really. NFP driver does > > > pci_sriov_set_totalvfs(pdev, 0) to clear the limit. Now it will > > > disable SR-IOV completely. If this patch gets applied I will drop > > > those pci_sriov_set_totalvfs(pdev, 0) calls from the driver and the > > > regression will be solved. > > > > Sorry, I missed the regression part. 8d85a7a4f2c9 ("PCI/IOV: Allow PF > > drivers to limit total_VFs to 0") appeared in v4.18-rc1, so I guess the > > problem is that nfp works correctly in v4.17, but is broken in v4.18-rc1? > > > > I'd just like to understand the breakage and fix better. In v4.17: > > > > # nfp loaded: > > nfp_pci_probe > > nfp_pcie_sriov_read_nfd_limit > > nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) > > if (err) # FW doesn't expose limit (?) > > pci_sriov_set_totalvfs(0) > > dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear > > > > # userspace writes N to sysfs "sriov_numvfs": > > sriov_numvfs_store # write N > > pci_sriov_get_totalvfs > > return dev->sriov->total_VFs # since driver_max_VFs == 0 > > driver->sriov_configure(N) > > nfp_pcie_sriov_configure(N) > > nfp_pcie_sriov_enable(N) > > > > In v4.18-rc1, it's the same except pci_sriov_get_totalvfs() now returns 0: > > > > # nfp loaded: > > nfp_pci_probe > > nfp_pcie_sriov_read_nfd_limit > > nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) > > if (err) # FW doesn't expose limit (?) > > pci_sriov_set_totalvfs(0) > > dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear > > > > # userspace writes N to sysfs "sriov_numvfs": > > sriov_numvfs_store # write N > > pci_sriov_get_totalvfs > > return 0 # (driver_max_VFs == 0) > > return -ERANGE # because N > 0 > > > > Am I on the right track? > > That's exactly it! OK. This regression happens even on the first module load; it doesn't require an unload and reload. If that's the case, *this* patch will not actually fix the regression, because it only changes things when we detach the driver. The *nfp* patch that removes the pci_sriov_set_totalvfs(0) calls will fix the regression because driver_max_VFs will be unchanged from its initial setting (TotalVFs from the SR-IOV capability). So if I'd figured this out earlier, we would have squashed the nfp patch into 8d85a7a4f2c9 and avoided the regression altogether. *This* patch fixes a legitimate unload/reload issue, but I'm not clear whether it's a regression anybody actually sees. If you see it, what scenario is it? Bjorn
On Thu, 28 Jun 2018 17:10:59 -0500, Bjorn Helgaas wrote: > On Thu, Jun 28, 2018 at 11:14:57AM -0700, Jakub Kicinski wrote: > > On Thu, 28 Jun 2018 13:07:05 -0500, Bjorn Helgaas wrote: > > > On Thu, Jun 28, 2018 at 09:17:09AM -0700, Jakub Kicinski wrote: > > > > On Thu, Jun 28, 2018 at 6:56 AM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > > On Tue, Jun 26, 2018 at 09:40:02AM -0700, Jakub Kicinski wrote: > > > > >> On Mon, 18 Jun 2018 15:01:42 -0700, Jakub Kicinski wrote: > > > > >> > Bjorn points out that currently core and most of the drivers don't > > > > >> > clean up dev->sriov->driver_max_VFs settings on .remove(). This > > > > >> > means that if a different driver is bound afterwards it will > > > > >> > inherit the old setting: > > > > >> > > > > > >> > - load PF driver 1 > > > > >> > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs > > > > >> > - unload PF driver 1 > > > > >> > - load PF driver 2 > > > > >> > # driver 2 does not know to call pci_sriov_set_totalvfs() > > > > >> > > > > > >> > Some drivers (e.g. nfp) used to do the clean up by calling > > > > >> > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver > > > > >> > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers > > > > >> > to limit total_VFs to 0") 0 no longer has this special meaning. > > > > >> > > > > > >> > The need to reset driver_max_VFs comes from the fact that old FW > > > > >> > builds may not expose its VF limits to the drivers, and depend on > > > > >> > the ability to reject the configuration change when driver notifies > > > > >> > the FW as part of struct pci_driver->sriov_configure() callback. > > > > >> > Therefore if there is no information on VF count limits we should > > > > >> > use the PCI capability max, and not the last value set by > > > > >> > pci_sriov_set_totalvfs(). > > > > >> > > > > > >> > Reset driver_max_VFs back to total_VFs after device remove. If > > > > >> > drivers want to reload FW/reconfigure the device while driver is > > > > >> > bound we may add an explicit pci_sriov_reset_totalvfs(), but for > > > > >> > now no driver is doing that. > > > > >> > > > > > >> > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") > > > > >> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > > > >> > > > > >> Any comments? :( Could this still make it into 4.18? > > > > > > > > > > Is this a fix for something we broke during the v4.18 merge window? > > > > > Or does it fix something that's been broken for a long time? I think > > > > > the latter, but haven't looked carefully yet. > > > > > > > > This is half of a fix, really. NFP driver does > > > > pci_sriov_set_totalvfs(pdev, 0) to clear the limit. Now it will > > > > disable SR-IOV completely. If this patch gets applied I will drop > > > > those pci_sriov_set_totalvfs(pdev, 0) calls from the driver and the > > > > regression will be solved. > > > > > > Sorry, I missed the regression part. 8d85a7a4f2c9 ("PCI/IOV: Allow PF > > > drivers to limit total_VFs to 0") appeared in v4.18-rc1, so I guess the > > > problem is that nfp works correctly in v4.17, but is broken in v4.18-rc1? > > > > > > I'd just like to understand the breakage and fix better. In v4.17: > > > > > > # nfp loaded: > > > nfp_pci_probe > > > nfp_pcie_sriov_read_nfd_limit > > > nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) > > > if (err) # FW doesn't expose limit (?) > > > pci_sriov_set_totalvfs(0) > > > dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear > > > > > > # userspace writes N to sysfs "sriov_numvfs": > > > sriov_numvfs_store # write N > > > pci_sriov_get_totalvfs > > > return dev->sriov->total_VFs # since driver_max_VFs == 0 > > > driver->sriov_configure(N) > > > nfp_pcie_sriov_configure(N) > > > nfp_pcie_sriov_enable(N) > > > > > > In v4.18-rc1, it's the same except pci_sriov_get_totalvfs() now returns 0: > > > > > > # nfp loaded: > > > nfp_pci_probe > > > nfp_pcie_sriov_read_nfd_limit > > > nfp_rtsym_read_le("nfd_vf_cfg_max_vfs", &err) > > > if (err) # FW doesn't expose limit (?) > > > pci_sriov_set_totalvfs(0) > > > dev->sriov->driver_max_VFs = 0 # PCI_SRIOV_CTRL_VFE is clear > > > > > > # userspace writes N to sysfs "sriov_numvfs": > > > sriov_numvfs_store # write N > > > pci_sriov_get_totalvfs > > > return 0 # (driver_max_VFs == 0) > > > return -ERANGE # because N > 0 > > > > > > Am I on the right track? > > > > That's exactly it! > > OK. This regression happens even on the first module load; it doesn't > require an unload and reload. > > If that's the case, *this* patch will not actually fix the regression, > because it only changes things when we detach the driver. The *nfp* > patch that removes the pci_sriov_set_totalvfs(0) calls will fix the > regression because driver_max_VFs will be unchanged from its initial > setting (TotalVFs from the SR-IOV capability). > > So if I'd figured this out earlier, we would have squashed the nfp > patch into 8d85a7a4f2c9 and avoided the regression altogether. True, this patch by itself does not fix anything for the NFP, the pci_sriov_set_totalvfs(0) calls have to be removed as well. I was planning to route the patches separately because I had dependent patches for net, but that's no longer true, so we could as well squash the "PCI: reset driver SR-IOV state after remove" and "nfp: align setting totalvfs to changes in PCI core". Would that make sense? > *This* patch fixes a legitimate unload/reload issue, but I'm not clear > whether it's a regression anybody actually sees. If you see it, what > scenario is it? The firmware of the card can come either from flash, initramfs or disk. The flash is limited in size, and the initramfs is sometimes hard to automatically provision with the right FW for customers. So there are deployment scenarios where one firmware is loaded first and then at a later stage in boot a more advanced/different firmware is loaded. If the second FW doesn't have VF limit symbol without this patch and with pci_sriov_set_totalvfs(0) removed it would inherit the settings from the first one. IOW for the NFP we really need both.
On Mon, Jun 18, 2018 at 03:01:42PM -0700, Jakub Kicinski wrote: > Bjorn points out that currently core and most of the drivers don't > clean up dev->sriov->driver_max_VFs settings on .remove(). This > means that if a different driver is bound afterwards it will > inherit the old setting: > > - load PF driver 1 > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs > - unload PF driver 1 > - load PF driver 2 > # driver 2 does not know to call pci_sriov_set_totalvfs() > > Some drivers (e.g. nfp) used to do the clean up by calling > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers > to limit total_VFs to 0") 0 no longer has this special meaning. > > The need to reset driver_max_VFs comes from the fact that old FW > builds may not expose its VF limits to the drivers, and depend on > the ability to reject the configuration change when driver notifies > the FW as part of struct pci_driver->sriov_configure() callback. > Therefore if there is no information on VF count limits we should > use the PCI capability max, and not the last value set by > pci_sriov_set_totalvfs(). > > Reset driver_max_VFs back to total_VFs after device remove. If > drivers want to reload FW/reconfigure the device while driver is > bound we may add an explicit pci_sriov_reset_totalvfs(), but for > now no driver is doing that. > > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Applied to for-linus for v4.18, thanks! > --- > drivers/pci/iov.c | 16 ++++++++++++++++ > drivers/pci/pci-driver.c | 1 + > drivers/pci/pci.h | 4 ++++ > 3 files changed, 21 insertions(+) > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > index d0d73dbbd5ca..cbbe6d8fab0c 100644 > --- a/drivers/pci/iov.c > +++ b/drivers/pci/iov.c > @@ -574,6 +574,22 @@ void pci_iov_release(struct pci_dev *dev) > sriov_release(dev); > } > > +/** > + * pci_iov_device_removed - clean up SR-IOV state after PF driver is detached > + * @dev: the PCI device > + */ > +void pci_iov_device_removed(struct pci_dev *dev) > +{ > + struct pci_sriov *iov = dev->sriov; > + > + if (!dev->is_physfn) > + return; > + iov->driver_max_VFs = iov->total_VFs; > + if (iov->num_VFs) > + dev_warn(&dev->dev, > + "driver left SR-IOV enabled after remove\n"); > +} > + > /** > * pci_iov_update_resource - update a VF BAR > * @dev: the PCI device > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > index c125d53033c6..80a281cf5d21 100644 > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -445,6 +445,7 @@ static int pci_device_remove(struct device *dev) > } > pcibios_free_irq(pci_dev); > pci_dev->driver = NULL; > + pci_iov_device_removed(pci_dev); > } > > /* Undo the runtime PM settings in local_pci_probe() */ > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index c358e7a07f3f..fc8bd4fdfb95 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -311,6 +311,7 @@ static inline void pci_restore_ats_state(struct pci_dev *dev) > #ifdef CONFIG_PCI_IOV > int pci_iov_init(struct pci_dev *dev); > void pci_iov_release(struct pci_dev *dev); > +void pci_iov_device_removed(struct pci_dev *dev); > void pci_iov_update_resource(struct pci_dev *dev, int resno); > resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno); > void pci_restore_iov_state(struct pci_dev *dev); > @@ -323,6 +324,9 @@ static inline int pci_iov_init(struct pci_dev *dev) > } > static inline void pci_iov_release(struct pci_dev *dev) > > +{ > +} > +static inline void pci_iov_device_removed(struct pci_dev *dev) > { > } > static inline void pci_restore_iov_state(struct pci_dev *dev) > -- > 2.17.1 >
On Fri, 29 Jun 2018 15:09:34 -0500, Bjorn Helgaas wrote: > On Mon, Jun 18, 2018 at 03:01:42PM -0700, Jakub Kicinski wrote: > > Bjorn points out that currently core and most of the drivers don't > > clean up dev->sriov->driver_max_VFs settings on .remove(). This > > means that if a different driver is bound afterwards it will > > inherit the old setting: > > > > - load PF driver 1 > > - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs > > - unload PF driver 1 > > - load PF driver 2 > > # driver 2 does not know to call pci_sriov_set_totalvfs() > > > > Some drivers (e.g. nfp) used to do the clean up by calling > > pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver > > limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers > > to limit total_VFs to 0") 0 no longer has this special meaning. > > > > The need to reset driver_max_VFs comes from the fact that old FW > > builds may not expose its VF limits to the drivers, and depend on > > the ability to reject the configuration change when driver notifies > > the FW as part of struct pci_driver->sriov_configure() callback. > > Therefore if there is no information on VF count limits we should > > use the PCI capability max, and not the last value set by > > pci_sriov_set_totalvfs(). > > > > Reset driver_max_VFs back to total_VFs after device remove. If > > drivers want to reload FW/reconfigure the device while driver is > > bound we may add an explicit pci_sriov_reset_totalvfs(), but for > > now no driver is doing that. > > > > Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > Applied to for-linus for v4.18, thanks! Awesome, thanks a lot!
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index d0d73dbbd5ca..cbbe6d8fab0c 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -574,6 +574,22 @@ void pci_iov_release(struct pci_dev *dev) sriov_release(dev); } +/** + * pci_iov_device_removed - clean up SR-IOV state after PF driver is detached + * @dev: the PCI device + */ +void pci_iov_device_removed(struct pci_dev *dev) +{ + struct pci_sriov *iov = dev->sriov; + + if (!dev->is_physfn) + return; + iov->driver_max_VFs = iov->total_VFs; + if (iov->num_VFs) + dev_warn(&dev->dev, + "driver left SR-IOV enabled after remove\n"); +} + /** * pci_iov_update_resource - update a VF BAR * @dev: the PCI device diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index c125d53033c6..80a281cf5d21 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -445,6 +445,7 @@ static int pci_device_remove(struct device *dev) } pcibios_free_irq(pci_dev); pci_dev->driver = NULL; + pci_iov_device_removed(pci_dev); } /* Undo the runtime PM settings in local_pci_probe() */ diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index c358e7a07f3f..fc8bd4fdfb95 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -311,6 +311,7 @@ static inline void pci_restore_ats_state(struct pci_dev *dev) #ifdef CONFIG_PCI_IOV int pci_iov_init(struct pci_dev *dev); void pci_iov_release(struct pci_dev *dev); +void pci_iov_device_removed(struct pci_dev *dev); void pci_iov_update_resource(struct pci_dev *dev, int resno); resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno); void pci_restore_iov_state(struct pci_dev *dev); @@ -323,6 +324,9 @@ static inline int pci_iov_init(struct pci_dev *dev) } static inline void pci_iov_release(struct pci_dev *dev) +{ +} +static inline void pci_iov_device_removed(struct pci_dev *dev) { } static inline void pci_restore_iov_state(struct pci_dev *dev)
Bjorn points out that currently core and most of the drivers don't clean up dev->sriov->driver_max_VFs settings on .remove(). This means that if a different driver is bound afterwards it will inherit the old setting: - load PF driver 1 - driver 1 calls pci_sriov_set_totalvfs() to reduce driver_max_VFs - unload PF driver 1 - load PF driver 2 # driver 2 does not know to call pci_sriov_set_totalvfs() Some drivers (e.g. nfp) used to do the clean up by calling pci_sriov_set_totalvfs(dev, 0), since 0 was equivalent to no driver limit set. After commit 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") 0 no longer has this special meaning. The need to reset driver_max_VFs comes from the fact that old FW builds may not expose its VF limits to the drivers, and depend on the ability to reject the configuration change when driver notifies the FW as part of struct pci_driver->sriov_configure() callback. Therefore if there is no information on VF count limits we should use the PCI capability max, and not the last value set by pci_sriov_set_totalvfs(). Reset driver_max_VFs back to total_VFs after device remove. If drivers want to reload FW/reconfigure the device while driver is bound we may add an explicit pci_sriov_reset_totalvfs(), but for now no driver is doing that. Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> --- drivers/pci/iov.c | 16 ++++++++++++++++ drivers/pci/pci-driver.c | 1 + drivers/pci/pci.h | 4 ++++ 3 files changed, 21 insertions(+)