Message ID | 20180402224652.4058-1-jakub.kicinski@netronome.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Hi Jakub, On Mon, Apr 02, 2018 at 03:46:52PM -0700, Jakub Kicinski wrote: > Some user space depends on enabling sriov_totalvfs number of VFs > to not fail, e.g.: > > $ cat .../sriov_totalvfs > .../sriov_numvfs > > For devices which VF support depends on loaded FW we have the > pci_sriov_{g,s}et_totalvfs() API. However, this API uses 0 as > a special "unset" value, meaning drivers can't limit sriov_totalvfs > to 0. Remove the special values completely and simply initialize > driver_max_VFs to total_VFs. Then always use driver_max_VFs. > Add a helper for drivers to reset the VF limit back to total. I still can't really make sense out of the changelog. I think part of the reason it's confusing is because there are two things going on: 1) You want this: pci_sriov_set_totalvfs(dev, 0); x = pci_sriov_get_totalvfs(dev) to return 0 instead of total_VFs. That seems to connect with your subject line. It means "sriov_totalvfs" in sysfs could be 0, but I don't know how that is useful (I'm sure it is; just educate me :)) 2) You're adding the pci_sriov_reset_totalvfs() interface. I'm not sure what you intend for this. Is *every* driver supposed to call it in .remove()? Could/should this be done in the core somehow instead of depending on every driver? I'm also having a hard time connecting your user-space command example with the rest of this. Maybe it will make more sense to me tomorrow after some coffee. > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > --- > drivers/net/ethernet/netronome/nfp/nfp_main.c | 6 +++--- > drivers/pci/iov.c | 27 +++++++++++++++++++++------ > include/linux/pci.h | 2 ++ > 3 files changed, 26 insertions(+), 9 deletions(-) > > diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.c b/drivers/net/ethernet/netronome/nfp/nfp_main.c > index c4b1f344b4da..a76d177e40dd 100644 > --- a/drivers/net/ethernet/netronome/nfp/nfp_main.c > +++ b/drivers/net/ethernet/netronome/nfp/nfp_main.c > @@ -123,7 +123,7 @@ static int nfp_pcie_sriov_read_nfd_limit(struct nfp_pf *pf) > return pci_sriov_set_totalvfs(pf->pdev, pf->limit_vfs); > > pf->limit_vfs = ~0; > - pci_sriov_set_totalvfs(pf->pdev, 0); /* 0 is unset */ > + pci_sriov_reset_totalvfs(pf->pdev); > /* Allow any setting for backwards compatibility if symbol not found */ > if (err == -ENOENT) > return 0; > @@ -537,7 +537,7 @@ static int nfp_pci_probe(struct pci_dev *pdev, > err_net_remove: > nfp_net_pci_remove(pf); > err_sriov_unlimit: > - pci_sriov_set_totalvfs(pf->pdev, 0); > + pci_sriov_reset_totalvfs(pf->pdev); > err_fw_unload: > kfree(pf->rtbl); > nfp_mip_close(pf->mip); > @@ -570,7 +570,7 @@ static void nfp_pci_remove(struct pci_dev *pdev) > nfp_hwmon_unregister(pf); > > nfp_pcie_sriov_disable(pdev); > - pci_sriov_set_totalvfs(pf->pdev, 0); > + pci_sriov_reset_totalvfs(pf->pdev); > > nfp_net_pci_remove(pf); > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > index 677924ae0350..c63ea870d8be 100644 > --- a/drivers/pci/iov.c > +++ b/drivers/pci/iov.c > @@ -443,6 +443,7 @@ static int sriov_init(struct pci_dev *dev, int pos) > iov->nres = nres; > iov->ctrl = ctrl; > iov->total_VFs = total; > + iov->driver_max_VFs = total; > pci_read_config_word(dev, pos + PCI_SRIOV_VF_DID, &iov->vf_device); > iov->pgsz = pgsz; > iov->self = dev; > @@ -788,12 +789,29 @@ int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs) > } > EXPORT_SYMBOL_GPL(pci_sriov_set_totalvfs); > > +/** > + * pci_sriov_reset_totalvfs -- return the TotalVFs value to the default > + * @dev: the PCI PF device > + * > + * Should be called from PF driver's remove routine with > + * device's mutex held. > + */ > +void pci_sriov_reset_totalvfs(struct pci_dev *dev) > +{ > + /* Shouldn't change if VFs already enabled */ > + if (!dev->is_physfn || dev->sriov->ctrl & PCI_SRIOV_CTRL_VFE) > + return; > + > + dev->sriov->driver_max_VFs = dev->sriov->total_VFs; > +} > +EXPORT_SYMBOL_GPL(pci_sriov_reset_totalvfs); > + > /** > * pci_sriov_get_totalvfs -- get total VFs supported on this device > * @dev: the PCI PF device > * > - * For a PCIe device with SRIOV support, return the PCIe > - * SRIOV capability value of TotalVFs or the value of driver_max_VFs > + * For a PCIe device with SRIOV support, return the value of driver_max_VFs > + * which can be equal to the PCIe SRIOV capability value of TotalVFs or lower > * if the driver reduced it. Otherwise 0. > */ > int pci_sriov_get_totalvfs(struct pci_dev *dev) > @@ -801,9 +819,6 @@ int pci_sriov_get_totalvfs(struct pci_dev *dev) > if (!dev->is_physfn) > return 0; > > - if (dev->sriov->driver_max_VFs) > - return dev->sriov->driver_max_VFs; > - > - return dev->sriov->total_VFs; > + return dev->sriov->driver_max_VFs; > } > EXPORT_SYMBOL_GPL(pci_sriov_get_totalvfs); > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 024a1beda008..95fde8850393 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -1952,6 +1952,7 @@ void pci_iov_remove_virtfn(struct pci_dev *dev, int id); > int pci_num_vf(struct pci_dev *dev); > int pci_vfs_assigned(struct pci_dev *dev); > int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs); > +void pci_sriov_reset_totalvfs(struct pci_dev *dev); > int pci_sriov_get_totalvfs(struct pci_dev *dev); > resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno); > void pci_vf_drivers_autoprobe(struct pci_dev *dev, bool probe); > @@ -1978,6 +1979,7 @@ static inline int pci_vfs_assigned(struct pci_dev *dev) > { return 0; } > static inline int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs) > { return 0; } > +static inline void pci_sriov_reset_totalvfs(struct pci_dev *dev) { } > static inline int pci_sriov_get_totalvfs(struct pci_dev *dev) > { return 0; } > static inline resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno) > -- > 2.16.2 >
Hi Bjorn! On Thu, 24 May 2018 18:57:48 -0500, Bjorn Helgaas wrote: > On Mon, Apr 02, 2018 at 03:46:52PM -0700, Jakub Kicinski wrote: > > Some user space depends on enabling sriov_totalvfs number of VFs > > to not fail, e.g.: > > > > $ cat .../sriov_totalvfs > .../sriov_numvfs > > > > For devices which VF support depends on loaded FW we have the > > pci_sriov_{g,s}et_totalvfs() API. However, this API uses 0 as > > a special "unset" value, meaning drivers can't limit sriov_totalvfs > > to 0. Remove the special values completely and simply initialize > > driver_max_VFs to total_VFs. Then always use driver_max_VFs. > > Add a helper for drivers to reset the VF limit back to total. > > I still can't really make sense out of the changelog. > > I think part of the reason it's confusing is because there are two > things going on: > > 1) You want this: > > pci_sriov_set_totalvfs(dev, 0); > x = pci_sriov_get_totalvfs(dev) > > to return 0 instead of total_VFs. That seems to connect with > your subject line. It means "sriov_totalvfs" in sysfs could be > 0, but I don't know how that is useful (I'm sure it is; just > educate me :)) Let me just quote the bug report that got filed on our internal bug tracker :) When testing Juju Openstack with Ubuntu 18.04, enabling SR-IOV causes errors because Juju gets the sriov_totalvfs for SR-IOV-capable device then tries to set that as the sriov_numvfs parameter. For SR-IOV incapable FW, the sriov_totalvfs parameter should be 0, but it's set to max. When FW is switched to flower*, the correct sriov_totalvfs value is presented. * flower is a project name My understanding is OpenStack uses sriov_totalvfs to determine how many VFs can be enabled, looks like this is the code: http://git.openstack.org/cgit/openstack/charm-neutron-openvswitch/tree/hooks/neutron_ovs_utils.py#n464 > 2) You're adding the pci_sriov_reset_totalvfs() interface. I'm not > sure what you intend for this. Is *every* driver supposed to > call it in .remove()? Could/should this be done in the core > somehow instead of depending on every driver? Good question, I was just thinking yesterday we may want to call it from the core, but I don't think it's strictly necessary nor always sufficient (we may reload FW without re-probing). We have a device which supports different number of VFs based on the FW loaded. Some legacy FWs does not inform the driver how many VFs it can support, because it supports max. So the flow in our driver is this: load_fw(dev); ... max_vfs = ask_fw_for_max_vfs(dev); if (max_vfs >= 0) return pci_sriov_set_totalvfs(dev, max_vfs); else /* FW didn't tell us, assume max */ return pci_sriov_reset_totalvfs(dev); We also reset the max on device remove, but that's not strictly necessary. Other users of pci_sriov_set_totalvfs() always know the value to set the total to (either always get it from FW or it's a constant). If you prefer we can work out the correct max for those legacy cases in the driver as well, although it seemed cleaner to just ask the core, since it already has total_VFs value handy :) > I'm also having a hard time connecting your user-space command example > with the rest of this. Maybe it will make more sense to me tomorrow > after some coffee. OpenStack assumes it will always be able to set sriov_numvfs to sriov_totalvfs, see this 'if': http://git.openstack.org/cgit/openstack/charm-neutron-openvswitch/tree/hooks/neutron_ovs_utils.py#n512 I tried to morph that into an concise bash command, but clearly failed. Sorry about the lack of clarity! :(
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.c b/drivers/net/ethernet/netronome/nfp/nfp_main.c index c4b1f344b4da..a76d177e40dd 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_main.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_main.c @@ -123,7 +123,7 @@ static int nfp_pcie_sriov_read_nfd_limit(struct nfp_pf *pf) return pci_sriov_set_totalvfs(pf->pdev, pf->limit_vfs); pf->limit_vfs = ~0; - pci_sriov_set_totalvfs(pf->pdev, 0); /* 0 is unset */ + pci_sriov_reset_totalvfs(pf->pdev); /* Allow any setting for backwards compatibility if symbol not found */ if (err == -ENOENT) return 0; @@ -537,7 +537,7 @@ static int nfp_pci_probe(struct pci_dev *pdev, err_net_remove: nfp_net_pci_remove(pf); err_sriov_unlimit: - pci_sriov_set_totalvfs(pf->pdev, 0); + pci_sriov_reset_totalvfs(pf->pdev); err_fw_unload: kfree(pf->rtbl); nfp_mip_close(pf->mip); @@ -570,7 +570,7 @@ static void nfp_pci_remove(struct pci_dev *pdev) nfp_hwmon_unregister(pf); nfp_pcie_sriov_disable(pdev); - pci_sriov_set_totalvfs(pf->pdev, 0); + pci_sriov_reset_totalvfs(pf->pdev); nfp_net_pci_remove(pf); diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index 677924ae0350..c63ea870d8be 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -443,6 +443,7 @@ static int sriov_init(struct pci_dev *dev, int pos) iov->nres = nres; iov->ctrl = ctrl; iov->total_VFs = total; + iov->driver_max_VFs = total; pci_read_config_word(dev, pos + PCI_SRIOV_VF_DID, &iov->vf_device); iov->pgsz = pgsz; iov->self = dev; @@ -788,12 +789,29 @@ int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs) } EXPORT_SYMBOL_GPL(pci_sriov_set_totalvfs); +/** + * pci_sriov_reset_totalvfs -- return the TotalVFs value to the default + * @dev: the PCI PF device + * + * Should be called from PF driver's remove routine with + * device's mutex held. + */ +void pci_sriov_reset_totalvfs(struct pci_dev *dev) +{ + /* Shouldn't change if VFs already enabled */ + if (!dev->is_physfn || dev->sriov->ctrl & PCI_SRIOV_CTRL_VFE) + return; + + dev->sriov->driver_max_VFs = dev->sriov->total_VFs; +} +EXPORT_SYMBOL_GPL(pci_sriov_reset_totalvfs); + /** * pci_sriov_get_totalvfs -- get total VFs supported on this device * @dev: the PCI PF device * - * For a PCIe device with SRIOV support, return the PCIe - * SRIOV capability value of TotalVFs or the value of driver_max_VFs + * For a PCIe device with SRIOV support, return the value of driver_max_VFs + * which can be equal to the PCIe SRIOV capability value of TotalVFs or lower * if the driver reduced it. Otherwise 0. */ int pci_sriov_get_totalvfs(struct pci_dev *dev) @@ -801,9 +819,6 @@ int pci_sriov_get_totalvfs(struct pci_dev *dev) if (!dev->is_physfn) return 0; - if (dev->sriov->driver_max_VFs) - return dev->sriov->driver_max_VFs; - - return dev->sriov->total_VFs; + return dev->sriov->driver_max_VFs; } EXPORT_SYMBOL_GPL(pci_sriov_get_totalvfs); diff --git a/include/linux/pci.h b/include/linux/pci.h index 024a1beda008..95fde8850393 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1952,6 +1952,7 @@ void pci_iov_remove_virtfn(struct pci_dev *dev, int id); int pci_num_vf(struct pci_dev *dev); int pci_vfs_assigned(struct pci_dev *dev); int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs); +void pci_sriov_reset_totalvfs(struct pci_dev *dev); int pci_sriov_get_totalvfs(struct pci_dev *dev); resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno); void pci_vf_drivers_autoprobe(struct pci_dev *dev, bool probe); @@ -1978,6 +1979,7 @@ static inline int pci_vfs_assigned(struct pci_dev *dev) { return 0; } static inline int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs) { return 0; } +static inline void pci_sriov_reset_totalvfs(struct pci_dev *dev) { } static inline int pci_sriov_get_totalvfs(struct pci_dev *dev) { return 0; } static inline resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
Some user space depends on enabling sriov_totalvfs number of VFs to not fail, e.g.: $ cat .../sriov_totalvfs > .../sriov_numvfs For devices which VF support depends on loaded FW we have the pci_sriov_{g,s}et_totalvfs() API. However, this API uses 0 as a special "unset" value, meaning drivers can't limit sriov_totalvfs to 0. Remove the special values completely and simply initialize driver_max_VFs to total_VFs. Then always use driver_max_VFs. Add a helper for drivers to reset the VF limit back to total. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> --- drivers/net/ethernet/netronome/nfp/nfp_main.c | 6 +++--- drivers/pci/iov.c | 27 +++++++++++++++++++++------ include/linux/pci.h | 2 ++ 3 files changed, 26 insertions(+), 9 deletions(-)