Message ID | 1349134020-62152-1-git-send-email-ddutile@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
> +static void pci_sriov_remove_sysfs_dev_files(struct pci_dev *dev) > +{ > + int pos; > + > + if ((dev->is_physfn) && pci_is_pcie(dev)) { > + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV); > + if (pos) > + device_remove_file(&dev->dev, pci_dev_sriov_attrs); Shouldn't it iterate over the attributes when removing them? -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/02/2012 03:32 AM, Yuval Mintz wrote: >> +static void pci_sriov_remove_sysfs_dev_files(struct pci_dev *dev) >> +{ >> + int pos; >> + >> + if ((dev->is_physfn)&& pci_is_pcie(dev)) { >> + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV); >> + if (pos) >> + device_remove_file(&dev->dev, pci_dev_sriov_attrs); > > Shouldn't it iterate over the attributes when removing them? > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html good catch; that should be: + int pos, i; .... + for (i = 0; attr_name(pci_dev_sriov_attrs[i]); i++) + device_remove_file(&dev->dev, &pci_dev_sriov_attrs); it doesn't remove the files in reverse order, but neither does the device-generic code for dev-attrs. Another question for RFC: -- above shows I didn't test with hot-plug -- which makes me wonder/think if [get,put]_device([dev,parent]) needed during [create,remove]_file() calls. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 1, 2012 at 4:27 PM, Donald Dutile <ddutile@redhat.com> wrote: > Provide files under sysfs to determine the max number of vfs > an SRIOV-capable PCIe device supports, and methods to enable and > disable the vfs on a per device basis. > > Currently, VF enablement by SRIOV-capable PCIe devices is done > in driver-specific module parameters. If not setup in modprobe files, > it requires admin to unload & reload PF drivers with number of desired > VFs to enable. Additionally, the enablement is system wide: all > devices controlled by the same driver have the same number of VFs > enabled. Although the latter is probably desired, there are PCI > configurations setup by system BIOS that may not enable that to occur. > > Three files are created if a PCIe device has SRIOV support: > sriov_max_vfs -- cat-ing this file returns the maximum number > of VFs a PCIe device supports. > sriov_enable_vfs -- echo'ing a number to this file enables this > number of VFs for this given PCIe device. > -- cat-ing this file will return the number of VFs > currently enabled on this PCIe device. > sriov_disable_vfs -- echo-ing any number other than 0 disables all > VFs associated with this PCIe device. > > VF enable and disablement is invoked much like other PCIe > configuration functions -- via registered callbacks in the driver, > i.e., probe, release, etc. > > Note: I haven't had a chance to refactor an SRIOV PF driver > to test against; hoping to try ixgbe or igb in next couple days. > To date, just tested that cat-ing sriov_max_vfs works, and > cat-ing sriov_enable_vfs returns the correct number when > a PF driver has been loaded with VFs enabled via per-driver param, > e.g., modprobe igb max_vfs=4 > > Send comments and I'll integrate as needed, while modifying > a PF driver to use this interface. > > Signed-off-by: Donald Dutile <ddutile@redhat.com> > > --- > drivers/pci/pci-sysfs.c | 186 ++++++++++++++++++++++++++++++++++++++++++++---- > include/linux/pci.h | 2 + > 2 files changed, 175 insertions(+), 13 deletions(-) > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index 6869009..1b5eab7 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -404,6 +404,152 @@ static ssize_t d3cold_allowed_show(struct device *dev, > } > #endif > > + > +#ifdef CONFIG_PCI_IOV > +static ssize_t sriov_max_vfs_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + struct pci_dev *pdev; > + > + pdev = to_pci_dev(dev); > + return sprintf (buf, "%u\n", pdev->sriov->total); > +} > + > +bool pci_dev_has_sriov(struct pci_dev *pdev) > +{ > + int ret = false; > + int pos; > + > + if (!pci_is_pcie(pdev)) > + goto out; > + > + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV); > + if (pos) > + ret = true; > +out: > + return ret; > +} > + > +static ssize_t sriov_enable_vfs_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + struct pci_dev *pdev; > + int nr_virtfn = 0; > + > + pdev = to_pci_dev(dev); > + > + if (pci_dev_has_sriov(pdev)) > + nr_virtfn = pdev->sriov->nr_virtfn; > + > + return sprintf (buf, "%u\n", nr_virtfn); > +} > + > +static ssize_t sriov_enable_vfs_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + struct pci_dev *pdev; > + int num_vf_enabled = 0; > + unsigned long num_vfs; > + pdev = to_pci_dev(dev); > + > + if (!pci_dev_has_sriov(pdev)) > + goto out; > + > + /* Requested VFs to enable < max_vfs > + * and none enabled already > + */ > + if (strict_strtoul(buf, 0, &num_vfs) < 0) > + return -EINVAL; > + > + if ((num_vfs > pdev->sriov->total) || > + (pdev->sriov->nr_virtfn != 0)) > + return -EINVAL; > + > + /* Ready for lift-off! */ > + if (pdev->driver && pdev->driver->sriov_enable) { > + num_vf_enabled = pdev->driver->sriov_enable(pdev, num_vfs); > + } > + > +out: > + if (num_vf_enabled != num_vfs) > + printk(KERN_WARNING > + "%s: %04x:%02x:%02x.%d: Only %d VFs enabled \n", > + pci_name(pdev), pci_domain_nr(pdev->bus), > + pdev->bus->number, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), num_vf_enabled); > + > + return count; > +} > + > +static ssize_t sriov_disable_vfs_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + struct pci_dev *pdev; > + > + pdev = to_pci_dev(dev); > + > + /* make sure sriov device & at least 1 vf enabled */ > + if (!pci_dev_has_sriov(pdev) || > + (pdev->sriov->nr_virtfn == 0)) > + goto out; > + > + /* Ready for landing! */ > + if (pdev->driver && pdev->driver->sriov_disable) { > + pdev->driver->sriov_disable(pdev); > + } > +out: > + return count; > +} > + > +struct device_attribute pci_dev_sriov_attrs[] = { > + __ATTR_RO(sriov_max_vfs), > + __ATTR(sriov_enable_vfs, (S_IRUGO|S_IWUSR|S_IWGRP), > + sriov_enable_vfs_show, sriov_enable_vfs_store), > + __ATTR(sriov_disable_vfs, (S_IWUSR|S_IWGRP), > + NULL, sriov_disable_vfs_store), > + __ATTR_NULL, > +}; > + > +static int pci_sriov_create_sysfs_dev_files(struct pci_dev *dev) > +{ > + int pos, i; > + int retval=0; > + > + if ((dev->is_physfn) && pci_is_pcie(dev)) { > + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV); > + if (pos) { > + for (i = 0; attr_name(pci_dev_sriov_attrs[i]); i++) { > + retval = device_create_file(&dev->dev, &pci_dev_sriov_attrs[i]); > + if (retval) { > + while (--i >= 0) > + device_remove_file(&dev->dev, &pci_dev_sriov_attrs[i]); > + break; > + } > + } > + } > + } > + return retval; > +} > + > +static void pci_sriov_remove_sysfs_dev_files(struct pci_dev *dev) > +{ > + int pos; > + > + if ((dev->is_physfn) && pci_is_pcie(dev)) { > + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV); > + if (pos) > + device_remove_file(&dev->dev, pci_dev_sriov_attrs); > + } > +} ... > @@ -1203,14 +1365,18 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev) > goto error; > dev->reset_fn = 1; > } > + > + /* SRIOV */ > + retval = pci_sriov_create_sysfs_dev_files(dev); > + if (retval) > + goto error; > + No, You should not use function for that. According to Greg, you should use group visible to do that. Please check the patches that i send out before. here I attached them again. Yinghai
On 10/02/2012 04:01 PM, Yinghai Lu wrote: > On Mon, Oct 1, 2012 at 4:27 PM, Donald Dutile<ddutile@redhat.com> wrote: >> Provide files under sysfs to determine the max number of vfs >> an SRIOV-capable PCIe device supports, and methods to enable and >> disable the vfs on a per device basis. >> >> Currently, VF enablement by SRIOV-capable PCIe devices is done >> in driver-specific module parameters. If not setup in modprobe files, >> it requires admin to unload& reload PF drivers with number of desired >> VFs to enable. Additionally, the enablement is system wide: all >> devices controlled by the same driver have the same number of VFs >> enabled. Although the latter is probably desired, there are PCI >> configurations setup by system BIOS that may not enable that to occur. >> >> Three files are created if a PCIe device has SRIOV support: >> sriov_max_vfs -- cat-ing this file returns the maximum number >> of VFs a PCIe device supports. >> sriov_enable_vfs -- echo'ing a number to this file enables this >> number of VFs for this given PCIe device. >> -- cat-ing this file will return the number of VFs >> currently enabled on this PCIe device. >> sriov_disable_vfs -- echo-ing any number other than 0 disables all >> VFs associated with this PCIe device. >> >> VF enable and disablement is invoked much like other PCIe >> configuration functions -- via registered callbacks in the driver, >> i.e., probe, release, etc. >> >> Note: I haven't had a chance to refactor an SRIOV PF driver >> to test against; hoping to try ixgbe or igb in next couple days. >> To date, just tested that cat-ing sriov_max_vfs works, and >> cat-ing sriov_enable_vfs returns the correct number when >> a PF driver has been loaded with VFs enabled via per-driver param, >> e.g., modprobe igb max_vfs=4 >> >> Send comments and I'll integrate as needed, while modifying >> a PF driver to use this interface. >> >> Signed-off-by: Donald Dutile<ddutile@redhat.com> >> >> --- >> drivers/pci/pci-sysfs.c | 186 ++++++++++++++++++++++++++++++++++++++++++++---- >> include/linux/pci.h | 2 + >> 2 files changed, 175 insertions(+), 13 deletions(-) >> >> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c >> index 6869009..1b5eab7 100644 >> --- a/drivers/pci/pci-sysfs.c >> +++ b/drivers/pci/pci-sysfs.c >> @@ -404,6 +404,152 @@ static ssize_t d3cold_allowed_show(struct device *dev, >> } >> #endif >> >> + >> +#ifdef CONFIG_PCI_IOV >> +static ssize_t sriov_max_vfs_show(struct device *dev, >> + struct device_attribute *attr, >> + char *buf) >> +{ >> + struct pci_dev *pdev; >> + >> + pdev = to_pci_dev(dev); >> + return sprintf (buf, "%u\n", pdev->sriov->total); >> +} >> + >> +bool pci_dev_has_sriov(struct pci_dev *pdev) >> +{ >> + int ret = false; >> + int pos; >> + >> + if (!pci_is_pcie(pdev)) >> + goto out; >> + >> + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV); >> + if (pos) >> + ret = true; >> +out: >> + return ret; >> +} >> + >> +static ssize_t sriov_enable_vfs_show(struct device *dev, >> + struct device_attribute *attr, >> + char *buf) >> +{ >> + struct pci_dev *pdev; >> + int nr_virtfn = 0; >> + >> + pdev = to_pci_dev(dev); >> + >> + if (pci_dev_has_sriov(pdev)) >> + nr_virtfn = pdev->sriov->nr_virtfn; >> + >> + return sprintf (buf, "%u\n", nr_virtfn); >> +} >> + >> +static ssize_t sriov_enable_vfs_store(struct device *dev, >> + struct device_attribute *attr, >> + const char *buf, size_t count) >> +{ >> + struct pci_dev *pdev; >> + int num_vf_enabled = 0; >> + unsigned long num_vfs; >> + pdev = to_pci_dev(dev); >> + >> + if (!pci_dev_has_sriov(pdev)) >> + goto out; >> + >> + /* Requested VFs to enable< max_vfs >> + * and none enabled already >> + */ >> + if (strict_strtoul(buf, 0,&num_vfs)< 0) >> + return -EINVAL; >> + >> + if ((num_vfs> pdev->sriov->total) || >> + (pdev->sriov->nr_virtfn != 0)) >> + return -EINVAL; >> + >> + /* Ready for lift-off! */ >> + if (pdev->driver&& pdev->driver->sriov_enable) { >> + num_vf_enabled = pdev->driver->sriov_enable(pdev, num_vfs); >> + } >> + >> +out: >> + if (num_vf_enabled != num_vfs) >> + printk(KERN_WARNING >> + "%s: %04x:%02x:%02x.%d: Only %d VFs enabled \n", >> + pci_name(pdev), pci_domain_nr(pdev->bus), >> + pdev->bus->number, PCI_SLOT(pdev->devfn), >> + PCI_FUNC(pdev->devfn), num_vf_enabled); >> + >> + return count; >> +} >> + >> +static ssize_t sriov_disable_vfs_store(struct device *dev, >> + struct device_attribute *attr, >> + const char *buf, size_t count) >> +{ >> + struct pci_dev *pdev; >> + >> + pdev = to_pci_dev(dev); >> + >> + /* make sure sriov device& at least 1 vf enabled */ >> + if (!pci_dev_has_sriov(pdev) || >> + (pdev->sriov->nr_virtfn == 0)) >> + goto out; >> + >> + /* Ready for landing! */ >> + if (pdev->driver&& pdev->driver->sriov_disable) { >> + pdev->driver->sriov_disable(pdev); >> + } >> +out: >> + return count; >> +} >> + >> +struct device_attribute pci_dev_sriov_attrs[] = { >> + __ATTR_RO(sriov_max_vfs), >> + __ATTR(sriov_enable_vfs, (S_IRUGO|S_IWUSR|S_IWGRP), >> + sriov_enable_vfs_show, sriov_enable_vfs_store), >> + __ATTR(sriov_disable_vfs, (S_IWUSR|S_IWGRP), >> + NULL, sriov_disable_vfs_store), >> + __ATTR_NULL, >> +}; >> + >> +static int pci_sriov_create_sysfs_dev_files(struct pci_dev *dev) >> +{ >> + int pos, i; >> + int retval=0; >> + >> + if ((dev->is_physfn)&& pci_is_pcie(dev)) { >> + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV); >> + if (pos) { >> + for (i = 0; attr_name(pci_dev_sriov_attrs[i]); i++) { >> + retval = device_create_file(&dev->dev,&pci_dev_sriov_attrs[i]); >> + if (retval) { >> + while (--i>= 0) >> + device_remove_file(&dev->dev,&pci_dev_sriov_attrs[i]); >> + break; >> + } >> + } >> + } >> + } >> + return retval; > >> +} >> + >> +static void pci_sriov_remove_sysfs_dev_files(struct pci_dev *dev) >> +{ >> + int pos; >> + >> + if ((dev->is_physfn)&& pci_is_pcie(dev)) { >> + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV); >> + if (pos) >> + device_remove_file(&dev->dev, pci_dev_sriov_attrs); >> + } >> +} > ... >> @@ -1203,14 +1365,18 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev) >> goto error; >> dev->reset_fn = 1; >> } >> + >> + /* SRIOV */ >> + retval = pci_sriov_create_sysfs_dev_files(dev); >> + if (retval) >> + goto error; >> + > > No, You should not use function for that. > > According to Greg, you should use group visible to do that. > > Please check the patches that i send out before. > > here I attached them again. > > Yinghai Greg: Why not use the above functions? and what are 'visible groups' ?? I tried to follow how other optional files were added, which didn't involve groups or visibility. ... well, ok, I followed optional symlinks in iov.c... which didn't use group attributes, and these file seemed no different than other pcie attributes which is handled in the pci-sysfs.c module in this fashion. Yinghai: As for looking at your patches, I did that.... and they made no sense, i.e., I don't see the value/use/need of 'is_visible'. maybe if there was some decent documentation/comments on this stuff, I could confidently use them.... if they exist, please point me in the that direction. if not, then point me to better examples. Looking in Documentation, I didn't find anything to help. Doing a grep on is_visible, all I see are flags being added to subsystem-specific struct's, and the use of device_[create,remove]_files(). -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 2, 2012 at 1:23 PM, Don Dutile <ddutile@redhat.com> wrote: > > Yinghai: As for looking at your patches, I did that.... and > they made no sense, i.e., I don't see the value/use/need of > 'is_visible'. sysfs frame work use it during create sysfs files for that group. > maybe if there was some decent > documentation/comments on this stuff, I could > confidently use them.... if they exist, please point me > in the that direction. if not, then point me to better examples. > Looking in Documentation, sriov_xxx.patch should create one file, you could just expand them to make three. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 02, 2012 at 04:23:40PM -0400, Don Dutile wrote:
> Greg: Why not use the above functions?
What "above functions"? You make a lot of calls in the code :)
You are creating an attribute group, and then manually walking through
it to create/destroy the file? That's not good, use the in-kernel
functions to do that automatically for you.
Also, you need to do this _before_ the device shows up to userspace,
otherwise you have a race that you just lost with udev and the like.
thanks,
greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/02/2012 04:39 PM, Greg Kroah-Hartman wrote: > On Tue, Oct 02, 2012 at 04:23:40PM -0400, Don Dutile wrote: >> Greg: Why not use the above functions? > > What "above functions"? You make a lot of calls in the code :) > > You are creating an attribute group, and then manually walking through > it to create/destroy the file? That's not good, use the in-kernel > functions to do that automatically for you. > > Also, you need to do this _before_ the device shows up to userspace, > otherwise you have a race that you just lost with udev and the like. > > thanks, > > greg k-h So, why is it ok to create the 'reset' file after the device shows up? Like reset, SRIOV is optional. The sysfs files are designed to only be used by userspace utils to enable/disable VFs after the system is up. Or should the design allow a udev rule to be created that if one of these new sriov files exists, then perform a set of read-max-vf & enable n-vf ops ? -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 02, 2012 at 05:06:34PM -0400, Don Dutile wrote: > On 10/02/2012 04:39 PM, Greg Kroah-Hartman wrote: > >On Tue, Oct 02, 2012 at 04:23:40PM -0400, Don Dutile wrote: > >>Greg: Why not use the above functions? > > > >What "above functions"? You make a lot of calls in the code :) > > > >You are creating an attribute group, and then manually walking through > >it to create/destroy the file? That's not good, use the in-kernel > >functions to do that automatically for you. > > > >Also, you need to do this _before_ the device shows up to userspace, > >otherwise you have a race that you just lost with udev and the like. > > > >thanks, > > > >greg k-h > > So, why is it ok to create the 'reset' file after the device shows up? It's not, that should be fixed, patches are always welcome. > Like reset, SRIOV is optional. The sysfs files are designed to only > be used by userspace utils to enable/disable VFs after the system is up. > > Or should the design allow a udev rule to be created that if one > of these new sriov files exists, then perform a set of read-max-vf & > enable n-vf ops ? Yes it should. Never create sysfs files after the device is present in the system, otherwise userspace never knows about it. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 2, 2012 at 8:10 PM, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > On Tue, Oct 02, 2012 at 05:06:34PM -0400, Don Dutile wrote: >> >> So, why is it ok to create the 'reset' file after the device shows up? > > It's not, that should be fixed, patches are always welcome. > I tried to move "reset" to device type, but it did not work. Anyway will send out two patches for dev_type and vga again. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Also, you need to do this _before_ the device shows up to userspace, >> otherwise you have a race that you just lost with udev and the like. >> >> thanks, >> >> greg k-h > > > So, why is it ok to create the 'reset' file after the device shows up? > Like reset, SRIOV is optional. The sysfs files are designed to only > be used by userspace utils to enable/disable VFs after the system is up. It should (the attribute) be there when the device is created. Otherwise you end up with the case when udev could call 'reset' attribute the moment the SR-IOV device shows up. Or the userspace utility could. > Or should the design allow a udev rule to be created that if one > of these new sriov files exists, then perform a set of read-max-vf & > enable n-vf ops ? Well, that depends on the ingenuity of the udev rules writers. But we should also make sure that the concept is race-free. You don't want the udev rules to have 'sleep 1' to work-around the kernel code. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/02/2012 11:10 PM, Greg Kroah-Hartman wrote: > On Tue, Oct 02, 2012 at 05:06:34PM -0400, Don Dutile wrote: >> On 10/02/2012 04:39 PM, Greg Kroah-Hartman wrote: >>> On Tue, Oct 02, 2012 at 04:23:40PM -0400, Don Dutile wrote: >>>> Greg: Why not use the above functions? >>> >>> What "above functions"? You make a lot of calls in the code :) >>> >>> You are creating an attribute group, and then manually walking through >>> it to create/destroy the file? That's not good, use the in-kernel >>> functions to do that automatically for you. >>> >>> Also, you need to do this _before_ the device shows up to userspace, >>> otherwise you have a race that you just lost with udev and the like. >>> >>> thanks, >>> >>> greg k-h >> >> So, why is it ok to create the 'reset' file after the device shows up? > > It's not, that should be fixed, patches are always welcome. > >> Like reset, SRIOV is optional. The sysfs files are designed to only >> be used by userspace utils to enable/disable VFs after the system is up. >> >> Or should the design allow a udev rule to be created that if one >> of these new sriov files exists, then perform a set of read-max-vf& >> enable n-vf ops ? > > Yes it should. Never create sysfs files after the device is present in > the system, otherwise userspace never knows about it. > > thanks, > > greg k-h ok, thanks for feedback and bit of education .... still looking for more on this 'visible' tagging though. so, I guess Bjorn ought to add 'cleaning up pci-sysfs' to his PCI to-do list... maybe once i figure out the proper sriov sysfs code, I can duplicate that effort in the reset,vga & pm-capabilities section. - Don -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
First two patches are old patches in my queue. They introduce pci_dev_type and use it with vga in sysfs. other two will add set_max_vfs in sysfs for device that support siov. last one is draft version for ixgbe, still need ixgbe guys to sort it out. Thanks Yinghai Yinghai Lu (5): PCI: Add pci_dev_type PCI, sys: Use is_visable() with boot_vga attribute for pci_dev PCI: add set_max_vfs in pci_driver ops PCI: Add max_vfs in sysfs per pci device where supports ixgbe: add driver set_max_vfs support drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 + drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 44 +++++++++--- drivers/pci/pci-sysfs.c | 96 ++++++++++++++++++++++--- drivers/pci/pci.h | 1 + drivers/pci/probe.c | 1 + include/linux/pci.h | 2 + 6 files changed, 126 insertions(+), 20 deletions(-)
On Wed, Oct 3, 2012 at 6:18 AM, Don Dutile <ddutile@redhat.com> wrote: > maybe once i figure out the proper sriov sysfs code, I > can duplicate that effort in the reset,vga & pm-capabilities > section. I resent my patcheset in plain mail instead of attached form. I am thinking we should ixgbe guys to sort out ixgbe change to use set_max_vfs ops. Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----Original Message----- > From: yhlu.kernel@gmail.com [mailto:yhlu.kernel@gmail.com] On Behalf Of > Yinghai Lu > Sent: Wednesday, October 03, 2012 10:56 AM > To: Don Dutile > Cc: Greg Kroah-Hartman; linux-pci@vger.kernel.org; bhelgaas@google.com; > yuvalmin@broadcom.com; bhutchings@solarflare.com; Rose, Gregory V; > davem@davemloft.net > Subject: Re: [RFC] PCI: enable and disable sriov support via sysfs at per > device level > > On Wed, Oct 3, 2012 at 6:18 AM, Don Dutile <ddutile@redhat.com> wrote: > > maybe once i figure out the proper sriov sysfs code, I can duplicate > > that effort in the reset,vga & pm-capabilities section. > > I resent my patcheset in plain mail instead of attached form. > > I am thinking we should ixgbe guys to sort out ixgbe change to use > set_max_vfs ops. I'm willing to do that but I'm sort of waiting for things to settle out a bit and see which set of patches everyone settles on before diving into it. - Greg > > Thanks > > Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 3, 2012 at 11:16 AM, Rose, Gregory V <gregory.v.rose@intel.com> wrote: >> >> I resent my patcheset in plain mail instead of attached form. >> >> I am thinking we should ixgbe guys to sort out ixgbe change to use >> set_max_vfs ops. > > I'm willing to do that but I'm sort of waiting for things to settle out a bit and see which set of patches everyone settles on before diving into it. That would be great ! Actually your changes to ixgbe should be critical changes. others change to drivers/pci/pci-sysfs.c or adding 3 or 1 methods in driver ops is not that really matter. Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 6869009..1b5eab7 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -404,6 +404,152 @@ static ssize_t d3cold_allowed_show(struct device *dev, } #endif + +#ifdef CONFIG_PCI_IOV +static ssize_t sriov_max_vfs_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct pci_dev *pdev; + + pdev = to_pci_dev(dev); + return sprintf (buf, "%u\n", pdev->sriov->total); +} + +bool pci_dev_has_sriov(struct pci_dev *pdev) +{ + int ret = false; + int pos; + + if (!pci_is_pcie(pdev)) + goto out; + + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV); + if (pos) + ret = true; +out: + return ret; +} + +static ssize_t sriov_enable_vfs_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct pci_dev *pdev; + int nr_virtfn = 0; + + pdev = to_pci_dev(dev); + + if (pci_dev_has_sriov(pdev)) + nr_virtfn = pdev->sriov->nr_virtfn; + + return sprintf (buf, "%u\n", nr_virtfn); +} + +static ssize_t sriov_enable_vfs_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct pci_dev *pdev; + int num_vf_enabled = 0; + unsigned long num_vfs; + pdev = to_pci_dev(dev); + + if (!pci_dev_has_sriov(pdev)) + goto out; + + /* Requested VFs to enable < max_vfs + * and none enabled already + */ + if (strict_strtoul(buf, 0, &num_vfs) < 0) + return -EINVAL; + + if ((num_vfs > pdev->sriov->total) || + (pdev->sriov->nr_virtfn != 0)) + return -EINVAL; + + /* Ready for lift-off! */ + if (pdev->driver && pdev->driver->sriov_enable) { + num_vf_enabled = pdev->driver->sriov_enable(pdev, num_vfs); + } + +out: + if (num_vf_enabled != num_vfs) + printk(KERN_WARNING + "%s: %04x:%02x:%02x.%d: Only %d VFs enabled \n", + pci_name(pdev), pci_domain_nr(pdev->bus), + pdev->bus->number, PCI_SLOT(pdev->devfn), + PCI_FUNC(pdev->devfn), num_vf_enabled); + + return count; +} + +static ssize_t sriov_disable_vfs_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct pci_dev *pdev; + + pdev = to_pci_dev(dev); + + /* make sure sriov device & at least 1 vf enabled */ + if (!pci_dev_has_sriov(pdev) || + (pdev->sriov->nr_virtfn == 0)) + goto out; + + /* Ready for landing! */ + if (pdev->driver && pdev->driver->sriov_disable) { + pdev->driver->sriov_disable(pdev); + } +out: + return count; +} + +struct device_attribute pci_dev_sriov_attrs[] = { + __ATTR_RO(sriov_max_vfs), + __ATTR(sriov_enable_vfs, (S_IRUGO|S_IWUSR|S_IWGRP), + sriov_enable_vfs_show, sriov_enable_vfs_store), + __ATTR(sriov_disable_vfs, (S_IWUSR|S_IWGRP), + NULL, sriov_disable_vfs_store), + __ATTR_NULL, +}; + +static int pci_sriov_create_sysfs_dev_files(struct pci_dev *dev) +{ + int pos, i; + int retval=0; + + if ((dev->is_physfn) && pci_is_pcie(dev)) { + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV); + if (pos) { + for (i = 0; attr_name(pci_dev_sriov_attrs[i]); i++) { + retval = device_create_file(&dev->dev, &pci_dev_sriov_attrs[i]); + if (retval) { + while (--i >= 0) + device_remove_file(&dev->dev, &pci_dev_sriov_attrs[i]); + break; + } + } + } + } + return retval; +} + +static void pci_sriov_remove_sysfs_dev_files(struct pci_dev *dev) +{ + int pos; + + if ((dev->is_physfn) && pci_is_pcie(dev)) { + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV); + if (pos) + device_remove_file(&dev->dev, pci_dev_sriov_attrs); + } +} +#else +static int pci_sriov_create_sysfs_dev_files(struct pci_dev *dev) { return 0; } +static void pci_sriov_remove_sysfs_dev_files(struct pci_dev *dev) { return; } +#endif /* CONFIG_PCI_IOV */ + struct device_attribute pci_dev_attrs[] = { __ATTR_RO(resource), __ATTR_RO(vendor), @@ -1169,6 +1315,22 @@ static ssize_t reset_store(struct device *dev, static struct device_attribute reset_attr = __ATTR(reset, 0200, NULL, reset_store); +static void pci_reset_remove_sysfs_dev_file(struct pci_dev *dev) +{ + if (dev->reset_fn) { + device_remove_file(&dev->dev, &reset_attr); + dev->reset_fn = 0; + } +} + +static void pci_vpd_remove_sysfs_dev_file(struct pci_dev *dev) +{ + if (dev->vpd && dev->vpd->attr) { + sysfs_remove_bin_file(&dev->dev.kobj, dev->vpd->attr); + kfree(dev->vpd->attr); + } +} + static int pci_create_capabilities_sysfs(struct pci_dev *dev) { int retval; @@ -1203,14 +1365,18 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev) goto error; dev->reset_fn = 1; } + + /* SRIOV */ + retval = pci_sriov_create_sysfs_dev_files(dev); + if (retval) + goto error; + return 0; error: + pci_reset_remove_sysfs_dev_file(dev); pcie_aspm_remove_sysfs_dev_files(dev); - if (dev->vpd && dev->vpd->attr) { - sysfs_remove_bin_file(&dev->dev.kobj, dev->vpd->attr); - kfree(dev->vpd->attr); - } + pci_vpd_remove_sysfs_dev_file(dev); return retval; } @@ -1303,16 +1469,10 @@ err: static void pci_remove_capabilities_sysfs(struct pci_dev *dev) { - if (dev->vpd && dev->vpd->attr) { - sysfs_remove_bin_file(&dev->dev.kobj, dev->vpd->attr); - kfree(dev->vpd->attr); - } - + pci_sriov_remove_sysfs_dev_files(dev); + pci_reset_remove_sysfs_dev_file(dev); pcie_aspm_remove_sysfs_dev_files(dev); - if (dev->reset_fn) { - device_remove_file(&dev->dev, &reset_attr); - dev->reset_fn = 0; - } + pci_vpd_remove_sysfs_dev_file(dev); } /** diff --git a/include/linux/pci.h b/include/linux/pci.h index 5faa831..29e10aa 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -596,6 +596,8 @@ struct pci_driver { int (*resume_early) (struct pci_dev *dev); int (*resume) (struct pci_dev *dev); /* Device woken up */ void (*shutdown) (struct pci_dev *dev); + int (*sriov_enable) (struct pci_dev *dev, int num_vfs); /* PF pci dev */ + void (*sriov_disable) (struct pci_dev *dev); struct pci_error_handlers *err_handler; struct device_driver driver; struct pci_dynids dynids;
Provide files under sysfs to determine the max number of vfs an SRIOV-capable PCIe device supports, and methods to enable and disable the vfs on a per device basis. Currently, VF enablement by SRIOV-capable PCIe devices is done in driver-specific module parameters. If not setup in modprobe files, it requires admin to unload & reload PF drivers with number of desired VFs to enable. Additionally, the enablement is system wide: all devices controlled by the same driver have the same number of VFs enabled. Although the latter is probably desired, there are PCI configurations setup by system BIOS that may not enable that to occur. Three files are created if a PCIe device has SRIOV support: sriov_max_vfs -- cat-ing this file returns the maximum number of VFs a PCIe device supports. sriov_enable_vfs -- echo'ing a number to this file enables this number of VFs for this given PCIe device. -- cat-ing this file will return the number of VFs currently enabled on this PCIe device. sriov_disable_vfs -- echo-ing any number other than 0 disables all VFs associated with this PCIe device. VF enable and disablement is invoked much like other PCIe configuration functions -- via registered callbacks in the driver, i.e., probe, release, etc. Note: I haven't had a chance to refactor an SRIOV PF driver to test against; hoping to try ixgbe or igb in next couple days. To date, just tested that cat-ing sriov_max_vfs works, and cat-ing sriov_enable_vfs returns the correct number when a PF driver has been loaded with VFs enabled via per-driver param, e.g., modprobe igb max_vfs=4 Send comments and I'll integrate as needed, while modifying a PF driver to use this interface. Signed-off-by: Donald Dutile <ddutile@redhat.com> --- drivers/pci/pci-sysfs.c | 186 ++++++++++++++++++++++++++++++++++++++++++++---- include/linux/pci.h | 2 + 2 files changed, 175 insertions(+), 13 deletions(-)