Message ID | 20240410115808.12896-1-parav@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | devlink: Support setting max_io_eqs | expand |
On 4/10/2024 6:58 AM, Parav Pandit wrote: > Devices send event notifications for the IO queues, > such as tx and rx queues, through event queues. > > Enable a privileged owner, such as a hypervisor PF, to set the number > of IO event queues for the VF and SF during the provisioning stage. How do you provision tx/rx queues for VFs & SFs? Don't you need similar mechanism to setup max tx/rx queues too? > > example: > Get maximum IO event queues of the VF device:: > > $ devlink port show pci/0000:06:00.0/2 > pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 > function: > hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10 > > Set maximum IO event queues of the VF device:: > > $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 > > $ devlink port show pci/0000:06:00.0/2 > pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 > function: > hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32 > > patch summary: > patch-1 updates devlink uapi > patch-2 adds print, get and set routines for max_io_eqs field > > changelog: > v1->v2: > - addressed comments from Jiri > - updated man page for the new parameter > - corrected print to not have EQs value as optional > - replaced 'value' with 'EQs' > > Parav Pandit (2): > uapi: Update devlink kernel headers > devlink: Support setting max_io_eqs > > devlink/devlink.c | 29 ++++++++++++++++++++++++++++- > include/uapi/linux/devlink.h | 1 + > man/man8/devlink-port.8 | 12 ++++++++++++ > 3 files changed, 41 insertions(+), 1 deletion(-) >
Hi Sridhar, > From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > Sent: Thursday, April 11, 2024 4:53 AM > > > On 4/10/2024 6:58 AM, Parav Pandit wrote: > > Devices send event notifications for the IO queues, such as tx and rx > > queues, through event queues. > > > > Enable a privileged owner, such as a hypervisor PF, to set the number > > of IO event queues for the VF and SF during the provisioning stage. > > How do you provision tx/rx queues for VFs & SFs? > Don't you need similar mechanism to setup max tx/rx queues too? Currently we don’t. They are derived from the IO event queues. As you know, sometimes more txqs than IO event queues needed for XDP, timestamp, multiple TCs. If needed, probably additional knob for txq, rxq can be added to restrict device resources. > > > > > > example: > > Get maximum IO event queues of the VF device:: > > > > $ devlink port show pci/0000:06:00.0/2 > > pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 > vfnum 1 > > function: > > hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs > > 10 > > > > Set maximum IO event queues of the VF device:: > > > > $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 > > > > $ devlink port show pci/0000:06:00.0/2 > > pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 > vfnum 1 > > function: > > hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs > > 32 > > > > patch summary: > > patch-1 updates devlink uapi > > patch-2 adds print, get and set routines for max_io_eqs field > > > > changelog: > > v1->v2: > > - addressed comments from Jiri > > - updated man page for the new parameter > > - corrected print to not have EQs value as optional > > - replaced 'value' with 'EQs' > > > > Parav Pandit (2): > > uapi: Update devlink kernel headers > > devlink: Support setting max_io_eqs > > > > devlink/devlink.c | 29 ++++++++++++++++++++++++++++- > > include/uapi/linux/devlink.h | 1 + > > man/man8/devlink-port.8 | 12 ++++++++++++ > > 3 files changed, 41 insertions(+), 1 deletion(-) > >
On 4/10/2024 9:32 PM, Parav Pandit wrote: > Hi Sridhar, > >> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> >> Sent: Thursday, April 11, 2024 4:53 AM >> >> >> On 4/10/2024 6:58 AM, Parav Pandit wrote: >>> Devices send event notifications for the IO queues, such as tx and rx >>> queues, through event queues. >>> >>> Enable a privileged owner, such as a hypervisor PF, to set the number >>> of IO event queues for the VF and SF during the provisioning stage. >> >> How do you provision tx/rx queues for VFs & SFs? >> Don't you need similar mechanism to setup max tx/rx queues too? > > Currently we don’t. They are derived from the IO event queues. > As you know, sometimes more txqs than IO event queues needed for XDP, timestamp, multiple TCs. > If needed, probably additional knob for txq, rxq can be added to restrict device resources. Rather than deriving tx and rx queues from IO event queues, isn't it more user friendly to do the other way. Let the host admin set the max number of tx and rx queues allowed and the driver derive the number of ioevent queues based on those values. This will be consistent with what ethtool reports as pre-set maximum values for the corresponding VF/SF. > >> >> >>> >>> example: >>> Get maximum IO event queues of the VF device:: >>> >>> $ devlink port show pci/0000:06:00.0/2 >>> pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 >> vfnum 1 >>> function: >>> hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs >>> 10 >>> >>> Set maximum IO event queues of the VF device:: >>> >>> $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 >>> >>> $ devlink port show pci/0000:06:00.0/2 >>> pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 >> vfnum 1 >>> function: >>> hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs >>> 32 >>> >>> patch summary: >>> patch-1 updates devlink uapi >>> patch-2 adds print, get and set routines for max_io_eqs field >>> >>> changelog: >>> v1->v2: >>> - addressed comments from Jiri >>> - updated man page for the new parameter >>> - corrected print to not have EQs value as optional >>> - replaced 'value' with 'EQs' >>> >>> Parav Pandit (2): >>> uapi: Update devlink kernel headers >>> devlink: Support setting max_io_eqs >>> >>> devlink/devlink.c | 29 ++++++++++++++++++++++++++++- >>> include/uapi/linux/devlink.h | 1 + >>> man/man8/devlink-port.8 | 12 ++++++++++++ >>> 3 files changed, 41 insertions(+), 1 deletion(-) >>>
On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: > > > On 4/10/2024 9:32 PM, Parav Pandit wrote: >> Hi Sridhar, >> >>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> >>> Sent: Thursday, April 11, 2024 4:53 AM >>> >>> >>> On 4/10/2024 6:58 AM, Parav Pandit wrote: >>>> Devices send event notifications for the IO queues, such as tx and rx >>>> queues, through event queues. >>>> >>>> Enable a privileged owner, such as a hypervisor PF, to set the number >>>> of IO event queues for the VF and SF during the provisioning stage. >>> >>> How do you provision tx/rx queues for VFs & SFs? >>> Don't you need similar mechanism to setup max tx/rx queues too? >> >> Currently we don’t. They are derived from the IO event queues. >> As you know, sometimes more txqs than IO event queues needed for XDP, >> timestamp, multiple TCs. >> If needed, probably additional knob for txq, rxq can be added to >> restrict device resources. > > Rather than deriving tx and rx queues from IO event queues, isn't it > more user friendly to do the other way. Let the host admin set the max > number of tx and rx queues allowed and the driver derive the number of > ioevent queues based on those values. This will be consistent with what > ethtool reports as pre-set maximum values for the corresponding VF/SF. > I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I have not reviewed enough of the other drivers). Rx and Tx queues are already part of the ethtool API. This devlink feature is allowing resource limits to be configured, and a consistent API across tools would be better for users.
Hi David, Sridhar, > From: David Ahern <dsahern@kernel.org> > Sent: Friday, April 12, 2024 7:36 AM > > On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: > > > > > > On 4/10/2024 9:32 PM, Parav Pandit wrote: > >> Hi Sridhar, > >> > >>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > >>> Sent: Thursday, April 11, 2024 4:53 AM > >>> > >>> > >>> On 4/10/2024 6:58 AM, Parav Pandit wrote: > >>>> Devices send event notifications for the IO queues, such as tx and > >>>> rx queues, through event queues. > >>>> > >>>> Enable a privileged owner, such as a hypervisor PF, to set the > >>>> number of IO event queues for the VF and SF during the provisioning > stage. > >>> > >>> How do you provision tx/rx queues for VFs & SFs? > >>> Don't you need similar mechanism to setup max tx/rx queues too? > >> > >> Currently we don’t. They are derived from the IO event queues. > >> As you know, sometimes more txqs than IO event queues needed for XDP, > >> timestamp, multiple TCs. > >> If needed, probably additional knob for txq, rxq can be added to > >> restrict device resources. > > > > Rather than deriving tx and rx queues from IO event queues, isn't it > > more user friendly to do the other way. Let the host admin set the max > > number of tx and rx queues allowed and the driver derive the number of > > ioevent queues based on those values. This will be consistent with > > what ethtool reports as pre-set maximum values for the corresponding > VF/SF. > > > > I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I have not > reviewed enough of the other drivers). IO EQs are used by hns3, mana, mlx5, mlxsw, be2net. They might not yet have the need to provision them. > Rx and Tx queues are already part of > the ethtool API. This devlink feature is allowing resource limits to be > configured, and a consistent API across tools would be better for users. IO Eqs of a function are utilized by the non netdev stack as well for a multi-functionality function like rdma completion vectors. Txq and rxq are yet another separate resource, so it is not mutually exclusive with IO EQs. I can additionally add txq and rxq provisioning knob too if this is useful, yes? Sridhar, I didn’t lately check other drivers how usable is it, will you also implement the txq, rxq callbacks? Please let me know I can start the work later next week for those additional knobs.
> From: Parav Pandit <parav@nvidia.com> > Sent: Friday, April 12, 2024 9:02 AM > > Hi David, Sridhar, > > > From: David Ahern <dsahern@kernel.org> > > Sent: Friday, April 12, 2024 7:36 AM > > > > On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: > > > > > > > > > On 4/10/2024 9:32 PM, Parav Pandit wrote: > > >> Hi Sridhar, > > >> > > >>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > > >>> Sent: Thursday, April 11, 2024 4:53 AM > > >>> > > >>> > > >>> On 4/10/2024 6:58 AM, Parav Pandit wrote: > > >>>> Devices send event notifications for the IO queues, such as tx > > >>>> and rx queues, through event queues. > > >>>> > > >>>> Enable a privileged owner, such as a hypervisor PF, to set the > > >>>> number of IO event queues for the VF and SF during the > > >>>> provisioning > > stage. > > >>> > > >>> How do you provision tx/rx queues for VFs & SFs? > > >>> Don't you need similar mechanism to setup max tx/rx queues too? > > >> > > >> Currently we don’t. They are derived from the IO event queues. > > >> As you know, sometimes more txqs than IO event queues needed for > > >> XDP, timestamp, multiple TCs. > > >> If needed, probably additional knob for txq, rxq can be added to > > >> restrict device resources. > > > > > > Rather than deriving tx and rx queues from IO event queues, isn't it > > > more user friendly to do the other way. Let the host admin set the > > > max number of tx and rx queues allowed and the driver derive the > > > number of ioevent queues based on those values. This will be > > > consistent with what ethtool reports as pre-set maximum values for > > > the corresponding > > VF/SF. > > > > > > > I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I > > have not reviewed enough of the other drivers). > > IO EQs are used by hns3, mana, mlx5, mlxsw, be2net. They might not yet have > the need to provision them. > > > Rx and Tx queues are already part of > > the ethtool API. This devlink feature is allowing resource limits to > > be configured, and a consistent API across tools would be better for users. > > IO Eqs of a function are utilized by the non netdev stack as well for a multi- > functionality function like rdma completion vectors. > Txq and rxq are yet another separate resource, so it is not mutually exclusive > with IO EQs. > > I can additionally add txq and rxq provisioning knob too if this is useful, yes? > > Sridhar, > I didn’t lately check other drivers how usable is it, will you also implement > the txq, rxq callbacks? > Please let me know I can start the work later next week for those additional > knobs. I also forgot to describe in above reply that some driver like mlx5 creates internal tx and rxqs not directly visible in channels, for xdp, timestamp, for traffic classes, dropping certain packets on rx, etc. So exact derivation of io queues is also hard there. Regardless to me, both knobs are useful, and driver will create min() resource based on both the device limits.
On 4/12/2024 12:22 AM, Parav Pandit wrote: > >> From: Parav Pandit <parav@nvidia.com> >> Sent: Friday, April 12, 2024 9:02 AM >> >> Hi David, Sridhar, >> >>> From: David Ahern <dsahern@kernel.org> >>> Sent: Friday, April 12, 2024 7:36 AM >>> >>> On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: >>>> >>>> >>>> On 4/10/2024 9:32 PM, Parav Pandit wrote: >>>>> Hi Sridhar, >>>>> >>>>>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> >>>>>> Sent: Thursday, April 11, 2024 4:53 AM >>>>>> >>>>>> >>>>>> On 4/10/2024 6:58 AM, Parav Pandit wrote: >>>>>>> Devices send event notifications for the IO queues, such as tx >>>>>>> and rx queues, through event queues. >>>>>>> >>>>>>> Enable a privileged owner, such as a hypervisor PF, to set the >>>>>>> number of IO event queues for the VF and SF during the >>>>>>> provisioning >>> stage. >>>>>> >>>>>> How do you provision tx/rx queues for VFs & SFs? >>>>>> Don't you need similar mechanism to setup max tx/rx queues too? >>>>> >>>>> Currently we don’t. They are derived from the IO event queues. >>>>> As you know, sometimes more txqs than IO event queues needed for >>>>> XDP, timestamp, multiple TCs. >>>>> If needed, probably additional knob for txq, rxq can be added to >>>>> restrict device resources. >>>> >>>> Rather than deriving tx and rx queues from IO event queues, isn't it >>>> more user friendly to do the other way. Let the host admin set the >>>> max number of tx and rx queues allowed and the driver derive the >>>> number of ioevent queues based on those values. This will be >>>> consistent with what ethtool reports as pre-set maximum values for >>>> the corresponding >>> VF/SF. >>>> >>> >>> I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I >>> have not reviewed enough of the other drivers). >> >> IO EQs are used by hns3, mana, mlx5, mlxsw, be2net. They might not yet have >> the need to provision them. >> >>> Rx and Tx queues are already part of >>> the ethtool API. This devlink feature is allowing resource limits to >>> be configured, and a consistent API across tools would be better for users. >> >> IO Eqs of a function are utilized by the non netdev stack as well for a multi- >> functionality function like rdma completion vectors. >> Txq and rxq are yet another separate resource, so it is not mutually exclusive >> with IO EQs. >> >> I can additionally add txq and rxq provisioning knob too if this is useful, yes? Yes. We need knobs for txq and rxq too. IO Eq looks like a completion queue. We don't need them for ice driver at this time, but for our idpf based control/switchdev driver we need a way to setup max number of txqueues, rxqueues, rxbuffer queues and tx completion queues. >> >> Sridhar, >> I didn’t lately check other drivers how usable is it, will you also implement >> the txq, rxq callbacks? >> Please let me know I can start the work later next week for those additional >> knobs. Sure. Our subfunction support for ice is currently under review and we are defaulting to 1 rx/tx queue for now. These knobs would be required and useful when we enable more than 1 queue for each SF. > > I also forgot to describe in above reply that some driver like mlx5 creates internal tx and rxqs not directly visible in channels, for xdp, timestamp, for traffic classes, dropping certain packets on rx, etc. > So exact derivation of io queues is also hard there > > Regardless to me, both knobs are useful, and driver will create min() resource based on both the device limits. >
> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > Sent: Saturday, April 13, 2024 3:33 AM > > On 4/12/2024 12:22 AM, Parav Pandit wrote: > > > >> From: Parav Pandit <parav@nvidia.com> > >> Sent: Friday, April 12, 2024 9:02 AM > >> > >> Hi David, Sridhar, > >> > >>> From: David Ahern <dsahern@kernel.org> > >>> Sent: Friday, April 12, 2024 7:36 AM > >>> > >>> On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: > >>>> > >>>> > >>>> On 4/10/2024 9:32 PM, Parav Pandit wrote: > >>>>> Hi Sridhar, > >>>>> > >>>>>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > >>>>>> Sent: Thursday, April 11, 2024 4:53 AM > >>>>>> > >>>>>> > >>>>>> On 4/10/2024 6:58 AM, Parav Pandit wrote: > >>>>>>> Devices send event notifications for the IO queues, such as tx > >>>>>>> and rx queues, through event queues. > >>>>>>> > >>>>>>> Enable a privileged owner, such as a hypervisor PF, to set the > >>>>>>> number of IO event queues for the VF and SF during the > >>>>>>> provisioning > >>> stage. > >>>>>> > >>>>>> How do you provision tx/rx queues for VFs & SFs? > >>>>>> Don't you need similar mechanism to setup max tx/rx queues too? > >>>>> > >>>>> Currently we don’t. They are derived from the IO event queues. > >>>>> As you know, sometimes more txqs than IO event queues needed for > >>>>> XDP, timestamp, multiple TCs. > >>>>> If needed, probably additional knob for txq, rxq can be added to > >>>>> restrict device resources. > >>>> > >>>> Rather than deriving tx and rx queues from IO event queues, isn't > >>>> it more user friendly to do the other way. Let the host admin set > >>>> the max number of tx and rx queues allowed and the driver derive > >>>> the number of ioevent queues based on those values. This will be > >>>> consistent with what ethtool reports as pre-set maximum values for > >>>> the corresponding > >>> VF/SF. > >>>> > >>> > >>> I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I > >>> have not reviewed enough of the other drivers). > >> > >> IO EQs are used by hns3, mana, mlx5, mlxsw, be2net. They might not > >> yet have the need to provision them. > >> > >>> Rx and Tx queues are already part of the ethtool API. This devlink > >>> feature is allowing resource limits to be configured, and a > >>> consistent API across tools would be better for users. > >> > >> IO Eqs of a function are utilized by the non netdev stack as well for > >> a multi- functionality function like rdma completion vectors. > >> Txq and rxq are yet another separate resource, so it is not mutually > >> exclusive with IO EQs. > >> > >> I can additionally add txq and rxq provisioning knob too if this is useful, > yes? > > Yes. We need knobs for txq and rxq too. > IO Eq looks like a completion queue. We don't need them for ice driver at > this time, but for our idpf based control/switchdev driver we need a way to > setup max number of txqueues, rxqueues, rxbuffer queues and tx completion > queues. > Understood. Make sense. > >> > >> Sridhar, > >> I didn’t lately check other drivers how usable is it, will you also implement > >> the txq, rxq callbacks? > >> Please let me know I can start the work later next week for those > additional > >> knobs. > > Sure. Our subfunction support for ice is currently under review and we > are defaulting to 1 rx/tx queue for now. These knobs would be required > and useful when we enable more than 1 queue for each SF. > Got it. I will start the kernel side patches and CC you for reviews after completing this iproute2 patch. It would be good if you can help verify on your device. > > > > I also forgot to describe in above reply that some driver like mlx5 creates > internal tx and rxqs not directly visible in channels, for xdp, timestamp, for > traffic classes, dropping certain packets on rx, etc. > > So exact derivation of io queues is also hard there > > > Regardless to me, both knobs are useful, and driver will create min() > resource based on both the device limits. > >
Hello: This series was applied to iproute2/iproute2-next.git (main) by David Ahern <dsahern@kernel.org>: On Wed, 10 Apr 2024 14:58:06 +0300 you wrote: > Devices send event notifications for the IO queues, > such as tx and rx queues, through event queues. > > Enable a privileged owner, such as a hypervisor PF, to set the number > of IO event queues for the VF and SF during the provisioning stage. > > example: > Get maximum IO event queues of the VF device:: > > [...] Here is the summary with links: - [v2,1/2] uapi: Update devlink kernel headers (no matching commit) - [v2,2/2] devlink: Support setting max_io_eqs https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=e8add23c59b7 You are awesome, thank you!