Message ID | 20231017134217.82497-1-yishaih@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce a vfio driver over virtio devices | expand |
On 17/10/2023 16:42, Yishai Hadas wrote: > This series introduce a vfio driver over virtio devices to support the > legacy interface functionality for VFs. > > Background, from the virtio spec [1]. > -------------------------------------------------------------------- > In some systems, there is a need to support a virtio legacy driver with > a device that does not directly support the legacy interface. In such > scenarios, a group owner device can provide the legacy interface > functionality for the group member devices. The driver of the owner > device can then access the legacy interface of a member device on behalf > of the legacy member device driver. > > For example, with the SR-IOV group type, group members (VFs) can not > present the legacy interface in an I/O BAR in BAR0 as expected by the > legacy pci driver. If the legacy driver is running inside a virtual > machine, the hypervisor executing the virtual machine can present a > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the > legacy driver accesses to this I/O BAR and forwards them to the group > owner device (PF) using group administration commands. > -------------------------------------------------------------------- > > The first 6 patches are in the virtio area and handle the below: > - Fix common config map for modern device as was reported by Michael Tsirkin. > - Introduce the admin virtqueue infrastcture. > - Expose the layout of the commands that should be used for > supporting the legacy access. > - Expose APIs to enable upper layers as of vfio, net, etc > to execute admin commands. > > The above follows the virtio spec that was lastly accepted in that area > [1]. > > The last 3 patches are in the vfio area and handle the below: > - Expose some APIs from vfio/pci to be used by the vfio/virtio driver. > - Introduce a vfio driver over virtio devices to support the legacy > interface functionality for VFs. > > The series was tested successfully over virtio-net VFs in the host, > while running in the guest both modern and legacy drivers. > > [1] > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c > > Changes from V0: https://www.spinics.net/lists/linux-virtualization/msg63802.html > > Virtio: > - Fix the common config map size issue that was reported by Michael > Tsirkin. > - Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by > Michael, instead skip the AQ specifically. > - Move admin vq implementation into virtio_pci_modern.c as was asked by > Michael. > - Rename structure virtio_avq to virtio_pci_admin_vq and some extra > corresponding renames. > - Remove exported symbols virtio_pci_vf_get_pf_dev(), > virtio_admin_cmd_exec() as now callers are local to the module. > - Handle inflight commands as part of the device reset flow. > - Introduce APIs per admin command in virtio-pci as was asked by Michael. > > Vfio: > - Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for > vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by > Alex. > - Drop the intermediate patch which prepares the commands and calls the > generic virtio admin command API (i.e. virtio_admin_cmd_exec()). > - Instead, call directly to the new APIs per admin command that are > exported from Virtio - based on Michael's request. > - Enable only virtio-net as part of the pci_device_id table to enforce > upon binding only what is supported as suggested by Alex. > - Add support for byte-wise access (read/write) over the device config > region as was asked by Alex. > - Consider whether MSIX is practically enabled/disabled to choose the > right opcode upon issuing read/write admin command, as mentioned > by Michael. > - Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines > as was suggested by Michael. > - Set the '.close_device' op to vfio_pci_core_close_device() as was > pointed by Alex. > - Adapt to Vfio multi-line comment style in a few places. > - Add virtualization@lists.linux-foundation.org in the MAINTAINERS file > to be CCed for the new driver as was suggested by Jason. > > Yishai > > Feng Liu (5): > virtio-pci: Fix common config map for modern device > virtio: Define feature bit for administration virtqueue > virtio-pci: Introduce admin virtqueue > virtio-pci: Introduce admin command sending function > virtio-pci: Introduce admin commands > > Yishai Hadas (4): > virtio-pci: Introduce APIs to execute legacy IO admin commands > vfio/pci: Expose vfio_pci_core_setup_barmap() > vfio/pci: Expose vfio_pci_iowrite/read##size() > vfio/virtio: Introduce a vfio driver over virtio devices > > MAINTAINERS | 7 + > drivers/vfio/pci/Kconfig | 2 + > drivers/vfio/pci/Makefile | 2 + > drivers/vfio/pci/vfio_pci_core.c | 25 ++ > drivers/vfio/pci/vfio_pci_rdwr.c | 38 +- > drivers/vfio/pci/virtio/Kconfig | 15 + > drivers/vfio/pci/virtio/Makefile | 4 + > drivers/vfio/pci/virtio/main.c | 577 +++++++++++++++++++++++++ > drivers/virtio/virtio.c | 37 +- > drivers/virtio/virtio_pci_common.c | 14 + > drivers/virtio/virtio_pci_common.h | 20 +- > drivers/virtio/virtio_pci_modern.c | 441 ++++++++++++++++++- > drivers/virtio/virtio_pci_modern_dev.c | 24 +- > include/linux/vfio_pci_core.h | 20 + > include/linux/virtio.h | 8 + > include/linux/virtio_config.h | 4 + > include/linux/virtio_pci_admin.h | 18 + > include/linux/virtio_pci_modern.h | 5 + > include/uapi/linux/virtio_config.h | 8 +- > include/uapi/linux/virtio_pci.h | 66 +++ > 20 files changed, 1295 insertions(+), 40 deletions(-) > create mode 100644 drivers/vfio/pci/virtio/Kconfig > create mode 100644 drivers/vfio/pci/virtio/Makefile > create mode 100644 drivers/vfio/pci/virtio/main.c > create mode 100644 include/linux/virtio_pci_admin.h > Hi Michael, Did you have the chance to review the virtio part of that series ? IMO, we addressed all your notes on V0, I would be happy to get your feedback on V1 before sending V2. In my TO-DO list for V2, have for now the below minor items. Virtio: Patch #6: Fix a krobot note where it needs to include the H file as part of the export symbols C file. Vfio: #patch #9: Rename the 'ops' variable to drop the 'acc' and potentially some rename in the description of the module with regards to 'family'. Alex, Are you fine to leave the provisioning of the VF including the control of its transitional capability in the device hands as was suggested by Jason ? Any specific recommendation following the discussion in the ML, for the 'family' note ? Once I'll have the above feedback I may prepare and send V2. Yishai
On Sun, Oct 22, 2023 at 11:20:31AM +0300, Yishai Hadas wrote: > On 17/10/2023 16:42, Yishai Hadas wrote: > > This series introduce a vfio driver over virtio devices to support the > > legacy interface functionality for VFs. > > > > Background, from the virtio spec [1]. > > -------------------------------------------------------------------- > > In some systems, there is a need to support a virtio legacy driver with > > a device that does not directly support the legacy interface. In such > > scenarios, a group owner device can provide the legacy interface > > functionality for the group member devices. The driver of the owner > > device can then access the legacy interface of a member device on behalf > > of the legacy member device driver. > > > > For example, with the SR-IOV group type, group members (VFs) can not > > present the legacy interface in an I/O BAR in BAR0 as expected by the > > legacy pci driver. If the legacy driver is running inside a virtual > > machine, the hypervisor executing the virtual machine can present a > > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the > > legacy driver accesses to this I/O BAR and forwards them to the group > > owner device (PF) using group administration commands. > > -------------------------------------------------------------------- > > > > The first 6 patches are in the virtio area and handle the below: > > - Fix common config map for modern device as was reported by Michael Tsirkin. > > - Introduce the admin virtqueue infrastcture. > > - Expose the layout of the commands that should be used for > > supporting the legacy access. > > - Expose APIs to enable upper layers as of vfio, net, etc > > to execute admin commands. > > > > The above follows the virtio spec that was lastly accepted in that area > > [1]. > > > > The last 3 patches are in the vfio area and handle the below: > > - Expose some APIs from vfio/pci to be used by the vfio/virtio driver. > > - Introduce a vfio driver over virtio devices to support the legacy > > interface functionality for VFs. > > > > The series was tested successfully over virtio-net VFs in the host, > > while running in the guest both modern and legacy drivers. > > > > [1] > > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c > > > > Changes from V0: https://www.spinics.net/lists/linux-virtualization/msg63802.html > > > > Virtio: > > - Fix the common config map size issue that was reported by Michael > > Tsirkin. > > - Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by > > Michael, instead skip the AQ specifically. > > - Move admin vq implementation into virtio_pci_modern.c as was asked by > > Michael. > > - Rename structure virtio_avq to virtio_pci_admin_vq and some extra > > corresponding renames. > > - Remove exported symbols virtio_pci_vf_get_pf_dev(), > > virtio_admin_cmd_exec() as now callers are local to the module. > > - Handle inflight commands as part of the device reset flow. > > - Introduce APIs per admin command in virtio-pci as was asked by Michael. > > > > Vfio: > > - Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for > > vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by > > Alex. > > - Drop the intermediate patch which prepares the commands and calls the > > generic virtio admin command API (i.e. virtio_admin_cmd_exec()). > > - Instead, call directly to the new APIs per admin command that are > > exported from Virtio - based on Michael's request. > > - Enable only virtio-net as part of the pci_device_id table to enforce > > upon binding only what is supported as suggested by Alex. > > - Add support for byte-wise access (read/write) over the device config > > region as was asked by Alex. > > - Consider whether MSIX is practically enabled/disabled to choose the > > right opcode upon issuing read/write admin command, as mentioned > > by Michael. > > - Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines > > as was suggested by Michael. > > - Set the '.close_device' op to vfio_pci_core_close_device() as was > > pointed by Alex. > > - Adapt to Vfio multi-line comment style in a few places. > > - Add virtualization@lists.linux-foundation.org in the MAINTAINERS file > > to be CCed for the new driver as was suggested by Jason. > > > > Yishai > > > > Feng Liu (5): > > virtio-pci: Fix common config map for modern device > > virtio: Define feature bit for administration virtqueue > > virtio-pci: Introduce admin virtqueue > > virtio-pci: Introduce admin command sending function > > virtio-pci: Introduce admin commands > > > > Yishai Hadas (4): > > virtio-pci: Introduce APIs to execute legacy IO admin commands > > vfio/pci: Expose vfio_pci_core_setup_barmap() > > vfio/pci: Expose vfio_pci_iowrite/read##size() > > vfio/virtio: Introduce a vfio driver over virtio devices > > > > MAINTAINERS | 7 + > > drivers/vfio/pci/Kconfig | 2 + > > drivers/vfio/pci/Makefile | 2 + > > drivers/vfio/pci/vfio_pci_core.c | 25 ++ > > drivers/vfio/pci/vfio_pci_rdwr.c | 38 +- > > drivers/vfio/pci/virtio/Kconfig | 15 + > > drivers/vfio/pci/virtio/Makefile | 4 + > > drivers/vfio/pci/virtio/main.c | 577 +++++++++++++++++++++++++ > > drivers/virtio/virtio.c | 37 +- > > drivers/virtio/virtio_pci_common.c | 14 + > > drivers/virtio/virtio_pci_common.h | 20 +- > > drivers/virtio/virtio_pci_modern.c | 441 ++++++++++++++++++- > > drivers/virtio/virtio_pci_modern_dev.c | 24 +- > > include/linux/vfio_pci_core.h | 20 + > > include/linux/virtio.h | 8 + > > include/linux/virtio_config.h | 4 + > > include/linux/virtio_pci_admin.h | 18 + > > include/linux/virtio_pci_modern.h | 5 + > > include/uapi/linux/virtio_config.h | 8 +- > > include/uapi/linux/virtio_pci.h | 66 +++ > > 20 files changed, 1295 insertions(+), 40 deletions(-) > > create mode 100644 drivers/vfio/pci/virtio/Kconfig > > create mode 100644 drivers/vfio/pci/virtio/Makefile > > create mode 100644 drivers/vfio/pci/virtio/main.c > > create mode 100644 include/linux/virtio_pci_admin.h > > > Hi Michael, > > Did you have the chance to review the virtio part of that series ? Not yet, will take a couple more days. > IMO, we addressed all your notes on V0, I would be happy to get your > feedback on V1 before sending V2. > > In my TO-DO list for V2, have for now the below minor items. > Virtio: > Patch #6: Fix a krobot note where it needs to include the H file as part of > the export symbols C file. > Vfio: > #patch #9: Rename the 'ops' variable to drop the 'acc' and potentially some > rename in the description of the module with regards to 'family'. > > Alex, > Are you fine to leave the provisioning of the VF including the control of > its transitional capability in the device hands as was suggested by Jason ? > Any specific recommendation following the discussion in the ML, for the > 'family' note ? > > Once I'll have the above feedback I may prepare and send V2. > > Yishai
On Sun, 22 Oct 2023 11:20:31 +0300 Yishai Hadas <yishaih@nvidia.com> wrote: > On 17/10/2023 16:42, Yishai Hadas wrote: > > This series introduce a vfio driver over virtio devices to support the > > legacy interface functionality for VFs. > > > > Background, from the virtio spec [1]. > > -------------------------------------------------------------------- > > In some systems, there is a need to support a virtio legacy driver with > > a device that does not directly support the legacy interface. In such > > scenarios, a group owner device can provide the legacy interface > > functionality for the group member devices. The driver of the owner > > device can then access the legacy interface of a member device on behalf > > of the legacy member device driver. > > > > For example, with the SR-IOV group type, group members (VFs) can not > > present the legacy interface in an I/O BAR in BAR0 as expected by the > > legacy pci driver. If the legacy driver is running inside a virtual > > machine, the hypervisor executing the virtual machine can present a > > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the > > legacy driver accesses to this I/O BAR and forwards them to the group > > owner device (PF) using group administration commands. > > -------------------------------------------------------------------- > > > > The first 6 patches are in the virtio area and handle the below: > > - Fix common config map for modern device as was reported by Michael Tsirkin. > > - Introduce the admin virtqueue infrastcture. > > - Expose the layout of the commands that should be used for > > supporting the legacy access. > > - Expose APIs to enable upper layers as of vfio, net, etc > > to execute admin commands. > > > > The above follows the virtio spec that was lastly accepted in that area > > [1]. > > > > The last 3 patches are in the vfio area and handle the below: > > - Expose some APIs from vfio/pci to be used by the vfio/virtio driver. > > - Introduce a vfio driver over virtio devices to support the legacy > > interface functionality for VFs. > > > > The series was tested successfully over virtio-net VFs in the host, > > while running in the guest both modern and legacy drivers. > > > > [1] > > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c > > > > Changes from V0: https://www.spinics.net/lists/linux-virtualization/msg63802.html > > > > Virtio: > > - Fix the common config map size issue that was reported by Michael > > Tsirkin. > > - Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by > > Michael, instead skip the AQ specifically. > > - Move admin vq implementation into virtio_pci_modern.c as was asked by > > Michael. > > - Rename structure virtio_avq to virtio_pci_admin_vq and some extra > > corresponding renames. > > - Remove exported symbols virtio_pci_vf_get_pf_dev(), > > virtio_admin_cmd_exec() as now callers are local to the module. > > - Handle inflight commands as part of the device reset flow. > > - Introduce APIs per admin command in virtio-pci as was asked by Michael. > > > > Vfio: > > - Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for > > vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by > > Alex. > > - Drop the intermediate patch which prepares the commands and calls the > > generic virtio admin command API (i.e. virtio_admin_cmd_exec()). > > - Instead, call directly to the new APIs per admin command that are > > exported from Virtio - based on Michael's request. > > - Enable only virtio-net as part of the pci_device_id table to enforce > > upon binding only what is supported as suggested by Alex. > > - Add support for byte-wise access (read/write) over the device config > > region as was asked by Alex. > > - Consider whether MSIX is practically enabled/disabled to choose the > > right opcode upon issuing read/write admin command, as mentioned > > by Michael. > > - Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines > > as was suggested by Michael. > > - Set the '.close_device' op to vfio_pci_core_close_device() as was > > pointed by Alex. > > - Adapt to Vfio multi-line comment style in a few places. > > - Add virtualization@lists.linux-foundation.org in the MAINTAINERS file > > to be CCed for the new driver as was suggested by Jason. > > > > Yishai > > > > Feng Liu (5): > > virtio-pci: Fix common config map for modern device > > virtio: Define feature bit for administration virtqueue > > virtio-pci: Introduce admin virtqueue > > virtio-pci: Introduce admin command sending function > > virtio-pci: Introduce admin commands > > > > Yishai Hadas (4): > > virtio-pci: Introduce APIs to execute legacy IO admin commands > > vfio/pci: Expose vfio_pci_core_setup_barmap() > > vfio/pci: Expose vfio_pci_iowrite/read##size() > > vfio/virtio: Introduce a vfio driver over virtio devices > > > > MAINTAINERS | 7 + > > drivers/vfio/pci/Kconfig | 2 + > > drivers/vfio/pci/Makefile | 2 + > > drivers/vfio/pci/vfio_pci_core.c | 25 ++ > > drivers/vfio/pci/vfio_pci_rdwr.c | 38 +- > > drivers/vfio/pci/virtio/Kconfig | 15 + > > drivers/vfio/pci/virtio/Makefile | 4 + > > drivers/vfio/pci/virtio/main.c | 577 +++++++++++++++++++++++++ > > drivers/virtio/virtio.c | 37 +- > > drivers/virtio/virtio_pci_common.c | 14 + > > drivers/virtio/virtio_pci_common.h | 20 +- > > drivers/virtio/virtio_pci_modern.c | 441 ++++++++++++++++++- > > drivers/virtio/virtio_pci_modern_dev.c | 24 +- > > include/linux/vfio_pci_core.h | 20 + > > include/linux/virtio.h | 8 + > > include/linux/virtio_config.h | 4 + > > include/linux/virtio_pci_admin.h | 18 + > > include/linux/virtio_pci_modern.h | 5 + > > include/uapi/linux/virtio_config.h | 8 +- > > include/uapi/linux/virtio_pci.h | 66 +++ > > 20 files changed, 1295 insertions(+), 40 deletions(-) > > create mode 100644 drivers/vfio/pci/virtio/Kconfig > > create mode 100644 drivers/vfio/pci/virtio/Makefile > > create mode 100644 drivers/vfio/pci/virtio/main.c > > create mode 100644 include/linux/virtio_pci_admin.h > > > Hi Michael, > > Did you have the chance to review the virtio part of that series ? > > IMO, we addressed all your notes on V0, I would be happy to get your > feedback on V1 before sending V2. > > In my TO-DO list for V2, have for now the below minor items. > Virtio: > Patch #6: Fix a krobot note where it needs to include the H file as part > of the export symbols C file. > Vfio: > #patch #9: Rename the 'ops' variable to drop the 'acc' and potentially > some rename in the description of the module with regards to 'family'. > > Alex, > Are you fine to leave the provisioning of the VF including the control > of its transitional capability in the device hands as was suggested by > Jason ? If this is the standard we're going to follow, ie. profiling of a device is expected to occur prior to the probe of the vfio-pci variant driver, then we should get the out-of-tree NVIDIA vGPU driver on board with this too. > Any specific recommendation following the discussion in the ML, for the > 'family' note ? It's not super important, it's just overly broad vs what's actually implemented. Limiting the description to virtio-net for the current implementation is fine. > Once I'll have the above feedback I may prepare and send V2. I'll try to take a more thorough look, but also note my comments to Ankit relative to config space emulation. This driver correctly implements the flags for the IO Port BAR, but does not support sizing of the BAR through config space, which I think is a shortcoming relative to that implemented by vfio-pci. QEMU doesn't rely on this, but we don't know there aren't other userspaces that depend on this behavior. Thanks, Alex
On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote: > > Alex, > > Are you fine to leave the provisioning of the VF including the control > > of its transitional capability in the device hands as was suggested by > > Jason ? > > If this is the standard we're going to follow, ie. profiling of a > device is expected to occur prior to the probe of the vfio-pci variant > driver, then we should get the out-of-tree NVIDIA vGPU driver on board > with this too. Those GPU drivers are using mdev not vfio-pci.. mdev doesn't have a way in its uapi to configure the mdev before it is created. I'm hopeful that the SIOV work will develop something better because we clearly need it for the general use cases of SIOV beyond VFIO. Jason
On Mon, 23 Oct 2023 12:42:57 -0300 Jason Gunthorpe <jgg@nvidia.com> wrote: > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote: > > > > Alex, > > > Are you fine to leave the provisioning of the VF including the control > > > of its transitional capability in the device hands as was suggested by > > > Jason ? > > > > If this is the standard we're going to follow, ie. profiling of a > > device is expected to occur prior to the probe of the vfio-pci variant > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board > > with this too. > > Those GPU drivers are using mdev not vfio-pci.. The SR-IOV mdev vGPUs rely on the IOMMU backing device support which was removed from upstream. They only exist in the mdev form on downstreams which have retained this interface for compatibility and continuity. I'm not aware of any other means by which the SR-IOV RID can be used in the mdev model, therefore only the pre-SR-IOV GPUs should continue to use the mdev interface. > mdev doesn't have a way in its uapi to configure the mdev before it is > created. Of course. Thanks, Alex
On Mon, Oct 23, 2023 at 10:09:13AM -0600, Alex Williamson wrote: > On Mon, 23 Oct 2023 12:42:57 -0300 > Jason Gunthorpe <jgg@nvidia.com> wrote: > > > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote: > > > > > > Alex, > > > > Are you fine to leave the provisioning of the VF including the control > > > > of its transitional capability in the device hands as was suggested by > > > > Jason ? > > > > > > If this is the standard we're going to follow, ie. profiling of a > > > device is expected to occur prior to the probe of the vfio-pci variant > > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board > > > with this too. > > > > Those GPU drivers are using mdev not vfio-pci.. > > The SR-IOV mdev vGPUs rely on the IOMMU backing device support which > was removed from upstream. It wasn't, but it changed forms. mdev is a sysfs framework for managing lifecycle with GUIDs only. The thing using mdev can call vfio_register_emulated_iommu_dev() or vfio_register_group_dev(). It doesn't matter to the mdev stuff. The thing using mdev is responsible to get the struct device to pass to vfio_register_group_dev() Jason
On Mon, 23 Oct 2023 13:20:43 -0300 Jason Gunthorpe <jgg@nvidia.com> wrote: > On Mon, Oct 23, 2023 at 10:09:13AM -0600, Alex Williamson wrote: > > On Mon, 23 Oct 2023 12:42:57 -0300 > > Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote: > > > > > > > > Alex, > > > > > Are you fine to leave the provisioning of the VF including the control > > > > > of its transitional capability in the device hands as was suggested by > > > > > Jason ? > > > > > > > > If this is the standard we're going to follow, ie. profiling of a > > > > device is expected to occur prior to the probe of the vfio-pci variant > > > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board > > > > with this too. > > > > > > Those GPU drivers are using mdev not vfio-pci.. > > > > The SR-IOV mdev vGPUs rely on the IOMMU backing device support which > > was removed from upstream. > > It wasn't, but it changed forms. > > mdev is a sysfs framework for managing lifecycle with GUIDs only. > > The thing using mdev can call vfio_register_emulated_iommu_dev() or > vfio_register_group_dev(). > > It doesn't matter to the mdev stuff. > > The thing using mdev is responsible to get the struct device to pass > to vfio_register_group_dev() Are we describing what can be done (possibly limited to out-of-tree drivers) or what should be done and would be accepted upstream? I'm under the impression that mdev has been redefined to be more narrowly focused for emulated IOMMU devices and that devices based around a PCI VF should be making use of a vfio-pci variant driver. Are you suggesting it's the vendor's choice based on whether they want the mdev lifecycle support? We've defined certain aspects of the vfio-mdev interface as only available for emulated IOMMU devices, ex. page pinning. Thanks, Alex
On Mon, Oct 23, 2023 at 10:45:48AM -0600, Alex Williamson wrote: > On Mon, 23 Oct 2023 13:20:43 -0300 > Jason Gunthorpe <jgg@nvidia.com> wrote: > > > On Mon, Oct 23, 2023 at 10:09:13AM -0600, Alex Williamson wrote: > > > On Mon, 23 Oct 2023 12:42:57 -0300 > > > Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > > > > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote: > > > > > > > > > > Alex, > > > > > > Are you fine to leave the provisioning of the VF including the control > > > > > > of its transitional capability in the device hands as was suggested by > > > > > > Jason ? > > > > > > > > > > If this is the standard we're going to follow, ie. profiling of a > > > > > device is expected to occur prior to the probe of the vfio-pci variant > > > > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board > > > > > with this too. > > > > > > > > Those GPU drivers are using mdev not vfio-pci.. > > > > > > The SR-IOV mdev vGPUs rely on the IOMMU backing device support which > > > was removed from upstream. > > > > It wasn't, but it changed forms. > > > > mdev is a sysfs framework for managing lifecycle with GUIDs only. > > > > The thing using mdev can call vfio_register_emulated_iommu_dev() or > > vfio_register_group_dev(). > > > > It doesn't matter to the mdev stuff. > > > > The thing using mdev is responsible to get the struct device to pass > > to vfio_register_group_dev() > > Are we describing what can be done (possibly limited to out-of-tree > drivers) or what should be done and would be accepted upstream? Beyond disliking mdev, I'm not really set on how we should try to setup an extensively mediated PCI SRIOV driver. There is quite a lot of similarity to SIOV, so it may be the right answer is to put SIOV and this special mediated SRIOV case on the same, new, infrastructure. SIOV can't use variant vfio PCI drivers. mdev guid lifecycle is really ugly and quite limited anyhow. So I've been thinking we need something else. > I'm under the impression that mdev has been redefined to be more > narrowly focused for emulated IOMMU devices and that devices based > around a PCI VF should be making use of a vfio-pci variant driver. I've been viewing mdev as legacy, just let it die off with the S390 drivers and Intel GPU as the only users, ever. When we solve the SIOV issue we should come with something that can absorb what S390/GPU need too. At the end of the day we need an API to create /dev/vfioXX on demand, to configure them before creating them, and the destroy them. It doesn't matter at all how the driver that owns vfioXX operates, it will call the right iommufd APIs for RID/PASID/access/etc to do whatever its thing is. It would be wonderful if we could get to the point where the new interface can also create/destroy SRIOV vfios directly too. > Are you suggesting it's the vendor's choice based on whether they want > the mdev lifecycle support? So, in tree I would like to discourage new mdev drivers. Out of tree, I don't care, the APIs exist if people want to build things with them then they get the usual out of tree cavet. > We've defined certain aspects of the vfio-mdev interface as only > available for emulated IOMMU devices, ex. page pinning. Thanks, Did we? iommufd made it up to the driver to decide what to do, and a driver can certainly create a concurrent iommufd_access and iommufd_device if it wants. AFAICT the container stuff doesn't check, drivers can do both concurrently? Jason
> From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Monday, October 23, 2023 11:43 PM > > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote: > > > > Alex, > > > Are you fine to leave the provisioning of the VF including the control > > > of its transitional capability in the device hands as was suggested by > > > Jason ? > > > > If this is the standard we're going to follow, ie. profiling of a > > device is expected to occur prior to the probe of the vfio-pci variant > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board > > with this too. > > Those GPU drivers are using mdev not vfio-pci.. > > mdev doesn't have a way in its uapi to configure the mdev before it is > created. > > I'm hopeful that the SIOV work will develop something better because > we clearly need it for the general use cases of SIOV beyond VFIO. > The internal idxd driver version which I looked at last time leaves provisioning via idxd's own config interface. sure let's brainstorm what'd be (if possible) a general provisioning framework after it's sent out for review.