mbox series

[V1,vfio,0/9] Introduce a vfio driver over virtio devices

Message ID 20231017134217.82497-1-yishaih@nvidia.com (mailing list archive)
Headers show
Series Introduce a vfio driver over virtio devices | expand

Message

Yishai Hadas Oct. 17, 2023, 1:42 p.m. UTC
This series introduce a vfio driver over virtio devices to support the
legacy interface functionality for VFs.

Background, from the virtio spec [1].
--------------------------------------------------------------------
In some systems, there is a need to support a virtio legacy driver with
a device that does not directly support the legacy interface. In such
scenarios, a group owner device can provide the legacy interface
functionality for the group member devices. The driver of the owner
device can then access the legacy interface of a member device on behalf
of the legacy member device driver.

For example, with the SR-IOV group type, group members (VFs) can not
present the legacy interface in an I/O BAR in BAR0 as expected by the
legacy pci driver. If the legacy driver is running inside a virtual
machine, the hypervisor executing the virtual machine can present a
virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
legacy driver accesses to this I/O BAR and forwards them to the group
owner device (PF) using group administration commands.
--------------------------------------------------------------------

The first 6 patches are in the virtio area and handle the below:
- Fix common config map for modern device as was reported by Michael Tsirkin.
- Introduce the admin virtqueue infrastcture.
- Expose the layout of the commands that should be used for
  supporting the legacy access.
- Expose APIs to enable upper layers as of vfio, net, etc
  to execute admin commands.

The above follows the virtio spec that was lastly accepted in that area
[1].

The last 3 patches are in the vfio area and handle the below:
- Expose some APIs from vfio/pci to be used by the vfio/virtio driver.
- Introduce a vfio driver over virtio devices to support the legacy
  interface functionality for VFs. 

The series was tested successfully over virtio-net VFs in the host,
while running in the guest both modern and legacy drivers.

[1]
https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c

Changes from V0: https://www.spinics.net/lists/linux-virtualization/msg63802.html

Virtio:
- Fix the common config map size issue that was reported by Michael
  Tsirkin.
- Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by
  Michael, instead skip the AQ specifically.
- Move admin vq implementation into virtio_pci_modern.c as was asked by
  Michael.
- Rename structure virtio_avq to virtio_pci_admin_vq and some extra
  corresponding renames.
- Remove exported symbols virtio_pci_vf_get_pf_dev(),
  virtio_admin_cmd_exec() as now callers are local to the module.
- Handle inflight commands as part of the device reset flow.
- Introduce APIs per admin command in virtio-pci as was asked by Michael.

Vfio:
- Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for
  vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by
  Alex.
- Drop the intermediate patch which prepares the commands and calls the
  generic virtio admin command API (i.e. virtio_admin_cmd_exec()).
- Instead, call directly to the new APIs per admin command that are
  exported from Virtio - based on Michael's request.
- Enable only virtio-net as part of the pci_device_id table to enforce
  upon binding only what is supported as suggested by Alex.
- Add support for byte-wise access (read/write) over the device config
  region as was asked by Alex.
- Consider whether MSIX is practically enabled/disabled to choose the
  right opcode upon issuing read/write admin command, as mentioned
  by Michael.
- Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines
  as was suggested by Michael.
- Set the '.close_device' op to vfio_pci_core_close_device() as was
  pointed by Alex.
- Adapt to Vfio multi-line comment style in a few places.
- Add virtualization@lists.linux-foundation.org in the MAINTAINERS file
  to be CCed for the new driver as was suggested by Jason.

Yishai

Feng Liu (5):
  virtio-pci: Fix common config map for modern device
  virtio: Define feature bit for administration virtqueue
  virtio-pci: Introduce admin virtqueue
  virtio-pci: Introduce admin command sending function
  virtio-pci: Introduce admin commands

Yishai Hadas (4):
  virtio-pci: Introduce APIs to execute legacy IO admin commands
  vfio/pci: Expose vfio_pci_core_setup_barmap()
  vfio/pci: Expose vfio_pci_iowrite/read##size()
  vfio/virtio: Introduce a vfio driver over virtio devices

 MAINTAINERS                            |   7 +
 drivers/vfio/pci/Kconfig               |   2 +
 drivers/vfio/pci/Makefile              |   2 +
 drivers/vfio/pci/vfio_pci_core.c       |  25 ++
 drivers/vfio/pci/vfio_pci_rdwr.c       |  38 +-
 drivers/vfio/pci/virtio/Kconfig        |  15 +
 drivers/vfio/pci/virtio/Makefile       |   4 +
 drivers/vfio/pci/virtio/main.c         | 577 +++++++++++++++++++++++++
 drivers/virtio/virtio.c                |  37 +-
 drivers/virtio/virtio_pci_common.c     |  14 +
 drivers/virtio/virtio_pci_common.h     |  20 +-
 drivers/virtio/virtio_pci_modern.c     | 441 ++++++++++++++++++-
 drivers/virtio/virtio_pci_modern_dev.c |  24 +-
 include/linux/vfio_pci_core.h          |  20 +
 include/linux/virtio.h                 |   8 +
 include/linux/virtio_config.h          |   4 +
 include/linux/virtio_pci_admin.h       |  18 +
 include/linux/virtio_pci_modern.h      |   5 +
 include/uapi/linux/virtio_config.h     |   8 +-
 include/uapi/linux/virtio_pci.h        |  66 +++
 20 files changed, 1295 insertions(+), 40 deletions(-)
 create mode 100644 drivers/vfio/pci/virtio/Kconfig
 create mode 100644 drivers/vfio/pci/virtio/Makefile
 create mode 100644 drivers/vfio/pci/virtio/main.c
 create mode 100644 include/linux/virtio_pci_admin.h

Comments

Yishai Hadas Oct. 22, 2023, 8:20 a.m. UTC | #1
On 17/10/2023 16:42, Yishai Hadas wrote:
> This series introduce a vfio driver over virtio devices to support the
> legacy interface functionality for VFs.
>
> Background, from the virtio spec [1].
> --------------------------------------------------------------------
> In some systems, there is a need to support a virtio legacy driver with
> a device that does not directly support the legacy interface. In such
> scenarios, a group owner device can provide the legacy interface
> functionality for the group member devices. The driver of the owner
> device can then access the legacy interface of a member device on behalf
> of the legacy member device driver.
>
> For example, with the SR-IOV group type, group members (VFs) can not
> present the legacy interface in an I/O BAR in BAR0 as expected by the
> legacy pci driver. If the legacy driver is running inside a virtual
> machine, the hypervisor executing the virtual machine can present a
> virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> legacy driver accesses to this I/O BAR and forwards them to the group
> owner device (PF) using group administration commands.
> --------------------------------------------------------------------
>
> The first 6 patches are in the virtio area and handle the below:
> - Fix common config map for modern device as was reported by Michael Tsirkin.
> - Introduce the admin virtqueue infrastcture.
> - Expose the layout of the commands that should be used for
>    supporting the legacy access.
> - Expose APIs to enable upper layers as of vfio, net, etc
>    to execute admin commands.
>
> The above follows the virtio spec that was lastly accepted in that area
> [1].
>
> The last 3 patches are in the vfio area and handle the below:
> - Expose some APIs from vfio/pci to be used by the vfio/virtio driver.
> - Introduce a vfio driver over virtio devices to support the legacy
>    interface functionality for VFs.
>
> The series was tested successfully over virtio-net VFs in the host,
> while running in the guest both modern and legacy drivers.
>
> [1]
> https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
>
> Changes from V0: https://www.spinics.net/lists/linux-virtualization/msg63802.html
>
> Virtio:
> - Fix the common config map size issue that was reported by Michael
>    Tsirkin.
> - Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by
>    Michael, instead skip the AQ specifically.
> - Move admin vq implementation into virtio_pci_modern.c as was asked by
>    Michael.
> - Rename structure virtio_avq to virtio_pci_admin_vq and some extra
>    corresponding renames.
> - Remove exported symbols virtio_pci_vf_get_pf_dev(),
>    virtio_admin_cmd_exec() as now callers are local to the module.
> - Handle inflight commands as part of the device reset flow.
> - Introduce APIs per admin command in virtio-pci as was asked by Michael.
>
> Vfio:
> - Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for
>    vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by
>    Alex.
> - Drop the intermediate patch which prepares the commands and calls the
>    generic virtio admin command API (i.e. virtio_admin_cmd_exec()).
> - Instead, call directly to the new APIs per admin command that are
>    exported from Virtio - based on Michael's request.
> - Enable only virtio-net as part of the pci_device_id table to enforce
>    upon binding only what is supported as suggested by Alex.
> - Add support for byte-wise access (read/write) over the device config
>    region as was asked by Alex.
> - Consider whether MSIX is practically enabled/disabled to choose the
>    right opcode upon issuing read/write admin command, as mentioned
>    by Michael.
> - Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines
>    as was suggested by Michael.
> - Set the '.close_device' op to vfio_pci_core_close_device() as was
>    pointed by Alex.
> - Adapt to Vfio multi-line comment style in a few places.
> - Add virtualization@lists.linux-foundation.org in the MAINTAINERS file
>    to be CCed for the new driver as was suggested by Jason.
>
> Yishai
>
> Feng Liu (5):
>    virtio-pci: Fix common config map for modern device
>    virtio: Define feature bit for administration virtqueue
>    virtio-pci: Introduce admin virtqueue
>    virtio-pci: Introduce admin command sending function
>    virtio-pci: Introduce admin commands
>
> Yishai Hadas (4):
>    virtio-pci: Introduce APIs to execute legacy IO admin commands
>    vfio/pci: Expose vfio_pci_core_setup_barmap()
>    vfio/pci: Expose vfio_pci_iowrite/read##size()
>    vfio/virtio: Introduce a vfio driver over virtio devices
>
>   MAINTAINERS                            |   7 +
>   drivers/vfio/pci/Kconfig               |   2 +
>   drivers/vfio/pci/Makefile              |   2 +
>   drivers/vfio/pci/vfio_pci_core.c       |  25 ++
>   drivers/vfio/pci/vfio_pci_rdwr.c       |  38 +-
>   drivers/vfio/pci/virtio/Kconfig        |  15 +
>   drivers/vfio/pci/virtio/Makefile       |   4 +
>   drivers/vfio/pci/virtio/main.c         | 577 +++++++++++++++++++++++++
>   drivers/virtio/virtio.c                |  37 +-
>   drivers/virtio/virtio_pci_common.c     |  14 +
>   drivers/virtio/virtio_pci_common.h     |  20 +-
>   drivers/virtio/virtio_pci_modern.c     | 441 ++++++++++++++++++-
>   drivers/virtio/virtio_pci_modern_dev.c |  24 +-
>   include/linux/vfio_pci_core.h          |  20 +
>   include/linux/virtio.h                 |   8 +
>   include/linux/virtio_config.h          |   4 +
>   include/linux/virtio_pci_admin.h       |  18 +
>   include/linux/virtio_pci_modern.h      |   5 +
>   include/uapi/linux/virtio_config.h     |   8 +-
>   include/uapi/linux/virtio_pci.h        |  66 +++
>   20 files changed, 1295 insertions(+), 40 deletions(-)
>   create mode 100644 drivers/vfio/pci/virtio/Kconfig
>   create mode 100644 drivers/vfio/pci/virtio/Makefile
>   create mode 100644 drivers/vfio/pci/virtio/main.c
>   create mode 100644 include/linux/virtio_pci_admin.h
>
Hi Michael,

Did you have the chance to review the virtio part of that series ?

IMO, we addressed all your notes on V0, I would be happy to get your 
feedback on V1 before sending V2.

In my TO-DO list for V2, have for now the below minor items.
Virtio:
Patch #6: Fix a krobot note where it needs to include the H file as part 
of the export symbols C file.
Vfio:
#patch #9: Rename the 'ops' variable to drop the 'acc' and potentially 
some rename in the description of the module with regards to 'family'.

Alex,
Are you fine to leave the provisioning of the VF including the control 
of its transitional capability in the device hands as was suggested by 
Jason ?
Any specific recommendation following the discussion in the ML, for the 
'family' note ?

Once I'll have the above feedback I may prepare and send V2.

Yishai
Michael S. Tsirkin Oct. 22, 2023, 9:12 a.m. UTC | #2
On Sun, Oct 22, 2023 at 11:20:31AM +0300, Yishai Hadas wrote:
> On 17/10/2023 16:42, Yishai Hadas wrote:
> > This series introduce a vfio driver over virtio devices to support the
> > legacy interface functionality for VFs.
> > 
> > Background, from the virtio spec [1].
> > --------------------------------------------------------------------
> > In some systems, there is a need to support a virtio legacy driver with
> > a device that does not directly support the legacy interface. In such
> > scenarios, a group owner device can provide the legacy interface
> > functionality for the group member devices. The driver of the owner
> > device can then access the legacy interface of a member device on behalf
> > of the legacy member device driver.
> > 
> > For example, with the SR-IOV group type, group members (VFs) can not
> > present the legacy interface in an I/O BAR in BAR0 as expected by the
> > legacy pci driver. If the legacy driver is running inside a virtual
> > machine, the hypervisor executing the virtual machine can present a
> > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> > legacy driver accesses to this I/O BAR and forwards them to the group
> > owner device (PF) using group administration commands.
> > --------------------------------------------------------------------
> > 
> > The first 6 patches are in the virtio area and handle the below:
> > - Fix common config map for modern device as was reported by Michael Tsirkin.
> > - Introduce the admin virtqueue infrastcture.
> > - Expose the layout of the commands that should be used for
> >    supporting the legacy access.
> > - Expose APIs to enable upper layers as of vfio, net, etc
> >    to execute admin commands.
> > 
> > The above follows the virtio spec that was lastly accepted in that area
> > [1].
> > 
> > The last 3 patches are in the vfio area and handle the below:
> > - Expose some APIs from vfio/pci to be used by the vfio/virtio driver.
> > - Introduce a vfio driver over virtio devices to support the legacy
> >    interface functionality for VFs.
> > 
> > The series was tested successfully over virtio-net VFs in the host,
> > while running in the guest both modern and legacy drivers.
> > 
> > [1]
> > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
> > 
> > Changes from V0: https://www.spinics.net/lists/linux-virtualization/msg63802.html
> > 
> > Virtio:
> > - Fix the common config map size issue that was reported by Michael
> >    Tsirkin.
> > - Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by
> >    Michael, instead skip the AQ specifically.
> > - Move admin vq implementation into virtio_pci_modern.c as was asked by
> >    Michael.
> > - Rename structure virtio_avq to virtio_pci_admin_vq and some extra
> >    corresponding renames.
> > - Remove exported symbols virtio_pci_vf_get_pf_dev(),
> >    virtio_admin_cmd_exec() as now callers are local to the module.
> > - Handle inflight commands as part of the device reset flow.
> > - Introduce APIs per admin command in virtio-pci as was asked by Michael.
> > 
> > Vfio:
> > - Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for
> >    vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by
> >    Alex.
> > - Drop the intermediate patch which prepares the commands and calls the
> >    generic virtio admin command API (i.e. virtio_admin_cmd_exec()).
> > - Instead, call directly to the new APIs per admin command that are
> >    exported from Virtio - based on Michael's request.
> > - Enable only virtio-net as part of the pci_device_id table to enforce
> >    upon binding only what is supported as suggested by Alex.
> > - Add support for byte-wise access (read/write) over the device config
> >    region as was asked by Alex.
> > - Consider whether MSIX is practically enabled/disabled to choose the
> >    right opcode upon issuing read/write admin command, as mentioned
> >    by Michael.
> > - Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines
> >    as was suggested by Michael.
> > - Set the '.close_device' op to vfio_pci_core_close_device() as was
> >    pointed by Alex.
> > - Adapt to Vfio multi-line comment style in a few places.
> > - Add virtualization@lists.linux-foundation.org in the MAINTAINERS file
> >    to be CCed for the new driver as was suggested by Jason.
> > 
> > Yishai
> > 
> > Feng Liu (5):
> >    virtio-pci: Fix common config map for modern device
> >    virtio: Define feature bit for administration virtqueue
> >    virtio-pci: Introduce admin virtqueue
> >    virtio-pci: Introduce admin command sending function
> >    virtio-pci: Introduce admin commands
> > 
> > Yishai Hadas (4):
> >    virtio-pci: Introduce APIs to execute legacy IO admin commands
> >    vfio/pci: Expose vfio_pci_core_setup_barmap()
> >    vfio/pci: Expose vfio_pci_iowrite/read##size()
> >    vfio/virtio: Introduce a vfio driver over virtio devices
> > 
> >   MAINTAINERS                            |   7 +
> >   drivers/vfio/pci/Kconfig               |   2 +
> >   drivers/vfio/pci/Makefile              |   2 +
> >   drivers/vfio/pci/vfio_pci_core.c       |  25 ++
> >   drivers/vfio/pci/vfio_pci_rdwr.c       |  38 +-
> >   drivers/vfio/pci/virtio/Kconfig        |  15 +
> >   drivers/vfio/pci/virtio/Makefile       |   4 +
> >   drivers/vfio/pci/virtio/main.c         | 577 +++++++++++++++++++++++++
> >   drivers/virtio/virtio.c                |  37 +-
> >   drivers/virtio/virtio_pci_common.c     |  14 +
> >   drivers/virtio/virtio_pci_common.h     |  20 +-
> >   drivers/virtio/virtio_pci_modern.c     | 441 ++++++++++++++++++-
> >   drivers/virtio/virtio_pci_modern_dev.c |  24 +-
> >   include/linux/vfio_pci_core.h          |  20 +
> >   include/linux/virtio.h                 |   8 +
> >   include/linux/virtio_config.h          |   4 +
> >   include/linux/virtio_pci_admin.h       |  18 +
> >   include/linux/virtio_pci_modern.h      |   5 +
> >   include/uapi/linux/virtio_config.h     |   8 +-
> >   include/uapi/linux/virtio_pci.h        |  66 +++
> >   20 files changed, 1295 insertions(+), 40 deletions(-)
> >   create mode 100644 drivers/vfio/pci/virtio/Kconfig
> >   create mode 100644 drivers/vfio/pci/virtio/Makefile
> >   create mode 100644 drivers/vfio/pci/virtio/main.c
> >   create mode 100644 include/linux/virtio_pci_admin.h
> > 
> Hi Michael,
> 
> Did you have the chance to review the virtio part of that series ?

Not yet, will take a couple more days.

> IMO, we addressed all your notes on V0, I would be happy to get your
> feedback on V1 before sending V2.
> 
> In my TO-DO list for V2, have for now the below minor items.
> Virtio:
> Patch #6: Fix a krobot note where it needs to include the H file as part of
> the export symbols C file.
> Vfio:
> #patch #9: Rename the 'ops' variable to drop the 'acc' and potentially some
> rename in the description of the module with regards to 'family'.
> 
> Alex,
> Are you fine to leave the provisioning of the VF including the control of
> its transitional capability in the device hands as was suggested by Jason ?
> Any specific recommendation following the discussion in the ML, for the
> 'family' note ?
> 
> Once I'll have the above feedback I may prepare and send V2.
> 
> Yishai
Alex Williamson Oct. 23, 2023, 3:33 p.m. UTC | #3
On Sun, 22 Oct 2023 11:20:31 +0300
Yishai Hadas <yishaih@nvidia.com> wrote:

> On 17/10/2023 16:42, Yishai Hadas wrote:
> > This series introduce a vfio driver over virtio devices to support the
> > legacy interface functionality for VFs.
> >
> > Background, from the virtio spec [1].
> > --------------------------------------------------------------------
> > In some systems, there is a need to support a virtio legacy driver with
> > a device that does not directly support the legacy interface. In such
> > scenarios, a group owner device can provide the legacy interface
> > functionality for the group member devices. The driver of the owner
> > device can then access the legacy interface of a member device on behalf
> > of the legacy member device driver.
> >
> > For example, with the SR-IOV group type, group members (VFs) can not
> > present the legacy interface in an I/O BAR in BAR0 as expected by the
> > legacy pci driver. If the legacy driver is running inside a virtual
> > machine, the hypervisor executing the virtual machine can present a
> > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> > legacy driver accesses to this I/O BAR and forwards them to the group
> > owner device (PF) using group administration commands.
> > --------------------------------------------------------------------
> >
> > The first 6 patches are in the virtio area and handle the below:
> > - Fix common config map for modern device as was reported by Michael Tsirkin.
> > - Introduce the admin virtqueue infrastcture.
> > - Expose the layout of the commands that should be used for
> >    supporting the legacy access.
> > - Expose APIs to enable upper layers as of vfio, net, etc
> >    to execute admin commands.
> >
> > The above follows the virtio spec that was lastly accepted in that area
> > [1].
> >
> > The last 3 patches are in the vfio area and handle the below:
> > - Expose some APIs from vfio/pci to be used by the vfio/virtio driver.
> > - Introduce a vfio driver over virtio devices to support the legacy
> >    interface functionality for VFs.
> >
> > The series was tested successfully over virtio-net VFs in the host,
> > while running in the guest both modern and legacy drivers.
> >
> > [1]
> > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
> >
> > Changes from V0: https://www.spinics.net/lists/linux-virtualization/msg63802.html
> >
> > Virtio:
> > - Fix the common config map size issue that was reported by Michael
> >    Tsirkin.
> > - Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by
> >    Michael, instead skip the AQ specifically.
> > - Move admin vq implementation into virtio_pci_modern.c as was asked by
> >    Michael.
> > - Rename structure virtio_avq to virtio_pci_admin_vq and some extra
> >    corresponding renames.
> > - Remove exported symbols virtio_pci_vf_get_pf_dev(),
> >    virtio_admin_cmd_exec() as now callers are local to the module.
> > - Handle inflight commands as part of the device reset flow.
> > - Introduce APIs per admin command in virtio-pci as was asked by Michael.
> >
> > Vfio:
> > - Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for
> >    vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by
> >    Alex.
> > - Drop the intermediate patch which prepares the commands and calls the
> >    generic virtio admin command API (i.e. virtio_admin_cmd_exec()).
> > - Instead, call directly to the new APIs per admin command that are
> >    exported from Virtio - based on Michael's request.
> > - Enable only virtio-net as part of the pci_device_id table to enforce
> >    upon binding only what is supported as suggested by Alex.
> > - Add support for byte-wise access (read/write) over the device config
> >    region as was asked by Alex.
> > - Consider whether MSIX is practically enabled/disabled to choose the
> >    right opcode upon issuing read/write admin command, as mentioned
> >    by Michael.
> > - Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines
> >    as was suggested by Michael.
> > - Set the '.close_device' op to vfio_pci_core_close_device() as was
> >    pointed by Alex.
> > - Adapt to Vfio multi-line comment style in a few places.
> > - Add virtualization@lists.linux-foundation.org in the MAINTAINERS file
> >    to be CCed for the new driver as was suggested by Jason.
> >
> > Yishai
> >
> > Feng Liu (5):
> >    virtio-pci: Fix common config map for modern device
> >    virtio: Define feature bit for administration virtqueue
> >    virtio-pci: Introduce admin virtqueue
> >    virtio-pci: Introduce admin command sending function
> >    virtio-pci: Introduce admin commands
> >
> > Yishai Hadas (4):
> >    virtio-pci: Introduce APIs to execute legacy IO admin commands
> >    vfio/pci: Expose vfio_pci_core_setup_barmap()
> >    vfio/pci: Expose vfio_pci_iowrite/read##size()
> >    vfio/virtio: Introduce a vfio driver over virtio devices
> >
> >   MAINTAINERS                            |   7 +
> >   drivers/vfio/pci/Kconfig               |   2 +
> >   drivers/vfio/pci/Makefile              |   2 +
> >   drivers/vfio/pci/vfio_pci_core.c       |  25 ++
> >   drivers/vfio/pci/vfio_pci_rdwr.c       |  38 +-
> >   drivers/vfio/pci/virtio/Kconfig        |  15 +
> >   drivers/vfio/pci/virtio/Makefile       |   4 +
> >   drivers/vfio/pci/virtio/main.c         | 577 +++++++++++++++++++++++++
> >   drivers/virtio/virtio.c                |  37 +-
> >   drivers/virtio/virtio_pci_common.c     |  14 +
> >   drivers/virtio/virtio_pci_common.h     |  20 +-
> >   drivers/virtio/virtio_pci_modern.c     | 441 ++++++++++++++++++-
> >   drivers/virtio/virtio_pci_modern_dev.c |  24 +-
> >   include/linux/vfio_pci_core.h          |  20 +
> >   include/linux/virtio.h                 |   8 +
> >   include/linux/virtio_config.h          |   4 +
> >   include/linux/virtio_pci_admin.h       |  18 +
> >   include/linux/virtio_pci_modern.h      |   5 +
> >   include/uapi/linux/virtio_config.h     |   8 +-
> >   include/uapi/linux/virtio_pci.h        |  66 +++
> >   20 files changed, 1295 insertions(+), 40 deletions(-)
> >   create mode 100644 drivers/vfio/pci/virtio/Kconfig
> >   create mode 100644 drivers/vfio/pci/virtio/Makefile
> >   create mode 100644 drivers/vfio/pci/virtio/main.c
> >   create mode 100644 include/linux/virtio_pci_admin.h
> >  
> Hi Michael,
> 
> Did you have the chance to review the virtio part of that series ?
> 
> IMO, we addressed all your notes on V0, I would be happy to get your 
> feedback on V1 before sending V2.
> 
> In my TO-DO list for V2, have for now the below minor items.
> Virtio:
> Patch #6: Fix a krobot note where it needs to include the H file as part 
> of the export symbols C file.
> Vfio:
> #patch #9: Rename the 'ops' variable to drop the 'acc' and potentially 
> some rename in the description of the module with regards to 'family'.
> 
> Alex,
> Are you fine to leave the provisioning of the VF including the control 
> of its transitional capability in the device hands as was suggested by 
> Jason ?

If this is the standard we're going to follow, ie. profiling of a
device is expected to occur prior to the probe of the vfio-pci variant
driver, then we should get the out-of-tree NVIDIA vGPU driver on board
with this too.

> Any specific recommendation following the discussion in the ML, for the 
> 'family' note ?

It's not super important, it's just overly broad vs what's actually
implemented.  Limiting the description to virtio-net for the current
implementation is fine.

> Once I'll have the above feedback I may prepare and send V2.

I'll try to take a more thorough look, but also note my comments to
Ankit relative to config space emulation.  This driver correctly
implements the flags for the IO Port BAR, but does not support sizing
of the BAR through config space, which I think is a shortcoming
relative to that implemented by vfio-pci.  QEMU doesn't rely on this,
but we don't know there aren't other userspaces that depend on this
behavior.  Thanks,

Alex
Jason Gunthorpe Oct. 23, 2023, 3:42 p.m. UTC | #4
On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote:

> > Alex,
> > Are you fine to leave the provisioning of the VF including the control 
> > of its transitional capability in the device hands as was suggested by 
> > Jason ?
> 
> If this is the standard we're going to follow, ie. profiling of a
> device is expected to occur prior to the probe of the vfio-pci variant
> driver, then we should get the out-of-tree NVIDIA vGPU driver on board
> with this too.

Those GPU drivers are using mdev not vfio-pci..

mdev doesn't have a way in its uapi to configure the mdev before it is
created.

I'm hopeful that the SIOV work will develop something better because
we clearly need it for the general use cases of SIOV beyond VFIO.

Jason
Alex Williamson Oct. 23, 2023, 4:09 p.m. UTC | #5
On Mon, 23 Oct 2023 12:42:57 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote:
> 
> > > Alex,
> > > Are you fine to leave the provisioning of the VF including the control 
> > > of its transitional capability in the device hands as was suggested by 
> > > Jason ?  
> > 
> > If this is the standard we're going to follow, ie. profiling of a
> > device is expected to occur prior to the probe of the vfio-pci variant
> > driver, then we should get the out-of-tree NVIDIA vGPU driver on board
> > with this too.  
> 
> Those GPU drivers are using mdev not vfio-pci..

The SR-IOV mdev vGPUs rely on the IOMMU backing device support which
was removed from upstream.  They only exist in the mdev form on
downstreams which have retained this interface for compatibility and
continuity.  I'm not aware of any other means by which the SR-IOV RID
can be used in the mdev model, therefore only the pre-SR-IOV GPUs
should continue to use the mdev interface.

> mdev doesn't have a way in its uapi to configure the mdev before it is
> created.

Of course.  Thanks,

Alex
Jason Gunthorpe Oct. 23, 2023, 4:20 p.m. UTC | #6
On Mon, Oct 23, 2023 at 10:09:13AM -0600, Alex Williamson wrote:
> On Mon, 23 Oct 2023 12:42:57 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote:
> > 
> > > > Alex,
> > > > Are you fine to leave the provisioning of the VF including the control 
> > > > of its transitional capability in the device hands as was suggested by 
> > > > Jason ?  
> > > 
> > > If this is the standard we're going to follow, ie. profiling of a
> > > device is expected to occur prior to the probe of the vfio-pci variant
> > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board
> > > with this too.  
> > 
> > Those GPU drivers are using mdev not vfio-pci..
> 
> The SR-IOV mdev vGPUs rely on the IOMMU backing device support which
> was removed from upstream.  

It wasn't, but it changed forms.

mdev is a sysfs framework for managing lifecycle with GUIDs only.

The thing using mdev can call vfio_register_emulated_iommu_dev() or
vfio_register_group_dev(). 

It doesn't matter to the mdev stuff.

The thing using mdev is responsible to get the struct device to pass
to vfio_register_group_dev()

Jason
Alex Williamson Oct. 23, 2023, 4:45 p.m. UTC | #7
On Mon, 23 Oct 2023 13:20:43 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, Oct 23, 2023 at 10:09:13AM -0600, Alex Williamson wrote:
> > On Mon, 23 Oct 2023 12:42:57 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote:
> > >   
> > > > > Alex,
> > > > > Are you fine to leave the provisioning of the VF including the control 
> > > > > of its transitional capability in the device hands as was suggested by 
> > > > > Jason ?    
> > > > 
> > > > If this is the standard we're going to follow, ie. profiling of a
> > > > device is expected to occur prior to the probe of the vfio-pci variant
> > > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board
> > > > with this too.    
> > > 
> > > Those GPU drivers are using mdev not vfio-pci..  
> > 
> > The SR-IOV mdev vGPUs rely on the IOMMU backing device support which
> > was removed from upstream.    
> 
> It wasn't, but it changed forms.
> 
> mdev is a sysfs framework for managing lifecycle with GUIDs only.
> 
> The thing using mdev can call vfio_register_emulated_iommu_dev() or
> vfio_register_group_dev(). 
> 
> It doesn't matter to the mdev stuff.
> 
> The thing using mdev is responsible to get the struct device to pass
> to vfio_register_group_dev()

Are we describing what can be done (possibly limited to out-of-tree
drivers) or what should be done and would be accepted upstream?

I'm under the impression that mdev has been redefined to be more
narrowly focused for emulated IOMMU devices and that devices based
around a PCI VF should be making use of a vfio-pci variant driver.

Are you suggesting it's the vendor's choice based on whether they want
the mdev lifecycle support?

We've defined certain aspects of the vfio-mdev interface as only
available for emulated IOMMU devices, ex. page pinning.  Thanks,

Alex
Jason Gunthorpe Oct. 23, 2023, 5:27 p.m. UTC | #8
On Mon, Oct 23, 2023 at 10:45:48AM -0600, Alex Williamson wrote:
> On Mon, 23 Oct 2023 13:20:43 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Mon, Oct 23, 2023 at 10:09:13AM -0600, Alex Williamson wrote:
> > > On Mon, 23 Oct 2023 12:42:57 -0300
> > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >   
> > > > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote:
> > > >   
> > > > > > Alex,
> > > > > > Are you fine to leave the provisioning of the VF including the control 
> > > > > > of its transitional capability in the device hands as was suggested by 
> > > > > > Jason ?    
> > > > > 
> > > > > If this is the standard we're going to follow, ie. profiling of a
> > > > > device is expected to occur prior to the probe of the vfio-pci variant
> > > > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board
> > > > > with this too.    
> > > > 
> > > > Those GPU drivers are using mdev not vfio-pci..  
> > > 
> > > The SR-IOV mdev vGPUs rely on the IOMMU backing device support which
> > > was removed from upstream.    
> > 
> > It wasn't, but it changed forms.
> > 
> > mdev is a sysfs framework for managing lifecycle with GUIDs only.
> > 
> > The thing using mdev can call vfio_register_emulated_iommu_dev() or
> > vfio_register_group_dev(). 
> > 
> > It doesn't matter to the mdev stuff.
> > 
> > The thing using mdev is responsible to get the struct device to pass
> > to vfio_register_group_dev()
> 
> Are we describing what can be done (possibly limited to out-of-tree
> drivers) or what should be done and would be accepted upstream?

Beyond disliking mdev, I'm not really set on how we should try to
setup an extensively mediated PCI SRIOV driver. There is quite a lot
of similarity to SIOV, so it may be the right answer is to put SIOV
and this special mediated SRIOV case on the same, new, infrastructure.

SIOV can't use variant vfio PCI drivers.

mdev guid lifecycle is really ugly and quite limited anyhow.

So I've been thinking we need something else.

> I'm under the impression that mdev has been redefined to be more
> narrowly focused for emulated IOMMU devices and that devices based
> around a PCI VF should be making use of a vfio-pci variant driver.

I've been viewing mdev as legacy, just let it die off with the S390
drivers and Intel GPU as the only users, ever.

When we solve the SIOV issue we should come with something that can
absorb what S390/GPU need too.

At the end of the day we need an API to create /dev/vfioXX on demand,
to configure them before creating them, and the destroy them. It
doesn't matter at all how the driver that owns vfioXX operates, it
will call the right iommufd APIs for RID/PASID/access/etc to do
whatever its thing is.

It would be wonderful if we could get to the point where the new
interface can also create/destroy SRIOV vfios directly too.

> Are you suggesting it's the vendor's choice based on whether they want
> the mdev lifecycle support?

So, in tree I would like to discourage new mdev drivers. Out of tree,
I don't care, the APIs exist if people want to build things with them
then they get the usual out of tree cavet.

> We've defined certain aspects of the vfio-mdev interface as only
> available for emulated IOMMU devices, ex. page pinning.  Thanks,

Did we?

iommufd made it up to the driver to decide what to do, and a driver
can certainly create a concurrent iommufd_access and iommufd_device if
it wants.

AFAICT the container stuff doesn't check, drivers can do both
concurrently?

Jason
Tian, Kevin Oct. 25, 2023, 8:34 a.m. UTC | #9
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Monday, October 23, 2023 11:43 PM
> 
> On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote:
> 
> > > Alex,
> > > Are you fine to leave the provisioning of the VF including the control
> > > of its transitional capability in the device hands as was suggested by
> > > Jason ?
> >
> > If this is the standard we're going to follow, ie. profiling of a
> > device is expected to occur prior to the probe of the vfio-pci variant
> > driver, then we should get the out-of-tree NVIDIA vGPU driver on board
> > with this too.
> 
> Those GPU drivers are using mdev not vfio-pci..
> 
> mdev doesn't have a way in its uapi to configure the mdev before it is
> created.
> 
> I'm hopeful that the SIOV work will develop something better because
> we clearly need it for the general use cases of SIOV beyond VFIO.
> 

The internal idxd driver version which I looked at last time leaves
provisioning via idxd's own config interface. sure let's brainstorm
what'd be (if possible) a general provisioning framework after it's
sent out for review.