mbox series

[RFC,00/24] Control VQ support in vDPA

Message ID 20200924032125.18619-1-jasowang@redhat.com (mailing list archive)
Headers show
Series Control VQ support in vDPA | expand

Message

Jason Wang Sept. 24, 2020, 3:21 a.m. UTC
Hi All:

This series tries to add the support for control virtqueue in vDPA.

Control virtqueue is used by networking device for accepting various
commands from the driver. It's a must to support multiqueue and other
configurations.

When used by vhost-vDPA bus driver for VM, the control virtqueue
should be shadowed via userspace VMM (Qemu) instead of being assigned
directly to Guest. This is because Qemu needs to know the device state
in order to start and stop device correctly (e.g for Live Migration).

This requies to isolate the memory mapping for control virtqueue
presented by vhost-vDPA to prevent guest from accesing it directly.

To achieve this, vDPA introduce two new abstractions:

- address space: identified through address space id (ASID) and a set
                 of memory mapping in maintained
- virtqueue group: the minimal set of virtqueues that must share an
                 address space

Device needs to advertise the following attributes to vDPA:

- the number of address spaces supported in the device
- the number of virtqueue groups supported in the device
- the mappings from a specific virtqueue to its virtqueue groups

The mappings from virtqueue to virtqueue groups is fixed and defined
by vDPA device driver. E.g:

- For the device that has hardware ASID support, it can simply
  advertise a per virtqueue virtqueue group.
- For the device that does not have hardware ASID support, it can
  simply advertise a single virtqueue group that contains all
  virtqueues. Or if it wants a software emulated control virtqueue, it
  can advertise two virtqueue groups, one is for cvq, another is for
  the rest virtqueues.

vDPA also allow to change the association between virtqueue group and
address space. So in the case of control virtqueue, userspace
VMM(Qemu) may use a dedicated address space for the control virtqueue
group to isolate the memory mapping.

The vhost/vhost-vDPA is also extend for the userspace to:

- query the number of virtqueue groups and address spaces supported by
  the device
- query the virtqueue group for a specific virtqueue
- assocaite a virtqueue group with an address space
- send ASID based IOTLB commands

This will help userspace VMM(Qemu) to detect whether the control vq
could be supported and isolate memory mappings of control virtqueue
from the others.

To demonstrate the usage, vDPA simulator is extended to support
setting MAC address via a emulated control virtqueue. Please refer
patch 24 for more implementation details.

Please review.

Note that patch 1 and a equivalent of patch 2 have been posted in the
list. Those two are requirement for this series to work, so I add them
here.

Thank

Jason Wang (24):
  vhost-vdpa: fix backend feature ioctls
  vhost-vdpa: fix vqs leak in vhost_vdpa_open()
  vhost: move the backend feature bits to vhost_types.h
  virtio-vdpa: don't set callback if virtio doesn't need it
  vhost-vdpa: passing iotlb to IOMMU mapping helpers
  vhost-vdpa: switch to use vhost-vdpa specific IOTLB
  vdpa: add the missing comment for nvqs in struct vdpa_device
  vdpa: introduce virtqueue groups
  vdpa: multiple address spaces support
  vdpa: introduce config operations for associating ASID to a virtqueue
    group
  vhost_iotlb: split out IOTLB initialization
  vhost: support ASID in IOTLB API
  vhost-vdpa: introduce ASID based IOTLB
  vhost-vdpa: introduce uAPI to get the number of virtqueue groups
  vhost-vdpa: introduce uAPI to get the number of address spaces
  vhost-vdpa: uAPI to get virtqueue group id
  vhost-vdpa: introduce uAPI to set group ASID
  vhost-vdpa: support ASID based IOTLB API
  vdpa_sim: use separated iov for reading and writing
  vdpa_sim: advertise VIRTIO_NET_F_MTU
  vdpa_sim: advertise VIRTIO_NET_F_MAC
  vdpa_sim: factor out buffer completion logic
  vdpa_sim: filter destination mac address
  vdpasim: control virtqueue support

 drivers/vdpa/ifcvf/ifcvf_main.c   |   9 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c |  11 +-
 drivers/vdpa/vdpa.c               |   8 +-
 drivers/vdpa/vdpa_sim/vdpa_sim.c  | 293 ++++++++++++++++++++++++------
 drivers/vhost/iotlb.c             |  23 ++-
 drivers/vhost/vdpa.c              | 259 ++++++++++++++++++++------
 drivers/vhost/vhost.c             |  23 ++-
 drivers/vhost/vhost.h             |   4 +-
 drivers/virtio/virtio_vdpa.c      |   2 +-
 include/linux/vdpa.h              |  42 ++++-
 include/linux/vhost_iotlb.h       |   2 +
 include/uapi/linux/vhost.h        |  19 +-
 include/uapi/linux/vhost_types.h  |  10 +-
 13 files changed, 556 insertions(+), 149 deletions(-)

Comments

Stefan Hajnoczi Sept. 24, 2020, 10:17 a.m. UTC | #1
On Thu, Sep 24, 2020 at 11:21:01AM +0800, Jason Wang wrote:
> This series tries to add the support for control virtqueue in vDPA.

Please include documentation for both driver authors and vhost-vdpa
ioctl users. vhost-vdpa ioctls are only documented with a single
sentence. Please add full information on arguments, return values, and a
high-level explanation of the feature (like this cover letter) to
introduce the API.

What is the policy for using virtqueue groups? My guess is:
1. virtio_vdpa simply enables all virtqueue groups.
2. vhost_vdpa relies on userspace policy on how to use virtqueue groups.
   Are the semantics of virtqueue groups documented somewhere so
   userspace knows what to do? If a vDPA driver author decides to create
   N virtqueue groups, N/2 virtqueue groups, or just 1 virtqueue group,
   how will userspace know what to do?

Maybe a document is needed to describe the recommended device-specific
virtqueue groups that vDPA drivers should implement (e.g. "put the net
control vq into its own virtqueue group")?

This could become messy with guidelines. For example, drivers might be
shipped that aren't usable for certain use cases just because the author
didn't know that a certain virtqueue grouping is advantageous.

BTW I like how general this feature is. It seems to allow vDPA devices
to be split into sub-devices for further passthrough. Who will write the
first vDPA-on-vDPA driver? :)

Stefan
Jason Wang Sept. 25, 2020, 11:36 a.m. UTC | #2
On 2020/9/24 下午6:17, Stefan Hajnoczi wrote:
> On Thu, Sep 24, 2020 at 11:21:01AM +0800, Jason Wang wrote:
>> This series tries to add the support for control virtqueue in vDPA.
> Please include documentation for both driver authors and vhost-vdpa
> ioctl users. vhost-vdpa ioctls are only documented with a single
> sentence. Please add full information on arguments, return values, and a
> high-level explanation of the feature (like this cover letter) to
> introduce the API.


Right, this is in the TODO list. (And we probably need to start with 
documenting vDPA bus operations first).


>
> What is the policy for using virtqueue groups? My guess is:
> 1. virtio_vdpa simply enables all virtqueue groups.
> 2. vhost_vdpa relies on userspace policy on how to use virtqueue groups.
>     Are the semantics of virtqueue groups documented somewhere so
>     userspace knows what to do? If a vDPA driver author decides to create
>     N virtqueue groups, N/2 virtqueue groups, or just 1 virtqueue group,
>     how will userspace know what to do?


So the mapping from virtqueue to virtqueue group is mandated by the vDPA 
device(driver). vDPA bus driver (like vhost-vDPA), can only change the 
association between virtqueue groups and ASID.

By default, it is required all virtqueue groups to be associated to 
address space 0. This make sure virtio_vdpa can work without any special 
groups/asid configuration.

I admit we need document all those semantics/polices.


>
> Maybe a document is needed to describe the recommended device-specific
> virtqueue groups that vDPA drivers should implement (e.g. "put the net
> control vq into its own virtqueue group")?


Yes, note that this depends on the hardware capability actually. It can 
only put control vq in other virtqueue group if:

1) hardware support to isolate control vq DMA from the rest virtqueues 
(PASID or simply using PA (translated address) for control vq)
or
2) the control vq is emulated by vDPA device driver (like vdpa_sim did).


>
> This could become messy with guidelines. For example, drivers might be
> shipped that aren't usable for certain use cases just because the author
> didn't know that a certain virtqueue grouping is advantageous.


Right.


>
> BTW I like how general this feature is. It seems to allow vDPA devices
> to be split into sub-devices for further passthrough. Who will write the
> first vDPA-on-vDPA driver? :)


Yes, that's an interesting question. For now, I can imagine we can 
emulated a SRIOV based virtio-net VFs via this.

If we want to expose the ASID setting to guest as well, it probably 
needs more thought.

Thanks


>
> Stefan