mbox series

[RFC,00/12] Prefer to use SVQ to stall dataplane at NIC state restore through CVQ

Message ID 20230720181459.607008-1-eperezma@redhat.com (mailing list archive)
Headers show
Series Prefer to use SVQ to stall dataplane at NIC state restore through CVQ | expand

Message

Eugenio Perez Martin July 20, 2023, 6:14 p.m. UTC
At this moment the migration of net features that depends on CVQ is not
possible, as there is no reliable way to restore the device state like mac
address, number of enabled queues, etc to the destination.  This is mainly
caused because the device must only read CVQ, and process all the commands
before resuming the dataplane.

This series uses the VirtIO feature _F_RING_RESET to achieve it, adding an
alternative method to late vq enabling proposed in [1][2].  It expose SVQ to
the device until it process all the CVQ messages, and then replaces the vring
for the guest's one.

As an advantage, it uses a feature well deviced in the VirtIO standard.  As a
disadvantage, current HW already support the late enabling and it does not
support RING_RESET.

This patch must be applied on top of the series ("Enable vdpa net migration
with features depending on CVQ") [1][2].

The patch has been tested with vp_vdpa, but using high IOVA instead of a
sepparated ID for shadow CVQ and shadow temporal vrings.

[1] Message-id: <20230706191227.835526-1-eperezma@redhat.com>
[2] https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg01325.html

Eugenio Pérez (12):
  vhost: add vhost_reset_queue_op
  vhost: add vhost_restart_queue_op
  vhost_net: Use ops->vhost_restart_queue in vhost_net_virtqueue_restart
  vhost_net: Use ops->vhost_reset_queue in vhost_net_virtqueue_reset
  vdpa: add vhost_vdpa_set_vring_ready_internal
  vdpa: add vhost_vdpa_svq_stop
  vdpa: add vhost_vdpa_reset_queue
  vdpa: add vhost_vdpa_svq_start
  vdpa: add vhost_vdpa_restart_queue
  vdpa: enable all vqs if the device support RING_RESET feature
  vdpa: use SVQ to stall dataplane while NIC state is being restored
  vhost: Allow _F_RING_RESET with shadow virtqueue

 include/hw/virtio/vhost-backend.h  |   6 ++
 hw/net/vhost_net.c                 |  16 ++--
 hw/virtio/vhost-shadow-virtqueue.c |   1 +
 hw/virtio/vhost-vdpa.c             | 139 +++++++++++++++++++++--------
 net/vhost-vdpa.c                   |  55 ++++++++++--
 hw/virtio/trace-events             |   2 +-
 6 files changed, 171 insertions(+), 48 deletions(-)

Comments

Eugenio Perez Martin July 21, 2023, 6:48 a.m. UTC | #1
On Thu, Jul 20, 2023 at 8:15 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> At this moment the migration of net features that depends on CVQ is not
> possible, as there is no reliable way to restore the device state like mac
> address, number of enabled queues, etc to the destination.  This is mainly
> caused because the device must only read CVQ, and process all the commands
> before resuming the dataplane.
>
> This series uses the VirtIO feature _F_RING_RESET to achieve it, adding an
> alternative method to late vq enabling proposed in [1][2].  It expose SVQ to
> the device until it process all the CVQ messages, and then replaces the vring
> for the guest's one.
>

A couple of things I forgot to add:
* Assuming the implementation of _F_RING_RESET in vdpa is calling
kernel vdpa ops .set_vq_ready(vq, false). I'm not sure if this is the
best implementation, but it is currently unused in the kernel. At the
same time, .set_vq_ready(vq, true) also enables the vq again.

> As an advantage, it uses a feature well deviced in the VirtIO standard.  As a
> disadvantage, current HW already support the late enabling and it does not
> support RING_RESET.
>
> This patch must be applied on top of the series ("Enable vdpa net migration
> with features depending on CVQ") [1][2].
>
> The patch has been tested with vp_vdpa, but using high IOVA instead of a
> sepparated ID for shadow CVQ and shadow temporal vrings.
>

And with _F_STATE implementation I sent long ago.

Based on this, my suggestion is:
* Leave the late enable for vDPA devices.
* Make them fail if the vDPA parent device does not support it. This
can be considered as a fix.
* Leave _F_RING_RESET to be added on top, as the semantics are not
implemented in vDPA at the moment.

Would that work?

> [1] Message-id: <20230706191227.835526-1-eperezma@redhat.com>
> [2] https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg01325.html
>
> Eugenio Pérez (12):
>   vhost: add vhost_reset_queue_op
>   vhost: add vhost_restart_queue_op
>   vhost_net: Use ops->vhost_restart_queue in vhost_net_virtqueue_restart
>   vhost_net: Use ops->vhost_reset_queue in vhost_net_virtqueue_reset
>   vdpa: add vhost_vdpa_set_vring_ready_internal
>   vdpa: add vhost_vdpa_svq_stop
>   vdpa: add vhost_vdpa_reset_queue
>   vdpa: add vhost_vdpa_svq_start
>   vdpa: add vhost_vdpa_restart_queue
>   vdpa: enable all vqs if the device support RING_RESET feature
>   vdpa: use SVQ to stall dataplane while NIC state is being restored
>   vhost: Allow _F_RING_RESET with shadow virtqueue
>
>  include/hw/virtio/vhost-backend.h  |   6 ++
>  hw/net/vhost_net.c                 |  16 ++--
>  hw/virtio/vhost-shadow-virtqueue.c |   1 +
>  hw/virtio/vhost-vdpa.c             | 139 +++++++++++++++++++++--------
>  net/vhost-vdpa.c                   |  55 ++++++++++--
>  hw/virtio/trace-events             |   2 +-
>  6 files changed, 171 insertions(+), 48 deletions(-)
>
> --
> 2.39.3
>
>
Jason Wang July 25, 2023, 7:15 a.m. UTC | #2
在 2023/7/21 14:48, Eugenio Perez Martin 写道:
> On Thu, Jul 20, 2023 at 8:15 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>> At this moment the migration of net features that depends on CVQ is not
>> possible, as there is no reliable way to restore the device state like mac
>> address, number of enabled queues, etc to the destination.  This is mainly
>> caused because the device must only read CVQ, and process all the commands
>> before resuming the dataplane.
>>
>> This series uses the VirtIO feature _F_RING_RESET to achieve it, adding an
>> alternative method to late vq enabling proposed in [1][2].  It expose SVQ to
>> the device until it process all the CVQ messages, and then replaces the vring
>> for the guest's one.
>>
> A couple of things I forgot to add:
> * Assuming the implementation of _F_RING_RESET in vdpa is calling
> kernel vdpa ops .set_vq_ready(vq, false). I'm not sure if this is the
> best implementation, but it is currently unused in the kernel. At the
> same time, .set_vq_ready(vq, true) also enables the vq again.


I think we need another ops, as set_vq_ready() tends to be functional 
equivalent to queue_enable.

If we reuse set_vq_ready(vq, false), we would get conflict in the future 
when driver is allowed to stop/resume a specific virtqueue via setting 0 
to queue_enable. And that's also the reason why queue_enable is not 
resued to reset a virtqueue.


>
>> As an advantage, it uses a feature well deviced in the VirtIO standard.  As a
>> disadvantage, current HW already support the late enabling and it does not
>> support RING_RESET.
>>
>> This patch must be applied on top of the series ("Enable vdpa net migration
>> with features depending on CVQ") [1][2].
>>
>> The patch has been tested with vp_vdpa, but using high IOVA instead of a
>> sepparated ID for shadow CVQ and shadow temporal vrings.
>>
> And with _F_STATE implementation I sent long ago.
>
> Based on this, my suggestion is:
> * Leave the late enable for vDPA devices.
> * Make them fail if the vDPA parent device does not support it. This
> can be considered as a fix.
> * Leave _F_RING_RESET to be added on top, as the semantics are not
> implemented in vDPA at the moment.
>
> Would that work?


I think it can work, let's start from late enabling which seems 
lightweight than reset and see. We can leave the vp_vdpa to be done on 
top with another series.

Thanks


>
>> [1] Message-id: <20230706191227.835526-1-eperezma@redhat.com>
>> [2] https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg01325.html
>>
>> Eugenio Pérez (12):
>>    vhost: add vhost_reset_queue_op
>>    vhost: add vhost_restart_queue_op
>>    vhost_net: Use ops->vhost_restart_queue in vhost_net_virtqueue_restart
>>    vhost_net: Use ops->vhost_reset_queue in vhost_net_virtqueue_reset
>>    vdpa: add vhost_vdpa_set_vring_ready_internal
>>    vdpa: add vhost_vdpa_svq_stop
>>    vdpa: add vhost_vdpa_reset_queue
>>    vdpa: add vhost_vdpa_svq_start
>>    vdpa: add vhost_vdpa_restart_queue
>>    vdpa: enable all vqs if the device support RING_RESET feature
>>    vdpa: use SVQ to stall dataplane while NIC state is being restored
>>    vhost: Allow _F_RING_RESET with shadow virtqueue
>>
>>   include/hw/virtio/vhost-backend.h  |   6 ++
>>   hw/net/vhost_net.c                 |  16 ++--
>>   hw/virtio/vhost-shadow-virtqueue.c |   1 +
>>   hw/virtio/vhost-vdpa.c             | 139 +++++++++++++++++++++--------
>>   net/vhost-vdpa.c                   |  55 ++++++++++--
>>   hw/virtio/trace-events             |   2 +-
>>   6 files changed, 171 insertions(+), 48 deletions(-)
>>
>> --
>> 2.39.3
>>
>>
Eugenio Perez Martin July 27, 2023, 12:53 p.m. UTC | #3
On Tue, Jul 25, 2023 at 9:15 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2023/7/21 14:48, Eugenio Perez Martin 写道:
> > On Thu, Jul 20, 2023 at 8:15 PM Eugenio Pérez <eperezma@redhat.com> wrote:
> >> At this moment the migration of net features that depends on CVQ is not
> >> possible, as there is no reliable way to restore the device state like mac
> >> address, number of enabled queues, etc to the destination.  This is mainly
> >> caused because the device must only read CVQ, and process all the commands
> >> before resuming the dataplane.
> >>
> >> This series uses the VirtIO feature _F_RING_RESET to achieve it, adding an
> >> alternative method to late vq enabling proposed in [1][2].  It expose SVQ to
> >> the device until it process all the CVQ messages, and then replaces the vring
> >> for the guest's one.
> >>
> > A couple of things I forgot to add:
> > * Assuming the implementation of _F_RING_RESET in vdpa is calling
> > kernel vdpa ops .set_vq_ready(vq, false). I'm not sure if this is the
> > best implementation, but it is currently unused in the kernel. At the
> > same time, .set_vq_ready(vq, true) also enables the vq again.
>
>
> I think we need another ops, as set_vq_ready() tends to be functional
> equivalent to queue_enable.
>
> If we reuse set_vq_ready(vq, false), we would get conflict in the future
> when driver is allowed to stop/resume a specific virtqueue via setting 0
> to queue_enable. And that's also the reason why queue_enable is not
> resued to reset a virtqueue.
>

Yes, I totally agree.

>
> >
> >> As an advantage, it uses a feature well deviced in the VirtIO standard.  As a
> >> disadvantage, current HW already support the late enabling and it does not
> >> support RING_RESET.
> >>
> >> This patch must be applied on top of the series ("Enable vdpa net migration
> >> with features depending on CVQ") [1][2].
> >>
> >> The patch has been tested with vp_vdpa, but using high IOVA instead of a
> >> sepparated ID for shadow CVQ and shadow temporal vrings.
> >>
> > And with _F_STATE implementation I sent long ago.
> >
> > Based on this, my suggestion is:
> > * Leave the late enable for vDPA devices.
> > * Make them fail if the vDPA parent device does not support it. This
> > can be considered as a fix.
> > * Leave _F_RING_RESET to be added on top, as the semantics are not
> > implemented in vDPA at the moment.
> >
> > Would that work?
>
>
> I think it can work, let's start from late enabling which seems
> lightweight than reset and see. We can leave the vp_vdpa to be done on
> top with another series.
>

great!

Thanks!


> Thanks
>
>
> >
> >> [1] Message-id: <20230706191227.835526-1-eperezma@redhat.com>
> >> [2] https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg01325.html
> >>
> >> Eugenio Pérez (12):
> >>    vhost: add vhost_reset_queue_op
> >>    vhost: add vhost_restart_queue_op
> >>    vhost_net: Use ops->vhost_restart_queue in vhost_net_virtqueue_restart
> >>    vhost_net: Use ops->vhost_reset_queue in vhost_net_virtqueue_reset
> >>    vdpa: add vhost_vdpa_set_vring_ready_internal
> >>    vdpa: add vhost_vdpa_svq_stop
> >>    vdpa: add vhost_vdpa_reset_queue
> >>    vdpa: add vhost_vdpa_svq_start
> >>    vdpa: add vhost_vdpa_restart_queue
> >>    vdpa: enable all vqs if the device support RING_RESET feature
> >>    vdpa: use SVQ to stall dataplane while NIC state is being restored
> >>    vhost: Allow _F_RING_RESET with shadow virtqueue
> >>
> >>   include/hw/virtio/vhost-backend.h  |   6 ++
> >>   hw/net/vhost_net.c                 |  16 ++--
> >>   hw/virtio/vhost-shadow-virtqueue.c |   1 +
> >>   hw/virtio/vhost-vdpa.c             | 139 +++++++++++++++++++++--------
> >>   net/vhost-vdpa.c                   |  55 ++++++++++--
> >>   hw/virtio/trace-events             |   2 +-
> >>   6 files changed, 171 insertions(+), 48 deletions(-)
> >>
> >> --
> >> 2.39.3
> >>
> >>
>
Michael S. Tsirkin July 27, 2023, 1:05 p.m. UTC | #4
On Fri, Jul 21, 2023 at 08:48:02AM +0200, Eugenio Perez Martin wrote:
> * Leave _F_RING_RESET to be added on top, as the semantics are not
> implemented in vDPA at the moment.

We really need _F_RING_RESET in vdpa too though.
You did code it up already - why do you want to leave
it out?
Eugenio Perez Martin July 27, 2023, 3:23 p.m. UTC | #5
On Thu, Jul 27, 2023 at 3:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Jul 21, 2023 at 08:48:02AM +0200, Eugenio Perez Martin wrote:
> > * Leave _F_RING_RESET to be added on top, as the semantics are not
> > implemented in vDPA at the moment.
>
> We really need _F_RING_RESET in vdpa too though.
> You did code it up already - why do you want to leave
> it out?
>

I don't want to leave it out, sorry if it sounded that way.

I'd like to merge the late enable part first, as it has been already
validated and it works with current HW. The _F_RING_RESET part needs
more time regarding vhost vDPA API for Linux >=6.5, HW support, etc.

I can start working on the _F_RING_RESET kernel part needed changes
with a dedicated ioctl if this version looks good enough.

Thanks!