diff mbox series

[RFC,v2,06/13] vhost: delay set_vring_ready after DRIVER_OK

Message ID 20230112172434.760850-7-eperezma@redhat.com (mailing list archive)
State New, archived
Headers show
Series Dinamycally switch to vhost shadow virtqueues at vdpa net migration | expand

Commit Message

Eugenio Perez Martin Jan. 12, 2023, 5:24 p.m. UTC
To restore the device at the destination of a live migration we send the
commands through control virtqueue. For a device to read CVQ it must
have received the DRIVER_OK status bit.

However this opens a window where the device could start receiving
packets in rx queue 0 before it receives the RSS configuration. To avoid
that, we will not send vring_enable until all configuration is used by
the device.

As a first step, run vhost_set_vring_ready for all vhost_net backend after
all of them are started (with DRIVER_OK). This code should not affect
vdpa.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/net/vhost_net.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

Comments

Jason Wang Jan. 13, 2023, 4:36 a.m. UTC | #1
On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> To restore the device at the destination of a live migration we send the
> commands through control virtqueue. For a device to read CVQ it must
> have received the DRIVER_OK status bit.

This probably requires the support from the parent driver and requires
some changes or fixes in the parent driver.

Some drivers did:

parent_set_status():
if (DRIVER_OK)
    if (queue_enable)
        write queue_enable to the device

Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.

>
> However this opens a window where the device could start receiving
> packets in rx queue 0 before it receives the RSS configuration. To avoid
> that, we will not send vring_enable until all configuration is used by
> the device.
>
> As a first step, run vhost_set_vring_ready for all vhost_net backend after
> all of them are started (with DRIVER_OK). This code should not affect
> vdpa.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  hw/net/vhost_net.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index c4eecc6f36..3900599465 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>          } else {
>              peer = qemu_get_peer(ncs, n->max_queue_pairs);
>          }
> +        r = vhost_net_start_one(get_vhost_net(peer), dev);
> +        if (r < 0) {
> +            goto err_start;
> +        }
> +    }
> +
> +    for (int j = 0; j < nvhosts; j++) {
> +        if (j < data_queue_pairs) {
> +            peer = qemu_get_peer(ncs, j);
> +        } else {
> +            peer = qemu_get_peer(ncs, n->max_queue_pairs);
> +        }

I fail to understand why we need to change the vhost_net layer? This
is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
vhost_vdpa_dev_start()?

Thanks

>
>          if (peer->vring_enable) {
>              /* restore vring enable state */
> @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>                  goto err_start;
>              }
>          }
> -
> -        r = vhost_net_start_one(get_vhost_net(peer), dev);
> -        if (r < 0) {
> -            goto err_start;
> -        }
>      }
>
>      return 0;
> --
> 2.31.1
>
Eugenio Perez Martin Jan. 13, 2023, 8:19 a.m. UTC | #2
On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >
> > To restore the device at the destination of a live migration we send the
> > commands through control virtqueue. For a device to read CVQ it must
> > have received the DRIVER_OK status bit.
>
> This probably requires the support from the parent driver and requires
> some changes or fixes in the parent driver.
>
> Some drivers did:
>
> parent_set_status():
> if (DRIVER_OK)
>     if (queue_enable)
>         write queue_enable to the device
>
> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
>

I don't get your point here. No device should start reading CVQ (or
any other VQ) without having received DRIVER_OK.

Some parent drivers do not support sending the queue enable command
after DRIVER_OK, usually because they clean part of the state like the
set by set_vring_base. Even vdpa_net_sim needs fixes here.

But my understanding is that it should be supported so I consider it a
bug. Especially after queue_reset patches. Is that what you mean?

> >
> > However this opens a window where the device could start receiving
> > packets in rx queue 0 before it receives the RSS configuration. To avoid
> > that, we will not send vring_enable until all configuration is used by
> > the device.
> >
> > As a first step, run vhost_set_vring_ready for all vhost_net backend after
> > all of them are started (with DRIVER_OK). This code should not affect
> > vdpa.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  hw/net/vhost_net.c | 17 ++++++++++++-----
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > index c4eecc6f36..3900599465 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >          } else {
> >              peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >          }
> > +        r = vhost_net_start_one(get_vhost_net(peer), dev);
> > +        if (r < 0) {
> > +            goto err_start;
> > +        }
> > +    }
> > +
> > +    for (int j = 0; j < nvhosts; j++) {
> > +        if (j < data_queue_pairs) {
> > +            peer = qemu_get_peer(ncs, j);
> > +        } else {
> > +            peer = qemu_get_peer(ncs, n->max_queue_pairs);
> > +        }
>
> I fail to understand why we need to change the vhost_net layer? This
> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
> vhost_vdpa_dev_start()?
>

The vhost-net layer explicitly calls vhost_set_vring_enable before
vhost_dev_start, and this is exactly the behavior we want to avoid.
Even if we make changes to vhost_dev, this change is still needed.

And we want to explicitly enable CVQ first, which "only" vhost_net
knows which is. To perform that in vhost_vdpa_dev_start would require
quirks, involving one or more of:
* Ignore vq enable calls if the device is not the CVQ one. How to
signal what is the CVQ? Can we trust it will be the last one for all
kind of devices?
* Enable queues that do not belong to the last vhost_dev from the enable call.
* Enable the rest of the queues from the last enable in reverse order.
* Intercalate the "net load" callback between enabling the last
vhost_vdpa device and enabling the rest of devices.
* Add an "enable priority" order?

Thanks!

> Thanks
>
> >
> >          if (peer->vring_enable) {
> >              /* restore vring enable state */
> > @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >                  goto err_start;
> >              }
> >          }
> > -
> > -        r = vhost_net_start_one(get_vhost_net(peer), dev);
> > -        if (r < 0) {
> > -            goto err_start;
> > -        }
> >      }
> >
> >      return 0;
> > --
> > 2.31.1
> >
>
Stefano Garzarella Jan. 13, 2023, 9:51 a.m. UTC | #3
On Fri, Jan 13, 2023 at 09:19:00AM +0100, Eugenio Perez Martin wrote:
>On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>> >
>> > To restore the device at the destination of a live migration we send the
>> > commands through control virtqueue. For a device to read CVQ it must
>> > have received the DRIVER_OK status bit.
>>
>> This probably requires the support from the parent driver and requires
>> some changes or fixes in the parent driver.
>>
>> Some drivers did:
>>
>> parent_set_status():
>> if (DRIVER_OK)
>>     if (queue_enable)
>>         write queue_enable to the device
>>
>> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
>>
>
>I don't get your point here. No device should start reading CVQ (or
>any other VQ) without having received DRIVER_OK.
>
>Some parent drivers do not support sending the queue enable command
>after DRIVER_OK, usually because they clean part of the state like the
>set by set_vring_base. Even vdpa_net_sim needs fixes here.
>
>But my understanding is that it should be supported so I consider it a
>bug. Especially after queue_reset patches. Is that what you mean?
>
>> >
>> > However this opens a window where the device could start receiving
>> > packets in rx queue 0 before it receives the RSS configuration. To avoid
>> > that, we will not send vring_enable until all configuration is used by
>> > the device.
>> >
>> > As a first step, run vhost_set_vring_ready for all vhost_net backend after
>> > all of them are started (with DRIVER_OK). This code should not affect
>> > vdpa.
>> >
>> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>> > ---
>> >  hw/net/vhost_net.c | 17 ++++++++++++-----
>> >  1 file changed, 12 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>> > index c4eecc6f36..3900599465 100644
>> > --- a/hw/net/vhost_net.c
>> > +++ b/hw/net/vhost_net.c
>> > @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>> >          } else {
>> >              peer = qemu_get_peer(ncs, n->max_queue_pairs);
>> >          }
>> > +        r = vhost_net_start_one(get_vhost_net(peer), dev);
>> > +        if (r < 0) {
>> > +            goto err_start;
>> > +        }
>> > +    }
>> > +
>> > +    for (int j = 0; j < nvhosts; j++) {
>> > +        if (j < data_queue_pairs) {
>> > +            peer = qemu_get_peer(ncs, j);
>> > +        } else {
>> > +            peer = qemu_get_peer(ncs, n->max_queue_pairs);
>> > +        }
>>
>> I fail to understand why we need to change the vhost_net layer? This
>> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
>> vhost_vdpa_dev_start()?
>>
>
>The vhost-net layer explicitly calls vhost_set_vring_enable before
>vhost_dev_start, and this is exactly the behavior we want to avoid.
>Even if we make changes to vhost_dev, this change is still needed.

I'm working on something similar since I'd like to re-work the following 
commit we merged just before 7.2 release:
     4daa5054c5 vhost: enable vrings in vhost_dev_start() for vhost-user
     devices

vhost-net wasn't the only one who enabled vrings independently, but it 
was easy enough for others devices to avoid it and enable them in 
vhost_dev_start().

Do you think can we avoid in some way this special behaviour of 
vhost-net and enable the vrings in vhost_dev_start?

Thanks,
Stefano

>
>And we want to explicitly enable CVQ first, which "only" vhost_net
>knows which is. To perform that in vhost_vdpa_dev_start would require
>quirks, involving one or more of:
>* Ignore vq enable calls if the device is not the CVQ one. How to
>signal what is the CVQ? Can we trust it will be the last one for all
>kind of devices?
>* Enable queues that do not belong to the last vhost_dev from the enable call.
>* Enable the rest of the queues from the last enable in reverse order.
>* Intercalate the "net load" callback between enabling the last
>vhost_vdpa device and enabling the rest of devices.
>* Add an "enable priority" order?
>
>Thanks!
>
>> Thanks
>>
>> >
>> >          if (peer->vring_enable) {
>> >              /* restore vring enable state */
>> > @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>> >                  goto err_start;
>> >              }
>> >          }
>> > -
>> > -        r = vhost_net_start_one(get_vhost_net(peer), dev);
>> > -        if (r < 0) {
>> > -            goto err_start;
>> > -        }
>> >      }
>> >
>> >      return 0;
>> > --
>> > 2.31.1
>> >
>>
>
Eugenio Perez Martin Jan. 13, 2023, 10:03 a.m. UTC | #4
On Fri, Jan 13, 2023 at 10:51 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> On Fri, Jan 13, 2023 at 09:19:00AM +0100, Eugenio Perez Martin wrote:
> >On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >> >
> >> > To restore the device at the destination of a live migration we send the
> >> > commands through control virtqueue. For a device to read CVQ it must
> >> > have received the DRIVER_OK status bit.
> >>
> >> This probably requires the support from the parent driver and requires
> >> some changes or fixes in the parent driver.
> >>
> >> Some drivers did:
> >>
> >> parent_set_status():
> >> if (DRIVER_OK)
> >>     if (queue_enable)
> >>         write queue_enable to the device
> >>
> >> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
> >>
> >
> >I don't get your point here. No device should start reading CVQ (or
> >any other VQ) without having received DRIVER_OK.
> >
> >Some parent drivers do not support sending the queue enable command
> >after DRIVER_OK, usually because they clean part of the state like the
> >set by set_vring_base. Even vdpa_net_sim needs fixes here.
> >
> >But my understanding is that it should be supported so I consider it a
> >bug. Especially after queue_reset patches. Is that what you mean?
> >
> >> >
> >> > However this opens a window where the device could start receiving
> >> > packets in rx queue 0 before it receives the RSS configuration. To avoid
> >> > that, we will not send vring_enable until all configuration is used by
> >> > the device.
> >> >
> >> > As a first step, run vhost_set_vring_ready for all vhost_net backend after
> >> > all of them are started (with DRIVER_OK). This code should not affect
> >> > vdpa.
> >> >
> >> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >> > ---
> >> >  hw/net/vhost_net.c | 17 ++++++++++++-----
> >> >  1 file changed, 12 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >> > index c4eecc6f36..3900599465 100644
> >> > --- a/hw/net/vhost_net.c
> >> > +++ b/hw/net/vhost_net.c
> >> > @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >> >          } else {
> >> >              peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >> >          }
> >> > +        r = vhost_net_start_one(get_vhost_net(peer), dev);
> >> > +        if (r < 0) {
> >> > +            goto err_start;
> >> > +        }
> >> > +    }
> >> > +
> >> > +    for (int j = 0; j < nvhosts; j++) {
> >> > +        if (j < data_queue_pairs) {
> >> > +            peer = qemu_get_peer(ncs, j);
> >> > +        } else {
> >> > +            peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >> > +        }
> >>
> >> I fail to understand why we need to change the vhost_net layer? This
> >> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
> >> vhost_vdpa_dev_start()?
> >>
> >
> >The vhost-net layer explicitly calls vhost_set_vring_enable before
> >vhost_dev_start, and this is exactly the behavior we want to avoid.
> >Even if we make changes to vhost_dev, this change is still needed.
>
> I'm working on something similar since I'd like to re-work the following
> commit we merged just before 7.2 release:
>      4daa5054c5 vhost: enable vrings in vhost_dev_start() for vhost-user
>      devices
>
> vhost-net wasn't the only one who enabled vrings independently, but it
> was easy enough for others devices to avoid it and enable them in
> vhost_dev_start().
>
> Do you think can we avoid in some way this special behaviour of
> vhost-net and enable the vrings in vhost_dev_start?
>

Actually looking forward to it :). If that gets merged before this
series, I think we could drop this patch.

If I'm not wrong the enable/disable dance is used just by vhost-user
at the moment.

Maxime, could you give us some hints about the tests to use to check
that changes do not introduce regressions in vhost-user?

Thanks!

> Thanks,
> Stefano
>
> >
> >And we want to explicitly enable CVQ first, which "only" vhost_net
> >knows which is. To perform that in vhost_vdpa_dev_start would require
> >quirks, involving one or more of:
> >* Ignore vq enable calls if the device is not the CVQ one. How to
> >signal what is the CVQ? Can we trust it will be the last one for all
> >kind of devices?
> >* Enable queues that do not belong to the last vhost_dev from the enable call.
> >* Enable the rest of the queues from the last enable in reverse order.
> >* Intercalate the "net load" callback between enabling the last
> >vhost_vdpa device and enabling the rest of devices.
> >* Add an "enable priority" order?
> >
> >Thanks!
> >
> >> Thanks
> >>
> >> >
> >> >          if (peer->vring_enable) {
> >> >              /* restore vring enable state */
> >> > @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >> >                  goto err_start;
> >> >              }
> >> >          }
> >> > -
> >> > -        r = vhost_net_start_one(get_vhost_net(peer), dev);
> >> > -        if (r < 0) {
> >> > -            goto err_start;
> >> > -        }
> >> >      }
> >> >
> >> >      return 0;
> >> > --
> >> > 2.31.1
> >> >
> >>
> >
>
Stefano Garzarella Jan. 13, 2023, 10:37 a.m. UTC | #5
On Fri, Jan 13, 2023 at 11:03:17AM +0100, Eugenio Perez Martin wrote:
>On Fri, Jan 13, 2023 at 10:51 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>>
>> On Fri, Jan 13, 2023 at 09:19:00AM +0100, Eugenio Perez Martin wrote:
>> >On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
>> >>
>> >> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>> >> >
>> >> > To restore the device at the destination of a live migration we send the
>> >> > commands through control virtqueue. For a device to read CVQ it must
>> >> > have received the DRIVER_OK status bit.
>> >>
>> >> This probably requires the support from the parent driver and requires
>> >> some changes or fixes in the parent driver.
>> >>
>> >> Some drivers did:
>> >>
>> >> parent_set_status():
>> >> if (DRIVER_OK)
>> >>     if (queue_enable)
>> >>         write queue_enable to the device
>> >>
>> >> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
>> >>
>> >
>> >I don't get your point here. No device should start reading CVQ (or
>> >any other VQ) without having received DRIVER_OK.
>> >
>> >Some parent drivers do not support sending the queue enable command
>> >after DRIVER_OK, usually because they clean part of the state like the
>> >set by set_vring_base. Even vdpa_net_sim needs fixes here.
>> >
>> >But my understanding is that it should be supported so I consider it a
>> >bug. Especially after queue_reset patches. Is that what you mean?
>> >
>> >> >
>> >> > However this opens a window where the device could start receiving
>> >> > packets in rx queue 0 before it receives the RSS configuration. To avoid
>> >> > that, we will not send vring_enable until all configuration is used by
>> >> > the device.
>> >> >
>> >> > As a first step, run vhost_set_vring_ready for all vhost_net backend after
>> >> > all of them are started (with DRIVER_OK). This code should not affect
>> >> > vdpa.
>> >> >
>> >> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>> >> > ---
>> >> >  hw/net/vhost_net.c | 17 ++++++++++++-----
>> >> >  1 file changed, 12 insertions(+), 5 deletions(-)
>> >> >
>> >> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>> >> > index c4eecc6f36..3900599465 100644
>> >> > --- a/hw/net/vhost_net.c
>> >> > +++ b/hw/net/vhost_net.c
>> >> > @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>> >> >          } else {
>> >> >              peer = qemu_get_peer(ncs, n->max_queue_pairs);
>> >> >          }
>> >> > +        r = vhost_net_start_one(get_vhost_net(peer), dev);
>> >> > +        if (r < 0) {
>> >> > +            goto err_start;
>> >> > +        }
>> >> > +    }
>> >> > +
>> >> > +    for (int j = 0; j < nvhosts; j++) {
>> >> > +        if (j < data_queue_pairs) {
>> >> > +            peer = qemu_get_peer(ncs, j);
>> >> > +        } else {
>> >> > +            peer = qemu_get_peer(ncs, n->max_queue_pairs);
>> >> > +        }
>> >>
>> >> I fail to understand why we need to change the vhost_net layer? This
>> >> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
>> >> vhost_vdpa_dev_start()?
>> >>
>> >
>> >The vhost-net layer explicitly calls vhost_set_vring_enable before
>> >vhost_dev_start, and this is exactly the behavior we want to avoid.
>> >Even if we make changes to vhost_dev, this change is still needed.
>>
>> I'm working on something similar since I'd like to re-work the following
>> commit we merged just before 7.2 release:
>>      4daa5054c5 vhost: enable vrings in vhost_dev_start() for vhost-user
>>      devices
>>
>> vhost-net wasn't the only one who enabled vrings independently, but it
>> was easy enough for others devices to avoid it and enable them in
>> vhost_dev_start().
>>
>> Do you think can we avoid in some way this special behaviour of
>> vhost-net and enable the vrings in vhost_dev_start?
>>
>
>Actually looking forward to it :). If that gets merged before this
>series, I think we could drop this patch.

I hope to send a RFC net week :-) let's see...

>
>If I'm not wrong the enable/disable dance is used just by vhost-user
>at the moment.

Yep, I got the same.

My doubts are that for vhost-user-net we enable only the first 
VirtIONet->curr_queue_pairs queue IIUC. While for other devices (and 
maybe also for vDPA devices) we enable all of them.

I need to figure out if it's safe to do this for vhost-user-net as well, 
otherwise I need to find a way to leave this behavior.

>
>Maxime, could you give us some hints about the tests to use to check
>that changes do not introduce regressions in vhost-user?

Yep, any help on how to test is very appreciated.

Thanks,
Stefano
Jason Wang Jan. 16, 2023, 6:36 a.m. UTC | #6
在 2023/1/13 16:19, Eugenio Perez Martin 写道:
> On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>> To restore the device at the destination of a live migration we send the
>>> commands through control virtqueue. For a device to read CVQ it must
>>> have received the DRIVER_OK status bit.
>> This probably requires the support from the parent driver and requires
>> some changes or fixes in the parent driver.
>>
>> Some drivers did:
>>
>> parent_set_status():
>> if (DRIVER_OK)
>>      if (queue_enable)
>>          write queue_enable to the device
>>
>> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
>>
> I don't get your point here. No device should start reading CVQ (or
> any other VQ) without having received DRIVER_OK.


If I understand the code correctly:

For CVQ, we do SET_VRING_ENABLE before DRIVER_OK, that's fine.

For datapath_vq, we do SET_VRING_ENABLE after DRIVER_OK, this requires 
parent driver support (explained above)


>
> Some parent drivers do not support sending the queue enable command
> after DRIVER_OK, usually because they clean part of the state like the
> set by set_vring_base. Even vdpa_net_sim needs fixes here.


Yes, so the question is:

Do we need another backend feature for this? (otherwise thing may break 
silently)


>
> But my understanding is that it should be supported so I consider it a
> bug.


Probably, we need fine some proof in the spec, e.g in 3.1.1:

"""

7.Perform device-specific setup, including discovery of virtqueues for 
the device, optional per-bus setup, reading and possibly writing the 
device’s virtio configuration space, and population of virtqueues.
8.Set the DRIVER_OK status bit. At this point the device is “live”.

"""

So if my understanding is correct, "discovery of virtqueues for the 
device" implies queue_enable here which is expected to be done before 
DRIVER_OK. But it doesn't say anything regrading to the behaviour of 
setting queue ready after DRIVER_OK.

I'm not sure it's a real bug or not, may Michael and comment on this.


>   Especially after queue_reset patches. Is that what you mean?


We haven't supported queue_reset yet in Qemu. But it allows to write 1 
to queue_enable after DRIVER_OK for sure.


>
>>> However this opens a window where the device could start receiving
>>> packets in rx queue 0 before it receives the RSS configuration. To avoid
>>> that, we will not send vring_enable until all configuration is used by
>>> the device.
>>>
>>> As a first step, run vhost_set_vring_ready for all vhost_net backend after
>>> all of them are started (with DRIVER_OK). This code should not affect
>>> vdpa.
>>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>>   hw/net/vhost_net.c | 17 ++++++++++++-----
>>>   1 file changed, 12 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>>> index c4eecc6f36..3900599465 100644
>>> --- a/hw/net/vhost_net.c
>>> +++ b/hw/net/vhost_net.c
>>> @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>           } else {
>>>               peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>>           }
>>> +        r = vhost_net_start_one(get_vhost_net(peer), dev);
>>> +        if (r < 0) {
>>> +            goto err_start;
>>> +        }
>>> +    }
>>> +
>>> +    for (int j = 0; j < nvhosts; j++) {
>>> +        if (j < data_queue_pairs) {
>>> +            peer = qemu_get_peer(ncs, j);
>>> +        } else {
>>> +            peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>> +        }
>> I fail to understand why we need to change the vhost_net layer? This
>> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
>> vhost_vdpa_dev_start()?
>>
> The vhost-net layer explicitly calls vhost_set_vring_enable before
> vhost_dev_start, and this is exactly the behavior we want to avoid.
> Even if we make changes to vhost_dev, this change is still needed.


Note that the only user of vhost_set_vring_enable() is vhost-user where 
the semantic is different:

It uses that to changes the number of active queues:

static int peer_attach(VirtIONet *n, int index)

         if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
=>      vhost_set_vring_enable(nc->peer, 1);
     }

This is not the semantic of vhost-vDPA that tries to be complaint with 
virtio-spec. So I'm not sure how it can help here.


>
> And we want to explicitly enable CVQ first, which "only" vhost_net
> knows which is.


This should be known by net/vhost-vdpa.c.


> To perform that in vhost_vdpa_dev_start would require
> quirks, involving one or more of:
> * Ignore vq enable calls if the device is not the CVQ one. How to
> signal what is the CVQ? Can we trust it will be the last one for all
> kind of devices?
> * Enable queues that do not belong to the last vhost_dev from the enable call.
> * Enable the rest of the queues from the last enable in reverse order.
> * Intercalate the "net load" callback between enabling the last
> vhost_vdpa device and enabling the rest of devices.
> * Add an "enable priority" order?


Haven't had time in thinking through, but it would be better if we can 
limit the changes in vhost-vdpa layer. E.g currently the 
VHOST_VDPA_SET_VRING_ENABLE is done at vhost_dev_start().

Thanks


>
> Thanks!
>
>> Thanks
>>
>>>           if (peer->vring_enable) {
>>>               /* restore vring enable state */
>>> @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>                   goto err_start;
>>>               }
>>>           }
>>> -
>>> -        r = vhost_net_start_one(get_vhost_net(peer), dev);
>>> -        if (r < 0) {
>>> -            goto err_start;
>>> -        }
>>>       }
>>>
>>>       return 0;
>>> --
>>> 2.31.1
>>>
Eugenio Perez Martin Jan. 16, 2023, 4:16 p.m. UTC | #7
On Mon, Jan 16, 2023 at 7:37 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2023/1/13 16:19, Eugenio Perez Martin 写道:
> > On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>> To restore the device at the destination of a live migration we send the
> >>> commands through control virtqueue. For a device to read CVQ it must
> >>> have received the DRIVER_OK status bit.
> >> This probably requires the support from the parent driver and requires
> >> some changes or fixes in the parent driver.
> >>
> >> Some drivers did:
> >>
> >> parent_set_status():
> >> if (DRIVER_OK)
> >>      if (queue_enable)
> >>          write queue_enable to the device
> >>
> >> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
> >>
> > I don't get your point here. No device should start reading CVQ (or
> > any other VQ) without having received DRIVER_OK.
>
>
> If I understand the code correctly:
>
> For CVQ, we do SET_VRING_ENABLE before DRIVER_OK, that's fine.
>
> For datapath_vq, we do SET_VRING_ENABLE after DRIVER_OK, this requires
> parent driver support (explained above)
>
>
> >
> > Some parent drivers do not support sending the queue enable command
> > after DRIVER_OK, usually because they clean part of the state like the
> > set by set_vring_base. Even vdpa_net_sim needs fixes here.
>
>
> Yes, so the question is:
>
> Do we need another backend feature for this? (otherwise thing may break
> silently)
>
>
> >
> > But my understanding is that it should be supported so I consider it a
> > bug.
>
>
> Probably, we need fine some proof in the spec, e.g in 3.1.1:
>
> """
>
> 7.Perform device-specific setup, including discovery of virtqueues for
> the device, optional per-bus setup, reading and possibly writing the
> device’s virtio configuration space, and population of virtqueues.
> 8.Set the DRIVER_OK status bit. At this point the device is “live”.
>
> """
>
> So if my understanding is correct, "discovery of virtqueues for the
> device" implies queue_enable here which is expected to be done before
> DRIVER_OK. But it doesn't say anything regrading to the behaviour of
> setting queue ready after DRIVER_OK.
>
> I'm not sure it's a real bug or not, may Michael and comment on this.
>

Right, input on this topic would be really appreciated.

>
> >   Especially after queue_reset patches. Is that what you mean?
>
>
> We haven't supported queue_reset yet in Qemu. But it allows to write 1
> to queue_enable after DRIVER_OK for sure.
>

I was not clear, I meant in the emulated device. I'm testing this
series with the proposal of _F_STATE.

>
> >
> >>> However this opens a window where the device could start receiving
> >>> packets in rx queue 0 before it receives the RSS configuration. To avoid
> >>> that, we will not send vring_enable until all configuration is used by
> >>> the device.
> >>>
> >>> As a first step, run vhost_set_vring_ready for all vhost_net backend after
> >>> all of them are started (with DRIVER_OK). This code should not affect
> >>> vdpa.
> >>>
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>>   hw/net/vhost_net.c | 17 ++++++++++++-----
> >>>   1 file changed, 12 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >>> index c4eecc6f36..3900599465 100644
> >>> --- a/hw/net/vhost_net.c
> >>> +++ b/hw/net/vhost_net.c
> >>> @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >>>           } else {
> >>>               peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >>>           }
> >>> +        r = vhost_net_start_one(get_vhost_net(peer), dev);
> >>> +        if (r < 0) {
> >>> +            goto err_start;
> >>> +        }
> >>> +    }
> >>> +
> >>> +    for (int j = 0; j < nvhosts; j++) {
> >>> +        if (j < data_queue_pairs) {
> >>> +            peer = qemu_get_peer(ncs, j);
> >>> +        } else {
> >>> +            peer = qemu_get_peer(ncs, n->max_queue_pairs);
> >>> +        }
> >> I fail to understand why we need to change the vhost_net layer? This
> >> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
> >> vhost_vdpa_dev_start()?
> >>
> > The vhost-net layer explicitly calls vhost_set_vring_enable before
> > vhost_dev_start, and this is exactly the behavior we want to avoid.
> > Even if we make changes to vhost_dev, this change is still needed.
>
>
> Note that the only user of vhost_set_vring_enable() is vhost-user where
> the semantic is different:
>
> It uses that to changes the number of active queues:
>
> static int peer_attach(VirtIONet *n, int index)
>
>          if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> =>      vhost_set_vring_enable(nc->peer, 1);
>      }
>
> This is not the semantic of vhost-vDPA that tries to be complaint with
> virtio-spec. So I'm not sure how it can help here.
>

Right, but previous changes use enable callback to delay the enable of
the datapath virtqueues. I'll try to fit the changes in
virtio/vhost-vdpa though.

Thanks!

>
> >
> > And we want to explicitly enable CVQ first, which "only" vhost_net
> > knows which is.
>
>
> This should be known by net/vhost-vdpa.c.
>
>
> > To perform that in vhost_vdpa_dev_start would require
> > quirks, involving one or more of:
> > * Ignore vq enable calls if the device is not the CVQ one. How to
> > signal what is the CVQ? Can we trust it will be the last one for all
> > kind of devices?
> > * Enable queues that do not belong to the last vhost_dev from the enable call.
> > * Enable the rest of the queues from the last enable in reverse order.
> > * Intercalate the "net load" callback between enabling the last
> > vhost_vdpa device and enabling the rest of devices.
> > * Add an "enable priority" order?
>
>
> Haven't had time in thinking through, but it would be better if we can
> limit the changes in vhost-vdpa layer. E.g currently the
> VHOST_VDPA_SET_VRING_ENABLE is done at vhost_dev_start().
>
> Thanks
>
>
> >
> > Thanks!
> >
> >> Thanks
> >>
> >>>           if (peer->vring_enable) {
> >>>               /* restore vring enable state */
> >>> @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >>>                   goto err_start;
> >>>               }
> >>>           }
> >>> -
> >>> -        r = vhost_net_start_one(get_vhost_net(peer), dev);
> >>> -        if (r < 0) {
> >>> -            goto err_start;
> >>> -        }
> >>>       }
> >>>
> >>>       return 0;
> >>> --
> >>> 2.31.1
> >>>
>
Jason Wang Jan. 17, 2023, 5:36 a.m. UTC | #8
在 2023/1/17 00:16, Eugenio Perez Martin 写道:
> On Mon, Jan 16, 2023 at 7:37 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> 在 2023/1/13 16:19, Eugenio Perez Martin 写道:
>>> On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>>>> To restore the device at the destination of a live migration we send the
>>>>> commands through control virtqueue. For a device to read CVQ it must
>>>>> have received the DRIVER_OK status bit.
>>>> This probably requires the support from the parent driver and requires
>>>> some changes or fixes in the parent driver.
>>>>
>>>> Some drivers did:
>>>>
>>>> parent_set_status():
>>>> if (DRIVER_OK)
>>>>       if (queue_enable)
>>>>           write queue_enable to the device
>>>>
>>>> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
>>>>
>>> I don't get your point here. No device should start reading CVQ (or
>>> any other VQ) without having received DRIVER_OK.
>>
>> If I understand the code correctly:
>>
>> For CVQ, we do SET_VRING_ENABLE before DRIVER_OK, that's fine.
>>
>> For datapath_vq, we do SET_VRING_ENABLE after DRIVER_OK, this requires
>> parent driver support (explained above)
>>
>>
>>> Some parent drivers do not support sending the queue enable command
>>> after DRIVER_OK, usually because they clean part of the state like the
>>> set by set_vring_base. Even vdpa_net_sim needs fixes here.
>>
>> Yes, so the question is:
>>
>> Do we need another backend feature for this? (otherwise thing may break
>> silently)
>>
>>
>>> But my understanding is that it should be supported so I consider it a
>>> bug.
>>
>> Probably, we need fine some proof in the spec, e.g in 3.1.1:
>>
>> """
>>
>> 7.Perform device-specific setup, including discovery of virtqueues for
>> the device, optional per-bus setup, reading and possibly writing the
>> device’s virtio configuration space, and population of virtqueues.
>> 8.Set the DRIVER_OK status bit. At this point the device is “live”.
>>
>> """
>>
>> So if my understanding is correct, "discovery of virtqueues for the
>> device" implies queue_enable here which is expected to be done before
>> DRIVER_OK. But it doesn't say anything regrading to the behaviour of
>> setting queue ready after DRIVER_OK.
>>
>> I'm not sure it's a real bug or not, may Michael and comment on this.
>>
> Right, input on this topic would be really appreciated.
>
>>>    Especially after queue_reset patches. Is that what you mean?
>>
>> We haven't supported queue_reset yet in Qemu. But it allows to write 1
>> to queue_enable after DRIVER_OK for sure.
>>
> I was not clear, I meant in the emulated device. I'm testing this
> series with the proposal of _F_STATE.
>
>>>>> However this opens a window where the device could start receiving
>>>>> packets in rx queue 0 before it receives the RSS configuration. To avoid
>>>>> that, we will not send vring_enable until all configuration is used by
>>>>> the device.
>>>>>
>>>>> As a first step, run vhost_set_vring_ready for all vhost_net backend after
>>>>> all of them are started (with DRIVER_OK). This code should not affect
>>>>> vdpa.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>    hw/net/vhost_net.c | 17 ++++++++++++-----
>>>>>    1 file changed, 12 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>>>>> index c4eecc6f36..3900599465 100644
>>>>> --- a/hw/net/vhost_net.c
>>>>> +++ b/hw/net/vhost_net.c
>>>>> @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>>>            } else {
>>>>>                peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>>>>            }
>>>>> +        r = vhost_net_start_one(get_vhost_net(peer), dev);
>>>>> +        if (r < 0) {
>>>>> +            goto err_start;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    for (int j = 0; j < nvhosts; j++) {
>>>>> +        if (j < data_queue_pairs) {
>>>>> +            peer = qemu_get_peer(ncs, j);
>>>>> +        } else {
>>>>> +            peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>>>> +        }
>>>> I fail to understand why we need to change the vhost_net layer? This
>>>> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
>>>> vhost_vdpa_dev_start()?
>>>>
>>> The vhost-net layer explicitly calls vhost_set_vring_enable before
>>> vhost_dev_start, and this is exactly the behavior we want to avoid.
>>> Even if we make changes to vhost_dev, this change is still needed.
>>
>> Note that the only user of vhost_set_vring_enable() is vhost-user where
>> the semantic is different:
>>
>> It uses that to changes the number of active queues:
>>
>> static int peer_attach(VirtIONet *n, int index)
>>
>>           if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
>> =>      vhost_set_vring_enable(nc->peer, 1);
>>       }
>>
>> This is not the semantic of vhost-vDPA that tries to be complaint with
>> virtio-spec. So I'm not sure how it can help here.
>>
> Right, but previous changes use enable callback to delay the enable of
> the datapath virtqueues. I'll try to fit the changes in
> virtio/vhost-vdpa though.


This would make things more complicated. As mentioned above, 
vhost-user's usage of vhost_set_vring_enable() is not spec compliant 
while the vhost-vDPA VHOST_VDPA_SET_VRING_ENALBE tries to be compliant 
with the spec.

If we tries to mix use that it may result confusion for the readers.

Thanks


>
> Thanks!
>
>>> And we want to explicitly enable CVQ first, which "only" vhost_net
>>> knows which is.
>>
>> This should be known by net/vhost-vdpa.c.
>>
>>
>>> To perform that in vhost_vdpa_dev_start would require
>>> quirks, involving one or more of:
>>> * Ignore vq enable calls if the device is not the CVQ one. How to
>>> signal what is the CVQ? Can we trust it will be the last one for all
>>> kind of devices?
>>> * Enable queues that do not belong to the last vhost_dev from the enable call.
>>> * Enable the rest of the queues from the last enable in reverse order.
>>> * Intercalate the "net load" callback between enabling the last
>>> vhost_vdpa device and enabling the rest of devices.
>>> * Add an "enable priority" order?
>>
>> Haven't had time in thinking through, but it would be better if we can
>> limit the changes in vhost-vdpa layer. E.g currently the
>> VHOST_VDPA_SET_VRING_ENABLE is done at vhost_dev_start().
>>
>> Thanks
>>
>>
>>> Thanks!
>>>
>>>> Thanks
>>>>
>>>>>            if (peer->vring_enable) {
>>>>>                /* restore vring enable state */
>>>>> @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>>>                    goto err_start;
>>>>>                }
>>>>>            }
>>>>> -
>>>>> -        r = vhost_net_start_one(get_vhost_net(peer), dev);
>>>>> -        if (r < 0) {
>>>>> -            goto err_start;
>>>>> -        }
>>>>>        }
>>>>>
>>>>>        return 0;
>>>>> --
>>>>> 2.31.1
>>>>>
Maxime Coquelin Jan. 17, 2023, 3:15 p.m. UTC | #9
Hi Eugenio,

On 1/13/23 11:03, Eugenio Perez Martin wrote:
> On Fri, Jan 13, 2023 at 10:51 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>>
>> On Fri, Jan 13, 2023 at 09:19:00AM +0100, Eugenio Perez Martin wrote:
>>> On Fri, Jan 13, 2023 at 5:36 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>
>>>> On Fri, Jan 13, 2023 at 1:25 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>>>>
>>>>> To restore the device at the destination of a live migration we send the
>>>>> commands through control virtqueue. For a device to read CVQ it must
>>>>> have received the DRIVER_OK status bit.
>>>>
>>>> This probably requires the support from the parent driver and requires
>>>> some changes or fixes in the parent driver.
>>>>
>>>> Some drivers did:
>>>>
>>>> parent_set_status():
>>>> if (DRIVER_OK)
>>>>      if (queue_enable)
>>>>          write queue_enable to the device
>>>>
>>>> Examples are IFCVF or even vp_vdpa at least. MLX5 seems to be fine.
>>>>
>>>
>>> I don't get your point here. No device should start reading CVQ (or
>>> any other VQ) without having received DRIVER_OK.
>>>
>>> Some parent drivers do not support sending the queue enable command
>>> after DRIVER_OK, usually because they clean part of the state like the
>>> set by set_vring_base. Even vdpa_net_sim needs fixes here.
>>>
>>> But my understanding is that it should be supported so I consider it a
>>> bug. Especially after queue_reset patches. Is that what you mean?
>>>
>>>>>
>>>>> However this opens a window where the device could start receiving
>>>>> packets in rx queue 0 before it receives the RSS configuration. To avoid
>>>>> that, we will not send vring_enable until all configuration is used by
>>>>> the device.
>>>>>
>>>>> As a first step, run vhost_set_vring_ready for all vhost_net backend after
>>>>> all of them are started (with DRIVER_OK). This code should not affect
>>>>> vdpa.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>   hw/net/vhost_net.c | 17 ++++++++++++-----
>>>>>   1 file changed, 12 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
>>>>> index c4eecc6f36..3900599465 100644
>>>>> --- a/hw/net/vhost_net.c
>>>>> +++ b/hw/net/vhost_net.c
>>>>> @@ -399,6 +399,18 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>>>           } else {
>>>>>               peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>>>>           }
>>>>> +        r = vhost_net_start_one(get_vhost_net(peer), dev);
>>>>> +        if (r < 0) {
>>>>> +            goto err_start;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    for (int j = 0; j < nvhosts; j++) {
>>>>> +        if (j < data_queue_pairs) {
>>>>> +            peer = qemu_get_peer(ncs, j);
>>>>> +        } else {
>>>>> +            peer = qemu_get_peer(ncs, n->max_queue_pairs);
>>>>> +        }
>>>>
>>>> I fail to understand why we need to change the vhost_net layer? This
>>>> is vhost-vDPA specific, so I wonder if we can limit the changes to e.g
>>>> vhost_vdpa_dev_start()?
>>>>
>>>
>>> The vhost-net layer explicitly calls vhost_set_vring_enable before
>>> vhost_dev_start, and this is exactly the behavior we want to avoid.
>>> Even if we make changes to vhost_dev, this change is still needed.
>>
>> I'm working on something similar since I'd like to re-work the following
>> commit we merged just before 7.2 release:
>>       4daa5054c5 vhost: enable vrings in vhost_dev_start() for vhost-user
>>       devices
>>
>> vhost-net wasn't the only one who enabled vrings independently, but it
>> was easy enough for others devices to avoid it and enable them in
>> vhost_dev_start().
>>
>> Do you think can we avoid in some way this special behaviour of
>> vhost-net and enable the vrings in vhost_dev_start?
>>
> 
> Actually looking forward to it :). If that gets merged before this
> series, I think we could drop this patch.
> 
> If I'm not wrong the enable/disable dance is used just by vhost-user
> at the moment.
> 
> Maxime, could you give us some hints about the tests to use to check
> that changes do not introduce regressions in vhost-user?

You can use DPDK's testpmd [0] tool with Vhost PMD, e.g.:

#single queue pair
# dpdk-testpmd -l <CORE IDs> --no-pci 
--vdev=net_vhost0,iface=/tmp/vhost-user1 -- -i

#multiqueue
# dpdk-testpmd -l <CORE IDs> --no-pci 
--vdev=net_vhost0,iface=/tmp/vhost-user1,queues=4 -- -i --rxq=4 --txq=4

[0]: https://doc.dpdk.org/guides/testpmd_app_ug/index.html

Maxime

> Thanks!
> 
>> Thanks,
>> Stefano
>>
>>>
>>> And we want to explicitly enable CVQ first, which "only" vhost_net
>>> knows which is. To perform that in vhost_vdpa_dev_start would require
>>> quirks, involving one or more of:
>>> * Ignore vq enable calls if the device is not the CVQ one. How to
>>> signal what is the CVQ? Can we trust it will be the last one for all
>>> kind of devices?
>>> * Enable queues that do not belong to the last vhost_dev from the enable call.
>>> * Enable the rest of the queues from the last enable in reverse order.
>>> * Intercalate the "net load" callback between enabling the last
>>> vhost_vdpa device and enabling the rest of devices.
>>> * Add an "enable priority" order?
>>>
>>> Thanks!
>>>
>>>> Thanks
>>>>
>>>>>
>>>>>           if (peer->vring_enable) {
>>>>>               /* restore vring enable state */
>>>>> @@ -408,11 +420,6 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>>>>>                   goto err_start;
>>>>>               }
>>>>>           }
>>>>> -
>>>>> -        r = vhost_net_start_one(get_vhost_net(peer), dev);
>>>>> -        if (r < 0) {
>>>>> -            goto err_start;
>>>>> -        }
>>>>>       }
>>>>>
>>>>>       return 0;
>>>>> --
>>>>> 2.31.1
>>>>>
>>>>
>>>
>>
>
diff mbox series

Patch

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index c4eecc6f36..3900599465 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -399,6 +399,18 @@  int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
         } else {
             peer = qemu_get_peer(ncs, n->max_queue_pairs);
         }
+        r = vhost_net_start_one(get_vhost_net(peer), dev);
+        if (r < 0) {
+            goto err_start;
+        }
+    }
+
+    for (int j = 0; j < nvhosts; j++) {
+        if (j < data_queue_pairs) {
+            peer = qemu_get_peer(ncs, j);
+        } else {
+            peer = qemu_get_peer(ncs, n->max_queue_pairs);
+        }
 
         if (peer->vring_enable) {
             /* restore vring enable state */
@@ -408,11 +420,6 @@  int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
                 goto err_start;
             }
         }
-
-        r = vhost_net_start_one(get_vhost_net(peer), dev);
-        if (r < 0) {
-            goto err_start;
-        }
     }
 
     return 0;