diff mbox series

[RFC,v2,12/13] vdpa: preemptive kick at enable

Message ID 20230112172434.760850-13-eperezma@redhat.com (mailing list archive)
State New, archived
Headers show
Series Dinamycally switch to vhost shadow virtqueues at vdpa net migration | expand

Commit Message

Eugenio Perez Martin Jan. 12, 2023, 5:24 p.m. UTC
Spuriously kick the destination device's queue so it knows in case there
are new descriptors.

RFC: This is somehow a gray area. The guest may have placed descriptors
in a virtqueue but not kicked it, so it might be surprised if the device
starts processing it.

However, that information is not in the migration stream and it should
be an edge case anyhow, being resilient to parallel notifications from
the guest.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Jason Wang Jan. 13, 2023, 2:31 a.m. UTC | #1
On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> Spuriously kick the destination device's queue so it knows in case there
> are new descriptors.
>
> RFC: This is somehow a gray area. The guest may have placed descriptors
> in a virtqueue but not kicked it, so it might be surprised if the device
> starts processing it.

So I think this is kind of the work of the vDPA parent. For the parent
that needs this trick, we should do it in the parent driver.

Thanks

>
> However, that information is not in the migration stream and it should
> be an edge case anyhow, being resilient to parallel notifications from
> the guest.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  hw/virtio/vhost-vdpa.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 40b7e8706a..dff94355dd 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
>      }
>      trace_vhost_vdpa_set_vring_ready(dev);
>      for (i = 0; i < dev->nvqs; ++i) {
> +        VirtQueue *vq;
>          struct vhost_vring_state state = {
>              .index = dev->vq_index + i,
>              .num = 1,
>          };
>          vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
> +
> +        /* Preemptive kick */
> +        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
> +        event_notifier_set(virtio_queue_get_host_notifier(vq));
>      }
>      return 0;
>  }
> --
> 2.31.1
>
Zhu, Lingshan Jan. 13, 2023, 3:25 a.m. UTC | #2
On 1/13/2023 10:31 AM, Jason Wang wrote:
> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>> Spuriously kick the destination device's queue so it knows in case there
>> are new descriptors.
>>
>> RFC: This is somehow a gray area. The guest may have placed descriptors
>> in a virtqueue but not kicked it, so it might be surprised if the device
>> starts processing it.
> So I think this is kind of the work of the vDPA parent. For the parent
> that needs this trick, we should do it in the parent driver.
Agree, it looks easier implementing this in parent driver,
I can implement it in ifcvf set_vq_ready right now

Thanks
Zhu Lingshan
>
> Thanks
>
>> However, that information is not in the migration stream and it should
>> be an edge case anyhow, being resilient to parallel notifications from
>> the guest.
>>
>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>> ---
>>   hw/virtio/vhost-vdpa.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>> index 40b7e8706a..dff94355dd 100644
>> --- a/hw/virtio/vhost-vdpa.c
>> +++ b/hw/virtio/vhost-vdpa.c
>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
>>       }
>>       trace_vhost_vdpa_set_vring_ready(dev);
>>       for (i = 0; i < dev->nvqs; ++i) {
>> +        VirtQueue *vq;
>>           struct vhost_vring_state state = {
>>               .index = dev->vq_index + i,
>>               .num = 1,
>>           };
>>           vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
>> +
>> +        /* Preemptive kick */
>> +        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
>> +        event_notifier_set(virtio_queue_get_host_notifier(vq));
>>       }
>>       return 0;
>>   }
>> --
>> 2.31.1
>>
Jason Wang Jan. 13, 2023, 3:39 a.m. UTC | #3
On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>
>
>
> On 1/13/2023 10:31 AM, Jason Wang wrote:
> > On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >> Spuriously kick the destination device's queue so it knows in case there
> >> are new descriptors.
> >>
> >> RFC: This is somehow a gray area. The guest may have placed descriptors
> >> in a virtqueue but not kicked it, so it might be surprised if the device
> >> starts processing it.
> > So I think this is kind of the work of the vDPA parent. For the parent
> > that needs this trick, we should do it in the parent driver.
> Agree, it looks easier implementing this in parent driver,
> I can implement it in ifcvf set_vq_ready right now

Great, but please check whether or not it is really needed.

Some device implementation could check the available descriptions
after DRIVER_OK without waiting for a kick.

Thanks

>
> Thanks
> Zhu Lingshan
> >
> > Thanks
> >
> >> However, that information is not in the migration stream and it should
> >> be an edge case anyhow, being resilient to parallel notifications from
> >> the guest.
> >>
> >> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >> ---
> >>   hw/virtio/vhost-vdpa.c | 5 +++++
> >>   1 file changed, 5 insertions(+)
> >>
> >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >> index 40b7e8706a..dff94355dd 100644
> >> --- a/hw/virtio/vhost-vdpa.c
> >> +++ b/hw/virtio/vhost-vdpa.c
> >> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
> >>       }
> >>       trace_vhost_vdpa_set_vring_ready(dev);
> >>       for (i = 0; i < dev->nvqs; ++i) {
> >> +        VirtQueue *vq;
> >>           struct vhost_vring_state state = {
> >>               .index = dev->vq_index + i,
> >>               .num = 1,
> >>           };
> >>           vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
> >> +
> >> +        /* Preemptive kick */
> >> +        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
> >> +        event_notifier_set(virtio_queue_get_host_notifier(vq));
> >>       }
> >>       return 0;
> >>   }
> >> --
> >> 2.31.1
> >>
>
Eugenio Perez Martin Jan. 13, 2023, 9:06 a.m. UTC | #4
On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
> >
> >
> >
> > On 1/13/2023 10:31 AM, Jason Wang wrote:
> > > On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > >> Spuriously kick the destination device's queue so it knows in case there
> > >> are new descriptors.
> > >>
> > >> RFC: This is somehow a gray area. The guest may have placed descriptors
> > >> in a virtqueue but not kicked it, so it might be surprised if the device
> > >> starts processing it.
> > > So I think this is kind of the work of the vDPA parent. For the parent
> > > that needs this trick, we should do it in the parent driver.
> > Agree, it looks easier implementing this in parent driver,
> > I can implement it in ifcvf set_vq_ready right now
>
> Great, but please check whether or not it is really needed.
>
> Some device implementation could check the available descriptions
> after DRIVER_OK without waiting for a kick.
>

So IIUC we can entirely drop this from the series (and I hope we can).
But then, what with the devices that does *not* check for them?

If we drop it it seems to me we must mandate devices to check for
descriptors at queue_enable. The queue could stall if not, isn't it?

Thanks!

> Thanks
>
> >
> > Thanks
> > Zhu Lingshan
> > >
> > > Thanks
> > >
> > >> However, that information is not in the migration stream and it should
> > >> be an edge case anyhow, being resilient to parallel notifications from
> > >> the guest.
> > >>
> > >> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > >> ---
> > >>   hw/virtio/vhost-vdpa.c | 5 +++++
> > >>   1 file changed, 5 insertions(+)
> > >>
> > >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > >> index 40b7e8706a..dff94355dd 100644
> > >> --- a/hw/virtio/vhost-vdpa.c
> > >> +++ b/hw/virtio/vhost-vdpa.c
> > >> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
> > >>       }
> > >>       trace_vhost_vdpa_set_vring_ready(dev);
> > >>       for (i = 0; i < dev->nvqs; ++i) {
> > >> +        VirtQueue *vq;
> > >>           struct vhost_vring_state state = {
> > >>               .index = dev->vq_index + i,
> > >>               .num = 1,
> > >>           };
> > >>           vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
> > >> +
> > >> +        /* Preemptive kick */
> > >> +        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
> > >> +        event_notifier_set(virtio_queue_get_host_notifier(vq));
> > >>       }
> > >>       return 0;
> > >>   }
> > >> --
> > >> 2.31.1
> > >>
> >
>
Jason Wang Jan. 16, 2023, 7:02 a.m. UTC | #5
在 2023/1/13 17:06, Eugenio Perez Martin 写道:
> On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>>>
>>>
>>> On 1/13/2023 10:31 AM, Jason Wang wrote:
>>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>>>> Spuriously kick the destination device's queue so it knows in case there
>>>>> are new descriptors.
>>>>>
>>>>> RFC: This is somehow a gray area. The guest may have placed descriptors
>>>>> in a virtqueue but not kicked it, so it might be surprised if the device
>>>>> starts processing it.
>>>> So I think this is kind of the work of the vDPA parent. For the parent
>>>> that needs this trick, we should do it in the parent driver.
>>> Agree, it looks easier implementing this in parent driver,
>>> I can implement it in ifcvf set_vq_ready right now
>> Great, but please check whether or not it is really needed.
>>
>> Some device implementation could check the available descriptions
>> after DRIVER_OK without waiting for a kick.
>>
> So IIUC we can entirely drop this from the series (and I hope we can).
> But then, what with the devices that does *not* check for them?


It needs mediation in the vDPA parent driver.


>
> If we drop it it seems to me we must mandate devices to check for
> descriptors at queue_enable. The queue could stall if not, isn't it?


I'm not sure, did you see real issue with this? (Note that we don't do 
this for vhost-user-(vDPA))

Btw, the code can result of kick before DRIVER_OK, which seems racy.

Thanks


>
> Thanks!
>
>> Thanks
>>
>>> Thanks
>>> Zhu Lingshan
>>>> Thanks
>>>>
>>>>> However, that information is not in the migration stream and it should
>>>>> be an edge case anyhow, being resilient to parallel notifications from
>>>>> the guest.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>    hw/virtio/vhost-vdpa.c | 5 +++++
>>>>>    1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>>> index 40b7e8706a..dff94355dd 100644
>>>>> --- a/hw/virtio/vhost-vdpa.c
>>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
>>>>>        }
>>>>>        trace_vhost_vdpa_set_vring_ready(dev);
>>>>>        for (i = 0; i < dev->nvqs; ++i) {
>>>>> +        VirtQueue *vq;
>>>>>            struct vhost_vring_state state = {
>>>>>                .index = dev->vq_index + i,
>>>>>                .num = 1,
>>>>>            };
>>>>>            vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
>>>>> +
>>>>> +        /* Preemptive kick */
>>>>> +        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
>>>>> +        event_notifier_set(virtio_queue_get_host_notifier(vq));
>>>>>        }
>>>>>        return 0;
>>>>>    }
>>>>> --
>>>>> 2.31.1
>>>>>
Si-Wei Liu Feb. 2, 2023, 12:56 a.m. UTC | #6
On 1/13/2023 1:06 AM, Eugenio Perez Martin wrote:
> On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>>>
>>>
>>> On 1/13/2023 10:31 AM, Jason Wang wrote:
>>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>>>> Spuriously kick the destination device's queue so it knows in case there
>>>>> are new descriptors.
>>>>>
>>>>> RFC: This is somehow a gray area. The guest may have placed descriptors
>>>>> in a virtqueue but not kicked it, so it might be surprised if the device
>>>>> starts processing it.
>>>> So I think this is kind of the work of the vDPA parent. For the parent
>>>> that needs this trick, we should do it in the parent driver.
>>> Agree, it looks easier implementing this in parent driver,
>>> I can implement it in ifcvf set_vq_ready right now
>> Great, but please check whether or not it is really needed.
>>
>> Some device implementation could check the available descriptions
>> after DRIVER_OK without waiting for a kick.
>>
> So IIUC we can entirely drop this from the series (and I hope we can).
> But then, what with the devices that does *not* check for them?
I wonder how the kick can be missed from the first place. Supposedly the 
moment when vhost_dev_stop() calls .suspend() into vdpa driver, the 
vcpus already stopped running (vm_running = false) and all pending kicks 
are delivered through vhost-vdpa's host notifiers or mapped doorbell 
already then device won't get new ones. If the device intends to 
purposely ignore (note: this could be a device bug) pending kicks during 
.suspend(), then consequently it should check available descriptors 
after reaching driver_ok to process outstanding descriptors, making up 
for the missing kick. If the vdpa driver doesn't support .suspend(), 
then it should flush the work before .reset() - vhost-scsi does it this 
way.  Or otherwise I think it's the norm (right thing to do) device 
should process pending kicks before guest memory is to be unmapped at 
the late game of vhost_dev_stop(). Is there any case kicks may be missing?

-Siwei


>
> If we drop it it seems to me we must mandate devices to check for
> descriptors at queue_enable. The queue could stall if not, isn't it?
>
> Thanks!
>
>> Thanks
>>
>>> Thanks
>>> Zhu Lingshan
>>>> Thanks
>>>>
>>>>> However, that information is not in the migration stream and it should
>>>>> be an edge case anyhow, being resilient to parallel notifications from
>>>>> the guest.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>    hw/virtio/vhost-vdpa.c | 5 +++++
>>>>>    1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>>> index 40b7e8706a..dff94355dd 100644
>>>>> --- a/hw/virtio/vhost-vdpa.c
>>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
>>>>>        }
>>>>>        trace_vhost_vdpa_set_vring_ready(dev);
>>>>>        for (i = 0; i < dev->nvqs; ++i) {
>>>>> +        VirtQueue *vq;
>>>>>            struct vhost_vring_state state = {
>>>>>                .index = dev->vq_index + i,
>>>>>                .num = 1,
>>>>>            };
>>>>>            vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
>>>>> +
>>>>> +        /* Preemptive kick */
>>>>> +        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
>>>>> +        event_notifier_set(virtio_queue_get_host_notifier(vq));
>>>>>        }
>>>>>        return 0;
>>>>>    }
>>>>> --
>>>>> 2.31.1
>>>>>
Eugenio Perez Martin Feb. 2, 2023, 4:53 p.m. UTC | #7
On Thu, Feb 2, 2023 at 1:57 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 1/13/2023 1:06 AM, Eugenio Perez Martin wrote:
> > On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
> >>>
> >>>
> >>> On 1/13/2023 10:31 AM, Jason Wang wrote:
> >>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>>>> Spuriously kick the destination device's queue so it knows in case there
> >>>>> are new descriptors.
> >>>>>
> >>>>> RFC: This is somehow a gray area. The guest may have placed descriptors
> >>>>> in a virtqueue but not kicked it, so it might be surprised if the device
> >>>>> starts processing it.
> >>>> So I think this is kind of the work of the vDPA parent. For the parent
> >>>> that needs this trick, we should do it in the parent driver.
> >>> Agree, it looks easier implementing this in parent driver,
> >>> I can implement it in ifcvf set_vq_ready right now
> >> Great, but please check whether or not it is really needed.
> >>
> >> Some device implementation could check the available descriptions
> >> after DRIVER_OK without waiting for a kick.
> >>
> > So IIUC we can entirely drop this from the series (and I hope we can).
> > But then, what with the devices that does *not* check for them?
> I wonder how the kick can be missed from the first place. Supposedly the
> moment when vhost_dev_stop() calls .suspend() into vdpa driver, the
> vcpus already stopped running (vm_running = false) and all pending kicks
> are delivered through vhost-vdpa's host notifiers or mapped doorbell
> already then device won't get new ones.

I'm thinking now in cases like the net rx queue.

When the guest starts it fills it and kicks the device. Let's say
avail_idx is 255.

Following the qemu emulated virtio net,
hw/virtio/virtio.c:virtqueue_split_pop will stash shadow_avail_idx =
255, and it will not check it again until it is out of rx descriptors.

Now the NIC fills N < 255 receive buffers, and VMM migrates. Will the
destination device check rx avail idx even if it has not received any
kick? (here could be at startup or when it needs to receive a packet).
- If the answer is yes, and it will be a bug not to check it, then we
can drop this patch. We're covered even if there is a possibility of
losing a kick in the source.
- If the answer is that it is not mandatory, we need to solve it
somehow. To me, the best way is to spuriously kick as we don't need
changes in the device, all we need is here. A new feature flag
_F_CHECK_AVAIL_ON_STARTUP or equivalent would work the same, but I
think it complicates everything more.

For tx the device should suspend "immediately", so it may receive a
kick, fetch avail_idx with M pending descriptors, transmit P < M and
then receive the suspend. If we don't want to wait indefinitely, the
device should stop processing so there are still pending requests in
the queue for the destination to send. So the case now is the same as
rx, even if the source device actually receives the kick.

Having said that, I didn't check if any code drains the vhost host
notifier. Or, as mentioned in the meeting, check that HW cannot
reorder kick and suspend call.

> If the device intends to
> purposely ignore (note: this could be a device bug) pending kicks during
> .suspend(), then consequently it should check available descriptors
> after reaching driver_ok to process outstanding descriptors, making up
> for the missing kick. If the vdpa driver doesn't support .suspend(),
> then it should flush the work before .reset() - vhost-scsi does it this
> way.  Or otherwise I think it's the norm (right thing to do) device
> should process pending kicks before guest memory is to be unmapped at
> the late game of vhost_dev_stop(). Is there any case kicks may be missing?
>

So process pending kicks means to drain all tx and rx descriptors?
That would be a solution, but then we don't need virtqueue_state at
all as we might simply recover it from guest's vring avail_idx.

Thanks!

> -Siwei
>
>
> >
> > If we drop it it seems to me we must mandate devices to check for
> > descriptors at queue_enable. The queue could stall if not, isn't it?
> >
> > Thanks!
> >
> >> Thanks
> >>
> >>> Thanks
> >>> Zhu Lingshan
> >>>> Thanks
> >>>>
> >>>>> However, that information is not in the migration stream and it should
> >>>>> be an edge case anyhow, being resilient to parallel notifications from
> >>>>> the guest.
> >>>>>
> >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>>>> ---
> >>>>>    hw/virtio/vhost-vdpa.c | 5 +++++
> >>>>>    1 file changed, 5 insertions(+)
> >>>>>
> >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>>>> index 40b7e8706a..dff94355dd 100644
> >>>>> --- a/hw/virtio/vhost-vdpa.c
> >>>>> +++ b/hw/virtio/vhost-vdpa.c
> >>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
> >>>>>        }
> >>>>>        trace_vhost_vdpa_set_vring_ready(dev);
> >>>>>        for (i = 0; i < dev->nvqs; ++i) {
> >>>>> +        VirtQueue *vq;
> >>>>>            struct vhost_vring_state state = {
> >>>>>                .index = dev->vq_index + i,
> >>>>>                .num = 1,
> >>>>>            };
> >>>>>            vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
> >>>>> +
> >>>>> +        /* Preemptive kick */
> >>>>> +        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
> >>>>> +        event_notifier_set(virtio_queue_get_host_notifier(vq));
> >>>>>        }
> >>>>>        return 0;
> >>>>>    }
> >>>>> --
> >>>>> 2.31.1
> >>>>>
>
Eugenio Perez Martin Feb. 2, 2023, 4:55 p.m. UTC | #8
On Mon, Jan 16, 2023 at 8:02 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2023/1/13 17:06, Eugenio Perez Martin 写道:
> > On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
> >>>
> >>>
> >>> On 1/13/2023 10:31 AM, Jason Wang wrote:
> >>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>>>> Spuriously kick the destination device's queue so it knows in case there
> >>>>> are new descriptors.
> >>>>>
> >>>>> RFC: This is somehow a gray area. The guest may have placed descriptors
> >>>>> in a virtqueue but not kicked it, so it might be surprised if the device
> >>>>> starts processing it.
> >>>> So I think this is kind of the work of the vDPA parent. For the parent
> >>>> that needs this trick, we should do it in the parent driver.
> >>> Agree, it looks easier implementing this in parent driver,
> >>> I can implement it in ifcvf set_vq_ready right now
> >> Great, but please check whether or not it is really needed.
> >>
> >> Some device implementation could check the available descriptions
> >> after DRIVER_OK without waiting for a kick.
> >>
> > So IIUC we can entirely drop this from the series (and I hope we can).
> > But then, what with the devices that does *not* check for them?
>
>
> It needs mediation in the vDPA parent driver.
>
>
> >
> > If we drop it it seems to me we must mandate devices to check for
> > descriptors at queue_enable. The queue could stall if not, isn't it?
>
>
> I'm not sure, did you see real issue with this? (Note that we don't do
> this for vhost-user-(vDPA))
>

Still unchecked, sorry. But not needing it for vhost-user-vDPA is a
very good signal indeed, thanks for pointing that.

> Btw, the code can result of kick before DRIVER_OK, which seems racy.
>

Good catch :). I'll fix it in the next revision if we see we need it.
I really hope to be able to drop it though.

Thanks!

> Thanks
>
>
> >
> > Thanks!
> >
> >> Thanks
> >>
> >>> Thanks
> >>> Zhu Lingshan
> >>>> Thanks
> >>>>
> >>>>> However, that information is not in the migration stream and it should
> >>>>> be an edge case anyhow, being resilient to parallel notifications from
> >>>>> the guest.
> >>>>>
> >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>>>> ---
> >>>>>    hw/virtio/vhost-vdpa.c | 5 +++++
> >>>>>    1 file changed, 5 insertions(+)
> >>>>>
> >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>>>> index 40b7e8706a..dff94355dd 100644
> >>>>> --- a/hw/virtio/vhost-vdpa.c
> >>>>> +++ b/hw/virtio/vhost-vdpa.c
> >>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
> >>>>>        }
> >>>>>        trace_vhost_vdpa_set_vring_ready(dev);
> >>>>>        for (i = 0; i < dev->nvqs; ++i) {
> >>>>> +        VirtQueue *vq;
> >>>>>            struct vhost_vring_state state = {
> >>>>>                .index = dev->vq_index + i,
> >>>>>                .num = 1,
> >>>>>            };
> >>>>>            vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
> >>>>> +
> >>>>> +        /* Preemptive kick */
> >>>>> +        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
> >>>>> +        event_notifier_set(virtio_queue_get_host_notifier(vq));
> >>>>>        }
> >>>>>        return 0;
> >>>>>    }
> >>>>> --
> >>>>> 2.31.1
> >>>>>
>
Si-Wei Liu Feb. 4, 2023, 11:04 a.m. UTC | #9
On 2/2/2023 8:53 AM, Eugenio Perez Martin wrote:
> On Thu, Feb 2, 2023 at 1:57 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 1/13/2023 1:06 AM, Eugenio Perez Martin wrote:
>>> On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>>>>>
>>>>> On 1/13/2023 10:31 AM, Jason Wang wrote:
>>>>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
>>>>>>> Spuriously kick the destination device's queue so it knows in case there
>>>>>>> are new descriptors.
>>>>>>>
>>>>>>> RFC: This is somehow a gray area. The guest may have placed descriptors
>>>>>>> in a virtqueue but not kicked it, so it might be surprised if the device
>>>>>>> starts processing it.
>>>>>> So I think this is kind of the work of the vDPA parent. For the parent
>>>>>> that needs this trick, we should do it in the parent driver.
>>>>> Agree, it looks easier implementing this in parent driver,
>>>>> I can implement it in ifcvf set_vq_ready right now
>>>> Great, but please check whether or not it is really needed.
>>>>
>>>> Some device implementation could check the available descriptions
>>>> after DRIVER_OK without waiting for a kick.
>>>>
>>> So IIUC we can entirely drop this from the series (and I hope we can).
>>> But then, what with the devices that does *not* check for them?
>> I wonder how the kick can be missed from the first place. Supposedly the
>> moment when vhost_dev_stop() calls .suspend() into vdpa driver, the
>> vcpus already stopped running (vm_running = false) and all pending kicks
>> are delivered through vhost-vdpa's host notifiers or mapped doorbell
>> already then device won't get new ones.
> I'm thinking now in cases like the net rx queue.
>
> When the guest starts it fills it and kicks the device. Let's say
> avail_idx is 255.
>
> Following the qemu emulated virtio net,
> hw/virtio/virtio.c:virtqueue_split_pop will stash shadow_avail_idx =
> 255, and it will not check it again until it is out of rx descriptors.
>
> Now the NIC fills N < 255 receive buffers, and VMM migrates. Will the
> destination device check rx avail idx even if it has not received any
> kick? (here could be at startup or when it needs to receive a packet).
> - If the answer is yes, and it will be a bug not to check it, then we
> can drop this patch. We're covered even if there is a possibility of
> losing a kick in the source.
So this is not an issue of missing delivery of kicks, but more of how 
device is expected to handle pending kicks during suspend? For network 
device, it's not required to process up to avail_idx during suspend, but 
this doesn't mean it should ignore the kick for new descriptors, or 
instead I would say the device shouldn't specifically rely on kick, 
either at suspend or at startup. If at suspend, the device doesn't 
process up to avail_idx, correspondingly the implementation of it should 
sync the avail_idx in memory at startup. Even if the device 
implementation has to process up to avail_idx at suspend, for 
interoperability (i.e. source device didn't sync at suspend) point of 
view it still needs to check avail_idx at startup (resume) time and go 
on to process any pending request, right? So in any case, it seems to me 
the "implicit" kick at startup is needed for any device implementation 
anyway. I wouldn't say mandatory but that's the way how its supposed to 
work I feel.

> - If the answer is that it is not mandatory, we need to solve it
> somehow. To me, the best way is to spuriously kick as we don't need
> changes in the device, all we need is here. A new feature flag
> _F_CHECK_AVAIL_ON_STARTUP or equivalent would work the same, but I
> think it complicates everything more.
>
> For tx the device should suspend "immediately", so it may receive a
> kick, fetch avail_idx with M pending descriptors, transmit P < M and
> then receive the suspend. If we don't want to wait indefinitely, the
> device should stop processing so there are still pending requests in
> the queue for the destination to send. So the case now is the same as
> rx, even if the source device actually receives the kick.
>
> Having said that, I didn't check if any code drains the vhost host
> notifier. Or, as mentioned in the meeting, check that HW cannot
> reorder kick and suspend call.
Not sure how order matters here, though I thought device suspend/resume 
doesn't tie in with kick order?

>
>> If the device intends to
>> purposely ignore (note: this could be a device bug) pending kicks during
>> .suspend(), then consequently it should check available descriptors
>> after reaching driver_ok to process outstanding descriptors, making up
>> for the missing kick. If the vdpa driver doesn't support .suspend(),
>> then it should flush the work before .reset() - vhost-scsi does it this
>> way.  Or otherwise I think it's the norm (right thing to do) device
>> should process pending kicks before guest memory is to be unmapped at
>> the late game of vhost_dev_stop(). Is there any case kicks may be missing?
>>
> So process pending kicks means to drain all tx and rx descriptors?
No it doesn't have to. What I said is it shouldn't ignore pending kicks 
as if there's no more buffer posted to the device. But really, a fair 
expectation for suspending or starting device is device should do a 
spontaneous sync for getting the latest avail_idx in the memory, which 
does not have to depend on kicks. For network hardware device, I thought 
suspend just needs to wait until the completion of ongoing Tx/Rx DMA 
transaction already in the flight, rather than to drain all the upcoming 
packets until avail_idx.

Regards,
-Siwei
> That would be a solution, but then we don't need virtqueue_state at
> all as we might simply recover it from guest's vring avail_idx.
>
> Thanks!
>
>> -Siwei
>>
>>
>>> If we drop it it seems to me we must mandate devices to check for
>>> descriptors at queue_enable. The queue could stall if not, isn't it?
>>>
>>> Thanks!
>>>
>>>> Thanks
>>>>
>>>>> Thanks
>>>>> Zhu Lingshan
>>>>>> Thanks
>>>>>>
>>>>>>> However, that information is not in the migration stream and it should
>>>>>>> be an edge case anyhow, being resilient to parallel notifications from
>>>>>>> the guest.
>>>>>>>
>>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>>>> ---
>>>>>>>     hw/virtio/vhost-vdpa.c | 5 +++++
>>>>>>>     1 file changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>>>>> index 40b7e8706a..dff94355dd 100644
>>>>>>> --- a/hw/virtio/vhost-vdpa.c
>>>>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
>>>>>>>         }
>>>>>>>         trace_vhost_vdpa_set_vring_ready(dev);
>>>>>>>         for (i = 0; i < dev->nvqs; ++i) {
>>>>>>> +        VirtQueue *vq;
>>>>>>>             struct vhost_vring_state state = {
>>>>>>>                 .index = dev->vq_index + i,
>>>>>>>                 .num = 1,
>>>>>>>             };
>>>>>>>             vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
>>>>>>> +
>>>>>>> +        /* Preemptive kick */
>>>>>>> +        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
>>>>>>> +        event_notifier_set(virtio_queue_get_host_notifier(vq));
>>>>>>>         }
>>>>>>>         return 0;
>>>>>>>     }
>>>>>>> --
>>>>>>> 2.31.1
>>>>>>>
Michael S. Tsirkin Feb. 5, 2023, 10 a.m. UTC | #10
On Sat, Feb 04, 2023 at 03:04:02AM -0800, Si-Wei Liu wrote:
> For network hardware device, I thought suspend
> just needs to wait until the completion of ongoing Tx/Rx DMA transaction
> already in the flight, rather than to drain all the upcoming packets until
> avail_idx.

It depends I guess but if device expects to recover all state from just
ring state in memory then at least it has to drain until some index
value.
Si-Wei Liu Feb. 6, 2023, 5:08 a.m. UTC | #11
On 2/5/2023 2:00 AM, Michael S. Tsirkin wrote:
> On Sat, Feb 04, 2023 at 03:04:02AM -0800, Si-Wei Liu wrote:
>> For network hardware device, I thought suspend
>> just needs to wait until the completion of ongoing Tx/Rx DMA transaction
>> already in the flight, rather than to drain all the upcoming packets until
>> avail_idx.
> It depends I guess but if device expects to recover all state from just
> ring state in memory then at least it has to drain until some index
> value.
Yes, that's the general requirement for other devices than networking 
device. For e.g., if a storage device had posted request before 
suspending and there's no way to replay those requests from destination, 
it needs to drain until all posted requests are completed. For network 
device, this requirement can be lifted up somehow, as network (Ethernet) 
usually is tolerant to packet drops. Jason and I once had a long 
discussion about the expectation for {get,set}_vq_state() driver API and 
we came to conclusion that this is something networking device can stand 
up to:

https://lore.kernel.org/lkml/b2d18964-8cd6-6bb1-1995-5b966207046d@redhat.com/

-Siwei
diff mbox series

Patch

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 40b7e8706a..dff94355dd 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -732,11 +732,16 @@  static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready)
     }
     trace_vhost_vdpa_set_vring_ready(dev);
     for (i = 0; i < dev->nvqs; ++i) {
+        VirtQueue *vq;
         struct vhost_vring_state state = {
             .index = dev->vq_index + i,
             .num = 1,
         };
         vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
+
+        /* Preemptive kick */
+        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
+        event_notifier_set(virtio_queue_get_host_notifier(vq));
     }
     return 0;
 }