Message ID | 20230112172434.760850-13-eperezma@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Dinamycally switch to vhost shadow virtqueues at vdpa net migration | expand |
On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote: > > Spuriously kick the destination device's queue so it knows in case there > are new descriptors. > > RFC: This is somehow a gray area. The guest may have placed descriptors > in a virtqueue but not kicked it, so it might be surprised if the device > starts processing it. So I think this is kind of the work of the vDPA parent. For the parent that needs this trick, we should do it in the parent driver. Thanks > > However, that information is not in the migration stream and it should > be an edge case anyhow, being resilient to parallel notifications from > the guest. > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > --- > hw/virtio/vhost-vdpa.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index 40b7e8706a..dff94355dd 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) > } > trace_vhost_vdpa_set_vring_ready(dev); > for (i = 0; i < dev->nvqs; ++i) { > + VirtQueue *vq; > struct vhost_vring_state state = { > .index = dev->vq_index + i, > .num = 1, > }; > vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > + > + /* Preemptive kick */ > + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); > + event_notifier_set(virtio_queue_get_host_notifier(vq)); > } > return 0; > } > -- > 2.31.1 >
On 1/13/2023 10:31 AM, Jason Wang wrote: > On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote: >> Spuriously kick the destination device's queue so it knows in case there >> are new descriptors. >> >> RFC: This is somehow a gray area. The guest may have placed descriptors >> in a virtqueue but not kicked it, so it might be surprised if the device >> starts processing it. > So I think this is kind of the work of the vDPA parent. For the parent > that needs this trick, we should do it in the parent driver. Agree, it looks easier implementing this in parent driver, I can implement it in ifcvf set_vq_ready right now Thanks Zhu Lingshan > > Thanks > >> However, that information is not in the migration stream and it should >> be an edge case anyhow, being resilient to parallel notifications from >> the guest. >> >> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> >> --- >> hw/virtio/vhost-vdpa.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >> index 40b7e8706a..dff94355dd 100644 >> --- a/hw/virtio/vhost-vdpa.c >> +++ b/hw/virtio/vhost-vdpa.c >> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) >> } >> trace_vhost_vdpa_set_vring_ready(dev); >> for (i = 0; i < dev->nvqs; ++i) { >> + VirtQueue *vq; >> struct vhost_vring_state state = { >> .index = dev->vq_index + i, >> .num = 1, >> }; >> vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >> + >> + /* Preemptive kick */ >> + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); >> + event_notifier_set(virtio_queue_get_host_notifier(vq)); >> } >> return 0; >> } >> -- >> 2.31.1 >>
On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: > > > > On 1/13/2023 10:31 AM, Jason Wang wrote: > > On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote: > >> Spuriously kick the destination device's queue so it knows in case there > >> are new descriptors. > >> > >> RFC: This is somehow a gray area. The guest may have placed descriptors > >> in a virtqueue but not kicked it, so it might be surprised if the device > >> starts processing it. > > So I think this is kind of the work of the vDPA parent. For the parent > > that needs this trick, we should do it in the parent driver. > Agree, it looks easier implementing this in parent driver, > I can implement it in ifcvf set_vq_ready right now Great, but please check whether or not it is really needed. Some device implementation could check the available descriptions after DRIVER_OK without waiting for a kick. Thanks > > Thanks > Zhu Lingshan > > > > Thanks > > > >> However, that information is not in the migration stream and it should > >> be an edge case anyhow, being resilient to parallel notifications from > >> the guest. > >> > >> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > >> --- > >> hw/virtio/vhost-vdpa.c | 5 +++++ > >> 1 file changed, 5 insertions(+) > >> > >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > >> index 40b7e8706a..dff94355dd 100644 > >> --- a/hw/virtio/vhost-vdpa.c > >> +++ b/hw/virtio/vhost-vdpa.c > >> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) > >> } > >> trace_vhost_vdpa_set_vring_ready(dev); > >> for (i = 0; i < dev->nvqs; ++i) { > >> + VirtQueue *vq; > >> struct vhost_vring_state state = { > >> .index = dev->vq_index + i, > >> .num = 1, > >> }; > >> vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > >> + > >> + /* Preemptive kick */ > >> + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); > >> + event_notifier_set(virtio_queue_get_host_notifier(vq)); > >> } > >> return 0; > >> } > >> -- > >> 2.31.1 > >> >
On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote: > > On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: > > > > > > > > On 1/13/2023 10:31 AM, Jason Wang wrote: > > > On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote: > > >> Spuriously kick the destination device's queue so it knows in case there > > >> are new descriptors. > > >> > > >> RFC: This is somehow a gray area. The guest may have placed descriptors > > >> in a virtqueue but not kicked it, so it might be surprised if the device > > >> starts processing it. > > > So I think this is kind of the work of the vDPA parent. For the parent > > > that needs this trick, we should do it in the parent driver. > > Agree, it looks easier implementing this in parent driver, > > I can implement it in ifcvf set_vq_ready right now > > Great, but please check whether or not it is really needed. > > Some device implementation could check the available descriptions > after DRIVER_OK without waiting for a kick. > So IIUC we can entirely drop this from the series (and I hope we can). But then, what with the devices that does *not* check for them? If we drop it it seems to me we must mandate devices to check for descriptors at queue_enable. The queue could stall if not, isn't it? Thanks! > Thanks > > > > > Thanks > > Zhu Lingshan > > > > > > Thanks > > > > > >> However, that information is not in the migration stream and it should > > >> be an edge case anyhow, being resilient to parallel notifications from > > >> the guest. > > >> > > >> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > > >> --- > > >> hw/virtio/vhost-vdpa.c | 5 +++++ > > >> 1 file changed, 5 insertions(+) > > >> > > >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > >> index 40b7e8706a..dff94355dd 100644 > > >> --- a/hw/virtio/vhost-vdpa.c > > >> +++ b/hw/virtio/vhost-vdpa.c > > >> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) > > >> } > > >> trace_vhost_vdpa_set_vring_ready(dev); > > >> for (i = 0; i < dev->nvqs; ++i) { > > >> + VirtQueue *vq; > > >> struct vhost_vring_state state = { > > >> .index = dev->vq_index + i, > > >> .num = 1, > > >> }; > > >> vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > > >> + > > >> + /* Preemptive kick */ > > >> + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); > > >> + event_notifier_set(virtio_queue_get_host_notifier(vq)); > > >> } > > >> return 0; > > >> } > > >> -- > > >> 2.31.1 > > >> > > >
在 2023/1/13 17:06, Eugenio Perez Martin 写道: > On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote: >> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: >>> >>> >>> On 1/13/2023 10:31 AM, Jason Wang wrote: >>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote: >>>>> Spuriously kick the destination device's queue so it knows in case there >>>>> are new descriptors. >>>>> >>>>> RFC: This is somehow a gray area. The guest may have placed descriptors >>>>> in a virtqueue but not kicked it, so it might be surprised if the device >>>>> starts processing it. >>>> So I think this is kind of the work of the vDPA parent. For the parent >>>> that needs this trick, we should do it in the parent driver. >>> Agree, it looks easier implementing this in parent driver, >>> I can implement it in ifcvf set_vq_ready right now >> Great, but please check whether or not it is really needed. >> >> Some device implementation could check the available descriptions >> after DRIVER_OK without waiting for a kick. >> > So IIUC we can entirely drop this from the series (and I hope we can). > But then, what with the devices that does *not* check for them? It needs mediation in the vDPA parent driver. > > If we drop it it seems to me we must mandate devices to check for > descriptors at queue_enable. The queue could stall if not, isn't it? I'm not sure, did you see real issue with this? (Note that we don't do this for vhost-user-(vDPA)) Btw, the code can result of kick before DRIVER_OK, which seems racy. Thanks > > Thanks! > >> Thanks >> >>> Thanks >>> Zhu Lingshan >>>> Thanks >>>> >>>>> However, that information is not in the migration stream and it should >>>>> be an edge case anyhow, being resilient to parallel notifications from >>>>> the guest. >>>>> >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> >>>>> --- >>>>> hw/virtio/vhost-vdpa.c | 5 +++++ >>>>> 1 file changed, 5 insertions(+) >>>>> >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >>>>> index 40b7e8706a..dff94355dd 100644 >>>>> --- a/hw/virtio/vhost-vdpa.c >>>>> +++ b/hw/virtio/vhost-vdpa.c >>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) >>>>> } >>>>> trace_vhost_vdpa_set_vring_ready(dev); >>>>> for (i = 0; i < dev->nvqs; ++i) { >>>>> + VirtQueue *vq; >>>>> struct vhost_vring_state state = { >>>>> .index = dev->vq_index + i, >>>>> .num = 1, >>>>> }; >>>>> vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >>>>> + >>>>> + /* Preemptive kick */ >>>>> + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); >>>>> + event_notifier_set(virtio_queue_get_host_notifier(vq)); >>>>> } >>>>> return 0; >>>>> } >>>>> -- >>>>> 2.31.1 >>>>>
On 1/13/2023 1:06 AM, Eugenio Perez Martin wrote: > On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote: >> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: >>> >>> >>> On 1/13/2023 10:31 AM, Jason Wang wrote: >>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote: >>>>> Spuriously kick the destination device's queue so it knows in case there >>>>> are new descriptors. >>>>> >>>>> RFC: This is somehow a gray area. The guest may have placed descriptors >>>>> in a virtqueue but not kicked it, so it might be surprised if the device >>>>> starts processing it. >>>> So I think this is kind of the work of the vDPA parent. For the parent >>>> that needs this trick, we should do it in the parent driver. >>> Agree, it looks easier implementing this in parent driver, >>> I can implement it in ifcvf set_vq_ready right now >> Great, but please check whether or not it is really needed. >> >> Some device implementation could check the available descriptions >> after DRIVER_OK without waiting for a kick. >> > So IIUC we can entirely drop this from the series (and I hope we can). > But then, what with the devices that does *not* check for them? I wonder how the kick can be missed from the first place. Supposedly the moment when vhost_dev_stop() calls .suspend() into vdpa driver, the vcpus already stopped running (vm_running = false) and all pending kicks are delivered through vhost-vdpa's host notifiers or mapped doorbell already then device won't get new ones. If the device intends to purposely ignore (note: this could be a device bug) pending kicks during .suspend(), then consequently it should check available descriptors after reaching driver_ok to process outstanding descriptors, making up for the missing kick. If the vdpa driver doesn't support .suspend(), then it should flush the work before .reset() - vhost-scsi does it this way. Or otherwise I think it's the norm (right thing to do) device should process pending kicks before guest memory is to be unmapped at the late game of vhost_dev_stop(). Is there any case kicks may be missing? -Siwei > > If we drop it it seems to me we must mandate devices to check for > descriptors at queue_enable. The queue could stall if not, isn't it? > > Thanks! > >> Thanks >> >>> Thanks >>> Zhu Lingshan >>>> Thanks >>>> >>>>> However, that information is not in the migration stream and it should >>>>> be an edge case anyhow, being resilient to parallel notifications from >>>>> the guest. >>>>> >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> >>>>> --- >>>>> hw/virtio/vhost-vdpa.c | 5 +++++ >>>>> 1 file changed, 5 insertions(+) >>>>> >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >>>>> index 40b7e8706a..dff94355dd 100644 >>>>> --- a/hw/virtio/vhost-vdpa.c >>>>> +++ b/hw/virtio/vhost-vdpa.c >>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) >>>>> } >>>>> trace_vhost_vdpa_set_vring_ready(dev); >>>>> for (i = 0; i < dev->nvqs; ++i) { >>>>> + VirtQueue *vq; >>>>> struct vhost_vring_state state = { >>>>> .index = dev->vq_index + i, >>>>> .num = 1, >>>>> }; >>>>> vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >>>>> + >>>>> + /* Preemptive kick */ >>>>> + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); >>>>> + event_notifier_set(virtio_queue_get_host_notifier(vq)); >>>>> } >>>>> return 0; >>>>> } >>>>> -- >>>>> 2.31.1 >>>>>
On Thu, Feb 2, 2023 at 1:57 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 1/13/2023 1:06 AM, Eugenio Perez Martin wrote: > > On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote: > >> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: > >>> > >>> > >>> On 1/13/2023 10:31 AM, Jason Wang wrote: > >>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote: > >>>>> Spuriously kick the destination device's queue so it knows in case there > >>>>> are new descriptors. > >>>>> > >>>>> RFC: This is somehow a gray area. The guest may have placed descriptors > >>>>> in a virtqueue but not kicked it, so it might be surprised if the device > >>>>> starts processing it. > >>>> So I think this is kind of the work of the vDPA parent. For the parent > >>>> that needs this trick, we should do it in the parent driver. > >>> Agree, it looks easier implementing this in parent driver, > >>> I can implement it in ifcvf set_vq_ready right now > >> Great, but please check whether or not it is really needed. > >> > >> Some device implementation could check the available descriptions > >> after DRIVER_OK without waiting for a kick. > >> > > So IIUC we can entirely drop this from the series (and I hope we can). > > But then, what with the devices that does *not* check for them? > I wonder how the kick can be missed from the first place. Supposedly the > moment when vhost_dev_stop() calls .suspend() into vdpa driver, the > vcpus already stopped running (vm_running = false) and all pending kicks > are delivered through vhost-vdpa's host notifiers or mapped doorbell > already then device won't get new ones. I'm thinking now in cases like the net rx queue. When the guest starts it fills it and kicks the device. Let's say avail_idx is 255. Following the qemu emulated virtio net, hw/virtio/virtio.c:virtqueue_split_pop will stash shadow_avail_idx = 255, and it will not check it again until it is out of rx descriptors. Now the NIC fills N < 255 receive buffers, and VMM migrates. Will the destination device check rx avail idx even if it has not received any kick? (here could be at startup or when it needs to receive a packet). - If the answer is yes, and it will be a bug not to check it, then we can drop this patch. We're covered even if there is a possibility of losing a kick in the source. - If the answer is that it is not mandatory, we need to solve it somehow. To me, the best way is to spuriously kick as we don't need changes in the device, all we need is here. A new feature flag _F_CHECK_AVAIL_ON_STARTUP or equivalent would work the same, but I think it complicates everything more. For tx the device should suspend "immediately", so it may receive a kick, fetch avail_idx with M pending descriptors, transmit P < M and then receive the suspend. If we don't want to wait indefinitely, the device should stop processing so there are still pending requests in the queue for the destination to send. So the case now is the same as rx, even if the source device actually receives the kick. Having said that, I didn't check if any code drains the vhost host notifier. Or, as mentioned in the meeting, check that HW cannot reorder kick and suspend call. > If the device intends to > purposely ignore (note: this could be a device bug) pending kicks during > .suspend(), then consequently it should check available descriptors > after reaching driver_ok to process outstanding descriptors, making up > for the missing kick. If the vdpa driver doesn't support .suspend(), > then it should flush the work before .reset() - vhost-scsi does it this > way. Or otherwise I think it's the norm (right thing to do) device > should process pending kicks before guest memory is to be unmapped at > the late game of vhost_dev_stop(). Is there any case kicks may be missing? > So process pending kicks means to drain all tx and rx descriptors? That would be a solution, but then we don't need virtqueue_state at all as we might simply recover it from guest's vring avail_idx. Thanks! > -Siwei > > > > > > If we drop it it seems to me we must mandate devices to check for > > descriptors at queue_enable. The queue could stall if not, isn't it? > > > > Thanks! > > > >> Thanks > >> > >>> Thanks > >>> Zhu Lingshan > >>>> Thanks > >>>> > >>>>> However, that information is not in the migration stream and it should > >>>>> be an edge case anyhow, being resilient to parallel notifications from > >>>>> the guest. > >>>>> > >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > >>>>> --- > >>>>> hw/virtio/vhost-vdpa.c | 5 +++++ > >>>>> 1 file changed, 5 insertions(+) > >>>>> > >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > >>>>> index 40b7e8706a..dff94355dd 100644 > >>>>> --- a/hw/virtio/vhost-vdpa.c > >>>>> +++ b/hw/virtio/vhost-vdpa.c > >>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) > >>>>> } > >>>>> trace_vhost_vdpa_set_vring_ready(dev); > >>>>> for (i = 0; i < dev->nvqs; ++i) { > >>>>> + VirtQueue *vq; > >>>>> struct vhost_vring_state state = { > >>>>> .index = dev->vq_index + i, > >>>>> .num = 1, > >>>>> }; > >>>>> vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > >>>>> + > >>>>> + /* Preemptive kick */ > >>>>> + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); > >>>>> + event_notifier_set(virtio_queue_get_host_notifier(vq)); > >>>>> } > >>>>> return 0; > >>>>> } > >>>>> -- > >>>>> 2.31.1 > >>>>> >
On Mon, Jan 16, 2023 at 8:02 AM Jason Wang <jasowang@redhat.com> wrote: > > > 在 2023/1/13 17:06, Eugenio Perez Martin 写道: > > On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote: > >> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: > >>> > >>> > >>> On 1/13/2023 10:31 AM, Jason Wang wrote: > >>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote: > >>>>> Spuriously kick the destination device's queue so it knows in case there > >>>>> are new descriptors. > >>>>> > >>>>> RFC: This is somehow a gray area. The guest may have placed descriptors > >>>>> in a virtqueue but not kicked it, so it might be surprised if the device > >>>>> starts processing it. > >>>> So I think this is kind of the work of the vDPA parent. For the parent > >>>> that needs this trick, we should do it in the parent driver. > >>> Agree, it looks easier implementing this in parent driver, > >>> I can implement it in ifcvf set_vq_ready right now > >> Great, but please check whether or not it is really needed. > >> > >> Some device implementation could check the available descriptions > >> after DRIVER_OK without waiting for a kick. > >> > > So IIUC we can entirely drop this from the series (and I hope we can). > > But then, what with the devices that does *not* check for them? > > > It needs mediation in the vDPA parent driver. > > > > > > If we drop it it seems to me we must mandate devices to check for > > descriptors at queue_enable. The queue could stall if not, isn't it? > > > I'm not sure, did you see real issue with this? (Note that we don't do > this for vhost-user-(vDPA)) > Still unchecked, sorry. But not needing it for vhost-user-vDPA is a very good signal indeed, thanks for pointing that. > Btw, the code can result of kick before DRIVER_OK, which seems racy. > Good catch :). I'll fix it in the next revision if we see we need it. I really hope to be able to drop it though. Thanks! > Thanks > > > > > > Thanks! > > > >> Thanks > >> > >>> Thanks > >>> Zhu Lingshan > >>>> Thanks > >>>> > >>>>> However, that information is not in the migration stream and it should > >>>>> be an edge case anyhow, being resilient to parallel notifications from > >>>>> the guest. > >>>>> > >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > >>>>> --- > >>>>> hw/virtio/vhost-vdpa.c | 5 +++++ > >>>>> 1 file changed, 5 insertions(+) > >>>>> > >>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > >>>>> index 40b7e8706a..dff94355dd 100644 > >>>>> --- a/hw/virtio/vhost-vdpa.c > >>>>> +++ b/hw/virtio/vhost-vdpa.c > >>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) > >>>>> } > >>>>> trace_vhost_vdpa_set_vring_ready(dev); > >>>>> for (i = 0; i < dev->nvqs; ++i) { > >>>>> + VirtQueue *vq; > >>>>> struct vhost_vring_state state = { > >>>>> .index = dev->vq_index + i, > >>>>> .num = 1, > >>>>> }; > >>>>> vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); > >>>>> + > >>>>> + /* Preemptive kick */ > >>>>> + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); > >>>>> + event_notifier_set(virtio_queue_get_host_notifier(vq)); > >>>>> } > >>>>> return 0; > >>>>> } > >>>>> -- > >>>>> 2.31.1 > >>>>> >
On 2/2/2023 8:53 AM, Eugenio Perez Martin wrote: > On Thu, Feb 2, 2023 at 1:57 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> >> >> On 1/13/2023 1:06 AM, Eugenio Perez Martin wrote: >>> On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote: >>>> On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: >>>>> >>>>> On 1/13/2023 10:31 AM, Jason Wang wrote: >>>>>> On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote: >>>>>>> Spuriously kick the destination device's queue so it knows in case there >>>>>>> are new descriptors. >>>>>>> >>>>>>> RFC: This is somehow a gray area. The guest may have placed descriptors >>>>>>> in a virtqueue but not kicked it, so it might be surprised if the device >>>>>>> starts processing it. >>>>>> So I think this is kind of the work of the vDPA parent. For the parent >>>>>> that needs this trick, we should do it in the parent driver. >>>>> Agree, it looks easier implementing this in parent driver, >>>>> I can implement it in ifcvf set_vq_ready right now >>>> Great, but please check whether or not it is really needed. >>>> >>>> Some device implementation could check the available descriptions >>>> after DRIVER_OK without waiting for a kick. >>>> >>> So IIUC we can entirely drop this from the series (and I hope we can). >>> But then, what with the devices that does *not* check for them? >> I wonder how the kick can be missed from the first place. Supposedly the >> moment when vhost_dev_stop() calls .suspend() into vdpa driver, the >> vcpus already stopped running (vm_running = false) and all pending kicks >> are delivered through vhost-vdpa's host notifiers or mapped doorbell >> already then device won't get new ones. > I'm thinking now in cases like the net rx queue. > > When the guest starts it fills it and kicks the device. Let's say > avail_idx is 255. > > Following the qemu emulated virtio net, > hw/virtio/virtio.c:virtqueue_split_pop will stash shadow_avail_idx = > 255, and it will not check it again until it is out of rx descriptors. > > Now the NIC fills N < 255 receive buffers, and VMM migrates. Will the > destination device check rx avail idx even if it has not received any > kick? (here could be at startup or when it needs to receive a packet). > - If the answer is yes, and it will be a bug not to check it, then we > can drop this patch. We're covered even if there is a possibility of > losing a kick in the source. So this is not an issue of missing delivery of kicks, but more of how device is expected to handle pending kicks during suspend? For network device, it's not required to process up to avail_idx during suspend, but this doesn't mean it should ignore the kick for new descriptors, or instead I would say the device shouldn't specifically rely on kick, either at suspend or at startup. If at suspend, the device doesn't process up to avail_idx, correspondingly the implementation of it should sync the avail_idx in memory at startup. Even if the device implementation has to process up to avail_idx at suspend, for interoperability (i.e. source device didn't sync at suspend) point of view it still needs to check avail_idx at startup (resume) time and go on to process any pending request, right? So in any case, it seems to me the "implicit" kick at startup is needed for any device implementation anyway. I wouldn't say mandatory but that's the way how its supposed to work I feel. > - If the answer is that it is not mandatory, we need to solve it > somehow. To me, the best way is to spuriously kick as we don't need > changes in the device, all we need is here. A new feature flag > _F_CHECK_AVAIL_ON_STARTUP or equivalent would work the same, but I > think it complicates everything more. > > For tx the device should suspend "immediately", so it may receive a > kick, fetch avail_idx with M pending descriptors, transmit P < M and > then receive the suspend. If we don't want to wait indefinitely, the > device should stop processing so there are still pending requests in > the queue for the destination to send. So the case now is the same as > rx, even if the source device actually receives the kick. > > Having said that, I didn't check if any code drains the vhost host > notifier. Or, as mentioned in the meeting, check that HW cannot > reorder kick and suspend call. Not sure how order matters here, though I thought device suspend/resume doesn't tie in with kick order? > >> If the device intends to >> purposely ignore (note: this could be a device bug) pending kicks during >> .suspend(), then consequently it should check available descriptors >> after reaching driver_ok to process outstanding descriptors, making up >> for the missing kick. If the vdpa driver doesn't support .suspend(), >> then it should flush the work before .reset() - vhost-scsi does it this >> way. Or otherwise I think it's the norm (right thing to do) device >> should process pending kicks before guest memory is to be unmapped at >> the late game of vhost_dev_stop(). Is there any case kicks may be missing? >> > So process pending kicks means to drain all tx and rx descriptors? No it doesn't have to. What I said is it shouldn't ignore pending kicks as if there's no more buffer posted to the device. But really, a fair expectation for suspending or starting device is device should do a spontaneous sync for getting the latest avail_idx in the memory, which does not have to depend on kicks. For network hardware device, I thought suspend just needs to wait until the completion of ongoing Tx/Rx DMA transaction already in the flight, rather than to drain all the upcoming packets until avail_idx. Regards, -Siwei > That would be a solution, but then we don't need virtqueue_state at > all as we might simply recover it from guest's vring avail_idx. > > Thanks! > >> -Siwei >> >> >>> If we drop it it seems to me we must mandate devices to check for >>> descriptors at queue_enable. The queue could stall if not, isn't it? >>> >>> Thanks! >>> >>>> Thanks >>>> >>>>> Thanks >>>>> Zhu Lingshan >>>>>> Thanks >>>>>> >>>>>>> However, that information is not in the migration stream and it should >>>>>>> be an edge case anyhow, being resilient to parallel notifications from >>>>>>> the guest. >>>>>>> >>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> >>>>>>> --- >>>>>>> hw/virtio/vhost-vdpa.c | 5 +++++ >>>>>>> 1 file changed, 5 insertions(+) >>>>>>> >>>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >>>>>>> index 40b7e8706a..dff94355dd 100644 >>>>>>> --- a/hw/virtio/vhost-vdpa.c >>>>>>> +++ b/hw/virtio/vhost-vdpa.c >>>>>>> @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) >>>>>>> } >>>>>>> trace_vhost_vdpa_set_vring_ready(dev); >>>>>>> for (i = 0; i < dev->nvqs; ++i) { >>>>>>> + VirtQueue *vq; >>>>>>> struct vhost_vring_state state = { >>>>>>> .index = dev->vq_index + i, >>>>>>> .num = 1, >>>>>>> }; >>>>>>> vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); >>>>>>> + >>>>>>> + /* Preemptive kick */ >>>>>>> + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); >>>>>>> + event_notifier_set(virtio_queue_get_host_notifier(vq)); >>>>>>> } >>>>>>> return 0; >>>>>>> } >>>>>>> -- >>>>>>> 2.31.1 >>>>>>>
On Sat, Feb 04, 2023 at 03:04:02AM -0800, Si-Wei Liu wrote: > For network hardware device, I thought suspend > just needs to wait until the completion of ongoing Tx/Rx DMA transaction > already in the flight, rather than to drain all the upcoming packets until > avail_idx. It depends I guess but if device expects to recover all state from just ring state in memory then at least it has to drain until some index value.
On 2/5/2023 2:00 AM, Michael S. Tsirkin wrote: > On Sat, Feb 04, 2023 at 03:04:02AM -0800, Si-Wei Liu wrote: >> For network hardware device, I thought suspend >> just needs to wait until the completion of ongoing Tx/Rx DMA transaction >> already in the flight, rather than to drain all the upcoming packets until >> avail_idx. > It depends I guess but if device expects to recover all state from just > ring state in memory then at least it has to drain until some index > value. Yes, that's the general requirement for other devices than networking device. For e.g., if a storage device had posted request before suspending and there's no way to replay those requests from destination, it needs to drain until all posted requests are completed. For network device, this requirement can be lifted up somehow, as network (Ethernet) usually is tolerant to packet drops. Jason and I once had a long discussion about the expectation for {get,set}_vq_state() driver API and we came to conclusion that this is something networking device can stand up to: https://lore.kernel.org/lkml/b2d18964-8cd6-6bb1-1995-5b966207046d@redhat.com/ -Siwei
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 40b7e8706a..dff94355dd 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, int ready) } trace_vhost_vdpa_set_vring_ready(dev); for (i = 0; i < dev->nvqs; ++i) { + VirtQueue *vq; struct vhost_vring_state state = { .index = dev->vq_index + i, .num = 1, }; vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state); + + /* Preemptive kick */ + vq = virtio_get_queue(dev->vdev, dev->vq_index + i); + event_notifier_set(virtio_queue_get_host_notifier(vq)); } return 0; }
Spuriously kick the destination device's queue so it knows in case there are new descriptors. RFC: This is somehow a gray area. The guest may have placed descriptors in a virtqueue but not kicked it, so it might be surprised if the device starts processing it. However, that information is not in the migration stream and it should be an edge case anyhow, being resilient to parallel notifications from the guest. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> --- hw/virtio/vhost-vdpa.c | 5 +++++ 1 file changed, 5 insertions(+)