Message ID | 20230710165333.17506-1-lingshan.zhu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V2] vhost_vdpa: no need to fetch vring base when poweroff | expand |
On Mon, Jul 10, 2023 at 10:54 AM Zhu Lingshan <lingshan.zhu@intel.com> wrote: > > In the poweroff routine, no need to fetch last available index. > > This commit also provides a better debug message in the vhost > caller vhost_virtqueue_stop, because if vhost does not fetch > the last avail idx successfully, maybe the device does not > suspend, vhost will sync last avail idx to vring used idx as a > work around, not a failure. > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Thanks! > --- > hw/virtio/vhost-vdpa.c | 10 ++++++++++ > hw/virtio/vhost.c | 2 +- > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index 3c575a9a6e..10b445f64e 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -26,6 +26,7 @@ > #include "cpu.h" > #include "trace.h" > #include "qapi/error.h" > +#include "sysemu/runstate.h" > > /* > * Return one past the end of the end of section. Be careful with uint64_t > @@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev, > struct vhost_vdpa *v = dev->opaque; > int ret; > > + if (runstate_check(RUN_STATE_SHUTDOWN)) { > + /* > + * Some devices do not support this call properly, > + * and we don't need to retrieve the indexes > + * if it is shutting down > + */ > + return 0; > + } > + > if (v->shadow_vqs_enabled) { > ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index); > return 0; > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index 82394331bf..7dd90cff3a 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev, > > r = dev->vhost_ops->vhost_get_vring_base(dev, &state); > if (r < 0) { > - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r); > + VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used idx for vhost VQ %u", idx); > /* Connection to the backend is broken, so let's sync internal > * last avail idx to the device used idx. > */ > -- > 2.39.3 >
On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan <lingshan.zhu@intel.com> wrote: > > In the poweroff routine, no need to fetch last available index. > This is because there's no concept of shutdown in the vhost layer, it only knows start and stop. > This commit also provides a better debug message in the vhost > caller vhost_virtqueue_stop, A separate patch is better. > because if vhost does not fetch > the last avail idx successfully, maybe the device does not > suspend, vhost will sync last avail idx to vring used idx as a > work around, not a failure. This only happens if we return a negative value? > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> > --- > hw/virtio/vhost-vdpa.c | 10 ++++++++++ > hw/virtio/vhost.c | 2 +- > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index 3c575a9a6e..10b445f64e 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -26,6 +26,7 @@ > #include "cpu.h" > #include "trace.h" > #include "qapi/error.h" > +#include "sysemu/runstate.h" > > /* > * Return one past the end of the end of section. Be careful with uint64_t > @@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev, > struct vhost_vdpa *v = dev->opaque; > int ret; > > + if (runstate_check(RUN_STATE_SHUTDOWN)) { > + /* > + * Some devices do not support this call properly, > + * and we don't need to retrieve the indexes > + * if it is shutting down > + */ > + return 0; Checking runstate in the vhost code seems like a layer violation. What happens without this patch? Thanks > + } > + > if (v->shadow_vqs_enabled) { > ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index); > return 0; > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index 82394331bf..7dd90cff3a 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev, > > r = dev->vhost_ops->vhost_get_vring_base(dev, &state); > if (r < 0) { > - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r); > + VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used idx for vhost VQ %u", idx); > /* Connection to the backend is broken, so let's sync internal > * last avail idx to the device used idx. > */ > -- > 2.39.3 >
On 7/11/2023 10:50 AM, Jason Wang wrote: > On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan <lingshan.zhu@intel.com> wrote: >> In the poweroff routine, no need to fetch last available index. >> > This is because there's no concept of shutdown in the vhost layer, it > only knows start and stop. > >> This commit also provides a better debug message in the vhost >> caller vhost_virtqueue_stop, > A separate patch is better. OK > >> because if vhost does not fetch >> the last avail idx successfully, maybe the device does not >> suspend, vhost will sync last avail idx to vring used idx as a >> work around, not a failure. > This only happens if we return a negative value? Yes > >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> >> --- >> hw/virtio/vhost-vdpa.c | 10 ++++++++++ >> hw/virtio/vhost.c | 2 +- >> 2 files changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >> index 3c575a9a6e..10b445f64e 100644 >> --- a/hw/virtio/vhost-vdpa.c >> +++ b/hw/virtio/vhost-vdpa.c >> @@ -26,6 +26,7 @@ >> #include "cpu.h" >> #include "trace.h" >> #include "qapi/error.h" >> +#include "sysemu/runstate.h" >> >> /* >> * Return one past the end of the end of section. Be careful with uint64_t >> @@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev, >> struct vhost_vdpa *v = dev->opaque; >> int ret; >> >> + if (runstate_check(RUN_STATE_SHUTDOWN)) { >> + /* >> + * Some devices do not support this call properly, >> + * and we don't need to retrieve the indexes >> + * if it is shutting down >> + */ >> + return 0; > Checking runstate in the vhost code seems like a layer violation. > > What happens without this patch? vhost tries to fetch vring base, vhost_vdpa needs suspend the device before retrieving last_avail_idx. However not all devices can support .suspend properly so this call may fail. Then vhost will print an error shows something failed. The error msg is confused, as stated in the commit log, restoring last_avail_idx with guest used idx is a workaround rather than a failure. And no needs to fetch last_avail_idx when power off. Thanks > > Thanks > >> + } >> + >> if (v->shadow_vqs_enabled) { >> ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index); >> return 0; >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c >> index 82394331bf..7dd90cff3a 100644 >> --- a/hw/virtio/vhost.c >> +++ b/hw/virtio/vhost.c >> @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev, >> >> r = dev->vhost_ops->vhost_get_vring_base(dev, &state); >> if (r < 0) { >> - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r); >> + VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used idx for vhost VQ %u", idx); >> /* Connection to the backend is broken, so let's sync internal >> * last avail idx to the device used idx. >> */ >> -- >> 2.39.3 >>
On Tue, Jul 11, 2023 at 12:09 PM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: > > > > On 7/11/2023 10:50 AM, Jason Wang wrote: > > On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan <lingshan.zhu@intel.com> wrote: > >> In the poweroff routine, no need to fetch last available index. > >> > > This is because there's no concept of shutdown in the vhost layer, it > > only knows start and stop. > > > >> This commit also provides a better debug message in the vhost > >> caller vhost_virtqueue_stop, > > A separate patch is better. > OK > > > >> because if vhost does not fetch > >> the last avail idx successfully, maybe the device does not > >> suspend, vhost will sync last avail idx to vring used idx as a > >> work around, not a failure. > > This only happens if we return a negative value? > Yes > > > >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> > >> --- > >> hw/virtio/vhost-vdpa.c | 10 ++++++++++ > >> hw/virtio/vhost.c | 2 +- > >> 2 files changed, 11 insertions(+), 1 deletion(-) > >> > >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > >> index 3c575a9a6e..10b445f64e 100644 > >> --- a/hw/virtio/vhost-vdpa.c > >> +++ b/hw/virtio/vhost-vdpa.c > >> @@ -26,6 +26,7 @@ > >> #include "cpu.h" > >> #include "trace.h" > >> #include "qapi/error.h" > >> +#include "sysemu/runstate.h" > >> > >> /* > >> * Return one past the end of the end of section. Be careful with uint64_t > >> @@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev, > >> struct vhost_vdpa *v = dev->opaque; > >> int ret; > >> > >> + if (runstate_check(RUN_STATE_SHUTDOWN)) { > >> + /* > >> + * Some devices do not support this call properly, > >> + * and we don't need to retrieve the indexes > >> + * if it is shutting down > >> + */ > >> + return 0; > > Checking runstate in the vhost code seems like a layer violation. > > > > What happens without this patch? > vhost tries to fetch vring base, > vhost_vdpa needs suspend the device before retrieving last_avail_idx. > However not all devices can support .suspend properly so this call > may fail. I think this is where I'm lost. If the device doesn't support suspending, any reason we only try to fix the case of shutdown? Btw, the fail is intended: if (!v->suspended) { /* * Cannot trust in value returned by device, let vhost recover used * idx from guest. */ return -1; } And if we return to success here, will we go to set an uninitialized last avail idx? r = dev->vhost_ops->vhost_get_vring_base(dev, &state); if (r < 0) { ... }.else { virtio_queue_set_last_avail_idx(vdev, idx, state.num); } Thanks > Then vhost will print an error shows something failed. > > The error msg is confused, as stated in the commit log, restoring > last_avail_idx with guest used idx > is a workaround rather than a failure. And no needs to fetch last_avail_idx > when power off. > > Thanks > > > > Thanks > > > >> + } > >> + > >> if (v->shadow_vqs_enabled) { > >> ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index); > >> return 0; > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > >> index 82394331bf..7dd90cff3a 100644 > >> --- a/hw/virtio/vhost.c > >> +++ b/hw/virtio/vhost.c > >> @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev, > >> > >> r = dev->vhost_ops->vhost_get_vring_base(dev, &state); > >> if (r < 0) { > >> - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r); > >> + VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used idx for vhost VQ %u", idx); > >> /* Connection to the backend is broken, so let's sync internal > >> * last avail idx to the device used idx. > >> */ > >> -- > >> 2.39.3 > >> >
On Tue, Jul 11, 2023 at 9:05 AM Jason Wang <jasowang@redhat.com> wrote: > > On Tue, Jul 11, 2023 at 12:09 PM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: > > > > > > > > On 7/11/2023 10:50 AM, Jason Wang wrote: > > > On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan <lingshan.zhu@intel.com> wrote: > > >> In the poweroff routine, no need to fetch last available index. > > >> > > > This is because there's no concept of shutdown in the vhost layer, it > > > only knows start and stop. > > > > > >> This commit also provides a better debug message in the vhost > > >> caller vhost_virtqueue_stop, > > > A separate patch is better. > > OK > > > > > >> because if vhost does not fetch > > >> the last avail idx successfully, maybe the device does not > > >> suspend, vhost will sync last avail idx to vring used idx as a > > >> work around, not a failure. > > > This only happens if we return a negative value? > > Yes > > > > > >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> > > >> --- > > >> hw/virtio/vhost-vdpa.c | 10 ++++++++++ > > >> hw/virtio/vhost.c | 2 +- > > >> 2 files changed, 11 insertions(+), 1 deletion(-) > > >> > > >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > >> index 3c575a9a6e..10b445f64e 100644 > > >> --- a/hw/virtio/vhost-vdpa.c > > >> +++ b/hw/virtio/vhost-vdpa.c > > >> @@ -26,6 +26,7 @@ > > >> #include "cpu.h" > > >> #include "trace.h" > > >> #include "qapi/error.h" > > >> +#include "sysemu/runstate.h" > > >> > > >> /* > > >> * Return one past the end of the end of section. Be careful with uint64_t > > >> @@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev, > > >> struct vhost_vdpa *v = dev->opaque; > > >> int ret; > > >> > > >> + if (runstate_check(RUN_STATE_SHUTDOWN)) { > > >> + /* > > >> + * Some devices do not support this call properly, > > >> + * and we don't need to retrieve the indexes > > >> + * if it is shutting down > > >> + */ > > >> + return 0; > > > Checking runstate in the vhost code seems like a layer violation. > > > > > > What happens without this patch? > > vhost tries to fetch vring base, > > vhost_vdpa needs suspend the device before retrieving last_avail_idx. > > However not all devices can support .suspend properly so this call > > may fail. > > I think this is where I'm lost. If the device doesn't support > suspending, any reason we only try to fix the case of shutdown? > > Btw, the fail is intended: > > if (!v->suspended) { > /* > * Cannot trust in value returned by device, let vhost recover used > * idx from guest. > */ > return -1; > } > The fail is intended, but to recover the last used idx, either from device or from the guest, is only useful in the case of migration. I think the main problem is the error message, actually. But I think it has no use either to recover last_avail_idx or print a debug message if we're not migrating. Another solution would be to recover it from the guest at vhost_vdpa_get_vring_base, but I don't like the duplication. > And if we return to success here, will we go to set an uninitialized > last avail idx? > It will be either the default value (is set to 0 at __virtio_queue_reset) or the one received from a migration (at virtio_load). Thanks! > r = dev->vhost_ops->vhost_get_vring_base(dev, &state); > if (r < 0) { > ... > }.else { > virtio_queue_set_last_avail_idx(vdev, idx, state.num); > } > > Thanks > > > Then vhost will print an error shows something failed. > > > > The error msg is confused, as stated in the commit log, restoring > > last_avail_idx with guest used idx > > is a workaround rather than a failure. And no needs to fetch last_avail_idx > > when power off. > > > > Thanks > > > > > > Thanks > > > > > >> + } > > >> + > > >> if (v->shadow_vqs_enabled) { > > >> ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index); > > >> return 0; > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > >> index 82394331bf..7dd90cff3a 100644 > > >> --- a/hw/virtio/vhost.c > > >> +++ b/hw/virtio/vhost.c > > >> @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev, > > >> > > >> r = dev->vhost_ops->vhost_get_vring_base(dev, &state); > > >> if (r < 0) { > > >> - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r); > > >> + VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used idx for vhost VQ %u", idx); > > >> /* Connection to the backend is broken, so let's sync internal > > >> * last avail idx to the device used idx. > > >> */ > > >> -- > > >> 2.39.3 > > >> > > >
On Tue, Jul 11, 2023 at 3:25 PM Eugenio Perez Martin <eperezma@redhat.com> wrote: > On Tue, Jul 11, 2023 at 9:05 AM Jason Wang <jasowang@redhat.com> wrote: > > > > On Tue, Jul 11, 2023 at 12:09 PM Zhu, Lingshan <lingshan.zhu@intel.com> > wrote: > > > > > > > > > > > > On 7/11/2023 10:50 AM, Jason Wang wrote: > > > > On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan <lingshan.zhu@intel.com> > wrote: > > > >> In the poweroff routine, no need to fetch last available index. > > > >> > > > > This is because there's no concept of shutdown in the vhost layer, it > > > > only knows start and stop. > > > > > > > >> This commit also provides a better debug message in the vhost > > > >> caller vhost_virtqueue_stop, > > > > A separate patch is better. > > > OK > > > > > > > >> because if vhost does not fetch > > > >> the last avail idx successfully, maybe the device does not > > > >> suspend, vhost will sync last avail idx to vring used idx as a > > > >> work around, not a failure. > > > > This only happens if we return a negative value? > > > Yes > > > > > > > >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> > > > >> --- > > > >> hw/virtio/vhost-vdpa.c | 10 ++++++++++ > > > >> hw/virtio/vhost.c | 2 +- > > > >> 2 files changed, 11 insertions(+), 1 deletion(-) > > > >> > > > >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > > >> index 3c575a9a6e..10b445f64e 100644 > > > >> --- a/hw/virtio/vhost-vdpa.c > > > >> +++ b/hw/virtio/vhost-vdpa.c > > > >> @@ -26,6 +26,7 @@ > > > >> #include "cpu.h" > > > >> #include "trace.h" > > > >> #include "qapi/error.h" > > > >> +#include "sysemu/runstate.h" > > > >> > > > >> /* > > > >> * Return one past the end of the end of section. Be careful with > uint64_t > > > >> @@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct > vhost_dev *dev, > > > >> struct vhost_vdpa *v = dev->opaque; > > > >> int ret; > > > >> > > > >> + if (runstate_check(RUN_STATE_SHUTDOWN)) { > > > >> + /* > > > >> + * Some devices do not support this call properly, > > > >> + * and we don't need to retrieve the indexes > > > >> + * if it is shutting down > > > >> + */ > > > >> + return 0; > > > > Checking runstate in the vhost code seems like a layer violation. > > > > > > > > What happens without this patch? > > > vhost tries to fetch vring base, > > > vhost_vdpa needs suspend the device before retrieving last_avail_idx. > > > However not all devices can support .suspend properly so this call > > > may fail. > > > > I think this is where I'm lost. If the device doesn't support > > suspending, any reason we only try to fix the case of shutdown? > > > > Btw, the fail is intended: > > > > if (!v->suspended) { > > /* > > * Cannot trust in value returned by device, let vhost recover > used > > * idx from guest. > > */ > > return -1; > > } > > > > The fail is intended, but to recover the last used idx, either from > device or from the guest, is only useful in the case of migration. > Note that we had userspace devices like VDUSE now, so it could be useful in the case of a VDUSE daemon crash or reconnect. > I think the main problem is the error message, actually. But I think > it has no use either to recover last_avail_idx or print a debug > message if we're not migrating. Another solution would be to recover > it from the guest at vhost_vdpa_get_vring_base, but I don't like the > duplication. > > > And if we return to success here, will we go to set an uninitialized > > last avail idx? > > > > It will be either the default value (is set to 0 at > __virtio_queue_reset) or the one received from a migration (at > virtio_load). > 0 is even sub-optimal than the index used. Anyhow, VHOST_DEBUG should not be enabled for production environments. Thanks > > Thanks! > > > r = dev->vhost_ops->vhost_get_vring_base(dev, &state); > > if (r < 0) { > > ... > > }.else { > > virtio_queue_set_last_avail_idx(vdev, idx, state.num); > > } > > > > Thanks > > > > > Then vhost will print an error shows something failed. > > > > > > The error msg is confused, as stated in the commit log, restoring > > > last_avail_idx with guest used idx > > > is a workaround rather than a failure. And no needs to fetch > last_avail_idx > > > when power off. > > > > > > Thanks > > > > > > > > Thanks > > > > > > > >> + } > > > >> + > > > >> if (v->shadow_vqs_enabled) { > > > >> ring->num = virtio_queue_get_last_avail_idx(dev->vdev, > ring->index); > > > >> return 0; > > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > > >> index 82394331bf..7dd90cff3a 100644 > > > >> --- a/hw/virtio/vhost.c > > > >> +++ b/hw/virtio/vhost.c > > > >> @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev > *dev, > > > >> > > > >> r = dev->vhost_ops->vhost_get_vring_base(dev, &state); > > > >> if (r < 0) { > > > >> - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", > idx, r); > > > >> + VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used > idx for vhost VQ %u", idx); > > > >> /* Connection to the backend is broken, so let's sync > internal > > > >> * last avail idx to the device used idx. > > > >> */ > > > >> -- > > > >> 2.39.3 > > > >> > > > > > > >
On 7/11/2023 3:34 PM, Jason Wang wrote: > > > On Tue, Jul 11, 2023 at 3:25 PM Eugenio Perez Martin > <eperezma@redhat.com> wrote: > > On Tue, Jul 11, 2023 at 9:05 AM Jason Wang <jasowang@redhat.com> > wrote: > > > > On Tue, Jul 11, 2023 at 12:09 PM Zhu, Lingshan > <lingshan.zhu@intel.com> wrote: > > > > > > > > > > > > On 7/11/2023 10:50 AM, Jason Wang wrote: > > > > On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan > <lingshan.zhu@intel.com> wrote: > > > >> In the poweroff routine, no need to fetch last available index. > > > >> > > > > This is because there's no concept of shutdown in the vhost > layer, it > > > > only knows start and stop. > > > > > > > >> This commit also provides a better debug message in the vhost > > > >> caller vhost_virtqueue_stop, > > > > A separate patch is better. > > > OK > > > > > > > >> because if vhost does not fetch > > > >> the last avail idx successfully, maybe the device does not > > > >> suspend, vhost will sync last avail idx to vring used idx as a > > > >> work around, not a failure. > > > > This only happens if we return a negative value? > > > Yes > > > > > > > >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> > > > >> --- > > > >> hw/virtio/vhost-vdpa.c | 10 ++++++++++ > > > >> hw/virtio/vhost.c | 2 +- > > > >> 2 files changed, 11 insertions(+), 1 deletion(-) > > > >> > > > >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > > >> index 3c575a9a6e..10b445f64e 100644 > > > >> --- a/hw/virtio/vhost-vdpa.c > > > >> +++ b/hw/virtio/vhost-vdpa.c > > > >> @@ -26,6 +26,7 @@ > > > >> #include "cpu.h" > > > >> #include "trace.h" > > > >> #include "qapi/error.h" > > > >> +#include "sysemu/runstate.h" > > > >> > > > >> /* > > > >> * Return one past the end of the end of section. Be > careful with uint64_t > > > >> @@ -1391,6 +1392,15 @@ static int > vhost_vdpa_get_vring_base(struct vhost_dev *dev, > > > >> struct vhost_vdpa *v = dev->opaque; > > > >> int ret; > > > >> > > > >> + if (runstate_check(RUN_STATE_SHUTDOWN)) { > > > >> + /* > > > >> + * Some devices do not support this call properly, > > > >> + * and we don't need to retrieve the indexes > > > >> + * if it is shutting down > > > >> + */ > > > >> + return 0; > > > > Checking runstate in the vhost code seems like a layer > violation. > > > > > > > > What happens without this patch? > > > vhost tries to fetch vring base, > > > vhost_vdpa needs suspend the device before retrieving > last_avail_idx. > > > However not all devices can support .suspend properly so this call > > > may fail. > > > > I think this is where I'm lost. If the device doesn't support > > suspending, any reason we only try to fix the case of shutdown? > > > > Btw, the fail is intended: > > > > if (!v->suspended) { > > /* > > * Cannot trust in value returned by device, let vhost > recover used > > * idx from guest. > > */ > > return -1; > > } > > > > The fail is intended, but to recover the last used idx, either from > device or from the guest, is only useful in the case of migration. > > > Note that we had userspace devices like VDUSE now, so it could be > useful in the case of a VDUSE daemon crash or reconnect. This code blcok is for vhost_vdpa backend, and I think vduse is another code path. Return a guest used idx may be a good idea but as Eugenio pointed out that may duplicate the code. > > > I think the main problem is the error message, actually. But I think > it has no use either to recover last_avail_idx or print a debug > message if we're not migrating. Another solution would be to recover > it from the guest at vhost_vdpa_get_vring_base, but I don't like the > duplication. > > > And if we return to success here, will we go to set an uninitialized > > last avail idx? > > > > It will be either the default value (is set to 0 at > __virtio_queue_reset) or the one received from a migration (at > virtio_load). > > > 0 is even sub-optimal than the index used. Anyhow, VHOST_DEBUG should > not be enabled for production environments. Returning 0 sounds like a queue reset, yes we can reset a queue if failed to suspend it, I am not sure whther 0 is better than guest used idx. I think we are not able to disable VHOST_DEBUG because customers can build QEMU by their own. Thanks > > Thanks > > > Thanks! > > > r = dev->vhost_ops->vhost_get_vring_base(dev, &state); > > if (r < 0) { > > ... > > }.else { > > virtio_queue_set_last_avail_idx(vdev, idx, state.num); > > } > > > > Thanks > > > > > Then vhost will print an error shows something failed. > > > > > > The error msg is confused, as stated in the commit log, restoring > > > last_avail_idx with guest used idx > > > is a workaround rather than a failure. And no needs to fetch > last_avail_idx > > > when power off. > > > > > > Thanks > > > > > > > > Thanks > > > > > > > >> + } > > > >> + > > > >> if (v->shadow_vqs_enabled) { > > > >> ring->num = > virtio_queue_get_last_avail_idx(dev->vdev, ring->index); > > > >> return 0; > > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > > >> index 82394331bf..7dd90cff3a 100644 > > > >> --- a/hw/virtio/vhost.c > > > >> +++ b/hw/virtio/vhost.c > > > >> @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct > vhost_dev *dev, > > > >> > > > >> r = dev->vhost_ops->vhost_get_vring_base(dev, &state); > > > >> if (r < 0) { > > > >> - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore > failed: %d", idx, r); > > > >> + VHOST_OPS_DEBUG(r, "sync last avail idx to the > guest used idx for vhost VQ %u", idx); > > > >> /* Connection to the backend is broken, so let's > sync internal > > > >> * last avail idx to the device used idx. > > > >> */ > > > >> -- > > > >> 2.39.3 > > > >> > > > > > >
On Wed, Jul 12, 2023 at 2:54 PM Zhu, Lingshan <lingshan.zhu@intel.com> wrote: > > > On 7/11/2023 3:34 PM, Jason Wang wrote: > > > > On Tue, Jul 11, 2023 at 3:25 PM Eugenio Perez Martin <eperezma@redhat.com> > wrote: > >> On Tue, Jul 11, 2023 at 9:05 AM Jason Wang <jasowang@redhat.com> wrote: >> > >> > On Tue, Jul 11, 2023 at 12:09 PM Zhu, Lingshan <lingshan.zhu@intel.com> >> wrote: >> > > >> > > >> > > >> > > On 7/11/2023 10:50 AM, Jason Wang wrote: >> > > > On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan < >> lingshan.zhu@intel.com> wrote: >> > > >> In the poweroff routine, no need to fetch last available index. >> > > >> >> > > > This is because there's no concept of shutdown in the vhost layer, >> it >> > > > only knows start and stop. >> > > > >> > > >> This commit also provides a better debug message in the vhost >> > > >> caller vhost_virtqueue_stop, >> > > > A separate patch is better. >> > > OK >> > > > >> > > >> because if vhost does not fetch >> > > >> the last avail idx successfully, maybe the device does not >> > > >> suspend, vhost will sync last avail idx to vring used idx as a >> > > >> work around, not a failure. >> > > > This only happens if we return a negative value? >> > > Yes >> > > > >> > > >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> >> > > >> --- >> > > >> hw/virtio/vhost-vdpa.c | 10 ++++++++++ >> > > >> hw/virtio/vhost.c | 2 +- >> > > >> 2 files changed, 11 insertions(+), 1 deletion(-) >> > > >> >> > > >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >> > > >> index 3c575a9a6e..10b445f64e 100644 >> > > >> --- a/hw/virtio/vhost-vdpa.c >> > > >> +++ b/hw/virtio/vhost-vdpa.c >> > > >> @@ -26,6 +26,7 @@ >> > > >> #include "cpu.h" >> > > >> #include "trace.h" >> > > >> #include "qapi/error.h" >> > > >> +#include "sysemu/runstate.h" >> > > >> >> > > >> /* >> > > >> * Return one past the end of the end of section. Be careful >> with uint64_t >> > > >> @@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct >> vhost_dev *dev, >> > > >> struct vhost_vdpa *v = dev->opaque; >> > > >> int ret; >> > > >> >> > > >> + if (runstate_check(RUN_STATE_SHUTDOWN)) { >> > > >> + /* >> > > >> + * Some devices do not support this call properly, >> > > >> + * and we don't need to retrieve the indexes >> > > >> + * if it is shutting down >> > > >> + */ >> > > >> + return 0; >> > > > Checking runstate in the vhost code seems like a layer violation. >> > > > >> > > > What happens without this patch? >> > > vhost tries to fetch vring base, >> > > vhost_vdpa needs suspend the device before retrieving last_avail_idx. >> > > However not all devices can support .suspend properly so this call >> > > may fail. >> > >> > I think this is where I'm lost. If the device doesn't support >> > suspending, any reason we only try to fix the case of shutdown? >> > >> > Btw, the fail is intended: >> > >> > if (!v->suspended) { >> > /* >> > * Cannot trust in value returned by device, let vhost recover >> used >> > * idx from guest. >> > */ >> > return -1; >> > } >> > >> >> The fail is intended, but to recover the last used idx, either from >> device or from the guest, is only useful in the case of migration. >> > > Note that we had userspace devices like VDUSE now, so it could be useful > in the case of a VDUSE daemon crash or reconnect. > > This code blcok is for vhost_vdpa backend, and I think vduse is another > code path. > I'm not sure I understand here, I meant vhost_vdpa + vduse. It works similar to vhost-user. > Return a guest used idx may be a good idea but as Eugenio pointed out that > may duplicate the code. > > > >> I think the main problem is the error message, actually. But I think >> it has no use either to recover last_avail_idx or print a debug >> message if we're not migrating. Another solution would be to recover >> it from the guest at vhost_vdpa_get_vring_base, but I don't like the >> duplication. >> >> > And if we return to success here, will we go to set an uninitialized >> > last avail idx? >> > >> >> It will be either the default value (is set to 0 at >> __virtio_queue_reset) or the one received from a migration (at >> virtio_load). >> > > 0 is even sub-optimal than the index used. Anyhow, VHOST_DEBUG should not > be enabled for production environments. > > Returning 0 sounds like a queue reset, yes we can reset a queue if failed > to suspend it, I am not sure whther > 0 is better than guest used idx. > > I think we are not able to disable VHOST_DEBUG because customers can build > QEMU by their own. > Well, disabling debug information is a common practice in any distribution. Or if you worry about the default, let's have a patch to undef VHOST_DEBUG by defualt. Thanks > > Thanks > > > Thanks > > >> >> Thanks! >> >> > r = dev->vhost_ops->vhost_get_vring_base(dev, &state); >> > if (r < 0) { >> > ... >> > }.else { >> > virtio_queue_set_last_avail_idx(vdev, idx, state.num); >> > } >> > >> > Thanks >> > >> > > Then vhost will print an error shows something failed. >> > > >> > > The error msg is confused, as stated in the commit log, restoring >> > > last_avail_idx with guest used idx >> > > is a workaround rather than a failure. And no needs to fetch >> last_avail_idx >> > > when power off. >> > > >> > > Thanks >> > > > >> > > > Thanks >> > > > >> > > >> + } >> > > >> + >> > > >> if (v->shadow_vqs_enabled) { >> > > >> ring->num = virtio_queue_get_last_avail_idx(dev->vdev, >> ring->index); >> > > >> return 0; >> > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c >> > > >> index 82394331bf..7dd90cff3a 100644 >> > > >> --- a/hw/virtio/vhost.c >> > > >> +++ b/hw/virtio/vhost.c >> > > >> @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev >> *dev, >> > > >> >> > > >> r = dev->vhost_ops->vhost_get_vring_base(dev, &state); >> > > >> if (r < 0) { >> > > >> - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", >> idx, r); >> > > >> + VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used >> idx for vhost VQ %u", idx); >> > > >> /* Connection to the backend is broken, so let's sync >> internal >> > > >> * last avail idx to the device used idx. >> > > >> */ >> > > >> -- >> > > >> 2.39.3 >> > > >> >> > > >> > >> >> >
On 7/12/2023 3:22 PM, Jason Wang wrote: > > > On Wed, Jul 12, 2023 at 2:54 PM Zhu, Lingshan <lingshan.zhu@intel.com> > wrote: > > > > On 7/11/2023 3:34 PM, Jason Wang wrote: >> >> >> On Tue, Jul 11, 2023 at 3:25 PM Eugenio Perez Martin >> <eperezma@redhat.com> wrote: >> >> On Tue, Jul 11, 2023 at 9:05 AM Jason Wang >> <jasowang@redhat.com> wrote: >> > >> > On Tue, Jul 11, 2023 at 12:09 PM Zhu, Lingshan >> <lingshan.zhu@intel.com> wrote: >> > > >> > > >> > > >> > > On 7/11/2023 10:50 AM, Jason Wang wrote: >> > > > On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan >> <lingshan.zhu@intel.com> wrote: >> > > >> In the poweroff routine, no need to fetch last >> available index. >> > > >> >> > > > This is because there's no concept of shutdown in the >> vhost layer, it >> > > > only knows start and stop. >> > > > >> > > >> This commit also provides a better debug message in >> the vhost >> > > >> caller vhost_virtqueue_stop, >> > > > A separate patch is better. >> > > OK >> > > > >> > > >> because if vhost does not fetch >> > > >> the last avail idx successfully, maybe the device does not >> > > >> suspend, vhost will sync last avail idx to vring used >> idx as a >> > > >> work around, not a failure. >> > > > This only happens if we return a negative value? >> > > Yes >> > > > >> > > >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> >> > > >> --- >> > > >> hw/virtio/vhost-vdpa.c | 10 ++++++++++ >> > > >> hw/virtio/vhost.c | 2 +- >> > > >> 2 files changed, 11 insertions(+), 1 deletion(-) >> > > >> >> > > >> diff --git a/hw/virtio/vhost-vdpa.c >> b/hw/virtio/vhost-vdpa.c >> > > >> index 3c575a9a6e..10b445f64e 100644 >> > > >> --- a/hw/virtio/vhost-vdpa.c >> > > >> +++ b/hw/virtio/vhost-vdpa.c >> > > >> @@ -26,6 +26,7 @@ >> > > >> #include "cpu.h" >> > > >> #include "trace.h" >> > > >> #include "qapi/error.h" >> > > >> +#include "sysemu/runstate.h" >> > > >> >> > > >> /* >> > > >> * Return one past the end of the end of section. Be >> careful with uint64_t >> > > >> @@ -1391,6 +1392,15 @@ static int >> vhost_vdpa_get_vring_base(struct vhost_dev *dev, >> > > >> struct vhost_vdpa *v = dev->opaque; >> > > >> int ret; >> > > >> >> > > >> + if (runstate_check(RUN_STATE_SHUTDOWN)) { >> > > >> + /* >> > > >> + * Some devices do not support this call >> properly, >> > > >> + * and we don't need to retrieve the indexes >> > > >> + * if it is shutting down >> > > >> + */ >> > > >> + return 0; >> > > > Checking runstate in the vhost code seems like a layer >> violation. >> > > > >> > > > What happens without this patch? >> > > vhost tries to fetch vring base, >> > > vhost_vdpa needs suspend the device before retrieving >> last_avail_idx. >> > > However not all devices can support .suspend properly so >> this call >> > > may fail. >> > >> > I think this is where I'm lost. If the device doesn't support >> > suspending, any reason we only try to fix the case of shutdown? >> > >> > Btw, the fail is intended: >> > >> > if (!v->suspended) { >> > /* >> > * Cannot trust in value returned by device, let >> vhost recover used >> > * idx from guest. >> > */ >> > return -1; >> > } >> > >> >> The fail is intended, but to recover the last used idx, >> either from >> device or from the guest, is only useful in the case of >> migration. >> >> >> Note that we had userspace devices like VDUSE now, so it could be >> useful in the case of a VDUSE daemon crash or reconnect. > This code blcok is for vhost_vdpa backend, and I think vduse is > another code path. > > > I'm not sure I understand here, I meant vhost_vdpa + vduse. It works > similar to vhost-user. OK, so do you suggest we set vring state == 0 and return 0 if failed to suspend the device? No matter shutdown or other cases. > > Return a guest used idx may be a good idea but as Eugenio pointed > out that may duplicate the code. >> >> >> I think the main problem is the error message, actually. But >> I think >> it has no use either to recover last_avail_idx or print a debug >> message if we're not migrating. Another solution would be to >> recover >> it from the guest at vhost_vdpa_get_vring_base, but I don't >> like the >> duplication. >> >> > And if we return to success here, will we go to set an >> uninitialized >> > last avail idx? >> > >> >> It will be either the default value (is set to 0 at >> __virtio_queue_reset) or the one received from a migration (at >> virtio_load). >> >> >> 0 is even sub-optimal than the index used. Anyhow, VHOST_DEBUG >> should not be enabled for production environments. > Returning 0 sounds like a queue reset, yes we can reset a queue if > failed to suspend it, I am not sure whther > 0 is better than guest used idx. > > I think we are not able to disable VHOST_DEBUG because customers > can build QEMU by their own. > > > Well, disabling debug information is a common practice in any > distribution. > > Or if you worry about the default, let's have a patch to undef > VHOST_DEBUG by defualt. > I can do this in the next version Thanks > Thanks > > > Thanks >> >> Thanks >> >> >> Thanks! >> >> > r = dev->vhost_ops->vhost_get_vring_base(dev, &state); >> > if (r < 0) { >> > ... >> > }.else { >> > virtio_queue_set_last_avail_idx(vdev, idx, state.num); >> > } >> > >> > Thanks >> > >> > > Then vhost will print an error shows something failed. >> > > >> > > The error msg is confused, as stated in the commit log, >> restoring >> > > last_avail_idx with guest used idx >> > > is a workaround rather than a failure. And no needs to >> fetch last_avail_idx >> > > when power off. >> > > >> > > Thanks >> > > > >> > > > Thanks >> > > > >> > > >> + } >> > > >> + >> > > >> if (v->shadow_vqs_enabled) { >> > > >> ring->num = >> virtio_queue_get_last_avail_idx(dev->vdev, ring->index); >> > > >> return 0; >> > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c >> > > >> index 82394331bf..7dd90cff3a 100644 >> > > >> --- a/hw/virtio/vhost.c >> > > >> +++ b/hw/virtio/vhost.c >> > > >> @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct >> vhost_dev *dev, >> > > >> >> > > >> r = dev->vhost_ops->vhost_get_vring_base(dev, >> &state); >> > > >> if (r < 0) { >> > > >> - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore >> failed: %d", idx, r); >> > > >> + VHOST_OPS_DEBUG(r, "sync last avail idx to >> the guest used idx for vhost VQ %u", idx); >> > > >> /* Connection to the backend is broken, so >> let's sync internal >> > > >> * last avail idx to the device used idx. >> > > >> */ >> > > >> -- >> > > >> 2.39.3 >> > > >> >> > > >> > >> >
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 3c575a9a6e..10b445f64e 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -26,6 +26,7 @@ #include "cpu.h" #include "trace.h" #include "qapi/error.h" +#include "sysemu/runstate.h" /* * Return one past the end of the end of section. Be careful with uint64_t @@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev, struct vhost_vdpa *v = dev->opaque; int ret; + if (runstate_check(RUN_STATE_SHUTDOWN)) { + /* + * Some devices do not support this call properly, + * and we don't need to retrieve the indexes + * if it is shutting down + */ + return 0; + } + if (v->shadow_vqs_enabled) { ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index); return 0; diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 82394331bf..7dd90cff3a 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev, r = dev->vhost_ops->vhost_get_vring_base(dev, &state); if (r < 0) { - VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r); + VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used idx for vhost VQ %u", idx); /* Connection to the backend is broken, so let's sync internal * last avail idx to the device used idx. */
In the poweroff routine, no need to fetch last available index. This commit also provides a better debug message in the vhost caller vhost_virtqueue_stop, because if vhost does not fetch the last avail idx successfully, maybe the device does not suspend, vhost will sync last avail idx to vring used idx as a work around, not a failure. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> --- hw/virtio/vhost-vdpa.c | 10 ++++++++++ hw/virtio/vhost.c | 2 +- 2 files changed, 11 insertions(+), 1 deletion(-)