Message ID | 20230720181459.607008-12-eperezma@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Prefer to use SVQ to stall dataplane at NIC state restore through CVQ | expand |
On 7/20/2023 11:14 AM, Eugenio Pérez wrote: > Some dynamic state of a virtio-net vDPA devices is restored from CVQ in > the event of a live migration. However, dataplane needs to be disabled > so the NIC does not receive buffers in the invalid ring. > > As a default method to achieve it, let's offer a shadow vring with 0 > avail idx. As a fallback method, we will enable dataplane vqs later, as > proposed previously. Let's not jump to conclusion too early what will be the default v.s. fallback [1] - as this is on a latency sensitive path, I'm not fully convinced ring reset could perform better than or equally same as the deferred dataplane enablement approach on hardware. At this stage I think ring_reset has no adoption on vendors device, while it's definitely easier with lower hardware overhead for vendor to implement deferred dataplane enabling. If at some point vendor's device has to support RING_RESET for other use cases (MTU change propagation for ex., a prerequisite for GRO HW) than live migration, defaulting to RING_RESET on this SVQ path has no real benefit but adds complications needlessly to vendor's device. [1] https://lore.kernel.org/virtualization/bf2164a9-1dfd-14d9-be2a-8bb7620a0619@oracle.com/T/#m15caca6fbb00ca9c00e2b33391297a2d8282ff89 Thanks, -Siwei > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > --- > net/vhost-vdpa.c | 49 +++++++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 44 insertions(+), 5 deletions(-) > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > index af83de92f8..e14ae48f23 100644 > --- a/net/vhost-vdpa.c > +++ b/net/vhost-vdpa.c > @@ -338,10 +338,25 @@ static int vhost_vdpa_net_data_start(NetClientState *nc) > { > VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc); > struct vhost_vdpa *v = &s->vhost_vdpa; > + bool has_cvq = v->dev->vq_index_end % 2; > > assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA); > > - if (s->always_svq || > + if (has_cvq && (v->dev->features & VIRTIO_F_RING_RESET)) { > + /* > + * Offer a fake vring to the device while the state is restored > + * through CVQ. That way, the guest will not see packets in unexpected > + * queues. > + * > + * This will be undone after loading all state through CVQ, at > + * vhost_vdpa_net_load. > + * > + * TODO: Future optimizations may skip some SVQ setup and teardown, > + * like set the right kick and call fd or doorbell maps directly, and > + * the iova tree. > + */ > + v->shadow_vqs_enabled = true; > + } else if (s->always_svq || > migration_is_setup_or_active(migrate_get_current()->state)) { > v->shadow_vqs_enabled = true; > v->shadow_data = true; > @@ -738,10 +753,34 @@ static int vhost_vdpa_net_load(NetClientState *nc) > return r; > } > > - for (int i = 0; i < v->dev->vq_index; ++i) { > - r = vhost_vdpa_set_vring_ready(v, i); > - if (unlikely(r)) { > - return r; > + if (v->dev->features & VIRTIO_F_RING_RESET && !s->always_svq && > + !migration_is_setup_or_active(migrate_get_current()->state)) { > + NICState *nic = qemu_get_nic(s->nc.peer); > + int queue_pairs = n->multiqueue ? n->max_queue_pairs : 1; > + > + for (int i = 0; i < queue_pairs; ++i) { > + NetClientState *ncs = qemu_get_peer(nic->ncs, i); > + VhostVDPAState *s_i = DO_UPCAST(VhostVDPAState, nc, ncs); > + > + for (int j = 0; j < 2; ++j) { > + vhost_net_virtqueue_reset(v->dev->vdev, ncs->peer, j); > + } > + > + s_i->vhost_vdpa.shadow_vqs_enabled = false; > + > + for (int j = 0; j < 2; ++j) { > + r = vhost_net_virtqueue_restart(v->dev->vdev, ncs->peer, j); > + if (unlikely(r < 0)) { > + return r; > + } > + } > + } > + } else { > + for (int i = 0; i < v->dev->vq_index; ++i) { > + r = vhost_vdpa_set_vring_ready(v, i); > + if (unlikely(r)) { > + return r; > + } > } > } >
On Sat, Jul 22, 2023 at 12:59 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 7/20/2023 11:14 AM, Eugenio Pérez wrote: > > Some dynamic state of a virtio-net vDPA devices is restored from CVQ in > > the event of a live migration. However, dataplane needs to be disabled > > so the NIC does not receive buffers in the invalid ring. > > > > As a default method to achieve it, let's offer a shadow vring with 0 > > avail idx. As a fallback method, we will enable dataplane vqs later, as > > proposed previously. > Let's not jump to conclusion too early what will be the default v.s. > fallback [1] - as this is on a latency sensitive path, I'm not fully > convinced ring reset could perform better than or equally same as the > deferred dataplane enablement approach on hardware. At this stage I > think ring_reset has no adoption on vendors device, while it's > definitely easier with lower hardware overhead for vendor to implement > deferred dataplane enabling. If at some point vendor's device has to > support RING_RESET for other use cases (MTU change propagation for ex., > a prerequisite for GRO HW) than live migration, defaulting to RING_RESET > on this SVQ path has no real benefit but adds complications needlessly > to vendor's device. > I agree with that. Let's say "*This series* uses RING_RESET as the default method, and late vq enablement as fallback". Michael, given the current HW support, would it work to start merging the late enable for vDPA after the feature freeze, and then add the use of RING_RESET on top later? Thanks! > [1] > https://lore.kernel.org/virtualization/bf2164a9-1dfd-14d9-be2a-8bb7620a0619@oracle.com/T/#m15caca6fbb00ca9c00e2b33391297a2d8282ff89 > > Thanks, > -Siwei > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > > --- > > net/vhost-vdpa.c | 49 +++++++++++++++++++++++++++++++++++++++++++----- > > 1 file changed, 44 insertions(+), 5 deletions(-) > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > > index af83de92f8..e14ae48f23 100644 > > --- a/net/vhost-vdpa.c > > +++ b/net/vhost-vdpa.c > > @@ -338,10 +338,25 @@ static int vhost_vdpa_net_data_start(NetClientState *nc) > > { > > VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc); > > struct vhost_vdpa *v = &s->vhost_vdpa; > > + bool has_cvq = v->dev->vq_index_end % 2; > > > > assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA); > > > > - if (s->always_svq || > > + if (has_cvq && (v->dev->features & VIRTIO_F_RING_RESET)) { > > + /* > > + * Offer a fake vring to the device while the state is restored > > + * through CVQ. That way, the guest will not see packets in unexpected > > + * queues. > > + * > > + * This will be undone after loading all state through CVQ, at > > + * vhost_vdpa_net_load. > > + * > > + * TODO: Future optimizations may skip some SVQ setup and teardown, > > + * like set the right kick and call fd or doorbell maps directly, and > > + * the iova tree. > > + */ > > + v->shadow_vqs_enabled = true; > > + } else if (s->always_svq || > > migration_is_setup_or_active(migrate_get_current()->state)) { > > v->shadow_vqs_enabled = true; > > v->shadow_data = true; > > @@ -738,10 +753,34 @@ static int vhost_vdpa_net_load(NetClientState *nc) > > return r; > > } > > > > - for (int i = 0; i < v->dev->vq_index; ++i) { > > - r = vhost_vdpa_set_vring_ready(v, i); > > - if (unlikely(r)) { > > - return r; > > + if (v->dev->features & VIRTIO_F_RING_RESET && !s->always_svq && > > + !migration_is_setup_or_active(migrate_get_current()->state)) { > > + NICState *nic = qemu_get_nic(s->nc.peer); > > + int queue_pairs = n->multiqueue ? n->max_queue_pairs : 1; > > + > > + for (int i = 0; i < queue_pairs; ++i) { > > + NetClientState *ncs = qemu_get_peer(nic->ncs, i); > > + VhostVDPAState *s_i = DO_UPCAST(VhostVDPAState, nc, ncs); > > + > > + for (int j = 0; j < 2; ++j) { > > + vhost_net_virtqueue_reset(v->dev->vdev, ncs->peer, j); > > + } > > + > > + s_i->vhost_vdpa.shadow_vqs_enabled = false; > > + > > + for (int j = 0; j < 2; ++j) { > > + r = vhost_net_virtqueue_restart(v->dev->vdev, ncs->peer, j); > > + if (unlikely(r < 0)) { > > + return r; > > + } > > + } > > + } > > + } else { > > + for (int i = 0; i < v->dev->vq_index; ++i) { > > + r = vhost_vdpa_set_vring_ready(v, i); > > + if (unlikely(r)) { > > + return r; > > + } > > } > > } > > >
On Tue, Jul 25, 2023 at 3:59 AM Eugenio Perez Martin <eperezma@redhat.com> wrote: > > On Sat, Jul 22, 2023 at 12:59 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > > > > > On 7/20/2023 11:14 AM, Eugenio Pérez wrote: > > > Some dynamic state of a virtio-net vDPA devices is restored from CVQ in > > > the event of a live migration. However, dataplane needs to be disabled > > > so the NIC does not receive buffers in the invalid ring. > > > > > > As a default method to achieve it, let's offer a shadow vring with 0 > > > avail idx. As a fallback method, we will enable dataplane vqs later, as > > > proposed previously. > > Let's not jump to conclusion too early what will be the default v.s. > > fallback [1] - as this is on a latency sensitive path, I'm not fully > > convinced ring reset could perform better than or equally same as the > > deferred dataplane enablement approach on hardware. At this stage I > > think ring_reset has no adoption on vendors device, while it's > > definitely easier with lower hardware overhead for vendor to implement > > deferred dataplane enabling. That's my feeling as well. > > If at some point vendor's device has to > > support RING_RESET for other use cases (MTU change propagation for ex., > > a prerequisite for GRO HW) than live migration, Currently, it is used for changing ring size, and it is planned for AF_XDP. But I think neither of them requires low latency. Thanks > > defaulting to RING_RESET > > on this SVQ path has no real benefit but adds complications needlessly > > to vendor's device. > > > > I agree with that. Let's say "*This series* uses RING_RESET as the > default method, and late vq enablement as fallback". > > Michael, given the current HW support, would it work to start merging > the late enable for vDPA after the feature freeze, and then add the > use of RING_RESET on top later? > > Thanks! > > > [1] > > https://lore.kernel.org/virtualization/bf2164a9-1dfd-14d9-be2a-8bb7620a0619@oracle.com/T/#m15caca6fbb00ca9c00e2b33391297a2d8282ff89 > > > > Thanks, > > -Siwei > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > > > --- > > > net/vhost-vdpa.c | 49 +++++++++++++++++++++++++++++++++++++++++++----- > > > 1 file changed, 44 insertions(+), 5 deletions(-) > > > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > > > index af83de92f8..e14ae48f23 100644 > > > --- a/net/vhost-vdpa.c > > > +++ b/net/vhost-vdpa.c > > > @@ -338,10 +338,25 @@ static int vhost_vdpa_net_data_start(NetClientState *nc) > > > { > > > VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc); > > > struct vhost_vdpa *v = &s->vhost_vdpa; > > > + bool has_cvq = v->dev->vq_index_end % 2; > > > > > > assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA); > > > > > > - if (s->always_svq || > > > + if (has_cvq && (v->dev->features & VIRTIO_F_RING_RESET)) { > > > + /* > > > + * Offer a fake vring to the device while the state is restored > > > + * through CVQ. That way, the guest will not see packets in unexpected > > > + * queues. > > > + * > > > + * This will be undone after loading all state through CVQ, at > > > + * vhost_vdpa_net_load. > > > + * > > > + * TODO: Future optimizations may skip some SVQ setup and teardown, > > > + * like set the right kick and call fd or doorbell maps directly, and > > > + * the iova tree. > > > + */ > > > + v->shadow_vqs_enabled = true; > > > + } else if (s->always_svq || > > > migration_is_setup_or_active(migrate_get_current()->state)) { > > > v->shadow_vqs_enabled = true; > > > v->shadow_data = true; > > > @@ -738,10 +753,34 @@ static int vhost_vdpa_net_load(NetClientState *nc) > > > return r; > > > } > > > > > > - for (int i = 0; i < v->dev->vq_index; ++i) { > > > - r = vhost_vdpa_set_vring_ready(v, i); > > > - if (unlikely(r)) { > > > - return r; > > > + if (v->dev->features & VIRTIO_F_RING_RESET && !s->always_svq && > > > + !migration_is_setup_or_active(migrate_get_current()->state)) { > > > + NICState *nic = qemu_get_nic(s->nc.peer); > > > + int queue_pairs = n->multiqueue ? n->max_queue_pairs : 1; > > > + > > > + for (int i = 0; i < queue_pairs; ++i) { > > > + NetClientState *ncs = qemu_get_peer(nic->ncs, i); > > > + VhostVDPAState *s_i = DO_UPCAST(VhostVDPAState, nc, ncs); > > > + > > > + for (int j = 0; j < 2; ++j) { > > > + vhost_net_virtqueue_reset(v->dev->vdev, ncs->peer, j); > > > + } > > > + > > > + s_i->vhost_vdpa.shadow_vqs_enabled = false; > > > + > > > + for (int j = 0; j < 2; ++j) { > > > + r = vhost_net_virtqueue_restart(v->dev->vdev, ncs->peer, j); > > > + if (unlikely(r < 0)) { > > > + return r; > > > + } > > > + } > > > + } > > > + } else { > > > + for (int i = 0; i < v->dev->vq_index; ++i) { > > > + r = vhost_vdpa_set_vring_ready(v, i); > > > + if (unlikely(r)) { > > > + return r; > > > + } > > > } > > > } > > > > > >
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c index af83de92f8..e14ae48f23 100644 --- a/net/vhost-vdpa.c +++ b/net/vhost-vdpa.c @@ -338,10 +338,25 @@ static int vhost_vdpa_net_data_start(NetClientState *nc) { VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc); struct vhost_vdpa *v = &s->vhost_vdpa; + bool has_cvq = v->dev->vq_index_end % 2; assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA); - if (s->always_svq || + if (has_cvq && (v->dev->features & VIRTIO_F_RING_RESET)) { + /* + * Offer a fake vring to the device while the state is restored + * through CVQ. That way, the guest will not see packets in unexpected + * queues. + * + * This will be undone after loading all state through CVQ, at + * vhost_vdpa_net_load. + * + * TODO: Future optimizations may skip some SVQ setup and teardown, + * like set the right kick and call fd or doorbell maps directly, and + * the iova tree. + */ + v->shadow_vqs_enabled = true; + } else if (s->always_svq || migration_is_setup_or_active(migrate_get_current()->state)) { v->shadow_vqs_enabled = true; v->shadow_data = true; @@ -738,10 +753,34 @@ static int vhost_vdpa_net_load(NetClientState *nc) return r; } - for (int i = 0; i < v->dev->vq_index; ++i) { - r = vhost_vdpa_set_vring_ready(v, i); - if (unlikely(r)) { - return r; + if (v->dev->features & VIRTIO_F_RING_RESET && !s->always_svq && + !migration_is_setup_or_active(migrate_get_current()->state)) { + NICState *nic = qemu_get_nic(s->nc.peer); + int queue_pairs = n->multiqueue ? n->max_queue_pairs : 1; + + for (int i = 0; i < queue_pairs; ++i) { + NetClientState *ncs = qemu_get_peer(nic->ncs, i); + VhostVDPAState *s_i = DO_UPCAST(VhostVDPAState, nc, ncs); + + for (int j = 0; j < 2; ++j) { + vhost_net_virtqueue_reset(v->dev->vdev, ncs->peer, j); + } + + s_i->vhost_vdpa.shadow_vqs_enabled = false; + + for (int j = 0; j < 2; ++j) { + r = vhost_net_virtqueue_restart(v->dev->vdev, ncs->peer, j); + if (unlikely(r < 0)) { + return r; + } + } + } + } else { + for (int i = 0; i < v->dev->vq_index; ++i) { + r = vhost_vdpa_set_vring_ready(v, i); + if (unlikely(r)) { + return r; + } } }
Some dynamic state of a virtio-net vDPA devices is restored from CVQ in the event of a live migration. However, dataplane needs to be disabled so the NIC does not receive buffers in the invalid ring. As a default method to achieve it, let's offer a shadow vring with 0 avail idx. As a fallback method, we will enable dataplane vqs later, as proposed previously. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> --- net/vhost-vdpa.c | 49 +++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 44 insertions(+), 5 deletions(-)