Message ID | 1612993680-29454-4-git-send-email-si-wei.liu@oracle.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Headers | show |
Series | mlx5_vdpa bug fixes | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Wed, Feb 10, 2021 at 01:48:00PM -0800, Si-Wei Liu wrote: > While virtq is stopped, get_vq_state() is supposed to > be called to get sync'ed with the latest internal > avail_index from device. The saved avail_index is used > to restate the virtq once device is started. Commit > b35ccebe3ef7 introduced the clear_virtqueues() routine > to reset the saved avail_index, however, the index > gets cleared a bit earlier before get_vq_state() tries > to read it. This would cause consistency problems when > virtq is restarted, e.g. through a series of link down > and link up events. We could defer the clearing of > avail_index to until the device is to be started, > i.e. until VIRTIO_CONFIG_S_DRIVER_OK is set again in > set_status(). > > Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map") > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> > Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com> > --- > drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c > index 7c1f789..ce6aae8 100644 > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > @@ -1777,7 +1777,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) > if (!status) { > mlx5_vdpa_info(mvdev, "performing device reset\n"); > teardown_driver(ndev); > - clear_virtqueues(ndev); > mlx5_vdpa_destroy_mr(&ndev->mvdev); > ndev->mvdev.status = 0; > ++mvdev->generation; > @@ -1786,6 +1785,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) > > if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) { > if (status & VIRTIO_CONFIG_S_DRIVER_OK) { > + clear_virtqueues(ndev); > err = setup_driver(ndev); > if (err) { > mlx5_vdpa_warn(mvdev, "failed to setup driver\n"); > -- > 1.8.3.1 >
On Thu, Feb 11, 2021 at 09:33:14AM +0200, Eli Cohen wrote: > On Wed, Feb 10, 2021 at 01:48:00PM -0800, Si-Wei Liu wrote: > > While virtq is stopped, get_vq_state() is supposed to > > be called to get sync'ed with the latest internal > > avail_index from device. The saved avail_index is used > > to restate the virtq once device is started. Commit > > b35ccebe3ef7 introduced the clear_virtqueues() routine > > to reset the saved avail_index, however, the index > > gets cleared a bit earlier before get_vq_state() tries > > to read it. This would cause consistency problems when > > virtq is restarted, e.g. through a series of link down > > and link up events. We could defer the clearing of > > avail_index to until the device is to be started, > > i.e. until VIRTIO_CONFIG_S_DRIVER_OK is set again in > > set_status(). > > > > Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map") > > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> > > Acked-by: Jason Wang <jasowang@redhat.com> > > Acked-by: Eli Cohen <elic@nvidia.com> > I take it back. I think we don't need to clear the indexes at all. In case we need to restore indexes we'll get the right values through set_vq_state(). If we suspend the virtqueue due to VM being suspended, qemu will query first and will provide the the queried value. In case of VM reboot, it will provide 0 in set_vq_state(). I am sending a patch that addresses both reboot and suspend. > > --- > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > index 7c1f789..ce6aae8 100644 > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > @@ -1777,7 +1777,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) > > if (!status) { > > mlx5_vdpa_info(mvdev, "performing device reset\n"); > > teardown_driver(ndev); > > - clear_virtqueues(ndev); > > mlx5_vdpa_destroy_mr(&ndev->mvdev); > > ndev->mvdev.status = 0; > > ++mvdev->generation; > > @@ -1786,6 +1785,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) > > > > if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) { > > if (status & VIRTIO_CONFIG_S_DRIVER_OK) { > > + clear_virtqueues(ndev); > > err = setup_driver(ndev); > > if (err) { > > mlx5_vdpa_warn(mvdev, "failed to setup driver\n"); > > -- > > 1.8.3.1 > >
On 2/16/2021 7:21 AM, Eli Cohen wrote: > On Thu, Feb 11, 2021 at 09:33:14AM +0200, Eli Cohen wrote: >> On Wed, Feb 10, 2021 at 01:48:00PM -0800, Si-Wei Liu wrote: >>> While virtq is stopped, get_vq_state() is supposed to >>> be called to get sync'ed with the latest internal >>> avail_index from device. The saved avail_index is used >>> to restate the virtq once device is started. Commit >>> b35ccebe3ef7 introduced the clear_virtqueues() routine >>> to reset the saved avail_index, however, the index >>> gets cleared a bit earlier before get_vq_state() tries >>> to read it. This would cause consistency problems when >>> virtq is restarted, e.g. through a series of link down >>> and link up events. We could defer the clearing of >>> avail_index to until the device is to be started, >>> i.e. until VIRTIO_CONFIG_S_DRIVER_OK is set again in >>> set_status(). >>> >>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map") >>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> >>> Acked-by: Jason Wang <jasowang@redhat.com> >> Acked-by: Eli Cohen <elic@nvidia.com> >> > I take it back. I think we don't need to clear the indexes at all. In > case we need to restore indexes we'll get the right values through > set_vq_state(). If we suspend the virtqueue due to VM being suspended, > qemu will query first and will provide the the queried value. In case of > VM reboot, it will provide 0 in set_vq_state(). > > I am sending a patch that addresses both reboot and suspend. With set_vq_state() repurposed to restoring used_index I'm fine with this approach. Do I have to repost a v3 of this series while dropping the 3rd patch? -Siwei > >>> --- >>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c >>> index 7c1f789..ce6aae8 100644 >>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c >>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c >>> @@ -1777,7 +1777,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) >>> if (!status) { >>> mlx5_vdpa_info(mvdev, "performing device reset\n"); >>> teardown_driver(ndev); >>> - clear_virtqueues(ndev); >>> mlx5_vdpa_destroy_mr(&ndev->mvdev); >>> ndev->mvdev.status = 0; >>> ++mvdev->generation; >>> @@ -1786,6 +1785,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) >>> >>> if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) { >>> if (status & VIRTIO_CONFIG_S_DRIVER_OK) { >>> + clear_virtqueues(ndev); >>> err = setup_driver(ndev); >>> if (err) { >>> mlx5_vdpa_warn(mvdev, "failed to setup driver\n"); >>> -- >>> 1.8.3.1 >>>
On 2021/2/18 上午5:55, Si-Wei Liu wrote: > > > On 2/16/2021 7:21 AM, Eli Cohen wrote: >> On Thu, Feb 11, 2021 at 09:33:14AM +0200, Eli Cohen wrote: >>> On Wed, Feb 10, 2021 at 01:48:00PM -0800, Si-Wei Liu wrote: >>>> While virtq is stopped, get_vq_state() is supposed to >>>> be called to get sync'ed with the latest internal >>>> avail_index from device. The saved avail_index is used >>>> to restate the virtq once device is started. Commit >>>> b35ccebe3ef7 introduced the clear_virtqueues() routine >>>> to reset the saved avail_index, however, the index >>>> gets cleared a bit earlier before get_vq_state() tries >>>> to read it. This would cause consistency problems when >>>> virtq is restarted, e.g. through a series of link down >>>> and link up events. We could defer the clearing of >>>> avail_index to until the device is to be started, >>>> i.e. until VIRTIO_CONFIG_S_DRIVER_OK is set again in >>>> set_status(). >>>> >>>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index >>>> after change map") >>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> >>>> Acked-by: Jason Wang <jasowang@redhat.com> >>> Acked-by: Eli Cohen <elic@nvidia.com> >>> >> I take it back. I think we don't need to clear the indexes at all. In >> case we need to restore indexes we'll get the right values through >> set_vq_state(). If we suspend the virtqueue due to VM being suspended, >> qemu will query first and will provide the the queried value. In case of >> VM reboot, it will provide 0 in set_vq_state(). >> >> I am sending a patch that addresses both reboot and suspend. > With set_vq_state() repurposed to restoring used_index I'm fine with > this approach. > > Do I have to repost a v3 of this series while dropping the 3rd patch? > > -Siwei Yes, please. Thanks
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 7c1f789..ce6aae8 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1777,7 +1777,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) if (!status) { mlx5_vdpa_info(mvdev, "performing device reset\n"); teardown_driver(ndev); - clear_virtqueues(ndev); mlx5_vdpa_destroy_mr(&ndev->mvdev); ndev->mvdev.status = 0; ++mvdev->generation; @@ -1786,6 +1785,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) { if (status & VIRTIO_CONFIG_S_DRIVER_OK) { + clear_virtqueues(ndev); err = setup_driver(ndev); if (err) { mlx5_vdpa_warn(mvdev, "failed to setup driver\n");