Message ID | 20230129025150.119972-3-xuanzhuo@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed | expand |
On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > Before per-queue reset, we need to recover async tx resources. At this > time, virtio_net_flush_tx() is called, but we should not try to send > new packets, so virtio_net_flush_tx() should check the current > per-queue reset state. > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > Reported-by: Alexander Bulekov <alxndr@bu.edu> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > --- > hw/net/virtio-net.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > index 3ae909041a..fba6451a50 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > VirtQueueElement *elem; > int32_t num_packets = 0; > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > + virtio_queue_reset_state(q->tx_vq)) { We have other places that check DRIVER_OK do we need to check queue reset as well? E.g: virtio_net_can_receive() virtio_net_tx_{timer|bh}() Thanks > return num_packets; > } > > -- > 2.32.0.3.g01195cf9f >
On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > Before per-queue reset, we need to recover async tx resources. At this > time, virtio_net_flush_tx() is called, but we should not try to send > new packets, so virtio_net_flush_tx() should check the current > per-queue reset state. What does "at this time" mean here? Do you in fact mean it's called from flush_or_purge_queued_packets? What does the call stack look like? If yes introducing a vq state just so virtio_net_flush_tx knows we are in the process of reset would be a bad idea. We want something much more local, ideally on stack even ... > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > Reported-by: Alexander Bulekov <alxndr@bu.edu> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > --- > hw/net/virtio-net.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > index 3ae909041a..fba6451a50 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > VirtQueueElement *elem; > int32_t num_packets = 0; > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > + virtio_queue_reset_state(q->tx_vq)) { > return num_packets; > } > > -- > 2.32.0.3.g01195cf9f
On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > Before per-queue reset, we need to recover async tx resources. At this > > time, virtio_net_flush_tx() is called, but we should not try to send > > new packets, so virtio_net_flush_tx() should check the current > > per-queue reset state. > > > What does "at this time" mean here? > Do you in fact mean it's called from flush_or_purge_queued_packets? Yes virtio_queue_reset k->queue_reset virtio_net_queue_reset flush_or_purge_queued_packets qemu_flush_or_purge_queued_packets ..... (callback) virtio_net_tx_complete virtio_net_flush_tx <-- here send new packet. We need stop it. Because it is inside the callback, I can't pass information through the stack. I originally thought it was a general situation, so I wanted to put it in struct VirtQueue. If it is not very suitable, it may be better to put it in VirtIONetQueue. Thanks. > What does the call stack look like? > > If yes introducing a vq state just so virtio_net_flush_tx > knows we are in the process of reset would be a bad idea. > We want something much more local, ideally on stack even ... > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > --- > > hw/net/virtio-net.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > index 3ae909041a..fba6451a50 100644 > > --- a/hw/net/virtio-net.c > > +++ b/hw/net/virtio-net.c > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > VirtQueueElement *elem; > > int32_t num_packets = 0; > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > + virtio_queue_reset_state(q->tx_vq)) { > > return num_packets; > > } > > > > -- > > 2.32.0.3.g01195cf9f >
On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote: > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > Before per-queue reset, we need to recover async tx resources. At this > > time, virtio_net_flush_tx() is called, but we should not try to send > > new packets, so virtio_net_flush_tx() should check the current > > per-queue reset state. > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > --- > > hw/net/virtio-net.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > index 3ae909041a..fba6451a50 100644 > > --- a/hw/net/virtio-net.c > > +++ b/hw/net/virtio-net.c > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > VirtQueueElement *elem; > > int32_t num_packets = 0; > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > + virtio_queue_reset_state(q->tx_vq)) { > > We have other places that check DRIVER_OK do we need to check queue > reset as well? I checked it again. I still think that the location of other checking DRIVER_OK does not need to check the queue reset. Thanks. > > E.g: > virtio_net_can_receive() > virtio_net_tx_{timer|bh}() > > Thanks > > > return num_packets; > > } > > > > -- > > 2.32.0.3.g01195cf9f > > >
On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > new packets, so virtio_net_flush_tx() should check the current > > > per-queue reset state. > > > > > > What does "at this time" mean here? > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > Yes > > virtio_queue_reset > k->queue_reset > virtio_net_queue_reset > flush_or_purge_queued_packets > qemu_flush_or_purge_queued_packets > ..... > (callback) virtio_net_tx_complete > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > Because it is inside the callback, I can't pass information through the stack. I > originally thought it was a general situation, so I wanted to put it in > struct VirtQueue. > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > Thanks. Hmm maybe. Another idea: isn't virtio_net_tx_complete called with length 0 here? Are there other cases where length is 0? > > What does the call stack look like? > > > > If yes introducing a vq state just so virtio_net_flush_tx > > knows we are in the process of reset would be a bad idea. > > We want something much more local, ideally on stack even ... > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > --- > > > hw/net/virtio-net.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > index 3ae909041a..fba6451a50 100644 > > > --- a/hw/net/virtio-net.c > > > +++ b/hw/net/virtio-net.c > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > VirtQueueElement *elem; > > > int32_t num_packets = 0; > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > + virtio_queue_reset_state(q->tx_vq)) { btw this sounds like you are asking it to reset some state. > > > return num_packets; and then ret = virtio_net_flush_tx(q); if (ret >= n->tx_burst) will reschedule automatically won't it? also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > } > > > > > > -- > > > 2.32.0.3.g01195cf9f > >
On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > new packets, so virtio_net_flush_tx() should check the current > > > > per-queue reset state. > > > > > > > > > What does "at this time" mean here? > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > Yes > > > > virtio_queue_reset > > k->queue_reset > > virtio_net_queue_reset > > flush_or_purge_queued_packets > > qemu_flush_or_purge_queued_packets > > ..... > > (callback) virtio_net_tx_complete > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > originally thought it was a general situation, so I wanted to put it in > > struct VirtQueue. > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > Thanks. > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > with length 0 here? Are there other cases where length is 0? > > > > > What does the call stack look like? > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > knows we are in the process of reset would be a bad idea. > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > --- > > > > hw/net/virtio-net.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > index 3ae909041a..fba6451a50 100644 > > > > --- a/hw/net/virtio-net.c > > > > +++ b/hw/net/virtio-net.c > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > VirtQueueElement *elem; > > > > int32_t num_packets = 0; > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > + virtio_queue_reset_state(q->tx_vq)) { > > btw this sounds like you are asking it to reset some state. > > > > > return num_packets; > > and then > > ret = virtio_net_flush_tx(q); > if (ret >= n->tx_burst) > > > will reschedule automatically won't it? > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? virtio_net_flush_tx may been called by timer. Thanks. > > > > > > } > > > > > > > > -- > > > > 2.32.0.3.g01195cf9f > > > >
On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > per-queue reset state. > > > > > > > > > > > > What does "at this time" mean here? > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > Yes > > > > > > virtio_queue_reset > > > k->queue_reset > > > virtio_net_queue_reset > > > flush_or_purge_queued_packets > > > qemu_flush_or_purge_queued_packets > > > ..... > > > (callback) virtio_net_tx_complete > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > originally thought it was a general situation, so I wanted to put it in > > > struct VirtQueue. > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > Thanks. > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > with length 0 here? Are there other cases where length is 0? > > > > > > > > What does the call stack look like? > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > knows we are in the process of reset would be a bad idea. > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > --- > > > > > hw/net/virtio-net.c | 3 ++- > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > index 3ae909041a..fba6451a50 100644 > > > > > --- a/hw/net/virtio-net.c > > > > > +++ b/hw/net/virtio-net.c > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > VirtQueueElement *elem; > > > > > int32_t num_packets = 0; > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > btw this sounds like you are asking it to reset some state. > > > > > > > return num_packets; > > > > and then > > > > ret = virtio_net_flush_tx(q); > > if (ret >= n->tx_burst) > > > > > > will reschedule automatically won't it? > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > virtio_net_flush_tx may been called by timer. > > Thanks. timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > > > > } > > > > > > > > > > -- > > > > > 2.32.0.3.g01195cf9f > > > > > >
On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > per-queue reset state. > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > Yes > > > > > > > > virtio_queue_reset > > > > k->queue_reset > > > > virtio_net_queue_reset > > > > flush_or_purge_queued_packets > > > > qemu_flush_or_purge_queued_packets > > > > ..... > > > > (callback) virtio_net_tx_complete > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > originally thought it was a general situation, so I wanted to put it in > > > > struct VirtQueue. > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > Thanks. > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > knows we are in the process of reset would be a bad idea. > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > --- > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > --- a/hw/net/virtio-net.c > > > > > > +++ b/hw/net/virtio-net.c > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > VirtQueueElement *elem; > > > > > > int32_t num_packets = 0; > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > return num_packets; > > > > > > and then > > > > > > ret = virtio_net_flush_tx(q); > > > if (ret >= n->tx_burst) > > > > > > > > > will reschedule automatically won't it? > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > virtio_net_flush_tx may been called by timer. > > > > Thanks. > > timer won't run while flush_or_purge_queued_packets is in progress. Is timer not executed during the VMEXIT process? Otherwise, we still have to consider that after the flush_or_purge_queued_packets, this process before the structure is cleared. Even if it can be processed in virtio_net_tx_complete, is there any good way? This is a callback, it is not convenient to pass the parameters. Thanks > > > > > > > > > > > > > } > > > > > > > > > > > > -- > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > >
On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > Yes > > > > > > > > > > virtio_queue_reset > > > > > k->queue_reset > > > > > virtio_net_queue_reset > > > > > flush_or_purge_queued_packets > > > > > qemu_flush_or_purge_queued_packets > > > > > ..... > > > > > (callback) virtio_net_tx_complete > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > struct VirtQueue. > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > Thanks. > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > --- > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > VirtQueueElement *elem; > > > > > > > int32_t num_packets = 0; > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > return num_packets; > > > > > > > > and then > > > > > > > > ret = virtio_net_flush_tx(q); > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > virtio_net_flush_tx may been called by timer. > > > > > > Thanks. > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > consider that after the flush_or_purge_queued_packets, this process before the > structure is cleared. void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) { VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); if (k->queue_reset) { k->queue_reset(vdev, queue_index); } __virtio_queue_reset(vdev, queue_index); } No timers do not run between k->queue_reset and __virtio_queue_reset. > Even if it can be processed in virtio_net_tx_complete, is there any good way? > This is a callback, it is not convenient to pass the parameters. > > Thanks How about checking that length is 0? > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > -- > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > >
On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > Yes > > > > > > > > > > > > virtio_queue_reset > > > > > > k->queue_reset > > > > > > virtio_net_queue_reset > > > > > > flush_or_purge_queued_packets > > > > > > qemu_flush_or_purge_queued_packets > > > > > > ..... > > > > > > (callback) virtio_net_tx_complete > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > struct VirtQueue. > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > Thanks. > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > --- > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > VirtQueueElement *elem; > > > > > > > > int32_t num_packets = 0; > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > return num_packets; > > > > > > > > > > and then > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > virtio_net_flush_tx may been called by timer. > > > > > > > > Thanks. > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > consider that after the flush_or_purge_queued_packets, this process before the > > structure is cleared. > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > { > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > if (k->queue_reset) { > k->queue_reset(vdev, queue_index); > } > > __virtio_queue_reset(vdev, queue_index); > } > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > This is a callback, it is not convenient to pass the parameters. > > > > Thanks > > > How about checking that length is 0? I think you refer to the second parameter of virtio_net_tx_complete "len". void qemu_net_queue_purge(NetQueue *queue, NetClientState *from) { NetPacket *packet, *next; QTAILQ_FOREACH_SAFE(packet, &queue->packets, entry, next) { if (packet->sender == from) { QTAILQ_REMOVE(&queue->packets, packet, entry); queue->nq_count--; if (packet->sent_cb) { packet->sent_cb(packet->sender, 0); } g_free(packet); } } } qemu_net_queue_purge pass 0 as len. This function has been called in large quantities. I am not sure if it will have a bad impact. I will try if there is a good way to pass the parameters on the stack. Thanks. > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > -- > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > >
On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > Yes > > > > > > > > > > > > virtio_queue_reset > > > > > > k->queue_reset > > > > > > virtio_net_queue_reset > > > > > > flush_or_purge_queued_packets > > > > > > qemu_flush_or_purge_queued_packets > > > > > > ..... > > > > > > (callback) virtio_net_tx_complete > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > struct VirtQueue. > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > Thanks. > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > --- > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > VirtQueueElement *elem; > > > > > > > > int32_t num_packets = 0; > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > return num_packets; > > > > > > > > > > and then > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > virtio_net_flush_tx may been called by timer. > > > > > > > > Thanks. > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > consider that after the flush_or_purge_queued_packets, this process before the > > structure is cleared. > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > { > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > if (k->queue_reset) { > k->queue_reset(vdev, queue_index); > } > > __virtio_queue_reset(vdev, queue_index); > } > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > This is a callback, it is not convenient to pass the parameters. > > > > Thanks > > > How about checking that length is 0? I think that check length is not a good way. This modifys the semantics of 0. It is not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() will pass 0, and this function is called by many places. How about we add an api in queue.c to replace the sent_cb callback on queue? Thanks. > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > -- > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > >
On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote: > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > new packets, so virtio_net_flush_tx() should check the current > > > per-queue reset state. > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > --- > > > hw/net/virtio-net.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > index 3ae909041a..fba6451a50 100644 > > > --- a/hw/net/virtio-net.c > > > +++ b/hw/net/virtio-net.c > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > VirtQueueElement *elem; > > > int32_t num_packets = 0; > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > We have other places that check DRIVER_OK do we need to check queue > > reset as well? > > I checked it again. I still think that the location of other checking DRIVER_OK > does not need to check the queue reset. For example, if we don't disable can_receive() when the queue is reset, it means rx may go for virtio_net_receive_rcu(). It means the Qemu is still trying to process the traffic from the network backend like tap which may waste cpu cycles. I think the correct way is to return false when the queue is reset in can_receive(), then the backend poll will be disabled (e.g TAP). When the queue is enabled again, qemu_flush_queued_packets() will wake up the backend polling. Having had time to check other places but it would be better to mention why it doesn't need a check in the changelog. Thanks > > Thanks. > > > > > > E.g: > > virtio_net_can_receive() > > virtio_net_tx_{timer|bh}() > > > > Thanks > > > > > return num_packets; > > > } > > > > > > -- > > > 2.32.0.3.g01195cf9f > > > > > >
On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote: > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > new packets, so virtio_net_flush_tx() should check the current > > > > per-queue reset state. > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > --- > > > > hw/net/virtio-net.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > index 3ae909041a..fba6451a50 100644 > > > > --- a/hw/net/virtio-net.c > > > > +++ b/hw/net/virtio-net.c > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > VirtQueueElement *elem; > > > > int32_t num_packets = 0; > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > We have other places that check DRIVER_OK do we need to check queue > > > reset as well? > > > > I checked it again. I still think that the location of other checking DRIVER_OK > > does not need to check the queue reset. > > For example, if we don't disable can_receive() when the queue is > reset, it means rx may go for virtio_net_receive_rcu(). It means the > Qemu is still trying to process the traffic from the network backend > like tap which may waste cpu cycles. > > I think the correct way is to return false when the queue is reset in > can_receive(), then the backend poll will be disabled (e.g TAP). When > the queue is enabled again, qemu_flush_queued_packets() will wake up > the backend polling. > > Having had time to check other places but it would be better to > mention why it doesn't need a check in the changelog. static bool virtio_net_can_receive(NetClientState *nc) { VirtIONet *n = qemu_get_nic_opaque(nc); VirtIODevice *vdev = VIRTIO_DEVICE(n); VirtIONetQueue *q = virtio_net_get_subqueue(nc); if (!vdev->vm_running) { return false; } if (nc->queue_index >= n->curr_queue_pairs) { return false; } if (!virtio_queue_ready(q->rx_vq) || !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { return false; } return true; } int virtio_queue_ready(VirtQueue *vq) { return vq->vring.avail != 0; } static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i) { vdev->vq[i].vring.desc = 0; vdev->vq[i].vring.avail = 0; vdev->vq[i].vring.used = 0; vdev->vq[i].last_avail_idx = 0; vdev->vq[i].shadow_avail_idx = 0; vdev->vq[i].used_idx = 0; vdev->vq[i].last_avail_wrap_counter = true; vdev->vq[i].shadow_avail_wrap_counter = true; vdev->vq[i].used_wrap_counter = true; virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR); vdev->vq[i].signalled_used = 0; vdev->vq[i].signalled_used_valid = false; vdev->vq[i].notification = true; vdev->vq[i].vring.num = vdev->vq[i].vring.num_default; vdev->vq[i].inuse = 0; virtio_virtqueue_reset_region_cache(&vdev->vq[i]); } In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0. Then callback can_receive will return False. Thanks. > > Thanks > > > > > Thanks. > > > > > > > > > > E.g: > > > virtio_net_can_receive() > > > virtio_net_tx_{timer|bh}() > > > > > > Thanks > > > > > > > return num_packets; > > > > } > > > > > > > > -- > > > > 2.32.0.3.g01195cf9f > > > > > > > > > >
On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote: > > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > per-queue reset state. > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > --- > > > > > hw/net/virtio-net.c | 3 ++- > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > index 3ae909041a..fba6451a50 100644 > > > > > --- a/hw/net/virtio-net.c > > > > > +++ b/hw/net/virtio-net.c > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > VirtQueueElement *elem; > > > > > int32_t num_packets = 0; > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > We have other places that check DRIVER_OK do we need to check queue > > > > reset as well? > > > > > > I checked it again. I still think that the location of other checking DRIVER_OK > > > does not need to check the queue reset. > > > > For example, if we don't disable can_receive() when the queue is > > reset, it means rx may go for virtio_net_receive_rcu(). It means the > > Qemu is still trying to process the traffic from the network backend > > like tap which may waste cpu cycles. > > > > I think the correct way is to return false when the queue is reset in > > can_receive(), then the backend poll will be disabled (e.g TAP). When > > the queue is enabled again, qemu_flush_queued_packets() will wake up > > the backend polling. > > > > Having had time to check other places but it would be better to > > mention why it doesn't need a check in the changelog. > > > static bool virtio_net_can_receive(NetClientState *nc) > { > VirtIONet *n = qemu_get_nic_opaque(nc); > VirtIODevice *vdev = VIRTIO_DEVICE(n); > VirtIONetQueue *q = virtio_net_get_subqueue(nc); > > if (!vdev->vm_running) { > return false; > } > > if (nc->queue_index >= n->curr_queue_pairs) { > return false; > } > > if (!virtio_queue_ready(q->rx_vq) || > !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > return false; > } > > return true; > } > > int virtio_queue_ready(VirtQueue *vq) > { > return vq->vring.avail != 0; > } > > > static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i) > { > vdev->vq[i].vring.desc = 0; > vdev->vq[i].vring.avail = 0; > vdev->vq[i].vring.used = 0; > vdev->vq[i].last_avail_idx = 0; > vdev->vq[i].shadow_avail_idx = 0; > vdev->vq[i].used_idx = 0; > vdev->vq[i].last_avail_wrap_counter = true; > vdev->vq[i].shadow_avail_wrap_counter = true; > vdev->vq[i].used_wrap_counter = true; > virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR); > vdev->vq[i].signalled_used = 0; > vdev->vq[i].signalled_used_valid = false; > vdev->vq[i].notification = true; > vdev->vq[i].vring.num = vdev->vq[i].vring.num_default; > vdev->vq[i].inuse = 0; > virtio_virtqueue_reset_region_cache(&vdev->vq[i]); > } > > In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0. Ok, but this is kind of fragile (especially when vIOMMU is enabled). I'd add an explicit check for reset there. (probably on top). Thanks > Then callback can_receive will return False. > > > Thanks. > > > > > > Thanks > > > > > > > > Thanks. > > > > > > > > > > > > > > E.g: > > > > virtio_net_can_receive() > > > > virtio_net_tx_{timer|bh}() > > > > > > > > Thanks > > > > > > > > > return num_packets; > > > > > } > > > > > > > > > > -- > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > >
On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > k->queue_reset > > > > > > > virtio_net_queue_reset > > > > > > > flush_or_purge_queued_packets > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > ..... > > > > > > > (callback) virtio_net_tx_complete > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > > --- > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > and then > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > > > virtio_net_flush_tx may been called by timer. > > > > > > > > > > Thanks. > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > > consider that after the flush_or_purge_queued_packets, this process before the > > > structure is cleared. > > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > { > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > if (k->queue_reset) { > > k->queue_reset(vdev, queue_index); > > } > > > > __virtio_queue_reset(vdev, queue_index); > > } > > > > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > > This is a callback, it is not convenient to pass the parameters. > > > > > > Thanks > > > > > > How about checking that length is 0? > > > I think that check length is not a good way. This modifys the semantics of 0. It is > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() > will pass 0, and this function is called by many places. > > How about we add an api in queue.c to replace the sent_cb callback on queue? > > Thanks. OK I guess. Jason? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > -- > > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > >
On Mon, Jan 30, 2023 at 11:53:18AM +0800, Jason Wang wrote: > On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > per-queue reset state. > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > --- > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > --- a/hw/net/virtio-net.c > > > > > > +++ b/hw/net/virtio-net.c > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > VirtQueueElement *elem; > > > > > > int32_t num_packets = 0; > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > We have other places that check DRIVER_OK do we need to check queue > > > > > reset as well? > > > > > > > > I checked it again. I still think that the location of other checking DRIVER_OK > > > > does not need to check the queue reset. > > > > > > For example, if we don't disable can_receive() when the queue is > > > reset, it means rx may go for virtio_net_receive_rcu(). It means the > > > Qemu is still trying to process the traffic from the network backend > > > like tap which may waste cpu cycles. > > > > > > I think the correct way is to return false when the queue is reset in > > > can_receive(), then the backend poll will be disabled (e.g TAP). When > > > the queue is enabled again, qemu_flush_queued_packets() will wake up > > > the backend polling. > > > > > > Having had time to check other places but it would be better to > > > mention why it doesn't need a check in the changelog. > > > > > > static bool virtio_net_can_receive(NetClientState *nc) > > { > > VirtIONet *n = qemu_get_nic_opaque(nc); > > VirtIODevice *vdev = VIRTIO_DEVICE(n); > > VirtIONetQueue *q = virtio_net_get_subqueue(nc); > > > > if (!vdev->vm_running) { > > return false; > > } > > > > if (nc->queue_index >= n->curr_queue_pairs) { > > return false; > > } > > > > if (!virtio_queue_ready(q->rx_vq) || > > !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > return false; > > } > > > > return true; > > } > > > > int virtio_queue_ready(VirtQueue *vq) > > { > > return vq->vring.avail != 0; > > } > > > > > > static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i) > > { > > vdev->vq[i].vring.desc = 0; > > vdev->vq[i].vring.avail = 0; > > vdev->vq[i].vring.used = 0; > > vdev->vq[i].last_avail_idx = 0; > > vdev->vq[i].shadow_avail_idx = 0; > > vdev->vq[i].used_idx = 0; > > vdev->vq[i].last_avail_wrap_counter = true; > > vdev->vq[i].shadow_avail_wrap_counter = true; > > vdev->vq[i].used_wrap_counter = true; > > virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR); > > vdev->vq[i].signalled_used = 0; > > vdev->vq[i].signalled_used_valid = false; > > vdev->vq[i].notification = true; > > vdev->vq[i].vring.num = vdev->vq[i].vring.num_default; > > vdev->vq[i].inuse = 0; > > virtio_virtqueue_reset_region_cache(&vdev->vq[i]); > > } > > > > In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0. > > Ok, but this is kind of fragile (especially when vIOMMU is enabled). > I'd add an explicit check for reset there. It's not great in that spec says avail 0 is actually legal. But I don't really want to see more and more checks. If we are doing cleanups, the right way is probably a new "live" flag that transports can set correctly from the combination of DRIVER_OK, desc, kick, queue_enable, queue_reset and so on. > (probably on top). > > Thanks > > > Then callback can_receive will return False. > > > > > > Thanks. > > > > > > > > > > Thanks > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > E.g: > > > > > virtio_net_can_receive() > > > > > virtio_net_tx_{timer|bh}() > > > > > > > > > > Thanks > > > > > > > > > > > return num_packets; > > > > > > } > > > > > > > > > > > > -- > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > >
On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > k->queue_reset > > > > > > > > virtio_net_queue_reset > > > > > > > > flush_or_purge_queued_packets > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > ..... > > > > > > > > (callback) virtio_net_tx_complete > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > > > --- > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > > > > > virtio_net_flush_tx may been called by timer. We stop timer/bh during device reset, do we need to do the same with vq reset? > > > > > > > > > > > > Thanks. > > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > > > consider that after the flush_or_purge_queued_packets, this process before the > > > > structure is cleared. > > > > > > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > > { > > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > > if (k->queue_reset) { > > > k->queue_reset(vdev, queue_index); > > > } > > > > > > __virtio_queue_reset(vdev, queue_index); > > > } > > > > > > > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > > > This is a callback, it is not convenient to pass the parameters. > > > > > > > > Thanks > > > > > > > > > How about checking that length is 0? > > > > > > I think that check length is not a good way. This modifys the semantics of 0. 0 seems to mean "purge" and > It is > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() > > will pass 0, and this function is called by many places. That's exactly what we want actually, when do purge we don't need a flush? > > > > How about we add an api in queue.c to replace the sent_cb callback on queue? > > > > Thanks. > > OK I guess. Jason? Not sure, anything different from adding a check in virtio_net_tx_complete()? (assuming bh and timer is cancelled or deleted). Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > > > > > > >
On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote: > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > k->queue_reset > > > > > > > > > virtio_net_queue_reset > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > ..... > > > > > > > > > (callback) virtio_net_tx_complete > > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > > > > --- > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > > > > > > > virtio_net_flush_tx may been called by timer. > > We stop timer/bh during device reset, do we need to do the same with vq reset? > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > > > > consider that after the flush_or_purge_queued_packets, this process before the > > > > > structure is cleared. > > > > > > > > > > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > > > { > > > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > > > > if (k->queue_reset) { > > > > k->queue_reset(vdev, queue_index); > > > > } > > > > > > > > __virtio_queue_reset(vdev, queue_index); > > > > } > > > > > > > > > > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > > > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > > > > This is a callback, it is not convenient to pass the parameters. > > > > > > > > > > Thanks > > > > > > > > > > > > How about checking that length is 0? > > > > > > > > > I think that check length is not a good way. This modifys the semantics of 0. > > 0 seems to mean "purge" and > > > It is > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() > > > will pass 0, and this function is called by many places. > > That's exactly what we want actually, when do purge we don't need a flush? Yes, but I'm not sure. If we stop flush, there will be any other effects. On the other hand, if we use "0" as a judgment condition, do you mean only the implementation of the purge in the flush_or_purge_queued_packets()? > > > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue? > > > > > > Thanks. > > > > OK I guess. Jason? > > Not sure, anything different from adding a check in > virtio_net_tx_complete()? (assuming bh and timer is cancelled or > deleted). We replaced the sent_cb with a function without flush. Thanks. > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote: > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > > k->queue_reset > > > > > > > > > > virtio_net_queue_reset > > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > > ..... > > > > > > > > > > (callback) virtio_net_tx_complete > > > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > > > > > --- > > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > > > > > > > > > virtio_net_flush_tx may been called by timer. > > > > We stop timer/bh during device reset, do we need to do the same with vq reset? > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > > > > > consider that after the flush_or_purge_queued_packets, this process before the > > > > > > structure is cleared. > > > > > > > > > > > > > > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > > > > { > > > > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > > > > > > if (k->queue_reset) { > > > > > k->queue_reset(vdev, queue_index); > > > > > } > > > > > > > > > > __virtio_queue_reset(vdev, queue_index); > > > > > } > > > > > > > > > > > > > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > > > > > > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > > > > > This is a callback, it is not convenient to pass the parameters. > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > How about checking that length is 0? > > > > > > > > > > > > I think that check length is not a good way. This modifys the semantics of 0. > > > > 0 seems to mean "purge" and > > > > > It is > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() > > > > will pass 0, and this function is called by many places. > > > > That's exactly what we want actually, when do purge we don't need a flush? > > Yes, but I'm not sure. If we stop flush, there will be any other effects. So we did: virtio_net_queue_reset(): nc = qemu_get_subqueue(n->nic, vq2q(queue_index); flush_or_purge_queued_packets(nc); qemu_flush_or_purge_queued_packets(nc->peer, true); // [1] if (qemu_net_queue_flush(nc->incoming_queue)) { .... } else if (purge) { qemu_net_queue_purge(nc->incoming_queue, nc->peer); packet->send_cb() virtio_net_tx_complete() virtio_net_flush_tx() qemu_sendv_packet_async() // [2] } We try to flush the tap's incoming queue and if we fail we will purge in [1]. But the sent_cb() tries to send more packets which could be queued to the tap incoming queue [2]. This breaks the semantic of qemu_flush_or_purge_queued_packets(). > > On the other hand, if we use "0" as a judgment condition, do you mean only the > implementation of the purge in the flush_or_purge_queued_packets()? It should be all the users of qemu_net_queue_purge(). The rest users seems all fine: virtio_net_vhost_status(), if we do flush, it may end up with touching vring during vhost is running. filters: all do a flush before. > > > > > > > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue? > > > > > > > > Thanks. > > > > > OK I guess. Jason? > > > > Not sure, anything different from adding a check in > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or > > deleted). > > We replaced the sent_cb with a function without flush. I meant it won't be different from adding a if (virtio_queue_is_rest()) somewhere in virtio_net_tx_complete()? Thanks > > Thanks. > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
On Mon, Jan 30, 2023 at 1:50 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Mon, Jan 30, 2023 at 11:53:18AM +0800, Jason Wang wrote: > > On Mon, Jan 30, 2023 at 11:42 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > On Mon, 30 Jan 2023 11:01:40 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > > On Sun, Jan 29, 2023 at 3:44 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > > > > > On Sun, 29 Jan 2023 14:23:21 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > > > > On Sun, Jan 29, 2023 at 10:52 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > per-queue reset state. > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > --- > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > VirtQueueElement *elem; > > > > > > > int32_t num_packets = 0; > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > We have other places that check DRIVER_OK do we need to check queue > > > > > > reset as well? > > > > > > > > > > I checked it again. I still think that the location of other checking DRIVER_OK > > > > > does not need to check the queue reset. > > > > > > > > For example, if we don't disable can_receive() when the queue is > > > > reset, it means rx may go for virtio_net_receive_rcu(). It means the > > > > Qemu is still trying to process the traffic from the network backend > > > > like tap which may waste cpu cycles. > > > > > > > > I think the correct way is to return false when the queue is reset in > > > > can_receive(), then the backend poll will be disabled (e.g TAP). When > > > > the queue is enabled again, qemu_flush_queued_packets() will wake up > > > > the backend polling. > > > > > > > > Having had time to check other places but it would be better to > > > > mention why it doesn't need a check in the changelog. > > > > > > > > > static bool virtio_net_can_receive(NetClientState *nc) > > > { > > > VirtIONet *n = qemu_get_nic_opaque(nc); > > > VirtIODevice *vdev = VIRTIO_DEVICE(n); > > > VirtIONetQueue *q = virtio_net_get_subqueue(nc); > > > > > > if (!vdev->vm_running) { > > > return false; > > > } > > > > > > if (nc->queue_index >= n->curr_queue_pairs) { > > > return false; > > > } > > > > > > if (!virtio_queue_ready(q->rx_vq) || > > > !(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > return false; > > > } > > > > > > return true; > > > } > > > > > > int virtio_queue_ready(VirtQueue *vq) > > > { > > > return vq->vring.avail != 0; > > > } > > > > > > > > > static void __virtio_queue_reset(VirtIODevice *vdev, uint32_t i) > > > { > > > vdev->vq[i].vring.desc = 0; > > > vdev->vq[i].vring.avail = 0; > > > vdev->vq[i].vring.used = 0; > > > vdev->vq[i].last_avail_idx = 0; > > > vdev->vq[i].shadow_avail_idx = 0; > > > vdev->vq[i].used_idx = 0; > > > vdev->vq[i].last_avail_wrap_counter = true; > > > vdev->vq[i].shadow_avail_wrap_counter = true; > > > vdev->vq[i].used_wrap_counter = true; > > > virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR); > > > vdev->vq[i].signalled_used = 0; > > > vdev->vq[i].signalled_used_valid = false; > > > vdev->vq[i].notification = true; > > > vdev->vq[i].vring.num = vdev->vq[i].vring.num_default; > > > vdev->vq[i].inuse = 0; > > > virtio_virtqueue_reset_region_cache(&vdev->vq[i]); > > > } > > > > > > In the implementation of Per-Queue Reset, for RX, we stop RX by setting vdev->vq[i].vring.avail to 0. > > > > Ok, but this is kind of fragile (especially when vIOMMU is enabled). > > I'd add an explicit check for reset there. > > It's not great in that spec says avail 0 is actually legal. > But I don't really want to see more and more checks. > If we are doing cleanups, the right way is probably a new "live" flag > that transports can set correctly from the combination of > DRIVER_OK, desc, kick, queue_enable, queue_reset and so on. I second this, but for kick, we can only do that unless it is mandated by the spec (otherwise we may break drivers silently). Thanks > > > (probably on top). > > > > Thanks > > > > > Then callback can_receive will return False. > > > > > > > > > Thanks. > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > E.g: > > > > > > virtio_net_can_receive() > > > > > > virtio_net_tx_{timer|bh}() > > > > > > > > > > > > Thanks > > > > > > > > > > > > > return num_packets; > > > > > > > } > > > > > > > > > > > > > > -- > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > > > > > > > >
On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang <jasowang@redhat.com> wrote: > On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > > > k->queue_reset > > > > > > > > > > > virtio_net_queue_reset > > > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > > > ..... > > > > > > > > > > > (callback) virtio_net_tx_complete > > > > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > > > > > > --- > > > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > > > > > > > > > > > virtio_net_flush_tx may been called by timer. > > > > > > We stop timer/bh during device reset, do we need to do the same with vq reset? > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > > > > > > consider that after the flush_or_purge_queued_packets, this process before the > > > > > > > structure is cleared. > > > > > > > > > > > > > > > > > > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > > > > > { > > > > > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > > > > > > > > if (k->queue_reset) { > > > > > > k->queue_reset(vdev, queue_index); > > > > > > } > > > > > > > > > > > > __virtio_queue_reset(vdev, queue_index); > > > > > > } > > > > > > > > > > > > > > > > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > > > > > > > > > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > > > > > > This is a callback, it is not convenient to pass the parameters. > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > How about checking that length is 0? > > > > > > > > > > > > > > > I think that check length is not a good way. This modifys the semantics of 0. > > > > > > 0 seems to mean "purge" and > > > > > > > It is > > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() > > > > > will pass 0, and this function is called by many places. > > > > > > That's exactly what we want actually, when do purge we don't need a flush? > > > > Yes, but I'm not sure. If we stop flush, there will be any other effects. > > So we did: > > virtio_net_queue_reset(): > nc = qemu_get_subqueue(n->nic, vq2q(queue_index); > flush_or_purge_queued_packets(nc); > qemu_flush_or_purge_queued_packets(nc->peer, true); // [1] > if (qemu_net_queue_flush(nc->incoming_queue)) { > .... > } else if (purge) { > qemu_net_queue_purge(nc->incoming_queue, nc->peer); > packet->send_cb() > virtio_net_tx_complete() > virtio_net_flush_tx() > qemu_sendv_packet_async() // [2] > } > > We try to flush the tap's incoming queue and if we fail we will purge > in [1]. But the sent_cb() tries to send more packets which could be > queued to the tap incoming queue [2]. This breaks the semantic of > qemu_flush_or_purge_queued_packets(). Sounds like good news, and I think so too. > > > > > On the other hand, if we use "0" as a judgment condition, do you mean only the > > implementation of the purge in the flush_or_purge_queued_packets()? > > It should be all the users of qemu_net_queue_purge(). The rest users > seems all fine: > > virtio_net_vhost_status(), if we do flush, it may end up with touching > vring during vhost is running. > filters: all do a flush before. > > > > > > > > > > > > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue? > > > > > > > > > > Thanks. > > > > > > OK I guess. Jason? > > > > > > Not sure, anything different from adding a check in > > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or > > > deleted). > > > > We replaced the sent_cb with a function without flush. > > I meant it won't be different from adding a > > if (virtio_queue_is_rest()) > > somewhere in virtio_net_tx_complete()? Only modify on the stack, without using a variable like disabled_by_reset. Thanks. > > Thanks > > > > > Thanks. > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
On Mon, Jan 30, 2023 at 6:26 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang <jasowang@redhat.com> wrote: > > On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > > > > k->queue_reset > > > > > > > > > > > > virtio_net_queue_reset > > > > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > > > > ..... > > > > > > > > > > > > (callback) virtio_net_tx_complete > > > > > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > > > > > > > > > > > > > virtio_net_flush_tx may been called by timer. > > > > > > > > We stop timer/bh during device reset, do we need to do the same with vq reset? > > > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > > > > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > > > > > > > consider that after the flush_or_purge_queued_packets, this process before the > > > > > > > > structure is cleared. > > > > > > > > > > > > > > > > > > > > > > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > > > > > > { > > > > > > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > > > > > > > > > > if (k->queue_reset) { > > > > > > > k->queue_reset(vdev, queue_index); > > > > > > > } > > > > > > > > > > > > > > __virtio_queue_reset(vdev, queue_index); > > > > > > > } > > > > > > > > > > > > > > > > > > > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > > > > > > > > > > > > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > > > > > > > This is a callback, it is not convenient to pass the parameters. > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > How about checking that length is 0? > > > > > > > > > > > > > > > > > > I think that check length is not a good way. This modifys the semantics of 0. > > > > > > > > 0 seems to mean "purge" and > > > > > > > > > It is > > > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() > > > > > > will pass 0, and this function is called by many places. > > > > > > > > That's exactly what we want actually, when do purge we don't need a flush? > > > > > > Yes, but I'm not sure. If we stop flush, there will be any other effects. > > > > So we did: > > > > virtio_net_queue_reset(): > > nc = qemu_get_subqueue(n->nic, vq2q(queue_index); > > flush_or_purge_queued_packets(nc); > > qemu_flush_or_purge_queued_packets(nc->peer, true); // [1] > > if (qemu_net_queue_flush(nc->incoming_queue)) { > > .... > > } else if (purge) { > > qemu_net_queue_purge(nc->incoming_queue, nc->peer); > > packet->send_cb() > > virtio_net_tx_complete() > > virtio_net_flush_tx() > > qemu_sendv_packet_async() // [2] > > } > > > > We try to flush the tap's incoming queue and if we fail we will purge > > in [1]. But the sent_cb() tries to send more packets which could be > > queued to the tap incoming queue [2]. This breaks the semantic of > > qemu_flush_or_purge_queued_packets(). > > Sounds like good news, and I think so too. > > > > > > > > > On the other hand, if we use "0" as a judgment condition, do you mean only the > > > implementation of the purge in the flush_or_purge_queued_packets()? > > > > It should be all the users of qemu_net_queue_purge(). The rest users > > seems all fine: > > > > virtio_net_vhost_status(), if we do flush, it may end up with touching > > vring during vhost is running. > > filters: all do a flush before. > > > > > > > > > > > > > > > > > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue? > > > > > > > > > > > > Thanks. > > > > > > > OK I guess. Jason? > > > > > > > > Not sure, anything different from adding a check in > > > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or > > > > deleted). > > > > > > We replaced the sent_cb with a function without flush. > > > > I meant it won't be different from adding a > > > > if (virtio_queue_is_rest()) > > > > somewhere in virtio_net_tx_complete()? > > > Only modify on the stack, without using a variable like disabled_by_reset. Ok, but per discussion above, it looks to me we can simply check len against zero which seems simpler. (And it may fix other possible bugs) Thanks > > Thanks. > > > > > > Thanks > > > > > > > > Thanks. > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
On Tue, 31 Jan 2023 11:27:42 +0800, Jason Wang <jasowang@redhat.com> wrote: > On Mon, Jan 30, 2023 at 6:26 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote: > > > > > > > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > > > > > k->queue_reset > > > > > > > > > > > > > virtio_net_queue_reset > > > > > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > > > > > ..... > > > > > > > > > > > > > (callback) virtio_net_tx_complete > > > > > > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > > > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > > > > > > > > > > > > > > > virtio_net_flush_tx may been called by timer. > > > > > > > > > > We stop timer/bh during device reset, do we need to do the same with vq reset? > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > > > > > > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > > > > > > > > consider that after the flush_or_purge_queued_packets, this process before the > > > > > > > > > structure is cleared. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > > > > > > > { > > > > > > > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > > > > > > > > > > > > if (k->queue_reset) { > > > > > > > > k->queue_reset(vdev, queue_index); > > > > > > > > } > > > > > > > > > > > > > > > > __virtio_queue_reset(vdev, queue_index); > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > > > > > > > > > > > > > > > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > > > > > > > > This is a callback, it is not convenient to pass the parameters. > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > How about checking that length is 0? > > > > > > > > > > > > > > > > > > > > > I think that check length is not a good way. This modifys the semantics of 0. > > > > > > > > > > 0 seems to mean "purge" and > > > > > > > > > > > It is > > > > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() > > > > > > > will pass 0, and this function is called by many places. > > > > > > > > > > That's exactly what we want actually, when do purge we don't need a flush? > > > > > > > > Yes, but I'm not sure. If we stop flush, there will be any other effects. > > > > > > So we did: > > > > > > virtio_net_queue_reset(): > > > nc = qemu_get_subqueue(n->nic, vq2q(queue_index); > > > flush_or_purge_queued_packets(nc); > > > qemu_flush_or_purge_queued_packets(nc->peer, true); // [1] > > > if (qemu_net_queue_flush(nc->incoming_queue)) { > > > .... > > > } else if (purge) { > > > qemu_net_queue_purge(nc->incoming_queue, nc->peer); > > > packet->send_cb() > > > virtio_net_tx_complete() > > > virtio_net_flush_tx() > > > qemu_sendv_packet_async() // [2] > > > } > > > > > > We try to flush the tap's incoming queue and if we fail we will purge > > > in [1]. But the sent_cb() tries to send more packets which could be > > > queued to the tap incoming queue [2]. This breaks the semantic of > > > qemu_flush_or_purge_queued_packets(). > > > > Sounds like good news, and I think so too. > > > > > > > > > > > > > On the other hand, if we use "0" as a judgment condition, do you mean only the > > > > implementation of the purge in the flush_or_purge_queued_packets()? > > > > > > It should be all the users of qemu_net_queue_purge(). The rest users > > > seems all fine: > > > > > > virtio_net_vhost_status(), if we do flush, it may end up with touching > > > vring during vhost is running. > > > filters: all do a flush before. > > > > > > > > > > > > > > > > > > > > > > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue? > > > > > > > > > > > > > > Thanks. > > > > > > > > OK I guess. Jason? > > > > > > > > > > Not sure, anything different from adding a check in > > > > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or > > > > > deleted). > > > > > > > > We replaced the sent_cb with a function without flush. > > > > > > I meant it won't be different from adding a > > > > > > if (virtio_queue_is_rest()) > > > > > > somewhere in virtio_net_tx_complete()? > > > > > > Only modify on the stack, without using a variable like disabled_by_reset. > > Ok, but per discussion above, it looks to me we can simply check len > against zero which seems simpler. (And it may fix other possible bugs) Yes, if '0' means purge and we should not flush new packet for purge, that is a good way. I will post patch soon. Thanks. > > Thanks > > > > > Thanks. > > > > > > > > > > Thanks > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang <jasowang@redhat.com> wrote: > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > k->queue_reset > > > > > > > > > virtio_net_queue_reset > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > ..... > > > > > > > > > (callback) virtio_net_tx_complete > > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > Reported-by: Alexander Bulekov <alxndr@bu.edu> > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > > > > > > > > > --- > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > > > > > > > virtio_net_flush_tx may been called by timer. > > We stop timer/bh during device reset, do we need to do the same with vq reset? Should I stop timer/bh? Thanks. > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > > > > consider that after the flush_or_purge_queued_packets, this process before the > > > > > structure is cleared. > > > > > > > > > > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > > > { > > > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > > > > if (k->queue_reset) { > > > > k->queue_reset(vdev, queue_index); > > > > } > > > > > > > > __virtio_queue_reset(vdev, queue_index); > > > > } > > > > > > > > > > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > > > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > > > > This is a callback, it is not convenient to pass the parameters. > > > > > > > > > > Thanks > > > > > > > > > > > > How about checking that length is 0? > > > > > > > > > I think that check length is not a good way. This modifys the semantics of 0. > > 0 seems to mean "purge" and > > > It is > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() > > > will pass 0, and this function is called by many places. > > That's exactly what we want actually, when do purge we don't need a flush? > > > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue? > > > > > > Thanks. > > > > OK I guess. Jason? > > Not sure, anything different from adding a check in > virtio_net_tx_complete()? (assuming bh and timer is cancelled or > deleted). > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 3ae909041a..fba6451a50 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) VirtQueueElement *elem; int32_t num_packets = 0; int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || + virtio_queue_reset_state(q->tx_vq)) { return num_packets; }
Check whether it is per-queue reset state in virtio_net_flush_tx(). Before per-queue reset, we need to recover async tx resources. At this time, virtio_net_flush_tx() is called, but we should not try to send new packets, so virtio_net_flush_tx() should check the current per-queue reset state. Fixes: 7dc6be52 ("virtio-net: support queue reset") Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 Reported-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> --- hw/net/virtio-net.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)