Message ID | 1648621997-22416-1-git-send-email-si-wei.liu@oracle.com (mailing list archive) |
---|---|
Headers | show |
Series | vhost-vdpa multiqueue fixes | expand |
在 2022/3/30 14:33, Si-Wei Liu 写道: > Hi, > > This patch series attempt to fix a few issues in vhost-vdpa multiqueue functionality. > > Patch #1 is the formal submission for RFC patch in: > https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/ > > Patch #2 and #3 were taken from a previous patchset posted on qemu-devel: > https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/ > > albeit abandoned, two patches in that set turn out to be useful for patch #4, which is to fix a QEMU crash due to race condition. > > Patch #5 through #7 are obviously small bug fixes. Please find the description of each in the commit log. > > Thanks, > -Siwei Hi Si Wei: I think we need another version of this series? Thanks > > --- > > Eugenio Pérez (2): > virtio-net: Fix indentation > virtio-net: Only enable userland vq if using tap backend > > Si-Wei Liu (5): > virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa > virtio: don't read pending event on host notifier if disabled > vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa > vhost-net: fix improper cleanup in vhost_net_start > vhost-vdpa: backend feature should set only once > > hw/net/vhost_net.c | 4 +++- > hw/net/virtio-net.c | 25 +++++++++++++++++++++---- > hw/virtio/vhost-vdpa.c | 2 +- > hw/virtio/virtio-bus.c | 3 ++- > hw/virtio/virtio.c | 21 +++++++++++++-------- > include/hw/virtio/virtio.h | 2 ++ > net/vhost-vdpa.c | 4 +++- > 7 files changed, 45 insertions(+), 16 deletions(-) >
On 4/26/2022 9:28 PM, Jason Wang wrote: > > 在 2022/3/30 14:33, Si-Wei Liu 写道: >> Hi, >> >> This patch series attempt to fix a few issues in vhost-vdpa >> multiqueue functionality. >> >> Patch #1 is the formal submission for RFC patch in: >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqg3ysNsk$ >> >> Patch #2 and #3 were taken from a previous patchset posted on >> qemu-devel: >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqc3mXqDs$ >> >> albeit abandoned, two patches in that set turn out to be useful for >> patch #4, which is to fix a QEMU crash due to race condition. >> >> Patch #5 through #7 are obviously small bug fixes. Please find the >> description of each in the commit log. >> >> Thanks, >> -Siwei > > > Hi Si Wei: > > I think we need another version of this series? Hi Jason, Apologies for the long delay. I was in the middle of reworking the patch "virtio: don't read pending event on host notifier if disabled", but found out that it would need quite some code change for the userspace fallback handler to work properly (for the queue no. change case specifically). I have to drop it from the series and posted it later on when ready. Will post a v2 with relevant patches removed. Regards, -Siwei > > Thanks > > >> >> --- >> >> Eugenio Pérez (2): >> virtio-net: Fix indentation >> virtio-net: Only enable userland vq if using tap backend >> >> Si-Wei Liu (5): >> virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa >> virtio: don't read pending event on host notifier if disabled >> vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa >> vhost-net: fix improper cleanup in vhost_net_start >> vhost-vdpa: backend feature should set only once >> >> hw/net/vhost_net.c | 4 +++- >> hw/net/virtio-net.c | 25 +++++++++++++++++++++---- >> hw/virtio/vhost-vdpa.c | 2 +- >> hw/virtio/virtio-bus.c | 3 ++- >> hw/virtio/virtio.c | 21 +++++++++++++-------- >> include/hw/virtio/virtio.h | 2 ++ >> net/vhost-vdpa.c | 4 +++- >> 7 files changed, 45 insertions(+), 16 deletions(-) >> >
On Wed, Apr 27, 2022 at 4:30 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 4/26/2022 9:28 PM, Jason Wang wrote: > > > > 在 2022/3/30 14:33, Si-Wei Liu 写道: > >> Hi, > >> > >> This patch series attempt to fix a few issues in vhost-vdpa > >> multiqueue functionality. > >> > >> Patch #1 is the formal submission for RFC patch in: > >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqg3ysNsk$ > >> > >> Patch #2 and #3 were taken from a previous patchset posted on > >> qemu-devel: > >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqc3mXqDs$ > >> > >> albeit abandoned, two patches in that set turn out to be useful for > >> patch #4, which is to fix a QEMU crash due to race condition. > >> > >> Patch #5 through #7 are obviously small bug fixes. Please find the > >> description of each in the commit log. > >> > >> Thanks, > >> -Siwei > > > > > > Hi Si Wei: > > > > I think we need another version of this series? > Hi Jason, > > Apologies for the long delay. I was in the middle of reworking the patch > "virtio: don't read pending event on host notifier if disabled", but > found out that it would need quite some code change for the userspace > fallback handler to work properly (for the queue no. change case > specifically). We probably need this fix for -stable, so I wonder if we can have a workaround first and do refactoring on top? > I have to drop it from the series and posted it later on > when ready. Will post a v2 with relevant patches removed. Thanks > > Regards, > -Siwei > > > > > Thanks > > > > > >> > >> --- > >> > >> Eugenio Pérez (2): > >> virtio-net: Fix indentation > >> virtio-net: Only enable userland vq if using tap backend > >> > >> Si-Wei Liu (5): > >> virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa > >> virtio: don't read pending event on host notifier if disabled > >> vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa > >> vhost-net: fix improper cleanup in vhost_net_start > >> vhost-vdpa: backend feature should set only once > >> > >> hw/net/vhost_net.c | 4 +++- > >> hw/net/virtio-net.c | 25 +++++++++++++++++++++---- > >> hw/virtio/vhost-vdpa.c | 2 +- > >> hw/virtio/virtio-bus.c | 3 ++- > >> hw/virtio/virtio.c | 21 +++++++++++++-------- > >> include/hw/virtio/virtio.h | 2 ++ > >> net/vhost-vdpa.c | 4 +++- > >> 7 files changed, 45 insertions(+), 16 deletions(-) > >> > > >
On 4/27/2022 1:38 AM, Jason Wang wrote: > On Wed, Apr 27, 2022 at 4:30 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> >> >> On 4/26/2022 9:28 PM, Jason Wang wrote: >>> 在 2022/3/30 14:33, Si-Wei Liu 写道: >>>> Hi, >>>> >>>> This patch series attempt to fix a few issues in vhost-vdpa >>>> multiqueue functionality. >>>> >>>> Patch #1 is the formal submission for RFC patch in: >>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqg3ysNsk$ >>>> >>>> Patch #2 and #3 were taken from a previous patchset posted on >>>> qemu-devel: >>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqc3mXqDs$ >>>> >>>> albeit abandoned, two patches in that set turn out to be useful for >>>> patch #4, which is to fix a QEMU crash due to race condition. >>>> >>>> Patch #5 through #7 are obviously small bug fixes. Please find the >>>> description of each in the commit log. >>>> >>>> Thanks, >>>> -Siwei >>> >>> Hi Si Wei: >>> >>> I think we need another version of this series? >> Hi Jason, >> >> Apologies for the long delay. I was in the middle of reworking the patch >> "virtio: don't read pending event on host notifier if disabled", but >> found out that it would need quite some code change for the userspace >> fallback handler to work properly (for the queue no. change case >> specifically). > We probably need this fix for -stable, so I wonder if we can have a > workaround first and do refactoring on top? Hmmm, a nasty fix but may well address the segfault is something like this: diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 8ca0b80..3ac93a4 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -368,6 +368,10 @@ static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status) int i; uint8_t queue_status; + if (n->status_pending) + return; + + n->status_pending = true; virtio_net_vnet_endian_status(n, status); virtio_net_vhost_status(n, status); @@ -416,6 +420,7 @@ static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status) } } } + n->status_pending = false; } static void virtio_net_set_link_status(NetClientState *nc) diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h index eb87032..95efea8 100644 --- a/include/hw/virtio/virtio-net.h +++ b/include/hw/virtio/virtio-net.h @@ -216,6 +216,7 @@ struct VirtIONet { VirtioNetRssData rss_data; struct NetRxPkt *rx_pkt; struct EBPFRSSContext ebpf_rss; + bool status_pending; }; void virtio_net_set_netclient_name(VirtIONet *n, const char *name, To be honest, I am not sure if this is worth a full blown fix to make it completely work. Without applying vq suspend patch (the one I posted in https://lore.kernel.org/qemu-devel/df7c9a87-b2bd-7758-a6b6-bd834a7336fe@oracle.com/), it's very hard for me to effectively verify my code change - it's very easy for the guest vq index to be out of sync if not stopping the vq once the vhost is up and running (I tested it with repeatedly set_link off and on). I am not sure if there's real chance we can run into issue in practice due to the incomplete fix, if we don't fix the vq stop/suspend issue first. Anyway I will try, as other use case e.g, live migration is likely to get stumbled on it, too. -Siwei > >> I have to drop it from the series and posted it later on >> when ready. Will post a v2 with relevant patches removed. > Thanks > >> Regards, >> -Siwei >> >>> Thanks >>> >>> >>>> --- >>>> >>>> Eugenio Pérez (2): >>>> virtio-net: Fix indentation >>>> virtio-net: Only enable userland vq if using tap backend >>>> >>>> Si-Wei Liu (5): >>>> virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa >>>> virtio: don't read pending event on host notifier if disabled >>>> vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa >>>> vhost-net: fix improper cleanup in vhost_net_start >>>> vhost-vdpa: backend feature should set only once >>>> >>>> hw/net/vhost_net.c | 4 +++- >>>> hw/net/virtio-net.c | 25 +++++++++++++++++++++---- >>>> hw/virtio/vhost-vdpa.c | 2 +- >>>> hw/virtio/virtio-bus.c | 3 ++- >>>> hw/virtio/virtio.c | 21 +++++++++++++-------- >>>> include/hw/virtio/virtio.h | 2 ++ >>>> net/vhost-vdpa.c | 4 +++- >>>> 7 files changed, 45 insertions(+), 16 deletions(-) >>>>
On Wed, Apr 27, 2022 at 5:09 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 4/27/2022 1:38 AM, Jason Wang wrote: > > On Wed, Apr 27, 2022 at 4:30 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >> > >> > >> On 4/26/2022 9:28 PM, Jason Wang wrote: > >>> 在 2022/3/30 14:33, Si-Wei Liu 写道: > >>>> Hi, > >>>> > >>>> This patch series attempt to fix a few issues in vhost-vdpa > >>>> multiqueue functionality. > >>>> > >>>> Patch #1 is the formal submission for RFC patch in: > >>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqg3ysNsk$ > >>>> > >>>> Patch #2 and #3 were taken from a previous patchset posted on > >>>> qemu-devel: > >>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqc3mXqDs$ > >>>> > >>>> albeit abandoned, two patches in that set turn out to be useful for > >>>> patch #4, which is to fix a QEMU crash due to race condition. > >>>> > >>>> Patch #5 through #7 are obviously small bug fixes. Please find the > >>>> description of each in the commit log. > >>>> > >>>> Thanks, > >>>> -Siwei > >>> > >>> Hi Si Wei: > >>> > >>> I think we need another version of this series? > >> Hi Jason, > >> > >> Apologies for the long delay. I was in the middle of reworking the patch > >> "virtio: don't read pending event on host notifier if disabled", but > >> found out that it would need quite some code change for the userspace > >> fallback handler to work properly (for the queue no. change case > >> specifically). > > We probably need this fix for -stable, so I wonder if we can have a > > workaround first and do refactoring on top? > Hmmm, a nasty fix but may well address the segfault is something like this: > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > index 8ca0b80..3ac93a4 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -368,6 +368,10 @@ static void virtio_net_set_status(struct > VirtIODevice *vdev, uint8_t status) > int i; > uint8_t queue_status; > > + if (n->status_pending) > + return; > + > + n->status_pending = true; > virtio_net_vnet_endian_status(n, status); > virtio_net_vhost_status(n, status); > > @@ -416,6 +420,7 @@ static void virtio_net_set_status(struct > VirtIODevice *vdev, uint8_t status) > } > } > } > + n->status_pending = false; > } > > static void virtio_net_set_link_status(NetClientState *nc) > diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h > index eb87032..95efea8 100644 > --- a/include/hw/virtio/virtio-net.h > +++ b/include/hw/virtio/virtio-net.h > @@ -216,6 +216,7 @@ struct VirtIONet { > VirtioNetRssData rss_data; > struct NetRxPkt *rx_pkt; > struct EBPFRSSContext ebpf_rss; > + bool status_pending; > }; > > void virtio_net_set_netclient_name(VirtIONet *n, const char *name, > > To be honest, I am not sure if this is worth a full blown fix to make it > completely work. Without applying vq suspend patch (the one I posted in > https://lore.kernel.org/qemu-devel/df7c9a87-b2bd-7758-a6b6-bd834a7336fe@oracle.com/), > it's very hard for me to effectively verify my code change - it's very > easy for the guest vq index to be out of sync if not stopping the vq > once the vhost is up and running (I tested it with repeatedly set_link > off and on). Can we test via vmstop? > I am not sure if there's real chance we can run into issue > in practice due to the incomplete fix, if we don't fix the vq > stop/suspend issue first. Anyway I will try, as other use case e.g, live > migration is likely to get stumbled on it, too. Ok, so I think we probably don't need the "nasty" fix above. Let's fix it with the issue of stop/resume. Thanks > > -Siwei > > > > > >> I have to drop it from the series and posted it later on > >> when ready. Will post a v2 with relevant patches removed. > > Thanks > > > >> Regards, > >> -Siwei > >> > >>> Thanks > >>> > >>> > >>>> --- > >>>> > >>>> Eugenio Pérez (2): > >>>> virtio-net: Fix indentation > >>>> virtio-net: Only enable userland vq if using tap backend > >>>> > >>>> Si-Wei Liu (5): > >>>> virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa > >>>> virtio: don't read pending event on host notifier if disabled > >>>> vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa > >>>> vhost-net: fix improper cleanup in vhost_net_start > >>>> vhost-vdpa: backend feature should set only once > >>>> > >>>> hw/net/vhost_net.c | 4 +++- > >>>> hw/net/virtio-net.c | 25 +++++++++++++++++++++---- > >>>> hw/virtio/vhost-vdpa.c | 2 +- > >>>> hw/virtio/virtio-bus.c | 3 ++- > >>>> hw/virtio/virtio.c | 21 +++++++++++++-------- > >>>> include/hw/virtio/virtio.h | 2 ++ > >>>> net/vhost-vdpa.c | 4 +++- > >>>> 7 files changed, 45 insertions(+), 16 deletions(-) > >>>> >
On 4/28/2022 7:30 PM, Jason Wang wrote: > On Wed, Apr 27, 2022 at 5:09 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >> >> >> On 4/27/2022 1:38 AM, Jason Wang wrote: >>> On Wed, Apr 27, 2022 at 4:30 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: >>>> >>>> On 4/26/2022 9:28 PM, Jason Wang wrote: >>>>> 在 2022/3/30 14:33, Si-Wei Liu 写道: >>>>>> Hi, >>>>>> >>>>>> This patch series attempt to fix a few issues in vhost-vdpa >>>>>> multiqueue functionality. >>>>>> >>>>>> Patch #1 is the formal submission for RFC patch in: >>>>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqg3ysNsk$ >>>>>> >>>>>> Patch #2 and #3 were taken from a previous patchset posted on >>>>>> qemu-devel: >>>>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqc3mXqDs$ >>>>>> >>>>>> albeit abandoned, two patches in that set turn out to be useful for >>>>>> patch #4, which is to fix a QEMU crash due to race condition. >>>>>> >>>>>> Patch #5 through #7 are obviously small bug fixes. Please find the >>>>>> description of each in the commit log. >>>>>> >>>>>> Thanks, >>>>>> -Siwei >>>>> Hi Si Wei: >>>>> >>>>> I think we need another version of this series? >>>> Hi Jason, >>>> >>>> Apologies for the long delay. I was in the middle of reworking the patch >>>> "virtio: don't read pending event on host notifier if disabled", but >>>> found out that it would need quite some code change for the userspace >>>> fallback handler to work properly (for the queue no. change case >>>> specifically). >>> We probably need this fix for -stable, so I wonder if we can have a >>> workaround first and do refactoring on top? >> Hmmm, a nasty fix but may well address the segfault is something like this: >> >> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c >> index 8ca0b80..3ac93a4 100644 >> --- a/hw/net/virtio-net.c >> +++ b/hw/net/virtio-net.c >> @@ -368,6 +368,10 @@ static void virtio_net_set_status(struct >> VirtIODevice *vdev, uint8_t status) >> int i; >> uint8_t queue_status; >> >> + if (n->status_pending) >> + return; >> + >> + n->status_pending = true; >> virtio_net_vnet_endian_status(n, status); >> virtio_net_vhost_status(n, status); >> >> @@ -416,6 +420,7 @@ static void virtio_net_set_status(struct >> VirtIODevice *vdev, uint8_t status) >> } >> } >> } >> + n->status_pending = false; >> } >> >> static void virtio_net_set_link_status(NetClientState *nc) >> diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h >> index eb87032..95efea8 100644 >> --- a/include/hw/virtio/virtio-net.h >> +++ b/include/hw/virtio/virtio-net.h >> @@ -216,6 +216,7 @@ struct VirtIONet { >> VirtioNetRssData rss_data; >> struct NetRxPkt *rx_pkt; >> struct EBPFRSSContext ebpf_rss; >> + bool status_pending; >> }; >> >> void virtio_net_set_netclient_name(VirtIONet *n, const char *name, >> >> To be honest, I am not sure if this is worth a full blown fix to make it >> completely work. Without applying vq suspend patch (the one I posted in >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/df7c9a87-b2bd-7758-a6b6-bd834a7336fe@oracle.com/__;!!ACWV5N9M2RV99hQ!L4qque3YpPr-CGp12NrNdMMT1HROfEY_Juw2vnfZXHjOhtT0XJCR9GB8cvWEbJL9Aeh-WhDogBVArJn91P0$ ), >> it's very hard for me to effectively verify my code change - it's very >> easy for the guest vq index to be out of sync if not stopping the vq >> once the vhost is up and running (I tested it with repeatedly set_link >> off and on). > Can we test via vmstop? Yes, of coz, that's where the segfault happened. The tight loop of set_link on/off doesn't even work for the single queue case, hence that's why I doubted it ever worked for vhost-vdpa. > >> I am not sure if there's real chance we can run into issue >> in practice due to the incomplete fix, if we don't fix the vq >> stop/suspend issue first. Anyway I will try, as other use case e.g, live >> migration is likely to get stumbled on it, too. > Ok, so I think we probably don't need the "nasty" fix above. Let's fix > it with the issue of stop/resume. Ok, then does below tentative code change suffice the need? i.e. it would fail the request of changing queue pairs when the vhost-vdpa backend falls back to the userspace handler, but it's probably the easiest way to fix the vmstop segfault. diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index ed231f9..8ba9f09 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -1177,6 +1177,7 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd, struct virtio_net_ctrl_mq mq; size_t s; uint16_t queue_pairs; + NetClientState *nc = qemu_get_queue(n->nic); s = iov_to_buf(iov, iov_cnt, 0, &mq, sizeof(mq)); if (s != sizeof(mq)) { @@ -1196,6 +1197,13 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_ERR; } + /* avoid changing the number of queue_pairs for vdpa device in + * userspace handler. + * TODO: get userspace fallback to work with future patch */ + if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { + return VIRTIO_NET_ERR; + } + n->curr_queue_pairs = queue_pairs; /* stop the backend before changing the number of queue_pairs to avoid handling a * disabled queue */ Thanks, -Siwei > > Thanks > >> -Siwei >> >> >>>> I have to drop it from the series and posted it later on >>>> when ready. Will post a v2 with relevant patches removed. >>> Thanks >>> >>>> Regards, >>>> -Siwei >>>> >>>>> Thanks >>>>> >>>>> >>>>>> --- >>>>>> >>>>>> Eugenio Pérez (2): >>>>>> virtio-net: Fix indentation >>>>>> virtio-net: Only enable userland vq if using tap backend >>>>>> >>>>>> Si-Wei Liu (5): >>>>>> virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa >>>>>> virtio: don't read pending event on host notifier if disabled >>>>>> vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa >>>>>> vhost-net: fix improper cleanup in vhost_net_start >>>>>> vhost-vdpa: backend feature should set only once >>>>>> >>>>>> hw/net/vhost_net.c | 4 +++- >>>>>> hw/net/virtio-net.c | 25 +++++++++++++++++++++---- >>>>>> hw/virtio/vhost-vdpa.c | 2 +- >>>>>> hw/virtio/virtio-bus.c | 3 ++- >>>>>> hw/virtio/virtio.c | 21 +++++++++++++-------- >>>>>> include/hw/virtio/virtio.h | 2 ++ >>>>>> net/vhost-vdpa.c | 4 +++- >>>>>> 7 files changed, 45 insertions(+), 16 deletions(-) >>>>>>
On Sat, Apr 30, 2022 at 10:07 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > On 4/28/2022 7:30 PM, Jason Wang wrote: > > On Wed, Apr 27, 2022 at 5:09 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >> > >> > >> On 4/27/2022 1:38 AM, Jason Wang wrote: > >>> On Wed, Apr 27, 2022 at 4:30 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > >>>> > >>>> On 4/26/2022 9:28 PM, Jason Wang wrote: > >>>>> 在 2022/3/30 14:33, Si-Wei Liu 写道: > >>>>>> Hi, > >>>>>> > >>>>>> This patch series attempt to fix a few issues in vhost-vdpa > >>>>>> multiqueue functionality. > >>>>>> > >>>>>> Patch #1 is the formal submission for RFC patch in: > >>>>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqg3ysNsk$ > >>>>>> > >>>>>> Patch #2 and #3 were taken from a previous patchset posted on > >>>>>> qemu-devel: > >>>>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqc3mXqDs$ > >>>>>> > >>>>>> albeit abandoned, two patches in that set turn out to be useful for > >>>>>> patch #4, which is to fix a QEMU crash due to race condition. > >>>>>> > >>>>>> Patch #5 through #7 are obviously small bug fixes. Please find the > >>>>>> description of each in the commit log. > >>>>>> > >>>>>> Thanks, > >>>>>> -Siwei > >>>>> Hi Si Wei: > >>>>> > >>>>> I think we need another version of this series? > >>>> Hi Jason, > >>>> > >>>> Apologies for the long delay. I was in the middle of reworking the patch > >>>> "virtio: don't read pending event on host notifier if disabled", but > >>>> found out that it would need quite some code change for the userspace > >>>> fallback handler to work properly (for the queue no. change case > >>>> specifically). > >>> We probably need this fix for -stable, so I wonder if we can have a > >>> workaround first and do refactoring on top? > >> Hmmm, a nasty fix but may well address the segfault is something like this: > >> > >> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > >> index 8ca0b80..3ac93a4 100644 > >> --- a/hw/net/virtio-net.c > >> +++ b/hw/net/virtio-net.c > >> @@ -368,6 +368,10 @@ static void virtio_net_set_status(struct > >> VirtIODevice *vdev, uint8_t status) > >> int i; > >> uint8_t queue_status; > >> > >> + if (n->status_pending) > >> + return; > >> + > >> + n->status_pending = true; > >> virtio_net_vnet_endian_status(n, status); > >> virtio_net_vhost_status(n, status); > >> > >> @@ -416,6 +420,7 @@ static void virtio_net_set_status(struct > >> VirtIODevice *vdev, uint8_t status) > >> } > >> } > >> } > >> + n->status_pending = false; > >> } > >> > >> static void virtio_net_set_link_status(NetClientState *nc) > >> diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h > >> index eb87032..95efea8 100644 > >> --- a/include/hw/virtio/virtio-net.h > >> +++ b/include/hw/virtio/virtio-net.h > >> @@ -216,6 +216,7 @@ struct VirtIONet { > >> VirtioNetRssData rss_data; > >> struct NetRxPkt *rx_pkt; > >> struct EBPFRSSContext ebpf_rss; > >> + bool status_pending; > >> }; > >> > >> void virtio_net_set_netclient_name(VirtIONet *n, const char *name, > >> > >> To be honest, I am not sure if this is worth a full blown fix to make it > >> completely work. Without applying vq suspend patch (the one I posted in > >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/df7c9a87-b2bd-7758-a6b6-bd834a7336fe@oracle.com/__;!!ACWV5N9M2RV99hQ!L4qque3YpPr-CGp12NrNdMMT1HROfEY_Juw2vnfZXHjOhtT0XJCR9GB8cvWEbJL9Aeh-WhDogBVArJn91P0$ ), > >> it's very hard for me to effectively verify my code change - it's very > >> easy for the guest vq index to be out of sync if not stopping the vq > >> once the vhost is up and running (I tested it with repeatedly set_link > >> off and on). > > Can we test via vmstop? > Yes, of coz, that's where the segfault happened. The tight loop of > set_link on/off doesn't even work for the single queue case, hence > that's why I doubted it ever worked for vhost-vdpa. Right, this is something we need to check. Set_link should stop the vhost device anyhow, otherwise it should be a bug. > > > > >> I am not sure if there's real chance we can run into issue > >> in practice due to the incomplete fix, if we don't fix the vq > >> stop/suspend issue first. Anyway I will try, as other use case e.g, live > >> migration is likely to get stumbled on it, too. > > Ok, so I think we probably don't need the "nasty" fix above. Let's fix > > it with the issue of stop/resume. > Ok, then does below tentative code change suffice the need? i.e. it > would fail the request of changing queue pairs when the vhost-vdpa > backend falls back to the userspace handler, but it's probably the > easiest way to fix the vmstop segfault. Probably, let's see. Thanks > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > index ed231f9..8ba9f09 100644 > --- a/hw/net/virtio-net.c > +++ b/hw/net/virtio-net.c > @@ -1177,6 +1177,7 @@ static int virtio_net_handle_mq(VirtIONet *n, > uint8_t cmd, > struct virtio_net_ctrl_mq mq; > size_t s; > uint16_t queue_pairs; > + NetClientState *nc = qemu_get_queue(n->nic); > > s = iov_to_buf(iov, iov_cnt, 0, &mq, sizeof(mq)); > if (s != sizeof(mq)) { > @@ -1196,6 +1197,13 @@ static int virtio_net_handle_mq(VirtIONet *n, > uint8_t cmd, > return VIRTIO_NET_ERR; > } > > + /* avoid changing the number of queue_pairs for vdpa device in > + * userspace handler. > + * TODO: get userspace fallback to work with future patch */ > + if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) { > + return VIRTIO_NET_ERR; > + } > + > n->curr_queue_pairs = queue_pairs; > /* stop the backend before changing the number of queue_pairs to > avoid handling a > * disabled queue */ > > Thanks, > -Siwei > > > > Thanks > > > >> -Siwei > >> > >> > >>>> I have to drop it from the series and posted it later on > >>>> when ready. Will post a v2 with relevant patches removed. > >>> Thanks > >>> > >>>> Regards, > >>>> -Siwei > >>>> > >>>>> Thanks > >>>>> > >>>>> > >>>>>> --- > >>>>>> > >>>>>> Eugenio Pérez (2): > >>>>>> virtio-net: Fix indentation > >>>>>> virtio-net: Only enable userland vq if using tap backend > >>>>>> > >>>>>> Si-Wei Liu (5): > >>>>>> virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa > >>>>>> virtio: don't read pending event on host notifier if disabled > >>>>>> vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa > >>>>>> vhost-net: fix improper cleanup in vhost_net_start > >>>>>> vhost-vdpa: backend feature should set only once > >>>>>> > >>>>>> hw/net/vhost_net.c | 4 +++- > >>>>>> hw/net/virtio-net.c | 25 +++++++++++++++++++++---- > >>>>>> hw/virtio/vhost-vdpa.c | 2 +- > >>>>>> hw/virtio/virtio-bus.c | 3 ++- > >>>>>> hw/virtio/virtio.c | 21 +++++++++++++-------- > >>>>>> include/hw/virtio/virtio.h | 2 ++ > >>>>>> net/vhost-vdpa.c | 4 +++- > >>>>>> 7 files changed, 45 insertions(+), 16 deletions(-) > >>>>>> >