diff mbox series

[2/2] vsock/virtio: Don't reset the created SOCKET during s2r

Message ID 20250207052033.2222629-2-junnan01.wu@samsung.com (mailing list archive)
State New
Headers show
Series [1/2] vsock/virtio: Move rx_buf_nr and rx_buf_max_nr initialization position | expand

Commit Message

Junnan Wu Feb. 7, 2025, 5:20 a.m. UTC
From: Ying Gao <ying01.gao@samsung.com>

If suspend is executed during vsock communication and the
socket is reset, the original socket will be unusable after resume.

Judge the value of vdev->priv in function virtio_vsock_vqs_del,
only when the function is invoked by virtio_vsock_remove,
all vsock connections will be reset.

Signed-off-by: Ying Gao <ying01.gao@samsung.com>
Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
---
 net/vmw_vsock/virtio_transport.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Luigi Leonardi Feb. 10, 2025, 11:48 a.m. UTC | #1
Like for the other patch, some maintainers have not been CCd.

On Fri, Feb 07, 2025 at 01:20:33PM +0800, Junnan Wu wrote:
>From: Ying Gao <ying01.gao@samsung.com>
>
>If suspend is executed during vsock communication and the
>socket is reset, the original socket will be unusable after resume.
>
>Judge the value of vdev->priv in function virtio_vsock_vqs_del,
>only when the function is invoked by virtio_vsock_remove,
>all vsock connections will be reset.
>
The second part of the commit message is not that clear, do you mind 
rephrasing it?

>Signed-off-by: Ying Gao <ying01.gao@samsung.com>
Missing Co-developed-by?
>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>


>---
> net/vmw_vsock/virtio_transport.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index 9eefd0fba92b..9df609581755 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -717,8 +717,10 @@ static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
> 	struct sk_buff *skb;
>
> 	/* Reset all connected sockets when the VQs disappear */
>-	vsock_for_each_connected_socket(&virtio_transport.transport,
>-					virtio_vsock_reset_sock);
I would add a comment explaining why you are adding this check.
>+	if (!vdev->priv) {
>+		vsock_for_each_connected_socket(&virtio_transport.transport,
>+						virtio_vsock_reset_sock);
>+	}
>
> 	/* Stop all work handlers to make sure no one is accessing the device,
> 	 * so we can safely call virtio_reset_device().
>-- 
>2.34.1
>

I am not familiar with freeze/resume, but I don't see any problems with 
this patch.

Thank you,
Luigi
Stefano Garzarella Feb. 10, 2025, 4:52 p.m. UTC | #2
On Mon, Feb 10, 2025 at 12:48:03PM +0100, leonardi@redhat.com wrote:
>Like for the other patch, some maintainers have not been CCd.

Yes, please use `scripts/get_maintainer.pl`.

>
>On Fri, Feb 07, 2025 at 01:20:33PM +0800, Junnan Wu wrote:
>>From: Ying Gao <ying01.gao@samsung.com>
>>
>>If suspend is executed during vsock communication and the
>>socket is reset, the original socket will be unusable after resume.

Why? (I mean for a good commit description)

>>
>>Judge the value of vdev->priv in function virtio_vsock_vqs_del,
>>only when the function is invoked by virtio_vsock_remove,
>>all vsock connections will be reset.
>>
>The second part of the commit message is not that clear, do you mind 
>rephrasing it?

+1 on that

Also in this case, why checking `vdev->priv` fixes the issue?

>
>>Signed-off-by: Ying Gao <ying01.gao@samsung.com>
>Missing Co-developed-by?
>>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
>
>
>>---
>>net/vmw_vsock/virtio_transport.c | 6 ++++--
>>1 file changed, 4 insertions(+), 2 deletions(-)
>>
>>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>>index 9eefd0fba92b..9df609581755 100644
>>--- a/net/vmw_vsock/virtio_transport.c
>>+++ b/net/vmw_vsock/virtio_transport.c
>>@@ -717,8 +717,10 @@ static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
>>	struct sk_buff *skb;
>>
>>	/* Reset all connected sockets when the VQs disappear */
>>-	vsock_for_each_connected_socket(&virtio_transport.transport,
>>-					virtio_vsock_reset_sock);
>I would add a comment explaining why you are adding this check.

Yes, please.

>>+	if (!vdev->priv) {
>>+		vsock_for_each_connected_socket(&virtio_transport.transport,
>>+						virtio_vsock_reset_sock);
>>+	}

Okay, after looking at the code I understood why, but please write it 
into the commit next time!

virtio_vsock_vqs_del() is called in 2 cases:
1 - in virtio_vsock_remove() after setting `vdev->priv` to null since
     the drive is about to be unloaded because the device is for example
     removed (hot-unplug)

2 - in virtio_vsock_freeze() when suspending, but in this case
     `vdev->priv` is not touched.

I don't think is a good idea using that because in the future it could 
change. So better to add a parameter to virtio_vsock_vqs_del() to 
differentiate the 2 use cases.


That said, I think this patch is wrong:

We are deallocating virtqueues, so all packets that are "in flight" will 
be completely discarded. Our transport (virtqueues) has no mechanism to 
retransmit them, so those packets would be lost forever. So we cannot 
guarantee the reliability of SOCK_STREAM sockets for example.

In any case, after a suspension, many connections will be expired in the 
host anyway, so does it make sense to keep them open in the guest?

If you want to support this use case, you must first provide a way to 
keep those packets somewhere (e.g. avoiding to remove the virtqueues?), 
but I honestly don't understand the use case.

To be clear, this behavior is intended, and it's for example the same as 
when suspending the VM is the hypervisor directly, which after that, it 
sends an event to the guest, just to close all connections because it's 
complicated to keep them active.

Thanks,
Stefano

>>
>>	/* Stop all work handlers to make sure no one is accessing the device,
>>	 * so we can safely call virtio_reset_device().
>>-- 
>>2.34.1
>>
>
>I am not familiar with freeze/resume, but I don't see any problems 
>with this patch.
>
>Thank you,
>Luigi
>
Stefano Garzarella Feb. 11, 2025, 8:35 a.m. UTC | #3
Please read the links we already shared with you!!!

No MIME, no links, no compression, no attachments. Just plain text

https://www.kernel.org/doc/html/latest/process/submitting-patches.html#no-mime-no-links-no-compression-no-attachments-just-plain-text


On Tue, 11 Feb 2025 at 06:24, 吴俊南 <junnan01.wu@samsung.com> wrote:
>
> Hello leonardi  and  stefanha:
>
>     Thanks for your review. And I will add other maintainers CCd in 
>     next push. And I want to discuss more about the second patch.

Why are you sending a v2 if we didn't reach an agreement on the second 
patch?

>
>
>     Firstly, we think our scenarios are quite different.

These are the information that should be put in the commit description.
We are not oracles imagining scenarios....

> Our scenario is  virtio-vsock deployed in embeded environment, and 
> suspend to ram is for order to allow system run at low power 
> consumption. In this scenario, the AF_VSOCK socket is created by Guest 
> upper application and don't close after driver freeze. Once restore, 
> the connection which are communicating before will be failed. It will 
> cause that upper application based on vsock connect failed. In this 
> mode, guest haven't received the event to close all connections.  
> That's difference with you metioned.

I mentioned the second scenario just as an example.

>
>
>     Secondly, refer to socket based on virtio-net device, they don't 
>     close connected during freeze.
>
>     Here we did a test that:
>
>    Start iperf server based on virtio-net in Host.
>    Start iperf client based on virtio-net in Guest and keep 
>    communicating with server.
>    Suspend Guest
>    Resume Guest.
>
>     Here in virtio-net, the iperf communication is still working after 
>     these steps. But iperf based on vsock will fail. We think it 
>     should keep same reaction with virtio-net

I agree that it would be cool, but this patch is not the right way as I 
explained in the previous email.

virtio-net can easily discard packets because it's an ethernet device.
As I already explained, virtio-vsock guarantees ordering and delivery of 
packets via virtqueues, if these disappear, you have to add something on 
top that keeps track of undelivered packets.

>
>
>     Thirdly, accroding to virtio-spec, vsock facilitates data transfer 
>     between the guest and device without using the Ethernet or IP 
>     protocols.

What does this have to do with packet loss?
It simply says that vsock does not need a classic TCP/IP stack, but 
directly connects guest and host sockets via virtqueues.

>
>     Therefore we think packets lost is acceptable for it, and it is 
>     not necessary to keep those packet during suspend flow.

Where did you read that packet loss is acceptable?

From https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html#x1-4800006

  5.10.6.2 Addressing

  ...

  Currently stream and seqpacket sockets are supported. type is 1
  (VIRTIO_VSOCK_TYPE_STREAM) for stream socket types, and 2
  (VIRTIO_VSOCK_TYPE_SEQPACKET) for seqpacket socket types.

  #define VIRTIO_VSOCK_TYPE_STREAM    1
  #define VIRTIO_VSOCK_TYPE_SEQPACKET 2

  Stream sockets provide in-order, guaranteed, connection-oriented
  delivery without message boundaries. Seqpacket sockets provide
  in-order, guaranteed, connection-oriented delivery with message and
  record boundaries.


Please explain in the commit description how this change ensures the 
requirements of the specification: "in-order, guaranteed, 
connection-oriented delivery".

Thanks,
Stefano


>
>
>  Best Wish
>
> --------- Original Message ---------
>
> Sender : Stefano Garzarella <sgarzare@redhat.com>
>
> Date : 2025-02-11 00:52 (GMT+8)
>
> Title : Re: [PATCH 2/2] vsock/virtio: Don't reset the created SOCKET during s2r
>
>  
>
> On Mon, Feb 10, 2025 at 12:48:03PM +0100, leonardi@redhat.com wrote:
>
> >Like for the other patch, some maintainers have not been CCd.
>
>
> Yes, please use `scripts/get_maintainer.pl`.
>
>
> >
>
> >On Fri, Feb 07, 2025 at 01:20:33PM +0800, Junnan Wu wrote:
>
> >>From: Ying Gao <ying01.gao@samsung.com>
>
> >>
>
> >>If suspend is executed during vsock communication and the
>
> >>socket is reset, the original socket will be unusable after resume.
>
>
> Why? (I mean for a good commit description)
>
>
> >>
>
> >>Judge the value of vdev->priv in function virtio_vsock_vqs_del,
>
> >>only when the function is invoked by virtio_vsock_remove,
>
> >>all vsock connections will be reset.
>
> >>
>
> >The second part of the commit message is not that clear, do you mind
>
> >rephrasing it?
>
>
> +1 on that
>
>
> Also in this case, why checking `vdev->priv` fixes the issue?
>
>
> >
>
> >>Signed-off-by: Ying Gao <ying01.gao@samsung.com>
>
> >Missing Co-developed-by?
>
> >>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
>
> >
>
> >
>
> >>---
>
> >>net/vmw_vsock/virtio_transport.c | 6 ++++--
>
> >>1 file changed, 4 insertions(+), 2 deletions(-)
>
> >>
>
> >>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>
> >>index 9eefd0fba92b..9df609581755 100644
>
> >>--- a/net/vmw_vsock/virtio_transport.c
>
> >>+++ b/net/vmw_vsock/virtio_transport.c
>
> >>@@ -717,8 +717,10 @@ static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
>
> >>        struct sk_buff *skb;
>
> >>
>
> >>        /* Reset all connected sockets when the VQs disappear */
>
> >>-        vsock_for_each_connected_socket(&virtio_transport.transport,
>
> >>-                                        virtio_vsock_reset_sock);
>
> >I would add a comment explaining why you are adding this check.
>
>
> Yes, please.
>
>
> >>+        if (!vdev->priv) {
>
> >>+                vsock_for_each_connected_socket(&virtio_transport.transport,
>
> >>+                                                virtio_vsock_reset_sock);
>
> >>+        }
>
>
> Okay, after looking at the code I understood why, but please write it
>
> into the commit next time!
>
>
> virtio_vsock_vqs_del() is called in 2 cases:
>
> 1 - in virtio_vsock_remove() after setting `vdev->priv` to null since
>
>     the drive is about to be unloaded because the device is for example
>
>     removed (hot-unplug)
>
>
> 2 - in virtio_vsock_freeze() when suspending, but in this case
>
>     `vdev->priv` is not touched.
>
>
> I don't think is a good idea using that because in the future it could
>
> change. So better to add a parameter to virtio_vsock_vqs_del() to
>
> differentiate the 2 use cases.
>
>
>
> That said, I think this patch is wrong:
>
>
> We are deallocating virtqueues, so all packets that are "in flight" will
>
> be completely discarded. Our transport (virtqueues) has no mechanism to
>
> retransmit them, so those packets would be lost forever. So we cannot
>
> guarantee the reliability of SOCK_STREAM sockets for example.
>
>
> In any case, after a suspension, many connections will be expired in the
>
> host anyway, so does it make sense to keep them open in the guest?
>
>
> If you want to support this use case, you must first provide a way to
>
> keep those packets somewhere (e.g. avoiding to remove the virtqueues?),
>
> but I honestly don't understand the use case.
>
>
> To be clear, this behavior is intended, and it's for example the same as
>
> when suspending the VM is the hypervisor directly, which after that, it
>
> sends an event to the guest, just to close all connections because it's
>
> complicated to keep them active.
>
>
> Thanks,
>
> Stefano
>
>
> >>
>
> >>        /* Stop all work handlers to make sure no one is accessing the device,
>
> >>         * so we can safely call virtio_reset_device().
>
> >>--
>
> >>2.34.1
>
> >>
>
> >
>
> >I am not familiar with freeze/resume, but I don't see any problems
>
> >with this patch.
>
> >
>
> >Thank you,
>
> >Luigi
>
> >
>
>
>  
>
>  
>
>
Junnan Wu Feb. 12, 2025, 4:48 a.m. UTC | #4
>On Mon, Feb 10, 2025 at 12:48:03PM +0100, leonardi@redhat.com wrote:
>>Like for the other patch, some maintainers have not been CCd.
>
>Yes, please use `scripts/get_maintainer.pl`.
>

Ok, I will add other maintainers by this script in next push.
 
>>
>>On Fri, Feb 07, 2025 at 01:20:33PM +0800, Junnan Wu wrote:
>>>From: Ying Gao <ying01.gao@samsung.com>
>>>
>>>If suspend is executed during vsock communication and the
>>>socket is reset, the original socket will be unusable after resume.
>
>Why? (I mean for a good commit description)
>
>>>
>>>Judge the value of vdev->priv in function virtio_vsock_vqs_del,
>>>only when the function is invoked by virtio_vsock_remove,
>>>all vsock connections will be reset.
>>>
>>The second part of the commit message is not that clear, do you mind 
>>rephrasing it?
>
>+1 on that
>

Well, I will rephrase it in next version.

>Also in this case, why checking `vdev->priv` fixes the issue?
>
>>
>>>Signed-off-by: Ying Gao <ying01.gao@samsung.com>
>>Missing Co-developed-by?
>>>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
>>
>>
>>>---
>>>net/vmw_vsock/virtio_transport.c | 6 ++++--
>>>1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>>>index 9eefd0fba92b..9df609581755 100644
>>>--- a/net/vmw_vsock/virtio_transport.c
>>>+++ b/net/vmw_vsock/virtio_transport.c
>>>@@ -717,8 +717,10 @@ static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
>>>	struct sk_buff *skb;
>>>
>>>	/* Reset all connected sockets when the VQs disappear */
>>>-	vsock_for_each_connected_socket(&virtio_transport.transport,
>>>-					virtio_vsock_reset_sock);
>>I would add a comment explaining why you are adding this check.
>
>Yes, please.
>

Ok, I left a comment here in next version

>>>+	if (!vdev->priv) {
>>>+		vsock_for_each_connected_socket(&virtio_transport.transport,
>>>+						virtio_vsock_reset_sock);
>>>+	}
>
>Okay, after looking at the code I understood why, but please write it 
>into the commit next time!
>
>virtio_vsock_vqs_del() is called in 2 cases:
>1 - in virtio_vsock_remove() after setting `vdev->priv` to null since
>     the drive is about to be unloaded because the device is for example
>     removed (hot-unplug)
>
>2 - in virtio_vsock_freeze() when suspending, but in this case
>     `vdev->priv` is not touched.
>
>I don't think is a good idea using that because in the future it could 
>change. So better to add a parameter to virtio_vsock_vqs_del() to 
>differentiate the 2 use cases.
>
>
>That said, I think this patch is wrong:
>
>We are deallocating virtqueues, so all packets that are "in flight" will 
>be completely discarded. Our transport (virtqueues) has no mechanism to 
>retransmit them, so those packets would be lost forever. So we cannot 
>guarantee the reliability of SOCK_STREAM sockets for example.
>
>In any case, after a suspension, many connections will be expired in the 
>host anyway, so does it make sense to keep them open in the guest?
>

If host still holds vsock connection during suspend,
I think guest should keep them open at this case.

Because we find a scenario that when we do freeze at the time that vsock
connection is communicating, and after restore, upper application
is trying to continue sending msg via vsock, then error `ENOTCONN`
returned in function `vsock_connectible_sendmsg`. But host does not realize
this thing and still waiting to receive msg with old connect.
If host doesn't close old connection, it will cause that guest
can never connect to host via vsock because of error `EPIPE` returned.

If we freeze vsock after sending and receiving data operation completed,
this error will not happen, and guest can still connect to host after resume.

For example:
In suitaion 1), if we do following steps
    step 1) Host start a vsock server
    step 2) Guest start a vsock client which will no-limited sending data
    step 3) Guest freeze and resume
Then vsock connection will be broken and guest can never connect to host via
vsock untill Host reset vsock server.

And in suitaion 2), if we do following steps
    step1) Host start a vsock server
    step2) Guest start a vsock client and send some data
    step3) After client completed transmit, Guest freeze and resume
    step4) Guest start a new vsock client and send some data
In this suitaion, host server don't need to reset, and guest client works well
after resume.

>If you want to support this use case, you must first provide a way to 
>keep those packets somewhere (e.g. avoiding to remove the virtqueues?), 
>but I honestly don't understand the use case.
>

In cases guest sending no-reply-required packet via vsock,
when guest suspend, the sending action will also suspend
and no packets will loss after resume.

And when host is sending packet via vsock when guest suspend and Vq disapper,
like you mentioned, those packets will loss.
But I think those packets should be keep in host device side,
and promise that once guest resume,
get them in host device and continue sending.

Thanks,
Junnan Wu

>To be clear, this behavior is intended, and it's for example the same as 
>when suspending the VM is the hypervisor directly, which after that, it 
>sends an event to the guest, just to close all connections because it's 
>complicated to keep them active.
>
>Thanks,
>Stefano
>
>
>
>>>
>>>	/* Stop all work handlers to make sure no one is accessing the device,
>>>	 * so we can safely call virtio_reset_device().
>>>-- 
>>>2.34.1
>>>
>>
>>I am not familiar with freeze/resume, but I don't see any problems 
>>with this patch.
>>
>>Thank you,
>>Luigi
>>
Stefano Garzarella Feb. 13, 2025, 9:58 a.m. UTC | #5
On Wed, Feb 12, 2025 at 12:48:43PM +0800, Junnan Wu wrote:
>>On Mon, Feb 10, 2025 at 12:48:03PM +0100, leonardi@redhat.com wrote:
>>>Like for the other patch, some maintainers have not been CCd.
>>
>>Yes, please use `scripts/get_maintainer.pl`.
>>
>
>Ok, I will add other maintainers by this script in next push.
>
>>>
>>>On Fri, Feb 07, 2025 at 01:20:33PM +0800, Junnan Wu wrote:
>>>>From: Ying Gao <ying01.gao@samsung.com>
>>>>
>>>>If suspend is executed during vsock communication and the
>>>>socket is reset, the original socket will be unusable after resume.
>>
>>Why? (I mean for a good commit description)
>>
>>>>
>>>>Judge the value of vdev->priv in function virtio_vsock_vqs_del,
>>>>only when the function is invoked by virtio_vsock_remove,
>>>>all vsock connections will be reset.
>>>>
>>>The second part of the commit message is not that clear, do you mind
>>>rephrasing it?
>>
>>+1 on that
>>
>
>Well, I will rephrase it in next version.
>
>>Also in this case, why checking `vdev->priv` fixes the issue?
>>
>>>
>>>>Signed-off-by: Ying Gao <ying01.gao@samsung.com>
>>>Missing Co-developed-by?
>>>>Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
>>>
>>>
>>>>---
>>>>net/vmw_vsock/virtio_transport.c | 6 ++++--
>>>>1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>>>>index 9eefd0fba92b..9df609581755 100644
>>>>--- a/net/vmw_vsock/virtio_transport.c
>>>>+++ b/net/vmw_vsock/virtio_transport.c
>>>>@@ -717,8 +717,10 @@ static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
>>>>	struct sk_buff *skb;
>>>>
>>>>	/* Reset all connected sockets when the VQs disappear */
>>>>-	vsock_for_each_connected_socket(&virtio_transport.transport,
>>>>-					virtio_vsock_reset_sock);
>>>I would add a comment explaining why you are adding this check.
>>
>>Yes, please.
>>
>
>Ok, I left a comment here in next version
>
>>>>+	if (!vdev->priv) {
>>>>+		vsock_for_each_connected_socket(&virtio_transport.transport,
>>>>+						virtio_vsock_reset_sock);
>>>>+	}
>>
>>Okay, after looking at the code I understood why, but please write it
>>into the commit next time!
>>
>>virtio_vsock_vqs_del() is called in 2 cases:
>>1 - in virtio_vsock_remove() after setting `vdev->priv` to null since
>>     the drive is about to be unloaded because the device is for example
>>     removed (hot-unplug)
>>
>>2 - in virtio_vsock_freeze() when suspending, but in this case
>>     `vdev->priv` is not touched.
>>
>>I don't think is a good idea using that because in the future it could
>>change. So better to add a parameter to virtio_vsock_vqs_del() to
>>differentiate the 2 use cases.
>>
>>
>>That said, I think this patch is wrong:
>>
>>We are deallocating virtqueues, so all packets that are "in flight" will
>>be completely discarded. Our transport (virtqueues) has no mechanism to
>>retransmit them, so those packets would be lost forever. So we cannot
>>guarantee the reliability of SOCK_STREAM sockets for example.
>>
>>In any case, after a suspension, many connections will be expired in the
>>host anyway, so does it make sense to keep them open in the guest?
>>
>
>If host still holds vsock connection during suspend,
>I think guest should keep them open at this case.
>
>Because we find a scenario that when we do freeze at the time that vsock
>connection is communicating, and after restore, upper application
>is trying to continue sending msg via vsock, then error `ENOTCONN`
>returned in function `vsock_connectible_sendmsg`. But host does not realize
>this thing and still waiting to receive msg with old connect.
>If host doesn't close old connection, it will cause that guest
>can never connect to host via vsock because of error `EPIPE` returned.
>
>If we freeze vsock after sending and receiving data operation completed,
>this error will not happen, and guest can still connect to host after resume.
>
>For example:
>In suitaion 1), if we do following steps
>    step 1) Host start a vsock server
>    step 2) Guest start a vsock client which will no-limited sending data
>    step 3) Guest freeze and resume
>Then vsock connection will be broken and guest can never connect to host via
>vsock untill Host reset vsock server.
>
>And in suitaion 2), if we do following steps
>    step1) Host start a vsock server
>    step2) Guest start a vsock client and send some data
>    step3) After client completed transmit, Guest freeze and resume
>    step4) Guest start a new vsock client and send some data
>In this suitaion, host server don't need to reset, and guest client works well
>after resume.

Okay, but this is not what this patch is doing, right?
Or have I missed something?

>
>>If you want to support this use case, you must first provide a way to
>>keep those packets somewhere (e.g. avoiding to remove the virtqueues?),
>>but I honestly don't understand the use case.
>>
>
>In cases guest sending no-reply-required packet via vsock,
>when guest suspend, the sending action will also suspend
>and no packets will loss after resume.

You can try this simple example to check if it works or not:

guest$ dd if=/dev/urandom of=bigfile bs=1M count=10240
guest$ md5sum bigfile
e412f2803a89da265d53a28dea0f0da7  bigfile

host$ nc --vsock -p 1234 -l > bigfile
guest$ cat bigfile | nc --vsock 2 1234

# while sending do a suspend/resume cycle

# Without your patch, nc should fail, so the user knows the
# communication was wrong, with your patch should not fail.

host$ md5sum bigfile


Is the md5sum the same? If not it means you lost packets and we can't do 
that.

>
>And when host is sending packet via vsock when guest suspend and Vq disapper,
>like you mentioned, those packets will loss.
>But I think those packets should be keep in host device side,
>and promise that once guest resume,
>get them in host device and continue sending.

The host will stop using virtqueue after the driver calls 
`virtio_reset_device()`, so we should handle all the packets already 
queued in the RX virtqueue, because when the host put them in the 
virtqueue it doesn't have any way to track them, so should be up to the 
driver in the guest to stop the device and then check all the buffer 
already queued.

But currently we also call 
`virtio_vsock_skb_queue_purge(&vsock->send_pkt_queue);` which will 
discard all the packets queued by application in the guests that weren't 
even queued in the virtqueue.

So again, this patch as it is, it's absolutely not right.

I understand the use case and it's clear to me now, but please write it 
in the commit description.

In summary, if we want to support your use case (and that is fine by 
me), we need to do better in the driver:

- we must not purge `send_pkt_queue`
- we need to make sure that all buffers that the host has put in the RX 
   virtqueue are handled by the guest
- we need to make sure that all buffers that the guest has put in the TX 
   virtqueue are handled by the host or put back on top of send_pkt_queue

Thanks,
Stefano

>
>Thanks,
>Junnan Wu
>
>>To be clear, this behavior is intended, and it's for example the same as
>>when suspending the VM is the hypervisor directly, which after that, it
>>sends an event to the guest, just to close all connections because it's
>>complicated to keep them active.
>>
>>Thanks,
>>Stefano
>>
>>
>>
>>>>
>>>>	/* Stop all work handlers to make sure no one is accessing the device,
>>>>	 * so we can safely call virtio_reset_device().
>>>>--
>>>>2.34.1
>>>>
>>>
>>>I am not familiar with freeze/resume, but I don't see any problems
>>>with this patch.
>>>
>>>Thank you,
>>>Luigi
>>>
>
diff mbox series

Patch

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 9eefd0fba92b..9df609581755 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -717,8 +717,10 @@  static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
 	struct sk_buff *skb;
 
 	/* Reset all connected sockets when the VQs disappear */
-	vsock_for_each_connected_socket(&virtio_transport.transport,
-					virtio_vsock_reset_sock);
+	if (!vdev->priv) {
+		vsock_for_each_connected_socket(&virtio_transport.transport,
+						virtio_vsock_reset_sock);
+	}
 
 	/* Stop all work handlers to make sure no one is accessing the device,
 	 * so we can safely call virtio_reset_device().