diff mbox series

[net] vsock: avoid timeout during connect() if the socket is closing

Message ID 20250328141528.420719-1-sgarzare@redhat.com (mailing list archive)
State New
Delegated to: Netdev Maintainers
Headers show
Series [net] vsock: avoid timeout during connect() if the socket is closing | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 12 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-04-01--00-00 (tests: 902)

Commit Message

Stefano Garzarella March 28, 2025, 2:15 p.m. UTC
From: Stefano Garzarella <sgarzare@redhat.com>

When a peer attempts to establish a connection, vsock_connect() contains
a loop that waits for the state to be TCP_ESTABLISHED. However, the
other peer can be fast enough to accept the connection and close it
immediately, thus moving the state to TCP_CLOSING.

When this happens, the peer in the vsock_connect() is properly woken up,
but since the state is not TCP_ESTABLISHED, it goes back to sleep
until the timeout expires, returning -ETIMEDOUT.

If the socket state is TCP_CLOSING, waiting for the timeout is pointless.
vsock_connect() can return immediately without errors or delay since the
connection actually happened. The socket will be in a closing state,
but this is not an issue, and subsequent calls will fail as expected.

We discovered this issue while developing a test that accepts and
immediately closes connections to stress the transport switch between
two connect() calls, where the first one was interrupted by a signal
(see Closes link).

Reported-by: Luigi Leonardi <leonardi@redhat.com>
Closes: https://lore.kernel.org/virtualization/bq6hxrolno2vmtqwcvb5bljfpb7mvwb3kohrvaed6auz5vxrfv@ijmd2f3grobn/
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 net/vmw_vsock/af_vsock.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Paolo Abeni April 1, 2025, 11:56 a.m. UTC | #1
On 3/28/25 3:15 PM, Stefano Garzarella wrote:
> From: Stefano Garzarella <sgarzare@redhat.com>
> 
> When a peer attempts to establish a connection, vsock_connect() contains
> a loop that waits for the state to be TCP_ESTABLISHED. However, the
> other peer can be fast enough to accept the connection and close it
> immediately, thus moving the state to TCP_CLOSING.
> 
> When this happens, the peer in the vsock_connect() is properly woken up,
> but since the state is not TCP_ESTABLISHED, it goes back to sleep
> until the timeout expires, returning -ETIMEDOUT.
> 
> If the socket state is TCP_CLOSING, waiting for the timeout is pointless.
> vsock_connect() can return immediately without errors or delay since the
> connection actually happened. The socket will be in a closing state,
> but this is not an issue, and subsequent calls will fail as expected.
> 
> We discovered this issue while developing a test that accepts and
> immediately closes connections to stress the transport switch between
> two connect() calls, where the first one was interrupted by a signal
> (see Closes link).
> 
> Reported-by: Luigi Leonardi <leonardi@redhat.com>
> Closes: https://lore.kernel.org/virtualization/bq6hxrolno2vmtqwcvb5bljfpb7mvwb3kohrvaed6auz5vxrfv@ijmd2f3grobn/
> Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>

Acked-by: Paolo Abeni <pabeni@redhat.com>
Luigi Leonardi April 1, 2025, 12:27 p.m. UTC | #2
On Fri, Mar 28, 2025 at 03:15:28PM +0100, Stefano Garzarella wrote:
>From: Stefano Garzarella <sgarzare@redhat.com>
>
>When a peer attempts to establish a connection, vsock_connect() contains
>a loop that waits for the state to be TCP_ESTABLISHED. However, the
>other peer can be fast enough to accept the connection and close it
>immediately, thus moving the state to TCP_CLOSING.
>
>When this happens, the peer in the vsock_connect() is properly woken up,
>but since the state is not TCP_ESTABLISHED, it goes back to sleep
>until the timeout expires, returning -ETIMEDOUT.
>
>If the socket state is TCP_CLOSING, waiting for the timeout is pointless.
>vsock_connect() can return immediately without errors or delay since the
>connection actually happened. The socket will be in a closing state,
>but this is not an issue, and subsequent calls will fail as expected.
>
>We discovered this issue while developing a test that accepts and
>immediately closes connections to stress the transport switch between
>two connect() calls, where the first one was interrupted by a signal
>(see Closes link).
>
>Reported-by: Luigi Leonardi <leonardi@redhat.com>
>Closes: https://lore.kernel.org/virtualization/bq6hxrolno2vmtqwcvb5bljfpb7mvwb3kohrvaed6auz5vxrfv@ijmd2f3grobn/
>Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
>Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>---
> net/vmw_vsock/af_vsock.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 7e3db87ae433..fc6afbc8d680 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1551,7 +1551,11 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
> 	timeout = vsk->connect_timeout;
> 	prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
>
>-	while (sk->sk_state != TCP_ESTABLISHED && sk->sk_err == 0) {
>+	/* If the socket is already closing or it is in an error state, there
>+	 * is no point in waiting.
>+	 */
>+	while (sk->sk_state != TCP_ESTABLISHED &&
>+	       sk->sk_state != TCP_CLOSING && sk->sk_err == 0) {
> 		if (flags & O_NONBLOCK) {
> 			/* If we're not going to block, we schedule a timeout
> 			 * function to generate a timeout on the connection
>-- 
>2.49.0
>

Just tested and fixes the issue! Thanks Stefano!

Tested-by: Luigi Leonardi <leonardi@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
diff mbox series

Patch

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 7e3db87ae433..fc6afbc8d680 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1551,7 +1551,11 @@  static int vsock_connect(struct socket *sock, struct sockaddr *addr,
 	timeout = vsk->connect_timeout;
 	prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
 
-	while (sk->sk_state != TCP_ESTABLISHED && sk->sk_err == 0) {
+	/* If the socket is already closing or it is in an error state, there
+	 * is no point in waiting.
+	 */
+	while (sk->sk_state != TCP_ESTABLISHED &&
+	       sk->sk_state != TCP_CLOSING && sk->sk_err == 0) {
 		if (flags & O_NONBLOCK) {
 			/* If we're not going to block, we schedule a timeout
 			 * function to generate a timeout on the connection