diff mbox

rsocket: Return ECONNRESET when socket in recv is disconnected

Message ID 1412806773-23776-1-git-send-email-sean.hefty@intel.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Hefty, Sean Oct. 8, 2014, 10:19 p.m. UTC
From: Sean Hefty <sean.hefty@intel.com>

The following behavior difference was reported between rsockets and
sockets:

when remote end is suddenly closed, recv() waiting on it receives
tcp/ip => ECONNRESET error
rsockets => 0 value

Update rrecv() to return ECONNRESET if no data is available and
the connection is no longer readable.

Problem reported by: srkodali@linux.vnet.ibm.com

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
---
 src/rsocket.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

Comments

Jason Gunthorpe Oct. 8, 2014, 11:28 p.m. UTC | #1
On Wed, Oct 08, 2014 at 03:19:33PM -0700, sean.hefty@intel.com wrote:
> The following behavior difference was reported between rsockets and
> sockets:
> 
> when remote end is suddenly closed, recv() waiting on it receives
> tcp/ip => ECONNRESET error
> rsockets => 0 value

That isn't the whole story though - you only get ECONNRESET in some
cases, and it is OS dependent.
 
stackoverflow suggets:

http://stackoverflow.com/questions/2974021/what-does-econnreset-mean-in-the-context-of-an-af-local-socket

Which looks reasonable to me and matches my experience with socket
programming on Linux. The Steven's book might have some authoritative
clarifications on the subject as well.

It certainly is not correct to unconditionally return ECONNRESET for
reads on far-end-closed sockets, they should typically return 0.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hefty, Sean Oct. 8, 2014, 11:38 p.m. UTC | #2
> stackoverflow suggets:
> 
> http://stackoverflow.com/questions/2974021/what-does-econnreset-mean-in-
> the-context-of-an-af-local-socket
> 
> Which looks reasonable to me and matches my experience with socket
> programming on Linux. The Steven's book might have some authoritative
> clarifications on the subject as well.
> 
> It certainly is not correct to unconditionally return ECONNRESET for
> reads on far-end-closed sockets, they should typically return 0.

Thanks for the info. 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sreedhar Kodali Oct. 9, 2014, 11:09 a.m. UTC | #3
Hi Sean,

Thanks for the fix.  Now, we are able to see the similar behavior
between tcp/ip and rsockets when the other end is suddenly closed
while waiting on data.

Thank You.

- Sreedhar

On 2014-10-09 03:49, sean.hefty@intel.com wrote:
> From: Sean Hefty <sean.hefty@intel.com>
> 
> The following behavior difference was reported between rsockets and
> sockets:
> 
> when remote end is suddenly closed, recv() waiting on it receives
> tcp/ip => ECONNRESET error
> rsockets => 0 value
> 
> Update rrecv() to return ECONNRESET if no data is available and
> the connection is no longer readable.
> 
> Problem reported by: srkodali@linux.vnet.ibm.com
> 
> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
> ---
>  src/rsocket.c |   10 ++++++----
>  1 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/src/rsocket.c b/src/rsocket.c
> index 074fb18..7c7af52 100644
> --- a/src/rsocket.c
> +++ b/src/rsocket.c
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2008-2013 Intel Corporation.  All rights reserved.
> + * Copyright (c) 2008-2014 Intel Corporation.  All rights reserved.
>   *
>   * This software is available to you under a choice of one of two
>   * licenses.  You may choose to be licensed under the terms of the GNU
> @@ -2394,7 +2394,7 @@ ssize_t rrecv(int socket, void *buf, size_t len,
> int flags)
>  	struct rsocket *rs;
>  	size_t left = len;
>  	uint32_t end_size, rsize;
> -	int ret;
> +	int ret = 0;
> 
>  	rs = idm_at(&idm, socket);
>  	if (rs->type == SOCK_DGRAM) {
> @@ -2419,9 +2419,11 @@ ssize_t rrecv(int socket, void *buf, size_t
> len, int flags)
>  					  rs_conn_have_rdata);
>  			if (ret)
>  				break;
> +
> +			if (!(rs->state & rs_readable))
> +				ret = ERR(ECONNRESET);
>  		}
> 
> -		ret = 0;
>  		if (flags & MSG_PEEK) {
>  			left = len - rs_peek(rs, buf, left);
>  			break;
> @@ -2456,7 +2458,7 @@ ssize_t rrecv(int socket, void *buf, size_t len,
> int flags)
>  	} while (left && (flags & MSG_WAITALL) && (rs->state & rs_readable));
> 
>  	fastlock_release(&rs->rlock);
> -	return ret ? ret : len - left;
> +	return (ret && left == len) ? ret : len - left;
>  }
> 
>  ssize_t rrecvfrom(int socket, void *buf, size_t len, int flags,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hefty, Sean Oct. 9, 2014, 4:44 p.m. UTC | #4
> Thanks for the fix.  Now, we are able to see the similar behavior
> between tcp/ip and rsockets when the other end is suddenly closed
> while waiting on data.

Can you read and respond to Jason's comments on this patch?

> That isn't the whole story though - you only get ECONNRESET in some
> cases, and it is OS dependent.
 
> stackoverflow suggets:

> http://stackoverflow.com/questions/2974021/what-does-econnreset-mean-in-the-context-of-an-af-local-socket

> Which looks reasonable to me and matches my experience with socket
> programming on Linux. The Steven's book might have some authoritative
> clarifications on the subject as well.

Based on this information, it's not clear that the current behavior of returning 0 is wrong.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sreedhar Kodali Oct. 10, 2014, 5:07 a.m. UTC | #5
Hi Jason & Sean,

POSIX recv() manual page describes behavior as follows (reproduced here 
for convenience) -

http://linux.die.net/man/3/recv

==========
Return Value

Upon successful completion, recv() shall return the length of the 
message in bytes. If no messages are available to be received and the 
peer has performed an orderly shutdown, recv() shall return 0. 
Otherwise, -1 shall be returned and errno set to indicate the error.

Errors

The recv() function shall fail if:

ECONNRESET
     A connection was forcibly closed by a peer.
==========

So the current behavior of returning 0 is wrong as there was no orderly 
shutdown performed on the other end before a close() is issued [ 
http://stackoverflow.com/questions/5879560/how-can-i-cause-an-econnreset-in-recv-from-a-client 
].

Thank You.

- Sreedhar


On 2014-10-09 22:14, Hefty, Sean wrote:
>> Thanks for the fix.  Now, we are able to see the similar behavior
>> between tcp/ip and rsockets when the other end is suddenly closed
>> while waiting on data.
> 
> Can you read and respond to Jason's comments on this patch?
> 
>> That isn't the whole story though - you only get ECONNRESET in some
>> cases, and it is OS dependent.
> 
>> stackoverflow suggets:
> 
>> http://stackoverflow.com/questions/2974021/what-does-econnreset-mean-in-the-context-of-an-af-local-socket
> 
>> Which looks reasonable to me and matches my experience with socket
>> programming on Linux. The Steven's book might have some authoritative
>> clarifications on the subject as well.
> 
> Based on this information, it's not clear that the current behavior of
> returning 0 is wrong.
> 
> - Sean

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hefty, Sean Oct. 10, 2014, 5:34 a.m. UTC | #6
> Upon successful completion, recv() shall return the length of the
> message in bytes. If no messages are available to be received and the
> peer has performed an orderly shutdown, recv() shall return 0.
> Otherwise, -1 shall be returned and errno set to indicate the error.
> 
> Errors
> 
> The recv() function shall fail if:
> 
> ECONNRESET
>      A connection was forcibly closed by a peer.
> ==========
> 
> So the current behavior of returning 0 is wrong as there was no orderly
> shutdown performed on the other end before a close() is issued [
> http://stackoverflow.com/questions/5879560/how-can-i-cause-an-econnreset-
> in-recv-from-a-client

The patch is incorrect then.  It always returns ECONNRESET.  Rsockets does not distinguish between the app calling shutdown versus close.  Because everything is in user space, the disconnect message itself comes from the app.  The rclose call invokes rshutdown, otherwise the remote side will may never know that the peer is gone (which can happen if the app exits without calling rclose).

I'm inclined to just leave the current behavior as is.  I don't see why a real app would care if close performed an orderly shutdown or not, as this errors on the side of acting better than expected.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Oct. 10, 2014, 5:45 p.m. UTC | #7
On Fri, Oct 10, 2014 at 10:37:34AM +0530, Sreedhar Kodali wrote:
> So the current behavior of returning 0 is wrong as there was no
> orderly shutdown performed on the other end before a close() is
> issued

'Orderly shutdown' is any call of close() when there is no data in the
recv queue, or shutdown(SHUT_WR).

> http://stackoverflow.com/questions/5879560/how-can-i-cause-an-econnreset-in-recv-from-a-client

This accepted answer is not correct, close on its own does not
generate ECONNRESET.

This is trivally shown with strace and a bit of python:

server:
 import socket
 s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
 s.bind(("127.0.0.1",9090))
 s.listen(1)
 conn, addr = s.accept()
 conn.recv(1024)

client:

import socket
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect(("127.0.0.1",9090))
s.close()

Result:

[..]
accept(3, {sa_family=AF_INET, sin_port=htons(41284), sin_addr=inet_addr("127.0.0.1")}, [16]) = 4
recvfrom(4, "", 1024, 0, NULL, NULL)    = 0

ECONNRESET signals a special condition that it seems rsockets cannot
detect, as it doesn't have the one sided close semantics of TCP.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sreedhar Kodali Oct. 13, 2014, 3:51 a.m. UTC | #8
Hi Sean,

I am fine with your conclusion.  But since R-Sockets emulates socket
functionality on top of RDMA, it would be helpful if this behavioral
difference between TCP and R-Sockets is captured as part of the user
documentation.

Thank You.

- Sreedhar

On 2014-10-10 11:04, Hefty, Sean wrote:
>> Upon successful completion, recv() shall return the length of the
>> message in bytes. If no messages are available to be received and the
>> peer has performed an orderly shutdown, recv() shall return 0.
>> Otherwise, -1 shall be returned and errno set to indicate the error.
>> 
>> Errors
>> 
>> The recv() function shall fail if:
>> 
>> ECONNRESET
>>      A connection was forcibly closed by a peer.
>> ==========
>> 
>> So the current behavior of returning 0 is wrong as there was no 
>> orderly
>> shutdown performed on the other end before a close() is issued [
>> http://stackoverflow.com/questions/5879560/how-can-i-cause-an-econnreset-
>> in-recv-from-a-client
> 
> The patch is incorrect then.  It always returns ECONNRESET.  Rsockets
> does not distinguish between the app calling shutdown versus close.
> Because everything is in user space, the disconnect message itself
> comes from the app.  The rclose call invokes rshutdown, otherwise the
> remote side will may never know that the peer is gone (which can
> happen if the app exits without calling rclose).
> 
> I'm inclined to just leave the current behavior as is.  I don't see
> why a real app would care if close performed an orderly shutdown or
> not, as this errors on the side of acting better than expected.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sreedhar Kodali Oct. 13, 2014, 3:55 a.m. UTC | #9
Hi Jason,

Thanks for your explanation - I got it.  On the other mail thread
I have just requested Sean to mark this as R-Sockets limitation or
behavioral difference so users wouldn't confuse when they try
to compare with TCP.

Thank You.

- Sreedhar

On 2014-10-10 23:15, Jason Gunthorpe wrote:
> On Fri, Oct 10, 2014 at 10:37:34AM +0530, Sreedhar Kodali wrote:
>> So the current behavior of returning 0 is wrong as there was no
>> orderly shutdown performed on the other end before a close() is
>> issued
> 
> 'Orderly shutdown' is any call of close() when there is no data in the
> recv queue, or shutdown(SHUT_WR).
> 
>> http://stackoverflow.com/questions/5879560/how-can-i-cause-an-econnreset-in-recv-from-a-client
> 
> This accepted answer is not correct, close on its own does not
> generate ECONNRESET.
> 
> This is trivally shown with strace and a bit of python:
> 
> server:
>  import socket
>  s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
>  s.bind(("127.0.0.1",9090))
>  s.listen(1)
>  conn, addr = s.accept()
>  conn.recv(1024)
> 
> client:
> 
> import socket
> s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
> s.connect(("127.0.0.1",9090))
> s.close()
> 
> Result:
> 
> [..]
> accept(3, {sa_family=AF_INET, sin_port=htons(41284),
> sin_addr=inet_addr("127.0.0.1")}, [16]) = 4
> recvfrom(4, "", 1024, 0, NULL, NULL)    = 0
> 
> ECONNRESET signals a special condition that it seems rsockets cannot
> detect, as it doesn't have the one sided close semantics of TCP.
> 
> Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/src/rsocket.c b/src/rsocket.c
index 074fb18..7c7af52 100644
--- a/src/rsocket.c
+++ b/src/rsocket.c
@@ -1,5 +1,5 @@ 
 /*
- * Copyright (c) 2008-2013 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2008-2014 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -2394,7 +2394,7 @@  ssize_t rrecv(int socket, void *buf, size_t len, int flags)
 	struct rsocket *rs;
 	size_t left = len;
 	uint32_t end_size, rsize;
-	int ret;
+	int ret = 0;
 
 	rs = idm_at(&idm, socket);
 	if (rs->type == SOCK_DGRAM) {
@@ -2419,9 +2419,11 @@  ssize_t rrecv(int socket, void *buf, size_t len, int flags)
 					  rs_conn_have_rdata);
 			if (ret)
 				break;
+
+			if (!(rs->state & rs_readable))
+				ret = ERR(ECONNRESET);
 		}
 
-		ret = 0;
 		if (flags & MSG_PEEK) {
 			left = len - rs_peek(rs, buf, left);
 			break;
@@ -2456,7 +2458,7 @@  ssize_t rrecv(int socket, void *buf, size_t len, int flags)
 	} while (left && (flags & MSG_WAITALL) && (rs->state & rs_readable));
 
 	fastlock_release(&rs->rlock);
-	return ret ? ret : len - left;
+	return (ret && left == len) ? ret : len - left;
 }
 
 ssize_t rrecvfrom(int socket, void *buf, size_t len, int flags,