diff mbox

[RFC] cifs: retry kernel_sendmsg only in case of -EAGAIN

Message ID 506948FD.3090602@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Suresh Jayaraman Oct. 1, 2012, 7:40 a.m. UTC
In smb_sendv(), we seem to retry if kernel_sendmsg() returned either
-ENOSPC or -EAGAIN. In either case after multiple attempts, we set the
error to -EAGAIN before breaking out of the loop.

First, it is not clear to me when kernel_sendmsg() can return -ENOSPC,
and what it would mean and why should we retry. It makes me wonder
whether this check is part of some old code. Also, there seem to be no
need to set the error back to -EAGAIN before we break out the loop.
Fix this by making cifs retry only if kernel_sendmsg() returns -EAGAIN.

If the above discussion make sense, here is a patch to fix this.
---

Retry kernel_sendmsg() only in case of -EAGAIN and remove redundant
error assignment.


Signed-off-by: Suresh Jayaraman <sjayaraman@suse.com>
---
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jeff Layton Oct. 2, 2012, 1:40 a.m. UTC | #1
On Mon, 01 Oct 2012 13:10:45 +0530
Suresh Jayaraman <sjayaraman@suse.com> wrote:

> In smb_sendv(), we seem to retry if kernel_sendmsg() returned either
> -ENOSPC or -EAGAIN. In either case after multiple attempts, we set the
> error to -EAGAIN before breaking out of the loop.
> 
> First, it is not clear to me when kernel_sendmsg() can return -ENOSPC,
> and what it would mean and why should we retry. It makes me wonder
> whether this check is part of some old code. Also, there seem to be no
> need to set the error back to -EAGAIN before we break out the loop.
> Fix this by making cifs retry only if kernel_sendmsg() returns -EAGAIN.
> 
> If the above discussion make sense, here is a patch to fix this.
> ---
> 
> Retry kernel_sendmsg() only in case of -EAGAIN and remove redundant
> error assignment.
> 
> 
> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.com>
> ---
> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> index d9b639b..a33db4c 100644
> --- a/fs/cifs/transport.c
> +++ b/fs/cifs/transport.c
> @@ -154,7 +154,7 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
>  	while (total_len) {
>  		rc = kernel_sendmsg(ssocket, &smb_msg, &iov[first_vec],
>  				    n_vec - first_vec, total_len);
> -		if ((rc == -ENOSPC) || (rc == -EAGAIN)) {
> +		if (rc == -EAGAIN) {
>  			i++;
>  			/*
>  			 * If blocking send we try 3 times, since each can block
> @@ -177,7 +177,6 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
>  			if ((i >= 14) || (!server->noblocksnd && (i > 2))) {
>  				cERROR(1, "sends on sock %p stuck for 15 seconds",
>  				    ssocket);
> -				rc = -EAGAIN;
>  				break;
>  			}
>  			msleep(1 << i);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

That ENOSPC has been there a long time, and I think you're correct that
it's of questionable value.

Acked-by: Jeff Layton <jlayton@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve French Oct. 2, 2012, 1:46 a.m. UTC | #2
How can we be certain that ENOSPC is no longer returned? When it was
returned then the obvious thing was to block briefly and retry.

Does it do any harm?

On Mon, Oct 1, 2012 at 8:40 PM, Jeff Layton <jlayton@redhat.com> wrote:
> On Mon, 01 Oct 2012 13:10:45 +0530
> Suresh Jayaraman <sjayaraman@suse.com> wrote:
>
>> In smb_sendv(), we seem to retry if kernel_sendmsg() returned either
>> -ENOSPC or -EAGAIN. In either case after multiple attempts, we set the
>> error to -EAGAIN before breaking out of the loop.
>>
>> First, it is not clear to me when kernel_sendmsg() can return -ENOSPC,
>> and what it would mean and why should we retry. It makes me wonder
>> whether this check is part of some old code. Also, there seem to be no
>> need to set the error back to -EAGAIN before we break out the loop.
>> Fix this by making cifs retry only if kernel_sendmsg() returns -EAGAIN.
>>
>> If the above discussion make sense, here is a patch to fix this.
>> ---
>>
>> Retry kernel_sendmsg() only in case of -EAGAIN and remove redundant
>> error assignment.
>>
>>
>> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.com>
>> ---
>> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
>> index d9b639b..a33db4c 100644
>> --- a/fs/cifs/transport.c
>> +++ b/fs/cifs/transport.c
>> @@ -154,7 +154,7 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
>>       while (total_len) {
>>               rc = kernel_sendmsg(ssocket, &smb_msg, &iov[first_vec],
>>                                   n_vec - first_vec, total_len);
>> -             if ((rc == -ENOSPC) || (rc == -EAGAIN)) {
>> +             if (rc == -EAGAIN) {
>>                       i++;
>>                       /*
>>                        * If blocking send we try 3 times, since each can block
>> @@ -177,7 +177,6 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
>>                       if ((i >= 14) || (!server->noblocksnd && (i > 2))) {
>>                               cERROR(1, "sends on sock %p stuck for 15 seconds",
>>                                   ssocket);
>> -                             rc = -EAGAIN;
>>                               break;
>>                       }
>>                       msleep(1 << i);
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> That ENOSPC has been there a long time, and I think you're correct that
> it's of questionable value.
>
> Acked-by: Jeff Layton <jlayton@redhat.com>
Jeff Layton Oct. 2, 2012, 1:50 a.m. UTC | #3
On Mon, 1 Oct 2012 20:46:01 -0500
Steve French <smfrench@gmail.com> wrote:

> How can we be certain that ENOSPC is no longer returned? When it was
> returned then the obvious thing was to block briefly and retry.
> 
> Does it do any harm?
> 

Can you explain the situation that caused it to return -ENOSPC in the
first place? Why that and not -EAGAIN? Or -ENOBUFS?

> On Mon, Oct 1, 2012 at 8:40 PM, Jeff Layton <jlayton@redhat.com> wrote:
> > On Mon, 01 Oct 2012 13:10:45 +0530
> > Suresh Jayaraman <sjayaraman@suse.com> wrote:
> >
> >> In smb_sendv(), we seem to retry if kernel_sendmsg() returned either
> >> -ENOSPC or -EAGAIN. In either case after multiple attempts, we set the
> >> error to -EAGAIN before breaking out of the loop.
> >>
> >> First, it is not clear to me when kernel_sendmsg() can return -ENOSPC,
> >> and what it would mean and why should we retry. It makes me wonder
> >> whether this check is part of some old code. Also, there seem to be no
> >> need to set the error back to -EAGAIN before we break out the loop.
> >> Fix this by making cifs retry only if kernel_sendmsg() returns -EAGAIN.
> >>
> >> If the above discussion make sense, here is a patch to fix this.
> >> ---
> >>
> >> Retry kernel_sendmsg() only in case of -EAGAIN and remove redundant
> >> error assignment.
> >>
> >>
> >> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.com>
> >> ---
> >> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> >> index d9b639b..a33db4c 100644
> >> --- a/fs/cifs/transport.c
> >> +++ b/fs/cifs/transport.c
> >> @@ -154,7 +154,7 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
> >>       while (total_len) {
> >>               rc = kernel_sendmsg(ssocket, &smb_msg, &iov[first_vec],
> >>                                   n_vec - first_vec, total_len);
> >> -             if ((rc == -ENOSPC) || (rc == -EAGAIN)) {
> >> +             if (rc == -EAGAIN) {
> >>                       i++;
> >>                       /*
> >>                        * If blocking send we try 3 times, since each can block
> >> @@ -177,7 +177,6 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
> >>                       if ((i >= 14) || (!server->noblocksnd && (i > 2))) {
> >>                               cERROR(1, "sends on sock %p stuck for 15 seconds",
> >>                                   ssocket);
> >> -                             rc = -EAGAIN;
> >>                               break;
> >>                       }
> >>                       msleep(1 << i);
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> > That ENOSPC has been there a long time, and I think you're correct that
> > it's of questionable value.
> >
> > Acked-by: Jeff Layton <jlayton@redhat.com>
> 
> 
>
Suresh Jayaraman Oct. 2, 2012, 10:22 a.m. UTC | #4
On 10/02/2012 07:20 AM, Jeff Layton wrote:
> On Mon, 1 Oct 2012 20:46:01 -0500
> Steve French <smfrench@gmail.com> wrote:
> 
>> How can we be certain that ENOSPC is no longer returned? When it was

I had traced kernel_sendmsg() and the functions it is calling, but I
couldn't find any of them returning -ENOSPC. Also, I couldn't find any
other callers of kernel_sendmsg() checking for -ENOSPC.

>> returned then the obvious thing was to block briefly and retry.
>>
>> Does it do any harm?

When I tried to analyze what might cause the error reported here
  http://thread.gmane.org/gmane.linux.kernel/1367198/focus=1367498

I bumped into this code. I had a few questions that I couldn't answer like

a) When does kernel_sendmsg return -ENOSPC?
b) Even if we assume it returns -ENOSPC, does it really make sense to retry?
c) Does the error condition that lead to -ENOSPC rectify by the time we
retry?
d) Why do we need to reset -ENOSPC to -EAGAIN before returning the error?

If we can answer these questions reasonably, we may not need this patch.

> 
> Can you explain the situation that caused it to return -ENOSPC in the
> first place? Why that and not -EAGAIN? Or -ENOBUFS?

I too thought -ENOBUFS might represent the condition better.


Thanks
Suresh
> 
>> On Mon, Oct 1, 2012 at 8:40 PM, Jeff Layton <jlayton@redhat.com> wrote:
>>> On Mon, 01 Oct 2012 13:10:45 +0530
>>> Suresh Jayaraman <sjayaraman@suse.com> wrote:
>>>
>>>> In smb_sendv(), we seem to retry if kernel_sendmsg() returned either
>>>> -ENOSPC or -EAGAIN. In either case after multiple attempts, we set the
>>>> error to -EAGAIN before breaking out of the loop.
>>>>
>>>> First, it is not clear to me when kernel_sendmsg() can return -ENOSPC,
>>>> and what it would mean and why should we retry. It makes me wonder
>>>> whether this check is part of some old code. Also, there seem to be no
>>>> need to set the error back to -EAGAIN before we break out the loop.
>>>> Fix this by making cifs retry only if kernel_sendmsg() returns -EAGAIN.
>>>>
>>>> If the above discussion make sense, here is a patch to fix this.
>>>> ---
>>>>
>>>> Retry kernel_sendmsg() only in case of -EAGAIN and remove redundant
>>>> error assignment.
>>>>
>>>>
>>>> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.com>
>>>> ---
>>>> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
>>>> index d9b639b..a33db4c 100644
>>>> --- a/fs/cifs/transport.c
>>>> +++ b/fs/cifs/transport.c
>>>> @@ -154,7 +154,7 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
>>>>       while (total_len) {
>>>>               rc = kernel_sendmsg(ssocket, &smb_msg, &iov[first_vec],
>>>>                                   n_vec - first_vec, total_len);
>>>> -             if ((rc == -ENOSPC) || (rc == -EAGAIN)) {
>>>> +             if (rc == -EAGAIN) {
>>>>                       i++;
>>>>                       /*
>>>>                        * If blocking send we try 3 times, since each can block
>>>> @@ -177,7 +177,6 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
>>>>                       if ((i >= 14) || (!server->noblocksnd && (i > 2))) {
>>>>                               cERROR(1, "sends on sock %p stuck for 15 seconds",
>>>>                                   ssocket);
>>>> -                             rc = -EAGAIN;
>>>>                               break;
>>>>                       }
>>>>                       msleep(1 << i);
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> That ENOSPC has been there a long time, and I think you're correct that
>>> it's of questionable value.
>>>
>>> Acked-by: Jeff Layton <jlayton@redhat.com>
>>
>>
>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton Oct. 2, 2012, 11:12 a.m. UTC | #5
On Tue, 02 Oct 2012 15:52:34 +0530
Suresh Jayaraman <sjayaraman@suse.com> wrote:

> On 10/02/2012 07:20 AM, Jeff Layton wrote:
> > On Mon, 1 Oct 2012 20:46:01 -0500
> > Steve French <smfrench@gmail.com> wrote:
> > 
> >> How can we be certain that ENOSPC is no longer returned? When it was
> 
> I had traced kernel_sendmsg() and the functions it is calling, but I
> couldn't find any of them returning -ENOSPC. Also, I couldn't find any
> other callers of kernel_sendmsg() checking for -ENOSPC.
> 
> >> returned then the obvious thing was to block briefly and retry.
> >>
> >> Does it do any harm?
> 
> When I tried to analyze what might cause the error reported here
>   http://thread.gmane.org/gmane.linux.kernel/1367198/focus=1367498
> 
> I bumped into this code. I had a few questions that I couldn't answer like
> 
> a) When does kernel_sendmsg return -ENOSPC?
> b) Even if we assume it returns -ENOSPC, does it really make sense to retry?
> c) Does the error condition that lead to -ENOSPC rectify by the time we
> retry?
> d) Why do we need to reset -ENOSPC to -EAGAIN before returning the error?
> 
> If we can answer these questions reasonably, we may not need this patch.
> 
> > 
> > Can you explain the situation that caused it to return -ENOSPC in the
> > first place? Why that and not -EAGAIN? Or -ENOBUFS?
> 
> I too thought -ENOBUFS might represent the condition better.
> 

To lend further weight to the argument too. The sendmsg(2) manpage
doesn't list -ENOSPC as an error return.

kernel_sendmsg() is basically the kernelspace wrapper around the same
set of functions that the sendmsg() system call uses. If it's not a
valid error return for sendmsg() then it almost certainly isn't one for
kernel_sendmsg() either.

The harm in these situations is further obfuscation of some code that
is already far too complicated, and the potential for papering over
real bugs. If kernel_sendmsg returns ENOSPC, then that's likely a bug
and is something we should deal with accordingly instead of blindly
retrying.

If you're concerned about it however, one possibility might be to add a
WARN_ON_ONCE that fires on an ENOSPC error for a few releases.
Steve French Oct. 2, 2012, 4:08 p.m. UTC | #6
On Tue, Oct 2, 2012 at 6:12 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Tue, 02 Oct 2012 15:52:34 +0530
> Suresh Jayaraman <sjayaraman@suse.com> wrote:
>
>> On 10/02/2012 07:20 AM, Jeff Layton wrote:
>> > On Mon, 1 Oct 2012 20:46:01 -0500
>> > Steve French <smfrench@gmail.com> wrote:
>> >
>> >> How can we be certain that ENOSPC is no longer returned? When it was
>>
>> I had traced kernel_sendmsg() and the functions it is calling, but I
>> couldn't find any of them returning -ENOSPC. Also, I couldn't find any
>> other callers of kernel_sendmsg() checking for -ENOSPC.
>>
>> >> returned then the obvious thing was to block briefly and retry.
>> >>
>> >> Does it do any harm?
>>
>> When I tried to analyze what might cause the error reported here
>>   http://thread.gmane.org/gmane.linux.kernel/1367198/focus=1367498
>>
>> I bumped into this code. I had a few questions that I couldn't answer like
>>
>> a) When does kernel_sendmsg return -ENOSPC?
>> b) Even if we assume it returns -ENOSPC, does it really make sense to retry?
>> c) Does the error condition that lead to -ENOSPC rectify by the time we
>> retry?
>> d) Why do we need to reset -ENOSPC to -EAGAIN before returning the error?
>>
>> If we can answer these questions reasonably, we may not need this patch.
>>
>> >
>> > Can you explain the situation that caused it to return -ENOSPC in the
>> > first place? Why that and not -EAGAIN? Or -ENOBUFS?
>>
>> I too thought -ENOBUFS might represent the condition better.
>>
>
> To lend further weight to the argument too. The sendmsg(2) manpage
> doesn't list -ENOSPC as an error return.
>
> kernel_sendmsg() is basically the kernelspace wrapper around the same
> set of functions that the sendmsg() system call uses. If it's not a
> valid error return for sendmsg() then it almost certainly isn't one for
> kernel_sendmsg() either.
>
> The harm in these situations is further obfuscation of some code that
> is already far too complicated, and the potential for papering over
> real bugs. If kernel_sendmsg returns ENOSPC, then that's likely a bug
> and is something we should deal with accordingly instead of blindly
> retrying.
>
> If you're concerned about it however, one possibility might be to add a
> WARN_ON_ONCE that fires on an ENOSPC error for a few releases.

Seems reasonable, since if we did get an ENOSPC from a lower level
driver (filtered through either sendmsg interface) we would expect
the same strategy to deal with as EAGAIN - wait and retry.
diff mbox

Patch

diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index d9b639b..a33db4c 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -154,7 +154,7 @@  smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
 	while (total_len) {
 		rc = kernel_sendmsg(ssocket, &smb_msg, &iov[first_vec],
 				    n_vec - first_vec, total_len);
-		if ((rc == -ENOSPC) || (rc == -EAGAIN)) {
+		if (rc == -EAGAIN) {
 			i++;
 			/*
 			 * If blocking send we try 3 times, since each can block
@@ -177,7 +177,6 @@  smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec)
 			if ((i >= 14) || (!server->noblocksnd && (i > 2))) {
 				cERROR(1, "sends on sock %p stuck for 15 seconds",
 				    ssocket);
-				rc = -EAGAIN;
 				break;
 			}
 			msleep(1 << i);