diff mbox

Revert "mountd: handle allocation failures in auth_unix_ip upcall"

Message ID 20121126220343.GA18140@fieldses.org (mailing list archive)
State New, archived
Headers show

Commit Message

J. Bruce Fields Nov. 26, 2012, 10:03 p.m. UTC
From: "J. Bruce Fields" <bfields@redhat.com>

This reverts commit 485f7a21e1649797f29317b865cbb094c1f6a71d.  The
failures handled there could be any sort of name resolution failure, not
just an allocation, and failing to downcall (hence leaving the client
hanging) is not the correct thing to do in those cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 utils/mountd/cache.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

Comments

Chuck Lever Nov. 26, 2012, 10:05 p.m. UTC | #1
On Nov 26, 2012, at 5:03 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:

> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> This reverts commit 485f7a21e1649797f29317b865cbb094c1f6a71d.  The
> failures handled there could be any sort of name resolution failure, not
> just an allocation, and failing to downcall (hence leaving the client
> hanging) is not the correct thing to do in those cases.

The problem is in the kernel, then: a downcall should be allowed to fail, IMO.

> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
> utils/mountd/cache.c | 12 +++---------
> 1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/utils/mountd/cache.c b/utils/mountd/cache.c
> index 8f14032..6710eca 100644
> --- a/utils/mountd/cache.c
> +++ b/utils/mountd/cache.c
> @@ -84,6 +84,7 @@ static void auth_unix_ip(FILE *f)
> 	char ipaddr[INET6_ADDRSTRLEN];
> 	char *client = NULL;
> 	struct addrinfo *tmp = NULL;
> +	struct addrinfo *ai = NULL;
> 	if (readline(fileno(f), &lbuf, &lbuflen) != 1)
> 		return;
> 
> @@ -106,16 +107,12 @@ static void auth_unix_ip(FILE *f)
> 
> 	/* addr is a valid, interesting address, find the domain name... */
> 	if (!use_ipaddr) {
> -		struct addrinfo *ai = NULL;
> -
> 		ai = client_resolve(tmp->ai_addr);
> -		if (ai == NULL)
> -			goto out;
> 		client = client_compose(ai);
> 		freeaddrinfo(ai);
> -		if (!client)
> -			goto out;
> 	}
> +	freeaddrinfo(tmp);
> +
> 	qword_print(f, "nfsd");
> 	qword_print(f, ipaddr);
> 	qword_printuint(f, time(0) + DEFAULT_TTL);
> @@ -127,9 +124,6 @@ static void auth_unix_ip(FILE *f)
> 	xlog(D_CALL, "auth_unix_ip: client %p '%s'", client, client?client: "DEFAULT");
> 
> 	free(client);
> -out:
> -	freeaddrinfo(tmp);
> -
> }
> 
> static void auth_unix_gid(FILE *f)
> -- 
> 1.7.11.7
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Nov. 26, 2012, 10:15 p.m. UTC | #2
On Mon, Nov 26, 2012 at 05:05:22PM -0500, Chuck Lever wrote:
> 
> On Nov 26, 2012, at 5:03 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > This reverts commit 485f7a21e1649797f29317b865cbb094c1f6a71d.  The
> > failures handled there could be any sort of name resolution failure, not
> > just an allocation, and failing to downcall (hence leaving the client
> > hanging) is not the correct thing to do in those cases.
> 
> The problem is in the kernel, then: a downcall should be allowed to fail, IMO.

In this case, after a revert, a failure here will result in the downcall
passing down a client named "DEFAULT".  Presumably that won't be
permitted access to the export, so the client will end up getting an
error.

But I may not understand your objection.

--b.

> 
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > ---
> > utils/mountd/cache.c | 12 +++---------
> > 1 file changed, 3 insertions(+), 9 deletions(-)
> > 
> > diff --git a/utils/mountd/cache.c b/utils/mountd/cache.c
> > index 8f14032..6710eca 100644
> > --- a/utils/mountd/cache.c
> > +++ b/utils/mountd/cache.c
> > @@ -84,6 +84,7 @@ static void auth_unix_ip(FILE *f)
> > 	char ipaddr[INET6_ADDRSTRLEN];
> > 	char *client = NULL;
> > 	struct addrinfo *tmp = NULL;
> > +	struct addrinfo *ai = NULL;
> > 	if (readline(fileno(f), &lbuf, &lbuflen) != 1)
> > 		return;
> > 
> > @@ -106,16 +107,12 @@ static void auth_unix_ip(FILE *f)
> > 
> > 	/* addr is a valid, interesting address, find the domain name... */
> > 	if (!use_ipaddr) {
> > -		struct addrinfo *ai = NULL;
> > -
> > 		ai = client_resolve(tmp->ai_addr);
> > -		if (ai == NULL)
> > -			goto out;
> > 		client = client_compose(ai);
> > 		freeaddrinfo(ai);
> > -		if (!client)
> > -			goto out;
> > 	}
> > +	freeaddrinfo(tmp);
> > +
> > 	qword_print(f, "nfsd");
> > 	qword_print(f, ipaddr);
> > 	qword_printuint(f, time(0) + DEFAULT_TTL);
> > @@ -127,9 +124,6 @@ static void auth_unix_ip(FILE *f)
> > 	xlog(D_CALL, "auth_unix_ip: client %p '%s'", client, client?client: "DEFAULT");
> > 
> > 	free(client);
> > -out:
> > -	freeaddrinfo(tmp);
> > -
> > }
> > 
> > static void auth_unix_gid(FILE *f)
> > -- 
> > 1.7.11.7
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever Nov. 26, 2012, 10:38 p.m. UTC | #3
On Nov 26, 2012, at 5:15 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Mon, Nov 26, 2012 at 05:05:22PM -0500, Chuck Lever wrote:
>> 
>> On Nov 26, 2012, at 5:03 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:
>> 
>>> From: "J. Bruce Fields" <bfields@redhat.com>
>>> 
>>> This reverts commit 485f7a21e1649797f29317b865cbb094c1f6a71d.  The
>>> failures handled there could be any sort of name resolution failure, not
>>> just an allocation, and failing to downcall (hence leaving the client
>>> hanging) is not the correct thing to do in those cases.
>> 
>> The problem is in the kernel, then: a downcall should be allowed to fail, IMO.
> 
> In this case, after a revert, a failure here will result in the downcall
> passing down a client named "DEFAULT".  Presumably that won't be
> permitted access to the export, so the client will end up getting an
> error.

"A failure here" can mean either malloc() returned NULL in client_resolve() or client_compose(), or . . . ?

> But I may not understand your objection.

The main problem is I don't understand your patch description.  :-)

I don't seem to have commit 485f7a21 in my nfs-utils git tree (it's helpful to include the short description for such a case).

What exactly is the problem with the current code?

client_resolve() can return a NULL in some cases.  Why is it OK to pass a NULL "ai" to client_compose() ?  Looks like that can result in a mountd segfault.  The kernel won't get any downcall reply in that case!  Is that what you are trying to fix?

WRT my original objection: In general I don't see how to make it impossible for mountd to fail.  Thus the kernel needs to be better about recovering when mountd suddenly disappears.

> 
> --b.
> 
>> 
>>> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>>> ---
>>> utils/mountd/cache.c | 12 +++---------
>>> 1 file changed, 3 insertions(+), 9 deletions(-)
>>> 
>>> diff --git a/utils/mountd/cache.c b/utils/mountd/cache.c
>>> index 8f14032..6710eca 100644
>>> --- a/utils/mountd/cache.c
>>> +++ b/utils/mountd/cache.c
>>> @@ -84,6 +84,7 @@ static void auth_unix_ip(FILE *f)
>>> 	char ipaddr[INET6_ADDRSTRLEN];
>>> 	char *client = NULL;
>>> 	struct addrinfo *tmp = NULL;
>>> +	struct addrinfo *ai = NULL;
>>> 	if (readline(fileno(f), &lbuf, &lbuflen) != 1)
>>> 		return;
>>> 
>>> @@ -106,16 +107,12 @@ static void auth_unix_ip(FILE *f)
>>> 
>>> 	/* addr is a valid, interesting address, find the domain name... */
>>> 	if (!use_ipaddr) {
>>> -		struct addrinfo *ai = NULL;
>>> -
>>> 		ai = client_resolve(tmp->ai_addr);
>>> -		if (ai == NULL)
>>> -			goto out;
>>> 		client = client_compose(ai);
>>> 		freeaddrinfo(ai);
>>> -		if (!client)
>>> -			goto out;
>>> 	}
>>> +	freeaddrinfo(tmp);
>>> +
>>> 	qword_print(f, "nfsd");
>>> 	qword_print(f, ipaddr);
>>> 	qword_printuint(f, time(0) + DEFAULT_TTL);
>>> @@ -127,9 +124,6 @@ static void auth_unix_ip(FILE *f)
>>> 	xlog(D_CALL, "auth_unix_ip: client %p '%s'", client, client?client: "DEFAULT");
>>> 
>>> 	free(client);
>>> -out:
>>> -	freeaddrinfo(tmp);
>>> -
>>> }
>>> 
>>> static void auth_unix_gid(FILE *f)
>>> -- 
>>> 1.7.11.7
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever
>> chuck[dot]lever[at]oracle[dot]com
>> 
>> 
>> 

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Nov. 26, 2012, 10:51 p.m. UTC | #4
On Mon, Nov 26, 2012 at 05:38:49PM -0500, Chuck Lever wrote:
> 
> On Nov 26, 2012, at 5:15 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Mon, Nov 26, 2012 at 05:05:22PM -0500, Chuck Lever wrote:
> >> 
> >> On Nov 26, 2012, at 5:03 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> >> 
> >>> From: "J. Bruce Fields" <bfields@redhat.com>
> >>> 
> >>> This reverts commit 485f7a21e1649797f29317b865cbb094c1f6a71d.  The
> >>> failures handled there could be any sort of name resolution failure, not
> >>> just an allocation, and failing to downcall (hence leaving the client
> >>> hanging) is not the correct thing to do in those cases.
> >> 
> >> The problem is in the kernel, then: a downcall should be allowed to fail, IMO.
> > 
> > In this case, after a revert, a failure here will result in the downcall
> > passing down a client named "DEFAULT".  Presumably that won't be
> > permitted access to the export, so the client will end up getting an
> > error.
> 
> "A failure here" can mean either malloc() returned NULL in client_resolve() or client_compose(), or . . . ?

Looks like it'd also fail if we couldn't map the client's ip address to
a name.

> > But I may not understand your objection.
> 
> The main problem is I don't understand your patch description.  :-)
> 
> I don't seem to have commit 485f7a21 in my nfs-utils git tree (it's
> helpful to include the short description for such a case).

Ah, crap, sorry, looks like I reverted a local commit of that patch....
The upstream commit is bf6a4febaa78bf188896b7b5b02c46562dd08b70 (and
the short description is in the subject line above).

> What exactly is the problem with the current code?
> 
> client_resolve() can return a NULL in some cases.  Why is it OK to
> pass a NULL "ai" to client_compose() ?  Looks like that can result in
> a mountd segfault.

Bah, I thought I'd checked this and found it was prepared to handle
that, but no:

	client_check->check_wildcard()

looks like it can oops.

OK, I'll take another look.

> The kernel won't get any downcall reply in that
> case!  Is that what you are trying to fix?
> 
> WRT my original objection: In general I don't see how to make it
> impossible for mountd to fail.

Sure, but mountd is required for the server to function, so it's just a
question of how we fail.

> Thus the kernel needs to be better about recovering when mountd
> suddenly disappears.

Currently it drops and lets the client retry.  I suspect that's the
correct thing to do, but alternatives are welcomed.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever Nov. 26, 2012, 11:10 p.m. UTC | #5
On Nov 26, 2012, at 5:51 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Mon, Nov 26, 2012 at 05:38:49PM -0500, Chuck Lever wrote:
>> 
>> On Nov 26, 2012, at 5:15 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:
>> 
>>> On Mon, Nov 26, 2012 at 05:05:22PM -0500, Chuck Lever wrote:
>>>> 
>>>> On Nov 26, 2012, at 5:03 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:
>>>> 
>>>>> From: "J. Bruce Fields" <bfields@redhat.com>
>>>>> 
>>>>> This reverts commit 485f7a21e1649797f29317b865cbb094c1f6a71d.  The
>>>>> failures handled there could be any sort of name resolution failure, not
>>>>> just an allocation, and failing to downcall (hence leaving the client
>>>>> hanging) is not the correct thing to do in those cases.
>>>> 
>>>> The problem is in the kernel, then: a downcall should be allowed to fail, IMO.
>>> 
>>> In this case, after a revert, a failure here will result in the downcall
>>> passing down a client named "DEFAULT".  Presumably that won't be
>>> permitted access to the export, so the client will end up getting an
>>> error.
>> 
>> "A failure here" can mean either malloc() returned NULL in client_resolve() or client_compose(), or . . . ?
> 
> Looks like it'd also fail if we couldn't map the client's ip address to
> a name.

That's the common failure mode here, which is now treated just like a malloc(3) failure.

>> What exactly is the problem with the current code?

Anyway, it would help my addled brain if the problem description in the next version of this patch was more clear about what the code is doing wrong now.

>> The kernel won't get any downcall reply in that
>> case!  Is that what you are trying to fix?
>> 
>> WRT my original objection: In general I don't see how to make it
>> impossible for mountd to fail.
> 
> Sure, but mountd is required for the server to function, so it's just a
> question of how we fail.
> 
>> Thus the kernel needs to be better about recovering when mountd
>> suddenly disappears.
> 
> Currently it drops and lets the client retry.  I suspect that's the
> correct thing to do, but alternatives are welcomed.

By "drops" do you mean the server drops the NFS request, and the client retransmits the request?  That's actually pretty unfriendly behavior, IMO, since an application on a client is typically stuck at that point until that RPC gets some sort of result (after possibly several RTOs).

Maybe the server could signal an NFS error of some kind and let the client decide if it wants to retry until the server is working again, or fail that request immediately.

(Also, if NFSv4 is dependent on a mountd upcall, it is not supposed to drop a request, AFAIK).

OK, but this is a separate issue from the case you are trying to fix.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Dickson Nov. 27, 2012, 2:23 p.m. UTC | #6
Hello,

On 26/11/12 17:51, J. Bruce Fields wrote:
>> What exactly is the problem with the current code?
>> > 
>> > client_resolve() can return a NULL in some cases.  Why is it OK to
>> > pass a NULL "ai" to client_compose() ?  Looks like that can result in
>> > a mountd segfault.
> Bah, I thought I'd checked this and found it was prepared to handle
> that, but no:
> 
> 	client_check->check_wildcard()
> 
> looks like it can oops.
Yeah... added client_check->check_netgroup() to this list...

> 
> OK, I'll take another look.
A simply check that ai is not null when calling client_compose() should work..

> 
>> > The kernel won't get any downcall reply in that
>> > case!  Is that what you are trying to fix?
>> > 
>> > WRT my original objection: In general I don't see how to make it
>> > impossible for mountd to fail.
> Sure, but mountd is required for the server to function, so it's just a
> question of how we fail.
> 
>> > Thus the kernel needs to be better about recovering when mountd
>> > suddenly disappears.
> Currently it drops and lets the client retry.  I suspect that's the
> correct thing to do, but alternatives are welcomed.
This one things I didn't see happen... the client retrying... expected it would...

steved.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/utils/mountd/cache.c b/utils/mountd/cache.c
index 8f14032..6710eca 100644
--- a/utils/mountd/cache.c
+++ b/utils/mountd/cache.c
@@ -84,6 +84,7 @@  static void auth_unix_ip(FILE *f)
 	char ipaddr[INET6_ADDRSTRLEN];
 	char *client = NULL;
 	struct addrinfo *tmp = NULL;
+	struct addrinfo *ai = NULL;
 	if (readline(fileno(f), &lbuf, &lbuflen) != 1)
 		return;
 
@@ -106,16 +107,12 @@  static void auth_unix_ip(FILE *f)
 
 	/* addr is a valid, interesting address, find the domain name... */
 	if (!use_ipaddr) {
-		struct addrinfo *ai = NULL;
-
 		ai = client_resolve(tmp->ai_addr);
-		if (ai == NULL)
-			goto out;
 		client = client_compose(ai);
 		freeaddrinfo(ai);
-		if (!client)
-			goto out;
 	}
+	freeaddrinfo(tmp);
+
 	qword_print(f, "nfsd");
 	qword_print(f, ipaddr);
 	qword_printuint(f, time(0) + DEFAULT_TTL);
@@ -127,9 +124,6 @@  static void auth_unix_ip(FILE *f)
 	xlog(D_CALL, "auth_unix_ip: client %p '%s'", client, client?client: "DEFAULT");
 
 	free(client);
-out:
-	freeaddrinfo(tmp);
-
 }
 
 static void auth_unix_gid(FILE *f)