diff mbox series

[v2,2/6] net/handshake: Fix handshake_dup() ref counting

Message ID 168333395123.7813.7077088598355438510.stgit@oracle-102.nfsv4bat.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Bug fixes for net/handshake | expand

Checks

Context Check Description
netdev/series_format warning Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 10 this patch: 10
netdev/cc_maintainers fail 2 blamed authors not CCed: kuba@kernel.org chuck.lever@oracle.com; 5 maintainers not CCed: kuba@kernel.org chuck.lever@oracle.com davem@davemloft.net pabeni@redhat.com edumazet@google.com
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 10 this patch: 10
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Chuck Lever May 6, 2023, 12:46 a.m. UTC
From: Chuck Lever <chuck.lever@oracle.com>

If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
twice.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/handshake/netlink.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Simon Horman May 6, 2023, 4:07 p.m. UTC | #1
On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
> twice.
> 
> Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Leon Romanovsky May 7, 2023, 8:25 a.m. UTC | #2
On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
> twice.
> 
> Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  net/handshake/netlink.c |    4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
> index 7ec8a76c3c8a..032d96152e2f 100644
> --- a/net/handshake/netlink.c
> +++ b/net/handshake/netlink.c
> @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock)
>  
>  	file = get_file(sock->file);
>  	newfd = get_unused_fd_flags(O_CLOEXEC);
> -	if (newfd < 0) {
> -		fput(file);
> +	if (newfd < 0)
>  		return newfd;

IMHO, the better way to fix it is to change handshake_nl_accept_doit()
do not call to fput(sock->file) in error case. It is not right thing
to have a call to handshake_dup() and rely on elevated get_file()
for failure too as it will be problematic for future extension of
handshake_dup().

Thanks

> -	}
>  
>  	fd_install(newfd, file);
>  	return newfd;
> 
> 
>
Paolo Abeni May 9, 2023, 7:04 a.m. UTC | #3
On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote:
> On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> > 
> > If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
> > twice.
> > 
> > Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> >  net/handshake/netlink.c |    4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> > 
> > diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
> > index 7ec8a76c3c8a..032d96152e2f 100644
> > --- a/net/handshake/netlink.c
> > +++ b/net/handshake/netlink.c
> > @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock)
> >  
> >  	file = get_file(sock->file);
> >  	newfd = get_unused_fd_flags(O_CLOEXEC);
> > -	if (newfd < 0) {
> > -		fput(file);
> > +	if (newfd < 0)
> >  		return newfd;
> 
> IMHO, the better way to fix it is to change handshake_nl_accept_doit()
> do not call to fput(sock->file) in error case. It is not right thing
> to have a call to handshake_dup() and rely on elevated get_file()
> for failure too as it will be problematic for future extension of
> handshake_dup().

I agree with the above: I think a failing helper should leave the
larger scope status unmodified. In this case a failing handshake_dup()
should not touch file refcount, and handshake_nl_accept_doit() should
be modified accordingly, something alike:

---
diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
index e865fcf68433..8897a17189ad 100644
--- a/net/handshake/netlink.c
+++ b/net/handshake/netlink.c
@@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info)
 	}
 	err = req->hr_proto->hp_accept(req, info, fd);
 	if (err)
-		goto out_complete;
+		goto out_put;
 
 	trace_handshake_cmd_accept(net, req, req->hr_sk, fd);
 	return 0;
 
+out_put:
+	fput(sock->file);
 out_complete:
 	handshake_complete(req, -EIO, NULL);
-	fput(sock->file);
 out_status:
 	trace_handshake_cmd_accept_err(net, req, NULL, err);
 	return err;
---

Somewhat related: handshake_nl_done_doit() releases the file refcount
even if the req lookup fails. If that is caused by a concurrent
req_cancel - not sure if possible at all, possibly syzkaller could
guess if instructed about the API - such refcount will underflow, as it
is rightfully decremented by req_cancel, too.

I think it should be safer adding a chunk like:

---
diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
index e865fcf68433..3e3e849f302a 100644
--- a/net/handshake/netlink.c
+++ b/net/handshake/netlink.c
@@ -172,7 +173,6 @@ int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info)
 	req = handshake_req_hash_lookup(sock->sk);
 	if (!req) {
 		err = -EBUSY;
-		fput(sock->file);
 		goto out_status;
 	}
---

Possibly explicitly documenting the used ownership rules for the file
refcount in the relevant functions could help with future maintenance.

Finally it's not clear to me if we agreed to a target tree or not ;) I
see no replies so my suggestion.

Thanks!

Paolo
Chuck Lever III May 9, 2023, 1:59 p.m. UTC | #4
> On May 9, 2023, at 12:04 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> 
> On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote:
>> On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote:
>>> From: Chuck Lever <chuck.lever@oracle.com>
>>> 
>>> If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
>>> twice.
>>> 
>>> Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
>>> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>> net/handshake/netlink.c |    4 +---
>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>> 
>>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
>>> index 7ec8a76c3c8a..032d96152e2f 100644
>>> --- a/net/handshake/netlink.c
>>> +++ b/net/handshake/netlink.c
>>> @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock)
>>> 
>>> file = get_file(sock->file);
>>> newfd = get_unused_fd_flags(O_CLOEXEC);
>>> - if (newfd < 0) {
>>> - fput(file);
>>> + if (newfd < 0)
>>> return newfd;
>> 
>> IMHO, the better way to fix it is to change handshake_nl_accept_doit()
>> do not call to fput(sock->file) in error case. It is not right thing
>> to have a call to handshake_dup() and rely on elevated get_file()
>> for failure too as it will be problematic for future extension of
>> handshake_dup().
> 
> I agree with the above: I think a failing helper should leave the
> larger scope status unmodified. In this case a failing handshake_dup()
> should not touch file refcount, and handshake_nl_accept_doit() should
> be modified accordingly, something alike:
> 
> ---
> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
> index e865fcf68433..8897a17189ad 100644
> --- a/net/handshake/netlink.c
> +++ b/net/handshake/netlink.c
> @@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info)
> }
> err = req->hr_proto->hp_accept(req, info, fd);
> if (err)
> - goto out_complete;
> + goto out_put;
> 
> trace_handshake_cmd_accept(net, req, req->hr_sk, fd);
> return 0;
> 
> +out_put:
> + fput(sock->file);
> out_complete:
> handshake_complete(req, -EIO, NULL);
> - fput(sock->file);
> out_status:
> trace_handshake_cmd_accept_err(net, req, NULL, err);
> return err;

I'm happy to accommodate these changes, but it's not clear to me
whether you want this hunk applied /in addition to/ my fix or
/instead of/.


> ---
> 
> Somewhat related: handshake_nl_done_doit() releases the file refcount
> even if the req lookup fails.

That's because sockfd_lookup() increments the file ref count.


> If that is caused by a concurrent
> req_cancel - not sure if possible at all, possibly syzkaller could
> guess if instructed about the API - such refcount will underflow, as it
> is rightfully decremented by req_cancel, too.

More likely, req_cancel might take the file ref count to zero
before sockfd_lookup can increment it, resulting in a UAF?

Let me think about this.


> I think it should be safer adding a chunk like:
> 
> ---
> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
> index e865fcf68433..3e3e849f302a 100644
> --- a/net/handshake/netlink.c
> +++ b/net/handshake/netlink.c
> @@ -172,7 +173,6 @@ int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info)
> req = handshake_req_hash_lookup(sock->sk);
> if (!req) {
> err = -EBUSY;
> - fput(sock->file);
> goto out_status;
> }
> ---
> 
> Possibly explicitly documenting the used ownership rules for the file
> refcount in the relevant functions could help with future maintenance.
> 
> Finally it's not clear to me if we agreed to a target tree or not ;) I
> see no replies so my suggestion.

Since we expect other subsystems to use net/handshake besides
SunRPC, I agree to going with netdev, even for this series once
it is acceptable.


--
Chuck Lever
Paolo Abeni May 9, 2023, 2:22 p.m. UTC | #5
On Tue, 2023-05-09 at 13:59 +0000, Chuck Lever III wrote:
> 
> > On May 9, 2023, at 12:04 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > 
> > On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote:
> > > On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote:
> > > > From: Chuck Lever <chuck.lever@oracle.com>
> > > > 
> > > > If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
> > > > twice.
> > > > 
> > > > Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> > > > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
> > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > ---
> > > > net/handshake/netlink.c |    4 +---
> > > > 1 file changed, 1 insertion(+), 3 deletions(-)
> > > > 
> > > > diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
> > > > index 7ec8a76c3c8a..032d96152e2f 100644
> > > > --- a/net/handshake/netlink.c
> > > > +++ b/net/handshake/netlink.c
> > > > @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock)
> > > > 
> > > > file = get_file(sock->file);
> > > > newfd = get_unused_fd_flags(O_CLOEXEC);
> > > > - if (newfd < 0) {
> > > > - fput(file);
> > > > + if (newfd < 0)
> > > > return newfd;
> > > 
> > > IMHO, the better way to fix it is to change handshake_nl_accept_doit()
> > > do not call to fput(sock->file) in error case. It is not right thing
> > > to have a call to handshake_dup() and rely on elevated get_file()
> > > for failure too as it will be problematic for future extension of
> > > handshake_dup().
> > 
> > I agree with the above: I think a failing helper should leave the
> > larger scope status unmodified. In this case a failing handshake_dup()
> > should not touch file refcount, and handshake_nl_accept_doit() should
> > be modified accordingly, something alike:
> > 
> > ---
> > diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
> > index e865fcf68433..8897a17189ad 100644
> > --- a/net/handshake/netlink.c
> > +++ b/net/handshake/netlink.c
> > @@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info)
> > }
> > err = req->hr_proto->hp_accept(req, info, fd);
> > if (err)
> > - goto out_complete;
> > + goto out_put;
> > 
> > trace_handshake_cmd_accept(net, req, req->hr_sk, fd);
> > return 0;
> > 
> > +out_put:
> > + fput(sock->file);
> > out_complete:
> > handshake_complete(req, -EIO, NULL);
> > - fput(sock->file);
> > out_status:
> > trace_handshake_cmd_accept_err(net, req, NULL, err);
> > return err;
> 
> I'm happy to accommodate these changes, but it's not clear to me
> whether you want this hunk applied /in addition to/ my fix or
> /instead of/.

It's above (completely untested!) chunk is intended to be a replace for
patch 2/6

> > ---
> > 
> > Somewhat related: handshake_nl_done_doit() releases the file refcount
> > even if the req lookup fails.
> 
> That's because sockfd_lookup() increments the file ref count.

Ooops, I missed that.

Then in the successful path handshake_nl_done_doit() should call
fput() twice ?!? 1 for the reference acquired by sockfd_lookup() and 1
for the reference owned by  'req' ?!? Otherwise a ref will be leaked.

> > If that is caused by a concurrent
> > req_cancel - not sure if possible at all, possibly syzkaller could
> > guess if instructed about the API - such refcount will underflow, as it
> > is rightfully decremented by req_cancel, too.
> 
> More likely, req_cancel might take the file ref count to zero
> before sockfd_lookup can increment it, resulting in a UAF?
> 
> Let me think about this.

I now think this race is not possible, but I now fear the refcount leak
mentioned above.

Cheers,

Paolo
Chuck Lever III May 9, 2023, 2:32 p.m. UTC | #6
> On May 9, 2023, at 7:22 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> 
> On Tue, 2023-05-09 at 13:59 +0000, Chuck Lever III wrote:
>> 
>>> On May 9, 2023, at 12:04 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>>> 
>>> On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote:
>>>> On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote:
>>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>> 
>>>>> If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
>>>>> twice.
>>>>> 
>>>>> Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
>>>>> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>> ---
>>>>> net/handshake/netlink.c |    4 +---
>>>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>>>> 
>>>>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
>>>>> index 7ec8a76c3c8a..032d96152e2f 100644
>>>>> --- a/net/handshake/netlink.c
>>>>> +++ b/net/handshake/netlink.c
>>>>> @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock)
>>>>> 
>>>>> file = get_file(sock->file);
>>>>> newfd = get_unused_fd_flags(O_CLOEXEC);
>>>>> - if (newfd < 0) {
>>>>> - fput(file);
>>>>> + if (newfd < 0)
>>>>> return newfd;
>>>> 
>>>> IMHO, the better way to fix it is to change handshake_nl_accept_doit()
>>>> do not call to fput(sock->file) in error case. It is not right thing
>>>> to have a call to handshake_dup() and rely on elevated get_file()
>>>> for failure too as it will be problematic for future extension of
>>>> handshake_dup().
>>> 
>>> I agree with the above: I think a failing helper should leave the
>>> larger scope status unmodified. In this case a failing handshake_dup()
>>> should not touch file refcount, and handshake_nl_accept_doit() should
>>> be modified accordingly, something alike:
>>> 
>>> ---
>>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
>>> index e865fcf68433..8897a17189ad 100644
>>> --- a/net/handshake/netlink.c
>>> +++ b/net/handshake/netlink.c
>>> @@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info)
>>> }
>>> err = req->hr_proto->hp_accept(req, info, fd);
>>> if (err)
>>> - goto out_complete;
>>> + goto out_put;
>>> 
>>> trace_handshake_cmd_accept(net, req, req->hr_sk, fd);
>>> return 0;
>>> 
>>> +out_put:
>>> + fput(sock->file);
>>> out_complete:
>>> handshake_complete(req, -EIO, NULL);
>>> - fput(sock->file);
>>> out_status:
>>> trace_handshake_cmd_accept_err(net, req, NULL, err);
>>> return err;
>> 
>> I'm happy to accommodate these changes, but it's not clear to me
>> whether you want this hunk applied /in addition to/ my fix or
>> /instead of/.
> 
> It's above (completely untested!) chunk is intended to be a replace for
> patch 2/6
> 
>>> ---
>>> 
>>> Somewhat related: handshake_nl_done_doit() releases the file refcount
>>> even if the req lookup fails.
>> 
>> That's because sockfd_lookup() increments the file ref count.
> 
> Ooops, I missed that.
> 
> Then in the successful path handshake_nl_done_doit() should call
> fput() twice ?!? 1 for the reference acquired by sockfd_lookup() and 1
> for the reference owned by  'req' ?!? Otherwise a ref will be leaked.
> 
>>> If that is caused by a concurrent
>>> req_cancel - not sure if possible at all, possibly syzkaller could
>>> guess if instructed about the API - such refcount will underflow, as it
>>> is rightfully decremented by req_cancel, too.
>> 
>> More likely, req_cancel might take the file ref count to zero
>> before sockfd_lookup can increment it, resulting in a UAF?
>> 
>> Let me think about this.
> 
> I now think this race is not possible, but I now fear the refcount leak
> mentioned above.

Not sure why I haven't seen evidence of a leak here. I'll have a closer look.


--
Chuck Lever
Chuck Lever III May 16, 2023, 8:05 p.m. UTC | #7
> On May 9, 2023, at 10:32 AM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On May 9, 2023, at 7:22 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>> 
>> On Tue, 2023-05-09 at 13:59 +0000, Chuck Lever III wrote:
>>> 
>>>> On May 9, 2023, at 12:04 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>>>> 
>>>> On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote:
>>>>> On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote:
>>>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>>> 
>>>>>> If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
>>>>>> twice.
>>>>>> 
>>>>>> Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
>>>>>> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
>>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>>> ---
>>>>>> net/handshake/netlink.c |    4 +---
>>>>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>>>>> 
>>>>>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
>>>>>> index 7ec8a76c3c8a..032d96152e2f 100644
>>>>>> --- a/net/handshake/netlink.c
>>>>>> +++ b/net/handshake/netlink.c
>>>>>> @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock)
>>>>>> 
>>>>>> file = get_file(sock->file);
>>>>>> newfd = get_unused_fd_flags(O_CLOEXEC);
>>>>>> - if (newfd < 0) {
>>>>>> - fput(file);
>>>>>> + if (newfd < 0)
>>>>>> return newfd;
>>>>> 
>>>>> IMHO, the better way to fix it is to change handshake_nl_accept_doit()
>>>>> do not call to fput(sock->file) in error case. It is not right thing
>>>>> to have a call to handshake_dup() and rely on elevated get_file()
>>>>> for failure too as it will be problematic for future extension of
>>>>> handshake_dup().
>>>> 
>>>> I agree with the above: I think a failing helper should leave the
>>>> larger scope status unmodified. In this case a failing handshake_dup()
>>>> should not touch file refcount, and handshake_nl_accept_doit() should
>>>> be modified accordingly, something alike:
>>>> 
>>>> ---
>>>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
>>>> index e865fcf68433..8897a17189ad 100644
>>>> --- a/net/handshake/netlink.c
>>>> +++ b/net/handshake/netlink.c
>>>> @@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info)
>>>> }
>>>> err = req->hr_proto->hp_accept(req, info, fd);
>>>> if (err)
>>>> - goto out_complete;
>>>> + goto out_put;
>>>> 
>>>> trace_handshake_cmd_accept(net, req, req->hr_sk, fd);
>>>> return 0;
>>>> 
>>>> +out_put:
>>>> + fput(sock->file);
>>>> out_complete:
>>>> handshake_complete(req, -EIO, NULL);
>>>> - fput(sock->file);
>>>> out_status:
>>>> trace_handshake_cmd_accept_err(net, req, NULL, err);
>>>> return err;
>>> 
>>> I'm happy to accommodate these changes, but it's not clear to me
>>> whether you want this hunk applied /in addition to/ my fix or
>>> /instead of/.
>> 
>> It's above (completely untested!) chunk is intended to be a replace for
>> patch 2/6
>> 
>>>> ---
>>>> 
>>>> Somewhat related: handshake_nl_done_doit() releases the file refcount
>>>> even if the req lookup fails.
>>> 
>>> That's because sockfd_lookup() increments the file ref count.
>> 
>> Ooops, I missed that.
>> 
>> Then in the successful path handshake_nl_done_doit() should call
>> fput() twice ?!? 1 for the reference acquired by sockfd_lookup() and 1
>> for the reference owned by  'req' ?!? Otherwise a ref will be leaked.
>> 
>>>> If that is caused by a concurrent
>>>> req_cancel - not sure if possible at all, possibly syzkaller could
>>>> guess if instructed about the API - such refcount will underflow, as it
>>>> is rightfully decremented by req_cancel, too.
>>> 
>>> More likely, req_cancel might take the file ref count to zero
>>> before sockfd_lookup can increment it, resulting in a UAF?
>>> 
>>> Let me think about this.
>> 
>> I now think this race is not possible, but I now fear the refcount leak
>> mentioned above.
> 
> Not sure why I haven't seen evidence of a leak here. I'll have a closer look.

I added provisional trace points to report the value of f_count
before the fput() in done_doit and also where the RPC client
destroys the socket. The latter reports f_count is 1, as I would
expect, just before the final __fput_sync().


--
Chuck Lever
diff mbox series

Patch

diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
index 7ec8a76c3c8a..032d96152e2f 100644
--- a/net/handshake/netlink.c
+++ b/net/handshake/netlink.c
@@ -101,10 +101,8 @@  static int handshake_dup(struct socket *sock)
 
 	file = get_file(sock->file);
 	newfd = get_unused_fd_flags(O_CLOEXEC);
-	if (newfd < 0) {
-		fput(file);
+	if (newfd < 0)
 		return newfd;
-	}
 
 	fd_install(newfd, file);
 	return newfd;