Message ID | 168333395123.7813.7077088598355438510.stgit@oracle-102.nfsv4bat.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Bug fixes for net/handshake | expand |
On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote: > From: Chuck Lever <chuck.lever@oracle.com> > > If get_unused_fd_flags() fails, we ended up calling fput(sock->file) > twice. > > Reported-by: Dan Carpenter <dan.carpenter@linaro.org> > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Simon Horman <simon.horman@corigine.com>
On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote: > From: Chuck Lever <chuck.lever@oracle.com> > > If get_unused_fd_flags() fails, we ended up calling fput(sock->file) > twice. > > Reported-by: Dan Carpenter <dan.carpenter@linaro.org> > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > --- > net/handshake/netlink.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c > index 7ec8a76c3c8a..032d96152e2f 100644 > --- a/net/handshake/netlink.c > +++ b/net/handshake/netlink.c > @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock) > > file = get_file(sock->file); > newfd = get_unused_fd_flags(O_CLOEXEC); > - if (newfd < 0) { > - fput(file); > + if (newfd < 0) > return newfd; IMHO, the better way to fix it is to change handshake_nl_accept_doit() do not call to fput(sock->file) in error case. It is not right thing to have a call to handshake_dup() and rely on elevated get_file() for failure too as it will be problematic for future extension of handshake_dup(). Thanks > - } > > fd_install(newfd, file); > return newfd; > > >
On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote: > On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote: > > From: Chuck Lever <chuck.lever@oracle.com> > > > > If get_unused_fd_flags() fails, we ended up calling fput(sock->file) > > twice. > > > > Reported-by: Dan Carpenter <dan.carpenter@linaro.org> > > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > > --- > > net/handshake/netlink.c | 4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c > > index 7ec8a76c3c8a..032d96152e2f 100644 > > --- a/net/handshake/netlink.c > > +++ b/net/handshake/netlink.c > > @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock) > > > > file = get_file(sock->file); > > newfd = get_unused_fd_flags(O_CLOEXEC); > > - if (newfd < 0) { > > - fput(file); > > + if (newfd < 0) > > return newfd; > > IMHO, the better way to fix it is to change handshake_nl_accept_doit() > do not call to fput(sock->file) in error case. It is not right thing > to have a call to handshake_dup() and rely on elevated get_file() > for failure too as it will be problematic for future extension of > handshake_dup(). I agree with the above: I think a failing helper should leave the larger scope status unmodified. In this case a failing handshake_dup() should not touch file refcount, and handshake_nl_accept_doit() should be modified accordingly, something alike: --- diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c index e865fcf68433..8897a17189ad 100644 --- a/net/handshake/netlink.c +++ b/net/handshake/netlink.c @@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info) } err = req->hr_proto->hp_accept(req, info, fd); if (err) - goto out_complete; + goto out_put; trace_handshake_cmd_accept(net, req, req->hr_sk, fd); return 0; +out_put: + fput(sock->file); out_complete: handshake_complete(req, -EIO, NULL); - fput(sock->file); out_status: trace_handshake_cmd_accept_err(net, req, NULL, err); return err; --- Somewhat related: handshake_nl_done_doit() releases the file refcount even if the req lookup fails. If that is caused by a concurrent req_cancel - not sure if possible at all, possibly syzkaller could guess if instructed about the API - such refcount will underflow, as it is rightfully decremented by req_cancel, too. I think it should be safer adding a chunk like: --- diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c index e865fcf68433..3e3e849f302a 100644 --- a/net/handshake/netlink.c +++ b/net/handshake/netlink.c @@ -172,7 +173,6 @@ int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info) req = handshake_req_hash_lookup(sock->sk); if (!req) { err = -EBUSY; - fput(sock->file); goto out_status; } --- Possibly explicitly documenting the used ownership rules for the file refcount in the relevant functions could help with future maintenance. Finally it's not clear to me if we agreed to a target tree or not ;) I see no replies so my suggestion. Thanks! Paolo
> On May 9, 2023, at 12:04 AM, Paolo Abeni <pabeni@redhat.com> wrote: > > On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote: >> On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote: >>> From: Chuck Lever <chuck.lever@oracle.com> >>> >>> If get_unused_fd_flags() fails, we ended up calling fput(sock->file) >>> twice. >>> >>> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> >>> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") >>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >>> --- >>> net/handshake/netlink.c | 4 +--- >>> 1 file changed, 1 insertion(+), 3 deletions(-) >>> >>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c >>> index 7ec8a76c3c8a..032d96152e2f 100644 >>> --- a/net/handshake/netlink.c >>> +++ b/net/handshake/netlink.c >>> @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock) >>> >>> file = get_file(sock->file); >>> newfd = get_unused_fd_flags(O_CLOEXEC); >>> - if (newfd < 0) { >>> - fput(file); >>> + if (newfd < 0) >>> return newfd; >> >> IMHO, the better way to fix it is to change handshake_nl_accept_doit() >> do not call to fput(sock->file) in error case. It is not right thing >> to have a call to handshake_dup() and rely on elevated get_file() >> for failure too as it will be problematic for future extension of >> handshake_dup(). > > I agree with the above: I think a failing helper should leave the > larger scope status unmodified. In this case a failing handshake_dup() > should not touch file refcount, and handshake_nl_accept_doit() should > be modified accordingly, something alike: > > --- > diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c > index e865fcf68433..8897a17189ad 100644 > --- a/net/handshake/netlink.c > +++ b/net/handshake/netlink.c > @@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info) > } > err = req->hr_proto->hp_accept(req, info, fd); > if (err) > - goto out_complete; > + goto out_put; > > trace_handshake_cmd_accept(net, req, req->hr_sk, fd); > return 0; > > +out_put: > + fput(sock->file); > out_complete: > handshake_complete(req, -EIO, NULL); > - fput(sock->file); > out_status: > trace_handshake_cmd_accept_err(net, req, NULL, err); > return err; I'm happy to accommodate these changes, but it's not clear to me whether you want this hunk applied /in addition to/ my fix or /instead of/. > --- > > Somewhat related: handshake_nl_done_doit() releases the file refcount > even if the req lookup fails. That's because sockfd_lookup() increments the file ref count. > If that is caused by a concurrent > req_cancel - not sure if possible at all, possibly syzkaller could > guess if instructed about the API - such refcount will underflow, as it > is rightfully decremented by req_cancel, too. More likely, req_cancel might take the file ref count to zero before sockfd_lookup can increment it, resulting in a UAF? Let me think about this. > I think it should be safer adding a chunk like: > > --- > diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c > index e865fcf68433..3e3e849f302a 100644 > --- a/net/handshake/netlink.c > +++ b/net/handshake/netlink.c > @@ -172,7 +173,6 @@ int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *info) > req = handshake_req_hash_lookup(sock->sk); > if (!req) { > err = -EBUSY; > - fput(sock->file); > goto out_status; > } > --- > > Possibly explicitly documenting the used ownership rules for the file > refcount in the relevant functions could help with future maintenance. > > Finally it's not clear to me if we agreed to a target tree or not ;) I > see no replies so my suggestion. Since we expect other subsystems to use net/handshake besides SunRPC, I agree to going with netdev, even for this series once it is acceptable. -- Chuck Lever
On Tue, 2023-05-09 at 13:59 +0000, Chuck Lever III wrote: > > > On May 9, 2023, at 12:04 AM, Paolo Abeni <pabeni@redhat.com> wrote: > > > > On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote: > > > On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote: > > > > From: Chuck Lever <chuck.lever@oracle.com> > > > > > > > > If get_unused_fd_flags() fails, we ended up calling fput(sock->file) > > > > twice. > > > > > > > > Reported-by: Dan Carpenter <dan.carpenter@linaro.org> > > > > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") > > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com> > > > > --- > > > > net/handshake/netlink.c | 4 +--- > > > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > > > > > diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c > > > > index 7ec8a76c3c8a..032d96152e2f 100644 > > > > --- a/net/handshake/netlink.c > > > > +++ b/net/handshake/netlink.c > > > > @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock) > > > > > > > > file = get_file(sock->file); > > > > newfd = get_unused_fd_flags(O_CLOEXEC); > > > > - if (newfd < 0) { > > > > - fput(file); > > > > + if (newfd < 0) > > > > return newfd; > > > > > > IMHO, the better way to fix it is to change handshake_nl_accept_doit() > > > do not call to fput(sock->file) in error case. It is not right thing > > > to have a call to handshake_dup() and rely on elevated get_file() > > > for failure too as it will be problematic for future extension of > > > handshake_dup(). > > > > I agree with the above: I think a failing helper should leave the > > larger scope status unmodified. In this case a failing handshake_dup() > > should not touch file refcount, and handshake_nl_accept_doit() should > > be modified accordingly, something alike: > > > > --- > > diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c > > index e865fcf68433..8897a17189ad 100644 > > --- a/net/handshake/netlink.c > > +++ b/net/handshake/netlink.c > > @@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info) > > } > > err = req->hr_proto->hp_accept(req, info, fd); > > if (err) > > - goto out_complete; > > + goto out_put; > > > > trace_handshake_cmd_accept(net, req, req->hr_sk, fd); > > return 0; > > > > +out_put: > > + fput(sock->file); > > out_complete: > > handshake_complete(req, -EIO, NULL); > > - fput(sock->file); > > out_status: > > trace_handshake_cmd_accept_err(net, req, NULL, err); > > return err; > > I'm happy to accommodate these changes, but it's not clear to me > whether you want this hunk applied /in addition to/ my fix or > /instead of/. It's above (completely untested!) chunk is intended to be a replace for patch 2/6 > > --- > > > > Somewhat related: handshake_nl_done_doit() releases the file refcount > > even if the req lookup fails. > > That's because sockfd_lookup() increments the file ref count. Ooops, I missed that. Then in the successful path handshake_nl_done_doit() should call fput() twice ?!? 1 for the reference acquired by sockfd_lookup() and 1 for the reference owned by 'req' ?!? Otherwise a ref will be leaked. > > If that is caused by a concurrent > > req_cancel - not sure if possible at all, possibly syzkaller could > > guess if instructed about the API - such refcount will underflow, as it > > is rightfully decremented by req_cancel, too. > > More likely, req_cancel might take the file ref count to zero > before sockfd_lookup can increment it, resulting in a UAF? > > Let me think about this. I now think this race is not possible, but I now fear the refcount leak mentioned above. Cheers, Paolo
> On May 9, 2023, at 7:22 AM, Paolo Abeni <pabeni@redhat.com> wrote: > > On Tue, 2023-05-09 at 13:59 +0000, Chuck Lever III wrote: >> >>> On May 9, 2023, at 12:04 AM, Paolo Abeni <pabeni@redhat.com> wrote: >>> >>> On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote: >>>> On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote: >>>>> From: Chuck Lever <chuck.lever@oracle.com> >>>>> >>>>> If get_unused_fd_flags() fails, we ended up calling fput(sock->file) >>>>> twice. >>>>> >>>>> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> >>>>> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") >>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >>>>> --- >>>>> net/handshake/netlink.c | 4 +--- >>>>> 1 file changed, 1 insertion(+), 3 deletions(-) >>>>> >>>>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c >>>>> index 7ec8a76c3c8a..032d96152e2f 100644 >>>>> --- a/net/handshake/netlink.c >>>>> +++ b/net/handshake/netlink.c >>>>> @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock) >>>>> >>>>> file = get_file(sock->file); >>>>> newfd = get_unused_fd_flags(O_CLOEXEC); >>>>> - if (newfd < 0) { >>>>> - fput(file); >>>>> + if (newfd < 0) >>>>> return newfd; >>>> >>>> IMHO, the better way to fix it is to change handshake_nl_accept_doit() >>>> do not call to fput(sock->file) in error case. It is not right thing >>>> to have a call to handshake_dup() and rely on elevated get_file() >>>> for failure too as it will be problematic for future extension of >>>> handshake_dup(). >>> >>> I agree with the above: I think a failing helper should leave the >>> larger scope status unmodified. In this case a failing handshake_dup() >>> should not touch file refcount, and handshake_nl_accept_doit() should >>> be modified accordingly, something alike: >>> >>> --- >>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c >>> index e865fcf68433..8897a17189ad 100644 >>> --- a/net/handshake/netlink.c >>> +++ b/net/handshake/netlink.c >>> @@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info) >>> } >>> err = req->hr_proto->hp_accept(req, info, fd); >>> if (err) >>> - goto out_complete; >>> + goto out_put; >>> >>> trace_handshake_cmd_accept(net, req, req->hr_sk, fd); >>> return 0; >>> >>> +out_put: >>> + fput(sock->file); >>> out_complete: >>> handshake_complete(req, -EIO, NULL); >>> - fput(sock->file); >>> out_status: >>> trace_handshake_cmd_accept_err(net, req, NULL, err); >>> return err; >> >> I'm happy to accommodate these changes, but it's not clear to me >> whether you want this hunk applied /in addition to/ my fix or >> /instead of/. > > It's above (completely untested!) chunk is intended to be a replace for > patch 2/6 > >>> --- >>> >>> Somewhat related: handshake_nl_done_doit() releases the file refcount >>> even if the req lookup fails. >> >> That's because sockfd_lookup() increments the file ref count. > > Ooops, I missed that. > > Then in the successful path handshake_nl_done_doit() should call > fput() twice ?!? 1 for the reference acquired by sockfd_lookup() and 1 > for the reference owned by 'req' ?!? Otherwise a ref will be leaked. > >>> If that is caused by a concurrent >>> req_cancel - not sure if possible at all, possibly syzkaller could >>> guess if instructed about the API - such refcount will underflow, as it >>> is rightfully decremented by req_cancel, too. >> >> More likely, req_cancel might take the file ref count to zero >> before sockfd_lookup can increment it, resulting in a UAF? >> >> Let me think about this. > > I now think this race is not possible, but I now fear the refcount leak > mentioned above. Not sure why I haven't seen evidence of a leak here. I'll have a closer look. -- Chuck Lever
> On May 9, 2023, at 10:32 AM, Chuck Lever III <chuck.lever@oracle.com> wrote: > > > >> On May 9, 2023, at 7:22 AM, Paolo Abeni <pabeni@redhat.com> wrote: >> >> On Tue, 2023-05-09 at 13:59 +0000, Chuck Lever III wrote: >>> >>>> On May 9, 2023, at 12:04 AM, Paolo Abeni <pabeni@redhat.com> wrote: >>>> >>>> On Sun, 2023-05-07 at 11:25 +0300, Leon Romanovsky wrote: >>>>> On Fri, May 05, 2023 at 08:46:01PM -0400, Chuck Lever wrote: >>>>>> From: Chuck Lever <chuck.lever@oracle.com> >>>>>> >>>>>> If get_unused_fd_flags() fails, we ended up calling fput(sock->file) >>>>>> twice. >>>>>> >>>>>> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> >>>>>> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") >>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> >>>>>> --- >>>>>> net/handshake/netlink.c | 4 +--- >>>>>> 1 file changed, 1 insertion(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c >>>>>> index 7ec8a76c3c8a..032d96152e2f 100644 >>>>>> --- a/net/handshake/netlink.c >>>>>> +++ b/net/handshake/netlink.c >>>>>> @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock) >>>>>> >>>>>> file = get_file(sock->file); >>>>>> newfd = get_unused_fd_flags(O_CLOEXEC); >>>>>> - if (newfd < 0) { >>>>>> - fput(file); >>>>>> + if (newfd < 0) >>>>>> return newfd; >>>>> >>>>> IMHO, the better way to fix it is to change handshake_nl_accept_doit() >>>>> do not call to fput(sock->file) in error case. It is not right thing >>>>> to have a call to handshake_dup() and rely on elevated get_file() >>>>> for failure too as it will be problematic for future extension of >>>>> handshake_dup(). >>>> >>>> I agree with the above: I think a failing helper should leave the >>>> larger scope status unmodified. In this case a failing handshake_dup() >>>> should not touch file refcount, and handshake_nl_accept_doit() should >>>> be modified accordingly, something alike: >>>> >>>> --- >>>> diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c >>>> index e865fcf68433..8897a17189ad 100644 >>>> --- a/net/handshake/netlink.c >>>> +++ b/net/handshake/netlink.c >>>> @@ -138,14 +138,15 @@ int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *info) >>>> } >>>> err = req->hr_proto->hp_accept(req, info, fd); >>>> if (err) >>>> - goto out_complete; >>>> + goto out_put; >>>> >>>> trace_handshake_cmd_accept(net, req, req->hr_sk, fd); >>>> return 0; >>>> >>>> +out_put: >>>> + fput(sock->file); >>>> out_complete: >>>> handshake_complete(req, -EIO, NULL); >>>> - fput(sock->file); >>>> out_status: >>>> trace_handshake_cmd_accept_err(net, req, NULL, err); >>>> return err; >>> >>> I'm happy to accommodate these changes, but it's not clear to me >>> whether you want this hunk applied /in addition to/ my fix or >>> /instead of/. >> >> It's above (completely untested!) chunk is intended to be a replace for >> patch 2/6 >> >>>> --- >>>> >>>> Somewhat related: handshake_nl_done_doit() releases the file refcount >>>> even if the req lookup fails. >>> >>> That's because sockfd_lookup() increments the file ref count. >> >> Ooops, I missed that. >> >> Then in the successful path handshake_nl_done_doit() should call >> fput() twice ?!? 1 for the reference acquired by sockfd_lookup() and 1 >> for the reference owned by 'req' ?!? Otherwise a ref will be leaked. >> >>>> If that is caused by a concurrent >>>> req_cancel - not sure if possible at all, possibly syzkaller could >>>> guess if instructed about the API - such refcount will underflow, as it >>>> is rightfully decremented by req_cancel, too. >>> >>> More likely, req_cancel might take the file ref count to zero >>> before sockfd_lookup can increment it, resulting in a UAF? >>> >>> Let me think about this. >> >> I now think this race is not possible, but I now fear the refcount leak >> mentioned above. > > Not sure why I haven't seen evidence of a leak here. I'll have a closer look. I added provisional trace points to report the value of f_count before the fput() in done_doit and also where the RPC client destroys the socket. The latter reports f_count is 1, as I would expect, just before the final __fput_sync(). -- Chuck Lever
diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c index 7ec8a76c3c8a..032d96152e2f 100644 --- a/net/handshake/netlink.c +++ b/net/handshake/netlink.c @@ -101,10 +101,8 @@ static int handshake_dup(struct socket *sock) file = get_file(sock->file); newfd = get_unused_fd_flags(O_CLOEXEC); - if (newfd < 0) { - fput(file); + if (newfd < 0) return newfd; - } fd_install(newfd, file); return newfd;