Message ID | 87h91mtvdb.fsf@notabene.neil.brown.name (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Apr 18, 2017 at 10:25:20AM +1000, NeilBrown wrote: > I can't say that I like this patch at all. > > The problem is that: > > pages = size / PAGE_SIZE + 1; /* extra page as we hold both request and reply. > * We assume one is at most one page > */ > > this assumption is never verified. > To my mind, the "obvious" way to verify this assumption is that an > attempt to generate a multi-page reply should fail if there was a > multi-page request. A third option, by the way, which Ari Kauppi argued for, is adding a null check each time we increment rq_next_page, since we seem to arrange for the page array to always be NULL-terminated. > Failing if there was a little bit of extra noise at the end of the > request seems harsher than necessary, and could result in a regression. You're worrying there might be a weird old client out there somewhere? I guess it seems like a small enough risk to me. I'm more worried the extra garbage might violate assumptions elsewhere in the code. But, this looks good too: > We already know how big replies can get, so we can perform a complete > sanity check quite early: > > diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c > index a08aeb56b8e4..14f4d425cf8c 100644 > --- a/net/sunrpc/svc.c > +++ b/net/sunrpc/svc.c > @@ -1196,6 +1196,12 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) > goto err_bad_proc; > rqstp->rq_procinfo = procp; > > + if ((procp->pc_xdrressize == 0 || > + procp->pc_xdrressize > XDR_QUADLEN(PAGE_SIZE)) && > + rqstp->rq_arg.len > PAGE_SIZE) > + /* The assumption about request/reply sizes in svc_init_buffer() is violated! */ > + goto err_garbage; > + > /* Syntactic check complete */ > serv->sv_stats->rpccnt++; > > > I haven't tested this at all and haven't even convinced myself that > it covers every case (though I cannot immediately think of any likely > corners). > > Does it address your test case? I'll check, it probably does. We'd need to limit the test to v2/v3. I'm also not opposed to doing both (or all three). --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 18 2017, J. Bruce Fields wrote: > On Tue, Apr 18, 2017 at 10:25:20AM +1000, NeilBrown wrote: >> I can't say that I like this patch at all. >> >> The problem is that: >> >> pages = size / PAGE_SIZE + 1; /* extra page as we hold both request and reply. >> * We assume one is at most one page >> */ >> >> this assumption is never verified. >> To my mind, the "obvious" way to verify this assumption is that an >> attempt to generate a multi-page reply should fail if there was a >> multi-page request. > > A third option, by the way, which Ari Kauppi argued for, is adding a > null check each time we increment rq_next_page, since we seem to arrange > for the page array to always be NULL-terminated. Not a bad idea. That is what nfsaclsvc_encode_getaclres() and nfs3svc_encode_getaclres do. Hmm... your change to xdr_argsize_check will break nfsaclsvc_decode_setaclargs(), won't it? It performs the check before the final nfsacl_decode(). > >> Failing if there was a little bit of extra noise at the end of the >> request seems harsher than necessary, and could result in a regression. > > You're worrying there might be a weird old client out there somewhere? > I guess it seems like a small enough risk to me. I'm more worried the > extra garbage might violate assumptions elsewhere in the code. Something like that. Probably no client does that... I wouldn't be overly surprised if some old boot-from-NFS code in a some ROM somewhere took a shortcut like this though. > > But, this looks good too: > >> We already know how big replies can get, so we can perform a complete >> sanity check quite early: >> >> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c >> index a08aeb56b8e4..14f4d425cf8c 100644 >> --- a/net/sunrpc/svc.c >> +++ b/net/sunrpc/svc.c >> @@ -1196,6 +1196,12 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) >> goto err_bad_proc; >> rqstp->rq_procinfo = procp; >> >> + if ((procp->pc_xdrressize == 0 || >> + procp->pc_xdrressize > XDR_QUADLEN(PAGE_SIZE)) && >> + rqstp->rq_arg.len > PAGE_SIZE) >> + /* The assumption about request/reply sizes in svc_init_buffer() is violated! */ >> + goto err_garbage; >> + >> /* Syntactic check complete */ >> serv->sv_stats->rpccnt++; >> >> >> I haven't tested this at all and haven't even convinced myself that >> it covers every case (though I cannot immediately think of any likely >> corners). >> >> Does it address your test case? > > I'll check, it probably does. > > We'd need to limit the test to v2/v3. Why? Does v4 allocate extra pages? Or is it more careful about using them? v4 does need something different, as pc_xdrressize is always zero.. Thanks, NeilBrown > > I'm also not opposed to doing both (or all three). > > --b.
On Wed, Apr 19, 2017 at 10:17:09AM +1000, NeilBrown wrote: > On Tue, Apr 18 2017, J. Bruce Fields wrote: > > > On Tue, Apr 18, 2017 at 10:25:20AM +1000, NeilBrown wrote: > >> I can't say that I like this patch at all. > >> > >> The problem is that: > >> > >> pages = size / PAGE_SIZE + 1; /* extra page as we hold both request and reply. > >> * We assume one is at most one page > >> */ > >> > >> this assumption is never verified. > >> To my mind, the "obvious" way to verify this assumption is that an > >> attempt to generate a multi-page reply should fail if there was a > >> multi-page request. > > > > A third option, by the way, which Ari Kauppi argued for, is adding a > > null check each time we increment rq_next_page, since we seem to arrange > > for the page array to always be NULL-terminated. > > Not a bad idea. That is what nfsaclsvc_encode_getaclres() and > nfs3svc_encode_getaclres do. > Hmm... your change to xdr_argsize_check will break > nfsaclsvc_decode_setaclargs(), won't it? It performs the check before > the final nfsacl_decode(). Ugh, I forget that I don't run any tests for NFSv3 ACLs. Well, that would be easy enough to fix.... > >> I haven't tested this at all and haven't even convinced myself that > >> it covers every case (though I cannot immediately think of any likely > >> corners). > >> > >> Does it address your test case? > > > > I'll check, it probably does. > > > > We'd need to limit the test to v2/v3. > > Why? Does v4 allocate extra pages? Or is it more careful about using > them? > v4 does need something different, as pc_xdrressize is always zero.. NFSv4 compounds just don't have that limitation. You can read and write in the same compound if you want. (Why you'd want to, I've no idea.) (In fact, I think at least in the version >=4.1 case we should probably only be placing limits on argument and reply sizes individually, so our current implementation (which also places limits on the sum of the two) is probably wrong. This doesn't keep me up at night.) --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 18 2017, J. Bruce Fields wrote: > On Wed, Apr 19, 2017 at 10:17:09AM +1000, NeilBrown wrote: >> On Tue, Apr 18 2017, J. Bruce Fields wrote: >> >> > On Tue, Apr 18, 2017 at 10:25:20AM +1000, NeilBrown wrote: >> >> I can't say that I like this patch at all. >> >> >> >> The problem is that: >> >> >> >> pages = size / PAGE_SIZE + 1; /* extra page as we hold both request and reply. >> >> * We assume one is at most one page >> >> */ >> >> >> >> this assumption is never verified. >> >> To my mind, the "obvious" way to verify this assumption is that an >> >> attempt to generate a multi-page reply should fail if there was a >> >> multi-page request. >> > >> > A third option, by the way, which Ari Kauppi argued for, is adding a >> > null check each time we increment rq_next_page, since we seem to arrange >> > for the page array to always be NULL-terminated. >> >> Not a bad idea. That is what nfsaclsvc_encode_getaclres() and >> nfs3svc_encode_getaclres do. >> Hmm... your change to xdr_argsize_check will break >> nfsaclsvc_decode_setaclargs(), won't it? It performs the check before >> the final nfsacl_decode(). > > Ugh, I forget that I don't run any tests for NFSv3 ACLs. Well, that > would be easy enough to fix.... > >> >> I haven't tested this at all and haven't even convinced myself that >> >> it covers every case (though I cannot immediately think of any likely >> >> corners). >> >> >> >> Does it address your test case? >> > >> > I'll check, it probably does. >> > >> > We'd need to limit the test to v2/v3. >> >> Why? Does v4 allocate extra pages? Or is it more careful about using >> them? >> v4 does need something different, as pc_xdrressize is always zero.. > > NFSv4 compounds just don't have that limitation. You can read and write > in the same compound if you want. (Why you'd want to, I've no idea.) I realise NFSv4 compounds don't have that limitation. I wondered what code in the NFSv4 server ensures that we don't try to use more memory than was allocated. I notice lots of calls to xdr_reserve_space() in nfs4xdr.c. Many of them trigger nfserr_resource when xdr_reserve_space() returns NULL. But not all. nfsd4_encode_readv() just pops up a warning. Once. Then will (eventually) de-reference the NULL pointer and crash. So presumably it really cannot happen (should be a BUG_ON anyway)? So why can this not happen? I see that nfsd4_encode_read() limits the size of the read to xdr->buf->buflen - xdr->buf->len and nfsd4_encode_readdir() does a similar thing when computing bytes_left. So, it is more careful about using the allocated pages than v2/3 is. Thanks, NeilBrown > > (In fact, I think at least in the version >=4.1 case we should probably > only be placing limits on argument and reply sizes individually, so our > current implementation (which also places limits on the sum of the two) > is probably wrong. This doesn't keep me up at night.) > > --b.
On Thu, Apr 20, 2017 at 10:57:10AM +1000, NeilBrown wrote: > I realise NFSv4 compounds don't have that limitation. > I wondered what code in the NFSv4 server ensures that we don't try to use > more memory than was allocated. > > I notice lots of calls to xdr_reserve_space() in nfs4xdr.c. Many of them > trigger nfserr_resource when xdr_reserve_space() returns NULL. > But not all. > nfsd4_encode_readv() just pops up a warning. Once. Then will > (eventually) de-reference the NULL pointer and crash. > So presumably it really cannot happen (should be a BUG_ON anyway)? > So why can this not happen? > I see that nfsd4_encode_read() limits the size of the read to > xdr->buf->buflen - xdr->buf->len > and nfsd4_encode_readdir() does a similar thing when computing > bytes_left. > > So, it is more careful about using the allocated pages than v2/3 is. Yes. The v4 code was written from the start with overflow checks preceding any encode or decode. And I tried to think this all through carefully when I rewrote the encoding side a few years ago. But I don't think that really got much review, and test coverage is poor (a big thanks here to the synpsys people for their fuzzing work), so additional skeptical eyes are welcomed.... There's a lot of tricky hand-written code here handling data from the network. Every now and then somebody brings up the idea of trying to autogenerate it, as is traditionally done for rpc programs. No idea how practical that is. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 18, 2017 at 01:13:51PM -0400, J. Bruce Fields wrote: > On Tue, Apr 18, 2017 at 10:25:20AM +1000, NeilBrown wrote: > > I can't say that I like this patch at all. > > > > The problem is that: > > > > pages = size / PAGE_SIZE + 1; /* extra page as we hold both request and reply. > > * We assume one is at most one page > > */ > > > > this assumption is never verified. > > To my mind, the "obvious" way to verify this assumption is that an > > attempt to generate a multi-page reply should fail if there was a > > multi-page request. > > A third option, by the way, which Ari Kauppi argued for, is adding a > null check each time we increment rq_next_page, since we seem to arrange > for the page array to always be NULL-terminated. > > > Failing if there was a little bit of extra noise at the end of the > > request seems harsher than necessary, and could result in a regression. > > You're worrying there might be a weird old client out there somewhere? > I guess it seems like a small enough risk to me. I'm more worried the > extra garbage might violate assumptions elsewhere in the code. > > But, this looks good too: But, I'm not too happy about putting that NFSv2/v3-specific check in common rpc code. Also, I think this check comes too late for some of the damage. I may go with some variation on Ari's idea, let me give it a try.... --b. > > > We already know how big replies can get, so we can perform a complete > > sanity check quite early: > > > > diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c > > index a08aeb56b8e4..14f4d425cf8c 100644 > > --- a/net/sunrpc/svc.c > > +++ b/net/sunrpc/svc.c > > @@ -1196,6 +1196,12 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) > > goto err_bad_proc; > > rqstp->rq_procinfo = procp; > > > > + if ((procp->pc_xdrressize == 0 || > > + procp->pc_xdrressize > XDR_QUADLEN(PAGE_SIZE)) && > > + rqstp->rq_arg.len > PAGE_SIZE) > > + /* The assumption about request/reply sizes in svc_init_buffer() is violated! */ > > + goto err_garbage; > > + > > /* Syntactic check complete */ > > serv->sv_stats->rpccnt++; > > > > > > I haven't tested this at all and haven't even convinced myself that > > it covers every case (though I cannot immediately think of any likely > > corners). > > > > Does it address your test case? > > I'll check, it probably does. > > We'd need to limit the test to v2/v3. > > I'm also not opposed to doing both (or all three). > > --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 20, 2017 at 12:19:35PM -0400, J. Bruce Fields wrote: > On Tue, Apr 18, 2017 at 01:13:51PM -0400, J. Bruce Fields wrote: > > On Tue, Apr 18, 2017 at 10:25:20AM +1000, NeilBrown wrote: > > > I can't say that I like this patch at all. > > > > > > The problem is that: > > > > > > pages = size / PAGE_SIZE + 1; /* extra page as we hold both request and reply. > > > * We assume one is at most one page > > > */ > > > > > > this assumption is never verified. > > > To my mind, the "obvious" way to verify this assumption is that an > > > attempt to generate a multi-page reply should fail if there was a > > > multi-page request. > > > > A third option, by the way, which Ari Kauppi argued for, is adding a > > null check each time we increment rq_next_page, since we seem to arrange > > for the page array to always be NULL-terminated. > > > > > Failing if there was a little bit of extra noise at the end of the > > > request seems harsher than necessary, and could result in a regression. > > > > You're worrying there might be a weird old client out there somewhere? > > I guess it seems like a small enough risk to me. I'm more worried the > > extra garbage might violate assumptions elsewhere in the code. > > > > But, this looks good too: > > But, I'm not too happy about putting that NFSv2/v3-specific check in > common rpc code. Also, I think this check comes too late for some of > the damage. > > I may go with some variation on Ari's idea, let me give it a try.... In the read case, I think Ari's approach wouldn't catch the error until nfsd_direct_splice_actor(), which doesn't actually look capable of handling errors. Maybe that should be fixed. Or maybe read just needs some more checks. Ugh. --b. > > --b. > > > > > > We already know how big replies can get, so we can perform a complete > > > sanity check quite early: > > > > > > diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c > > > index a08aeb56b8e4..14f4d425cf8c 100644 > > > --- a/net/sunrpc/svc.c > > > +++ b/net/sunrpc/svc.c > > > @@ -1196,6 +1196,12 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) > > > goto err_bad_proc; > > > rqstp->rq_procinfo = procp; > > > > > > + if ((procp->pc_xdrressize == 0 || > > > + procp->pc_xdrressize > XDR_QUADLEN(PAGE_SIZE)) && > > > + rqstp->rq_arg.len > PAGE_SIZE) > > > + /* The assumption about request/reply sizes in svc_init_buffer() is violated! */ > > > + goto err_garbage; > > > + > > > /* Syntactic check complete */ > > > serv->sv_stats->rpccnt++; > > > > > > > > > I haven't tested this at all and haven't even convinced myself that > > > it covers every case (though I cannot immediately think of any likely > > > corners). > > > > > > Does it address your test case? > > > > I'll check, it probably does. > > > > We'd need to limit the test to v2/v3. > > > > I'm also not opposed to doing both (or all three). > > > > --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 20 2017, J. Bruce Fields wrote: > On Thu, Apr 20, 2017 at 12:19:35PM -0400, J. Bruce Fields wrote: >> On Tue, Apr 18, 2017 at 01:13:51PM -0400, J. Bruce Fields wrote: >> > On Tue, Apr 18, 2017 at 10:25:20AM +1000, NeilBrown wrote: >> > > I can't say that I like this patch at all. >> > > >> > > The problem is that: >> > > >> > > pages = size / PAGE_SIZE + 1; /* extra page as we hold both request and reply. >> > > * We assume one is at most one page >> > > */ >> > > >> > > this assumption is never verified. >> > > To my mind, the "obvious" way to verify this assumption is that an >> > > attempt to generate a multi-page reply should fail if there was a >> > > multi-page request. >> > >> > A third option, by the way, which Ari Kauppi argued for, is adding a >> > null check each time we increment rq_next_page, since we seem to arrange >> > for the page array to always be NULL-terminated. >> > >> > > Failing if there was a little bit of extra noise at the end of the >> > > request seems harsher than necessary, and could result in a regression. >> > >> > You're worrying there might be a weird old client out there somewhere? >> > I guess it seems like a small enough risk to me. I'm more worried the >> > extra garbage might violate assumptions elsewhere in the code. >> > >> > But, this looks good too: >> >> But, I'm not too happy about putting that NFSv2/v3-specific check in >> common rpc code. Also, I think this check comes too late for some of >> the damage. Too late? It is earlier than anything else. >> >> I may go with some variation on Ari's idea, let me give it a try.... > > In the read case, I think Ari's approach wouldn't catch the error until > nfsd_direct_splice_actor(), which doesn't actually look capable of > handling errors. Maybe that should be fixed. Or maybe read just needs > some more checks. Ugh. By the time you get to nfsd_read(), the 'struct kvec' should be set up and valid. So we need checks is e.g. nfs3svc_decode_readargs(), but not deeper. NeilBrown
On Fri, Apr 21, 2017 at 08:11:59AM +1000, NeilBrown wrote: > On Thu, Apr 20 2017, J. Bruce Fields wrote: > > > On Thu, Apr 20, 2017 at 12:19:35PM -0400, J. Bruce Fields wrote: > >> On Tue, Apr 18, 2017 at 01:13:51PM -0400, J. Bruce Fields wrote: > >> > On Tue, Apr 18, 2017 at 10:25:20AM +1000, NeilBrown wrote: > >> > > I can't say that I like this patch at all. > >> > > > >> > > The problem is that: > >> > > > >> > > pages = size / PAGE_SIZE + 1; /* extra page as we hold both request and reply. > >> > > * We assume one is at most one page > >> > > */ > >> > > > >> > > this assumption is never verified. > >> > > To my mind, the "obvious" way to verify this assumption is that an > >> > > attempt to generate a multi-page reply should fail if there was a > >> > > multi-page request. > >> > > >> > A third option, by the way, which Ari Kauppi argued for, is adding a > >> > null check each time we increment rq_next_page, since we seem to arrange > >> > for the page array to always be NULL-terminated. > >> > > >> > > Failing if there was a little bit of extra noise at the end of the > >> > > request seems harsher than necessary, and could result in a regression. > >> > > >> > You're worrying there might be a weird old client out there somewhere? > >> > I guess it seems like a small enough risk to me. I'm more worried the > >> > extra garbage might violate assumptions elsewhere in the code. > >> > > >> > But, this looks good too: > >> > >> But, I'm not too happy about putting that NFSv2/v3-specific check in > >> common rpc code. Also, I think this check comes too late for some of > >> the damage. > > Too late? It is earlier than anything else. D'oh, yes, I had some idea the check happened after encoding. > >> I may go with some variation on Ari's idea, let me give it a try.... > > > > In the read case, I think Ari's approach wouldn't catch the error until > > nfsd_direct_splice_actor(), which doesn't actually look capable of > > handling errors. Maybe that should be fixed. Or maybe read just needs > > some more checks. Ugh. > > By the time you get to nfsd_read(), the 'struct kvec' should be set up > and valid. That's ignored in the splice case, isn't it? OK, maybe I need to sleep on it and look again in the morning.... --b. > So we need checks is e.g. nfs3svc_decode_readargs(), but not > deeper. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index a08aeb56b8e4..14f4d425cf8c 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -1196,6 +1196,12 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv) goto err_bad_proc; rqstp->rq_procinfo = procp; + if ((procp->pc_xdrressize == 0 || + procp->pc_xdrressize > XDR_QUADLEN(PAGE_SIZE)) && + rqstp->rq_arg.len > PAGE_SIZE) + /* The assumption about request/reply sizes in svc_init_buffer() is violated! */ + goto err_garbage; + /* Syntactic check complete */ serv->sv_stats->rpccnt++;