Message ID | CAN-5tyFX2aZPXy2NRiATwkbv8gDbyafB-4JsRUghuF84ih_WnQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
> On Jan 26, 2018, at 9:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > Hi Bruce/Chuck, > > There is a problem with nfsd doing a WRITE of size 4093,4094,4095. To > reproduce, mount with RDMA and do a simple dd "dd if=/dev/zero > of=/mnt/testfile bs=4093 count=1". What happens is that nfsd fails to > parse GETATTR after the WRITE in the compound. It fails the operation > with ERR_OP_ILLEGAL. > > The problem happens for the values where XDR round up ends up rounding > up to the page size. I don't know if my fix is the appropriate way to > fix this but with it I don't get the error: > > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > index 2c61c6b..a8489c3 100644 > --- a/fs/nfsd/nfs4xdr.c > +++ b/fs/nfsd/nfs4xdr.c > @@ -1289,11 +1289,12 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compounda > > len = XDR_QUADLEN(write->wr_buflen) << 2; > if (len >= avail) { > - int pages; > + int pages = 0; > > len -= avail; > > - pages = len >> PAGE_SHIFT; > + if (write->wr_buflen >= PAGE_SIZE) > + pages = len >> PAGE_SHIFT; > argp->pagelist += pages; > argp->pagelen -= pages * PAGE_SIZE; > len -= pages * PAGE_SIZE; > > So the problem is the using "len" instead of "write->wr_buflen" leads > for the values 4093,4094,4095 that are rounded up to 4096, it makes > pages=1 and the argp->pagelen ends up being a negative value leading > to problems of parsing the GETATTR. Would this also be a problem near any page boundary, say, a write length of 8191 bytes? Instead of using the rounded up "len", why not try this: - pages = len >> PAGE_SHIFT; + pages = write->wr_buflen >> PAGE_SHIFT; And be sure to test with TCP as well. > If this looks OK, I can send a patch. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jan 26, 2018 at 1:23 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > > >> On Jan 26, 2018, at 9:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote: >> >> Hi Bruce/Chuck, >> >> There is a problem with nfsd doing a WRITE of size 4093,4094,4095. To >> reproduce, mount with RDMA and do a simple dd "dd if=/dev/zero >> of=/mnt/testfile bs=4093 count=1". What happens is that nfsd fails to >> parse GETATTR after the WRITE in the compound. It fails the operation >> with ERR_OP_ILLEGAL. >> >> The problem happens for the values where XDR round up ends up rounding >> up to the page size. I don't know if my fix is the appropriate way to >> fix this but with it I don't get the error: >> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >> index 2c61c6b..a8489c3 100644 >> --- a/fs/nfsd/nfs4xdr.c >> +++ b/fs/nfsd/nfs4xdr.c >> @@ -1289,11 +1289,12 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compounda >> >> len = XDR_QUADLEN(write->wr_buflen) << 2; >> if (len >= avail) { >> - int pages; >> + int pages = 0; >> >> len -= avail; >> >> - pages = len >> PAGE_SHIFT; >> + if (write->wr_buflen >= PAGE_SIZE) >> + pages = len >> PAGE_SHIFT; >> argp->pagelist += pages; >> argp->pagelen -= pages * PAGE_SIZE; >> len -= pages * PAGE_SIZE; >> >> So the problem is the using "len" instead of "write->wr_buflen" leads >> for the values 4093,4094,4095 that are rounded up to 4096, it makes >> pages=1 and the argp->pagelen ends up being a negative value leading >> to problems of parsing the GETATTR. > > Would this also be a problem near any page boundary, say, a > write length of 8191 bytes? > > Instead of using the rounded up "len", why not try this: > > - pages = len >> PAGE_SHIFT; > + pages = write->wr_buflen >> PAGE_SHIFT; You are right. It needs to be that. Otherwise 8191 fails the same way. > And be sure to test with TCP as well. Sigh. It breaks normal (non-RDMA) mounts. I'll figure out why. >> If this looks OK, I can send a patch. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Chuck Lever > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Jan 26, 2018, at 11:05 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > On Fri, Jan 26, 2018 at 1:23 PM, Chuck Lever <chuck.lever@oracle.com> wrote: >> >> >>> On Jan 26, 2018, at 9:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote: >>> >>> Hi Bruce/Chuck, >>> >>> There is a problem with nfsd doing a WRITE of size 4093,4094,4095. To >>> reproduce, mount with RDMA and do a simple dd "dd if=/dev/zero >>> of=/mnt/testfile bs=4093 count=1". What happens is that nfsd fails to >>> parse GETATTR after the WRITE in the compound. It fails the operation >>> with ERR_OP_ILLEGAL. >>> >>> The problem happens for the values where XDR round up ends up rounding >>> up to the page size. I don't know if my fix is the appropriate way to >>> fix this but with it I don't get the error: >>> >>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >>> index 2c61c6b..a8489c3 100644 >>> --- a/fs/nfsd/nfs4xdr.c >>> +++ b/fs/nfsd/nfs4xdr.c >>> @@ -1289,11 +1289,12 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compounda >>> >>> len = XDR_QUADLEN(write->wr_buflen) << 2; >>> if (len >= avail) { >>> - int pages; >>> + int pages = 0; >>> >>> len -= avail; >>> >>> - pages = len >> PAGE_SHIFT; >>> + if (write->wr_buflen >= PAGE_SIZE) >>> + pages = len >> PAGE_SHIFT; >>> argp->pagelist += pages; >>> argp->pagelen -= pages * PAGE_SIZE; >>> len -= pages * PAGE_SIZE; >>> >>> So the problem is the using "len" instead of "write->wr_buflen" leads >>> for the values 4093,4094,4095 that are rounded up to 4096, it makes >>> pages=1 and the argp->pagelen ends up being a negative value leading >>> to problems of parsing the GETATTR. >> >> Would this also be a problem near any page boundary, say, a >> write length of 8191 bytes? >> >> Instead of using the rounded up "len", why not try this: >> >> - pages = len >> PAGE_SHIFT; >> + pages = write->wr_buflen >> PAGE_SHIFT; > > You are right. It needs to be that. Otherwise 8191 fails the same way. > >> And be sure to test with TCP as well. > > Sigh. It breaks normal (non-RDMA) mounts. I'll figure out why. OK. Remember that a Read chunk's length does not have to be rounded up. Maybe the transport needs to round up the length of the unmarshaled data content on behalf of the NFSv4 write decoder. > >>> If this looks OK, I can send a patch. >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- >> Chuck Lever >> >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 2c61c6b..a8489c3 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -1289,11 +1289,12 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compounda len = XDR_QUADLEN(write->wr_buflen) << 2; if (len >= avail) { - int pages; + int pages = 0; len -= avail; - pages = len >> PAGE_SHIFT; + if (write->wr_buflen >= PAGE_SIZE) + pages = len >> PAGE_SHIFT; argp->pagelist += pages; argp->pagelen -= pages * PAGE_SIZE; len -= pages * PAGE_SIZE;