Message ID | 56AE0302.6050101@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Jan 31, 2016 at 08:50:10PM +0800, Kinglong Mee wrote: > ltp fsync02 will cause nfs sending LAYOUTCOMMIT with length > larger than two pages. nfsd returns NFSERR_BAD_XDR right now. This is with the xfs block layout? Christoph, do we know anything about average or worst-case sizes for that layout update field? > This patch lets nfsd supports read buffer from multiples pages. Hm. We'll end up kmalloc()ing the passed-in field length: p = argp->tmpp = kmalloc(nbytes, GFP_KERNEL); We still do still have that (avail + argp->pagelen) limit, so we're not going to pass arbitrarily large nbytes straight from the network to kmalloc. But we do try to avoid depending on higher-order allocations. --b. > > Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> > --- > fs/nfsd/nfs4xdr.c | 18 ++++++++++++++---- > 1 file changed, 14 insertions(+), 4 deletions(-) > > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > index d6ef095..fcf399f 100644 > --- a/fs/nfsd/nfs4xdr.c > +++ b/fs/nfsd/nfs4xdr.c > @@ -143,11 +143,11 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) > * Maybe we need a new page, maybe we have just run out > */ > unsigned int avail = (char *)argp->end - (char *)argp->p; > + unsigned int copied = 0; > __be32 *p; > + > if (avail + argp->pagelen < nbytes) > return NULL; > - if (avail + PAGE_SIZE < nbytes) /* need more than a page !! */ > - return NULL; > /* ok, we can do it with the current plus the next page */ > if (nbytes <= sizeof(argp->tmp)) > p = argp->tmp; > @@ -164,9 +164,19 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) > * guarantee p points to at least nbytes bytes. > */ > memcpy(p, argp->p, avail); > + copied += avail; > + nbytes -= avail; > + > + while (nbytes > PAGE_SIZE) { > + next_decode_page(argp); > + memcpy(((char*)p) + copied, argp->p, PAGE_SIZE); > + copied += PAGE_SIZE; > + nbytes -= PAGE_SIZE; > + } > + > next_decode_page(argp); > - memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); > - argp->p += XDR_QUADLEN(nbytes - avail); > + memcpy(((char*)p) + copied, argp->p, nbytes); > + argp->p += XDR_QUADLEN(nbytes); > return p; > } > > -- > 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Feb 01, 2016 at 01:38:05PM -0500, bfields wrote: > On Sun, Jan 31, 2016 at 08:50:10PM +0800, Kinglong Mee wrote: > > ltp fsync02 will cause nfs sending LAYOUTCOMMIT with length > > larger than two pages. nfsd returns NFSERR_BAD_XDR right now. > > This is with the xfs block layout? > > Christoph, do we know anything about average or worst-case sizes for > that layout update field? > > > This patch lets nfsd supports read buffer from multiples pages. > > Hm. We'll end up kmalloc()ing the passed-in field length: > > p = argp->tmpp = kmalloc(nbytes, GFP_KERNEL); > > We still do still have that (avail + argp->pagelen) limit, so we're not > going to pass arbitrarily large nbytes straight from the network to > kmalloc. But we do try to avoid depending on higher-order allocations. (Which it looks like we were allowing before, possibly by accident. But I doubt they were actually happening in practice, so that's not evidence that we don't need to worry about allocations greater than a page.) --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2/2/2016 02:38, J. Bruce Fields wrote: > On Sun, Jan 31, 2016 at 08:50:10PM +0800, Kinglong Mee wrote: >> ltp fsync02 will cause nfs sending LAYOUTCOMMIT with length >> larger than two pages. nfsd returns NFSERR_BAD_XDR right now. > > This is with the xfs block layout? Yes, xfs block layout. Tested by ltp's fsync02. thanks, Kinglong Mee > > Christoph, do we know anything about average or worst-case sizes for > that layout update field? > >> This patch lets nfsd supports read buffer from multiples pages. > > Hm. We'll end up kmalloc()ing the passed-in field length: > > p = argp->tmpp = kmalloc(nbytes, GFP_KERNEL); > > We still do still have that (avail + argp->pagelen) limit, so we're not > going to pass arbitrarily large nbytes straight from the network to > kmalloc. But we do try to avoid depending on higher-order allocations. > > --b. > >> >> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> >> --- >> fs/nfsd/nfs4xdr.c | 18 ++++++++++++++---- >> 1 file changed, 14 insertions(+), 4 deletions(-) >> >> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c >> index d6ef095..fcf399f 100644 >> --- a/fs/nfsd/nfs4xdr.c >> +++ b/fs/nfsd/nfs4xdr.c >> @@ -143,11 +143,11 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) >> * Maybe we need a new page, maybe we have just run out >> */ >> unsigned int avail = (char *)argp->end - (char *)argp->p; >> + unsigned int copied = 0; >> __be32 *p; >> + >> if (avail + argp->pagelen < nbytes) >> return NULL; >> - if (avail + PAGE_SIZE < nbytes) /* need more than a page !! */ >> - return NULL; >> /* ok, we can do it with the current plus the next page */ >> if (nbytes <= sizeof(argp->tmp)) >> p = argp->tmp; >> @@ -164,9 +164,19 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) >> * guarantee p points to at least nbytes bytes. >> */ >> memcpy(p, argp->p, avail); >> + copied += avail; >> + nbytes -= avail; >> + >> + while (nbytes > PAGE_SIZE) { >> + next_decode_page(argp); >> + memcpy(((char*)p) + copied, argp->p, PAGE_SIZE); >> + copied += PAGE_SIZE; >> + nbytes -= PAGE_SIZE; >> + } >> + >> next_decode_page(argp); >> - memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); >> - argp->p += XDR_QUADLEN(nbytes - avail); >> + memcpy(((char*)p) + copied, argp->p, nbytes); >> + argp->p += XDR_QUADLEN(nbytes); >> return p; >> } >> >> -- >> 2.5.0 > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Feb 01, 2016 at 01:38:05PM -0500, J. Bruce Fields wrote: > This is with the xfs block layout? > > Christoph, do we know anything about average or worst-case sizes for > that layout update field? The average is rather small and fits into a single page, the worst case is basically unlimited: (file size / block size) * sizeof(pnfs_block_extent) by the protocol, and about half that for a non-stupid client as it would merge consecutive blocks and only trigger something close to the worst case for a "block allocated, block hole, block allocated, ..." pattern. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 02, 2016 at 10:20:54AM +0100, Christoph Hellwig wrote: > On Mon, Feb 01, 2016 at 01:38:05PM -0500, J. Bruce Fields wrote: > > This is with the xfs block layout? > > > > Christoph, do we know anything about average or worst-case sizes for > > that layout update field? > > The average is rather small and fits into a single page, the worst > case is basically unlimited: > > (file size / block size) * sizeof(pnfs_block_extent) > > by the protocol, and about half that for a non-stupid client as > it would merge consecutive blocks and only trigger something close > to the worst case for a "block allocated, block hole, block allocated, ..." > pattern. OK. And what's the failure mode if the layoutcommit fails? I guess the client returns the layout and redoes IO through the MDS. For a rare failure maybe that's not so terrible. So I guess the right thing to do is take Kinglong's patch for now. After that, it wouldn't be that hard for nfsd4_block_proc_layoutcommit to decode from an array of pages. But does the iomaps array end up being just as big? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 02, 2016 at 09:29:48AM -0500, J. Bruce Fields wrote: > OK. And what's the failure mode if the layoutcommit fails? I guess the > client returns the layout and redoes IO through the MDS. For a rare > failure maybe that's not so terrible. > > So I guess the right thing to do is take Kinglong's patch for now. Yes, it would be great to take it. > After that, it wouldn't be that hard for nfsd4_block_proc_layoutcommit > to decode from an array of pages. But does the iomaps array end up > being just as big? The block layout extents are rather bloated, so it will be significantly smaller. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index d6ef095..fcf399f 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -143,11 +143,11 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) * Maybe we need a new page, maybe we have just run out */ unsigned int avail = (char *)argp->end - (char *)argp->p; + unsigned int copied = 0; __be32 *p; + if (avail + argp->pagelen < nbytes) return NULL; - if (avail + PAGE_SIZE < nbytes) /* need more than a page !! */ - return NULL; /* ok, we can do it with the current plus the next page */ if (nbytes <= sizeof(argp->tmp)) p = argp->tmp; @@ -164,9 +164,19 @@ static __be32 *read_buf(struct nfsd4_compoundargs *argp, u32 nbytes) * guarantee p points to at least nbytes bytes. */ memcpy(p, argp->p, avail); + copied += avail; + nbytes -= avail; + + while (nbytes > PAGE_SIZE) { + next_decode_page(argp); + memcpy(((char*)p) + copied, argp->p, PAGE_SIZE); + copied += PAGE_SIZE; + nbytes -= PAGE_SIZE; + } + next_decode_page(argp); - memcpy(((char*)p)+avail, argp->p, (nbytes - avail)); - argp->p += XDR_QUADLEN(nbytes - avail); + memcpy(((char*)p) + copied, argp->p, nbytes); + argp->p += XDR_QUADLEN(nbytes); return p; }
ltp fsync02 will cause nfs sending LAYOUTCOMMIT with length larger than two pages. nfsd returns NFSERR_BAD_XDR right now. This patch lets nfsd supports read buffer from multiples pages. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> --- fs/nfsd/nfs4xdr.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)