Message ID | 20140617210439.GC4510@tonberry.usersys.redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jun 17, 2014 at 5:04 PM, Scott Mayhew <smayhew@redhat.com> wrote: > > No, I don't think it's right to ignore NFS_INO_INVALID_DATA, and > originally I was testing a fix similar to this: > > ---8<--- > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > index 3ee5af4..98ff061 100644 > --- a/fs/nfs/write.c > +++ b/fs/nfs/write.c > @@ -934,12 +934,14 @@ static bool nfs_write_pageuptodate(struct page *page, struct inode *inode) > > if (nfs_have_delegated_attributes(inode)) > goto out; > - if (nfsi->cache_validity & (NFS_INO_INVALID_DATA|NFS_INO_REVAL_PAGECACHE)) > + if (nfsi->cache_validity & NFS_INO_REVAL_PAGECACHE) > return false; > smp_rmb(); > if (test_bit(NFS_INO_INVALIDATING, &nfsi->flags)) > return false; > out: > + if (nfsi->cache_validity & NFS_INO_INVALID_DATA) > + return false; > return PageUptodate(page) != 0; > } > ---8<--- > > > However, > > 1) it wasn't really keeping with the spirit of commit 8d197a56 (NFS: > Always trust the PageUptodate flag when we have a delegation), and > > 2) one of my test programs (used to test commit c7559663 (NFS: Allow > nfs_updatepage to extend a write under additional circumstances))) > started performing poorly again, doing tons of sub page-sized writes > intead of a handful of wsize'd writes. > > I did some more digging and I think I see 2 areas that could be > improved. > > The first would be to clear NFS_INO_INVALID_DATA if we've just > truncated the inode to 0 bytes -- after all, if we've just unmapped > all the pages from the inode's address space then isn't our data > consisitent?: Hi Scott, What say we rather just don't set/clear NFS_INO_INVALID_DATA if mapping->nrpages == 0? That should catch the above case, plus a couple of other potential false positives. > The second thing I noticed is that we're constantly invalidating our > cache due to the change attribute changing on the server. But if we > have a write delegation then the change attribute changing must be the > result of *our* changes, in which case we should be able to just silently > update the change attribute on our side without invalidating our caches: Ack. That's a bug in the current code. Cheers Trond
On Tue, 17 Jun 2014, Trond Myklebust wrote: > On Tue, Jun 17, 2014 at 5:04 PM, Scott Mayhew <smayhew@redhat.com> wrote: > > > > I did some more digging and I think I see 2 areas that could be > > improved. > > > > The first would be to clear NFS_INO_INVALID_DATA if we've just > > truncated the inode to 0 bytes -- after all, if we've just unmapped > > all the pages from the inode's address space then isn't our data > > consisitent?: > > What say we rather just don't set/clear NFS_INO_INVALID_DATA if > mapping->nrpages == 0? That should catch the above case, plus a couple > of other potential false positives. > That'd be fine too, but I'm not clear on whether I should do this check for all files or just for regular files (I'm not sure how it would make that big a difference for directory inodes for instance). -Scott -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Scott, How about the following 2 extra patches on top of yours? Do they help with the performance regression? Cheers Trond Scott Mayhew (1): nfs: Fix cache_validity check in nfs_write_pageuptodate() Trond Myklebust (2): NFS: Clear NFS_INO_REVAL_PAGECACHE when we update the file size NFS: Don't mark the data cache as invalid if it has been flushed fs/nfs/inode.c | 76 +++++++++++++++++++++++++++++++--------------------------- fs/nfs/write.c | 4 +++- 2 files changed, 44 insertions(+), 36 deletions(-)
On Fri, 20 Jun 2014, Trond Myklebust wrote: > Hi Scott, > > How about the following 2 extra patches on top of yours? Do they help > with the performance regression? Yes, they do. Thanks! -Scott > > Cheers > Trond > > Scott Mayhew (1): > nfs: Fix cache_validity check in nfs_write_pageuptodate() > > Trond Myklebust (2): > NFS: Clear NFS_INO_REVAL_PAGECACHE when we update the file size > NFS: Don't mark the data cache as invalid if it has been flushed > > fs/nfs/inode.c | 76 +++++++++++++++++++++++++++++++--------------------------- > fs/nfs/write.c | 4 +++- > 2 files changed, 44 insertions(+), 36 deletions(-) > > -- > 1.9.3 > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 3ee5af4..98ff061 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -934,12 +934,14 @@ static bool nfs_write_pageuptodate(struct page *page, struct inode *inode) if (nfs_have_delegated_attributes(inode)) goto out; - if (nfsi->cache_validity & (NFS_INO_INVALID_DATA|NFS_INO_REVAL_PAGECACHE)) + if (nfsi->cache_validity & NFS_INO_REVAL_PAGECACHE) return false; smp_rmb(); if (test_bit(NFS_INO_INVALIDATING, &nfsi->flags)) return false; out: + if (nfsi->cache_validity & NFS_INO_INVALID_DATA) + return false; return PageUptodate(page) != 0; } ---8<--- However, 1) it wasn't really keeping with the spirit of commit 8d197a56 (NFS: Always trust the PageUptodate flag when we have a delegation), and 2) one of my test programs (used to test commit c7559663 (NFS: Allow nfs_updatepage to extend a write under additional circumstances))) started performing poorly again, doing tons of sub page-sized writes intead of a handful of wsize'd writes. I did some more digging and I think I see 2 areas that could be improved. The first would be to clear NFS_INO_INVALID_DATA if we've just truncated the inode to 0 bytes -- after all, if we've just unmapped all the pages from the inode's address space then isn't our data consisitent?: ---8<--- diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index c496f8a..1078d06 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -584,6 +584,11 @@ void nfs_setattr_update_inode(struct inode *inode, struct iattr *attr) if ((attr->ia_valid & ATTR_SIZE) != 0) { nfs_inc_stats(inode, NFSIOS_SETATTRTRUNC); nfs_vmtruncate(inode, attr->ia_size); + if (attr->ia_size == 0) { + spin_lock(&inode->i_lock); + NFS_I(inode)->cache_validity &= ~NFS_INO_INVALID_DATA; + spin_unlock(&inode->i_lock); + } } } EXPORT_SYMBOL_GPL(nfs_setattr_update_inode); ---8<--- The second thing I noticed is that we're constantly invalidating our cache due to the change attribute changing on the server. But if we have a write delegation then the change attribute changing must be the result of *our* changes, in which case we should be able to just silently update the change attribute on our side without invalidating our caches: ---8<--- diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 1078d06..932c999 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1568,15 +1568,17 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr) /* More cache consistency checks */ if (fattr->valid & NFS_ATTR_FATTR_CHANGE) { if (inode->i_version != fattr->change_attr) { - dprintk("NFS: change_attr change on server for file %s/%ld\n", + if (!NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE)) { + dprintk("NFS: change_attr change on server for file %s/%ld\n", inode->i_sb->s_id, inode->i_ino); - invalid |= NFS_INO_INVALID_ATTR - | NFS_INO_INVALID_DATA - | NFS_INO_INVALID_ACCESS - | NFS_INO_INVALID_ACL - | NFS_INO_REVAL_PAGECACHE; - if (S_ISDIR(inode->i_mode)) - nfs_force_lookup_revalidate(inode); + invalid |= NFS_INO_INVALID_ATTR + | NFS_INO_INVALID_DATA + | NFS_INO_INVALID_ACCESS + | NFS_INO_INVALID_ACL + | NFS_INO_REVAL_PAGECACHE; + if (S_ISDIR(inode->i_mode)) + nfs_force_lookup_revalidate(inode); + } inode->i_version = fattr->change_attr; } } else if (server->caps & NFS_CAP_CHANGE_ATTR)