Message ID | 20231213152350.431591-38-dhowells@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | netfs, afs, 9p: Delegate high-level I/O to netfslib | expand |
On Wed, 2023-12-13 at 15:23 +0000, David Howells wrote: > Track the file position above which the server is not expected to have any > data (the "zero point") and preemptively assume that we can satisfy > requests by filling them with zeroes locally rather than attempting to > download them if they're over that line - even if we've written data back > to the server. Assume that any data that was written back above that > position is held in the local cache. Note that we have to split requests > that straddle the line. > > Make use of this to optimise away some reads from the server. We need to > set the zero point in the following circumstances: > > (1) When we see an extant remote inode and have no cache for it, we set > the zero_point to i_size. > > (2) On local inode creation, we set zero_point to 0. > > (3) On local truncation down, we reduce zero_point to the new i_size if > the new i_size is lower. > > (4) On local truncation up, we don't change zero_point. > The above seems odd, but I guess the assumption is that if there are any writes by a 3rd party above the old zero point, that that would cause an invalidation? > (5) On local modification, we don't change zero_point. > > (6) On remote invalidation, we set zero_point to the new i_size. > > (7) If stored data is discarded from the pagecache or culled from fscache, > we must set zero_point above that if the data also got written to the > server. > > (8) If dirty data is written back to the server, but not fscache, we must > set zero_point above that. > > (9) If a direct I/O write is made, set zero_point above that. > > Assuming the above, any read from the server at or above the zero_point > position will return all zeroes. > > The zero_point value can be stored in the cache, provided the above rules > are applied to it by any code that culls part of the local cache. > > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Jeff Layton <jlayton@kernel.org> > cc: linux-cachefs@redhat.com > cc: linux-fsdevel@vger.kernel.org > cc: linux-mm@kvack.org > --- > fs/afs/inode.c | 22 +++++++++++++--------- > fs/netfs/buffered_write.c | 2 +- > fs/netfs/direct_write.c | 4 ++++ > fs/netfs/io.c | 10 ++++++++++ > fs/netfs/misc.c | 5 +++++ > fs/smb/client/cifsfs.c | 4 ++-- > include/linux/netfs.h | 14 ++++++++++++-- > 7 files changed, 47 insertions(+), 14 deletions(-) > > diff --git a/fs/afs/inode.c b/fs/afs/inode.c > index c43112dcbbbb..dfd940a64e0f 100644 > --- a/fs/afs/inode.c > +++ b/fs/afs/inode.c > @@ -166,6 +166,7 @@ static void afs_apply_status(struct afs_operation *op, > struct inode *inode = &vnode->netfs.inode; > struct timespec64 t; > umode_t mode; > + bool unexpected_jump = false; > bool data_changed = false; > bool change_size = vp->set_size; > > @@ -230,6 +231,7 @@ static void afs_apply_status(struct afs_operation *op, > } > change_size = true; > data_changed = true; > + unexpected_jump = true; > } else if (vnode->status.type == AFS_FTYPE_DIR) { > /* Expected directory change is handled elsewhere so > * that we can locally edit the directory and save on a > @@ -251,6 +253,8 @@ static void afs_apply_status(struct afs_operation *op, > vnode->netfs.remote_i_size = status->size; > if (change_size || status->size > i_size_read(inode)) { > afs_set_i_size(vnode, status->size); > + if (unexpected_jump) > + vnode->netfs.zero_point = status->size; > inode_set_ctime_to_ts(inode, t); > inode_set_atime_to_ts(inode, t); > } > @@ -689,17 +693,17 @@ static void afs_setattr_success(struct afs_operation *op) > static void afs_setattr_edit_file(struct afs_operation *op) > { > struct afs_vnode_param *vp = &op->file[0]; > - struct inode *inode = &vp->vnode->netfs.inode; > + struct afs_vnode *vnode = vp->vnode; > > if (op->setattr.attr->ia_valid & ATTR_SIZE) { > loff_t size = op->setattr.attr->ia_size; > loff_t i_size = op->setattr.old_i_size; > > - if (size < i_size) > - truncate_pagecache(inode, size); > - if (size != i_size) > - fscache_resize_cookie(afs_vnode_cache(vp->vnode), > - vp->scb.status.size); > + if (size != i_size) { > + truncate_setsize(&vnode->netfs.inode, size); > + netfs_resize_file(&vnode->netfs, size, true); > + fscache_resize_cookie(afs_vnode_cache(vnode), size); > + } > } > } > > @@ -767,11 +771,11 @@ int afs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, > */ > if (!(attr->ia_valid & (supported & ~ATTR_SIZE & ~ATTR_MTIME)) && > attr->ia_size < i_size && > - attr->ia_size > vnode->status.size) { > - truncate_pagecache(inode, attr->ia_size); > + attr->ia_size > vnode->netfs.remote_i_size) { > + truncate_setsize(inode, attr->ia_size); > + netfs_resize_file(&vnode->netfs, size, false); > fscache_resize_cookie(afs_vnode_cache(vnode), > attr->ia_size); > - i_size_write(inode, attr->ia_size); > ret = 0; > goto out_unlock; > } > diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c > index dce6995fb644..d7ce424b9188 100644 > --- a/fs/netfs/buffered_write.c > +++ b/fs/netfs/buffered_write.c > @@ -73,7 +73,7 @@ static enum netfs_how_to_modify netfs_how_to_modify(struct netfs_inode *ctx, > if (folio_test_uptodate(folio)) > return NETFS_FOLIO_IS_UPTODATE; > > - if (pos >= ctx->remote_i_size) > + if (pos >= ctx->zero_point) > return NETFS_MODIFY_AND_CLEAR; > > if (!maybe_trouble && offset == 0 && len >= flen) > diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c > index bb0c2718f57b..aad05f2349a4 100644 > --- a/fs/netfs/direct_write.c > +++ b/fs/netfs/direct_write.c > @@ -134,6 +134,7 @@ ssize_t netfs_unbuffered_write_iter(struct kiocb *iocb, struct iov_iter *from) > struct file *file = iocb->ki_filp; > struct inode *inode = file->f_mapping->host; > struct netfs_inode *ictx = netfs_inode(inode); > + unsigned long long end; > ssize_t ret; > > _enter("%llx,%zx,%llx", iocb->ki_pos, iov_iter_count(from), i_size_read(inode)); > @@ -155,6 +156,9 @@ ssize_t netfs_unbuffered_write_iter(struct kiocb *iocb, struct iov_iter *from) > ret = kiocb_invalidate_pages(iocb, iov_iter_count(from)); > if (ret < 0) > goto out; > + end = iocb->ki_pos + iov_iter_count(from); > + if (end > ictx->zero_point) > + ictx->zero_point = end; > > fscache_invalidate(netfs_i_cookie(ictx), NULL, i_size_read(inode), > FSCACHE_INVAL_DIO_WRITE); > diff --git a/fs/netfs/io.c b/fs/netfs/io.c > index 5d9098db815a..41a6113aa7fa 100644 > --- a/fs/netfs/io.c > +++ b/fs/netfs/io.c > @@ -569,6 +569,7 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq, > struct iov_iter *io_iter) > { > enum netfs_io_source source = NETFS_DOWNLOAD_FROM_SERVER; > + struct netfs_inode *ictx = netfs_inode(rreq->inode); > size_t lsize; > > _enter("%llx-%llx,%llx", subreq->start, subreq->start + subreq->len, rreq->i_size); > @@ -586,6 +587,14 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq, > * to make serial calls, it can indicate a short read and then > * we will call it again. > */ > + if (rreq->origin != NETFS_DIO_READ) { > + if (subreq->start >= ictx->zero_point) { > + source = NETFS_FILL_WITH_ZEROES; > + goto set; > + } > + if (subreq->len > ictx->zero_point - subreq->start) > + subreq->len = ictx->zero_point - subreq->start; > + } > if (subreq->len > rreq->i_size - subreq->start) > subreq->len = rreq->i_size - subreq->start; > if (rreq->rsize && subreq->len > rreq->rsize) > @@ -607,6 +616,7 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq, > } > } > > +set: > if (subreq->len > rreq->len) > pr_warn("R=%08x[%u] SREQ>RREQ %zx > %zx\n", > rreq->debug_id, subreq->debug_index, > diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c > index 40421ced4cd3..31e45dfad5b0 100644 > --- a/fs/netfs/misc.c > +++ b/fs/netfs/misc.c > @@ -240,6 +240,11 @@ EXPORT_SYMBOL(netfs_invalidate_folio); > bool netfs_release_folio(struct folio *folio, gfp_t gfp) > { > struct netfs_inode *ctx = netfs_inode(folio_inode(folio)); > + unsigned long long end; > + > + end = folio_pos(folio) + folio_size(folio); > + if (end > ctx->zero_point) > + ctx->zero_point = end; > > if (folio_test_private(folio)) > return false; > diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c > index 96a65cf9b5ec..07cd88897c33 100644 > --- a/fs/smb/client/cifsfs.c > +++ b/fs/smb/client/cifsfs.c > @@ -1220,7 +1220,7 @@ static int cifs_precopy_set_eof(struct inode *src_inode, struct cifsInodeInfo *s > if (rc < 0) > goto set_failed; > > - netfs_resize_file(&src_cifsi->netfs, src_end); > + netfs_resize_file(&src_cifsi->netfs, src_end, true); > fscache_resize_cookie(cifs_inode_cookie(src_inode), src_end); > return 0; > > @@ -1351,7 +1351,7 @@ static loff_t cifs_remap_file_range(struct file *src_file, loff_t off, > smb_file_src, smb_file_target, off, len, destoff); > if (rc == 0 && new_size > i_size_read(target_inode)) { > truncate_setsize(target_inode, new_size); > - netfs_resize_file(&target_cifsi->netfs, new_size); > + netfs_resize_file(&target_cifsi->netfs, new_size, true); > fscache_resize_cookie(cifs_inode_cookie(target_inode), > new_size); > } > diff --git a/include/linux/netfs.h b/include/linux/netfs.h > index fc77f7be220a..2005ad3b0e25 100644 > --- a/include/linux/netfs.h > +++ b/include/linux/netfs.h > @@ -136,6 +136,8 @@ struct netfs_inode { > struct fscache_cookie *cache; > #endif > loff_t remote_i_size; /* Size of the remote file */ > + loff_t zero_point; /* Size after which we assume there's no data > + * on the server */ > unsigned long flags; > #define NETFS_ICTX_ODIRECT 0 /* The file has DIO in progress */ > #define NETFS_ICTX_UNBUFFERED 1 /* I/O should not use the pagecache */ > @@ -465,22 +467,30 @@ static inline void netfs_inode_init(struct netfs_inode *ctx, > { > ctx->ops = ops; > ctx->remote_i_size = i_size_read(&ctx->inode); > + ctx->zero_point = ctx->remote_i_size; > ctx->flags = 0; > #if IS_ENABLED(CONFIG_FSCACHE) > ctx->cache = NULL; > #endif > + /* ->releasepage() drives zero_point */ > + mapping_set_release_always(ctx->inode.i_mapping); > } > > /** > * netfs_resize_file - Note that a file got resized > * @ctx: The netfs inode being resized > * @new_i_size: The new file size > + * @changed_on_server: The change was applied to the server > * > * Inform the netfs lib that a file got resized so that it can adjust its state. > */ > -static inline void netfs_resize_file(struct netfs_inode *ctx, loff_t new_i_size) > +static inline void netfs_resize_file(struct netfs_inode *ctx, loff_t new_i_size, > + bool changed_on_server) > { > - ctx->remote_i_size = new_i_size; > + if (changed_on_server) > + ctx->remote_i_size = new_i_size; > + if (new_i_size < ctx->zero_point) > + ctx->zero_point = new_i_size; > } > > /** >
Jeff Layton <jlayton@kernel.org> wrote: > > (4) On local truncation up, we don't change zero_point. > > > > The above seems odd, but I guess the assumption is that if there are any > writes by a 3rd party above the old zero point, that that would cause an > invalidation? All truncating up does is extend the region from which reading would return zeros, so it doesn't affect the zero_point. If a third party interferes, then we have to invalidate the local caches and reset zero_point to the EOF. But if a third party is writing to the file at the same time as you without both of you using locking or exclusive direct writes, your data is probably screwed anyway... Something cifs and ceph can use leasing to make this work; afs uses the data version number, notifications and the principle that you should use file locks. David
diff --git a/fs/afs/inode.c b/fs/afs/inode.c index c43112dcbbbb..dfd940a64e0f 100644 --- a/fs/afs/inode.c +++ b/fs/afs/inode.c @@ -166,6 +166,7 @@ static void afs_apply_status(struct afs_operation *op, struct inode *inode = &vnode->netfs.inode; struct timespec64 t; umode_t mode; + bool unexpected_jump = false; bool data_changed = false; bool change_size = vp->set_size; @@ -230,6 +231,7 @@ static void afs_apply_status(struct afs_operation *op, } change_size = true; data_changed = true; + unexpected_jump = true; } else if (vnode->status.type == AFS_FTYPE_DIR) { /* Expected directory change is handled elsewhere so * that we can locally edit the directory and save on a @@ -251,6 +253,8 @@ static void afs_apply_status(struct afs_operation *op, vnode->netfs.remote_i_size = status->size; if (change_size || status->size > i_size_read(inode)) { afs_set_i_size(vnode, status->size); + if (unexpected_jump) + vnode->netfs.zero_point = status->size; inode_set_ctime_to_ts(inode, t); inode_set_atime_to_ts(inode, t); } @@ -689,17 +693,17 @@ static void afs_setattr_success(struct afs_operation *op) static void afs_setattr_edit_file(struct afs_operation *op) { struct afs_vnode_param *vp = &op->file[0]; - struct inode *inode = &vp->vnode->netfs.inode; + struct afs_vnode *vnode = vp->vnode; if (op->setattr.attr->ia_valid & ATTR_SIZE) { loff_t size = op->setattr.attr->ia_size; loff_t i_size = op->setattr.old_i_size; - if (size < i_size) - truncate_pagecache(inode, size); - if (size != i_size) - fscache_resize_cookie(afs_vnode_cache(vp->vnode), - vp->scb.status.size); + if (size != i_size) { + truncate_setsize(&vnode->netfs.inode, size); + netfs_resize_file(&vnode->netfs, size, true); + fscache_resize_cookie(afs_vnode_cache(vnode), size); + } } } @@ -767,11 +771,11 @@ int afs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, */ if (!(attr->ia_valid & (supported & ~ATTR_SIZE & ~ATTR_MTIME)) && attr->ia_size < i_size && - attr->ia_size > vnode->status.size) { - truncate_pagecache(inode, attr->ia_size); + attr->ia_size > vnode->netfs.remote_i_size) { + truncate_setsize(inode, attr->ia_size); + netfs_resize_file(&vnode->netfs, size, false); fscache_resize_cookie(afs_vnode_cache(vnode), attr->ia_size); - i_size_write(inode, attr->ia_size); ret = 0; goto out_unlock; } diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c index dce6995fb644..d7ce424b9188 100644 --- a/fs/netfs/buffered_write.c +++ b/fs/netfs/buffered_write.c @@ -73,7 +73,7 @@ static enum netfs_how_to_modify netfs_how_to_modify(struct netfs_inode *ctx, if (folio_test_uptodate(folio)) return NETFS_FOLIO_IS_UPTODATE; - if (pos >= ctx->remote_i_size) + if (pos >= ctx->zero_point) return NETFS_MODIFY_AND_CLEAR; if (!maybe_trouble && offset == 0 && len >= flen) diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c index bb0c2718f57b..aad05f2349a4 100644 --- a/fs/netfs/direct_write.c +++ b/fs/netfs/direct_write.c @@ -134,6 +134,7 @@ ssize_t netfs_unbuffered_write_iter(struct kiocb *iocb, struct iov_iter *from) struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; struct netfs_inode *ictx = netfs_inode(inode); + unsigned long long end; ssize_t ret; _enter("%llx,%zx,%llx", iocb->ki_pos, iov_iter_count(from), i_size_read(inode)); @@ -155,6 +156,9 @@ ssize_t netfs_unbuffered_write_iter(struct kiocb *iocb, struct iov_iter *from) ret = kiocb_invalidate_pages(iocb, iov_iter_count(from)); if (ret < 0) goto out; + end = iocb->ki_pos + iov_iter_count(from); + if (end > ictx->zero_point) + ictx->zero_point = end; fscache_invalidate(netfs_i_cookie(ictx), NULL, i_size_read(inode), FSCACHE_INVAL_DIO_WRITE); diff --git a/fs/netfs/io.c b/fs/netfs/io.c index 5d9098db815a..41a6113aa7fa 100644 --- a/fs/netfs/io.c +++ b/fs/netfs/io.c @@ -569,6 +569,7 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq, struct iov_iter *io_iter) { enum netfs_io_source source = NETFS_DOWNLOAD_FROM_SERVER; + struct netfs_inode *ictx = netfs_inode(rreq->inode); size_t lsize; _enter("%llx-%llx,%llx", subreq->start, subreq->start + subreq->len, rreq->i_size); @@ -586,6 +587,14 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq, * to make serial calls, it can indicate a short read and then * we will call it again. */ + if (rreq->origin != NETFS_DIO_READ) { + if (subreq->start >= ictx->zero_point) { + source = NETFS_FILL_WITH_ZEROES; + goto set; + } + if (subreq->len > ictx->zero_point - subreq->start) + subreq->len = ictx->zero_point - subreq->start; + } if (subreq->len > rreq->i_size - subreq->start) subreq->len = rreq->i_size - subreq->start; if (rreq->rsize && subreq->len > rreq->rsize) @@ -607,6 +616,7 @@ netfs_rreq_prepare_read(struct netfs_io_request *rreq, } } +set: if (subreq->len > rreq->len) pr_warn("R=%08x[%u] SREQ>RREQ %zx > %zx\n", rreq->debug_id, subreq->debug_index, diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c index 40421ced4cd3..31e45dfad5b0 100644 --- a/fs/netfs/misc.c +++ b/fs/netfs/misc.c @@ -240,6 +240,11 @@ EXPORT_SYMBOL(netfs_invalidate_folio); bool netfs_release_folio(struct folio *folio, gfp_t gfp) { struct netfs_inode *ctx = netfs_inode(folio_inode(folio)); + unsigned long long end; + + end = folio_pos(folio) + folio_size(folio); + if (end > ctx->zero_point) + ctx->zero_point = end; if (folio_test_private(folio)) return false; diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c index 96a65cf9b5ec..07cd88897c33 100644 --- a/fs/smb/client/cifsfs.c +++ b/fs/smb/client/cifsfs.c @@ -1220,7 +1220,7 @@ static int cifs_precopy_set_eof(struct inode *src_inode, struct cifsInodeInfo *s if (rc < 0) goto set_failed; - netfs_resize_file(&src_cifsi->netfs, src_end); + netfs_resize_file(&src_cifsi->netfs, src_end, true); fscache_resize_cookie(cifs_inode_cookie(src_inode), src_end); return 0; @@ -1351,7 +1351,7 @@ static loff_t cifs_remap_file_range(struct file *src_file, loff_t off, smb_file_src, smb_file_target, off, len, destoff); if (rc == 0 && new_size > i_size_read(target_inode)) { truncate_setsize(target_inode, new_size); - netfs_resize_file(&target_cifsi->netfs, new_size); + netfs_resize_file(&target_cifsi->netfs, new_size, true); fscache_resize_cookie(cifs_inode_cookie(target_inode), new_size); } diff --git a/include/linux/netfs.h b/include/linux/netfs.h index fc77f7be220a..2005ad3b0e25 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -136,6 +136,8 @@ struct netfs_inode { struct fscache_cookie *cache; #endif loff_t remote_i_size; /* Size of the remote file */ + loff_t zero_point; /* Size after which we assume there's no data + * on the server */ unsigned long flags; #define NETFS_ICTX_ODIRECT 0 /* The file has DIO in progress */ #define NETFS_ICTX_UNBUFFERED 1 /* I/O should not use the pagecache */ @@ -465,22 +467,30 @@ static inline void netfs_inode_init(struct netfs_inode *ctx, { ctx->ops = ops; ctx->remote_i_size = i_size_read(&ctx->inode); + ctx->zero_point = ctx->remote_i_size; ctx->flags = 0; #if IS_ENABLED(CONFIG_FSCACHE) ctx->cache = NULL; #endif + /* ->releasepage() drives zero_point */ + mapping_set_release_always(ctx->inode.i_mapping); } /** * netfs_resize_file - Note that a file got resized * @ctx: The netfs inode being resized * @new_i_size: The new file size + * @changed_on_server: The change was applied to the server * * Inform the netfs lib that a file got resized so that it can adjust its state. */ -static inline void netfs_resize_file(struct netfs_inode *ctx, loff_t new_i_size) +static inline void netfs_resize_file(struct netfs_inode *ctx, loff_t new_i_size, + bool changed_on_server) { - ctx->remote_i_size = new_i_size; + if (changed_on_server) + ctx->remote_i_size = new_i_size; + if (new_i_size < ctx->zero_point) + ctx->zero_point = new_i_size; } /**
Track the file position above which the server is not expected to have any data (the "zero point") and preemptively assume that we can satisfy requests by filling them with zeroes locally rather than attempting to download them if they're over that line - even if we've written data back to the server. Assume that any data that was written back above that position is held in the local cache. Note that we have to split requests that straddle the line. Make use of this to optimise away some reads from the server. We need to set the zero point in the following circumstances: (1) When we see an extant remote inode and have no cache for it, we set the zero_point to i_size. (2) On local inode creation, we set zero_point to 0. (3) On local truncation down, we reduce zero_point to the new i_size if the new i_size is lower. (4) On local truncation up, we don't change zero_point. (5) On local modification, we don't change zero_point. (6) On remote invalidation, we set zero_point to the new i_size. (7) If stored data is discarded from the pagecache or culled from fscache, we must set zero_point above that if the data also got written to the server. (8) If dirty data is written back to the server, but not fscache, we must set zero_point above that. (9) If a direct I/O write is made, set zero_point above that. Assuming the above, any read from the server at or above the zero_point position will return all zeroes. The zero_point value can be stored in the cache, provided the above rules are applied to it by any code that culls part of the local cache. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- fs/afs/inode.c | 22 +++++++++++++--------- fs/netfs/buffered_write.c | 2 +- fs/netfs/direct_write.c | 4 ++++ fs/netfs/io.c | 10 ++++++++++ fs/netfs/misc.c | 5 +++++ fs/smb/client/cifsfs.c | 4 ++-- include/linux/netfs.h | 14 ++++++++++++-- 7 files changed, 47 insertions(+), 14 deletions(-)