Message ID | 20220407143834.7516-1-lhenriques@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] ceph: invalidate pages when doing direct/sync writes | expand |
On Thu, 2022-04-07 at 15:38 +0100, Luís Henriques wrote: > When doing a direct/sync write, we need to invalidate the page cache in > the range being written to. If we don't do this, the cache will include > invalid data as we just did a write that avoided the page cache. > > Signed-off-by: Luís Henriques <lhenriques@suse.de> > --- > fs/ceph/file.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > Ok, here's a new attempt. After discussion in this thread and on IRC, I > think this is the right fix. generic/647 now passes with and without > encryption. Thanks! > > Changes since v2: > - Invalidation needs to be done after a write > > Changes since v1: > - Replaced truncate_inode_pages_range() by invalidate_inode_pages2_range > - Call fscache_invalidate with FSCACHE_INVAL_DIO_WRITE if we're doing DIO > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > index 5072570c2203..63e67eb60310 100644 > --- a/fs/ceph/file.c > +++ b/fs/ceph/file.c > @@ -1938,6 +1938,15 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, > break; > } > ceph_clear_error_write(ci); > + ret = invalidate_inode_pages2_range( > + inode->i_mapping, > + pos >> PAGE_SHIFT, > + (pos + len - 1) >> PAGE_SHIFT); > + if (ret < 0) { > + dout("invalidate_inode_pages2_range returned %d\n", > + ret); > + ret = 0; > + } > pos += len; > written += len; > dout("sync_write written %d\n", written); Looks good. I suspect we can also remove the invalidate_indode_pages2_range call earlier in this function too. I may roll that into this patch. I'll give this an xfstests run with fscrypt enabled and see how it does. Thanks!
On 4/7/22 10:38 PM, Luís Henriques wrote: > When doing a direct/sync write, we need to invalidate the page cache in > the range being written to. If we don't do this, the cache will include > invalid data as we just did a write that avoided the page cache. > > Signed-off-by: Luís Henriques <lhenriques@suse.de> > --- > fs/ceph/file.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > Ok, here's a new attempt. After discussion in this thread and on IRC, I > think this is the right fix. generic/647 now passes with and without > encryption. Thanks! > > Changes since v2: > - Invalidation needs to be done after a write > > Changes since v1: > - Replaced truncate_inode_pages_range() by invalidate_inode_pages2_range > - Call fscache_invalidate with FSCACHE_INVAL_DIO_WRITE if we're doing DIO > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > index 5072570c2203..63e67eb60310 100644 > --- a/fs/ceph/file.c > +++ b/fs/ceph/file.c > @@ -1938,6 +1938,15 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, > break; > } > ceph_clear_error_write(ci); > + ret = invalidate_inode_pages2_range( > + inode->i_mapping, > + pos >> PAGE_SHIFT, > + (pos + len - 1) >> PAGE_SHIFT); > + if (ret < 0) { > + dout("invalidate_inode_pages2_range returned %d\n", > + ret); > + ret = 0; > + } > pos += len; > written += len; > dout("sync_write written %d\n", written); > LGTM. Maybe it worth adding a comment to explain why we need this and where the mapping come from ? Reviewed-by: Xiubo Li <xiubli@redhat.com>
Jeff Layton <jlayton@kernel.org> writes: > On Thu, 2022-04-07 at 15:38 +0100, Luís Henriques wrote: >> When doing a direct/sync write, we need to invalidate the page cache in >> the range being written to. If we don't do this, the cache will include >> invalid data as we just did a write that avoided the page cache. >> >> Signed-off-by: Luís Henriques <lhenriques@suse.de> >> --- >> fs/ceph/file.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> Ok, here's a new attempt. After discussion in this thread and on IRC, I >> think this is the right fix. generic/647 now passes with and without >> encryption. Thanks! >> >> Changes since v2: >> - Invalidation needs to be done after a write >> >> Changes since v1: >> - Replaced truncate_inode_pages_range() by invalidate_inode_pages2_range >> - Call fscache_invalidate with FSCACHE_INVAL_DIO_WRITE if we're doing DIO >> >> diff --git a/fs/ceph/file.c b/fs/ceph/file.c >> index 5072570c2203..63e67eb60310 100644 >> --- a/fs/ceph/file.c >> +++ b/fs/ceph/file.c >> @@ -1938,6 +1938,15 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, >> break; >> } >> ceph_clear_error_write(ci); >> + ret = invalidate_inode_pages2_range( >> + inode->i_mapping, >> + pos >> PAGE_SHIFT, >> + (pos + len - 1) >> PAGE_SHIFT); >> + if (ret < 0) { >> + dout("invalidate_inode_pages2_range returned %d\n", >> + ret); >> + ret = 0; >> + } >> pos += len; >> written += len; >> dout("sync_write written %d\n", written); > > Looks good. I suspect we can also remove the > invalidate_indode_pages2_range call earlier in this function too. I may > roll that into this patch. Right, that occurred to me as well but I wasn't really sure that would be safe. > I'll give this an xfstests run with fscrypt enabled and see how it does. I'll do the same here, I just run a few tests on a vstart cluster. But I'll definitely give it some more testing. Cheers,
Xiubo Li <xiubli@redhat.com> writes: > On 4/7/22 10:38 PM, Luís Henriques wrote: >> When doing a direct/sync write, we need to invalidate the page cache in >> the range being written to. If we don't do this, the cache will include >> invalid data as we just did a write that avoided the page cache. >> >> Signed-off-by: Luís Henriques <lhenriques@suse.de> >> --- >> fs/ceph/file.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> Ok, here's a new attempt. After discussion in this thread and on IRC, I >> think this is the right fix. generic/647 now passes with and without >> encryption. Thanks! >> >> Changes since v2: >> - Invalidation needs to be done after a write >> >> Changes since v1: >> - Replaced truncate_inode_pages_range() by invalidate_inode_pages2_range >> - Call fscache_invalidate with FSCACHE_INVAL_DIO_WRITE if we're doing DIO >> >> diff --git a/fs/ceph/file.c b/fs/ceph/file.c >> index 5072570c2203..63e67eb60310 100644 >> --- a/fs/ceph/file.c >> +++ b/fs/ceph/file.c >> @@ -1938,6 +1938,15 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, >> break; >> } >> ceph_clear_error_write(ci); >> + ret = invalidate_inode_pages2_range( >> + inode->i_mapping, >> + pos >> PAGE_SHIFT, >> + (pos + len - 1) >> PAGE_SHIFT); >> + if (ret < 0) { >> + dout("invalidate_inode_pages2_range returned %d\n", >> + ret); >> + ret = 0; >> + } >> pos += len; >> written += len; >> dout("sync_write written %d\n", written); >> > LGTM. > > Maybe it worth adding a comment to explain why we need this and where the > mapping come from ? > > Reviewed-by: Xiubo Li <xiubli@redhat.com> > Sure, I'll send out v4 with an extra comment. Thanks. Cheers,
diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 5072570c2203..63e67eb60310 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1938,6 +1938,15 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, break; } ceph_clear_error_write(ci); + ret = invalidate_inode_pages2_range( + inode->i_mapping, + pos >> PAGE_SHIFT, + (pos + len - 1) >> PAGE_SHIFT); + if (ret < 0) { + dout("invalidate_inode_pages2_range returned %d\n", + ret); + ret = 0; + } pos += len; written += len; dout("sync_write written %d\n", written);
When doing a direct/sync write, we need to invalidate the page cache in the range being written to. If we don't do this, the cache will include invalid data as we just did a write that avoided the page cache. Signed-off-by: Luís Henriques <lhenriques@suse.de> --- fs/ceph/file.c | 9 +++++++++ 1 file changed, 9 insertions(+) Ok, here's a new attempt. After discussion in this thread and on IRC, I think this is the right fix. generic/647 now passes with and without encryption. Thanks! Changes since v2: - Invalidation needs to be done after a write Changes since v1: - Replaced truncate_inode_pages_range() by invalidate_inode_pages2_range - Call fscache_invalidate with FSCACHE_INVAL_DIO_WRITE if we're doing DIO