Message ID | 20241108124246.198489-5-bfoster@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | iomap: zero range flush fixes | expand |
On Fri, Nov 08, 2024 at 07:42:46AM -0500, Brian Foster wrote: > iomap_zero_range() uses buffered writes for manual zeroing, no > longer updates i_size for such writes, but is still explicitly > called for post-eof ranges. The historical use case for this is > zeroing post-eof speculative preallocation on extending writes from > XFS. However, XFS also recently changed to convert all post-eof > delalloc mappings to unwritten in the iomap_begin() handler, which > means it now never expects manual zeroing of post-eof mappings. In > other words, all post-eof mappings should be reported as holes or > unwritten. > > This is a subtle dependency that can be hard to detect if violated > because associated codepaths are likely to update i_size after folio > locks are dropped, but before writeback happens to occur. For > example, if XFS reverts back to some form of manual zeroing of > post-eof blocks on write extension, writeback of those zeroed folios > will now race with the presumed i_size update from the subsequent > buffered write. > > Since iomap_zero_range() can't correctly zero post-eof mappings > beyond EOF without updating i_size, warn if this ever occurs. This > serves as minimal indication that if this use case is reintroduced > by a filesystem, iomap_zero_range() might need to reconsider i_size > updates for write extending use cases. > > Signed-off-by: Brian Foster <bfoster@redhat.com> > --- > fs/iomap/buffered-io.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 7f40234a301e..e18830e4809b 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -1354,6 +1354,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) > { > loff_t pos = iter->pos; > loff_t length = iomap_length(iter); > + loff_t isize = iter->inode->i_size; > loff_t written = 0; > > do { > @@ -1369,6 +1370,8 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) > if (iter->iomap.flags & IOMAP_F_STALE) > break; > > + /* warn about zeroing folios beyond eof that won't write back */ > + WARN_ON_ONCE(folio_pos(folio) > isize); WARN_ON_ONCE(folio_pos(folio) > iter->inode->i_size));? No need to have the extra local variable for something that shouldn't ever happen. Do you need i_size_read for correctness here? --D > offset = offset_in_folio(folio, pos); > if (bytes > folio_size(folio) - offset) > bytes = folio_size(folio) - offset; > -- > 2.47.0 > >
Looks fine:
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Fri, Nov 08, 2024 at 07:06:23PM -0800, Darrick J. Wong wrote: > On Fri, Nov 08, 2024 at 07:42:46AM -0500, Brian Foster wrote: > > iomap_zero_range() uses buffered writes for manual zeroing, no > > longer updates i_size for such writes, but is still explicitly > > called for post-eof ranges. The historical use case for this is > > zeroing post-eof speculative preallocation on extending writes from > > XFS. However, XFS also recently changed to convert all post-eof > > delalloc mappings to unwritten in the iomap_begin() handler, which > > means it now never expects manual zeroing of post-eof mappings. In > > other words, all post-eof mappings should be reported as holes or > > unwritten. > > > > This is a subtle dependency that can be hard to detect if violated > > because associated codepaths are likely to update i_size after folio > > locks are dropped, but before writeback happens to occur. For > > example, if XFS reverts back to some form of manual zeroing of > > post-eof blocks on write extension, writeback of those zeroed folios > > will now race with the presumed i_size update from the subsequent > > buffered write. > > > > Since iomap_zero_range() can't correctly zero post-eof mappings > > beyond EOF without updating i_size, warn if this ever occurs. This > > serves as minimal indication that if this use case is reintroduced > > by a filesystem, iomap_zero_range() might need to reconsider i_size > > updates for write extending use cases. > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > --- > > fs/iomap/buffered-io.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > > index 7f40234a301e..e18830e4809b 100644 > > --- a/fs/iomap/buffered-io.c > > +++ b/fs/iomap/buffered-io.c > > @@ -1354,6 +1354,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) > > { > > loff_t pos = iter->pos; > > loff_t length = iomap_length(iter); > > + loff_t isize = iter->inode->i_size; > > loff_t written = 0; > > > > do { > > @@ -1369,6 +1370,8 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) > > if (iter->iomap.flags & IOMAP_F_STALE) > > break; > > > > + /* warn about zeroing folios beyond eof that won't write back */ > > + WARN_ON_ONCE(folio_pos(folio) > isize); > > WARN_ON_ONCE(folio_pos(folio) > iter->inode->i_size));? > > No need to have the extra local variable for something that shouldn't > ever happen. Do you need i_size_read for correctness here? > Dropped isize. I didn't think we needed i_size_read() since we're typically in an fs operation path, but I could be wrong. I haven't seen any spurious warnings in my testing so far, at least. Brian > --D > > > offset = offset_in_folio(folio, pos); > > if (bytes > folio_size(folio) - offset) > > bytes = folio_size(folio) - offset; > > -- > > 2.47.0 > > > > >
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 7f40234a301e..e18830e4809b 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1354,6 +1354,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) { loff_t pos = iter->pos; loff_t length = iomap_length(iter); + loff_t isize = iter->inode->i_size; loff_t written = 0; do { @@ -1369,6 +1370,8 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) if (iter->iomap.flags & IOMAP_F_STALE) break; + /* warn about zeroing folios beyond eof that won't write back */ + WARN_ON_ONCE(folio_pos(folio) > isize); offset = offset_in_folio(folio, pos); if (bytes > folio_size(folio) - offset) bytes = folio_size(folio) - offset;
iomap_zero_range() uses buffered writes for manual zeroing, no longer updates i_size for such writes, but is still explicitly called for post-eof ranges. The historical use case for this is zeroing post-eof speculative preallocation on extending writes from XFS. However, XFS also recently changed to convert all post-eof delalloc mappings to unwritten in the iomap_begin() handler, which means it now never expects manual zeroing of post-eof mappings. In other words, all post-eof mappings should be reported as holes or unwritten. This is a subtle dependency that can be hard to detect if violated because associated codepaths are likely to update i_size after folio locks are dropped, but before writeback happens to occur. For example, if XFS reverts back to some form of manual zeroing of post-eof blocks on write extension, writeback of those zeroed folios will now race with the presumed i_size update from the subsequent buffered write. Since iomap_zero_range() can't correctly zero post-eof mappings beyond EOF without updating i_size, warn if this ever occurs. This serves as minimal indication that if this use case is reintroduced by a filesystem, iomap_zero_range() might need to reconsider i_size updates for write extending use cases. Signed-off-by: Brian Foster <bfoster@redhat.com> --- fs/iomap/buffered-io.c | 3 +++ 1 file changed, 3 insertions(+)