diff mbox series

[v3,4/4] iomap: warn on zero range of a post-eof folio

Message ID 20241108124246.198489-5-bfoster@redhat.com (mailing list archive)
State New
Headers show
Series iomap: zero range flush fixes | expand

Commit Message

Brian Foster Nov. 8, 2024, 12:42 p.m. UTC
iomap_zero_range() uses buffered writes for manual zeroing, no
longer updates i_size for such writes, but is still explicitly
called for post-eof ranges. The historical use case for this is
zeroing post-eof speculative preallocation on extending writes from
XFS. However, XFS also recently changed to convert all post-eof
delalloc mappings to unwritten in the iomap_begin() handler, which
means it now never expects manual zeroing of post-eof mappings. In
other words, all post-eof mappings should be reported as holes or
unwritten.

This is a subtle dependency that can be hard to detect if violated
because associated codepaths are likely to update i_size after folio
locks are dropped, but before writeback happens to occur. For
example, if XFS reverts back to some form of manual zeroing of
post-eof blocks on write extension, writeback of those zeroed folios
will now race with the presumed i_size update from the subsequent
buffered write.

Since iomap_zero_range() can't correctly zero post-eof mappings
beyond EOF without updating i_size, warn if this ever occurs. This
serves as minimal indication that if this use case is reintroduced
by a filesystem, iomap_zero_range() might need to reconsider i_size
updates for write extending use cases.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/iomap/buffered-io.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Darrick J. Wong Nov. 9, 2024, 3:06 a.m. UTC | #1
On Fri, Nov 08, 2024 at 07:42:46AM -0500, Brian Foster wrote:
> iomap_zero_range() uses buffered writes for manual zeroing, no
> longer updates i_size for such writes, but is still explicitly
> called for post-eof ranges. The historical use case for this is
> zeroing post-eof speculative preallocation on extending writes from
> XFS. However, XFS also recently changed to convert all post-eof
> delalloc mappings to unwritten in the iomap_begin() handler, which
> means it now never expects manual zeroing of post-eof mappings. In
> other words, all post-eof mappings should be reported as holes or
> unwritten.
> 
> This is a subtle dependency that can be hard to detect if violated
> because associated codepaths are likely to update i_size after folio
> locks are dropped, but before writeback happens to occur. For
> example, if XFS reverts back to some form of manual zeroing of
> post-eof blocks on write extension, writeback of those zeroed folios
> will now race with the presumed i_size update from the subsequent
> buffered write.
> 
> Since iomap_zero_range() can't correctly zero post-eof mappings
> beyond EOF without updating i_size, warn if this ever occurs. This
> serves as minimal indication that if this use case is reintroduced
> by a filesystem, iomap_zero_range() might need to reconsider i_size
> updates for write extending use cases.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/iomap/buffered-io.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 7f40234a301e..e18830e4809b 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1354,6 +1354,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
>  {
>  	loff_t pos = iter->pos;
>  	loff_t length = iomap_length(iter);
> +	loff_t isize = iter->inode->i_size;
>  	loff_t written = 0;
>  
>  	do {
> @@ -1369,6 +1370,8 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
>  		if (iter->iomap.flags & IOMAP_F_STALE)
>  			break;
>  
> +		/* warn about zeroing folios beyond eof that won't write back */
> +		WARN_ON_ONCE(folio_pos(folio) > isize);

		WARN_ON_ONCE(folio_pos(folio) > iter->inode->i_size));?

No need to have the extra local variable for something that shouldn't
ever happen.  Do you need i_size_read for correctness here?

--D

>  		offset = offset_in_folio(folio, pos);
>  		if (bytes > folio_size(folio) - offset)
>  			bytes = folio_size(folio) - offset;
> -- 
> 2.47.0
> 
>
Christoph Hellwig Nov. 11, 2024, 6:06 a.m. UTC | #2
Looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Brian Foster Nov. 12, 2024, 2:01 p.m. UTC | #3
On Fri, Nov 08, 2024 at 07:06:23PM -0800, Darrick J. Wong wrote:
> On Fri, Nov 08, 2024 at 07:42:46AM -0500, Brian Foster wrote:
> > iomap_zero_range() uses buffered writes for manual zeroing, no
> > longer updates i_size for such writes, but is still explicitly
> > called for post-eof ranges. The historical use case for this is
> > zeroing post-eof speculative preallocation on extending writes from
> > XFS. However, XFS also recently changed to convert all post-eof
> > delalloc mappings to unwritten in the iomap_begin() handler, which
> > means it now never expects manual zeroing of post-eof mappings. In
> > other words, all post-eof mappings should be reported as holes or
> > unwritten.
> > 
> > This is a subtle dependency that can be hard to detect if violated
> > because associated codepaths are likely to update i_size after folio
> > locks are dropped, but before writeback happens to occur. For
> > example, if XFS reverts back to some form of manual zeroing of
> > post-eof blocks on write extension, writeback of those zeroed folios
> > will now race with the presumed i_size update from the subsequent
> > buffered write.
> > 
> > Since iomap_zero_range() can't correctly zero post-eof mappings
> > beyond EOF without updating i_size, warn if this ever occurs. This
> > serves as minimal indication that if this use case is reintroduced
> > by a filesystem, iomap_zero_range() might need to reconsider i_size
> > updates for write extending use cases.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/iomap/buffered-io.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index 7f40234a301e..e18830e4809b 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -1354,6 +1354,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
> >  {
> >  	loff_t pos = iter->pos;
> >  	loff_t length = iomap_length(iter);
> > +	loff_t isize = iter->inode->i_size;
> >  	loff_t written = 0;
> >  
> >  	do {
> > @@ -1369,6 +1370,8 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
> >  		if (iter->iomap.flags & IOMAP_F_STALE)
> >  			break;
> >  
> > +		/* warn about zeroing folios beyond eof that won't write back */
> > +		WARN_ON_ONCE(folio_pos(folio) > isize);
> 
> 		WARN_ON_ONCE(folio_pos(folio) > iter->inode->i_size));?
> 
> No need to have the extra local variable for something that shouldn't
> ever happen.  Do you need i_size_read for correctness here?
> 

Dropped isize. I didn't think we needed i_size_read() since we're
typically in an fs operation path, but I could be wrong. I haven't seen
any spurious warnings in my testing so far, at least.

Brian

> --D
> 
> >  		offset = offset_in_folio(folio, pos);
> >  		if (bytes > folio_size(folio) - offset)
> >  			bytes = folio_size(folio) - offset;
> > -- 
> > 2.47.0
> > 
> > 
>
diff mbox series

Patch

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 7f40234a301e..e18830e4809b 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1354,6 +1354,7 @@  static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 {
 	loff_t pos = iter->pos;
 	loff_t length = iomap_length(iter);
+	loff_t isize = iter->inode->i_size;
 	loff_t written = 0;
 
 	do {
@@ -1369,6 +1370,8 @@  static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 		if (iter->iomap.flags & IOMAP_F_STALE)
 			break;
 
+		/* warn about zeroing folios beyond eof that won't write back */
+		WARN_ON_ONCE(folio_pos(folio) > isize);
 		offset = offset_in_folio(folio, pos);
 		if (bytes > folio_size(folio) - offset)
 			bytes = folio_size(folio) - offset;