Message ID | 4607d44ee4cf72a302d3adeba5d0ae99518a5f36.1712391866.git.sweettea-kernel@dorminy.me (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fallback if compressed IO fails for ENOSPC | expand |
在 2024/4/6 19:15, Sweet Tea Dorminy 写道: > In commit b4ccace878f4 ("btrfs: refactor submit_compressed_extents()"), if > an async extent compressed but failed to find enough space, we changed > from falling back to an uncompressed write to just failing the write > altogether. The principle was that if there's not enough space to write > the compressed version of the data, there can't possibly be enough space > to write the larger, uncompressed version of the data. > > However, this isn't necessarily true: due to fragmentation, there could > be enough discontiguous free blocks to write the uncompressed version, > but not enough contiguous free blocks to write the smaller but > unsplittable compressed version. You indeed got me, since for reserved writes we reserve the whole compressed size, meanwhile for uncompressed fallback we can afford minimal size of just a single sector, it indeed has its own chance. > > This has occurred to an internal workload which relied on write()'s > return value indicating there was space. While rare, it has happened a > few times. Just curious, how rare is such case? > > Thus, in order to prevent early ENOSPC, re-add a fallback to > uncompressed writing. > > Fixes: b4ccace878f4 ("btrfs: refactor submit_compressed_extents()") > Co-developed-by: Neal Gompa <neal@gompa.dev> > Signed-off-by: Neal Gompa <neal@gompa.dev> > Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> And the fix is really small and doesn't fallback to the old hellish code, looks pretty good to me. Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks, Qu > --- > fs/btrfs/inode.c | 13 ++++++------- > 1 file changed, 6 insertions(+), 7 deletions(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 3442dedff53d..15a13e191ee7 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -1148,13 +1148,13 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, > 0, *alloc_hint, &ins, 1, 1); > if (ret) { > /* > - * Here we used to try again by going back to non-compressed > - * path for ENOSPC. But we can't reserve space even for > - * compressed size, how could it work for uncompressed size > - * which requires larger size? So here we directly go error > - * path. > + * We can't reserve contiguous space for the compressed size. > + * Unlikely, but it's possible that we could have enough > + * non-contiguous space for the uncompressed size instead. So > + * fall back to uncompressed. > */ > - goto out_free; > + submit_uncompressed_range(inode, async_extent, locked_page); > + goto done; > } > > /* Here we're doing allocation and writeback of the compressed pages */ > @@ -1206,7 +1206,6 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, > out_free_reserve: > btrfs_dec_block_group_reservations(fs_info, ins.objectid); > btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1); > -out_free: > mapping_set_error(inode->vfs_inode.i_mapping, -EIO); > extent_clear_unlock_delalloc(inode, start, end, > NULL, EXTENT_LOCKED | EXTENT_DELALLOC |
On Sat, Apr 06, 2024 at 04:45:02AM -0400, Sweet Tea Dorminy wrote: > In commit b4ccace878f4 ("btrfs: refactor submit_compressed_extents()"), if > an async extent compressed but failed to find enough space, we changed > from falling back to an uncompressed write to just failing the write > altogether. The principle was that if there's not enough space to write > the compressed version of the data, there can't possibly be enough space > to write the larger, uncompressed version of the data. > > However, this isn't necessarily true: due to fragmentation, there could > be enough discontiguous free blocks to write the uncompressed version, > but not enough contiguous free blocks to write the smaller but > unsplittable compressed version. > > This has occurred to an internal workload which relied on write()'s > return value indicating there was space. While rare, it has happened a > few times. > > Thus, in order to prevent early ENOSPC, re-add a fallback to > uncompressed writing. > > Fixes: b4ccace878f4 ("btrfs: refactor submit_compressed_extents()") > Co-developed-by: Neal Gompa <neal@gompa.dev> > Signed-off-by: Neal Gompa <neal@gompa.dev> > Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Reviewed-by: David Sterba <dsterba@suse.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3442dedff53d..15a13e191ee7 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1148,13 +1148,13 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, 0, *alloc_hint, &ins, 1, 1); if (ret) { /* - * Here we used to try again by going back to non-compressed - * path for ENOSPC. But we can't reserve space even for - * compressed size, how could it work for uncompressed size - * which requires larger size? So here we directly go error - * path. + * We can't reserve contiguous space for the compressed size. + * Unlikely, but it's possible that we could have enough + * non-contiguous space for the uncompressed size instead. So + * fall back to uncompressed. */ - goto out_free; + submit_uncompressed_range(inode, async_extent, locked_page); + goto done; } /* Here we're doing allocation and writeback of the compressed pages */ @@ -1206,7 +1206,6 @@ static void submit_one_async_extent(struct async_chunk *async_chunk, out_free_reserve: btrfs_dec_block_group_reservations(fs_info, ins.objectid); btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1); -out_free: mapping_set_error(inode->vfs_inode.i_mapping, -EIO); extent_clear_unlock_delalloc(inode, start, end, NULL, EXTENT_LOCKED | EXTENT_DELALLOC |