mbox series

[v2,0/8] btrfs: make subpage handling be feature full

Message ID cover.1740635497.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs: make subpage handling be feature full | expand

Message

Qu Wenruo Feb. 27, 2025, 5:54 a.m. UTC
[CHANGLOG]
v2:
- Add a new bug fix which is exposed by recent 2K block size tests on
  x86_64
  It's a self deadlock where the folio_end_writeback() is called with
  subpage lock hold, and folio_end_writeback() will eventually call
  btrfs_release_folio() and try to lock the same spin lock.

- Various grammar fixes and commit message/comments updates

- Address comments from Filipe and adds his tags

v1:
- Merge previous two patches into one
- Re-order the patches so preparation/fixes are always before feature
  enablement
- Update the commit message of the first patch
  So that we do not focus on the out-of-tree part, but explain why it's
  not ideal in the first place (double page zeroing), and just mention
  the behavior change in the future.

Since the introduction of btrfs subapge support in v5.15, there are
quite some limitations:

- No RAID56 support
  Added in v5.19.

- No memory efficient scrub support
  Added in v6.4.

- No block perfect compressed write support
  Previously btrfs requires the dirty range to be fully page aligned, or
  it will skip compression completely.

  Added in v6.13.

- Various subpage related error handling fixes
  Added in v6.14.

- No support to create inline data extent
- No partial uptodate page support
  This is a long existing optimization supported by EXT4/XFS and
  is required to pass generic/563 with subpage block size.

The last two are addressed in this patchset.

And since all features are implemented for subpage cases, the long
existing experimental warning message can be dropped, as it is already
causing a lot of concerns for users who checks the dmesg carefully.

Qu Wenruo (8):
  btrfs: prevent inline data extents read from touching blocks beyond
    its range
  btrfs: subpage: do not hold subpage spin lock when clearing folio
    writeback
  btrfs: fix the qgroup data free range for inline data extents
  btrfs: introduce a read path dedicated extent lock helper
  btrfs: make btrfs_do_readpage() to do block-by-block read
  btrfs: allow buffered write to avoid full page read if it's block
    aligned
  btrfs: allow inline data extents creation if block size < page size
  btrfs: remove the subpage related warning message

 fs/btrfs/disk-io.c      |   5 -
 fs/btrfs/extent_io.c    | 229 +++++++++++++++++++++++++++++++++++-----
 fs/btrfs/file.c         |   5 +-
 fs/btrfs/inode.c        |  29 ++---
 fs/btrfs/ordered-data.c |  23 +++-
 fs/btrfs/ordered-data.h |   8 +-
 fs/btrfs/subpage.c      |  10 +-
 7 files changed, 246 insertions(+), 63 deletions(-)

Comments

David Sterba Feb. 27, 2025, 11:16 a.m. UTC | #1
On Thu, Feb 27, 2025 at 04:24:38PM +1030, Qu Wenruo wrote:
> [CHANGLOG]
> v2:
> - Add a new bug fix which is exposed by recent 2K block size tests on
>   x86_64
>   It's a self deadlock where the folio_end_writeback() is called with
>   subpage lock hold, and folio_end_writeback() will eventually call
>   btrfs_release_folio() and try to lock the same spin lock.
> 
> Since the introduction of btrfs subapge support in v5.15, there are
> quite some limitations:
> 
> - No RAID56 support
>   Added in v5.19.
> 
> - No memory efficient scrub support
>   Added in v6.4.
> 
> - No block perfect compressed write support
>   Previously btrfs requires the dirty range to be fully page aligned, or
>   it will skip compression completely.
> 
>   Added in v6.13.
> 
> - Various subpage related error handling fixes
>   Added in v6.14.
> 
> - No support to create inline data extent
> - No partial uptodate page support
>   This is a long existing optimization supported by EXT4/XFS and
>   is required to pass generic/563 with subpage block size.
> 
> The last two are addressed in this patchset.

That's great, thank you very much. I think not all patches have a
reviewed-by tag but I'd like to get it to for-next very soon.  The next
is rc5 and we should have all features in. This also means the 2K
nodesize block so we can give it more testing.
Qu Wenruo Feb. 28, 2025, 3:14 a.m. UTC | #2
在 2025/2/27 21:46, David Sterba 写道:
> On Thu, Feb 27, 2025 at 04:24:38PM +1030, Qu Wenruo wrote:
>> [CHANGLOG]
>> v2:
>> - Add a new bug fix which is exposed by recent 2K block size tests on
>>    x86_64
>>    It's a self deadlock where the folio_end_writeback() is called with
>>    subpage lock hold, and folio_end_writeback() will eventually call
>>    btrfs_release_folio() and try to lock the same spin lock.
>>
>> Since the introduction of btrfs subapge support in v5.15, there are
>> quite some limitations:
>>
>> - No RAID56 support
>>    Added in v5.19.
>>
>> - No memory efficient scrub support
>>    Added in v6.4.
>>
>> - No block perfect compressed write support
>>    Previously btrfs requires the dirty range to be fully page aligned, or
>>    it will skip compression completely.
>>
>>    Added in v6.13.
>>
>> - Various subpage related error handling fixes
>>    Added in v6.14.
>>
>> - No support to create inline data extent
>> - No partial uptodate page support
>>    This is a long existing optimization supported by EXT4/XFS and
>>    is required to pass generic/563 with subpage block size.
>>
>> The last two are addressed in this patchset.
> 
> That's great, thank you very much. I think not all patches have a
> reviewed-by tag but I'd like to get it to for-next very soon.  The next
> is rc5 and we should have all features in. This also means the 2K
> nodesize block so we can give it more testing.

Now pushed to for-next, with all patches reviewed (or acked) by Filipe.

Great thanks to Filipe for the review!

For the 2K one, since it's just two small patches I'm also fine pushing 
them now.
Just do not forget that we need progs patches, and a dozen of known 
failures from fstests, and I'm not yet able to address them all any time 
soon.

Thanks,
Qu
David Sterba Feb. 28, 2025, 12:42 p.m. UTC | #3
On Fri, Feb 28, 2025 at 01:44:04PM +1030, Qu Wenruo wrote:
> For the 2K one, since it's just two small patches I'm also fine pushing 
> them now.
> Just do not forget that we need progs patches, and a dozen of known 
> failures from fstests, and I'm not yet able to address them all any time 
> soon.

Yeah, the mkfs support can go to the next minor progs release. About the
status we can print a warning and document it. No need to focus on
fixing the fstests, I think stress testing will be sufficient for now.