mbox series

[0/3] btrfs: extra debug output for sector size < page size cases

Message ID cover.1732680197.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs: extra debug output for sector size < page size cases | expand

Message

Qu Wenruo Nov. 27, 2024, 4:06 a.m. UTC
The first patch is the long existing bug that full subpage bitmap dump
is not working for checked bitmap.
Thankfully even myself is not affected by the bug.

The second one is for a crash I hit where ASSERT() got triggered in
btrfs_folio_set_locked() after a btrfs_run_delalloc_range() failure.

The last one is for the btrfs_run_delalloc_range() failure, which is
not that rare in my environment, I guess the unsafe cache mode for my
aarch64 VM makes it too easy to hit ENOSPC.

But ENOSPC from btrfs_run_delalloc_range() itself is already a problem
for our data/metadata space reservation code, thus it should be
outputted even for non-debug build.

Qu Wenruo (3):
  btrfs: subpage: fix the bitmap dump for the locked flags
  btrfs: subpage: dump the involved bitmap when ASSERT() failed
  btrfs: add extra error messages for extent_writepage() failure

 fs/btrfs/extent_io.c | 16 +++++++++++++++
 fs/btrfs/subpage.c   | 47 ++++++++++++++++++++++++++++++++------------
 2 files changed, 50 insertions(+), 13 deletions(-)

Comments

David Sterba Nov. 27, 2024, 3:30 p.m. UTC | #1
On Wed, Nov 27, 2024 at 02:36:35PM +1030, Qu Wenruo wrote:
> The first patch is the long existing bug that full subpage bitmap dump
> is not working for checked bitmap.
> Thankfully even myself is not affected by the bug.
> 
> The second one is for a crash I hit where ASSERT() got triggered in
> btrfs_folio_set_locked() after a btrfs_run_delalloc_range() failure.
> 
> The last one is for the btrfs_run_delalloc_range() failure, which is
> not that rare in my environment, I guess the unsafe cache mode for my
> aarch64 VM makes it too easy to hit ENOSPC.
> 
> But ENOSPC from btrfs_run_delalloc_range() itself is already a problem
> for our data/metadata space reservation code, thus it should be
> outputted even for non-debug build.
> 
> Qu Wenruo (3):
>   btrfs: subpage: fix the bitmap dump for the locked flags
>   btrfs: subpage: dump the involved bitmap when ASSERT() failed
>   btrfs: add extra error messages for extent_writepage() failure

Reviewed-by: David Sterba <dsterba@suse.com>
Qu Wenruo Nov. 27, 2024, 8:30 p.m. UTC | #2
在 2024/11/28 02:00, David Sterba 写道:
> On Wed, Nov 27, 2024 at 02:36:35PM +1030, Qu Wenruo wrote:
>> The first patch is the long existing bug that full subpage bitmap dump
>> is not working for checked bitmap.
>> Thankfully even myself is not affected by the bug.
>>
>> The second one is for a crash I hit where ASSERT() got triggered in
>> btrfs_folio_set_locked() after a btrfs_run_delalloc_range() failure.
>>
>> The last one is for the btrfs_run_delalloc_range() failure, which is
>> not that rare in my environment, I guess the unsafe cache mode for my
>> aarch64 VM makes it too easy to hit ENOSPC.
>>
>> But ENOSPC from btrfs_run_delalloc_range() itself is already a problem
>> for our data/metadata space reservation code, thus it should be
>> outputted even for non-debug build.
>>
>> Qu Wenruo (3):
>>    btrfs: subpage: fix the bitmap dump for the locked flags
>>    btrfs: subpage: dump the involved bitmap when ASSERT() failed
>>    btrfs: add extra error messages for extent_writepage() failure
>
> Reviewed-by: David Sterba <dsterba@suse.com>

Thanks for the review.

Although I'd prefer this series to be merged after the double accounting
fix (which also affects non-subpage cases)

The problem is in the 3rd patch, which is touching the code just after
btrfs_run_delalloc_range() returned.

That area is also where the double accounting fix is touching.

To make backport a little easier, I'd prefer make the fix first, then
the extra debug patches.

Thanks,
Qu
David Sterba Nov. 28, 2024, 1:10 p.m. UTC | #3
On Thu, Nov 28, 2024 at 07:00:58AM +1030, Qu Wenruo wrote:
> 
> 
> 在 2024/11/28 02:00, David Sterba 写道:
> > On Wed, Nov 27, 2024 at 02:36:35PM +1030, Qu Wenruo wrote:
> >> The first patch is the long existing bug that full subpage bitmap dump
> >> is not working for checked bitmap.
> >> Thankfully even myself is not affected by the bug.
> >>
> >> The second one is for a crash I hit where ASSERT() got triggered in
> >> btrfs_folio_set_locked() after a btrfs_run_delalloc_range() failure.
> >>
> >> The last one is for the btrfs_run_delalloc_range() failure, which is
> >> not that rare in my environment, I guess the unsafe cache mode for my
> >> aarch64 VM makes it too easy to hit ENOSPC.
> >>
> >> But ENOSPC from btrfs_run_delalloc_range() itself is already a problem
> >> for our data/metadata space reservation code, thus it should be
> >> outputted even for non-debug build.
> >>
> >> Qu Wenruo (3):
> >>    btrfs: subpage: fix the bitmap dump for the locked flags
> >>    btrfs: subpage: dump the involved bitmap when ASSERT() failed
> >>    btrfs: add extra error messages for extent_writepage() failure
> >
> > Reviewed-by: David Sterba <dsterba@suse.com>
> 
> Thanks for the review.
> 
> Although I'd prefer this series to be merged after the double accounting
> fix (which also affects non-subpage cases)
> 
> The problem is in the 3rd patch, which is touching the code just after
> btrfs_run_delalloc_range() returned.
> 
> That area is also where the double accounting fix is touching.
> 
> To make backport a little easier, I'd prefer make the fix first, then
> the extra debug patches.

Right, no objections here. As you've been finding the problems any
debugging and reporting improvement is good.