diff mbox series

btrfs-progs: free extent buffer after repairing wrong transid eb

Message ID 20220830124752.45550-1-glass@fydeos.io (mailing list archive)
State New, archived
Headers show
Series btrfs-progs: free extent buffer after repairing wrong transid eb | expand

Commit Message

Su Yue Aug. 30, 2022, 12:47 p.m. UTC
In read_tree_block, extent buffer EXTENT_BAD_TRANSID flagged will
be added into fs_info->recow_ebs with an increment of its refs.

The corresponding free_extent_buffer should be called after we
fix transid error by cowing extent buffer then remove them from
fs_info->recow_ebs.

Otherwise, extent buffers will be leaked as fsck-tests/002 reports:
===================================================================
====== RUN CHECK /root/btrfs-progs/btrfs check --repair --force ./default_case.img.restored
parent transid verify failed on 29360128 wanted 9 found 755944791
parent transid verify failed on 29360128 wanted 9 found 755944791
parent transid verify failed on 29360128 wanted 9 found 755944791
Ignoring transid failure
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
extent buffer leak: start 29360128 len 4096
enabling repair mode
===================================================================

Fixes: c64485544baa ("Btrfs-progs: keep track of transid failures and fix them if possible")
Signed-off-by: Su Yue <glass@fydeos.io>
---
 check/main.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Qu Wenruo Aug. 31, 2022, 8 a.m. UTC | #1
On 2022/8/30 20:47, Su Yue wrote:
> In read_tree_block, extent buffer EXTENT_BAD_TRANSID flagged will
> be added into fs_info->recow_ebs with an increment of its refs.
>
> The corresponding free_extent_buffer should be called after we
> fix transid error by cowing extent buffer then remove them from
> fs_info->recow_ebs.
>
> Otherwise, extent buffers will be leaked as fsck-tests/002 reports:
> ===================================================================
> ====== RUN CHECK /root/btrfs-progs/btrfs check --repair --force ./default_case.img.restored
> parent transid verify failed on 29360128 wanted 9 found 755944791
> parent transid verify failed on 29360128 wanted 9 found 755944791
> parent transid verify failed on 29360128 wanted 9 found 755944791
> Ignoring transid failure
> [1/7] checking root items
> Fixed 0 roots.
> [2/7] checking extents
> [3/7] checking free space cache
> [4/7] checking fs roots
> [5/7] checking only csums items (without verifying data)
> [6/7] checking root refs
> [7/7] checking quota groups skipped (not enabled on this FS)
> extent buffer leak: start 29360128 len 4096
> enabling repair mode
> ===================================================================
>
> Fixes: c64485544baa ("Btrfs-progs: keep track of transid failures and fix them if possible")
> Signed-off-by: Su Yue <glass@fydeos.io>

Great to fix the fsck/002 runs.

Have you hit any other eb leaks? My extra noisy patch to crash progs
when eb leaks failed at fsck/002.

Not sure if there are any other remaining.

Thanks,
Qu
> ---
>   check/main.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/check/main.c b/check/main.c
> index 0ba38f73c0a4..0dd18d07ff5d 100644
> --- a/check/main.c
> +++ b/check/main.c
> @@ -10966,6 +10966,7 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv)
>   				      struct extent_buffer, recow);
>   		list_del_init(&eb->recow);
>   		ret = recow_extent_buffer(root, eb);
> +		free_extent_buffer(eb);
>   		err |= !!ret;
>   		if (ret) {
>   			error("fails to fix transid errors");
Su Yue Aug. 31, 2022, 10:45 a.m. UTC | #2
On 2022/8/31 16:00, Qu Wenruo wrote:
> 
> 
> On 2022/8/30 20:47, Su Yue wrote:
>> In read_tree_block, extent buffer EXTENT_BAD_TRANSID flagged will
>> be added into fs_info->recow_ebs with an increment of its refs.
>>
>> The corresponding free_extent_buffer should be called after we
>> fix transid error by cowing extent buffer then remove them from
>> fs_info->recow_ebs.
>>
>> Otherwise, extent buffers will be leaked as fsck-tests/002 reports:
>> ===================================================================
>> ====== RUN CHECK /root/btrfs-progs/btrfs check --repair --force 
>> ./default_case.img.restored
>> parent transid verify failed on 29360128 wanted 9 found 755944791
>> parent transid verify failed on 29360128 wanted 9 found 755944791
>> parent transid verify failed on 29360128 wanted 9 found 755944791
>> Ignoring transid failure
>> [1/7] checking root items
>> Fixed 0 roots.
>> [2/7] checking extents
>> [3/7] checking free space cache
>> [4/7] checking fs roots
>> [5/7] checking only csums items (without verifying data)
>> [6/7] checking root refs
>> [7/7] checking quota groups skipped (not enabled on this FS)
>> extent buffer leak: start 29360128 len 4096
>> enabling repair mode
>> ===================================================================
>>
>> Fixes: c64485544baa ("Btrfs-progs: keep track of transid failures and 
>> fix them if possible")
>> Signed-off-by: Su Yue <glass@fydeos.io>
> 
> Great to fix the fsck/002 runs.
> 
> Have you hit any other eb leaks? My extra noisy patch to crash progs
> when eb leaks failed at fsck/002.
> 
No other eb leaks found with your patches.

> Not sure if there are any other remaining.

At least `make tests` seems happy for now.

--
Su
> 
> Thanks,
> Qu
>> ---
>>   check/main.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/check/main.c b/check/main.c
>> index 0ba38f73c0a4..0dd18d07ff5d 100644
>> --- a/check/main.c
>> +++ b/check/main.c
>> @@ -10966,6 +10966,7 @@ static int cmd_check(const struct cmd_struct 
>> *cmd, int argc, char **argv)
>>                         struct extent_buffer, recow);
>>           list_del_init(&eb->recow);
>>           ret = recow_extent_buffer(root, eb);
>> +        free_extent_buffer(eb);
>>           err |= !!ret;
>>           if (ret) {
>>               error("fails to fix transid errors");
Qu Wenruo Aug. 31, 2022, 10:47 a.m. UTC | #3
On 2022/8/31 18:45, Su Yue wrote:
>
>
> On 2022/8/31 16:00, Qu Wenruo wrote:
>>
>>
>> On 2022/8/30 20:47, Su Yue wrote:
>>> In read_tree_block, extent buffer EXTENT_BAD_TRANSID flagged will
>>> be added into fs_info->recow_ebs with an increment of its refs.
>>>
>>> The corresponding free_extent_buffer should be called after we
>>> fix transid error by cowing extent buffer then remove them from
>>> fs_info->recow_ebs.
>>>
>>> Otherwise, extent buffers will be leaked as fsck-tests/002 reports:
>>> ===================================================================
>>> ====== RUN CHECK /root/btrfs-progs/btrfs check --repair --force
>>> ./default_case.img.restored
>>> parent transid verify failed on 29360128 wanted 9 found 755944791
>>> parent transid verify failed on 29360128 wanted 9 found 755944791
>>> parent transid verify failed on 29360128 wanted 9 found 755944791
>>> Ignoring transid failure
>>> [1/7] checking root items
>>> Fixed 0 roots.
>>> [2/7] checking extents
>>> [3/7] checking free space cache
>>> [4/7] checking fs roots
>>> [5/7] checking only csums items (without verifying data)
>>> [6/7] checking root refs
>>> [7/7] checking quota groups skipped (not enabled on this FS)
>>> extent buffer leak: start 29360128 len 4096
>>> enabling repair mode
>>> ===================================================================
>>>
>>> Fixes: c64485544baa ("Btrfs-progs: keep track of transid failures and
>>> fix them if possible")
>>> Signed-off-by: Su Yue <glass@fydeos.io>
>>
>> Great to fix the fsck/002 runs.
>>
>> Have you hit any other eb leaks? My extra noisy patch to crash progs
>> when eb leaks failed at fsck/002.
>>
> No other eb leaks found with your patches.

Awesome, and it looks like I should look into the problem more carefully
before claiming we have tons of eb leaks.

Thanks,
Qu
>
>> Not sure if there are any other remaining.
>
> At least `make tests` seems happy for now.
>
> --
> Su
>>
>> Thanks,
>> Qu
>>> ---
>>>   check/main.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/check/main.c b/check/main.c
>>> index 0ba38f73c0a4..0dd18d07ff5d 100644
>>> --- a/check/main.c
>>> +++ b/check/main.c
>>> @@ -10966,6 +10966,7 @@ static int cmd_check(const struct cmd_struct
>>> *cmd, int argc, char **argv)
>>>                         struct extent_buffer, recow);
>>>           list_del_init(&eb->recow);
>>>           ret = recow_extent_buffer(root, eb);
>>> +        free_extent_buffer(eb);
>>>           err |= !!ret;
>>>           if (ret) {
>>>               error("fails to fix transid errors");
David Sterba Aug. 31, 2022, 2:37 p.m. UTC | #4
On Tue, Aug 30, 2022 at 08:47:52PM +0800, Su Yue wrote:
> In read_tree_block, extent buffer EXTENT_BAD_TRANSID flagged will
> be added into fs_info->recow_ebs with an increment of its refs.
> 
> The corresponding free_extent_buffer should be called after we
> fix transid error by cowing extent buffer then remove them from
> fs_info->recow_ebs.
> 
> Otherwise, extent buffers will be leaked as fsck-tests/002 reports:
> ===================================================================
> ====== RUN CHECK /root/btrfs-progs/btrfs check --repair --force ./default_case.img.restored
> parent transid verify failed on 29360128 wanted 9 found 755944791
> parent transid verify failed on 29360128 wanted 9 found 755944791
> parent transid verify failed on 29360128 wanted 9 found 755944791
> Ignoring transid failure
> [1/7] checking root items
> Fixed 0 roots.
> [2/7] checking extents
> [3/7] checking free space cache
> [4/7] checking fs roots
> [5/7] checking only csums items (without verifying data)
> [6/7] checking root refs
> [7/7] checking quota groups skipped (not enabled on this FS)
> extent buffer leak: start 29360128 len 4096
> enabling repair mode
> ===================================================================
> 
> Fixes: c64485544baa ("Btrfs-progs: keep track of transid failures and fix them if possible")
> Signed-off-by: Su Yue <glass@fydeos.io>

Added to devel, thanks.
diff mbox series

Patch

diff --git a/check/main.c b/check/main.c
index 0ba38f73c0a4..0dd18d07ff5d 100644
--- a/check/main.c
+++ b/check/main.c
@@ -10966,6 +10966,7 @@  static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv)
 				      struct extent_buffer, recow);
 		list_del_init(&eb->recow);
 		ret = recow_extent_buffer(root, eb);
+		free_extent_buffer(eb);
 		err |= !!ret;
 		if (ret) {
 			error("fails to fix transid errors");