mbox series

[0/2] btrfs-progs: rely on "btrfstune --csum" to replace "btrfs check --init-csum-tree"

Message ID cover.1692688214.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs-progs: rely on "btrfstune --csum" to replace "btrfs check --init-csum-tree" | expand

Message

Qu Wenruo Aug. 22, 2023, 7:15 a.m. UTC
We had a report that "btrfs check --init-csum-tree" corrupted a
seemingly fine btrfs (which can originally pass "btrfs check
--readonly").

The root cause is in how we rebuild the csum tree, in the btrfs check
code, we screw up the csum tree root, then rely on the extent tree
repair code to finish the damage we introduced.

This can lead to unexpected corner cases, if the fs is already fine,
there is no need for such risky move.

Considering there are valid ways to cause data csum mismatch (mostly
O_DIRECT and modifying the buffer when it's still under writeback), some
users expect to use "btrfs check --init-csum-tree" to fix the csum
mismatch, which can lead to the same corruption.

Instead this patchset would recommend the end users to go "btrfstune
--csum", as it is way less risky by its design, and no more damage to
the fs caused by ourselves.

I hope we can completely go that direction when the "btrfstune --csum"
option is moved out of experimental features.

Qu Wenruo (2):
  btrfs-progs: tune: allow --csum to rebuild csum tree
  btrfs-progs: check: add advice to avoid --init-csum-tree

 check/main.c       |  9 +++++++++
 tune/change-csum.c | 35 +++++++++++++++++++++++++----------
 tune/main.c        |  3 ++-
 3 files changed, 36 insertions(+), 11 deletions(-)

--
2.41.0

Comments

David Sterba Aug. 28, 2023, 4:10 p.m. UTC | #1
On Tue, Aug 22, 2023 at 03:15:16PM +0800, Qu Wenruo wrote:
> We had a report that "btrfs check --init-csum-tree" corrupted a
> seemingly fine btrfs (which can originally pass "btrfs check
> --readonly").
> 
> The root cause is in how we rebuild the csum tree, in the btrfs check
> code, we screw up the csum tree root, then rely on the extent tree
> repair code to finish the damage we introduced.
> 
> This can lead to unexpected corner cases, if the fs is already fine,
> there is no need for such risky move.
> 
> Considering there are valid ways to cause data csum mismatch (mostly
> O_DIRECT and modifying the buffer when it's still under writeback), some
> users expect to use "btrfs check --init-csum-tree" to fix the csum
> mismatch, which can lead to the same corruption.
> 
> Instead this patchset would recommend the end users to go "btrfstune
> --csum", as it is way less risky by its design, and no more damage to
> the fs caused by ourselves.
> 
> I hope we can completely go that direction when the "btrfstune --csum"
> option is moved out of experimental features.
> 
> Qu Wenruo (2):
>   btrfs-progs: tune: allow --csum to rebuild csum tree
>   btrfs-progs: check: add advice to avoid --init-csum-tree

We can go farther and label the option as dangerous in a more visible
way, currently there's no warning and it's marked as dangerous only in
documentation and not even in the help text. I'd like to deprecate the
option and remove it from 'check' completely.

Adding support to btrfstune --csum to rebuild the checksum tree is OK.