Message ID | 20241015153957.2099812-1-maharmstone@fb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | generic: add test for missing btrfs csums in log when doing async on subpage vol | expand |
On Tue, Oct 15, 2024 at 4:42 PM Mark Harmstone <maharmstone@fb.com> wrote: > > Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a > race could mean that csums weren't getting written to the log tree, > leading to corruption when it was replayed. > > The patches to detect log this tree corruption are in btrfs-progs 6.11. This shouldn't be needed right? Because after log replay the csums are missing and 'btrfs check' detects (IIRC) missing csums for extents referred by file extent items in a subvolume tree - if it doesn't then it should be improved. > > Signed-off-by: Mark Harmstone <maharmstone@fb.com> > --- > This is a genericized version of the test I originally proposed as > btrfs/333. > > tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/757.out | 2 ++ > 2 files changed, 73 insertions(+) > create mode 100755 tests/generic/757 > create mode 100644 tests/generic/757.out > > diff --git a/tests/generic/757 b/tests/generic/757 > new file mode 100755 > index 00000000..6ad3d01e > --- /dev/null > +++ b/tests/generic/757 > @@ -0,0 +1,71 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# > +# FS QA Test 757 > +# > +# Test async dio with fsync to test a btrfs bug where a race meant that csums > +# weren't getting written to the log tree, causing corruptions on remount. > +# This can be seen on subpage FSes on Linux 6.4. > +# > +. ./common/preamble > +_begin_fstest auto quick metadata log recoveryloop > + > +_fixed_by_kernel_commit e917ff56c8e7 \ > + "btrfs: determine synchronous writers from bio or writeback control" For generic tests what we do is: [ $FSTYP == "btrfs" ] && _fixed_by_kernel_commit ..... As long as the failure has not been observed and fixed on other filesystems. In case one day a regression happens in another fs, we just add a corresponding line using the same logic. Otherwise if the test one days fails on another fs and fstests suggests that that commit is missing, it would be odd. Everything else looks good, so with that fixed (maybe Zorro can change that when picking the patch): Reviewed-by: Filipe Manana <fdmanana@suse.com> Thanks. > + > +fio_config=$tmp.fio > + > +. ./common/dmlogwrites > + > +_require_scratch > +_require_log_writes > + > +cat >$fio_config <<EOF > +[global] > +iodepth=128 > +direct=1 > +ioengine=libaio > +rw=randwrite > +runtime=1s > +[job0] > +rw=randwrite > +filename=$SCRATCH_MNT/file > +size=1g > +fdatasync=1 > +EOF > + > +_require_fio $fio_config > + > +cat $fio_config >> $seqres.full > + > +_log_writes_init $SCRATCH_DEV > +_log_writes_mkfs >> $seqres.full 2>&1 > +_log_writes_mark mkfs > + > +_log_writes_mount > + > +$FIO_PROG $fio_config > /dev/null 2>&1 > +_log_writes_unmount > + > +_log_writes_remove > + > +prev=$(_log_writes_mark_to_entry_number mkfs) > +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'" > +cur=$(_log_writes_find_next_fua $prev) > +[ -z "$cur" ] && _fail "failed to locate next FUA write" > + > +while [ ! -z "$cur" ]; do > + _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full > + > + _check_scratch_fs > + > + prev=$cur > + cur=$(_log_writes_find_next_fua $(($cur + 1))) > + [ -z "$cur" ] && break > +done > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/757.out b/tests/generic/757.out > new file mode 100644 > index 00000000..dfbc8094 > --- /dev/null > +++ b/tests/generic/757.out > @@ -0,0 +1,2 @@ > +QA output created by 757 > +Silence is golden > -- > 2.44.2 > >
On 16/10/24 12:09, Filipe Manana wrote: > > > On Tue, Oct 15, 2024 at 4:42 PM Mark Harmstone <maharmstone@fb.com> wrote: >> >> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a >> race could mean that csums weren't getting written to the log tree, >> leading to corruption when it was replayed. >> >> The patches to detect log this tree corruption are in btrfs-progs 6.11. > > This shouldn't be needed right? > Because after log replay the csums are missing and 'btrfs check' > detects (IIRC) missing csums for extents referred by file extent items > in a subvolume tree - if it doesn't then it should be improved. Yes, but we're not mounting it in the tests between the log_writes calls, so the log isn't getting replayed. The patches to btrfs check make it so that it identifies filesystems that would get corrupted as soon as they're next mounted. > >> >> Signed-off-by: Mark Harmstone <maharmstone@fb.com> >> --- >> This is a genericized version of the test I originally proposed as >> btrfs/333. >> >> tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ >> tests/generic/757.out | 2 ++ >> 2 files changed, 73 insertions(+) >> create mode 100755 tests/generic/757 >> create mode 100644 tests/generic/757.out >> >> diff --git a/tests/generic/757 b/tests/generic/757 >> new file mode 100755 >> index 00000000..6ad3d01e >> --- /dev/null >> +++ b/tests/generic/757 >> @@ -0,0 +1,71 @@ >> +#! /bin/bash >> +# SPDX-License-Identifier: GPL-2.0 >> +# >> +# FS QA Test 757 >> +# >> +# Test async dio with fsync to test a btrfs bug where a race meant that csums >> +# weren't getting written to the log tree, causing corruptions on remount. >> +# This can be seen on subpage FSes on Linux 6.4. >> +# >> +. ./common/preamble >> +_begin_fstest auto quick metadata log recoveryloop >> + >> +_fixed_by_kernel_commit e917ff56c8e7 \ >> + "btrfs: determine synchronous writers from bio or writeback control" > > For generic tests what we do is: > > [ $FSTYP == "btrfs" ] && _fixed_by_kernel_commit ..... > > As long as the failure has not been observed and fixed on other filesystems. > In case one day a regression happens in another fs, we just add a > corresponding line using the same logic. > > Otherwise if the test one days fails on another fs and fstests > suggests that that commit is missing, it would be odd. > > Everything else looks good, so with that fixed (maybe Zorro can change > that when picking the patch): > > Reviewed-by: Filipe Manana <fdmanana@suse.com> > > Thanks. > Thanks Filipe. Zorro, let me know if you're happy making this change, or otherwise I'll resubmit. Mark
On Fri, Oct 18, 2024 at 05:36:26PM +0000, Mark Harmstone wrote: > On 16/10/24 12:09, Filipe Manana wrote: > > > > > On Tue, Oct 15, 2024 at 4:42 PM Mark Harmstone <maharmstone@fb.com> wrote: > >> > >> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a > >> race could mean that csums weren't getting written to the log tree, > >> leading to corruption when it was replayed. > >> > >> The patches to detect log this tree corruption are in btrfs-progs 6.11. > > > > This shouldn't be needed right? > > Because after log replay the csums are missing and 'btrfs check' > > detects (IIRC) missing csums for extents referred by file extent items > > in a subvolume tree - if it doesn't then it should be improved. > > Yes, but we're not mounting it in the tests between the log_writes > calls, so the log isn't getting replayed. The patches to btrfs check > make it so that it identifies filesystems that would get corrupted as > soon as they're next mounted. > > > > >> > >> Signed-off-by: Mark Harmstone <maharmstone@fb.com> > >> --- > >> This is a genericized version of the test I originally proposed as > >> btrfs/333. > >> > >> tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ > >> tests/generic/757.out | 2 ++ > >> 2 files changed, 73 insertions(+) > >> create mode 100755 tests/generic/757 > >> create mode 100644 tests/generic/757.out > >> > >> diff --git a/tests/generic/757 b/tests/generic/757 > >> new file mode 100755 > >> index 00000000..6ad3d01e > >> --- /dev/null > >> +++ b/tests/generic/757 > >> @@ -0,0 +1,71 @@ > >> +#! /bin/bash > >> +# SPDX-License-Identifier: GPL-2.0 > >> +# > >> +# FS QA Test 757 > >> +# > >> +# Test async dio with fsync to test a btrfs bug where a race meant that csums > >> +# weren't getting written to the log tree, causing corruptions on remount. > >> +# This can be seen on subpage FSes on Linux 6.4. > >> +# > >> +. ./common/preamble > >> +_begin_fstest auto quick metadata log recoveryloop > >> + > >> +_fixed_by_kernel_commit e917ff56c8e7 \ > >> + "btrfs: determine synchronous writers from bio or writeback control" > > > > For generic tests what we do is: > > > > [ $FSTYP == "btrfs" ] && _fixed_by_kernel_commit ..... > > > > As long as the failure has not been observed and fixed on other filesystems. > > In case one day a regression happens in another fs, we just add a > > corresponding line using the same logic. > > > > Otherwise if the test one days fails on another fs and fstests > > suggests that that commit is missing, it would be odd. > > > > Everything else looks good, so with that fixed (maybe Zorro can change > > that when picking the patch): > > > > Reviewed-by: Filipe Manana <fdmanana@suse.com> > > > > Thanks. > > > > Thanks Filipe. Zorro, let me know if you're happy making this change, or > otherwise I'll resubmit. I can help to change that when I merge it. Thanks you and Filipe. Thanks, Zorro > > Mark > >
On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote: > Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a > race could mean that csums weren't getting written to the log tree, > leading to corruption when it was replayed. > > The patches to detect log this tree corruption are in btrfs-progs 6.11. > > Signed-off-by: Mark Harmstone <maharmstone@fb.com> > --- > This is a genericized version of the test I originally proposed as > btrfs/333. > > tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/757.out | 2 ++ > 2 files changed, 73 insertions(+) > create mode 100755 tests/generic/757 > create mode 100644 tests/generic/757.out > > diff --git a/tests/generic/757 b/tests/generic/757 > new file mode 100755 > index 00000000..6ad3d01e > --- /dev/null > +++ b/tests/generic/757 > @@ -0,0 +1,71 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# > +# FS QA Test 757 > +# > +# Test async dio with fsync to test a btrfs bug where a race meant that csums > +# weren't getting written to the log tree, causing corruptions on remount. > +# This can be seen on subpage FSes on Linux 6.4. > +# > +. ./common/preamble > +_begin_fstest auto quick metadata log recoveryloop > + > +_fixed_by_kernel_commit e917ff56c8e7 \ > + "btrfs: determine synchronous writers from bio or writeback control" > + > +fio_config=$tmp.fio > + > +. ./common/dmlogwrites > + > +_require_scratch > +_require_log_writes > + > +cat >$fio_config <<EOF > +[global] > +iodepth=128 > +direct=1 > +ioengine=libaio > +rw=randwrite > +runtime=1s > +[job0] > +rw=randwrite > +filename=$SCRATCH_MNT/file > +size=1g > +fdatasync=1 > +EOF > + > +_require_fio $fio_config > + > +cat $fio_config >> $seqres.full > + > +_log_writes_init $SCRATCH_DEV > +_log_writes_mkfs >> $seqres.full 2>&1 > +_log_writes_mark mkfs > + > +_log_writes_mount For dmlogwrites test, we generally calls _log_writes_cleanup in _cleanup, to recover the SCRATCH_DEV anyway, even if this test is killed at the middle of its testing phase, to avoid later tests failed. I'll add below code in this case, when I merge it. If there's not objection from you. _cleanup() { cd / _log_writes_cleanup rm -f $tmp.* } Thanks, Zorro > + > +$FIO_PROG $fio_config > /dev/null 2>&1 > +_log_writes_unmount > + > +_log_writes_remove > + > +prev=$(_log_writes_mark_to_entry_number mkfs) > +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'" > +cur=$(_log_writes_find_next_fua $prev) > +[ -z "$cur" ] && _fail "failed to locate next FUA write" > + > +while [ ! -z "$cur" ]; do > + _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full > + > + _check_scratch_fs > + > + prev=$cur > + cur=$(_log_writes_find_next_fua $(($cur + 1))) > + [ -z "$cur" ] && break > +done > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/757.out b/tests/generic/757.out > new file mode 100644 > index 00000000..dfbc8094 > --- /dev/null > +++ b/tests/generic/757.out > @@ -0,0 +1,2 @@ > +QA output created by 757 > +Silence is golden > -- > 2.44.2 > >
On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote: > Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a > race could mean that csums weren't getting written to the log tree, > leading to corruption when it was replayed. > > The patches to detect log this tree corruption are in btrfs-progs 6.11. > > Signed-off-by: Mark Harmstone <maharmstone@fb.com> > --- Sorry, more review points below. I can help to change these if you say "yes" to all :) > This is a genericized version of the test I originally proposed as > btrfs/333. > > tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/757.out | 2 ++ > 2 files changed, 73 insertions(+) > create mode 100755 tests/generic/757 > create mode 100644 tests/generic/757.out > > diff --git a/tests/generic/757 b/tests/generic/757 > new file mode 100755 > index 00000000..6ad3d01e > --- /dev/null > +++ b/tests/generic/757 > @@ -0,0 +1,71 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# > +# FS QA Test 757 > +# > +# Test async dio with fsync to test a btrfs bug where a race meant that csums > +# weren't getting written to the log tree, causing corruptions on remount. > +# This can be seen on subpage FSes on Linux 6.4. > +# > +. ./common/preamble > +_begin_fstest auto quick metadata log recoveryloop ^^^ aio > + > +_fixed_by_kernel_commit e917ff56c8e7 \ > + "btrfs: determine synchronous writers from bio or writeback control" > + > +fio_config=$tmp.fio > + > +. ./common/dmlogwrites > + > +_require_scratch > +_require_log_writes > + > +cat >$fio_config <<EOF > +[global] > +iodepth=128 > +direct=1 > +ioengine=libaio _require_aiodio ? > +rw=randwrite > +runtime=1s > +[job0] > +rw=randwrite > +filename=$SCRATCH_MNT/file > +size=1g > +fdatasync=1 > +EOF > + > +_require_fio $fio_config > + > +cat $fio_config >> $seqres.full > + > +_log_writes_init $SCRATCH_DEV > +_log_writes_mkfs >> $seqres.full 2>&1 > +_log_writes_mark mkfs > + > +_log_writes_mount > + > +$FIO_PROG $fio_config > /dev/null 2>&1 Don't you care the output of fio running anymore? Maybe use > $seqres.full ? And just make sure, do you want to ignore failures of fio, as you do "2>&1"? Thanks, Zorro > +_log_writes_unmount > + > +_log_writes_remove > + > +prev=$(_log_writes_mark_to_entry_number mkfs) > +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'" > +cur=$(_log_writes_find_next_fua $prev) > +[ -z "$cur" ] && _fail "failed to locate next FUA write" > + > +while [ ! -z "$cur" ]; do > + _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full > + > + _check_scratch_fs > + > + prev=$cur > + cur=$(_log_writes_find_next_fua $(($cur + 1))) > + [ -z "$cur" ] && break > +done > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/757.out b/tests/generic/757.out > new file mode 100644 > index 00000000..dfbc8094 > --- /dev/null > +++ b/tests/generic/757.out > @@ -0,0 +1,2 @@ > +QA output created by 757 > +Silence is golden > -- > 2.44.2 > >
These look good to me. Thank you for your help On 23/10/24 04:53, Zorro Lang wrote: > > > On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote: >> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a >> race could mean that csums weren't getting written to the log tree, >> leading to corruption when it was replayed. >> >> The patches to detect log this tree corruption are in btrfs-progs 6.11. >> >> Signed-off-by: Mark Harmstone <maharmstone@fb.com> >> --- > > Sorry, more review points below. I can help to change these if you say "yes" > to all :) > >> This is a genericized version of the test I originally proposed as >> btrfs/333. >> >> tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ >> tests/generic/757.out | 2 ++ >> 2 files changed, 73 insertions(+) >> create mode 100755 tests/generic/757 >> create mode 100644 tests/generic/757.out >> >> diff --git a/tests/generic/757 b/tests/generic/757 >> new file mode 100755 >> index 00000000..6ad3d01e >> --- /dev/null >> +++ b/tests/generic/757 >> @@ -0,0 +1,71 @@ >> +#! /bin/bash >> +# SPDX-License-Identifier: GPL-2.0 >> +# >> +# FS QA Test 757 >> +# >> +# Test async dio with fsync to test a btrfs bug where a race meant that csums >> +# weren't getting written to the log tree, causing corruptions on remount. >> +# This can be seen on subpage FSes on Linux 6.4. >> +# >> +. ./common/preamble >> +_begin_fstest auto quick metadata log recoveryloop > ^^^ > aio > >> + >> +_fixed_by_kernel_commit e917ff56c8e7 \ >> + "btrfs: determine synchronous writers from bio or writeback control" >> + >> +fio_config=$tmp.fio >> + >> +. ./common/dmlogwrites >> + >> +_require_scratch >> +_require_log_writes >> + >> +cat >$fio_config <<EOF >> +[global] >> +iodepth=128 >> +direct=1 >> +ioengine=libaio > > _require_aiodio ? > >> +rw=randwrite >> +runtime=1s >> +[job0] >> +rw=randwrite >> +filename=$SCRATCH_MNT/file >> +size=1g >> +fdatasync=1 >> +EOF >> + >> +_require_fio $fio_config >> + >> +cat $fio_config >> $seqres.full >> + >> +_log_writes_init $SCRATCH_DEV >> +_log_writes_mkfs >> $seqres.full 2>&1 >> +_log_writes_mark mkfs >> + >> +_log_writes_mount >> + >> +$FIO_PROG $fio_config > /dev/null 2>&1 > > Don't you care the output of fio running anymore? Maybe use > $seqres.full ? > > And just make sure, do you want to ignore failures of fio, as you do "2>&1"? > > Thanks, > Zorro > >> +_log_writes_unmount >> + >> +_log_writes_remove >> + >> +prev=$(_log_writes_mark_to_entry_number mkfs) >> +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'" >> +cur=$(_log_writes_find_next_fua $prev) >> +[ -z "$cur" ] && _fail "failed to locate next FUA write" >> + >> +while [ ! -z "$cur" ]; do >> + _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full >> + >> + _check_scratch_fs >> + >> + prev=$cur >> + cur=$(_log_writes_find_next_fua $(($cur + 1))) >> + [ -z "$cur" ] && break >> +done >> + >> +echo "Silence is golden" >> + >> +# success, all done >> +status=0 >> +exit >> diff --git a/tests/generic/757.out b/tests/generic/757.out >> new file mode 100644 >> index 00000000..dfbc8094 >> --- /dev/null >> +++ b/tests/generic/757.out >> @@ -0,0 +1,2 @@ >> +QA output created by 757 >> +Silence is golden >> -- >> 2.44.2 >> >> >
On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote: > Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a > race could mean that csums weren't getting written to the log tree, > leading to corruption when it was replayed. > > The patches to detect log this tree corruption are in btrfs-progs 6.11. > > Signed-off-by: Mark Harmstone <maharmstone@fb.com> > --- > This is a genericized version of the test I originally proposed as > btrfs/333. > > tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/757.out | 2 ++ > 2 files changed, 73 insertions(+) > create mode 100755 tests/generic/757 > create mode 100644 tests/generic/757.out > > diff --git a/tests/generic/757 b/tests/generic/757 > new file mode 100755 > index 00000000..6ad3d01e > --- /dev/null > +++ b/tests/generic/757 > @@ -0,0 +1,71 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# > +# FS QA Test 757 > +# > +# Test async dio with fsync to test a btrfs bug where a race meant that csums > +# weren't getting written to the log tree, causing corruptions on remount. > +# This can be seen on subpage FSes on Linux 6.4. > +# > +. ./common/preamble > +_begin_fstest auto quick metadata log recoveryloop > + > +_fixed_by_kernel_commit e917ff56c8e7 \ > + "btrfs: determine synchronous writers from bio or writeback control" > + > +fio_config=$tmp.fio > + > +. ./common/dmlogwrites > + > +_require_scratch > +_require_log_writes > + > +cat >$fio_config <<EOF > +[global] > +iodepth=128 > +direct=1 > +ioengine=libaio > +rw=randwrite > +runtime=1s > +[job0] > +rw=randwrite > +filename=$SCRATCH_MNT/file > +size=1g > +fdatasync=1 > +EOF > + > +_require_fio $fio_config > + > +cat $fio_config >> $seqres.full > + > +_log_writes_init $SCRATCH_DEV > +_log_writes_mkfs >> $seqres.full 2>&1 > +_log_writes_mark mkfs > + > +_log_writes_mount > + > +$FIO_PROG $fio_config > /dev/null 2>&1 > +_log_writes_unmount > + > +_log_writes_remove > + > +prev=$(_log_writes_mark_to_entry_number mkfs) > +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'" > +cur=$(_log_writes_find_next_fua $prev) > +[ -z "$cur" ] && _fail "failed to locate next FUA write" > + > +while [ ! -z "$cur" ]; do > + _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full > + > + _check_scratch_fs This test fails on xfs because (afaict) replaying the log to $cur results in $SCRATCH_DEV being a filesystem with a dirty log; and xfs_repair fails when it is given a filesystem with a dirty log. I then fixed the test to mount and unmount the filesystem to recovery the dirty log before invoking xfs_repair: # xfs_repair won't run if the log is dirty if [ $FSTYP = "xfs" ]; then _scratch_mount _scratch_unmount fi _check_scratch_fs But now the test takes a very long time to run because (on my system anyway) the fio run can initiate 17,000 FUAs, which means that this loop runs that many times. 100 iterations takes about 45 seconds, which is about two hours. Is it necessary to iterate the loop that many times to reproduce whatever issue btrfs had? --D > + > + prev=$cur > + cur=$(_log_writes_find_next_fua $(($cur + 1))) > + [ -z "$cur" ] && break > +done > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/757.out b/tests/generic/757.out > new file mode 100644 > index 00000000..dfbc8094 > --- /dev/null > +++ b/tests/generic/757.out > @@ -0,0 +1,2 @@ > +QA output created by 757 > +Silence is golden > -- > 2.44.2 > >
On Mon, Oct 28, 2024 at 02:57:28PM -0700, Darrick J. Wong wrote: > On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote: > > Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a > > race could mean that csums weren't getting written to the log tree, > > leading to corruption when it was replayed. > > > > The patches to detect log this tree corruption are in btrfs-progs 6.11. > > > > Signed-off-by: Mark Harmstone <maharmstone@fb.com> > > --- > > This is a genericized version of the test I originally proposed as > > btrfs/333. > > > > tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ > > tests/generic/757.out | 2 ++ > > 2 files changed, 73 insertions(+) > > create mode 100755 tests/generic/757 > > create mode 100644 tests/generic/757.out > > > > diff --git a/tests/generic/757 b/tests/generic/757 > > new file mode 100755 > > index 00000000..6ad3d01e > > --- /dev/null > > +++ b/tests/generic/757 > > @@ -0,0 +1,71 @@ > > +#! /bin/bash > > +# SPDX-License-Identifier: GPL-2.0 > > +# > > +# FS QA Test 757 > > +# > > +# Test async dio with fsync to test a btrfs bug where a race meant that csums > > +# weren't getting written to the log tree, causing corruptions on remount. > > +# This can be seen on subpage FSes on Linux 6.4. > > +# > > +. ./common/preamble > > +_begin_fstest auto quick metadata log recoveryloop > > + > > +_fixed_by_kernel_commit e917ff56c8e7 \ > > + "btrfs: determine synchronous writers from bio or writeback control" > > + > > +fio_config=$tmp.fio > > + > > +. ./common/dmlogwrites > > + > > +_require_scratch > > +_require_log_writes > > + > > +cat >$fio_config <<EOF > > +[global] > > +iodepth=128 > > +direct=1 > > +ioengine=libaio > > +rw=randwrite > > +runtime=1s > > +[job0] > > +rw=randwrite > > +filename=$SCRATCH_MNT/file > > +size=1g > > +fdatasync=1 > > +EOF > > + > > +_require_fio $fio_config > > + > > +cat $fio_config >> $seqres.full > > + > > +_log_writes_init $SCRATCH_DEV > > +_log_writes_mkfs >> $seqres.full 2>&1 > > +_log_writes_mark mkfs > > + > > +_log_writes_mount > > + > > +$FIO_PROG $fio_config > /dev/null 2>&1 > > +_log_writes_unmount > > + > > +_log_writes_remove > > + > > +prev=$(_log_writes_mark_to_entry_number mkfs) > > +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'" > > +cur=$(_log_writes_find_next_fua $prev) > > +[ -z "$cur" ] && _fail "failed to locate next FUA write" > > + > > +while [ ! -z "$cur" ]; do > > + _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full > > + > > + _check_scratch_fs > > This test fails on xfs because (afaict) replaying the log to $cur > results in $SCRATCH_DEV being a filesystem with a dirty log; and > xfs_repair fails when it is given a filesystem with a dirty log. > > I then fixed the test to mount and unmount the filesystem to recovery > the dirty log before invoking xfs_repair: > > # xfs_repair won't run if the log is dirty > if [ $FSTYP = "xfs" ]; then > _scratch_mount > _scratch_unmount > fi Thanks Darrick, you're right. I'm wondering can we always do a mount&unmount at here, no matter the $FSTYP, if that doesn't affect the testing of other filesystems? > _check_scratch_fs > > But now the test takes a very long time to run because (on my system > anyway) the fio run can initiate 17,000 FUAs, which means that this loop > runs that many times. 100 iterations takes about 45 seconds, which is > about two hours. > > Is it necessary to iterate the loop that many times to reproduce > whatever issue btrfs had? Yes, it takes much long time on my side too: FSTYP -- ext4 PLATFORM -- Linux/x86_64 dell-per750-47 6.12.0-rc4+ #1 SMP PREEMPT_DYNAMIC Fri Oct 25 14:25:45 EDT 2024 MKFS_OPTIONS -- -F /dev/sda4 MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch generic/757 4247s Ran: generic/757 Passed all 1 tests So better to reduce the testing time as much as possible, and remove it from the "quick" group. (Maybe we can have a tag to mark those cases need much long time too). This patch has been merged into for-next branch, as: cf97fa373 generic: add test for missing btrfs csums in log when doing async on subpage vol Please send another (or other two) patch to fix above 2 problems. Thanks, Zorro > > --D > > > + > > + prev=$cur > > + cur=$(_log_writes_find_next_fua $(($cur + 1))) > > + [ -z "$cur" ] && break > > +done > > + > > +echo "Silence is golden" > > + > > +# success, all done > > +status=0 > > +exit > > diff --git a/tests/generic/757.out b/tests/generic/757.out > > new file mode 100644 > > index 00000000..dfbc8094 > > --- /dev/null > > +++ b/tests/generic/757.out > > @@ -0,0 +1,2 @@ > > +QA output created by 757 > > +Silence is golden > > -- > > 2.44.2 > > > > >
On 29/10/24 05:11, Zorro Lang wrote: > On Mon, Oct 28, 2024 at 02:57:28PM -0700, Darrick J. Wong wrote: >> On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote: >>> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a >>> race could mean that csums weren't getting written to the log tree, >>> leading to corruption when it was replayed. >>> >>> The patches to detect log this tree corruption are in btrfs-progs 6.11. >>> >>> Signed-off-by: Mark Harmstone <maharmstone@fb.com> >>> --- >>> This is a genericized version of the test I originally proposed as >>> btrfs/333. >>> >>> tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ >>> tests/generic/757.out | 2 ++ >>> 2 files changed, 73 insertions(+) >>> create mode 100755 tests/generic/757 >>> create mode 100644 tests/generic/757.out >>> >>> diff --git a/tests/generic/757 b/tests/generic/757 >>> new file mode 100755 >>> index 00000000..6ad3d01e >>> --- /dev/null >>> +++ b/tests/generic/757 >>> @@ -0,0 +1,71 @@ >>> +#! /bin/bash >>> +# SPDX-License-Identifier: GPL-2.0 >>> +# >>> +# FS QA Test 757 >>> +# >>> +# Test async dio with fsync to test a btrfs bug where a race meant that csums >>> +# weren't getting written to the log tree, causing corruptions on remount. >>> +# This can be seen on subpage FSes on Linux 6.4. >>> +# >>> +. ./common/preamble >>> +_begin_fstest auto quick metadata log recoveryloop >>> + >>> +_fixed_by_kernel_commit e917ff56c8e7 \ >>> + "btrfs: determine synchronous writers from bio or writeback control" >>> + >>> +fio_config=$tmp.fio >>> + >>> +. ./common/dmlogwrites >>> + >>> +_require_scratch >>> +_require_log_writes >>> + >>> +cat >$fio_config <<EOF >>> +[global] >>> +iodepth=128 >>> +direct=1 >>> +ioengine=libaio >>> +rw=randwrite >>> +runtime=1s >>> +[job0] >>> +rw=randwrite >>> +filename=$SCRATCH_MNT/file >>> +size=1g >>> +fdatasync=1 >>> +EOF >>> + >>> +_require_fio $fio_config >>> + >>> +cat $fio_config >> $seqres.full >>> + >>> +_log_writes_init $SCRATCH_DEV >>> +_log_writes_mkfs >> $seqres.full 2>&1 >>> +_log_writes_mark mkfs >>> + >>> +_log_writes_mount >>> + >>> +$FIO_PROG $fio_config > /dev/null 2>&1 >>> +_log_writes_unmount >>> + >>> +_log_writes_remove >>> + >>> +prev=$(_log_writes_mark_to_entry_number mkfs) >>> +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'" >>> +cur=$(_log_writes_find_next_fua $prev) >>> +[ -z "$cur" ] && _fail "failed to locate next FUA write" >>> + >>> +while [ ! -z "$cur" ]; do >>> + _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full >>> + >>> + _check_scratch_fs >> >> This test fails on xfs because (afaict) replaying the log to $cur >> results in $SCRATCH_DEV being a filesystem with a dirty log; and >> xfs_repair fails when it is given a filesystem with a dirty log. >> >> I then fixed the test to mount and unmount the filesystem to recovery >> the dirty log before invoking xfs_repair: >> >> # xfs_repair won't run if the log is dirty >> if [ $FSTYP = "xfs" ]; then >> _scratch_mount >> _scratch_unmount >> fi > > Thanks Darrick, you're right. > I'm wondering can we always do a mount&unmount at here, no matter the > $FSTYP, if that doesn't affect the testing of other filesystems? > >> _check_scratch_fs >> >> But now the test takes a very long time to run because (on my system >> anyway) the fio run can initiate 17,000 FUAs, which means that this loop >> runs that many times. 100 iterations takes about 45 seconds, which is >> about two hours. >> >> Is it necessary to iterate the loop that many times to reproduce >> whatever issue btrfs had? > > Yes, it takes much long time on my side too: > FSTYP -- ext4 > PLATFORM -- Linux/x86_64 dell-per750-47 6.12.0-rc4+ #1 SMP PREEMPT_DYNAMIC Fri Oct 25 14:25:45 EDT 2024 > MKFS_OPTIONS -- -F /dev/sda4 > MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch > > generic/757 4247s > Ran: generic/757 > Passed all 1 tests > > So better to reduce the testing time as much as possible, and remove it > from the "quick" group. (Maybe we can have a tag to mark those cases need > much long time too). Or maybe it should be made btrfs-specific again? That's how I originally wrote it, but Filipe Manana thought it ought to be genericized. An advantage of making it FS-specific is that you can use the --fsck option in log-writes, which is about 10 times quicker than doing the loop in bash. Mark
diff --git a/tests/generic/757 b/tests/generic/757 new file mode 100755 index 00000000..6ad3d01e --- /dev/null +++ b/tests/generic/757 @@ -0,0 +1,71 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# FS QA Test 757 +# +# Test async dio with fsync to test a btrfs bug where a race meant that csums +# weren't getting written to the log tree, causing corruptions on remount. +# This can be seen on subpage FSes on Linux 6.4. +# +. ./common/preamble +_begin_fstest auto quick metadata log recoveryloop + +_fixed_by_kernel_commit e917ff56c8e7 \ + "btrfs: determine synchronous writers from bio or writeback control" + +fio_config=$tmp.fio + +. ./common/dmlogwrites + +_require_scratch +_require_log_writes + +cat >$fio_config <<EOF +[global] +iodepth=128 +direct=1 +ioengine=libaio +rw=randwrite +runtime=1s +[job0] +rw=randwrite +filename=$SCRATCH_MNT/file +size=1g +fdatasync=1 +EOF + +_require_fio $fio_config + +cat $fio_config >> $seqres.full + +_log_writes_init $SCRATCH_DEV +_log_writes_mkfs >> $seqres.full 2>&1 +_log_writes_mark mkfs + +_log_writes_mount + +$FIO_PROG $fio_config > /dev/null 2>&1 +_log_writes_unmount + +_log_writes_remove + +prev=$(_log_writes_mark_to_entry_number mkfs) +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'" +cur=$(_log_writes_find_next_fua $prev) +[ -z "$cur" ] && _fail "failed to locate next FUA write" + +while [ ! -z "$cur" ]; do + _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full + + _check_scratch_fs + + prev=$cur + cur=$(_log_writes_find_next_fua $(($cur + 1))) + [ -z "$cur" ] && break +done + +echo "Silence is golden" + +# success, all done +status=0 +exit diff --git a/tests/generic/757.out b/tests/generic/757.out new file mode 100644 index 00000000..dfbc8094 --- /dev/null +++ b/tests/generic/757.out @@ -0,0 +1,2 @@ +QA output created by 757 +Silence is golden
Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a race could mean that csums weren't getting written to the log tree, leading to corruption when it was replayed. The patches to detect log this tree corruption are in btrfs-progs 6.11. Signed-off-by: Mark Harmstone <maharmstone@fb.com> --- This is a genericized version of the test I originally proposed as btrfs/333. tests/generic/757 | 71 +++++++++++++++++++++++++++++++++++++++++++ tests/generic/757.out | 2 ++ 2 files changed, 73 insertions(+) create mode 100755 tests/generic/757 create mode 100644 tests/generic/757.out