diff mbox series

generic: add test for missing btrfs csums in log when doing async on subpage vol

Message ID 20241015153957.2099812-1-maharmstone@fb.com (mailing list archive)
State New, archived
Headers show
Series generic: add test for missing btrfs csums in log when doing async on subpage vol | expand

Commit Message

Mark Harmstone Oct. 15, 2024, 3:39 p.m. UTC
Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
race could mean that csums weren't getting written to the log tree,
leading to corruption when it was replayed.

The patches to detect log this tree corruption are in btrfs-progs 6.11.

Signed-off-by: Mark Harmstone <maharmstone@fb.com>
---
This is a genericized version of the test I originally proposed as
btrfs/333.

 tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
 tests/generic/757.out |  2 ++
 2 files changed, 73 insertions(+)
 create mode 100755 tests/generic/757
 create mode 100644 tests/generic/757.out

Comments

Filipe Manana Oct. 16, 2024, 11:09 a.m. UTC | #1
On Tue, Oct 15, 2024 at 4:42 PM Mark Harmstone <maharmstone@fb.com> wrote:
>
> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
> race could mean that csums weren't getting written to the log tree,
> leading to corruption when it was replayed.
>
> The patches to detect log this tree corruption are in btrfs-progs 6.11.

This shouldn't be needed right?
Because after log replay the csums are missing and 'btrfs check'
detects (IIRC) missing csums for extents referred by file extent items
in a subvolume tree - if it doesn't then it should be improved.

>
> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
> ---
> This is a genericized version of the test I originally proposed as
> btrfs/333.
>
>  tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/757.out |  2 ++
>  2 files changed, 73 insertions(+)
>  create mode 100755 tests/generic/757
>  create mode 100644 tests/generic/757.out
>
> diff --git a/tests/generic/757 b/tests/generic/757
> new file mode 100755
> index 00000000..6ad3d01e
> --- /dev/null
> +++ b/tests/generic/757
> @@ -0,0 +1,71 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# FS QA Test 757
> +#
> +# Test async dio with fsync to test a btrfs bug where a race meant that csums
> +# weren't getting written to the log tree, causing corruptions on remount.
> +# This can be seen on subpage FSes on Linux 6.4.
> +#
> +. ./common/preamble
> +_begin_fstest auto quick metadata log recoveryloop
> +
> +_fixed_by_kernel_commit e917ff56c8e7 \
> +       "btrfs: determine synchronous writers from bio or writeback control"

For generic tests what we do is:

[ $FSTYP == "btrfs" ] && _fixed_by_kernel_commit .....

As long as the failure has not been observed and fixed on other filesystems.
In case one day a regression happens in another fs, we just add a
corresponding line using the same logic.

Otherwise if the test one days fails on another fs and fstests
suggests that that commit is missing, it would be odd.

Everything else looks good, so with that fixed (maybe Zorro can change
that when picking the patch):

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.


> +
> +fio_config=$tmp.fio
> +
> +. ./common/dmlogwrites
> +
> +_require_scratch
> +_require_log_writes
> +
> +cat >$fio_config <<EOF
> +[global]
> +iodepth=128
> +direct=1
> +ioengine=libaio
> +rw=randwrite
> +runtime=1s
> +[job0]
> +rw=randwrite
> +filename=$SCRATCH_MNT/file
> +size=1g
> +fdatasync=1
> +EOF
> +
> +_require_fio $fio_config
> +
> +cat $fio_config >> $seqres.full
> +
> +_log_writes_init $SCRATCH_DEV
> +_log_writes_mkfs >> $seqres.full 2>&1
> +_log_writes_mark mkfs
> +
> +_log_writes_mount
> +
> +$FIO_PROG $fio_config > /dev/null 2>&1
> +_log_writes_unmount
> +
> +_log_writes_remove
> +
> +prev=$(_log_writes_mark_to_entry_number mkfs)
> +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'"
> +cur=$(_log_writes_find_next_fua $prev)
> +[ -z "$cur" ] && _fail "failed to locate next FUA write"
> +
> +while [ ! -z "$cur" ]; do
> +       _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> +
> +       _check_scratch_fs
> +
> +       prev=$cur
> +       cur=$(_log_writes_find_next_fua $(($cur + 1)))
> +       [ -z "$cur" ] && break
> +done
> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/generic/757.out b/tests/generic/757.out
> new file mode 100644
> index 00000000..dfbc8094
> --- /dev/null
> +++ b/tests/generic/757.out
> @@ -0,0 +1,2 @@
> +QA output created by 757
> +Silence is golden
> --
> 2.44.2
>
>
Mark Harmstone Oct. 18, 2024, 5:36 p.m. UTC | #2
On 16/10/24 12:09, Filipe Manana wrote:
> > 
> On Tue, Oct 15, 2024 at 4:42 PM Mark Harmstone <maharmstone@fb.com> wrote:
>>
>> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
>> race could mean that csums weren't getting written to the log tree,
>> leading to corruption when it was replayed.
>>
>> The patches to detect log this tree corruption are in btrfs-progs 6.11.
> 
> This shouldn't be needed right?
> Because after log replay the csums are missing and 'btrfs check'
> detects (IIRC) missing csums for extents referred by file extent items
> in a subvolume tree - if it doesn't then it should be improved.

Yes, but we're not mounting it in the tests between the log_writes 
calls, so the log isn't getting replayed. The patches to btrfs check 
make it so that it identifies filesystems that would get corrupted as 
soon as they're next mounted.

> 
>>
>> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
>> ---
>> This is a genericized version of the test I originally proposed as
>> btrfs/333.
>>
>>   tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
>>   tests/generic/757.out |  2 ++
>>   2 files changed, 73 insertions(+)
>>   create mode 100755 tests/generic/757
>>   create mode 100644 tests/generic/757.out
>>
>> diff --git a/tests/generic/757 b/tests/generic/757
>> new file mode 100755
>> index 00000000..6ad3d01e
>> --- /dev/null
>> +++ b/tests/generic/757
>> @@ -0,0 +1,71 @@
>> +#! /bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +#
>> +# FS QA Test 757
>> +#
>> +# Test async dio with fsync to test a btrfs bug where a race meant that csums
>> +# weren't getting written to the log tree, causing corruptions on remount.
>> +# This can be seen on subpage FSes on Linux 6.4.
>> +#
>> +. ./common/preamble
>> +_begin_fstest auto quick metadata log recoveryloop
>> +
>> +_fixed_by_kernel_commit e917ff56c8e7 \
>> +       "btrfs: determine synchronous writers from bio or writeback control"
> 
> For generic tests what we do is:
> 
> [ $FSTYP == "btrfs" ] && _fixed_by_kernel_commit .....
> 
> As long as the failure has not been observed and fixed on other filesystems.
> In case one day a regression happens in another fs, we just add a
> corresponding line using the same logic.
> 
> Otherwise if the test one days fails on another fs and fstests
> suggests that that commit is missing, it would be odd.
> 
> Everything else looks good, so with that fixed (maybe Zorro can change
> that when picking the patch):
> 
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
> 
> Thanks.
> 

Thanks Filipe. Zorro, let me know if you're happy making this change, or 
otherwise I'll resubmit.

Mark
Zorro Lang Oct. 23, 2024, 3:31 a.m. UTC | #3
On Fri, Oct 18, 2024 at 05:36:26PM +0000, Mark Harmstone wrote:
> On 16/10/24 12:09, Filipe Manana wrote:
> > > 
> > On Tue, Oct 15, 2024 at 4:42 PM Mark Harmstone <maharmstone@fb.com> wrote:
> >>
> >> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
> >> race could mean that csums weren't getting written to the log tree,
> >> leading to corruption when it was replayed.
> >>
> >> The patches to detect log this tree corruption are in btrfs-progs 6.11.
> > 
> > This shouldn't be needed right?
> > Because after log replay the csums are missing and 'btrfs check'
> > detects (IIRC) missing csums for extents referred by file extent items
> > in a subvolume tree - if it doesn't then it should be improved.
> 
> Yes, but we're not mounting it in the tests between the log_writes 
> calls, so the log isn't getting replayed. The patches to btrfs check 
> make it so that it identifies filesystems that would get corrupted as 
> soon as they're next mounted.
> 
> > 
> >>
> >> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
> >> ---
> >> This is a genericized version of the test I originally proposed as
> >> btrfs/333.
> >>
> >>   tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
> >>   tests/generic/757.out |  2 ++
> >>   2 files changed, 73 insertions(+)
> >>   create mode 100755 tests/generic/757
> >>   create mode 100644 tests/generic/757.out
> >>
> >> diff --git a/tests/generic/757 b/tests/generic/757
> >> new file mode 100755
> >> index 00000000..6ad3d01e
> >> --- /dev/null
> >> +++ b/tests/generic/757
> >> @@ -0,0 +1,71 @@
> >> +#! /bin/bash
> >> +# SPDX-License-Identifier: GPL-2.0
> >> +#
> >> +# FS QA Test 757
> >> +#
> >> +# Test async dio with fsync to test a btrfs bug where a race meant that csums
> >> +# weren't getting written to the log tree, causing corruptions on remount.
> >> +# This can be seen on subpage FSes on Linux 6.4.
> >> +#
> >> +. ./common/preamble
> >> +_begin_fstest auto quick metadata log recoveryloop
> >> +
> >> +_fixed_by_kernel_commit e917ff56c8e7 \
> >> +       "btrfs: determine synchronous writers from bio or writeback control"
> > 
> > For generic tests what we do is:
> > 
> > [ $FSTYP == "btrfs" ] && _fixed_by_kernel_commit .....
> > 
> > As long as the failure has not been observed and fixed on other filesystems.
> > In case one day a regression happens in another fs, we just add a
> > corresponding line using the same logic.
> > 
> > Otherwise if the test one days fails on another fs and fstests
> > suggests that that commit is missing, it would be odd.
> > 
> > Everything else looks good, so with that fixed (maybe Zorro can change
> > that when picking the patch):
> > 
> > Reviewed-by: Filipe Manana <fdmanana@suse.com>
> > 
> > Thanks.
> > 
> 
> Thanks Filipe. Zorro, let me know if you're happy making this change, or 
> otherwise I'll resubmit.

I can help to change that when I merge it. Thanks you and Filipe.

Thanks,
Zorro

> 
> Mark
> 
>
Zorro Lang Oct. 23, 2024, 3:43 a.m. UTC | #4
On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote:
> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
> race could mean that csums weren't getting written to the log tree,
> leading to corruption when it was replayed.
> 
> The patches to detect log this tree corruption are in btrfs-progs 6.11.
> 
> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
> ---
> This is a genericized version of the test I originally proposed as
> btrfs/333.
> 
>  tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/757.out |  2 ++
>  2 files changed, 73 insertions(+)
>  create mode 100755 tests/generic/757
>  create mode 100644 tests/generic/757.out
> 
> diff --git a/tests/generic/757 b/tests/generic/757
> new file mode 100755
> index 00000000..6ad3d01e
> --- /dev/null
> +++ b/tests/generic/757
> @@ -0,0 +1,71 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# FS QA Test 757
> +#
> +# Test async dio with fsync to test a btrfs bug where a race meant that csums
> +# weren't getting written to the log tree, causing corruptions on remount.
> +# This can be seen on subpage FSes on Linux 6.4.
> +#
> +. ./common/preamble
> +_begin_fstest auto quick metadata log recoveryloop
> +
> +_fixed_by_kernel_commit e917ff56c8e7 \
> +	"btrfs: determine synchronous writers from bio or writeback control"
> +
> +fio_config=$tmp.fio
> +
> +. ./common/dmlogwrites
> +
> +_require_scratch
> +_require_log_writes
> +
> +cat >$fio_config <<EOF
> +[global]
> +iodepth=128
> +direct=1
> +ioengine=libaio
> +rw=randwrite
> +runtime=1s
> +[job0]
> +rw=randwrite
> +filename=$SCRATCH_MNT/file
> +size=1g
> +fdatasync=1
> +EOF
> +
> +_require_fio $fio_config
> +
> +cat $fio_config >> $seqres.full
> +
> +_log_writes_init $SCRATCH_DEV
> +_log_writes_mkfs >> $seqres.full 2>&1
> +_log_writes_mark mkfs
> +
> +_log_writes_mount

For dmlogwrites test, we generally calls _log_writes_cleanup in _cleanup, to
recover the SCRATCH_DEV anyway, even if this test is killed at the middle of
its testing phase, to avoid later tests failed.

I'll add below code in this case, when I merge it. If there's not objection
from you.

_cleanup()
{
	cd /
	_log_writes_cleanup
	rm -f $tmp.*
}

Thanks,
Zorro

> +
> +$FIO_PROG $fio_config > /dev/null 2>&1
> +_log_writes_unmount
> +
> +_log_writes_remove
> +
> +prev=$(_log_writes_mark_to_entry_number mkfs)
> +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'"
> +cur=$(_log_writes_find_next_fua $prev)
> +[ -z "$cur" ] && _fail "failed to locate next FUA write"
> +
> +while [ ! -z "$cur" ]; do
> +	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> +
> +	_check_scratch_fs
> +
> +	prev=$cur
> +	cur=$(_log_writes_find_next_fua $(($cur + 1)))
> +	[ -z "$cur" ] && break
> +done
> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/generic/757.out b/tests/generic/757.out
> new file mode 100644
> index 00000000..dfbc8094
> --- /dev/null
> +++ b/tests/generic/757.out
> @@ -0,0 +1,2 @@
> +QA output created by 757
> +Silence is golden
> -- 
> 2.44.2
> 
>
Zorro Lang Oct. 23, 2024, 3:53 a.m. UTC | #5
On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote:
> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
> race could mean that csums weren't getting written to the log tree,
> leading to corruption when it was replayed.
> 
> The patches to detect log this tree corruption are in btrfs-progs 6.11.
> 
> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
> ---

Sorry, more review points below. I can help to change these if you say "yes"
to all :)

> This is a genericized version of the test I originally proposed as
> btrfs/333.
> 
>  tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/757.out |  2 ++
>  2 files changed, 73 insertions(+)
>  create mode 100755 tests/generic/757
>  create mode 100644 tests/generic/757.out
> 
> diff --git a/tests/generic/757 b/tests/generic/757
> new file mode 100755
> index 00000000..6ad3d01e
> --- /dev/null
> +++ b/tests/generic/757
> @@ -0,0 +1,71 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# FS QA Test 757
> +#
> +# Test async dio with fsync to test a btrfs bug where a race meant that csums
> +# weren't getting written to the log tree, causing corruptions on remount.
> +# This can be seen on subpage FSes on Linux 6.4.
> +#
> +. ./common/preamble
> +_begin_fstest auto quick metadata log recoveryloop
                                                      ^^^
                                                      aio

> +
> +_fixed_by_kernel_commit e917ff56c8e7 \
> +	"btrfs: determine synchronous writers from bio or writeback control"
> +
> +fio_config=$tmp.fio
> +
> +. ./common/dmlogwrites
> +
> +_require_scratch
> +_require_log_writes
> +
> +cat >$fio_config <<EOF
> +[global]
> +iodepth=128
> +direct=1
> +ioengine=libaio

_require_aiodio ?

> +rw=randwrite
> +runtime=1s
> +[job0]
> +rw=randwrite
> +filename=$SCRATCH_MNT/file
> +size=1g
> +fdatasync=1
> +EOF
> +
> +_require_fio $fio_config
> +
> +cat $fio_config >> $seqres.full
> +
> +_log_writes_init $SCRATCH_DEV
> +_log_writes_mkfs >> $seqres.full 2>&1
> +_log_writes_mark mkfs
> +
> +_log_writes_mount
> +
> +$FIO_PROG $fio_config > /dev/null 2>&1

Don't you care the output of fio running anymore? Maybe use > $seqres.full ?

And just make sure, do you want to ignore failures of fio, as you do "2>&1"?

Thanks,
Zorro

> +_log_writes_unmount
> +
> +_log_writes_remove
> +
> +prev=$(_log_writes_mark_to_entry_number mkfs)
> +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'"
> +cur=$(_log_writes_find_next_fua $prev)
> +[ -z "$cur" ] && _fail "failed to locate next FUA write"
> +
> +while [ ! -z "$cur" ]; do
> +	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> +
> +	_check_scratch_fs
> +
> +	prev=$cur
> +	cur=$(_log_writes_find_next_fua $(($cur + 1)))
> +	[ -z "$cur" ] && break
> +done
> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/generic/757.out b/tests/generic/757.out
> new file mode 100644
> index 00000000..dfbc8094
> --- /dev/null
> +++ b/tests/generic/757.out
> @@ -0,0 +1,2 @@
> +QA output created by 757
> +Silence is golden
> -- 
> 2.44.2
> 
>
Mark Harmstone Oct. 23, 2024, 10:34 a.m. UTC | #6
These look good to me. Thank you for your help

On 23/10/24 04:53, Zorro Lang wrote:
> > 
> On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote:
>> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
>> race could mean that csums weren't getting written to the log tree,
>> leading to corruption when it was replayed.
>>
>> The patches to detect log this tree corruption are in btrfs-progs 6.11.
>>
>> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
>> ---
> 
> Sorry, more review points below. I can help to change these if you say "yes"
> to all :)
> 
>> This is a genericized version of the test I originally proposed as
>> btrfs/333.
>>
>>   tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
>>   tests/generic/757.out |  2 ++
>>   2 files changed, 73 insertions(+)
>>   create mode 100755 tests/generic/757
>>   create mode 100644 tests/generic/757.out
>>
>> diff --git a/tests/generic/757 b/tests/generic/757
>> new file mode 100755
>> index 00000000..6ad3d01e
>> --- /dev/null
>> +++ b/tests/generic/757
>> @@ -0,0 +1,71 @@
>> +#! /bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +#
>> +# FS QA Test 757
>> +#
>> +# Test async dio with fsync to test a btrfs bug where a race meant that csums
>> +# weren't getting written to the log tree, causing corruptions on remount.
>> +# This can be seen on subpage FSes on Linux 6.4.
>> +#
>> +. ./common/preamble
>> +_begin_fstest auto quick metadata log recoveryloop
>                                                        ^^^
>                                                        aio
> 
>> +
>> +_fixed_by_kernel_commit e917ff56c8e7 \
>> +	"btrfs: determine synchronous writers from bio or writeback control"
>> +
>> +fio_config=$tmp.fio
>> +
>> +. ./common/dmlogwrites
>> +
>> +_require_scratch
>> +_require_log_writes
>> +
>> +cat >$fio_config <<EOF
>> +[global]
>> +iodepth=128
>> +direct=1
>> +ioengine=libaio
> 
> _require_aiodio ?
> 
>> +rw=randwrite
>> +runtime=1s
>> +[job0]
>> +rw=randwrite
>> +filename=$SCRATCH_MNT/file
>> +size=1g
>> +fdatasync=1
>> +EOF
>> +
>> +_require_fio $fio_config
>> +
>> +cat $fio_config >> $seqres.full
>> +
>> +_log_writes_init $SCRATCH_DEV
>> +_log_writes_mkfs >> $seqres.full 2>&1
>> +_log_writes_mark mkfs
>> +
>> +_log_writes_mount
>> +
>> +$FIO_PROG $fio_config > /dev/null 2>&1
> 
> Don't you care the output of fio running anymore? Maybe use > $seqres.full ?
> 
> And just make sure, do you want to ignore failures of fio, as you do "2>&1"?
> 
> Thanks,
> Zorro
> 
>> +_log_writes_unmount
>> +
>> +_log_writes_remove
>> +
>> +prev=$(_log_writes_mark_to_entry_number mkfs)
>> +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'"
>> +cur=$(_log_writes_find_next_fua $prev)
>> +[ -z "$cur" ] && _fail "failed to locate next FUA write"
>> +
>> +while [ ! -z "$cur" ]; do
>> +	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
>> +
>> +	_check_scratch_fs
>> +
>> +	prev=$cur
>> +	cur=$(_log_writes_find_next_fua $(($cur + 1)))
>> +	[ -z "$cur" ] && break
>> +done
>> +
>> +echo "Silence is golden"
>> +
>> +# success, all done
>> +status=0
>> +exit
>> diff --git a/tests/generic/757.out b/tests/generic/757.out
>> new file mode 100644
>> index 00000000..dfbc8094
>> --- /dev/null
>> +++ b/tests/generic/757.out
>> @@ -0,0 +1,2 @@
>> +QA output created by 757
>> +Silence is golden
>> -- 
>> 2.44.2
>>
>>
>
Darrick J. Wong Oct. 28, 2024, 9:57 p.m. UTC | #7
On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote:
> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
> race could mean that csums weren't getting written to the log tree,
> leading to corruption when it was replayed.
> 
> The patches to detect log this tree corruption are in btrfs-progs 6.11.
> 
> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
> ---
> This is a genericized version of the test I originally proposed as
> btrfs/333.
> 
>  tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/757.out |  2 ++
>  2 files changed, 73 insertions(+)
>  create mode 100755 tests/generic/757
>  create mode 100644 tests/generic/757.out
> 
> diff --git a/tests/generic/757 b/tests/generic/757
> new file mode 100755
> index 00000000..6ad3d01e
> --- /dev/null
> +++ b/tests/generic/757
> @@ -0,0 +1,71 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# FS QA Test 757
> +#
> +# Test async dio with fsync to test a btrfs bug where a race meant that csums
> +# weren't getting written to the log tree, causing corruptions on remount.
> +# This can be seen on subpage FSes on Linux 6.4.
> +#
> +. ./common/preamble
> +_begin_fstest auto quick metadata log recoveryloop
> +
> +_fixed_by_kernel_commit e917ff56c8e7 \
> +	"btrfs: determine synchronous writers from bio or writeback control"
> +
> +fio_config=$tmp.fio
> +
> +. ./common/dmlogwrites
> +
> +_require_scratch
> +_require_log_writes
> +
> +cat >$fio_config <<EOF
> +[global]
> +iodepth=128
> +direct=1
> +ioengine=libaio
> +rw=randwrite
> +runtime=1s
> +[job0]
> +rw=randwrite
> +filename=$SCRATCH_MNT/file
> +size=1g
> +fdatasync=1
> +EOF
> +
> +_require_fio $fio_config
> +
> +cat $fio_config >> $seqres.full
> +
> +_log_writes_init $SCRATCH_DEV
> +_log_writes_mkfs >> $seqres.full 2>&1
> +_log_writes_mark mkfs
> +
> +_log_writes_mount
> +
> +$FIO_PROG $fio_config > /dev/null 2>&1
> +_log_writes_unmount
> +
> +_log_writes_remove
> +
> +prev=$(_log_writes_mark_to_entry_number mkfs)
> +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'"
> +cur=$(_log_writes_find_next_fua $prev)
> +[ -z "$cur" ] && _fail "failed to locate next FUA write"
> +
> +while [ ! -z "$cur" ]; do
> +	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> +
> +	_check_scratch_fs

This test fails on xfs because (afaict) replaying the log to $cur
results in $SCRATCH_DEV being a filesystem with a dirty log; and
xfs_repair fails when it is given a filesystem with a dirty log.

I then fixed the test to mount and unmount the filesystem to recovery
the dirty log before invoking xfs_repair:

	# xfs_repair won't run if the log is dirty
	if [ $FSTYP = "xfs" ]; then
		_scratch_mount
		_scratch_unmount
	fi
	_check_scratch_fs

But now the test takes a very long time to run because (on my system
anyway) the fio run can initiate 17,000 FUAs, which means that this loop
runs that many times.  100 iterations takes about 45 seconds, which is
about two hours.

Is it necessary to iterate the loop that many times to reproduce
whatever issue btrfs had?

--D

> +
> +	prev=$cur
> +	cur=$(_log_writes_find_next_fua $(($cur + 1)))
> +	[ -z "$cur" ] && break
> +done
> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/generic/757.out b/tests/generic/757.out
> new file mode 100644
> index 00000000..dfbc8094
> --- /dev/null
> +++ b/tests/generic/757.out
> @@ -0,0 +1,2 @@
> +QA output created by 757
> +Silence is golden
> -- 
> 2.44.2
> 
>
Zorro Lang Oct. 29, 2024, 5:11 a.m. UTC | #8
On Mon, Oct 28, 2024 at 02:57:28PM -0700, Darrick J. Wong wrote:
> On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote:
> > Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
> > race could mean that csums weren't getting written to the log tree,
> > leading to corruption when it was replayed.
> > 
> > The patches to detect log this tree corruption are in btrfs-progs 6.11.
> > 
> > Signed-off-by: Mark Harmstone <maharmstone@fb.com>
> > ---
> > This is a genericized version of the test I originally proposed as
> > btrfs/333.
> > 
> >  tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/757.out |  2 ++
> >  2 files changed, 73 insertions(+)
> >  create mode 100755 tests/generic/757
> >  create mode 100644 tests/generic/757.out
> > 
> > diff --git a/tests/generic/757 b/tests/generic/757
> > new file mode 100755
> > index 00000000..6ad3d01e
> > --- /dev/null
> > +++ b/tests/generic/757
> > @@ -0,0 +1,71 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# FS QA Test 757
> > +#
> > +# Test async dio with fsync to test a btrfs bug where a race meant that csums
> > +# weren't getting written to the log tree, causing corruptions on remount.
> > +# This can be seen on subpage FSes on Linux 6.4.
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto quick metadata log recoveryloop
> > +
> > +_fixed_by_kernel_commit e917ff56c8e7 \
> > +	"btrfs: determine synchronous writers from bio or writeback control"
> > +
> > +fio_config=$tmp.fio
> > +
> > +. ./common/dmlogwrites
> > +
> > +_require_scratch
> > +_require_log_writes
> > +
> > +cat >$fio_config <<EOF
> > +[global]
> > +iodepth=128
> > +direct=1
> > +ioengine=libaio
> > +rw=randwrite
> > +runtime=1s
> > +[job0]
> > +rw=randwrite
> > +filename=$SCRATCH_MNT/file
> > +size=1g
> > +fdatasync=1
> > +EOF
> > +
> > +_require_fio $fio_config
> > +
> > +cat $fio_config >> $seqres.full
> > +
> > +_log_writes_init $SCRATCH_DEV
> > +_log_writes_mkfs >> $seqres.full 2>&1
> > +_log_writes_mark mkfs
> > +
> > +_log_writes_mount
> > +
> > +$FIO_PROG $fio_config > /dev/null 2>&1
> > +_log_writes_unmount
> > +
> > +_log_writes_remove
> > +
> > +prev=$(_log_writes_mark_to_entry_number mkfs)
> > +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'"
> > +cur=$(_log_writes_find_next_fua $prev)
> > +[ -z "$cur" ] && _fail "failed to locate next FUA write"
> > +
> > +while [ ! -z "$cur" ]; do
> > +	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
> > +
> > +	_check_scratch_fs
> 
> This test fails on xfs because (afaict) replaying the log to $cur
> results in $SCRATCH_DEV being a filesystem with a dirty log; and
> xfs_repair fails when it is given a filesystem with a dirty log.
> 
> I then fixed the test to mount and unmount the filesystem to recovery
> the dirty log before invoking xfs_repair:
> 
> 	# xfs_repair won't run if the log is dirty
> 	if [ $FSTYP = "xfs" ]; then
> 		_scratch_mount
> 		_scratch_unmount
> 	fi

Thanks Darrick, you're right.
I'm wondering can we always do a mount&unmount at here, no matter the
$FSTYP, if that doesn't affect the testing of other filesystems?

> 	_check_scratch_fs
> 
> But now the test takes a very long time to run because (on my system
> anyway) the fio run can initiate 17,000 FUAs, which means that this loop
> runs that many times.  100 iterations takes about 45 seconds, which is
> about two hours.
> 
> Is it necessary to iterate the loop that many times to reproduce
> whatever issue btrfs had?

Yes, it takes much long time on my side too:
 FSTYP         -- ext4
 PLATFORM      -- Linux/x86_64 dell-per750-47 6.12.0-rc4+ #1 SMP PREEMPT_DYNAMIC Fri Oct 25 14:25:45 EDT 2024
 MKFS_OPTIONS  -- -F /dev/sda4
 MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch

 generic/757        4247s
 Ran: generic/757
 Passed all 1 tests

So better to reduce the testing time as much as possible, and remove it
from the "quick" group. (Maybe we can have a tag to mark those cases need
much long time too).

This patch has been merged into for-next branch, as:

  cf97fa373 generic: add test for missing btrfs csums in log when doing async on subpage vol

Please send another (or other two) patch to fix above 2 problems.

Thanks,
Zorro

> 
> --D
> 
> > +
> > +	prev=$cur
> > +	cur=$(_log_writes_find_next_fua $(($cur + 1)))
> > +	[ -z "$cur" ] && break
> > +done
> > +
> > +echo "Silence is golden"
> > +
> > +# success, all done
> > +status=0
> > +exit
> > diff --git a/tests/generic/757.out b/tests/generic/757.out
> > new file mode 100644
> > index 00000000..dfbc8094
> > --- /dev/null
> > +++ b/tests/generic/757.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 757
> > +Silence is golden
> > -- 
> > 2.44.2
> > 
> > 
>
Mark Harmstone Oct. 29, 2024, 10:03 a.m. UTC | #9
On 29/10/24 05:11, Zorro Lang wrote:
> On Mon, Oct 28, 2024 at 02:57:28PM -0700, Darrick J. Wong wrote:
>> On Tue, Oct 15, 2024 at 04:39:34PM +0100, Mark Harmstone wrote:
>>> Adds a test for a bug we encountered on Linux 6.4 on aarch64, where a
>>> race could mean that csums weren't getting written to the log tree,
>>> leading to corruption when it was replayed.
>>>
>>> The patches to detect log this tree corruption are in btrfs-progs 6.11.
>>>
>>> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
>>> ---
>>> This is a genericized version of the test I originally proposed as
>>> btrfs/333.
>>>
>>>   tests/generic/757     | 71 +++++++++++++++++++++++++++++++++++++++++++
>>>   tests/generic/757.out |  2 ++
>>>   2 files changed, 73 insertions(+)
>>>   create mode 100755 tests/generic/757
>>>   create mode 100644 tests/generic/757.out
>>>
>>> diff --git a/tests/generic/757 b/tests/generic/757
>>> new file mode 100755
>>> index 00000000..6ad3d01e
>>> --- /dev/null
>>> +++ b/tests/generic/757
>>> @@ -0,0 +1,71 @@
>>> +#! /bin/bash
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +#
>>> +# FS QA Test 757
>>> +#
>>> +# Test async dio with fsync to test a btrfs bug where a race meant that csums
>>> +# weren't getting written to the log tree, causing corruptions on remount.
>>> +# This can be seen on subpage FSes on Linux 6.4.
>>> +#
>>> +. ./common/preamble
>>> +_begin_fstest auto quick metadata log recoveryloop
>>> +
>>> +_fixed_by_kernel_commit e917ff56c8e7 \
>>> +	"btrfs: determine synchronous writers from bio or writeback control"
>>> +
>>> +fio_config=$tmp.fio
>>> +
>>> +. ./common/dmlogwrites
>>> +
>>> +_require_scratch
>>> +_require_log_writes
>>> +
>>> +cat >$fio_config <<EOF
>>> +[global]
>>> +iodepth=128
>>> +direct=1
>>> +ioengine=libaio
>>> +rw=randwrite
>>> +runtime=1s
>>> +[job0]
>>> +rw=randwrite
>>> +filename=$SCRATCH_MNT/file
>>> +size=1g
>>> +fdatasync=1
>>> +EOF
>>> +
>>> +_require_fio $fio_config
>>> +
>>> +cat $fio_config >> $seqres.full
>>> +
>>> +_log_writes_init $SCRATCH_DEV
>>> +_log_writes_mkfs >> $seqres.full 2>&1
>>> +_log_writes_mark mkfs
>>> +
>>> +_log_writes_mount
>>> +
>>> +$FIO_PROG $fio_config > /dev/null 2>&1
>>> +_log_writes_unmount
>>> +
>>> +_log_writes_remove
>>> +
>>> +prev=$(_log_writes_mark_to_entry_number mkfs)
>>> +[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'"
>>> +cur=$(_log_writes_find_next_fua $prev)
>>> +[ -z "$cur" ] && _fail "failed to locate next FUA write"
>>> +
>>> +while [ ! -z "$cur" ]; do
>>> +	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
>>> +
>>> +	_check_scratch_fs
>>
>> This test fails on xfs because (afaict) replaying the log to $cur
>> results in $SCRATCH_DEV being a filesystem with a dirty log; and
>> xfs_repair fails when it is given a filesystem with a dirty log.
>>
>> I then fixed the test to mount and unmount the filesystem to recovery
>> the dirty log before invoking xfs_repair:
>>
>> 	# xfs_repair won't run if the log is dirty
>> 	if [ $FSTYP = "xfs" ]; then
>> 		_scratch_mount
>> 		_scratch_unmount
>> 	fi
> 
> Thanks Darrick, you're right.
> I'm wondering can we always do a mount&unmount at here, no matter the
> $FSTYP, if that doesn't affect the testing of other filesystems?
> 
>> 	_check_scratch_fs
>>
>> But now the test takes a very long time to run because (on my system
>> anyway) the fio run can initiate 17,000 FUAs, which means that this loop
>> runs that many times.  100 iterations takes about 45 seconds, which is
>> about two hours.
>>
>> Is it necessary to iterate the loop that many times to reproduce
>> whatever issue btrfs had?
> 
> Yes, it takes much long time on my side too:
>   FSTYP         -- ext4
>   PLATFORM      -- Linux/x86_64 dell-per750-47 6.12.0-rc4+ #1 SMP PREEMPT_DYNAMIC Fri Oct 25 14:25:45 EDT 2024
>   MKFS_OPTIONS  -- -F /dev/sda4
>   MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/xfstests/scratch
> 
>   generic/757        4247s
>   Ran: generic/757
>   Passed all 1 tests
> 
> So better to reduce the testing time as much as possible, and remove it
> from the "quick" group. (Maybe we can have a tag to mark those cases need
> much long time too).

Or maybe it should be made btrfs-specific again? That's how I originally 
wrote it, but Filipe Manana thought it ought to be genericized.

An advantage of making it FS-specific is that you can use the --fsck 
option in log-writes, which is about 10 times quicker than doing the 
loop in bash.

Mark
diff mbox series

Patch

diff --git a/tests/generic/757 b/tests/generic/757
new file mode 100755
index 00000000..6ad3d01e
--- /dev/null
+++ b/tests/generic/757
@@ -0,0 +1,71 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# FS QA Test 757
+#
+# Test async dio with fsync to test a btrfs bug where a race meant that csums
+# weren't getting written to the log tree, causing corruptions on remount.
+# This can be seen on subpage FSes on Linux 6.4.
+#
+. ./common/preamble
+_begin_fstest auto quick metadata log recoveryloop
+
+_fixed_by_kernel_commit e917ff56c8e7 \
+	"btrfs: determine synchronous writers from bio or writeback control"
+
+fio_config=$tmp.fio
+
+. ./common/dmlogwrites
+
+_require_scratch
+_require_log_writes
+
+cat >$fio_config <<EOF
+[global]
+iodepth=128
+direct=1
+ioengine=libaio
+rw=randwrite
+runtime=1s
+[job0]
+rw=randwrite
+filename=$SCRATCH_MNT/file
+size=1g
+fdatasync=1
+EOF
+
+_require_fio $fio_config
+
+cat $fio_config >> $seqres.full
+
+_log_writes_init $SCRATCH_DEV
+_log_writes_mkfs >> $seqres.full 2>&1
+_log_writes_mark mkfs
+
+_log_writes_mount
+
+$FIO_PROG $fio_config > /dev/null 2>&1
+_log_writes_unmount
+
+_log_writes_remove
+
+prev=$(_log_writes_mark_to_entry_number mkfs)
+[ -z "$prev" ] && _fail "failed to locate entry mark 'mkfs'"
+cur=$(_log_writes_find_next_fua $prev)
+[ -z "$cur" ] && _fail "failed to locate next FUA write"
+
+while [ ! -z "$cur" ]; do
+	_log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqres.full
+
+	_check_scratch_fs
+
+	prev=$cur
+	cur=$(_log_writes_find_next_fua $(($cur + 1)))
+	[ -z "$cur" ] && break
+done
+
+echo "Silence is golden"
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/757.out b/tests/generic/757.out
new file mode 100644
index 00000000..dfbc8094
--- /dev/null
+++ b/tests/generic/757.out
@@ -0,0 +1,2 @@ 
+QA output created by 757
+Silence is golden