Message ID | 20230223051049.30086-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: add test case for NODATASUM dev-replace | expand |
On Thu, Feb 23, 2023 at 01:10:49PM +0800, Qu Wenruo wrote: > During my development on dev-replace, I made a mistake in RAID56 > dev-replace code where it can lead to NODATASUM corruption. > Thankfully such corruption didn't reach upstream. > > Inspired by such incident, here comes a new test case for btrfs > dev-replace, the new case would: > > - Populate the filesystem with nodatasum mount option > > - Run fssum to record the contents > Since the test case only cares about data contents, here we don't > include metadata like uid/gid/timestamp. > > - Wipe one device > > - Mount the fs with the missing device > > - Verify the contents is still correct > > - Replace the missing device > > - Verify the contents is still correct again > Before the verification, drop all cache to make sure the 2nd > verification is reading from the disks. > > For current kernels, the test case should pass as expected. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- There's not review points from btrfs list, and this case looks good to me, and I didn't find any problems on this case by a simple test [1]. So I'd like to merge this btrfs specific case directly. Reviewed-by: Zorro Lang <zlang@redhat.com> [1] # ./check -s pooldev btrfs/286 SECTION -- pooldev FSTYP -- btrfs PLATFORM -- Linux/x86_64 hp-dl380pg8-01 6.2.0-rc8-mainline+ #6 SMP PREEMPT_DYNAMIC Wed Feb 15 14:16:45 CST 2023 MKFS_OPTIONS -- /dev/mapper/testvg-scratchdev1 MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/mapper/testvg-scratchdev1 /mnt/scratch btrfs/286 41s Ran: btrfs/286 Passed all 1 tests SECTION -- pooldev ========================= Ran: btrfs/286 Passed all 1 tests > tests/btrfs/286 | 78 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/286.out | 2 ++ > 2 files changed, 80 insertions(+) > create mode 100755 tests/btrfs/286 > create mode 100644 tests/btrfs/286.out > > diff --git a/tests/btrfs/286 b/tests/btrfs/286 > new file mode 100755 > index 00000000..fb805256 > --- /dev/null > +++ b/tests/btrfs/286 > @@ -0,0 +1,78 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2023 SUSE Linux Products GmbH. All Rights Reserved. > +# > +# FS QA Test 286 > +# > +# Make sure btrfs dev-replace on missing device won't cause data corruption > +# for NODATASUM data. > +# > +. ./common/preamble > +_begin_fstest auto replace > + > +# Import common functions. > +. ./common/filter > + > +# real QA test starts here > + > +# Modify as appropriate. > +_supported_fs btrfs > +_require_command "$WIPEFS_PROG" wipefs > +_btrfs_get_profile_configs replace-missing > +_require_fssum > +_require_scratch_dev_pool 5 > +_scratch_dev_pool_get 4 > +_spare_dev_get > + > +workload() > +{ > + local profile=$1 > + local victim="$(echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}')" > + > + echo "=== Profile: $profile ===" >> $seqres.full > + rm -f $tmp.fssum > + _scratch_pool_mkfs "$profile" >> $seqres.full 2>&1 > + > + # Use nodatasum mount option, so all data won't have checksum. > + _scratch_mount -o nodatasum > + > + $FSSTRESS_PROG -p 10 -n 200 -d $SCRATCH_MNT > /dev/null 2>&1 > + sync > + > + # Generate fssum for later verification, here we only care > + # about the file contents, thus we don't bother metadata at all. > + $FSSUM_PROG -n -d -f -w $tmp.fssum $SCRATCH_MNT > + _scratch_unmount > + > + # Wipe devid 2 > + $WIPEFS_PROG -a $victim >> $seqres.full 2>&1 > + > + # Mount the fs with the victim device missing > + _scratch_mount -o degraded,nodatasum > + > + # Verify no data corruption first. > + echo "=== Verify the contents before replace ===" >> $seqres.full > + $FSSUM_PROG -r $tmp.fssum $SCRATCH_MNT >> $seqres.full 2>&1 > + > + # Replace the missing device > + $BTRFS_UTIL_PROG replace start -Bf 2 $SPARE_DEV $SCRATCH_MNT >> $seqres.full > + > + # Drop all cache to make sure later read are all from the disks > + echo 3 > /proc/sys/vm/drop_caches > + > + # Re-check the file contents > + echo "=== Verify the contents after replace ===" >> $seqres.full > + $FSSUM_PROG -r $tmp.fssum $SCRATCH_MNT >> $seqres.full 2>&1 > + > + _scratch_unmount > +} > + > +for t in "${_btrfs_profile_configs[@]}"; do > + workload "$t" > +done > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/286.out b/tests/btrfs/286.out > new file mode 100644 > index 00000000..35c48006 > --- /dev/null > +++ b/tests/btrfs/286.out > @@ -0,0 +1,2 @@ > +QA output created by 286 > +Silence is golden > -- > 2.39.0 >
diff --git a/tests/btrfs/286 b/tests/btrfs/286 new file mode 100755 index 00000000..fb805256 --- /dev/null +++ b/tests/btrfs/286 @@ -0,0 +1,78 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2023 SUSE Linux Products GmbH. All Rights Reserved. +# +# FS QA Test 286 +# +# Make sure btrfs dev-replace on missing device won't cause data corruption +# for NODATASUM data. +# +. ./common/preamble +_begin_fstest auto replace + +# Import common functions. +. ./common/filter + +# real QA test starts here + +# Modify as appropriate. +_supported_fs btrfs +_require_command "$WIPEFS_PROG" wipefs +_btrfs_get_profile_configs replace-missing +_require_fssum +_require_scratch_dev_pool 5 +_scratch_dev_pool_get 4 +_spare_dev_get + +workload() +{ + local profile=$1 + local victim="$(echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}')" + + echo "=== Profile: $profile ===" >> $seqres.full + rm -f $tmp.fssum + _scratch_pool_mkfs "$profile" >> $seqres.full 2>&1 + + # Use nodatasum mount option, so all data won't have checksum. + _scratch_mount -o nodatasum + + $FSSTRESS_PROG -p 10 -n 200 -d $SCRATCH_MNT > /dev/null 2>&1 + sync + + # Generate fssum for later verification, here we only care + # about the file contents, thus we don't bother metadata at all. + $FSSUM_PROG -n -d -f -w $tmp.fssum $SCRATCH_MNT + _scratch_unmount + + # Wipe devid 2 + $WIPEFS_PROG -a $victim >> $seqres.full 2>&1 + + # Mount the fs with the victim device missing + _scratch_mount -o degraded,nodatasum + + # Verify no data corruption first. + echo "=== Verify the contents before replace ===" >> $seqres.full + $FSSUM_PROG -r $tmp.fssum $SCRATCH_MNT >> $seqres.full 2>&1 + + # Replace the missing device + $BTRFS_UTIL_PROG replace start -Bf 2 $SPARE_DEV $SCRATCH_MNT >> $seqres.full + + # Drop all cache to make sure later read are all from the disks + echo 3 > /proc/sys/vm/drop_caches + + # Re-check the file contents + echo "=== Verify the contents after replace ===" >> $seqres.full + $FSSUM_PROG -r $tmp.fssum $SCRATCH_MNT >> $seqres.full 2>&1 + + _scratch_unmount +} + +for t in "${_btrfs_profile_configs[@]}"; do + workload "$t" +done + +echo "Silence is golden" + +# success, all done +status=0 +exit diff --git a/tests/btrfs/286.out b/tests/btrfs/286.out new file mode 100644 index 00000000..35c48006 --- /dev/null +++ b/tests/btrfs/286.out @@ -0,0 +1,2 @@ +QA output created by 286 +Silence is golden
During my development on dev-replace, I made a mistake in RAID56 dev-replace code where it can lead to NODATASUM corruption. Thankfully such corruption didn't reach upstream. Inspired by such incident, here comes a new test case for btrfs dev-replace, the new case would: - Populate the filesystem with nodatasum mount option - Run fssum to record the contents Since the test case only cares about data contents, here we don't include metadata like uid/gid/timestamp. - Wipe one device - Mount the fs with the missing device - Verify the contents is still correct - Replace the missing device - Verify the contents is still correct again Before the verification, drop all cache to make sure the 2nd verification is reading from the disks. For current kernels, the test case should pass as expected. Signed-off-by: Qu Wenruo <wqu@suse.com> --- tests/btrfs/286 | 78 +++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/286.out | 2 ++ 2 files changed, 80 insertions(+) create mode 100755 tests/btrfs/286 create mode 100644 tests/btrfs/286.out