Message ID | 20220127054543.28964-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: add test case to make sure autodefrag won't give up the whole cluster when there is a hole in it | expand |
On Thu, Jan 27, 2022 at 01:45:43PM +0800, Qu Wenruo wrote: > In v5.11~v5.15 kernels, there is a regression in autodefrag that if a > cluster (up to 256K in size) has even a single hole, the whole cluster > will be rejected. > > This will greatly reduce the efficiency of autodefrag. > > The behavior is fixed in v5.16 by a full rework, although the rework > itself has other problems, it at least solves this particular > regression. > > Here we add a test case to reproduce the case, where we have a 128K > cluster, the first half is fragmented extents which can be defragged. > The second half is hole. > > Make sure autodefrag can defrag the 64K part. > > This test needs extra debug feature, which is titled: > > [RFC] btrfs: sysfs: introduce <uuid>/debug/cleaner_trigger > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > common/btrfs | 11 +++++ > tests/btrfs/256 | 112 ++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/256.out | 2 + > 3 files changed, 125 insertions(+) > create mode 100755 tests/btrfs/256 > create mode 100644 tests/btrfs/256.out > > diff --git a/common/btrfs b/common/btrfs > index 5de926dd..4e6842d9 100644 > --- a/common/btrfs > +++ b/common/btrfs > @@ -496,3 +496,14 @@ _require_btrfs_support_sectorsize() > grep -wq $sectorsize /sys/fs/btrfs/features/supported_sectorsizes || \ > _notrun "sectorsize $sectorsize is not supported" > } > + > +# Require trigger for cleaner kthread > +_require_btrfs_debug_cleaner_trigger() > +{ > + local fsid > + > + fsid=$($BTRFS_UTIL_PROG filesystem show $TEST_DIR | grep uuid: |\ > + $AWK_PROG '{print $NF}') > + test -f /sys/fs/btrfs/$fsid/debug/cleaner_trigger ||\ > + _notrun "no cleaner kthread trigger" > +} > diff --git a/tests/btrfs/256 b/tests/btrfs/256 > new file mode 100755 > index 00000000..86e6739e > --- /dev/null > +++ b/tests/btrfs/256 > @@ -0,0 +1,112 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved. > +# > +# FS QA Test 256 > +# > +# Make sure btrfs auto defrag can properly defrag clusters which has hole > +# in the middle > +# > +. ./common/preamble > +_begin_fstest auto defrag quick > + > +. ./common/btrfs > +. ./common/filter > + > +# real QA test starts here > + > +# Modify as appropriate. > +_supported_fs generic > +_require_scratch > + > +get_extent_disk_sector() > +{ > + local file=$1 > + local offset=$2 > + > + $XFS_IO_PROG -c "fiemap $offset" "$file" | _filter_xfs_io_fiemap |\ > + head -n1 | $AWK_PROG '{print $3}' > +} > + > +# Needs 4K sectorsize, as larger sectorsize can change the file layout. > +_require_btrfs_support_sectorsize 4096 > + > +# We need a way to trigger autodefrag > +_require_btrfs_debug_cleaner_trigger In order to trigger the cleaner, we don't need another special purpose RFC debug patch. Just mount the fs with "-o commit=1", and then leave the "sleep 3" as it is. We do this in other tests that expect the cleaner thread to do something. Every time the transaction kthread wakes up, it will wake up the cleaner kthread, even if it doesn't have any transaction to commit. > + > +_scratch_mkfs >> $seqres.full > + > +# Need datacow to show which range is defragged, and we're testing > +# autodefrag > +_scratch_mount -o datacow,autodefrag > + > +fsid=$($BTRFS_UTIL_PROG filesystem show $SCRATCH_MNT |grep uuid: |\ > + $AWK_PROG '{print $NF}') > + > +# Create a layout where we have fragmented extents at [0, 64k) (sync write in > +# reserve order), then a hole at [64k, 128k) > +$XFS_IO_PROG -f -c "pwrite 48k 16k" -c sync \ > + -c "pwrite 32k 16k" -c sync \ > + -c "pwrite 16k 16k" -c sync \ > + -c "pwrite 0 16k" $SCRATCH_MNT/foobar >> $seqres.full Instead of adding a bunch of "-c sync", you can pass -s to xfs_io: xfs_io -f -s -c "pwrite ..." -c "pwrite ..." ... It does exactly the same. > +truncate -s 128k $SCRATCH_MNT/foobar > +sync > + > +old_csum=$(_md5_checksum $SCRATCH_MNT/foobar) > +echo "=== File extent layout before autodefrag ===" >> $seqres.full > +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full > +echo "old md5=$old_csum" >> $seqres.full > + > +old_regular=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 0) > +old_hole=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 64k) > + > +# For hole only xfs_io fiemap, there will be no output at all. You can get around that by not using _filter_xfs_io_fiemap at get_extent_disk_sector. > +# Re-fill it to "hole" for later comparison > +if [ ! -z $old_hole ]; then > + echo "hole not at 128k" 128K -> 64K > +else > + old_hole="hole" > +fi > + > +# Now trigger autodefrag > +echo 0 > /sys/fs/btrfs/$fsid/debug/cleaner_trigger > + > +# No good way to wait for autodefrag to finish yet > +sleep 3 > +sync > + > +new_csum=$(_md5_checksum $SCRATCH_MNT/foobar) > +new_regular=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 0) > +new_hole=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 64k) > + > +echo "=== File extent layout after autodefrag ===" >> $seqres.full > +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full > +echo "new md5=$new_csum" >> $seqres.full > + > +if [ ! -z $new_hole ]; then > + echo "hole not at 128k" 64K > +else > + new_hole="hole" > +fi > + > +# In v5.11~v5.15 kernels, regular extents won't get defragged, and would trigger > +# the following output > +if [ $new_regular == $old_regular ]; then > + echo "regular extents didn't get defragged" > +fi > + > +# In v5.10 and earlier kernel, autodefrag may choose to defrag holes, > +# which should be avoided. > +if [ "$new_hole" != "$old_hole" ]; then > + echo "hole extents got defragged" > +fi > + > +# Defrag should not change file content > +if [ "$new_csum" != "$old_csum" ]; then > + echo "file content changed" > +fi > + > +echo "Silence is golden" > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/256.out b/tests/btrfs/256.out > new file mode 100644 > index 00000000..7ee8e2e5 > --- /dev/null > +++ b/tests/btrfs/256.out > @@ -0,0 +1,2 @@ > +QA output created by 256 > +Silence is golden > -- > 2.34.1 >
On 2022/1/27 19:23, Filipe Manana wrote: > On Thu, Jan 27, 2022 at 01:45:43PM +0800, Qu Wenruo wrote: >> In v5.11~v5.15 kernels, there is a regression in autodefrag that if a >> cluster (up to 256K in size) has even a single hole, the whole cluster >> will be rejected. >> >> This will greatly reduce the efficiency of autodefrag. >> >> The behavior is fixed in v5.16 by a full rework, although the rework >> itself has other problems, it at least solves this particular >> regression. >> >> Here we add a test case to reproduce the case, where we have a 128K >> cluster, the first half is fragmented extents which can be defragged. >> The second half is hole. >> >> Make sure autodefrag can defrag the 64K part. >> >> This test needs extra debug feature, which is titled: >> >> [RFC] btrfs: sysfs: introduce <uuid>/debug/cleaner_trigger >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> common/btrfs | 11 +++++ >> tests/btrfs/256 | 112 ++++++++++++++++++++++++++++++++++++++++++++ >> tests/btrfs/256.out | 2 + >> 3 files changed, 125 insertions(+) >> create mode 100755 tests/btrfs/256 >> create mode 100644 tests/btrfs/256.out >> >> diff --git a/common/btrfs b/common/btrfs >> index 5de926dd..4e6842d9 100644 >> --- a/common/btrfs >> +++ b/common/btrfs >> @@ -496,3 +496,14 @@ _require_btrfs_support_sectorsize() >> grep -wq $sectorsize /sys/fs/btrfs/features/supported_sectorsizes || \ >> _notrun "sectorsize $sectorsize is not supported" >> } >> + >> +# Require trigger for cleaner kthread >> +_require_btrfs_debug_cleaner_trigger() >> +{ >> + local fsid >> + >> + fsid=$($BTRFS_UTIL_PROG filesystem show $TEST_DIR | grep uuid: |\ >> + $AWK_PROG '{print $NF}') >> + test -f /sys/fs/btrfs/$fsid/debug/cleaner_trigger ||\ >> + _notrun "no cleaner kthread trigger" >> +} >> diff --git a/tests/btrfs/256 b/tests/btrfs/256 >> new file mode 100755 >> index 00000000..86e6739e >> --- /dev/null >> +++ b/tests/btrfs/256 >> @@ -0,0 +1,112 @@ >> +#! /bin/bash >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved. >> +# >> +# FS QA Test 256 >> +# >> +# Make sure btrfs auto defrag can properly defrag clusters which has hole >> +# in the middle >> +# >> +. ./common/preamble >> +_begin_fstest auto defrag quick >> + >> +. ./common/btrfs >> +. ./common/filter >> + >> +# real QA test starts here >> + >> +# Modify as appropriate. >> +_supported_fs generic >> +_require_scratch >> + >> +get_extent_disk_sector() >> +{ >> + local file=$1 >> + local offset=$2 >> + >> + $XFS_IO_PROG -c "fiemap $offset" "$file" | _filter_xfs_io_fiemap |\ >> + head -n1 | $AWK_PROG '{print $3}' >> +} >> + >> +# Needs 4K sectorsize, as larger sectorsize can change the file layout. >> +_require_btrfs_support_sectorsize 4096 >> + >> +# We need a way to trigger autodefrag >> +_require_btrfs_debug_cleaner_trigger > > In order to trigger the cleaner, we don't need another special purpose > RFC debug patch. > > Just mount the fs with "-o commit=1", and then leave the "sleep 3" as it > is. We do this in other tests that expect the cleaner thread to do > something. Every time the transaction kthread wakes up, it will wake up > the cleaner kthread, even if it doesn't have any transaction to commit. Right! That's way better than the RFC patch. > >> + >> +_scratch_mkfs >> $seqres.full >> + >> +# Need datacow to show which range is defragged, and we're testing >> +# autodefrag >> +_scratch_mount -o datacow,autodefrag >> + >> +fsid=$($BTRFS_UTIL_PROG filesystem show $SCRATCH_MNT |grep uuid: |\ >> + $AWK_PROG '{print $NF}') >> + >> +# Create a layout where we have fragmented extents at [0, 64k) (sync write in >> +# reserve order), then a hole at [64k, 128k) >> +$XFS_IO_PROG -f -c "pwrite 48k 16k" -c sync \ >> + -c "pwrite 32k 16k" -c sync \ >> + -c "pwrite 16k 16k" -c sync \ >> + -c "pwrite 0 16k" $SCRATCH_MNT/foobar >> $seqres.full > > Instead of adding a bunch of "-c sync", you can pass -s to xfs_io: > > xfs_io -f -s -c "pwrite ..." -c "pwrite ..." ... > > It does exactly the same. > >> +truncate -s 128k $SCRATCH_MNT/foobar >> +sync >> + >> +old_csum=$(_md5_checksum $SCRATCH_MNT/foobar) >> +echo "=== File extent layout before autodefrag ===" >> $seqres.full >> +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full >> +echo "old md5=$old_csum" >> $seqres.full >> + >> +old_regular=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 0) >> +old_hole=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 64k) >> + >> +# For hole only xfs_io fiemap, there will be no output at all. > > You can get around that by not using _filter_xfs_io_fiemap at > get_extent_disk_sector. Unfortunately I have tried and failed. The problem is really xfs_io fiemap -v will just output nothing if there is only hole. $ sudo xfs_io -c "fiemap -v" /mnt/btrfs/file2 /mnt/btrfs/file2: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..31]: 26688..26719 32 0x1 1: [32..63]: hole 32 $ sudo xfs_io -c "fiemap -v 16k" /mnt/btrfs/file2 /mnt/btrfs/file2: Maybe it would be something to fix in the future? Anyway, I'll integrate the special hole handling into the helper. Thanks, Qu > >> +# Re-fill it to "hole" for later comparison >> +if [ ! -z $old_hole ]; then >> + echo "hole not at 128k" > > 128K -> 64K > >> +else >> + old_hole="hole" >> +fi >> + >> +# Now trigger autodefrag >> +echo 0 > /sys/fs/btrfs/$fsid/debug/cleaner_trigger >> + >> +# No good way to wait for autodefrag to finish yet >> +sleep 3 >> +sync >> + >> +new_csum=$(_md5_checksum $SCRATCH_MNT/foobar) >> +new_regular=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 0) >> +new_hole=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 64k) >> + >> +echo "=== File extent layout after autodefrag ===" >> $seqres.full >> +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full >> +echo "new md5=$new_csum" >> $seqres.full >> + >> +if [ ! -z $new_hole ]; then >> + echo "hole not at 128k" > > 64K > >> +else >> + new_hole="hole" >> +fi >> + >> +# In v5.11~v5.15 kernels, regular extents won't get defragged, and would trigger >> +# the following output >> +if [ $new_regular == $old_regular ]; then >> + echo "regular extents didn't get defragged" >> +fi >> + >> +# In v5.10 and earlier kernel, autodefrag may choose to defrag holes, >> +# which should be avoided. >> +if [ "$new_hole" != "$old_hole" ]; then >> + echo "hole extents got defragged" >> +fi >> + >> +# Defrag should not change file content >> +if [ "$new_csum" != "$old_csum" ]; then >> + echo "file content changed" >> +fi >> + >> +echo "Silence is golden" >> +# success, all done >> +status=0 >> +exit >> diff --git a/tests/btrfs/256.out b/tests/btrfs/256.out >> new file mode 100644 >> index 00000000..7ee8e2e5 >> --- /dev/null >> +++ b/tests/btrfs/256.out >> @@ -0,0 +1,2 @@ >> +QA output created by 256 >> +Silence is golden >> -- >> 2.34.1 >>
On Fri, Jan 28, 2022 at 07:32:53AM +0800, Qu Wenruo wrote: > On 2022/1/27 19:23, Filipe Manana wrote: > > On Thu, Jan 27, 2022 at 01:45:43PM +0800, Qu Wenruo wrote: > >> +get_extent_disk_sector() > >> +{ > >> + local file=$1 > >> + local offset=$2 > >> + > >> + $XFS_IO_PROG -c "fiemap $offset" "$file" | _filter_xfs_io_fiemap |\ > >> + head -n1 | $AWK_PROG '{print $3}' > >> +} > >> + > >> +# Needs 4K sectorsize, as larger sectorsize can change the file layout. > >> +_require_btrfs_support_sectorsize 4096 > >> + > >> +# We need a way to trigger autodefrag > >> +_require_btrfs_debug_cleaner_trigger > > > > In order to trigger the cleaner, we don't need another special purpose > > RFC debug patch. > > > > Just mount the fs with "-o commit=1", and then leave the "sleep 3" as it > > is. We do this in other tests that expect the cleaner thread to do > > something. Every time the transaction kthread wakes up, it will wake up > > the cleaner kthread, even if it doesn't have any transaction to commit. > > Right! That's way better than the RFC patch. Also the BTRFS_IOC_SYNC ioctl (available as "btrfs filesystem sync") will wake the transaction kthread.
diff --git a/common/btrfs b/common/btrfs index 5de926dd..4e6842d9 100644 --- a/common/btrfs +++ b/common/btrfs @@ -496,3 +496,14 @@ _require_btrfs_support_sectorsize() grep -wq $sectorsize /sys/fs/btrfs/features/supported_sectorsizes || \ _notrun "sectorsize $sectorsize is not supported" } + +# Require trigger for cleaner kthread +_require_btrfs_debug_cleaner_trigger() +{ + local fsid + + fsid=$($BTRFS_UTIL_PROG filesystem show $TEST_DIR | grep uuid: |\ + $AWK_PROG '{print $NF}') + test -f /sys/fs/btrfs/$fsid/debug/cleaner_trigger ||\ + _notrun "no cleaner kthread trigger" +} diff --git a/tests/btrfs/256 b/tests/btrfs/256 new file mode 100755 index 00000000..86e6739e --- /dev/null +++ b/tests/btrfs/256 @@ -0,0 +1,112 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved. +# +# FS QA Test 256 +# +# Make sure btrfs auto defrag can properly defrag clusters which has hole +# in the middle +# +. ./common/preamble +_begin_fstest auto defrag quick + +. ./common/btrfs +. ./common/filter + +# real QA test starts here + +# Modify as appropriate. +_supported_fs generic +_require_scratch + +get_extent_disk_sector() +{ + local file=$1 + local offset=$2 + + $XFS_IO_PROG -c "fiemap $offset" "$file" | _filter_xfs_io_fiemap |\ + head -n1 | $AWK_PROG '{print $3}' +} + +# Needs 4K sectorsize, as larger sectorsize can change the file layout. +_require_btrfs_support_sectorsize 4096 + +# We need a way to trigger autodefrag +_require_btrfs_debug_cleaner_trigger + +_scratch_mkfs >> $seqres.full + +# Need datacow to show which range is defragged, and we're testing +# autodefrag +_scratch_mount -o datacow,autodefrag + +fsid=$($BTRFS_UTIL_PROG filesystem show $SCRATCH_MNT |grep uuid: |\ + $AWK_PROG '{print $NF}') + +# Create a layout where we have fragmented extents at [0, 64k) (sync write in +# reserve order), then a hole at [64k, 128k) +$XFS_IO_PROG -f -c "pwrite 48k 16k" -c sync \ + -c "pwrite 32k 16k" -c sync \ + -c "pwrite 16k 16k" -c sync \ + -c "pwrite 0 16k" $SCRATCH_MNT/foobar >> $seqres.full +truncate -s 128k $SCRATCH_MNT/foobar +sync + +old_csum=$(_md5_checksum $SCRATCH_MNT/foobar) +echo "=== File extent layout before autodefrag ===" >> $seqres.full +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full +echo "old md5=$old_csum" >> $seqres.full + +old_regular=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 0) +old_hole=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 64k) + +# For hole only xfs_io fiemap, there will be no output at all. +# Re-fill it to "hole" for later comparison +if [ ! -z $old_hole ]; then + echo "hole not at 128k" +else + old_hole="hole" +fi + +# Now trigger autodefrag +echo 0 > /sys/fs/btrfs/$fsid/debug/cleaner_trigger + +# No good way to wait for autodefrag to finish yet +sleep 3 +sync + +new_csum=$(_md5_checksum $SCRATCH_MNT/foobar) +new_regular=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 0) +new_hole=$(get_extent_disk_sector "$SCRATCH_MNT/foobar" 64k) + +echo "=== File extent layout after autodefrag ===" >> $seqres.full +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full +echo "new md5=$new_csum" >> $seqres.full + +if [ ! -z $new_hole ]; then + echo "hole not at 128k" +else + new_hole="hole" +fi + +# In v5.11~v5.15 kernels, regular extents won't get defragged, and would trigger +# the following output +if [ $new_regular == $old_regular ]; then + echo "regular extents didn't get defragged" +fi + +# In v5.10 and earlier kernel, autodefrag may choose to defrag holes, +# which should be avoided. +if [ "$new_hole" != "$old_hole" ]; then + echo "hole extents got defragged" +fi + +# Defrag should not change file content +if [ "$new_csum" != "$old_csum" ]; then + echo "file content changed" +fi + +echo "Silence is golden" +# success, all done +status=0 +exit diff --git a/tests/btrfs/256.out b/tests/btrfs/256.out new file mode 100644 index 00000000..7ee8e2e5 --- /dev/null +++ b/tests/btrfs/256.out @@ -0,0 +1,2 @@ +QA output created by 256 +Silence is golden
In v5.11~v5.15 kernels, there is a regression in autodefrag that if a cluster (up to 256K in size) has even a single hole, the whole cluster will be rejected. This will greatly reduce the efficiency of autodefrag. The behavior is fixed in v5.16 by a full rework, although the rework itself has other problems, it at least solves this particular regression. Here we add a test case to reproduce the case, where we have a 128K cluster, the first half is fragmented extents which can be defragged. The second half is hole. Make sure autodefrag can defrag the 64K part. This test needs extra debug feature, which is titled: [RFC] btrfs: sysfs: introduce <uuid>/debug/cleaner_trigger Signed-off-by: Qu Wenruo <wqu@suse.com> --- common/btrfs | 11 +++++ tests/btrfs/256 | 112 ++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/256.out | 2 + 3 files changed, 125 insertions(+) create mode 100755 tests/btrfs/256 create mode 100644 tests/btrfs/256.out