diff mbox series

btrfs: make sure autodefrag won't defrag ranges which don't contribute to fragmentations

Message ID 20220315023417.25129-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: make sure autodefrag won't defrag ranges which don't contribute to fragmentations | expand

Commit Message

Qu Wenruo March 15, 2022, 2:34 a.m. UTC
There is a report that btrfs autodefrag is defragging extents which only
has one single sector.

Such defragging will not reduce the number of extents, but only waste
IO.

The fix for it is titled:

  btrfs: avoid defragging extents whose next extents are not targets

Here we add a test case, which will create an inode with the following
layout:

  0                16K   20K                  64K
  |<-- Extent A -->|<-B->|<----- Extent C --->|
  |Gen: 7          |Gen 9|Gen 7               |

And we trigger autodefrag with newer_than = 8, which means it will only
defrag extents newer than or equal to geneartion 8.

Currently only Extent B meets the condition, but it can not be merged
with Extent A nor Extent C, as they don't meet the geneartion
requirement.

Unpatched kernel will defrag only Extent B, resulting no change in
fragmentation, while cost extra IO.

Patched kernel will not defrag anything.

Although this is still not the ideal case, as we can defrag the whole
64K range, but that's not what autodefrag can do with its geneartion
limitation.

And such "perfect" defrag can cause way more IO than some users can
stand.

At least we should not only defrag extent B.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 tests/btrfs/262     | 99 +++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/262.out |  2 +
 2 files changed, 101 insertions(+)
 create mode 100755 tests/btrfs/262
 create mode 100644 tests/btrfs/262.out

Comments

Filipe Manana March 15, 2022, 11:32 a.m. UTC | #1
On Tue, Mar 15, 2022 at 10:34:17AM +0800, Qu Wenruo wrote:
> There is a report that btrfs autodefrag is defragging extents which only
> has one single sector.
> 
> Such defragging will not reduce the number of extents, but only waste
> IO.
> 
> The fix for it is titled:
> 
>   btrfs: avoid defragging extents whose next extents are not targets
> 
> Here we add a test case, which will create an inode with the following
> layout:
> 
>   0                16K   20K                  64K
>   |<-- Extent A -->|<-B->|<----- Extent C --->|
>   |Gen: 7          |Gen 9|Gen 7               |
> 
> And we trigger autodefrag with newer_than = 8, which means it will only
> defrag extents newer than or equal to geneartion 8.
> 
> Currently only Extent B meets the condition, but it can not be merged
> with Extent A nor Extent C, as they don't meet the geneartion
> requirement.
> 
> Unpatched kernel will defrag only Extent B, resulting no change in
> fragmentation, while cost extra IO.
> 
> Patched kernel will not defrag anything.
> 
> Although this is still not the ideal case, as we can defrag the whole
> 64K range, but that's not what autodefrag can do with its geneartion
> limitation.
> 
> And such "perfect" defrag can cause way more IO than some users can
> stand.
> 
> At least we should not only defrag extent B.

Looking at the subject:

  btrfs: make sure autodefrag won't defrag ranges which don't contribute to fragmentations

It's a bit long, it gives the idea it's a fix and not a test, plus
"fragmentations" should be "defragmentation".

Maybe something like the following would be more clear and shorter:

  btrfs: test that autdodefrag does not rewrite single extents

> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  tests/btrfs/262     | 99 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/262.out |  2 +
>  2 files changed, 101 insertions(+)
>  create mode 100755 tests/btrfs/262
>  create mode 100644 tests/btrfs/262.out
> 
> diff --git a/tests/btrfs/262 b/tests/btrfs/262
> new file mode 100755
> index 00000000..0f91c4a4
> --- /dev/null
> +++ b/tests/btrfs/262
> @@ -0,0 +1,99 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2022 SUSE Linux Products GmbH.  All Rights Reserved.
> +#
> +# FS QA Test 262
> +#
> +# Make sure btrfs autodefrag will not defrag ranges which won't reduce
> +# defragmentation.
> +#
> +. ./common/preamble
> +_begin_fstest auto quick defrag
> +
> +# Override the default cleanup function.
> +# _cleanup()
> +# {
> +# 	cd /
> +# 	rm -r -f $tmp.*
> +# }

Could go away.

> +
> +# Import common functions.
> +. ./common/filter
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_supported_fs btrfs
> +_require_scratch
> +_require_xfs_io_command "fiemap" "ranged"
> +
> +# Needs fixed 4K sector size, or the file layout will not match the expected
> +# result
> +_require_btrfs_support_sectorsize 4096
> +
> +_scratch_mkfs >> $seqres.full
> +
> +_scratch_mount -o noautodefrag
> +
> +# Create the initial layout, with a large 64K extent for later fragments
> +$XFS_IO_PROG -f -c "pwrite 0 64K" -c sync "$SCRATCH_MNT/foobar" >> $seqres.full
> +
> +# Need to bump the generation by one, as autodefrag uses the last modified
> +# generation of a subvolume. With this generation bump, autodefrag will
> +# defrag the whole file, not only the new write.

Some of the sentences in comments end with punctuation, others do not.
If not for the sake of formal correctness and to make it clear you ended the
sentences as you intended, than at least finish them all with punctation for
the sake of consistency and aesthetics.

> +touch "$SCRATCH_MNT/trash"
> +sync
> +
> +# Remount to autodefrag
> +_scratch_remount autodefrag
> +
> +# Write a new sector, which should trigger autodefrag
> +$XFS_IO_PROG -f -c "pwrite 16K 4K" -c sync "$SCRATCH_MNT/foobar" >> $seqres.full

We could leave the -f out, as it makes it clear we want to write to an
existing file.

Anyway, the test fails without the kernel patch, and it passes with it.
Minor cosmetics apart:

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.

> +
> +echo "=== File extent layout before autodefrag ===" >> $seqres.full
> +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full
> +
> +old_csum=$(_md5_checksum "$SCRATCH_MNT/foobar")
> +old_ext_0=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 0)
> +old_ext_16k=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 16384)
> +old_ext_20k=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 20480)
> +
> +# Trigger autodefrag
> +_scratch_remount commit=1
> +sleep 3
> +# Make sure the defragged range reach disk
> +sync
> +
> +echo "=== File extent layout after autodefrag ===" >> $seqres.full
> +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full
> +
> +new_csum=$(_md5_checksum "$SCRATCH_MNT/foobar")
> +new_ext_0=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 0)
> +new_ext_16k=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 16384)
> +new_ext_20k=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 20480)
> +
> +# For extent at offset 0 and 20K, they are older than the target generation,
> +# thus they should not be defragged
> +if [ "$new_ext_0" != "$old_ext_0" ]; then
> +	echo "extent at offset 0 got defragged"
> +fi
> +if [ "$new_ext_20k" != "$old_ext_20k" ]; then
> +	echo "extent at offset 20K got defragged"
> +fi
> +
> +# For extent at offset 4K, it's a single sector, and its adjacent extents
> +# are not targets, thus it should not be defragged.
> +if [ "$new_ext_16k" != "$old_ext_16k" ]; then
> +	echo "extent at offset 16K got defragged"
> +fi
> +
> +# Defrag should not change file content
> +if [ "$new_csum" != "$old_csum" ]; then
> +	echo "file content changed"
> +fi
> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/btrfs/262.out b/tests/btrfs/262.out
> new file mode 100644
> index 00000000..404badc3
> --- /dev/null
> +++ b/tests/btrfs/262.out
> @@ -0,0 +1,2 @@
> +QA output created by 262
> +Silence is golden
> -- 
> 2.34.1
>
diff mbox series

Patch

diff --git a/tests/btrfs/262 b/tests/btrfs/262
new file mode 100755
index 00000000..0f91c4a4
--- /dev/null
+++ b/tests/btrfs/262
@@ -0,0 +1,99 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2022 SUSE Linux Products GmbH.  All Rights Reserved.
+#
+# FS QA Test 262
+#
+# Make sure btrfs autodefrag will not defrag ranges which won't reduce
+# defragmentation.
+#
+. ./common/preamble
+_begin_fstest auto quick defrag
+
+# Override the default cleanup function.
+# _cleanup()
+# {
+# 	cd /
+# 	rm -r -f $tmp.*
+# }
+
+# Import common functions.
+. ./common/filter
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs btrfs
+_require_scratch
+_require_xfs_io_command "fiemap" "ranged"
+
+# Needs fixed 4K sector size, or the file layout will not match the expected
+# result
+_require_btrfs_support_sectorsize 4096
+
+_scratch_mkfs >> $seqres.full
+
+_scratch_mount -o noautodefrag
+
+# Create the initial layout, with a large 64K extent for later fragments
+$XFS_IO_PROG -f -c "pwrite 0 64K" -c sync "$SCRATCH_MNT/foobar" >> $seqres.full
+
+# Need to bump the generation by one, as autodefrag uses the last modified
+# generation of a subvolume. With this generation bump, autodefrag will
+# defrag the whole file, not only the new write.
+touch "$SCRATCH_MNT/trash"
+sync
+
+# Remount to autodefrag
+_scratch_remount autodefrag
+
+# Write a new sector, which should trigger autodefrag
+$XFS_IO_PROG -f -c "pwrite 16K 4K" -c sync "$SCRATCH_MNT/foobar" >> $seqres.full
+
+echo "=== File extent layout before autodefrag ===" >> $seqres.full
+$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full
+
+old_csum=$(_md5_checksum "$SCRATCH_MNT/foobar")
+old_ext_0=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 0)
+old_ext_16k=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 16384)
+old_ext_20k=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 20480)
+
+# Trigger autodefrag
+_scratch_remount commit=1
+sleep 3
+# Make sure the defragged range reach disk
+sync
+
+echo "=== File extent layout after autodefrag ===" >> $seqres.full
+$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $seqres.full
+
+new_csum=$(_md5_checksum "$SCRATCH_MNT/foobar")
+new_ext_0=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 0)
+new_ext_16k=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 16384)
+new_ext_20k=$(_get_file_extent_sector "$SCRATCH_MNT/foobar" 20480)
+
+# For extent at offset 0 and 20K, they are older than the target generation,
+# thus they should not be defragged
+if [ "$new_ext_0" != "$old_ext_0" ]; then
+	echo "extent at offset 0 got defragged"
+fi
+if [ "$new_ext_20k" != "$old_ext_20k" ]; then
+	echo "extent at offset 20K got defragged"
+fi
+
+# For extent at offset 4K, it's a single sector, and its adjacent extents
+# are not targets, thus it should not be defragged.
+if [ "$new_ext_16k" != "$old_ext_16k" ]; then
+	echo "extent at offset 16K got defragged"
+fi
+
+# Defrag should not change file content
+if [ "$new_csum" != "$old_csum" ]; then
+	echo "file content changed"
+fi
+
+echo "Silence is golden"
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/262.out b/tests/btrfs/262.out
new file mode 100644
index 00000000..404badc3
--- /dev/null
+++ b/tests/btrfs/262.out
@@ -0,0 +1,2 @@ 
+QA output created by 262
+Silence is golden