diff mbox series

btrfs: add test case to make sure autodefrag works even the extent maps are read from disk

Message ID 20220208071427.19171-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: add test case to make sure autodefrag works even the extent maps are read from disk | expand

Commit Message

Qu Wenruo Feb. 8, 2022, 7:14 a.m. UTC
There is a long existing problem that extent_map::generation is not
populated (thus always 0) if its read from disk.

This can prevent btrfs autodefrag from working as it relies on
extent_map::generation.
If it's always 0, then autodefrag will not consider the range as a
defrag target.

The test case itself will verify the behavior by:

- Create a fragmented file
  By writing backwards with OSYNC
  This will also queue the file for autodefrag.

- Drop all cache
  Including the extent map cache, meaning later read will
  all get extent map by reading from on-disk file extent items.

- Trigger autodefrag and verify the file layout
  If defrag works, the new file layout should differ from the original
  one.

The kernel fix is titled:

  btrfs: populate extent_map::generation when reading from disk

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 tests/btrfs/259     | 64 +++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/259.out |  2 ++
 2 files changed, 66 insertions(+)
 create mode 100755 tests/btrfs/259
 create mode 100644 tests/btrfs/259.out

Comments

Filipe Manana Feb. 10, 2022, 4:37 p.m. UTC | #1
On Tue, Feb 8, 2022 at 12:00 PM Qu Wenruo <wqu@suse.com> wrote:
>
> There is a long existing problem that extent_map::generation is not
> populated (thus always 0) if its read from disk.
>
> This can prevent btrfs autodefrag from working as it relies on
> extent_map::generation.
> If it's always 0, then autodefrag will not consider the range as a
> defrag target.
>
> The test case itself will verify the behavior by:
>
> - Create a fragmented file
>   By writing backwards with OSYNC
>   This will also queue the file for autodefrag.
>
> - Drop all cache
>   Including the extent map cache, meaning later read will
>   all get extent map by reading from on-disk file extent items.
>
> - Trigger autodefrag and verify the file layout
>   If defrag works, the new file layout should differ from the original
>   one.
>
> The kernel fix is titled:
>
>   btrfs: populate extent_map::generation when reading from disk
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  tests/btrfs/259     | 64 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/btrfs/259.out |  2 ++
>  2 files changed, 66 insertions(+)
>  create mode 100755 tests/btrfs/259
>  create mode 100644 tests/btrfs/259.out
>
> diff --git a/tests/btrfs/259 b/tests/btrfs/259
> new file mode 100755
> index 00000000..577e4ce4
> --- /dev/null
> +++ b/tests/btrfs/259
> @@ -0,0 +1,64 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
> +#
> +# FS QA Test 259
> +#
> +# Make sure autodefrag can still defrag the file even their extent maps are
> +# read from disk
> +#
> +. ./common/preamble
> +_begin_fstest auto quick defrag
> +
> +# Override the default cleanup function.
> +# _cleanup()
> +# {
> +#      cd /
> +#      rm -r -f $tmp.*
> +# }
> +
> +# Import common functions.
> +# . ./common/filter
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_supported_fs btrfs
> +_require_scratch
> +
> +# Need 4K sectorsize, as the autodefrag threshold is only 64K,
> +# thus 64K sectorsize will not work.
> +_require_btrfs_support_sectorsize 4096

Missing a:

_require_xfs_io_command fiemap

> +_scratch_mkfs -s 4k >> $seqres.full
> +_scratch_mount -o datacow,autodefrag
> +
> +# Create fragmented write
> +$XFS_IO_PROG -f -s -c "pwrite 24k 8k" -c "pwrite 16k 8k" \
> +               -c "pwrite 8k 8k" -c "pwrite 0 8k" \
> +               "$SCRATCH_MNT/foobar" >> $seqres.full
> +sync

A comment on why this sync is needed would be good to have.
It may be confusing to the reader since we were doing synchronous writes before.

> +
> +echo "=== Before autodefrag ===" >> $seqres.full
> +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.before
> +cat $tmp.before >> $seqres.full
> +
> +# Drop the cache (including extent map cache per-inode)
> +echo 3 > /proc/sys/vm/drop_caches
> +
> +# Now trigger autodefrag

A bit more explanation would be useful.

Set the commit interval to 1 second, so that 1 second after the
remount the transaction kthread runs
and wakes up the cleaner kthread, which in turn will run autodefrag.

> +_scratch_remount commit=1
> +sleep 3
> +sync

This sync is useless, so it should go away.

Otherwise, it looks good and the test works as expected.

Thanks for doing it.

> +
> +echo "=== After autodefrag ===" >> $seqres.full
> +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.after
> +cat $tmp.after >> $seqres.full
> +
> +# The layout should differ if autodefrag is working
> +diff $tmp.before $tmp.after > /dev/null && echo "autodefrag didn't defrag the file"
> +
> +echo "Silence is golden"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/btrfs/259.out b/tests/btrfs/259.out
> new file mode 100644
> index 00000000..bfbd2dea
> --- /dev/null
> +++ b/tests/btrfs/259.out
> @@ -0,0 +1,2 @@
> +QA output created by 259
> +Silence is golden
> --
> 2.34.1
>
Qu Wenruo Feb. 11, 2022, 12:01 a.m. UTC | #2
On 2022/2/11 00:37, Filipe Manana wrote:
> On Tue, Feb 8, 2022 at 12:00 PM Qu Wenruo <wqu@suse.com> wrote:
>>
>> There is a long existing problem that extent_map::generation is not
>> populated (thus always 0) if its read from disk.
>>
>> This can prevent btrfs autodefrag from working as it relies on
>> extent_map::generation.
>> If it's always 0, then autodefrag will not consider the range as a
>> defrag target.
>>
>> The test case itself will verify the behavior by:
>>
>> - Create a fragmented file
>>    By writing backwards with OSYNC
>>    This will also queue the file for autodefrag.
>>
>> - Drop all cache
>>    Including the extent map cache, meaning later read will
>>    all get extent map by reading from on-disk file extent items.
>>
>> - Trigger autodefrag and verify the file layout
>>    If defrag works, the new file layout should differ from the original
>>    one.
>>
>> The kernel fix is titled:
>>
>>    btrfs: populate extent_map::generation when reading from disk
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   tests/btrfs/259     | 64 +++++++++++++++++++++++++++++++++++++++++++++
>>   tests/btrfs/259.out |  2 ++
>>   2 files changed, 66 insertions(+)
>>   create mode 100755 tests/btrfs/259
>>   create mode 100644 tests/btrfs/259.out
>>
>> diff --git a/tests/btrfs/259 b/tests/btrfs/259
>> new file mode 100755
>> index 00000000..577e4ce4
>> --- /dev/null
>> +++ b/tests/btrfs/259
>> @@ -0,0 +1,64 @@
>> +#! /bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
>> +#
>> +# FS QA Test 259
>> +#
>> +# Make sure autodefrag can still defrag the file even their extent maps are
>> +# read from disk
>> +#
>> +. ./common/preamble
>> +_begin_fstest auto quick defrag
>> +
>> +# Override the default cleanup function.
>> +# _cleanup()
>> +# {
>> +#      cd /
>> +#      rm -r -f $tmp.*
>> +# }
>> +
>> +# Import common functions.
>> +# . ./common/filter
>> +
>> +# real QA test starts here
>> +
>> +# Modify as appropriate.
>> +_supported_fs btrfs
>> +_require_scratch
>> +
>> +# Need 4K sectorsize, as the autodefrag threshold is only 64K,
>> +# thus 64K sectorsize will not work.
>> +_require_btrfs_support_sectorsize 4096
>
> Missing a:
>
> _require_xfs_io_command fiemap
>
>> +_scratch_mkfs -s 4k >> $seqres.full
>> +_scratch_mount -o datacow,autodefrag
>> +
>> +# Create fragmented write
>> +$XFS_IO_PROG -f -s -c "pwrite 24k 8k" -c "pwrite 16k 8k" \
>> +               -c "pwrite 8k 8k" -c "pwrite 0 8k" \
>> +               "$SCRATCH_MNT/foobar" >> $seqres.full
>> +sync
>
> A comment on why this sync is needed would be good to have.
> It may be confusing to the reader since we were doing synchronous writes before.
>
>> +
>> +echo "=== Before autodefrag ===" >> $seqres.full
>> +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.before
>> +cat $tmp.before >> $seqres.full
>> +
>> +# Drop the cache (including extent map cache per-inode)
>> +echo 3 > /proc/sys/vm/drop_caches
>> +
>> +# Now trigger autodefrag
>
> A bit more explanation would be useful.
>
> Set the commit interval to 1 second, so that 1 second after the
> remount the transaction kthread runs
> and wakes up the cleaner kthread, which in turn will run autodefrag.
>
>> +_scratch_remount commit=1
>> +sleep 3
>> +sync
>
> This sync is useless, so it should go away.

Autodefrag doesn't write data back at all.
It just mark the target range dirty and wait for later writeback.

Thus sync is still needed AFAIK.

Thanks,
Qu

>
> Otherwise, it looks good and the test works as expected.
>
> Thanks for doing it.
>
>> +
>> +echo "=== After autodefrag ===" >> $seqres.full
>> +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.after
>> +cat $tmp.after >> $seqres.full
>> +
>> +# The layout should differ if autodefrag is working
>> +diff $tmp.before $tmp.after > /dev/null && echo "autodefrag didn't defrag the file"
>> +
>> +echo "Silence is golden"
>> +
>> +# success, all done
>> +status=0
>> +exit
>> diff --git a/tests/btrfs/259.out b/tests/btrfs/259.out
>> new file mode 100644
>> index 00000000..bfbd2dea
>> --- /dev/null
>> +++ b/tests/btrfs/259.out
>> @@ -0,0 +1,2 @@
>> +QA output created by 259
>> +Silence is golden
>> --
>> 2.34.1
>>
Filipe Manana Feb. 11, 2022, 11:52 a.m. UTC | #3
On Fri, Feb 11, 2022 at 08:01:24AM +0800, Qu Wenruo wrote:
> 
> 
> On 2022/2/11 00:37, Filipe Manana wrote:
> > On Tue, Feb 8, 2022 at 12:00 PM Qu Wenruo <wqu@suse.com> wrote:
> > > 
> > > There is a long existing problem that extent_map::generation is not
> > > populated (thus always 0) if its read from disk.
> > > 
> > > This can prevent btrfs autodefrag from working as it relies on
> > > extent_map::generation.
> > > If it's always 0, then autodefrag will not consider the range as a
> > > defrag target.
> > > 
> > > The test case itself will verify the behavior by:
> > > 
> > > - Create a fragmented file
> > >    By writing backwards with OSYNC
> > >    This will also queue the file for autodefrag.
> > > 
> > > - Drop all cache
> > >    Including the extent map cache, meaning later read will
> > >    all get extent map by reading from on-disk file extent items.
> > > 
> > > - Trigger autodefrag and verify the file layout
> > >    If defrag works, the new file layout should differ from the original
> > >    one.
> > > 
> > > The kernel fix is titled:
> > > 
> > >    btrfs: populate extent_map::generation when reading from disk
> > > 
> > > Signed-off-by: Qu Wenruo <wqu@suse.com>
> > > ---
> > >   tests/btrfs/259     | 64 +++++++++++++++++++++++++++++++++++++++++++++
> > >   tests/btrfs/259.out |  2 ++
> > >   2 files changed, 66 insertions(+)
> > >   create mode 100755 tests/btrfs/259
> > >   create mode 100644 tests/btrfs/259.out
> > > 
> > > diff --git a/tests/btrfs/259 b/tests/btrfs/259
> > > new file mode 100755
> > > index 00000000..577e4ce4
> > > --- /dev/null
> > > +++ b/tests/btrfs/259
> > > @@ -0,0 +1,64 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
> > > +#
> > > +# FS QA Test 259
> > > +#
> > > +# Make sure autodefrag can still defrag the file even their extent maps are
> > > +# read from disk
> > > +#
> > > +. ./common/preamble
> > > +_begin_fstest auto quick defrag
> > > +
> > > +# Override the default cleanup function.
> > > +# _cleanup()
> > > +# {
> > > +#      cd /
> > > +#      rm -r -f $tmp.*
> > > +# }
> > > +
> > > +# Import common functions.
> > > +# . ./common/filter
> > > +
> > > +# real QA test starts here
> > > +
> > > +# Modify as appropriate.
> > > +_supported_fs btrfs
> > > +_require_scratch
> > > +
> > > +# Need 4K sectorsize, as the autodefrag threshold is only 64K,
> > > +# thus 64K sectorsize will not work.
> > > +_require_btrfs_support_sectorsize 4096
> > 
> > Missing a:
> > 
> > _require_xfs_io_command fiemap
> > 
> > > +_scratch_mkfs -s 4k >> $seqres.full
> > > +_scratch_mount -o datacow,autodefrag
> > > +
> > > +# Create fragmented write
> > > +$XFS_IO_PROG -f -s -c "pwrite 24k 8k" -c "pwrite 16k 8k" \
> > > +               -c "pwrite 8k 8k" -c "pwrite 0 8k" \
> > > +               "$SCRATCH_MNT/foobar" >> $seqres.full
> > > +sync
> > 
> > A comment on why this sync is needed would be good to have.
> > It may be confusing to the reader since we were doing synchronous writes before.
> > 
> > > +
> > > +echo "=== Before autodefrag ===" >> $seqres.full
> > > +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.before
> > > +cat $tmp.before >> $seqres.full
> > > +
> > > +# Drop the cache (including extent map cache per-inode)
> > > +echo 3 > /proc/sys/vm/drop_caches
> > > +
> > > +# Now trigger autodefrag
> > 
> > A bit more explanation would be useful.
> > 
> > Set the commit interval to 1 second, so that 1 second after the
> > remount the transaction kthread runs
> > and wakes up the cleaner kthread, which in turn will run autodefrag.
> > 
> > > +_scratch_remount commit=1
> > > +sleep 3
> > > +sync
> > 
> > This sync is useless, so it should go away.
> 
> Autodefrag doesn't write data back at all.
> It just mark the target range dirty and wait for later writeback.
> 
> Thus sync is still needed AFAIK.

Well, something weird is going on then.

Removing the sync, makes the test still fail on an unpatched kernel,
and on a patched kernel, it still succeeds.

Looking at the .full file, the results are correct:

=== Before autodefrag ===
/home/fdmanana/btrfs-tests/scratch_1/foobar:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..15]:         26672..26687        16   0x0
   1: [16..31]:        26656..26671        16   0x0
   2: [32..47]:        26640..26655        16   0x0
   3: [48..63]:        26624..26639        16   0x1
=== After autodefrag ===
/home/fdmanana/btrfs-tests/scratch_1/foobar:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..63]:         26688..26751        64   0x1

So why this worked seems to be because:

- After doing the writes to the file, a delayed iput on the inode
  did a wakeup on the cleaner kthread;

- The cleaner kthread ran defrag;

- btrfs_remount() calls sync_filesystem(), flushing all delalloc.

So, yes, it's better to leave the 'sync', as there's no way to
guarantee the cleaner ran defrag before remount.

Perhaps a comment about that sync would be useful as well.

Thanks.

> 
> Thanks,
> Qu
> 
> > 
> > Otherwise, it looks good and the test works as expected.
> > 
> > Thanks for doing it.
> > 
> > > +
> > > +echo "=== After autodefrag ===" >> $seqres.full
> > > +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.after
> > > +cat $tmp.after >> $seqres.full
> > > +
> > > +# The layout should differ if autodefrag is working
> > > +diff $tmp.before $tmp.after > /dev/null && echo "autodefrag didn't defrag the file"
> > > +
> > > +echo "Silence is golden"
> > > +
> > > +# success, all done
> > > +status=0
> > > +exit
> > > diff --git a/tests/btrfs/259.out b/tests/btrfs/259.out
> > > new file mode 100644
> > > index 00000000..bfbd2dea
> > > --- /dev/null
> > > +++ b/tests/btrfs/259.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 259
> > > +Silence is golden
> > > --
> > > 2.34.1
> > >
diff mbox series

Patch

diff --git a/tests/btrfs/259 b/tests/btrfs/259
new file mode 100755
index 00000000..577e4ce4
--- /dev/null
+++ b/tests/btrfs/259
@@ -0,0 +1,64 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# FS QA Test 259
+#
+# Make sure autodefrag can still defrag the file even their extent maps are
+# read from disk
+#
+. ./common/preamble
+_begin_fstest auto quick defrag
+
+# Override the default cleanup function.
+# _cleanup()
+# {
+# 	cd /
+# 	rm -r -f $tmp.*
+# }
+
+# Import common functions.
+# . ./common/filter
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs btrfs
+_require_scratch
+
+# Need 4K sectorsize, as the autodefrag threshold is only 64K,
+# thus 64K sectorsize will not work.
+_require_btrfs_support_sectorsize 4096
+_scratch_mkfs -s 4k >> $seqres.full
+_scratch_mount -o datacow,autodefrag
+
+# Create fragmented write
+$XFS_IO_PROG -f -s -c "pwrite 24k 8k" -c "pwrite 16k 8k" \
+		-c "pwrite 8k 8k" -c "pwrite 0 8k" \
+		"$SCRATCH_MNT/foobar" >> $seqres.full
+sync
+
+echo "=== Before autodefrag ===" >> $seqres.full
+$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.before
+cat $tmp.before >> $seqres.full
+
+# Drop the cache (including extent map cache per-inode)
+echo 3 > /proc/sys/vm/drop_caches
+
+# Now trigger autodefrag
+_scratch_remount commit=1
+sleep 3
+sync
+
+echo "=== After autodefrag ===" >> $seqres.full
+$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.after
+cat $tmp.after >> $seqres.full
+
+# The layout should differ if autodefrag is working
+diff $tmp.before $tmp.after > /dev/null && echo "autodefrag didn't defrag the file"
+
+echo "Silence is golden"
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/259.out b/tests/btrfs/259.out
new file mode 100644
index 00000000..bfbd2dea
--- /dev/null
+++ b/tests/btrfs/259.out
@@ -0,0 +1,2 @@ 
+QA output created by 259
+Silence is golden