diff mbox series

btrfs/280: run defrag after creating file to get expected extent layout

Message ID 837d97d52fee15653d1dac216d1d75a14bb1916d.1717586749.git.fdmanana@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs/280: run defrag after creating file to get expected extent layout | expand

Commit Message

Filipe Manana June 5, 2024, 11:26 a.m. UTC
From: Filipe Manana <fdmanana@suse.com>

The test writes a 128M file and expects to end up with 1024 extents, each
with a size of 128K, which is the maximum size for compressed extents.
Generally this is what happens, but often it's possibly for writeback to
kick in while creating the file (due to memory pressure, or something
calling sync in parallel, etc) which may result in creating more and
smaller extents, which makes the test fail since its golden output
expects exactly 1024 extents with a size of 128K each.

So to work around run defrag after creating the file, which will ensure
we get only 128K extents in the file.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 tests/btrfs/280 | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

David Disseldorp June 5, 2024, 12:10 p.m. UTC | #1
On Wed,  5 Jun 2024 12:26:20 +0100, fdmanana@kernel.org wrote:

> From: Filipe Manana <fdmanana@suse.com>
> 
> The test writes a 128M file and expects to end up with 1024 extents, each
> with a size of 128K, which is the maximum size for compressed extents.
> Generally this is what happens, but often it's possibly for writeback to
> kick in while creating the file (due to memory pressure, or something
> calling sync in parallel, etc) which may result in creating more and
> smaller extents, which makes the test fail since its golden output
> expects exactly 1024 extents with a size of 128K each.
> 
> So to work around run defrag after creating the file, which will ensure
> we get only 128K extents in the file.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Looks fine.
Signed-off-by: David Disseldorp <ddiss@suse.de>

> ---
>  tests/btrfs/280 | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/btrfs/280 b/tests/btrfs/280
> index d4f613ce..0f7f8a37 100755
> --- a/tests/btrfs/280
> +++ b/tests/btrfs/280
> @@ -13,7 +13,7 @@
>  # the backref walking code, used by fiemap to determine if an extent is shared.
>  #
>  . ./common/preamble
> -_begin_fstest auto quick compress snapshot fiemap
> +_begin_fstest auto quick compress snapshot fiemap defrag
>  
>  . ./common/filter
>  . ./common/punch # for _filter_fiemap_flags

_require_defrag might be worth calling, but it doesn't really do
anything for btrfs, so I'm fine either way.
David Sterba June 5, 2024, 3:27 p.m. UTC | #2
On Wed, Jun 05, 2024 at 12:26:20PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> The test writes a 128M file and expects to end up with 1024 extents, each
> with a size of 128K, which is the maximum size for compressed extents.
> Generally this is what happens, but often it's possibly for writeback to
> kick in while creating the file (due to memory pressure, or something
> calling sync in parallel, etc) which may result in creating more and
> smaller extents, which makes the test fail since its golden output
> expects exactly 1024 extents with a size of 128K each.
> 
> So to work around run defrag after creating the file, which will ensure
> we get only 128K extents in the file.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: David Sterba <dsterba@suse.com>
Qu Wenruo June 5, 2024, 10:30 p.m. UTC | #3
在 2024/6/5 20:56, fdmanana@kernel.org 写道:
> From: Filipe Manana <fdmanana@suse.com>
>
> The test writes a 128M file and expects to end up with 1024 extents, each
> with a size of 128K, which is the maximum size for compressed extents.
> Generally this is what happens, but often it's possibly for writeback to
> kick in while creating the file (due to memory pressure, or something
> calling sync in parallel, etc) which may result in creating more and
> smaller extents, which makes the test fail since its golden output
> expects exactly 1024 extents with a size of 128K each.
>
> So to work around run defrag after creating the file, which will ensure
> we get only 128K extents in the file.

But defrag is not much different than reading the page and set it dirty
for writeback again.

It can be affected by the same memory pressure things to get split.

I guess you choose compressed file extents is to bump up the subvolume
tree meanwhile also compatible for all sector sizes.

In that case, what about doing DIO using sectorsize of the fs?
So that each dio write would result one file extent item, meanwhile
since it's a single sector/page, memory pressure will never be able to
writeback that sector halfway.

Thanks,
Qu
>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>   tests/btrfs/280 | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/tests/btrfs/280 b/tests/btrfs/280
> index d4f613ce..0f7f8a37 100755
> --- a/tests/btrfs/280
> +++ b/tests/btrfs/280
> @@ -13,7 +13,7 @@
>   # the backref walking code, used by fiemap to determine if an extent is shared.
>   #
>   . ./common/preamble
> -_begin_fstest auto quick compress snapshot fiemap
> +_begin_fstest auto quick compress snapshot fiemap defrag
>
>   . ./common/filter
>   . ./common/punch # for _filter_fiemap_flags
> @@ -36,6 +36,14 @@ _scratch_mount -o compress
>   # extent tree (if the root was a leaf, we would have only data references).
>   $XFS_IO_PROG -f -c "pwrite -b 1M 0 128M" $SCRATCH_MNT/foo | _filter_xfs_io
>
> +# While writing the file it's possible, but rare, that writeback kicked in due
> +# to memory pressure or a concurrent sync call for example, so we may end up
> +# with extents smaller than 128K (the maximum size for compressed extents) and
> +# therefore make the test expectations fail because we get more extents than
> +# what the golden output expects. So run defrag to make sure we get exactly
> +# the expected number of 128K extents (1024 extents).
> +$BTRFS_UTIL_PROG filesystem defrag "$SCRATCH_MNT/foo" >> $seqres.full
> +
>   # Create a RW snapshot of the default subvolume.
>   _btrfs subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap
>
Filipe Manana June 5, 2024, 11:17 p.m. UTC | #4
On Wed, Jun 5, 2024 at 11:30 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> 在 2024/6/5 20:56, fdmanana@kernel.org 写道:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > The test writes a 128M file and expects to end up with 1024 extents, each
> > with a size of 128K, which is the maximum size for compressed extents.
> > Generally this is what happens, but often it's possibly for writeback to
> > kick in while creating the file (due to memory pressure, or something
> > calling sync in parallel, etc) which may result in creating more and
> > smaller extents, which makes the test fail since its golden output
> > expects exactly 1024 extents with a size of 128K each.
> >
> > So to work around run defrag after creating the file, which will ensure
> > we get only 128K extents in the file.
>
> But defrag is not much different than reading the page and set it dirty
> for writeback again.
>
> It can be affected by the same memory pressure things to get split.

Defrag locks the range, the pages, then dirties the pages and then
unlocks the pages. So any writeback attempt happening in parallel will
wait for the pages
to be unlocked. So we shouldn't get extents smaller than 128K. Did I
miss anything?

>
> I guess you choose compressed file extents is to bump up the subvolume
> tree meanwhile also compatible for all sector sizes.

Yes, and to be fast and use very little space.

>
> In that case, what about doing DIO using sectorsize of the fs?
> So that each dio write would result one file extent item, meanwhile
> since it's a single sector/page, memory pressure will never be able to
> writeback that sector halfway.

I thought about DIO, but would have to leave holes between every
extent (and for that I would rather use buffered IO for simplicity and
probably faster).
Otherwise fiemap merges all adjacent extents, you get one 8M extent
reported, covering the range of the odd single profile data block group created
by mkfs, and another one for the remaining of the file - it's just
ugly and hard to reason about, plus that could break one day if we
ever get rid of that 8M data block group.



>
> Thanks,
> Qu
> >
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >   tests/btrfs/280 | 10 +++++++++-
> >   1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/btrfs/280 b/tests/btrfs/280
> > index d4f613ce..0f7f8a37 100755
> > --- a/tests/btrfs/280
> > +++ b/tests/btrfs/280
> > @@ -13,7 +13,7 @@
> >   # the backref walking code, used by fiemap to determine if an extent is shared.
> >   #
> >   . ./common/preamble
> > -_begin_fstest auto quick compress snapshot fiemap
> > +_begin_fstest auto quick compress snapshot fiemap defrag
> >
> >   . ./common/filter
> >   . ./common/punch # for _filter_fiemap_flags
> > @@ -36,6 +36,14 @@ _scratch_mount -o compress
> >   # extent tree (if the root was a leaf, we would have only data references).
> >   $XFS_IO_PROG -f -c "pwrite -b 1M 0 128M" $SCRATCH_MNT/foo | _filter_xfs_io
> >
> > +# While writing the file it's possible, but rare, that writeback kicked in due
> > +# to memory pressure or a concurrent sync call for example, so we may end up
> > +# with extents smaller than 128K (the maximum size for compressed extents) and
> > +# therefore make the test expectations fail because we get more extents than
> > +# what the golden output expects. So run defrag to make sure we get exactly
> > +# the expected number of 128K extents (1024 extents).
> > +$BTRFS_UTIL_PROG filesystem defrag "$SCRATCH_MNT/foo" >> $seqres.full
> > +
> >   # Create a RW snapshot of the default subvolume.
> >   _btrfs subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap
> >
Qu Wenruo June 6, 2024, 12:52 a.m. UTC | #5
在 2024/6/6 08:47, Filipe Manana 写道:
> On Wed, Jun 5, 2024 at 11:30 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> 在 2024/6/5 20:56, fdmanana@kernel.org 写道:
>>> From: Filipe Manana <fdmanana@suse.com>
>>>
>>> The test writes a 128M file and expects to end up with 1024 extents, each
>>> with a size of 128K, which is the maximum size for compressed extents.
>>> Generally this is what happens, but often it's possibly for writeback to
>>> kick in while creating the file (due to memory pressure, or something
>>> calling sync in parallel, etc) which may result in creating more and
>>> smaller extents, which makes the test fail since its golden output
>>> expects exactly 1024 extents with a size of 128K each.
>>>
>>> So to work around run defrag after creating the file, which will ensure
>>> we get only 128K extents in the file.
>>
>> But defrag is not much different than reading the page and set it dirty
>> for writeback again.
>>
>> It can be affected by the same memory pressure things to get split.
> 
> Defrag locks the range, the pages, then dirties the pages and then
> unlocks the pages. So any writeback attempt happening in parallel will
> wait for the pages
> to be unlocked. So we shouldn't get extents smaller than 128K. Did I
> miss anything?
> 

You're right, I forgot the page is also locked, and the defrag cluster 
size is 256K, exactly aligned with compression extent size.

So it's completely fine.

>>
>> I guess you choose compressed file extents is to bump up the subvolume
>> tree meanwhile also compatible for all sector sizes.
> 
> Yes, and to be fast and use very little space.
> 
>>
>> In that case, what about doing DIO using sectorsize of the fs?
>> So that each dio write would result one file extent item, meanwhile
>> since it's a single sector/page, memory pressure will never be able to
>> writeback that sector halfway.
> 
> I thought about DIO, but would have to leave holes between every
> extent (and for that I would rather use buffered IO for simplicity and
> probably faster).
> Otherwise fiemap merges all adjacent extents, you get one 8M extent
> reported, covering the range of the odd single profile data block group created
> by mkfs, and another one for the remaining of the file - it's just
> ugly and hard to reason about, plus that could break one day if we
> ever get rid of that 8M data block group.

Yep, fiemap merging is another problem.

So this looks totally fine now.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> 
> 
> 
>>
>> Thanks,
>> Qu
>>>
>>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
>>> ---
>>>    tests/btrfs/280 | 10 +++++++++-
>>>    1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tests/btrfs/280 b/tests/btrfs/280
>>> index d4f613ce..0f7f8a37 100755
>>> --- a/tests/btrfs/280
>>> +++ b/tests/btrfs/280
>>> @@ -13,7 +13,7 @@
>>>    # the backref walking code, used by fiemap to determine if an extent is shared.
>>>    #
>>>    . ./common/preamble
>>> -_begin_fstest auto quick compress snapshot fiemap
>>> +_begin_fstest auto quick compress snapshot fiemap defrag
>>>
>>>    . ./common/filter
>>>    . ./common/punch # for _filter_fiemap_flags
>>> @@ -36,6 +36,14 @@ _scratch_mount -o compress
>>>    # extent tree (if the root was a leaf, we would have only data references).
>>>    $XFS_IO_PROG -f -c "pwrite -b 1M 0 128M" $SCRATCH_MNT/foo | _filter_xfs_io
>>>
>>> +# While writing the file it's possible, but rare, that writeback kicked in due
>>> +# to memory pressure or a concurrent sync call for example, so we may end up
>>> +# with extents smaller than 128K (the maximum size for compressed extents) and
>>> +# therefore make the test expectations fail because we get more extents than
>>> +# what the golden output expects. So run defrag to make sure we get exactly
>>> +# the expected number of 128K extents (1024 extents).
>>> +$BTRFS_UTIL_PROG filesystem defrag "$SCRATCH_MNT/foo" >> $seqres.full
>>> +
>>>    # Create a RW snapshot of the default subvolume.
>>>    _btrfs subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap
>>>
>
diff mbox series

Patch

diff --git a/tests/btrfs/280 b/tests/btrfs/280
index d4f613ce..0f7f8a37 100755
--- a/tests/btrfs/280
+++ b/tests/btrfs/280
@@ -13,7 +13,7 @@ 
 # the backref walking code, used by fiemap to determine if an extent is shared.
 #
 . ./common/preamble
-_begin_fstest auto quick compress snapshot fiemap
+_begin_fstest auto quick compress snapshot fiemap defrag
 
 . ./common/filter
 . ./common/punch # for _filter_fiemap_flags
@@ -36,6 +36,14 @@  _scratch_mount -o compress
 # extent tree (if the root was a leaf, we would have only data references).
 $XFS_IO_PROG -f -c "pwrite -b 1M 0 128M" $SCRATCH_MNT/foo | _filter_xfs_io
 
+# While writing the file it's possible, but rare, that writeback kicked in due
+# to memory pressure or a concurrent sync call for example, so we may end up
+# with extents smaller than 128K (the maximum size for compressed extents) and
+# therefore make the test expectations fail because we get more extents than
+# what the golden output expects. So run defrag to make sure we get exactly
+# the expected number of 128K extents (1024 extents).
+$BTRFS_UTIL_PROG filesystem defrag "$SCRATCH_MNT/foo" >> $seqres.full
+
 # Create a RW snapshot of the default subvolume.
 _btrfs subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap