Message ID | 20190129071857.32473-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] btrfs: Test if btrfs will commit too many transaction for balance | expand |
On Tue, Jan 29, 2019 at 7:19 AM Qu Wenruo <wqu@suse.com> wrote: > > Kernel commit 64403612b73a ("btrfs: rework > btrfs_check_space_for_delayed_refs") is introducing a regression for > btrfs balance performance. > > Since that commit will cause btrfs to commit transaction for nothing > during transaction, to commit transaction for nothing during transaction -> to commit too many transactions for nothing during balance/relocation. Also, for the subject, "transaction" -> "transactions" (plural). > it will slow balance dramatically even we only need > to relocate several megabytes. > > This test case will catch the problem by using super block generation as > failure criteria. > For small chunk relocated, we will commit 6 transaction for each block > group, and the test case should only have 2 block groups, it should only > commit 12 transactions. > > This test case will use 120 as the threshold to detect the failure. > > And in my test environment, with kernel fix btrfs committed 14 > transactions and finished in 6s. > While without the fix btrfs committed 1734 transactions and takes 25s to > finish. > > So the test case should be more than enough to detect the regression, > while still keep the runtime small enough for failure. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > changelog: > v2: > - Remove the unnecessary workaround for ENOSPC balance > Since we know the cause of that false ENOSPC error, a simple sync > would be enough to take care of the problem > > - Remove unnecessary mount option of max_inline > The same reason above. > --- > tests/btrfs/181 | 101 ++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/181.out | 2 + > tests/btrfs/group | 1 + > 3 files changed, 104 insertions(+) > create mode 100755 tests/btrfs/181 > create mode 100644 tests/btrfs/181.out > > diff --git a/tests/btrfs/181 b/tests/btrfs/181 > new file mode 100755 > index 00000000..869cb15f > --- /dev/null > +++ b/tests/btrfs/181 > @@ -0,0 +1,101 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2019 SUSE Linux Products GmbH. All Rights Reserved. > +# > +# FS QA Test 181 > +# > +# Test if btrfs will commit too many transactions for nothing and cause > +# performance regression during balance. > +# > +# This bug is going to be fixed by a patch for kenerl title kenerl -> kernel > +# "btrfs: don't end the transaction for delayed refs in throttle" > +# > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > + > +here=`pwd` > +tmp=/tmp/$$ > +status=1 # failure is the default! > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +_cleanup() > +{ > + cd / > + rm -f $tmp.* > +} > + > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > + > +# remove previous $seqres.full before test > +rm -f $seqres.full > + > +# real QA test starts here > + > +# Modify as appropriate. > +_supported_fs btrfs > +_supported_os Linux > +_require_scratch > +_require_btrfs_command inspect-internal dump-super > + > +_scratch_mkfs > /dev/null > + > +_scratch_mount > + > +nr_files=1024 > +nr_snapshots=8 nr_snapshots is unused in the test. > + > +get_super_gen() > +{ > + $BTRFS_UTIL_PROG inspect dump-super "$SCRATCH_DEV" | grep ^generation |\ > + awk '{print $2}' > +} > + > +$BTRFS_UTIL_PROG subv create "$SCRATCH_MNT/subvol" > /dev/null subv -> subvolume We always prefer to use full command names. > + > +# Create some small files to take up enough metadata reserved space > +for ((i = 0; i < $nr_files; i++)) do > + _pwrite_byte 0xcd 0 1K "$SCRATCH_MNT/subvol/file_$i" > /dev/null > +done > + > +# Commit the fs so we can get a stable super generation > +sync > + > +before_gen=$(get_super_gen) > +if [ -z $before_gen ]; then > + _fail "failed to get super block genreation" genreation -> generation > +fi Also, this error checking should fit better in the get_super_gen function. We check for the error here but not when getting "after_gen", which seems odd. > + > +$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" > /dev/null > + > +after_gen=$(get_super_gen) > + > +# Since the fs is pretty small, we should have only 1 small metadata chunk and > +# one tiny system chunk. > +# Relocating such small chunks only need 6 commits for each, thus 12 commits for need -> needs > +# 2 chunks. > +# Here we use 10x the theoretic value as threshold. > +theoretic_gen=$(( 6 * 2 )) > +threshold_gen=$(( 10 * $theoretic_gen )) > +if [ $(( $after_gen - $before_gen )) -gt 120 ]; then > + echo "balance committed too many transactions" > + echo "super generation before balance: ${before_gen}" > + echo "super generation after balance: ${after_gen}" > + echo "super generation difference: $((after_gen - before_gen))" > + echo "theoretic generation difference: ${theoretic_gen}" > + echo "threshold generation difference: ${threshold_gen}" > +fi > + > +echo "super generation before balance: ${before_gen}" >> $seqres.full > +echo "super generation after balance: ${after_gen}" >> $seqres.full > +echo "super generation difference: $((after_gen - before_gen))" >> $seqres.full > +echo "theoretic generation difference: ${theoretic_gen}" >> $seqres.full > +echo "threshold generation difference: ${threshold_gen}" >> $seqres.full > + > +# success, all done > +echo "Silence is golden" Other than that, it looks good. Reviewed-by: Filipe Manana <fdmanana@suse.com> Thanks. > + > +status=0 > +exit > diff --git a/tests/btrfs/181.out b/tests/btrfs/181.out > new file mode 100644 > index 00000000..a5a93be6 > --- /dev/null > +++ b/tests/btrfs/181.out > @@ -0,0 +1,2 @@ > +QA output created by 181 > +Silence is golden > diff --git a/tests/btrfs/group b/tests/btrfs/group > index 03eb62f9..0db485cb 100644 > --- a/tests/btrfs/group > +++ b/tests/btrfs/group > @@ -183,3 +183,4 @@ > 178 auto quick send > 179 auto qgroup dangerous > 180 auto quick qgroup limit > +181 auto quick balance > -- > 2.18.0 >
diff --git a/tests/btrfs/181 b/tests/btrfs/181 new file mode 100755 index 00000000..869cb15f --- /dev/null +++ b/tests/btrfs/181 @@ -0,0 +1,101 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2019 SUSE Linux Products GmbH. All Rights Reserved. +# +# FS QA Test 181 +# +# Test if btrfs will commit too many transactions for nothing and cause +# performance regression during balance. +# +# This bug is going to be fixed by a patch for kenerl title +# "btrfs: don't end the transaction for delayed refs in throttle" +# +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + cd / + rm -f $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# remove previous $seqres.full before test +rm -f $seqres.full + +# real QA test starts here + +# Modify as appropriate. +_supported_fs btrfs +_supported_os Linux +_require_scratch +_require_btrfs_command inspect-internal dump-super + +_scratch_mkfs > /dev/null + +_scratch_mount + +nr_files=1024 +nr_snapshots=8 + +get_super_gen() +{ + $BTRFS_UTIL_PROG inspect dump-super "$SCRATCH_DEV" | grep ^generation |\ + awk '{print $2}' +} + +$BTRFS_UTIL_PROG subv create "$SCRATCH_MNT/subvol" > /dev/null + +# Create some small files to take up enough metadata reserved space +for ((i = 0; i < $nr_files; i++)) do + _pwrite_byte 0xcd 0 1K "$SCRATCH_MNT/subvol/file_$i" > /dev/null +done + +# Commit the fs so we can get a stable super generation +sync + +before_gen=$(get_super_gen) +if [ -z $before_gen ]; then + _fail "failed to get super block genreation" +fi + +$BTRFS_UTIL_PROG balance start -m "$SCRATCH_MNT" > /dev/null + +after_gen=$(get_super_gen) + +# Since the fs is pretty small, we should have only 1 small metadata chunk and +# one tiny system chunk. +# Relocating such small chunks only need 6 commits for each, thus 12 commits for +# 2 chunks. +# Here we use 10x the theoretic value as threshold. +theoretic_gen=$(( 6 * 2 )) +threshold_gen=$(( 10 * $theoretic_gen )) +if [ $(( $after_gen - $before_gen )) -gt 120 ]; then + echo "balance committed too many transactions" + echo "super generation before balance: ${before_gen}" + echo "super generation after balance: ${after_gen}" + echo "super generation difference: $((after_gen - before_gen))" + echo "theoretic generation difference: ${theoretic_gen}" + echo "threshold generation difference: ${threshold_gen}" +fi + +echo "super generation before balance: ${before_gen}" >> $seqres.full +echo "super generation after balance: ${after_gen}" >> $seqres.full +echo "super generation difference: $((after_gen - before_gen))" >> $seqres.full +echo "theoretic generation difference: ${theoretic_gen}" >> $seqres.full +echo "threshold generation difference: ${threshold_gen}" >> $seqres.full + +# success, all done +echo "Silence is golden" + +status=0 +exit diff --git a/tests/btrfs/181.out b/tests/btrfs/181.out new file mode 100644 index 00000000..a5a93be6 --- /dev/null +++ b/tests/btrfs/181.out @@ -0,0 +1,2 @@ +QA output created by 181 +Silence is golden diff --git a/tests/btrfs/group b/tests/btrfs/group index 03eb62f9..0db485cb 100644 --- a/tests/btrfs/group +++ b/tests/btrfs/group @@ -183,3 +183,4 @@ 178 auto quick send 179 auto qgroup dangerous 180 auto quick qgroup limit +181 auto quick balance
Kernel commit 64403612b73a ("btrfs: rework btrfs_check_space_for_delayed_refs") is introducing a regression for btrfs balance performance. Since that commit will cause btrfs to commit transaction for nothing during transaction, it will slow balance dramatically even we only need to relocate several megabytes. This test case will catch the problem by using super block generation as failure criteria. For small chunk relocated, we will commit 6 transaction for each block group, and the test case should only have 2 block groups, it should only commit 12 transactions. This test case will use 120 as the threshold to detect the failure. And in my test environment, with kernel fix btrfs committed 14 transactions and finished in 6s. While without the fix btrfs committed 1734 transactions and takes 25s to finish. So the test case should be more than enough to detect the regression, while still keep the runtime small enough for failure. Signed-off-by: Qu Wenruo <wqu@suse.com> --- changelog: v2: - Remove the unnecessary workaround for ENOSPC balance Since we know the cause of that false ENOSPC error, a simple sync would be enough to take care of the problem - Remove unnecessary mount option of max_inline The same reason above. --- tests/btrfs/181 | 101 ++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/181.out | 2 + tests/btrfs/group | 1 + 3 files changed, 104 insertions(+) create mode 100755 tests/btrfs/181 create mode 100644 tests/btrfs/181.out