Message ID | 1419060301-26830-1-git-send-email-gux.fnst@cn.fujitsu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, Dec 20, 2014 at 03:25:01PM +0800, Xing Gu wrote: > This case tests truncate/collapse range race. If > the race occurs, it will trigger BUG_ON. > > Signed-off-by: Xing Gu <gux.fnst@cn.fujitsu.com> > --- What changed from the previous version? ... > +rm -f $seqres.full > +_scratch_mkfs >>$seqres.full 2>&1 > +_scratch_mount > + > +old_bug=`dmesg | grep -c "kernel BUG"` > + > +testfile=$SCRATCH_MNT/file.$seq > +# fcollapse/truncate continuously and simultaneously a same file > +for ((i=1; i <= 100; i++)); do > + for ((i=1; i <= 1000; i++)); do > + $XFS_IO_PROG -f -c 'truncate 100k' $testfile 2>> $seqres.full > + $XFS_IO_PROG -f -c 'fcollapse 0 16k' $testfile 2>> $seqres.full > + done & > + for ((i=1; i <= 1000; i++)); do > + $XFS_IO_PROG -f -c 'truncate 0' $testfile 2>> $seqres.full > + done & > +done The previous version of this ran a loop for 3 minutes, which we talked about being too long. This loop forks 300,000 processes and generates a 1.5MB $seqres.full file. On my single CPU test VM it takes: generic/039 302s About 5 minutes to run, so it takes longer than the 3 minute version of the same test we said was too long. FYI, my 16p test VM still takes 35s to crunch through this test and it pegs all 16 CPUs to 100% usage. We don't need to record the output of the xfs_io commands, so avoiding a fork and throwing away the output such as: $XFS_IO_PROG -f -c 'truncate 100k' \ -c 'fcollapse 0 16k' \ $testfile > /dev/null 2>&1 makes the runtime on the 16p VM drop by 40% (22s) and by 33% (200s) on the single CPU VM. but that's still too long on the smaller CPU systems. I think the loop iterations need to be tuned to the number of CPUs in the system. This: NCPUS=`$here/src/feature -o` OUTER_LOOPS=$((10 * $NCPUS * $LOAD_FACTOR)) INNER_LOOPS=$((50 * $NCPUS * $LOAD_FACTOR)) plus the above xfs_io optimisations give a runtime of 3s on my 1p machien and 30s on my 16p machine. That would be more acceptible to everyone, I think. > +wait > + > +new_bug=`dmesg | grep -c "kernel BUG"` > +if [ $new_bug -ne $old_bug ]; then > + _fail "kernel bug detected, check dmesg for more infomation." > +fi A kernel bug in a process with an open file descriptor will cause the filesystem to be unmountable. It will hang the test, require a reboot. Hence there's no point in checking dmesg for a bug message as it will be noticed by the test failing to complete. > +status=0 > +exit > diff --git a/tests/generic/039.out b/tests/generic/039.out > new file mode 100644 > index 0000000..0cacac7 > --- /dev/null > +++ b/tests/generic/039.out > @@ -0,0 +1 @@ > +QA output created by 039 The test needs to echo something to indicate that an empty golden output file is expected. "Silence is golden" is the usual phrase here.... > 036 auto aio rw stress > 037 metadata auto quick > 038 auto stress > +039 auto metadata rw With the addition of $LOAD_FACTOR, this can be added to the stress group as well. Cheers, Dave.
On 12/24/2014 09:53 AM, Dave Chinner wrote: > On Sat, Dec 20, 2014 at 03:25:01PM +0800, Xing Gu wrote: >> This case tests truncate/collapse range race. If >> the race occurs, it will trigger BUG_ON. >> >> Signed-off-by: Xing Gu <gux.fnst@cn.fujitsu.com> >> --- > > What changed from the previous version? > Compared with the previous version?there are mainly two changes: (1) Since this patch only checks for the truncate/collapse range race, the description of previous version is not clear. I changed the description. (2) Considering the different performance of each test machine, it is not reasonable to set a run loop for a fixed time eg. 3 minutes in the previous version. I changed the form of loop. > ... >> +rm -f $seqres.full >> +_scratch_mkfs >>$seqres.full 2>&1 >> +_scratch_mount >> + >> +old_bug=`dmesg | grep -c "kernel BUG"` >> + >> +testfile=$SCRATCH_MNT/file.$seq >> +# fcollapse/truncate continuously and simultaneously a same file >> +for ((i=1; i <= 100; i++)); do >> + for ((i=1; i <= 1000; i++)); do >> + $XFS_IO_PROG -f -c 'truncate 100k' $testfile 2>> $seqres.full >> + $XFS_IO_PROG -f -c 'fcollapse 0 16k' $testfile 2>> $seqres.full >> + done & >> + for ((i=1; i <= 1000; i++)); do >> + $XFS_IO_PROG -f -c 'truncate 0' $testfile 2>> $seqres.full >> + done & >> +done > > The previous version of this ran a loop for 3 minutes, which we > talked about being too long. This loop forks 300,000 processes > and generates a 1.5MB $seqres.full file. On my single CPU test VM > it takes: > > generic/039 302s > > About 5 minutes to run, so it takes longer than the 3 minute version > of the same test we said was too long. FYI, my 16p test VM still > takes 35s to crunch through this test and it pegs all 16 CPUs to > 100% usage. > > We don't need to record the output of the xfs_io commands, so > avoiding a fork and throwing away the output such as: > > $XFS_IO_PROG -f -c 'truncate 100k' \ > -c 'fcollapse 0 16k' \ > $testfile > /dev/null 2>&1 > > makes the runtime on the 16p VM drop by 40% (22s) and by 33% (200s) > on the single CPU VM. but that's still too long on the smaller CPU > systems. > > I think the loop iterations need to be tuned to the number of CPUs > in the system. This: > > NCPUS=`$here/src/feature -o` > OUTER_LOOPS=$((10 * $NCPUS * $LOAD_FACTOR)) > INNER_LOOPS=$((50 * $NCPUS * $LOAD_FACTOR)) > > plus the above xfs_io optimisations give a runtime of 3s on my 1p > machien and 30s on my 16p machine. That would be more acceptible > to everyone, I think. > Got it. >> +wait >> + >> +new_bug=`dmesg | grep -c "kernel BUG"` >> +if [ $new_bug -ne $old_bug ]; then >> + _fail "kernel bug detected, check dmesg for more infomation." >> +fi > > A kernel bug in a process with an open file descriptor will cause > the filesystem to be unmountable. It will hang the test, require a > reboot. Hence there's no point in checking dmesg for a bug message > as it will be noticed by the test failing to complete. > Got it. >> +status=0 >> +exit >> diff --git a/tests/generic/039.out b/tests/generic/039.out >> new file mode 100644 >> index 0000000..0cacac7 >> --- /dev/null >> +++ b/tests/generic/039.out >> @@ -0,0 +1 @@ >> +QA output created by 039 > > The test needs to echo something to indicate that an empty golden > output file is expected. "Silence is golden" is the usual phrase > here.... > Got it. >> 036 auto aio rw stress >> 037 metadata auto quick >> 038 auto stress >> +039 auto metadata rw > > With the addition of $LOAD_FACTOR, this can be added to the stress > group as well. > Got it. Thanks for your suggestion! Regards, Xing Gu > Cheers, > > Dave. > -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/tests/generic/039 b/tests/generic/039 new file mode 100755 index 0000000..a09df43 --- /dev/null +++ b/tests/generic/039 @@ -0,0 +1,75 @@ +#! /bin/bash +# FS QA Test No. 039 +# +# Test truncate/collapse range race. +# +#----------------------------------------------------------------------- +# Copyright (c) 2014 Fujitsu. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! + +_cleanup() +{ + rm -f $tmp.* +} + +trap "_cleanup; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# real QA test starts here +_supported_os Linux +_supported_fs generic +_require_scratch +_require_xfs_io_command "fcollapse" + +rm -f $seqres.full +_scratch_mkfs >>$seqres.full 2>&1 +_scratch_mount + +old_bug=`dmesg | grep -c "kernel BUG"` + +testfile=$SCRATCH_MNT/file.$seq +# fcollapse/truncate continuously and simultaneously a same file +for ((i=1; i <= 100; i++)); do + for ((i=1; i <= 1000; i++)); do + $XFS_IO_PROG -f -c 'truncate 100k' $testfile 2>> $seqres.full + $XFS_IO_PROG -f -c 'fcollapse 0 16k' $testfile 2>> $seqres.full + done & + for ((i=1; i <= 1000; i++)); do + $XFS_IO_PROG -f -c 'truncate 0' $testfile 2>> $seqres.full + done & +done + +wait + +new_bug=`dmesg | grep -c "kernel BUG"` +if [ $new_bug -ne $old_bug ]; then + _fail "kernel bug detected, check dmesg for more infomation." +fi + +status=0 +exit diff --git a/tests/generic/039.out b/tests/generic/039.out new file mode 100644 index 0000000..0cacac7 --- /dev/null +++ b/tests/generic/039.out @@ -0,0 +1 @@ +QA output created by 039 diff --git a/tests/generic/group b/tests/generic/group index 1e89848..5a3d13a 100644 --- a/tests/generic/group +++ b/tests/generic/group @@ -41,6 +41,7 @@ 036 auto aio rw stress 037 metadata auto quick 038 auto stress +039 auto metadata rw 053 acl repair auto quick 062 attr udf auto quick 068 other auto freeze dangerous stress
This case tests truncate/collapse range race. If the race occurs, it will trigger BUG_ON. Signed-off-by: Xing Gu <gux.fnst@cn.fujitsu.com> --- tests/generic/039 | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++ tests/generic/039.out | 1 + tests/generic/group | 1 + 3 files changed, 77 insertions(+) create mode 100755 tests/generic/039 create mode 100644 tests/generic/039.out