Message ID | CA+EzBbB5t5bSZ0hB6Y1jkxTgOLOv809Jbbcx7w5eK_XS_FPG7w@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fstest: CrashMonkey tests ported to xfstest | expand |
On Mon, Oct 29, 2018 at 4:51 PM Jayashree Mohan <jayashree2912@gmail.com> wrote: > > Hi all, > > As discussed previously, I have ported a few CrashMonkey tests into > xfstest test-case for your review. I have written a patch for testing > simple "hard link" behaviour. This patch batches 37 test cases > generated by CrashMonkey into one xfstest test, all of them testing > hard-link behaviour. These 37 tests have been grouped based on the > similarities; there are comments in the patch to > describe what each group is testing for. On a high-level, we are > testing the creation of hard links between files in same directory and > files across two directories, while allowing fsync of either the files > involved, their parent directories, or unrelated sibling files. > > We aim to convert the other CrashMonkey tests in the same way, batched > by the type of file-system operation we test. This particular test > took 10 seconds to run on my VM (Ubuntu 16.04, 2GB RAM, single core) > on 100MB scratch partition. I have added per test case timer in this > patch, which shows each test case takes about 0.25 seconds (to run, > flakey_remount, test and cleanup). This test passes clean on ext4, xfs > and F2FS. It will show three failed cases as of kernel 4.16 on btrfs > (I think it's not yet patched on 4.19, so you should see the failure > on the newest kernel as well). > > Thinking more about it, none of the CrashMonkey test cases check for > resource exhaustion and hence we don't need more than 100MB of scratch > device. If the per-test external overhead due to fsck/scrub etc > depends on the scratch partition size, we can speed up the CrashMonkey > tests by allowing a new device like scratch(much smaller in size - > about 100-200MB). Additionally, our tests also do not need fsck to be > run at the end. Is there a way to skip performing fsck after each test > (if its something configurable in the test file) ? Just use _require_scratch_nocheck() instead of _require_scratch() For this type of tests, I think it's a good idea to let fsck run. Even if all of the links are persisted, the log/journal replay might have caused metadata inconsistencies in the fs for example - this was true for many cases I fixed over the years in btrfs. Even if fsck doesn't report any problem now, it's still good to run it, to help prevent future regressions. Plus this test creates a very small fs, it's not like fsck will take a significant time to run. So for all these reasons I would unmount and fsck after each test. The test looks good to me, only some coding style comments below. > > If this sort of patch seems fine to you, I can go ahead and port the > other CrashMonkey tests in the same way. After batching, I hope there > would be around 10-12 test files like the one attached here. > > Patch below: > --- > > diff --git a/tests/generic/517-link b/tests/generic/517-link > new file mode 100755 > index 0000000..791b6c0 > --- /dev/null > +++ b/tests/generic/517-link > @@ -0,0 +1,162 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2018 The University of Texas at Austin. All Rights Reserved. > +# > +# FS QA Test 517-link > +# > +# Test if we create a hard link to a file and persist either of the > files, all the names persist. > +# > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > + > +here=`pwd` > +tmp=/tmp/$$ > +status=1 # failure is the default! > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +_cleanup() > +{ > + _cleanup_flakey > + cd / > + rm -f $tmp.* > +} > + > +init_start_time=$(date +%s.%N) > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > +. ./common/dmflakey > + > +# remove previous $seqres.full before test > +rm -f $seqres.full > + > +# real QA test starts here > +_supported_fs generic > +_supported_os Linux > +_require_scratch > +_require_dm_target flakey > + > +# initialize scratch device > +_scratch_mkfs >>$seqres.full 2>&1 > +_require_metadata_journaling $SCRATCH_DEV > +_init_flakey > +_mount_flakey > +init_end_time=$(date +%s.%N) > +init_time_elapsed=$(echo "$init_end_time-$init_start_time" | bc) > +echo "Initialization time = $init_time_elapsed" >> $seqres.full > + > +stat_opt='-c "blocks: %b size: %s inode: %i links: %h"' > +test_num=0 > +total_time=0 > + > +cleanup() > +{ > + rm -rf $SCRATCH_MNT/* > + sync > +} > + > +# create a hard link $2 to file $1, and fsync $3, followed by power-cu > +test_link() > +{ > + start_time=$(date +%s.%N) > + test_num=$((test_num + 1)) > + sibling=0 > + before="" > + after="" Several of these are local variable, so prefix them with the 'local' keyword. > + echo -ne "\n=== Test $test_num : link $1 $2 " | tee -a $seqres.full You need to use the _filter_scratch filter here and then adjust the golden output (the .out file). Because if not, then the test will fail for anyone with a different mount path for the scratch device. Yours is /mnt/scratch but mine is /home/fdmanana/scratch_1 for example. > + > + # now execute the workload > + mkdir -p "${1%/*}" && mkdir -p "${2%/*}" && touch "$1" > + ln $1 $2 > + > + if [ -z "$3" ] > + then The fstests coding style is like: if [ whatever ]; then fi This applies to all if statements below. > + echo -ne " with sync ===\n" > + sync > + before=`stat "$stat_opt" $1` > + else > + echo " with fsync $3 ===" > + > + # If the file being persisted is a sibling, create it first > + if [ ! -f $3 ] > + then > + sibling=1 > + touch $3 > + fi > + > + $XFS_IO_PROG -c "fsync" $3 | _filter_xfs_io No need to use the filter. The fsync command, unlike pwrite for example, doesn't output rates and time. > + > + if [ $sibling -ne 1 ] > + then > + before=`stat "$stat_opt" $1` > + fi > + fi > + > + _flakey_drop_and_remount | tee -a $seqres.full > + > + if [ -f $1 ] > + then > + after=`stat "$stat_opt" $1` > + fi > + > + if [ "$before" != "$after" ] && [ $sibling -ne 1 ] > + then > + echo "Before: $before" | tee -a $seqres.full > + echo "After: $after" | tee -a $seqres.full > + fi > + > + cleanup > + end_time=$(date +%s.%N) > + time_elapsed=$(echo "$end_time-$start_time" | bc) > + echo " Elapsed time : $time_elapsed" >> $seqres.full > + total_time=$(echo "$total_time+$time_elapsed" | bc) Add one space before the + and another after, for readability and to conform with the code base. > + echo " Total time : $total_time" >> $seqres.full > +} > + > +# run the link test for different combinations > + > +test_start_time=$(date +%s.%N) > +# Group 1: Both files within root directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar; do > + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar $file_name > +done > +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar > + > +# Group 2: Create hard link in a sub directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar > $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar $file_name > +done > +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar > + > +# Group 3: Create hard link in parent directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar > $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar $file_name > +done > +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar > + > +# Group 4: Both files within a directory other than root > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/A $SCRATCH_MNT/A/bar > $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar $file_name > +done > +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar > + > +#Group 5: Exercise name reuse : Link file in sub-directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar > $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar $file_name > +done > +test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar > + > +#Group 6: Exercise name reuse : Link file in parent directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar > $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar $file_name > +done > +test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar > + > +test_end_time=$(date +%s.%N) > +test_time_elapsed=$(echo "$test_end_time-$test_start_time" | bc) > +echo "Test Elapsed time : $test_time_elapsed" >> $seqres.full > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/517-link.out b/tests/generic/517-link.out > new file mode 100644 > index 0000000..381b644 > --- /dev/null > +++ b/tests/generic/517-link.out > @@ -0,0 +1,75 @@ > +QA output created by 517-link > + > +=== Test 1 : link /mnt/scratch/foo /mnt/scratch/bar with fsync > /mnt/scratch === > + > +=== Test 2 : link /mnt/scratch/foo /mnt/scratch/bar with fsync > /mnt/scratch/foo === > + > +=== Test 3 : link /mnt/scratch/foo /mnt/scratch/bar with fsync > /mnt/scratch/bar === > + > +=== Test 4 : link /mnt/scratch/foo /mnt/scratch/bar with sync === > + > +=== Test 5 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch === > + > +=== Test 6 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/foo === > + > +=== Test 7 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/bar === > + > +=== Test 8 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A === > + > +=== Test 9 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 10 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 11 : link /mnt/scratch/foo /mnt/scratch/A/bar with sync === > + > +=== Test 12 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch === > + > +=== Test 13 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/foo === > + > +=== Test 14 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/bar === > + > +=== Test 15 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/A === > + > +=== Test 16 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 17 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 18 : link /mnt/scratch/A/foo /mnt/scratch/bar with sync === > + > +=== Test 19 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync > /mnt/scratch === > + > +=== Test 20 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A === > + > +=== Test 21 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 22 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 23 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with sync === > + > +=== Test 24 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch === > + > +=== Test 25 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/foo === > + > +=== Test 26 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/bar === > + > +=== Test 27 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/A === > + > +=== Test 28 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 29 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 30 : link /mnt/scratch/bar /mnt/scratch/A/bar with sync === > + > +=== Test 31 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch === > + > +=== Test 32 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/foo === > + > +=== Test 33 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/bar === > + > +=== Test 34 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/A === > + > +=== Test 35 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 36 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 37 : link /mnt/scratch/A/bar /mnt/scratch/bar with sync === > diff --git a/tests/generic/group b/tests/generic/group > index 47de978..dc04152 100644 > --- a/tests/generic/group > +++ b/tests/generic/group > @@ -519,3 +519,4 @@ > 514 auto quick clone > 515 auto quick clone > 516 auto quick dedupe clone > +517-link crash Should also be in the 'quick' and 'log' groups. Not sure if it's worth adding the new 'crash' group. The test isn't fundamentally different from other power failure tests. Thanks. > > > Thanks, > Jayashree Mohan
Hi Filipe, Thanks for the feedback on the patch. Will fix the coding style as you suggested. > For this type of tests, I think it's a good idea to let fsck run. > > Even if all of the links are persisted, the log/journal replay might > have caused metadata inconsistencies in the fs for example - this was > true for many cases I fixed over the years in btrfs. > Even if fsck doesn't report any problem now, it's still good to run > it, to help prevent future regressions. > > Plus this test creates a very small fs, it's not like fsck will take a > significant time to run. > So for all these reasons I would unmount and fsck after each test. Originally, there are 300 odd crash-consistency tests generated by CrashMonkey. To run fsck after each test, we will have to convert each of these tests into an equivalent xfstest test-case. We previously had a discussion about this, on the thread here ( https://www.spinics.net/lists/fstests/msg10718.html ) and the suggestion was to batch similar tests together to reduce the external per-test overhead due to scrub/fsck. Additionally, batching similar tests will result in around 15 new test cases that could be added to the 'quick group', as opposed to adding 300 new tests. However, if you still recommend that fsck be run after each test, I can submit patches for 300 individual test cases. Let me know which one is preferable. Thanks, Jayashree.
On Fri, Nov 02, 2018 at 03:39:51PM -0500, Jayashree Mohan wrote: > Hi Filipe, > > Thanks for the feedback on the patch. Will fix the coding style as you > suggested. > > > For this type of tests, I think it's a good idea to let fsck run. > > > > Even if all of the links are persisted, the log/journal replay might > > have caused metadata inconsistencies in the fs for example - this was > > true for many cases I fixed over the years in btrfs. > > Even if fsck doesn't report any problem now, it's still good to run > > it, to help prevent future regressions. > > > > Plus this test creates a very small fs, it's not like fsck will take a > > significant time to run. > > So for all these reasons I would unmount and fsck after each test. This looks reasonable to me. > > Originally, there are 300 odd crash-consistency tests generated by > CrashMonkey. To run fsck after each test, we will have to convert each > of these tests into an equivalent xfstest test-case. We previously had > a discussion about this, on the thread here ( > https://www.spinics.net/lists/fstests/msg10718.html ) and the > suggestion was to batch similar tests together to reduce the external > per-test overhead due to scrub/fsck. You could batch similar tests together but still do fsck after each sub-test by calling _check_scratch_fs manually, and call _require_scratch_nocheck to indicate there's no need to do fsck at the end of test. > Additionally, batching similar tests will result in around 15 new test > cases that could be added to the 'quick group', as opposed to adding > 300 new tests. I think we could batch similar tests and create relatively small fs (e.g. 256M, as btrfs defaults to mixed mode if fs < 256M, and btrfs folks wanted to test non-mixed mode by default) and run fsck after each sub-test first, then see how long the tests take. Thanks, Eryu > > However, if you still recommend that fsck be run after each test, I > can submit patches for 300 individual test cases. Let me know which > one is preferable. > > Thanks, > Jayashree.
On Mon, Oct 29, 2018 at 11:50:39AM -0500, Jayashree Mohan wrote: > Hi all, > > As discussed previously, I have ported a few CrashMonkey tests into > xfstest test-case for your review. I have written a patch for testing > simple "hard link" behaviour. This patch batches 37 test cases > generated by CrashMonkey into one xfstest test, all of them testing > hard-link behaviour. These 37 tests have been grouped based on the > similarities; there are comments in the patch to > describe what each group is testing for. On a high-level, we are > testing the creation of hard links between files in same directory and > files across two directories, while allowing fsync of either the files > involved, their parent directories, or unrelated sibling files. > > We aim to convert the other CrashMonkey tests in the same way, batched > by the type of file-system operation we test. This particular test > took 10 seconds to run on my VM (Ubuntu 16.04, 2GB RAM, single core) > on 100MB scratch partition. I have added per test case timer in this > patch, which shows each test case takes about 0.25 seconds (to run, > flakey_remount, test and cleanup). This test passes clean on ext4, xfs > and F2FS. It will show three failed cases as of kernel 4.16 on btrfs > (I think it's not yet patched on 4.19, so you should see the failure > on the newest kernel as well). > > Thinking more about it, none of the CrashMonkey test cases check for > resource exhaustion and hence we don't need more than 100MB of scratch > device. If the per-test external overhead due to fsck/scrub etc > depends on the scratch partition size, we can speed up the CrashMonkey > tests by allowing a new device like scratch(much smaller in size - > about 100-200MB). Additionally, our tests also do not need fsck to be > run at the end. Is there a way to skip performing fsck after each test > (if its something configurable in the test file) ? > > If this sort of patch seems fine to you, I can go ahead and port the > other CrashMonkey tests in the same way. After batching, I hope there > would be around 10-12 test files like the one attached here. > > Patch below: > --- > > diff --git a/tests/generic/517-link b/tests/generic/517-link > new file mode 100755 > index 0000000..791b6c0 > --- /dev/null > +++ b/tests/generic/517-link > @@ -0,0 +1,162 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2018 The University of Texas at Austin. All Rights Reserved. > +# > +# FS QA Test 517-link > +# > +# Test if we create a hard link to a file and persist either of the > files, all the names persist. Your mail agent seems to wrap the line and makes the patch unable to be applied. There're many other places in the patch. Could you please take a look? > +# > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > + > +here=`pwd` > +tmp=/tmp/$$ > +status=1 # failure is the default! > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +_cleanup() > +{ > + _cleanup_flakey Please use single tab to do indention not 4 spaces. > + cd / > + rm -f $tmp.* > +} > + > +init_start_time=$(date +%s.%N) > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > +. ./common/dmflakey > + > +# remove previous $seqres.full before test > +rm -f $seqres.full > + > +# real QA test starts here > +_supported_fs generic > +_supported_os Linux > +_require_scratch > +_require_dm_target flakey > + > +# initialize scratch device > +_scratch_mkfs >>$seqres.full 2>&1 > +_require_metadata_journaling $SCRATCH_DEV > +_init_flakey > +_mount_flakey > +init_end_time=$(date +%s.%N) > +init_time_elapsed=$(echo "$init_end_time-$init_start_time" | bc) > +echo "Initialization time = $init_time_elapsed" >> $seqres.full > + > +stat_opt='-c "blocks: %b size: %s inode: %i links: %h"' > +test_num=0 > +total_time=0 > + > +cleanup() Hmm, this could be confusing people, as we already have a _cleanup() function. But I think it'd be better to re-create the fs for each test so every sub-test starts with a clean filesystem and there's no need to do cleanup after each sub-test. (Yeah, this may consume more test time.) Thanks, Eryu > +{ > + rm -rf $SCRATCH_MNT/* > + sync > +} > + > +# create a hard link $2 to file $1, and fsync $3, followed by power-cu > +test_link() > +{ > + start_time=$(date +%s.%N) > + test_num=$((test_num + 1)) > + sibling=0 > + before="" > + after="" > + echo -ne "\n=== Test $test_num : link $1 $2 " | tee -a $seqres.full > + > + # now execute the workload > + mkdir -p "${1%/*}" && mkdir -p "${2%/*}" && touch "$1" > + ln $1 $2 > + > + if [ -z "$3" ] > + then > + echo -ne " with sync ===\n" > + sync > + before=`stat "$stat_opt" $1` > + else > + echo " with fsync $3 ===" > + > + # If the file being persisted is a sibling, create it first > + if [ ! -f $3 ] > + then > + sibling=1 > + touch $3 > + fi > + > + $XFS_IO_PROG -c "fsync" $3 | _filter_xfs_io > + > + if [ $sibling -ne 1 ] > + then > + before=`stat "$stat_opt" $1` > + fi > + fi > + > + _flakey_drop_and_remount | tee -a $seqres.full > + > + if [ -f $1 ] > + then > + after=`stat "$stat_opt" $1` > + fi > + > + if [ "$before" != "$after" ] && [ $sibling -ne 1 ] > + then > + echo "Before: $before" | tee -a $seqres.full > + echo "After: $after" | tee -a $seqres.full > + fi > + > + cleanup > + end_time=$(date +%s.%N) > + time_elapsed=$(echo "$end_time-$start_time" | bc) > + echo " Elapsed time : $time_elapsed" >> $seqres.full > + total_time=$(echo "$total_time+$time_elapsed" | bc) > + echo " Total time : $total_time" >> $seqres.full > +} > + > +# run the link test for different combinations > + > +test_start_time=$(date +%s.%N) > +# Group 1: Both files within root directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar; do > + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar $file_name > +done > +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar > + > +# Group 2: Create hard link in a sub directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar > $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar $file_name > +done > +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar > + > +# Group 3: Create hard link in parent directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar > $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar $file_name > +done > +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar > + > +# Group 4: Both files within a directory other than root > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/A $SCRATCH_MNT/A/bar > $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar $file_name > +done > +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar > + > +#Group 5: Exercise name reuse : Link file in sub-directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar > $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar $file_name > +done > +test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar > + > +#Group 6: Exercise name reuse : Link file in parent directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar > $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar $file_name > +done > +test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar > + > +test_end_time=$(date +%s.%N) > +test_time_elapsed=$(echo "$test_end_time-$test_start_time" | bc) > +echo "Test Elapsed time : $test_time_elapsed" >> $seqres.full > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/517-link.out b/tests/generic/517-link.out > new file mode 100644 > index 0000000..381b644 > --- /dev/null > +++ b/tests/generic/517-link.out > @@ -0,0 +1,75 @@ > +QA output created by 517-link > + > +=== Test 1 : link /mnt/scratch/foo /mnt/scratch/bar with fsync > /mnt/scratch === > + > +=== Test 2 : link /mnt/scratch/foo /mnt/scratch/bar with fsync > /mnt/scratch/foo === > + > +=== Test 3 : link /mnt/scratch/foo /mnt/scratch/bar with fsync > /mnt/scratch/bar === > + > +=== Test 4 : link /mnt/scratch/foo /mnt/scratch/bar with sync === > + > +=== Test 5 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch === > + > +=== Test 6 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/foo === > + > +=== Test 7 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/bar === > + > +=== Test 8 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A === > + > +=== Test 9 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 10 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 11 : link /mnt/scratch/foo /mnt/scratch/A/bar with sync === > + > +=== Test 12 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch === > + > +=== Test 13 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/foo === > + > +=== Test 14 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/bar === > + > +=== Test 15 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/A === > + > +=== Test 16 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 17 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 18 : link /mnt/scratch/A/foo /mnt/scratch/bar with sync === > + > +=== Test 19 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync > /mnt/scratch === > + > +=== Test 20 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A === > + > +=== Test 21 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 22 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 23 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with sync === > + > +=== Test 24 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch === > + > +=== Test 25 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/foo === > + > +=== Test 26 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/bar === > + > +=== Test 27 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/A === > + > +=== Test 28 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 29 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 30 : link /mnt/scratch/bar /mnt/scratch/A/bar with sync === > + > +=== Test 31 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch === > + > +=== Test 32 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/foo === > + > +=== Test 33 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/bar === > + > +=== Test 34 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/A === > + > +=== Test 35 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/A/bar === > + > +=== Test 36 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync > /mnt/scratch/A/foo === > + > +=== Test 37 : link /mnt/scratch/A/bar /mnt/scratch/bar with sync === > diff --git a/tests/generic/group b/tests/generic/group > index 47de978..dc04152 100644 > --- a/tests/generic/group > +++ b/tests/generic/group > @@ -519,3 +519,4 @@ > 514 auto quick clone > 515 auto quick clone > 516 auto quick dedupe clone > +517-link crash > > > Thanks, > Jayashree Mohan
Hi Eryu and Filipe, I have incorporated the coding style suggested, and renamed the cleanup function. However, creating a clean fs image after each sub test is resulting in about 10-15s of additional overhead overall. I instead clean up the working directory and unmount. The time for the tests varies between 12-15 seconds. Please find the patch below — diff --git a/tests/generic/517-link b/tests/generic/517-link new file mode 100755 index 0000000..ea5c5b7 --- /dev/null +++ b/tests/generic/517-link @@ -0,0 +1,164 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2018 The University of Texas at Austin. All Rights Reserved. +# +# FS QA Test 517-link +# +# Test case created by CrashMonkey +# +# Test if we create a hard link to a file and persist either of the files, all the names persist. +# +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + _cleanup_flakey + cd / + rm -f $tmp.* +} + +init_start_time=$(date +%s.%N) +# get standard environment, filters and checks +. ./common/rc +. ./common/filter +. ./common/dmflakey + +# remove previous $seqres.full before test +rm -f $seqres.full + +# real QA test starts here +_supported_fs generic +_supported_os Linux +_require_scratch_nocheck +_require_dm_target flakey + +# initialize scratch device +_scratch_mkfs >> $seqres.full 2>&1 +_require_metadata_journaling $SCRATCH_DEV +_init_flakey +init_end_time=$(date +%s.%N) +init_time_elapsed=$(echo "$init_end_time - $init_start_time" | bc) +echo "Initialization time = $init_time_elapsed" >> $seqres.full + +stat_opt='-c "blocks: %b size: %s inode: %i links: %h"' +test_num=0 +total_time=0 + +_clean_dir() +{ + _mount_flakey + rm -rf $SCRATCH_MNT/* + sync + _unmount_flakey +} + +# create a hard link $2 to file $1, and fsync $3, followed by power-cut +test_link() +{ + test_num=$((test_num + 1)) + local start_time=$(date +%s.%N) + local sibling=0 + local before="" + local after="" + echo -ne "\n=== Test $test_num : link $1 $2 " | _filter_scratch + _mount_flakey + + # now execute the workload + mkdir -p "${1%/*}" && mkdir -p "${2%/*}" && touch "$1" + ln $1 $2 + + if [ -z "$3" ]; then + echo -ne " with sync ===\n" + sync + before=`stat "$stat_opt" $1` + else + echo " with fsync $3 ===" | _filter_scratch + + # If the file being persisted is a sibling, create it first + if [ ! -f $3 ]; then + sibling=1 + touch $3 + fi + + $XFS_IO_PROG -c "fsync" $3 + + if [ $sibling -ne 1 ]; then + before=`stat "$stat_opt" $1` + fi + fi + + _flakey_drop_and_remount | tee -a $seqres.full + + if [ -f $1 ]; then + after=`stat "$stat_opt" $1` + fi + + if [ "$before" != "$after" ] && [ $sibling -ne 1 ]; then + echo "Before: $before" | tee -a $seqres.full + echo "After: $after" | tee -a $seqres.full + fi + + _unmount_flakey + _check_scratch_fs $FLAKEY_DEV + [ $? -ne 0 ] && _fatal "fsck failed" + end_time=$(date +%s.%N) + time_elapsed=$(echo "$end_time - $start_time" | bc) + echo " Elapsed time : $time_elapsed" >> $seqres.full + total_time=$(echo "$total_time + $time_elapsed" | bc) + echo " Total time : $total_time" >> $seqres.full + _clean_dir +} + +# run the link test for different combinations + +test_start_time=$(date +%s.%N) +# Group 1: Both files within root directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar; do + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar $file_name +done +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar + +# Group 2: Create hard link in a sub directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar $file_name +done +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar + +# Group 3: Create hard link in parent directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar $file_name +done +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar + +# Group 4: Both files within a directory other than root +for file_name in $SCRATCH_MNT $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar $file_name +done +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar + +#Group 5: Exercise name reuse : Link file in sub-directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar $file_name +done +test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar + +#Group 6: Exercise name reuse : Link file in parent directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar $file_name +done +test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar + +test_end_time=$(date +%s.%N) +test_time_elapsed=$(echo "$test_end_time-$test_start_time" | bc) +echo "Test Elapsed time : $test_time_elapsed" >> $seqres.full + +# success, all done +status=0 +exit diff --git a/tests/generic/517-link.out b/tests/generic/517-link.out new file mode 100644 index 0000000..9555860 --- /dev/null +++ b/tests/generic/517-link.out @@ -0,0 +1,75 @@ +QA output created by 517-link + +=== Test 1 : link SCRATCH_MNT/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT === + +=== Test 2 : link SCRATCH_MNT/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/foo === + +=== Test 3 : link SCRATCH_MNT/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/bar === + +=== Test 4 : link SCRATCH_MNT/foo SCRATCH_MNT/bar with sync === + +=== Test 5 : link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT === + +=== Test 6 : link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/foo === + +=== Test 7 : link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/bar === + +=== Test 8 : link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A === + +=== Test 9 : link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/bar === + +=== Test 10 : link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/foo === + +=== Test 11 : link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with sync === + +=== Test 12 : link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT === + +=== Test 13 : link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/foo === + +=== Test 14 : link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/bar === + +=== Test 15 : link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/A === + +=== Test 16 : link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/bar === + +=== Test 17 : link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/foo === + +=== Test 18 : link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with sync === + +=== Test 19 : link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT === + +=== Test 20 : link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A === + +=== Test 21 : link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/bar === + +=== Test 22 : link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/foo === + +=== Test 23 : link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with sync === + +=== Test 24 : link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT === + +=== Test 25 : link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/foo === + +=== Test 26 : link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/bar === + +=== Test 27 : link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A === + +=== Test 28 : link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/bar === + +=== Test 29 : link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/foo === + +=== Test 30 : link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with sync === + +=== Test 31 : link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT === + +=== Test 32 : link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/foo === + +=== Test 33 : link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/bar === + +=== Test 34 : link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/A === + +=== Test 35 : link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/bar === + +=== Test 36 : link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/foo === + +=== Test 37 : link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with sync === diff --git a/tests/generic/group b/tests/generic/group index 47de978..67e9108 100644 --- a/tests/generic/group +++ b/tests/generic/group @@ -519,3 +519,4 @@ 514 auto quick clone 515 auto quick clone 516 auto quick dedupe clone +517-link quick log > On Nov 4, 2018, at 10:38 AM, Eryu Guan <guaneryu@gmail.com> wrote: > > On Mon, Oct 29, 2018 at 11:50:39AM -0500, Jayashree Mohan wrote: >> Hi all, >> >> As discussed previously, I have ported a few CrashMonkey tests into >> xfstest test-case for your review. I have written a patch for testing >> simple "hard link" behaviour. This patch batches 37 test cases >> generated by CrashMonkey into one xfstest test, all of them testing >> hard-link behaviour. These 37 tests have been grouped based on the >> similarities; there are comments in the patch to >> describe what each group is testing for. On a high-level, we are >> testing the creation of hard links between files in same directory and >> files across two directories, while allowing fsync of either the files >> involved, their parent directories, or unrelated sibling files. >> >> We aim to convert the other CrashMonkey tests in the same way, batched >> by the type of file-system operation we test. This particular test >> took 10 seconds to run on my VM (Ubuntu 16.04, 2GB RAM, single core) >> on 100MB scratch partition. I have added per test case timer in this >> patch, which shows each test case takes about 0.25 seconds (to run, >> flakey_remount, test and cleanup). This test passes clean on ext4, xfs >> and F2FS. It will show three failed cases as of kernel 4.16 on btrfs >> (I think it's not yet patched on 4.19, so you should see the failure >> on the newest kernel as well). >> >> Thinking more about it, none of the CrashMonkey test cases check for >> resource exhaustion and hence we don't need more than 100MB of scratch >> device. If the per-test external overhead due to fsck/scrub etc >> depends on the scratch partition size, we can speed up the CrashMonkey >> tests by allowing a new device like scratch(much smaller in size - >> about 100-200MB). Additionally, our tests also do not need fsck to be >> run at the end. Is there a way to skip performing fsck after each test >> (if its something configurable in the test file) ? >> >> If this sort of patch seems fine to you, I can go ahead and port the >> other CrashMonkey tests in the same way. After batching, I hope there >> would be around 10-12 test files like the one attached here. >> >> Patch below: >> --- >> >> diff --git a/tests/generic/517-link b/tests/generic/517-link >> new file mode 100755 >> index 0000000..791b6c0 >> --- /dev/null >> +++ b/tests/generic/517-link >> @@ -0,0 +1,162 @@ >> +#! /bin/bash >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright (c) 2018 The University of Texas at Austin. All Rights Reserved. >> +# >> +# FS QA Test 517-link >> +# >> +# Test if we create a hard link to a file and persist either of the >> files, all the names persist. > > Your mail agent seems to wrap the line and makes the patch unable to be > applied. There're many other places in the patch. Could you please take > a look? > >> +# >> +seq=`basename $0` >> +seqres=$RESULT_DIR/$seq >> +echo "QA output created by $seq" >> + >> +here=`pwd` >> +tmp=/tmp/$$ >> +status=1 # failure is the default! >> +trap "_cleanup; exit \$status" 0 1 2 3 15 >> + >> +_cleanup() >> +{ >> + _cleanup_flakey > > Please use single tab to do indention not 4 spaces. > >> + cd / >> + rm -f $tmp.* >> +} >> + >> +init_start_time=$(date +%s.%N) >> +# get standard environment, filters and checks >> +. ./common/rc >> +. ./common/filter >> +. ./common/dmflakey >> + >> +# remove previous $seqres.full before test >> +rm -f $seqres.full >> + >> +# real QA test starts here >> +_supported_fs generic >> +_supported_os Linux >> +_require_scratch >> +_require_dm_target flakey >> + >> +# initialize scratch device >> +_scratch_mkfs >>$seqres.full 2>&1 >> +_require_metadata_journaling $SCRATCH_DEV >> +_init_flakey >> +_mount_flakey >> +init_end_time=$(date +%s.%N) >> +init_time_elapsed=$(echo "$init_end_time-$init_start_time" | bc) >> +echo "Initialization time = $init_time_elapsed" >> $seqres.full >> + >> +stat_opt='-c "blocks: %b size: %s inode: %i links: %h"' >> +test_num=0 >> +total_time=0 >> + >> +cleanup() > > Hmm, this could be confusing people, as we already have a _cleanup() > function. But I think it'd be better to re-create the fs for each test > so every sub-test starts with a clean filesystem and there's no need to > do cleanup after each sub-test. (Yeah, this may consume more test time.) > > Thanks, > Eryu > >> +{ >> + rm -rf $SCRATCH_MNT/* >> + sync >> +} >> + >> +# create a hard link $2 to file $1, and fsync $3, followed by power-cu >> +test_link() >> +{ >> + start_time=$(date +%s.%N) >> + test_num=$((test_num + 1)) >> + sibling=0 >> + before="" >> + after="" >> + echo -ne "\n=== Test $test_num : link $1 $2 " | tee -a $seqres.full >> + >> + # now execute the workload >> + mkdir -p "${1%/*}" && mkdir -p "${2%/*}" && touch "$1" >> + ln $1 $2 >> + >> + if [ -z "$3" ] >> + then >> + echo -ne " with sync ===\n" >> + sync >> + before=`stat "$stat_opt" $1` >> + else >> + echo " with fsync $3 ===" >> + >> + # If the file being persisted is a sibling, create it first >> + if [ ! -f $3 ] >> + then >> + sibling=1 >> + touch $3 >> + fi >> + >> + $XFS_IO_PROG -c "fsync" $3 | _filter_xfs_io >> + >> + if [ $sibling -ne 1 ] >> + then >> + before=`stat "$stat_opt" $1` >> + fi >> + fi >> + >> + _flakey_drop_and_remount | tee -a $seqres.full >> + >> + if [ -f $1 ] >> + then >> + after=`stat "$stat_opt" $1` >> + fi >> + >> + if [ "$before" != "$after" ] && [ $sibling -ne 1 ] >> + then >> + echo "Before: $before" | tee -a $seqres.full >> + echo "After: $after" | tee -a $seqres.full >> + fi >> + >> + cleanup >> + end_time=$(date +%s.%N) >> + time_elapsed=$(echo "$end_time-$start_time" | bc) >> + echo " Elapsed time : $time_elapsed" >> $seqres.full >> + total_time=$(echo "$total_time+$time_elapsed" | bc) >> + echo " Total time : $total_time" >> $seqres.full >> +} >> + >> +# run the link test for different combinations >> + >> +test_start_time=$(date +%s.%N) >> +# Group 1: Both files within root directory >> +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar; do >> + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar $file_name >> +done >> +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar >> + >> +# Group 2: Create hard link in a sub directory >> +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar >> $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do >> + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar $file_name >> +done >> +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar >> + >> +# Group 3: Create hard link in parent directory >> +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar >> $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do >> + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar $file_name >> +done >> +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar >> + >> +# Group 4: Both files within a directory other than root >> +for file_name in $SCRATCH_MNT $SCRATCH_MNT/A $SCRATCH_MNT/A/bar >> $SCRATCH_MNT/A/foo; do >> + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar $file_name >> +done >> +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar >> + >> +#Group 5: Exercise name reuse : Link file in sub-directory >> +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar >> $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do >> + test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar $file_name >> +done >> +test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar >> + >> +#Group 6: Exercise name reuse : Link file in parent directory >> +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar >> $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do >> + test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar $file_name >> +done >> +test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar >> + >> +test_end_time=$(date +%s.%N) >> +test_time_elapsed=$(echo "$test_end_time-$test_start_time" | bc) >> +echo "Test Elapsed time : $test_time_elapsed" >> $seqres.full >> + >> +# success, all done >> +status=0 >> +exit >> diff --git a/tests/generic/517-link.out b/tests/generic/517-link.out >> new file mode 100644 >> index 0000000..381b644 >> --- /dev/null >> +++ b/tests/generic/517-link.out >> @@ -0,0 +1,75 @@ >> +QA output created by 517-link >> + >> +=== Test 1 : link /mnt/scratch/foo /mnt/scratch/bar with fsync >> /mnt/scratch === >> + >> +=== Test 2 : link /mnt/scratch/foo /mnt/scratch/bar with fsync >> /mnt/scratch/foo === >> + >> +=== Test 3 : link /mnt/scratch/foo /mnt/scratch/bar with fsync >> /mnt/scratch/bar === >> + >> +=== Test 4 : link /mnt/scratch/foo /mnt/scratch/bar with sync === >> + >> +=== Test 5 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch === >> + >> +=== Test 6 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch/foo === >> + >> +=== Test 7 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch/bar === >> + >> +=== Test 8 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch/A === >> + >> +=== Test 9 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch/A/bar === >> + >> +=== Test 10 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch/A/foo === >> + >> +=== Test 11 : link /mnt/scratch/foo /mnt/scratch/A/bar with sync === >> + >> +=== Test 12 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync >> /mnt/scratch === >> + >> +=== Test 13 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync >> /mnt/scratch/foo === >> + >> +=== Test 14 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync >> /mnt/scratch/bar === >> + >> +=== Test 15 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync >> /mnt/scratch/A === >> + >> +=== Test 16 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync >> /mnt/scratch/A/bar === >> + >> +=== Test 17 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync >> /mnt/scratch/A/foo === >> + >> +=== Test 18 : link /mnt/scratch/A/foo /mnt/scratch/bar with sync === >> + >> +=== Test 19 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch === >> + >> +=== Test 20 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch/A === >> + >> +=== Test 21 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch/A/bar === >> + >> +=== Test 22 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync >> /mnt/scratch/A/foo === >> + >> +=== Test 23 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with sync === >> + >> +=== Test 24 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync >> /mnt/scratch === >> + >> +=== Test 25 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync >> /mnt/scratch/foo === >> + >> +=== Test 26 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync >> /mnt/scratch/bar === >> + >> +=== Test 27 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync >> /mnt/scratch/A === >> + >> +=== Test 28 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync >> /mnt/scratch/A/bar === >> + >> +=== Test 29 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync >> /mnt/scratch/A/foo === >> + >> +=== Test 30 : link /mnt/scratch/bar /mnt/scratch/A/bar with sync === >> + >> +=== Test 31 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync >> /mnt/scratch === >> + >> +=== Test 32 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync >> /mnt/scratch/foo === >> + >> +=== Test 33 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync >> /mnt/scratch/bar === >> + >> +=== Test 34 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync >> /mnt/scratch/A === >> + >> +=== Test 35 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync >> /mnt/scratch/A/bar === >> + >> +=== Test 36 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync >> /mnt/scratch/A/foo === >> + >> +=== Test 37 : link /mnt/scratch/A/bar /mnt/scratch/bar with sync === >> diff --git a/tests/generic/group b/tests/generic/group >> index 47de978..dc04152 100644 >> --- a/tests/generic/group >> +++ b/tests/generic/group >> @@ -519,3 +519,4 @@ >> 514 auto quick clone >> 515 auto quick clone >> 516 auto quick dedupe clone >> +517-link crash >> >> >> Thanks, >> Jayashree Mohan
On Sun, Nov 04, 2018 at 02:21:21PM -0600, Jayashree Mohan wrote: > Hi Eryu and Filipe, > > I have incorporated the coding style suggested, and renamed the cleanup function. However, creating a clean fs image after each sub test is resulting in about 10-15s of additional overhead overall. I instead clean up the working directory and unmount. The time for the tests varies between 12-15 seconds. > > Please find the patch below > > — > > diff --git a/tests/generic/517-link b/tests/generic/517-link > new file mode 100755 > index 0000000..ea5c5b7 > --- /dev/null > +++ b/tests/generic/517-link > @@ -0,0 +1,164 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2018 The University of Texas at Austin. All Rights Reserved. > +# > +# FS QA Test 517-link > +# > +# Test case created by CrashMonkey > +# > +# Test if we create a hard link to a file and persist either of the files, all the names persist. > +# > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > + > +here=`pwd` > +tmp=/tmp/$$ > +status=1 # failure is the default! > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +_cleanup() > +{ > + _cleanup_flakey > + cd / > + rm -f $tmp.* > +} > + > +init_start_time=$(date +%s.%N) > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > +. ./common/dmflakey > + > +# remove previous $seqres.full before test > +rm -f $seqres.full > + > +# real QA test starts here > +_supported_fs generic > +_supported_os Linux > +_require_scratch_nocheck > +_require_dm_target flakey > + > +# initialize scratch device > +_scratch_mkfs >> $seqres.full 2>&1 > +_require_metadata_journaling $SCRATCH_DEV > +_init_flakey > +init_end_time=$(date +%s.%N) > +init_time_elapsed=$(echo "$init_end_time - $init_start_time" | bc) > +echo "Initialization time = $init_time_elapsed" >> $seqres.full No timing inside the test, please. The harness times the test execution. > +stat_opt='-c "blocks: %b size: %s inode: %i links: %h"' > +test_num=0 Don't number tests. If we add an extra test, it will completely break the golden output. > +total_time=0 > + > +_clean_dir() > +{ > + _mount_flakey > + rm -rf $SCRATCH_MNT/* > + sync > + _unmount_flakey > +} Why not just _scratch_mkfs? > + > +# create a hard link $2 to file $1, and fsync $3, followed by power-cut > +test_link() > +{ > + test_num=$((test_num + 1)) > + local start_time=$(date +%s.%N) > + local sibling=0 > + local before="" > + local after="" > + echo -ne "\n=== Test $test_num : link $1 $2 " | _filter_scratch > + _mount_flakey > + Still whitespace damaged. > + # now execute the workload > + mkdir -p "${1%/*}" && mkdir -p "${2%/*}" && touch "$1" One line each. > + ln $1 $2 No checks for failure? > + > + if [ -z "$3" ]; then > + echo -ne " with sync ===\n" > + sync > + before=`stat "$stat_opt" $1` > + else > + echo " with fsync $3 ===" | _filter_scratch > + > + # If the file being persisted is a sibling, create it first > + if [ ! -f $3 ]; then > + sibling=1 > + touch $3 > + fi > + > + $XFS_IO_PROG -c "fsync" $3 > + > + if [ $sibling -ne 1 ]; then > + before=`stat "$stat_opt" $1` > + fi > + fi To tell the truth, I'd consider these two separate functions - test_link_sync() and test_link_fsync(). More below. > + > + _flakey_drop_and_remount | tee -a $seqres.full > + > + if [ -f $1 ]; then > + after=`stat "$stat_opt" $1` > + fi > + > + if [ "$before" != "$after" ] && [ $sibling -ne 1 ]; then > + echo "Before: $before" | tee -a $seqres.full > + echo "After: $after" | tee -a $seqres.full > + fi Why tee all the output to $seqres.full? Once the timing info is removed, it dones't contain anything extra that the normal output file... > + > + _unmount_flakey > + _check_scratch_fs $FLAKEY_DEV > + [ $? -ne 0 ] && _fatal "fsck failed" > + end_time=$(date +%s.%N) > + time_elapsed=$(echo "$end_time - $start_time" | bc) > + echo " Elapsed time : $time_elapsed" >> $seqres.full > + total_time=$(echo "$total_time + $time_elapsed" | bc) > + echo " Total time : $total_time" >> $seqres.full > + _clean_dir > +} > + > +# run the link test for different combinations > + > +test_start_time=$(date +%s.%N) > +# Group 1: Both files within root directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar; do > + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar $file_name > +done > +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar > + > +# Group 2: Create hard link in a sub directory > +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do > + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar $file_name > +done > +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar Lines too long. Basically, this is a two dimensional array of filenames. Make test_link_fsync() do: src=$SCRATCH_MNT/$1 dst=$SCRATCH_MNT/$2 fsync=$SCRATCH_MNT/$3 And define your fsync filenames and link destinations names like so (can't remember bash array notation, so paraphrasing): # group 1: Both files within root directory fnames[1]="./ foo bar" dnames[1]="foo bar" # group 2: Create hard link in a sub directory fnames[2]="./ foo bar A A/bar A/foo" dnames[2]="foo A/bar" #g3 fnames[3]="./ foo bar A A/bar A/foo" dnames[3]="A/foo bar" #g4 fnames[4]="./ A A/bar A/foo" dnames[4]="A/foo A/bar" #g5 fnames[5]="./ foo bar A A/bar A/foo" dnames[5]="bar A/bar" #g6 fnames[6]="./ foo bar A A/bar A/foo" dnames[6]="A/bar bar" for ((i=1; i<=6; i++)); do for f in $fnames[$i]; do test_link_fsync $dnames[$i] $f done test_link_sync $dnames[1] done And all this shouty, shouty noise goes away. Much easier to read, much simpler to maintain. Cheers, Dave.
Hi Dave, Thanks for the comments. >> +init_end_time=$(date +%s.%N) >> +init_time_elapsed=$(echo "$init_end_time - $init_start_time" | bc) >> +echo "Initialization time = $init_time_elapsed" >> $seqres.full > > No timing inside the test, please. The harness times the test > execution. Sure, that was meant for the initial evaluation. I’ve removed it now. > >> +total_time=0 >> + >> +_clean_dir() >> +{ >> + _mount_flakey >> + rm -rf $SCRATCH_MNT/* >> + sync >> + _unmount_flakey >> +} > > Why not just _scratch_mkfs? I believe that to _scratch_mkfs, I must first _cleanup dm_flakey. If I replace the above snippet by _cleanup _scratch_mkfs _init_flakey The time taken for the test goes up by around 10 seconds (due to mkfs maybe). So I thought it was sufficient to remove the working directory. > > Why tee all the output to $seqres.full? Once the timing info is > removed, it dones't contain anything extra that the normal output > file… Yeah, it was required for my timing info. Removed it now. > >> +# Group 2: Create hard link in a sub directory >> +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do >> + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar $file_name >> +done >> +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar > > Lines too long. > > Basically, this is a two dimensional array of filenames. > Make test_link_fsync() do: > > And all this shouty, shouty noise goes away. Much easier to read, > much simpler to maintain. Thanks! I think the code looks cleaner now. — diff --git a/tests/generic/517-link b/tests/generic/517-link new file mode 100755 index 0000000..7108300 --- /dev/null +++ b/tests/generic/517-link @@ -0,0 +1,176 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2018 The University of Texas at Austin. All Rights Reserved. +# +# FS QA Test 517-link +# +# Test case created by CrashMonkey +# +# Test if we create a hard link to a file and persist either of the files, all the names persist. +# +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + _cleanup_flakey + cd / + rm -f $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter +. ./common/dmflakey + +# remove previous $seqres.full before test +rm -f $seqres.full + +# real QA test starts here +_supported_fs generic +_supported_os Linux +_require_scratch_nocheck +_require_dm_target flakey + +# initialize scratch device +_scratch_mkfs >> $seqres.full 2>&1 +_require_metadata_journaling $SCRATCH_DEV +_init_flakey + +stat_opt='-c "blocks: %b size: %s inode: %i links: %h"' +before="" +after="" + +_clean_dir() +{ + _mount_flakey + rm -rf $SCRATCH_MNT/* + sync + _unmount_flakey +} + +_check_consistency() +{ + _flakey_drop_and_remount | tee -a $seqres.full + + if [ -f $1 ]; then + after=`stat "$stat_opt" $1` + fi + + if [ "$before" != "$after" ] && [ $2 -ne 1 ]; then + echo "Before: $before" + echo "After: $after" + fi + + _unmount_flakey + _check_scratch_fs $FLAKEY_DEV + [ $? -ne 0 ] && _fatal "fsck failed" +} + +# create a hard link $2 to file $1, and fsync $3, followed by power-cut +test_link_fsync() +{ + local sibling=0 + before="" + after="" + src=$SCRATCH_MNT/$1 + dest=$SCRATCH_MNT/$2 + + if [ "$3" == "./" ]; then + fsync=$SCRATCH_MNT + else + fsync=$SCRATCH_MNT/$3 + fi + + echo -ne "\n=== link $src $dest with fsync $fsync ===\n" | _filter_scratch + _mount_flakey + + # now execute the workload + mkdir -p "${src%/*}" + mkdir -p "${dest%/*}" + touch $src + ln $src $dest || _fail "Link creation failed" + + # If the file being persisted is a sibling, create it first + if [ ! -f $fsync ]; then + sibling=1 + touch $fsync + fi + + $XFS_IO_PROG -c "fsync" $fsync + + if [ $sibling -ne 1 ]; then + before=`stat "$stat_opt" $src` + fi + + _check_consistency $src $sibling + _clean_dir +} + +# create a hard link $2 to file $1, and sync, followed by power-cut +test_link_sync() +{ + src=$SCRATCH_MNT/$1 + dest=$SCRATCH_MNT/$2 + before="" + after="" + echo -ne "\n=== link $src $dest with sync ===\n" | _filter_scratch + _mount_flakey + + # now execute the workload + mkdir -p "${src%/*}" + mkdir -p "${dest%/*}" + touch $src + ln $src $dest || _fail "Link creation failed" + sync + before=`stat "$stat_opt" $src` + + _check_consistency $src 0 + _clean_dir +} + + +# Create different combinations to run the link test +# Group 0: Both files within root directory +file_names[0]="foo bar" +fsync_names[0]="./ foo bar" + +# Group 1: Create hard link in a sub directory +file_names[1]="foo A/bar" +fsync_names[1]="./ foo bar A A/bar A/foo" + +# Group 2: Create hard link in parent directory +file_names[2]="A/foo bar" +fsync_names[2]="./ foo bar A A/bar A/foo" + +# Group 3: Both files within a directory other than root +file_names[3]="A/foo A/bar" +fsync_names[3]="./ A A/bar A/foo" + +#Group 4: Exercise name reuse : Link file in sub-directory +file_names[4]="bar A/bar" +fsync_names[4]="./ foo bar A A/bar A/foo" + +#Group 5: Exercise name reuse : Link file in parent directory +file_names[5]="A/bar bar" +fsync_names[5]="./ foo bar A A/bar A/foo" + +for ((test_group=0; test_group<6; test_group++)); do + for file in ${fsync_names[$test_group]}; do + test_link_fsync ${file_names[$test_group]} $file + done + test_link_sync ${file_names[$test_group]} +done + +# success, all done +status=0 +exit diff --git a/tests/generic/517-link.out b/tests/generic/517-link.out new file mode 100644 index 0000000..1f80b27 --- /dev/null +++ b/tests/generic/517-link.out @@ -0,0 +1,75 @@ +QA output created by 517-link + +=== link SCRATCH_MNT/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/foo === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/bar === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/bar with sync === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/foo === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/bar === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/bar === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/foo === + +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with sync === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/foo === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/bar === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/A === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/bar === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/foo === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with sync === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/bar === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/foo === + +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with sync === + +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT === + +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/foo === + +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/bar === + +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A === + +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/bar === + +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/foo === + +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with sync === + +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT === + +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/foo === + +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/bar === + +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/A === + +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/bar === + +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/foo === + +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with sync === diff --git a/tests/generic/group b/tests/generic/group index 47de978..67e9108 100644 --- a/tests/generic/group +++ b/tests/generic/group @@ -519,3 +519,4 @@ 514 auto quick clone 515 auto quick clone 516 auto quick dedupe clone +517-link quick log
On Mon, Nov 05, 2018 at 02:16:57PM -0600, Jayashree Mohan wrote: > Hi Dave, > > Thanks for the comments. > > >> +init_end_time=$(date +%s.%N) > >> +init_time_elapsed=$(echo "$init_end_time - $init_start_time" | bc) > >> +echo "Initialization time = $init_time_elapsed" >> $seqres.full > > > > No timing inside the test, please. The harness times the test > > execution. > > Sure, that was meant for the initial evaluation. I’ve removed it now. Thanks! > > > > >> +total_time=0 > >> + > >> +_clean_dir() > >> +{ > >> + _mount_flakey > >> + rm -rf $SCRATCH_MNT/* > >> + sync > >> + _unmount_flakey > >> +} > > > > Why not just _scratch_mkfs? > > I believe that to _scratch_mkfs, I must first _cleanup dm_flakey. > If I replace the above snippet by > _cleanup > _scratch_mkfs > _init_flakey > > The time taken for the test goes up by around 10 seconds (due to > mkfs maybe). So I thought it was sufficient to remove the working > directory. Ok. Put that in a comment (comments are for explaining why, not how or what). :) Overall the new version looks a lot better. More comments below. > diff --git a/tests/generic/517-link b/tests/generic/517-link > new file mode 100755 > index 0000000..7108300 > --- /dev/null > +++ b/tests/generic/517-link Please just use "517" as the test name. At some point it was though that descriptive names for tests might be useful, but it's turned out that's not the case and we have just continued to number them. > @@ -0,0 +1,176 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2018 The University of Texas at Austin. All Rights Reserved. > +# > +# FS QA Test 517-link > +# > +# Test case created by CrashMonkey > +# > +# Test if we create a hard link to a file and persist either of the files, all the names persist. > +# > +seq=`basename $0` > +seqres=$RESULT_DIR/$seq > +echo "QA output created by $seq" > + > +here=`pwd` > +tmp=/tmp/$$ > +status=1 # failure is the default! > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +_cleanup() > +{ > + _cleanup_flakey > + cd / > + rm -f $tmp.* > +} > + > +# get standard environment, filters and checks > +. ./common/rc > +. ./common/filter > +. ./common/dmflakey > + > +# remove previous $seqres.full before test > +rm -f $seqres.full > + > +# real QA test starts here > +_supported_fs generic > +_supported_os Linux > +_require_scratch_nocheck > +_require_dm_target flakey > + > +# initialize scratch device > +_scratch_mkfs >> $seqres.full 2>&1 > +_require_metadata_journaling $SCRATCH_DEV > +_init_flakey > + > +stat_opt='-c "blocks: %b size: %s inode: %i links: %h"' > +before="" > +after="" > + > +_clean_dir() > +{ > + _mount_flakey > + rm -rf $SCRATCH_MNT/* > + sync > + _unmount_flakey > +} > + > +_check_consistency() > +{ > + _flakey_drop_and_remount | tee -a $seqres.full > + Whitespace damage. > + if [ -f $1 ]; then > + after=`stat "$stat_opt" $1` > + fi > + > + if [ "$before" != "$after" ] && [ $2 -ne 1 ]; then > + echo "Before: $before" > + echo "After: $after" Whitespace. Add this to your .vimrc: " highlight whitespace damage highlight RedundantSpaces ctermbg=red guibg=red match RedundantSpaces /\s\+$\| \+\ze\t/ > + fi > + > + _unmount_flakey > + _check_scratch_fs $FLAKEY_DEV > + [ $? -ne 0 ] && _fatal "fsck failed" > +} > + > +# create a hard link $2 to file $1, and fsync $3, followed by power-cut > +test_link_fsync() > +{ > + local sibling=0 > + before="" > + after="" > + src=$SCRATCH_MNT/$1 > + dest=$SCRATCH_MNT/$2 > + Whitespace > + if [ "$3" == "./" ]; then > + fsync=$SCRATCH_MNT > + else > + fsync=$SCRATCH_MNT/$3 > + fi Why doesn't fsync of $SCRATCH_MNT/./ work? > + > + echo -ne "\n=== link $src $dest with fsync $fsync ===\n" | _filter_scratch > + _mount_flakey > + Whitespace > + # now execute the workload > + mkdir -p "${src%/*}" Please explain why this magic directory creation is needed - "now execute the workload" isn't going to remind me what this does in a couple of years time. > + mkdir -p "${dest%/*}" Whitespace > + touch $src > + ln $src $dest || _fail "Link creation failed" No need for _fail here. The error message will end up in the output and the test will fail the golden output match. That way it will still run all the other tests. > + Whitespace > + # If the file being persisted is a sibling, create it first > + if [ ! -f $fsync ]; then > + sibling=1 > + touch $fsync > + fi > + > + $XFS_IO_PROG -c "fsync" $fsync > + Whitespace > + if [ $sibling -ne 1 ]; then > + before=`stat "$stat_opt" $src` > + fi > + > + _check_consistency $src $sibling > + _clean_dir > +} > + > +# create a hard link $2 to file $1, and sync, followed by power-cut > +test_link_sync() > +{ > + src=$SCRATCH_MNT/$1 > + dest=$SCRATCH_MNT/$2 > + before="" > + after="" > + echo -ne "\n=== link $src $dest with sync ===\n" | _filter_scratch > + _mount_flakey > + Whitespace > + # now execute the workload > + mkdir -p "${src%/*}" > + mkdir -p "${dest%/*}" Whitespace > + touch $src > + ln $src $dest || _fail "Link creation failed" Whitespace > + sync > + before=`stat "$stat_opt" $src` > + Whitespace > + _check_consistency $src 0 > + _clean_dir > +} > + > + > +# Create different combinations to run the link test > +# Group 0: Both files within root directory > +file_names[0]="foo bar" > +fsync_names[0]="./ foo bar" > + > +# Group 1: Create hard link in a sub directory > +file_names[1]="foo A/bar" > +fsync_names[1]="./ foo bar A A/bar A/foo" > + > +# Group 2: Create hard link in parent directory > +file_names[2]="A/foo bar" > +fsync_names[2]="./ foo bar A A/bar A/foo" > + > +# Group 3: Both files within a directory other than root > +file_names[3]="A/foo A/bar" > +fsync_names[3]="./ A A/bar A/foo" > + > +#Group 4: Exercise name reuse : Link file in sub-directory > +file_names[4]="bar A/bar" > +fsync_names[4]="./ foo bar A A/bar A/foo" > + > +#Group 5: Exercise name reuse : Link file in parent directory > +file_names[5]="A/bar bar" > +fsync_names[5]="./ foo bar A A/bar A/foo" > + > +for ((test_group=0; test_group<6; test_group++)); do > + for file in ${fsync_names[$test_group]}; do > + test_link_fsync ${file_names[$test_group]} $file > + done > + test_link_sync ${file_names[$test_group]} > +done Whitespace > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/517-link.out b/tests/generic/517-link.out > new file mode 100644 > index 0000000..1f80b27 > --- /dev/null > +++ b/tests/generic/517-link.out > @@ -0,0 +1,75 @@ > +QA output created by 517-link > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/foo === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/bar === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/bar with sync === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/foo === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/bar === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/bar === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/foo === > + > +=== link SCRATCH_MNT/foo SCRATCH_MNT/A/bar with sync === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/foo === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/bar === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/A === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/bar === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/foo === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/bar with sync === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/bar === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/foo === > + > +=== link SCRATCH_MNT/A/foo SCRATCH_MNT/A/bar with sync === > + > +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT === > + > +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/foo === > + > +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/bar === > + > +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A === > + > +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/bar === > + > +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with fsync SCRATCH_MNT/A/foo === > + > +=== link SCRATCH_MNT/bar SCRATCH_MNT/A/bar with sync === > + > +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT === > + > +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/foo === > + > +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/bar === > + > +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/A === > + > +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/bar === > + > +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with fsync SCRATCH_MNT/A/foo === > + > +=== link SCRATCH_MNT/A/bar SCRATCH_MNT/bar with sync === > diff --git a/tests/generic/group b/tests/generic/group > index 47de978..67e9108 100644 > --- a/tests/generic/group > +++ b/tests/generic/group > @@ -519,3 +519,4 @@ > 514 auto quick clone > 515 auto quick clone > 516 auto quick dedupe clone > +517-link quick log > > >
On Mon, Nov 05, 2018 at 02:16:57PM -0600, Jayashree Mohan wrote: > > I believe that to _scratch_mkfs, I must first _cleanup dm_flakey. If I replace the above snippet by > _cleanup > _scratch_mkfs > _init_flakey > > The time taken for the test goes up by around 10 seconds (due to mkfs maybe). So I thought it was sufficient to remove the working directory. Can you try adding _check_scratch_fs after each test case? Yes, it will increase the test time, but it will make it easier for a developer to figure out what might be going on. Also, how big does the file system have to be? I wonder if we can speed things up if a ramdisk is used as the backing device for dm-flakey. On the flip side, am I remembering correctly that the original technique used by your research paper used a custom I/O stack so you could find potential problems even in the operations getting lost after a power drop, no matter how unlikely, but rather, for anything that isn't guaranteed by the cache flush command? One argument for not using a ramdisk to speed things up is that it would make be much less likely that potential problems would be found. But I wonder, given that we're not using dm-flakey, how likely that we would notice regressions in the first place. For example, given that we know which patches were needed to fix the various problems found by your research. Suppose we revert those patches, or use a kernel that doesn't have those fixes. Will the xfstests script you've generated be able to trigger the failures with an unfixed kernel? Cheers, - Ted
On Tue, Nov 06, 2018 at 06:15:36PM -0500, Theodore Y. Ts'o wrote: > On Mon, Nov 05, 2018 at 02:16:57PM -0600, Jayashree Mohan wrote: > > > > I believe that to _scratch_mkfs, I must first _cleanup dm_flakey. If I replace the above snippet by > > _cleanup > > _scratch_mkfs > > _init_flakey > > > > The time taken for the test goes up by around 10 seconds (due to mkfs maybe). So I thought it was sufficient to remove the working directory. > > Can you try adding _check_scratch_fs after each test case? Yes, it _check_scratch_fs now runs xfs_scrub on XFS as well as xfs_repair, so it's actually quite expensive. The whole point of aggregating all these tests into one fstest is to avoid the overhead of running _check_scratch_fs after every single test that are /extremely unlikely/ to fail on existing filesystems. > Also, how big does the file system have to be? I wonder if we can > speed things up if a ramdisk is used as the backing device for > dm-flakey. I already run fstests on pmem devices for fast iteration of the entire test suite. The slowest part of the test suite for quick tests is still the filesystem checking between tests. So putting dm-flakey on a ramdisk doesn't help anything here. I also run fstests on memory constrained VMs, where there isn't ram to add a ramdisk for random stuff like this. Then all the dm-flakey tests either wont run or fail due to OOM-kills. Cheers, Dave.
> > Can you try adding _check_scratch_fs after each test case? Yes, it > > _check_scratch_fs now runs xfs_scrub on XFS as well as xfs_repair, > so it's actually quite expensive. > > The whole point of aggregating all these tests into one fstest is to > avoid the overhead of running _check_scratch_fs after every single > test that are /extremely unlikely/ to fail on existing filesystems. Filipe and Eryu suggest that we run _check_scratch_fs after each subtest. Quoting Filipe, > > For this type of tests, I think it's a good idea to let fsck run. > > Even if all of the links are persisted, the log/journal replay might > have caused metadata inconsistencies in the fs for example - this was > true for many cases I fixed over the years in btrfs. > Even if fsck doesn't report any problem now, it's still good to run > it, to help prevent future regressions. > > Plus this test creates a very small fs, it's not like fsck will take a > significant time to run. > So for all these reasons I would unmount and fsck after each test. For this reason, we currently _check_scratch_fs after each subtest in the _check_consistency method in my patch. + _unmount_flakey + _check_scratch_fs $FLAKEY_DEV + [ $? -ne 0 ] && _fatal "fsck failed" Running on a 200MB partition, addition of this check added only around 3-4 seconds of delay in total for this patch consisting of 37 tests. Currently this patch takes about 12-15 seconds to run to completion on my 200MB partition. Am I missing something - is this the check you are talking about? Thanks, Jayashree Mohan
Hi Ted, > > I believe that to _scratch_mkfs, I must first _cleanup dm_flakey. If I replace the above snippet by > > _cleanup > > _scratch_mkfs > > _init_flakey > > > > The time taken for the test goes up by around 10 seconds (due to mkfs maybe). So I thought it was sufficient to remove the working directory. > > Can you try adding _check_scratch_fs after each test case? Yes, it > will increase the test time, but it will make it easier for a > developer to figure out what might be going on. As per Filipe's and Eryu's suggestions, each sub test in the patch unmounts the device and tests for consistency using _check_scratch_fs. + _unmount_flakey + _check_scratch_fs $FLAKEY_DEV + [ $? -ne 0 ] && _fatal "fsck failed" This added about an additional 3-4 seconds of delay overall. I hope this is what you're suggesting. > Also, how big does the file system have to be? I wonder if we can > speed things up if a ramdisk is used as the backing device for > dm-flakey. The file system can be as small as 100MB. I would imagine that ramdisk results in speedup. > On the flip side, am I remembering correctly that the original > technique used by your research paper used a custom I/O stack so you > could find potential problems even in the operations getting lost > after a power drop, no matter how unlikely, but rather, for anything > that isn't guaranteed by the cache flush command? Are you talking about re-ordering of the block IOs? We don't use that feature for these tests - we only replay the block IOs in order, just like dm-flakey/ dm-logwrites would do. > One argument for not using a ramdisk to speed things up is that it > would make be much less likely that potential problems would be found. > But I wonder, given that we're not using dm-flakey, how likely that we > would notice regressions in the first place. To clarify, the patches I would be sending out, do not require CrashMonkey in the loop for testing. We only use dm-flakey and the in-order replay support it provides. > For example, given that we know which patches were needed to fix the > various problems found by your research. Suppose we revert those > patches, or use a kernel that doesn't have those fixes. Will the > xfstests script you've generated be able to trigger the failures with > an unfixed kernel? Yes, if you run these xftests on an unpatched kernel, you can reproduce the bugs our paper claims.
On Tue, Nov 06, 2018 at 05:50:28PM -0600, Jayashree Mohan wrote: > On Tue, Nov 6, 2018 at 5:40 PM Dave Chinner <david@fromorbit.com> wrote: > > > On Tue, Nov 06, 2018 at 06:15:36PM -0500, Theodore Y. Ts'o wrote: > > > On Mon, Nov 05, 2018 at 02:16:57PM -0600, Jayashree Mohan wrote: > > > > > > > > I believe that to _scratch_mkfs, I must first _cleanup dm_flakey. If I > > replace the above snippet by > > > > _cleanup > > > > _scratch_mkfs > > > > _init_flakey > > > > > > > > The time taken for the test goes up by around 10 seconds (due to mkfs > > maybe). So I thought it was sufficient to remove the working directory. > > > > > > Can you try adding _check_scratch_fs after each test case? Yes, it > > > > _check_scratch_fs now runs xfs_scrub on XFS as well as xfs_repair, > > so it's actually quite expensive. > > > > The whole point of aggregating all these tests into one fstest is to > > avoid the overhead of running _check_scratch_fs after every single > > test that are /extremely unlikely/ to fail on existing filesystems. > > > > Filipe and Eryu suggest that we run _check_scratch_fs after each subtest. > Quoting Filipe, These tests are highly unlikely to fail on existing filesystems. If they do fail, then the developer can narrow it down by modifying the test to the single test that fails and add _check_scratch_fs where necessary. Run time matters. Expensive, finegrained testing is only useful if there's a high probability of the test finding ongoing bugs. These tests have found bugs when they were first written, but the ongoing probability of finding more bugs is extremely low. Adding a huge amount of testing overhead for what appear to be very marginal returns is not a good tradeoff. > > Plus this test creates a very small fs, it's not like fsck will take a > > significant time to run. > > So for all these reasons I would unmount and fsck after each test. > > For this reason, we currently _check_scratch_fs after each subtest in the > _check_consistency method in my patch. > + _unmount_flakey > + _check_scratch_fs $FLAKEY_DEV > + [ $? -ne 0 ] && _fatal "fsck failed" > > Running on a 200MB partition, addition of this check added only around 3-4 > seconds of delay in total for this patch consisting of 37 tests. Currently > this patch takes about 12-15 seconds to run to completion on my 200MB > partition. What filesystem, and what about 20GB scratch partitions (which are common)? i.e. Checking cost is different on different filesystems, different capacity devices and even different userspace versions of the same filesystem utilities. It is most definitely not free, and in some cases can be prohibitively expensive. ---- I suspect we've lost sight of the fact that fstests was /primarily/ a filesystem developer test suite, not a distro regression test suite. If the test suite becomes too cumbersome and slow for developers to use effectively, then it will get used less during development and that's a *really, really bad outcome*. e.g. I don't use the auto group in my development workflow any more - it's too slow to get decent coverage of changes these days. I only use the auto group for integration testing now. A few years ago it would take about an hour to run the auto group on a couple of spindles. These days it takes closer to 5 hours on the same setup. Even on my really fast pmem setup it take a couple of hours to run the entire auto group. As I have 15 different configurations I need to run through integration testing and limited resources, runtime certainly matters. It even takes half an hour to run the quick group on my fast machine, which really isn't very quick anymore because of the sheer number of tests in the quick group. Half an hour is too slow for effective change feed back - feedback within 5 minutes is necessary, otherwise the developer will context switch to something else while waiting and lose all focus on what they were doing. This leads to highly inefficient developers. The quick group is now full of short, fine grained, targetted regression tests. These are useful for distro release-to-release testing, but the chances that they find new bugs are extremely low. They really aren't that useful for developers who "fix the bug and move on" and will never see that test fail ever again. If one of these tests has a huge overhead or the sheer number of those tests creates substantial overhead, then that is not a particularly good use of the developer's time and resources. IOWs, there's a cost to run each test that the really important use case for fstests (i.e. the developers' test/dev cycle) cannot really afford to pay that cost. i.e. developers need wide test coverage /at speed/, not need deep, fine-grained, time consuming test coverage. To put it in other words, developers need tests focussed on finding bugs quickly, not regression tests that provide the core requirements of integration and release testing. The development testing phase is all about finding fast and effciently. There's been little done in recent times to make fstests faster and more efficient at finding new bugs for developers - the vast majority of new tests has been for specific bugs that have already been fixed. Even these crashmonkey tests are not "find new bugs" tests. The bugs they uncovered have been found and fixed, and so they fall into this same finegrained integration regression test category. The only tests that I've seen discover new bugs recently are those that run fsx, fstress or some other semi-randomised workloads that are combined with some other operation. These tests find the bugs that fine-grained, targetted regression tests will never uncover, and so in many cases running most of these integration/regression tests doesn't provide any value to the developer. The faster the tests find the bugs, the faster the developer fixes them. Then more dev/test cycles a developer can do in a day, the more bugs they find and fix. So we want fstests to remain useful to developers, then we have to pay more attention to runtime and how efficient the tests are at fiding new bugs. More is not better. Perhaps we need to recategorise the tests into new groups. Perhaps we need to start working on reducing the huge amount of technical debt we now have in fstests. Perhaps we need to scale the fstests infrastructure to support thousands of tests efficiently. Perhaps we need to focus on tests that uncover new bugs (looking forwards) rather than regression tests (looking backwards) But, really, if we don't stop and think about what we are doing here fstests will slowly drop out of the day-to-day developer workflow because it is becoming less useful for finding new filesystem bugs quickly... Cheers, Dave.
On Wed, Nov 07, 2018 at 01:09:22PM +1100, Dave Chinner wrote: > > Running on a 200MB partition, addition of this check added only around 3-4 > > seconds of delay in total for this patch consisting of 37 tests. Currently > > this patch takes about 12-15 seconds to run to completion on my 200MB > > partition. > > What filesystem, and what about 20GB scratch partitions (which are > common)? i.e. Checking cost is different on different filesystems, > different capacity devices and even different userspace versions of > the same filesystem utilities. It is most definitely not free, and > in some cases can be prohibitively expensive. For the CrashMonkey tests, one solution might be to force the use of a small file system on the scratch disk. (e.g., using _scratch_mkfs_sized). > I suspect we've lost sight of the fact that fstests was /primarily/h > a filesystem developer test suite, not a distro regression test > suite. If the test suite becomes too cumbersome and slow for > developers to use effectively, then it will get used less during > development and that's a *really, really bad outcome*. I agree with your concern. > It even takes half an hour to run the quick group on my fast > machine, which really isn't very quick anymore because of the sheer > number of tests in the quick group. Half an hour is too slow for > effective change feed back - feedback within 5 minutes is > necessary, otherwise the developer will context switch to somethingt > else while waiting and lose all focus on what they were doing. This > leads to highly inefficient developers. At Google we were willing to live with a 10 minute "fssmoke" subset, but admittedly, that's grown to 15-20 minutes in recent years. So trying to create a "smoke" group that is only 5 minutes SGTM. > The only tests that I've seen discover new bugs recently are those > that run fsx, fstress or some other semi-randomised workloads that > are combined with some other operation. These tests find the bugs > that fine-grained, targetted regression tests will never uncover, > and so in many cases running most of these integration/regression > tests doesn't provide any value to the developer. Yeah, what I used to do is assume that if the test run survives past generic/013 (which uses fsstress), I'd assume that it would pass the rest of the tests, and I would move on to reviewing the next commit. Unfortuantely we've added so many ext4 specific tests (which run in front of generic) that this trick no longer works. I haven't gotten annoyed enough to hack in some way to reorder the tests that get run so the highest value tests run first, and then sending a "90+% chance the commit is good, running the rest of the tests" message, but it has occurred to me.... > Perhaps we need to recategorise the tests into new groups. Agreed. Either we need to change what tests we leave in "quick", or we need to create a new group "smoke" where quick is an attribute of the group as a whole, not an attribute of each test in the "quick" group. > Perhaps we need to scale the fstests infrastructure to support > thousands of tests efficiently. On my "when I find the round tuit, or when I can get a GSOC or intern to work on it, whichever comes first" list is to enhance gce-xfstests so it can shard the tests for a particular fs configuration so they use a group of a VMs, instead of just using a separate VM for each config scenario (e.g., dax, blocksize < page size, bigalloc, ext3 compat, etc.) It might mean using ~100 VM's instead of the current 10 that I use, but if it means the tests complete in a tenth of the time, the total cost for doing a full integration test won't change by that much. The bigger problem is that people might have to ask permission to increase the GCE quotas from the defaults used on new accounts. For those people who are trying to run xfstests on bare metal, I'm not sure there's that much that can be done to improve things; did you have some ideas? Or were you assuming that step one would require buying many more physical test machines in your test cluster? (Maybe IBM will be willing to give you a much bigger test machine budget? :-) - Ted
Hi all, We understand the concern about testing times. To choose a middle ground, Ted's suggestion of using _scratch_mkfs_sized works best for CrashMonkey specific tests. These tests involve very few files and it suffices to have a 100MB file system. I tested the patch on ext4, xfs, btrfs and f2fs on a partition of this size. The overhead due to _check_scratch_fs after each sub test is in the range of 3-5 seconds for all these file systems. If this is tolerable, we can force a smaller file system size for all CrashMonkey tests. Does this sound reasonable to you? Thanks, Jayashree Mohan On Tue, Nov 6, 2018 at 10:04 PM Theodore Y. Ts'o <tytso@mit.edu> wrote: > > On Wed, Nov 07, 2018 at 01:09:22PM +1100, Dave Chinner wrote: > > > Running on a 200MB partition, addition of this check added only around 3-4 > > > seconds of delay in total for this patch consisting of 37 tests. Currently > > > this patch takes about 12-15 seconds to run to completion on my 200MB > > > partition. > > > > What filesystem, and what about 20GB scratch partitions (which are > > common)? i.e. Checking cost is different on different filesystems, > > different capacity devices and even different userspace versions of > > the same filesystem utilities. It is most definitely not free, and > > in some cases can be prohibitively expensive. > > For the CrashMonkey tests, one solution might be to force the use of a > small file system on the scratch disk. (e.g., using _scratch_mkfs_sized). > > > I suspect we've lost sight of the fact that fstests was /primarily/h > > a filesystem developer test suite, not a distro regression test > > suite. If the test suite becomes too cumbersome and slow for > > developers to use effectively, then it will get used less during > > development and that's a *really, really bad outcome*. > > I agree with your concern. > > > It even takes half an hour to run the quick group on my fast > > machine, which really isn't very quick anymore because of the sheer > > number of tests in the quick group. Half an hour is too slow for > > effective change feed back - feedback within 5 minutes is > > necessary, otherwise the developer will context switch to somethingt > > else while waiting and lose all focus on what they were doing. This > > leads to highly inefficient developers. > > At Google we were willing to live with a 10 minute "fssmoke" subset, > but admittedly, that's grown to 15-20 minutes in recent years. So > trying to create a "smoke" group that is only 5 minutes SGTM. > > > The only tests that I've seen discover new bugs recently are those > > that run fsx, fstress or some other semi-randomised workloads that > > are combined with some other operation. These tests find the bugs > > that fine-grained, targetted regression tests will never uncover, > > and so in many cases running most of these integration/regression > > tests doesn't provide any value to the developer. > > Yeah, what I used to do is assume that if the test run survives past > generic/013 (which uses fsstress), I'd assume that it would pass the > rest of the tests, and I would move on to reviewing the next commit. > Unfortuantely we've added so many ext4 specific tests (which run in > front of generic) that this trick no longer works. I haven't gotten > annoyed enough to hack in some way to reorder the tests that get run > so the highest value tests run first, and then sending a "90+% chance > the commit is good, running the rest of the tests" message, but it has > occurred to me.... > > > Perhaps we need to recategorise the tests into new groups. > > Agreed. Either we need to change what tests we leave in "quick", or > we need to create a new group "smoke" where quick is an attribute of > the group as a whole, not an attribute of each test in the "quick" > group. > > > Perhaps we need to scale the fstests infrastructure to support > > thousands of tests efficiently. > > On my "when I find the round tuit, or when I can get a GSOC or intern > to work on it, whichever comes first" list is to enhance gce-xfstests > so it can shard the tests for a particular fs configuration so they > use a group of a VMs, instead of just using a separate VM for each > config scenario (e.g., dax, blocksize < page size, bigalloc, ext3 > compat, etc.) > > It might mean using ~100 VM's instead of the current 10 that I use, > but if it means the tests complete in a tenth of the time, the total > cost for doing a full integration test won't change by that much. The > bigger problem is that people might have to ask permission to increase > the GCE quotas from the defaults used on new accounts. > > For those people who are trying to run xfstests on bare metal, I'm not > sure there's that much that can be done to improve things; did you > have some ideas? Or were you assuming that step one would require > buying many more physical test machines in your test cluster? > > (Maybe IBM will be willing to give you a much bigger test machine > budget? :-) > > - Ted
On Wed, Nov 07, 2018 at 07:41:50PM -0600, Jayashree Mohan wrote: > Hi all, > > We understand the concern about testing times. To choose a middle > ground, Ted's suggestion of using _scratch_mkfs_sized works best for > CrashMonkey specific tests. These tests involve very few files and it > suffices to have a 100MB file system. I tested the patch on ext4, xfs, > btrfs and f2fs on a partition of this size. The overhead due to > _check_scratch_fs after each sub test is in the range of 3-5 seconds 3-5 second per invocation of _check_scratch_fs? > for all these file systems. If this is tolerable, we can force a Not if your numbers that means 300 x 3-5 seconds. That's 15-25 minutes of extra runtime. Cheers, Dave.
On Wed, Nov 07, 2018 at 01:09:22PM +1100, Dave Chinner wrote: > To put it in other words, developers need tests focussed on finding > bugs quickly, not regression tests that provide the core > requirements of integration and release testing. The development > testing phase is all about finding fast and effciently. To emphasis my point of having tests and tools capable of finding new bugs, I noticed yesterday that fstress and fsx didn't support copy_file_range, and fsx doesn't support clone/dedupe_file_range either. Darrick added them overnight. fsx as run by generic/263 takes *32* operations to find a data corruption with copy_file_range on XFS. Even changing it to do buffered IO instead of direct IO, it only takes ~600 operations to fail with a different data corruption. That's *at least* two previously unknown bugs exposed in under a second of runtime. That's the sort of tooling we need - we don't need hundreds of tests that are scripted reproducers of fixed problems, we need tools that exercise boundary conditions and corner cases in ways that are likely to expose incorrect behaviour. Tools that do these things quickly and in a reproducable manner are worth their weight in gold... IMO, Quality Engineering is not just about writing regression tests to keep out known bugs - it's most important function is developing and refining new testing tools to find bugs that have escaped detection with existing testing methods and tools. If test engineers can find new bugs, software engineers can fix them. That's really the ultimate goal here - to find bugs and fix them before users are exposed to them... Cheers, Dave.
> > We understand the concern about testing times. To choose a middle > > ground, Ted's suggestion of using _scratch_mkfs_sized works best for > > CrashMonkey specific tests. These tests involve very few files and it > > suffices to have a 100MB file system. I tested the patch on ext4, xfs, > > btrfs and f2fs on a partition of this size. The overhead due to > > _check_scratch_fs after each sub test is in the range of 3-5 seconds > > 3-5 second per invocation of _check_scratch_fs? No. This is the overall overhead for calling _check_scratch_fs after each of the 37 sub tests in my first patch. Without this check, it took 10 seconds to run. With this addition, it takes about 12-15 seconds to run all the 37 sub tests. > > for all these file systems. If this is tolerable, we can force a > > Not if your numbers that means 300 x 3-5 seconds. That's 15-25 > minutes of extra runtime. Going by the above calculation, the extra run time for the invocation of check after each sub-test should be around 30-40 seconds in total (for the set of 300 tests).
On Thu, Nov 8, 2018 at 3:40 AM Dave Chinner <david@fromorbit.com> wrote: > > On Wed, Nov 07, 2018 at 01:09:22PM +1100, Dave Chinner wrote: > > To put it in other words, developers need tests focussed on finding > > bugs quickly, not regression tests that provide the core > > requirements of integration and release testing. The development > > testing phase is all about finding fast and effciently. > > To emphasis my point of having tests and tools capable of finding > new bugs, I noticed yesterday that fstress and fsx didn't support > copy_file_range, and fsx doesn't support clone/dedupe_file_range > either. Darrick added them overnight. > > fsx as run by generic/263 takes *32* operations to find a data > corruption with copy_file_range on XFS. Even changing it to do > buffered IO instead of direct IO, it only takes ~600 operations to > fail with a different data corruption. > > That's *at least* two previously unknown bugs exposed in under a > second of runtime. > > That's the sort of tooling we need - we don't need hundreds of tests > that are scripted reproducers of fixed problems, we need tools that > exercise boundary conditions and corner cases in ways that are > likely to expose incorrect behaviour. Tools that do these things > quickly and in a reproducable manner are worth their weight in > gold... > > IMO, Quality Engineering is not just about writing regression tests > to keep out known bugs - it's most important function is developing > and refining new testing tools to find bugs that have escaped > detection with existing testing methods and tools. If test engineers > can find new bugs, software engineers can fix them. That's really > the ultimate goal here - to find bugs and fix them before users are > exposed to them... Dave, I think there is some confusion about what CrashMonkey does. I think you'll find its very close to what you want. Let me explain. CrashMonkey does exactly the kind of systematic testing that you want. Given a set of system calls, it generates tests for crash consistency for different workloads comprising of these system calls. It does this by testing each system call first, then each pair of system calls, and so on. Both the workload (which system calls to test) and the check (what should the file system look like after crash recovery) are automatically generated, without any human effort in the loop. CrashMonkey found 10 new bugs in btrfs and F2FS, so its not just a suite of regression tests. When we studied previous crash-consistency bugs reported and fixed in the kernel, we noticed most of them could be reproduced on a clean fs image of small size (100 MB). We found that the arguments to the system calls could also be constrained: we just needed to reuse a small set of file names or file ranges. We used this to automatically generate xfstests-style tests for each file system. We generated and tested a total of 3.3M workloads on a research cluster at UT Austin. We found that even testing a single system call revealed three new bugs (which have not all been patched yet). To systematically test single system calls, you need about 300 tests. This is what Jayashree is trying to add to fstests. Since there are very few crash-consistency tests currently in fstests, it might be good to add more coverage with these tests. The CrashMonkey github repo is available here: https://github.com/utsaslab/crashmonkey The link to the paper: https://www.cs.utexas.edu/~jaya/pdf/osdi18-B3.pdf Talk slides: https://www.cs.utexas.edu/~jaya/slides/osdi18-B3-slides.pdf
On Thu, Nov 08, 2018 at 08:40:45PM +1100, Dave Chinner wrote: > IMO, Quality Engineering is not just about writing regression tests > to keep out known bugs - it's most important function is developing > and refining new testing tools to find bugs that have escaped > detection with existing testing methods and tools. If test engineers > can find new bugs, software engineers can fix them. That's really > the ultimate goal here - to find bugs and fix them before users are > exposed to them... How about creating a new group, "regression"? Regression tests are not useless, but we might want to reserve them for integration testing, periodic tests, and distro testing of "golden masters" where test run is not of essence. For smoke testing, what we want are general functional tests, and "moderate" stress tests. Regression tests are a distant third priority. So assigning regression tests to distinct group so we can filter them out might be another way of controlling the runtime of a smoke test config. - Ted
On Thu, Nov 08, 2018 at 09:35:56AM -0600, Vijaychidambaram Velayudhan Pillai wrote: > On Thu, Nov 8, 2018 at 3:40 AM Dave Chinner <david@fromorbit.com> wrote: > > > > On Wed, Nov 07, 2018 at 01:09:22PM +1100, Dave Chinner wrote: > > > To put it in other words, developers need tests focussed on finding > > > bugs quickly, not regression tests that provide the core > > > requirements of integration and release testing. The development > > > testing phase is all about finding fast and effciently. > > > > To emphasis my point of having tests and tools capable of finding > > new bugs, I noticed yesterday that fstress and fsx didn't support > > copy_file_range, and fsx doesn't support clone/dedupe_file_range > > either. Darrick added them overnight. > > > > fsx as run by generic/263 takes *32* operations to find a data > > corruption with copy_file_range on XFS. Even changing it to do > > buffered IO instead of direct IO, it only takes ~600 operations to > > fail with a different data corruption. > > > > That's *at least* two previously unknown bugs exposed in under a > > second of runtime. > > > > That's the sort of tooling we need - we don't need hundreds of tests > > that are scripted reproducers of fixed problems, we need tools that > > exercise boundary conditions and corner cases in ways that are > > likely to expose incorrect behaviour. Tools that do these things > > quickly and in a reproducable manner are worth their weight in > > gold... > > > > IMO, Quality Engineering is not just about writing regression tests > > to keep out known bugs - it's most important function is developing > > and refining new testing tools to find bugs that have escaped > > detection with existing testing methods and tools. If test engineers > > can find new bugs, software engineers can fix them. That's really > > the ultimate goal here - to find bugs and fix them before users are > > exposed to them... > > Dave, I think there is some confusion about what CrashMonkey does. I > think you'll find its very close to what you want. Let me explain. I'm pretty sure I know what crashmonkey does, and I know what is being proposed here /isn't crashmonkey/. > CrashMonkey does exactly the kind of systematic testing that you want. > Given a set of system calls, it generates tests for crash consistency > for different workloads comprising of these system calls. It does this > by testing each system call first, then each pair of system calls, and > so on. Both the workload (which system calls to test) and the check > (what should the file system look like after crash recovery) are > automatically generated, without any human effort in the loop. > > CrashMonkey found 10 new bugs in btrfs and F2FS, so its not just a > suite of regression tests. Sure, but that was during an initial exploritory phase of the project on two immature filesystems. Now that those initial bugs have been fixed, I'm not seeing new bugs being reported. So what is being proposed for fstests here? It's not an exploratory tool like crashmonkey, or arbitrary boundary case exerciser like fsstress and fsx. What is being proposed is a set of fixed scripts that walk a defined set of single operations with a known, fixed set of initial conditions and checks that each individual op they behave as they should in that environment. There's no variation here. The same tests with the same initial conditions is run /every time/. When run on the same code, they will exercise exactly the same code path. There is no variation at all, and so if there are no bugs in the code path they exercise, they will not find any new bugs. That's my point: unlike crashmonkey in workload generation mode or fsx/fsstress, the code they exercise does not vary. Yes, when first run on a new code base they exposed bugs, but now that those bugs are fixed, they don't find any new bugs. They can only detect bugs in changes to the fixed code paths they exercise. Ergo, they are regression tests. Don't get me wrong - we need both types of tests - but I care less about a huge swath of railroad regression tests than I do about tools that find new bugs over and over again without needing modification. > When we studied previous crash-consistency bugs reported and fixed in > the kernel, we noticed most of them could be reproduced on a clean fs > image of small size (100 MB). We found that the arguments to the > system calls could also be constrained: we just needed to reuse a > small set of file names or file ranges. We used this to automatically > generate xfstests-style tests for each file system. We generated and > tested a total of 3.3M workloads on a research cluster at UT Austin. Which is great, especailly as you found bugs in that exploration. But exhaustive searches like this really are not practical for day to day development. Developers don't ahve their own personal clusters for testing their filesystem code. They might only have a laptop. These sorts of massive exploratory regression testing are really the domain of product release managers and their QE department (think of the scale of testing that goes into a RHEL or SLES release). It's -their job- to find gnarly, weird regressions that are beyond the capability of individual developers to uncover. This isn't the sort of testing that is relevant to the day-to-day filesystem developer. This comes back to my point about fstests being a tool for developers as much as it is for distro QE departments. The balance is falling too far towards the "massive regression test suite" side and away from the "find new bugs really fast" focus we have historically had. Adding hundreds more tests on that fall on the "massive regression test suite" side of the ledger just makes this imbalance worse. That's not something that crashmonkey can solve, but it's something we, as fstests users and developers, have to be very aware of when considering an addition of the size being proposed. > We found that even testing a single system call revealed three new > bugs (which have not all been patched yet). To systematically test > single system calls, you need about 300 tests. That's 300 tests per system call? I think that's underestimating the complexity of many syscalls (like open(), read(), etc) quite substantially. Indeed, open(O_TMPFILE) is going to make linkat() behave very differently, and there's a whole set of crash consistency problems when O_TMPFILE is used with linkat() that the proposed link behaviour tests do not cover. Maybe a better way to integrate this is to add a completely new tests/ subdirectory and push all the crash consistency tests into that directory. They don't get run by quick/auto, but instead by a specific group that runs that directory. The tests don't get intermingled with all the other generic tests, and you can set them up to run fsck as often as you want because they don't get in the way of existing testing. Over time we can more of the generic crash consistency regression tests elsewhere in fstests (e.g. all those fsync-on-btrfs-doesn't tests) over to that same subdir. I suspect we need to do more of this sort of "by type" breakup of the "generic" directory, too, because it has become hard to manage the 500-odd tests that are now in it.... Cheers, Dave.
On Thu, Nov 8, 2018 at 9:15 PM Dave Chinner <david@fromorbit.com> wrote: > Which is great, especailly as you found bugs in that exploration. > But exhaustive searches like this really are not practical for day > to day development. Developers don't ahve their own personal > clusters for testing their filesystem code. They might only have a > laptop. > > These sorts of massive exploratory regression testing are really the > domain of product release managers and their QE department (think of > the scale of testing that goes into a RHEL or SLES release). It's > -their job- to find gnarly, weird regressions that are beyond the > capability of individual developers to uncover. This isn't the sort > of testing that is relevant to the day-to-day filesystem developer. > > This comes back to my point about fstests being a tool for > developers as much as it is for distro QE departments. The balance > is falling too far towards the "massive regression test suite" side > and away from the "find new bugs really fast" focus we have > historically had. Adding hundreds more tests on that fall on the > "massive regression test suite" side of the ledger just makes this > imbalance worse. > > That's not something that crashmonkey can solve, but it's something > we, as fstests users and developers, have to be very aware of when > considering an addition of the size being proposed. > > > We found that even testing a single system call revealed three new > > bugs (which have not all been patched yet). To systematically test > > single system calls, you need about 300 tests. > > That's 300 tests per system call? I think that's underestimating > the complexity of many syscalls (like open(), read(), etc) quite > substantially. Indeed, open(O_TMPFILE) is going to make linkat() > behave very differently, and there's a whole set of crash > consistency problems when O_TMPFILE is used with linkat() that the > proposed link behaviour tests do not cover. We have workloads which explore the interaction between different system calls, which would capture effects like this. But thats a bigger set, and we are not attempting to add them to fstests at this point. > Maybe a better way to integrate this is to add a completely new > tests/ subdirectory and push all the crash consistency tests into > that directory. They don't get run by quick/auto, but instead by a > specific group that runs that directory. The tests don't get > intermingled with all the other generic tests, and you can set them > up to run fsck as often as you want because they don't get in the > way of existing testing. Over time we can more of the generic crash > consistency regression tests elsewhere in fstests (e.g. all those > fsync-on-btrfs-doesn't tests) over to that same subdir. I agree with not wanting developers to run 300 tests every time they run fstests. Perhaps a different "crash" group is what we want to do here? Then only developers who want to ensure crash consistency still holds can run those, knowing it will take a bit of time to run the set. We'll submit a patch with a new "crash" group and see what people think -- let me know if you disagree. I should note that although ext4 and xfs are super robust from a long history of development and testing, many other file systems in the Linux kernel are not, and I think the other file systems would definitely benefit from having the CrashMonkey regression tests. More generally, I'm not sure how to help/encourage file-system developers to run CrashMonkey. Note that CrashMonkey can be run for different amounts of time based on the computational budget of the developers. It is similar to fstress in that regard. I think having fstress in-kernel helps developers use it. Is adding the CrashMonkey tool (user-space code) into the Linux kernel the right move here?
How about creating a group "regress", and we can move them into tests/regress for better categorization, for regression tests? And for tests that attempt find new problems, we already have a group "fuzzers"; and perhaps there should be a some standard guieline for how long a single invocation of a fuzz test should run before quitting. I'd say something in the 5 to 15 minute range? People can always run the fuzz test multiple times, and if the fuzz test can save test between runs, it should matter that it's broken up into multiple test invocations. - Ted
On Fri, Nov 9, 2018 at 1:35 PM Theodore Y. Ts'o <tytso@mit.edu> wrote: > > How about creating a group "regress", and we can move them into > tests/regress for better categorization, for regression tests? "regress" group seems good, we'll send out a patch with that. > And for tests that attempt find new problems, we already have a group > "fuzzers"; and perhaps there should be a some standard guieline for > how long a single invocation of a fuzz test should run before > quitting. I'd say something in the 5 to 15 minute range? People can > always run the fuzz test multiple times, and if the fuzz test can save > test between runs, it should matter that it's broken up into multiple > test invocations. CrashMonkey doesn't do random tests for every invocation, so its behavior is a bit different from a fuzzer. But we can definitely modify CrashMonkey to randomly pick workloads in the target state space and run for a pre-defined amount of time.
diff --git a/tests/generic/517-link b/tests/generic/517-link new file mode 100755 index 0000000..791b6c0 --- /dev/null +++ b/tests/generic/517-link @@ -0,0 +1,162 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2018 The University of Texas at Austin. All Rights Reserved. +# +# FS QA Test 517-link +# +# Test if we create a hard link to a file and persist either of the files, all the names persist. +# +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + _cleanup_flakey + cd / + rm -f $tmp.* +} + +init_start_time=$(date +%s.%N) +# get standard environment, filters and checks +. ./common/rc +. ./common/filter +. ./common/dmflakey + +# remove previous $seqres.full before test +rm -f $seqres.full + +# real QA test starts here +_supported_fs generic +_supported_os Linux +_require_scratch +_require_dm_target flakey + +# initialize scratch device +_scratch_mkfs >>$seqres.full 2>&1 +_require_metadata_journaling $SCRATCH_DEV +_init_flakey +_mount_flakey +init_end_time=$(date +%s.%N) +init_time_elapsed=$(echo "$init_end_time-$init_start_time" | bc) +echo "Initialization time = $init_time_elapsed" >> $seqres.full + +stat_opt='-c "blocks: %b size: %s inode: %i links: %h"' +test_num=0 +total_time=0 + +cleanup() +{ + rm -rf $SCRATCH_MNT/* + sync +} + +# create a hard link $2 to file $1, and fsync $3, followed by power-cu +test_link() +{ + start_time=$(date +%s.%N) + test_num=$((test_num + 1)) + sibling=0 + before="" + after="" + echo -ne "\n=== Test $test_num : link $1 $2 " | tee -a $seqres.full + + # now execute the workload + mkdir -p "${1%/*}" && mkdir -p "${2%/*}" && touch "$1" + ln $1 $2 + + if [ -z "$3" ] + then + echo -ne " with sync ===\n" + sync + before=`stat "$stat_opt" $1` + else + echo " with fsync $3 ===" + + # If the file being persisted is a sibling, create it first + if [ ! -f $3 ] + then + sibling=1 + touch $3 + fi + + $XFS_IO_PROG -c "fsync" $3 | _filter_xfs_io + + if [ $sibling -ne 1 ] + then + before=`stat "$stat_opt" $1` + fi + fi + + _flakey_drop_and_remount | tee -a $seqres.full + + if [ -f $1 ] + then + after=`stat "$stat_opt" $1` + fi + + if [ "$before" != "$after" ] && [ $sibling -ne 1 ] + then + echo "Before: $before" | tee -a $seqres.full + echo "After: $after" | tee -a $seqres.full + fi + + cleanup + end_time=$(date +%s.%N) + time_elapsed=$(echo "$end_time-$start_time" | bc) + echo " Elapsed time : $time_elapsed" >> $seqres.full + total_time=$(echo "$total_time+$time_elapsed" | bc) + echo " Total time : $total_time" >> $seqres.full +} + +# run the link test for different combinations + +test_start_time=$(date +%s.%N) +# Group 1: Both files within root directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar; do + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar $file_name +done +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/bar + +# Group 2: Create hard link in a sub directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar $file_name +done +test_link $SCRATCH_MNT/foo $SCRATCH_MNT/A/bar + +# Group 3: Create hard link in parent directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar $file_name +done +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/bar + +# Group 4: Both files within a directory other than root +for file_name in $SCRATCH_MNT $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar $file_name +done +test_link $SCRATCH_MNT/A/foo $SCRATCH_MNT/A/bar + +#Group 5: Exercise name reuse : Link file in sub-directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar $file_name +done +test_link $SCRATCH_MNT/bar $SCRATCH_MNT/A/bar + +#Group 6: Exercise name reuse : Link file in parent directory +for file_name in $SCRATCH_MNT $SCRATCH_MNT/foo $SCRATCH_MNT/bar $SCRATCH_MNT/A $SCRATCH_MNT/A/bar $SCRATCH_MNT/A/foo; do + test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar $file_name +done +test_link $SCRATCH_MNT/A/bar $SCRATCH_MNT/bar + +test_end_time=$(date +%s.%N) +test_time_elapsed=$(echo "$test_end_time-$test_start_time" | bc) +echo "Test Elapsed time : $test_time_elapsed" >> $seqres.full + +# success, all done +status=0 +exit diff --git a/tests/generic/517-link.out b/tests/generic/517-link.out new file mode 100644 index 0000000..381b644 --- /dev/null +++ b/tests/generic/517-link.out @@ -0,0 +1,75 @@ +QA output created by 517-link + +=== Test 1 : link /mnt/scratch/foo /mnt/scratch/bar with fsync /mnt/scratch === + +=== Test 2 : link /mnt/scratch/foo /mnt/scratch/bar with fsync /mnt/scratch/foo === + +=== Test 3 : link /mnt/scratch/foo /mnt/scratch/bar with fsync /mnt/scratch/bar === + +=== Test 4 : link /mnt/scratch/foo /mnt/scratch/bar with sync === + +=== Test 5 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync /mnt/scratch === + +=== Test 6 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync /mnt/scratch/foo === + +=== Test 7 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync /mnt/scratch/bar === + +=== Test 8 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync /mnt/scratch/A === + +=== Test 9 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync /mnt/scratch/A/bar === + +=== Test 10 : link /mnt/scratch/foo /mnt/scratch/A/bar with fsync /mnt/scratch/A/foo === + +=== Test 11 : link /mnt/scratch/foo /mnt/scratch/A/bar with sync === + +=== Test 12 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync /mnt/scratch === + +=== Test 13 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync /mnt/scratch/foo === + +=== Test 14 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync /mnt/scratch/bar === + +=== Test 15 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync /mnt/scratch/A === + +=== Test 16 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync /mnt/scratch/A/bar === + +=== Test 17 : link /mnt/scratch/A/foo /mnt/scratch/bar with fsync /mnt/scratch/A/foo === + +=== Test 18 : link /mnt/scratch/A/foo /mnt/scratch/bar with sync === + +=== Test 19 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync /mnt/scratch === + +=== Test 20 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync /mnt/scratch/A === + +=== Test 21 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync /mnt/scratch/A/bar === + +=== Test 22 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with fsync /mnt/scratch/A/foo === + +=== Test 23 : link /mnt/scratch/A/foo /mnt/scratch/A/bar with sync === + +=== Test 24 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync /mnt/scratch === + +=== Test 25 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync /mnt/scratch/foo === + +=== Test 26 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync /mnt/scratch/bar === + +=== Test 27 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync /mnt/scratch/A === + +=== Test 28 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync /mnt/scratch/A/bar === + +=== Test 29 : link /mnt/scratch/bar /mnt/scratch/A/bar with fsync /mnt/scratch/A/foo === + +=== Test 30 : link /mnt/scratch/bar /mnt/scratch/A/bar with sync === + +=== Test 31 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync /mnt/scratch === + +=== Test 32 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync /mnt/scratch/foo === + +=== Test 33 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync /mnt/scratch/bar === + +=== Test 34 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync /mnt/scratch/A === + +=== Test 35 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync /mnt/scratch/A/bar === + +=== Test 36 : link /mnt/scratch/A/bar /mnt/scratch/bar with fsync /mnt/scratch/A/foo === + +=== Test 37 : link /mnt/scratch/A/bar /mnt/scratch/bar with sync === diff --git a/tests/generic/group b/tests/generic/group index 47de978..dc04152 100644 --- a/tests/generic/group +++ b/tests/generic/group @@ -519,3 +519,4 @@ 514 auto quick clone 515 auto quick clone 516 auto quick dedupe clone +517-link crash