Message ID | fe1cd52ce8954e5aee1fc0a4baf5c75ef7d2635a.1627590942.git.josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fstests: generic/204: fail if the mkfs fails | expand |
On Thu, Jul 29, 2021 at 04:35:53PM -0400, Josef Bacik wrote: > My nightly fstests runs on my Raspberry Pi got stuck trying to run > generic/204. This boiled down to mkfs failing to make the scratch > device that small with the subpage blocksize support, and thus trying to > fill a 1tib drive with tiny files. On one hand I'd like to make > _scratch_mkfs failures automatically fail the test, but I worry about > cases where a test may be checking for an option and need to do > something different with failures, so for now simply fail if we can't > make our tiny-fs in generic/204. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> FWIW, I'm carrying the following patch in my local xfstests tree: commit cb8e8d44de5bbb1d6163dfeb2b77e8f003a564da Author: Theodore Ts'o <tytso@mit.edu> Date: Mon Jul 3 01:29:21 2017 -0400 common: kill the test if mkfs.ext4 in _scratch_mkfs_sized fails If the file system size specified by test is incompatible with the mkfs options used in the test configuration, make sure the test stops at that point instad of doing something undefined. Signed-off-by: Theodore Ts'o <tytso@mit.edu> diff --git a/common/rc b/common/rc index 332e18b7..1dcad4a3 100644 --- a/common/rc +++ b/common/rc @@ -1034,7 +1034,9 @@ _scratch_mkfs_sized() fi ;; ext2|ext3|ext4|ext4dev) - ${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks + echo "${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks" + ${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks || \ + _die "${MKFS_PROG}.$FSTYP failed!" ;; gfs2) # mkfs.gfs2 doesn't automatically shrink journal files on small I tried getting this upstream, but apparently there was pushback where when the mkfs failed, it was considered a *feature* that the test would run on some random scratch file system that was previously there from before. I didn't appreciate wasting time trying to run down test failures caused by some completely inappropriate file system being used for a test after _scratch_mkfs_sized failed, so I've just been carrying the patch locally.... - Ted
On Thu, Jul 29, 2021 at 04:35:53PM -0400, Josef Bacik wrote: > My nightly fstests runs on my Raspberry Pi got stuck trying to run > generic/204. This boiled down to mkfs failing to make the scratch > device that small with the subpage blocksize support, and thus trying to > fill a 1tib drive with tiny files. On one hand I'd like to make So the underlying disk is 1TB in size, and we ended up using this 1T filesystem when _scratch_mkfs_sized failed? But we have done _try_wipe_scratch_devs before each test to make sure we don't use previous scratch dev accidently (just like this case), and the subsesquent _scratch_mount will fail and fail the whole test. So it's not clear to me what caused the failure you hit. Thanks, Eryu > _scratch_mkfs failures automatically fail the test, but I worry about > cases where a test may be checking for an option and need to do > something different with failures, so for now simply fail if we can't > make our tiny-fs in generic/204. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > tests/generic/204 | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/tests/generic/204 b/tests/generic/204 > index a3dabb71..b5deb443 100755 > --- a/tests/generic/204 > +++ b/tests/generic/204 > @@ -35,7 +35,8 @@ _scratch_mkfs 2> /dev/null | _filter_mkfs 2> $tmp.mkfs > /dev/null > [ $FSTYP = "xfs" ] && MKFS_OPTIONS="$MKFS_OPTIONS -l size=16m -i maxpct=50" > > SIZE=`expr 115 \* 1024 \* 1024` > -_scratch_mkfs_sized $SIZE $dbsize 2> /dev/null > $tmp.mkfs.raw > +_scratch_mkfs_sized $SIZE $dbsize 2> /dev/null > $tmp.mkfs.raw \ > + || _fail "mkfs failed" > cat $tmp.mkfs.raw | _filter_mkfs 2> $tmp.mkfs > /dev/null > _scratch_mount > > -- > 2.26.3
On Sun, Aug 01, 2021 at 08:42:02PM +0800, Eryu Guan wrote: > On Thu, Jul 29, 2021 at 04:35:53PM -0400, Josef Bacik wrote: > > My nightly fstests runs on my Raspberry Pi got stuck trying to run > > generic/204. This boiled down to mkfs failing to make the scratch > > device that small with the subpage blocksize support, and thus trying to > > fill a 1tib drive with tiny files. On one hand I'd like to make > > So the underlying disk is 1TB in size, and we ended up using this 1T > filesystem when _scratch_mkfs_sized failed? > > But we have done _try_wipe_scratch_devs before each test to make sure we > don't use previous scratch dev accidently (just like this case), and the > subsesquent _scratch_mount will fail and fail the whole test. So it's > not clear to me what caused the failure you hit. Ah, if the previous test _notrun'd, then the scratch dev didn't get wiped. I think patch "check: don't leave the scratch filesystem mounted after _notrun" from Darrick should fix the bug for you. Would you please confirm? Thanks, Eryu
On Sun, Aug 01, 2021 at 08:53:36PM +0800, Eryu Guan wrote: > > So the underlying disk is 1TB in size, and we ended up using this 1T > > filesystem when _scratch_mkfs_sized failed? > > > > But we have done _try_wipe_scratch_devs before each test to make sure we > > don't use previous scratch dev accidently (just like this case), and the > > subsesquent _scratch_mount will fail and fail the whole test. So it's > > not clear to me what caused the failure you hit. The call to _try_wipe_scratch_devs was added in 2019. My commit to add: || _notrun "mkfs.${FSTYP} failed" dates from 2017. So the reason I was seeing the problem was because it was before we started running wipefs between tests. That being said, I've checked a recent test run, and the _notrun hasn't triggered recently. Looking at the git history, it looks like a large number of tests had their arguments to _scratch_mkfs_sized adjusted upwards to avoid failures when running with 64k block sizes on powerpc. Going back to generic/204, I see why Josef ran into issues, however. even though we are running wipefs before each test. In the case of generic/204, it runs _scratch_mkfs to determine the blocksize, and then it runs _scratch_mkfs_sized --- and if it fails, the file system is left at the full size of the scratch file system, and then generic/204 takes a vey long time. So even if we can rely on wipefs causing the tests to fail, maybe we should just add a check for mkfs failure to _scratch_mkfs_sized? I think that's a better fix than Josef's proposed patch to generic/204. One benefit of adding the check to _scratch_mkfs_sized is we can supply a clearer explanation of the failure since the failure would be "mkfs failed" as opposed to "mount: /vdc: wrong fs type, bad option, bad superblock on /dev/vdc, missing codepage or helper program, or other error." It might also make sense to adjust the size passed to _scratch_mkfs_sized in generic/204 to be a something slightly larger, since otherwise it's pretty much guaranteed that generic/204 will start failing on PowerPC when testing with a 64k block size. Cheers, - Ted
diff --git a/tests/generic/204 b/tests/generic/204 index a3dabb71..b5deb443 100755 --- a/tests/generic/204 +++ b/tests/generic/204 @@ -35,7 +35,8 @@ _scratch_mkfs 2> /dev/null | _filter_mkfs 2> $tmp.mkfs > /dev/null [ $FSTYP = "xfs" ] && MKFS_OPTIONS="$MKFS_OPTIONS -l size=16m -i maxpct=50" SIZE=`expr 115 \* 1024 \* 1024` -_scratch_mkfs_sized $SIZE $dbsize 2> /dev/null > $tmp.mkfs.raw +_scratch_mkfs_sized $SIZE $dbsize 2> /dev/null > $tmp.mkfs.raw \ + || _fail "mkfs failed" cat $tmp.mkfs.raw | _filter_mkfs 2> $tmp.mkfs > /dev/null _scratch_mount
My nightly fstests runs on my Raspberry Pi got stuck trying to run generic/204. This boiled down to mkfs failing to make the scratch device that small with the subpage blocksize support, and thus trying to fill a 1tib drive with tiny files. On one hand I'd like to make _scratch_mkfs failures automatically fail the test, but I worry about cases where a test may be checking for an option and need to do something different with failures, so for now simply fail if we can't make our tiny-fs in generic/204. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- tests/generic/204 | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)