diff mbox series

fstests: generic/204: fail if the mkfs fails

Message ID fe1cd52ce8954e5aee1fc0a4baf5c75ef7d2635a.1627590942.git.josef@toxicpanda.com (mailing list archive)
State New, archived
Headers show
Series fstests: generic/204: fail if the mkfs fails | expand

Commit Message

Josef Bacik July 29, 2021, 8:35 p.m. UTC
My nightly fstests runs on my Raspberry Pi got stuck trying to run
generic/204.  This boiled down to mkfs failing to make the scratch
device that small with the subpage blocksize support, and thus trying to
fill a 1tib drive with tiny files.  On one hand I'd like to make
_scratch_mkfs failures automatically fail the test, but I worry about
cases where a test may be checking for an option and need to do
something different with failures, so for now simply fail if we can't
make our tiny-fs in generic/204.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 tests/generic/204 | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Theodore Ts'o July 30, 2021, 3:52 a.m. UTC | #1
On Thu, Jul 29, 2021 at 04:35:53PM -0400, Josef Bacik wrote:
> My nightly fstests runs on my Raspberry Pi got stuck trying to run
> generic/204.  This boiled down to mkfs failing to make the scratch
> device that small with the subpage blocksize support, and thus trying to
> fill a 1tib drive with tiny files.  On one hand I'd like to make
> _scratch_mkfs failures automatically fail the test, but I worry about
> cases where a test may be checking for an option and need to do
> something different with failures, so for now simply fail if we can't
> make our tiny-fs in generic/204.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

FWIW, I'm carrying the following patch in my local xfstests tree:

commit cb8e8d44de5bbb1d6163dfeb2b77e8f003a564da
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Mon Jul 3 01:29:21 2017 -0400

    common: kill the test if mkfs.ext4 in _scratch_mkfs_sized fails
    
    If the file system size specified by test is incompatible with the
    mkfs options used in the test configuration, make sure the test stops
    at that point instad of doing something undefined.
    
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>

diff --git a/common/rc b/common/rc
index 332e18b7..1dcad4a3 100644
--- a/common/rc
+++ b/common/rc
@@ -1034,7 +1034,9 @@ _scratch_mkfs_sized()
 		fi
 		;;
 	ext2|ext3|ext4|ext4dev)
-		${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks
+		echo "${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks"
+		${MKFS_PROG} -t $FSTYP -F $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks || \
+		    _die "${MKFS_PROG}.$FSTYP failed!"
 		;;
 	gfs2)
 		# mkfs.gfs2 doesn't automatically shrink journal files on small



I tried getting this upstream, but apparently there was pushback where
when the mkfs failed, it was considered a *feature* that the test
would run on some random scratch file system that was previously there
from before.

I didn't appreciate wasting time trying to run down test failures
caused by some completely inappropriate file system being used for a
test after _scratch_mkfs_sized failed, so I've just been carrying the
patch locally....

						- Ted
Eryu Guan Aug. 1, 2021, 12:42 p.m. UTC | #2
On Thu, Jul 29, 2021 at 04:35:53PM -0400, Josef Bacik wrote:
> My nightly fstests runs on my Raspberry Pi got stuck trying to run
> generic/204.  This boiled down to mkfs failing to make the scratch
> device that small with the subpage blocksize support, and thus trying to
> fill a 1tib drive with tiny files.  On one hand I'd like to make

So the underlying disk is 1TB in size, and we ended up using this 1T
filesystem when _scratch_mkfs_sized failed?

But we have done _try_wipe_scratch_devs before each test to make sure we
don't use previous scratch dev accidently (just like this case), and the
subsesquent _scratch_mount will fail and fail the whole test. So it's
not clear to me what caused the failure you hit.

Thanks,
Eryu

> _scratch_mkfs failures automatically fail the test, but I worry about
> cases where a test may be checking for an option and need to do
> something different with failures, so for now simply fail if we can't
> make our tiny-fs in generic/204.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  tests/generic/204 | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/generic/204 b/tests/generic/204
> index a3dabb71..b5deb443 100755
> --- a/tests/generic/204
> +++ b/tests/generic/204
> @@ -35,7 +35,8 @@ _scratch_mkfs 2> /dev/null | _filter_mkfs 2> $tmp.mkfs > /dev/null
>  [ $FSTYP = "xfs" ] && MKFS_OPTIONS="$MKFS_OPTIONS -l size=16m -i maxpct=50"
>  
>  SIZE=`expr 115 \* 1024 \* 1024`
> -_scratch_mkfs_sized $SIZE $dbsize 2> /dev/null > $tmp.mkfs.raw
> +_scratch_mkfs_sized $SIZE $dbsize 2> /dev/null > $tmp.mkfs.raw \
> +	|| _fail "mkfs failed"
>  cat $tmp.mkfs.raw | _filter_mkfs 2> $tmp.mkfs > /dev/null
>  _scratch_mount
>  
> -- 
> 2.26.3
Eryu Guan Aug. 1, 2021, 12:53 p.m. UTC | #3
On Sun, Aug 01, 2021 at 08:42:02PM +0800, Eryu Guan wrote:
> On Thu, Jul 29, 2021 at 04:35:53PM -0400, Josef Bacik wrote:
> > My nightly fstests runs on my Raspberry Pi got stuck trying to run
> > generic/204.  This boiled down to mkfs failing to make the scratch
> > device that small with the subpage blocksize support, and thus trying to
> > fill a 1tib drive with tiny files.  On one hand I'd like to make
> 
> So the underlying disk is 1TB in size, and we ended up using this 1T
> filesystem when _scratch_mkfs_sized failed?
> 
> But we have done _try_wipe_scratch_devs before each test to make sure we
> don't use previous scratch dev accidently (just like this case), and the
> subsesquent _scratch_mount will fail and fail the whole test. So it's
> not clear to me what caused the failure you hit.

Ah, if the previous test _notrun'd, then the scratch dev didn't get
wiped. I think patch "check: don't leave the scratch filesystem mounted
after _notrun" from Darrick should fix the bug for you. Would you please
confirm?

Thanks,
Eryu
Theodore Ts'o Aug. 1, 2021, 3:57 p.m. UTC | #4
On Sun, Aug 01, 2021 at 08:53:36PM +0800, Eryu Guan wrote:
> > So the underlying disk is 1TB in size, and we ended up using this 1T
> > filesystem when _scratch_mkfs_sized failed?
> > 
> > But we have done _try_wipe_scratch_devs before each test to make sure we
> > don't use previous scratch dev accidently (just like this case), and the
> > subsesquent _scratch_mount will fail and fail the whole test. So it's
> > not clear to me what caused the failure you hit.

The call to _try_wipe_scratch_devs was added in 2019.  My commit to
add:

	|| _notrun "mkfs.${FSTYP} failed"

dates from 2017.  So the reason I was seeing the problem was because
it was before we started running wipefs between tests.

That being said, I've checked a recent test run, and the _notrun
hasn't triggered recently.  Looking at the git history, it looks like
a large number of tests had their arguments to _scratch_mkfs_sized
adjusted upwards to avoid failures when running with 64k block sizes
on powerpc.

Going back to generic/204, I see why Josef ran into issues, however.
even though we are running wipefs before each test.  In the case of
generic/204, it runs _scratch_mkfs to determine the blocksize, and
then it runs _scratch_mkfs_sized --- and if it fails, the file system
is left at the full size of the scratch file system, and then
generic/204 takes a vey long time.

So even if we can rely on wipefs causing the tests to fail, maybe we
should just add a check for mkfs failure to _scratch_mkfs_sized?  I
think that's a better fix than Josef's proposed patch to generic/204.
One benefit of adding the check to _scratch_mkfs_sized is we can
supply a clearer explanation of the failure since the failure would be
"mkfs failed" as opposed to "mount: /vdc: wrong fs type, bad option,
bad superblock on /dev/vdc, missing codepage or helper program, or
other error."

It might also make sense to adjust the size passed to
_scratch_mkfs_sized in generic/204 to be a something slightly larger,
since otherwise it's pretty much guaranteed that generic/204 will
start failing on PowerPC when testing with a 64k block size.

Cheers,

						- Ted
diff mbox series

Patch

diff --git a/tests/generic/204 b/tests/generic/204
index a3dabb71..b5deb443 100755
--- a/tests/generic/204
+++ b/tests/generic/204
@@ -35,7 +35,8 @@  _scratch_mkfs 2> /dev/null | _filter_mkfs 2> $tmp.mkfs > /dev/null
 [ $FSTYP = "xfs" ] && MKFS_OPTIONS="$MKFS_OPTIONS -l size=16m -i maxpct=50"
 
 SIZE=`expr 115 \* 1024 \* 1024`
-_scratch_mkfs_sized $SIZE $dbsize 2> /dev/null > $tmp.mkfs.raw
+_scratch_mkfs_sized $SIZE $dbsize 2> /dev/null > $tmp.mkfs.raw \
+	|| _fail "mkfs failed"
 cat $tmp.mkfs.raw | _filter_mkfs 2> $tmp.mkfs > /dev/null
 _scratch_mount