Message ID | 1418264718-31198-1-git-send-email-tytso@mit.edu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Dec 10, 2014 at 09:25:18PM -0500, Theodore Ts'o wrote: > It's possible based on a race conditions (and possibly the version of > coreutils which supplies /usr/bin/yes) that commands of the form: > > yes | $MKFS_PROG ... > > will end up causing the following failure: What race conditions? > shared/298 16s ... [23:49:03] [23:49:19] - output mismatch (see /results/results-4k/shared/298.out.bad) > --- tests/shared/298.out 2014-10-31 10:13:04.000000000 -0400 > +++ /results/results-4k/shared/298.out.bad 2014-11-29 23:49:19.118138099 -0500 > @@ -1,4 +1,6 @@ > QA output created by 298 > +yes: standard output: Broken pipe > +yes: write error But thats indicative that something failed and you need to analyse the reason, yes? So AFAICT this is deisred behaviour, and hiding the error will actually hide other failures, right? > Generating garbage on loop...done. > Running fstrim...done. > Detecting interesting holes in image...done. > ... > (Run 'diff -u tests/shared/298.out /results/results-4k/shared/298.out.bad' to see the entire diff) > > The simplest way to fix this is to redirect the stderr of the yes > command to /dev/null. I'd highly recommend you add ext4 specific mkfs branches so you can use the non-interactive/force mkfs flags so that "yes |" is not necessary for your (and everyone elses ext4) test environments. Cheers, Dave.
On Thu, Dec 11, 2014 at 03:13:23PM +1100, Dave Chinner wrote: > On Wed, Dec 10, 2014 at 09:25:18PM -0500, Theodore Ts'o wrote: > > It's possible based on a race conditions (and possibly the version of > > coreutils which supplies /usr/bin/yes) that commands of the form: > > > > yes | $MKFS_PROG ... > > > > will end up causing the following failure: > > What race conditions? It looks like under some circumstances, mkfs reads from stdin and then exits quickly; in the mean time, since the pipe has read from, yes then tries to send more "y\n" to stdout, and then it gets an EPIPE error and complains. > But thats indicative that something failed and you need to analyse > the reason, yes? So AFAICT this is deisred behaviour, and hiding the > error will actually hide other failures, right? No, I don't think anything fails; it's perfectly OK that mkfs can read from stdin and then exit very quickly. It seems to happen primarily when we're running a test where mkfs can exit quickly enough to hit this race. > I'd highly recommend you add ext4 specific mkfs branches so you can > use the non-interactive/force mkfs flags so that "yes |" is not > necessary for your (and everyone elses ext4) test environments. Sure, I'd be happy to add a ext2/3/4 specific mkfs branch. Does any other file system's mkfs try to do interactive I/O? If so, then they could hit this too, so I think the "yes 2> /dev/null | ..." formulation is still a good thing to do. If not, maybe we should be removing the "yes | ..." construct entirely, since it's pretty ugly IMHO. Cheers, - Ted -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 11, 2014 at 07:24:08AM -0500, Theodore Ts'o wrote: > On Thu, Dec 11, 2014 at 03:13:23PM +1100, Dave Chinner wrote: > > On Wed, Dec 10, 2014 at 09:25:18PM -0500, Theodore Ts'o wrote: > > > It's possible based on a race conditions (and possibly the version of > > > coreutils which supplies /usr/bin/yes) that commands of the form: > > > > > > yes | $MKFS_PROG ... > > > > > > will end up causing the following failure: > > > > What race conditions? > > It looks like under some circumstances, mkfs reads from stdin and then > exits quickly; in the mean time, since the pipe has read from, yes > then tries to send more "y\n" to stdout, and then it gets an EPIPE > error and complains. > > > But thats indicative that something failed and you need to analyse > > the reason, yes? So AFAICT this is deisred behaviour, and hiding the > > error will actually hide other failures, right? > > No, I don't think anything fails; it's perfectly OK that mkfs can read > from stdin and then exit very quickly. It seems to happen primarily > when we're running a test where mkfs can exit quickly enough to hit > this race. Ok, seems like a pretty strange corner case and more like a bug in yes than anything.... > > I'd highly recommend you add ext4 specific mkfs branches so you can > > use the non-interactive/force mkfs flags so that "yes |" is not > > necessary for your (and everyone elses ext4) test environments. > > Sure, I'd be happy to add a ext2/3/4 specific mkfs branch. Great! > Does any > other file system's mkfs try to do interactive I/O? If so, then they > could hit this too, so I think the "yes 2> /dev/null | ..." > formulation is still a good thing to do. If not, maybe we should be > removing the "yes | ..." construct entirely, since it's pretty ugly > IMHO. I don't know what other mkfs programs do. I agree that it's an ugly construct, but it seems to work for all the other non-specific mkfs programs around. If you create extN specific mkfs branches, then I'd just leave the default alone so we don't inadvertantly break some other filesystem that needs it... Cheers, Dave.
diff --git a/common/rc b/common/rc index d5e3aff..39a1432 100644 --- a/common/rc +++ b/common/rc @@ -518,7 +518,7 @@ _test_mkfs() $MKFS_BTRFS_PROG $MKFS_OPTIONS $* $TEST_DEV > /dev/null ;; *) - yes | $MKFS_PROG -t $FSTYP -- $MKFS_OPTIONS $* $TEST_DEV + yes 2>/dev/null | $MKFS_PROG -t $FSTYP -- $MKFS_OPTIONS $* $TEST_DEV ;; esac } @@ -536,7 +536,7 @@ _mkfs_dev() $MKFS_BTRFS_PROG $MKFS_OPTIONS $* 2>$tmp_dir.mkfserr 1>$tmp_dir.mkfsstd ;; *) - yes | $MKFS_PROG -t $FSTYP -- $MKFS_OPTIONS $* \ + yes 2>/dev/null | $MKFS_PROG -t $FSTYP -- $MKFS_OPTIONS $* \ 2>$tmp_dir.mkfserr 1>$tmp_dir.mkfsstd ;; esac @@ -586,7 +586,7 @@ _scratch_mkfs() # do nothing for tmpfs ;; *) - yes | $MKFS_PROG -t $FSTYP -- $MKFS_OPTIONS $* $SCRATCH_DEV + yes 2>/dev/null | $MKFS_PROG -t $FSTYP -- $MKFS_OPTIONS $* $SCRATCH_DEV ;; esac } @@ -655,7 +655,8 @@ _scratch_mkfs_sized() fi ;; ext2|ext3|ext4|ext4dev) - yes | ${MKFS_PROG}.$FSTYP $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks + yes 2>/dev/null | ${MKFS_PROG}.$FSTYP $MKFS_OPTIONS -b $blocksize \ + $SCRATCH_DEV $blocks ;; udf) $MKFS_UDF_PROG $MKFS_OPTIONS -b $blocksize $SCRATCH_DEV $blocks
It's possible based on a race conditions (and possibly the version of coreutils which supplies /usr/bin/yes) that commands of the form: yes | $MKFS_PROG ... will end up causing the following failure: shared/298 16s ... [23:49:03] [23:49:19] - output mismatch (see /results/results-4k/shared/298.out.bad) --- tests/shared/298.out 2014-10-31 10:13:04.000000000 -0400 +++ /results/results-4k/shared/298.out.bad 2014-11-29 23:49:19.118138099 -0500 @@ -1,4 +1,6 @@ QA output created by 298 +yes: standard output: Broken pipe +yes: write error Generating garbage on loop...done. Running fstrim...done. Detecting interesting holes in image...done. ... (Run 'diff -u tests/shared/298.out /results/results-4k/shared/298.out.bad' to see the entire diff) The simplest way to fix this is to redirect the stderr of the yes command to /dev/null. Signed-off-by: Theodore Ts'o <tytso@mit.edu> --- common/rc | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)