Message ID | 1434014273-1059-1-git-send-email-eguan@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jun 11, 2015 at 05:17:53PM +0800, Eryu Guan wrote: > Test concurrent buffered I/O, DIO, AIO, mmap I/O and splice I/O on the > same files. > > Signed-off-by: Eryu Guan <eguan@redhat.com> > --- > > This fio job file has been proven to be potent, it triggers WARNINGs on ext4 > and xfs with 4.1-rc6 kernel. > > ext4: WARNING: at fs/ext4/inode.c:1328 > xfs: WARNING: CPU: 7 PID: 3090 at fs/xfs/xfs_file.c:726 xfs_file_dio_aio_write+0x176/0x2a8 [xfs]() Ok, so that warning is expected on XFS - that's intentional, WARN_ONCE() output indicating a data coherency problem has occurred because of the because the application is mixing buffered/mmap IO with direct IO on the same file and direct Io has been unable to cleanly invalidate the cache. i.e. it's information to us developers explaining why the user is complaining about data corruption.... So this test is never going to pass on XFS unless you tell the test harness to ignore the dmesg output... > And it ever paniced kernel in mm code and hung xfs. The "hung XFS" case will probably be the pipe mutex inversion problem in the generic splice code. i.e. .splice_read -> xfs_file_splice_read -> IOLOCK_SHARED -> generic_file_splice_read -> splice_to_pipe -> pipe_lock() vs: iter_file_splice_write -> pipe_lock() -> vfs_iter_write -> xfs_file_write_iter -> xfs_file_buffered_aio_write -> IOLOCK_EXCL Can you confirm this? If so, there's not much we can actually do about this - the recent big splice rewrite replaced the pipe_lock/i_mutex inversion deadlock with a different pipe_lock inversion deadlock.... > diff --git a/tests/generic/group b/tests/generic/group > index 0c8964c..2e534a5 100644 > --- a/tests/generic/group > +++ b/tests/generic/group > @@ -92,6 +92,7 @@ > 087 perms auto quick > 088 perms auto quick > 089 metadata auto > +090 auto rw stress Hence I'm not sure "auto" is the correct group here. "dangerous" is more likely because it is exercising a problem we can't fix and will prevent the auto test group from making progress past this test. Cheers, Dave.
On Thu, Jun 18, 2015 at 08:15:25AM +1000, Dave Chinner wrote: > On Thu, Jun 11, 2015 at 05:17:53PM +0800, Eryu Guan wrote: > > Test concurrent buffered I/O, DIO, AIO, mmap I/O and splice I/O on the > > same files. > > > > Signed-off-by: Eryu Guan <eguan@redhat.com> > > --- > > > > This fio job file has been proven to be potent, it triggers WARNINGs on ext4 > > and xfs with 4.1-rc6 kernel. > > > > ext4: WARNING: at fs/ext4/inode.c:1328 > > xfs: WARNING: CPU: 7 PID: 3090 at fs/xfs/xfs_file.c:726 xfs_file_dio_aio_write+0x176/0x2a8 [xfs]() > > Ok, so that warning is expected on XFS - that's intentional, > WARN_ONCE() output indicating a data coherency problem has occurred > because of the because the application is mixing buffered/mmap IO > with direct IO on the same file and direct Io has been unable to > cleanly invalidate the cache. i.e. it's information to us > developers explaining why the user is complaining about data > corruption.... > > So this test is never going to pass on XFS unless you tell the test > harness to ignore the dmesg output... I can send a v4 to disable dmesg check if FSTYP is xfs, but that will ignore any other WARNINGs/Oops too, which seems not ideal to me either. I'm fine to go either way here(disable the dmesg check or not). But I personally prefer changing the WARN_ON_ONCE to something like xfs_warn() or xfs_warn_ratelimited() to give out the warning. > > > And it ever paniced kernel in mm code and hung xfs. > > The "hung XFS" case will probably be the pipe mutex inversion > problem in the generic splice code. i.e. > > .splice_read -> xfs_file_splice_read -> IOLOCK_SHARED -> > generic_file_splice_read -> splice_to_pipe -> pipe_lock() > > vs: > > iter_file_splice_write -> pipe_lock() -> vfs_iter_write -> > xfs_file_write_iter -> xfs_file_buffered_aio_write -> IOLOCK_EXCL > > Can you confirm this? If so, there's not much we can actually do > about this - the recent big splice rewrite replaced the > pipe_lock/i_mutex inversion deadlock with a different pipe_lock > inversion deadlock.... Yes, XFS deadlocks in the splice code with RHEL7.1 kernel but doesn't deadlock with 4.1-rc[567] kernels (I only confirmed on these kernels just now), so ... > > > diff --git a/tests/generic/group b/tests/generic/group > > index 0c8964c..2e534a5 100644 > > --- a/tests/generic/group > > +++ b/tests/generic/group > > @@ -92,6 +92,7 @@ > > 087 perms auto quick > > 088 perms auto quick > > 089 metadata auto > > +090 auto rw stress > > Hence I'm not sure "auto" is the correct group here. "dangerous" is > more likely because it is exercising a problem we can't fix and will > prevent the auto test group from making progress past this test. I think the auto group should be fine here. Thanks, Eryu -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 18, 2015 at 11:04:31AM +0800, Eryu Guan wrote: > On Thu, Jun 18, 2015 at 08:15:25AM +1000, Dave Chinner wrote: > > On Thu, Jun 11, 2015 at 05:17:53PM +0800, Eryu Guan wrote: > > > Test concurrent buffered I/O, DIO, AIO, mmap I/O and splice I/O on the > > > same files. > > > > > > Signed-off-by: Eryu Guan <eguan@redhat.com> > > > --- > > > > > > This fio job file has been proven to be potent, it triggers WARNINGs on ext4 > > > and xfs with 4.1-rc6 kernel. > > > > > > ext4: WARNING: at fs/ext4/inode.c:1328 > > > xfs: WARNING: CPU: 7 PID: 3090 at fs/xfs/xfs_file.c:726 xfs_file_dio_aio_write+0x176/0x2a8 [xfs]() > > > > Ok, so that warning is expected on XFS - that's intentional, > > WARN_ONCE() output indicating a data coherency problem has occurred > > because of the because the application is mixing buffered/mmap IO > > with direct IO on the same file and direct Io has been unable to > > cleanly invalidate the cache. i.e. it's information to us > > developers explaining why the user is complaining about data > > corruption.... > > > > So this test is never going to pass on XFS unless you tell the test > > harness to ignore the dmesg output... > > I can send a v4 to disable dmesg check if FSTYP is xfs, but that will > ignore any other WARNINGs/Oops too, which seems not ideal to me either. Such conditional output issues are dealt with by adding filters to the output.... > I'm fine to go either way here(disable the dmesg check or not). But I > personally prefer changing the WARN_ON_ONCE to something like xfs_warn() > or xfs_warn_ratelimited() to give out the warning. History tells us that such warnings get ignored and not reported, and we lose lots of hair before we find out that the bug reporter thought it "wasn't important" and so "didn't include it" in any of the bug reports. Data coherency problems are important enough that we WARN_ON_ONCE is justified - it's something we need to know about sooner rather than later, and it's something that application developers also need to be aware of. They won't notice an xfs warning in the logs, but they will notice abort() or some other syslog monitor telling them there's been a kernel warning.... > > > And it ever paniced kernel in mm code and hung xfs. > > > > The "hung XFS" case will probably be the pipe mutex inversion > > problem in the generic splice code. i.e. > > > > .splice_read -> xfs_file_splice_read -> IOLOCK_SHARED -> > > generic_file_splice_read -> splice_to_pipe -> pipe_lock() > > > > vs: > > > > iter_file_splice_write -> pipe_lock() -> vfs_iter_write -> > > xfs_file_write_iter -> xfs_file_buffered_aio_write -> IOLOCK_EXCL > > > > Can you confirm this? If so, there's not much we can actually do > > about this - the recent big splice rewrite replaced the > > pipe_lock/i_mutex inversion deadlock with a different pipe_lock > > inversion deadlock.... > > Yes, XFS deadlocks in the splice code with RHEL7.1 kernel but doesn't > deadlock with 4.1-rc[567] kernels (I only confirmed on these kernels > just now), so ... Oh, ok, so the current upstream is fine; RHEL7 has the pre-write_iter rewrite of the splice code, so the deadlock must be of the older variety. We can ignore that, then. > > > diff --git a/tests/generic/group b/tests/generic/group > > > index 0c8964c..2e534a5 100644 > > > --- a/tests/generic/group > > > +++ b/tests/generic/group > > > @@ -92,6 +92,7 @@ > > > 087 perms auto quick > > > 088 perms auto quick > > > 089 metadata auto > > > +090 auto rw stress > > > > Hence I'm not sure "auto" is the correct group here. "dangerous" is > > more likely because it is exercising a problem we can't fix and will > > prevent the auto test group from making progress past this test. > > I think the auto group should be fine here. If it doesn't fail on current upstream kernels, that will be fine. If it fails, and there is no likely resolution of the failure in the forseeable future, then it does not belong in the auto group. Cheers, Dave.
diff --git a/tests/generic/090 b/tests/generic/090 new file mode 100755 index 0000000..e7cca52 --- /dev/null +++ b/tests/generic/090 @@ -0,0 +1,122 @@ +#! /bin/bash +# FS QA Test generic/090 +# +# Concurrent mixed I/O (buffer I/O, aiodio, mmap, splice) on the same files +# +#----------------------------------------------------------------------- +# Copyright (c) 2015 Red Hat Inc. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + cd / + rm -f $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# real QA test starts here +_supported_fs generic +_supported_os Linux +_require_scratch + +iodepth=$((16 * LOAD_FACTOR)) +iodepth_batch=$((8 * LOAD_FACTOR)) +numjobs=$((5 * LOAD_FACTOR)) +fio_config=$tmp.fio +cat >$fio_config <<EOF +[global] +bs=8k +iodepth=$iodepth +iodepth_batch=$iodepth_batch +randrepeat=1 +size=1m +directory=$SCRATCH_MNT +numjobs=$numjobs +[job1] +ioengine=sync +bs=1k +direct=1 +rw=randread +filename=file1:file2 +[job2] +ioengine=libaio +rw=randwrite +direct=1 +filename=file1:file2 +[job3] +bs=1k +ioengine=posixaio +rw=randwrite +direct=1 +filename=file1:file2 +[job4] +ioengine=splice +direct=1 +rw=randwrite +filename=file1:file2 +[job5] +bs=1k +ioengine=sync +rw=randread +filename=file1:file2 +[job6] +ioengine=posixaio +rw=randwrite +filename=file1:file2 +[job7] +ioengine=splice +rw=randwrite +filename=file1:file2 +[job8] +ioengine=mmap +rw=randwrite +bs=1k +filename=file1:file2 +[job9] +ioengine=mmap +rw=randwrite +direct=1 +filename=file1:file2 +EOF +# with ioengine=mmap and direct=1, fio requires bs to be at least pagesize, +# which is a fio built-in var. +echo 'bs=$pagesize' >> $fio_config + +rm -f $seqres.full +_require_fio $fio_config +_scratch_mkfs >>$seqres.full 2>&1 +_scratch_mount + +echo "Silence is golden" +$FIO_PROG $fio_config >>$seqres.full 2>&1 + +# all done, expect no hang no oops no fs corruption, +# _check_dmesg and _check_filesystems will do the check work for us +status=0 +exit diff --git a/tests/generic/090.out b/tests/generic/090.out new file mode 100644 index 0000000..2b5100d --- /dev/null +++ b/tests/generic/090.out @@ -0,0 +1,2 @@ +QA output created by 090 +Silence is golden diff --git a/tests/generic/group b/tests/generic/group index 0c8964c..2e534a5 100644 --- a/tests/generic/group +++ b/tests/generic/group @@ -92,6 +92,7 @@ 087 perms auto quick 088 perms auto quick 089 metadata auto +090 auto rw stress 091 rw auto quick 092 auto quick prealloc 093 attr cap udf auto
Test concurrent buffered I/O, DIO, AIO, mmap I/O and splice I/O on the same files. Signed-off-by: Eryu Guan <eguan@redhat.com> --- This fio job file has been proven to be potent, it triggers WARNINGs on ext4 and xfs with 4.1-rc6 kernel. ext4: WARNING: at fs/ext4/inode.c:1328 xfs: WARNING: CPU: 7 PID: 3090 at fs/xfs/xfs_file.c:726 xfs_file_dio_aio_write+0x176/0x2a8 [xfs]() The ext4 issue should be fixed by Lukas's patch ext4: fix reservation release on invalidatepage for delalloc fs And it ever paniced kernel in mm code and hung xfs. I reduced the numjobs and iodepth to reduce the test time(~25s on my test host) and scale them by $LOAD_FACTOR. And it still could trigger the warning on ext4 and xfs with reduced workload. v2: - use mktemp to create tmp fio job file v3: - initialize scratch dev and mount it correctly before test - roll back to use /tmp/$$ as tmp tests/generic/090 | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++ tests/generic/090.out | 2 + tests/generic/group | 1 + 3 files changed, 125 insertions(+) create mode 100755 tests/generic/090 create mode 100644 tests/generic/090.out