diff mbox

xfstests xfs fuzzers fail with DAX

Message ID 20160804024514.GA2906@xzhoul.usersys.redhat.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Murphy Zhou Aug. 4, 2016, 2:45 a.m. UTC
Hi,

A few xfs fuzzers in xfstests fail with dax mount option, pass without dax.
They are xfs/086 xfs/088 xfs/089 xfs/091.

xfstests to commit 4470ad4c7e  (Jul 26)
kernel   to commit dd95069545  (Jul 24)

+ ./check xfs/091
FSTYP         -- xfs (non-debug)
PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
MOUNT_OPTIONS -- -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch

xfs/091	 104s
Ran: xfs/091
Passed all 1 tests

+ echo 'MOUNT_OPTIONS="-o dax"'
+ ./check xfs/091
FSTYP         -- xfs (non-debug)
PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch

xfs/091 104s ...  - output mismatch (see /root/xfstests/results//xfs/091.out.bad)
    --- tests/xfs/091.out	2016-07-18 02:57:47.670000000 -0400
    +++ /root/xfstests/results//xfs/091.out.bad	2016-08-03 22:38:14.948000000 -0400
    @@ -6,6 +6,70 @@
     + corrupt image
     + mount image
     + modify files
    +pwrite64: Structure needs cleaning
    +pwrite64: Structure needs cleaning
    +pwrite64: Structure needs cleaning
    +pwrite64: Structure needs cleaning
    ...
    (Run 'diff -u tests/xfs/091.out /root/xfstests/results//xfs/091.out.bad'  to see the entire diff)
Ran: xfs/091
Failures: xfs/091
Failed 1 of 1 tests

# diff -u xfstests/tests/xfs/091.out /root/xfstests/results//xfs/091.out.bad

Comments

Eric Sandeen Aug. 4, 2016, 3 a.m. UTC | #1
On 8/3/16 9:45 PM, Xiong Zhou wrote:
> Hi,
> 
> A few xfs fuzzers in xfstests fail with dax mount option, pass without dax.
> They are xfs/086 xfs/088 xfs/089 xfs/091.
> 
> xfstests to commit 4470ad4c7e  (Jul 26)
> kernel   to commit dd95069545  (Jul 24)
> 
> + ./check xfs/091
> FSTYP         -- xfs (non-debug)
> PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
> MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
> MOUNT_OPTIONS -- -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch
> 
> xfs/091	 104s
> Ran: xfs/091
> Passed all 1 tests
> 
> + echo 'MOUNT_OPTIONS="-o dax"'
> + ./check xfs/091
> FSTYP         -- xfs (non-debug)
> PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
> MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
> MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch
> 
> xfs/091 104s ...  - output mismatch (see /root/xfstests/results//xfs/091.out.bad)
>     --- tests/xfs/091.out	2016-07-18 02:57:47.670000000 -0400
>     +++ /root/xfstests/results//xfs/091.out.bad	2016-08-03 22:38:14.948000000 -0400
>     @@ -6,6 +6,70 @@
>      + corrupt image
>      + mount image
>      + modify files
>     +pwrite64: Structure needs cleaning
>     +pwrite64: Structure needs cleaning
>     +pwrite64: Structure needs cleaning
>     +pwrite64: Structure needs cleaning

This means the filesystem has shut down, most likely, and more information about
the error is in dmesg.

Further, if the filesystem is corrupt, xfs_repair output would be interesting.

Can you provide that information?

This can probably be reproduced, but when reporting a bug it's always good to provide
as many details as you can.

Thanks,
-Eric
Dave Chinner Aug. 7, 2016, 11:11 p.m. UTC | #2
On Wed, Aug 03, 2016 at 10:00:09PM -0500, Eric Sandeen wrote:
> On 8/3/16 9:45 PM, Xiong Zhou wrote:
> > Hi,
> > 
> > A few xfs fuzzers in xfstests fail with dax mount option, pass without dax.
> > They are xfs/086 xfs/088 xfs/089 xfs/091.
> > 
> > xfstests to commit 4470ad4c7e  (Jul 26)
> > kernel   to commit dd95069545  (Jul 24)
> > 
> > + ./check xfs/091
> > FSTYP         -- xfs (non-debug)
> > PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
> > MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
> > MOUNT_OPTIONS -- -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch
> > 
> > xfs/091	 104s
> > Ran: xfs/091
> > Passed all 1 tests
> > 
> > + echo 'MOUNT_OPTIONS="-o dax"'
> > + ./check xfs/091
> > FSTYP         -- xfs (non-debug)
> > PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
> > MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
> > MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch
> > 
> > xfs/091 104s ...  - output mismatch (see /root/xfstests/results//xfs/091.out.bad)
> >     --- tests/xfs/091.out	2016-07-18 02:57:47.670000000 -0400
> >     +++ /root/xfstests/results//xfs/091.out.bad	2016-08-03 22:38:14.948000000 -0400
> >     @@ -6,6 +6,70 @@
> >      + corrupt image
> >      + mount image
> >      + modify files
> >     +pwrite64: Structure needs cleaning
> >     +pwrite64: Structure needs cleaning
> >     +pwrite64: Structure needs cleaning
> >     +pwrite64: Structure needs cleaning
> 
> This means the filesystem has shut down, most likely, and more information about
> the error is in dmesg.
>
> Further, if the filesystem is corrupt, xfs_repair output would be interesting.
> 
> Can you provide that information?
> 
> This can probably be reproduced, but when reporting a bug it's always good to provide
> as many details as you can.

What it indicates to me is that DAX detects inode/freespace  metadata
related corruption sooner than non-DAX paths because we don't do
delayed allocation on DAX. i.e. we are doing direct allocation in
the syscall path and errors that would have been detected in the
writeback path and triggered until sync/unmount are now triggering
in the pwrite() syscall path.

Hence I think this is probably expected behaviour, and not a bug or
regression. We probably should just filter the pwrite errors out...

Cheers,

Dave.
Dan Williams Aug. 30, 2016, 1:50 a.m. UTC | #3
[ Adding Darrick on the off chance that this triggers an "aha, of
course it does!" ]

Darrick these corruption tests you added to xfstests last year all
fail the same way with DAX enabled.  They spew:

    "pwrite64: Structure needs cleaning"

...reports that are cleaned up by running without "-o dax".

Alternatively you could sit back and watch me try to figure it out,
that should be quite entertaining... as a start I'll try to pin down a
stack trace when the error is returned.


On Wed, Aug 3, 2016 at 7:45 PM, Xiong Zhou <xzhou@redhat.com> wrote:
> Hi,
>
> A few xfs fuzzers in xfstests fail with dax mount option, pass without dax.
> They are xfs/086 xfs/088 xfs/089 xfs/091.
>
> xfstests to commit 4470ad4c7e  (Jul 26)
> kernel   to commit dd95069545  (Jul 24)
>
> + ./check xfs/091
> FSTYP         -- xfs (non-debug)
> PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
> MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
> MOUNT_OPTIONS -- -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch
>
> xfs/091  104s
> Ran: xfs/091
> Passed all 1 tests
>
> + echo 'MOUNT_OPTIONS="-o dax"'
> + ./check xfs/091
> FSTYP         -- xfs (non-debug)
> PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
> MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
> MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch
>
> xfs/091 104s ...  - output mismatch (see /root/xfstests/results//xfs/091.out.bad)
>     --- tests/xfs/091.out       2016-07-18 02:57:47.670000000 -0400
>     +++ /root/xfstests/results//xfs/091.out.bad 2016-08-03 22:38:14.948000000 -0400
>     @@ -6,6 +6,70 @@
>      + corrupt image
>      + mount image
>      + modify files
>     +pwrite64: Structure needs cleaning
>     +pwrite64: Structure needs cleaning
>     +pwrite64: Structure needs cleaning
>     +pwrite64: Structure needs cleaning
>     ...
>     (Run 'diff -u tests/xfs/091.out /root/xfstests/results//xfs/091.out.bad'  to see the entire diff)
> Ran: xfs/091
> Failures: xfs/091
> Failed 1 of 1 tests
>
> # diff -u xfstests/tests/xfs/091.out /root/xfstests/results//xfs/091.out.bad
> --- xfstests/tests/xfs/091.out  2016-07-18 02:57:47.670000000 -0400
> +++ /root/xfstests/results//xfs/091.out.bad     2016-08-03 22:38:14.948000000 -0400
> @@ -6,6 +6,70 @@
>  + corrupt image
>  + mount image
>  + modify files
> +pwrite64: Structure needs cleaning
> <snip 62 more same lines>
> +pwrite64: Structure needs cleaning
>  + repair fs
>  + mount image
>  + chattr -R -i
>
>
> Thanks,
> Xiong
>
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
Darrick J. Wong Aug. 30, 2016, 2:37 a.m. UTC | #4
On Mon, Aug 29, 2016 at 06:50:05PM -0700, Dan Williams wrote:
> [ Adding Darrick on the off chance that this triggers an "aha, of
> course it does!" ]

Aha!  Of course it does!!! :)

> Darrick these corruption tests you added to xfstests last year all
> fail the same way with DAX enabled.  They spew:
> 
>     "pwrite64: Structure needs cleaning"
> 
> ...reports that are cleaned up by running without "-o dax".

I think this happens because in non-dax mode, the pwrite is a buffered
write and so long as we can create a delalloc reservation, everything
is ok and nothing fails.  Whereas for dax we have to allocate the
blocks for the pwrite immediately, thereby triggering the cntbt
verifier error.

Proceeding from the assumption "DAX behaves a lot like DIO", all the
tests that rely on buffered mode semantics are going to choke if DAX
is turned on without them knowing about it.

> Alternatively you could sit back and watch me try to figure it out,
> that should be quite entertaining... as a start I'll try to pin down a
> stack trace when the error is returned.

As for how to fix this, probably the best option is to change line 98
to 'pwrite -W -S 0x62...' and update the output to include the
'structure needs cleaning' message.

Or get rid of the mount option and require explicitly turning on DAX
on a per-inode basis, which I think is where Dave is already going.

--D

> 
> 
> On Wed, Aug 3, 2016 at 7:45 PM, Xiong Zhou <xzhou@redhat.com> wrote:
> > Hi,
> >
> > A few xfs fuzzers in xfstests fail with dax mount option, pass without dax.
> > They are xfs/086 xfs/088 xfs/089 xfs/091.
> >
> > xfstests to commit 4470ad4c7e  (Jul 26)
> > kernel   to commit dd95069545  (Jul 24)
> >
> > + ./check xfs/091
> > FSTYP         -- xfs (non-debug)
> > PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
> > MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
> > MOUNT_OPTIONS -- -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch
> >
> > xfs/091  104s
> > Ran: xfs/091
> > Passed all 1 tests
> >
> > + echo 'MOUNT_OPTIONS="-o dax"'
> > + ./check xfs/091
> > FSTYP         -- xfs (non-debug)
> > PLATFORM      -- Linux/x86_64 rhel73 4.7.0+
> > MKFS_OPTIONS  -- -f -bsize=4096 /dev/pmem1
> > MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem1 /daxsch
> >
> > xfs/091 104s ...  - output mismatch (see /root/xfstests/results//xfs/091.out.bad)
> >     --- tests/xfs/091.out       2016-07-18 02:57:47.670000000 -0400
> >     +++ /root/xfstests/results//xfs/091.out.bad 2016-08-03 22:38:14.948000000 -0400
> >     @@ -6,6 +6,70 @@
> >      + corrupt image
> >      + mount image
> >      + modify files
> >     +pwrite64: Structure needs cleaning
> >     +pwrite64: Structure needs cleaning
> >     +pwrite64: Structure needs cleaning
> >     +pwrite64: Structure needs cleaning
> >     ...
> >     (Run 'diff -u tests/xfs/091.out /root/xfstests/results//xfs/091.out.bad'  to see the entire diff)
> > Ran: xfs/091
> > Failures: xfs/091
> > Failed 1 of 1 tests
> >
> > # diff -u xfstests/tests/xfs/091.out /root/xfstests/results//xfs/091.out.bad
> > --- xfstests/tests/xfs/091.out  2016-07-18 02:57:47.670000000 -0400
> > +++ /root/xfstests/results//xfs/091.out.bad     2016-08-03 22:38:14.948000000 -0400
> > @@ -6,6 +6,70 @@
> >  + corrupt image
> >  + mount image
> >  + modify files
> > +pwrite64: Structure needs cleaning
> > <snip 62 more same lines>
> > +pwrite64: Structure needs cleaning
> >  + repair fs
> >  + mount image
> >  + chattr -R -i
> >
> >
> > Thanks,
> > Xiong
> >
> > _______________________________________________
> > Linux-nvdimm mailing list
> > Linux-nvdimm@lists.01.org
> > https://lists.01.org/mailman/listinfo/linux-nvdimm
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
Dan Williams Aug. 30, 2016, 2:53 p.m. UTC | #5
On Mon, Aug 29, 2016 at 7:37 PM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> On Mon, Aug 29, 2016 at 06:50:05PM -0700, Dan Williams wrote:
>> [ Adding Darrick on the off chance that this triggers an "aha, of
>> course it does!" ]
>
> Aha!  Of course it does!!! :)

Heh, thanks :).  And apologies to Dave for missing his earlier note
pointing out the delalloc failure, linux-nvdimm list ate the response.

>
>> Darrick these corruption tests you added to xfstests last year all
>> fail the same way with DAX enabled.  They spew:
>>
>>     "pwrite64: Structure needs cleaning"
>>
>> ...reports that are cleaned up by running without "-o dax".
>
> I think this happens because in non-dax mode, the pwrite is a buffered
> write and so long as we can create a delalloc reservation, everything
> is ok and nothing fails.  Whereas for dax we have to allocate the
> blocks for the pwrite immediately, thereby triggering the cntbt
> verifier error.
>
> Proceeding from the assumption "DAX behaves a lot like DIO", all the
> tests that rely on buffered mode semantics are going to choke if DAX
> is turned on without them knowing about it.
>
>> Alternatively you could sit back and watch me try to figure it out,
>> that should be quite entertaining... as a start I'll try to pin down a
>> stack trace when the error is returned.
>
> As for how to fix this, probably the best option is to change line 98
> to 'pwrite -W -S 0x62...' and update the output to include the
> 'structure needs cleaning' message.

I'll give it a shot.

> Or get rid of the mount option and require explicitly turning on DAX
> on a per-inode basis, which I think is where Dave is already going.

Yes, I think we can't run away from the dax mount option fast enough.
The semantics are different, so an application / administrator needs
to explicitly opt-in to DAX semantics per-inode otherwise we are
guaranteed to cause surprises.
diff mbox

Patch

--- xfstests/tests/xfs/091.out	2016-07-18 02:57:47.670000000 -0400
+++ /root/xfstests/results//xfs/091.out.bad	2016-08-03 22:38:14.948000000 -0400
@@ -6,6 +6,70 @@ 
 + corrupt image
 + mount image
 + modify files
+pwrite64: Structure needs cleaning
<snip 62 more same lines>
+pwrite64: Structure needs cleaning
 + repair fs
 + mount image
 + chattr -R -i


Thanks,
Xiong

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs