Message ID | 20211021163959.1887011-1-bfoster@redhat.com (mailing list archive) |
---|---|
State | Deferred, archived |
Headers | show |
Series | generic: test COW writeback failure when overlapping non-shared blocks | expand |
On Thu, Oct 21, 2021 at 12:39:59PM -0400, Brian Foster wrote: > Test that COW writeback that overlaps non-shared delalloc blocks > does not leave around stale delalloc blocks on I/O failure. This > triggers assert failures and free space accounting corruption on > XFS. > > Signed-off-by: Brian Foster <bfoster@redhat.com> > --- > > This test targets the problem addressed by the following patch in XFS: > > https://lore.kernel.org/linux-xfs/20211021163330.1886516-1-bfoster@redhat.com/ > > Brian > > tests/generic/651 | 53 +++++++++++++++++++++++++++++++++++++++++++ > tests/generic/651.out | 2 ++ > 2 files changed, 55 insertions(+) > create mode 100755 tests/generic/651 > create mode 100644 tests/generic/651.out > > diff --git a/tests/generic/651 b/tests/generic/651 > new file mode 100755 > index 00000000..8d4e6728 > --- /dev/null > +++ b/tests/generic/651 > @@ -0,0 +1,53 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2021 Red Hat, Inc. All Rights Reserved. > +# > +# FS QA Test 651 > +# > +# Test that COW writeback that overlaps non-shared delalloc blocks does not > +# leave around stale delalloc blocks on I/O failure. This triggers assert > +# failures and free space accounting corruption on XFS. > +# > +. ./common/preamble > +_begin_fstest auto quick clone > + > +_cleanup() > +{ > + _cleanup_flakey > + cd / > + rm -r -f $tmp.* > +} > + > +# Import common functions. > +. ./common/reflink > +. ./common/dmflakey > + > +# real QA test starts here > +_supported_fs generic > +_require_scratch_reflink > +_require_flakey_with_error_writes _require_cp_reflink > + > +_scratch_mkfs >> $seqres.full > +_init_flakey > +_mount_flakey > + > +# create two files that share a single block > +$XFS_IO_PROG -fc "pwrite 4k 4k" $SCRATCH_MNT/file1 >> $seqres.full Please use: blksz=$(_get_file_block_size $SCRATCH_MNT) $XFS_IO_PROG -fc "pwrite $blksz $blksz" $SCRATCH_MNT/file1 >> $seqres.full So that this test will work properly on filesystems with bs > 4k. > +cp --reflink $SCRATCH_MNT/file1 $SCRATCH_MNT/file2 Nit: This could be shortened to use the _cp_reflink helper, though it doesn't really matter to me if you do. > +# Perform a buffered write across the shared and non-shared blocks. On XFS, this > +# creates a COW fork extent that covers the shared block as well as the just Ah, the reason why there's a cow fork extent covering the delalloc reservation is due to the default cow extent size hint, right? In that case, you need: _require_xfs_io_command "cowextsize" $XFS_IO_PROG -c "cowextsize 0" $SCRATCH_MNT >> $seqres.full to ensure that the speculative cow preallocation actually gets set up. Otherwise, I think test won't reproduce the bug if the test config has -d cowextsize=1 in the mkfs options. > +# created non-shared delalloc block. Fail the writeback to verify that all > +# delayed allocation is cleaned up properly. > +_load_flakey_table $FLAKEY_ERROR_WRITES > +$XFS_IO_PROG -c "pwrite 0 8k" -c fsync $SCRATCH_MNT/file2 >> $seqres.full $((2 * blksz)), not 8k Other than that, this looks reasonable to me. I'll go look at the fix patch now. :) --D > +_load_flakey_table $FLAKEY_ALLOW_WRITES > + > +# Try a post-fail reflink and then unmount. Both of these are known to produce > +# errors and/or assert failures on XFS if we trip over a stale delalloc block. > +cp --reflink $SCRATCH_MNT/file2 $SCRATCH_MNT/file3 > +_unmount_flakey > + > +# success, all done > +status=0 > +exit > diff --git a/tests/generic/651.out b/tests/generic/651.out > new file mode 100644 > index 00000000..bd44c80c > --- /dev/null > +++ b/tests/generic/651.out > @@ -0,0 +1,2 @@ > +QA output created by 651 > +fsync: Input/output error > -- > 2.31.1 >
On Thu, Oct 21, 2021 at 11:40:05AM -0700, Darrick J. Wong wrote: > On Thu, Oct 21, 2021 at 12:39:59PM -0400, Brian Foster wrote: > > Test that COW writeback that overlaps non-shared delalloc blocks > > does not leave around stale delalloc blocks on I/O failure. This > > triggers assert failures and free space accounting corruption on > > XFS. > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > --- > > > > This test targets the problem addressed by the following patch in XFS: > > > > https://lore.kernel.org/linux-xfs/20211021163330.1886516-1-bfoster@redhat.com/ > > > > Brian > > > > tests/generic/651 | 53 +++++++++++++++++++++++++++++++++++++++++++ > > tests/generic/651.out | 2 ++ > > 2 files changed, 55 insertions(+) > > create mode 100755 tests/generic/651 > > create mode 100644 tests/generic/651.out > > > > diff --git a/tests/generic/651 b/tests/generic/651 > > new file mode 100755 > > index 00000000..8d4e6728 > > --- /dev/null > > +++ b/tests/generic/651 > > @@ -0,0 +1,53 @@ > > +#! /bin/bash > > +# SPDX-License-Identifier: GPL-2.0 > > +# Copyright (c) 2021 Red Hat, Inc. All Rights Reserved. > > +# > > +# FS QA Test 651 > > +# > > +# Test that COW writeback that overlaps non-shared delalloc blocks does not > > +# leave around stale delalloc blocks on I/O failure. This triggers assert > > +# failures and free space accounting corruption on XFS. > > +# > > +. ./common/preamble > > +_begin_fstest auto quick clone > > + > > +_cleanup() > > +{ > > + _cleanup_flakey > > + cd / > > + rm -r -f $tmp.* > > +} > > + > > +# Import common functions. > > +. ./common/reflink > > +. ./common/dmflakey > > + > > +# real QA test starts here > > +_supported_fs generic > > +_require_scratch_reflink > > +_require_flakey_with_error_writes > > _require_cp_reflink > > > + > > +_scratch_mkfs >> $seqres.full > > +_init_flakey > > +_mount_flakey > > + > > +# create two files that share a single block > > +$XFS_IO_PROG -fc "pwrite 4k 4k" $SCRATCH_MNT/file1 >> $seqres.full > > Please use: > > blksz=$(_get_file_block_size $SCRATCH_MNT) > $XFS_IO_PROG -fc "pwrite $blksz $blksz" $SCRATCH_MNT/file1 >> $seqres.full > > So that this test will work properly on filesystems with bs > 4k. > Yeah, I'll fix the various hardcoded sizes. Thanks. > > +cp --reflink $SCRATCH_MNT/file1 $SCRATCH_MNT/file2 > > Nit: This could be shortened to use the _cp_reflink helper, though it > doesn't really matter to me if you do. > Didn't know we had it. I'll look into it. > > +# Perform a buffered write across the shared and non-shared blocks. On XFS, this > > +# creates a COW fork extent that covers the shared block as well as the just > > Ah, the reason why there's a cow fork extent covering the delalloc > reservation is due to the default cow extent size hint, right? In that > case, you need: > Yeah.. > _require_xfs_io_command "cowextsize" > $XFS_IO_PROG -c "cowextsize 0" $SCRATCH_MNT >> $seqres.full > > to ensure that the speculative cow preallocation actually gets set up. > Otherwise, I think test won't reproduce the bug if the test config has > -d cowextsize=1 in the mkfs options. > .. but then we aren't susceptible to the problem, right? I sometimes waffle on whether it's better for a test to create a problematic situation and test it, or run on the configuration specified by the user and test a particular scenario against that. Maybe the former makes more sense in this very specific test case, but then I suppose "cowextsize blksz*2" (or whatever large enough value) is probably more robust than "cowextsize 0" (which I assume means "default" and thus can change, right)? Brian > > +# created non-shared delalloc block. Fail the writeback to verify that all > > +# delayed allocation is cleaned up properly. > > +_load_flakey_table $FLAKEY_ERROR_WRITES > > +$XFS_IO_PROG -c "pwrite 0 8k" -c fsync $SCRATCH_MNT/file2 >> $seqres.full > > $((2 * blksz)), not 8k > > Other than that, this looks reasonable to me. I'll go look at the fix > patch now. :) > > --D > > > +_load_flakey_table $FLAKEY_ALLOW_WRITES > > + > > +# Try a post-fail reflink and then unmount. Both of these are known to produce > > +# errors and/or assert failures on XFS if we trip over a stale delalloc block. > > +cp --reflink $SCRATCH_MNT/file2 $SCRATCH_MNT/file3 > > +_unmount_flakey > > + > > +# success, all done > > +status=0 > > +exit > > diff --git a/tests/generic/651.out b/tests/generic/651.out > > new file mode 100644 > > index 00000000..bd44c80c > > --- /dev/null > > +++ b/tests/generic/651.out > > @@ -0,0 +1,2 @@ > > +QA output created by 651 > > +fsync: Input/output error > > -- > > 2.31.1 > > >
On Thu, Oct 21, 2021 at 03:09:49PM -0400, Brian Foster wrote: > On Thu, Oct 21, 2021 at 11:40:05AM -0700, Darrick J. Wong wrote: > > On Thu, Oct 21, 2021 at 12:39:59PM -0400, Brian Foster wrote: > > > Test that COW writeback that overlaps non-shared delalloc blocks > > > does not leave around stale delalloc blocks on I/O failure. This > > > triggers assert failures and free space accounting corruption on > > > XFS. > > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > > --- > > > > > > This test targets the problem addressed by the following patch in XFS: > > > > > > https://lore.kernel.org/linux-xfs/20211021163330.1886516-1-bfoster@redhat.com/ > > > > > > Brian > > > > > > tests/generic/651 | 53 +++++++++++++++++++++++++++++++++++++++++++ > > > tests/generic/651.out | 2 ++ > > > 2 files changed, 55 insertions(+) > > > create mode 100755 tests/generic/651 > > > create mode 100644 tests/generic/651.out > > > > > > diff --git a/tests/generic/651 b/tests/generic/651 > > > new file mode 100755 > > > index 00000000..8d4e6728 > > > --- /dev/null > > > +++ b/tests/generic/651 > > > @@ -0,0 +1,53 @@ > > > +#! /bin/bash > > > +# SPDX-License-Identifier: GPL-2.0 > > > +# Copyright (c) 2021 Red Hat, Inc. All Rights Reserved. > > > +# > > > +# FS QA Test 651 > > > +# > > > +# Test that COW writeback that overlaps non-shared delalloc blocks does not > > > +# leave around stale delalloc blocks on I/O failure. This triggers assert > > > +# failures and free space accounting corruption on XFS. > > > +# > > > +. ./common/preamble > > > +_begin_fstest auto quick clone > > > + > > > +_cleanup() > > > +{ > > > + _cleanup_flakey > > > + cd / > > > + rm -r -f $tmp.* > > > +} > > > + > > > +# Import common functions. > > > +. ./common/reflink > > > +. ./common/dmflakey > > > + > > > +# real QA test starts here > > > +_supported_fs generic > > > +_require_scratch_reflink > > > +_require_flakey_with_error_writes > > > > _require_cp_reflink > > > > > + > > > +_scratch_mkfs >> $seqres.full > > > +_init_flakey > > > +_mount_flakey > > > + > > > +# create two files that share a single block > > > +$XFS_IO_PROG -fc "pwrite 4k 4k" $SCRATCH_MNT/file1 >> $seqres.full > > > > Please use: > > > > blksz=$(_get_file_block_size $SCRATCH_MNT) > > $XFS_IO_PROG -fc "pwrite $blksz $blksz" $SCRATCH_MNT/file1 >> $seqres.full > > > > So that this test will work properly on filesystems with bs > 4k. > > > > Yeah, I'll fix the various hardcoded sizes. Thanks. > > > > +cp --reflink $SCRATCH_MNT/file1 $SCRATCH_MNT/file2 > > > > Nit: This could be shortened to use the _cp_reflink helper, though it > > doesn't really matter to me if you do. > > > > Didn't know we had it. I'll look into it. > > > > +# Perform a buffered write across the shared and non-shared blocks. On XFS, this > > > +# creates a COW fork extent that covers the shared block as well as the just > > > > Ah, the reason why there's a cow fork extent covering the delalloc > > reservation is due to the default cow extent size hint, right? In that > > case, you need: > > > > Yeah.. > > > _require_xfs_io_command "cowextsize" > > $XFS_IO_PROG -c "cowextsize 0" $SCRATCH_MNT >> $seqres.full > > > > to ensure that the speculative cow preallocation actually gets set up. > > Otherwise, I think test won't reproduce the bug if the test config has > > -d cowextsize=1 in the mkfs options. > > > > .. but then we aren't susceptible to the problem, right? > > I sometimes waffle on whether it's better for a test to create a > problematic situation and test it, or run on the configuration specified > by the user and test a particular scenario against that. Maybe the > former makes more sense in this very specific test case, but then I > suppose "cowextsize blksz*2" (or whatever large enough value) is Yes, blksz*2. > probably more robust than "cowextsize 0" (which I assume means "default" > and thus can change, right)? Seeing as this is a reproducer, explicitly setting cowextsize seems appropriate. Alternately, I suppose you could detect the one case where it won't work (cowextsize == 1fsb) and only then change it. --D > > Brian > > > > +# created non-shared delalloc block. Fail the writeback to verify that all > > > +# delayed allocation is cleaned up properly. > > > +_load_flakey_table $FLAKEY_ERROR_WRITES > > > +$XFS_IO_PROG -c "pwrite 0 8k" -c fsync $SCRATCH_MNT/file2 >> $seqres.full > > > > $((2 * blksz)), not 8k > > > > Other than that, this looks reasonable to me. I'll go look at the fix > > patch now. :) > > > > --D > > > > > +_load_flakey_table $FLAKEY_ALLOW_WRITES > > > + > > > +# Try a post-fail reflink and then unmount. Both of these are known to produce > > > +# errors and/or assert failures on XFS if we trip over a stale delalloc block. > > > +cp --reflink $SCRATCH_MNT/file2 $SCRATCH_MNT/file3 > > > +_unmount_flakey > > > + > > > +# success, all done > > > +status=0 > > > +exit > > > diff --git a/tests/generic/651.out b/tests/generic/651.out > > > new file mode 100644 > > > index 00000000..bd44c80c > > > --- /dev/null > > > +++ b/tests/generic/651.out > > > @@ -0,0 +1,2 @@ > > > +QA output created by 651 > > > +fsync: Input/output error > > > -- > > > 2.31.1 > > > > > >
diff --git a/tests/generic/651 b/tests/generic/651 new file mode 100755 index 00000000..8d4e6728 --- /dev/null +++ b/tests/generic/651 @@ -0,0 +1,53 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2021 Red Hat, Inc. All Rights Reserved. +# +# FS QA Test 651 +# +# Test that COW writeback that overlaps non-shared delalloc blocks does not +# leave around stale delalloc blocks on I/O failure. This triggers assert +# failures and free space accounting corruption on XFS. +# +. ./common/preamble +_begin_fstest auto quick clone + +_cleanup() +{ + _cleanup_flakey + cd / + rm -r -f $tmp.* +} + +# Import common functions. +. ./common/reflink +. ./common/dmflakey + +# real QA test starts here +_supported_fs generic +_require_scratch_reflink +_require_flakey_with_error_writes + +_scratch_mkfs >> $seqres.full +_init_flakey +_mount_flakey + +# create two files that share a single block +$XFS_IO_PROG -fc "pwrite 4k 4k" $SCRATCH_MNT/file1 >> $seqres.full +cp --reflink $SCRATCH_MNT/file1 $SCRATCH_MNT/file2 + +# Perform a buffered write across the shared and non-shared blocks. On XFS, this +# creates a COW fork extent that covers the shared block as well as the just +# created non-shared delalloc block. Fail the writeback to verify that all +# delayed allocation is cleaned up properly. +_load_flakey_table $FLAKEY_ERROR_WRITES +$XFS_IO_PROG -c "pwrite 0 8k" -c fsync $SCRATCH_MNT/file2 >> $seqres.full +_load_flakey_table $FLAKEY_ALLOW_WRITES + +# Try a post-fail reflink and then unmount. Both of these are known to produce +# errors and/or assert failures on XFS if we trip over a stale delalloc block. +cp --reflink $SCRATCH_MNT/file2 $SCRATCH_MNT/file3 +_unmount_flakey + +# success, all done +status=0 +exit diff --git a/tests/generic/651.out b/tests/generic/651.out new file mode 100644 index 00000000..bd44c80c --- /dev/null +++ b/tests/generic/651.out @@ -0,0 +1,2 @@ +QA output created by 651 +fsync: Input/output error
Test that COW writeback that overlaps non-shared delalloc blocks does not leave around stale delalloc blocks on I/O failure. This triggers assert failures and free space accounting corruption on XFS. Signed-off-by: Brian Foster <bfoster@redhat.com> --- This test targets the problem addressed by the following patch in XFS: https://lore.kernel.org/linux-xfs/20211021163330.1886516-1-bfoster@redhat.com/ Brian tests/generic/651 | 53 +++++++++++++++++++++++++++++++++++++++++++ tests/generic/651.out | 2 ++ 2 files changed, 55 insertions(+) create mode 100755 tests/generic/651 create mode 100644 tests/generic/651.out