Message ID | c68878cb99025b8c8465221205d5de9e40777b18.1709806478.git.anand.jain@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fstests: test tempfsid with dm flakey device | expand |
On Thu, Mar 07, 2024 at 06:20:24PM +0530, Anand Jain wrote: > When a flakey device is configured, we have access to both the physical > device and the DM flaky device. Ensure that when the flakey device is > configured, the physical device mount fails. > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > tests/btrfs/318 | 45 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/318.out | 3 +++ > 2 files changed, 48 insertions(+) > create mode 100755 tests/btrfs/318 > create mode 100644 tests/btrfs/318.out > > diff --git a/tests/btrfs/318 b/tests/btrfs/318 > new file mode 100755 > index 000000000000..015950fbd93c > --- /dev/null > +++ b/tests/btrfs/318 > @@ -0,0 +1,45 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2024 Oracle. All Rights Reserved. > +# > +# FS QA Test 318 > +# > +# Create multiple device nodes with the same device try mount > +# > +. ./common/preamble > +_begin_fstest auto volume tempfsid May I ask why it's a "tempfsid" related test? > + > +# Override the default cleanup function. > +_cleanup() > +{ > + umount $extra_mnt &> /dev/null > + rm -rf $extra_mnt &> /dev/null The "&> /dev/null" isn't needed, if you use "-f" for rm > + _unmount_flakey > + _cleanup_flakey > + cd / > + rm -r -f $tmp.* > +} > + > +# Import common functions. > +. ./common/filter > +. ./common/dmflakey > + > +# real QA test starts here > +_supported_fs btrfs > +_require_scratch > +_require_dm_target flakey > + > +_scratch_mkfs >> $seqres.full > +_init_flakey > + > +_mount_flakey > +extra_mnt=$TEST_DIR/extra_mnt _require_test ? > +rm -rf $extra_mnt > +mkdir -p $extra_mnt > +_mount $SCRATCH_DEV $extra_mnt 2>&1 | _filter_testdir_and_scratch Recommend calling "_filter_error_mount" too Thanks, Zorro > + > +_flakey_drop_and_remount > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/318.out b/tests/btrfs/318.out > new file mode 100644 > index 000000000000..5cdbea8c4b2a > --- /dev/null > +++ b/tests/btrfs/318.out > @@ -0,0 +1,3 @@ > +QA output created by 318 > +mount: TEST_DIR/extra_mnt: wrong fs type, bad option, bad superblock on SCRATCH_DEV, missing codepage or helper program, or other error. > + dmesg(1) may have more information after failed mount system call. > -- > 2.39.3 > >
On Thu, Mar 7, 2024 at 12:51 PM Anand Jain <anand.jain@oracle.com> wrote: > > When a flakey device is configured, we have access to both the physical > device and the DM flaky device. Ensure that when the flakey device is > configured, the physical device mount fails. Does it need to be flakey? Can't it be any other DM type? The way it's phrased, gives the idea that somehow this is all flakey specific. Just be clear and mention any DM on top of the device. > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > tests/btrfs/318 | 45 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/318.out | 3 +++ > 2 files changed, 48 insertions(+) > create mode 100755 tests/btrfs/318 > create mode 100644 tests/btrfs/318.out > > diff --git a/tests/btrfs/318 b/tests/btrfs/318 > new file mode 100755 > index 000000000000..015950fbd93c > --- /dev/null > +++ b/tests/btrfs/318 > @@ -0,0 +1,45 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2024 Oracle. All Rights Reserved. > +# > +# FS QA Test 318 > +# > +# Create multiple device nodes with the same device try mount > +# > +. ./common/preamble > +_begin_fstest auto volume tempfsid Also 'quick' group. > + > +# Override the default cleanup function. > +_cleanup() > +{ > + umount $extra_mnt &> /dev/null > + rm -rf $extra_mnt &> /dev/null > + _unmount_flakey > + _cleanup_flakey > + cd / > + rm -r -f $tmp.* > +} > + > +# Import common functions. > +. ./common/filter > +. ./common/dmflakey > + > +# real QA test starts here > +_supported_fs btrfs > +_require_scratch > +_require_dm_target flakey > + > +_scratch_mkfs >> $seqres.full > +_init_flakey > + > +_mount_flakey > +extra_mnt=$TEST_DIR/extra_mnt > +rm -rf $extra_mnt > +mkdir -p $extra_mnt > +_mount $SCRATCH_DEV $extra_mnt 2>&1 | _filter_testdir_and_scratch > + > +_flakey_drop_and_remount Why? Please add a comment. Either this is a leftover from copy-paste from other tests, that exercise fsync and power failure, or the goal is to do an unmount followed by a mount and check that the mount succeeds. If it's the latter case, then it's confusing to use _flakey_drop_and_remount, because that drops writes, unmounts, allows writes again and then mounts - i.e. we don't want to simulate power loss by silently dropping writes. Just call _unmount_flakey and _mount_flakey and add a comment mentioning why we are doing it. Finally, a link to the report that motivated this, due to a bug in a patch that ended not being sent to Linus, would be useful: https://lore.kernel.org/linux-btrfs/CAL3q7H5wx5rKmSzGWP7mRqaSfAY88g=35N4OBrbJB61rK0mt2w@mail.gmail.com/ I'm also not convinced that we need this test, because the bug could be reliably reproduced by running all existing tests or just a subset of the tests as I reported there. Thanks. > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/318.out b/tests/btrfs/318.out > new file mode 100644 > index 000000000000..5cdbea8c4b2a > --- /dev/null > +++ b/tests/btrfs/318.out > @@ -0,0 +1,3 @@ > +QA output created by 318 > +mount: TEST_DIR/extra_mnt: wrong fs type, bad option, bad superblock on SCRATCH_DEV, missing codepage or helper program, or other error. > + dmesg(1) may have more information after failed mount system call. > -- > 2.39.3 > >
> I'm also not convinced that we need this test, because the bug could > be reliably reproduced by running all existing tests or just a subset > of the tests as I reported there. This test case was motivated by btrfs/159; and recent report; yep, most of the lines here are taken from btrfs/159. However, I wanted a test case in the tempfsid group which exercises multi-node-same-physical; If there are no objections, I am going to add the tempfsid group to btrfs/159 for now. Thanks, Anand
On Thu, Mar 07, 2024 at 06:20:24PM +0530, Anand Jain wrote: > When a flakey device is configured, we have access to both the physical > device and the DM flaky device. Ensure that when the flakey device is > configured, the physical device mount fails. > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > tests/btrfs/318 | 45 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/318.out | 3 +++ > 2 files changed, 48 insertions(+) > create mode 100755 tests/btrfs/318 > create mode 100644 tests/btrfs/318.out > > diff --git a/tests/btrfs/318 b/tests/btrfs/318 > new file mode 100755 > index 000000000000..015950fbd93c > --- /dev/null > +++ b/tests/btrfs/318 > @@ -0,0 +1,45 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2024 Oracle. All Rights Reserved. > +# > +# FS QA Test 318 > +# > +# Create multiple device nodes with the same device try mount > +# > +. ./common/preamble > +_begin_fstest auto volume tempfsid > + > +# Override the default cleanup function. > +_cleanup() > +{ > + umount $extra_mnt &> /dev/null > + rm -rf $extra_mnt &> /dev/null > + _unmount_flakey > + _cleanup_flakey > + cd / > + rm -r -f $tmp.* > +} > + > +# Import common functions. > +. ./common/filter > +. ./common/dmflakey > + > +# real QA test starts here > +_supported_fs btrfs BTW, what cause it have to be a btrfs specific test case? I didn't any btrfs specific operations below, can you explain the reason? Thanks, Zorro > +_require_scratch > +_require_dm_target flakey > + > +_scratch_mkfs >> $seqres.full > +_init_flakey > + > +_mount_flakey > +extra_mnt=$TEST_DIR/extra_mnt > +rm -rf $extra_mnt > +mkdir -p $extra_mnt > +_mount $SCRATCH_DEV $extra_mnt 2>&1 | _filter_testdir_and_scratch > + > +_flakey_drop_and_remount > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/318.out b/tests/btrfs/318.out > new file mode 100644 > index 000000000000..5cdbea8c4b2a > --- /dev/null > +++ b/tests/btrfs/318.out > @@ -0,0 +1,3 @@ > +QA output created by 318 > +mount: TEST_DIR/extra_mnt: wrong fs type, bad option, bad superblock on SCRATCH_DEV, missing codepage or helper program, or other error. > + dmesg(1) may have more information after failed mount system call. > -- > 2.39.3 > >
On Thu, Mar 7, 2024 at 4:46 PM Anand Jain <anand.jain@oracle.com> wrote: > > > > I'm also not convinced that we need this test, because the bug could > > be reliably reproduced by running all existing tests or just a subset > > of the tests as I reported there. > > This test case was motivated by btrfs/159; and recent report; > yep, most of the lines here are taken from btrfs/159. Ok, so it's motivated by what I reported, triggered by btrfs/159 but only when other tests are run before it. > However, I wanted a test case in the tempfsid group which > exercises multi-node-same-physical; Sure. It's just that we could trigger the issue simply by running all existing tests, or just a subset as I reported before [1]. It was fully deterministic and David could reproduce it after the report too. What surprises me is that no one noticed this before and at some stage the faulty patch was about to be sent to Linus. > > If there are no objections, I am going to add the tempfsid group to > btrfs/159 for now. That is confusing, please don't. btrfs/159 doesn't test tempfsid, it existed for many years before the tempfsid feature (introduced in the 6.7 kernel), it just happens that it triggers the issue if other fstests are run before it. That would also be pointless because running btrfs/159 alone doesn't trigger the bug, so running "./check -g tempfsid" wouldn't cause 159 to fail. Between that and a new test case that is somewhat redundant, I'd rather have the new test case with proper documentation/comments. Thanks. [1] https://lore.kernel.org/linux-btrfs/CAL3q7H5wx5rKmSzGWP7mRqaSfAY88g=35N4OBrbJB61rK0mt2w@mail.gmail.com/ > > Thanks, Anand
On 3/8/24 18:00, Filipe Manana wrote: > On Thu, Mar 7, 2024 at 4:46 PM Anand Jain <anand.jain@oracle.com> wrote: >> >> >>> I'm also not convinced that we need this test, because the bug could >>> be reliably reproduced by running all existing tests or just a subset >>> of the tests as I reported there. >> >> This test case was motivated by btrfs/159; and recent report; >> yep, most of the lines here are taken from btrfs/159. > > Ok, so it's motivated by what I reported, triggered by btrfs/159 but > only when other tests are run before it. > >> However, I wanted a test case in the tempfsid group which >> exercises multi-node-same-physical; > > Sure. It's just that we could trigger the issue simply by running all > existing tests, or just a subset as I reported before [1]. > It was fully deterministic and David could reproduce it after the report too. > > What surprises me is that no one noticed this before and at some stage > the faulty patch was about to be sent to Linus. > >> >> If there are no objections, I am going to add the tempfsid group to >> btrfs/159 for now. > > That is confusing, please don't. > btrfs/159 doesn't test tempfsid, it existed for many years before the > tempfsid feature (introduced in the 6.7 kernel), > it just happens that it triggers the issue if other fstests are run before it. > > That would also be pointless because running btrfs/159 alone doesn't > trigger the bug, so running "./check -g tempfsid" wouldn't cause 159 > to fail. > > Between that and a new test case that is somewhat redundant, I'd > rather have the new test case with proper documentation/comments. > I have converted this test case into a generic one. Now, XFS, ext4, and Btrfs (after a patch) match the golden output. I'll be sending the btrfs kernel patch along with it. Thanks, Anand > Thanks. > > [1] https://lore.kernel.org/linux-btrfs/CAL3q7H5wx5rKmSzGWP7mRqaSfAY88g=35N4OBrbJB61rK0mt2w@mail.gmail.com/ > >> >> Thanks, Anand
On 3/8/24 00:50, Zorro Lang wrote: > On Thu, Mar 07, 2024 at 06:20:24PM +0530, Anand Jain wrote: >> When a flakey device is configured, we have access to both the physical >> device and the DM flaky device. Ensure that when the flakey device is >> configured, the physical device mount fails. >> >> Signed-off-by: Anand Jain <anand.jain@oracle.com> >> --- >> tests/btrfs/318 | 45 +++++++++++++++++++++++++++++++++++++++++++++ >> tests/btrfs/318.out | 3 +++ >> 2 files changed, 48 insertions(+) >> create mode 100755 tests/btrfs/318 >> create mode 100644 tests/btrfs/318.out >> >> diff --git a/tests/btrfs/318 b/tests/btrfs/318 >> new file mode 100755 >> index 000000000000..015950fbd93c >> --- /dev/null >> +++ b/tests/btrfs/318 >> @@ -0,0 +1,45 @@ >> +#! /bin/bash >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright (c) 2024 Oracle. All Rights Reserved. >> +# >> +# FS QA Test 318 >> +# >> +# Create multiple device nodes with the same device try mount >> +# >> +. ./common/preamble >> +_begin_fstest auto volume tempfsid >> + >> +# Override the default cleanup function. >> +_cleanup() >> +{ >> + umount $extra_mnt &> /dev/null >> + rm -rf $extra_mnt &> /dev/null >> + _unmount_flakey >> + _cleanup_flakey >> + cd / >> + rm -r -f $tmp.* >> +} >> + >> +# Import common functions. >> +. ./common/filter >> +. ./common/dmflakey >> + >> +# real QA test starts here >> +_supported_fs btrfs > > BTW, what cause it have to be a btrfs specific test case? I didn't any > btrfs specific operations below, can you explain the reason? > Right. Now I notice it can be made generic. I have converted this into a generic test case. I'll send it when the kernel patch is ready. For now, I have to withdraw this patch. Thanks for the suggestion. Anand
diff --git a/tests/btrfs/318 b/tests/btrfs/318 new file mode 100755 index 000000000000..015950fbd93c --- /dev/null +++ b/tests/btrfs/318 @@ -0,0 +1,45 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2024 Oracle. All Rights Reserved. +# +# FS QA Test 318 +# +# Create multiple device nodes with the same device try mount +# +. ./common/preamble +_begin_fstest auto volume tempfsid + +# Override the default cleanup function. +_cleanup() +{ + umount $extra_mnt &> /dev/null + rm -rf $extra_mnt &> /dev/null + _unmount_flakey + _cleanup_flakey + cd / + rm -r -f $tmp.* +} + +# Import common functions. +. ./common/filter +. ./common/dmflakey + +# real QA test starts here +_supported_fs btrfs +_require_scratch +_require_dm_target flakey + +_scratch_mkfs >> $seqres.full +_init_flakey + +_mount_flakey +extra_mnt=$TEST_DIR/extra_mnt +rm -rf $extra_mnt +mkdir -p $extra_mnt +_mount $SCRATCH_DEV $extra_mnt 2>&1 | _filter_testdir_and_scratch + +_flakey_drop_and_remount + +# success, all done +status=0 +exit diff --git a/tests/btrfs/318.out b/tests/btrfs/318.out new file mode 100644 index 000000000000..5cdbea8c4b2a --- /dev/null +++ b/tests/btrfs/318.out @@ -0,0 +1,3 @@ +QA output created by 318 +mount: TEST_DIR/extra_mnt: wrong fs type, bad option, bad superblock on SCRATCH_DEV, missing codepage or helper program, or other error. + dmesg(1) may have more information after failed mount system call.
When a flakey device is configured, we have access to both the physical device and the DM flaky device. Ensure that when the flakey device is configured, the physical device mount fails. Signed-off-by: Anand Jain <anand.jain@oracle.com> --- tests/btrfs/318 | 45 +++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/318.out | 3 +++ 2 files changed, 48 insertions(+) create mode 100755 tests/btrfs/318 create mode 100644 tests/btrfs/318.out