Message ID | 20241101050743.113687-1-wqu@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | btrfs: a new test case to verify mount behavior with background remounting | expand |
On Fri, Nov 1, 2024 at 5:08 AM Qu Wenruo <wqu@suse.com> wrote: > > [BUG] > When there is a process in the background remounting a btrfs, switching > between RO/RW, then another process try to mount another subvolume of > the same btrfs read-only, we can hit a race causing the RW mount to fail > with -EBUSY: > > [CAUSE] > During the btrfs mount, to support mounting different subvolumes with > different RO/RW flags, we have a small hack during the mount: > > Retry with matching RO flags if the initial mount fail with -EBUSY. > > The problem is, during that retry we do not hold any super block lock > (s_umount), this meanings there can be a remount process changing the RO > flags of the original fs super block. > > If so, we can have an EBUSY error during retry. > And this time we treat any failure as an error, without any retry and > cause the above EBUSY mount failure. > > [FIX] > The fix is already sent to the mailing list. > The fix is to allow btrfs to have different RO flag between super block > and mount point during mount, and if the RO flag mismatch, reconfigure > the fs to RW with s_umount hold, so that there will be no race. > > [TEST CASE] > The test case will create two processes: > > - Remounting an existing subvolume mount point > Switching between RO and RW > > - Mounting another subvolume RW > After a successful mount, unmount and retry. > > This is enough to trigger the -EBUSY error in less than 5 seconds. > To be extra safe, the test case will run for 10 seconds at least, and > follow TIME_FACTOR for extra loads. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > tests/btrfs/325 | 80 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/325.out | 2 ++ > 2 files changed, 82 insertions(+) > create mode 100755 tests/btrfs/325 > create mode 100644 tests/btrfs/325.out > > diff --git a/tests/btrfs/325 b/tests/btrfs/325 > new file mode 100755 > index 00000000..d0713b39 > --- /dev/null > +++ b/tests/btrfs/325 > @@ -0,0 +1,80 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved. > +# > +# FS QA Test 325 > +# > +# Test that mounting a subvolume read-write will success, with another > +# subvolume being remounted RO/RW at background > +# > +. ./common/preamble > +_begin_fstest auto quick mount remount > + > +_fixed_by_kernel_commit xxxxxxxxxxxx \ > + "btrfs: fix mount failure due to remount races" > + > +_cleanup() > +{ > + cd / > + umount "$subv1_mount" &> /dev/null > + umount "$subv2_mount" &> /dev/null $UMOUNT_PROG > + rm -rf -- "$subv1_mount" &> /dev/null > + rm -rf -- "$subv2_mount" &> /dev/null > +} > + > +# For the extra mount points > +_require_test > +_require_scratch > + > +_scratch_mkfs >> $seqres.full 2>&1 > +_scratch_mount > +$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/subvol1 >> $seqres.full > +$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/subvol2 >> $seqres.full > +_scratch_unmount > + > +subv1_mount="$TEST_DIR/subvol1_mount" > +subv2_mount="$TEST_DIR/subvol2_mount" > +mkdir -p "$subv1_mount" > +mkdir -p "$subv2_mount" > + > +# Subv1 remount workload, mount the subv1 and switching it between > +# RO and RW. > +_mount "$SCRATCH_DEV" "$subv1_mount" -o subvol=subvol1 > +while _mount -o remount,ro "$subv1_mount"; do > + mount -o remount,rw "$subv1_mount" _mount here too. > +done & > +subv1_pid=$! > + > +# Subv2 rw mount workload > +# For unpatched kernel, this will fail with -EBUSY. > +# > +# To maintain the per-subvolume RO/RW mount, if the existing btrfs is > +# mounted RO, unpatched btrfs will retry with its RO flag reverted, > +# then reconfigure the fs to RW. > +# > +# But such retry has no super blocl lock hold, thus if another remount blocl -> block > +# process has already remounted the fs RW, the attempt will fail and > +# return -EBUSY. > +# > +# Patched kernel will allow the superblock and mount point RO flags > +# to differ, and then hold the s_umount to reconfigure the superblock > +# to RW, preventing any possible race. > +while _mount "$SCRATCH_DEV" "$subv2_mount "-o subvol=subvol2; do > + umount "$subv2_mount"; $UMOUNT_PROG > +done & > +subv2_pid=$! We should have the _cleanup function kill and wait for $subv1_pid and $subv2_pid in case the test is interrupted, something like: kill $subv1_pid $subv2_pid &> /dev/null wait $subv1_pid $subv2_pid &> /dev/null > + > +sleep $(( 10 * $TIME_FACTOR )) > + > +kill $subv1_pid > +kill $subv2_pid Add a 'wait' here, to make sure the umounts below don't fail in case the subvolumes are mounted. > +umount "$subv1_mount" &> /dev/null > +umount "$subv2_mount" &> /dev/null $UMOUNT_PROG > +rm -rf -- "$subv1_mount" &> /dev/null > +rm -rf -- "$subv2_mount" &> /dev/null Also, why these "&> /dev/null" redirections? If we have the 'wait', the umounts should succeed and the rm -fr shouldn't fail - in fact it's useful to test that they don't fail, that the umounts were successful in case the subvolumes were mounted. Thanks. > + > +echo "Silence is golden" > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/325.out b/tests/btrfs/325.out > new file mode 100644 > index 00000000..cf13795c > --- /dev/null > +++ b/tests/btrfs/325.out > @@ -0,0 +1,2 @@ > +QA output created by 325 > +Silence is golden > -- > 2.46.0 > >
在 2024/11/1 19:28, Filipe Manana 写道: > On Fri, Nov 1, 2024 at 5:08 AM Qu Wenruo <wqu@suse.com> wrote: >> >> [BUG] >> When there is a process in the background remounting a btrfs, switching >> between RO/RW, then another process try to mount another subvolume of >> the same btrfs read-only, we can hit a race causing the RW mount to fail >> with -EBUSY: >> >> [CAUSE] >> During the btrfs mount, to support mounting different subvolumes with >> different RO/RW flags, we have a small hack during the mount: >> >> Retry with matching RO flags if the initial mount fail with -EBUSY. >> >> The problem is, during that retry we do not hold any super block lock >> (s_umount), this meanings there can be a remount process changing the RO >> flags of the original fs super block. >> >> If so, we can have an EBUSY error during retry. >> And this time we treat any failure as an error, without any retry and >> cause the above EBUSY mount failure. >> >> [FIX] >> The fix is already sent to the mailing list. >> The fix is to allow btrfs to have different RO flag between super block >> and mount point during mount, and if the RO flag mismatch, reconfigure >> the fs to RW with s_umount hold, so that there will be no race. >> >> [TEST CASE] >> The test case will create two processes: >> >> - Remounting an existing subvolume mount point >> Switching between RO and RW >> >> - Mounting another subvolume RW >> After a successful mount, unmount and retry. >> >> This is enough to trigger the -EBUSY error in less than 5 seconds. >> To be extra safe, the test case will run for 10 seconds at least, and >> follow TIME_FACTOR for extra loads. >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> tests/btrfs/325 | 80 +++++++++++++++++++++++++++++++++++++++++++++ >> tests/btrfs/325.out | 2 ++ >> 2 files changed, 82 insertions(+) >> create mode 100755 tests/btrfs/325 >> create mode 100644 tests/btrfs/325.out >> >> diff --git a/tests/btrfs/325 b/tests/btrfs/325 >> new file mode 100755 >> index 00000000..d0713b39 >> --- /dev/null >> +++ b/tests/btrfs/325 >> @@ -0,0 +1,80 @@ >> +#! /bin/bash >> +# SPDX-License-Identifier: GPL-2.0 >> +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved. >> +# >> +# FS QA Test 325 >> +# >> +# Test that mounting a subvolume read-write will success, with another >> +# subvolume being remounted RO/RW at background >> +# >> +. ./common/preamble >> +_begin_fstest auto quick mount remount >> + >> +_fixed_by_kernel_commit xxxxxxxxxxxx \ >> + "btrfs: fix mount failure due to remount races" >> + >> +_cleanup() >> +{ >> + cd / >> + umount "$subv1_mount" &> /dev/null >> + umount "$subv2_mount" &> /dev/null > > $UMOUNT_PROG Well, I went direct "umount" because I did a quick search for "umount" and hits twice inside common/rc. I guess it's time to do a cleanup inside common/rc too. > >> + rm -rf -- "$subv1_mount" &> /dev/null >> + rm -rf -- "$subv2_mount" &> /dev/null >> +} >> + >> +# For the extra mount points >> +_require_test >> +_require_scratch >> + >> +_scratch_mkfs >> $seqres.full 2>&1 >> +_scratch_mount >> +$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/subvol1 >> $seqres.full >> +$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/subvol2 >> $seqres.full >> +_scratch_unmount >> + >> +subv1_mount="$TEST_DIR/subvol1_mount" >> +subv2_mount="$TEST_DIR/subvol2_mount" >> +mkdir -p "$subv1_mount" >> +mkdir -p "$subv2_mount" >> + >> +# Subv1 remount workload, mount the subv1 and switching it between >> +# RO and RW. >> +_mount "$SCRATCH_DEV" "$subv1_mount" -o subvol=subvol1 >> +while _mount -o remount,ro "$subv1_mount"; do >> + mount -o remount,rw "$subv1_mount" > > _mount here too. > >> +done & >> +subv1_pid=$! >> + >> +# Subv2 rw mount workload >> +# For unpatched kernel, this will fail with -EBUSY. >> +# >> +# To maintain the per-subvolume RO/RW mount, if the existing btrfs is >> +# mounted RO, unpatched btrfs will retry with its RO flag reverted, >> +# then reconfigure the fs to RW. >> +# >> +# But such retry has no super blocl lock hold, thus if another remount > > blocl -> block > >> +# process has already remounted the fs RW, the attempt will fail and >> +# return -EBUSY. >> +# >> +# Patched kernel will allow the superblock and mount point RO flags >> +# to differ, and then hold the s_umount to reconfigure the superblock >> +# to RW, preventing any possible race. >> +while _mount "$SCRATCH_DEV" "$subv2_mount "-o subvol=subvol2; do >> + umount "$subv2_mount"; > > $UMOUNT_PROG > >> +done & >> +subv2_pid=$! > > We should have the _cleanup function kill and wait for $subv1_pid and > $subv2_pid in case the test is interrupted, something like: > > kill $subv1_pid $subv2_pid &> /dev/null > wait $subv1_pid $subv2_pid &> /dev/null Forgot the wait part. Thanks a lot pointing this out. Thanks, Qu > >> + >> +sleep $(( 10 * $TIME_FACTOR )) >> + >> +kill $subv1_pid >> +kill $subv2_pid > > Add a 'wait' here, to make sure the umounts below don't fail in case > the subvolumes are mounted. > >> +umount "$subv1_mount" &> /dev/null >> +umount "$subv2_mount" &> /dev/null > > $UMOUNT_PROG > >> +rm -rf -- "$subv1_mount" &> /dev/null >> +rm -rf -- "$subv2_mount" &> /dev/null > > > Also, why these "&> /dev/null" redirections? > If we have the 'wait', the umounts should succeed and the rm -fr > shouldn't fail - in fact it's useful to test that they don't fail, > that the umounts were successful in case the subvolumes were mounted. > > Thanks. > >> + >> +echo "Silence is golden" >> + >> +# success, all done >> +status=0 >> +exit >> diff --git a/tests/btrfs/325.out b/tests/btrfs/325.out >> new file mode 100644 >> index 00000000..cf13795c >> --- /dev/null >> +++ b/tests/btrfs/325.out >> @@ -0,0 +1,2 @@ >> +QA output created by 325 >> +Silence is golden >> -- >> 2.46.0 >> >>
diff --git a/tests/btrfs/325 b/tests/btrfs/325 new file mode 100755 index 00000000..d0713b39 --- /dev/null +++ b/tests/btrfs/325 @@ -0,0 +1,80 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved. +# +# FS QA Test 325 +# +# Test that mounting a subvolume read-write will success, with another +# subvolume being remounted RO/RW at background +# +. ./common/preamble +_begin_fstest auto quick mount remount + +_fixed_by_kernel_commit xxxxxxxxxxxx \ + "btrfs: fix mount failure due to remount races" + +_cleanup() +{ + cd / + umount "$subv1_mount" &> /dev/null + umount "$subv2_mount" &> /dev/null + rm -rf -- "$subv1_mount" &> /dev/null + rm -rf -- "$subv2_mount" &> /dev/null +} + +# For the extra mount points +_require_test +_require_scratch + +_scratch_mkfs >> $seqres.full 2>&1 +_scratch_mount +$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/subvol1 >> $seqres.full +$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/subvol2 >> $seqres.full +_scratch_unmount + +subv1_mount="$TEST_DIR/subvol1_mount" +subv2_mount="$TEST_DIR/subvol2_mount" +mkdir -p "$subv1_mount" +mkdir -p "$subv2_mount" + +# Subv1 remount workload, mount the subv1 and switching it between +# RO and RW. +_mount "$SCRATCH_DEV" "$subv1_mount" -o subvol=subvol1 +while _mount -o remount,ro "$subv1_mount"; do + mount -o remount,rw "$subv1_mount" +done & +subv1_pid=$! + +# Subv2 rw mount workload +# For unpatched kernel, this will fail with -EBUSY. +# +# To maintain the per-subvolume RO/RW mount, if the existing btrfs is +# mounted RO, unpatched btrfs will retry with its RO flag reverted, +# then reconfigure the fs to RW. +# +# But such retry has no super blocl lock hold, thus if another remount +# process has already remounted the fs RW, the attempt will fail and +# return -EBUSY. +# +# Patched kernel will allow the superblock and mount point RO flags +# to differ, and then hold the s_umount to reconfigure the superblock +# to RW, preventing any possible race. +while _mount "$SCRATCH_DEV" "$subv2_mount "-o subvol=subvol2; do + umount "$subv2_mount"; +done & +subv2_pid=$! + +sleep $(( 10 * $TIME_FACTOR )) + +kill $subv1_pid +kill $subv2_pid +umount "$subv1_mount" &> /dev/null +umount "$subv2_mount" &> /dev/null +rm -rf -- "$subv1_mount" &> /dev/null +rm -rf -- "$subv2_mount" &> /dev/null + +echo "Silence is golden" + +# success, all done +status=0 +exit diff --git a/tests/btrfs/325.out b/tests/btrfs/325.out new file mode 100644 index 00000000..cf13795c --- /dev/null +++ b/tests/btrfs/325.out @@ -0,0 +1,2 @@ +QA output created by 325 +Silence is golden
[BUG] When there is a process in the background remounting a btrfs, switching between RO/RW, then another process try to mount another subvolume of the same btrfs read-only, we can hit a race causing the RW mount to fail with -EBUSY: [CAUSE] During the btrfs mount, to support mounting different subvolumes with different RO/RW flags, we have a small hack during the mount: Retry with matching RO flags if the initial mount fail with -EBUSY. The problem is, during that retry we do not hold any super block lock (s_umount), this meanings there can be a remount process changing the RO flags of the original fs super block. If so, we can have an EBUSY error during retry. And this time we treat any failure as an error, without any retry and cause the above EBUSY mount failure. [FIX] The fix is already sent to the mailing list. The fix is to allow btrfs to have different RO flag between super block and mount point during mount, and if the RO flag mismatch, reconfigure the fs to RW with s_umount hold, so that there will be no race. [TEST CASE] The test case will create two processes: - Remounting an existing subvolume mount point Switching between RO and RW - Mounting another subvolume RW After a successful mount, unmount and retry. This is enough to trigger the -EBUSY error in less than 5 seconds. To be extra safe, the test case will run for 10 seconds at least, and follow TIME_FACTOR for extra loads. Signed-off-by: Qu Wenruo <wqu@suse.com> --- tests/btrfs/325 | 80 +++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/325.out | 2 ++ 2 files changed, 82 insertions(+) create mode 100755 tests/btrfs/325 create mode 100644 tests/btrfs/325.out