diff mbox series

btrfs: fix mount failure due to remount races

Message ID a68b1c835a227c844a5e106c9240f0c0215906c8.1729569894.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: fix mount failure due to remount races | expand

Commit Message

Qu Wenruo Oct. 22, 2024, 4:08 a.m. UTC
[BUG]
The following reproducer can cause btrfs mount to fail:

  dev="/dev/test/scratch1"
  mnt1="/mnt/test"
  mnt2="/mnt/scratch"

  mkfs.btrfs -f $dev
  mount $dev $mnt1
  btrfs subvolume create $mnt1/subvol1
  btrfs subvolume create $mnt1/subvol2
  umount $mnt1

  mount $dev $mnt1 -o subvol=subvol1
  while mount -o remount,ro $mnt1; do mount -o remount,rw $mnt1; done &
  bg=$!

  while mount $dev $mnt2 -o subvol=subvol2; do umount $mnt2; done

  kill $bg
  wait
  umount -R $mnt1
  umount -R $mnt2

The script will fail with the following error:

 mount: /mnt/scratch: /dev/mapper/test-scratch1 already mounted on /mnt/test.
       dmesg(1) may have more information after failed mount system call.
 umount: /mnt/test: target is busy.
 umount: /mnt/scratch/: not mounted

And there is no kernel error message.

[CAUSE]
During the btrfs mount, to support mounting different subvolumes with
different RO/RW flags, we have a small hack during the mount:

  Retry with matching RO flags if the initial mount fail with -EBUSY.

The problem is, during that retry we do not hold any super block lock
(s_umount), this meanings there can be a remount process changing the RO
flags of the original fs super block.

If so, we can have an EBUSY error during retry.
And this time we treat any failure as an error, without any retry and
cause the above EBUSY mount failure.

[FIX]
Since we are not holding any super block at all, there is no good way to
properly prevent the race of changing the RO flag of the super block.

Thus here fix the bug by retry with an inverted read-only flag from the
previous attempt, and retry until fc_mount() succeed inside
btrfs_reconfigure_for_mount, or got an non-EBUSY error.

Furthermore, each retry will use an inverted RO flag from the previous
attempt, we will eventually win the race just by chance and can
continue.

This will also slightly change the condition on if we need to
reconfigure the fs, since it's possible that the succeeded run is
already using the correct RO flag.

Finally enhance the error message for btrfs_reconfigure_for_mount(), so
that btrfs will no longer silently error out during mount.

Fixes: f044b318675f ("btrfs: handle the ro->rw transition for mounting different subvolumes")
Reported-by: Enno Gotthold <egotthold@suse.com>
Reported-by: Fabian Vogt <fvogt@suse.com>
[ Special thanks for the reproducer and early analyze pointing to btrfs ]
Link: https://bugzilla.suse.com/show_bug.cgi?id=1231836
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/super.c | 37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)
diff mbox series

Patch

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index fa25db4aacf9..fe88cebb9dd1 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2002,28 +2002,47 @@  static struct vfsmount *btrfs_reconfigure_for_mount(struct fs_context *fc)
 {
 	struct vfsmount *mnt;
 	int ret;
-	const bool ro2rw = !(fc->sb_flags & SB_RDONLY);
+	const unsigned int old_sb_flags = fc->sb_flags_mask;
 
+retry:
 	/*
 	 * We got an EBUSY because our SB_RDONLY flag didn't match the existing
 	 * super block, so invert our setting here and retry the mount so we
 	 * can get our vfsmount.
 	 */
-	if (ro2rw)
-		fc->sb_flags |= SB_RDONLY;
-	else
+	if (fc->sb_flags & SB_RDONLY)
 		fc->sb_flags &= ~SB_RDONLY;
-
+	else
+		fc->sb_flags |= SB_RDONLY;
 	mnt = fc_mount(fc);
-	if (IS_ERR(mnt))
+	/*
+	 * There is no super block lock to hold, thus we can have
+	 * another remount changing the RO/RW status.
+	 * So here we need to check if we got -EBUSY.
+	 * If we got one, retry with inverted RO flags again.
+	 */
+	if (IS_ERR(mnt) && PTR_ERR(mnt) == -EBUSY)
+		goto retry;
+	if (IS_ERR(mnt)) {
+		ret = PTR_ERR(mnt);
+		btrfs_err(NULL, "failed to mount during reconfigure: %d\n", ret);
+		return mnt;
+	}
+	if (!fc->oldapi)
 		return mnt;
 
-	if (!fc->oldapi || !ro2rw)
+	down_write(&mnt->mnt_sb->s_umount);
+	/*
+	 * The new mount is already matching our RO flags, or no need to
+	 * reconfigure to RW.
+	 */
+	if ((old_sb_flags & SB_RDONLY) == (mnt->mnt_sb->s_flags & SB_RDONLY) ||
+	    !(old_sb_flags & SB_RDONLY)) {
+		up_write(&mnt->mnt_sb->s_umount);
 		return mnt;
-
+	}
 	/* We need to convert to rw, call reconfigure. */
 	fc->sb_flags &= ~SB_RDONLY;
-	down_write(&mnt->mnt_sb->s_umount);
 	ret = btrfs_reconfigure(fc);
 	up_write(&mnt->mnt_sb->s_umount);
 	if (ret) {