diff mbox

btrfs: s_bdev is not null after missing replace

Message ID 1460629450-2271-1-git-send-email-anand.jain@oracle.com (mailing list archive)
State Superseded
Headers show

Commit Message

Anand Jain April 14, 2016, 10:24 a.m. UTC
Yauhen reported in the ML that s_bdev is null at mount, and
s_bdev gets updated to some device when missing device is
replaced, as because bdev is null for missing device, things
gets matched up. Fix this by checking if s_bdev is set. I
didn't want to completely remove updating s_bdev because
the future multi device support at vfs layer may need it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reported-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
---
 fs/btrfs/dev-replace.c | 3 ++-
 fs/btrfs/volumes.c     | 6 ++++--
 2 files changed, 6 insertions(+), 3 deletions(-)

Comments

David Sterba May 3, 2016, 5:31 p.m. UTC | #1
On Thu, Apr 14, 2016 at 06:24:10PM +0800, Anand Jain wrote:
> Yauhen reported in the ML that s_bdev is null at mount, and
> s_bdev gets updated to some device when missing device is
> replaced, as because bdev is null for missing device, things
> gets matched up. Fix this by checking if s_bdev is set. I
> didn't want to completely remove updating s_bdev because
> the future multi device support at vfs layer may need it.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> Reported-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>

Do you have a testcase for that? As there are more patches touching the
device pointers I'd rather see some test coverage. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yauhen Kharuzhy May 3, 2016, 8:34 p.m. UTC | #2
On Tue, May 03, 2016 at 07:31:48PM +0200, David Sterba wrote:
> On Thu, Apr 14, 2016 at 06:24:10PM +0800, Anand Jain wrote:
> > Yauhen reported in the ML that s_bdev is null at mount, and
> > s_bdev gets updated to some device when missing device is
> > replaced, as because bdev is null for missing device, things
> > gets matched up. Fix this by checking if s_bdev is set. I
> > didn't want to completely remove updating s_bdev because
> > the future multi device support at vfs layer may need it.
> > 
> > Signed-off-by: Anand Jain <anand.jain@oracle.com>
> > Reported-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
> 
> Do you have a testcase for that? As there are more patches touching the
> device pointers I'd rather see some test coverage. Thanks.

Testcase is ('global spare' patchset is needed for device closing
support):

1) create RAID, mount -> s_bdev is NULL
2) replace latest device by another -> s_bdev is bdev of new drive
3) remove drive added by replace
4) touch mountpount and do btrfs fi sync (device will be closed and
marked as missing here) -> s_bdev is invalid pointer to closed and freed
bdev
5) unmount FS -> oops

Something like:

mkfs.btrfs -d raid5 -m raid5 <dev1> <dev2>... <devN-1>
mount <dev1> /mnt
btrfs replace start -B <devN-1> <devN> /mnt
_devmgt_remove <devN> # _devmgt_remove is helper for detaching scsi device
touch /mnt
btrfs fi sync /mnt
umount /mnt


I will try to make xfstest scripts for this case and for other cases
reported by me.
Yauhen Kharuzhy May 3, 2016, 8:42 p.m. UTC | #3
On Tue, May 03, 2016 at 11:34:50PM +0300, Yauhen Kharuzhy wrote:
> On Tue, May 03, 2016 at 07:31:48PM +0200, David Sterba wrote:
> > On Thu, Apr 14, 2016 at 06:24:10PM +0800, Anand Jain wrote:
> > > Yauhen reported in the ML that s_bdev is null at mount, and
> > > s_bdev gets updated to some device when missing device is
> > > replaced, as because bdev is null for missing device, things
> > > gets matched up. Fix this by checking if s_bdev is set. I
> > > didn't want to completely remove updating s_bdev because
> > > the future multi device support at vfs layer may need it.
> > > 
> > > Signed-off-by: Anand Jain <anand.jain@oracle.com>
> > > Reported-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
> > 
> > Do you have a testcase for that? As there are more patches touching the
> > device pointers I'd rather see some test coverage. Thanks.
> 
> Testcase is ('global spare' patchset is needed for device closing
> support):
> 
> 1) create RAID, mount -> s_bdev is NULL
sorry, for match old s_bdev (NULL) with replaced device:

1.1) remove latest drive and touch FS to make device missing -> device->bdev is NULL

> 2) replace latest device by another -> s_bdev is bdev of new drive
> 3) remove drive added by replace
> 4) touch mountpount and do btrfs fi sync (device will be closed and
> marked as missing here) -> s_bdev is invalid pointer to closed and freed
> bdev
> 5) unmount FS -> oops
> 
> Something like:
> 
> mkfs.btrfs -d raid5 -m raid5 <dev1> <dev2>... <devN-1>
> mount <dev1> /mnt
_devmgt_remove <devN-1>
touch /mnt
btrfs fi sync /mnt

btrfs replace start -B <missing devid> <devN> /mnt
> _devmgt_remove <devN> # _devmgt_remove is helper for detaching scsi device
> touch /mnt
> btrfs fi sync /mnt
> umount /mnt
> 
> 
> I will try to make xfstest scripts for this case and for other cases
> reported by me.
> 
> -- 
> Yauhen Kharuzhy
diff mbox

Patch

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index ddc4843604df..f8ff67609c18 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -569,7 +569,8 @@  static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 	ASSERT(list_empty(&src_device->resized_list));
 	tgt_device->commit_total_bytes = src_device->commit_total_bytes;
 	tgt_device->commit_bytes_used = src_device->bytes_used;
-	if (fs_info->sb->s_bdev == src_device->bdev)
+	if (fs_info->sb->s_bdev &&
+		(fs_info->sb->s_bdev == src_device->bdev))
 		fs_info->sb->s_bdev = tgt_device->bdev;
 	if (fs_info->fs_devices->latest_bdev == src_device->bdev)
 		fs_info->fs_devices->latest_bdev = tgt_device->bdev;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c2a87fc127a7..06de1e9b7891 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1898,7 +1898,8 @@  int btrfs_rm_device(struct btrfs_root *root, char *device_path)
 
 	next_device = list_entry(root->fs_info->fs_devices->devices.next,
 				 struct btrfs_device, dev_list);
-	if (device->bdev == root->fs_info->sb->s_bdev)
+	if (root->fs_info->sb->s_bdev &&
+		(root->fs_info->sb->s_bdev == device->bdev))
 		root->fs_info->sb->s_bdev = next_device->bdev;
 	if (device->bdev == root->fs_info->fs_devices->latest_bdev)
 		root->fs_info->fs_devices->latest_bdev = next_device->bdev;
@@ -2084,7 +2085,8 @@  void btrfs_destroy_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 
 	next_device = list_entry(fs_info->fs_devices->devices.next,
 				 struct btrfs_device, dev_list);
-	if (tgtdev->bdev == fs_info->sb->s_bdev)
+	if (fs_info->sb->s_bdev &&
+		(tgtdev->bdev == fs_info->sb->s_bdev))
 		fs_info->sb->s_bdev = next_device->bdev;
 	if (tgtdev->bdev == fs_info->fs_devices->latest_bdev)
 		fs_info->fs_devices->latest_bdev = next_device->bdev;