diff mbox series

btrfs: clear BTRFS_DEV_STATE_MISSING bit in btrfs_close_one_device

Message ID 1633367733-14671-1-git-send-email-zhanglikernel@gmail.com (mailing list archive)
State New, archived
Headers show
Series btrfs: clear BTRFS_DEV_STATE_MISSING bit in btrfs_close_one_device | expand

Commit Message

Li Zhang Oct. 4, 2021, 5:15 p.m. UTC
bug: https://github.com/kdave/btrfs-progs/issues/389

The previous patch does not fix the bug right:
https://lore.kernel.org/linux-btrfs/1632330390-29793-1-git-send-email-zhanglikernel@gmail.com
So I write a new one

It seems that the cause of the error is decrementing
fs_devices->missing_devices but not clearing device->dev_state.
Every time we umount filesystem, it would call close_ctree,
And it would eventually involve btrfs_close_one_device to close the device,
but it only decrements fs_devices->missing_devices but does not clear
the device BTRFS_DEV_STATE_MISSING bit. Worse, this bug will cause Integer
Overflow, because every time umount, fs_devices->missing_devices will
decrease. If fs_devices->missing_devices value hit 0, it would overflow.

I add the debug print in read_one_dev
function(not in patch) to print fs_devices->missing_devices value.

[root@zllke test]# truncate -s 10g test1
[root@zllke test]# truncate -s 10g test2
[root@zllke test]# losetup /dev/loop1 test1
[root@zllke test]# losetup /dev/loop2 test2
[root@zllke test]# mkfs.btrfs -draid1 -mraid1 /dev/loop1 /dev/loop2 -f
[root@zllke test]# losetup -d /dev/loop2
[root@zllke test]# mount -o degraded /dev/loop1 /mnt/1
[root@zllke test]# umount /mnt/1
[root@zllke test]# mount -o degraded /dev/loop1 /mnt/1
[root@zllke test]# umount /mnt/1
[root@zllke test]# mount -o degraded /dev/loop1 /mnt/1
[root@zllke test]# umount /mnt/1
[root@zllke test]# dmesg
[  168.728888] loop1: detected capacity change from 0 to 20971520
[  168.751227] BTRFS: device fsid 56ad51f1-5523-463b-8547-c19486c51ebb devid 1 transid 21 /dev/loop1 scanned by systemd-udevd (2311)
[  169.179102] loop2: detected capacity change from 0 to 20971520
[  169.198307] BTRFS: device fsid 56ad51f1-5523-463b-8547-c19486c51ebb devid 2 transid 17 /dev/loop2 scanned by systemd-udevd (2313)
[  190.696579] BTRFS info (device loop1): flagging fs with big metadata feature
[  190.699445] BTRFS info (device loop1): allowing degraded mounts
[  190.701819] BTRFS info (device loop1): using free space tree
[  190.704126] BTRFS info (device loop1): has skinny extents
[  190.708890] BTRFS info (device loop1):  before clear_missing.00000000f706684d /dev/loop1 0
[  190.711958] BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing
[  190.715370] BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 1
[  209.075744] BTRFS info (device loop1): flagging fs with big metadata feature
[  209.079106] BTRFS info (device loop1): allowing degraded mounts
[  209.082042] BTRFS info (device loop1): using free space tree
[  209.084791] BTRFS info (device loop1): has skinny extents
[  209.089172] BTRFS info (device loop1):  before clear_missing.00000000f706684d /dev/loop1 0
[  209.093074] BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing
[  209.096848] BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 0
[  218.778031] BTRFS info (device loop1): flagging fs with big metadata feature
[  218.781504] BTRFS info (device loop1): allowing degraded mounts
[  218.784319] BTRFS info (device loop1): using free space tree
[  218.786902] BTRFS info (device loop1): has skinny extents
[  218.791190] BTRFS info (device loop1):  before clear_missing.00000000f706684d /dev/loop1 18446744073709551615
[  218.795532] BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing
[  218.799320] BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 18446744073709551615

If fs_devices->missing_devices is 0, next time it would be 18446744073709551615

After apply this patch, the fs_devices->missing_devices seems to be right

[root@zllke test]# truncate -s 10g test1
[root@zllke test]# truncate -s 10g test2
[root@zllke test]# losetup /dev/loop1 test1
[root@zllke test]# losetup /dev/loop2 test2
[root@zllke test]# mkfs.btrfs -draid1 -mraid1 /dev/loop1 /dev/loop2 -f
[root@zllke test]# losetup -d /dev/loop2
[root@zllke test]# mount -o degraded /dev/loop1 /mnt/1
[root@zllke test]# umount /mnt/1
[root@zllke test]# mount -o degraded /dev/loop1 /mnt/1
[root@zllke test]# umount /mnt/1
[root@zllke test]# mount -o degraded /dev/loop1 /mnt/1
[root@zllke test]# umount /mnt/1
[root@zllke test]# dmesg
[   80.647739] loop1: detected capacity change from 0 to 20971520
[   81.268113] loop2: detected capacity change from 0 to 20971520
[   90.694332] BTRFS: device fsid 15aa1203-98d3-4a66-bcae-ca82f629c2cd devid 1 transid 5 /dev/loop1 scanned by mkfs.btrfs (1863)
[   90.705180] BTRFS: device fsid 15aa1203-98d3-4a66-bcae-ca82f629c2cd devid 2 transid 5 /dev/loop2 scanned by mkfs.btrfs (1863)
[  104.935735] BTRFS info (device loop1): flagging fs with big metadata feature
[  104.939020] BTRFS info (device loop1): allowing degraded mounts
[  104.941637] BTRFS info (device loop1): disk space caching is enabled
[  104.944442] BTRFS info (device loop1): has skinny extents
[  104.948848] BTRFS info (device loop1):  before clear_missing.00000000975bd577 /dev/loop1 0
[  104.952365] BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing
[  104.956220] BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 1
[  104.960602] BTRFS info (device loop1): checking UUID tree
[  157.888711] BTRFS info (device loop1): flagging fs with big metadata feature
[  157.892915] BTRFS info (device loop1): allowing degraded mounts
[  157.896333] BTRFS info (device loop1): disk space caching is enabled
[  157.899244] BTRFS info (device loop1): has skinny extents
[  157.905068] BTRFS info (device loop1):  before clear_missing.00000000975bd577 /dev/loop1 0
[  157.908981] BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing
[  157.913540] BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 1
[  161.057615] BTRFS info (device loop1): flagging fs with big metadata feature
[  161.060874] BTRFS info (device loop1): allowing degraded mounts
[  161.063422] BTRFS info (device loop1): disk space caching is enabled
[  161.066179] BTRFS info (device loop1): has skinny extents
[  161.069997] BTRFS info (device loop1):  before clear_missing.00000000975bd577 /dev/loop1 0
[  161.073328] BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing
[  161.077084] BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 1

Signed-off-by: Li Zhang <zhanglikernel@gmail.com>
---
 fs/btrfs/volumes.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

David Sterba Oct. 11, 2021, 3:17 p.m. UTC | #1
On Tue, Oct 05, 2021 at 01:15:33AM +0800, Li Zhang wrote:
> bug: https://github.com/kdave/btrfs-progs/issues/389
> 
> The previous patch does not fix the bug right:
> https://lore.kernel.org/linux-btrfs/1632330390-29793-1-git-send-email-zhanglikernel@gmail.com
> So I write a new one

This looks correct, dropping the bit when we decrease the missing device
counter. I've added the patch to for-next for now, thanks.
Li Zhang Oct. 12, 2021, 4:34 a.m. UTC | #2
Great, thanks!

David Sterba <dsterba@suse.cz> 于2021年10月11日周一 下午11:17写道:
>
> On Tue, Oct 05, 2021 at 01:15:33AM +0800, Li Zhang wrote:
> > bug: https://github.com/kdave/btrfs-progs/issues/389
> >
> > The previous patch does not fix the bug right:
> > https://lore.kernel.org/linux-btrfs/1632330390-29793-1-git-send-email-zhanglikernel@gmail.com
> > So I write a new one
>
> This looks correct, dropping the bit when we decrease the missing device
> counter. I've added the patch to for-next for now, thanks.
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2ec3b8a..56252cc 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1122,8 +1122,10 @@  static void btrfs_close_one_device(struct btrfs_device *device)
 	if (device->devid == BTRFS_DEV_REPLACE_DEVID)
 		clear_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state);
 
-	if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
+	if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) {
+		clear_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state);
 		fs_devices->missing_devices--;
+	}
 
 	btrfs_close_bdev(device);
 	if (device->bdev) {