Message ID | 1633367733-14671-1-git-send-email-zhanglikernel@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: clear BTRFS_DEV_STATE_MISSING bit in btrfs_close_one_device | expand |
On Tue, Oct 05, 2021 at 01:15:33AM +0800, Li Zhang wrote: > bug: https://github.com/kdave/btrfs-progs/issues/389 > > The previous patch does not fix the bug right: > https://lore.kernel.org/linux-btrfs/1632330390-29793-1-git-send-email-zhanglikernel@gmail.com > So I write a new one This looks correct, dropping the bit when we decrease the missing device counter. I've added the patch to for-next for now, thanks.
Great, thanks! David Sterba <dsterba@suse.cz> 于2021年10月11日周一 下午11:17写道: > > On Tue, Oct 05, 2021 at 01:15:33AM +0800, Li Zhang wrote: > > bug: https://github.com/kdave/btrfs-progs/issues/389 > > > > The previous patch does not fix the bug right: > > https://lore.kernel.org/linux-btrfs/1632330390-29793-1-git-send-email-zhanglikernel@gmail.com > > So I write a new one > > This looks correct, dropping the bit when we decrease the missing device > counter. I've added the patch to for-next for now, thanks.
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 2ec3b8a..56252cc 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1122,8 +1122,10 @@ static void btrfs_close_one_device(struct btrfs_device *device) if (device->devid == BTRFS_DEV_REPLACE_DEVID) clear_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state); - if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) + if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) { + clear_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state); fs_devices->missing_devices--; + } btrfs_close_bdev(device); if (device->bdev) {
bug: https://github.com/kdave/btrfs-progs/issues/389 The previous patch does not fix the bug right: https://lore.kernel.org/linux-btrfs/1632330390-29793-1-git-send-email-zhanglikernel@gmail.com So I write a new one It seems that the cause of the error is decrementing fs_devices->missing_devices but not clearing device->dev_state. Every time we umount filesystem, it would call close_ctree, And it would eventually involve btrfs_close_one_device to close the device, but it only decrements fs_devices->missing_devices but does not clear the device BTRFS_DEV_STATE_MISSING bit. Worse, this bug will cause Integer Overflow, because every time umount, fs_devices->missing_devices will decrease. If fs_devices->missing_devices value hit 0, it would overflow. I add the debug print in read_one_dev function(not in patch) to print fs_devices->missing_devices value. [root@zllke test]# truncate -s 10g test1 [root@zllke test]# truncate -s 10g test2 [root@zllke test]# losetup /dev/loop1 test1 [root@zllke test]# losetup /dev/loop2 test2 [root@zllke test]# mkfs.btrfs -draid1 -mraid1 /dev/loop1 /dev/loop2 -f [root@zllke test]# losetup -d /dev/loop2 [root@zllke test]# mount -o degraded /dev/loop1 /mnt/1 [root@zllke test]# umount /mnt/1 [root@zllke test]# mount -o degraded /dev/loop1 /mnt/1 [root@zllke test]# umount /mnt/1 [root@zllke test]# mount -o degraded /dev/loop1 /mnt/1 [root@zllke test]# umount /mnt/1 [root@zllke test]# dmesg [ 168.728888] loop1: detected capacity change from 0 to 20971520 [ 168.751227] BTRFS: device fsid 56ad51f1-5523-463b-8547-c19486c51ebb devid 1 transid 21 /dev/loop1 scanned by systemd-udevd (2311) [ 169.179102] loop2: detected capacity change from 0 to 20971520 [ 169.198307] BTRFS: device fsid 56ad51f1-5523-463b-8547-c19486c51ebb devid 2 transid 17 /dev/loop2 scanned by systemd-udevd (2313) [ 190.696579] BTRFS info (device loop1): flagging fs with big metadata feature [ 190.699445] BTRFS info (device loop1): allowing degraded mounts [ 190.701819] BTRFS info (device loop1): using free space tree [ 190.704126] BTRFS info (device loop1): has skinny extents [ 190.708890] BTRFS info (device loop1): before clear_missing.00000000f706684d /dev/loop1 0 [ 190.711958] BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing [ 190.715370] BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 1 [ 209.075744] BTRFS info (device loop1): flagging fs with big metadata feature [ 209.079106] BTRFS info (device loop1): allowing degraded mounts [ 209.082042] BTRFS info (device loop1): using free space tree [ 209.084791] BTRFS info (device loop1): has skinny extents [ 209.089172] BTRFS info (device loop1): before clear_missing.00000000f706684d /dev/loop1 0 [ 209.093074] BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing [ 209.096848] BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 0 [ 218.778031] BTRFS info (device loop1): flagging fs with big metadata feature [ 218.781504] BTRFS info (device loop1): allowing degraded mounts [ 218.784319] BTRFS info (device loop1): using free space tree [ 218.786902] BTRFS info (device loop1): has skinny extents [ 218.791190] BTRFS info (device loop1): before clear_missing.00000000f706684d /dev/loop1 18446744073709551615 [ 218.795532] BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing [ 218.799320] BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 18446744073709551615 If fs_devices->missing_devices is 0, next time it would be 18446744073709551615 After apply this patch, the fs_devices->missing_devices seems to be right [root@zllke test]# truncate -s 10g test1 [root@zllke test]# truncate -s 10g test2 [root@zllke test]# losetup /dev/loop1 test1 [root@zllke test]# losetup /dev/loop2 test2 [root@zllke test]# mkfs.btrfs -draid1 -mraid1 /dev/loop1 /dev/loop2 -f [root@zllke test]# losetup -d /dev/loop2 [root@zllke test]# mount -o degraded /dev/loop1 /mnt/1 [root@zllke test]# umount /mnt/1 [root@zllke test]# mount -o degraded /dev/loop1 /mnt/1 [root@zllke test]# umount /mnt/1 [root@zllke test]# mount -o degraded /dev/loop1 /mnt/1 [root@zllke test]# umount /mnt/1 [root@zllke test]# dmesg [ 80.647739] loop1: detected capacity change from 0 to 20971520 [ 81.268113] loop2: detected capacity change from 0 to 20971520 [ 90.694332] BTRFS: device fsid 15aa1203-98d3-4a66-bcae-ca82f629c2cd devid 1 transid 5 /dev/loop1 scanned by mkfs.btrfs (1863) [ 90.705180] BTRFS: device fsid 15aa1203-98d3-4a66-bcae-ca82f629c2cd devid 2 transid 5 /dev/loop2 scanned by mkfs.btrfs (1863) [ 104.935735] BTRFS info (device loop1): flagging fs with big metadata feature [ 104.939020] BTRFS info (device loop1): allowing degraded mounts [ 104.941637] BTRFS info (device loop1): disk space caching is enabled [ 104.944442] BTRFS info (device loop1): has skinny extents [ 104.948848] BTRFS info (device loop1): before clear_missing.00000000975bd577 /dev/loop1 0 [ 104.952365] BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing [ 104.956220] BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 1 [ 104.960602] BTRFS info (device loop1): checking UUID tree [ 157.888711] BTRFS info (device loop1): flagging fs with big metadata feature [ 157.892915] BTRFS info (device loop1): allowing degraded mounts [ 157.896333] BTRFS info (device loop1): disk space caching is enabled [ 157.899244] BTRFS info (device loop1): has skinny extents [ 157.905068] BTRFS info (device loop1): before clear_missing.00000000975bd577 /dev/loop1 0 [ 157.908981] BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing [ 157.913540] BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 1 [ 161.057615] BTRFS info (device loop1): flagging fs with big metadata feature [ 161.060874] BTRFS info (device loop1): allowing degraded mounts [ 161.063422] BTRFS info (device loop1): disk space caching is enabled [ 161.066179] BTRFS info (device loop1): has skinny extents [ 161.069997] BTRFS info (device loop1): before clear_missing.00000000975bd577 /dev/loop1 0 [ 161.073328] BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing [ 161.077084] BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 1 Signed-off-by: Li Zhang <zhanglikernel@gmail.com> --- fs/btrfs/volumes.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)