Message ID | 3a6553bc8e7b4ea56f1ed0f1a3160fc1f7209df6.1605109916.git.johannes.thumshirn@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: hold device_list_mutex while accessing a btrfs_device's members | expand |
On 11/11/20 11:52 pm, Johannes Thumshirn wrote: > A struct btrfs_device's lifetime in device_list_add() is protected by the > device_list_mutex. So don't drop the device_list_mutex when printing a > duplicate device warning in device_list_add. > The only other thread which can free the %device is the userland initiated forget command. But both this (scan) and the forget threads are under %uuid_mutex. So %device is protected from freeing. Did we see any bug reproduced due to this? Thanks.
On 12/11/2020 04:09, Anand Jain wrote: > On 11/11/20 11:52 pm, Johannes Thumshirn wrote: >> A struct btrfs_device's lifetime in device_list_add() is protected by the >> device_list_mutex. So don't drop the device_list_mutex when printing a >> duplicate device warning in device_list_add. >> > > The only other thread which can free the %device is the userland > initiated forget command. But both this (scan) and the forget threads > are under %uuid_mutex. So %device is protected from freeing. > > Did we see any bug reproduced due to this? > > Thanks. > Yes and no, I've stumbled across this while trying to fix this syzbot report: https://github.com/btrfs/fstests/issues/29 It doesn't fix it though, but still it looks odd we're dropping the mutex and then access the device. Doesn't harm the other way round. I take it's more of a cosmetic fix than a bug fix.
On 12/11/20 3:24 pm, Johannes Thumshirn wrote: > On 12/11/2020 04:09, Anand Jain wrote: >> On 11/11/20 11:52 pm, Johannes Thumshirn wrote: >>> A struct btrfs_device's lifetime in device_list_add() is protected by the >>> device_list_mutex. So don't drop the device_list_mutex when printing a >>> duplicate device warning in device_list_add. >>> >> >> The only other thread which can free the %device is the userland >> initiated forget command. But both this (scan) and the forget threads >> are under %uuid_mutex. So %device is protected from freeing. >> >> Did we see any bug reproduced due to this? >> >> Thanks. >> > > Yes and no, I've stumbled across this while trying to fix this syzbot > report: https://github.com/btrfs/fstests/issues/29 Fix in the ML [1] can fix the above issue grossly. [1] https://lore.kernel.org/linux-btrfs/20200114060920.4527-2-anand.jain@oracle.com/T/ There is a scenario where the device->fs_info shall be NULL. Consider Thread-A writes device->bdev at 1w and device->fs_info at 2w. Thread-B reads device->bdev and device->fs_info at 3r, 4r, and 5r. There is a possible rw order as below when the uuid mutex is released. 1w, 3r, 4r, 5r, 2w And this shall lead to the scenario device->bdev != NULL and device->fs_info == NULL for the thread-B leading to the Warning. Thread-A Thread-B btrfs_mount_root() mutex_lock(&uuid_mutex); btrfs_open_devices() open_fs_devices() btrfs_open_one_device() device->bdev = bdev; <----1w mutex_unlock(&uuid_mutex); mutex_lock(&uuid_mutex); btrfs_scan_one_device() device_list_add() if (device->bdev) { <--4r btrfs_warn_in_rcu(device->fs_info, <--5r :: btrfs_info_in_rcu(device->fs_info, <---6r } mutex_unlock(&uuid_mutex); btrfs_fill_super() open_ctree() init_mount_fs_info() fs_info->sb = sb; <---2w init_tree_roots btrfs_read_roots() btrfs_init_devices_late() mutex_lock(&fs_devices->device_list_mutex); device->fs_info = fs_info <----3w mutex_unlock(&fs_devices->device_list_mutex); Also, I think there isn't a need to use the RCU here? (I may be wrong on the rcu part). > > It doesn't fix it though, but still it looks odd we're dropping the > mutex and then access the device. Doesn't harm the other way round. > > I take it's more of a cosmetic fix than a bug fix. > Thanks, Anand
On 12/11/2020 10:40, Anand Jain wrote: > > > On 12/11/20 3:24 pm, Johannes Thumshirn wrote: >> On 12/11/2020 04:09, Anand Jain wrote: >>> On 11/11/20 11:52 pm, Johannes Thumshirn wrote: >>>> A struct btrfs_device's lifetime in device_list_add() is protected by the >>>> device_list_mutex. So don't drop the device_list_mutex when printing a >>>> duplicate device warning in device_list_add. >>>> >>> >>> The only other thread which can free the %device is the userland >>> initiated forget command. But both this (scan) and the forget threads >>> are under %uuid_mutex. So %device is protected from freeing. >>> >>> Did we see any bug reproduced due to this? >>> >>> Thanks. >>> >> >> Yes and no, I've stumbled across this while trying to fix this syzbot > > >> report: https://github.com/btrfs/fstests/issues/29 > > > Fix in the ML [1] can fix the above issue grossly. > > [1] > > https://lore.kernel.org/linux-btrfs/20200114060920.4527-2-anand.jain@oracle.com/T/ > > There is a scenario where the device->fs_info shall be NULL. > > Consider Thread-A writes device->bdev at 1w and device->fs_info at 2w. > Thread-B reads device->bdev and device->fs_info at 3r, 4r, and 5r. > > There is a possible rw order as below when the uuid mutex is released. > > 1w, 3r, 4r, 5r, 2w > > And this shall lead to the scenario device->bdev != NULL and > device->fs_info == NULL for the thread-B leading to the Warning. device->fs_info is not necessarily NULL here unfortunately. Why aren't we simply doing this instead of the NO_FS_INFO dance: diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index bb1aa96e1233..4327c089183a 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -940,16 +940,16 @@ static noinline struct btrfs_device *device_list_add(const char *path, if (device->bdev != path_bdev) { bdput(path_bdev); mutex_unlock(&fs_devices->device_list_mutex); - btrfs_warn_in_rcu(device->fs_info, - "duplicate device %s devid %llu generation %llu scanned by %s (%d)", + pr_info( + "BTRFS: duplicate device %s devid %llu generation %llu scanned by %s (%d)", path, devid, found_transid, current->comm, task_pid_nr(current)); return ERR_PTR(-EEXIST); } bdput(path_bdev); - btrfs_info_in_rcu(device->fs_info, - "devid %llu device path %s changed to %s scanned by %s (%d)", + pr_info( + "BTRFS: devid %llu device path %s changed to %s scanned by %s (%d)", devid, rcu_str_deref(device->name), path, current->comm, task_pid_nr(current));
On 12.11.20 г. 11:54 ч., Johannes Thumshirn wrote: > On 12/11/2020 10:40, Anand Jain wrote: >> >> >> On 12/11/20 3:24 pm, Johannes Thumshirn wrote: >>> On 12/11/2020 04:09, Anand Jain wrote: >>>> On 11/11/20 11:52 pm, Johannes Thumshirn wrote: >>>>> A struct btrfs_device's lifetime in device_list_add() is protected by the >>>>> device_list_mutex. So don't drop the device_list_mutex when printing a >>>>> duplicate device warning in device_list_add. >>>>> >>>> >>>> The only other thread which can free the %device is the userland >>>> initiated forget command. But both this (scan) and the forget threads >>>> are under %uuid_mutex. So %device is protected from freeing. >>>> >>>> Did we see any bug reproduced due to this? >>>> >>>> Thanks. >>>> >>> >>> Yes and no, I've stumbled across this while trying to fix this syzbot >> >> >>> report: https://github.com/btrfs/fstests/issues/29 >> >> >> Fix in the ML [1] can fix the above issue grossly. >> >> [1] >> >> https://lore.kernel.org/linux-btrfs/20200114060920.4527-2-anand.jain@oracle.com/T/ >> >> There is a scenario where the device->fs_info shall be NULL. >> >> Consider Thread-A writes device->bdev at 1w and device->fs_info at 2w. >> Thread-B reads device->bdev and device->fs_info at 3r, 4r, and 5r. >> >> There is a possible rw order as below when the uuid mutex is released. >> >> 1w, 3r, 4r, 5r, 2w >> >> And this shall lead to the scenario device->bdev != NULL and >> device->fs_info == NULL for the thread-B leading to the Warning. > > device->fs_info is not necessarily NULL here unfortunately. > > Why aren't we simply doing this instead of the NO_FS_INFO dance: > Because the btrfs_* helpers also provide the fsid of the system for which an event happened and this becomes relevant when you have a multi-btrfs system.
On 12/11/2020 10:58, Nikolay Borisov wrote: > Because the btrfs_* helpers also provide the fsid of the system for > which an event happened and this becomes relevant when you have a > multi-btrfs system. But you won't get the fsid with NO_FS_INFO either: diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index a906315efd19..5bd8a889fed0 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -216,9 +216,17 @@ void __cold btrfs_printk(const struct btrfs_fs_info *fs_info, const char *fmt, . vaf.fmt = fmt; vaf.va = &args; - if (__ratelimit(ratelimit)) - printk("%sBTRFS %s (device %s): %pV\n", lvl, type, - fs_info ? fs_info->sb->s_id : "<unknown>", &vaf); + if (__ratelimit(ratelimit)) { + if (fs_info == NULL) + printk("%sBTRFS %s (device %s): %pV\n", lvl, type, + "<unknown>", &vaf); + else if (fs_info == NO_FS_INFO) + printk("%sBTRFS %s (device %s): %pV\n", lvl, type, + "...", &vaf); + else + printk("%sBTRFS %s (device %s): %pV\n", lvl, type, + fs_info->sb->s_id, &vaf); + } The equivalent to this in the context of device_list_add() would then be: pr_warn("BTRFS warning (device ...): duplicate device %s devid %llu generation %llu scanned by %s (%d)", path, devid, found_transid, current->comm, task_pid_nr(current));
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index c927dc597550..a653b778b49f 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -939,12 +939,12 @@ static noinline struct btrfs_device *device_list_add(const char *path, if (device->bdev != path_bdev) { bdput(path_bdev); - mutex_unlock(&fs_devices->device_list_mutex); btrfs_warn_in_rcu(device->fs_info, "duplicate device %s devid %llu generation %llu scanned by %s (%d)", path, devid, found_transid, current->comm, task_pid_nr(current)); + mutex_unlock(&fs_devices->device_list_mutex); return ERR_PTR(-EEXIST); } bdput(path_bdev);
A struct btrfs_device's lifetime in device_list_add() is protected by the device_list_mutex. So don't drop the device_list_mutex when printing a duplicate device warning in device_list_add. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> --- fs/btrfs/volumes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)