Message ID | 20200722080925.6802-2-nborisov@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Misc cleanups around device addition | expand |
On 22/07/2020 10:09, Nikolay Borisov wrote: > When adding a new device there's a mandatory check to see if a device is > being duplicated to the filesystem it's added to. Since this is a > read-only operations not necessary to take device_list_mutex and can simply > make do with an rcu-readlock. No semantics changes. Hmm what are the actual locking rules for the device list? For instance looking at add_missing_dev() in volumes.c addition to the list is unprotected (both from read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on. Thanks, Johannes
On 22.07.20 г. 12:17 ч., Johannes Thumshirn wrote: > On 22/07/2020 10:09, Nikolay Borisov wrote: >> When adding a new device there's a mandatory check to see if a device is >> being duplicated to the filesystem it's added to. Since this is a >> read-only operations not necessary to take device_list_mutex and can simply >> make do with an rcu-readlock. No semantics changes. > > Hmm what are the actual locking rules for the device list? For instance looking > at add_missing_dev() in volumes.c addition to the list is unprotected (both from > read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do > a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on. Bummer, the locking rules are supposed to be those documented atop volume.c, namely: * fs_devices::device_list_mutex (per-fs, with RCU) 18 * ------------------------------------------------ 17 * protects updates to fs_devices::devices, ie. adding and deleting 16 * 15 * simple list traversal with read-only actions can be done with RCU protection 14 * 13 * may be used to exclude some operations from running concurrently without any 12 * modifications to the list (see write_all_supers) However, your observations seem correct and rather annoying. Let me go and try and figure this out... > > Thanks, > Johannes >
On Wed, Jul 22, 2020 at 01:36:15PM +0300, Nikolay Borisov wrote: > > > On 22.07.20 г. 12:17 ч., Johannes Thumshirn wrote: > > On 22/07/2020 10:09, Nikolay Borisov wrote: > >> When adding a new device there's a mandatory check to see if a device is > >> being duplicated to the filesystem it's added to. Since this is a > >> read-only operations not necessary to take device_list_mutex and can simply > >> make do with an rcu-readlock. No semantics changes. > > > > Hmm what are the actual locking rules for the device list? For instance looking > > at add_missing_dev() in volumes.c addition to the list is unprotected (both from > > read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do > > a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on. > > Bummer, the locking rules are supposed to be those documented atop > volume.c, namely: > > * fs_devices::device_list_mutex (per-fs, with RCU) > > 18 * ------------------------------------------------ > > 17 * protects updates to fs_devices::devices, ie. adding and > deleting > 16 * > > 15 * simple list traversal with read-only actions can be done with > RCU protection > 14 * > > 13 * may be used to exclude some operations from running > concurrently without any > 12 * modifications to the list (see write_all_supers) > > > > However, your observations seem correct and rather annoying. Let me go > and try and figure this out... For device list it's important to know from which context is the list used. On unmoutned filesystems, the devices can be added by scanning ioctl. On mounted filesystem, before the mount is finished, the devices cannot be concurrently used (single-threaded context) and uuid_mutex is temporarily protecting the devices agains scan ioctl. On finished mount device list must be protected by device_list_mutex from the same filesystem changes (ioctls, superblock write). Device scanning is a no-op here, but still needs to use the uuid_mutex. Enter RCU. For performance reasons we don't want to take device_list_mutex for read-only operations like show_devname or lookups of device id, where it's not critical that the returned information is stale. The mentioned helpers are used by various functions and the context is not obvious or documented, but it should be clear once the caller chain is known. I can turn that into comments but please let me know if this is sufficient as explanation or if you need more.
On 22/07/2020 16:48, David Sterba wrote: > On Wed, Jul 22, 2020 at 01:36:15PM +0300, Nikolay Borisov wrote: >> >> >> On 22.07.20 г. 12:17 ч., Johannes Thumshirn wrote: >>> On 22/07/2020 10:09, Nikolay Borisov wrote: >>>> When adding a new device there's a mandatory check to see if a device is >>>> being duplicated to the filesystem it's added to. Since this is a >>>> read-only operations not necessary to take device_list_mutex and can simply >>>> make do with an rcu-readlock. No semantics changes. >>> >>> Hmm what are the actual locking rules for the device list? For instance looking >>> at add_missing_dev() in volumes.c addition to the list is unprotected (both from >>> read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do >>> a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on. >> >> Bummer, the locking rules are supposed to be those documented atop >> volume.c, namely: >> >> * fs_devices::device_list_mutex (per-fs, with RCU) >> >> 18 * ------------------------------------------------ >> >> 17 * protects updates to fs_devices::devices, ie. adding and >> deleting >> 16 * >> >> 15 * simple list traversal with read-only actions can be done with >> RCU protection >> 14 * >> >> 13 * may be used to exclude some operations from running >> concurrently without any >> 12 * modifications to the list (see write_all_supers) >> >> >> >> However, your observations seem correct and rather annoying. Let me go >> and try and figure this out... > > For device list it's important to know from which context is the list > used. > > On unmoutned filesystems, the devices can be added by scanning ioctl. > > On mounted filesystem, before the mount is finished, the devices cannot > be concurrently used (single-threaded context) and uuid_mutex is > temporarily protecting the devices agains scan ioctl. > > On finished mount device list must be protected by device_list_mutex > from the same filesystem changes (ioctls, superblock write). Device > scanning is a no-op here, but still needs to use the uuid_mutex. > > Enter RCU. For performance reasons we don't want to take > device_list_mutex for read-only operations like show_devname or lookups > of device id, where it's not critical that the returned information is > stale. OK I got confused by a) the mix of unlocked, device_list_mutex and uuid_mutex use and b) sometimes we use plain list_* operations and sometimes rcu list ones. > The mentioned helpers are used by various functions and the context is > not obvious or documented, but it should be clear once the caller chain > is known. > > I can turn that into comments but please let me know if this is > sufficient as explanation or if you need more. > This would be great yes.
On 22.07.20 г. 17:47 ч., David Sterba wrote: > On Wed, Jul 22, 2020 at 01:36:15PM +0300, Nikolay Borisov wrote: >> >> >> On 22.07.20 г. 12:17 ч., Johannes Thumshirn wrote: >>> On 22/07/2020 10:09, Nikolay Borisov wrote: >>>> When adding a new device there's a mandatory check to see if a device is >>>> being duplicated to the filesystem it's added to. Since this is a >>>> read-only operations not necessary to take device_list_mutex and can simply >>>> make do with an rcu-readlock. No semantics changes. >>> >>> Hmm what are the actual locking rules for the device list? For instance looking >>> at add_missing_dev() in volumes.c addition to the list is unprotected (both from >>> read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do >>> a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on. >> >> Bummer, the locking rules are supposed to be those documented atop >> volume.c, namely: >> >> * fs_devices::device_list_mutex (per-fs, with RCU) >> >> 18 * ------------------------------------------------ >> >> 17 * protects updates to fs_devices::devices, ie. adding and >> deleting >> 16 * >> >> 15 * simple list traversal with read-only actions can be done with >> RCU protection >> 14 * >> >> 13 * may be used to exclude some operations from running >> concurrently without any >> 12 * modifications to the list (see write_all_supers) >> >> >> >> However, your observations seem correct and rather annoying. Let me go >> and try and figure this out... > > For device list it's important to know from which context is the list > used. > > On unmoutned filesystems, the devices can be added by scanning ioctl. > > On mounted filesystem, before the mount is finished, the devices cannot > be concurrently used (single-threaded context) and uuid_mutex is > temporarily protecting the devices agains scan ioctl. > > On finished mount device list must be protected by device_list_mutex > from the same filesystem changes (ioctls, superblock write). Device > scanning is a no-op here, but still needs to use the uuid_mutex. > > Enter RCU. For performance reasons we don't want to take > device_list_mutex for read-only operations like show_devname or lookups > of device id, where it's not critical that the returned information is > stale. > > The mentioned helpers are used by various functions and the context is > not obvious or documented, but it should be clear once the caller chain > is known. > > I can turn that into comments but please let me know if this is > sufficient as explanation or if you need more. > Yes having those different contexts spelled out is really beneficial.
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9f338d9db51b..0795ab511f8b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2502,16 +2502,15 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path filemap_write_and_wait(bdev->bd_inode->i_mapping); - mutex_lock(&fs_devices->device_list_mutex); - list_for_each_entry(device, &fs_devices->devices, dev_list) { + rcu_read_lock(); + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { if (device->bdev == bdev) { ret = -EEXIST; - mutex_unlock( - &fs_devices->device_list_mutex); + rcu_read_unlock(); goto error; } } - mutex_unlock(&fs_devices->device_list_mutex); + rcu_read_unlock(); device = btrfs_alloc_device(fs_info, NULL, NULL); if (IS_ERR(device)) {
When adding a new device there's a mandatory check to see if a device is being duplicated to the filesystem it's added to. Since this is a read-only operations not necessary to take device_list_mutex and can simply make do with an rcu-readlock. No semantics changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> --- fs/btrfs/volumes.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)