From patchwork Fri Sep 9 23:03:38 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anand Jain X-Patchwork-Id: 9324579 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6C3046077F for ; Fri, 9 Sep 2016 23:02:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4AD1E29BDD for ; Fri, 9 Sep 2016 23:02:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3E88129F3B; Fri, 9 Sep 2016 23:02:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.5 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3B07A29BDD for ; Fri, 9 Sep 2016 23:02:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755338AbcIIXCE (ORCPT ); Fri, 9 Sep 2016 19:02:04 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:28805 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754666AbcIIXCD (ORCPT ); Fri, 9 Sep 2016 19:02:03 -0400 Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u89N1wKJ031042 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 9 Sep 2016 23:01:58 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0021.oracle.com (8.13.8/8.13.8) with ESMTP id u89N1vaW024855 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 9 Sep 2016 23:01:58 GMT Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id u89N1unh031483; Fri, 9 Sep 2016 23:01:56 GMT Received: from localhost.localdomain (/42.60.24.64) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 09 Sep 2016 16:01:55 -0700 From: Anand Jain To: linux-btrfs@vger.kernel.org Cc: dsterba@suse.cz, idryomov@gmail.com Subject: [PATCH v2] btrfs: fix a possible umount deadlock Date: Sat, 10 Sep 2016 07:03:38 +0800 Message-Id: <20160909230338.1493-1-anand.jain@oracle.com> X-Mailer: git-send-email 2.9.0 In-Reply-To: <20160909083104.10383-1-anand.jain@oracle.com> References: <20160909083104.10383-1-anand.jain@oracle.com> X-Source-IP: userv0021.oracle.com [156.151.31.71] Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP btrfs_show_devname() is using the device_list_mutex, sometimes a call to blkdev_put() leads vfs calling into this func. So call blkdev_put() outside of device_list_mutex, as of now. [ 983.284212] ====================================================== [ 983.290401] [ INFO: possible circular locking dependency detected ] [ 983.296677] 4.8.0-rc5-ceph-00023-g1b39cec2 #1 Not tainted [ 983.302081] ------------------------------------------------------- [ 983.308357] umount/21720 is trying to acquire lock: [ 983.313243] (&bdev->bd_mutex){+.+.+.}, at: [] blkdev_put+0x31/0x150 [ 983.321264] [ 983.321264] but task is already holding lock: [ 983.327101] (&fs_devs->device_list_mutex){+.+...}, at: [] __btrfs_close_devices+0x46/0x200 [btrfs] [ 983.337839] [ 983.337839] which lock already depends on the new lock. [ 983.337839] [ 983.346024] [ 983.346024] the existing dependency chain (in reverse order) is: [ 983.353512] -> #4 (&fs_devs->device_list_mutex){+.+...}: [ 983.359096] [] lock_acquire+0x1bc/0x1f0 [ 983.365143] [] mutex_lock_nested+0x65/0x350 [ 983.371521] [] btrfs_show_devname+0x36/0x1f0 [btrfs] [ 983.378710] [] show_vfsmnt+0x4e/0x150 [ 983.384593] [] m_show+0x17/0x20 [ 983.389957] [] seq_read+0x2b5/0x3b0 [ 983.395669] [] __vfs_read+0x28/0x100 [ 983.401464] [] vfs_read+0xab/0x150 [ 983.407080] [] SyS_read+0x52/0xb0 [ 983.412609] [] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 983.419617] -> #3 (namespace_sem){++++++}: [ 983.424024] [] lock_acquire+0x1bc/0x1f0 [ 983.430074] [] down_write+0x49/0x80 [ 983.435785] [] lock_mount+0x67/0x1c0 [ 983.441582] [] do_add_mount+0x32/0xf0 [ 983.447458] [] finish_automount+0x5a/0xc0 [ 983.453682] [] follow_managed+0x1b3/0x2a0 [ 983.459912] [] lookup_fast+0x300/0x350 [ 983.465875] [] path_openat+0x3a7/0xaa0 [ 983.471846] [] do_filp_open+0x85/0xe0 [ 983.477731] [] do_sys_open+0x14c/0x1f0 [ 983.483702] [] SyS_open+0x1e/0x20 [ 983.489240] [] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 983.496254] -> #2 (&sb->s_type->i_mutex_key#3){+.+.+.}: [ 983.501798] [] lock_acquire+0x1bc/0x1f0 [ 983.507855] [] down_write+0x49/0x80 [ 983.513558] [] start_creating+0x87/0x100 [ 983.519703] [] debugfs_create_dir+0x17/0x100 [ 983.526195] [] bdi_register+0x93/0x210 [ 983.532165] [] bdi_register_owner+0x43/0x70 [ 983.538570] [] device_add_disk+0x1fb/0x450 [ 983.544888] [] loop_add+0x1e6/0x290 [ 983.550596] [] loop_init+0x10b/0x14f [ 983.556394] [] do_one_initcall+0xa7/0x180 [ 983.562618] [] kernel_init_freeable+0x1cc/0x266 [ 983.569370] [] kernel_init+0xe/0x100 [ 983.575166] [] ret_from_fork+0x1f/0x40 [ 983.581131] -> #1 (loop_index_mutex){+.+.+.}: [ 983.585801] [] lock_acquire+0x1bc/0x1f0 [ 983.591858] [] mutex_lock_nested+0x65/0x350 [ 983.598256] [] lo_open+0x1f/0x60 [ 983.603704] [] __blkdev_get+0x123/0x400 [ 983.609757] [] blkdev_get+0x34a/0x350 [ 983.615639] [] blkdev_open+0x64/0x80 [ 983.621428] [] do_dentry_open+0x1c6/0x2d0 [ 983.627651] [] vfs_open+0x69/0x80 [ 983.633181] [] path_openat+0x834/0xaa0 [ 983.639152] [] do_filp_open+0x85/0xe0 [ 983.645035] [] do_sys_open+0x14c/0x1f0 [ 983.650999] [] SyS_open+0x1e/0x20 [ 983.656535] [] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 983.663541] -> #0 (&bdev->bd_mutex){+.+.+.}: [ 983.668107] [] __lock_acquire+0x1003/0x17b0 [ 983.674510] [] lock_acquire+0x1bc/0x1f0 [ 983.680561] [] mutex_lock_nested+0x65/0x350 [ 983.686967] [] blkdev_put+0x31/0x150 [ 983.692761] [] btrfs_close_bdev+0x4f/0x60 [btrfs] [ 983.699699] [] __btrfs_close_devices+0xcb/0x200 [btrfs] [ 983.707178] [] btrfs_close_devices+0x2b/0xa0 [btrfs] [ 983.714380] [] close_ctree+0x265/0x340 [btrfs] [ 983.721061] [] btrfs_put_super+0x19/0x20 [btrfs] [ 983.727908] [] generic_shutdown_super+0x6f/0x100 [ 983.734744] [] kill_anon_super+0x16/0x30 [ 983.740888] [] btrfs_kill_super+0x1e/0x130 [btrfs] [ 983.747909] [] deactivate_locked_super+0x49/0x80 [ 983.754745] [] deactivate_super+0x5d/0x70 [ 983.760977] [] cleanup_mnt+0x5c/0x80 [ 983.766773] [] __cleanup_mnt+0x12/0x20 [ 983.772738] [] task_work_run+0x7e/0xc0 [ 983.778708] [] exit_to_usermode_loop+0x7e/0xb4 [ 983.785373] [] syscall_return_slowpath+0xbb/0xd0 [ 983.792212] [] entry_SYSCALL_64_fastpath+0xbf/0xc1 [ 983.799225] [ 983.799225] other info that might help us debug this: [ 983.799225] [ 983.807291] Chain exists of: &bdev->bd_mutex --> namespace_sem --> &fs_devs->device_list_mutex [ 983.816521] Possible unsafe locking scenario: [ 983.816521] [ 983.822489] CPU0 CPU1 [ 983.827043] ---- ---- [ 983.831599] lock(&fs_devs->device_list_mutex); [ 983.836289] lock(namespace_sem); [ 983.842268] lock(&fs_devs->device_list_mutex); [ 983.849478] lock(&bdev->bd_mutex); [ 983.853127] [ 983.853127] *** DEADLOCK *** [ 983.853127] [ 983.859113] 3 locks held by umount/21720: [ 983.863145] #0: (&type->s_umount_key#35){++++..}, at: [] deactivate_super+0x55/0x70 [ 983.872713] #1: (uuid_mutex){+.+.+.}, at: [] btrfs_close_devices+0x23/0xa0 [btrfs] [ 983.882206] #2: (&fs_devs->device_list_mutex){+.+...}, at: [] __btrfs_close_devices+0x46/0x200 [btrfs] [ 983.893422] [ 983.893422] stack backtrace: [ 983.897824] CPU: 6 PID: 21720 Comm: umount Not tainted 4.8.0-rc5-ceph-00023-g1b39cec2 #1 [ 983.905958] Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015 [ 983.913492] 0000000000000000 ffff8c8a53c17a38 ffffffff91429521 ffffffff9260f4f0 [ 983.921018] ffffffff92642760 ffff8c8a53c17a88 ffffffff911b2b04 0000000000000050 [ 983.928542] ffffffff9237d620 ffff8c8a5294aee0 ffff8c8a5294aeb8 ffff8c8a5294aee0 [ 983.936072] Call Trace: [ 983.938545] [] dump_stack+0x85/0xc4 [ 983.943715] [] print_circular_bug+0x1fb/0x20c [ 983.949748] [] __lock_acquire+0x1003/0x17b0 [ 983.955613] [] lock_acquire+0x1bc/0x1f0 [ 983.961123] [] ? blkdev_put+0x31/0x150 [ 983.966550] [] mutex_lock_nested+0x65/0x350 [ 983.972407] [] ? blkdev_put+0x31/0x150 [ 983.977832] [] blkdev_put+0x31/0x150 [ 983.983101] [] btrfs_close_bdev+0x4f/0x60 [btrfs] [ 983.989500] [] __btrfs_close_devices+0xcb/0x200 [btrfs] [ 983.996415] [] btrfs_close_devices+0x2b/0xa0 [btrfs] [ 984.003068] [] close_ctree+0x265/0x340 [btrfs] [ 984.009189] [] ? evict_inodes+0x15e/0x170 [ 984.014881] [] btrfs_put_super+0x19/0x20 [btrfs] [ 984.021176] [] generic_shutdown_super+0x6f/0x100 [ 984.027476] [] kill_anon_super+0x16/0x30 [ 984.033082] [] btrfs_kill_super+0x1e/0x130 [btrfs] [ 984.039548] [] deactivate_locked_super+0x49/0x80 [ 984.045839] [] deactivate_super+0x5d/0x70 [ 984.051525] [] cleanup_mnt+0x5c/0x80 [ 984.056774] [] __cleanup_mnt+0x12/0x20 [ 984.062201] [] task_work_run+0x7e/0xc0 [ 984.067625] [] exit_to_usermode_loop+0x7e/0xb4 [ 984.073747] [] syscall_return_slowpath+0xbb/0xd0 [ 984.080038] [] entry_SYSCALL_64_fastpath+0xbf/0xc1 Reported-by: Ilya Dryomov Signed-off-by: Anand Jain --- v1->v2: Fix typo, remove static thanks Kdave. fs/btrfs/volumes.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 1ce584893d1b..66dc3eeca6d0 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -877,8 +877,6 @@ static void btrfs_close_one_device(struct btrfs_device *device) if (device->missing) fs_devices->missing_devices--; - btrfs_close_bdev(device); - new_device = btrfs_alloc_device(NULL, &device->devid, device->uuid); BUG_ON(IS_ERR(new_device)); /* -ENOMEM */ @@ -892,13 +890,13 @@ static void btrfs_close_one_device(struct btrfs_device *device) list_replace_rcu(&device->dev_list, &new_device->dev_list); new_device->fs_devices = device->fs_devices; - - call_rcu(&device->rcu, free_device); } static int __btrfs_close_devices(struct btrfs_fs_devices *fs_devices) { struct btrfs_device *device, *tmp; + LIST_HEAD(pending_put); + INIT_LIST_HEAD(&pending_put); if (--fs_devices->opened > 0) return 0; @@ -906,9 +904,24 @@ static int __btrfs_close_devices(struct btrfs_fs_devices *fs_devices) mutex_lock(&fs_devices->device_list_mutex); list_for_each_entry_safe(device, tmp, &fs_devices->devices, dev_list) { btrfs_close_one_device(device); + list_add(&device->dev_list, &pending_put); } mutex_unlock(&fs_devices->device_list_mutex); + /* + * btrfs_show_devname() is using the device_list_mutex, + * sometimes a call to blkdev_put() leads vfs calling + * into this func. So do put outside of device_list_mutex, + * as of now. + */ + while (!list_empty(&pending_put)) { + device = list_entry(pending_put.next, + struct btrfs_device, dev_list); + list_del(&device->dev_list); + btrfs_close_bdev(device); + call_rcu(&device->rcu, free_device); + } + WARN_ON(fs_devices->open_devices); WARN_ON(fs_devices->rw_devices); fs_devices->opened = 0;