Message ID | 1477622906-17056-1-git-send-email-robbieko@synology.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Oct 28, 2016 at 3:48 AM, robbieko <robbieko@synology.com> wrote: > From: Robbie Ko <robbieko@synology.com> > > We found a fsync deadlock in log_new_dir_dentries, because > btrfs_search_forward get path lock, then call btrfs_iget will > get another extent_buffer lock, maybe occur deadlock. This still doesn't explain how the deadlock happens. For it to happen it's necessary that before btrfs_iget() does a tree search, some other task gets write locks on nodes and blocks waiting for the leaf locked by btrfs_search_forward() to be unlocked, and that btrfs_iget() tries to read lock those same nodes write locked by that other task. It's important to have detailed information in the change logs. I understand that english isn't your native tongue (it's not mine either, and I'm far from mastering it), but that's not an excuse to not express all the important information in detail (we can all live with grammar errors and typos). > > Fix this by release path before call btrfs_iget, avoid deadlock occur. > > Example: > Pid waiting: 32021->32020->32028->14431->14436->32021 > > The following are their extent_buffer locked/waiting respectively: > extent_buffer: start:207060992, len:16384 > locker pid: 32020 read lock > wait pid: 32021 write lock > extent_buffer: start:14730821632, len:16384 > locker pid: 32028 read lock > wait pid: 32020 write lock > extent_buffer: start:446503813120, len:16384 > locker pid: 14431 write lock > wait pid: 32028 read lock > extent_buffer: start:446503845888, len: 16384 > locker pid: 14436 write lock > wait pid: 14431 write lock > extent_buffer: start: 446504386560, len: 16384 > locker pid: 32021 write lock > wait pid: 14436 write lock > > The following are their call trace respectively. > [ 4077.478852] kworker/u24:10 D ffff88107fc90640 0 14431 2 0x00000000 > [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] > [ 4077.494346] ffff880ffa56bad0 0000000000000046 0000000000009000 ffff880ffa56bfd8 > [ 4077.502629] ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 ffff88101a5d6138 > [ 4077.510915] ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 ffff881016ce21c0 > [ 4077.519202] Call Trace: > [ 4077.528752] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs] > [ 4077.536049] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4077.542574] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs] > [ 4077.550171] [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs] > [ 4077.558252] [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs] > [ 4077.566140] [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs] > [ 4077.573928] [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] > [ 4077.582399] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs] > [ 4077.589896] [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] > [ 4077.599632] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs] > [ 4077.607134] [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs] > [ 4077.615329] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 > [ 4077.622043] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 > [ 4077.628459] [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270 > [ 4077.635759] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 > [ 4077.641404] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 > [ 4077.648696] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 > [ 4077.654926] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 > > [ 4078.358087] kworker/u24:15 D ffff88107fcd0640 0 14436 2 0x00000000 > [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] > [ 4078.373574] ffff880ffa57fad0 0000000000000046 0000000000009000 ffff880ffa57ffd8 > [ 4078.381864] ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 ffff88101a5d6138 > [ 4078.390163] ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 ffff88103004d0a0 > [ 4078.398466] Call Trace: > [ 4078.408019] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs] > [ 4078.415322] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4078.421844] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs] > [ 4078.429438] [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs] > [ 4078.437518] [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs] > [ 4078.445404] [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs] > [ 4078.453194] [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] > [ 4078.461663] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs] > [ 4078.469161] [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] > [ 4078.478893] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs] > [ 4078.486388] [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs] > [ 4078.494561] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 > [ 4078.501278] [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40 > [ 4078.508673] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 > [ 4078.515098] [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270 > [ 4078.522396] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 > [ 4078.528032] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 > [ 4078.535325] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 > [ 4078.541552] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 > > [ 4079.355824] user-space-program D ffff88107fd30640 0 32020 1 0x00000000 > [ 4079.363716] ffff880eae8eba10 0000000000000086 0000000000009000 ffff880eae8ebfd8 > [ 4079.372003] ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 ffff88101a5d6138 > [ 4079.380294] ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 ffff881016c162c0 > [ 4079.388586] Call Trace: > [ 4079.398134] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs] > [ 4079.405431] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4079.411955] [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 [btrfs] > [ 4079.419644] [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 [btrfs] > [ 4079.427237] [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 [btrfs] > [ 4079.435041] [<ffffffffa0689b60>] ? generic_bin_search.constprop.38+0x80/0x190 [btrfs] > [ 4079.443897] [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0 [btrfs] > [ 4079.451975] [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs] > [ 4079.458890] [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs] > [ 4079.466292] [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30 [btrfs] > [ 4079.474373] [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs] > [ 4079.482161] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs] > [ 4079.489558] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 > [ 4079.495300] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 > [ 4079.501422] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b > > [ 4079.508334] user-space-program D ffff88107fc30640 0 32021 1 0x00000004 > [ 4079.516226] ffff880eae8efbf8 0000000000000086 0000000000009000 ffff880eae8effd8 > [ 4079.524513] ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 ffff88101a5d6138 > [ 4079.532802] ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 ffff881030279610 > [ 4079.541092] Call Trace: > [ 4079.550642] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs] > [ 4079.557941] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4079.564463] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs] > [ 4079.572058] [<ffffffffa06bb7d8>] ? btrfs_truncate_inode_items+0x168/0xb90 [btrfs] > [ 4079.580526] [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0 [btrfs] > [ 4079.588701] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs] > [ 4079.596196] [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 [btrfs] > [ 4079.603789] [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs] > [ 4079.610994] [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs] > [ 4079.618197] [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680 > [ 4079.624625] [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160 > [ 4079.630854] [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90 > [ 4079.636889] [<ffffffff810f59fa>] ? do_sys_ftruncate.constprop.15+0x10a/0x160 > [ 4079.644869] [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570 > [ 4079.650805] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b > > [ 4080.410607] user-space-program D ffff88107fc70640 0 32028 12639 0x00000004 > [ 4080.418489] ffff880eaeccbbe0 0000000000000086 0000000000009000 ffff880eaeccbfd8 > [ 4080.426778] ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 ffff88101a5d6138 > [ 4080.435067] ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 ffff880f317ef1e0 > [ 4080.443353] Call Trace: > [ 4080.452920] [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 [btrfs] > [ 4080.460703] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4080.467225] [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40 [btrfs] > [ 4080.475400] [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 [btrfs] > [ 4080.482994] [<ffffffffa06b2df0>] ? btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs] > [ 4080.491857] [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 [btrfs] > [ 4080.499353] [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0 > [ 4080.505879] [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs] > [ 4080.512696] [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 [btrfs] > [ 4080.520387] [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30 [btrfs] > [ 4080.528469] [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs] > [ 4080.536258] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs] > [ 4080.543657] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 > [ 4080.549399] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 > [ 4080.555534] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b > > v2: > * Improve the change log Same as I mentioned before, versioning information does not belong in the changelog, but rather after the "---" below. See https://btrfs.wiki.kernel.org/index.php/Developer's_FAQ#Repeated_submissions and examples from others that submit patches to this list. I've picked the patch to my branch at https://git.kernel.org/cgit/linux/kernel/git/fdmanana/linux.git/log/?h=for-chris-4.10 and reworded the changelog for clarity and completeness. Thanks > > Signed-off-by: Robbie Ko <robbieko@synology.com> > --- > fs/btrfs/tree-log.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c > index ef9c55b..9913084 100644 > --- a/fs/btrfs/tree-log.c > +++ b/fs/btrfs/tree-log.c > @@ -5213,6 +5213,7 @@ process_leaf: > if (di_key.type == BTRFS_ROOT_ITEM_KEY) > continue; > > + btrfs_release_path(path); > di_inode = btrfs_iget(root->fs_info->sb, &di_key, > root, NULL); > if (IS_ERR(di_inode)) { > @@ -5222,13 +5223,12 @@ process_leaf: > > if (btrfs_inode_in_log(di_inode, trans->transid)) { > iput(di_inode); > - continue; > + break; > } > > ctx->log_new_dentries = false; > if (type == BTRFS_FT_DIR || type == BTRFS_FT_SYMLINK) > log_mode = LOG_INODE_ALL; > - btrfs_release_path(path); > ret = btrfs_log_inode(trans, root, di_inode, > log_mode, 0, LLONG_MAX, ctx); > if (!ret && > -- > 1.9.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index ef9c55b..9913084 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -5213,6 +5213,7 @@ process_leaf: if (di_key.type == BTRFS_ROOT_ITEM_KEY) continue; + btrfs_release_path(path); di_inode = btrfs_iget(root->fs_info->sb, &di_key, root, NULL); if (IS_ERR(di_inode)) { @@ -5222,13 +5223,12 @@ process_leaf: if (btrfs_inode_in_log(di_inode, trans->transid)) { iput(di_inode); - continue; + break; } ctx->log_new_dentries = false; if (type == BTRFS_FT_DIR || type == BTRFS_FT_SYMLINK) log_mode = LOG_INODE_ALL; - btrfs_release_path(path); ret = btrfs_log_inode(trans, root, di_inode, log_mode, 0, LLONG_MAX, ctx); if (!ret &&