Message ID | 1475834728-16198-1-git-send-email-robbieko@synology.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Oct 7, 2016 at 11:05 AM, robbieko <robbieko@synology.com> wrote: > From: Robbie Ko <robbieko@synology.com> > > We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021, > in log_new_dir_dentries, because btrfs_search_forward get path lock, then > call btrfs_iget will get another extent_buffer lock, maybe occur deadlock. What are those numbers? Are they inode numbers? If so you're suggesting a deadlock due to recursive logging of the same inode. However the trace below, and the code change, has nothing to do with that. It's just about btrfs_iget trying to do a search on a btree and attempting to read lock some node/leaf that already has a write lock acquired previously by the same task. Please be more clear on your change logs. > > we can release path before call btrfs_iget, avoid deadlock occur. > > some process call trace like below: > [ 4077.478852] kworker/u24:10 D ffff88107fc90640 0 14431 2 0x00000000 > [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] > [ 4077.494346] ffff880ffa56bad0 0000000000000046 0000000000009000 ffff880ffa56bfd8 > [ 4077.502629] ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 ffff88101a5d6138 > [ 4077.510915] ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 ffff881016ce21c0 > [ 4077.519202] Call Trace: > [ 4077.528752] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs] > [ 4077.536049] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4077.542574] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs] > [ 4077.550171] [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs] > [ 4077.558252] [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs] > [ 4077.566140] [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs] > [ 4077.573928] [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] > [ 4077.582399] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs] > [ 4077.589896] [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] > [ 4077.599632] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs] > [ 4077.607134] [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs] > [ 4077.615329] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 > [ 4077.622043] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 > [ 4077.628459] [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270 > [ 4077.635759] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 > [ 4077.641404] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 > [ 4077.648696] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 > [ 4077.654926] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 > > [ 4078.358087] kworker/u24:15 D ffff88107fcd0640 0 14436 2 0x00000000 > [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] > [ 4078.373574] ffff880ffa57fad0 0000000000000046 0000000000009000 ffff880ffa57ffd8 > [ 4078.381864] ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 ffff88101a5d6138 > [ 4078.390163] ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 ffff88103004d0a0 > [ 4078.398466] Call Trace: > [ 4078.408019] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs] > [ 4078.415322] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4078.421844] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs] > [ 4078.429438] [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs] > [ 4078.437518] [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs] > [ 4078.445404] [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs] > [ 4078.453194] [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] > [ 4078.461663] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs] > [ 4078.469161] [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] > [ 4078.478893] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs] > [ 4078.486388] [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs] > [ 4078.494561] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 > [ 4078.501278] [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40 > [ 4078.508673] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 > [ 4078.515098] [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270 > [ 4078.522396] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 > [ 4078.528032] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 > [ 4078.535325] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 > [ 4078.541552] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 > > [ 4079.355824] user-space-program D ffff88107fd30640 0 32020 1 0x00000000 > [ 4079.363716] ffff880eae8eba10 0000000000000086 0000000000009000 ffff880eae8ebfd8 > [ 4079.372003] ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 ffff88101a5d6138 > [ 4079.380294] ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 ffff881016c162c0 > [ 4079.388586] Call Trace: > [ 4079.398134] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs] > [ 4079.405431] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4079.411955] [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 [btrfs] > [ 4079.419644] [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 [btrfs] > [ 4079.427237] [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 [btrfs] > [ 4079.435041] [<ffffffffa0689b60>] ? generic_bin_search.constprop.38+0x80/0x190 [btrfs] > [ 4079.443897] [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0 [btrfs] > [ 4079.451975] [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs] > [ 4079.458890] [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs] > [ 4079.466292] [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30 [btrfs] > [ 4079.474373] [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs] > [ 4079.482161] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs] > [ 4079.489558] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 > [ 4079.495300] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 > [ 4079.501422] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b > > [ 4079.508334] user-space-program D ffff88107fc30640 0 32021 1 0x00000004 > [ 4079.516226] ffff880eae8efbf8 0000000000000086 0000000000009000 ffff880eae8effd8 > [ 4079.524513] ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 ffff88101a5d6138 > [ 4079.532802] ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 ffff881030279610 > [ 4079.541092] Call Trace: > [ 4079.550642] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs] > [ 4079.557941] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4079.564463] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs] > [ 4079.572058] [<ffffffffa06bb7d8>] ? btrfs_truncate_inode_items+0x168/0xb90 [btrfs] > [ 4079.580526] [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0 [btrfs] > [ 4079.588701] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs] > [ 4079.596196] [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 [btrfs] > [ 4079.603789] [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs] > [ 4079.610994] [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs] > [ 4079.618197] [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680 > [ 4079.624625] [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160 > [ 4079.630854] [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90 > [ 4079.636889] [<ffffffff810f59fa>] ? do_sys_ftruncate.constprop.15+0x10a/0x160 > [ 4079.644869] [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570 > [ 4079.650805] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b > > [ 4080.410607] user-space-program D ffff88107fc70640 0 32028 12639 0x00000004 > [ 4080.418489] ffff880eaeccbbe0 0000000000000086 0000000000009000 ffff880eaeccbfd8 > [ 4080.426778] ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 ffff88101a5d6138 > [ 4080.435067] ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 ffff880f317ef1e0 > [ 4080.443353] Call Trace: > [ 4080.452920] [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 [btrfs] > [ 4080.460703] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 > [ 4080.467225] [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40 [btrfs] > [ 4080.475400] [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 [btrfs] > [ 4080.482994] [<ffffffffa06b2df0>] ? btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs] > [ 4080.491857] [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 [btrfs] > [ 4080.499353] [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0 > [ 4080.505879] [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs] > [ 4080.512696] [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 [btrfs] > [ 4080.520387] [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30 [btrfs] > [ 4080.528469] [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs] > [ 4080.536258] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs] > [ 4080.543657] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 > [ 4080.549399] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 > [ 4080.555534] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b > > Signed-off-by: Robbie Ko <robbieko@synology.com> > --- > fs/btrfs/tree-log.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c > index ef9c55b..9913084 100644 > --- a/fs/btrfs/tree-log.c > +++ b/fs/btrfs/tree-log.c > @@ -5213,6 +5213,7 @@ process_leaf: > if (di_key.type == BTRFS_ROOT_ITEM_KEY) > continue; > > + btrfs_release_path(path); > di_inode = btrfs_iget(root->fs_info->sb, &di_key, > root, NULL); > if (IS_ERR(di_inode)) { > @@ -5222,13 +5223,12 @@ process_leaf: > > if (btrfs_inode_in_log(di_inode, trans->transid)) { > iput(di_inode); > - continue; > + break; Why the break instead of continue? This seems totally unrelated to the deadlock on a leaf/node and incorrect. thanks > } > > ctx->log_new_dentries = false; > if (type == BTRFS_FT_DIR || type == BTRFS_FT_SYMLINK) > log_mode = LOG_INODE_ALL; > - btrfs_release_path(path); > ret = btrfs_log_inode(trans, root, di_inode, > log_mode, 0, LLONG_MAX, ctx); > if (!ret && > -- > 1.9.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 7, 2016 at 11:43 AM, robbieko <robbieko@synology.com> wrote: > Hi Filipe, > > I am sorry, I express not clear enough. > This number is pid, and the above are their call trace respectively. And why did you replace the continue statement with a break statement? Also please avoid mixing inline replies with top posting, it just breaks the thread. thanks > > Thanks. > robbieko > > Filipe Manana <fdmanana@gmail.com> 於 2016-10-07 18:23 寫道: > > On Fri, Oct 7, 2016 at 11:05 AM, robbieko <robbieko@synology.com> wrote: >> From: Robbie Ko <robbieko@synology.com> >> >> We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021, >> in log_new_dir_dentries, because btrfs_search_forward get path lock, then >> call btrfs_iget will get another extent_buffer lock, maybe occur deadlock. > > What are those numbers? Are they inode numbers? > If so you're suggesting a deadlock due to recursive logging of the same > inode. > However the trace below, and the code change, has nothing to do with that. > It's just about btrfs_iget trying to do a search on a btree and > attempting to read lock some node/leaf that already has a write lock > acquired previously by the same task. > > Please be more clear on your change logs. > > >> >> we can release path before call btrfs_iget, avoid deadlock occur. >> >> some process call trace like below: >> [ 4077.478852] kworker/u24:10 D ffff88107fc90640 0 14431 2 >> 0x00000000 >> [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper >> [btrfs] >> [ 4077.494346] ffff880ffa56bad0 0000000000000046 0000000000009000 >> ffff880ffa56bfd8 >> [ 4077.502629] ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 >> ffff88101a5d6138 >> [ 4077.510915] ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 >> ffff881016ce21c0 >> [ 4077.519202] Call Trace: >> [ 4077.528752] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs] >> [ 4077.536049] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >> [ 4077.542574] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 >> [btrfs] >> [ 4077.550171] [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 >> [btrfs] >> [ 4077.558252] [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 >> [btrfs] >> [ 4077.566140] [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 >> [btrfs] >> [ 4077.573928] [<ffffffffa06fd629>] ? >> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] >> [ 4077.582399] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 >> [btrfs] >> [ 4077.589896] [<ffffffffa06b4a64>] ? >> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] >> [ 4077.599632] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 >> [btrfs] >> [ 4077.607134] [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 >> [btrfs] >> [ 4077.615329] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 >> [ 4077.622043] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 >> [ 4077.628459] [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270 >> [ 4077.635759] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 >> [ 4077.641404] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 >> [ 4077.648696] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 >> [ 4077.654926] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 >> >> [ 4078.358087] kworker/u24:15 D ffff88107fcd0640 0 14436 2 >> 0x00000000 >> [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper >> [btrfs] >> [ 4078.373574] ffff880ffa57fad0 0000000000000046 0000000000009000 >> ffff880ffa57ffd8 >> [ 4078.381864] ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 >> ffff88101a5d6138 >> [ 4078.390163] ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 >> ffff88103004d0a0 >> [ 4078.398466] Call Trace: >> [ 4078.408019] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs] >> [ 4078.415322] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >> [ 4078.421844] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 >> [btrfs] >> [ 4078.429438] [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 >> [btrfs] >> [ 4078.437518] [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 >> [btrfs] >> [ 4078.445404] [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 >> [btrfs] >> [ 4078.453194] [<ffffffffa06fd629>] ? >> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] >> [ 4078.461663] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 >> [btrfs] >> [ 4078.469161] [<ffffffffa06b4a64>] ? >> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] >> [ 4078.478893] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 >> [btrfs] >> [ 4078.486388] [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 >> [btrfs] >> [ 4078.494561] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 >> [ 4078.501278] [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40 >> [ 4078.508673] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 >> [ 4078.515098] [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270 >> [ 4078.522396] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 >> [ 4078.528032] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 >> [ 4078.535325] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 >> [ 4078.541552] [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110 >> >> [ 4079.355824] user-space-program D ffff88107fd30640 0 32020 1 >> 0x00000000 >> [ 4079.363716] ffff880eae8eba10 0000000000000086 0000000000009000 >> ffff880eae8ebfd8 >> [ 4079.372003] ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 >> ffff88101a5d6138 >> [ 4079.380294] ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 >> ffff881016c162c0 >> [ 4079.388586] Call Trace: >> [ 4079.398134] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs] >> [ 4079.405431] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >> [ 4079.411955] [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 >> [btrfs] >> [ 4079.419644] [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 >> [btrfs] >> [ 4079.427237] [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 >> [btrfs] >> [ 4079.435041] [<ffffffffa0689b60>] ? >> generic_bin_search.constprop.38+0x80/0x190 [btrfs] >> [ 4079.443897] [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0 >> [btrfs] >> [ 4079.451975] [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs] >> [ 4079.458890] [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs] >> [ 4079.466292] [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30 >> [btrfs] >> [ 4079.474373] [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 >> [btrfs] >> [ 4079.482161] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs] >> [ 4079.489558] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 >> [ 4079.495300] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 >> [ 4079.501422] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b >> >> [ 4079.508334] user-space-program D ffff88107fc30640 0 32021 1 >> 0x00000004 >> [ 4079.516226] ffff880eae8efbf8 0000000000000086 0000000000009000 >> ffff880eae8effd8 >> [ 4079.524513] ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 >> ffff88101a5d6138 >> [ 4079.532802] ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 >> ffff881030279610 >> [ 4079.541092] Call Trace: >> [ 4079.550642] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs] >> [ 4079.557941] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >> [ 4079.564463] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 >> [btrfs] >> [ 4079.572058] [<ffffffffa06bb7d8>] ? >> btrfs_truncate_inode_items+0x168/0xb90 [btrfs] >> [ 4079.580526] [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0 >> [btrfs] >> [ 4079.588701] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 >> [btrfs] >> [ 4079.596196] [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 >> [btrfs] >> [ 4079.603789] [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs] >> [ 4079.610994] [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs] >> [ 4079.618197] [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680 >> [ 4079.624625] [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160 >> [ 4079.630854] [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90 >> [ 4079.636889] [<ffffffff810f59fa>] ? >> do_sys_ftruncate.constprop.15+0x10a/0x160 >> [ 4079.644869] [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570 >> [ 4079.650805] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b >> >> [ 4080.410607] user-space-program D ffff88107fc70640 0 32028 12639 >> 0x00000004 >> [ 4080.418489] ffff880eaeccbbe0 0000000000000086 0000000000009000 >> ffff880eaeccbfd8 >> [ 4080.426778] ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 >> ffff88101a5d6138 >> [ 4080.435067] ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 >> ffff880f317ef1e0 >> [ 4080.443353] Call Trace: >> [ 4080.452920] [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 >> [btrfs] >> [ 4080.460703] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >> [ 4080.467225] [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40 >> [btrfs] >> [ 4080.475400] [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 >> [btrfs] >> [ 4080.482994] [<ffffffffa06b2df0>] ? >> btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs] >> [ 4080.491857] [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 >> [btrfs] >> [ 4080.499353] [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0 >> [ 4080.505879] [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs] >> [ 4080.512696] [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 >> [btrfs] >> [ 4080.520387] [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30 >> [btrfs] >> [ 4080.528469] [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 >> [btrfs] >> [ 4080.536258] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs] >> [ 4080.543657] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 >> [ 4080.549399] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 >> [ 4080.555534] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b >> >> Signed-off-by: Robbie Ko <robbieko@synology.com> >> --- >> fs/btrfs/tree-log.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c >> index ef9c55b..9913084 100644 >> --- a/fs/btrfs/tree-log.c >> +++ b/fs/btrfs/tree-log.c >> @@ -5213,6 +5213,7 @@ process_leaf: >> if (di_key.type == BTRFS_ROOT_ITEM_KEY) >> continue; >> >> + btrfs_release_path(path); >> di_inode = btrfs_iget(root->fs_info->sb, &di_key, >> root, NULL); >> if (IS_ERR(di_inode)) { >> @@ -5222,13 +5223,12 @@ process_leaf: >> >> if (btrfs_inode_in_log(di_inode, trans->transid)) >> { >> iput(di_inode); >> - continue; >> + break; > > Why the break instead of continue? This seems totally unrelated to the > deadlock on a leaf/node and incorrect. > > thanks > >> } >> >> ctx->log_new_dentries = false; >> if (type == BTRFS_FT_DIR || type == >> BTRFS_FT_SYMLINK) >> log_mode = LOG_INODE_ALL; >> - btrfs_release_path(path); >> ret = btrfs_log_inode(trans, root, di_inode, >> log_mode, 0, LLONG_MAX, >> ctx); >> if (!ret && >> -- >> 1.9.1 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > Filipe David Manana, > > "People will forget what you said, > people will forget what you did, > but people will never forget how you made them feel."
Hi Filipe: why did you replace the continue statement with a break statement: because we released ahead of the path, it can not continue to use, need to jump out, and then go to again. supplement: We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021, the number id pid. extent_buffer: start:207060992, len:16384 locker pid: 32020 read lock wait pid: 32021 write lock extent_buffer: start:14730821632, len:16384 locker pid: 32028 read lock wait pid: 32020 write lock extent_buffer: start:446503813120, len:16384 locker pid: 14431 write lock wait pid: 32028 read lock extent_buffer: start:446503845888, len: 16384 locker pid: 14436 write lock wait pid: 14431 write lock extent_buffer: start: 446504386560, len: 16384 locker pid: 32021 write lock wait pid: 14436 write lock Thanks. Robbie Ko Filipe Manana 於 2016-10-07 18:46 寫到: > On Fri, Oct 7, 2016 at 11:43 AM, robbieko <robbieko@synology.com> > wrote: >> Hi Filipe, >> >> I am sorry, I express not clear enough. >> This number is pid, and the above are their call trace respectively. > > And why did you replace the continue statement with a break statement? > > Also please avoid mixing inline replies with top posting, it just > breaks the thread. > > thanks > >> >> Thanks. >> robbieko >> >> Filipe Manana <fdmanana@gmail.com> 於 2016-10-07 18:23 寫道: >> >> On Fri, Oct 7, 2016 at 11:05 AM, robbieko <robbieko@synology.com> >> wrote: >>> From: Robbie Ko <robbieko@synology.com> >>> >>> We found a fsync deadlock ie. >>> 32021->32020->32028->14431->14436->32021, >>> in log_new_dir_dentries, because btrfs_search_forward get path lock, >>> then >>> call btrfs_iget will get another extent_buffer lock, maybe occur >>> deadlock. >> >> What are those numbers? Are they inode numbers? >> If so you're suggesting a deadlock due to recursive logging of the >> same >> inode. >> However the trace below, and the code change, has nothing to do with >> that. >> It's just about btrfs_iget trying to do a search on a btree and >> attempting to read lock some node/leaf that already has a write lock >> acquired previously by the same task. >> >> Please be more clear on your change logs. >> >> >>> >>> we can release path before call btrfs_iget, avoid deadlock occur. >>> >>> some process call trace like below: >>> [ 4077.478852] kworker/u24:10 D ffff88107fc90640 0 14431 2 >>> 0x00000000 >>> [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper >>> [btrfs] >>> [ 4077.494346] ffff880ffa56bad0 0000000000000046 0000000000009000 >>> ffff880ffa56bfd8 >>> [ 4077.502629] ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 >>> ffff88101a5d6138 >>> [ 4077.510915] ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 >>> ffff881016ce21c0 >>> [ 4077.519202] Call Trace: >>> [ 4077.528752] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 >>> [btrfs] >>> [ 4077.536049] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>> [ 4077.542574] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 >>> [btrfs] >>> [ 4077.550171] [<ffffffffa06a5073>] ? >>> btrfs_lookup_file_extent+0x33/0x40 >>> [btrfs] >>> [ 4077.558252] [<ffffffffa06c600b>] ? >>> __btrfs_drop_extents+0x13b/0xdf0 >>> [btrfs] >>> [ 4077.566140] [<ffffffffa06fc9e2>] ? >>> add_delayed_data_ref+0xe2/0x150 >>> [btrfs] >>> [ 4077.573928] [<ffffffffa06fd629>] ? >>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] >>> [ 4077.582399] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 >>> [btrfs] >>> [ 4077.589896] [<ffffffffa06b4a64>] ? >>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] >>> [ 4077.599632] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 >>> [btrfs] >>> [ 4077.607134] [<ffffffffa06bab57>] ? >>> btrfs_finish_ordered_io+0x2e7/0x600 >>> [btrfs] >>> [ 4077.615329] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 >>> [ 4077.622043] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 >>> [ 4077.628459] [<ffffffff8104d620>] ? >>> manage_workers.isra.26+0x270/0x270 >>> [ 4077.635759] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 >>> [ 4077.641404] [<ffffffff81052a60>] ? >>> kthread_create_on_node+0x110/0x110 >>> [ 4077.648696] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 >>> [ 4077.654926] [<ffffffff81052a60>] ? >>> kthread_create_on_node+0x110/0x110 >>> >>> [ 4078.358087] kworker/u24:15 D ffff88107fcd0640 0 14436 2 >>> 0x00000000 >>> [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper >>> [btrfs] >>> [ 4078.373574] ffff880ffa57fad0 0000000000000046 0000000000009000 >>> ffff880ffa57ffd8 >>> [ 4078.381864] ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 >>> ffff88101a5d6138 >>> [ 4078.390163] ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 >>> ffff88103004d0a0 >>> [ 4078.398466] Call Trace: >>> [ 4078.408019] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 >>> [btrfs] >>> [ 4078.415322] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>> [ 4078.421844] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 >>> [btrfs] >>> [ 4078.429438] [<ffffffffa06a5073>] ? >>> btrfs_lookup_file_extent+0x33/0x40 >>> [btrfs] >>> [ 4078.437518] [<ffffffffa06c600b>] ? >>> __btrfs_drop_extents+0x13b/0xdf0 >>> [btrfs] >>> [ 4078.445404] [<ffffffffa06fc9e2>] ? >>> add_delayed_data_ref+0xe2/0x150 >>> [btrfs] >>> [ 4078.453194] [<ffffffffa06fd629>] ? >>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] >>> [ 4078.461663] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 >>> [btrfs] >>> [ 4078.469161] [<ffffffffa06b4a64>] ? >>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] >>> [ 4078.478893] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 >>> [btrfs] >>> [ 4078.486388] [<ffffffffa06bab57>] ? >>> btrfs_finish_ordered_io+0x2e7/0x600 >>> [btrfs] >>> [ 4078.494561] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 >>> [ 4078.501278] [<ffffffff8104a507>] ? >>> pwq_activate_delayed_work+0x27/0x40 >>> [ 4078.508673] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 >>> [ 4078.515098] [<ffffffff8104d620>] ? >>> manage_workers.isra.26+0x270/0x270 >>> [ 4078.522396] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 >>> [ 4078.528032] [<ffffffff81052a60>] ? >>> kthread_create_on_node+0x110/0x110 >>> [ 4078.535325] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 >>> [ 4078.541552] [<ffffffff81052a60>] ? >>> kthread_create_on_node+0x110/0x110 >>> >>> [ 4079.355824] user-space-program D ffff88107fd30640 0 32020 >>> 1 >>> 0x00000000 >>> [ 4079.363716] ffff880eae8eba10 0000000000000086 0000000000009000 >>> ffff880eae8ebfd8 >>> [ 4079.372003] ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 >>> ffff88101a5d6138 >>> [ 4079.380294] ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 >>> ffff881016c162c0 >>> [ 4079.388586] Call Trace: >>> [ 4079.398134] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 >>> [btrfs] >>> [ 4079.405431] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>> [ 4079.411955] [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 >>> [btrfs] >>> [ 4079.419644] [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 >>> [btrfs] >>> [ 4079.427237] [<ffffffffa06aba52>] ? >>> btrfs_buffer_uptodate+0x52/0x70 >>> [btrfs] >>> [ 4079.435041] [<ffffffffa0689b60>] ? >>> generic_bin_search.constprop.38+0x80/0x190 [btrfs] >>> [ 4079.443897] [<ffffffffa068ea44>] ? >>> btrfs_insert_empty_items+0x74/0xd0 >>> [btrfs] >>> [ 4079.451975] [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs] >>> [ 4079.458890] [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 >>> [btrfs] >>> [ 4079.466292] [<ffffffffa06f34a1>] ? >>> btrfs_log_inode_parent+0xc61/0xf30 >>> [btrfs] >>> [ 4079.474373] [<ffffffffa06f45a9>] ? >>> btrfs_log_dentry_safe+0x59/0x80 >>> [btrfs] >>> [ 4079.482161] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 >>> [btrfs] >>> [ 4079.489558] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 >>> [ 4079.495300] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 >>> [ 4079.501422] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b >>> >>> [ 4079.508334] user-space-program D ffff88107fc30640 0 32021 >>> 1 >>> 0x00000004 >>> [ 4079.516226] ffff880eae8efbf8 0000000000000086 0000000000009000 >>> ffff880eae8effd8 >>> [ 4079.524513] ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 >>> ffff88101a5d6138 >>> [ 4079.532802] ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 >>> ffff881030279610 >>> [ 4079.541092] Call Trace: >>> [ 4079.550642] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 >>> [btrfs] >>> [ 4079.557941] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>> [ 4079.564463] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 >>> [btrfs] >>> [ 4079.572058] [<ffffffffa06bb7d8>] ? >>> btrfs_truncate_inode_items+0x168/0xb90 [btrfs] >>> [ 4079.580526] [<ffffffffa06b04be>] ? >>> join_transaction.isra.15+0x1e/0x3a0 >>> [btrfs] >>> [ 4079.588701] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 >>> [btrfs] >>> [ 4079.596196] [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 >>> [btrfs] >>> [ 4079.603789] [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 >>> [btrfs] >>> [ 4079.610994] [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 >>> [btrfs] >>> [ 4079.618197] [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680 >>> [ 4079.624625] [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160 >>> [ 4079.630854] [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90 >>> [ 4079.636889] [<ffffffff810f59fa>] ? >>> do_sys_ftruncate.constprop.15+0x10a/0x160 >>> [ 4079.644869] [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570 >>> [ 4079.650805] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b >>> >>> [ 4080.410607] user-space-program D ffff88107fc70640 0 32028 >>> 12639 >>> 0x00000004 >>> [ 4080.418489] ffff880eaeccbbe0 0000000000000086 0000000000009000 >>> ffff880eaeccbfd8 >>> [ 4080.426778] ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 >>> ffff88101a5d6138 >>> [ 4080.435067] ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 >>> ffff880f317ef1e0 >>> [ 4080.443353] Call Trace: >>> [ 4080.452920] [<ffffffffa06ed15d>] ? >>> btrfs_tree_read_lock+0xdd/0x190 >>> [btrfs] >>> [ 4080.460703] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>> [ 4080.467225] [<ffffffffa06876bb>] ? >>> btrfs_read_lock_root_node+0x2b/0x40 >>> [btrfs] >>> [ 4080.475400] [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 >>> [btrfs] >>> [ 4080.482994] [<ffffffffa06b2df0>] ? >>> btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs] >>> [ 4080.491857] [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 >>> [btrfs] >>> [ 4080.499353] [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0 >>> [ 4080.505879] [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs] >>> [ 4080.512696] [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 >>> [btrfs] >>> [ 4080.520387] [<ffffffffa06f341f>] ? >>> btrfs_log_inode_parent+0xbdf/0xf30 >>> [btrfs] >>> [ 4080.528469] [<ffffffffa06f45a9>] ? >>> btrfs_log_dentry_safe+0x59/0x80 >>> [btrfs] >>> [ 4080.536258] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 >>> [btrfs] >>> [ 4080.543657] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 >>> [ 4080.549399] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 >>> [ 4080.555534] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b >>> >>> Signed-off-by: Robbie Ko <robbieko@synology.com> >>> --- >>> fs/btrfs/tree-log.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c >>> index ef9c55b..9913084 100644 >>> --- a/fs/btrfs/tree-log.c >>> +++ b/fs/btrfs/tree-log.c >>> @@ -5213,6 +5213,7 @@ process_leaf: >>> if (di_key.type == BTRFS_ROOT_ITEM_KEY) >>> continue; >>> >>> + btrfs_release_path(path); >>> di_inode = btrfs_iget(root->fs_info->sb, >>> &di_key, >>> root, NULL); >>> if (IS_ERR(di_inode)) { >>> @@ -5222,13 +5223,12 @@ process_leaf: >>> >>> if (btrfs_inode_in_log(di_inode, >>> trans->transid)) >>> { >>> iput(di_inode); >>> - continue; >>> + break; >> >> Why the break instead of continue? This seems totally unrelated to the >> deadlock on a leaf/node and incorrect. >> >> thanks >> >>> } >>> >>> ctx->log_new_dentries = false; >>> if (type == BTRFS_FT_DIR || type == >>> BTRFS_FT_SYMLINK) >>> log_mode = LOG_INODE_ALL; >>> - btrfs_release_path(path); >>> ret = btrfs_log_inode(trans, root, di_inode, >>> log_mode, 0, LLONG_MAX, >>> ctx); >>> if (!ret && >>> -- >>> 1.9.1 >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> >> -- >> Filipe David Manana, >> >> "People will forget what you said, >> people will forget what you did, >> but people will never forget how you made them feel." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 11, 2016 at 8:47 AM, robbieko <robbieko@synology.com> wrote: > Hi Filipe: > > why did you replace the continue statement with a break statement: > because we released ahead of the path, it can not continue to use, > need to jump out, and then go to again. Seeing the code and not only the patch's diff, I can see it now, thanks. Btw, please avoid top posting, it's not how you should reply to a thread, see [1]. [1] http://linux.sgms-centre.com/misc/netiquette.php > > supplement: > We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021, the > number id pid. > extent_buffer: start:207060992, len:16384 > locker pid: 32020 read lock > wait pid: 32021 write lock > extent_buffer: start:14730821632, len:16384 > locker pid: 32028 read lock > wait pid: 32020 write lock > extent_buffer: start:446503813120, len:16384 > locker pid: 14431 write lock > wait pid: 32028 read lock > extent_buffer: start:446503845888, len: 16384 > locker pid: 14436 write lock > wait pid: 14431 write lock > extent_buffer: start: 446504386560, len: 16384 > locker pid: 32021 write lock > wait pid: 14436 write lock > > Thanks. > Robbie Ko > > Filipe Manana 於 2016-10-07 18:46 寫到: > >> On Fri, Oct 7, 2016 at 11:43 AM, robbieko <robbieko@synology.com> wrote: >>> >>> Hi Filipe, >>> >>> I am sorry, I express not clear enough. >>> This number is pid, and the above are their call trace respectively. >> >> >> And why did you replace the continue statement with a break statement? >> >> Also please avoid mixing inline replies with top posting, it just >> breaks the thread. >> >> thanks >> >>> >>> Thanks. >>> robbieko >>> >>> Filipe Manana <fdmanana@gmail.com> 於 2016-10-07 18:23 寫道: >>> >>> On Fri, Oct 7, 2016 at 11:05 AM, robbieko <robbieko@synology.com> wrote: >>>> >>>> From: Robbie Ko <robbieko@synology.com> >>>> >>>> We found a fsync deadlock ie. 32021->32020->32028->14431->14436->32021, >>>> in log_new_dir_dentries, because btrfs_search_forward get path lock, >>>> then >>>> call btrfs_iget will get another extent_buffer lock, maybe occur >>>> deadlock. >>> >>> >>> What are those numbers? Are they inode numbers? >>> If so you're suggesting a deadlock due to recursive logging of the same >>> inode. >>> However the trace below, and the code change, has nothing to do with >>> that. >>> It's just about btrfs_iget trying to do a search on a btree and >>> attempting to read lock some node/leaf that already has a write lock >>> acquired previously by the same task. >>> >>> Please be more clear on your change logs. >>> >>> >>>> >>>> we can release path before call btrfs_iget, avoid deadlock occur. >>>> >>>> some process call trace like below: >>>> [ 4077.478852] kworker/u24:10 D ffff88107fc90640 0 14431 2 >>>> 0x00000000 >>>> [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper >>>> [btrfs] >>>> [ 4077.494346] ffff880ffa56bad0 0000000000000046 0000000000009000 >>>> ffff880ffa56bfd8 >>>> [ 4077.502629] ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 >>>> ffff88101a5d6138 >>>> [ 4077.510915] ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 >>>> ffff881016ce21c0 >>>> [ 4077.519202] Call Trace: >>>> [ 4077.528752] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 >>>> [btrfs] >>>> [ 4077.536049] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>>> [ 4077.542574] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 >>>> [btrfs] >>>> [ 4077.550171] [<ffffffffa06a5073>] ? >>>> btrfs_lookup_file_extent+0x33/0x40 >>>> [btrfs] >>>> [ 4077.558252] [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 >>>> [btrfs] >>>> [ 4077.566140] [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 >>>> [btrfs] >>>> [ 4077.573928] [<ffffffffa06fd629>] ? >>>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] >>>> [ 4077.582399] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 >>>> [btrfs] >>>> [ 4077.589896] [<ffffffffa06b4a64>] ? >>>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] >>>> [ 4077.599632] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 >>>> [btrfs] >>>> [ 4077.607134] [<ffffffffa06bab57>] ? >>>> btrfs_finish_ordered_io+0x2e7/0x600 >>>> [btrfs] >>>> [ 4077.615329] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 >>>> [ 4077.622043] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 >>>> [ 4077.628459] [<ffffffff8104d620>] ? >>>> manage_workers.isra.26+0x270/0x270 >>>> [ 4077.635759] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 >>>> [ 4077.641404] [<ffffffff81052a60>] ? >>>> kthread_create_on_node+0x110/0x110 >>>> [ 4077.648696] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 >>>> [ 4077.654926] [<ffffffff81052a60>] ? >>>> kthread_create_on_node+0x110/0x110 >>>> >>>> [ 4078.358087] kworker/u24:15 D ffff88107fcd0640 0 14436 2 >>>> 0x00000000 >>>> [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper >>>> [btrfs] >>>> [ 4078.373574] ffff880ffa57fad0 0000000000000046 0000000000009000 >>>> ffff880ffa57ffd8 >>>> [ 4078.381864] ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 >>>> ffff88101a5d6138 >>>> [ 4078.390163] ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 >>>> ffff88103004d0a0 >>>> [ 4078.398466] Call Trace: >>>> [ 4078.408019] [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 >>>> [btrfs] >>>> [ 4078.415322] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>>> [ 4078.421844] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 >>>> [btrfs] >>>> [ 4078.429438] [<ffffffffa06a5073>] ? >>>> btrfs_lookup_file_extent+0x33/0x40 >>>> [btrfs] >>>> [ 4078.437518] [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 >>>> [btrfs] >>>> [ 4078.445404] [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 >>>> [btrfs] >>>> [ 4078.453194] [<ffffffffa06fd629>] ? >>>> btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs] >>>> [ 4078.461663] [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 >>>> [btrfs] >>>> [ 4078.469161] [<ffffffffa06b4a64>] ? >>>> insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs] >>>> [ 4078.478893] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 >>>> [btrfs] >>>> [ 4078.486388] [<ffffffffa06bab57>] ? >>>> btrfs_finish_ordered_io+0x2e7/0x600 >>>> [btrfs] >>>> [ 4078.494561] [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0 >>>> [ 4078.501278] [<ffffffff8104a507>] ? >>>> pwq_activate_delayed_work+0x27/0x40 >>>> [ 4078.508673] [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0 >>>> [ 4078.515098] [<ffffffff8104d620>] ? >>>> manage_workers.isra.26+0x270/0x270 >>>> [ 4078.522396] [<ffffffff81052b0f>] ? kthread+0xaf/0xc0 >>>> [ 4078.528032] [<ffffffff81052a60>] ? >>>> kthread_create_on_node+0x110/0x110 >>>> [ 4078.535325] [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90 >>>> [ 4078.541552] [<ffffffff81052a60>] ? >>>> kthread_create_on_node+0x110/0x110 >>>> >>>> [ 4079.355824] user-space-program D ffff88107fd30640 0 32020 1 >>>> 0x00000000 >>>> [ 4079.363716] ffff880eae8eba10 0000000000000086 0000000000009000 >>>> ffff880eae8ebfd8 >>>> [ 4079.372003] ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 >>>> ffff88101a5d6138 >>>> [ 4079.380294] ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 >>>> ffff881016c162c0 >>>> [ 4079.388586] Call Trace: >>>> [ 4079.398134] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 >>>> [btrfs] >>>> [ 4079.405431] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>>> [ 4079.411955] [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 >>>> [btrfs] >>>> [ 4079.419644] [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 >>>> [btrfs] >>>> [ 4079.427237] [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 >>>> [btrfs] >>>> [ 4079.435041] [<ffffffffa0689b60>] ? >>>> generic_bin_search.constprop.38+0x80/0x190 [btrfs] >>>> [ 4079.443897] [<ffffffffa068ea44>] ? >>>> btrfs_insert_empty_items+0x74/0xd0 >>>> [btrfs] >>>> [ 4079.451975] [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs] >>>> [ 4079.458890] [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 >>>> [btrfs] >>>> [ 4079.466292] [<ffffffffa06f34a1>] ? >>>> btrfs_log_inode_parent+0xc61/0xf30 >>>> [btrfs] >>>> [ 4079.474373] [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 >>>> [btrfs] >>>> [ 4079.482161] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 >>>> [btrfs] >>>> [ 4079.489558] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 >>>> [ 4079.495300] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 >>>> [ 4079.501422] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b >>>> >>>> [ 4079.508334] user-space-program D ffff88107fc30640 0 32021 1 >>>> 0x00000004 >>>> [ 4079.516226] ffff880eae8efbf8 0000000000000086 0000000000009000 >>>> ffff880eae8effd8 >>>> [ 4079.524513] ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 >>>> ffff88101a5d6138 >>>> [ 4079.532802] ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 >>>> ffff881030279610 >>>> [ 4079.541092] Call Trace: >>>> [ 4079.550642] [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 >>>> [btrfs] >>>> [ 4079.557941] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>>> [ 4079.564463] [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 >>>> [btrfs] >>>> [ 4079.572058] [<ffffffffa06bb7d8>] ? >>>> btrfs_truncate_inode_items+0x168/0xb90 [btrfs] >>>> [ 4079.580526] [<ffffffffa06b04be>] ? >>>> join_transaction.isra.15+0x1e/0x3a0 >>>> [btrfs] >>>> [ 4079.588701] [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 >>>> [btrfs] >>>> [ 4079.596196] [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 >>>> [btrfs] >>>> [ 4079.603789] [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs] >>>> [ 4079.610994] [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs] >>>> [ 4079.618197] [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680 >>>> [ 4079.624625] [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160 >>>> [ 4079.630854] [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90 >>>> [ 4079.636889] [<ffffffff810f59fa>] ? >>>> do_sys_ftruncate.constprop.15+0x10a/0x160 >>>> [ 4079.644869] [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570 >>>> [ 4079.650805] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b >>>> >>>> [ 4080.410607] user-space-program D ffff88107fc70640 0 32028 12639 >>>> 0x00000004 >>>> [ 4080.418489] ffff880eaeccbbe0 0000000000000086 0000000000009000 >>>> ffff880eaeccbfd8 >>>> [ 4080.426778] ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 >>>> ffff88101a5d6138 >>>> [ 4080.435067] ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 >>>> ffff880f317ef1e0 >>>> [ 4080.443353] Call Trace: >>>> [ 4080.452920] [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 >>>> [btrfs] >>>> [ 4080.460703] [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30 >>>> [ 4080.467225] [<ffffffffa06876bb>] ? >>>> btrfs_read_lock_root_node+0x2b/0x40 >>>> [btrfs] >>>> [ 4080.475400] [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 >>>> [btrfs] >>>> [ 4080.482994] [<ffffffffa06b2df0>] ? >>>> btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs] >>>> [ 4080.491857] [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 >>>> [btrfs] >>>> [ 4080.499353] [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0 >>>> [ 4080.505879] [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs] >>>> [ 4080.512696] [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 >>>> [btrfs] >>>> [ 4080.520387] [<ffffffffa06f341f>] ? >>>> btrfs_log_inode_parent+0xbdf/0xf30 >>>> [btrfs] >>>> [ 4080.528469] [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 >>>> [btrfs] >>>> [ 4080.536258] [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 >>>> [btrfs] >>>> [ 4080.543657] [<ffffffff8112777c>] ? do_fsync+0x4c/0x80 >>>> [ 4080.549399] [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10 >>>> [ 4080.555534] [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b >>>> >>>> Signed-off-by: Robbie Ko <robbieko@synology.com> >>>> --- >>>> fs/btrfs/tree-log.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c >>>> index ef9c55b..9913084 100644 >>>> --- a/fs/btrfs/tree-log.c >>>> +++ b/fs/btrfs/tree-log.c >>>> @@ -5213,6 +5213,7 @@ process_leaf: >>>> if (di_key.type == BTRFS_ROOT_ITEM_KEY) >>>> continue; >>>> >>>> + btrfs_release_path(path); >>>> di_inode = btrfs_iget(root->fs_info->sb, >>>> &di_key, >>>> root, NULL); >>>> if (IS_ERR(di_inode)) { >>>> @@ -5222,13 +5223,12 @@ process_leaf: >>>> >>>> if (btrfs_inode_in_log(di_inode, >>>> trans->transid)) >>>> { >>>> iput(di_inode); >>>> - continue; >>>> + break; >>> >>> >>> Why the break instead of continue? This seems totally unrelated to the >>> deadlock on a leaf/node and incorrect. >>> >>> thanks >>> >>>> } >>>> >>>> ctx->log_new_dentries = false; >>>> if (type == BTRFS_FT_DIR || type == >>>> BTRFS_FT_SYMLINK) >>>> log_mode = LOG_INODE_ALL; >>>> - btrfs_release_path(path); >>>> ret = btrfs_log_inode(trans, root, di_inode, >>>> log_mode, 0, LLONG_MAX, >>>> ctx); >>>> if (!ret && >>>> -- >>>> 1.9.1 >>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" >>>> in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >>> >>> >>> -- >>> Filipe David Manana, >>> >>> "People will forget what you said, >>> people will forget what you did, >>> but people will never forget how you made them feel." > >
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index ef9c55b..9913084 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -5213,6 +5213,7 @@ process_leaf: if (di_key.type == BTRFS_ROOT_ITEM_KEY) continue; + btrfs_release_path(path); di_inode = btrfs_iget(root->fs_info->sb, &di_key, root, NULL); if (IS_ERR(di_inode)) { @@ -5222,13 +5223,12 @@ process_leaf: if (btrfs_inode_in_log(di_inode, trans->transid)) { iput(di_inode); - continue; + break; } ctx->log_new_dentries = false; if (type == BTRFS_FT_DIR || type == BTRFS_FT_SYMLINK) log_mode = LOG_INODE_ALL; - btrfs_release_path(path); ret = btrfs_log_inode(trans, root, di_inode, log_mode, 0, LLONG_MAX, ctx); if (!ret &&