Message ID | baa48c5a32ae079b218613cbdae175f2387cd745.1739948529.git.naohiro.aota@wdc.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | btrfs: zoned: fix extent unlock in cow_file_range() | expand |
在 2025/2/19 17:32, Naohiro Aota 写道: > Running generic/751 on the btrfs for-next often results in hung like > below. They are both stack by locking an extent. This suggests someone > forget to unlock an extent. > > INFO: task kworker/u128:1:12 blocked for more than 323 seconds. > Not tainted 6.13.0-BTRFS-ZNS+ #503 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:kworker/u128:1 state:D stack:0 pid:12 tgid:12 ppid:2 flags:0x00004000 > Workqueue: btrfs-fixup btrfs_work_helper [btrfs] > Call Trace: > <TASK> > __schedule+0x534/0xdd0 > schedule+0x39/0x140 > __lock_extent+0x31b/0x380 [btrfs] > ? __pfx_autoremove_wake_function+0x10/0x10 > btrfs_writepage_fixup_worker+0xf1/0x3a0 [btrfs] > btrfs_work_helper+0xff/0x480 [btrfs] > ? lock_release+0x178/0x2c0 > process_one_work+0x1ee/0x570 > ? srso_return_thunk+0x5/0x5f > worker_thread+0x1d1/0x3b0 > ? __pfx_worker_thread+0x10/0x10 > kthread+0x10b/0x230 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x30/0x50 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > </TASK> > INFO: task kworker/u134:0:184 blocked for more than 323 seconds. > Not tainted 6.13.0-BTRFS-ZNS+ #503 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:kworker/u134:0 state:D stack:0 pid:184 tgid:184 ppid:2 flags:0x00004000 > Workqueue: writeback wb_workfn (flush-btrfs-4) > Call Trace: > <TASK> > __schedule+0x534/0xdd0 > schedule+0x39/0x140 > __lock_extent+0x31b/0x380 [btrfs] > ? __pfx_autoremove_wake_function+0x10/0x10 > find_lock_delalloc_range+0xdb/0x260 [btrfs] > writepage_delalloc+0x12f/0x500 [btrfs] > ? srso_return_thunk+0x5/0x5f > extent_write_cache_pages+0x232/0x840 [btrfs] > btrfs_writepages+0x72/0x130 [btrfs] > do_writepages+0xe7/0x260 > ? srso_return_thunk+0x5/0x5f > ? lock_acquire+0xd2/0x300 > ? srso_return_thunk+0x5/0x5f > ? find_held_lock+0x2b/0x80 > ? wbc_attach_and_unlock_inode.part.0+0x102/0x250 > ? wbc_attach_and_unlock_inode.part.0+0x102/0x250 > __writeback_single_inode+0x5c/0x4b0 > writeback_sb_inodes+0x22d/0x550 > __writeback_inodes_wb+0x4c/0xe0 > wb_writeback+0x2f6/0x3f0 > wb_workfn+0x32a/0x510 > process_one_work+0x1ee/0x570 > ? srso_return_thunk+0x5/0x5f > worker_thread+0x1d1/0x3b0 > ? __pfx_worker_thread+0x10/0x10 > kthread+0x10b/0x230 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x30/0x50 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > This happens because we have another success path for the zoned mode. When > there is no active zone available, btrfs_reserve_extent() returns > -EAGAIN. In this case, we have two reactions. (1) If the given range is > never allocated, we can only wait for someone to finish a zone, so wait on > BTRFS_FS_NEED_ZONE_FINISH bit and retry afterward. (2) Or, if some > allocations are already done, we must bail out and let the caller to send > IOs for the allocation. This is because these IOs may be necessary to > finish a zone. > > The commit 06f364284794 ("btrfs: do proper folio cleanup when > cow_file_range() failed") moved the unlock code from the inside of the loop > to the outside. So, previously, the allocated extents are unlocked just > after the allocation and so before returning from the function. However, > they are no longer unlocked on the case (2) above. That caused the hung > issue. > > Fix the issue by modifying the 'end' to the end of the allocated > range. Then, we can exit the loop and the same unlock code can properly > handle the case. > > Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> > Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > CC: stable@vger.kernel.org > Fixes: 06f364284794 ("btrfs: do proper folio cleanup when cow_file_range() failed") Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks for catching this, at the time of crafting that series, I had a bad feeling that there would be some bugs related to zoned support, and unfortunately this turns out to be true... Thanks, Qu > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> > --- > fs/btrfs/inode.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 1512eb94b6e5..f80db81fc853 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -1378,8 +1378,14 @@ static noinline int cow_file_range(struct btrfs_inode *inode, > continue; > } > if (done_offset) { > - *done_offset = start - 1; > - return 0; > + /* > + * Move @end to the end of the processed range, > + * and exit the loop to unlock the processed > + * extents. > + */ > + end = start - 1; > + ret = 0; > + break; > } > ret = -ENOSPC; > }
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1512eb94b6e5..f80db81fc853 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1378,8 +1378,14 @@ static noinline int cow_file_range(struct btrfs_inode *inode, continue; } if (done_offset) { - *done_offset = start - 1; - return 0; + /* + * Move @end to the end of the processed range, + * and exit the loop to unlock the processed + * extents. + */ + end = start - 1; + ret = 0; + break; } ret = -ENOSPC; }