mbox series

[0/3] btrfs: zoned: fix zoned extent allocator

Message ID 20211207153549.2946602-1-naohiro.aota@wdc.com (mailing list archive)
Headers show
Series btrfs: zoned: fix zoned extent allocator | expand

Message

Naohiro Aota Dec. 7, 2021, 3:35 p.m. UTC
There are several reports of hung_task on btrfs recently.

- https://github.com/naota/linux/issues/59
- https://lore.kernel.org/linux-btrfs/CAJCQCtR=jztS3P34U_iUNoBodExHcud44OQ8oe4Jn3TK=1yFNw@mail.gmail.com/T/

The stack trace of hung is like this:

[  739.991925][   T25] INFO: task kworker/u4:0:7 blocked for more than 122 seconds.
[  739.994821][   T25]       Not tainted 5.16.0-rc3+ #56
[  739.996676][   T25] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  739.999600][   T25] task:kworker/u4:0    state:D stack:    0 pid:    7 ppid:     2 flags:0x00004000
[  740.002656][   T25] Workqueue: writeback wb_workfn (flush-btrfs-5)
[  740.004894][   T25] Call Trace:
[  740.006113][   T25]  <TASK>
[  740.007143][   T25]  __schedule+0x9e3/0x23b0
[  740.008657][   T25]  ? io_schedule_timeout+0x190/0x190
[  740.010529][   T25]  ? blk_start_plug_nr_ios+0x270/0x270
[  740.012385][   T25]  ? _raw_spin_unlock_irq+0x28/0x50
[  740.014163][   T25]  schedule+0xed/0x280
[  740.015567][   T25]  io_schedule+0xfb/0x180
[  740.017026][   T25]  folio_wait_bit_common+0x386/0x840
[  740.018839][   T25]  ? delete_from_page_cache+0x220/0x220
[  740.020744][   T25]  ? __this_cpu_preempt_check+0x13/0x20
[  740.022645][   T25]  ? filemap_range_has_page+0x210/0x210
[  740.024588][   T25]  __folio_lock+0x17/0x20
[  740.026197][   T25]  extent_write_cache_pages+0x78c/0xba0 [btrfs]
[  740.028776][   T25]  ? __extent_writepage+0x980/0x980 [btrfs]
[  740.031026][   T25]  ? __kasan_check_read+0x11/0x20
[  740.032916][   T25]  ? __lock_acquire+0x1772/0x5a10
[  740.034611][   T25]  extent_writepages+0x1e8/0x3b0 [btrfs]
[  740.036636][   T25]  ? extent_write_locked_range+0x580/0x580 [btrfs]
[  740.038828][   T25]  ? lockdep_hardirqs_on_prepare+0x410/0x410
[  740.040929][   T25]  btrfs_writepages+0xe/0x10 [btrfs]
[  740.042879][   T25]  do_writepages+0x187/0x610
[  740.044239][   T25]  ? page_writeback_cpu_online+0x20/0x20
[  740.045810][   T25]  ? sched_clock+0x9/0x10
[  740.047015][   T25]  ? sched_clock_cpu+0x18/0x1b0
[  740.048341][   T25]  ? find_held_lock+0x3c/0x130
[  740.049649][   T25]  ? __this_cpu_preempt_check+0x13/0x20
[  740.051155][   T25]  ? lock_release+0x3fd/0xed0
[  740.052358][   T25]  ? __this_cpu_preempt_check+0x13/0x20
[  740.053775][   T25]  ? lock_is_held_type+0xe4/0x140
[  740.055059][   T25]  __writeback_single_inode+0xd7/0xa90
[  740.056399][   T25]  writeback_sb_inodes+0x4e8/0xff0
[  740.057693][   T25]  ? sync_inode_metadata+0xe0/0xe0
[  740.058918][   T25]  ? down_read_trylock+0x45/0x50
[  740.060003][   T25]  ? trylock_super+0x1b/0xc0
[  740.061050][   T25]  __writeback_inodes_wb+0xba/0x210
[  740.062123][   T25]  wb_writeback+0x5b3/0x8b0
[  740.063075][   T25]  ? __writeback_inodes_wb+0x210/0x210
[  740.064158][   T25]  ? __local_bh_enable_ip+0xaa/0x120
[  740.065196][   T25]  ? __local_bh_enable_ip+0xaa/0x120
[  740.066180][   T25]  ? trace_hardirqs_on+0x2b/0x130
[  740.067117][   T25]  ? wb_workfn+0x2cc/0xe80
[  740.067943][   T25]  ? get_nr_dirty_inodes+0x130/0x1c0
[  740.068909][   T25]  wb_workfn+0x6f5/0xe80
[  740.069674][   T25]  ? inode_wait_for_writeback+0x40/0x40
[  740.070841][   T25]  ? __this_cpu_preempt_check+0x13/0x20
[  740.071834][   T25]  ? lock_acquire+0x1c1/0x4f0
[  740.072630][   T25]  ? lock_release+0xed0/0xed0
[  740.073483][   T25]  ? lock_downgrade+0x7c0/0x7c0
[  740.074374][   T25]  ? __this_cpu_preempt_check+0x13/0x20
[  740.075342][   T25]  ? lock_is_held_type+0xe4/0x140
[  740.076236][   T25]  process_one_work+0x826/0x14e0
[  740.077045][   T25]  ? lock_is_held_type+0xe4/0x140
[  740.077921][   T25]  ? pwq_dec_nr_in_flight+0x250/0x250
[  740.078842][   T25]  ? lockdep_hardirqs_off+0x99/0xe0
[  740.079699][   T25]  worker_thread+0x59b/0x1050
[  740.080478][   T25]  ? process_one_work+0x14e0/0x14e0
[  740.081306][   T25]  kthread+0x38f/0x460
[  740.081955][   T25]  ? set_kthread_struct+0x110/0x110
[  740.082769][   T25]  ret_from_fork+0x22/0x30
[  740.083476][   T25]  </TASK>
[  740.083972][   T25] INFO: task aio-dio-write-v:1459 blocked for more than 122 seconds.
[  740.085202][   T25]       Not tainted 5.16.0-rc3+ #56
[  740.085970][   T25] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  740.087238][   T25] task:aio-dio-write-v state:D stack:    0 pid: 1459 ppid:   542 flags:0x00004000
[  740.088676][   T25] Call Trace:
[  740.089269][   T25]  <TASK>
[  740.089780][   T25]  __schedule+0x9e3/0x23b0
[  740.090644][   T25]  ? io_schedule_timeout+0x190/0x190
[  740.091614][   T25]  ? blk_start_plug_nr_ios+0x270/0x270
[  740.092649][   T25]  ? _raw_spin_unlock_irq+0x28/0x50
[  740.093531][   T25]  schedule+0xed/0x280
[  740.094323][   T25]  io_schedule+0xfb/0x180
[  740.095095][   T25]  folio_wait_bit_common+0x386/0x840
[  740.095868][   T25]  ? delete_from_page_cache+0x220/0x220
[  740.096730][   T25]  ? lock_is_held_type+0xe4/0x140
[  740.097432][   T25]  ? filemap_range_has_page+0x210/0x210
[  740.098252][   T25]  __filemap_get_folio+0x4d3/0x8f0
[  740.099016][   T25]  ? filemap_range_needs_writeback+0xb0/0xb0
[  740.099953][   T25]  ? lock_contended+0xdf0/0xdf0
[  740.100663][   T25]  pagecache_get_page+0x19/0xc0
[  740.101400][   T25]  prepare_pages+0x205/0x4c0 [btrfs]
[  740.102274][   T25]  btrfs_buffered_write+0x5e0/0x1060 [btrfs]
[  740.103402][   T25]  ? btrfs_dirty_pages+0x2c0/0x2c0 [btrfs]
[  740.104480][   T25]  ? __up_read+0x1a9/0x7b0
[  740.105260][   T25]  ? up_write+0x480/0x480
[  740.106024][   T25]  ? btrfs_file_llseek+0x600/0x600 [btrfs]
[  740.107157][   T25]  btrfs_file_write_iter+0x84e/0xfa0 [btrfs]
[  740.108239][   T25]  ? lock_downgrade+0x7c0/0x7c0
[  740.109091][   T25]  aio_write+0x314/0x820
[  740.109888][   T25]  ? cpumask_weight.constprop.0+0x40/0x40
[  740.110846][   T25]  ? kvm_sched_clock_read+0x18/0x40
[  740.111677][   T25]  ? sched_clock_cpu+0x18/0x1b0
[  740.112765][   T25]  ? __this_cpu_preempt_check+0x13/0x20
[  740.114113][   T25]  ? lock_release+0x3fd/0xed0
[  740.115264][   T25]  ? lock_downgrade+0x7c0/0x7c0
[  740.116505][   T25]  io_submit_one.constprop.0+0xba3/0x1a50
[  740.117894][   T25]  ? io_submit_one.constprop.0+0xba3/0x1a50
[  740.119180][   T25]  ? kvm_sched_clock_read+0x18/0x40
[  740.120036][   T25]  ? sched_clock+0x9/0x10
[  740.120764][   T25]  ? sched_clock_cpu+0x18/0x1b0
[  740.121619][   T25]  ? __x64_sys_io_getevents_time32+0x2a0/0x2a0
[  740.122628][   T25]  ? lock_release+0x3fd/0xed0
[  740.123522][   T25]  __x64_sys_io_submit+0x15d/0x2b0
[  740.124417][   T25]  ? __x64_sys_io_submit+0x15d/0x2b0
[  740.125363][   T25]  ? io_submit_one.constprop.0+0x1a50/0x1a50
[  740.126322][   T25]  ? __this_cpu_preempt_check+0x13/0x20
[  740.127243][   T25]  ? lock_is_held_type+0xe4/0x140
[  740.128071][   T25]  ? lockdep_hardirqs_on+0x7e/0x100
[  740.128915][   T25]  ? syscall_enter_from_user_mode+0x25/0x80
[  740.129872][   T25]  ? trace_hardirqs_on+0x2b/0x130
[  740.130694][   T25]  do_syscall_64+0x3b/0x90
[  740.131380][   T25]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  740.132351][   T25] RIP: 0033:0x7f5c03578679
[  740.133117][   T25] RSP: 002b:00007fffa4f030e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
[  740.134468][   T25] RAX: ffffffffffffffda RBX: 00007f5c034746c0 RCX: 00007f5c03578679
[  740.135749][   T25] RDX: 0000561f67b701d0 RSI: 0000000000000037 RDI: 00007f5c03676000
[  740.137151][   T25] RBP: 00007f5c03676000 R08: 0000000000000000 R09: 0000000000000000
[  740.138505][   T25] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000037
[  740.139831][   T25] R13: 0000000000000000 R14: 0000561f67b701d0 R15: 0000561f67b701d0
[  740.141263][   T25]  </TASK>
[  740.141793][   T25]
[  740.141793][   T25] Showing all locks held in the system:
[  740.143126][   T25] 3 locks held by kworker/u4:0/7:
[  740.143982][   T25]  #0: ffff888100d4d948 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x740/0x14e0
[  740.145599][   T25]  #1: ffffc9000007fda8 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x770/0x14e0
[  740.147521][   T25]  #2: ffff8881771760e8 (&type->s_umount_key#30){++++}-{3:3}, at: trylock_super+0x1b/0xc0
[  740.149084][   T25] 1 lock held by khungtaskd/25:
[  740.149916][   T25]  #0: ffffffff833cee00 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x5f/0x27f
[  740.151491][   T25] 1 lock held by aio-dio-write-v/1459:
[  740.152458][   T25]  #0: ffff888172999b10 (&sb->s_type->i_mutex_key#13){++++}-{3:3}, at: btrfs_inode_lock+0x3f/0x70 [btrfs]
[  740.154487][   T25]
[  740.154920][   T25] =============================================
[  740.154920][   T25]
[  740.156347][   T25] Kernel panic - not syncing: hung_task: blocked tasks
[  740.157548][   T25] CPU: 0 PID: 25 Comm: khungtaskd Not tainted 5.16.0-rc3+ #56
[  740.158826][   T25] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
[  740.160265][   T25] Call Trace:
[  740.160788][   T25]  <TASK>
[  740.161240][   T25]  dump_stack_lvl+0x49/0x5e
[  740.162045][   T25]  dump_stack+0x10/0x12
[  740.162642][   T25]  panic+0x1e2/0x478
[  740.163330][   T25]  ? __warn_printk+0xf3/0xf3
[  740.164041][   T25]  watchdog.cold+0x118/0x137
[  740.164741][   T25]  ? reset_hung_task_detector+0x30/0x30
[  740.165537][   T25]  kthread+0x38f/0x460
[  740.166202][   T25]  ? set_kthread_struct+0x110/0x110
[  740.167022][   T25]  ret_from_fork+0x22/0x30
[  740.167782][   T25]  </TASK>
[  740.168520][   T25] Kernel Offset: disabled
[  740.169285][   T25] ---[ end Kernel panic - not syncing: hung_task: blocked tasks ]---

While we are debugging this issue, we found some faulty behaviors on
the zoned extent allocator. It is not the root cause of the hung as we
see a similar report also on a regular btrfs. But, it looks like that
early -ENOSPC is, at least, making the hung to happen often.

So, this series fixes the faulty behaviors of the zoned extent
allocator.

Patch 1 fixes a case when allocation fails in a dedicated block group.

Patches 2 and 3 fix the chunk allocation condition for zoned
allocator, so that it won't block a possible chunk allocation.

Naohiro Aota (3):
  btrfs: zoned: unset dedicated block group on allocation failure
  btrfs: add extent allocator hook to decide to allocate chunk or not
  btrfs: zoned: fix chunk allocation condition for zoned allocator

 fs/btrfs/extent-tree.c | 58 ++++++++++++++++++++++++++++++------------
 fs/btrfs/zoned.c       |  5 ++--
 fs/btrfs/zoned.h       |  5 ++--
 3 files changed, 46 insertions(+), 22 deletions(-)

Comments

David Sterba Dec. 8, 2021, 4:18 p.m. UTC | #1
On Wed, Dec 08, 2021 at 12:35:46AM +0900, Naohiro Aota wrote:
> There are several reports of hung_task on btrfs recently.
> 
> - https://github.com/naota/linux/issues/59
> - https://lore.kernel.org/linux-btrfs/CAJCQCtR=jztS3P34U_iUNoBodExHcud44OQ8oe4Jn3TK=1yFNw@mail.gmail.com/T/
> 
> The stack trace of hung is like this:
> 
> [  739.991925][   T25] INFO: task kworker/u4:0:7 blocked for more than 122 seconds.
> [  739.994821][   T25]       Not tainted 5.16.0-rc3+ #56
> [  739.996676][   T25] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  739.999600][   T25] task:kworker/u4:0    state:D stack:    0 pid:    7 ppid:     2 flags:0x00004000
> [  740.002656][   T25] Workqueue: writeback wb_workfn (flush-btrfs-5)
> [  740.004894][   T25] Call Trace:
> [  740.006113][   T25]  <TASK>
> [  740.007143][   T25]  __schedule+0x9e3/0x23b0
> [  740.008657][   T25]  ? io_schedule_timeout+0x190/0x190
> [  740.010529][   T25]  ? blk_start_plug_nr_ios+0x270/0x270
> [  740.012385][   T25]  ? _raw_spin_unlock_irq+0x28/0x50
> [  740.014163][   T25]  schedule+0xed/0x280
> [  740.015567][   T25]  io_schedule+0xfb/0x180
> [  740.017026][   T25]  folio_wait_bit_common+0x386/0x840
> [  740.018839][   T25]  ? delete_from_page_cache+0x220/0x220
> [  740.020744][   T25]  ? __this_cpu_preempt_check+0x13/0x20
> [  740.022645][   T25]  ? filemap_range_has_page+0x210/0x210
> [  740.024588][   T25]  __folio_lock+0x17/0x20
> [  740.026197][   T25]  extent_write_cache_pages+0x78c/0xba0 [btrfs]
> [  740.028776][   T25]  ? __extent_writepage+0x980/0x980 [btrfs]
> [  740.031026][   T25]  ? __kasan_check_read+0x11/0x20
> [  740.032916][   T25]  ? __lock_acquire+0x1772/0x5a10
> [  740.034611][   T25]  extent_writepages+0x1e8/0x3b0 [btrfs]
> [  740.036636][   T25]  ? extent_write_locked_range+0x580/0x580 [btrfs]
> [  740.038828][   T25]  ? lockdep_hardirqs_on_prepare+0x410/0x410
> [  740.040929][   T25]  btrfs_writepages+0xe/0x10 [btrfs]
> [  740.042879][   T25]  do_writepages+0x187/0x610
> [  740.044239][   T25]  ? page_writeback_cpu_online+0x20/0x20
> [  740.045810][   T25]  ? sched_clock+0x9/0x10
> [  740.047015][   T25]  ? sched_clock_cpu+0x18/0x1b0
> [  740.048341][   T25]  ? find_held_lock+0x3c/0x130
> [  740.049649][   T25]  ? __this_cpu_preempt_check+0x13/0x20
> [  740.051155][   T25]  ? lock_release+0x3fd/0xed0
> [  740.052358][   T25]  ? __this_cpu_preempt_check+0x13/0x20
> [  740.053775][   T25]  ? lock_is_held_type+0xe4/0x140
> [  740.055059][   T25]  __writeback_single_inode+0xd7/0xa90
> [  740.056399][   T25]  writeback_sb_inodes+0x4e8/0xff0
> [  740.057693][   T25]  ? sync_inode_metadata+0xe0/0xe0
> [  740.058918][   T25]  ? down_read_trylock+0x45/0x50
> [  740.060003][   T25]  ? trylock_super+0x1b/0xc0
> [  740.061050][   T25]  __writeback_inodes_wb+0xba/0x210
> [  740.062123][   T25]  wb_writeback+0x5b3/0x8b0
> [  740.063075][   T25]  ? __writeback_inodes_wb+0x210/0x210
> [  740.064158][   T25]  ? __local_bh_enable_ip+0xaa/0x120
> [  740.065196][   T25]  ? __local_bh_enable_ip+0xaa/0x120
> [  740.066180][   T25]  ? trace_hardirqs_on+0x2b/0x130
> [  740.067117][   T25]  ? wb_workfn+0x2cc/0xe80
> [  740.067943][   T25]  ? get_nr_dirty_inodes+0x130/0x1c0
> [  740.068909][   T25]  wb_workfn+0x6f5/0xe80
> [  740.069674][   T25]  ? inode_wait_for_writeback+0x40/0x40
> [  740.070841][   T25]  ? __this_cpu_preempt_check+0x13/0x20
> [  740.071834][   T25]  ? lock_acquire+0x1c1/0x4f0
> [  740.072630][   T25]  ? lock_release+0xed0/0xed0
> [  740.073483][   T25]  ? lock_downgrade+0x7c0/0x7c0
> [  740.074374][   T25]  ? __this_cpu_preempt_check+0x13/0x20
> [  740.075342][   T25]  ? lock_is_held_type+0xe4/0x140
> [  740.076236][   T25]  process_one_work+0x826/0x14e0
> [  740.077045][   T25]  ? lock_is_held_type+0xe4/0x140
> [  740.077921][   T25]  ? pwq_dec_nr_in_flight+0x250/0x250
> [  740.078842][   T25]  ? lockdep_hardirqs_off+0x99/0xe0
> [  740.079699][   T25]  worker_thread+0x59b/0x1050
> [  740.080478][   T25]  ? process_one_work+0x14e0/0x14e0
> [  740.081306][   T25]  kthread+0x38f/0x460
> [  740.081955][   T25]  ? set_kthread_struct+0x110/0x110
> [  740.082769][   T25]  ret_from_fork+0x22/0x30
> [  740.083476][   T25]  </TASK>
> [  740.083972][   T25] INFO: task aio-dio-write-v:1459 blocked for more than 122 seconds.
> [  740.085202][   T25]       Not tainted 5.16.0-rc3+ #56
> [  740.085970][   T25] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  740.087238][   T25] task:aio-dio-write-v state:D stack:    0 pid: 1459 ppid:   542 flags:0x00004000
> [  740.088676][   T25] Call Trace:
> [  740.089269][   T25]  <TASK>
> [  740.089780][   T25]  __schedule+0x9e3/0x23b0
> [  740.090644][   T25]  ? io_schedule_timeout+0x190/0x190
> [  740.091614][   T25]  ? blk_start_plug_nr_ios+0x270/0x270
> [  740.092649][   T25]  ? _raw_spin_unlock_irq+0x28/0x50
> [  740.093531][   T25]  schedule+0xed/0x280
> [  740.094323][   T25]  io_schedule+0xfb/0x180
> [  740.095095][   T25]  folio_wait_bit_common+0x386/0x840
> [  740.095868][   T25]  ? delete_from_page_cache+0x220/0x220
> [  740.096730][   T25]  ? lock_is_held_type+0xe4/0x140
> [  740.097432][   T25]  ? filemap_range_has_page+0x210/0x210
> [  740.098252][   T25]  __filemap_get_folio+0x4d3/0x8f0
> [  740.099016][   T25]  ? filemap_range_needs_writeback+0xb0/0xb0
> [  740.099953][   T25]  ? lock_contended+0xdf0/0xdf0
> [  740.100663][   T25]  pagecache_get_page+0x19/0xc0
> [  740.101400][   T25]  prepare_pages+0x205/0x4c0 [btrfs]
> [  740.102274][   T25]  btrfs_buffered_write+0x5e0/0x1060 [btrfs]
> [  740.103402][   T25]  ? btrfs_dirty_pages+0x2c0/0x2c0 [btrfs]
> [  740.104480][   T25]  ? __up_read+0x1a9/0x7b0
> [  740.105260][   T25]  ? up_write+0x480/0x480
> [  740.106024][   T25]  ? btrfs_file_llseek+0x600/0x600 [btrfs]
> [  740.107157][   T25]  btrfs_file_write_iter+0x84e/0xfa0 [btrfs]
> [  740.108239][   T25]  ? lock_downgrade+0x7c0/0x7c0
> [  740.109091][   T25]  aio_write+0x314/0x820
> [  740.109888][   T25]  ? cpumask_weight.constprop.0+0x40/0x40
> [  740.110846][   T25]  ? kvm_sched_clock_read+0x18/0x40
> [  740.111677][   T25]  ? sched_clock_cpu+0x18/0x1b0
> [  740.112765][   T25]  ? __this_cpu_preempt_check+0x13/0x20
> [  740.114113][   T25]  ? lock_release+0x3fd/0xed0
> [  740.115264][   T25]  ? lock_downgrade+0x7c0/0x7c0
> [  740.116505][   T25]  io_submit_one.constprop.0+0xba3/0x1a50
> [  740.117894][   T25]  ? io_submit_one.constprop.0+0xba3/0x1a50
> [  740.119180][   T25]  ? kvm_sched_clock_read+0x18/0x40
> [  740.120036][   T25]  ? sched_clock+0x9/0x10
> [  740.120764][   T25]  ? sched_clock_cpu+0x18/0x1b0
> [  740.121619][   T25]  ? __x64_sys_io_getevents_time32+0x2a0/0x2a0
> [  740.122628][   T25]  ? lock_release+0x3fd/0xed0
> [  740.123522][   T25]  __x64_sys_io_submit+0x15d/0x2b0
> [  740.124417][   T25]  ? __x64_sys_io_submit+0x15d/0x2b0
> [  740.125363][   T25]  ? io_submit_one.constprop.0+0x1a50/0x1a50
> [  740.126322][   T25]  ? __this_cpu_preempt_check+0x13/0x20
> [  740.127243][   T25]  ? lock_is_held_type+0xe4/0x140
> [  740.128071][   T25]  ? lockdep_hardirqs_on+0x7e/0x100
> [  740.128915][   T25]  ? syscall_enter_from_user_mode+0x25/0x80
> [  740.129872][   T25]  ? trace_hardirqs_on+0x2b/0x130
> [  740.130694][   T25]  do_syscall_64+0x3b/0x90
> [  740.131380][   T25]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  740.132351][   T25] RIP: 0033:0x7f5c03578679
> [  740.133117][   T25] RSP: 002b:00007fffa4f030e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
> [  740.134468][   T25] RAX: ffffffffffffffda RBX: 00007f5c034746c0 RCX: 00007f5c03578679
> [  740.135749][   T25] RDX: 0000561f67b701d0 RSI: 0000000000000037 RDI: 00007f5c03676000
> [  740.137151][   T25] RBP: 00007f5c03676000 R08: 0000000000000000 R09: 0000000000000000
> [  740.138505][   T25] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000037
> [  740.139831][   T25] R13: 0000000000000000 R14: 0000561f67b701d0 R15: 0000561f67b701d0
> [  740.141263][   T25]  </TASK>
> [  740.141793][   T25]
> [  740.141793][   T25] Showing all locks held in the system:
> [  740.143126][   T25] 3 locks held by kworker/u4:0/7:
> [  740.143982][   T25]  #0: ffff888100d4d948 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x740/0x14e0
> [  740.145599][   T25]  #1: ffffc9000007fda8 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x770/0x14e0
> [  740.147521][   T25]  #2: ffff8881771760e8 (&type->s_umount_key#30){++++}-{3:3}, at: trylock_super+0x1b/0xc0
> [  740.149084][   T25] 1 lock held by khungtaskd/25:
> [  740.149916][   T25]  #0: ffffffff833cee00 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x5f/0x27f
> [  740.151491][   T25] 1 lock held by aio-dio-write-v/1459:
> [  740.152458][   T25]  #0: ffff888172999b10 (&sb->s_type->i_mutex_key#13){++++}-{3:3}, at: btrfs_inode_lock+0x3f/0x70 [btrfs]
> [  740.154487][   T25]
> [  740.154920][   T25] =============================================
> [  740.154920][   T25]
> [  740.156347][   T25] Kernel panic - not syncing: hung_task: blocked tasks
> [  740.157548][   T25] CPU: 0 PID: 25 Comm: khungtaskd Not tainted 5.16.0-rc3+ #56
> [  740.158826][   T25] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> [  740.160265][   T25] Call Trace:
> [  740.160788][   T25]  <TASK>
> [  740.161240][   T25]  dump_stack_lvl+0x49/0x5e
> [  740.162045][   T25]  dump_stack+0x10/0x12
> [  740.162642][   T25]  panic+0x1e2/0x478
> [  740.163330][   T25]  ? __warn_printk+0xf3/0xf3
> [  740.164041][   T25]  watchdog.cold+0x118/0x137
> [  740.164741][   T25]  ? reset_hung_task_detector+0x30/0x30
> [  740.165537][   T25]  kthread+0x38f/0x460
> [  740.166202][   T25]  ? set_kthread_struct+0x110/0x110
> [  740.167022][   T25]  ret_from_fork+0x22/0x30
> [  740.167782][   T25]  </TASK>
> [  740.168520][   T25] Kernel Offset: disabled
> [  740.169285][   T25] ---[ end Kernel panic - not syncing: hung_task: blocked tasks ]---
> 
> While we are debugging this issue, we found some faulty behaviors on
> the zoned extent allocator. It is not the root cause of the hung as we
> see a similar report also on a regular btrfs. But, it looks like that
> early -ENOSPC is, at least, making the hung to happen often.
> 
> So, this series fixes the faulty behaviors of the zoned extent
> allocator.
> 
> Patch 1 fixes a case when allocation fails in a dedicated block group.
> 
> Patches 2 and 3 fix the chunk allocation condition for zoned
> allocator, so that it won't block a possible chunk allocation.
> 
> Naohiro Aota (3):
>   btrfs: zoned: unset dedicated block group on allocation failure
>   btrfs: add extent allocator hook to decide to allocate chunk or not
>   btrfs: zoned: fix chunk allocation condition for zoned allocator

All seem to be relevant for 5.16-rc so I'll add them to misc-next now to
give it some testing, pull request next week. Thanks.
Shinichiro Kawasaki Dec. 29, 2021, 12:22 a.m. UTC | #2
On Dec 08, 2021 / 17:18, David Sterba wrote:
> On Wed, Dec 08, 2021 at 12:35:46AM +0900, Naohiro Aota wrote:
> > There are several reports of hung_task on btrfs recently.
> > 
> > - https://github.com/naota/linux/issues/59
> > - https://lore.kernel.org/linux-btrfs/CAJCQCtR=jztS3P34U_iUNoBodExHcud44OQ8oe4Jn3TK=1yFNw@mail.gmail.com/T/

(snip)

> > While we are debugging this issue, we found some faulty behaviors on
> > the zoned extent allocator. It is not the root cause of the hung as we
> > see a similar report also on a regular btrfs. But, it looks like that
> > early -ENOSPC is, at least, making the hung to happen often.
> > 
> > So, this series fixes the faulty behaviors of the zoned extent
> > allocator.
> > 
> > Patch 1 fixes a case when allocation fails in a dedicated block group.
> > 
> > Patches 2 and 3 fix the chunk allocation condition for zoned
> > allocator, so that it won't block a possible chunk allocation.
> > 
> > Naohiro Aota (3):
> >   btrfs: zoned: unset dedicated block group on allocation failure
> >   btrfs: add extent allocator hook to decide to allocate chunk or not
> >   btrfs: zoned: fix chunk allocation condition for zoned allocator
> 
> All seem to be relevant for 5.16-rc so I'll add them to misc-next now to
> give it some testing, pull request next week. Thanks.

Hello David, thanks for your maintainer-ship always.

When I run my test set for zoned btrfs configuration, I keep on observing the
issue that Naohiro addressed with the three patches. The patches are not yet
merged to 5.16-rc7. Can I expect they get merged to rc8?
David Sterba Jan. 3, 2022, 7:13 p.m. UTC | #3
On Wed, Dec 29, 2021 at 12:22:31AM +0000, Shinichiro Kawasaki wrote:
> On Dec 08, 2021 / 17:18, David Sterba wrote:
> > On Wed, Dec 08, 2021 at 12:35:46AM +0900, Naohiro Aota wrote:
> > > There are several reports of hung_task on btrfs recently.
> > > 
> > > - https://github.com/naota/linux/issues/59
> > > - https://lore.kernel.org/linux-btrfs/CAJCQCtR=jztS3P34U_iUNoBodExHcud44OQ8oe4Jn3TK=1yFNw@mail.gmail.com/T/
> 
> (snip)
> 
> > > While we are debugging this issue, we found some faulty behaviors on
> > > the zoned extent allocator. It is not the root cause of the hung as we
> > > see a similar report also on a regular btrfs. But, it looks like that
> > > early -ENOSPC is, at least, making the hung to happen often.
> > > 
> > > So, this series fixes the faulty behaviors of the zoned extent
> > > allocator.
> > > 
> > > Patch 1 fixes a case when allocation fails in a dedicated block group.
> > > 
> > > Patches 2 and 3 fix the chunk allocation condition for zoned
> > > allocator, so that it won't block a possible chunk allocation.
> > > 
> > > Naohiro Aota (3):
> > >   btrfs: zoned: unset dedicated block group on allocation failure
> > >   btrfs: add extent allocator hook to decide to allocate chunk or not
> > >   btrfs: zoned: fix chunk allocation condition for zoned allocator
> > 
> > All seem to be relevant for 5.16-rc so I'll add them to misc-next now to
> > give it some testing, pull request next week. Thanks.
> 
> Hello David, thanks for your maintainer-ship always.
> 
> When I run my test set for zoned btrfs configuration, I keep on observing the
> issue that Naohiro addressed with the three patches. The patches are not yet
> merged to 5.16-rc7. Can I expect they get merged to rc8?

Sorry, I did not get to sending the pull request due to holidays. The
timing of 5.16 release next week is too close, I'll tag the patches for
stable so they'll get to 5.16 later.
Shinichiro Kawasaki Jan. 5, 2022, 2 a.m. UTC | #4
On Jan 03, 2022 / 20:13, David Sterba wrote:
> On Wed, Dec 29, 2021 at 12:22:31AM +0000, Shinichiro Kawasaki wrote:
> > On Dec 08, 2021 / 17:18, David Sterba wrote:
> > > On Wed, Dec 08, 2021 at 12:35:46AM +0900, Naohiro Aota wrote:
> > > > There are several reports of hung_task on btrfs recently.
> > > > 
> > > > - https://github.com/naota/linux/issues/59
> > > > - https://lore.kernel.org/linux-btrfs/CAJCQCtR=jztS3P34U_iUNoBodExHcud44OQ8oe4Jn3TK=1yFNw@mail.gmail.com/T/
> > 
> > (snip)
> > 
> > > > While we are debugging this issue, we found some faulty behaviors on
> > > > the zoned extent allocator. It is not the root cause of the hung as we
> > > > see a similar report also on a regular btrfs. But, it looks like that
> > > > early -ENOSPC is, at least, making the hung to happen often.
> > > > 
> > > > So, this series fixes the faulty behaviors of the zoned extent
> > > > allocator.
> > > > 
> > > > Patch 1 fixes a case when allocation fails in a dedicated block group.
> > > > 
> > > > Patches 2 and 3 fix the chunk allocation condition for zoned
> > > > allocator, so that it won't block a possible chunk allocation.
> > > > 
> > > > Naohiro Aota (3):
> > > >   btrfs: zoned: unset dedicated block group on allocation failure
> > > >   btrfs: add extent allocator hook to decide to allocate chunk or not
> > > >   btrfs: zoned: fix chunk allocation condition for zoned allocator
> > > 
> > > All seem to be relevant for 5.16-rc so I'll add them to misc-next now to
> > > give it some testing, pull request next week. Thanks.
> > 
> > Hello David, thanks for your maintainer-ship always.
> > 
> > When I run my test set for zoned btrfs configuration, I keep on observing the
> > issue that Naohiro addressed with the three patches. The patches are not yet
> > merged to 5.16-rc7. Can I expect they get merged to rc8?
> 
> Sorry, I did not get to sending the pull request due to holidays. The
> timing of 5.16 release next week is too close, I'll tag the patches for
> stable so they'll get to 5.16 later.

Sounds good. Thank you for your care.