diff mbox series

[v2,2/8] btrfs: subpage: do not hold subpage spin lock when clearing folio writeback

Message ID 5d724635289a10a39164b767bdbee81e9e82e7d7.1740635497.git.wqu@suse.com (mailing list archive)
State New
Headers show
Series btrfs: make subpage handling be feature full | expand

Commit Message

Qu Wenruo Feb. 27, 2025, 5:54 a.m. UTC
[BUG]
When testing subpage block size btrfs (block size < page size), I hit
the following spin lock hang on x86_64, with the experimental 2K block
size support:

  <TASK>
  _raw_spin_lock_irq+0x2f/0x40
  wait_subpage_spinlock+0x69/0x80 [btrfs]
  btrfs_release_folio+0x46/0x70 [btrfs]
  folio_unmap_invalidate+0xcb/0x250
  folio_end_writeback+0x127/0x1b0
  btrfs_subpage_clear_writeback+0xef/0x140 [btrfs]
  end_bbio_data_write+0x13a/0x3c0 [btrfs]
  btrfs_bio_end_io+0x6f/0xc0 [btrfs]
  process_one_work+0x156/0x310
  worker_thread+0x252/0x390
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xef/0x250
  ? finish_task_switch.isra.0+0x8a/0x250
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x34/0x50
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1a/0x30
  </TASK>

[CAUSE]
It's a self deadlock with the following sequence:

 btrfs_subpage_clear_writeback()
 |- spin_lock_irqsave(&subpage->lock);
 |- folio_end_writeback()
    |- folio_end_dropbehind_write()
       |- folio_unmap_invalidate()
          |- btrfs_release_folio()
             |- wait_subpage_spinlock()
                |- spin_lock_irq(&subpage->lock);
                   !! DEADLOCK !!

We're trying to acquire the same spin lock already held by ourselves.

[FIX]
Move the folio_end_writeback() call out of the spin lock critical
section.

And since we no longer have all the bitmap operation and the writeback
flag clearing happening inside the critical section, we must do extra
checks to make sure only the last one clearing the writeback bitmap can
clear the folio writeback flag.

Fixes: 3470da3b7d87 ("btrfs: subpage: introduce helpers for writeback status")
Cc: stable@vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/subpage.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Filipe Manana Feb. 27, 2025, 12:42 p.m. UTC | #1
On Thu, Feb 27, 2025 at 5:56 AM Qu Wenruo <wqu@suse.com> wrote:
>
> [BUG]
> When testing subpage block size btrfs (block size < page size), I hit
> the following spin lock hang on x86_64, with the experimental 2K block
> size support:
>
>   <TASK>
>   _raw_spin_lock_irq+0x2f/0x40
>   wait_subpage_spinlock+0x69/0x80 [btrfs]
>   btrfs_release_folio+0x46/0x70 [btrfs]
>   folio_unmap_invalidate+0xcb/0x250
>   folio_end_writeback+0x127/0x1b0
>   btrfs_subpage_clear_writeback+0xef/0x140 [btrfs]
>   end_bbio_data_write+0x13a/0x3c0 [btrfs]
>   btrfs_bio_end_io+0x6f/0xc0 [btrfs]
>   process_one_work+0x156/0x310
>   worker_thread+0x252/0x390
>   ? __pfx_worker_thread+0x10/0x10
>   kthread+0xef/0x250
>   ? finish_task_switch.isra.0+0x8a/0x250
>   ? __pfx_kthread+0x10/0x10
>   ret_from_fork+0x34/0x50
>   ? __pfx_kthread+0x10/0x10
>   ret_from_fork_asm+0x1a/0x30
>   </TASK>
>
> [CAUSE]
> It's a self deadlock with the following sequence:
>
>  btrfs_subpage_clear_writeback()
>  |- spin_lock_irqsave(&subpage->lock);
>  |- folio_end_writeback()
>     |- folio_end_dropbehind_write()
>        |- folio_unmap_invalidate()
>           |- btrfs_release_folio()
>              |- wait_subpage_spinlock()
>                 |- spin_lock_irq(&subpage->lock);
>                    !! DEADLOCK !!
>
> We're trying to acquire the same spin lock already held by ourselves.
>
> [FIX]
> Move the folio_end_writeback() call out of the spin lock critical
> section.
>
> And since we no longer have all the bitmap operation and the writeback
> flag clearing happening inside the critical section, we must do extra
> checks to make sure only the last one clearing the writeback bitmap can
> clear the folio writeback flag.
>
> Fixes: 3470da3b7d87 ("btrfs: subpage: introduce helpers for writeback status")
> Cc: stable@vger.kernel.org # 5.15+
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks.

> ---
>  fs/btrfs/subpage.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c
> index ebb40f506921..bedb5fac579b 100644
> --- a/fs/btrfs/subpage.c
> +++ b/fs/btrfs/subpage.c
> @@ -466,15 +466,21 @@ void btrfs_subpage_clear_writeback(const struct btrfs_fs_info *fs_info,
>         struct btrfs_subpage *subpage = folio_get_private(folio);
>         unsigned int start_bit = subpage_calc_start_bit(fs_info, folio,
>                                                         writeback, start, len);
> +       bool was_writeback;
> +       bool last = false;
>         unsigned long flags;
>
>         spin_lock_irqsave(&subpage->lock, flags);
> +       was_writeback = !subpage_test_bitmap_all_zero(fs_info, folio, writeback);
>         bitmap_clear(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits);
> -       if (subpage_test_bitmap_all_zero(fs_info, folio, writeback)) {
> +       if (subpage_test_bitmap_all_zero(fs_info, folio, writeback) &&
> +           was_writeback) {
>                 ASSERT(folio_test_writeback(folio));
> -               folio_end_writeback(folio);
> +               last = true;
>         }
>         spin_unlock_irqrestore(&subpage->lock, flags);
> +       if (last)
> +               folio_end_writeback(folio);
>  }
>
>  void btrfs_subpage_set_ordered(const struct btrfs_fs_info *fs_info,
> --
> 2.48.1
>
>
diff mbox series

Patch

diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c
index ebb40f506921..bedb5fac579b 100644
--- a/fs/btrfs/subpage.c
+++ b/fs/btrfs/subpage.c
@@ -466,15 +466,21 @@  void btrfs_subpage_clear_writeback(const struct btrfs_fs_info *fs_info,
 	struct btrfs_subpage *subpage = folio_get_private(folio);
 	unsigned int start_bit = subpage_calc_start_bit(fs_info, folio,
 							writeback, start, len);
+	bool was_writeback;
+	bool last = false;
 	unsigned long flags;
 
 	spin_lock_irqsave(&subpage->lock, flags);
+	was_writeback = !subpage_test_bitmap_all_zero(fs_info, folio, writeback);
 	bitmap_clear(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits);
-	if (subpage_test_bitmap_all_zero(fs_info, folio, writeback)) {
+	if (subpage_test_bitmap_all_zero(fs_info, folio, writeback) &&
+	    was_writeback) {
 		ASSERT(folio_test_writeback(folio));
-		folio_end_writeback(folio);
+		last = true;
 	}
 	spin_unlock_irqrestore(&subpage->lock, flags);
+	if (last)
+		folio_end_writeback(folio);
 }
 
 void btrfs_subpage_set_ordered(const struct btrfs_fs_info *fs_info,