Message ID | 2efe2794ecbfbfab1545a9341d3a1fb7464dc048.1728315195.git.fdmanana@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: qgroup: run delayed iputs after ordered extent completion | expand |
在 2024/10/8 02:04, fdmanana@kernel.org 写道: > From: Filipe Manana <fdmanana@suse.com> > > When trying to flush qgroups in order to release space we run delayed > iputs in order to release space from recently deleted files (their link > counted reached zero), and then we start delalloc and wait for any > existing ordered extents to complete. > > However there's a time window here where we end up not doing the final > iput on a deleted file which could release necessary space: > > 1) An unlink operation starts; > > 2) During the unlink, or right before it completes, delalloc is flushed > and an ordered extent is created; > > 3) When the ordered extent is created, the inode's ref count is > incremented (with igrab() at alloc_ordered_extent()); > > 4) When the unlink finishes it doesn't drop the last reference on the > inode and so it doesn't trigger inode eviction to delete all of > the inode's items in its root and drop all references on its data > extents; > > 5) Another task enters try_flush_qgroup() to try to release space, > it runs all delayed iputs, but there's no delayed iput yet for that > deleted file because the ordered extent hasn't completed yet; > > 6) Then at try_flush_qgroup() we wait for the ordered extent to complete > and that results in adding a delayed iput at btrfs_put_ordered_extent() > when called from btrfs_finish_one_ordered(); > > 7) Adding the delayed iput results in waking the cleaner kthread if it's > not running already. However it may take some time for it to be > scheduled, or it may be running but busy running auto defrag, dropping > deleted snapshots or doing other work, so by the time we return from > try_flush_qgroup() the space for deleted file isn't released. > > Improve on this by running delayed iputs only after flushing delalloc > and waiting for ordered extent completion. > > Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks, Qu > --- > fs/btrfs/qgroup.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > index f68a26390589..bbc54dd154ec 100644 > --- a/fs/btrfs/qgroup.c > +++ b/fs/btrfs/qgroup.c > @@ -4195,13 +4195,20 @@ static int try_flush_qgroup(struct btrfs_root *root) > return 0; > } > > - btrfs_run_delayed_iputs(root->fs_info); > - btrfs_wait_on_delayed_iputs(root->fs_info); > ret = btrfs_start_delalloc_snapshot(root, true); > if (ret < 0) > goto out; > btrfs_wait_ordered_extents(root, U64_MAX, NULL); > > + /* > + * After waiting for ordered extents run delayed iputs in order to free > + * space from unlinked files before committing the current transaction, > + * as ordered extents may have been holding the last reference of an > + * inode and they add a delayed iput when they complete. > + */ > + btrfs_run_delayed_iputs(root->fs_info); > + btrfs_wait_on_delayed_iputs(root->fs_info); > + > ret = btrfs_commit_current_transaction(root); > out: > clear_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state);
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index f68a26390589..bbc54dd154ec 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -4195,13 +4195,20 @@ static int try_flush_qgroup(struct btrfs_root *root) return 0; } - btrfs_run_delayed_iputs(root->fs_info); - btrfs_wait_on_delayed_iputs(root->fs_info); ret = btrfs_start_delalloc_snapshot(root, true); if (ret < 0) goto out; btrfs_wait_ordered_extents(root, U64_MAX, NULL); + /* + * After waiting for ordered extents run delayed iputs in order to free + * space from unlinked files before committing the current transaction, + * as ordered extents may have been holding the last reference of an + * inode and they add a delayed iput when they complete. + */ + btrfs_run_delayed_iputs(root->fs_info); + btrfs_wait_on_delayed_iputs(root->fs_info); + ret = btrfs_commit_current_transaction(root); out: clear_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state);