diff mbox series

[v5,06/18] btrfs: create qgroup earlier in snapshot creation

Message ID 4418d4544e16023fb0b7db6969b657b32cd25f93.1690495785.git.boris@bur.io (mailing list archive)
State New, archived
Headers show
Series btrfs: simple quotas | expand

Commit Message

Boris Burkov July 27, 2023, 10:12 p.m. UTC
Pull creating the qgroup earlier in the snapshot. This allows simple
quotas qgroups to see all the metadata writes related to the snapshot
being created and to be born with the root node accounted.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/qgroup.c      | 3 +++
 fs/btrfs/transaction.c | 6 ++++++
 2 files changed, 9 insertions(+)

Comments

Josef Bacik Aug. 21, 2023, 6:02 p.m. UTC | #1
On Thu, Jul 27, 2023 at 03:12:53PM -0700, Boris Burkov wrote:
> Pull creating the qgroup earlier in the snapshot. This allows simple
> quotas qgroups to see all the metadata writes related to the snapshot
> being created and to be born with the root node accounted.
> 
> Signed-off-by: Boris Burkov <boris@bur.io>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef
David Sterba Sept. 7, 2023, 11:41 a.m. UTC | #2
On Thu, Jul 27, 2023 at 03:12:53PM -0700, Boris Burkov wrote:
> Pull creating the qgroup earlier in the snapshot. This allows simple
> quotas qgroups to see all the metadata writes related to the snapshot
> being created and to be born with the root node accounted.
> 
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
>  fs/btrfs/qgroup.c      | 3 +++
>  fs/btrfs/transaction.c | 6 ++++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index 18f521716e8d..8e3a4ced3077 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -1672,6 +1672,9 @@ int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
>  	struct btrfs_qgroup *qgroup;
>  	int ret = 0;
>  
> +	if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_DISABLED)
> +		return 0;
> +
>  	mutex_lock(&fs_info->qgroup_ioctl_lock);
>  	if (!fs_info->quota_root) {
>  		ret = -ENOTCONN;
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 89ff15aa085f..25217888e897 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1722,6 +1722,12 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
>  	}
>  	btrfs_release_path(path);
>  
> +	ret = btrfs_create_qgroup(trans, objectid);
> +	if (ret) {
> +		btrfs_abort_transaction(trans, ret);

This adds and error case to the middle of a transaction commit.
Snapshots are created in two parts, first is the ioctl adding the
structure and then commit actually creates that. So the first phase
preallocates what's needed (the root_item and path) and should do the
same with the qgroups as much as possible.

Also check all the things that btrfs_create_qgroup() does, searches the
qgroup tree, adds the new item, takes the qgroup_ioctl_lock mutex, and
adds the sysfs entry (that does allocations under GFP_KERNEL).
If you really need to create the qgroup like that then it needs much
more care.
Boris Burkov Sept. 8, 2023, 10:50 p.m. UTC | #3
On Thu, Sep 07, 2023 at 01:41:35PM +0200, David Sterba wrote:
> On Thu, Jul 27, 2023 at 03:12:53PM -0700, Boris Burkov wrote:
> > Pull creating the qgroup earlier in the snapshot. This allows simple
> > quotas qgroups to see all the metadata writes related to the snapshot
> > being created and to be born with the root node accounted.
> > 
> > Signed-off-by: Boris Burkov <boris@bur.io>
> > ---
> >  fs/btrfs/qgroup.c      | 3 +++
> >  fs/btrfs/transaction.c | 6 ++++++
> >  2 files changed, 9 insertions(+)
> > 
> > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> > index 18f521716e8d..8e3a4ced3077 100644
> > --- a/fs/btrfs/qgroup.c
> > +++ b/fs/btrfs/qgroup.c
> > @@ -1672,6 +1672,9 @@ int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
> >  	struct btrfs_qgroup *qgroup;
> >  	int ret = 0;
> >  
> > +	if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_DISABLED)
> > +		return 0;
> > +
> >  	mutex_lock(&fs_info->qgroup_ioctl_lock);
> >  	if (!fs_info->quota_root) {
> >  		ret = -ENOTCONN;
> > diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> > index 89ff15aa085f..25217888e897 100644
> > --- a/fs/btrfs/transaction.c
> > +++ b/fs/btrfs/transaction.c
> > @@ -1722,6 +1722,12 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
> >  	}
> >  	btrfs_release_path(path);
> >  
> > +	ret = btrfs_create_qgroup(trans, objectid);
> > +	if (ret) {
> > +		btrfs_abort_transaction(trans, ret);
> 
> This adds and error case to the middle of a transaction commit.
> Snapshots are created in two parts, first is the ioctl adding the
> structure and then commit actually creates that. So the first phase
> preallocates what's needed (the root_item and path) and should do the
> same with the qgroups as much as possible.
> 
> Also check all the things that btrfs_create_qgroup() does, searches the
> qgroup tree, adds the new item, takes the qgroup_ioctl_lock mutex, and
> adds the sysfs entry (that does allocations under GFP_KERNEL).

I believe it does it with GFP_NOFS via allocating "prealloc". I might be
missing another allocation under the covers. That's covered below,
though.

> If you really need to create the qgroup like that then it needs much
> more care.

As I understand it, the way that the qgroup gets created currently is by
qgroup_account_snapshot which calls btrfs_qgroup_inherit in this same
function.

btrfs_create_qgroup consists of:
- lock qgroup_ioctl_lock
- do an rbtree lookup for the qgid
- do a NOFS "prealloc" allocation for the qgroup struct
- add the qgroup item
- add it to the rbtree
- add it to sysfs (using the above nofs prealloc)

With the exception of the qgroup_ioctl_lock, all those are in
btrfs_qgroup_inherit (and much more).

So that is all happening within create_pending_snapshot and thus the
commit critical section. It also does other work like backref walks,
and committing the roots.

Am I missing something important about the relative parts of
create_pending_snapshots where this work is happening? My intent was to
pull it up to before the run_delayed_refs in create_pending_snapshots
so that the new dir metadata item gets counted correctly. I think I may
have gotten delayed_refs and delayed_items confused and pulled it up
*too* far, and can probably stuff it earlier in that account function
or something.

Apologies if I am fundamentally misunderstanding something here.
diff mbox series

Patch

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 18f521716e8d..8e3a4ced3077 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1672,6 +1672,9 @@  int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid)
 	struct btrfs_qgroup *qgroup;
 	int ret = 0;
 
+	if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_DISABLED)
+		return 0;
+
 	mutex_lock(&fs_info->qgroup_ioctl_lock);
 	if (!fs_info->quota_root) {
 		ret = -ENOTCONN;
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 89ff15aa085f..25217888e897 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1722,6 +1722,12 @@  static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
 	}
 	btrfs_release_path(path);
 
+	ret = btrfs_create_qgroup(trans, objectid);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto fail;
+	}
+
 	/*
 	 * pull in the delayed directory update
 	 * and the delayed inode item