Message ID | 3ea480fc-46ad-4072-198d-1e8f49121be0@jp.fujitsu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] btrfs: qgroup: Remove qgroup items along with subvolume deletion | expand |
On Mon, Aug 06, 2018 at 01:53:28PM +0900, Misono Tomohiro wrote: >When qgroup is on, subvolume deletion does not remove qgroup items >of the subvolume (qgroup info, limit, relation) from quota tree and >they need to get removed manually by "btrfs qgroup destroy". > >Since level 0 qgroup cannot be used/inherited by any other subvolume, >let's remove them automatically when subvolume is deleted >(to be precise, when the subvolume root is dropped). > >Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> >Reviewed-by: Qu Wenruo <wqu@suse.com> >Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> >--- >v2 -> v3: > Use root->root_key.objectid instead of root->objectid > Add Reviewed-by tag > >v1 -> v2: > Move call of btrfs_remove_qgroup() from btrfs_delete_subvolume() > to btrfs_snapshot_destroy() so that it will be called after the > subvolume root is really dropped > > fs/btrfs/extent-tree.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > >diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >index 9e7b237b9547..48edf839ed2c 100644 >--- a/fs/btrfs/extent-tree.c >+++ b/fs/btrfs/extent-tree.c >@@ -8871,12 +8871,13 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > struct btrfs_root_item *root_item = &root->root_item; > struct walk_control *wc; > struct btrfs_key key; >+ u64 objectid = root->root_key.objectid; > int err = 0; > int ret; > int level; > bool root_dropped = false; > >- btrfs_debug(fs_info, "Drop subvolume %llu", root->objectid); >+ btrfs_debug(fs_info, "Drop subvolume %llu", objectid); > > path = btrfs_alloc_path(); > if (!path) { >@@ -9030,7 +9031,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > goto out_end_trans; > } > >- if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID) { >+ if (objectid != BTRFS_TREE_RELOC_OBJECTID) { > ret = btrfs_find_root(tree_root, &root->root_key, path, > NULL, NULL); > if (ret < 0) { >@@ -9043,8 +9044,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > * > * The most common failure here is just -ENOENT. > */ >- btrfs_del_orphan_item(trans, tree_root, >- root->root_key.objectid); >+ btrfs_del_orphan_item(trans, tree_root, objectid); > } > } > >@@ -9056,6 +9056,14 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > btrfs_put_fs_root(root); > } > root_dropped = true; >+ >+ /* Remove level-0 qgroup items since no other subvolume can use them */ >+ ret = btrfs_remove_qgroup(trans, objectid); >+ if (ret && ret != -EINVAL && ret != -ENOENT) { I'm sorry for missing the snapshot case. If it is a snapshot, then when we remove the relevant qgroup, we will not be able to perform quick_update_accounting(), and it will return 1. So we shouldn't abort the transaction when the return value = 1. btrfs_remove_qgroup -> __del_qgroup_relation -> quick_update_accounting << if qgroup->excl != qgroup->rfer; return 1
On 2018/08/08 1:23, Lu Fengqi wrote: > On Mon, Aug 06, 2018 at 01:53:28PM +0900, Misono Tomohiro wrote: >> When qgroup is on, subvolume deletion does not remove qgroup items >> of the subvolume (qgroup info, limit, relation) from quota tree and >> they need to get removed manually by "btrfs qgroup destroy". >> >> Since level 0 qgroup cannot be used/inherited by any other subvolume, >> let's remove them automatically when subvolume is deleted >> (to be precise, when the subvolume root is dropped). >> >> Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> >> Reviewed-by: Qu Wenruo <wqu@suse.com> >> Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> >> --- >> v2 -> v3: >> Use root->root_key.objectid instead of root->objectid >> Add Reviewed-by tag >> >> v1 -> v2: >> Move call of btrfs_remove_qgroup() from btrfs_delete_subvolume() >> to btrfs_snapshot_destroy() so that it will be called after the >> subvolume root is really dropped >> >> fs/btrfs/extent-tree.c | 16 ++++++++++++---- >> 1 file changed, 12 insertions(+), 4 deletions(-) >> >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >> index 9e7b237b9547..48edf839ed2c 100644 >> --- a/fs/btrfs/extent-tree.c >> +++ b/fs/btrfs/extent-tree.c >> @@ -8871,12 +8871,13 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >> struct btrfs_root_item *root_item = &root->root_item; >> struct walk_control *wc; >> struct btrfs_key key; >> + u64 objectid = root->root_key.objectid; >> int err = 0; >> int ret; >> int level; >> bool root_dropped = false; >> >> - btrfs_debug(fs_info, "Drop subvolume %llu", root->objectid); >> + btrfs_debug(fs_info, "Drop subvolume %llu", objectid); >> >> path = btrfs_alloc_path(); >> if (!path) { >> @@ -9030,7 +9031,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >> goto out_end_trans; >> } >> >> - if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID) { >> + if (objectid != BTRFS_TREE_RELOC_OBJECTID) { >> ret = btrfs_find_root(tree_root, &root->root_key, path, >> NULL, NULL); >> if (ret < 0) { >> @@ -9043,8 +9044,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >> * >> * The most common failure here is just -ENOENT. >> */ >> - btrfs_del_orphan_item(trans, tree_root, >> - root->root_key.objectid); >> + btrfs_del_orphan_item(trans, tree_root, objectid); >> } >> } >> >> @@ -9056,6 +9056,14 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >> btrfs_put_fs_root(root); >> } >> root_dropped = true; >> + >> + /* Remove level-0 qgroup items since no other subvolume can use them */ >> + ret = btrfs_remove_qgroup(trans, objectid); >> + if (ret && ret != -EINVAL && ret != -ENOENT) { > > I'm sorry for missing the snapshot case. If it is a snapshot, then when > we remove the relevant qgroup, we will not be able to perform > quick_update_accounting(), and it will return 1. So we shouldn't abort > the transaction when the return value = 1. > > btrfs_remove_qgroup > -> __del_qgroup_relation > -> quick_update_accounting << if qgroup->excl != qgroup->rfer; return 1 > oh, thanks for pointing out, I missed that case... I wonder if there is a way to keep qgroup consistency without enforcing rescan. btw, I noticed that removing qgroup relation when excl != rfer causes some problem without this patch: (4.18-rc7) $ mkfs.btrfs -fq $DEV $ mount $DEV /mnt $ btrfs quota enable /mnt $ btrfs qgroup create 1/0 /mnt $ btrfs sub create /mnt/sub $ btrfs qgroup assign 0/257 1/0 /mnt $ dd if=/dev/urandom of=/mnt/sub/file bs=1k count=1000 $ btrfs sub snap /mnt/sub /mnt/snap $ dd if=/dev/urandom of=/mnt/sub/file2 bs=1k count=1000 $ btrfs fi sync /mnt $ btrfs qgroup show -pcre /mnt qgroupid rfer excl max_rfer max_excl parent child -------- ---- ---- -------- -------- ------ ----- 0/5 16.00KiB 16.00KiB none none --- --- 0/257 1.97MiB 1016.00KiB none none 1/0 --- 0/258 1016.00KiB 16.00KiB none none --- --- 1/0 1.97MiB 1.97MiB none none --- 0/257 ^^^^^^^ should be 1016Kib $ btrfs qgroup remove 0/257 1/0 /mnt WARNING: quotas may be inconsistent, rescan needed $ btrfs qgroup show -pcre /mnt WARNING: qgroup data inconsistent, rescan recommended qgroupid rfer excl max_rfer max_excl parent child -------- ---- ---- -------- -------- ------ ----- 0/5 16.00KiB 16.00KiB none none --- --- 0/258 1016.00KiB 1016.00KiB none none --- --- 1/0 1.97MiB 1.97MiB none none --- --- $ btrfs quota rescan -w /mnt $ btrfs qgroup show -pcre /mnt qgroupid rfer excl max_rfer max_excl parent child -------- ---- ---- -------- -------- ------ ----- 0/5 16.00KiB 16.00KiB none none --- --- 0/258 1016.00KiB 1016.00KiB none none --- --- 1/0 1.97MiB 1.97MiB none none --- --- ^^^^^^ ^^^^^^^ not changed, although inconsistency flag is cleared I will look at the case to better understand qgroup. Thanks, Misono -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018年08月08日 10:56, Misono Tomohiro wrote: > On 2018/08/08 1:23, Lu Fengqi wrote: >> On Mon, Aug 06, 2018 at 01:53:28PM +0900, Misono Tomohiro wrote: >>> When qgroup is on, subvolume deletion does not remove qgroup items >>> of the subvolume (qgroup info, limit, relation) from quota tree and >>> they need to get removed manually by "btrfs qgroup destroy". >>> >>> Since level 0 qgroup cannot be used/inherited by any other subvolume, >>> let's remove them automatically when subvolume is deleted >>> (to be precise, when the subvolume root is dropped). >>> >>> Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> >>> Reviewed-by: Qu Wenruo <wqu@suse.com> >>> Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> >>> --- >>> v2 -> v3: >>> Use root->root_key.objectid instead of root->objectid >>> Add Reviewed-by tag >>> >>> v1 -> v2: >>> Move call of btrfs_remove_qgroup() from btrfs_delete_subvolume() >>> to btrfs_snapshot_destroy() so that it will be called after the >>> subvolume root is really dropped >>> >>> fs/btrfs/extent-tree.c | 16 ++++++++++++---- >>> 1 file changed, 12 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >>> index 9e7b237b9547..48edf839ed2c 100644 >>> --- a/fs/btrfs/extent-tree.c >>> +++ b/fs/btrfs/extent-tree.c >>> @@ -8871,12 +8871,13 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >>> struct btrfs_root_item *root_item = &root->root_item; >>> struct walk_control *wc; >>> struct btrfs_key key; >>> + u64 objectid = root->root_key.objectid; >>> int err = 0; >>> int ret; >>> int level; >>> bool root_dropped = false; >>> >>> - btrfs_debug(fs_info, "Drop subvolume %llu", root->objectid); >>> + btrfs_debug(fs_info, "Drop subvolume %llu", objectid); >>> >>> path = btrfs_alloc_path(); >>> if (!path) { >>> @@ -9030,7 +9031,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >>> goto out_end_trans; >>> } >>> >>> - if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID) { >>> + if (objectid != BTRFS_TREE_RELOC_OBJECTID) { >>> ret = btrfs_find_root(tree_root, &root->root_key, path, >>> NULL, NULL); >>> if (ret < 0) { >>> @@ -9043,8 +9044,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >>> * >>> * The most common failure here is just -ENOENT. >>> */ >>> - btrfs_del_orphan_item(trans, tree_root, >>> - root->root_key.objectid); >>> + btrfs_del_orphan_item(trans, tree_root, objectid); >>> } >>> } >>> >>> @@ -9056,6 +9056,14 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >>> btrfs_put_fs_root(root); >>> } >>> root_dropped = true; >>> + >>> + /* Remove level-0 qgroup items since no other subvolume can use them */ >>> + ret = btrfs_remove_qgroup(trans, objectid); >>> + if (ret && ret != -EINVAL && ret != -ENOENT) { >> >> I'm sorry for missing the snapshot case. If it is a snapshot, then when >> we remove the relevant qgroup, we will not be able to perform >> quick_update_accounting(), and it will return 1. So we shouldn't abort >> the transaction when the return value = 1. >> >> btrfs_remove_qgroup >> -> __del_qgroup_relation >> -> quick_update_accounting << if qgroup->excl != qgroup->rfer; return 1 >> > > oh, thanks for pointing out, I missed that case... > I wonder if there is a way to keep qgroup consistency without enforcing rescan. In theory, it's possible by just dirtying all extents of related level 0 qgroups. In practical, not really possible. As above method will hang current transaction almost forever. > > btw, I noticed that removing qgroup relation when excl != rfer causes > some problem without this patch: > > (4.18-rc7) > $ mkfs.btrfs -fq $DEV > $ mount $DEV /mnt > > $ btrfs quota enable /mnt > $ btrfs qgroup create 1/0 /mnt > $ btrfs sub create /mnt/sub > $ btrfs qgroup assign 0/257 1/0 /mnt > > $ dd if=/dev/urandom of=/mnt/sub/file bs=1k count=1000 > $ btrfs sub snap /mnt/sub /mnt/snap In fact, at this point we should already mark btrfs quota inconsistent. When creating qgroup 0/258 we're already doing quick hack for its number, but since 0/257 is already belong to 1/0, the quick update itself is already making quota inconsistent. (Yes, snapshot creation under multi-level qgroups is the biggest challenge for btrfs quota) > $ dd if=/dev/urandom of=/mnt/sub/file2 bs=1k count=1000 > > $ btrfs fi sync /mnt > $ btrfs qgroup show -pcre /mnt At this point, qgroups is already inconsistent. I'll try to fix it. Thanks, Qu > qgroupid rfer excl max_rfer max_excl parent child > -------- ---- ---- -------- -------- ------ ----- > 0/5 16.00KiB 16.00KiB none none --- --- > 0/257 1.97MiB 1016.00KiB none none 1/0 --- > 0/258 1016.00KiB 16.00KiB none none --- --- > 1/0 1.97MiB 1.97MiB none none --- 0/257 > ^^^^^^^ should be 1016Kib > > $ btrfs qgroup remove 0/257 1/0 /mnt > WARNING: quotas may be inconsistent, rescan needed > $ btrfs qgroup show -pcre /mnt > WARNING: qgroup data inconsistent, rescan recommended > qgroupid rfer excl max_rfer max_excl parent child > -------- ---- ---- -------- -------- ------ ----- > 0/5 16.00KiB 16.00KiB none none --- --- > 0/258 1016.00KiB 1016.00KiB none none --- --- > 1/0 1.97MiB 1.97MiB none none --- --- > > $ btrfs quota rescan -w /mnt > $ btrfs qgroup show -pcre /mnt > qgroupid rfer excl max_rfer max_excl parent child > -------- ---- ---- -------- -------- ------ ----- > 0/5 16.00KiB 16.00KiB none none --- --- > 0/258 1016.00KiB 1016.00KiB none none --- --- > 1/0 1.97MiB 1.97MiB none none --- --- > ^^^^^^ ^^^^^^^ not changed, although inconsistency flag is cleared > > I will look at the case to better understand qgroup. > > Thanks, > Misono > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 9e7b237b9547..48edf839ed2c 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -8871,12 +8871,13 @@ int btrfs_drop_snapshot(struct btrfs_root *root, struct btrfs_root_item *root_item = &root->root_item; struct walk_control *wc; struct btrfs_key key; + u64 objectid = root->root_key.objectid; int err = 0; int ret; int level; bool root_dropped = false; - btrfs_debug(fs_info, "Drop subvolume %llu", root->objectid); + btrfs_debug(fs_info, "Drop subvolume %llu", objectid); path = btrfs_alloc_path(); if (!path) { @@ -9030,7 +9031,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, goto out_end_trans; } - if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID) { + if (objectid != BTRFS_TREE_RELOC_OBJECTID) { ret = btrfs_find_root(tree_root, &root->root_key, path, NULL, NULL); if (ret < 0) { @@ -9043,8 +9044,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, * * The most common failure here is just -ENOENT. */ - btrfs_del_orphan_item(trans, tree_root, - root->root_key.objectid); + btrfs_del_orphan_item(trans, tree_root, objectid); } } @@ -9056,6 +9056,14 @@ int btrfs_drop_snapshot(struct btrfs_root *root, btrfs_put_fs_root(root); } root_dropped = true; + + /* Remove level-0 qgroup items since no other subvolume can use them */ + ret = btrfs_remove_qgroup(trans, objectid); + if (ret && ret != -EINVAL && ret != -ENOENT) { + btrfs_abort_transaction(trans, ret); + err = ret; + } + out_end_trans: btrfs_end_transaction_throttle(trans); out_free: