From patchwork Wed Jul 5 23:36:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13303047 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF002EB64DA for ; Wed, 5 Jul 2023 23:37:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231264AbjGEXhi (ORCPT ); Wed, 5 Jul 2023 19:37:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230305AbjGEXhh (ORCPT ); Wed, 5 Jul 2023 19:37:37 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AF801989 for ; Wed, 5 Jul 2023 16:37:35 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id A82B15C01DF; Wed, 5 Jul 2023 19:37:34 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 05 Jul 2023 19:37:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1688600254; x= 1688686654; bh=buIpC7iBRxM0wR5wWlfVZ/WeCsVSrQ6Zn0VgfCvVNg4=; b=W t++Fh4SYF57IYBhj+LaCL6OrfRRwhKM2yNzZjjoJIdEnnnLIBcnFQOc+Ftds6XzB wsZYogwq7roTjdtfFoYNyfz+0Amf91DjwPTZ/Qzk0QCk3ZClzBsCAFU36WE0y3kU G54ddOPm11XMGSRlNMvGNBxNQojaqCqB0ykfT/RF8gaa9dsCRQjP3pA03gWDnjGi nqVP/MO4RpQe0yLqs9mQq76roqvXxFJraTX1Hs3+CABkyx8WtUSAVboJxdmQLpCZ NFeDERz91HlmVgFhBeTtWKAM+zm+BtkaJ1It51tP5JdsAThZRBOzz9nRgYmScPBj ToaPLT1/6zpEARiXZudjw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1688600254; x=1688686654; bh=b uIpC7iBRxM0wR5wWlfVZ/WeCsVSrQ6Zn0VgfCvVNg4=; b=Hu+eI0BhTPyJWLvb0 eIonbuqtOVjNuIhT7/5HK5KAbIRw335KGLp/BPVFmHARA3ndiWizQdI+a0dXNZQJ HZPGHWJx5NwGiWVQpGIN/eCMyFfesnP23u/6kfQXrJDNPKFx8QMXTaZWMwDyQtkl aEzmciBSJYU6mI1UzqLqqxjiOs4NGTVp/FEBAEHjNgixUs50bvQxAS3jN9INcYN+ DpH9TpLzKiMpV3lmgUJdqsaYSwtlSghNKYls9fMybjoVgyjk8K8/Gk7RNBFMmeNr VgFfUsmujWUFcyXYUPNi7t5keoVFCZ56u2roqCtGS5l6RHHtt23AxMi0Ynj8eBeZ IOLCA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudekgddvgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhi oheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejffevvd ehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehm rghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 5 Jul 2023 19:37:34 -0400 (EDT) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 1/8] btrfs-progs: document squotas Date: Wed, 5 Jul 2023 16:36:20 -0700 Message-ID: <8f44931fc285453f18956cc6601568816d7dcf69.1688599734.git.boris@bur.io> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Document the new options in btrfs quota and mkfs.btrfs. Also, add a section to the long form qgroups document about squotas. Signed-off-by: Boris Burkov --- Documentation/btrfs-quota.rst | 7 +++- Documentation/ch-quota-intro.rst | 59 ++++++++++++++++++++++++++++++++ Documentation/mkfs.btrfs.rst | 6 ++++ 3 files changed, 71 insertions(+), 1 deletion(-) diff --git a/Documentation/btrfs-quota.rst b/Documentation/btrfs-quota.rst index 830e9059a..d5e08330e 100644 --- a/Documentation/btrfs-quota.rst +++ b/Documentation/btrfs-quota.rst @@ -47,9 +47,14 @@ SUBCOMMAND disable Disable subvolume quota support for a filesystem. -enable +enable [options] Enable subvolume quota support for a filesystem. + ``Options`` + + -s|--simple + use simple quotas (squotas) instead of qgroups + rescan [options] Trash all qgroup numbers and scan the metadata again with the current config. diff --git a/Documentation/ch-quota-intro.rst b/Documentation/ch-quota-intro.rst index 351772d10..a69e35c8a 100644 --- a/Documentation/ch-quota-intro.rst +++ b/Documentation/ch-quota-intro.rst @@ -194,3 +194,62 @@ but some snapshots for backup purposes are being created by the system. The user's snapshots should be accounted to the user, not the system. The solution is similar to the one from section 'Accounting snapshots to the user', but do not assign system snapshots to user's qgroup. + +Simple Quotas (squotas) +^^^^^^^^^^^^^^^^^^^^^^^ + +As detailed in this document, qgroups can handle many complex extent sharing +and unsharing scenarios while maintaining an accurate count of exclusive and +shared usage. However, this flexibility comes at a cost: many of the +computations are global, in the sense that we must count up the number of trees +referring to an extent after its references change. This can slow down +transaction commits and lead to unacceptable latencies, especially in cases +where snapshots scale up. + +To work around this limitation of qgroups, btrfs also supports a second set of +quota semantics: simple quotas or squotas. Squotas fully share the qgroups API +and hierarchical model, but do not track shared vs. exclusive usage. Instead, +they account all extents to the subvolume that first allocated it. With a bit +of new bookkeeping, this allows all accounting decisions to be local to the +allocation or freeing operation that deals with the extents themselves, and +fully avoids the complex and costly back-reference resolutions. + +``Example`` + +To illustrate the difference between squotas and qgroups, consider the following +basic example assuming a nodesize of 16KiB. + +1. create subvolume 256 +1. rack up 1GiB of data and metadata usage in 256 +2. snapshot 256, creating subvolume 257 +3. CoW 512MiB of the data and metadata in 257 +4. delete everything in 256 + +At each step, qgroups would have the following accounting: +1. 0/256: 16KiB excl 0 shared +2. 0/256: 1GiB excl 0 shared +3. 0/256: 0 excl 1GiB shared; 0/257: 0 excl 1GiB shared +4. 0/256: 512MiB excl 512MiB shared; 0/257: 512MiB excl 512MiB shared +5. 0/256: 16KiB excl 0 shared; 0/257: 1GiB excl 0 shared + +Whereas under squotas, the accounting would look like: +1. 0/256: 16KiB excl 16KiB shared +2. 0/256: 1GiB excl 1GiB shared +3. 0/256: 1GiB excl 1GiB shared; 0/257: 16KiB excl 16KiB shared +4. 0/256: 1GiB excl 1GiB shared; 0/257: 512MiB excl 512MiB shared +5. 0/256: 512MiB excl 512MiB shared; 0/257: 512MiB excl 512MiB shared + +Note that since the original snapshotted 512MiB are still referenced by 257, +they cannot be freed from 256, even after 256 is emptied, or even deleted. + +``Summary`` + +If you want some of power and flexibility of quotas for tracking and limiting +subvolume usage, but want to avoid the performance penalty of accurately +tracking extent ownership lifecycles, then squotas can be a useful option. + +Furthermore, squotas is targeted at use cases where the original extent is +immutable, like image snapshotting for container startup, in which case we avoid +these awkward scenarios where a subvolume is empty or deleted but still has +significant extents accounted to it. However, as long as you are aware of the +accounting semantics, they can handle mutable original extents. diff --git a/Documentation/mkfs.btrfs.rst b/Documentation/mkfs.btrfs.rst index d1626f736..051e8fb1c 100644 --- a/Documentation/mkfs.btrfs.rst +++ b/Documentation/mkfs.btrfs.rst @@ -307,6 +307,12 @@ block-group-tree Enable the block group tree to greatly reduce mount time for large filesystems. +squota + (kernel support since X.X) + + Enable simple quota support (squotas). This is an alternative to qgroups with + a smaller performance impact but no notion of shared vs. exclusive usage. + .. _mkfs-section-profiles: BLOCK GROUPS, CHUNKS, RAID From patchwork Wed Jul 5 23:36:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13303048 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D778AEB64DD for ; Wed, 5 Jul 2023 23:37:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230487AbjGEXhj (ORCPT ); Wed, 5 Jul 2023 19:37:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230383AbjGEXhi (ORCPT ); Wed, 5 Jul 2023 19:37:38 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AD1912A for ; Wed, 5 Jul 2023 16:37:37 -0700 (PDT) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 6E1295C011C; Wed, 5 Jul 2023 19:37:36 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Wed, 05 Jul 2023 19:37:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1688600256; x= 1688686656; bh=8fhQPx6Zr9cmHuSArlpJxpXZUEBA4e9EbrP07W05rMg=; b=J BiYNM/VRPlENob2tz+ih7Ae3N0FizM8ukFV4tbWCqT425+yInYhCYScTYgxmNR2I ZqX2bDpeslRx/QwR6PV8KbcrIxdYNiRFPxBfArTRjbMi0ZoCctFvPrqFzm39Z57z Kj4VNlCLBlsHfctPAvR1n3onZVnIZ4A56VXyXgaKB+bI7b0U5X+00OfgvJtcUhy+ o8s84znzkbTyX3eTFP1hCOUL6Zyyk3exirvIwnFSJKCIuT4yjTJ6PJzsT8HOj35u 9RoALYBW/SW8pRkE0r8ncOPvHkJXHaXpc7zLC8N7l9Uv5L3JGkUHTTBKu4BDkbr+ a0t+jJkxUE3C0ILykSPdA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1688600256; x=1688686656; bh=8 fhQPx6Zr9cmHuSArlpJxpXZUEBA4e9EbrP07W05rMg=; b=TsQljGmcskg/w44Lr bPkRVDaZVroL16rIPOpkq8BqFKB77ZQIlFb6VgPeEmC/WzvETvhzTeF6lEtEL/7T lefQ5hgkheF7yzfZrVO8QlUt/7QH9bbVeBSc0mXSPVUXbNDHWNGbP51RPkES5UDv Ur9yw3eSFFFxq+kkb/xL0b9yQj+DLN8Hfcsr/GJZaPwSfQbiiIZOYevzCRpTxtcq Ffnd3xhFdDOzMswt4+SG3HxOKcvwgM7lA6fPfR5KmZ6DC7kcwrRNE8R05AbbEH3h JHxJ51MWPZCfCN+Ocx9GujU9cgHRNafguxzVKH/zB6Af3Cqqm349abkosItGkLiF hBD5Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudekgddvgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhi oheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejffevvd ehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehm rghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 5 Jul 2023 19:37:35 -0400 (EDT) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 2/8] btrfs-progs: simple quotas kernel definitions Date: Wed, 5 Jul 2023 16:36:21 -0700 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Copy over structs, accessors, and constants for simple quotas Signed-off-by: Boris Burkov Reviewed-by: Josef Bacik --- kernel-shared/accessors.h | 9 +++++++++ kernel-shared/ctree.h | 6 ++++-- kernel-shared/uapi/btrfs.h | 1 + kernel-shared/uapi/btrfs_tree.h | 12 ++++++++++++ 4 files changed, 26 insertions(+), 2 deletions(-) diff --git a/kernel-shared/accessors.h b/kernel-shared/accessors.h index 539c20d09..ab8c2d337 100644 --- a/kernel-shared/accessors.h +++ b/kernel-shared/accessors.h @@ -379,9 +379,13 @@ static inline u32 btrfs_extent_inline_ref_size(int type) if (type == BTRFS_EXTENT_DATA_REF_KEY) return sizeof(struct btrfs_extent_data_ref) + offsetof(struct btrfs_extent_inline_ref, offset); + if (type == BTRFS_EXTENT_OWNER_REF_KEY) + return sizeof(struct btrfs_extent_inline_ref); return 0; } +BTRFS_SETGET_FUNCS(extent_owner_ref_root_id, struct btrfs_extent_owner_ref, root_id, 64); + /* struct btrfs_node */ BTRFS_SETGET_FUNCS(key_blockptr, struct btrfs_key_ptr, blockptr, 64); BTRFS_SETGET_FUNCS(key_generation, struct btrfs_key_ptr, generation, 64); @@ -979,6 +983,9 @@ BTRFS_SETGET_FUNCS(qgroup_status_flags, struct btrfs_qgroup_status_item, flags, 64); BTRFS_SETGET_FUNCS(qgroup_status_rescan, struct btrfs_qgroup_status_item, rescan, 64); +BTRFS_SETGET_FUNCS(qgroup_status_enable_gen, struct btrfs_qgroup_status_item, + enable_gen, 64); + BTRFS_SETGET_STACK_FUNCS(stack_qgroup_status_generation, struct btrfs_qgroup_status_item, generation, 64); BTRFS_SETGET_STACK_FUNCS(stack_qgroup_status_version, @@ -987,6 +994,8 @@ BTRFS_SETGET_STACK_FUNCS(stack_qgroup_status_flags, struct btrfs_qgroup_status_item, flags, 64); BTRFS_SETGET_STACK_FUNCS(stack_qgroup_status_rescan, struct btrfs_qgroup_status_item, rescan, 64); +BTRFS_SETGET_STACK_FUNCS(stack_qgroup_status_enable_gen, + struct btrfs_qgroup_status_item, enable_gen, 64); /* btrfs_qgroup_info_item */ BTRFS_SETGET_FUNCS(qgroup_info_generation, struct btrfs_qgroup_info_item, diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h index 5d3392ae8..3b283b21e 100644 --- a/kernel-shared/ctree.h +++ b/kernel-shared/ctree.h @@ -102,7 +102,8 @@ static inline u32 __BTRFS_LEAF_DATA_SIZE(u32 nodesize) BTRFS_FEATURE_INCOMPAT_RAID1C34 | \ BTRFS_FEATURE_INCOMPAT_METADATA_UUID | \ BTRFS_FEATURE_INCOMPAT_ZONED | \ - BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2) + BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 | \ + BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA) #else #define BTRFS_FEATURE_INCOMPAT_SUPP \ (BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF | \ @@ -117,7 +118,8 @@ static inline u32 __BTRFS_LEAF_DATA_SIZE(u32 nodesize) BTRFS_FEATURE_INCOMPAT_NO_HOLES | \ BTRFS_FEATURE_INCOMPAT_RAID1C34 | \ BTRFS_FEATURE_INCOMPAT_METADATA_UUID | \ - BTRFS_FEATURE_INCOMPAT_ZONED) + BTRFS_FEATURE_INCOMPAT_ZONED | \ + BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA) #endif /* diff --git a/kernel-shared/uapi/btrfs.h b/kernel-shared/uapi/btrfs.h index 85b04f89a..d312b9f4f 100644 --- a/kernel-shared/uapi/btrfs.h +++ b/kernel-shared/uapi/btrfs.h @@ -356,6 +356,7 @@ _static_assert(sizeof(struct btrfs_ioctl_fs_info_args) == 1024); #define BTRFS_FEATURE_INCOMPAT_RAID1C34 (1ULL << 11) #define BTRFS_FEATURE_INCOMPAT_ZONED (1ULL << 12) #define BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 (1ULL << 13) +#define BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA (1ULL << 14) struct btrfs_ioctl_feature_flags { __u64 compat_flags; diff --git a/kernel-shared/uapi/btrfs_tree.h b/kernel-shared/uapi/btrfs_tree.h index ad555e705..a9fdbbb1e 100644 --- a/kernel-shared/uapi/btrfs_tree.h +++ b/kernel-shared/uapi/btrfs_tree.h @@ -227,6 +227,8 @@ #define BTRFS_SHARED_DATA_REF_KEY 184 +#define BTRFS_EXTENT_OWNER_REF_KEY 190 + /* * block groups give us hints into the extent allocation trees. Which * blocks are free etc etc @@ -783,6 +785,10 @@ struct btrfs_shared_data_ref { __le32 count; } __attribute__ ((__packed__)); +struct btrfs_extent_owner_ref { + __le64 root_id; +} __attribute__ ((__packed__)); + struct btrfs_extent_inline_ref { __u8 type; __le64 offset; @@ -1224,6 +1230,12 @@ struct btrfs_qgroup_status_item { * of the scan. It contains a logical address */ __le64 rescan; + + /* + * Used by simple quotas to ignore old extent deletions + * Present iff incompat flag SIMPLE_QUOTA is set + */ + __le64 enable_gen; } __attribute__ ((__packed__)); struct btrfs_qgroup_info_item { From patchwork Wed Jul 5 23:36:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13303049 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65958C0015E for ; Wed, 5 Jul 2023 23:37:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231540AbjGEXhk (ORCPT ); Wed, 5 Jul 2023 19:37:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44346 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230383AbjGEXhj (ORCPT ); Wed, 5 Jul 2023 19:37:39 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0E8B12A for ; Wed, 5 Jul 2023 16:37:38 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 2A3FA5C0061; Wed, 5 Jul 2023 19:37:38 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 05 Jul 2023 19:37:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1688600258; x= 1688686658; bh=OFjX3bjDPnhkT8+uIDteh9ld/sNNTfrtbjQipmJdBhc=; b=j aUZndJvlFAQuFvCrVRfPh1d8k9GoZMirwALxEPfTTaapAX3BSlTz+IiG/zGdbLdn TDi/nvJNdukmu/6e/6JgM8eL7X7Aq4DKt4VLUsoMXW6JIhyfsz0SzY37tUILIB4c fbI7VR9D394ymWIRtjGJHwdotdw4NxLjZUxnUXanA67RNQGHSokRsV+AISJjclmt 6Wihmh7Sg/7VDF80Qj+jaDJGzn0z2IDMA/eTxB0SEdl6kwWUSOEaag1L4eZRo6dI EFh/wKMw3J6WLzBgm0crlNRJzr4Fjsmk4iFS3LtIK187pB3pRUzAOBsbf2Oq0Ozl fKuesormCBfT4nzAWXrFw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1688600258; x=1688686658; bh=O FjX3bjDPnhkT8+uIDteh9ld/sNNTfrtbjQipmJdBhc=; b=Z5EoPECNvWhJUYCbO Ej0aEgRnTqEDlG5gp05hcdqV7v5O3q5ZfNU+8W1lx3TXhQuiHl00L02sVfkVGKBj CeRgRkKHqSkhMoE/d2uZ7z0EDzv030KLAvovcu7t3OCIgx4DteZuHWj/uFfq5U5V iaoKau78QSz6dAfVV52kt1w0tavbkZDdXxZVAbsyzeWU4NoKuh98C5ajZ05h7Mmh fdK2xDWWKb3BIx5/FYHUm93Yobz+Sho9lfojwLdGzW3FHQgbTUlB0SDuIoRVOasH yomxiTNjbJDw9i62crqlX1EOSZuG6/wTfYdLx/fhF06bHRQ26uBQx9Xyhndujx27 r2G9A== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudekgddvgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhi oheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejffevvd ehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgepudenucfrrghrrghmpehm rghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 5 Jul 2023 19:37:37 -0400 (EDT) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 3/8] btrfs-progs: simple quotas dump commands Date: Wed, 5 Jul 2023 16:36:22 -0700 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add support to btrfs inspect-internal dump-super and dump-tree for the new structures and feature flags introduced by simple quotas Signed-off-by: Boris Burkov Reviewed-by: Josef Bacik --- kernel-shared/print-tree.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/kernel-shared/print-tree.c b/kernel-shared/print-tree.c index 0f7f7b72f..7d4e77579 100644 --- a/kernel-shared/print-tree.c +++ b/kernel-shared/print-tree.c @@ -509,6 +509,10 @@ void print_extent_item(struct extent_buffer *eb, int slot, int metadata) (unsigned long long)offset, btrfs_shared_data_ref_count(eb, sref)); break; + case BTRFS_EXTENT_OWNER_REF_KEY: + printf("\t\textent owner root %llu\n", + (unsigned long long)offset); + break; default: return; } @@ -661,6 +665,7 @@ void print_key_type(FILE *stream, u64 objectid, u8 type) [BTRFS_EXTENT_DATA_REF_KEY] = "EXTENT_DATA_REF", [BTRFS_SHARED_DATA_REF_KEY] = "SHARED_DATA_REF", [BTRFS_EXTENT_REF_V0_KEY] = "EXTENT_REF_V0", + [BTRFS_EXTENT_OWNER_REF_KEY] = "EXTENT_OWNER_REF", [BTRFS_CSUM_ITEM_KEY] = "CSUM_ITEM", [BTRFS_EXTENT_CSUM_KEY] = "EXTENT_CSUM", [BTRFS_EXTENT_DATA_KEY] = "EXTENT_DATA", @@ -1042,6 +1047,17 @@ static void print_shared_data_ref(struct extent_buffer *eb, int slot) btrfs_shared_data_ref_count(eb, sref)); } +static void print_extent_owner_ref(struct extent_buffer *eb, int slot) +{ + struct btrfs_extent_owner_ref *oref; + u64 root_id; + + oref = btrfs_item_ptr(eb, slot, struct btrfs_extent_owner_ref); + root_id = btrfs_extent_owner_ref_root_id(eb, oref); + + printf("\t\textent owner root %llu\n", root_id); +} + static void print_free_space_info(struct extent_buffer *eb, int slot) { struct btrfs_free_space_info *free_info; @@ -1083,11 +1099,16 @@ static void print_qgroup_status(struct extent_buffer *eb, int slot) memset(flags_str, 0, sizeof(flags_str)); qgroup_flags_to_str(btrfs_qgroup_status_flags(eb, qg_status), flags_str); - printf("\t\tversion %llu generation %llu flags %s scan %llu\n", + printf("\t\tversion %llu generation %llu flags %s scan %llu", (unsigned long long)btrfs_qgroup_status_version(eb, qg_status), (unsigned long long)btrfs_qgroup_status_generation(eb, qg_status), flags_str, (unsigned long long)btrfs_qgroup_status_rescan(eb, qg_status)); + if (btrfs_fs_incompat(eb->fs_info, SIMPLE_QUOTA)) + printf(" enable_gen %llu\n", + (unsigned long long)btrfs_qgroup_status_enable_gen(eb, qg_status)); + else + printf("\n"); } static void print_qgroup_info(struct extent_buffer *eb, int slot) @@ -1407,6 +1428,9 @@ void btrfs_print_leaf(struct extent_buffer *eb, unsigned int mode) case BTRFS_SHARED_DATA_REF_KEY: print_shared_data_ref(eb, i); break; + case BTRFS_EXTENT_OWNER_REF_KEY: + print_extent_owner_ref(eb, i); + break; case BTRFS_EXTENT_REF_V0_KEY: printf("\t\textent ref v0 (deprecated)\n"); break; @@ -1708,6 +1732,7 @@ static struct readable_flag_entry incompat_flags_array[] = { DEF_INCOMPAT_FLAG_ENTRY(RAID1C34), DEF_INCOMPAT_FLAG_ENTRY(ZONED), DEF_INCOMPAT_FLAG_ENTRY(EXTENT_TREE_V2), + DEF_INCOMPAT_FLAG_ENTRY(SIMPLE_QUOTA), }; static const int incompat_flags_num = sizeof(incompat_flags_array) / sizeof(struct readable_flag_entry); From patchwork Wed Jul 5 23:36:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13303051 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1927FEB64DA for ; Wed, 5 Jul 2023 23:37:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231770AbjGEXhm (ORCPT ); Wed, 5 Jul 2023 19:37:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230383AbjGEXhl (ORCPT ); Wed, 5 Jul 2023 19:37:41 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6404412A for ; Wed, 5 Jul 2023 16:37:40 -0700 (PDT) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id D124A5C0056; Wed, 5 Jul 2023 19:37:39 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Wed, 05 Jul 2023 19:37:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1688600259; x= 1688686659; bh=52gjYiNz8XzyDtqbnO3KbZg0Np9UaQt5epFL2VKj0/c=; b=b Mr7RGYmzrQ0uSPWYO4SSbEsCehSKc52YX62P2Iy0NH4DxDZE5NpAEzTzU+TWOCC2 vRg3QbUdsd/wnz3gz8IyeNr2KdiZig7OaKm3IsNZAtSHaBdGyQp6LD01aex57H42 Ur5QFA5np7awimZPv+mwVhMWcdtThU+4234Yjta4JMZRiHFBHbwphODJYN0V8aWV JwQcPHUUyXc3pEZ2nu82wkj54WbjwF+SjxzWCJWeZQP6RtzSTcIS8midk7uOdrv/ rbzZ0nXpClyiXh5UMiwgGrfCwHaIRyKCI/+w6b5cmHQ3kWHa/bs1G1x3/o+B+KcG ta0GQokAcp/qdrNoq/cvg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1688600259; x=1688686659; bh=5 2gjYiNz8XzyDtqbnO3KbZg0Np9UaQt5epFL2VKj0/c=; b=Rk9O2WhtPPEnBGSuJ EvjUFJTQQHzvi3CbQ2r0XBtos3nADq4fuPmv4aTZg43pfdBB8yJ+o1qNE3XpkqTv krNQ+6Mk2PETslx+TKF728fcluF1NWVXsDZ4KmeLLoBJWr9VVhEorZi2EJJaqN18 JBriDyGEprGu0t0peyUk1T/o3BAysMsJdVi6xP9ua2HUFw+TLkwzqP0D6yM3PXVI r5db6fSMlgpWbPNprA9dCaEiwFp8tLiEIwUPEm8wS7F2YNC0TYduUssRHsOGsYvD biu3vvLPEVHzGhj0PcotI5gV8nuJKKTojbQ+tMwuQeVLuhVywVJ0PHSIwk9s/Kh6 QVPig== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudekgddvfecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhi oheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejffevvd ehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehm rghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 5 Jul 2023 19:37:39 -0400 (EDT) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 4/8] btrfs-progs: simple quotas fsck Date: Wed, 5 Jul 2023 16:36:23 -0700 Message-ID: <929adaf2889519f82cb79db3077eef2d8938a247.1688599734.git.boris@bur.io> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add simple quotas checks to btrfs check. Like the kernel feature, these checks bypass most of the backref walking in the qgroups check. Instead, they enforce the invariant behind the design of simple quotas by scanning the extent tree and determining the owner of each extent: Data: reading the owner ref inline item Metadata: reading the tree block and reading its btrfs_header's owner This gives us the expected count from squotas which we check against the on-disk state of the qgroup items Signed-off-by: Boris Burkov --- check/main.c | 2 + check/qgroup-verify.c | 122 ++++++++++++++++++++++++++++++++++-------- 2 files changed, 102 insertions(+), 22 deletions(-) diff --git a/check/main.c b/check/main.c index 77bb50a0e..07f31fbe0 100644 --- a/check/main.c +++ b/check/main.c @@ -5667,6 +5667,8 @@ static int process_extent_item(struct btrfs_root *root, btrfs_shared_data_ref_count(eb, sref), gen, 0, num_bytes); break; + case BTRFS_EXTENT_OWNER_REF_KEY: + break; default: fprintf(stderr, "corrupt extent record: key [%llu,%u,%llu]\n", diff --git a/check/qgroup-verify.c b/check/qgroup-verify.c index 1a62009b8..0d079f3b7 100644 --- a/check/qgroup-verify.c +++ b/check/qgroup-verify.c @@ -85,6 +85,8 @@ static struct counts_tree { unsigned int num_groups; unsigned int rescan_running:1; unsigned int qgroup_inconsist:1; + unsigned int simple:1; + u64 enable_gen; u64 scan_progress; } counts = { .root = RB_ROOT }; @@ -341,14 +343,14 @@ static int find_parent_roots(struct ulist *roots, u64 parent) ref = find_ref_bytenr(parent); if (!ref) { error("bytenr ref not found for parent %llu", - (unsigned long long)parent); + (unsigned long long)parent); return -EIO; } node = &ref->bytenr_node; if (ref->bytenr != parent) { error("found bytenr ref does not match parent: %llu != %llu", - (unsigned long long)ref->bytenr, - (unsigned long long)parent); + (unsigned long long)ref->bytenr, + (unsigned long long)parent); return -EIO; } @@ -364,8 +366,8 @@ static int find_parent_roots(struct ulist *roots, u64 parent) prev = rb_entry(prev_node, struct ref, bytenr_node); if (prev->bytenr == parent) { error( - "unexpected: prev bytenr same as parent: %llu", - (unsigned long long)parent); + "unexpected: prev bytenr same as parent: %llu", + (unsigned long long)parent); return -EIO; } } @@ -717,9 +719,6 @@ static int travel_tree(struct btrfs_fs_info *info, struct btrfs_root *root, u64 new_bytenr; u64 new_num_bytes; -// printf("travel_tree: bytenr: %llu\tnum_bytes: %llu\tref_parent: %llu\n", -// bytenr, num_bytes, ref_parent); - eb = read_tree_block(info, bytenr, btrfs_root_id(root), 0, 0, NULL); if (!extent_buffer_uptodate(eb)) @@ -915,20 +914,24 @@ static int add_qgroup_relation(u64 memberid, u64 parentid) return 0; } -static void read_qgroup_status(struct extent_buffer *eb, int slot, - struct counts_tree *counts) +static void read_qgroup_status(struct btrfs_fs_info *info, + struct extent_buffer *eb, + int slot, struct counts_tree *counts) { struct btrfs_qgroup_status_item *status_item; u64 flags; status_item = btrfs_item_ptr(eb, slot, struct btrfs_qgroup_status_item); flags = btrfs_qgroup_status_flags(eb, status_item); + + if (counts->simple == 1) + counts->enable_gen = btrfs_qgroup_status_enable_gen(eb, status_item); /* * Since qgroup_inconsist/rescan_running is just one bit, * assign value directly won't work. */ counts->qgroup_inconsist = !!(flags & - BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT); + BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT); counts->rescan_running = !!(flags & BTRFS_QGROUP_STATUS_FLAG_RESCAN); counts->scan_progress = btrfs_qgroup_status_rescan(eb, status_item); } @@ -948,6 +951,8 @@ static int load_quota_info(struct btrfs_fs_info *info) int i, nr; int search_relations = 0; + if (btrfs_fs_incompat(info, SIMPLE_QUOTA)) + counts.simple = 1; loop: /* * Do 2 passes, the first allocates group counts and reads status @@ -981,7 +986,7 @@ loop: if (ret) { errno = -ret; error( - "failed to add qgroup relation, member=%llu parent=%llu: %m", + "failed to add qgroup relation, member=%llu parent=%llu: %m", key.objectid, key.offset); goto out; } @@ -990,7 +995,7 @@ loop: } if (key.type == BTRFS_QGROUP_STATUS_KEY) { - read_qgroup_status(leaf, i, &counts); + read_qgroup_status(info, leaf, i, &counts); continue; } @@ -1038,6 +1043,51 @@ out: return ret; } +static int simple_quota_account_extent(struct btrfs_fs_info *info, + struct extent_buffer *leaf, + struct btrfs_key *key, + struct btrfs_extent_item *ei, + struct btrfs_extent_inline_ref *iref, + u64 bytenr, u64 num_bytes, int meta_item) +{ + u64 generation; + int type; + u64 root; + struct ulist *roots = ulist_alloc(0); + int ret; + struct extent_buffer *node_eb; + u64 extent_root; + + generation = btrfs_extent_generation(leaf, ei); + if (generation < counts.enable_gen) + return 0; + + type = btrfs_extent_inline_ref_type(leaf, iref); + if (!meta_item) { + if (type == BTRFS_EXTENT_OWNER_REF_KEY) { + struct btrfs_extent_owner_ref *oref = (struct btrfs_extent_owner_ref *)(&iref->offset); + root = btrfs_extent_owner_ref_root_id(leaf, oref); + } else { + return 0; + } + } else { + extent_root = btrfs_root_id(btrfs_extent_root(info, key->objectid)); + node_eb = read_tree_block(info, key->objectid, extent_root, 0, 0, NULL); + if (!extent_buffer_uptodate(node_eb)) + return -EIO; + root = btrfs_header_owner(node_eb); + free_extent_buffer(node_eb); + } + + if (!is_fstree(root)) + return 0; + + ulist_add(roots, root, 0, 0); + ret = account_one_extent(roots, bytenr, num_bytes); + ulist_free(roots); + return ret; +} + static int add_inline_refs(struct btrfs_fs_info *info, struct extent_buffer *ei_leaf, int slot, u64 bytenr, u64 num_bytes, int meta_item) @@ -1045,6 +1095,7 @@ static int add_inline_refs(struct btrfs_fs_info *info, struct btrfs_extent_item *ei; struct btrfs_extent_inline_ref *iref; struct btrfs_extent_data_ref *dref; + struct btrfs_key key; u64 flags, root_obj, offset, parent; u32 item_size = btrfs_item_size(ei_leaf, slot); int type; @@ -1052,6 +1103,7 @@ static int add_inline_refs(struct btrfs_fs_info *info, unsigned long ptr; ei = btrfs_item_ptr(ei_leaf, slot, struct btrfs_extent_item); + btrfs_item_key_to_cpu(ei_leaf, &key, slot); flags = btrfs_extent_flags(ei_leaf, ei); if (flags & BTRFS_EXTENT_FLAG_TREE_BLOCK && !meta_item) { @@ -1062,6 +1114,15 @@ static int add_inline_refs(struct btrfs_fs_info *info, iref = (struct btrfs_extent_inline_ref *)(ei + 1); } + if (counts.simple) { + int ret = simple_quota_account_extent(info, ei_leaf, &key, ei, iref, + bytenr, num_bytes, meta_item); + + if (ret) + error("simple quota account extent error: %d", ret); + return ret; + } + ptr = (unsigned long)iref; end = (unsigned long)ei + item_size; while (ptr < end) { @@ -1083,6 +1144,7 @@ static int add_inline_refs(struct btrfs_fs_info *info, parent = offset; break; default: + error("unexpected iref type %d", type); return 1; } @@ -1212,13 +1274,19 @@ static int scan_extents(struct btrfs_fs_info *info, ret = add_inline_refs(info, leaf, i, bytenr, num_bytes, meta); if (ret) + { + error("add inline refs error: %d", ret); goto out; + } level = get_tree_block_level(&key, leaf, i); if (level) { if (alloc_tree_block(bytenr, num_bytes, level)) + { + error("enomem 1"); return ENOMEM; + } } continue; @@ -1241,7 +1309,10 @@ static int scan_extents(struct btrfs_fs_info *info, ret = add_keyed_ref(info, &key, leaf, i, bytenr, num_bytes); if (ret) + { + error("add keyed ref error: %d", ret); goto out; + } } ret = btrfs_next_leaf(root, &path); @@ -1330,10 +1401,10 @@ void report_qgroups(int all) if (!opt_check_repair && counts.rescan_running) { if (all) { printf( - "Qgroup rescan is running, a difference in qgroup counts is expected\n"); + "Qgroup rescan is running, a difference in qgroup counts is expected\n"); } else { printf( - "Qgroup rescan is running, qgroups will not be printed.\n"); + "Qgroup rescan is running, qgroups will not be printed.\n"); return; } } @@ -1342,7 +1413,7 @@ void report_qgroups(int all) */ if (counts.qgroup_inconsist && !counts.rescan_running) printf( -"Rescan hasn't been initialzied, a difference in qgroup accounting is expected\n"); + "Rescan hasn't been initialzied, a difference in qgroup accounting is expected\n"); node = rb_first(&counts.root); while (node) { c = rb_entry(node, struct qgroup_count, rb_node); @@ -1445,6 +1516,12 @@ int qgroup_verify_all(struct btrfs_fs_info *info) goto out; } } + /* + * As in the kernel, simple qgroup accounting is done locally per extent, + * so we don't need * to do all the logic resolving refs. + */ + if (counts.simple) + goto check; ret = map_implied_refs(info); if (ret) { @@ -1454,6 +1531,7 @@ int qgroup_verify_all(struct btrfs_fs_info *info) ret = account_all_refs(1, 0); +check: /* * Do the correctness check here, so for callers who don't want * verbose report can skip calling report_qgroups() @@ -1568,8 +1646,8 @@ static int repair_qgroup_info(struct btrfs_fs_info *info, if (!silent) printf("Repair qgroup %u/%llu\n", - btrfs_qgroup_level(count->qgroupid), - btrfs_qgroup_subvolid(count->qgroupid)); + btrfs_qgroup_level(count->qgroupid), + btrfs_qgroup_subvolid(count->qgroupid)); trans = btrfs_start_transaction(root, 1); if (IS_ERR(trans)) @@ -1596,14 +1674,14 @@ static int repair_qgroup_info(struct btrfs_fs_info *info, trans->transid); btrfs_set_qgroup_info_rfer(path.nodes[0], info_item, - count->info.referenced); + count->info.referenced); btrfs_set_qgroup_info_rfer_cmpr(path.nodes[0], info_item, - count->info.referenced_compressed); + count->info.referenced_compressed); btrfs_set_qgroup_info_excl(path.nodes[0], info_item, - count->info.exclusive); + count->info.exclusive); btrfs_set_qgroup_info_excl_cmpr(path.nodes[0], info_item, - count->info.exclusive_compressed); + count->info.exclusive_compressed); btrfs_mark_buffer_dirty(path.nodes[0]); From patchwork Wed Jul 5 23:36:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13303050 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31EB6EB64DD for ; Wed, 5 Jul 2023 23:37:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231608AbjGEXhn (ORCPT ); Wed, 5 Jul 2023 19:37:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231724AbjGEXhm (ORCPT ); Wed, 5 Jul 2023 19:37:42 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ECA5B1990 for ; Wed, 5 Jul 2023 16:37:41 -0700 (PDT) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id 64E875C00B4; Wed, 5 Jul 2023 19:37:41 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Wed, 05 Jul 2023 19:37:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1688600261; x= 1688686661; bh=vHVNKV1iQqpPa4eszV6eL6oG+RGHvuHd55kGwqH8imE=; b=d QeqnsFHzOzYbe6Q1vZ068FhnsgD6fZXlgPfrFMty+2Bm4WgH0m0TBP1Jmbl6zoEE rID6JWvbwjZ6dNyFRi4hzP7Tb4885AG3L5nZwar7i7wr8syyNMus4DEpDYQ/KQWQ PkwySSjTRsXk8xlNVgRlrv0D+FU0dqGYTryv5TRLpRGR07tvL0XVwM6LAw6iMDfV VPeMgnB1Rd14Y2MyS7qvaJWSM3Y5cVGlvtAtwFYsp2ju9qm3g/wAchzs9+prFCRs fYiHk8lAR1hJsLmLhdWqwlRAF1aUfc3bgC164K3GXAnknSZg1jIMENVV9rS28cRw Fsyqvaic2b62ABtje3gGg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1688600261; x=1688686661; bh=v HVNKV1iQqpPa4eszV6eL6oG+RGHvuHd55kGwqH8imE=; b=OCOAm+Nj91+O4s4ZQ c6kWxBD4D6d2HVmKtlFxWGi4zmVPl9dmuH1y7Ng3A8lG4x3VUsWjzmIgOufOqt7L A2Nv3jYfTkVe+YMUG3sZnSegl0Kdi1RPHJXuRPjO9jB3sWOFQ5x3qEtOxXudLE/P 4hubNog8oerdaRCUXAK5mF2vlw47OtGOnsjFK8Fi5lRg9orGY9vUCmp0B4y0Axjp 74GXFjiyhLb9fGal0GSJkAV2EVBCmlcPwivyX1MRraar2yp0Tj66VEzYVC2EM52k 6xxUoKffyigsnRL0nBD1vOwj7SCv0LeQSOoywK8sEpWRAXhP1MRF9aerFY+IvHCi hbiXw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudekgddvgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhi oheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejffevvd ehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehm rghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 5 Jul 2023 19:37:40 -0400 (EDT) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 5/8] btrfs-progs: simple quotas mkfs Date: Wed, 5 Jul 2023 16:36:24 -0700 Message-ID: <19ca469539472675b8cdb0d807e59cbd4e081fd4.1688599734.git.boris@bur.io> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add the ability to enable simple quotas from mkfs with '-O squota' There is some complication around handling enable gen while still counting the root node of an fs. To handle this, employ a hack of doing a no-op write on the root node to bump its generation up above that of the qgroup enable generation, which results in counting it properly. Signed-off-by: Boris Burkov Reviewed-by: Josef Bacik --- common/fsfeatures.c | 9 +++++++ mkfs/main.c | 63 ++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 66 insertions(+), 6 deletions(-) diff --git a/common/fsfeatures.c b/common/fsfeatures.c index 00658fa51..584ecb5fc 100644 --- a/common/fsfeatures.c +++ b/common/fsfeatures.c @@ -108,6 +108,15 @@ static const struct btrfs_feature mkfs_features[] = { VERSION_NULL(default), .desc = "quota support (qgroups)" }, + { + .name = "squota", + .incompat_flag = BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA, + .sysfs_name = "squota", + VERSION_TO_STRING2(compat, 6,5), + VERSION_NULL(safe), + VERSION_NULL(default), + .desc = "squota support (simple qgroups)" + }, { .name = "extref", .incompat_flag = BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF, diff --git a/mkfs/main.c b/mkfs/main.c index 7acd39ec6..2f0b563a0 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -59,6 +59,8 @@ #include "mkfs/common.h" #include "mkfs/rootdir.h" +#include "libbtrfs/ctree.h" + struct mkfs_allocation { u64 data; u64 metadata; @@ -882,6 +884,39 @@ static int insert_qgroup_items(struct btrfs_trans_handle *trans, return ret; } +static int touch_root_subvol(struct btrfs_fs_info *fs_info) +{ + struct btrfs_trans_handle *trans; + struct btrfs_inode_item *inode_item; + struct btrfs_key key = { + .objectid = BTRFS_FIRST_FREE_OBJECTID, + .type = BTRFS_INODE_ITEM_KEY, + .offset = 0, + }; + struct extent_buffer *leaf; + int slot; + struct btrfs_path path; + int ret; + + trans = btrfs_start_transaction(fs_info->fs_root, 1); + btrfs_init_path(&path); + ret = btrfs_search_slot(trans, fs_info->fs_root, &key, &path, 0, 1); + if (ret) + goto fail; + leaf = path.nodes[0]; + slot = path.slots[0]; + btrfs_item_key_to_cpu(leaf, &key, slot); + inode_item = btrfs_item_ptr(leaf, slot, struct btrfs_inode_item); + btrfs_mark_buffer_dirty(leaf); + btrfs_commit_transaction(trans, fs_info->fs_root); + btrfs_release_path(&path); + return 0; +fail: + btrfs_abort_transaction(trans, ret); + btrfs_release_path(&path); + return ret; +} + static int setup_quota_root(struct btrfs_fs_info *fs_info) { struct btrfs_trans_handle *trans; @@ -890,8 +925,11 @@ static int setup_quota_root(struct btrfs_fs_info *fs_info) struct btrfs_path path; struct btrfs_key key; int qgroup_repaired = 0; + bool simple = btrfs_fs_incompat(fs_info, SIMPLE_QUOTA); + int flags; int ret; + /* One to modify tree root, one for quota root */ trans = btrfs_start_transaction(fs_info->tree_root, 2); if (IS_ERR(trans)) { @@ -921,13 +959,16 @@ static int setup_quota_root(struct btrfs_fs_info *fs_info) qsi = btrfs_item_ptr(path.nodes[0], path.slots[0], struct btrfs_qgroup_status_item); - btrfs_set_qgroup_status_generation(path.nodes[0], qsi, 0); + btrfs_set_qgroup_status_generation(path.nodes[0], qsi, trans->transid); btrfs_set_qgroup_status_rescan(path.nodes[0], qsi, 0); + flags = BTRFS_QGROUP_STATUS_FLAG_ON; + if (simple) + btrfs_set_qgroup_status_enable_gen(path.nodes[0], qsi, trans->transid); + else + flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT; - /* Mark current status info inconsistent, and fix it later */ - btrfs_set_qgroup_status_flags(path.nodes[0], qsi, - BTRFS_QGROUP_STATUS_FLAG_ON | - BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT); + btrfs_set_qgroup_status_version(path.nodes[0], qsi, 1); + btrfs_set_qgroup_status_flags(path.nodes[0], qsi, flags); btrfs_release_path(&path); /* Currently mkfs will only create one subvolume */ @@ -944,6 +985,15 @@ static int setup_quota_root(struct btrfs_fs_info *fs_info) return ret; } + /* Hack to count the default subvol metadata by dirtying it */ + if (simple) { + ret = touch_root_subvol(fs_info); + if (ret) { + error("failed to touch root dir for simple quota accounting %d (%m)", ret); + goto fail; + } + } + /* * Qgroup is setup but with wrong info, use qgroup-verify * infrastructure to repair them. (Just acts as offline rescan) @@ -1743,7 +1793,8 @@ raid_groups: } } - if (features.runtime_flags & BTRFS_FEATURE_RUNTIME_QUOTA) { + if (features.runtime_flags & BTRFS_FEATURE_RUNTIME_QUOTA || + features.incompat_flags & BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA) { ret = setup_quota_root(fs_info); if (ret < 0) { error("failed to initialize quota: %d (%m)", ret); From patchwork Wed Jul 5 23:36:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13303053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A08CEB64DA for ; Wed, 5 Jul 2023 23:37:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231791AbjGEXhp (ORCPT ); Wed, 5 Jul 2023 19:37:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231724AbjGEXho (ORCPT ); Wed, 5 Jul 2023 19:37:44 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8D4A512A for ; Wed, 5 Jul 2023 16:37:43 -0700 (PDT) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 08C965C0056; Wed, 5 Jul 2023 19:37:43 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Wed, 05 Jul 2023 19:37:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1688600263; x= 1688686663; bh=1xzdKvzy3sMHeu+eRDpJi09YRWXqrdn9Pq2QDJ7KYmg=; b=V MRZmK80X4qYUZGjYJUELn1929M8JkYKplN4s51rz/91jDkXtTlBFkMnSEAJFR2ev g3YbKo7JiCT2CY48OJoJQzGf1Ug7BNz6kbb0FOlBl7mzdSV4d21zpdRXAtmTXobt dJYsqLTNzQNxl/r9Lk9H0yQiGb/NKNLNfcdWxe5g5KZt19qSRMVWZIi2A+pZPU8z tWBFxp2KIdrJiF1v00keIkuCH/gFSoAqwCLu5E2mdyGn2ovTYb2mQHUGhO0FUcon Ah1BVTQVpF67ps+Lvjbex77j5AlJHJKLQ4RRWyuDjd/gtHFK57EkL46OYeNDXfza C3TiTmNoX5sGx1n8BnVNQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1688600263; x=1688686663; bh=1 xzdKvzy3sMHeu+eRDpJi09YRWXqrdn9Pq2QDJ7KYmg=; b=mnQg41/1BMfEjqCDx +wlf86GNRW1RqX8RGXAmP2XyncT8SFmBz5xgMdFkhnOEvjbZNRfkxkl8lrSeQWHg RQAiFn/NB+DqtkclgtzzxaIKj0MuXSCPwaGs39q9Q2obcmU1FCh2h00hvm8lOvkh r/BohAUzFXeZ1MM3Bk9bfaUi36KsshV6djLMLZqHIXGIYzhRgak3Kw8rroGFUXPt Ptk+naSx3CBr+Z1cHsEXJv86he8kmFAtuksUG3QCTz1vJs6Sc2zYoW4OL8pdMCoL CpOYmo3FCpcq6FCFh/kQ79TJr9wKNgqzQHPQhJY7/2bJh2cOHJEHY0Rt0cNkEWIn S5CaQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudekgddvfecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhi oheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejffevvd ehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehm rghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 5 Jul 2023 19:37:42 -0400 (EDT) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 6/8] btrfs-progs: simple quotas btrfstune Date: Wed, 5 Jul 2023 16:36:25 -0700 Message-ID: <97a649f080eef409746d4a4cd59f4c27e0bbb287.1688599734.git.boris@bur.io> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add the ability to enable simple quotas on an existing file system at rest with btrfstune. This is similar to the functionality in mkfs, except it must also find all the roots for which it must create qgroups. Note that this *does not* retroactively compute usage for existing extents as that is impossible for data. This is consistent with the behavior of the live enable ioctl. Signed-off-by: Boris Burkov Reviewed-by: Josef Bacik --- Makefile | 2 +- tune/main.c | 13 +++- tune/quota.c | 169 +++++++++++++++++++++++++++++++++++++++++++++++++++ tune/tune.h | 3 + 4 files changed, 184 insertions(+), 3 deletions(-) create mode 100644 tune/quota.c diff --git a/Makefile b/Makefile index 86c73590d..74862ff32 100644 --- a/Makefile +++ b/Makefile @@ -257,7 +257,7 @@ convert_objects = convert/main.o convert/common.o convert/source-fs.o \ mkfs_objects = mkfs/main.o mkfs/common.o mkfs/rootdir.o image_objects = image/main.o image/sanitize.o tune_objects = tune/main.o tune/seeding.o tune/change-uuid.o tune/change-metadata-uuid.o \ - tune/convert-bgt.o tune/change-csum.o check/clear-cache.o + tune/convert-bgt.o tune/change-csum.o tune/quota.o check/clear-cache.o all_objects = $(objects) $(cmds_objects) $(libbtrfs_objects) $(convert_objects) \ $(mkfs_objects) $(image_objects) $(tune_objects) $(libbtrfsutil_objects) diff --git a/tune/main.c b/tune/main.c index e38c1f6d3..b694d0da0 100644 --- a/tune/main.c +++ b/tune/main.c @@ -103,6 +103,7 @@ static const char * const tune_usage[] = { OPTLINE("-x", "enable skinny metadata extent refs (mkfs: skinny-metadata)"), OPTLINE("-n", "enable no-holes feature (mkfs: no-holes, more efficient sparse file representation)"), OPTLINE("-S <0|1>", "set/unset seeding status of a device"), + OPTLINE("-q", "enable simple quotas on the file system. (mkfs: squota)"), OPTLINE("--convert-to-block-group-tree", "convert filesystem to track block groups in " "the separate block-group-tree instead of extent tree (sets the incompat bit)"), OPTLINE("--convert-from-block-group-tree", @@ -147,6 +148,7 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) char *new_fsid_str = NULL; int ret; u64 super_flags = 0; + int quota = 0; int fd = -1; btrfs_config_init(); @@ -169,7 +171,7 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) #endif { NULL, 0, NULL, 0 } }; - int c = getopt_long(argc, argv, "S:rxfuU:nmM:", long_options, NULL); + int c = getopt_long(argc, argv, "S:rxqfuU:nmM:", long_options, NULL); if (c < 0) break; @@ -184,6 +186,9 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) case 'x': super_flags |= BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA; break; + case 'q': + quota = 1; + break; case 'n': super_flags |= BTRFS_FEATURE_INCOMPAT_NO_HOLES; break; @@ -241,7 +246,7 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) } if (!super_flags && !seeding_flag && !(random_fsid || new_fsid_str) && !change_metadata_uuid && csum_type == -1 && !to_bg_tree && - !to_extent_tree && !to_fst) { + !to_extent_tree && !to_fst && !quota) { error("at least one option should be specified"); usage(&tune_cmd, 1); return 1; @@ -420,6 +425,10 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) total++; } + if (quota) { + ret = enable_quota(root->fs_info, true); + } + if (success == total) { ret = 0; } else { diff --git a/tune/quota.c b/tune/quota.c new file mode 100644 index 000000000..7de5ef827 --- /dev/null +++ b/tune/quota.c @@ -0,0 +1,169 @@ +#include + +#include "common/messages.h" +#include "kernel-shared/ctree.h" +#include "kernel-shared/disk-io.h" +#include "kernel-shared/transaction.h" +#include "kernel-shared/uapi/btrfs_tree.h" + +static int create_qgroup(struct btrfs_fs_info *fs_info, + struct btrfs_trans_handle *trans, + u64 qgroupid) +{ + struct btrfs_path path; + struct btrfs_root *quota_root = fs_info->quota_root; + struct btrfs_key key; + int ret; + + if (qgroupid >> BTRFS_QGROUP_LEVEL_SHIFT) { + error("qgroup level other than 0 is not supported yet"); + return -ENOTTY; + } + + key.objectid = 0; + key.type = BTRFS_QGROUP_INFO_KEY; + key.offset = qgroupid; + + btrfs_init_path(&path); + ret = btrfs_insert_empty_item(trans, quota_root, &path, &key, + sizeof(struct btrfs_qgroup_info_item)); + btrfs_release_path(&path); + if (ret < 0) + return ret; + + key.objectid = 0; + key.type = BTRFS_QGROUP_LIMIT_KEY; + key.offset = qgroupid; + ret = btrfs_insert_empty_item(trans, quota_root, &path, &key, + sizeof(struct btrfs_qgroup_limit_item)); + btrfs_release_path(&path); + + printf("created qgroup for %llu\n", qgroupid); + return ret; +} + +static int create_qgroups(struct btrfs_fs_info *fs_info, + struct btrfs_trans_handle *trans) +{ + struct btrfs_key key = { + .objectid = 0, + .type = BTRFS_ROOT_REF_KEY, + .offset = 0, + }; + struct btrfs_path path; + struct extent_buffer *leaf; + int slot; + struct btrfs_root *tree_root = fs_info->tree_root; + int ret; + + + ret = create_qgroup(fs_info, trans, BTRFS_FS_TREE_OBJECTID); + if (ret) + goto out; + + btrfs_init_path(&path); + ret = btrfs_search_slot_for_read(tree_root, &key, &path, 1, 0); + if (ret) + goto out; + + while (1) { + slot = path.slots[0]; + leaf = path.nodes[0]; + btrfs_item_key_to_cpu(leaf, &key, slot); + if (key.type == BTRFS_ROOT_REF_KEY) { + ret = create_qgroup(fs_info, trans, key.offset); + if (ret) + goto out; + } + ret = btrfs_next_item(tree_root, &path); + if (ret < 0) { + error("failed to advance to next item"); + goto out; + } + if (ret) + break; + } + +out: + btrfs_release_path(&path); + return ret; +} + +int enable_quota(struct btrfs_fs_info *fs_info, bool simple) +{ + struct btrfs_super_block *sb = fs_info->super_copy; + struct btrfs_trans_handle *trans; + int super_flags = btrfs_super_incompat_flags(sb); + struct btrfs_qgroup_status_item *qsi; + struct btrfs_root *quota_root; + struct btrfs_path path; + struct btrfs_key key; + int flags; + int ret; + + trans = btrfs_start_transaction(fs_info->tree_root, 2); + if (IS_ERR(trans)) { + ret = PTR_ERR(trans); + errno = -ret; + error_msg(ERROR_MSG_START_TRANS, "%m"); + return ret; + } + + ret = btrfs_create_root(trans, fs_info, BTRFS_QUOTA_TREE_OBJECTID); + if (ret < 0) { + error("failed to create quota root: %d (%m)", ret); + goto fail; + } + quota_root = fs_info->quota_root; + + /* Create the qgroup status item */ + key.objectid = 0; + key.type = BTRFS_QGROUP_STATUS_KEY; + key.offset = 0; + + btrfs_init_path(&path); + ret = btrfs_insert_empty_item(trans, quota_root, &path, &key, + sizeof(*qsi)); + if (ret < 0) { + error("failed to insert qgroup status item: %d (%m)", ret); + goto fail; + } + + qsi = btrfs_item_ptr(path.nodes[0], path.slots[0], + struct btrfs_qgroup_status_item); + btrfs_set_qgroup_status_generation(path.nodes[0], qsi, trans->transid); + btrfs_set_qgroup_status_rescan(path.nodes[0], qsi, 0); + flags = BTRFS_QGROUP_STATUS_FLAG_ON; + if (simple) + btrfs_set_qgroup_status_enable_gen(path.nodes[0], qsi, trans->transid); + else + flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT; + + btrfs_set_qgroup_status_version(path.nodes[0], qsi, 1); + btrfs_set_qgroup_status_flags(path.nodes[0], qsi, flags); + btrfs_release_path(&path); + + /* Create the qgroup items */ + ret = create_qgroups(fs_info, trans); + if (ret < 0) { + error("failed to create qgroup items for subvols %d (%m)", ret); + goto fail; + } + + /* Set squota incompat flag */ + if (simple) { + super_flags |= BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA; + btrfs_set_super_incompat_flags(sb, super_flags); + } + + ret = btrfs_commit_transaction(trans, fs_info->tree_root); + if (ret < 0) { + errno = -ret; + error_msg(ERROR_MSG_COMMIT_TRANS, "%m"); + return ret; + } + return ret; +fail: + btrfs_abort_transaction(trans, ret); + return ret; +} diff --git a/tune/tune.h b/tune/tune.h index 0ef249d89..cbf33b2e7 100644 --- a/tune/tune.h +++ b/tune/tune.h @@ -33,4 +33,7 @@ int convert_to_bg_tree(struct btrfs_fs_info *fs_info); int convert_to_extent_tree(struct btrfs_fs_info *fs_info); int btrfs_change_csum_type(struct btrfs_fs_info *fs_info, u16 new_csum_type); + +int enable_quota(struct btrfs_fs_info *fs_info, bool simple); + #endif From patchwork Wed Jul 5 23:36:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13303052 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89A2AEB64DD for ; Wed, 5 Jul 2023 23:37:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231861AbjGEXhq (ORCPT ); Wed, 5 Jul 2023 19:37:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231794AbjGEXhq (ORCPT ); Wed, 5 Jul 2023 19:37:46 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4793812A for ; Wed, 5 Jul 2023 16:37:45 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id B52345C01DF; Wed, 5 Jul 2023 19:37:44 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 05 Jul 2023 19:37:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1688600264; x= 1688686664; bh=0bgg29wXyH/HJ/8250BxoeU/AmHZbhrqeuGkDdQREj4=; b=E lZyt7rGc7Xj2OLql4ev/P5jptPd/COfvUZyU1LYun+pnSy1BcNXKrGL3qHig6MSX Pyw50p9pYHe3WWPxYI7KX3zbbhGueIV9M0VV3gAbou0U6PV6wjICYpykL9I4LfNU 8pSpBez+/Aav4vTcQYT+3EPSf6cM88zQvVmHLPXZTp0cEDT5te73CQZzlB6lV0iu 7rJM7XORi3tEdC4Jq2hSpX9xO7eOqaEoKlDCCNA+oh7ssxnfhRfoIAlxLyqAm/nO QPUxRrIilpzjPZXjRUtG2NOp5IzNtg/cXwUJlfOF/zucUrwWbxMNpFZ+KPXlxU3b 54f2i4ATQK3pEmwOSFzCg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1688600264; x=1688686664; bh=0 bgg29wXyH/HJ/8250BxoeU/AmHZbhrqeuGkDdQREj4=; b=I1XePTkCV/CIIy4gK tXFoLd4MKT1hDGBIYBkuT+wOTtKrLiI5iztFcppvkNfGEBcCIWu7XJLEPgIkB5Ik 0D9ykqyw7lZcFHR3Pf3ZHLjLDAS51wgMj/Z/xqOLWUV14VBOJFBYoWy2RlxKS9hu V6XV1PIMkOLBTX86j0BVTKKI1jwe6mTeseTETmBLqRg8eeBz7pV+4VzI+rPNBKhl abvFsbe7XQ8s6kFLvDHmUtzz3bWZoCOGGX/lvXLOYHbg+CjDA2YauaRnO6Hm12RA VCeqf2X8G8fo4CVgD/nKd/TkPzyXUSN7lvNi5zXUdBS30p+LA5w3HLxloVGYJ4Xk 8t8cA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudekgddvgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhi oheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejffevvd ehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgepvdenucfrrghrrghmpehm rghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 5 Jul 2023 19:37:44 -0400 (EDT) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 7/8] btrfs-progs: simple quotas enable cmd Date: Wed, 5 Jul 2023 16:36:26 -0700 Message-ID: <83c39c42c2d04a3d6bf7fdb7f63a31d3def1223f.1688599734.git.boris@bur.io> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add a --simple flag to btrfs quota enable. If set, this enables simple quotas instead of full qgroups. This re-uses the deprecated 'status' field of the quota ioctl to avoid adding a new ioctl. Signed-off-by: Boris Burkov --- cmds/quota.c | 41 ++++++++++++++++++++++++++++++-------- kernel-shared/uapi/btrfs.h | 3 +++ 2 files changed, 36 insertions(+), 8 deletions(-) diff --git a/cmds/quota.c b/cmds/quota.c index cd874f9ed..31d36fcd8 100644 --- a/cmds/quota.c +++ b/cmds/quota.c @@ -34,19 +34,17 @@ static const char * const quota_cmd_group_usage[] = { NULL }; -static int quota_ctl(int cmd, int argc, char **argv) +static int quota_ctl(int cmd, char *path, bool simple) { int ret = 0; int fd; - char *path = argv[1]; struct btrfs_ioctl_quota_ctl_args args; DIR *dirstream = NULL; - if (check_argc_exact(argc, 2)) - return -1; - memset(&args, 0, sizeof(args)); args.cmd = cmd; + if (cmd == BTRFS_QUOTA_CTL_ENABLE && simple) + args.status = BTRFS_QUOTA_CTL_ENABLE_SIMPLE_QUOTA; fd = btrfs_open_dir(path, &dirstream, 1); if (fd < 0) @@ -67,16 +65,40 @@ static const char * const cmd_quota_enable_usage[] = { "Any data already present on the filesystem will not count towards", "the space usage numbers. It is recommended to enable quota for a", "filesystem before writing any data to it.", + "", + "-s|--simple simple qgroups account ownership by extent lifetime rather than backref walks", NULL }; static int cmd_quota_enable(const struct cmd_struct *cmd, int argc, char **argv) { int ret; + bool simple = false; - clean_args_no_options(cmd, argc, argv); + optind = 0; + while (1) { + static const struct option long_options[] = { + {"simple", no_argument, NULL, 's'}, + {NULL, 0, NULL, 0} + }; + int c; - ret = quota_ctl(BTRFS_QUOTA_CTL_ENABLE, argc, argv); + c = getopt_long(argc, argv, "s", long_options, NULL); + if (c < 0) + break; + + switch (c) { + case 's': + simple = true; + break; + default: + usage_unknown_option(cmd, argv); + } + } + if (check_argc_exact(argc - optind, 1)) + return -1; + + ret = quota_ctl(BTRFS_QUOTA_CTL_ENABLE, argv[optind], simple); if (ret < 0) usage(cmd, 1); @@ -97,7 +119,10 @@ static int cmd_quota_disable(const struct cmd_struct *cmd, clean_args_no_options(cmd, argc, argv); - ret = quota_ctl(BTRFS_QUOTA_CTL_DISABLE, argc, argv); + if (check_argc_exact(argc, 2)) + return -1; + + ret = quota_ctl(BTRFS_QUOTA_CTL_DISABLE, argv[1], false); if (ret < 0) usage(cmd, 1); diff --git a/kernel-shared/uapi/btrfs.h b/kernel-shared/uapi/btrfs.h index d312b9f4f..34c295fd6 100644 --- a/kernel-shared/uapi/btrfs.h +++ b/kernel-shared/uapi/btrfs.h @@ -786,9 +786,12 @@ struct btrfs_ioctl_get_dev_stats { }; _static_assert(sizeof(struct btrfs_ioctl_get_dev_stats) == 1032); +/* cmd values */ #define BTRFS_QUOTA_CTL_ENABLE 1 #define BTRFS_QUOTA_CTL_DISABLE 2 #define BTRFS_QUOTA_CTL_RESCAN__NOTUSED 3 +/* status values */ +#define BTRFS_QUOTA_CTL_ENABLE_SIMPLE_QUOTA (1UL) struct btrfs_ioctl_quota_ctl_args { __u64 cmd; __u64 status; From patchwork Wed Jul 5 23:36:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13303054 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D2F7EB64DA for ; Wed, 5 Jul 2023 23:37:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231888AbjGEXht (ORCPT ); Wed, 5 Jul 2023 19:37:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231794AbjGEXhr (ORCPT ); Wed, 5 Jul 2023 19:37:47 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D859812A for ; Wed, 5 Jul 2023 16:37:46 -0700 (PDT) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id 517E55C0056; Wed, 5 Jul 2023 19:37:46 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Wed, 05 Jul 2023 19:37:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1688600266; x= 1688686666; bh=v3S/njawk/I7RN3zggO14cGiWt1suYABB4+95F04XLU=; b=W PvW73H+LSCdHmKJ24kOp79GwVnuudVTc/yBldtcSlBqHS0tOXVAW7HcXzQ05G2jK AsF7cryjwAOd9afDZeiMeQswEUF6I8q6LiNY52PhcbkACSUAM9qNaGu9Z3SyPHMc pIdbqyEUOYhW6Y6aec6Fi2G6HCPVi07ECR6Hzecn6ekSv7qnugspTsfy7p01XLGC 1OTV68nJETckAGwCykS2wsotsJ1YZBzRjVwVkJm4uD3PGVMvZrxsQP6vsLvUB49x ZyhVguX6KTkGk6Kqag9sDhLQyqtowd3o18vq0wx3Kx7neKXwULwXJXrDA0L8qgmx GRmIekWuYiaskpyRVuDsg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1688600266; x=1688686666; bh=v 3S/njawk/I7RN3zggO14cGiWt1suYABB4+95F04XLU=; b=B6BeCjQLd83iWW7WH ydac7ooRthjtg+MV4uwAr2L5z2kyGCv4yATNaRdF8EPqfiyuJaVNW2tKvLIKXnVA l068Cg7sG6uLBAwq5pqISr5eUNak+M1sxl09uet2mf6orH61dY38WAZcUDGtYdVW xtVfVCVNTuAiREC9r6Gbjrbj7Kt9milpvryrLFbnGL0XYEzx0egIGuvFhtbw1jXp ArqytdtMs3NsNj8FcjmPMC1jYOqLepMHQal70vXk3nnYWnSbmA9hKfW6hFQx4kaN 48YGd355tFDE6wtZU+NeeIED6BI8IKaDlH7yBy+bFRbtdC0bfippUZDcXpUL+EfL wrG6Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudekgddvgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhi oheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejffevvd ehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgepudenucfrrghrrghmpehm rghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 5 Jul 2023 19:37:45 -0400 (EDT) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 8/8] btrfs-progs: tree-checker: handle owner ref items Date: Wed, 5 Jul 2023 16:36:27 -0700 Message-ID: <45586f6889c79d6e1708b2b78ee9043b15cf6dff.1688599734.git.boris@bur.io> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add the new OWNER_REF inline items to the tree-checker extent item checking code. We could somehow validate the root id for being a valid fstree id, but just skipping it seems fine as well. Signed-off-by: Boris Burkov Reviewed-by: Josef Bacik --- kernel-shared/tree-checker.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel-shared/tree-checker.c b/kernel-shared/tree-checker.c index 107975891..2f834cf33 100644 --- a/kernel-shared/tree-checker.c +++ b/kernel-shared/tree-checker.c @@ -1477,6 +1477,8 @@ static int check_extent_item(struct extent_buffer *leaf, } inline_refs += btrfs_shared_data_ref_count(leaf, sref); break; + case BTRFS_EXTENT_OWNER_REF_KEY: + break; default: extent_err(leaf, slot, "unknown inline ref type: %u", inline_type);