From patchwork Wed May 8 12:17:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13658653 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6EB4183A12 for ; Wed, 8 May 2024 12:17:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170655; cv=none; b=kT/RXSSPoCP3CzcBjnLfMMKibaP0D2x0zJZgtY1iNuRVlFadxExdes6pAvejF71f1RBv00jkmdUjQf/nC/Xz45GUdU/h7Y+GB0pk4OcuOdmweJ11b09n676UffGP/Ak70/9LP7WizRaC4ScIy1HTi5xNyEamrsmqYSC2pQrv0yg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170655; c=relaxed/simple; bh=EyBRNjfnPcGuOPjs3FXXiJPJ6vgrVJSJOeHSgkwTIoY=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FKAjfXEVu9tAUrabSUvc0QG6pF8URIjWs4C7weIRFYXcvBwcXaItcbSJ2kMJxbKJhbTy62rgVXoa9sxNJAztqBXOR16EQZ4wnc3CAbn+uctASs3wgdfACt+uqifnYtWVQZE+xRu9nn8Nz+sz9aXs8RvIZfjcAJ3UAVhPSkDL64g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lBJgiROo; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lBJgiROo" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7F51DC3277B for ; Wed, 8 May 2024 12:17:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715170655; bh=EyBRNjfnPcGuOPjs3FXXiJPJ6vgrVJSJOeHSgkwTIoY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=lBJgiROonmQT3wzKOafERMc7McDUSUQupzO0wAIR8XDW2u+/obUky+wQfeU5jAmla vNP8yLHcgAtisRM9y1cFgusGqWbaGYPwCmT+cl1Cg/vyBY6sFLUvxqNCEqT5eEFDIP NFm20HcTIQ/e5e0f+ARatLXU5j6clDb48/XhugY8JoJoePW7tPhSaYAIyj+Fb8mvWe CsN5yhxiWKXwXucU0kLAa1IqYey3G54q7J/WalnKUPvTe/5I8tn/DOBB8geaVo5l/l wT76T2u2/hb1q0IwmQwtVvWzW+ATWL4INY/7KdqrDPrS+6x5fqeKGooXEmRQLjLC9l r04rcfR3mKwyA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 1/8] btrfs: use an xarray to track open inodes in a root Date: Wed, 8 May 2024 13:17:24 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Currently we use a red black tree (rb-tree) to track the currently open inodes of a root (in struct btrfs_root::inode_tree). This however is not very efficient when the number of inodes is large since rb-trees are binary trees. For example for 100K open inodes, the tree has a depth of 17. Besides that, inserting into the tree requires navigating through it and pulling useless cache lines in the process since the red black tree nodes are embedded within the btrfs inode - on the other hand, by being embedded, it requires no extra memory allocations. We can improve this by using an xarray instead, which is efficient when indices are densely clustered (such as inode numbers), is more cache friendly and behaves like a resizable array, with a much better search and insertion complexity than a red black tree. This only has one small disadvantage which is that insertion will sometimes require allocating memory for the xarray - which may fail (not that often since it uses a kmem_cache) - but on the other hand we can reduce the btrfs inode structure size by 24 bytes (from 1064 down to 1040 bytes) after removing the embedded red black tree node, which after the next patches will allow to reduce the size of the structure to 1024 bytes, meaning we will be able to store 4 inodes per 4K page instead of 3 inodes. This change does a straighforward change to use an xarray, and results in a transaction abort if we can't allocate memory for the xarray when creating an inode - but the next patch changes things so that we don't need to abort. Running the following fs_mark test showed some improvements: $ cat test.sh #!/bin/bash DEV=/dev/nullb0 MNT=/mnt/nullb0 MOUNT_OPTIONS="-o ssd" FILES=100000 THREADS=$(nproc --all) echo "performance" | \ tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor mkfs.btrfs -f $DEV mount $MOUNT_OPTIONS $DEV $MNT OPTS="-S 0 -L 5 -n $FILES -s 0 -t $THREADS -k" for ((i = 1; i <= $THREADS; i++)); do OPTS="$OPTS -d $MNT/d$i" done fs_mark $OPTS umount $MNT Before this patch: FSUse% Count Size Files/sec App Overhead 10 1200000 0 92081.6 12505547 16 2400000 0 138222.6 13067072 23 3600000 0 148833.1 13290336 43 4800000 0 97864.7 13931248 53 6000000 0 85597.3 14384313 After this patch: FSUse% Count Size Files/sec App Overhead 10 1200000 0 93225.1 12571078 16 2400000 0 146720.3 12805007 23 3600000 0 160626.4 13073835 46 4800000 0 116286.2 13802927 53 6000000 0 90087.9 14754892 The test was run with a release kernel config (Debian's default config). Also capturing the insertion times into the rb tree and into the xarray, that is measuring the duration of the old function inode_tree_add() and the duration of the new btrfs_add_inode_to_root() function, gave the following results (in nanoseconds): Before this patch, inode_tree_add() execution times: Count: 5000000 Range: 0.000 - 5536887.000; Mean: 775.674; Median: 729.000; Stddev: 4820.961 Percentiles: 90th: 1015.000; 95th: 1139.000; 99th: 1397.000 0.000 - 7.816: 40 | 7.816 - 37.858: 209 | 37.858 - 170.278: 6059 | 170.278 - 753.961: 2754890 ##################################################### 753.961 - 3326.728: 2232312 ########################################### 3326.728 - 14667.018: 4366 | 14667.018 - 64652.943: 852 | 64652.943 - 284981.761: 550 | 284981.761 - 1256150.914: 221 | 1256150.914 - 5536887.000: 7 | After this patch, btrfs_add_inode_to_root() execution times: Count: 5000000 Range: 0.000 - 2900652.000; Mean: 272.148; Median: 241.000; Stddev: 2873.369 Percentiles: 90th: 342.000; 95th: 432.000; 99th: 572.000 0.000 - 7.264: 104 | 7.264 - 33.145: 352 | 33.145 - 140.081: 109606 # 140.081 - 581.930: 4840090 ##################################################### 581.930 - 2407.590: 43532 | 2407.590 - 9950.979: 2245 | 9950.979 - 41119.278: 514 | 41119.278 - 169902.616: 155 | 169902.616 - 702018.539: 47 | 702018.539 - 2900652.000: 9 | Average, percentiles, standard deviation, etc, are all much better. Signed-off-by: Filipe Manana Reviewed-by: Qu Wenruo --- fs/btrfs/btrfs_inode.h | 3 - fs/btrfs/ctree.h | 7 ++- fs/btrfs/disk-io.c | 6 +- fs/btrfs/inode.c | 128 ++++++++++++++++------------------------- 4 files changed, 58 insertions(+), 86 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 91c994b569f3..e577b9745884 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -155,9 +155,6 @@ struct btrfs_inode { */ struct list_head delalloc_inodes; - /* node for the red-black tree that links inodes in subvolume root */ - struct rb_node rb_node; - unsigned long runtime_flags; /* full 64 bit generation number, struct vfs_inode doesn't have a big diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index c03c58246033..aa2568f86dc9 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -222,8 +222,11 @@ struct btrfs_root { struct list_head root_list; spinlock_t inode_lock; - /* red-black tree that keeps track of in-memory inodes */ - struct rb_root inode_tree; + /* + * Xarray that keeps track of in-memory inodes, protected by the lock + * @inode_lock. + */ + struct xarray inodes; /* * Xarray that keeps track of delayed nodes of every inode, protected diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index a91a8056758a..ed40fe1db53e 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -662,7 +662,7 @@ static void __setup_root(struct btrfs_root *root, struct btrfs_fs_info *fs_info, root->free_objectid = 0; root->nr_delalloc_inodes = 0; root->nr_ordered_extents = 0; - root->inode_tree = RB_ROOT; + xa_init(&root->inodes); xa_init(&root->delayed_nodes); btrfs_init_root_block_rsv(root); @@ -1854,7 +1854,8 @@ void btrfs_put_root(struct btrfs_root *root) return; if (refcount_dec_and_test(&root->refs)) { - WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree)); + if (WARN_ON(!xa_empty(&root->inodes))) + xa_destroy(&root->inodes); WARN_ON(test_bit(BTRFS_ROOT_DEAD_RELOC_TREE, &root->state)); if (root->anon_dev) free_anon_bdev(root->anon_dev); @@ -1939,7 +1940,6 @@ static int btrfs_init_btree_inode(struct super_block *sb) inode->i_mapping->a_ops = &btree_aops; mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); - RB_CLEAR_NODE(&BTRFS_I(inode)->rb_node); extent_io_tree_init(fs_info, &BTRFS_I(inode)->io_tree, IO_TREE_BTREE_INODE_IO); extent_map_tree_init(&BTRFS_I(inode)->extent_tree); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d0274324c75a..450fe1582f1d 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5493,58 +5493,51 @@ static int fixup_tree_root_location(struct btrfs_fs_info *fs_info, return err; } -static void inode_tree_add(struct btrfs_inode *inode) +static int btrfs_add_inode_to_root(struct btrfs_inode *inode) { struct btrfs_root *root = inode->root; - struct btrfs_inode *entry; - struct rb_node **p; - struct rb_node *parent; - struct rb_node *new = &inode->rb_node; - u64 ino = btrfs_ino(inode); + struct btrfs_inode *existing; + const u64 ino = btrfs_ino(inode); + int ret; if (inode_unhashed(&inode->vfs_inode)) - return; - parent = NULL; + return 0; + + ret = xa_reserve(&root->inodes, ino, GFP_NOFS); + if (ret) + return ret; + spin_lock(&root->inode_lock); - p = &root->inode_tree.rb_node; - while (*p) { - parent = *p; - entry = rb_entry(parent, struct btrfs_inode, rb_node); + existing = xa_store(&root->inodes, ino, inode, GFP_ATOMIC); + spin_unlock(&root->inode_lock); - if (ino < btrfs_ino(entry)) - p = &parent->rb_left; - else if (ino > btrfs_ino(entry)) - p = &parent->rb_right; - else { - WARN_ON(!(entry->vfs_inode.i_state & - (I_WILL_FREE | I_FREEING))); - rb_replace_node(parent, new, &root->inode_tree); - RB_CLEAR_NODE(parent); - spin_unlock(&root->inode_lock); - return; - } + if (xa_is_err(existing)) { + ret = xa_err(existing); + ASSERT(ret != -EINVAL); + ASSERT(ret != -ENOMEM); + return ret; + } else if (existing) { + WARN_ON(!(existing->vfs_inode.i_state & (I_WILL_FREE | I_FREEING))); } - rb_link_node(new, parent, p); - rb_insert_color(new, &root->inode_tree); - spin_unlock(&root->inode_lock); + + return 0; } -static void inode_tree_del(struct btrfs_inode *inode) +static void btrfs_del_inode_from_root(struct btrfs_inode *inode) { struct btrfs_root *root = inode->root; - int empty = 0; + struct btrfs_inode *entry; + bool empty = false; spin_lock(&root->inode_lock); - if (!RB_EMPTY_NODE(&inode->rb_node)) { - rb_erase(&inode->rb_node, &root->inode_tree); - RB_CLEAR_NODE(&inode->rb_node); - empty = RB_EMPTY_ROOT(&root->inode_tree); - } + entry = xa_erase(&root->inodes, btrfs_ino(inode)); + if (entry == inode) + empty = xa_empty(&root->inodes); spin_unlock(&root->inode_lock); if (empty && btrfs_root_refs(&root->root_item) == 0) { spin_lock(&root->inode_lock); - empty = RB_EMPTY_ROOT(&root->inode_tree); + empty = xa_empty(&root->inodes); spin_unlock(&root->inode_lock); if (empty) btrfs_add_dead_root(root); @@ -5613,8 +5606,13 @@ struct inode *btrfs_iget_path(struct super_block *s, u64 ino, ret = btrfs_read_locked_inode(inode, path); if (!ret) { - inode_tree_add(BTRFS_I(inode)); - unlock_new_inode(inode); + ret = btrfs_add_inode_to_root(BTRFS_I(inode)); + if (ret) { + iget_failed(inode); + inode = ERR_PTR(ret); + } else { + unlock_new_inode(inode); + } } else { iget_failed(inode); /* @@ -6426,7 +6424,11 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, } } - inode_tree_add(BTRFS_I(inode)); + ret = btrfs_add_inode_to_root(BTRFS_I(inode)); + if (ret) { + btrfs_abort_transaction(trans, ret); + goto discard; + } trace_btrfs_inode_new(inode); btrfs_set_inode_last_trans(trans, BTRFS_I(inode)); @@ -8466,7 +8468,6 @@ struct inode *btrfs_alloc_inode(struct super_block *sb) ei->ordered_tree_last = NULL; INIT_LIST_HEAD(&ei->delalloc_inodes); INIT_LIST_HEAD(&ei->delayed_iput); - RB_CLEAR_NODE(&ei->rb_node); init_rwsem(&ei->i_mmap_lock); return inode; @@ -8538,7 +8539,7 @@ void btrfs_destroy_inode(struct inode *vfs_inode) } } btrfs_qgroup_check_reserved_leak(inode); - inode_tree_del(inode); + btrfs_del_inode_from_root(inode); btrfs_drop_extent_map_range(inode, 0, (u64)-1, false); btrfs_inode_clear_file_extent_range(inode, 0, (u64)-1); btrfs_put_root(inode->root); @@ -10857,52 +10858,23 @@ void btrfs_assert_inode_range_clean(struct btrfs_inode *inode, u64 start, u64 en */ struct btrfs_inode *btrfs_find_first_inode(struct btrfs_root *root, u64 min_ino) { - struct rb_node *node; - struct rb_node *prev; struct btrfs_inode *inode; + unsigned long from = min_ino; spin_lock(&root->inode_lock); -again: - node = root->inode_tree.rb_node; - prev = NULL; - while (node) { - prev = node; - inode = rb_entry(node, struct btrfs_inode, rb_node); - if (min_ino < btrfs_ino(inode)) - node = node->rb_left; - else if (min_ino > btrfs_ino(inode)) - node = node->rb_right; - else + while (true) { + inode = xa_find(&root->inodes, &from, ULONG_MAX, XA_PRESENT); + if (!inode) + break; + if (igrab(&inode->vfs_inode)) break; - } - - if (!node) { - while (prev) { - inode = rb_entry(prev, struct btrfs_inode, rb_node); - if (min_ino <= btrfs_ino(inode)) { - node = prev; - break; - } - prev = rb_next(prev); - } - } - - while (node) { - inode = rb_entry(prev, struct btrfs_inode, rb_node); - if (igrab(&inode->vfs_inode)) { - spin_unlock(&root->inode_lock); - return inode; - } - - min_ino = btrfs_ino(inode) + 1; - if (cond_resched_lock(&root->inode_lock)) - goto again; - node = rb_next(node); + from = btrfs_ino(inode) + 1; + cond_resched_lock(&root->inode_lock); } spin_unlock(&root->inode_lock); - return NULL; + return inode; } static const struct inode_operations btrfs_dir_inode_operations = { From patchwork Wed May 8 12:17:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13658654 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6B1BB83CBE for ; Wed, 8 May 2024 12:17:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170656; cv=none; b=BUb9cJTtMI76KB/fn6TL+h7O456E6TUZ3nHNfovjfb4DbQf6ofdvr5xF/d+8mpQqGc52O4Ik+Oz0sPvp5hzolBhBsm2ySXjaY+OD6bzPkOZAQGiVZVViucW7Udj0UwRjWvD3EbBG389p3HLx5N52GFwUwcElQjLLoqvf7lDJSHk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170656; c=relaxed/simple; bh=7k6Yg+N7w1RI3Eak9CJ7F8XP9uvtAz94b4glfhcsF+8=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ua24N7hJb8rZ1bjbsqodk/3L1tioKh+OaHFhe74RgZ/uXr2orKV4DHh0aB1/JTp1UDuIxLW7FLEAg1cUxCisgZCg8i6WCvbJneMbq2K+z5zC3/EiOjnPjFWDdUCyL8taBVexfelzZAsk6Od0bgl46KvC6UewHxN95x2ffZorAZg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qipN3iFi; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qipN3iFi" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 80D72C4AF17 for ; Wed, 8 May 2024 12:17:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715170656; bh=7k6Yg+N7w1RI3Eak9CJ7F8XP9uvtAz94b4glfhcsF+8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=qipN3iFiFUfPfpKggTdB+2dcvxko70NLIyGUFrg2cZOa2oUBLH8/RI9yxz3Ws4eF2 tkDTCvBssbMLLf5izQm6zihenSxWueaF8SrTUoV6788nsBQfBSQq2/76y8dWnAWV5g Tr76ADIR4zJ4HdGcBvNN5PLbcSYPRJXZkc2HTWu0iuMy74/0ILCYEsdAR6iFhfnwev NUpceOVtnNN84LMk0JpJvPswGvTqfpp3K6MKwQdZGWYcDmnQI5Afph4YGySAkhSZoi k6hId0rgxZ+K6JnBSDHVRRaTn44JruAGSkz8p8k6MawV37n3QszCM5R3/Ux3nec//3 30zb53qUgmL4w== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 2/8] btrfs: preallocate inodes xarray entry to avoid transaction abort Date: Wed, 8 May 2024 13:17:25 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana When creating a new inode, at btrfs_create_new_inode(), one of the very last steps is to add the inode to the root's inodes xarray. This often requires allocating memory which may fail (even though xarrays have a dedicated kmem_cache which make it less likely to fail), and at that point we are forced to abort the current transaction (as some, but not all, of the inode metadata was added to its subvolume btree). To avoid a transaction abort, preallocate memory for the xarray early at btrfs_create_new_inode(), so that if we fail we don't need to abort the transaction and the insertion into the xarray is guaranteed to succeed. Signed-off-by: Filipe Manana Reviewed-by: Qu Wenruo --- fs/btrfs/inode.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 450fe1582f1d..85dbc19c2f6f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5493,7 +5493,7 @@ static int fixup_tree_root_location(struct btrfs_fs_info *fs_info, return err; } -static int btrfs_add_inode_to_root(struct btrfs_inode *inode) +static int btrfs_add_inode_to_root(struct btrfs_inode *inode, bool prealloc) { struct btrfs_root *root = inode->root; struct btrfs_inode *existing; @@ -5503,9 +5503,11 @@ static int btrfs_add_inode_to_root(struct btrfs_inode *inode) if (inode_unhashed(&inode->vfs_inode)) return 0; - ret = xa_reserve(&root->inodes, ino, GFP_NOFS); - if (ret) - return ret; + if (prealloc) { + ret = xa_reserve(&root->inodes, ino, GFP_NOFS); + if (ret) + return ret; + } spin_lock(&root->inode_lock); existing = xa_store(&root->inodes, ino, inode, GFP_ATOMIC); @@ -5606,7 +5608,7 @@ struct inode *btrfs_iget_path(struct super_block *s, u64 ino, ret = btrfs_read_locked_inode(inode, path); if (!ret) { - ret = btrfs_add_inode_to_root(BTRFS_I(inode)); + ret = btrfs_add_inode_to_root(BTRFS_I(inode), true); if (ret) { iget_failed(inode); inode = ERR_PTR(ret); @@ -6237,6 +6239,7 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, struct btrfs_item_batch batch; unsigned long ptr; int ret; + bool xa_reserved = false; path = btrfs_alloc_path(); if (!path) @@ -6251,6 +6254,11 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, goto out; inode->i_ino = objectid; + ret = xa_reserve(&root->inodes, objectid, GFP_NOFS); + if (ret) + goto out; + xa_reserved = true; + if (args->orphan) { /* * O_TMPFILE, set link count to 0, so that after this point, we @@ -6424,8 +6432,9 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, } } - ret = btrfs_add_inode_to_root(BTRFS_I(inode)); - if (ret) { + ret = btrfs_add_inode_to_root(BTRFS_I(inode), false); + if (WARN_ON(ret)) { + /* Shouldn't happen, we used xa_reserve() before. */ btrfs_abort_transaction(trans, ret); goto discard; } @@ -6456,6 +6465,9 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, ihold(inode); discard_new_inode(inode); out: + if (xa_reserved) + xa_release(&root->inodes, objectid); + btrfs_free_path(path); return ret; } From patchwork Wed May 8 12:17:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13658655 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67A5F8405C for ; Wed, 8 May 2024 12:17:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170657; cv=none; b=V7H3sy3+tnxLZH9GB33oIBoz+vjRNqyhyDObIugG11fDfhHZKYu3otrA00H4ntbJ79ZhkDH5NqWCP7WJL9IX2XpOM7wlr4TKjtObZc/857HfJl2vn3LbC5cznjnkDRVwODFfM3RrKqQi5GPmlwWYv9fYd8Kd2LPx360Fv/wbFDM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170657; c=relaxed/simple; bh=iQNe+PvtKbO1JDkFvIuzXNqeKwpNZ8IqIusUBnFkdos=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dJJjCipsCYJlm76A1LeYJFZjBj+KgY9k+HLaUpPPJSsGqLrENkUIov9LOtSeCwJBvf/aDNeSCsH1h0P7lVb9xG0WWKamfGgNuzoWJtpqdBssPWCuwjVTnaBwn0lqYhQE5zLFadFyXV0S24GamHhV9p/A0qc1DcvLkaiysu55uqQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=sE60Q8Ls; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="sE60Q8Ls" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81339C4AF18 for ; Wed, 8 May 2024 12:17:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715170657; bh=iQNe+PvtKbO1JDkFvIuzXNqeKwpNZ8IqIusUBnFkdos=; h=From:To:Subject:Date:In-Reply-To:References:From; b=sE60Q8LsJuGGpyD1bHlkpWCbHc5gwXOxyuaBRabarJWQSP1lQMJPRTlZTbmY5UEcc ROhpmLcJxJWSPrSbOxwUMU4L3UEQMy6uucZgKx/cjV1wS9aDUA6IvaECDbXHLYUxB9 HmYm+WJQdtCzQDj11KeU6q8hDTqOmsitZv9nvcXkPuMxoXJa9gU8+GiUVA6U4DEgCw lEQic+jVRFoc5hiQ9k9FzZGtQW92fULQR7hj65LZ0kYMjqBJSCtBdGwUW5pBztXpJB nTpDhBA6b27n/dJpo1oRWML5lLWJkbz76YTRQVvIfQ7zIlEzP/EqIxCo548W0/n6YD uvu6PkH49KScg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 3/8] btrfs: reduce nesting and deduplicate error handling at btrfs_iget_path() Date: Wed, 8 May 2024 13:17:26 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Make btrfs_iget_path() simpler and easier to read by avoiding nesting of if-then-else statements and having an error label to do all the error handling instead of repeating it a couple times. Signed-off-by: Filipe Manana Reviewed-by: Qu Wenruo --- fs/btrfs/inode.c | 44 +++++++++++++++++++++----------------------- 1 file changed, 21 insertions(+), 23 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 85dbc19c2f6f..8ea9fd4c2b66 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5598,37 +5598,35 @@ struct inode *btrfs_iget_path(struct super_block *s, u64 ino, struct btrfs_root *root, struct btrfs_path *path) { struct inode *inode; + int ret; inode = btrfs_iget_locked(s, ino, root); if (!inode) return ERR_PTR(-ENOMEM); - if (inode->i_state & I_NEW) { - int ret; + if (!(inode->i_state & I_NEW)) + return inode; - ret = btrfs_read_locked_inode(inode, path); - if (!ret) { - ret = btrfs_add_inode_to_root(BTRFS_I(inode), true); - if (ret) { - iget_failed(inode); - inode = ERR_PTR(ret); - } else { - unlock_new_inode(inode); - } - } else { - iget_failed(inode); - /* - * ret > 0 can come from btrfs_search_slot called by - * btrfs_read_locked_inode, this means the inode item - * was not found. - */ - if (ret > 0) - ret = -ENOENT; - inode = ERR_PTR(ret); - } - } + ret = btrfs_read_locked_inode(inode, path); + /* + * ret > 0 can come from btrfs_search_slot called by + * btrfs_read_locked_inode(), this means the inode item was not found. + */ + if (ret > 0) + ret = -ENOENT; + if (ret < 0) + goto error; + + ret = btrfs_add_inode_to_root(BTRFS_I(inode), true); + if (ret < 0) + goto error; + + unlock_new_inode(inode); return inode; +error: + iget_failed(inode); + return ERR_PTR(ret); } struct inode *btrfs_iget(struct super_block *s, u64 ino, struct btrfs_root *root) From patchwork Wed May 8 12:17:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13658656 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3328684A38 for ; Wed, 8 May 2024 12:17:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170658; cv=none; b=OQz193bXMOS9qy1Sd1AS6IaaFD1I/9DM/dS0akD49Eb8PvoOqRHSoLXbTJZdrRWRU6UVo3F10hmge/eUqow5urVI9CqNNhIKZtkXpVrzMIWJcXBFG8SQZ6/vhXHqe1hBLaBgIiDFucTOyToNmX7PbTNynDeZwz9SHV4SPLyHmHQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170658; c=relaxed/simple; bh=/8CwFRoR2Um7c/d7Bya/ILPFkMCBDcQUigypf/cIeoQ=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=DsxQqqfr8d3sWMGnaGuI9G3ouhxW1H1kdGHqtDX7COOa8nTaivSs8eSoSYmYAtvwrN/gmWmyNB1o66T4pf6fiQMwj65cNHHLYHJFKh+EFpPGHarOKXFa54Bb/+n2cVI+ZYVcctVGxaRU3tSoP55loBCYm1hSk8BcbZFBYVDRLrg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=t6b/Y0iK; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="t6b/Y0iK" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 84293C113CC for ; Wed, 8 May 2024 12:17:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715170658; bh=/8CwFRoR2Um7c/d7Bya/ILPFkMCBDcQUigypf/cIeoQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=t6b/Y0iKffK0lwsk6ynniOg5DIJh7m9oZjZhD5QlylhVg7Z4tb005ScrAkomsocI1 +dor4kyk6K1AssvE9nlBkzoEYYLxJxviJdIBSW934dGmCsD5cjYiwkx3iQssQHbKyp zbf9931i5YHZeG6Fmmh6Ax8m7lmgEIAyFfgizHcGaFa4HeVllh7G1AEg+KUZ8fgEEC mLp738w2BUzNt7yKKINvG1yRiqg58vhDR7ptFOGJQ8o4YGlBffyUPFOx8x9eDCbzIO DwqkUoQHhltn6oFDeVywpuRJNjWKzK5p2NaLWxYx9ffUyaF7T9UtG9DJt7I8nO2paC FtZs4SRh3p0eQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 4/8] btrfs: remove inode_lock from struct btrfs_root and use xarray locks Date: Wed, 8 May 2024 13:17:27 +0100 Message-Id: <51066985ea4e9ea16388854a1d48ee197f07219f.1715169723.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Currently we use the spinlock inode_lock from struct btrfs_root to serialize access to two different data structures: 1) The delayed inodes xarray (struct btrfs_root::delayed_nodes); 2) The inodes xarray (struct btrfs_root::inodes). Instead of using our own lock, we can use the spinlock that is part of the xarray implementation, by using the xa_lock() and xa_unlock() APIs and using the xarray APIs with the double underscore prefix that don't take the xarray locks and assume the caller is using xa_lock() and xa_unlock(). So remove the spinlock inode_lock from struct btrfs_root and use the corresponding xarray locks. This brings 2 benefits: 1) We reduce the size of struct btrfs_root, from 1336 bytes down to 1328 bytes on a 64 bits release kernel config; 2) We reduce lock contention by not using anymore the same lock for changing two different and unrelated xarrays. Signed-off-by: Filipe Manana Reviewed-by: Qu Wenruo --- fs/btrfs/ctree.h | 1 - fs/btrfs/delayed-inode.c | 24 +++++++++++------------- fs/btrfs/disk-io.c | 1 - fs/btrfs/inode.c | 18 ++++++++---------- 4 files changed, 19 insertions(+), 25 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index aa2568f86dc9..1004cb934b4a 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -221,7 +221,6 @@ struct btrfs_root { struct list_head root_list; - spinlock_t inode_lock; /* * Xarray that keeps track of in-memory inodes, protected by the lock * @inode_lock. diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 95a0497fa866..1373f474c9b6 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -77,14 +77,14 @@ static struct btrfs_delayed_node *btrfs_get_delayed_node( return node; } - spin_lock(&root->inode_lock); + xa_lock(&root->delayed_nodes); node = xa_load(&root->delayed_nodes, ino); if (node) { if (btrfs_inode->delayed_node) { refcount_inc(&node->refs); /* can be accessed */ BUG_ON(btrfs_inode->delayed_node != node); - spin_unlock(&root->inode_lock); + xa_unlock(&root->delayed_nodes); return node; } @@ -111,10 +111,10 @@ static struct btrfs_delayed_node *btrfs_get_delayed_node( node = NULL; } - spin_unlock(&root->inode_lock); + xa_unlock(&root->delayed_nodes); return node; } - spin_unlock(&root->inode_lock); + xa_unlock(&root->delayed_nodes); return NULL; } @@ -148,21 +148,21 @@ static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node( kmem_cache_free(delayed_node_cache, node); return ERR_PTR(-ENOMEM); } - spin_lock(&root->inode_lock); + xa_lock(&root->delayed_nodes); ptr = xa_load(&root->delayed_nodes, ino); if (ptr) { /* Somebody inserted it, go back and read it. */ - spin_unlock(&root->inode_lock); + xa_unlock(&root->delayed_nodes); kmem_cache_free(delayed_node_cache, node); node = NULL; goto again; } - ptr = xa_store(&root->delayed_nodes, ino, node, GFP_ATOMIC); + ptr = __xa_store(&root->delayed_nodes, ino, node, GFP_ATOMIC); ASSERT(xa_err(ptr) != -EINVAL); ASSERT(xa_err(ptr) != -ENOMEM); ASSERT(ptr == NULL); btrfs_inode->delayed_node = node; - spin_unlock(&root->inode_lock); + xa_unlock(&root->delayed_nodes); return node; } @@ -275,14 +275,12 @@ static void __btrfs_release_delayed_node( if (refcount_dec_and_test(&delayed_node->refs)) { struct btrfs_root *root = delayed_node->root; - spin_lock(&root->inode_lock); /* * Once our refcount goes to zero, nobody is allowed to bump it * back up. We can delete it now. */ ASSERT(refcount_read(&delayed_node->refs) == 0); xa_erase(&root->delayed_nodes, delayed_node->inode_id); - spin_unlock(&root->inode_lock); kmem_cache_free(delayed_node_cache, delayed_node); } } @@ -2057,9 +2055,9 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root) struct btrfs_delayed_node *node; int count; - spin_lock(&root->inode_lock); + xa_lock(&root->delayed_nodes); if (xa_empty(&root->delayed_nodes)) { - spin_unlock(&root->inode_lock); + xa_unlock(&root->delayed_nodes); return; } @@ -2076,7 +2074,7 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root *root) if (count >= ARRAY_SIZE(delayed_nodes)) break; } - spin_unlock(&root->inode_lock); + xa_unlock(&root->delayed_nodes); index++; for (int i = 0; i < count; i++) { diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index ed40fe1db53e..d20e400a9ce3 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -674,7 +674,6 @@ static void __setup_root(struct btrfs_root *root, struct btrfs_fs_info *fs_info, INIT_LIST_HEAD(&root->ordered_extents); INIT_LIST_HEAD(&root->ordered_root); INIT_LIST_HEAD(&root->reloc_dirty_list); - spin_lock_init(&root->inode_lock); spin_lock_init(&root->delalloc_lock); spin_lock_init(&root->ordered_extent_lock); spin_lock_init(&root->accounting_lock); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8ea9fd4c2b66..4fd41d6b377f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5509,9 +5509,7 @@ static int btrfs_add_inode_to_root(struct btrfs_inode *inode, bool prealloc) return ret; } - spin_lock(&root->inode_lock); existing = xa_store(&root->inodes, ino, inode, GFP_ATOMIC); - spin_unlock(&root->inode_lock); if (xa_is_err(existing)) { ret = xa_err(existing); @@ -5531,16 +5529,16 @@ static void btrfs_del_inode_from_root(struct btrfs_inode *inode) struct btrfs_inode *entry; bool empty = false; - spin_lock(&root->inode_lock); - entry = xa_erase(&root->inodes, btrfs_ino(inode)); + xa_lock(&root->inodes); + entry = __xa_erase(&root->inodes, btrfs_ino(inode)); if (entry == inode) empty = xa_empty(&root->inodes); - spin_unlock(&root->inode_lock); + xa_unlock(&root->inodes); if (empty && btrfs_root_refs(&root->root_item) == 0) { - spin_lock(&root->inode_lock); + xa_lock(&root->inodes); empty = xa_empty(&root->inodes); - spin_unlock(&root->inode_lock); + xa_unlock(&root->inodes); if (empty) btrfs_add_dead_root(root); } @@ -10871,7 +10869,7 @@ struct btrfs_inode *btrfs_find_first_inode(struct btrfs_root *root, u64 min_ino) struct btrfs_inode *inode; unsigned long from = min_ino; - spin_lock(&root->inode_lock); + xa_lock(&root->inodes); while (true) { inode = xa_find(&root->inodes, &from, ULONG_MAX, XA_PRESENT); if (!inode) @@ -10880,9 +10878,9 @@ struct btrfs_inode *btrfs_find_first_inode(struct btrfs_root *root, u64 min_ino) break; from = btrfs_ino(inode) + 1; - cond_resched_lock(&root->inode_lock); + cond_resched_lock(&root->inodes.xa_lock); } - spin_unlock(&root->inode_lock); + xa_unlock(&root->inodes); return inode; } From patchwork Wed May 8 12:17:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13658657 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FD7683CDC for ; Wed, 8 May 2024 12:17:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170659; cv=none; b=WcubOnY6ahujM2FEn2AP+O3WSZUzHuiAdRQcGSeLtePPkeefqfjOpveGxcIkdwif4zSusMAJmBc6Oq4F7Nxdcu8tSTMgBEReF0qKJ9Ja4EfhPb2cg1+O0/0o2vlRrgl1EX4Mht7y7fsBUM0PiqW8G4XAIA7cKM41D7xYHTfK9ck= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170659; c=relaxed/simple; bh=L1qTQoYBVjz1Zv4QybF5rZ/1skQYQ8tHpS2o2lnzfcQ=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hmZOENtWOW/4BJ4KSRUEdWj3DPMxUdj9PCbNATfPch5SXlfgB4tyFHDJfAJFn2sKbELpGhp7oWwOe3HjeEOUYGeZWBlEJJnjRv6ibaoeGbB/dEPlbU5Je14GDqbvuXlbQJj4GEvnmC5DbkXf3faeTiNktW8ZLDLuv+T0mjdWc6w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=bt1FgT1S; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="bt1FgT1S" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 85885C3277B for ; Wed, 8 May 2024 12:17:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715170659; bh=L1qTQoYBVjz1Zv4QybF5rZ/1skQYQ8tHpS2o2lnzfcQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=bt1FgT1S9tOwgv+oFtM6ocojFEeGnITxzu2y8hZPsIDj4F67IJU75hmXt35L8VXNp L17YRv6UL7MNrlKDRQ6RxUKjF1ctBLfOmh9/0s2Fbgb6GnksiTj6l3I1UZQnH2eDcs Tq1YUgwxdHH1ZLInJV58pe8GfKEPkA+Pbl57OORV2xVtZ0wx26RW8LDTxVB1xwzS7C BeR2wYh/Bfuo0zOqtkaEb0GP/pMZqBHvVGHEi6H36MgPueT+PmsMiGMRSkyUKBlnjj XySL6K+KLg47u+t0EcvGLeSO/S44arYWvYFYCZDnZCnemuMExUhchwUUVmnpTrvHQk 4s8/MBpkqunWw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 5/8] btrfs: unify index_cnt and csum_bytes from struct btrfs_inode Date: Wed, 8 May 2024 13:17:28 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana The index_cnt field of struct btrfs_inode is used only for two purposes: 1) To store the index for the next entry added to a directory; 2) For the data relocation inode to track the logical start address of the block group currently being relocated. For the relocation case we use index_cnt because it's not used for anything else in the relocation use case - we could have used other fields that are not used by relocation such as defrag_bytes, last_unlink_trans or last_reflink_trans for example (amongs others). Since the csum_bytes field is not used for directories, do the following changes: 1) Put index_cnt and csum_bytes in a union, and index_cnt is only initialized when the inode is a directory. The csum_bytes is only accessed in IO paths for regular files, so we're fine here; 2) Use the defrag_bytes field for relocation, since the data relocation inode is never used for defrag purposes. And to make the naming better, alias it to reloc_block_group_start by using a union. This reduces the size of struct btrfs_inode by 8 bytes in a release kernel, from 1040 bytes down to 1032 bytes. Signed-off-by: Filipe Manana Reviewed-by: Qu Wenruo --- fs/btrfs/btrfs_inode.h | 46 +++++++++++++++++++++++++--------------- fs/btrfs/delayed-inode.c | 3 ++- fs/btrfs/inode.c | 21 ++++++++++++------ fs/btrfs/relocation.c | 12 +++++------ fs/btrfs/tree-log.c | 3 ++- 5 files changed, 54 insertions(+), 31 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index e577b9745884..19bb3d057414 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -215,11 +215,20 @@ struct btrfs_inode { u64 last_dir_index_offset; }; - /* - * Total number of bytes pending defrag, used by stat to check whether - * it needs COW. Protected by 'lock'. - */ - u64 defrag_bytes; + union { + /* + * Total number of bytes pending defrag, used by stat to check whether + * it needs COW. Protected by 'lock'. + * Used by inodes other than the data relocation inode. + */ + u64 defrag_bytes; + + /* + * Logical address of the block group being relocated. + * Used only by the data relocation inode. + */ + u64 reloc_block_group_start; + }; /* * The size of the file stored in the metadata on disk. data=ordered @@ -228,12 +237,21 @@ struct btrfs_inode { */ u64 disk_i_size; - /* - * If this is a directory then index_cnt is the counter for the index - * number for new files that are created. For an empty directory, this - * must be initialized to BTRFS_DIR_START_INDEX. - */ - u64 index_cnt; + union { + /* + * If this is a directory then index_cnt is the counter for the + * index number for new files that are created. For an empty + * directory, this must be initialized to BTRFS_DIR_START_INDEX. + */ + u64 index_cnt; + + /* + * If this is not a directory, this is the number of bytes + * outstanding that are going to need csums. This is used in + * ENOSPC accounting. Protected by 'lock'. + */ + u64 csum_bytes; + }; /* Cache the directory index number to speed the dir/file remove */ u64 dir_index; @@ -256,12 +274,6 @@ struct btrfs_inode { */ u64 last_reflink_trans; - /* - * Number of bytes outstanding that are going to need csums. This is - * used in ENOSPC accounting. Protected by 'lock'. - */ - u64 csum_bytes; - /* Backwards incompatible flags, lower half of inode_item::flags */ u32 flags; /* Read-only compatibility flags, upper half of inode_item::flags */ diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 1373f474c9b6..e298a44eaf69 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -1914,7 +1914,8 @@ int btrfs_fill_inode(struct inode *inode, u32 *rdev) BTRFS_I(inode)->i_otime_nsec = btrfs_stack_timespec_nsec(&inode_item->otime); inode->i_generation = BTRFS_I(inode)->generation; - BTRFS_I(inode)->index_cnt = (u64)-1; + if (S_ISDIR(inode->i_mode)) + BTRFS_I(inode)->index_cnt = (u64)-1; mutex_unlock(&delayed_node->mutex); btrfs_release_delayed_node(delayed_node); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4fd41d6b377f..9b98aa65cc63 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3856,7 +3856,9 @@ static int btrfs_read_locked_inode(struct inode *inode, inode->i_rdev = 0; rdev = btrfs_inode_rdev(leaf, inode_item); - BTRFS_I(inode)->index_cnt = (u64)-1; + if (S_ISDIR(inode->i_mode)) + BTRFS_I(inode)->index_cnt = (u64)-1; + btrfs_inode_split_flags(btrfs_inode_flags(leaf, inode_item), &BTRFS_I(inode)->flags, &BTRFS_I(inode)->ro_flags); @@ -6268,8 +6270,10 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, if (ret) goto out; } - /* index_cnt is ignored for everything but a dir. */ - BTRFS_I(inode)->index_cnt = BTRFS_DIR_START_INDEX; + + if (S_ISDIR(inode->i_mode)) + BTRFS_I(inode)->index_cnt = BTRFS_DIR_START_INDEX; + BTRFS_I(inode)->generation = trans->transid; inode->i_generation = BTRFS_I(inode)->generation; @@ -8435,8 +8439,12 @@ struct inode *btrfs_alloc_inode(struct super_block *sb) ei->disk_i_size = 0; ei->flags = 0; ei->ro_flags = 0; + /* + * ->index_cnt will be propertly initialized later when creating a new + * inode (btrfs_create_new_inode()) or when reading an existing inode + * from disk (btrfs_read_locked_inode()). + */ ei->csum_bytes = 0; - ei->index_cnt = (u64)-1; ei->dir_index = 0; ei->last_unlink_trans = 0; ei->last_reflink_trans = 0; @@ -8511,9 +8519,10 @@ void btrfs_destroy_inode(struct inode *vfs_inode) if (!S_ISDIR(vfs_inode->i_mode)) { WARN_ON(inode->delalloc_bytes); WARN_ON(inode->new_delalloc_bytes); + WARN_ON(inode->csum_bytes); } - WARN_ON(inode->csum_bytes); - WARN_ON(inode->defrag_bytes); + if (!root || !btrfs_is_data_reloc_root(root)) + WARN_ON(inode->defrag_bytes); /* * This can happen where we create an inode, but somebody else also diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 8b24bb5a0aa1..9f35524b6664 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -962,7 +962,7 @@ static int get_new_location(struct inode *reloc_inode, u64 *new_bytenr, if (!path) return -ENOMEM; - bytenr -= BTRFS_I(reloc_inode)->index_cnt; + bytenr -= BTRFS_I(reloc_inode)->reloc_block_group_start; ret = btrfs_lookup_file_extent(NULL, root, path, btrfs_ino(BTRFS_I(reloc_inode)), bytenr, 0); if (ret < 0) @@ -2797,7 +2797,7 @@ static noinline_for_stack int prealloc_file_extent_cluster( u64 alloc_hint = 0; u64 start; u64 end; - u64 offset = inode->index_cnt; + u64 offset = inode->reloc_block_group_start; u64 num_bytes; int nr; int ret = 0; @@ -2951,7 +2951,7 @@ static int relocate_one_folio(struct inode *inode, struct file_ra_state *ra, int *cluster_nr, unsigned long index) { struct btrfs_fs_info *fs_info = inode_to_fs_info(inode); - u64 offset = BTRFS_I(inode)->index_cnt; + u64 offset = BTRFS_I(inode)->reloc_block_group_start; const unsigned long last_index = (cluster->end - offset) >> PAGE_SHIFT; gfp_t mask = btrfs_alloc_write_mask(inode->i_mapping); struct folio *folio; @@ -3086,7 +3086,7 @@ static int relocate_one_folio(struct inode *inode, struct file_ra_state *ra, static int relocate_file_extent_cluster(struct inode *inode, const struct file_extent_cluster *cluster) { - u64 offset = BTRFS_I(inode)->index_cnt; + u64 offset = BTRFS_I(inode)->reloc_block_group_start; unsigned long index; unsigned long last_index; struct file_ra_state *ra; @@ -3915,7 +3915,7 @@ static noinline_for_stack struct inode *create_reloc_inode( inode = NULL; goto out; } - BTRFS_I(inode)->index_cnt = group->start; + BTRFS_I(inode)->reloc_block_group_start = group->start; ret = btrfs_orphan_add(trans, BTRFS_I(inode)); out: @@ -4395,7 +4395,7 @@ int btrfs_reloc_clone_csums(struct btrfs_ordered_extent *ordered) { struct btrfs_inode *inode = BTRFS_I(ordered->inode); struct btrfs_fs_info *fs_info = inode->root->fs_info; - u64 disk_bytenr = ordered->file_offset + inode->index_cnt; + u64 disk_bytenr = ordered->file_offset + inode->reloc_block_group_start; struct btrfs_root *csum_root = btrfs_csum_root(fs_info, disk_bytenr); LIST_HEAD(list); int ret; diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 5146387b416b..0aee43466c52 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -1625,7 +1625,8 @@ static noinline int fixup_inode_link_count(struct btrfs_trans_handle *trans, if (ret) goto out; } - BTRFS_I(inode)->index_cnt = (u64)-1; + if (S_ISDIR(inode->i_mode)) + BTRFS_I(inode)->index_cnt = (u64)-1; if (inode->i_nlink == 0) { if (S_ISDIR(inode->i_mode)) { From patchwork Wed May 8 12:17:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13658658 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F6FF83CDC for ; Wed, 8 May 2024 12:17:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170660; cv=none; b=ao0UlfESD5SX9f1OWxP3oQS5jtYKGol3Dg2yPV6NceW17rVyM9ejL/e9VdD28NGmyoKtZJVTFaTbi0RQMCcFOF/+h1iz953Og/Rxco/JtJDBaAMIJZ8XtJShqGpsnK0OITOy7kCCuZ2IIrvVy0rDv23oNh/934eRD3zB2uPfVGM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170660; c=relaxed/simple; bh=0+zVYEueYtzfOA++IVEtDCN9bL3D46fHqucgtWxKz94=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Kqz7UYLkClSH4ZXWGA/Apgj1jO46cW5v601NPQrC4sR/+PcvaD9gx+0+HVPiypigpp84W07jprNqGVIIqXDV55LAquxr/T/Sijrn3+LRmtWYF2n4SiPnKbFjRxLam5NkP9lQzOJ4XsvBcf0ge/YRbRYkhn2XFoyjgbH9/l+MB+k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=T9zS86vH; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="T9zS86vH" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 853E9C3277B for ; Wed, 8 May 2024 12:17:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715170660; bh=0+zVYEueYtzfOA++IVEtDCN9bL3D46fHqucgtWxKz94=; h=From:To:Subject:Date:In-Reply-To:References:From; b=T9zS86vH/0tBoHp6CGTF1uY/mDG8TP0QHxzxIaAHTzHc3mDJRpA38mnWdCOYuaxCM sLTje+Y8bAn61zyCIJGH0Kdhq+1oW2+6VlsG9p3CXZlXV59Ql1WkAe5xXy8OSqYgix ogzq+rpsLAbt4njyhLlCEywUM9vfOkUH2rAf6mRf5h02GFGei/2JVyVPPkOcWlarwM 2s6juxECsBPEe9dKnAo7+6K5C8/dx/zsxaIifmqxi2+VrXONOx0MylE9joYhTniIsz Xif9tILXX6olkUnfCebtRgGmyodv9tQ6DQg1MBUv5iby9P8gYYZFUZp9nI99wg+9Uj 26n88RqHXU7oQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 6/8] btrfs: don't allocate file extent tree for non regular files Date: Wed, 8 May 2024 13:17:29 +0100 Message-Id: <13d914be0518dc3f4a8086f96275c3b8bf113d63.1715169723.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana When not using the NO_HOLES feature we always allocate an io tree for an inode's file_extent_tree. This is wasteful because that io tree is only used for regular files, so we allocate more memory than needed for inodes that represent directories or symlinks for example, or for inodes that correspond to free space inodes. So improve on this by allocating the io tree only for inodes of regular files that are not free space inodes. Signed-off-by: Filipe Manana Reviewed-by: Qu Wenruo --- fs/btrfs/file-item.c | 13 ++++++----- fs/btrfs/inode.c | 53 +++++++++++++++++++++++++++++--------------- 2 files changed, 42 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index bce95f871750..f3ed78e21fa4 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -45,13 +45,12 @@ */ void btrfs_inode_safe_disk_i_size_write(struct btrfs_inode *inode, u64 new_i_size) { - struct btrfs_fs_info *fs_info = inode->root->fs_info; u64 start, end, i_size; int ret; spin_lock(&inode->lock); i_size = new_i_size ?: i_size_read(&inode->vfs_inode); - if (btrfs_fs_incompat(fs_info, NO_HOLES)) { + if (!inode->file_extent_tree) { inode->disk_i_size = i_size; goto out_unlock; } @@ -84,13 +83,14 @@ void btrfs_inode_safe_disk_i_size_write(struct btrfs_inode *inode, u64 new_i_siz int btrfs_inode_set_file_extent_range(struct btrfs_inode *inode, u64 start, u64 len) { + if (!inode->file_extent_tree) + return 0; + if (len == 0) return 0; ASSERT(IS_ALIGNED(start + len, inode->root->fs_info->sectorsize)); - if (btrfs_fs_incompat(inode->root->fs_info, NO_HOLES)) - return 0; return set_extent_bit(inode->file_extent_tree, start, start + len - 1, EXTENT_DIRTY, NULL); } @@ -112,14 +112,15 @@ int btrfs_inode_set_file_extent_range(struct btrfs_inode *inode, u64 start, int btrfs_inode_clear_file_extent_range(struct btrfs_inode *inode, u64 start, u64 len) { + if (!inode->file_extent_tree) + return 0; + if (len == 0) return 0; ASSERT(IS_ALIGNED(start + len, inode->root->fs_info->sectorsize) || len == (u64)-1); - if (btrfs_fs_incompat(inode->root->fs_info, NO_HOLES)) - return 0; return clear_extent_bit(inode->file_extent_tree, start, start + len - 1, EXTENT_DIRTY, NULL); } diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 9b98aa65cc63..175fd007f0ef 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3781,6 +3781,30 @@ static noinline int acls_after_inode_item(struct extent_buffer *leaf, return 1; } +static int btrfs_init_file_extent_tree(struct btrfs_inode *inode) +{ + struct btrfs_fs_info *fs_info = inode->root->fs_info; + + if (WARN_ON_ONCE(inode->file_extent_tree)) + return 0; + if (btrfs_fs_incompat(fs_info, NO_HOLES)) + return 0; + if (!S_ISREG(inode->vfs_inode.i_mode)) + return 0; + if (btrfs_is_free_space_inode(inode)) + return 0; + + inode->file_extent_tree = kmalloc(sizeof(struct extent_io_tree), GFP_KERNEL); + if (!inode->file_extent_tree) + return -ENOMEM; + + extent_io_tree_init(fs_info, inode->file_extent_tree, IO_TREE_INODE_FILE_EXTENT); + /* Lockdep class is set only for the file extent tree. */ + lockdep_set_class(&inode->file_extent_tree->lock, &file_extent_tree_class); + + return 0; +} + /* * read an inode from the btree into the in-memory inode */ @@ -3800,6 +3824,10 @@ static int btrfs_read_locked_inode(struct inode *inode, bool filled = false; int first_xattr_slot; + ret = btrfs_init_file_extent_tree(BTRFS_I(inode)); + if (ret) + return ret; + ret = btrfs_fill_inode(inode, &rdev); if (!ret) filled = true; @@ -6247,6 +6275,10 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, BTRFS_I(inode)->root = btrfs_grab_root(BTRFS_I(dir)->root); root = BTRFS_I(inode)->root; + ret = btrfs_init_file_extent_tree(BTRFS_I(inode)); + if (ret) + goto out; + ret = btrfs_get_free_objectid(root, &objectid); if (ret) goto out; @@ -8413,20 +8445,10 @@ struct inode *btrfs_alloc_inode(struct super_block *sb) struct btrfs_fs_info *fs_info = btrfs_sb(sb); struct btrfs_inode *ei; struct inode *inode; - struct extent_io_tree *file_extent_tree = NULL; - - /* Self tests may pass a NULL fs_info. */ - if (fs_info && !btrfs_fs_incompat(fs_info, NO_HOLES)) { - file_extent_tree = kmalloc(sizeof(struct extent_io_tree), GFP_KERNEL); - if (!file_extent_tree) - return NULL; - } ei = alloc_inode_sb(sb, btrfs_inode_cachep, GFP_KERNEL); - if (!ei) { - kfree(file_extent_tree); + if (!ei) return NULL; - } ei->root = NULL; ei->generation = 0; @@ -8471,13 +8493,8 @@ struct inode *btrfs_alloc_inode(struct super_block *sb) extent_io_tree_init(fs_info, &ei->io_tree, IO_TREE_INODE_IO); ei->io_tree.inode = ei; - ei->file_extent_tree = file_extent_tree; - if (file_extent_tree) { - extent_io_tree_init(fs_info, ei->file_extent_tree, - IO_TREE_INODE_FILE_EXTENT); - /* Lockdep class is set only for the file extent tree. */ - lockdep_set_class(&ei->file_extent_tree->lock, &file_extent_tree_class); - } + ei->file_extent_tree = NULL; + mutex_init(&ei->log_mutex); spin_lock_init(&ei->ordered_tree_lock); ei->ordered_tree = RB_ROOT; From patchwork Wed May 8 12:17:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13658659 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7577B84D30 for ; Wed, 8 May 2024 12:17:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170661; cv=none; b=I4Xk3ck1BUyaGaQYisUZ+48LgAlT1pIRb0sYyQdo7MPmUFTInzUFAHFEsZ4p8CKa7ee+CtKh4AyyQtm65rSr1VRHE1U2Xlbj5imeEsUBCrspfu155r1ECQmZnIors1ZPciP6Y8HhAijfZ0NGrIEaWarWlB2Chvi/dEwTRUPVprI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170661; c=relaxed/simple; bh=cudikI2nGagDvyRXJ+P06wUiFodZhGA5WVJQezomopw=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gkmdbQw2Y7ZyxwQabPIxn0S9a25oaVvEN0izehvJKxysQSbq/56pbdKgEcEU7MxBS69H4kh6zI+nmH7co52CqCLqOXq5Vb4dUE2kpot0COQOCzxMqU9CHfSl5RJ7AsrryG+bq+Ky5O8uetIQ53k/66zumkJrhXAcad9TaRb/P30= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=d2IcEYgO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="d2IcEYgO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 86AC5C113CC for ; Wed, 8 May 2024 12:17:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715170661; bh=cudikI2nGagDvyRXJ+P06wUiFodZhGA5WVJQezomopw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=d2IcEYgOwO1LnuUOF8ZViRQ9Gb4EMYK7YQEk1gk4Mdgn/bveWEHAPRfILNLsBfAGy Es+hcwgHfqRJL66KMKc7QV26kteOBAc2rT9VIXm98ya9Iab4S7VebnT/frZPHTkuq9 NUgty3BpwgewvhZ9bKqDF5CCML55zasDZN2av18PPL4W6AZoVNXjnQLOexiMbbY49q b2ASW7niv9lvm+IDpGMshQSChJkV32ECBJ/cRJXBDR6Ak7QArdRK0WxKCy9QKtwHxz OsYgIJZXRLXhr/eu3XLqznbJcjtUB/JOjMn/Kz3y9w5z5XplmiwpUlcQzVur0W+Nay A48WUDWa7rL4g== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 7/8] btrfs: remove location key from struct btrfs_inode Date: Wed, 8 May 2024 13:17:30 +0100 Message-Id: <0d7da3157a163513ef75dc02832c4fcf80726cf9.1715169723.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Currently struct btrfs_inode has a key member, named "location", that is either: 1) The key of the inode's item. In this case the objectid is the number of the inode; 2) A key stored in a dir entry with a type of BTRFS_ROOT_ITEM_KEY, for the case where we have a root that is a snapshot of a subvolume that points to other subvolumes. In this case the objectid is the ID of a subvolume inside the snapshotted parent subvolume. The key is only used to lookup the inode item for the first case, while for the second it's never used since it corresponds to directory stubs created with new_simple_dir() and which are marked as dummy, so there's no actual inode item to ever update. In the second case we only check the key type at btrfs_ino() for 32 bits platforms and its objectid is only needed for unlink. Instead of using a key we can do fine with just the objectid, since we can generate the key whenever we need it having only the objectid, as in all use cases the type is always BTRFS_INODE_ITEM_KEY and the offset is always 0. So use only an objectid instead of a full key. This reduces the size of struct btrfs_inode from 1032 bytes down to 1024 bytes on a release kernel, meaning we can now have 4 inodes per 4K page instead of 3, using less pages for inodes. Signed-off-by: Filipe Manana --- fs/btrfs/btrfs_inode.h | 47 +++++++++++++++++++++++++++++++----- fs/btrfs/disk-io.c | 4 +-- fs/btrfs/export.c | 2 +- fs/btrfs/inode.c | 25 +++++++++---------- fs/btrfs/ioctl.c | 8 +++--- fs/btrfs/tests/btrfs-tests.c | 4 +-- fs/btrfs/tree-log.c | 6 +++-- 7 files changed, 63 insertions(+), 33 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 19bb3d057414..fa2f91396ae0 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -89,6 +89,29 @@ enum { BTRFS_INODE_FREE_SPACE_INODE, /* Set when there are no capabilities in XATTs for the inode. */ BTRFS_INODE_NO_CAP_XATTR, + /* + * Indicate this is a directory that points to a subvolume for which + * there is no root reference item. That's a case like the following: + * + * $ btrfs subvolume create /mnt/parent + * $ btrfs subvolume create /mnt/parent/child + * $ btrfs subvolume snapshot /mnt/parent /mnt/snap + * + * If subvolume "parent" is root 256, subvolume "child" is root 257 and + * snapshot "snap" is root 258, then there's no root reference item (key + * BTRFS_ROOT_REF_KEY in the root tree) for the subvolume "child" + * associated to root 258 (the snapshot) - there's only for the root + * of the "parent" subvolume (root 256). In the chunk root we have a + * (256 BTRFS_ROOT_REF_KEY 257) key but we don't have a + * (258 BTRFS_ROOT_REF_KEY 257) key - the sames goes for backrefs, we + * have a (257 BTRFS_ROOT_BACKREF_KEY 256) but we don't have a + * (257 BTRFS_ROOT_BACKREF_KEY 258) key. + * + * So when opening the "child" dentry from the snapshot's directory, + * we don't find a root ref item and we create a stub inode. This is + * done at new_simple_dir(), called from btrfs_lookup_dentry(). + */ + BTRFS_INODE_ROOT_STUB, }; /* in memory btrfs inode */ @@ -96,10 +119,15 @@ struct btrfs_inode { /* which subvolume this inode belongs to */ struct btrfs_root *root; - /* key used to find this inode on disk. This is used by the code - * to read in roots of subvolumes + /* + * This is either: + * + * 1) The objectid of the corresponding BTRFS_INODE_ITEM_KEY; + * + * 2) In case this a root stub inode (BTRFS_INODE_ROOT_STUB flag set), + * the ID of that root. */ - struct btrfs_key location; + u64 objectid; /* Cached value of inode property 'compression'. */ u8 prop_compress; @@ -330,10 +358,9 @@ static inline unsigned long btrfs_inode_hash(u64 objectid, */ static inline u64 btrfs_ino(const struct btrfs_inode *inode) { - u64 ino = inode->location.objectid; + u64 ino = inode->objectid; - /* type == BTRFS_ROOT_ITEM_KEY: subvol dir */ - if (inode->location.type == BTRFS_ROOT_ITEM_KEY) + if (test_bit(BTRFS_INODE_ROOT_STUB, &inode->runtime_flags)) ino = inode->vfs_inode.i_ino; return ino; } @@ -347,6 +374,14 @@ static inline u64 btrfs_ino(const struct btrfs_inode *inode) #endif +static inline void btrfs_get_inode_key(const struct btrfs_inode *inode, + struct btrfs_key *key) +{ + key->objectid = inode->objectid; + key->type = BTRFS_INODE_ITEM_KEY; + key->offset = 0; +} + static inline void btrfs_i_size_write(struct btrfs_inode *inode, u64 size) { i_size_write(&inode->vfs_inode, size); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index d20e400a9ce3..e3edaf510108 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1944,9 +1944,7 @@ static int btrfs_init_btree_inode(struct super_block *sb) extent_map_tree_init(&BTRFS_I(inode)->extent_tree); BTRFS_I(inode)->root = btrfs_grab_root(fs_info->tree_root); - BTRFS_I(inode)->location.objectid = BTRFS_BTREE_INODE_OBJECTID; - BTRFS_I(inode)->location.type = 0; - BTRFS_I(inode)->location.offset = 0; + BTRFS_I(inode)->objectid = BTRFS_BTREE_INODE_OBJECTID; set_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags); __insert_inode_hash(inode, hash); fs_info->btree_inode = inode; diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c index 9e81f89e76d8..5526e25ebb3f 100644 --- a/fs/btrfs/export.c +++ b/fs/btrfs/export.c @@ -40,7 +40,7 @@ static int btrfs_encode_fh(struct inode *inode, u32 *fh, int *max_len, if (parent) { u64 parent_root_id; - fid->parent_objectid = BTRFS_I(parent)->location.objectid; + fid->parent_objectid = BTRFS_I(parent)->objectid; fid->parent_gen = parent->i_generation; parent_root_id = btrfs_root_id(BTRFS_I(parent)->root); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 175fd007f0ef..44dc82ff96db 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3838,7 +3838,7 @@ static int btrfs_read_locked_inode(struct inode *inode, return -ENOMEM; } - memcpy(&location, &BTRFS_I(inode)->location, sizeof(location)); + btrfs_get_inode_key(BTRFS_I(inode), &location); ret = btrfs_lookup_inode(NULL, root, path, &location, 0); if (ret) { @@ -4068,13 +4068,15 @@ static noinline int btrfs_update_inode_item(struct btrfs_trans_handle *trans, struct btrfs_inode_item *inode_item; struct btrfs_path *path; struct extent_buffer *leaf; + struct btrfs_key key; int ret; path = btrfs_alloc_path(); if (!path) return -ENOMEM; - ret = btrfs_lookup_inode(trans, inode->root, path, &inode->location, 1); + btrfs_get_inode_key(inode, &key); + ret = btrfs_lookup_inode(trans, inode->root, path, &key, 1); if (ret) { if (ret > 0) ret = -ENOENT; @@ -4338,7 +4340,7 @@ static int btrfs_unlink_subvol(struct btrfs_trans_handle *trans, if (btrfs_ino(inode) == BTRFS_FIRST_FREE_OBJECTID) { objectid = btrfs_root_id(inode->root); } else if (btrfs_ino(inode) == BTRFS_EMPTY_SUBVOL_DIR_OBJECTID) { - objectid = inode->location.objectid; + objectid = inode->objectid; } else { WARN_ON(1); fscrypt_free_filename(&fname); @@ -5580,9 +5582,7 @@ static int btrfs_init_locked_inode(struct inode *inode, void *p) struct btrfs_iget_args *args = p; inode->i_ino = args->ino; - BTRFS_I(inode)->location.objectid = args->ino; - BTRFS_I(inode)->location.type = BTRFS_INODE_ITEM_KEY; - BTRFS_I(inode)->location.offset = 0; + BTRFS_I(inode)->objectid = args->ino; BTRFS_I(inode)->root = btrfs_grab_root(args->root); if (args->root && args->root == args->root->fs_info->tree_root && @@ -5596,7 +5596,7 @@ static int btrfs_find_actor(struct inode *inode, void *opaque) { struct btrfs_iget_args *args = opaque; - return args->ino == BTRFS_I(inode)->location.objectid && + return args->ino == BTRFS_I(inode)->objectid && args->root == BTRFS_I(inode)->root; } @@ -5673,7 +5673,8 @@ static struct inode *new_simple_dir(struct inode *dir, return ERR_PTR(-ENOMEM); BTRFS_I(inode)->root = btrfs_grab_root(root); - memcpy(&BTRFS_I(inode)->location, key, sizeof(*key)); + BTRFS_I(inode)->objectid = key->objectid; + set_bit(BTRFS_INODE_ROOT_STUB, &BTRFS_I(inode)->runtime_flags); set_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags); inode->i_ino = BTRFS_EMPTY_SUBVOL_DIR_OBJECTID; @@ -6149,7 +6150,7 @@ static int btrfs_insert_inode_locked(struct inode *inode) { struct btrfs_iget_args args; - args.ino = BTRFS_I(inode)->location.objectid; + args.ino = BTRFS_I(inode)->objectid; args.root = BTRFS_I(inode)->root; return insert_inode_locked4(inode, @@ -6256,7 +6257,6 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, struct btrfs_fs_info *fs_info = inode_to_fs_info(dir); struct btrfs_root *root; struct btrfs_inode_item *inode_item; - struct btrfs_key *location; struct btrfs_path *path; u64 objectid; struct btrfs_inode_ref *ref; @@ -6332,10 +6332,7 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, BTRFS_INODE_NODATASUM; } - location = &BTRFS_I(inode)->location; - location->objectid = objectid; - location->offset = 0; - location->type = BTRFS_INODE_ITEM_KEY; + BTRFS_I(inode)->objectid = objectid; ret = btrfs_insert_inode_locked(inode); if (ret < 0) { diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index efd5d6e9589e..2de8f06523b9 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1918,7 +1918,7 @@ static int btrfs_search_path_in_tree_user(struct mnt_idmap *idmap, { struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; struct super_block *sb = inode->i_sb; - struct btrfs_key upper_limit = BTRFS_I(inode)->location; + u64 upper_limit = BTRFS_I(inode)->objectid; u64 treeid = btrfs_root_id(BTRFS_I(inode)->root); u64 dirid = args->dirid; unsigned long item_off; @@ -1944,7 +1944,7 @@ static int btrfs_search_path_in_tree_user(struct mnt_idmap *idmap, * If the bottom subvolume does not exist directly under upper_limit, * construct the path in from the bottom up. */ - if (dirid != upper_limit.objectid) { + if (dirid != upper_limit) { ptr = &args->path[BTRFS_INO_LOOKUP_USER_PATH_MAX - 1]; root = btrfs_get_fs_root(fs_info, treeid, true); @@ -2019,7 +2019,7 @@ static int btrfs_search_path_in_tree_user(struct mnt_idmap *idmap, goto out_put; } - if (key.offset == upper_limit.objectid) + if (key.offset == upper_limit) break; if (key.objectid == BTRFS_FIRST_FREE_OBJECTID) { ret = -EACCES; @@ -2140,7 +2140,7 @@ static int btrfs_ioctl_ino_lookup_user(struct file *file, void __user *argp) inode = file_inode(file); if (args->dirid == BTRFS_FIRST_FREE_OBJECTID && - BTRFS_I(inode)->location.objectid != BTRFS_FIRST_FREE_OBJECTID) { + BTRFS_I(inode)->objectid != BTRFS_FIRST_FREE_OBJECTID) { /* * The subvolume does not exist under fd with which this is * called diff --git a/fs/btrfs/tests/btrfs-tests.c b/fs/btrfs/tests/btrfs-tests.c index dce0387ef155..b28a79935d8e 100644 --- a/fs/btrfs/tests/btrfs-tests.c +++ b/fs/btrfs/tests/btrfs-tests.c @@ -62,9 +62,7 @@ struct inode *btrfs_new_test_inode(void) inode->i_mode = S_IFREG; inode->i_ino = BTRFS_FIRST_FREE_OBJECTID; - BTRFS_I(inode)->location.type = BTRFS_INODE_ITEM_KEY; - BTRFS_I(inode)->location.objectid = BTRFS_FIRST_FREE_OBJECTID; - BTRFS_I(inode)->location.offset = 0; + BTRFS_I(inode)->objectid = BTRFS_FIRST_FREE_OBJECTID; inode_init_owner(&nop_mnt_idmap, inode, NULL, S_IFREG); return inode; diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 0aee43466c52..2e762b89d4a2 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -4235,8 +4235,10 @@ static int log_inode_item(struct btrfs_trans_handle *trans, struct btrfs_inode *inode, bool inode_item_dropped) { struct btrfs_inode_item *inode_item; + struct btrfs_key key; int ret; + btrfs_get_inode_key(inode, &key); /* * If we are doing a fast fsync and the inode was logged before in the * current transaction, then we know the inode was previously logged and @@ -4248,7 +4250,7 @@ static int log_inode_item(struct btrfs_trans_handle *trans, * already exists can also result in unnecessarily splitting a leaf. */ if (!inode_item_dropped && inode->logged_trans == trans->transid) { - ret = btrfs_search_slot(trans, log, &inode->location, path, 0, 1); + ret = btrfs_search_slot(trans, log, &key, path, 0, 1); ASSERT(ret <= 0); if (ret > 0) ret = -ENOENT; @@ -4262,7 +4264,7 @@ static int log_inode_item(struct btrfs_trans_handle *trans, * the inode, we set BTRFS_INODE_NEEDS_FULL_SYNC on its runtime * flags and set ->logged_trans to 0. */ - ret = btrfs_insert_empty_item(trans, log, path, &inode->location, + ret = btrfs_insert_empty_item(trans, log, path, &key, sizeof(*inode_item)); ASSERT(ret != -EEXIST); } From patchwork Wed May 8 12:17:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13658660 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7221884DE6 for ; Wed, 8 May 2024 12:17:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170662; cv=none; b=Z/xwGqFLQNP0/nllcXLwfdBqGDDOKiSUGmtQrB+r0xB7Enqwx08CCoLchCOZ5dvWk2ecdprHrHxfAugAJm/L0jzB4uvPgFjNa3LyDV/0KtRzbiWkQ9t9A79EWv8EDkzk4QNY+AXHLXZtjIFqHghYDEBONUsGEDzNWJegdyBUZ+M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715170662; c=relaxed/simple; bh=6GLB4NgNVTgt1bkzFVVg+BmYBVGWHex/BDe0GPzqsRw=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=OGXlARUuvv/kUkOSQVh1Ntq7jQl/en1plVLypR76LKR1oA2rzUs4zuZ93dSU9lPp2Bdw1rmpR0dTwAptlDblWdOq8SSBUEJ+5yQSz66nZfINKOuoJBQZp+3pDv+Gr9rm2zosvPP+onhsdgAwvB0H7+aG7Hu2G/dN0YXfTHzr/iY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YW3a0GCb; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YW3a0GCb" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 87C3FC4AF17 for ; Wed, 8 May 2024 12:17:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715170662; bh=6GLB4NgNVTgt1bkzFVVg+BmYBVGWHex/BDe0GPzqsRw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=YW3a0GCbAtuZESqXWshoA35W6IffU8PtbT69JTUNayGqxFkuEI2asB1FlFGOIKR1L 9V9RCDNI+wyXL/R4f8pWZBQWjm80m0kRsgH6UBNVXAUPXrm0FRTF+bBxesNiVHPoI5 +jAHt2LUOWXBDPHMC5puvh3lL8v4Ukm4BNgEc/vkclBcEDc8eNcrrlVT+zBunFpzD7 v7CZTxnqkEg6fHWF2BN4VsI2+/dYHU5LvhzOMHd7ioXT7NpT1LnJclIyKawVVuShV4 ztHTzyL6U9X+E2QG8Izgi0Mksm4BmIArgy4p8ZNs5ErEVOv7sAWlnCZBnvOmtxuqfB WsWRkrC5Hfs9w== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 8/8] btrfs: remove objectid from struct btrfs_inode on 64 bits platforms Date: Wed, 8 May 2024 13:17:31 +0100 Message-Id: <108152a5f5ca8023d76be4751764f52e9e81c0f2.1715169723.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana On 64 bits platforms we don't really need to have a dedicated member (the objectid field) for the inode's number since we store in the vfs inode's i_ino member, which is an unsigned long and this type is 64 bits wide on 64 bits platforms. We only need that field in case we are on a 32 bits platform because the unsigned long type is 32 bits wide on such platforms See commit 33345d01522f ("Btrfs: Always use 64bit inode number") regarding this 64/32 bits detail. The objectid field of struct btrfs_inode is also used to store the ID of a root for directories that are stubs for unreferenced roots. In such cases the inode is a directory and has the BTRFS_INODE_ROOT_STUB runtime flag set. So in order to reduce the size of btrfs_inode structure on 64 bits platforms we can remove the objectid member and use the vfs inode's i_ino member instead whenever we need to get the inode number. In case the inode is a root stub (BTRFS_INODE_ROOT_STUB set) we can use the member last_reflink_trans to store the ID of the unreferenced root, since such inode is a directory and reflinks can't be done against directories. So remove the objectid fields for 64 bits platforms and alias the last_reflink_trans field with a name of ref_root_id in a union. On a release kernel config, this reduces the size of struct btrfs_inode from 1024 bytes down to 1016 bytes, giving an 8 bytes slack space for future expansion without decreasing the number of inodes we can have per page (4 with a 4K page, 64 with a 64K page). Signed-off-by: Filipe Manana --- fs/btrfs/btrfs_inode.h | 50 ++++++++++++++++++++++++------------ fs/btrfs/disk-io.c | 3 +-- fs/btrfs/export.c | 2 +- fs/btrfs/inode.c | 17 +++++------- fs/btrfs/ioctl.c | 4 +-- fs/btrfs/tests/btrfs-tests.c | 3 +-- 6 files changed, 45 insertions(+), 34 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index fa2f91396ae0..4d9299789a03 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -119,15 +119,14 @@ struct btrfs_inode { /* which subvolume this inode belongs to */ struct btrfs_root *root; +#if BITS_PER_LONG == 32 /* - * This is either: - * - * 1) The objectid of the corresponding BTRFS_INODE_ITEM_KEY; - * - * 2) In case this a root stub inode (BTRFS_INODE_ROOT_STUB flag set), - * the ID of that root. + * The objectid of the corresponding BTRFS_INODE_ITEM_KEY. + * On 64 bits platforms we can get it from vfs_inode.i_ino, which is an + * unsigned long and therefore 64 bits on such platforms. */ u64 objectid; +#endif /* Cached value of inode property 'compression'. */ u8 prop_compress; @@ -291,16 +290,25 @@ struct btrfs_inode { */ u64 last_unlink_trans; - /* - * The id/generation of the last transaction where this inode was - * either the source or the destination of a clone/dedupe operation. - * Used when logging an inode to know if there are shared extents that - * need special care when logging checksum items, to avoid duplicate - * checksum items in a log (which can lead to a corruption where we end - * up with missing checksum ranges after log replay). - * Protected by the vfs inode lock. - */ - u64 last_reflink_trans; + union { + /* + * The id/generation of the last transaction where this inode + * was either the source or the destination of a clone/dedupe + * operation. Used when logging an inode to know if there are + * shared extents that need special care when logging checksum + * items, to avoid duplicate checksum items in a log (which can + * lead to a corruption where we end up with missing checksum + * ranges after log replay). Protected by the vfs inode lock. + * Used for regular files only. + */ + u64 last_reflink_trans; + + /* + * In case this a root stub inode (BTRFS_INODE_ROOT_STUB flag set), + * the ID of that root. + */ + u64 ref_root_id; + }; /* Backwards incompatible flags, lower half of inode_item::flags */ u32 flags; @@ -377,11 +385,19 @@ static inline u64 btrfs_ino(const struct btrfs_inode *inode) static inline void btrfs_get_inode_key(const struct btrfs_inode *inode, struct btrfs_key *key) { - key->objectid = inode->objectid; + key->objectid = btrfs_ino(inode); key->type = BTRFS_INODE_ITEM_KEY; key->offset = 0; } +static inline void btrfs_set_inode_number(struct btrfs_inode *inode, u64 ino) +{ +#if BITS_PER_LONG == 32 + inode->objectid = ino; +#endif + inode->vfs_inode.i_ino = ino; +} + static inline void btrfs_i_size_write(struct btrfs_inode *inode, u64 size) { i_size_write(&inode->vfs_inode, size); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index e3edaf510108..e6bf895b3547 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1928,7 +1928,7 @@ static int btrfs_init_btree_inode(struct super_block *sb) if (!inode) return -ENOMEM; - inode->i_ino = BTRFS_BTREE_INODE_OBJECTID; + btrfs_set_inode_number(BTRFS_I(inode), BTRFS_BTREE_INODE_OBJECTID); set_nlink(inode, 1); /* * we set the i_size on the btree inode to the max possible int. @@ -1944,7 +1944,6 @@ static int btrfs_init_btree_inode(struct super_block *sb) extent_map_tree_init(&BTRFS_I(inode)->extent_tree); BTRFS_I(inode)->root = btrfs_grab_root(fs_info->tree_root); - BTRFS_I(inode)->objectid = BTRFS_BTREE_INODE_OBJECTID; set_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags); __insert_inode_hash(inode, hash); fs_info->btree_inode = inode; diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c index 5526e25ebb3f..5da56e21ff73 100644 --- a/fs/btrfs/export.c +++ b/fs/btrfs/export.c @@ -40,7 +40,7 @@ static int btrfs_encode_fh(struct inode *inode, u32 *fh, int *max_len, if (parent) { u64 parent_root_id; - fid->parent_objectid = BTRFS_I(parent)->objectid; + fid->parent_objectid = btrfs_ino(BTRFS_I(parent)); fid->parent_gen = parent->i_generation; parent_root_id = btrfs_root_id(BTRFS_I(parent)->root); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 44dc82ff96db..5a1014122088 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4340,7 +4340,7 @@ static int btrfs_unlink_subvol(struct btrfs_trans_handle *trans, if (btrfs_ino(inode) == BTRFS_FIRST_FREE_OBJECTID) { objectid = btrfs_root_id(inode->root); } else if (btrfs_ino(inode) == BTRFS_EMPTY_SUBVOL_DIR_OBJECTID) { - objectid = inode->objectid; + objectid = inode->ref_root_id; } else { WARN_ON(1); fscrypt_free_filename(&fname); @@ -5581,8 +5581,7 @@ static int btrfs_init_locked_inode(struct inode *inode, void *p) { struct btrfs_iget_args *args = p; - inode->i_ino = args->ino; - BTRFS_I(inode)->objectid = args->ino; + btrfs_set_inode_number(BTRFS_I(inode), args->ino); BTRFS_I(inode)->root = btrfs_grab_root(args->root); if (args->root && args->root == args->root->fs_info->tree_root && @@ -5596,7 +5595,7 @@ static int btrfs_find_actor(struct inode *inode, void *opaque) { struct btrfs_iget_args *args = opaque; - return args->ino == BTRFS_I(inode)->objectid && + return args->ino == btrfs_ino(BTRFS_I(inode)) && args->root == BTRFS_I(inode)->root; } @@ -5673,11 +5672,11 @@ static struct inode *new_simple_dir(struct inode *dir, return ERR_PTR(-ENOMEM); BTRFS_I(inode)->root = btrfs_grab_root(root); - BTRFS_I(inode)->objectid = key->objectid; + BTRFS_I(inode)->ref_root_id = key->objectid; set_bit(BTRFS_INODE_ROOT_STUB, &BTRFS_I(inode)->runtime_flags); set_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags); - inode->i_ino = BTRFS_EMPTY_SUBVOL_DIR_OBJECTID; + btrfs_set_inode_number(BTRFS_I(inode), BTRFS_EMPTY_SUBVOL_DIR_OBJECTID); /* * We only need lookup, the rest is read-only and there's no inode * associated with the dentry @@ -6150,7 +6149,7 @@ static int btrfs_insert_inode_locked(struct inode *inode) { struct btrfs_iget_args args; - args.ino = BTRFS_I(inode)->objectid; + args.ino = btrfs_ino(BTRFS_I(inode)); args.root = BTRFS_I(inode)->root; return insert_inode_locked4(inode, @@ -6282,7 +6281,7 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, ret = btrfs_get_free_objectid(root, &objectid); if (ret) goto out; - inode->i_ino = objectid; + btrfs_set_inode_number(BTRFS_I(inode), objectid); ret = xa_reserve(&root->inodes, objectid, GFP_NOFS); if (ret) @@ -6332,8 +6331,6 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, BTRFS_INODE_NODATASUM; } - BTRFS_I(inode)->objectid = objectid; - ret = btrfs_insert_inode_locked(inode); if (ret < 0) { if (!args->orphan) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 2de8f06523b9..1d343cd2435b 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1918,7 +1918,7 @@ static int btrfs_search_path_in_tree_user(struct mnt_idmap *idmap, { struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; struct super_block *sb = inode->i_sb; - u64 upper_limit = BTRFS_I(inode)->objectid; + u64 upper_limit = btrfs_ino(BTRFS_I(inode)); u64 treeid = btrfs_root_id(BTRFS_I(inode)->root); u64 dirid = args->dirid; unsigned long item_off; @@ -2140,7 +2140,7 @@ static int btrfs_ioctl_ino_lookup_user(struct file *file, void __user *argp) inode = file_inode(file); if (args->dirid == BTRFS_FIRST_FREE_OBJECTID && - BTRFS_I(inode)->objectid != BTRFS_FIRST_FREE_OBJECTID) { + btrfs_ino(BTRFS_I(inode)) != BTRFS_FIRST_FREE_OBJECTID) { /* * The subvolume does not exist under fd with which this is * called diff --git a/fs/btrfs/tests/btrfs-tests.c b/fs/btrfs/tests/btrfs-tests.c index b28a79935d8e..ce50847e1e01 100644 --- a/fs/btrfs/tests/btrfs-tests.c +++ b/fs/btrfs/tests/btrfs-tests.c @@ -61,8 +61,7 @@ struct inode *btrfs_new_test_inode(void) return NULL; inode->i_mode = S_IFREG; - inode->i_ino = BTRFS_FIRST_FREE_OBJECTID; - BTRFS_I(inode)->objectid = BTRFS_FIRST_FREE_OBJECTID; + btrfs_set_inode_number(BTRFS_I(inode), BTRFS_FIRST_FREE_OBJECTID); inode_init_owner(&nop_mnt_idmap, inode, NULL, S_IFREG); return inode;