Message ID | de086134d128aad13d16b2aabc72918d7ec7637e.1441309178.git.osandov@fb.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Thu, Sep 03, 2015 at 12:44:27PM -0700, Omar Sandoval wrote: > Now we can finally hook up everything so we can actually use free space > tree. On the first mount with the free_space_tree mount option, the free > space tree will be created and the FREE_SPACE_TREE read-only compat bit > will be set. Any time the filesystem is mounted from then on, we will > use the free space tree. > > Having both the free space cache and free space trees enabled is > nonsense, so we don't allow that to happen. Since mkfs sets the > superblock cache generation to -1, this means that the filesystem will > have to be mounted with nospace_cache,free_space_tree to create the free > space trees on first mount. Once the FREE_SPACE_TREE bit is set, the > cache generation is ignored when mounting. This is all a little more > complicated than would be ideal, but at some point we can presumably > make the free space tree the default and stop setting the cache > generation in mkfs. I have objections against introducing another options to do something with space cache. As you write, it does not make sens to have 'space_cache' and 'free_space_tree' enabled, and I agree. The b-tree approach is an "implementation detail", an improved version of space caching. Because of that I propose to do the following: * use space_cache mount option, and add a value denoting the used implementation, eg. space_cache=btree or space_cache=v2 etc * keep space_cache for backward compatibility for the current implementaion * clear_cache should reset state for both * nospace_cache prevents using any of the two versions of space cache On the mkfs side, we can add new incompat feature to the -O option that will set the incompat bit to the superblock. Mounting such filesystem would use the v2 cache automatically. I'd like to see the b-tree space cache default in the future, until then it'll be mkfs-time option or mount-time option. For backward compatibility, mounting a free space v2 filesystem on older kernel can be done with support of userspace tools: reset the cache generation (as if clear_cache was used), drop all the free-space-tree structures and unset the incompat bit. I think this kind of fallback is desirable. Other than that, I like the series and the improvements it's supposed to bring. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 09, 2015 at 02:00:23PM +0200, David Sterba wrote: > On Thu, Sep 03, 2015 at 12:44:27PM -0700, Omar Sandoval wrote: > > Now we can finally hook up everything so we can actually use free space > > tree. On the first mount with the free_space_tree mount option, the free > > space tree will be created and the FREE_SPACE_TREE read-only compat bit > > will be set. Any time the filesystem is mounted from then on, we will > > use the free space tree. > > > > Having both the free space cache and free space trees enabled is > > nonsense, so we don't allow that to happen. Since mkfs sets the > > superblock cache generation to -1, this means that the filesystem will > > have to be mounted with nospace_cache,free_space_tree to create the free > > space trees on first mount. Once the FREE_SPACE_TREE bit is set, the > > cache generation is ignored when mounting. This is all a little more > > complicated than would be ideal, but at some point we can presumably > > make the free space tree the default and stop setting the cache > > generation in mkfs. > > I have objections against introducing another options to do something > with space cache. As you write, it does not make sens to have > 'space_cache' and 'free_space_tree' enabled, and I agree. The b-tree > approach is an "implementation detail", an improved version of space > caching. > > Because of that I propose to do the following: > > * use space_cache mount option, and add a value denoting the used > implementation, eg. space_cache=btree or space_cache=v2 etc > > * keep space_cache for backward compatibility for the current > implementaion > > * clear_cache should reset state for both > > * nospace_cache prevents using any of the two versions of space cache Okay, I like the idea of calling this space_cache=v2 and allowing clear_cache to clear the free space tree just in case. However, the free space tree doesn't use a cache generation like the old free space cache, so once it's created, we can't ever ignore it, so for nospace_cache, the best we could do would be to fail the mount (unless clear_cache is also set). The other option would be to add something like the cache generation for the free space tree, but I'd rather not do that since fixing an out-of-date free space tree is a little more involved than with the old cache (at that point, we might as well clear the tree and redo it all over again). What do you think of that? Is failing on nospace_cache okay with you? Thanks. > On the mkfs side, we can add new incompat feature to the -O option that > will set the incompat bit to the superblock. Mounting such filesystem > would use the v2 cache automatically. > > I'd like to see the b-tree space cache default in the future, until then > it'll be mkfs-time option or mount-time option. > > For backward compatibility, mounting a free space v2 filesystem on older > kernel can be done with support of userspace tools: reset the cache > generation (as if clear_cache was used), drop all the free-space-tree > structures and unset the incompat bit. I think this kind of fallback is > desirable. > > > Other than that, I like the series and the improvements it's supposed to > bring.
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 05420991e101..3524fe065b72 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -531,7 +531,10 @@ struct btrfs_super_block { #define BTRFS_FEATURE_COMPAT_SUPP 0ULL #define BTRFS_FEATURE_COMPAT_SAFE_SET 0ULL #define BTRFS_FEATURE_COMPAT_SAFE_CLEAR 0ULL -#define BTRFS_FEATURE_COMPAT_RO_SUPP 0ULL + +#define BTRFS_FEATURE_COMPAT_RO_SUPP \ + (BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE) + #define BTRFS_FEATURE_COMPAT_RO_SAFE_SET 0ULL #define BTRFS_FEATURE_COMPAT_RO_SAFE_CLEAR 0ULL @@ -2203,6 +2206,7 @@ struct btrfs_ioctl_defrag_range_args { #define BTRFS_MOUNT_CHECK_INTEGRITY_INCLUDING_EXTENT_DATA (1 << 21) #define BTRFS_MOUNT_PANIC_ON_FATAL_ERROR (1 << 22) #define BTRFS_MOUNT_RESCAN_UUID_TREE (1 << 23) +#define BTRFS_MOUNT_FREE_SPACE_TREE (1 << 24) #define BTRFS_DEFAULT_COMMIT_INTERVAL (30) #define BTRFS_DEFAULT_MAX_INLINE (8192) @@ -3746,6 +3750,7 @@ static inline void free_fs_info(struct btrfs_fs_info *fs_info) kfree(fs_info->csum_root); kfree(fs_info->quota_root); kfree(fs_info->uuid_root); + kfree(fs_info->free_space_root); kfree(fs_info->super_copy); kfree(fs_info->super_for_commit); security_free_mnt_opts(&fs_info->security_opts); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index f556c3732c2c..e88674c594da 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -42,6 +42,7 @@ #include "locking.h" #include "tree-log.h" #include "free-space-cache.h" +#include "free-space-tree.h" #include "inode-map.h" #include "check-integrity.h" #include "rcu-string.h" @@ -1641,6 +1642,9 @@ struct btrfs_root *btrfs_get_fs_root(struct btrfs_fs_info *fs_info, if (location->objectid == BTRFS_UUID_TREE_OBJECTID) return fs_info->uuid_root ? fs_info->uuid_root : ERR_PTR(-ENOENT); + if (location->objectid == BTRFS_FREE_SPACE_TREE_OBJECTID) + return fs_info->free_space_root ? fs_info->free_space_root : + ERR_PTR(-ENOENT); again: root = btrfs_lookup_fs_root(fs_info, location->objectid); if (root) { @@ -2138,6 +2142,7 @@ static void free_root_pointers(struct btrfs_fs_info *info, int chunk_root) free_root_extent_buffers(info->uuid_root); if (chunk_root) free_root_extent_buffers(info->chunk_root); + free_root_extent_buffers(info->free_space_root); } void btrfs_free_fs_roots(struct btrfs_fs_info *fs_info) @@ -2439,6 +2444,15 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info, fs_info->uuid_root = root; } + if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) { + location.objectid = BTRFS_FREE_SPACE_TREE_OBJECTID; + root = btrfs_read_tree_root(tree_root, &location); + if (IS_ERR(root)) + return PTR_ERR(root); + set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state); + fs_info->free_space_root = root; + } + return 0; } @@ -3063,6 +3077,18 @@ retry_root_backup: btrfs_qgroup_rescan_resume(fs_info); + if (btrfs_test_opt(tree_root, FREE_SPACE_TREE) && + !btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE)) { + pr_info("BTRFS: creating free space tree\n"); + ret = btrfs_create_free_space_tree(fs_info); + if (ret) { + pr_warn("BTRFS: failed to create free space tree %d\n", + ret); + close_ctree(tree_root); + return ret; + } + } + if (!fs_info->uuid_root) { pr_info("BTRFS: creating UUID tree\n"); ret = btrfs_create_uuid_tree(fs_info); diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index b93f127c4bc8..d7705e4ed119 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -319,7 +319,7 @@ enum { Opt_check_integrity_print_mask, Opt_fatal_errors, Opt_rescan_uuid_tree, Opt_commit_interval, Opt_barrier, Opt_nodefrag, Opt_nodiscard, Opt_noenospc_debug, Opt_noflushoncommit, Opt_acl, Opt_datacow, - Opt_datasum, Opt_treelog, Opt_noinode_cache, + Opt_datasum, Opt_treelog, Opt_noinode_cache, Opt_free_space_tree, Opt_err, }; @@ -372,6 +372,7 @@ static match_table_t tokens = { {Opt_rescan_uuid_tree, "rescan_uuid_tree"}, {Opt_fatal_errors, "fatal_errors=%s"}, {Opt_commit_interval, "commit=%d"}, + {Opt_free_space_tree, "free_space_tree"}, {Opt_err, NULL}, }; @@ -392,7 +393,9 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) bool compress_force = false; cache_gen = btrfs_super_cache_generation(root->fs_info->super_copy); - if (cache_gen) + if (btrfs_fs_compat_ro(root->fs_info, FREE_SPACE_TREE)) + btrfs_set_opt(info->mount_opt, FREE_SPACE_TREE); + else if (cache_gen) btrfs_set_opt(info->mount_opt, SPACE_CACHE); if (!options) @@ -738,6 +741,10 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) info->commit_interval = BTRFS_DEFAULT_COMMIT_INTERVAL; } break; + case Opt_free_space_tree: + btrfs_set_and_info(root, FREE_SPACE_TREE, + "enabling free space tree"); + break; case Opt_err: btrfs_info(root->fs_info, "unrecognized mount option '%s'", p); ret = -EINVAL; @@ -747,8 +754,16 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) } } out: + if (btrfs_test_opt(root, SPACE_CACHE) && + btrfs_test_opt(root, FREE_SPACE_TREE)) { + btrfs_err(root->fs_info, + "cannot use both free space cache and free space tree"); + ret = -EINVAL; + } if (!ret && btrfs_test_opt(root, SPACE_CACHE)) btrfs_info(root->fs_info, "disk space caching is enabled"); + if (!ret && btrfs_test_opt(root, FREE_SPACE_TREE)) + btrfs_info(root->fs_info, "using free space tree"); kfree(orig); return ret; } @@ -1152,6 +1167,8 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry) seq_puts(seq, ",discard"); if (!(root->fs_info->sb->s_flags & MS_POSIXACL)) seq_puts(seq, ",noacl"); + if (btrfs_test_opt(root, FREE_SPACE_TREE)) + seq_puts(seq, ",free_space_tree"); if (btrfs_test_opt(root, SPACE_CACHE)) seq_puts(seq, ",space_cache"); else