From patchwork Thu Aug 3 15:43:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guilherme G. Piccoli" X-Patchwork-Id: 13340268 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6232BC04A6A for ; Thu, 3 Aug 2023 15:45:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236988AbjHCPpU (ORCPT ); Thu, 3 Aug 2023 11:45:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236759AbjHCPpT (ORCPT ); Thu, 3 Aug 2023 11:45:19 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 970C8E46; Thu, 3 Aug 2023 08:45:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=uzP7UXLX5/VT/n6cYOegUOZi54+OqutmMNOZMMzFZ4g=; b=GPePsTm97PHIlLUznJ0HNjOsXi 01cRWwTh22HqB4TRFXYl5cUp+GflJjiRpf8zaI1S4xFtz7zskClU3Ay2PScfZNelaVVArF+YbE4AF Taj8RrUCUKIkCOQbY1x0jsGvrU70WTok/9U2Fh6M7eq7DUtnbhbnymcmWcApAAMLaS0xv5pkmZKfN dACLlKjCF1DqqMix1mj5gnpPSu2nvM3qxyl24KK4fKOfE6qOzI38o4dtGGDIInY10JZgPppSJ7gP9 28ZeHeM3ueU+guUvCDN1iCgThcFffVBdRLBCDwNTL6N/WCujRDhKdRDJJjq4k4FWTu8RTx+jwIIMh vR0GQThg==; Received: from [201.92.22.215] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qRaVl-00Bty6-GA; Thu, 03 Aug 2023 17:45:14 +0200 From: "Guilherme G. Piccoli" To: linux-btrfs@vger.kernel.org Cc: clm@fb.com, josef@toxicpanda.com, dsterba@suse.com, linux-fsdevel@vger.kernel.org, kernel@gpiccoli.net, gpiccoli@igalia.com, kernel-dev@igalia.com, anand.jain@oracle.com, david@fromorbit.com, kreijack@libero.it, johns@valvesoftware.com, ludovico.denittis@collabora.com, quwenruo.btrfs@gmx.com, wqu@suse.com, vivek@collabora.com Subject: [PATCH 1/3] btrfs-progs: Add the single-dev feature (to both mkfs/tune) Date: Thu, 3 Aug 2023 12:43:39 -0300 Message-ID: <20230803154453.1488248-2-gpiccoli@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230803154453.1488248-1-gpiccoli@igalia.com> References: <20230803154453.1488248-1-gpiccoli@igalia.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The single-dev feature allows a device to be mounted regardless of its fsid already being present in another device - in other words, this feature disables RAID modes / metadata_uuid, allowing a single device per filesystem. Its goal is mainly to allow mounting the same fsid at the same time in the system. Introduce hereby the feature to both mkfs (-O single-dev) and btrfstune (-s), syncing the kernel-shared headers as well. The feature is a compat_ro, its kernel version was set to v6.5. Suggested-by: Qu Wenruo Signed-off-by: Guilherme G. Piccoli --- Hi folks, thanks in advance for reviews! Notice that I've added the feature to btrfstune as well, but I found docs online saying this tool is deprecated..so not sure if that was the proper approach. Also, a design decision: I've skipped the btrfs_register_one_device() call when mkfs was just used with the single-dev tuning, or else it shows a (harmless) error and succeeds, since of course scanning fails for such devices, as per the feature implementation. So, I thought it was more straightforward to just skip the call itself. Cheers, Guilherme common/fsfeatures.c | 7 ++++ kernel-shared/ctree.h | 3 +- kernel-shared/uapi/btrfs.h | 7 ++++ mkfs/main.c | 4 ++- tune/main.c | 72 +++++++++++++++++++++++--------------- 5 files changed, 63 insertions(+), 30 deletions(-) diff --git a/common/fsfeatures.c b/common/fsfeatures.c index 00658fa5159f..a320b7062b8c 100644 --- a/common/fsfeatures.c +++ b/common/fsfeatures.c @@ -160,6 +160,13 @@ static const struct btrfs_feature mkfs_features[] = { VERSION_NULL(default), .desc = "RAID1 with 3 or 4 copies" }, + { + .name = "single-dev", + .compat_ro_flag = BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV, + .sysfs_name = "single_dev", + VERSION_TO_STRING2(compat, 6,5), + .desc = "single device (allows same fsid mounting)" + }, #ifdef BTRFS_ZONED { .name = "zoned", diff --git a/kernel-shared/ctree.h b/kernel-shared/ctree.h index 59533879b939..e3fd834aa6dd 100644 --- a/kernel-shared/ctree.h +++ b/kernel-shared/ctree.h @@ -86,7 +86,8 @@ static inline u32 __BTRFS_LEAF_DATA_SIZE(u32 nodesize) (BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE | \ BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID | \ BTRFS_FEATURE_COMPAT_RO_VERITY | \ - BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE) + BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE | \ + BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV) #if EXPERIMENTAL #define BTRFS_FEATURE_INCOMPAT_SUPP \ diff --git a/kernel-shared/uapi/btrfs.h b/kernel-shared/uapi/btrfs.h index 85b04f89a2a9..2e0ee6ef6446 100644 --- a/kernel-shared/uapi/btrfs.h +++ b/kernel-shared/uapi/btrfs.h @@ -336,6 +336,13 @@ _static_assert(sizeof(struct btrfs_ioctl_fs_info_args) == 1024); */ #define BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE (1ULL << 3) +/* + * Single devices (as flagged by the corresponding compat_ro flag) only + * gets scanned during mount time; also, a random fsid is generated for + * them, in order to cope with same-fsid filesystem mounts. + */ +#define BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV (1ULL << 4) + #define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF (1ULL << 0) #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL (1ULL << 1) #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS (1ULL << 2) diff --git a/mkfs/main.c b/mkfs/main.c index 972ed1112ea6..429799932224 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -1025,6 +1025,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv) char *label = NULL; int nr_global_roots = sysconf(_SC_NPROCESSORS_ONLN); char *source_dir = NULL; + bool single_dev; cpu_detect_flags(); hash_init_accel(); @@ -1218,6 +1219,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv) usage(&mkfs_cmd, 1); opt_zoned = !!(features.incompat_flags & BTRFS_FEATURE_INCOMPAT_ZONED); + single_dev = !!(features.compat_ro_flags & BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV); if (source_dir && device_count > 1) { error("the option -r is limited to a single device"); @@ -1815,7 +1817,7 @@ out: device_count = argc - optind; while (device_count-- > 0) { file = argv[optind++]; - if (path_is_block_device(file) == 1) + if (path_is_block_device(file) == 1 && !single_dev) btrfs_register_one_device(file); } } diff --git a/tune/main.c b/tune/main.c index 0ca1e01282c9..95e55fcda44f 100644 --- a/tune/main.c +++ b/tune/main.c @@ -42,27 +42,31 @@ #include "tune/tune.h" #include "check/clear-cache.h" +#define SET_SUPER_FLAGS(type) \ +static int set_super_##type##_flags(struct btrfs_root *root, u64 flags) \ +{ \ + struct btrfs_trans_handle *trans; \ + struct btrfs_super_block *disk_super; \ + u64 super_flags; \ + int ret; \ + \ + disk_super = root->fs_info->super_copy; \ + super_flags = btrfs_super_##type##_flags(disk_super); \ + super_flags |= flags; \ + trans = btrfs_start_transaction(root, 1); \ + BUG_ON(IS_ERR(trans)); \ + btrfs_set_super_##type##_flags(disk_super, super_flags); \ + ret = btrfs_commit_transaction(trans, root); \ + \ + return ret; \ +} + +SET_SUPER_FLAGS(incompat) +SET_SUPER_FLAGS(compat_ro) + static char *device; static int force = 0; -static int set_super_incompat_flags(struct btrfs_root *root, u64 flags) -{ - struct btrfs_trans_handle *trans; - struct btrfs_super_block *disk_super; - u64 super_flags; - int ret; - - disk_super = root->fs_info->super_copy; - super_flags = btrfs_super_incompat_flags(disk_super); - super_flags |= flags; - trans = btrfs_start_transaction(root, 1); - BUG_ON(IS_ERR(trans)); - btrfs_set_super_incompat_flags(disk_super, super_flags); - ret = btrfs_commit_transaction(trans, root); - - return ret; -} - static int convert_to_fst(struct btrfs_fs_info *fs_info) { int ret; @@ -102,6 +106,7 @@ static const char * const tune_usage[] = { OPTLINE("-r", "enable extended inode refs (mkfs: extref, for hardlink limits)"), OPTLINE("-x", "enable skinny metadata extent refs (mkfs: skinny-metadata)"), OPTLINE("-n", "enable no-holes feature (mkfs: no-holes, more efficient sparse file representation)"), + OPTLINE("-s", "enable single device feature (mkfs: single-dev, allows same fsid mounting)"), OPTLINE("-S <0|1>", "set/unset seeding status of a device"), OPTLINE("--convert-to-block-group-tree", "convert filesystem to track block groups in " "the separate block-group-tree instead of extent tree (sets the incompat bit)"), @@ -146,7 +151,8 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) int csum_type = -1; char *new_fsid_str = NULL; int ret; - u64 super_flags = 0; + u64 compat_ro_flags = 0; + u64 incompat_flags = 0; int fd = -1; btrfs_config_init(); @@ -169,7 +175,7 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) #endif { NULL, 0, NULL, 0 } }; - int c = getopt_long(argc, argv, "S:rxfuU:nmM:", long_options, NULL); + int c = getopt_long(argc, argv, "S:rxfuU:nsmM:", long_options, NULL); if (c < 0) break; @@ -179,13 +185,16 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) seeding_value = arg_strtou64(optarg); break; case 'r': - super_flags |= BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF; + incompat_flags |= BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF; break; case 'x': - super_flags |= BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA; + incompat_flags |= BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA; break; case 'n': - super_flags |= BTRFS_FEATURE_INCOMPAT_NO_HOLES; + incompat_flags |= BTRFS_FEATURE_INCOMPAT_NO_HOLES; + break; + case 's': + compat_ro_flags |= BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV; break; case 'f': force = 1; @@ -239,9 +248,9 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) error("random fsid can't be used with specified fsid"); return 1; } - if (!super_flags && !seeding_flag && !(random_fsid || new_fsid_str) && - !change_metadata_uuid && csum_type == -1 && !to_bg_tree && - !to_extent_tree && !to_fst) { + if (!compat_ro_flags && !incompat_flags && !seeding_flag && + !(random_fsid || new_fsid_str) && !change_metadata_uuid && + csum_type == -1 && !to_bg_tree && !to_extent_tree && !to_fst) { error("at least one option should be specified"); usage(&tune_cmd, 1); return 1; @@ -363,8 +372,15 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) total++; } - if (super_flags) { - ret = set_super_incompat_flags(root, super_flags); + if (incompat_flags) { + ret = set_super_incompat_flags(root, incompat_flags); + if (!ret) + success++; + total++; + } + + if (compat_ro_flags) { + ret = set_super_compat_ro_flags(root, compat_ro_flags); if (!ret) success++; total++; From patchwork Thu Aug 3 15:43:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guilherme G. Piccoli" X-Patchwork-Id: 13340269 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52673EB64DD for ; Thu, 3 Aug 2023 15:45:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236993AbjHCPp2 (ORCPT ); Thu, 3 Aug 2023 11:45:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237223AbjHCPp1 (ORCPT ); Thu, 3 Aug 2023 11:45:27 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8263535B0; Thu, 3 Aug 2023 08:45:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=QnXvmo+pGJHTaTtdiH64v7w8MHpTofPd0HAyxnugKFY=; b=XOuVBZqdnEcBAnOJkSdDPMqY7/ V/Z7Gs3WDhBnONfSZM9N5avXdJ2XmTbbxrpTn5NxyU1s1xXHBMaGXEUJTs9ivoLJujb89F6YlLVtI 3Y4byx8DJco8gE6Q7/zVt3w6UD35nfPn4vCYQECa/6PtqPoefdi0eYn21Obj9ACCqETBFl4SBqImR ngo4GQtlsgvOeLmRSQ0jC6/z4avp6ufO2+Iwa8nN076t3PjDWbi07y4WmJGIpRIpv0gm9i65ehNcU uNvopR5txhSBLxn4CvZjAl8ASwLIXdVldFP8Do+qVMmPf4xq4tekyzOMEMoCgGM+xSUaG+fB6RzRO 8qQe3SLw==; Received: from [201.92.22.215] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qRaVs-00BtyE-I7; Thu, 03 Aug 2023 17:45:21 +0200 From: "Guilherme G. Piccoli" To: linux-btrfs@vger.kernel.org Cc: clm@fb.com, josef@toxicpanda.com, dsterba@suse.com, linux-fsdevel@vger.kernel.org, kernel@gpiccoli.net, gpiccoli@igalia.com, kernel-dev@igalia.com, anand.jain@oracle.com, david@fromorbit.com, kreijack@libero.it, johns@valvesoftware.com, ludovico.denittis@collabora.com, quwenruo.btrfs@gmx.com, wqu@suse.com, vivek@collabora.com Subject: [PATCH 2/3] btrfs: Introduce the single-dev feature Date: Thu, 3 Aug 2023 12:43:40 -0300 Message-ID: <20230803154453.1488248-3-gpiccoli@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230803154453.1488248-1-gpiccoli@igalia.com> References: <20230803154453.1488248-1-gpiccoli@igalia.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Btrfs doesn't currently support to mount 2 different devices holding the same filesystem - the fsid is used as a unique identifier in the driver. This case is supported though in some other common filesystems, like ext4; one of the reasons for which is not trivial supporting this case on btrfs is due to its multi-device filesystem nature, native RAID, etc. Supporting the same-fsid mounts has the advantage of allowing btrfs to be used in A/B partitioned devices, like mobile phones or the Steam Deck for example. Without this support, it's not safe for users to keep the same "image version" in both A and B partitions, a setup that is quite common for development, for example. Also, as a big bonus, it allows fs integrity check based on block devices for RO devices (whereas currently it is required that both have different fsid, breaking the block device hash comparison). Such same-fsid mounting is hereby added through the usage of the filesystem feature "single-dev" - when such feature is used, btrfs generates a random fsid for the filesystem and leverages the long-term present metadata_uuid infrastructure to enable the usage of this secondary virtual fsid, effectively requiring few non-invasive changes to the code and no new potential corner cases. In order to prevent more code complexity and corner cases, given the nature of this mechanism (single-devices), the single-dev feature is not allowed when the metadata_uuid flag is already present on the fs, or if the device is on fsid-change state. Device removal/replace is also disabled for devices presenting the single-dev feature. Suggested-by: John Schoenick Suggested-by: Qu Wenruo Signed-off-by: Guilherme G. Piccoli --- fs/btrfs/disk-io.c | 19 +++++++- fs/btrfs/fs.h | 3 +- fs/btrfs/ioctl.c | 18 +++++++ fs/btrfs/super.c | 8 ++-- fs/btrfs/volumes.c | 97 ++++++++++++++++++++++++++++++-------- fs/btrfs/volumes.h | 3 +- include/uapi/linux/btrfs.h | 7 +++ 7 files changed, 127 insertions(+), 28 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 669b10355091..455fa4949c98 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -320,7 +320,7 @@ static bool check_tree_block_fsid(struct extent_buffer *eb) /* * alloc_fs_devices() copies the fsid into metadata_uuid if the * metadata_uuid is unset in the superblock, including for a seed device. - * So, we can use fs_devices->metadata_uuid. + * So, we can use fs_devices->metadata_uuid; same for SINGLE_DEV devices. */ if (!memcmp(fsid, fs_info->fs_devices->metadata_uuid, BTRFS_FSID_SIZE)) return false; @@ -2288,6 +2288,7 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info, { u64 nodesize = btrfs_super_nodesize(sb); u64 sectorsize = btrfs_super_sectorsize(sb); + u8 *fsid; int ret = 0; if (btrfs_super_magic(sb) != BTRFS_MAGIC) { @@ -2368,7 +2369,21 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info, ret = -EINVAL; } - if (memcmp(fs_info->fs_devices->fsid, sb->fsid, BTRFS_FSID_SIZE)) { + /* + * For SINGLE_DEV devices, btrfs creates a random fsid and makes + * use of the metadata_uuid infrastructure in order to allow, for + * example, two devices with same fsid getting mounted at the same + * time. But notice no changes happen at the disk level, so the + * random generated fsid is a driver abstraction, not to be written + * in the disk. That's the reason we're required here to compare the + * fsid with the metadata_uuid for such devices. + */ + if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) + fsid = fs_info->fs_devices->metadata_uuid; + else + fsid = fs_info->fs_devices->fsid; + + if (memcmp(fsid, sb->fsid, BTRFS_FSID_SIZE)) { btrfs_err(fs_info, "superblock fsid doesn't match fsid of fs_devices: %pU != %pU", sb->fsid, fs_info->fs_devices->fsid); diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 203d2a267828..c6d124973361 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -200,7 +200,8 @@ enum { (BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE | \ BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE_VALID | \ BTRFS_FEATURE_COMPAT_RO_VERITY | \ - BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE) + BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE | \ + BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV) #define BTRFS_FEATURE_COMPAT_RO_SAFE_SET 0ULL #define BTRFS_FEATURE_COMPAT_RO_SAFE_CLEAR 0ULL diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index a895d105464b..56703d87def9 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2678,6 +2678,12 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) if (!capable(CAP_SYS_ADMIN)) return -EPERM; + if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) { + btrfs_err(fs_info, + "device removal is unsupported on SINGLE_DEV devices\n"); + return -EINVAL; + } + vol_args = memdup_user(arg, sizeof(*vol_args)); if (IS_ERR(vol_args)) return PTR_ERR(vol_args); @@ -2744,6 +2750,12 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) if (!capable(CAP_SYS_ADMIN)) return -EPERM; + if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) { + btrfs_err(fs_info, + "device removal is unsupported on SINGLE_DEV devices\n"); + return -EINVAL; + } + vol_args = memdup_user(arg, sizeof(*vol_args)); if (IS_ERR(vol_args)) return PTR_ERR(vol_args); @@ -3268,6 +3280,12 @@ static long btrfs_ioctl_dev_replace(struct btrfs_fs_info *fs_info, if (!capable(CAP_SYS_ADMIN)) return -EPERM; + if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) { + btrfs_err(fs_info, + "device removal is unsupported on SINGLE_DEV devices\n"); + return -EINVAL; + } + if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) { btrfs_err(fs_info, "device replace not supported on extent tree v2 yet"); return -EINVAL; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index f1dd172d8d5b..ee87189b1ccd 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -883,7 +883,7 @@ static int btrfs_parse_device_options(const char *options, blk_mode_t flags) error = -ENOMEM; goto out; } - device = btrfs_scan_one_device(device_name, flags); + device = btrfs_scan_one_device(device_name, flags, true); kfree(device_name); if (IS_ERR(device)) { error = PTR_ERR(device); @@ -1478,7 +1478,7 @@ static struct dentry *btrfs_mount_root(struct file_system_type *fs_type, goto error_fs_info; } - device = btrfs_scan_one_device(device_name, mode); + device = btrfs_scan_one_device(device_name, mode, true); if (IS_ERR(device)) { mutex_unlock(&uuid_mutex); error = PTR_ERR(device); @@ -2190,7 +2190,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd, switch (cmd) { case BTRFS_IOC_SCAN_DEV: mutex_lock(&uuid_mutex); - device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ); + device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false); ret = PTR_ERR_OR_ZERO(device); mutex_unlock(&uuid_mutex); break; @@ -2204,7 +2204,7 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd, break; case BTRFS_IOC_DEVICES_READY: mutex_lock(&uuid_mutex); - device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ); + device = btrfs_scan_one_device(vol->name, BLK_OPEN_READ, false); if (IS_ERR(device)) { mutex_unlock(&uuid_mutex); ret = PTR_ERR(device); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 73753dae111a..433a490f2de8 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -681,12 +681,14 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, return -EINVAL; } -static u8 *btrfs_sb_metadata_uuid_or_null(struct btrfs_super_block *sb) +static u8 *btrfs_sb_metadata_uuid_single_dev(struct btrfs_super_block *sb, + bool has_metadata_uuid, + bool single_dev) { - bool has_metadata_uuid = (btrfs_super_incompat_flags(sb) & - BTRFS_FEATURE_INCOMPAT_METADATA_UUID); + if (has_metadata_uuid || single_dev) + return sb->metadata_uuid; - return has_metadata_uuid ? sb->metadata_uuid : NULL; + return NULL; } u8 *btrfs_sb_fsid_ptr(struct btrfs_super_block *sb) @@ -775,8 +777,36 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata( return NULL; } + +static void prepare_virtual_fsid(struct btrfs_super_block *disk_super, + const char *path) +{ + struct btrfs_fs_devices *fs_devices; + u8 vfsid[BTRFS_FSID_SIZE]; + bool dup_fsid = true; + + while (dup_fsid) { + dup_fsid = false; + generate_random_uuid(vfsid); + + list_for_each_entry(fs_devices, &fs_uuids, fs_list) { + if (!memcmp(vfsid, fs_devices->fsid, BTRFS_FSID_SIZE) || + !memcmp(vfsid, fs_devices->metadata_uuid, + BTRFS_FSID_SIZE)) + dup_fsid = true; + } + } + + memcpy(disk_super->metadata_uuid, disk_super->fsid, BTRFS_FSID_SIZE); + memcpy(disk_super->fsid, vfsid, BTRFS_FSID_SIZE); + + pr_info("BTRFS: virtual fsid (%pU) set for SINGLE_DEV device %s (real fsid %pU)\n", + disk_super->fsid, path, disk_super->metadata_uuid); +} + /* - * Add new device to list of registered devices + * Add new device to list of registered devices, or in case of a SINGLE_DEV + * device, also creates a virtual fsid to cope with same-fsid cases. * * Returns: * device pointer which was just added or updated when successful @@ -784,7 +814,7 @@ static struct btrfs_fs_devices *find_fsid_reverted_metadata( */ static noinline struct btrfs_device *device_list_add(const char *path, struct btrfs_super_block *disk_super, - bool *new_device_added) + bool *new_device_added, bool single_dev) { struct btrfs_device *device; struct btrfs_fs_devices *fs_devices = NULL; @@ -805,23 +835,32 @@ static noinline struct btrfs_device *device_list_add(const char *path, return ERR_PTR(error); } - if (fsid_change_in_progress) { - if (!has_metadata_uuid) - fs_devices = find_fsid_inprogress(disk_super); - else - fs_devices = find_fsid_changed(disk_super); - } else if (has_metadata_uuid) { - fs_devices = find_fsid_with_metadata_uuid(disk_super); + if (single_dev) { + if (has_metadata_uuid || fsid_change_in_progress) { + btrfs_err(NULL, + "SINGLE_DEV devices don't support the metadata_uuid feature\n"); + return ERR_PTR(-EINVAL); + } + prepare_virtual_fsid(disk_super, path); } else { - fs_devices = find_fsid_reverted_metadata(disk_super); - if (!fs_devices) - fs_devices = find_fsid(disk_super->fsid, NULL); + if (fsid_change_in_progress) { + if (!has_metadata_uuid) + fs_devices = find_fsid_inprogress(disk_super); + else + fs_devices = find_fsid_changed(disk_super); + } else if (has_metadata_uuid) { + fs_devices = find_fsid_with_metadata_uuid(disk_super); + } else { + fs_devices = find_fsid_reverted_metadata(disk_super); + if (!fs_devices) + fs_devices = find_fsid(disk_super->fsid, NULL); + } } - if (!fs_devices) { fs_devices = alloc_fs_devices(disk_super->fsid, - btrfs_sb_metadata_uuid_or_null(disk_super)); + btrfs_sb_metadata_uuid_single_dev(disk_super, + has_metadata_uuid, single_dev)); if (IS_ERR(fs_devices)) return ERR_CAST(fs_devices); @@ -1365,13 +1404,15 @@ int btrfs_forget_devices(dev_t devt) * and we are not allowed to call set_blocksize during the scan. The superblock * is read via pagecache */ -struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags) +struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags, + bool mounting) { struct btrfs_super_block *disk_super; bool new_device_added = false; struct btrfs_device *device = NULL; struct block_device *bdev; u64 bytenr, bytenr_orig; + bool single_dev; int ret; lockdep_assert_held(&uuid_mutex); @@ -1410,7 +1451,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags) goto error_bdev_put; } - device = device_list_add(path, disk_super, &new_device_added); + single_dev = btrfs_super_compat_ro_flags(disk_super) & + BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV; + + if (!mounting && single_dev) { + pr_info("BTRFS: skipped non-mount scan on SINGLE_DEV device %s\n", + path); + btrfs_release_disk_super(disk_super); + return ERR_PTR(-EINVAL); + } + + device = device_list_add(path, disk_super, &new_device_added, single_dev); if (!IS_ERR(device) && new_device_added) btrfs_free_stale_devices(device->devt, device); @@ -2406,6 +2457,12 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info, args->devid = btrfs_stack_device_id(&disk_super->dev_item); memcpy(args->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE); + + /* + * Note that SINGLE_DEV devices are not handled in a special way here; + * device removal/replace is instead forbidden when such feature is + * present, this note is for future users/readers of this function. + */ if (btrfs_fs_incompat(fs_info, METADATA_UUID)) memcpy(args->fsid, disk_super->metadata_uuid, BTRFS_FSID_SIZE); else diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 0f87057bb575..b9856c801567 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -611,7 +611,8 @@ struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans, void btrfs_mapping_tree_free(struct extent_map_tree *tree); int btrfs_open_devices(struct btrfs_fs_devices *fs_devices, blk_mode_t flags, void *holder); -struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags); +struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags, + bool mounting); int btrfs_forget_devices(dev_t devt); void btrfs_close_devices(struct btrfs_fs_devices *fs_devices); void btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices); diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index dbb8b96da50d..cb7a7cfe1ea9 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -313,6 +313,13 @@ struct btrfs_ioctl_fs_info_args { */ #define BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE (1ULL << 3) +/* + * Single devices (as flagged by the corresponding compat_ro flag) only + * gets scanned during mount time; also, a random fsid is generated for + * them, in order to cope with same-fsid filesystem mounts. + */ +#define BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV (1ULL << 4) + #define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF (1ULL << 0) #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL (1ULL << 1) #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS (1ULL << 2) From patchwork Thu Aug 3 15:43:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guilherme G. Piccoli" X-Patchwork-Id: 13340270 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12A20EB64DD for ; Thu, 3 Aug 2023 15:45:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237223AbjHCPpf (ORCPT ); Thu, 3 Aug 2023 11:45:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237217AbjHCPpd (ORCPT ); Thu, 3 Aug 2023 11:45:33 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD0973582; Thu, 3 Aug 2023 08:45:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=pcNm2tGAu/WGE5Yp4ptBPcu3Aj2mc+V2YQ93YaHjPbA=; b=MGgek6bKJYJGB6q5NaDjncOFWW w8Pb7gdLYpWPD6Wm7XenLRfiXnr20dYUMXu0Z9DYdbQzbNCEaUsaK8D5W7OaHy5E5tjHyxRKiHkjX De05WpGEvvaSETbptTeKmLkBzZOZaAP3q/lTbCcAjHI9LRKTsnvpovZJZv3vmVJEzsYUqEIQvZJzN 981hVxC8SCPGWOUeQ4bZcDB9fq5rZzUDjOBs763miMPNuEqbYFp+rQAFmr/spDH1XuLaHZrqx8Tvt foC9wwEJ0wESByQDE+UYld38iu8TlsvKlWjTQseOoP3lXKsZ6Gm0pl5Aiu5ZEyi7a3AThxpYsLY1Z 2wi8vWgw==; Received: from [201.92.22.215] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qRaW0-00Btyc-I0; Thu, 03 Aug 2023 17:45:29 +0200 From: "Guilherme G. Piccoli" To: linux-btrfs@vger.kernel.org Cc: clm@fb.com, josef@toxicpanda.com, dsterba@suse.com, linux-fsdevel@vger.kernel.org, kernel@gpiccoli.net, gpiccoli@igalia.com, kernel-dev@igalia.com, anand.jain@oracle.com, david@fromorbit.com, kreijack@libero.it, johns@valvesoftware.com, ludovico.denittis@collabora.com, quwenruo.btrfs@gmx.com, wqu@suse.com, vivek@collabora.com Subject: [PATCH 3/3] btrfs: Add parameter to force devices behave as single-dev ones Date: Thu, 3 Aug 2023 12:43:41 -0300 Message-ID: <20230803154453.1488248-4-gpiccoli@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230803154453.1488248-1-gpiccoli@igalia.com> References: <20230803154453.1488248-1-gpiccoli@igalia.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Devices with the single-dev feature enabled in their superblock are allowed to be mounted regardless of their fsid being already present in the system - the goal of such feature is to have the device in a single mode with no advanced features, like RAID; it is a compat_ro feature present since kernel v6.5. The thing is that such feature comes in the form of a superblock flag, so devices that doesn't have it set, can't use the feature of course. The Steam Deck console aims to have block-based updates in its RO rootfs, and given its A/B partition nature, both block devices are required to be the same for their hash to match, so it's not possible to compare two images if one has this feature set in the superblock, while the other has not. So if we end-up having two old images, we couldn't make use of the single-dev feature to mount both at same time, or if we set the flag in one of them to enable the feature, we break the block-based hash comparison. We propose here a module parameter approach to allow forcing any given path (to a device holding a btrfs filesystem) behaving as a single-dev device. That would useful for cases like the Steam Deck one, or for debug purposes. If the filesystem already has the compat_ro flag set in its superblock, the parameter is no-op. Signed-off-by: Guilherme G. Piccoli --- fs/btrfs/disk-io.c | 2 +- fs/btrfs/ioctl.c | 6 +++--- fs/btrfs/super.c | 5 +++++ fs/btrfs/super.h | 2 ++ fs/btrfs/volumes.c | 45 ++++++++++++++++++++++++++++++++++++++++++--- fs/btrfs/volumes.h | 2 ++ 6 files changed, 55 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 455fa4949c98..8df1defa1ede 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2378,7 +2378,7 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info, * in the disk. That's the reason we're required here to compare the * fsid with the metadata_uuid for such devices. */ - if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) + if (fs_info->fs_devices->single_dev) fsid = fs_info->fs_devices->metadata_uuid; else fsid = fs_info->fs_devices->fsid; diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 56703d87def9..4fc63e802b08 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2678,7 +2678,7 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) if (!capable(CAP_SYS_ADMIN)) return -EPERM; - if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) { + if (fs_info->fs_devices->single_dev) { btrfs_err(fs_info, "device removal is unsupported on SINGLE_DEV devices\n"); return -EINVAL; @@ -2750,7 +2750,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) if (!capable(CAP_SYS_ADMIN)) return -EPERM; - if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) { + if (fs_info->fs_devices->single_dev) { btrfs_err(fs_info, "device removal is unsupported on SINGLE_DEV devices\n"); return -EINVAL; @@ -3280,7 +3280,7 @@ static long btrfs_ioctl_dev_replace(struct btrfs_fs_info *fs_info, if (!capable(CAP_SYS_ADMIN)) return -EPERM; - if (btrfs_fs_compat_ro(fs_info, SINGLE_DEV)) { + if (fs_info->fs_devices->single_dev) { btrfs_err(fs_info, "device removal is unsupported on SINGLE_DEV devices\n"); return -EINVAL; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index ee87189b1ccd..3cfc9c63360f 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -62,6 +62,11 @@ #define CREATE_TRACE_POINTS #include +char *force_single_dev; +module_param(force_single_dev, charp, 0444); +MODULE_PARM_DESC(force_single_dev, + "User list of devices to force acting as single-dev (comma separated)"); + static const struct super_operations btrfs_super_ops; /* diff --git a/fs/btrfs/super.h b/fs/btrfs/super.h index 8dbb909b364f..c855127600c8 100644 --- a/fs/btrfs/super.h +++ b/fs/btrfs/super.h @@ -3,6 +3,8 @@ #ifndef BTRFS_SUPER_H #define BTRFS_SUPER_H +extern char *force_single_dev; + int btrfs_parse_options(struct btrfs_fs_info *info, char *options, unsigned long new_flags); int btrfs_sync_fs(struct super_block *sb, int wait); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 433a490f2de8..06c5bad77bdf 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -12,6 +12,7 @@ #include #include #include +#include #include "misc.h" #include "ctree.h" #include "extent_map.h" @@ -865,6 +866,7 @@ static noinline struct btrfs_device *device_list_add(const char *path, return ERR_CAST(fs_devices); fs_devices->fsid_change = fsid_change_in_progress; + fs_devices->single_dev = single_dev; mutex_lock(&fs_devices->device_list_mutex); list_add(&fs_devices->fs_list, &fs_uuids); @@ -1399,6 +1401,45 @@ int btrfs_forget_devices(dev_t devt) return ret; } +/* + * SINGLE_DEV is a compat_ro feature, but we also have the force_single_dev + * module parameter in order to allow forcing a device to behave as single-dev, + * so old filesystems could also get mounted in a same-fsid mounting way. + */ + +static bool is_single_dev(const char *path, struct btrfs_super_block *sb) +{ + + if (btrfs_super_compat_ro_flags(sb) & BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV) + return true; + + if (force_single_dev) { + char *p, *skip_devs, *orig; + + skip_devs = kstrdup(force_single_dev, GFP_KERNEL); + if (!skip_devs) { + pr_err("BTRFS: couldn't parse force_single_dev parameter\n"); + return false; + } + + orig = skip_devs; + while ((p = strsep(&skip_devs, ",")) != NULL) { + if (!*p) + continue; + + if (!strcmp(p, path)) { + pr_info( + "BTRFS: forcing device %s to be single-dev\n", path); + kfree(orig); + return true; + } + } + kfree(orig); + } + + return false; +} + /* * Look for a btrfs signature on a device. This may be called out of the mount path * and we are not allowed to call set_blocksize during the scan. The superblock @@ -1451,9 +1492,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags, goto error_bdev_put; } - single_dev = btrfs_super_compat_ro_flags(disk_super) & - BTRFS_FEATURE_COMPAT_RO_SINGLE_DEV; - + single_dev = is_single_dev(path, disk_super); if (!mounting && single_dev) { pr_info("BTRFS: skipped non-mount scan on SINGLE_DEV device %s\n", path); diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index b9856c801567..57a3969f101c 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -293,6 +293,8 @@ struct btrfs_fs_devices { */ u8 metadata_uuid[BTRFS_FSID_SIZE]; + bool single_dev; + struct list_head fs_list; /*