From patchwork Thu May 28 18:34:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 11576487 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E0BF292A for ; Thu, 28 May 2020 18:35:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C49DD208A7 for ; Thu, 28 May 2020 18:35:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=libero.it header.i=@libero.it header.b="TumFOWT4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405889AbgE1SfP (ORCPT ); Thu, 28 May 2020 14:35:15 -0400 Received: from smtp-35.italiaonline.it ([213.209.10.35]:42506 "EHLO libero.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2405867AbgE1SfC (ORCPT ); Thu, 28 May 2020 14:35:02 -0400 Received: from venice.bhome ([78.12.136.199]) by smtp-35.iol.local with ESMTPA id eNMjjt6vcLNQWeNMmjtDeB; Thu, 28 May 2020 20:34:57 +0200 x-libjamoibt: 1601 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=libero.it; s=s2014; t=1590690897; bh=XuTm1g1fT1p8wNALfZN4tosWASd0tlP7Bk5Sc+iz7QQ=; h=From; b=TumFOWT4bHC2o670mx395fUDDoQ/T9DegeEwaCzQELwFKxzAduMCzyN8XQp2s0XfG 843koW+4GtONb+QrX/9L1CPMFcmX+Yy+vqKqmPE+8QxDW0Cm1Fsplz9JI9v2NuEZ8P NYkpgEgoaHU+O00strWHk91DN+qPbXA7WElOdgjh8mbtAetzIMYqu9GzjjPhd469Qx 1yGsUwMqU1+jIvRNuZEs2A3lzO4HVnTKLNNhAgW+uDIZSYm4CxwpDT+GarHNIK9tMp Mxl0Nfpe2ZJ8BpMCZDgrJJb92tdXbwISx3HlWafAX7GRYxQwuy4J984qeya+hWM2Ap pFIAFLT5kOeCg== X-CNFS-Analysis: v=2.3 cv=LKsYv6e9 c=1 sm=1 tr=0 a=kx39m2EDZI1V9vDwKCQCcA==:117 a=kx39m2EDZI1V9vDwKCQCcA==:17 a=tAp6EsRIHxeSHigjOvkA:9 From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Michael , Hugo Mills , Martin Svec , Wang Yugui , Paul Jones , Adam Borowski , Zygo Blaxell , Goffredo Baroncelli Subject: [PATCH 1/4] Add an ioctl to set/retrive the device properties Date: Thu, 28 May 2020 20:34:48 +0200 Message-Id: <20200528183451.16654-2-kreijack@libero.it> X-Mailer: git-send-email 2.27.0.rc2 In-Reply-To: <20200528183451.16654-1-kreijack@libero.it> References: <20200528183451.16654-1-kreijack@libero.it> MIME-Version: 1.0 X-CMAE-Envelope: MS4wfDrXuWwqW9ik1CG+woYMSGvNfWh9IhnmcXwPbE8nfDLRf4NfgySGejXwrgFP7yFQQnItySqGDeTWk+6WpDrHqrs7f5/5Q+dXlaH5VbLUDBl3F059ypjE X3Xtq4rMt2ZBHXvPUzalvLQ+8Jl9bHdA69EVN399/ZGPOlgSD1658C0hf9mc6TjEErDwSyLmVwkZFwJBEnzaEngys9yzvq7AKE3802VxIxk4UsesQytgqLkO zurn1hiwQZ7sRGSWpGD5ol8FlF4Cp82U+gB/NCojpIkXkxlNIxvdxaSrBTxE6+/JlicmdJZuXrSDO/rtxIvqMBDq+NMOIvkb/7gAbIZCiSzHT5fIpX+ODEkd vh0ml1XrsIwk0c5Ul7kzpUZ8CN1l4vFWVIeZPsasJCikVh5lg6hPzSa1ocg6K1a6CyJKBrjQi51gH3ZptvB0l3oc8orHCg== Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Signed-off-by: Goffredo Baroncelli --- fs/btrfs/ioctl.c | 67 ++++++++++++++++++++++++++++++++++++++ fs/btrfs/volumes.c | 2 +- fs/btrfs/volumes.h | 2 ++ include/uapi/linux/btrfs.h | 40 +++++++++++++++++++++++ 4 files changed, 110 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 40b729dce91c..cba3fa942e2f 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4724,6 +4724,71 @@ static int btrfs_ioctl_set_features(struct file *file, void __user *arg) return ret; } +static long btrfs_ioctl_dev_properties(struct file *file, + void __user *argp) +{ + struct inode *inode = file_inode(file); + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct btrfs_ioctl_dev_properties dev_props; + struct btrfs_device *device; + struct btrfs_root *root = fs_info->chunk_root; + struct btrfs_trans_handle *trans; + int ret; + u64 prev_type; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (copy_from_user(&dev_props, argp, sizeof(dev_props))) + return -EFAULT; + + device = btrfs_find_device(fs_info->fs_devices, dev_props.devid, + NULL, NULL, false); + if (!device) { + btrfs_info(fs_info, "change_dev_properties: unable to find device %llu", + dev_props.devid); + return -ENODEV; + } + + if (dev_props.properties & BTRFS_DEV_PROPERTY_READ) { + u64 props = dev_props.properties; + memset(&dev_props, 0, sizeof(dev_props)); + if (props & BTRFS_DEV_PROPERTY_TYPE) { + dev_props.properties = BTRFS_DEV_PROPERTY_TYPE; + dev_props.type = device->type; + } + if(copy_to_user(argp, &dev_props, sizeof(dev_props))) + return -EFAULT; + return 0; + } + + /* it is possible to set only BTRFS_DEV_PROPERTY_TYPE for now */ + if (dev_props.properties & ~(BTRFS_DEV_PROPERTY_TYPE)) + return -EPERM; + + trans = btrfs_start_transaction(root, 0); + if (IS_ERR(trans)) + return PTR_ERR(trans); + + prev_type = device->type; + device->type = dev_props.type; + ret = btrfs_update_device(trans, device); + + if (ret < 0) { + btrfs_abort_transaction(trans, ret); + btrfs_end_transaction(trans); + device->type = prev_type; + return ret; + } + + ret = btrfs_commit_transaction(trans); + if (ret < 0) + device->type = prev_type; + + return ret; + +} + static int _btrfs_ioctl_send(struct file *file, void __user *argp, bool compat) { struct btrfs_ioctl_send_args *arg; @@ -4907,6 +4972,8 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_get_subvol_rootref(file, argp); case BTRFS_IOC_INO_LOOKUP_USER: return btrfs_ioctl_ino_lookup_user(file, argp); + case BTRFS_IOC_DEV_PROPERTIES: + return btrfs_ioctl_dev_properties(file, argp); } return -ENOTTY; diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index be1e047a489e..5265f54c2931 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2710,7 +2710,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path return ret; } -static noinline int btrfs_update_device(struct btrfs_trans_handle *trans, +int btrfs_update_device(struct btrfs_trans_handle *trans, struct btrfs_device *device) { int ret; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index f067b5934c46..0ac5bf2b95e6 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -577,5 +577,7 @@ bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info, int btrfs_bg_type_to_factor(u64 flags); const char *btrfs_bg_type_to_raid_name(u64 flags); int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info); +int btrfs_update_device(struct btrfs_trans_handle *trans, + struct btrfs_device *device); #endif diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index e6b6cb0f8bc6..bb096075677d 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -842,6 +842,44 @@ struct btrfs_ioctl_get_subvol_rootref_args { __u8 align[7]; }; +#define BTRFS_DEV_PROPERTY_TYPE (1ULL << 0) +#define BTRFS_DEV_PROPERTY_DEV_GROUP (1ULL << 1) +#define BTRFS_DEV_PROPERTY_SEEK_SPEED (1ULL << 2) +#define BTRFS_DEV_PROPERTY_BANDWIDTH (1ULL << 3) +#define BTRFS_DEV_PROPERTY_READ (1ULL << 60) + +/* + * The ioctl BTRFS_IOC_DEV_PROPERTIES can read and write the device properties. + * + * The properties that the user want to write have to be set + * in the 'properties' field using the BTRFS_DEV_PROPERTY_xxxx constants. + * + * If the ioctl is used to read the device properties, the bit + * BTRFS_DEV_PROPERTY_READ has to be set in the 'properties' field. + * In this case the properties that the user want have to be set in the + * 'properties' field. The kernel doesn't return a property that was not + * required, however it may return a subset of the requested properties. + * The returned properties have the corrispondent BTRFS_DEV_PROPERTY_xxxx + * flag set in the 'properties' field. + * + * Up to 2020/05/11 the only properties that can be read/write is the 'type' + * one. + */ +struct btrfs_ioctl_dev_properties { + __u64 devid; + __u64 properties; + __u64 type; + __u32 dev_group; + __u8 seek_speed; + __u8 bandwidth; + + /* + * for future expansion + */ + __u8 unused1[2]; + __u64 unused2[4]; +}; + /* Error codes as returned by the kernel */ enum btrfs_err_code { BTRFS_ERROR_DEV_RAID1_MIN_NOT_MET = 1, @@ -970,5 +1008,7 @@ enum btrfs_err_code { struct btrfs_ioctl_ino_lookup_user_args) #define BTRFS_IOC_SNAP_DESTROY_V2 _IOW(BTRFS_IOCTL_MAGIC, 63, \ struct btrfs_ioctl_vol_args_v2) +#define BTRFS_IOC_DEV_PROPERTIES _IOW(BTRFS_IOCTL_MAGIC, 64, \ + struct btrfs_ioctl_dev_properties) #endif /* _UAPI_LINUX_BTRFS_H */ From patchwork Thu May 28 18:34:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 11576481 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A19292A for ; Thu, 28 May 2020 18:35:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 76927207D3 for ; Thu, 28 May 2020 18:35:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=libero.it header.i=@libero.it header.b="HavJfKbJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405880AbgE1SfB (ORCPT ); Thu, 28 May 2020 14:35:01 -0400 Received: from smtp-35.italiaonline.it ([213.209.10.35]:47090 "EHLO libero.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2405868AbgE1Se7 (ORCPT ); Thu, 28 May 2020 14:34:59 -0400 Received: from venice.bhome ([78.12.136.199]) by smtp-35.iol.local with ESMTPA id eNMjjt6vcLNQWeNMnjtDeO; Thu, 28 May 2020 20:34:57 +0200 x-libjamoibt: 1601 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=libero.it; s=s2014; t=1590690897; bh=H9WRSFd82hbCDotLF6a/tKsukwxA6ocUzLx0ZSBBwwI=; h=From; b=HavJfKbJA0qh+niC3RXbBFeM0gw1gdMwrcviaAbVvh3ZXA6kWTdrN4ZrNFHIMPcMa v95+C49F0YZJKFuDnpZCCZ99sR5+XO62sfYAEeFqu1TvP1ojtFHWfjFdAcICL+mfra gSAyN+N/lJid0LIGl3RohEQaYpJqwAfUdNLgjRS9oxBZh6cSSH52XvyH3xBnGxIaQO s0rR5PdXYhJnjhs8n2AwU+J6D9vpCu7SemURoV6r1a2/efzpZ0SoLfBZifNnrNXloS M3CIxqWPRy8qpzFUOfuAfC/0WOEHbHrI0EVRd1d9SQ4+sVBMSCbOA/NiHlM4gNggM1 bfpztHILfnn4A== X-CNFS-Analysis: v=2.3 cv=LKsYv6e9 c=1 sm=1 tr=0 a=kx39m2EDZI1V9vDwKCQCcA==:117 a=kx39m2EDZI1V9vDwKCQCcA==:17 a=V5MRcGaKH_fqCW5OQCMA:9 From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Michael , Hugo Mills , Martin Svec , Wang Yugui , Paul Jones , Adam Borowski , Zygo Blaxell , Goffredo Baroncelli Subject: [PATCH 2/4] Add flags for dedicated metadata disks Date: Thu, 28 May 2020 20:34:49 +0200 Message-Id: <20200528183451.16654-3-kreijack@libero.it> X-Mailer: git-send-email 2.27.0.rc2 In-Reply-To: <20200528183451.16654-1-kreijack@libero.it> References: <20200528183451.16654-1-kreijack@libero.it> MIME-Version: 1.0 X-CMAE-Envelope: MS4wfDrXuWwqW9ik1CG+woYMSGvNfWh9IhnmcXwPbE8nfDLRf4NfgySGejXwrgFP7yFQQnItySqGDeTWk+6WpDrHqrs7f5/5Q+dXlaH5VbLUDBl3F059ypjE X3Xtq4rMt2ZBHXvPUzalvLQ+8Jl9bHdA69EVN399/ZGPOlgSD1658C0hf9mc6TjEErDwSyLmVwkZFwJBEnzaEngys9yzvq7AKE3802VxIxk4UsesQytgqLkO zurn1hiwQZ7sRGSWpGD5ol8FlF4Cp82U+gB/NCojpIkXkxlNIxvdxaSrBTxE6+/JlicmdJZuXrSDO/rtxIvqMBDq+NMOIvkb/7gAbIZCiSzHT5fIpX+ODEkd vh0ml1XrsIwk0c5Ul7kzpUZ8CN1l4vFWVIeZPsasJCikVh5lg6hPzSa1ocg6K1a6CyJKBrjQi51gH3ZptvB0l3oc8orHCg== Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Signed-off-by: Goffredo Baroncelli --- include/uapi/linux/btrfs_tree.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index 8e322e2c7e78..a45d09591db8 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -355,6 +355,9 @@ struct btrfs_key { __u64 offset; } __attribute__ ((__packed__)); +/* dev_item.type */ +#define BTRFS_DEV_PREFERRED_METADATA (1ULL << 0) + struct btrfs_dev_item { /* the internal btrfs device id */ __le64 devid; From patchwork Thu May 28 18:34:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 11576483 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF7BC92A for ; Thu, 28 May 2020 18:35:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B7DCD207D3 for ; Thu, 28 May 2020 18:35:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=libero.it header.i=@libero.it header.b="QL8a9syv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405883AbgE1SfH (ORCPT ); Thu, 28 May 2020 14:35:07 -0400 Received: from smtp-35.italiaonline.it ([213.209.10.35]:39096 "EHLO libero.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2405869AbgE1Se7 (ORCPT ); Thu, 28 May 2020 14:34:59 -0400 Received: from venice.bhome ([78.12.136.199]) by smtp-35.iol.local with ESMTPA id eNMjjt6vcLNQWeNMnjtDeb; Thu, 28 May 2020 20:34:57 +0200 x-libjamoibt: 1601 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=libero.it; s=s2014; t=1590690897; bh=I3afqNPYrT3PRTlHcsNgFKQRiwQgzimU035ThZUYgZo=; h=From; b=QL8a9syvwKd3WgLOpHbZ3LlqtVgicqgQtY0d0qNhCeHrRHeLbsXTAnM2+QFvZ0YVf lwb9otoYyBOLB0QADtP0TSBxXQelNAEM1LjnioMi59m6UHpDj0qwkf26C7N1LXeKB5 ClvFWO1iexnQOcf+SJa9oDeeraoZoggfooI4q5DvR6TN7y9fKQbTqpG6L0OABKNz71 vbaE+3FaJujCUbJDyqxJGoRMRYgkK41R5o7UNwMG38S0gO58nd2cFLg/QnuP+96OYU oNkbyEtUyHc0yGGJVY0mJkvAtjfD4SpS6WtVgj1BeCPFHlDH8MgQbtXvFCc1ONvJhC LdNLzBzKiu5BQ== X-CNFS-Analysis: v=2.3 cv=LKsYv6e9 c=1 sm=1 tr=0 a=kx39m2EDZI1V9vDwKCQCcA==:117 a=kx39m2EDZI1V9vDwKCQCcA==:17 a=1s11hCFB_oFSRyqhbgYA:9 a=pHzHmUro8NiASowvMSCR:22 a=Ew2E2A-JSTLzCXPT_086:22 From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Michael , Hugo Mills , Martin Svec , Wang Yugui , Paul Jones , Adam Borowski , Zygo Blaxell , Goffredo Baroncelli Subject: [PATCH 3/4] Export dev_item.type in sysfs /sys/fs/btrfs//devinfo//type Date: Thu, 28 May 2020 20:34:50 +0200 Message-Id: <20200528183451.16654-4-kreijack@libero.it> X-Mailer: git-send-email 2.27.0.rc2 In-Reply-To: <20200528183451.16654-1-kreijack@libero.it> References: <20200528183451.16654-1-kreijack@libero.it> MIME-Version: 1.0 X-CMAE-Envelope: MS4wfDrXuWwqW9ik1CG+woYMSGvNfWh9IhnmcXwPbE8nfDLRf4NfgySGejXwrgFP7yFQQnItySqGDeTWk+6WpDrHqrs7f5/5Q+dXlaH5VbLUDBl3F059ypjE X3Xtq4rMt2ZBHXvPUzalvLQ+8Jl9bHdA69EVN399/ZGPOlgSD1658C0hf9mc6TjEErDwSyLmVwkZFwJBEnzaEngys9yzvq7AKE3802VxIxk4UsesQytgqLkO zurn1hiwQZ7sRGSWpGD5ol8FlF4Cp82U+gB/NCojpIkXkxlNIxvdxaSrBTxE6+/JlicmdJZuXrSDO/rtxIvqMBDq+NMOIvkb/7gAbIZCiSzHT5fIpX+ODEkd vh0ml1XrsIwk0c5Ul7kzpUZ8CN1l4vFWVIeZPsasJCikVh5lg6hPzSa1ocg6K1a6CyJKBrjQi51gH3ZptvB0l3oc8orHCg== Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Signed-off-by: Goffredo Baroncelli --- fs/btrfs/sysfs.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index a39bff64ff24..c189fd7f9afd 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1244,11 +1244,22 @@ static ssize_t btrfs_devinfo_writeable_show(struct kobject *kobj, } BTRFS_ATTR(devid, writeable, btrfs_devinfo_writeable_show); +static ssize_t btrfs_devinfo_type_show(struct kobject *kobj, + struct kobj_attribute *a, char *buf) +{ + struct btrfs_device *device = container_of(kobj, struct btrfs_device, + devid_kobj); + + return scnprintf(buf, PAGE_SIZE, "0x%08llx\n",device->type); +} +BTRFS_ATTR(devid, type, btrfs_devinfo_type_show); + static struct attribute *devid_attrs[] = { BTRFS_ATTR_PTR(devid, in_fs_metadata), BTRFS_ATTR_PTR(devid, missing), BTRFS_ATTR_PTR(devid, replace_target), BTRFS_ATTR_PTR(devid, writeable), + BTRFS_ATTR_PTR(devid, type), NULL }; ATTRIBUTE_GROUPS(devid); From patchwork Thu May 28 18:34:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 11576485 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7C3F792A for ; Thu, 28 May 2020 18:35:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5FA9320814 for ; Thu, 28 May 2020 18:35:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=libero.it header.i=@libero.it header.b="ZYmhVbB1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405888AbgE1SfN (ORCPT ); Thu, 28 May 2020 14:35:13 -0400 Received: from smtp-35.italiaonline.it ([213.209.10.35]:56501 "EHLO libero.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2405870AbgE1SfC (ORCPT ); Thu, 28 May 2020 14:35:02 -0400 Received: from venice.bhome ([78.12.136.199]) by smtp-35.iol.local with ESMTPA id eNMjjt6vcLNQWeNMnjtDem; Thu, 28 May 2020 20:34:58 +0200 x-libjamoibt: 1601 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=libero.it; s=s2014; t=1590690898; bh=io4q/CZppc3rzuMZdLaJcjQk4LkA0G6TnXDUVBTpuc0=; h=From; b=ZYmhVbB1sFdAG1M91ZL4J+tz9hhpOMztzmGOTaCYjsGM6vr6oJKl9uTJd/PLXtSYy Qskh8do4Qedm/W5jm8xtksMMaNtQaVaPuF/9yBnHW+HFhce0lgUIq9K9Bomb7GCrwP XLSD5E5RiL7b30jwf398JNW2y9hAeTHRihdlRQWkiA2O7BNviGOD64W4FDFjvyeujE l3QQ/vZ9z1uIz9kGCLqOrrlR4JBBUshaAbiH4qqut/iRtIbxDEW3uI9Q4oHx7iBht0 kzVLC43nzWrquKJ2gU6cJm5EltQR5nvVzaWcvgpD/+DvqdpT2vBoG4n5mHmInKxjPd PcL56Y5qS6Fqg== X-CNFS-Analysis: v=2.3 cv=LKsYv6e9 c=1 sm=1 tr=0 a=kx39m2EDZI1V9vDwKCQCcA==:117 a=kx39m2EDZI1V9vDwKCQCcA==:17 a=rNnOEDRGF8liazhjtn8A:9 From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Michael , Hugo Mills , Martin Svec , Wang Yugui , Paul Jones , Adam Borowski , Zygo Blaxell , Goffredo Baroncelli Subject: [PATCH 4/4] btrfs: add preferred_metadata mode Date: Thu, 28 May 2020 20:34:51 +0200 Message-Id: <20200528183451.16654-5-kreijack@libero.it> X-Mailer: git-send-email 2.27.0.rc2 In-Reply-To: <20200528183451.16654-1-kreijack@libero.it> References: <20200528183451.16654-1-kreijack@libero.it> MIME-Version: 1.0 X-CMAE-Envelope: MS4wfJoKuwSWQ65alJRJ1383fXcluNbC2GX9Cd1hsEu+1jou/2ulgVaoSSVXhf4wPoUbtnp0vLc0/g1EnsEDVQv7aP5RBLz9Xjqv7mykjtVOVSUO+JvMlqEo jEOsExz1RyHAxHxC4Io9ukosAs9tILZvkSXJWp1qb4DK2LwenteLEGInq0Me7GfoBdf0OSNX+cI+/yqXnRvKIfhoxoDLz1w0EwYynAofsiqkWRkAc/0SGmJ7 j4SViwOzq5cQw0ufLX0rZwxvcjhgJh4pp9sqb5bAE/b8KVkn+BKG7EpmZ+cQNMW3Bj2/lZH5ieBk9wp5nYz3bFKWuijh9dBZ9GG78nKe/XiVdnIBFhU8u/MW Vi2/eN6WZGE9I8Yj2rwizijcmNlC/ThYIbn/Hi8pxhmDsw9dJEInwOhCtkVJykaVI+sT/AjWq3oih78qqfeVSblcdBm0Sg== Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli When this mode is enabled, the allocation policy of the chunk is so modified: - allocation of metadata chunk: priority is given to preferred_metadata disks. - allocation of data chunk: priority is given to a non preferred_metadata disk. When a striped profile is involved (like RAID0,5,6), the logic is a bit more complex. If there are enough disks, the data profiles are stored on the non preferred_metadata disks; instead the metadata profiles are stored on the preferred_metadata disk. If the disks are not enough, then the profile is allocated on all the disks. Example: assuming that sda, sdb, sdc are ssd disks, and sde, sdf are non preferred_metadata ones. A data profile raid6, will be stored on sda, sdb, sdc, sde, sdf (sde and sdf are not enough to host a raid5 profile). A metadata profile raid6, will be stored on sda, sdb, sdc (these are enough to host a raid6 profile). To enable this mode pass -o dedicated_metadata at mount time. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/ctree.h | 1 + fs/btrfs/super.c | 8 +++++ fs/btrfs/volumes.c | 89 ++++++++++++++++++++++++++++++++++++++++++++-- fs/btrfs/volumes.h | 1 + 4 files changed, 97 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 03ea7370aea7..779760fd27b1 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1239,6 +1239,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct btrfs_fs_info *info) #define BTRFS_MOUNT_NOLOGREPLAY (1 << 27) #define BTRFS_MOUNT_REF_VERIFY (1 << 28) #define BTRFS_MOUNT_DISCARD_ASYNC (1 << 29) +#define BTRFS_MOUNT_PREFERRED_METADATA (1 << 30) #define BTRFS_DEFAULT_COMMIT_INTERVAL (30) #define BTRFS_DEFAULT_MAX_INLINE (2048) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 438ecba26557..80700dc9dcf8 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -359,6 +359,7 @@ enum { #ifdef CONFIG_BTRFS_FS_REF_VERIFY Opt_ref_verify, #endif + Opt_preferred_metadata, Opt_err, }; @@ -430,6 +431,7 @@ static const match_table_t tokens = { #ifdef CONFIG_BTRFS_FS_REF_VERIFY {Opt_ref_verify, "ref_verify"}, #endif + {Opt_preferred_metadata, "preferred_metadata"}, {Opt_err, NULL}, }; @@ -881,6 +883,10 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, btrfs_set_opt(info->mount_opt, REF_VERIFY); break; #endif + case Opt_preferred_metadata: + btrfs_set_and_info(info, PREFERRED_METADATA, + "enabling preferred_metadata"); + break; case Opt_err: btrfs_err(info, "unrecognized mount option '%s'", p); ret = -EINVAL; @@ -1403,6 +1409,8 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry) #endif if (btrfs_test_opt(info, REF_VERIFY)) seq_puts(seq, ",ref_verify"); + if (btrfs_test_opt(info, PREFERRED_METADATA)) + seq_puts(seq, ",preferred_metadata"); seq_printf(seq, ",subvolid=%llu", BTRFS_I(d_inode(dentry))->root->root_key.objectid); seq_puts(seq, ",subvol="); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 5265f54c2931..c68efb15e473 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -4770,6 +4770,56 @@ static int btrfs_cmp_device_info(const void *a, const void *b) return 0; } +/* + * sort the devices in descending order by preferred_metadata, + * max_avail, total_avail + */ +static int btrfs_cmp_device_info_metadata(const void *a, const void *b) +{ + const struct btrfs_device_info *di_a = a; + const struct btrfs_device_info *di_b = b; + + /* metadata -> preferred_metadata first */ + if (di_a->preferred_metadata && !di_b->preferred_metadata) + return -1; + if (!di_a->preferred_metadata && di_b->preferred_metadata) + return 1; + if (di_a->max_avail > di_b->max_avail) + return -1; + if (di_a->max_avail < di_b->max_avail) + return 1; + if (di_a->total_avail > di_b->total_avail) + return -1; + if (di_a->total_avail < di_b->total_avail) + return 1; + return 0; +} + +/* + * sort the devices in descending order by !preferred_metadata, + * max_avail, total_avail + */ +static int btrfs_cmp_device_info_data(const void *a, const void *b) +{ + const struct btrfs_device_info *di_a = a; + const struct btrfs_device_info *di_b = b; + + /* data -> preferred_metadata last */ + if (di_a->preferred_metadata && !di_b->preferred_metadata) + return 1; + if (!di_a->preferred_metadata && di_b->preferred_metadata) + return -1; + if (di_a->max_avail > di_b->max_avail) + return -1; + if (di_a->max_avail < di_b->max_avail) + return 1; + if (di_a->total_avail > di_b->total_avail) + return -1; + if (di_a->total_avail < di_b->total_avail) + return 1; + return 0; +} + static void check_raid56_incompat_flag(struct btrfs_fs_info *info, u64 type) { if (!(type & BTRFS_BLOCK_GROUP_RAID56_MASK)) @@ -4885,6 +4935,7 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, int ndevs = 0; u64 max_avail; u64 dev_offset; + int nr_preferred_metadata = 0; /* * in the first pass through the devices list, we gather information @@ -4937,15 +4988,49 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, devices_info[ndevs].max_avail = max_avail; devices_info[ndevs].total_avail = total_avail; devices_info[ndevs].dev = device; + devices_info[ndevs].preferred_metadata = !!(device->type & + BTRFS_DEV_PREFERRED_METADATA); + if (devices_info[ndevs].preferred_metadata) + nr_preferred_metadata++; ++ndevs; } ctl->ndevs = ndevs; + BUG_ON(nr_preferred_metadata > ndevs); /* * now sort the devices by hole size / available space */ - sort(devices_info, ndevs, sizeof(struct btrfs_device_info), - btrfs_cmp_device_info, NULL); + if (((ctl->type & BTRFS_BLOCK_GROUP_DATA) && + (ctl->type & BTRFS_BLOCK_GROUP_METADATA)) || + !btrfs_test_opt(info, PREFERRED_METADATA)) { + /* mixed bg or PREFERRED_METADATA not set */ + sort(devices_info, ctl->ndevs, sizeof(struct btrfs_device_info), + btrfs_cmp_device_info, NULL); + } else { + /* + * if PREFERRED_METADATA is set, sort the device considering + * also the kind (preferred_metadata or not). Limit the + * availables devices to the ones of the same kind, to avoid + * that a striped profile, like raid5, spreads to all kind of + * devices. + * It is allowed to use different kinds of devices if the ones + * of the same kind are not enough alone. + */ + if (ctl->type & BTRFS_BLOCK_GROUP_DATA) { + int nr_data = ctl->ndevs - nr_preferred_metadata; + sort(devices_info, ctl->ndevs, + sizeof(struct btrfs_device_info), + btrfs_cmp_device_info_data, NULL); + if (nr_data >= ctl->devs_min) + ctl->ndevs = nr_data; + } else { /* non data -> metadata and system */ + sort(devices_info, ctl->ndevs, + sizeof(struct btrfs_device_info), + btrfs_cmp_device_info_metadata, NULL); + if (nr_preferred_metadata >= ctl->devs_min) + ctl->ndevs = nr_preferred_metadata; + } + } return 0; } diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 0ac5bf2b95e6..d39c3b0e7569 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -347,6 +347,7 @@ struct btrfs_device_info { u64 dev_offset; u64 max_avail; u64 total_avail; + int preferred_metadata:1; }; struct btrfs_raid_attr {