From patchwork Thu Apr 28 15:02:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12830862 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6BD0C433EF for ; Thu, 28 Apr 2022 15:02:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348702AbiD1PFl (ORCPT ); Thu, 28 Apr 2022 11:05:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343735AbiD1PFk (ORCPT ); Thu, 28 Apr 2022 11:05:40 -0400 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FB94B3C6C for ; Thu, 28 Apr 2022 08:02:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1651158144; x=1682694144; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=t0pwqfD4XHuiGp+7FfFl1ic8ER7qdyEW1rCq/MyzWXs=; b=SegxuJMnwoue9ueYcVuyiM6sAdt2igw6h96S7OHglULHvegCmBXQtn34 gB+k0Y/CEv4/BxBNJUvQ1L2EPTjVisH9xkeNLh56W9+inChip6PgTfzQ+ 2wI7r6WUw543b0EhVdzYyhbkyuOYjTpw97N5rMZpbFJbK5cqltMCMZIMT 0cpNYgoD2pBlYehAfTKqJ9vRYdmPKjJFSETXNQBm0eyVmNJyHexm0aA+o a8cCnmJRl8qEual0rsDQLl8ugBWXRSboXkqZN58vkDrt4bX5eE5SJdyA7 QS/cWNS7rlOhxT4eT15NZ8QXUePHXiKMPcGb++cO0orcHBgsgcVCZBHlk g==; X-IronPort-AV: E=Sophos;i="5.91,295,1647273600"; d="scan'208";a="303279894" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 28 Apr 2022 23:02:24 +0800 IronPort-SDR: fr3ji37y6zR4G6FeQVZVIrwz4WlUVpMiL24nqp8lj4RkXiYs5Y2ymDOOUJ0I22kFITOPhnl8CE 6jMsdkvJAG9n7jwlYxxKbaILnEMR6US7zMnsqbdx62CEE2ripbtvD4tE6LPVkufdnhamwbh9JW ofthv61b3anbwlLyBy4UhCpofejov4QYEDJsl5wa91YroY1zQE6u8paX7O6vnZv0Vrj3kqt1my KWkDUuq1+hu7E/sMxwDpZMgi5GTFkW4eimg27ONVWAsm+jzqigC5itpv2gy2ozU/7KoIe45n+9 KHjx+lWfWQTxbgIdx+SyesrX Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 28 Apr 2022 07:32:32 -0700 IronPort-SDR: N6aWaAmv18LFGdBxfOzZNamlD33rQf9TAmzGwDgu2er7oYp13ImTG164/rN7nrpzS2QCDdmya8 5EW0wq4Dp/98nxAmX53Unwep+v54ruZlkcxPnVRU2eFQkUSdYYPsUMKysFWozKsUeGPjjXBiQT sVsEaXy2foxU1Y2HcEoY9DnN/qIsN7ePhVvg05n8hcX5L3B9Lo7+h9v+gCRyGRfZQ5WDFkAigP MfgfsXaL7yf0C1ueiKdSQZXDNAhFy6SJ1pAnzp+W+uXFcrp2H+OHSJDdGEksR4mKT/kq5YMGbm 9gU= WDCIronportException: Internal Received: from fd6v5s2.ad.shared (HELO naota-xeon.wdc.com) ([10.225.48.6]) by uls-op-cesaip02.wdc.com with ESMTP; 28 Apr 2022 08:02:24 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota Subject: [PATCH 1/4] btrfs: zoned: consolidate zone finish function Date: Fri, 29 Apr 2022 00:02:15 +0900 Message-Id: <4d5e42d343318979a254f7dbdd96aa1c48908ed8.1651157034.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org btrfs_zone_finish() and btrfs_zone_finish_endio() have similar code. Introduce __btrfs_zone_finish() to consolidate them. Signed-off-by: Naohiro Aota --- fs/btrfs/zoned.c | 127 ++++++++++++++++++++--------------------------- 1 file changed, 54 insertions(+), 73 deletions(-) diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 997a96d7a3d5..9cddafe78fb1 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1879,20 +1879,14 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) return ret; } -int btrfs_zone_finish(struct btrfs_block_group *block_group) +static int __btrfs_zone_finish(struct btrfs_block_group *block_group, bool nowait) { struct btrfs_fs_info *fs_info = block_group->fs_info; struct map_lookup *map; - struct btrfs_device *device; - u64 physical; + bool need_zone_finish; int ret = 0; int i; - if (!btrfs_is_zoned(fs_info)) - return 0; - - map = block_group->physical_map; - spin_lock(&block_group->lock); if (!block_group->zone_is_active) { spin_unlock(&block_group->lock); @@ -1906,36 +1900,42 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group) spin_unlock(&block_group->lock); return -EAGAIN; } - spin_unlock(&block_group->lock); - ret = btrfs_inc_block_group_ro(block_group, false); - if (ret) - return ret; + if (!nowait) { + spin_unlock(&block_group->lock); - /* Ensure all writes in this block group finish */ - btrfs_wait_block_group_reservations(block_group); - /* No need to wait for NOCOW writers. Zoned mode does not allow that. */ - btrfs_wait_ordered_roots(fs_info, U64_MAX, block_group->start, - block_group->length); + ret = btrfs_inc_block_group_ro(block_group, false); + if (ret) + return ret; - spin_lock(&block_group->lock); + /* Ensure all writes in this block group finish */ + btrfs_wait_block_group_reservations(block_group); + /* No need to wait for NOCOW writers. Zoned mode does not allow that. */ + btrfs_wait_ordered_roots(fs_info, U64_MAX, block_group->start, + block_group->length); - /* - * Bail out if someone already deactivated the block group, or - * allocated space is left in the block group. - */ - if (!block_group->zone_is_active) { - spin_unlock(&block_group->lock); - btrfs_dec_block_group_ro(block_group); - return 0; - } + spin_lock(&block_group->lock); - if (block_group->reserved) { - spin_unlock(&block_group->lock); - btrfs_dec_block_group_ro(block_group); - return -EAGAIN; + /* + * Bail out if someone already deactivated the block group, or + * allocated space is left in the block group. + */ + if (!block_group->zone_is_active) { + spin_unlock(&block_group->lock); + btrfs_dec_block_group_ro(block_group); + return 0; + } + + if (block_group->reserved) { + spin_unlock(&block_group->lock); + btrfs_dec_block_group_ro(block_group); + return -EAGAIN; + } } + /* There is unwritten space left. Need to finish the underlying zones. */ + need_zone_finish = (block_group->zone_capacity - block_group->alloc_offset) > 0; + block_group->zone_is_active = 0; block_group->alloc_offset = block_group->zone_capacity; block_group->free_space_ctl->free_space = 0; @@ -1943,24 +1943,29 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group) btrfs_clear_data_reloc_bg(block_group); spin_unlock(&block_group->lock); + map = block_group->physical_map; for (i = 0; i < map->num_stripes; i++) { - device = map->stripes[i].dev; - physical = map->stripes[i].physical; + struct btrfs_device *device = map->stripes[i].dev; + const u64 physical = map->stripes[i].physical; if (device->zone_info->max_active_zones == 0) continue; - ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH, - physical >> SECTOR_SHIFT, - device->zone_info->zone_size >> SECTOR_SHIFT, - GFP_NOFS); + if (need_zone_finish) { + ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH, + physical >> SECTOR_SHIFT, + device->zone_info->zone_size >> SECTOR_SHIFT, + GFP_NOFS); - if (ret) - return ret; + if (ret) + return ret; + } btrfs_dev_clear_active_zone(device, physical); } - btrfs_dec_block_group_ro(block_group); + + if (!nowait) + btrfs_dec_block_group_ro(block_group); spin_lock(&fs_info->zone_active_bgs_lock); ASSERT(!list_empty(&block_group->active_bg_list)); @@ -1973,6 +1978,14 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group) return 0; } +int btrfs_zone_finish(struct btrfs_block_group *block_group) +{ + if (!btrfs_is_zoned(block_group->fs_info)) + return 0; + + return __btrfs_zone_finish(block_group, false); +} + bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags) { struct btrfs_fs_info *fs_info = fs_devices->fs_info; @@ -2004,9 +2017,6 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags) void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 length) { struct btrfs_block_group *block_group; - struct map_lookup *map; - struct btrfs_device *device; - u64 physical; if (!btrfs_is_zoned(fs_info)) return; @@ -2017,36 +2027,7 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len if (logical + length < block_group->start + block_group->zone_capacity) goto out; - spin_lock(&block_group->lock); - - if (!block_group->zone_is_active) { - spin_unlock(&block_group->lock); - goto out; - } - - block_group->zone_is_active = 0; - /* We should have consumed all the free space */ - ASSERT(block_group->alloc_offset == block_group->zone_capacity); - ASSERT(block_group->free_space_ctl->free_space == 0); - btrfs_clear_treelog_bg(block_group); - btrfs_clear_data_reloc_bg(block_group); - spin_unlock(&block_group->lock); - - map = block_group->physical_map; - device = map->stripes[0].dev; - physical = map->stripes[0].physical; - - if (!device->zone_info->max_active_zones) - goto out; - - btrfs_dev_clear_active_zone(device, physical); - - spin_lock(&fs_info->zone_active_bgs_lock); - ASSERT(!list_empty(&block_group->active_bg_list)); - list_del_init(&block_group->active_bg_list); - spin_unlock(&fs_info->zone_active_bgs_lock); - - btrfs_put_block_group(block_group); + __btrfs_zone_finish(block_group, true); out: btrfs_put_block_group(block_group); From patchwork Thu Apr 28 15:02:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12830863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D238CC433EF for ; Thu, 28 Apr 2022 15:02:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348715AbiD1PFm (ORCPT ); Thu, 28 Apr 2022 11:05:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230396AbiD1PFk (ORCPT ); Thu, 28 Apr 2022 11:05:40 -0400 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EEE4B3C73 for ; Thu, 28 Apr 2022 08:02:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1651158146; x=1682694146; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ars0dbGK5yV6UY9VSIWcAYeT1hi5xBWcXDHPP3vCVHs=; b=dmym7YMTpC38ZLmtx2aMtjERiyPMvmr/Z2WApW7kCmVVUI3+ymD+MLJ7 IdLFTEHdBk+63HcV7pWybS7NfwzX/mrN5oJnNtNCB4IZ29xsYhf8j2dWW FWwH6Jcl/R6R3vM/6kdQr47gacoQEhtx9FD/qvGoaJ8pdgSs/HQwBH5Sg HDHckka7h8DWr73QxcsoxARiIgdRJnwrAZpqh8aWZtcQPC54xzo2M+dHG h1mVsafspfxtoR9zlHopLyGJf+mOjU20OadOtAWlg1dyJJQPWGOzUD0Z/ HJ+nJ7KvRwJjAN45Au0Tcv4Vn+5ZXc9dZaCC3Bk6arpFVkxb9mjq3iapO g==; X-IronPort-AV: E=Sophos;i="5.91,295,1647273600"; d="scan'208";a="303279897" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 28 Apr 2022 23:02:24 +0800 IronPort-SDR: qKxGQdj8gUeKxBv4vuyDiFG/vZrG5MZYVgOcUEjzZMGqEJ0KNgQ2lRDUh+Vf/KWFSB8jO89unN 3S4SdLpOCmZYfZySjnxUGePzMLXMngPqDR98KBtozkib+OtAKSVq4qQvaEb+fwHY71oxC6cxKH emEKhO9i1adrU69o51f9z1l/kZUzgvj5WOLMN7CNhoLZqzPmCsc0xgv3dOIfRR1OEG6M6TMPpk sewiTpFiWR4Oy4EknTgkSysYjsqjRmWKLmWEupoT+jJd+XPNlg/iMKZXK1ZpusFGHkn6LbxbJi YtVByPjPvL2vQ2XtZNHlLKKa Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 28 Apr 2022 07:32:33 -0700 IronPort-SDR: bu4pzD/cp0/LlnGkvb/bLzMlrSpOUbmeBicbgRlq/eRV98Tt5W03wsWodKirFFlbbK/d0JYJEN KQxinUU3/XQhTIwDykQqdPF7ba8YaJAAgOWCMtzBSghTW6sk0zT7i6iIdRiT0G9b+jHLH/Icfp 153m+RetGXhWZFdJqKmYnTShqHRMpsBh2Yf5Wl8Hu+HR7WRcJ/iqf4D6iV2YtHLsQcB5DiHEL1 CxwQcDBPwOoUwL0yY70RAjO+7uavHAaokJBSqkpNXRdakMbfB0vxfVfggDwsivMsvlQ5X6r3N0 4jI= WDCIronportException: Internal Received: from fd6v5s2.ad.shared (HELO naota-xeon.wdc.com) ([10.225.48.6]) by uls-op-cesaip02.wdc.com with ESMTP; 28 Apr 2022 08:02:25 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota Subject: [PATCH 2/4] btrfs: zoned: finish BG when there are no more allocatable bytes left Date: Fri, 29 Apr 2022 00:02:16 +0900 Message-Id: <42758829d8696a471a27f7aaeab5468f60b1565d.1651157034.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently, btrfs_zone_finish_endio() finishes a block group only when the written region reaches the end of the block group. We can also finish the block group when no more allocation is possible. Cc: stable@vger.kernel.org # 5.16+ Fixes: be1a1d7a5d24 ("btrfs: zoned: finish fully written block group") Signed-off-by: Naohiro Aota Reviewed-by: Pankaj Raghav --- fs/btrfs/zoned.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 9cddafe78fb1..0f6ca3587c3b 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2017,6 +2017,7 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags) void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 length) { struct btrfs_block_group *block_group; + u64 min_use; if (!btrfs_is_zoned(fs_info)) return; @@ -2024,7 +2025,14 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len block_group = btrfs_lookup_block_group(fs_info, logical); ASSERT(block_group); - if (logical + length < block_group->start + block_group->zone_capacity) + /* No MIXED BG on zoned btrfs. */ + if (block_group->flags & BTRFS_BLOCK_GROUP_DATA) + min_use = fs_info->sectorsize; + else + min_use = fs_info->nodesize; + + /* Bail out if we can allocate more data from this BG. */ + if (logical + length + min_use <= block_group->start + block_group->zone_capacity) goto out; __btrfs_zone_finish(block_group, true); From patchwork Thu Apr 28 15:02:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12830865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AA5FC433EF for ; Thu, 28 Apr 2022 15:02:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348705AbiD1PFq (ORCPT ); Thu, 28 Apr 2022 11:05:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343735AbiD1PFl (ORCPT ); Thu, 28 Apr 2022 11:05:41 -0400 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B482B3C77 for ; Thu, 28 Apr 2022 08:02:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1651158146; x=1682694146; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mlHprPGck7hGApTsz+DkoK6yVOaYvqafyDfjI+LeR7I=; b=c41Zoe1/HKSY1MHyh2j2voLToIJ7v8bjDKk7jJlfuJ+Gas2ixoG4AsRN JHqQpDoCHkbzrEsDLZSpr2S0XvAfCJY/hQGKgo+6RH1jmkwXrHKQC6KLx HsAW+SdsGCmGYzzTUU5EI6PHrpLaFiSHWrhSmBF7dH+3YOftTryPc7KQ6 LQHQo0NUVumTo+kKkiN0j/Fpuez1EDj5YeVeY67rl2TleD3unYEHJA/hM 5y7L7aFqj42HUFthVIrjtBQ4GvL71/f254kNQg9DRhLrpELWTFYkK4rfh qjdcGeDr9cnTApocNpfHPoWNRty2MXowrs0LvJaM6T1kbtNrhGeBxPW08 Q==; X-IronPort-AV: E=Sophos;i="5.91,295,1647273600"; d="scan'208";a="303279898" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 28 Apr 2022 23:02:25 +0800 IronPort-SDR: UJd2QAXCweosIe4G05TSLm+Nocp4T+NbQ55fa4ZNZtV7Z1etRkhprttBya9dTlyG5HTwv7/521 ysTxqlCSO4DcijrxRo1dFtLTJGTtd/ftDTemGUNGenfk/adiTZmDLp8ExtGTcd4FoU7nuR90Se 7L2+5xZH2/qLKmrznfiLhcB6j/mZpcL0cenk8O8i9Vrz8lBLUtgHvX6pGCGWTKQIe+sSRdKn1U hvyPLxiK9UoRJSo6HEldj9t1a/1ZQ56y2F/X5J9+KtVOZr8PXiwgOUknIspbsH8jIe75v2EvJ+ 7rXaNgHbf5GVnQG2rUOOR79I Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 28 Apr 2022 07:32:33 -0700 IronPort-SDR: xoGxCucOCgfBkEn3GxxVvB6WnIXuwNKUXYig/3l+MjlzeUzwAe5l8sVjgmoJIYSQX92s4pn9ir 2d7fa03ASWTxsm04exBbPLdg887sXD18klpGFB1ypSmxZJgFhtSPUm9rMmIokYwWUwGm5hIXp5 jmqE4kGyqQ53PZopTwYcqUW6AeyrUlySboy+HsYP24GzYxhyHHT+oSGhDhocUKzEL3MdbbR91C pXptAO34u1emS1ymX2FEj08DeijX2q5iYuCKNesMn2Nmoht2MZsXu3RCkM5oOSHRhTCw7VQTyu oyY= WDCIronportException: Internal Received: from fd6v5s2.ad.shared (HELO naota-xeon.wdc.com) ([10.225.48.6]) by uls-op-cesaip02.wdc.com with ESMTP; 28 Apr 2022 08:02:25 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota Subject: [PATCH 3/4] btrfs: zoned: properly finish block group on metadata write Date: Fri, 29 Apr 2022 00:02:17 +0900 Message-Id: X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Commit be1a1d7a5d24 ("btrfs: zoned: finish fully written block group") introduced zone finishing code both for data and metadata end_io path. However, the metadata side is not working as it should be. First, it compares logical address (eb->start + eb->len) with offset within a block group (cache->zone_capacity) in submit_eb_page(). That essentially disabled zone finishing on metadata end_io path. Furthermore, fixing the issue above revealed we cannot call btrfs_zone_finish_endio() in end_extent_buffer_writeback(). We cannot call btrfs_lookup_block_group() which require spin lock inside end_io context. This commit introduces btrfs_schedule_zone_finish_bg() to wait for the extent buffer writeback and do the zone finish IO in a workqueue. Also, drop EXTENT_BUFFER_ZONE_FINISH as it is no longer used. Cc: stable@vger.kernel.org # 5.16+ Fixes: be1a1d7a5d24 ("btrfs: zoned: finish fully written block group") Signed-off-by: Naohiro Aota --- fs/btrfs/block-group.h | 2 ++ fs/btrfs/extent_io.c | 6 +----- fs/btrfs/extent_io.h | 1 - fs/btrfs/zoned.c | 34 ++++++++++++++++++++++++++++++++++ fs/btrfs/zoned.h | 5 +++++ 5 files changed, 42 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h index c9bf01dd10e8..3ac668ace50a 100644 --- a/fs/btrfs/block-group.h +++ b/fs/btrfs/block-group.h @@ -212,6 +212,8 @@ struct btrfs_block_group { u64 meta_write_pointer; struct map_lookup *physical_map; struct list_head active_bg_list; + struct work_struct zone_finish_work; + struct extent_buffer *last_eb; }; static inline u64 btrfs_block_group_end(struct btrfs_block_group *block_group) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f9d6dd310c42..4778067bc0fa 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4259,9 +4259,6 @@ void wait_on_extent_buffer_writeback(struct extent_buffer *eb) static void end_extent_buffer_writeback(struct extent_buffer *eb) { - if (test_bit(EXTENT_BUFFER_ZONE_FINISH, &eb->bflags)) - btrfs_zone_finish_endio(eb->fs_info, eb->start, eb->len); - clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags); smp_mb__after_atomic(); wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK); @@ -4851,8 +4848,7 @@ static int submit_eb_page(struct page *page, struct writeback_control *wbc, /* * Implies write in zoned mode. Mark the last eb in a block group. */ - if (cache->seq_zone && eb->start + eb->len == cache->zone_capacity) - set_bit(EXTENT_BUFFER_ZONE_FINISH, &eb->bflags); + btrfs_schedule_zone_finish_bg(cache, eb); btrfs_put_block_group(cache); } ret = write_one_eb(eb, wbc, epd); diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index b390ec79f9a8..89ebb7338d6f 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -32,7 +32,6 @@ enum { /* write IO error */ EXTENT_BUFFER_WRITE_ERR, EXTENT_BUFFER_NO_CHECK, - EXTENT_BUFFER_ZONE_FINISH, }; /* these are flags for __process_pages_contig */ diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 0f6ca3587c3b..afad085a589a 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2041,6 +2041,40 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len btrfs_put_block_group(block_group); } +static void btrfs_zone_finish_endio_workfn(struct work_struct *work) +{ + struct btrfs_block_group *bg = + container_of(work, struct btrfs_block_group, zone_finish_work); + + wait_on_extent_buffer_writeback(bg->last_eb); + free_extent_buffer(bg->last_eb); + btrfs_zone_finish_endio(bg->fs_info, bg->start, bg->length); + btrfs_put_block_group(bg); +} + +void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg, + struct extent_buffer *eb) +{ + if (!bg->seq_zone || + eb->start + eb->len * 2 <= bg->start + bg->zone_capacity) + return; + + if (WARN_ON(bg->zone_finish_work.func == + btrfs_zone_finish_endio_workfn)) { + btrfs_err(bg->fs_info, + "double scheduling of BG %llu zone finishing", + bg->start); + return; + } + + /* For the work */ + btrfs_get_block_group(bg); + atomic_inc(&eb->refs); + bg->last_eb = eb; + INIT_WORK(&bg->zone_finish_work, btrfs_zone_finish_endio_workfn); + queue_work(system_unbound_wq, &bg->zone_finish_work); +} + void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg) { struct btrfs_fs_info *fs_info = bg->fs_info; diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h index de923fc8449d..10f31d1c8b0c 100644 --- a/fs/btrfs/zoned.h +++ b/fs/btrfs/zoned.h @@ -72,6 +72,8 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group); bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags); void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 length); +void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg, + struct extent_buffer *eb); void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg); void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info); bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info); @@ -230,6 +232,9 @@ static inline bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, static inline void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 length) { } +static inline void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg, + struct extent_buffer *eb) { } + static inline void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg) { } static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { } From patchwork Thu Apr 28 15:02:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12830864 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C5DFC433F5 for ; Thu, 28 Apr 2022 15:02:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348716AbiD1PFo (ORCPT ); Thu, 28 Apr 2022 11:05:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348705AbiD1PFl (ORCPT ); Thu, 28 Apr 2022 11:05:41 -0400 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF803B3C6C for ; Thu, 28 Apr 2022 08:02:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1651158146; x=1682694146; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=McS6LOQ+qbCDqY9h9ImN2+sDf7qjg7/obzhcg89zg0o=; b=Vl9OAEtuvDVSH2H/pckWp5pdQ+U3RnTQEVyjchD49HG80Lfdnpfs/m/W 8uQhiD0hK2q58FuwqPOy4Lg711+M+2oxTwP2mFKJ1dlvQ4GaY3bp6jp8m SajbDuCjevikSzrk7TtQ7oRAvdOabeGyiwjm6kKC9bV7PG+M4RweZWWvb aFagpWYDSR6q+DO2VJe9pEw+LMZqQWlKYqhiUWssYFrvSk4ZiIiiwT7X3 6yg17+Bt0PI6sjpUuoo1n+XMnYBOtxjYk07V51YNnqRge1NJpc2PqF7Nt dejjAnXq3tjbD8t6oHFUthSzOw3I/b6VadtKAD6rEJ9n4Lk09sRl2FlN9 A==; X-IronPort-AV: E=Sophos;i="5.91,295,1647273600"; d="scan'208";a="303279901" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 28 Apr 2022 23:02:26 +0800 IronPort-SDR: D4dSp0uMDCOS8ZqlxgefMWURUIyJWCYEUnkNaBSMKK+6SjwipiQ43EXofZXuWt3sO0k1s/iI8+ F4zzdjgsVVl+gF4SVFn+xNP4ipKQgRVMKvRIwIBXBhO4A7WCvPnmzfwWQ1lGTlENxAU8FKk6Hm V83ECPy79Y7fgsH/RzIqDwkDm82tdDe3shN33SVuNw3flK9nsBxAkq6eHsenmr5KBMvoyA9mV8 5xLdeuRBIXx5tX5WCFy3V+xREZptrT8m8mq57navfZr8ObP0MsPUCAu2fPE4SeR9a5WE9lSXrm MVyup3ZBon6uiEzB5kR4wA3E Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 28 Apr 2022 07:32:34 -0700 IronPort-SDR: Gqnl9Cm+C5fofjR8OJvZCjL6J/d+W6sVK1t0JIMoHIaPvG/vzNp8VwSLt7o/PxEkU/euwBipOI SzMK2erXY1OyR1sRhPCKDmj7G4EWlfniqTE0hQQBi46UqEcg1EjTHyyWkI1d4B+ukN3OUX65gs XQtnfATyI/L0mqHCvK+NinhC8q/+U5XNMtPxc4p2DjaHWoIMw8OjUoYVDW6dEwyR/YFwAnBTjv 2Xns/xp3Aqf/CrcyHuTc0zv458x8cz8l1rfxmfrx1Pj/lCNAE72bgurTMm0i5POgq4n1DlOJOk 77E= WDCIronportException: Internal Received: from fd6v5s2.ad.shared (HELO naota-xeon.wdc.com) ([10.225.48.6]) by uls-op-cesaip02.wdc.com with ESMTP; 28 Apr 2022 08:02:26 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota Subject: [PATCH 4/4] btrfs: zoned: zone finish unused block group Date: Fri, 29 Apr 2022 00:02:18 +0900 Message-Id: <1572dbb7eb23266429fe61dd3c764d70c116942c.1651157034.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org While the active zones within an active block group are reset, and their active resource is released, the block group itself is kept in the active block group list and marked as active. As a result, the list will contain more than max_active_zones block groups. That itself is not fatal for the device as the zones are properly reset. However, that inflated list is, of course, strange. Also, a to-appear patch series, which deactivates an active block group on demand, get confused with the wrong list. So, fix the issue by finishing the unused block group once it gets read-only, so that we can release the active resource in an early stage. Cc: stable@vger.kernel.org # 5.16+ Fixes: be1a1d7a5d24 ("btrfs: zoned: finish fully written block group") Signed-off-by: Naohiro Aota --- fs/btrfs/block-group.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 9739f3e8230a..ede389f2602d 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -1385,6 +1385,14 @@ void btrfs_delete_unused_bgs(struct btrfs_fs_info *fs_info) goto next; } + ret = btrfs_zone_finish(block_group); + if (ret < 0) { + btrfs_dec_block_group_ro(block_group); + if (ret == -EAGAIN) + ret = 0; + goto next; + } + /* * Want to do this before we do anything else so we can recover * properly if we fail to join the transaction.