From patchwork Tue Mar 28 05:19:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88C5EC761A6 for ; Tue, 28 Mar 2023 05:20:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231587AbjC1FUN (ORCPT ); Tue, 28 Mar 2023 01:20:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230514AbjC1FUK (ORCPT ); Tue, 28 Mar 2023 01:20:10 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F31851BE8 for ; Mon, 27 Mar 2023 22:20:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=ZOXIf1zqB3w5elnjWJQrLgeGUSS8aYRV+ngK6an9JJQ=; b=eiRsgzT6/xZLxNyTRyGkfYCu2C Eo8kjIToJSqJwdzlA4i9bfsughS0Qqh/pIYnJNtrVXM/JL4KMPh7OPDwyLUv5NeSwjJFSs3f0th43 pp3AimW5CaVmeC1dyWzuzb7ao4zxFtgwAC2z5qwmfzYYqc20uZBXmEgB7lvtLODKurZmPXRd/0GF1 t51HssfHeyi9ka8fERVw3WMG8bkMDuZxGUmA3X9ssr79ECYadmLHnpSjJHYmtWechNQlVvB7jwvne iKeVJzGwMKWf8YnCZbs1hneYUCcVUvTSB4Nw78hZGWSYWN7uVUrPZXiH7owiUKo6jFEZhuJ9H0E5i d55tDcvg==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1kZ-00DATi-1P; Tue, 28 Mar 2023 05:20:03 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Filipe Manana , Naohiro Aota , Johannes Thumshirn Subject: [PATCH 01/11] btrfs: add function to create and return an ordered extent Date: Tue, 28 Mar 2023 14:19:47 +0900 Message-Id: <20230328051957.1161316-2-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Boris Burkov Currently, btrfs_add_ordered_extent allocates a new ordered extent, adds it to the rb_tree, but doesn't return a referenced pointer to the caller. There are cases where it is useful for the creator of a new ordered_extent to hang on to such a pointer, so add a new function btrfs_alloc_ordered_extent which is the same as btrfs_add_ordered_extent, except it takes an additional reference count and returns a pointer to the ordered_extent. Implement btrfs_add_ordered_extent as btrfs_alloc_ordered_extent followed by dropping the new reference and handling the IS_ERR case. The type of flags in btrfs_alloc_ordered_extent and btrfs_add_ordered_extent is changed from unsigned int to unsigned long so it's unified with the other ordered extent functions. Reviewed-by: Filipe Manana Reviewed-by: Christoph Hellwig Reviewed-by: Naohiro Aota Signed-off-by: Boris Burkov Signed-off-by: David Sterba Signed-off-by: Christoph Hellwig Tested-by: Johannes Thumshirn --- fs/btrfs/ordered-data.c | 46 +++++++++++++++++++++++++++++++++-------- fs/btrfs/ordered-data.h | 5 +++++ 2 files changed, 42 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 6c24b69e2d0a37..83a51c692406ab 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -160,14 +160,16 @@ static inline struct rb_node *tree_search(struct btrfs_ordered_inode_tree *tree, * @compress_type: Compression algorithm used for data. * * Most of these parameters correspond to &struct btrfs_file_extent_item. The - * tree is given a single reference on the ordered extent that was inserted. + * tree is given a single reference on the ordered extent that was inserted, and + * the returned pointer is given a second reference. * - * Return: 0 or -ENOMEM. + * Return: the new ordered extent or ERR_PTR(-ENOMEM). */ -int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, - u64 num_bytes, u64 ram_bytes, u64 disk_bytenr, - u64 disk_num_bytes, u64 offset, unsigned flags, - int compress_type) +struct btrfs_ordered_extent *btrfs_alloc_ordered_extent( + struct btrfs_inode *inode, u64 file_offset, + u64 num_bytes, u64 ram_bytes, u64 disk_bytenr, + u64 disk_num_bytes, u64 offset, unsigned long flags, + int compress_type) { struct btrfs_root *root = inode->root; struct btrfs_fs_info *fs_info = root->fs_info; @@ -181,7 +183,7 @@ int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, /* For nocow write, we can release the qgroup rsv right now */ ret = btrfs_qgroup_free_data(inode, NULL, file_offset, num_bytes); if (ret < 0) - return ret; + return ERR_PTR(ret); ret = 0; } else { /* @@ -190,11 +192,11 @@ int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, */ ret = btrfs_qgroup_release_data(inode, file_offset, num_bytes); if (ret < 0) - return ret; + return ERR_PTR(ret); } entry = kmem_cache_zalloc(btrfs_ordered_extent_cache, GFP_NOFS); if (!entry) - return -ENOMEM; + return ERR_PTR(-ENOMEM); entry->file_offset = file_offset; entry->num_bytes = num_bytes; @@ -256,6 +258,32 @@ int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, btrfs_mod_outstanding_extents(inode, 1); spin_unlock(&inode->lock); + /* One ref for the returned entry to match semantics of lookup. */ + refcount_inc(&entry->refs); + + return entry; +} + +/* + * Add a new btrfs_ordered_extent for the range, but drop the reference instead + * of returning it to the caller. + */ +int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, + u64 num_bytes, u64 ram_bytes, u64 disk_bytenr, + u64 disk_num_bytes, u64 offset, unsigned flags, + int compress_type) +{ + struct btrfs_ordered_extent *ordered; + + ordered = btrfs_alloc_ordered_extent(inode, file_offset, num_bytes, + ram_bytes, disk_bytenr, + disk_num_bytes, offset, flags, + compress_type); + + if (IS_ERR(ordered)) + return PTR_ERR(ordered); + btrfs_put_ordered_extent(ordered); + return 0; } diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h index eb40cb39f842e6..c00a5a3f060fa2 100644 --- a/fs/btrfs/ordered-data.h +++ b/fs/btrfs/ordered-data.h @@ -178,6 +178,11 @@ void btrfs_mark_ordered_io_finished(struct btrfs_inode *inode, bool btrfs_dec_test_ordered_pending(struct btrfs_inode *inode, struct btrfs_ordered_extent **cached, u64 file_offset, u64 io_size); +struct btrfs_ordered_extent *btrfs_alloc_ordered_extent( + struct btrfs_inode *inode, u64 file_offset, + u64 num_bytes, u64 ram_bytes, u64 disk_bytenr, + u64 disk_num_bytes, u64 offset, unsigned long flags, + int compress_type); int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, u64 num_bytes, u64 ram_bytes, u64 disk_bytenr, u64 disk_num_bytes, u64 offset, unsigned flags, From patchwork Tue Mar 28 05:19:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190500 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F29A9C76195 for ; Tue, 28 Mar 2023 05:20:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231377AbjC1FUL (ORCPT ); Tue, 28 Mar 2023 01:20:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229606AbjC1FUJ (ORCPT ); Tue, 28 Mar 2023 01:20:09 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBAEF19B6 for ; Mon, 27 Mar 2023 22:20:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=3caGFhmGTFK+7oxblIoRhOwyO6HAik100Y0JaqZehLo=; b=F8nfdBq7HwYYmKvMZD9Uk4mylu igOTUMFB+6HxadrLZsCBe4OxluiGYgBHtEEKkm+wZyCYdElyA+xazIfubtpYQgpGaHeRUp7WRyoiP O/dtKso01qgB/fMbPBKmUmVMpWKNzC8bN8nO0TafZbbuXIth8fv25PuupquYyU1Hu780sUPLjquLb FdTFLAJPqkOo5uc8G7VWZJNv623eIN66JUzgv9kmaGt8kSDQWoSsB7JGzgi6XJRXl1e+ZsJsPownR q0kj0KeyskH7nKqN4nJ6j+1C255/qyDAnd93us/rxi3J5cMyDVhiXLRSVt/5Wf59PvvmxNXktglKa evjWypQQ==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1kb-00DAU6-2s; Tue, 28 Mar 2023 05:20:06 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Naohiro Aota , Johannes Thumshirn Subject: [PATCH 02/11] btrfs: pass flags as unsigned long to btrfs_add_ordered_extent Date: Tue, 28 Mar 2023 14:19:48 +0900 Message-Id: <20230328051957.1161316-3-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Boris Burkov The ordered_extent flags are declared as unsigned long, so pass them as such to btrfs_add_ordered_extent. Signed-off-by: Boris Burkov [hch: split from a larger patch] Signed-off-by: Christoph Hellwig Reviewed-by: Naohiro Aota Tested-by: Johannes Thumshirn --- fs/btrfs/ordered-data.c | 2 +- fs/btrfs/ordered-data.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 83a51c692406ab..1848d0d1a9c41e 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -270,7 +270,7 @@ struct btrfs_ordered_extent *btrfs_alloc_ordered_extent( */ int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, u64 num_bytes, u64 ram_bytes, u64 disk_bytenr, - u64 disk_num_bytes, u64 offset, unsigned flags, + u64 disk_num_bytes, u64 offset, unsigned long flags, int compress_type) { struct btrfs_ordered_extent *ordered; diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h index c00a5a3f060fa2..18007f9c00add8 100644 --- a/fs/btrfs/ordered-data.h +++ b/fs/btrfs/ordered-data.h @@ -185,7 +185,7 @@ struct btrfs_ordered_extent *btrfs_alloc_ordered_extent( int compress_type); int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, u64 num_bytes, u64 ram_bytes, u64 disk_bytenr, - u64 disk_num_bytes, u64 offset, unsigned flags, + u64 disk_num_bytes, u64 offset, unsigned long flags, int compress_type); void btrfs_add_ordered_sum(struct btrfs_ordered_extent *entry, struct btrfs_ordered_sum *sum); From patchwork Tue Mar 28 05:19:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190502 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 631CDC76195 for ; Tue, 28 Mar 2023 05:20:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230514AbjC1FUR (ORCPT ); Tue, 28 Mar 2023 01:20:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231873AbjC1FUO (ORCPT ); Tue, 28 Mar 2023 01:20:14 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DA5D26AA for ; Mon, 27 Mar 2023 22:20:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=4Uv1PNwJ+hXcUXFcwVw7wXjqrYxbzLaQjwWZqL9Y4UA=; b=w/9RSeKZa8gSI0nuQ2FMTBeOxu tLDYewE0NaoZnQfSxz/QUShZqfSsaG7yEZnuKyGK5oL0Hlg8SPLxyOJaQ2WNRM3ENOJ/6TYk3/Ezq Icq1XrKH/Q/s7AFAC+xIC/uSlIYE299defjWnAJjaec9pVpnKT7rSpXaFW+2Qc3EC1/xGtcDvxgwf ZkCZAfPS4rYQW276cb8A3gUnuL9diWmisTiEnDJe66hJPIjbzRRmpTc6HRy2wBuu/BaAPj04UyCxR cCs/pgorRUTQQG6lYdesSu5RSKFycov34eWlad9taUFCp2r1lJVIjQj9AoBMApthsP4pQALTZx3ko 4KSJS5Qg==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1ke-00DAUx-0n; Tue, 28 Mar 2023 05:20:08 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Naohiro Aota , Johannes Thumshirn Subject: [PATCH 03/11] btrfs: stash ordered extent in dio_data during iomap dio Date: Tue, 28 Mar 2023 14:19:49 +0900 Message-Id: <20230328051957.1161316-4-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Boris Burkov While it is not feasible for an ordered extent to survive across the calls btrfs_direct_write makes into __iomap_dio_rw, it is still helpful to stash it on the dio_data in between creating it in iomap_begin and finishing it in either end_io or iomap_end. The specific use I have in mind is that we can check if a partcular bio is partial in submit_io without unconditionally looking up the ordered extent. This is a preparatory patch for a later patch which does just that. Signed-off-by: Boris Burkov Signed-off-by: Christoph Hellwig Reviewed-by: Naohiro Aota Tested-by: Johannes Thumshirn --- fs/btrfs/inode.c | 37 ++++++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 865d56ff2ce150..1441fe89a208d9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -81,6 +81,7 @@ struct btrfs_dio_data { struct extent_changeset *data_reserved; bool data_space_reserved; bool nocow_done; + struct btrfs_ordered_extent *ordered; }; struct btrfs_dio_private { @@ -6965,6 +6966,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, } static struct extent_map *btrfs_create_dio_extent(struct btrfs_inode *inode, + struct btrfs_dio_data *dio_data, const u64 start, const u64 len, const u64 orig_start, @@ -6975,7 +6977,7 @@ static struct extent_map *btrfs_create_dio_extent(struct btrfs_inode *inode, const int type) { struct extent_map *em = NULL; - int ret; + struct btrfs_ordered_extent *ordered; if (type != BTRFS_ORDERED_NOCOW) { em = create_io_em(inode, start, len, orig_start, block_start, @@ -6985,18 +6987,21 @@ static struct extent_map *btrfs_create_dio_extent(struct btrfs_inode *inode, if (IS_ERR(em)) goto out; } - ret = btrfs_add_ordered_extent(inode, start, len, len, block_start, - block_len, 0, - (1 << type) | - (1 << BTRFS_ORDERED_DIRECT), - BTRFS_COMPRESS_NONE); - if (ret) { + ordered = btrfs_alloc_ordered_extent(inode, start, len, len, + block_start, block_len, 0, + (1 << type) | + (1 << BTRFS_ORDERED_DIRECT), + BTRFS_COMPRESS_NONE); + if (IS_ERR(ordered)) { if (em) { free_extent_map(em); btrfs_drop_extent_map_range(inode, start, start + len - 1, false); } - em = ERR_PTR(ret); + em = ERR_CAST(ordered); + } else { + ASSERT(!dio_data->ordered); + dio_data->ordered = ordered; } out: @@ -7004,6 +7009,7 @@ static struct extent_map *btrfs_create_dio_extent(struct btrfs_inode *inode, } static struct extent_map *btrfs_new_extent_direct(struct btrfs_inode *inode, + struct btrfs_dio_data *dio_data, u64 start, u64 len) { struct btrfs_root *root = inode->root; @@ -7019,7 +7025,8 @@ static struct extent_map *btrfs_new_extent_direct(struct btrfs_inode *inode, if (ret) return ERR_PTR(ret); - em = btrfs_create_dio_extent(inode, start, ins.offset, start, + em = btrfs_create_dio_extent(inode, dio_data, + start, ins.offset, start, ins.objectid, ins.offset, ins.offset, ins.offset, BTRFS_ORDERED_REGULAR); btrfs_dec_block_group_reservations(fs_info, ins.objectid); @@ -7364,7 +7371,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, } space_reserved = true; - em2 = btrfs_create_dio_extent(BTRFS_I(inode), start, len, + em2 = btrfs_create_dio_extent(BTRFS_I(inode), dio_data, start, len, orig_start, block_start, len, orig_block_len, ram_bytes, type); @@ -7406,7 +7413,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, goto out; space_reserved = true; - em = btrfs_new_extent_direct(BTRFS_I(inode), start, len); + em = btrfs_new_extent_direct(BTRFS_I(inode), dio_data, start, len); if (IS_ERR(em)) { ret = PTR_ERR(em); goto out; @@ -7712,6 +7719,10 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length, pos + length - 1, NULL); ret = -ENOTBLK; } + if (write) { + btrfs_put_ordered_extent(dio_data->ordered); + dio_data->ordered = NULL; + } if (write) extent_changeset_free(dio_data->data_reserved); @@ -7773,7 +7784,7 @@ static const struct iomap_dio_ops btrfs_dio_ops = { ssize_t btrfs_dio_read(struct kiocb *iocb, struct iov_iter *iter, size_t done_before) { - struct btrfs_dio_data data; + struct btrfs_dio_data data = { 0 }; return iomap_dio_rw(iocb, iter, &btrfs_dio_iomap_ops, &btrfs_dio_ops, IOMAP_DIO_PARTIAL, &data, done_before); @@ -7782,7 +7793,7 @@ ssize_t btrfs_dio_read(struct kiocb *iocb, struct iov_iter *iter, size_t done_be struct iomap_dio *btrfs_dio_write(struct kiocb *iocb, struct iov_iter *iter, size_t done_before) { - struct btrfs_dio_data data; + struct btrfs_dio_data data = { 0 }; return __iomap_dio_rw(iocb, iter, &btrfs_dio_iomap_ops, &btrfs_dio_ops, IOMAP_DIO_PARTIAL, &data, done_before); From patchwork Tue Mar 28 05:19:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BEBBC76196 for ; Tue, 28 Mar 2023 05:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231833AbjC1FUS (ORCPT ); Tue, 28 Mar 2023 01:20:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231904AbjC1FUP (ORCPT ); Tue, 28 Mar 2023 01:20:15 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28F482709 for ; Mon, 27 Mar 2023 22:20:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=iBTsjEEL1InbVMyfyileoSgPUNS/IueXumKnKaOXdhY=; b=NMXEvIodrgaYME41ZHwmUDeQbt 6rXW8M9dC9/O1yfaRAiCfnAXENB4pBMJqluJ6i8yvUhZ5luEmBl9/0f4SsrKzqkAkBE9Hz6tWizHo a57NEiBkcA5rIZW++GkhX6aYTtp9M2DP683DwBW2riKvEXhbYVOQy3GtT20s7x5ixOEJaURcmPZ2E 0mHVO3LpGrc9U7eBtx5/vOrL4/kQWkP7mWOO60nHDC6diKdvfiV/722zO17ujd989zFNz34TyD967 9HEBAXGC8rUw1nqrlAyGIIztGvvc5IfxC8a/4eYWKvxRkQ4hO3UhjeGLrIjayt1iRNngIKBpafopV UdYYDDog==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1kg-00DAVM-2b; Tue, 28 Mar 2023 05:20:11 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Johannes Thumshirn Subject: [PATCH 04/11] btrfs: move ordered_extent internal sanity checks into btrfs_split_ordered_extent Date: Tue, 28 Mar 2023 14:19:50 +0900 Message-Id: <20230328051957.1161316-5-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Move the three checks that are about ordered extent internal sanity checking into btrfs_split_ordered_extent instead of doing them in the higher level btrfs_extract_ordered_extent routine. Signed-off-by: Christoph Hellwig Tested-by: Johannes Thumshirn --- fs/btrfs/inode.c | 18 ------------------ fs/btrfs/ordered-data.c | 10 ++++++++++ 2 files changed, 10 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1441fe89a208d9..5013d1b0b00e29 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2646,18 +2646,6 @@ blk_status_t btrfs_extract_ordered_extent(struct btrfs_bio *bbio) if (ordered->disk_num_bytes == len) goto out; - /* We cannot split once end_bio'd ordered extent */ - if (WARN_ON_ONCE(ordered->bytes_left != ordered->disk_num_bytes)) { - ret = -EINVAL; - goto out; - } - - /* We cannot split a compressed ordered extent */ - if (WARN_ON_ONCE(ordered->disk_num_bytes != ordered->num_bytes)) { - ret = -EINVAL; - goto out; - } - ordered_end = ordered->disk_bytenr + ordered->disk_num_bytes; /* bio must be in one ordered extent */ if (WARN_ON_ONCE(start < ordered->disk_bytenr || end > ordered_end)) { @@ -2665,12 +2653,6 @@ blk_status_t btrfs_extract_ordered_extent(struct btrfs_bio *bbio) goto out; } - /* Checksum list should be empty */ - if (WARN_ON_ONCE(!list_empty(&ordered->list))) { - ret = -EINVAL; - goto out; - } - file_len = ordered->num_bytes; pre = start - ordered->disk_bytenr; post = ordered_end - end; diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 1848d0d1a9c41e..4b46406c0c8af5 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -1149,6 +1149,16 @@ int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 pre, trace_btrfs_ordered_extent_split(BTRFS_I(inode), ordered); + /* We cannot split once end_bio'd ordered extent */ + if (WARN_ON_ONCE(ordered->bytes_left != ordered->disk_num_bytes)) + return -EINVAL; + /* We cannot split a compressed ordered extent */ + if (WARN_ON_ONCE(ordered->disk_num_bytes != ordered->num_bytes)) + return -EINVAL; + /* Checksum list should be empty */ + if (WARN_ON_ONCE(!list_empty(&ordered->list))) + return -EINVAL; + spin_lock_irq(&tree->lock); /* Remove from tree once */ node = &ordered->rb_node; From patchwork Tue Mar 28 05:19:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190504 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E887C761A6 for ; Tue, 28 Mar 2023 05:20:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231834AbjC1FUT (ORCPT ); Tue, 28 Mar 2023 01:20:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231487AbjC1FUR (ORCPT ); Tue, 28 Mar 2023 01:20:17 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 571621BE8 for ; Mon, 27 Mar 2023 22:20:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=E2eaYFqnhqQ74bryGcdYr/z7KZkWwIAgW3SOEavuRQU=; b=W6imzK458+lqpiShQmoky+uWE8 4zfikv9uplJBub17SsKaIDgoJwamO5Xqqjsc2oE8x6V8tRx8k39V/5hACegTTmg+cqMXanUPXuDhn KHycdlZY9fP3zcnf0QY3vOh0SjNuZjZ+3pdqIkl68KuDLk57fMvxHzI7eh1r+A/77jYB09kfBwfV8 bzYtzRqfVw03RwSh2wDvEhK8zrWe8QQg3rSERYes1C6zPYYrLD9FdSkUcUku3rXl0Rv87EMZ+CNQF eXB1+eo8bHq4SSr5y/pDiROrDISHIjGi6JVEneo7Doil6gAz35FhlcXGebYYm8TIYA5RBNGqSQWLq BeR/gqGg==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1kj-00DAWG-0q; Tue, 28 Mar 2023 05:20:13 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Johannes Thumshirn Subject: [PATCH 05/11] btrfs: simplify btrfs_extract_ordered_extent Date: Tue, 28 Mar 2023 14:19:51 +0900 Message-Id: <20230328051957.1161316-6-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org btrfs_extract_ordered_extent is always used to split an ordered_extent and extent_map into two parts, so it doesn't need to deal with a three way split. Simplify it by only allowing for a single split point, and always split out the beginning of the extent, as that is what we'll later need to be able to hold on to a reference to the original ordered_extent that the first part is split off for submission. Signed-off-by: Christoph Hellwig Tested-by: Johannes Thumshirn --- fs/btrfs/inode.c | 29 +++++++++++++---------------- 1 file changed, 13 insertions(+), 16 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 5013d1b0b00e29..428fa99711def1 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2632,39 +2632,36 @@ blk_status_t btrfs_extract_ordered_extent(struct btrfs_bio *bbio) u64 len = bbio->bio.bi_iter.bi_size; struct btrfs_inode *inode = bbio->inode; struct btrfs_ordered_extent *ordered; - u64 file_len; - u64 end = start + len; - u64 ordered_end; - u64 pre, post; + u64 ordered_len; int ret = 0; ordered = btrfs_lookup_ordered_extent(inode, bbio->file_offset); if (WARN_ON_ONCE(!ordered)) return BLK_STS_IOERR; + ordered_len = ordered->num_bytes; - /* No need to split */ - if (ordered->disk_num_bytes == len) + /* Must always be called for the beginning of an ordered extent. */ + if (WARN_ON_ONCE(start != ordered->disk_bytenr)) { + ret = -EINVAL; goto out; + } - ordered_end = ordered->disk_bytenr + ordered->disk_num_bytes; - /* bio must be in one ordered extent */ - if (WARN_ON_ONCE(start < ordered->disk_bytenr || end > ordered_end)) { + /* The bio must be entirely covered by the ordered extent */ + if (WARN_ON_ONCE(len > ordered_len)) { ret = -EINVAL; goto out; } - file_len = ordered->num_bytes; - pre = start - ordered->disk_bytenr; - post = ordered_end - end; + /* No need to split if the ordered extent covers the entire bio */ + if (ordered->disk_num_bytes == len) + goto out; - ret = btrfs_split_ordered_extent(ordered, pre, post); + ret = btrfs_split_ordered_extent(ordered, len, 0); if (ret) goto out; - ret = split_zoned_em(inode, bbio->file_offset, file_len, pre, post); - + ret = split_zoned_em(inode, bbio->file_offset, ordered_len, len, 0); out: btrfs_put_ordered_extent(ordered); - return errno_to_blk_status(ret); } From patchwork Tue Mar 28 05:19:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 558C5C76195 for ; Tue, 28 Mar 2023 05:20:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231932AbjC1FUY (ORCPT ); Tue, 28 Mar 2023 01:20:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229744AbjC1FUW (ORCPT ); Tue, 28 Mar 2023 01:20:22 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6D7A1BE8 for ; Mon, 27 Mar 2023 22:20:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=BpUDjBqDGzu3hvqByzedbbc0AznwYOaPHCRcHRAfvUM=; b=39lALCRWwNlACTOYT+GEH1aG40 e7njt0/0sG2GIhZz5RucAVEtsOIwkpR6+0VjDyrETjwR0jKDzm7SXlBjTqODfdQ7k6jtlQMe3aOYW gkyZEg+l26rKxUnwRlJtjau2aAwkfGkmoyhvjopWtL6Hbx500ff/BhrCqs+2Z5WniMHlzFUq8C9/S stjWabfocwFGaEuYY/RtBP5SeRtOtm7HO9/0eQhhW5jKDYvXkRdaYwesYHuSMGxammYbKc70EF02D HMyCv2ioZVU9wC1krcFd6IZSc0FGSSSSESYriJcFaxMjmmlTLVcl0E5WIRJDEIzu5+gw6ovNWooEG ztlhSAtQ==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1kl-00DAWs-2K; Tue, 28 Mar 2023 05:20:16 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Naohiro Aota , Johannes Thumshirn Subject: [PATCH 06/11] btrfs: simplify btrfs_split_ordered_extent Date: Tue, 28 Mar 2023 14:19:52 +0900 Message-Id: <20230328051957.1161316-7-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org btrfs_split_ordered_extent is only ever asked to split out the beginning of an ordered_extent. Change it to only take a len to split out, and switch it to allocate the new extent for the beginning, as that helps with callers that want to keep a pointer to the ordered_extent that it is stealing from. Signed-off-by: Christoph Hellwig Reviewed-by: Naohiro Aota Tested-by: Johannes Thumshirn --- fs/btrfs/inode.c | 8 +------- fs/btrfs/ordered-data.c | 31 +++++++++++++++---------------- fs/btrfs/ordered-data.h | 3 +-- 3 files changed, 17 insertions(+), 25 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 428fa99711def1..5358187f37fe10 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2646,17 +2646,11 @@ blk_status_t btrfs_extract_ordered_extent(struct btrfs_bio *bbio) goto out; } - /* The bio must be entirely covered by the ordered extent */ - if (WARN_ON_ONCE(len > ordered_len)) { - ret = -EINVAL; - goto out; - } - /* No need to split if the ordered extent covers the entire bio */ if (ordered->disk_num_bytes == len) goto out; - ret = btrfs_split_ordered_extent(ordered, len, 0); + ret = btrfs_split_ordered_extent(ordered, len); if (ret) goto out; ret = split_zoned_em(inode, bbio->file_offset, ordered_len, len, 0); diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 4b46406c0c8af5..561531ca4e9ef2 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -1138,17 +1138,22 @@ static int clone_ordered_extent(struct btrfs_ordered_extent *ordered, u64 pos, ordered->compress_type); } -int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 pre, - u64 post) +/* split out a new ordered extent for this first @len bytes of @ordered */ +int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 len) { struct inode *inode = ordered->inode; struct btrfs_ordered_inode_tree *tree = &BTRFS_I(inode)->ordered_tree; - struct rb_node *node; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - int ret = 0; + struct rb_node *node; trace_btrfs_ordered_extent_split(BTRFS_I(inode), ordered); + /* + * The entire bio must be covered by the ordered extent, but we can't + * reduce the original extent to a zero length either. + */ + if (WARN_ON_ONCE(len >= ordered->num_bytes)) + return -EINVAL; /* We cannot split once end_bio'd ordered extent */ if (WARN_ON_ONCE(ordered->bytes_left != ordered->disk_num_bytes)) return -EINVAL; @@ -1167,11 +1172,11 @@ int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 pre, if (tree->last == node) tree->last = NULL; - ordered->file_offset += pre; - ordered->disk_bytenr += pre; - ordered->num_bytes -= (pre + post); - ordered->disk_num_bytes -= (pre + post); - ordered->bytes_left -= (pre + post); + ordered->file_offset += len; + ordered->disk_bytenr += len; + ordered->num_bytes -= len; + ordered->disk_num_bytes -= len; + ordered->bytes_left -= len; /* Re-insert the node */ node = tree_insert(&tree->tree, ordered->file_offset, &ordered->rb_node); @@ -1182,13 +1187,7 @@ int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 pre, spin_unlock_irq(&tree->lock); - if (pre) - ret = clone_ordered_extent(ordered, 0, pre); - if (ret == 0 && post) - ret = clone_ordered_extent(ordered, pre + ordered->disk_num_bytes, - post); - - return ret; + return clone_ordered_extent(ordered, 0, len); } int __init ordered_data_init(void) diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h index 18007f9c00add8..f0f1138d23c331 100644 --- a/fs/btrfs/ordered-data.h +++ b/fs/btrfs/ordered-data.h @@ -212,8 +212,7 @@ void btrfs_lock_and_flush_ordered_range(struct btrfs_inode *inode, u64 start, struct extent_state **cached_state); bool btrfs_try_lock_ordered_range(struct btrfs_inode *inode, u64 start, u64 end, struct extent_state **cached_state); -int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 pre, - u64 post); +int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 len); int __init ordered_data_init(void); void __cold ordered_data_exit(void); From patchwork Tue Mar 28 05:19:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190506 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 586EEC761A6 for ; Tue, 28 Mar 2023 05:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231432AbjC1FU0 (ORCPT ); Tue, 28 Mar 2023 01:20:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231904AbjC1FUX (ORCPT ); Tue, 28 Mar 2023 01:20:23 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57E2B26AA for ; Mon, 27 Mar 2023 22:20:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=yzRpRG9AsChjptOVsfo76GUb/4lWZN5TCvcmfm5L2Tc=; b=mFlp/NiNAFdTySZh5QFiHXouv/ JF42durreNq3KD5eY6xOw99tfu/SSebEgxAPSGbDhhL35zzvqoT35XVrJIhY7ILE+NWHc/y1OcFvZ U92jbMrU9PujipAsmGoGFE1eyAKt9sZGJc+BTv53p4/pamH1ZuI6F7qGpY7FqBV9Q6j1jaPwsKcJf VXbv9c375VKNW5XxqPMsF+BtHykchltp4ABTGANp3ZdKN04T5+Rp9+cklFtH6pQX9rpos6OzbSItS UUNCV6vAUycH0KHgFgbOrGGe79A3/CJ3c5h8JOTmiouCPB3gYXZW3NXZFT5Eo4VXAmAEU1yqPTS27 Obn9Jewg==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1ko-00DAXK-0v; Tue, 28 Mar 2023 05:20:18 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Johannes Thumshirn Subject: [PATCH 07/11] btrfs: fold btrfs_clone_ordered_extent into btrfs_split_ordered_extent Date: Tue, 28 Mar 2023 14:19:53 +0900 Message-Id: <20230328051957.1161316-8-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org btrfs_clone_ordered_extent is very specific to the usage in btrfs_split_ordered_extent. Now that only a single call to btrfs_clone_ordered_extent is left, just fold it into btrfs_split_ordered_extent to make the operation more clear. Signed-off-by: Christoph Hellwig Tested-by: Johannes Thumshirn --- fs/btrfs/ordered-data.c | 37 ++++++++++++++----------------------- 1 file changed, 14 insertions(+), 23 deletions(-) diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 561531ca4e9ef2..e1224a115707cc 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -1116,38 +1116,21 @@ bool btrfs_try_lock_ordered_range(struct btrfs_inode *inode, u64 start, u64 end, return false; } - -static int clone_ordered_extent(struct btrfs_ordered_extent *ordered, u64 pos, - u64 len) -{ - struct inode *inode = ordered->inode; - struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; - u64 file_offset = ordered->file_offset + pos; - u64 disk_bytenr = ordered->disk_bytenr + pos; - unsigned long flags = ordered->flags & BTRFS_ORDERED_TYPE_FLAGS; - - /* - * The splitting extent is already counted and will be added again in - * btrfs_add_ordered_extent_*(). Subtract len to avoid double counting. - */ - percpu_counter_add_batch(&fs_info->ordered_bytes, -len, - fs_info->delalloc_batch); - WARN_ON_ONCE(flags & (1 << BTRFS_ORDERED_COMPRESSED)); - return btrfs_add_ordered_extent(BTRFS_I(inode), file_offset, len, len, - disk_bytenr, len, 0, flags, - ordered->compress_type); -} - /* split out a new ordered extent for this first @len bytes of @ordered */ int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 len) { struct inode *inode = ordered->inode; struct btrfs_ordered_inode_tree *tree = &BTRFS_I(inode)->ordered_tree; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + u64 file_offset = ordered->file_offset; + u64 disk_bytenr = ordered->disk_bytenr; + unsigned long flags = ordered->flags & BTRFS_ORDERED_TYPE_FLAGS; struct rb_node *node; trace_btrfs_ordered_extent_split(BTRFS_I(inode), ordered); + ASSERT(!(flags & (1 << BTRFS_ORDERED_COMPRESSED))); + /* * The entire bio must be covered by the ordered extent, but we can't * reduce the original extent to a zero length either. @@ -1187,7 +1170,15 @@ int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 len) spin_unlock_irq(&tree->lock); - return clone_ordered_extent(ordered, 0, len); + /* + * The splitting extent is already counted and will be added again in + * btrfs_add_ordered_extent(). Subtract len to avoid double counting. + */ + percpu_counter_add_batch(&fs_info->ordered_bytes, -len, + fs_info->delalloc_batch); + return btrfs_add_ordered_extent(BTRFS_I(inode), file_offset, len, len, + disk_bytenr, len, 0, flags, + ordered->compress_type); } int __init ordered_data_init(void) From patchwork Tue Mar 28 05:19:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190507 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01415C77B60 for ; Tue, 28 Mar 2023 05:20:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231956AbjC1FU2 (ORCPT ); Tue, 28 Mar 2023 01:20:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231925AbjC1FUY (ORCPT ); Tue, 28 Mar 2023 01:20:24 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06D7C268C for ; Mon, 27 Mar 2023 22:20:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=zMmhWWB2BDSJQ1mHbMDSmbEVhle7FXbjt6rYnwYBDO4=; b=bbWLJber0ici1l1dp8y/pyxeok Fe16htZbazhRKRvxX931ouP1Yk5ymF0s2XoZ8nBO7/4e7RM9db9XR3lKZq7nenFEE1zzGVrhuDJwz dYAIrFekDhio1ofFBKxyvCR+fKNEUEQ2grpQMI4lAB2tWKn2QP0x+qG/gpJmtByzqXj0cY93Bkbxp z5qR3tLxDyxpMEjnoFAs2ecVCaN3mdx+xbJ73zwWFhayrkBgmOZzC3laI6qVCwdUslXyOFbf+snet zy16BUoIsjMKabn5ioDMNVv0n3jkwMXczJDF9xYZFTPzwiadexaNKvgeF2VR+EaMjDlaGbZeCMUPO +czg7iJA==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1kq-00DAXh-0w; Tue, 28 Mar 2023 05:20:20 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Johannes Thumshirn Subject: [PATCH 08/11] btrfs: simplify split_zoned_em Date: Tue, 28 Mar 2023 14:19:54 +0900 Message-Id: <20230328051957.1161316-9-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org split_zoned_em is only ever asked to split out the beginning of an extent map. Change it to only take a len to split out instead of a pre and post region. Also rename the function to split_extent_map as there is nothing zoned device specific about it. Note: this function should probably move to extent_map.c. Signed-off-by: Christoph Hellwig Tested-by: Johannes Thumshirn --- fs/btrfs/inode.c | 78 +++++++++++++++++------------------------------- 1 file changed, 27 insertions(+), 51 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 5358187f37fe10..f7110a314a5fab 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2512,37 +2512,32 @@ void btrfs_clear_delalloc_extent(struct btrfs_inode *inode, } /* - * Split an extent_map at [start, start + len] + * Split off the first pre bytes from the extent_map at [start, start + len] * * This function is intended to be used only for extract_ordered_extent(). */ -static int split_zoned_em(struct btrfs_inode *inode, u64 start, u64 len, - u64 pre, u64 post) +static int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, + u64 pre) { struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em; struct extent_map *split_pre = NULL; struct extent_map *split_mid = NULL; - struct extent_map *split_post = NULL; int ret = 0; unsigned long flags; - /* Sanity check */ - if (pre == 0 && post == 0) - return 0; + ASSERT(pre != 0); + ASSERT(pre < len); split_pre = alloc_extent_map(); - if (pre) - split_mid = alloc_extent_map(); - if (post) - split_post = alloc_extent_map(); - if (!split_pre || (pre && !split_mid) || (post && !split_post)) { + if (!split_pre) + return -ENOMEM; + split_mid = alloc_extent_map(); + if (!split_mid) { ret = -ENOMEM; - goto out; + goto out_free_pre; } - ASSERT(pre + post < len); - lock_extent(&inode->io_tree, start, start + len - 1, NULL); write_lock(&em_tree->lock); em = lookup_extent_mapping(em_tree, start, len); @@ -2563,7 +2558,7 @@ static int split_zoned_em(struct btrfs_inode *inode, u64 start, u64 len, /* First, replace the em with a new extent_map starting from * em->start */ split_pre->start = em->start; - split_pre->len = (pre ? pre : em->len - post); + split_pre->len = pre; split_pre->orig_start = split_pre->start; split_pre->block_start = em->block_start; split_pre->block_len = split_pre->len; @@ -2577,38 +2572,21 @@ static int split_zoned_em(struct btrfs_inode *inode, u64 start, u64 len, /* * Now we only have an extent_map at: - * [em->start, em->start + pre] if pre != 0 - * [em->start, em->start + em->len - post] if pre == 0 + * [em->start, em->start + pre] */ - if (pre) { - /* Insert the middle extent_map */ - split_mid->start = em->start + pre; - split_mid->len = em->len - pre - post; - split_mid->orig_start = split_mid->start; - split_mid->block_start = em->block_start + pre; - split_mid->block_len = split_mid->len; - split_mid->orig_block_len = split_mid->block_len; - split_mid->ram_bytes = split_mid->len; - split_mid->flags = flags; - split_mid->compress_type = em->compress_type; - split_mid->generation = em->generation; - add_extent_mapping(em_tree, split_mid, 1); - } - - if (post) { - split_post->start = em->start + em->len - post; - split_post->len = post; - split_post->orig_start = split_post->start; - split_post->block_start = em->block_start + em->len - post; - split_post->block_len = split_post->len; - split_post->orig_block_len = split_post->block_len; - split_post->ram_bytes = split_post->len; - split_post->flags = flags; - split_post->compress_type = em->compress_type; - split_post->generation = em->generation; - add_extent_mapping(em_tree, split_post, 1); - } + /* Insert the middle extent_map */ + split_mid->start = em->start + pre; + split_mid->len = em->len - pre; + split_mid->orig_start = split_mid->start; + split_mid->block_start = em->block_start + pre; + split_mid->block_len = split_mid->len; + split_mid->orig_block_len = split_mid->block_len; + split_mid->ram_bytes = split_mid->len; + split_mid->flags = flags; + split_mid->compress_type = em->compress_type; + split_mid->generation = em->generation; + add_extent_mapping(em_tree, split_mid, 1); /* Once for us */ free_extent_map(em); @@ -2618,11 +2596,9 @@ static int split_zoned_em(struct btrfs_inode *inode, u64 start, u64 len, out_unlock: write_unlock(&em_tree->lock); unlock_extent(&inode->io_tree, start, start + len - 1, NULL); -out: - free_extent_map(split_pre); free_extent_map(split_mid); - free_extent_map(split_post); - +out_free_pre: + free_extent_map(split_pre); return ret; } @@ -2653,7 +2629,7 @@ blk_status_t btrfs_extract_ordered_extent(struct btrfs_bio *bbio) ret = btrfs_split_ordered_extent(ordered, len); if (ret) goto out; - ret = split_zoned_em(inode, bbio->file_offset, ordered_len, len, 0); + ret = split_extent_map(inode, bbio->file_offset, ordered_len, len); out: btrfs_put_ordered_extent(ordered); return errno_to_blk_status(ret); From patchwork Tue Mar 28 05:19:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190508 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CEE5BC77B61 for ; Tue, 28 Mar 2023 05:20:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231979AbjC1FU3 (ORCPT ); Tue, 28 Mar 2023 01:20:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231487AbjC1FU1 (ORCPT ); Tue, 28 Mar 2023 01:20:27 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA679268C for ; Mon, 27 Mar 2023 22:20:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=lN1Xtm8AJJ7+DMvDv8GI9eSh0AEDbPfOEjtYakm2azM=; b=4V49kfFVc3InjPdUSXxjl6d6XJ N/SGdlXE1OK7H4WTSu/gcX0LKQpVs8NRBWQyyb3TV/kfVNObCgD0J8liJUNT/bTJMfcc6LlQqL1TD TCkrC0GGn5YQfzmMBgwTkdGFPlIMslid6OZJPQMezPlcAjzPePHtvDvFBVIk3qS9lmOjmVWCgm0r9 nbKTpawdamNNu89Q/STFC6yTeK3RuABEDZNe73yT7zzmyzCaEYC3dgG6pp9AsxHKNzEY+sSM5uioI IdOtSkj0Y+jkr0lf2a61ynqTK5fppLzSgsxkRHT4V01bg8mk+IAg4G2ORIoXdGuuM2rCo5zuf2eVI KWvHazHA==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1ks-00DAYY-1r; Tue, 28 Mar 2023 05:20:22 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Johannes Thumshirn Subject: [PATCH 09/11] btrfs: pass an ordered_extent to btrfs_extract_ordered_extent Date: Tue, 28 Mar 2023 14:19:55 +0900 Message-Id: <20230328051957.1161316-10-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To prepare for a new caller that already has the ordered_extent available, change btrfs_extract_ordered_extent to take an argument for it. Add a wrapper for the bio case that still has to do the lookup (for now). Signed-off-by: Christoph Hellwig Tested-by: Johannes Thumshirn --- fs/btrfs/bio.c | 16 +++++++++++++++- fs/btrfs/btrfs_inode.h | 3 ++- fs/btrfs/inode.c | 26 ++++++++------------------ 3 files changed, 25 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index cf09c6271edbee..1bb6a45edc2354 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -61,6 +61,20 @@ struct btrfs_bio *btrfs_bio_alloc(unsigned int nr_vecs, blk_opf_t opf, return bbio; } +static blk_status_t btrfs_bio_extract_ordered_extent(struct btrfs_bio *bbio) +{ + struct btrfs_ordered_extent *ordered; + int ret; + + ordered = btrfs_lookup_ordered_extent(bbio->inode, bbio->file_offset); + if (WARN_ON_ONCE(!ordered)) + return BLK_STS_IOERR; + ret = btrfs_extract_ordered_extent(bbio, ordered); + btrfs_put_ordered_extent(ordered); + + return errno_to_blk_status(ret); +} + static struct btrfs_bio *btrfs_split_bio(struct btrfs_fs_info *fs_info, struct btrfs_bio *orig_bbio, u64 map_length, bool use_append) @@ -653,7 +667,7 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num) if (use_append) { bio->bi_opf &= ~REQ_OP_WRITE; bio->bi_opf |= REQ_OP_ZONE_APPEND; - ret = btrfs_extract_ordered_extent(bbio); + ret = btrfs_bio_extract_ordered_extent(bbio); if (ret) goto fail_put_bio; } diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 9dc21622806ef4..bb498448066981 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -407,7 +407,8 @@ static inline void btrfs_inode_split_flags(u64 inode_item_flags, int btrfs_check_sector_csum(struct btrfs_fs_info *fs_info, struct page *page, u32 pgoff, u8 *csum, const u8 * const csum_expected); -blk_status_t btrfs_extract_ordered_extent(struct btrfs_bio *bbio); +int btrfs_extract_ordered_extent(struct btrfs_bio *bbio, + struct btrfs_ordered_extent *ordered); bool btrfs_data_csum_ok(struct btrfs_bio *bbio, struct btrfs_device *dev, u32 bio_offset, struct bio_vec *bv); noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index f7110a314a5fab..042018271baa37 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2602,37 +2602,27 @@ static int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, return ret; } -blk_status_t btrfs_extract_ordered_extent(struct btrfs_bio *bbio) +int btrfs_extract_ordered_extent(struct btrfs_bio *bbio, + struct btrfs_ordered_extent *ordered) { u64 start = (u64)bbio->bio.bi_iter.bi_sector << SECTOR_SHIFT; u64 len = bbio->bio.bi_iter.bi_size; struct btrfs_inode *inode = bbio->inode; - struct btrfs_ordered_extent *ordered; - u64 ordered_len; + u64 ordered_len = ordered->num_bytes; int ret = 0; - ordered = btrfs_lookup_ordered_extent(inode, bbio->file_offset); - if (WARN_ON_ONCE(!ordered)) - return BLK_STS_IOERR; - ordered_len = ordered->num_bytes; - /* Must always be called for the beginning of an ordered extent. */ - if (WARN_ON_ONCE(start != ordered->disk_bytenr)) { - ret = -EINVAL; - goto out; - } + if (WARN_ON_ONCE(start != ordered->disk_bytenr)) + return -EINVAL; /* No need to split if the ordered extent covers the entire bio */ if (ordered->disk_num_bytes == len) - goto out; + return 0; ret = btrfs_split_ordered_extent(ordered, len); if (ret) - goto out; - ret = split_extent_map(inode, bbio->file_offset, ordered_len, len); -out: - btrfs_put_ordered_extent(ordered); - return errno_to_blk_status(ret); + return ret; + return split_extent_map(inode, bbio->file_offset, ordered_len, len); } /* From patchwork Tue Mar 28 05:19:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40C86C76196 for ; Tue, 28 Mar 2023 05:20:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231970AbjC1FUb (ORCPT ); Tue, 28 Mar 2023 01:20:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231976AbjC1FU3 (ORCPT ); Tue, 28 Mar 2023 01:20:29 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A54ED211F for ; Mon, 27 Mar 2023 22:20:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=6Ylr6w2t07ku0X5pX8I8Gr6hWRhaigyc66RGqOroY5o=; b=elMmR8I82xKSluw4YV7sJHBjgl yeIR0rIu9xv0eCeFf+EMGzjSX139ER6hqYB9RWeLxG5qJTVkCQqRkpJfc+uFDEs7NX7Qq/yZj1sp+ m/OCoXTuMaQMY6dsU4H1fuw05QZEcKiC8z2Ln7Kcxy2VUBbpR6JP0YaYCGZZknVe29lIIrlWUQ+ys EVFJWkFoe33I8ZjbEyvuroUd6R7WOdZwpC7ccjOgRQbSIsQ8aWg3hpHw+qvlGKbys7jrOwOfmG8g9 ENZm5bJdQiCtfrFVjJuph4o+mjIK0eeOwy08tz0jlqRet3PSpex1JAwdC17pLXjgcf0Abs1Ee720a rIkqRpCQ==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1ku-00DAZK-1a; Tue, 28 Mar 2023 05:20:24 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Johannes Thumshirn Subject: [PATCH 10/11] btrfs: don't split nocow extent_maps in btrfs_extract_ordered_extent Date: Tue, 28 Mar 2023 14:19:56 +0900 Message-Id: <20230328051957.1161316-11-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Boris Burkov Nocow writes just overwrite an existing extent map, which thus should not be split in btrfs_extract_ordered_extent. The nocow case can't currently happen as btrfs_extract_ordered_extent is only used on zoned devices that do not support nocow writes, but this will change soon. Signed-off-by: Boris Burkov [hch: split from a larger patch, wrote a commit log] Signed-off-by: Christoph Hellwig Tested-by: Johannes Thumshirn --- fs/btrfs/inode.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 042018271baa37..a791faabb2ec87 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2622,6 +2622,14 @@ int btrfs_extract_ordered_extent(struct btrfs_bio *bbio, ret = btrfs_split_ordered_extent(ordered, len); if (ret) return ret; + + /* + * Don't split the extent_map for nocow extents, as we're writing + * into a pre-existing one. + */ + if (test_bit(BTRFS_ORDERED_NOCOW, &ordered->flags)) + return 0; + return split_extent_map(inode, bbio->file_offset, ordered_len, len); } From patchwork Tue Mar 28 05:19:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13190510 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB458C761A6 for ; Tue, 28 Mar 2023 05:20:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232037AbjC1FUd (ORCPT ); Tue, 28 Mar 2023 01:20:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231977AbjC1FUb (ORCPT ); Tue, 28 Mar 2023 01:20:31 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9750A1FEC for ; Mon, 27 Mar 2023 22:20:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=L3+G+vJ3ZQhdbB/dHZApbiHMf0+60irH3F6STThdQPA=; b=yQX/fAhwQ2GkJCbUw2Mjv5UR4v fxYmkg+7XZ2CpB1yLlSZuEPJi1tkeTNcwbRW1frURbUIIyLpg1YRiqLjPPghLQxmIxZzarmqrz13K wT1dPQFi7JGKnHZ9uSbxH8RVoltS6urHSoXBDrBd4c4smz4lN8aFH4Snl0y1LHt+5gJJ6LF/ZJ8DM CDqRebFO0ZI2Qh9BEOKg6R3XudA+ofpSwtJQ0BAU6JwV40SLSKGdaQF8Uho38LbxRAX2GT5Gp8xT+ JbIGAGLY7OgJPJub+ENWjNrlILhWKOUsDIcPn0KUn2bTb8GcaLi65+iRcDtI9dC8weaTARFTLDwvu 1GDWFFzA==; Received: from [182.171.77.115] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1ph1kw-00DAZX-2o; Tue, 28 Mar 2023 05:20:27 +0000 From: Christoph Hellwig To: Chris Mason , Josef Bacik , David Sterba Cc: Boris Burkov , Johannes Thumshirn , Naohiro Aota , linux-btrfs@vger.kernel.org, Johannes Thumshirn Subject: [PATCH 11/11] btrfs: split partial dio bios before submit Date: Tue, 28 Mar 2023 14:19:57 +0900 Message-Id: <20230328051957.1161316-12-hch@lst.de> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230328051957.1161316-1-hch@lst.de> References: <20230328051957.1161316-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Boris Burkov If an application is doing direct io to a btrfs file and experiences a page fault reading from the write buffer, iomap will issue a partial bio, and allow the fs to keep going. However, there was a subtle bug in this codepath in the btrfs dio iomap implementation that led to the partial write ending up as a gap in the file's extents and to be read back as zeros. The sequence of events in a partial write, lightly summarized and trimmed down for brevity is as follows: ====WRITING TASK==== btrfs_direct_write __iomap_dio_write iomap_iter btrfs_dio_iomap_begin # create full ordered extent iomap_dio_bio_iter bio_iov_iter_get_pages # page fault; partial read submit_bio # partial bio iomap_iter btrfs_dio_iomap_end btrfs_mark_ordered_io_finished # sets BTRFS_ORDERED_IOERR; # submit to finish_ordered_fn wq fault_in_iov_iter_readable # btrfs_direct_write detects partial write __iomap_dio_write iomap_iter btrfs_dio_iomap_begin # create second partial ordered extent iomap_dio_bio_iter bio_iov_iter_get_pages # read all of remainder submit_bio # partial bio with all of remainder iomap_iter btrfs_dio_iomap_end # nothing exciting to do with ordered io ====DIO ENDIO==== ==FIRST PARTIAL BIO== btrfs_dio_end_io btrfs_mark_ordered_io_finished # bytes_left > 0 # don't submit to finish_ordered_fn wq ==SECOND PARTIAL BIO== btrfs_dio_end_io btrfs_mark_ordered_io_finished # bytes_left == 0 # submit to finish_ordered_fn wq ====BTRFS FINISH ORDERED WQ==== ==FIRST PARTIAL BIO== btrfs_finish_ordered_io # called by dio_iomap_end_io, sees # BTRFS_ORDERED_IOERR, just drops the # ordered_extent ==SECOND PARTIAL BIO== btrfs_finish_ordered_io # called by btrfs_dio_end_io, writes out file # extents, csums, etc... The essence of the problem is that while btrfs_direct_write and iomap properly interact to submit all the correct bios, there is insufficient logic in the btrfs dio functions (btrfs_dio_iomap_begin, btrfs_dio_submit_io, btrfs_dio_end_io, and btrfs_dio_iomap_end) to ensure that every bio is at least a part of a completed ordered_extent. And it is completing an ordered_extent that results in crucial functionality like writing out a file extent for the range. More specifically, btrfs_dio_end_io treats the ordered extent as unfinished but btrfs_dio_iomap_end sets BTRFS_ORDERED_IOERR on it. Thus, the finish io work doesn't result in file extents, csums, etc... In the aftermath, such a file behaves as though it has a hole in it, instead of the purportedly written data. We considered a few options for fixing the bug (apologies for any incorrect summary of a proposal which I didn't implement and fully understand): 1. treat the partial bio as if we had truncated the file, which would result in properly finishing it. 2. split the ordered extent when submitting a partial bio. 3. cache the ordered extent across calls to __iomap_dio_rw in iter->private, so that we could reuse it and correctly apply several bios to it. I had trouble with 1, and it felt the most like a hack, so I tried 2 and 3. Since 3 has the benefit of also not creating an extra file extent, and avoids an ordered extent lookup during bio submission, it felt like the best option. However, that turned out to re-introduce a deadlock which this code discarding the ordered_extent between faults was meant to fix in the first place. (Link to an explanation of the deadlock below) Therefore, go with fix #2, which requires a bit more setup work but fixes the corruption without introducing the deadlock, which is fundamentally caused by the ordered extent existing when we attempt to fault in a range that overlaps with it. Put succinctly, what this patch does is: when we submit a dio bio, check if it is partial against the ordered extent stored in dio_data, and if it is, extract the ordered_extent that matches the bio exactly out of the larger ordered_extent. Keep the remaining ordered_extent around in dio_data for cancellation in iomap_end. Thanks to Josef, Christoph, and Filipe with their help figuring out the bug and the fix. Fixes: 51bd9563b678 ("btrfs: fix deadlock due to page faults during direct IO reads and writes") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2169947 Link: https://lore.kernel.org/linux-btrfs/aa1fb69e-b613-47aa-a99e-a0a2c9ed273f@app.fastmail.com/ Link: https://pastebin.com/3SDaH8C6 Link: https://lore.kernel.org/linux-btrfs/20230315195231.GW10580@twin.jikos.cz/T/#t Signed-off-by: Boris Burkov [hch: refactored the ordered_extent extraction] Signed-off-by: Christoph Hellwig Tested-by: Johannes Thumshirn --- fs/btrfs/inode.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index a791faabb2ec87..dd67a37ea0d5ca 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7716,6 +7716,24 @@ static void btrfs_dio_submit_io(const struct iomap_iter *iter, struct bio *bio, dip->bytes = bio->bi_iter.bi_size; dio_data->submitted += bio->bi_iter.bi_size; + + /* + * Check if we are doing a partial write. If we are, we need to split + * the ordered extent to match the submitted bio. Hang on to the + * remaining unfinishable ordered_extent in dio_data so that it can be + * cancelled in iomap_end to avoid a deadlock wherein faulting the + * remaining pages is blocked on the outstanding ordered extent. + */ + if (iter->flags & IOMAP_WRITE) { + int err; + + err = btrfs_extract_ordered_extent(bbio, dio_data->ordered); + if (err) { + btrfs_bio_end_io(bbio, errno_to_blk_status(err)); + return; + } + } + btrfs_submit_bio(bbio, 0); }