diff mbox series

[v2,2/5] btrfs-progs: mkfs: use proper zoned compatible write for bgt feature

Message ID 69d093d141f7e3ad02f4353d69458805fa8883bc.1712095635.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs-progs: zoned devices support for bgt feature | expand

Commit Message

Qu Wenruo April 2, 2024, 10:07 p.m. UTC
[BUG]
There is a bug report that mkfs.btrfs can not specify block-group-tree
feature along with zoned devices:

  # mkfs.btrfs /dev/nullb0 -O block-group-tree,zoned
  btrfs-progs v6.7.1
  See https://btrfs.readthedocs.io for more information.

  Resetting device zones /dev/nullb0 (40 zones) ...
  NOTE: several default settings have changed in version 5.15, please make sure
        this does not affect your deployments:
        - DUP for metadata (-m dup)
        - enabled no-holes (-O no-holes)
        - enabled free-space-tree (-R free-space-tree)

  ERROR: error during mkfs: Invalid argument

[CAUSE]
During mkfs, we need to write all the 7 or 8 tree blocks into the
metadata zone, and since it's zoned device, we need to fulfill all the
requirement for zoned writes, including:

- All writes must be in sequential bytenr
- Buffer must be aligned to sector size

The sequential bytenr requirement is already met by the mkfs design, but
the second requirement on memory alignment is never met for metadata, as
we put the contents of a leaf in extent_buffer::data[], which is after a
lot of small members.

Thus metadata IO buffer would never be aligned to sector size (normally
4K).
And we require btrfs_pwrite() and btrfs_pread() to handle the memory
alignment for us.

However in create_block_group_tree() we didn't use btrfs_pwrite(), but
plain pwrite() call directly, which would lead to -EINVAL error due to
memory alignment problem.

[FIX]
Just call btrfs_pwrite() instead of the plain pwrite() in
create_block_group_tree().

Issue: #765
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 mkfs/common.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/mkfs/common.c b/mkfs/common.c
index 5e56b33dda6d..3c48a6c120e7 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -249,8 +249,8 @@  static int create_block_group_tree(int fd, struct btrfs_mkfs_config *cfg,
 	btrfs_set_header_nritems(buf, 1);
 	csum_tree_block_size(buf, btrfs_csum_type_size(cfg->csum_type), 0,
 			     cfg->csum_type);
-	ret = pwrite(fd, buf->data, cfg->nodesize,
-		     cfg->blocks[MKFS_BLOCK_GROUP_TREE]);
+	ret = btrfs_pwrite(fd, buf->data, cfg->nodesize,
+			   cfg->blocks[MKFS_BLOCK_GROUP_TREE], cfg->zone_size);
 	if (ret != cfg->nodesize)
 		return ret < 0 ? -errno : -EIO;
 	return 0;