Message ID | 5780c450b3b5a642773bf3981bcfd49d1a6080b0.1712614770.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: extent-map: use disk_bytenr/offset to replace block_start/block_len/orig_start | expand |
On Mon, Apr 8, 2024 at 11:34 PM Qu Wenruo <wqu@suse.com> wrote: > > The structure can_nocow_file_extent_args is utilized to provide the > needed info for a NOCOW writes. > > However some of its members are pretty confusing. > For example, @disk_bytenr is not btrfs_file_extent_item::disk_bytenr, > but with extra offset, thus it works more like extent_map::block_start. > > This patch would: > > - Rename members directly fetched from btrfs_file_extent_item > The new name would have "orig_" prefix, with the same member name from > btrfs_file_extent_item. > > - For the old @disk_bytenr, rename it to @block_start > As it's directly passed into create_io_em() as @block_start. So I find these new names more confusing actually. So the existing names reflect fields from struct btrfs_file_extent_item, because NOCOW checks are always done against the range of a file extent item, therefore the existing naming. Sometimes it may be against the whole range of the extent item, sometimes only a part of it, in which case disk_bytenr is incremented by offsets. This is the same logic with struct btrfs_ordered_extent: for a NOCOW write disk_bytenr may either match the disk_bytenr of an existing file extent item or it's adjusted by some offset in case it covers only part of the extent item. So currently we are both consistent with btrfs_ordered_extent besides the fact the NOCOW checks are done against a file extent item. I particularly find block_start not intuitive - block? Is it a block number? What's the size of the block? Etc. disk_bytenr is a lot more clear - it's a disk address in bytes. > > - Add extra comments explaining those members > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/inode.c | 51 ++++++++++++++++++++++++++++-------------------- > 1 file changed, 30 insertions(+), 21 deletions(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 2e0156943c7c..4d207c3b38d9 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -1847,11 +1847,20 @@ struct can_nocow_file_extent_args { > */ > bool free_path; > > - /* Output fields. Only set when can_nocow_file_extent() returns 1. */ > + /* > + * Output fields. Only set when can_nocow_file_extent() returns 1. > + * > + * @block_start: The bytenr of the new nocow write should be at. > + * @orig_disk_bytenr: The original data extent's disk_bytenr. This orig_disk_bytenr field is not defined anywhere in this patch. Thanks. > + * @orig_disk_num_bytes:The original data extent's disk_num_bytes. > + * @orig_offset: The original offset inside the old data extent. > + * Caller should calculate their own > + * btrfs_file_extent_item::offset base on this. > + */ > > - u64 disk_bytenr; > - u64 disk_num_bytes; > - u64 extent_offset; > + u64 block_start; > + u64 orig_disk_num_bytes; > + u64 orig_offset; > /* Number of bytes that can be written to in NOCOW mode. */ > u64 num_bytes; > }; > @@ -1887,9 +1896,9 @@ static int can_nocow_file_extent(struct btrfs_path *path, > goto out; > > /* Can't access these fields unless we know it's not an inline extent. */ > - args->disk_bytenr = btrfs_file_extent_disk_bytenr(leaf, fi); > - args->disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); > - args->extent_offset = btrfs_file_extent_offset(leaf, fi); > + args->block_start = btrfs_file_extent_disk_bytenr(leaf, fi); > + args->orig_disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); > + args->orig_offset = btrfs_file_extent_offset(leaf, fi); > > if (!(inode->flags & BTRFS_INODE_NODATACOW) && > extent_type == BTRFS_FILE_EXTENT_REG) > @@ -1906,7 +1915,7 @@ static int can_nocow_file_extent(struct btrfs_path *path, > goto out; > > /* An explicit hole, must COW. */ > - if (args->disk_bytenr == 0) > + if (args->block_start == 0) > goto out; > > /* Compressed/encrypted/encoded extents must be COWed. */ > @@ -1925,8 +1934,8 @@ static int can_nocow_file_extent(struct btrfs_path *path, > btrfs_release_path(path); > > ret = btrfs_cross_ref_exist(root, btrfs_ino(inode), > - key->offset - args->extent_offset, > - args->disk_bytenr, args->strict, path); > + key->offset - args->orig_offset, > + args->block_start, args->strict, path); > WARN_ON_ONCE(ret > 0 && is_freespace_inode); > if (ret != 0) > goto out; > @@ -1947,15 +1956,15 @@ static int can_nocow_file_extent(struct btrfs_path *path, > atomic_read(&root->snapshot_force_cow)) > goto out; > > - args->disk_bytenr += args->extent_offset; > - args->disk_bytenr += args->start - key->offset; > + args->block_start += args->orig_offset; > + args->block_start += args->start - key->offset; > args->num_bytes = min(args->end + 1, extent_end) - args->start; > > /* > * Force COW if csums exist in the range. This ensures that csums for a > * given extent are either valid or do not exist. > */ > - ret = csum_exist_in_range(root->fs_info, args->disk_bytenr, args->num_bytes, > + ret = csum_exist_in_range(root->fs_info, args->block_start, args->num_bytes, > nowait); > WARN_ON_ONCE(ret > 0 && is_freespace_inode); > if (ret != 0) > @@ -2112,7 +2121,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, > goto must_cow; > > ret = 0; > - nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.disk_bytenr); > + nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.block_start); > if (!nocow_bg) { > must_cow: > /* > @@ -2151,14 +2160,14 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, > nocow_end = cur_offset + nocow_args.num_bytes - 1; > is_prealloc = extent_type == BTRFS_FILE_EXTENT_PREALLOC; > if (is_prealloc) { > - u64 orig_start = found_key.offset - nocow_args.extent_offset; > + u64 orig_start = found_key.offset - nocow_args.orig_offset; > struct extent_map *em; > > em = create_io_em(inode, cur_offset, nocow_args.num_bytes, > orig_start, > - nocow_args.disk_bytenr, /* block_start */ > + nocow_args.block_start, /* block_start */ > nocow_args.num_bytes, /* block_len */ > - nocow_args.disk_num_bytes, /* orig_block_len */ > + nocow_args.orig_disk_num_bytes, /* orig_block_len */ > ram_bytes, BTRFS_COMPRESS_NONE, > BTRFS_ORDERED_PREALLOC); > if (IS_ERR(em)) { > @@ -2171,7 +2180,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, > > ordered = btrfs_alloc_ordered_extent(inode, cur_offset, > nocow_args.num_bytes, nocow_args.num_bytes, > - nocow_args.disk_bytenr, nocow_args.num_bytes, 0, > + nocow_args.block_start, nocow_args.num_bytes, 0, > is_prealloc > ? (1 << BTRFS_ORDERED_PREALLOC) > : (1 << BTRFS_ORDERED_NOCOW), > @@ -7189,7 +7198,7 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, > } > > ret = 0; > - if (btrfs_extent_readonly(fs_info, nocow_args.disk_bytenr)) > + if (btrfs_extent_readonly(fs_info, nocow_args.block_start)) > goto out; > > if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW) && > @@ -7206,9 +7215,9 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, > } > > if (orig_start) > - *orig_start = key.offset - nocow_args.extent_offset; > + *orig_start = key.offset - nocow_args.orig_offset; > if (orig_block_len) > - *orig_block_len = nocow_args.disk_num_bytes; > + *orig_block_len = nocow_args.orig_disk_num_bytes; > > *len = nocow_args.num_bytes; > ret = 1; > -- > 2.44.0 > >
在 2024/4/12 00:16, Filipe Manana 写道: > On Mon, Apr 8, 2024 at 11:34 PM Qu Wenruo <wqu@suse.com> wrote: >> >> The structure can_nocow_file_extent_args is utilized to provide the >> needed info for a NOCOW writes. >> >> However some of its members are pretty confusing. >> For example, @disk_bytenr is not btrfs_file_extent_item::disk_bytenr, >> but with extra offset, thus it works more like extent_map::block_start. >> >> This patch would: >> >> - Rename members directly fetched from btrfs_file_extent_item >> The new name would have "orig_" prefix, with the same member name from >> btrfs_file_extent_item. >> >> - For the old @disk_bytenr, rename it to @block_start >> As it's directly passed into create_io_em() as @block_start. > > So I find these new names more confusing actually. > > So the existing names reflect fields from struct > btrfs_file_extent_item, because NOCOW checks are always done against > the range of a file extent item, therefore the existing naming. It's true for @extent_offset, but @disk_bytenr is not the case. It's calculated by file_extent_item::disk_bytenr + file_extent_item::offset. That's why I find the old @disk_bytenr very confusing (and caused several crashes in my sanity checks). > > Sometimes it may be against the whole range of the extent item, > sometimes only a part of it, in which case disk_bytenr is incremented > by offsets. > > This is the same logic with struct btrfs_ordered_extent: for a NOCOW > write disk_bytenr may either match the disk_bytenr of an existing file > extent item or it's adjusted by some offset in case it covers only > part of the extent item. The NOCOW ordered extent would skip the file extent map updates, that's why it doesn't really need an super accurate disk_bytenr/disk_num_bytes to match data extents. > > So currently we are both consistent with btrfs_ordered_extent besides > the fact the NOCOW checks are done against a file extent item. > > I particularly find block_start not intuitive - block? Is it a block > number? What's the size of the block? Etc. > disk_bytenr is a lot more clear - it's a disk address in bytes. Well, the new @block_start matches the old extent_map::block_start. I have to say, we do not have a solid definition on "disk_bytenr" in the first place. Should it always match ondisk file_extent_item::disk_bytenr, or should it act like "block_start" of the old extent_map? And if we have separate definitions, one to always match file extent item disk_bytenr, and one to match the real IO start bytenr, what should be their names? I hope we can get a good naming to solve the confusion, any good idea? Thanks, Qu > >> >> - Add extra comments explaining those members >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> fs/btrfs/inode.c | 51 ++++++++++++++++++++++++++++-------------------- >> 1 file changed, 30 insertions(+), 21 deletions(-) >> >> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c >> index 2e0156943c7c..4d207c3b38d9 100644 >> --- a/fs/btrfs/inode.c >> +++ b/fs/btrfs/inode.c >> @@ -1847,11 +1847,20 @@ struct can_nocow_file_extent_args { >> */ >> bool free_path; >> >> - /* Output fields. Only set when can_nocow_file_extent() returns 1. */ >> + /* >> + * Output fields. Only set when can_nocow_file_extent() returns 1. >> + * >> + * @block_start: The bytenr of the new nocow write should be at. >> + * @orig_disk_bytenr: The original data extent's disk_bytenr. > > This orig_disk_bytenr field is not defined anywhere in this patch. > > Thanks. > >> + * @orig_disk_num_bytes:The original data extent's disk_num_bytes. >> + * @orig_offset: The original offset inside the old data extent. >> + * Caller should calculate their own >> + * btrfs_file_extent_item::offset base on this. >> + */ >> >> - u64 disk_bytenr; >> - u64 disk_num_bytes; >> - u64 extent_offset; >> + u64 block_start; >> + u64 orig_disk_num_bytes; >> + u64 orig_offset; >> /* Number of bytes that can be written to in NOCOW mode. */ >> u64 num_bytes; >> }; >> @@ -1887,9 +1896,9 @@ static int can_nocow_file_extent(struct btrfs_path *path, >> goto out; >> >> /* Can't access these fields unless we know it's not an inline extent. */ >> - args->disk_bytenr = btrfs_file_extent_disk_bytenr(leaf, fi); >> - args->disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); >> - args->extent_offset = btrfs_file_extent_offset(leaf, fi); >> + args->block_start = btrfs_file_extent_disk_bytenr(leaf, fi); >> + args->orig_disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); >> + args->orig_offset = btrfs_file_extent_offset(leaf, fi); >> >> if (!(inode->flags & BTRFS_INODE_NODATACOW) && >> extent_type == BTRFS_FILE_EXTENT_REG) >> @@ -1906,7 +1915,7 @@ static int can_nocow_file_extent(struct btrfs_path *path, >> goto out; >> >> /* An explicit hole, must COW. */ >> - if (args->disk_bytenr == 0) >> + if (args->block_start == 0) >> goto out; >> >> /* Compressed/encrypted/encoded extents must be COWed. */ >> @@ -1925,8 +1934,8 @@ static int can_nocow_file_extent(struct btrfs_path *path, >> btrfs_release_path(path); >> >> ret = btrfs_cross_ref_exist(root, btrfs_ino(inode), >> - key->offset - args->extent_offset, >> - args->disk_bytenr, args->strict, path); >> + key->offset - args->orig_offset, >> + args->block_start, args->strict, path); >> WARN_ON_ONCE(ret > 0 && is_freespace_inode); >> if (ret != 0) >> goto out; >> @@ -1947,15 +1956,15 @@ static int can_nocow_file_extent(struct btrfs_path *path, >> atomic_read(&root->snapshot_force_cow)) >> goto out; >> >> - args->disk_bytenr += args->extent_offset; >> - args->disk_bytenr += args->start - key->offset; >> + args->block_start += args->orig_offset; >> + args->block_start += args->start - key->offset; >> args->num_bytes = min(args->end + 1, extent_end) - args->start; >> >> /* >> * Force COW if csums exist in the range. This ensures that csums for a >> * given extent are either valid or do not exist. >> */ >> - ret = csum_exist_in_range(root->fs_info, args->disk_bytenr, args->num_bytes, >> + ret = csum_exist_in_range(root->fs_info, args->block_start, args->num_bytes, >> nowait); >> WARN_ON_ONCE(ret > 0 && is_freespace_inode); >> if (ret != 0) >> @@ -2112,7 +2121,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, >> goto must_cow; >> >> ret = 0; >> - nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.disk_bytenr); >> + nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.block_start); >> if (!nocow_bg) { >> must_cow: >> /* >> @@ -2151,14 +2160,14 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, >> nocow_end = cur_offset + nocow_args.num_bytes - 1; >> is_prealloc = extent_type == BTRFS_FILE_EXTENT_PREALLOC; >> if (is_prealloc) { >> - u64 orig_start = found_key.offset - nocow_args.extent_offset; >> + u64 orig_start = found_key.offset - nocow_args.orig_offset; >> struct extent_map *em; >> >> em = create_io_em(inode, cur_offset, nocow_args.num_bytes, >> orig_start, >> - nocow_args.disk_bytenr, /* block_start */ >> + nocow_args.block_start, /* block_start */ >> nocow_args.num_bytes, /* block_len */ >> - nocow_args.disk_num_bytes, /* orig_block_len */ >> + nocow_args.orig_disk_num_bytes, /* orig_block_len */ >> ram_bytes, BTRFS_COMPRESS_NONE, >> BTRFS_ORDERED_PREALLOC); >> if (IS_ERR(em)) { >> @@ -2171,7 +2180,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, >> >> ordered = btrfs_alloc_ordered_extent(inode, cur_offset, >> nocow_args.num_bytes, nocow_args.num_bytes, >> - nocow_args.disk_bytenr, nocow_args.num_bytes, 0, >> + nocow_args.block_start, nocow_args.num_bytes, 0, >> is_prealloc >> ? (1 << BTRFS_ORDERED_PREALLOC) >> : (1 << BTRFS_ORDERED_NOCOW), >> @@ -7189,7 +7198,7 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, >> } >> >> ret = 0; >> - if (btrfs_extent_readonly(fs_info, nocow_args.disk_bytenr)) >> + if (btrfs_extent_readonly(fs_info, nocow_args.block_start)) >> goto out; >> >> if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW) && >> @@ -7206,9 +7215,9 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, >> } >> >> if (orig_start) >> - *orig_start = key.offset - nocow_args.extent_offset; >> + *orig_start = key.offset - nocow_args.orig_offset; >> if (orig_block_len) >> - *orig_block_len = nocow_args.disk_num_bytes; >> + *orig_block_len = nocow_args.orig_disk_num_bytes; >> >> *len = nocow_args.num_bytes; >> ret = 1; >> -- >> 2.44.0 >> >> >
On Thu, Apr 11, 2024 at 11:03 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: > > > > 在 2024/4/12 00:16, Filipe Manana 写道: > > On Mon, Apr 8, 2024 at 11:34 PM Qu Wenruo <wqu@suse.com> wrote: > >> > >> The structure can_nocow_file_extent_args is utilized to provide the > >> needed info for a NOCOW writes. > >> > >> However some of its members are pretty confusing. > >> For example, @disk_bytenr is not btrfs_file_extent_item::disk_bytenr, > >> but with extra offset, thus it works more like extent_map::block_start. > >> > >> This patch would: > >> > >> - Rename members directly fetched from btrfs_file_extent_item > >> The new name would have "orig_" prefix, with the same member name from > >> btrfs_file_extent_item. > >> > >> - For the old @disk_bytenr, rename it to @block_start > >> As it's directly passed into create_io_em() as @block_start. > > > > So I find these new names more confusing actually. > > > > So the existing names reflect fields from struct > > btrfs_file_extent_item, because NOCOW checks are always done against > > the range of a file extent item, therefore the existing naming. > > It's true for @extent_offset, but @disk_bytenr is not the case. > > It's calculated by file_extent_item::disk_bytenr + file_extent_item::offset. > > That's why I find the old @disk_bytenr very confusing (and caused > several crashes in my sanity checks). > > > > > Sometimes it may be against the whole range of the extent item, > > sometimes only a part of it, in which case disk_bytenr is incremented > > by offsets. > > > > This is the same logic with struct btrfs_ordered_extent: for a NOCOW > > write disk_bytenr may either match the disk_bytenr of an existing file > > extent item or it's adjusted by some offset in case it covers only > > part of the extent item. > > The NOCOW ordered extent would skip the file extent map updates, that's > why it doesn't really need an super accurate disk_bytenr/disk_num_bytes > to match data extents. > > > > > So currently we are both consistent with btrfs_ordered_extent besides > > the fact the NOCOW checks are done against a file extent item. > > > > I particularly find block_start not intuitive - block? Is it a block > > number? What's the size of the block? Etc. > > disk_bytenr is a lot more clear - it's a disk address in bytes. > > Well, the new @block_start matches the old extent_map::block_start. So it becomes a single exception, different from everywhere else. Doesn't seem like a good thing in general. > > I have to say, we do not have a solid definition on "disk_bytenr" in the > first place. Well I find the name clear, it is a disk location measured by a byte address. block_start is not so clear for anyone not familiar with btrfs' internals, it makes me think of a block number and wonder what's the block size, etc. > > Should it always match ondisk file_extent_item::disk_bytenr, or should > it act like "block_start" of the old extent_map? It's always about a range of a file extent item, be it the whole range or just a part of it. I don't see why it's confusing to use disk_bytenr, etc. I find it more confusing to use something else, or at least what's being proposed in this patch. > > And if we have separate definitions, one to always match file extent > item disk_bytenr, and one to match the real IO start bytenr, what should > be their names? > > I hope we can get a good naming to solve the confusion, any good idea? For me the current naming is fine and I don't find it confusing... So, I'm not sure what to tell you. > > Thanks, > Qu > > > >> > >> - Add extra comments explaining those members > >> > >> Signed-off-by: Qu Wenruo <wqu@suse.com> > >> --- > >> fs/btrfs/inode.c | 51 ++++++++++++++++++++++++++++-------------------- > >> 1 file changed, 30 insertions(+), 21 deletions(-) > >> > >> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > >> index 2e0156943c7c..4d207c3b38d9 100644 > >> --- a/fs/btrfs/inode.c > >> +++ b/fs/btrfs/inode.c > >> @@ -1847,11 +1847,20 @@ struct can_nocow_file_extent_args { > >> */ > >> bool free_path; > >> > >> - /* Output fields. Only set when can_nocow_file_extent() returns 1. */ > >> + /* > >> + * Output fields. Only set when can_nocow_file_extent() returns 1. > >> + * > >> + * @block_start: The bytenr of the new nocow write should be at. > >> + * @orig_disk_bytenr: The original data extent's disk_bytenr. > > > > This orig_disk_bytenr field is not defined anywhere in this patch. > > > > Thanks. > > > >> + * @orig_disk_num_bytes:The original data extent's disk_num_bytes. > >> + * @orig_offset: The original offset inside the old data extent. > >> + * Caller should calculate their own > >> + * btrfs_file_extent_item::offset base on this. > >> + */ > >> > >> - u64 disk_bytenr; > >> - u64 disk_num_bytes; > >> - u64 extent_offset; > >> + u64 block_start; > >> + u64 orig_disk_num_bytes; > >> + u64 orig_offset; > >> /* Number of bytes that can be written to in NOCOW mode. */ > >> u64 num_bytes; > >> }; > >> @@ -1887,9 +1896,9 @@ static int can_nocow_file_extent(struct btrfs_path *path, > >> goto out; > >> > >> /* Can't access these fields unless we know it's not an inline extent. */ > >> - args->disk_bytenr = btrfs_file_extent_disk_bytenr(leaf, fi); > >> - args->disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); > >> - args->extent_offset = btrfs_file_extent_offset(leaf, fi); > >> + args->block_start = btrfs_file_extent_disk_bytenr(leaf, fi); > >> + args->orig_disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); > >> + args->orig_offset = btrfs_file_extent_offset(leaf, fi); > >> > >> if (!(inode->flags & BTRFS_INODE_NODATACOW) && > >> extent_type == BTRFS_FILE_EXTENT_REG) > >> @@ -1906,7 +1915,7 @@ static int can_nocow_file_extent(struct btrfs_path *path, > >> goto out; > >> > >> /* An explicit hole, must COW. */ > >> - if (args->disk_bytenr == 0) > >> + if (args->block_start == 0) > >> goto out; > >> > >> /* Compressed/encrypted/encoded extents must be COWed. */ > >> @@ -1925,8 +1934,8 @@ static int can_nocow_file_extent(struct btrfs_path *path, > >> btrfs_release_path(path); > >> > >> ret = btrfs_cross_ref_exist(root, btrfs_ino(inode), > >> - key->offset - args->extent_offset, > >> - args->disk_bytenr, args->strict, path); > >> + key->offset - args->orig_offset, > >> + args->block_start, args->strict, path); > >> WARN_ON_ONCE(ret > 0 && is_freespace_inode); > >> if (ret != 0) > >> goto out; > >> @@ -1947,15 +1956,15 @@ static int can_nocow_file_extent(struct btrfs_path *path, > >> atomic_read(&root->snapshot_force_cow)) > >> goto out; > >> > >> - args->disk_bytenr += args->extent_offset; > >> - args->disk_bytenr += args->start - key->offset; > >> + args->block_start += args->orig_offset; > >> + args->block_start += args->start - key->offset; > >> args->num_bytes = min(args->end + 1, extent_end) - args->start; > >> > >> /* > >> * Force COW if csums exist in the range. This ensures that csums for a > >> * given extent are either valid or do not exist. > >> */ > >> - ret = csum_exist_in_range(root->fs_info, args->disk_bytenr, args->num_bytes, > >> + ret = csum_exist_in_range(root->fs_info, args->block_start, args->num_bytes, > >> nowait); > >> WARN_ON_ONCE(ret > 0 && is_freespace_inode); > >> if (ret != 0) > >> @@ -2112,7 +2121,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, > >> goto must_cow; > >> > >> ret = 0; > >> - nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.disk_bytenr); > >> + nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.block_start); > >> if (!nocow_bg) { > >> must_cow: > >> /* > >> @@ -2151,14 +2160,14 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, > >> nocow_end = cur_offset + nocow_args.num_bytes - 1; > >> is_prealloc = extent_type == BTRFS_FILE_EXTENT_PREALLOC; > >> if (is_prealloc) { > >> - u64 orig_start = found_key.offset - nocow_args.extent_offset; > >> + u64 orig_start = found_key.offset - nocow_args.orig_offset; > >> struct extent_map *em; > >> > >> em = create_io_em(inode, cur_offset, nocow_args.num_bytes, > >> orig_start, > >> - nocow_args.disk_bytenr, /* block_start */ > >> + nocow_args.block_start, /* block_start */ > >> nocow_args.num_bytes, /* block_len */ > >> - nocow_args.disk_num_bytes, /* orig_block_len */ > >> + nocow_args.orig_disk_num_bytes, /* orig_block_len */ > >> ram_bytes, BTRFS_COMPRESS_NONE, > >> BTRFS_ORDERED_PREALLOC); > >> if (IS_ERR(em)) { > >> @@ -2171,7 +2180,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, > >> > >> ordered = btrfs_alloc_ordered_extent(inode, cur_offset, > >> nocow_args.num_bytes, nocow_args.num_bytes, > >> - nocow_args.disk_bytenr, nocow_args.num_bytes, 0, > >> + nocow_args.block_start, nocow_args.num_bytes, 0, > >> is_prealloc > >> ? (1 << BTRFS_ORDERED_PREALLOC) > >> : (1 << BTRFS_ORDERED_NOCOW), > >> @@ -7189,7 +7198,7 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, > >> } > >> > >> ret = 0; > >> - if (btrfs_extent_readonly(fs_info, nocow_args.disk_bytenr)) > >> + if (btrfs_extent_readonly(fs_info, nocow_args.block_start)) > >> goto out; > >> > >> if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW) && > >> @@ -7206,9 +7215,9 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, > >> } > >> > >> if (orig_start) > >> - *orig_start = key.offset - nocow_args.extent_offset; > >> + *orig_start = key.offset - nocow_args.orig_offset; > >> if (orig_block_len) > >> - *orig_block_len = nocow_args.disk_num_bytes; > >> + *orig_block_len = nocow_args.orig_disk_num_bytes; > >> > >> *len = nocow_args.num_bytes; > >> ret = 1; > >> -- > >> 2.44.0 > >> > >> > >
在 2024/4/12 22:51, Filipe Manana 写道: > On Thu, Apr 11, 2024 at 11:03 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote: [...] >> >> Well, the new @block_start matches the old extent_map::block_start. > > So it becomes a single exception, different from everywhere else. > Doesn't seem like a good thing in general. OK, I can get rid of the @block_start name. > >> >> I have to say, we do not have a solid definition on "disk_bytenr" in the >> first place. > > Well I find the name clear, it is a disk location measured by a byte address. > block_start is not so clear for anyone not familiar with btrfs' > internals, it makes me think of a block number and wonder what's the > block size, etc. > >> >> Should it always match ondisk file_extent_item::disk_bytenr, or should >> it act like "block_start" of the old extent_map? > > It's always about a range of a file extent item, be it the whole range > or just a part of it. > I don't see why it's confusing to use disk_bytenr, etc. > I find it more confusing to use something else, or at least what's > being proposed in this patch. Well, IMHO since we take the name @disk_bytenr from btrfs file extent item, and btrfs file extent uses @disk_bytenr to uniquely locate a data extent, then we should also follow it to use @disk_bytenr for the same purpose. So that every time we see the name @disk_bytenr, we know it can be used to locate a data extent, without any need for weird offset calculation. That's why I'm strongly against adding any offset into @disk_bytenr. And I believe that's the biggest difference in our points of view. Although in this particular case, I can use some extra prefixs like "orig_" or "fe_" (for file extent), so that those members can be later directly passed to create_io_em() without extra offset calculation. Would that be a acceptable trade-off? Another solution would be just drop this patch, and do extra calulation resulting something like this: create_io_em(..., disk_bytenr - whatever_offset, /* disk_bytenr */ offset - whatever_offset, /* offset */ PREALLOC, ...); At least that does not sound sane to me, and can be bug prune. You won't believe how many different crashes I hit just due to the weird disk_bytenr calculation here, and that's the biggest reason I have Thanks, Qu > >> >> And if we have separate definitions, one to always match file extent >> item disk_bytenr, and one to match the real IO start bytenr, what should >> be their names? >> >> I hope we can get a good naming to solve the confusion, any good idea? > > For me the current naming is fine and I don't find it confusing... So, > I'm not sure what to tell you. > >> >> Thanks, >> Qu >>> >>>> >>>> - Add extra comments explaining those members >>>> >>>> Signed-off-by: Qu Wenruo <wqu@suse.com> >>>> --- >>>> fs/btrfs/inode.c | 51 ++++++++++++++++++++++++++++-------------------- >>>> 1 file changed, 30 insertions(+), 21 deletions(-) >>>> >>>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c >>>> index 2e0156943c7c..4d207c3b38d9 100644 >>>> --- a/fs/btrfs/inode.c >>>> +++ b/fs/btrfs/inode.c >>>> @@ -1847,11 +1847,20 @@ struct can_nocow_file_extent_args { >>>> */ >>>> bool free_path; >>>> >>>> - /* Output fields. Only set when can_nocow_file_extent() returns 1. */ >>>> + /* >>>> + * Output fields. Only set when can_nocow_file_extent() returns 1. >>>> + * >>>> + * @block_start: The bytenr of the new nocow write should be at. >>>> + * @orig_disk_bytenr: The original data extent's disk_bytenr. >>> >>> This orig_disk_bytenr field is not defined anywhere in this patch. >>> >>> Thanks. >>> >>>> + * @orig_disk_num_bytes:The original data extent's disk_num_bytes. >>>> + * @orig_offset: The original offset inside the old data extent. >>>> + * Caller should calculate their own >>>> + * btrfs_file_extent_item::offset base on this. >>>> + */ >>>> >>>> - u64 disk_bytenr; >>>> - u64 disk_num_bytes; >>>> - u64 extent_offset; >>>> + u64 block_start; >>>> + u64 orig_disk_num_bytes; >>>> + u64 orig_offset; >>>> /* Number of bytes that can be written to in NOCOW mode. */ >>>> u64 num_bytes; >>>> }; >>>> @@ -1887,9 +1896,9 @@ static int can_nocow_file_extent(struct btrfs_path *path, >>>> goto out; >>>> >>>> /* Can't access these fields unless we know it's not an inline extent. */ >>>> - args->disk_bytenr = btrfs_file_extent_disk_bytenr(leaf, fi); >>>> - args->disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); >>>> - args->extent_offset = btrfs_file_extent_offset(leaf, fi); >>>> + args->block_start = btrfs_file_extent_disk_bytenr(leaf, fi); >>>> + args->orig_disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); >>>> + args->orig_offset = btrfs_file_extent_offset(leaf, fi); >>>> >>>> if (!(inode->flags & BTRFS_INODE_NODATACOW) && >>>> extent_type == BTRFS_FILE_EXTENT_REG) >>>> @@ -1906,7 +1915,7 @@ static int can_nocow_file_extent(struct btrfs_path *path, >>>> goto out; >>>> >>>> /* An explicit hole, must COW. */ >>>> - if (args->disk_bytenr == 0) >>>> + if (args->block_start == 0) >>>> goto out; >>>> >>>> /* Compressed/encrypted/encoded extents must be COWed. */ >>>> @@ -1925,8 +1934,8 @@ static int can_nocow_file_extent(struct btrfs_path *path, >>>> btrfs_release_path(path); >>>> >>>> ret = btrfs_cross_ref_exist(root, btrfs_ino(inode), >>>> - key->offset - args->extent_offset, >>>> - args->disk_bytenr, args->strict, path); >>>> + key->offset - args->orig_offset, >>>> + args->block_start, args->strict, path); >>>> WARN_ON_ONCE(ret > 0 && is_freespace_inode); >>>> if (ret != 0) >>>> goto out; >>>> @@ -1947,15 +1956,15 @@ static int can_nocow_file_extent(struct btrfs_path *path, >>>> atomic_read(&root->snapshot_force_cow)) >>>> goto out; >>>> >>>> - args->disk_bytenr += args->extent_offset; >>>> - args->disk_bytenr += args->start - key->offset; >>>> + args->block_start += args->orig_offset; >>>> + args->block_start += args->start - key->offset; >>>> args->num_bytes = min(args->end + 1, extent_end) - args->start; >>>> >>>> /* >>>> * Force COW if csums exist in the range. This ensures that csums for a >>>> * given extent are either valid or do not exist. >>>> */ >>>> - ret = csum_exist_in_range(root->fs_info, args->disk_bytenr, args->num_bytes, >>>> + ret = csum_exist_in_range(root->fs_info, args->block_start, args->num_bytes, >>>> nowait); >>>> WARN_ON_ONCE(ret > 0 && is_freespace_inode); >>>> if (ret != 0) >>>> @@ -2112,7 +2121,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, >>>> goto must_cow; >>>> >>>> ret = 0; >>>> - nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.disk_bytenr); >>>> + nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.block_start); >>>> if (!nocow_bg) { >>>> must_cow: >>>> /* >>>> @@ -2151,14 +2160,14 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, >>>> nocow_end = cur_offset + nocow_args.num_bytes - 1; >>>> is_prealloc = extent_type == BTRFS_FILE_EXTENT_PREALLOC; >>>> if (is_prealloc) { >>>> - u64 orig_start = found_key.offset - nocow_args.extent_offset; >>>> + u64 orig_start = found_key.offset - nocow_args.orig_offset; >>>> struct extent_map *em; >>>> >>>> em = create_io_em(inode, cur_offset, nocow_args.num_bytes, >>>> orig_start, >>>> - nocow_args.disk_bytenr, /* block_start */ >>>> + nocow_args.block_start, /* block_start */ >>>> nocow_args.num_bytes, /* block_len */ >>>> - nocow_args.disk_num_bytes, /* orig_block_len */ >>>> + nocow_args.orig_disk_num_bytes, /* orig_block_len */ >>>> ram_bytes, BTRFS_COMPRESS_NONE, >>>> BTRFS_ORDERED_PREALLOC); >>>> if (IS_ERR(em)) { >>>> @@ -2171,7 +2180,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, >>>> >>>> ordered = btrfs_alloc_ordered_extent(inode, cur_offset, >>>> nocow_args.num_bytes, nocow_args.num_bytes, >>>> - nocow_args.disk_bytenr, nocow_args.num_bytes, 0, >>>> + nocow_args.block_start, nocow_args.num_bytes, 0, >>>> is_prealloc >>>> ? (1 << BTRFS_ORDERED_PREALLOC) >>>> : (1 << BTRFS_ORDERED_NOCOW), >>>> @@ -7189,7 +7198,7 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, >>>> } >>>> >>>> ret = 0; >>>> - if (btrfs_extent_readonly(fs_info, nocow_args.disk_bytenr)) >>>> + if (btrfs_extent_readonly(fs_info, nocow_args.block_start)) >>>> goto out; >>>> >>>> if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW) && >>>> @@ -7206,9 +7215,9 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, >>>> } >>>> >>>> if (orig_start) >>>> - *orig_start = key.offset - nocow_args.extent_offset; >>>> + *orig_start = key.offset - nocow_args.orig_offset; >>>> if (orig_block_len) >>>> - *orig_block_len = nocow_args.disk_num_bytes; >>>> + *orig_block_len = nocow_args.orig_disk_num_bytes; >>>> >>>> *len = nocow_args.num_bytes; >>>> ret = 1; >>>> -- >>>> 2.44.0 >>>> >>>> >>> >
在 2024/4/13 07:30, Qu Wenruo 写道: > > > 在 2024/4/12 22:51, Filipe Manana 写道: >> On Thu, Apr 11, 2024 at 11:03 PM Qu Wenruo <quwenruo.btrfs@gmx.com> >> wrote: > [...] >>> >>> Well, the new @block_start matches the old extent_map::block_start. >> >> So it becomes a single exception, different from everywhere else. >> Doesn't seem like a good thing in general. > > OK, I can get rid of the @block_start name. > >> >>> >>> I have to say, we do not have a solid definition on "disk_bytenr" in the >>> first place. >> >> Well I find the name clear, it is a disk location measured by a byte >> address. >> block_start is not so clear for anyone not familiar with btrfs' >> internals, it makes me think of a block number and wonder what's the >> block size, etc. >> >>> >>> Should it always match ondisk file_extent_item::disk_bytenr, or should >>> it act like "block_start" of the old extent_map? >> >> It's always about a range of a file extent item, be it the whole range >> or just a part of it. >> I don't see why it's confusing to use disk_bytenr, etc. >> I find it more confusing to use something else, or at least what's >> being proposed in this patch. > > Well, IMHO since we take the name @disk_bytenr from btrfs file extent > item, and btrfs file extent uses @disk_bytenr to uniquely locate a data > extent, then we should also follow it to use @disk_bytenr for the same > purpose. > > So that every time we see the name @disk_bytenr, we know it can be used > to locate a data extent, without any need for weird offset calculation. > > That's why I'm strongly against adding any offset into @disk_bytenr. > And I believe that's the biggest difference in our points of view. > > Although in this particular case, I can use some extra prefixs like > "orig_" or "fe_" (for file extent), so that those members can be later > directly passed to create_io_em() without extra offset calculation. > > Would that be a acceptable trade-off? > > > Another solution would be just drop this patch, and do extra calulation > resulting something like this: > > create_io_em(..., > disk_bytenr - whatever_offset, /* disk_bytenr */ > offset - whatever_offset, /* offset */ > PREALLOC, ...); > > At least that does not sound sane to me, and can be bug prune. > You won't believe how many different crashes I hit just due to the weird > disk_bytenr calculation here, and that's the biggest reason I have the extra sanity checks. > > Thanks, > Qu >
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 2e0156943c7c..4d207c3b38d9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1847,11 +1847,20 @@ struct can_nocow_file_extent_args { */ bool free_path; - /* Output fields. Only set when can_nocow_file_extent() returns 1. */ + /* + * Output fields. Only set when can_nocow_file_extent() returns 1. + * + * @block_start: The bytenr of the new nocow write should be at. + * @orig_disk_bytenr: The original data extent's disk_bytenr. + * @orig_disk_num_bytes:The original data extent's disk_num_bytes. + * @orig_offset: The original offset inside the old data extent. + * Caller should calculate their own + * btrfs_file_extent_item::offset base on this. + */ - u64 disk_bytenr; - u64 disk_num_bytes; - u64 extent_offset; + u64 block_start; + u64 orig_disk_num_bytes; + u64 orig_offset; /* Number of bytes that can be written to in NOCOW mode. */ u64 num_bytes; }; @@ -1887,9 +1896,9 @@ static int can_nocow_file_extent(struct btrfs_path *path, goto out; /* Can't access these fields unless we know it's not an inline extent. */ - args->disk_bytenr = btrfs_file_extent_disk_bytenr(leaf, fi); - args->disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); - args->extent_offset = btrfs_file_extent_offset(leaf, fi); + args->block_start = btrfs_file_extent_disk_bytenr(leaf, fi); + args->orig_disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); + args->orig_offset = btrfs_file_extent_offset(leaf, fi); if (!(inode->flags & BTRFS_INODE_NODATACOW) && extent_type == BTRFS_FILE_EXTENT_REG) @@ -1906,7 +1915,7 @@ static int can_nocow_file_extent(struct btrfs_path *path, goto out; /* An explicit hole, must COW. */ - if (args->disk_bytenr == 0) + if (args->block_start == 0) goto out; /* Compressed/encrypted/encoded extents must be COWed. */ @@ -1925,8 +1934,8 @@ static int can_nocow_file_extent(struct btrfs_path *path, btrfs_release_path(path); ret = btrfs_cross_ref_exist(root, btrfs_ino(inode), - key->offset - args->extent_offset, - args->disk_bytenr, args->strict, path); + key->offset - args->orig_offset, + args->block_start, args->strict, path); WARN_ON_ONCE(ret > 0 && is_freespace_inode); if (ret != 0) goto out; @@ -1947,15 +1956,15 @@ static int can_nocow_file_extent(struct btrfs_path *path, atomic_read(&root->snapshot_force_cow)) goto out; - args->disk_bytenr += args->extent_offset; - args->disk_bytenr += args->start - key->offset; + args->block_start += args->orig_offset; + args->block_start += args->start - key->offset; args->num_bytes = min(args->end + 1, extent_end) - args->start; /* * Force COW if csums exist in the range. This ensures that csums for a * given extent are either valid or do not exist. */ - ret = csum_exist_in_range(root->fs_info, args->disk_bytenr, args->num_bytes, + ret = csum_exist_in_range(root->fs_info, args->block_start, args->num_bytes, nowait); WARN_ON_ONCE(ret > 0 && is_freespace_inode); if (ret != 0) @@ -2112,7 +2121,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, goto must_cow; ret = 0; - nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.disk_bytenr); + nocow_bg = btrfs_inc_nocow_writers(fs_info, nocow_args.block_start); if (!nocow_bg) { must_cow: /* @@ -2151,14 +2160,14 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, nocow_end = cur_offset + nocow_args.num_bytes - 1; is_prealloc = extent_type == BTRFS_FILE_EXTENT_PREALLOC; if (is_prealloc) { - u64 orig_start = found_key.offset - nocow_args.extent_offset; + u64 orig_start = found_key.offset - nocow_args.orig_offset; struct extent_map *em; em = create_io_em(inode, cur_offset, nocow_args.num_bytes, orig_start, - nocow_args.disk_bytenr, /* block_start */ + nocow_args.block_start, /* block_start */ nocow_args.num_bytes, /* block_len */ - nocow_args.disk_num_bytes, /* orig_block_len */ + nocow_args.orig_disk_num_bytes, /* orig_block_len */ ram_bytes, BTRFS_COMPRESS_NONE, BTRFS_ORDERED_PREALLOC); if (IS_ERR(em)) { @@ -2171,7 +2180,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, ordered = btrfs_alloc_ordered_extent(inode, cur_offset, nocow_args.num_bytes, nocow_args.num_bytes, - nocow_args.disk_bytenr, nocow_args.num_bytes, 0, + nocow_args.block_start, nocow_args.num_bytes, 0, is_prealloc ? (1 << BTRFS_ORDERED_PREALLOC) : (1 << BTRFS_ORDERED_NOCOW), @@ -7189,7 +7198,7 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, } ret = 0; - if (btrfs_extent_readonly(fs_info, nocow_args.disk_bytenr)) + if (btrfs_extent_readonly(fs_info, nocow_args.block_start)) goto out; if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW) && @@ -7206,9 +7215,9 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, } if (orig_start) - *orig_start = key.offset - nocow_args.extent_offset; + *orig_start = key.offset - nocow_args.orig_offset; if (orig_block_len) - *orig_block_len = nocow_args.disk_num_bytes; + *orig_block_len = nocow_args.orig_disk_num_bytes; *len = nocow_args.num_bytes; ret = 1;
The structure can_nocow_file_extent_args is utilized to provide the needed info for a NOCOW writes. However some of its members are pretty confusing. For example, @disk_bytenr is not btrfs_file_extent_item::disk_bytenr, but with extra offset, thus it works more like extent_map::block_start. This patch would: - Rename members directly fetched from btrfs_file_extent_item The new name would have "orig_" prefix, with the same member name from btrfs_file_extent_item. - For the old @disk_bytenr, rename it to @block_start As it's directly passed into create_io_em() as @block_start. - Add extra comments explaining those members Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/inode.c | 51 ++++++++++++++++++++++++++++-------------------- 1 file changed, 30 insertions(+), 21 deletions(-)