Message ID | 20240820143434.25332-1-jth@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: stripe-tree: correctly truncate stripe extents on delete | expand |
On Tue, Aug 20, 2024 at 04:34:33PM +0200, Johannes Thumshirn wrote: > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > In our CI system, we're seeing the following ASSERT()ion to trigger when > running RAID stripe-tree tests on non-zoned devices: > > assertion failed: found_start >= start && found_end <= end, in fs/btrfs/raid-stripe-tree.c:64 > > This ASSERT()ion triggers, because for the initial design of RAID stripe-tree, > I had the "one ordered-extent equals one bio" rule of zoned btrfs in mind. > > But for a RAID stripe-tree based system, that is not hosted on a zoned > storage device, but on a regular device this rule doesn't apply. > > So in case the range we want to delete starts in the middle of the > previous item, grab the item and "truncate" it's length. That is, subtract > the deleted portion from the key's offset. > > In case the range to delete ends in the middle of an item, we have to > adjust both the item's key as well as the stripe extents. > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > --- > fs/btrfs/raid-stripe-tree.c | 50 ++++++++++++++++++++++++++++++++++++- > 1 file changed, 49 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c > index 4c859b550f6c..c8365d14271f 100644 > --- a/fs/btrfs/raid-stripe-tree.c > +++ b/fs/btrfs/raid-stripe-tree.c > @@ -61,7 +61,55 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 le > trace_btrfs_raid_extent_delete(fs_info, start, end, > found_start, found_end); > > - ASSERT(found_start >= start && found_end <= end); > + if (found_start < start) { > + struct btrfs_key prev; > + u64 diff = start - found_start; > + > + ret = btrfs_previous_item(stripe_root, path, start, > + BTRFS_RAID_STRIPE_KEY); This is only safe if we're not path->slots[0] == 0, otherwise we'll do btrfs_prev_leaf(), which doesn't modify anything, adn then we'll be in trouble. If this is safe then a comment indicating why we expect this to only back up one slot, and maybe an ASSERT(path->slots[0] > 0); before the btrfs_previous_item to make sure we don't screw this up later. Thanks, Josef
diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 4c859b550f6c..c8365d14271f 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -61,7 +61,55 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 le trace_btrfs_raid_extent_delete(fs_info, start, end, found_start, found_end); - ASSERT(found_start >= start && found_end <= end); + if (found_start < start) { + struct btrfs_key prev; + u64 diff = start - found_start; + + ret = btrfs_previous_item(stripe_root, path, start, + BTRFS_RAID_STRIPE_KEY); + leaf = path->nodes[0]; + slot = path->slots[0]; + btrfs_item_key_to_cpu(leaf, &prev, slot); + prev.offset -= diff; + + btrfs_mark_buffer_dirty(trans, leaf); + + start += diff; + length -= diff; + + btrfs_release_path(path); + continue; + } + + if (end < found_end && found_end - end < key.offset) { + struct btrfs_stripe_extent *stripe_extent; + u64 diff = key.offset - length; + int num_stripes; + + num_stripes = btrfs_num_raid_stripes( + btrfs_item_size(leaf, slot)); + stripe_extent = btrfs_item_ptr( + leaf, slot, struct btrfs_stripe_extent); + + for (int i = 0; i < num_stripes; i++) { + struct btrfs_raid_stride *stride = + &stripe_extent->strides[i]; + u64 physical = btrfs_raid_stride_physical( + leaf, stride); + + physical += diff; + btrfs_set_raid_stride_physical(leaf, stride, + physical); + } + + key.objectid += diff; + key.offset -= diff; + + btrfs_mark_buffer_dirty(trans, leaf); + btrfs_release_path(path); + break; + } + ret = btrfs_del_item(trans, stripe_root, path); if (ret) break;