mbox series

[0/4] Fix incorrect splitting logic in btrfs_drop_extent_map_range

Message ID cover.1692305624.git.josef@toxicpanda.com (mailing list archive)
Headers show
Series Fix incorrect splitting logic in btrfs_drop_extent_map_range | expand

Message

Josef Bacik Aug. 17, 2023, 8:57 p.m. UTC
We have been hitting a fair number of warnings in btrfs_drop_extent_map_range
and in unpin_extent_map in production.  Upon investigation I discovered we were
splitting improperly when we call btrfs_drop_extent_map_range with skip_pinned.
This results in invalid extent_maps in the inode's io_tree, which in turn wreaks
all sorts of havoc, mostly in the form of these WARN_ON()'s.  This took me a
while to spot so I have a bunch of self-tests that test various functionality of
btrfs_drop_extent_map_range and btrfs_add_extent_mapping, with one test that
actual exercises the bug.

This has been broken for a while, and thankfully is only triggered in certain
cases with relocation on.  Our environment uses auto relocation heavily which is
why we hit this reliably, but the incident rate is still relatively low.  The
bug was introduced over 10 years ago, it probably could be limited to being
backported to the most recent kernels, basically anytime after Filipe's cleaning
up of this code.  Thanks,

Josef

Josef Bacik (4):
  btrfs: fix incorrect splitting in btrfs_drop_extent_map_range
  btrfs: add extent_map tests for dropping with odd layouts
  btrfs: add a self test for btrfs_add_extent_mapping
  btrfs: test invalid splitting when skipping pinned drop extent_map

 fs/btrfs/extent_map.c             |   6 +-
 fs/btrfs/tests/extent-map-tests.c | 414 ++++++++++++++++++++++++++++++
 2 files changed, 416 insertions(+), 4 deletions(-)

Comments

David Sterba Aug. 17, 2023, 11:52 p.m. UTC | #1
On Thu, Aug 17, 2023 at 04:57:29PM -0400, Josef Bacik wrote:
> We have been hitting a fair number of warnings in btrfs_drop_extent_map_range
> and in unpin_extent_map in production.  Upon investigation I discovered we were
> splitting improperly when we call btrfs_drop_extent_map_range with skip_pinned.
> This results in invalid extent_maps in the inode's io_tree, which in turn wreaks
> all sorts of havoc, mostly in the form of these WARN_ON()'s.  This took me a
> while to spot so I have a bunch of self-tests that test various functionality of
> btrfs_drop_extent_map_range and btrfs_add_extent_mapping, with one test that
> actual exercises the bug.
> 
> This has been broken for a while, and thankfully is only triggered in certain
> cases with relocation on.  Our environment uses auto relocation heavily which is
> why we hit this reliably, but the incident rate is still relatively low.  The
> bug was introduced over 10 years ago, it probably could be limited to being
> backported to the most recent kernels, basically anytime after Filipe's cleaning
> up of this code.  Thanks,
> 
> Josef
> 
> Josef Bacik (4):
>   btrfs: fix incorrect splitting in btrfs_drop_extent_map_range
>   btrfs: add extent_map tests for dropping with odd layouts
>   btrfs: add a self test for btrfs_add_extent_mapping
>   btrfs: test invalid splitting when skipping pinned drop extent_map

Nice, we have a new record holder, thanks for tracking it down and for
adding the tests. I'll add the patches to misc-next but review is still open.