Message ID | f6e36de0cc45247c30c645764f3ffe4f6a487007.1712621026.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range() | expand |
On Tue, Apr 9, 2024 at 1:06 AM Qu Wenruo <wqu@suse.com> wrote: > > [BUG] > During my extent_map cleanup/refactor, with extra sanity checks, > extent-map-tests::test_case_7() would not pass the checks. > > The problem is, after btrfs_drop_extent_map_range(), the resulted > extent_map has a @block_start way too large. > Meanwhile my btrfs_file_extent_item based members are returning a > correct @disk_bytenr/@offset combination. > > The extent map layout looks like this: > > 0 16K 32K 48K > | PINNED | | Regular | > > The regular em at [32K, 48K) also has 32K @block_start. > > Then drop range [0, 36K), which should shrink the regular one to be > [36K, 48K). > However the @block_start is incorrect, we expect 32K + 4K, but got 52K. > > [CAUSE] > Inside btrfs_drop_extent_map_range() function, if we hit an extent_map > that covers the target range but is still beyond it, we need to split > that extent map into half: > > |<-- drop range -->| > |<----- existing extent_map --->| > > And if the extent map is not compressed, we need to forward > extent_map::block_start by the difference between the end of drop range > and the extent map start. > > However in that particular case, the difference is calculated using > (start + len - em->start). > > The problem is @start can be modified if the drop range covers any > pinned extent. > > This leads to wrong calculation, and would be caught by my later > extent_map sanity checks, which checks the em::block_start against > btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset. > > This is a regression caused by commit c962098ca4af ("btrfs: fix > incorrect splitting in btrfs_drop_extent_map_range"), which removed the > @len update for pinned extents. > > [FIX] > Fix it by avoiding using @start completely, and use @end - em->start > instead, which @end is exclusive bytenr number. > > And update the test case to verify the @block_start to prevent such > problem from happening. > > Thankfully this is not going to lead to any data corruption, as IO path > does not utilize btrfs_drop_extent_map_range() with @skip_pinned set. > > So this fix is only here for the sake of consistency/correctness. > > CC: stable@vger.kernel.org # 6.5+ > Fixes: c962098ca4af ("btrfs: fix incorrect splitting in btrfs_drop_extent_map_range") > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > Changelog: > v2: > - Remove the mention of possible corruption > Thankfully this bug does not affect IO path thus it's fine. > > - Explain why c962098ca4af is the cause > --- > fs/btrfs/extent_map.c | 2 +- > fs/btrfs/tests/extent-map-tests.c | 6 +++++- > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c > index 471654cb65b0..955ce300e5a1 100644 > --- a/fs/btrfs/extent_map.c > +++ b/fs/btrfs/extent_map.c > @@ -799,7 +799,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, > split->block_len = em->block_len; > split->orig_start = em->orig_start; > } else { > - const u64 diff = start + len - em->start; > + const u64 diff = end - em->start; > > split->block_len = split->len; > split->block_start += diff; > diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c > index 253cce7ffecf..80e71c5cb7ab 100644 > --- a/fs/btrfs/tests/extent-map-tests.c > +++ b/fs/btrfs/tests/extent-map-tests.c > @@ -818,7 +818,6 @@ static int test_case_7(struct btrfs_fs_info *fs_info) > test_err("em->len is %llu, expected 16K", em->len); > goto out; > } > - > free_extent_map(em); As pointed out before, please avoid such accidental and unrelated changes like this. With that fixed: Reviewed-by: Filipe Manana <fdmanana@suse.com> > > read_lock(&em_tree->lock); > @@ -847,6 +846,11 @@ static int test_case_7(struct btrfs_fs_info *fs_info) > goto out; > } > > + if (em->block_start != SZ_32K + SZ_4K) { > + test_err("em->block_start is %llu, expected 36K", em->block_start); > + goto out; > + } > + > free_extent_map(em); > > read_lock(&em_tree->lock); > -- > 2.44.0 > >
diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 471654cb65b0..955ce300e5a1 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -799,7 +799,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, split->block_len = em->block_len; split->orig_start = em->orig_start; } else { - const u64 diff = start + len - em->start; + const u64 diff = end - em->start; split->block_len = split->len; split->block_start += diff; diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c index 253cce7ffecf..80e71c5cb7ab 100644 --- a/fs/btrfs/tests/extent-map-tests.c +++ b/fs/btrfs/tests/extent-map-tests.c @@ -818,7 +818,6 @@ static int test_case_7(struct btrfs_fs_info *fs_info) test_err("em->len is %llu, expected 16K", em->len); goto out; } - free_extent_map(em); read_lock(&em_tree->lock); @@ -847,6 +846,11 @@ static int test_case_7(struct btrfs_fs_info *fs_info) goto out; } + if (em->block_start != SZ_32K + SZ_4K) { + test_err("em->block_start is %llu, expected 36K", em->block_start); + goto out; + } + free_extent_map(em); read_lock(&em_tree->lock);
[BUG] During my extent_map cleanup/refactor, with extra sanity checks, extent-map-tests::test_case_7() would not pass the checks. The problem is, after btrfs_drop_extent_map_range(), the resulted extent_map has a @block_start way too large. Meanwhile my btrfs_file_extent_item based members are returning a correct @disk_bytenr/@offset combination. The extent map layout looks like this: 0 16K 32K 48K | PINNED | | Regular | The regular em at [32K, 48K) also has 32K @block_start. Then drop range [0, 36K), which should shrink the regular one to be [36K, 48K). However the @block_start is incorrect, we expect 32K + 4K, but got 52K. [CAUSE] Inside btrfs_drop_extent_map_range() function, if we hit an extent_map that covers the target range but is still beyond it, we need to split that extent map into half: |<-- drop range -->| |<----- existing extent_map --->| And if the extent map is not compressed, we need to forward extent_map::block_start by the difference between the end of drop range and the extent map start. However in that particular case, the difference is calculated using (start + len - em->start). The problem is @start can be modified if the drop range covers any pinned extent. This leads to wrong calculation, and would be caught by my later extent_map sanity checks, which checks the em::block_start against btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset. This is a regression caused by commit c962098ca4af ("btrfs: fix incorrect splitting in btrfs_drop_extent_map_range"), which removed the @len update for pinned extents. [FIX] Fix it by avoiding using @start completely, and use @end - em->start instead, which @end is exclusive bytenr number. And update the test case to verify the @block_start to prevent such problem from happening. Thankfully this is not going to lead to any data corruption, as IO path does not utilize btrfs_drop_extent_map_range() with @skip_pinned set. So this fix is only here for the sake of consistency/correctness. CC: stable@vger.kernel.org # 6.5+ Fixes: c962098ca4af ("btrfs: fix incorrect splitting in btrfs_drop_extent_map_range") Signed-off-by: Qu Wenruo <wqu@suse.com> --- Changelog: v2: - Remove the mention of possible corruption Thankfully this bug does not affect IO path thus it's fine. - Explain why c962098ca4af is the cause --- fs/btrfs/extent_map.c | 2 +- fs/btrfs/tests/extent-map-tests.c | 6 +++++- 2 files changed, 6 insertions(+), 2 deletions(-)