Message ID | 76ba30ab42dc77209f9cc1274b06619c7d474303.1729294376.git.boris@bur.io (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] btrfs: make dropped merge extent_map immutable | expand |
在 2024/10/19 10:04, Boris Burkov 写道: > In debugging some corrupt squashfs files, we observed symptoms of > corrupt page cache pages but correct on-disk contents. Further > investigation revealed that the exact symptom was a correct page > followed by an incorrect, duplicate, page. This got us thinking about > extent maps. > > commit ac05ca913e9f ("Btrfs: fix race between using extent maps and merging them") > enforces a reference count on the primary `em` extent_map being merged, > as that one gets modified. > > However, since, > commit 3d2ac9922465 ("btrfs: introduce new members for extent_map") > both 'em' and 'merge' get modified, which started modifying 'merge' > and thus introduced the same race. > > We were able to reproduce this by looping the affected squashfs workload > in parallel on a bunch of separate btrfs-es while also dropping caches. > We are still working on a simple enough reproducer to make into an fstest. > > The simplest fix is to stop modifying 'merge', which is not essential, > as it is dropped immediately after the merge. This behavior is simply > a consequence of the order of the two extent maps being important in > computing the new values. Modify merge_ondisk_extents to take prev and > next by const* and also take a third merged parameter that it puts the > results in. Note that this introduces the rather odd behavior of passing > 'em' to merge_ondisk_extents as a const * and as a regular ptr. > > Fixes: 3d2ac9922465 ("btrfs: introduce new members for extent_map") > Signed-off-by: Omar Sandoval <osandov@fb.com> > Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Just some small nitpicks inlined below, which can be done at merge time. > --- > Changelog: > v2: > - make 'merge' immutable instead of refcounting it > --- > fs/btrfs/extent_map.c | 20 ++++++++------------ > 1 file changed, 8 insertions(+), 12 deletions(-) > > diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c > index d58d6fc40da1..a8290724475a 100644 > --- a/fs/btrfs/extent_map.c > +++ b/fs/btrfs/extent_map.c > @@ -252,7 +252,8 @@ static bool mergeable_maps(const struct extent_map *prev, const struct extent_ma > * @prev and @next will be both updated to point to the new merged range. This comment needs to be updated. Thanks, Qu > * Thus one of them should be removed by the caller. > */ > -static void merge_ondisk_extents(struct extent_map *prev, struct extent_map *next) > +static void merge_ondisk_extents(struct extent_map const *prev, struct extent_map const *next, > + struct extent_map *merged) > { > u64 new_disk_bytenr; > u64 new_disk_num_bytes; > @@ -287,15 +288,10 @@ static void merge_ondisk_extents(struct extent_map *prev, struct extent_map *nex > new_disk_bytenr; > new_offset = prev->disk_bytenr + prev->offset - new_disk_bytenr; > > - prev->disk_bytenr = new_disk_bytenr; > - prev->disk_num_bytes = new_disk_num_bytes; > - prev->ram_bytes = new_disk_num_bytes; > - prev->offset = new_offset; > - > - next->disk_bytenr = new_disk_bytenr; > - next->disk_num_bytes = new_disk_num_bytes; > - next->ram_bytes = new_disk_num_bytes; > - next->offset = new_offset; > + merged->disk_bytenr = new_disk_bytenr; > + merged->disk_num_bytes = new_disk_num_bytes; > + merged->ram_bytes = new_disk_num_bytes; > + merged->offset = new_offset; > } > > static void dump_extent_map(struct btrfs_fs_info *fs_info, const char *prefix, > @@ -363,7 +359,7 @@ static void try_merge_map(struct btrfs_inode *inode, struct extent_map *em) > em->generation = max(em->generation, merge->generation); > > if (em->disk_bytenr < EXTENT_MAP_LAST_BYTE) > - merge_ondisk_extents(merge, em); > + merge_ondisk_extents(merge, em, em); > em->flags |= EXTENT_FLAG_MERGED; > > validate_extent_map(fs_info, em); > @@ -378,7 +374,7 @@ static void try_merge_map(struct btrfs_inode *inode, struct extent_map *em) > if (rb && can_merge_extent_map(merge) && mergeable_maps(em, merge)) { > em->len += merge->len; > if (em->disk_bytenr < EXTENT_MAP_LAST_BYTE) > - merge_ondisk_extents(em, merge); > + merge_ondisk_extents(em, merge, em); > validate_extent_map(fs_info, em); > em->generation = max(em->generation, merge->generation); > em->flags |= EXTENT_FLAG_MERGED;
diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index d58d6fc40da1..a8290724475a 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -252,7 +252,8 @@ static bool mergeable_maps(const struct extent_map *prev, const struct extent_ma * @prev and @next will be both updated to point to the new merged range. * Thus one of them should be removed by the caller. */ -static void merge_ondisk_extents(struct extent_map *prev, struct extent_map *next) +static void merge_ondisk_extents(struct extent_map const *prev, struct extent_map const *next, + struct extent_map *merged) { u64 new_disk_bytenr; u64 new_disk_num_bytes; @@ -287,15 +288,10 @@ static void merge_ondisk_extents(struct extent_map *prev, struct extent_map *nex new_disk_bytenr; new_offset = prev->disk_bytenr + prev->offset - new_disk_bytenr; - prev->disk_bytenr = new_disk_bytenr; - prev->disk_num_bytes = new_disk_num_bytes; - prev->ram_bytes = new_disk_num_bytes; - prev->offset = new_offset; - - next->disk_bytenr = new_disk_bytenr; - next->disk_num_bytes = new_disk_num_bytes; - next->ram_bytes = new_disk_num_bytes; - next->offset = new_offset; + merged->disk_bytenr = new_disk_bytenr; + merged->disk_num_bytes = new_disk_num_bytes; + merged->ram_bytes = new_disk_num_bytes; + merged->offset = new_offset; } static void dump_extent_map(struct btrfs_fs_info *fs_info, const char *prefix, @@ -363,7 +359,7 @@ static void try_merge_map(struct btrfs_inode *inode, struct extent_map *em) em->generation = max(em->generation, merge->generation); if (em->disk_bytenr < EXTENT_MAP_LAST_BYTE) - merge_ondisk_extents(merge, em); + merge_ondisk_extents(merge, em, em); em->flags |= EXTENT_FLAG_MERGED; validate_extent_map(fs_info, em); @@ -378,7 +374,7 @@ static void try_merge_map(struct btrfs_inode *inode, struct extent_map *em) if (rb && can_merge_extent_map(merge) && mergeable_maps(em, merge)) { em->len += merge->len; if (em->disk_bytenr < EXTENT_MAP_LAST_BYTE) - merge_ondisk_extents(em, merge); + merge_ondisk_extents(em, merge, em); validate_extent_map(fs_info, em); em->generation = max(em->generation, merge->generation); em->flags |= EXTENT_FLAG_MERGED;