Message ID | cover.1723096922.git.wqu@suse.com (mailing list archive) |
---|---|
Headers | show |
Series | btrfs: reduce extent map lookup overhead for data write | expand |
On Thu, Aug 8, 2024 at 7:06 AM Qu Wenruo <wqu@suse.com> wrote: > > Unlike data read path, which use cached extent map to reduce overhead, > data write path always do the extent map lookup no matter what. > > So this patchset will improve the situation by: > > - Move em_cached into bio_ctrl > Since the lifespan of them is the same, it's a perfect match. > > - Make data write path to use bio_ctrl::em_cached > > Unfortunately since my last relocation, I no longer have any dedicated > storage attached to my VMs (my laptop only has one NVME slot, and my main > workhorse aarch64 board only has one NVME attached either). > > So no benchmark yet, and any extra benchmark would be very appreciated. You don't need to test with dedicated NVME or any dedicated hardware. Use a VM and use a null block device. Make the test scenario doing a 1M write into a file that has a few thousand extent maps loaded in the inode's io tree, then measure the total time taken by submit_one_sector() calls with a bpftrace script. > > Qu Wenruo (2): > btrfs: introduce extent_map::em_cached member The subject doesn't match the change, should be btrfs_bio_ctrl::em_cached. The name is also odd, "cached_em" would be more correct, but given it's inside a context structure, just "em" would be fine. > btrfs: utilize cached extent map for data writeback > > fs/btrfs/extent_io.c | 24 +++++++++++++----------- > 1 file changed, 13 insertions(+), 11 deletions(-) > > -- > 2.45.2 > >
On Thu, Aug 08, 2024 at 03:35:58PM +0930, Qu Wenruo wrote: > Unlike data read path, which use cached extent map to reduce overhead, > data write path always do the extent map lookup no matter what. > > So this patchset will improve the situation by: > > - Move em_cached into bio_ctrl > Since the lifespan of them is the same, it's a perfect match. > > - Make data write path to use bio_ctrl::em_cached > > Unfortunately since my last relocation, I no longer have any dedicated > storage attached to my VMs (my laptop only has one NVME slot, and my main > workhorse aarch64 board only has one NVME attached either). > > So no benchmark yet, and any extra benchmark would be very appreciated. > > Qu Wenruo (2): > btrfs: introduce extent_map::em_cached member > btrfs: utilize cached extent map for data writeback This looks like a good optimization, we're approaching code freeze (next week) so it would be good to get it merged by then. As it's an optimization we can also postpone it if you're busy with other things.
在 2024/8/30 03:30, David Sterba 写道: > On Thu, Aug 08, 2024 at 03:35:58PM +0930, Qu Wenruo wrote: >> Unlike data read path, which use cached extent map to reduce overhead, >> data write path always do the extent map lookup no matter what. >> >> So this patchset will improve the situation by: >> >> - Move em_cached into bio_ctrl >> Since the lifespan of them is the same, it's a perfect match. >> >> - Make data write path to use bio_ctrl::em_cached >> >> Unfortunately since my last relocation, I no longer have any dedicated >> storage attached to my VMs (my laptop only has one NVME slot, and my main >> workhorse aarch64 board only has one NVME attached either). >> >> So no benchmark yet, and any extra benchmark would be very appreciated. >> >> Qu Wenruo (2): >> btrfs: introduce extent_map::em_cached member >> btrfs: utilize cached extent map for data writeback > > This looks like a good optimization, we're approaching code freeze (next > week) so it would be good to get it merged by then. As it's an > optimization we can also postpone it if you're busy with other things. > I'd like to postpone this one. The biggest problem is, this is not working the last time I micro-benchmarked the whole thing. It turns out that, with the new code the latency to lookup an extent map is in fact larger than the existing code. Maybe related to the extent map shrinker which is doing a too good job. I'll refresh this series when the em shrinker get stable and I can re-benchmark the result to see if it's still worthy. Thanks, Qu