Message ID | 20230503152441.1141019-14-hch@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/21] btrfs: mark extent_buffer_under_io static | expand |
On Wed, May 03, 2023 at 05:24:33PM +0200, Christoph Hellwig wrote: > The btrfs_bio_ctrl machinery is overkill for writing extent_buffers > as we always operate on PAGE SIZE chunks (or one smaller one for the > subpage case) that are contigous and are guaranteed to fit into a > single bio. Replace it with open coded btrfs_bio_alloc, __bio_add_page > and btrfs_submit_bio calls. submit_extent_page hasn't been open coded completely, there's still call to wbc_account_cgroup_owner() but it's probably skipped for other reasons as this is metadata inode.
On Tue, May 23, 2023 at 08:45:41PM +0200, David Sterba wrote: > On Wed, May 03, 2023 at 05:24:33PM +0200, Christoph Hellwig wrote: > > The btrfs_bio_ctrl machinery is overkill for writing extent_buffers > > as we always operate on PAGE SIZE chunks (or one smaller one for the > > subpage case) that are contigous and are guaranteed to fit into a > > single bio. Replace it with open coded btrfs_bio_alloc, __bio_add_page > > and btrfs_submit_bio calls. > > submit_extent_page hasn't been open coded completely, there's still call > to wbc_account_cgroup_owner() but it's probably skipped for other > reasons as this is metadata inode. Hmm, I was under the impression that we never did cgroup accounting for metadata. While that is true for the rest of the kernel, I fear I might have changed semantics for btrfs here (for better or worse). Let me dig into the code again, but I suspect I'll need to add wbc_account_cgroup_owner and wbc_init_bio back to not change semantics vs the old code.
On Wed, May 24, 2023 at 07:50:03AM +0200, Christoph Hellwig wrote: > On Tue, May 23, 2023 at 08:45:41PM +0200, David Sterba wrote: > > On Wed, May 03, 2023 at 05:24:33PM +0200, Christoph Hellwig wrote: > > > The btrfs_bio_ctrl machinery is overkill for writing extent_buffers > > > as we always operate on PAGE SIZE chunks (or one smaller one for the > > > subpage case) that are contigous and are guaranteed to fit into a > > > single bio. Replace it with open coded btrfs_bio_alloc, __bio_add_page > > > and btrfs_submit_bio calls. > > > > submit_extent_page hasn't been open coded completely, there's still call > > to wbc_account_cgroup_owner() but it's probably skipped for other > > reasons as this is metadata inode. > > Hmm, I was under the impression that we never did cgroup accounting for > metadata. While that is true for the rest of the kernel, I fear I might > have changed semantics for btrfs here (for better or worse). Let me > dig into the code again, but I suspect I'll need to add > wbc_account_cgroup_owner and wbc_init_bio back to not change semantics > vs the old code. Until we know for sure I'd rather keep the call, so something like that (untested): diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index fcd0563f7a15..a6480fa0366c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1833,6 +1833,8 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb, wbc->nr_to_write--; } __bio_add_page(&bbio->bio, p, eb->len, eb->start - page_offset(p)); + if (bio_ctrl->wbc) + wbc_account_cgroup_owner(bio_ctrl->wbc, p, len); unlock_page(p); } else { for (int i = 0; i < num_extent_pages(eb); i++) { @@ -1842,6 +1844,8 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb, clear_page_dirty_for_io(p); set_page_writeback(p); __bio_add_page(&bbio->bio, p, PAGE_SIZE, 0); + if (bio_ctrl->wbc) + wbc_account_cgroup_owner(bio_ctrl->wbc, p, len); wbc->nr_to_write--; unlock_page(p); } @@ -4090,9 +4094,15 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num, if (eb->fs_info->nodesize < PAGE_SIZE) { __bio_add_page(&bbio->bio, eb->pages[0], eb->len, eb->start - page_offset(eb->pages[0])); + if (bio_ctrl->wbc) + wbc_account_cgroup_owner(bio_ctrl->wbc, eb->pages[0], len); } else { - for (i = 0; i < num_pages; i++) + for (i = 0; i < num_pages; i++) { __bio_add_page(&bbio->bio, eb->pages[i], PAGE_SIZE, 0); + if (bio_ctrl->wbc) + wbc_account_cgroup_owner(bio_ctrl->wbc, + eb->pages[i], len); + } } btrfs_submit_bio(bbio, mirror_num); --- I can fold it to the patches.
On Fri, May 26, 2023 at 03:15:49PM +0200, David Sterba wrote: > Until we know for sure I'd rather keep the call, so something like that > (untested): Yes. I have something similar, but I didn't manage to get around to testing either as I'm at a conference right now. Comments below: > +++ b/fs/btrfs/extent_io.c > @@ -1833,6 +1833,8 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb, > wbc->nr_to_write--; > } > __bio_add_page(&bbio->bio, p, eb->len, eb->start - page_offset(p)); > + if (bio_ctrl->wbc) > + wbc_account_cgroup_owner(bio_ctrl->wbc, p, len); wbc is always non-NULL in write_one_eb, no need for a conditional here or in the other branch. > @@ -4090,9 +4094,15 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num, > if (eb->fs_info->nodesize < PAGE_SIZE) { > __bio_add_page(&bbio->bio, eb->pages[0], eb->len, > eb->start - page_offset(eb->pages[0])); > + if (bio_ctrl->wbc) > + wbc_account_cgroup_owner(bio_ctrl->wbc, eb->pages[0], len); .. and the read side never has one. For the write side we'll also need calls to bio_set_dev and wbc_init_bio. I plan take care of all of this including testing over the weekend.
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 76636e7c21b02f..68cdc6bed60c19 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -121,9 +121,6 @@ static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl) /* Caller should ensure the bio has at least some range added */ ASSERT(bbio->bio.bi_iter.bi_size); - if (!is_data_inode(&bbio->inode->vfs_inode)) - bbio->bio.bi_opf |= REQ_META; - if (btrfs_op(&bbio->bio) == BTRFS_MAP_READ && bio_ctrl->compress_type != BTRFS_COMPRESS_NONE) btrfs_submit_compressed_read(bbio); @@ -1899,11 +1896,7 @@ static void write_one_subpage_eb(struct extent_buffer *eb, struct btrfs_fs_info *fs_info = eb->fs_info; struct page *page = eb->pages[0]; bool no_dirty_ebs = false; - struct btrfs_bio_ctrl bio_ctrl = { - .wbc = wbc, - .opf = REQ_OP_WRITE | wbc_to_write_flags(wbc), - .end_io_func = end_bio_subpage_eb_writepage, - }; + struct btrfs_bio *bbio; prepare_eb_write(eb); @@ -1917,10 +1910,16 @@ static void write_one_subpage_eb(struct extent_buffer *eb, if (no_dirty_ebs) clear_page_dirty_for_io(page); - submit_extent_page(&bio_ctrl, eb->start, page, eb->len, - eb->start - page_offset(page)); + bbio = btrfs_bio_alloc(INLINE_EXTENT_BUFFER_PAGES, + REQ_OP_WRITE | REQ_META | wbc_to_write_flags(wbc), + eb->fs_info, end_bio_subpage_eb_writepage, NULL); + bbio->bio.bi_iter.bi_sector = eb->start >> SECTOR_SHIFT; + bbio->inode = BTRFS_I(eb->fs_info->btree_inode); + bbio->file_offset = eb->start; + __bio_add_page(&bbio->bio, page, eb->len, eb->start - page_offset(page)); unlock_page(page); - submit_one_bio(&bio_ctrl); + btrfs_submit_bio(bbio, 0); + /* * Submission finished without problem, if no range of the page is * dirty anymore, we have submitted a page. Update nr_written in wbc. @@ -1932,16 +1931,19 @@ static void write_one_subpage_eb(struct extent_buffer *eb, static noinline_for_stack void write_one_eb(struct extent_buffer *eb, struct writeback_control *wbc) { - u64 disk_bytenr = eb->start; + struct btrfs_bio *bbio; int i, num_pages; - struct btrfs_bio_ctrl bio_ctrl = { - .wbc = wbc, - .opf = REQ_OP_WRITE | wbc_to_write_flags(wbc), - .end_io_func = end_bio_extent_buffer_writepage, - }; prepare_eb_write(eb); + bbio = btrfs_bio_alloc(INLINE_EXTENT_BUFFER_PAGES, + REQ_OP_WRITE | REQ_META | wbc_to_write_flags(wbc), + eb->fs_info, end_bio_extent_buffer_writepage, + NULL); + bbio->bio.bi_iter.bi_sector = eb->start >> SECTOR_SHIFT; + bbio->inode = BTRFS_I(eb->fs_info->btree_inode); + bbio->file_offset = eb->start; + num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { struct page *p = eb->pages[i]; @@ -1949,12 +1951,11 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb, lock_page(p); clear_page_dirty_for_io(p); set_page_writeback(p); - submit_extent_page(&bio_ctrl, disk_bytenr, p, PAGE_SIZE, 0); - disk_bytenr += PAGE_SIZE; + __bio_add_page(&bbio->bio, p, PAGE_SIZE, 0); wbc->nr_to_write--; unlock_page(p); } - submit_one_bio(&bio_ctrl); + btrfs_submit_bio(bbio, 0); } /*