Message ID | 20200406232440.4027-3-chaitanya.kulkarni@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block: add bio based read-write helper | expand |
On 2020/04/07 8:25, Chaitanya Kulkarni wrote: > The existing routine xfs_rw_bdev has block layer bio-based code which > maps the data buffer allocated with kmalloc or vmalloc to the READ/WRITE > bios. Use a block layer helper from the previous patch to avoid code > duplication. > > Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> > --- > fs/xfs/xfs_bio_io.c | 47 ++------------------------------------------- > 1 file changed, 2 insertions(+), 45 deletions(-) > > diff --git a/fs/xfs/xfs_bio_io.c b/fs/xfs/xfs_bio_io.c > index e2148f2d5d6b..b04c398fb99c 100644 > --- a/fs/xfs/xfs_bio_io.c > +++ b/fs/xfs/xfs_bio_io.c > @@ -4,11 +4,6 @@ > */ > #include "xfs.h" > > -static inline unsigned int bio_max_vecs(unsigned int count) > -{ > - return min_t(unsigned, howmany(count, PAGE_SIZE), BIO_MAX_PAGES); > -} > - > int > xfs_rw_bdev( > struct block_device *bdev, > @@ -18,44 +13,6 @@ xfs_rw_bdev( > unsigned int op) > > { > - unsigned int is_vmalloc = is_vmalloc_addr(data); > - unsigned int left = count; > - int error; > - struct bio *bio; > - > - if (is_vmalloc && op == REQ_OP_WRITE) > - flush_kernel_vmap_range(data, count); > - > - bio = bio_alloc(GFP_KERNEL, bio_max_vecs(left)); > - bio_set_dev(bio, bdev); > - bio->bi_iter.bi_sector = sector; > - bio->bi_opf = op | REQ_META | REQ_SYNC; > - > - do { > - struct page *page = kmem_to_page(data); > - unsigned int off = offset_in_page(data); > - unsigned int len = min_t(unsigned, left, PAGE_SIZE - off); > - > - while (bio_add_page(bio, page, len, off) != len) { > - struct bio *prev = bio; > - > - bio = bio_alloc(GFP_KERNEL, bio_max_vecs(left)); > - bio_copy_dev(bio, prev); > - bio->bi_iter.bi_sector = bio_end_sector(prev); > - bio->bi_opf = prev->bi_opf; > - bio_chain(prev, bio); > - > - submit_bio(prev); > - } > - > - data += len; > - left -= len; > - } while (left > 0); Your helper could use a similar loop structure. This is very easy to read. > - > - error = submit_bio_wait(bio); > - bio_put(bio); > - > - if (is_vmalloc && op == REQ_OP_READ) > - invalidate_kernel_vmap_range(data, count); > - return error; > + return blkdev_issue_rw(bdev, data, sector, count, op, REQ_META, > + GFP_KERNEL); > } >
On 04/06/2020 05:32 PM, Damien Le Moal wrote: >> - >> >- do { >> >- struct page *page = kmem_to_page(data); >> >- unsigned int off = offset_in_page(data); >> >- unsigned int len = min_t(unsigned, left, PAGE_SIZE - off); >> >- >> >- while (bio_add_page(bio, page, len, off) != len) { >> >- struct bio *prev = bio; >> >- >> >- bio = bio_alloc(GFP_KERNEL, bio_max_vecs(left)); >> >- bio_copy_dev(bio, prev); >> >- bio->bi_iter.bi_sector = bio_end_sector(prev); >> >- bio->bi_opf = prev->bi_opf; >> >- bio_chain(prev, bio); >> >- >> >- submit_bio(prev); >> >- } >> >- >> >- data += len; >> >- left -= len; >> >- } while (left > 0); > Your helper could use a similar loop structure. This is very easy to read. > If I understand correctly this pattern is used since it is not a part of block layer. The helpers in blk-lib.c are not accessible so this :- 1. Adds an extra helper bio_max_vecs(). 2. Adds a new macro howmany which is XFS specific and doesn't follow usual block layer macros naming. 3. Open codes bio chaining. 4. Uses two bio variables for chaining. 5. Doesn't allow to anchor bio which is needed in async I/O scenario. All above breaks the existing pattern and code reuse in blk-lib.c, since blk-lib.c:- 1. Already provides blk_next_bio() why repeat the bio allocation and bio chaining code ? 2. Instead of adding a new helper bio_max_vecs() why not use existing __blkdev_sectors_to_bio_pages() ? 3. Why use two bios when it can be done with one bio with the helpers in blk-lib.c ? 4. Allows async version.
On Tue, Apr 07, 2020 at 08:06:35PM +0000, Chaitanya Kulkarni wrote: > On 04/06/2020 05:32 PM, Damien Le Moal wrote: > >> - > >> >- do { > >> >- struct page *page = kmem_to_page(data); > >> >- unsigned int off = offset_in_page(data); > >> >- unsigned int len = min_t(unsigned, left, PAGE_SIZE - off); > >> >- > >> >- while (bio_add_page(bio, page, len, off) != len) { > >> >- struct bio *prev = bio; > >> >- > >> >- bio = bio_alloc(GFP_KERNEL, bio_max_vecs(left)); > >> >- bio_copy_dev(bio, prev); > >> >- bio->bi_iter.bi_sector = bio_end_sector(prev); > >> >- bio->bi_opf = prev->bi_opf; > >> >- bio_chain(prev, bio); > >> >- > >> >- submit_bio(prev); > >> >- } > >> >- > >> >- data += len; > >> >- left -= len; > >> >- } while (left > 0); > > Your helper could use a similar loop structure. This is very easy to read. > > > If I understand correctly this pattern is used since it is not a part of > block layer. It's because it was simple and easy to understandi, not because of the fact it is outside the core block layer. Us XFS folks are simple people who like simple things that are easy to understand because there is so much of XFS that is so horrifically complex that we want to implement simple stuff once and just not have to care about it again. > The helpers in blk-lib.c are not accessible so this :- So export the helpers? > All above breaks the existing pattern and code reuse in blk-lib.c, since > blk-lib.c:- > 1. Already provides blk_next_bio() why repeat the bio allocation > and bio chaining code ? So export the helper? > 2. Instead of adding a new helper bio_max_vecs() why not use existing > __blkdev_sectors_to_bio_pages() ? That's not an improvement. The XFS code is _much_ easier to read and understand. > 3. Why use two bios when it can be done with one bio with the helpers > in blk-lib.c ? That helper is blk_next_bio(), which hides the second bio inside it. So you aren't actually getting rid of the need for two bio pointers. > 4. Allows async version. Which is not used by XFS, so just adds complexity to this XFS path for no good reason. Seriously, this 20 lines of code in XFS turns into 50-60 lines of code in the block layer to do the same thing. How is that an improvement? Cheers, Dave.
(+cc Christoph) On 04/07/2020 04:28 PM, Dave Chinner wrote: > On Tue, Apr 07, 2020 at 08:06:35PM +0000, Chaitanya Kulkarni wrote: >> On 04/06/2020 05:32 PM, Damien Le Moal wrote: >>>> - >>>>> - do { >>>>> - struct page *page = kmem_to_page(data); >>>>> - unsigned int off = offset_in_page(data); >>>>> - unsigned int len = min_t(unsigned, left, PAGE_SIZE - off); >>>>> - >>>>> - while (bio_add_page(bio, page, len, off) != len) { >>>>> - struct bio *prev = bio; >>>>> - >>>>> - bio = bio_alloc(GFP_KERNEL, bio_max_vecs(left)); >>>>> - bio_copy_dev(bio, prev); >>>>> - bio->bi_iter.bi_sector = bio_end_sector(prev); >>>>> - bio->bi_opf = prev->bi_opf; >>>>> - bio_chain(prev, bio); >>>>> - >>>>> - submit_bio(prev); >>>>> - } >>>>> - >>>>> - data += len; >>>>> - left -= len; >>>>> - } while (left > 0); >>> Your helper could use a similar loop structure. This is very easy to read. >>> >> If I understand correctly this pattern is used since it is not a part of >> block layer. > > It's because it was simple and easy to understandi, not because of > the fact it is outside the core block layer. > > Us XFS folks are simple people who like simple things that are easy to > understand because there is so much of XFS that is so horrifically > complex that we want to implement simple stuff once and just not > have to care about it again. > Agree, new code should not make things complicated, in fact initial version had existing code pattern see [1]. Before answering the questions I think I should share few considerations I went through based on the code and requirement :- 1. The new helper is needed since XFS and rnbd is doing the same thing, i.e. mapping bios to buffer. The helper should be a part of the block layer library function, as most of the library APIs are present there, which are used by the file systems/network drivers. e.g. blkdev_issue_zeroout(). Summery :- Something like blkdev_issue_zeroout() is a good candidate since it is used in file-systems and network drivers. 2. Figure out the common code which can be moved to the library and identify what different APIs are needed for the functionality to save code duplication :- A. Synchronous version is needed for XFS. (without cond_resched() I need to fix this in next version). B. Asynchronous version is needed for network drivers rnbd as mentioned in the cover letter. Summery :- Add sync and async version. 3. Whenever a new function is added to the library we should not invent new patterns and follow the same style which is present in the library code i.e. in blk-lib.c, unless new API can't fit into current pattern. This is needed since block layer maintainers might complain about new design pattern. Summary :- See if new API can be designed identical to blkdev_issue_zeroout() and __blkdev_issue_zero_pages(). 4. Look at the existing user(s) (XFS) of the library (blk-lib.c) API(zeroout) and how it is using existing API if any. For XFS, which uses existing library function :- "fs/xfs/xfs_bmap_util.c:65: blkdev_issue_zeroout(target->bt_bdev" Summery :- Since the library(blk-lib.c) API(zeroout) is used in the user(XFS) that means it is acceptable to use similar code pattern. 5. Design a new API in the library based on the existing code. That :- A. Matches the code pattern which is present in the library, already used in the exiting user and reuse it. B. Make sure to have identical APIs parameter to keep the consistency whenever it is possible. (including APIs prototype parameters name etc). Summary :- Add blkdev_issue_rw() -> blkdev_issue_zeroout() and __blkdev_issue_rw() -> __blkdev_issue_zero_pages(). Regarding the number of lines, this is not aimed at having less number of lines in XFS only than existing xfs_rw_bdev() (although it does that), but since it is a library API it will save overall lines of the code (e.g. XFS 60 and rnbd ~50 and future uses) in the users like fs and network drivers similar to existing zeroout API. Please correct me if I'm wrong in any the above considerations, given that I'm not a XFS developer I'd like to make sure new API pattern is acceptable to both XFS developers and the block layer maintainers. I'm not stuck on using this pattern and I'll be happy to re-use the xfs_rw_bdev() pattern if block layer maintainers are okay with it. I'll wait for them to reply. >> The helpers in blk-lib.c are not accessible so this :- > > So export the helpers? > >> All above breaks the existing pattern and code reuse in blk-lib.c, since >> blk-lib.c:- >> 1. Already provides blk_next_bio() why repeat the bio allocation >> and bio chaining code ? > > So export the helper? > >> 2. Instead of adding a new helper bio_max_vecs() why not use existing >> __blkdev_sectors_to_bio_pages() ? > > That's not an improvement. The XFS code is _much_ easier to read > and understand. > >> 3. Why use two bios when it can be done with one bio with the helpers >> in blk-lib.c ? > > That helper is blk_next_bio(), which hides the second bio inside it. > So you aren't actually getting rid of the need for two bio pointers. > >> 4. Allows async version. > > Which is not used by XFS, so just adds complexity to this XFS path > for no good reason. > > Seriously, this 20 lines of code in XFS turns into 50-60 lines > of code in the block layer to do the same thing. How is that an > improvement? > > Cheers, > > Dave. > [1] Initial draft (for reference only) :- struct bio *__bio_execute_rw(struct block_device *bd, char *buf, sector_t sect, unsigned count, unsigned op, unsigned opf, gfp_t gfp_mask) { bool vm = is_vmalloc_addr(buf) ? true : false; struct request_queue *q = bd->bd_queue; unsigned int left = count; struct page *page; struct bio *new_bio; blk_qc_t cookie; bool do_poll; new_bio = bio_alloc(gfp_mask, bio_max_vecs(left)); bio_set_dev(new_bio, bd); new_bio->bi_iter.bi_sector = sect; new_bio->bi_opf = op_opf | REQ_SYNC; do { unsigned int offset = offset_in_page(buf); unsigned int len = min_t(unsigned, left, PAGE_SIZE - offset); page = vm ? vmalloc_to_page(buf) : virt_to_page(buf); while (bio_add_page(new_bio, page, len, offset) != len) { struct bio *curr_bio = new_bio; new_bio = bio_alloc(gfp_mask, bio_max_vecs(left)); bio_copy_dev(new_bio, curr_bio); new_bio->bi_iter.bi_sector = bio_end_sector(curr_bio); new_bio->bi_opf = curr_bio->bi_opf; bio_chain(curr_bio, new_bio); cookie = submit_bio(curr_bio); } buf += len; left -= len; } while (left > 0); return new_bio; } EXPORT_SYMBOL(__bio_execute_rw); int bio_execute_rw_sync(struct block_device *bd, char *buf, sector_t sect, unsigned count, unsigned op, unsigned opf, gfp_t gfp_mask) { unsigned int is_vmalloc = is_vmalloc_addr(buf); struct bio *bio; int error; if (is_vmalloc && op == REQ_OP_WRITE) flush_kernel_vmap_range(buf, count); bio = __bio_execute_rw(bd, sect, count, buf, op, opf, mask); error = submit_bio_wait(bio); bio_put(bio); if (is_vmalloc && op == REQ_OP_READ) invalidate_kernel_vmap_range(buf, count); return error; } EXPORT_SYMBOL_GPL(bio_execute_rw_sync);
diff --git a/fs/xfs/xfs_bio_io.c b/fs/xfs/xfs_bio_io.c index e2148f2d5d6b..b04c398fb99c 100644 --- a/fs/xfs/xfs_bio_io.c +++ b/fs/xfs/xfs_bio_io.c @@ -4,11 +4,6 @@ */ #include "xfs.h" -static inline unsigned int bio_max_vecs(unsigned int count) -{ - return min_t(unsigned, howmany(count, PAGE_SIZE), BIO_MAX_PAGES); -} - int xfs_rw_bdev( struct block_device *bdev, @@ -18,44 +13,6 @@ xfs_rw_bdev( unsigned int op) { - unsigned int is_vmalloc = is_vmalloc_addr(data); - unsigned int left = count; - int error; - struct bio *bio; - - if (is_vmalloc && op == REQ_OP_WRITE) - flush_kernel_vmap_range(data, count); - - bio = bio_alloc(GFP_KERNEL, bio_max_vecs(left)); - bio_set_dev(bio, bdev); - bio->bi_iter.bi_sector = sector; - bio->bi_opf = op | REQ_META | REQ_SYNC; - - do { - struct page *page = kmem_to_page(data); - unsigned int off = offset_in_page(data); - unsigned int len = min_t(unsigned, left, PAGE_SIZE - off); - - while (bio_add_page(bio, page, len, off) != len) { - struct bio *prev = bio; - - bio = bio_alloc(GFP_KERNEL, bio_max_vecs(left)); - bio_copy_dev(bio, prev); - bio->bi_iter.bi_sector = bio_end_sector(prev); - bio->bi_opf = prev->bi_opf; - bio_chain(prev, bio); - - submit_bio(prev); - } - - data += len; - left -= len; - } while (left > 0); - - error = submit_bio_wait(bio); - bio_put(bio); - - if (is_vmalloc && op == REQ_OP_READ) - invalidate_kernel_vmap_range(data, count); - return error; + return blkdev_issue_rw(bdev, data, sector, count, op, REQ_META, + GFP_KERNEL); }
The existing routine xfs_rw_bdev has block layer bio-based code which maps the data buffer allocated with kmalloc or vmalloc to the READ/WRITE bios. Use a block layer helper from the previous patch to avoid code duplication. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> --- fs/xfs/xfs_bio_io.c | 47 ++------------------------------------------- 1 file changed, 2 insertions(+), 45 deletions(-)