diff mbox series

[2/2] block: store bdev->bd_disk->fops->submit_bio state in bdev

Message ID 20230414134848.91563-3-axboe@kernel.dk (mailing list archive)
State New, archived
Headers show
Series Optimize block_device utilization | expand

Commit Message

Jens Axboe April 14, 2023, 1:48 p.m. UTC
We have a long chain of memory dereferencing just to whether or not
this disk has a special submit_bio helper. As that's not necessarily
the common case, add a bd_submit_bio state in the bdev to avoid
traversing this memory dependency chain if we don't need to.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/bdev.c              | 1 +
 block/blk-core.c          | 8 ++++----
 block/genhd.c             | 4 ++++
 include/linux/blk_types.h | 1 +
 4 files changed, 10 insertions(+), 4 deletions(-)

Comments

Damien Le Moal April 15, 2023, 2:43 a.m. UTC | #1
On 4/14/23 22:48, Jens Axboe wrote:
> We have a long chain of memory dereferencing just to whether or not
> this disk has a special submit_bio helper. As that's not necessarily
> the common case, add a bd_submit_bio state in the bdev to avoid
> traversing this memory dependency chain if we don't need to.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  block/bdev.c              | 1 +
>  block/blk-core.c          | 8 ++++----
>  block/genhd.c             | 4 ++++
>  include/linux/blk_types.h | 1 +
>  4 files changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 1795c7d4b99e..31a5d25b2b44 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -419,6 +419,7 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
>  	bdev->bd_inode = inode;
>  	bdev->bd_queue = disk->queue;
>  	bdev->bd_stats = alloc_percpu(struct disk_stats);
> +	bdev->bd_submit_bio = 0;

"= false;" would be better to match bd_submit_bio type.

[...]

> diff --git a/block/genhd.c b/block/genhd.c
> index 02d9cfb9e077..07736c5db988 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -420,6 +420,10 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
>  	 */
>  	elevator_init_mq(disk->queue);
>  
> +	/* Mark bdev as having a submit_bio, if needed */
> +	if (disk->fops->submit_bio)
> +		disk->part0->bd_submit_bio = 1;

"= true;" would be better to match the type.

Note that this could also be:

disk->part0->bd_submit_bio = disk->fops->submit_bio;

thus removing the if.
Jens Axboe April 15, 2023, 3:41 a.m. UTC | #2
On 4/14/23 8:43?PM, Damien Le Moal wrote:
> On 4/14/23 22:48, Jens Axboe wrote:
>> We have a long chain of memory dereferencing just to whether or not
>> this disk has a special submit_bio helper. As that's not necessarily
>> the common case, add a bd_submit_bio state in the bdev to avoid
>> traversing this memory dependency chain if we don't need to.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  block/bdev.c              | 1 +
>>  block/blk-core.c          | 8 ++++----
>>  block/genhd.c             | 4 ++++
>>  include/linux/blk_types.h | 1 +
>>  4 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/block/bdev.c b/block/bdev.c
>> index 1795c7d4b99e..31a5d25b2b44 100644
>> --- a/block/bdev.c
>> +++ b/block/bdev.c
>> @@ -419,6 +419,7 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
>>  	bdev->bd_inode = inode;
>>  	bdev->bd_queue = disk->queue;
>>  	bdev->bd_stats = alloc_percpu(struct disk_stats);
>> +	bdev->bd_submit_bio = 0;
> 
> "= false;" would be better to match bd_submit_bio type.

Done

>> diff --git a/block/genhd.c b/block/genhd.c
>> index 02d9cfb9e077..07736c5db988 100644
>> --- a/block/genhd.c
>> +++ b/block/genhd.c
>> @@ -420,6 +420,10 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
>>  	 */
>>  	elevator_init_mq(disk->queue);
>>  
>> +	/* Mark bdev as having a submit_bio, if needed */
>> +	if (disk->fops->submit_bio)
>> +		disk->part0->bd_submit_bio = 1;
> 
> "= true;" would be better to match the type.
> 
> Note that this could also be:
> 
> disk->part0->bd_submit_bio = disk->fops->submit_bio;
> 
> thus removing the if.

I made it:

disk->part0->bd_submit_bio = disk->fops->submit_bio != NULL;

instead to make it explicit, I don't think that assignment would be
happy otherwise.
Christoph Hellwig April 16, 2023, 5:53 a.m. UTC | #3
On Fri, Apr 14, 2023 at 07:48:48AM -0600, Jens Axboe wrote:
> We have a long chain of memory dereferencing just to whether or not
> this disk has a special submit_bio helper. As that's not necessarily
> the common case, add a bd_submit_bio state in the bdev to avoid
> traversing this memory dependency chain if we don't need to.

Do you have any numbers on how this helps?

> +	bdev->bd_submit_bio = 0;

bd_submit_bio sounds like a function call, so I'd name this
bd_has_submit_io.

But maybe it might make more sense to just add a bit that this is
a blk-mq backed device into bd_state as that might be handy in other
places as well?
Jens Axboe April 16, 2023, 6:59 p.m. UTC | #4
On 4/15/23 11:53 PM, Christoph Hellwig wrote:
> On Fri, Apr 14, 2023 at 07:48:48AM -0600, Jens Axboe wrote:
>> We have a long chain of memory dereferencing just to whether or not
>> this disk has a special submit_bio helper. As that's not necessarily
>> the common case, add a bd_submit_bio state in the bdev to avoid
>> traversing this memory dependency chain if we don't need to.
> 
> Do you have any numbers on how this helps?

I didn't run any numbers, but seems obvious to me that we don't want
to pull in 3 layers deep of pointer indirections when we can avoid
it.

>> +	bdev->bd_submit_bio = 0;
> 
> bd_submit_bio sounds like a function call, so I'd name this
> bd_has_submit_io.

Good point, I'll rename it.

> But maybe it might make more sense to just add a bit that this is
> a blk-mq backed device into bd_state as that might be handy in other
> places as well?

I'd rather just do that if needed.
diff mbox series

Patch

diff --git a/block/bdev.c b/block/bdev.c
index 1795c7d4b99e..31a5d25b2b44 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -419,6 +419,7 @@  struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 	bdev->bd_inode = inode;
 	bdev->bd_queue = disk->queue;
 	bdev->bd_stats = alloc_percpu(struct disk_stats);
+	bdev->bd_submit_bio = 0;
 	if (!bdev->bd_stats) {
 		iput(inode);
 		return NULL;
diff --git a/block/blk-core.c b/block/blk-core.c
index 269765d16cfd..ae7953539dc0 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -587,14 +587,14 @@  static inline blk_status_t blk_check_zone_append(struct request_queue *q,
 
 static void __submit_bio(struct bio *bio)
 {
-	struct gendisk *disk = bio->bi_bdev->bd_disk;
-
 	if (unlikely(!blk_crypto_bio_prep(&bio)))
 		return;
 
-	if (!disk->fops->submit_bio) {
+	if (!bio->bi_bdev->bd_submit_bio) {
 		blk_mq_submit_bio(bio);
 	} else if (likely(bio_queue_enter(bio) == 0)) {
+		struct gendisk *disk = bio->bi_bdev->bd_disk;
+
 		disk->fops->submit_bio(bio);
 		blk_queue_exit(disk->queue);
 	}
@@ -698,7 +698,7 @@  void submit_bio_noacct_nocheck(struct bio *bio)
 	 */
 	if (current->bio_list)
 		bio_list_add(&current->bio_list[0], bio);
-	else if (!bio->bi_bdev->bd_disk->fops->submit_bio)
+	else if (!bio->bi_bdev->bd_submit_bio)
 		__submit_bio_noacct_mq(bio);
 	else
 		__submit_bio_noacct(bio);
diff --git a/block/genhd.c b/block/genhd.c
index 02d9cfb9e077..07736c5db988 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -420,6 +420,10 @@  int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
 	 */
 	elevator_init_mq(disk->queue);
 
+	/* Mark bdev as having a submit_bio, if needed */
+	if (disk->fops->submit_bio)
+		disk->part0->bd_submit_bio = 1;
+
 	/*
 	 * If the driver provides an explicit major number it also must provide
 	 * the number of minors numbers supported, and those will be used to
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index d68d6e951fad..c08e1c08b7ba 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -47,6 +47,7 @@  struct block_device {
 	bool			bd_read_only;	/* read-only policy */
 	u8			bd_partno;
 	bool			bd_write_holder;
+	bool			bd_submit_bio;
 	dev_t			bd_dev;
 	atomic_t		bd_openers;
 	spinlock_t		bd_size_lock; /* for bd_inode->i_size updates */