diff mbox series

[2/4] block: inline fast path of driver tag allocation

Message ID 20211013164937.985367-3-axboe@kernel.dk (mailing list archive)
State New, archived
Headers show
Series Various block optimizations | expand

Commit Message

Jens Axboe Oct. 13, 2021, 4:49 p.m. UTC
If we don't use an IO scheduler or have shared tags, then we don't need
to call into this external function at all. This saves ~2% for such
a setup.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c |  8 +++-----
 block/blk-mq.h | 15 ++++++++++++++-
 2 files changed, 17 insertions(+), 6 deletions(-)

Comments

Christoph Hellwig Oct. 13, 2021, 5:22 p.m. UTC | #1
On Wed, Oct 13, 2021 at 10:49:35AM -0600, Jens Axboe wrote:
> If we don't use an IO scheduler or have shared tags, then we don't need
> to call into this external function at all. This saves ~2% for such
> a setup.

Hmm.  What happens if you just throw an inline tag onto
blk_mq_get_driver_tag?  All the high performance callers should be
in blk-mq.c anyway.  If that isn't enough maybe something like the
version below?

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 38e6651d8b94c..ba9af26d5209d 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1126,18 +1126,23 @@ static bool __blk_mq_get_driver_tag(struct request *rq)
 	return true;
 }
 
-bool blk_mq_get_driver_tag(struct request *rq)
+static void blk_mq_inc_active_requests(struct request *rq)
+{
+	if (!(rq->rq_flags & RQF_MQ_INFLIGHT)) {
+		rq->rq_flags |= RQF_MQ_INFLIGHT;
+		__blk_mq_inc_active_requests(rq->mq_hctx);
+	}
+}
+
+inline bool blk_mq_get_driver_tag(struct request *rq)
 {
 	struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
 
 	if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_get_driver_tag(rq))
 		return false;
 
-	if ((hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) &&
-			!(rq->rq_flags & RQF_MQ_INFLIGHT)) {
-		rq->rq_flags |= RQF_MQ_INFLIGHT;
-		__blk_mq_inc_active_requests(hctx);
-	}
+	if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
+		blk_mq_inc_active_requests(rq);
 	hctx->tags->rqs[rq->tag] = rq;
 	return true;
 }
Jens Axboe Oct. 13, 2021, 5:46 p.m. UTC | #2
On 10/13/21 11:22 AM, Christoph Hellwig wrote:
> On Wed, Oct 13, 2021 at 10:49:35AM -0600, Jens Axboe wrote:
>> If we don't use an IO scheduler or have shared tags, then we don't need
>> to call into this external function at all. This saves ~2% for such
>> a setup.
> 
> Hmm.  What happens if you just throw an inline tag onto
> blk_mq_get_driver_tag?

I'd be surprised if that's any different than my patch in terms of
performance, the fast path would be about the same. I don't feel
strongly about it, can do that instead.
Christoph Hellwig Oct. 13, 2021, 5:57 p.m. UTC | #3
On Wed, Oct 13, 2021 at 11:46:04AM -0600, Jens Axboe wrote:
> On 10/13/21 11:22 AM, Christoph Hellwig wrote:
> > On Wed, Oct 13, 2021 at 10:49:35AM -0600, Jens Axboe wrote:
> >> If we don't use an IO scheduler or have shared tags, then we don't need
> >> to call into this external function at all. This saves ~2% for such
> >> a setup.
> > 
> > Hmm.  What happens if you just throw an inline tag onto
> > blk_mq_get_driver_tag?
> 
> I'd be surprised if that's any different than my patch in terms of
> performance, the fast path would be about the same. I don't feel
> strongly about it, can do that instead.

I find the double indirection in your patch a bit confusing.  Not a big
deal if it is actually required, but if we can avoid that I'd prefer
not to add the extra indirection.
Jens Axboe Oct. 13, 2021, 6:07 p.m. UTC | #4
On 10/13/21 11:57 AM, Christoph Hellwig wrote:
> On Wed, Oct 13, 2021 at 11:46:04AM -0600, Jens Axboe wrote:
>> On 10/13/21 11:22 AM, Christoph Hellwig wrote:
>>> On Wed, Oct 13, 2021 at 10:49:35AM -0600, Jens Axboe wrote:
>>>> If we don't use an IO scheduler or have shared tags, then we don't need
>>>> to call into this external function at all. This saves ~2% for such
>>>> a setup.
>>>
>>> Hmm.  What happens if you just throw an inline tag onto
>>> blk_mq_get_driver_tag?
>>
>> I'd be surprised if that's any different than my patch in terms of
>> performance, the fast path would be about the same. I don't feel
>> strongly about it, can do that instead.
> 
> I find the double indirection in your patch a bit confusing.  Not a big
> deal if it is actually required, but if we can avoid that I'd prefer
> not to add the extra indirection.

Tested the variants, and it does seem to be the best one...
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 46a91e5fabc5..fe3e926c20a9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1135,7 +1135,7 @@  static inline unsigned int queued_to_index(unsigned int queued)
 	return min(BLK_MQ_MAX_DISPATCH_ORDER - 1, ilog2(queued) + 1);
 }
 
-static bool __blk_mq_get_driver_tag(struct request *rq)
+static bool __blk_mq_alloc_driver_tag(struct request *rq)
 {
 	struct sbitmap_queue *bt = &rq->mq_hctx->tags->bitmap_tags;
 	unsigned int tag_offset = rq->mq_hctx->tags->nr_reserved_tags;
@@ -1159,11 +1159,9 @@  static bool __blk_mq_get_driver_tag(struct request *rq)
 	return true;
 }
 
-bool blk_mq_get_driver_tag(struct request *rq)
+bool __blk_mq_get_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq)
 {
-	struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
-
-	if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_get_driver_tag(rq))
+	if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq))
 		return false;
 
 	if ((hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) &&
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 8be447995106..ceed0a001c76 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -264,7 +264,20 @@  static inline void blk_mq_put_driver_tag(struct request *rq)
 	__blk_mq_put_driver_tag(rq->mq_hctx, rq);
 }
 
-bool blk_mq_get_driver_tag(struct request *rq);
+bool __blk_mq_get_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq);
+
+static inline bool blk_mq_get_driver_tag(struct request *rq)
+{
+	struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
+
+	if (rq->tag != BLK_MQ_NO_TAG &&
+	    !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
+		hctx->tags->rqs[rq->tag] = rq;
+		return true;
+	}
+
+	return __blk_mq_get_driver_tag(hctx, rq);
+}
 
 static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap)
 {