diff mbox series

[3/3] blk-mq: fix wait condition for tagset wait completed check

Message ID 20250128-nvme-misc-fixes-v1-3-40c586581171@kernel.org (mailing list archive)
State New
Headers show
Series misc nvme related fixes | expand

Commit Message

Daniel Wagner Jan. 28, 2025, 4:34 p.m. UTC
blk_mq_tagset_count_completed_reqs returns the number of completed
requests. The only user of this function is
blk_mq_tagset_wait_completed_request which wants to know how many
request are not yet completed. Thus return the number of in flight
requests and terminate the wait loop when there is no inflight request.

Fixes: f9934a80f91d ("blk-mq: introduce blk_mq_tagset_wait_completed_request()")
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
---
 block/blk-mq-tag.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Christoph Hellwig Jan. 29, 2025, 6:07 a.m. UTC | #1
On Tue, Jan 28, 2025 at 05:34:48PM +0100, Daniel Wagner wrote:
> blk_mq_tagset_count_completed_reqs returns the number of completed
> requests. The only user of this function is
> blk_mq_tagset_wait_completed_request which wants to know how many
> request are not yet completed. Thus return the number of in flight
> requests and terminate the wait loop when there is no inflight request.

This looks sensible, but you should probably send it directly to
Jens instead of tagging it onto a nvme series.

The code could also really use a comment on what we're counting and why.
Nilay Shroff Jan. 29, 2025, 9:54 a.m. UTC | #2
On 1/28/25 10:04 PM, Daniel Wagner wrote:
> blk_mq_tagset_count_completed_reqs returns the number of completed
> requests. The only user of this function is
> blk_mq_tagset_wait_completed_request which wants to know how many
> request are not yet completed. Thus return the number of in flight
> requests and terminate the wait loop when there is no inflight request.
> 
> Fixes: f9934a80f91d ("blk-mq: introduce blk_mq_tagset_wait_completed_request()")
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Daniel Wagner <wagi@kernel.org>
> ---
>  block/blk-mq-tag.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index b9f417d980b46d54b74dec8adcb5b04e6a78635c..3ce46afb65e3c3de9f11ca440bf0f335f21d0998 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -450,11 +450,11 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
>  }
>  EXPORT_SYMBOL(blk_mq_tagset_busy_iter);
>  
> -static bool blk_mq_tagset_count_completed_rqs(struct request *rq, void *data)
> +static bool blk_mq_tagset_count_inflight_rqs(struct request *rq, void *data)
>  {
>  	unsigned *count = data;
>  
> -	if (blk_mq_request_completed(rq))
> +	if (blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
>  		(*count)++;
>  	return true;
>  }
> @@ -472,7 +472,7 @@ void blk_mq_tagset_wait_completed_request(struct blk_mq_tag_set *tagset)
>  		unsigned count = 0;
>  
>  		blk_mq_tagset_busy_iter(tagset,
> -				blk_mq_tagset_count_completed_rqs, &count);
> +				blk_mq_tagset_count_inflight_rqs, &count);
>  		if (!count)
>  			break;
>  		msleep(5);
> 
I see that blk_mq_tagset_wait_completed_request() is called from nvme_cancel_tagset()
and nvme_cancel_admin_tagset(). And it seems to me that the intention here's to wait 
until each completed requests are freed (or change its state to MQ_RQ_IDLE). 

Looking at code, the nvme_cancel_xxx() first invokes blk_mq_tagset_busy_iter() which 
iterates through tagset and cancels each in-flight request and marks the request state
to MQ_RQ_COMPLETE. Next in blk_mq_tagset_wait_completed_request(), we wait for each
completed request state changed to anything but MQ_RQ_COMPLETE. The next state of the 
request would be naturally MQ_RQ_IDLE once that request is freed. So in blk_mq_tagset_
wait_completed_request(), essentially we wait for request state change from MQ_RQ_COMPLETE
to MQ_RQ_IDLE.

So now if the proposal is that blk_mq_tagset_wait_completed_request() has to wait only 
if there's any in-flight request then it seems this function would never need to wait 
and looks redundant because req->state would never be MQ_RQ_IN_FLIGHT as those would 
have been already changed to MQ_RQ_COMPLETE when nvme_cancel_xxx() invokes blk_mq_tagset_
busy_iter(ctrl->tagset, nvme_cancel_request, ctrl).

Having said that, I am not sure what was the real intention here, in nvme_cancel_xxx(), 
do we really need to wait only until in-flight requests are completed or synchronize with 
request's completion callback (i.e. wait until all completed requests are freed)? 

Thanks,
--Nilay
diff mbox series

Patch

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index b9f417d980b46d54b74dec8adcb5b04e6a78635c..3ce46afb65e3c3de9f11ca440bf0f335f21d0998 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -450,11 +450,11 @@  void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 }
 EXPORT_SYMBOL(blk_mq_tagset_busy_iter);
 
-static bool blk_mq_tagset_count_completed_rqs(struct request *rq, void *data)
+static bool blk_mq_tagset_count_inflight_rqs(struct request *rq, void *data)
 {
 	unsigned *count = data;
 
-	if (blk_mq_request_completed(rq))
+	if (blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
 		(*count)++;
 	return true;
 }
@@ -472,7 +472,7 @@  void blk_mq_tagset_wait_completed_request(struct blk_mq_tag_set *tagset)
 		unsigned count = 0;
 
 		blk_mq_tagset_busy_iter(tagset,
-				blk_mq_tagset_count_completed_rqs, &count);
+				blk_mq_tagset_count_inflight_rqs, &count);
 		if (!count)
 			break;
 		msleep(5);