diff mbox series

[V2,7/8] nvme: use blk_mq_queue_tag_inflight_iter

Message ID 1553492318-1810-8-git-send-email-jianchao.w.wang@oracle.com (mailing list archive)
State New, archived
Headers show
Series : blk-mq: use static_rqs to iterate busy tags | expand

Commit Message

jianchao.wang March 25, 2019, 5:38 a.m. UTC
blk_mq_tagset_inflight_iter is not safe that it could get stale request
in tags->rqs[]. Use blk_mq_queue_tag_inflight_iter here. A new helper
interface nvme_iterate_inflight_rqs is introduced to iterate
all of the ns under a ctrl.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 drivers/nvme/host/core.c   | 12 ++++++++++++
 drivers/nvme/host/fc.c     | 10 +++++-----
 drivers/nvme/host/nvme.h   |  2 ++
 drivers/nvme/host/pci.c    |  5 +++--
 drivers/nvme/host/rdma.c   |  4 ++--
 drivers/nvme/host/tcp.c    |  5 +++--
 drivers/nvme/target/loop.c |  4 ++--
 7 files changed, 29 insertions(+), 13 deletions(-)

Comments

Keith Busch March 25, 2019, 1:49 p.m. UTC | #1
On Mon, Mar 25, 2019 at 01:38:37PM +0800, Jianchao Wang wrote:
> blk_mq_tagset_inflight_iter is not safe that it could get stale request
> in tags->rqs[]. Use blk_mq_queue_tag_inflight_iter here. A new helper
> interface nvme_iterate_inflight_rqs is introduced to iterate
> all of the ns under a ctrl.

Nak, NVMe only iterates tags when new requests can't enter, allocated
requests can't dispatch, and dispatched commands can't complete. So
it is perfectly safe to iterate if the driver takes reasonable steps
beforehand. Further, for M tags and N namespaces, we complete teardown
in O(M) time, but this makes in O(M*N) without gaining anything.
jianchao.wang March 26, 2019, 1:17 a.m. UTC | #2
Hi Keith

On 3/25/19 9:49 PM, Keith Busch wrote:
> On Mon, Mar 25, 2019 at 01:38:37PM +0800, Jianchao Wang wrote:
>> blk_mq_tagset_inflight_iter is not safe that it could get stale request
>> in tags->rqs[]. Use blk_mq_queue_tag_inflight_iter here. A new helper
>> interface nvme_iterate_inflight_rqs is introduced to iterate
>> all of the ns under a ctrl.
> 
> Nak, NVMe only iterates tags when new requests can't enter, allocated
> requests can't dispatch, and dispatched commands can't complete. So
> it is perfectly safe to iterate if the driver takes reasonable steps
> beforehand.

nvme_dev_disable just quiesce and freeze the request_queue, but not drain the enters.
So there still could be someone escapes the queue freeze checking and tries to allocate
request.

> Further, for M tags and N namespaces, we complete teardown
> in O(M) time, but this makes in O(M*N) without gaining anything.
> 

Yes, it is indeed inefficient.

Thanks
Jianchao
Ming Lei March 26, 2019, 2:41 a.m. UTC | #3
On Tue, Mar 26, 2019 at 9:18 AM jianchao.wang
<jianchao.w.wang@oracle.com> wrote:
>
> Hi Keith
>
> On 3/25/19 9:49 PM, Keith Busch wrote:
> > On Mon, Mar 25, 2019 at 01:38:37PM +0800, Jianchao Wang wrote:
> >> blk_mq_tagset_inflight_iter is not safe that it could get stale request
> >> in tags->rqs[]. Use blk_mq_queue_tag_inflight_iter here. A new helper
> >> interface nvme_iterate_inflight_rqs is introduced to iterate
> >> all of the ns under a ctrl.
> >
> > Nak, NVMe only iterates tags when new requests can't enter, allocated
> > requests can't dispatch, and dispatched commands can't complete. So
> > it is perfectly safe to iterate if the driver takes reasonable steps
> > beforehand.
>
> nvme_dev_disable just quiesce and freeze the request_queue, but not drain the enters.
> So there still could be someone escapes the queue freeze checking and tries to allocate
> request.

The rq->state is just IDLE for these allocated request, so there
shouldn't be issue
in NVMe's case.

Thanks,
Ming
jianchao.wang March 26, 2019, 3:05 a.m. UTC | #4
On 3/26/19 10:41 AM, Ming Lei wrote:
> On Tue, Mar 26, 2019 at 9:18 AM jianchao.wang
> <jianchao.w.wang@oracle.com> wrote:
>>
>> Hi Keith
>>
>> On 3/25/19 9:49 PM, Keith Busch wrote:
>>> On Mon, Mar 25, 2019 at 01:38:37PM +0800, Jianchao Wang wrote:
>>>> blk_mq_tagset_inflight_iter is not safe that it could get stale request
>>>> in tags->rqs[]. Use blk_mq_queue_tag_inflight_iter here. A new helper
>>>> interface nvme_iterate_inflight_rqs is introduced to iterate
>>>> all of the ns under a ctrl.
>>>
>>> Nak, NVMe only iterates tags when new requests can't enter, allocated
>>> requests can't dispatch, and dispatched commands can't complete. So
>>> it is perfectly safe to iterate if the driver takes reasonable steps
>>> beforehand.
>>
>> nvme_dev_disable just quiesce and freeze the request_queue, but not drain the enters.
>> So there still could be someone escapes the queue freeze checking and tries to allocate
>> request.
> 
> The rq->state is just IDLE for these allocated request, so there
> shouldn't be issue
> in NVMe's case.

What if there used to be a io scheduler and leave some stale requests of sched tags ?
Or the nr_hw_queues was decreased and leave the hctx->fq->flush_rq ?

The stable request could be some tings freed and used
by others and the state field happen to be overwritten to non-zero...

Thanks
Jianchao
Keith Busch March 26, 2019, 11:57 p.m. UTC | #5
On Mon, Mar 25, 2019 at 08:05:53PM -0700, jianchao.wang wrote:
> What if there used to be a io scheduler and leave some stale requests of sched tags ?
> Or the nr_hw_queues was decreased and leave the hctx->fq->flush_rq ?

Requests internally queued in scheduler or block layer are not eligible
for the nvme driver's iterator callback. We only use it to reclaim
dispatched requests that the target can't return, which only applies to
requests that must have a valid rq->tag value from hctx->tags.
 
> The stable request could be some tings freed and used
> by others and the state field happen to be overwritten to non-zero...

I am not sure I follow what this means. At least for nvme, every queue
sharing the same tagset is quiesced and frozen, there should be no
request state in flux at the time we iterate.
jianchao.wang March 27, 2019, 2:03 a.m. UTC | #6
Hi Keith

On 3/27/19 7:57 AM, Keith Busch wrote:
> On Mon, Mar 25, 2019 at 08:05:53PM -0700, jianchao.wang wrote:
>> What if there used to be a io scheduler and leave some stale requests of sched tags ?
>> Or the nr_hw_queues was decreased and leave the hctx->fq->flush_rq ?
> 
> Requests internally queued in scheduler or block layer are not eligible
> for the nvme driver's iterator callback. We only use it to reclaim
> dispatched requests that the target can't return, which only applies to
> requests that must have a valid rq->tag value from hctx->tags.
>  
>> The stable request could be some tings freed and used
>> by others and the state field happen to be overwritten to non-zero...
> 
> I am not sure I follow what this means. At least for nvme, every queue
> sharing the same tagset is quiesced and frozen, there should be no
> request state in flux at the time we iterate.
> 

In nvme_dev_disable, when we try to reclaim the in-flight requests with blk_mq_tagset_busy_iter,
the request_queues are quiesced but just start-freeze.
We will try to _drain_ the in-flight requests for the _shutdown_ case when controller is not dead.
For the reset case, there still could be someone escapes the checking of queue freezing and enters
blk_mq_make_request and tries to allocate tag, then we may get,

generic_make_request        nvme_dev_disable
 -> blk_queue_enter              
                              -> nvme_start_freeze (just start freeze, no drain)
                              -> nvme_stop_queues
 -> blk_mq_make_request
  - > blk_mq_get_request      -> blk_mq_tagset_busy_iter
     -> blk_mq_get_tag
                                -> bt_tags_for_each
                                   -> bt_tags_iter
                                       -> rq = tags->rqs[] ---> [1]
     -> blk_mq_rq_ctx_init
       -> data->hctx->tags->rqs[rq->tag] = rq;

The rq got on position [1] could be a stale request that has been freed due to,
1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset
2. a removed io scheduler's sched request

And this stale request may have been used by others and the request->state is changed to a non-zero
value and passes the checking of blk_mq_request_started and then it will be handled by nvme_cancel_request.

Thanks
Jianchao
Keith Busch March 27, 2019, 2:15 a.m. UTC | #7
On Wed, Mar 27, 2019 at 10:03:26AM +0800, jianchao.wang wrote:
> Hi Keith
> 
> On 3/27/19 7:57 AM, Keith Busch wrote:
> > On Mon, Mar 25, 2019 at 08:05:53PM -0700, jianchao.wang wrote:
> >> What if there used to be a io scheduler and leave some stale requests of sched tags ?
> >> Or the nr_hw_queues was decreased and leave the hctx->fq->flush_rq ?
> > 
> > Requests internally queued in scheduler or block layer are not eligible
> > for the nvme driver's iterator callback. We only use it to reclaim
> > dispatched requests that the target can't return, which only applies to
> > requests that must have a valid rq->tag value from hctx->tags.
> >  
> >> The stable request could be some tings freed and used
> >> by others and the state field happen to be overwritten to non-zero...
> > 
> > I am not sure I follow what this means. At least for nvme, every queue
> > sharing the same tagset is quiesced and frozen, there should be no
> > request state in flux at the time we iterate.
> > 
> 
> In nvme_dev_disable, when we try to reclaim the in-flight requests with blk_mq_tagset_busy_iter,
> the request_queues are quiesced but just start-freeze.
> We will try to _drain_ the in-flight requests for the _shutdown_ case when controller is not dead.
> For the reset case, there still could be someone escapes the checking of queue freezing and enters
> blk_mq_make_request and tries to allocate tag, then we may get,
> 
> generic_make_request        nvme_dev_disable
>  -> blk_queue_enter              
>                               -> nvme_start_freeze (just start freeze, no drain)
>                               -> nvme_stop_queues
>  -> blk_mq_make_request
>   - > blk_mq_get_request      -> blk_mq_tagset_busy_iter
>      -> blk_mq_get_tag
>                                 -> bt_tags_for_each
>                                    -> bt_tags_iter
>                                        -> rq = tags->rqs[] ---> [1]
>      -> blk_mq_rq_ctx_init
>        -> data->hctx->tags->rqs[rq->tag] = rq;
> 
> The rq got on position [1] could be a stale request that has been freed due to,
> 1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset
> 2. a removed io scheduler's sched request
> 
> And this stale request may have been used by others and the request->state is changed to a non-zero
> value and passes the checking of blk_mq_request_started and then it will be handled by nvme_cancel_request.

How is that request state going to be anyting other than IDLE? A freed
request state is IDLE, and continues to be IDLE until dispatched. But
dispatch is blocked for the entire tagset, so request states can't be
started during an nvme reset.
jianchao.wang March 27, 2019, 2:27 a.m. UTC | #8
On 3/27/19 10:15 AM, Keith Busch wrote:
> On Wed, Mar 27, 2019 at 10:03:26AM +0800, jianchao.wang wrote:
>> Hi Keith
>>
>> On 3/27/19 7:57 AM, Keith Busch wrote:
>>> On Mon, Mar 25, 2019 at 08:05:53PM -0700, jianchao.wang wrote:
>>>> What if there used to be a io scheduler and leave some stale requests of sched tags ?
>>>> Or the nr_hw_queues was decreased and leave the hctx->fq->flush_rq ?
>>>
>>> Requests internally queued in scheduler or block layer are not eligible
>>> for the nvme driver's iterator callback. We only use it to reclaim
>>> dispatched requests that the target can't return, which only applies to
>>> requests that must have a valid rq->tag value from hctx->tags.
>>>  
>>>> The stable request could be some tings freed and used
>>>> by others and the state field happen to be overwritten to non-zero...
>>>
>>> I am not sure I follow what this means. At least for nvme, every queue
>>> sharing the same tagset is quiesced and frozen, there should be no
>>> request state in flux at the time we iterate.
>>>
>>
>> In nvme_dev_disable, when we try to reclaim the in-flight requests with blk_mq_tagset_busy_iter,
>> the request_queues are quiesced but just start-freeze.
>> We will try to _drain_ the in-flight requests for the _shutdown_ case when controller is not dead.
>> For the reset case, there still could be someone escapes the checking of queue freezing and enters
>> blk_mq_make_request and tries to allocate tag, then we may get,
>>
>> generic_make_request        nvme_dev_disable
>>  -> blk_queue_enter              
>>                               -> nvme_start_freeze (just start freeze, no drain)
>>                               -> nvme_stop_queues
>>  -> blk_mq_make_request
>>   - > blk_mq_get_request      -> blk_mq_tagset_busy_iter
>>      -> blk_mq_get_tag
>>                                 -> bt_tags_for_each
>>                                    -> bt_tags_iter
>>                                        -> rq = tags->rqs[] ---> [1]
>>      -> blk_mq_rq_ctx_init
>>        -> data->hctx->tags->rqs[rq->tag] = rq;
>>
>> The rq got on position [1] could be a stale request that has been freed due to,
>> 1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset
>> 2. a removed io scheduler's sched request
>>
>> And this stale request may have been used by others and the request->state is changed to a non-zero
>> value and passes the checking of blk_mq_request_started and then it will be handled by nvme_cancel_request.
> 
> How is that request state going to be anyting other than IDLE? A freed
> request state is IDLE, and continues to be IDLE until dispatched. But
> dispatch is blocked for the entire tagset, so request states can't be
> started during an nvme reset.
> 

As the comment above, the stable request maybe something that has been freed due to following case,
1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset
2. a removed io scheduler's sched request
and this freed request could be allocated by others which may change the field of request->state.

Thanks
Jianchao
Keith Busch March 27, 2019, 2:33 a.m. UTC | #9
On Wed, Mar 27, 2019 at 10:27:57AM +0800, jianchao.wang wrote:
> As the comment above, the stable request maybe something that has been freed due to following case,
> 1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset
> 2. a removed io scheduler's sched request
> and this freed request could be allocated by others which may change the field of request->state.

You're not explaing how that request->state is changed. I understand the
request can be reallocated, but what is changing its state?
jianchao.wang March 27, 2019, 2:45 a.m. UTC | #10
Hi Keith

On 3/27/19 10:33 AM, Keith Busch wrote:
> On Wed, Mar 27, 2019 at 10:27:57AM +0800, jianchao.wang wrote:
>> As the comment above, the stable request maybe something that has been freed due to following case,
>> 1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset
>> 2. a removed io scheduler's sched request
>> and this freed request could be allocated by others which may change the field of request->state.
> 
> You're not explaing how that request->state is changed. I understand the
> request can be reallocated, but what is changing its state?
> 

Sorry for my bad description, and lead to the misunderstand.
The _free_ below means,

1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset
   The whole request_queue is cleaned up and freed, so the hctx->fq.flush is freed back to a slab

2. a removed io scheduler's sched request
   The io scheduled is detached and all of the structures are freed, including the pages where sched
   requests locates.

So the pointers in tags->rqs[] may point to memory that is not used as a blk layer request.


Thanks
Jianchao
Keith Busch March 27, 2019, 6:51 a.m. UTC | #11
On Wed, Mar 27, 2019 at 10:45:33AM +0800, jianchao.wang wrote:
> 1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset
>    The whole request_queue is cleaned up and freed, so the hctx->fq.flush is freed back to a slab
>
> 2. a removed io scheduler's sched request
>    The io scheduled is detached and all of the structures are freed, including the pages where sched
>    requests locates.
> 
> So the pointers in tags->rqs[] may point to memory that is not used as a blk layer request.

Oh, free as in kfree'd, not blk_mq_free_request. So it's a read-after-
free that you're concerned about, not that anyone explicitly changed a
request->state.

We at least can't free the flush_queue until the queue is frozen. If the
queue is frozen, we've completed the special fq->flush_rq where its end_io
replaces tags->rqs[tag] back to the fq->orig_rq from the static_rqs,
so nvme's iterator couldn't see the fq->flush_rq address if it's invalid.

The sched_tags concern, though, appears theoretically possible.
jianchao.wang March 27, 2019, 7:18 a.m. UTC | #12
Hi Keith

On 3/27/19 2:51 PM, Keith Busch wrote:
> On Wed, Mar 27, 2019 at 10:45:33AM +0800, jianchao.wang wrote:
>> 1. a hctx->fq.flush_rq of dead request_queue that shares the same tagset
>>    The whole request_queue is cleaned up and freed, so the hctx->fq.flush is freed back to a slab
>>
>> 2. a removed io scheduler's sched request
>>    The io scheduled is detached and all of the structures are freed, including the pages where sched
>>    requests locates.
>>
>> So the pointers in tags->rqs[] may point to memory that is not used as a blk layer request.
> 
> Oh, free as in kfree'd, not blk_mq_free_request. So it's a read-after-
> free that you're concerned about, not that anyone explicitly changed a
> request->state.

Yes ;)

> 
> We at least can't free the flush_queue until the queue is frozen. If the
> queue is frozen, we've completed the special fq->flush_rq where its end_io
> replaces tags->rqs[tag] back to the fq->orig_rq from the static_rqs,
> so nvme's iterator couldn't see the fq->flush_rq address if it's invalid.
> 

This is true for the non io-scheduler case in which the flush_rq would steal the driver tag.
But for io-scheduler case, flush_rq would acquire a driver tag itself.


> The sched_tags concern, though, appears theoretically possible.
> 

Thanks
Jianchao
diff mbox series

Patch

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4706019..d6c53fe 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3874,6 +3874,18 @@  void nvme_start_queues(struct nvme_ctrl *ctrl)
 }
 EXPORT_SYMBOL_GPL(nvme_start_queues);
 
+void nvme_iterate_inflight_rqs(struct nvme_ctrl *ctrl,
+		busy_iter_fn *fn, void *data)
+{
+	struct nvme_ns *ns;
+
+	down_read(&ctrl->namespaces_rwsem);
+	list_for_each_entry(ns, &ctrl->namespaces, list)
+		blk_mq_queue_tag_inflight_iter(ns->queue, fn, data);
+	up_read(&ctrl->namespaces_rwsem);
+}
+EXPORT_SYMBOL_GPL(nvme_iterate_inflight_rqs);
+
 int __init nvme_core_init(void)
 {
 	int result = -ENOMEM;
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index f3b9d91..667da72 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2367,7 +2367,7 @@  nvme_fc_complete_rq(struct request *rq)
 /*
  * This routine is used by the transport when it needs to find active
  * io on a queue that is to be terminated. The transport uses
- * blk_mq_tagset_busy_itr() to find the busy requests, which then invoke
+ * blk_mq_queue_tag_inflight_iter() to find the busy requests, which then invoke
  * this routine to kill them on a 1 by 1 basis.
  *
  * As FC allocates FC exchange for each io, the transport must contact
@@ -2740,7 +2740,7 @@  nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
 	 * If io queues are present, stop them and terminate all outstanding
 	 * ios on them. As FC allocates FC exchange for each io, the
 	 * transport must contact the LLDD to terminate the exchange,
-	 * thus releasing the FC exchange. We use blk_mq_tagset_busy_itr()
+	 * thus releasing the FC exchange. We use blk_mq_queue_tag_inflight_iter
 	 * to tell us what io's are busy and invoke a transport routine
 	 * to kill them with the LLDD.  After terminating the exchange
 	 * the LLDD will call the transport's normal io done path, but it
@@ -2750,7 +2750,7 @@  nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
 	 */
 	if (ctrl->ctrl.queue_count > 1) {
 		nvme_stop_queues(&ctrl->ctrl);
-		blk_mq_tagset_busy_iter(&ctrl->tag_set,
+		nvme_iterate_inflight_rqs(&ctrl->ctrl,
 				nvme_fc_terminate_exchange, &ctrl->ctrl);
 	}
 
@@ -2768,11 +2768,11 @@  nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
 
 	/*
 	 * clean up the admin queue. Same thing as above.
-	 * use blk_mq_tagset_busy_itr() and the transport routine to
+	 * use blk_mq_queue_tag_inflight_iter() and the transport routine to
 	 * terminate the exchanges.
 	 */
 	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
-	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
+	blk_mq_queue_tag_inflight_iter(ctrl->ctrl.admin_q,
 				nvme_fc_terminate_exchange, &ctrl->ctrl);
 
 	/* kill the aens as they are a separate path */
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 527d645..4c6bc803 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -445,6 +445,8 @@  void nvme_unfreeze(struct nvme_ctrl *ctrl);
 void nvme_wait_freeze(struct nvme_ctrl *ctrl);
 void nvme_wait_freeze_timeout(struct nvme_ctrl *ctrl, long timeout);
 void nvme_start_freeze(struct nvme_ctrl *ctrl);
+void nvme_iterate_inflight_rqs(struct nvme_ctrl *ctrl,
+		busy_iter_fn *fn, void *data);
 
 #define NVME_QID_ANY -1
 struct request *nvme_alloc_request(struct request_queue *q,
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index a90cf5d..96faa36 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2430,8 +2430,9 @@  static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 	nvme_suspend_queue(&dev->queues[0]);
 	nvme_pci_disable(dev);
 
-	blk_mq_tagset_busy_iter(&dev->tagset, nvme_cancel_request, &dev->ctrl);
-	blk_mq_tagset_busy_iter(&dev->admin_tagset, nvme_cancel_request, &dev->ctrl);
+	nvme_iterate_inflight_rqs(&dev->ctrl, nvme_cancel_request, &dev->ctrl);
+	blk_mq_queue_tag_inflight_iter(dev->ctrl.admin_q,
+			nvme_cancel_request, &dev->ctrl);
 
 	/*
 	 * The driver will not be starting up queues again if shutting down so
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 11a5eca..5660200 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -914,7 +914,7 @@  static void nvme_rdma_teardown_admin_queue(struct nvme_rdma_ctrl *ctrl,
 {
 	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
 	nvme_rdma_stop_queue(&ctrl->queues[0]);
-	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set, nvme_cancel_request,
+	blk_mq_queue_tag_inflight_iter(ctrl->ctrl.admin_q, nvme_cancel_request,
 			&ctrl->ctrl);
 	blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
 	nvme_rdma_destroy_admin_queue(ctrl, remove);
@@ -926,7 +926,7 @@  static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl,
 	if (ctrl->ctrl.queue_count > 1) {
 		nvme_stop_queues(&ctrl->ctrl);
 		nvme_rdma_stop_io_queues(ctrl);
-		blk_mq_tagset_busy_iter(&ctrl->tag_set, nvme_cancel_request,
+		nvme_iterate_inflight_rqs(&ctrl->ctrl, nvme_cancel_request,
 				&ctrl->ctrl);
 		if (remove)
 			nvme_start_queues(&ctrl->ctrl);
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index e7e0888..4c825dc 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1710,7 +1710,8 @@  static void nvme_tcp_teardown_admin_queue(struct nvme_ctrl *ctrl,
 {
 	blk_mq_quiesce_queue(ctrl->admin_q);
 	nvme_tcp_stop_queue(ctrl, 0);
-	blk_mq_tagset_busy_iter(ctrl->admin_tagset, nvme_cancel_request, ctrl);
+	blk_mq_queue_tag_inflight_iter(ctrl->admin_q,
+			nvme_cancel_request, ctrl);
 	blk_mq_unquiesce_queue(ctrl->admin_q);
 	nvme_tcp_destroy_admin_queue(ctrl, remove);
 }
@@ -1722,7 +1723,7 @@  static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl,
 		return;
 	nvme_stop_queues(ctrl);
 	nvme_tcp_stop_io_queues(ctrl);
-	blk_mq_tagset_busy_iter(ctrl->tagset, nvme_cancel_request, ctrl);
+	nvme_iterate_inflight_rqs(ctrl, nvme_cancel_request, ctrl);
 	if (remove)
 		nvme_start_queues(ctrl);
 	nvme_tcp_destroy_io_queues(ctrl, remove);
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index b9f623a..50d7288 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -421,7 +421,7 @@  static void nvme_loop_shutdown_ctrl(struct nvme_loop_ctrl *ctrl)
 {
 	if (ctrl->ctrl.queue_count > 1) {
 		nvme_stop_queues(&ctrl->ctrl);
-		blk_mq_tagset_busy_iter(&ctrl->tag_set,
+		nvme_iterate_inflight_rqs(&ctrl->ctrl,
 					nvme_cancel_request, &ctrl->ctrl);
 		nvme_loop_destroy_io_queues(ctrl);
 	}
@@ -430,7 +430,7 @@  static void nvme_loop_shutdown_ctrl(struct nvme_loop_ctrl *ctrl)
 		nvme_shutdown_ctrl(&ctrl->ctrl);
 
 	blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
-	blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
+	blk_mq_queue_tag_inflight_iter(ctrl->ctrl.admin_q,
 				nvme_cancel_request, &ctrl->ctrl);
 	blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
 	nvme_loop_destroy_admin_queue(ctrl);