Message ID | 20170407181654.27836-7-bart.vanassche@sandisk.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Apr 07 2017 at 2:16pm -0400, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > While running the srp-test software I noticed that request > processing stalls sporadically at the beginning of a test, namely > when mkfs is run against a dm-mpath device. Every time when that > happened the following command was sufficient to resume request > processing: > > echo run >/sys/kernel/debug/block/dm-0/state > > This patch avoids that such request processing stalls occur. The > test I ran is as follows: > > while srp-test/run_tests -d -r 30 -t 02-mq; do :; done > > Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > Cc: Mike Snitzer <snitzer@redhat.com> > Cc: dm-devel@redhat.com > --- > drivers/md/dm-rq.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c > index 6886bf160fb2..d19af1d21f4c 100644 > --- a/drivers/md/dm-rq.c > +++ b/drivers/md/dm-rq.c > @@ -755,6 +755,7 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, > /* Undo dm_start_request() before requeuing */ > rq_end_stats(md, rq); > rq_completed(md, rq_data_dir(rq), false); > + blk_mq_delay_run_hw_queue(hctx, 100/*ms*/); > return BLK_MQ_RQ_QUEUE_BUSY; > } > > -- > 2.12.0 > I really appreciate your hard work Bart but this looks like a cheap hack. I'm clearly too late to stop this from going in (given Jens got it merged for -rc6) but: this has no place in dm-mq (or any blk-mq driver). If it is needed it should be elevated to blk-mq core to trigger blk_mq_delay_run_hw_queue() when BLK_MQ_RQ_QUEUE_BUSY is returned from blk_mq_ops' .queue_rq. If this dm-mq specific commit is justified the case certainly is spelled out in the commit header.
On Tue, 2017-04-11 at 12:09 -0400, Mike Snitzer wrote: > This has no place in dm-mq (or any blk-mq > driver). If it is needed it should be elevated to blk-mq core to > trigger blk_mq_delay_run_hw_queue() when BLK_MQ_RQ_QUEUE_BUSY is > returned from blk_mq_ops' .queue_rq. Hello Mike, If the blk-mq core would have to figure out whether or not a queue is no longer busy without any cooperation from the blk-mq driver all the blk-mq core could do is to attempt to rerun that queue from time to time. But at which intervals should the blk-mq core check whether or not a queue is still busy? Would it be possible to choose intervals at which to check the queue state that work well for all block drivers? Consider e.g. at the dm-mpath driver. multipath_busy() returns true as long as path initialization is in progress. Path initialization can take a long time. The (indirect) call to blk_mq_run_queue() from pg_init_done() resumes request processing immediately after path initialization has finished. Sorry but I don't think it is possible to invent an algorithm for the blk-mq core that guarantees not only that a queue is rerun as soon as it is no longer busy but also that avoids that plenty of CPU cycles are wasted by the blk-mq core for checking whether a queue is no longer busy. Bart.
On Tue, Apr 11 2017 at 12:26pm -0400, Bart Van Assche <Bart.VanAssche@sandisk.com> wrote: > On Tue, 2017-04-11 at 12:09 -0400, Mike Snitzer wrote: > > This has no place in dm-mq (or any blk-mq > > driver). If it is needed it should be elevated to blk-mq core to > > trigger blk_mq_delay_run_hw_queue() when BLK_MQ_RQ_QUEUE_BUSY is > > returned from blk_mq_ops' .queue_rq. > > Hello Mike, > > If the blk-mq core would have to figure out whether or not a queue is no > longer busy without any cooperation from the blk-mq driver all the blk-mq > core could do is to attempt to rerun that queue from time to time. But at > which intervals should the blk-mq core check whether or not a queue is still > busy? Would it be possible to choose intervals at which to check the queue > state that work well for all block drivers? Consider e.g. at the dm-mpath > driver. multipath_busy() returns true as long as path initialization is in > progress. Path initialization can take a long time. The (indirect) call to > blk_mq_run_queue() from pg_init_done() resumes request processing immediately > after path initialization has finished. Sorry but I don't think it is possible > to invent an algorithm for the blk-mq core that guarantees not only that a > queue is rerun as soon as it is no longer busy but also that avoids that > plenty of CPU cycles are wasted by the blk-mq core for checking whether a > queue is no longer busy. Sorry but that isn't a very strong argument for what you've done. I mean I do appreciate your point that the 2 BLK_MQ_RQ_QUEUE_BUSY returns in dm_mq_queue_rq() are not equal but that could easily be conveyed using a new return value. Anyway, point is, no blk-mq driver should need to have concern about whether their request will get resubmitted (and the associated hw queue re-ran) if they return BLK_MQ_RQ_QUEUE_BUSY. Your change is a means to an end but it just solves the problem in a very hackish way. Other drivers will very likely be caught about by this blk-mq quirk in the future.
On Tue, 2017-04-11 at 13:47 -0400, Mike Snitzer wrote: > Other drivers will very likely be caught about by > this blk-mq quirk in the future. Hello Mike, Are you aware that the requirement that blk-mq drivers rerun the queue after having returned BLK_MQ_RQ_QUEUE_BUSY is a requirement that is shared with traditional block drivers? From dm_old_request_fn(): if (... || (ti->type->busy && ti->type->busy(ti))) { blk_delay_queue(q, 10); return; } Bart.
On Tue, Apr 11 2017 at 1:51pm -0400, Bart Van Assche <Bart.VanAssche@sandisk.com> wrote: > On Tue, 2017-04-11 at 13:47 -0400, Mike Snitzer wrote: > > Other drivers will very likely be caught about by > > this blk-mq quirk in the future. > > Hello Mike, > > Are you aware that the requirement that blk-mq drivers rerun the queue after > having returned BLK_MQ_RQ_QUEUE_BUSY is a requirement that is shared with > traditional block drivers? From dm_old_request_fn(): > > if (... || (ti->type->busy && ti->type->busy(ti))) { > blk_delay_queue(q, 10); > return; > } No, and pointing to DM code that does something with the old .request_fn case to justify why blk-mq requires the same is pretty specious. Rather than working so hard to use DM code against me, your argument should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well established pattern" I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does. But that is only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns tree-wide. Could be there are some others, but hardly a well-established pattern.
On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote: > Rather than working so hard to use DM code against me, your argument > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well > established pattern" > > I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does. But that is > only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns > tree-wide. > > Could be there are some others, but hardly a well-established pattern. Hello Mike, Several blk-mq drivers that can return BLK_MQ_RQ_QUEUE_BUSY from their .queue_rq() implementation stop the request queue (blk_mq_stop_hw_queue()) before returning "busy" and restart the queue after the busy condition has been cleared (blk_mq_start_stopped_hw_queues()). Examples are virtio_blk and xen-blkfront. However, this approach is not appropriate for the dm-mq core nor for the scsi core since both drivers already use the "stopped" state for another purpose than tracking whether or not a hardware queue is busy. Hence the blk_mq_delay_run_hw_queue() and blk_mq_run_hw_queue() calls in these last two drivers to rerun a hardware queue after the busy state has been cleared. Bart.
On Tue, Apr 11, 2017 at 06:18:36PM +0000, Bart Van Assche wrote: > On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote: > > Rather than working so hard to use DM code against me, your argument > > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well > > established pattern" > > > > I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does. But that is > > only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns > > tree-wide. > > > > Could be there are some others, but hardly a well-established pattern. > > Hello Mike, > > Several blk-mq drivers that can return BLK_MQ_RQ_QUEUE_BUSY from their > .queue_rq() implementation stop the request queue (blk_mq_stop_hw_queue()) > before returning "busy" and restart the queue after the busy condition has > been cleared (blk_mq_start_stopped_hw_queues()). Examples are virtio_blk and > xen-blkfront. However, this approach is not appropriate for the dm-mq core > nor for the scsi core since both drivers already use the "stopped" state for > another purpose than tracking whether or not a hardware queue is busy. Hence > the blk_mq_delay_run_hw_queue() and blk_mq_run_hw_queue() calls in these last > two drivers to rerun a hardware queue after the busy state has been cleared. But looks this patch just reruns the hw queue after 100ms, which isn't that after the busy state has been cleared, right? Actually if BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(), blk-mq will buffer this request into hctx->dispatch and run the hw queue again, so looks blk_mq_delay_run_hw_queue() in this situation shouldn't have been needed at my 1st impression. Or maybe Bart has more stories about this usage, better to comments it? Thanks, Ming
On Wed, 2017-04-12 at 11:42 +0800, Ming Lei wrote: > On Tue, Apr 11, 2017 at 06:18:36PM +0000, Bart Van Assche wrote: > > On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote: > > > Rather than working so hard to use DM code against me, your argument > > > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well > > > established pattern" > > > > > > I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does. But that is > > > only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns > > > tree-wide. > > > > > > Could be there are some others, but hardly a well-established pattern. > > > > Hello Mike, > > > > Several blk-mq drivers that can return BLK_MQ_RQ_QUEUE_BUSY from their > > .queue_rq() implementation stop the request queue (blk_mq_stop_hw_queue()) > > before returning "busy" and restart the queue after the busy condition has > > been cleared (blk_mq_start_stopped_hw_queues()). Examples are virtio_blk and > > xen-blkfront. However, this approach is not appropriate for the dm-mq core > > nor for the scsi core since both drivers already use the "stopped" state for > > another purpose than tracking whether or not a hardware queue is busy. Hence > > the blk_mq_delay_run_hw_queue() and blk_mq_run_hw_queue() calls in these last > > two drivers to rerun a hardware queue after the busy state has been cleared. > > But looks this patch just reruns the hw queue after 100ms, which isn't > that after the busy state has been cleared, right? Hello Ming, That patch can be considered as a first step that can be refined further, namely by modifying the dm-rq code further such that dm-rq queues are only rerun after the busy condition has been cleared. The patch at the start of this thread is easier to review and easier to test than any patch that would only rerun dm-rq queues after the busy condition has been cleared. > Actually if BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(), blk-mq > will buffer this request into hctx->dispatch and run the hw queue again, > so looks blk_mq_delay_run_hw_queue() in this situation shouldn't have been > needed at my 1st impression. If the blk-mq core would always rerun a hardware queue if a block driver returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU core to be busy with polling a hardware queue until the "busy" condition has been cleared. One can see easily that that's not what the blk-mq core does. From blk_mq_sched_dispatch_requests(): if (!list_empty(&rq_list)) { blk_mq_sched_mark_restart_hctx(hctx); did_work = blk_mq_dispatch_rq_list(q, &rq_list); } From the end of blk_mq_dispatch_rq_list(): if (!list_empty(list)) { [ ... ] if (!blk_mq_sched_needs_restart(hctx) && !test_bit(BLK_MQ_S_TAG_WAITING, &hctx->state)) blk_mq_run_hw_queue(hctx, true); } In other words, the BLK_MQ_S_SCHED_RESTART flag is set before the dispatch list is examined and only if that flag gets cleared while blk_mq_dispatch_rq_list() is in progress by a concurrent blk_mq_sched_restart_hctx() call then the dispatch list will be rerun after a block driver returned BLK_MQ_RQ_QUEUE_BUSY. Bart.
On Wed, Apr 12, 2017 at 06:38:07PM +0000, Bart Van Assche wrote: > On Wed, 2017-04-12 at 11:42 +0800, Ming Lei wrote: > > On Tue, Apr 11, 2017 at 06:18:36PM +0000, Bart Van Assche wrote: > > > On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote: > > > > Rather than working so hard to use DM code against me, your argument > > > > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well > > > > established pattern" > > > > > > > > I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does. But that is > > > > only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns > > > > tree-wide. > > > > > > > > Could be there are some others, but hardly a well-established pattern. > > > > > > Hello Mike, > > > > > > Several blk-mq drivers that can return BLK_MQ_RQ_QUEUE_BUSY from their > > > .queue_rq() implementation stop the request queue (blk_mq_stop_hw_queue()) > > > before returning "busy" and restart the queue after the busy condition has > > > been cleared (blk_mq_start_stopped_hw_queues()). Examples are virtio_blk and > > > xen-blkfront. However, this approach is not appropriate for the dm-mq core > > > nor for the scsi core since both drivers already use the "stopped" state for > > > another purpose than tracking whether or not a hardware queue is busy. Hence > > > the blk_mq_delay_run_hw_queue() and blk_mq_run_hw_queue() calls in these last > > > two drivers to rerun a hardware queue after the busy state has been cleared. > > > > But looks this patch just reruns the hw queue after 100ms, which isn't > > that after the busy state has been cleared, right? > > Hello Ming, > > That patch can be considered as a first step that can be refined further, namely > by modifying the dm-rq code further such that dm-rq queues are only rerun after > the busy condition has been cleared. The patch at the start of this thread is > easier to review and easier to test than any patch that would only rerun dm-rq > queues after the busy condition has been cleared. OK, got it, it should have been better to add comments about this change since reruning the queue after 100ms is actually a workaround, instead of final solution. > > > Actually if BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(), blk-mq > > will buffer this request into hctx->dispatch and run the hw queue again, > > so looks blk_mq_delay_run_hw_queue() in this situation shouldn't have been > > needed at my 1st impression. > > If the blk-mq core would always rerun a hardware queue if a block driver > returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU core It won't casue 100% CPU utilization since we restart queue in completion path and at that time at least one tag is available, then progress can be made. > to be busy with polling a hardware queue until the "busy" condition has been > cleared. One can see easily that that's not what the blk-mq core does. From > blk_mq_sched_dispatch_requests(): > > if (!list_empty(&rq_list)) { > blk_mq_sched_mark_restart_hctx(hctx); > did_work = blk_mq_dispatch_rq_list(q, &rq_list); > } > > From the end of blk_mq_dispatch_rq_list(): > > if (!list_empty(list)) { > [ ... ] > if (!blk_mq_sched_needs_restart(hctx) && > !test_bit(BLK_MQ_S_TAG_WAITING, &hctx->state)) > blk_mq_run_hw_queue(hctx, true); > } That is exactly what I meant, blk-mq already provides this mechanism to rerun the queue automatically in case of BLK_MQ_RQ_QUEUE_BUSY. If the mechanism doesn't work well, we need to fix that, then why bother drivers to workaround it? > > In other words, the BLK_MQ_S_SCHED_RESTART flag is set before the dispatch list > is examined and only if that flag gets cleared while blk_mq_dispatch_rq_list() > is in progress by a concurrent blk_mq_sched_restart_hctx() call then the > dispatch list will be rerun after a block driver returned BLK_MQ_RQ_QUEUE_BUSY. Yes, the queue is rerun either in completion path when BLK_MQ_S_SCHED_RESTART is set, or just .queue_rq() returning _BUSY and the flag is cleared at the same time from completion path. So in theroy we can make sure the queue will be run again if _BUSY happened, then what is the root cause why we have to add blk_mq_delay_run_hw_queue(hctx, 100) in dm's .queue_rq()? Thanks, Ming
On 04/12/17 19:20, Ming Lei wrote: > On Wed, Apr 12, 2017 at 06:38:07PM +0000, Bart Van Assche wrote: >> If the blk-mq core would always rerun a hardware queue if a block driver >> returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU core > > It won't casue 100% CPU utilization since we restart queue in completion > path and at that time at least one tag is available, then progress can be > made. Hello Ming, Sorry but you are wrong. If .queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY then it's likely that calling .queue_rq() again after only a few microseconds will cause it to return BLK_MQ_RQ_QUEUE_BUSY again. If you don't believe me, change "if (!blk_mq_sched_needs_restart(hctx) && !test_bit(BLK_MQ_S_TAG_WAITING, &hctx->state)) blk_mq_run_hw_queue(hctx, true);" into "blk_mq_run_hw_queue(hctx, true);", trigger a busy condition for either a SCSI LLD or a dm-rq driver, run top and you will see that the CPU usage of a kworker thread jumps up to 100%. Bart.
On Thu, Apr 13, 2017 at 09:59:57AM -0700, Bart Van Assche wrote: > On 04/12/17 19:20, Ming Lei wrote: > > On Wed, Apr 12, 2017 at 06:38:07PM +0000, Bart Van Assche wrote: > >> If the blk-mq core would always rerun a hardware queue if a block driver > >> returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU core > > > > It won't casue 100% CPU utilization since we restart queue in completion > > path and at that time at least one tag is available, then progress can be > > made. > > Hello Ming, > > Sorry but you are wrong. If .queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY > then it's likely that calling .queue_rq() again after only a few > microseconds will cause it to return BLK_MQ_RQ_QUEUE_BUSY again. If you > don't believe me, change "if (!blk_mq_sched_needs_restart(hctx) && > !test_bit(BLK_MQ_S_TAG_WAITING, &hctx->state)) blk_mq_run_hw_queue(hctx, > true);" into "blk_mq_run_hw_queue(hctx, true);", trigger a busy Yes, that can be true, but I mean it is still OK to run the queue again with if (!blk_mq_sched_needs_restart(hctx) && !test_bit(BLK_MQ_S_TAG_WAITING, &hctx->state)) blk_mq_run_hw_queue(hctx, true); and restarting queue in __blk_mq_finish_request() when BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(). And both are in current blk-mq implementation. Then why do we need blk_mq_delay_run_hw_queue(hctx, 100/*ms*/) in dm? Thanks, Ming
On Fri, 2017-04-14 at 09:13 +0800, Ming Lei wrote: > On Thu, Apr 13, 2017 at 09:59:57AM -0700, Bart Van Assche wrote: > > On 04/12/17 19:20, Ming Lei wrote: > > > On Wed, Apr 12, 2017 at 06:38:07PM +0000, Bart Van Assche wrote: > > > > If the blk-mq core would always rerun a hardware queue if a block driver > > > > returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU core > > > > > > It won't casue 100% CPU utilization since we restart queue in completion > > > path and at that time at least one tag is available, then progress can be > > > made. > > > > Hello Ming, > > > > Sorry but you are wrong. If .queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY > > then it's likely that calling .queue_rq() again after only a few > > microseconds will cause it to return BLK_MQ_RQ_QUEUE_BUSY again. If you > > don't believe me, change "if (!blk_mq_sched_needs_restart(hctx) && > > !test_bit(BLK_MQ_S_TAG_WAITING, &hctx->state)) blk_mq_run_hw_queue(hctx, > > true);" into "blk_mq_run_hw_queue(hctx, true);", trigger a busy > > Yes, that can be true, but I mean it is still OK to run the queue again > with > > if (!blk_mq_sched_needs_restart(hctx) && > !test_bit(BLK_MQ_S_TAG_WAITING, &hctx->state)) > blk_mq_run_hw_queue(hctx, true); > > and restarting queue in __blk_mq_finish_request() when > BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(). And both are in current > blk-mq implementation. > > Then why do we need blk_mq_delay_run_hw_queue(hctx, 100/*ms*/) in dm? Because if dm_mq_queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY that there is no guarantee that __blk_mq_finish_request() will be called later on for the same queue. dm_mq_queue_rq() can e.g. return BLK_MQ_RQ_QUEUE_BUSY while no dm requests are in progress because the SCSI error handler is active for all underlying paths. See also scsi_lld_busy() and scsi_host_in_recovery(). Bart.
On Fri, Apr 14, 2017 at 05:12:50PM +0000, Bart Van Assche wrote: > On Fri, 2017-04-14 at 09:13 +0800, Ming Lei wrote: > > On Thu, Apr 13, 2017 at 09:59:57AM -0700, Bart Van Assche wrote: > > > On 04/12/17 19:20, Ming Lei wrote: > > > > On Wed, Apr 12, 2017 at 06:38:07PM +0000, Bart Van Assche wrote: > > > > > If the blk-mq core would always rerun a hardware queue if a block driver > > > > > returns BLK_MQ_RQ_QUEUE_BUSY then that would cause 100% of a single CPU core > > > > > > > > It won't casue 100% CPU utilization since we restart queue in completion > > > > path and at that time at least one tag is available, then progress can be > > > > made. > > > > > > Hello Ming, > > > > > > Sorry but you are wrong. If .queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY > > > then it's likely that calling .queue_rq() again after only a few > > > microseconds will cause it to return BLK_MQ_RQ_QUEUE_BUSY again. If you > > > don't believe me, change "if (!blk_mq_sched_needs_restart(hctx) && > > > !test_bit(BLK_MQ_S_TAG_WAITING, &hctx->state)) blk_mq_run_hw_queue(hctx, > > > true);" into "blk_mq_run_hw_queue(hctx, true);", trigger a busy > > > > Yes, that can be true, but I mean it is still OK to run the queue again > > with > > > > if (!blk_mq_sched_needs_restart(hctx) && > > !test_bit(BLK_MQ_S_TAG_WAITING, &hctx->state)) > > blk_mq_run_hw_queue(hctx, true); > > > > and restarting queue in __blk_mq_finish_request() when > > BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(). And both are in current > > blk-mq implementation. > > > > Then why do we need blk_mq_delay_run_hw_queue(hctx, 100/*ms*/) in dm? > > Because if dm_mq_queue_rq() returns BLK_MQ_RQ_QUEUE_BUSY that there is no > guarantee that __blk_mq_finish_request() will be called later on for the > same queue. dm_mq_queue_rq() can e.g. return BLK_MQ_RQ_QUEUE_BUSY while no > dm requests are in progress because the SCSI error handler is active for > all underlying paths. See also scsi_lld_busy() and scsi_host_in_recovery(). OK, thanks Bart for the explanation. Looks a very interesting BLK_MQ_RQ_QUEUE_BUSY case which isn't casued by too many pending I/O, and will study more about this case. Thanks, Ming
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 6886bf160fb2..d19af1d21f4c 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -755,6 +755,7 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, /* Undo dm_start_request() before requeuing */ rq_end_stats(md, rq); rq_completed(md, rq_data_dir(rq), false); + blk_mq_delay_run_hw_queue(hctx, 100/*ms*/); return BLK_MQ_RQ_QUEUE_BUSY; }
While running the srp-test software I noticed that request processing stalls sporadically at the beginning of a test, namely when mkfs is run against a dm-mpath device. Every time when that happened the following command was sufficient to resume request processing: echo run >/sys/kernel/debug/block/dm-0/state This patch avoids that such request processing stalls occur. The test I ran is as follows: while srp-test/run_tests -d -r 30 -t 02-mq; do :; done Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com --- drivers/md/dm-rq.c | 1 + 1 file changed, 1 insertion(+)