diff mbox

SCSI: delay run queue if device is blocked in scsi_dev_queue_ready()

Message ID 20171202163150.1273-1-ming.lei@redhat.com (mailing list archive)
State Superseded
Headers show

Commit Message

Ming Lei Dec. 2, 2017, 4:31 p.m. UTC
Before commit 0df21c86bdbf ("scsi: implement .get_budget and .put_budget
for blk-mq"), we run queue after 3ms if device is blocked and queue is
idle, which is done in handling BLK_STS_RESOURCE. After commit 0df21c86bdbf
is introduced, queue won't be run any more under this situation.

IO hang is observed when timeout happened, and this patch fixes the IO
hang issue by running queue after delay in scsi_dev_queue_ready, just like
non-mq.

This issue can be triggered by the following script:

	#!/bin/sh
	rmmod scsi_debug
	modprobe scsi_debug max_queue=1

	DEVICE=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename`

	DISK_DIR=`ls -d /sys/block/$DEVICE/device/scsi_disk/*`

	echo "using scsi device $DEVICE"
	echo "-1" >/sys/bus/pseudo/drivers/scsi_debug/every_nth
	echo starting loop $i
	echo "temporary write through" >$DISK_DIR/cache_type
	echo "128" >/sys/bus/pseudo/drivers/scsi_debug/opts
	echo none > /sys/block/$DEVICE/queue/scheduler
	dd if=/dev/$DEVICE of=/dev/null bs=1M iflag=direct count=1 &
	sleep 5
	echo "0" >/sys/bus/pseudo/drivers/scsi_debug/opts
	wait
	echo "SUCCESS"

Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi_lib.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Johannes Thumshirn Dec. 4, 2017, 8:19 a.m. UTC | #1
Hi Ming,

Ming Lei <ming.lei@redhat.com> writes:
> This issue can be triggered by the following script:
>
> 	#!/bin/sh
> 	rmmod scsi_debug
> 	modprobe scsi_debug max_queue=1
>
> 	DEVICE=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename`
>
> 	DISK_DIR=`ls -d /sys/block/$DEVICE/device/scsi_disk/*`
>
> 	echo "using scsi device $DEVICE"
> 	echo "-1" >/sys/bus/pseudo/drivers/scsi_debug/every_nth
> 	echo starting loop $i
> 	echo "temporary write through" >$DISK_DIR/cache_type
> 	echo "128" >/sys/bus/pseudo/drivers/scsi_debug/opts
> 	echo none > /sys/block/$DEVICE/queue/scheduler
> 	dd if=/dev/$DEVICE of=/dev/null bs=1M iflag=direct count=1 &
> 	sleep 5
> 	echo "0" >/sys/bus/pseudo/drivers/scsi_debug/opts
> 	wait
> 	echo "SUCCESS"

Can you please submit a test-case for blktest as well, given you have a
nice reproducer?

Thanks,
        Johannes
Ming Lei Dec. 4, 2017, 8:34 a.m. UTC | #2
On Mon, Dec 04, 2017 at 09:19:33AM +0100, Johannes Thumshirn wrote:
> 
> Hi Ming,
> 
> Ming Lei <ming.lei@redhat.com> writes:
> > This issue can be triggered by the following script:
> >
> > 	#!/bin/sh
> > 	rmmod scsi_debug
> > 	modprobe scsi_debug max_queue=1
> >
> > 	DEVICE=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename`
> >
> > 	DISK_DIR=`ls -d /sys/block/$DEVICE/device/scsi_disk/*`
> >
> > 	echo "using scsi device $DEVICE"
> > 	echo "-1" >/sys/bus/pseudo/drivers/scsi_debug/every_nth
> > 	echo starting loop $i
> > 	echo "temporary write through" >$DISK_DIR/cache_type
> > 	echo "128" >/sys/bus/pseudo/drivers/scsi_debug/opts
> > 	echo none > /sys/block/$DEVICE/queue/scheduler
> > 	dd if=/dev/$DEVICE of=/dev/null bs=1M iflag=direct count=1 &
> > 	sleep 5
> > 	echo "0" >/sys/bus/pseudo/drivers/scsi_debug/opts
> > 	wait
> > 	echo "SUCCESS"
> 
> Can you please submit a test-case for blktest as well, given you have a
> nice reproducer?

Hi Johannes,

I am happy to do that, but recently I am very busy, so it may be done
a bit late by me.

But anyone should reproduce the issue 100% with V4.15-rc kernel by just
running the above script, not any specific hardware is required at all,
so that means anyone can make a patch for blktest to test block/SCSI
timeout if he/she is interested in doing that.

Thanks,
Ming
Johannes Thumshirn Dec. 4, 2017, 8:44 a.m. UTC | #3
Ming Lei <ming.lei@redhat.com> writes:


> I am happy to do that, but recently I am very busy, so it may be done
> a bit late by me.
>
> But anyone should reproduce the issue 100% with V4.15-rc kernel by just
> running the above script, not any specific hardware is required at all,
> so that means anyone can make a patch for blktest to test block/SCSI
> timeout if he/she is interested in doing that.

OK, let me see if I can spent some time on this the next days.

Byte,
        Johannes
Ming Lei Dec. 4, 2017, 8:58 a.m. UTC | #4
On Mon, Dec 04, 2017 at 09:44:55AM +0100, Johannes Thumshirn wrote:
> Ming Lei <ming.lei@redhat.com> writes:
> 
> 
> > I am happy to do that, but recently I am very busy, so it may be done
> > a bit late by me.
> >
> > But anyone should reproduce the issue 100% with V4.15-rc kernel by just
> > running the above script, not any specific hardware is required at all,
> > so that means anyone can make a patch for blktest to test block/SCSI
> > timeout if he/she is interested in doing that.
> 
> OK, let me see if I can spent some time on this the next days.

That is great, thanks!
Bart Van Assche Dec. 4, 2017, 3:09 p.m. UTC | #5
On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
> Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")


It might be safer to revert commit 0df21c86bdbf instead of trying to fix all
issues introduced by that commit for kernel version v4.15 ...

Bart.
Ming Lei Dec. 4, 2017, 10:45 p.m. UTC | #6
On Mon, Dec 04, 2017 at 03:09:20PM +0000, Bart Van Assche wrote:
> On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
> 
> It might be safer to revert commit 0df21c86bdbf instead of trying to fix all
> issues introduced by that commit for kernel version v4.15 ...

What are all issues in v4.15-rc? Up to now, it is the only issue reported,
and can be fixed by this simple patch, which one can be thought as cleanup
too.
Bart Van Assche Dec. 4, 2017, 10:49 p.m. UTC | #7
On Tue, 2017-12-05 at 06:45 +0800, Ming Lei wrote:
> On Mon, Dec 04, 2017 at 03:09:20PM +0000, Bart Van Assche wrote:

> > On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:

> > > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")

> > 

> > It might be safer to revert commit 0df21c86bdbf instead of trying to fix all

> > issues introduced by that commit for kernel version v4.15 ...

> 

> What are all issues in v4.15-rc? Up to now, it is the only issue reported,

> and can be fixed by this simple patch, which one can be thought as cleanup

> too.


The three issues I described in the commit message of the patch that is available at:
https://marc.info/?l=linux-block&m=151240866010572.

Bart.
Holger Hoffstätte Dec. 4, 2017, 11:48 p.m. UTC | #8
On Tue, 05 Dec 2017 06:45:08 +0800, Ming Lei wrote:

> On Mon, Dec 04, 2017 at 03:09:20PM +0000, Bart Van Assche wrote:
>> On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
>> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
>> 
>> It might be safer to revert commit 0df21c86bdbf instead of trying to fix all
>> issues introduced by that commit for kernel version v4.15 ...
> 
> What are all issues in v4.15-rc? Up to now, it is the only issue reported,
> and can be fixed by this simple patch, which one can be thought as cleanup
> too.

Even with this patch I've encountered at least one hang that
seemed related. I'm using most of block/scsi-4.15 on top of 4.14 and
the hang in question was on a rotating disk. It could be solved by activating
a different scheduler on the hanging device; all hanging sync/df processes got
unstuck and all was fine again, which leads me to believe that there is at least
one more rare condition where delaying requests (as done in the budget patch)
leads to a hang.

This happened with mq-deadline which I was testing specifically to avoid
any BFQ-related side effects.

I didn't do anything specific to trigger the hang and have not been able
to reproduce it during regular usage.

-h
Ming Lei Dec. 5, 2017, 5:16 a.m. UTC | #9
On Mon, Dec 04, 2017 at 11:48:07PM +0000, Holger Hoffstätte wrote:
> On Tue, 05 Dec 2017 06:45:08 +0800, Ming Lei wrote:
> 
> > On Mon, Dec 04, 2017 at 03:09:20PM +0000, Bart Van Assche wrote:
> >> On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
> >> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
> >> 
> >> It might be safer to revert commit 0df21c86bdbf instead of trying to fix all
> >> issues introduced by that commit for kernel version v4.15 ...
> > 
> > What are all issues in v4.15-rc? Up to now, it is the only issue reported,
> > and can be fixed by this simple patch, which one can be thought as cleanup
> > too.
> 
> Even with this patch I've encountered at least one hang that
> seemed related. I'm using most of block/scsi-4.15 on top of 4.14 and
> the hang in question was on a rotating disk. It could be solved by activating
> a different scheduler on the hanging device; all hanging sync/df processes got
> unstuck and all was fine again, which leads me to believe that there is at least
> one more rare condition where delaying requests (as done in the budget patch)
> leads to a hang.
> 
> This happened with mq-deadline which I was testing specifically to avoid
> any BFQ-related side effects.

OK, this looks a new report.

Without any log, we can't make any progress, and even we can't guess
what the issue is related with.

Could you post your dmesg log(include the hang process stack trace)? And
dump the debugfs log by the following script when this hang happens?

	http://people.redhat.com/minlei/tests/tools/dump-blk-info

BTW, you just need to pass the disk name to the script, such as: /dev/sda.
Ming Lei Dec. 5, 2017, 6:56 a.m. UTC | #10
On Tue, Dec 05, 2017 at 01:16:24PM +0800, Ming Lei wrote:
> On Mon, Dec 04, 2017 at 11:48:07PM +0000, Holger Hoffstätte wrote:
> > On Tue, 05 Dec 2017 06:45:08 +0800, Ming Lei wrote:
> > 
> > > On Mon, Dec 04, 2017 at 03:09:20PM +0000, Bart Van Assche wrote:
> > >> On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
> > >> > Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
> > >> 
> > >> It might be safer to revert commit 0df21c86bdbf instead of trying to fix all
> > >> issues introduced by that commit for kernel version v4.15 ...
> > > 
> > > What are all issues in v4.15-rc? Up to now, it is the only issue reported,
> > > and can be fixed by this simple patch, which one can be thought as cleanup
> > > too.
> > 
> > Even with this patch I've encountered at least one hang that
> > seemed related. I'm using most of block/scsi-4.15 on top of 4.14 and
> > the hang in question was on a rotating disk. It could be solved by activating
> > a different scheduler on the hanging device; all hanging sync/df processes got
> > unstuck and all was fine again, which leads me to believe that there is at least
> > one more rare condition where delaying requests (as done in the budget patch)
> > leads to a hang.
> > 
> > This happened with mq-deadline which I was testing specifically to avoid
> > any BFQ-related side effects.
> 
> OK, this looks a new report.
> 
> Without any log, we can't make any progress, and even we can't guess
> what the issue is related with.
> 
> Could you post your dmesg log(include the hang process stack trace)? And
> dump the debugfs log by the following script when this hang happens?
> 
> 	http://people.redhat.com/minlei/tests/tools/dump-blk-info
> 
> BTW, you just need to pass the disk name to the script, such as: /dev/sda.

Thinking of the issue further, this patch only covers case of
scsi_set_blocked(), but don't consider the case in which .get_budget()
is called inside blk_mq_dispatch_rq_list() for request coming from
hctx->dispatch_list.

If .get_budget() is called in both blk_mq_do_dispatch_sched() and
blk_mq_do_dispatch_ctx(), we don't need to run queue if the queue
is idle. But if it is called from blk_mq_dispatch_rq_list() for request
coming from hctx->dispatch_list, we have to run queue if queue is
idle, as before.

So please ignore this patch, and will submit V2 for cover both cases.

Thanks,
Ming
Holger Hoffstätte Dec. 5, 2017, 11:26 a.m. UTC | #11
On 12/05/17 06:16, Ming Lei wrote:
> On Mon, Dec 04, 2017 at 11:48:07PM +0000, Holger Hoffstätte wrote:
>> On Tue, 05 Dec 2017 06:45:08 +0800, Ming Lei wrote:
>>
>>> On Mon, Dec 04, 2017 at 03:09:20PM +0000, Bart Van Assche wrote:
>>>> On Sun, 2017-12-03 at 00:31 +0800, Ming Lei wrote:
>>>>> Fixes: 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq")
>>>>
>>>> It might be safer to revert commit 0df21c86bdbf instead of trying to fix all
>>>> issues introduced by that commit for kernel version v4.15 ...
>>>
>>> What are all issues in v4.15-rc? Up to now, it is the only issue reported,
>>> and can be fixed by this simple patch, which one can be thought as cleanup
>>> too.
>>
>> Even with this patch I've encountered at least one hang that
>> seemed related. I'm using most of block/scsi-4.15 on top of 4.14 and
>> the hang in question was on a rotating disk. It could be solved by activating
>> a different scheduler on the hanging device; all hanging sync/df processes got
>> unstuck and all was fine again, which leads me to believe that there is at least
>> one more rare condition where delaying requests (as done in the budget patch)
>> leads to a hang.
>>
>> This happened with mq-deadline which I was testing specifically to avoid
>> any BFQ-related side effects.
> 
> OK, this looks a new report.
> 
> Without any log, we can't make any progress, and even we can't guess
> what the issue is related with.

Considering that you just had an idea about a corner case and posted v2
of the patch, it's safe to say that we actually can...which is why I
described the situation exactly the way I did. :)

I did try to get stack traces but all the hanging processes were
simply stuck on the device mutex (deep inside btrfs), so nothing too
helpful.

> Could you post your dmesg log(include the hang process stack trace)? And
> dump the debugfs log by the following script when this hang happens?
> 
> 	http://people.redhat.com/minlei/tests/tools/dump-blk-info
> 
> BTW, you just need to pass the disk name to the script, such as: /dev/sda.

Thanks for the script. I'm now running with the new patch and will see what
happens.

cheers,
Holger
diff mbox

Patch

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index cf9f36a1113f..9aada86055d3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1387,6 +1387,15 @@  static void scsi_unprep_fn(struct request_queue *q, struct request *req)
 	scsi_uninit_cmd(blk_mq_rq_to_pdu(req));
 }
 
+static void scsi_mq_delay_queue(struct request_queue *q, unsigned long msecs)
+{
+	struct blk_mq_hw_ctx *hctx;
+	int i;
+
+	queue_for_each_hw_ctx(q, hctx, i)
+		blk_mq_delay_run_hw_queue(hctx, msecs);
+}
+
 /*
  * scsi_dev_queue_ready: if we can send requests to sdev, return 1 else
  * return 0.
@@ -1407,11 +1416,10 @@  static inline int scsi_dev_queue_ready(struct request_queue *q,
 		 * unblock after device_blocked iterates to zero
 		 */
 		if (atomic_dec_return(&sdev->device_blocked) > 0) {
-			/*
-			 * For the MQ case we take care of this in the caller.
-			 */
 			if (!q->mq_ops)
 				blk_delay_queue(q, SCSI_QUEUE_DELAY);
+			else
+				scsi_mq_delay_queue(q, SCSI_QUEUE_DELAY);
 			goto out_dec;
 		}
 		SCSI_LOG_MLQUEUE(3, sdev_printk(KERN_INFO, sdev,