Message ID | 20171204173032.16330-1-bart.vanassche@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Dec 04, 2017 at 09:30:32AM -0800, Bart Van Assche wrote: > Commit 0df21c86bdbf introduced several bugs: > * A SCSI queue stall for queue depths > 1, addressed by commit > 88022d7201e9 ("blk-mq: don't handle failure in .get_budget") This one is committed already. > * A systematic lockup for SCSI queues with queue depth 1. The > following test reproduces that bug systematically: > - Change the SRP initiator such that SCSI target queue depth is > limited to 1. > - Run the following command: > srp-test/run_tests -f xfs -d -e none -r 60 -t 01 > See also "[PATCH 4/7] blk-mq: Avoid that request processing > stalls when sharing tags" > (https://marc.info/?l=linux-block&m=151208695316857). Note: > reverting commit 0df21c86bdbf also fixes a sporadic SCSI request > queue lockup while inserting a blk_mq_sched_mark_restart_hctx() > before all blk_mq_dispatch_rq_list() calls only fixes the > systematic lockup for queue depth 1. You are the only reproducer, and you don't want to provide any kernel log about this issue, so how can we help you fix your issue? You said that your patch fixes 'commit b347689ffbca ("blk-mq-sched: improve dispatching from sw queue")', but you don't mention any issue about that commit, and your patch is actually nothing to do with commit b347689ffbca, and seems your work style is just try and guess. Also both Jens and I have run tests on null_blk and scsi_debug by setting queue_depth as one, and we all can't see IO hang with current blk-mq. > * A scsi_debug lockup - see also "[PATCH] SCSI: delay run queue if > device is blocked in scsi_dev_queue_ready()" > (https://marc.info/?l=linux-block&m=151223233407154). This issue is clearly explained in theory, and can be reproduced/verified by scsi_debug, so why can't we apply it to fix the issue? And the fix is simply and can be thought as cleanup too, since the handling for this case becomes same with non-mq path now. > > I think the above means that it is too risky to try to fix all bugs > introduced by commit 0df21c86bdbf before kernel v4.15 is released. > Hence revert that commit. What is the risk? > > Fixes: commit 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq") > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Hannes Reinecke <hare@suse.com> > Cc: Johannes Thumshirn <jthumshirn@suse.de> > Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> > Cc: Martin K. Petersen <martin.petersen@oracle.com> > Cc: linux-scsi@vger.kernel.org This commit fixes one important SCSI_MQ performance issue, we can't simply revert it just because of one un-confirmed report from you only(without any kernel log provided). So Nak.
On Tue, 2017-12-05 at 06:42 +0800, Ming Lei wrote: > On Mon, Dec 04, 2017 at 09:30:32AM -0800, Bart Van Assche wrote: > > * A systematic lockup for SCSI queues with queue depth 1. The > > following test reproduces that bug systematically: > > - Change the SRP initiator such that SCSI target queue depth is > > limited to 1. > > - Run the following command: > > srp-test/run_tests -f xfs -d -e none -r 60 -t 01 > > See also "[PATCH 4/7] blk-mq: Avoid that request processing > > stalls when sharing tags" > > (https://marc.info/?l=linux-block&m=151208695316857). Note: > > reverting commit 0df21c86bdbf also fixes a sporadic SCSI request > > queue lockup while inserting a blk_mq_sched_mark_restart_hctx() > > before all blk_mq_dispatch_rq_list() calls only fixes the > > systematic lockup for queue depth 1. > > You are the only reproducer [ ... ] That's not correct. I'm pretty sure if you try to reproduce this that you will see the same hang I ran into. Does this mean that you have not yet tried to reproduce the hang I reported? > You said that your patch fixes 'commit b347689ffbca ("blk-mq-sched: > improve dispatching from sw queue")', but you don't mention any issue > about that commit. That's not correct either. From the commit message "A systematic lockup for SCSI queues with queue depth 1." > > I think the above means that it is too risky to try to fix all bugs > > introduced by commit 0df21c86bdbf before kernel v4.15 is released. > > Hence revert that commit. > > What is the risk? That more bugs were introduced by commit 0df21c86bdbf than the ones that have been discovered so far. Bart.
On Mon, Dec 04, 2017 at 10:48:18PM +0000, Bart Van Assche wrote: > On Tue, 2017-12-05 at 06:42 +0800, Ming Lei wrote: > > On Mon, Dec 04, 2017 at 09:30:32AM -0800, Bart Van Assche wrote: > > > * A systematic lockup for SCSI queues with queue depth 1. The > > > following test reproduces that bug systematically: > > > - Change the SRP initiator such that SCSI target queue depth is > > > limited to 1. > > > - Run the following command: > > > srp-test/run_tests -f xfs -d -e none -r 60 -t 01 > > > See also "[PATCH 4/7] blk-mq: Avoid that request processing > > > stalls when sharing tags" > > > (https://marc.info/?l=linux-block&m=151208695316857). Note: > > > reverting commit 0df21c86bdbf also fixes a sporadic SCSI request > > > queue lockup while inserting a blk_mq_sched_mark_restart_hctx() > > > before all blk_mq_dispatch_rq_list() calls only fixes the > > > systematic lockup for queue depth 1. > > > > You are the only reproducer [ ... ] > > That's not correct. I'm pretty sure if you try to reproduce this that > you will see the same hang I ran into. Does this mean that you have not > yet tried to reproduce the hang I reported? Do you mean every kernel developer has to own one SRP/IB hardware? I don't have your hardware to reproduce that, and I don't think most of guys have that. Otherwise, there should have be such similar reports from others, not from only you. More importantly I don't understand why you can't share the kernel log/debugfs log when IO hang happens? Without any kernel log, how can we confirm that it is a valid report? > > > You said that your patch fixes 'commit b347689ffbca ("blk-mq-sched: > > improve dispatching from sw queue")', but you don't mention any issue > > about that commit. > > That's not correct either. From the commit message "A systematic lockup > for SCSI queues with queue depth 1." I mean you mentioned your patch can fix 'commit b347689ffbca ("blk-mq-sched: improve dispatching from sw queue")', but you never point where the commit b347689ffbca is wrong, how your patch fixes the mistake of that commit. > > > > I think the above means that it is too risky to try to fix all bugs > > > introduced by commit 0df21c86bdbf before kernel v4.15 is released. > > > Hence revert that commit. > > > > What is the risk? > > That more bugs were introduced by commit 0df21c86bdbf than the ones that > have been discovered so far. If you don't provide any log, I have to ignore your report simply. So there is only one real issue which can be addressed easily by the following patch: https://marc.info/?l=linux-scsi&m=151223234607157&w=2
On Tue, 2017-12-05 at 07:01 +0800, Ming Lei wrote: > On Mon, Dec 04, 2017 at 10:48:18PM +0000, Bart Van Assche wrote: > > On Tue, 2017-12-05 at 06:42 +0800, Ming Lei wrote: > > > On Mon, Dec 04, 2017 at 09:30:32AM -0800, Bart Van Assche wrote: > > > > * A systematic lockup for SCSI queues with queue depth 1. The > > > > following test reproduces that bug systematically: > > > > - Change the SRP initiator such that SCSI target queue depth is > > > > limited to 1. > > > > - Run the following command: > > > > srp-test/run_tests -f xfs -d -e none -r 60 -t 01 > > > > See also "[PATCH 4/7] blk-mq: Avoid that request processing > > > > stalls when sharing tags" > > > > (https://marc.info/?l=linux-block&m=151208695316857). Note: > > > > reverting commit 0df21c86bdbf also fixes a sporadic SCSI request > > > > queue lockup while inserting a blk_mq_sched_mark_restart_hctx() > > > > before all blk_mq_dispatch_rq_list() calls only fixes the > > > > systematic lockup for queue depth 1. > > > > > > You are the only reproducer [ ... ] > > > > That's not correct. I'm pretty sure if you try to reproduce this that > > you will see the same hang I ran into. Does this mean that you have not > > yet tried to reproduce the hang I reported? > > Do you mean every kernel developer has to own one SRP/IB hardware? When I have the time I will make it possible to run this test on any system equipped with at least one Ethernet port. But the fact that the test I mentioned requires IB hardware should not prevent you from running this test since you have run this test software before. > I don't have your hardware to reproduce that, That's not true. Your employer definitely owns IB hardware. E.g. the following message shows that you have run the srp-test yourself on IB hardware only four weeks ago: https://www.spinics.net/lists/linux-block/msg19511.html > Otherwise, there should have be such similar reports from others, not from > only you. That's not correct either. How long was it ago that kernel v4.15-rc1 was released? One week? How many SRP users do you think have tried to trigger a queue full condition with that kernel version? > More importantly I don't understand why you can't share the kernel > log/debugfs log when IO hang happens? > > Without any kernel log, how can we confirm that it is a valid report? It's really unfortunate that you are focussing on denying that the bug I reported exists instead of trying to fix the bugs introduced by commit b347689ffbca. BTW, I have more than enough experience to decide myself what a valid report is and what not. I can easily send you several MB of kernel logs. The reason I have not yet done this is because I'm 99.9% sure that these won't help to root cause the reported issue. But I can tell you what I learned from analyzing the information under /sys/kernel/debug/block: every time a hang occurred I noticed that no requests were "busy", that two requests occurred in rq_lists and one request occurred in a hctx dispatch list. This is enough to conclude that a queue run was missing. And I think that the patch at the start of this e-mail thread not only shows that the root cause is in the block layer but also that this bug was introduced by commit b347689ffbca. > > > You said that your patch fixes 'commit b347689ffbca ("blk-mq-sched: > > > improve dispatching from sw queue")', but you don't mention any issue > > > about that commit. > > > > That's not correct either. From the commit message "A systematic lockup > > for SCSI queues with queue depth 1." > > I mean you mentioned your patch can fix 'commit b347689ffbca > ("blk-mq-sched: improve dispatching from sw queue")', but you never > point where the commit b347689ffbca is wrong, how your patch fixes > the mistake of that commit. You should know that it is not required to perform a root cause analysis before posting a revert. Having performed a bisect is sufficient. BTW, it seems like you forgot that last Friday I explained to you that there is an obvious bug in commit b347689ffbca, namely that a blk_mq_sched_mark_restart_hctx() call is missing in blk_mq_sched_dispatch_requests() before the blk_mq_do_dispatch_ctx() call. See also https://marc.info/?l=linux-block&m=151215794224401. Bart.
On Mon, Dec 04, 2017 at 11:32:27PM +0000, Bart Van Assche wrote: > On Tue, 2017-12-05 at 07:01 +0800, Ming Lei wrote: > > On Mon, Dec 04, 2017 at 10:48:18PM +0000, Bart Van Assche wrote: > > > On Tue, 2017-12-05 at 06:42 +0800, Ming Lei wrote: > > > > On Mon, Dec 04, 2017 at 09:30:32AM -0800, Bart Van Assche wrote: > > > > > * A systematic lockup for SCSI queues with queue depth 1. The > > > > > following test reproduces that bug systematically: > > > > > - Change the SRP initiator such that SCSI target queue depth is > > > > > limited to 1. > > > > > - Run the following command: > > > > > srp-test/run_tests -f xfs -d -e none -r 60 -t 01 > > > > > See also "[PATCH 4/7] blk-mq: Avoid that request processing > > > > > stalls when sharing tags" > > > > > (https://marc.info/?l=linux-block&m=151208695316857). Note: > > > > > reverting commit 0df21c86bdbf also fixes a sporadic SCSI request > > > > > queue lockup while inserting a blk_mq_sched_mark_restart_hctx() > > > > > before all blk_mq_dispatch_rq_list() calls only fixes the > > > > > systematic lockup for queue depth 1. > > > > > > > > You are the only reproducer [ ... ] > > > > > > That's not correct. I'm pretty sure if you try to reproduce this that > > > you will see the same hang I ran into. Does this mean that you have not > > > yet tried to reproduce the hang I reported? > > > > Do you mean every kernel developer has to own one SRP/IB hardware? > > When I have the time I will make it possible to run this test on any system > equipped with at least one Ethernet port. But the fact that the test I > mentioned requires IB hardware should not prevent you from running this test > since you have run this test software before. > > > I don't have your hardware to reproduce that, > > That's not true. Your employer definitely owns IB hardware. E.g. the > following message shows that you have run the srp-test yourself on IB hardware > only four weeks ago: > > https://www.spinics.net/lists/linux-block/msg19511.html The hardware belongs to Laurence, at that time I can borrow from him, and now I am not sure if it is available. > > > Otherwise, there should have be such similar reports from others, not from > > only you. > > That's not correct either. How long was it ago that kernel v4.15-rc1 was > released? One week? How many SRP users do you think have tried to trigger a > queue full condition with that kernel version? OK, we can wait for further reporters to provide kernel log if you don't want. > > > More importantly I don't understand why you can't share the kernel > > log/debugfs log when IO hang happens? > > > > Without any kernel log, how can we confirm that it is a valid report? > > It's really unfortunate that you are focussing on denying that the bug I > reported exists instead of trying to fix the bugs introduced by commit If you look at bug reports in kenrel mail list, you will see most of reports includes some kind of log, that is a common practice to report issue with log attached. It can save us much time for talking in mails. > b347689ffbca. BTW, I have more than enough experience to decide myself what > a valid report is and what not. I can easily send you several MB of kernel As I mentioned, only dmesg with hang trace and debugfs log should be enough, both can't be so big, right? > logs. The reason I have not yet done this is because I'm 99.9% sure that > these won't help to root cause the reported issue. But I can tell you what That is your opinion, most of times, I can find some clue from debugfs about hang issue, then I can try to add trace just in some possible places for narrowing down the issue. > I learned from analyzing the information under /sys/kernel/debug/block: > every time a hang occurred I noticed that no requests were "busy", that two > requests occurred in rq_lists and one request occurred in a hctx dispatch Then what do other attributes show? Like queue/hctx state? The following script can get all this info easily: http://people.redhat.com/minlei/tests/tools/dump-blk-info Also it is a bit odd to see request in hctx->dispatch now, and it can only happen now when scsi_target_queue_ready() returns false, so I guess you apply some change on target->can_queue(such as setting it as 1 in srp/ib code manually)? Please reply, if yes, I will try to see if I can reproduce it with this kind of change on scsi_debug. > list. This is enough to conclude that a queue run was missing. And I think In this case, seems it isn't related with both commit b347689ff and 0df21c86bdbf, since both don't change RESTART for hctx->dispatch, and shouldn't affect run queue. > that the patch at the start of this e-mail thread not only shows that the > root cause is in the block layer but also that this bug was introduced by > commit b347689ffbca. > > > > > You said that your patch fixes 'commit b347689ffbca ("blk-mq-sched: > > > > improve dispatching from sw queue")', but you don't mention any issue > > > > about that commit. > > > > > > That's not correct either. From the commit message "A systematic lockup > > > for SCSI queues with queue depth 1." > > > > I mean you mentioned your patch can fix 'commit b347689ffbca > > ("blk-mq-sched: improve dispatching from sw queue")', but you never > > point where the commit b347689ffbca is wrong, how your patch fixes > > the mistake of that commit. > > You should know that it is not required to perform a root cause analysis > before posting a revert. Having performed a bisect is sufficient. I don't think your issue can't be fixed before releasing V4.15 if enough log are provided, and reverting can cause performance regression for all SCSI_MQ users. Also more importantly you may revert a wrong commit, because sometimes some commits may make some issue happen easily, not the direct reason of the issue. > > BTW, it seems like you forgot that last Friday I explained to you that there > is an obvious bug in commit b347689ffbca, namely that a blk_mq_sched_mark_restart_hctx() > call is missing in blk_mq_sched_dispatch_requests() before the blk_mq_do_dispatch_ctx() > call. See also https://marc.info/?l=linux-block&m=151215794224401. No, I have explained to you that your change isn't necessary, see: https://marc.info/?l=linux-block&m=151217500929341&w=2
On Tue, 2017-12-05 at 08:20 +0800, Ming Lei wrote: > Also it is a bit odd to see request in hctx->dispatch now, and it can only > happen now when scsi_target_queue_ready() returns false, so I guess you apply > some change on target->can_queue(such as setting it as 1 in srp/ib code > manually)? Yes, but that had already been mentioned. From the e-mail at the start of this e-mail thread: "Change the SRP initiator such that SCSI target queue depth is limited to 1." The changes I made in the SRP initiator are the same as those described in the following message from about one month ago: https://www.spinics.net/lists/linux-scsi/msg114720.html. Bart.
On Tue, Dec 05, 2017 at 12:29:59AM +0000, Bart Van Assche wrote: > On Tue, 2017-12-05 at 08:20 +0800, Ming Lei wrote: > > Also it is a bit odd to see request in hctx->dispatch now, and it can only > > happen now when scsi_target_queue_ready() returns false, so I guess you apply > > some change on target->can_queue(such as setting it as 1 in srp/ib code > > manually)? > > Yes, but that had already been mentioned. From the e-mail at the start of > this e-mail thread: "Change the SRP initiator such that SCSI target queue > depth is limited to 1." The changes I made in the SRP initiator are the same > as those described in the following message from about one month ago: > https://www.spinics.net/lists/linux-scsi/msg114720.html. OK, got it. Then no reason to revert commit(0df21c86bdbf scsi: implement .get_budget an .put_budget for blk-mq) for one issue which may never happen in reality since this reproducer need out-of-tree patch. I don't mean it isn't a issue, but I don't think it has top priority for reverting commit 0df21c86bdbf. Especially there isn't proof shown that 0df21c86bdbf causes this issue since this commit won't change run queue for requests in hctx->dispatch_list. I's like to take a look if someone'd like to cooperate, such as providing kernel log, test debug patch, and kind of things. Or when I get this hardware to reproduce. -- Ming
On Tue, 2017-12-05 at 09:04 +0800, Ming Lei wrote: > Then no reason to revert commit(0df21c86bdbf scsi: implement .get_budget an > .put_budget for blk-mq) for one issue which may never happen in reality since > this reproducer need out-of-tree patch. Sorry but I disagree completely. You seem to overlook that there may be other circumstances that trigger the same lockup, e.g. a SCSI queue full condition. Bart.
On Tue, Dec 05, 2017 at 01:13:43AM +0000, Bart Van Assche wrote: > On Tue, 2017-12-05 at 09:04 +0800, Ming Lei wrote: > > Then no reason to revert commit(0df21c86bdbf scsi: implement .get_budget an > > .put_budget for blk-mq) for one issue which may never happen in reality since > > this reproducer need out-of-tree patch. > > Sorry but I disagree completely. You seem to overlook that there may be other > circumstances that trigger the same lockup, e.g. a SCSI queue full condition. If the scsi_dev_queue_ready() returns false, .get_budget() catches that and never add request to hctx->dispatch. And scsi_host_queue_ready() always returns true, since we respect per-host queue depth by blk_mq_get_driver_tag() before calling .queue_rq(). Or if I miss other cases, please point it out.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 84bd2b16d216..a7e7966f1477 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1976,9 +1976,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, struct scsi_device *sdev = q->queuedata; struct Scsi_Host *shost = sdev->host; struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req); - blk_status_t ret; + blk_status_t ret = BLK_STS_RESOURCE; int reason; + if (!scsi_mq_get_budget(hctx)) + goto out; ret = prep_to_mq(scsi_prep_state_check(sdev, req)); if (ret != BLK_STS_OK) goto out_put_budget; @@ -2022,6 +2024,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, atomic_dec(&scsi_target(sdev)->target_busy); out_put_budget: scsi_mq_put_budget(hctx); +out: switch (ret) { case BLK_STS_OK: break; @@ -2225,8 +2228,6 @@ struct request_queue *scsi_old_alloc_queue(struct scsi_device *sdev) } static const struct blk_mq_ops scsi_mq_ops = { - .get_budget = scsi_mq_get_budget, - .put_budget = scsi_mq_put_budget, .queue_rq = scsi_queue_rq, .complete = scsi_softirq_done, .timeout = scsi_timeout,
Commit 0df21c86bdbf introduced several bugs: * A SCSI queue stall for queue depths > 1, addressed by commit 88022d7201e9 ("blk-mq: don't handle failure in .get_budget") * A systematic lockup for SCSI queues with queue depth 1. The following test reproduces that bug systematically: - Change the SRP initiator such that SCSI target queue depth is limited to 1. - Run the following command: srp-test/run_tests -f xfs -d -e none -r 60 -t 01 See also "[PATCH 4/7] blk-mq: Avoid that request processing stalls when sharing tags" (https://marc.info/?l=linux-block&m=151208695316857). Note: reverting commit 0df21c86bdbf also fixes a sporadic SCSI request queue lockup while inserting a blk_mq_sched_mark_restart_hctx() before all blk_mq_dispatch_rq_list() calls only fixes the systematic lockup for queue depth 1. * A scsi_debug lockup - see also "[PATCH] SCSI: delay run queue if device is blocked in scsi_dev_queue_ready()" (https://marc.info/?l=linux-block&m=151223233407154). I think the above means that it is too risky to try to fix all bugs introduced by commit 0df21c86bdbf before kernel v4.15 is released. Hence revert that commit. Fixes: commit 0df21c86bdbf ("scsi: implement .get_budget and .put_budget for blk-mq") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org --- drivers/scsi/scsi_lib.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)