Message ID | aa1c23e1-1beb-a26d-3e77-fe78aa281771@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 01/24/2017 08:54 AM, Hannes Reinecke wrote: > Hi Jens, > > I'm trying to debug a queue stall with your blk-mq-sched branch; with my > latest mpt3sas patches fio stops basically directly after starting a > sequential read :-( > > I've debugged things and came up with the attached patch; we need to > restart waiters with blk_mq_tag_idle() after completing a tag. > We're already calling blk_mq_tag_busy() when fetching a tag, so I think > calling blk_mq_tag_idle() is required when retiring a tag. I'll take a look at this. It sounds like all your grief is related to shared tag maps, which I don't have anything that uses here. I'll see if we are leaking it, you should be able to check that by reading the 'active' file in the sysfs directory.
On 01/24/2017 08:54 AM, Hannes Reinecke wrote: > Hi Jens, > > I'm trying to debug a queue stall with your blk-mq-sched branch; with my > latest mpt3sas patches fio stops basically directly after starting a > sequential read :-( > > I've debugged things and came up with the attached patch; we need to > restart waiters with blk_mq_tag_idle() after completing a tag. > We're already calling blk_mq_tag_busy() when fetching a tag, so I think > calling blk_mq_tag_idle() is required when retiring a tag. The patch isn't correct, the whole point of the un-idling is that it ISN'T happening for every request completion. Otherwise you throw away scalability. So a queue will go into active mode on the first request, and idle when it's been idle for a bit. The active count is used to divide up the tags. So I'm assuming we're missing a queue run somewhere when we fail getting a driver tag. The latter should only happen if the target has IO in flight already, and the restart marking should take care of it. Obviously there's a case where that is not true, since you are seeing stalls. > However, even with the attached patch I'm seeing some queue stalls; > looks like they're related to the 'stonewall' statement in fio. I think you are heading down the wrong path. Your patch will cause the symptoms to be a bit different, but you'll still run into cases where we fail giving out the tag and then stall.
On 01/24/2017 05:03 PM, Jens Axboe wrote: > On 01/24/2017 08:54 AM, Hannes Reinecke wrote: >> Hi Jens, >> >> I'm trying to debug a queue stall with your blk-mq-sched branch; with my >> latest mpt3sas patches fio stops basically directly after starting a >> sequential read :-( >> >> I've debugged things and came up with the attached patch; we need to >> restart waiters with blk_mq_tag_idle() after completing a tag. >> We're already calling blk_mq_tag_busy() when fetching a tag, so I think >> calling blk_mq_tag_idle() is required when retiring a tag. > > I'll take a look at this. It sounds like all your grief is related to > shared tag maps, which I don't have anything that uses here. I'll see > if we are leaking it, you should be able to check that by reading the > 'active' file in the sysfs directory. > Ah. That'll explain it. Basically _all_ my HBAs I'm testing with have shared tag maps :-( Cheers, Hannes
On 01/24/2017 05:09 PM, Jens Axboe wrote: > On 01/24/2017 08:54 AM, Hannes Reinecke wrote: >> Hi Jens, >> >> I'm trying to debug a queue stall with your blk-mq-sched branch; with my >> latest mpt3sas patches fio stops basically directly after starting a >> sequential read :-( >> >> I've debugged things and came up with the attached patch; we need to >> restart waiters with blk_mq_tag_idle() after completing a tag. >> We're already calling blk_mq_tag_busy() when fetching a tag, so I think >> calling blk_mq_tag_idle() is required when retiring a tag. > > The patch isn't correct, the whole point of the un-idling is that it > ISN'T happening for every request completion. Otherwise you throw > away scalability. So a queue will go into active mode on the first > request, and idle when it's been idle for a bit. The active count > is used to divide up the tags. > > So I'm assuming we're missing a queue run somewhere when we fail > getting a driver tag. The latter should only happen if the target > has IO in flight already, and the restart marking should take care > of it. Obviously there's a case where that is not true, since you > are seeing stalls. > But what is the point in the 'blk_mq_tag_busy()' thingie then? When will it be reset? The only instances I've seen is that it'll be getting reset during resize and teardown ... hence my patch. >> However, even with the attached patch I'm seeing some queue stalls; >> looks like they're related to the 'stonewall' statement in fio. > > I think you are heading down the wrong path. Your patch will cause > the symptoms to be a bit different, but you'll still run into cases > where we fail giving out the tag and then stall. > Hehe. How did you know that? That's indeed what I'm seeing. Oh well, back to the drawing board... Cheers, Hannes
From 82b15ff40d71aed318f9946881825f9f03ef8f48 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke <hare@suse.de> Date: Tue, 24 Jan 2017 14:43:09 +0100 Subject: [PATCH] block-mq: fixup queue stall __blk_mq_alloc_request() calls blk_mq_tag_busy(), which might result in the queue to become blocked. So we need to call blk_mq_tag_idle() once the tag is finished to wakeup all waiters on the queue. Patch is relative to the blk-mq-sched branch Signed-off-by: Hannes Reinecke <hare@suse.com> --- block/blk-mq.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 739a292..d52bcb1 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -333,10 +333,12 @@ void __blk_mq_finish_request(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, { const int sched_tag = rq->internal_tag; struct request_queue *q = rq->q; + bool unbusy = false; - if (rq->rq_flags & RQF_MQ_INFLIGHT) + if (rq->rq_flags & RQF_MQ_INFLIGHT) { atomic_dec(&hctx->nr_active); - + unbusy = true; + } wbt_done(q->rq_wb, &rq->issue_stat); rq->rq_flags = 0; @@ -346,6 +348,9 @@ void __blk_mq_finish_request(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag); if (sched_tag != -1) blk_mq_sched_completed_request(hctx, rq); + if (unbusy) + blk_mq_tag_idle(hctx); + blk_queue_exit(q); } -- 1.8.5.6