From patchwork Sat Sep 30 10:27:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 9979375 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5992A60327 for ; Sat, 30 Sep 2017 10:29:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 494AC2962E for ; Sat, 30 Sep 2017 10:29:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3E20629638; Sat, 30 Sep 2017 10:29:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AC1F82962E for ; Sat, 30 Sep 2017 10:29:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752921AbdI3K32 (ORCPT ); Sat, 30 Sep 2017 06:29:28 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46160 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752912AbdI3K30 (ORCPT ); Sat, 30 Sep 2017 06:29:26 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5A5217E42E; Sat, 30 Sep 2017 10:29:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 5A5217E42E Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=ming.lei@redhat.com Received: from localhost (ovpn-12-31.pek2.redhat.com [10.72.12.31]) by smtp.corp.redhat.com (Postfix) with ESMTP id B608D60631; Sat, 30 Sep 2017 10:29:13 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, Christoph Hellwig , Mike Snitzer , dm-devel@redhat.com Cc: Bart Van Assche , Laurence Oberman , Paolo Valente , Oleksandr Natalenko , Tom Nguyen , linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, Omar Sandoval , Ming Lei Subject: [PATCH V5 6/7] blk-mq-sched: improve dispatching from sw queue Date: Sat, 30 Sep 2017 18:27:19 +0800 Message-Id: <20170930102720.30219-7-ming.lei@redhat.com> In-Reply-To: <20170930102720.30219-1-ming.lei@redhat.com> References: <20170930102720.30219-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Sat, 30 Sep 2017 10:29:26 +0000 (UTC) Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP SCSI devices use host-wide tagset, and the shared driver tag space is often quite big. Meantime there is also queue depth for each lun(.cmd_per_lun), which is often small. So lots of requests may stay in sw queue, and we always flush all belonging to same hw queue and dispatch them all to driver, unfortunately it is easy to cause queue busy because of the small per-lun queue depth. Once these requests are flushed out, they have to stay in hctx->dispatch, and no bio merge can participate into these requests, and sequential IO performance is hurted. This patch improves dispatching from sw queue when there is per-request-queue queue depth by taking request one by one from sw queue, just like the way of IO scheduler. Reviewed-by: Omar Sandoval Reviewed-by: Bart Van Assche Tested-by: Oleksandr Natalenko Tested-by: Tom Nguyen Tested-by: Paolo Valente Signed-off-by: Ming Lei --- block/blk-mq-sched.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++-- include/linux/blk-mq.h | 2 ++ 2 files changed, 53 insertions(+), 2 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 538f363f39ca..3ba112d9dc15 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -105,6 +105,42 @@ static void blk_mq_do_dispatch_sched(struct request_queue *q, } while (blk_mq_dispatch_rq_list(q, &rq_list)); } +static struct blk_mq_ctx *blk_mq_next_ctx(struct blk_mq_hw_ctx *hctx, + struct blk_mq_ctx *ctx) +{ + unsigned idx = ctx->index_hw; + + if (++idx == hctx->nr_ctx) + idx = 0; + + return hctx->ctxs[idx]; +} + +static void blk_mq_do_dispatch_ctx(struct request_queue *q, + struct blk_mq_hw_ctx *hctx) +{ + LIST_HEAD(rq_list); + struct blk_mq_ctx *ctx = READ_ONCE(hctx->dispatch_from); + bool dispatched; + + do { + struct request *rq; + + rq = blk_mq_dequeue_from_ctx(hctx, ctx); + if (!rq) + break; + list_add(&rq->queuelist, &rq_list); + + /* round robin for fair dispatch */ + ctx = blk_mq_next_ctx(hctx, rq->mq_ctx); + + dispatched = blk_mq_dispatch_rq_list(q, &rq_list); + } while (dispatched); + + if (!dispatched) + WRITE_ONCE(hctx->dispatch_from, ctx); +} + void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) { struct request_queue *q = hctx->queue; @@ -142,18 +178,31 @@ void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx) if (!list_empty(&rq_list)) { blk_mq_sched_mark_restart_hctx(hctx); do_sched_dispatch = blk_mq_dispatch_rq_list(q, &rq_list); - } else if (!has_sched_dispatch) { + } else if (!has_sched_dispatch && !q->queue_depth) { + /* + * If there is no per-request_queue depth, we + * flush all requests in this hw queue, otherwise + * pick up request one by one from sw queue for + * avoiding to mess up I/O merge when dispatch + * run out of resource, which can be triggered + * easily by per-request_queue queue depth + */ blk_mq_flush_busy_ctxs(hctx, &rq_list); blk_mq_dispatch_rq_list(q, &rq_list); } + if (!do_sched_dispatch) + return; + /* * We want to dispatch from the scheduler if there was nothing * on the dispatch list or we were able to dispatch from the * dispatch list. */ - if (do_sched_dispatch && has_sched_dispatch) + if (has_sched_dispatch) blk_mq_do_dispatch_sched(q, e, hctx); + else + blk_mq_do_dispatch_ctx(q, hctx); } bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio, diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 2747469cedaf..fccabe00fb55 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -30,6 +30,8 @@ struct blk_mq_hw_ctx { struct sbitmap ctx_map; + struct blk_mq_ctx *dispatch_from; + struct blk_mq_ctx **ctxs; unsigned int nr_ctx;