From patchwork Tue Dec 21 12:31:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kashyap Desai X-Patchwork-Id: 12689593 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7E5DC433F5 for ; Tue, 21 Dec 2021 12:31:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237592AbhLUMb7 (ORCPT ); Tue, 21 Dec 2021 07:31:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237591AbhLUMb6 (ORCPT ); Tue, 21 Dec 2021 07:31:58 -0500 Received: from mail-pg1-x541.google.com (mail-pg1-x541.google.com [IPv6:2607:f8b0:4864:20::541]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FE4BC061574 for ; Tue, 21 Dec 2021 04:31:58 -0800 (PST) Received: by mail-pg1-x541.google.com with SMTP id m15so12177499pgu.11 for ; Tue, 21 Dec 2021 04:31:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id; bh=UFgXBFfCxVSAxAWVrtXxh0hp1ncIKQa0Z1V5J4vy7y4=; b=YD7j/6ddea9EXLigeKPtJYuX5cX/98DRJvAHiMfPqrUEyQGJW4Z+kkwdx9uWOwlORd SsuCjYsbOSbgyJKWCy8VGovqqkQDpNJG8mDRIw0Zw1e9mvnWbfLXtVrKACn3/2bNb1WB KGAxsEqVtiqO2JD9pG82US71gEjZuAiRw48qc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=UFgXBFfCxVSAxAWVrtXxh0hp1ncIKQa0Z1V5J4vy7y4=; b=DQsJ4Glg/bUainHOqMjlLOUxxiZl7/JjKs2VFK7RSI1J5nR3T6Tl6Kmb3luPZZBnIV S8OrLnBWCtEbvQm9ls3V1E/W506o70YF58L0mWJI428WuGAZuH+cTeflsqV4vAGFmQ1l /zJbwC+wb3U8szPgnME8RaRidFxm5BEtjvXCmJA78Qf0mLxKGcGSqMmgCryzOWzaY/F0 cLl7SM75s1n5PDZzsGDbFe3X5ZsPj13lQ75NxFnWEMtrimAxRum0lfA3EsdYiQ5oP8Um LdHzm0DHM14Wm+vPueVe60A4o+JPMCevaLnpDNT2QR/g0thBRcEgEQy9/3lFD7FQ0F6X N5UQ== X-Gm-Message-State: AOAM532fxylYwo7YTnq9PdCM8J/rNpeOEvTOWkEoOkSn8WpWS2H+mhsq y7vgNkNWM96uUaEnOX3eW+i7FrbGzThN/Why X-Google-Smtp-Source: ABdhPJydVV3/fumPZDKV97gp7Ii5WuZWEopryLKWIlL9VVteH8v61RADRTTWKEqIkIMZ4aKLHjx02g== X-Received: by 2002:a63:1a16:: with SMTP id a22mr2791752pga.208.1640089917793; Tue, 21 Dec 2021 04:31:57 -0800 (PST) Received: from amd_smc.dhcp.broadcom.net ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id t191sm19841156pgd.3.2021.12.21.04.31.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Dec 2021 04:31:56 -0800 (PST) From: Kashyap Desai To: axboe@kernel.dk Cc: Kashyap Desai , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, john.garry@huawei.com, ming.lei@redhat.com, sathya.prakash@broadcom.com Subject: [PATCH RFT] blk-mq: optimize queue tag busy iter for shared_tags Date: Tue, 21 Dec 2021 18:01:57 +0530 Message-Id: <20211221123157.14052-1-kashyap.desai@broadcom.com> X-Mailer: git-send-email 2.18.1 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org In [0], CPU usage for blk_mq_queue_tag_busy_iter() was optimized, but there are still periodic call of blk_mq_queue_tag_busy_iter() from below context. Below context is used for block layer timer to find out potential expired command (per request queue) which requires tag iteration almost every 5 seconds(defined BLK_MAX_TIMEOUT) for each request queue. kthread worker_thread process_one_work blk_mq_timeout_work blk_mq_queue_tag_busy_iter bt_iter blk_mq_find_and_get_req _raw_spin_lock_irqsave native_queued_spin_lock_slowpath Changes in this patch optimize extra iterations of tags in case of shared_tags. One iteration of shared_tags can give expected results for iterate function. Setup - AMD64 Gen-4.0 Server. 64 Virtual Drive created using 16 Nvme drives + mpi3mr driver (in shared_tags mode) Test command - fio 64.fio --rw=randread --bs=4K --iodepth=32 --numjobs=2 --ioscheduler=mq-deadline --disk_util=0 Without this patch on 5.16.0-rc5, mpi3mr driver in shared_tags mode can give 4.0M IOPs vs expected to get ~6.0M. Snippet of perf top 25.42% [kernel] [k] native_queued_spin_lock_slowpath 3.95% [kernel] [k] cpupri_set 2.05% [kernel] [k] __blk_mq_get_driver_tag 1.67% [kernel] [k] __rcu_read_unlock 1.63% [kernel] [k] check_preemption_disabled After applying this patch on 5.16.0-rc5, mpi3mr driver in shared_tags mode reach up to 5.8M IOPs. Snippet of perf top 7.95% [kernel] [k] native_queued_spin_lock_slowpath 5.61% [kernel] [k] cpupri_set 2.98% [kernel] [k] acpi_processor_ffh_cstate_enter 2.49% [kernel] [k] read_tsc 2.15% [kernel] [k] check_preemption_disabled [0] https://lore.kernel.org/all/9b092ca49e9b5415772cd950a3c12584@mail.gmail.com/ Cc: linux-block@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: john.garry@huawei.com Cc: ming.lei@redhat.com Cc: sathya.prakash@broadcom.com Signed-off-by: Kashyap Desai --- block/blk-mq-tag.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 995336abee33..3e0a8e79f966 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -253,7 +253,8 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data) if (!rq) return true; - if (rq->q == hctx->queue && rq->mq_hctx == hctx) + if (rq->q == hctx->queue && (rq->mq_hctx == hctx || + blk_mq_is_shared_tags(hctx->flags))) ret = iter_data->fn(hctx, rq, iter_data->data, reserved); blk_mq_put_rq_ref(rq); return ret; @@ -484,6 +485,14 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn, if (tags->nr_reserved_tags) bt_for_each(hctx, &tags->breserved_tags, fn, priv, true); bt_for_each(hctx, &tags->bitmap_tags, fn, priv, false); + + /* In case of shared bitmap if shared_tags is allocated, it is not required + * to iterate all the hctx. Looping one hctx is good enough. + */ + if (blk_mq_is_shared_tags(hctx->flags)) { + blk_queue_exit(q); + return; + } } blk_queue_exit(q); }