From patchwork Sat Jan 13 11:05:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 10162181 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D5D83600CA for ; Sat, 13 Jan 2018 11:06:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C3728287BE for ; Sat, 13 Jan 2018 11:06:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B7E0728A9A; Sat, 13 Jan 2018 11:06:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3929628AAD for ; Sat, 13 Jan 2018 11:06:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934221AbeAMLFl (ORCPT ); Sat, 13 Jan 2018 06:05:41 -0500 Received: from mail-wr0-f193.google.com ([209.85.128.193]:38205 "EHLO mail-wr0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933524AbeAMLFg (ORCPT ); Sat, 13 Jan 2018 06:05:36 -0500 Received: by mail-wr0-f193.google.com with SMTP id x1so3265803wrb.5 for ; Sat, 13 Jan 2018 03:05:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tPmlQPbD1RJPmkRgIqkRKPwGMCuKL5Tn9KHHaG7Kr+M=; b=PXV6aMaGwmnXTifnRl+ndI/48AiyG+8kfZ3whLZsyB5R21vh4WDHJrRcfDGd6yhb++ ORS+gVNrk4+CC5QXWh9BE135vKyhQrF67GYSkR5H5kblH7tSAzzNGN0kKsjIprU8Zxz9 qfpDQY+CvD/25dsMBEVN3w00NahmqSWLerhw8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tPmlQPbD1RJPmkRgIqkRKPwGMCuKL5Tn9KHHaG7Kr+M=; b=Clv9A8LfODPLEMkbHfNgs+b+AK5POEPDM/JQrgbApNEZ34qpT9rozA+mx91ImeLPl/ HKFlUVGi/eDgBgGHAoPCSRzyKkavqvr0AtafkF1lhtB9vdZJ7K1zMuHRAoqSMANi0MfQ badeRnAIf+N3Dt4CkkDVE+KWH8LQrJ/BEujd3JotT9fspvjuQcDJAkGzTXcyUR9jJm3X hxMEPwSWSDhNKpR9PinhIdoeUgMna9D2/hdKOb5hUjrKT5363eot3OnMiV6nx4y5g7lk 3BdbeZSFrFTfp8UDHGAbYckZ/re6ImbokqD45M5I/Y6O8Iox6QY3emU+8/ZGmlFeyxVm VVDg== X-Gm-Message-State: AKwxytdrYQvXV2AKt72N5CzFZ+LXU0WZyv2VgEd8uE/aiew0/OhZzgCj 73DjLXU+CFIbV+cQMfft5WRfmA== X-Google-Smtp-Source: ACJfBosnVPz0TthGzh3N/FimPjuGcQg0e/qM8pxAAWlkzyq41ml2ytHVVHpFwpvrMLgv6TKkKpeuQw== X-Received: by 10.223.166.41 with SMTP id k38mr6996442wrc.242.1515841534797; Sat, 13 Jan 2018 03:05:34 -0800 (PST) Received: from localhost.localdomain (146-241-36-39.dyn.eolo.it. [146.241.36.39]) by smtp.gmail.com with ESMTPSA id o18sm20747276wrg.59.2018.01.13.03.05.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 13 Jan 2018 03:05:33 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, linus.walleij@linaro.org, bfq-iosched@googlegroups.com, oleksandr@natalenko.name, Paolo Valente Subject: [PATCH BUGFIX/IMPROVEMENT 1/2] block, bfq: limit tags for writes and async I/O Date: Sat, 13 Jan 2018 12:05:17 +0100 Message-Id: <20180113110518.2519-2-paolo.valente@linaro.org> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180113110518.2519-1-paolo.valente@linaro.org> References: <20180113110518.2519-1-paolo.valente@linaro.org> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Asynchronous I/O can easily starve synchronous I/O (both sync reads and sync writes), by consuming all request tags. Similarly, storms of synchronous writes, such as those that sync(2) may trigger, can starve synchronous reads. In their turn, these two problems may also cause BFQ to loose control on latency for interactive and soft real-time applications. For example, on a PLEXTOR PX-256M5S SSD, LibreOffice Writer takes 0.6 seconds to start if the device is idle, but it takes more than 45 seconds (!) if there are sequential writes in the background. This commit addresses this issue by limiting the maximum percentage of tags that asynchronous I/O requests and synchronous write requests can consume. In particular, this commit grants a higher threshold to synchronous writes, to prevent the latter from being starved by asynchronous I/O. According to the above test, LibreOffice Writer now starts in about 1.2 seconds on average, regardless of the background workload, and apart from some rare outlier. To check this improvement, run, e.g., sudo ./comm_startup_lat.sh bfq 5 5 seq 10 "lowriter --terminate_after_init" for the comm_startup_lat benchmark in the S suite [1]. [1] https://github.com/Algodev-github/S Tested-by: Oleksandr Natalenko Tested-by: Holger Hoffstätte Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++ block/bfq-iosched.h | 12 +++++++++ 2 files changed, 89 insertions(+) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 1caeecad7af1..527bd2ccda51 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -417,6 +417,82 @@ static struct request *bfq_choose_req(struct bfq_data *bfqd, } } +/* + * See the comments on bfq_limit_depth for the purpose of + * the depths set in the function. + */ +static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt) +{ + bfqd->sb_shift = bt->sb.shift; + + /* + * In-word depths if no bfq_queue is being weight-raised: + * leaving 25% of tags only for sync reads. + * + * In next formulas, right-shift the value + * (1U<sb_shift), instead of computing directly + * (1U<<(bfqd->sb_shift - something)), to be robust against + * any possible value of bfqd->sb_shift, without having to + * limit 'something'. + */ + /* no more than 50% of tags for async I/O */ + bfqd->word_depths[0][0] = max((1U<sb_shift)>>1, 1U); + /* + * no more than 75% of tags for sync writes (25% extra tags + * w.r.t. async I/O, to prevent async I/O from starving sync + * writes) + */ + bfqd->word_depths[0][1] = max(((1U<sb_shift) * 3)>>2, 1U); + + /* + * In-word depths in case some bfq_queue is being weight- + * raised: leaving ~63% of tags for sync reads. This is the + * highest percentage for which, in our tests, application + * start-up times didn't suffer from any regression due to tag + * shortage. + */ + /* no more than ~18% of tags for async I/O */ + bfqd->word_depths[1][0] = max(((1U<sb_shift) * 3)>>4, 1U); + /* no more than ~37% of tags for sync writes (~20% extra tags) */ + bfqd->word_depths[1][1] = max(((1U<sb_shift) * 6)>>4, 1U); +} + +/* + * Async I/O can easily starve sync I/O (both sync reads and sync + * writes), by consuming all tags. Similarly, storms of sync writes, + * such as those that sync(2) may trigger, can starve sync reads. + * Limit depths of async I/O and sync writes so as to counter both + * problems. + */ +static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data) +{ + struct blk_mq_tags *tags = blk_mq_tags_from_data(data); + struct bfq_data *bfqd = data->q->elevator->elevator_data; + struct sbitmap_queue *bt; + + if (op_is_sync(op) && !op_is_write(op)) + return; + + if (data->flags & BLK_MQ_REQ_RESERVED) { + if (unlikely(!tags->nr_reserved_tags)) { + WARN_ON_ONCE(1); + return; + } + bt = &tags->breserved_tags; + } else + bt = &tags->bitmap_tags; + + if (unlikely(bfqd->sb_shift != bt->sb.shift)) + bfq_update_depths(bfqd, bt); + + data->shallow_depth = + bfqd->word_depths[!!bfqd->wr_busy_queues][op_is_sync(op)]; + + bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u", + __func__, bfqd->wr_busy_queues, op_is_sync(op), + data->shallow_depth); +} + static struct bfq_queue * bfq_rq_pos_tree_lookup(struct bfq_data *bfqd, struct rb_root *root, sector_t sector, struct rb_node **ret_parent, @@ -5267,6 +5343,7 @@ static struct elv_fs_entry bfq_attrs[] = { static struct elevator_type iosched_bfq_mq = { .ops.mq = { + .limit_depth = bfq_limit_depth, .prepare_request = bfq_prepare_request, .finish_request = bfq_finish_request, .exit_icq = bfq_exit_icq, diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 5d47b58d5fc8..fcd941008127 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -629,6 +629,18 @@ struct bfq_data { struct bfq_io_cq *bio_bic; /* bfqq associated with the task issuing current bio for merging */ struct bfq_queue *bio_bfqq; + + /* + * Cached sbitmap shift, used to compute depth limits in + * bfq_update_depths. + */ + unsigned int sb_shift; + + /* + * Depth limits used in bfq_limit_depth (see comments on the + * function) + */ + unsigned int word_depths[2][2]; }; enum bfqq_state_flags {