From patchwork Tue May  9 10:54:23 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Paolo Valente <paolo.valente@linaro.org>
X-Patchwork-Id: 9717625
Return-Path: <linux-block-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	9D5E560237 for <patchwork-linux-block@patchwork.kernel.org>;
	Tue,  9 May 2017 10:54:47 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 924BC22701
	for <patchwork-linux-block@patchwork.kernel.org>;
	Tue,  9 May 2017 10:54:47 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 86EF528329; Tue,  9 May 2017 10:54:47 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.5 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,RCVD_IN_DNSWL_HI,RCVD_IN_SORBS_SPAM
	autolearn=unavailable version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2FB6D22701
	for <patchwork-linux-block@patchwork.kernel.org>;
	Tue,  9 May 2017 10:54:47 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751554AbdEIKyf (ORCPT
	<rfc822;patchwork-linux-block@patchwork.kernel.org>);
	Tue, 9 May 2017 06:54:35 -0400
Received: from mail-wr0-f177.google.com ([209.85.128.177]:35633 "EHLO
	mail-wr0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751368AbdEIKye (ORCPT
	<rfc822; linux-block@vger.kernel.org>); Tue, 9 May 2017 06:54:34 -0400
Received: by mail-wr0-f177.google.com with SMTP id z52so67305350wrc.2
	for <linux-block@vger.kernel.org>;
	Tue, 09 May 2017 03:54:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google;
	h=from:to:cc:subject:date:message-id;
	bh=nMVLgKzaSU7d2pXv2jRKyD2R6dwQsguUxV4o6HZfSHo=;
	b=IN8eobxm2w48idcnsioz+Jw4vgFyY/+PFnFqKIZX/C+vDOgvwFaY0QuHaCZiLYvOFI
	REwHci/eYN9BrzRcnLK9vj3xO6qTlpgbWlDojHScWiUuYV62Azx31mLDOoDAJCqoHmZ3
	bdEOt/L4uVZqG8t8jHrOic4k7C43yctTpYxHM=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id;
	bh=nMVLgKzaSU7d2pXv2jRKyD2R6dwQsguUxV4o6HZfSHo=;
	b=mitZuBQ0XKB8lY+aAa4JEFq7LIW07hJFr1CzPf+v7D2LaJp7fmtPVGsmABXacgYV9J
	9KUpBhEObjuKrObdokAfAoh42bZK+72/AskW2VNA3NGvSm1mQrktx089ergpuhnqx0gG
	ZRmxTKRMwzaIey7y84RQsJUs2F4NukT6lybLH2OIcSIcshPtLUkWtVr2KKrkj8Gh4bHj
	d3PYKzabK7o7OFnJOD6JHv2gAjH6Lb5utNma4o3QsygzyGM+esTD3unYsxdvsFu1LE8I
	Fzr70679TtFe7YGFnoHw9HXnT0C9WSeSwUDG7zPt4q2EkUTp7iFy78rCagJwBI927eH1
	chUQ==
X-Gm-Message-State: AN3rC/6n5ToYr1dMWvqHVdiFI85N2U9d2zrcL1DQ4u/KLcAVdm2ao9M4
	PxeG/W0A61SzfrDi
X-Received: by 10.223.130.168 with SMTP id 37mr44676080wrc.16.1494327273058;
	Tue, 09 May 2017 03:54:33 -0700 (PDT)
Received: from localhost.localdomain ([185.14.11.197])
	by smtp.gmail.com with ESMTPSA id
	o97sm18350157wrc.48.2017.05.09.03.54.31
	(version=TLS1 cipher=AES128-SHA bits=128/128);
	Tue, 09 May 2017 03:54:32 -0700 (PDT)
From: Paolo Valente <paolo.valente@linaro.org>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	ulf.hansson@linaro.org, linus.walleij@linaro.org,
	broonie@kernel.org, Paolo Valente <paolo.valente@linaro.org>
Subject: [PATCH BUGFIX] block,
	bfq: stress that low_latency must be off to get max throughput
Date: Tue,  9 May 2017 12:54:23 +0200
Message-Id: <20170509105423.3672-1-paolo.valente@linaro.org>
X-Mailer: git-send-email 2.10.0
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-block.vger.kernel.org>
X-Mailing-List: linux-block@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

The introduction of the BFQ and Kyber I/O schedulers has triggered a
new wave of I/O benchmarks. Unfortunately, comments and discussions on
these benchmarks confirm that there is still little awareness that it
is very hard to achieve, at the same time, a low latency and a high
throughput. In particular, virtually all benchmarks measure
throughput, or throughput-related figures of merit, but, for BFQ, they
use the scheduler in its default configuration. This configuration is
geared, instead, toward a low latency. This is evidently a sign that
BFQ documentation is still too unclear on this important aspect. This
commit addresses this issue by stressing how BFQ configuration must be
(easily) changed if the only goal is maximum throughput.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 Documentation/block/bfq-iosched.txt | 17 ++++++++++++++++-
 block/bfq-iosched.c                 |  5 +++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/block/bfq-iosched.txt b/Documentation/block/bfq-iosched.txt
index 1b87df6..05e2822 100644
--- a/Documentation/block/bfq-iosched.txt
+++ b/Documentation/block/bfq-iosched.txt
@@ -11,6 +11,13 @@ controllers), BFQ's main features are:
   groups (switching back to time distribution when needed to keep
   throughput high).
 
+In its default configuration, BFQ privileges latency over
+throughput. So, when needed for achieving a lower latency, BFQ builds
+schedules that may lead to a lower throughput. If your main or only
+goal, for a given device, is to achieve the maximum-possible
+throughput at all times, then do switch off all low-latency heuristics
+for that device, by setting low_latency to 0. Full details in Section 3.
+
 On average CPUs, the current version of BFQ can handle devices
 performing at most ~30K IOPS; at most ~50 KIOPS on faster CPUs. As a
 reference, 30-50 KIOPS correspond to very high bandwidths with
@@ -375,11 +382,19 @@ default, low latency mode is enabled. If enabled, interactive and soft
 real-time applications are privileged and experience a lower latency,
 as explained in more detail in the description of how BFQ works.
 
-DO NOT enable this mode if you need full control on bandwidth
+DISABLE this mode if you need full control on bandwidth
 distribution. In fact, if it is enabled, then BFQ automatically
 increases the bandwidth share of privileged applications, as the main
 means to guarantee a lower latency to them.
 
+In addition, as already highlighted at the beginning of this document,
+DISABLE this mode if your only goal is to achieve a high throughput.
+In fact, privileging the I/O of some application over the rest may
+entail a lower throughput. To achieve the highest-possible throughput
+on a non-rotational device, setting slice_idle to 0 may be needed too
+(at the cost of giving up any strong guarantee on fairness and low
+latency).
+
 timeout_sync
 ------------
 
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index bd8499e..08ce450 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -56,6 +56,11 @@
  * rotational or flash-based devices, and to get the job done quickly
  * for applications consisting in many I/O-bound processes.
  *
+ * NOTE: if the main or only goal, with a given device, is to achieve
+ * the maximum-possible throughput at all times, then do switch off
+ * all low-latency heuristics for that device, by setting low_latency
+ * to 0.
+ *
  * BFQ is described in [1], where also a reference to the initial, more
  * theoretical paper on BFQ can be found. The interested reader can find
  * in the latter paper full details on the main algorithm, as well as