mbox series

[V13,REBASED,0/8] block, bfq: extend bfq to support multi-actuator drives

Message ID 20230103145503.71712-1-paolo.valente@linaro.org (mailing list archive)
Headers show
Series block, bfq: extend bfq to support multi-actuator drives | expand

Message

Paolo Valente Jan. 3, 2023, 2:54 p.m. UTC
Hi,
rebased V13 [2].

Here is the whole description of this patch series again.  This
extension addresses the following issue. Single-LUN multi-actuator
SCSI drives, as well as all multi-actuator SATA drives appear as a
single device to the I/O subsystem [1].  Yet they address commands to
different actuators internally, as a function of Logical Block
Addressing (LBAs). A given sector is reachable by only one of the
actuators. For example, Seagate’s Serial Advanced Technology
Attachment (SATA) version contains two actuators and maps the lower
half of the SATA LBA space to the lower actuator and the upper half to
the upper actuator.

Evidently, to fully utilize actuators, no actuator must be left idle
or underutilized while there is pending I/O for it. To reach this
goal, the block layer must somehow control the load of each actuator
individually. This series enriches BFQ with such a per-actuator
control, as a first step. Then it also adds a simple mechanism for
guaranteeing that actuators with pending I/O are never left idle.

See [1] for a more detailed overview of the problem and of the
solutions implemented in this patch series. There you will also find
some preliminary performance results.

Thanks,
Paolo

[1] https://www.linaro.org/blog/budget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives/
[2] https://lore.kernel.org/lkml/20221229203707.68458-8-paolo.valente@linaro.org/t/

Davide Zini (3):
  block, bfq: split also async bfq_queues on a per-actuator basis
  block, bfq: inject I/O to underutilized actuators
  block, bfq: balance I/O injection among underutilized actuators

Federico Gavioli (1):
  block, bfq: retrieve independent access ranges from request queue

Paolo Valente (4):
  block, bfq: split sync bfq_queues on a per-actuator basis
  block, bfq: forbid stable merging of queues associated with different
    actuators
  block, bfq: move io_cq-persistent bfqq data into a dedicated struct
  block, bfq: turn bfqq_data into an array in bfq_io_cq

 block/bfq-cgroup.c  |  93 +++----
 block/bfq-iosched.c | 587 ++++++++++++++++++++++++++++++--------------
 block/bfq-iosched.h | 142 ++++++++---
 block/bfq-wf2q.c    |   2 +-
 4 files changed, 565 insertions(+), 259 deletions(-)

--
2.20.1

Comments

Jens Axboe Jan. 4, 2023, 6:05 p.m. UTC | #1
On Tue, 03 Jan 2023 15:54:55 +0100, Paolo Valente wrote:
> rebased V13 [2].
> 
> Here is the whole description of this patch series again.  This
> extension addresses the following issue. Single-LUN multi-actuator
> SCSI drives, as well as all multi-actuator SATA drives appear as a
> single device to the I/O subsystem [1].  Yet they address commands to
> different actuators internally, as a function of Logical Block
> Addressing (LBAs). A given sector is reachable by only one of the
> actuators. For example, Seagate’s Serial Advanced Technology
> Attachment (SATA) version contains two actuators and maps the lower
> half of the SATA LBA space to the lower actuator and the upper half to
> the upper actuator.
> 
> [...]

Applied, thanks!

[1/8] block, bfq: split sync bfq_queues on a per-actuator basis
      commit: abc653033297fb39c097f9e18cc4ab42a5c00a23
[2/8] block, bfq: forbid stable merging of queues associated with different actuators
      commit: d591f14a59ed700caff6db734ecf558387d38f35
[3/8] block, bfq: move io_cq-persistent bfqq data into a dedicated struct
      commit: d85fed150b4efadf01ea3d12ba78285f6720f583
[4/8] block, bfq: turn bfqq_data into an array in bfq_io_cq
      commit: 7cf744815a3cd94591b0227f3c63f533f3402a47
[5/8] block, bfq: split also async bfq_queues on a per-actuator basis
      commit: 8249909fe789d7dc50f6749bbdf440d69ac46ac1
[6/8] block, bfq: retrieve independent access ranges from request queue
      commit: b3d9aece342834ef3840b55a99a11dc82b1f96cc
[7/8] block, bfq: inject I/O to underutilized actuators
      commit: 3f40467eb5ec1e4f383daff7f93c7494e7881fee
[8/8] block, bfq: balance I/O injection among underutilized actuators
      commit: dd9b66eb9ed5c0e58098c336cb8e6329590564be

Best regards,
Jan Kara Jan. 16, 2023, 1:03 p.m. UTC | #2
Hi Paolo!

On Tue 03-01-23 15:54:55, Paolo Valente wrote:
> Here is the whole description of this patch series again.  This
> extension addresses the following issue. Single-LUN multi-actuator
> SCSI drives, as well as all multi-actuator SATA drives appear as a
> single device to the I/O subsystem [1].  Yet they address commands to
> different actuators internally, as a function of Logical Block
> Addressing (LBAs). A given sector is reachable by only one of the
> actuators. For example, Seagate’s Serial Advanced Technology
> Attachment (SATA) version contains two actuators and maps the lower
> half of the SATA LBA space to the lower actuator and the upper half to
> the upper actuator.
> 
> Evidently, to fully utilize actuators, no actuator must be left idle
> or underutilized while there is pending I/O for it. To reach this
> goal, the block layer must somehow control the load of each actuator
> individually. This series enriches BFQ with such a per-actuator
> control, as a first step. Then it also adds a simple mechanism for
> guaranteeing that actuators with pending I/O are never left idle.
> 
> See [1] for a more detailed overview of the problem and of the
> solutions implemented in this patch series. There you will also find
> some preliminary performance results.

Sorry, I didn't find time to look into this earlier. I've just had a
high-level look into the patches and I have one question: Did you consider
a solution where you'd basically duplicate all of the scheduling for each
actuator (thus making them effectively independent devices from the point
of view of BFQ)? From the first look it would look like somewhat simpler
solution than splitting all the BFQ queues and implementing special
injection mechanism for other actuators and perhaps lead to better
utilization of the actuators. OTOH the latecy and QoS for tasks using
multiple actuators would be probably worse because it would be basically
determined by the busiest of the actuators. So I'm asking mostly out of
curiosity :)

								Honza
Paolo Valente Jan. 16, 2023, 6:09 p.m. UTC | #3
> Il giorno 16 gen 2023, alle ore 14:03, Jan Kara <jack@suse.cz> ha scritto:
> 
> Hi Paolo!
> 
> On Tue 03-01-23 15:54:55, Paolo Valente wrote:
>> Here is the whole description of this patch series again.  This
>> extension addresses the following issue. Single-LUN multi-actuator
>> SCSI drives, as well as all multi-actuator SATA drives appear as a
>> single device to the I/O subsystem [1].  Yet they address commands to
>> different actuators internally, as a function of Logical Block
>> Addressing (LBAs). A given sector is reachable by only one of the
>> actuators. For example, Seagate’s Serial Advanced Technology
>> Attachment (SATA) version contains two actuators and maps the lower
>> half of the SATA LBA space to the lower actuator and the upper half to
>> the upper actuator.
>> 
>> Evidently, to fully utilize actuators, no actuator must be left idle
>> or underutilized while there is pending I/O for it. To reach this
>> goal, the block layer must somehow control the load of each actuator
>> individually. This series enriches BFQ with such a per-actuator
>> control, as a first step. Then it also adds a simple mechanism for
>> guaranteeing that actuators with pending I/O are never left idle.
>> 
>> See [1] for a more detailed overview of the problem and of the
>> solutions implemented in this patch series. There you will also find
>> some preliminary performance results.
> 
> Sorry, I didn't find time to look into this earlier. I've just had a
> high-level look into the patches and I have one question: Did you consider
> a solution where you'd basically duplicate all of the scheduling for each
> actuator (thus making them effectively independent devices from the point
> of view of BFQ)?

Yep, I did.

> From the first look it would look like somewhat simpler
> solution than splitting all the BFQ queues and implementing special
> injection mechanism for other actuators and perhaps lead to better
> utilization of the actuators. OTOH the latecy and QoS for tasks using
> multiple actuators would be probably worse because it would be basically
> determined by the busiest of the actuators.

Exactly, that's why I had to keep all queues in the same bucket.

Thanks for both asking and answering! :)

Jokes apart, thank you a lot for having a look at this contribution,
Paolo

> So I'm asking mostly out of
> curiosity :)
> 
> 								Honza
> 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR