mbox series

[V4,0/2] mmc: hsq: dynamically adjust hsq_depth to improve performance

Message ID 20230919074707.25517-1-wenchao.chen@unisoc.com (mailing list archive)
Headers show
Series mmc: hsq: dynamically adjust hsq_depth to improve performance | expand

Message

Wenchao Chen Sept. 19, 2023, 7:47 a.m. UTC
Change in v4:
- Remove "struct hsq_slot *slot" and simplify the code.
- In general, need_change has to be greater than or equal to 2 to allow the threshold
  to be adjusted.

Change in v3:
- Use "mrq->data->blksz * mrq->data->blocks == 4096" for 4K.
- Add explanation for "HSQ_PERFORMANCE_DEPTH".

Change in v2:
- Support for dynamic adjustment of hsq_depth.

Test
=====
I tested 3 times for each case and output a average speed.
Ran 'fio' to evaluate the performance:
1.Fixed hsq_depth
1) Sequential write:
Speed: 168 164 165
Average speed: 165.67MB/S

2) Sequential read:
Speed: 326 326 326
Average speed: 326MB/S

3) Random write:
Speed: 82.6 83 83
Average speed: 82.87MB/S

4) Random read:
Speed: 48.2 48.3 47.6
Average speed: 48.03MB/S

2.Dynamic hsq_depth
1) Sequential write:
Speed: 167 166 166
Average speed: 166.33MB/S

2) Sequential read:
Speed: 327 326 326
Average speed: 326.3MB/S

3) Random write:
Speed: 86.1 86.2 87.7
Average speed: 86.67MB/S

4) Random read:
Speed: 48.1 48 48
Average speed: 48.03MB/S

Based on the above data, dynamic hsq_depth can improve the performance of random writes.
Random write improved by 4.6%.

In addition, we tested 8K and 16K.
1.Fixed hsq_depth
1) Random write(bs=8K):
Speed: 116 114 115
Average speed: 115MB/S

2) Random read(bs=8K):
Speed: 83 83 82.5
Average speed: 82.8MB/S

3) Random write(bs=16K):
Speed: 141 142 141
Average speed: 141.3MB/S

4) Random read(bs=16K):
Speed: 132 132 132
Average speed: 132MB/S

2.Dynamic hsq_depth(mrq->data->blksz * mrq->data->blocks == 8192 or 16384)
1) Random write(bs=8K):
Speed: 115 115 115
Average speed: 115MB/S

2) Random read(bs=8K):
Speed: 82.7 82.9 82.8
Average speed: 82.8MB/S

3) Random write(bs=16K):
Speed: 143 141 141
Average speed: 141.6MB/S

4) Random read(bs=16K):
Speed: 132 132 132
Average speed: 132MB/S

Increasing hsq_depth cannot improve 8k and 16k random read/write performance.
To reduce latency, we dynamically increase hsq_depth only for 4k random writes.

Test cmd
=========
1)write: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=write -bs=512K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
2)read: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=read -bs=512K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
3)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=4K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
4)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=4K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
5)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=8K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
6)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=8K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
7)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=16K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
8)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=16K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64

Wenchao Chen (2):
  mmc: queue: replace immediate with hsq->depth
  mmc: hsq: dynamic adjustment of hsq->depth

 drivers/mmc/core/queue.c   |  6 +-----
 drivers/mmc/host/mmc_hsq.c | 22 ++++++++++++++++++++++
 drivers/mmc/host/mmc_hsq.h | 11 +++++++++++
 include/linux/mmc/host.h   |  1 +
 4 files changed, 35 insertions(+), 5 deletions(-)

Comments

Wenchao Chen Sept. 26, 2023, 10:54 a.m. UTC | #1
A gentle ping.

On Tue, 19 Sept 2023 at 15:47, Wenchao Chen <wenchao.chen@unisoc.com> wrote:
>
> Change in v4:
> - Remove "struct hsq_slot *slot" and simplify the code.
> - In general, need_change has to be greater than or equal to 2 to allow the threshold
>   to be adjusted.
>
> Change in v3:
> - Use "mrq->data->blksz * mrq->data->blocks == 4096" for 4K.
> - Add explanation for "HSQ_PERFORMANCE_DEPTH".
>
> Change in v2:
> - Support for dynamic adjustment of hsq_depth.
>
> Test
> =====
> I tested 3 times for each case and output a average speed.
> Ran 'fio' to evaluate the performance:
> 1.Fixed hsq_depth
> 1) Sequential write:
> Speed: 168 164 165
> Average speed: 165.67MB/S
>
> 2) Sequential read:
> Speed: 326 326 326
> Average speed: 326MB/S
>
> 3) Random write:
> Speed: 82.6 83 83
> Average speed: 82.87MB/S
>
> 4) Random read:
> Speed: 48.2 48.3 47.6
> Average speed: 48.03MB/S
>
> 2.Dynamic hsq_depth
> 1) Sequential write:
> Speed: 167 166 166
> Average speed: 166.33MB/S
>
> 2) Sequential read:
> Speed: 327 326 326
> Average speed: 326.3MB/S
>
> 3) Random write:
> Speed: 86.1 86.2 87.7
> Average speed: 86.67MB/S
>
> 4) Random read:
> Speed: 48.1 48 48
> Average speed: 48.03MB/S
>
> Based on the above data, dynamic hsq_depth can improve the performance of random writes.
> Random write improved by 4.6%.
>
> In addition, we tested 8K and 16K.
> 1.Fixed hsq_depth
> 1) Random write(bs=8K):
> Speed: 116 114 115
> Average speed: 115MB/S
>
> 2) Random read(bs=8K):
> Speed: 83 83 82.5
> Average speed: 82.8MB/S
>
> 3) Random write(bs=16K):
> Speed: 141 142 141
> Average speed: 141.3MB/S
>
> 4) Random read(bs=16K):
> Speed: 132 132 132
> Average speed: 132MB/S
>
> 2.Dynamic hsq_depth(mrq->data->blksz * mrq->data->blocks == 8192 or 16384)
> 1) Random write(bs=8K):
> Speed: 115 115 115
> Average speed: 115MB/S
>
> 2) Random read(bs=8K):
> Speed: 82.7 82.9 82.8
> Average speed: 82.8MB/S
>
> 3) Random write(bs=16K):
> Speed: 143 141 141
> Average speed: 141.6MB/S
>
> 4) Random read(bs=16K):
> Speed: 132 132 132
> Average speed: 132MB/S
>
> Increasing hsq_depth cannot improve 8k and 16k random read/write performance.
> To reduce latency, we dynamically increase hsq_depth only for 4k random writes.
>
> Test cmd
> =========
> 1)write: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=write -bs=512K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 2)read: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=read -bs=512K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 3)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=4K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 4)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=4K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 5)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=8K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 6)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=8K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 7)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=16K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 8)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=16K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
>
> Wenchao Chen (2):
>   mmc: queue: replace immediate with hsq->depth
>   mmc: hsq: dynamic adjustment of hsq->depth
>
>  drivers/mmc/core/queue.c   |  6 +-----
>  drivers/mmc/host/mmc_hsq.c | 22 ++++++++++++++++++++++
>  drivers/mmc/host/mmc_hsq.h | 11 +++++++++++
>  include/linux/mmc/host.h   |  1 +
>  4 files changed, 35 insertions(+), 5 deletions(-)
>
> --
> 2.17.1
>
Ulf Hansson Sept. 26, 2023, 3:03 p.m. UTC | #2
On Tue, 19 Sept 2023 at 09:47, Wenchao Chen <wenchao.chen@unisoc.com> wrote:
>
> Change in v4:
> - Remove "struct hsq_slot *slot" and simplify the code.
> - In general, need_change has to be greater than or equal to 2 to allow the threshold
>   to be adjusted.
>
> Change in v3:
> - Use "mrq->data->blksz * mrq->data->blocks == 4096" for 4K.
> - Add explanation for "HSQ_PERFORMANCE_DEPTH".
>
> Change in v2:
> - Support for dynamic adjustment of hsq_depth.
>
> Test
> =====
> I tested 3 times for each case and output a average speed.
> Ran 'fio' to evaluate the performance:
> 1.Fixed hsq_depth
> 1) Sequential write:
> Speed: 168 164 165
> Average speed: 165.67MB/S
>
> 2) Sequential read:
> Speed: 326 326 326
> Average speed: 326MB/S
>
> 3) Random write:
> Speed: 82.6 83 83
> Average speed: 82.87MB/S
>
> 4) Random read:
> Speed: 48.2 48.3 47.6
> Average speed: 48.03MB/S
>
> 2.Dynamic hsq_depth
> 1) Sequential write:
> Speed: 167 166 166
> Average speed: 166.33MB/S
>
> 2) Sequential read:
> Speed: 327 326 326
> Average speed: 326.3MB/S
>
> 3) Random write:
> Speed: 86.1 86.2 87.7
> Average speed: 86.67MB/S
>
> 4) Random read:
> Speed: 48.1 48 48
> Average speed: 48.03MB/S
>
> Based on the above data, dynamic hsq_depth can improve the performance of random writes.
> Random write improved by 4.6%.
>
> In addition, we tested 8K and 16K.
> 1.Fixed hsq_depth
> 1) Random write(bs=8K):
> Speed: 116 114 115
> Average speed: 115MB/S
>
> 2) Random read(bs=8K):
> Speed: 83 83 82.5
> Average speed: 82.8MB/S
>
> 3) Random write(bs=16K):
> Speed: 141 142 141
> Average speed: 141.3MB/S
>
> 4) Random read(bs=16K):
> Speed: 132 132 132
> Average speed: 132MB/S
>
> 2.Dynamic hsq_depth(mrq->data->blksz * mrq->data->blocks == 8192 or 16384)
> 1) Random write(bs=8K):
> Speed: 115 115 115
> Average speed: 115MB/S
>
> 2) Random read(bs=8K):
> Speed: 82.7 82.9 82.8
> Average speed: 82.8MB/S
>
> 3) Random write(bs=16K):
> Speed: 143 141 141
> Average speed: 141.6MB/S
>
> 4) Random read(bs=16K):
> Speed: 132 132 132
> Average speed: 132MB/S
>
> Increasing hsq_depth cannot improve 8k and 16k random read/write performance.
> To reduce latency, we dynamically increase hsq_depth only for 4k random writes.
>
> Test cmd
> =========
> 1)write: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=write -bs=512K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 2)read: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=read -bs=512K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 3)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=4K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 4)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=4K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 5)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=8K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 6)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=8K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 7)randwrite: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randwrite -bs=16K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
> 8)randread: fio -filename=/dev/mmcblk0p72 -direct=1 -rw=randread -bs=16K -size=512M -group_reporting -name=test -numjobs=8 -thread -iodepth=64
>
> Wenchao Chen (2):
>   mmc: queue: replace immediate with hsq->depth
>   mmc: hsq: dynamic adjustment of hsq->depth
>
>  drivers/mmc/core/queue.c   |  6 +-----
>  drivers/mmc/host/mmc_hsq.c | 22 ++++++++++++++++++++++
>  drivers/mmc/host/mmc_hsq.h | 11 +++++++++++
>  include/linux/mmc/host.h   |  1 +
>  4 files changed, 35 insertions(+), 5 deletions(-)
>
> --
> 2.17.1
>

The series applied for next! I took the liberty of making some updates
to the commit messages and a small adjustment of the code in patch2.
Please have a look and yell at me if there is something that looks
odd!

Thanks and kind regards
Uffe