mbox series

[v4,0/2] block/dm: support bio polling

Message ID 20220304212623.34016-1-snitzer@redhat.com (mailing list archive)
Headers show
Series block/dm: support bio polling | expand

Message

Mike Snitzer March 4, 2022, 9:26 p.m. UTC
Hi,

I've rebased Ming's latest [1] ontop of dm-5.18 [2] (which is based on
for-5.18/block). End result available in dm-5.18-biopoll branch [3]

These changes add bio polling support to DM.  Tested with linear and
striped DM targets.

IOPS improvement was ~5% on my baremetal system with a single Intel
Optane NVMe device (555K hipri=1 vs 525K hipri=0).

Ming has seen better improvement while testing within a VM:
 dm-linear: hipri=1 vs hipri=0 15~20% iops improvement
 dm-stripe: hipri=1 vs hipri=0 ~30% iops improvement

I'd like to merge these changes via the DM tree when the 5.18 merge
window opens.  The first block patch that adds ->poll_bio to
block_device_operations will need review so that I can take it
through the DM tree.  Reason for going through the DM tree is there
have been some fairly extensive changes queued in dm-5.18 that build
on for-5.18/block.  So I think it easiest to just add the block
depenency via DM tree since DM is first consumer of ->poll_bio

FYI, Ming does have another DM patch [4] that looks to avoid using
hlist but I only just saw it.  bio_split() _is_ involved (see
dm_split_and_process_bio) so I'm not exactly sure where he is going
with that change.  But that is DM-implementation detail that we'll
sort out. Big thing is we need approval for the first block patch to
go to Linus with the DM tree ;)

Thanks,
Mike

[1] https://github.com/ming1/linux/commits/my_v5.18-dm-bio-poll
[2] https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.18
[3] https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.18-biopoll
[4] https://github.com/ming1/linux/commit/c107c30e15041ac1ce672f56809961406e2a3e52

Ming Lei (2):
  block: add ->poll_bio to block_device_operations
  dm: support bio polling

 block/blk-core.c       |  12 +++-
 block/genhd.c          |   2 +
 drivers/md/dm-core.h   |   2 +
 drivers/md/dm-table.c  |  27 +++++++++
 drivers/md/dm.c        | 150 ++++++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/blkdev.h |   2 +
 6 files changed, 189 insertions(+), 6 deletions(-)

Comments

Ming Lei March 5, 2022, 1:43 a.m. UTC | #1
On Fri, Mar 04, 2022 at 04:26:21PM -0500, Mike Snitzer wrote:
> Hi,
> 
> I've rebased Ming's latest [1] ontop of dm-5.18 [2] (which is based on
> for-5.18/block). End result available in dm-5.18-biopoll branch [3]
> 
> These changes add bio polling support to DM.  Tested with linear and
> striped DM targets.
> 
> IOPS improvement was ~5% on my baremetal system with a single Intel
> Optane NVMe device (555K hipri=1 vs 525K hipri=0).
> 
> Ming has seen better improvement while testing within a VM:
>  dm-linear: hipri=1 vs hipri=0 15~20% iops improvement
>  dm-stripe: hipri=1 vs hipri=0 ~30% iops improvement
> 
> I'd like to merge these changes via the DM tree when the 5.18 merge
> window opens.  The first block patch that adds ->poll_bio to
> block_device_operations will need review so that I can take it
> through the DM tree.  Reason for going through the DM tree is there
> have been some fairly extensive changes queued in dm-5.18 that build
> on for-5.18/block.  So I think it easiest to just add the block
> depenency via DM tree since DM is first consumer of ->poll_bio
> 
> FYI, Ming does have another DM patch [4] that looks to avoid using
> hlist but I only just saw it.  bio_split() _is_ involved (see
> dm_split_and_process_bio) so I'm not exactly sure where he is going
> with that change. 

io_uring(polling) workloads often cares latency, so big IO request
isn't involved usually, I guess. Then bio_split() is seldom called in
dm_split_and_process_bio(), such as if 4k random IO is run on dm-linear
or dm-stripe via io_uring, bio_split() won't be run into.

Single list is enough here, and efficient than hlist, just need
a little care to delete element from the list since linux kernel doesn't
have generic single list implementation.

> But that is DM-implementation detail that we'll
> sort out.

Yeah, that patch also needs more test.


Thanks, 
Ming
Mike Snitzer March 5, 2022, 2:14 a.m. UTC | #2
On Fri, Mar 04 2022 at  8:43P -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Fri, Mar 04, 2022 at 04:26:21PM -0500, Mike Snitzer wrote:
> > Hi,
> > 
> > I've rebased Ming's latest [1] ontop of dm-5.18 [2] (which is based on
> > for-5.18/block). End result available in dm-5.18-biopoll branch [3]
> > 
> > These changes add bio polling support to DM.  Tested with linear and
> > striped DM targets.
> > 
> > IOPS improvement was ~5% on my baremetal system with a single Intel
> > Optane NVMe device (555K hipri=1 vs 525K hipri=0).
> > 
> > Ming has seen better improvement while testing within a VM:
> >  dm-linear: hipri=1 vs hipri=0 15~20% iops improvement
> >  dm-stripe: hipri=1 vs hipri=0 ~30% iops improvement
> > 
> > I'd like to merge these changes via the DM tree when the 5.18 merge
> > window opens.  The first block patch that adds ->poll_bio to
> > block_device_operations will need review so that I can take it
> > through the DM tree.  Reason for going through the DM tree is there
> > have been some fairly extensive changes queued in dm-5.18 that build
> > on for-5.18/block.  So I think it easiest to just add the block
> > depenency via DM tree since DM is first consumer of ->poll_bio
> > 
> > FYI, Ming does have another DM patch [4] that looks to avoid using
> > hlist but I only just saw it.  bio_split() _is_ involved (see
> > dm_split_and_process_bio) so I'm not exactly sure where he is going
> > with that change. 
> 
> io_uring(polling) workloads often cares latency, so big IO request
> isn't involved usually, I guess. Then bio_split() is seldom called in
> dm_split_and_process_bio(), such as if 4k random IO is run on dm-linear
> or dm-stripe via io_uring, bio_split() won't be run into.
> 
> Single list is enough here, and efficient than hlist, just need
> a little care to delete element from the list since linux kernel doesn't
> have generic single list implementation.

OK, makes sense, thanks for clarifying. But yeah its a bit fiddley for sure.

> > But that is DM-implementation detail that we'll
> > sort out.
> 
> Yeah, that patch also needs more test.

Yeap, sounds good.