mbox series

[0/4] io_uring: add IORING_OP_READ[WRITE]_SPLICE_BUF

Message ID 20230210153212.733006-1-ming.lei@redhat.com (mailing list archive)
Headers show
Series io_uring: add IORING_OP_READ[WRITE]_SPLICE_BUF | expand

Message

Ming Lei Feb. 10, 2023, 3:32 p.m. UTC
Hello,

Add two OPs which buffer is retrieved via kernel splice for supporting
fuse/ublk zero copy.

The 1st patch enhances direct pipe & splice for moving pages in kernel,
so that the two added OPs won't be misused, and avoid potential security
hole.

The 2nd patch allows splice_direct_to_actor() caller to ignore signal
if the actor won't block and can be done in bound time.

The 3rd patch add the two OPs.

The 4th patch implements ublk's ->splice_read() for supporting
zero copy.

ublksrv(userspace):

https://github.com/ming1/ubdsrv/commits/io_uring_splice_buf
    
So far, only loop/null target implements zero copy in above branch:
    
	ublk add -t loop -f $file -z
	ublk add -t none -z

Basic FS/IO function is verified, mount/kernel building & fio
works fine, and big chunk IO(BS: 64k/512k) performance gets improved
obviously.
 
Any comment is welcome!

Ming Lei (4):
  fs/splice: enhance direct pipe & splice for moving pages in kernel
  fs/splice: allow to ignore signal in __splice_from_pipe
  io_uring: add IORING_OP_READ[WRITE]_SPLICE_BUF
  ublk_drv: support splice based read/write zero copy

 drivers/block/ublk_drv.c      | 169 +++++++++++++++++++++++++++++++--
 fs/splice.c                   |  19 +++-
 include/linux/pipe_fs_i.h     |  10 ++
 include/linux/splice.h        |  23 +++++
 include/uapi/linux/io_uring.h |   2 +
 include/uapi/linux/ublk_cmd.h |  31 +++++-
 io_uring/opdef.c              |  37 ++++++++
 io_uring/rw.c                 | 174 +++++++++++++++++++++++++++++++++-
 io_uring/rw.h                 |   1 +
 9 files changed, 456 insertions(+), 10 deletions(-)

Comments

Jens Axboe Feb. 10, 2023, 9:54 p.m. UTC | #1
On 2/10/23 8:32 AM, Ming Lei wrote:
> Hello,
> 
> Add two OPs which buffer is retrieved via kernel splice for supporting
> fuse/ublk zero copy.
> 
> The 1st patch enhances direct pipe & splice for moving pages in kernel,
> so that the two added OPs won't be misused, and avoid potential security
> hole.
> 
> The 2nd patch allows splice_direct_to_actor() caller to ignore signal
> if the actor won't block and can be done in bound time.
> 
> The 3rd patch add the two OPs.
> 
> The 4th patch implements ublk's ->splice_read() for supporting
> zero copy.
> 
> ublksrv(userspace):
> 
> https://github.com/ming1/ubdsrv/commits/io_uring_splice_buf
>     
> So far, only loop/null target implements zero copy in above branch:
>     
> 	ublk add -t loop -f $file -z
> 	ublk add -t none -z
> 
> Basic FS/IO function is verified, mount/kernel building & fio
> works fine, and big chunk IO(BS: 64k/512k) performance gets improved
> obviously.

Do you have any performance numbers? Also curious on liburing regression
tests, would be nice to see as it helps with review.

Caveat - haven't looked into this in detail just yet.
Jens Axboe Feb. 10, 2023, 10:19 p.m. UTC | #2
On 2/10/23 2:54 PM, Jens Axboe wrote:
> On 2/10/23 8:32 AM, Ming Lei wrote:
>> Hello,
>>
>> Add two OPs which buffer is retrieved via kernel splice for supporting
>> fuse/ublk zero copy.
>>
>> The 1st patch enhances direct pipe & splice for moving pages in kernel,
>> so that the two added OPs won't be misused, and avoid potential security
>> hole.
>>
>> The 2nd patch allows splice_direct_to_actor() caller to ignore signal
>> if the actor won't block and can be done in bound time.
>>
>> The 3rd patch add the two OPs.
>>
>> The 4th patch implements ublk's ->splice_read() for supporting
>> zero copy.
>>
>> ublksrv(userspace):
>>
>> https://github.com/ming1/ubdsrv/commits/io_uring_splice_buf
>>     
>> So far, only loop/null target implements zero copy in above branch:
>>     
>> 	ublk add -t loop -f $file -z
>> 	ublk add -t none -z
>>
>> Basic FS/IO function is verified, mount/kernel building & fio
>> works fine, and big chunk IO(BS: 64k/512k) performance gets improved
>> obviously.
> 
> Do you have any performance numbers? Also curious on liburing regression
> tests, would be nice to see as it helps with review.
> 
> Caveat - haven't looked into this in detail just yet.

Also see the recent splice/whatever discussion, might be something
that's relevant here, particularly if we can avoid splice:

https://lore.kernel.org/io-uring/0cfd9f02-dea7-90e2-e932-c8129b6013c7@samba.org/

It's long...
Ming Lei Feb. 11, 2023, 5:13 a.m. UTC | #3
On Fri, Feb 10, 2023 at 02:54:29PM -0700, Jens Axboe wrote:
> On 2/10/23 8:32 AM, Ming Lei wrote:
> > Hello,
> > 
> > Add two OPs which buffer is retrieved via kernel splice for supporting
> > fuse/ublk zero copy.
> > 
> > The 1st patch enhances direct pipe & splice for moving pages in kernel,
> > so that the two added OPs won't be misused, and avoid potential security
> > hole.
> > 
> > The 2nd patch allows splice_direct_to_actor() caller to ignore signal
> > if the actor won't block and can be done in bound time.
> > 
> > The 3rd patch add the two OPs.
> > 
> > The 4th patch implements ublk's ->splice_read() for supporting
> > zero copy.
> > 
> > ublksrv(userspace):
> > 
> > https://github.com/ming1/ubdsrv/commits/io_uring_splice_buf
> >     
> > So far, only loop/null target implements zero copy in above branch:
> >     
> > 	ublk add -t loop -f $file -z
> > 	ublk add -t none -z
> > 
> > Basic FS/IO function is verified, mount/kernel building & fio
> > works fine, and big chunk IO(BS: 64k/512k) performance gets improved
> > obviously.
> 
> Do you have any performance numbers?

Simple test on ublk-loop over image in btrfs shows the improvement is
100% ~ 200%.

> Also curious on liburing regression
> tests, would be nice to see as it helps with review.

It isn't easy since it requires ublk device so far, it looks like
read to/write from one device buffer.

Thanks,
Ming
Jens Axboe Feb. 11, 2023, 3:45 p.m. UTC | #4
On 2/10/23 10:13 PM, Ming Lei wrote:
> On Fri, Feb 10, 2023 at 02:54:29PM -0700, Jens Axboe wrote:
>> On 2/10/23 8:32 AM, Ming Lei wrote:
>>> Hello,
>>>
>>> Add two OPs which buffer is retrieved via kernel splice for supporting
>>> fuse/ublk zero copy.
>>>
>>> The 1st patch enhances direct pipe & splice for moving pages in kernel,
>>> so that the two added OPs won't be misused, and avoid potential security
>>> hole.
>>>
>>> The 2nd patch allows splice_direct_to_actor() caller to ignore signal
>>> if the actor won't block and can be done in bound time.
>>>
>>> The 3rd patch add the two OPs.
>>>
>>> The 4th patch implements ublk's ->splice_read() for supporting
>>> zero copy.
>>>
>>> ublksrv(userspace):
>>>
>>> https://github.com/ming1/ubdsrv/commits/io_uring_splice_buf
>>>     
>>> So far, only loop/null target implements zero copy in above branch:
>>>     
>>> 	ublk add -t loop -f $file -z
>>> 	ublk add -t none -z
>>>
>>> Basic FS/IO function is verified, mount/kernel building & fio
>>> works fine, and big chunk IO(BS: 64k/512k) performance gets improved
>>> obviously.
>>
>> Do you have any performance numbers?
> 
> Simple test on ublk-loop over image in btrfs shows the improvement is
> 100% ~ 200%.

That is pretty tasty...

>> Also curious on liburing regression
>> tests, would be nice to see as it helps with review.
> 
> It isn't easy since it requires ublk device so far, it looks like
> read to/write from one device buffer.

It can't be tested without ublk itself? Surely the two new added ops can
have separate test cases?
Stefan Hajnoczi Feb. 14, 2023, 4:36 p.m. UTC | #5
On Fri, Feb 10, 2023 at 11:32:08PM +0800, Ming Lei wrote:
> Hello,
> 
> Add two OPs which buffer is retrieved via kernel splice for supporting
> fuse/ublk zero copy.
> 
> The 1st patch enhances direct pipe & splice for moving pages in kernel,
> so that the two added OPs won't be misused, and avoid potential security
> hole.
> 
> The 2nd patch allows splice_direct_to_actor() caller to ignore signal
> if the actor won't block and can be done in bound time.
> 
> The 3rd patch add the two OPs.
> 
> The 4th patch implements ublk's ->splice_read() for supporting
> zero copy.
> 
> ublksrv(userspace):
> 
> https://github.com/ming1/ubdsrv/commits/io_uring_splice_buf
>     
> So far, only loop/null target implements zero copy in above branch:
>     
> 	ublk add -t loop -f $file -z
> 	ublk add -t none -z
> 
> Basic FS/IO function is verified, mount/kernel building & fio
> works fine, and big chunk IO(BS: 64k/512k) performance gets improved
> obviously.
>  
> Any comment is welcome!

I'm not familiar enough with the splice implementation to review these
patches, but the performance numbers you posted look great. This could
be very nice for ublk and FUSE servers!

Thanks,
Stefan

> Ming Lei (4):
>   fs/splice: enhance direct pipe & splice for moving pages in kernel
>   fs/splice: allow to ignore signal in __splice_from_pipe
>   io_uring: add IORING_OP_READ[WRITE]_SPLICE_BUF
>   ublk_drv: support splice based read/write zero copy
> 
>  drivers/block/ublk_drv.c      | 169 +++++++++++++++++++++++++++++++--
>  fs/splice.c                   |  19 +++-
>  include/linux/pipe_fs_i.h     |  10 ++
>  include/linux/splice.h        |  23 +++++
>  include/uapi/linux/io_uring.h |   2 +
>  include/uapi/linux/ublk_cmd.h |  31 +++++-
>  io_uring/opdef.c              |  37 ++++++++
>  io_uring/rw.c                 | 174 +++++++++++++++++++++++++++++++++-
>  io_uring/rw.h                 |   1 +
>  9 files changed, 456 insertions(+), 10 deletions(-)
> 
> -- 
> 2.31.1
>