mbox series

[RFC,v3,0/3] introduce io_uring_cmd_import_fixed_vec

Message ID 20250315172319.16770-1-sidong.yang@furiosa.ai (mailing list archive)
Headers show
Series introduce io_uring_cmd_import_fixed_vec | expand

Message

Sidong Yang March 15, 2025, 5:23 p.m. UTC
This patche series introduce io_uring_cmd_import_vec. With this function,
Multiple fixed buffer could be used in uring cmd. It's vectored version
for io_uring_cmd_import_fixed(). Also this patch series includes a usage
for new api for encoded read/write in btrfs by using uring cmd.

There was approximately 10 percent of performance improvements through benchmark.
The benchmark code is in
https://github.com/SidongYang/btrfs-encoded-io-test/blob/main/main.c

./main -l
Elapsed time: 0.598997 seconds
./main -l -f
Elapsed time: 0.540332 seconds

v2:
 - don't export iou_vc, use bvec for btrfs
 - use io_is_compat for checking compat
 - reduce allocation/free for import fixed vec

v3:
 - add iou_vec cache in io_uring_cmd and use it
 - also encoded write fixed supported

Sidong Yang (3):
  io-uring/cmd: add iou_vec field for io_uring_cmd
  io-uring/cmd: introduce io_uring_cmd_import_fixed_vec
  btrfs: ioctl: introduce btrfs_uring_import_iovec()

 fs/btrfs/ioctl.c             | 32 +++++++++++++++++++++--------
 include/linux/io_uring/cmd.h | 15 ++++++++++++++
 io_uring/io_uring.c          |  2 +-
 io_uring/opdef.c             |  1 +
 io_uring/uring_cmd.c         | 39 ++++++++++++++++++++++++++++++++++++
 io_uring/uring_cmd.h         |  3 +++
 6 files changed, 83 insertions(+), 9 deletions(-)

Comments

Pavel Begunkov March 16, 2025, 7:22 a.m. UTC | #1
On 3/15/25 17:23, Sidong Yang wrote:
> This patche series introduce io_uring_cmd_import_vec. With this function,
> Multiple fixed buffer could be used in uring cmd. It's vectored version
> for io_uring_cmd_import_fixed(). Also this patch series includes a usage
> for new api for encoded read/write in btrfs by using uring cmd.
> 
> There was approximately 10 percent of performance improvements through benchmark.
> The benchmark code is in
> https://github.com/SidongYang/btrfs-encoded-io-test/blob/main/main.c
> 
> ./main -l
> Elapsed time: 0.598997 seconds
> ./main -l -f
> Elapsed time: 0.540332 seconds

It's probably precise, but it's usually hard to judge about
performance from such short runs. Mark, do we have some benchmark
for the io_uring cmd?
Mark Harmstone March 17, 2025, 10:32 a.m. UTC | #2
On 16/3/25 07:22, Pavel Begunkov wrote:
> > 
> On 3/15/25 17:23, Sidong Yang wrote:
>> This patche series introduce io_uring_cmd_import_vec. With this function,
>> Multiple fixed buffer could be used in uring cmd. It's vectored version
>> for io_uring_cmd_import_fixed(). Also this patch series includes a usage
>> for new api for encoded read/write in btrfs by using uring cmd.
>>
>> There was approximately 10 percent of performance improvements through 
>> benchmark.
>> The benchmark code is in
>> https://github.com/SidongYang/btrfs-encoded-io-test/blob/main/main.c
>> ./main -l
>> Elapsed time: 0.598997 seconds
>> ./main -l -f
>> Elapsed time: 0.540332 seconds
> 
> It's probably precise, but it's usually hard to judge about
> performance from such short runs. Mark, do we have some benchmark
> for the io_uring cmd?

Unfortunately not. My plan was to plug it in to btrfs-receive and to use 
that as a benchmark, but it turned out that the limiting factor there 
was the dentry locking.

Sidong, Pavel is right - your figures would be more useful if you ran it 
1,000 times or so.

Mark
Sidong Yang March 17, 2025, 1:56 p.m. UTC | #3
On Mon, Mar 17, 2025 at 10:32:02AM +0000, Mark Harmstone wrote:
> On 16/3/25 07:22, Pavel Begunkov wrote:
> > > 
> > On 3/15/25 17:23, Sidong Yang wrote:
> >> This patche series introduce io_uring_cmd_import_vec. With this function,
> >> Multiple fixed buffer could be used in uring cmd. It's vectored version
> >> for io_uring_cmd_import_fixed(). Also this patch series includes a usage
> >> for new api for encoded read/write in btrfs by using uring cmd.
> >>
> >> There was approximately 10 percent of performance improvements through 
> >> benchmark.
> >> The benchmark code is in
> >> https://github.com/SidongYang/btrfs-encoded-io-test/blob/main/main.c
> >> ./main -l
> >> Elapsed time: 0.598997 seconds
> >> ./main -l -f
> >> Elapsed time: 0.540332 seconds
> > 
> > It's probably precise, but it's usually hard to judge about
> > performance from such short runs. Mark, do we have some benchmark
> > for the io_uring cmd?
> 
> Unfortunately not. My plan was to plug it in to btrfs-receive and to use 
> that as a benchmark, but it turned out that the limiting factor there 
> was the dentry locking.
> 
> Sidong, Pavel is right - your figures would be more useful if you ran it 
> 1,000 times or so.

Yes, it would be useful for large number of repetitions.

Thanks,
Sidong

> 
> Mark