Message ID | cover.1579649589.git.asml.silence@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | splice(2) support for io_uring | expand |
On 1/21/20 5:05 PM, Pavel Begunkov wrote: > It works well for basic cases, but there is still work to be done. E.g. > it misses @hash_reg_file checks for the second (output) file. Anyway, > there are some questions I want to discuss: > > - why sqe->len is __u32? Splice uses size_t, and I think it's better > to have something wider (e.g. u64) for fututre use. That's the story > behind added sqe->splice_len. IO operations in Linux generally are INT_MAX, so the u32 is plenty big. That's why I chose it. For this specifically, if you look at splice: if (unlikely(len > MAX_RW_COUNT)) len = MAX_RW_COUNT; so anything larger is truncated anyway. > - it requires 2 fds, and it's painful. Currently file managing is done > by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm > thinking to make each opcode function handle file grabbing/putting > themself with some helpers, as it's done in the patch for splice's > out-file. > 1. Opcode handler knows, whether it have/needs a file, and thus > doesn't need extra checks done in common path. > 2. It will be more consistent with splice. > Objections? Ideas? Sounds reasonable to me, but always easier to judge in patch form :-) > - do we need offset pointers with fallback to file->f_pos? Or is it > enough to have offset value. Jens, I remember you added the first > option somewhere, could you tell the reasoning? I recently added support for -1/cur position, which splice also uses. So you should be fine with that.
On 22/01/2020 04:55, Jens Axboe wrote: > On 1/21/20 5:05 PM, Pavel Begunkov wrote: >> It works well for basic cases, but there is still work to be done. E.g. >> it misses @hash_reg_file checks for the second (output) file. Anyway, >> there are some questions I want to discuss: >> >> - why sqe->len is __u32? Splice uses size_t, and I think it's better >> to have something wider (e.g. u64) for fututre use. That's the story >> behind added sqe->splice_len. > > IO operations in Linux generally are INT_MAX, so the u32 is plenty big. > That's why I chose it. For this specifically, if you look at splice: > > if (unlikely(len > MAX_RW_COUNT)) > len = MAX_RW_COUNT; > > so anything larger is truncated anyway. Yeah, I saw this one, but that was rather an argument for the future. It's pretty easy to transfer more than 4GB with sg list, but that would be the case for splice. > >> - it requires 2 fds, and it's painful. Currently file managing is done >> by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm >> thinking to make each opcode function handle file grabbing/putting >> themself with some helpers, as it's done in the patch for splice's >> out-file. >> 1. Opcode handler knows, whether it have/needs a file, and thus >> doesn't need extra checks done in common path. >> 2. It will be more consistent with splice. >> Objections? Ideas? > > Sounds reasonable to me, but always easier to judge in patch form :-) > >> - do we need offset pointers with fallback to file->f_pos? Or is it >> enough to have offset value. Jens, I remember you added the first >> option somewhere, could you tell the reasoning? > > I recently added support for -1/cur position, which splice also uses. So > you should be fine with that. > I always have been thinking about it as a legacy from old days, and one of the problems of posix. It's not hard to count it in the userspace especially in C++ or high-level languages, and is just another obstacle for having a performant API. So, I'd rather get rid of it here. But is there any reasons against?
On 1/21/20 8:11 PM, Pavel Begunkov wrote: > On 22/01/2020 04:55, Jens Axboe wrote: >> On 1/21/20 5:05 PM, Pavel Begunkov wrote: >>> It works well for basic cases, but there is still work to be done. E.g. >>> it misses @hash_reg_file checks for the second (output) file. Anyway, >>> there are some questions I want to discuss: >>> >>> - why sqe->len is __u32? Splice uses size_t, and I think it's better >>> to have something wider (e.g. u64) for fututre use. That's the story >>> behind added sqe->splice_len. >> >> IO operations in Linux generally are INT_MAX, so the u32 is plenty big. >> That's why I chose it. For this specifically, if you look at splice: >> >> if (unlikely(len > MAX_RW_COUNT)) >> len = MAX_RW_COUNT; >> >> so anything larger is truncated anyway. > > Yeah, I saw this one, but that was rather an argument for the future. > It's pretty easy to transfer more than 4GB with sg list, but that > would be the case for splice. I don't see this changing, ever, basically. And probably not a big deal, if you want to do more than 2GB worth of IO, you simply splice them over multiple commands. At those sizes, the overhead there is negligible. >>> - it requires 2 fds, and it's painful. Currently file managing is done >>> by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm >>> thinking to make each opcode function handle file grabbing/putting >>> themself with some helpers, as it's done in the patch for splice's >>> out-file. >>> 1. Opcode handler knows, whether it have/needs a file, and thus >>> doesn't need extra checks done in common path. >>> 2. It will be more consistent with splice. >>> Objections? Ideas? >> >> Sounds reasonable to me, but always easier to judge in patch form :-) >> >>> - do we need offset pointers with fallback to file->f_pos? Or is it >>> enough to have offset value. Jens, I remember you added the first >>> option somewhere, could you tell the reasoning? >> >> I recently added support for -1/cur position, which splice also uses. So >> you should be fine with that. >> > > I always have been thinking about it as a legacy from old days, and > one of the problems of posix. It's not hard to count it in the > userspace especially in C++ or high-level languages, and is just > another obstacle for having a performant API. So, I'd rather get rid > of it here. But is there any reasons against? It's not always trivial to do in libraries, or programming languages even. That's why it exists. I would not expect anyone to use it outside of that.