Message ID | cover.1629559905.git.asml.silence@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | open/accept directly into io_uring fixed file table | expand |
On 8/21/21 9:52 AM, Pavel Begunkov wrote: > Add an optional feature to open/accept directly into io_uring's fixed > file table bypassing the normal file table. Same behaviour if as the > snippet below, but in one operation: > > sqe = prep_[open,accept](...); > cqe = submit_and_wait(sqe); > io_uring_register_files_update(uring_idx, (fd = cqe->res)); > close((fd = cqe->res)); > > The idea in pretty old, and was brough up and implemented a year ago > by Josh Triplett, though haven't sought the light for some reasons. > > The behaviour is controlled by setting sqe->file_index, where 0 implies > the old behaviour. If non-zero value is specified, then it will behave > as described and place the file into a fixed file slot > sqe->file_index - 1. A file table should be already created, the slot > should be valid and empty, otherwise the operation will fail. > > we can't use IOSQE_FIXED_FILE to switch between modes, because accept > takes a file, and it already uses the flag with a different meaning. > > since RFC: > - added attribution > - updated descriptions > - rebased > > since v1: > - EBADF if slot is already used (Josh Triplett) > - alias index with splice_fd_in (Josh Triplett) > - fix a bound check bug With the prep series, this looks good to me now. Josh, what do you think? And we need the net folks to sign off on the first patch, of course.
On Sat, Aug 21, 2021 at 08:18:12PM -0600, Jens Axboe wrote: > On 8/21/21 9:52 AM, Pavel Begunkov wrote: > > Add an optional feature to open/accept directly into io_uring's fixed > > file table bypassing the normal file table. Same behaviour if as the > > snippet below, but in one operation: > > > > sqe = prep_[open,accept](...); > > cqe = submit_and_wait(sqe); > > io_uring_register_files_update(uring_idx, (fd = cqe->res)); > > close((fd = cqe->res)); > > > > The idea in pretty old, and was brough up and implemented a year ago > > by Josh Triplett, though haven't sought the light for some reasons. > > > > The behaviour is controlled by setting sqe->file_index, where 0 implies > > the old behaviour. If non-zero value is specified, then it will behave > > as described and place the file into a fixed file slot > > sqe->file_index - 1. A file table should be already created, the slot > > should be valid and empty, otherwise the operation will fail. > > > > we can't use IOSQE_FIXED_FILE to switch between modes, because accept > > takes a file, and it already uses the flag with a different meaning. > > > > since RFC: > > - added attribution > > - updated descriptions > > - rebased > > > > since v1: > > - EBADF if slot is already used (Josh Triplett) > > - alias index with splice_fd_in (Josh Triplett) > > - fix a bound check bug > > With the prep series, this looks good to me now. Josh, what do you > think? I would still like to see this using a union with the `nofile` field in io_open and io_accept, rather than overloading the 16-bit buf_index field. That would avoid truncating to 16 bits, and make less work for expansion to more than 16 bits of fixed file indexes. (I'd also like that to actually use a union, rather than overloading the meaning of buf_index/nofile.) I personally still feel that using non-zero to signify index-plus-one is both error-prone and not as future-compatible. I think we could do better with no additional overhead. But I think the final call on that interface is up to you, Jens. Do you think it'd be worth spending a flag bit or using a different opcode, to get a cleaner interface? If you don't, then I'd be fine with seeing this go in with just the io_open and io_accept change. - Josh Triplett
On 8/23/21 1:13 PM, Josh Triplett wrote: > On Sat, Aug 21, 2021 at 08:18:12PM -0600, Jens Axboe wrote: >> On 8/21/21 9:52 AM, Pavel Begunkov wrote: >>> Add an optional feature to open/accept directly into io_uring's fixed >>> file table bypassing the normal file table. Same behaviour if as the >>> snippet below, but in one operation: >>> >>> sqe = prep_[open,accept](...); >>> cqe = submit_and_wait(sqe); >>> io_uring_register_files_update(uring_idx, (fd = cqe->res)); >>> close((fd = cqe->res)); >>> >>> The idea in pretty old, and was brough up and implemented a year ago >>> by Josh Triplett, though haven't sought the light for some reasons. >>> >>> The behaviour is controlled by setting sqe->file_index, where 0 implies >>> the old behaviour. If non-zero value is specified, then it will behave >>> as described and place the file into a fixed file slot >>> sqe->file_index - 1. A file table should be already created, the slot >>> should be valid and empty, otherwise the operation will fail. >>> >>> we can't use IOSQE_FIXED_FILE to switch between modes, because accept >>> takes a file, and it already uses the flag with a different meaning. >>> >>> since RFC: >>> - added attribution >>> - updated descriptions >>> - rebased >>> >>> since v1: >>> - EBADF if slot is already used (Josh Triplett) >>> - alias index with splice_fd_in (Josh Triplett) >>> - fix a bound check bug >> >> With the prep series, this looks good to me now. Josh, what do you >> think? > > I would still like to see this using a union with the `nofile` field in > io_open and io_accept, rather than overloading the 16-bit buf_index > field. That would avoid truncating to 16 bits, and make less work for > expansion to more than 16 bits of fixed file indexes. > > (I'd also like that to actually use a union, rather than overloading the > meaning of buf_index/nofile.) Agree, and in fact there's room in the open and accept command parts, so we can just make it a separate entry there instead of using ->buf_index. Then just pass in the index to io_install_fixed_file() instead of having it pull it from req->buf_index. > I personally still feel that using non-zero to signify index-plus-one is > both error-prone and not as future-compatible. I think we could do > better with no additional overhead. But I think the final call on that > interface is up to you, Jens. Do you think it'd be worth spending a flag > bit or using a different opcode, to get a cleaner interface? If you > don't, then I'd be fine with seeing this go in with just the io_open and > io_accept change. I'd be inclined to go the extra opcode route instead, as the flag only really would make sense to requests that instantiate file descriptors. For this particular case, we'd need 3 new opcodes for openat/openat2/accept, which is probably a worthwhile expenditure. Pavel, what do you think? Switch to using a different opcode for the new requests, and just grab some space in io_open and io_accept for the fd and pass it in to install. I do think that'd end up being less hackish and easier to grok for a user.
On 8/23/21 8:40 PM, Jens Axboe wrote: > On 8/23/21 1:13 PM, Josh Triplett wrote: >> On Sat, Aug 21, 2021 at 08:18:12PM -0600, Jens Axboe wrote: >>> On 8/21/21 9:52 AM, Pavel Begunkov wrote: >>>> Add an optional feature to open/accept directly into io_uring's fixed >>>> file table bypassing the normal file table. Same behaviour if as the >>>> snippet below, but in one operation: >>>> >>>> sqe = prep_[open,accept](...); >>>> cqe = submit_and_wait(sqe); >>>> io_uring_register_files_update(uring_idx, (fd = cqe->res)); >>>> close((fd = cqe->res)); >>>> >>>> The idea in pretty old, and was brough up and implemented a year ago >>>> by Josh Triplett, though haven't sought the light for some reasons. >>>> >>>> The behaviour is controlled by setting sqe->file_index, where 0 implies >>>> the old behaviour. If non-zero value is specified, then it will behave >>>> as described and place the file into a fixed file slot >>>> sqe->file_index - 1. A file table should be already created, the slot >>>> should be valid and empty, otherwise the operation will fail. >>>> >>>> we can't use IOSQE_FIXED_FILE to switch between modes, because accept >>>> takes a file, and it already uses the flag with a different meaning. >>>> >>>> since RFC: >>>> - added attribution >>>> - updated descriptions >>>> - rebased >>>> >>>> since v1: >>>> - EBADF if slot is already used (Josh Triplett) >>>> - alias index with splice_fd_in (Josh Triplett) >>>> - fix a bound check bug >>> >>> With the prep series, this looks good to me now. Josh, what do you >>> think? >> >> I would still like to see this using a union with the `nofile` field in >> io_open and io_accept, rather than overloading the 16-bit buf_index >> field. That would avoid truncating to 16 bits, and make less work for >> expansion to more than 16 bits of fixed file indexes. >> >> (I'd also like that to actually use a union, rather than overloading the >> meaning of buf_index/nofile.) > > Agree, and in fact there's room in the open and accept command parts, so > we can just make it a separate entry there instead of using ->buf_index. > Then just pass in the index to io_install_fixed_file() instead of having > it pull it from req->buf_index. That's internal details, can be expanded at wish in the future, if we'd ever need larger tables. ->buf_index already holds indexes to different resources just fine. Aliasing with nofile would rather be ugly, so the only option, as you mentioned, is to grab some space from open/accept structs, but don't see why we'd want it when there is a more convenient alternative. >> I personally still feel that using non-zero to signify index-plus-one is >> both error-prone and not as future-compatible. I think we could do >> better with no additional overhead. But I think the final call on that >> interface is up to you, Jens. Do you think it'd be worth spending a flag >> bit or using a different opcode, to get a cleaner interface? If you >> don't, then I'd be fine with seeing this go in with just the io_open and >> io_accept change. > > I'd be inclined to go the extra opcode route instead, as the flag only > really would make sense to requests that instantiate file descriptors. > For this particular case, we'd need 3 new opcodes for > openat/openat2/accept, which is probably a worthwhile expenditure. > > Pavel, what do you think? Switch to using a different opcode for the new > requests, and just grab some space in io_open and io_accept for the fd > and pass it in to install. I don't get it, why it's even called hackish? How that's anyhow better? To me the feature looks like a natural extension to the operations, just like a read can be tuned with flags, so and creating new opcodes seems a bit ugly, unnecessary taking space from opcodes and adding duplication (even if both versions call the same handler). First, why it's not future-compatible? It's a serious argument, but I don't see where it came from. Do I miss something? It's u32 now, and so will easily cover all indexes. SQE fields should always be zeroed, that's a rule, liburing follows it, and there would have been already lots of problems for users not honoring it. And there will be a helper hiding all the index conversions for convenience. void io_uring_prep_open_direct(sqe, index, ...) { io_uring_prep_open(sqe, ...); sqe->file_index = index + 1; }
On 8/24/21 3:48 AM, Pavel Begunkov wrote: > On 8/23/21 8:40 PM, Jens Axboe wrote: >> On 8/23/21 1:13 PM, Josh Triplett wrote: >>> On Sat, Aug 21, 2021 at 08:18:12PM -0600, Jens Axboe wrote: >>>> On 8/21/21 9:52 AM, Pavel Begunkov wrote: >>>>> Add an optional feature to open/accept directly into io_uring's fixed >>>>> file table bypassing the normal file table. Same behaviour if as the >>>>> snippet below, but in one operation: >>>>> >>>>> sqe = prep_[open,accept](...); >>>>> cqe = submit_and_wait(sqe); >>>>> io_uring_register_files_update(uring_idx, (fd = cqe->res)); >>>>> close((fd = cqe->res)); >>>>> >>>>> The idea in pretty old, and was brough up and implemented a year ago >>>>> by Josh Triplett, though haven't sought the light for some reasons. >>>>> >>>>> The behaviour is controlled by setting sqe->file_index, where 0 implies >>>>> the old behaviour. If non-zero value is specified, then it will behave >>>>> as described and place the file into a fixed file slot >>>>> sqe->file_index - 1. A file table should be already created, the slot >>>>> should be valid and empty, otherwise the operation will fail. >>>>> >>>>> we can't use IOSQE_FIXED_FILE to switch between modes, because accept >>>>> takes a file, and it already uses the flag with a different meaning. >>>>> >>>>> since RFC: >>>>> - added attribution >>>>> - updated descriptions >>>>> - rebased >>>>> >>>>> since v1: >>>>> - EBADF if slot is already used (Josh Triplett) >>>>> - alias index with splice_fd_in (Josh Triplett) >>>>> - fix a bound check bug >>>> >>>> With the prep series, this looks good to me now. Josh, what do you >>>> think? >>> >>> I would still like to see this using a union with the `nofile` field in >>> io_open and io_accept, rather than overloading the 16-bit buf_index >>> field. That would avoid truncating to 16 bits, and make less work for >>> expansion to more than 16 bits of fixed file indexes. >>> >>> (I'd also like that to actually use a union, rather than overloading the >>> meaning of buf_index/nofile.) >> >> Agree, and in fact there's room in the open and accept command parts, so >> we can just make it a separate entry there instead of using ->buf_index. >> Then just pass in the index to io_install_fixed_file() instead of having >> it pull it from req->buf_index. > > That's internal details, can be expanded at wish in the future, if we'd > ever need larger tables. ->buf_index already holds indexes to different > resources just fine. Sure it's internal and can always be changed, doesn't change the fact that it's a bit iffy that it's used differently in different spots. As it costs us nothing to simply add a 'fixed_file' u32 for io_accept and io_open, I really think that should be done instead. > Aliasing with nofile would rather be ugly, so the only option, as you > mentioned, is to grab some space from open/accept structs, but don't see > why we'd want it when there is a more convenient alternative. Because it's a lot more readable and less error prone imho. Agree on the union, we don't have to resort to that. >>> I personally still feel that using non-zero to signify index-plus-one is >>> both error-prone and not as future-compatible. I think we could do >>> better with no additional overhead. But I think the final call on that >>> interface is up to you, Jens. Do you think it'd be worth spending a flag >>> bit or using a different opcode, to get a cleaner interface? If you >>> don't, then I'd be fine with seeing this go in with just the io_open and >>> io_accept change. >> >> I'd be inclined to go the extra opcode route instead, as the flag only >> really would make sense to requests that instantiate file descriptors. >> For this particular case, we'd need 3 new opcodes for >> openat/openat2/accept, which is probably a worthwhile expenditure. >> >> Pavel, what do you think? Switch to using a different opcode for the new >> requests, and just grab some space in io_open and io_accept for the fd >> and pass it in to install. > > I don't get it, why it's even called hackish? How that's anyhow better? > To me the feature looks like a natural extension to the operations, just > like a read can be tuned with flags, so and creating new opcodes seems > a bit ugly, unnecessary taking space from opcodes and adding duplication > (even if both versions call the same handler). I agree that it's a natural extension, the problem is that we have to do unnatural things (somewhat) to make it work. I'm fine with using the union for the splice_fd_in to pass it in, I don't think it's a big deal. I do wish that IORING_OP_CLOSE would work with them, though. I think we should to that as a followup patch. It's a bit odd to be able to open a file with IORING_OP_OPENAT and not being able to close it with IORING_OP_CLOSE. For the latter, we should just give it fixed file support, which would be pretty trivial. > First, why it's not future-compatible? It's a serious argument, but I > don't see where it came from. Do I miss something? > > It's u32 now, and so will easily cover all indexes. SQE fields should > always be zeroed, that's a rule, liburing follows it, and there would > have been already lots of problems for users not honoring it. And there > will be a helper hiding all the index conversions for convenience. > > void io_uring_prep_open_direct(sqe, index, ...) > { > io_uring_prep_open(sqe, ...); > sqe->file_index = index + 1; > } Let's keep it the way that it is, but I do want to see the buf_index thing go away and just req->open.fixed_file or whatever being used for open and accept. We should fold that in.
On 8/24/21 3:02 PM, Jens Axboe wrote: > On 8/24/21 3:48 AM, Pavel Begunkov wrote: >> On 8/23/21 8:40 PM, Jens Axboe wrote: >>> On 8/23/21 1:13 PM, Josh Triplett wrote: >>>> On Sat, Aug 21, 2021 at 08:18:12PM -0600, Jens Axboe wrote: >>>>> On 8/21/21 9:52 AM, Pavel Begunkov wrote: >>>>>> Add an optional feature to open/accept directly into io_uring's fixed >>>>>> file table bypassing the normal file table. Same behaviour if as the >>>>>> snippet below, but in one operation: >>>>>> >>>>>> sqe = prep_[open,accept](...); >>>>>> cqe = submit_and_wait(sqe); >>>>>> io_uring_register_files_update(uring_idx, (fd = cqe->res)); >>>>>> close((fd = cqe->res)); >>>>>> >>>>>> The idea in pretty old, and was brough up and implemented a year ago >>>>>> by Josh Triplett, though haven't sought the light for some reasons. >>>>>> >>>>>> The behaviour is controlled by setting sqe->file_index, where 0 implies >>>>>> the old behaviour. If non-zero value is specified, then it will behave >>>>>> as described and place the file into a fixed file slot >>>>>> sqe->file_index - 1. A file table should be already created, the slot >>>>>> should be valid and empty, otherwise the operation will fail. >>>>>> >>>>>> we can't use IOSQE_FIXED_FILE to switch between modes, because accept >>>>>> takes a file, and it already uses the flag with a different meaning. >>>>>> >>>>>> since RFC: >>>>>> - added attribution >>>>>> - updated descriptions >>>>>> - rebased >>>>>> >>>>>> since v1: >>>>>> - EBADF if slot is already used (Josh Triplett) >>>>>> - alias index with splice_fd_in (Josh Triplett) >>>>>> - fix a bound check bug >>>>> >>>>> With the prep series, this looks good to me now. Josh, what do you >>>>> think? >>>> >>>> I would still like to see this using a union with the `nofile` field in >>>> io_open and io_accept, rather than overloading the 16-bit buf_index >>>> field. That would avoid truncating to 16 bits, and make less work for >>>> expansion to more than 16 bits of fixed file indexes. >>>> >>>> (I'd also like that to actually use a union, rather than overloading the >>>> meaning of buf_index/nofile.) >>> >>> Agree, and in fact there's room in the open and accept command parts, so >>> we can just make it a separate entry there instead of using ->buf_index. >>> Then just pass in the index to io_install_fixed_file() instead of having >>> it pull it from req->buf_index. >> >> That's internal details, can be expanded at wish in the future, if we'd >> ever need larger tables. ->buf_index already holds indexes to different >> resources just fine. > > Sure it's internal and can always be changed, doesn't change the fact > that it's a bit iffy that it's used differently in different spots. As > it costs us nothing to simply add a 'fixed_file' u32 for io_accept and > io_open, I really think that should be done instead. > >> Aliasing with nofile would rather be ugly, so the only option, as you >> mentioned, is to grab some space from open/accept structs, but don't see >> why we'd want it when there is a more convenient alternative. > > Because it's a lot more readable and less error prone imho. Agree on the > union, we don't have to resort to that. Ok, I don't have a strong opinion on that. Will resend >>>> I personally still feel that using non-zero to signify index-plus-one is >>>> both error-prone and not as future-compatible. I think we could do >>>> better with no additional overhead. But I think the final call on that >>>> interface is up to you, Jens. Do you think it'd be worth spending a flag >>>> bit or using a different opcode, to get a cleaner interface? If you >>>> don't, then I'd be fine with seeing this go in with just the io_open and >>>> io_accept change. >>> >>> I'd be inclined to go the extra opcode route instead, as the flag only >>> really would make sense to requests that instantiate file descriptors. >>> For this particular case, we'd need 3 new opcodes for >>> openat/openat2/accept, which is probably a worthwhile expenditure. >>> >>> Pavel, what do you think? Switch to using a different opcode for the new >>> requests, and just grab some space in io_open and io_accept for the fd >>> and pass it in to install. >> >> I don't get it, why it's even called hackish? How that's anyhow better? >> To me the feature looks like a natural extension to the operations, just >> like a read can be tuned with flags, so and creating new opcodes seems >> a bit ugly, unnecessary taking space from opcodes and adding duplication >> (even if both versions call the same handler). > > I agree that it's a natural extension, the problem is that we have to do > unnatural things (somewhat) to make it work. I'm fine with using the > union for the splice_fd_in to pass it in, I don't think it's a big deal. > > I do wish that IORING_OP_CLOSE would work with them, though. I think we > should to that as a followup patch. It's a bit odd to be able to open a > file with IORING_OP_OPENAT and not being able to close it with > IORING_OP_CLOSE. For the latter, we should just give it fixed file > support, which would be pretty trivial. > >> First, why it's not future-compatible? It's a serious argument, but I >> don't see where it came from. Do I miss something? >> >> It's u32 now, and so will easily cover all indexes. SQE fields should >> always be zeroed, that's a rule, liburing follows it, and there would >> have been already lots of problems for users not honoring it. And there >> will be a helper hiding all the index conversions for convenience. >> >> void io_uring_prep_open_direct(sqe, index, ...) >> { >> io_uring_prep_open(sqe, ...); >> sqe->file_index = index + 1; >> } > > Let's keep it the way that it is, but I do want to see the buf_index > thing go away and just req->open.fixed_file or whatever being used for > open and accept. We should fold that in.