Message ID | 20241022204708.1025470-3-axboe@kernel.dk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Add support for registered waits | expand |
On Tue, Oct 22, 2024 at 02:39:03PM -0600, Jens Axboe wrote: > In scenarios where a high frequency of wait events are seen, the copy > of the struct io_uring_getevents_arg is quite noticeable in the > profiles in terms of time spent. It can be seen as up to 3.5-4.5%. > Rewrite the copy-in logic, saving about 0.5% of the time. I'm surprised it's faster to retrieve field by field instead of a straight copy. But you can't argue with the results! > Signed-off-by: Jens Axboe <axboe@kernel.dk> > --- > io_uring/io_uring.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > index 8952453ea807..612e7d66f845 100644 > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -3239,6 +3239,7 @@ static int io_validate_ext_arg(unsigned flags, const void __user *argp, size_t a > static int io_get_ext_arg(unsigned flags, const void __user *argp, > struct ext_arg *ext_arg) > { > + const struct io_uring_getevents_arg __user *uarg = argp; > struct io_uring_getevents_arg arg; > > /* > @@ -3256,8 +3257,13 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, > */ > if (ext_arg->argsz != sizeof(arg)) > return -EINVAL; > - if (copy_from_user(&arg, argp, sizeof(arg))) > + if (!user_access_begin(uarg, sizeof(*uarg))) > return -EFAULT; > + unsafe_get_user(arg.sigmask, &uarg->sigmask, uaccess_end); > + unsafe_get_user(arg.min_wait_usec, &uarg->min_wait_usec, uaccess_end); > + unsafe_get_user(arg.ts, &uarg->ts, uaccess_end); > + unsafe_get_user(arg.sigmask_sz, &uarg->sigmask_sz, uaccess_end); > + user_access_end(); > ext_arg->min_time = arg.min_wait_usec * NSEC_PER_USEC; > ext_arg->sig = u64_to_user_ptr(arg.sigmask); > ext_arg->argsz = arg.sigmask_sz; > @@ -3267,6 +3273,9 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, > ext_arg->ts_set = true; > } > return 0; > +uaccess_end: > + user_access_end(); > + return -EFAULT; > } I was looking for the 'goto' for this label and discovered it's in the macro. Neat. Reviewed-by: Keith Busch <kbusch@kernel.org>
On 10/22/24 4:40 PM, Keith Busch wrote: > On Tue, Oct 22, 2024 at 02:39:03PM -0600, Jens Axboe wrote: >> In scenarios where a high frequency of wait events are seen, the copy >> of the struct io_uring_getevents_arg is quite noticeable in the >> profiles in terms of time spent. It can be seen as up to 3.5-4.5%. >> Rewrite the copy-in logic, saving about 0.5% of the time. > > I'm surprised it's faster to retrieve field by field instead of a > straight copy. But you can't argue with the results! It's a pretty common setup - mostly when copying separate entities. But works here too! > I was looking for the 'goto' for this label and discovered it's in the > macro. Neat. > > Reviewed-by: Keith Busch <kbusch@kernel.org> Thanks, will add.
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 8952453ea807..612e7d66f845 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3239,6 +3239,7 @@ static int io_validate_ext_arg(unsigned flags, const void __user *argp, size_t a static int io_get_ext_arg(unsigned flags, const void __user *argp, struct ext_arg *ext_arg) { + const struct io_uring_getevents_arg __user *uarg = argp; struct io_uring_getevents_arg arg; /* @@ -3256,8 +3257,13 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, */ if (ext_arg->argsz != sizeof(arg)) return -EINVAL; - if (copy_from_user(&arg, argp, sizeof(arg))) + if (!user_access_begin(uarg, sizeof(*uarg))) return -EFAULT; + unsafe_get_user(arg.sigmask, &uarg->sigmask, uaccess_end); + unsafe_get_user(arg.min_wait_usec, &uarg->min_wait_usec, uaccess_end); + unsafe_get_user(arg.ts, &uarg->ts, uaccess_end); + unsafe_get_user(arg.sigmask_sz, &uarg->sigmask_sz, uaccess_end); + user_access_end(); ext_arg->min_time = arg.min_wait_usec * NSEC_PER_USEC; ext_arg->sig = u64_to_user_ptr(arg.sigmask); ext_arg->argsz = arg.sigmask_sz; @@ -3267,6 +3273,9 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, ext_arg->ts_set = true; } return 0; +uaccess_end: + user_access_end(); + return -EFAULT; } SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
In scenarios where a high frequency of wait events are seen, the copy of the struct io_uring_getevents_arg is quite noticeable in the profiles in terms of time spent. It can be seen as up to 3.5-4.5%. Rewrite the copy-in logic, saving about 0.5% of the time. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- io_uring/io_uring.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)