Message ID | 281fc79d98b5d91fe4778c5137a17a2ab4693e5c.1665088876.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/1] io_uring: optimise locking for local tw with submit_wait | expand |
On 10/6/22 2:42 PM, Pavel Begunkov wrote: > Running local task_work requires taking uring_lock, for submit + wait we > can try to run them right after submit while we still hold the lock and > save one lock/unlokc pair. The optimisation was implemented in the first > local tw patches but got dropped for simplicity. > > Suggested-by: Dylan Yudaken <dylany@fb.com> > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > --- > io_uring/io_uring.c | 12 ++++++++++-- > io_uring/io_uring.h | 7 +++++++ > 2 files changed, 17 insertions(+), 2 deletions(-) > > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > index 355fc1f3083d..b092473eca1d 100644 > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -3224,8 +3224,16 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, > mutex_unlock(&ctx->uring_lock); > goto out; > } > - if ((flags & IORING_ENTER_GETEVENTS) && ctx->syscall_iopoll) > - goto iopoll_locked; > + if (flags & IORING_ENTER_GETEVENTS) { > + if (ctx->syscall_iopoll) > + goto iopoll_locked; > + /* > + * Ignore errors, we'll soon call io_cqring_wait() and > + * it should handle ownership problems if any. > + */ > + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) > + (void)io_run_local_work_locked(ctx); > + } > mutex_unlock(&ctx->uring_lock); > } > > diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h > index e733d31f31d2..8504bc1f3839 100644 > --- a/io_uring/io_uring.h > +++ b/io_uring/io_uring.h > @@ -275,6 +275,13 @@ static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx) > return ret; > } > > +static inline int io_run_local_work_locked(struct io_ring_ctx *ctx) > +{ > + if (llist_empty(&ctx->work_llist)) > + return 0; > + return __io_run_local_work(ctx, true); > +} Do you have pending patches that also use this? If not, maybe we should just keep it in io_uring.c? If you do, then this looks fine to me rather than needing to shuffle it later.
On 10/6/22 21:59, Jens Axboe wrote: > On 10/6/22 2:42 PM, Pavel Begunkov wrote: >> Running local task_work requires taking uring_lock, for submit + wait we >> can try to run them right after submit while we still hold the lock and >> save one lock/unlokc pair. The optimisation was implemented in the first >> local tw patches but got dropped for simplicity. >> >> Suggested-by: Dylan Yudaken <dylany@fb.com> >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >> --- >> io_uring/io_uring.c | 12 ++++++++++-- >> io_uring/io_uring.h | 7 +++++++ >> 2 files changed, 17 insertions(+), 2 deletions(-) >> >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >> index 355fc1f3083d..b092473eca1d 100644 >> --- a/io_uring/io_uring.c >> +++ b/io_uring/io_uring.c >> @@ -3224,8 +3224,16 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, >> mutex_unlock(&ctx->uring_lock); >> goto out; >> } >> - if ((flags & IORING_ENTER_GETEVENTS) && ctx->syscall_iopoll) >> - goto iopoll_locked; >> + if (flags & IORING_ENTER_GETEVENTS) { >> + if (ctx->syscall_iopoll) >> + goto iopoll_locked; >> + /* >> + * Ignore errors, we'll soon call io_cqring_wait() and >> + * it should handle ownership problems if any. >> + */ >> + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) >> + (void)io_run_local_work_locked(ctx); >> + } >> mutex_unlock(&ctx->uring_lock); >> } >> >> diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h >> index e733d31f31d2..8504bc1f3839 100644 >> --- a/io_uring/io_uring.h >> +++ b/io_uring/io_uring.h >> @@ -275,6 +275,13 @@ static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx) >> return ret; >> } >> >> +static inline int io_run_local_work_locked(struct io_ring_ctx *ctx) >> +{ >> + if (llist_empty(&ctx->work_llist)) >> + return 0; >> + return __io_run_local_work(ctx, true); >> +} > > Do you have pending patches that also use this? If not, maybe we > should just keep it in io_uring.c? If you do, then this looks fine > to me rather than needing to shuffle it later. No, I don't. I'd argue it's better as a helper because at least it hides always confusing bool argument, and we'd also need to replace a similar one in io_iopoll_check(). Add we can stick must_hold there for even more clarity. But ultimately I don't care much.
On 10/6/22 3:09 PM, Pavel Begunkov wrote: > On 10/6/22 21:59, Jens Axboe wrote: >> On 10/6/22 2:42 PM, Pavel Begunkov wrote: >>> Running local task_work requires taking uring_lock, for submit + wait we >>> can try to run them right after submit while we still hold the lock and >>> save one lock/unlokc pair. The optimisation was implemented in the first >>> local tw patches but got dropped for simplicity. >>> >>> Suggested-by: Dylan Yudaken <dylany@fb.com> >>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >>> --- >>> io_uring/io_uring.c | 12 ++++++++++-- >>> io_uring/io_uring.h | 7 +++++++ >>> 2 files changed, 17 insertions(+), 2 deletions(-) >>> >>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >>> index 355fc1f3083d..b092473eca1d 100644 >>> --- a/io_uring/io_uring.c >>> +++ b/io_uring/io_uring.c >>> @@ -3224,8 +3224,16 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, >>> mutex_unlock(&ctx->uring_lock); >>> goto out; >>> } >>> - if ((flags & IORING_ENTER_GETEVENTS) && ctx->syscall_iopoll) >>> - goto iopoll_locked; >>> + if (flags & IORING_ENTER_GETEVENTS) { >>> + if (ctx->syscall_iopoll) >>> + goto iopoll_locked; >>> + /* >>> + * Ignore errors, we'll soon call io_cqring_wait() and >>> + * it should handle ownership problems if any. >>> + */ >>> + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) >>> + (void)io_run_local_work_locked(ctx); >>> + } >>> mutex_unlock(&ctx->uring_lock); >>> } >>> diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h >>> index e733d31f31d2..8504bc1f3839 100644 >>> --- a/io_uring/io_uring.h >>> +++ b/io_uring/io_uring.h >>> @@ -275,6 +275,13 @@ static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx) >>> return ret; >>> } >>> +static inline int io_run_local_work_locked(struct io_ring_ctx *ctx) >>> +{ >>> + if (llist_empty(&ctx->work_llist)) >>> + return 0; >>> + return __io_run_local_work(ctx, true); >>> +} >> >> Do you have pending patches that also use this? If not, maybe we >> should just keep it in io_uring.c? If you do, then this looks fine >> to me rather than needing to shuffle it later. > > No, I don't. I'd argue it's better as a helper because at least it > hides always confusing bool argument, and we'd also need to replace > a similar one in io_iopoll_check(). Add we can stick must_hold there > for even more clarity. But ultimately I don't care much. I really don't feel that strongly about it either, let's just keep it the way it is.
On Thu, 6 Oct 2022 21:42:33 +0100, Pavel Begunkov wrote: > Running local task_work requires taking uring_lock, for submit + wait we > can try to run them right after submit while we still hold the lock and > save one lock/unlokc pair. The optimisation was implemented in the first > local tw patches but got dropped for simplicity. > > Applied, thanks! [1/1] io_uring: optimise locking for local tw with submit_wait commit: a2b61c4d8fcb005007bae5b2f007d43cba89baa1 Best regards,
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 355fc1f3083d..b092473eca1d 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3224,8 +3224,16 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, mutex_unlock(&ctx->uring_lock); goto out; } - if ((flags & IORING_ENTER_GETEVENTS) && ctx->syscall_iopoll) - goto iopoll_locked; + if (flags & IORING_ENTER_GETEVENTS) { + if (ctx->syscall_iopoll) + goto iopoll_locked; + /* + * Ignore errors, we'll soon call io_cqring_wait() and + * it should handle ownership problems if any. + */ + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) + (void)io_run_local_work_locked(ctx); + } mutex_unlock(&ctx->uring_lock); } diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index e733d31f31d2..8504bc1f3839 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -275,6 +275,13 @@ static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx) return ret; } +static inline int io_run_local_work_locked(struct io_ring_ctx *ctx) +{ + if (llist_empty(&ctx->work_llist)) + return 0; + return __io_run_local_work(ctx, true); +} + static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked) { if (!*locked) {
Running local task_work requires taking uring_lock, for submit + wait we can try to run them right after submit while we still hold the lock and save one lock/unlokc pair. The optimisation was implemented in the first local tw patches but got dropped for simplicity. Suggested-by: Dylan Yudaken <dylany@fb.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> --- io_uring/io_uring.c | 12 ++++++++++-- io_uring/io_uring.h | 7 +++++++ 2 files changed, 17 insertions(+), 2 deletions(-)