Message ID | cover.1568413210.git.asml.silence@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Optimise io_uring completion waiting | expand |
On 9/13/19 4:28 PM, Pavel Begunkov (Silence) wrote: > From: Pavel Begunkov <asml.silence@gmail.com> > > There could be a lot of overhead within generic wait_event_*() used for > waiting for large number of completions. The patchset removes much of > it by using custom wait event (wait_threshold). > > Synthetic test showed ~40% performance boost. (see patch 2) Nifty, from an io_uring perspective, I like this a lot. The core changes needed to support it look fine as well. I'll await Peter/Ingo's comments on it.
It solves much of the problem, though still have overhead on traversing a wait queue + indirect calls for checking. I've been thinking to either 1. create n wait queues and bucketing waiter. E.g. log2(min_events) bucketing would remove at least half of such calls for arbitary min_events and all if min_events is pow2. 2. or dig deeper and add custom wake_up with perhaps sorted wait_queue. As I see it, it's pretty bulky and over-engineered, but maybe somebody knows an easier way? Anyway, I don't have performance numbers for that, so don't know if this would be justified. On 14/09/2019 03:31, Jens Axboe wrote: > On 9/13/19 4:28 PM, Pavel Begunkov (Silence) wrote: >> From: Pavel Begunkov <asml.silence@gmail.com> >> >> There could be a lot of overhead within generic wait_event_*() used for >> waiting for large number of completions. The patchset removes much of >> it by using custom wait event (wait_threshold). >> >> Synthetic test showed ~40% performance boost. (see patch 2) > > Nifty, from an io_uring perspective, I like this a lot. > > The core changes needed to support it look fine as well. I'll await > Peter/Ingo's comments on it. >
From: Pavel Begunkov <asml.silence@gmail.com> There could be a lot of overhead within generic wait_event_*() used for waiting for large number of completions. The patchset removes much of it by using custom wait event (wait_threshold). Synthetic test showed ~40% performance boost. (see patch 2) Pavel Begunkov (2): sched/wait: Add wait_threshold io_uring: Optimise cq waiting with wait_threshold fs/io_uring.c | 21 ++++++----- include/linux/wait_threshold.h | 64 ++++++++++++++++++++++++++++++++++ kernel/sched/Makefile | 2 +- kernel/sched/wait_threshold.c | 26 ++++++++++++++ 4 files changed, 103 insertions(+), 10 deletions(-) create mode 100644 include/linux/wait_threshold.h create mode 100644 kernel/sched/wait_threshold.c