Message ID | 20210405231025.33829-3-dave@stgolabs.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fs/epoll: restore user-visible behavior upon event ready | expand |
On Mon, 5 Apr 2021 16:10:25 -0700 Davidlohr Bueso <dave@stgolabs.net> wrote: > 339ddb53d373 (fs/epoll: remove unnecessary wakeups of nested epoll) changed > the userspace visible behavior of exclusive waiters blocked on a common > epoll descriptor upon a single event becoming ready. Previously, all tasks > doing epoll_wait would awake, and now only one is awoken, potentially causing > missed wakeups on applications that rely on this behavior, such as Apache Qpid. > > While the aforementioned commit aims at having only a wakeup single path in > ep_poll_callback (with the exceptions of epoll_ctl cases), we need to restore > the wakeup in what was the old ep_scan_ready_list() such that the next thread > can be awoken, in a cascading style, after the waker's corresponding ep_send_events(). > Tricky. 339ddb53d373 was merged in December 2019. So do we backport this fix? Could any userspace code be depending upon the post-339ddb53d373 behaviour? Or do we just leave the post-339ddb53d373 code as-is? Presumably the issue is very rarely encountered, and changeing it back has its own risks.
On Mon, 05 Apr 2021, Andrew Morton wrote: >Tricky. 339ddb53d373 was merged in December 2019. So do we backport >this fix? Could any userspace code be depending upon the >post-339ddb53d373 behavior? As with previous trouble caused by this commit, I vote for restoring the behavior backporting the fix, basically the equivalent of adding (which was my intention): Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") > >Or do we just leave the post-339ddb53d373 code as-is? Presumably the >issue is very rarely encountered, and changeing it back has its own >risks. While I also consider this scenario rare (normally new ready events can become ready and trigger new wakeups), I'm seeing reports in real applications of task hangs due to this change of semantics. Alternatively, users can update their code to timeout in such scenarios, but it is ultimately the kernel's fault. Furthermore it hasn't really been all _that_ long since the commit was merged, so I don't think it merits a change in behavior. As for the risks of restoring the behavior, afaict this only fixed a double wakeup in an obscure nested epoll scenario, so I'm not too worried there sacrificing performance for functionality. That said, there are fixes, for example 65759097d80 (epoll: call final ep_events_available() check under the lock) that would perhaps be rendered unnecessary. Thanks, Davidlohr
On Mon, 5 Apr 2021 20:22:26 -0700 Davidlohr Bueso <dave@stgolabs.net> wrote: > On Mon, 05 Apr 2021, Andrew Morton wrote: > > >Tricky. 339ddb53d373 was merged in December 2019. So do we backport > >this fix? Could any userspace code be depending upon the > >post-339ddb53d373 behavior? > > As with previous trouble caused by this commit, I vote for restoring the behavior > backporting the fix, basically the equivalent of adding (which was my intention): > > Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") OK, I added the Fixes: line and the cc:stable line.
diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 73138ea68342..1e596e1d0bba 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -657,6 +657,12 @@ static void ep_done_scan(struct eventpoll *ep, */ list_splice(txlist, &ep->rdllist); __pm_relax(ep->ws); + + if (!list_empty(&ep->rdllist)) { + if (waitqueue_active(&ep->wq)) + wake_up(&ep->wq); + } + write_unlock_irq(&ep->lock); }
339ddb53d373 (fs/epoll: remove unnecessary wakeups of nested epoll) changed the userspace visible behavior of exclusive waiters blocked on a common epoll descriptor upon a single event becoming ready. Previously, all tasks doing epoll_wait would awake, and now only one is awoken, potentially causing missed wakeups on applications that rely on this behavior, such as Apache Qpid. While the aforementioned commit aims at having only a wakeup single path in ep_poll_callback (with the exceptions of epoll_ctl cases), we need to restore the wakeup in what was the old ep_scan_ready_list() such that the next thread can be awoken, in a cascading style, after the waker's corresponding ep_send_events(). Signed-off-by: Davidlohr Bueso <dbueso@suse.de> --- fs/eventpoll.c | 6 ++++++ 1 file changed, 6 insertions(+)