Message ID | 20240426080548.8203-1-xuewen.yan@unisoc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] epoll: Add synchronous wakeup support for ep_poll_callback | expand |
adding Eric On Fri, Apr 26, 2024 at 04:05:48PM +0800, Xuewen Yan wrote: > Now, the epoll only use wake_up() interface to wake up task. > However, sometimes, there are epoll users which want to use > the synchronous wakeup flag to hint the scheduler, such as > Android binder driver. > So add a wake_up_sync() define, and use the wake_up_sync() > when the sync is true in ep_poll_callback(). > > Co-developed-by: Jing Xia <jing.xia@unisoc.com> > Signed-off-by: Jing Xia <jing.xia@unisoc.com> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> > --- > fs/eventpoll.c | 5 ++++- > include/linux/wait.h | 1 + > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/fs/eventpoll.c b/fs/eventpoll.c > index 882b89edc52a..9b815e0a1ac5 100644 > --- a/fs/eventpoll.c > +++ b/fs/eventpoll.c > @@ -1336,7 +1336,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v > break; > } > } > - wake_up(&ep->wq); > + if (sync) > + wake_up_sync(&ep->wq); > + else > + wake_up(&ep->wq); > } > if (waitqueue_active(&ep->poll_wait)) > pwake++; > diff --git a/include/linux/wait.h b/include/linux/wait.h > index 8aa3372f21a0..2b322a9b88a2 100644 > --- a/include/linux/wait.h > +++ b/include/linux/wait.h > @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head); > #define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL) > #define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1) > #define wake_up_all_locked(x) __wake_up_locked((x), TASK_NORMAL, 0) > +#define wake_up_sync(x) __wake_up_sync(x, TASK_NORMAL) > > #define wake_up_interruptible(x) __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL) > #define wake_up_interruptible_nr(x, nr) __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL) > -- > 2.25.1 >
We've also observed this issue on ChromeOS, it seems like it might long-standing epoll bug as it diverges from the behavior of poll. Any chance a maintainer can take a look? Thanks Brian On Fri, Apr 26, 2024 at 04:05:48PM +0800, Xuewen Yan wrote: > Now, the epoll only use wake_up() interface to wake up task. > However, sometimes, there are epoll users which want to use > the synchronous wakeup flag to hint the scheduler, such as > Android binder driver. > So add a wake_up_sync() define, and use the wake_up_sync() > when the sync is true in ep_poll_callback(). > > Co-developed-by: Jing Xia <jing.xia@unisoc.com> > Signed-off-by: Jing Xia <jing.xia@unisoc.com> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> > --- > fs/eventpoll.c | 5 ++++- > include/linux/wait.h | 1 + > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/fs/eventpoll.c b/fs/eventpoll.c > index 882b89edc52a..9b815e0a1ac5 100644 > --- a/fs/eventpoll.c > +++ b/fs/eventpoll.c > @@ -1336,7 +1336,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v > break; > } > } > - wake_up(&ep->wq); > + if (sync) > + wake_up_sync(&ep->wq); > + else > + wake_up(&ep->wq); > } > if (waitqueue_active(&ep->poll_wait)) > pwake++; > diff --git a/include/linux/wait.h b/include/linux/wait.h > index 8aa3372f21a0..2b322a9b88a2 100644 > --- a/include/linux/wait.h > +++ b/include/linux/wait.h > @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head); > #define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL) > #define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1) > #define wake_up_all_locked(x) __wake_up_locked((x), TASK_NORMAL, 0) > +#define wake_up_sync(x) __wake_up_sync(x, TASK_NORMAL) > > #define wake_up_interruptible(x) __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL) > #define wake_up_interruptible_nr(x, nr) __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL) > -- > 2.25.1 >
I think this patch really needs help with the commit message, something like: wait_queue_func_t accepts 4 arguments (struct wait_queue_entry *wq_entry, unsigned mode, int flags, void *key); In the case of poll and select the wait queue function is pollwake in fs/select.c, this wake function passes the third argument flags as the sync parameter to the default_wake_function defined in kernel/sched/core.c. This argument is passed along to try_to_wake_up which continues to pass down the wake flags to select_task_rq and finally in the case of CFS select_task_rq_fair. In select_task_rq_fair the sync flag is passed down to the wake_affine_* functions in kernel/sched/fair.c which accept and honor the sync flag. Epoll however when reciving the WF_SYNC flag completely drops it on the floor, the wakeup function used by epoll is defined in fs/eventpoll.c, ep_poll_callback. This callback receives a sync flag just like pollwake; however, it never does anything with it. Ultimately it wakes up the waiting task directly using wake_up. This shows that there seems to be a divergence between poll/select and epoll regarding honoring sync wakeups. I have tested this patch through self tests and numerous runs of the perf benchmarks for epoll. All tests past and I did not see any observable performance changes in epoll_wait. Reviewed-by: Brian Geffon <bgeffon@google.com> Tested-by: Brian Geffon <bgeffon@google.com> Reported-by: Benoit Lize <lizeb@google.com> On Thu, Sep 19, 2024 at 8:36 AM Brian Geffon <bgeffon@google.com> wrote: > > We've also observed this issue on ChromeOS, it seems like it might long-standing epoll bug as it diverges from the behavior of poll. Any chance a maintainer can take a look? > > Thanks > Brian > > On Fri, Apr 26, 2024 at 04:05:48PM +0800, Xuewen Yan wrote: > > Now, the epoll only use wake_up() interface to wake up task. > > However, sometimes, there are epoll users which want to use > > the synchronous wakeup flag to hint the scheduler, such as > > Android binder driver. > > So add a wake_up_sync() define, and use the wake_up_sync() > > when the sync is true in ep_poll_callback(). > > > > Co-developed-by: Jing Xia <jing.xia@unisoc.com> > > Signed-off-by: Jing Xia <jing.xia@unisoc.com> > > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> > > --- > > fs/eventpoll.c | 5 ++++- > > include/linux/wait.h | 1 + > > 2 files changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/fs/eventpoll.c b/fs/eventpoll.c > > index 882b89edc52a..9b815e0a1ac5 100644 > > --- a/fs/eventpoll.c > > +++ b/fs/eventpoll.c > > @@ -1336,7 +1336,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v > > break; > > } > > } > > - wake_up(&ep->wq); > > + if (sync) > > + wake_up_sync(&ep->wq); > > + else > > + wake_up(&ep->wq); > > } > > if (waitqueue_active(&ep->poll_wait)) > > pwake++; > > diff --git a/include/linux/wait.h b/include/linux/wait.h > > index 8aa3372f21a0..2b322a9b88a2 100644 > > --- a/include/linux/wait.h > > +++ b/include/linux/wait.h > > @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head); > > #define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL) > > #define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1) > > #define wake_up_all_locked(x) __wake_up_locked((x), TASK_NORMAL, 0) > > +#define wake_up_sync(x) __wake_up_sync(x, TASK_NORMAL) > > > > #define wake_up_interruptible(x) __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL) > > #define wake_up_interruptible_nr(x, nr) __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL) > > -- > > 2.25.1 > > >
On Wed, Sep 25, 2024 at 02:38:02PM -0400, Brian Geffon wrote: > I think this patch really needs help with the commit message, something like: > > wait_queue_func_t accepts 4 arguments (struct wait_queue_entry > *wq_entry, unsigned mode, int flags, void *key); > > In the case of poll and select the wait queue function is pollwake in > fs/select.c, this wake function passes > the third argument flags as the sync parameter to the > default_wake_function defined in kernel/sched/core.c. This > argument is passed along to try_to_wake_up which continues to pass > down the wake flags to select_task_rq and finally > in the case of CFS select_task_rq_fair. In select_task_rq_fair the > sync flag is passed down to the wake_affine_* functions > in kernel/sched/fair.c which accept and honor the sync flag. > > Epoll however when reciving the WF_SYNC flag completely drops it on > the floor, the wakeup function used > by epoll is defined in fs/eventpoll.c, ep_poll_callback. This callback > receives a sync flag just like pollwake; > however, it never does anything with it. Ultimately it wakes up the > waiting task directly using wake_up. > > This shows that there seems to be a divergence between poll/select and > epoll regarding honoring sync wakeups. > > I have tested this patch through self tests and numerous runs of the > perf benchmarks for epoll. All tests past and > I did not see any observable performance changes in epoll_wait. > > Reviewed-by: Brian Geffon <bgeffon@google.com> > Tested-by: Brian Geffon <bgeffon@google.com> > Reported-by: Benoit Lize <lizeb@google.com> Friendly ping on this. Would someone mind taking a look and picking this up? > > > On Thu, Sep 19, 2024 at 8:36 AM Brian Geffon <bgeffon@google.com> wrote: > > > > We've also observed this issue on ChromeOS, it seems like it might long-standing epoll bug as it diverges from the behavior of poll. Any chance a maintainer can take a look? > > > > Thanks > > Brian > > > > On Fri, Apr 26, 2024 at 04:05:48PM +0800, Xuewen Yan wrote: > > > Now, the epoll only use wake_up() interface to wake up task. > > > However, sometimes, there are epoll users which want to use > > > the synchronous wakeup flag to hint the scheduler, such as > > > Android binder driver. > > > So add a wake_up_sync() define, and use the wake_up_sync() > > > when the sync is true in ep_poll_callback(). > > > > > > Co-developed-by: Jing Xia <jing.xia@unisoc.com> > > > Signed-off-by: Jing Xia <jing.xia@unisoc.com> > > > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> > > > --- > > > fs/eventpoll.c | 5 ++++- > > > include/linux/wait.h | 1 + > > > 2 files changed, 5 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/eventpoll.c b/fs/eventpoll.c > > > index 882b89edc52a..9b815e0a1ac5 100644 > > > --- a/fs/eventpoll.c > > > +++ b/fs/eventpoll.c > > > @@ -1336,7 +1336,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v > > > break; > > > } > > > } > > > - wake_up(&ep->wq); > > > + if (sync) > > > + wake_up_sync(&ep->wq); > > > + else > > > + wake_up(&ep->wq); > > > } > > > if (waitqueue_active(&ep->poll_wait)) > > > pwake++; > > > diff --git a/include/linux/wait.h b/include/linux/wait.h > > > index 8aa3372f21a0..2b322a9b88a2 100644 > > > --- a/include/linux/wait.h > > > +++ b/include/linux/wait.h > > > @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head); > > > #define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL) > > > #define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1) > > > #define wake_up_all_locked(x) __wake_up_locked((x), TASK_NORMAL, 0) > > > +#define wake_up_sync(x) __wake_up_sync(x, TASK_NORMAL) > > > > > > #define wake_up_interruptible(x) __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL) > > > #define wake_up_interruptible_nr(x, nr) __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL) > > > -- > > > 2.25.1 > > > > >
On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote: > Now, the epoll only use wake_up() interface to wake up task. > However, sometimes, there are epoll users which want to use > the synchronous wakeup flag to hint the scheduler, such as > Android binder driver. > So add a wake_up_sync() define, and use the wake_up_sync() > when the sync is true in ep_poll_callback(). > > [...] Applied to the vfs.misc branch of the vfs/vfs.git tree. Patches in the vfs.misc branch should appear in linux-next soon. Please report any outstanding bugs that were missed during review in a new review to the original patch series allowing us to drop it. It's encouraged to provide Acked-bys and Reviewed-bys even though the patch has now been applied. If possible patch trailers will be updated. Note that commit hashes shown below are subject to change due to rebase, trailer updates or similar. If in doubt, please check the listed branch. tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git branch: vfs.misc [1/1] epoll: Add synchronous wakeup support for ep_poll_callback https://git.kernel.org/vfs/vfs/c/2ce0e17660a7
On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote: > On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote: > > Now, the epoll only use wake_up() interface to wake up task. > > However, sometimes, there are epoll users which want to use > > the synchronous wakeup flag to hint the scheduler, such as > > Android binder driver. > > So add a wake_up_sync() define, and use the wake_up_sync() > > when the sync is true in ep_poll_callback(). > > > > [...] > > Applied to the vfs.misc branch of the vfs/vfs.git tree. > Patches in the vfs.misc branch should appear in linux-next soon. > > Please report any outstanding bugs that were missed during review in a > new review to the original patch series allowing us to drop it. > > It's encouraged to provide Acked-bys and Reviewed-bys even though the > patch has now been applied. If possible patch trailers will be updated. > > Note that commit hashes shown below are subject to change due to rebase, > trailer updates or similar. If in doubt, please check the listed branch. > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git > branch: vfs.misc This is a bug that's been present for all of time, so I think we should: Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org I sent a patch which adds a benchmark for nonblocking pipes using epoll: https://lore.kernel.org/lkml/20241016190009.866615-1-bgeffon@google.com/ Using this new benchmark I get the following results without this fix and with this fix: $ tools/perf/perf bench sched pipe -n # Running 'sched/pipe' benchmark: # Executed 1000000 pipe operations between two processes Total time: 12.194 [sec] 12.194376 usecs/op 82005 ops/sec $ tools/perf/perf bench sched pipe -n # Running 'sched/pipe' benchmark: # Executed 1000000 pipe operations between two processes Total time: 9.229 [sec] 9.229738 usecs/op 108345 ops/sec > > [1/1] epoll: Add synchronous wakeup support for ep_poll_callback > https://git.kernel.org/vfs/vfs/c/2ce0e17660a7
On Wed, Oct 16, 2024 at 03:05:38PM -0400, Brian Geffon wrote: > On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote: > > On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote: > > > Now, the epoll only use wake_up() interface to wake up task. > > > However, sometimes, there are epoll users which want to use > > > the synchronous wakeup flag to hint the scheduler, such as > > > Android binder driver. > > > So add a wake_up_sync() define, and use the wake_up_sync() > > > when the sync is true in ep_poll_callback(). > > > > > > [...] > > > > Applied to the vfs.misc branch of the vfs/vfs.git tree. > > Patches in the vfs.misc branch should appear in linux-next soon. > > > > Please report any outstanding bugs that were missed during review in a > > new review to the original patch series allowing us to drop it. > > > > It's encouraged to provide Acked-bys and Reviewed-bys even though the > > patch has now been applied. If possible patch trailers will be updated. > > > > Note that commit hashes shown below are subject to change due to rebase, > > trailer updates or similar. If in doubt, please check the listed branch. > > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git > > branch: vfs.misc > > This is a bug that's been present for all of time, so I think we should: > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Cc: stable@vger.kernel.org This is in as 900bbaae ("epoll: Add synchronous wakeup support for ep_poll_callback"). How do maintainers feel about: Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org > > I sent a patch which adds a benchmark for nonblocking pipes using epoll: > https://lore.kernel.org/lkml/20241016190009.866615-1-bgeffon@google.com/ > > Using this new benchmark I get the following results without this fix > and with this fix: > > $ tools/perf/perf bench sched pipe -n > # Running 'sched/pipe' benchmark: > # Executed 1000000 pipe operations between two processes > > Total time: 12.194 [sec] > > 12.194376 usecs/op > 82005 ops/sec > > > $ tools/perf/perf bench sched pipe -n > # Running 'sched/pipe' benchmark: > # Executed 1000000 pipe operations between two processes > > Total time: 9.229 [sec] > > 9.229738 usecs/op > 108345 ops/sec > > > > > [1/1] epoll: Add synchronous wakeup support for ep_poll_callback > > https://git.kernel.org/vfs/vfs/c/2ce0e17660a7 Thanks, Brian
On Thu, Dec 12, 2024 at 12:14 PM Brian Geffon <bgeffon@google.com> wrote: > > On Wed, Oct 16, 2024 at 03:05:38PM -0400, Brian Geffon wrote: > > On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote: > > > On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote: > > > > Now, the epoll only use wake_up() interface to wake up task. > > > > However, sometimes, there are epoll users which want to use > > > > the synchronous wakeup flag to hint the scheduler, such as > > > > Android binder driver. > > > > So add a wake_up_sync() define, and use the wake_up_sync() > > > > when the sync is true in ep_poll_callback(). > > > > > > > > [...] > > > > > > Applied to the vfs.misc branch of the vfs/vfs.git tree. > > > Patches in the vfs.misc branch should appear in linux-next soon. > > > > > > Please report any outstanding bugs that were missed during review in a > > > new review to the original patch series allowing us to drop it. > > > > > > It's encouraged to provide Acked-bys and Reviewed-bys even though the > > > patch has now been applied. If possible patch trailers will be updated. > > > > > > Note that commit hashes shown below are subject to change due to rebase, > > > trailer updates or similar. If in doubt, please check the listed branch. > > > > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git > > > branch: vfs.misc > > > > This is a bug that's been present for all of time, so I think we should: > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > Cc: stable@vger.kernel.org > > This is in as 900bbaae ("epoll: Add synchronous wakeup support for > ep_poll_callback"). How do maintainers feel about: > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Cc: stable@vger.kernel.org Dear stable maintainers, this fixes a bug goes all the way back and beyond Linux 2.6.12-rc2. Can you please add this commit to the stable releases? commit 900bbaae67e980945dec74d36f8afe0de7556d5a upstream. > > > > > I sent a patch which adds a benchmark for nonblocking pipes using epoll: > > https://lore.kernel.org/lkml/20241016190009.866615-1-bgeffon@google.com/ > > > > Using this new benchmark I get the following results without this fix > > and with this fix: > > > > $ tools/perf/perf bench sched pipe -n > > # Running 'sched/pipe' benchmark: > > # Executed 1000000 pipe operations between two processes > > > > Total time: 12.194 [sec] > > > > 12.194376 usecs/op > > 82005 ops/sec > > > > > > $ tools/perf/perf bench sched pipe -n > > # Running 'sched/pipe' benchmark: > > # Executed 1000000 pipe operations between two processes > > > > Total time: 9.229 [sec] > > > > 9.229738 usecs/op > > 108345 ops/sec > > > > > > > > [1/1] epoll: Add synchronous wakeup support for ep_poll_callback > > > https://git.kernel.org/vfs/vfs/c/2ce0e17660a7 > > Thanks, > Brian > Thanks, Brian
On Tue, Dec 17, 2024 at 09:30:51AM -0500, Brian Geffon wrote: > On Thu, Dec 12, 2024 at 12:14 PM Brian Geffon <bgeffon@google.com> wrote: > > > > On Wed, Oct 16, 2024 at 03:05:38PM -0400, Brian Geffon wrote: > > > On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote: > > > > On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote: > > > > > Now, the epoll only use wake_up() interface to wake up task. > > > > > However, sometimes, there are epoll users which want to use > > > > > the synchronous wakeup flag to hint the scheduler, such as > > > > > Android binder driver. > > > > > So add a wake_up_sync() define, and use the wake_up_sync() > > > > > when the sync is true in ep_poll_callback(). > > > > > > > > > > [...] > > > > > > > > Applied to the vfs.misc branch of the vfs/vfs.git tree. > > > > Patches in the vfs.misc branch should appear in linux-next soon. > > > > > > > > Please report any outstanding bugs that were missed during review in a > > > > new review to the original patch series allowing us to drop it. > > > > > > > > It's encouraged to provide Acked-bys and Reviewed-bys even though the > > > > patch has now been applied. If possible patch trailers will be updated. > > > > > > > > Note that commit hashes shown below are subject to change due to rebase, > > > > trailer updates or similar. If in doubt, please check the listed branch. > > > > > > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git > > > > branch: vfs.misc > > > > > > This is a bug that's been present for all of time, so I think we should: > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > Cc: stable@vger.kernel.org > > > > This is in as 900bbaae ("epoll: Add synchronous wakeup support for > > ep_poll_callback"). How do maintainers feel about: > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > Cc: stable@vger.kernel.org > > Dear stable maintainers, this fixes a bug goes all the way back and > beyond Linux 2.6.12-rc2. Can you please add this commit to the stable > releases? > > commit 900bbaae67e980945dec74d36f8afe0de7556d5a upstream. How is this a bugfix? It looks like it is just a new feature being added to epoll, what bug does it "fix"? confused, greg k-h
On Tue, Dec 17, 2024 at 10:34 AM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Tue, Dec 17, 2024 at 09:30:51AM -0500, Brian Geffon wrote: > > On Thu, Dec 12, 2024 at 12:14 PM Brian Geffon <bgeffon@google.com> wrote: > > > > > > On Wed, Oct 16, 2024 at 03:05:38PM -0400, Brian Geffon wrote: > > > > On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote: > > > > > On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote: > > > > > > Now, the epoll only use wake_up() interface to wake up task. > > > > > > However, sometimes, there are epoll users which want to use > > > > > > the synchronous wakeup flag to hint the scheduler, such as > > > > > > Android binder driver. > > > > > > So add a wake_up_sync() define, and use the wake_up_sync() > > > > > > when the sync is true in ep_poll_callback(). > > > > > > > > > > > > [...] > > > > > > > > > > Applied to the vfs.misc branch of the vfs/vfs.git tree. > > > > > Patches in the vfs.misc branch should appear in linux-next soon. > > > > > > > > > > Please report any outstanding bugs that were missed during review in a > > > > > new review to the original patch series allowing us to drop it. > > > > > > > > > > It's encouraged to provide Acked-bys and Reviewed-bys even though the > > > > > patch has now been applied. If possible patch trailers will be updated. > > > > > > > > > > Note that commit hashes shown below are subject to change due to rebase, > > > > > trailer updates or similar. If in doubt, please check the listed branch. > > > > > > > > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git > > > > > branch: vfs.misc > > > > > > > > This is a bug that's been present for all of time, so I think we should: > > > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > > Cc: stable@vger.kernel.org > > > > > > This is in as 900bbaae ("epoll: Add synchronous wakeup support for > > > ep_poll_callback"). How do maintainers feel about: > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > Cc: stable@vger.kernel.org > > > > Dear stable maintainers, this fixes a bug goes all the way back and > > beyond Linux 2.6.12-rc2. Can you please add this commit to the stable > > releases? > > > > commit 900bbaae67e980945dec74d36f8afe0de7556d5a upstream. > Hi Greg, > How is this a bugfix? It looks like it is just a new feature being > added to epoll, what bug does it "fix"? The bug this fixes is that epoll today completely ignores the WF_SYNC flag. The end result is the same, the wakee is woken, but the kernel has several places where a synchronous wakeup is expected and with epoll this isn't honored. However, it is honored with poll, select, recvmsg, etc. Ultimately, epoll is inconsistent with other polling methods and this inconsistency can lead to unexpected and surprising scheduling properties based on the mechanism used in userspace. For example, f467a6a664 ("pipe: fix and clarify pipe read wakeup logic") highlighted the importance of sync wakeups for pipes, should GNU make start using epoll we would see the same regression that resulted in this fix from Linus. > > confused, > > greg k-h Thanks, Brian
diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 882b89edc52a..9b815e0a1ac5 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1336,7 +1336,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v break; } } - wake_up(&ep->wq); + if (sync) + wake_up_sync(&ep->wq); + else + wake_up(&ep->wq); } if (waitqueue_active(&ep->poll_wait)) pwake++; diff --git a/include/linux/wait.h b/include/linux/wait.h index 8aa3372f21a0..2b322a9b88a2 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head); #define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL) #define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1) #define wake_up_all_locked(x) __wake_up_locked((x), TASK_NORMAL, 0) +#define wake_up_sync(x) __wake_up_sync(x, TASK_NORMAL) #define wake_up_interruptible(x) __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL) #define wake_up_interruptible_nr(x, nr) __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL)