diff mbox series

[RFC] epoll: Add synchronous wakeup support for ep_poll_callback

Message ID 20240426080548.8203-1-xuewen.yan@unisoc.com (mailing list archive)
State New, archived
Headers show
Series [RFC] epoll: Add synchronous wakeup support for ep_poll_callback | expand

Commit Message

Xuewen Yan April 26, 2024, 8:05 a.m. UTC
Now, the epoll only use wake_up() interface to wake up task.
However, sometimes, there are epoll users which want to use
the synchronous wakeup flag to hint the scheduler, such as
Android binder driver.
So add a wake_up_sync() define, and use the wake_up_sync()
when the sync is true in ep_poll_callback().

Co-developed-by: Jing Xia <jing.xia@unisoc.com>
Signed-off-by: Jing Xia <jing.xia@unisoc.com>
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 fs/eventpoll.c       | 5 ++++-
 include/linux/wait.h | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

Comments

Christian Brauner April 26, 2024, 8:31 a.m. UTC | #1
adding Eric

On Fri, Apr 26, 2024 at 04:05:48PM +0800, Xuewen Yan wrote:
> Now, the epoll only use wake_up() interface to wake up task.
> However, sometimes, there are epoll users which want to use
> the synchronous wakeup flag to hint the scheduler, such as
> Android binder driver.
> So add a wake_up_sync() define, and use the wake_up_sync()
> when the sync is true in ep_poll_callback().
> 
> Co-developed-by: Jing Xia <jing.xia@unisoc.com>
> Signed-off-by: Jing Xia <jing.xia@unisoc.com>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
>  fs/eventpoll.c       | 5 ++++-
>  include/linux/wait.h | 1 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 882b89edc52a..9b815e0a1ac5 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -1336,7 +1336,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
>  				break;
>  			}
>  		}
> -		wake_up(&ep->wq);
> +		if (sync)
> +			wake_up_sync(&ep->wq);
> +		else
> +			wake_up(&ep->wq);
>  	}
>  	if (waitqueue_active(&ep->poll_wait))
>  		pwake++;
> diff --git a/include/linux/wait.h b/include/linux/wait.h
> index 8aa3372f21a0..2b322a9b88a2 100644
> --- a/include/linux/wait.h
> +++ b/include/linux/wait.h
> @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head);
>  #define wake_up_all(x)			__wake_up(x, TASK_NORMAL, 0, NULL)
>  #define wake_up_locked(x)		__wake_up_locked((x), TASK_NORMAL, 1)
>  #define wake_up_all_locked(x)		__wake_up_locked((x), TASK_NORMAL, 0)
> +#define wake_up_sync(x)			__wake_up_sync(x, TASK_NORMAL)
>  
>  #define wake_up_interruptible(x)	__wake_up(x, TASK_INTERRUPTIBLE, 1, NULL)
>  #define wake_up_interruptible_nr(x, nr)	__wake_up(x, TASK_INTERRUPTIBLE, nr, NULL)
> -- 
> 2.25.1
>
Brian Geffon Sept. 19, 2024, 12:36 p.m. UTC | #2
We've also observed this issue on ChromeOS, it seems like it might long-standing epoll bug as it diverges from the behavior of poll. Any chance a maintainer can take a look?

Thanks
Brian

On Fri, Apr 26, 2024 at 04:05:48PM +0800, Xuewen Yan wrote:
> Now, the epoll only use wake_up() interface to wake up task.
> However, sometimes, there are epoll users which want to use
> the synchronous wakeup flag to hint the scheduler, such as
> Android binder driver.
> So add a wake_up_sync() define, and use the wake_up_sync()
> when the sync is true in ep_poll_callback().
> 
> Co-developed-by: Jing Xia <jing.xia@unisoc.com>
> Signed-off-by: Jing Xia <jing.xia@unisoc.com>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
>  fs/eventpoll.c       | 5 ++++-
>  include/linux/wait.h | 1 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 882b89edc52a..9b815e0a1ac5 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -1336,7 +1336,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
>  				break;
>  			}
>  		}
> -		wake_up(&ep->wq);
> +		if (sync)
> +			wake_up_sync(&ep->wq);
> +		else
> +			wake_up(&ep->wq);
>  	}
>  	if (waitqueue_active(&ep->poll_wait))
>  		pwake++;
> diff --git a/include/linux/wait.h b/include/linux/wait.h
> index 8aa3372f21a0..2b322a9b88a2 100644
> --- a/include/linux/wait.h
> +++ b/include/linux/wait.h
> @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head);
>  #define wake_up_all(x)			__wake_up(x, TASK_NORMAL, 0, NULL)
>  #define wake_up_locked(x)		__wake_up_locked((x), TASK_NORMAL, 1)
>  #define wake_up_all_locked(x)		__wake_up_locked((x), TASK_NORMAL, 0)
> +#define wake_up_sync(x)			__wake_up_sync(x, TASK_NORMAL)
>  
>  #define wake_up_interruptible(x)	__wake_up(x, TASK_INTERRUPTIBLE, 1, NULL)
>  #define wake_up_interruptible_nr(x, nr)	__wake_up(x, TASK_INTERRUPTIBLE, nr, NULL)
> -- 
> 2.25.1
>
Brian Geffon Sept. 25, 2024, 6:38 p.m. UTC | #3
I think this patch really needs help with the commit message, something like:

wait_queue_func_t accepts 4 arguments (struct wait_queue_entry
*wq_entry, unsigned mode, int flags, void *key);

In the case of poll and select the wait queue function is pollwake in
fs/select.c, this wake function passes
the third argument flags as the sync parameter to the
default_wake_function defined in kernel/sched/core.c. This
argument is passed along to try_to_wake_up which continues to pass
down the wake flags to select_task_rq and finally
in the case of CFS select_task_rq_fair. In select_task_rq_fair the
sync flag is passed down to the wake_affine_* functions
in kernel/sched/fair.c which accept and honor the sync flag.

Epoll however when reciving the WF_SYNC flag completely drops it on
the floor, the wakeup function used
by epoll is defined in fs/eventpoll.c, ep_poll_callback. This callback
receives a sync flag just like pollwake;
however, it never does anything with it. Ultimately it wakes up the
waiting task directly using wake_up.

This shows that there seems to be a divergence between poll/select and
epoll regarding honoring sync wakeups.

I have tested this patch through self tests and numerous runs of the
perf benchmarks for epoll. All tests past and
I did not see any observable performance changes in epoll_wait.

Reviewed-by: Brian Geffon <bgeffon@google.com>
Tested-by: Brian Geffon <bgeffon@google.com>
Reported-by: Benoit Lize <lizeb@google.com>


On Thu, Sep 19, 2024 at 8:36 AM Brian Geffon <bgeffon@google.com> wrote:
>
> We've also observed this issue on ChromeOS, it seems like it might long-standing epoll bug as it diverges from the behavior of poll. Any chance a maintainer can take a look?
>
> Thanks
> Brian
>
> On Fri, Apr 26, 2024 at 04:05:48PM +0800, Xuewen Yan wrote:
> > Now, the epoll only use wake_up() interface to wake up task.
> > However, sometimes, there are epoll users which want to use
> > the synchronous wakeup flag to hint the scheduler, such as
> > Android binder driver.
> > So add a wake_up_sync() define, and use the wake_up_sync()
> > when the sync is true in ep_poll_callback().
> >
> > Co-developed-by: Jing Xia <jing.xia@unisoc.com>
> > Signed-off-by: Jing Xia <jing.xia@unisoc.com>
> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> > ---
> >  fs/eventpoll.c       | 5 ++++-
> >  include/linux/wait.h | 1 +
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> > index 882b89edc52a..9b815e0a1ac5 100644
> > --- a/fs/eventpoll.c
> > +++ b/fs/eventpoll.c
> > @@ -1336,7 +1336,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
> >                               break;
> >                       }
> >               }
> > -             wake_up(&ep->wq);
> > +             if (sync)
> > +                     wake_up_sync(&ep->wq);
> > +             else
> > +                     wake_up(&ep->wq);
> >       }
> >       if (waitqueue_active(&ep->poll_wait))
> >               pwake++;
> > diff --git a/include/linux/wait.h b/include/linux/wait.h
> > index 8aa3372f21a0..2b322a9b88a2 100644
> > --- a/include/linux/wait.h
> > +++ b/include/linux/wait.h
> > @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head);
> >  #define wake_up_all(x)                       __wake_up(x, TASK_NORMAL, 0, NULL)
> >  #define wake_up_locked(x)            __wake_up_locked((x), TASK_NORMAL, 1)
> >  #define wake_up_all_locked(x)                __wake_up_locked((x), TASK_NORMAL, 0)
> > +#define wake_up_sync(x)                      __wake_up_sync(x, TASK_NORMAL)
> >
> >  #define wake_up_interruptible(x)     __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL)
> >  #define wake_up_interruptible_nr(x, nr)      __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL)
> > --
> > 2.25.1
> >
>
Brian Geffon Oct. 15, 2024, 8:46 p.m. UTC | #4
On Wed, Sep 25, 2024 at 02:38:02PM -0400, Brian Geffon wrote:
> I think this patch really needs help with the commit message, something like:
> 
> wait_queue_func_t accepts 4 arguments (struct wait_queue_entry
> *wq_entry, unsigned mode, int flags, void *key);
> 
> In the case of poll and select the wait queue function is pollwake in
> fs/select.c, this wake function passes
> the third argument flags as the sync parameter to the
> default_wake_function defined in kernel/sched/core.c. This
> argument is passed along to try_to_wake_up which continues to pass
> down the wake flags to select_task_rq and finally
> in the case of CFS select_task_rq_fair. In select_task_rq_fair the
> sync flag is passed down to the wake_affine_* functions
> in kernel/sched/fair.c which accept and honor the sync flag.
> 
> Epoll however when reciving the WF_SYNC flag completely drops it on
> the floor, the wakeup function used
> by epoll is defined in fs/eventpoll.c, ep_poll_callback. This callback
> receives a sync flag just like pollwake;
> however, it never does anything with it. Ultimately it wakes up the
> waiting task directly using wake_up.
> 
> This shows that there seems to be a divergence between poll/select and
> epoll regarding honoring sync wakeups.
> 
> I have tested this patch through self tests and numerous runs of the
> perf benchmarks for epoll. All tests past and
> I did not see any observable performance changes in epoll_wait.
> 
> Reviewed-by: Brian Geffon <bgeffon@google.com>
> Tested-by: Brian Geffon <bgeffon@google.com>
> Reported-by: Benoit Lize <lizeb@google.com>

Friendly ping on this. Would someone mind taking a look and picking this
up?

> 
> 
> On Thu, Sep 19, 2024 at 8:36 AM Brian Geffon <bgeffon@google.com> wrote:
> >
> > We've also observed this issue on ChromeOS, it seems like it might long-standing epoll bug as it diverges from the behavior of poll. Any chance a maintainer can take a look?
> >
> > Thanks
> > Brian
> >
> > On Fri, Apr 26, 2024 at 04:05:48PM +0800, Xuewen Yan wrote:
> > > Now, the epoll only use wake_up() interface to wake up task.
> > > However, sometimes, there are epoll users which want to use
> > > the synchronous wakeup flag to hint the scheduler, such as
> > > Android binder driver.
> > > So add a wake_up_sync() define, and use the wake_up_sync()
> > > when the sync is true in ep_poll_callback().
> > >
> > > Co-developed-by: Jing Xia <jing.xia@unisoc.com>
> > > Signed-off-by: Jing Xia <jing.xia@unisoc.com>
> > > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> > > ---
> > >  fs/eventpoll.c       | 5 ++++-
> > >  include/linux/wait.h | 1 +
> > >  2 files changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> > > index 882b89edc52a..9b815e0a1ac5 100644
> > > --- a/fs/eventpoll.c
> > > +++ b/fs/eventpoll.c
> > > @@ -1336,7 +1336,10 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
> > >                               break;
> > >                       }
> > >               }
> > > -             wake_up(&ep->wq);
> > > +             if (sync)
> > > +                     wake_up_sync(&ep->wq);
> > > +             else
> > > +                     wake_up(&ep->wq);
> > >       }
> > >       if (waitqueue_active(&ep->poll_wait))
> > >               pwake++;
> > > diff --git a/include/linux/wait.h b/include/linux/wait.h
> > > index 8aa3372f21a0..2b322a9b88a2 100644
> > > --- a/include/linux/wait.h
> > > +++ b/include/linux/wait.h
> > > @@ -221,6 +221,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head);
> > >  #define wake_up_all(x)                       __wake_up(x, TASK_NORMAL, 0, NULL)
> > >  #define wake_up_locked(x)            __wake_up_locked((x), TASK_NORMAL, 1)
> > >  #define wake_up_all_locked(x)                __wake_up_locked((x), TASK_NORMAL, 0)
> > > +#define wake_up_sync(x)                      __wake_up_sync(x, TASK_NORMAL)
> > >
> > >  #define wake_up_interruptible(x)     __wake_up(x, TASK_INTERRUPTIBLE, 1, NULL)
> > >  #define wake_up_interruptible_nr(x, nr)      __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL)
> > > --
> > > 2.25.1
> > >
> >
Christian Brauner Oct. 16, 2024, 1:10 p.m. UTC | #5
On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote:
> Now, the epoll only use wake_up() interface to wake up task.
> However, sometimes, there are epoll users which want to use
> the synchronous wakeup flag to hint the scheduler, such as
> Android binder driver.
> So add a wake_up_sync() define, and use the wake_up_sync()
> when the sync is true in ep_poll_callback().
> 
> [...]

Applied to the vfs.misc branch of the vfs/vfs.git tree.
Patches in the vfs.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.misc

[1/1] epoll: Add synchronous wakeup support for ep_poll_callback
      https://git.kernel.org/vfs/vfs/c/2ce0e17660a7
Brian Geffon Oct. 16, 2024, 7:05 p.m. UTC | #6
On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote:
> On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote:
> > Now, the epoll only use wake_up() interface to wake up task.
> > However, sometimes, there are epoll users which want to use
> > the synchronous wakeup flag to hint the scheduler, such as
> > Android binder driver.
> > So add a wake_up_sync() define, and use the wake_up_sync()
> > when the sync is true in ep_poll_callback().
> > 
> > [...]
> 
> Applied to the vfs.misc branch of the vfs/vfs.git tree.
> Patches in the vfs.misc branch should appear in linux-next soon.
> 
> Please report any outstanding bugs that were missed during review in a
> new review to the original patch series allowing us to drop it.
> 
> It's encouraged to provide Acked-bys and Reviewed-bys even though the
> patch has now been applied. If possible patch trailers will be updated.
> 
> Note that commit hashes shown below are subject to change due to rebase,
> trailer updates or similar. If in doubt, please check the listed branch.
> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> branch: vfs.misc

This is a bug that's been present for all of time, so I think we should:

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") 
Cc: stable@vger.kernel.org

I sent a patch which adds a benchmark for nonblocking pipes using epoll:
https://lore.kernel.org/lkml/20241016190009.866615-1-bgeffon@google.com/

Using this new benchmark I get the following results without this fix
and with this fix:

$ tools/perf/perf bench sched pipe -n
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes

     Total time: 12.194 [sec]

      12.194376 usecs/op
          82005 ops/sec


$ tools/perf/perf bench sched pipe -n
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes

     Total time: 9.229 [sec]

       9.229738 usecs/op
         108345 ops/sec

> 
> [1/1] epoll: Add synchronous wakeup support for ep_poll_callback
>       https://git.kernel.org/vfs/vfs/c/2ce0e17660a7
Brian Geffon Dec. 12, 2024, 5:14 p.m. UTC | #7
On Wed, Oct 16, 2024 at 03:05:38PM -0400, Brian Geffon wrote:
> On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote:
> > On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote:
> > > Now, the epoll only use wake_up() interface to wake up task.
> > > However, sometimes, there are epoll users which want to use
> > > the synchronous wakeup flag to hint the scheduler, such as
> > > Android binder driver.
> > > So add a wake_up_sync() define, and use the wake_up_sync()
> > > when the sync is true in ep_poll_callback().
> > > 
> > > [...]
> > 
> > Applied to the vfs.misc branch of the vfs/vfs.git tree.
> > Patches in the vfs.misc branch should appear in linux-next soon.
> > 
> > Please report any outstanding bugs that were missed during review in a
> > new review to the original patch series allowing us to drop it.
> > 
> > It's encouraged to provide Acked-bys and Reviewed-bys even though the
> > patch has now been applied. If possible patch trailers will be updated.
> > 
> > Note that commit hashes shown below are subject to change due to rebase,
> > trailer updates or similar. If in doubt, please check the listed branch.
> > 
> > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> > branch: vfs.misc
> 
> This is a bug that's been present for all of time, so I think we should:
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") 
> Cc: stable@vger.kernel.org

This is in as 900bbaae ("epoll: Add synchronous wakeup support for
ep_poll_callback"). How do maintainers feel about:

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org

> 
> I sent a patch which adds a benchmark for nonblocking pipes using epoll:
> https://lore.kernel.org/lkml/20241016190009.866615-1-bgeffon@google.com/
> 
> Using this new benchmark I get the following results without this fix
> and with this fix:
> 
> $ tools/perf/perf bench sched pipe -n
> # Running 'sched/pipe' benchmark:
> # Executed 1000000 pipe operations between two processes
> 
>      Total time: 12.194 [sec]
> 
>       12.194376 usecs/op
>           82005 ops/sec
> 
> 
> $ tools/perf/perf bench sched pipe -n
> # Running 'sched/pipe' benchmark:
> # Executed 1000000 pipe operations between two processes
> 
>      Total time: 9.229 [sec]
> 
>        9.229738 usecs/op
>          108345 ops/sec
> 
> > 
> > [1/1] epoll: Add synchronous wakeup support for ep_poll_callback
> >       https://git.kernel.org/vfs/vfs/c/2ce0e17660a7

Thanks,
Brian
Brian Geffon Dec. 17, 2024, 2:30 p.m. UTC | #8
On Thu, Dec 12, 2024 at 12:14 PM Brian Geffon <bgeffon@google.com> wrote:
>
> On Wed, Oct 16, 2024 at 03:05:38PM -0400, Brian Geffon wrote:
> > On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote:
> > > On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote:
> > > > Now, the epoll only use wake_up() interface to wake up task.
> > > > However, sometimes, there are epoll users which want to use
> > > > the synchronous wakeup flag to hint the scheduler, such as
> > > > Android binder driver.
> > > > So add a wake_up_sync() define, and use the wake_up_sync()
> > > > when the sync is true in ep_poll_callback().
> > > >
> > > > [...]
> > >
> > > Applied to the vfs.misc branch of the vfs/vfs.git tree.
> > > Patches in the vfs.misc branch should appear in linux-next soon.
> > >
> > > Please report any outstanding bugs that were missed during review in a
> > > new review to the original patch series allowing us to drop it.
> > >
> > > It's encouraged to provide Acked-bys and Reviewed-bys even though the
> > > patch has now been applied. If possible patch trailers will be updated.
> > >
> > > Note that commit hashes shown below are subject to change due to rebase,
> > > trailer updates or similar. If in doubt, please check the listed branch.
> > >
> > > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> > > branch: vfs.misc
> >
> > This is a bug that's been present for all of time, so I think we should:
> >
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Cc: stable@vger.kernel.org
>
> This is in as 900bbaae ("epoll: Add synchronous wakeup support for
> ep_poll_callback"). How do maintainers feel about:
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Cc: stable@vger.kernel.org

Dear stable maintainers, this fixes a bug goes all the way back and
beyond Linux 2.6.12-rc2. Can you please add this commit to the stable
releases?

commit 900bbaae67e980945dec74d36f8afe0de7556d5a upstream.

>
> >
> > I sent a patch which adds a benchmark for nonblocking pipes using epoll:
> > https://lore.kernel.org/lkml/20241016190009.866615-1-bgeffon@google.com/
> >
> > Using this new benchmark I get the following results without this fix
> > and with this fix:
> >
> > $ tools/perf/perf bench sched pipe -n
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> >      Total time: 12.194 [sec]
> >
> >       12.194376 usecs/op
> >           82005 ops/sec
> >
> >
> > $ tools/perf/perf bench sched pipe -n
> > # Running 'sched/pipe' benchmark:
> > # Executed 1000000 pipe operations between two processes
> >
> >      Total time: 9.229 [sec]
> >
> >        9.229738 usecs/op
> >          108345 ops/sec
> >
> > >
> > > [1/1] epoll: Add synchronous wakeup support for ep_poll_callback
> > >       https://git.kernel.org/vfs/vfs/c/2ce0e17660a7
>
> Thanks,
> Brian
>

 Thanks,
 Brian
Greg KH Dec. 17, 2024, 3:34 p.m. UTC | #9
On Tue, Dec 17, 2024 at 09:30:51AM -0500, Brian Geffon wrote:
> On Thu, Dec 12, 2024 at 12:14 PM Brian Geffon <bgeffon@google.com> wrote:
> >
> > On Wed, Oct 16, 2024 at 03:05:38PM -0400, Brian Geffon wrote:
> > > On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote:
> > > > On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote:
> > > > > Now, the epoll only use wake_up() interface to wake up task.
> > > > > However, sometimes, there are epoll users which want to use
> > > > > the synchronous wakeup flag to hint the scheduler, such as
> > > > > Android binder driver.
> > > > > So add a wake_up_sync() define, and use the wake_up_sync()
> > > > > when the sync is true in ep_poll_callback().
> > > > >
> > > > > [...]
> > > >
> > > > Applied to the vfs.misc branch of the vfs/vfs.git tree.
> > > > Patches in the vfs.misc branch should appear in linux-next soon.
> > > >
> > > > Please report any outstanding bugs that were missed during review in a
> > > > new review to the original patch series allowing us to drop it.
> > > >
> > > > It's encouraged to provide Acked-bys and Reviewed-bys even though the
> > > > patch has now been applied. If possible patch trailers will be updated.
> > > >
> > > > Note that commit hashes shown below are subject to change due to rebase,
> > > > trailer updates or similar. If in doubt, please check the listed branch.
> > > >
> > > > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> > > > branch: vfs.misc
> > >
> > > This is a bug that's been present for all of time, so I think we should:
> > >
> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Cc: stable@vger.kernel.org
> >
> > This is in as 900bbaae ("epoll: Add synchronous wakeup support for
> > ep_poll_callback"). How do maintainers feel about:
> >
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Cc: stable@vger.kernel.org
> 
> Dear stable maintainers, this fixes a bug goes all the way back and
> beyond Linux 2.6.12-rc2. Can you please add this commit to the stable
> releases?
> 
> commit 900bbaae67e980945dec74d36f8afe0de7556d5a upstream.

How is this a bugfix?  It looks like it is just a new feature being
added to epoll, what bug does it "fix"?

confused,

greg k-h
Brian Geffon Dec. 17, 2024, 4:31 p.m. UTC | #10
On Tue, Dec 17, 2024 at 10:34 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Dec 17, 2024 at 09:30:51AM -0500, Brian Geffon wrote:
> > On Thu, Dec 12, 2024 at 12:14 PM Brian Geffon <bgeffon@google.com> wrote:
> > >
> > > On Wed, Oct 16, 2024 at 03:05:38PM -0400, Brian Geffon wrote:
> > > > On Wed, Oct 16, 2024 at 03:10:34PM +0200, Christian Brauner wrote:
> > > > > On Fri, 26 Apr 2024 16:05:48 +0800, Xuewen Yan wrote:
> > > > > > Now, the epoll only use wake_up() interface to wake up task.
> > > > > > However, sometimes, there are epoll users which want to use
> > > > > > the synchronous wakeup flag to hint the scheduler, such as
> > > > > > Android binder driver.
> > > > > > So add a wake_up_sync() define, and use the wake_up_sync()
> > > > > > when the sync is true in ep_poll_callback().
> > > > > >
> > > > > > [...]
> > > > >
> > > > > Applied to the vfs.misc branch of the vfs/vfs.git tree.
> > > > > Patches in the vfs.misc branch should appear in linux-next soon.
> > > > >
> > > > > Please report any outstanding bugs that were missed during review in a
> > > > > new review to the original patch series allowing us to drop it.
> > > > >
> > > > > It's encouraged to provide Acked-bys and Reviewed-bys even though the
> > > > > patch has now been applied. If possible patch trailers will be updated.
> > > > >
> > > > > Note that commit hashes shown below are subject to change due to rebase,
> > > > > trailer updates or similar. If in doubt, please check the listed branch.
> > > > >
> > > > > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> > > > > branch: vfs.misc
> > > >
> > > > This is a bug that's been present for all of time, so I think we should:
> > > >
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Cc: stable@vger.kernel.org
> > >
> > > This is in as 900bbaae ("epoll: Add synchronous wakeup support for
> > > ep_poll_callback"). How do maintainers feel about:
> > >
> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Cc: stable@vger.kernel.org
> >
> > Dear stable maintainers, this fixes a bug goes all the way back and
> > beyond Linux 2.6.12-rc2. Can you please add this commit to the stable
> > releases?
> >
> > commit 900bbaae67e980945dec74d36f8afe0de7556d5a upstream.
>

Hi Greg,

> How is this a bugfix?  It looks like it is just a new feature being
> added to epoll, what bug does it "fix"?

The bug this fixes is that epoll today completely ignores the WF_SYNC
flag. The end result is the same, the wakee is woken, but the kernel has
several places where a synchronous wakeup is expected and with epoll
this isn't honored. However, it is honored with poll, select, recvmsg, etc.
Ultimately, epoll is inconsistent with other polling methods and this
inconsistency can lead to unexpected and surprising scheduling properties
based on the mechanism used in userspace.

For example, f467a6a664 ("pipe: fix and clarify pipe read wakeup logic")
highlighted the importance of sync wakeups for pipes, should GNU make
start using epoll we would see the same regression that resulted in this
fix from Linus.

>
> confused,
>
> greg k-h

Thanks,
Brian
diff mbox series

Patch

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 882b89edc52a..9b815e0a1ac5 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1336,7 +1336,10 @@  static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
 				break;
 			}
 		}
-		wake_up(&ep->wq);
+		if (sync)
+			wake_up_sync(&ep->wq);
+		else
+			wake_up(&ep->wq);
 	}
 	if (waitqueue_active(&ep->poll_wait))
 		pwake++;
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 8aa3372f21a0..2b322a9b88a2 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -221,6 +221,7 @@  void __wake_up_pollfree(struct wait_queue_head *wq_head);
 #define wake_up_all(x)			__wake_up(x, TASK_NORMAL, 0, NULL)
 #define wake_up_locked(x)		__wake_up_locked((x), TASK_NORMAL, 1)
 #define wake_up_all_locked(x)		__wake_up_locked((x), TASK_NORMAL, 0)
+#define wake_up_sync(x)			__wake_up_sync(x, TASK_NORMAL)
 
 #define wake_up_interruptible(x)	__wake_up(x, TASK_INTERRUPTIBLE, 1, NULL)
 #define wake_up_interruptible_nr(x, nr)	__wake_up(x, TASK_INTERRUPTIBLE, nr, NULL)