Message ID | 20200210094123.389854-2-rpenyaev@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,1/3] epoll: fix possible lost wakeup on epoll_ctl() path | expand |
On 2/10/20 4:41 AM, Roman Penyaev wrote: > Now ep->lock is responsible for wqueue serialization, thus if ep->lock > is taken on write path, wake_up_locked() can be invoked. > > Though, read path is different. Since concurrent cpus can enter the > wake up function it needs to be internally serialized, thus wake_up() > variant is used which implies internal spin lock. > > Signed-off-by: Roman Penyaev <rpenyaev@suse.de> > Cc: Max Neunhoeffer <max@arangodb.com> > Cc: Jakub Kicinski <kuba@kernel.org> > Cc: Christopher Kohlhoff <chris.kohlhoff@clearpool.io> > Cc: Davidlohr Bueso <dbueso@suse.de> > Cc: Jason Baron <jbaron@akamai.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: linux-fsdevel@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > --- > Nothing interesting in v2: > changed the comment a bit > > fs/eventpoll.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/fs/eventpoll.c b/fs/eventpoll.c > index eee3c92a9ebf..6e218234bd4a 100644 > --- a/fs/eventpoll.c > +++ b/fs/eventpoll.c > @@ -1173,7 +1173,7 @@ static inline bool chain_epi_lockless(struct epitem *epi) > * Another thing worth to mention is that ep_poll_callback() can be called > * concurrently for the same @epi from different CPUs if poll table was inited > * with several wait queues entries. Plural wakeup from different CPUs of a > - * single wait queue is serialized by wq.lock, but the case when multiple wait > + * single wait queue is serialized by ep->lock, but the case when multiple wait > * queues are used should be detected accordingly. This is detected using > * cmpxchg() operation. > */ > @@ -1248,6 +1248,12 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v > break; > } > } > + /* > + * Since here we have the read lock (ep->lock) taken, plural > + * wakeup from different CPUs can occur, thus we call wake_up() > + * variant which implies its own lock on wqueue. All other paths > + * take write lock. > + */ > wake_up(&ep->wq); > } > if (waitqueue_active(&ep->poll_wait)) > @@ -1551,7 +1557,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event, > > /* Notify waiting tasks that events are available */ > if (waitqueue_active(&ep->wq)) > - wake_up(&ep->wq); > + wake_up_locked(&ep->wq); So I think this will now hit the 'lockdep_assert_held()' in __wake_up_common()? I agree that its correct, but I think it will confuse lockdep here... Thanks, -Jason
On 2020-02-10 19:16, Jason Baron wrote: > On 2/10/20 4:41 AM, Roman Penyaev wrote: >> Now ep->lock is responsible for wqueue serialization, thus if ep->lock >> is taken on write path, wake_up_locked() can be invoked. >> >> Though, read path is different. Since concurrent cpus can enter the >> wake up function it needs to be internally serialized, thus wake_up() >> variant is used which implies internal spin lock. >> >> Signed-off-by: Roman Penyaev <rpenyaev@suse.de> >> Cc: Max Neunhoeffer <max@arangodb.com> >> Cc: Jakub Kicinski <kuba@kernel.org> >> Cc: Christopher Kohlhoff <chris.kohlhoff@clearpool.io> >> Cc: Davidlohr Bueso <dbueso@suse.de> >> Cc: Jason Baron <jbaron@akamai.com> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: linux-fsdevel@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> --- >> Nothing interesting in v2: >> changed the comment a bit >> >> fs/eventpoll.c | 12 +++++++++--- >> 1 file changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/fs/eventpoll.c b/fs/eventpoll.c >> index eee3c92a9ebf..6e218234bd4a 100644 >> --- a/fs/eventpoll.c >> +++ b/fs/eventpoll.c >> @@ -1173,7 +1173,7 @@ static inline bool chain_epi_lockless(struct >> epitem *epi) >> * Another thing worth to mention is that ep_poll_callback() can be >> called >> * concurrently for the same @epi from different CPUs if poll table >> was inited >> * with several wait queues entries. Plural wakeup from different >> CPUs of a >> - * single wait queue is serialized by wq.lock, but the case when >> multiple wait >> + * single wait queue is serialized by ep->lock, but the case when >> multiple wait >> * queues are used should be detected accordingly. This is detected >> using >> * cmpxchg() operation. >> */ >> @@ -1248,6 +1248,12 @@ static int ep_poll_callback(wait_queue_entry_t >> *wait, unsigned mode, int sync, v >> break; >> } >> } >> + /* >> + * Since here we have the read lock (ep->lock) taken, plural >> + * wakeup from different CPUs can occur, thus we call wake_up() >> + * variant which implies its own lock on wqueue. All other paths >> + * take write lock. >> + */ >> wake_up(&ep->wq); >> } >> if (waitqueue_active(&ep->poll_wait)) >> @@ -1551,7 +1557,7 @@ static int ep_insert(struct eventpoll *ep, const >> struct epoll_event *event, >> >> /* Notify waiting tasks that events are available */ >> if (waitqueue_active(&ep->wq)) >> - wake_up(&ep->wq); >> + wake_up_locked(&ep->wq); > > > So I think this will now hit the 'lockdep_assert_held()' in > __wake_up_common()? I agree that its correct, but I think it will > confuse lockdep here... Argh! True. And I do not see any neat way to shut up lockdep here (Calling lock_acquire() manually seems not an option for such minor thing). Then this optimization is not needed, patch is cancelled. Thanks for noting that. -- Roman
diff --git a/fs/eventpoll.c b/fs/eventpoll.c index eee3c92a9ebf..6e218234bd4a 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1173,7 +1173,7 @@ static inline bool chain_epi_lockless(struct epitem *epi) * Another thing worth to mention is that ep_poll_callback() can be called * concurrently for the same @epi from different CPUs if poll table was inited * with several wait queues entries. Plural wakeup from different CPUs of a - * single wait queue is serialized by wq.lock, but the case when multiple wait + * single wait queue is serialized by ep->lock, but the case when multiple wait * queues are used should be detected accordingly. This is detected using * cmpxchg() operation. */ @@ -1248,6 +1248,12 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v break; } } + /* + * Since here we have the read lock (ep->lock) taken, plural + * wakeup from different CPUs can occur, thus we call wake_up() + * variant which implies its own lock on wqueue. All other paths + * take write lock. + */ wake_up(&ep->wq); } if (waitqueue_active(&ep->poll_wait)) @@ -1551,7 +1557,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event, /* Notify waiting tasks that events are available */ if (waitqueue_active(&ep->wq)) - wake_up(&ep->wq); + wake_up_locked(&ep->wq); if (waitqueue_active(&ep->poll_wait)) pwake++; } @@ -1657,7 +1663,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, /* Notify waiting tasks that events are available */ if (waitqueue_active(&ep->wq)) - wake_up(&ep->wq); + wake_up_locked(&ep->wq); if (waitqueue_active(&ep->poll_wait)) pwake++; }
Now ep->lock is responsible for wqueue serialization, thus if ep->lock is taken on write path, wake_up_locked() can be invoked. Though, read path is different. Since concurrent cpus can enter the wake up function it needs to be internally serialized, thus wake_up() variant is used which implies internal spin lock. Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Cc: Max Neunhoeffer <max@arangodb.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Christopher Kohlhoff <chris.kohlhoff@clearpool.io> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- Nothing interesting in v2: changed the comment a bit fs/eventpoll.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)