Message ID | 20240104162510.72773-1-urezki@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Reduce synchronize_rcu() latency(v4) | expand |
On Thu, Jan 04, 2024 at 05:25:06PM +0100, Uladzislau Rezki (Sony) wrote: > This is a v4 that tends to improve synchronize_rcu() call. To be more > specific it is about reducing a waiting time(especially worst cases) > of caller that blocks until a grace period is elapsed. > > In general, this series separates synchronize_rcu() callers from other > callbacks. We keep a dedicated an independent queue, thus the processing > of it starts as soon as grace period is over, so there is no need to wait > until other callbacks are processed one by one. Please note, a number of > callbacks can be 10K, 20K, 60K and so on. That is why this series maintain > a separate track for this call that blocks a context. And before I forget (again), a possible follow-on to this work is to reduce cond_synchronize_rcu() and cond_synchronize_rcu_full() latency. Right now, these wait for a full additional grace period (and maybe more) when the required grace period has not elapsed. In contrast, this work might enable waiting only for the needed portion of a grace period to elapse. Thanx, Paul > v3 -> v4: > - Squash patches; > - Add more description; > - Fix comments based on v3 feedback. > > v3: https://lore.kernel.org/lkml/cd45b0b5-f86b-43fb-a5f3-47d340cd4f9f@paulmck-laptop/T/ > v2: https://lore.kernel.org/all/20231030131254.488186-1-urezki@gmail.com/T/ > v1: https://lore.kernel.org/lkml/20231025140915.590390-1-urezki@gmail.com/T/ > > Neeraj Upadhyay (1): > rcu: Improve handling of synchronize_rcu() users > > Uladzislau Rezki (Sony) (3): > rcu: Reduce synchronize_rcu() latency > rcu: Add a trace event for synchronize_rcu_normal() > rcu: Support direct wake-up of synchronize_rcu() users > > .../admin-guide/kernel-parameters.txt | 14 + > include/trace/events/rcu.h | 27 ++ > kernel/rcu/Kconfig.debug | 12 + > kernel/rcu/tree.c | 361 +++++++++++++++++- > kernel/rcu/tree.h | 19 + > kernel/rcu/tree_exp.h | 2 +- > 6 files changed, 433 insertions(+), 2 deletions(-) > > -- > 2.39.2 >
On Fri, Jan 26, 2024 at 11:07:18PM -0800, Paul E. McKenney wrote: > On Thu, Jan 04, 2024 at 05:25:06PM +0100, Uladzislau Rezki (Sony) wrote: > > This is a v4 that tends to improve synchronize_rcu() call. To be more > > specific it is about reducing a waiting time(especially worst cases) > > of caller that blocks until a grace period is elapsed. > > > > In general, this series separates synchronize_rcu() callers from other > > callbacks. We keep a dedicated an independent queue, thus the processing > > of it starts as soon as grace period is over, so there is no need to wait > > until other callbacks are processed one by one. Please note, a number of > > callbacks can be 10K, 20K, 60K and so on. That is why this series maintain > > a separate track for this call that blocks a context. > > And before I forget (again), a possible follow-on to this work is to > reduce cond_synchronize_rcu() and cond_synchronize_rcu_full() latency. > Right now, these wait for a full additional grace period (and maybe > more) when the required grace period has not elapsed. In contrast, > this work might enable waiting only for the needed portion of a grace > period to elapse. > Thanks. I see it. Probably we also need to move "sync" related functionality out of tree.c file to the sync.c or something similar to that name. IMO. Thanks! -- Uladzislau Rezki
On Mon, Jan 29, 2024 at 05:23:01PM +0100, Uladzislau Rezki wrote: > On Fri, Jan 26, 2024 at 11:07:18PM -0800, Paul E. McKenney wrote: > > On Thu, Jan 04, 2024 at 05:25:06PM +0100, Uladzislau Rezki (Sony) wrote: > > > This is a v4 that tends to improve synchronize_rcu() call. To be more > > > specific it is about reducing a waiting time(especially worst cases) > > > of caller that blocks until a grace period is elapsed. > > > > > > In general, this series separates synchronize_rcu() callers from other > > > callbacks. We keep a dedicated an independent queue, thus the processing > > > of it starts as soon as grace period is over, so there is no need to wait > > > until other callbacks are processed one by one. Please note, a number of > > > callbacks can be 10K, 20K, 60K and so on. That is why this series maintain > > > a separate track for this call that blocks a context. > > > > And before I forget (again), a possible follow-on to this work is to > > reduce cond_synchronize_rcu() and cond_synchronize_rcu_full() latency. > > Right now, these wait for a full additional grace period (and maybe > > more) when the required grace period has not elapsed. In contrast, > > this work might enable waiting only for the needed portion of a grace > > period to elapse. > > > Thanks. I see it. Probably we also need to move "sync" related > functionality out of tree.c file to the sync.c or something similar > to that name. IMO. I would prioritize moving the kfree_rcu() code out of tree.c quite a ways over moving out the synchronous-wait code. ;-) Thanx, Paul
On Mon, Jan 29, 2024 at 11:43:43AM -0800, Paul E. McKenney wrote: > On Mon, Jan 29, 2024 at 05:23:01PM +0100, Uladzislau Rezki wrote: > > On Fri, Jan 26, 2024 at 11:07:18PM -0800, Paul E. McKenney wrote: > > > On Thu, Jan 04, 2024 at 05:25:06PM +0100, Uladzislau Rezki (Sony) wrote: > > > > This is a v4 that tends to improve synchronize_rcu() call. To be more > > > > specific it is about reducing a waiting time(especially worst cases) > > > > of caller that blocks until a grace period is elapsed. > > > > > > > > In general, this series separates synchronize_rcu() callers from other > > > > callbacks. We keep a dedicated an independent queue, thus the processing > > > > of it starts as soon as grace period is over, so there is no need to wait > > > > until other callbacks are processed one by one. Please note, a number of > > > > callbacks can be 10K, 20K, 60K and so on. That is why this series maintain > > > > a separate track for this call that blocks a context. > > > > > > And before I forget (again), a possible follow-on to this work is to > > > reduce cond_synchronize_rcu() and cond_synchronize_rcu_full() latency. > > > Right now, these wait for a full additional grace period (and maybe > > > more) when the required grace period has not elapsed. In contrast, > > > this work might enable waiting only for the needed portion of a grace > > > period to elapse. > > > > > Thanks. I see it. Probably we also need to move "sync" related > > functionality out of tree.c file to the sync.c or something similar > > to that name. IMO. > > I would prioritize moving the kfree_rcu() code out of tree.c quite > a ways over moving out the synchronous-wait code. ;-) > Indeed. But i am not about priority :) -- Uladzislau Rezki