Message ID | 20230329160203.191380-4-frederic@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | rcu/nocb: Shrinker related boring fixes | expand |
On Wed, Mar 29, 2023 at 06:02:02PM +0200, Frederic Weisbecker wrote: > The ->lazy_len is only checked locklessly. Recheck again under the > ->nocb_lock to avoid spending more time on flushing/waking if not > necessary. The ->lazy_len can still increment concurrently (from 1 to > infinity) but under the ->nocb_lock we at least know for sure if there > are lazy callbacks at all (->lazy_len > 0). > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > --- > kernel/rcu/tree_nocb.h | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h > index c321fce2af8e..dfa9c10d6727 100644 > --- a/kernel/rcu/tree_nocb.h > +++ b/kernel/rcu/tree_nocb.h > @@ -1358,12 +1358,20 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) > if (!rcu_rdp_is_offloaded(rdp)) > continue; > > + if (!READ_ONCE(rdp->lazy_len)) > + continue; Do you depend on the ordering of the above read of ->lazy_len against anything in the following, aside from the re-read of ->lazy_len? (Same variable, both READ_ONCE() or stronger, so you do get that ordering.) If you do need that ordering, the above READ_ONCE() needs to instead be smp_load_acquire() or similar. If you don't need that ordering, what you have is good. > + rcu_nocb_lock_irqsave(rdp, flags); > + /* > + * Recheck under the nocb lock. Since we are not holding the bypass > + * lock we may still race with increments from the enqueuer but still > + * we know for sure if there is at least one lazy callback. > + */ > _count = READ_ONCE(rdp->lazy_len); > - > - if (_count == 0) > + if (!_count) { > + rcu_nocb_unlock_irqrestore(rdp, flags); > continue; > - > - rcu_nocb_lock_irqsave(rdp, flags); > + } > WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false)); > rcu_nocb_unlock_irqrestore(rdp, flags); > wake_nocb_gp(rdp, false); > -- > 2.34.1 >
On Wed, Mar 29, 2023 at 01:54:20PM -0700, Paul E. McKenney wrote: > On Wed, Mar 29, 2023 at 06:02:02PM +0200, Frederic Weisbecker wrote: > > The ->lazy_len is only checked locklessly. Recheck again under the > > ->nocb_lock to avoid spending more time on flushing/waking if not > > necessary. The ->lazy_len can still increment concurrently (from 1 to > > infinity) but under the ->nocb_lock we at least know for sure if there > > are lazy callbacks at all (->lazy_len > 0). > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > --- > > kernel/rcu/tree_nocb.h | 16 ++++++++++++---- > > 1 file changed, 12 insertions(+), 4 deletions(-) > > > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h > > index c321fce2af8e..dfa9c10d6727 100644 > > --- a/kernel/rcu/tree_nocb.h > > +++ b/kernel/rcu/tree_nocb.h > > @@ -1358,12 +1358,20 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) > > if (!rcu_rdp_is_offloaded(rdp)) > > continue; > > > > + if (!READ_ONCE(rdp->lazy_len)) > > + continue; > > Do you depend on the ordering of the above read of ->lazy_len against > anything in the following, aside from the re-read of ->lazy_len? (Same > variable, both READ_ONCE() or stronger, so you do get that ordering.) > > If you do need that ordering, the above READ_ONCE() needs to instead > be smp_load_acquire() or similar. If you don't need that ordering, > what you have is good. No ordering dependency intended here. The early ->lazy_len read is really just an optimization here to avoid locking if it *seems* there is nothing to do with this rdp. But what follows doesn't depend on that read. Thanks.
On Wed, Mar 29, 2023 at 11:22:45PM +0200, Frederic Weisbecker wrote: > On Wed, Mar 29, 2023 at 01:54:20PM -0700, Paul E. McKenney wrote: > > On Wed, Mar 29, 2023 at 06:02:02PM +0200, Frederic Weisbecker wrote: > > > The ->lazy_len is only checked locklessly. Recheck again under the > > > ->nocb_lock to avoid spending more time on flushing/waking if not > > > necessary. The ->lazy_len can still increment concurrently (from 1 to > > > infinity) but under the ->nocb_lock we at least know for sure if there > > > are lazy callbacks at all (->lazy_len > 0). > > > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > > --- > > > kernel/rcu/tree_nocb.h | 16 ++++++++++++---- > > > 1 file changed, 12 insertions(+), 4 deletions(-) > > > > > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h > > > index c321fce2af8e..dfa9c10d6727 100644 > > > --- a/kernel/rcu/tree_nocb.h > > > +++ b/kernel/rcu/tree_nocb.h > > > @@ -1358,12 +1358,20 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) > > > if (!rcu_rdp_is_offloaded(rdp)) > > > continue; > > > > > > + if (!READ_ONCE(rdp->lazy_len)) > > > + continue; > > > > Do you depend on the ordering of the above read of ->lazy_len against > > anything in the following, aside from the re-read of ->lazy_len? (Same > > variable, both READ_ONCE() or stronger, so you do get that ordering.) > > > > If you do need that ordering, the above READ_ONCE() needs to instead > > be smp_load_acquire() or similar. If you don't need that ordering, > > what you have is good. > > No ordering dependency intended here. The early ->lazy_len read is really just > an optimization here to avoid locking if it *seems* there is nothing to do with > this rdp. But what follows doesn't depend on that read. Full steam ahead with READ_ONCE(), then! ;-) Thanx, Paul
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index c321fce2af8e..dfa9c10d6727 100644 --- a/kernel/rcu/tree_nocb.h +++ b/kernel/rcu/tree_nocb.h @@ -1358,12 +1358,20 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) if (!rcu_rdp_is_offloaded(rdp)) continue; + if (!READ_ONCE(rdp->lazy_len)) + continue; + + rcu_nocb_lock_irqsave(rdp, flags); + /* + * Recheck under the nocb lock. Since we are not holding the bypass + * lock we may still race with increments from the enqueuer but still + * we know for sure if there is at least one lazy callback. + */ _count = READ_ONCE(rdp->lazy_len); - - if (_count == 0) + if (!_count) { + rcu_nocb_unlock_irqrestore(rdp, flags); continue; - - rcu_nocb_lock_irqsave(rdp, flags); + } WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false)); rcu_nocb_unlock_irqrestore(rdp, flags); wake_nocb_gp(rdp, false);
The ->lazy_len is only checked locklessly. Recheck again under the ->nocb_lock to avoid spending more time on flushing/waking if not necessary. The ->lazy_len can still increment concurrently (from 1 to infinity) but under the ->nocb_lock we at least know for sure if there are lazy callbacks at all (->lazy_len > 0). Signed-off-by: Frederic Weisbecker <frederic@kernel.org> --- kernel/rcu/tree_nocb.h | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)