Message ID | 20240604222652.2370998-3-paulmck@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | dec56ca5f1c3448a04e2366d38487dd5c23d5205 |
Headers | show |
Series | Grace-period memory-barrier adjustments for v6.11 | expand |
On Wed, Jun 5, 2024 at 3:58 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > From: Frederic Weisbecker <frederic@kernel.org> > > When the grace period kthread checks the extended quiescent state > counter of a CPU, full ordering is necessary to ensure that either: > > * If the GP kthread observes the remote target in an extended quiescent > state, then that target must observe all accesses prior to the current > grace period, including the current grace period sequence number, once > it exits that extended quiescent state. > > or: > > * If the GP kthread observes the remote target NOT in an extended > quiescent state, then the target further entering in an extended > quiescent state must observe all accesses prior to the current > grace period, including the current grace period sequence number, once > it enters that extended quiescent state. > > This ordering is enforced through a full memory barrier placed right > before taking the first EQS snapshot. However this is superfluous > because the snapshot is taken while holding the target's rnp lock which > provides the necessary ordering through its chain of > smp_mb__after_unlock_lock(). > > Remove the needless explicit barrier before the snapshot and put a > comment about the implicit barrier newly relied upon here. > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > --- > kernel/rcu/tree_exp.h | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > index 8a1d9c8bd9f74..bec24ea6777e8 100644 > --- a/kernel/rcu/tree_exp.h > +++ b/kernel/rcu/tree_exp.h > @@ -357,7 +357,13 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp) > !(rnp->qsmaskinitnext & mask)) { > mask_ofl_test |= mask; > } else { > - snap = rcu_dynticks_snap(cpu); > + /* > + * Full ordering against accesses prior current GP and > + * also against current GP sequence number is enforced > + * by current rnp locking with chained > + * smp_mb__after_unlock_lock(). Again, worth mentioning the chaining sites sync_exp_reset_tree() and this function? Thanks Neeraj > + */ > + snap = ct_dynticks_cpu_acquire(cpu); > if (rcu_dynticks_in_eqs(snap)) > mask_ofl_test |= mask; > else > -- > 2.40.1 > >
On Wed, Jun 12, 2024 at 02:14:14PM +0530, Neeraj upadhyay wrote: > On Wed, Jun 5, 2024 at 3:58 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > From: Frederic Weisbecker <frederic@kernel.org> > > > > When the grace period kthread checks the extended quiescent state > > counter of a CPU, full ordering is necessary to ensure that either: > > > > * If the GP kthread observes the remote target in an extended quiescent > > state, then that target must observe all accesses prior to the current > > grace period, including the current grace period sequence number, once > > it exits that extended quiescent state. > > > > or: > > > > * If the GP kthread observes the remote target NOT in an extended > > quiescent state, then the target further entering in an extended > > quiescent state must observe all accesses prior to the current > > grace period, including the current grace period sequence number, once > > it enters that extended quiescent state. > > > > This ordering is enforced through a full memory barrier placed right > > before taking the first EQS snapshot. However this is superfluous > > because the snapshot is taken while holding the target's rnp lock which > > provides the necessary ordering through its chain of > > smp_mb__after_unlock_lock(). > > > > Remove the needless explicit barrier before the snapshot and put a > > comment about the implicit barrier newly relied upon here. > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > --- > > kernel/rcu/tree_exp.h | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > > index 8a1d9c8bd9f74..bec24ea6777e8 100644 > > --- a/kernel/rcu/tree_exp.h > > +++ b/kernel/rcu/tree_exp.h > > @@ -357,7 +357,13 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp) > > !(rnp->qsmaskinitnext & mask)) { > > mask_ofl_test |= mask; > > } else { > > - snap = rcu_dynticks_snap(cpu); > > + /* > > + * Full ordering against accesses prior current GP and > > + * also against current GP sequence number is enforced > > + * by current rnp locking with chained > > + * smp_mb__after_unlock_lock(). > > Again, worth mentioning the chaining sites sync_exp_reset_tree() and > this function? It might well be in both cases. Could you and Frederic propose agreed-upon appropriate changes (including the null change, if appropriate)? Thanx, Paul > Thanks > Neeraj > > > + */ > > + snap = ct_dynticks_cpu_acquire(cpu); > > if (rcu_dynticks_in_eqs(snap)) > > mask_ofl_test |= mask; > > else > > -- > > 2.40.1 > > > >
Le Wed, Jun 12, 2024 at 02:14:14PM +0530, Neeraj upadhyay a écrit : > On Wed, Jun 5, 2024 at 3:58 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > From: Frederic Weisbecker <frederic@kernel.org> > > > > When the grace period kthread checks the extended quiescent state > > counter of a CPU, full ordering is necessary to ensure that either: > > > > * If the GP kthread observes the remote target in an extended quiescent > > state, then that target must observe all accesses prior to the current > > grace period, including the current grace period sequence number, once > > it exits that extended quiescent state. > > > > or: > > > > * If the GP kthread observes the remote target NOT in an extended > > quiescent state, then the target further entering in an extended > > quiescent state must observe all accesses prior to the current > > grace period, including the current grace period sequence number, once > > it enters that extended quiescent state. > > > > This ordering is enforced through a full memory barrier placed right > > before taking the first EQS snapshot. However this is superfluous > > because the snapshot is taken while holding the target's rnp lock which > > provides the necessary ordering through its chain of > > smp_mb__after_unlock_lock(). > > > > Remove the needless explicit barrier before the snapshot and put a > > comment about the implicit barrier newly relied upon here. > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > --- > > kernel/rcu/tree_exp.h | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > > index 8a1d9c8bd9f74..bec24ea6777e8 100644 > > --- a/kernel/rcu/tree_exp.h > > +++ b/kernel/rcu/tree_exp.h > > @@ -357,7 +357,13 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp) > > !(rnp->qsmaskinitnext & mask)) { > > mask_ofl_test |= mask; > > } else { > > - snap = rcu_dynticks_snap(cpu); > > + /* > > + * Full ordering against accesses prior current GP and > > + * also against current GP sequence number is enforced > > + * by current rnp locking with chained > > + * smp_mb__after_unlock_lock(). > > Again, worth mentioning the chaining sites sync_exp_reset_tree() and > this function? How about this? /* * Full ordering against accesses prior current GP and also against * current GP sequence number is enforced by rcu_seq_start() implicit * barrier, relayed by kworkers locking and even further by * smp_mb__after_unlock_lock() barriers chained all the way throughout * the rnp locking tree since sync_exp_reset_tree() and up to the current * leaf rnp locking. */ Thanks. > > > Thanks > Neeraj > > > + */ > > + snap = ct_dynticks_cpu_acquire(cpu); > > if (rcu_dynticks_in_eqs(snap)) > > mask_ofl_test |= mask; > > else > > -- > > 2.40.1 > > > >
On Wed, Jun 26, 2024 at 7:58 PM Frederic Weisbecker <frederic@kernel.org> wrote: > > Le Wed, Jun 12, 2024 at 02:14:14PM +0530, Neeraj upadhyay a écrit : > > On Wed, Jun 5, 2024 at 3:58 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > From: Frederic Weisbecker <frederic@kernel.org> > > > > > > When the grace period kthread checks the extended quiescent state > > > counter of a CPU, full ordering is necessary to ensure that either: > > > > > > * If the GP kthread observes the remote target in an extended quiescent > > > state, then that target must observe all accesses prior to the current > > > grace period, including the current grace period sequence number, once > > > it exits that extended quiescent state. > > > > > > or: > > > > > > * If the GP kthread observes the remote target NOT in an extended > > > quiescent state, then the target further entering in an extended > > > quiescent state must observe all accesses prior to the current > > > grace period, including the current grace period sequence number, once > > > it enters that extended quiescent state. > > > > > > This ordering is enforced through a full memory barrier placed right > > > before taking the first EQS snapshot. However this is superfluous > > > because the snapshot is taken while holding the target's rnp lock which > > > provides the necessary ordering through its chain of > > > smp_mb__after_unlock_lock(). > > > > > > Remove the needless explicit barrier before the snapshot and put a > > > comment about the implicit barrier newly relied upon here. > > > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > --- > > > kernel/rcu/tree_exp.h | 8 +++++++- > > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > > > index 8a1d9c8bd9f74..bec24ea6777e8 100644 > > > --- a/kernel/rcu/tree_exp.h > > > +++ b/kernel/rcu/tree_exp.h > > > @@ -357,7 +357,13 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp) > > > !(rnp->qsmaskinitnext & mask)) { > > > mask_ofl_test |= mask; > > > } else { > > > - snap = rcu_dynticks_snap(cpu); > > > + /* > > > + * Full ordering against accesses prior current GP and > > > + * also against current GP sequence number is enforced > > > + * by current rnp locking with chained > > > + * smp_mb__after_unlock_lock(). > > > > Again, worth mentioning the chaining sites sync_exp_reset_tree() and > > this function? > > How about this? > Looks good to me, thanks! - Neeraj > /* > * Full ordering against accesses prior current GP and also against > * current GP sequence number is enforced by rcu_seq_start() implicit > * barrier, relayed by kworkers locking and even further by > * smp_mb__after_unlock_lock() barriers chained all the way throughout > * the rnp locking tree since sync_exp_reset_tree() and up to the current > * leaf rnp locking. > */ > > Thanks. > > > > > > > Thanks > > Neeraj > > > > > + */ > > > + snap = ct_dynticks_cpu_acquire(cpu); > > > if (rcu_dynticks_in_eqs(snap)) > > > mask_ofl_test |= mask; > > > else > > > -- > > > 2.40.1 > > > > > >
Le Wed, Jun 26, 2024 at 10:49:58PM +0530, Neeraj upadhyay a écrit : > On Wed, Jun 26, 2024 at 7:58 PM Frederic Weisbecker <frederic@kernel.org> wrote: > > > > Le Wed, Jun 12, 2024 at 02:14:14PM +0530, Neeraj upadhyay a écrit : > > > On Wed, Jun 5, 2024 at 3:58 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > > > From: Frederic Weisbecker <frederic@kernel.org> > > > > > > > > When the grace period kthread checks the extended quiescent state > > > > counter of a CPU, full ordering is necessary to ensure that either: > > > > > > > > * If the GP kthread observes the remote target in an extended quiescent > > > > state, then that target must observe all accesses prior to the current > > > > grace period, including the current grace period sequence number, once > > > > it exits that extended quiescent state. > > > > > > > > or: > > > > > > > > * If the GP kthread observes the remote target NOT in an extended > > > > quiescent state, then the target further entering in an extended > > > > quiescent state must observe all accesses prior to the current > > > > grace period, including the current grace period sequence number, once > > > > it enters that extended quiescent state. > > > > > > > > This ordering is enforced through a full memory barrier placed right > > > > before taking the first EQS snapshot. However this is superfluous > > > > because the snapshot is taken while holding the target's rnp lock which > > > > provides the necessary ordering through its chain of > > > > smp_mb__after_unlock_lock(). > > > > > > > > Remove the needless explicit barrier before the snapshot and put a > > > > comment about the implicit barrier newly relied upon here. > > > > > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > > --- > > > > kernel/rcu/tree_exp.h | 8 +++++++- > > > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > > > > index 8a1d9c8bd9f74..bec24ea6777e8 100644 > > > > --- a/kernel/rcu/tree_exp.h > > > > +++ b/kernel/rcu/tree_exp.h > > > > @@ -357,7 +357,13 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp) > > > > !(rnp->qsmaskinitnext & mask)) { > > > > mask_ofl_test |= mask; > > > > } else { > > > > - snap = rcu_dynticks_snap(cpu); > > > > + /* > > > > + * Full ordering against accesses prior current GP and > > > > + * also against current GP sequence number is enforced > > > > + * by current rnp locking with chained > > > > + * smp_mb__after_unlock_lock(). > > > > > > Again, worth mentioning the chaining sites sync_exp_reset_tree() and > > > this function? > > > > How about this? > > > > Looks good to me, thanks! And similar to the previous one, a last minute edition: /* * Full ordering between remote CPU's post idle accesses * and updater's accesses prior to current GP (and also * the started GP sequence number) is enforced by * rcu_seq_start() implicit barrier, relayed by kworkers * locking and even further by smp_mb__after_unlock_lock() * barriers chained all the way throughout the rnp locking * tree since sync_exp_reset_tree() and up to the current * leaf rnp locking. * * Ordering between remote CPU's pre idle accesses and * post grace period updater's accesses is enforced by the * below acquire semantic. */ Still ok? Thanks.
On Thu, Jun 27, 2024 at 3:42 AM Frederic Weisbecker <frederic@kernel.org> wrote: > > Le Wed, Jun 26, 2024 at 10:49:58PM +0530, Neeraj upadhyay a écrit : > > On Wed, Jun 26, 2024 at 7:58 PM Frederic Weisbecker <frederic@kernel.org> wrote: > > > > > > Le Wed, Jun 12, 2024 at 02:14:14PM +0530, Neeraj upadhyay a écrit : > > > > On Wed, Jun 5, 2024 at 3:58 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > > > > > From: Frederic Weisbecker <frederic@kernel.org> > > > > > > > > > > When the grace period kthread checks the extended quiescent state > > > > > counter of a CPU, full ordering is necessary to ensure that either: > > > > > > > > > > * If the GP kthread observes the remote target in an extended quiescent > > > > > state, then that target must observe all accesses prior to the current > > > > > grace period, including the current grace period sequence number, once > > > > > it exits that extended quiescent state. > > > > > > > > > > or: > > > > > > > > > > * If the GP kthread observes the remote target NOT in an extended > > > > > quiescent state, then the target further entering in an extended > > > > > quiescent state must observe all accesses prior to the current > > > > > grace period, including the current grace period sequence number, once > > > > > it enters that extended quiescent state. > > > > > > > > > > This ordering is enforced through a full memory barrier placed right > > > > > before taking the first EQS snapshot. However this is superfluous > > > > > because the snapshot is taken while holding the target's rnp lock which > > > > > provides the necessary ordering through its chain of > > > > > smp_mb__after_unlock_lock(). > > > > > > > > > > Remove the needless explicit barrier before the snapshot and put a > > > > > comment about the implicit barrier newly relied upon here. > > > > > > > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > > > --- > > > > > kernel/rcu/tree_exp.h | 8 +++++++- > > > > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > > > > > index 8a1d9c8bd9f74..bec24ea6777e8 100644 > > > > > --- a/kernel/rcu/tree_exp.h > > > > > +++ b/kernel/rcu/tree_exp.h > > > > > @@ -357,7 +357,13 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp) > > > > > !(rnp->qsmaskinitnext & mask)) { > > > > > mask_ofl_test |= mask; > > > > > } else { > > > > > - snap = rcu_dynticks_snap(cpu); > > > > > + /* > > > > > + * Full ordering against accesses prior current GP and > > > > > + * also against current GP sequence number is enforced > > > > > + * by current rnp locking with chained > > > > > + * smp_mb__after_unlock_lock(). > > > > > > > > Again, worth mentioning the chaining sites sync_exp_reset_tree() and > > > > this function? > > > > > > How about this? > > > > > > > Looks good to me, thanks! > > And similar to the previous one, a last minute edition: > > /* > * Full ordering between remote CPU's post idle accesses > * and updater's accesses prior to current GP (and also > * the started GP sequence number) is enforced by > * rcu_seq_start() implicit barrier, relayed by kworkers > * locking and even further by smp_mb__after_unlock_lock() > * barriers chained all the way throughout the rnp locking > * tree since sync_exp_reset_tree() and up to the current > * leaf rnp locking. > * > * Ordering between remote CPU's pre idle accesses and > * post grace period updater's accesses is enforced by the > * below acquire semantic. > */ > > Still ok? > Yes, looks good, thanks. Thanks Neeraj > Thanks.
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 8a1d9c8bd9f74..bec24ea6777e8 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -357,7 +357,13 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp) !(rnp->qsmaskinitnext & mask)) { mask_ofl_test |= mask; } else { - snap = rcu_dynticks_snap(cpu); + /* + * Full ordering against accesses prior current GP and + * also against current GP sequence number is enforced + * by current rnp locking with chained + * smp_mb__after_unlock_lock(). + */ + snap = ct_dynticks_cpu_acquire(cpu); if (rcu_dynticks_in_eqs(snap)) mask_ofl_test |= mask; else