diff mbox series

[1/6] rcu: Remove full ordering on second EQS snapshot

Message ID 20240604222652.2370998-1-paulmck@kernel.org (mailing list archive)
State Accepted
Commit da979d0162fc60a1d6490aa0d6e8d0ac737353fc
Headers show
Series Grace-period memory-barrier adjustments for v6.11 | expand

Commit Message

Paul E. McKenney June 4, 2024, 10:26 p.m. UTC
From: Frederic Weisbecker <frederic@kernel.org>

When the grace period kthread checks the extended quiescent state
counter of a CPU, full ordering is necessary to ensure that either:

* If the GP kthread observes the remote target in an extended quiescent
  state, then that target must observe all accesses prior to the current
  grace period, including the current grace period sequence number, once
  it exits that extended quiescent state. Also the GP kthread must
  observe all accesses performed by the target prior it entering in
  EQS.

or:

* If the GP kthread observes the remote target NOT in an extended
  quiescent state, then the target further entering in an extended
  quiescent state must observe all accesses prior to the current
  grace period, including the current grace period sequence number, once
  it enters that extended quiescent state. Also the GP kthread later
  observing that EQS must also observe all accesses performed by the
  target prior it entering in EQS.

This ordering is explicitly performed both on the first EQS snapshot
and on the second one as well through the combination of a preceding
full barrier followed by an acquire read. However the second snapshot's
full memory barrier is redundant and not needed to enforce the above
guarantees:

    GP kthread                  Remote target
    ----                        -----
    // Access prior GP
    WRITE_ONCE(A, 1)
    // first snapshot
    smp_mb()
    x = smp_load_acquire(EQS)
                               // Access prior GP
                               WRITE_ONCE(B, 1)
                               // EQS enter
                               // implied full barrier by atomic_add_return()
                               atomic_add_return(RCU_DYNTICKS_IDX, EQS)
                               // implied full barrier by atomic_add_return()
                               READ_ONCE(A)
    // second snapshot
    y = smp_load_acquire(EQS)
    z = READ_ONCE(B)

If the GP kthread above fails to observe the remote target in EQS
(x not in EQS), the remote target will observe A == 1 after further
entering in EQS. Then the second snapshot taken by the GP kthread only
need to be an acquire read in order to observe z == 1.

Therefore remove the needless full memory barrier on second snapshot.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Frederic Weisbecker June 5, 2024, 12:21 p.m. UTC | #1
Le Tue, Jun 04, 2024 at 03:26:47PM -0700, Paul E. McKenney a écrit :
> From: Frederic Weisbecker <frederic@kernel.org>
> 
> When the grace period kthread checks the extended quiescent state
> counter of a CPU, full ordering is necessary to ensure that either:
> 
> * If the GP kthread observes the remote target in an extended quiescent
>   state, then that target must observe all accesses prior to the current
>   grace period, including the current grace period sequence number, once
>   it exits that extended quiescent state. Also the GP kthread must
>   observe all accesses performed by the target prior it entering in
>   EQS.
> 
> or:
> 
> * If the GP kthread observes the remote target NOT in an extended
>   quiescent state, then the target further entering in an extended
>   quiescent state must observe all accesses prior to the current
>   grace period, including the current grace period sequence number, once
>   it enters that extended quiescent state. Also the GP kthread later
>   observing that EQS must also observe all accesses performed by the
>   target prior it entering in EQS.
> 
> This ordering is explicitly performed both on the first EQS snapshot
> and on the second one as well through the combination of a preceding
> full barrier followed by an acquire read. However the second snapshot's
> full memory barrier is redundant and not needed to enforce the above
> guarantees:
> 
>     GP kthread                  Remote target
>     ----                        -----
>     // Access prior GP
>     WRITE_ONCE(A, 1)
>     // first snapshot
>     smp_mb()
>     x = smp_load_acquire(EQS)
>                                // Access prior GP
>                                WRITE_ONCE(B, 1)
>                                // EQS enter
>                                // implied full barrier by atomic_add_return()
>                                atomic_add_return(RCU_DYNTICKS_IDX, EQS)
>                                // implied full barrier by atomic_add_return()
>                                READ_ONCE(A)
>     // second snapshot
>     y = smp_load_acquire(EQS)
>     z = READ_ONCE(B)
> 
> If the GP kthread above fails to observe the remote target in EQS
> (x not in EQS), the remote target will observe A == 1 after further
> entering in EQS. Then the second snapshot taken by the GP kthread only
> need to be an acquire read in order to observe z == 1.
> 
> Therefore remove the needless full memory barrier on second snapshot.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> ---
>  kernel/rcu/tree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 28c7031711a3f..f07b8bff4621b 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -321,7 +321,7 @@ static bool rcu_dynticks_in_eqs(int snap)
>   */
>  static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
>  {
> -	return snap != rcu_dynticks_snap(rdp->cpu);
> +	return snap != ct_dynticks_cpu_acquire(rdp->cpu);

I guess I'm going to add a comment here to elaborate on the fact
it relies on the ordering enforced before the first snapshot. Would
you prefer a delta patch or an updated patch?

Thanks.

>  }
>  
>  /*
> -- 
> 2.40.1
> 
>
Paul E. McKenney June 5, 2024, 6:44 p.m. UTC | #2
On Wed, Jun 05, 2024 at 02:21:13PM +0200, Frederic Weisbecker wrote:
> Le Tue, Jun 04, 2024 at 03:26:47PM -0700, Paul E. McKenney a écrit :
> > From: Frederic Weisbecker <frederic@kernel.org>
> > 
> > When the grace period kthread checks the extended quiescent state
> > counter of a CPU, full ordering is necessary to ensure that either:
> > 
> > * If the GP kthread observes the remote target in an extended quiescent
> >   state, then that target must observe all accesses prior to the current
> >   grace period, including the current grace period sequence number, once
> >   it exits that extended quiescent state. Also the GP kthread must
> >   observe all accesses performed by the target prior it entering in
> >   EQS.
> > 
> > or:
> > 
> > * If the GP kthread observes the remote target NOT in an extended
> >   quiescent state, then the target further entering in an extended
> >   quiescent state must observe all accesses prior to the current
> >   grace period, including the current grace period sequence number, once
> >   it enters that extended quiescent state. Also the GP kthread later
> >   observing that EQS must also observe all accesses performed by the
> >   target prior it entering in EQS.
> > 
> > This ordering is explicitly performed both on the first EQS snapshot
> > and on the second one as well through the combination of a preceding
> > full barrier followed by an acquire read. However the second snapshot's
> > full memory barrier is redundant and not needed to enforce the above
> > guarantees:
> > 
> >     GP kthread                  Remote target
> >     ----                        -----
> >     // Access prior GP
> >     WRITE_ONCE(A, 1)
> >     // first snapshot
> >     smp_mb()
> >     x = smp_load_acquire(EQS)
> >                                // Access prior GP
> >                                WRITE_ONCE(B, 1)
> >                                // EQS enter
> >                                // implied full barrier by atomic_add_return()
> >                                atomic_add_return(RCU_DYNTICKS_IDX, EQS)
> >                                // implied full barrier by atomic_add_return()
> >                                READ_ONCE(A)
> >     // second snapshot
> >     y = smp_load_acquire(EQS)
> >     z = READ_ONCE(B)
> > 
> > If the GP kthread above fails to observe the remote target in EQS
> > (x not in EQS), the remote target will observe A == 1 after further
> > entering in EQS. Then the second snapshot taken by the GP kthread only
> > need to be an acquire read in order to observe z == 1.
> > 
> > Therefore remove the needless full memory barrier on second snapshot.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > ---
> >  kernel/rcu/tree.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 28c7031711a3f..f07b8bff4621b 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -321,7 +321,7 @@ static bool rcu_dynticks_in_eqs(int snap)
> >   */
> >  static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
> >  {
> > -	return snap != rcu_dynticks_snap(rdp->cpu);
> > +	return snap != ct_dynticks_cpu_acquire(rdp->cpu);
> 
> I guess I'm going to add a comment here to elaborate on the fact
> it relies on the ordering enforced before the first snapshot. Would
> you prefer a delta patch or an updated patch?

Either works, just tell me which you are doing when you submit the patch.
Either way, I will arrange for there to be a single combined commit.

							Thanx, Paul
Frederic Weisbecker June 26, 2024, 3:03 p.m. UTC | #3
Le Wed, Jun 05, 2024 at 11:44:42AM -0700, Paul E. McKenney a écrit :
> On Wed, Jun 05, 2024 at 02:21:13PM +0200, Frederic Weisbecker wrote:
> > Le Tue, Jun 04, 2024 at 03:26:47PM -0700, Paul E. McKenney a écrit :
> > > From: Frederic Weisbecker <frederic@kernel.org>
> > > 
> > > When the grace period kthread checks the extended quiescent state
> > > counter of a CPU, full ordering is necessary to ensure that either:
> > > 
> > > * If the GP kthread observes the remote target in an extended quiescent
> > >   state, then that target must observe all accesses prior to the current
> > >   grace period, including the current grace period sequence number, once
> > >   it exits that extended quiescent state. Also the GP kthread must
> > >   observe all accesses performed by the target prior it entering in
> > >   EQS.
> > > 
> > > or:
> > > 
> > > * If the GP kthread observes the remote target NOT in an extended
> > >   quiescent state, then the target further entering in an extended
> > >   quiescent state must observe all accesses prior to the current
> > >   grace period, including the current grace period sequence number, once
> > >   it enters that extended quiescent state. Also the GP kthread later
> > >   observing that EQS must also observe all accesses performed by the
> > >   target prior it entering in EQS.
> > > 
> > > This ordering is explicitly performed both on the first EQS snapshot
> > > and on the second one as well through the combination of a preceding
> > > full barrier followed by an acquire read. However the second snapshot's
> > > full memory barrier is redundant and not needed to enforce the above
> > > guarantees:
> > > 
> > >     GP kthread                  Remote target
> > >     ----                        -----
> > >     // Access prior GP
> > >     WRITE_ONCE(A, 1)
> > >     // first snapshot
> > >     smp_mb()
> > >     x = smp_load_acquire(EQS)
> > >                                // Access prior GP
> > >                                WRITE_ONCE(B, 1)
> > >                                // EQS enter
> > >                                // implied full barrier by atomic_add_return()
> > >                                atomic_add_return(RCU_DYNTICKS_IDX, EQS)
> > >                                // implied full barrier by atomic_add_return()
> > >                                READ_ONCE(A)
> > >     // second snapshot
> > >     y = smp_load_acquire(EQS)
> > >     z = READ_ONCE(B)
> > > 
> > > If the GP kthread above fails to observe the remote target in EQS
> > > (x not in EQS), the remote target will observe A == 1 after further
> > > entering in EQS. Then the second snapshot taken by the GP kthread only
> > > need to be an acquire read in order to observe z == 1.
> > > 
> > > Therefore remove the needless full memory barrier on second snapshot.
> > > 
> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > ---
> > >  kernel/rcu/tree.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 28c7031711a3f..f07b8bff4621b 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -321,7 +321,7 @@ static bool rcu_dynticks_in_eqs(int snap)
> > >   */
> > >  static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
> > >  {
> > > -	return snap != rcu_dynticks_snap(rdp->cpu);
> > > +	return snap != ct_dynticks_cpu_acquire(rdp->cpu);
> > 
> > I guess I'm going to add a comment here to elaborate on the fact
> > it relies on the ordering enforced before the first snapshot. Would
> > you prefer a delta patch or an updated patch?
> 
> Either works, just tell me which you are doing when you submit the patch.
> Either way, I will arrange for there to be a single combined commit.

Ok before I resend, how does the following comment look like?

/*
 * The first failing snapshot is already ordered against the accesses
 * performed by the remote CPU after it exiting idle.
 *
 * The second snapshot therefore only needs to order against accesses
 * performed by the remote CPU prior it entering idle and therefore can
 * solely on acquire semantics.
 */

Thanks.

> 
> 							Thanx, Paul
>
Paul E. McKenney June 26, 2024, 3:32 p.m. UTC | #4
On Wed, Jun 26, 2024 at 05:03:02PM +0200, Frederic Weisbecker wrote:
> Le Wed, Jun 05, 2024 at 11:44:42AM -0700, Paul E. McKenney a écrit :
> > On Wed, Jun 05, 2024 at 02:21:13PM +0200, Frederic Weisbecker wrote:
> > > Le Tue, Jun 04, 2024 at 03:26:47PM -0700, Paul E. McKenney a écrit :
> > > > From: Frederic Weisbecker <frederic@kernel.org>
> > > > 
> > > > When the grace period kthread checks the extended quiescent state
> > > > counter of a CPU, full ordering is necessary to ensure that either:
> > > > 
> > > > * If the GP kthread observes the remote target in an extended quiescent
> > > >   state, then that target must observe all accesses prior to the current
> > > >   grace period, including the current grace period sequence number, once
> > > >   it exits that extended quiescent state. Also the GP kthread must
> > > >   observe all accesses performed by the target prior it entering in
> > > >   EQS.
> > > > 
> > > > or:
> > > > 
> > > > * If the GP kthread observes the remote target NOT in an extended
> > > >   quiescent state, then the target further entering in an extended
> > > >   quiescent state must observe all accesses prior to the current
> > > >   grace period, including the current grace period sequence number, once
> > > >   it enters that extended quiescent state. Also the GP kthread later
> > > >   observing that EQS must also observe all accesses performed by the
> > > >   target prior it entering in EQS.
> > > > 
> > > > This ordering is explicitly performed both on the first EQS snapshot
> > > > and on the second one as well through the combination of a preceding
> > > > full barrier followed by an acquire read. However the second snapshot's
> > > > full memory barrier is redundant and not needed to enforce the above
> > > > guarantees:
> > > > 
> > > >     GP kthread                  Remote target
> > > >     ----                        -----
> > > >     // Access prior GP
> > > >     WRITE_ONCE(A, 1)
> > > >     // first snapshot
> > > >     smp_mb()
> > > >     x = smp_load_acquire(EQS)
> > > >                                // Access prior GP
> > > >                                WRITE_ONCE(B, 1)
> > > >                                // EQS enter
> > > >                                // implied full barrier by atomic_add_return()
> > > >                                atomic_add_return(RCU_DYNTICKS_IDX, EQS)
> > > >                                // implied full barrier by atomic_add_return()
> > > >                                READ_ONCE(A)
> > > >     // second snapshot
> > > >     y = smp_load_acquire(EQS)
> > > >     z = READ_ONCE(B)
> > > > 
> > > > If the GP kthread above fails to observe the remote target in EQS
> > > > (x not in EQS), the remote target will observe A == 1 after further
> > > > entering in EQS. Then the second snapshot taken by the GP kthread only
> > > > need to be an acquire read in order to observe z == 1.
> > > > 
> > > > Therefore remove the needless full memory barrier on second snapshot.
> > > > 
> > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > > ---
> > > >  kernel/rcu/tree.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 28c7031711a3f..f07b8bff4621b 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -321,7 +321,7 @@ static bool rcu_dynticks_in_eqs(int snap)
> > > >   */
> > > >  static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
> > > >  {
> > > > -	return snap != rcu_dynticks_snap(rdp->cpu);
> > > > +	return snap != ct_dynticks_cpu_acquire(rdp->cpu);
> > > 
> > > I guess I'm going to add a comment here to elaborate on the fact
> > > it relies on the ordering enforced before the first snapshot. Would
> > > you prefer a delta patch or an updated patch?
> > 
> > Either works, just tell me which you are doing when you submit the patch.
> > Either way, I will arrange for there to be a single combined commit.
> 
> Ok before I resend, how does the following comment look like?
> 
> /*
>  * The first failing snapshot is already ordered against the accesses
>  * performed by the remote CPU after it exiting idle.

s/exiting/exits/

>  * The second snapshot therefore only needs to order against accesses
>  * performed by the remote CPU prior it entering idle and therefore can
>  * solely on acquire semantics.
>  */

s/prior it entering/prior to entering/
s/solely/rely solely/

Other than those nits, looks good to me!  

								Thanx, Paul
Frederic Weisbecker June 26, 2024, 3:44 p.m. UTC | #5
Le Wed, Jun 26, 2024 at 08:32:45AM -0700, Paul E. McKenney a écrit :
> On Wed, Jun 26, 2024 at 05:03:02PM +0200, Frederic Weisbecker wrote:
> > Le Wed, Jun 05, 2024 at 11:44:42AM -0700, Paul E. McKenney a écrit :
> > > On Wed, Jun 05, 2024 at 02:21:13PM +0200, Frederic Weisbecker wrote:
> > > > Le Tue, Jun 04, 2024 at 03:26:47PM -0700, Paul E. McKenney a écrit :
> > > > > From: Frederic Weisbecker <frederic@kernel.org>
> > > > > 
> > > > > When the grace period kthread checks the extended quiescent state
> > > > > counter of a CPU, full ordering is necessary to ensure that either:
> > > > > 
> > > > > * If the GP kthread observes the remote target in an extended quiescent
> > > > >   state, then that target must observe all accesses prior to the current
> > > > >   grace period, including the current grace period sequence number, once
> > > > >   it exits that extended quiescent state. Also the GP kthread must
> > > > >   observe all accesses performed by the target prior it entering in
> > > > >   EQS.
> > > > > 
> > > > > or:
> > > > > 
> > > > > * If the GP kthread observes the remote target NOT in an extended
> > > > >   quiescent state, then the target further entering in an extended
> > > > >   quiescent state must observe all accesses prior to the current
> > > > >   grace period, including the current grace period sequence number, once
> > > > >   it enters that extended quiescent state. Also the GP kthread later
> > > > >   observing that EQS must also observe all accesses performed by the
> > > > >   target prior it entering in EQS.
> > > > > 
> > > > > This ordering is explicitly performed both on the first EQS snapshot
> > > > > and on the second one as well through the combination of a preceding
> > > > > full barrier followed by an acquire read. However the second snapshot's
> > > > > full memory barrier is redundant and not needed to enforce the above
> > > > > guarantees:
> > > > > 
> > > > >     GP kthread                  Remote target
> > > > >     ----                        -----
> > > > >     // Access prior GP
> > > > >     WRITE_ONCE(A, 1)
> > > > >     // first snapshot
> > > > >     smp_mb()
> > > > >     x = smp_load_acquire(EQS)
> > > > >                                // Access prior GP
> > > > >                                WRITE_ONCE(B, 1)
> > > > >                                // EQS enter
> > > > >                                // implied full barrier by atomic_add_return()
> > > > >                                atomic_add_return(RCU_DYNTICKS_IDX, EQS)
> > > > >                                // implied full barrier by atomic_add_return()
> > > > >                                READ_ONCE(A)
> > > > >     // second snapshot
> > > > >     y = smp_load_acquire(EQS)
> > > > >     z = READ_ONCE(B)
> > > > > 
> > > > > If the GP kthread above fails to observe the remote target in EQS
> > > > > (x not in EQS), the remote target will observe A == 1 after further
> > > > > entering in EQS. Then the second snapshot taken by the GP kthread only
> > > > > need to be an acquire read in order to observe z == 1.
> > > > > 
> > > > > Therefore remove the needless full memory barrier on second snapshot.
> > > > > 
> > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > > > ---
> > > > >  kernel/rcu/tree.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > index 28c7031711a3f..f07b8bff4621b 100644
> > > > > --- a/kernel/rcu/tree.c
> > > > > +++ b/kernel/rcu/tree.c
> > > > > @@ -321,7 +321,7 @@ static bool rcu_dynticks_in_eqs(int snap)
> > > > >   */
> > > > >  static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
> > > > >  {
> > > > > -	return snap != rcu_dynticks_snap(rdp->cpu);
> > > > > +	return snap != ct_dynticks_cpu_acquire(rdp->cpu);
> > > > 
> > > > I guess I'm going to add a comment here to elaborate on the fact
> > > > it relies on the ordering enforced before the first snapshot. Would
> > > > you prefer a delta patch or an updated patch?
> > > 
> > > Either works, just tell me which you are doing when you submit the patch.
> > > Either way, I will arrange for there to be a single combined commit.
> > 
> > Ok before I resend, how does the following comment look like?
> > 
> > /*
> >  * The first failing snapshot is already ordered against the accesses
> >  * performed by the remote CPU after it exiting idle.
> 
> s/exiting/exits/
> 
> >  * The second snapshot therefore only needs to order against accesses
> >  * performed by the remote CPU prior it entering idle and therefore can
> >  * solely on acquire semantics.
> >  */
> 
> s/prior it entering/prior to entering/
> s/solely/rely solely/
> 
> Other than those nits, looks good to me!

Thanks a lot!

I'll resend with these changes.

> 
> 								Thanx, Paul
diff mbox series

Patch

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 28c7031711a3f..f07b8bff4621b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -321,7 +321,7 @@  static bool rcu_dynticks_in_eqs(int snap)
  */
 static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap)
 {
-	return snap != rcu_dynticks_snap(rdp->cpu);
+	return snap != ct_dynticks_cpu_acquire(rdp->cpu);
 }
 
 /*