diff mbox series

rcutorture: Avoid problematic critical section nesting on RT

Message ID 20210817144018.nqssoq475vitrqlv@linutronix.de (mailing list archive)
State New, archived
Headers show
Series rcutorture: Avoid problematic critical section nesting on RT | expand

Commit Message

Sebastian Andrzej Siewior Aug. 17, 2021, 2:40 p.m. UTC
From: Scott Wood <swood@redhat.com>

rcutorture was generating some nesting scenarios that are not
reasonable.  Constrain the state selection to avoid them.

Example:

1. rcu_read_lock()
2. local_irq_disable()
3. rcu_read_unlock()
4. local_irq_enable()

If the thread is preempted between steps 1 and 2,
rcu_read_unlock_special.b.blocked will be set, but it won't be
acted on in step 3 because IRQs are disabled.  Thus, reporting of the
quiescent state will be delayed beyond the local_irq_enable().

For now, these scenarios will continue to be tested on non-PREEMPT_RT
kernels, until debug checks are added to ensure that they are not
happening elsewhere.

Signed-off-by: Scott Wood <swood@redhat.com>
[valentin.schneider@arm.com: Don't disable BH in atomic context]
[bigeasy: remove 'preempt_disable(); local_bh_disable(); preempt_enable();
 local_bh_enable();' from the examples because this works on RT now. ]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
I folded Valentin's bits.
I removed the unbalanced preempt_disable()/migrate_disable() part from
the description because it is supported now by the migrate disable
implementation. I didn't find it explicit in code/ patch except as part
of local_bh_disable().


 kernel/rcu/rcutorture.c |   94 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 80 insertions(+), 14 deletions(-)
---

Comments

Paul E. McKenney Aug. 18, 2021, 10:46 p.m. UTC | #1
On Tue, Aug 17, 2021 at 04:40:18PM +0200, Sebastian Andrzej Siewior wrote:
> From: Scott Wood <swood@redhat.com>
> 
> rcutorture was generating some nesting scenarios that are not
> reasonable.  Constrain the state selection to avoid them.
> 
> Example:
> 
> 1. rcu_read_lock()
> 2. local_irq_disable()
> 3. rcu_read_unlock()
> 4. local_irq_enable()
> 
> If the thread is preempted between steps 1 and 2,
> rcu_read_unlock_special.b.blocked will be set, but it won't be
> acted on in step 3 because IRQs are disabled.  Thus, reporting of the
> quiescent state will be delayed beyond the local_irq_enable().
> 
> For now, these scenarios will continue to be tested on non-PREEMPT_RT
> kernels, until debug checks are added to ensure that they are not
> happening elsewhere.
> 
> Signed-off-by: Scott Wood <swood@redhat.com>
> [valentin.schneider@arm.com: Don't disable BH in atomic context]
> [bigeasy: remove 'preempt_disable(); local_bh_disable(); preempt_enable();
>  local_bh_enable();' from the examples because this works on RT now. ]
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

This looks close to being ready for mainline, actually.

One comment below.

							Thanx, Paul

> ---
> I folded Valentin's bits.
> I removed the unbalanced preempt_disable()/migrate_disable() part from
> the description because it is supported now by the migrate disable
> implementation. I didn't find it explicit in code/ patch except as part
> of local_bh_disable().
> 
> 
>  kernel/rcu/rcutorture.c |   94 ++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 80 insertions(+), 14 deletions(-)
> ---
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -61,10 +61,13 @@ MODULE_AUTHOR("Paul E. McKenney <paulmck
>  #define RCUTORTURE_RDR_RBH	 0x08	/*  ... rcu_read_lock_bh(). */
>  #define RCUTORTURE_RDR_SCHED	 0x10	/*  ... rcu_read_lock_sched(). */
>  #define RCUTORTURE_RDR_RCU	 0x20	/*  ... entering another RCU reader. */
> -#define RCUTORTURE_RDR_NBITS	 6	/* Number of bits defined above. */
> +#define RCUTORTURE_RDR_ATOM_BH	 0x40	/*  ... disabling bh while atomic */
> +#define RCUTORTURE_RDR_ATOM_RBH	 0x80	/*  ... RBH while atomic */
> +#define RCUTORTURE_RDR_NBITS	 8	/* Number of bits defined above. */
>  #define RCUTORTURE_MAX_EXTEND	 \
>  	(RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | RCUTORTURE_RDR_PREEMPT | \
> -	 RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED)
> +	 RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED | \
> +	 RCUTORTURE_RDR_ATOM_BH | RCUTORTURE_RDR_ATOM_RBH)
>  #define RCUTORTURE_RDR_MAX_LOOPS 0x7	/* Maximum reader extensions. */
>  					/* Must be power of two minus one. */
>  #define RCUTORTURE_RDR_MAX_SEGS (RCUTORTURE_RDR_MAX_LOOPS + 3)
> @@ -1429,31 +1432,53 @@ static void rcutorture_one_extend(int *r
>  	WARN_ON_ONCE((idxold >> RCUTORTURE_RDR_SHIFT) > 1);
>  	rtrsp->rt_readstate = newstate;
>  
> -	/* First, put new protection in place to avoid critical-section gap. */
> +	/*
> +	 * First, put new protection in place to avoid critical-section gap.
> +	 * Disable preemption around the ATOM disables to ensure that
> +	 * in_atomic() is true.
> +	 */
>  	if (statesnew & RCUTORTURE_RDR_BH)
>  		local_bh_disable();
> +	if (statesnew & RCUTORTURE_RDR_RBH)
> +		rcu_read_lock_bh();
>  	if (statesnew & RCUTORTURE_RDR_IRQ)
>  		local_irq_disable();
>  	if (statesnew & RCUTORTURE_RDR_PREEMPT)
>  		preempt_disable();
> -	if (statesnew & RCUTORTURE_RDR_RBH)
> -		rcu_read_lock_bh();
>  	if (statesnew & RCUTORTURE_RDR_SCHED)
>  		rcu_read_lock_sched();
> +	preempt_disable();
> +	if (statesnew & RCUTORTURE_RDR_ATOM_BH)
> +		local_bh_disable();
> +	if (statesnew & RCUTORTURE_RDR_ATOM_RBH)
> +		rcu_read_lock_bh();
> +	preempt_enable();
>  	if (statesnew & RCUTORTURE_RDR_RCU)
>  		idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;
>  
> -	/* Next, remove old protection, irq first due to bh conflict. */
> +	/*
> +	 * Next, remove old protection, in decreasing order of strength
> +	 * to avoid unlock paths that aren't safe in the stronger
> +	 * context.  Disable preemption around the ATOM enables in
> +	 * case the context was only atomic due to IRQ disabling.
> +	 */
> +	preempt_disable();
>  	if (statesold & RCUTORTURE_RDR_IRQ)
>  		local_irq_enable();
> -	if (statesold & RCUTORTURE_RDR_BH)
> +	if (statesold & RCUTORTURE_RDR_ATOM_BH)
>  		local_bh_enable();
> +	if (statesold & RCUTORTURE_RDR_ATOM_RBH)
> +		rcu_read_unlock_bh();
> +	preempt_enable();

The addition of preempt_enable() here prevents rcutorture from covering
an important part of the mainline RCU state space, namely when an RCU
read-side section ends with just local_irq_enable().  This situation
is a challenge for RCU because it must indirectly detect the end of the
critical section.

Would it work for RT if the preempt_enable() and preempt_disable()
were executed only if either RT on the one hand or statesold has the
RCUTORTURE_RDR_ATOM_BH or RCUTORTURE_RDR_ATOM_RBH bit set on the other?

>  	if (statesold & RCUTORTURE_RDR_PREEMPT)
>  		preempt_enable();
> -	if (statesold & RCUTORTURE_RDR_RBH)
> -		rcu_read_unlock_bh();
>  	if (statesold & RCUTORTURE_RDR_SCHED)
>  		rcu_read_unlock_sched();
> +	if (statesold & RCUTORTURE_RDR_BH)
> +		local_bh_enable();
> +	if (statesold & RCUTORTURE_RDR_RBH)
> +		rcu_read_unlock_bh();
> +
>  	if (statesold & RCUTORTURE_RDR_RCU) {
>  		bool lockit = !statesnew && !(torture_random(trsp) & 0xffff);
>  
> @@ -1496,6 +1521,12 @@ rcutorture_extend_mask(int oldmask, stru
>  	int mask = rcutorture_extend_mask_max();
>  	unsigned long randmask1 = torture_random(trsp) >> 8;
>  	unsigned long randmask2 = randmask1 >> 3;
> +	unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
> +	unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ;
> +	unsigned long nonatomic_bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> +	unsigned long atomic_bhs = RCUTORTURE_RDR_ATOM_BH |
> +				   RCUTORTURE_RDR_ATOM_RBH;
> +	unsigned long tmp;
>  
>  	WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
>  	/* Mostly only one bit (need preemption!), sometimes lots of bits. */
> @@ -1503,11 +1534,46 @@ rcutorture_extend_mask(int oldmask, stru
>  		mask = mask & randmask2;
>  	else
>  		mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
> -	/* Can't enable bh w/irq disabled. */
> -	if ((mask & RCUTORTURE_RDR_IRQ) &&
> -	    ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
> -	     (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
> -		mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> +
> +	/*
> +	 * Can't enable bh w/irq disabled.
> +	 */
> +	tmp = atomic_bhs | nonatomic_bhs;
> +	if (mask & RCUTORTURE_RDR_IRQ)
> +		mask |= oldmask & tmp;

This is more straightforward than my original, good!

> +
> +	/*
> +	 * Ideally these sequences would be detected in debug builds
> +	 * (regardless of RT), but until then don't stop testing
> +	 * them on non-RT.
> +	 */
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> +		/*
> +		 * Can't disable bh in atomic context if bh was already
> +		 * disabled by another task on the same CPU. Instead of
> +		 * attempting to track this, just avoid disabling bh in atomic
> +		 * context.
> +		 */
> +		mask &= ~atomic_bhs;

At some point, we will need to test disabling bh in atomic context,
correct?  Or am I missing something here?

> +		/*
> +		 * Can't release the outermost rcu lock in an irq disabled
> +		 * section without preemption also being disabled, if irqs
> +		 * had ever been enabled during this RCU critical section
> +		 * (could leak a special flag and delay reporting the qs).
> +		 */
> +		if ((oldmask & RCUTORTURE_RDR_RCU) &&
> +		    (mask & RCUTORTURE_RDR_IRQ) &&
> +		    !(mask & preempts))
> +			mask |= RCUTORTURE_RDR_RCU;
> +
> +		/* Can't modify non-atomic bh in atomic context */
> +		tmp = nonatomic_bhs;
> +		if (oldmask & preempts_irq)
> +			mask &= ~tmp;
> +		if ((oldmask | mask) & preempts_irq)
> +			mask |= oldmask & tmp;
> +	}
> +
>  	return mask ?: RCUTORTURE_RDR_RCU;
>  }
>
Sebastian Andrzej Siewior Aug. 19, 2021, 3:35 p.m. UTC | #2
On 2021-08-18 15:46:51 [-0700], Paul E. McKenney wrote:
…
> > +	/*
> > +	 * Ideally these sequences would be detected in debug builds
> > +	 * (regardless of RT), but until then don't stop testing
> > +	 * them on non-RT.
> > +	 */
> > +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > +		/*
> > +		 * Can't disable bh in atomic context if bh was already
> > +		 * disabled by another task on the same CPU. Instead of
> > +		 * attempting to track this, just avoid disabling bh in atomic
> > +		 * context.
> > +		 */
> > +		mask &= ~atomic_bhs;
> 
> At some point, we will need to test disabling bh in atomic context,
> correct?  Or am I missing something here?

Ideally there is no disabling bh in atomic context (on RT). Having it
breaks some fundamental rules how softirq handling and the bh related
synchronisation is implemented. Given that the softirq handler is
invoked in thread context and preemption is not disabled as part of
spin_lock(), rcu_read_lock(), and interrupts are in general not disabled
in the interrupt handler or spin_lock_irq() there is close to zero
chance of disabling bh in atomic context on RT.

In reality there is (of course) something that needs to disable bh in
atomic context and it happens only during boot up (or from idle unless
I'm mistaken).
It is required that bh disable and its enable part (later) happens in
the same context that is if bh has been disabled in preemptible context
it must not be enabled in atomic context (and vice versa).

The bottom line is that there must not be a local_bh_disable() in atomic
context if another (preempted) task already did a local_bh_disable() on
the same CPU, like in the following scenario on one CPU:

TASK A                      TASK B
 local_bh_disable();
 … preempted
                           preempt_disable();
			   local_bh_disable();

Then this breaks the synchronisation that is otherwise provided by
local_bh_disable(). Without that preempt_disable() TASK B would block
(and wait) until TASK A completes its BH section. In atomic context it
is not possible and therefore not allowed.

Sebastian
Sebastian Andrzej Siewior Aug. 19, 2021, 3:39 p.m. UTC | #3
On 2021-08-18 15:46:51 [-0700], Paul E. McKenney wrote:
> > ---
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -61,10 +61,13 @@ MODULE_AUTHOR("Paul E. McKenney <paulmck> > -	/* Next, remove old protection, irq first due to bh conflict. */
> > +	/*
> > +	 * Next, remove old protection, in decreasing order of strength
> > +	 * to avoid unlock paths that aren't safe in the stronger
> > +	 * context.  Disable preemption around the ATOM enables in
> > +	 * case the context was only atomic due to IRQ disabling.
> > +	 */
> > +	preempt_disable();
> >  	if (statesold & RCUTORTURE_RDR_IRQ)
> >  		local_irq_enable();
> > -	if (statesold & RCUTORTURE_RDR_BH)
> > +	if (statesold & RCUTORTURE_RDR_ATOM_BH)
> >  		local_bh_enable();
> > +	if (statesold & RCUTORTURE_RDR_ATOM_RBH)
> > +		rcu_read_unlock_bh();
> > +	preempt_enable();
> 
> The addition of preempt_enable() here prevents rcutorture from covering
> an important part of the mainline RCU state space, namely when an RCU
> read-side section ends with just local_irq_enable().  This situation
> is a challenge for RCU because it must indirectly detect the end of the
> critical section.
> 
> Would it work for RT if the preempt_enable() and preempt_disable()
> were executed only if either RT on the one hand or statesold has the
> RCUTORTURE_RDR_ATOM_BH or RCUTORTURE_RDR_ATOM_RBH bit set on the other?

Now that I stared at it some more (and it stared briefly back at me) I
couldn't explain why we need this and that piece of the patch so I came
up with following which I can explain:

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 40ef5417d9545..5c8b31b7eff03 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -1432,28 +1432,34 @@ static void rcutorture_one_extend(int *readstate, int newstate,
 	/* First, put new protection in place to avoid critical-section gap. */
 	if (statesnew & RCUTORTURE_RDR_BH)
 		local_bh_disable();
+	if (statesnew & RCUTORTURE_RDR_RBH)
+		rcu_read_lock_bh();
 	if (statesnew & RCUTORTURE_RDR_IRQ)
 		local_irq_disable();
 	if (statesnew & RCUTORTURE_RDR_PREEMPT)
 		preempt_disable();
-	if (statesnew & RCUTORTURE_RDR_RBH)
-		rcu_read_lock_bh();
 	if (statesnew & RCUTORTURE_RDR_SCHED)
 		rcu_read_lock_sched();
 	if (statesnew & RCUTORTURE_RDR_RCU)
 		idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;
 
-	/* Next, remove old protection, irq first due to bh conflict. */
+	/*
+	 * Next, remove old protection, in decreasing order of strength
+	 * to avoid unlock paths that aren't safe in the stronger
+	 * context. Namely: BH can not be enabled with disabled interrupts.
+	 * Additionally PREEMPT_RT requires that BH is enabled in preemptible
+	 * context.
+	 */
 	if (statesold & RCUTORTURE_RDR_IRQ)
 		local_irq_enable();
-	if (statesold & RCUTORTURE_RDR_BH)
-		local_bh_enable();
 	if (statesold & RCUTORTURE_RDR_PREEMPT)
 		preempt_enable();
-	if (statesold & RCUTORTURE_RDR_RBH)
-		rcu_read_unlock_bh();
 	if (statesold & RCUTORTURE_RDR_SCHED)
 		rcu_read_unlock_sched();
+	if (statesold & RCUTORTURE_RDR_BH)
+		local_bh_enable();
+	if (statesold & RCUTORTURE_RDR_RBH)
+		rcu_read_unlock_bh();
 	if (statesold & RCUTORTURE_RDR_RCU) {
 		bool lockit = !statesnew && !(torture_random(trsp) & 0xffff);
 
@@ -1496,6 +1502,9 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
 	int mask = rcutorture_extend_mask_max();
 	unsigned long randmask1 = torture_random(trsp) >> 8;
 	unsigned long randmask2 = randmask1 >> 3;
+	unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
+	unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ;
+	unsigned long bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
 
 	WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
 	/* Mostly only one bit (need preemption!), sometimes lots of bits. */
@@ -1503,11 +1512,37 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
 		mask = mask & randmask2;
 	else
 		mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
-	/* Can't enable bh w/irq disabled. */
-	if ((mask & RCUTORTURE_RDR_IRQ) &&
-	    ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
-	     (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
-		mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
+
+	/*
+	 * Can't enable bh w/irq disabled.
+	 */
+	if (mask & RCUTORTURE_RDR_IRQ)
+		mask |= oldmask & bhs;
+
+	/*
+	 * Ideally these sequences would be detected in debug builds
+	 * (regardless of RT), but until then don't stop testing
+	 * them on non-RT.
+	 */
+	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
+		/*
+		 * Can't release the outermost rcu lock in an irq disabled
+		 * section without preemption also being disabled, if irqs
+		 * had ever been enabled during this RCU critical section
+		 * (could leak a special flag and delay reporting the qs).
+		 */
+		if ((oldmask & RCUTORTURE_RDR_RCU) &&
+		    (mask & RCUTORTURE_RDR_IRQ) &&
+		    !(mask & preempts))
+			mask |= RCUTORTURE_RDR_RCU;
+
+		/* Can't modify bh in atomic context */
+		if (oldmask & preempts_irq)
+			mask &= ~bhs;
+		if ((oldmask | mask) & preempts_irq)
+			mask |= oldmask & bhs;
+	}
+
 	return mask ?: RCUTORTURE_RDR_RCU;
 }
Sebastian Andrzej Siewior Aug. 19, 2021, 3:47 p.m. UTC | #4
On 2021-08-19 17:39:29 [+0200], To Paul E. McKenney wrote:
> up with following which I can explain:
> 
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index 40ef5417d9545..5c8b31b7eff03 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -1432,28 +1432,34 @@ static void rcutorture_one_extend(int *readstate, int newstate,
>  	/* First, put new protection in place to avoid critical-section gap. */
>  	if (statesnew & RCUTORTURE_RDR_BH)
>  		local_bh_disable();
> +	if (statesnew & RCUTORTURE_RDR_RBH)
> +		rcu_read_lock_bh();
>  	if (statesnew & RCUTORTURE_RDR_IRQ)
>  		local_irq_disable();
>  	if (statesnew & RCUTORTURE_RDR_PREEMPT)
>  		preempt_disable();
> -	if (statesnew & RCUTORTURE_RDR_RBH)
> -		rcu_read_lock_bh();
>  	if (statesnew & RCUTORTURE_RDR_SCHED)
>  		rcu_read_lock_sched();
>  	if (statesnew & RCUTORTURE_RDR_RCU)
>  		idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;

So the ordering in the enable and disable part regarding BH is
important. First BH, then preemption or IRQ.

> -	/* Next, remove old protection, irq first due to bh conflict. */
> +	/*
> +	 * Next, remove old protection, in decreasing order of strength
> +	 * to avoid unlock paths that aren't safe in the stronger
> +	 * context. Namely: BH can not be enabled with disabled interrupts.
> +	 * Additionally PREEMPT_RT requires that BH is enabled in preemptible
> +	 * context.
> +	 */
>  	if (statesold & RCUTORTURE_RDR_IRQ)
>  		local_irq_enable();
> -	if (statesold & RCUTORTURE_RDR_BH)
> -		local_bh_enable();
>  	if (statesold & RCUTORTURE_RDR_PREEMPT)
>  		preempt_enable();
> -	if (statesold & RCUTORTURE_RDR_RBH)
> -		rcu_read_unlock_bh();
>  	if (statesold & RCUTORTURE_RDR_SCHED)
>  		rcu_read_unlock_sched();
> +	if (statesold & RCUTORTURE_RDR_BH)
> +		local_bh_enable();
> +	if (statesold & RCUTORTURE_RDR_RBH)
> +		rcu_read_unlock_bh();
>  	if (statesold & RCUTORTURE_RDR_RCU) {
>  		bool lockit = !statesnew && !(torture_random(trsp) & 0xffff);

The same in the unlock part so that BH is unlocked in preemptible
context.
Now if you need bh lock/unlock in atomic context (either with disabled
IRQs or preemption) then I would dig out the atomic-bh part again and
make !RT only without the preempt_disable() section around about which
one you did complain.

> @@ -1496,6 +1502,9 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
>  	int mask = rcutorture_extend_mask_max();
>  	unsigned long randmask1 = torture_random(trsp) >> 8;
>  	unsigned long randmask2 = randmask1 >> 3;
> +	unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
> +	unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ;
> +	unsigned long bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
>  
>  	WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
>  	/* Mostly only one bit (need preemption!), sometimes lots of bits. */
> @@ -1503,11 +1512,37 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
>  		mask = mask & randmask2;
>  	else
>  		mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
> -	/* Can't enable bh w/irq disabled. */
> -	if ((mask & RCUTORTURE_RDR_IRQ) &&
> -	    ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
> -	     (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
> -		mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> +
> +	/*
> +	 * Can't enable bh w/irq disabled.
> +	 */
> +	if (mask & RCUTORTURE_RDR_IRQ)
> +		mask |= oldmask & bhs;
> +
> +	/*
> +	 * Ideally these sequences would be detected in debug builds
> +	 * (regardless of RT), but until then don't stop testing
> +	 * them on non-RT.
> +	 */
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> +		/*
> +		 * Can't release the outermost rcu lock in an irq disabled
> +		 * section without preemption also being disabled, if irqs
> +		 * had ever been enabled during this RCU critical section
> +		 * (could leak a special flag and delay reporting the qs).
> +		 */
> +		if ((oldmask & RCUTORTURE_RDR_RCU) &&
> +		    (mask & RCUTORTURE_RDR_IRQ) &&
> +		    !(mask & preempts))
> +			mask |= RCUTORTURE_RDR_RCU;

This piece above, I don't understand. I had it running for a while and
it didn't explode. Let me try TREE01 for 30min without that piece.

> +		/* Can't modify bh in atomic context */
> +		if (oldmask & preempts_irq)
> +			mask &= ~bhs;
> +		if ((oldmask | mask) & preempts_irq)
> +			mask |= oldmask & bhs;

And this is needed because we can't lock/unlock bh while atomic.

> +	}
> +
>  	return mask ?: RCUTORTURE_RDR_RCU;
>  }
>  

Sebastian
Paul E. McKenney Aug. 19, 2021, 6:20 p.m. UTC | #5
On Thu, Aug 19, 2021 at 05:47:08PM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-08-19 17:39:29 [+0200], To Paul E. McKenney wrote:
> > up with following which I can explain:
> > 
> > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > index 40ef5417d9545..5c8b31b7eff03 100644
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -1432,28 +1432,34 @@ static void rcutorture_one_extend(int *readstate, int newstate,
> >  	/* First, put new protection in place to avoid critical-section gap. */
> >  	if (statesnew & RCUTORTURE_RDR_BH)
> >  		local_bh_disable();
> > +	if (statesnew & RCUTORTURE_RDR_RBH)
> > +		rcu_read_lock_bh();
> >  	if (statesnew & RCUTORTURE_RDR_IRQ)
> >  		local_irq_disable();
> >  	if (statesnew & RCUTORTURE_RDR_PREEMPT)
> >  		preempt_disable();
> > -	if (statesnew & RCUTORTURE_RDR_RBH)
> > -		rcu_read_lock_bh();
> >  	if (statesnew & RCUTORTURE_RDR_SCHED)
> >  		rcu_read_lock_sched();
> >  	if (statesnew & RCUTORTURE_RDR_RCU)
> >  		idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;
> 
> So the ordering in the enable and disable part regarding BH is
> important. First BH, then preemption or IRQ.
> 
> > -	/* Next, remove old protection, irq first due to bh conflict. */
> > +	/*
> > +	 * Next, remove old protection, in decreasing order of strength
> > +	 * to avoid unlock paths that aren't safe in the stronger
> > +	 * context. Namely: BH can not be enabled with disabled interrupts.
> > +	 * Additionally PREEMPT_RT requires that BH is enabled in preemptible
> > +	 * context.
> > +	 */
> >  	if (statesold & RCUTORTURE_RDR_IRQ)
> >  		local_irq_enable();
> > -	if (statesold & RCUTORTURE_RDR_BH)
> > -		local_bh_enable();
> >  	if (statesold & RCUTORTURE_RDR_PREEMPT)
> >  		preempt_enable();
> > -	if (statesold & RCUTORTURE_RDR_RBH)
> > -		rcu_read_unlock_bh();
> >  	if (statesold & RCUTORTURE_RDR_SCHED)
> >  		rcu_read_unlock_sched();
> > +	if (statesold & RCUTORTURE_RDR_BH)
> > +		local_bh_enable();
> > +	if (statesold & RCUTORTURE_RDR_RBH)
> > +		rcu_read_unlock_bh();
> >  	if (statesold & RCUTORTURE_RDR_RCU) {
> >  		bool lockit = !statesnew && !(torture_random(trsp) & 0xffff);
> 
> The same in the unlock part so that BH is unlocked in preemptible
> context.
> Now if you need bh lock/unlock in atomic context (either with disabled
> IRQs or preemption) then I would dig out the atomic-bh part again and
> make !RT only without the preempt_disable() section around about which
> one you did complain.
> 
> > @@ -1496,6 +1502,9 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
> >  	int mask = rcutorture_extend_mask_max();
> >  	unsigned long randmask1 = torture_random(trsp) >> 8;
> >  	unsigned long randmask2 = randmask1 >> 3;
> > +	unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
> > +	unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ;
> > +	unsigned long bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> >  
> >  	WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
> >  	/* Mostly only one bit (need preemption!), sometimes lots of bits. */
> > @@ -1503,11 +1512,37 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
> >  		mask = mask & randmask2;
> >  	else
> >  		mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
> > -	/* Can't enable bh w/irq disabled. */
> > -	if ((mask & RCUTORTURE_RDR_IRQ) &&
> > -	    ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
> > -	     (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
> > -		mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> > +
> > +	/*
> > +	 * Can't enable bh w/irq disabled.
> > +	 */
> > +	if (mask & RCUTORTURE_RDR_IRQ)
> > +		mask |= oldmask & bhs;
> > +
> > +	/*
> > +	 * Ideally these sequences would be detected in debug builds
> > +	 * (regardless of RT), but until then don't stop testing
> > +	 * them on non-RT.
> > +	 */
> > +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > +		/*
> > +		 * Can't release the outermost rcu lock in an irq disabled
> > +		 * section without preemption also being disabled, if irqs
> > +		 * had ever been enabled during this RCU critical section
> > +		 * (could leak a special flag and delay reporting the qs).
> > +		 */
> > +		if ((oldmask & RCUTORTURE_RDR_RCU) &&
> > +		    (mask & RCUTORTURE_RDR_IRQ) &&
> > +		    !(mask & preempts))
> > +			mask |= RCUTORTURE_RDR_RCU;
> 
> This piece above, I don't understand. I had it running for a while and
> it didn't explode. Let me try TREE01 for 30min without that piece.

This might be historical.  There was a time when interrupts being
disabled across rcu_read_unlock() meant that preemption had to have
been disabled across the entire RCU read-side critical section.

I am not seeing a purpose for it now, but I could easily be missing
something, especially given my tenuous grasp of RT.

Either way, looking forward to the next version!

							Thanx, Paul

> > +		/* Can't modify bh in atomic context */
> > +		if (oldmask & preempts_irq)
> > +			mask &= ~bhs;
> > +		if ((oldmask | mask) & preempts_irq)
> > +			mask |= oldmask & bhs;
> 
> And this is needed because we can't lock/unlock bh while atomic.
> 
> > +	}
> > +
> >  	return mask ?: RCUTORTURE_RDR_RCU;
> >  }
> >  
> 
> Sebastian
Sebastian Andrzej Siewior Aug. 19, 2021, 6:45 p.m. UTC | #6
On 2021-08-19 11:20:35 [-0700], Paul E. McKenney wrote:
> > This piece above, I don't understand. I had it running for a while and
> > it didn't explode. Let me try TREE01 for 30min without that piece.
> 
> This might be historical.  There was a time when interrupts being
> disabled across rcu_read_unlock() meant that preemption had to have
> been disabled across the entire RCU read-side critical section.
> 
> I am not seeing a purpose for it now, but I could easily be missing
> something, especially given my tenuous grasp of RT.

Okay. So the 30min test didn't trigger any warnings…

> Either way, looking forward to the next version!

Good. So if you liked what you have seen then I'm going to resubmit the
above as a proper patch then.
Thanks!

> 							Thanx, Paul

Sebastian
Scott Wood Aug. 20, 2021, 3:23 a.m. UTC | #7
On Tue, 2021-08-17 at 16:40 +0200, Sebastian Andrzej Siewior wrote:
> [bigeasy: remove 'preempt_disable(); local_bh_disable(); preempt_enable();
>  local_bh_enable();' from the examples because this works on RT now. ]

Does it actually work?  If preemption is disabled during local_bh_disable,
softirq_ctrl.lock won't be taken.  If you then get preempted between the
preempt_enable() and the local_bh_enable(), and another task tries to do
local_bh_disable(), won't it successfully get softirq_ctrl.lock, add to
softirq_ctrl.cnt, and proceed right into the critical section?

Or am I missing something?

-Scott
Scott Wood Aug. 20, 2021, 4:11 a.m. UTC | #8
On Thu, 2021-08-19 at 11:20 -0700, Paul E. McKenney wrote:
> On Thu, Aug 19, 2021 at 05:47:08PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2021-08-19 17:39:29 [+0200], To Paul E. McKenney wrote:
> > > +	/*
> > > +	 * Ideally these sequences would be detected in debug builds
> > > +	 * (regardless of RT), but until then don't stop testing
> > > +	 * them on non-RT.
> > > +	 */
> > > +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > > +		/*
> > > +		 * Can't release the outermost rcu lock in an irq disabled
> > > +		 * section without preemption also being disabled, if irqs
> > > +		 * had ever been enabled during this RCU critical section
> > > +		 * (could leak a special flag and delay reporting the qs).
> > > +		 */
> > > +		if ((oldmask & RCUTORTURE_RDR_RCU) &&
> > > +		    (mask & RCUTORTURE_RDR_IRQ) &&
> > > +		    !(mask & preempts))
> > > +			mask |= RCUTORTURE_RDR_RCU;
> > 
> > This piece above, I don't understand. I had it running for a while and
> > it didn't explode. Let me try TREE01 for 30min without that piece.
> 
> This might be historical.  There was a time when interrupts being
> disabled across rcu_read_unlock() meant that preemption had to have
> been disabled across the entire RCU read-side critical section.
> 
> I am not seeing a purpose for it now, but I could easily be missing
> something, especially given my tenuous grasp of RT.

Yeah, I think this was to deal with not having the irq work stuff in RT
at the time.

-Scott
Sebastian Andrzej Siewior Aug. 20, 2021, 6:54 a.m. UTC | #9
On 2021-08-19 22:23:37 [-0500], Scott Wood wrote:
> On Tue, 2021-08-17 at 16:40 +0200, Sebastian Andrzej Siewior wrote:
> > [bigeasy: remove 'preempt_disable(); local_bh_disable(); preempt_enable();
> >  local_bh_enable();' from the examples because this works on RT now. ]
> 
> Does it actually work?  If preemption is disabled during local_bh_disable,
> softirq_ctrl.lock won't be taken.  If you then get preempted between the
> preempt_enable() and the local_bh_enable(), and another task tries to do
> local_bh_disable(), won't it successfully get softirq_ctrl.lock, add to
> softirq_ctrl.cnt, and proceed right into the critical section?
> 
> Or am I missing something?

No, I mixed it up with migrate_disable/enable. I corrected it while
redoing it yesterday.

> -Scott

Sebastian
Sebastian Andrzej Siewior Aug. 20, 2021, 7:11 a.m. UTC | #10
On 2021-08-19 23:11:12 [-0500], Scott Wood wrote:
> On Thu, 2021-08-19 at 11:20 -0700, Paul E. McKenney wrote:
> > On Thu, Aug 19, 2021 at 05:47:08PM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2021-08-19 17:39:29 [+0200], To Paul E. McKenney wrote:
> > > > +	/*
> > > > +	 * Ideally these sequences would be detected in debug builds
> > > > +	 * (regardless of RT), but until then don't stop testing
> > > > +	 * them on non-RT.
> > > > +	 */
> > > > +	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > > > +		/*
> > > > +		 * Can't release the outermost rcu lock in an irq disabled
> > > > +		 * section without preemption also being disabled, if irqs
> > > > +		 * had ever been enabled during this RCU critical section
> > > > +		 * (could leak a special flag and delay reporting the qs).
> > > > +		 */
> > > > +		if ((oldmask & RCUTORTURE_RDR_RCU) &&
> > > > +		    (mask & RCUTORTURE_RDR_IRQ) &&
> > > > +		    !(mask & preempts))
> > > > +			mask |= RCUTORTURE_RDR_RCU;
> > > 
> > > This piece above, I don't understand. I had it running for a while and
> > > it didn't explode. Let me try TREE01 for 30min without that piece.
> > 
> > This might be historical.  There was a time when interrupts being
> > disabled across rcu_read_unlock() meant that preemption had to have
> > been disabled across the entire RCU read-side critical section.
> > 
> > I am not seeing a purpose for it now, but I could easily be missing
> > something, especially given my tenuous grasp of RT.
> 
> Yeah, I think this was to deal with not having the irq work stuff in RT
> at the time.

Good. Thank you for the confirmation. 
I run (without the hunk above) 2x 6h of TREE01 and 4x 6h of TREE06 and
it looked good.

> -Scott

Sebastian
diff mbox series

Patch

--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -61,10 +61,13 @@  MODULE_AUTHOR("Paul E. McKenney <paulmck
 #define RCUTORTURE_RDR_RBH	 0x08	/*  ... rcu_read_lock_bh(). */
 #define RCUTORTURE_RDR_SCHED	 0x10	/*  ... rcu_read_lock_sched(). */
 #define RCUTORTURE_RDR_RCU	 0x20	/*  ... entering another RCU reader. */
-#define RCUTORTURE_RDR_NBITS	 6	/* Number of bits defined above. */
+#define RCUTORTURE_RDR_ATOM_BH	 0x40	/*  ... disabling bh while atomic */
+#define RCUTORTURE_RDR_ATOM_RBH	 0x80	/*  ... RBH while atomic */
+#define RCUTORTURE_RDR_NBITS	 8	/* Number of bits defined above. */
 #define RCUTORTURE_MAX_EXTEND	 \
 	(RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | RCUTORTURE_RDR_PREEMPT | \
-	 RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED)
+	 RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED | \
+	 RCUTORTURE_RDR_ATOM_BH | RCUTORTURE_RDR_ATOM_RBH)
 #define RCUTORTURE_RDR_MAX_LOOPS 0x7	/* Maximum reader extensions. */
 					/* Must be power of two minus one. */
 #define RCUTORTURE_RDR_MAX_SEGS (RCUTORTURE_RDR_MAX_LOOPS + 3)
@@ -1429,31 +1432,53 @@  static void rcutorture_one_extend(int *r
 	WARN_ON_ONCE((idxold >> RCUTORTURE_RDR_SHIFT) > 1);
 	rtrsp->rt_readstate = newstate;
 
-	/* First, put new protection in place to avoid critical-section gap. */
+	/*
+	 * First, put new protection in place to avoid critical-section gap.
+	 * Disable preemption around the ATOM disables to ensure that
+	 * in_atomic() is true.
+	 */
 	if (statesnew & RCUTORTURE_RDR_BH)
 		local_bh_disable();
+	if (statesnew & RCUTORTURE_RDR_RBH)
+		rcu_read_lock_bh();
 	if (statesnew & RCUTORTURE_RDR_IRQ)
 		local_irq_disable();
 	if (statesnew & RCUTORTURE_RDR_PREEMPT)
 		preempt_disable();
-	if (statesnew & RCUTORTURE_RDR_RBH)
-		rcu_read_lock_bh();
 	if (statesnew & RCUTORTURE_RDR_SCHED)
 		rcu_read_lock_sched();
+	preempt_disable();
+	if (statesnew & RCUTORTURE_RDR_ATOM_BH)
+		local_bh_disable();
+	if (statesnew & RCUTORTURE_RDR_ATOM_RBH)
+		rcu_read_lock_bh();
+	preempt_enable();
 	if (statesnew & RCUTORTURE_RDR_RCU)
 		idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;
 
-	/* Next, remove old protection, irq first due to bh conflict. */
+	/*
+	 * Next, remove old protection, in decreasing order of strength
+	 * to avoid unlock paths that aren't safe in the stronger
+	 * context.  Disable preemption around the ATOM enables in
+	 * case the context was only atomic due to IRQ disabling.
+	 */
+	preempt_disable();
 	if (statesold & RCUTORTURE_RDR_IRQ)
 		local_irq_enable();
-	if (statesold & RCUTORTURE_RDR_BH)
+	if (statesold & RCUTORTURE_RDR_ATOM_BH)
 		local_bh_enable();
+	if (statesold & RCUTORTURE_RDR_ATOM_RBH)
+		rcu_read_unlock_bh();
+	preempt_enable();
 	if (statesold & RCUTORTURE_RDR_PREEMPT)
 		preempt_enable();
-	if (statesold & RCUTORTURE_RDR_RBH)
-		rcu_read_unlock_bh();
 	if (statesold & RCUTORTURE_RDR_SCHED)
 		rcu_read_unlock_sched();
+	if (statesold & RCUTORTURE_RDR_BH)
+		local_bh_enable();
+	if (statesold & RCUTORTURE_RDR_RBH)
+		rcu_read_unlock_bh();
+
 	if (statesold & RCUTORTURE_RDR_RCU) {
 		bool lockit = !statesnew && !(torture_random(trsp) & 0xffff);
 
@@ -1496,6 +1521,12 @@  rcutorture_extend_mask(int oldmask, stru
 	int mask = rcutorture_extend_mask_max();
 	unsigned long randmask1 = torture_random(trsp) >> 8;
 	unsigned long randmask2 = randmask1 >> 3;
+	unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
+	unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ;
+	unsigned long nonatomic_bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
+	unsigned long atomic_bhs = RCUTORTURE_RDR_ATOM_BH |
+				   RCUTORTURE_RDR_ATOM_RBH;
+	unsigned long tmp;
 
 	WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
 	/* Mostly only one bit (need preemption!), sometimes lots of bits. */
@@ -1503,11 +1534,46 @@  rcutorture_extend_mask(int oldmask, stru
 		mask = mask & randmask2;
 	else
 		mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
-	/* Can't enable bh w/irq disabled. */
-	if ((mask & RCUTORTURE_RDR_IRQ) &&
-	    ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
-	     (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
-		mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
+
+	/*
+	 * Can't enable bh w/irq disabled.
+	 */
+	tmp = atomic_bhs | nonatomic_bhs;
+	if (mask & RCUTORTURE_RDR_IRQ)
+		mask |= oldmask & tmp;
+
+	/*
+	 * Ideally these sequences would be detected in debug builds
+	 * (regardless of RT), but until then don't stop testing
+	 * them on non-RT.
+	 */
+	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
+		/*
+		 * Can't disable bh in atomic context if bh was already
+		 * disabled by another task on the same CPU. Instead of
+		 * attempting to track this, just avoid disabling bh in atomic
+		 * context.
+		 */
+		mask &= ~atomic_bhs;
+		/*
+		 * Can't release the outermost rcu lock in an irq disabled
+		 * section without preemption also being disabled, if irqs
+		 * had ever been enabled during this RCU critical section
+		 * (could leak a special flag and delay reporting the qs).
+		 */
+		if ((oldmask & RCUTORTURE_RDR_RCU) &&
+		    (mask & RCUTORTURE_RDR_IRQ) &&
+		    !(mask & preempts))
+			mask |= RCUTORTURE_RDR_RCU;
+
+		/* Can't modify non-atomic bh in atomic context */
+		tmp = nonatomic_bhs;
+		if (oldmask & preempts_irq)
+			mask &= ~tmp;
+		if ((oldmask | mask) & preempts_irq)
+			mask |= oldmask & tmp;
+	}
+
 	return mask ?: RCUTORTURE_RDR_RCU;
 }