Message ID | 20230320032422.4010801-1-qiang1.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] rcutorture: Convert schedule_timeout_uninterruptible() to mdelay() in rcu_torture_stall() | expand |
Hi Qiang, > From: Zqiang <qiang1.zhang@intel.com> > Sent: Monday, March 20, 2023 11:24 AM > To: paulmck@kernel.org; frederic@kernel.org; joel@joelfernandes.org > Cc: rcu@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: [PATCH v2] rcutorture: Convert schedule_timeout_uninterruptible() > to mdelay() in rcu_torture_stall() > > For kernels built with enable PREEMPT_NONE and s/enable/enabling/ > CONFIG_DEBUG_ATOMIC_SLEEP, running the RCU stall tests. s/running/run > > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4" > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30 > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1 > rcutorture.stall_cpu_block=1" -d > > [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall > [ 10.841073] rcu_torture_stall start on CPU 3. > [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000 > .... > [ 10.841108] Call Trace: > [ 10.841110] <TASK> > [ 10.841112] dump_stack_lvl+0x64/0xb0 > [ 10.841118] dump_stack+0x10/0x20 > [ 10.841121] __schedule_bug+0x8b/0xb0 > [ 10.841126] __schedule+0x2172/0x2940 > [ 10.841157] schedule+0x9b/0x150 > [ 10.841160] schedule_timeout+0x2e8/0x4f0 > [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50 > [ 10.841195] rcu_torture_stall+0x2e8/0x300 > [ 10.841199] kthread+0x175/0x1a0 > [ 10.841206] ret_from_fork+0x2c/0x50 > > The above calltrace occurs in the local_irq_disable/enable() critical section > call schedule_timeout(), and invoke schedule_timeout() also implies a > quiescent state, of course it also fails to trigger RCU stall, this commit > therefore use mdelay() instead of schedule_timeout() to trigger RCU stall. Tweak the commit description above to fix some grammar errors: The above call trace occurred in the local_irq_disable/enable() critical section when calling schedule_timeout() from rcu_torture_stall(). Invoking schedule_timeout() also implies a quiescent state, of course, it also fails to trigger RCU stall. This commit, therefore, uses mdelay() instead of schedule_timeout() to trigger the RCU stall. > Suggested-by: Joel Fernandes <joel@joelfernandes.org> > Signed-off-by: Zqiang <qiang1.zhang@intel.com> I didn't reproduce the call trace after applying your patch. So, with the above minor fixes, then Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Thanks -Qiuxu > --- > kernel/rcu/rcutorture.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index > d06c2da04c34..a08a72bef5f1 100644 > --- a/kernel/rcu/rcutorture.c > +++ b/kernel/rcu/rcutorture.c > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args) #ifdef > CONFIG_PREEMPTION > preempt_schedule(); > #else > - schedule_timeout_uninterruptible(HZ); > + mdelay(jiffies_to_msecs(HZ)); > #endif > } else if (stall_no_softlockup) { > touch_softlockup_watchdog(); > -- > 2.25.1
On Mon, Mar 20, 2023 at 11:24:22AM +0800, Zqiang wrote: > For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP, > running the RCU stall tests. > > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4" > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30 > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1 > rcutorture.stall_cpu_block=1" -d > > [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall > [ 10.841073] rcu_torture_stall start on CPU 3. > [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000 > .... > [ 10.841108] Call Trace: > [ 10.841110] <TASK> > [ 10.841112] dump_stack_lvl+0x64/0xb0 > [ 10.841118] dump_stack+0x10/0x20 > [ 10.841121] __schedule_bug+0x8b/0xb0 > [ 10.841126] __schedule+0x2172/0x2940 > [ 10.841157] schedule+0x9b/0x150 > [ 10.841160] schedule_timeout+0x2e8/0x4f0 > [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50 > [ 10.841195] rcu_torture_stall+0x2e8/0x300 > [ 10.841199] kthread+0x175/0x1a0 > [ 10.841206] ret_from_fork+0x2c/0x50 > > The above calltrace occurs in the local_irq_disable/enable() critical > section call schedule_timeout(), and invoke schedule_timeout() also > implies a quiescent state, of course it also fails to trigger RCU stall, > this commit therefore use mdelay() instead of schedule_timeout() to > trigger RCU stall. > > Suggested-by: Joel Fernandes <joel@joelfernandes.org> > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > --- > kernel/rcu/rcutorture.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > index d06c2da04c34..a08a72bef5f1 100644 > --- a/kernel/rcu/rcutorture.c > +++ b/kernel/rcu/rcutorture.c > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args) Right here there is: if (stall_cpu_block) { In other words, the rcutorture.stall_cpu_block module parameter says to block, even if it is a bad thing to do. The point of this is to verify the error messages that are supposed to be printed on the console when this happens. > #ifdef CONFIG_PREEMPTION > preempt_schedule(); > #else > - schedule_timeout_uninterruptible(HZ); > + mdelay(jiffies_to_msecs(HZ)); So this really needs to stay schedule_timeout_uninterruptible(HZ). So should there be a change to kernel-parameters.txt to make it more clear that this is intended behavior? Thanx, Paul > #endif > } else if (stall_no_softlockup) { > touch_softlockup_watchdog(); > -- > 2.25.1 >
> For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP, > running the RCU stall tests. > > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4" > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30 > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1 > rcutorture.stall_cpu_block=1" -d > > [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall > [ 10.841073] rcu_torture_stall start on CPU 3. > [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000 > .... > [ 10.841108] Call Trace: > [ 10.841110] <TASK> > [ 10.841112] dump_stack_lvl+0x64/0xb0 > [ 10.841118] dump_stack+0x10/0x20 > [ 10.841121] __schedule_bug+0x8b/0xb0 > [ 10.841126] __schedule+0x2172/0x2940 > [ 10.841157] schedule+0x9b/0x150 > [ 10.841160] schedule_timeout+0x2e8/0x4f0 > [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50 > [ 10.841195] rcu_torture_stall+0x2e8/0x300 > [ 10.841199] kthread+0x175/0x1a0 > [ 10.841206] ret_from_fork+0x2c/0x50 > > The above calltrace occurs in the local_irq_disable/enable() critical > section call schedule_timeout(), and invoke schedule_timeout() also > implies a quiescent state, of course it also fails to trigger RCU stall, > this commit therefore use mdelay() instead of schedule_timeout() to > trigger RCU stall. > > Suggested-by: Joel Fernandes <joel@joelfernandes.org> > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > --- > kernel/rcu/rcutorture.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > index d06c2da04c34..a08a72bef5f1 100644 > --- a/kernel/rcu/rcutorture.c > +++ b/kernel/rcu/rcutorture.c > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args) > >Right here there is: > > if (stall_cpu_block) { > >In other words, the rcutorture.stall_cpu_block module parameter says to >block, even if it is a bad thing to do. The point of this is to verify >the error messages that are supposed to be printed on the console when >this happens. > > #ifdef CONFIG_PREEMPTION > preempt_schedule(); > #else > - schedule_timeout_uninterruptible(HZ); > + mdelay(jiffies_to_msecs(HZ)); > >So this really needs to stay schedule_timeout_uninterruptible(HZ). But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent state, this will not cause an RCU stall to occur, and still in the RCU read critical section(PREEMPT_COUNT=y). It didn't happen RCU stall when I tested with the following parameters for rcutorture.stall_cpu=30 rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1 rcutorture.stall_cpu_block=1 Thanks Zqiang > >So should there be a change to kernel-parameters.txt to make it >more clear that this is intended behavior? > > Thanx, Paul > > #endif > } else if (stall_no_softlockup) { > touch_softlockup_watchdog(); > -- > 2.25.1 >
On Mon, Mar 20, 2023 at 11:05:17PM +0000, Zhang, Qiang1 wrote: > > For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP, > > running the RCU stall tests. > > > > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4" > > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30 > > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1 > > rcutorture.stall_cpu_block=1" -d > > > > [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall > > [ 10.841073] rcu_torture_stall start on CPU 3. > > [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000 > > .... > > [ 10.841108] Call Trace: > > [ 10.841110] <TASK> > > [ 10.841112] dump_stack_lvl+0x64/0xb0 > > [ 10.841118] dump_stack+0x10/0x20 > > [ 10.841121] __schedule_bug+0x8b/0xb0 > > [ 10.841126] __schedule+0x2172/0x2940 > > [ 10.841157] schedule+0x9b/0x150 > > [ 10.841160] schedule_timeout+0x2e8/0x4f0 > > [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50 > > [ 10.841195] rcu_torture_stall+0x2e8/0x300 > > [ 10.841199] kthread+0x175/0x1a0 > > [ 10.841206] ret_from_fork+0x2c/0x50 > > > > The above calltrace occurs in the local_irq_disable/enable() critical > > section call schedule_timeout(), and invoke schedule_timeout() also > > implies a quiescent state, of course it also fails to trigger RCU stall, > > this commit therefore use mdelay() instead of schedule_timeout() to > > trigger RCU stall. > > > > Suggested-by: Joel Fernandes <joel@joelfernandes.org> > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > --- > > kernel/rcu/rcutorture.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > > index d06c2da04c34..a08a72bef5f1 100644 > > --- a/kernel/rcu/rcutorture.c > > +++ b/kernel/rcu/rcutorture.c > > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args) > > > >Right here there is: > > > > if (stall_cpu_block) { > > > >In other words, the rcutorture.stall_cpu_block module parameter says to > >block, even if it is a bad thing to do. The point of this is to verify > >the error messages that are supposed to be printed on the console when > >this happens. > > > > #ifdef CONFIG_PREEMPTION > > preempt_schedule(); > > #else > > - schedule_timeout_uninterruptible(HZ); > > + mdelay(jiffies_to_msecs(HZ)); > > > >So this really needs to stay schedule_timeout_uninterruptible(HZ). > > But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent state, > this will not cause an RCU stall to occur, and still in the RCU read critical section(PREEMPT_COUNT=y). > > It didn't happen RCU stall when I tested with the following parameters for > rcutorture.stall_cpu=30 > rcutorture.stall_no_softlockup=1 > rcutorture.stall_cpu_irqsoff=1 > rcutorture.stall_cpu_block=1 Understood. If you want that RCU CPU stall in a CONFIG_PREEMPTION=n kernel, you should not use rcutorture.stall_cpu_block=1. In a CONFIG_PREEMPTION=y kernel, rcutorture.stall_cpu_block=1 forces the grace period to be stalled on a task rather than a CPU, exercising a different part of the RCU CPU stall warning code. In a CONFIG_PREEMPTION=n kernel, using rcutorture.stall_cpu_block=1 forces the CPU to go through a quiescent state, as you say. It can also cause lockdep and scheduling-while-atomic complaints, depending on exactly what type of RCU reader is in effect. So these are test-the-diagnostics parameters. The mdelay() instead makes rcutorture.stall_cpu_block=1 do the same thing as does rcutorture.stall_cpu_block=0 for CONFIG_PREEMPTION=n kernels, right? Thanx, Paul > Thanks > Zqiang > > > > >So should there be a change to kernel-parameters.txt to make it > >more clear that this is intended behavior? > > > > Thanx, Paul > > > > #endif > > } else if (stall_no_softlockup) { > > touch_softlockup_watchdog(); > > -- > > 2.25.1 > >
> > For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP, > > running the RCU stall tests. > > > > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4" > > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30 > > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1 > > rcutorture.stall_cpu_block=1" -d > > > > [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall > > [ 10.841073] rcu_torture_stall start on CPU 3. > > [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000 > > .... > > [ 10.841108] Call Trace: > > [ 10.841110] <TASK> > > [ 10.841112] dump_stack_lvl+0x64/0xb0 > > [ 10.841118] dump_stack+0x10/0x20 > > [ 10.841121] __schedule_bug+0x8b/0xb0 > > [ 10.841126] __schedule+0x2172/0x2940 > > [ 10.841157] schedule+0x9b/0x150 > > [ 10.841160] schedule_timeout+0x2e8/0x4f0 > > [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50 > > [ 10.841195] rcu_torture_stall+0x2e8/0x300 > > [ 10.841199] kthread+0x175/0x1a0 > > [ 10.841206] ret_from_fork+0x2c/0x50 > > > > The above calltrace occurs in the local_irq_disable/enable() critical > > section call schedule_timeout(), and invoke schedule_timeout() also > > implies a quiescent state, of course it also fails to trigger RCU stall, > > this commit therefore use mdelay() instead of schedule_timeout() to > > trigger RCU stall. > > > > Suggested-by: Joel Fernandes <joel@joelfernandes.org> > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > --- > > kernel/rcu/rcutorture.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > > index d06c2da04c34..a08a72bef5f1 100644 > > --- a/kernel/rcu/rcutorture.c > > +++ b/kernel/rcu/rcutorture.c > > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args) > > > >Right here there is: > > > > if (stall_cpu_block) { > > > >In other words, the rcutorture.stall_cpu_block module parameter says to > >block, even if it is a bad thing to do. The point of this is to verify > >the error messages that are supposed to be printed on the console when > >this happens. > > > > #ifdef CONFIG_PREEMPTION > > preempt_schedule(); > > #else > > - schedule_timeout_uninterruptible(HZ); > > + mdelay(jiffies_to_msecs(HZ)); > > > >So this really needs to stay schedule_timeout_uninterruptible(HZ). > > But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent state, > this will not cause an RCU stall to occur, and still in the RCU read critical section(PREEMPT_COUNT=y). > > It didn't happen RCU stall when I tested with the following parameters for > rcutorture.stall_cpu=30 > rcutorture.stall_no_softlockup=1 > rcutorture.stall_cpu_irqsoff=1 > rcutorture.stall_cpu_block=1 > >Understood. If you want that RCU CPU stall in a CONFIG_PREEMPTION=n >kernel, you should not use rcutorture.stall_cpu_block=1. > >In a CONFIG_PREEMPTION=y kernel, rcutorture.stall_cpu_block=1 forces >the grace period to be stalled on a task rather than a CPU, exercising >a different part of the RCU CPU stall warning code. > >In a CONFIG_PREEMPTION=n kernel, using rcutorture.stall_cpu_block=1 >forces the CPU to go through a quiescent state, as you say. It can >also cause lockdep and scheduling-while-atomic complaints, depending on >exactly what type of RCU reader is in effect. > >So these are test-the-diagnostics parameters. The mdelay() instead >makes rcutorture.stall_cpu_block=1 do the same thing as does >rcutorture.stall_cpu_block=0 for CONFIG_PREEMPTION=n kernels, right? Yes, maybe we can increase the description of the stall_cpu_block in kernel-parameters.txt. > > Thanx, Paul > > Thanks > Zqiang > > > > >So should there be a change to kernel-parameters.txt to make it > >more clear that this is intended behavior? Agree Thanks Zqiang > > > > Thanx, Paul > > > > #endif > > } else if (stall_no_softlockup) { > > touch_softlockup_watchdog(); > > -- > > 2.25.1 > >
> From: Paul E. McKenney <paulmck@kernel.org> > [...] > > But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent > > state, this will not cause an RCU stall to occur, and still in the RCU read > critical section(PREEMPT_COUNT=y). > > > > It didn't happen RCU stall when I tested with the following parameters > > for > > rcutorture.stall_cpu=30 > > rcutorture.stall_no_softlockup=1 > > rcutorture.stall_cpu_irqsoff=1 > > rcutorture.stall_cpu_block=1 > > Understood. If you want that RCU CPU stall in a CONFIG_PREEMPTION=n > kernel, you should not use rcutorture.stall_cpu_block=1. > Verified. if rcutorture.stall_cpu_block=0, it can trigger the expected RCU CPU stall for either torture_type=srcu or torture_type=rcu. > In a CONFIG_PREEMPTION=y kernel, rcutorture.stall_cpu_block=1 forces the > grace period to be stalled on a task rather than a CPU, exercising a different > part of the RCU CPU stall warning code. > > In a CONFIG_PREEMPTION=n kernel, using rcutorture.stall_cpu_block=1 > forces the CPU to go through a quiescent state, as you say. It can also cause > lockdep and scheduling-while-atomic complaints, depending on exactly what > type of RCU reader is in effect. > Verified. If rcutorture.stall_cpu_block=1: There were lockdep and scheduling-while-atomic complaints for torture_type=rcu. No lockdep and scheduling-while-atomic complaints for torture_type=srcu. > So these are test-the-diagnostics parameters. The mdelay() instead makes > rcutorture.stall_cpu_block=1 do the same thing as does > rcutorture.stall_cpu_block=0 for CONFIG_PREEMPTION=n kernels, right? Good to know that these are test-the-diagnostics parameters and their expected behaviors. ;-) Thanks! -Qiuxu > Thanx, Paul
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index d06c2da04c34..a08a72bef5f1 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args) #ifdef CONFIG_PREEMPTION preempt_schedule(); #else - schedule_timeout_uninterruptible(HZ); + mdelay(jiffies_to_msecs(HZ)); #endif } else if (stall_no_softlockup) { touch_softlockup_watchdog();
For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP, running the RCU stall tests. runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4" bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30 rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1 rcutorture.stall_cpu_block=1" -d [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall [ 10.841073] rcu_torture_stall start on CPU 3. [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000 .... [ 10.841108] Call Trace: [ 10.841110] <TASK> [ 10.841112] dump_stack_lvl+0x64/0xb0 [ 10.841118] dump_stack+0x10/0x20 [ 10.841121] __schedule_bug+0x8b/0xb0 [ 10.841126] __schedule+0x2172/0x2940 [ 10.841157] schedule+0x9b/0x150 [ 10.841160] schedule_timeout+0x2e8/0x4f0 [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50 [ 10.841195] rcu_torture_stall+0x2e8/0x300 [ 10.841199] kthread+0x175/0x1a0 [ 10.841206] ret_from_fork+0x2c/0x50 The above calltrace occurs in the local_irq_disable/enable() critical section call schedule_timeout(), and invoke schedule_timeout() also implies a quiescent state, of course it also fails to trigger RCU stall, this commit therefore use mdelay() instead of schedule_timeout() to trigger RCU stall. Suggested-by: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Zqiang <qiang1.zhang@intel.com> --- kernel/rcu/rcutorture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)