Message ID | 20230721075716.857-2-thunder.leizhen@huaweicloud.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | rcu: Simplify check_cpu_stall() | expand |
On Fri, Jul 21, 2023 at 03:57:15PM +0800, thunder.leizhen@huaweicloud.com wrote: > From: Zhen Lei <thunder.leizhen@huawei.com> > > j = jiffies; > js = READ_ONCE(rcu_state.jiffies_stall); (1) > if (ULONG_CMP_LT(j, js)) (2) > return; > > if (cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) (3) > didstall = true; > > if (didstall && READ_ONCE(rcu_state.jiffies_stall) == jn) { (4) > jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; > WRITE_ONCE(rcu_state.jiffies_stall, jn); > } > > For ease of description, the pseudo code is extracted as above. First, > assume that only one CPU is operating, the condition 'didstall' is true > means that (3) succeeds. That is, the value of rcu_state.jiffies_stall > must be 'jn'. > > Then, assuming that another CPU is also operating at the same time, there > are two cases: > 1. That CPU sees the updated result at (1), it returns at (2). > 2. That CPU does not see the updated result at (1), it fails at (3) and > cmpxchg returns jn. So that, didstall cannot be true. The history behind this one is that there are races in which the stall can end in the midst of check_cpu_stall(). For example, when the activity of producing the stall warning breaks things loose. And yes, long ago, I figured that if things had been static for so many seconds, they were unlikely to change, and thus omitted any and all synchronization. The Linux kernel taught me better. ;-) Thanx, Paul > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > --- > kernel/rcu/tree_stall.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h > index cc884cd49e026a3..371713f3f7d15d9 100644 > --- a/kernel/rcu/tree_stall.h > +++ b/kernel/rcu/tree_stall.h > @@ -794,7 +794,7 @@ static void check_cpu_stall(struct rcu_data *rdp) > rcu_ftrace_dump(DUMP_ALL); > didstall = true; > } > - if (didstall && READ_ONCE(rcu_state.jiffies_stall) == jn) { > + if (didstall) { > jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; > WRITE_ONCE(rcu_state.jiffies_stall, jn); > } > -- > 2.25.1 >
On 2023/7/22 6:37, Paul E. McKenney wrote: > On Fri, Jul 21, 2023 at 03:57:15PM +0800, thunder.leizhen@huaweicloud.com wrote: >> From: Zhen Lei <thunder.leizhen@huawei.com> >> >> j = jiffies; >> js = READ_ONCE(rcu_state.jiffies_stall); (1) >> if (ULONG_CMP_LT(j, js)) (2) >> return; >> >> if (cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) (3) >> didstall = true; >> >> if (didstall && READ_ONCE(rcu_state.jiffies_stall) == jn) { (4) >> jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; >> WRITE_ONCE(rcu_state.jiffies_stall, jn); >> } >> >> For ease of description, the pseudo code is extracted as above. First, >> assume that only one CPU is operating, the condition 'didstall' is true >> means that (3) succeeds. That is, the value of rcu_state.jiffies_stall >> must be 'jn'. >> >> Then, assuming that another CPU is also operating at the same time, there >> are two cases: >> 1. That CPU sees the updated result at (1), it returns at (2). >> 2. That CPU does not see the updated result at (1), it fails at (3) and >> cmpxchg returns jn. So that, didstall cannot be true. > > The history behind this one is that there are races in which the stall > can end in the midst of check_cpu_stall(). For example, when the activity > of producing the stall warning breaks things loose. > > And yes, long ago, I figured that if things had been static for so > many seconds, they were unlikely to change, and thus omitted any and > all synchronization. The Linux kernel taught me better. ;-) Okay, got it, thank you > > Thanx, Paul > >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >> --- >> kernel/rcu/tree_stall.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h >> index cc884cd49e026a3..371713f3f7d15d9 100644 >> --- a/kernel/rcu/tree_stall.h >> +++ b/kernel/rcu/tree_stall.h >> @@ -794,7 +794,7 @@ static void check_cpu_stall(struct rcu_data *rdp) >> rcu_ftrace_dump(DUMP_ALL); >> didstall = true; >> } >> - if (didstall && READ_ONCE(rcu_state.jiffies_stall) == jn) { >> + if (didstall) { >> jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; >> WRITE_ONCE(rcu_state.jiffies_stall, jn); >> } >> -- >> 2.25.1 >> > . >
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index cc884cd49e026a3..371713f3f7d15d9 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -794,7 +794,7 @@ static void check_cpu_stall(struct rcu_data *rdp) rcu_ftrace_dump(DUMP_ALL); didstall = true; } - if (didstall && READ_ONCE(rcu_state.jiffies_stall) == jn) { + if (didstall) { jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; WRITE_ONCE(rcu_state.jiffies_stall, jn); }