Message ID | 20220722005213.3511188-1-qiang1.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | rcu: Only check tasks blocked on leaf rcu_nodes for PREEMPT_RCU | expand |
On Fri, Jul 22, 2022 at 08:52:13AM +0800, Zqiang wrote: > In PREEMPT_RCU kernel, for multi-node rcu tree, if the RCU read > critical section is preempted, the current task are only queued > leaf rcu_node blkd list, for single-node rcu tree, the root node > is also leaf node. > > This commit add rcu_is_leaf_node() to filter out checks for non-leaf > rcu_node. > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> First, thank you for digging into this! > --- > kernel/rcu/tree_plugin.h | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > index b2219577fbe2..a9df11ec65af 100644 > --- a/kernel/rcu/tree_plugin.h > +++ b/kernel/rcu/tree_plugin.h > @@ -693,6 +693,8 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) > > RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n"); > raw_lockdep_assert_held_rcu_node(rnp); > + if (!rcu_is_leaf_node(rnp)) > + goto end; > if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp))) > dump_blkd_tasks(rnp, 10); > if (rcu_preempt_has_tasks(rnp) && So you are adding a simple check for all rcu_node structures that removes two simple checks for non-leaf rcu_node structures. Assuming that the costs of all these checks is roughly the same (but please feel free to actually measure them on real hardware), what what fraction of the rcu_node structures must be non-leaf for this change to be a win? Then for what configurations using default fanouts is this fraction exceeded? The default fanout is 16 from non-leaf to leaf and 64 from non-leaf to non-leaf (32 for non-leaf to non-leaf on 32-bit systems). Hey, you wanted some algebra practice anyway. ;-) > @@ -703,6 +705,7 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) > trace_rcu_unlock_preempted_task(TPS("rcu_preempt-GPS"), > rnp->gp_seq, t->pid); > } > +end: > WARN_ON_ONCE(rnp->qsmask); > } > > @@ -1178,7 +1181,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags) > */ > static void rcu_preempt_boost_start_gp(struct rcu_node *rnp) > { > - rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES; > + if (rcu_is_leaf_node(rnp)) > + rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES; And here you are adding a check on all nodes to eliminate an addition and a store on non-leaf nodes. Same questions as above, with the same invitation to actually measure the costs of these operations. Thanx, Paul > } > > /* > -- > 2.25.1 >
On Fri, Jul 22, 2022 at 08:52:13AM +0800, Zqiang wrote: > In PREEMPT_RCU kernel, for multi-node rcu tree, if the RCU read > critical section is preempted, the current task are only queued > leaf rcu_node blkd list, for single-node rcu tree, the root node > is also leaf node. > > This commit add rcu_is_leaf_node() to filter out checks for non-leaf > rcu_node. > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > >First, thank you for digging into this! > > --- > kernel/rcu/tree_plugin.h | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > index b2219577fbe2..a9df11ec65af 100644 > --- a/kernel/rcu/tree_plugin.h > +++ b/kernel/rcu/tree_plugin.h > @@ -693,6 +693,8 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) > > RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n"); > raw_lockdep_assert_held_rcu_node(rnp); > + if (!rcu_is_leaf_node(rnp)) > + goto end; > if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp))) > dump_blkd_tasks(rnp, 10); > if (rcu_preempt_has_tasks(rnp) && >So you are adding a simple check for all rcu_node structures >that removes two simple checks for non-leaf rcu_node structures. > >Assuming that the costs of all these checks is roughly the same (but >please feel free to actually measure them on real hardware), what what >fraction of the rcu_node structures must be non-leaf for this change to >be a win? Then for what configurations using default fanouts is this >fraction exceeded? > >The default fanout is 16 from non-leaf to leaf and 64 from non-leaf >to non-leaf (32 for non-leaf to non-leaf on 32-bit systems). Thanks for reminding, I only considered the logic in the code, not the actual real usage scenarios. I will do some tests on the actual hardware.
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index b2219577fbe2..a9df11ec65af 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -693,6 +693,8 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n"); raw_lockdep_assert_held_rcu_node(rnp); + if (!rcu_is_leaf_node(rnp)) + goto end; if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp))) dump_blkd_tasks(rnp, 10); if (rcu_preempt_has_tasks(rnp) && @@ -703,6 +705,7 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) trace_rcu_unlock_preempted_task(TPS("rcu_preempt-GPS"), rnp->gp_seq, t->pid); } +end: WARN_ON_ONCE(rnp->qsmask); } @@ -1178,7 +1181,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags) */ static void rcu_preempt_boost_start_gp(struct rcu_node *rnp) { - rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES; + if (rcu_is_leaf_node(rnp)) + rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES; } /*
In PREEMPT_RCU kernel, for multi-node rcu tree, if the RCU read critical section is preempted, the current task are only queued leaf rcu_node blkd list, for single-node rcu tree, the root node is also leaf node. This commit add rcu_is_leaf_node() to filter out checks for non-leaf rcu_node. Signed-off-by: Zqiang <qiang1.zhang@intel.com> --- kernel/rcu/tree_plugin.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)