diff mbox series

rcu: Only check tasks blocked on leaf rcu_nodes for PREEMPT_RCU

Message ID 20220722005213.3511188-1-qiang1.zhang@intel.com (mailing list archive)
State New, archived
Headers show
Series rcu: Only check tasks blocked on leaf rcu_nodes for PREEMPT_RCU | expand

Commit Message

Zqiang July 22, 2022, 12:52 a.m. UTC
In PREEMPT_RCU kernel, for multi-node rcu tree, if the RCU read
critical section is preempted, the current task are only queued
leaf rcu_node blkd list, for single-node rcu tree, the root node
is also leaf node.

This commit add rcu_is_leaf_node() to filter out checks for non-leaf
rcu_node.

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 kernel/rcu/tree_plugin.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Paul E. McKenney July 22, 2022, 1:16 a.m. UTC | #1
On Fri, Jul 22, 2022 at 08:52:13AM +0800, Zqiang wrote:
> In PREEMPT_RCU kernel, for multi-node rcu tree, if the RCU read
> critical section is preempted, the current task are only queued
> leaf rcu_node blkd list, for single-node rcu tree, the root node
> is also leaf node.
> 
> This commit add rcu_is_leaf_node() to filter out checks for non-leaf
> rcu_node.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>

First, thank you for digging into this!

> ---
>  kernel/rcu/tree_plugin.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index b2219577fbe2..a9df11ec65af 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -693,6 +693,8 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
>  
>  	RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n");
>  	raw_lockdep_assert_held_rcu_node(rnp);
> +	if (!rcu_is_leaf_node(rnp))
> +		goto end;
>  	if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)))
>  		dump_blkd_tasks(rnp, 10);
>  	if (rcu_preempt_has_tasks(rnp) &&

So you are adding a simple check for all rcu_node structures
that removes two simple checks for non-leaf rcu_node structures.

Assuming that the costs of all these checks is roughly the same (but
please feel free to actually measure them on real hardware), what what
fraction of the rcu_node structures must be non-leaf for this change to
be a win?  Then for what configurations using default fanouts is this
fraction exceeded?

The default fanout is 16 from non-leaf to leaf and 64 from non-leaf
to non-leaf (32 for non-leaf to non-leaf on 32-bit systems).

Hey, you wanted some algebra practice anyway.  ;-)

> @@ -703,6 +705,7 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
>  		trace_rcu_unlock_preempted_task(TPS("rcu_preempt-GPS"),
>  						rnp->gp_seq, t->pid);
>  	}
> +end:
>  	WARN_ON_ONCE(rnp->qsmask);
>  }
>  
> @@ -1178,7 +1181,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
>   */
>  static void rcu_preempt_boost_start_gp(struct rcu_node *rnp)
>  {
> -	rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES;
> +	if (rcu_is_leaf_node(rnp))
> +		rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES;

And here you are adding a check on all nodes to eliminate an addition
and a store on non-leaf nodes.  Same questions as above, with the same
invitation to actually measure the costs of these operations.

						Thanx, Paul

>  }
>  
>  /*
> -- 
> 2.25.1
>
Zqiang July 22, 2022, 1:45 a.m. UTC | #2
On Fri, Jul 22, 2022 at 08:52:13AM +0800, Zqiang wrote:
> In PREEMPT_RCU kernel, for multi-node rcu tree, if the RCU read
> critical section is preempted, the current task are only queued
> leaf rcu_node blkd list, for single-node rcu tree, the root node
> is also leaf node.
> 
> This commit add rcu_is_leaf_node() to filter out checks for non-leaf
> rcu_node.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
>
>First, thank you for digging into this!
>
> ---
>  kernel/rcu/tree_plugin.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index b2219577fbe2..a9df11ec65af 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -693,6 +693,8 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
>  
>  	RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n");
>  	raw_lockdep_assert_held_rcu_node(rnp);
> +	if (!rcu_is_leaf_node(rnp))
> +		goto end;
>  	if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)))
>  		dump_blkd_tasks(rnp, 10);
>  	if (rcu_preempt_has_tasks(rnp) &&

>So you are adding a simple check for all rcu_node structures
>that removes two simple checks for non-leaf rcu_node structures.
>
>Assuming that the costs of all these checks is roughly the same (but
>please feel free to actually measure them on real hardware), what what
>fraction of the rcu_node structures must be non-leaf for this change to
>be a win?  Then for what configurations using default fanouts is this
>fraction exceeded?
>
>The default fanout is 16 from non-leaf to leaf and 64 from non-leaf
>to non-leaf (32 for non-leaf to non-leaf on 32-bit systems).

Thanks for reminding, I only considered the logic in the code, 
not the actual real usage scenarios. I will do some tests on the
actual hardware. 
diff mbox series

Patch

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index b2219577fbe2..a9df11ec65af 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -693,6 +693,8 @@  static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
 
 	RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n");
 	raw_lockdep_assert_held_rcu_node(rnp);
+	if (!rcu_is_leaf_node(rnp))
+		goto end;
 	if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)))
 		dump_blkd_tasks(rnp, 10);
 	if (rcu_preempt_has_tasks(rnp) &&
@@ -703,6 +705,7 @@  static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
 		trace_rcu_unlock_preempted_task(TPS("rcu_preempt-GPS"),
 						rnp->gp_seq, t->pid);
 	}
+end:
 	WARN_ON_ONCE(rnp->qsmask);
 }
 
@@ -1178,7 +1181,8 @@  static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
  */
 static void rcu_preempt_boost_start_gp(struct rcu_node *rnp)
 {
-	rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES;
+	if (rcu_is_leaf_node(rnp))
+		rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES;
 }
 
 /*