Message ID | 20240217012745.3446231-3-boqun.feng@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 4f86ec909904869fc8b122e8471dd1798b24df7f |
Headers | show |
Series | RCU tasks fixes for v6.9 | expand |
Le Fri, Feb 16, 2024 at 05:27:37PM -0800, Boqun Feng a écrit : > From: "Paul E. McKenney" <paulmck@kernel.org> > > Holding a mutex across synchronize_rcu_tasks() and acquiring > that same mutex in code called from do_exit() after its call to > exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() > results in deadlock. This is by design, because tasks that are far > enough into do_exit() are no longer present on the tasks list, making > it a bit difficult for RCU Tasks to find them, let alone wait on them > to do a voluntary context switch. However, such deadlocks are becoming > more frequent. In addition, lockdep currently does not detect such > deadlocks and they can be difficult to reproduce. > > In addition, if a task voluntarily context switches during that time > (for example, if it blocks acquiring a mutex), then this task is in an > RCU Tasks quiescent state. And with some adjustments, RCU Tasks could > just as well take advantage of that fact. > > This commit therefore adds the data structures that will be needed > to rely on these quiescent states and to eliminate these deadlocks. > > Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ > > Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> > Reported-by: Yang Jihong <yangjihong1@huawei.com> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > Tested-by: Yang Jihong <yangjihong1@huawei.com> > Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
On Thu, Feb 22, 2024 at 05:54:48PM +0100, Frederic Weisbecker wrote: > Le Fri, Feb 16, 2024 at 05:27:37PM -0800, Boqun Feng a écrit : > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > Holding a mutex across synchronize_rcu_tasks() and acquiring > > that same mutex in code called from do_exit() after its call to > > exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() > > results in deadlock. This is by design, because tasks that are far > > enough into do_exit() are no longer present on the tasks list, making > > it a bit difficult for RCU Tasks to find them, let alone wait on them > > to do a voluntary context switch. However, such deadlocks are becoming > > more frequent. In addition, lockdep currently does not detect such > > deadlocks and they can be difficult to reproduce. > > > > In addition, if a task voluntarily context switches during that time > > (for example, if it blocks acquiring a mutex), then this task is in an > > RCU Tasks quiescent state. And with some adjustments, RCU Tasks could > > just as well take advantage of that fact. > > > > This commit therefore adds the data structures that will be needed > > to rely on these quiescent states and to eliminate these deadlocks. > > > > Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ > > > > Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> > > Reported-by: Yang Jihong <yangjihong1@huawei.com> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > Tested-by: Yang Jihong <yangjihong1@huawei.com> > > Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com> > > Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Thank you, I have recorded your three review tags. Thanx, Paul
diff --git a/include/linux/sched.h b/include/linux/sched.h index ffe8f618ab86..5eeebed2dd9b 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -858,6 +858,8 @@ struct task_struct { u8 rcu_tasks_idx; int rcu_tasks_idle_cpu; struct list_head rcu_tasks_holdout_list; + int rcu_tasks_exit_cpu; + struct list_head rcu_tasks_exit_list; #endif /* #ifdef CONFIG_TASKS_RCU */ #ifdef CONFIG_TASKS_TRACE_RCU diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 732ad5b39946..b7d5f2757053 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -32,6 +32,7 @@ typedef void (*postgp_func_t)(struct rcu_tasks *rtp); * @rtp_irq_work: IRQ work queue for deferred wakeups. * @barrier_q_head: RCU callback for barrier operation. * @rtp_blkd_tasks: List of tasks blocked as readers. + * @rtp_exit_list: List of tasks in the latter portion of do_exit(). * @cpu: CPU number corresponding to this entry. * @rtpp: Pointer to the rcu_tasks structure. */ @@ -46,6 +47,7 @@ struct rcu_tasks_percpu { struct irq_work rtp_irq_work; struct rcu_head barrier_q_head; struct list_head rtp_blkd_tasks; + struct list_head rtp_exit_list; int cpu; struct rcu_tasks *rtpp; };