mbox series

[v2,0/6] RCU tasks fixes for v6.9

Message ID 20240217012745.3446231-1-boqun.feng@gmail.com (mailing list archive)
Headers show
Series RCU tasks fixes for v6.9 | expand

Message

Boqun Feng Feb. 17, 2024, 1:27 a.m. UTC
Hi,

This series contains the fixes of RCU tasks for v6.9. You can also find
the series at:

	git://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux.git rcu-tasks.2024.02.14a

Changes since v1:

*	Update with Paul's rework on "Eliminate deadlocks involving
	do_exit() and RCU task"

The detailed list of changes:

Paul E. McKenney (6):
  rcu-tasks: Repair RCU Tasks Trace quiescence check
  rcu-tasks: Add data to eliminate RCU-tasks/do_exit() deadlocks
  rcu-tasks: Initialize data to eliminate RCU-tasks/do_exit() deadlocks
  rcu-tasks: Maintain lists to eliminate RCU-tasks/do_exit() deadlocks
  rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasks
  rcu-tasks: Maintain real-time response in rcu_tasks_postscan()

 include/linux/rcupdate.h |   4 +-
 include/linux/sched.h    |   2 +
 init/init_task.c         |   1 +
 kernel/fork.c            |   1 +
 kernel/rcu/tasks.h       | 110 ++++++++++++++++++++++++++++++---------
 5 files changed, 90 insertions(+), 28 deletions(-)

Comments

Frederic Weisbecker Feb. 22, 2024, 4:52 p.m. UTC | #1
Le Fri, Feb 16, 2024 at 05:27:35PM -0800, Boqun Feng a écrit :
> Hi,
> 
> This series contains the fixes of RCU tasks for v6.9. You can also find
> the series at:
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux.git rcu-tasks.2024.02.14a
> 
> Changes since v1:
> 
> *	Update with Paul's rework on "Eliminate deadlocks involving
> 	do_exit() and RCU task"
> 
> The detailed list of changes:
> 
> Paul E. McKenney (6):
>   rcu-tasks: Repair RCU Tasks Trace quiescence check
>   rcu-tasks: Add data to eliminate RCU-tasks/do_exit() deadlocks
>   rcu-tasks: Initialize data to eliminate RCU-tasks/do_exit() deadlocks
>   rcu-tasks: Maintain lists to eliminate RCU-tasks/do_exit() deadlocks
>   rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasks

Food for later thoughts and further improvements: would it make sense to
call exit_rcu_tasks_start() on fork() instead and rely solely on
each CPUs' rtp_exit_list instead of the tasklist?

Thanks.

>   rcu-tasks: Maintain real-time response in rcu_tasks_postscan()
> 
>  include/linux/rcupdate.h |   4 +-
>  include/linux/sched.h    |   2 +
>  init/init_task.c         |   1 +
>  kernel/fork.c            |   1 +
>  kernel/rcu/tasks.h       | 110 ++++++++++++++++++++++++++++++---------
>  5 files changed, 90 insertions(+), 28 deletions(-)
> 
> -- 
> 2.43.0
> 
>
Paul E. McKenney Feb. 22, 2024, 10:09 p.m. UTC | #2
On Thu, Feb 22, 2024 at 05:52:23PM +0100, Frederic Weisbecker wrote:
> Le Fri, Feb 16, 2024 at 05:27:35PM -0800, Boqun Feng a écrit :
> > Hi,
> > 
> > This series contains the fixes of RCU tasks for v6.9. You can also find
> > the series at:
> > 
> > 	git://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux.git rcu-tasks.2024.02.14a
> > 
> > Changes since v1:
> > 
> > *	Update with Paul's rework on "Eliminate deadlocks involving
> > 	do_exit() and RCU task"
> > 
> > The detailed list of changes:
> > 
> > Paul E. McKenney (6):
> >   rcu-tasks: Repair RCU Tasks Trace quiescence check
> >   rcu-tasks: Add data to eliminate RCU-tasks/do_exit() deadlocks
> >   rcu-tasks: Initialize data to eliminate RCU-tasks/do_exit() deadlocks
> >   rcu-tasks: Maintain lists to eliminate RCU-tasks/do_exit() deadlocks
> >   rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasks
> 
> Food for later thoughts and further improvements: would it make sense to
> call exit_rcu_tasks_start() on fork() instead and rely solely on
> each CPUs' rtp_exit_list instead of the tasklist?

It might well.

One big advantage of doing that is the ability to incrementally traverse
the tasks.  But is there some good way of doing that to the full task
lists?  If so, everyone could benefit.

							Thanx, Paul

> Thanks.
> 
> >   rcu-tasks: Maintain real-time response in rcu_tasks_postscan()
> > 
> >  include/linux/rcupdate.h |   4 +-
> >  include/linux/sched.h    |   2 +
> >  init/init_task.c         |   1 +
> >  kernel/fork.c            |   1 +
> >  kernel/rcu/tasks.h       | 110 ++++++++++++++++++++++++++++++---------
> >  5 files changed, 90 insertions(+), 28 deletions(-)
> > 
> > -- 
> > 2.43.0
> > 
> > 
>
Frederic Weisbecker Feb. 23, 2024, 12:25 p.m. UTC | #3
On Thu, Feb 22, 2024 at 02:09:17PM -0800, Paul E. McKenney wrote:
> On Thu, Feb 22, 2024 at 05:52:23PM +0100, Frederic Weisbecker wrote:
> > Le Fri, Feb 16, 2024 at 05:27:35PM -0800, Boqun Feng a écrit :
> > > Hi,
> > > 
> > > This series contains the fixes of RCU tasks for v6.9. You can also find
> > > the series at:
> > > 
> > > 	git://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux.git rcu-tasks.2024.02.14a
> > > 
> > > Changes since v1:
> > > 
> > > *	Update with Paul's rework on "Eliminate deadlocks involving
> > > 	do_exit() and RCU task"
> > > 
> > > The detailed list of changes:
> > > 
> > > Paul E. McKenney (6):
> > >   rcu-tasks: Repair RCU Tasks Trace quiescence check
> > >   rcu-tasks: Add data to eliminate RCU-tasks/do_exit() deadlocks
> > >   rcu-tasks: Initialize data to eliminate RCU-tasks/do_exit() deadlocks
> > >   rcu-tasks: Maintain lists to eliminate RCU-tasks/do_exit() deadlocks
> > >   rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasks
> > 
> > Food for later thoughts and further improvements: would it make sense to
> > call exit_rcu_tasks_start() on fork() instead and rely solely on
> > each CPUs' rtp_exit_list instead of the tasklist?
> 
> It might well.
> 
> One big advantage of doing that is the ability to incrementally traverse
> the tasks.  But is there some good way of doing that to the full task
> lists?  If so, everyone could benefit.

What do you mean by incrementally? You mean being able to cond_resched()
in the middle of the tasks iteration? Yeah not sure that's possible...

Thanks.

> 
> 							Thanx, Paul
> 
> > Thanks.
> > 
> > >   rcu-tasks: Maintain real-time response in rcu_tasks_postscan()
> > > 
> > >  include/linux/rcupdate.h |   4 +-
> > >  include/linux/sched.h    |   2 +
> > >  init/init_task.c         |   1 +
> > >  kernel/fork.c            |   1 +
> > >  kernel/rcu/tasks.h       | 110 ++++++++++++++++++++++++++++++---------
> > >  5 files changed, 90 insertions(+), 28 deletions(-)
> > > 
> > > -- 
> > > 2.43.0
> > > 
> > > 
> >
Paul E. McKenney Feb. 24, 2024, 12:43 a.m. UTC | #4
On Fri, Feb 23, 2024 at 01:25:06PM +0100, Frederic Weisbecker wrote:
> On Thu, Feb 22, 2024 at 02:09:17PM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 22, 2024 at 05:52:23PM +0100, Frederic Weisbecker wrote:
> > > Le Fri, Feb 16, 2024 at 05:27:35PM -0800, Boqun Feng a écrit :
> > > > Hi,
> > > > 
> > > > This series contains the fixes of RCU tasks for v6.9. You can also find
> > > > the series at:
> > > > 
> > > > 	git://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux.git rcu-tasks.2024.02.14a
> > > > 
> > > > Changes since v1:
> > > > 
> > > > *	Update with Paul's rework on "Eliminate deadlocks involving
> > > > 	do_exit() and RCU task"
> > > > 
> > > > The detailed list of changes:
> > > > 
> > > > Paul E. McKenney (6):
> > > >   rcu-tasks: Repair RCU Tasks Trace quiescence check
> > > >   rcu-tasks: Add data to eliminate RCU-tasks/do_exit() deadlocks
> > > >   rcu-tasks: Initialize data to eliminate RCU-tasks/do_exit() deadlocks
> > > >   rcu-tasks: Maintain lists to eliminate RCU-tasks/do_exit() deadlocks
> > > >   rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasks
> > > 
> > > Food for later thoughts and further improvements: would it make sense to
> > > call exit_rcu_tasks_start() on fork() instead and rely solely on
> > > each CPUs' rtp_exit_list instead of the tasklist?
> > 
> > It might well.
> > 
> > One big advantage of doing that is the ability to incrementally traverse
> > the tasks.  But is there some good way of doing that to the full task
> > lists?  If so, everyone could benefit.
> 
> What do you mean by incrementally? You mean being able to cond_resched()
> in the middle of the tasks iteration? Yeah not sure that's possible...

I do indeed mean doing cond_resched() mid-stream.

One way to make this happen would be to do something like this:

struct task_struct_anchor {
	struct list_head tsa_list;
	struct list_head tsa_adjust_list;
	atomic_t tsa_ref;  // Or use an appropriate API.
	bool tsa_is_anchor;
}

Each task structure would contain one of these, though there are a
number of ways to conserve space if needed.

These anchors would be placed perhaps every 1,000 tasks or so.  When a
traversal encountered one, it could atomic_inc_not_zero() the reference
count, and if that succeeded, exit the RCU read-side critical section and
do a cond_resched().  It could then enter a new RCU read-side critical
section, drop the reference, and continue.

A traveral might container_of() its way from ->tsa_list to the
task_struct_anchor structure, then if ->tsa_is_anchor is false,
container_of() its way to the enclosing task structure.

How to maintain proper spacing of the anchors?

One way is to make the traversals do the checking.  If the space between a
pair of anchors was to large or too small, it could add the first of the
pair to a list to be adjusted.  This list could periodically be processed,
perhaps with more urgency if a huge gap had opened up.

Freeing an anchor requires decrementing the reference count, waiting for
it to go to zero, removing the anchor, waiting for a grace period (perhaps
asynchronously), and only then freeing the anchor.

Anchors cannot be moved, only added or removed.

So it is possible.  But is it reasonable?  ;-)

							Thanx, Paul

> > > Thanks.
> > > 
> > > >   rcu-tasks: Maintain real-time response in rcu_tasks_postscan()
> > > > 
> > > >  include/linux/rcupdate.h |   4 +-
> > > >  include/linux/sched.h    |   2 +
> > > >  init/init_task.c         |   1 +
> > > >  kernel/fork.c            |   1 +
> > > >  kernel/rcu/tasks.h       | 110 ++++++++++++++++++++++++++++++---------
> > > >  5 files changed, 90 insertions(+), 28 deletions(-)
> > > > 
> > > > -- 
> > > > 2.43.0
> > > > 
> > > > 
> > >
Frederic Weisbecker Feb. 26, 2024, 1:56 p.m. UTC | #5
Le Fri, Feb 23, 2024 at 04:43:04PM -0800, Paul E. McKenney a écrit :
> I do indeed mean doing cond_resched() mid-stream.
> 
> One way to make this happen would be to do something like this:
> 
> struct task_struct_anchor {
> 	struct list_head tsa_list;
> 	struct list_head tsa_adjust_list;
> 	atomic_t tsa_ref;  // Or use an appropriate API.
> 	bool tsa_is_anchor;
> }
> 
> Each task structure would contain one of these, though there are a
> number of ways to conserve space if needed.
> 
> These anchors would be placed perhaps every 1,000 tasks or so.  When a
> traversal encountered one, it could atomic_inc_not_zero() the reference
> count, and if that succeeded, exit the RCU read-side critical section and
> do a cond_resched().  It could then enter a new RCU read-side critical
> section, drop the reference, and continue.
> 
> A traveral might container_of() its way from ->tsa_list to the
> task_struct_anchor structure, then if ->tsa_is_anchor is false,
> container_of() its way to the enclosing task structure.
> 
> How to maintain proper spacing of the anchors?
> 
> One way is to make the traversals do the checking.  If the space between a
> pair of anchors was to large or too small, it could add the first of the
> pair to a list to be adjusted.  This list could periodically be processed,
> perhaps with more urgency if a huge gap had opened up.
> 
> Freeing an anchor requires decrementing the reference count, waiting for
> it to go to zero, removing the anchor, waiting for a grace period (perhaps
> asynchronously), and only then freeing the anchor.
> 
> Anchors cannot be moved, only added or removed.
> 
> So it is possible.  But is it reasonable?  ;-)

Wow! And this will need to be done both for process leaders (p->tasks)
and for threads (p->thread_node) :-)
Paul E. McKenney Feb. 26, 2024, 2:37 p.m. UTC | #6
On Mon, Feb 26, 2024 at 02:56:06PM +0100, Frederic Weisbecker wrote:
> Le Fri, Feb 23, 2024 at 04:43:04PM -0800, Paul E. McKenney a écrit :
> > I do indeed mean doing cond_resched() mid-stream.
> > 
> > One way to make this happen would be to do something like this:
> > 
> > struct task_struct_anchor {
> > 	struct list_head tsa_list;
> > 	struct list_head tsa_adjust_list;
> > 	atomic_t tsa_ref;  // Or use an appropriate API.
> > 	bool tsa_is_anchor;
> > }
> > 
> > Each task structure would contain one of these, though there are a
> > number of ways to conserve space if needed.
> > 
> > These anchors would be placed perhaps every 1,000 tasks or so.  When a
> > traversal encountered one, it could atomic_inc_not_zero() the reference
> > count, and if that succeeded, exit the RCU read-side critical section and
> > do a cond_resched().  It could then enter a new RCU read-side critical
> > section, drop the reference, and continue.
> > 
> > A traveral might container_of() its way from ->tsa_list to the
> > task_struct_anchor structure, then if ->tsa_is_anchor is false,
> > container_of() its way to the enclosing task structure.
> > 
> > How to maintain proper spacing of the anchors?
> > 
> > One way is to make the traversals do the checking.  If the space between a
> > pair of anchors was to large or too small, it could add the first of the
> > pair to a list to be adjusted.  This list could periodically be processed,
> > perhaps with more urgency if a huge gap had opened up.
> > 
> > Freeing an anchor requires decrementing the reference count, waiting for
> > it to go to zero, removing the anchor, waiting for a grace period (perhaps
> > asynchronously), and only then freeing the anchor.
> > 
> > Anchors cannot be moved, only added or removed.
> > 
> > So it is possible.  But is it reasonable?  ;-)
> 
> Wow! And this will need to be done both for process leaders (p->tasks)
> and for threads (p->thread_node) :-)

True enough!  Your point being?  ;-)

							Thanx, Paul