diff mbox series

[v2,4/8] rculist: add list_bidir_{del,prev}_rcu()

Message ID 20241212-work-mount-rbtree-lockless-v2-4-4fe6cef02534@kernel.org (mailing list archive)
State New
Headers show
Series fs: lockless mntns lookup | expand

Commit Message

Christian Brauner Dec. 12, 2024, 11:56 a.m. UTC
Currently there is no primite for retrieving the previous list member.
To do this we need a new deletion primite that doesn't poison the prev
pointer and a corresponding retrieval helper. Note that it is not valid
to ues both list_del_rcu() and list_bidir_del_rcu() on the same list.

Suggested-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 include/linux/rculist.h | 43 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

Comments

Paul E. McKenney Dec. 12, 2024, 3:18 p.m. UTC | #1
On Thu, Dec 12, 2024 at 12:56:03PM +0100, Christian Brauner wrote:
> Currently there is no primite for retrieving the previous list member.

s/primite/primitive/g

To my surprise, there is an English word "primite".  According to Merriam
Webster, this is "the anterior member of a pair of gregarines in syzygy".
I fervently hope not to have much opportunity to use this word, especially
in reference to myself.  But I cannot escape the suspicion that Merriam
Webster might be engaging in a little trolling.  ;-)

> To do this we need a new deletion primite that doesn't poison the prev
> pointer and a corresponding retrieval helper. Note that it is not valid
> to ues both list_del_rcu() and list_bidir_del_rcu() on the same list.
> 
> Suggested-by: "Paul E. McKenney" <paulmck@kernel.org>
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good!  I have a few suggestions below, mostly grammar nits.

							Thanx, Paul

> ---
>  include/linux/rculist.h | 43 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index 14dfa6008467e803d57f98cfa0275569f1c6a181..c81f9e5a789928ae6825c89325396d638b3e48c5 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -30,6 +30,14 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
>   * way, we must not access it directly
>   */
>  #define list_next_rcu(list)	(*((struct list_head __rcu **)(&(list)->next)))
> +/*
> + * Return the ->prev pointer of a list_head in an rcu safe way. Don't
> + * access it directly.
> + *
> + * In order to use list_bidir_prev_rcu() deletions must only be done via
> + * list_bidir_del() to avoid poisoning the ->prev pointer.

This should be list_bidir_del_rcu(), right?  If so, I suggest wording
this as follows or similar:

 * Any list traversed with list_bidir_prev_rcu() must never use
 * list_del_rcu().  Doing so will poison the ->prev pointer that
 * list_bidir_prev_rcu() relies on, which will result in segfaults.
 * To prevent these segfaults, use list_bidir_del_rcu() instead
 * of list_del_rcu().

> + */
> +#define list_bidir_prev_rcu(list) (*((struct list_head __rcu **)(&(list)->prev)))

We need a rcu_dereference() in there somewhere, otherwise the compiler
might ruin your day.

Huh.  You (quite reasonably) copy-pasta'd list_next_rcu().  So the
restriction is the same, the caller must use rcu_dereference.  Unless,
like seq_list_next_rcu(), you are never dereferencing it.  That said,
I am not so sure about the callers of unloaded_tainted_modules_seq_next()
and rxrpc_call_seq_next(), which inherit the same restriction.

If those two are used properly with rcu_dereference(), we have empirical
evidence indicating that things might be OK.  Otherwise, both need at
least an upgrade of their header comments.  ;-)

>  /**
>   * list_tail_rcu - returns the prev pointer of the head of the list
> @@ -158,6 +166,41 @@ static inline void list_del_rcu(struct list_head *entry)
>  	entry->prev = LIST_POISON2;
>  }
>  
> +/**
> + * list_bidir_del_rcu - deletes entry from list without re-initialization
> + * @entry: the element to delete from the list.
> + *
> + * In contrat to list_del_rcu() doesn't poison the previous pointer thus

Looks good, but while I am here, I might as well nitpick...

"In constrast".

> + * allowing to go backwards via list_prev_bidir_rcu().

"allowing backwards traversal via"

> + * Note: list_empty() on entry does not return true after this,
> + * the entry is in an undefined state. It is useful for RCU based

"because the entry is in a special undefined state that permits
RCU-based lockfree reverse traversal."

> + * lockfree traversal.

At which point, you don't need this paragraph break.

> + * In particular, it means that we can not poison the forward

"this means that ... forward and backwards"

> + * pointers that may still be used for walking the list.
> + *
> + * The caller must take whatever precautions are necessary
> + * (such as holding appropriate locks) to avoid racing
> + * with another list-mutation primitive, such as list_bidir_del_rcu()
> + * or list_add_rcu(), running on this same list.
> + * However, it is perfectly legal to run concurrently with
> + * the _rcu list-traversal primitives, such as
> + * list_for_each_entry_rcu().
> + *
> + * Noe that the it is not allowed to use list_del_rcu() and

"Note that list_del_rcu() and list_bidir_del_rcu() must not be used on
the same list at the same time."

If you want to leave off the "at the same time", I am good.  One could
argue that we should not call attention to the possibility of adding
this sort of complexity.  Let them need it badly first.  ;-)

> + * list_bidir_del_rcu() on the same list.
> + *
> + * Note that the caller is not permitted to immediately free
> + * the newly deleted entry.  Instead, either synchronize_rcu()
> + * or call_rcu() must be used to defer freeing until an RCU
> + * grace period has elapsed.
> + */
> +static inline void list_bidir_del_rcu(struct list_head *entry)
> +{
> +	__list_del_entry(entry);
> +}
> +
>  /**
>   * hlist_del_init_rcu - deletes entry from hash list with re-initialization
>   * @n: the element to delete from the hash list.
> 
> -- 
> 2.45.2
>
Christian Brauner Dec. 12, 2024, 5:40 p.m. UTC | #2
On Thu, Dec 12, 2024 at 07:18:44AM -0800, Paul E. McKenney wrote:
> On Thu, Dec 12, 2024 at 12:56:03PM +0100, Christian Brauner wrote:
> > Currently there is no primite for retrieving the previous list member.
> 
> s/primite/primitive/g
> 
> To my surprise, there is an English word "primite".  According to Merriam
> Webster, this is "the anterior member of a pair of gregarines in syzygy".
> I fervently hope not to have much opportunity to use this word, especially
> in reference to myself.  But I cannot escape the suspicion that Merriam
> Webster might be engaging in a little trolling.  ;-)

:)

> 
> > To do this we need a new deletion primite that doesn't poison the prev
> > pointer and a corresponding retrieval helper. Note that it is not valid
> > to ues both list_del_rcu() and list_bidir_del_rcu() on the same list.
> > 
> > Suggested-by: "Paul E. McKenney" <paulmck@kernel.org>
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> 
> Looks good!  I have a few suggestions below, mostly grammar nits.

Yes to all.

> 
> 							Thanx, Paul
> 
> > ---
> >  include/linux/rculist.h | 43 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 43 insertions(+)
> > 
> > diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> > index 14dfa6008467e803d57f98cfa0275569f1c6a181..c81f9e5a789928ae6825c89325396d638b3e48c5 100644
> > --- a/include/linux/rculist.h
> > +++ b/include/linux/rculist.h
> > @@ -30,6 +30,14 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
> >   * way, we must not access it directly
> >   */
> >  #define list_next_rcu(list)	(*((struct list_head __rcu **)(&(list)->next)))
> > +/*
> > + * Return the ->prev pointer of a list_head in an rcu safe way. Don't
> > + * access it directly.
> > + *
> > + * In order to use list_bidir_prev_rcu() deletions must only be done via
> > + * list_bidir_del() to avoid poisoning the ->prev pointer.
> 
> This should be list_bidir_del_rcu(), right?  If so, I suggest wording
> this as follows or similar:
> 
>  * Any list traversed with list_bidir_prev_rcu() must never use
>  * list_del_rcu().  Doing so will poison the ->prev pointer that
>  * list_bidir_prev_rcu() relies on, which will result in segfaults.
>  * To prevent these segfaults, use list_bidir_del_rcu() instead
>  * of list_del_rcu().
> 
> > + */
> > +#define list_bidir_prev_rcu(list) (*((struct list_head __rcu **)(&(list)->prev)))
> 
> We need a rcu_dereference() in there somewhere, otherwise the compiler
> might ruin your day.
> 
> Huh.  You (quite reasonably) copy-pasta'd list_next_rcu().  So the

I expect the caller to do the rcu_dereference() just like with
list_next_rcu():

At first I made it include a rcu_dereference. But then I realized that
the api becomes rather wonky because list_next_rcu() would have to be
called with rcu_dereference() (most caller's do) and
list_bidir_prev_rcu() wouldn't.

That seemed very confusing to me so I just kept it aligned with
list_next_rcu(). Now, the correct follow-up cleanup would obviously be
to port both helpers to include the rcu_dereference() and thereby simply
all callers. For the few (mostly inside the header itself) we could just
add a __list_next_rcu() thing that doesn't include the
rcu_dereference().

> restriction is the same, the caller must use rcu_dereference.  Unless,
> like seq_list_next_rcu(), you are never dereferencing it.  That said,
> I am not so sure about the callers of unloaded_tainted_modules_seq_next()
> and rxrpc_call_seq_next(), which inherit the same restriction.
> 
> If those two are used properly with rcu_dereference(), we have empirical
> evidence indicating that things might be OK.  Otherwise, both need at
> least an upgrade of their header comments.  ;-)
> 
> >  /**
> >   * list_tail_rcu - returns the prev pointer of the head of the list
> > @@ -158,6 +166,41 @@ static inline void list_del_rcu(struct list_head *entry)
> >  	entry->prev = LIST_POISON2;
> >  }
> >  
> > +/**
> > + * list_bidir_del_rcu - deletes entry from list without re-initialization
> > + * @entry: the element to delete from the list.
> > + *
> > + * In contrat to list_del_rcu() doesn't poison the previous pointer thus
> 
> Looks good, but while I am here, I might as well nitpick...
> 
> "In constrast".
> 
> > + * allowing to go backwards via list_prev_bidir_rcu().
> 
> "allowing backwards traversal via"
> 
> > + * Note: list_empty() on entry does not return true after this,
> > + * the entry is in an undefined state. It is useful for RCU based
> 
> "because the entry is in a special undefined state that permits
> RCU-based lockfree reverse traversal."
> 
> > + * lockfree traversal.
> 
> At which point, you don't need this paragraph break.
> 
> > + * In particular, it means that we can not poison the forward
> 
> "this means that ... forward and backwards"
> 
> > + * pointers that may still be used for walking the list.
> > + *
> > + * The caller must take whatever precautions are necessary
> > + * (such as holding appropriate locks) to avoid racing
> > + * with another list-mutation primitive, such as list_bidir_del_rcu()
> > + * or list_add_rcu(), running on this same list.
> > + * However, it is perfectly legal to run concurrently with
> > + * the _rcu list-traversal primitives, such as
> > + * list_for_each_entry_rcu().
> > + *
> > + * Noe that the it is not allowed to use list_del_rcu() and
> 
> "Note that list_del_rcu() and list_bidir_del_rcu() must not be used on
> the same list at the same time."
> 
> If you want to leave off the "at the same time", I am good.  One could
> argue that we should not call attention to the possibility of adding
> this sort of complexity.  Let them need it badly first.  ;-)
> 
> > + * list_bidir_del_rcu() on the same list.
> > + *
> > + * Note that the caller is not permitted to immediately free
> > + * the newly deleted entry.  Instead, either synchronize_rcu()
> > + * or call_rcu() must be used to defer freeing until an RCU
> > + * grace period has elapsed.
> > + */
> > +static inline void list_bidir_del_rcu(struct list_head *entry)
> > +{
> > +	__list_del_entry(entry);
> > +}
> > +
> >  /**
> >   * hlist_del_init_rcu - deletes entry from hash list with re-initialization
> >   * @n: the element to delete from the hash list.
> > 
> > -- 
> > 2.45.2
> >
Paul E. McKenney Dec. 12, 2024, 6:27 p.m. UTC | #3
On Thu, Dec 12, 2024 at 06:40:48PM +0100, Christian Brauner wrote:
> On Thu, Dec 12, 2024 at 07:18:44AM -0800, Paul E. McKenney wrote:
> > On Thu, Dec 12, 2024 at 12:56:03PM +0100, Christian Brauner wrote:
> > > Currently there is no primite for retrieving the previous list member.
> > 
> > s/primite/primitive/g
> > 
> > To my surprise, there is an English word "primite".  According to Merriam
> > Webster, this is "the anterior member of a pair of gregarines in syzygy".
> > I fervently hope not to have much opportunity to use this word, especially
> > in reference to myself.  But I cannot escape the suspicion that Merriam
> > Webster might be engaging in a little trolling.  ;-)
> 
> :)
> 
> > 
> > > To do this we need a new deletion primite that doesn't poison the prev
> > > pointer and a corresponding retrieval helper. Note that it is not valid
> > > to ues both list_del_rcu() and list_bidir_del_rcu() on the same list.
> > > 
> > > Suggested-by: "Paul E. McKenney" <paulmck@kernel.org>
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > 
> > Looks good!  I have a few suggestions below, mostly grammar nits.
> 
> Yes to all.
> 
> > 
> > 							Thanx, Paul
> > 
> > > ---
> > >  include/linux/rculist.h | 43 +++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 43 insertions(+)
> > > 
> > > diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> > > index 14dfa6008467e803d57f98cfa0275569f1c6a181..c81f9e5a789928ae6825c89325396d638b3e48c5 100644
> > > --- a/include/linux/rculist.h
> > > +++ b/include/linux/rculist.h
> > > @@ -30,6 +30,14 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
> > >   * way, we must not access it directly
> > >   */
> > >  #define list_next_rcu(list)	(*((struct list_head __rcu **)(&(list)->next)))
> > > +/*
> > > + * Return the ->prev pointer of a list_head in an rcu safe way. Don't
> > > + * access it directly.
> > > + *
> > > + * In order to use list_bidir_prev_rcu() deletions must only be done via
> > > + * list_bidir_del() to avoid poisoning the ->prev pointer.
> > 
> > This should be list_bidir_del_rcu(), right?  If so, I suggest wording
> > this as follows or similar:
> > 
> >  * Any list traversed with list_bidir_prev_rcu() must never use
> >  * list_del_rcu().  Doing so will poison the ->prev pointer that
> >  * list_bidir_prev_rcu() relies on, which will result in segfaults.
> >  * To prevent these segfaults, use list_bidir_del_rcu() instead
> >  * of list_del_rcu().
> > 
> > > + */
> > > +#define list_bidir_prev_rcu(list) (*((struct list_head __rcu **)(&(list)->prev)))
> > 
> > We need a rcu_dereference() in there somewhere, otherwise the compiler
> > might ruin your day.
> > 
> > Huh.  You (quite reasonably) copy-pasta'd list_next_rcu().  So the
> 
> I expect the caller to do the rcu_dereference() just like with
> list_next_rcu():
> 
> At first I made it include a rcu_dereference. But then I realized that
> the api becomes rather wonky because list_next_rcu() would have to be
> called with rcu_dereference() (most caller's do) and
> list_bidir_prev_rcu() wouldn't.
> 
> That seemed very confusing to me so I just kept it aligned with
> list_next_rcu(). Now, the correct follow-up cleanup would obviously be
> to port both helpers to include the rcu_dereference() and thereby simply
> all callers. For the few (mostly inside the header itself) we could just
> add a __list_next_rcu() thing that doesn't include the
> rcu_dereference().

I am not sure what others think, but that sound better than status quo
to me.  And it is not like rcu_dereference() is costly, other than on
DEC Alpha, so we likely don't even need that __list_next_rcu().  Of
course, the opinion of actual hardware overrides my own, as always.

							Thanx, Paul

> > restriction is the same, the caller must use rcu_dereference.  Unless,
> > like seq_list_next_rcu(), you are never dereferencing it.  That said,
> > I am not so sure about the callers of unloaded_tainted_modules_seq_next()
> > and rxrpc_call_seq_next(), which inherit the same restriction.
> > 
> > If those two are used properly with rcu_dereference(), we have empirical
> > evidence indicating that things might be OK.  Otherwise, both need at
> > least an upgrade of their header comments.  ;-)
> > 
> > >  /**
> > >   * list_tail_rcu - returns the prev pointer of the head of the list
> > > @@ -158,6 +166,41 @@ static inline void list_del_rcu(struct list_head *entry)
> > >  	entry->prev = LIST_POISON2;
> > >  }
> > >  
> > > +/**
> > > + * list_bidir_del_rcu - deletes entry from list without re-initialization
> > > + * @entry: the element to delete from the list.
> > > + *
> > > + * In contrat to list_del_rcu() doesn't poison the previous pointer thus
> > 
> > Looks good, but while I am here, I might as well nitpick...
> > 
> > "In constrast".
> > 
> > > + * allowing to go backwards via list_prev_bidir_rcu().
> > 
> > "allowing backwards traversal via"
> > 
> > > + * Note: list_empty() on entry does not return true after this,
> > > + * the entry is in an undefined state. It is useful for RCU based
> > 
> > "because the entry is in a special undefined state that permits
> > RCU-based lockfree reverse traversal."
> > 
> > > + * lockfree traversal.
> > 
> > At which point, you don't need this paragraph break.
> > 
> > > + * In particular, it means that we can not poison the forward
> > 
> > "this means that ... forward and backwards"
> > 
> > > + * pointers that may still be used for walking the list.
> > > + *
> > > + * The caller must take whatever precautions are necessary
> > > + * (such as holding appropriate locks) to avoid racing
> > > + * with another list-mutation primitive, such as list_bidir_del_rcu()
> > > + * or list_add_rcu(), running on this same list.
> > > + * However, it is perfectly legal to run concurrently with
> > > + * the _rcu list-traversal primitives, such as
> > > + * list_for_each_entry_rcu().
> > > + *
> > > + * Noe that the it is not allowed to use list_del_rcu() and
> > 
> > "Note that list_del_rcu() and list_bidir_del_rcu() must not be used on
> > the same list at the same time."
> > 
> > If you want to leave off the "at the same time", I am good.  One could
> > argue that we should not call attention to the possibility of adding
> > this sort of complexity.  Let them need it badly first.  ;-)
> > 
> > > + * list_bidir_del_rcu() on the same list.
> > > + *
> > > + * Note that the caller is not permitted to immediately free
> > > + * the newly deleted entry.  Instead, either synchronize_rcu()
> > > + * or call_rcu() must be used to defer freeing until an RCU
> > > + * grace period has elapsed.
> > > + */
> > > +static inline void list_bidir_del_rcu(struct list_head *entry)
> > > +{
> > > +	__list_del_entry(entry);
> > > +}
> > > +
> > >  /**
> > >   * hlist_del_init_rcu - deletes entry from hash list with re-initialization
> > >   * @n: the element to delete from the hash list.
> > > 
> > > -- 
> > > 2.45.2
> > >
diff mbox series

Patch

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 14dfa6008467e803d57f98cfa0275569f1c6a181..c81f9e5a789928ae6825c89325396d638b3e48c5 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -30,6 +30,14 @@  static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
  * way, we must not access it directly
  */
 #define list_next_rcu(list)	(*((struct list_head __rcu **)(&(list)->next)))
+/*
+ * Return the ->prev pointer of a list_head in an rcu safe way. Don't
+ * access it directly.
+ *
+ * In order to use list_bidir_prev_rcu() deletions must only be done via
+ * list_bidir_del() to avoid poisoning the ->prev pointer.
+ */
+#define list_bidir_prev_rcu(list) (*((struct list_head __rcu **)(&(list)->prev)))
 
 /**
  * list_tail_rcu - returns the prev pointer of the head of the list
@@ -158,6 +166,41 @@  static inline void list_del_rcu(struct list_head *entry)
 	entry->prev = LIST_POISON2;
 }
 
+/**
+ * list_bidir_del_rcu - deletes entry from list without re-initialization
+ * @entry: the element to delete from the list.
+ *
+ * In contrat to list_del_rcu() doesn't poison the previous pointer thus
+ * allowing to go backwards via list_prev_bidir_rcu().
+ *
+ * Note: list_empty() on entry does not return true after this,
+ * the entry is in an undefined state. It is useful for RCU based
+ * lockfree traversal.
+ *
+ * In particular, it means that we can not poison the forward
+ * pointers that may still be used for walking the list.
+ *
+ * The caller must take whatever precautions are necessary
+ * (such as holding appropriate locks) to avoid racing
+ * with another list-mutation primitive, such as list_bidir_del_rcu()
+ * or list_add_rcu(), running on this same list.
+ * However, it is perfectly legal to run concurrently with
+ * the _rcu list-traversal primitives, such as
+ * list_for_each_entry_rcu().
+ *
+ * Noe that the it is not allowed to use list_del_rcu() and
+ * list_bidir_del_rcu() on the same list.
+ *
+ * Note that the caller is not permitted to immediately free
+ * the newly deleted entry.  Instead, either synchronize_rcu()
+ * or call_rcu() must be used to defer freeing until an RCU
+ * grace period has elapsed.
+ */
+static inline void list_bidir_del_rcu(struct list_head *entry)
+{
+	__list_del_entry(entry);
+}
+
 /**
  * hlist_del_init_rcu - deletes entry from hash list with re-initialization
  * @n: the element to delete from the hash list.