Message ID | 20230706102849.437687-3-mmpgouride@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | rcu: Fix rculist_nulls and doc | expand |
On Thu, Jul 06, 2023 at 10:28:49AM +0000, Alan Huang wrote: > The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is > n->next = first within hlist_add_head_rcu() before rcu_assign_pointer(), > which modifies obj->obj_node.next. There may be readers holding the > reference of obj in lockless_lookup, and when updater modifies ->next, > readers can see the change immediately because of SLAB_TYPESAFE_BY_RCU. > > There are two memory ordering required in the insertion algorithm, > we need to make sure obj->key is updated before obj->obj_node.next > and obj->refcnt, atomic_set_release is not enough to provide the > required memory barrier. > > Signed-off-by: Alan Huang <mmpgouride@gmail.com> > --- > Documentation/RCU/rculist_nulls.rst | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst > index 21e40fcc08de..fa3729dc7e74 100644 > --- a/Documentation/RCU/rculist_nulls.rst > +++ b/Documentation/RCU/rculist_nulls.rst > @@ -64,7 +64,7 @@ but a version with an additional memory barrier (smp_rmb()) > { > struct hlist_node *node, *next; > for (pos = rcu_dereference((head)->first); > - pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && > + pos && ({ next = READ_ONCE(pos->next); smp_rmb(); prefetch(next); 1; }) && This one looks good, though the READ_ONCE() becoming rcu_dereference() would allow the smp_rmb() to be dropped, correct? > ({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; }); > pos = rcu_dereference(next)) > if (obj->key == key) > @@ -112,7 +112,12 @@ detect the fact that it missed following items in original chain. > obj = kmem_cache_alloc(...); > lock_chain(); // typically a spin_lock() > obj->key = key; > - atomic_set_release(&obj->refcnt, 1); // key before refcnt > + /* > + * We need to make sure obj->key is updated before obj->obj_node.next > + * and obj->refcnt. > + */ > + smp_wmb(); > + atomic_set(&obj->refcnt, 1); ...but what is smp_wmb() doing that the combination of atomic_set_release() and hlist_add_head_rcu() was not already doing? What am I missing? Thanx, Paul > hlist_add_head_rcu(&obj->obj_node, list); > unlock_chain(); // typically a spin_unlock() > > -- > 2.34.1 >
> 2023年7月11日 03:11,Paul E. McKenney <paulmck@kernel.org> 写道: > > On Thu, Jul 06, 2023 at 10:28:49AM +0000, Alan Huang wrote: >> The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is >> n->next = first within hlist_add_head_rcu() before rcu_assign_pointer(), >> which modifies obj->obj_node.next. There may be readers holding the >> reference of obj in lockless_lookup, and when updater modifies ->next, >> readers can see the change immediately because of SLAB_TYPESAFE_BY_RCU. >> >> There are two memory ordering required in the insertion algorithm, >> we need to make sure obj->key is updated before obj->obj_node.next >> and obj->refcnt, atomic_set_release is not enough to provide the >> required memory barrier. >> >> Signed-off-by: Alan Huang <mmpgouride@gmail.com> >> --- >> Documentation/RCU/rculist_nulls.rst | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst >> index 21e40fcc08de..fa3729dc7e74 100644 >> --- a/Documentation/RCU/rculist_nulls.rst >> +++ b/Documentation/RCU/rculist_nulls.rst >> @@ -64,7 +64,7 @@ but a version with an additional memory barrier (smp_rmb()) >> { >> struct hlist_node *node, *next; >> for (pos = rcu_dereference((head)->first); >> - pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && >> + pos && ({ next = READ_ONCE(pos->next); smp_rmb(); prefetch(next); 1; }) && > > This one looks good, though the READ_ONCE() becoming rcu_dereference() > would allow the smp_rmb() to be dropped, correct? The pattern here is: reader updater // pos->next is also obj->obj_node.next READ_ONCE(pos->next); WRITE_ONCE(obj->key, key); smp_rmb(); smp_wmb(); // this is n->next = first; within hlist_add_head_rcu READ_ONCE(obj->key); WRITE_ONCE(obj->obj_node.next, h->first); The point here is that the objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is n->next = first; within hlist_add_head_rcu, the modification is visible to readers immediately (before the invocation of rcu_assign_pointer) Therefore, the smp_rmb() is necessary to ensure that we won’t get the new value of ->next and the old ->key. (If we get the new ->next and old ->key, we can not detect movement of the object.) But in this patch, I forgot to add READ_ONCE to obj->key, will send v2. > >> ({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; }); >> pos = rcu_dereference(next)) >> if (obj->key == key) >> @@ -112,7 +112,12 @@ detect the fact that it missed following items in original chain. >> obj = kmem_cache_alloc(...); >> lock_chain(); // typically a spin_lock() >> obj->key = key; >> - atomic_set_release(&obj->refcnt, 1); // key before refcnt >> + /* >> + * We need to make sure obj->key is updated before obj->obj_node.next >> + * and obj->refcnt. >> + */ >> + smp_wmb(); >> + atomic_set(&obj->refcnt, 1); > > ...but what is smp_wmb() doing that the combination of atomic_set_release() > and hlist_add_head_rcu() was not already doing? What am I missing? Like above. > > Thanx, Paul > >> hlist_add_head_rcu(&obj->obj_node, list); >> unlock_chain(); // typically a spin_unlock() >> >> -- >> 2.34.1
diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst index 21e40fcc08de..fa3729dc7e74 100644 --- a/Documentation/RCU/rculist_nulls.rst +++ b/Documentation/RCU/rculist_nulls.rst @@ -64,7 +64,7 @@ but a version with an additional memory barrier (smp_rmb()) { struct hlist_node *node, *next; for (pos = rcu_dereference((head)->first); - pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && + pos && ({ next = READ_ONCE(pos->next); smp_rmb(); prefetch(next); 1; }) && ({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; }); pos = rcu_dereference(next)) if (obj->key == key) @@ -112,7 +112,12 @@ detect the fact that it missed following items in original chain. obj = kmem_cache_alloc(...); lock_chain(); // typically a spin_lock() obj->key = key; - atomic_set_release(&obj->refcnt, 1); // key before refcnt + /* + * We need to make sure obj->key is updated before obj->obj_node.next + * and obj->refcnt. + */ + smp_wmb(); + atomic_set(&obj->refcnt, 1); hlist_add_head_rcu(&obj->obj_node, list); unlock_chain(); // typically a spin_unlock()
The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is n->next = first within hlist_add_head_rcu() before rcu_assign_pointer(), which modifies obj->obj_node.next. There may be readers holding the reference of obj in lockless_lookup, and when updater modifies ->next, readers can see the change immediately because of SLAB_TYPESAFE_BY_RCU. There are two memory ordering required in the insertion algorithm, we need to make sure obj->key is updated before obj->obj_node.next and obj->refcnt, atomic_set_release is not enough to provide the required memory barrier. Signed-off-by: Alan Huang <mmpgouride@gmail.com> --- Documentation/RCU/rculist_nulls.rst | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)