Message ID | 20211103050241.61293-11-rpearsonhpe@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | Correct race conditions in rdma_rxe | expand |
On Wed, Nov 03, 2021 at 12:02:39AM -0500, Bob Pearson wrote: > Currently rxe_add_ref() calls kref_get() which increments the > reference count even if the object has already been released. > Change this to refcount_inc_not_zero() which will only succeed > if the current ref count is larger than zero. This change exposes > some reference counting errors which will be fixed in the following > patches. > > Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> > drivers/infiniband/sw/rxe/rxe_pool.h | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h > index 6cd2366d5407..46f2abc359f3 100644 > +++ b/drivers/infiniband/sw/rxe/rxe_pool.h > @@ -7,6 +7,8 @@ > #ifndef RXE_POOL_H > #define RXE_POOL_H > > +#include <linux/refcount.h> > + > enum rxe_pool_flags { > RXE_POOL_AUTO_INDEX = BIT(1), > RXE_POOL_EXT_INDEX = BIT(2), > @@ -70,9 +72,15 @@ int __rxe_pool_add(struct rxe_pool *pool, struct rxe_pool_elem *elem); > > void *rxe_pool_get_index(struct rxe_pool *pool, unsigned long index); > > -#define rxe_add_ref(obj) kref_get(&(obj)->elem.ref_cnt) > +static inline int __rxe_add_ref(struct rxe_pool_elem *elem) > +{ > + return refcount_inc_not_zero(&elem->ref_cnt.refcount); > +} Don't reach inside a kref to touch the refcount. kref_get_unless_zero() is the api for krefs I'm not sure how any of this works, ie rxe_create_qp() stuffs a core allocated object into the pool, and various places do refcounting on it But then rxe_destroy_qp doesn't seem to do anything with the refcounts, it just blindly lets things go to kfree in the core. Seems really confused.. Jason
On 11/19/21 11:45, Jason Gunthorpe wrote: > On Wed, Nov 03, 2021 at 12:02:39AM -0500, Bob Pearson wrote: >> Currently rxe_add_ref() calls kref_get() which increments the >> reference count even if the object has already been released. >> Change this to refcount_inc_not_zero() which will only succeed >> if the current ref count is larger than zero. This change exposes >> some reference counting errors which will be fixed in the following >> patches. >> >> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> >> drivers/infiniband/sw/rxe/rxe_pool.h | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h >> index 6cd2366d5407..46f2abc359f3 100644 >> +++ b/drivers/infiniband/sw/rxe/rxe_pool.h >> @@ -7,6 +7,8 @@ >> #ifndef RXE_POOL_H >> #define RXE_POOL_H >> >> +#include <linux/refcount.h> >> + >> enum rxe_pool_flags { >> RXE_POOL_AUTO_INDEX = BIT(1), >> RXE_POOL_EXT_INDEX = BIT(2), >> @@ -70,9 +72,15 @@ int __rxe_pool_add(struct rxe_pool *pool, struct rxe_pool_elem *elem); >> >> void *rxe_pool_get_index(struct rxe_pool *pool, unsigned long index); >> >> -#define rxe_add_ref(obj) kref_get(&(obj)->elem.ref_cnt) >> +static inline int __rxe_add_ref(struct rxe_pool_elem *elem) >> +{ >> + return refcount_inc_not_zero(&elem->ref_cnt.refcount); >> +} > > Don't reach inside a kref to touch the > refcount. kref_get_unless_zero() is the api for krefs Yes. Thanks for pointing this out. I had started using the refcount APIs instead of the kref APIs for a different one that is not supported by kref and then quit using it. This is a cleaner solution. > > I'm not sure how any of this works, ie rxe_create_qp() stuffs a core > allocated object into the pool, and various places do refcounting on > it I have been having a long running discussion with Leon about rxe's use of ref counting. His opinion is that it isn't required at all and that rdma-core is doing enough. I think that we either have to push the problem onto the apps (e.g. return EAGAIN or EBUSY) or let the driver sleep if there are currently active packets being processed with a mutex or semaphore. Additionally we need to cleanup some bad code that makes things not work as mentioned. > > But then rxe_destroy_qp doesn't seem to do anything with the > refcounts, it just blindly lets things go to kfree in the core. > > Seems really confused.. > > Jason >
On Tue, Nov 30, 2021 at 02:05:03PM -0600, Bob Pearson wrote: > let the driver sleep if there are currently active packets being processed with a mutex > or semaphore. Additionally we need to cleanup some bad code that makes things not work as > mentioned. Drivers have to sleep in their destroy to fully fence any activity related to the destroying object. It can't keep operating the object after destroy returns, that is not our semantic Jason
diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index 6cd2366d5407..46f2abc359f3 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -7,6 +7,8 @@ #ifndef RXE_POOL_H #define RXE_POOL_H +#include <linux/refcount.h> + enum rxe_pool_flags { RXE_POOL_AUTO_INDEX = BIT(1), RXE_POOL_EXT_INDEX = BIT(2), @@ -70,9 +72,15 @@ int __rxe_pool_add(struct rxe_pool *pool, struct rxe_pool_elem *elem); void *rxe_pool_get_index(struct rxe_pool *pool, unsigned long index); -#define rxe_add_ref(obj) kref_get(&(obj)->elem.ref_cnt) +static inline int __rxe_add_ref(struct rxe_pool_elem *elem) +{ + return refcount_inc_not_zero(&elem->ref_cnt.refcount); +} + +#define rxe_add_ref(obj) __rxe_add_ref(&(obj)->elem) void rxe_elem_release(struct kref *kref); + #define rxe_drop_ref(obj) kref_put(&(obj)->elem.ref_cnt, rxe_elem_release) #endif /* RXE_POOL_H */
Currently rxe_add_ref() calls kref_get() which increments the reference count even if the object has already been released. Change this to refcount_inc_not_zero() which will only succeed if the current ref count is larger than zero. This change exposes some reference counting errors which will be fixed in the following patches. Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> --- drivers/infiniband/sw/rxe/rxe_pool.h | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)