Message ID | 20220208211644.123457-9-rpearsonhpe@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Move two object pools to rxe_mcast.c | expand |
On Tue, Feb 08, 2022 at 03:16:42PM -0600, Bob Pearson wrote: > Well behaved applications will free all memory allocated by multicast > but programs which do not clean up properly can leave behind allocated > memory when the rxe driver is unloaded. This patch walks the red-black > tree holding multicast group elements and then walks the list of attached > qp's freeing the mca's and finally the mcg's. How does this happen? the ib core ensures that all uobjects are destroyed, so if something is still in the rb tree here it means that an earlier uobject destruction leaked it Jason
On 2/11/22 12:43, Jason Gunthorpe wrote: > On Tue, Feb 08, 2022 at 03:16:42PM -0600, Bob Pearson wrote: >> Well behaved applications will free all memory allocated by multicast >> but programs which do not clean up properly can leave behind allocated >> memory when the rxe driver is unloaded. This patch walks the red-black >> tree holding multicast group elements and then walks the list of attached >> qp's freeing the mca's and finally the mcg's. > > How does this happen? the ib core ensures that all uobjects are > destroyed, so if something is still in the rb tree here it means that > an earlier uobject destruction leaked it > > Jason The mc_grp and mc_elem objects are not rdma-core uobjects. So their memory is allocated by the rxe driver. They get created by ib_attach_mcast and destroyed by ib_detach_mcast. If an application crashes without calling a matching ib_detach_mcast for each attachment the driver would have leaked the memory. This patch fixes that. Bob
On Fri, Feb 11, 2022 at 01:36:06PM -0600, Bob Pearson wrote: > On 2/11/22 12:43, Jason Gunthorpe wrote: > > On Tue, Feb 08, 2022 at 03:16:42PM -0600, Bob Pearson wrote: > >> Well behaved applications will free all memory allocated by multicast > >> but programs which do not clean up properly can leave behind allocated > >> memory when the rxe driver is unloaded. This patch walks the red-black > >> tree holding multicast group elements and then walks the list of attached > >> qp's freeing the mca's and finally the mcg's. > > > > How does this happen? the ib core ensures that all uobjects are > > destroyed, so if something is still in the rb tree here it means that > > an earlier uobject destruction leaked it > > > > Jason > > The mc_grp and mc_elem objects are not rdma-core uobjects. So their memory > is allocated by the rxe driver. They get created by ib_attach_mcast and destroyed > by ib_detach_mcast. If an application crashes without calling a matching > ib_detach_mcast for each attachment the driver would have leaked the memory. > This patch fixes that. The mcast attachment is affiliated with a QP, when all the QPs are destroyed, which are rdma-core objects, then all attachments should be free'd as well. That should have happened by the time things get here I would expect a simple WARN_ON that the rb tree is empty in destroy to catch any bugs. Jason
On 2/11/22 13:56, Jason Gunthorpe wrote: > On Fri, Feb 11, 2022 at 01:36:06PM -0600, Bob Pearson wrote: >> On 2/11/22 12:43, Jason Gunthorpe wrote: >>> On Tue, Feb 08, 2022 at 03:16:42PM -0600, Bob Pearson wrote: >>>> Well behaved applications will free all memory allocated by multicast >>>> but programs which do not clean up properly can leave behind allocated >>>> memory when the rxe driver is unloaded. This patch walks the red-black >>>> tree holding multicast group elements and then walks the list of attached >>>> qp's freeing the mca's and finally the mcg's. >>> >>> How does this happen? the ib core ensures that all uobjects are >>> destroyed, so if something is still in the rb tree here it means that >>> an earlier uobject destruction leaked it >>> >>> Jason >> >> The mc_grp and mc_elem objects are not rdma-core uobjects. So their memory >> is allocated by the rxe driver. They get created by ib_attach_mcast and destroyed >> by ib_detach_mcast. If an application crashes without calling a matching >> ib_detach_mcast for each attachment the driver would have leaked the memory. >> This patch fixes that. > > The mcast attachment is affiliated with a QP, when all the QPs are > destroyed, which are rdma-core objects, then all attachments should be > free'd as well. That should have happened by the time things get here > > I would expect a simple WARN_ON that the rb tree is empty in destroy > to catch any bugs. > > Jason I wrote a simple user test case that calls ibv_attach_mcast() and then sleeps. If I type ^C the qp is detached as you predicted so someone is counting them somewhere. It isn't obvious in rdma-core. Not sure what you are suggesting above. The detach code already checks to see if the attachment is *not* present. The only time it is clear that there shouldn't be any more attachments present is when the device is closed which is what the cleanup code is doing. I can add a warning if there was anything there but it will likely never get called since someone above me is actively cleaning them up from failed apps. Or we can just drop this patch and assume it isn't needed. Bob
On Fri, Feb 11, 2022 at 06:37:50PM -0600, Bob Pearson wrote: > > The mcast attachment is affiliated with a QP, when all the QPs are > > destroyed, which are rdma-core objects, then all attachments should be > > free'd as well. That should have happened by the time things get here > > > > I would expect a simple WARN_ON that the rb tree is empty in destroy > > to catch any bugs. > > > > Jason > > I wrote a simple user test case that calls ibv_attach_mcast() and > then sleeps. If I type ^C the qp is detached as you predicted so > someone is counting them somewhere. It isn't obvious in rdma-core. QP destroy is done inside the kernel in all the uverbs stuff, detatch from multicast during QP destroy should have been inside RXE's qp destroy function. > never get called since someone above me is actively cleaning them up > from failed apps. Or we can just drop this patch and assume it isn't > needed. I mean drop this patch and just put a WARN_ON to assert the rb tree is empty before destroying the memory that holds it, just in case someone adds a bug someday. Jason
diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c index 3520eb2db685..603b0156f889 100644 --- a/drivers/infiniband/sw/rxe/rxe.c +++ b/drivers/infiniband/sw/rxe/rxe.c @@ -29,6 +29,8 @@ void rxe_dealloc(struct ib_device *ib_dev) rxe_pool_cleanup(&rxe->mr_pool); rxe_pool_cleanup(&rxe->mw_pool); + rxe_cleanup_mcast(rxe); + if (rxe->tfm) crypto_free_shash(rxe->tfm); } diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index 409efeecd581..0bc1b7e2877c 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -44,6 +44,7 @@ struct rxe_mcg *rxe_lookup_mcg(struct rxe_dev *rxe, union ib_gid *mgid); int rxe_attach_mcast(struct ib_qp *ibqp, union ib_gid *mgid, u16 mlid); int rxe_detach_mcast(struct ib_qp *ibqp, union ib_gid *mgid, u16 mlid); void rxe_cleanup_mcg(struct kref *kref); +void rxe_cleanup_mcast(struct rxe_dev *rxe); /* rxe_mmap.c */ struct rxe_mmap_info { diff --git a/drivers/infiniband/sw/rxe/rxe_mcast.c b/drivers/infiniband/sw/rxe/rxe_mcast.c index 07c218788c59..846147878607 100644 --- a/drivers/infiniband/sw/rxe/rxe_mcast.c +++ b/drivers/infiniband/sw/rxe/rxe_mcast.c @@ -382,3 +382,34 @@ int rxe_detach_mcast(struct ib_qp *ibqp, union ib_gid *mgid, u16 mlid) return rxe_detach_mcg(rxe, qp, mgid); } + +/** + * rxe_cleanup_mcast - cleanup all resources held by mcast + * @rxe: rxe object + * + * Called when rxe device is unloaded. Walk red-black tree to + * find all mcg's and then walk mcg->qp_list to find all mca's and + * free them. These should have been freed already if apps are + * well behaved. + */ +void rxe_cleanup_mcast(struct rxe_dev *rxe) +{ + struct rb_root *root = &rxe->mcg_tree; + struct rb_node *node, *next; + struct rxe_mcg *mcg; + struct rxe_mca *mca, *tmp; + + for (node = rb_first(root); node; node = next) { + next = rb_next(node); + mcg = rb_entry(node, typeof(*mcg), node); + + spin_lock_bh(&rxe->mcg_lock); + list_for_each_entry_safe(mca, tmp, &mcg->qp_list, qp_list) + kfree(mca); + + __rxe_remove_mcg(mcg); + spin_unlock_bh(&rxe->mcg_lock); + + kfree(mcg); + } +}
Well behaved applications will free all memory allocated by multicast but programs which do not clean up properly can leave behind allocated memory when the rxe driver is unloaded. This patch walks the red-black tree holding multicast group elements and then walks the list of attached qp's freeing the mca's and finally the mcg's. Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> --- drivers/infiniband/sw/rxe/rxe.c | 2 ++ drivers/infiniband/sw/rxe/rxe_loc.h | 1 + drivers/infiniband/sw/rxe/rxe_mcast.c | 31 +++++++++++++++++++++++++++ 3 files changed, 34 insertions(+)