diff mbox series

[rdma-next,2/4] RDMA/restrack: Release MR restrack when delete

Message ID 703db18e8d4ef628691fb93980a709be673e62e3.1667810736.git.leonro@nvidia.com (mailing list archive)
State Accepted
Headers show
Series Various core fixes | expand

Commit Message

Leon Romanovsky Nov. 7, 2022, 8:51 a.m. UTC
From: Mark Zhang <markzhang@nvidia.com>

The MR restrack also needs to be released when delete it, otherwise it
cause memory leak as the task struct won't be released.

Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/restrack.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Jason Gunthorpe Nov. 9, 2022, 7:08 p.m. UTC | #1
On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote:
> From: Mark Zhang <markzhang@nvidia.com>
> 
> The MR restrack also needs to be released when delete it, otherwise it
> cause memory leak as the task struct won't be released.
> 
> Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects")
> Signed-off-by: Mark Zhang <markzhang@nvidia.com>
> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  drivers/infiniband/core/restrack.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
> index 1f935d9f6178..01a499a8b88d 100644
> --- a/drivers/infiniband/core/restrack.c
> +++ b/drivers/infiniband/core/restrack.c
> @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res)
>  	rt = &dev->res[res->type];
>  
>  	old = xa_erase(&rt->xa, res->id);
> -	if (res->type == RDMA_RESTRACK_MR)
> -		return;

This needs more explanation, there was some good reason we needed to
avoid the wait_for_completion() for the driver allocated objects, but I
can't remember it anymore.

You added this code in the v2 of the original series, maybe it had
something to do with mlx4?

Jason
Leon Romanovsky Nov. 10, 2022, 9:35 a.m. UTC | #2
On Wed, Nov 09, 2022 at 03:08:33PM -0400, Jason Gunthorpe wrote:
> On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote:
> > From: Mark Zhang <markzhang@nvidia.com>
> > 
> > The MR restrack also needs to be released when delete it, otherwise it
> > cause memory leak as the task struct won't be released.
> > 
> > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects")
> > Signed-off-by: Mark Zhang <markzhang@nvidia.com>
> > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> >  drivers/infiniband/core/restrack.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
> > index 1f935d9f6178..01a499a8b88d 100644
> > --- a/drivers/infiniband/core/restrack.c
> > +++ b/drivers/infiniband/core/restrack.c
> > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res)
> >  	rt = &dev->res[res->type];
> >  
> >  	old = xa_erase(&rt->xa, res->id);
> > -	if (res->type == RDMA_RESTRACK_MR)
> > -		return;
> 
> This needs more explanation, there was some good reason we needed to
> avoid the wait_for_completion() for the driver allocated objects, but I
> can't remember it anymore.
> 
> You added this code in the v2 of the original series, maybe it had
> something to do with mlx4?

I failed to remember either, but if you want even more magic in your life,
see this hilarious thread:
https://lore.kernel.org/linux-rdma/9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com/

I kept this patch in my queue ~1 month and didn't get any complains.

Thanks

> 
> Jason
Jason Gunthorpe Nov. 10, 2022, 7:29 p.m. UTC | #3
On Thu, Nov 10, 2022 at 11:35:37AM +0200, Leon Romanovsky wrote:
> On Wed, Nov 09, 2022 at 03:08:33PM -0400, Jason Gunthorpe wrote:
> > On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote:
> > > From: Mark Zhang <markzhang@nvidia.com>
> > > 
> > > The MR restrack also needs to be released when delete it, otherwise it
> > > cause memory leak as the task struct won't be released.
> > > 
> > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects")
> > > Signed-off-by: Mark Zhang <markzhang@nvidia.com>
> > > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > ---
> > >  drivers/infiniband/core/restrack.c | 2 --
> > >  1 file changed, 2 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
> > > index 1f935d9f6178..01a499a8b88d 100644
> > > --- a/drivers/infiniband/core/restrack.c
> > > +++ b/drivers/infiniband/core/restrack.c
> > > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res)
> > >  	rt = &dev->res[res->type];
> > >  
> > >  	old = xa_erase(&rt->xa, res->id);
> > > -	if (res->type == RDMA_RESTRACK_MR)
> > > -		return;
> > 
> > This needs more explanation, there was some good reason we needed to
> > avoid the wait_for_completion() for the driver allocated objects, but I
> > can't remember it anymore.
> > 
> > You added this code in the v2 of the original series, maybe it had
> > something to do with mlx4?
> 
> I failed to remember either, but if you want even more magic in your life,
> see this hilarious thread:
> https://lore.kernel.org/linux-rdma/9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com/

Oh, that clears it up

The issue is that dereg can fail for MR:

	rdma_restrack_del(&mr->res);
	ret = mr->device->ops.dereg_mr(mr, udata);
	if (!ret) {
		atomic_dec(&pd->usecnt);

Because the driver management of the object puts it in the wrong
order.

The above if is necessary because if we trigger this failure path
without it, then the next attempt to free the MR will trigger a
WARN_ON.

So, no, this can't just be deleted, and the suggestions I gave in that
prior thread still look like they hold up. As would converting the mr
into core ownership. We are very close now, the mlx5 cache mess is
almost cleaned up enough to do it. Perhaps after Michael's series

Jason
Leon Romanovsky Nov. 10, 2022, 7:39 p.m. UTC | #4
On Thu, Nov 10, 2022 at 03:29:41PM -0400, Jason Gunthorpe wrote:
> On Thu, Nov 10, 2022 at 11:35:37AM +0200, Leon Romanovsky wrote:
> > On Wed, Nov 09, 2022 at 03:08:33PM -0400, Jason Gunthorpe wrote:
> > > On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote:
> > > > From: Mark Zhang <markzhang@nvidia.com>
> > > > 
> > > > The MR restrack also needs to be released when delete it, otherwise it
> > > > cause memory leak as the task struct won't be released.
> > > > 
> > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects")
> > > > Signed-off-by: Mark Zhang <markzhang@nvidia.com>
> > > > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
> > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > ---
> > > >  drivers/infiniband/core/restrack.c | 2 --
> > > >  1 file changed, 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
> > > > index 1f935d9f6178..01a499a8b88d 100644
> > > > --- a/drivers/infiniband/core/restrack.c
> > > > +++ b/drivers/infiniband/core/restrack.c
> > > > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res)
> > > >  	rt = &dev->res[res->type];
> > > >  
> > > >  	old = xa_erase(&rt->xa, res->id);
> > > > -	if (res->type == RDMA_RESTRACK_MR)
> > > > -		return;
> > > 
> > > This needs more explanation, there was some good reason we needed to
> > > avoid the wait_for_completion() for the driver allocated objects, but I
> > > can't remember it anymore.
> > > 
> > > You added this code in the v2 of the original series, maybe it had
> > > something to do with mlx4?
> > 
> > I failed to remember either, but if you want even more magic in your life,
> > see this hilarious thread:
> > https://lore.kernel.org/linux-rdma/9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com/
> 
> Oh, that clears it up
> 
> The issue is that dereg can fail for MR:
> 
> 	rdma_restrack_del(&mr->res);
> 	ret = mr->device->ops.dereg_mr(mr, udata);
> 	if (!ret) {
> 		atomic_dec(&pd->usecnt);
> 
> Because the driver management of the object puts it in the wrong
> order.
> 
> The above if is necessary because if we trigger this failure path
> without it, then the next attempt to free the MR will trigger a
> WARN_ON.

Not really, after first entry to rdma_restrack_del(), we will set
res->valid to false. Any subsequent calls to rdma_restrack_del() will
do nothing.

  322 void rdma_restrack_del(struct rdma_restrack_entry *res)
  323 {
  324         struct rdma_restrack_entry *old;
  325         struct rdma_restrack_root *rt;
  326         struct ib_device *dev;
  327
  328         if (!res->valid) {
  329                 if (res->task) {
  330                         put_task_struct(res->task);
  331                         res->task = NULL;
  332                 }
  333                 return; <------- exit
  334         }

Thanks
Jason Gunthorpe Nov. 10, 2022, 7:47 p.m. UTC | #5
On Thu, Nov 10, 2022 at 09:39:51PM +0200, Leon Romanovsky wrote:
> On Thu, Nov 10, 2022 at 03:29:41PM -0400, Jason Gunthorpe wrote:
> > On Thu, Nov 10, 2022 at 11:35:37AM +0200, Leon Romanovsky wrote:
> > > On Wed, Nov 09, 2022 at 03:08:33PM -0400, Jason Gunthorpe wrote:
> > > > On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote:
> > > > > From: Mark Zhang <markzhang@nvidia.com>
> > > > > 
> > > > > The MR restrack also needs to be released when delete it, otherwise it
> > > > > cause memory leak as the task struct won't be released.
> > > > > 
> > > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects")
> > > > > Signed-off-by: Mark Zhang <markzhang@nvidia.com>
> > > > > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
> > > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > > ---
> > > > >  drivers/infiniband/core/restrack.c | 2 --
> > > > >  1 file changed, 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
> > > > > index 1f935d9f6178..01a499a8b88d 100644
> > > > > --- a/drivers/infiniband/core/restrack.c
> > > > > +++ b/drivers/infiniband/core/restrack.c
> > > > > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res)
> > > > >  	rt = &dev->res[res->type];
> > > > >  
> > > > >  	old = xa_erase(&rt->xa, res->id);
> > > > > -	if (res->type == RDMA_RESTRACK_MR)
> > > > > -		return;
> > > > 
> > > > This needs more explanation, there was some good reason we needed to
> > > > avoid the wait_for_completion() for the driver allocated objects, but I
> > > > can't remember it anymore.
> > > > 
> > > > You added this code in the v2 of the original series, maybe it had
> > > > something to do with mlx4?
> > > 
> > > I failed to remember either, but if you want even more magic in your life,
> > > see this hilarious thread:
> > > https://lore.kernel.org/linux-rdma/9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com/
> > 
> > Oh, that clears it up
> > 
> > The issue is that dereg can fail for MR:
> > 
> > 	rdma_restrack_del(&mr->res);
> > 	ret = mr->device->ops.dereg_mr(mr, udata);
> > 	if (!ret) {
> > 		atomic_dec(&pd->usecnt);
> > 
> > Because the driver management of the object puts it in the wrong
> > order.
> > 
> > The above if is necessary because if we trigger this failure path
> > without it, then the next attempt to free the MR will trigger a
> > WARN_ON.
> 
> Not really, after first entry to rdma_restrack_del(), we will set
> res->valid to false. Any subsequent calls to rdma_restrack_del() will
> do nothing.

Uh, that does seem OK

So I'm back to I don't know why this if was put in.

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
index 1f935d9f6178..01a499a8b88d 100644
--- a/drivers/infiniband/core/restrack.c
+++ b/drivers/infiniband/core/restrack.c
@@ -343,8 +343,6 @@  void rdma_restrack_del(struct rdma_restrack_entry *res)
 	rt = &dev->res[res->type];
 
 	old = xa_erase(&rt->xa, res->id);
-	if (res->type == RDMA_RESTRACK_MR)
-		return;
 	WARN_ON(old != res);
 
 out: