Message ID | 703db18e8d4ef628691fb93980a709be673e62e3.1667810736.git.leonro@nvidia.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | Various core fixes | expand |
On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote: > From: Mark Zhang <markzhang@nvidia.com> > > The MR restrack also needs to be released when delete it, otherwise it > cause memory leak as the task struct won't be released. > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") > Signed-off-by: Mark Zhang <markzhang@nvidia.com> > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > --- > drivers/infiniband/core/restrack.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > index 1f935d9f6178..01a499a8b88d 100644 > --- a/drivers/infiniband/core/restrack.c > +++ b/drivers/infiniband/core/restrack.c > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > rt = &dev->res[res->type]; > > old = xa_erase(&rt->xa, res->id); > - if (res->type == RDMA_RESTRACK_MR) > - return; This needs more explanation, there was some good reason we needed to avoid the wait_for_completion() for the driver allocated objects, but I can't remember it anymore. You added this code in the v2 of the original series, maybe it had something to do with mlx4? Jason
On Wed, Nov 09, 2022 at 03:08:33PM -0400, Jason Gunthorpe wrote: > On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote: > > From: Mark Zhang <markzhang@nvidia.com> > > > > The MR restrack also needs to be released when delete it, otherwise it > > cause memory leak as the task struct won't be released. > > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") > > Signed-off-by: Mark Zhang <markzhang@nvidia.com> > > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > --- > > drivers/infiniband/core/restrack.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > > index 1f935d9f6178..01a499a8b88d 100644 > > --- a/drivers/infiniband/core/restrack.c > > +++ b/drivers/infiniband/core/restrack.c > > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > > rt = &dev->res[res->type]; > > > > old = xa_erase(&rt->xa, res->id); > > - if (res->type == RDMA_RESTRACK_MR) > > - return; > > This needs more explanation, there was some good reason we needed to > avoid the wait_for_completion() for the driver allocated objects, but I > can't remember it anymore. > > You added this code in the v2 of the original series, maybe it had > something to do with mlx4? I failed to remember either, but if you want even more magic in your life, see this hilarious thread: https://lore.kernel.org/linux-rdma/9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com/ I kept this patch in my queue ~1 month and didn't get any complains. Thanks > > Jason
On Thu, Nov 10, 2022 at 11:35:37AM +0200, Leon Romanovsky wrote: > On Wed, Nov 09, 2022 at 03:08:33PM -0400, Jason Gunthorpe wrote: > > On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote: > > > From: Mark Zhang <markzhang@nvidia.com> > > > > > > The MR restrack also needs to be released when delete it, otherwise it > > > cause memory leak as the task struct won't be released. > > > > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") > > > Signed-off-by: Mark Zhang <markzhang@nvidia.com> > > > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > > --- > > > drivers/infiniband/core/restrack.c | 2 -- > > > 1 file changed, 2 deletions(-) > > > > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > > > index 1f935d9f6178..01a499a8b88d 100644 > > > --- a/drivers/infiniband/core/restrack.c > > > +++ b/drivers/infiniband/core/restrack.c > > > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > > > rt = &dev->res[res->type]; > > > > > > old = xa_erase(&rt->xa, res->id); > > > - if (res->type == RDMA_RESTRACK_MR) > > > - return; > > > > This needs more explanation, there was some good reason we needed to > > avoid the wait_for_completion() for the driver allocated objects, but I > > can't remember it anymore. > > > > You added this code in the v2 of the original series, maybe it had > > something to do with mlx4? > > I failed to remember either, but if you want even more magic in your life, > see this hilarious thread: > https://lore.kernel.org/linux-rdma/9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com/ Oh, that clears it up The issue is that dereg can fail for MR: rdma_restrack_del(&mr->res); ret = mr->device->ops.dereg_mr(mr, udata); if (!ret) { atomic_dec(&pd->usecnt); Because the driver management of the object puts it in the wrong order. The above if is necessary because if we trigger this failure path without it, then the next attempt to free the MR will trigger a WARN_ON. So, no, this can't just be deleted, and the suggestions I gave in that prior thread still look like they hold up. As would converting the mr into core ownership. We are very close now, the mlx5 cache mess is almost cleaned up enough to do it. Perhaps after Michael's series Jason
On Thu, Nov 10, 2022 at 03:29:41PM -0400, Jason Gunthorpe wrote: > On Thu, Nov 10, 2022 at 11:35:37AM +0200, Leon Romanovsky wrote: > > On Wed, Nov 09, 2022 at 03:08:33PM -0400, Jason Gunthorpe wrote: > > > On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote: > > > > From: Mark Zhang <markzhang@nvidia.com> > > > > > > > > The MR restrack also needs to be released when delete it, otherwise it > > > > cause memory leak as the task struct won't be released. > > > > > > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") > > > > Signed-off-by: Mark Zhang <markzhang@nvidia.com> > > > > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> > > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > > > --- > > > > drivers/infiniband/core/restrack.c | 2 -- > > > > 1 file changed, 2 deletions(-) > > > > > > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > > > > index 1f935d9f6178..01a499a8b88d 100644 > > > > --- a/drivers/infiniband/core/restrack.c > > > > +++ b/drivers/infiniband/core/restrack.c > > > > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > > > > rt = &dev->res[res->type]; > > > > > > > > old = xa_erase(&rt->xa, res->id); > > > > - if (res->type == RDMA_RESTRACK_MR) > > > > - return; > > > > > > This needs more explanation, there was some good reason we needed to > > > avoid the wait_for_completion() for the driver allocated objects, but I > > > can't remember it anymore. > > > > > > You added this code in the v2 of the original series, maybe it had > > > something to do with mlx4? > > > > I failed to remember either, but if you want even more magic in your life, > > see this hilarious thread: > > https://lore.kernel.org/linux-rdma/9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com/ > > Oh, that clears it up > > The issue is that dereg can fail for MR: > > rdma_restrack_del(&mr->res); > ret = mr->device->ops.dereg_mr(mr, udata); > if (!ret) { > atomic_dec(&pd->usecnt); > > Because the driver management of the object puts it in the wrong > order. > > The above if is necessary because if we trigger this failure path > without it, then the next attempt to free the MR will trigger a > WARN_ON. Not really, after first entry to rdma_restrack_del(), we will set res->valid to false. Any subsequent calls to rdma_restrack_del() will do nothing. 322 void rdma_restrack_del(struct rdma_restrack_entry *res) 323 { 324 struct rdma_restrack_entry *old; 325 struct rdma_restrack_root *rt; 326 struct ib_device *dev; 327 328 if (!res->valid) { 329 if (res->task) { 330 put_task_struct(res->task); 331 res->task = NULL; 332 } 333 return; <------- exit 334 } Thanks
On Thu, Nov 10, 2022 at 09:39:51PM +0200, Leon Romanovsky wrote: > On Thu, Nov 10, 2022 at 03:29:41PM -0400, Jason Gunthorpe wrote: > > On Thu, Nov 10, 2022 at 11:35:37AM +0200, Leon Romanovsky wrote: > > > On Wed, Nov 09, 2022 at 03:08:33PM -0400, Jason Gunthorpe wrote: > > > > On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote: > > > > > From: Mark Zhang <markzhang@nvidia.com> > > > > > > > > > > The MR restrack also needs to be released when delete it, otherwise it > > > > > cause memory leak as the task struct won't be released. > > > > > > > > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") > > > > > Signed-off-by: Mark Zhang <markzhang@nvidia.com> > > > > > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> > > > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > > > > --- > > > > > drivers/infiniband/core/restrack.c | 2 -- > > > > > 1 file changed, 2 deletions(-) > > > > > > > > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > > > > > index 1f935d9f6178..01a499a8b88d 100644 > > > > > --- a/drivers/infiniband/core/restrack.c > > > > > +++ b/drivers/infiniband/core/restrack.c > > > > > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > > > > > rt = &dev->res[res->type]; > > > > > > > > > > old = xa_erase(&rt->xa, res->id); > > > > > - if (res->type == RDMA_RESTRACK_MR) > > > > > - return; > > > > > > > > This needs more explanation, there was some good reason we needed to > > > > avoid the wait_for_completion() for the driver allocated objects, but I > > > > can't remember it anymore. > > > > > > > > You added this code in the v2 of the original series, maybe it had > > > > something to do with mlx4? > > > > > > I failed to remember either, but if you want even more magic in your life, > > > see this hilarious thread: > > > https://lore.kernel.org/linux-rdma/9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com/ > > > > Oh, that clears it up > > > > The issue is that dereg can fail for MR: > > > > rdma_restrack_del(&mr->res); > > ret = mr->device->ops.dereg_mr(mr, udata); > > if (!ret) { > > atomic_dec(&pd->usecnt); > > > > Because the driver management of the object puts it in the wrong > > order. > > > > The above if is necessary because if we trigger this failure path > > without it, then the next attempt to free the MR will trigger a > > WARN_ON. > > Not really, after first entry to rdma_restrack_del(), we will set > res->valid to false. Any subsequent calls to rdma_restrack_del() will > do nothing. Uh, that does seem OK So I'm back to I don't know why this if was put in. Jason
diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c index 1f935d9f6178..01a499a8b88d 100644 --- a/drivers/infiniband/core/restrack.c +++ b/drivers/infiniband/core/restrack.c @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) rt = &dev->res[res->type]; old = xa_erase(&rt->xa, res->id); - if (res->type == RDMA_RESTRACK_MR) - return; WARN_ON(old != res); out: