diff mbox series

[STABLE,5.10] RDMA/restrack: Release MR/QP restrack when delete

Message ID 20241116175748571awvOCFyR9lCLwe61IhOXL@zte.com.cn (mailing list archive)
State New
Headers show
Series [STABLE,5.10] RDMA/restrack: Release MR/QP restrack when delete | expand

Commit Message

jiang.kun2@zte.com.cn Nov. 16, 2024, 9:57 a.m. UTC
From: tuqiang <tu.qiang35@zte.com.cn>

The MR/QP restrack also needs to be released when delete it, otherwise it
cause memory leak as the task struct won't be released.

This problem has been fixed by the commit <dac153f2802d>
("RDMA/restrack: Release MR restrack when delete"), but still exists in the
linux-5.10.y branch.

Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects")
Signed-off-by: tuqiang <tu.qiang35@zte.com.cn>
Signed-off-by: Jiang Kun <jiang.kun2@zte.com.cn>
Cc: stable@vger.kernel.org
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
---
 drivers/infiniband/core/restrack.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Greg KH Nov. 16, 2024, 10:25 a.m. UTC | #1
On Sat, Nov 16, 2024 at 05:57:48PM +0800, jiang.kun2@zte.com.cn wrote:
> From: tuqiang <tu.qiang35@zte.com.cn>
> 
> The MR/QP restrack also needs to be released when delete it, otherwise it
> cause memory leak as the task struct won't be released.
> 
> This problem has been fixed by the commit <dac153f2802d>
> ("RDMA/restrack: Release MR restrack when delete"), but still exists in the
> linux-5.10.y branch.

Why don't we just take the correct fix?  Why is this needed instead?

thanks,

greg k-h
tuqiang Nov. 20, 2024, 3:04 p.m. UTC | #2
>
>On Sat, Nov 16, 2024 at 05:57:48PM +0800, jiang.kun2@zte.com.cn wrote:
>> From: tuqiang <tu.qiang35@zte.com.cn>
>> 
>> The MR/QP restrack also needs to be released when delete it, otherwise it
>> cause memory leak as the task struct won't be released.
>> 
>> This problem has been fixed by the commit <dac153f2802d>
>> ("RDMA/restrack: Release MR restrack when delete"), but still exists in the
>> linux-5.10.y branch.
>
>Why don't we just take the correct fix?  Why is this needed instead?

1. Reply: Why don't we just take the correct fix?
=========================================
Due to inconsistent code context, it is not possible to directly cherry-pick the 
changes to the linux-5.10 branch.
The commit 514aee660df4 (RDMA: Globally allocate and release QP memory) resolved 
the resource release issue for QP, but the MR issue remains unresolved.


2. Reply: Why is this needed instead?
==================================
When a user applies for resources by executing MR/QP-related commands, they will
reference the task_struct object. However, when consuming the object, rdma_restrack_del 
does not have the corresponding release mechanism.

Stack:
0xffffffffb70df1d0 : get_task_struct+0x0/0x50 [kernel]
0xffffffffc5b3a42c : rdma_restrack_attach_task.isra.6+0x2c/0x50 [ib_core]
0xffffffffc748fd54 : ib_uverbs_reg_mr+0x194/0x260 [ib_uverbs]
0xffffffffc749a049 : ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xb9/0x110 [ib_uverbs]
0xffffffffc7496a1f : ib_uverbs_run_method+0x6ff/0x7b0 [ib_uverbs]
0xffffffffc7496c65 : ib_uverbs_cmd_verbs+0x195/0x360 [ib_uverbs]
0xffffffffc7496ec3 : ib_uverbs_ioctl+0x93/0xe0 [ib_uverbs]
0xffffffffb736bbe9 : __x64_sys_ioctl+0x89/0xc0 [kernel]
0xffffffffb7a62a10 : do_syscall_64+0x30/0x40 [kernel]

0xffffffffb70df1d0 : get_task_struct+0x0/0x50 [kernel]
0xffffffffc5b3a42c : rdma_restrack_attach_task.isra.6+0x2c/0x50 [ib_core]
0xffffffffc749bfea : ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0xaba/0xb40 [ib_uverbs]
0xffffffffc7496a1f : ib_uverbs_run_method+0x6ff/0x7b0 [ib_uverbs]
0xffffffffc7496c65 : ib_uverbs_cmd_verbs+0x195/0x360 [ib_uverbs]
0xffffffffc7496ec3 : ib_uverbs_ioctl+0x93/0xe0 [ib_uverbs]
0xffffffffb736bbe9 : __x64_sys_ioctl+0x89/0xc0 [kernel]
0xffffffffb7a62a10 : do_syscall_64+0x30/0x40 [kernel]

>
>thanks,
>
>greg k-h
diff mbox series

Patch

diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
index bbbbec5b1593..d5a69c4a1891 100644
--- a/drivers/infiniband/core/restrack.c
+++ b/drivers/infiniband/core/restrack.c
@@ -326,8 +326,6 @@  void rdma_restrack_del(struct rdma_restrack_entry *res)
 	rt = &dev->res[res->type];

 	old = xa_erase(&rt->xa, res->id);
-	if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP)
-		return;
 	WARN_ON(old != res);
 	res->valid = false;