diff mbox series

[v2,for-rc] RDMA/srpt: Fix UAF when srpt_add_one() failed

Message ID 20240801123253.2908831-1-huangjunxian6@hisilicon.com (mailing list archive)
State New, archived
Headers show
Series [v2,for-rc] RDMA/srpt: Fix UAF when srpt_add_one() failed | expand

Commit Message

Junxian Huang Aug. 1, 2024, 12:32 p.m. UTC
Currently cancel_work_sync() is not called when srpt_refresh_port()
failed in srpt_add_one(). There is a probability that sdev has been
freed while the previously initiated sport->work is still running,
leading to a UAF as the log below:

[  T880] ib_srpt MAD registration failed for hns_1-1.
[  T880] ib_srpt srpt_add_one(hns_1) failed.
[  T376] Unable to handle kernel paging request at virtual address 0000000000010008
...
[  T376] Workqueue: events srpt_refresh_port_work [ib_srpt]
...
[  T376] Call trace:
[  T376]  srpt_refresh_port+0x94/0x264 [ib_srpt]
[  T376]  srpt_refresh_port_work+0x1c/0x2c [ib_srpt]
[  T376]  process_one_work+0x1d8/0x4cc
[  T376]  worker_thread+0x158/0x410
[  T376]  kthread+0x108/0x13c
[  T376]  ret_from_fork+0x10/0x18

Add cancel_work_sync() to the exception branch to fix this UAF.
Besides, exchange the order of INIT_WORK() and srpt_refresh_port()
in srpt_add_one(), so that when srpt_refresh_port() failed, there
is no need to cancel the work in this iteration.

Fixes: a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Comments

Bart Van Assche Aug. 1, 2024, 4:20 p.m. UTC | #1
On 8/1/24 5:32 AM, Junxian Huang wrote:
> Besides, exchange the order of INIT_WORK() and srpt_refresh_port()
> in srpt_add_one(), so that when srpt_refresh_port() failed, there
> is no need to cancel the work in this iteration.

The above description is wrong. There is no need to cancel work after
INIT_WORK() has been called if the work has never been queued. Hence,
moving the INIT_WORK() call is not necessary.

> @@ -3220,7 +3221,6 @@ static int srpt_add_one(struct ib_device *device)
>   		sport->port_attrib.srp_max_rsp_size = DEFAULT_MAX_RSP_SIZE;
>   		sport->port_attrib.srp_sq_size = DEF_SRPT_SQ_SIZE;
>   		sport->port_attrib.use_srq = false;
> -		INIT_WORK(&sport->work, srpt_refresh_port_work);
>   
>   		ret = srpt_refresh_port(sport);
>   		if (ret) {
> @@ -3229,6 +3229,8 @@ static int srpt_add_one(struct ib_device *device)
>   			i--;
>   			goto err_port;
>   		}
> +
> +		INIT_WORK(&sport->work, srpt_refresh_port_work);
>   	}

I don't think that this change is necessary.

Bart.
Junxian Huang Aug. 2, 2024, 2:09 a.m. UTC | #2
On 2024/8/2 0:20, Bart Van Assche wrote:
> On 8/1/24 5:32 AM, Junxian Huang wrote:
>> Besides, exchange the order of INIT_WORK() and srpt_refresh_port()
>> in srpt_add_one(), so that when srpt_refresh_port() failed, there
>> is no need to cancel the work in this iteration.
> 
> The above description is wrong. There is no need to cancel work after
> INIT_WORK() has been called if the work has never been queued. Hence,
> moving the INIT_WORK() call is not necessary.
> 

Well, inspired by your comment I looked into the code again and I think
perhaps this whole patch is not necessary.

I encountered this problem in 5.10 kernel, where ib_register_event_handler()
was called before the for-loop. But this bug has been fixed in the current
mainline, and the work won't be queued until the whole for-loop is finished.

Thanks,
Junxian

>> @@ -3220,7 +3221,6 @@ static int srpt_add_one(struct ib_device *device)
>>           sport->port_attrib.srp_max_rsp_size = DEFAULT_MAX_RSP_SIZE;
>>           sport->port_attrib.srp_sq_size = DEF_SRPT_SQ_SIZE;
>>           sport->port_attrib.use_srq = false;
>> -        INIT_WORK(&sport->work, srpt_refresh_port_work);
>>             ret = srpt_refresh_port(sport);
>>           if (ret) {
>> @@ -3229,6 +3229,8 @@ static int srpt_add_one(struct ib_device *device)
>>               i--;
>>               goto err_port;
>>           }
>> +
>> +        INIT_WORK(&sport->work, srpt_refresh_port_work);
>>       }
> 
> I don't think that this change is necessary.
> 
> Bart.
>
diff mbox series

Patch

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 9632afbd727b..7def231da21a 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -648,6 +648,7 @@  static void srpt_unregister_mad_agent(struct srpt_device *sdev, int port_cnt)
 			ib_unregister_mad_agent(sport->mad_agent);
 			sport->mad_agent = NULL;
 		}
+		cancel_work_sync(&sport->work);
 	}
 }
 
@@ -3220,7 +3221,6 @@  static int srpt_add_one(struct ib_device *device)
 		sport->port_attrib.srp_max_rsp_size = DEFAULT_MAX_RSP_SIZE;
 		sport->port_attrib.srp_sq_size = DEF_SRPT_SQ_SIZE;
 		sport->port_attrib.use_srq = false;
-		INIT_WORK(&sport->work, srpt_refresh_port_work);
 
 		ret = srpt_refresh_port(sport);
 		if (ret) {
@@ -3229,6 +3229,8 @@  static int srpt_add_one(struct ib_device *device)
 			i--;
 			goto err_port;
 		}
+
+		INIT_WORK(&sport->work, srpt_refresh_port_work);
 	}
 
 	ib_register_event_handler(&sdev->event_handler);
@@ -3264,13 +3266,9 @@  static void srpt_remove_one(struct ib_device *device, void *client_data)
 	struct srpt_device *sdev = client_data;
 	int i;
 
-	srpt_unregister_mad_agent(sdev, sdev->device->phys_port_cnt);
-
 	ib_unregister_event_handler(&sdev->event_handler);
 
-	/* Cancel any work queued by the just unregistered IB event handler. */
-	for (i = 0; i < sdev->device->phys_port_cnt; i++)
-		cancel_work_sync(&sdev->port[i].work);
+	srpt_unregister_mad_agent(sdev, sdev->device->phys_port_cnt);
 
 	if (sdev->cm_id)
 		ib_destroy_cm_id(sdev->cm_id);