diff mbox series

[PATCHv2,1/1] RDMA/rxe: Fix BUG: KASAN: null-ptr-deref in rxe_qp_do_cleanup

Message ID 20220705225414.315478-1-yanjun.zhu@linux.dev (mailing list archive)
State Accepted
Headers show
Series [PATCHv2,1/1] RDMA/rxe: Fix BUG: KASAN: null-ptr-deref in rxe_qp_do_cleanup | expand

Commit Message

Zhu Yanjun July 5, 2022, 10:54 p.m. UTC
From: Zhu Yanjun <yanjun.zhu@linux.dev>

The function rxe_create_qp calls rxe_qp_from_init. If some error
occurs, the error handler of function rxe_qp_from_init will set
both scq and rcq to NULL.

Then rxe_create_qp calls rxe_put to handle qp. In the end,
rxe_qp_do_cleanup is called by rxe_put. rxe_qp_do_cleanup directly
accesses scq and rcq before checking them. This will cause 
null-ptr-deref error.

The call graph is as below:

rxe_create_qp {
  ...
  rxe_qp_from_init {
    ...
  err1:
    ...
    qp->rcq = NULL;  <---rcq is set to NULL
    qp->scq = NULL;  <---scq is set to NULL
    ...
  }
        
qp_init:
  rxe_put{
    ...
    rxe_qp_do_cleanup {
      ...
      atomic_dec(&qp->scq->num_wq); <--- scq is accessed
      ...
      atomic_dec(&qp->rcq->num_wq); <--- rcq is accessed
    }
}

Fixes: 4703b4f0d94a ("RDMA/rxe: Enforce IBA C11-17")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
V1->V2: Describe the error flows.
---
 drivers/infiniband/sw/rxe/rxe_qp.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

Haris Iqbal July 5, 2022, 10:35 a.m. UTC | #1
On Tue, Jul 5, 2022 at 8:28 AM <yanjun.zhu@linux.dev> wrote:
>
> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>
> The function rxe_create_qp calls rxe_qp_from_init. If some error
> occurs, the error handler of function rxe_qp_from_init will set
> both scq and rcq to NULL.
>
> Then rxe_create_qp calls rxe_put to handle qp. In the end,
> rxe_qp_do_cleanup is called by rxe_put. rxe_qp_do_cleanup directly
> accesses scq and rcq before checking them. This will cause
> null-ptr-deref error.
>
> The call graph is as below:
>
> rxe_create_qp {
>   ...
>   rxe_qp_from_init {
>     ...
>   err1:
>     ...
>     qp->rcq = NULL;  <---rcq is set to NULL
>     qp->scq = NULL;  <---scq is set to NULL
>     ...
>   }
>
> qp_init:
>   rxe_put{
>     ...
>     rxe_qp_do_cleanup {
>       ...
>       atomic_dec(&qp->scq->num_wq); <--- scq is accessed
>       ...
>       atomic_dec(&qp->rcq->num_wq); <--- rcq is accessed
>     }
> }
>
> Fixes: 4703b4f0d94a ("RDMA/rxe: Enforce IBA C11-17")
> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> ---
> V1->V2: Describe the error flows.
> ---
>  drivers/infiniband/sw/rxe/rxe_qp.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
> index 22e9b85344c3..b79e1b43454e 100644
> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
> @@ -804,13 +804,15 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
>         if (qp->rq.queue)
>                 rxe_queue_cleanup(qp->rq.queue);
>
> -       atomic_dec(&qp->scq->num_wq);
> -       if (qp->scq)
> +       if (qp->scq) {
> +               atomic_dec(&qp->scq->num_wq);
>                 rxe_put(qp->scq);
> +       }
>
> -       atomic_dec(&qp->rcq->num_wq);
> -       if (qp->rcq)
> +       if (qp->rcq) {
> +               atomic_dec(&qp->rcq->num_wq);
>                 rxe_put(qp->rcq);
> +       }
>
>         if (qp->pd)
>                 rxe_put(qp->pd);
> --
> 2.34.1

Looks good.

Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>

Should the check for "rxe_cleanup_task(&qp->comp.task);" also happen
in this commit? or would it be covered in a different one?

>
Zhu Yanjun July 6, 2022, 1:10 p.m. UTC | #2
在 2022/7/5 18:35, Haris Iqbal 写道:
> On Tue, Jul 5, 2022 at 8:28 AM <yanjun.zhu@linux.dev> wrote:
>>
>> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>>
>> The function rxe_create_qp calls rxe_qp_from_init. If some error
>> occurs, the error handler of function rxe_qp_from_init will set
>> both scq and rcq to NULL.
>>
>> Then rxe_create_qp calls rxe_put to handle qp. In the end,
>> rxe_qp_do_cleanup is called by rxe_put. rxe_qp_do_cleanup directly
>> accesses scq and rcq before checking them. This will cause
>> null-ptr-deref error.
>>
>> The call graph is as below:
>>
>> rxe_create_qp {
>>    ...
>>    rxe_qp_from_init {
>>      ...
>>    err1:
>>      ...
>>      qp->rcq = NULL;  <---rcq is set to NULL
>>      qp->scq = NULL;  <---scq is set to NULL
>>      ...
>>    }
>>
>> qp_init:
>>    rxe_put{
>>      ...
>>      rxe_qp_do_cleanup {
>>        ...
>>        atomic_dec(&qp->scq->num_wq); <--- scq is accessed
>>        ...
>>        atomic_dec(&qp->rcq->num_wq); <--- rcq is accessed
>>      }
>> }
>>
>> Fixes: 4703b4f0d94a ("RDMA/rxe: Enforce IBA C11-17")
>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>> ---
>> V1->V2: Describe the error flows.
>> ---
>>   drivers/infiniband/sw/rxe/rxe_qp.c | 10 ++++++----
>>   1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
>> index 22e9b85344c3..b79e1b43454e 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
>> @@ -804,13 +804,15 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
>>          if (qp->rq.queue)
>>                  rxe_queue_cleanup(qp->rq.queue);
>>
>> -       atomic_dec(&qp->scq->num_wq);
>> -       if (qp->scq)
>> +       if (qp->scq) {
>> +               atomic_dec(&qp->scq->num_wq);
>>                  rxe_put(qp->scq);
>> +       }
>>
>> -       atomic_dec(&qp->rcq->num_wq);
>> -       if (qp->rcq)
>> +       if (qp->rcq) {
>> +               atomic_dec(&qp->rcq->num_wq);
>>                  rxe_put(qp->rcq);
>> +       }
>>
>>          if (qp->pd)
>>                  rxe_put(qp->pd);
>> --
>> 2.34.1
> 
> Looks good.
> 
> Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> 
> Should the check for "rxe_cleanup_task(&qp->comp.task);" also happen
> in this commit? or would it be covered in a different one?

It is in a different commit. I will send it out very soon.

Zhu Yanjun

> 
>>
Haris Iqbal July 6, 2022, 1:11 p.m. UTC | #3
On Wed, Jul 6, 2022 at 3:10 PM Yanjun Zhu <yanjun.zhu@linux.dev> wrote:
>
> 在 2022/7/5 18:35, Haris Iqbal 写道:
> > On Tue, Jul 5, 2022 at 8:28 AM <yanjun.zhu@linux.dev> wrote:
> >>
> >> From: Zhu Yanjun <yanjun.zhu@linux.dev>
> >>
> >> The function rxe_create_qp calls rxe_qp_from_init. If some error
> >> occurs, the error handler of function rxe_qp_from_init will set
> >> both scq and rcq to NULL.
> >>
> >> Then rxe_create_qp calls rxe_put to handle qp. In the end,
> >> rxe_qp_do_cleanup is called by rxe_put. rxe_qp_do_cleanup directly
> >> accesses scq and rcq before checking them. This will cause
> >> null-ptr-deref error.
> >>
> >> The call graph is as below:
> >>
> >> rxe_create_qp {
> >>    ...
> >>    rxe_qp_from_init {
> >>      ...
> >>    err1:
> >>      ...
> >>      qp->rcq = NULL;  <---rcq is set to NULL
> >>      qp->scq = NULL;  <---scq is set to NULL
> >>      ...
> >>    }
> >>
> >> qp_init:
> >>    rxe_put{
> >>      ...
> >>      rxe_qp_do_cleanup {
> >>        ...
> >>        atomic_dec(&qp->scq->num_wq); <--- scq is accessed
> >>        ...
> >>        atomic_dec(&qp->rcq->num_wq); <--- rcq is accessed
> >>      }
> >> }
> >>
> >> Fixes: 4703b4f0d94a ("RDMA/rxe: Enforce IBA C11-17")
> >> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> >> ---
> >> V1->V2: Describe the error flows.
> >> ---
> >>   drivers/infiniband/sw/rxe/rxe_qp.c | 10 ++++++----
> >>   1 file changed, 6 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
> >> index 22e9b85344c3..b79e1b43454e 100644
> >> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
> >> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
> >> @@ -804,13 +804,15 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
> >>          if (qp->rq.queue)
> >>                  rxe_queue_cleanup(qp->rq.queue);
> >>
> >> -       atomic_dec(&qp->scq->num_wq);
> >> -       if (qp->scq)
> >> +       if (qp->scq) {
> >> +               atomic_dec(&qp->scq->num_wq);
> >>                  rxe_put(qp->scq);
> >> +       }
> >>
> >> -       atomic_dec(&qp->rcq->num_wq);
> >> -       if (qp->rcq)
> >> +       if (qp->rcq) {
> >> +               atomic_dec(&qp->rcq->num_wq);
> >>                  rxe_put(qp->rcq);
> >> +       }
> >>
> >>          if (qp->pd)
> >>                  rxe_put(qp->pd);
> >> --
> >> 2.34.1
> >
> > Looks good.
> >
> > Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> >
> > Should the check for "rxe_cleanup_task(&qp->comp.task);" also happen
> > in this commit? or would it be covered in a different one?
>
> It is in a different commit. I will send it out very soon.

Okay. Thank you!

>
> Zhu Yanjun
>
> >
> >>
>
Bob Pearson July 14, 2022, 4:57 p.m. UTC | #4
On 7/6/22 08:11, Haris Iqbal wrote:
> On Wed, Jul 6, 2022 at 3:10 PM Yanjun Zhu <yanjun.zhu@linux.dev> wrote:
>>
>> 在 2022/7/5 18:35, Haris Iqbal 写道:
>>> On Tue, Jul 5, 2022 at 8:28 AM <yanjun.zhu@linux.dev> wrote:
>>>>
>>>> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>>>>
>>>> The function rxe_create_qp calls rxe_qp_from_init. If some error
>>>> occurs, the error handler of function rxe_qp_from_init will set
>>>> both scq and rcq to NULL.
>>>>
>>>> Then rxe_create_qp calls rxe_put to handle qp. In the end,
>>>> rxe_qp_do_cleanup is called by rxe_put. rxe_qp_do_cleanup directly
>>>> accesses scq and rcq before checking them. This will cause
>>>> null-ptr-deref error.
>>>>
>>>> The call graph is as below:
>>>>
>>>> rxe_create_qp {
>>>>    ...
>>>>    rxe_qp_from_init {
>>>>      ...
>>>>    err1:
>>>>      ...
>>>>      qp->rcq = NULL;  <---rcq is set to NULL
>>>>      qp->scq = NULL;  <---scq is set to NULL
>>>>      ...
>>>>    }
>>>>
>>>> qp_init:
>>>>    rxe_put{
>>>>      ...
>>>>      rxe_qp_do_cleanup {
>>>>        ...
>>>>        atomic_dec(&qp->scq->num_wq); <--- scq is accessed
>>>>        ...
>>>>        atomic_dec(&qp->rcq->num_wq); <--- rcq is accessed
>>>>      }
>>>> }
>>>>
>>>> Fixes: 4703b4f0d94a ("RDMA/rxe: Enforce IBA C11-17")
>>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>>>> ---
>>>> V1->V2: Describe the error flows.
>>>> ---
>>>>   drivers/infiniband/sw/rxe/rxe_qp.c | 10 ++++++----
>>>>   1 file changed, 6 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
>>>> index 22e9b85344c3..b79e1b43454e 100644
>>>> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
>>>> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
>>>> @@ -804,13 +804,15 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
>>>>          if (qp->rq.queue)
>>>>                  rxe_queue_cleanup(qp->rq.queue);
>>>>
>>>> -       atomic_dec(&qp->scq->num_wq);
>>>> -       if (qp->scq)
>>>> +       if (qp->scq) {
>>>> +               atomic_dec(&qp->scq->num_wq);
>>>>                  rxe_put(qp->scq);
>>>> +       }
>>>>
>>>> -       atomic_dec(&qp->rcq->num_wq);
>>>> -       if (qp->rcq)
>>>> +       if (qp->rcq) {
>>>> +               atomic_dec(&qp->rcq->num_wq);
>>>>                  rxe_put(qp->rcq);
>>>> +       }
>>>>
>>>>          if (qp->pd)
>>>>                  rxe_put(qp->pd);
>>>> --
>>>> 2.34.1
>>>
>>> Looks good.
>>>
>>> Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
>>>
>>> Should the check for "rxe_cleanup_task(&qp->comp.task);" also happen
>>> in this commit? or would it be covered in a different one?
>>
>> It is in a different commit. I will send it out very soon.
> 
> Okay. Thank you!
> 
>>
>> Zhu Yanjun
>>
>>>
>>>>
>>

Agreed.

Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
Leon Romanovsky July 18, 2022, 11:33 a.m. UTC | #5
On Tue, Jul 05, 2022 at 06:54:14PM -0400, yanjun.zhu@linux.dev wrote:
> From: Zhu Yanjun <yanjun.zhu@linux.dev>
> 
> The function rxe_create_qp calls rxe_qp_from_init. If some error
> occurs, the error handler of function rxe_qp_from_init will set
> both scq and rcq to NULL.
> 
> Then rxe_create_qp calls rxe_put to handle qp. In the end,
> rxe_qp_do_cleanup is called by rxe_put. rxe_qp_do_cleanup directly
> accesses scq and rcq before checking them. This will cause 
> null-ptr-deref error.
> 
> The call graph is as below:
> 
> rxe_create_qp {
>   ...
>   rxe_qp_from_init {
>     ...
>   err1:
>     ...
>     qp->rcq = NULL;  <---rcq is set to NULL
>     qp->scq = NULL;  <---scq is set to NULL
>     ...
>   }
>         
> qp_init:
>   rxe_put{
>     ...
>     rxe_qp_do_cleanup {
>       ...
>       atomic_dec(&qp->scq->num_wq); <--- scq is accessed
>       ...
>       atomic_dec(&qp->rcq->num_wq); <--- rcq is accessed
>     }
> }
> 
> Fixes: 4703b4f0d94a ("RDMA/rxe: Enforce IBA C11-17")
> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> ---
> V1->V2: Describe the error flows.
> ---
>  drivers/infiniband/sw/rxe/rxe_qp.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 

Thanks, applied.
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 22e9b85344c3..b79e1b43454e 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -804,13 +804,15 @@  static void rxe_qp_do_cleanup(struct work_struct *work)
 	if (qp->rq.queue)
 		rxe_queue_cleanup(qp->rq.queue);
 
-	atomic_dec(&qp->scq->num_wq);
-	if (qp->scq)
+	if (qp->scq) {
+		atomic_dec(&qp->scq->num_wq);
 		rxe_put(qp->scq);
+	}
 
-	atomic_dec(&qp->rcq->num_wq);
-	if (qp->rcq)
+	if (qp->rcq) {
+		atomic_dec(&qp->rcq->num_wq);
 		rxe_put(qp->rcq);
+	}
 
 	if (qp->pd)
 		rxe_put(qp->pd);