diff mbox

drivers/scsi/qla2xxx: fix double free bug after firmware timeout

Message ID 151603716189.28707.13105191089826357709.stgit@rabbit.intern.cm-ag (mailing list archive)
State Changes Requested
Headers show

Commit Message

Max Kellermann Jan. 15, 2018, 5:26 p.m. UTC
When the qla2xxx firmware is unavailable, eventually
qla2x00_sp_timeout() is reached, which calls the timeout function and
frees the srb_t instance.

The timeout function always resolves to qla2x00_async_iocb_timeout(),
which invokes another callback function called "done".  All of these
qla2x00_*_sp_done() callbacks also free the srb_t instance; after
returning to qla2x00_sp_timeout(), it is freed again.

The fix is to remove the "sp->free(sp)" call from qla2x00_sp_timeout()
and add it to those code paths in qla2x00_async_iocb_timeout() which
do not already free the object.

This is how it looks like with KASAN:

 BUG: KASAN: use-after-free in qla2x00_sp_timeout+0x228/0x250
 Read of size 8 at addr ffff88278147a590 by task swapper/2/0

 Allocated by task 1502:
  save_stack+0x33/0xa0
  kasan_kmalloc+0xa0/0xd0
  kmem_cache_alloc+0xb8/0x1c0
  mempool_alloc+0xd6/0x260
  qla24xx_async_gnl+0x3c5/0x1100

 Freed by task 0:
  save_stack+0x33/0xa0
  kasan_slab_free+0x72/0xc0
  kmem_cache_free+0x75/0x200
  qla24xx_async_gnl_sp_done+0x556/0x9e0
  qla2x00_async_iocb_timeout+0x1c7/0x420
  qla2x00_sp_timeout+0x16d/0x250
  call_timer_fn+0x36/0x200

 The buggy address belongs to the object at ffff88278147a440
  which belongs to the cache qla2xxx_srbs of size 344
 The buggy address is located 336 bytes inside of
  344-byte region [ffff88278147a440, ffff88278147a598)

Signed-off-by: Max Kellermann <mk@cm4all.com>
---
 drivers/scsi/qla2xxx/qla_init.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Madhani, Himanshu Jan. 15, 2018, 7:58 p.m. UTC | #1
Hi Max, 

> On Jan 15, 2018, at 9:26 AM, Max Kellermann <mk@cm4all.com> wrote:
> 
> When the qla2xxx firmware is unavailable, eventually
> qla2x00_sp_timeout() is reached, which calls the timeout function and
> frees the srb_t instance.
> 
> The timeout function always resolves to qla2x00_async_iocb_timeout(),
> which invokes another callback function called "done".  All of these
> qla2x00_*_sp_done() callbacks also free the srb_t instance; after
> returning to qla2x00_sp_timeout(), it is freed again.
> 
> The fix is to remove the "sp->free(sp)" call from qla2x00_sp_timeout()
> and add it to those code paths in qla2x00_async_iocb_timeout() which
> do not already free the object.
> 
> This is how it looks like with KASAN:
> 
> BUG: KASAN: use-after-free in qla2x00_sp_timeout+0x228/0x250
> Read of size 8 at addr ffff88278147a590 by task swapper/2/0
> 
> Allocated by task 1502:
>  save_stack+0x33/0xa0
>  kasan_kmalloc+0xa0/0xd0
>  kmem_cache_alloc+0xb8/0x1c0
>  mempool_alloc+0xd6/0x260
>  qla24xx_async_gnl+0x3c5/0x1100
> 
> Freed by task 0:
>  save_stack+0x33/0xa0
>  kasan_slab_free+0x72/0xc0
>  kmem_cache_free+0x75/0x200
>  qla24xx_async_gnl_sp_done+0x556/0x9e0
>  qla2x00_async_iocb_timeout+0x1c7/0x420
>  qla2x00_sp_timeout+0x16d/0x250
>  call_timer_fn+0x36/0x200
> 
> The buggy address belongs to the object at ffff88278147a440
>  which belongs to the cache qla2xxx_srbs of size 344
> The buggy address is located 336 bytes inside of
>  344-byte region [ffff88278147a440, ffff88278147a598)
> 
> Signed-off-by: Max Kellermann <mk@cm4all.com>
> ---
> drivers/scsi/qla2xxx/qla_init.c |    3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
> index b5b48ddca962..801890564e00 100644
> --- a/drivers/scsi/qla2xxx/qla_init.c
> +++ b/drivers/scsi/qla2xxx/qla_init.c
> @@ -58,7 +58,6 @@ qla2x00_sp_timeout(unsigned long __data)
> 	req->outstanding_cmds[sp->handle] = NULL;
> 	iocb = &sp->u.iocb_cmd;
> 	iocb->timeout(sp);
> -	sp->free(sp);
> 	spin_unlock_irqrestore(&vha->hw->hardware_lock, flags);
> }
> 
> @@ -121,9 +120,11 @@ qla2x00_async_iocb_timeout(void *data)
> 		ea.data[1] = lio->u.logio.data[1];
> 		ea.sp = sp;
> 		qla24xx_handle_plogi_done_event(fcport->vha, &ea);
> +		sp->free(sp);
> 		break;
> 	case SRB_LOGOUT_CMD:
> 		qlt_logo_completion_handler(fcport, QLA_FUNCTION_TIMEOUT);
> +		sp->free(sp);
> 		break;
> 	case SRB_CT_PTHRU_CMD:
> 	case SRB_MB_IOCB:
> 

We have patch to prevent this double free in 4.16/scsi-queue already. 

https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=4.16/scsi-queue&id=045d6ea200af794ba15515984cff63787a7fc3c0

Thanks,
- Himanshu
Max Kellermann Jan. 15, 2018, 8:37 p.m. UTC | #2
On 2018/01/15 20:58, "Madhani, Himanshu" <Himanshu.Madhani@cavium.com> wrote:
> We have patch to prevent this double free in 4.16/scsi-queue
> already.

No, let me repeat: this is a different bug!

Your bug is about the free call after waiting for completion
synchronously in qla24xx_els_dcmd_iocb(), after it was already freed
by qla2x00_sp_timeout().

My bug is about free in qla2x00_*_sp_done() and again in
qla2x00_sp_timeout().  My patch description describes exactly that.

And you know what?  My patch fixes both bugs.  It is superior to the
one that was merged 4 weeks later, isn't it?

You NACKed my patch 5 weeks ago, and I explained to you that you were
talking about a different bug, but you never replied to that.

Max
Madhani, Himanshu Jan. 15, 2018, 9:03 p.m. UTC | #3
Hi Max, 

> On Jan 15, 2018, at 12:37 PM, Max Kellermann <mk@cm4all.com> wrote:

> 

> On 2018/01/15 20:58, "Madhani, Himanshu" <Himanshu.Madhani@cavium.com> wrote:

>> We have patch to prevent this double free in 4.16/scsi-queue

>> already.

> 

> No, let me repeat: this is a different bug!

> 

> Your bug is about the free call after waiting for completion

> synchronously in qla24xx_els_dcmd_iocb(), after it was already freed

> by qla2x00_sp_timeout().

> 

> My bug is about free in qla2x00_*_sp_done() and again in

> qla2x00_sp_timeout().  My patch description describes exactly that.

> 

> And you know what?  My patch fixes both bugs.  It is superior to the

> one that was merged 4 weeks later, isn't it?

> 

> You NACKed my patch 5 weeks ago, and I explained to you that you were

> talking about a different bug, but you never replied to that.

> 

> Max


Thanks for refreshing my memory. Sorry about not getting back to you earlier. 
I was out for couple weeks so it got dropped out of my bucket for response.

I do agree there is double free issue here. I’ll discuss this internally with folks who has 
deeper history of this code and will get back to you and see which approach we want to
take. 

Thanks,
- Himanshu
Madhani, Himanshu Jan. 18, 2018, 1:02 a.m. UTC | #4
Max,

> On Jan 15, 2018, at 12:37 PM, Max Kellermann <mk@cm4all.com> wrote:

> 

> On 2018/01/15 20:58, "Madhani, Himanshu" <Himanshu.Madhani@cavium.com> wrote:

>> We have patch to prevent this double free in 4.16/scsi-queue

>> already.

> 

> No, let me repeat: this is a different bug!

> 

> Your bug is about the free call after waiting for completion

> synchronously in qla24xx_els_dcmd_iocb(), after it was already freed

> by qla2x00_sp_timeout().

> 

> My bug is about free in qla2x00_*_sp_done() and again in

> qla2x00_sp_timeout().  My patch description describes exactly that.

> 

> And you know what?  My patch fixes both bugs.  It is superior to the

> one that was merged 4 weeks later, isn't it?

> 

> You NACKed my patch 5 weeks ago, and I explained to you that you were

> talking about a different bug, but you never replied to that.

> 

> Max


Just to update you, we are working on fixing this internally and will post
patch soon. 

I’ll add you are Reporter and please review and let us know your comments.

Thanks,
- Himanshu
diff mbox

Patch

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index b5b48ddca962..801890564e00 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -58,7 +58,6 @@  qla2x00_sp_timeout(unsigned long __data)
 	req->outstanding_cmds[sp->handle] = NULL;
 	iocb = &sp->u.iocb_cmd;
 	iocb->timeout(sp);
-	sp->free(sp);
 	spin_unlock_irqrestore(&vha->hw->hardware_lock, flags);
 }
 
@@ -121,9 +120,11 @@  qla2x00_async_iocb_timeout(void *data)
 		ea.data[1] = lio->u.logio.data[1];
 		ea.sp = sp;
 		qla24xx_handle_plogi_done_event(fcport->vha, &ea);
+		sp->free(sp);
 		break;
 	case SRB_LOGOUT_CMD:
 		qlt_logo_completion_handler(fcport, QLA_FUNCTION_TIMEOUT);
+		sp->free(sp);
 		break;
 	case SRB_CT_PTHRU_CMD:
 	case SRB_MB_IOCB: