diff mbox series

[v2,06/11] qla2xxx: complete command early within lock

Message ID 20240710171057.35066-7-njavali@marvell.com (mailing list archive)
State Accepted
Headers show
Series qla2xxx misc. bug fixes | expand

Commit Message

Nilesh Javali July 10, 2024, 5:10 p.m. UTC
From: Shreyas Deodhar <sdeodhar@marvell.com>

A crash was observed while performing NPIV and FW reset,

 BUG: kernel NULL pointer dereference, address: 000000000000001c
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 1 PREEMPT_RT SMP NOPTI
 RIP: 0010:dma_direct_unmap_sg+0x51/0x1e0
 RSP: 0018:ffffc90026f47b88 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000021 RCX: 0000000000000002
 RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff8881041130d0
 RBP: ffff8881041130d0 R08: 0000000000000000 R09: 0000000000000034
 R10: ffffc90026f47c48 R11: 0000000000000031 R12: 0000000000000000
 R13: 0000000000000000 R14: ffff8881565e4a20 R15: 0000000000000000
 FS: 00007f4c69ed3d00(0000) GS:ffff889faac80000(0000) knlGS:0000000000000000
 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000000000001c CR3: 0000000288a50002 CR4: 00000000007706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
 <TASK>
 ? __die_body+0x1a/0x60
 ? page_fault_oops+0x16f/0x4a0
 ? do_user_addr_fault+0x174/0x7f0
 ? exc_page_fault+0x69/0x1a0
 ? asm_exc_page_fault+0x22/0x30
 ? dma_direct_unmap_sg+0x51/0x1e0
 ? preempt_count_sub+0x96/0xe0
 qla2xxx_qpair_sp_free_dma+0x29f/0x3b0 [qla2xxx]
 qla2xxx_qpair_sp_compl+0x60/0x80 [qla2xxx]
 __qla2x00_abort_all_cmds+0xa2/0x450 [qla2xxx]

The command completion was done early while aborting the
commands in driver unload path but outside lock to
avoid the WARN_ON condition of performing dma_free_attr
within the lock. However this caused race condition
while command completion via multiple paths causing system
crash.

Hence complete the command early in unload path
but within the lock to avoid race condition.

Fixes: 0367076b0817 ("scsi: qla2xxx: Perform lockless command completion in abort path")
Cc: stable@vger.kernel.org
Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
---
 drivers/scsi/qla2xxx/qla_os.c | 5 -----
 1 file changed, 5 deletions(-)

Comments

Himanshu Madhani July 10, 2024, 7:43 p.m. UTC | #1
On 7/10/24 10:10 AM, Nilesh Javali wrote:
> From: Shreyas Deodhar <sdeodhar@marvell.com>
> 
> A crash was observed while performing NPIV and FW reset,
> 
>   BUG: kernel NULL pointer dereference, address: 000000000000001c
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: 0000 1 PREEMPT_RT SMP NOPTI
>   RIP: 0010:dma_direct_unmap_sg+0x51/0x1e0
>   RSP: 0018:ffffc90026f47b88 EFLAGS: 00010246
>   RAX: 0000000000000000 RBX: 0000000000000021 RCX: 0000000000000002
>   RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff8881041130d0
>   RBP: ffff8881041130d0 R08: 0000000000000000 R09: 0000000000000034
>   R10: ffffc90026f47c48 R11: 0000000000000031 R12: 0000000000000000
>   R13: 0000000000000000 R14: ffff8881565e4a20 R15: 0000000000000000
>   FS: 00007f4c69ed3d00(0000) GS:ffff889faac80000(0000) knlGS:0000000000000000
>   CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 000000000000001c CR3: 0000000288a50002 CR4: 00000000007706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   PKRU: 55555554
>   Call Trace:
>   <TASK>
>   ? __die_body+0x1a/0x60
>   ? page_fault_oops+0x16f/0x4a0
>   ? do_user_addr_fault+0x174/0x7f0
>   ? exc_page_fault+0x69/0x1a0
>   ? asm_exc_page_fault+0x22/0x30
>   ? dma_direct_unmap_sg+0x51/0x1e0
>   ? preempt_count_sub+0x96/0xe0
>   qla2xxx_qpair_sp_free_dma+0x29f/0x3b0 [qla2xxx]
>   qla2xxx_qpair_sp_compl+0x60/0x80 [qla2xxx]
>   __qla2x00_abort_all_cmds+0xa2/0x450 [qla2xxx]
> 
> The command completion was done early while aborting the
> commands in driver unload path but outside lock to
> avoid the WARN_ON condition of performing dma_free_attr
> within the lock. However this caused race condition
> while command completion via multiple paths causing system
> crash.
> 
> Hence complete the command early in unload path
> but within the lock to avoid race condition.
> 
> Fixes: 0367076b0817 ("scsi: qla2xxx: Perform lockless command completion in abort path")
> Cc: stable@vger.kernel.org
> Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com>
> Signed-off-by: Nilesh Javali <njavali@marvell.com>
> ---
>   drivers/scsi/qla2xxx/qla_os.c | 5 -----
>   1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index 70f43eab3f01..e4056cddb727 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -1875,14 +1875,9 @@ __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res)
>   	for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
>   		sp = req->outstanding_cmds[cnt];
>   		if (sp) {
> -			/*
> -			 * perform lockless completion during driver unload
> -			 */
>   			if (qla2x00_chip_is_down(vha)) {
>   				req->outstanding_cmds[cnt] = NULL;
> -				spin_unlock_irqrestore(qp->qp_lock_ptr, flags);
>   				sp->done(sp, res);
> -				spin_lock_irqsave(qp->qp_lock_ptr, flags);
>   				continue;
>   			}
>   

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
diff mbox series

Patch

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 70f43eab3f01..e4056cddb727 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1875,14 +1875,9 @@  __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res)
 	for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
 		sp = req->outstanding_cmds[cnt];
 		if (sp) {
-			/*
-			 * perform lockless completion during driver unload
-			 */
 			if (qla2x00_chip_is_down(vha)) {
 				req->outstanding_cmds[cnt] = NULL;
-				spin_unlock_irqrestore(qp->qp_lock_ptr, flags);
 				sp->done(sp, res);
-				spin_lock_irqsave(qp->qp_lock_ptr, flags);
 				continue;
 			}