From patchwork Fri Jun 21 16:50:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Himanshu Madhani X-Patchwork-Id: 11010417 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 28E9F13AF for ; Fri, 21 Jun 2019 16:50:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0879428837 for ; Fri, 21 Jun 2019 16:50:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EAFEE288B9; Fri, 21 Jun 2019 16:50:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6035C28837 for ; Fri, 21 Jun 2019 16:50:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726192AbfFUQuc (ORCPT ); Fri, 21 Jun 2019 12:50:32 -0400 Received: from mx0a-0016f401.pphosted.com ([67.231.148.174]:12932 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726017AbfFUQuc (ORCPT ); Fri, 21 Jun 2019 12:50:32 -0400 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5LGnMdH007666; Fri, 21 Jun 2019 09:50:29 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=VzIHvVHplbtxWvCXNv67xB2KpLL3aCCRBBaCHDxU3+U=; b=ysxIiyLUy6jHWm88EftTP9jQXXgs+JIrNPeYUQ+atQGQxAenzRb+lJ4BUyZnPSU5b6bH zoPjXMUIg2aajEByrJa9yzUIlM7Uc2N1QLYzq4ruttP9vufrrn0tN09GEMH9yXC403rL sNTURfNH6QRtrv43SSExPOzw/Mh13taP/YBxZtKH+gcR8sCJAXJ5HEdazO5mEafgDaMU At7zAngSXs68j0bghp2w4+WBKPbIHAcrI7mscG5T+lxs7m5MRzqLbv0bN7D6h5oQU96J sHzcPXKnTKP6c+CX+c3UKUXB5NVsbnhPE2/3+46UtoRwNk4sWqnpuMAxm5WSbjuquSfp mQ== Received: from sc-exch04.marvell.com ([199.233.58.184]) by mx0a-0016f401.pphosted.com with ESMTP id 2t8vu2hgx4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 21 Jun 2019 09:50:28 -0700 Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH04.marvell.com (10.93.176.84) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Fri, 21 Jun 2019 09:50:27 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Fri, 21 Jun 2019 09:50:27 -0700 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id A1FC33F7040; Fri, 21 Jun 2019 09:50:27 -0700 (PDT) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x5LGoRYw023913; Fri, 21 Jun 2019 09:50:27 -0700 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id x5LGoRG8023912; Fri, 21 Jun 2019 09:50:27 -0700 From: Himanshu Madhani To: , CC: , Subject: [PATCH v3 1/3] qla2xxx: Fix kernel crash after disconnecting NVMe devices Date: Fri, 21 Jun 2019 09:50:22 -0700 Message-ID: <20190621165024.23874-2-hmadhani@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20190621165024.23874-1-hmadhani@marvell.com> References: <20190621165024.23874-1-hmadhani@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-21_12:,, signatures=0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Arun Easi BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx] PGD 800000084cf41067 PUD 84d288067 PMD 0 Oops: 0000 [#1] SMP Call Trace: [] process_one_work+0x17f/0x440 [] worker_thread+0x126/0x3c0 [] ? manage_workers.isra.26+0x2a0/0x2a0 [] kthread+0xd1/0xe0 [] ? insert_kthread_work+0x40/0x40 [] ret_from_fork_nospec_begin+0x21/0x21 [] ? insert_kthread_work+0x40/0x40 RIP [] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx] The crash is due to a bad entry in the nvme_rport_list. This list is not protected, and when a remoteport_delete callback is called, driver traverses the list and crashes. Actually, the list could be removed and driver could traverse the main fcport list instead. Fix does exactly that. Signed-off-by: Arun Easi Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_def.h | 1 - drivers/scsi/qla2xxx/qla_nvme.c | 37 ++++++++++--------------------------- drivers/scsi/qla2xxx/qla_nvme.h | 1 - drivers/scsi/qla2xxx/qla_os.c | 1 - 4 files changed, 10 insertions(+), 30 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 1a4095c56eee..602ed24bb806 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -4376,7 +4376,6 @@ typedef struct scsi_qla_host { struct nvme_fc_local_port *nvme_local_port; struct completion nvme_del_done; - struct list_head nvme_rport_list; uint16_t fcoe_vlan_id; uint16_t fcoe_fcf_idx; diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 22e3fba28e51..b43c62758cec 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -74,7 +74,6 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport) rport = fcport->nvme_remote_port->private; rport->fcport = fcport; - list_add_tail(&rport->list, &vha->nvme_rport_list); fcport->nvme_flag |= NVME_FLAG_REGISTERED; return 0; @@ -542,19 +541,12 @@ static void qla_nvme_localport_delete(struct nvme_fc_local_port *lport) static void qla_nvme_remoteport_delete(struct nvme_fc_remote_port *rport) { fc_port_t *fcport; - struct qla_nvme_rport *qla_rport = rport->private, *trport; + struct qla_nvme_rport *qla_rport = rport->private; fcport = qla_rport->fcport; fcport->nvme_remote_port = NULL; fcport->nvme_flag &= ~NVME_FLAG_REGISTERED; - list_for_each_entry_safe(qla_rport, trport, - &fcport->vha->nvme_rport_list, list) { - if (qla_rport->fcport == fcport) { - list_del(&qla_rport->list); - break; - } - } complete(&fcport->nvme_del_done); if (!test_bit(UNLOADING, &fcport->vha->dpc_flags)) { @@ -590,7 +582,7 @@ static void qla_nvme_unregister_remote_port(struct work_struct *work) { struct fc_port *fcport = container_of(work, struct fc_port, nvme_del_work); - struct qla_nvme_rport *qla_rport, *trport; + int ret; if (!IS_ENABLED(CONFIG_NVME_FC)) return; @@ -598,23 +590,14 @@ static void qla_nvme_unregister_remote_port(struct work_struct *work) ql_log(ql_log_warn, NULL, 0x2112, "%s: unregister remoteport on %p\n",__func__, fcport); - list_for_each_entry_safe(qla_rport, trport, - &fcport->vha->nvme_rport_list, list) { - if (qla_rport->fcport == fcport) { - ql_log(ql_log_info, fcport->vha, 0x2113, - "%s: fcport=%p\n", __func__, fcport); - nvme_fc_set_remoteport_devloss - (fcport->nvme_remote_port, 0); - init_completion(&fcport->nvme_del_done); - if (nvme_fc_unregister_remoteport - (fcport->nvme_remote_port)) - ql_log(ql_log_info, fcport->vha, 0x2114, - "%s: Failed to unregister nvme_remote_port\n", - __func__); - wait_for_completion(&fcport->nvme_del_done); - break; - } - } + nvme_fc_set_remoteport_devloss(fcport->nvme_remote_port, 0); + init_completion(&fcport->nvme_del_done); + ret = nvme_fc_unregister_remoteport(fcport->nvme_remote_port); + if (ret) + ql_log(ql_log_info, fcport->vha, 0x2114, + "%s: Failed to unregister nvme_remote_port (%d)\n", + __func__, ret); + wait_for_completion(&fcport->nvme_del_done); } void qla_nvme_delete(struct scsi_qla_host *vha) diff --git a/drivers/scsi/qla2xxx/qla_nvme.h b/drivers/scsi/qla2xxx/qla_nvme.h index d3b8a6440113..2d088add7011 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.h +++ b/drivers/scsi/qla2xxx/qla_nvme.h @@ -37,7 +37,6 @@ struct nvme_private { }; struct qla_nvme_rport { - struct list_head list; struct fc_port *fcport; }; diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 00fee5bf4de1..ae93ae2b6090 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -4789,7 +4789,6 @@ struct scsi_qla_host *qla2x00_create_host(struct scsi_host_template *sht, INIT_LIST_HEAD(&vha->plogi_ack_list); INIT_LIST_HEAD(&vha->qp_list); INIT_LIST_HEAD(&vha->gnl.fcports); - INIT_LIST_HEAD(&vha->nvme_rport_list); INIT_LIST_HEAD(&vha->gpnid_list); INIT_WORK(&vha->iocb_work, qla2x00_iocb_work_fn); From patchwork Fri Jun 21 16:50:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Himanshu Madhani X-Patchwork-Id: 11010421 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2B0FF14B6 for ; Fri, 21 Jun 2019 16:50:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A66928837 for ; Fri, 21 Jun 2019 16:50:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0EDEF288B9; Fri, 21 Jun 2019 16:50:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9FB6B28837 for ; Fri, 21 Jun 2019 16:50:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726200AbfFUQuh (ORCPT ); Fri, 21 Jun 2019 12:50:37 -0400 Received: from mx0a-0016f401.pphosted.com ([67.231.148.174]:50552 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726194AbfFUQug (ORCPT ); Fri, 21 Jun 2019 12:50:36 -0400 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5LGnLOF007652; Fri, 21 Jun 2019 09:50:34 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=0pjYj5AMIqSTKHSgnex+H5/d27t+2SYjdfcWAeDJxI0=; b=VrFNIfSCCmCwJT9DdZoLtAOgiXygtCZ9bnbnNEmr6iXySif4FeitpryCIHR+5FECMJ6k PYbsQlAX0MqUioCu35L4WLpmux2ercizfzxZ4wwyC1EY/lFvBLnmXxcvFSQXHmhco3BS 81iUT64Swpfok9EK6ornKzBcN+pYvLdF9l40w/CcZEr2Q9/UC5BD69azAdhoOpboU9me yPKFSD1sCP7CmQCyQXYhDq24cU3dpkNZXCj6I8AkOy9NLVcSwZFpPlDydvPA0HXUGZz7 r5B+UImo8Ke8OTv4rshMrsQ01AI5Ps6j+6PCOK2GVMvTC2Ls4iVA8Xw4rU9SxYHORMWU uQ== Received: from sc-exch03.marvell.com ([199.233.58.183]) by mx0a-0016f401.pphosted.com with ESMTP id 2t8vu2hgwv-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 21 Jun 2019 09:50:34 -0700 Received: from SC-EXCH01.marvell.com (10.93.176.81) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Fri, 21 Jun 2019 09:50:31 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Fri, 21 Jun 2019 09:50:31 -0700 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id D15983F7040; Fri, 21 Jun 2019 09:50:30 -0700 (PDT) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x5LGoULN023917; Fri, 21 Jun 2019 09:50:30 -0700 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id x5LGoUd8023916; Fri, 21 Jun 2019 09:50:30 -0700 From: Himanshu Madhani To: , CC: , Subject: [PATCH v3 2/3] qla2xxx: on session delete return nvme cmd Date: Fri, 21 Jun 2019 09:50:23 -0700 Message-ID: <20190621165024.23874-3-hmadhani@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20190621165024.23874-1-hmadhani@marvell.com> References: <20190621165024.23874-1-hmadhani@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-21_12:,, signatures=0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Quinn Tran - on session delete or chip reset, reject all NVME commands. - on NVME command submission error, free srb resource. Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_nvme.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index b43c62758cec..8b3cb0fd307e 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -239,8 +239,16 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, struct qla_hw_data *ha; srb_t *sp; + + if (!fcport || (fcport && fcport->deleted)) + return rval; + vha = fcport->vha; ha = vha->hw; + + if (!ha->flags.fw_started) + return rval; + /* Alloc SRB structure */ sp = qla2x00_get_sp(vha, fcport, GFP_ATOMIC); if (!sp) @@ -272,6 +280,7 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, "qla2x00_start_sp failed = %d\n", rval); atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); + sp->free(sp); return rval; } @@ -486,11 +495,11 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, fcport = qla_rport->fcport; - vha = fcport->vha; - - if (test_bit(ABORT_ISP_ACTIVE, &vha->dpc_flags)) + if (!qpair || !fcport || (qpair && !qpair->fw_started) || + (fcport && fcport->deleted)) return rval; + vha = fcport->vha; /* * If we know the dev is going away while the transport is still sending * IO's return busy back to stall the IO Q. This happens when the @@ -523,6 +532,7 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, "qla2x00_start_nvme_mq failed = %d\n", rval); atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); + sp->free(sp); } return rval; @@ -549,14 +559,13 @@ static void qla_nvme_remoteport_delete(struct nvme_fc_remote_port *rport) complete(&fcport->nvme_del_done); - if (!test_bit(UNLOADING, &fcport->vha->dpc_flags)) { - INIT_WORK(&fcport->free_work, qlt_free_session_done); - schedule_work(&fcport->free_work); - } + INIT_WORK(&fcport->free_work, qlt_free_session_done); + schedule_work(&fcport->free_work); fcport->nvme_flag &= ~NVME_FLAG_DELETING; ql_log(ql_log_info, fcport->vha, 0x2110, - "remoteport_delete of %p completed.\n", fcport); + "remoteport_delete of %p %8phN completed.\n", + fcport, fcport->port_name); } static struct nvme_fc_port_template qla_nvme_fc_transport = { @@ -588,7 +597,8 @@ static void qla_nvme_unregister_remote_port(struct work_struct *work) return; ql_log(ql_log_warn, NULL, 0x2112, - "%s: unregister remoteport on %p\n",__func__, fcport); + "%s: unregister remoteport on %p %8phN\n", + __func__, fcport, fcport->port_name); nvme_fc_set_remoteport_devloss(fcport->nvme_remote_port, 0); init_completion(&fcport->nvme_del_done); From patchwork Fri Jun 21 16:50:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Himanshu Madhani X-Patchwork-Id: 11010423 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C64BA14B6 for ; Fri, 21 Jun 2019 16:50:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B47A528837 for ; Fri, 21 Jun 2019 16:50:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A8C7C288B9; Fri, 21 Jun 2019 16:50:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4BD4288B6 for ; Fri, 21 Jun 2019 16:50:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726194AbfFUQun (ORCPT ); Fri, 21 Jun 2019 12:50:43 -0400 Received: from mx0b-0016f401.pphosted.com ([67.231.156.173]:51452 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726203AbfFUQun (ORCPT ); Fri, 21 Jun 2019 12:50:43 -0400 Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5LGf4F6004801; Fri, 21 Jun 2019 09:50:38 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=pHFz50stXxGndvfGTMGo5WYdaMEciMBZLOe+8+O1cjI=; b=MtalnFsC40cjEZmGx7YUfkW9SXXrH014MQFrRDqd6TmxqGzTqpAbOjFRQi2ekz2xu86h 9dyQ2M7sZ4G0ILA6HwUAJ7/zSMMekZJEZqh+H8A0Q8TwXzoQD37Ioeneg6psW/pG4KeP V23/7qfKZQKz8eRqkZIulQFE+UWG1Oxb2GiDEeb2bUHSrPJGpSNiTzYbYTAKhWLKTq5m lv+ngRwCjUS9bDIX88s9I6rbmH9Qn8dei0cQsjvATHsFS8gxHub6tOYjgYpi1oBME9D1 3+8k00vVfJgknTMvRiRfS/ATA6CejMXi2E5ZAxxnIANU/8R0zQrJCloH+1LG2ABEy7H2 yA== Received: from sc-exch01.marvell.com ([199.233.58.181]) by mx0b-0016f401.pphosted.com with ESMTP id 2t8yp20tss-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 21 Jun 2019 09:50:37 -0700 Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Fri, 21 Jun 2019 09:50:34 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Fri, 21 Jun 2019 09:50:34 -0700 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id 14AAA3F703F; Fri, 21 Jun 2019 09:50:34 -0700 (PDT) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x5LGoXX0023921; Fri, 21 Jun 2019 09:50:33 -0700 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id x5LGoXWj023920; Fri, 21 Jun 2019 09:50:33 -0700 From: Himanshu Madhani To: , CC: , Subject: [PATCH v3 3/3] qla2xxx: Fix NVME cmd and LS cmd timeout race condition Date: Fri, 21 Jun 2019 09:50:24 -0700 Message-ID: <20190621165024.23874-4-hmadhani@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20190621165024.23874-1-hmadhani@marvell.com> References: <20190621165024.23874-1-hmadhani@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-21_11:,, signatures=0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Quinn Tran This patch uses kref to protect access between fcp_abort path and nvme command and LS command completion path. Stack trace below shows the abort path is accessing stale memory (nvme_private->sp). When command kref reaches 0, nvme_private & srb resource will be disconnected from each other. Any subsequence nvme abort request will not be able to reference the original srb. [ 5631.003998] BUG: unable to handle kernel paging request at 00000010000005d8 [ 5631.004016] IP: [] qla_nvme_abort_work+0x22/0x100 [qla2xxx] [ 5631.004086] Workqueue: events qla_nvme_abort_work [qla2xxx] [ 5631.004097] RIP: 0010:[] [] qla_nvme_abort_work+0x22/0x100 [qla2xxx] [ 5631.004109] Call Trace: [ 5631.004115] [] ? pwq_dec_nr_in_flight+0x64/0xb0 [ 5631.004117] [] process_one_work+0x17f/0x440 [ 5631.004120] [] worker_thread+0x126/0x3c0 Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_def.h | 3 + drivers/scsi/qla2xxx/qla_nvme.c | 163 ++++++++++++++++++++++++++++------------ drivers/scsi/qla2xxx/qla_nvme.h | 1 + 3 files changed, 117 insertions(+), 50 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 602ed24bb806..c0d1b0715541 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -532,6 +532,8 @@ typedef struct srb { uint8_t cmd_type; uint8_t pad[3]; atomic_t ref_count; + struct kref cmd_kref; /* need to migrate ref_count over to this */ + void *priv; wait_queue_head_t nvme_ls_waitq; struct fc_port *fcport; struct scsi_qla_host *vha; @@ -554,6 +556,7 @@ typedef struct srb { } u; void (*done)(void *, int); void (*free)(void *); + void (*put_fn)(struct kref *kref); } srb_t; #define GET_CMD_SP(sp) (sp->u.scmd.cmd) diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 8b3cb0fd307e..316aea085e6e 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -123,53 +123,91 @@ static int qla_nvme_alloc_queue(struct nvme_fc_local_port *lport, return 0; } +static void qla_nvme_release_fcp_cmd_kref(struct kref *kref) +{ + struct srb *sp = container_of(kref, struct srb, cmd_kref); + struct nvme_private *priv = (struct nvme_private *)sp->priv; + struct nvmefc_fcp_req *fd; + struct srb_iocb *nvme; + unsigned long flags; + + if (!priv) + goto out; + + nvme = &sp->u.iocb_cmd; + fd = nvme->u.nvme.desc; + + spin_lock_irqsave(&priv->cmd_lock, flags); + priv->sp = NULL; + sp->priv = NULL; + if (priv->comp_status == QLA_SUCCESS) { + fd->rcv_rsplen = nvme->u.nvme.rsp_pyld_len; + } else { + fd->rcv_rsplen = 0; + fd->transferred_length = 0; + } + fd->status = 0; + spin_unlock_irqrestore(&priv->cmd_lock, flags); + + fd->done(fd); +out: + qla2xxx_rel_qpair_sp(sp->qpair, sp); +} + +static void qla_nvme_release_ls_cmd_kref(struct kref *kref) +{ + struct srb *sp = container_of(kref, struct srb, cmd_kref); + struct nvme_private *priv = (struct nvme_private *)sp->priv; + struct nvmefc_ls_req *fd; + unsigned long flags; + + if (!priv) + goto out; + + spin_lock_irqsave(&priv->cmd_lock, flags); + priv->sp = NULL; + sp->priv = NULL; + spin_unlock_irqrestore(&priv->cmd_lock, flags); + + fd = priv->fd; + fd->done(fd, priv->comp_status); +out: + qla2x00_rel_sp(sp); +} + +static void qla_nvme_ls_complete(struct work_struct *work) +{ + struct nvme_private *priv = + container_of(work, struct nvme_private, ls_work); + + kref_put(&priv->sp->cmd_kref, qla_nvme_release_ls_cmd_kref); +} + static void qla_nvme_sp_ls_done(void *ptr, int res) { srb_t *sp = ptr; - struct srb_iocb *nvme; - struct nvmefc_ls_req *fd; struct nvme_private *priv; - if (WARN_ON_ONCE(atomic_read(&sp->ref_count) == 0)) + if (WARN_ON_ONCE(kref_read(&sp->cmd_kref) == 0)) return; - atomic_dec(&sp->ref_count); - if (res) res = -EINVAL; - nvme = &sp->u.iocb_cmd; - fd = nvme->u.nvme.desc; - priv = fd->private; + priv = (struct nvme_private *)sp->priv; priv->comp_status = res; + INIT_WORK(&priv->ls_work, qla_nvme_ls_complete); schedule_work(&priv->ls_work); - /* work schedule doesn't need the sp */ - qla2x00_rel_sp(sp); } +/* it assumed that QPair lock is held. */ static void qla_nvme_sp_done(void *ptr, int res) { srb_t *sp = ptr; - struct srb_iocb *nvme; - struct nvmefc_fcp_req *fd; + struct nvme_private *priv = (struct nvme_private *)sp->priv; - nvme = &sp->u.iocb_cmd; - fd = nvme->u.nvme.desc; - - if (WARN_ON_ONCE(atomic_read(&sp->ref_count) == 0)) - return; - - atomic_dec(&sp->ref_count); - - if (res == QLA_SUCCESS) { - fd->rcv_rsplen = nvme->u.nvme.rsp_pyld_len; - } else { - fd->rcv_rsplen = 0; - fd->transferred_length = 0; - } - fd->status = 0; - fd->done(fd); - qla2xxx_rel_qpair_sp(sp->qpair, sp); + priv->comp_status = res; + kref_put(&sp->cmd_kref, qla_nvme_release_fcp_cmd_kref); return; } @@ -188,44 +226,50 @@ static void qla_nvme_abort_work(struct work_struct *work) __func__, sp, sp->handle, fcport, fcport->deleted); if (!ha->flags.fw_started && (fcport && fcport->deleted)) - return; + goto out; if (ha->flags.host_shutting_down) { ql_log(ql_log_info, sp->fcport->vha, 0xffff, "%s Calling done on sp: %p, type: 0x%x, sp->ref_count: 0x%x\n", __func__, sp, sp->type, atomic_read(&sp->ref_count)); sp->done(sp, 0); - return; + goto out; } - if (WARN_ON_ONCE(atomic_read(&sp->ref_count) == 0)) - return; - rval = ha->isp_ops->abort_command(sp); ql_dbg(ql_dbg_io, fcport->vha, 0x212b, "%s: %s command for sp=%p, handle=%x on fcport=%p rval=%x\n", __func__, (rval != QLA_SUCCESS) ? "Failed to abort" : "Aborted", sp, sp->handle, fcport, rval); + +out: + /* kref_get was done before work was schedule. */ + kref_put(&sp->cmd_kref, sp->put_fn); } static void qla_nvme_ls_abort(struct nvme_fc_local_port *lport, struct nvme_fc_remote_port *rport, struct nvmefc_ls_req *fd) { struct nvme_private *priv = fd->private; + unsigned long flags; + + spin_lock_irqsave(&priv->cmd_lock, flags); + if (!priv->sp) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + + if (!kref_get_unless_zero(&priv->sp->cmd_kref)) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + spin_unlock_irqrestore(&priv->cmd_lock, flags); INIT_WORK(&priv->abort_work, qla_nvme_abort_work); schedule_work(&priv->abort_work); } -static void qla_nvme_ls_complete(struct work_struct *work) -{ - struct nvme_private *priv = - container_of(work, struct nvme_private, ls_work); - struct nvmefc_ls_req *fd = priv->fd; - - fd->done(fd, priv->comp_status); -} static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, struct nvme_fc_remote_port *rport, struct nvmefc_ls_req *fd) @@ -257,11 +301,13 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, sp->type = SRB_NVME_LS; sp->name = "nvme_ls"; sp->done = qla_nvme_sp_ls_done; - atomic_set(&sp->ref_count, 1); - nvme = &sp->u.iocb_cmd; + sp->put_fn = qla_nvme_release_ls_cmd_kref; + sp->priv = (void *)priv; priv->sp = sp; + kref_init(&sp->cmd_kref); + spin_lock_init(&priv->cmd_lock); + nvme = &sp->u.iocb_cmd; priv->fd = fd; - INIT_WORK(&priv->ls_work, qla_nvme_ls_complete); nvme->u.nvme.desc = fd; nvme->u.nvme.dir = 0; nvme->u.nvme.dl = 0; @@ -278,9 +324,10 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x700e, "qla2x00_start_sp failed = %d\n", rval); - atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); - sp->free(sp); + sp->priv = NULL; + priv->sp = NULL; + qla2x00_rel_sp(sp); return rval; } @@ -292,6 +339,18 @@ static void qla_nvme_fcp_abort(struct nvme_fc_local_port *lport, struct nvmefc_fcp_req *fd) { struct nvme_private *priv = fd->private; + unsigned long flags; + + spin_lock_irqsave(&priv->cmd_lock, flags); + if (!priv->sp) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + if (!kref_get_unless_zero(&priv->sp->cmd_kref)) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + spin_unlock_irqrestore(&priv->cmd_lock, flags); INIT_WORK(&priv->abort_work, qla_nvme_abort_work); schedule_work(&priv->abort_work); @@ -515,12 +574,15 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, if (!sp) return -EBUSY; - atomic_set(&sp->ref_count, 1); init_waitqueue_head(&sp->nvme_ls_waitq); + kref_init(&sp->cmd_kref); + spin_lock_init(&priv->cmd_lock); + sp->priv = (void *)priv; priv->sp = sp; sp->type = SRB_NVME_CMD; sp->name = "nvme_cmd"; sp->done = qla_nvme_sp_done; + sp->put_fn = qla_nvme_release_fcp_cmd_kref; sp->qpair = qpair; sp->vha = vha; nvme = &sp->u.iocb_cmd; @@ -530,9 +592,10 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x212d, "qla2x00_start_nvme_mq failed = %d\n", rval); - atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); - sp->free(sp); + sp->priv = NULL; + priv->sp = NULL; + qla2xxx_rel_qpair_sp(sp->qpair, sp); } return rval; diff --git a/drivers/scsi/qla2xxx/qla_nvme.h b/drivers/scsi/qla2xxx/qla_nvme.h index 2d088add7011..67bb4a2a3742 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.h +++ b/drivers/scsi/qla2xxx/qla_nvme.h @@ -34,6 +34,7 @@ struct nvme_private { struct work_struct ls_work; struct work_struct abort_work; int comp_status; + spinlock_t cmd_lock; }; struct qla_nvme_rport {