From patchwork Fri Jun 14 22:10:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Himanshu Madhani X-Patchwork-Id: 10996839 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DABBE76 for ; Fri, 14 Jun 2019 22:10:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CB5DD28610 for ; Fri, 14 Jun 2019 22:10:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BF748286C4; Fri, 14 Jun 2019 22:10:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 31E82286BC for ; Fri, 14 Jun 2019 22:10:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725973AbfFNWK3 (ORCPT ); Fri, 14 Jun 2019 18:10:29 -0400 Received: from mx0a-0016f401.pphosted.com ([67.231.148.174]:58730 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725812AbfFNWK2 (ORCPT ); Fri, 14 Jun 2019 18:10:28 -0400 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5EM65Hr027172; Fri, 14 Jun 2019 15:10:25 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=byUxi2tYPAr52HfLKLtVIBiC9k3ouA612htw3ckheS4=; b=Bv0HQlqyrT9l2IcQAl2+k/XqYsXH47Ygh1ZsvBUFj2MyACM+ZOytUqQYheBL49z7/WxG YWI1ffo1AjdqGkRvEgzcRyte6bIrEknQXyDX4ghpK6BLWpxHmyCfq/lXyEWxtbRpt3Dq +Yt/L88xlUIIXjb69ibjAJY3lzTErLPeH4HCPOm84mYUQTSdzje+5xE6hxU3GWeX+Evw SwdI+4uHGj7KKReseQywKwwuaH/gAmOphLV2uMz1pT+p06AGfgd4Ysh1JYp8gOqXhST2 Ou/T2SptD047yDTKw1kS1prp5LPjQLAJTpbkMNFfvyeA6A6NeDbX+sLYAVE47XJ/qbqr xg== Received: from sc-exch02.marvell.com ([199.233.58.182]) by mx0a-0016f401.pphosted.com with ESMTP id 2t4gx3rq1h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 14 Jun 2019 15:10:25 -0700 Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH02.marvell.com (10.93.176.82) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Fri, 14 Jun 2019 15:10:24 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Fri, 14 Jun 2019 15:10:24 -0700 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id EB6203F703F; Fri, 14 Jun 2019 15:10:23 -0700 (PDT) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x5EMANX5019212; Fri, 14 Jun 2019 15:10:23 -0700 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id x5EMANAo019211; Fri, 14 Jun 2019 15:10:23 -0700 From: Himanshu Madhani To: , CC: , Subject: [PATCH 1/3] qla2xxx: Fix kernel crash after disconnecting NVMe devices Date: Fri, 14 Jun 2019 15:10:18 -0700 Message-ID: <20190614221020.19173-2-hmadhani@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20190614221020.19173-1-hmadhani@marvell.com> References: <20190614221020.19173-1-hmadhani@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-14_09:,, signatures=0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Arun Easi BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx] PGD 800000084cf41067 PUD 84d288067 PMD 0 Oops: 0000 [#1] SMP Call Trace: [] process_one_work+0x17f/0x440 [] worker_thread+0x126/0x3c0 [] ? manage_workers.isra.26+0x2a0/0x2a0 [] kthread+0xd1/0xe0 [] ? insert_kthread_work+0x40/0x40 [] ret_from_fork_nospec_begin+0x21/0x21 [] ? insert_kthread_work+0x40/0x40 RIP [] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx] The crash is due to a bad entry in the nvme_rport_list. This list is not protected, and when a remoteport_delete callback is called, driver traverses the list and crashes. Actually, the list could be removed and driver could traverse the main fcport list instead. Fix does exactly that. Signed-off-by: Arun Easi Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_def.h | 1 - drivers/scsi/qla2xxx/qla_nvme.c | 52 ++++++++++++++++++++--------------------- drivers/scsi/qla2xxx/qla_nvme.h | 1 - drivers/scsi/qla2xxx/qla_os.c | 1 - 4 files changed, 25 insertions(+), 30 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 1a4095c56eee..602ed24bb806 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -4376,7 +4376,6 @@ typedef struct scsi_qla_host { struct nvme_fc_local_port *nvme_local_port; struct completion nvme_del_done; - struct list_head nvme_rport_list; uint16_t fcoe_vlan_id; uint16_t fcoe_fcf_idx; diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 22e3fba28e51..99220a3cf734 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -14,6 +14,18 @@ static struct nvme_fc_port_template qla_nvme_fc_transport; static void qla_nvme_unregister_remote_port(struct work_struct *); +static inline +int qla_is_active_nvme_fcport(struct fc_port *fcport) +{ + return fcport->nvme_flag & NVME_FLAG_REGISTERED; +} + +#define qla_list_for_each_nvme_fcport(_fcport, _vha) \ +{ \ + list_for_each_entry(_fcport, &_vha->vp_fcports, list) \ + if (qla_is_active_nvme_fcport(_fcport)) \ +} + int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport) { struct qla_nvme_rport *rport; @@ -74,7 +86,6 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport) rport = fcport->nvme_remote_port->private; rport->fcport = fcport; - list_add_tail(&rport->list, &vha->nvme_rport_list); fcport->nvme_flag |= NVME_FLAG_REGISTERED; return 0; @@ -542,19 +553,12 @@ static void qla_nvme_localport_delete(struct nvme_fc_local_port *lport) static void qla_nvme_remoteport_delete(struct nvme_fc_remote_port *rport) { fc_port_t *fcport; - struct qla_nvme_rport *qla_rport = rport->private, *trport; + struct qla_nvme_rport *qla_rport = rport->private; fcport = qla_rport->fcport; fcport->nvme_remote_port = NULL; fcport->nvme_flag &= ~NVME_FLAG_REGISTERED; - list_for_each_entry_safe(qla_rport, trport, - &fcport->vha->nvme_rport_list, list) { - if (qla_rport->fcport == fcport) { - list_del(&qla_rport->list); - break; - } - } complete(&fcport->nvme_del_done); if (!test_bit(UNLOADING, &fcport->vha->dpc_flags)) { @@ -590,31 +594,25 @@ static void qla_nvme_unregister_remote_port(struct work_struct *work) { struct fc_port *fcport = container_of(work, struct fc_port, nvme_del_work); - struct qla_nvme_rport *qla_rport, *trport; + int ret; if (!IS_ENABLED(CONFIG_NVME_FC)) return; + if (!qla_is_active_nvme_fcport(fcport)) + return; + ql_log(ql_log_warn, NULL, 0x2112, "%s: unregister remoteport on %p\n",__func__, fcport); - list_for_each_entry_safe(qla_rport, trport, - &fcport->vha->nvme_rport_list, list) { - if (qla_rport->fcport == fcport) { - ql_log(ql_log_info, fcport->vha, 0x2113, - "%s: fcport=%p\n", __func__, fcport); - nvme_fc_set_remoteport_devloss - (fcport->nvme_remote_port, 0); - init_completion(&fcport->nvme_del_done); - if (nvme_fc_unregister_remoteport - (fcport->nvme_remote_port)) - ql_log(ql_log_info, fcport->vha, 0x2114, - "%s: Failed to unregister nvme_remote_port\n", - __func__); - wait_for_completion(&fcport->nvme_del_done); - break; - } - } + nvme_fc_set_remoteport_devloss(fcport->nvme_remote_port, 0); + init_completion(&fcport->nvme_del_done); + ret = nvme_fc_unregister_remoteport(fcport->nvme_remote_port); + if (ret) + ql_log(ql_log_info, fcport->vha, 0x2114, + "%s: Failed to unregister nvme_remote_port (%d)\n", + __func__, ret); + wait_for_completion(&fcport->nvme_del_done); } void qla_nvme_delete(struct scsi_qla_host *vha) diff --git a/drivers/scsi/qla2xxx/qla_nvme.h b/drivers/scsi/qla2xxx/qla_nvme.h index d3b8a6440113..2d088add7011 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.h +++ b/drivers/scsi/qla2xxx/qla_nvme.h @@ -37,7 +37,6 @@ struct nvme_private { }; struct qla_nvme_rport { - struct list_head list; struct fc_port *fcport; }; diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 00fee5bf4de1..ae93ae2b6090 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -4789,7 +4789,6 @@ struct scsi_qla_host *qla2x00_create_host(struct scsi_host_template *sht, INIT_LIST_HEAD(&vha->plogi_ack_list); INIT_LIST_HEAD(&vha->qp_list); INIT_LIST_HEAD(&vha->gnl.fcports); - INIT_LIST_HEAD(&vha->nvme_rport_list); INIT_LIST_HEAD(&vha->gpnid_list); INIT_WORK(&vha->iocb_work, qla2x00_iocb_work_fn); From patchwork Fri Jun 14 22:10:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Himanshu Madhani X-Patchwork-Id: 10996841 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE15F76 for ; Fri, 14 Jun 2019 22:10:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9CB87286BC for ; Fri, 14 Jun 2019 22:10:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8C10F28610; Fri, 14 Jun 2019 22:10:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 291C128610 for ; Fri, 14 Jun 2019 22:10:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725999AbfFNWKb (ORCPT ); Fri, 14 Jun 2019 18:10:31 -0400 Received: from mx0b-0016f401.pphosted.com ([67.231.156.173]:60506 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725812AbfFNWKb (ORCPT ); Fri, 14 Jun 2019 18:10:31 -0400 Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5EM7f19008313; Fri, 14 Jun 2019 15:10:29 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=WBhpjFssfLN2752Pk4FBynmMLFvN2SnvhG1wu6SuNC4=; b=Zf8/eaDVxuQOsdW8lfuKlzidSRKR6qOkrXRBqxxFCTo/e5cMNqa/UoUw7VCRI8Zhz7FE O+EXn3iir4KBmZLGu3caA89kRIy8Lfb/5p2HaNgTZfmQ6wgfXIMd7SN4QnVsrGkNfY1O pkRYHV3ogHGUowOzCAV/ygQ0n7Q3wk//yPgbLtuMQRy5EL9nEt5EmhGNxDT3fc6z8m+X OvSFFXBsNBp1PX6TCpn1QkOvDE4xGOm/ErSQUDrC3ASBI2jnTN1tWz1s2K7EZqduVY3z 3bJQDMYxzwcoPqYwZJFdb2L4zNR4TVmQLCjhiPJ6BBQcRoRnIkBWuNxBK3Fk020pwVkH HQ== Received: from sc-exch03.marvell.com ([199.233.58.183]) by mx0b-0016f401.pphosted.com with ESMTP id 2t3hvq01ay-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 14 Jun 2019 15:10:28 -0700 Received: from SC-EXCH01.marvell.com (10.93.176.81) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Fri, 14 Jun 2019 15:10:27 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Fri, 14 Jun 2019 15:10:27 -0700 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id 1E27E3F7040; Fri, 14 Jun 2019 15:10:27 -0700 (PDT) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x5EMARF6019216; Fri, 14 Jun 2019 15:10:27 -0700 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id x5EMARpc019215; Fri, 14 Jun 2019 15:10:27 -0700 From: Himanshu Madhani To: , CC: , Subject: [PATCH 2/3] qla2xxx: on session delete return nvme cmd Date: Fri, 14 Jun 2019 15:10:19 -0700 Message-ID: <20190614221020.19173-3-hmadhani@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20190614221020.19173-1-hmadhani@marvell.com> References: <20190614221020.19173-1-hmadhani@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-14_09:,, signatures=0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Quinn Tran - on session delete or chip reset, reject all NVME commands. - on NVME command submission error, free srb resource. Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Reviewed-by: Ewan D. Milne --- drivers/scsi/qla2xxx/qla_nvme.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 99220a3cf734..ead10e1a81fc 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -253,6 +253,10 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, vha = fcport->vha; ha = vha->hw; + + if (!ha->flags.fw_started || (fcport && fcport->deleted)) + return rval; + /* Alloc SRB structure */ sp = qla2x00_get_sp(vha, fcport, GFP_ATOMIC); if (!sp) @@ -284,6 +288,7 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, "qla2x00_start_sp failed = %d\n", rval); atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); + sp->free(sp); return rval; } @@ -500,7 +505,7 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, vha = fcport->vha; - if (test_bit(ABORT_ISP_ACTIVE, &vha->dpc_flags)) + if ((qpair && !qpair->fw_started) || (fcport && fcport->deleted)) return rval; /* @@ -535,6 +540,7 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, "qla2x00_start_nvme_mq failed = %d\n", rval); atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); + sp->free(sp); } return rval; @@ -561,14 +567,13 @@ static void qla_nvme_remoteport_delete(struct nvme_fc_remote_port *rport) complete(&fcport->nvme_del_done); - if (!test_bit(UNLOADING, &fcport->vha->dpc_flags)) { - INIT_WORK(&fcport->free_work, qlt_free_session_done); - schedule_work(&fcport->free_work); - } + INIT_WORK(&fcport->free_work, qlt_free_session_done); + schedule_work(&fcport->free_work); fcport->nvme_flag &= ~NVME_FLAG_DELETING; ql_log(ql_log_info, fcport->vha, 0x2110, - "remoteport_delete of %p completed.\n", fcport); + "remoteport_delete of %p %8phN completed.\n", + fcport, fcport->port_name); } static struct nvme_fc_port_template qla_nvme_fc_transport = { @@ -603,7 +608,8 @@ static void qla_nvme_unregister_remote_port(struct work_struct *work) return; ql_log(ql_log_warn, NULL, 0x2112, - "%s: unregister remoteport on %p\n",__func__, fcport); + "%s: unregister remoteport on %p %8phN\n", + __func__, fcport, fcport->port_name); nvme_fc_set_remoteport_devloss(fcport->nvme_remote_port, 0); init_completion(&fcport->nvme_del_done); From patchwork Fri Jun 14 22:10:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Himanshu Madhani X-Patchwork-Id: 10996843 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7E0EA76 for ; Fri, 14 Jun 2019 22:10:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6E22528610 for ; Fri, 14 Jun 2019 22:10:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 62A89286C4; Fri, 14 Jun 2019 22:10:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9816728610 for ; Fri, 14 Jun 2019 22:10:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726046AbfFNWKg (ORCPT ); Fri, 14 Jun 2019 18:10:36 -0400 Received: from mx0b-0016f401.pphosted.com ([67.231.156.173]:60516 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726030AbfFNWKg (ORCPT ); Fri, 14 Jun 2019 18:10:36 -0400 Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x5EM7f1A008313; Fri, 14 Jun 2019 15:10:32 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=gmbnLksuE5nEI1VJtXIGD5n+wXtqpOHRm8zfo/NhiSk=; b=S+Tag8Z20y1KBtk8WH9XXtxWLkSjuyj1P73UMFW+iKB2gPXjd8KrdZL/xONptmGKPw2l h7Hu9nTlAf6UAJVVHWLNsAEZMIowpkVLYXDMKkGV7uBYvCjAtXX0nNpSUqsduilQWiME If9QVf1oEcabxEit7AH2Pc290LZw/FuF5hUVfy4G9HqSiGq9HXF7yXxp9zsIfG1TX7fp YEZEif+n8j/2mEX0z3lQDIZoq/hsc0lthdGzabD8clbjd3e+UsKQlUo07lKZpgmUfl0p KNPnSM3iJ3ZQZmDQOl/Se9xiywdgFFBfaD2QD9iZ3J8DnBRdm6PxwUwO1YWZ/0SNSnAe Aw== Received: from sc-exch01.marvell.com ([199.233.58.181]) by mx0b-0016f401.pphosted.com with ESMTP id 2t3hvq01b3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 14 Jun 2019 15:10:32 -0700 Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Fri, 14 Jun 2019 15:10:30 -0700 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Fri, 14 Jun 2019 15:10:30 -0700 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id 544A63F703F; Fri, 14 Jun 2019 15:10:30 -0700 (PDT) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x5EMAUnL019220; Fri, 14 Jun 2019 15:10:30 -0700 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id x5EMAUZj019219; Fri, 14 Jun 2019 15:10:30 -0700 From: Himanshu Madhani To: , CC: , Subject: [PATCH 3/3] qla2xxx: Fix NVME cmd and LS cmd timeout race condition Date: Fri, 14 Jun 2019 15:10:20 -0700 Message-ID: <20190614221020.19173-4-hmadhani@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20190614221020.19173-1-hmadhani@marvell.com> References: <20190614221020.19173-1-hmadhani@marvell.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-14_09:,, signatures=0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Quinn Tran This patch uses kref to protect access between fcp_abort path and nvme command and LS command completion path. Stack trace below shows the abort path is accessing stale memory (nvme_private->sp). When command kref reaches 0, nvme_private & srb resource will be disconnected from each other. Any subsequence nvme abort request will not be able to reference the original srb. [ 5631.003998] BUG: unable to handle kernel paging request at 00000010000005d8 [ 5631.004016] IP: [] qla_nvme_abort_work+0x22/0x100 [qla2xxx] [ 5631.004086] Workqueue: events qla_nvme_abort_work [qla2xxx] [ 5631.004097] RIP: 0010:[] [] qla_nvme_abort_work+0x22/0x100 [qla2xxx] [ 5631.004109] Call Trace: [ 5631.004115] [] ? pwq_dec_nr_in_flight+0x64/0xb0 [ 5631.004117] [] process_one_work+0x17f/0x440 [ 5631.004120] [] worker_thread+0x126/0x3c0 Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_def.h | 2 + drivers/scsi/qla2xxx/qla_nvme.c | 164 ++++++++++++++++++++++++++++------------ drivers/scsi/qla2xxx/qla_nvme.h | 1 + 3 files changed, 117 insertions(+), 50 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 602ed24bb806..85a27ee5d647 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -532,6 +532,8 @@ typedef struct srb { uint8_t cmd_type; uint8_t pad[3]; atomic_t ref_count; + struct kref cmd_kref; /* need to migrate ref_count over to this */ + void *priv; wait_queue_head_t nvme_ls_waitq; struct fc_port *fcport; struct scsi_qla_host *vha; diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index ead10e1a81fc..b56dcab9d265 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -135,53 +135,91 @@ static int qla_nvme_alloc_queue(struct nvme_fc_local_port *lport, return 0; } +static void qla_nvme_release_fcp_cmd_kref(struct kref *kref) +{ + struct srb *sp = container_of(kref, struct srb, cmd_kref); + struct nvme_private *priv = (struct nvme_private *)sp->priv; + struct nvmefc_fcp_req *fd; + struct srb_iocb *nvme; + unsigned long flags; + + if (!priv) + goto out; + + nvme = &sp->u.iocb_cmd; + fd = nvme->u.nvme.desc; + + spin_lock_irqsave(&priv->cmd_lock, flags); + priv->sp = NULL; + sp->priv = NULL; + if (priv->comp_status == QLA_SUCCESS) { + fd->rcv_rsplen = nvme->u.nvme.rsp_pyld_len; + } else { + fd->rcv_rsplen = 0; + fd->transferred_length = 0; + } + fd->status = 0; + spin_unlock_irqrestore(&priv->cmd_lock, flags); + + fd->done(fd); +out: + qla2xxx_rel_qpair_sp(sp->qpair, sp); +} + +static void qla_nvme_release_ls_cmd_kref(struct kref *kref) +{ + struct srb *sp = container_of(kref, struct srb, cmd_kref); + struct nvme_private *priv = (struct nvme_private *)sp->priv; + struct nvmefc_ls_req *fd; + unsigned long flags; + + if (!priv) + goto out; + + spin_lock_irqsave(&priv->cmd_lock, flags); + priv->sp = NULL; + sp->priv = NULL; + spin_unlock_irqrestore(&priv->cmd_lock, flags); + + fd = priv->fd; + fd->done(fd, priv->comp_status); +out: + qla2x00_rel_sp(sp); +} + +static void qla_nvme_ls_complete(struct work_struct *work) +{ + struct nvme_private *priv = + container_of(work, struct nvme_private, ls_work); + + kref_put(&priv->sp->cmd_kref, qla_nvme_release_ls_cmd_kref); +} + static void qla_nvme_sp_ls_done(void *ptr, int res) { srb_t *sp = ptr; - struct srb_iocb *nvme; - struct nvmefc_ls_req *fd; struct nvme_private *priv; - if (WARN_ON_ONCE(atomic_read(&sp->ref_count) == 0)) + if (WARN_ON_ONCE(kref_read(&sp->cmd_kref) == 0)) return; - atomic_dec(&sp->ref_count); - if (res) res = -EINVAL; - nvme = &sp->u.iocb_cmd; - fd = nvme->u.nvme.desc; - priv = fd->private; + priv = (struct nvme_private *)sp->priv; priv->comp_status = res; + INIT_WORK(&priv->ls_work, qla_nvme_ls_complete); schedule_work(&priv->ls_work); - /* work schedule doesn't need the sp */ - qla2x00_rel_sp(sp); } +/* it assumed that QPair lock is held. */ static void qla_nvme_sp_done(void *ptr, int res) { srb_t *sp = ptr; - struct srb_iocb *nvme; - struct nvmefc_fcp_req *fd; + struct nvme_private *priv = (struct nvme_private *)sp->priv; - nvme = &sp->u.iocb_cmd; - fd = nvme->u.nvme.desc; - - if (WARN_ON_ONCE(atomic_read(&sp->ref_count) == 0)) - return; - - atomic_dec(&sp->ref_count); - - if (res == QLA_SUCCESS) { - fd->rcv_rsplen = nvme->u.nvme.rsp_pyld_len; - } else { - fd->rcv_rsplen = 0; - fd->transferred_length = 0; - } - fd->status = 0; - fd->done(fd); - qla2xxx_rel_qpair_sp(sp->qpair, sp); + priv->comp_status = res; + kref_put(&sp->cmd_kref, qla_nvme_release_fcp_cmd_kref); return; } @@ -200,44 +238,53 @@ static void qla_nvme_abort_work(struct work_struct *work) __func__, sp, sp->handle, fcport, fcport->deleted); if (!ha->flags.fw_started && (fcport && fcport->deleted)) - return; + goto out; if (ha->flags.host_shutting_down) { ql_log(ql_log_info, sp->fcport->vha, 0xffff, "%s Calling done on sp: %p, type: 0x%x, sp->ref_count: 0x%x\n", __func__, sp, sp->type, atomic_read(&sp->ref_count)); sp->done(sp, 0); - return; + goto out; } - if (WARN_ON_ONCE(atomic_read(&sp->ref_count) == 0)) - return; - rval = ha->isp_ops->abort_command(sp); ql_dbg(ql_dbg_io, fcport->vha, 0x212b, "%s: %s command for sp=%p, handle=%x on fcport=%p rval=%x\n", __func__, (rval != QLA_SUCCESS) ? "Failed to abort" : "Aborted", sp, sp->handle, fcport, rval); + +out: + /* kref_get was done before work was schedule. */ + if (sp->type == SRB_NVME_CMD) + kref_put(&sp->cmd_kref, qla_nvme_release_fcp_cmd_kref); + else if (sp->type == SRB_NVME_LS) + kref_put(&sp->cmd_kref, qla_nvme_release_ls_cmd_kref); } static void qla_nvme_ls_abort(struct nvme_fc_local_port *lport, struct nvme_fc_remote_port *rport, struct nvmefc_ls_req *fd) { struct nvme_private *priv = fd->private; + unsigned long flags; + + spin_lock_irqsave(&priv->cmd_lock, flags); + if (!priv->sp) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + + if (!kref_get_unless_zero(&priv->sp->cmd_kref)) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + spin_unlock_irqrestore(&priv->cmd_lock, flags); INIT_WORK(&priv->abort_work, qla_nvme_abort_work); schedule_work(&priv->abort_work); } -static void qla_nvme_ls_complete(struct work_struct *work) -{ - struct nvme_private *priv = - container_of(work, struct nvme_private, ls_work); - struct nvmefc_ls_req *fd = priv->fd; - - fd->done(fd, priv->comp_status); -} static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, struct nvme_fc_remote_port *rport, struct nvmefc_ls_req *fd) @@ -265,11 +312,12 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, sp->type = SRB_NVME_LS; sp->name = "nvme_ls"; sp->done = qla_nvme_sp_ls_done; - atomic_set(&sp->ref_count, 1); - nvme = &sp->u.iocb_cmd; + sp->priv = (void *)priv; priv->sp = sp; + kref_init(&sp->cmd_kref); + spin_lock_init(&priv->cmd_lock); + nvme = &sp->u.iocb_cmd; priv->fd = fd; - INIT_WORK(&priv->ls_work, qla_nvme_ls_complete); nvme->u.nvme.desc = fd; nvme->u.nvme.dir = 0; nvme->u.nvme.dl = 0; @@ -286,9 +334,10 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport, if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x700e, "qla2x00_start_sp failed = %d\n", rval); - atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); - sp->free(sp); + sp->priv = NULL; + priv->sp = NULL; + qla2x00_rel_sp(sp); return rval; } @@ -300,6 +349,18 @@ static void qla_nvme_fcp_abort(struct nvme_fc_local_port *lport, struct nvmefc_fcp_req *fd) { struct nvme_private *priv = fd->private; + unsigned long flags; + + spin_lock_irqsave(&priv->cmd_lock, flags); + if (!priv->sp) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + if (!kref_get_unless_zero(&priv->sp->cmd_kref)) { + spin_unlock_irqrestore(&priv->cmd_lock, flags); + return; + } + spin_unlock_irqrestore(&priv->cmd_lock, flags); INIT_WORK(&priv->abort_work, qla_nvme_abort_work); schedule_work(&priv->abort_work); @@ -523,8 +584,10 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, if (!sp) return -EBUSY; - atomic_set(&sp->ref_count, 1); init_waitqueue_head(&sp->nvme_ls_waitq); + kref_init(&sp->cmd_kref); + spin_lock_init(&priv->cmd_lock); + sp->priv = (void *)priv; priv->sp = sp; sp->type = SRB_NVME_CMD; sp->name = "nvme_cmd"; @@ -538,9 +601,10 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x212d, "qla2x00_start_nvme_mq failed = %d\n", rval); - atomic_dec(&sp->ref_count); wake_up(&sp->nvme_ls_waitq); - sp->free(sp); + sp->priv = NULL; + priv->sp = NULL; + qla2xxx_rel_qpair_sp(sp->qpair, sp); } return rval; diff --git a/drivers/scsi/qla2xxx/qla_nvme.h b/drivers/scsi/qla2xxx/qla_nvme.h index 2d088add7011..67bb4a2a3742 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.h +++ b/drivers/scsi/qla2xxx/qla_nvme.h @@ -34,6 +34,7 @@ struct nvme_private { struct work_struct ls_work; struct work_struct abort_work; int comp_status; + spinlock_t cmd_lock; }; struct qla_nvme_rport {