From patchwork Tue Mar 26 07:38:38 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Saurav Kashyap <skashyap@marvell.com>
X-Patchwork-Id: 10870513
Return-Path: <linux-scsi-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C6C2415AC
	for <patchwork-linux-scsi@patchwork.kernel.org>;
 Tue, 26 Mar 2019 07:39:49 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AB8EE2907D
	for <patchwork-linux-scsi@patchwork.kernel.org>;
 Tue, 26 Mar 2019 07:39:49 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 9FA9629089; Tue, 26 Mar 2019 07:39:49 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 269752907D
	for <patchwork-linux-scsi@patchwork.kernel.org>;
 Tue, 26 Mar 2019 07:39:48 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1731101AbfCZHjr (ORCPT
        <rfc822;patchwork-linux-scsi@patchwork.kernel.org>);
        Tue, 26 Mar 2019 03:39:47 -0400
Received: from mail-eopbgr730088.outbound.protection.outlook.com
 ([40.107.73.88]:64425
        "EHLO NAM05-DM3-obe.outbound.protection.outlook.com"
        rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
        id S1726266AbfCZHjr (ORCPT <rfc822;linux-scsi@vger.kernel.org>);
        Tue, 26 Mar 2019 03:39:47 -0400
Received: from BYAPR07CA0022.namprd07.prod.outlook.com (2603:10b6:a02:bc::35)
 by SN2PR07MB2544.namprd07.prod.outlook.com (2603:10b6:804:7::10) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1730.16; Tue, 26 Mar
 2019 07:39:37 +0000
Received: from DM3NAM05FT018.eop-nam05.prod.protection.outlook.com
 (2a01:111:f400:7e51::201) by BYAPR07CA0022.outlook.office365.com
 (2603:10b6:a02:bc::35) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1730.18 via Frontend
 Transport; Tue, 26 Mar 2019 07:39:37 +0000
Authentication-Results: spf=fail (sender IP is 199.233.58.38)
 smtp.mailfrom=marvell.com; vger.kernel.org; dkim=none (message not signed)
 header.d=none;vger.kernel.org; dmarc=fail action=none
 header.from=marvell.com;
Received-SPF: Fail (protection.outlook.com: domain of marvell.com does not
 designate 199.233.58.38 as permitted sender) receiver=protection.outlook.com;
 client-ip=199.233.58.38; helo=CAEXCH02.caveonetworks.com;
Received: from CAEXCH02.caveonetworks.com (199.233.58.38) by
 DM3NAM05FT018.mail.protection.outlook.com (10.152.98.127) with Microsoft SMTP
 Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA) id
 15.20.1750.4 via Frontend Transport; Tue, 26 Mar 2019 07:39:36 +0000
Received: from dut1171.mv.qlogic.com (10.112.88.18) by
 CAEXCH02.caveonetworks.com (10.67.98.110) with Microsoft SMTP Server (TLS) id
 14.2.347.0; Tue, 26 Mar 2019 00:39:18 -0700
Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1])    by
 dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id x2Q7dIs4026851;
    Tue, 26
 Mar 2019 00:39:18 -0700
Received: (from root@localhost) by dut1171.mv.qlogic.com
 (8.14.7/8.14.7/Submit) id x2Q7dIg4026850;
      Tue, 26 Mar 2019 00:39:18 -0700
From: Saurav Kashyap <skashyap@marvell.com>
To: <martin.petersen@oracle.com>
CC: <QLogic-Storage-Upstream@cavium.com>, <linux-scsi@vger.kernel.org>
Subject: [PATCH v2 06/26] qedf: Modify abort and tmf handler to handle edge
 condition and flush.
Date: Tue, 26 Mar 2019 00:38:38 -0700
Message-ID: <20190326073858.26792-7-skashyap@marvell.com>
X-Mailer: git-send-email 2.12.0
In-Reply-To: <20190326073858.26792-1-skashyap@marvell.com>
References: <20190326073858.26792-1-skashyap@marvell.com>
MIME-Version: 1.0
X-EOPAttributedMessage: 0
X-Matching-Connectors: 
 131980595774418794;(abac79dc-c90b-41ba-8033-08d666125e47);(abac79dc-c90b-41ba-8033-08d666125e47)
X-Forefront-Antispam-Report: 
 CIP:199.233.58.38;IPV:CAL;CTRY:US;EFV:NLI;SFV:NSPM;SFS:(10009020)(346002)(376002)(136003)(396003)(39860400002)(2980300002)(1109001)(1110001)(339900001)(189003)(199004)(105606002)(2351001)(42186006)(68736007)(106466001)(336012)(86362001)(356004)(6666004)(1076003)(80596001)(498600001)(30864003)(87636003)(26826003)(85426001)(53936002)(69596002)(6862004)(36906005)(2906002)(36756003)(2616005)(446003)(11346002)(4326008)(5660300002)(97736004)(50226002)(48376002)(81156014)(50466002)(305945005)(8676002)(26005)(51416003)(8936002)(81166006)(476003)(54906003)(126002)(16586007)(486006)(316002)(76176011)(47776003)(14444005);DIR:OUT;SFP:1101;SCL:1;SRVR:SN2PR07MB2544;H:CAEXCH02.caveonetworks.com;FPR:;SPF:Fail;LANG:en;PTR:InfoDomainNonexistent;MX:1;A:1;
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: f2191358-2ef6-4211-04da-08d6b1be32d7
X-Microsoft-Antispam: 
 BCL:0;PCL:0;RULEID:(2390118)(7020095)(5600127)(711020)(4605104)(2017052603328);SRVR:SN2PR07MB2544;
X-MS-TrafficTypeDiagnostic: SN2PR07MB2544:
X-Microsoft-Antispam-PRVS: 
 <SN2PR07MB25445734199E859EC81323CED25F0@SN2PR07MB2544.namprd07.prod.outlook.com>
X-Forefront-PRVS: 09888BC01D
X-Microsoft-Antispam-Message-Info: 
 m/YIWbw5gDLMwsr+qqD2KM5Sif68EqEaH3szWFolVqsd40PmWd2+TThzEYDF31XBwSKD6qIWkLTh1dyEGot5S5LBdgdvdU94E1gkx1rqnwKcpJvK1s/AQ/zpZ8avyuIOFWj7/8R2bPowwJYUi0Ezb4r6+R+u6ZQw8jG5/KjpmCxdig3LyBXRIJXRD60txUACM6XyhMYGHwRtd+nDrNmBrSUvScdY58Xugt4wv2PL1VNjZu3RbVGacy3NmHA5jx/z521UM0Y++b/c/wSwIM6S3IPJBuGyTAo5O4t1tXJt8nn61V7BO2renDZPLkQwFsXfzNCFMGDUZVT1JORtQf6PgrFDmqTjAXxxl54RhRIDrGk4ao8vRNUpYpdg1UhDNEF36XqlLuW2RjIxc1JgQupnoWwdRRkQPoSppo0rDLEDa2c=
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Mar 2019 07:39:36.9604
 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 
 f2191358-2ef6-4211-04da-08d6b1be32d7
X-MS-Exchange-CrossTenant-Id: 5afe0b00-7697-4969-b663-5eab37d5f47e
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: 
 TenantId=5afe0b00-7697-4969-b663-5eab37d5f47e;Ip=[199.233.58.38];Helo=[CAEXCH02.caveonetworks.com]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN2PR07MB2544
Sender: linux-scsi-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-scsi.vger.kernel.org>
X-Mailing-List: linux-scsi@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

An I/Os can be any state when flush is called, it can be in abort,
Waiting for abort, RRQ send and waiting or TMF send.
- HZ can be different on different architecture, correctly set
  abort timeout value.
- Flush can complete the I/Os prematurely, handle refcount
  for aborted I/Os and for which RRQ is pending.
- Differentiate LUN/TARGET reset, as cleanup needs to be
  send to firmware accordingly.
- Add flush mutex to sync cleanup call from abort and flush routine.
- Clear abort/outstanding bit on timeout.

Signed-off-by: Shyam Sundar <shyam.sundar@marvell.com>
Signed-off-by: Chad Dupuis <cdupuis@marvell.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
---
 drivers/scsi/qedf/qedf.h      |  10 ++-
 drivers/scsi/qedf/qedf_els.c  |  31 +++++++-
 drivers/scsi/qedf/qedf_io.c   | 178 +++++++++++++++++++++++++++++++++++-------
 drivers/scsi/qedf/qedf_main.c | 124 +++++++++++++++++++++--------
 4 files changed, 276 insertions(+), 67 deletions(-)

diff --git a/drivers/scsi/qedf/qedf.h b/drivers/scsi/qedf/qedf.h
index 787cc12..9e5e183 100644
--- a/drivers/scsi/qedf/qedf.h
+++ b/drivers/scsi/qedf/qedf.h
@@ -49,8 +49,8 @@
 	sizeof(struct fc_frame_header))
 #define QEDF_MAX_NPIV		64
 #define QEDF_TM_TIMEOUT		10
-#define QEDF_ABORT_TIMEOUT	10
-#define QEDF_CLEANUP_TIMEOUT	10
+#define QEDF_ABORT_TIMEOUT	(10 * 1000)
+#define QEDF_CLEANUP_TIMEOUT	1
 #define QEDF_MAX_CDB_LEN	16
 
 #define UPSTREAM_REMOVE		1
@@ -82,6 +82,7 @@ struct qedf_els_cb_arg {
 };
 
 enum qedf_ioreq_event {
+	QEDF_IOREQ_EV_NONE,
 	QEDF_IOREQ_EV_ABORT_SUCCESS,
 	QEDF_IOREQ_EV_ABORT_FAILED,
 	QEDF_IOREQ_EV_SEND_RRQ,
@@ -182,7 +183,10 @@ struct qedf_rport {
 #define QEDF_RPORT_SESSION_READY 1
 #define QEDF_RPORT_UPLOADING_CONNECTION	2
 #define QEDF_RPORT_IN_RESET 3
+#define QEDF_RPORT_IN_LUN_RESET 4
+#define QEDF_RPORT_IN_TARGET_RESET 5
 	unsigned long flags;
+	int lun_reset_lun;
 	unsigned long retry_delay_timestamp;
 	struct fc_rport *rport;
 	struct fc_rport_priv *rdata;
@@ -395,6 +399,8 @@ struct qedf_ctx {
 	u8 target_resets;
 	u8 task_set_fulls;
 	u8 busy;
+	/* Used for flush routine */
+	struct mutex flush_mutex;
 };
 
 struct io_bdt {
diff --git a/drivers/scsi/qedf/qedf_els.c b/drivers/scsi/qedf/qedf_els.c
index a60819b..ac2bfc2 100644
--- a/drivers/scsi/qedf/qedf_els.c
+++ b/drivers/scsi/qedf/qedf_els.c
@@ -201,8 +201,12 @@ static void qedf_rrq_compl(struct qedf_els_cb_arg *cb_arg)
 		   " orig xid = 0x%x, rrq_xid = 0x%x, refcount=%d\n",
 		   orig_io_req, orig_io_req->xid, rrq_req->xid, refcount);
 
-	/* This should return the aborted io_req to the command pool */
-	if (orig_io_req)
+	/*
+	 * This should return the aborted io_req to the command pool. Note that
+	 * we need to check the refcound in case the original request was
+	 * flushed but we get a completion on this xid.
+	 */
+	if (orig_io_req && refcount > 0)
 		kref_put(&orig_io_req->refcount, qedf_release_cmd);
 
 out_free:
@@ -229,6 +233,7 @@ int qedf_send_rrq(struct qedf_ioreq *aborted_io_req)
 	uint32_t sid;
 	uint32_t r_a_tov;
 	int rc;
+	int refcount;
 
 	if (!aborted_io_req) {
 		QEDF_ERR(NULL, "abort_io_req is NULL.\n");
@@ -237,6 +242,15 @@ int qedf_send_rrq(struct qedf_ioreq *aborted_io_req)
 
 	fcport = aborted_io_req->fcport;
 
+	if (!fcport) {
+		refcount = kref_read(&aborted_io_req->refcount);
+		QEDF_ERR(NULL,
+			 "RRQ work was queued prior to a flush xid=0x%x, refcount=%d.\n",
+			 aborted_io_req->xid, refcount);
+		kref_put(&aborted_io_req->refcount, qedf_release_cmd);
+		return -EINVAL;
+	}
+
 	/* Check that fcport is still offloaded */
 	if (!test_bit(QEDF_RPORT_SESSION_READY, &fcport->flags)) {
 		QEDF_ERR(NULL, "fcport is no longer offloaded.\n");
@@ -249,6 +263,19 @@ int qedf_send_rrq(struct qedf_ioreq *aborted_io_req)
 	}
 
 	qedf = fcport->qedf;
+
+	/*
+	 * Sanity check that we can send a RRQ to make sure that refcount isn't
+	 * 0
+	 */
+	refcount = kref_read(&aborted_io_req->refcount);
+	if (refcount != 1) {
+		QEDF_INFO(&qedf->dbg_ctx, QEDF_LOG_ELS,
+			  "refcount for xid=%x io_req=%p refcount=%d is not 1.\n",
+			  aborted_io_req->xid, aborted_io_req, refcount);
+		return -EINVAL;
+	}
+
 	lport = qedf->lport;
 	sid = fcport->sid;
 	r_a_tov = lport->r_a_tov;
diff --git a/drivers/scsi/qedf/qedf_io.c b/drivers/scsi/qedf/qedf_io.c
index 1d810d7..c22dbb3 100644
--- a/drivers/scsi/qedf/qedf_io.c
+++ b/drivers/scsi/qedf/qedf_io.c
@@ -43,8 +43,9 @@ static void qedf_cmd_timeout(struct work_struct *work)
 	switch (io_req->cmd_type) {
 	case QEDF_ABTS:
 		if (qedf == NULL) {
-			QEDF_INFO(NULL, QEDF_LOG_IO, "qedf is NULL for xid=0x%x.\n",
-			    io_req->xid);
+			QEDF_INFO(NULL, QEDF_LOG_IO,
+				  "qedf is NULL for ABTS xid=0x%x.\n",
+				  io_req->xid);
 			return;
 		}
 
@@ -61,6 +62,9 @@ static void qedf_cmd_timeout(struct work_struct *work)
 		 */
 		kref_put(&io_req->refcount, qedf_release_cmd);
 
+		/* Clear in abort bit now that we're done with the command */
+		clear_bit(QEDF_CMD_IN_ABORT, &io_req->flags);
+
 		/*
 		 * Now that the original I/O and the ABTS are complete see
 		 * if we need to reconnect to the target.
@@ -68,6 +72,15 @@ static void qedf_cmd_timeout(struct work_struct *work)
 		qedf_restart_rport(fcport);
 		break;
 	case QEDF_ELS:
+		if (!qedf) {
+			QEDF_INFO(NULL, QEDF_LOG_IO,
+				  "qedf is NULL for ELS xid=0x%x.\n",
+				  io_req->xid);
+			return;
+		}
+		/* ELS request no longer outstanding since it timed out */
+		clear_bit(QEDF_CMD_OUTSTANDING, &io_req->flags);
+
 		kref_get(&io_req->refcount);
 		/*
 		 * Don't attempt to clean an ELS timeout as any subseqeunt
@@ -1137,6 +1150,19 @@ void qedf_scsi_completion(struct qedf_ctx *qedf, struct fcoe_cqe *cqe,
 
 	fcport = io_req->fcport;
 
+	/*
+	 * When flush is active, let the cmds be completed from the cleanup
+	 * context
+	 */
+	if (test_bit(QEDF_RPORT_IN_TARGET_RESET, &fcport->flags) ||
+	    (test_bit(QEDF_RPORT_IN_LUN_RESET, &fcport->flags) &&
+	     sc_cmd->device->lun == (u64)fcport->lun_reset_lun)) {
+		QEDF_INFO(&qedf->dbg_ctx, QEDF_LOG_IO,
+			  "Dropping good completion xid=0x%x as fcport is flushing",
+			  io_req->xid);
+		return;
+	}
+
 	qedf_parse_fcp_rsp(io_req, fcp_rsp);
 
 	qedf_unmap_sg_list(qedf, io_req);
@@ -1720,15 +1746,23 @@ int qedf_initiate_abts(struct qedf_ioreq *io_req, bool return_scsi_cmd_on_abts)
 	unsigned long flags;
 	struct fcoe_wqe *sqe;
 	u16 sqe_idx;
+	int refcount = 0;
 
 	/* Sanity check qedf_rport before dereferencing any pointers */
 	if (!test_bit(QEDF_RPORT_SESSION_READY, &fcport->flags)) {
 		QEDF_ERR(NULL, "tgt not offloaded\n");
 		rc = 1;
-		goto abts_err;
+		goto out;
 	}
 
 	rdata = fcport->rdata;
+
+	if (!rdata || !kref_get_unless_zero(&rdata->kref)) {
+		QEDF_ERR(&qedf->dbg_ctx, "stale rport\n");
+		rc = 1;
+		goto out;
+	}
+
 	r_a_tov = rdata->r_a_tov;
 	qedf = fcport->qedf;
 	lport = qedf->lport;
@@ -1736,20 +1770,20 @@ int qedf_initiate_abts(struct qedf_ioreq *io_req, bool return_scsi_cmd_on_abts)
 	if (lport->state != LPORT_ST_READY || !(lport->link_up)) {
 		QEDF_ERR(&(qedf->dbg_ctx), "link is not ready\n");
 		rc = 1;
-		goto abts_err;
+		goto out;
 	}
 
 	if (atomic_read(&qedf->link_down_tmo_valid) > 0) {
 		QEDF_ERR(&(qedf->dbg_ctx), "link_down_tmo active.\n");
 		rc = 1;
-		goto abts_err;
+		goto out;
 	}
 
 	/* Ensure room on SQ */
 	if (!atomic_read(&fcport->free_sqes)) {
 		QEDF_ERR(&(qedf->dbg_ctx), "No SQ entries available\n");
 		rc = 1;
-		goto abts_err;
+		goto out;
 	}
 
 	if (test_bit(QEDF_RPORT_UPLOADING_CONNECTION, &fcport->flags)) {
@@ -1774,18 +1808,17 @@ int qedf_initiate_abts(struct qedf_ioreq *io_req, bool return_scsi_cmd_on_abts)
 	qedf->control_requests++;
 	qedf->packet_aborts++;
 
-	/* Set the return CPU to be the same as the request one */
-	io_req->cpu = smp_processor_id();
-
 	/* Set the command type to abort */
 	io_req->cmd_type = QEDF_ABTS;
 	io_req->return_scsi_cmd_on_abts = return_scsi_cmd_on_abts;
 
 	set_bit(QEDF_CMD_IN_ABORT, &io_req->flags);
-	QEDF_INFO(&(qedf->dbg_ctx), QEDF_LOG_SCSI_TM, "ABTS io_req xid = "
-		   "0x%x\n", xid);
+	refcount = kref_read(&io_req->refcount);
+	QEDF_INFO(&qedf->dbg_ctx, QEDF_LOG_SCSI_TM,
+		  "ABTS io_req xid = 0x%x refcount=%d\n",
+		  xid, refcount);
 
-	qedf_cmd_timer_set(qedf, io_req, QEDF_ABORT_TIMEOUT * HZ);
+	qedf_cmd_timer_set(qedf, io_req, QEDF_ABORT_TIMEOUT);
 
 	spin_lock_irqsave(&fcport->rport_lock, flags);
 
@@ -1799,13 +1832,6 @@ int qedf_initiate_abts(struct qedf_ioreq *io_req, bool return_scsi_cmd_on_abts)
 
 	spin_unlock_irqrestore(&fcport->rport_lock, flags);
 
-	return rc;
-abts_err:
-	/*
-	 * If the ABTS task fails to queue then we need to cleanup the
-	 * task at the firmware.
-	 */
-	qedf_initiate_cleanup(io_req, return_scsi_cmd_on_abts);
 out:
 	return rc;
 }
@@ -1815,25 +1841,59 @@ void qedf_process_abts_compl(struct qedf_ctx *qedf, struct fcoe_cqe *cqe,
 {
 	uint32_t r_ctl;
 	uint16_t xid;
+	int rc;
+	struct qedf_rport *fcport = io_req->fcport;
 
 	QEDF_INFO(&(qedf->dbg_ctx), QEDF_LOG_SCSI_TM, "Entered with xid = "
 		   "0x%x cmd_type = %d\n", io_req->xid, io_req->cmd_type);
 
-	cancel_delayed_work(&io_req->timeout_work);
-
 	xid = io_req->xid;
 	r_ctl = cqe->cqe_info.abts_info.r_ctl;
 
+	/* This was added at a point when we were scheduling abts_compl &
+	 * cleanup_compl on different CPUs and there was a possibility of
+	 * the io_req to be freed from the other context before we got here.
+	 */
+	if (!fcport) {
+		QEDF_INFO(&qedf->dbg_ctx, QEDF_LOG_IO,
+			  "Dropping ABTS completion xid=0x%x as fcport is NULL",
+			  io_req->xid);
+		return;
+	}
+
+	/*
+	 * When flush is active, let the cmds be completed from the cleanup
+	 * context
+	 */
+	if (test_bit(QEDF_RPORT_IN_TARGET_RESET, &fcport->flags) ||
+	    test_bit(QEDF_RPORT_IN_LUN_RESET, &fcport->flags)) {
+		QEDF_INFO(&qedf->dbg_ctx, QEDF_LOG_IO,
+			  "Dropping ABTS completion xid=0x%x as fcport is flushing",
+			  io_req->xid);
+		return;
+	}
+
+	if (!cancel_delayed_work(&io_req->timeout_work)) {
+		QEDF_ERR(&qedf->dbg_ctx,
+			 "Wasn't able to cancel abts timeout work.\n");
+	}
+
 	switch (r_ctl) {
 	case FC_RCTL_BA_ACC:
 		QEDF_INFO(&(qedf->dbg_ctx), QEDF_LOG_SCSI_TM,
 		    "ABTS response - ACC Send RRQ after R_A_TOV\n");
 		io_req->event = QEDF_IOREQ_EV_ABORT_SUCCESS;
+		rc = kref_get_unless_zero(&io_req->refcount);
+		if (!rc) {
+			QEDF_INFO(&qedf->dbg_ctx, QEDF_LOG_SCSI_TM,
+				  "kref is already zero so ABTS was already completed or flushed xid=0x%x.\n",
+				  io_req->xid);
+			return;
+		}
 		/*
 		 * Dont release this cmd yet. It will be relesed
 		 * after we get RRQ response
 		 */
-		kref_get(&io_req->refcount);
 		queue_delayed_work(qedf->dpc_wq, &io_req->rrq_work,
 		    msecs_to_jiffies(qedf->lport->r_a_tov));
 		break;
@@ -2106,6 +2166,7 @@ static int qedf_execute_tmf(struct qedf_rport *fcport, struct scsi_cmnd *sc_cmd,
 	int rc = 0;
 	uint16_t xid;
 	int tmo = 0;
+	int lun = 0;
 	unsigned long flags;
 	struct fcoe_wqe *sqe;
 	u16 sqe_idx;
@@ -2115,6 +2176,7 @@ static int qedf_execute_tmf(struct qedf_rport *fcport, struct scsi_cmnd *sc_cmd,
 		return FAILED;
 	}
 
+	lun = (int)sc_cmd->device->lun;
 	if (!test_bit(QEDF_RPORT_SESSION_READY, &fcport->flags)) {
 		QEDF_ERR(&(qedf->dbg_ctx), "fcport not offloaded\n");
 		rc = FAILED;
@@ -2141,7 +2203,7 @@ static int qedf_execute_tmf(struct qedf_rport *fcport, struct scsi_cmnd *sc_cmd,
 	io_req->fcport = fcport;
 	io_req->cmd_type = QEDF_TASK_MGMT_CMD;
 
-	/* Set the return CPU to be the same as the request one */
+	/* Record which cpu this request is associated with */
 	io_req->cpu = smp_processor_id();
 
 	/* Set TM flags */
@@ -2150,7 +2212,7 @@ static int qedf_execute_tmf(struct qedf_rport *fcport, struct scsi_cmnd *sc_cmd,
 	io_req->tm_flags = tm_flags;
 
 	/* Default is to return a SCSI command when an error occurs */
-	io_req->return_scsi_cmd_on_abts = true;
+	io_req->return_scsi_cmd_on_abts = false;
 
 	/* Obtain exchange id */
 	xid = io_req->xid;
@@ -2174,12 +2236,16 @@ static int qedf_execute_tmf(struct qedf_rport *fcport, struct scsi_cmnd *sc_cmd,
 
 	spin_unlock_irqrestore(&fcport->rport_lock, flags);
 
+	set_bit(QEDF_CMD_OUTSTANDING, &io_req->flags);
 	tmo = wait_for_completion_timeout(&io_req->tm_done,
 	    QEDF_TM_TIMEOUT * HZ);
 
 	if (!tmo) {
 		rc = FAILED;
 		QEDF_ERR(&(qedf->dbg_ctx), "wait for tm_cmpl timeout!\n");
+		/* Clear outstanding bit since command timed out */
+		clear_bit(QEDF_CMD_OUTSTANDING, &io_req->flags);
+		io_req->sc_cmd = NULL;
 	} else {
 		/* Check TMF response code */
 		if (io_req->fcp_rsp_code == 0)
@@ -2187,14 +2253,25 @@ static int qedf_execute_tmf(struct qedf_rport *fcport, struct scsi_cmnd *sc_cmd,
 		else
 			rc = FAILED;
 	}
+	/*
+	 * Double check that fcport has not gone into an uploading state before
+	 * executing the command flush for the LUN/target.
+	 */
+	if (test_bit(QEDF_RPORT_UPLOADING_CONNECTION, &fcport->flags)) {
+		QEDF_ERR(&qedf->dbg_ctx,
+			 "fcport is uploading, not executing flush.\n");
+		goto no_flush;
+	}
+	/* We do not need this io_req any more */
+	kref_put(&io_req->refcount, qedf_release_cmd);
+
 
 	if (tm_flags == FCP_TMF_LUN_RESET)
-		qedf_flush_active_ios(fcport, (int)sc_cmd->device->lun);
+		qedf_flush_active_ios(fcport, lun);
 	else
 		qedf_flush_active_ios(fcport, -1);
 
-	kref_put(&io_req->refcount, qedf_release_cmd);
-
+no_flush:
 	if (rc != SUCCESS) {
 		QEDF_ERR(&(qedf->dbg_ctx), "task mgmt command failed...\n");
 		rc = FAILED;
@@ -2215,22 +2292,57 @@ int qedf_initiate_tmf(struct scsi_cmnd *sc_cmd, u8 tm_flags)
 	struct fc_lport *lport;
 	int rc = SUCCESS;
 	int rval;
+	struct qedf_ioreq *io_req = NULL;
+	int ref_cnt = 0;
+	struct fc_rport_priv *rdata = fcport->rdata;
 
-	rval = fc_remote_port_chkready(rport);
+	QEDF_ERR(NULL,
+		 "tm_flags 0x%x sc_cmd %p op = 0x%02x target_id = 0x%x lun=%d\n",
+		 tm_flags, sc_cmd, sc_cmd->cmnd[0], rport->scsi_target_id,
+		 (int)sc_cmd->device->lun);
 
+	if (!rdata || !kref_get_unless_zero(&rdata->kref)) {
+		QEDF_ERR(NULL, "stale rport\n");
+		return FAILED;
+	}
+
+	QEDF_ERR(NULL, "portid=%06x tm_flags =%s\n", rdata->ids.port_id,
+		 (tm_flags == FCP_TMF_TGT_RESET) ? "TARGET RESET" :
+		 "LUN RESET");
+
+	if (sc_cmd->SCp.ptr) {
+		io_req = (struct qedf_ioreq *)sc_cmd->SCp.ptr;
+		ref_cnt = kref_read(&io_req->refcount);
+		QEDF_ERR(NULL,
+			 "orig io_req = %p xid = 0x%x ref_cnt = %d.\n",
+			 io_req, io_req->xid, ref_cnt);
+	}
+
+	rval = fc_remote_port_chkready(rport);
 	if (rval) {
 		QEDF_ERR(NULL, "device_reset rport not ready\n");
 		rc = FAILED;
 		goto tmf_err;
 	}
 
-	if (fcport == NULL) {
+	rc = fc_block_scsi_eh(sc_cmd);
+	if (rc)
+		return rc;
+
+	if (!fcport) {
 		QEDF_ERR(NULL, "device_reset: rport is NULL\n");
 		rc = FAILED;
 		goto tmf_err;
 	}
 
 	qedf = fcport->qedf;
+
+	if (!qedf) {
+		QEDF_ERR(NULL, "qedf is NULL.\n");
+		rc = FAILED;
+		goto tmf_err;
+	}
+
 	lport = qedf->lport;
 
 	if (test_bit(QEDF_UNLOADING, &qedf->flags) ||
@@ -2245,6 +2357,12 @@ int qedf_initiate_tmf(struct scsi_cmnd *sc_cmd, u8 tm_flags)
 		goto tmf_err;
 	}
 
+	if (test_bit(QEDF_RPORT_UPLOADING_CONNECTION, &fcport->flags)) {
+		QEDF_ERR(&qedf->dbg_ctx, "fcport is uploading.\n");
+		rc = FAILED;
+		goto tmf_err;
+	}
+
 	rc = qedf_execute_tmf(fcport, sc_cmd, tm_flags);
 
 tmf_err:
@@ -2256,6 +2374,8 @@ void qedf_process_tmf_compl(struct qedf_ctx *qedf, struct fcoe_cqe *cqe,
 {
 	struct fcoe_cqe_rsp_info *fcp_rsp;
 
+	clear_bit(QEDF_CMD_OUTSTANDING, &io_req->flags);
+
 	fcp_rsp = &cqe->cqe_info.rsp_info;
 	qedf_parse_fcp_rsp(io_req, fcp_rsp);
 
diff --git a/drivers/scsi/qedf/qedf_main.c b/drivers/scsi/qedf/qedf_main.c
index 8affe0e..3fd8107 100644
--- a/drivers/scsi/qedf/qedf_main.c
+++ b/drivers/scsi/qedf/qedf_main.c
@@ -615,50 +615,113 @@ static u32 qedf_get_login_failures(void *cookie)
 static int qedf_eh_abort(struct scsi_cmnd *sc_cmd)
 {
 	struct fc_rport *rport = starget_to_rport(scsi_target(sc_cmd->device));
-	struct fc_rport_libfc_priv *rp = rport->dd_data;
-	struct qedf_rport *fcport;
 	struct fc_lport *lport;
 	struct qedf_ctx *qedf;
 	struct qedf_ioreq *io_req;
+	struct fc_rport_libfc_priv *rp = rport->dd_data;
+	struct fc_rport_priv *rdata;
+	struct qedf_rport *fcport = NULL;
 	int rc = FAILED;
+	int wait_count = 100;
+	int refcount = 0;
 	int rval;
-
-	if (fc_remote_port_chkready(rport)) {
-		QEDF_ERR(NULL, "rport not ready\n");
-		goto out;
-	}
+	int got_ref = 0;
 
 	lport = shost_priv(sc_cmd->device->host);
 	qedf = (struct qedf_ctx *)lport_priv(lport);
 
-	if ((lport->state != LPORT_ST_READY) || !(lport->link_up)) {
-		QEDF_ERR(&(qedf->dbg_ctx), "link not ready.\n");
+	/* rport and tgt are allocated together, so tgt should be non-NULL */
+	fcport = (struct qedf_rport *)&rp[1];
+	rdata = fcport->rdata;
+	if (!rdata || !kref_get_unless_zero(&rdata->kref)) {
+		QEDF_ERR(&qedf->dbg_ctx, "stale rport, sc_cmd=%p\n", sc_cmd);
+		rc = 1;
 		goto out;
 	}
 
-	fcport = (struct qedf_rport *)&rp[1];
 
 	io_req = (struct qedf_ioreq *)sc_cmd->SCp.ptr;
 	if (!io_req) {
-		QEDF_ERR(&(qedf->dbg_ctx), "io_req is NULL.\n");
+		QEDF_ERR(&qedf->dbg_ctx,
+			 "sc_cmd not queued with lld, sc_cmd=%p op=0x%02x, port_id=%06x\n",
+			 sc_cmd, sc_cmd->cmnd[0],
+			 rdata->ids.port_id);
 		rc = SUCCESS;
-		goto out;
+		goto drop_rdata_kref;
 	}
 
-	QEDF_ERR(&(qedf->dbg_ctx), "Aborting io_req sc_cmd=%p xid=0x%x "
-		  "fp_idx=%d.\n", sc_cmd, io_req->xid, io_req->fp_idx);
+	rval = kref_get_unless_zero(&io_req->refcount);	/* ID: 005 */
+	if (rval)
+		got_ref = 1;
+
+	/* If we got a valid io_req, confirm it belongs to this sc_cmd. */
+	if (!rval || io_req->sc_cmd != sc_cmd) {
+		QEDF_ERR(&qedf->dbg_ctx,
+			 "Freed/Incorrect io_req, io_req->sc_cmd=%p, sc_cmd=%p, port_id=%06x, bailing out.\n",
+			 io_req->sc_cmd, sc_cmd, rdata->ids.port_id);
+
+		goto drop_rdata_kref;
+	}
+
+	if (fc_remote_port_chkready(rport)) {
+		refcount = kref_read(&io_req->refcount);
+		QEDF_ERR(&qedf->dbg_ctx,
+			 "rport not ready, io_req=%p, xid=0x%x sc_cmd=%p op=0x%02x, refcount=%d, port_id=%06x\n",
+			 io_req, io_req->xid, sc_cmd, sc_cmd->cmnd[0],
+			 refcount, rdata->ids.port_id);
+
+		goto drop_rdata_kref;
+	}
+
+	rc = fc_block_scsi_eh(sc_cmd);
+	if (rc)
+		goto drop_rdata_kref;
+
+	if (test_bit(QEDF_RPORT_UPLOADING_CONNECTION, &fcport->flags)) {
+		QEDF_ERR(&qedf->dbg_ctx,
+			 "Connection uploading, xid=0x%x., port_id=%06x\n",
+			 io_req->xid, rdata->ids.port_id);
+		while (io_req->sc_cmd && (wait_count != 0)) {
+			msleep(100);
+			wait_count--;
+		}
+		if (wait_count) {
+			QEDF_ERR(&qedf->dbg_ctx, "ABTS succeeded\n");
+			rc = SUCCESS;
+		} else {
+			QEDF_ERR(&qedf->dbg_ctx, "ABTS failed\n");
+			rc = FAILED;
+		}
+		goto drop_rdata_kref;
+	}
+
+	if (lport->state != LPORT_ST_READY || !(lport->link_up)) {
+		QEDF_ERR(&qedf->dbg_ctx, "link not ready.\n");
+		goto drop_rdata_kref;
+	}
+
+	QEDF_ERR(&qedf->dbg_ctx,
+		 "Aborting io_req=%p sc_cmd=%p xid=0x%x fp_idx=%d, port_id=%06x.\n",
+		 io_req, sc_cmd, io_req->xid, io_req->fp_idx,
+		 rdata->ids.port_id);
 
 	if (qedf->stop_io_on_error) {
 		qedf_stop_all_io(qedf);
 		rc = SUCCESS;
-		goto out;
+		goto drop_rdata_kref;
 	}
 
 	init_completion(&io_req->abts_done);
 	rval = qedf_initiate_abts(io_req, true);
 	if (rval) {
 		QEDF_ERR(&(qedf->dbg_ctx), "Failed to queue ABTS.\n");
-		goto out;
+		/*
+		 * If we fail to queue the ABTS then return this command to
+		 * the SCSI layer as it will own and free the xid
+		 */
+		rc = SUCCESS;
+		qedf_scsi_done(qedf, io_req, DID_ERROR);
+		goto drop_rdata_kref;
 	}
 
 	wait_for_completion(&io_req->abts_done);
@@ -684,19 +747,27 @@ static int qedf_eh_abort(struct scsi_cmnd *sc_cmd)
 		QEDF_ERR(&(qedf->dbg_ctx), "ABTS failed, xid=0x%x.\n",
 			  io_req->xid);
 
+drop_rdata_kref:
+	kref_put(&rdata->kref, fc_rport_destroy);
 out:
+	if (got_ref)
+		kref_put(&io_req->refcount, qedf_release_cmd);
 	return rc;
 }
 
 static int qedf_eh_target_reset(struct scsi_cmnd *sc_cmd)
 {
-	QEDF_ERR(NULL, "TARGET RESET Issued...");
+	QEDF_ERR(NULL, "%d:0:%d:%lld: TARGET RESET Issued...",
+		 sc_cmd->device->host->host_no, sc_cmd->device->id,
+		 sc_cmd->device->lun);
 	return qedf_initiate_tmf(sc_cmd, FCP_TMF_TGT_RESET);
 }
 
 static int qedf_eh_device_reset(struct scsi_cmnd *sc_cmd)
 {
-	QEDF_ERR(NULL, "LUN RESET Issued...\n");
+	QEDF_ERR(NULL, "%d:0:%d:%lld: LUN RESET Issued... ",
+		 sc_cmd->device->host->host_no, sc_cmd->device->id,
+		 sc_cmd->device->lun);
 	return qedf_initiate_tmf(sc_cmd, FCP_TMF_LUN_RESET);
 }
 
@@ -740,22 +811,6 @@ static int qedf_eh_host_reset(struct scsi_cmnd *sc_cmd)
 {
 	struct fc_lport *lport;
 	struct qedf_ctx *qedf;
-	struct fc_rport *rport = starget_to_rport(scsi_target(sc_cmd->device));
-	struct fc_rport_libfc_priv *rp = rport->dd_data;
-	struct qedf_rport *fcport = (struct qedf_rport *)&rp[1];
-	int rval;
-
-	rval = fc_remote_port_chkready(rport);
-
-	if (rval) {
-		QEDF_ERR(NULL, "device_reset rport not ready\n");
-		return FAILED;
-	}
-
-	if (fcport == NULL) {
-		QEDF_ERR(NULL, "device_reset: rport is NULL\n");
-		return FAILED;
-	}
 
 	lport = shost_priv(sc_cmd->device->host);
 	qedf = lport_priv(lport);
@@ -3002,6 +3057,7 @@ static int __qedf_probe(struct pci_dev *pdev, int mode)
 		pci_set_drvdata(pdev, qedf);
 		init_completion(&qedf->fipvlan_compl);
 		mutex_init(&qedf->stats_mutex);
+		mutex_init(&qedf->flush_mutex);
 
 		QEDF_INFO(&(qedf->dbg_ctx), QEDF_LOG_INFO,
 		   "QLogic FastLinQ FCoE Module qedf %s, "