From patchwork Thu Jun 27 10:17:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sumit Saxena X-Patchwork-Id: 13714121 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 71FE413D8B8 for ; Thu, 27 Jun 2024 10:20:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719483659; cv=none; b=NWI3ZY6et9nuwWCPOPHQ4yjw7o41qPAAfcRhFsNXLMW3JAFfc/oSLXASjsIbabIjvop53VmdWUCNPn/Gjs0vYQxD8uhyDDdpvSppWYuZfrl3etsEIZoE8ebbf0SFzkzz9aZ2DxpJbYsH7CaypY2cSNy+K/9zfm4wiZ/2gimIUBU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719483659; c=relaxed/simple; bh=NuOBN22rCtxIb3abhPPMRPsarZGfJ/7b9uCcN77dbu8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=NvFOopgrxq2E3yMu0oD4mZWwFJfcU5195+wkEolliVJAhJWJrb7NVKVz7CpmxVGic73FuEKQKrszzSJKd2zq9s6Zy5640CAxsX4cLvgyrugYdlSJtbVLGYGtud/NPh3mFi4c6IPiaKlMLeH6OmkBm20TqZwwxyD61bk2YqQvilM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=broadcom.com; spf=fail smtp.mailfrom=broadcom.com; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b=UpCM7eh7; arc=none smtp.client-ip=209.85.214.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=broadcom.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=broadcom.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b="UpCM7eh7" Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-1fa244db0b2so40775555ad.3 for ; Thu, 27 Jun 2024 03:20:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1719483657; x=1720088457; darn=vger.kernel.org; h=mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:from:to:cc:subject:date:message-id:reply-to; bh=rAuk5MkuMkmInd3WifpGGRB1SYMRkvu5vHMOsUN4KHE=; b=UpCM7eh7KQZYFbtcg3kSBQWruy7ltMqwOV7vmpXGhH2RvBHOWQ7k6dcNmXjzAN1VPa XMioU9XmkA0ebe3BD1uHK0ZnoHcCRCSnRSddbxMYlw2L75NeQPFVxD+uEsx7TXiT4a8y OO5hoQ28i+ugfpgZC3Qh0W16X5BPwdT7Lu/a0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719483657; x=1720088457; h=mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rAuk5MkuMkmInd3WifpGGRB1SYMRkvu5vHMOsUN4KHE=; b=Bt4VGqBalK3RA2Rcs58LhmkHvYKlAivPwxPi9Ox+uGqtlWfgaylcJGPshXHhpYmjs4 F2GN9IQbgZYm3l85gGitn3hR7gcOwQFa1H+f9pmXo2g97jMktmlbY4LU/We7nJq/9l/I q4ltL3Y9W6hEgehFaVvFUPVggnuBtZzTGprzdKw0hIZ0zITkWmQOw+DC4l6J/tC3wiFS 5jmTZjyc2A9/Y1m02smVnA4ac0TAHzrzqueH+vEpRTQQ+zAvEa4Q8fW43VoKXLGDQDXd sXlwxu2jCD9joSTYx0mRHfOiWmt2xjImmbnkP8ZkP+iSMuKsWpKfrFQnHdyG7DmUI1XA 29wg== X-Gm-Message-State: AOJu0YzT3+HW+hYON0I3Y9AoVZM27zy9MD0FWMD40Kv2Q/y1FbvqFM2b ryZryYxp4/LbV6xIoMYBcpewT8P0gbJaaWEROqj+O+cbu8KQ8ajulm4zunnAAQ== X-Google-Smtp-Source: AGHT+IF+eDV9qER+RB3on91Ra1/gz6JB6AY3+hCqcEjh3hyoBePN70kitv0kqzSG7ga6EbSj5xUfRw== X-Received: by 2002:a17:903:234c:b0:1f9:ba4b:57f2 with SMTP id d9443c01a7336-1fa6909e7abmr77823265ad.21.1719483656572; Thu, 27 Jun 2024 03:20:56 -0700 (PDT) Received: from localhost.localdomain ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1faac979478sm9858495ad.180.2024.06.27.03.20.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Jun 2024 03:20:55 -0700 (PDT) From: Sumit Saxena To: martin.petersen@oracle.com, helgaas@kernel.org, sathya.prakash@broadcom.com, sumit.saxena@broadcom.com, chandrakanth.patil@broadcom.com, ranjan.kumar@broadcom.com, prayas.patel@broadcom.com Cc: linux-scsi@vger.kernel.org, linux-pci@vger.kernel.org Subject: [PATCH v5 2/3] mpi3mr: Prevent PCI writes from driver during PCI error recovery Date: Thu, 27 Jun 2024 15:47:34 +0530 Message-Id: <20240627101735.18286-3-sumit.saxena@broadcom.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240627101735.18286-1-sumit.saxena@broadcom.com> References: <20240627101735.18286-1-sumit.saxena@broadcom.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Prevent interaction with the hardware while the error recovery in progress. Signed-off-by: Sathya Prakash Signed-off-by: Ranjan Kumar Signed-off-by: Sumit Saxena --- drivers/scsi/mpi3mr/mpi3mr.h | 1 + drivers/scsi/mpi3mr/mpi3mr_app.c | 10 ++++-- drivers/scsi/mpi3mr/mpi3mr_fw.c | 22 +++++++++--- drivers/scsi/mpi3mr/mpi3mr_os.c | 49 +++++++++++++++++++++++--- drivers/scsi/mpi3mr/mpi3mr_transport.c | 39 +++++++++++++++++--- 5 files changed, 104 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/mpi3mr/mpi3mr.h b/drivers/scsi/mpi3mr/mpi3mr.h index 2b1d5645ba9b..e99bb8ec428c 100644 --- a/drivers/scsi/mpi3mr/mpi3mr.h +++ b/drivers/scsi/mpi3mr/mpi3mr.h @@ -519,6 +519,7 @@ struct mpi3mr_throttle_group_info { /* HBA port flags */ #define MPI3MR_HBA_PORT_FLAG_DIRTY 0x01 +#define MPI3MR_HBA_PORT_FLAG_NEW 0x02 /* IOCTL data transfer sge*/ #define MPI3MR_NUM_IOCTL_SGE 256 diff --git a/drivers/scsi/mpi3mr/mpi3mr_app.c b/drivers/scsi/mpi3mr/mpi3mr_app.c index f73f265c7921..c369f58fc93a 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_app.c +++ b/drivers/scsi/mpi3mr/mpi3mr_app.c @@ -846,7 +846,7 @@ static int mpi3mr_bsg_pel_abort(struct mpi3mr_ioc *mrioc) dprint_bsg_err(mrioc, "%s: reset in progress\n", __func__); return -1; } - if (mrioc->stop_bsgs) { + if (mrioc->stop_bsgs || mrioc->block_on_pci_err) { dprint_bsg_err(mrioc, "%s: bsgs are blocked\n", __func__); return -1; } @@ -1492,6 +1492,9 @@ static long mpi3mr_bsg_adp_reset(struct mpi3mr_ioc *mrioc, goto out; } + if (mrioc->unrecoverable || mrioc->block_on_pci_err) + return -EINVAL; + sg_copy_to_buffer(job->request_payload.sg_list, job->request_payload.sg_cnt, &adpreset, sizeof(adpreset)); @@ -2575,7 +2578,7 @@ static long mpi3mr_bsg_process_mpt_cmds(struct bsg_job *job) mutex_unlock(&mrioc->bsg_cmds.mutex); goto out; } - if (mrioc->stop_bsgs) { + if (mrioc->stop_bsgs || mrioc->block_on_pci_err) { dprint_bsg_err(mrioc, "%s: bsgs are blocked\n", __func__); rval = -EAGAIN; mutex_unlock(&mrioc->bsg_cmds.mutex); @@ -3108,7 +3111,8 @@ adp_state_show(struct device *dev, struct device_attribute *attr, ioc_state = mpi3mr_get_iocstate(mrioc); if (ioc_state == MRIOC_STATE_UNRECOVERABLE) adp_state = MPI3MR_BSG_ADPSTATE_UNRECOVERABLE; - else if ((mrioc->reset_in_progress) || (mrioc->stop_bsgs)) + else if (mrioc->reset_in_progress || mrioc->stop_bsgs || + mrioc->block_on_pci_err) adp_state = MPI3MR_BSG_ADPSTATE_IN_RESET; else if (ioc_state == MRIOC_STATE_FAULT) adp_state = MPI3MR_BSG_ADPSTATE_FAULT; diff --git a/drivers/scsi/mpi3mr/mpi3mr_fw.c b/drivers/scsi/mpi3mr/mpi3mr_fw.c index 458c856dda4b..c196dc14ad20 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_fw.c +++ b/drivers/scsi/mpi3mr/mpi3mr_fw.c @@ -608,7 +608,7 @@ int mpi3mr_blk_mq_poll(struct Scsi_Host *shost, unsigned int queue_num) mrioc = (struct mpi3mr_ioc *)shost->hostdata; if ((mrioc->reset_in_progress || mrioc->prepare_for_reset || - mrioc->unrecoverable)) + mrioc->unrecoverable || mrioc->pci_err_recovery)) return 0; num_entries = mpi3mr_process_op_reply_q(mrioc, @@ -1693,6 +1693,12 @@ int mpi3mr_admin_request_post(struct mpi3mr_ioc *mrioc, void *admin_req, retval = -EAGAIN; goto out; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "admin request queue submission failed due to pci error recovery in progress\n"); + retval = -EAGAIN; + goto out; + } + areq_entry = (u8 *)mrioc->admin_req_base + (areq_pi * MPI3MR_ADMIN_REQ_FRAME_SZ); memset(areq_entry, 0, MPI3MR_ADMIN_REQ_FRAME_SZ); @@ -2363,6 +2369,11 @@ int mpi3mr_op_request_post(struct mpi3mr_ioc *mrioc, retval = -EAGAIN; goto out; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "operational request queue submission failed due to pci error recovery in progress\n"); + retval = -EAGAIN; + goto out; + } segment_base_addr = segments[pi / op_req_q->segment_qd].segment; req_entry = (u8 *)segment_base_addr + @@ -2627,7 +2638,7 @@ static void mpi3mr_watchdog_work(struct work_struct *work) union mpi3mr_trigger_data trigger_data; u16 reset_reason = MPI3MR_RESET_FROM_FAULT_WATCH; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return; if (!mrioc->unrecoverable && !pci_device_is_present(mrioc->pdev)) { @@ -4268,7 +4279,7 @@ int mpi3mr_reinit_ioc(struct mpi3mr_ioc *mrioc, u8 is_resume) goto out_failed_noretry; } - if (is_resume) { + if (is_resume || mrioc->block_on_pci_err) { dprint_reset(mrioc, "setting up single ISR\n"); retval = mpi3mr_setup_isr(mrioc, 1); if (retval) { @@ -4319,7 +4330,7 @@ int mpi3mr_reinit_ioc(struct mpi3mr_ioc *mrioc, u8 is_resume) goto out_failed; } - if (is_resume) { + if (is_resume || mrioc->block_on_pci_err) { dprint_reset(mrioc, "setting up multiple ISR\n"); retval = mpi3mr_setup_isr(mrioc, 0); if (retval) { @@ -4807,7 +4818,8 @@ void mpi3mr_cleanup_ioc(struct mpi3mr_ioc *mrioc) ioc_state = mpi3mr_get_iocstate(mrioc); - if ((!mrioc->unrecoverable) && (!mrioc->reset_in_progress) && + if (!mrioc->unrecoverable && !mrioc->reset_in_progress && + !mrioc->pci_err_recovery && (ioc_state == MRIOC_STATE_READY)) { if (mpi3mr_issue_and_process_mur(mrioc, MPI3MR_RESET_FROM_CTLR_CLEANUP)) diff --git a/drivers/scsi/mpi3mr/mpi3mr_os.c b/drivers/scsi/mpi3mr/mpi3mr_os.c index 0986b362e5f0..69b14918de59 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_os.c +++ b/drivers/scsi/mpi3mr/mpi3mr_os.c @@ -956,7 +956,7 @@ static int mpi3mr_report_tgtdev_to_host(struct mpi3mr_ioc *mrioc, int retval = 0; struct mpi3mr_tgt_dev *tgtdev; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return -1; tgtdev = mpi3mr_get_tgtdev_by_perst_id(mrioc, perst_id); @@ -2007,6 +2007,7 @@ static void mpi3mr_fwevt_bh(struct mpi3mr_ioc *mrioc, struct mpi3_device_page0 *dev_pg0 = NULL; u16 perst_id, handle, dev_info; struct mpi3_device0_sas_sata_format *sasinf = NULL; + unsigned int timeout; mpi3mr_fwevt_del_from_list(mrioc, fwevt); mrioc->current_event = fwevt; @@ -2097,8 +2098,18 @@ static void mpi3mr_fwevt_bh(struct mpi3mr_ioc *mrioc, } case MPI3_EVENT_WAIT_FOR_DEVICES_TO_REFRESH: { - while (mrioc->device_refresh_on) + timeout = MPI3MR_RESET_TIMEOUT * 2; + while ((mrioc->device_refresh_on || mrioc->block_on_pci_err) && + !mrioc->unrecoverable && !mrioc->pci_err_recovery) { msleep(500); + if (!timeout--) { + mrioc->unrecoverable = 1; + break; + } + } + + if (mrioc->unrecoverable || mrioc->pci_err_recovery) + break; dprint_event_bh(mrioc, "scan for non responding and newly added devices after soft reset started\n"); @@ -3796,6 +3807,13 @@ int mpi3mr_issue_tm(struct mpi3mr_ioc *mrioc, u8 tm_type, mutex_unlock(&drv_cmd->mutex); goto out; } + if (mrioc->block_on_pci_err) { + retval = -1; + dprint_tm(mrioc, "sending task management failed due to\n" + "pci error recovery in progress\n"); + mutex_unlock(&drv_cmd->mutex); + goto out; + } drv_cmd->state = MPI3MR_CMD_PENDING; drv_cmd->is_waiting = 1; @@ -4181,6 +4199,7 @@ static int mpi3mr_eh_bus_reset(struct scsi_cmnd *scmd) struct mpi3mr_sdev_priv_data *sdev_priv_data; u8 dev_type = MPI3_DEVICE_DEVFORM_VD; int retval = FAILED; + unsigned int timeout = MPI3MR_RESET_TIMEOUT; sdev_priv_data = scmd->device->hostdata; if (sdev_priv_data && sdev_priv_data->tgt_priv_data) { @@ -4191,12 +4210,24 @@ static int mpi3mr_eh_bus_reset(struct scsi_cmnd *scmd) if (dev_type == MPI3_DEVICE_DEVFORM_VD) { mpi3mr_wait_for_host_io(mrioc, MPI3MR_RAID_ERRREC_RESET_TIMEOUT); - if (!mpi3mr_get_fw_pending_ios(mrioc)) + if (!mpi3mr_get_fw_pending_ios(mrioc)) { + while (mrioc->reset_in_progress || + mrioc->prepare_for_reset || + mrioc->block_on_pci_err) { + ssleep(1); + if (!timeout--) { + retval = FAILED; + goto out; + } + } retval = SUCCESS; + goto out; + } } if (retval == FAILED) mpi3mr_print_pending_host_io(mrioc); +out: sdev_printk(KERN_INFO, scmd->device, "Bus reset is %s for scmd(%p)\n", ((retval == SUCCESS) ? "SUCCESS" : "FAILED"), scmd); @@ -4879,7 +4910,8 @@ static int mpi3mr_qcmd(struct Scsi_Host *shost, goto out; } - if (mrioc->reset_in_progress) { + if (mrioc->reset_in_progress || mrioc->prepare_for_reset + || mrioc->block_on_pci_err) { retval = SCSI_MLQUEUE_HOST_BUSY; goto out; } @@ -5362,7 +5394,14 @@ static void mpi3mr_remove(struct pci_dev *pdev) while (mrioc->reset_in_progress || mrioc->is_driver_loading) ssleep(1); - if (!pci_device_is_present(mrioc->pdev)) { + if (mrioc->block_on_pci_err) { + mrioc->block_on_pci_err = false; + scsi_unblock_requests(shost); + mrioc->unrecoverable = 1; + } + + if (!pci_device_is_present(mrioc->pdev) || + mrioc->pci_err_recovery) { mrioc->unrecoverable = 1; mpi3mr_flush_cmds_for_unrecovered_controller(mrioc); } diff --git a/drivers/scsi/mpi3mr/mpi3mr_transport.c b/drivers/scsi/mpi3mr/mpi3mr_transport.c index 329cc6ec3b58..8612780f6e9e 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_transport.c +++ b/drivers/scsi/mpi3mr/mpi3mr_transport.c @@ -151,6 +151,11 @@ static int mpi3mr_report_manufacture(struct mpi3mr_ioc *mrioc, return -EFAULT; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", __func__); + return -EFAULT; + } + data_out_sz = sizeof(struct rep_manu_request); data_in_sz = sizeof(struct rep_manu_reply); data_out = dma_alloc_coherent(&mrioc->pdev->dev, @@ -790,6 +795,12 @@ static int mpi3mr_set_identify(struct mpi3mr_ioc *mrioc, u16 handle, return -EFAULT; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", + __func__); + return -EFAULT; + } + if ((mpi3mr_cfg_get_dev_pg0(mrioc, &ioc_status, &device_pg0, sizeof(device_pg0), MPI3_DEVICE_PGAD_FORM_HANDLE, handle))) { ioc_err(mrioc, "%s: device page0 read failed\n", __func__); @@ -1007,6 +1018,9 @@ mpi3mr_alloc_hba_port(struct mpi3mr_ioc *mrioc, u16 port_id) hba_port->port_id = port_id; ioc_info(mrioc, "hba_port entry: %p, port: %d is added to hba_port list\n", hba_port, hba_port->port_id); + if (mrioc->reset_in_progress || + mrioc->pci_err_recovery) + hba_port->flags = MPI3MR_HBA_PORT_FLAG_NEW; list_add_tail(&hba_port->list, &mrioc->hba_port_table_list); return hba_port; } @@ -1055,7 +1069,7 @@ void mpi3mr_update_links(struct mpi3mr_ioc *mrioc, struct mpi3mr_sas_node *mr_sas_node; struct mpi3mr_sas_phy *mr_sas_phy; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return; spin_lock_irqsave(&mrioc->sas_node_lock, flags); @@ -1978,7 +1992,7 @@ int mpi3mr_expander_add(struct mpi3mr_ioc *mrioc, u16 handle) if (!handle) return -1; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return -1; if ((mpi3mr_cfg_get_sas_exp_pg0(mrioc, &ioc_status, &expander_pg0, @@ -2184,7 +2198,7 @@ void mpi3mr_expander_node_remove(struct mpi3mr_ioc *mrioc, /* remove sibling ports attached to this expander */ list_for_each_entry_safe(mr_sas_port, next, &sas_expander->sas_port_list, port_list) { - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return; if (mr_sas_port->remote_identify.device_type == SAS_END_DEVICE) @@ -2234,7 +2248,7 @@ void mpi3mr_expander_remove(struct mpi3mr_ioc *mrioc, u64 sas_address, struct mpi3mr_sas_node *sas_expander; unsigned long flags; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return; if (!hba_port) @@ -2545,6 +2559,11 @@ static int mpi3mr_get_expander_phy_error_log(struct mpi3mr_ioc *mrioc, return -EFAULT; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", __func__); + return -EFAULT; + } + data_out_sz = sizeof(struct phy_error_log_request); data_in_sz = sizeof(struct phy_error_log_reply); sz = data_out_sz + data_in_sz; @@ -2804,6 +2823,12 @@ mpi3mr_expander_phy_control(struct mpi3mr_ioc *mrioc, return -EFAULT; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", + __func__); + return -EFAULT; + } + data_out_sz = sizeof(struct phy_control_request); data_in_sz = sizeof(struct phy_control_reply); sz = data_out_sz + data_in_sz; @@ -3227,6 +3252,12 @@ mpi3mr_transport_smp_handler(struct bsg_job *job, struct Scsi_Host *shost, goto out; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", __func__); + rc = -EFAULT; + goto out; + } + rc = mpi3mr_map_smp_buffer(&mrioc->pdev->dev, &job->request_payload, &dma_addr_out, &dma_len_out, &addr_out); if (rc)