From patchwork Fri Aug 31 04:34:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suganath Prabu S X-Patchwork-Id: 10583107 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 75436920 for ; Fri, 31 Aug 2018 04:35:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6564A2C18F for ; Fri, 31 Aug 2018 04:35:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 597DE2C1D8; Fri, 31 Aug 2018 04:35:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 78F6C2C18F for ; Fri, 31 Aug 2018 04:35:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727207AbeHaIkq (ORCPT ); Fri, 31 Aug 2018 04:40:46 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:51156 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726660AbeHaIkq (ORCPT ); Fri, 31 Aug 2018 04:40:46 -0400 Received: by mail-wm0-f66.google.com with SMTP id s12-v6so3884588wmc.0 for ; Thu, 30 Aug 2018 21:35:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=d+ciiSHeZp3R3Bt8isCsC2o69svRUrpIgsGj3GYAPgY=; b=O6MNkW3eq4xAmC4r/oCvWCa9SRrj5TGA8CeqAtnJxFbzYAxT8KOvFAHoIXYodPBbV2 Z7XuanMD0HFEBGV3gnR7cIVQNL5Fxdmr+shgFIQs60+b5QSlz2aEOOznJlfv3VQFzf9f P8IhFLDuQdSHDPT9gQHVgM64hzmybqiSBSXBU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=d+ciiSHeZp3R3Bt8isCsC2o69svRUrpIgsGj3GYAPgY=; b=WP+FEUfuk4Ut1gdog6REJzL1AvPIGggCthcomNH9tpS92LRBGS25niKNLq2Cd/iLSC 9zWVKK90mow3FvlWtDhKLe3Y5a9Rx/YkJoqQtwhiooynEHV2h/gghXUzqFdpTGyz3vRG wT88PRG3qIMRyNRtWJ+stRB2PLjbGQeCkcEI3mF/MaFYGLLAJA2TkHSnrsqhKOO07ZYv 01/ByES5zj0mLKmA1shYplp6R7PhFm09KwYcDrAjSAsbF7/QyCgZuRrTBYw+KyVcZF6m wCKhWUeP8WtrJQInUmkGFPa2H8gxd2xmV1JJ1VB9epEG6mz/jYRH5r9MSkK+cbxasoMB BQ3g== X-Gm-Message-State: APzg51DKT7mThEt/99a1VPry6JcdD4YOdH9yO2TmZlDVeLJPgXiNTWEK T4x9PIrJrjH2voFdfqHyE1Eu910F4pA= X-Google-Smtp-Source: ANB0VdYxQ/RWadfLNVCn7ZT6Bp+ofci0me5gwoUiWJxPqQqO8wkd4TiQExjosr6PEk1ZiTTXDyomPw== X-Received: by 2002:a1c:2787:: with SMTP id n129-v6mr3461281wmn.101.1535690110068; Thu, 30 Aug 2018 21:35:10 -0700 (PDT) Received: from localhost.localdomain.localdomain ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id h15-v6sm2261735wmb.21.2018.08.30.21.35.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 Aug 2018 21:35:09 -0700 (PDT) From: Suganath Prabu S To: linux-scsi@vger.kernel.org Cc: Sathya.Prakash@broadcom.com, sreekanth.reddy@broadcom.com, Suganath Prabu S Subject: [Patch v1 2/7] mpt3sas: Add HBA hot plug watchdog thread. Date: Fri, 31 Aug 2018 00:34:36 -0400 Message-Id: <1535690081-16116-3-git-send-email-suganath-prabu.subramani@broadcom.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1535690081-16116-1-git-send-email-suganath-prabu.subramani@broadcom.com> References: <1535690081-16116-1-git-send-email-suganath-prabu.subramani@broadcom.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP During driver load create a hba hot unplug watchdog thread "_base_hba_hot_unplug_work". This will poll whether HBA device is unplugged or not by reading IOC's vendor field in IOC's PCI configuration space for every one second. If hot unplug is detected, it terminates all the outstanding IOs and hence kernels's PCIe hotplug module (i.e. pciehp) will clear the instances of the hot unplugged PCI device. Below functions starts and stops the watchdog. mpt3sas_base_start_hba_unplug_watchdog mpt3sas_base_stop_hba_unplug_watchdog Watchdog thread starts immediately once IOC becomes operational. Signed-off-by: Suganath Prabu S --- drivers/scsi/mpt3sas/mpt3sas_base.c | 92 +++++++++++++++++++++++++++++++++++- drivers/scsi/mpt3sas/mpt3sas_base.h | 6 +++ drivers/scsi/mpt3sas/mpt3sas_scsih.c | 7 +++ 3 files changed, 104 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 8b33670..6c8a30f 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -69,6 +69,7 @@ static MPT_CALLBACK mpt_callbacks[MPT_MAX_CALLBACKS]; #define FAULT_POLLING_INTERVAL 1000 /* in milliseconds */ +#define HBA_HOTUNPLUG_POLLING_INTERVAL 1000 /* in milliseconds */ /* maximum controller queue depth */ #define MAX_HBA_QUEUE_DEPTH 30000 @@ -672,6 +673,46 @@ _base_fault_reset_work(struct work_struct *work) spin_unlock_irqrestore(&ioc->ioc_reset_in_progress_lock, flags); } +static void +_base_hba_hot_unplug_work(struct work_struct *work) +{ + struct MPT3SAS_ADAPTER *ioc = + container_of(work, struct MPT3SAS_ADAPTER, + hba_hot_unplug_work.work); + unsigned long flags; + + spin_lock_irqsave(&ioc->hba_hot_unplug_lock, flags); + if (ioc->shost_recovery || ioc->pci_error_recovery) + goto rearm_timer; + + if (mpt3sas_base_pci_device_is_unplugged(ioc)) { + if (ioc->remove_host) { + pr_err(MPT3SAS_FMT + "The IOC seems hot unplugged and the driver is " + "waiting for pciehp module to remove the PCIe " + "device instance associated with IOC!!!\n", + ioc->name); + goto rearm_timer; + } + + /* Set remove_host flag here, since kernel will invoke driver's + * .remove() callback function one after the other for all hot + * un-plugged devices, so it may take some time to call + * .remove() function for subsequent hot un-plugged + * PCI devices. + */ + ioc->remove_host = 1; + } + +rearm_timer: + if (ioc->hba_hot_unplug_work_q) + queue_delayed_work(ioc->hba_hot_unplug_work_q, + &ioc->hba_hot_unplug_work, + msecs_to_jiffies(HBA_HOTUNPLUG_POLLING_INTERVAL)); + spin_unlock_irqrestore(&ioc->hba_hot_unplug_lock, flags); +} + + /** * mpt3sas_base_start_watchdog - start the fault_reset_work_q * @ioc: per adapter object @@ -730,6 +771,54 @@ mpt3sas_base_stop_watchdog(struct MPT3SAS_ADAPTER *ioc) } } +void +mpt3sas_base_start_hba_unplug_watchdog(struct MPT3SAS_ADAPTER *ioc) +{ + unsigned long flags; + + if (ioc->hba_hot_unplug_work_q) + return; + + /* Initialize hba hot unplug polling */ + INIT_DELAYED_WORK(&ioc->hba_hot_unplug_work, + _base_hba_hot_unplug_work); + snprintf(ioc->hba_hot_unplug_work_q_name, + sizeof(ioc->hba_hot_unplug_work_q_name), "poll_%s%d_hba_unplug", + ioc->driver_name, ioc->id); + ioc->hba_hot_unplug_work_q = + create_singlethread_workqueue(ioc->hba_hot_unplug_work_q_name); + if (!ioc->hba_hot_unplug_work_q) { + pr_err(MPT3SAS_FMT "%s: failed (line=%d)\n", + ioc->name, __func__, __LINE__); + return; + } + + spin_lock_irqsave(&ioc->hba_hot_unplug_lock, flags); + if (ioc->hba_hot_unplug_work_q) + queue_delayed_work(ioc->hba_hot_unplug_work_q, + &ioc->hba_hot_unplug_work, + msecs_to_jiffies(HBA_HOTUNPLUG_POLLING_INTERVAL)); + spin_unlock_irqrestore(&ioc->hba_hot_unplug_lock, flags); +} + +void +mpt3sas_base_stop_hba_unplug_watchdog(struct MPT3SAS_ADAPTER *ioc) +{ + unsigned long flags; + struct workqueue_struct *wq; + + spin_lock_irqsave(&ioc->hba_hot_unplug_lock, flags); + wq = ioc->hba_hot_unplug_work_q; + ioc->hba_hot_unplug_work_q = NULL; + spin_unlock_irqrestore(&ioc->hba_hot_unplug_lock, flags); + + if (wq) { + if (!cancel_delayed_work_sync(&ioc->hba_hot_unplug_work)) + flush_workqueue(wq); + destroy_workqueue(wq); + } +} + /** * mpt3sas_base_fault_info - verbose translation of firmware FAULT code * @ioc: per adapter object @@ -6458,7 +6547,7 @@ _base_make_ioc_operational(struct MPT3SAS_ADAPTER *ioc) } skip_init_reply_post_host_index: - + mpt3sas_base_start_hba_unplug_watchdog(ioc); _base_unmask_interrupts(ioc); if (ioc->hba_mpi_version_belonged != MPI2_VERSION) { @@ -6789,6 +6878,7 @@ mpt3sas_base_detach(struct MPT3SAS_ADAPTER *ioc) __func__)); mpt3sas_base_stop_watchdog(ioc); + mpt3sas_base_stop_hba_unplug_watchdog(ioc); mpt3sas_base_free_resources(ioc); _base_release_memory_pools(ioc); mpt3sas_free_enclosure_list(ioc); diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 8ee3ba7..4186bc9 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -1140,8 +1140,11 @@ struct MPT3SAS_ADAPTER { /* fw fault handler */ char fault_reset_work_q_name[20]; + char hba_hot_unplug_work_q_name[20]; struct workqueue_struct *fault_reset_work_q; + struct workqueue_struct *hba_hot_unplug_work_q; struct delayed_work fault_reset_work; + struct delayed_work hba_hot_unplug_work; /* fw event handler */ char firmware_event_name[20]; @@ -1158,6 +1161,7 @@ struct MPT3SAS_ADAPTER { struct mutex reset_in_progress_mutex; spinlock_t ioc_reset_in_progress_lock; + spinlock_t hba_hot_unplug_lock; u8 ioc_link_reset_in_progress; u8 ignore_loginfos; @@ -1482,6 +1486,8 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc); u8 mpt3sas_base_check_cmd_timeout(struct MPT3SAS_ADAPTER *ioc, u8 status, void *mpi_request, int sz); +void mpt3sas_base_start_hba_unplug_watchdog(struct MPT3SAS_ADAPTER *ioc); +void mpt3sas_base_stop_hba_unplug_watchdog(struct MPT3SAS_ADAPTER *ioc); /* scsih shared API */ struct scsi_cmnd *mpt3sas_scsih_scsi_lookup_get(struct MPT3SAS_ADAPTER *ioc, u16 smid); diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index eeee9da..7e0c4ec 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -9828,9 +9828,11 @@ static void scsih_remove(struct pci_dev *pdev) ioc->remove_host = 1; mpt3sas_wait_for_commands_to_complete(ioc); + spin_lock_irqsave(&ioc->hba_hot_unplug_lock, flags); _scsih_flush_running_cmds(ioc); _scsih_fw_event_cleanup_queue(ioc); + spin_unlock_irqrestore(&ioc->hba_hot_unplug_lock, flags); spin_lock_irqsave(&ioc->fw_event_lock, flags); wq = ioc->firmware_event_thread; @@ -10724,6 +10726,7 @@ scsih_suspend(struct pci_dev *pdev, pm_message_t state) pci_power_t device_state; mpt3sas_base_stop_watchdog(ioc); + mpt3sas_base_stop_hba_unplug_watchdog(ioc); flush_scheduled_work(); scsi_block_requests(shost); device_state = pci_choose_state(pdev, state); @@ -10766,6 +10769,7 @@ scsih_resume(struct pci_dev *pdev) mpt3sas_base_hard_reset_handler(ioc, SOFT_RESET); scsi_unblock_requests(shost); mpt3sas_base_start_watchdog(ioc); + mpt3sas_base_start_hba_unplug_watchdog(ioc); return 0; } #endif /* CONFIG_PM */ @@ -10796,12 +10800,14 @@ scsih_pci_error_detected(struct pci_dev *pdev, pci_channel_state_t state) ioc->pci_error_recovery = 1; scsi_block_requests(ioc->shost); mpt3sas_base_stop_watchdog(ioc); + mpt3sas_base_stop_hba_unplug_watchdog(ioc); mpt3sas_base_free_resources(ioc); return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: /* Permanent error, prepare for device removal */ ioc->pci_error_recovery = 1; mpt3sas_base_stop_watchdog(ioc); + mpt3sas_base_stop_hba_unplug_watchdog(ioc); _scsih_flush_running_cmds(ioc); return PCI_ERS_RESULT_DISCONNECT; } @@ -10862,6 +10868,7 @@ scsih_pci_resume(struct pci_dev *pdev) pci_cleanup_aer_uncorrect_error_status(pdev); mpt3sas_base_start_watchdog(ioc); + mpt3sas_base_start_hba_unplug_watchdog(ioc); scsi_unblock_requests(ioc->shost); }