From patchwork Mon Jul 23 22:24:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 10541245 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C822C91E for ; Mon, 23 Jul 2018 22:24:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B8AB1284CE for ; Mon, 23 Jul 2018 22:24:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AC49E2851A; Mon, 23 Jul 2018 22:24:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E483284CE for ; Mon, 23 Jul 2018 22:24:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388216AbeGWX1q (ORCPT ); Mon, 23 Jul 2018 19:27:46 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57560 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388109AbeGWX1q (ORCPT ); Mon, 23 Jul 2018 19:27:46 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1E257C049D5B; Mon, 23 Jul 2018 22:24:26 +0000 (UTC) Received: from gimli.home (ovpn-116-105.phx2.redhat.com [10.3.116.105]) by smtp.corp.redhat.com (Postfix) with ESMTP id 02BE8608F0; Mon, 23 Jul 2018 22:24:22 +0000 (UTC) Subject: [PATCH 1/2] PCI: Export pcie_has_flr() From: Alex Williamson To: linux-pci@vger.kernel.org Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org Date: Mon, 23 Jul 2018 16:24:22 -0600 Message-ID: <20180723222422.4371.68674.stgit@gimli.home> In-Reply-To: <20180723221533.4371.90064.stgit@gimli.home> References: <20180723221533.4371.90064.stgit@gimli.home> User-Agent: StGit/0.18-102-gdf9f MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 23 Jul 2018 22:24:26 +0000 (UTC) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP pcie_flr() suggests pcie_has_flr() to ensure that PCIe FLR support is present prior to calling. pcie_flr() is exported while pcie_has_flr() is not. Resolve this. Signed-off-by: Alex Williamson --- drivers/pci/pci.c | 3 ++- include/linux/pci.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 2bec76c9d9a7..52fe2d72a99c 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4071,7 +4071,7 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) * Returns true if the device advertises support for PCIe function level * resets. */ -static bool pcie_has_flr(struct pci_dev *dev) +bool pcie_has_flr(struct pci_dev *dev) { u32 cap; @@ -4081,6 +4081,7 @@ static bool pcie_has_flr(struct pci_dev *dev) pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap); return cap & PCI_EXP_DEVCAP_FLR; } +EXPORT_SYMBOL_GPL(pcie_has_flr); /** * pcie_flr - initiate a PCIe function level reset diff --git a/include/linux/pci.h b/include/linux/pci.h index 04c7ea6ed67b..bbe030d7814f 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1092,6 +1092,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev, enum pci_bus_speed *speed, enum pcie_link_width *width); void pcie_print_link_status(struct pci_dev *dev); +bool pcie_has_flr(struct pci_dev *dev); int pcie_flr(struct pci_dev *dev); int __pci_reset_function_locked(struct pci_dev *dev); int pci_reset_function(struct pci_dev *dev); From patchwork Mon Jul 23 22:24:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 10541249 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E88DE91E for ; Mon, 23 Jul 2018 22:24:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D8138284CE for ; Mon, 23 Jul 2018 22:24:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CBF592851A; Mon, 23 Jul 2018 22:24:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4111D284CE for ; Mon, 23 Jul 2018 22:24:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388148AbeGWX1y (ORCPT ); Mon, 23 Jul 2018 19:27:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60174 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388109AbeGWX1y (ORCPT ); Mon, 23 Jul 2018 19:27:54 -0400 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.25]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 994DB85546; Mon, 23 Jul 2018 22:24:33 +0000 (UTC) Received: from gimli.home (ovpn-116-105.phx2.redhat.com [10.3.116.105]) by smtp.corp.redhat.com (Postfix) with ESMTP id 81A382010CA4; Mon, 23 Jul 2018 22:24:31 +0000 (UTC) Subject: [PATCH 2/2] PCI: NVMe device specific reset quirk From: Alex Williamson To: linux-pci@vger.kernel.org Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org Date: Mon, 23 Jul 2018 16:24:31 -0600 Message-ID: <20180723222431.4371.25962.stgit@gimli.home> In-Reply-To: <20180723221533.4371.90064.stgit@gimli.home> References: <20180723221533.4371.90064.stgit@gimli.home> User-Agent: StGit/0.18-102-gdf9f MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.25 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Mon, 23 Jul 2018 22:24:33 +0000 (UTC) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Take advantage of NVMe devices using a standard interface to quiesce the controller prior to reset, including device specific delays before and after that reset. This resolves several NVMe device assignment scenarios with two different vendors. The Intel DC P3700 controller has been shown to only work as a VM boot device on the initial VM startup, failing after reset or reboot, and also fails to initialize after hot-plug into a VM. Adding a delay after FLR resolves these cases. The Samsung SM961/PM961 (960 EVO) sometimes fails to return from FLR with the PCI config space reading back as -1. A reproducible instance of this behavior is resolved by clearing the enable bit in the configuration register and waiting for the ready status to clear (disabling the NVMe controller) prior to FLR. As all NVMe devices make use of this standard interface and the NVMe specification also requires PCIe FLR support, we can apply this quirk to all devices with matching class code. Signed-off-by: Alex Williamson --- drivers/pci/quirks.c | 112 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 112 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e72c8742aafa..83853562f220 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -28,6 +28,7 @@ #include #include #include +#include #include /* isa_dma_bridge_buggy */ #include "pci.h" @@ -3669,6 +3670,116 @@ static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe) #define PCI_DEVICE_ID_INTEL_IVB_M_VGA 0x0156 #define PCI_DEVICE_ID_INTEL_IVB_M2_VGA 0x0166 +/* NVMe controller needs delay before testing ready status */ +#define NVME_QUIRK_CHK_RDY_DELAY (1 << 0) +/* NVMe controller needs post-FLR delay */ +#define NVME_QUIRK_POST_FLR_DELAY (1 << 1) + +static const struct pci_device_id nvme_reset_tbl[] = { + { PCI_DEVICE(0x1bb1, 0x0100), /* Seagate Nytro Flash Storage */ + .driver_data = NVME_QUIRK_CHK_RDY_DELAY, }, + { PCI_DEVICE(0x1c58, 0x0003), /* HGST adapter */ + .driver_data = NVME_QUIRK_CHK_RDY_DELAY, }, + { PCI_DEVICE(0x1c58, 0x0023), /* WDC SN200 adapter */ + .driver_data = NVME_QUIRK_CHK_RDY_DELAY, }, + { PCI_DEVICE(0x1c5f, 0x0540), /* Memblaze Pblaze4 adapter */ + .driver_data = NVME_QUIRK_CHK_RDY_DELAY, }, + { PCI_DEVICE(0x144d, 0xa821), /* Samsung PM1725 */ + .driver_data = NVME_QUIRK_CHK_RDY_DELAY, }, + { PCI_DEVICE(0x144d, 0xa822), /* Samsung PM1725a */ + .driver_data = NVME_QUIRK_CHK_RDY_DELAY, }, + { PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x0953), /* Intel DC P3700 */ + .driver_data = NVME_QUIRK_POST_FLR_DELAY, }, + { PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) }, + { 0 } +}; + +/* + * The NVMe specification requires that controllers support PCIe FLR, but + * but some Samsung SM961/PM961 controllers fail to recover after FLR (-1 + * config space) unless the device is quiesced prior to FLR. Do this for + * all NVMe devices by disabling the controller before reset. Some Intel + * controllers also require an additional post-FLR delay or else attempts + * to re-enable will timeout, do that here as well with heuristically + * determined delay value. Also maintain the delay between disabling and + * checking ready status as used by the native NVMe driver. + */ +static int reset_nvme(struct pci_dev *dev, int probe) +{ + const struct pci_device_id *id; + void __iomem *bar; + u16 cmd; + u32 cfg; + + id = pci_match_id(nvme_reset_tbl, dev); + if (!id || !pcie_has_flr(dev) || !pci_resource_start(dev, 0)) + return -ENOTTY; + + if (probe) + return 0; + + bar = pci_iomap(dev, 0, NVME_REG_CC + sizeof(cfg)); + if (!bar) + return -ENOTTY; + + pci_read_config_word(dev, PCI_COMMAND, &cmd); + pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY); + + cfg = readl(bar + NVME_REG_CC); + + /* Disable controller if enabled */ + if (cfg & NVME_CC_ENABLE) { + u64 cap = readq(bar + NVME_REG_CAP); + unsigned long timeout; + + /* + * Per nvme_disable_ctrl() skip shutdown notification as it + * could complete commands to the admin queue. We only intend + * to quiesce the device before reset. + */ + cfg &= ~(NVME_CC_SHN_MASK | NVME_CC_ENABLE); + + writel(cfg, bar + NVME_REG_CC); + + /* A heuristic value, matches NVME_QUIRK_DELAY_AMOUNT */ + if (id->driver_data & NVME_QUIRK_CHK_RDY_DELAY) + msleep(2300); + + /* Cap register provides max timeout in 500ms increments */ + timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies; + + for (;;) { + u32 status = readl(bar + NVME_REG_CSTS); + + /* Ready status becomes zero on disable complete */ + if (!(status & NVME_CSTS_RDY)) + break; + + msleep(100); + + if (time_after(jiffies, timeout)) { + pci_warn(dev, "Timeout waiting for NVMe ready status to clear after disable\n"); + break; + } + } + } + + pci_iounmap(dev, bar); + + /* + * We could use the optional NVM Subsystem Reset here, hardware + * supporting this is simply unavailable at the time of this code + * to validate in comparison to PCIe FLR. NVMe spec dictates that + * NVMe devices shall implement PCIe FLR. + */ + pcie_flr(dev); + + if (id->driver_data & NVME_QUIRK_POST_FLR_DELAY) + msleep(250); /* Heuristic based on Intel DC P3700 */ + + return 0; +} + static const struct pci_dev_reset_methods pci_dev_reset_methods[] = { { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF, reset_intel_82599_sfp_virtfn }, @@ -3678,6 +3789,7 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = { reset_ivb_igd }, { PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID, reset_chelsio_generic_dev }, + { PCI_ANY_ID, PCI_ANY_ID, reset_nvme }, { 0 } };