From patchwork Sat Jul 28 09:13:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sinan Kaya X-Patchwork-Id: 10548095 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 800A91751 for ; Sun, 29 Jul 2018 00:04:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6F2612AAA7 for ; Sun, 29 Jul 2018 00:04:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 62DAF2AAAE; Sun, 29 Jul 2018 00:04:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DATE_IN_PAST_12_24, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 627D62AAAE for ; Sun, 29 Jul 2018 00:04:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731744AbeG2Bco (ORCPT ); Sat, 28 Jul 2018 21:32:44 -0400 Received: from mail.kernel.org ([198.145.29.99]:48236 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731739AbeG2Bco (ORCPT ); Sat, 28 Jul 2018 21:32:44 -0400 Received: from sinan-ubuntu.cust.blueprintrf.com (unknown [209.119.211.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2F6F020893; Sun, 29 Jul 2018 00:04:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1532822665; bh=Q9Q6PuO51ik3GzmJg3eR5CBnmX5phZYaSnE+Kxw8h6w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=n3Tge+ObFpccj4urkUxtUfn1RY8BA47iXkTNJM7JTrlM7Hlnusm4dpR2EFSYqfh3W eipKaaYGOy2PFgmKu7pepInXryk3ImBPf8TqodrC40SVFrEj48NYCZoE09YWSPQSAx uRyTgxA6EoZDvrJecWTOTIijC6I1e/wcNjw5SD44= From: Sinan Kaya To: linux-pci@vger.kernel.org Cc: Sinan Kaya , Bjorn Helgaas , Mika Westerberg , Keith Busch , Oza Pawandeep , Markus Elfring , Lukas Wunner , Kees Cook Subject: [PATCH v6 1/1] PCI: pciehp: Ignore link events when there is a fatal error pending Date: Sat, 28 Jul 2018 02:13:24 -0700 Message-Id: <20180728091324.154795-2-okaya@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180728091324.154795-1-okaya@kernel.org> References: <20180728091324.154795-1-okaya@kernel.org> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We need to figure out how to gracefully return inside hotplug driver if link down happened and there is an error pending. Fatal error needs to be serviced by AER/DPC drivers. Hotplug driver is observing link events while AER/DPC drivers are performing link recovery today and are causing confusion for the hotplug statemachine. 1. check if there is a fatal error pending in the device_status register of the PCI Express capability on the root port. 2. bail out from hotplug routine if this is the case. 3. otherwise, existing behavior. If fatal error is pending and a fatal error service such as DPC or AER is running, it is the responsibility of the fatal error service to recover the link. Signed-off-by: Sinan Kaya --- drivers/pci/hotplug/pciehp_hpc.c | 13 ++++++++---- drivers/pci/pci.h | 1 + drivers/pci/pcie/err.c | 35 ++++++++++++++++++++++++++++++++ 3 files changed, 45 insertions(+), 4 deletions(-) diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c index 718b6073afad..776566ab7583 100644 --- a/drivers/pci/hotplug/pciehp_hpc.c +++ b/drivers/pci/hotplug/pciehp_hpc.c @@ -612,10 +612,15 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id) * and cause the wrong event to queue. */ if (events & PCI_EXP_SLTSTA_DLLSC) { - ctrl_info(ctrl, "Slot(%s): Link %s\n", slot_name(slot), - link ? "Up" : "Down"); - pciehp_queue_interrupt_event(slot, link ? INT_LINK_UP : - INT_LINK_DOWN); + if (pci_fatal_error_pending(pdev, PCI_ERR_UNC_SURPDN)) + ctrl_info(ctrl, "Slot(%s): Ignoring Link %s event due to detected Fatal Error\n", + slot_name(slot), link ? "Up" : "Down"); + else { + ctrl_info(ctrl, "Slot(%s): Link %s\n", slot_name(slot), + link ? "Up" : "Down"); + pciehp_queue_interrupt_event(slot, link ? INT_LINK_UP : + INT_LINK_DOWN); + } } else if (events & PCI_EXP_SLTSTA_PDC) { present = !!(status & PCI_EXP_SLTSTA_PDS); ctrl_info(ctrl, "Slot(%s): Card %spresent\n", slot_name(slot), diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 882f1f9596df..7494b2c0c5ff 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -360,6 +360,7 @@ void pci_enable_acs(struct pci_dev *dev); /* PCI error reporting and recovery */ void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service); void pcie_do_nonfatal_recovery(struct pci_dev *dev); +bool pci_fatal_error_pending(struct pci_dev *pdev, u32 usr_mask); bool pcie_wait_for_link(struct pci_dev *pdev, bool active); #ifdef CONFIG_PCIEASPM diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index f7ce0cb0b0b7..316b2d2750b9 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -386,3 +386,38 @@ void pcie_do_nonfatal_recovery(struct pci_dev *dev) /* TODO: Should kernel panic here? */ pci_info(dev, "AER: Device recovery failed\n"); } + +bool pci_fatal_error_pending(struct pci_dev *pdev, u32 usr_mask) +{ + u16 err_status = 0; + u32 status, mask; + int rc; + + if (!pci_is_pcie(pdev)) + return false; + + rc = pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &err_status); + if (rc) + return false; + + if (!(err_status & PCI_EXP_DEVSTA_FED)) + return false; + + if (!pdev->aer_cap) + return false; + + rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, + &status); + if (rc) + return false; + + rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_MASK, + &mask); + if (rc) + return false; + + status &= mask; + status &= ~usr_mask; + + return !!status; +}