From patchwork Sun Aug  5 22:51:36 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Sinan Kaya <okaya@kernel.org>
X-Patchwork-Id: 10556225
X-Patchwork-Delegate: bhelgaas@google.com
Return-Path: <linux-pci-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC47014E2
	for <patchwork-linux-pci@patchwork.kernel.org>;
 Sun,  5 Aug 2018 22:52:00 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9C050292F1
	for <patchwork-linux-pci@patchwork.kernel.org>;
 Sun,  5 Aug 2018 22:52:00 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 8F2A229329; Sun,  5 Aug 2018 22:52:00 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham
	version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C1E4292F1
	for <patchwork-linux-pci@patchwork.kernel.org>;
 Sun,  5 Aug 2018 22:52:00 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1726680AbeHFA6M (ORCPT
        <rfc822;patchwork-linux-pci@patchwork.kernel.org>);
        Sun, 5 Aug 2018 20:58:12 -0400
Received: from mail.kernel.org ([198.145.29.99]:60590 "EHLO mail.kernel.org"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1726613AbeHFA6M (ORCPT <rfc822;linux-pci@vger.kernel.org>);
        Sun, 5 Aug 2018 20:58:12 -0400
Received: from localhost.localdomain (cpe-174-109-247-98.nc.res.rr.com
 [174.109.247.98])
        (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
        (No client certificate requested)
        by mail.kernel.org (Postfix) with ESMTPSA id 82DC0218C1;
        Sun,  5 Aug 2018 22:51:57 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
        s=default; t=1533509518;
        bh=h8PKL8SelS+xkDcS+yiT/Gh9nWsD7yK01oF+67GkX1k=;
        h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
        b=CVbo+20d+8iBTt9mx1oTkDMNaQieVSFJFnVMAYWe5mZaz0rY7t4W5STyW7cnXQiAe
         Jd7/eKXdAUVqBfiuFt6HfApga6y8s4WCnJe1dN8pDYuyiuw1tJbZmuUEwtixdbwhPP
         oW710uxVF4gSXkJX7w2/GPHtN6SCQ6VjMje4dJ7o=
From: Sinan Kaya <okaya@kernel.org>
To: linux-pci@vger.kernel.org
Cc: Sinan Kaya <okaya@kernel.org>, Bjorn Helgaas <bhelgaas@google.com>,
        Lukas Wunner <lukas@wunner.de>,
        Mika Westerberg <mika.westerberg@linux.intel.com>,
        "Gustavo A. R. Silva" <gustavo@embeddedor.com>,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        Oza Pawandeep <poza@codeaurora.org>,
        Keith Busch <keith.busch@intel.com>
Subject: [PATCH v7 1/1] PCI: pciehp: Ignore link events when there is a fatal
 error pending
Date: Sun,  5 Aug 2018 15:51:36 -0700
Message-Id: <20180805225136.5800-2-okaya@kernel.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20180805225136.5800-1-okaya@kernel.org>
References: <20180805225136.5800-1-okaya@kernel.org>
Sender: linux-pci-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-pci.vger.kernel.org>
X-Mailing-List: linux-pci@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

AER/DPC reset is known as warm-resets. HP link recovery is known as
cold-reset via power-off and power-on command to the PCI slot.

In the middle of a warm-reset operation (AER/DPC), we are:

1. turning off the slow power. Slot power needs to be kept on in order
for recovery to succeed.

2. performing a cold reset causing Fatal Error recovery to fail.

If link goes down due to a DPC event, it should be recovered by DPC
status trigger. Injecting a cold reset in the middle can cause a HW
lockup as it is an undefined behavior.

Similarly, If link goes down due to an AER secondary bus reset issue,
it should be recovered by HW. Injecting a cold reset in the middle of a
secondary bus reset can cause a HW lockup as it is an undefined behavior.

1. HP ISR observes link down interrupt.
2. HP ISR checks that there is a fatal error pending, it doesn't touch
the link.
3. HP ISR waits until link recovery happens.
4. HP ISR calls the read vendor id function.
5. If all fails, try the cold-reset approach.

If fatal error is pending and a fatal error service such as DPC or AER
is running, it is the responsibility of the fatal error service to
recover the link.

Signed-off-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/pci/hotplug/pciehp_ctrl.c | 19 ++++++++++
 drivers/pci/pci.h                 |  2 ++
 drivers/pci/pcie/err.c            | 60 +++++++++++++++++++++++++++++++
 3 files changed, 81 insertions(+)

diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index da7c72372ffc..ba8dd51a3f0f 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -222,9 +222,28 @@ void pciehp_handle_disable_request(struct slot *slot)
 void pciehp_handle_presence_or_link_change(struct slot *slot, u32 events)
 {
 	struct controller *ctrl = slot->ctrl;
+	struct pci_dev *pdev = ctrl->pcie->port;
 	bool link_active;
 	u8 present;
 
+	/* If a fatal error is pending, wait for AER or DPC to handle it. */
+	if (pcie_fatal_error_pending(pdev, PCI_ERR_UNC_SURPDN)) {
+		bool recovered;
+
+		recovered = pcie_wait_fatal_error_clear(pdev,
+						PCI_ERR_UNC_SURPDN);
+
+		/* If the fatal error is gone and the link is up, return */
+		if (recovered && pcie_wait_for_link(pdev, true)) {
+			ctrl_info(ctrl, "Slot(%s): Ignoring Link event due to successful fatal error recovery\n",
+				  slot_name(slot));
+			return;
+		}
+
+		ctrl_info(ctrl, "Slot(%s): Fatal error recovery failed for Link event, tryig hotplug reset\n",
+			  slot_name(slot));
+	}
+
 	/*
 	 * If the slot is on and presence or link has changed, turn it off.
 	 * Even if it's occupied again, we cannot assume the card is the same.
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c358e7a07f3f..2ebb05338368 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -356,6 +356,8 @@ void pci_enable_acs(struct pci_dev *dev);
 /* PCI error reporting and recovery */
 void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service);
 void pcie_do_nonfatal_recovery(struct pci_dev *dev);
+bool pcie_fatal_error_pending(struct pci_dev *pdev, u32 usrmask);
+bool pcie_wait_fatal_error_clear(struct pci_dev *pdev, u32 usrmask);
 
 bool pcie_wait_for_link(struct pci_dev *pdev, bool active);
 #ifdef CONFIG_PCIEASPM
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index f7ce0cb0b0b7..8ea012dbf063 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -16,6 +16,7 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/aer.h>
+#include <linux/delay.h>
 #include "portdrv.h"
 #include "../pci.h"
 
@@ -386,3 +387,62 @@ void pcie_do_nonfatal_recovery(struct pci_dev *dev)
 	/* TODO: Should kernel panic here? */
 	pci_info(dev, "AER: Device recovery failed\n");
 }
+
+bool pcie_fatal_error_pending(struct pci_dev *pdev, u32 usrmask)
+{
+	u16 err_status = 0;
+	u32 status, mask, severity;
+	int rc;
+
+	if (!pci_is_pcie(pdev))
+		return false;
+
+	rc = pcie_capability_read_word(pdev, PCI_EXP_DEVSTA, &err_status);
+	if (rc)
+		return false;
+
+	if (!(err_status & PCI_EXP_DEVSTA_FED))
+		return false;
+
+	if (!pdev->aer_cap)
+		return false;
+
+	rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS,
+				   &status);
+	if (rc)
+		return false;
+
+	rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_MASK,
+				   &mask);
+	if (rc)
+		return false;
+
+	rc = pci_read_config_dword(pdev, pdev->aer_cap + PCI_ERR_UNCOR_SEVER,
+				   &severity);
+	if (rc)
+		return false;
+
+	status &= mask;
+	status &= ~usrmask;
+	status &= severity;
+
+	return !!status;
+}
+
+bool pcie_wait_fatal_error_clear(struct pci_dev *pdev, u32 usrmask)
+{
+	int timeout = 1000;
+	bool ret;
+
+	for (;;) {
+		ret = pcie_fatal_error_pending(pdev, usrmask);
+		if (ret == false)
+			return true;
+		if (timeout <= 0)
+			break;
+		msleep(20);
+		timeout -= 20;
+	}
+
+	return false;
+}