From patchwork Fri Sep 25 02:34:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhao, Haifeng" X-Patchwork-Id: 11798673 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 37277618 for ; Fri, 25 Sep 2020 02:40:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1E9E820888 for ; Fri, 25 Sep 2020 02:40:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726694AbgIYCku (ORCPT ); Thu, 24 Sep 2020 22:40:50 -0400 Received: from mga17.intel.com ([192.55.52.151]:60724 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726448AbgIYCku (ORCPT ); Thu, 24 Sep 2020 22:40:50 -0400 IronPort-SDR: iYjanzg5yvkkTDMkg2EMsOFOBOpOEyQ/vxriGv2afsM7qrKlfcAmEsoKVk0Tc3e9eRam+Cms2s PQuxDUxz24Cw== X-IronPort-AV: E=McAfee;i="6000,8403,9754"; a="141431845" X-IronPort-AV: E=Sophos;i="5.77,300,1596524400"; d="scan'208";a="141431845" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Sep 2020 19:35:58 -0700 IronPort-SDR: edQLed7J+7Et0h2PLwHieV0s0QzFTiRjxkPWU1Y571lz/hcfhCCNF4dB5mgNHCb3klmchnXBSo yLaHmH6kYifg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,300,1596524400"; d="scan'208";a="512515330" Received: from shskylake.sh.intel.com ([10.239.48.137]) by fmsmga005.fm.intel.com with ESMTP; 24 Sep 2020 19:35:55 -0700 From: Ethan Zhao To: bhelgaas@google.com, oohall@gmail.com, ruscur@russell.cc, lukas@wunner.de, andriy.shevchenko@linux.intel.com, stuart.w.hayes@gmail.com, mr.nuke.me@gmail.com, mika.westerberg@linux.intel.com Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, pei.p.jia@intel.com, Ethan Zhao Subject: [PATCH 4/5] PCI: only return true when dev io state is really changed Date: Thu, 24 Sep 2020 22:34:22 -0400 Message-Id: <20200925023423.42675-5-haifeng.zhao@intel.com> X-Mailer: git-send-email 2.18.4 In-Reply-To: <20200925023423.42675-1-haifeng.zhao@intel.com> References: <20200925023423.42675-1-haifeng.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When uncorrectable error happens, AER driver and DPC driver interrupt handlers likely call pcie_do_recovery()->pci_walk_bus()->report_frozen_detected() with pci_channel_io_frozen the same time. If pci_dev_set_io_state() return true even if the original state is pci_channel_io_frozen, that will cause AER or DPC handler re-enter the error detecting and recovery procedure one after another. The result is the recovery flow mixed between AER and DPC. So simplify the pci_dev_set_io_state() function to only return true when dev->error_state is changed. Signed-off-by: Ethan Zhao Tested-by: Wen jin Tested-by: Shanshan Zhang Reviewed-by: Alexandru Gagniuc --- drivers/pci/pci.h | 31 +++---------------------------- 1 file changed, 3 insertions(+), 28 deletions(-) diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index fa12f7cbc1a0..d420bb977f3b 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -362,35 +362,10 @@ static inline bool pci_dev_set_io_state(struct pci_dev *dev, bool changed = false; device_lock_assert(&dev->dev); - switch (new) { - case pci_channel_io_perm_failure: - switch (dev->error_state) { - case pci_channel_io_frozen: - case pci_channel_io_normal: - case pci_channel_io_perm_failure: - changed = true; - break; - } - break; - case pci_channel_io_frozen: - switch (dev->error_state) { - case pci_channel_io_frozen: - case pci_channel_io_normal: - changed = true; - break; - } - break; - case pci_channel_io_normal: - switch (dev->error_state) { - case pci_channel_io_frozen: - case pci_channel_io_normal: - changed = true; - break; - } - break; - } - if (changed) + if (dev->error_state != new) { dev->error_state = new; + changed = true; + } return changed; } From patchwork Fri Sep 25 02:34:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhao, Haifeng" X-Patchwork-Id: 11798675 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 71A636CB for ; Fri, 25 Sep 2020 02:40:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5EC5C20EDD for ; Fri, 25 Sep 2020 02:40:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726808AbgIYCku (ORCPT ); Thu, 24 Sep 2020 22:40:50 -0400 Received: from mga17.intel.com ([192.55.52.151]:60724 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726676AbgIYCku (ORCPT ); Thu, 24 Sep 2020 22:40:50 -0400 IronPort-SDR: gGhkLgHy7JwBe9rH1gVdYeuhZRWP9RzTPzH1850yXJTsGLwmuphn+C6xGBksPaMR3DEOo1pKxf W0gJdkpOyrMA== X-IronPort-AV: E=McAfee;i="6000,8403,9754"; a="141431851" X-IronPort-AV: E=Sophos;i="5.77,300,1596524400"; d="scan'208";a="141431851" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Sep 2020 19:36:01 -0700 IronPort-SDR: ufkFF0YDpCuSDDBlyTqSVBAAm7yWx6S9ZOhe0VJvulU99hgn3Gmvwk1LGM9R8HGeeRxsn3QZ2R 2H5ENX9XXrDg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,300,1596524400"; d="scan'208";a="512515344" Received: from shskylake.sh.intel.com ([10.239.48.137]) by fmsmga005.fm.intel.com with ESMTP; 24 Sep 2020 19:35:58 -0700 From: Ethan Zhao To: bhelgaas@google.com, oohall@gmail.com, ruscur@russell.cc, lukas@wunner.de, andriy.shevchenko@linux.intel.com, stuart.w.hayes@gmail.com, mr.nuke.me@gmail.com, mika.westerberg@linux.intel.com Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, pei.p.jia@intel.com, Ethan Zhao Subject: [PATCH 5/5] PCI/ERR: don't mix io state not changed and no driver together Date: Thu, 24 Sep 2020 22:34:23 -0400 Message-Id: <20200925023423.42675-6-haifeng.zhao@intel.com> X-Mailer: git-send-email 2.18.4 In-Reply-To: <20200925023423.42675-1-haifeng.zhao@intel.com> References: <20200925023423.42675-1-haifeng.zhao@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When we see 'can't recover (no error_detected callback)' on console, Maybe the reason is io state is not changed by calling pci_dev_set_io_state(), that is confused. fix it. Signed-off-by: Ethan Zhao Tested-by: Wen jin Tested-by: Shanshan Zhang --- drivers/pci/pcie/err.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index e35c4480c86b..d85f27c90c26 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -55,8 +55,10 @@ static int report_error_detected(struct pci_dev *dev, if (!pci_dev_get(dev)) return 0; device_lock(&dev->dev); - if (!pci_dev_set_io_state(dev, state) || - !dev->driver || + if (!pci_dev_set_io_state(dev, state)) { + pci_dbg(dev, "Device might already being in error handling ...\n"); + vote = PCI_ERS_RESULT_NONE; + } else if (!dev->driver || !dev->driver->err_handler || !dev->driver->err_handler->error_detected) { /*