From patchwork Thu Aug 28 21:55:25 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rajat Jain X-Patchwork-Id: 4808061 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Original-To: patchwork-linux-pci@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id AE6299F375 for ; Thu, 28 Aug 2014 21:55:16 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7352F20123 for ; Thu, 28 Aug 2014 21:55:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4F4B6200E8 for ; Thu, 28 Aug 2014 21:55:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752168AbaH1VzN (ORCPT ); Thu, 28 Aug 2014 17:55:13 -0400 Received: from mail-pa0-f45.google.com ([209.85.220.45]:39575 "EHLO mail-pa0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752015AbaH1VzL (ORCPT ); Thu, 28 Aug 2014 17:55:11 -0400 Received: by mail-pa0-f45.google.com with SMTP id bj1so4214847pad.32 for ; Thu, 28 Aug 2014 14:55:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :content-type:content-transfer-encoding; bh=7H4Gi+dfYV1Flhnmwc+32wONgZv9VsYJZNvuP6+YH8c=; b=k6Dzv0x9hBZ1PqkFrWte/B1KMNF50uEpbCIX4n56WgwCqV6aCgCVS9RoSa5MrXVQtD JJ+6T2Vm8qtIAcGVh+Nl5wAdzZC7k3GdzX51m8YqHHQJuGQUK5pwHTr5eHNJmt3GwiqS zxL+0dWxaMoybQuhAbW5rBhTJM5tgs7lukjbzo2wV3/HrrjMdMZ1w/+Eb34U1C7x3YF3 J77BpNrjrhkaISBz9pHjsPfSBd/zj7CsDcKC6e+ufxh7dUkkaeQ0pXKDwFKBfm6FsHqt pTXtCrME7w5fIiMAE1W9c+DY19fhjtD/HfH6veDRiOL20L9I2hHch2Ver9rG18SFdwXU 0Xiw== X-Received: by 10.67.22.37 with SMTP id hp5mr9894708pad.150.1409262911063; Thu, 28 Aug 2014 14:55:11 -0700 (PDT) Received: from [192.168.95.129] ([66.129.239.11]) by mx.google.com with ESMTPSA id lx10sm7032350pdb.31.2014.08.28.14.55.10 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 28 Aug 2014 14:55:10 -0700 (PDT) Message-ID: <53FFA54D.9000907@gmail.com> Date: Thu, 28 Aug 2014 14:55:25 -0700 From: Rajat Jain User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130803 Thunderbird/17.0.8 MIME-Version: 1.0 To: Bjorn Helgaas , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org CC: rajatjain@juniper.net, groeck@juniper.net Subject: [PATCH] pci/probe: Enable CRS for Intel Haswell root ports Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The PCIe root port of the Intel Haswell CPU, has a behavior to endlessly retry the configuration cycles, if an endpoint responds with a CRS (Configuration Request Retry Status), and the "CRS Software Visibility" flag is not set at the root port. This results in a CPU hang, when the kernel tries to enumerate the device that responds with CRS. Please note that this root port behavior (of endless retries) is still compliant with PCIe spec as the spec leaves the behavior open to implementation, on how many retries to do if "CRS visibility flag" is not enabled and it receives a CRS. (Intel has chosen to retry indefinitely) Ref1: https://www.pcisig.com/specifications/pciexpress/ECN_CRS_Software_Visibility_No27.pdf Ref2: PCIe spec V3.0, pg119, pg127 for "Configuration Request Retry Status" Following CPUs are affected: http://ark.intel.com/products/codename/42174/Haswell#@All Thus we need to enable the CRS visibility flag for such root ports. The commit ad7edfe04908 ("[PCI] Do not enable CRS Software Visibility by default") suggests to maintain a whitelist of the systems for which CRS should be enabled. This patch does the same. Note: Looking at the spec and reading about the CRS, IMHO the "CRS visibility" looks like a good thing to me that should always be enabled on the root ports that support it. And may be we should always enable it if supported and maintain a blacklist of devices on which should be disabled (because of known issues). How I stumbled upon this and tested the fix: Root port: PCI bridge: Intel Corporation Device 2f02 (rev 01) I have a PCIe endpoint (a PLX 8713 NT bridge) that will keep on responding with CRS for a long time when the kernel tries to enumerate the endpoint, trying to indicate that the device is not yet ready. This is because it needs some configuration over I2C in order to complete its reset sequence. This results in a CPU hang during enumeration. I used this setup to fix and test this issue. After enabling the CRS visibility flag at the root port, I see that CPU moves on as expected declaring the following (instead of freezing): pci 0000:30:00.0 id reading try 50 times with interval 20 ms to get ffff0001 Signed-off-by: Rajat Jain Signed-off-by: Rajat Jain Signed-off-by: Guenter Roeck --- Hi Bjorn / folks, I had also saught suggestions on how this patch should be modelled. Please find a suggestive alternative here: https://lkml.org/lkml/2014/8/1/186 Please let me know your thoughts. Thanks, Rajat drivers/pci/probe.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index e3cf8a2..909ca75 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -740,6 +740,32 @@ struct pci_bus *pci_add_new_bus(struct pci_bus *parent, struct pci_dev *dev, } EXPORT_SYMBOL(pci_add_new_bus); +static const struct pci_device_id crs_whitelist[] = { + { PCI_VDEVICE(INTEL, 0x2f00), }, + { PCI_VDEVICE(INTEL, 0x2f01), }, + { PCI_VDEVICE(INTEL, 0x2f02), }, + { PCI_VDEVICE(INTEL, 0x2f03), }, + { PCI_VDEVICE(INTEL, 0x2f04), }, + { PCI_VDEVICE(INTEL, 0x2f05), }, + { PCI_VDEVICE(INTEL, 0x2f06), }, + { PCI_VDEVICE(INTEL, 0x2f07), }, + { PCI_VDEVICE(INTEL, 0x2f08), }, + { PCI_VDEVICE(INTEL, 0x2f09), }, + { PCI_VDEVICE(INTEL, 0x2f0a), }, + { PCI_VDEVICE(INTEL, 0x2f0b), }, + { }, +}; + +static void pci_enable_crs(struct pci_dev *dev) +{ + /* Enable CRS Software visibility only for whitelisted systems */ + if (pci_is_pcie(dev) && + pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT && + pci_match_id(crs_whitelist, dev)) + pcie_capability_set_word(dev, PCI_EXP_RTCTL, + PCI_EXP_RTCTL_CRSSVE); +} + /* * If it's a bridge, configure it and scan the bus behind it. * For CardBus bridges, we don't scan behind as the devices will @@ -787,6 +813,8 @@ int pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max, int pass) pci_write_config_word(dev, PCI_BRIDGE_CONTROL, bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); + pci_enable_crs(dev); + if ((secondary || subordinate) && !pcibios_assign_all_busses() && !is_cardbus && !broken) { unsigned int cmax;