From patchwork Mon Apr 30 13:36:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Gilles Buloz X-Patchwork-Id: 10371795 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 26D326038F for ; Mon, 30 Apr 2018 13:37:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1230728B05 for ; Mon, 30 Apr 2018 13:37:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0293328B09; Mon, 30 Apr 2018 13:37:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 316E828B02 for ; Mon, 30 Apr 2018 13:37:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: MIME-Version:In-Reply-To:References:Message-ID:Date:Subject:To:From:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=0tDl2MD22TP/3Q+xVxrVJWtZzE/j19nBC50swvK09Pw=; b=mj7uj98+2QCzEAnMdmgNNru78 QV+PTHVViZEtc++O8dwC1oVOFxoB2hL4HtDSdYbol1pgxjb0/Mh/1VijdhTc9Y7q9donjnT+FOyW7 k6dOM8PQ7DyGiTEZEvCoBORUkDGfot1i2IrxFmQeFm24VZbTXR31KYL+tAa3iXK5Rb4DbhE/b4hyN XW64tOxUNdWsyrHb0ezYEhkR8RbqLOy4SVqv9OXCcgQ1RcjAU3y8+CC8ooF7DLO4oqu/abxJjUFgu tpMegc2qJ0tQFa1M4Iw0o44pRHC1s7UAIDGrqybKnsru574hV+JZtm+aRjusMJfMBIa2IbfKnpaIj i3Y86CKrw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fD8zX-0003Yn-JL; Mon, 30 Apr 2018 13:37:19 +0000 Received: from eu-smtp-delivery-185.mimecast.com ([207.82.80.185]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1fD8zS-0003Xq-7W for linux-arm-kernel@lists.infradead.org; Mon, 30 Apr 2018 13:37:16 +0000 Received: from SDEMUCHB01.kontron.local (host-212-18-14-254.customer.m-online.net [212.18.14.254]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-60-90WBWJaVO3qXppzuc3KsAQ-1; Mon, 30 Apr 2018 14:36:55 +0100 Received: from SDEMUCMB01.kontron.local ([fe80::c99d:e06e:28fc:b20]) by SDEMUCHB01.kontron.local ([::1]) with mapi id 14.03.0123.003; Mon, 30 Apr 2018 15:36:54 +0200 From: Gilles Buloz To: Bjorn Helgaas Subject: Re: LS1043A : "synchronous abort" at boot due to PCI config read Thread-Topic: LS1043A : "synchronous abort" at boot due to PCI config read Thread-Index: AQHT3gPjlj6/yW5VNkGytbg+ZL5f/qQUaTmAgABKlYCABC4VAIAAUSYA Date: Mon, 30 Apr 2018 13:36:53 +0000 Message-ID: <5AE71BF4.2010200@kontron.com> References: <5AD0E995.3090802@kontron.com> <5AE317AB.4020404@kontron.com> <20180427165627.GA8199@bhelgaas-glaptop.roam.corp.google.com> <5AE6D7E2.9030506@kontron.com> In-Reply-To: <5AE6D7E2.9030506@kontron.com> Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: user-agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 x-originating-ip: [172.20.161.89] MIME-Version: 1.0 X-MC-Unique: 90WBWJaVO3qXppzuc3KsAQ-1 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180430_063714_686866_BAF8D7D3 X-CRM114-Status: GOOD ( 40.00 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Bjorn Helgaas , linux-pci , "Minghuan.Lian@nxp.com" , "linux-arm-kernel@lists.infradead.org" , Ard Biesheuvel Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Le 30/04/2018 10:46, Gilles BULOZ a écrit : > Le 27/04/2018 18:56, Bjorn Helgaas a écrit : >> On Fri, Apr 27, 2018 at 12:29:32PM +0000, Gilles Buloz wrote: >>> Le 27/04/2018 10:43, Ard Biesheuvel a écrit : >>>> (add Bjorn and linux-pci) >>>> >>>> On 13 April 2018 at 19:32, Gilles Buloz wrote: >>>>> Dear developers, >>>>> >>>>> I currently have two functional workarounds for this issue but >>>>> would like to know which one you would recommend, if any :-) I'm >>>>> using a LS1043A CPU (NXP QorIQ Layerscape) and get a "synchronous >>>>> external abort" when booting because of a PCI config read during >>>>> PCI scan. >>>>> >>>>> I'm using a custom hardware (based on LS1043ARDB) having a >>>>> PEX8112 PCIe-to-PCI bridge connected to the LS1043A to have a PCI >>>>> slot for legacy devices. This bridge only supports PCI-Compatible >>>>> config accesses (offset 0x00-0xFF). >> I would guess the PEX8112 itself has 4K of config space, but it only >> forwards 256 bytes of config space to the conventional PCI secondary >> bus. >> >>>>> On this PCI slot I connect a PCI module made of a PCI-to-PCIe >>>>> bridge plus PCIe devices behind. >>>>> >>>>> The problem occurs when the kernel probes the PCIe devices : as >>>>> they are PCIe devices, the kernel does a PCI config read access >>>>> at offset 0x100 to check if "PCIe extended capability registers" >>>>> are accessible (see drivers/pci/probe.c, function >>>>> pci_cfg_space_size_ext()). Unfortunately the PEX8112 PCIe-to-PCI >>>>> bridge that is in the path reports an error to the CPU for this >>>>> access, and it seems there's no way to disable that on this >>>>> bridge. >>>>> >>>>> The first workaround I found was to patch >>>>> drivers/pci/host/pci-layerscape.c to have PCIE_ABSERR_SETTING set >>>>> to 0x9400 instead of 0x9401 (for PCIE_ABSERR register) to disable >>>>> error reporting. This only impacts an NXP part of the Linux >>>>> kernel code, but I'm not sure this is a good idea (however it >>>>> seems to be like that on Intel platforms where even MEM accesses >>>>> to a no-device address return FF without any error). >>>>> >>>>> I've also tried another workaround that works : patch >>>>> drivers/pci/probe.c to use bus_flags to remember if a bus is >>>>> behind a bridge without extended address capability, to avoid PCi >>>>> config read accesses at offset 0x100 in pci_cfg_space_size() / >>>>> pci_cfg_space_size_ext(). But this patch impacts the generic PCI >>>>> probe method of Linux. >>>>> >>>>> Any Idea to properly handle that issue ? >>>>> >>>> This seems like a rather unusual configuration, but I guess that >>>> if the first bridge/switch advertises its inability to support >>>> extended config space accesses, we should not be performing them >>>> on any of its subordinate buses. How does the PEX8112 advertise >>>> this limitation? >>>> >>>> That said, I wonder if it is reasonable in the first place to >>>> expect that a PCIe device works as expected passing through a >>>> legacy PCI layer like that. >>>> >>> The PEX8112 PCIe-to-PCI bridge has capability PCI_CAP_ID_EXP, but >>> has no PCI_CAP_ID_PCIX capability. As I understand the lack of >>> PCI_CAP_ID_PCIX is advertising this limitation on the PCI side (no >>> support for PCI config offset >=0x100). >> Sounds right to me. >> >>> Also I guess in the case of a bridge having PCI_CAP_ID_PCIX, this >>> limitation would be advertised by the lack of PCI_X_STATUS_266MHZ >>> and PCI_X_STATUS_533MHZ (as done in drivers/pci/probe.c at >>> pci_cfg_space_size()) >> Also sounds right. Per the PCI-X spec, checking for PCI_X_STATUS_266MHZ >> should be enough, but it shouldn't hurt to check for either >> PCI_X_STATUS_266MHZ or PCI_X_STATUS_533MHZ. >> >>> I'm currently using the attached patch (for kernel 4.1.35-rt41 from >>> NXP Yocto BSP). It uses bus_flags to remember if a bus is behind a >>> bridge without extended address capability to avoid PCi config >>> accesses at offset >= 0x100. Thanks to this patch I now have a >>> functional system with functional PCI/PCIe devices. >> The patch seems like it's looking at the right things, but I don't >> want to build it into pci_scan_bridge_extend() because that function >> is much too complicated already. >> >> I'd rather build it into pci_cfg_space_size() or >> pci_cfg_space_size_ext() somehow. Maybe something along these lines? >> This doesn't account for the case of a PCIe-to-PCI-X Mode 2 bridge; in >> that case, I think all 4K would be accessible on the PCI-X side. >> >> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c >> index ac91b6fd0bcd..d8b091f0bcd1 100644 >> --- a/drivers/pci/probe.c >> +++ b/drivers/pci/probe.c >> @@ -1367,7 +1367,7 @@ static bool pci_ext_cfg_is_aliased(struct pci_dev *dev) >> * pci_cfg_space_size - Get the configuration space size of the PCI device >> * @dev: PCI device >> * >> - * Regular PCI devices have 256 bytes, but PCI-X 2 and PCI Express devices >> + * Regular PCI devices have 256 bytes, but PCI-X Mode 2 and PCI Express devices >> * have 4096 bytes. Even if the device is capable, that doesn't mean we can >> * access it. Maybe we don't have a way to generate extended config space >> * accesses, or the device is behind a reverse Express bridge. So we try >> @@ -1376,9 +1376,14 @@ static bool pci_ext_cfg_is_aliased(struct pci_dev *dev) >> */ >> static int pci_cfg_space_size_ext(struct pci_dev *dev) >> { >> + struct pci_dev *bridge = pci_upstream_bridge(dev); >> u32 status; >> int pos = PCI_CFG_SPACE_SIZE; >> + if (bridge && pci_is_pcie(bridge) && >> + pci_pcie_type(bridge) == PCI_EXP_TYPE_PCI_BRIDGE) >> + return PCI_CFG_SPACE_SIZE; >> + >> if (pci_read_config_dword(dev, pos, &status) != PCIBIOS_SUCCESSFUL) >> return PCI_CFG_SPACE_SIZE; >> if (status == 0xffffffff || pci_ext_cfg_is_aliased(dev)) >> >>> --- include/linux/pci.h.orig 2018-03-26 16:51:18.050000000 +0000 >>> +++ include/linux/pci.h 2018-03-26 16:51:27.660000000 +0000 >>> @@ -193,6 +193,7 @@ >>> enum pci_bus_flags { >>> PCI_BUS_FLAGS_NO_MSI = (__force pci_bus_flags_t) 1, >>> PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2, >>> + PCI_BUS_FLAGS_COMPAT_CFG_SPACE = (__force pci_bus_flags_t) 4, >>> }; >>> /* These values come from the PCI Express Spec */ >>> --- drivers/pci/probe.c.orig 2018-01-22 09:29:52.000000000 +0000 >>> +++ drivers/pci/probe.c 2018-03-26 16:54:30.830000000 +0000 >>> @@ -827,6 +827,28 @@ >>> child->primary = primary; >>> pci_bus_insert_busn_res(child, secondary, subordinate); >>> child->bridge_ctl = bctl; >>> + >>> + { >>> + int pos; >>> + u32 status; >>> + bool pci_compat_cfg_space = false; >>> + >>> + if (!pci_is_pcie(dev) || (pci_pcie_type(dev) == PCI_EXP_TYPE_PCIE_BRIDGE) || (pci_pcie_type(dev) == >>> PCI_EXP_TYPE_PCI_BRIDGE)) { >>> + /* for PCI/PCI bridges, or PCIe/PCI bridge in forward or reverse mode, we have to check for PCI-X >>> capabilities */ >>> + pos = pci_find_capability(dev, PCI_CAP_ID_PCIX); >>> + if (pos) { >>> + pci_read_config_dword(dev, pos + PCI_X_STATUS, &status); >>> + if (!(status & (PCI_X_STATUS_266MHZ | PCI_X_STATUS_533MHZ))) >>> + pci_compat_cfg_space = true; >>> + } else { >>> + pci_compat_cfg_space = true; >>> + } >>> + if (pci_compat_cfg_space) { >>> + dev_info(&dev->dev, "[%04x:%04x] Child bus limited to PCI-Compatible config space\n", dev->vendor, >>> dev->device); >>> + child->bus_flags |= PCI_BUS_FLAGS_COMPAT_CFG_SPACE; >>> + } >>> + } >>> + } >>> } >>> cmax = pci_scan_child_bus(child); >>> @@ -1098,6 +1120,11 @@ >>> goto fail; >>> } >>> + if (dev->bus->bus_flags & PCI_BUS_FLAGS_COMPAT_CFG_SPACE) { >>> + dev_info(&dev->dev, "[%04x:%04x] PCI-Compatible config space only due to parent bus(es)\n", dev->vendor, dev->device); >>> + return PCI_CFG_SPACE_SIZE; >>> + } >>> + >>> return pci_cfg_space_size_ext(dev); >>> fail: > Bjorn, > If I'm right about your proposed patch to pci_cfg_space_size_ext(), *bridge is pointing to the upper device of device *dev being > checked. I understand the purpose, but I think this fails for my config that is : > > LS1043 PCIe root -> PEX8112 PCIe-to-PCI bridge -> PMC slot connector -> PCI-to-PCIe bridge -> PCIe switch (4 ports) -> 4 PCIe > devices (one on each port) > > because : > - when pci_cfg_space_size_ext() is run on the 4 PCIe devices, *bridge is the PCIe switch which is not matching > PCI_EXP_TYPE_PCI_BRIDGE. In this case *bridge should also be checked for the parent bus of the PCIe switch, and so on. > - when pci_cfg_space_size_ext() is run for the PCI-to-PCIe bridge, *bridge is the PEX8112 that is also not matching > PCI_EXP_TYPE_PCI_BRIDGE but PCI_EXP_TYPE_PCIE_BRIDGE. This leads to a config access at offset 0x100 to the PCI-to-PCIe bridge, so > a crash (because of the PEX8112) > > I think setting a bit in bus_flags when creating a child bus is very efficient because once set it is automatically inherited by > all child buses and then the only thing that pci_cfg_space_size() has to do for each device is to check for this bit. Also this > PCI_BUS_FLAGS_COMPAT_CFG_SPACE flag is actually a bus property that is compliant with the purpose of bus_flags. > > I agree that pci_scan_bridge_extend() is already too complicated, so would you be okay to only add one line to it : > pci_bus_set_compat_cfg_space(child); > and put all the code I suggested in this new function pci_bus_set_compat_cfg_space() ? (also supporting PCI-X Mode 2 devices) > Improvement : this function can return immediately if the child bus has already inherited the flag from its parent. > I mean something like the attached patch I tested this morning... Sorry, this is for old kernel 4.1.35 but just to clarify what I propose (also applies to 4.16.6 by changing value of PCI_BUS_FLAGS_COMPAT_CFG_SPACE in pci.h to 8). --- include/linux/pci.h.orig 2018-03-26 16:51:18.050000000 +0000 +++ include/linux/pci.h 2018-04-30 09:50:57.660000000 +0000 @@ -193,6 +193,7 @@ enum pci_bus_flags { PCI_BUS_FLAGS_NO_MSI = (__force pci_bus_flags_t) 1, PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2, + PCI_BUS_FLAGS_COMPAT_CFG_SPACE = (__force pci_bus_flags_t) 4, }; /* These values come from the PCI Express Spec */ --- drivers/pci/probe.c.orig 2018-01-22 09:29:52.000000000 +0000 +++ drivers/pci/probe.c 2018-04-30 13:29:50.600000000 +0000 @@ -754,6 +754,35 @@ PCI_EXP_RTCTL_CRSSVE); } +static void pci_bus_check_compat_cfg_space(struct pci_bus *bus) +{ + struct pci_dev *dev = bus->self; + bool pci_compat_cfg_space = false; + int pos; + u32 status; + + if (bus->bus_flags & PCI_BUS_FLAGS_COMPAT_CFG_SPACE) + return; + + if (!pci_is_pcie(dev) || /* PCI/PCI bridge */ + (pci_pcie_type(dev) == PCI_EXP_TYPE_PCIE_BRIDGE) || /* PCIe/PCI bridge in forward mode */ + (pci_pcie_type(dev) == PCI_EXP_TYPE_PCI_BRIDGE)) { /* PCIe/PCI bridge in reverse mode */ + pos = pci_find_capability(dev, PCI_CAP_ID_PCIX); + if (pos) { + pci_read_config_dword(dev, pos + PCI_X_STATUS, &status); + if (!(status & (PCI_X_STATUS_266MHZ | PCI_X_STATUS_533MHZ))) + pci_compat_cfg_space = true; + } else { + pci_compat_cfg_space = true; + } + if (pci_compat_cfg_space) { + dev_info(&dev->dev, "bus %02x limited to PCI-Compatible config space\n", + bus->number); + bus->bus_flags |= PCI_BUS_FLAGS_COMPAT_CFG_SPACE; + } + } +} + /* * If it's a bridge, configure it and scan the bus behind it. * For CardBus bridges, we don't scan behind as the devices will @@ -827,6 +856,7 @@ child->primary = primary; pci_bus_insert_busn_res(child, secondary, subordinate); child->bridge_ctl = bctl; + pci_bus_check_compat_cfg_space(child); } cmax = pci_scan_child_bus(child); @@ -1084,6 +1114,9 @@ u32 status; u16 class; + if (dev->bus->bus_flags & PCI_BUS_FLAGS_COMPAT_CFG_SPACE) + return PCI_CFG_SPACE_SIZE; + class = dev->class >> 8; if (class == PCI_CLASS_BRIDGE_HOST) return pci_cfg_space_size_ext(dev);