Message ID | 20161102223552.14776-2-jthumshirn@suse.de (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Hi Johannes, On Wed, Nov 02, 2016 at 04:35:52PM -0600, Johannes Thumshirn wrote: > The Read Completion Boundary (RCB) bit must only be set on a device or > endpoint if it is set on the root complex. > > Certain BIOSes erroneously set the RCB Bit in their ACPI _HPX Tables > even if it is not set on the root port. This is a violation to the PCIe > Specification and is known to bring some Mellanox Connect-X 3 HCAs into > a state where they can't map their firmware and go into error recovery. > > BIOS Information > Vendor: IBM > Version: -[A8E120CUS-1.30]- > Release Date: 08/22/2016 This seems like a pretty serious problem (sounds like maybe the HCA is completely useless?) Can you point us at a bugzilla or other problem report? It's nice to have details of what this looks like to a user, so people who trip over this problem have a little more chance of finding the solution. 7a1562d4f2d0 ("PCI: Apply _HPX Link Control settings to all devices with a link") appeared in v3.18, so it's probably not a *new* problem, so my guess is that this is v4.10 material. > From PCI Express Base Specification 1.1, > section 2.3.1.1. Data Return for Read Requests: > The Read Completion Boundary (RCB) parameter determines the naturally > aligned address boundaries on which a Read Request may be serviced with > multiple Completions > o For a Root Complex, RCB is 64 bytes or 128 bytes > o This value is reported through a configuration register > (see Section 7.8) > Note: Bridges and Endpoints may implement a corresponding command > bit which may be set by system software to indicate the RCB value > for the Root Complex, allowing the Bridge/Endpoint to optimize its > behavior when the Root Complex’s RCB is 128 bytes. > o For all other system elements, RCB is 128 bytes > > Table 7-16: Link Control Register: > Configuration software must only Set this bit if the Root Port > Upstream from the Endpoint or Bridge reports an RCB value of > 128 bytes (a value of 1b in the Read Completion Boundary bit). > Default value of this bit is 0b. > > Functions that do not implement this feature must hardwire the > bit to 0b. > > Before commit 7a1562d4f: > > 41:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3] > > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ > > ExtSynch+ ClockPM- AutWidDis- BWInt- AutBWInt- > > > > 40:02.0 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 2a (rev 07) (prog-if 00 [Normal decode]) > > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ > > ExtSynch+ ClockPM- AutWidDis- BWInt- AutBWInt- > > After: > > 40:02.0 PCI bridge: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 PCI Express Root Port 2a (rev 07) (prog-if 00 [Normal decode]) > > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ > > ExtSynch+ ClockPM- AutWidDis- BWInt- AutBWInt- > > > > 41:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3] > > LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk+ > > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > > Fixes: 7a1562d4f ("PCI: Apply _HPX Link Control settings to all devices with a link") > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > Reviewed-by: Hannes Reinecke <hare@suse.com> > --- > drivers/pci/probe.c | 29 +++++++++++++++++++++++++++-- > 1 file changed, 27 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index ab00267..0a4ab9c 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -1439,6 +1439,19 @@ static void program_hpp_type1(struct pci_dev *dev, struct hpp_type1 *hpp) > dev_warn(&dev->dev, "PCI-X settings not supported\n"); > } > > +static bool pcie_get_root_rcb(struct pci_dev *dev) > +{ > + struct pci_dev *rp = pcie_find_root_port(dev); > + u16 lnkctl; > + > + if (!rp) > + return false; > + > + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); > + > + return lnkctl & PCI_EXP_LNKCTL_RCB; > +} > + > static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) > { > int pos; > @@ -1468,9 +1481,21 @@ static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) > ~hpp->pci_exp_devctl_and, hpp->pci_exp_devctl_or); > > /* Initialize Link Control Register */ > - if (pcie_cap_has_lnkctl(dev)) > + if (pcie_cap_has_lnkctl(dev)) { > + bool rrcb; > + u16 clear; > + u16 set; > + > + rrcb = pcie_get_root_rcb(dev); > + > + clear = ~hpp->pci_exp_lnkctl_and; > + set = hpp->pci_exp_lnkctl_or; > + if (!rrcb) > + set &= ~PCI_EXP_LNKCTL_RCB; > + > pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL, > - ~hpp->pci_exp_lnkctl_and, hpp->pci_exp_lnkctl_or); > + clear, set); > + } > > /* Find Advanced Error Reporting Enhanced Capability */ > pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR); > -- > 2.10.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 09, 2016 at 11:11:40AM -0600, Bjorn Helgaas wrote: > Hi Johannes, > > On Wed, Nov 02, 2016 at 04:35:52PM -0600, Johannes Thumshirn wrote: > > The Read Completion Boundary (RCB) bit must only be set on a device or > > endpoint if it is set on the root complex. > > > > Certain BIOSes erroneously set the RCB Bit in their ACPI _HPX Tables > > even if it is not set on the root port. This is a violation to the PCIe > > Specification and is known to bring some Mellanox Connect-X 3 HCAs into > > a state where they can't map their firmware and go into error recovery. > > > > BIOS Information > > Vendor: IBM > > Version: -[A8E120CUS-1.30]- > > Release Date: 08/22/2016 > > This seems like a pretty serious problem (sounds like maybe the HCA is > completely useless?) Correct. > > Can you point us at a bugzilla or other problem report? It's nice to > have details of what this looks like to a user, so people who trip > over this problem have a little more chance of finding the solution. As we already said, our bugzilla entry for this is not accessible from the outside, but I know Red Hat does have a bugzilla entry for the same issue as well. Maybe this is reachable from the outside (adding Don for this, as I know he has worked on this problem as well). > > 7a1562d4f2d0 ("PCI: Apply _HPX Link Control settings to all devices > with a link") appeared in v3.18, so it's probably not a *new* problem, > so my guess is that this is v4.10 material. Yes 4.10 sounds good to me. I personally think, this problem hasn't materialized yet, as this is the kind of hardware you run on a rather /stable/ kernel either you built on your own or get from an enterprise distribution and until recently these kernels haven't been updated to something newer than 3.18. Thanks, Johannes
On 11/14/2016 06:56 AM, Johannes Thumshirn wrote: > On Wed, Nov 09, 2016 at 11:11:40AM -0600, Bjorn Helgaas wrote: >> Hi Johannes, >> >> On Wed, Nov 02, 2016 at 04:35:52PM -0600, Johannes Thumshirn wrote: >>> The Read Completion Boundary (RCB) bit must only be set on a device or >>> endpoint if it is set on the root complex. >>> >>> Certain BIOSes erroneously set the RCB Bit in their ACPI _HPX Tables >>> even if it is not set on the root port. This is a violation to the PCIe >>> Specification and is known to bring some Mellanox Connect-X 3 HCAs into >>> a state where they can't map their firmware and go into error recovery. >>> >>> BIOS Information >>> Vendor: IBM >>> Version: -[A8E120CUS-1.30]- >>> Release Date: 08/22/2016 >> >> This seems like a pretty serious problem (sounds like maybe the HCA is >> completely useless?) > > Correct. > >> >> Can you point us at a bugzilla or other problem report? It's nice to >> have details of what this looks like to a user, so people who trip >> over this problem have a little more chance of finding the solution. > > As we already said, our bugzilla entry for this is not accessible from the > outside, but I know Red Hat does have a bugzilla entry for the same issue as > well. Maybe this is reachable from the outside (adding Don for this, as I know > he has worked on this problem as well). > RHEL bz's are not accessible from the outside. I suggest capturing the content of the RH bz issue and creating a k.o. bz with the information. >> >> 7a1562d4f2d0 ("PCI: Apply _HPX Link Control settings to all devices >> with a link") appeared in v3.18, so it's probably not a *new* problem, >> so my guess is that this is v4.10 material. > > Yes 4.10 sounds good to me. I personally think, this problem hasn't > materialized yet, as this is the kind of hardware you run on a rather /stable/ > kernel either you built on your own or get from an enterprise distribution and > until recently these kernels haven't been updated to something newer than > 3.18. > > Thanks, > Johannes > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 14, 2016 at 11:16:51AM -0500, Don Dutile wrote: > On 11/14/2016 06:56 AM, Johannes Thumshirn wrote: > > On Wed, Nov 09, 2016 at 11:11:40AM -0600, Bjorn Helgaas wrote: > > > Hi Johannes, > > > > > > On Wed, Nov 02, 2016 at 04:35:52PM -0600, Johannes Thumshirn wrote: > > > > The Read Completion Boundary (RCB) bit must only be set on a device or > > > > endpoint if it is set on the root complex. > > > > > > > > Certain BIOSes erroneously set the RCB Bit in their ACPI _HPX Tables > > > > even if it is not set on the root port. This is a violation to the PCIe > > > > Specification and is known to bring some Mellanox Connect-X 3 HCAs into > > > > a state where they can't map their firmware and go into error recovery. > > > > > > > > BIOS Information > > > > Vendor: IBM > > > > Version: -[A8E120CUS-1.30]- > > > > Release Date: 08/22/2016 > > > > > > This seems like a pretty serious problem (sounds like maybe the HCA is > > > completely useless?) > > > > Correct. > > > > > > > > Can you point us at a bugzilla or other problem report? It's nice to > > > have details of what this looks like to a user, so people who trip > > > over this problem have a little more chance of finding the solution. > > > > As we already said, our bugzilla entry for this is not accessible from the > > outside, but I know Red Hat does have a bugzilla entry for the same issue as > > well. Maybe this is reachable from the outside (adding Don for this, as I know > > he has worked on this problem as well). > > > RHEL bz's are not accessible from the outside. > I suggest capturing the content of the RH bz issue and creating a k.o. bz > with the information. I've created https://bugzilla.kernel.org/show_bug.cgi?id=187781 to track this on b.k.o. Feel free to add any information you have. @Bjorn anything else I can provide in order to get the fix applied? Byte, Johannes
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ab00267..0a4ab9c 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1439,6 +1439,19 @@ static void program_hpp_type1(struct pci_dev *dev, struct hpp_type1 *hpp) dev_warn(&dev->dev, "PCI-X settings not supported\n"); } +static bool pcie_get_root_rcb(struct pci_dev *dev) +{ + struct pci_dev *rp = pcie_find_root_port(dev); + u16 lnkctl; + + if (!rp) + return false; + + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); + + return lnkctl & PCI_EXP_LNKCTL_RCB; +} + static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) { int pos; @@ -1468,9 +1481,21 @@ static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) ~hpp->pci_exp_devctl_and, hpp->pci_exp_devctl_or); /* Initialize Link Control Register */ - if (pcie_cap_has_lnkctl(dev)) + if (pcie_cap_has_lnkctl(dev)) { + bool rrcb; + u16 clear; + u16 set; + + rrcb = pcie_get_root_rcb(dev); + + clear = ~hpp->pci_exp_lnkctl_and; + set = hpp->pci_exp_lnkctl_or; + if (!rrcb) + set &= ~PCI_EXP_LNKCTL_RCB; + pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL, - ~hpp->pci_exp_lnkctl_and, hpp->pci_exp_lnkctl_or); + clear, set); + } /* Find Advanced Error Reporting Enhanced Capability */ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR);