Message ID | 20161116181158.GB26600@bhelgaas-glaptop.roam.corp.google.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Wed, Nov 16, 2016 at 12:11:58PM -0600, Bjorn Helgaas wrote: > Hi Johannes, > > On Wed, Nov 02, 2016 at 04:35:52PM -0600, Johannes Thumshirn wrote: > > The Read Completion Boundary (RCB) bit must only be set on a device or > > endpoint if it is set on the root complex. > > I propose the following slightly modified patch. The interesting > difference is that your patch only touches the _HPX "OR" mask, so it > refrains from *setting* RCB in some cases, but it never actually > *clears* it. The only time we clear RCB is when the _HPX "AND" mask > has RCB == 0. > > My intent below is that we completely ignore the _HPX RCB bits, and we > set an Endpoint's RCB if and only if the Root Port's RCB is set. > > I made an ugly ASCII table to think about the cases: > > Root EP _HPX _HPX Final Endpoint RCB state > Port (init) AND OR (curr) (yours) (mine) > 0) 0 0 0 0 0 0 0 > 1) 0 0 0 1 1 0 0 > 2) 0 0 1 0 0 0 0 > 3) 0 0 1 1 1 0 0 > 4) 0 1 0 0 0 0 0 > 5) 0 1 0 1 1 0 0 > 6) 0 1 1 0 1 1 0 > 7) 0 1 1 1 1 1 0 > 8) 1 0 0 0 0 0 1 > 9) 1 0 0 1 1 1 1 > A) 1 0 1 0 0 0 1 > B) 1 0 1 1 1 1 1 > C) 1 1 0 0 0 0 1 > D) 1 1 0 1 1 1 1 > E) 1 1 1 0 1 1 1 > F) 1 1 1 1 1 1 1 > > Cases 0-7 should all result in the Endpoint RCB being zero because the > Root Port RCB is zero. Case 1 is the bug you're fixing. Cases 3 & 5 > are similar hypothetical bugs your patch also fixes. > > Cases 6 & 7, where firmware left the Endpoint RCB set and _HPX didn't > tell us to clear it, are hypothetical firmware bugs that your patch > wouldn't fix. > > In cases 8, A, and C, we currently leave the Endpoint RCB cleared, > either because firmware left it clear and _HPX didn't tell us to set > it (8 and A), or because firmware set it but _HPX told us to clear it > (C). > > One could argue that 8, A, and C should stay as they currently are, as > a way for _HPX to work around hardware bugs, e.g., a Root Port that > advertises a 128-byte RCB but doesn't actually support it. I didn't > bother with that and set the Endpoint's RCB to 128 in all cases when > the Root Port claims to support it. > > It'd be great if you could test this and comment. I've lost access to the machines, but I'll try to delegate it to someone who has access. > > If you get a chance, collect the /proc/iomem contents, too. That's > not for this bug; it's because I'm curious about the > > ERST: Can not request [mem 0xb928b000-0xb928cbff] for ERST > > problem in your dmesg log. I'll ask for this as well. Byte, Johannes
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ab00267..8854161 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1439,6 +1439,21 @@ static void program_hpp_type1(struct pci_dev *dev, struct hpp_type1 *hpp) dev_warn(&dev->dev, "PCI-X settings not supported\n"); } +static bool pcie_root_rcb_set(struct pci_dev *dev) +{ + struct pci_dev *rp = pcie_find_root_port(dev); + u16 lnkctl; + + if (!rp) + return false; + + pcie_capability_read_word(rp, PCI_EXP_LNKCTL, &lnkctl); + if (lnkctl & PCI_EXP_LNKCTL_RCB) + return true; + + return false; +} + static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) { int pos; @@ -1468,9 +1483,19 @@ static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) ~hpp->pci_exp_devctl_and, hpp->pci_exp_devctl_or); /* Initialize Link Control Register */ - if (pcie_cap_has_lnkctl(dev)) + if (pcie_cap_has_lnkctl(dev)) { + + /* + * If the Root Port supports Read Completion Boundary of + * 128, set RCB to 128. Otherwise, clear it. + */ + hpp->pci_exp_lnkctl_and |= PCI_EXP_LNKCTL_RCB; + if (pcie_root_rcb_set(dev)) + hpp->pci_exp_lnkctl_or |= PCI_EXP_LNKCTL_RCB; + pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL, ~hpp->pci_exp_lnkctl_and, hpp->pci_exp_lnkctl_or); + } /* Find Advanced Error Reporting Enhanced Capability */ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR);