Message ID | 1477490014-115917-1-git-send-email-jthumshirn@suse.de (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On 10/26/2016 03:53 PM, Johannes Thumshirn wrote: > The Read Completion Boundary bit must only be set on a device or endpoint if > it is set on the upstream bridge. > > Fixes: 7a1562d4f ("PCI: Apply _HPX Link Control settings to all devices with a link") > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > --- > drivers/pci/probe.c | 26 ++++++++++++++++++++++++-- > 1 file changed, 24 insertions(+), 2 deletions(-) > This fixes a regression where the mlx4 driver would not load (and, in fact, crash) on certain Ivybridge servers. Reviewed-by: Hannes Reinecke <hare@suse.com> Cheers, Hannes
Hi Johannes, On Wed, Oct 26, 2016 at 03:53:34PM +0200, Johannes Thumshirn wrote: > The Read Completion Boundary bit must only be set on a device or endpoint if > it is set on the upstream bridge. > > Fixes: 7a1562d4f ("PCI: Apply _HPX Link Control settings to all devices with a link") Can you please include a spec citation and a pointer to the bug report? > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > --- > drivers/pci/probe.c | 26 ++++++++++++++++++++++++-- > 1 file changed, 24 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index ab00267..5007b96 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -1439,6 +1439,16 @@ static void program_hpp_type1(struct pci_dev *dev, struct hpp_type1 *hpp) > dev_warn(&dev->dev, "PCI-X settings not supported\n"); > } > > +static bool pcie_get_upstream_rcb(struct pci_dev *dev) > +{ > + struct pci_dev *bridge = pci_upstream_bridge(dev); > + u16 lnkctl; > + > + pcie_capability_read_word(bridge, PCI_EXP_LNKCTL, &lnkctl); > + > + return lnkctl & PCI_EXP_LNKCTL_RCB; > +} > + > static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) > { > int pos; > @@ -1468,9 +1478,21 @@ static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) > ~hpp->pci_exp_devctl_and, hpp->pci_exp_devctl_or); > > /* Initialize Link Control Register */ > - if (pcie_cap_has_lnkctl(dev)) > + if (pcie_cap_has_lnkctl(dev)) { > + bool us_rcb; > + u16 clear; > + u16 set; > + > + us_rcb = pcie_get_upstream_rcb(dev); > + > + clear = ~hpp->pci_exp_lnkctl_and; > + set = hpp->pci_exp_lnkctl_or; > + if (!us_rcb) > + set &= ~PCI_EXP_LNKCTL_RCB; > + > pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL, > - ~hpp->pci_exp_lnkctl_and, hpp->pci_exp_lnkctl_or); > + clear, set); > + } > > /* Find Advanced Error Reporting Enhanced Capability */ > pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR); > -- > 1.8.5.6 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/26/2016 09:43 PM, Bjorn Helgaas wrote: > Hi Johannes, > > On Wed, Oct 26, 2016 at 03:53:34PM +0200, Johannes Thumshirn wrote: >> The Read Completion Boundary bit must only be set on a device or endpoint if >> it is set on the upstream bridge. >> >> Fixes: 7a1562d4f ("PCI: Apply _HPX Link Control settings to all devices with a link") > > Can you please include a spec citation and a pointer to the bug report? > PCI Express Base Specification 1.1, section 2.3.1.1. Data Return for Read Requests: The Read Completion Boundary (RCB) parameter determines the naturally aligned address boundaries on which a Read Request may be serviced with multiple Completions o For a Root Complex, RCB is 64 bytes or 128 bytes o This value is reported through a configuration register (see Section 7.8) Note: Bridges and Endpoints may implement a corresponding command bit which may be set by system software to indicate the RCB value for the Root Complex, allowing the Bridge/Endpoint to optimize its behavior when the Root Complex’s RCB is 128 bytes. o For all other system elements, RCB is 128 bytes In this particular case the _HPX method was causing the RCB for all PCI devices to be set to 128 bytes, while the root bridge remained at 64 bytes. While this is arguably a BIOS bug, earlier linux version (ie without the mentioned patch) were running fine, so this is actually a regression. Cheers, Hannes
On Thu, Oct 27, 2016 at 07:42:27AM +0200, Hannes Reinecke wrote: > On 10/26/2016 09:43 PM, Bjorn Helgaas wrote: > > Hi Johannes, > > > > On Wed, Oct 26, 2016 at 03:53:34PM +0200, Johannes Thumshirn wrote: > >> The Read Completion Boundary bit must only be set on a device or endpoint if > >> it is set on the upstream bridge. > >> > >> Fixes: 7a1562d4f ("PCI: Apply _HPX Link Control settings to all devices with a link") > > > > Can you please include a spec citation and a pointer to the bug report? > > > PCI Express Base Specification 1.1, > section 2.3.1.1. Data Return for Read Requests: > > The Read Completion Boundary (RCB) parameter determines the naturally > aligned address boundaries on which a Read Request may be serviced with > multiple Completions > o For a Root Complex, RCB is 64 bytes or 128 bytes > o This value is reported through a configuration register > (see Section 7.8) > Note: Bridges and Endpoints may implement a corresponding command > bit which may be set by system software to indicate the RCB value > for the Root Complex, allowing the Bridge/Endpoint to optimize its > behavior when the Root Complex’s RCB is 128 bytes. > o For all other system elements, RCB is 128 bytes > > In this particular case the _HPX method was causing the RCB for all PCI > devices to be set to 128 bytes, while the root bridge remained at 64 bytes. > While this is arguably a BIOS bug, earlier linux version (ie without the > mentioned patch) were running fine, so this is actually a regression. Thanks! I can fold this into the changelog. I assume you didn't mention a bugzilla or similar URL because this was found internally? I'd still like a clue about what this issue looks like to a user, because that helps connect future problem reports with this fix. And I suppose that since 7a1562d4f2d0 appeared in v3.18, we maybe should consider marking the fix for stable? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/27/2016 01:51 PM, Bjorn Helgaas wrote: > On Thu, Oct 27, 2016 at 07:42:27AM +0200, Hannes Reinecke wrote: >> On 10/26/2016 09:43 PM, Bjorn Helgaas wrote: >>> Hi Johannes, >>> >>> On Wed, Oct 26, 2016 at 03:53:34PM +0200, Johannes Thumshirn wrote: >>>> The Read Completion Boundary bit must only be set on a device or endpoint if >>>> it is set on the upstream bridge. >>>> >>>> Fixes: 7a1562d4f ("PCI: Apply _HPX Link Control settings to all devices with a link") >>> >>> Can you please include a spec citation and a pointer to the bug report? >>> >> PCI Express Base Specification 1.1, >> section 2.3.1.1. Data Return for Read Requests: >> >> The Read Completion Boundary (RCB) parameter determines the naturally >> aligned address boundaries on which a Read Request may be serviced with >> multiple Completions >> o For a Root Complex, RCB is 64 bytes or 128 bytes >> o This value is reported through a configuration register >> (see Section 7.8) >> Note: Bridges and Endpoints may implement a corresponding command >> bit which may be set by system software to indicate the RCB value >> for the Root Complex, allowing the Bridge/Endpoint to optimize its >> behavior when the Root Complex’s RCB is 128 bytes. >> o For all other system elements, RCB is 128 bytes >> >> In this particular case the _HPX method was causing the RCB for all PCI >> devices to be set to 128 bytes, while the root bridge remained at 64 bytes. >> While this is arguably a BIOS bug, earlier linux version (ie without the >> mentioned patch) were running fine, so this is actually a regression. > > Thanks! I can fold this into the changelog. > > I assume you didn't mention a bugzilla or similar URL because this was > found internally? I'd still like a clue about what this issue looks > like to a user, because that helps connect future problem reports with > this fix. > We do have a bugzilla report, but as this is a) on the SUSE internal bugzilla and b) a customer issue I didn't include a reference here. However, the symptoms are: [ 8.648872] mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014) [ 8.648889] mlx4_core: Initializing 0000:41:00.0 [ 10.068642] mlx4_core 0000:41:00.0: command 0xfff failed: fw status = 0x1 [ 10.068645] mlx4_core 0000:41:00.0: MAP_FA command failed, aborting [ 10.068659] mlx4_core 0000:41:00.0: Failed to start FW, aborting [ 10.068661] mlx4_core 0000:41:00.0: Failed to init fw, aborting. [ 11.071536] mlx4_core: probe of 0000:41:00.0 failed with error -5 > And I suppose that since 7a1562d4f2d0 appeared in v3.18, we maybe > should consider marking the fix for stable? > Yes, please. Cheers, Hannes
On Thu, Oct 27, 2016 at 06:51:52AM -0500, Bjorn Helgaas wrote: > On Thu, Oct 27, 2016 at 07:42:27AM +0200, Hannes Reinecke wrote: > > On 10/26/2016 09:43 PM, Bjorn Helgaas wrote: > > > Hi Johannes, > > > > > > On Wed, Oct 26, 2016 at 03:53:34PM +0200, Johannes Thumshirn wrote: > > >> The Read Completion Boundary bit must only be set on a device or endpoint if > > >> it is set on the upstream bridge. > > >> > > >> Fixes: 7a1562d4f ("PCI: Apply _HPX Link Control settings to all devices with a link") > > > > > > Can you please include a spec citation and a pointer to the bug report? > > > > > PCI Express Base Specification 1.1, > > section 2.3.1.1. Data Return for Read Requests: > > > > The Read Completion Boundary (RCB) parameter determines the naturally > > aligned address boundaries on which a Read Request may be serviced with > > multiple Completions > > o For a Root Complex, RCB is 64 bytes or 128 bytes > > o This value is reported through a configuration register > > (see Section 7.8) > > Note: Bridges and Endpoints may implement a corresponding command > > bit which may be set by system software to indicate the RCB value > > for the Root Complex, allowing the Bridge/Endpoint to optimize its > > behavior when the Root Complex’s RCB is 128 bytes. > > o For all other system elements, RCB is 128 bytes > > > > In this particular case the _HPX method was causing the RCB for all PCI > > devices to be set to 128 bytes, while the root bridge remained at 64 bytes. > > While this is arguably a BIOS bug, earlier linux version (ie without the > > mentioned patch) were running fine, so this is actually a regression. > > Thanks! I can fold this into the changelog. > > I assume you didn't mention a bugzilla or similar URL because this was > found internally? I'd still like a clue about what this issue looks > like to a user, because that helps connect future problem reports with > this fix. Sharing the bugzilla number won't be of any help here, as it's not accessible. The user visible issue was, that mlx4_core.ko wasn't able to issue the MLX4_CMD_MAP_FA command to it's firmware resulting in the adapter undergoing it's error recovery and ultimately hitting a BUG_ON() (which is subject to another patch). Even without asserting it won't load and hence not provide any network service. We haven't seen this behaviour on other drivers and it is likely a BIOS error (the wrong setting of PCI_EXP_LNKCTL_RCB) but that wasn't an issue before 7a1562d4f2d0 hence we do have a regression to kernels < 3.18. > > And I suppose that since 7a1562d4f2d0 appeared in v3.18, we maybe > should consider marking the fix for stable? Yes probably, but on the other hand we haven't heard of any other cases than this one (mlx4 NIC/HCA with a particular IBM IvyBridge Server, the next generation didn't expose the bug). Btw: Did you see I've sent a v2 as Alex Graf pointed out there might be a possible NULL pointer dereference when accessing the "bridge" pointer in pcie_get_upstream_rcb()? Thanks, Johannes
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ab00267..5007b96 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1439,6 +1439,16 @@ static void program_hpp_type1(struct pci_dev *dev, struct hpp_type1 *hpp) dev_warn(&dev->dev, "PCI-X settings not supported\n"); } +static bool pcie_get_upstream_rcb(struct pci_dev *dev) +{ + struct pci_dev *bridge = pci_upstream_bridge(dev); + u16 lnkctl; + + pcie_capability_read_word(bridge, PCI_EXP_LNKCTL, &lnkctl); + + return lnkctl & PCI_EXP_LNKCTL_RCB; +} + static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) { int pos; @@ -1468,9 +1478,21 @@ static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) ~hpp->pci_exp_devctl_and, hpp->pci_exp_devctl_or); /* Initialize Link Control Register */ - if (pcie_cap_has_lnkctl(dev)) + if (pcie_cap_has_lnkctl(dev)) { + bool us_rcb; + u16 clear; + u16 set; + + us_rcb = pcie_get_upstream_rcb(dev); + + clear = ~hpp->pci_exp_lnkctl_and; + set = hpp->pci_exp_lnkctl_or; + if (!us_rcb) + set &= ~PCI_EXP_LNKCTL_RCB; + pcie_capability_clear_and_set_word(dev, PCI_EXP_LNKCTL, - ~hpp->pci_exp_lnkctl_and, hpp->pci_exp_lnkctl_or); + clear, set); + } /* Find Advanced Error Reporting Enhanced Capability */ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR);
The Read Completion Boundary bit must only be set on a device or endpoint if it is set on the upstream bridge. Fixes: 7a1562d4f ("PCI: Apply _HPX Link Control settings to all devices with a link") Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> --- drivers/pci/probe.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-)