Message ID | 1449074998-9664-1-git-send-email-okaya@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Hi Sinan, On Wed, Dec 02, 2015 at 11:49:56AM -0500, Sinan Kaya wrote: > A PCIe card behind a switch is unable to report its errors when SERR# > forwarding is not enabled on the PCIe# switch's secondary interface > according to the spec. This patch enables SERR# forwarding when the PCI > header type is bridge. > > Signed-off-by: Sinan Kaya <okaya@codeaurora.org> I'm sitting on this for the moment because if you have _HPP, it seems like that should be enough to get SERR# forwarding turned on, and if it's not, I'd like to understand why. So no hurry, but I'm waiting on your investigation. Bjorn > --- > drivers/pci/pcie/aer/aerdrv_core.c | 32 ++++++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c > index 9803e3d..f248c17 100644 > --- a/drivers/pci/pcie/aer/aerdrv_core.c > +++ b/drivers/pci/pcie/aer/aerdrv_core.c > @@ -37,21 +37,53 @@ module_param(nosourceid, bool, 0); > > int pci_enable_pcie_error_reporting(struct pci_dev *dev) > { > + u8 header_type; > + > if (pcie_aer_get_firmware_first(dev)) > return -EIO; > > if (!pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR)) > return -EIO; > > + pci_read_config_byte(dev, PCI_HEADER_TYPE, &header_type); > + > + /* needs to be a bridge/switch */ > + if (header_type == PCI_HEADER_TYPE_BRIDGE) { > + u16 control; > + > + /* > + * A switch will not forward ERR_ messages coming from an > + * endpoint if SERR# forwarding is not enabled. > + * AER driver is checking the errors at the root only. > + */ > + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &control); > + control |= PCI_BRIDGE_CTL_SERR; > + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, control); > + } > + > return pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_AER_FLAGS); > } > EXPORT_SYMBOL_GPL(pci_enable_pcie_error_reporting); > > int pci_disable_pcie_error_reporting(struct pci_dev *dev) > { > + u8 header_type; > + > if (pcie_aer_get_firmware_first(dev)) > return -EIO; > > + pci_read_config_byte(dev, PCI_HEADER_TYPE, &header_type); > + > + /* needs to be a bridge/switch */ > + if (header_type == PCI_HEADER_TYPE_BRIDGE) { > + u16 control; > + > + /* clear serr forwarding */ > + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &control); > + control &= ~PCI_BRIDGE_CTL_SERR; > + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, control); > + } > + > return pcie_capability_clear_word(dev, PCI_EXP_DEVCTL, > PCI_EXP_AER_FLAGS); > } > -- > Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/4/2015 4:06 PM, Bjorn Helgaas wrote: > I'm sitting on this for the moment because if you have _HPP, it seems > like that should be enough to get SERR# forwarding turned on, and if > it's not, I'd like to understand why. So no hurry, but I'm waiting on > your investigation. > > Bjorn OK. I'll find out. I did not get a chance to debug it yet. My guess is that I wrote this patch before enabling the hotplug support. Therefore, I was not aware of any _HPP implementation in the kernel.
Hi Bjorn, On 12/4/2015 4:06 PM, Bjorn Helgaas wrote: > I'm sitting on this for the moment because if you have _HPP, it seems > like that should be enough to get SERR# forwarding turned on, and if > it's not, I'd like to understand why. So no hurry, but I'm waiting on > your investigation. > > Bjorn Here is the overall summary after my investigation. It looks like the kernel covers the hotplug use case. This patch is needed for systems without hotplug support and when the firmware is not setting up the SERR. after boot /# dmesg | grep hpp [ 3.115227] pci 0004:01:00.0: program_hpp_type0:1376 [ 3.128870] pci 0004:01:00.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 3.149597] pci 0004:01:00.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR [ 3.191601] pci 0004:02:08.0: program_hpp_type0:1376 [ 3.191611] pci 0004:02:08.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 3.206630] pci 0004:02:08.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR [ 3.253760] pci 0004:03:00.0: program_hpp_type0:1376 [ 3.267335] pci 0004:03:00.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 3.288046] pci 0004:03:00.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR [ 3.355296] pci 0004:04:00.0: program_hpp_type0:1376 [ 3.355306] pci 0004:04:00.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 3.370334] pci 0004:04:00.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR / # lspci 00:00.0 Class 0604: 17cb:0400 01:00.0 Class 0604: 10b5:8732 02:08.0 Class 0604: 10b5:8732 03:00.0 Class 0604: 10b5:8732 04:00.0 Class 0604: 10b5:8732 / # Without hpp in ACPI table, SERR is not enabled. /# dmesg | grep type0 /# Power up with HPP after boot. [ 3.129325]_pci_0004:01:00.0:_program_hpp_type0:1376 [ 3.143286] pci 0004:01:00.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 3.164016] pci 0004:01:00.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR [ 3.206019] pci 0004:02:08.0: program_hpp_type0:1376 [ 3.206028] pci 0004:02:08.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 3.220609] pci 0004:02:08.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR [ 3.267783] pci 0004:03:00.0: program_hpp_type0:1376 [ 3.281420] pci 0004:03:00.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 3.302197] pci 0004:03:00.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR [ 3.369684] pci 0004:04:00.0: program_hpp_type0:1376 [ 3.369694] pci 0004:04:00.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 3.384080] pci 0004:04:00.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR hotplug eject hotplug insert [ 98.338131] pci 0004:01:00.0: program_hpp_type0:1376 [ 98.351813] pci 0004:01:00.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 98.373147] pci 0004:01:00.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR [ 98.452051] pci 0004:02:08.0: program_hpp_type0:1376 [ 98.465772] pci 0004:02:08.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 98.487142] pci 0004:02:08.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR [ 98.597579] pci 0004:03:00.0: program_hpp_type0:1376 [ 98.611290] pci 0004:03:00.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 98.632181] pci 0004:03:00.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR [ 98.736153] pci 0004:04:00.0: program_hpp_type0:1376 [ 98.750437] pci 0004:04:00.0: program_hpp_type0:1378 pci_cmd |= PCI_COMMAND_SERR [ 98.771202] pci 0004:04:00.0: program_hpp_type0:1391 pci_bctl |= PCI_BRIDGE_CTL_SERR / #
On Thu, Dec 10, 2015 at 03:28:35PM -0500, Sinan Kaya wrote: > Hi Bjorn, > > On 12/4/2015 4:06 PM, Bjorn Helgaas wrote: > > I'm sitting on this for the moment because if you have _HPP, it seems > > like that should be enough to get SERR# forwarding turned on, and if > > it's not, I'd like to understand why. So no hurry, but I'm waiting on > > your investigation. > > > > Bjorn > > Here is the overall summary after my investigation. > > It looks like the kernel covers the hotplug use case. This patch is > needed for systems without hotplug support and when the firmware is not > setting up the SERR. Here's how I understand your results: Firmware _HPP or Devices Hot-added Hot-added enables _HPX sets present at root ports endpoints SERR# SERR# boot work work work -------- --------- ---------- ---------- --------- no no no (1) no (2) no (4) no yes yes yes yes yes no yes no (3) no (5) yes yes yes yes yes Your patch fixes cases 1-3 above, but I don't think it fixes cases 4 or 5. The difference is that in cases 2 and 3, when we hot-add a root port, the AER driver binds to the root port and (with your patch) enables SERR for anything below it. But in cases 4 and 5, the root port is already there, the AER driver has already bound to it. The AER driver tried to enable SERR for the hierarchy below the root port, but there was nothing there. Now we add the endpoint, and the AER driver isn't involved, so I don't think anything will enable SERR for the new endpoint. I think the best way to fix all the cases would be to do something in in pci_configure_device(). Then we could drop the AER bus walk in set_downstream_devices_error_reporting(). A bus walk like that is always an issue for hotplug. In principle, we should be able to just enable PCI_COMMAND_SERR and PCI_BRIDGE_CTL_SERR for everything, and then errors would get forwarded to the root port, and if/when the AER driver claimed the root port, it would start collecting them. But I'm a little leery of doing it unconditionally because there are a lot of platform- and driver-specific uses of those bits, and I'm afraid of breaking something. It might be possible, but it'll take some care to do it safely. > after boot > > /# dmesg | grep hpp > > [ 3.115227] pci 0004:01:00.0: program_hpp_type0:1376 > [ 3.128870] pci 0004:01:00.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 3.149597] pci 0004:01:00.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > [ 3.191601] pci 0004:02:08.0: program_hpp_type0:1376 > [ 3.191611] pci 0004:02:08.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 3.206630] pci 0004:02:08.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > [ 3.253760] pci 0004:03:00.0: program_hpp_type0:1376 > [ 3.267335] pci 0004:03:00.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 3.288046] pci 0004:03:00.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > [ 3.355296] pci 0004:04:00.0: program_hpp_type0:1376 > [ 3.355306] pci 0004:04:00.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 3.370334] pci 0004:04:00.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > > / # lspci > 00:00.0 Class 0604: 17cb:0400 > 01:00.0 Class 0604: 10b5:8732 > 02:08.0 Class 0604: 10b5:8732 > 03:00.0 Class 0604: 10b5:8732 > 04:00.0 Class 0604: 10b5:8732 > / # > > > Without hpp in ACPI table, SERR is not enabled. > > /# dmesg | grep type0 > /# > > Power up with HPP after boot. > > [ 3.129325]_pci_0004:01:00.0:_program_hpp_type0:1376 > [ 3.143286] pci 0004:01:00.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 3.164016] pci 0004:01:00.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > [ 3.206019] pci 0004:02:08.0: program_hpp_type0:1376 > [ 3.206028] pci 0004:02:08.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 3.220609] pci 0004:02:08.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > [ 3.267783] pci 0004:03:00.0: program_hpp_type0:1376 > [ 3.281420] pci 0004:03:00.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 3.302197] pci 0004:03:00.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > [ 3.369684] pci 0004:04:00.0: program_hpp_type0:1376 > [ 3.369694] pci 0004:04:00.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 3.384080] pci 0004:04:00.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > > hotplug eject > > hotplug insert > > [ 98.338131] pci 0004:01:00.0: program_hpp_type0:1376 > [ 98.351813] pci 0004:01:00.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 98.373147] pci 0004:01:00.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > [ 98.452051] pci 0004:02:08.0: program_hpp_type0:1376 > [ 98.465772] pci 0004:02:08.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 98.487142] pci 0004:02:08.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > [ 98.597579] pci 0004:03:00.0: program_hpp_type0:1376 > [ 98.611290] pci 0004:03:00.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 98.632181] pci 0004:03:00.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > [ 98.736153] pci 0004:04:00.0: program_hpp_type0:1376 > [ 98.750437] pci 0004:04:00.0: program_hpp_type0:1378 pci_cmd |= > PCI_COMMAND_SERR > [ 98.771202] pci 0004:04:00.0: program_hpp_type0:1391 pci_bctl |= > PCI_BRIDGE_CTL_SERR > / # > > > > -- > Sinan Kaya > Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a > Linux Foundation Collaborative Project > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/10/2015 5:37 PM, Bjorn Helgaas wrote: > On Thu, Dec 10, 2015 at 03:28:35PM -0500, Sinan Kaya wrote: >> Hi Bjorn, >> >> On 12/4/2015 4:06 PM, Bjorn Helgaas wrote: >>> I'm sitting on this for the moment because if you have _HPP, it seems >>> like that should be enough to get SERR# forwarding turned on, and if >>> it's not, I'd like to understand why. So no hurry, but I'm waiting on >>> your investigation. >>> >>> Bjorn >> >> Here is the overall summary after my investigation. >> >> It looks like the kernel covers the hotplug use case. This patch is >> needed for systems without hotplug support and when the firmware is not >> setting up the SERR. > > Here's how I understand your results: > > Firmware _HPP or Devices Hot-added Hot-added > enables _HPX sets present at root ports endpoints > SERR# SERR# boot work work work > -------- --------- ---------- ---------- --------- > no no no (1) no (2) no (4) > no yes yes yes yes > yes no yes no (3) no (5) > yes yes yes yes yes > > Your patch fixes cases 1-3 above, but I don't think it fixes cases 4 > or 5. > I think so. I don't supported 4 and 5 cases on our platform. But, I could write up a patch if somebody can test it on such a platform. > The difference is that in cases 2 and 3, when we hot-add a root port, > the AER driver binds to the root port and (with your patch) enables > SERR for anything below it. > > But in cases 4 and 5, the root port is already there, the AER driver > has already bound to it. The AER driver tried to enable SERR for the > hierarchy below the root port, but there was nothing there. Now we > add the endpoint, and the AER driver isn't involved, so I don't think > anything will enable SERR for the new endpoint. > > I think the best way to fix all the cases would be to do something in > in pci_configure_device(). Then we could drop the AER bus walk in > set_downstream_devices_error_reporting(). A bus walk like that is > always an issue for hotplug. > Let me read some code. > In principle, we should be able to just enable PCI_COMMAND_SERR and > PCI_BRIDGE_CTL_SERR for everything, and then errors would get > forwarded to the root port, and if/when the AER driver claimed the > root port, it would start collecting them. > > But I'm a little leery of doing it unconditionally because there are a > lot of platform- and driver-specific uses of those bits, and I'm > afraid of breaking something. It might be possible, but it'll take > some care to do it safely. Sure, when we were talking about ECRC the other day; you said we could enable it on platforms post 2000 using some SMBIOS API. We could go the same route here. > >> after boot >> >> /# dmesg | grep hpp >> >> [ 3.115227] pci 0004:01:00.0: program_hpp_type0:1376 >> [ 3.128870] pci 0004:01:00.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 3.149597] pci 0004:01:00.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> [ 3.191601] pci 0004:02:08.0: program_hpp_type0:1376 >> [ 3.191611] pci 0004:02:08.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 3.206630] pci 0004:02:08.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> [ 3.253760] pci 0004:03:00.0: program_hpp_type0:1376 >> [ 3.267335] pci 0004:03:00.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 3.288046] pci 0004:03:00.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> [ 3.355296] pci 0004:04:00.0: program_hpp_type0:1376 >> [ 3.355306] pci 0004:04:00.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 3.370334] pci 0004:04:00.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> >> / # lspci >> 00:00.0 Class 0604: 17cb:0400 >> 01:00.0 Class 0604: 10b5:8732 >> 02:08.0 Class 0604: 10b5:8732 >> 03:00.0 Class 0604: 10b5:8732 >> 04:00.0 Class 0604: 10b5:8732 >> / # >> >> >> Without hpp in ACPI table, SERR is not enabled. >> >> /# dmesg | grep type0 >> /# >> >> Power up with HPP after boot. >> >> [ 3.129325]_pci_0004:01:00.0:_program_hpp_type0:1376 >> [ 3.143286] pci 0004:01:00.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 3.164016] pci 0004:01:00.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> [ 3.206019] pci 0004:02:08.0: program_hpp_type0:1376 >> [ 3.206028] pci 0004:02:08.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 3.220609] pci 0004:02:08.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> [ 3.267783] pci 0004:03:00.0: program_hpp_type0:1376 >> [ 3.281420] pci 0004:03:00.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 3.302197] pci 0004:03:00.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> [ 3.369684] pci 0004:04:00.0: program_hpp_type0:1376 >> [ 3.369694] pci 0004:04:00.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 3.384080] pci 0004:04:00.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> >> hotplug eject >> >> hotplug insert >> >> [ 98.338131] pci 0004:01:00.0: program_hpp_type0:1376 >> [ 98.351813] pci 0004:01:00.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 98.373147] pci 0004:01:00.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> [ 98.452051] pci 0004:02:08.0: program_hpp_type0:1376 >> [ 98.465772] pci 0004:02:08.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 98.487142] pci 0004:02:08.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> [ 98.597579] pci 0004:03:00.0: program_hpp_type0:1376 >> [ 98.611290] pci 0004:03:00.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 98.632181] pci 0004:03:00.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> [ 98.736153] pci 0004:04:00.0: program_hpp_type0:1376 >> [ 98.750437] pci 0004:04:00.0: program_hpp_type0:1378 pci_cmd |= >> PCI_COMMAND_SERR >> [ 98.771202] pci 0004:04:00.0: program_hpp_type0:1391 pci_bctl |= >> PCI_BRIDGE_CTL_SERR >> / # >> >> >> >> -- >> Sinan Kaya >> Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a >> Linux Foundation Collaborative Project >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
On 12/11/2015 6:30 PM, Sinan Kaya wrote: >> I think the best way to fix all the cases would be to do something in >> > in pci_configure_device(). Then we could drop the AER bus walk in >> > set_downstream_devices_error_reporting(). A bus walk like that is >> > always an issue for hotplug. >> > > Let me read some code. > OK, If I understand it right; pci_configure_device is where program_hpp_type0 called. You also want to enable AER in this function. Move the contents of set_device_error_reporting into pci_configure_device like this below ? ... + int type = pci_pcie_type(dev); pci_configure_mps(dev); + if ((type == PCI_EXP_TYPE_ROOT_PORT) || + (type == PCI_EXP_TYPE_UPSTREAM) || + (type == PCI_EXP_TYPE_DOWNSTREAM)) { + pci_enable_pcie_error_reporting(dev); + } + pcie_set_ecrc_checking(dev);
Hi Sinan, Sorry for the delay in responding; I was on vacation when you sent this, and I missed it when I returned. On Mon, Dec 14, 2015 at 01:22:39PM -0500, Sinan Kaya wrote: > On 12/11/2015 6:30 PM, Sinan Kaya wrote: > >> I think the best way to fix all the cases would be to do something in > >> > in pci_configure_device(). Then we could drop the AER bus walk in > >> > set_downstream_devices_error_reporting(). A bus walk like that is > >> > always an issue for hotplug. > >> > > > Let me read some code. > > OK, If I understand it right; pci_configure_device is where > program_hpp_type0 called. You also want to enable AER in this function. > > Move the contents of set_device_error_reporting into > pci_configure_device like this below ? > > ... > + int type = pci_pcie_type(dev); > > pci_configure_mps(dev); > > + if ((type == PCI_EXP_TYPE_ROOT_PORT) || > + (type == PCI_EXP_TYPE_UPSTREAM) || > + (type == PCI_EXP_TYPE_DOWNSTREAM)) { > + pci_enable_pcie_error_reporting(dev); > + } > > + pcie_set_ecrc_checking(dev); Yep, that's the sort of thing I'm thinking. I think there are some subtleties to consider. _HPP/_HPX can twiddle some of the same bits. Should we allow _HPP to clear a bit that pci_enable_pcie_error_reporting() would set? What about the reverse? There are a ridiculous number of places that call pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR). We should do it once and cache the result in struct pci_dev. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/28/2015 5:29 PM, Bjorn Helgaas wrote: > Hi Sinan, > > Sorry for the delay in responding; I was on vacation when you sent > this, and I missed it when I returned. > > On Mon, Dec 14, 2015 at 01:22:39PM -0500, Sinan Kaya wrote: >> On 12/11/2015 6:30 PM, Sinan Kaya wrote: >>>> I think the best way to fix all the cases would be to do something in >>>>> in pci_configure_device(). Then we could drop the AER bus walk in >>>>> set_downstream_devices_error_reporting(). A bus walk like that is >>>>> always an issue for hotplug. >>>>> >>> Let me read some code. >> >> OK, If I understand it right; pci_configure_device is where >> program_hpp_type0 called. You also want to enable AER in this function. >> >> Move the contents of set_device_error_reporting into >> pci_configure_device like this below ? >> >> ... >> + int type = pci_pcie_type(dev); >> >> pci_configure_mps(dev); >> >> + if ((type == PCI_EXP_TYPE_ROOT_PORT) || >> + (type == PCI_EXP_TYPE_UPSTREAM) || >> + (type == PCI_EXP_TYPE_DOWNSTREAM)) { >> + pci_enable_pcie_error_reporting(dev); >> + } >> >> + pcie_set_ecrc_checking(dev); > > Yep, that's the sort of thing I'm thinking. > > I think there are some subtleties to consider. > > _HPP/_HPX can twiddle some of the same bits. Should we allow _HPP to > clear a bit that pci_enable_pcie_error_reporting() would set? What > about the reverse? > > There are a ridiculous number of places that call > pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR). We should > do it once and cache the result in struct pci_dev. > > Bjorn > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Thanks for the heads up. I had another pending question to you on another patch. I just sent a ping. I'll start working on this patch after the holidays.
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c index 9803e3d..f248c17 100644 --- a/drivers/pci/pcie/aer/aerdrv_core.c +++ b/drivers/pci/pcie/aer/aerdrv_core.c @@ -37,21 +37,53 @@ module_param(nosourceid, bool, 0); int pci_enable_pcie_error_reporting(struct pci_dev *dev) { + u8 header_type; + if (pcie_aer_get_firmware_first(dev)) return -EIO; if (!pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR)) return -EIO; + pci_read_config_byte(dev, PCI_HEADER_TYPE, &header_type); + + /* needs to be a bridge/switch */ + if (header_type == PCI_HEADER_TYPE_BRIDGE) { + u16 control; + + /* + * A switch will not forward ERR_ messages coming from an + * endpoint if SERR# forwarding is not enabled. + * AER driver is checking the errors at the root only. + */ + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &control); + control |= PCI_BRIDGE_CTL_SERR; + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, control); + } + return pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_AER_FLAGS); } EXPORT_SYMBOL_GPL(pci_enable_pcie_error_reporting); int pci_disable_pcie_error_reporting(struct pci_dev *dev) { + u8 header_type; + if (pcie_aer_get_firmware_first(dev)) return -EIO; + pci_read_config_byte(dev, PCI_HEADER_TYPE, &header_type); + + /* needs to be a bridge/switch */ + if (header_type == PCI_HEADER_TYPE_BRIDGE) { + u16 control; + + /* clear serr forwarding */ + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &control); + control &= ~PCI_BRIDGE_CTL_SERR; + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, control); + } + return pcie_capability_clear_word(dev, PCI_EXP_DEVCTL, PCI_EXP_AER_FLAGS); }
A PCIe card behind a switch is unable to report its errors when SERR# forwarding is not enabled on the PCIe# switch's secondary interface according to the spec. This patch enables SERR# forwarding when the PCI header type is bridge. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> --- drivers/pci/pcie/aer/aerdrv_core.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)