Message ID | 153989603841.78375.436243401553964151.stgit@bhelgaas-glaptop.roam.corp.google.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [v4] PCI/AER: Enable error reporting for all ports | expand |
On Thu, Oct 18, 2018 at 03:53:58PM -0500, Bjorn Helgaas wrote: > Change the AER service driver so it binds to *all* PCIe Ports, including > Switch Upstream and Downstream Ports. Enable AER error reporting for all > these Ports, but not for any children. I'm looking at this again and think enabling/disabling error reporting for ports is the responsibility of the port driver, not the AER service. The following should do the same as this patch, but without making AER driver handle non-root ports. The report enabling/disabling functions are already stubbed for '!CONFIG_PCIE_AER' and have checks for aer_cap and firmware first. A real patch for this could even make this remove all the aer specific error report enabling, so it'd be a net-loss in code lines. :) --- diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index 0acca3596807..f129a33c8303 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -122,12 +122,13 @@ static int pcie_portdrv_probe(struct pci_dev *dev, pm_runtime_put_autosuspend(&dev->dev); pm_runtime_allow(&dev->dev); } - + pci_enable_pcie_error_reporting(dev); return 0; } static void pcie_portdrv_remove(struct pci_dev *dev) { + pci_disable_pcie_error_reporting(dev); if (pci_bridge_d3_possible(dev)) { pm_runtime_forbid(&dev->dev); pm_runtime_get_noresume(&dev->dev); --
On Thu, Oct 18, 2018 at 05:03:13PM -0600, Keith Busch wrote: > On Thu, Oct 18, 2018 at 03:53:58PM -0500, Bjorn Helgaas wrote: > > Change the AER service driver so it binds to *all* PCIe Ports, > > including Switch Upstream and Downstream Ports. Enable AER error > > reporting for all these Ports, but not for any children. > > I'm looking at this again and think enabling/disabling error > reporting for ports is the responsibility of the port driver, not > the AER service. That's an interesting idea. Can you expand on this a little more? Why is it the responsibility of the port driver? Do you think pci_enable_pcie_error_reporting() shouldn't be part of the AER service because it updates the Device Control register, which is in the PCIe Capability, not the AER Capability? What about pci_aer_clear_device_status(), which clears Device Status, which is also in the PCIe Capability? > The following should do the same as this patch, but without making > AER driver handle non-root ports. The report enabling/disabling > functions are already stubbed for '!CONFIG_PCIE_AER' and have checks > for aer_cap and firmware first. If we thought we should enable error reporting *always*, regardless of whether the AER service is enabled, this would make perfect sense to me, and I might suggest doing it in an even more generic place like pci_configure_device() or pci_init_capabilities(). But that doesn't seem like where you're headed. It seems like you still only want error reporting enabled when CONFIG_PCIEAR=y. If that's the case, it seems like doing it in portdrv only obfuscates the connection with AER. When CONFIG_PCIEAER is unset, the portdrv code *looks* like it's doing something but it's really not because of the #ifdef magic. > A real patch for this could even make this remove all the aer > specific error report enabling, so it'd be a net-loss in code lines. > :) > > --- > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c > index 0acca3596807..f129a33c8303 100644 > --- a/drivers/pci/pcie/portdrv_pci.c > +++ b/drivers/pci/pcie/portdrv_pci.c > @@ -122,12 +122,13 @@ static int pcie_portdrv_probe(struct pci_dev *dev, > pm_runtime_put_autosuspend(&dev->dev); > pm_runtime_allow(&dev->dev); > } > - > + pci_enable_pcie_error_reporting(dev); > return 0; > } > > static void pcie_portdrv_remove(struct pci_dev *dev) > { > + pci_disable_pcie_error_reporting(dev); > if (pci_bridge_d3_possible(dev)) { > pm_runtime_forbid(&dev->dev); > pm_runtime_get_noresume(&dev->dev); > --
On Thu, Oct 18, 2018 at 08:11:35PM -0500, Bjorn Helgaas wrote: > On Thu, Oct 18, 2018 at 05:03:13PM -0600, Keith Busch wrote: > > On Thu, Oct 18, 2018 at 03:53:58PM -0500, Bjorn Helgaas wrote: > > > Change the AER service driver so it binds to *all* PCIe Ports, > > > including Switch Upstream and Downstream Ports. Enable AER error > > > reporting for all these Ports, but not for any children. > > > > I'm looking at this again and think enabling/disabling error > > reporting for ports is the responsibility of the port driver, not > > the AER service. > > That's an interesting idea. Can you expand on this a little more? > Why is it the responsibility of the port driver? > > Do you think pci_enable_pcie_error_reporting() shouldn't be part of > the AER service because it updates the Device Control register, which > is in the PCIe Capability, not the AER Capability? > > What about pci_aer_clear_device_status(), which clears Device Status, > which is also in the PCIe Capability? I was comparing how other end device driver's enable this, and they all call pci_enable_pcie_error_reporting() somewhere along their pci_driver->probe path. With that in mind, are ports special compared to end devices for this particular feature? > > The following should do the same as this patch, but without making > > AER driver handle non-root ports. The report enabling/disabling > > functions are already stubbed for '!CONFIG_PCIE_AER' and have checks > > for aer_cap and firmware first. > > If we thought we should enable error reporting *always*, regardless of > whether the AER service is enabled, this would make perfect sense to > me, and I might suggest doing it in an even more generic place like > pci_configure_device() or pci_init_capabilities(). There are unfortunately still pci_driver instances that don't implement the err_handler callbacks, and may cause problems if we enable error reporting in the device when its driver isn't capable of reacting to them. If it wasn't for that, I think it would make more sense to move this responsibility from drivers to the pci core. > But that doesn't seem like where you're headed. It seems like you > still only want error reporting enabled when CONFIG_PCIEAR=y. If > that's the case, it seems like doing it in portdrv only obfuscates the > connection with AER. When CONFIG_PCIEAER is unset, the portdrv code > *looks* like it's doing something but it's really not because of the > #ifdef magic. Right, but that's no different than every other Linux pci_driver. The component that provides the pci_driver.err_handler should be responsible for requesting to enable device error reporting, and that's provided by the port driver, not AER.
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index a90a9194ac4a..f4cafb2ee7da 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1315,12 +1315,6 @@ static void aer_enable_rootport(struct aer_rpc *rpc) pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, ®32); pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32); - /* - * Enable error reporting for the root port device and downstream port - * devices. - */ - set_downstream_devices_error_reporting(pdev, true); - /* Enable Root Port's interrupt in response to error messages */ pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_COMMAND, ®32); reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; @@ -1364,9 +1358,12 @@ static void aer_disable_rootport(struct aer_rpc *rpc) */ static void aer_remove(struct pcie_device *dev) { - struct aer_rpc *rpc = get_service_data(dev); + struct aer_rpc *rpc; - aer_disable_rootport(rpc); + if (pci_pcie_type(dev->port) == PCI_EXP_TYPE_ROOT_PORT) { + rpc = get_service_data(dev); + aer_disable_rootport(rpc); + } } /** @@ -1377,10 +1374,17 @@ static void aer_remove(struct pcie_device *dev) */ static int aer_probe(struct pcie_device *dev) { + struct pci_dev *pdev = dev->port; + int type = pci_pcie_type(pdev); int status; struct aer_rpc *rpc; struct device *device = &dev->device; + if (type == PCI_EXP_TYPE_UPSTREAM || type == PCI_EXP_TYPE_DOWNSTREAM) { + pci_enable_pcie_error_reporting(pdev); + return 0; + } + rpc = devm_kzalloc(device, sizeof(struct aer_rpc), GFP_KERNEL); if (!rpc) { dev_printk(KERN_DEBUG, device, "alloc AER rpc failed\n"); @@ -1398,52 +1402,56 @@ static int aer_probe(struct pcie_device *dev) } aer_enable_rootport(rpc); + pci_enable_pcie_error_reporting(pdev); dev_info(device, "AER enabled with IRQ %d\n", dev->irq); return 0; } /** - * aer_root_reset - reset link on Root Port - * @dev: pointer to Root Port's pci_dev data structure + * aer_reset_link - reset link + * @dev: pointer to pci_dev data structure * - * Invoked by Port Bus driver when performing link reset at Root Port. + * Invoked by Port Bus driver when performing link reset. */ -static pci_ers_result_t aer_root_reset(struct pci_dev *dev) +static pci_ers_result_t aer_reset_link(struct pci_dev *dev) { u32 reg32; - int pos; + int root = (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT); + int pos = dev->aer_cap; int rc; - pos = dev->aer_cap; - - /* Disable Root's interrupt in response to error messages */ - pci_read_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, ®32); - reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK; - pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, reg32); + if (root) { + /* Disable Root's interrupt in response to error messages */ + pci_read_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, ®32); + reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK; + pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, reg32); + } rc = pci_bus_error_reset(dev); - pci_printk(KERN_DEBUG, dev, "Root Port link has been reset\n"); + pci_printk(KERN_DEBUG, dev, "downstream link has been reset\n"); - /* Clear Root Error Status */ - pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, ®32); - pci_write_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, reg32); + if (root) { + /* Clear Root Error Status */ + pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, ®32); + pci_write_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, reg32); - /* Enable Root Port's interrupt in response to error messages */ - pci_read_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, ®32); - reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; - pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, reg32); + /* Enable Root Port's interrupt in response to error messages */ + pci_read_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, ®32); + reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; + pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, reg32); + } return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED; } static struct pcie_port_service_driver aerdriver = { .name = "aer", - .port_type = PCI_EXP_TYPE_ROOT_PORT, + .port_type = PCIE_ANY_PORT, .service = PCIE_PORT_SERVICE_AER, .probe = aer_probe, .remove = aer_remove, - .reset_link = aer_root_reset, + .reset_link = aer_reset_link, }; /**