Message ID | 20241008221657.1130181-16-terry.bowman@amd.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Enable CXL PCIe port protocol error handling and logging | expand |
On Tue, 8 Oct 2024 17:16:57 -0500 Terry Bowman <terry.bowman@amd.com> wrote: > The AER service drivers and CXL drivers are updated to handle PCIe > port protocol errors. But, the PCIe AER correctable and uncorrectable > internal errors are mask disabled for the PCIe port devices. > > Enable the AER internal errors for CXL PCIe port devices. > > Signed-off-by: Terry Bowman <terry.bowman@amd.com> A while back I thought we had a discussion about just enabling these for all devices and seeing if anyone screamed? I'd love to do that rather than carefully enabling them for CXL devices only ;) If not, this looks fine to me. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > drivers/cxl/core/pci.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 4706113d2582..1d84a7022c4d 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -908,6 +908,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_port_err_detected, CXL); > > void cxl_uport_init_aer(struct cxl_port *port) > { > + struct pci_dev *pdev = to_pci_dev(port->uport_dev); > /* uport may have more than 1 downstream EP. Check if already mapped. */ > if (port->uport_regs.ras) { > dev_warn(&port->dev, "RAS is already mapped\n"); > @@ -920,12 +921,14 @@ void cxl_uport_init_aer(struct cxl_port *port) > dev_err(&port->dev, "Failed to map RAS capability.\n"); > return; > } > + pci_aer_unmask_internal_errors(pdev); > } > EXPORT_SYMBOL_NS_GPL(cxl_uport_init_aer, CXL); > > void cxl_dport_init_aer(struct cxl_dport *dport) > { > struct device *dport_dev = dport->dport_dev; > + struct pci_dev *pdev = to_pci_dev(dport_dev); > > if (dport->rch) { > struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev); > @@ -949,6 +952,7 @@ void cxl_dport_init_aer(struct cxl_dport *dport) > dev_err(dport_dev, "Failed to map RAS capability.\n"); > return; > } > + pci_aer_unmask_internal_errors(pdev); > } > EXPORT_SYMBOL_NS_GPL(cxl_dport_init_aer, CXL); >
Hi Jonathan, On 10/16/24 12:21, Jonathan Cameron wrote: > On Tue, 8 Oct 2024 17:16:57 -0500 > Terry Bowman <terry.bowman@amd.com> wrote: > >> The AER service drivers and CXL drivers are updated to handle PCIe >> port protocol errors. But, the PCIe AER correctable and uncorrectable >> internal errors are mask disabled for the PCIe port devices. >> >> Enable the AER internal errors for CXL PCIe port devices. >> >> Signed-off-by: Terry Bowman <terry.bowman@amd.com> > > A while back I thought we had a discussion about just enabling these > for all devices and seeing if anyone screamed? > > I'd love to do that rather than carefully enabling them for CXL devices > only ;) > > If not, this looks fine to me. > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > These last 2 patches will be removed for v2. This is not necessary. Internal AER errors for root ports and RCECs handling are already enabled by the AER driver. Regards, Terry
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 4706113d2582..1d84a7022c4d 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -908,6 +908,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_port_err_detected, CXL); void cxl_uport_init_aer(struct cxl_port *port) { + struct pci_dev *pdev = to_pci_dev(port->uport_dev); /* uport may have more than 1 downstream EP. Check if already mapped. */ if (port->uport_regs.ras) { dev_warn(&port->dev, "RAS is already mapped\n"); @@ -920,12 +921,14 @@ void cxl_uport_init_aer(struct cxl_port *port) dev_err(&port->dev, "Failed to map RAS capability.\n"); return; } + pci_aer_unmask_internal_errors(pdev); } EXPORT_SYMBOL_NS_GPL(cxl_uport_init_aer, CXL); void cxl_dport_init_aer(struct cxl_dport *dport) { struct device *dport_dev = dport->dport_dev; + struct pci_dev *pdev = to_pci_dev(dport_dev); if (dport->rch) { struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev); @@ -949,6 +952,7 @@ void cxl_dport_init_aer(struct cxl_dport *dport) dev_err(dport_dev, "Failed to map RAS capability.\n"); return; } + pci_aer_unmask_internal_errors(pdev); } EXPORT_SYMBOL_NS_GPL(cxl_dport_init_aer, CXL);
The AER service drivers and CXL drivers are updated to handle PCIe port protocol errors. But, the PCIe AER correctable and uncorrectable internal errors are mask disabled for the PCIe port devices. Enable the AER internal errors for CXL PCIe port devices. Signed-off-by: Terry Bowman <terry.bowman@amd.com> --- drivers/cxl/core/pci.c | 4 ++++ 1 file changed, 4 insertions(+)