Message ID | 74cee7d2-dffa-0559-d529-5c86023161e3@codeaurora.org (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Thu, Nov 09, 2017 at 10:14:35AM -0500, Tyler Baicar wrote: > On 11/9/2017 4:46 AM, Borislav Petkov wrote: > > On Wed, Nov 08, 2017 at 12:13:12PM -0700, Tyler Baicar wrote: > > > Currently the GHES code only calls into the AER driver for > > > recoverable type errors. This is incorrect because errors of > > > other severities do not get logged by the AER driver and do not > > > get exposed to user space via the AER trace event. So, call > > > into the AER driver for PCIe errors regardless of the severity > > > > > > Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> > > > --- > > > drivers/acpi/apei/ghes.c | 8 +++----- > > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > > > index 839c3d5..bb65fa6 100644 > > > --- a/drivers/acpi/apei/ghes.c > > > +++ b/drivers/acpi/apei/ghes.c > > > @@ -458,14 +458,12 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int > > > #endif > > > } > > Where did the explanatory comment go? > > > > +/* > > + * PCIe AER errors need to be sent to the AER driver for reporting and > > + * recovery. The GHES severities map to the following AER severities and > > + * require the following handling: > > + * > > + * GHES_SEV_CORRECTABLE -> AER_CORRECTABLE > > + * These need to be reported by the AER driver but no recovery is > > + * necessary. > > + * GHES_SEV_RECOVERABLE -> AER_NONFATAL > > + * GHES_SEV_RECOVERABLE && CPER_SEC_RESET -> AER_FATAL > > + * These both need to be reported and recovered from by the AER driver. > > + * GHES_SEV_PANIC does not make it to this handling since the kernel must > > + * panic. > > + */ > > > > <--- ??? > Updated patch including the comment: When you decide to do the reckless thing of pasting a patch into thunderbird on *windoze*, first send it to yourself only and try applying it. Because I see this: [boris@pd: ~/kernel/linux> test-apply.sh /tmp/tbaicar.02 checking file drivers/acpi/apei/ghes.c patch: **** malformed patch at line 64: @@ -519,7 +531,7 @@ static void ghes_do_proc(struct ghes *ghes, Not good.
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 839c3d5..15dbf65 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -458,14 +458,26 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int #endif } -static void ghes_handle_aer(struct acpi_hest_generic_data *gdata, int sev, int sec_sev) +/* + * PCIe AER errors need to be sent to the AER driver for reporting and + * recovery. The GHES severities map to the following AER severities and + * require the following handling: + * + * GHES_SEV_CORRECTABLE -> AER_CORRECTABLE + * These need to be reported by the AER driver but no recovery is + * necessary. + * GHES_SEV_RECOVERABLE -> AER_NONFATAL + * GHES_SEV_RECOVERABLE && CPER_SEC_RESET -> AER_FATAL + * These both need to be reported and recovered from by the AER driver. + * GHES_SEV_PANIC does not make it to this handling since the kernel must + * panic. + */ +static void ghes_handle_aer(struct acpi_hest_generic_data *gdata) { #ifdef CONFIG_ACPI_APEI_PCIEAER struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata); - if (sev == GHES_SEV_RECOVERABLE && - sec_sev == GHES_SEV_RECOVERABLE && - pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID && + if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID && pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) { unsigned int devfn; int aer_severity; @@ -519,7 +531,7 @@ static void ghes_do_proc(struct ghes *ghes, ghes_handle_memory_failure(gdata, sev); } else if (guid_equal(sec_type, &CPER_SEC_PCIE)) { - ghes_handle_aer(gdata, sev, sec_sev); + ghes_handle_aer(gdata); } else if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) { struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);