diff mbox series

[1/2] ACPI: extlog: Trace CPER Non-standard Section Body

Message ID 20240527144356.246220-2-fabio.m.de.francesco@linux.intel.com (mailing list archive)
State New
Delegated to: Bjorn Helgaas
Headers show
Series Make ELOG log and trace consistently with GHES | expand

Commit Message

Fabio M. De Francesco May 27, 2024, 2:43 p.m. UTC
In extlog_print(), trace "Non-standard Section Body" reported by firmware
to the OS via Common Platform Error Record (CPER) (UEFI v2.10 Appendix N
2.3) to add further debug information and so to make ELOG log
consistently with ghes_do_proc() (GHES).

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
 drivers/acpi/acpi_extlog.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Dan Williams Aug. 6, 2024, 7:21 p.m. UTC | #1
Fabio M. De Francesco wrote:
> In extlog_print(), trace "Non-standard Section Body" reported by firmware
> to the OS via Common Platform Error Record (CPER) (UEFI v2.10 Appendix N
> 2.3) to add further debug information and so to make ELOG log
> consistently with ghes_do_proc() (GHES).

I think this description could be clearer, how about:

---

ghes_do_proc() has a catch-all for unknown or unhandled CPER formats
(UEFI v2.10 Appendix N 2.3), extlog_print() does not. This gap was
noticed by a RAS test that injected CXL protocol errors which were
notified to extlog_print() via the IOMCA (I/O Machine Check
Architecture) mechanism. Bring parity to the extlog_print() path by
including a similar trace_non_standard_event().

---

> 
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/acpi/acpi_extlog.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
> index f055609d4b64..e025ae390737 100644
> --- a/drivers/acpi/acpi_extlog.c
> +++ b/drivers/acpi/acpi_extlog.c
> @@ -179,6 +179,12 @@ static int extlog_print(struct notifier_block *nb, unsigned long val,
>  			if (gdata->error_data_length >= sizeof(*mem))
>  				trace_extlog_mem_event(mem, err_seq, fru_id, fru_text,
>  						       (u8)gdata->error_severity);
> +		} else {
> +			void *err = acpi_hest_get_payload(gdata);
> +
> +			trace_non_standard_event(sec_type, fru_id, fru_text,
> +						 gdata->error_severity, err,
> +						 gdata->error_data_length);
>  		}

...with the above changelog update the code change looks good to me, you
can add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Bjorn Helgaas Aug. 6, 2024, 9:07 p.m. UTC | #2
On Mon, May 27, 2024 at 04:43:40PM +0200, Fabio M. De Francesco wrote:
> In extlog_print(), trace "Non-standard Section Body" reported by firmware
> to the OS via Common Platform Error Record (CPER) (UEFI v2.10 Appendix N
> 2.3) to add further debug information and so to make ELOG log
> consistently with ghes_do_proc() (GHES).
> 
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/acpi/acpi_extlog.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
> index f055609d4b64..e025ae390737 100644
> --- a/drivers/acpi/acpi_extlog.c
> +++ b/drivers/acpi/acpi_extlog.c
> @@ -179,6 +179,12 @@ static int extlog_print(struct notifier_block *nb, unsigned long val,
>  			if (gdata->error_data_length >= sizeof(*mem))
>  				trace_extlog_mem_event(mem, err_seq, fru_id, fru_text,
>  						       (u8)gdata->error_severity);
> +		} else {
> +			void *err = acpi_hest_get_payload(gdata);
> +
> +			trace_non_standard_event(sec_type, fru_id, fru_text,
> +						 gdata->error_severity, err,
> +						 gdata->error_data_length);

Kudos for making these two paths more similar.

Not specific to *this* patch, but it's annoying to try to find
tracepoint implementations.  I guess it's
TRACE_EVENT(non_standard_event, ...) in include/ras/ras_event.h.

This has the same prototype as log_non_standard_event(), so
could extlog_print() be made a little bit more like ghes_do_proc() by
using log_non_standard_event() instead of trace_non_standard_event()
directly?

Bjorn
diff mbox series

Patch

diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
index f055609d4b64..e025ae390737 100644
--- a/drivers/acpi/acpi_extlog.c
+++ b/drivers/acpi/acpi_extlog.c
@@ -179,6 +179,12 @@  static int extlog_print(struct notifier_block *nb, unsigned long val,
 			if (gdata->error_data_length >= sizeof(*mem))
 				trace_extlog_mem_event(mem, err_seq, fru_id, fru_text,
 						       (u8)gdata->error_severity);
+		} else {
+			void *err = acpi_hest_get_payload(gdata);
+
+			trace_non_standard_event(sec_type, fru_id, fru_text,
+						 gdata->error_severity, err,
+						 gdata->error_data_length);
 		}
 	}