mbox series

[0/4] efi/cxl-cper: Report CXL CPER events through tracing

Message ID 20240228-cxl-cper3-v1-0-6aa3f1343c6c@intel.com
Headers show
Series efi/cxl-cper: Report CXL CPER events through tracing | expand

Message

Ira Weiny Feb. 29, 2024, 7:13 a.m. UTC
CXL Component Events, as defined by EFI 2.10 Section N.2.14, wrap a
mostly CXL event payload in an EFI Common Platform Error Record (CPER)
record.  If a device is configured for firmware first CXL event records
are not sent directly to the host.

The CXL sub-system uniquely has DPA to HPA translation information.  It
also already has event decoding/tracing.  Such translations are very
useful for users to determine which system issues may correspond to
specific hardware events.

The restructuring of the event data structures in 6.8 made sharing the
data between CPER/event logs more efficient.  Now re-wire the sending of
CPER records to the CXL sub-system.

In addition provide a default RAS event should the CXL module not be
loaded [ie callback not registered].

Series status/background
========================

Smita and Jonathan have been a great help with this series.  Once again
thank you.

Unfortunately, with all the churn surrounding the bug which Dan
Carpenter found the maintainers were force to revert this work.

Therefore, this is a whole new series based on what is in 6.8.

Testing
=======

I've hacked up a quick debugfs patch to facilitate easier testing.[1]

With this I have verified that the bug Dan Carpenter found is fixed.
However, the tp_printk bug Jonathan found remains.  The taking of the
device lock in the callback is required and the tp_printk issue is
unlikely to be fixed.  Fortunately, tp_printk is not widely used so it
is anticipated this will not be an issue.

No other locking issues were found with this test and locking debug
turned on.

[1] https://github.com/weiny2/linux-kernel/commit/6c540a23cb1194d67a9dcfefb702774a99afc3b1

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
Ira Weiny (4):
      cxl/event: Add missing include files
      acpi/ghes: Process CXL Component Events
      cxl/pci: Register for and process CPER events
      ras/events: Trace CXL CPER events even without the CXL stack loaded

 drivers/acpi/apei/ghes.c  | 130 ++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/pci.c         |  69 +++++++++++++++++++++++-
 include/linux/cxl-event.h |  21 ++++++++
 include/ras/ras_event.h   |  90 ++++++++++++++++++++++++++++++++
 4 files changed, 309 insertions(+), 1 deletion(-)
---
base-commit: daeacfa75d08954e1a5b71c36a8fbfcdd0b3fec9
change-id: 20240220-cxl-cper3-30e55279f936

Best regards,