Message ID | 20240214200709.777166-4-Benjamin.Cheatham@amd.com (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | cxl, EINJ: Update EINJ for CXL error types | expand |
On Wed, 14 Feb 2024, Ben Cheatham wrote: >Update EINJ kernel document to include how to inject CXL protocol error >types, build the kernel to include CXL error types, and give an example >injection. > >Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Would vote for folding into 2/3, but otherwise looks good with a minor suggestion. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> >--- > .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > >diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst >index d6b61d22f525..f179adf7b61c 100644 >--- a/Documentation/firmware-guide/acpi/apei/einj.rst >+++ b/Documentation/firmware-guide/acpi/apei/einj.rst >@@ -181,6 +181,25 @@ You should see something like this in dmesg:: > [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 > [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) > >+CXL error types are supported from ACPI 6.5 onwards. These error types ^ and target a CXL Port >+are not available in the legacy interface at /sys/kernel/debug/apei/einj, >+and are instead at /sys/kernel/debug/cxl/. There is a file under debug/cxl >+called "einj_type" that is analogous to available_error_type under debug/cxl. >+There is also a "einj_inject" file in each $dport_dev directory under debug/cxl >+that will inject a given error into the dport represented by $dport_dev. >+For example, to inject a CXL.mem protocol correctable error into >+$dport_dev=pci0000:0c:: >+ >+ # cd /sys/kernel/debug/cxl/ >+ # cat einj_type # See which error can be injected >+ 0x00008000 CXL.mem Protocol Correctable >+ 0x00010000 CXL.mem Protocol Uncorrectable non-fatal >+ 0x00020000 CXL.mem Protocol Uncorrectable fatal >+ # cd 0000:e0:01.1 # Navigate to dport to inject into >+ # echo 0x8000 > einj_inject # Inject error >+ >+To use CXL error types, ``CONFIG_CXL_EINJ`` will need to be enabled. >+ > Special notes for injection into SGX enclaves: > > There may be a separate BIOS setup option to enable SGX injection. >-- >2.34.1 >
Thanks for taking a look David! On 2/20/24 1:02 PM, Davidlohr Bueso wrote: > On Wed, 14 Feb 2024, Ben Cheatham wrote: > >> Update EINJ kernel document to include how to inject CXL protocol error >> types, build the kernel to include CXL error types, and give an example >> injection. >> >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> > > Would vote for folding into 2/3, but otherwise looks good with a minor > suggestion. > I would, but I think 2/3 is already pretty large and this is more digestible to me. I've also reworked a large portion of that patch for v13 so it's probably better to keep it smaller. > Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> > >> --- >> .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ >> 1 file changed, 19 insertions(+) >> >> diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst >> index d6b61d22f525..f179adf7b61c 100644 >> --- a/Documentation/firmware-guide/acpi/apei/einj.rst >> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst >> @@ -181,6 +181,25 @@ You should see something like this in dmesg:: >> [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 >> [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) >> >> +CXL error types are supported from ACPI 6.5 onwards. These error types > ^ and target a CXL Port > Will add. Thanks, Ben >> +are not available in the legacy interface at /sys/kernel/debug/apei/einj, >> +and are instead at /sys/kernel/debug/cxl/. There is a file under debug/cxl >> +called "einj_type" that is analogous to available_error_type under debug/cxl. >> +There is also a "einj_inject" file in each $dport_dev directory under debug/cxl >> +that will inject a given error into the dport represented by $dport_dev. >> +For example, to inject a CXL.mem protocol correctable error into >> +$dport_dev=pci0000:0c:: >> + >> + # cd /sys/kernel/debug/cxl/ >> + # cat einj_type # See which error can be injected >> + 0x00008000 CXL.mem Protocol Correctable >> + 0x00010000 CXL.mem Protocol Uncorrectable non-fatal >> + 0x00020000 CXL.mem Protocol Uncorrectable fatal >> + # cd 0000:e0:01.1 # Navigate to dport to inject into >> + # echo 0x8000 > einj_inject # Inject error >> + >> +To use CXL error types, ``CONFIG_CXL_EINJ`` will need to be enabled. >> + >> Special notes for injection into SGX enclaves: >> >> There may be a separate BIOS setup option to enable SGX injection. >> -- >> 2.34.1 >>
diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index d6b61d22f525..f179adf7b61c 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -181,6 +181,25 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +CXL error types are supported from ACPI 6.5 onwards. These error types +are not available in the legacy interface at /sys/kernel/debug/apei/einj, +and are instead at /sys/kernel/debug/cxl/. There is a file under debug/cxl +called "einj_type" that is analogous to available_error_type under debug/cxl. +There is also a "einj_inject" file in each $dport_dev directory under debug/cxl +that will inject a given error into the dport represented by $dport_dev. +For example, to inject a CXL.mem protocol correctable error into +$dport_dev=pci0000:0c:: + + # cd /sys/kernel/debug/cxl/ + # cat einj_type # See which error can be injected + 0x00008000 CXL.mem Protocol Correctable + 0x00010000 CXL.mem Protocol Uncorrectable non-fatal + 0x00020000 CXL.mem Protocol Uncorrectable fatal + # cd 0000:e0:01.1 # Navigate to dport to inject into + # echo 0x8000 > einj_inject # Inject error + +To use CXL error types, ``CONFIG_CXL_EINJ`` will need to be enabled. + Special notes for injection into SGX enclaves: There may be a separate BIOS setup option to enable SGX injection.