diff mbox series

[v5,3/3] ACPI, APEI, EINJ: Update EINJ documentation

Message ID 20230925200127.504256-4-Benjamin.Cheatham@amd.com (mailing list archive)
State Superseded, archived
Headers show
Series CXL, ACPI, APEI, EINJ: Update EINJ for CXL 1.1 error types | expand

Commit Message

Ben Cheatham Sept. 25, 2023, 8:01 p.m. UTC
Update EINJ documentation to include CXL errors in available_error_types
table and usage of the types.

Also fix a formatting error in the param4 file description that caused
the description to be on the same line as the bullet point.

Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
---
 .../firmware-guide/acpi/apei/einj.rst         | 25 ++++++++++++++++---
 1 file changed, 21 insertions(+), 4 deletions(-)

Comments

Jonathan Cameron Sept. 26, 2023, 11:05 a.m. UTC | #1
On Mon, 25 Sep 2023 15:01:27 -0500
Ben Cheatham <Benjamin.Cheatham@amd.com> wrote:

> Update EINJ documentation to include CXL errors in available_error_types
> table and usage of the types.
> 
> Also fix a formatting error in the param4 file description that caused
> the description to be on the same line as the bullet point.
> 
> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
A trivial comment inline.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  .../firmware-guide/acpi/apei/einj.rst         | 25 ++++++++++++++++---
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst
> index d6b61d22f525..c6f28118c48b 100644
> --- a/Documentation/firmware-guide/acpi/apei/einj.rst
> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst
> @@ -32,6 +32,9 @@ configuration::
>    CONFIG_ACPI_APEI
>    CONFIG_ACPI_APEI_EINJ
>  
> +To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same
> +value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m").
> +
>  The EINJ user interface is in <debugfs mount point>/apei/einj.
>  
>  The following files belong to it:
> @@ -40,9 +43,9 @@ The following files belong to it:
>  
>    This file shows which error types are supported:
>  
> -  ================  ===================================
> +  ================  =========================================
>    Error Type Value	Error Description
> -  ================  ===================================
> +  ================  =========================================
>    0x00000001        Processor Correctable
>    0x00000002        Processor Uncorrectable non-fatal
>    0x00000004        Processor Uncorrectable fatal
> @@ -55,7 +58,13 @@ The following files belong to it:
>    0x00000200        Platform Correctable
>    0x00000400        Platform Uncorrectable non-fatal
>    0x00000800        Platform Uncorrectable fatal
> -  ================  ===================================
> +  0x00001000        CXL.cache Protocol Correctable
> +  0x00002000        CXL.cache Protocol Uncorrectable non-fatal
> +  0x00004000        CXL.cache Protocol Uncorrectable fatal
> +  0x00008000        CXL.mem Protocol Correctable
> +  0x00010000        CXL.mem Protocol Uncorrectable non-fatal
> +  0x00020000        CXL.mem Protocol Uncorrectable fatal
> +  ================  =========================================
>  
>    The format of the file contents are as above, except present are only
>    the available error types.
> @@ -106,6 +115,7 @@ The following files belong to it:
>    Used when the 0x1 bit is set in "flags" to specify the APIC id
>  
>  - param4
> +

#Unrelated change.  Probably reasonable but should be separate patch really.

>    Used when the 0x4 bit is set in "flags" to specify target PCIe device
>  
>  - notrigger
> @@ -159,6 +169,13 @@ and param2 (1 = PROCESSOR, 2 = MEMORY, 4 = PCI). See your BIOS vendor
>  documentation for details (and expect changes to this API if vendors
>  creativity in using this feature expands beyond our expectations).
>  
> +CXL error types are supported from ACPI 6.5 onwards. To use these error
> +types you need the MMIO address of a CXL 1.1 downstream port. You can
> +find the address of dportY in /sys/bus/cxl/devices/portX/dportY/cxl_rcrb_addr
> +(it's possible that the dport is under the CXL root, in that case the
> +path would be /sys/us/cxl/devices/rootX/dportY/cxl_rcrb_addr).
> +From there, write the address to param1 and continue as you would for a
> +memory error type.
>  
>  An error injection example::
>  
> @@ -201,4 +218,4 @@ The following sequence can be used:
>    7) Read from the virtual address. This will trigger the error
>  
>  For more information about EINJ, please refer to ACPI specification
> -version 4.0, section 17.5 and ACPI 5.0, section 18.6.
> +version 4.0, section 17.5 and ACPI 6.5, section 18.6.
Ben Cheatham Sept. 26, 2023, 4 p.m. UTC | #2
On 9/26/23 6:05 AM, Jonathan Cameron wrote:
> On Mon, 25 Sep 2023 15:01:27 -0500
> Ben Cheatham <Benjamin.Cheatham@amd.com> wrote:
> 
>> Update EINJ documentation to include CXL errors in available_error_types
>> table and usage of the types.
>>
>> Also fix a formatting error in the param4 file description that caused
>> the description to be on the same line as the bullet point.
>>
>> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
> A trivial comment inline.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
>> ---
>>  .../firmware-guide/acpi/apei/einj.rst         | 25 ++++++++++++++++---
>>  1 file changed, 21 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst
>> index d6b61d22f525..c6f28118c48b 100644
>> --- a/Documentation/firmware-guide/acpi/apei/einj.rst
>> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst
>> @@ -32,6 +32,9 @@ configuration::
>>    CONFIG_ACPI_APEI
>>    CONFIG_ACPI_APEI_EINJ
>>  
>> +To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same
>> +value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m").
>> +
>>  The EINJ user interface is in <debugfs mount point>/apei/einj.
>>  
>>  The following files belong to it:
>> @@ -40,9 +43,9 @@ The following files belong to it:
>>  
>>    This file shows which error types are supported:
>>  
>> -  ================  ===================================
>> +  ================  =========================================
>>    Error Type Value	Error Description
>> -  ================  ===================================
>> +  ================  =========================================
>>    0x00000001        Processor Correctable
>>    0x00000002        Processor Uncorrectable non-fatal
>>    0x00000004        Processor Uncorrectable fatal
>> @@ -55,7 +58,13 @@ The following files belong to it:
>>    0x00000200        Platform Correctable
>>    0x00000400        Platform Uncorrectable non-fatal
>>    0x00000800        Platform Uncorrectable fatal
>> -  ================  ===================================
>> +  0x00001000        CXL.cache Protocol Correctable
>> +  0x00002000        CXL.cache Protocol Uncorrectable non-fatal
>> +  0x00004000        CXL.cache Protocol Uncorrectable fatal
>> +  0x00008000        CXL.mem Protocol Correctable
>> +  0x00010000        CXL.mem Protocol Uncorrectable non-fatal
>> +  0x00020000        CXL.mem Protocol Uncorrectable fatal
>> +  ================  =========================================
>>  
>>    The format of the file contents are as above, except present are only
>>    the available error types.
>> @@ -106,6 +115,7 @@ The following files belong to it:
>>    Used when the 0x1 bit is set in "flags" to specify the APIC id
>>  
>>  - param4
>> +
> 
> #Unrelated change.  Probably reasonable but should be separate patch really.
> 

I'll take that out.

Thanks,
Ben

>>    Used when the 0x4 bit is set in "flags" to specify target PCIe device
>>  
>>  - notrigger
>> @@ -159,6 +169,13 @@ and param2 (1 = PROCESSOR, 2 = MEMORY, 4 = PCI). See your BIOS vendor
>>  documentation for details (and expect changes to this API if vendors
>>  creativity in using this feature expands beyond our expectations).
>>  
>> +CXL error types are supported from ACPI 6.5 onwards. To use these error
>> +types you need the MMIO address of a CXL 1.1 downstream port. You can
>> +find the address of dportY in /sys/bus/cxl/devices/portX/dportY/cxl_rcrb_addr
>> +(it's possible that the dport is under the CXL root, in that case the
>> +path would be /sys/us/cxl/devices/rootX/dportY/cxl_rcrb_addr).
>> +From there, write the address to param1 and continue as you would for a
>> +memory error type.
>>  
>>  An error injection example::
>>  
>> @@ -201,4 +218,4 @@ The following sequence can be used:
>>    7) Read from the virtual address. This will trigger the error
>>  
>>  For more information about EINJ, please refer to ACPI specification
>> -version 4.0, section 17.5 and ACPI 5.0, section 18.6.
>> +version 4.0, section 17.5 and ACPI 6.5, section 18.6.
>
Bjorn Helgaas Sept. 26, 2023, 8:24 p.m. UTC | #3
On Mon, Sep 25, 2023 at 03:01:27PM -0500, Ben Cheatham wrote:
> Update EINJ documentation to include CXL errors in available_error_types
> table and usage of the types.
> 
> Also fix a formatting error in the param4 file description that caused
> the description to be on the same line as the bullet point.
> 
> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
> ---
>  .../firmware-guide/acpi/apei/einj.rst         | 25 ++++++++++++++++---
>  1 file changed, 21 insertions(+), 4 deletions(-)

I always feel like the documentation update should be in the same
patch as the new functionality so it's easy to match up with the code
and keep things together when backporting.  But I know that sentiment
is not universal and maybe there's good reason to keep them separate.

> diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst
> index d6b61d22f525..c6f28118c48b 100644
> --- a/Documentation/firmware-guide/acpi/apei/einj.rst
> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst
> @@ -32,6 +32,9 @@ configuration::
>    CONFIG_ACPI_APEI
>    CONFIG_ACPI_APEI_EINJ
>  
> +To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same
> +value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m").
> ...
Ben Cheatham Sept. 27, 2023, 3:31 p.m. UTC | #4
On 9/26/23 3:24 PM, Bjorn Helgaas wrote:
> On Mon, Sep 25, 2023 at 03:01:27PM -0500, Ben Cheatham wrote:
>> Update EINJ documentation to include CXL errors in available_error_types
>> table and usage of the types.
>>
>> Also fix a formatting error in the param4 file description that caused
>> the description to be on the same line as the bullet point.
>>
>> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
>> ---
>>  .../firmware-guide/acpi/apei/einj.rst         | 25 ++++++++++++++++---
>>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> I always feel like the documentation update should be in the same
> patch as the new functionality so it's easy to match up with the code
> and keep things together when backporting.  But I know that sentiment
> is not universal and maybe there's good reason to keep them separate.
> 

I put it into a separate patch since the documentation change was substantial,
but if it gets shorter in v6 I'll put it into the previous patch.

Thanks,
Ben

>> diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst
>> index d6b61d22f525..c6f28118c48b 100644
>> --- a/Documentation/firmware-guide/acpi/apei/einj.rst
>> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst
>> @@ -32,6 +32,9 @@ configuration::
>>    CONFIG_ACPI_APEI
>>    CONFIG_ACPI_APEI_EINJ
>>  
>> +To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same
>> +value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m").
>> ...
diff mbox series

Patch

diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst
index d6b61d22f525..c6f28118c48b 100644
--- a/Documentation/firmware-guide/acpi/apei/einj.rst
+++ b/Documentation/firmware-guide/acpi/apei/einj.rst
@@ -32,6 +32,9 @@  configuration::
   CONFIG_ACPI_APEI
   CONFIG_ACPI_APEI_EINJ
 
+To use CXL error types ``CONFIG_CXL_ACPI`` needs to be set to the same
+value as ``CONFIG_ACPI_APEI_EINJ`` (either "y" or "m").
+
 The EINJ user interface is in <debugfs mount point>/apei/einj.
 
 The following files belong to it:
@@ -40,9 +43,9 @@  The following files belong to it:
 
   This file shows which error types are supported:
 
-  ================  ===================================
+  ================  =========================================
   Error Type Value	Error Description
-  ================  ===================================
+  ================  =========================================
   0x00000001        Processor Correctable
   0x00000002        Processor Uncorrectable non-fatal
   0x00000004        Processor Uncorrectable fatal
@@ -55,7 +58,13 @@  The following files belong to it:
   0x00000200        Platform Correctable
   0x00000400        Platform Uncorrectable non-fatal
   0x00000800        Platform Uncorrectable fatal
-  ================  ===================================
+  0x00001000        CXL.cache Protocol Correctable
+  0x00002000        CXL.cache Protocol Uncorrectable non-fatal
+  0x00004000        CXL.cache Protocol Uncorrectable fatal
+  0x00008000        CXL.mem Protocol Correctable
+  0x00010000        CXL.mem Protocol Uncorrectable non-fatal
+  0x00020000        CXL.mem Protocol Uncorrectable fatal
+  ================  =========================================
 
   The format of the file contents are as above, except present are only
   the available error types.
@@ -106,6 +115,7 @@  The following files belong to it:
   Used when the 0x1 bit is set in "flags" to specify the APIC id
 
 - param4
+
   Used when the 0x4 bit is set in "flags" to specify target PCIe device
 
 - notrigger
@@ -159,6 +169,13 @@  and param2 (1 = PROCESSOR, 2 = MEMORY, 4 = PCI). See your BIOS vendor
 documentation for details (and expect changes to this API if vendors
 creativity in using this feature expands beyond our expectations).
 
+CXL error types are supported from ACPI 6.5 onwards. To use these error
+types you need the MMIO address of a CXL 1.1 downstream port. You can
+find the address of dportY in /sys/bus/cxl/devices/portX/dportY/cxl_rcrb_addr
+(it's possible that the dport is under the CXL root, in that case the
+path would be /sys/us/cxl/devices/rootX/dportY/cxl_rcrb_addr).
+From there, write the address to param1 and continue as you would for a
+memory error type.
 
 An error injection example::
 
@@ -201,4 +218,4 @@  The following sequence can be used:
   7) Read from the virtual address. This will trigger the error
 
 For more information about EINJ, please refer to ACPI specification
-version 4.0, section 17.5 and ACPI 5.0, section 18.6.
+version 4.0, section 17.5 and ACPI 6.5, section 18.6.