Message ID | 1633626876-12115-2-git-send-email-eric.devolder@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | acpi: Error Record Serialization Table, ERST, support for QEMU | expand |
On Thu, 7 Oct 2021, Eric DeVolder wrote: > Information on the implementation of the ACPI ERST support. > > Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Acked-by: Ani Sinha <ani@anisinha.ca> > --- > docs/specs/acpi_erst.rst | 200 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 200 insertions(+) > create mode 100644 docs/specs/acpi_erst.rst > > diff --git a/docs/specs/acpi_erst.rst b/docs/specs/acpi_erst.rst > new file mode 100644 > index 0000000..a8a9d22 > --- /dev/null > +++ b/docs/specs/acpi_erst.rst > @@ -0,0 +1,200 @@ > +ACPI ERST DEVICE > +================ > + > +The ACPI ERST device is utilized to support the ACPI Error Record > +Serialization Table, ERST, functionality. This feature is designed for > +storing error records in persistent storage for future reference > +and/or debugging. > + > +The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces > +(APEI)", and specifically subsection "Error Serialization", outlines a > +method for storing error records into persistent storage. > + > +The format of error records is described in the UEFI specification[2], > +in Appendix N "Common Platform Error Record". > + > +While the ACPI specification allows for an NVRAM "mode" (see > +GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is > +directly exposed for direct access by the OS/guest, this device > +implements the non-NVRAM "mode". This non-NVRAM "mode" is what is > +implemented by most BIOS (since flash memory requires programming > +operations in order to update its contents). Furthermore, as of the > +time of this writing, Linux only supports the non-NVRAM "mode". > + > + > +Background/Motivation > +--------------------- > + > +Linux uses the persistent storage filesystem, pstore, to record > +information (eg. dmesg tail) upon panics and shutdowns. Pstore is > +independent of, and runs before, kdump. In certain scenarios (ie. > +hosts/guests with root filesystems on NFS/iSCSI where networking > +software and/or hardware fails, and thus kdump fails), pstore may > +contain information available for post-mortem debugging. > + > +Two common storage backends for the pstore filesystem are ACPI ERST > +and UEFI. Most BIOS implement ACPI ERST. UEFI is not utilized in all > +guests. With QEMU supporting ACPI ERST, it becomes a viable pstore > +storage backend for virtual machines (as it is now for bare metal > +machines). > + > +Enabling support for ACPI ERST facilitates a consistent method to > +capture kernel panic information in a wide range of guests: from > +resource-constrained microvms to very large guests, and in particular, > +in direct-boot environments (which would lack UEFI run-time services). > + > +Note that Microsoft Windows also utilizes the ACPI ERST for certain > +crash information, if available[3]. > + > + > +Configuration|Usage > +------------------- > + > +To use ACPI ERST, a memory-backend-file object and acpi-erst device > +can be created, for example: > + > + qemu ... > + -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,size=0x10000,share=on \ > + -device acpi-erst,memdev=erstnvram > + > +For proper operation, the ACPI ERST device needs a memory-backend-file > +object with the following parameters: > + > + - id: The id of the memory-backend-file object is used to associate > + this memory with the acpi-erst device. > + - size: The size of the ACPI ERST backing storage. This parameter is > + required. > + - mem-path: The location of the ACPI ERST backing storage file. This > + parameter is also required. > + - share: The share=on parameter is required so that updates to the > + ERST backing store are written to the file. > + > +and ERST device: > + > + - memdev: Is the object id of the memory-backend-file. > + - record_size: Specifies the size of the records (or slots) in the > + backend storage. Must be a power of two value greater than or > + equal to 4096 (PAGE_SIZE). > + > + > +PCI Interface > +------------- > + > +The ERST device is a PCI device with two BARs, one for accessing the > +programming registers, and the other for accessing the record exchange > +buffer. > + > +BAR0 contains the programming interface consisting of ACTION and VALUE > +64-bit registers. All ERST actions/operations/side effects happen on > +the write to the ACTION, by design. Any data needed by the action must > +be placed into VALUE prior to writing ACTION. Reading the VALUE > +simply returns the register contents, which can be updated by a > +previous ACTION. > + > +BAR1 contains the 8KiB record exchange buffer, which is the > +implemented maximum record size. > + > + > +Backend Storage Format > +---------------------- > + > +The backend storage is divided into fixed size "slots", 8KiB in > +length, with each slot storing a single record. Not all slots need to > +be occupied, and they need not be occupied in a contiguous fashion. > +The ability to clear/erase specific records allows for the formation > +of unoccupied slots. > + > +Slot 0 contains a backend storage header that identifies the contents > +as ERST and also facilitates efficient access to the records. > +Depending upon the size of the backend storage, additional slots will > +be designated to be a part of the slot 0 header. For example, at 8KiB, > +the slot 0 header can accomodate 1021 records. Thus a storage size > +of 8MiB (8KiB * 1024) requires an additional slot for use by the > +header. In this scenario, slot 0 and slot 1 form the backend storage > +header, and records can be stored starting at slot 2. > + > +Below is an example layout of the backend storage format (for storage > +size less than 8MiB). The size of the storage is a multiple of 8KiB, > +and contains N number of slots to store records. The example below > +shows two records (in CPER format) in the backend storage, while the > +remaining slots are empty/available. > + > +:: > + > + Slot Record > + <------------------ 8KiB --------------------> > + +--------------------------------------------+ > + 0 | storage header | > + +--------------------------------------------+ > + 1 | empty/available | > + +--------------------------------------------+ > + 2 | CPER | > + +--------------------------------------------+ > + 3 | CPER | > + +--------------------------------------------+ > + ... | | > + +--------------------------------------------+ > + N | empty/available | > + +--------------------------------------------+ > + > +The storage header consists of some basic information and an array > +of CPER record_id's to efficiently access records in the backend > +storage. > + > +All fields in the header are stored in little endian format. > + > +:: > + > + +--------------------------------------------+ > + | magic | 0x0000 > + +--------------------------------------------+ > + | record_offset | record_size | 0x0008 > + +--------------------------------------------+ > + | record_count | reserved | version | 0x0010 > + +--------------------------------------------+ > + | record_id[0] | 0x0018 > + +--------------------------------------------+ > + | record_id[1] | 0x0020 > + +--------------------------------------------+ > + | record_id[...] | > + +--------------------------------------------+ > + | record_id[N] | 0x1FF8 > + +--------------------------------------------+ > + > +The 'magic' field contains the value 0x524F545354535245. > + > +The 'record_size' field contains the value 0x2000, 8KiB. > + > +The 'record_offset' field points to the first record_id in the array, > +0x0018. > + > +The 'version' field contains 0x0100, the first version. > + > +The 'record_count' field contains the number of valid records in the > +backend storage. > + > +The 'record_id' array fields are the 64-bit record identifiers of the > +CPER record in the corresponding slot. Stated differently, the > +location of a CPER record_id in the record_id[] array provides the > +slot index for the corresponding record in the backend storage. > + > +Note that, for example, with a backend storage less than 8MiB, slot 0 > +contains the header, so the record_id[0] will never contain a valid > +CPER record_id. Instead slot 1 is the first available slot and thus > +record_id_[1] may contain a CPER. > + > +A 'record_id' of all 0s or all 1s indicates an invalid record (ie. the > +slot is available). > + > + > +References > +---------- > + > +[1] "Advanced Configuration and Power Interface Specification", > + version 4.0, June 2009. > + > +[2] "Unified Extensible Firmware Interface Specification", > + version 2.1, October 2008. > + > +[3] "Windows Hardware Error Architecture", specfically > + "Error Record Persistence Mechanism". > -- > 1.8.3.1 > >
diff --git a/docs/specs/acpi_erst.rst b/docs/specs/acpi_erst.rst new file mode 100644 index 0000000..a8a9d22 --- /dev/null +++ b/docs/specs/acpi_erst.rst @@ -0,0 +1,200 @@ +ACPI ERST DEVICE +================ + +The ACPI ERST device is utilized to support the ACPI Error Record +Serialization Table, ERST, functionality. This feature is designed for +storing error records in persistent storage for future reference +and/or debugging. + +The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces +(APEI)", and specifically subsection "Error Serialization", outlines a +method for storing error records into persistent storage. + +The format of error records is described in the UEFI specification[2], +in Appendix N "Common Platform Error Record". + +While the ACPI specification allows for an NVRAM "mode" (see +GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is +directly exposed for direct access by the OS/guest, this device +implements the non-NVRAM "mode". This non-NVRAM "mode" is what is +implemented by most BIOS (since flash memory requires programming +operations in order to update its contents). Furthermore, as of the +time of this writing, Linux only supports the non-NVRAM "mode". + + +Background/Motivation +--------------------- + +Linux uses the persistent storage filesystem, pstore, to record +information (eg. dmesg tail) upon panics and shutdowns. Pstore is +independent of, and runs before, kdump. In certain scenarios (ie. +hosts/guests with root filesystems on NFS/iSCSI where networking +software and/or hardware fails, and thus kdump fails), pstore may +contain information available for post-mortem debugging. + +Two common storage backends for the pstore filesystem are ACPI ERST +and UEFI. Most BIOS implement ACPI ERST. UEFI is not utilized in all +guests. With QEMU supporting ACPI ERST, it becomes a viable pstore +storage backend for virtual machines (as it is now for bare metal +machines). + +Enabling support for ACPI ERST facilitates a consistent method to +capture kernel panic information in a wide range of guests: from +resource-constrained microvms to very large guests, and in particular, +in direct-boot environments (which would lack UEFI run-time services). + +Note that Microsoft Windows also utilizes the ACPI ERST for certain +crash information, if available[3]. + + +Configuration|Usage +------------------- + +To use ACPI ERST, a memory-backend-file object and acpi-erst device +can be created, for example: + + qemu ... + -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,size=0x10000,share=on \ + -device acpi-erst,memdev=erstnvram + +For proper operation, the ACPI ERST device needs a memory-backend-file +object with the following parameters: + + - id: The id of the memory-backend-file object is used to associate + this memory with the acpi-erst device. + - size: The size of the ACPI ERST backing storage. This parameter is + required. + - mem-path: The location of the ACPI ERST backing storage file. This + parameter is also required. + - share: The share=on parameter is required so that updates to the + ERST backing store are written to the file. + +and ERST device: + + - memdev: Is the object id of the memory-backend-file. + - record_size: Specifies the size of the records (or slots) in the + backend storage. Must be a power of two value greater than or + equal to 4096 (PAGE_SIZE). + + +PCI Interface +------------- + +The ERST device is a PCI device with two BARs, one for accessing the +programming registers, and the other for accessing the record exchange +buffer. + +BAR0 contains the programming interface consisting of ACTION and VALUE +64-bit registers. All ERST actions/operations/side effects happen on +the write to the ACTION, by design. Any data needed by the action must +be placed into VALUE prior to writing ACTION. Reading the VALUE +simply returns the register contents, which can be updated by a +previous ACTION. + +BAR1 contains the 8KiB record exchange buffer, which is the +implemented maximum record size. + + +Backend Storage Format +---------------------- + +The backend storage is divided into fixed size "slots", 8KiB in +length, with each slot storing a single record. Not all slots need to +be occupied, and they need not be occupied in a contiguous fashion. +The ability to clear/erase specific records allows for the formation +of unoccupied slots. + +Slot 0 contains a backend storage header that identifies the contents +as ERST and also facilitates efficient access to the records. +Depending upon the size of the backend storage, additional slots will +be designated to be a part of the slot 0 header. For example, at 8KiB, +the slot 0 header can accomodate 1021 records. Thus a storage size +of 8MiB (8KiB * 1024) requires an additional slot for use by the +header. In this scenario, slot 0 and slot 1 form the backend storage +header, and records can be stored starting at slot 2. + +Below is an example layout of the backend storage format (for storage +size less than 8MiB). The size of the storage is a multiple of 8KiB, +and contains N number of slots to store records. The example below +shows two records (in CPER format) in the backend storage, while the +remaining slots are empty/available. + +:: + + Slot Record + <------------------ 8KiB --------------------> + +--------------------------------------------+ + 0 | storage header | + +--------------------------------------------+ + 1 | empty/available | + +--------------------------------------------+ + 2 | CPER | + +--------------------------------------------+ + 3 | CPER | + +--------------------------------------------+ + ... | | + +--------------------------------------------+ + N | empty/available | + +--------------------------------------------+ + +The storage header consists of some basic information and an array +of CPER record_id's to efficiently access records in the backend +storage. + +All fields in the header are stored in little endian format. + +:: + + +--------------------------------------------+ + | magic | 0x0000 + +--------------------------------------------+ + | record_offset | record_size | 0x0008 + +--------------------------------------------+ + | record_count | reserved | version | 0x0010 + +--------------------------------------------+ + | record_id[0] | 0x0018 + +--------------------------------------------+ + | record_id[1] | 0x0020 + +--------------------------------------------+ + | record_id[...] | + +--------------------------------------------+ + | record_id[N] | 0x1FF8 + +--------------------------------------------+ + +The 'magic' field contains the value 0x524F545354535245. + +The 'record_size' field contains the value 0x2000, 8KiB. + +The 'record_offset' field points to the first record_id in the array, +0x0018. + +The 'version' field contains 0x0100, the first version. + +The 'record_count' field contains the number of valid records in the +backend storage. + +The 'record_id' array fields are the 64-bit record identifiers of the +CPER record in the corresponding slot. Stated differently, the +location of a CPER record_id in the record_id[] array provides the +slot index for the corresponding record in the backend storage. + +Note that, for example, with a backend storage less than 8MiB, slot 0 +contains the header, so the record_id[0] will never contain a valid +CPER record_id. Instead slot 1 is the first available slot and thus +record_id_[1] may contain a CPER. + +A 'record_id' of all 0s or all 1s indicates an invalid record (ie. the +slot is available). + + +References +---------- + +[1] "Advanced Configuration and Power Interface Specification", + version 4.0, June 2009. + +[2] "Unified Extensible Firmware Interface Specification", + version 2.1, October 2008. + +[3] "Windows Hardware Error Architecture", specfically + "Error Record Persistence Mechanism".
Information on the implementation of the ACPI ERST support. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> --- docs/specs/acpi_erst.rst | 200 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 200 insertions(+) create mode 100644 docs/specs/acpi_erst.rst