mbox series

[v3,0/7] acpi: Error Record Serialization Table, ERST, support for QEMU

Message ID 1622225659-16847-1-git-send-email-eric.devolder@oracle.com (mailing list archive)
Headers show
Series acpi: Error Record Serialization Table, ERST, support for QEMU | expand

Message

Eric DeVolder May 28, 2021, 6:14 p.m. UTC
NOTE: Also, I wanted to push this v3 for review while alerting
that I will be on holiday through June 8 (possibly a few days
later).

NOTE: The patches are arranged such that each can be applied
in order and not break the build (except the 0001 patch). Igor
has hinted at changing this, but I'm unsure how else to
re-arrange these patches accordingly.

NOTE: With the conversion to TYPE_MEMORY_BACKEND_FILE, live
migration to a completely different host does not behave 
properly (it loses the ERST contents because the file is not
present on the new host). This still needs to be worked out.
Other than live migration, this patchset fully works.

This patchset introduces support for the ACPI Error Record
Serialization Table, ERST.

Linux uses the persistent storage filesystem, pstore, to record
information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
independent of, and runs before, kdump.  In certain scenarios (ie.
hosts/guests with root filesystems on NFS/iSCSI where networking
software and/or hardware fails), pstore may contain the only
information available for post-mortem debugging.

Two common storage backends for the pstore filesystem are ACPI ERST
and UEFI. Most BIOS implement ACPI ERST; however, ACPI ERST is not
currently supported in QEMU, and UEFI is not utilized in all guests.
By implementing ACPI ERST within QEMU, then the ACPI ERST becomes a
viable pstore storage backend for virtual machines (as it is now for
bare metal machines).

Enabling support for ACPI ERST facilitates a consistent method to
capture kernel panic information in a wide range of guests: from
resource-constrained microvms to very large guests, and in
particular, in direct-boot environments (which would lack UEFI
run-time services).

Note that Microsoft Windows also utilizes the ACPI ERST for certain
crash information, if available.

The ACPI ERST persistent storage is contained within a single backing
file, with a default size of 64KiB. The size and filename of the
backing file can be obtained from QEMU parameters.

The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
(APEI)", and specifically subsection "Error Serialization", outlines
a method for storing error records into persistent storage.

[1] "Advanced Configuration and Power Interface Specification",
    version 6.2, May 2017.
    https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf

[2] "Unified Extensible Firmware Interface Specification",
    version 2.8, March 2019.
    https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8_final.pdf

Suggested-by: Konrad Wilk <konrad.wilk@oracle.com>
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>

---
v3: 28may2021
 - Converted to using a TYPE_MEMORY_BACKEND_FILE object rather than
   internal array with explicit file operations, per Igor.
 - Changed the way the qdev and base address are handled, allowing
   ERST to be disabled at run-time. Also aligns better with other
   existing code.

v2: 8feb2021
 - Added qtest/smoke test per Paolo Bonzini
 - Split patch into smaller chunks, per Igo Mammedov
 - Did away with use of ACPI packed structures, per Igo Mammedov

v1: 26oct2020
 - initial post

---
 hw/acpi/erst.c         | 909 +++++++++++++++++++++++++++++++++++++++++++++++++
 hw/acpi/meson.build    |   1 +
 hw/i386/acpi-build.c   |   4 +
 include/hw/acpi/erst.h |  97 ++++++
 4 files changed, 1011 insertions(+)
 create mode 100644 hw/acpi/erst.c
 create mode 100644 include/hw/acpi/erst.h


Eric DeVolder (7):
  ACPI ERST: bios-tables-test.c steps 1 and 2
  ACPI ERST: header file for ERST
  ACPI ERST: support for ACPI ERST feature
  ACPI ERST: include ERST feature in build of ACPI support
  ACPI ERST: create ERST device for pc/x86 machines.
  ACPI ERST: qtest for ERST
  ACPI ERST: step 6 of bios-tables-test.c

 hw/acpi/erst.c               | 902 +++++++++++++++++++++++++++++++++++++++++++
 hw/acpi/meson.build          |   1 +
 hw/i386/acpi-build.c         |   7 +
 hw/i386/pc.c                 |  31 ++
 include/hw/acpi/erst.h       |  82 ++++
 include/hw/i386/pc.h         |   1 +
 tests/data/acpi/microvm/ERST |   0
 tests/data/acpi/pc/ERST      | Bin 0 -> 976 bytes
 tests/data/acpi/q35/ERST     | Bin 0 -> 976 bytes
 tests/qtest/erst-test.c      | 106 +++++
 tests/qtest/meson.build      |   2 +
 11 files changed, 1132 insertions(+)
 create mode 100644 hw/acpi/erst.c
 create mode 100644 include/hw/acpi/erst.h
 create mode 100644 tests/data/acpi/microvm/ERST
 create mode 100644 tests/data/acpi/pc/ERST
 create mode 100644 tests/data/acpi/q35/ERST
 create mode 100644 tests/qtest/erst-test.c

Comments

Igor Mammedov June 7, 2021, 12:49 p.m. UTC | #1
On Fri, 28 May 2021 14:14:12 -0400
Eric DeVolder <eric.devolder@oracle.com> wrote:

> NOTE: Also, I wanted to push this v3 for review while alerting
> that I will be on holiday through June 8 (possibly a few days
> later).
this version addressed only the way the host storage is accessed
(and even that is only partially and needs more work to put into it)
The rest of the comments on v2 are still not addressed.

> NOTE: The patches are arranged such that each can be applied
> in order and not break the build (except the 0001 patch). Igor
> has hinted at changing this, but I'm unsure how else to
> re-arrange these patches accordingly.
as minimum, see suggestion for splitting #4 in 5/7

> NOTE: With the conversion to TYPE_MEMORY_BACKEND_FILE, live
> migration to a completely different host does not behave 
> properly (it loses the ERST contents because the file is not
> present on the new host). This still needs to be worked out.
> Other than live migration, this patchset fully works.

see: vmstate_register_ram_global()

> This patchset introduces support for the ACPI Error Record
> Serialization Table, ERST.
> 
> Linux uses the persistent storage filesystem, pstore, to record
> information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
> independent of, and runs before, kdump.  In certain scenarios (ie.
> hosts/guests with root filesystems on NFS/iSCSI where networking
> software and/or hardware fails), pstore may contain the only
> information available for post-mortem debugging.
> 
> Two common storage backends for the pstore filesystem are ACPI ERST
> and UEFI. Most BIOS implement ACPI ERST; however, ACPI ERST is not
> currently supported in QEMU, and UEFI is not utilized in all guests.
> By implementing ACPI ERST within QEMU, then the ACPI ERST becomes a
> viable pstore storage backend for virtual machines (as it is now for
> bare metal machines).
> 
> Enabling support for ACPI ERST facilitates a consistent method to
> capture kernel panic information in a wide range of guests: from
> resource-constrained microvms to very large guests, and in
> particular, in direct-boot environments (which would lack UEFI
> run-time services).
> 
> Note that Microsoft Windows also utilizes the ACPI ERST for certain
> crash information, if available.
> 
> The ACPI ERST persistent storage is contained within a single backing
> file, with a default size of 64KiB. The size and filename of the
> backing file can be obtained from QEMU parameters.
> 
> The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
> (APEI)", and specifically subsection "Error Serialization", outlines
> a method for storing error records into persistent storage.
> 
> [1] "Advanced Configuration and Power Interface Specification",
>     version 6.2, May 2017.
>     https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
> 
> [2] "Unified Extensible Firmware Interface Specification",
>     version 2.8, March 2019.
>     https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8_final.pdf
> 
> Suggested-by: Konrad Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> 
> ---
> v3: 28may2021
>  - Converted to using a TYPE_MEMORY_BACKEND_FILE object rather than
>    internal array with explicit file operations, per Igor.
good start but it's not complete yet.

>  - Changed the way the qdev and base address are handled, allowing
>    ERST to be disabled at run-time. Also aligns better with other
>    existing code.
it aligns with ancient code template and the way it used to plumb
into board (it's fine for pre-existing devices but not for new ones
(unless there is no other way )).
v2 had suggestions how to proceed (you asked some questions back then,
but result is not reflected in this series, which still has the old
code as it was in v2).

 
> v2: 8feb2021
>  - Added qtest/smoke test per Paolo Bonzini
>  - Split patch into smaller chunks, per Igo Mammedov
>  - Did away with use of ACPI packed structures, per Igo Mammedov
> 
> v1: 26oct2020
>  - initial post
> 
> ---
>  hw/acpi/erst.c         | 909 +++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/acpi/meson.build    |   1 +
>  hw/i386/acpi-build.c   |   4 +
>  include/hw/acpi/erst.h |  97 ++++++
>  4 files changed, 1011 insertions(+)
>  create mode 100644 hw/acpi/erst.c
>  create mode 100644 include/hw/acpi/erst.h
> 
> 
> Eric DeVolder (7):
>   ACPI ERST: bios-tables-test.c steps 1 and 2
>   ACPI ERST: header file for ERST
>   ACPI ERST: support for ACPI ERST feature
>   ACPI ERST: include ERST feature in build of ACPI support
>   ACPI ERST: create ERST device for pc/x86 machines.
>   ACPI ERST: qtest for ERST
>   ACPI ERST: step 6 of bios-tables-test.c
> 
>  hw/acpi/erst.c               | 902 +++++++++++++++++++++++++++++++++++++++++++
>  hw/acpi/meson.build          |   1 +
>  hw/i386/acpi-build.c         |   7 +
>  hw/i386/pc.c                 |  31 ++
>  include/hw/acpi/erst.h       |  82 ++++
>  include/hw/i386/pc.h         |   1 +
>  tests/data/acpi/microvm/ERST |   0
>  tests/data/acpi/pc/ERST      | Bin 0 -> 976 bytes
>  tests/data/acpi/q35/ERST     | Bin 0 -> 976 bytes
>  tests/qtest/erst-test.c      | 106 +++++
>  tests/qtest/meson.build      |   2 +
>  11 files changed, 1132 insertions(+)
>  create mode 100644 hw/acpi/erst.c
>  create mode 100644 include/hw/acpi/erst.h
>  create mode 100644 tests/data/acpi/microvm/ERST
>  create mode 100644 tests/data/acpi/pc/ERST
>  create mode 100644 tests/data/acpi/q35/ERST
>  create mode 100644 tests/qtest/erst-test.c
>
Eric DeVolder June 7, 2021, 9:03 p.m. UTC | #2
Igor,
Thanks for the information/feedback. I am working to implement all your suggestions; from my perspective, there were two big changes requested, and the use of hostmem-file was the first, and the conversion to PCI the second. V3 was the hostmem-file, and hopefully all changes then in v4.
Regards,
eric
Igor Mammedov June 14, 2021, 8:14 a.m. UTC | #3
On Mon, 7 Jun 2021 21:03:06 +0000
Eric DeVolder <eric.devolder@oracle.com> wrote:

> Igor,
> Thanks for the information/feedback. I am working to implement all your suggestions; from my perspective, there were two big changes requested, and the use of hostmem-file was the first, and the conversion to PCI the second. V3 was the hostmem-file, and hopefully all changes then in v4.

if series is work in progress and not ready for merging,
one should use RFC instead of PATCH tag


> Regards,
> eric
> 
> ________________________________
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Monday, June 7, 2021 7:49 AM
> To: Eric DeVolder <eric.devolder@oracle.com>
> Cc: qemu-devel@nongnu.org <qemu-devel@nongnu.org>; mst@redhat.com <mst@redhat.com>; marcel.apfelbaum@gmail.com <marcel.apfelbaum@gmail.com>; pbonzini@redhat.com <pbonzini@redhat.com>; rth@twiddle.net <rth@twiddle.net>; ehabkost@redhat.com <ehabkost@redhat.com>; Konrad Wilk <konrad.wilk@oracle.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Subject: Re: [PATCH v3 0/7] acpi: Error Record Serialization Table, ERST, support for QEMU
> 
> On Fri, 28 May 2021 14:14:12 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
> > NOTE: Also, I wanted to push this v3 for review while alerting
> > that I will be on holiday through June 8 (possibly a few days
> > later).  
> this version addressed only the way the host storage is accessed
> (and even that is only partially and needs more work to put into it)
> The rest of the comments on v2 are still not addressed.
> 
> > NOTE: The patches are arranged such that each can be applied
> > in order and not break the build (except the 0001 patch). Igor
> > has hinted at changing this, but I'm unsure how else to
> > re-arrange these patches accordingly.  
> as minimum, see suggestion for splitting #4 in 5/7
> 
> > NOTE: With the conversion to TYPE_MEMORY_BACKEND_FILE, live
> > migration to a completely different host does not behave
> > properly (it loses the ERST contents because the file is not
> > present on the new host). This still needs to be worked out.
> > Other than live migration, this patchset fully works.  
> 
> see: vmstate_register_ram_global()
> 
> > This patchset introduces support for the ACPI Error Record
> > Serialization Table, ERST.
> >
> > Linux uses the persistent storage filesystem, pstore, to record
> > information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
> > independent of, and runs before, kdump.  In certain scenarios (ie.
> > hosts/guests with root filesystems on NFS/iSCSI where networking
> > software and/or hardware fails), pstore may contain the only
> > information available for post-mortem debugging.
> >
> > Two common storage backends for the pstore filesystem are ACPI ERST
> > and UEFI. Most BIOS implement ACPI ERST; however, ACPI ERST is not
> > currently supported in QEMU, and UEFI is not utilized in all guests.
> > By implementing ACPI ERST within QEMU, then the ACPI ERST becomes a
> > viable pstore storage backend for virtual machines (as it is now for
> > bare metal machines).
> >
> > Enabling support for ACPI ERST facilitates a consistent method to
> > capture kernel panic information in a wide range of guests: from
> > resource-constrained microvms to very large guests, and in
> > particular, in direct-boot environments (which would lack UEFI
> > run-time services).
> >
> > Note that Microsoft Windows also utilizes the ACPI ERST for certain
> > crash information, if available.
> >
> > The ACPI ERST persistent storage is contained within a single backing
> > file, with a default size of 64KiB. The size and filename of the
> > backing file can be obtained from QEMU parameters.
> >
> > The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
> > (APEI)", and specifically subsection "Error Serialization", outlines
> > a method for storing error records into persistent storage.
> >
> > [1] "Advanced Configuration and Power Interface Specification",
> >     version 6.2, May 2017.
> >     https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
> >
> > [2] "Unified Extensible Firmware Interface Specification",
> >     version 2.8, March 2019.
> >     https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8_final.pdf
> >
> > Suggested-by: Konrad Wilk <konrad.wilk@oracle.com>
> > Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> >
> > ---
> > v3: 28may2021
> >  - Converted to using a TYPE_MEMORY_BACKEND_FILE object rather than
> >    internal array with explicit file operations, per Igor.  
> good start but it's not complete yet.
> 
> >  - Changed the way the qdev and base address are handled, allowing
> >    ERST to be disabled at run-time. Also aligns better with other
> >    existing code.  
> it aligns with ancient code template and the way it used to plumb
> into board (it's fine for pre-existing devices but not for new ones
> (unless there is no other way )).
> v2 had suggestions how to proceed (you asked some questions back then,
> but result is not reflected in this series, which still has the old
> code as it was in v2).
> 
> 
> > v2: 8feb2021
> >  - Added qtest/smoke test per Paolo Bonzini
> >  - Split patch into smaller chunks, per Igo Mammedov
> >  - Did away with use of ACPI packed structures, per Igo Mammedov
> >
> > v1: 26oct2020
> >  - initial post
> >
> > ---
> >  hw/acpi/erst.c         | 909 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/acpi/meson.build    |   1 +
> >  hw/i386/acpi-build.c   |   4 +
> >  include/hw/acpi/erst.h |  97 ++++++
> >  4 files changed, 1011 insertions(+)
> >  create mode 100644 hw/acpi/erst.c
> >  create mode 100644 include/hw/acpi/erst.h
> >
> >
> > Eric DeVolder (7):
> >   ACPI ERST: bios-tables-test.c steps 1 and 2
> >   ACPI ERST: header file for ERST
> >   ACPI ERST: support for ACPI ERST feature
> >   ACPI ERST: include ERST feature in build of ACPI support
> >   ACPI ERST: create ERST device for pc/x86 machines.
> >   ACPI ERST: qtest for ERST
> >   ACPI ERST: step 6 of bios-tables-test.c
> >
> >  hw/acpi/erst.c               | 902 +++++++++++++++++++++++++++++++++++++++++++
> >  hw/acpi/meson.build          |   1 +
> >  hw/i386/acpi-build.c         |   7 +
> >  hw/i386/pc.c                 |  31 ++
> >  include/hw/acpi/erst.h       |  82 ++++
> >  include/hw/i386/pc.h         |   1 +
> >  tests/data/acpi/microvm/ERST |   0
> >  tests/data/acpi/pc/ERST      | Bin 0 -> 976 bytes
> >  tests/data/acpi/q35/ERST     | Bin 0 -> 976 bytes
> >  tests/qtest/erst-test.c      | 106 +++++
> >  tests/qtest/meson.build      |   2 +
> >  11 files changed, 1132 insertions(+)
> >  create mode 100644 hw/acpi/erst.c
> >  create mode 100644 include/hw/acpi/erst.h
> >  create mode 100644 tests/data/acpi/microvm/ERST
> >  create mode 100644 tests/data/acpi/pc/ERST
> >  create mode 100644 tests/data/acpi/q35/ERST
> >  create mode 100644 tests/qtest/erst-test.c
> >  
>
Eric DeVolder June 14, 2021, 12:45 p.m. UTC | #4
Igor,
Thanks, I sent out V4 Friday; I believe it has all the changes in it that you have requested.
Regards,
eric