mbox series

[0/4] target/arm: Improvement on memory error handling

Message ID 20250214041635.608012-1-gshan@redhat.com (mailing list archive)
Headers show
Series target/arm: Improvement on memory error handling | expand

Message

Gavin Shan Feb. 14, 2025, 4:16 a.m. UTC
Currently, there is only one CPER buffer (entry), meaning only one
memory error can be reported. In extreme case, multiple memory errors
can be raised on different vCPUs. For example, a singile memory error
on a 64KB page of the host can results in 16 memory errors to 4KB
pages of the guest. Unfortunately, the virtual machine is simply aborted
by multiple concurrent memory errors, as the following call trace shows.
A SEA exception is injected to the guest so that the CPER buffer can
be claimed if the error is successfully pushed by acpi_ghes_memory_errors(),
Otherwise, abort() is triggered to crash the virtual machine.

  kvm_vcpu_thread_fn
    kvm_cpu_exec
      kvm_arch_on_sigbus_vcpu
        kvm_cpu_synchronize_state
        acpi_ghes_memory_errors         (a)
        kvm_inject_arm_sea | abort

It's arguably to crash the virtual machine in this case. The better
behaviour would be to retry on pushing the memory errors, to keep the
virtual machine alive so that the administrator has chance to chime
in, for example to dump the important data with luck. This series
adds one more parameter to acpi_ghes_memory_errors() so that it will
be tried to push the memory error until it succeeds.

Gavin Shan (4):
  acpi/ghes: Make ghes_record_cper_errors() static
  acpi/ghes: Use error_report() in ghes_record_cper_errors()
  acpi/ghes: Allow retry to write CPER errors
  target/arm: Retry pushing CPER error if necessary

 hw/acpi/ghes-stub.c    |  3 ++-
 hw/acpi/ghes.c         | 45 +++++++++++++++++++++---------------------
 include/hw/acpi/ghes.h |  5 ++---
 target/arm/kvm.c       | 31 +++++++++++++++++++++++------
 4 files changed, 51 insertions(+), 33 deletions(-)

Comments

Jonathan Cameron Feb. 14, 2025, 9:53 a.m. UTC | #1
On Fri, 14 Feb 2025 14:16:31 +1000
Gavin Shan <gshan@redhat.com> wrote:

> Currently, there is only one CPER buffer (entry), meaning only one
> memory error can be reported. In extreme case, multiple memory errors
> can be raised on different vCPUs. For example, a singile memory error
> on a 64KB page of the host can results in 16 memory errors to 4KB
> pages of the guest. Unfortunately, the virtual machine is simply aborted
> by multiple concurrent memory errors, as the following call trace shows.
> A SEA exception is injected to the guest so that the CPER buffer can
> be claimed if the error is successfully pushed by acpi_ghes_memory_errors(),
> Otherwise, abort() is triggered to crash the virtual machine.
> 
>   kvm_vcpu_thread_fn
>     kvm_cpu_exec
>       kvm_arch_on_sigbus_vcpu
>         kvm_cpu_synchronize_state
>         acpi_ghes_memory_errors         (a)
>         kvm_inject_arm_sea | abort
> 
> It's arguably to crash the virtual machine in this case. The better
> behaviour would be to retry on pushing the memory errors, to keep the
> virtual machine alive so that the administrator has chance to chime
> in, for example to dump the important data with luck. This series
> adds one more parameter to acpi_ghes_memory_errors() so that it will
> be tried to push the memory error until it succeeds.

Hi Gavin,

+CC Mauro given:
https://lore.kernel.org/all/cover.1738345063.git.mchehab+huawei@kernel.org/

is more or less reviewed subject to some requested patch reordering and
whilst I haven't checked, seems unlikely that there won't be a
clash with this series (might just be some fuzz)

Jonathan



> 
> Gavin Shan (4):
>   acpi/ghes: Make ghes_record_cper_errors() static
>   acpi/ghes: Use error_report() in ghes_record_cper_errors()
>   acpi/ghes: Allow retry to write CPER errors
>   target/arm: Retry pushing CPER error if necessary
> 
>  hw/acpi/ghes-stub.c    |  3 ++-
>  hw/acpi/ghes.c         | 45 +++++++++++++++++++++---------------------
>  include/hw/acpi/ghes.h |  5 ++---
>  target/arm/kvm.c       | 31 +++++++++++++++++++++++------
>  4 files changed, 51 insertions(+), 33 deletions(-)
>
Jonathan Cameron Feb. 14, 2025, 10:12 a.m. UTC | #2
On Fri, 14 Feb 2025 14:16:31 +1000
Gavin Shan <gshan@redhat.com> wrote:

> Currently, there is only one CPER buffer (entry), meaning only one
> memory error can be reported. In extreme case, multiple memory errors
> can be raised on different vCPUs. For example, a singile memory error
> on a 64KB page of the host can results in 16 memory errors to 4KB
> pages of the guest. Unfortunately, the virtual machine is simply aborted
> by multiple concurrent memory errors, as the following call trace shows.
> A SEA exception is injected to the guest so that the CPER buffer can
> be claimed if the error is successfully pushed by acpi_ghes_memory_errors(),
> Otherwise, abort() is triggered to crash the virtual machine.
> 
>   kvm_vcpu_thread_fn
>     kvm_cpu_exec
>       kvm_arch_on_sigbus_vcpu
>         kvm_cpu_synchronize_state
>         acpi_ghes_memory_errors         (a)
>         kvm_inject_arm_sea | abort
> 
> It's arguably to crash the virtual machine in this case. The better
> behaviour would be to retry on pushing the memory errors, to keep the
> virtual machine alive so that the administrator has chance to chime
> in, for example to dump the important data with luck. This series
> adds one more parameter to acpi_ghes_memory_errors() so that it will
> be tried to push the memory error until it succeeds.
Hi Gavin,

If the ultimate aim is to support multiple memory errors why not
just do that?  Been a while since I look at how that works, but 
the spec definitely allows it.  I think by just queuing up the errors
and updating the Error Status Address as each one is handled.
I think that's what GHESv2 ack is all about as it prevents the
RAS firmware updating the error record until it is acknowledged
at which point the RAS firmware can report the next one.

Or... Given the usecase above of a 64KiB host page and 4KiB guest
can we inject a single error record with multiple CPER entries and
just handle it all in one go?

Set the Error record header -> section count to 16 and provide
16 Memory Error Sections or equivalent.

Doesn't help with multiple errors in unrelated memory addresses but
maybe removes one problem case.

I've not checked all the information makes it to the right places
however or that we don't end up with a deadlock when multiple vCPU
involved.

If doing the more significant surgery this would involve, I'd
love to see Mauro's series land first as it cleans up a lot of
how HEST is handled etc.

Jonathan

> 
> Gavin Shan (4):
>   acpi/ghes: Make ghes_record_cper_errors() static
>   acpi/ghes: Use error_report() in ghes_record_cper_errors()
>   acpi/ghes: Allow retry to write CPER errors
>   target/arm: Retry pushing CPER error if necessary
> 
>  hw/acpi/ghes-stub.c    |  3 ++-
>  hw/acpi/ghes.c         | 45 +++++++++++++++++++++---------------------
>  include/hw/acpi/ghes.h |  5 ++---
>  target/arm/kvm.c       | 31 +++++++++++++++++++++++------
>  4 files changed, 51 insertions(+), 33 deletions(-)
>
Mauro Carvalho Chehab Feb. 14, 2025, 12:59 p.m. UTC | #3
Em Fri, 14 Feb 2025 14:16:31 +1000
Gavin Shan <gshan@redhat.com> escreveu:

> Currently, there is only one CPER buffer (entry), meaning only one
> memory error can be reported. In extreme case, multiple memory errors
> can be raised on different vCPUs. For example, a singile memory error
> on a 64KB page of the host can results in 16 memory errors to 4KB
> pages of the guest. 

There is already a patchset allowing to have multiple CPER entries
floating around since last year:

	https://lore.kernel.org/qemu-devel/cover.1738345063.git.mchehab+huawei@kernel.org/

I guess it is almost ready for being merged, needing just some
nitpick changes to satisfy ACPI maintainers. Such changeset already
adds a second CPER entry for GED, and allows to easily add more as
needed. 

> In extreme case, multiple memory errors
> can be raised on different vCPUs. For example, a singile memory error
> on a 64KB page of the host can results in 16 memory errors to 4KB
> pages of the guest. 

> Unfortunately, the virtual machine is simply aborted
> by multiple concurrent memory errors, as the following call trace shows.
> A SEA exception is injected to the guest so that the CPER buffer can
> be claimed if the error is successfully pushed by acpi_ghes_memory_errors(),
> Otherwise, abort() is triggered to crash the virtual machine.
> 
>   kvm_vcpu_thread_fn
>     kvm_cpu_exec
>       kvm_arch_on_sigbus_vcpu
>         kvm_cpu_synchronize_state
>         acpi_ghes_memory_errors         (a)
>         kvm_inject_arm_sea | abort
> 
> It's arguably to crash the virtual machine in this case. The better
> behaviour would be to retry on pushing the memory errors, to keep the
> virtual machine alive so that the administrator has chance to chime
> in, for example to dump the important data with luck. This series
> adds one more parameter to acpi_ghes_memory_errors() so that it will
> be tried to push the memory error until it succeeds.

Having a retry buffer might be interesting for some types of errors,
like error-injected and corrected errors. Yet, it doesn't sound right 
to buffer uncorrected errors that would affect the virtual machine.

> 
> Gavin Shan (4):
>   acpi/ghes: Make ghes_record_cper_errors() static
>   acpi/ghes: Use error_report() in ghes_record_cper_errors()
>   acpi/ghes: Allow retry to write CPER errors
>   target/arm: Retry pushing CPER error if necessary
> 
>  hw/acpi/ghes-stub.c    |  3 ++-
>  hw/acpi/ghes.c         | 45 +++++++++++++++++++++---------------------
>  include/hw/acpi/ghes.h |  5 ++---
>  target/arm/kvm.c       | 31 +++++++++++++++++++++++------
>  4 files changed, 51 insertions(+), 33 deletions(-)
> 



Thanks,
Mauro
Gavin Shan Feb. 17, 2025, 12:29 a.m. UTC | #4
On 2/14/25 7:53 PM, Jonathan Cameron wrote:
> On Fri, 14 Feb 2025 14:16:31 +1000
> Gavin Shan <gshan@redhat.com> wrote:
> 
>> Currently, there is only one CPER buffer (entry), meaning only one
>> memory error can be reported. In extreme case, multiple memory errors
>> can be raised on different vCPUs. For example, a singile memory error
>> on a 64KB page of the host can results in 16 memory errors to 4KB
>> pages of the guest. Unfortunately, the virtual machine is simply aborted
>> by multiple concurrent memory errors, as the following call trace shows.
>> A SEA exception is injected to the guest so that the CPER buffer can
>> be claimed if the error is successfully pushed by acpi_ghes_memory_errors(),
>> Otherwise, abort() is triggered to crash the virtual machine.
>>
>>    kvm_vcpu_thread_fn
>>      kvm_cpu_exec
>>        kvm_arch_on_sigbus_vcpu
>>          kvm_cpu_synchronize_state
>>          acpi_ghes_memory_errors         (a)
>>          kvm_inject_arm_sea | abort
>>
>> It's arguably to crash the virtual machine in this case. The better
>> behaviour would be to retry on pushing the memory errors, to keep the
>> virtual machine alive so that the administrator has chance to chime
>> in, for example to dump the important data with luck. This series
>> adds one more parameter to acpi_ghes_memory_errors() so that it will
>> be tried to push the memory error until it succeeds.
> 
> Hi Gavin,
> 
> +CC Mauro given:
> https://lore.kernel.org/all/cover.1738345063.git.mchehab+huawei@kernel.org/
> 
> is more or less reviewed subject to some requested patch reordering and
> whilst I haven't checked, seems unlikely that there won't be a
> clash with this series (might just be some fuzz)
> 

Jonathan, thanks for the pointer. I didn't notice there are pending acpi/hest
changes. The changes clash with those included in this series, I will take a
close look.

Thanks,
Gavin

> Jonathan
> 
> 
> 
>>
>> Gavin Shan (4):
>>    acpi/ghes: Make ghes_record_cper_errors() static
>>    acpi/ghes: Use error_report() in ghes_record_cper_errors()
>>    acpi/ghes: Allow retry to write CPER errors
>>    target/arm: Retry pushing CPER error if necessary
>>
>>   hw/acpi/ghes-stub.c    |  3 ++-
>>   hw/acpi/ghes.c         | 45 +++++++++++++++++++++---------------------
>>   include/hw/acpi/ghes.h |  5 ++---
>>   target/arm/kvm.c       | 31 +++++++++++++++++++++++------
>>   4 files changed, 51 insertions(+), 33 deletions(-)
>>
>
Gavin Shan Feb. 17, 2025, 3:49 a.m. UTC | #5
On 2/14/25 8:12 PM, Jonathan Cameron wrote:
> On Fri, 14 Feb 2025 14:16:31 +1000
> Gavin Shan <gshan@redhat.com> wrote:
> 
>> Currently, there is only one CPER buffer (entry), meaning only one
>> memory error can be reported. In extreme case, multiple memory errors
>> can be raised on different vCPUs. For example, a singile memory error
>> on a 64KB page of the host can results in 16 memory errors to 4KB
>> pages of the guest. Unfortunately, the virtual machine is simply aborted
>> by multiple concurrent memory errors, as the following call trace shows.
>> A SEA exception is injected to the guest so that the CPER buffer can
>> be claimed if the error is successfully pushed by acpi_ghes_memory_errors(),
>> Otherwise, abort() is triggered to crash the virtual machine.
>>
>>    kvm_vcpu_thread_fn
>>      kvm_cpu_exec
>>        kvm_arch_on_sigbus_vcpu
>>          kvm_cpu_synchronize_state
>>          acpi_ghes_memory_errors         (a)
>>          kvm_inject_arm_sea | abort
>>
>> It's arguably to crash the virtual machine in this case. The better
>> behaviour would be to retry on pushing the memory errors, to keep the
>> virtual machine alive so that the administrator has chance to chime
>> in, for example to dump the important data with luck. This series
>> adds one more parameter to acpi_ghes_memory_errors() so that it will
>> be tried to push the memory error until it succeeds.
> Hi Gavin,
> 
> If the ultimate aim is to support multiple memory errors why not
> just do that?  Been a while since I look at how that works, but
> the spec definitely allows it.  I think by just queuing up the errors
> and updating the Error Status Address as each one is handled.
> I think that's what GHESv2 ack is all about as it prevents the
> RAS firmware updating the error record until it is acknowledged
> at which point the RAS firmware can report the next one.
> 
> Or... Given the usecase above of a 64KiB host page and 4KiB guest
> can we inject a single error record with multiple CPER entries and
> just handle it all in one go?
> 
> Set the Error record header -> section count to 16 and provide
> 16 Memory Error Sections or equivalent.
> 
> Doesn't help with multiple errors in unrelated memory addresses but
> maybe removes one problem case.
> 
> I've not checked all the information makes it to the right places
> however or that we don't end up with a deadlock when multiple vCPU
> involved.
> 
> If doing the more significant surgery this would involve, I'd
> love to see Mauro's series land first as it cleans up a lot of
> how HEST is handled etc.
> 

Jonathan, thanks for review and comments. It's just an example that a problematic
64k host page can affect 16 4k guest pages. The errors aren't raised at the same
time because the SIGBUS signal is received by QEMU when the corresponding 4k guest
page is accessed. If all those errors are queued up and delivered at once, the
problem is when all those queued errors are delivered?

Besides, the problematic 64k host page affecting 16 4k guest page is an example.
when host/guest has same page size (e.g. 4KB), it's possible that two problematic
pages are detected by SIGBUS signals. It's also possible that one CPER error
is being delivered, but not acknowledged. A followup CPER error is raised to be
delivered. In this case, abort() is triggered either. So the problem isn't specific
64k host page size + 4k guest page size.

Thanks,
Gavin
Gavin Shan Feb. 17, 2025, 3:58 a.m. UTC | #6
On 2/14/25 10:59 PM, Mauro Carvalho Chehab wrote:
> Em Fri, 14 Feb 2025 14:16:31 +1000
> Gavin Shan <gshan@redhat.com> escreveu:
> 
>> Currently, there is only one CPER buffer (entry), meaning only one
>> memory error can be reported. In extreme case, multiple memory errors
>> can be raised on different vCPUs. For example, a singile memory error
>> on a 64KB page of the host can results in 16 memory errors to 4KB
>> pages of the guest.
> 
> There is already a patchset allowing to have multiple CPER entries
> floating around since last year:
> 
> 	https://lore.kernel.org/qemu-devel/cover.1738345063.git.mchehab+huawei@kernel.org/
> 
> I guess it is almost ready for being merged, needing just some
> nitpick changes to satisfy ACPI maintainers. Such changeset already
> adds a second CPER entry for GED, and allows to easily add more as
> needed.
> 

Thanks for the linker, Mauro. As I explained to Jonathan, the bottleneck
isn't the number of CPER entries (single or multiple). The bottleneck
is actually the acknowledgment mechanism. With the mechanism, a single
CPER buffer, which could contain multiple entries, can be delivered
and acknowledged at once. I don't see your series changes anything in
this regard if I don't miss anything.

>> In extreme case, multiple memory errors
>> can be raised on different vCPUs. For example, a singile memory error
>> on a 64KB page of the host can results in 16 memory errors to 4KB
>> pages of the guest.
> 
>> Unfortunately, the virtual machine is simply aborted
>> by multiple concurrent memory errors, as the following call trace shows.
>> A SEA exception is injected to the guest so that the CPER buffer can
>> be claimed if the error is successfully pushed by acpi_ghes_memory_errors(),
>> Otherwise, abort() is triggered to crash the virtual machine.
>>
>>    kvm_vcpu_thread_fn
>>      kvm_cpu_exec
>>        kvm_arch_on_sigbus_vcpu
>>          kvm_cpu_synchronize_state
>>          acpi_ghes_memory_errors         (a)
>>          kvm_inject_arm_sea | abort
>>
>> It's arguably to crash the virtual machine in this case. The better
>> behaviour would be to retry on pushing the memory errors, to keep the
>> virtual machine alive so that the administrator has chance to chime
>> in, for example to dump the important data with luck. This series
>> adds one more parameter to acpi_ghes_memory_errors() so that it will
>> be tried to push the memory error until it succeeds.
> 
> Having a retry buffer might be interesting for some types of errors,
> like error-injected and corrected errors. Yet, it doesn't sound right
> to buffer uncorrected errors that would affect the virtual machine.
> 

The question is how the uncorrected error can be delivered if the previous
corrected error is being delivered and not acknowledged yet? With the
acknowledgement mechanism, all errors are equal in priority when they're
delivered, correct?

Thanks,
Gavin