diff mbox series

[RESEND,RFC,v2,2/2] target/arm: Support NMI injection

Message ID 20200205110541.37811-3-gshan@redhat.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Gavin Shan Feb. 5, 2020, 11:05 a.m. UTC
This supports QMP/HMP "nmi" command by injecting SError interrupt to
guest, which is expected to crash with that. Currently, It's supported
on two CPU models: "host" and "max".

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/arm/virt.c          | 18 ++++++++++++++++
 target/arm/cpu-qom.h   |  1 +
 target/arm/cpu.c       | 48 ++++++++++++++++++++++++++++++++++++++++++
 target/arm/cpu64.c     | 25 ++++++++++++++++++----
 target/arm/internals.h |  8 +++++++
 5 files changed, 96 insertions(+), 4 deletions(-)

Comments

Peter Maydell Feb. 13, 2020, 11:11 a.m. UTC | #1
On Wed, 5 Feb 2020 at 11:06, Gavin Shan <gshan@redhat.com> wrote:
>
> This supports QMP/HMP "nmi" command by injecting SError interrupt to
> guest, which is expected to crash with that. Currently, It's supported
> on two CPU models: "host" and "max".
>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  hw/arm/virt.c          | 18 ++++++++++++++++
>  target/arm/cpu-qom.h   |  1 +
>  target/arm/cpu.c       | 48 ++++++++++++++++++++++++++++++++++++++++++
>  target/arm/cpu64.c     | 25 ++++++++++++++++++----
>  target/arm/internals.h |  8 +++++++
>  5 files changed, 96 insertions(+), 4 deletions(-)

A few quick general notes:

(1) as I mentioned on the cover letter, the mechanism
for injecting an SError/async external abort into the
CPU should be a qemu_irq line, like FIQ/IRQ, not a
special-purpose method on the CPU object.

(2) for function naming, there's a dividing line between:
 * code that implements the (unfortunately x86-centric)
   monitor command named "nmi"; these functions can have
   names with 'nmi' in them
 * code which implements the actual mechanism of 'deliver
   an SError to the CPU'; these functions should not
   have 'nmi' in the name or mention nmi, because nmi is
   not a concept in the Arm architecture

(3) Before we expose 'nmi' to users as something that
delivers an SError, we need to think about the interactions
with RAS, because currently we also use SError to say
"there was an error in the host memory you're using",
and we might in future want to use SError for proper
emulated RAS. We don't want to paint ourselves into a
corner by grabbing SError exclusively for 'nmi'.

thanks
-- PMM
Gavin Shan Feb. 14, 2020, 6:23 a.m. UTC | #2
On 2/13/20 10:11 PM, Peter Maydell wrote:
> On Wed, 5 Feb 2020 at 11:06, Gavin Shan <gshan@redhat.com> wrote:
>>
>> This supports QMP/HMP "nmi" command by injecting SError interrupt to
>> guest, which is expected to crash with that. Currently, It's supported
>> on two CPU models: "host" and "max".
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   hw/arm/virt.c          | 18 ++++++++++++++++
>>   target/arm/cpu-qom.h   |  1 +
>>   target/arm/cpu.c       | 48 ++++++++++++++++++++++++++++++++++++++++++
>>   target/arm/cpu64.c     | 25 ++++++++++++++++++----
>>   target/arm/internals.h |  8 +++++++
>>   5 files changed, 96 insertions(+), 4 deletions(-)
> 
> A few quick general notes:
> 
> (1) as I mentioned on the cover letter, the mechanism
> for injecting an SError/async external abort into the
> CPU should be a qemu_irq line, like FIQ/IRQ, not a
> special-purpose method on the CPU object.
> 
> (2) for function naming, there's a dividing line between:
>   * code that implements the (unfortunately x86-centric)
>     monitor command named "nmi"; these functions can have
>     names with 'nmi' in them
>   * code which implements the actual mechanism of 'deliver
>     an SError to the CPU'; these functions should not
>     have 'nmi' in the name or mention nmi, because nmi is
>     not a concept in the Arm architecture
> 
> (3) Before we expose 'nmi' to users as something that
> delivers an SError, we need to think about the interactions
> with RAS, because currently we also use SError to say
> "there was an error in the host memory you're using",
> and we might in future want to use SError for proper
> emulated RAS. We don't want to paint ourselves into a
> corner by grabbing SError exclusively for 'nmi'.
> 

Hi Peter, Thanks for the nice details. I just posted v3 to address
(1) and (2).

For (3), I'm not sure. It seems we need some registers to record the
details on why the SError is raised. For ARMv8 with RAS extension is
supported, VSESR_EL2 can be used. Otherwise, we probably need some
space in ESR_EL1. For aarch32, DFSR might be the alternative. With
more details about the cause, the "NMI" and other errors can be
classified.

May I ask if the SError is going to be triggered by simulated device
or real one, or both in future? If it's corresponding to host memory
error, it seems it should be triggered by a real hardware. In this case,
the error should be intercepted and then delivered to guest. I need
more details how the SError will be used for RAS.

Thanks,
Gavin
diff mbox series

Patch

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f788fe27d6..883861a28a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -71,6 +71,7 @@ 
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/acpi/generic_event_device.h"
+#include "hw/nmi.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -2026,10 +2027,25 @@  static int virt_kvm_type(MachineState *ms, const char *type_str)
     return requested_pa_size > 40 ? requested_pa_size : 0;
 }
 
+static void virt_nmi(NMIState *n, int cpu_index, Error **errp)
+{
+    ARMCPUClass *acc;
+    CPUState *cpu = first_cpu;
+
+    acc = ARM_CPU_GET_CLASS(OBJECT(cpu));
+    if (!acc->inject_nmi) {
+        error_setg(errp, "NMI injection not supported");
+        return;
+    }
+
+    acc->inject_nmi(cpu, errp);
+}
+
 static void virt_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
     HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+    NMIClass *nc = NMI_CLASS(oc);
 
     mc->init = machvirt_init;
     /* Start with max_cpus set to 512, which is the maximum supported by KVM.
@@ -2058,6 +2074,7 @@  static void virt_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug_request = virt_machine_device_unplug_request_cb;
     mc->numa_mem_supported = true;
     mc->auto_enable_numa_with_memhp = true;
+    nc->nmi_monitor_handler = virt_nmi;
 }
 
 static void virt_instance_init(Object *obj)
@@ -2141,6 +2158,7 @@  static const TypeInfo virt_machine_info = {
     .instance_init = virt_instance_init,
     .interfaces = (InterfaceInfo[]) {
          { TYPE_HOTPLUG_HANDLER },
+         { TYPE_NMI },
          { }
     },
 };
diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
index 7f5b244bde..853f0f3ca5 100644
--- a/target/arm/cpu-qom.h
+++ b/target/arm/cpu-qom.h
@@ -52,6 +52,7 @@  typedef struct ARMCPUClass {
     const ARMCPUInfo *info;
     DeviceRealize parent_realize;
     void (*parent_reset)(CPUState *cpu);
+    void (*inject_nmi)(CPUState *cpu, Error **errp);
 } ARMCPUClass;
 
 typedef struct ARMCPU ARMCPU;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 24fd77437b..f0ef315f3b 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -475,6 +475,44 @@  bool arm_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
     return ret;
 }
 
+#ifdef CONFIG_KVM
+static void do_inject_nmi_kvm(CPUState *cpu, run_on_cpu_data data)
+{
+    struct kvm_vcpu_events events;
+    Error **errp = data.host_ptr;
+    int ret;
+
+    if (!kvm_has_vcpu_events()) {
+        error_setg(errp, "vCPU events not supported");
+        return;
+    }
+
+    memset(&events, 0, sizeof(events));
+    events.exception.serror_pending = 1;
+    ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
+    if (ret) {
+        error_setg(errp, "Error to inject vCPU events");
+    }
+}
+#endif
+
+/* The guest is expected to crash once receiving the SError interrupt */
+static void do_inject_nmi(CPUState *cpu, run_on_cpu_data data)
+{
+    cpu_synchronize_state(cpu);
+
+#ifdef CONFIG_KVM
+    return do_inject_nmi_kvm(cpu, data);
+#endif
+
+    cpu_interrupt(cpu, CPU_INTERRUPT_SERROR);
+}
+
+void arm_cpu_inject_nmi(CPUState *cpu, Error **errp)
+{
+    async_run_on_cpu(cpu, do_inject_nmi, RUN_ON_CPU_HOST_PTR(errp));
+}
+
 #if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
 static bool arm_v7m_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
 {
@@ -2759,10 +2797,20 @@  static void arm_host_initfn(Object *obj)
     arm_cpu_post_init(obj);
 }
 
+#ifdef TARGET_AARCH64
+static void arm_host_class_init(ObjectClass *oc, void *data)
+{
+    ARMCPUClass *acc = ARM_CPU_CLASS(oc);
+
+    acc->inject_nmi = arm_cpu_inject_nmi;
+}
+#endif /* TARGET_AARCH64 */
+
 static const TypeInfo host_arm_cpu_type_info = {
     .name = TYPE_ARM_HOST_CPU,
 #ifdef TARGET_AARCH64
     .parent = TYPE_AARCH64_CPU,
+    .class_init = arm_host_class_init,
 #else
     .parent = TYPE_ARM_CPU,
 #endif
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 2d97bf45e1..66e746f66b 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -21,6 +21,7 @@ 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "cpu.h"
+#include "internals.h"
 #include "qemu/module.h"
 #if !defined(CONFIG_USER_ONLY)
 #include "hw/loader.h"
@@ -713,6 +714,14 @@  static void aarch64_max_initfn(Object *obj)
                         cpu_max_set_sve_max_vq, NULL, NULL, &error_fatal);
 }
 
+static void aarch64_max_class_init(ObjectClass *oc, void *data)
+{
+    ARMCPUClass *acc = ARM_CPU_CLASS(oc);
+
+    acc->inject_nmi = arm_cpu_inject_nmi;
+    acc->info = data;
+}
+
 struct ARMCPUInfo {
     const char *name;
     void (*initfn)(Object *obj);
@@ -720,10 +729,18 @@  struct ARMCPUInfo {
 };
 
 static const ARMCPUInfo aarch64_cpus[] = {
-    { .name = "cortex-a57",         .initfn = aarch64_a57_initfn },
-    { .name = "cortex-a53",         .initfn = aarch64_a53_initfn },
-    { .name = "cortex-a72",         .initfn = aarch64_a72_initfn },
-    { .name = "max",                .initfn = aarch64_max_initfn },
+    { .name = "cortex-a57",
+      .class_init = NULL,
+      .initfn = aarch64_a57_initfn },
+    { .name = "cortex-a53",
+      .class_init = NULL,
+      .initfn = aarch64_a53_initfn },
+    { .name = "cortex-a72",
+      .class_init = NULL,
+      .initfn = aarch64_a72_initfn },
+    { .name = "max",
+      .class_init = aarch64_max_class_init,
+      .initfn = aarch64_max_initfn },
     { .name = NULL }
 };
 
diff --git a/target/arm/internals.h b/target/arm/internals.h
index f5313dd3d4..6a922fbc98 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -949,6 +949,14 @@  void arm_cpu_update_virq(ARMCPU *cpu);
  */
 void arm_cpu_update_vfiq(ARMCPU *cpu);
 
+/**
+ * arm_cpu_inject_nmi: Inject SError exception
+ *
+ * Inject SError exception to trigger kernel panic inside the guest. That's
+ * the simulated behavior as to other architectures like x86.
+ */
+void arm_cpu_inject_nmi(CPUState *cpu, Error **errp);
+
 /**
  * arm_mmu_idx_el:
  * @env: The cpu environment