diff mbox series

[XEN] tools/misc: xen-hvmcrash: Inject #DF instead of overwriting RIP

Message ID 27f4397093d92b53f89d625d682bd4b7145b65d8.1717426439.git.matthew.barnes@cloud.com (mailing list archive)
State Superseded
Headers show
Series [XEN] tools/misc: xen-hvmcrash: Inject #DF instead of overwriting RIP | expand

Commit Message

Matthew Barnes June 3, 2024, 2:59 p.m. UTC
xen-hvmcrash would previously save records, overwrite the instruction
pointer with a bogus value, and then restore them to crash a domain
just enough to cause the guest OS to memdump.

This approach is found to be unreliable when tested on a guest running
Windows 10 x64, with some executions doing nothing at all.

Another approach would be to trigger NMIs. This approach is found to be
unreliable when tested on Linux (Ubuntu 22.04), as Linux will ignore
NMIs if it is not configured to handle such.

Injecting a double fault abort to all vCPUs is found to be more
reliable at crashing and invoking memdumps from Windows and Linux
domains.

This patch modifies the xen-hvmcrash tool to inject #DF to all vCPUs
belonging to the specified domain, instead of overwriting RIP.

Signed-off-by: Matthew Barnes <matthew.barnes@cloud.com>
---
 tools/misc/xen-hvmcrash.c | 77 +++++++--------------------------------
 1 file changed, 13 insertions(+), 64 deletions(-)

Comments

Anthony PERARD June 21, 2024, 3:38 p.m. UTC | #1
On Mon, Jun 03, 2024 at 03:59:18PM +0100, Matthew Barnes wrote:
> diff --git a/tools/misc/xen-hvmcrash.c b/tools/misc/xen-hvmcrash.c
> index 1d058fa40a47..8ef1beb388f8 100644
> --- a/tools/misc/xen-hvmcrash.c
> +++ b/tools/misc/xen-hvmcrash.c
> @@ -38,22 +38,21 @@
>  #include <sys/stat.h>
>  #include <arpa/inet.h>
>
> +#define XC_WANT_COMPAT_DEVICEMODEL_API
>  #include <xenctrl.h>
>  #include <xen/xen.h>
>  #include <xen/domctl.h>
>  #include <xen/hvm/save.h>

There's lots of headers that aren't used by the new codes and can be
removed. (They were probably way too many headers included when this
utility was introduced.)

> +    for (vcpu_id = 0; vcpu_id <= dominfo.max_vcpu_id; vcpu_id++) {
> +        printf("Injecting #DF to vcpu ID #%d...\n", vcpu_id);
> +        ret = xc_hvm_inject_trap(xch, domid, vcpu_id,
> +                                X86_ABORT_DF,

In the definition of xendevicemodel_inject_event(), the comment say to
look at "enum x86_event_type" for possible event "type", but there's no
"#DF" type, can we add this new one there before using it? (It's going
to be something like X86_EVENTTYPE_*)

> +                                XEN_DMOP_EVENT_hw_exc, 0,
> +                                NULL, NULL);

The new code doesn't build, "NULL" aren't integers.

> +        if (ret < 0) {
> +            fprintf(stderr, "Could not inject #DF to vcpu ID #%d\n", vcpu_id);
> +            perror("xc_hvm_inject_trap");

Are you meant to print two error lines when there's an error? You can
also use strerror() to convert an "errno" to a string.

Also, perror() might print an error from fprintf() if that last one
failed.

Are you meant to keep inject into other vcpus even if one have failed?
Should `xen-hvmcrash` return success when it failed to inject the double
fault to all vcpus?

Thanks,
Matthew Barnes June 25, 2024, 3:51 p.m. UTC | #2
On Fri, Jun 21, 2024 at 03:38:36PM +0000, Anthony PERARD wrote:
> On Mon, Jun 03, 2024 at 03:59:18PM +0100, Matthew Barnes wrote:
> > diff --git a/tools/misc/xen-hvmcrash.c b/tools/misc/xen-hvmcrash.c
> > index 1d058fa40a47..8ef1beb388f8 100644
> > --- a/tools/misc/xen-hvmcrash.c
> > +++ b/tools/misc/xen-hvmcrash.c
> > @@ -38,22 +38,21 @@
> >  #include <sys/stat.h>
> >  #include <arpa/inet.h>
> >
> > +#define XC_WANT_COMPAT_DEVICEMODEL_API
> >  #include <xenctrl.h>
> >  #include <xen/xen.h>
> >  #include <xen/domctl.h>
> >  #include <xen/hvm/save.h>
>
> There's lots of headers that aren't used by the new codes and can be
> removed. (They were probably way too many headers included when this
> utility was introduced.)

I will remove the unnecessary headers in patch v2.

> > +    for (vcpu_id = 0; vcpu_id <= dominfo.max_vcpu_id; vcpu_id++) {
> > +        printf("Injecting #DF to vcpu ID #%d...\n", vcpu_id);
> > +        ret = xc_hvm_inject_trap(xch, domid, vcpu_id,
> > +                                X86_ABORT_DF,
>
> In the definition of xendevicemodel_inject_event(), the comment say to
> look at "enum x86_event_type" for possible event "type", but there's no
> "#DF" type, can we add this new one there before using it? (It's going
> to be something like X86_EVENTTYPE_*)

To my understanding, the event types enum refer to the kind of interrupt
being handled. In this case, it is a hardware exception, which already
exists as an entry in the enum definition.

The `vector` parameter describes the kind of exception. I just found
that exception vector macros are defined in `x86-defns.h`, so I will
include that and instead use `X86_EXC_DF` in patch v2.

The only other usage of `xc_hvm_inject_trap` is in `xen-access.c`, which
defines all the required vectors as macros at the top of the source file.
Hence, I did the same in `xen-hvmcrash.c` for patch v1.

Would it be a good idea to rewrite `xen-access.c` to use `x86-defns.h`
as well, in a later patch?

> > +                                XEN_DMOP_EVENT_hw_exc, 0,
> > +                                NULL, NULL);
>
> The new code doesn't build, "NULL" aren't integers.
>
> > +        if (ret < 0) {
> > +            fprintf(stderr, "Could not inject #DF to vcpu ID #%d\n",
vcpu_id);
> > +            perror("xc_hvm_inject_trap");
>
> Are you meant to print two error lines when there's an error? You can
> also use strerror() to convert an "errno" to a string.

I will use strerror and one print call in patch v2.

> Are you meant to keep inject into other vcpus even if one have failed?

Yes; xen-hvmcrash doesn't have to inject *all* vcpus to cause it to
crash.

> Should `xen-hvmcrash` return success when it failed to inject the double
> fault to all vcpus?

I will make xen-hvmcrash yield an error if no vcpus could be injected in
patch v2.

Matt
Andrew Cooper June 25, 2024, 9:02 p.m. UTC | #3
On 03/06/2024 3:59 pm, Matthew Barnes wrote:
> xen-hvmcrash would previously save records, overwrite the instruction
> pointer with a bogus value, and then restore them to crash a domain
> just enough to cause the guest OS to memdump.
>
> This approach is found to be unreliable when tested on a guest running
> Windows 10 x64, with some executions doing nothing at all.
>
> Another approach would be to trigger NMIs. This approach is found to be
> unreliable when tested on Linux (Ubuntu 22.04), as Linux will ignore
> NMIs if it is not configured to handle such.
>
> Injecting a double fault abort to all vCPUs is found to be more
> reliable at crashing and invoking memdumps from Windows and Linux
> domains.

Why every CPU?

We never did that before, and I don't see any it ought to be necessary
now either.



> diff --git a/tools/misc/xen-hvmcrash.c b/tools/misc/xen-hvmcrash.c
> index 1d058fa40a47..8ef1beb388f8 100644
> --- a/tools/misc/xen-hvmcrash.c
> +++ b/tools/misc/xen-hvmcrash.c
> @@ -38,22 +38,21 @@
>  #include <sys/stat.h>
>  #include <arpa/inet.h>
>  
> +#define XC_WANT_COMPAT_DEVICEMODEL_API

Please don't introduce this.  We want to purge it from the codebase, not
propagate it.

You want to open and use a libxendevicemodel handle.  (Sadly you also
need a xenctrl handle too, until we sort out the userspace ABIs).

>  #include <xenctrl.h>
>  #include <xen/xen.h>
>  #include <xen/domctl.h>
>  #include <xen/hvm/save.h>
>  
> +#define X86_ABORT_DF 8

#include <xen/asm/x86-defns.h>

and use X86_EXC_DF.

~Andrew
Matthew Barnes June 26, 2024, 2:12 p.m. UTC | #4
On Tue, Jun 25, 2024 at 10:02:42PM +0100, Andrew Cooper wrote:
> On 03/06/2024 3:59 pm, Matthew Barnes wrote:
> > xen-hvmcrash would previously save records, overwrite the instruction
> > pointer with a bogus value, and then restore them to crash a domain
> > just enough to cause the guest OS to memdump.
> >
> > This approach is found to be unreliable when tested on a guest running
> > Windows 10 x64, with some executions doing nothing at all.
> >
> > Another approach would be to trigger NMIs. This approach is found to be
> > unreliable when tested on Linux (Ubuntu 22.04), as Linux will ignore
> > NMIs if it is not configured to handle such.
> >
> > Injecting a double fault abort to all vCPUs is found to be more
> > reliable at crashing and invoking memdumps from Windows and Linux
> > domains.
> 
> Why every CPU?
> 
> We never did that before, and I don't see any it ought to be necessary
> now either.

We do: at the moment, xen-hvmcrash iterates through
hvm_save_descriptors after pausing the domain, overwriting the EIP/RIP of
each cpu it finds.

Is there a reason not to inject #DF into each domain vCPU? Wouldn't that
crash the domain more reliably?

> > diff --git a/tools/misc/xen-hvmcrash.c b/tools/misc/xen-hvmcrash.c
> > index 1d058fa40a47..8ef1beb388f8 100644
> > --- a/tools/misc/xen-hvmcrash.c
> > +++ b/tools/misc/xen-hvmcrash.c
> > @@ -38,22 +38,21 @@
> >  #include <sys/stat.h>
> >  #include <arpa/inet.h>
> >  
> > +#define XC_WANT_COMPAT_DEVICEMODEL_API
> 
> Please don't introduce this.  We want to purge it from the codebase, not
> propagate it.
> 
> You want to open and use a libxendevicemodel handle.  (Sadly you also
> need a xenctrl handle too, until we sort out the userspace ABIs).
> 
> >  #include <xenctrl.h>
> >  #include <xen/xen.h>
> >  #include <xen/domctl.h>
> >  #include <xen/hvm/save.h>
> >  
> > +#define X86_ABORT_DF 8
> 
> #include <xen/asm/x86-defns.h>
> 
> and use X86_EXC_DF.

Understood: this will be reflected in patch v2.

Matt
diff mbox series

Patch

diff --git a/tools/misc/xen-hvmcrash.c b/tools/misc/xen-hvmcrash.c
index 1d058fa40a47..8ef1beb388f8 100644
--- a/tools/misc/xen-hvmcrash.c
+++ b/tools/misc/xen-hvmcrash.c
@@ -38,22 +38,21 @@ 
 #include <sys/stat.h>
 #include <arpa/inet.h>
 
+#define XC_WANT_COMPAT_DEVICEMODEL_API
 #include <xenctrl.h>
 #include <xen/xen.h>
 #include <xen/domctl.h>
 #include <xen/hvm/save.h>
 
+#define X86_ABORT_DF 8
+
 int
 main(int argc, char **argv)
 {
     int domid;
     xc_interface *xch;
     xc_domaininfo_t dominfo;
-    int ret;
-    uint32_t len;
-    uint8_t *buf;
-    uint32_t off;
-    struct hvm_save_descriptor *descriptor;
+    int vcpu_id, ret;
 
     if (argc != 2 || !argv[1] || (domid = atoi(argv[1])) < 0) {
         fprintf(stderr, "usage: %s <domid>\n", argv[0]);
@@ -77,66 +76,16 @@  main(int argc, char **argv)
         exit(1);
     }
 
-    ret = xc_domain_pause(xch, domid);
-    if (ret < 0) {
-        perror("xc_domain_pause");
-        exit(-1);
-    }
-
-    /*
-     * Calling with zero buffer length should return the buffer length
-     * required.
-     */
-    ret = xc_domain_hvm_getcontext(xch, domid, 0, 0);
-    if (ret < 0) {
-        perror("xc_domain_hvm_getcontext");
-        exit(1);
-    }
-    
-    len = ret;
-    buf = malloc(len);
-    if (buf == NULL) {
-        perror("malloc");
-        exit(1);
-    }
-
-    ret = xc_domain_hvm_getcontext(xch, domid, buf, len);
-    if (ret < 0) {
-        perror("xc_domain_hvm_getcontext");
-        exit(1);
-    }
-
-    off = 0;
-
-    while (off < len) {
-        descriptor = (struct hvm_save_descriptor *)(buf + off);
-
-        off += sizeof (struct hvm_save_descriptor);
-
-        if (descriptor->typecode == HVM_SAVE_CODE(CPU)) {
-            HVM_SAVE_TYPE(CPU) *cpu;
-
-            /* Overwrite EIP/RIP with some recognisable but bogus value */
-            cpu = (HVM_SAVE_TYPE(CPU) *)(buf + off);
-            printf("CPU[%d]: RIP = %" PRIx64 "\n", descriptor->instance, cpu->rip);
-            cpu->rip = 0xf001;
-        } else if (descriptor->typecode == HVM_SAVE_CODE(END)) {
-            break;
+    for (vcpu_id = 0; vcpu_id <= dominfo.max_vcpu_id; vcpu_id++) {
+        printf("Injecting #DF to vcpu ID #%d...\n", vcpu_id);
+        ret = xc_hvm_inject_trap(xch, domid, vcpu_id,
+                                X86_ABORT_DF,
+                                XEN_DMOP_EVENT_hw_exc, 0,
+                                NULL, NULL);
+        if (ret < 0) {
+            fprintf(stderr, "Could not inject #DF to vcpu ID #%d\n", vcpu_id);
+            perror("xc_hvm_inject_trap");
         }
-
-        off += descriptor->length;
-    }
-
-    ret = xc_domain_hvm_setcontext(xch, domid, buf, len);
-    if (ret < 0) {
-        perror("xc_domain_hvm_setcontext");
-        exit(1);
-    }
-
-    ret = xc_domain_unpause(xch, domid);
-    if (ret < 0) {
-        perror("xc_domain_unpause");
-        exit(1);
     }
 
     return 0;