diff mbox series

[2/3] tools/xenalyze: Ignore HVM_EMUL events harder

Message ID 20240426143231.4007671-3-george.dunlap@cloud.com (mailing list archive)
State New
Headers show
Series Further trace improvements | expand

Commit Message

George Dunlap April 26, 2024, 2:32 p.m. UTC
To unify certain common sanity checks, checks are done very early in
processing based only on the top-level type.

Unfortunately, when TRC_HVM_EMUL was introduced, it broke some of the
assumptions about how the top-level types worked.  Namely, traces of
this type will show up outside of HVM contexts: in idle domains and in
PV domains.

Make an explicit exception for TRC_HVM_EMUL types in a number of places:

 - Pass the record info pointer to toplevel_assert_check, so that it
   can exclude TRC_HVM_EMUL records from idle and vcpu data_mode
   checks

 - Don't attempt to set the vcpu data_type in hvm_process for
   TRC_HVM_EMUL records.

Signed-off-by: George Dunlap <george.dunlap@cloud.com>
---
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Anthony Perard <anthony.perard@cloud.com>
CC: Olaf Hering <olaf@aepfle.de>
---
 tools/xentrace/xenalyze.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

Comments

Andrew Cooper April 26, 2024, 3:06 p.m. UTC | #1
On 26/04/2024 3:32 pm, George Dunlap wrote:
> To unify certain common sanity checks, checks are done very early in
> processing based only on the top-level type.
>
> Unfortunately, when TRC_HVM_EMUL was introduced, it broke some of the
> assumptions about how the top-level types worked.  Namely, traces of
> this type will show up outside of HVM contexts: in idle domains and in
> PV domains.
>
> Make an explicit exception for TRC_HVM_EMUL types in a number of places:
>
>  - Pass the record info pointer to toplevel_assert_check, so that it
>    can exclude TRC_HVM_EMUL records from idle and vcpu data_mode
>    checks
>
>  - Don't attempt to set the vcpu data_type in hvm_process for
>    TRC_HVM_EMUL records.
>
> Signed-off-by: George Dunlap <george.dunlap@cloud.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Although I'm tempted to say that if records of this type show up outside
of HVM context, then it's misnamed or we've got an error in Xen.  Any
view on which?

~Andrew
George Dunlap April 26, 2024, 3:13 p.m. UTC | #2
On Fri, Apr 26, 2024 at 4:06 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 26/04/2024 3:32 pm, George Dunlap wrote:
> > To unify certain common sanity checks, checks are done very early in
> > processing based only on the top-level type.
> >
> > Unfortunately, when TRC_HVM_EMUL was introduced, it broke some of the
> > assumptions about how the top-level types worked.  Namely, traces of
> > this type will show up outside of HVM contexts: in idle domains and in
> > PV domains.
> >
> > Make an explicit exception for TRC_HVM_EMUL types in a number of places:
> >
> >  - Pass the record info pointer to toplevel_assert_check, so that it
> >    can exclude TRC_HVM_EMUL records from idle and vcpu data_mode
> >    checks
> >
> >  - Don't attempt to set the vcpu data_type in hvm_process for
> >    TRC_HVM_EMUL records.
> >
> > Signed-off-by: George Dunlap <george.dunlap@cloud.com>
>
> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> Although I'm tempted to say that if records of this type show up outside
> of HVM context, then it's misnamed or we've got an error in Xen.  Any
> view on which?

I didn't add them; they seem to be about doing emulation of devices on
behalf of an HVM domain.  But since some of these are things like
timers and such, the actual code ends up being run in random contexts;
probably interrupt contexts (which of course can occur when a non-HVM
guest is running).

One example of an event that showed up this way was
TRC_HVM_EMUL_RTC_STOP_TIMER, which is traced from
xen/arch/x86/hvm/rtc.c:rtc_pf_callback().  I didn't trace back how it
came to be called while a PV guest was running, but it was pretty
obvious that it was legit.

It would certainly make the xenalyze code cleaner to make a separate
top-level tracing category for them; but they certainly do seem to be
directly related to HVM, so doing so would seem to be putting the cart
before the horse, as they say (although I could be convinced
otherwise).

 -George
diff mbox series

Patch

diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index ceb07229d1..8fb45dec76 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -21,6 +21,7 @@ 
 #define _XOPEN_SOURCE 600
 #include <stdio.h>
 #include <stdlib.h>
+#include <stdbool.h>
 #include <argp.h>
 #include <inttypes.h>
 #include <limits.h>
@@ -5305,8 +5306,11 @@  void hvm_process(struct pcpu_info *p)
 
     assert(p->current);
 
-    if(vcpu_set_data_type(p->current, VCPU_DATA_HVM))
-        return;
+    /* HVM_EMUL types show up in all contexts */
+    if(ri->evt.sub != 0x4) {
+        if(vcpu_set_data_type(p->current, VCPU_DATA_HVM))
+            return;
+    }
 
     switch ( ri->evt.sub ) {
     case 2: /* HVM_HANDLER */
@@ -9447,9 +9451,10 @@  static struct tl_assert_mask tl_assert_checks[TOPLEVEL_MAX] = {
 /* There are a lot of common assumptions for the various processing
  * routines.  Check them all in one place, doing something else if
  * they don't pass. */
-int toplevel_assert_check(int toplevel, struct pcpu_info *p)
+int toplevel_assert_check(int toplevel, struct record_info *ri, struct pcpu_info *p)
 {
     struct tl_assert_mask mask;
+    bool is_hvm_emul = (toplevel == TOPLEVEL_HVM) && (ri->evt.sub == 0x4);
 
     mask = tl_assert_checks[toplevel];
 
@@ -9459,7 +9464,7 @@  int toplevel_assert_check(int toplevel, struct pcpu_info *p)
         goto fail;
     }
 
-    if( mask.not_idle_domain )
+    if( mask.not_idle_domain && !is_hvm_emul)
     {
         /* Can't do this check w/o first doing above check */
         assert(mask.p_current);
@@ -9478,7 +9483,8 @@  int toplevel_assert_check(int toplevel, struct pcpu_info *p)
         v = p->current;
 
         if ( ! (v->data_type == VCPU_DATA_NONE
-                || v->data_type == mask.vcpu_data_mode) )
+                || v->data_type == mask.vcpu_data_mode
+                || is_hvm_emul) )
         {
             /* This may happen for track_dirty_vram, which causes a SHADOW_WRMAP_BF trace f/ dom0 */
             fprintf(warn, "WARNING: Unexpected vcpu data type for d%dv%d on proc %d! Expected %d got %d. Not processing\n",
@@ -9525,7 +9531,7 @@  void process_record(struct pcpu_info *p) {
         return;
 
     /* Unify toplevel assertions */
-    if ( toplevel_assert_check(toplevel, p) )
+    if ( toplevel_assert_check(toplevel, ri, p) )
     {
         switch(toplevel) {
         case TRC_GEN_MAIN: