diff mbox series

[v5,1/6] x86/debugger: Remove debugger_trap_entry()

Message ID 20220420141307.24153-2-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series Clean up common/arch split for debugger.h | expand

Commit Message

Andrew Cooper April 20, 2022, 2:13 p.m. UTC
From: Bobby Eshleman <bobby.eshleman@gmail.com>

debugger_trap_entry() is unrelated to the other contents of debugger.h.  It is
a no-op for everything other than #DB/#BP, and for those it invokes guest
debugging (CONFIG_GDBSX) not host debugging (CONFIG_CRASH_DEBUG).

Furthermore, the description of how to use debugger_trap_entry() is at best,
stale.  It is not called from all exception paths, and because the developer
is forced to modify Xen to perform debugging, editing debugger_trap_entry() is
not the way one would efficiently go about diagnosing the problem.

Simplify everything by expanding debugger_trap_entry() into its two non-empty
locations, fixing bugs with their positioning (vs early exceptions and curr
not being safe to deference) and for #DB, deferring the pause until the
changes in %dr6 are saved to v->arch.dr6 so the debugger can actually see
which condition triggered.

Signed-off-by: Bobby Eshleman <bobby.eshleman@gmail.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

v5:
 * Remove dead logic.  Move GDBSX changes into a later patch.
 * Rewrite commmit message.
---
 xen/arch/x86/include/asm/debugger.h | 42 ++-----------------------------------
 xen/arch/x86/traps.c                | 34 +++++++++++++-----------------
 2 files changed, 16 insertions(+), 60 deletions(-)

Comments

Jan Beulich April 21, 2022, 1:02 p.m. UTC | #1
On 20.04.2022 16:13, Andrew Cooper wrote:
> From: Bobby Eshleman <bobby.eshleman@gmail.com>
> 
> debugger_trap_entry() is unrelated to the other contents of debugger.h.  It is
> a no-op for everything other than #DB/#BP, and for those it invokes guest
> debugging (CONFIG_GDBSX) not host debugging (CONFIG_CRASH_DEBUG).
> 
> Furthermore, the description of how to use debugger_trap_entry() is at best,
> stale.  It is not called from all exception paths,

But on almost all (before this change) - the exception looks to be
#NM.

> and because the developer
> is forced to modify Xen to perform debugging, editing debugger_trap_entry() is
> not the way one would efficiently go about diagnosing the problem.

Shouldn't it be the remote end to request which exceptions it wants
to be notified of? If so, removing the hook invocation isn't very
helpful.

Jan
Andrew Cooper April 21, 2022, 5:23 p.m. UTC | #2
On 21/04/2022 14:02, Jan Beulich wrote:
> On 20.04.2022 16:13, Andrew Cooper wrote:
>> From: Bobby Eshleman <bobby.eshleman@gmail.com>
>>
>> debugger_trap_entry() is unrelated to the other contents of debugger.h.  It is
>> a no-op for everything other than #DB/#BP, and for those it invokes guest
>> debugging (CONFIG_GDBSX) not host debugging (CONFIG_CRASH_DEBUG).
>>
>> Furthermore, the description of how to use debugger_trap_entry() is at best,
>> stale.  It is not called from all exception paths,
> But on almost all (before this change) - the exception looks to be
> #NM.
>
>> and because the developer
>> is forced to modify Xen to perform debugging, editing debugger_trap_entry() is
>> not the way one would efficiently go about diagnosing the problem.
> Shouldn't it be the remote end to request which exceptions it wants
> to be notified of? If so, removing the hook invocation isn't very
> helpful.

That's not part of the gdb remote protocol.

In normal conditions, gdb gets to see see anything which manifests as a
signal.  It does not get to see anything which is resolved by the kernel
behind the scenes.  #NM you've already identified, and most #PF's would
count too.  Back in the 32bit days, Xen-induced #GP/#SS's for non-4G
segments would count too.

But in addition to filtering Xen's idea of "fixing up behind the
scenes", you also need to Xen to understand when to skip notifications
based on what a PV guest kernel can fix up, and this is getting even
further out of gdb's comfort zone.

debugger_trap_entry() is empty (WRT gdbstub) specifically because it's
description is nonsense in any practical debugging scenario.  And lets
not start on the fact that the lack of ability to invoke
pv_event_inject() means that any fault from userspace will livelock
under debugging.

Deleting it is absolutely the right way forward, because a theoretical
future with someone wiring this up would have to start again from scratch.

Not that qualifies as a good reason in isolation, do_trap() contains
unreachable logic because the compiler can't figure out that #DB/#BP are
handled via alternative paths, and the gdbsx logic is dead.

~Andrew
Jan Beulich April 22, 2022, 7:27 a.m. UTC | #3
On 21.04.2022 19:23, Andrew Cooper wrote:
> On 21/04/2022 14:02, Jan Beulich wrote:
>> On 20.04.2022 16:13, Andrew Cooper wrote:
>>> From: Bobby Eshleman <bobby.eshleman@gmail.com>
>>>
>>> debugger_trap_entry() is unrelated to the other contents of debugger.h.  It is
>>> a no-op for everything other than #DB/#BP, and for those it invokes guest
>>> debugging (CONFIG_GDBSX) not host debugging (CONFIG_CRASH_DEBUG).
>>>
>>> Furthermore, the description of how to use debugger_trap_entry() is at best,
>>> stale.  It is not called from all exception paths,
>> But on almost all (before this change) - the exception looks to be
>> #NM.
>>
>>> and because the developer
>>> is forced to modify Xen to perform debugging, editing debugger_trap_entry() is
>>> not the way one would efficiently go about diagnosing the problem.
>> Shouldn't it be the remote end to request which exceptions it wants
>> to be notified of? If so, removing the hook invocation isn't very
>> helpful.
> 
> That's not part of the gdb remote protocol.
> 
> In normal conditions, gdb gets to see see anything which manifests as a
> signal.  It does not get to see anything which is resolved by the kernel
> behind the scenes.  #NM you've already identified, and most #PF's would
> count too.  Back in the 32bit days, Xen-induced #GP/#SS's for non-4G
> segments would count too.
> 
> But in addition to filtering Xen's idea of "fixing up behind the
> scenes", you also need to Xen to understand when to skip notifications
> based on what a PV guest kernel can fix up, and this is getting even
> further out of gdb's comfort zone.
> 
> debugger_trap_entry() is empty (WRT gdbstub) specifically because it's
> description is nonsense in any practical debugging scenario.  And lets
> not start on the fact that the lack of ability to invoke
> pv_event_inject() means that any fault from userspace will livelock
> under debugging.
> 
> Deleting it is absolutely the right way forward, because a theoretical
> future with someone wiring this up would have to start again from scratch.
> 
> Not that qualifies as a good reason in isolation, do_trap() contains
> unreachable logic because the compiler can't figure out that #DB/#BP are
> handled via alternative paths, and the gdbsx logic is dead.

The patch description could certainly do with expanding some along these
lines.

Acked-by: Jan Beulich <jbeulich@suse.com>

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/include/asm/debugger.h b/xen/arch/x86/include/asm/debugger.h
index 221bcde13796..e83b346a21d1 100644
--- a/xen/arch/x86/include/asm/debugger.h
+++ b/xen/arch/x86/include/asm/debugger.h
@@ -5,19 +5,12 @@ 
  * 
  * Each debugger should define two functions here:
  * 
- * 1. debugger_trap_entry(): 
- *  Called at start of any synchronous fault or trap, before any other work
- *  is done. The idea is that if your debugger deliberately caused the trap
- *  (e.g. to implement breakpoints or data watchpoints) then you can take
- *  appropriate action and return a non-zero value to cause early exit from
- *  the trap function.
- * 
- * 2. debugger_trap_fatal():
+ * debugger_trap_fatal():
  *  Called when Xen is about to give up and crash. Typically you will use this
  *  hook to drop into a debug session. It can also be used to hook off
  *  deliberately caused traps (which you then handle and return non-zero).
  *
- * 3. debugger_trap_immediate():
+ * debugger_trap_immediate():
  *  Called if we want to drop into a debugger now.  This is essentially the
  *  same as debugger_trap_fatal, except that we use the current register state
  *  rather than the state which was in effect when we took the trap.
@@ -49,31 +42,6 @@  static inline bool debugger_trap_fatal(
 /* Int3 is a trivial way to gather cpu_user_regs context. */
 #define debugger_trap_immediate() __asm__ __volatile__ ( "int3" );
 
-static inline bool debugger_trap_entry(
-    unsigned int vector, struct cpu_user_regs *regs)
-{
-    /*
-     * This function is called before any checks are made.  Amongst other
-     * things, be aware that during early boot, current is not a safe pointer
-     * to follow.
-     */
-    struct vcpu *v = current;
-
-    if ( vector != TRAP_int3 && vector != TRAP_debug )
-        return false;
-
-    if ( guest_mode(regs) && guest_kernel_mode(v, regs) &&
-         v->domain->debugger_attached  )
-    {
-        if ( vector != TRAP_debug ) /* domain pause is good enough */
-            current->arch.gdbsx_vcpu_event = vector;
-        domain_pause_for_debugger();
-        return true;
-    }
-
-    return false;
-}
-
 #else
 
 static inline bool debugger_trap_fatal(
@@ -84,12 +52,6 @@  static inline bool debugger_trap_fatal(
 
 #define debugger_trap_immediate() ((void)0)
 
-static inline bool debugger_trap_entry(
-    unsigned int vector, struct cpu_user_regs *regs)
-{
-    return false;
-}
-
 #endif
 
 #ifdef CONFIG_GDBSX
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 4c38f6c01539..84cd038dc38b 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -946,9 +946,6 @@  void do_trap(struct cpu_user_regs *regs)
     if ( regs->error_code & X86_XEC_EXT )
         goto hardware_trap;
 
-    if ( debugger_trap_entry(trapnr, regs) )
-        return;
-
     ASSERT(trapnr < 32);
 
     if ( guest_mode(regs) )
@@ -1177,9 +1174,6 @@  void do_invalid_op(struct cpu_user_regs *regs)
     int id = -1, lineno;
     const struct virtual_region *region;
 
-    if ( debugger_trap_entry(TRAP_invalid_op, regs) )
-        return;
-
     if ( likely(guest_mode(regs)) )
     {
         if ( pv_emulate_invalid_op(regs) )
@@ -1284,8 +1278,7 @@  void do_invalid_op(struct cpu_user_regs *regs)
 
 void do_int3(struct cpu_user_regs *regs)
 {
-    if ( debugger_trap_entry(TRAP_int3, regs) )
-        return;
+    struct vcpu *curr = current;
 
     if ( !guest_mode(regs) )
     {
@@ -1299,6 +1292,13 @@  void do_int3(struct cpu_user_regs *regs)
         return;
     }
 
+    if ( guest_kernel_mode(curr, regs) && curr->domain->debugger_attached )
+    {
+        curr->arch.gdbsx_vcpu_event = TRAP_int3;
+        domain_pause_for_debugger();
+        return;
+    }
+
     pv_inject_hw_exception(TRAP_int3, X86_EVENT_NO_EC);
 }
 
@@ -1575,9 +1575,6 @@  void do_page_fault(struct cpu_user_regs *regs)
     /* fixup_page_fault() might change regs->error_code, so cache it here. */
     error_code = regs->error_code;
 
-    if ( debugger_trap_entry(TRAP_page_fault, regs) )
-        return;
-
     perfc_incr(page_faults);
 
     /* Any shadow stack access fault is a bug in Xen. */
@@ -1676,9 +1673,6 @@  void do_general_protection(struct cpu_user_regs *regs)
     struct vcpu *v = current;
 #endif
 
-    if ( debugger_trap_entry(TRAP_gp_fault, regs) )
-        return;
-
     if ( regs->error_code & X86_XEC_EXT )
         goto hardware_gp;
 
@@ -1971,9 +1965,6 @@  void do_debug(struct cpu_user_regs *regs)
     /* Stash dr6 as early as possible. */
     dr6 = read_debugreg(6);
 
-    if ( debugger_trap_entry(TRAP_debug, regs) )
-        return;
-
     /*
      * At the time of writing (March 2018), on the subject of %dr6:
      *
@@ -2082,6 +2073,12 @@  void do_debug(struct cpu_user_regs *regs)
     v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT);
     v->arch.dr6 &= (dr6 | ~X86_DR6_DEFAULT);
 
+    if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
+    {
+        domain_pause_for_debugger();
+        return;
+    }
+
     pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 }
 
@@ -2097,9 +2094,6 @@  void do_entry_CP(struct cpu_user_regs *regs)
     const char *err = "??";
     unsigned int ec = regs->error_code;
 
-    if ( debugger_trap_entry(X86_EXC_CP, regs) )
-        return;
-
     /* Decode ec if possible */
     if ( ec < ARRAY_SIZE(errors) && errors[ec][0] )
         err = errors[ec];