diff mbox series

[03/16] x86/traps: Factor out exception_fixup() and make printing consistent

Message ID 20200501225838.9866-4-andrew.cooper3@citrix.com (mailing list archive)
State Superseded
Headers show
Series x86: Support for CET Supervisor Shadow Stacks | expand

Commit Message

Andrew Cooper May 1, 2020, 10:58 p.m. UTC
UD faults never had any diagnostics printed, and the others were inconsistent.

Don't use dprintk() because identifying traps.c is actively unhelpful in the
message, as it is the location of the fixup, not the fault.  Use the new
vec_name() infrastructure, rather than leaving raw numbers for the log.

  (XEN) Running stub recovery selftests...
  (XEN) Fixup #UD[0000]: ffff82d07fffd040 [ffff82d07fffd040] -> ffff82d0403ac9d6
  (XEN) Fixup #GP[0000]: ffff82d07fffd041 [ffff82d07fffd041] -> ffff82d0403ac9d6
  (XEN) Fixup #SS[0000]: ffff82d07fffd040 [ffff82d07fffd040] -> ffff82d0403ac9d6
  (XEN) Fixup #BP[0000]: ffff82d07fffd041 [ffff82d07fffd041] -> ffff82d0403ac9d6

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/traps.c | 68 ++++++++++++++++++++++++----------------------------
 1 file changed, 31 insertions(+), 37 deletions(-)

Comments

Jan Beulich May 4, 2020, 1:20 p.m. UTC | #1
On 02.05.2020 00:58, Andrew Cooper wrote:
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -774,10 +774,27 @@ static void do_reserved_trap(struct cpu_user_regs *regs)
>            trapnr, vec_name(trapnr), regs->error_code);
>  }
>  
> +static bool exception_fixup(struct cpu_user_regs *regs, bool print)
> +{
> +    unsigned long fixup = search_exception_table(regs);
> +
> +    if ( unlikely(fixup == 0) )
> +        return false;
> +
> +    /* Can currently be triggered by guests.  Make sure we ratelimit. */
> +    if ( IS_ENABLED(CONFIG_DEBUG) && print )

I didn't think we consider dprintk()-s a possible security issue.
Why would we consider so a printk() hidden behind
IS_ENABLED(CONFIG_DEBUG)? IOW I think one of XENLOG_GUEST and
IS_ENABLED(CONFIG_DEBUG) wants dropping.

> @@ -1466,12 +1468,11 @@ void do_page_fault(struct cpu_user_regs *regs)
>          if ( pf_type != real_fault )
>              return;
>  
> -        if ( likely((fixup = search_exception_table(regs)) != 0) )
> +        if ( likely(exception_fixup(regs, false)) )
>          {
>              perfc_incr(copy_user_faults);
>              if ( unlikely(regs->error_code & PFEC_reserved_bit) )
>                  reserved_bit_page_fault(addr, regs);
> -            regs->rip = fixup;

I'm afraid this modification can't validly be pulled ahead -
show_execution_state(), as called from reserved_bit_page_fault(),
wants to / should print the original RIP, not the fixed up one.

Jan
Andrew Cooper May 11, 2020, 3:14 p.m. UTC | #2
On 04/05/2020 14:20, Jan Beulich wrote:
> On 02.05.2020 00:58, Andrew Cooper wrote:
>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -774,10 +774,27 @@ static void do_reserved_trap(struct cpu_user_regs *regs)
>>            trapnr, vec_name(trapnr), regs->error_code);
>>  }
>>  
>> +static bool exception_fixup(struct cpu_user_regs *regs, bool print)
>> +{
>> +    unsigned long fixup = search_exception_table(regs);
>> +
>> +    if ( unlikely(fixup == 0) )
>> +        return false;
>> +
>> +    /* Can currently be triggered by guests.  Make sure we ratelimit. */
>> +    if ( IS_ENABLED(CONFIG_DEBUG) && print )
> I didn't think we consider dprintk()-s a possible security issue.
> Why would we consider so a printk() hidden behind
> IS_ENABLED(CONFIG_DEBUG)? IOW I think one of XENLOG_GUEST and
> IS_ENABLED(CONFIG_DEBUG) wants dropping.

Who said anything about a security issue?

I'm deliberately not using dprintk() for the reasons explained in the
commit message, so IS_ENABLED(CONFIG_DEBUG) is to cover that.

XENLOG_GUEST is because everything (other than the boot path) hitting
this caused directly by a guest action, and needs rate-limiting.  This
was not consistent before.

>
>> @@ -1466,12 +1468,11 @@ void do_page_fault(struct cpu_user_regs *regs)
>>          if ( pf_type != real_fault )
>>              return;
>>  
>> -        if ( likely((fixup = search_exception_table(regs)) != 0) )
>> +        if ( likely(exception_fixup(regs, false)) )
>>          {
>>              perfc_incr(copy_user_faults);
>>              if ( unlikely(regs->error_code & PFEC_reserved_bit) )
>>                  reserved_bit_page_fault(addr, regs);
>> -            regs->rip = fixup;
> I'm afraid this modification can't validly be pulled ahead -
> show_execution_state(), as called from reserved_bit_page_fault(),
> wants to / should print the original RIP, not the fixed up one.

This path is bogus to begin with.  Any RSVD pagefault (other than the
Shadow MMIO fastpath, handled earlier) is a bug in Xen and should be
fatal rather than just a warning on extable-tagged instructions.

Amongst other things, it would consistent an L1TF-vulnerable gadget. 
The MMIO fastpath is only safe-ish because it also inverts the upper
address bits.

~Andrew
Jan Beulich May 12, 2020, 1:05 p.m. UTC | #3
On 11.05.2020 17:14, Andrew Cooper wrote:
> On 04/05/2020 14:20, Jan Beulich wrote:
>> On 02.05.2020 00:58, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/traps.c
>>> +++ b/xen/arch/x86/traps.c
>>> @@ -774,10 +774,27 @@ static void do_reserved_trap(struct cpu_user_regs *regs)
>>>            trapnr, vec_name(trapnr), regs->error_code);
>>>  }
>>>  
>>> +static bool exception_fixup(struct cpu_user_regs *regs, bool print)
>>> +{
>>> +    unsigned long fixup = search_exception_table(regs);
>>> +
>>> +    if ( unlikely(fixup == 0) )
>>> +        return false;
>>> +
>>> +    /* Can currently be triggered by guests.  Make sure we ratelimit. */
>>> +    if ( IS_ENABLED(CONFIG_DEBUG) && print )
>> I didn't think we consider dprintk()-s a possible security issue.
>> Why would we consider so a printk() hidden behind
>> IS_ENABLED(CONFIG_DEBUG)? IOW I think one of XENLOG_GUEST and
>> IS_ENABLED(CONFIG_DEBUG) wants dropping.
> 
> Who said anything about a security issue?

The need to rate limit is (among other aspects) to prevent a
(logspam) security issue, isn't it?

> I'm deliberately not using dprintk() for the reasons explained in the
> commit message, so IS_ENABLED(CONFIG_DEBUG) is to cover that.

Which is fine, and which I understood.

> XENLOG_GUEST is because everything (other than the boot path) hitting
> this caused directly by a guest action, and needs rate-limiting.  This
> was not consistent before.

But why do we need both CONFIG_DEBUG and XENLOG_GUEST? In a debug
build I think we'd better know of such events under the control
of the general log level setting, not under that of guest specific
messages.

>>> @@ -1466,12 +1468,11 @@ void do_page_fault(struct cpu_user_regs *regs)
>>>          if ( pf_type != real_fault )
>>>              return;
>>>  
>>> -        if ( likely((fixup = search_exception_table(regs)) != 0) )
>>> +        if ( likely(exception_fixup(regs, false)) )
>>>          {
>>>              perfc_incr(copy_user_faults);
>>>              if ( unlikely(regs->error_code & PFEC_reserved_bit) )
>>>                  reserved_bit_page_fault(addr, regs);
>>> -            regs->rip = fixup;
>> I'm afraid this modification can't validly be pulled ahead -
>> show_execution_state(), as called from reserved_bit_page_fault(),
>> wants to / should print the original RIP, not the fixed up one.
> 
> This path is bogus to begin with.  Any RSVD pagefault (other than the
> Shadow MMIO fastpath, handled earlier) is a bug in Xen and should be
> fatal rather than just a warning on extable-tagged instructions.

I could see reasons to convert to a fatal exception (without looking
up fixups), but then in a separate change with suitable justification.
I'd like to point out though that this would convert a logspam kind
of DoS to a Xen crash one, in case such a PTE would indeed be created
anywhere by mistake.

However, it is not helpful to the diagnosis of such a problem if the
logged address points at the fixup code rather than the faulting one.

Jan
Andrew Cooper May 26, 2020, 6:06 p.m. UTC | #4
On 12/05/2020 14:05, Jan Beulich wrote:
> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
>
> On 11.05.2020 17:14, Andrew Cooper wrote:
>> On 04/05/2020 14:20, Jan Beulich wrote:
>>> On 02.05.2020 00:58, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/traps.c
>>>> +++ b/xen/arch/x86/traps.c
>>>> @@ -774,10 +774,27 @@ static void do_reserved_trap(struct cpu_user_regs *regs)
>>>>            trapnr, vec_name(trapnr), regs->error_code);
>>>>  }
>>>>  
>>>> +static bool exception_fixup(struct cpu_user_regs *regs, bool print)
>>>> +{
>>>> +    unsigned long fixup = search_exception_table(regs);
>>>> +
>>>> +    if ( unlikely(fixup == 0) )
>>>> +        return false;
>>>> +
>>>> +    /* Can currently be triggered by guests.  Make sure we ratelimit. */
>>>> +    if ( IS_ENABLED(CONFIG_DEBUG) && print )
>>> I didn't think we consider dprintk()-s a possible security issue.
>>> Why would we consider so a printk() hidden behind
>>> IS_ENABLED(CONFIG_DEBUG)? IOW I think one of XENLOG_GUEST and
>>> IS_ENABLED(CONFIG_DEBUG) wants dropping.
>> Who said anything about a security issue?
> The need to rate limit is (among other aspects) to prevent a
> (logspam) security issue, isn't it?

Rate limiting (from a security aspect) is a stopgap solution to relieve
incidental pressure on the various global spinlocks involved.

It specifically does not prevent a guest from trivially filling the
console ring with junk, or for that junk to be written to
/var/log/xen/hypervisor.log at an alarming rate, both of which are
issues in production setups, but not security issues.

Technical solutions to these problems do exist, such as deleting the
offending printk(), or maintaining per-guest console rings, but both
come with downsides in terms of usability, which similarly impacts
production setups.


What ratelimiting even in debug builds gets you is a quick spate of
printks() (e.g. any new sshd connection on an AMD system where the
MSR_VIRT_SPEC_CTRL patch is still uncommitted, and the default WRMSR
behaviour breaking wrmsr_safe() logic in Linux) not wasting an
unreasonable quantity of space in the console ring.

~Andrew
Jan Beulich May 27, 2020, 7:01 a.m. UTC | #5
On 26.05.2020 20:06, Andrew Cooper wrote:
> On 12/05/2020 14:05, Jan Beulich wrote:
>> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
>>
>> On 11.05.2020 17:14, Andrew Cooper wrote:
>>> On 04/05/2020 14:20, Jan Beulich wrote:
>>>> On 02.05.2020 00:58, Andrew Cooper wrote:
>>>>> --- a/xen/arch/x86/traps.c
>>>>> +++ b/xen/arch/x86/traps.c
>>>>> @@ -774,10 +774,27 @@ static void do_reserved_trap(struct cpu_user_regs *regs)
>>>>>            trapnr, vec_name(trapnr), regs->error_code);
>>>>>  }
>>>>>  
>>>>> +static bool exception_fixup(struct cpu_user_regs *regs, bool print)
>>>>> +{
>>>>> +    unsigned long fixup = search_exception_table(regs);
>>>>> +
>>>>> +    if ( unlikely(fixup == 0) )
>>>>> +        return false;
>>>>> +
>>>>> +    /* Can currently be triggered by guests.  Make sure we ratelimit. */
>>>>> +    if ( IS_ENABLED(CONFIG_DEBUG) && print )
>>>> I didn't think we consider dprintk()-s a possible security issue.
>>>> Why would we consider so a printk() hidden behind
>>>> IS_ENABLED(CONFIG_DEBUG)? IOW I think one of XENLOG_GUEST and
>>>> IS_ENABLED(CONFIG_DEBUG) wants dropping.
>>> Who said anything about a security issue?
>> The need to rate limit is (among other aspects) to prevent a
>> (logspam) security issue, isn't it?
> 
> Rate limiting (from a security aspect) is a stopgap solution to relieve
> incidental pressure on the various global spinlocks involved.
> 
> It specifically does not prevent a guest from trivially filling the
> console ring with junk, or for that junk to be written to
> /var/log/xen/hypervisor.log at an alarming rate, both of which are
> issues in production setups, but not security issues.

IOW you assert that e.g. XSA-141 should not have been issued?

> Technical solutions to these problems do exist, such as deleting the
> offending printk(), or maintaining per-guest console rings, but both
> come with downsides in terms of usability, which similarly impacts
> production setups.
> 
> 
> What ratelimiting even in debug builds gets you is a quick spate of
> printks() (e.g. any new sshd connection on an AMD system where the
> MSR_VIRT_SPEC_CTRL patch is still uncommitted, and the default WRMSR
> behaviour breaking wrmsr_safe() logic in Linux) not wasting an
> unreasonable quantity of space in the console ring.

Hmm, okay, I can accept this perspective. Since the other comment I
gave has been taken care of by re-arrangements in a separate patch:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index e73f07f28a..737ab036d2 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -774,10 +774,27 @@  static void do_reserved_trap(struct cpu_user_regs *regs)
           trapnr, vec_name(trapnr), regs->error_code);
 }
 
+static bool exception_fixup(struct cpu_user_regs *regs, bool print)
+{
+    unsigned long fixup = search_exception_table(regs);
+
+    if ( unlikely(fixup == 0) )
+        return false;
+
+    /* Can currently be triggered by guests.  Make sure we ratelimit. */
+    if ( IS_ENABLED(CONFIG_DEBUG) && print )
+        printk(XENLOG_GUEST XENLOG_WARNING "Fixup #%s[%04x]: %p [%ps] -> %p\n",
+               vec_name(regs->entry_vector), regs->error_code,
+               _p(regs->rip), _p(regs->rip), _p(fixup));
+
+    regs->rip = fixup;
+
+    return true;
+}
+
 static void do_trap(struct cpu_user_regs *regs)
 {
     unsigned int trapnr = regs->entry_vector;
-    unsigned long fixup;
 
     if ( regs->error_code & X86_XEC_EXT )
         goto hardware_trap;
@@ -795,13 +812,8 @@  static void do_trap(struct cpu_user_regs *regs)
         return;
     }
 
-    if ( likely((fixup = search_exception_table(regs)) != 0) )
-    {
-        dprintk(XENLOG_ERR, "Trap %u: %p [%ps] -> %p\n",
-                trapnr, _p(regs->rip), _p(regs->rip), _p(fixup));
-        regs->rip = fixup;
+    if ( likely(exception_fixup(regs, true)) )
         return;
-    }
 
  hardware_trap:
     if ( debugger_trap_fatal(trapnr, regs) )
@@ -1109,11 +1121,8 @@  void do_invalid_op(struct cpu_user_regs *regs)
     }
 
  die:
-    if ( (fixup = search_exception_table(regs)) != 0 )
-    {
-        regs->rip = fixup;
+    if ( likely(exception_fixup(regs, true)) )
         return;
-    }
 
     if ( debugger_trap_fatal(TRAP_invalid_op, regs) )
         return;
@@ -1129,15 +1138,8 @@  void do_int3(struct cpu_user_regs *regs)
 
     if ( !guest_mode(regs) )
     {
-        unsigned long fixup;
-
-        if ( (fixup = search_exception_table(regs)) != 0 )
-        {
-            dprintk(XENLOG_DEBUG, "Trap %u: %p [%ps] -> %p\n",
-                    TRAP_int3, _p(regs->rip), _p(regs->rip), _p(fixup));
-            regs->rip = fixup;
+        if ( likely(exception_fixup(regs, true)) )
             return;
-        }
 
         if ( !debugger_trap_fatal(TRAP_int3, regs) )
             printk(XENLOG_DEBUG "Hit embedded breakpoint at %p [%ps]\n",
@@ -1435,7 +1437,7 @@  static int fixup_page_fault(unsigned long addr, struct cpu_user_regs *regs)
  */
 void do_page_fault(struct cpu_user_regs *regs)
 {
-    unsigned long addr, fixup;
+    unsigned long addr;
     unsigned int error_code;
 
     addr = read_cr2();
@@ -1466,12 +1468,11 @@  void do_page_fault(struct cpu_user_regs *regs)
         if ( pf_type != real_fault )
             return;
 
-        if ( likely((fixup = search_exception_table(regs)) != 0) )
+        if ( likely(exception_fixup(regs, false)) )
         {
             perfc_incr(copy_user_faults);
             if ( unlikely(regs->error_code & PFEC_reserved_bit) )
                 reserved_bit_page_fault(addr, regs);
-            regs->rip = fixup;
             return;
         }
 
@@ -1529,7 +1530,6 @@  void do_general_protection(struct cpu_user_regs *regs)
 #ifdef CONFIG_PV
     struct vcpu *v = current;
 #endif
-    unsigned long fixup;
 
     if ( debugger_trap_entry(TRAP_gp_fault, regs) )
         return;
@@ -1596,13 +1596,8 @@  void do_general_protection(struct cpu_user_regs *regs)
 
  gp_in_kernel:
 
-    if ( likely((fixup = search_exception_table(regs)) != 0) )
-    {
-        dprintk(XENLOG_INFO, "GPF (%04x): %p [%ps] -> %p\n",
-                regs->error_code, _p(regs->rip), _p(regs->rip), _p(fixup));
-        regs->rip = fixup;
+    if ( likely(exception_fixup(regs, true)) )
         return;
-    }
 
  hardware_gp:
     if ( debugger_trap_fatal(TRAP_gp_fault, regs) )
@@ -1761,18 +1756,17 @@  void do_device_not_available(struct cpu_user_regs *regs)
 
     if ( !guest_mode(regs) )
     {
-        unsigned long fixup = search_exception_table(regs);
-
-        gprintk(XENLOG_ERR, "#NM: %p [%ps] -> %p\n",
-                _p(regs->rip), _p(regs->rip), _p(fixup));
         /*
          * We shouldn't be able to reach here, but for release builds have
          * the recovery logic in place nevertheless.
          */
-        ASSERT_UNREACHABLE();
-        BUG_ON(!fixup);
-        regs->rip = fixup;
-        return;
+        if ( exception_fixup(regs, true) )
+        {
+            ASSERT_UNREACHABLE();
+            return;
+        }
+
+        fatal_trap(regs, false);
     }
 
 #ifdef CONFIG_PV