Message ID | 20230915203628.837732-5-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/pv: #DB vs %dr6 fixes, part 2 | expand |
On 9/16/23 05:36, Andrew Cooper wrote: > @@ -658,7 +660,7 @@ static int cf_check rep_ins( > > ++*reps; > > - if ( poc->bpmatch || hypercall_preempt_check() ) > + if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() ) > break; > > /* x86_emulate() clips the repetition count to ensure we don't wrap. */ (snip) > @@ -726,7 +729,7 @@ static int cf_check rep_outs( > > ++*reps; > > - if ( poc->bpmatch || hypercall_preempt_check() ) > + if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() ) > break; > > /* x86_emulate() clips the repetition count to ensure we don't wrap. */ These two hunks look like a behavioral change in singlestep mode. This is actually a fix, assuming the emulator previously did not handle 'rep {in,out}s' in singlestep mode correctly, since it now checks for PENDING_DBG.BS in addition to PENDING_DBG.B[0-4]. If this is the case, (at least) this part of the patch looks like a stable candidate. You might want to edit the commit message to reflect that. (Ideally all the HWBP handling should be part of the emulator logic, but I don't see an easy way to generalize the PV-specific logic. It could be its own patch anyway.)
On 16/09/2023 8:36 am, Jinoh Kang wrote: > On 9/16/23 05:36, Andrew Cooper wrote: >> @@ -658,7 +660,7 @@ static int cf_check rep_ins( >> >> ++*reps; >> >> - if ( poc->bpmatch || hypercall_preempt_check() ) >> + if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() ) >> break; >> >> /* x86_emulate() clips the repetition count to ensure we don't wrap. */ > (snip) > >> @@ -726,7 +729,7 @@ static int cf_check rep_outs( >> >> ++*reps; >> >> - if ( poc->bpmatch || hypercall_preempt_check() ) >> + if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() ) >> break; >> >> /* x86_emulate() clips the repetition count to ensure we don't wrap. */ > These two hunks look like a behavioral change in singlestep mode. > > This is actually a fix, assuming the emulator previously did not handle > 'rep {in,out}s' in singlestep mode correctly, since it now checks for > PENDING_DBG.BS in addition to PENDING_DBG.B[0-4]. The emulator should handle this correctly already. I had been meaning to test this, but hadn't so far - guess I should fix that. x86_emulate.c line 511 in get_rep_prefix() sets max_reps to 1 if SingleStepping is active. This in turn causes the emulator to use the io_{read,write}() hook rather than the rep hook. This is important, because singlestepping becoming pending is normally evaluated at the end of the instruction. i.e. in this example it wouldn't show up in pending_dbg (yet). What definitely is broken here is the recognition of a data breakpoint on the memory operand of the INS/OUTS instruction, but it's broken everywhere else for PV guest emulation too, so needs to go on the TODO list. > If this is the case, (at least) this part of the patch looks like a stable > candidate. You might want to edit the commit message to reflect that. We're going to try and get all the %dr6 handling issues sorted, then decide whether to backport the lot or not. It will entirely depend on how invasive the fixes end up being, but I hope they'll be ok to backport. > (Ideally all the HWBP handling should be part of the emulator logic, but > I don't see an easy way to generalize the PV-specific logic. It could > be its own patch anyway.) The emulator does have HWBP handling for HVM guests, because that's architectural behaviour to look in the TSS. PV guests are the odd-ones-out with non-standard behaviour. ~Andrew
On 9/16/23 23:00, Andrew Cooper wrote: > On 16/09/2023 8:36 am, Jinoh Kang wrote: (snip) >> These two hunks look like a behavioral change in singlestep mode. >> >> This is actually a fix, assuming the emulator previously did not handle >> 'rep {in,out}s' in singlestep mode correctly, since it now checks for >> PENDING_DBG.BS in addition to PENDING_DBG.B[0-4]. > > The emulator should handle this correctly already. I had been meaning > to test this, but hadn't so far - guess I should fix that. > > x86_emulate.c line 511 in get_rep_prefix() sets max_reps to 1 if > SingleStepping is active. Thanks for informing. Although that EFLAGS.TF check in the macro now makes me--almost reflexively--imagine all sort of creative pathological cases, like "mov ss, ax; rep ins"... > > This in turn causes the emulator to use the io_{read,write}() hook > rather than the rep hook. Right. (Frankly that part of code has too many branches to be readable. Also the "presumably most efficient" part of the comment hints at perf optimization sans any profiling attempts...) > > This is important, because singlestepping becoming pending is normally > evaluated at the end of the instruction. i.e. in this example it > wouldn't show up in pending_dbg (yet). > > What definitely is broken here is the recognition of a data breakpoint > on the memory operand of the INS/OUTS instruction, but it's broken > everywhere else for PV guest emulation too, so needs to go on the TODO list. (Another thing definitely broken here is the recognition of I/O breakpoints post the first iteration. Maybe it would be beneficial to do differential testing between the {read,write}_io slowpath and rep_{in,out}s fastpath.) > >> If this is the case, (at least) this part of the patch looks like a stable >> candidate. You might want to edit the commit message to reflect that. > > We're going to try and get all the %dr6 handling issues sorted, then > decide whether to backport the lot or not. It will entirely depend on > how invasive the fixes end up being, but I hope they'll be ok to backport. > >> (Ideally all the HWBP handling should be part of the emulator logic, but >> I don't see an easy way to generalize the PV-specific logic. It could >> be its own patch anyway.) > > The emulator does have HWBP handling for HVM guests, because that's > architectural behaviour to look in the TSS. I was under such impression since I didn't immediately notice I/O breakpoint handling in ins/outs path; maybe I haven't looked into it deeper... > > PV guests are the odd-ones-out with non-standard behaviour. > > ~Andrew
On 16/09/2023 3:42 pm, Jinoh Kang wrote: > On 9/16/23 23:00, Andrew Cooper wrote: >> On 16/09/2023 8:36 am, Jinoh Kang wrote: > (snip) > >>> These two hunks look like a behavioral change in singlestep mode. >>> >>> This is actually a fix, assuming the emulator previously did not handle >>> 'rep {in,out}s' in singlestep mode correctly, since it now checks for >>> PENDING_DBG.BS in addition to PENDING_DBG.B[0-4]. >> The emulator should handle this correctly already. I had been meaning >> to test this, but hadn't so far - guess I should fix that. >> >> x86_emulate.c line 511 in get_rep_prefix() sets max_reps to 1 if >> SingleStepping is active. > Thanks for informing. Although that EFLAGS.TF check in the macro now > makes me--almost reflexively--imagine all sort of creative pathological > cases, like "mov ss, ax; rep ins"... Yeah isn't x86 fun... That's why it's important to accumulate in pending_dbg, which does hold the breakpoint recognition across point where the blocked-by-movss state inhibits the generation of #DB, and causes it to generate #DB on the subsequent instruction boundary. >> This in turn causes the emulator to use the io_{read,write}() hook >> rather than the rep hook. > Right. (Frankly that part of code has too many branches to be readable. > Also the "presumably most efficient" part of the comment hints at perf > optimization sans any profiling attempts...) Yeah it's not the greatest code. I cant remember the exact history there, but IIRC we used to handle the rep looping entirely in x86_emulate() and there were no rep hooks. There are two cases where the perf hit did get noticed. REP OUTSB to the qemu/bochs/Xen debug port, and Windows which does a REP MOVSB across the various bits of PCI MMIO. >> This is important, because singlestepping becoming pending is normally >> evaluated at the end of the instruction. i.e. in this example it >> wouldn't show up in pending_dbg (yet). >> >> What definitely is broken here is the recognition of a data breakpoint >> on the memory operand of the INS/OUTS instruction, but it's broken >> everywhere else for PV guest emulation too, so needs to go on the TODO list. > (Another thing definitely broken here is the recognition of I/O breakpoints > post the first iteration. Maybe it would be beneficial to do differential > testing between the {read,write}_io slowpath and rep_{in,out}s fastpath.) PV, or HVM guest? IO breakpoint recognition is missing generally in HVM guests. I opened https://gitlab.com/xen-project/xen/-/work_items/171 yesterday for it, because it's not 30s work to fix. But, IO breakpoints are invariant of REP. The IO port doesn't change, so the breakpoint(s) either match on every iteration, or not at all. (Of course, this doesn't mean that Xen is handling the recognition correctly.) In this patch, there is IO breakpoint recognition on all 4 of the PV IO port hooks, so I think it ought to work properly, whether it's the first iteration or not. Obviously if not, I've got another bug to figure out... It's worth saying that patch 2 does fix a bug (an X86EMUL_OKAY/DONE confusion), but I'm pretty sure it was only a singlestep on admin-ok PV IO ports which would be skipped, not the breakpoints. ~Andrew
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c index 0d9f84f458ba..2f73ed0682e1 100644 --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -34,7 +34,6 @@ struct priv_op_ctxt { unsigned long base, limit; } cs; char *io_emul_stub; - unsigned int bpmatch; }; /* I/O emulation helpers. Use non-standard calling conventions. */ @@ -367,7 +366,8 @@ static int cf_check read_io( if ( !guest_io_okay(port, bytes, curr, ctxt->regs) ) return X86EMUL_UNHANDLEABLE; - poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes); + poc->ctxt.retire.pending_dbg |= + check_guest_io_breakpoint(curr, port, bytes); if ( admin_io_okay(port, bytes, currd) ) { @@ -472,7 +472,8 @@ static int cf_check write_io( if ( !guest_io_okay(port, bytes, curr, ctxt->regs) ) return X86EMUL_UNHANDLEABLE; - poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes); + poc->ctxt.retire.pending_dbg |= + check_guest_io_breakpoint(curr, port, bytes); if ( admin_io_okay(port, bytes, currd) ) { @@ -636,7 +637,8 @@ static int cf_check rep_ins( return X86EMUL_EXCEPTION; } - poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes_per_rep); + poc->ctxt.retire.pending_dbg |= + check_guest_io_breakpoint(curr, port, bytes_per_rep); while ( *reps < goal ) { @@ -658,7 +660,7 @@ static int cf_check rep_ins( ++*reps; - if ( poc->bpmatch || hypercall_preempt_check() ) + if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() ) break; /* x86_emulate() clips the repetition count to ensure we don't wrap. */ @@ -703,7 +705,8 @@ static int cf_check rep_outs( return X86EMUL_EXCEPTION; } - poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes_per_rep); + poc->ctxt.retire.pending_dbg |= + check_guest_io_breakpoint(curr, port, bytes_per_rep); while ( *reps < goal ) { @@ -726,7 +729,7 @@ static int cf_check rep_outs( ++*reps; - if ( poc->bpmatch || hypercall_preempt_check() ) + if ( poc->ctxt.retire.pending_dbg || hypercall_preempt_check() ) break; /* x86_emulate() clips the repetition count to ensure we don't wrap. */ @@ -1360,12 +1363,9 @@ int pv_emulate_privileged_op(struct cpu_user_regs *regs) case X86EMUL_OKAY: ASSERT(!curr->arch.pv.trap_bounce.flags); - if ( ctxt.ctxt.retire.singlestep ) - ctxt.bpmatch |= DR_STEP; - - if ( ctxt.bpmatch ) + if ( ctxt.ctxt.retire.pending_dbg ) { - curr->arch.dr6 |= ctxt.bpmatch | DR_STATUS_RESERVED_ONE; + curr->arch.dr6 |= ctxt.ctxt.retire.pending_dbg | DR_STATUS_RESERVED_ONE; pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); }