Message ID | 57583D2B02000078000F31CE@prv-mh.provo.novell.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 08/06/16 14:43, Jan Beulich wrote: > Don't fetch CS explicitly, leverage the fact that hvm_emulate_prepare() > already does (and that hvm_virtual_to_linear_addr() doesn't alter it). > > At once increase the length passed to hvm_virtual_to_linear_addr() by > one: There definitely needs to be at least one more opcode byte, and we > can avoid missing a wraparound case this way. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> I looked into this when you suggested it, but it latches the wrong eip in the emulation state, and you will end up re-emulating the ud2a instruction, rather than the following instruction. I would be tempted just to leave this code in its current condition. ~Andrew
>>> On 09.06.16 at 13:34, <andrew.cooper3@citrix.com> wrote: > On 08/06/16 14:43, Jan Beulich wrote: >> Don't fetch CS explicitly, leverage the fact that hvm_emulate_prepare() >> already does (and that hvm_virtual_to_linear_addr() doesn't alter it). >> >> At once increase the length passed to hvm_virtual_to_linear_addr() by >> one: There definitely needs to be at least one more opcode byte, and we >> can avoid missing a wraparound case this way. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > I looked into this when you suggested it, but it latches the wrong eip > in the emulation state, and you will end up re-emulating the ud2a > instruction, rather than the following instruction. Where is there any latching of eip? All hvm_emulate_prepare() does is storing the regs pointer. Jan
On 09/06/16 13:31, Jan Beulich wrote: >>>> On 09.06.16 at 13:34, <andrew.cooper3@citrix.com> wrote: >> On 08/06/16 14:43, Jan Beulich wrote: >>> Don't fetch CS explicitly, leverage the fact that hvm_emulate_prepare() >>> already does (and that hvm_virtual_to_linear_addr() doesn't alter it). >>> >>> At once increase the length passed to hvm_virtual_to_linear_addr() by >>> one: There definitely needs to be at least one more opcode byte, and we >>> can avoid missing a wraparound case this way. >>> >>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> I looked into this when you suggested it, but it latches the wrong eip >> in the emulation state, and you will end up re-emulating the ud2a >> instruction, rather than the following instruction. > Where is there any latching of eip? All hvm_emulate_prepare() does > is storing the regs pointer. Oh - so it does. I clearly looked over it too quickly. What wraparound issue are you referring to? Adding 1 will cause incorrect behaviour when the emulation prefix ends at the segment limit. ~Andrew
>>> On 09.06.16 at 16:06, <andrew.cooper3@citrix.com> wrote: > On 09/06/16 13:31, Jan Beulich wrote: >>>>> On 09.06.16 at 13:34, <andrew.cooper3@citrix.com> wrote: >>> On 08/06/16 14:43, Jan Beulich wrote: >>>> Don't fetch CS explicitly, leverage the fact that hvm_emulate_prepare() >>>> already does (and that hvm_virtual_to_linear_addr() doesn't alter it). >>>> >>>> At once increase the length passed to hvm_virtual_to_linear_addr() by >>>> one: There definitely needs to be at least one more opcode byte, and we >>>> can avoid missing a wraparound case this way. >>>> >>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>> I looked into this when you suggested it, but it latches the wrong eip >>> in the emulation state, and you will end up re-emulating the ud2a >>> instruction, rather than the following instruction. >> Where is there any latching of eip? All hvm_emulate_prepare() does >> is storing the regs pointer. > > Oh - so it does. I clearly looked over it too quickly. > > What wraparound issue are you referring to? Adding 1 will cause > incorrect behaviour when the emulation prefix ends at the segment limit. I don't think so: The prefix together with the actual instruction encoding should be viewed as a single instruction, and it crossing the segment limit should #GP. It wrapping at the prefix/encoding boundary is the case that I'm specifically referring to (this case should also #GP, but wouldn't without this adjustment). Jan
On 09/06/16 15:13, Jan Beulich wrote: >>>> On 09.06.16 at 16:06, <andrew.cooper3@citrix.com> wrote: >> On 09/06/16 13:31, Jan Beulich wrote: >>>>>> On 09.06.16 at 13:34, <andrew.cooper3@citrix.com> wrote: >>>> On 08/06/16 14:43, Jan Beulich wrote: >>>>> Don't fetch CS explicitly, leverage the fact that hvm_emulate_prepare() >>>>> already does (and that hvm_virtual_to_linear_addr() doesn't alter it). >>>>> >>>>> At once increase the length passed to hvm_virtual_to_linear_addr() by >>>>> one: There definitely needs to be at least one more opcode byte, and we >>>>> can avoid missing a wraparound case this way. >>>>> >>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>> I looked into this when you suggested it, but it latches the wrong eip >>>> in the emulation state, and you will end up re-emulating the ud2a >>>> instruction, rather than the following instruction. >>> Where is there any latching of eip? All hvm_emulate_prepare() does >>> is storing the regs pointer. >> Oh - so it does. I clearly looked over it too quickly. >> >> What wraparound issue are you referring to? Adding 1 will cause >> incorrect behaviour when the emulation prefix ends at the segment limit. > I don't think so: The prefix together with the actual instruction > encoding should be viewed as a single instruction, and it crossing > the segment limit should #GP. It wrapping at the prefix/encoding > boundary is the case that I'm specifically referring to (this case > should also #GP, but wouldn't without this adjustment). But the force emulation prefix specifically doesn't behave like other prefixes. It doesn't count towards the 15 byte instruction limit, and if the emulated instruction does fault, we want the fault pointing at the emulated instruction, not the force prefix. We should avoid making any link. ~Andrew
>>> On 09.06.16 at 16:27, <andrew.cooper3@citrix.com> wrote: > On 09/06/16 15:13, Jan Beulich wrote: >>>>> On 09.06.16 at 16:06, <andrew.cooper3@citrix.com> wrote: >>> On 09/06/16 13:31, Jan Beulich wrote: >>>>>>> On 09.06.16 at 13:34, <andrew.cooper3@citrix.com> wrote: >>>>> On 08/06/16 14:43, Jan Beulich wrote: >>>>>> Don't fetch CS explicitly, leverage the fact that hvm_emulate_prepare() >>>>>> already does (and that hvm_virtual_to_linear_addr() doesn't alter it). >>>>>> >>>>>> At once increase the length passed to hvm_virtual_to_linear_addr() by >>>>>> one: There definitely needs to be at least one more opcode byte, and we >>>>>> can avoid missing a wraparound case this way. >>>>>> >>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>>> I looked into this when you suggested it, but it latches the wrong eip >>>>> in the emulation state, and you will end up re-emulating the ud2a >>>>> instruction, rather than the following instruction. >>>> Where is there any latching of eip? All hvm_emulate_prepare() does >>>> is storing the regs pointer. >>> Oh - so it does. I clearly looked over it too quickly. >>> >>> What wraparound issue are you referring to? Adding 1 will cause >>> incorrect behaviour when the emulation prefix ends at the segment limit. >> I don't think so: The prefix together with the actual instruction >> encoding should be viewed as a single instruction, and it crossing >> the segment limit should #GP. It wrapping at the prefix/encoding >> boundary is the case that I'm specifically referring to (this case >> should also #GP, but wouldn't without this adjustment). > > But the force emulation prefix specifically doesn't behave like other > prefixes. > > It doesn't count towards the 15 byte instruction limit, and if the > emulated instruction does fault, we want the fault pointing at the > emulated instruction, not the force prefix. We should avoid making any > link. Well, are you saying placing such a prefix right below the boundary of a flat segment is _expected_ to get the instruction at address 0 emulated? I don't think I could buy that. The patch makes no other connection between prefix and actual insn. And #GP because of such a boundary condition should imo point at the prefix; only all faults associated with the actual insn should point there. Jan
>>> On 09.06.16 at 17:05, <JBeulich@suse.com> wrote: >>>> On 09.06.16 at 16:27, <andrew.cooper3@citrix.com> wrote: >> On 09/06/16 15:13, Jan Beulich wrote: >>>>>> On 09.06.16 at 16:06, <andrew.cooper3@citrix.com> wrote: >>>> On 09/06/16 13:31, Jan Beulich wrote: >>>>>>>> On 09.06.16 at 13:34, <andrew.cooper3@citrix.com> wrote: >>>>>> On 08/06/16 14:43, Jan Beulich wrote: >>>>>>> Don't fetch CS explicitly, leverage the fact that hvm_emulate_prepare() >>>>>>> already does (and that hvm_virtual_to_linear_addr() doesn't alter it). >>>>>>> >>>>>>> At once increase the length passed to hvm_virtual_to_linear_addr() by >>>>>>> one: There definitely needs to be at least one more opcode byte, and we >>>>>>> can avoid missing a wraparound case this way. >>>>>>> >>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>>>> I looked into this when you suggested it, but it latches the wrong eip >>>>>> in the emulation state, and you will end up re-emulating the ud2a >>>>>> instruction, rather than the following instruction. >>>>> Where is there any latching of eip? All hvm_emulate_prepare() does >>>>> is storing the regs pointer. >>>> Oh - so it does. I clearly looked over it too quickly. >>>> >>>> What wraparound issue are you referring to? Adding 1 will cause >>>> incorrect behaviour when the emulation prefix ends at the segment limit. >>> I don't think so: The prefix together with the actual instruction >>> encoding should be viewed as a single instruction, and it crossing >>> the segment limit should #GP. It wrapping at the prefix/encoding >>> boundary is the case that I'm specifically referring to (this case >>> should also #GP, but wouldn't without this adjustment). >> >> But the force emulation prefix specifically doesn't behave like other >> prefixes. >> >> It doesn't count towards the 15 byte instruction limit, and if the >> emulated instruction does fault, we want the fault pointing at the >> emulated instruction, not the force prefix. We should avoid making any >> link. > > Well, are you saying placing such a prefix right below the boundary > of a flat segment is _expected_ to get the instruction at address 0 > emulated? I don't think I could buy that. The patch makes no other > connection between prefix and actual insn. And #GP because of > such a boundary condition should imo point at the prefix; only all > faults associated with the actual insn should point there. Ping? Jan
On 09/06/16 16:05, Jan Beulich wrote: >>>> On 09.06.16 at 16:27, <andrew.cooper3@citrix.com> wrote: >> On 09/06/16 15:13, Jan Beulich wrote: >>>>>> On 09.06.16 at 16:06, <andrew.cooper3@citrix.com> wrote: >>>> On 09/06/16 13:31, Jan Beulich wrote: >>>>>>>> On 09.06.16 at 13:34, <andrew.cooper3@citrix.com> wrote: >>>>>> On 08/06/16 14:43, Jan Beulich wrote: >>>>>>> Don't fetch CS explicitly, leverage the fact that hvm_emulate_prepare() >>>>>>> already does (and that hvm_virtual_to_linear_addr() doesn't alter it). >>>>>>> >>>>>>> At once increase the length passed to hvm_virtual_to_linear_addr() by >>>>>>> one: There definitely needs to be at least one more opcode byte, and we >>>>>>> can avoid missing a wraparound case this way. >>>>>>> >>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>>>> I looked into this when you suggested it, but it latches the wrong eip >>>>>> in the emulation state, and you will end up re-emulating the ud2a >>>>>> instruction, rather than the following instruction. >>>>> Where is there any latching of eip? All hvm_emulate_prepare() does >>>>> is storing the regs pointer. >>>> Oh - so it does. I clearly looked over it too quickly. >>>> >>>> What wraparound issue are you referring to? Adding 1 will cause >>>> incorrect behaviour when the emulation prefix ends at the segment limit. >>> I don't think so: The prefix together with the actual instruction >>> encoding should be viewed as a single instruction, and it crossing >>> the segment limit should #GP. It wrapping at the prefix/encoding >>> boundary is the case that I'm specifically referring to (this case >>> should also #GP, but wouldn't without this adjustment). >> But the force emulation prefix specifically doesn't behave like other >> prefixes. >> >> It doesn't count towards the 15 byte instruction limit, and if the >> emulated instruction does fault, we want the fault pointing at the >> emulated instruction, not the force prefix. We should avoid making any >> link. > Well, are you saying placing such a prefix right below the boundary > of a flat segment is _expected_ to get the instruction at address 0 > emulated? I don't think I could buy that. The patch makes no other > connection between prefix and actual insn. And #GP because of > such a boundary condition should imo point at the prefix; only all > faults associated with the actual insn should point there. Ok. That sounds reasonable. Would it be possible to add a small comment to the code? Even with the commit message, I was confused as to the nature of the +1. Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> On 17.06.16 at 11:37, <andrew.cooper3@citrix.com> wrote: > On 09/06/16 16:05, Jan Beulich wrote: >>>>> On 09.06.16 at 16:27, <andrew.cooper3@citrix.com> wrote: >>> On 09/06/16 15:13, Jan Beulich wrote: >>>>>>> On 09.06.16 at 16:06, <andrew.cooper3@citrix.com> wrote: >>>>> On 09/06/16 13:31, Jan Beulich wrote: >>>>>>>>> On 09.06.16 at 13:34, <andrew.cooper3@citrix.com> wrote: >>>>>>> On 08/06/16 14:43, Jan Beulich wrote: >>>>>>>> Don't fetch CS explicitly, leverage the fact that hvm_emulate_prepare() >>>>>>>> already does (and that hvm_virtual_to_linear_addr() doesn't alter it). >>>>>>>> >>>>>>>> At once increase the length passed to hvm_virtual_to_linear_addr() by >>>>>>>> one: There definitely needs to be at least one more opcode byte, and we >>>>>>>> can avoid missing a wraparound case this way. >>>>>>>> >>>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>>>>> I looked into this when you suggested it, but it latches the wrong eip >>>>>>> in the emulation state, and you will end up re-emulating the ud2a >>>>>>> instruction, rather than the following instruction. >>>>>> Where is there any latching of eip? All hvm_emulate_prepare() does >>>>>> is storing the regs pointer. >>>>> Oh - so it does. I clearly looked over it too quickly. >>>>> >>>>> What wraparound issue are you referring to? Adding 1 will cause >>>>> incorrect behaviour when the emulation prefix ends at the segment limit. >>>> I don't think so: The prefix together with the actual instruction >>>> encoding should be viewed as a single instruction, and it crossing >>>> the segment limit should #GP. It wrapping at the prefix/encoding >>>> boundary is the case that I'm specifically referring to (this case >>>> should also #GP, but wouldn't without this adjustment). >>> But the force emulation prefix specifically doesn't behave like other >>> prefixes. >>> >>> It doesn't count towards the 15 byte instruction limit, and if the >>> emulated instruction does fault, we want the fault pointing at the >>> emulated instruction, not the force prefix. We should avoid making any >>> link. >> Well, are you saying placing such a prefix right below the boundary >> of a flat segment is _expected_ to get the instruction at address 0 >> emulated? I don't think I could buy that. The patch makes no other >> connection between prefix and actual insn. And #GP because of >> such a boundary condition should imo point at the prefix; only all >> faults associated with the actual insn should point there. > > Ok. That sounds reasonable. Would it be possible to add a small > comment to the code? Even with the commit message, I was confused as to > the nature of the +1. + /* + * Note that in the call below we pass 1 more than the signature + * size, to guard against the overall code sequence wrapping between + * "prefix" and actual instruction. There's necessarily at least one + * actual instruction byte required, so this won't cause failure on + * legitimate uses. + */ > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Thanks, Jan
--- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3834,19 +3834,20 @@ void hvm_ud_intercept(struct cpu_user_re { struct hvm_emulate_ctxt ctxt; + hvm_emulate_prepare(&ctxt, regs); + if ( opt_hvm_fep ) { struct vcpu *cur = current; - struct segment_register cs; + const struct segment_register *cs = &ctxt.seg_reg[x86_seg_cs]; unsigned long addr; char sig[5]; /* ud2; .ascii "xen" */ - hvm_get_segment_register(cur, x86_seg_cs, &cs); - if ( hvm_virtual_to_linear_addr(x86_seg_cs, &cs, regs->eip, - sizeof(sig), hvm_access_insn_fetch, + if ( hvm_virtual_to_linear_addr(x86_seg_cs, cs, regs->eip, + sizeof(sig) + 1, hvm_access_insn_fetch, (hvm_long_mode_enabled(cur) && - cs.attr.fields.l) ? 64 : - cs.attr.fields.db ? 32 : 16, &addr) && + cs->attr.fields.l) ? 64 : + cs->attr.fields.db ? 32 : 16, &addr) && (hvm_fetch_from_guest_virt_nofault(sig, addr, sizeof(sig), 0) == HVMCOPY_okay) && (memcmp(sig, "\xf\xbxen", sizeof(sig)) == 0) ) @@ -3856,8 +3857,6 @@ void hvm_ud_intercept(struct cpu_user_re } } - hvm_emulate_prepare(&ctxt, regs); - switch ( hvm_emulate_one(&ctxt) ) { case X86EMUL_UNHANDLEABLE: