Message ID | 1454588852-5389-1-git-send-email-rcojocaru@bitdefender.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 04/02/16 12:27, Razvan Cojocaru wrote: > Currently, after receiving a vm_event reply requesting emulation, > the actual emulation is triggered in p2m_mem_access_check(), > which means that we're waiting for the page fault to occur again > before emulating. Presumably this means that we re-enter the guest and exit immediately for (hopefully) the same violation? > Aside from the performance impact, this > complicates the code since between hvm_do_resume() and the second > page fault it is possible that the latter becomes a completely > new page fault - hence checking that EIP and the GPA match with > the ones in the original page fault. Presumably this occurs when we injected an event on the vmentry? > If they don't, duplicate > EPT fault vm_events will occur, of which a monitoring application > needs to be aware. > This patch makes struct arch_vm_event smaller (since we no longer > need to track eip and gpa), removes the checking code from > p2m_mem_access_check(), and moves the emulation in hvm_do_resume(). > > Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> > --- > xen/arch/x86/hvm/hvm.c | 17 +++++++++++++++++ > xen/arch/x86/mm/p2m.c | 34 ---------------------------------- > xen/include/asm-x86/vm_event.h | 2 -- > 3 files changed, 17 insertions(+), 36 deletions(-) Gotta love that diffstat! The logic makes sense, so Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> for the x86-related nature, but it would be nice to have a review from Tamas for the vm_event side of things.
On 02/04/2016 02:36 PM, Andrew Cooper wrote: > On 04/02/16 12:27, Razvan Cojocaru wrote: >> Currently, after receiving a vm_event reply requesting emulation, >> the actual emulation is triggered in p2m_mem_access_check(), >> which means that we're waiting for the page fault to occur again >> before emulating. > > Presumably this means that we re-enter the guest and exit immediately > for (hopefully) the same violation? Yes, something along those lines: the original page fault occurs, a vm_event is being sent to the application, which replies with the EMULATE flag set (but does not lift the page restrictions). Now we're hoping that the same instruction that has caused the first page fault runs, which triggers a new page fault (thus a new call of p2m_mem_access_check()). But in p2m_mem_access_check() we check to see if emulate flags are set for the current VCPU, and if they are, we check to see that the instruction is really the one that has caused the original page fault, and if it is (i.e. both EIP and GPA match), we emulate. The ideal case is the one where the second page fault is being caused by the same instruction hitting the same page, and that happens most of the time, but it unfortunately does not happen all of the time. So when the second page fault is _not_ caused by the same instruction, we just reset the emulate flags and carry on with regular processing, which means that a new vm_event will be sent out about this new page fault. But even though the application has reuquested that the page fault that has triggered the last page fault be emulated, it wasn't (as a design limitation). So now, when / if the old instruction hits the page again, it will be received by the monitoring application as a new hit, not still the old, unemulated one. There are safeguards possible for this in the monitoring application, but they too have limitations, and it is ultimately less efficient and more error-prone that the alternative hopefully is. >> Aside from the performance impact, this >> complicates the code since between hvm_do_resume() and the second >> page fault it is possible that the latter becomes a completely >> new page fault - hence checking that EIP and the GPA match with >> the ones in the original page fault. > > Presumably this occurs when we injected an event on the vmentry? Any asynchronous-type cause of a page fault will do, hopefully I've been able to explain somewhat above. >> If they don't, duplicate >> EPT fault vm_events will occur, of which a monitoring application >> needs to be aware. >> This patch makes struct arch_vm_event smaller (since we no longer >> need to track eip and gpa), removes the checking code from >> p2m_mem_access_check(), and moves the emulation in hvm_do_resume(). >> >> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> >> --- >> xen/arch/x86/hvm/hvm.c | 17 +++++++++++++++++ >> xen/arch/x86/mm/p2m.c | 34 ---------------------------------- >> xen/include/asm-x86/vm_event.h | 2 -- >> 3 files changed, 17 insertions(+), 36 deletions(-) > > Gotta love that diffstat! > > The logic makes sense, so Acked-by: Andrew Cooper > <andrew.cooper3@citrix.com> for the x86-related nature, but it would be > nice to have a review from Tamas for the vm_event side of things. Thanks! Of course. Hopefully Tamas will like this, based on a conversation we've had a few days ago here. Thanks, Razvan
On 02/04/2016 02:52 PM, Razvan Cojocaru wrote: > On 02/04/2016 02:36 PM, Andrew Cooper wrote: >> On 04/02/16 12:27, Razvan Cojocaru wrote: >>> Currently, after receiving a vm_event reply requesting emulation, >>> the actual emulation is triggered in p2m_mem_access_check(), >>> which means that we're waiting for the page fault to occur again >>> before emulating. >> >> Presumably this means that we re-enter the guest and exit immediately >> for (hopefully) the same violation? > > Yes, something along those lines: the original page fault occurs, a > vm_event is being sent to the application, which replies with the > EMULATE flag set (but does not lift the page restrictions). Now we're > hoping that the same instruction that has caused the first page fault > runs, which triggers a new page fault (thus a new call of > p2m_mem_access_check()). But in p2m_mem_access_check() we check to see > if emulate flags are set for the current VCPU, and if they are, we check > to see that the instruction is really the one that has caused the > original page fault, and if it is (i.e. both EIP and GPA match), we emulate. > > The ideal case is the one where the second page fault is being caused by > the same instruction hitting the same page, and that happens most of the > time, but it unfortunately does not happen all of the time. > > So when the second page fault is _not_ caused by the same instruction, > we just reset the emulate flags and carry on with regular processing, > which means that a new vm_event will be sent out about this new page > fault. But even though the application has reuquested that the page > fault that has triggered the last page fault be emulated, it wasn't (as > a design limitation). So now, when / if the old instruction hits the > page again, it will be received by the monitoring application as a new > hit, not still the old, unemulated one. > > There are safeguards possible for this in the monitoring application, > but they too have limitations, and it is ultimately less efficient and > more error-prone that the alternative hopefully is. > >>> Aside from the performance impact, this >>> complicates the code since between hvm_do_resume() and the second >>> page fault it is possible that the latter becomes a completely >>> new page fault - hence checking that EIP and the GPA match with >>> the ones in the original page fault. >> >> Presumably this occurs when we injected an event on the vmentry? > > Any asynchronous-type cause of a page fault will do, hopefully I've been > able to explain somewhat above. > >>> If they don't, duplicate >>> EPT fault vm_events will occur, of which a monitoring application >>> needs to be aware. >>> This patch makes struct arch_vm_event smaller (since we no longer >>> need to track eip and gpa), removes the checking code from >>> p2m_mem_access_check(), and moves the emulation in hvm_do_resume(). >>> >>> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> >>> --- >>> xen/arch/x86/hvm/hvm.c | 17 +++++++++++++++++ >>> xen/arch/x86/mm/p2m.c | 34 ---------------------------------- >>> xen/include/asm-x86/vm_event.h | 2 -- >>> 3 files changed, 17 insertions(+), 36 deletions(-) >> >> Gotta love that diffstat! >> >> The logic makes sense, so Acked-by: Andrew Cooper >> <andrew.cooper3@citrix.com> for the x86-related nature, but it would be >> nice to have a review from Tamas for the vm_event side of things. > > Thanks! Of course. Hopefully Tamas will like this, based on a > conversation we've had a few days ago here. I've changed Tamas' address to the @novetta one, I think he might not be using the older @zentific one.
> >> The logic makes sense, so Acked-by: Andrew Cooper > >> <andrew.cooper3@citrix.com> for the x86-related nature, but it would be > >> nice to have a review from Tamas for the vm_event side of things. > > > > Thanks! Of course. Hopefully Tamas will like this, based on a > > conversation we've had a few days ago here. > Yes, it's a step in the right direction! > > I've changed Tamas' address to the @novetta one, I think he might not be > using the older @zentific one. > Thanks, either of these emails is actually fine but for Xen related things tamas@tklengyel.com (as in the Maintainers file) works best as that's where I have the proper filtering rules setup as to avoid getting mails like this lost in the inbox. Tamas
On Thu, Feb 4, 2016 at 5:27 AM, Razvan Cojocaru <rcojocaru@bitdefender.com> wrote: > Currently, after receiving a vm_event reply requesting emulation, > the actual emulation is triggered in p2m_mem_access_check(), > which means that we're waiting for the page fault to occur again > before emulating. Aside from the performance impact, this > complicates the code since between hvm_do_resume() and the second > page fault it is possible that the latter becomes a completely > new page fault - hence checking that EIP and the GPA match with > the ones in the original page fault. If they don't, duplicate > EPT fault vm_events will occur, of which a monitoring application > needs to be aware. > This patch makes struct arch_vm_event smaller (since we no longer > need to track eip and gpa), removes the checking code from > p2m_mem_access_check(), and moves the emulation in hvm_do_resume(). > > Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> > Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 35ec6c9..930d0e3 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -552,6 +552,23 @@ void hvm_do_resume(struct vcpu *v) { struct monitor_write_data *w = &v->arch.vm_event->write_data; + if ( v->arch.vm_event->emulate_flags ) + { + enum emul_kind kind = EMUL_KIND_NORMAL; + + if ( v->arch.vm_event->emulate_flags & + VM_EVENT_FLAG_SET_EMUL_READ_DATA ) + kind = EMUL_KIND_SET_CONTEXT; + else if ( v->arch.vm_event->emulate_flags & + VM_EVENT_FLAG_EMULATE_NOWRITE ) + kind = EMUL_KIND_NOWRITE; + + hvm_mem_access_emulate_one(kind, TRAP_invalid_op, + HVM_DELIVER_NO_ERROR_CODE); + + v->arch.vm_event->emulate_flags = 0; + } + if ( w->do_write.msr ) { hvm_msr_write_intercept(w->msr, w->value, 0); diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index a45ee35..47e7fad 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1639,7 +1639,6 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla, p2m_access_t p2ma; vm_event_request_t *req; int rc; - unsigned long eip = guest_cpu_user_regs()->eip; if ( altp2m_active(d) ) p2m = p2m_get_altp2m(v); @@ -1698,39 +1697,6 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla, } } - /* The previous vm_event reply does not match the current state. */ - if ( unlikely(v->arch.vm_event) && - (v->arch.vm_event->gpa != gpa || v->arch.vm_event->eip != eip) ) - { - /* Don't emulate the current instruction, send a new vm_event. */ - v->arch.vm_event->emulate_flags = 0; - - /* - * Make sure to mark the current state to match it again against - * the new vm_event about to be sent. - */ - v->arch.vm_event->gpa = gpa; - v->arch.vm_event->eip = eip; - } - - if ( unlikely(v->arch.vm_event) && v->arch.vm_event->emulate_flags ) - { - enum emul_kind kind = EMUL_KIND_NORMAL; - - if ( v->arch.vm_event->emulate_flags & - VM_EVENT_FLAG_SET_EMUL_READ_DATA ) - kind = EMUL_KIND_SET_CONTEXT; - else if ( v->arch.vm_event->emulate_flags & - VM_EVENT_FLAG_EMULATE_NOWRITE ) - kind = EMUL_KIND_NOWRITE; - - hvm_mem_access_emulate_one(kind, TRAP_invalid_op, - HVM_DELIVER_NO_ERROR_CODE); - - v->arch.vm_event->emulate_flags = 0; - return 1; - } - *req_ptr = NULL; req = xzalloc(vm_event_request_t); if ( req ) diff --git a/xen/include/asm-x86/vm_event.h b/xen/include/asm-x86/vm_event.h index 5aff834..fff8326 100644 --- a/xen/include/asm-x86/vm_event.h +++ b/xen/include/asm-x86/vm_event.h @@ -28,8 +28,6 @@ */ struct arch_vm_event { uint32_t emulate_flags; - unsigned long gpa; - unsigned long eip; struct vm_event_emul_read_data emul_read_data; struct monitor_write_data write_data; };
Currently, after receiving a vm_event reply requesting emulation, the actual emulation is triggered in p2m_mem_access_check(), which means that we're waiting for the page fault to occur again before emulating. Aside from the performance impact, this complicates the code since between hvm_do_resume() and the second page fault it is possible that the latter becomes a completely new page fault - hence checking that EIP and the GPA match with the ones in the original page fault. If they don't, duplicate EPT fault vm_events will occur, of which a monitoring application needs to be aware. This patch makes struct arch_vm_event smaller (since we no longer need to track eip and gpa), removes the checking code from p2m_mem_access_check(), and moves the emulation in hvm_do_resume(). Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> --- xen/arch/x86/hvm/hvm.c | 17 +++++++++++++++++ xen/arch/x86/mm/p2m.c | 34 ---------------------------------- xen/include/asm-x86/vm_event.h | 2 -- 3 files changed, 17 insertions(+), 36 deletions(-)