Message ID | 1467312289-3054-1-git-send-email-czuzu@bitdefender.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 06/30/16 21:44, Corneliu ZUZU wrote: > After trapping a control-register write vm-event and -until- deciding if that > write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual write, > there cannot and should not be another trapped control-register write event. > That is, currently -only one- of the fields of monitor_write_data.do_write can > be true at any given moment and therefore it would be more appropriate to > replace those fields with an enum value. > > Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com> > --- > xen/arch/x86/hvm/hvm.c | 18 +++++++++++------- > xen/arch/x86/monitor.c | 37 ++++++++++++++++++------------------- > xen/arch/x86/vm_event.c | 25 ++----------------------- > xen/include/asm-x86/domain.h | 20 ++++++++++---------- > 4 files changed, 41 insertions(+), 59 deletions(-) I'm not sure why neither Tamas' or my email address are in the CC list (I would have assumed either monitor.c or vm_event.c would trigger a response from get_maintainters.pl). In any case: Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Thanks, Razvan
On 7/1/2016 9:53 AM, Razvan Cojocaru wrote: > On 06/30/16 21:44, Corneliu ZUZU wrote: >> After trapping a control-register write vm-event and -until- deciding if that >> write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual write, >> there cannot and should not be another trapped control-register write event. >> That is, currently -only one- of the fields of monitor_write_data.do_write can >> be true at any given moment and therefore it would be more appropriate to >> replace those fields with an enum value. >> >> Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com> >> --- >> xen/arch/x86/hvm/hvm.c | 18 +++++++++++------- >> xen/arch/x86/monitor.c | 37 ++++++++++++++++++------------------- >> xen/arch/x86/vm_event.c | 25 ++----------------------- >> xen/include/asm-x86/domain.h | 20 ++++++++++---------- >> 4 files changed, 41 insertions(+), 59 deletions(-) > I'm not sure why neither Tamas' or my email address are in the CC list > (I would have assumed either monitor.c or vm_event.c would trigger a > response from get_maintainters.pl). In any case: > > Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com> > > > Thanks, > Razvan > I only now notice, it seems there's a problem with MAINTAINERS, the 2 files are listed: F: xen/*/vm_event.c F: xen/*/monitor.c there and the listing should have been: F: xen/arch/*/vm_event.c F: xen/arch/*/monitor.c Will submit a separate patch ASAP with that adjustment. Corneliu.
>>> On 30.06.16 at 20:44, <czuzu@bitdefender.com> wrote: > After trapping a control-register write vm-event and -until- deciding if that > write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual > write, > there cannot and should not be another trapped control-register write event. Is that true even for the case where full register state gets updated for a vCPU? Is that updating-all-context case of no interest to a monitoring application, now and forever? > --- a/xen/include/asm-x86/domain.h > +++ b/xen/include/asm-x86/domain.h > @@ -259,19 +259,19 @@ struct pv_domain > struct cpuidmasks *cpuidmasks; > }; > > -struct monitor_write_data { > - struct { > - unsigned int msr : 1; > - unsigned int cr0 : 1; > - unsigned int cr3 : 1; > - unsigned int cr4 : 1; > - } do_write; > +enum monitor_write_status > +{ > + MWS_NOWRITE = 0, Please omit the "= 0" here - MWS_NOWRITE will be zero even without that. Jan
On 7/4/2016 3:37 PM, Jan Beulich wrote: >>>> On 30.06.16 at 20:44, <czuzu@bitdefender.com> wrote: >> After trapping a control-register write vm-event and -until- deciding if that >> write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual >> write, >> there cannot and should not be another trapped control-register write event. > Is that true even for the case where full register state gets updated > for a vCPU? AFAIK, the full register state cannot be updated _at once_, that is: after each trapped register update monitor_write_data must _always_ be handled _before reentering the vCPU_. > Is that updating-all-context case of no interest to a > monitoring application, now and forever? As I said above, I'm don't see how such a case would (ever) be possible. >> --- a/xen/include/asm-x86/domain.h >> +++ b/xen/include/asm-x86/domain.h >> @@ -259,19 +259,19 @@ struct pv_domain >> struct cpuidmasks *cpuidmasks; >> }; >> >> -struct monitor_write_data { >> - struct { >> - unsigned int msr : 1; >> - unsigned int cr0 : 1; >> - unsigned int cr3 : 1; >> - unsigned int cr4 : 1; >> - } do_write; >> +enum monitor_write_status >> +{ >> + MWS_NOWRITE = 0, > Please omit the "= 0" here - MWS_NOWRITE will be zero even > without that. > > Jan > > Ack. Thanks, Corneliu.
>>> On 04.07.16 at 14:47, <czuzu@bitdefender.com> wrote: > On 7/4/2016 3:37 PM, Jan Beulich wrote: >>>>> On 30.06.16 at 20:44, <czuzu@bitdefender.com> wrote: >>> After trapping a control-register write vm-event and -until- deciding if that >>> write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual >>> write, >>> there cannot and should not be another trapped control-register write event. >> Is that true even for the case where full register state gets updated >> for a vCPU? > > AFAIK, the full register state cannot be updated _at once_, that is: > after each trapped register update monitor_write_data must _always_ be > handled _before reentering the vCPU_. I'm thinking about operations like VCPUOP_initialise here. Of course operations on current can't possibly update more than one register at a time. Jan
On 7/4/2016 4:07 PM, Jan Beulich wrote: >>>> On 04.07.16 at 14:47, <czuzu@bitdefender.com> wrote: >> On 7/4/2016 3:37 PM, Jan Beulich wrote: >>>>>> On 30.06.16 at 20:44, <czuzu@bitdefender.com> wrote: >>>> After trapping a control-register write vm-event and -until- deciding if that >>>> write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual >>>> write, >>>> there cannot and should not be another trapped control-register write event. >>> Is that true even for the case where full register state gets updated >>> for a vCPU? >> AFAIK, the full register state cannot be updated _at once_, that is: >> after each trapped register update monitor_write_data must _always_ be >> handled _before reentering the vCPU_. > I'm thinking about operations like VCPUOP_initialise here. Of course > operations on current can't possibly update more than one register > at a time. > > Jan Yes but those register update operations happen outside the vm-event subsystem, i.e. in those cases the registers get updated directly, not by setting bits in monitor_write_data. Only hypervisor-trapped register updates (e.g. hvm_set_cr0 w/ may_defer parameter = 1) which are deferred because of hvm_event_crX returning true use monitor_write_data. Corneliu.
>>> On 04.07.16 at 15:21, <czuzu@bitdefender.com> wrote: > On 7/4/2016 4:07 PM, Jan Beulich wrote: >>>>> On 04.07.16 at 14:47, <czuzu@bitdefender.com> wrote: >>> On 7/4/2016 3:37 PM, Jan Beulich wrote: >>>>>>> On 30.06.16 at 20:44, <czuzu@bitdefender.com> wrote: >>>>> After trapping a control-register write vm-event and -until- deciding if that >>>>> write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual >>>>> write, >>>>> there cannot and should not be another trapped control-register write event. >>>> Is that true even for the case where full register state gets updated >>>> for a vCPU? >>> AFAIK, the full register state cannot be updated _at once_, that is: >>> after each trapped register update monitor_write_data must _always_ be >>> handled _before reentering the vCPU_. >> I'm thinking about operations like VCPUOP_initialise here. Of course >> operations on current can't possibly update more than one register >> at a time. > > Yes but those register update operations happen outside the vm-event > subsystem, i.e. in those cases the registers get updated directly, not > by setting bits in monitor_write_data. > Only hypervisor-trapped register updates (e.g. hvm_set_cr0 w/ may_defer > parameter = 1) which are deferred because of hvm_event_crX returning > true use monitor_write_data. That's why I had added "Is that updating-all-context case of no interest to a monitoring application, now and forever?" After all I gave VCPUOP_initialise as an example because the guest itself can invoke it, and so I had assumed this to be of interest to a monitoring app. Jan
On 07/04/16 17:08, Jan Beulich wrote: >>>> On 04.07.16 at 15:21, <czuzu@bitdefender.com> wrote: >> On 7/4/2016 4:07 PM, Jan Beulich wrote: >>>>>> On 04.07.16 at 14:47, <czuzu@bitdefender.com> wrote: >>>> On 7/4/2016 3:37 PM, Jan Beulich wrote: >>>>>>>> On 30.06.16 at 20:44, <czuzu@bitdefender.com> wrote: >>>>>> After trapping a control-register write vm-event and -until- deciding if that >>>>>> write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual >>>>>> write, >>>>>> there cannot and should not be another trapped control-register write event. >>>>> Is that true even for the case where full register state gets updated >>>>> for a vCPU? >>>> AFAIK, the full register state cannot be updated _at once_, that is: >>>> after each trapped register update monitor_write_data must _always_ be >>>> handled _before reentering the vCPU_. >>> I'm thinking about operations like VCPUOP_initialise here. Of course >>> operations on current can't possibly update more than one register >>> at a time. >> >> Yes but those register update operations happen outside the vm-event >> subsystem, i.e. in those cases the registers get updated directly, not >> by setting bits in monitor_write_data. >> Only hypervisor-trapped register updates (e.g. hvm_set_cr0 w/ may_defer >> parameter = 1) which are deferred because of hvm_event_crX returning >> true use monitor_write_data. > > That's why I had added "Is that updating-all-context case of no > interest to a monitoring application, now and forever?" After all I > gave VCPUOP_initialise as an example because the guest itself > can invoke it, and so I had assumed this to be of interest to a > monitoring app. Well, the change does simplify the code for now but I can't bet on "forever" - maybe Tamas has some ideas here also? It may be that the prudent thing to do is leave the code without the enum change. Thanks, Razvan
On 7/4/2016 5:08 PM, Jan Beulich wrote: >>>> On 04.07.16 at 15:21, <czuzu@bitdefender.com> wrote: >> On 7/4/2016 4:07 PM, Jan Beulich wrote: >>>>>> On 04.07.16 at 14:47, <czuzu@bitdefender.com> wrote: >>>> On 7/4/2016 3:37 PM, Jan Beulich wrote: >>>>>>>> On 30.06.16 at 20:44, <czuzu@bitdefender.com> wrote: >>>>>> After trapping a control-register write vm-event and -until- deciding if that >>>>>> write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual >>>>>> write, >>>>>> there cannot and should not be another trapped control-register write event. >>>>> Is that true even for the case where full register state gets updated >>>>> for a vCPU? >>>> AFAIK, the full register state cannot be updated _at once_, that is: >>>> after each trapped register update monitor_write_data must _always_ be >>>> handled _before reentering the vCPU_. >>> I'm thinking about operations like VCPUOP_initialise here. Of course >>> operations on current can't possibly update more than one register >>> at a time. >> Yes but those register update operations happen outside the vm-event >> subsystem, i.e. in those cases the registers get updated directly, not >> by setting bits in monitor_write_data. >> Only hypervisor-trapped register updates (e.g. hvm_set_cr0 w/ may_defer >> parameter = 1) which are deferred because of hvm_event_crX returning >> true use monitor_write_data. > That's why I had added "Is that updating-all-context case of no > interest to a monitoring application, now and forever?" After all I > gave VCPUOP_initialise as an example because the guest itself > can invoke it, and so I had assumed this to be of interest to a > monitoring app. > > Jan > > Hmm, didn't have in mind the fact that the guest can do those kind of operations, I suppose that applies to PV guests, right? (in which case this scenario would be triggered by an in-guest PV driver, right?) CR-write monitor vm-events currently only trap _machine instructions_ which cause CR writes, but in the future I can see why it might be useful for a monitoring app to trap _any kind of guest behavior_ which would cause that. Good point, maybe Razvan and Tamas can chime in on this. Thanks, Corneliu.
>>> On 04.07.16 at 16:24, <czuzu@bitdefender.com> wrote: > On 7/4/2016 5:08 PM, Jan Beulich wrote: >>>>> On 04.07.16 at 15:21, <czuzu@bitdefender.com> wrote: >>> On 7/4/2016 4:07 PM, Jan Beulich wrote: >>>>>>> On 04.07.16 at 14:47, <czuzu@bitdefender.com> wrote: >>>>> On 7/4/2016 3:37 PM, Jan Beulich wrote: >>>>>>>>> On 30.06.16 at 20:44, <czuzu@bitdefender.com> wrote: >>>>>>> After trapping a control-register write vm-event and -until- deciding if that >>>>>>> write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual >>>>>>> write, >>>>>>> there cannot and should not be another trapped control-register write event. >>>>>> Is that true even for the case where full register state gets updated >>>>>> for a vCPU? >>>>> AFAIK, the full register state cannot be updated _at once_, that is: >>>>> after each trapped register update monitor_write_data must _always_ be >>>>> handled _before reentering the vCPU_. >>>> I'm thinking about operations like VCPUOP_initialise here. Of course >>>> operations on current can't possibly update more than one register >>>> at a time. >>> Yes but those register update operations happen outside the vm-event >>> subsystem, i.e. in those cases the registers get updated directly, not >>> by setting bits in monitor_write_data. >>> Only hypervisor-trapped register updates (e.g. hvm_set_cr0 w/ may_defer >>> parameter = 1) which are deferred because of hvm_event_crX returning >>> true use monitor_write_data. >> That's why I had added "Is that updating-all-context case of no >> interest to a monitoring application, now and forever?" After all I >> gave VCPUOP_initialise as an example because the guest itself >> can invoke it, and so I had assumed this to be of interest to a >> monitoring app. > > Hmm, didn't have in mind the fact that the guest can do those kind of > operations, I suppose that applies to PV guests, right? (in which case > this scenario would be triggered by an in-guest PV driver, right?) A driver wouldn't normally do such things, unless the OS is really highly modularized (far beyond what Linux allows). But yes, this is a PV operation, but equally applies to PVHv2 (see commit 192df6f912 ["x86: allow HVM guests to use hypercalls to bring up vCPUs"]) and by extension/generalization HVM. Jan
On 7/4/2016 6:03 PM, Jan Beulich wrote: >>>> On 04.07.16 at 16:24, <czuzu@bitdefender.com> wrote: >> On 7/4/2016 5:08 PM, Jan Beulich wrote: >>>>>> On 04.07.16 at 15:21, <czuzu@bitdefender.com> wrote: >>>> On 7/4/2016 4:07 PM, Jan Beulich wrote: >>>>>>>> On 04.07.16 at 14:47, <czuzu@bitdefender.com> wrote: >>>>>> On 7/4/2016 3:37 PM, Jan Beulich wrote: >>>>>>>>>> On 30.06.16 at 20:44, <czuzu@bitdefender.com> wrote: >>>>>>>> After trapping a control-register write vm-event and -until- deciding if that >>>>>>>> write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual >>>>>>>> write, >>>>>>>> there cannot and should not be another trapped control-register write event. >>>>>>> Is that true even for the case where full register state gets updated >>>>>>> for a vCPU? >>>>>> AFAIK, the full register state cannot be updated _at once_, that is: >>>>>> after each trapped register update monitor_write_data must _always_ be >>>>>> handled _before reentering the vCPU_. >>>>> I'm thinking about operations like VCPUOP_initialise here. Of course >>>>> operations on current can't possibly update more than one register >>>>> at a time. >>>> Yes but those register update operations happen outside the vm-event >>>> subsystem, i.e. in those cases the registers get updated directly, not >>>> by setting bits in monitor_write_data. >>>> Only hypervisor-trapped register updates (e.g. hvm_set_cr0 w/ may_defer >>>> parameter = 1) which are deferred because of hvm_event_crX returning >>>> true use monitor_write_data. >>> That's why I had added "Is that updating-all-context case of no >>> interest to a monitoring application, now and forever?" After all I >>> gave VCPUOP_initialise as an example because the guest itself >>> can invoke it, and so I had assumed this to be of interest to a >>> monitoring app. >> Hmm, didn't have in mind the fact that the guest can do those kind of >> operations, I suppose that applies to PV guests, right? (in which case >> this scenario would be triggered by an in-guest PV driver, right?) > A driver wouldn't normally do such things, unless the OS is really > highly modularized (far beyond what Linux allows). But yes, this > is a PV operation, but equally applies to PVHv2 (see commit > 192df6f912 ["x86: allow HVM guests to use hypercalls to bring up > vCPUs"]) and by extension/generalization HVM. > > Jan > > Right, then I agree w/ making this a bitfield as it was (if Tamas has nothing to object). And now it also makes sense to _not make this an enum_ on ARM too, thanks for pointing this out. Corneliu.
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 5481a6e..884ae40 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2186,8 +2186,9 @@ int hvm_set_cr0(unsigned long value, bool_t may_defer) * The actual write will occur in arch_monitor_write_data(), if * permitted. */ - v->arch.vm_event->write_data.do_write.cr0 = 1; - v->arch.vm_event->write_data.cr0 = value; + ASSERT(MWS_NOWRITE == v->arch.vm_event->write_data.status); + v->arch.vm_event->write_data.status = MWS_CR0; + v->arch.vm_event->write_data.value = value; return X86EMUL_OKAY; } @@ -2291,8 +2292,9 @@ int hvm_set_cr3(unsigned long value, bool_t may_defer) * The actual write will occur in arch_monitor_write_data(), if * permitted. */ - v->arch.vm_event->write_data.do_write.cr3 = 1; - v->arch.vm_event->write_data.cr3 = value; + ASSERT(MWS_NOWRITE == v->arch.vm_event->write_data.status); + v->arch.vm_event->write_data.status = MWS_CR3; + v->arch.vm_event->write_data.value = value; return X86EMUL_OKAY; } @@ -2374,8 +2376,9 @@ int hvm_set_cr4(unsigned long value, bool_t may_defer) * The actual write will occur in arch_monitor_write_data(), if * permitted. */ - v->arch.vm_event->write_data.do_write.cr4 = 1; - v->arch.vm_event->write_data.cr4 = value; + ASSERT(MWS_NOWRITE == v->arch.vm_event->write_data.status); + v->arch.vm_event->write_data.status = MWS_CR4; + v->arch.vm_event->write_data.value = value; return X86EMUL_OKAY; } @@ -3756,7 +3759,8 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content, * The actual write will occur in arch_monitor_write_data(), if * permitted. */ - v->arch.vm_event->write_data.do_write.msr = 1; + ASSERT(MWS_NOWRITE == v->arch.vm_event->write_data.status); + v->arch.vm_event->write_data.status = MWS_MSR; v->arch.vm_event->write_data.msr = msr; v->arch.vm_event->write_data.value = msr_content; diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c index 90e4856..5c8d4da 100644 --- a/xen/arch/x86/monitor.c +++ b/xen/arch/x86/monitor.c @@ -53,29 +53,28 @@ void arch_monitor_write_data(struct vcpu *v) w = &v->arch.vm_event->write_data; - if ( w->do_write.msr ) - { - hvm_msr_write_intercept(w->msr, w->value, 0); - w->do_write.msr = 0; - } - - if ( w->do_write.cr0 ) - { - hvm_set_cr0(w->cr0, 0); - w->do_write.cr0 = 0; - } + if ( likely(MWS_NOWRITE == w->status) ) + return; - if ( w->do_write.cr4 ) + switch ( w->status ) { - hvm_set_cr4(w->cr4, 0); - w->do_write.cr4 = 0; + case MWS_MSR: + hvm_msr_write_intercept(w->msr, w->value, 0); + break; + case MWS_CR0: + hvm_set_cr0(w->value, 0); + break; + case MWS_CR3: + hvm_set_cr3(w->value, 0); + break; + case MWS_CR4: + hvm_set_cr4(w->value, 0); + break; + default: + break; } - if ( w->do_write.cr3 ) - { - hvm_set_cr3(w->cr3, 0); - w->do_write.cr3 = 0; - } + w->status = MWS_NOWRITE; } static unsigned long *monitor_bitmap_for_msr(const struct domain *d, u32 *msr) diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c index 80f84d6..825da48 100644 --- a/xen/arch/x86/vm_event.c +++ b/xen/arch/x86/vm_event.c @@ -73,34 +73,13 @@ void vm_event_register_write_resume(struct vcpu *v, vm_event_response_t *rsp) { if ( rsp->flags & VM_EVENT_FLAG_DENY ) { - struct monitor_write_data *w = &v->arch.vm_event->write_data; - - ASSERT(w); + ASSERT(v->arch.vm_event); /* deny flag requires the vCPU to be paused */ if ( !atomic_read(&v->vm_event_pause_count) ) return; - switch ( rsp->reason ) - { - case VM_EVENT_REASON_MOV_TO_MSR: - w->do_write.msr = 0; - break; - case VM_EVENT_REASON_WRITE_CTRLREG: - switch ( rsp->u.write_ctrlreg.index ) - { - case VM_EVENT_X86_CR0: - w->do_write.cr0 = 0; - break; - case VM_EVENT_X86_CR3: - w->do_write.cr3 = 0; - break; - case VM_EVENT_X86_CR4: - w->do_write.cr4 = 0; - break; - } - break; - } + v->arch.vm_event->write_data.status = MWS_NOWRITE; } } diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h index 7c27f9e..a22ee6b 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -259,19 +259,19 @@ struct pv_domain struct cpuidmasks *cpuidmasks; }; -struct monitor_write_data { - struct { - unsigned int msr : 1; - unsigned int cr0 : 1; - unsigned int cr3 : 1; - unsigned int cr4 : 1; - } do_write; +enum monitor_write_status +{ + MWS_NOWRITE = 0, + MWS_MSR, + MWS_CR0, + MWS_CR3, + MWS_CR4, +}; +struct monitor_write_data { + enum monitor_write_status status; uint32_t msr; uint64_t value; - uint64_t cr0; - uint64_t cr3; - uint64_t cr4; }; struct arch_domain
After trapping a control-register write vm-event and -until- deciding if that write is to be permitted or not (VM_EVENT_FLAG_DENY) and doing the actual write, there cannot and should not be another trapped control-register write event. That is, currently -only one- of the fields of monitor_write_data.do_write can be true at any given moment and therefore it would be more appropriate to replace those fields with an enum value. Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com> --- xen/arch/x86/hvm/hvm.c | 18 +++++++++++------- xen/arch/x86/monitor.c | 37 ++++++++++++++++++------------------- xen/arch/x86/vm_event.c | 25 ++----------------------- xen/include/asm-x86/domain.h | 20 ++++++++++---------- 4 files changed, 41 insertions(+), 59 deletions(-)