Message ID | 6430868f-6eb8-5c07-3a79-6f94101ca600@bitdefender.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 07.11.16 at 15:34, <rcojocaru@bitdefender.com> wrote: > My proposal was simply something along the lines of: > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index 704fd64..58f5ae4 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -535,8 +535,22 @@ void hvm_do_resume(struct vcpu *v) > /* Inject pending hw/sw trap */ > if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) > { > - hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap); > + unsigned int success = 0; > + > + /* Check for already pending interrupts (races). */ > + if ( !hvm_event_pending(v) ) > + { > + hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap); > + success = 1; > + } > + > v->arch.hvm_vcpu.inject_trap.vector = -1; > + > + hvm_monitor_injection_result(v->arch.hvm_vcpu.inject_trap.vector, > + v->arch.hvm_vcpu.inject_trap.type, > + > v->arch.hvm_vcpu.inject_trap.error_code, > + v->arch.hvm_vcpu.inject_trap.cr2, > + success); > } > } > But you realize that injection isn't really VM event related; it's an independent interface. Hence my "too special cased"complaint. What if the event I did propose would be a one shot one, which you enable right before (or after) doing your event injection (it could also be an injection flag)? You'd then see whether the next event the vCPU gets is the one you tried to inject. Jan
On 11/07/2016 04:59 PM, Jan Beulich wrote: >>>> On 07.11.16 at 15:34, <rcojocaru@bitdefender.com> wrote: >> My proposal was simply something along the lines of: >> >> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c >> index 704fd64..58f5ae4 100644 >> --- a/xen/arch/x86/hvm/hvm.c >> +++ b/xen/arch/x86/hvm/hvm.c >> @@ -535,8 +535,22 @@ void hvm_do_resume(struct vcpu *v) >> /* Inject pending hw/sw trap */ >> if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) >> { >> - hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap); >> + unsigned int success = 0; >> + >> + /* Check for already pending interrupts (races). */ >> + if ( !hvm_event_pending(v) ) >> + { >> + hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap); >> + success = 1; >> + } >> + >> v->arch.hvm_vcpu.inject_trap.vector = -1; >> + >> + hvm_monitor_injection_result(v->arch.hvm_vcpu.inject_trap.vector, >> + v->arch.hvm_vcpu.inject_trap.type, >> + >> v->arch.hvm_vcpu.inject_trap.error_code, >> + v->arch.hvm_vcpu.inject_trap.cr2, >> + success); >> } >> } >> > > But you realize that injection isn't really VM event related; it's an > independent interface. Hence my "too special cased"complaint. > What if the event I did propose would be a one shot one, which > you enable right before (or after) doing your event injection (it > could also be an injection flag)? You'd then see whether the > next event the vCPU gets is the one you tried to inject. Not only do I realize that, but the irony is that that's been my initial reply to Tamas' suggestion. :) Unfortunately after looking carefully at all the alternatives, this one has turned out to be the simplest and most effective one - since it's not possible to know if the injection will succeed after xc_hvm_inject_trap() returns, the only way to know is if the application can be somehow notified asynchronously when the hypervisor knows, and the simplest way to do that is via vm_event. The one-shot vm_event does sound reasonable. I could set a flag per-VCPU, basically similar to v->arch.hvm_vcpu.inject_trap.vector, and fire a single event from hvm_inject_trap() if it's set (then unset it) - the flag would be set via an xc_monitor_next_interrupt() call in libxc. If nobody objects, I'll test that and see how it goes. Thanks, Razvan
>>> On 07.11.16 at 16:24, <rcojocaru@bitdefender.com> wrote: > The one-shot vm_event does sound reasonable. I could set a flag > per-VCPU, basically similar to v->arch.hvm_vcpu.inject_trap.vector, and > fire a single event from hvm_inject_trap() if it's set (then unset it) - > the flag would be set via an xc_monitor_next_interrupt() call in libxc. Doing this in hvm_inject_trap() would not cover all cases afict. I'd suggest doing this from hvm_do_resume() _after_ the (conditional) call to hvm_inject_trap(), if there is _any_ event pending. Jan
On 11/07/2016 06:10 PM, Jan Beulich wrote: >>>> On 07.11.16 at 16:24, <rcojocaru@bitdefender.com> wrote: >> The one-shot vm_event does sound reasonable. I could set a flag >> per-VCPU, basically similar to v->arch.hvm_vcpu.inject_trap.vector, and >> fire a single event from hvm_inject_trap() if it's set (then unset it) - >> the flag would be set via an xc_monitor_next_interrupt() call in libxc. > > Doing this in hvm_inject_trap() would not cover all cases afict. > I'd suggest doing this from hvm_do_resume() _after_ the > (conditional) call to hvm_inject_trap(), if there is _any_ event > pending. But that would only cover the hypercall-injected traps. The condition in hvm_do_resume() is "if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )", and inject_trap.vector seems to only ever be set by the hypercall: 5876 case HVMOP_inject_trap: 5877 { 5878 xen_hvm_inject_trap_t tr; 5879 struct domain *d; 5880 struct vcpu *v; 5881 5882 if ( copy_from_guest(&tr, arg, 1 ) ) 5883 return -EFAULT; 5884 5885 rc = rcu_lock_remote_domain_by_id(tr.domid, &d); 5886 if ( rc != 0 ) 5887 return rc; 5888 5889 rc = -EINVAL; 5890 if ( !is_hvm_domain(d) ) 5891 goto injtrap_fail; 5892 5893 rc = xsm_hvm_control(XSM_DM_PRIV, d, op); 5894 if ( rc ) 5895 goto injtrap_fail; 5896 5897 rc = -ENOENT; 5898 if ( tr.vcpuid >= d->max_vcpus || (v = d->vcpu[tr.vcpuid]) == NULL ) 5899 goto injtrap_fail; 5900 5901 if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) 5902 rc = -EBUSY; 5903 else 5904 { 5905 v->arch.hvm_vcpu.inject_trap.vector = tr.vector; 5906 v->arch.hvm_vcpu.inject_trap.type = tr.type; 5907 v->arch.hvm_vcpu.inject_trap.error_code = tr.error_code; 5908 v->arch.hvm_vcpu.inject_trap.insn_len = tr.insn_len; 5909 v->arch.hvm_vcpu.inject_trap.cr2 = tr.cr2; 5910 rc = 0; 5911 } 5912 5913 injtrap_fail: 5914 rcu_unlock_domain(d); 5915 break; 5916 } So if the next interrupt is not caused by the hypercall, we'll never get another event. Am I reading the code wrong? Thanks, Razvan
>>> On 07.11.16 at 18:01, <rcojocaru@bitdefender.com> wrote: > On 11/07/2016 06:10 PM, Jan Beulich wrote: >>>>> On 07.11.16 at 16:24, <rcojocaru@bitdefender.com> wrote: >>> The one-shot vm_event does sound reasonable. I could set a flag >>> per-VCPU, basically similar to v->arch.hvm_vcpu.inject_trap.vector, and >>> fire a single event from hvm_inject_trap() if it's set (then unset it) - >>> the flag would be set via an xc_monitor_next_interrupt() call in libxc. >> >> Doing this in hvm_inject_trap() would not cover all cases afict. >> I'd suggest doing this from hvm_do_resume() _after_ the >> (conditional) call to hvm_inject_trap(), if there is _any_ event >> pending. > > But that would only cover the hypercall-injected traps. The condition in > hvm_do_resume() is "if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )", > and inject_trap.vector seems to only ever be set by the hypercall: >[...] > So if the next interrupt is not caused by the hypercall, we'll never get > another event. Am I reading the code wrong? No, maybe I expressed myself ambiguously: I meant to say that the event should be delivered from hvm_do_resume(), but _outside_ the conditional guarding the call to hvm_inject_trap(). Otherwise things would have been worse than when doing it inside hvm_inject_trap(). Jan
On 11/08/2016 10:15 AM, Jan Beulich wrote: >>>> On 07.11.16 at 18:01, <rcojocaru@bitdefender.com> wrote: >> On 11/07/2016 06:10 PM, Jan Beulich wrote: >>>>>> On 07.11.16 at 16:24, <rcojocaru@bitdefender.com> wrote: >>>> The one-shot vm_event does sound reasonable. I could set a flag >>>> per-VCPU, basically similar to v->arch.hvm_vcpu.inject_trap.vector, and >>>> fire a single event from hvm_inject_trap() if it's set (then unset it) - >>>> the flag would be set via an xc_monitor_next_interrupt() call in libxc. >>> >>> Doing this in hvm_inject_trap() would not cover all cases afict. >>> I'd suggest doing this from hvm_do_resume() _after_ the >>> (conditional) call to hvm_inject_trap(), if there is _any_ event >>> pending. >> >> But that would only cover the hypercall-injected traps. The condition in >> hvm_do_resume() is "if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )", >> and inject_trap.vector seems to only ever be set by the hypercall: >> [...] >> So if the next interrupt is not caused by the hypercall, we'll never get >> another event. Am I reading the code wrong? > > No, maybe I expressed myself ambiguously: I meant to say that the > event should be delivered from hvm_do_resume(), but _outside_ the > conditional guarding the call to hvm_inject_trap(). Otherwise things > would have been worse than when doing it inside hvm_inject_trap(). Right, definitely, I'll do that. Thanks, Razvan
On 11/08/2016 10:15 AM, Jan Beulich wrote: >>>> On 07.11.16 at 18:01, <rcojocaru@bitdefender.com> wrote: >> On 11/07/2016 06:10 PM, Jan Beulich wrote: >>>>>> On 07.11.16 at 16:24, <rcojocaru@bitdefender.com> wrote: >>>> The one-shot vm_event does sound reasonable. I could set a flag >>>> per-VCPU, basically similar to v->arch.hvm_vcpu.inject_trap.vector, and >>>> fire a single event from hvm_inject_trap() if it's set (then unset it) - >>>> the flag would be set via an xc_monitor_next_interrupt() call in libxc. >>> >>> Doing this in hvm_inject_trap() would not cover all cases afict. >>> I'd suggest doing this from hvm_do_resume() _after_ the >>> (conditional) call to hvm_inject_trap(), if there is _any_ event >>> pending. >> >> But that would only cover the hypercall-injected traps. The condition in >> hvm_do_resume() is "if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )", >> and inject_trap.vector seems to only ever be set by the hypercall: >> [...] >> So if the next interrupt is not caused by the hypercall, we'll never get >> another event. Am I reading the code wrong? > > No, maybe I expressed myself ambiguously: I meant to say that the > event should be delivered from hvm_do_resume(), but _outside_ the > conditional guarding the call to hvm_inject_trap(). Otherwise things > would have been worse than when doing it inside hvm_inject_trap(). While working on this patch, I've had a new idea, which would require less changes and fix the problem in a more elegant manner if validated. Looking at vmx_idtv_reinject(), the real problem seems to be that VM_ENTRY_INTR_INFO is being written to directly: 3229 static void vmx_idtv_reinject(unsigned long idtv_info) 3230 { 3231 3232 /* Event delivery caused this intercept? Queue for redelivery. */ 3233 if ( unlikely(idtv_info & INTR_INFO_VALID_MASK) ) 3234 { 3235 if ( hvm_event_needs_reinjection(MASK_EXTR(idtv_info, 3236 INTR_INFO_INTR_TYPE_MASK), 3237 idtv_info & INTR_INFO_VECTOR_MASK) ) 3238 { 3239 /* See SDM 3B 25.7.1.1 and .2 for info about masking resvd bits. */ 3240 __vmwrite(VM_ENTRY_INTR_INFO, 3241 idtv_info & ~INTR_INFO_RESVD_BITS_MASK); 3242 if ( idtv_info & INTR_INFO_DELIVER_CODE_MASK ) 3243 { 3244 unsigned long ec; 3245 3246 __vmread(IDT_VECTORING_ERROR_CODE, &ec); 3247 __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE, ec); 3248 } 3249 } 3250 3251 /* 3252 * Clear NMI-blocking interruptibility info if an NMI delivery faulted. 3253 * Re-delivery will re-set it (see SDM 3B 25.7.1.2). 3254 */ 3255 if ( cpu_has_vmx_vnmi && 3256 ((idtv_info & INTR_INFO_INTR_TYPE_MASK) == 3257 MASK_INSR(X86_EVENTTYPE_NMI, INTR_INFO_INTR_TYPE_MASK)) ) 3258 { 3259 unsigned long intr_info; 3260 3261 __vmread(GUEST_INTERRUPTIBILITY_INFO, &intr_info); 3262 __vmwrite(GUEST_INTERRUPTIBILITY_INFO, 3263 intr_info & ~VMX_INTR_SHADOW_NMI); 3264 } 3265 } 3266 } where the hypercall looks at v->arch.hvm_vcpu.inject_trap.vector only. Then we notice that the hypercall _fails_immediately_ with -EBUSY if v->arch.hvm_vcpu.inject_trap.vector is already set: 5922 case HVMOP_inject_trap: 5923 { 5924 xen_hvm_inject_trap_t tr; 5925 struct domain *d; 5926 struct vcpu *v; 5927 5928 if ( copy_from_guest(&tr, arg, 1 ) ) 5929 return -EFAULT; 5930 5931 rc = rcu_lock_remote_domain_by_id(tr.domid, &d); 5932 if ( rc != 0 ) 5933 return rc; 5934 5935 rc = -EINVAL; 5936 if ( !is_hvm_domain(d) ) 5937 goto injtrap_fail; 5938 5939 rc = xsm_hvm_control(XSM_DM_PRIV, d, op); 5940 if ( rc ) 5941 goto injtrap_fail; 5942 5943 rc = -ENOENT; 5944 if ( tr.vcpuid >= d->max_vcpus || (v = d->vcpu[tr.vcpuid]) == NULL ) 5945 goto injtrap_fail; 5946 5947 if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) 5948 rc = -EBUSY; 5949 else 5950 { 5951 v->arch.hvm_vcpu.inject_trap.vector = tr.vector; 5952 v->arch.hvm_vcpu.inject_trap.type = tr.type; 5953 v->arch.hvm_vcpu.inject_trap.error_code = tr.error_code; 5954 v->arch.hvm_vcpu.inject_trap.insn_len = tr.insn_len; 5955 v->arch.hvm_vcpu.inject_trap.cr2 = tr.cr2; 5956 rc = 0; 5957 } 5958 5959 injtrap_fail: 5960 rcu_unlock_domain(d); 5961 break; 5962 } The conclusion: instead of __vmwrite(VM_ENTRY_INTR_INFO, ...); __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE, ...) in vmx_idtv_reinject(), simply do what the hypercall would have done, i.e. write inject_trap.vector, inject_trap.type, etc. This way, the hypercall will fail immediately if there's a pending interrupt set on exit. Is this too good to be true? :) Thanks, Razvan
On 11/08/2016 11:19 AM, Razvan Cojocaru wrote: > On 11/08/2016 10:15 AM, Jan Beulich wrote: >>>>> On 07.11.16 at 18:01, <rcojocaru@bitdefender.com> wrote: >>> On 11/07/2016 06:10 PM, Jan Beulich wrote: >>>>>>> On 07.11.16 at 16:24, <rcojocaru@bitdefender.com> wrote: >>>>> The one-shot vm_event does sound reasonable. I could set a flag >>>>> per-VCPU, basically similar to v->arch.hvm_vcpu.inject_trap.vector, and >>>>> fire a single event from hvm_inject_trap() if it's set (then unset it) - >>>>> the flag would be set via an xc_monitor_next_interrupt() call in libxc. >>>> >>>> Doing this in hvm_inject_trap() would not cover all cases afict. >>>> I'd suggest doing this from hvm_do_resume() _after_ the >>>> (conditional) call to hvm_inject_trap(), if there is _any_ event >>>> pending. >>> >>> But that would only cover the hypercall-injected traps. The condition in >>> hvm_do_resume() is "if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )", >>> and inject_trap.vector seems to only ever be set by the hypercall: >>> [...] >>> So if the next interrupt is not caused by the hypercall, we'll never get >>> another event. Am I reading the code wrong? >> >> No, maybe I expressed myself ambiguously: I meant to say that the >> event should be delivered from hvm_do_resume(), but _outside_ the >> conditional guarding the call to hvm_inject_trap(). Otherwise things >> would have been worse than when doing it inside hvm_inject_trap(). > > While working on this patch, I've had a new idea, which would require > less changes and fix the problem in a more elegant manner if validated. > Looking at vmx_idtv_reinject(), the real problem seems to be that > VM_ENTRY_INTR_INFO is being written to directly: > > 3229 static void vmx_idtv_reinject(unsigned long idtv_info) > 3230 { > 3231 > 3232 /* Event delivery caused this intercept? Queue for redelivery. */ > 3233 if ( unlikely(idtv_info & INTR_INFO_VALID_MASK) ) > 3234 { > 3235 if ( hvm_event_needs_reinjection(MASK_EXTR(idtv_info, > 3236 > INTR_INFO_INTR_TYPE_MASK), > 3237 idtv_info & > INTR_INFO_VECTOR_MASK) ) > 3238 { > 3239 /* See SDM 3B 25.7.1.1 and .2 for info about masking > resvd bits. */ > 3240 __vmwrite(VM_ENTRY_INTR_INFO, > 3241 idtv_info & ~INTR_INFO_RESVD_BITS_MASK); > 3242 if ( idtv_info & INTR_INFO_DELIVER_CODE_MASK ) > 3243 { > 3244 unsigned long ec; > 3245 > 3246 __vmread(IDT_VECTORING_ERROR_CODE, &ec); > 3247 __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE, ec); > 3248 } > 3249 } > 3250 > 3251 /* > 3252 * Clear NMI-blocking interruptibility info if an NMI > delivery faulted. > 3253 * Re-delivery will re-set it (see SDM 3B 25.7.1.2). > 3254 */ > 3255 if ( cpu_has_vmx_vnmi && > 3256 ((idtv_info & INTR_INFO_INTR_TYPE_MASK) == > 3257 MASK_INSR(X86_EVENTTYPE_NMI, INTR_INFO_INTR_TYPE_MASK)) ) > 3258 { > 3259 unsigned long intr_info; > 3260 > 3261 __vmread(GUEST_INTERRUPTIBILITY_INFO, &intr_info); > 3262 __vmwrite(GUEST_INTERRUPTIBILITY_INFO, > 3263 intr_info & ~VMX_INTR_SHADOW_NMI); > 3264 } > 3265 } > 3266 } > > where the hypercall looks at v->arch.hvm_vcpu.inject_trap.vector only. > Then we notice that the hypercall _fails_immediately_ with -EBUSY if > v->arch.hvm_vcpu.inject_trap.vector is already set: > > 5922 case HVMOP_inject_trap: > 5923 { > 5924 xen_hvm_inject_trap_t tr; > 5925 struct domain *d; > 5926 struct vcpu *v; > 5927 > 5928 if ( copy_from_guest(&tr, arg, 1 ) ) > 5929 return -EFAULT; > 5930 > 5931 rc = rcu_lock_remote_domain_by_id(tr.domid, &d); > 5932 if ( rc != 0 ) > 5933 return rc; > 5934 > 5935 rc = -EINVAL; > 5936 if ( !is_hvm_domain(d) ) > 5937 goto injtrap_fail; > 5938 > 5939 rc = xsm_hvm_control(XSM_DM_PRIV, d, op); > 5940 if ( rc ) > 5941 goto injtrap_fail; > 5942 > 5943 rc = -ENOENT; > 5944 if ( tr.vcpuid >= d->max_vcpus || (v = d->vcpu[tr.vcpuid]) > == NULL ) > 5945 goto injtrap_fail; > 5946 > 5947 if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) > 5948 rc = -EBUSY; Actually the fix should be even simpler than that: we can add to this if " || hvm_event_pending(v)". Objections? Thanks, Razvan
On 11/08/2016 11:39 AM, Razvan Cojocaru wrote: > On 11/08/2016 11:19 AM, Razvan Cojocaru wrote: >> On 11/08/2016 10:15 AM, Jan Beulich wrote: >>>>>> On 07.11.16 at 18:01, <rcojocaru@bitdefender.com> wrote: >>>> On 11/07/2016 06:10 PM, Jan Beulich wrote: >>>>>>>> On 07.11.16 at 16:24, <rcojocaru@bitdefender.com> wrote: >>>>>> The one-shot vm_event does sound reasonable. I could set a flag >>>>>> per-VCPU, basically similar to v->arch.hvm_vcpu.inject_trap.vector, and >>>>>> fire a single event from hvm_inject_trap() if it's set (then unset it) - >>>>>> the flag would be set via an xc_monitor_next_interrupt() call in libxc. >>>>> >>>>> Doing this in hvm_inject_trap() would not cover all cases afict. >>>>> I'd suggest doing this from hvm_do_resume() _after_ the >>>>> (conditional) call to hvm_inject_trap(), if there is _any_ event >>>>> pending. >>>> >>>> But that would only cover the hypercall-injected traps. The condition in >>>> hvm_do_resume() is "if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )", >>>> and inject_trap.vector seems to only ever be set by the hypercall: >>>> [...] >>>> So if the next interrupt is not caused by the hypercall, we'll never get >>>> another event. Am I reading the code wrong? >>> >>> No, maybe I expressed myself ambiguously: I meant to say that the >>> event should be delivered from hvm_do_resume(), but _outside_ the >>> conditional guarding the call to hvm_inject_trap(). Otherwise things >>> would have been worse than when doing it inside hvm_inject_trap(). >> >> While working on this patch, I've had a new idea, which would require >> less changes and fix the problem in a more elegant manner if validated. >> Looking at vmx_idtv_reinject(), the real problem seems to be that >> VM_ENTRY_INTR_INFO is being written to directly: >> >> 3229 static void vmx_idtv_reinject(unsigned long idtv_info) >> 3230 { >> 3231 >> 3232 /* Event delivery caused this intercept? Queue for redelivery. */ >> 3233 if ( unlikely(idtv_info & INTR_INFO_VALID_MASK) ) >> 3234 { >> 3235 if ( hvm_event_needs_reinjection(MASK_EXTR(idtv_info, >> 3236 >> INTR_INFO_INTR_TYPE_MASK), >> 3237 idtv_info & >> INTR_INFO_VECTOR_MASK) ) >> 3238 { >> 3239 /* See SDM 3B 25.7.1.1 and .2 for info about masking >> resvd bits. */ >> 3240 __vmwrite(VM_ENTRY_INTR_INFO, >> 3241 idtv_info & ~INTR_INFO_RESVD_BITS_MASK); >> 3242 if ( idtv_info & INTR_INFO_DELIVER_CODE_MASK ) >> 3243 { >> 3244 unsigned long ec; >> 3245 >> 3246 __vmread(IDT_VECTORING_ERROR_CODE, &ec); >> 3247 __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE, ec); >> 3248 } >> 3249 } >> 3250 >> 3251 /* >> 3252 * Clear NMI-blocking interruptibility info if an NMI >> delivery faulted. >> 3253 * Re-delivery will re-set it (see SDM 3B 25.7.1.2). >> 3254 */ >> 3255 if ( cpu_has_vmx_vnmi && >> 3256 ((idtv_info & INTR_INFO_INTR_TYPE_MASK) == >> 3257 MASK_INSR(X86_EVENTTYPE_NMI, INTR_INFO_INTR_TYPE_MASK)) ) >> 3258 { >> 3259 unsigned long intr_info; >> 3260 >> 3261 __vmread(GUEST_INTERRUPTIBILITY_INFO, &intr_info); >> 3262 __vmwrite(GUEST_INTERRUPTIBILITY_INFO, >> 3263 intr_info & ~VMX_INTR_SHADOW_NMI); >> 3264 } >> 3265 } >> 3266 } >> >> where the hypercall looks at v->arch.hvm_vcpu.inject_trap.vector only. >> Then we notice that the hypercall _fails_immediately_ with -EBUSY if >> v->arch.hvm_vcpu.inject_trap.vector is already set: >> >> 5922 case HVMOP_inject_trap: >> 5923 { >> 5924 xen_hvm_inject_trap_t tr; >> 5925 struct domain *d; >> 5926 struct vcpu *v; >> 5927 >> 5928 if ( copy_from_guest(&tr, arg, 1 ) ) >> 5929 return -EFAULT; >> 5930 >> 5931 rc = rcu_lock_remote_domain_by_id(tr.domid, &d); >> 5932 if ( rc != 0 ) >> 5933 return rc; >> 5934 >> 5935 rc = -EINVAL; >> 5936 if ( !is_hvm_domain(d) ) >> 5937 goto injtrap_fail; >> 5938 >> 5939 rc = xsm_hvm_control(XSM_DM_PRIV, d, op); >> 5940 if ( rc ) >> 5941 goto injtrap_fail; >> 5942 >> 5943 rc = -ENOENT; >> 5944 if ( tr.vcpuid >= d->max_vcpus || (v = d->vcpu[tr.vcpuid]) >> == NULL ) >> 5945 goto injtrap_fail; >> 5946 >> 5947 if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) >> 5948 rc = -EBUSY; > > Actually the fix should be even simpler than that: we can add to this if > " || hvm_event_pending(v)". > > Objections? Nevermind, vmx_event_pending() has a fair ASSERT that v == curr: 1789 static int vmx_event_pending(struct vcpu *v) 1790 { 1791 unsigned long intr_info; 1792 1793 ASSERT(v == current); 1794 __vmread(VM_ENTRY_INTR_INFO, &intr_info); 1795 1796 return intr_info & INTR_INFO_VALID_MASK; 1797 } The inject_trap.vector solution seems to be the only plausible one. Sorry for the spam. Thanks, Razvan
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 704fd64..58f5ae4 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -535,8 +535,22 @@ void hvm_do_resume(struct vcpu *v) /* Inject pending hw/sw trap */ if ( v->arch.hvm_vcpu.inject_trap.vector != -1 ) { - hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap); + unsigned int success = 0; + + /* Check for already pending interrupts (races). */ + if ( !hvm_event_pending(v) ) + { + hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap); + success = 1; + } + v->arch.hvm_vcpu.inject_trap.vector = -1; + + hvm_monitor_injection_result(v->arch.hvm_vcpu.inject_trap.vector, + v->arch.hvm_vcpu.inject_trap.type, + v->arch.hvm_vcpu.inject_trap.error_code, + v->arch.hvm_vcpu.inject_trap.cr2, + success); } }