Message ID | jpgd2dgfp5s.fsf@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Il 08/07/2014 06:30, Bandan Das ha scritto: > > With commit b6b8a1451fc40412c57d1 that introduced > vmx_check_nested_events, checks for injectable interrupts happen > at different points in time for L1 and L2 that could potentially > cause a race. The regression occurs because KVM_REQ_EVENT is always > set when nested_run_pending is set even if there's no pending interrupt. > Consequently, there could be a small window when check_nested_events > returns without exiting to L1, but an interrupt comes through soon > after and it incorrectly, gets injected to L2 by inject_pending_event > Fix this by adding a call to check for nested events too when a check > for injectable interrupt returns true > > Signed-off-by: Bandan Das <bsd@redhat.com> > --- > arch/x86/kvm/x86.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 73537ec..56327a6 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -5907,6 +5907,19 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win) > kvm_x86_ops->set_nmi(vcpu); > } > } else if (kvm_cpu_has_injectable_intr(vcpu)) { > + /* > + * TODO/FIXME: We are calling check_nested_events again > + * here to avoid a race condition. We should really be > + * setting KVM_REQ_EVENT only on certain events > + * and not unconditionally. > + * See https://lkml.org/lkml/2014/7/2/60 for discussion > + * about this proposal and current concerns > + */ > + if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events) { > + r = kvm_x86_ops->check_nested_events(vcpu, req_int_win); > + if (r != 0) > + return r; > + } > if (kvm_x86_ops->interrupt_allowed(vcpu)) { > kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), > false); > I think this should be done for NMI as well. Jan, what do you think? Can you run Jailhouse through this patch? Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2014-07-08 07:50, Paolo Bonzini wrote: > Il 08/07/2014 06:30, Bandan Das ha scritto: >> >> With commit b6b8a1451fc40412c57d1 that introduced >> vmx_check_nested_events, checks for injectable interrupts happen >> at different points in time for L1 and L2 that could potentially >> cause a race. The regression occurs because KVM_REQ_EVENT is always >> set when nested_run_pending is set even if there's no pending interrupt. >> Consequently, there could be a small window when check_nested_events >> returns without exiting to L1, but an interrupt comes through soon >> after and it incorrectly, gets injected to L2 by inject_pending_event >> Fix this by adding a call to check for nested events too when a check >> for injectable interrupt returns true >> >> Signed-off-by: Bandan Das <bsd@redhat.com> >> --- >> arch/x86/kvm/x86.c | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 73537ec..56327a6 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -5907,6 +5907,19 @@ static int inject_pending_event(struct kvm_vcpu >> *vcpu, bool req_int_win) >> kvm_x86_ops->set_nmi(vcpu); >> } >> } else if (kvm_cpu_has_injectable_intr(vcpu)) { >> + /* >> + * TODO/FIXME: We are calling check_nested_events again >> + * here to avoid a race condition. We should really be >> + * setting KVM_REQ_EVENT only on certain events >> + * and not unconditionally. >> + * See https://lkml.org/lkml/2014/7/2/60 for discussion >> + * about this proposal and current concerns >> + */ >> + if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events) { >> + r = kvm_x86_ops->check_nested_events(vcpu, req_int_win); >> + if (r != 0) >> + return r; >> + } >> if (kvm_x86_ops->interrupt_allowed(vcpu)) { >> kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), >> false); >> > > I think this should be done for NMI as well. I don't think arch.nmi_pending can flip asynchronously, only in the context of the VCPU thread - in contrast to pending IRQ states. > > Jan, what do you think? Can you run Jailhouse through this patch? Jailhouse seems fine with it, and it resolves the lockup of nested KVM here as well. Jan
Il 08/07/2014 08:56, Jan Kiszka ha scritto: > I don't think arch.nmi_pending can flip asynchronously, only in the > context of the VCPU thread - in contrast to pending IRQ states. Right, only nmi_queued is changed from other threads. /me should really look at the code instead of going from memory. >> Jan, what do you think? Can you run Jailhouse through this patch? > > Jailhouse seems fine with it, and it resolves the lockup of nested KVM > here as well. Thinking more about it, I think this is the right fix. Not setting KVM_REQ_EVENT in some cases can be an optimization, but it's not necessary. Definitely there are other cases in which KVM_REQ_EVENT is set even though no event is pending---most notably during emulation of invalid guest state. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 08, 2014 at 10:00:35AM +0200, Paolo Bonzini wrote: >Il 08/07/2014 08:56, Jan Kiszka ha scritto: >>I don't think arch.nmi_pending can flip asynchronously, only in the >>context of the VCPU thread - in contrast to pending IRQ states. > >Right, only nmi_queued is changed from other threads. /me should >really look at the code instead of going from memory. > >>>Jan, what do you think? Can you run Jailhouse through this patch? >> >>Jailhouse seems fine with it, and it resolves the lockup of nested KVM >>here as well. > >Thinking more about it, I think this is the right fix. Not setting >KVM_REQ_EVENT in some cases can be an optimization, but it's not >necessary. Definitely there are other cases in which KVM_REQ_EVENT >is set even though no event is pending---most notably during >emulation of invalid guest state. Anyway, Reviewed-by: Wanpeng Li <wanpeng.li@linux.intel.com> > >Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 73537ec..56327a6 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5907,6 +5907,19 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win) kvm_x86_ops->set_nmi(vcpu); } } else if (kvm_cpu_has_injectable_intr(vcpu)) { + /* + * TODO/FIXME: We are calling check_nested_events again + * here to avoid a race condition. We should really be + * setting KVM_REQ_EVENT only on certain events + * and not unconditionally. + * See https://lkml.org/lkml/2014/7/2/60 for discussion + * about this proposal and current concerns + */ + if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events) { + r = kvm_x86_ops->check_nested_events(vcpu, req_int_win); + if (r != 0) + return r; + } if (kvm_x86_ops->interrupt_allowed(vcpu)) { kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);
With commit b6b8a1451fc40412c57d1 that introduced vmx_check_nested_events, checks for injectable interrupts happen at different points in time for L1 and L2 that could potentially cause a race. The regression occurs because KVM_REQ_EVENT is always set when nested_run_pending is set even if there's no pending interrupt. Consequently, there could be a small window when check_nested_events returns without exiting to L1, but an interrupt comes through soon after and it incorrectly, gets injected to L2 by inject_pending_event Fix this by adding a call to check for nested events too when a check for injectable interrupt returns true Signed-off-by: Bandan Das <bsd@redhat.com> --- arch/x86/kvm/x86.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)