diff mbox

KVM: x86: Check for nested events if there is an injectable interrupt

Message ID jpgd2dgfp5s.fsf@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bandan Das July 8, 2014, 4:30 a.m. UTC
With commit b6b8a1451fc40412c57d1 that introduced
vmx_check_nested_events, checks for injectable interrupts happen
at different points in time for L1 and L2 that could potentially
cause a race. The regression occurs because KVM_REQ_EVENT is always
set when nested_run_pending is set even if there's no pending interrupt.
Consequently, there could be a small window when check_nested_events
returns without exiting to L1, but an interrupt comes through soon
after and it incorrectly, gets injected to L2 by inject_pending_event
Fix this by adding a call to check for nested events too when a check
for injectable interrupt returns true

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/x86.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Paolo Bonzini July 8, 2014, 5:50 a.m. UTC | #1
Il 08/07/2014 06:30, Bandan Das ha scritto:
>
> With commit b6b8a1451fc40412c57d1 that introduced
> vmx_check_nested_events, checks for injectable interrupts happen
> at different points in time for L1 and L2 that could potentially
> cause a race. The regression occurs because KVM_REQ_EVENT is always
> set when nested_run_pending is set even if there's no pending interrupt.
> Consequently, there could be a small window when check_nested_events
> returns without exiting to L1, but an interrupt comes through soon
> after and it incorrectly, gets injected to L2 by inject_pending_event
> Fix this by adding a call to check for nested events too when a check
> for injectable interrupt returns true
>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 73537ec..56327a6 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5907,6 +5907,19 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
>  			kvm_x86_ops->set_nmi(vcpu);
>  		}
>  	} else if (kvm_cpu_has_injectable_intr(vcpu)) {
> +		/*
> +		 * TODO/FIXME: We are calling check_nested_events again
> +		 * here to avoid a race condition. We should really be
> +		 * setting KVM_REQ_EVENT only on certain events
> +		 * and not unconditionally.
> +		 * See https://lkml.org/lkml/2014/7/2/60 for discussion
> +		 * about this proposal and current concerns
> +		 */
> +		if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events) {
> +			r = kvm_x86_ops->check_nested_events(vcpu, req_int_win);
> +			if (r != 0)
> +				return r;
> +		}
>  		if (kvm_x86_ops->interrupt_allowed(vcpu)) {
>  			kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu),
>  					    false);
>

I think this should be done for NMI as well.

Jan, what do you think?  Can you run Jailhouse through this patch?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka July 8, 2014, 6:56 a.m. UTC | #2
On 2014-07-08 07:50, Paolo Bonzini wrote:
> Il 08/07/2014 06:30, Bandan Das ha scritto:
>>
>> With commit b6b8a1451fc40412c57d1 that introduced
>> vmx_check_nested_events, checks for injectable interrupts happen
>> at different points in time for L1 and L2 that could potentially
>> cause a race. The regression occurs because KVM_REQ_EVENT is always
>> set when nested_run_pending is set even if there's no pending interrupt.
>> Consequently, there could be a small window when check_nested_events
>> returns without exiting to L1, but an interrupt comes through soon
>> after and it incorrectly, gets injected to L2 by inject_pending_event
>> Fix this by adding a call to check for nested events too when a check
>> for injectable interrupt returns true
>>
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>>  arch/x86/kvm/x86.c | 13 +++++++++++++
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 73537ec..56327a6 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5907,6 +5907,19 @@ static int inject_pending_event(struct kvm_vcpu
>> *vcpu, bool req_int_win)
>>              kvm_x86_ops->set_nmi(vcpu);
>>          }
>>      } else if (kvm_cpu_has_injectable_intr(vcpu)) {
>> +        /*
>> +         * TODO/FIXME: We are calling check_nested_events again
>> +         * here to avoid a race condition. We should really be
>> +         * setting KVM_REQ_EVENT only on certain events
>> +         * and not unconditionally.
>> +         * See https://lkml.org/lkml/2014/7/2/60 for discussion
>> +         * about this proposal and current concerns
>> +         */
>> +        if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events) {
>> +            r = kvm_x86_ops->check_nested_events(vcpu, req_int_win);
>> +            if (r != 0)
>> +                return r;
>> +        }
>>          if (kvm_x86_ops->interrupt_allowed(vcpu)) {
>>              kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu),
>>                          false);
>>
> 
> I think this should be done for NMI as well.

I don't think arch.nmi_pending can flip asynchronously, only in the
context of the VCPU thread - in contrast to pending IRQ states.

> 
> Jan, what do you think?  Can you run Jailhouse through this patch?

Jailhouse seems fine with it, and it resolves the lockup of nested KVM
here as well.

Jan
Paolo Bonzini July 8, 2014, 8 a.m. UTC | #3
Il 08/07/2014 08:56, Jan Kiszka ha scritto:
> I don't think arch.nmi_pending can flip asynchronously, only in the
> context of the VCPU thread - in contrast to pending IRQ states.

Right, only nmi_queued is changed from other threads.  /me should really 
look at the code instead of going from memory.

>> Jan, what do you think?  Can you run Jailhouse through this patch?
>
> Jailhouse seems fine with it, and it resolves the lockup of nested KVM
> here as well.

Thinking more about it, I think this is the right fix.  Not setting 
KVM_REQ_EVENT in some cases can be an optimization, but it's not 
necessary.  Definitely there are other cases in which KVM_REQ_EVENT is 
set even though no event is pending---most notably during emulation of 
invalid guest state.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wanpeng Li July 8, 2014, 9:26 a.m. UTC | #4
On Tue, Jul 08, 2014 at 10:00:35AM +0200, Paolo Bonzini wrote:
>Il 08/07/2014 08:56, Jan Kiszka ha scritto:
>>I don't think arch.nmi_pending can flip asynchronously, only in the
>>context of the VCPU thread - in contrast to pending IRQ states.
>
>Right, only nmi_queued is changed from other threads.  /me should
>really look at the code instead of going from memory.
>
>>>Jan, what do you think?  Can you run Jailhouse through this patch?
>>
>>Jailhouse seems fine with it, and it resolves the lockup of nested KVM
>>here as well.
>
>Thinking more about it, I think this is the right fix.  Not setting
>KVM_REQ_EVENT in some cases can be an optimization, but it's not
>necessary.  Definitely there are other cases in which KVM_REQ_EVENT
>is set even though no event is pending---most notably during
>emulation of invalid guest state.

Anyway, 

Reviewed-by: Wanpeng Li <wanpeng.li@linux.intel.com>

>
>Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 73537ec..56327a6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5907,6 +5907,19 @@  static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
 			kvm_x86_ops->set_nmi(vcpu);
 		}
 	} else if (kvm_cpu_has_injectable_intr(vcpu)) {
+		/*
+		 * TODO/FIXME: We are calling check_nested_events again
+		 * here to avoid a race condition. We should really be
+		 * setting KVM_REQ_EVENT only on certain events
+		 * and not unconditionally.
+		 * See https://lkml.org/lkml/2014/7/2/60 for discussion
+		 * about this proposal and current concerns
+		 */
+		if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events) {
+			r = kvm_x86_ops->check_nested_events(vcpu, req_int_win);
+			if (r != 0)
+				return r;
+		}
 		if (kvm_x86_ops->interrupt_allowed(vcpu)) {
 			kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu),
 					    false);