diff mbox series

[RFC] Further hack request_interrupt_window handling to work around kvm_cpu_has_interrupt() nesting breakage

Message ID 62918f65ec78f8990278a6a0db0567968fa23e49.camel@infradead.org (mailing list archive)
State New, archived
Headers show
Series [RFC] Further hack request_interrupt_window handling to work around kvm_cpu_has_interrupt() nesting breakage | expand

Commit Message

David Woodhouse Nov. 12, 2020, 1:03 p.m. UTC
In kvm_cpu_has_interrupt() we see the following FIXME:

	/*
	 * FIXME: interrupt.injected represents an interrupt that it's
	 * side-effects have already been applied (e.g. bit from IRR
	 * already moved to ISR). Therefore, it is incorrect to rely
	 * on interrupt.injected to know if there is a pending
	 * interrupt in the user-mode LAPIC.
	 * This leads to nVMX/nSVM not be able to distinguish
	 * if it should exit from L2 to L1 on EXTERNAL_INTERRUPT on
	 * pending interrupt or should re-inject an injected
	 * interrupt.
	 */

I'm using nested VMX for testing, while I add split-irqchip support to
my VMM. I see the vCPU lock up when attempting to deliver an interrupt.

What seems to happen is that request_interrupt_window is set, causing
an immediate vmexit because an IRQ *can* be delivered. But then
kvm_vcpu_ready_for_interrupt_injection() returns false, because
kvm_cpu_has_interrupt() is true.

Because that returns false, the kernel just continues looping in
vcpu_run(), constantly vmexiting and going right back in.

This utterly naïve hack makes my L2 guest boot properly, by not
enabling the irq window when we were going to ignore the exit anyway.
Is there a better fix?

I must also confess I'm working on a slightly older kernel in L1, and
have forward-ported to a more recent tree without actually testing
because from inspection it looks like exactly the same issue still
exists.
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 397f599b20e5..e23f0c8b4a16 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8830,7 +8830,10 @@  static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		}
 
 		inject_pending_event(vcpu, &req_immediate_exit);
-		if (req_int_win)
+		/* Don't enable the interrupt window for userspace if
+		 * kvm_cpu_has_interrupt() is set and we'd never actually
+		 * exit with ready_for_interrupt_window set anyway. */
+		if (req_int_win && !kvm_cpu_has_interrupt(vcpu)
 			kvm_x86_ops.enable_irq_window(vcpu);
 
 		if (kvm_lapic_enabled(vcpu)) {