diff mbox

[v2,0/7] KVM: nVMX: Fix IPIv vs. nested posted interrupts

Message ID 20240906043413.1049633-1-seanjc@google.com (mailing list archive)
State New, archived
Headers show

Commit Message

Sean Christopherson Sept. 6, 2024, 4:34 a.m. UTC
Fix a bug where KVM injects L2's nested posted interrupt into L1 as a
nested VM-Exit instead of triggering PI processing.  The actual bug is
technically a generic nested posted interrupts problem, but due to the
way that KVM handles interrupt delivery, the issue is mostly limited to
to IPI virtualization being enabled.

Found by the nested posted interrupt KUT test on SPR.

If it weren't for an annoying TOCTOU bug waiting to happen, the fix would
be quite simple, e.g. it's really just:

v2:
 - Split kvm_get_apic_interrupt() into has+ack to avoid marking the IRQ as
   in-service in vmcs02 instead of vmcs01. [Nathan]
 - Gather reviews, but only for the patches that didn't meaningful change (all
   two of them). [Chao]
 - Drop Cc: stable@ from all patches.  For real world hypervisors, this is
   unlikely to cause functional issues, only loss of IPI virtualization
   performance due to the unnecessary VM-Exit.  Whereas evidenced by my screwup
   in v1, this code is plenty subtle enough to introduce bugs.
 - Drop the patch to store nested.posted_intr_nv as an int, as there is no need
   to explicitly match -1 (as a signed int) in this approach.
 - Add a patch to assert vcpu->mutex is held when getting vmcs12, as I was
   "this" close to yanking out nested.posted_intr_nv, until I realized that
   accessing a different vCPU's vmcs12 in the IPI path is unsafe.

v1: https://lore.kernel.org/all/20240720000138.3027780-1-seanjc@google.com

Sean Christopherson (7):
  KVM: x86: Move "ack" phase of local APIC IRQ delivery to separate API
  KVM: nVMX: Get to-be-acknowledge IRQ for nested VM-Exit at injection
    site
  KVM: nVMX: Suppress external interrupt VM-Exit injection if there's no
    IRQ
  KVM: nVMX: Detect nested posted interrupt NV at nested VM-Exit
    injection
  KVM: x86: Fold kvm_get_apic_interrupt() into kvm_cpu_get_interrupt()
  KVM: nVMX: Explicitly invalidate posted_intr_nv if PI is disabled at
    VM-Enter
  KVM: nVMX: Assert that vcpu->mutex is held when accessing secondary
    VMCSes

 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/irq.c              | 10 ++++--
 arch/x86/kvm/lapic.c            |  9 +++---
 arch/x86/kvm/lapic.h            |  2 +-
 arch/x86/kvm/vmx/nested.c       | 57 ++++++++++++++++++++++++++-------
 arch/x86/kvm/vmx/nested.h       |  6 ++++
 arch/x86/kvm/vmx/vmx.c          |  7 ++++
 7 files changed, 72 insertions(+), 20 deletions(-)


base-commit: 332d2c1d713e232e163386c35a3ba0c1b90df83f

Comments

Sean Christopherson Sept. 10, 2024, 4:56 a.m. UTC | #1
On Thu, 05 Sep 2024 21:34:06 -0700, Sean Christopherson wrote:
> Fix a bug where KVM injects L2's nested posted interrupt into L1 as a
> nested VM-Exit instead of triggering PI processing.  The actual bug is
> technically a generic nested posted interrupts problem, but due to the
> way that KVM handles interrupt delivery, the issue is mostly limited to
> to IPI virtualization being enabled.
> 
> Found by the nested posted interrupt KUT test on SPR.
> 
> [...]

Trying this again, hopefully with less awful testing this time...

Applied to kvm-x86 vmx.

[1/7] KVM: x86: Move "ack" phase of local APIC IRQ delivery to separate API
      https://github.com/kvm-x86/linux/commit/a194a3a13ce0
[2/7] KVM: nVMX: Get to-be-acknowledge IRQ for nested VM-Exit at injection site
      https://github.com/kvm-x86/linux/commit/363010e1dd0e
[3/7] KVM: nVMX: Suppress external interrupt VM-Exit injection if there's no IRQ
      https://github.com/kvm-x86/linux/commit/8c23670f2b00
[4/7] KVM: nVMX: Detect nested posted interrupt NV at nested VM-Exit injection
      https://github.com/kvm-x86/linux/commit/6e0b456547f4
[5/7] KVM: x86: Fold kvm_get_apic_interrupt() into kvm_cpu_get_interrupt()
      https://github.com/kvm-x86/linux/commit/aa9477966aab
[6/7] KVM: nVMX: Explicitly invalidate posted_intr_nv if PI is disabled at VM-Enter
      https://github.com/kvm-x86/linux/commit/1ed0f119c5ff
[7/7] KVM: nVMX: Assert that vcpu->mutex is held when accessing secondary VMCSes
      https://github.com/kvm-x86/linux/commit/3dde46a21aa7

--
https://github.com/kvm-x86/linux/tree/next
Nathan Chancellor Sept. 10, 2024, 4:22 p.m. UTC | #2
On Mon, Sep 09, 2024 at 09:56:42PM -0700, Sean Christopherson wrote:
> On Thu, 05 Sep 2024 21:34:06 -0700, Sean Christopherson wrote:
> > Fix a bug where KVM injects L2's nested posted interrupt into L1 as a
> > nested VM-Exit instead of triggering PI processing.  The actual bug is
> > technically a generic nested posted interrupts problem, but due to the
> > way that KVM handles interrupt delivery, the issue is mostly limited to
> > to IPI virtualization being enabled.
> > 
> > Found by the nested posted interrupt KUT test on SPR.
> > 
> > [...]
> 
> Trying this again, hopefully with less awful testing this time...

I meant to reply yesterday but I guess I lost track of time. This passed
my testing on all my machines, so it is not as bad as last time :)

Cheers,
Nathan
Sean Christopherson Sept. 10, 2024, 5:43 p.m. UTC | #3
On Tue, Sep 10, 2024, Nathan Chancellor wrote:
> On Mon, Sep 09, 2024 at 09:56:42PM -0700, Sean Christopherson wrote:
> > On Thu, 05 Sep 2024 21:34:06 -0700, Sean Christopherson wrote:
> > > Fix a bug where KVM injects L2's nested posted interrupt into L1 as a
> > > nested VM-Exit instead of triggering PI processing.  The actual bug is
> > > technically a generic nested posted interrupts problem, but due to the
> > > way that KVM handles interrupt delivery, the issue is mostly limited to
> > > to IPI virtualization being enabled.
> > > 
> > > Found by the nested posted interrupt KUT test on SPR.
> > > 
> > > [...]
> > 
> > Trying this again, hopefully with less awful testing this time...
> 
> I meant to reply yesterday but I guess I lost track of time. This passed
> my testing on all my machines, so it is not as bad as last time :)

Mission Suck Less Accomplished!

Thanks much!
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f7dde74ff565..b07805daedf5 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4288,6 +4288,15 @@  static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
                        return -EBUSY;
                if (!nested_exit_on_intr(vcpu))
                        goto no_vmexit;
+
+               if (nested_cpu_has_posted_intr(get_vmcs12(vcpu)) &&
+                   kvm_apic_has_interrupt(vcpu) == vmx->nested.posted_intr_nv) {
+                       vmx->nested.pi_pending = true;
+                       kvm_apic_clear_irr(vcpu, vmx->nested.posted_intr_nv);
+                       goto no_vmexit;
+               }
+
                nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
                return 0;
        }