From patchwork Wed Dec 8 17:10:43 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Har'El X-Patchwork-Id: 391242 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id oB8HAtK9020359 for ; Wed, 8 Dec 2010 17:10:55 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754291Ab0LHRKr (ORCPT ); Wed, 8 Dec 2010 12:10:47 -0500 Received: from mtagate1.uk.ibm.com ([194.196.100.161]:38434 "EHLO mtagate1.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752611Ab0LHRKq (ORCPT ); Wed, 8 Dec 2010 12:10:46 -0500 Received: from d06nrmr1507.portsmouth.uk.ibm.com (d06nrmr1507.portsmouth.uk.ibm.com [9.149.38.233]) by mtagate1.uk.ibm.com (8.13.1/8.13.1) with ESMTP id oB8HAjjZ021133 for ; Wed, 8 Dec 2010 17:10:45 GMT Received: from d06av08.portsmouth.uk.ibm.com (d06av08.portsmouth.uk.ibm.com [9.149.37.249]) by d06nrmr1507.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id oB8HAlEi2830498 for ; Wed, 8 Dec 2010 17:10:47 GMT Received: from d06av08.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av08.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id oB8HAjiR012043 for ; Wed, 8 Dec 2010 17:10:45 GMT Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.217]) by d06av08.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id oB8HAiPn012040 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 8 Dec 2010 17:10:44 GMT Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1]) by rice.haifa.ibm.com (8.14.4/8.14.4) with ESMTP id oB8HAhHf008783; Wed, 8 Dec 2010 19:10:44 +0200 Received: (from nyh@localhost) by rice.haifa.ibm.com (8.14.4/8.14.4/Submit) id oB8HAhMJ008781; Wed, 8 Dec 2010 19:10:43 +0200 Date: Wed, 8 Dec 2010 19:10:43 +0200 Message-Id: <201012081710.oB8HAhMJ008781@rice.haifa.ibm.com> X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav Har'El" using -f Cc: gleb@redhat.com, avi@redhat.com To: kvm@vger.kernel.org From: "Nadav Har'El" References: <1291827596-nyh@il.ibm.com> Subject: [PATCH 21/28] nVMX: Correct handling of interrupt injection Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.3 (demeter1.kernel.org [140.211.167.41]); Wed, 08 Dec 2010 17:10:55 +0000 (UTC) --- .before/arch/x86/kvm/vmx.c 2010-12-08 18:56:51.000000000 +0200 +++ .after/arch/x86/kvm/vmx.c 2010-12-08 18:56:51.000000000 +0200 @@ -3462,9 +3462,25 @@ out: return ret; } +/* + * In nested virtualization, check if L1 asked to exit on external interrupts. + * For most existing hypervisors, this will always return true. + */ +static bool nested_exit_on_intr(struct kvm_vcpu *vcpu) +{ + return get_vmcs12_fields(vcpu)->pin_based_vm_exec_control & + PIN_BASED_EXT_INTR_MASK; +} + static void enable_irq_window(struct kvm_vcpu *vcpu) { u32 cpu_based_vm_exec_control; + if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) + /* We can get here when nested_run_pending caused + * vmx_interrupt_allowed() to return false. In this case, do + * nothing - the interrupt will be injected later. + */ + return; cpu_based_vm_exec_control = vmcs_read32(CPU_BASED_VM_EXEC_CONTROL); cpu_based_vm_exec_control |= CPU_BASED_VIRTUAL_INTR_PENDING; @@ -3578,6 +3594,13 @@ static void vmx_set_nmi_mask(struct kvm_ static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu) { + if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) { + if (to_vmx(vcpu)->nested.nested_run_pending) + return 0; + nested_vmx_vmexit(vcpu, true); + /* fall through to normal code, but now in L1, not L2 */ + } + return (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) && !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS)); @@ -5126,6 +5149,14 @@ static int vmx_handle_exit(struct kvm_vc if (enable_ept && is_paging(vcpu)) vcpu->arch.cr3 = vmcs_readl(GUEST_CR3); + /* + * the KVM_REQ_EVENT optimization bit is only on for one entry, and if + * we did not inject a still-pending event to L1 now because of + * nested_run_pending, we need to re-enable this bit. + */ + if (vmx->nested.nested_run_pending) + kvm_make_request(KVM_REQ_EVENT, vcpu); + if (exit_reason == EXIT_REASON_VMLAUNCH || exit_reason == EXIT_REASON_VMRESUME) vmx->nested.nested_run_pending = 1; @@ -5317,6 +5348,8 @@ static void vmx_complete_interrupts(stru static void vmx_cancel_injection(struct kvm_vcpu *vcpu) { + if (is_guest_mode(vcpu)) + return; __vmx_complete_interrupts(to_vmx(vcpu), vmcs_read32(VM_ENTRY_INTR_INFO_FIELD), VM_ENTRY_INSTRUCTION_LEN,