From patchwork Wed Feb 20 13:01:47 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Kiszka X-Patchwork-Id: 2167401 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 4A8173FD4E for ; Wed, 20 Feb 2013 13:02:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935301Ab3BTNCG (ORCPT ); Wed, 20 Feb 2013 08:02:06 -0500 Received: from david.siemens.de ([192.35.17.14]:29183 "EHLO david.siemens.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935133Ab3BTNCF (ORCPT ); Wed, 20 Feb 2013 08:02:05 -0500 Received: from mail1.siemens.de (localhost [127.0.0.1]) by david.siemens.de (8.13.6/8.13.6) with ESMTP id r1KD1mN4010503; Wed, 20 Feb 2013 14:01:48 +0100 Received: from mchn199C.mchp.siemens.de ([139.25.109.49]) by mail1.siemens.de (8.13.6/8.13.6) with ESMTP id r1KD1lmm001483; Wed, 20 Feb 2013 14:01:47 +0100 Message-ID: <5124C93B.50902@siemens.com> Date: Wed, 20 Feb 2013 14:01:47 +0100 From: Jan Kiszka User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: Gleb Natapov , Marcelo Tosatti CC: kvm , "Nadav Har'El" , "Nakajima, Jun" Subject: [PATCH] KVM: nVMX: Rework event injection and recovery Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This aligns VMX more with SVM regarding event injection and recovery for nested guests. The changes allow to inject interrupts directly from L0 to L2. One difference to SVM is that we always transfer the pending event injection into the architectural state of the VCPU and then drop it from there if it turns out that we left L2 to enter L1. VMX and SVM are now identical in how they recover event injections from unperformed vmlaunch/vmresume: We detect that VM_ENTRY_INTR_INFO_FIELD still contains a valid event and, if yes, transfer the content into L1's idt_vectoring_info_field. To avoid that we incorrectly leak an event into the architectural VCPU state that L1 wants to inject, we skip cancellation on nested run. Signed-off-by: Jan Kiszka --- Survived moderate testing here and (currently) makes sense to me, but please review very carefully. I wouldn't be surprised if I'm still missing some subtle corner case. arch/x86/kvm/vmx.c | 57 +++++++++++++++++++++++---------------------------- 1 files changed, 26 insertions(+), 31 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index dd3a8a0..7d2fbd2 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -6489,8 +6489,6 @@ static void __vmx_complete_interrupts(struct vcpu_vmx *vmx, static void vmx_complete_interrupts(struct vcpu_vmx *vmx) { - if (is_guest_mode(&vmx->vcpu)) - return; __vmx_complete_interrupts(vmx, vmx->idt_vectoring_info, VM_EXIT_INSTRUCTION_LEN, IDT_VECTORING_ERROR_CODE); @@ -6498,7 +6496,7 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx) static void vmx_cancel_injection(struct kvm_vcpu *vcpu) { - if (is_guest_mode(vcpu)) + if (to_vmx(vcpu)->nested.nested_run_pending) return; __vmx_complete_interrupts(to_vmx(vcpu), vmcs_read32(VM_ENTRY_INTR_INFO_FIELD), @@ -6531,21 +6529,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned long debugctlmsr; - if (is_guest_mode(vcpu) && !vmx->nested.nested_run_pending) { - struct vmcs12 *vmcs12 = get_vmcs12(vcpu); - if (vmcs12->idt_vectoring_info_field & - VECTORING_INFO_VALID_MASK) { - vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, - vmcs12->idt_vectoring_info_field); - vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, - vmcs12->vm_exit_instruction_len); - if (vmcs12->idt_vectoring_info_field & - VECTORING_INFO_DELIVER_CODE_MASK) - vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, - vmcs12->idt_vectoring_error_code); - } - } - /* Record the guest's net vcpu time for enforced NMI injections. */ if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked)) vmx->entry_time = ktime_get(); @@ -6704,17 +6687,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD); - if (is_guest_mode(vcpu)) { - struct vmcs12 *vmcs12 = get_vmcs12(vcpu); - vmcs12->idt_vectoring_info_field = vmx->idt_vectoring_info; - if (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK) { - vmcs12->idt_vectoring_error_code = - vmcs_read32(IDT_VECTORING_ERROR_CODE); - vmcs12->vm_exit_instruction_len = - vmcs_read32(VM_EXIT_INSTRUCTION_LEN); - } - } - vmx->loaded_vmcs->launched = 1; vmx->exit_reason = vmcs_read32(VM_EXIT_REASON); @@ -7403,9 +7375,32 @@ void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmcs12->vm_exit_instruction_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN); vmcs12->vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO); - /* clear vm-entry fields which are to be cleared on exit */ - if (!(vmcs12->vm_exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) + /* drop what we picked up for L0 via vmx_complete_interrupts */ + vcpu->arch.nmi_injected = false; + kvm_clear_exception_queue(vcpu); + kvm_clear_interrupt_queue(vcpu); + + if (!(vmcs12->vm_exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) && + vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK) { + /* + * Preserve the event that was supposed to be injected + * by emulating it would have been returned in + * IDT_VECTORING_INFO_FIELD. + */ + if (vmcs_read32(VM_ENTRY_INTR_INFO_FIELD) & + INTR_INFO_VALID_MASK) { + vmcs12->idt_vectoring_info_field = + vmcs12->vm_entry_intr_info_field; + vmcs12->idt_vectoring_error_code = + vmcs12->vm_entry_exception_error_code; + vmcs12->vm_exit_instruction_len = + vmcs12->vm_entry_instruction_len; + vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); + } + + /* clear vm-entry fields which are to be cleared on exit */ vmcs12->vm_entry_intr_info_field &= ~INTR_INFO_VALID_MASK; + } } /*