From patchwork Mon Nov 28 04:18:54 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyle Huey X-Patchwork-Id: 9449077 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 28E826071E for ; Mon, 28 Nov 2016 04:19:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 17B65204C1 for ; Mon, 28 Nov 2016 04:19:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BF97212DA; Mon, 28 Nov 2016 04:19:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 58D9A204C1 for ; Mon, 28 Nov 2016 04:19:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932190AbcK1ETl (ORCPT ); Sun, 27 Nov 2016 23:19:41 -0500 Received: from mail-pf0-f194.google.com ([209.85.192.194]:36506 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753883AbcK1ETU (ORCPT ); Sun, 27 Nov 2016 23:19:20 -0500 Received: by mail-pf0-f194.google.com with SMTP id c4so5669109pfb.3 for ; Sun, 27 Nov 2016 20:19:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kylehuey.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Rld+y5JdVX/8d8TndnnVQlESKcUe+rUytJ21XpLDjxM=; b=VHmW0SGKTUKIoaLiVvF7YpdMJg+mlce4ydkON2Y7lXj5sdC1015el5DTA4taNCZCHB 0ZBr+9TBZGlGFDHT4o4U1LcpFNeyCWTlRVbIs3E6oSZ/yWyORIrC/9St7BsqhEuFTHkb IwhZQRvK9ODgmWohAbExecm6Jy/58Faere+eZGUECO5qJO3NSXBS+6lpRLzxlqoDWPCO ZH7gTwwSBUEUzv3J501m05U2u0/FNIdg2jAmvOfFnZkaqh/ENseC1WqN8vEIAByhFpmE iwUZy7cc6Rzvw2i9vBV1tz8lrQ8V0crBonCZPnlRjZoBe0P60HezH/F/IV1683prkfXL ZkLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Rld+y5JdVX/8d8TndnnVQlESKcUe+rUytJ21XpLDjxM=; b=lnycyeQQrguigN5JpEUjMxApNt+kTcYJecrRiUHY59hnRYnzrVptB4Umj7A+Qty9g4 W0weH2/J8L1Brx88VAAnHYo7p3EAxwR2lQsNwLzWhB0Iw4QtPyWWxUcbwHmrr8LaOlQb BaP0m3jNqCs2P1pV/5r4rRU0c/gBhkKMAO05OB4n4TnUqrxsDXUp65EQ+g5sgZvmMFX8 FRBmDpHKlt73pko3s97HPorNbTwGZxS3eCKj3mP8SQthFsuNGcfL/C3jQ7HzQynbkh8a vIE1kqMbAvrI9qHPWZfJ1d4M7v+ZASmax80QEv4kkBYPGbJtMqt/SGSUPGrva/I+k7R9 6Xqw== X-Gm-Message-State: AKaTC03zgLcRiBUJkSN/jRrJCW0vhxTkUbw8lhZjJC/uBVuGSerB2XdGBvK6cuqqAGtgHQ== X-Received: by 10.84.216.6 with SMTP id m6mr45620212pli.130.1480306759531; Sun, 27 Nov 2016 20:19:19 -0800 (PST) Received: from minbar.hsd1.ca.comcast.net (c-73-162-102-141.hsd1.ca.comcast.net. [73.162.102.141]) by smtp.gmail.com with ESMTPSA id t89sm83037711pfe.50.2016.11.27.20.19.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 27 Nov 2016 20:19:19 -0800 (PST) From: Kyle Huey X-Google-Original-From: Kyle Huey To: Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Joerg Roedel Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/5] KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12 Date: Sun, 27 Nov 2016 20:18:54 -0800 Message-Id: <20161128041856.11420-4-khuey@kylehuey.com> X-Mailer: git-send-email 2.10.2 In-Reply-To: <20161128041856.11420-1-khuey@kylehuey.com> References: <20161128041856.11420-1-khuey@kylehuey.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We can't return both the pass/fail boolean for the vmcs and the upcoming continue/exit-to-userspace boolean for skip_emulated_instruction out of nested_vmx_check_vmcs, so move skip_emulated_instruction out of it instead. Additionally, VMENTER/VMRESUME only trigger singlestep exceptions when they advance the IP to the following instruction, not when they a) succeed, b) fail MSR validation or c) throw an exception. Add a skip_emulated_instruction_no_trap variant that will not check RFLAGS.TF. Signed-off-by: Kyle Huey --- arch/x86/kvm/vmx.c | 60 +++++++++++++++++++++++++++++++++++------------------- 1 file changed, 39 insertions(+), 21 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f2f9cf5..9cc4c41 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2455,28 +2455,33 @@ static void vmx_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask) interruptibility |= GUEST_INTR_STATE_MOV_SS; else if (mask & KVM_X86_SHADOW_INT_STI) interruptibility |= GUEST_INTR_STATE_STI; if ((interruptibility != interruptibility_old)) vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, interruptibility); } -static void skip_emulated_instruction(struct kvm_vcpu *vcpu) +static void skip_emulated_instruction_no_trap(struct kvm_vcpu *vcpu) { unsigned long rip; rip = kvm_rip_read(vcpu); rip += vmcs_read32(VM_EXIT_INSTRUCTION_LEN); kvm_rip_write(vcpu, rip); /* skipping an emulated instruction also counts */ vmx_set_interrupt_shadow(vcpu, 0); } +static void skip_emulated_instruction(struct kvm_vcpu *vcpu) +{ + skip_emulated_instruction_no_trap(vcpu); +} + /* * KVM wants to inject page-faults which it got to the guest. This function * checks whether in a nested guest, we need to inject them to L1 or L2. */ static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned nr) { struct vmcs12 *vmcs12 = get_vmcs12(vcpu); @@ -7319,33 +7324,36 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx) * VMX instructions which assume a current vmcs12 (i.e., that VMPTRLD was * used before) all generate the same failure when it is missing. */ static int nested_vmx_check_vmcs12(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); if (vmx->nested.current_vmptr == -1ull) { nested_vmx_failInvalid(vcpu); - skip_emulated_instruction(vcpu); return 0; } return 1; } static int handle_vmread(struct kvm_vcpu *vcpu) { unsigned long field; u64 field_value; unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO); gva_t gva = 0; - if (!nested_vmx_check_permission(vcpu) || - !nested_vmx_check_vmcs12(vcpu)) + if (!nested_vmx_check_permission(vcpu)) + return 1; + + if (!nested_vmx_check_vmcs12(vcpu)) { + skip_emulated_instruction(vcpu); return 1; + } /* Decode instruction info and find the field to read */ field = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 28) & 0xf)); /* Read the field, zero-extended to a u64 field_value */ if (vmcs12_read_any(vcpu, field, &field_value) < 0) { nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT); skip_emulated_instruction(vcpu); return 1; @@ -7383,19 +7391,23 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu) * mode, and eventually we need to write that into a field of several * possible lengths. The code below first zero-extends the value to 64 * bit (field_value), and then copies only the appropriate number of * bits into the vmcs12 field. */ u64 field_value = 0; struct x86_exception e; - if (!nested_vmx_check_permission(vcpu) || - !nested_vmx_check_vmcs12(vcpu)) + if (!nested_vmx_check_permission(vcpu)) + return 1; + + if (!nested_vmx_check_vmcs12(vcpu)) { + skip_emulated_instruction(vcpu); return 1; + } if (vmx_instruction_info & (1u << 10)) field_value = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 3) & 0xf)); else { if (get_vmx_mem_address(vcpu, exit_qualification, vmx_instruction_info, false, &gva)) return 1; @@ -10041,21 +10053,22 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) { struct vmcs12 *vmcs12; struct vcpu_vmx *vmx = to_vmx(vcpu); int cpu; struct loaded_vmcs *vmcs02; bool ia32e; u32 msr_entry_idx; - if (!nested_vmx_check_permission(vcpu) || - !nested_vmx_check_vmcs12(vcpu)) + if (!nested_vmx_check_permission(vcpu)) return 1; - skip_emulated_instruction(vcpu); + if (!nested_vmx_check_vmcs12(vcpu)) + goto out; + vmcs12 = get_vmcs12(vcpu); if (enable_shadow_vmcs) copy_shadow_to_vmcs12(vmx); /* * The nested entry process starts with enforcing various prerequisites * on vmcs12 as required by the Intel SDM, and act appropriately when @@ -10065,43 +10078,43 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) * To speed up the normal (success) code path, we should avoid checking * for misconfigurations which will anyway be caught by the processor * when using the merged vmcs02. */ if (vmcs12->launch_state == launch) { nested_vmx_failValid(vcpu, launch ? VMXERR_VMLAUNCH_NONCLEAR_VMCS : VMXERR_VMRESUME_NONLAUNCHED_VMCS); - return 1; + goto out; } if (vmcs12->guest_activity_state != GUEST_ACTIVITY_ACTIVE && vmcs12->guest_activity_state != GUEST_ACTIVITY_HLT) { nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); - return 1; + goto out; } if (!nested_get_vmcs12_pages(vcpu, vmcs12)) { nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); - return 1; + goto out; } if (nested_vmx_check_msr_bitmap_controls(vcpu, vmcs12)) { nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); - return 1; + goto out; } if (nested_vmx_check_apicv_controls(vcpu, vmcs12)) { nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); - return 1; + goto out; } if (nested_vmx_check_msr_switch_controls(vcpu, vmcs12)) { nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); - return 1; + goto out; } if (!vmx_control_verify(vmcs12->cpu_based_vm_exec_control, vmx->nested.nested_vmx_true_procbased_ctls_low, vmx->nested.nested_vmx_procbased_ctls_high) || !vmx_control_verify(vmcs12->secondary_vm_exec_control, vmx->nested.nested_vmx_secondary_ctls_low, vmx->nested.nested_vmx_secondary_ctls_high) || @@ -10111,36 +10124,36 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) !vmx_control_verify(vmcs12->vm_exit_controls, vmx->nested.nested_vmx_true_exit_ctls_low, vmx->nested.nested_vmx_exit_ctls_high) || !vmx_control_verify(vmcs12->vm_entry_controls, vmx->nested.nested_vmx_true_entry_ctls_low, vmx->nested.nested_vmx_entry_ctls_high)) { nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD); - return 1; + goto out; } if (((vmcs12->host_cr0 & VMXON_CR0_ALWAYSON) != VMXON_CR0_ALWAYSON) || ((vmcs12->host_cr4 & VMXON_CR4_ALWAYSON) != VMXON_CR4_ALWAYSON)) { nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_HOST_STATE_FIELD); - return 1; + goto out; } if (!nested_cr0_valid(vcpu, vmcs12->guest_cr0) || ((vmcs12->guest_cr4 & VMXON_CR4_ALWAYSON) != VMXON_CR4_ALWAYSON)) { nested_vmx_entry_failure(vcpu, vmcs12, EXIT_REASON_INVALID_STATE, ENTRY_FAIL_DEFAULT); - return 1; + goto out; } if (vmcs12->vmcs_link_pointer != -1ull) { nested_vmx_entry_failure(vcpu, vmcs12, EXIT_REASON_INVALID_STATE, ENTRY_FAIL_VMCS_LINK_PTR); - return 1; + goto out; } /* * If the load IA32_EFER VM-entry control is 1, the following checks * are performed on the field for the IA32_EFER MSR: * - Bits reserved in the IA32_EFER MSR must be 0. * - Bit 10 (corresponding to IA32_EFER.LMA) must equal the value of * the IA-32e mode guest VM-exit control. It must also be identical @@ -10150,17 +10163,17 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER) { ia32e = (vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) != 0; if (!kvm_valid_efer(vcpu, vmcs12->guest_ia32_efer) || ia32e != !!(vmcs12->guest_ia32_efer & EFER_LMA) || ((vmcs12->guest_cr0 & X86_CR0_PG) && ia32e != !!(vmcs12->guest_ia32_efer & EFER_LME))) { nested_vmx_entry_failure(vcpu, vmcs12, EXIT_REASON_INVALID_STATE, ENTRY_FAIL_DEFAULT); - return 1; + goto out; } } /* * If the load IA32_EFER VM-exit control is 1, bits reserved in the * IA32_EFER MSR must be 0 in the field for that register. In addition, * the values of the LMA and LME bits in the field must each be that of * the host address-space size VM-exit control. @@ -10168,29 +10181,30 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_EFER) { ia32e = (vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE) != 0; if (!kvm_valid_efer(vcpu, vmcs12->host_ia32_efer) || ia32e != !!(vmcs12->host_ia32_efer & EFER_LMA) || ia32e != !!(vmcs12->host_ia32_efer & EFER_LME)) { nested_vmx_entry_failure(vcpu, vmcs12, EXIT_REASON_INVALID_STATE, ENTRY_FAIL_DEFAULT); - return 1; + goto out; } } /* * We're finally done with prerequisite checking, and can start with * the nested entry. */ vmcs02 = nested_get_current_vmcs02(vmx); if (!vmcs02) return -ENOMEM; + skip_emulated_instruction_no_trap(vcpu); enter_guest_mode(vcpu); if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL); cpu = get_cpu(); vmx->loaded_vmcs = vmcs02; vmx_vcpu_put(vcpu); @@ -10222,16 +10236,20 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) /* * Note no nested_vmx_succeed or nested_vmx_fail here. At this point * we are no longer running L1, and VMLAUNCH/VMRESUME has not yet * returned as far as L1 is concerned. It will only return (and set * the success flag) when L2 exits (see nested_vmx_vmexit()). */ return 1; + +out: + skip_emulated_instruction(vcpu); + return 1; } /* * On a nested exit from L2 to L1, vmcs12.guest_cr0 might not be up-to-date * because L2 may have changed some cr0 bits directly (CRO_GUEST_HOST_MASK). * This function returns the new value we should put in vmcs12.guest_cr0. * It's not enough to just return the vmcs02 GUEST_CR0. Rather, * 1. Bits that neither L0 nor L1 trapped, were set directly by L2 and are now