From patchwork Mon Nov 28 04:18:55 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyle Huey X-Patchwork-Id: 9449081 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E6B9F6071E for ; Mon, 28 Nov 2016 04:19:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D37E0204C1 for ; Mon, 28 Nov 2016 04:19:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C807E2074F; Mon, 28 Nov 2016 04:19:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 196B120649 for ; Mon, 28 Nov 2016 04:19:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932164AbcK1ETi (ORCPT ); Sun, 27 Nov 2016 23:19:38 -0500 Received: from mail-pg0-f67.google.com ([74.125.83.67]:33685 "EHLO mail-pg0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753957AbcK1ETW (ORCPT ); Sun, 27 Nov 2016 23:19:22 -0500 Received: by mail-pg0-f67.google.com with SMTP id 3so11896189pgd.0 for ; Sun, 27 Nov 2016 20:19:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kylehuey.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=HJQV9O7aVKtroOsaLDwGeIvqsj5eRhek1ZQKfmcqxj8=; b=AKM4L06Iu1g2vugEQleYYmKZkHCG3iYn1m1TmzsCEb+d7EglNE/PWu6P1myM/ZGgO/ zERqjx3tsCVGw1Y0yaPVbjyVvtwtfONjdBkpYluYimM9R8t51+TzrWtFGPm8ozBCYFOb xus3g8NWdln199XffJwY0qE1qkeo9kHJVP7cZ5dIv+htd6AntnJjTXexPN34H273xhKV 6ElegQ8Cz5l+MY9cUvFyZsQPtDOcgiN2qNJL2hJUuRIbY5uDCQB84BFfjEOjGeIkilku PhcIvhT1mS806jkwingyXqzvqMyIFWaEfviJPAcdQRmNE1DkkZBVZlk52n0A/lCzUjHa o0qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=HJQV9O7aVKtroOsaLDwGeIvqsj5eRhek1ZQKfmcqxj8=; b=aGTGKIqqYFCWWqq2E+qqdtfepzs+rNF7MCdme8zwprQj4AxN/JBXe4BlcfyophaGua c+Zx22mDATxFR079WOz+zcQn3IkldOgrGkjuCmGVjMlwVenI73R3tWKZBhptakSGv1po 3K09mXL8IT6YA947EgQ87Pis0sn/LXcsx4Ww2qR8HNlHTtJ3cpHHNSFq4A4WnegE6UvJ Wg4xx3EeukdProq1DJOdd9IFWgO+gSaaJ8hhG+nXLEWefzvUGvzK8cKNXyrtm0ZFg3kx EKHHp0Z3B1DlSifxTm+t6g28Y6fX62a/tV3tNhSMfEPizPAm6gpbfZVc2AjFWqz5GiJZ GetA== X-Gm-Message-State: AKaTC027jVMSMXyNd+GvLgxPj54K4GM2syHkmrneY334+qXQSogAE9tjsh9YxT4h0vRBsQ== X-Received: by 10.84.209.161 with SMTP id y30mr45769698plh.163.1480306760687; Sun, 27 Nov 2016 20:19:20 -0800 (PST) Received: from minbar.hsd1.ca.comcast.net (c-73-162-102-141.hsd1.ca.comcast.net. [73.162.102.141]) by smtp.gmail.com with ESMTPSA id t89sm83037711pfe.50.2016.11.27.20.19.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 27 Nov 2016 20:19:20 -0800 (PST) From: Kyle Huey X-Google-Original-From: Kyle Huey To: Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Joerg Roedel Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/5] KVM: x86: Add a return value to skip_emulated_instruction and propagate it. Date: Sun, 27 Nov 2016 20:18:55 -0800 Message-Id: <20161128041856.11420-5-khuey@kylehuey.com> X-Mailer: git-send-email 2.10.2 In-Reply-To: <20161128041856.11420-1-khuey@kylehuey.com> References: <20161128041856.11420-1-khuey@kylehuey.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Return 1 (for the moment) from skip_emulated_instruction and propagate it through. This is straightforward, except for: - ICEBP, which is already inside a trap, so avoid triggering another trap. - Instructions that can trigger exits to userspace, such as the IO insns, MOVs to CR8, and HALT. If skip_emulated_instruction does trigger a KVM_GUESTDBG_SINGLESTEP exit, and the handling code for IN/OUT/MOV CR8/HALT also triggers an exit to userspace, the latter will take precedence. The singlestep will be triggered again on the next instruction, which is the current behavior. - XSAVES/XRSTORS, which should never be called. - Task switch instructions which would require additional handling (e.g. the task switch bit) and are instead modified to use the no_trap variant. Signed-off-by: Kyle Huey --- arch/x86/include/asm/kvm_host.h | 4 +- arch/x86/kvm/cpuid.c | 3 +- arch/x86/kvm/svm.c | 16 ++-- arch/x86/kvm/vmx.c | 175 ++++++++++++++++------------------------ arch/x86/kvm/x86.c | 22 +++-- 5 files changed, 92 insertions(+), 128 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 80bad5c..f846610 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -919,17 +919,17 @@ struct kvm_x86_ops { u32 (*get_pkru)(struct kvm_vcpu *vcpu); void (*fpu_activate)(struct kvm_vcpu *vcpu); void (*fpu_deactivate)(struct kvm_vcpu *vcpu); void (*tlb_flush)(struct kvm_vcpu *vcpu); void (*run)(struct kvm_vcpu *vcpu); int (*handle_exit)(struct kvm_vcpu *vcpu); - void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu); + int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu); void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask); u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu); void (*patch_hypercall)(struct kvm_vcpu *vcpu, unsigned char *hypercall_addr); void (*set_irq)(struct kvm_vcpu *vcpu); void (*set_nmi)(struct kvm_vcpu *vcpu); void (*queue_exception)(struct kvm_vcpu *vcpu, unsigned nr, bool has_error_code, u32 error_code, @@ -1363,17 +1363,17 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu); extern bool kvm_find_async_pf_gfn(struct kvm_vcpu *vcpu, gfn_t gfn); -void kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err); +int kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err); int kvm_is_in_guest(void); int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size); int x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size); bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu); bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 07cc629..d733ac2 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -885,12 +885,11 @@ int kvm_emulate_cpuid(struct kvm_vcpu *vcpu) eax = kvm_register_read(vcpu, VCPU_REGS_RAX); ecx = kvm_register_read(vcpu, VCPU_REGS_RCX); kvm_cpuid(vcpu, &eax, &ebx, &ecx, &edx); kvm_register_write(vcpu, VCPU_REGS_RAX, eax); kvm_register_write(vcpu, VCPU_REGS_RBX, ebx); kvm_register_write(vcpu, VCPU_REGS_RCX, ecx); kvm_register_write(vcpu, VCPU_REGS_RDX, edx); - kvm_x86_ops->skip_emulated_instruction(vcpu); - return 1; + return kvm_x86_ops->skip_emulated_instruction(vcpu); } EXPORT_SYMBOL_GPL(kvm_emulate_cpuid); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 5bdffcd..ed28be0 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -603,37 +603,38 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask) if (mask == 0) svm->vmcb->control.int_state &= ~SVM_INTERRUPT_SHADOW_MASK; else svm->vmcb->control.int_state |= SVM_INTERRUPT_SHADOW_MASK; } -static void skip_emulated_instruction(struct kvm_vcpu *vcpu) +static int skip_emulated_instruction(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); if (svm->vmcb->control.next_rip != 0) { WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS)); svm->next_rip = svm->vmcb->control.next_rip; } if (!svm->next_rip) { if (emulate_instruction(vcpu, EMULTYPE_SKIP) != EMULATE_DONE) printk(KERN_DEBUG "%s: NOP\n", __func__); - return; + return 1; } if (svm->next_rip - kvm_rip_read(vcpu) > MAX_INST_SIZE) printk(KERN_ERR "%s: ip 0x%lx next 0x%llx\n", __func__, kvm_rip_read(vcpu), svm->next_rip); kvm_rip_write(vcpu, svm->next_rip); svm_set_interrupt_shadow(vcpu, 0); + return 1; } static void svm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr, bool has_error_code, u32 error_code, bool reinject) { struct vcpu_svm *svm = to_svm(vcpu); @@ -3146,18 +3147,17 @@ static int skinit_interception(struct vcpu_svm *svm) trace_kvm_skinit(svm->vmcb->save.rip, kvm_register_read(&svm->vcpu, VCPU_REGS_RAX)); kvm_queue_exception(&svm->vcpu, UD_VECTOR); return 1; } static int wbinvd_interception(struct vcpu_svm *svm) { - kvm_emulate_wbinvd(&svm->vcpu); - return 1; + return kvm_emulate_wbinvd(&svm->vcpu); } static int xsetbv_interception(struct vcpu_svm *svm) { u64 new_bv = kvm_read_edx_eax(&svm->vcpu); u32 index = kvm_register_read(&svm->vcpu, VCPU_REGS_RCX); if (kvm_set_xcr(&svm->vcpu, index, new_bv) == 0) { @@ -3270,19 +3270,17 @@ static int emulate_on_interception(struct vcpu_svm *svm) static int rdpmc_interception(struct vcpu_svm *svm) { int err; if (!static_cpu_has(X86_FEATURE_NRIPS)) return emulate_on_interception(svm); err = kvm_rdpmc(&svm->vcpu); - kvm_complete_insn_gp(&svm->vcpu, err); - - return 1; + return kvm_complete_insn_gp(&svm->vcpu, err); } static bool check_selective_cr0_intercepted(struct vcpu_svm *svm, unsigned long val) { unsigned long cr0 = svm->vcpu.arch.cr0; bool ret = false; u64 intercept; @@ -3369,19 +3367,17 @@ static int cr_interception(struct vcpu_svm *svm) break; default: WARN(1, "unhandled read from CR%d", cr); kvm_queue_exception(&svm->vcpu, UD_VECTOR); return 1; } kvm_register_write(&svm->vcpu, reg, val); } - kvm_complete_insn_gp(&svm->vcpu, err); - - return 1; + return kvm_complete_insn_gp(&svm->vcpu, err); } static int dr_interception(struct vcpu_svm *svm) { int reg, dr; unsigned long val; if (svm->vcpu.guest_debug == 0) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 9cc4c41..f404aef 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2467,19 +2467,20 @@ static void skip_emulated_instruction_no_trap(struct kvm_vcpu *vcpu) rip = kvm_rip_read(vcpu); rip += vmcs_read32(VM_EXIT_INSTRUCTION_LEN); kvm_rip_write(vcpu, rip); /* skipping an emulated instruction also counts */ vmx_set_interrupt_shadow(vcpu, 0); } -static void skip_emulated_instruction(struct kvm_vcpu *vcpu) +static int skip_emulated_instruction(struct kvm_vcpu *vcpu) { skip_emulated_instruction_no_trap(vcpu); + return 1; } /* * KVM wants to inject page-faults which it got to the guest. This function * checks whether in a nested guest, we need to inject them to L1 or L2. */ static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned nr) { @@ -5511,17 +5512,17 @@ static int handle_exception(struct kvm_vcpu *vcpu) return 1; case DB_VECTOR: dr6 = vmcs_readl(EXIT_QUALIFICATION); if (!(vcpu->guest_debug & (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))) { vcpu->arch.dr6 &= ~15; vcpu->arch.dr6 |= dr6 | DR6_RTM; if (!(dr6 & ~DR6_RESERVED)) /* icebp */ - skip_emulated_instruction(vcpu); + skip_emulated_instruction_no_trap(vcpu); kvm_queue_exception(vcpu, DB_VECTOR); return 1; } kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1; kvm_run->debug.arch.dr7 = vmcs_readl(GUEST_DR7); /* fall through */ case BP_VECTOR: @@ -5556,33 +5557,38 @@ static int handle_triple_fault(struct kvm_vcpu *vcpu) { vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN; return 0; } static int handle_io(struct kvm_vcpu *vcpu) { unsigned long exit_qualification; - int size, in, string; + int size, in, string, ret; unsigned port; exit_qualification = vmcs_readl(EXIT_QUALIFICATION); string = (exit_qualification & 16) != 0; in = (exit_qualification & 8) != 0; ++vcpu->stat.io_exits; if (string || in) return emulate_instruction(vcpu, 0) == EMULATE_DONE; port = exit_qualification >> 16; size = (exit_qualification & 7) + 1; - skip_emulated_instruction(vcpu); - return kvm_fast_pio_out(vcpu, size, port); + ret = skip_emulated_instruction(vcpu); + + /* + * TODO: we might be squashing a KVM_GUESTDBG_SINGLESTEP-triggered + * KVM_EXIT_DEBUG here. + */ + return kvm_fast_pio_out(vcpu, size, port) && ret; } static void vmx_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall) { /* * Patch in the VMCALL instruction: */ @@ -5670,80 +5676,79 @@ static void handle_clts(struct kvm_vcpu *vcpu) } static int handle_cr(struct kvm_vcpu *vcpu) { unsigned long exit_qualification, val; int cr; int reg; int err; + int ret; exit_qualification = vmcs_readl(EXIT_QUALIFICATION); cr = exit_qualification & 15; reg = (exit_qualification >> 8) & 15; switch ((exit_qualification >> 4) & 3) { case 0: /* mov to cr */ val = kvm_register_readl(vcpu, reg); trace_kvm_cr_write(cr, val); switch (cr) { case 0: err = handle_set_cr0(vcpu, val); - kvm_complete_insn_gp(vcpu, err); - return 1; + return kvm_complete_insn_gp(vcpu, err); case 3: err = kvm_set_cr3(vcpu, val); - kvm_complete_insn_gp(vcpu, err); - return 1; + return kvm_complete_insn_gp(vcpu, err); case 4: err = handle_set_cr4(vcpu, val); - kvm_complete_insn_gp(vcpu, err); - return 1; + return kvm_complete_insn_gp(vcpu, err); case 8: { u8 cr8_prev = kvm_get_cr8(vcpu); u8 cr8 = (u8)val; err = kvm_set_cr8(vcpu, cr8); - kvm_complete_insn_gp(vcpu, err); + ret = kvm_complete_insn_gp(vcpu, err); if (lapic_in_kernel(vcpu)) - return 1; + return ret; if (cr8_prev <= cr8) - return 1; + return ret; + /* + * TODO: we might be squashing a + * KVM_GUESTDBG_SINGLESTEP-triggered + * KVM_EXIT_DEBUG here. + */ vcpu->run->exit_reason = KVM_EXIT_SET_TPR; return 0; } } break; case 2: /* clts */ handle_clts(vcpu); trace_kvm_cr_write(0, kvm_read_cr0(vcpu)); vmx_fpu_activate(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); case 1: /*mov from cr*/ switch (cr) { case 3: val = kvm_read_cr3(vcpu); kvm_register_write(vcpu, reg, val); trace_kvm_cr_read(cr, val); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); case 8: val = kvm_get_cr8(vcpu); kvm_register_write(vcpu, reg, val); trace_kvm_cr_read(cr, val); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } break; case 3: /* lmsw */ val = (exit_qualification >> LMSW_SOURCE_DATA_SHIFT) & 0x0f; trace_kvm_cr_write(0, (kvm_read_cr0(vcpu) & ~0xful) | val); kvm_lmsw(vcpu, val); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); default: break; } vcpu->run->exit_reason = 0; vcpu_unimpl(vcpu, "unhandled control register: op %d cr %d\n", (int)(exit_qualification >> 4) & 3, cr); return 0; } @@ -5804,18 +5809,17 @@ static int handle_dr(struct kvm_vcpu *vcpu) if (kvm_get_dr(vcpu, dr, &val)) return 1; kvm_register_write(vcpu, reg, val); } else if (kvm_set_dr(vcpu, dr, kvm_register_readl(vcpu, reg))) return 1; - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } static u64 vmx_get_dr6(struct kvm_vcpu *vcpu) { return vcpu->arch.dr6; } static void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val) @@ -5858,18 +5862,17 @@ static int handle_rdmsr(struct kvm_vcpu *vcpu) return 1; } trace_kvm_msr_read(ecx, msr_info.data); /* FIXME: handling of bits 32:63 of rax, rdx */ vcpu->arch.regs[VCPU_REGS_RAX] = msr_info.data & -1u; vcpu->arch.regs[VCPU_REGS_RDX] = (msr_info.data >> 32) & -1u; - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } static int handle_wrmsr(struct kvm_vcpu *vcpu) { struct msr_data msr; u32 ecx = vcpu->arch.regs[VCPU_REGS_RCX]; u64 data = (vcpu->arch.regs[VCPU_REGS_RAX] & -1u) | ((u64)(vcpu->arch.regs[VCPU_REGS_RDX] & -1u) << 32); @@ -5879,18 +5882,17 @@ static int handle_wrmsr(struct kvm_vcpu *vcpu) msr.host_initiated = false; if (kvm_set_msr(vcpu, &msr) != 0) { trace_kvm_msr_write_ex(ecx, data); kvm_inject_gp(vcpu, 0); return 1; } trace_kvm_msr_write(ecx, data); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } static int handle_tpr_below_threshold(struct kvm_vcpu *vcpu) { kvm_make_request(KVM_REQ_EVENT, vcpu); return 1; } @@ -5924,43 +5926,39 @@ static int handle_invd(struct kvm_vcpu *vcpu) return emulate_instruction(vcpu, 0) == EMULATE_DONE; } static int handle_invlpg(struct kvm_vcpu *vcpu) { unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); kvm_mmu_invlpg(vcpu, exit_qualification); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } static int handle_rdpmc(struct kvm_vcpu *vcpu) { int err; err = kvm_rdpmc(vcpu); - kvm_complete_insn_gp(vcpu, err); - - return 1; + return kvm_complete_insn_gp(vcpu, err); } static int handle_wbinvd(struct kvm_vcpu *vcpu) { - kvm_emulate_wbinvd(vcpu); - return 1; + return kvm_emulate_wbinvd(vcpu); } static int handle_xsetbv(struct kvm_vcpu *vcpu) { u64 new_bv = kvm_read_edx_eax(vcpu); u32 index = kvm_register_read(vcpu, VCPU_REGS_RCX); if (kvm_set_xcr(vcpu, index, new_bv) == 0) - skip_emulated_instruction(vcpu); + return skip_emulated_instruction(vcpu); return 1; } static int handle_xsaves(struct kvm_vcpu *vcpu) { skip_emulated_instruction(vcpu); WARN(1, "this should never happen\n"); return 1; @@ -5984,18 +5982,17 @@ static int handle_apic_access(struct kvm_vcpu *vcpu) /* * Sane guest uses MOV to write EOI, with written value * not cared. So make a short-circuit here by avoiding * heavy instruction emulation. */ if ((access_type == TYPE_LINEAR_APIC_INST_WRITE) && (offset == APIC_EOI)) { kvm_lapic_set_eoi(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } } return emulate_instruction(vcpu, 0) == EMULATE_DONE; } static int handle_apic_eoi_induced(struct kvm_vcpu *vcpu) { unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); @@ -6057,17 +6054,17 @@ static int handle_task_switch(struct kvm_vcpu *vcpu) break; } } tss_selector = exit_qualification; if (!idt_v || (type != INTR_TYPE_HARD_EXCEPTION && type != INTR_TYPE_EXT_INTR && type != INTR_TYPE_NMI_INTR)) - skip_emulated_instruction(vcpu); + skip_emulated_instruction_no_trap(vcpu); if (kvm_task_switch(vcpu, tss_selector, type == INTR_TYPE_SOFT_INTR ? idt_index : -1, reason, has_error_code, error_code) == EMULATE_FAIL) { vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION; vcpu->run->internal.ndata = 0; return 0; @@ -6134,18 +6131,17 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) static int handle_ept_misconfig(struct kvm_vcpu *vcpu) { int ret; gpa_t gpa; gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); if (!kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) { trace_kvm_fast_mmio(gpa); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } ret = handle_mmio_page_fault(vcpu, gpa, true); if (likely(ret == RET_MMIO_PF_EMULATE)) return x86_emulate_instruction(vcpu, gpa, 0, NULL, 0) == EMULATE_DONE; if (unlikely(ret == RET_MMIO_PF_INVALID)) @@ -6508,25 +6504,22 @@ static __exit void hardware_unsetup(void) * exiting, so only get here on cpu with PAUSE-Loop-Exiting. */ static int handle_pause(struct kvm_vcpu *vcpu) { if (ple_gap) grow_ple_window(vcpu); kvm_vcpu_on_spin(vcpu); - skip_emulated_instruction(vcpu); - - return 1; + return skip_emulated_instruction(vcpu); } static int handle_nop(struct kvm_vcpu *vcpu) { - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } static int handle_mwait(struct kvm_vcpu *vcpu) { printk_once(KERN_WARNING "kvm: MWAIT instruction emulated as NOP!\n"); return handle_nop(vcpu); } @@ -6823,59 +6816,53 @@ static int nested_vmx_check_vmptr(struct kvm_vcpu *vcpu, int exit_reason, * * Note - IA32_VMX_BASIC[48] will never be 1 * for the nested case; * which replaces physical address width with 32 * */ if (!PAGE_ALIGNED(vmptr) || (vmptr >> maxphyaddr)) { nested_vmx_failInvalid(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } page = nested_get_page(vcpu, vmptr); if (page == NULL || *(u32 *)kmap(page) != VMCS12_REVISION) { nested_vmx_failInvalid(vcpu); kunmap(page); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } kunmap(page); vmx->nested.vmxon_ptr = vmptr; break; case EXIT_REASON_VMCLEAR: if (!PAGE_ALIGNED(vmptr) || (vmptr >> maxphyaddr)) { nested_vmx_failValid(vcpu, VMXERR_VMCLEAR_INVALID_ADDRESS); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } if (vmptr == vmx->nested.vmxon_ptr) { nested_vmx_failValid(vcpu, VMXERR_VMCLEAR_VMXON_POINTER); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } break; case EXIT_REASON_VMPTRLD: if (!PAGE_ALIGNED(vmptr) || (vmptr >> maxphyaddr)) { nested_vmx_failValid(vcpu, VMXERR_VMPTRLD_INVALID_ADDRESS); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } if (vmptr == vmx->nested.vmxon_ptr) { nested_vmx_failValid(vcpu, VMXERR_VMCLEAR_VMXON_POINTER); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } break; default: return 1; /* shouldn't happen */ } if (vmpointer) *vmpointer = vmptr; @@ -6921,18 +6908,17 @@ static int handle_vmon(struct kvm_vcpu *vcpu) return 1; } if (nested_vmx_check_vmptr(vcpu, EXIT_REASON_VMON, NULL)) return 1; if (vmx->nested.vmxon) { nested_vmx_failValid(vcpu, VMXERR_VMXON_IN_VMX_ROOT_OPERATION); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } if ((vmx->msr_ia32_feature_control & VMXON_NEEDED_FEATURES) != VMXON_NEEDED_FEATURES) { kvm_inject_gp(vcpu, 0); return 1; } @@ -6963,18 +6949,17 @@ static int handle_vmon(struct kvm_vcpu *vcpu) hrtimer_init(&vmx->nested.preemption_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); vmx->nested.preemption_timer.function = vmx_preemption_timer_fn; vmx->nested.vmxon = true; nested_vmx_succeed(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); out_shadow_vmcs: kfree(vmx->nested.cached_vmcs12); out_cached_vmcs12: free_page((unsigned long)vmx->nested.msr_bitmap); out_msr_bitmap: @@ -7084,18 +7069,17 @@ static void free_nested(struct vcpu_vmx *vmx) /* Emulate the VMXOFF instruction */ static int handle_vmoff(struct kvm_vcpu *vcpu) { if (!nested_vmx_check_permission(vcpu)) return 1; free_nested(to_vmx(vcpu)); nested_vmx_succeed(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } /* Emulate the VMCLEAR instruction */ static int handle_vmclear(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); gpa_t vmptr; struct vmcs12 *vmcs12; @@ -7125,18 +7109,17 @@ static int handle_vmclear(struct kvm_vcpu *vcpu) vmcs12 = kmap(page); vmcs12->launch_state = 0; kunmap(page); nested_release_page(page); nested_free_vmcs02(vmx, vmptr); nested_vmx_succeed(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch); /* Emulate the VMLAUNCH instruction */ static int handle_vmlaunch(struct kvm_vcpu *vcpu) { return nested_vmx_run(vcpu, true); @@ -7340,28 +7323,25 @@ static int handle_vmread(struct kvm_vcpu *vcpu) u64 field_value; unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO); gva_t gva = 0; if (!nested_vmx_check_permission(vcpu)) return 1; - if (!nested_vmx_check_vmcs12(vcpu)) { - skip_emulated_instruction(vcpu); - return 1; - } + if (!nested_vmx_check_vmcs12(vcpu)) + return skip_emulated_instruction(vcpu); /* Decode instruction info and find the field to read */ field = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 28) & 0xf)); /* Read the field, zero-extended to a u64 field_value */ if (vmcs12_read_any(vcpu, field, &field_value) < 0) { nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } /* * Now copy part of this value to register or memory, as requested. * Note that the number of bits actually copied is 32 or 64 depending * on the guest's mode (32 or 64 bit), not on the given field's length. */ if (vmx_instruction_info & (1u << 10)) { kvm_register_writel(vcpu, (((vmx_instruction_info) >> 3) & 0xf), @@ -7371,18 +7351,17 @@ static int handle_vmread(struct kvm_vcpu *vcpu) vmx_instruction_info, true, &gva)) return 1; /* _system ok, as nested_vmx_check_permission verified cpl=0 */ kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, gva, &field_value, (is_long_mode(vcpu) ? 8 : 4), NULL); } nested_vmx_succeed(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } static int handle_vmwrite(struct kvm_vcpu *vcpu) { unsigned long field; gva_t gva; unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); @@ -7394,20 +7373,18 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu) * bits into the vmcs12 field. */ u64 field_value = 0; struct x86_exception e; if (!nested_vmx_check_permission(vcpu)) return 1; - if (!nested_vmx_check_vmcs12(vcpu)) { - skip_emulated_instruction(vcpu); - return 1; - } + if (!nested_vmx_check_vmcs12(vcpu)) + return skip_emulated_instruction(vcpu); if (vmx_instruction_info & (1u << 10)) field_value = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 3) & 0xf)); else { if (get_vmx_mem_address(vcpu, exit_qualification, vmx_instruction_info, false, &gva)) return 1; @@ -7418,29 +7395,26 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu) } } field = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 28) & 0xf)); if (vmcs_field_readonly(field)) { nested_vmx_failValid(vcpu, VMXERR_VMWRITE_READ_ONLY_VMCS_COMPONENT); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } if (vmcs12_write_any(vcpu, field, field_value) < 0) { nested_vmx_failValid(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } nested_vmx_succeed(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } /* Emulate the VMPTRLD instruction */ static int handle_vmptrld(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); gpa_t vmptr; @@ -7451,27 +7425,25 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu) return 1; if (vmx->nested.current_vmptr != vmptr) { struct vmcs12 *new_vmcs12; struct page *page; page = nested_get_page(vcpu, vmptr); if (page == NULL) { nested_vmx_failInvalid(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } new_vmcs12 = kmap(page); if (new_vmcs12->revision_id != VMCS12_REVISION) { kunmap(page); nested_release_page_clean(page); nested_vmx_failValid(vcpu, VMXERR_VMPTRLD_INCORRECT_VMCS_REVISION_ID); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } nested_release_vmcs12(vmx); vmx->nested.current_vmptr = vmptr; vmx->nested.current_vmcs12 = new_vmcs12; vmx->nested.current_vmcs12_page = page; /* * Load VMCS12 from guest memory since it is not already @@ -7485,18 +7457,17 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu) SECONDARY_EXEC_SHADOW_VMCS); vmcs_write64(VMCS_LINK_POINTER, __pa(vmx->vmcs01.shadow_vmcs)); vmx->nested.sync_shadow_vmcs = true; } } nested_vmx_succeed(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } /* Emulate the VMPTRST instruction */ static int handle_vmptrst(struct kvm_vcpu *vcpu) { unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO); gva_t vmcs_gva; @@ -7511,18 +7482,17 @@ static int handle_vmptrst(struct kvm_vcpu *vcpu) /* ok to use *_system, as nested_vmx_check_permission verified cpl=0 */ if (kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, vmcs_gva, (void *)&to_vmx(vcpu)->nested.current_vmptr, sizeof(u64), &e)) { kvm_inject_page_fault(vcpu, &e); return 1; } nested_vmx_succeed(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } /* Emulate the INVEPT instruction */ static int handle_invept(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); u32 vmx_instruction_info, types; unsigned long type; @@ -7550,18 +7520,17 @@ static int handle_invept(struct kvm_vcpu *vcpu) vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO); type = kvm_register_readl(vcpu, (vmx_instruction_info >> 28) & 0xf); types = (vmx->nested.nested_vmx_ept_caps >> VMX_EPT_EXTENT_SHIFT) & 6; if (type >= 32 || !(types & (1 << type))) { nested_vmx_failValid(vcpu, VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } /* According to the Intel VMX instruction reference, the memory * operand is read even if it isn't needed (e.g., for type==global) */ if (get_vmx_mem_address(vcpu, vmcs_readl(EXIT_QUALIFICATION), vmx_instruction_info, false, &gva)) return 1; @@ -7582,18 +7551,17 @@ static int handle_invept(struct kvm_vcpu *vcpu) kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); nested_vmx_succeed(vcpu); break; default: BUG_ON(1); break; } - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } static int handle_invvpid(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); u32 vmx_instruction_info; unsigned long type, types; gva_t gva; @@ -7614,18 +7582,17 @@ static int handle_invvpid(struct kvm_vcpu *vcpu) type = kvm_register_readl(vcpu, (vmx_instruction_info >> 28) & 0xf); types = (vmx->nested.nested_vmx_vpid_caps & VMX_VPID_EXTENT_SUPPORTED_MASK) >> 8; if (type >= 32 || !(types & (1 << type))) { nested_vmx_failValid(vcpu, VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } /* according to the intel vmx instruction reference, the memory * operand is read even if it isn't needed (e.g., for type==global) */ if (get_vmx_mem_address(vcpu, vmcs_readl(EXIT_QUALIFICATION), vmx_instruction_info, false, &gva)) return 1; @@ -7637,33 +7604,30 @@ static int handle_invvpid(struct kvm_vcpu *vcpu) switch (type) { case VMX_VPID_EXTENT_INDIVIDUAL_ADDR: case VMX_VPID_EXTENT_SINGLE_CONTEXT: case VMX_VPID_EXTENT_SINGLE_NON_GLOBAL: if (!vpid) { nested_vmx_failValid(vcpu, VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } break; case VMX_VPID_EXTENT_ALL_CONTEXT: break; default: WARN_ON_ONCE(1); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } __vmx_flush_tlb(vcpu, vmx->nested.vpid02); nested_vmx_succeed(vcpu); - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } static int handle_pml_full(struct kvm_vcpu *vcpu) { unsigned long exit_qualification; trace_kvm_pml_full(vcpu->vcpu_id); @@ -10238,18 +10202,17 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) * Note no nested_vmx_succeed or nested_vmx_fail here. At this point * we are no longer running L1, and VMLAUNCH/VMRESUME has not yet * returned as far as L1 is concerned. It will only return (and set * the success flag) when L2 exits (see nested_vmx_vmexit()). */ return 1; out: - skip_emulated_instruction(vcpu); - return 1; + return skip_emulated_instruction(vcpu); } /* * On a nested exit from L2 to L1, vmcs12.guest_cr0 might not be up-to-date * because L2 may have changed some cr0 bits directly (CRO_GUEST_HOST_MASK). * This function returns the new value we should put in vmcs12.guest_cr0. * It's not enough to just return the vmcs02 GUEST_CR0. Rather, * 1. Bits that neither L0 nor L1 trapped, were set directly by L2 and are now diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ec59301..fbcacf8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -420,22 +420,24 @@ void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) EXPORT_SYMBOL_GPL(kvm_queue_exception); void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr) { kvm_multiple_exception(vcpu, nr, false, 0, true); } EXPORT_SYMBOL_GPL(kvm_requeue_exception); -void kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err) +int kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err) { if (err) kvm_inject_gp(vcpu, 0); else - kvm_x86_ops->skip_emulated_instruction(vcpu); + return kvm_x86_ops->skip_emulated_instruction(vcpu); + + return 1; } EXPORT_SYMBOL_GPL(kvm_complete_insn_gp); void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault) { ++vcpu->stat.pf_guest; vcpu->arch.cr2 = fault->address; kvm_queue_exception_e(vcpu, PF_VECTOR, fault->error_code); @@ -4808,18 +4810,18 @@ static int kvm_emulate_wbinvd_noskip(struct kvm_vcpu *vcpu) cpumask_clear(vcpu->arch.wbinvd_dirty_mask); } else wbinvd(); return X86EMUL_CONTINUE; } int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu) { - kvm_x86_ops->skip_emulated_instruction(vcpu); - return kvm_emulate_wbinvd_noskip(vcpu); + kvm_emulate_wbinvd_noskip(vcpu); + return kvm_x86_ops->skip_emulated_instruction(vcpu); } EXPORT_SYMBOL_GPL(kvm_emulate_wbinvd); static void emulator_wbinvd(struct x86_emulate_ctxt *ctxt) { kvm_emulate_wbinvd_noskip(emul_to_vcpu(ctxt)); @@ -6002,18 +6004,22 @@ int kvm_vcpu_halt(struct kvm_vcpu *vcpu) vcpu->run->exit_reason = KVM_EXIT_HLT; return 0; } } EXPORT_SYMBOL_GPL(kvm_vcpu_halt); int kvm_emulate_halt(struct kvm_vcpu *vcpu) { - kvm_x86_ops->skip_emulated_instruction(vcpu); - return kvm_vcpu_halt(vcpu); + int ret = kvm_x86_ops->skip_emulated_instruction(vcpu); + /* + * TODO: we might be squashing a GUESTDBG_SINGLESTEP-triggered + * KVM_EXIT_DEBUG here. + */ + return kvm_vcpu_halt(vcpu) && ret; } EXPORT_SYMBOL_GPL(kvm_emulate_halt); /* * kvm_pv_kick_cpu_op: Kick a vcpu. * * @apicid - apicid of vcpu to be kicked. */ @@ -6034,19 +6040,19 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu) { vcpu->arch.apicv_active = false; kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu); } int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) { unsigned long nr, a0, a1, a2, a3, ret; - int op_64_bit, r = 1; + int op_64_bit, r; - kvm_x86_ops->skip_emulated_instruction(vcpu); + r = kvm_x86_ops->skip_emulated_instruction(vcpu); if (kvm_hv_hypercall_enabled(vcpu->kvm)) return kvm_hv_hypercall(vcpu); nr = kvm_register_read(vcpu, VCPU_REGS_RAX); a0 = kvm_register_read(vcpu, VCPU_REGS_RBX); a1 = kvm_register_read(vcpu, VCPU_REGS_RCX); a2 = kvm_register_read(vcpu, VCPU_REGS_RDX);