From patchwork Wed Jun 16 11:14:12 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Har'El X-Patchwork-Id: 106462 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter.kernel.org (8.14.3/8.14.3) with ESMTP id o5GBEJnD015341 for ; Wed, 16 Jun 2010 11:14:19 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756178Ab0FPLOQ (ORCPT ); Wed, 16 Jun 2010 07:14:16 -0400 Received: from mailgw11.technion.ac.il ([132.68.225.11]:21827 "EHLO mailgw11.technion.ac.il" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755803Ab0FPLOP (ORCPT ); Wed, 16 Jun 2010 07:14:15 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEABNPGEyERHMG/2dsb2JhbACeeXG+UgKFGASDTQ X-IronPort-AV: E=Sophos;i="4.53,426,1272834000"; d="scan'208";a="7697285" Received: from fermat.math.technion.ac.il ([132.68.115.6]) by mailgw11.technion.ac.il with ESMTP; 16 Jun 2010 14:14:13 +0300 Received: from fermat.math.technion.ac.il (localhost [127.0.0.1]) by fermat.math.technion.ac.il (8.12.10/8.12.10) with ESMTP id o5GBEDcQ014111; Wed, 16 Jun 2010 14:14:13 +0300 (IDT) Received: (from nyh@localhost) by fermat.math.technion.ac.il (8.12.10/8.12.10/Submit) id o5GBECUX014110; Wed, 16 Jun 2010 14:14:12 +0300 (IDT) X-Authentication-Warning: fermat.math.technion.ac.il: nyh set sender to nyh@math.technion.ac.il using -f Date: Wed, 16 Jun 2010 14:14:12 +0300 From: "Nadav Har'El" To: Avi Kivity Cc: kvm@vger.kernel.org Subject: Re: [PATCH 3/24] Implement VMXON and VMXOFF Message-ID: <20100616111412.GA11896@fermat.math.technion.ac.il> References: <1276431753-nyh@il.ibm.com> <201006131224.o5DCO63N012897@rice.haifa.ibm.com> <4C15E690.3000707@redhat.com> Mime-Version: 1.0 Content-Disposition: inline In-Reply-To: <4C15E690.3000707@redhat.com> User-Agent: Mutt/1.4.2.2i Hebrew-Date: 4 Tammuz 5770 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.3 (demeter.kernel.org [140.211.167.41]); Wed, 16 Jun 2010 11:14:19 +0000 (UTC) --- .before/arch/x86/kvm/vmx.c 2010-06-16 13:20:19.000000000 +0300 +++ .after/arch/x86/kvm/vmx.c 2010-06-16 13:20:19.000000000 +0300 @@ -125,6 +125,17 @@ struct shared_msr_entry { u64 mask; }; +/* + * The nested_vmx structure is part of vcpu_vmx, and holds information we need + * for correct emulation of VMX (i.e., nested VMX) on this vcpu. For example, + * the current VMCS set by L1, a list of the VMCSs used to run the active + * L2 guests on the hardware, and more. + */ +struct nested_vmx { + /* Has the level1 guest done vmxon? */ + bool vmxon; +}; + struct vcpu_vmx { struct kvm_vcpu vcpu; struct list_head local_vcpus_link; @@ -176,6 +187,9 @@ struct vcpu_vmx { u32 exit_reason; bool rdtscp_enabled; + + /* Support for guest hypervisors (nested VMX) */ + struct nested_vmx nested; }; static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu) @@ -3361,6 +3375,90 @@ static int handle_vmx_insn(struct kvm_vc return 1; } +/* + * Emulate the VMXON instruction. + * Currently, we just remember that VMX is active, and do not save or even + * inspect the argument to VMXON (the so-called "VMXON pointer") because we + * do not currently need to store anything in that guest-allocated memory + * region. Consequently, VMCLEAR and VMPTRLD also do not verify that the their + * argument is different from the VMXON pointer (which the spec says they do). + */ +static int handle_vmon(struct kvm_vcpu *vcpu) +{ + struct kvm_segment cs; + struct vcpu_vmx *vmx = to_vmx(vcpu); + + /* The Intel VMX Instruction Reference lists a bunch of bits that + * are prerequisite to running VMXON, most notably CR4.VMXE must be + * set to 1. Otherwise, we should fail with #UD. We test these now: + */ + if (!nested || + !kvm_read_cr4_bits(vcpu, X86_CR4_VMXE) || + !kvm_read_cr0_bits(vcpu, X86_CR0_PE) || + (vmx_get_rflags(vcpu) & X86_EFLAGS_VM)) { + kvm_queue_exception(vcpu, UD_VECTOR); + return 1; + } + + vmx_get_segment(vcpu, &cs, VCPU_SREG_CS); + if (is_long_mode(vcpu) && !cs.l) { + kvm_queue_exception(vcpu, UD_VECTOR); + return 1; + } + + if (vmx_get_cpl(vcpu)) { + kvm_inject_gp(vcpu, 0); + return 1; + } + + vmx->nested.vmxon = true; + + skip_emulated_instruction(vcpu); + return 1; +} + +/* + * Intel's VMX Instruction Reference specifies a common set of prerequisites + * for running VMX instructions (except VMXON, whose prerequisites are + * slightly different). It also specifies what exception to inject otherwise. + */ +static int nested_vmx_check_permission(struct kvm_vcpu *vcpu) +{ + struct kvm_segment cs; + struct vcpu_vmx *vmx = to_vmx(vcpu); + + if (!vmx->nested.vmxon) { + kvm_queue_exception(vcpu, UD_VECTOR); + return 0; + } + + vmx_get_segment(vcpu, &cs, VCPU_SREG_CS); + if ((vmx_get_rflags(vcpu) & X86_EFLAGS_VM) || + (is_long_mode(vcpu) && !cs.l)) { + kvm_queue_exception(vcpu, UD_VECTOR); + return 0; + } + + if (vmx_get_cpl(vcpu)) { + kvm_inject_gp(vcpu, 0); + return 0; + } + + return 1; +} + +/* Emulate the VMXOFF instruction */ +static int handle_vmoff(struct kvm_vcpu *vcpu) +{ + if (!nested_vmx_check_permission(vcpu)) + return 1; + + to_vmx(vcpu)->nested.vmxon = false; + + skip_emulated_instruction(vcpu); + return 1; +} + static int handle_invlpg(struct kvm_vcpu *vcpu) { unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); @@ -3650,8 +3748,8 @@ static int (*kvm_vmx_exit_handlers[])(st [EXIT_REASON_VMREAD] = handle_vmx_insn, [EXIT_REASON_VMRESUME] = handle_vmx_insn, [EXIT_REASON_VMWRITE] = handle_vmx_insn, - [EXIT_REASON_VMOFF] = handle_vmx_insn, - [EXIT_REASON_VMON] = handle_vmx_insn, + [EXIT_REASON_VMOFF] = handle_vmoff, + [EXIT_REASON_VMON] = handle_vmon, [EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold, [EXIT_REASON_APIC_ACCESS] = handle_apic_access, [EXIT_REASON_WBINVD] = handle_wbinvd,