From patchwork Mon May 16 19:53:08 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Har'El X-Patchwork-Id: 789592 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p4GJriuZ030980 for ; Mon, 16 May 2011 19:53:45 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755398Ab1EPTxN (ORCPT ); Mon, 16 May 2011 15:53:13 -0400 Received: from mtagate5.uk.ibm.com ([194.196.100.165]:58926 "EHLO mtagate5.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755397Ab1EPTxL (ORCPT ); Mon, 16 May 2011 15:53:11 -0400 Received: from d06nrmr1806.portsmouth.uk.ibm.com (d06nrmr1806.portsmouth.uk.ibm.com [9.149.39.193]) by mtagate5.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p4GJrAD4016055 for ; Mon, 16 May 2011 19:53:10 GMT Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by d06nrmr1806.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p4GJrAj32342936 for ; Mon, 16 May 2011 20:53:10 +0100 Received: from d06av06.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4GJrAXh023866 for ; Mon, 16 May 2011 13:53:10 -0600 Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.217]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p4GJr93Y023863 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 16 May 2011 13:53:09 -0600 Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1]) by rice.haifa.ibm.com (8.14.4/8.14.4) with ESMTP id p4GJr8h1001860; Mon, 16 May 2011 22:53:09 +0300 Received: (from nyh@localhost) by rice.haifa.ibm.com (8.14.4/8.14.4/Submit) id p4GJr8Jo001858; Mon, 16 May 2011 22:53:08 +0300 Date: Mon, 16 May 2011 22:53:08 +0300 Message-Id: <201105161953.p4GJr8Jo001858@rice.haifa.ibm.com> X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav Har'El" using -f Cc: gleb@redhat.com, avi@redhat.com To: kvm@vger.kernel.org From: "Nadav Har'El" References: <1305575004-nyh@il.ibm.com> Subject: [PATCH 18/31] nVMX: Implement VMLAUNCH and VMRESUME Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Mon, 16 May 2011 19:53:45 +0000 (UTC) Implement the VMLAUNCH and VMRESUME instructions, allowing a guest hypervisor to run its own guests. This patch does not include some of the necessary validity checks on vmcs12 fields before the entry. These will appear in a separate patch below. Signed-off-by: Nadav Har'El --- arch/x86/kvm/vmx.c | 84 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 82 insertions(+), 2 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- .before/arch/x86/kvm/vmx.c 2011-05-16 22:36:49.000000000 +0300 +++ .after/arch/x86/kvm/vmx.c 2011-05-16 22:36:49.000000000 +0300 @@ -347,6 +347,9 @@ struct nested_vmx { /* vmcs02_list cache of VMCSs recently used to run L2 guests */ struct list_head vmcs02_pool; int vmcs02_num; + + /* Saving the VMCS that we used for running L1 */ + struct saved_vmcs saved_vmcs01; u64 vmcs01_tsc_offset; /* * Guest pages referred to in vmcs02 with host-physical pointers, so @@ -4668,6 +4671,8 @@ static void nested_free_all_saved_vmcss( kfree(item); } vmx->nested.vmcs02_num = 0; + if (is_guest_mode(&vmx->vcpu)) + nested_free_saved_vmcs(vmx, &vmx->nested.saved_vmcs01); } /* Get a vmcs02 for the current vmcs12. */ @@ -4959,6 +4964,21 @@ static int handle_vmclear(struct kvm_vcp return 1; } +static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch); + +/* Emulate the VMLAUNCH instruction */ +static int handle_vmlaunch(struct kvm_vcpu *vcpu) +{ + return nested_vmx_run(vcpu, true); +} + +/* Emulate the VMRESUME instruction */ +static int handle_vmresume(struct kvm_vcpu *vcpu) +{ + + return nested_vmx_run(vcpu, false); +} + enum vmcs_field_type { VMCS_FIELD_TYPE_U16 = 0, VMCS_FIELD_TYPE_U64 = 1, @@ -5239,11 +5259,11 @@ static int (*kvm_vmx_exit_handlers[])(st [EXIT_REASON_INVLPG] = handle_invlpg, [EXIT_REASON_VMCALL] = handle_vmcall, [EXIT_REASON_VMCLEAR] = handle_vmclear, - [EXIT_REASON_VMLAUNCH] = handle_vmx_insn, + [EXIT_REASON_VMLAUNCH] = handle_vmlaunch, [EXIT_REASON_VMPTRLD] = handle_vmptrld, [EXIT_REASON_VMPTRST] = handle_vmptrst, [EXIT_REASON_VMREAD] = handle_vmread, - [EXIT_REASON_VMRESUME] = handle_vmx_insn, + [EXIT_REASON_VMRESUME] = handle_vmresume, [EXIT_REASON_VMWRITE] = handle_vmwrite, [EXIT_REASON_VMOFF] = handle_vmoff, [EXIT_REASON_VMON] = handle_vmon, @@ -6129,6 +6149,66 @@ static void nested_maintain_per_cpu_list } } +/* + * nested_vmx_run() handles a nested entry, i.e., a VMLAUNCH or VMRESUME on L1 + * for running an L2 nested guest. + */ +static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) +{ + struct vmcs12 *vmcs12; + struct vcpu_vmx *vmx = to_vmx(vcpu); + int cpu; + struct saved_vmcs *saved_vmcs02; + + if (!nested_vmx_check_permission(vcpu)) + return 1; + skip_emulated_instruction(vcpu); + + vmcs12 = get_vmcs12(vcpu); + + enter_guest_mode(vcpu); + + vmx->nested.vmcs01_tsc_offset = vmcs_read64(TSC_OFFSET); + + /* + * Switch from L1's VMCS (vmcs01), to L2's VMCS (vmcs02). Remember + * vmcs01, on which CPU it was last loaded, and whether it was launched + * (we need all these values next time we will use L1). Then recall + * these values from the last time vmcs02 was used. + */ + saved_vmcs02 = nested_get_current_vmcs02(vmx); + if (!saved_vmcs02) + return -ENOMEM; + + cpu = get_cpu(); + vmx->nested.saved_vmcs01.vmcs = vmx->vmcs; + vmx->nested.saved_vmcs01.cpu = vcpu->cpu; + vmx->nested.saved_vmcs01.launched = vmx->launched; + vmx->vmcs = saved_vmcs02->vmcs; + vcpu->cpu = saved_vmcs02->cpu; + vmx->launched = saved_vmcs02->launched; + + nested_maintain_per_cpu_lists(vmx, + saved_vmcs02, &vmx->nested.saved_vmcs01); + + vmx_vcpu_put(vcpu); + vmx_vcpu_load(vcpu, cpu); + vcpu->cpu = cpu; + put_cpu(); + + vmcs12->launch_state = 1; + + prepare_vmcs02(vcpu, vmcs12); + + /* + * Note no nested_vmx_succeed or nested_vmx_fail here. At this point + * we are no longer running L1, and VMLAUNCH/VMRESUME has not yet + * returned as far as L1 is concerned. It will only return (and set + * the success flag) when L2 exits (see nested_vmx_vmexit()). + */ + return 1; +} + static int vmx_check_intercept(struct kvm_vcpu *vcpu, struct x86_instruction_info *info, enum x86_intercept_stage stage)