From patchwork Sun Oct 17 10:06:08 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Har'El X-Patchwork-Id: 259751 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id o9HA6KS7014844 for ; Sun, 17 Oct 2010 10:06:20 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932325Ab0JQKGR (ORCPT ); Sun, 17 Oct 2010 06:06:17 -0400 Received: from mtagate1.de.ibm.com ([195.212.17.161]:34024 "EHLO mtagate1.de.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932322Ab0JQKGQ (ORCPT ); Sun, 17 Oct 2010 06:06:16 -0400 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate1.de.ibm.com (8.13.1/8.13.1) with ESMTP id o9HA6BtV031833 for ; Sun, 17 Oct 2010 10:06:11 GMT Received: from d12av01.megacenter.de.ibm.com (d12av01.megacenter.de.ibm.com [9.149.165.212]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o9HA6Bcb4116594 for ; Sun, 17 Oct 2010 12:06:11 +0200 Received: from d12av01.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av01.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id o9HA6Afl005777 for ; Sun, 17 Oct 2010 12:06:10 +0200 Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.112]) by d12av01.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id o9HA69AG005770 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 17 Oct 2010 12:06:10 +0200 Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1]) by rice.haifa.ibm.com (8.14.4/8.14.4) with ESMTP id o9HA6899029326; Sun, 17 Oct 2010 12:06:08 +0200 Received: (from nyh@localhost) by rice.haifa.ibm.com (8.14.4/8.14.4/Submit) id o9HA68SE029324; Sun, 17 Oct 2010 12:06:08 +0200 Date: Sun, 17 Oct 2010 12:06:08 +0200 Message-Id: <201010171006.o9HA68SE029324@rice.haifa.ibm.com> X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav Har'El" using -f Cc: gleb@redhat.com, avi@redhat.com To: kvm@vger.kernel.org From: "Nadav Har'El" References: <1287309814-nyh@il.ibm.com> Subject: [PATCH 05/27] nVMX: Introduce vmcs12: a VMCS structure for L1 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.3 (demeter1.kernel.org [140.211.167.41]); Sun, 17 Oct 2010 10:06:20 +0000 (UTC) --- .before/arch/x86/kvm/vmx.c 2010-10-17 11:52:00.000000000 +0200 +++ .after/arch/x86/kvm/vmx.c 2010-10-17 11:52:00.000000000 +0200 @@ -128,6 +128,34 @@ struct shared_msr_entry { }; /* + * struct vmcs12 describes the state that our guest hypervisor (L1) keeps for a + * single nested guest (L2), hence the name vmcs12. Any VMX implementation has + * a VMCS structure, and vmcs12 is our emulated VMX's VMCS. This structure is + * stored in guest memory specified by VMPTRLD, but is opaque to the guest, + * which must access it using VMREAD/VMWRITE/VMCLEAR instructions. More + * than one of these structures may exist, if L1 runs multiple L2 guests. + * nested_vmx_run() will use the data here to build a vmcs02: a VMCS for the + * underlying hardware which will be used to run L2. + * This structure is packed in order to preserve the binary content after live + * migration. If there are changes in the content or layout, VMCS12_REVISION + * must be changed. + */ +struct __packed vmcs12 { + /* According to the Intel spec, a VMCS region must start with the + * following two fields. Then follow implementation-specific data. + */ + u32 revision_id; + u32 abort; +}; + +/* + * VMCS12_REVISION is an arbitrary id that should be changed if the content or + * layout of struct vmcs12 is changed. MSR_IA32_VMX_BASIC returns this id, and + * VMPTRLD verifies that the VMCS region that L1 is loading contains this id. + */ +#define VMCS12_REVISION 0x11e57ed0 + +/* * The nested_vmx structure is part of vcpu_vmx, and holds information we need * for correct emulation of VMX (i.e., nested VMX) on this vcpu. For example, * the current VMCS set by L1, a list of the VMCSs used to run the active @@ -136,6 +164,12 @@ struct shared_msr_entry { struct nested_vmx { /* Has the level1 guest done vmxon? */ bool vmxon; + + /* The guest-physical address of the current VMCS L1 keeps for L2 */ + gpa_t current_vmptr; + /* The host-usable pointer to the above */ + struct page *current_vmcs12_page; + struct vmcs12 *current_vmcs12; }; struct vcpu_vmx { @@ -195,6 +229,26 @@ static inline struct vcpu_vmx *to_vmx(st return container_of(vcpu, struct vcpu_vmx, vcpu); } +static struct page *nested_get_page(struct kvm_vcpu *vcpu, gpa_t addr) +{ + struct page *page = gfn_to_page(vcpu->kvm, addr >> PAGE_SHIFT); + if (is_error_page(page)) { + kvm_release_page_clean(page); + return NULL; + } + return page; +} + +static void nested_release_page(struct page *page) +{ + kvm_release_page_dirty(page); +} + +static void nested_release_page_clean(struct page *page) +{ + kvm_release_page_clean(page); +} + static int init_rmode(struct kvm *kvm); static u64 construct_eptp(unsigned long root_hpa); static void kvm_cpu_vmxon(u64 addr); @@ -3467,6 +3521,11 @@ static int handle_vmoff(struct kvm_vcpu to_vmx(vcpu)->nested.vmxon = false; + if(to_vmx(vcpu)->nested.current_vmptr != -1ull){ + kunmap(to_vmx(vcpu)->nested.current_vmcs12_page); + nested_release_page(to_vmx(vcpu)->nested.current_vmcs12_page); + } + skip_emulated_instruction(vcpu); return 1; } @@ -4170,6 +4229,10 @@ static void vmx_free_vcpu(struct kvm_vcp struct vcpu_vmx *vmx = to_vmx(vcpu); free_vpid(vmx); + if (vmx->nested.vmxon && to_vmx(vcpu)->nested.current_vmptr != -1ull){ + kunmap(to_vmx(vcpu)->nested.current_vmcs12_page); + nested_release_page(to_vmx(vcpu)->nested.current_vmcs12_page); + } vmx_free_vmcs(vcpu); kfree(vmx->guest_msrs); kvm_vcpu_uninit(vcpu); @@ -4236,6 +4299,9 @@ static struct kvm_vcpu *vmx_create_vcpu( goto free_vmcs; } + vmx->nested.current_vmptr = -1ull; + vmx->nested.current_vmcs12 = NULL; + return &vmx->vcpu; free_vmcs: