From patchwork Wed May 25 20:03:55 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Har'El X-Patchwork-Id: 817182 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p4PK42DU030717 for ; Wed, 25 May 2011 20:04:02 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753269Ab1EYUD7 (ORCPT ); Wed, 25 May 2011 16:03:59 -0400 Received: from mtagate6.uk.ibm.com ([194.196.100.166]:36342 "EHLO mtagate6.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752276Ab1EYUD6 (ORCPT ); Wed, 25 May 2011 16:03:58 -0400 Received: from d06nrmr1707.portsmouth.uk.ibm.com (d06nrmr1707.portsmouth.uk.ibm.com [9.149.39.225]) by mtagate6.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p4PK3vkA005946 for ; Wed, 25 May 2011 20:03:57 GMT Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by d06nrmr1707.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p4PK3vwb1863922 for ; Wed, 25 May 2011 21:03:57 +0100 Received: from d06av01.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4PK3vAU007353 for ; Wed, 25 May 2011 14:03:57 -0600 Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.217]) by d06av01.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p4PK3u0w007342 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 25 May 2011 14:03:56 -0600 Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1]) by rice.haifa.ibm.com (8.14.4/8.14.4) with ESMTP id p4PK3two011132; Wed, 25 May 2011 23:03:55 +0300 Received: (from nyh@localhost) by rice.haifa.ibm.com (8.14.4/8.14.4/Submit) id p4PK3tFE011130; Wed, 25 May 2011 23:03:55 +0300 Date: Wed, 25 May 2011 23:03:55 +0300 Message-Id: <201105252003.p4PK3tFE011130@rice.haifa.ibm.com> X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav Har'El" using -f Cc: avi@redhat.com To: kvm@vger.kernel.org From: "Nadav Har'El" References: <1306353651-nyh@il.ibm.com> Subject: [PATCH 05/31] nVMX: Introduce vmcs12: a VMCS structure for L1 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Wed, 25 May 2011 20:04:02 +0000 (UTC) An implementation of VMX needs to define a VMCS structure. This structure is kept in guest memory, but is opaque to the guest (who can only read or write it with VMX instructions). This patch starts to define the VMCS structure which our nested VMX implementation will present to L1. We call it "vmcs12", as it is the VMCS that L1 keeps for its L2 guest. We will add more content to this structure in later patches. This patch also adds the notion (as required by the VMX spec) of L1's "current VMCS", and finally includes utility functions for mapping the guest-allocated VMCSs in host memory. Signed-off-by: Nadav Har'El --- arch/x86/kvm/vmx.c | 75 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- .before/arch/x86/kvm/vmx.c 2011-05-25 22:40:50.000000000 +0300 +++ .after/arch/x86/kvm/vmx.c 2011-05-25 22:40:50.000000000 +0300 @@ -145,12 +145,53 @@ struct shared_msr_entry { }; /* + * struct vmcs12 describes the state that our guest hypervisor (L1) keeps for a + * single nested guest (L2), hence the name vmcs12. Any VMX implementation has + * a VMCS structure, and vmcs12 is our emulated VMX's VMCS. This structure is + * stored in guest memory specified by VMPTRLD, but is opaque to the guest, + * which must access it using VMREAD/VMWRITE/VMCLEAR instructions. + * More than one of these structures may exist, if L1 runs multiple L2 guests. + * nested_vmx_run() will use the data here to build a vmcs02: a VMCS for the + * underlying hardware which will be used to run L2. + * This structure is packed to ensure that its layout is identical across + * machines (necessary for live migration). + * If there are changes in this struct, VMCS12_REVISION must be changed. + */ +struct __packed vmcs12 { + /* According to the Intel spec, a VMCS region must start with the + * following two fields. Then follow implementation-specific data. + */ + u32 revision_id; + u32 abort; +}; + +/* + * VMCS12_REVISION is an arbitrary id that should be changed if the content or + * layout of struct vmcs12 is changed. MSR_IA32_VMX_BASIC returns this id, and + * VMPTRLD verifies that the VMCS region that L1 is loading contains this id. + */ +#define VMCS12_REVISION 0x11e57ed0 + +/* + * VMCS12_SIZE is the number of bytes L1 should allocate for the VMXON region + * and any VMCS region. Although only sizeof(struct vmcs12) are used by the + * current implementation, 4K are reserved to avoid future complications. + */ +#define VMCS12_SIZE 0x1000 + +/* * The nested_vmx structure is part of vcpu_vmx, and holds information we need * for correct emulation of VMX (i.e., nested VMX) on this vcpu. */ struct nested_vmx { /* Has the level1 guest done vmxon? */ bool vmxon; + + /* The guest-physical address of the current VMCS L1 keeps for L2 */ + gpa_t current_vmptr; + /* The host-usable pointer to the above */ + struct page *current_vmcs12_page; + struct vmcs12 *current_vmcs12; }; struct vcpu_vmx { @@ -231,6 +272,31 @@ static inline struct vcpu_vmx *to_vmx(st return container_of(vcpu, struct vcpu_vmx, vcpu); } +static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu) +{ + return to_vmx(vcpu)->nested.current_vmcs12; +} + +static struct page *nested_get_page(struct kvm_vcpu *vcpu, gpa_t addr) +{ + struct page *page = gfn_to_page(vcpu->kvm, addr >> PAGE_SHIFT); + if (is_error_page(page)) { + kvm_release_page_clean(page); + return NULL; + } + return page; +} + +static void nested_release_page(struct page *page) +{ + kvm_release_page_dirty(page); +} + +static void nested_release_page_clean(struct page *page) +{ + kvm_release_page_clean(page); +} + static u64 construct_eptp(unsigned long root_hpa); static void kvm_cpu_vmxon(u64 addr); static void kvm_cpu_vmxoff(void); @@ -4038,6 +4104,12 @@ static void free_nested(struct vcpu_vmx if (!vmx->nested.vmxon) return; vmx->nested.vmxon = false; + if (vmx->nested.current_vmptr != -1ull) { + kunmap(vmx->nested.current_vmcs12_page); + nested_release_page(vmx->nested.current_vmcs12_page); + vmx->nested.current_vmptr = -1ull; + vmx->nested.current_vmcs12 = NULL; + } } /* Emulate the VMXOFF instruction */ @@ -4542,6 +4614,9 @@ static struct kvm_vcpu *vmx_create_vcpu( goto free_vmcs; } + vmx->nested.current_vmptr = -1ull; + vmx->nested.current_vmcs12 = NULL; + return &vmx->vcpu; free_vmcs: