From patchwork Sun Jun 13 12:25:07 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Har'El X-Patchwork-Id: 105784 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter.kernel.org (8.14.3/8.14.3) with ESMTP id o5DCPH6D026845 for ; Sun, 13 Jun 2010 12:25:17 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753344Ab0FMMZP (ORCPT ); Sun, 13 Jun 2010 08:25:15 -0400 Received: from mtagate7.uk.ibm.com ([194.196.100.167]:45327 "EHLO mtagate7.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752196Ab0FMMZN (ORCPT ); Sun, 13 Jun 2010 08:25:13 -0400 Received: from d06nrmr1806.portsmouth.uk.ibm.com (d06nrmr1806.portsmouth.uk.ibm.com [9.149.39.193]) by mtagate7.uk.ibm.com (8.13.1/8.13.1) with ESMTP id o5DCPAKp012330 for ; Sun, 13 Jun 2010 12:25:10 GMT Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by d06nrmr1806.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o5DCPABF1163354 for ; Sun, 13 Jun 2010 13:25:10 +0100 Received: from d06av01.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id o5DCPAKc019298 for ; Sun, 13 Jun 2010 13:25:10 +0100 Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.205]) by d06av01.portsmouth.uk.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id o5DCP8KR019229 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 13 Jun 2010 13:25:09 +0100 Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1]) by rice.haifa.ibm.com (8.14.4/8.14.4) with ESMTP id o5DCP8ll012924; Sun, 13 Jun 2010 15:25:08 +0300 Received: (from nyh@localhost) by rice.haifa.ibm.com (8.14.4/8.14.4/Submit) id o5DCP79H012922; Sun, 13 Jun 2010 15:25:07 +0300 Date: Sun, 13 Jun 2010 15:25:07 +0300 Message-Id: <201006131225.o5DCP79H012922@rice.haifa.ibm.com> X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav Har'El" using -f Cc: kvm@vger.kernel.org To: avi@redhat.com From: "Nadav Har'El" References: <1276431753-nyh@il.ibm.com> Subject: [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.3 (demeter.kernel.org [140.211.167.41]); Sun, 13 Jun 2010 12:25:17 +0000 (UTC) --- .before/arch/x86/kvm/vmx.c 2010-06-13 15:01:28.000000000 +0300 +++ .after/arch/x86/kvm/vmx.c 2010-06-13 15:01:28.000000000 +0300 @@ -117,6 +117,29 @@ struct shared_msr_entry { u64 mask; }; +#define VMCS12_REVISION 0x11e57ed0 + +/* + * struct vmcs12 describes the state that our guest hypervisor (L1) keeps for a + * single nested guest (L2), hence the name vmcs12. Any VMX implementation has + * a VMCS structure (which is opaque to the guest), and vmcs12 is our emulated + * VMX's VMCS. This structure is stored in guest memory specified by VMPTRLD, + * and accessed by the guest using VMREAD/VMWRITE/VMCLEAR instructions. More + * than one of these structures may exist, if L1 runs multiple L2 guests. + * nested_vmx_run() will use the data here to build a VMCS for the underlying + * hardware which will be used to run L2. + * This structure is packed in order to preseve the binary content after live + * migration. If there are changes in the content or layout, VMCS12_REVISION + * must be changed. + */ +struct __attribute__ ((__packed__)) vmcs12 { + /* According to the Intel spec, a VMCS region must start with the + * following two fields. Then follow implementation-specific data. + */ + u32 revision_id; + u32 abort; +}; + /* The nested_vmx structure is part of vcpu_vmx, and holds information we need * for correct emulation of VMX (i.e., nested VMX) on this vcpu. For example, * the current VMCS set by L1, a list of the VMCSs used to run the active @@ -125,6 +148,11 @@ struct shared_msr_entry { struct nested_vmx { /* Has the level1 guest done vmxon? */ bool vmxon; + + /* The guest-physical address of the current VMCS L1 keeps for L2 */ + gpa_t current_vmptr; + /* The host-usable pointer to the above. Set by nested_map_current() */ + struct vmcs12 *current_l2_page; }; struct vcpu_vmx { @@ -188,6 +216,61 @@ static inline struct vcpu_vmx *to_vmx(st return container_of(vcpu, struct vcpu_vmx, vcpu); } +static struct page *nested_get_page(struct kvm_vcpu *vcpu, u64 vmcs_addr) +{ + struct page *vmcs_page = + gfn_to_page(vcpu->kvm, vmcs_addr >> PAGE_SHIFT); + + if (is_error_page(vmcs_page)) { + printk(KERN_ERR "%s error allocating page 0x%llx\n", + __func__, vmcs_addr); + kvm_release_page_clean(vmcs_page); + return NULL; + } + return vmcs_page; +} + +static int nested_map_current(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + struct page *vmcs_page = + nested_get_page(vcpu, vmx->nested.current_vmptr); + + if (vmcs_page == NULL) { + printk(KERN_INFO "%s: failure in nested_get_page\n", __func__); + return 0; + } + + if (vmx->nested.current_l2_page) { + printk(KERN_INFO "Shadow vmcs already mapped\n"); + BUG_ON(1); + return 0; + } + + vmx->nested.current_l2_page = kmap_atomic(vmcs_page, KM_USER0); + return 1; +} + +static void nested_unmap_current(struct kvm_vcpu *vcpu) +{ + struct page *page; + struct vcpu_vmx *vmx = to_vmx(vcpu); + + if (!vmx->nested.current_l2_page) { + printk(KERN_INFO "Shadow vmcs already unmapped\n"); + BUG_ON(1); + return; + } + + page = kmap_atomic_to_page(vmx->nested.current_l2_page); + + kunmap_atomic(vmx->nested.current_l2_page, KM_USER0); + + kvm_release_page_dirty(page); + + vmx->nested.current_l2_page = NULL; +} + static int init_rmode(struct kvm *kvm); static u64 construct_eptp(unsigned long root_hpa); static void kvm_cpu_vmxon(u64 addr); @@ -4186,6 +4269,9 @@ static struct kvm_vcpu *vmx_create_vcpu( goto free_vmcs; } + vmx->nested.current_vmptr = -1ull; + vmx->nested.current_l2_page = NULL; + return &vmx->vcpu; free_vmcs: