From patchwork Sun May  8 08:17:15 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Nadav Har'El <nyh@il.ibm.com>
X-Patchwork-Id: 765102
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p488HOxA028804
	for <patchwork-kvm@patchwork.kernel.org>; Sun, 8 May 2011 08:17:24 GMT
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751679Ab1EHIRV (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Sun, 8 May 2011 04:17:21 -0400
Received: from mtagate5.uk.ibm.com ([194.196.100.165]:54899 "EHLO
	mtagate5.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751361Ab1EHIRT (ORCPT <rfc822; kvm@vger.kernel.org>);
	Sun, 8 May 2011 04:17:19 -0400
Received: from d06nrmr1707.portsmouth.uk.ibm.com
	(d06nrmr1707.portsmouth.uk.ibm.com [9.149.39.225])
	by mtagate5.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p488HIcb005234
	for <kvm@vger.kernel.org>; Sun, 8 May 2011 08:17:18 GMT
Received: from d06av08.portsmouth.uk.ibm.com (d06av08.portsmouth.uk.ibm.com
	[9.149.37.249])
	by d06nrmr1707.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with
	ESMTP id p488IXXN2494552
	for <kvm@vger.kernel.org>; Sun, 8 May 2011 09:18:33 +0100
Received: from d06av08.portsmouth.uk.ibm.com (loopback [127.0.0.1])
	by d06av08.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with
	ESMTP id p488HHcp010881
	for <kvm@vger.kernel.org>; Sun, 8 May 2011 09:17:17 +0100
Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.217])
	by d06av08.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with
	ESMTP id p488HGcF010878
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 8 May 2011 09:17:17 +0100
Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1])
	by rice.haifa.ibm.com (8.14.4/8.14.4) with ESMTP id p488HGBa017914;
	Sun, 8 May 2011 11:17:16 +0300
Received: (from nyh@localhost)
	by rice.haifa.ibm.com (8.14.4/8.14.4/Submit) id p488HFmF017912;
	Sun, 8 May 2011 11:17:15 +0300
Date: Sun, 8 May 2011 11:17:15 +0300
Message-Id: <201105080817.p488HFmF017912@rice.haifa.ibm.com>
X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav
	Har'El" <nyh@il.ibm.com> using -f
Cc: gleb@redhat.com, avi@redhat.com
To: kvm@vger.kernel.org
From: "Nadav Har'El" <nyh@il.ibm.com>
References: <1304842511-nyh@il.ibm.com>
Subject: [PATCH 04/30] nVMX: Introduce vmcs12: a VMCS structure for L1
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by
	milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]);
	Sun, 08 May 2011 08:17:41 +0000 (UTC)

An implementation of VMX needs to define a VMCS structure. This structure
is kept in guest memory, but is opaque to the guest (who can only read or
write it with VMX instructions).

This patch starts to define the VMCS structure which our nested VMX
implementation will present to L1. We call it "vmcs12", as it is the VMCS
that L1 keeps for its L2 guest. We will add more content to this structure
in later patches.

This patch also adds the notion (as required by the VMX spec) of L1's "current
VMCS", and finally includes utility functions for mapping the guest-allocated
VMCSs in host memory.

Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
---
 arch/x86/kvm/vmx.c |   75 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--- .before/arch/x86/kvm/vmx.c	2011-05-08 10:43:18.000000000 +0300
+++ .after/arch/x86/kvm/vmx.c	2011-05-08 10:43:18.000000000 +0300
@@ -131,12 +131,53 @@ struct shared_msr_entry {
 };
 
 /*
+ * struct vmcs12 describes the state that our guest hypervisor (L1) keeps for a
+ * single nested guest (L2), hence the name vmcs12. Any VMX implementation has
+ * a VMCS structure, and vmcs12 is our emulated VMX's VMCS. This structure is
+ * stored in guest memory specified by VMPTRLD, but is opaque to the guest,
+ * which must access it using VMREAD/VMWRITE/VMCLEAR instructions.
+ * More than one of these structures may exist, if L1 runs multiple L2 guests.
+ * nested_vmx_run() will use the data here to build a vmcs02: a VMCS for the
+ * underlying hardware which will be used to run L2.
+ * This structure is packed to ensure that its layout is identical across
+ * machines (necessary for live migration).
+ * If there are changes in this struct, VMCS12_REVISION must be changed.
+ */
+struct __packed vmcs12 {
+	/* According to the Intel spec, a VMCS region must start with the
+	 * following two fields. Then follow implementation-specific data.
+	 */
+	u32 revision_id;
+	u32 abort;
+};
+
+/*
+ * VMCS12_REVISION is an arbitrary id that should be changed if the content or
+ * layout of struct vmcs12 is changed. MSR_IA32_VMX_BASIC returns this id, and
+ * VMPTRLD verifies that the VMCS region that L1 is loading contains this id.
+ */
+#define VMCS12_REVISION 0x11e57ed0
+
+/*
+ * VMCS12_SIZE is the number of bytes L1 should allocate for the VMXON region
+ * and any VMCS region. Although only sizeof(struct vmcs12) are used by the
+ * current implementation, 4K are reserved to avoid future complications.
+ */
+#define VMCS12_SIZE 0x1000
+
+/*
  * The nested_vmx structure is part of vcpu_vmx, and holds information we need
  * for correct emulation of VMX (i.e., nested VMX) on this vcpu.
  */
 struct nested_vmx {
 	/* Has the level1 guest done vmxon? */
 	bool vmxon;
+
+	/* The guest-physical address of the current VMCS L1 keeps for L2 */
+	gpa_t current_vmptr;
+	/* The host-usable pointer to the above */
+	struct page *current_vmcs12_page;
+	struct vmcs12 *current_vmcs12;
 };
 
 struct vcpu_vmx {
@@ -212,6 +253,31 @@ static inline struct vcpu_vmx *to_vmx(st
 	return container_of(vcpu, struct vcpu_vmx, vcpu);
 }
 
+static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu)
+{
+	return to_vmx(vcpu)->nested.current_vmcs12;
+}
+
+static struct page *nested_get_page(struct kvm_vcpu *vcpu, gpa_t addr)
+{
+	struct page *page = gfn_to_page(vcpu->kvm, addr >> PAGE_SHIFT);
+	if (is_error_page(page)) {
+		kvm_release_page_clean(page);
+		return NULL;
+	}
+	return page;
+}
+
+static void nested_release_page(struct page *page)
+{
+	kvm_release_page_dirty(page);
+}
+
+static void nested_release_page_clean(struct page *page)
+{
+	kvm_release_page_clean(page);
+}
+
 static u64 construct_eptp(unsigned long root_hpa);
 static void kvm_cpu_vmxon(u64 addr);
 static void kvm_cpu_vmxoff(void);
@@ -3995,6 +4061,12 @@ static void free_nested(struct vcpu_vmx 
 	if (!vmx->nested.vmxon)
 		return;
 	vmx->nested.vmxon = false;
+	if (vmx->nested.current_vmptr != -1ull) {
+		kunmap(vmx->nested.current_vmcs12_page);
+		nested_release_page(vmx->nested.current_vmcs12_page);
+		vmx->nested.current_vmptr = -1ull;
+		vmx->nested.current_vmcs12 = NULL;
+	}
 }
 
 /* Emulate the VMXOFF instruction */
@@ -4518,6 +4590,9 @@ static struct kvm_vcpu *vmx_create_vcpu(
 			goto free_vmcs;
 	}
 
+	vmx->nested.current_vmptr = -1ull;
+	vmx->nested.current_vmcs12 = NULL;
+
 	return &vmx->vcpu;
 
 free_vmcs: