From patchwork Mon May 16 19:57:45 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Nadav Har'El <nyh@il.ibm.com>
X-Patchwork-Id: 789672
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by demeter2.kernel.org (8.14.4/8.14.3) with ESMTP id p4GJvqel015575
	for <patchwork-kvm@patchwork.kernel.org>;
	Mon, 16 May 2011 19:57:52 GMT
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1755421Ab1EPT5t (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Mon, 16 May 2011 15:57:49 -0400
Received: from mtagate6.uk.ibm.com ([194.196.100.166]:47473 "EHLO
	mtagate6.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1755413Ab1EPT5t (ORCPT <rfc822; kvm@vger.kernel.org>);
	Mon, 16 May 2011 15:57:49 -0400
Received: from d06nrmr1507.portsmouth.uk.ibm.com
	(d06nrmr1507.portsmouth.uk.ibm.com [9.149.38.233])
	by mtagate6.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p4GJvlED012102
	for <kvm@vger.kernel.org>; Mon, 16 May 2011 19:57:47 GMT
Received: from d06av08.portsmouth.uk.ibm.com (d06av08.portsmouth.uk.ibm.com
	[9.149.37.249])
	by d06nrmr1507.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with
	ESMTP id p4GJvltK2519182
	for <kvm@vger.kernel.org>; Mon, 16 May 2011 20:57:47 +0100
Received: from d06av08.portsmouth.uk.ibm.com (loopback [127.0.0.1])
	by d06av08.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with
	ESMTP id p4GJvlll025295
	for <kvm@vger.kernel.org>; Mon, 16 May 2011 20:57:47 +0100
Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.217])
	by d06av08.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with
	ESMTP id p4GJvkjm025286
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 16 May 2011 20:57:46 +0100
Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1])
	by rice.haifa.ibm.com (8.14.4/8.14.4) with ESMTP id p4GJvjgs002043;
	Mon, 16 May 2011 22:57:46 +0300
Received: (from nyh@localhost)
	by rice.haifa.ibm.com (8.14.4/8.14.4/Submit) id p4GJvjUR002041;
	Mon, 16 May 2011 22:57:45 +0300
Date: Mon, 16 May 2011 22:57:45 +0300
Message-Id: <201105161957.p4GJvjUR002041@rice.haifa.ibm.com>
X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav
	Har'El" <nyh@il.ibm.com> using -f
Cc: gleb@redhat.com, avi@redhat.com
To: kvm@vger.kernel.org
From: "Nadav Har'El" <nyh@il.ibm.com>
References: <1305575004-nyh@il.ibm.com>
Subject: [PATCH 27/31] nVMX: Further fixes for lazy FPU loading
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by
	milter-greylist-4.2.6 (demeter2.kernel.org [140.211.167.43]);
	Mon, 16 May 2011 19:57:52 +0000 (UTC)

KVM's "Lazy FPU loading" means that sometimes L0 needs to set CR0.TS, even
if a guest didn't set it. Moreover, L0 must also trap CR0.TS changes and
NM exceptions, even if we have a guest hypervisor (L1) who didn't want these
traps. And of course, conversely: If L1 wanted to trap these events, we
must let it, even if L0 is not interested in them.

This patch fixes some existing KVM code (in update_exception_bitmap(),
vmx_fpu_activate(), vmx_fpu_deactivate()) to do the correct merging of L0's
and L1's needs. Note that handle_cr() was already fixed in the above patch,
and that new code in introduced in previous patches already handles CR0
correctly (see prepare_vmcs02(), prepare_vmcs12(), and nested_vmx_vmexit()).

Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
---
 arch/x86/kvm/vmx.c |   31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--- .before/arch/x86/kvm/vmx.c	2011-05-16 22:36:50.000000000 +0300
+++ .after/arch/x86/kvm/vmx.c	2011-05-16 22:36:50.000000000 +0300
@@ -1179,6 +1179,15 @@ static void update_exception_bitmap(stru
 		eb &= ~(1u << PF_VECTOR); /* bypass_guest_pf = 0 */
 	if (vcpu->fpu_active)
 		eb &= ~(1u << NM_VECTOR);
+
+	/* When we are running a nested L2 guest and L1 specified for it a
+	 * certain exception bitmap, we must trap the same exceptions and pass
+	 * them to L1. When running L2, we will only handle the exceptions
+	 * specified above if L1 did not want them.
+	 */
+	if (is_guest_mode(vcpu))
+		eb |= get_vmcs12(vcpu)->exception_bitmap;
+
 	vmcs_write32(EXCEPTION_BITMAP, eb);
 }
 
@@ -1471,6 +1480,9 @@ static void vmx_fpu_activate(struct kvm_
 	vmcs_writel(GUEST_CR0, cr0);
 	update_exception_bitmap(vcpu);
 	vcpu->arch.cr0_guest_owned_bits = X86_CR0_TS;
+	if (is_guest_mode(vcpu))
+		vcpu->arch.cr0_guest_owned_bits &=
+			~get_vmcs12(vcpu)->cr0_guest_host_mask;
 	vmcs_writel(CR0_GUEST_HOST_MASK, ~vcpu->arch.cr0_guest_owned_bits);
 }
 
@@ -1494,12 +1506,29 @@ static inline unsigned long guest_readab
 
 static void vmx_fpu_deactivate(struct kvm_vcpu *vcpu)
 {
+	/* Note that there is no vcpu->fpu_active = 0 here. The caller must
+	 * set this *before* calling this function.
+	 */
 	vmx_decache_cr0_guest_bits(vcpu);
 	vmcs_set_bits(GUEST_CR0, X86_CR0_TS | X86_CR0_MP);
 	update_exception_bitmap(vcpu);
 	vcpu->arch.cr0_guest_owned_bits = 0;
 	vmcs_writel(CR0_GUEST_HOST_MASK, ~vcpu->arch.cr0_guest_owned_bits);
-	vmcs_writel(CR0_READ_SHADOW, vcpu->arch.cr0);
+	if (is_guest_mode(vcpu)) {
+		/*
+		 * L1's specified read shadow might not contain the TS bit,
+		 * so now that we turned on shadowing of this bit, we need to
+		 * set this bit of the shadow. Like in nested_vmx_run we need
+		 * guest_readable_cr0(vmcs12), but vmcs12->guest_cr0 is not
+		 * yet up-to-date here because we just decached cr0.TS (and
+		 * we'll only update vmcs12->guest_cr0 on nested exit).
+		 */
+		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+		vmcs12->guest_cr0 = (vmcs12->guest_cr0 & ~X86_CR0_TS) |
+			(vcpu->arch.cr0 & X86_CR0_TS);
+		vmcs_writel(CR0_READ_SHADOW, guest_readable_cr0(vmcs12));
+	} else
+		vmcs_writel(CR0_READ_SHADOW, vcpu->arch.cr0);
 }
 
 static unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu)