From patchwork Wed May 25 20:15:08 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Har'El X-Patchwork-Id: 817412 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter2.kernel.org (8.14.4/8.14.3) with ESMTP id p4PKFHTh027919 for ; Wed, 25 May 2011 20:15:17 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755084Ab1EYUPN (ORCPT ); Wed, 25 May 2011 16:15:13 -0400 Received: from mtagate3.uk.ibm.com ([194.196.100.163]:37413 "EHLO mtagate3.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755053Ab1EYUPM (ORCPT ); Wed, 25 May 2011 16:15:12 -0400 Received: from d06nrmr1307.portsmouth.uk.ibm.com (d06nrmr1307.portsmouth.uk.ibm.com [9.149.38.129]) by mtagate3.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p4PKFBrx017082 for ; Wed, 25 May 2011 20:15:11 GMT Received: from d06av07.portsmouth.uk.ibm.com (d06av07.portsmouth.uk.ibm.com [9.149.37.248]) by d06nrmr1307.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p4PKFAH52240538 for ; Wed, 25 May 2011 21:15:10 +0100 Received: from d06av07.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av07.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4PKFAFI024097 for ; Wed, 25 May 2011 14:15:10 -0600 Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.217]) by d06av07.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p4PKF9oa024078 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 25 May 2011 14:15:10 -0600 Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1]) by rice.haifa.ibm.com (8.14.4/8.14.4) with ESMTP id p4PKF90R011365; Wed, 25 May 2011 23:15:09 +0300 Received: (from nyh@localhost) by rice.haifa.ibm.com (8.14.4/8.14.4/Submit) id p4PKF8xd011363; Wed, 25 May 2011 23:15:08 +0300 Date: Wed, 25 May 2011 23:15:08 +0300 Message-Id: <201105252015.p4PKF8xd011363@rice.haifa.ibm.com> X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav Har'El" using -f Cc: avi@redhat.com To: kvm@vger.kernel.org From: "Nadav Har'El" References: <1306353651-nyh@il.ibm.com> Subject: [PATCH 27/31] nVMX: Further fixes for lazy FPU loading Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter2.kernel.org [140.211.167.43]); Wed, 25 May 2011 20:15:17 +0000 (UTC) KVM's "Lazy FPU loading" means that sometimes L0 needs to set CR0.TS, even if a guest didn't set it. Moreover, L0 must also trap CR0.TS changes and NM exceptions, even if we have a guest hypervisor (L1) who didn't want these traps. And of course, conversely: If L1 wanted to trap these events, we must let it, even if L0 is not interested in them. This patch fixes some existing KVM code (in update_exception_bitmap(), vmx_fpu_activate(), vmx_fpu_deactivate()) to do the correct merging of L0's and L1's needs. Note that handle_cr() was already fixed in the above patch, and that new code in introduced in previous patches already handles CR0 correctly (see prepare_vmcs02(), prepare_vmcs12(), and nested_vmx_vmexit()). Signed-off-by: Nadav Har'El --- arch/x86/kvm/vmx.c | 31 ++++++++++++++++++++++++++++++- 1 file changed, 30 insertions(+), 1 deletion(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- .before/arch/x86/kvm/vmx.c 2011-05-25 22:41:11.000000000 +0300 +++ .after/arch/x86/kvm/vmx.c 2011-05-25 22:41:11.000000000 +0300 @@ -1179,6 +1179,15 @@ static void update_exception_bitmap(stru eb &= ~(1u << PF_VECTOR); /* bypass_guest_pf = 0 */ if (vcpu->fpu_active) eb &= ~(1u << NM_VECTOR); + + /* When we are running a nested L2 guest and L1 specified for it a + * certain exception bitmap, we must trap the same exceptions and pass + * them to L1. When running L2, we will only handle the exceptions + * specified above if L1 did not want them. + */ + if (is_guest_mode(vcpu)) + eb |= get_vmcs12(vcpu)->exception_bitmap; + vmcs_write32(EXCEPTION_BITMAP, eb); } @@ -1473,6 +1482,9 @@ static void vmx_fpu_activate(struct kvm_ vmcs_writel(GUEST_CR0, cr0); update_exception_bitmap(vcpu); vcpu->arch.cr0_guest_owned_bits = X86_CR0_TS; + if (is_guest_mode(vcpu)) + vcpu->arch.cr0_guest_owned_bits &= + ~get_vmcs12(vcpu)->cr0_guest_host_mask; vmcs_writel(CR0_GUEST_HOST_MASK, ~vcpu->arch.cr0_guest_owned_bits); } @@ -1496,12 +1508,29 @@ static inline unsigned long nested_read_ static void vmx_fpu_deactivate(struct kvm_vcpu *vcpu) { + /* Note that there is no vcpu->fpu_active = 0 here. The caller must + * set this *before* calling this function. + */ vmx_decache_cr0_guest_bits(vcpu); vmcs_set_bits(GUEST_CR0, X86_CR0_TS | X86_CR0_MP); update_exception_bitmap(vcpu); vcpu->arch.cr0_guest_owned_bits = 0; vmcs_writel(CR0_GUEST_HOST_MASK, ~vcpu->arch.cr0_guest_owned_bits); - vmcs_writel(CR0_READ_SHADOW, vcpu->arch.cr0); + if (is_guest_mode(vcpu)) { + /* + * L1's specified read shadow might not contain the TS bit, + * so now that we turned on shadowing of this bit, we need to + * set this bit of the shadow. Like in nested_vmx_run we need + * nested_read_cr0(vmcs12), but vmcs12->guest_cr0 is not yet + * up-to-date here because we just decached cr0.TS (and we'll + * only update vmcs12->guest_cr0 on nested exit). + */ + struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + vmcs12->guest_cr0 = (vmcs12->guest_cr0 & ~X86_CR0_TS) | + (vcpu->arch.cr0 & X86_CR0_TS); + vmcs_writel(CR0_READ_SHADOW, nested_read_cr0(vmcs12)); + } else + vmcs_writel(CR0_READ_SHADOW, vcpu->arch.cr0); } static unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu)