From patchwork Mon May 6 07:04:25 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nakajima, Jun" X-Patchwork-Id: 2522881 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 87F743FD85 for ; Mon, 6 May 2013 07:04:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752923Ab3EFHEt (ORCPT ); Mon, 6 May 2013 03:04:49 -0400 Received: from mail-pb0-f54.google.com ([209.85.160.54]:51416 "EHLO mail-pb0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752879Ab3EFHEs (ORCPT ); Mon, 6 May 2013 03:04:48 -0400 Received: by mail-pb0-f54.google.com with SMTP id rr4so1832441pbb.41 for ; Mon, 06 May 2013 00:04:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:subject:date:message-id:x-mailer:in-reply-to :references:x-gm-message-state; bh=VMbdZCbAp8gygi5LsTW0O/LDm8o7dGpVE3p3wGpYoyo=; b=VDT999B7P6HeYZNqK8NVJbnUFfE4BbfaimGpb8j7qeQNt8+ww97PRnQwjbzT84nWm2 U/l/AoXK8W2pODJ3xVxSgG1G/SaXuiKVaj3IHMcBmjaubrGeFS2jRB2upfHrajS/2Sbh DKmCgIK0WTZ9+0Tz6KTEfHD9Cov09Nt1O6/kqT/C42SbqjCb94/OdcM+IabvHFe7CxyR VrWDfOcr5Vd6DTA0zETnLPXvu7s4Ajk4DKd3oQ2oE0tUzU9RsHpTgndjHy5IOLvedfQw NUE1lEZtwbbFEhGj9GgHaPMTXE8pbHIQ4ywrx1qCHiDyesvf4vQG3d2/B/Wy3D9x02Ed G3Gg== X-Received: by 10.66.155.39 with SMTP id vt7mr25126848pab.99.1367823888467; Mon, 06 May 2013 00:04:48 -0700 (PDT) Received: from localhost (c-98-207-34-191.hsd1.ca.comcast.net. [98.207.34.191]) by mx.google.com with ESMTPSA id ak1sm22752856pbc.10.2013.05.06.00.04.46 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 06 May 2013 00:04:47 -0700 (PDT) From: Jun Nakajima To: kvm@vger.kernel.org Subject: [PATCH v2 06/13] nEPT: Fix cr3 handling in nested exit and entry Date: Mon, 6 May 2013 00:04:25 -0700 Message-Id: <1367823872-25895-6-git-send-email-jun.nakajima@intel.com> X-Mailer: git-send-email 1.8.2.1.610.g562af5b In-Reply-To: <1367823872-25895-5-git-send-email-jun.nakajima@intel.com> References: <1367823872-25895-1-git-send-email-jun.nakajima@intel.com> <1367823872-25895-2-git-send-email-jun.nakajima@intel.com> <1367823872-25895-3-git-send-email-jun.nakajima@intel.com> <1367823872-25895-4-git-send-email-jun.nakajima@intel.com> <1367823872-25895-5-git-send-email-jun.nakajima@intel.com> X-Gm-Message-State: ALoCoQm5gG1zfg3XiJ8uwaKwlAmCr6SEwdoYhpcig7HI0J78EufeskGBg+KnVh2GTwSDENfJLi6N Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The existing code for handling cr3 and related VMCS fields during nested exit and entry wasn't correct in all cases: If L2 is allowed to control cr3 (and this is indeed the case in nested EPT), during nested exit we must copy the modified cr3 from vmcs02 to vmcs12, and we forgot to do so. This patch adds this copy. If L0 isn't controlling cr3 when running L2 (i.e., L0 is using EPT), and whoever does control cr3 (L1 or L2) is using PAE, the processor might have saved PDPTEs and we should also save them in vmcs12 (and restore later). Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu --- arch/x86/kvm/vmx.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 8fdcacf..d797d3e 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -7163,10 +7163,26 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmx_set_cr4(vcpu, vmcs12->guest_cr4); vmcs_writel(CR4_READ_SHADOW, nested_read_cr4(vmcs12)); - /* shadow page tables on either EPT or shadow page tables */ + /* + * Note that kvm_set_cr3() and kvm_mmu_reset_context() will do the + * right thing, and set GUEST_CR3 and/or EPT_POINTER in all supported + * settings: 1. shadow page tables on shadow page tables, 2. shadow + * page tables on EPT, 3. EPT on EPT. + */ kvm_set_cr3(vcpu, vmcs12->guest_cr3); kvm_mmu_reset_context(vcpu); + /* + * Additionally, except when L0 is using shadow page tables, L1 or + * L2 control guest_cr3 for L2, so they may also have saved PDPTEs + */ + if (enable_ept) { + vmcs_write64(GUEST_PDPTR0, vmcs12->guest_pdptr0); + vmcs_write64(GUEST_PDPTR1, vmcs12->guest_pdptr1); + vmcs_write64(GUEST_PDPTR2, vmcs12->guest_pdptr2); + vmcs_write64(GUEST_PDPTR3, vmcs12->guest_pdptr3); + } + kvm_register_write(vcpu, VCPU_REGS_RSP, vmcs12->guest_rsp); kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12->guest_rip); } @@ -7398,6 +7414,25 @@ void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmcs12->guest_pending_dbg_exceptions = vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS); + /* + * In some cases (usually, nested EPT), L2 is allowed to change its + * own CR3 without exiting. If it has changed it, we must keep it. + * Of course, if L0 is using shadow page tables, GUEST_CR3 was defined + * by L0, not L1 or L2, so we mustn't unconditionally copy it to vmcs12. + */ + if (enable_ept) + vmcs12->guest_cr3 = vmcs_read64(GUEST_CR3); + /* + * Additionally, except when L0 is using shadow page tables, L1 or + * L2 control guest_cr3 for L2, so save their PDPTEs + */ + if (enable_ept) { + vmcs12->guest_pdptr0 = vmcs_read64(GUEST_PDPTR0); + vmcs12->guest_pdptr1 = vmcs_read64(GUEST_PDPTR1); + vmcs12->guest_pdptr2 = vmcs_read64(GUEST_PDPTR2); + vmcs12->guest_pdptr3 = vmcs_read64(GUEST_PDPTR3); + } + /* TODO: These cannot have changed unless we have MSR bitmaps and * the relevant bit asks not to trap the change */ vmcs12->guest_ia32_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);