From patchwork Fri Jan 15 21:39:58 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Konrad Rzeszutek Wilk X-Patchwork-Id: 8044691 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 3BFE8BEEE5 for ; Fri, 15 Jan 2016 21:44:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 044D22037F for ; Fri, 15 Jan 2016 21:44:17 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A8BD720375 for ; Fri, 15 Jan 2016 21:44:15 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aKC6R-0003Wv-Sc; Fri, 15 Jan 2016 21:40:16 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aKC6N-0003Wq-PV for xen-devel@lists.xenproject.org; Fri, 15 Jan 2016 21:40:12 +0000 Received: from [193.109.254.147] by server-12.bemta-14.messagelabs.com id AD/68-09834-B3769965; Fri, 15 Jan 2016 21:40:11 +0000 X-Env-Sender: konrad@char.us.oracle.com X-Msg-Ref: server-7.tower-27.messagelabs.com!1452894008!17218245!1 X-Originating-IP: [156.151.31.81] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTU2LjE1MS4zMS44MSA9PiAyODgzMzk=\n X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 19800 invoked from network); 15 Jan 2016 21:40:10 -0000 Received: from userp1040.oracle.com (HELO userp1040.oracle.com) (156.151.31.81) by server-7.tower-27.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 15 Jan 2016 21:40:10 -0000 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u0FLe2Yf001616 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Fri, 15 Jan 2016 21:40:02 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id u0FLe0qB008987 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Fri, 15 Jan 2016 21:40:00 GMT Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by userv0122.oracle.com (8.13.8/8.13.8) with ESMTP id u0FLe06u030097; Fri, 15 Jan 2016 21:40:00 GMT Received: from char.us.oracle.com (/10.137.176.158) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 15 Jan 2016 13:40:00 -0800 Received: by char.us.oracle.com (Postfix, from userid 1000) id F138F6A4DA1; Fri, 15 Jan 2016 16:39:58 -0500 (EST) Date: Fri, 15 Jan 2016 16:39:58 -0500 From: Konrad Rzeszutek Wilk To: Jan Beulich Message-ID: <20160115213958.GA16118@char.us.oracle.com> References: <20160112033844.GB15551@char.us.oracle.com> <5694D3CB02000078000C5D00@prv-mh.provo.novell.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <5694D3CB02000078000C5D00@prv-mh.provo.novell.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Source-IP: userv0022.oracle.com [156.151.31.74] Cc: andrew.cooper3@citrix.com, kevin.tian@intel.com, wim.coekaerts@oracle.com, jun.nakajima@intel.com, xen-devel Subject: Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6 X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Jan 12, 2016 at 02:22:03AM -0700, Jan Beulich wrote: > >>> On 12.01.16 at 04:38, wrote: > > (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698 > > (XEN) ----[ Xen-4.6.0 x86_64 debug=y Tainted: C ]---- > > (XEN) CPU: 39 > > (XEN) RIP: e008:[] virtual_vmentry+0x487/0xac9 > > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor (d1v3) > > (XEN) rax: 0000000000000000 rbx: ffff83007786c000 rcx: 0000000000000000 > > (XEN) rdx: 0000000000000e00 rsi: 000fffffffffffff rdi: ffff83407f81e010 > > (XEN) rbp: ffff834008a47ea8 rsp: ffff834008a47e38 r8: 0000000000000000 > > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 > > (XEN) r12: 0000000000000000 r13: ffff82c000341000 r14: ffff834008a47f18 > > (XEN) r15: ffff83407f7c4000 cr0: 0000000080050033 cr4: 00000000001526e0 > > (XEN) cr3: 000000407fb22000 cr2: 0000000000000000 > > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > > (XEN) Xen stack trace from rsp=ffff834008a47e38: > > (XEN) ffff834008a47e68 ffff82d0801d2cde ffff834008a47e68 0000000000000d00 > > (XEN) 0000000000000000 0000000000000000 ffff834008a47e88 00000004801cc30e > > (XEN) ffff83007786c000 ffff83007786c000 ffff834008a40000 0000000000000000 > > (XEN) ffff834008a47f18 0000000000000000 ffff834008a47f08 ffff82d0801edf94 > > (XEN) ffff834008a47ef8 0000000000000000 ffff834008f62000 ffff834008a47f18 > > (XEN) 000000ae8c99eb8d ffff83007786c000 0000000000000000 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82d0801ee2ab > > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > (XEN) 00000000078bfbff 0000000000000000 0000000000000000 0000beef0000beef > > (XEN) fffffffffc4b3440 000000bf0000beef 0000000000040046 fffffffffc607f00 > > (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef > > (XEN) 000000000000beef 0000000000000027 ffff83007786c000 0000006f88716300 > > (XEN) 0000000000000000 > > (XEN) Xen call trace: > > (XEN) [] virtual_vmentry+0x487/0xac9 > > (XEN) [] nvmx_switch_guest+0x8ff/0x915 > > (XEN) [] vmx_asm_vmexit_handler+0x4b/0xc0 > > (XEN) > > (XEN) > > (XEN) **************************************** > > (XEN) Panic on CPU 39: > > (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698 > > (XEN) **************************************** > > (XEN) > > > > ..and then to my surprise the hypervisor stopped hitting this. > > Since we can (I hope) pretty much exclude a paging type, the > ASSERT() must have triggered because of vapic_pg being NULL. > That might be verifiable without extra printk()s, just by checking > the disassembly (assuming the value sits in a register). In which > case vapic_gpfn would be of interest too. The vapic_gpfn is 0xffffffffffff. To be exact: nvmx_update_virtual_apic_address:vCPU0 0xffffffffffffffff(vAPIC) 0x0(APIC), 0x0(TPR) ctrl=b5b9effe Based on this: Got me: (XEN) stdvga.c:151:d1v0 leaving stdvga mode (XEN) stdvga.c:147:d1v0 entering stdvga and caching modes (XEN) stdvga.c:520:d1v0 leaving caching mode (XEN) vvmx.c:2491:d1v0 Unknown nested vmexit reason 80000021. (XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest state (0). (XEN) ************* VMCS Area ************** (XEN) *** Guest State *** (XEN) CR0: actual=0x0000000000000030, shadow=0x0000000000000000, gh_mask=ffffffffffffffff (XEN) CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffffff (XEN) CR3 = 0x00000000800ed000 (XEN) RSP = 0x0000000000000000 (0x0000000000000000) RIP = 0x0000000000000000 (0x0000000000000000) (XEN) RFLAGS=0x00000002 (0x00000002) DR7 = 0x0000000000000400 (XEN) Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000 (XEN) sel attr limit base (XEN) CS: 0000 00000 00000000 0000000000000000 (XEN) DS: 0000 00000 00000000 0000000000000000 (XEN) SS: 0000 00000 00000000 0000000000000000 (XEN) ES: 0000 00000 00000000 0000000000000000 (XEN) FS: 0000 00000 00000000 0000000000000000 (XEN) GS: 0000 00000 00000000 0000000000000000 (XEN) GDTR: 00000000 0000000000000000 (XEN) LDTR: 0000 00000 00000000 0000000000000000 (XEN) IDTR: 00000000 0000000000000000 (XEN) TR: 0000 00000 00000000 0000000000000000 (XEN) EFER = 0x0000000000000800 PAT = 0x0000000000000000 (XEN) PreemptionTimer = 0x00000000 SM Base = 0x00000000 (XEN) DebugCtl = 0x0000000000000000 DebugExceptions = 0x0000000000000000 (XEN) Interruptibility = 00000000 ActivityState = 00000000 (XEN) *** Host State *** (XEN) RIP = 0xffff82d0801ee3a0 (vmx_asm_vmexit_handler) RSP = 0xffff8340077d7f90 (XEN) CS=e008 SS=0000 DS=0000 ES=0000 FS=0000 GS=0000 TR=e040 (XEN) FSBase=0000000000000000 GSBase=0000000000000000 TRBase=ffff8340077dfc00 (XEN) GDTBase=ffff8340077d0000 IDTBase=ffff8340077dc000 (XEN) CR0=0000000080050033 CR3=000000400076c000 CR4=00000000001526e0 (XEN) Sysenter RSP=ffff8340077d7fc0 CS:RIP=e008:ffff82d080238870 (XEN) EFER = 0x0000000000000000 PAT = 0x0000050100070406 (XEN) *** Control State *** (XEN) PinBased=0000003f CPUBased=b5b9effe SecondaryExec=000054eb (XEN) EntryControls=000011fb ExitControls=001fefff (XEN) ExceptionBitmap=00062042 PFECmask=00000000 PFECmatch=ffffffff (XEN) VMEntry: intr_info=00000000 errcode=00000000 ilen=00000000 (XEN) VMExit: intr_info=00000000 errcode=00000000 ilen=00000006 (XEN) reason=80000021 qualification=0000000000000000 (XEN) IDTVectoring: info=00000000 errcode=00000000 (XEN) TSC Offset = 0xfffd34adb2c3a149 (XEN) TPR Threshold = 0x00 PostedIntrVec = 0x00 (XEN) EPT pointer = 0x000000400079a01e EPTP index = 0x0000 (XEN) PLE Gap=00000080 Window=00001000 (XEN) Virtual processor ID = 0x004e VMfunc controls = 0000000000000000 (XEN) ************************************** (XEN) domain_crash called from vmx.c:2729 (XEN) Domain 1 (vcpu#0) crashed on cpu#21: (XEN) ----[ Xen-4.6.0 x86_64 debug=y Tainted: C ]---- (XEN) CPU: 21 (XEN) RIP: 0000:[<0000000000000000>] (XEN) RFLAGS: 0000000000000002 CONTEXT: hvm guest (d1v0) (XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: 0000000000000000 (XEN) rdx: 00000000078bfbff rsi: 0000000000000000 rdi: 0000000000000000 (XEN) rbp: 0000000000000000 rsp: 0000000000000000 r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 (XEN) r15: 0000000000000000 cr0: 0000000000000010 cr4: 0000000000000000 (XEN) cr3: 00000000800ed000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: 0000 .. > > Anyway, the writing of the respective VMCS field to zero in the > alternative worries me a little: Aren't we risking MFN zero to be > wrongly accessed due to this? > > Furthermore, nvmx_update_apic_access_address() having a > similar ASSERT() seems entirely wrong: The APIC access > page doesn't really need to match up with any actual page > belonging to the guest - a guest could choose to point this > into no-where (note that we've been at least considering this > option recently for our own purposes, in the context of > http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02191.html). > > > Instead I started getting an even more bizzare crash: Ignore this part please. .. snip.. > this doesn't match the call stack. Something's pretty fishy here. Yes. The hypervisor was modified alongside me and I hadn't connected the dots... > > Jan diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c index cb6f9b8..8a0abfc 100644 --- a/xen/arch/x86/hvm/vmx/vvmx.c +++ b/xen/arch/x86/hvm/vmx/vvmx.c @@ -695,7 +695,15 @@ static void nvmx_update_virtual_apic_address(struct vcpu *v) vapic_gpfn = __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR) >> PAGE_SHIFT; vapic_pg = get_page_from_gfn(v->domain, vapic_gpfn, &p2mt, P2M_ALLOC); - ASSERT(vapic_pg && !p2m_is_paging(p2mt)); + if ( !vapic_pg ) { + printk("%s:vCPU%d 0x%lx(vAPIC) 0x%lx(APIC), 0x%lx(TPR) ctrl=%x\n", __func__,v->vcpu_id, + __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR), + __get_vvmcs(nvcpu->nv_vvmcx, APIC_ACCESS_ADDR), + __get_vvmcs(nvcpu->nv_vvmcx, TPR_THRESHOLD), + ctrl); + } + ASSERT(vapic_pg); + ASSERT(vapic_pg && !p2m_is_paging(p2mt)); __vmwrite(VIRTUAL_APIC_PAGE_ADDR, page_to_maddr(vapic_pg)); put_page(vapic_pg); } > > What looks odd to me is the connection between > CPU_BASED_TPR_SHADOW being set and the use of a (valid) > virtual APIC page: Wouldn't this rather need to depend on > SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES, just like in > nvmx_update_apic_access_address()? Could be. I added in an read for the secondary control: nvmx_update_virtual_apic_address:vCPU2 0xffffffffffffffff(vAPIC) 0x0(APIC), 0x0(TPR) ctrl=b5b9effe sec=0 So trying your recommendation: diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c index cb6f9b8..d291c91 100644 --- a/xen/arch/x86/hvm/vmx/vvmx.c +++ b/xen/arch/x86/hvm/vmx/vvmx.c @@ -686,8 +686,8 @@ static void nvmx_update_virtual_apic_address(struct vcpu *v) struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v); u32 ctrl; - ctrl = __n2_exec_control(v); - if ( ctrl & CPU_BASED_TPR_SHADOW ) + ctrl = __n2_secondary_exec_control(v); + if ( ctrl & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES ) { p2m_type_t p2mt; unsigned long vapic_gpfn;