From patchwork Thu Feb 16 08:29:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Sergey Dyasli X-Patchwork-Id: 9576635 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8F32560209 for ; Thu, 16 Feb 2017 08:32:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82FDB1FFAE for ; Thu, 16 Feb 2017 08:32:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 77AA42018F; Thu, 16 Feb 2017 08:32:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8FF631FFB9 for ; Thu, 16 Feb 2017 08:32:17 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ceHRX-0000xD-6e; Thu, 16 Feb 2017 08:29:35 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ceHRW-0000x7-7I for xen-devel@lists.xenproject.org; Thu, 16 Feb 2017 08:29:34 +0000 Received: from [85.158.137.68] by server-11.bemta-3.messagelabs.com id AB/BE-01684-DE265A85; Thu, 16 Feb 2017 08:29:33 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprCKsWRWlGSWpSXmKPExsWyU9JRQvdN0tI Ig58vzCy+b5nM5MDocfjDFZYAxijWzLyk/IoE1owjF2MKVptUzJ08kbWB8YpxFyMnh4SAn8SM g/PZQWw2AT2JjbNfMYHYIgLaEhv3rwSyuTiYBc4zSbR/+wlWJCwQILHkzHFmiKJAiakPXzNC2 HoS02cdBouzCKhKnLl2nAXE5hUwkFjQu4kFZJCQwGomiRWHXoIlOAXsJdbPPw22jVFAVuJL42 qwZmYBcYlbT+YzQVwnILFkz3lmCFtU4uXjf6wQtprEwd53UDU6EmevP2GEsA0kti7dBzSfA2i OpsT6XfoQIy0lDv3ZxwZhK0pM6X7IDnGboMTJmU9YJjCKzUKyeRZC9ywk3bOQdM9C0r2AkXUV o0ZxalFZapGukYleUlFmekZJbmJmjq6hgbFebmpxcWJ6ak5iUrFecn7uJkZgbNUzMDDuYHx13 O8QoyQHk5Io707FpRFCfEn5KZUZicUZ8UWlOanFhxhlODiUJHhfJALlBItS01Mr0jJzgFEOk5 bg4FES4WUBRroQb3FBYm5xZjpE6hSjopQ4RJ8ASCKjNA+uDZZYLjHKSgnzMjIwMAjxFKQW5Wa WoMq/YhTnYFQS5uUHGc+TmVcCN/0V0GImoMWdEWCLSxIRUlINjALHcj2adA99i2LbdeaLw3OH iycCf2Spy0R+5j8T7f0oZqM+p+vnCobb14P739/IqHDOrktmunRUYluAuE4Dx+M/s71jp4dLb k+Ss13vLccimn4/5fN1v75bHVap9o1+59fo/5klHXb+RP5jt4VbbVcqLmi68u/h7q6ow9cOzZ qoZ/gvSl3JSomlOCPRUIu5qDgRAJjzs8gnAwAA X-Env-Sender: prvs=213870121=sergey.dyasli@citrix.com X-Msg-Ref: server-12.tower-31.messagelabs.com!1487233772!69064173!1 X-Originating-IP: [185.25.65.24] X-SpamReason: No, hits=0.0 required=7.0 tests=received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 9.2.3; banners=-,-,- X-VirusChecked: Checked Received: (qmail 11830 invoked from network); 16 Feb 2017 08:29:32 -0000 Received: from smtp.ctxuk.citrix.com (HELO SMTP.EU.CITRIX.COM) (185.25.65.24) by server-12.tower-31.messagelabs.com with RC4-SHA encrypted SMTP; 16 Feb 2017 08:29:32 -0000 X-IronPort-AV: E=Sophos;i="5.35,168,1484006400"; d="scan'208";a="40876269" From: Sergey Dyasli To: "JBeulich@suse.com" Thread-Topic: [Xen-devel] [PATCH 1/2] VMX: fix VMCS race on context-switch paths Thread-Index: AQHSiC7LKHM/ASrGlkCJhESw7uakPQ== Date: Thu, 16 Feb 2017 08:29:31 +0000 Message-ID: <1487233771.3032.1.camel@citrix.com> References: <58A2E8B70200007800139946@prv-mh.provo.novell.com> <58A2E9F9020000780013995D@prv-mh.provo.novell.com> <1487154433.3588.1.camel@citrix.com> <58A463BA020000780013A37D@prv-mh.provo.novell.com> <1487170512.3588.9.camel@citrix.com> <58A47EB5020000780013A442@prv-mh.provo.novell.com> In-Reply-To: <58A47EB5020000780013A442@prv-mh.provo.novell.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: Evolution 3.22.3-0ubuntu0.1 x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted Content-ID: MIME-Version: 1.0 Cc: Sergey Dyasli , Kevin Tian , "Kevin.Mayer@gdata.de" , Andrew Cooper , Anshul Makkar , "jun.nakajima@intel.com" , "xen-devel@lists.xenproject.org" Subject: Re: [Xen-devel] [PATCH 1/2] VMX: fix VMCS race on context-switch paths X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP On Wed, 2017-02-15 at 08:15 -0700, Jan Beulich wrote: > > > > On 15.02.17 at 15:55, wrote: > > > > On Wed, 2017-02-15 at 06:20 -0700, Jan Beulich wrote: > > > > > > On 15.02.17 at 11:27, wrote: > > > > > > > > This is what I'm getting during the original test case (32 VMs reboot): > > > > > > > > (XEN) [ 1407.789329] Watchdog timer detects that CPU12 is stuck! > > > > (XEN) [ 1407.795726] ----[ Xen-4.6.1-xs-local x86_64 debug=n Not tainted ]---- > > > > > > Hmm, this was with a non-debug build, so the ASSERT() in > > > vmx_vmcs_reload() was a no-op, yet it would have been useful > > > to know whether active_cpu was -1 when getting stuck here. > > > Btw - there was no nested virt in the picture in your try, was > > > there? > > > > No nested virt is involved in the test case. > > > > Is it worth giving your patch another try with removing ctxt_switch_same() > > since we figured out that vmx_do_resume() will reload vmcs either way? > > Yes, but that's the cosmetic part, whereras ... > > > And I will also update vmx_vmcs_reload() from your last email. > > ... this looks to be the actual bug fix. If you agree with my > reasoning of removing the loop altogether, you may want to go > with that version instead of adding the conditional. After extensive night testing, it can be safe to assume that below patch fixes the PML issue. I agree about removing the spinning since vmx_vmcs_enter/exit are synchronized with the scheduler by schedule_lock. But it costs nothing to check so I added a debug message to the loop. Needless to say, it was never printed. My patch for vmx_vmcs_exit() is obviously a half measure because it doesn't protect against VMCS clearing by an external IPI when current is idle. I'm not sure such situation is possible but there is nothing that prevents it. This clearly makes your approach superior and I think you need to submit v2 for proper review. --- Thanks, Sergey diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index 88db7ee..07e8527 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -551,6 +551,33 @@ static void vmx_load_vmcs(struct vcpu *v)      local_irq_restore(flags);  }   +void vmx_vmcs_reload(struct vcpu *v) +{ +    /* +     * As we're running with interrupts disabled, we can't acquire +     * v->arch.hvm_vmx.vmcs_lock here. However, with interrupts disabled +     * the VMCS can't be taken away from us anymore if we still own it. +     */ +    ASSERT(!local_irq_is_enabled()); +    if ( v->arch.hvm_vmx.vmcs == this_cpu(current_vmcs) ) +        return; +    ASSERT(!this_cpu(current_vmcs)); + +    if ( v->arch.hvm_vmx.active_cpu != smp_processor_id() ) +    { +        /* +         * Wait for the remote side to be done with the VMCS before loading +         * it here. +         */ +        while ( v->arch.hvm_vmx.active_cpu != -1 ) { +            printk("DS: v->arch.hvm_vmx.active_cpu == %d\n", +                    v->arch.hvm_vmx.active_cpu); +            cpu_relax(); +        } +    } +    vmx_load_vmcs(v); +} +  int vmx_cpu_up_prepare(unsigned int cpu)  {      /* diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 8cafec2..ccf433f 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -734,6 +734,18 @@ static void vmx_ctxt_switch_from(struct vcpu *v)      if ( unlikely(!this_cpu(vmxon)) )          return;   +    if ( !v->is_running ) +    { +        /* +         * When this vCPU isn't marked as running anymore, a remote pCPU's +         * attempt to pause us (from vmx_vmcs_enter()) won't have a reason +         * to spin in vcpu_sleep_sync(), and hence that pCPU might have taken +         * away the VMCS from us. As we're running with interrupts disabled, +         * we also can't call vmx_vmcs_enter(). +         */ +        vmx_vmcs_reload(v); +    } +      vmx_fpu_leave(v);      vmx_save_guest_msrs(v);      vmx_restore_host_msrs(); diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h index 5974cce..2bf8829 100644 --- a/xen/include/asm-x86/hvm/vmx/vmcs.h +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h @@ -157,6 +157,7 @@ void vmx_destroy_vmcs(struct vcpu *v);  void vmx_vmcs_enter(struct vcpu *v);  bool_t __must_check vmx_vmcs_try_enter(struct vcpu *v);  void vmx_vmcs_exit(struct vcpu *v); +void vmx_vmcs_reload(struct vcpu *v);    #define CPU_BASED_VIRTUAL_INTR_PENDING        0x00000004  #define CPU_BASED_USE_TSC_OFFSETING           0x00000008