From patchwork Thu May 4 09:00:17 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Razvan Cojocaru X-Patchwork-Id: 9711025 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 64BF060362 for ; Thu, 4 May 2017 09:02:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4D47A2863D for ; Thu, 4 May 2017 09:02:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 406C92866D; Thu, 4 May 2017 09:02:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7EC822863D for ; Thu, 4 May 2017 09:02:58 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d6Cco-0003cR-U6; Thu, 04 May 2017 09:00:38 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d6Cco-0003cL-7n for xen-devel@lists.xen.org; Thu, 04 May 2017 09:00:38 +0000 Received: from [85.158.137.68] by server-2.bemta-3.messagelabs.com id F5/BA-01977-5BDEA095; Thu, 04 May 2017 09:00:37 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrCIsWRWlGSWpSXmKPExsUSfTxjoe6Wt1y RBmf/KVks+biYxYHR4+ju30wBjFGsmXlJ+RUJrBn77qxjK1hhXDF5p0UDY4dWFyMHh5CAu8SF r8ldjFxA5hpGiUvretggnGuMEkuaz7N3MXKCFb04OIUVIrGPUWJ740RWkASbgKHE6o0tbCC2i IC0xLXPlxlBbGaBUolLE/sZQTYIC8RLdO8Dm8MioCqxsXcSWDmvgKfEzQUzwcolBOQkTh6bzA pSzingJXFkuwXEWk+JN9c72CFKciR2HO5lAimREJCS+N+qBHKNhMByFomfrbOhxshIPJp4k20 Co9ACRoZVjBrFqUVlqUW6RmZ6SUWZ6RkluYmZObqGBsZ6uanFxYnpqTmJScV6yfm5mxiBIVjP wMC4g7Fhr98hRkkOJiVRXvVX7JFCfEn5KZUZicUZ8UWlOanFhxhlODiUJHij33BFCgkWpaanV qRl5gCjASYtwcGjJMIbBJLmLS5IzC3OTIdInWLU5Xi39MN7JiGWvPy8VClxXh2QIgGQoozSPL gRsMi8xCgrJczLyMDAIMRTkFqUm1mCKv+KUZyDUUmYdx7IFJ7MvBK4Ta+AjmACOqJZlgPkiJJ EhJRUA2Ot4+3edRWLlt3oM/Pd+Gim/umYx1kpyZaLXrZNW/JmmvLk908ts27/mCm9/eq5fT0M +3+/rElbLJsbkfqo+GC/+S3hiIzJ+tM0Ls6pe8e8Xc3PJv5QD+NJ7eAzhoaXpOxCPsrV+k0zi pGbW/dWbFOK6psLtiFy1ksWHV4f6f7NbmPIzzqpZzlKLMUZiYZazEXFiQCrORd1xwIAAA== X-Env-Sender: rcojocaru@bitdefender.com X-Msg-Ref: server-14.tower-31.messagelabs.com!1493888435!98687786!1 X-Originating-IP: [91.199.104.161] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.4.12; banners=-,-,- X-VirusChecked: Checked Received: (qmail 39671 invoked from network); 4 May 2017 09:00:36 -0000 Received: from mx01.bbu.dsd.mx.bitdefender.com (HELO mx01.bbu.dsd.mx.bitdefender.com) (91.199.104.161) by server-14.tower-31.messagelabs.com with DHE-RSA-AES128-GCM-SHA256 encrypted SMTP; 4 May 2017 09:00:36 -0000 Received: (qmail 16179 invoked from network); 4 May 2017 12:00:34 +0300 Received: from unknown (HELO mx-sr.buh.bitdefender.com) (10.17.80.103) by mx01.bbu.dsd.mx.bitdefender.com with AES256-GCM-SHA384 encrypted SMTP; 4 May 2017 12:00:34 +0300 Received: from smtp02.buh.bitdefender.net (smtp.bitdefender.biz [10.17.80.76]) by mx-sr.buh.bitdefender.com (Postfix) with ESMTP id 9A09C7FE4F for ; Thu, 4 May 2017 12:00:34 +0300 (EEST) Received: (qmail 32603 invoked from network); 4 May 2017 12:00:34 +0300 Received: from xen.dsd.ro (HELO xen.dsd.bitdefender.biz) (rcojocaru@bitdefender.com@10.10.14.109) by smtp02.buh.bitdefender.net with AES128-SHA256 encrypted SMTP; 4 May 2017 12:00:34 +0300 From: Razvan Cojocaru To: xen-devel@lists.xen.org Date: Thu, 4 May 2017 12:00:17 +0300 Message-Id: <1493888417-20803-3-git-send-email-rcojocaru@bitdefender.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1493888417-20803-1-git-send-email-rcojocaru@bitdefender.com> References: <1493888417-20803-1-git-send-email-rcojocaru@bitdefender.com> X-BitDefender-Scanner: Clean, Agent: BitDefender qmail 3.1.6 on smtp02.buh.bitdefender.net, sigver: 7.71216 X-BitDefender-Spam: No (0) X-BitDefender-SpamStamp: Build: [Engines: 2.15.8.1074, Dats: 448088, Stamp: 3], Multi: [Enabled, t: (0.000015, 0.018399)], BW: [Enabled, t: (0.000009)], RBL DNSBL: [Disabled], APM: [Enabled, Score: 500, t: (0.007122), Flags: 85D2ED72; NN_NO_CONTENT_TYPE; NN_LEGIT_SUMM_400_WORDS; NN_NO_LINK_NMD; NN_LEGIT_BITDEFENDER; NN_LEGIT_S_SQARE_BRACKETS; NN_LEGIT_MAILING_LIST_TO], SGN: [Enabled, t: (0.013086,0.000296)], URL: [Enabled, t: (0.000006)], RTDA: [Enabled, t: (0.156622), Hit: No, Details: v2.4.9; Id: 11.5eubg7.1bf993jb5.11c8], total: 0(775) X-BitDefender-CF-Stamp: none Cc: andrew.cooper3@citrix.com, tamas@tklengyel.com, Razvan Cojocaru , jbeulich@suse.com Subject: [Xen-devel] [PATCH V3 2/2] x86/vm_event: fix race between __context_switch() and vm_event_resume() X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP The introspection agent can reply to a vm_event faster than vmx_vmexit_handler() can complete in some cases, where it is then not safe for vm_event_set_registers() to modify v->arch.user_regs. In the test scenario, we were stepping over an INT3 breakpoint by setting RIP += 1. The quick reply tended to complete before the VCPU triggering the introspection event had properly paused and been descheduled. If the reply occurs before __context_switch() happens, __context_switch() clobbers the reply by overwriting v->arch.user_regs from the stack. If we don't pass through __context_switch() (due to switching to the idle vCPU), reply data wouldn't be picked up when switching back straight to the original vCPU. This patch ensures that vm_event_resume() code only sets per-VCPU data to be used for the actual setting of registers later in hvm_do_resume() (similar to the model used to control setting of CRs and MSRs). The patch additionally removes the sync_vcpu_execstate(v) call from vm_event_resume(), which is no longer necessary, which removes the associated broadcast TLB flush (read: performance improvement). Signed-off-by: Razvan Cojocaru Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich Acked-by: Tamas K Lengyel --- Changes since V2: - Corrected commit message. --- xen/arch/x86/hvm/vm_event.c | 35 +++++++++++++++++++++++++++++++++++ xen/arch/x86/vm_event.c | 22 ++-------------------- xen/common/vm_event.c | 17 ++++++++++------- xen/include/asm-x86/vm_event.h | 2 ++ 4 files changed, 49 insertions(+), 27 deletions(-) diff --git a/xen/arch/x86/hvm/vm_event.c b/xen/arch/x86/hvm/vm_event.c index 1934556..8eeb210 100644 --- a/xen/arch/x86/hvm/vm_event.c +++ b/xen/arch/x86/hvm/vm_event.c @@ -25,6 +25,39 @@ #include #include +static void hvm_vm_event_set_registers(const struct vcpu *v) +{ + ASSERT(v == current); + + if ( unlikely(v->arch.vm_event->set_gprs) ) + { + struct cpu_user_regs *regs = guest_cpu_user_regs(); + + regs->rax = v->arch.vm_event->gprs.rax; + regs->rbx = v->arch.vm_event->gprs.rbx; + regs->rcx = v->arch.vm_event->gprs.rcx; + regs->rdx = v->arch.vm_event->gprs.rdx; + regs->rsp = v->arch.vm_event->gprs.rsp; + regs->rbp = v->arch.vm_event->gprs.rbp; + regs->rsi = v->arch.vm_event->gprs.rsi; + regs->rdi = v->arch.vm_event->gprs.rdi; + + regs->r8 = v->arch.vm_event->gprs.r8; + regs->r9 = v->arch.vm_event->gprs.r9; + regs->r10 = v->arch.vm_event->gprs.r10; + regs->r11 = v->arch.vm_event->gprs.r11; + regs->r12 = v->arch.vm_event->gprs.r12; + regs->r13 = v->arch.vm_event->gprs.r13; + regs->r14 = v->arch.vm_event->gprs.r14; + regs->r15 = v->arch.vm_event->gprs.r15; + + regs->rflags = v->arch.vm_event->gprs.rflags; + regs->rip = v->arch.vm_event->gprs.rip; + + v->arch.vm_event->set_gprs = false; + } +} + void hvm_vm_event_do_resume(struct vcpu *v) { struct monitor_write_data *w; @@ -32,6 +65,8 @@ void hvm_vm_event_do_resume(struct vcpu *v) if ( likely(!v->arch.vm_event) ) return; + hvm_vm_event_set_registers(v); + w = &v->arch.vm_event->write_data; if ( unlikely(v->arch.vm_event->emulate_flags) ) diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c index 502ccc2..a6ea42c 100644 --- a/xen/arch/x86/vm_event.c +++ b/xen/arch/x86/vm_event.c @@ -113,26 +113,8 @@ void vm_event_set_registers(struct vcpu *v, vm_event_response_t *rsp) { ASSERT(atomic_read(&v->vm_event_pause_count)); - v->arch.user_regs.rax = rsp->data.regs.x86.rax; - v->arch.user_regs.rbx = rsp->data.regs.x86.rbx; - v->arch.user_regs.rcx = rsp->data.regs.x86.rcx; - v->arch.user_regs.rdx = rsp->data.regs.x86.rdx; - v->arch.user_regs.rsp = rsp->data.regs.x86.rsp; - v->arch.user_regs.rbp = rsp->data.regs.x86.rbp; - v->arch.user_regs.rsi = rsp->data.regs.x86.rsi; - v->arch.user_regs.rdi = rsp->data.regs.x86.rdi; - - v->arch.user_regs.r8 = rsp->data.regs.x86.r8; - v->arch.user_regs.r9 = rsp->data.regs.x86.r9; - v->arch.user_regs.r10 = rsp->data.regs.x86.r10; - v->arch.user_regs.r11 = rsp->data.regs.x86.r11; - v->arch.user_regs.r12 = rsp->data.regs.x86.r12; - v->arch.user_regs.r13 = rsp->data.regs.x86.r13; - v->arch.user_regs.r14 = rsp->data.regs.x86.r14; - v->arch.user_regs.r15 = rsp->data.regs.x86.r15; - - v->arch.user_regs.rflags = rsp->data.regs.x86.rflags; - v->arch.user_regs.rip = rsp->data.regs.x86.rip; + v->arch.vm_event->gprs = rsp->data.regs.x86; + v->arch.vm_event->set_gprs = true; } void vm_event_monitor_next_interrupt(struct vcpu *v) diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c index 0fe9a53..9291db6 100644 --- a/xen/common/vm_event.c +++ b/xen/common/vm_event.c @@ -357,6 +357,16 @@ void vm_event_resume(struct domain *d, struct vm_event_domain *ved) { vm_event_response_t rsp; + /* + * vm_event_resume() runs in either XEN_DOMCTL_VM_EVENT_OP_*, or + * EVTCHN_send context from the introspection consumer. Both contexts + * are guaranteed not to be the subject of vm_event responses. + * While we could ASSERT(v != current) for each VCPU in d in the loop + * below, this covers the case where we would need to iterate over all + * of them more succintly. + */ + ASSERT(d != current->domain); + /* Pull all responses off the ring. */ while ( vm_event_get_response(d, ved, &rsp) ) { @@ -375,13 +385,6 @@ void vm_event_resume(struct domain *d, struct vm_event_domain *ved) v = d->vcpu[rsp.vcpu_id]; /* - * Make sure the vCPU state has been synchronized for the custom - * handlers. - */ - if ( atomic_read(&v->vm_event_pause_count) ) - sync_vcpu_execstate(v); - - /* * In some cases the response type needs extra handling, so here * we call the appropriate handlers. */ diff --git a/xen/include/asm-x86/vm_event.h b/xen/include/asm-x86/vm_event.h index ca73f99..39e73c8 100644 --- a/xen/include/asm-x86/vm_event.h +++ b/xen/include/asm-x86/vm_event.h @@ -32,6 +32,8 @@ struct arch_vm_event { struct vm_event_emul_insn_data insn; } emul; struct monitor_write_data write_data; + struct vm_event_regs_x86 gprs; + bool set_gprs; }; int vm_event_init_domain(struct domain *d);