From patchwork Tue Jul 12 09:02:09 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhang X-Patchwork-Id: 9224843 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5213E60868 for ; Tue, 12 Jul 2016 09:19:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3F6BF26419 for ; Tue, 12 Jul 2016 09:19:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 342DE2787D; Tue, 12 Jul 2016 09:19:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F20A026419 for ; Tue, 12 Jul 2016 09:19:40 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bMtoE-0004If-Fp; Tue, 12 Jul 2016 09:16:54 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bMtoC-0004Ha-QP for xen-devel@lists.xen.org; Tue, 12 Jul 2016 09:16:52 +0000 Received: from [85.158.139.211] by server-2.bemta-5.messagelabs.com id E6/44-03032-485B4875; Tue, 12 Jul 2016 09:16:52 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrHLMWRWlGSWpSXmKPExsVywNwkVrd5a0u 4wYyjFhZLPi5mcWD0OLr7N1MAYxRrZl5SfkUCa8b0Rx9YC6b6VrxYfYmlgfGHRRcjJ4eQQIXE 374n7CC2hACvxJFlM1gh7ACJPxMnM0LU1El0t79nBrHZBLQlfqz+DRYXEZCWuPb5MpDNxcEs0 MkkcXLrZTaQhLBAocTh7kUsIDaLgKrEzUPvwYbyCnhJ/Ft7ig1igZzEyWOTweKcAt4S7RPnMk Ms85JY9eYv0wRG3gWMDKsYNYpTi8pSi3QNLfSSijLTM0pyEzNzdA0NTPVyU4uLE9NTcxKTivW S83M3MQLDgQEIdjA2bfc8xCjJwaQkytvM3BIuxJeUn1KZkVicEV9UmpNafIhRhoNDSYJ38hag nGBRanpqRVpmDjAwYdISHDxKIrzxIGne4oLE3OLMdIjUKUZFKXHeWJCEAEgiozQPrg0WDZcYZ aWEeRmBDhHiKUgtys0sQZV/xSjOwagkzKsNMoUnM68EbvoroMVMQItrHZpBFpckIqSkGhgnr/ ku0MUYPv+t+3R/fj7ttPzpxq7Pz6RESG1N98hncpNPcEm65LNtb/6y2ezPrwdoZdlnWS8LqX9 48sofoyNBS6Yska24bsHSs59nQq/lc9/zZf1n/1W1tXZ82pk1pf2d13TGI+WP5gln2D9znJp5 7M2W/ZtvTnaXOt3b9lnRyG5N75cbC+cpsRRnJBpqMRcVJwIAgimLLoECAAA= X-Env-Sender: yu.c.zhang@linux.intel.com X-Msg-Ref: server-11.tower-206.messagelabs.com!1468315002!37439193!5 X-Originating-IP: [192.55.52.93] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTkyLjU1LjUyLjkzID0+IDMyNDY2NQ==\n X-StarScan-Received: X-StarScan-Version: 8.77; banners=-,-,- X-VirusChecked: Checked Received: (qmail 13379 invoked from network); 12 Jul 2016 09:16:51 -0000 Received: from mga11.intel.com (HELO mga11.intel.com) (192.55.52.93) by server-11.tower-206.messagelabs.com with SMTP; 12 Jul 2016 09:16:51 -0000 Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP; 12 Jul 2016 02:16:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.28,351,1464678000"; d="scan'208"; a="1015212967" Received: from zhangyu-xengt.bj.intel.com ([10.238.157.46]) by orsmga002.jf.intel.com with ESMTP; 12 Jul 2016 02:16:48 -0700 From: Yu Zhang To: xen-devel@lists.xen.org Date: Tue, 12 Jul 2016 17:02:09 +0800 Message-Id: <1468314129-28465-5-git-send-email-yu.c.zhang@linux.intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1468314129-28465-1-git-send-email-yu.c.zhang@linux.intel.com> References: <1468314129-28465-1-git-send-email-yu.c.zhang@linux.intel.com> Cc: Kevin Tian , Jun Nakajima , George Dunlap , Andrew Cooper , Paul Durrant , zhiyuan.lv@intel.com, Jan Beulich Subject: [Xen-devel] [PATCH v5 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps. X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP This patch resets p2m_ioreq_server entries back to p2m_ram_rw, after an ioreq server has unmapped. The resync is done both asynchronously with the current p2m_change_entry_type_global() interface, and synchronously by iterating the p2m table. The synchronous resetting is necessary because we need to guarantee the p2m table is clean before another ioreq server is mapped. And since the sweeping of p2m table could be time consuming, it is done with hypercall continuation. Asynchronous approach is also taken so that p2m_ioreq_server entries can also be reset when the HVM is scheduled between hypercall continuations. This patch also disallows live migration, when there's still any outstanding p2m_ioreq_server entry left. The core reason is our current implementation of p2m_change_entry_type_global() can not tell the state of p2m_ioreq_server entries(can not decide if an entry is to be emulated or to be resynced). Signed-off-by: Yu Zhang --- Cc: Paul Durrant Cc: Jan Beulich Cc: Andrew Cooper Cc: George Dunlap Cc: Jun Nakajima Cc: Kevin Tian --- xen/arch/x86/hvm/hvm.c | 52 ++++++++++++++++++++++++++++++++++++++++++++--- xen/arch/x86/mm/hap/hap.c | 9 ++++++++ xen/arch/x86/mm/p2m-ept.c | 6 +++++- xen/arch/x86/mm/p2m-pt.c | 10 +++++++-- xen/arch/x86/mm/p2m.c | 3 +++ xen/include/asm-x86/p2m.h | 5 ++++- 6 files changed, 78 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 4d98cc6..e57c8b9 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -5485,6 +5485,7 @@ static int hvmop_set_mem_type( { unsigned long pfn = a.first_pfn + start_iter; p2m_type_t t; + struct p2m_domain *p2m = p2m_get_hostp2m(d); get_gfn_unshare(d, pfn, &t); if ( p2m_is_paging(t) ) @@ -5512,6 +5513,12 @@ static int hvmop_set_mem_type( if ( rc ) goto out; + if ( t == p2m_ram_rw && memtype[a.hvmmem_type] == p2m_ioreq_server ) + p2m->ioreq.entry_count++; + + if ( t == p2m_ioreq_server && memtype[a.hvmmem_type] == p2m_ram_rw ) + p2m->ioreq.entry_count--; + /* Check for continuation if it's not the last interation */ if ( a.nr > ++start_iter && !(start_iter & HVMOP_op_mask) && hypercall_preempt_check() ) @@ -5530,11 +5537,13 @@ static int hvmop_set_mem_type( } static int hvmop_map_mem_type_to_ioreq_server( - XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop) + XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop, + unsigned long *iter) { xen_hvm_map_mem_type_to_ioreq_server_t op; struct domain *d; int rc; + unsigned long gfn = *iter; if ( copy_from_guest(&op, uop, 1) ) return -EFAULT; @@ -5559,7 +5568,42 @@ static int hvmop_map_mem_type_to_ioreq_server( if ( rc != 0 ) goto out; - rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags); + if ( gfn == 0 || op.flags != 0 ) + rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags); + + /* + * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server, + * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw. + */ + if ( op.flags == 0 && rc == 0 ) + { + struct p2m_domain *p2m = p2m_get_hostp2m(d); + + while ( gfn <= p2m->max_mapped_pfn ) + { + p2m_type_t t; + + if ( p2m->ioreq.entry_count == 0 ) + break; + + get_gfn_unshare(d, gfn, &t); + + if ( (t == p2m_ioreq_server) && + !(p2m_change_type_one(d, gfn, t, p2m_ram_rw)) ) + p2m->ioreq.entry_count--; + + put_gfn(d, gfn); + + /* Check for continuation if it's not the last iteration. */ + if ( ++gfn <= p2m->max_mapped_pfn && + !(gfn & HVMOP_op_mask) && + hypercall_preempt_check() ) + { + rc = -ERESTART; + goto out; + } + } + } out: rcu_unlock_domain(d); @@ -5578,6 +5622,7 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg) break; case HVMOP_modified_memory: case HVMOP_set_mem_type: + case HVMOP_map_mem_type_to_ioreq_server: mask = HVMOP_op_mask; break; } @@ -5607,7 +5652,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg) case HVMOP_map_mem_type_to_ioreq_server: rc = hvmop_map_mem_type_to_ioreq_server( - guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t)); + guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t), + &start_iter); break; case HVMOP_set_ioreq_server_state: diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index 9c2cd49..0442b1d 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -190,6 +190,15 @@ out: */ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) { + struct p2m_domain *p2m = p2m_get_hostp2m(d); + + /* + * Refuse to turn on global log-dirty mode if + * there's outstanding p2m_ioreq_server pages. + */ + if ( log_global && p2m->ioreq.entry_count > 0 ) + return -EBUSY; + /* turn on PG_log_dirty bit in paging mode */ paging_lock(d); d->arch.paging.mode |= PG_log_dirty; diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index 5f06d40..5d4d804 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -545,6 +545,9 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn) e.ipat = ipat; if ( e.recalc && p2m_is_changeable(e.sa_p2mt) ) { + if ( e.sa_p2mt == p2m_ioreq_server ) + p2m->ioreq.entry_count--; + e.sa_p2mt = p2m_is_logdirty_range(p2m, gfn + i, gfn + i) ? p2m_ram_logdirty : p2m_ram_rw; ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access); @@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m, if ( is_epte_valid(ept_entry) ) { if ( (recalc || ept_entry->recalc) && - p2m_is_changeable(ept_entry->sa_p2mt) ) + p2m_is_changeable(ept_entry->sa_p2mt) && + (ept_entry->sa_p2mt != p2m_ioreq_server) ) *t = p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty : p2m_ram_rw; else diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c index 6209e7b..7bebfd1 100644 --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -439,11 +439,13 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn) needs_recalc(l1, *pent) ) { l1_pgentry_t e = *pent; + p2m_type_t p2mt_old; if ( !valid_recalc(l1, e) ) P2M_DEBUG("bogus recalc leaf at d%d:%lx:%u\n", p2m->domain->domain_id, gfn, level); - if ( p2m_is_changeable(p2m_flags_to_type(l1e_get_flags(e))) ) + p2mt_old = p2m_flags_to_type(l1e_get_flags(e)); + if ( p2m_is_changeable(p2mt_old) ) { unsigned long mask = ~0UL << (level * PAGETABLE_ORDER); p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask) @@ -463,6 +465,10 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn) mfn &= ~(_PAGE_PSE_PAT >> PAGE_SHIFT); flags |= _PAGE_PSE; } + + if ( p2mt_old == p2m_ioreq_server ) + p2m->ioreq.entry_count--; + e = l1e_from_pfn(mfn, flags); p2m_add_iommu_flags(&e, level, (p2mt == p2m_ram_rw) @@ -729,7 +735,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, static inline p2m_type_t recalc_type(bool_t recalc, p2m_type_t t, struct p2m_domain *p2m, unsigned long gfn) { - if ( !recalc || !p2m_is_changeable(t) ) + if ( !recalc || !p2m_is_changeable(t) || (t == p2m_ioreq_server) ) return t; return p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty : p2m_ram_rw; diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index 5567181..e1c3e31 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -313,6 +313,9 @@ int p2m_set_ioreq_server(struct domain *d, p2m->ioreq.server = NULL; p2m->ioreq.flags = 0; + + if ( p2m->ioreq.entry_count > 0 ) + p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw); } else { diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index 0950a91..8e5b4f5 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -120,7 +120,8 @@ typedef unsigned int p2m_query_t; /* Types that can be subject to bulk transitions. */ #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \ - | p2m_to_mask(p2m_ram_logdirty) ) + | p2m_to_mask(p2m_ram_logdirty) \ + | p2m_to_mask(p2m_ioreq_server)) #define P2M_POD_TYPES (p2m_to_mask(p2m_populate_on_demand)) @@ -352,6 +353,8 @@ struct p2m_domain { #define P2M_IOREQ_HANDLE_WRITE_ACCESS XEN_HVMOP_IOREQ_MEM_ACCESS_WRITE #define P2M_IOREQ_HANDLE_READ_ACCESS XEN_HVMOP_IOREQ_MEM_ACCESS_READ + + unsigned int entry_count; } ioreq; };