From patchwork Tue Mar 21 02:52:52 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhang X-Patchwork-Id: 9635885 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 85DF4601E9 for ; Tue, 21 Mar 2017 03:14:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7AF5526E97 for ; Tue, 21 Mar 2017 03:14:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6FEAD27031; Tue, 21 Mar 2017 03:14:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C901B2768C for ; Tue, 21 Mar 2017 03:14:22 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cqADc-0001yo-Hk; Tue, 21 Mar 2017 03:12:20 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cqADc-0001yU-0O for xen-devel@lists.xen.org; Tue, 21 Mar 2017 03:12:20 +0000 Received: from [85.158.143.35] by server-6.bemta-6.messagelabs.com id 07/92-15112-31A90D85; Tue, 21 Mar 2017 03:12:19 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrDLMWRWlGSWpSXmKPExsXS1taRois060K EQUeTosWSj4tZHBg9ju7+zRTAGMWamZeUX5HAmvH0pXrBL4+KY9smMjYwrjTrYuTkEBKokGh4 f48FxJYQ4JU4smwGK4TtJ3HkbAM7RE07o8TXv5IgNpuAtsSP1b8ZQWwRAWmJa58vA9lcHMwCn UwSJ7deZgNJCAukSFxf+QfMZhFQlZh95DsTiM0r4CWxcfIsJogFchInj00GW8Yp4C3R9fwY0D IOoGVeEs9n105g5F3AyLCKUaM4tagstUjXyEQvqSgzPaMkNzEzR9fQwEwvN7W4ODE9NScxqVg vOT93EyMwFBiAYAfjvo+RhxglOZiURHl/fzkUIcSXlJ9SmZFYnBFfVJqTWnyIUYaDQ0mCd8Hs IxFCgkWp6akVaZk5wKCESUtw8CiJ8P4GSfMWFyTmFmemQ6ROMSpKifPeBUkIgCQySvPg2mCRc IlRVkqYlxHoECGegtSi3MwSVPlXjOIcjErCvE9ApvBk5pXATX8FtJgJaPE0voMgi0sSEVJSDY wGJ1fazjs1+6NtkcmMjVUFe9yOFfD9CWFmnMHskLX0eHTqTCdjrRqGhsaLZQVLUg01PiSdPL5 2q8P7hw9+bDRkDAozmrhB3nSHTGDtuvOha7eLl2X5bD214+XdAt4Pc32vTjz09yD3te8iidwT OZustYQXPw17pRsv28UVzRjx2N289m7l6hQlluKMREMt5qLiRACY7BItfwIAAA== X-Env-Sender: yu.c.zhang@linux.intel.com X-Msg-Ref: server-6.tower-21.messagelabs.com!1490065936!40262482!1 X-Originating-IP: [134.134.136.100] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.2.3; banners=-,-,- X-VirusChecked: Checked Received: (qmail 18759 invoked from network); 21 Mar 2017 03:12:18 -0000 Received: from mga07.intel.com (HELO mga07.intel.com) (134.134.136.100) by server-6.tower-21.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 21 Mar 2017 03:12:18 -0000 Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga105.jf.intel.com with ESMTP; 20 Mar 2017 20:12:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.36,197,1486454400"; d="scan'208";a="69243053" Received: from zhangyu-optiplex-9020.bj.intel.com ([10.238.135.159]) by orsmga004.jf.intel.com with ESMTP; 20 Mar 2017 20:12:13 -0700 From: Yu Zhang To: xen-devel@lists.xen.org Date: Tue, 21 Mar 2017 10:52:52 +0800 Message-Id: <1490064773-26751-5-git-send-email-yu.c.zhang@linux.intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1490064773-26751-1-git-send-email-yu.c.zhang@linux.intel.com> References: <1490064773-26751-1-git-send-email-yu.c.zhang@linux.intel.com> Cc: Kevin Tian , Jun Nakajima , George Dunlap , Andrew Cooper , Paul Durrant , zhiyuan.lv@intel.com, Jan Beulich Subject: [Xen-devel] [PATCH v9 4/5] x86/ioreq server: Asynchronously reset outstanding p2m_ioreq_server entries. X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP After an ioreq server has unmapped, the remaining p2m_ioreq_server entries need to be reset back to p2m_ram_rw. This patch does this asynchronously with the current p2m_change_entry_type_global() interface. This patch also disallows live migration, when there's still any outstanding p2m_ioreq_server entry left. The core reason is our current implementation of p2m_change_entry_type_global() can not tell the state of p2m_ioreq_server entries(can not decide if an entry is to be emulated or to be resynced). Note: new field entry_count is introduced in struct p2m_domain, to record the number of p2m_ioreq_server p2m page table entries. One nature of these entries is that they only point to 4K sized page frames, because all p2m_ioreq_server entries are originated from p2m_ram_rw ones in p2m_change_type_one(). We do not need to worry about the counting for 2M/1G sized pages. Signed-off-by: Yu Zhang Reviewed-by: Paul Durrant --- Cc: Paul Durrant Cc: Jan Beulich Cc: Andrew Cooper Cc: George Dunlap Cc: Jun Nakajima Cc: Kevin Tian changes in v4: - According to comments from Jan: use ASSERT() instead of 'if' condition in p2m_change_type_one(). - According to comments from Jan: commit message changes to mention the p2m_ioreq_server are all based on 4K sized pages. changes in v3: - Move the synchronously resetting logic into patch 5. - According to comments from Jan: introduce p2m_check_changeable() to clarify the p2m type change code. - According to comments from George: use locks in the same order to avoid deadlock, call p2m_change_entry_type_global() after unmap of the ioreq server is finished. changes in v2: - Move the calculation of ioreq server page entry_cout into p2m_change_type_one() so that we do not need a seperate lock. Note: entry_count is also calculated in resolve_misconfig()/ do_recalc(), fortunately callers of both routines have p2m lock protected already. - Simplify logic in hvmop_set_mem_type(). - Introduce routine p2m_finish_type_change() to walk the p2m table and do the p2m reset. --- xen/arch/x86/hvm/ioreq.c | 8 ++++++++ xen/arch/x86/mm/hap/hap.c | 9 +++++++++ xen/arch/x86/mm/p2m-ept.c | 8 +++++++- xen/arch/x86/mm/p2m-pt.c | 13 +++++++++++-- xen/arch/x86/mm/p2m.c | 20 ++++++++++++++++++++ xen/include/asm-x86/p2m.h | 9 ++++++++- 6 files changed, 63 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c index 746799f..102c6c2 100644 --- a/xen/arch/x86/hvm/ioreq.c +++ b/xen/arch/x86/hvm/ioreq.c @@ -949,6 +949,14 @@ int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id, spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock); + if ( rc == 0 && flags == 0 ) + { + struct p2m_domain *p2m = p2m_get_hostp2m(d); + + if ( read_atomic(&p2m->ioreq.entry_count) ) + p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw); + } + return rc; } diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index a57b385..6ec950a 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -187,6 +187,15 @@ out: */ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) { + struct p2m_domain *p2m = p2m_get_hostp2m(d); + + /* + * Refuse to turn on global log-dirty mode if + * there's outstanding p2m_ioreq_server pages. + */ + if ( log_global && read_atomic(&p2m->ioreq.entry_count) ) + return -EBUSY; + /* turn on PG_log_dirty bit in paging mode */ paging_lock(d); d->arch.paging.mode |= PG_log_dirty; diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index cc1eb21..1df3d09 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -544,6 +544,12 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn) e.ipat = ipat; if ( e.recalc && p2m_is_changeable(e.sa_p2mt) ) { + if ( e.sa_p2mt == p2m_ioreq_server ) + { + p2m->ioreq.entry_count--; + ASSERT(p2m->ioreq.entry_count >= 0); + } + e.sa_p2mt = p2m_is_logdirty_range(p2m, gfn + i, gfn + i) ? p2m_ram_logdirty : p2m_ram_rw; ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access); @@ -965,7 +971,7 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m, if ( is_epte_valid(ept_entry) ) { if ( (recalc || ept_entry->recalc) && - p2m_is_changeable(ept_entry->sa_p2mt) ) + p2m_check_changeable(ept_entry->sa_p2mt) ) *t = p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty : p2m_ram_rw; else diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c index f6c45ec..169de75 100644 --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -436,11 +436,13 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn) needs_recalc(l1, *pent) ) { l1_pgentry_t e = *pent; + p2m_type_t p2mt_old; if ( !valid_recalc(l1, e) ) P2M_DEBUG("bogus recalc leaf at d%d:%lx:%u\n", p2m->domain->domain_id, gfn, level); - if ( p2m_is_changeable(p2m_flags_to_type(l1e_get_flags(e))) ) + p2mt_old = p2m_flags_to_type(l1e_get_flags(e)); + if ( p2m_is_changeable(p2mt_old) ) { unsigned long mask = ~0UL << (level * PAGETABLE_ORDER); p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask) @@ -460,6 +462,13 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn) mfn &= ~(_PAGE_PSE_PAT >> PAGE_SHIFT); flags |= _PAGE_PSE; } + + if ( p2mt_old == p2m_ioreq_server ) + { + p2m->ioreq.entry_count--; + ASSERT(p2m->ioreq.entry_count >= 0); + } + e = l1e_from_pfn(mfn, flags); p2m_add_iommu_flags(&e, level, (p2mt == p2m_ram_rw) @@ -729,7 +738,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, static inline p2m_type_t recalc_type(bool_t recalc, p2m_type_t t, struct p2m_domain *p2m, unsigned long gfn) { - if ( !recalc || !p2m_is_changeable(t) ) + if ( !recalc || !p2m_check_changeable(t) ) return t; return p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty : p2m_ram_rw; diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index dd4e477..e3e54f1 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -954,6 +954,26 @@ int p2m_change_type_one(struct domain *d, unsigned long gfn, p2m->default_access) : -EBUSY; + if ( !rc ) + { + switch ( nt ) + { + case p2m_ram_rw: + if ( ot == p2m_ioreq_server ) + { + p2m->ioreq.entry_count--; + ASSERT(p2m->ioreq.entry_count >= 0); + } + break; + case p2m_ioreq_server: + ASSERT(ot == p2m_ram_rw); + p2m->ioreq.entry_count++; + break; + default: + break; + } + } + gfn_unlock(p2m, gfn, 0); return rc; diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index 3786680..395f125 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -120,7 +120,10 @@ typedef unsigned int p2m_query_t; /* Types that can be subject to bulk transitions. */ #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \ - | p2m_to_mask(p2m_ram_logdirty) ) + | p2m_to_mask(p2m_ram_logdirty) \ + | p2m_to_mask(p2m_ioreq_server) ) + +#define P2M_IOREQ_TYPES (p2m_to_mask(p2m_ioreq_server)) #define P2M_POD_TYPES (p2m_to_mask(p2m_populate_on_demand)) @@ -157,6 +160,7 @@ typedef unsigned int p2m_query_t; #define p2m_is_readonly(_t) (p2m_to_mask(_t) & P2M_RO_TYPES) #define p2m_is_discard_write(_t) (p2m_to_mask(_t) & P2M_DISCARD_WRITE_TYPES) #define p2m_is_changeable(_t) (p2m_to_mask(_t) & P2M_CHANGEABLE_TYPES) +#define p2m_is_ioreq(_t) (p2m_to_mask(_t) & P2M_IOREQ_TYPES) #define p2m_is_pod(_t) (p2m_to_mask(_t) & P2M_POD_TYPES) #define p2m_is_grant(_t) (p2m_to_mask(_t) & P2M_GRANT_TYPES) /* Grant types are *not* considered valid, because they can be @@ -178,6 +182,8 @@ typedef unsigned int p2m_query_t; #define p2m_allows_invalid_mfn(t) (p2m_to_mask(t) & P2M_INVALID_MFN_TYPES) +#define p2m_check_changeable(t) (p2m_is_changeable(t) && !p2m_is_ioreq(t)) + typedef enum { p2m_host, p2m_nested, @@ -349,6 +355,7 @@ struct p2m_domain { * are to be emulated by an ioreq server. */ unsigned int flags; + long entry_count; } ioreq; };