From patchwork Mon Sep 11 04:38:02 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haozhong Zhang X-Patchwork-Id: 9946565 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5CEBE6035D for ; Mon, 11 Sep 2017 04:41:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 52A85289DE for ; Mon, 11 Sep 2017 04:41:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4770F28AD7; Mon, 11 Sep 2017 04:41:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 9E6E6289DE for ; Mon, 11 Sep 2017 04:41:27 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1drGVR-0001w6-2P; Mon, 11 Sep 2017 04:39:33 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1drGVQ-0001tq-2h for xen-devel@lists.xen.org; Mon, 11 Sep 2017 04:39:32 +0000 Received: from [193.109.254.147] by server-1.bemta-6.messagelabs.com id CE/18-03414-38316B95; Mon, 11 Sep 2017 04:39:31 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpikeJIrShJLcpLzFFi42Jpa+sQ0W0S3hZ p0PbdxGLJx8UsDoweR3f/ZgpgjGLNzEvKr0hgzbj5dyd7wSyPiotd89kbGFvMuxi5OIQEpjNK zL/3j7mLkZNDQoBX4siyGawQdoDE0dP7WCCKehklvnyZygSSYBPQl1jx+CBYkYiAtMS1z5cZQ YqYBU4xSXTMPQA2SVggSqL72jV2EJtFQFXizZUpYDavgJ3EmYV9UBvkJXa1XQSzOYHiB1++A+ sVErCVWHB6AesERt4FjAyrGDWKU4vKUot0jcz1kooy0zNKchMzc3QNDcz0clOLixPTU3MSk4r 1kvNzNzECQ4IBCHYwLl4beIhRkoNJSZT33fEtkUJ8SfkplRmJxRnxRaU5qcWHGGU4OJQkeFWE tkUKCRalpqdWpGXmAIMTJi3BwaMkwhsFkuYtLkjMLc5Mh0idYjTmOLbp8h8mjo6bd/8wCbHk5 eelSonzyoGUCoCUZpTmwQ2CRc0lRlkpYV5GoNOEeApSi3IzS1DlXzGKczAqCfNGgEzhycwrgd v3CugUJqBTeC5tATmlJBEhJdXAqOWWmMu8urPqzySP1RsuM7xc6Dul+srS8G3KE1vFjHtDros /juy1kta7+nS3G8eCy6/zTmyoVb0TmBscWxCgdlmHe03Pc4nt2deXsimdqJH0fvXja65+s/nm 9JYLlvNSTIzSY83MZv5dr3Kr8YKJdLhJ7xrNNzM/MRUtvvT1M8O3L5+1pu7PUGIpzkg01GIuK k4EAJwfkkqVAgAA X-Env-Sender: haozhong.zhang@intel.com X-Msg-Ref: server-2.tower-27.messagelabs.com!1505104735!56506342!16 X-Originating-IP: [134.134.136.20] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTM0LjEzNC4xMzYuMjAgPT4gMzU1MzU4\n X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 49830 invoked from network); 11 Sep 2017 04:39:30 -0000 Received: from mga02.intel.com (HELO mga02.intel.com) (134.134.136.20) by server-2.tower-27.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 11 Sep 2017 04:39:30 -0000 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Sep 2017 21:39:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.42,376,1500966000"; d="scan'208"; a="1217078462" Received: from hz-desktop.sh.intel.com (HELO localhost) ([10.239.159.142]) by fmsmga002.fm.intel.com with ESMTP; 10 Sep 2017 21:39:28 -0700 From: Haozhong Zhang To: xen-devel@lists.xen.org Date: Mon, 11 Sep 2017 12:38:02 +0800 Message-Id: <20170911043820.14617-22-haozhong.zhang@intel.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170911043820.14617-1-haozhong.zhang@intel.com> References: <20170911043820.14617-1-haozhong.zhang@intel.com> Cc: Haozhong Zhang , Wei Liu , Andrew Cooper , Ian Jackson , Jan Beulich , Chao Peng , Dan Williams Subject: [Xen-devel] [RFC XEN PATCH v3 21/39] xen/pmem: support setup PMEM region for guest data usage X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Allow the command XEN_SYSCTL_nvdimm_pmem_setup of hypercall XEN_SYSCTL_nvdimm_op to setup a PMEM region for guest data usage. After the setup, that PMEM region will be able to be mapped to guest address space. Signed-off-by: Haozhong Zhang --- Cc: Ian Jackson Cc: Wei Liu Cc: Andrew Cooper Cc: Jan Beulich --- tools/libxc/include/xenctrl.h | 22 ++++++++ tools/libxc/xc_misc.c | 17 ++++++ xen/common/pmem.c | 118 +++++++++++++++++++++++++++++++++++++++++- xen/include/public/sysctl.h | 3 +- 4 files changed, 157 insertions(+), 3 deletions(-) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 7c5707fe11..41e5e3408c 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2621,6 +2621,28 @@ int xc_nvdimm_pmem_get_regions(xc_interface *xch, uint8_t type, int xc_nvdimm_pmem_setup_mgmt(xc_interface *xch, unsigned long smfn, unsigned long emfn); +/* + * Setup the specified PMEM pages for guest data usage. If success, + * these PMEM page can be mapped to guest and be used as the backend + * of vNDIMM devices. + * + * Parameters: + * xch: xc interface handle + * smfn, emfn: the start and end of the PMEM region + * mgmt_smfn, + + * mgmt_emfn: the start and the end MFN of the PMEM region that is + * used to manage this PMEM region. It must be in one of + * those added by xc_nvdimm_pmem_setup_mgmt() calls, and + * not overlap with @smfn - @emfn. + * + * Return: + * On success, return 0. Otherwise, return a non-zero error code. + */ +int xc_nvdimm_pmem_setup_data(xc_interface *xch, + unsigned long smfn, unsigned long emfn, + unsigned long mgmt_smfn, unsigned long mgmt_emfn); + /* Compat shims */ #include "xenctrl_compat.h" diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c index 3ad254f5ae..ef2e9e0656 100644 --- a/tools/libxc/xc_misc.c +++ b/tools/libxc/xc_misc.c @@ -1019,6 +1019,23 @@ int xc_nvdimm_pmem_setup_mgmt(xc_interface *xch, return rc; } +int xc_nvdimm_pmem_setup_data(xc_interface *xch, + unsigned long smfn, unsigned long emfn, + unsigned long mgmt_smfn, unsigned long mgmt_emfn) +{ + DECLARE_SYSCTL; + int rc; + + xc_nvdimm_pmem_setup_common(&sysctl, smfn, emfn, mgmt_smfn, mgmt_emfn); + sysctl.u.nvdimm.u.pmem_setup.type = PMEM_REGION_TYPE_DATA; + + rc = do_sysctl(xch, &sysctl); + if ( rc && sysctl.u.nvdimm.err ) + rc = -sysctl.u.nvdimm.err; + + return rc; +} + /* * Local variables: * mode: C diff --git a/xen/common/pmem.c b/xen/common/pmem.c index dcd8160407..6891ed7a47 100644 --- a/xen/common/pmem.c +++ b/xen/common/pmem.c @@ -34,16 +34,26 @@ static unsigned int nr_raw_regions; /* * All PMEM regions reserved for management purpose are linked to this * list. All of them must be covered by one or multiple PMEM regions - * in list pmem_raw_regions. + * in list pmem_raw_regions, and not appear in list pmem_data_regions. */ static LIST_HEAD(pmem_mgmt_regions); static DEFINE_SPINLOCK(pmem_mgmt_lock); static unsigned int nr_mgmt_regions; +/* + * All PMEM regions that can be mapped to guest are linked to this + * list. All of them must be covered by one or multiple PMEM regions + * in list pmem_raw_regions, and not appear in list pmem_mgmt_regions. + */ +static LIST_HEAD(pmem_data_regions); +static DEFINE_SPINLOCK(pmem_data_lock); +static unsigned int nr_data_regions; + struct pmem { struct list_head link; /* link to one of PMEM region list */ unsigned long smfn; /* start MFN of the PMEM region */ unsigned long emfn; /* end MFN of the PMEM region */ + spinlock_t lock; union { struct { @@ -53,6 +63,11 @@ struct pmem { struct { unsigned long used; /* # of used pages in MGMT PMEM region */ } mgmt; + + struct { + unsigned long mgmt_smfn; /* start MFN of management region */ + unsigned long mgmt_emfn; /* end MFN of management region */ + } data; } u; }; @@ -111,6 +126,7 @@ static int pmem_list_add(struct list_head *list, } new_pmem->smfn = smfn; new_pmem->emfn = emfn; + spin_lock_init(&new_pmem->lock); list_add(&new_pmem->link, cur); out: @@ -261,9 +277,16 @@ static int pmem_get_regions(xen_sysctl_nvdimm_pmem_regions_t *regions) static bool check_mgmt_size(unsigned long mgmt_mfns, unsigned long total_mfns) { - return mgmt_mfns >= + unsigned long required = ((sizeof(struct page_info) * total_mfns) >> PAGE_SHIFT) + ((sizeof(*machine_to_phys_mapping) * total_mfns) >> PAGE_SHIFT); + + if ( required > mgmt_mfns ) + printk(XENLOG_DEBUG "PMEM: insufficient management pages, " + "0x%lx pages required, 0x%lx pages available\n", + required, mgmt_mfns); + + return mgmt_mfns >= required; } static bool check_address_and_pxm(unsigned long smfn, unsigned long emfn, @@ -341,6 +364,93 @@ static int pmem_setup_mgmt(unsigned long smfn, unsigned long emfn) return rc; } +static struct pmem *find_mgmt_region(unsigned long smfn, unsigned long emfn) +{ + struct list_head *cur; + + ASSERT(spin_is_locked(&pmem_mgmt_lock)); + + list_for_each(cur, &pmem_mgmt_regions) + { + struct pmem *mgmt = list_entry(cur, struct pmem, link); + + if ( smfn >= mgmt->smfn && emfn <= mgmt->emfn ) + return mgmt; + } + + return NULL; +} + +static int pmem_setup_data(unsigned long smfn, unsigned long emfn, + unsigned long mgmt_smfn, unsigned long mgmt_emfn) +{ + struct pmem *data, *mgmt = NULL; + unsigned long used_mgmt_mfns; + unsigned int pxm; + int rc; + + if ( smfn == mfn_x(INVALID_MFN) || emfn == mfn_x(INVALID_MFN) || + smfn >= emfn ) + return -EINVAL; + + /* + * Require the PMEM region in one proximity domain, in order to + * avoid the error recovery from multiple calls to pmem_arch_setup() + * which is not revertible. + */ + if ( !check_address_and_pxm(smfn, emfn, &pxm) ) + return -EINVAL; + + if ( mgmt_smfn == mfn_x(INVALID_MFN) || mgmt_emfn == mfn_x(INVALID_MFN) || + mgmt_smfn >= mgmt_emfn ) + return -EINVAL; + + spin_lock(&pmem_mgmt_lock); + mgmt = find_mgmt_region(mgmt_smfn, mgmt_emfn); + if ( !mgmt ) + { + spin_unlock(&pmem_mgmt_lock); + return -ENXIO; + } + spin_unlock(&pmem_mgmt_lock); + + spin_lock(&mgmt->lock); + + if ( mgmt_smfn < mgmt->smfn + mgmt->u.mgmt.used || + !check_mgmt_size(mgmt_emfn - mgmt_smfn, emfn - smfn) ) + { + spin_unlock(&mgmt->lock); + return -ENOSPC; + } + + spin_lock(&pmem_data_lock); + + rc = pmem_list_add(&pmem_data_regions, smfn, emfn, &data); + if ( rc ) + goto out; + data->u.data.mgmt_smfn = data->u.data.mgmt_emfn = mfn_x(INVALID_MFN); + + rc = pmem_arch_setup(smfn, emfn, pxm, + mgmt_smfn, mgmt_emfn, &used_mgmt_mfns); + if ( rc ) + { + pmem_list_del(data); + goto out; + } + + mgmt->u.mgmt.used = mgmt_smfn - mgmt->smfn + used_mgmt_mfns; + data->u.data.mgmt_smfn = mgmt_smfn; + data->u.data.mgmt_emfn = mgmt->smfn + mgmt->u.mgmt.used; + + nr_data_regions++; + + out: + spin_unlock(&pmem_data_lock); + spin_unlock(&mgmt->lock); + + return rc; +} + static int pmem_setup(unsigned long smfn, unsigned long emfn, unsigned long mgmt_smfn, unsigned long mgmt_emfn, unsigned int type) @@ -360,6 +470,10 @@ static int pmem_setup(unsigned long smfn, unsigned long emfn, break; + case PMEM_REGION_TYPE_DATA: + rc = pmem_setup_data(smfn, emfn, mgmt_smfn, mgmt_emfn); + break; + default: rc = -EINVAL; } diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h index f825716446..d7c12f23fb 100644 --- a/xen/include/public/sysctl.h +++ b/xen/include/public/sysctl.h @@ -1121,6 +1121,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_set_parameter_t); /* Types of PMEM regions */ #define PMEM_REGION_TYPE_RAW 0 /* PMEM regions detected by Xen */ #define PMEM_REGION_TYPE_MGMT 1 /* PMEM regions for management usage */ +#define PMEM_REGION_TYPE_DATA 2 /* PMEM regions for guest data */ /* PMEM_REGION_TYPE_RAW */ struct xen_sysctl_nvdimm_pmem_raw_region { @@ -1176,7 +1177,7 @@ struct xen_sysctl_nvdimm_pmem_setup { /* above PMEM region. If the above PMEM region is */ /* a management region, mgmt_{s,e}mfn is required */ /* to be identical to {s,e}mfn. */ - uint8_t type; /* Only PMEM_REGION_TYPE_MGMT is supported now */ + uint8_t type; /* Must be one of PMEM_REGION_TYPE_{MGMT, DATA} */ }; typedef struct xen_sysctl_nvdimm_pmem_setup xen_sysctl_nvdimm_pmem_setup_t; DEFINE_XEN_GUEST_HANDLE(xen_sysctl_nvdimm_pmem_setup_t);