From patchwork Wed Aug 3 17:27:50 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Vrabel X-Patchwork-Id: 9261663 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0059F60865 for ; Wed, 3 Aug 2016 17:30:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E23B627FAC for ; Wed, 3 Aug 2016 17:30:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D665E28210; Wed, 3 Aug 2016 17:30:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 361AD27FAC for ; Wed, 3 Aug 2016 17:30:46 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bUzxi-0005Fd-Ff; Wed, 03 Aug 2016 17:28:10 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bUzxh-0005F7-HV for xen-devel@lists.xenproject.org; Wed, 03 Aug 2016 17:28:09 +0000 Received: from [193.109.254.147] by server-9.bemta-14.messagelabs.com id CC/AA-10182-8A922A75; Wed, 03 Aug 2016 17:28:08 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFupjkeJIrShJLcpLzFFi42JxWrrBXneF5qJ wg6XThC2+b5nM5MDocfjDFZYAxijWzLyk/IoE1oxF/4+xFKxQq1ixbi9jA+NN2S5GTg4JAX+J hu1XWUFsNgEdicdLZrB3MXJwiAioSNzea9DFyMXBLLCNUeJ/+0VGkBphgVCJ9jmfmUBqWIBqd i5xAjF5Bdwlfm2IgZgoJ3H++E9mEJtTwEPiyczXbCC2EFDJrU9PGSFsFYmPa1eBbeUVEJQ4Of MJC4jNLCAhcfDFC2aQkRIC3BJ/u+0nMPLNQlI1C0nVAkamVYwaxalFZalFukYWeklFmekZJbm JmTm6hoYmermpxcWJ6ak5iUnFesn5uZsYgeFUz8DAuINx3XG/Q4ySHExKorwTDy4MF+JLyk+p zEgszogvKs1JLT7EKMPBoSTBy6ixKFxIsCg1PbUiLTMHGNgwaQkOHiUR3sPqQGne4oLE3OLMd IjUKUZFKXFeF5A+AZBERmkeXBssmi4xykoJ8zIyMDAI8RSkFuVmlqDKv2IU52BUEubNAZnCk5 lXAjf9FdBiJqDFJwwWgCwuSURISTUwLkyKZ1OqflOx2tqsxOxUwow5sie3Xz3c/OBbLvfk5yf ZQx6cZOoOq3h3+uJ+UbUF2RFPn6sHz4gLbj1fzKTfWHX/y/8ZqY+nqRbO+a5tL71Ze9vN72rV wj/+HTrX6L8yTv9Sq1CVttu5krvra05vd4oQsOs+mZYodGxppY5joJG1WN5EXRkXJZbijERDL eai4kQAXf4MMqECAAA= X-Env-Sender: prvs=016c44363=david.vrabel@citrix.com X-Msg-Ref: server-4.tower-27.messagelabs.com!1470245286!56775565!2 X-Originating-IP: [66.165.176.63] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 8.77; banners=-,-,- X-VirusChecked: Checked Received: (qmail 63915 invoked from network); 3 Aug 2016 17:28:07 -0000 Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63) by server-4.tower-27.messagelabs.com with RC4-SHA encrypted SMTP; 3 Aug 2016 17:28:07 -0000 X-IronPort-AV: E=Sophos;i="5.28,466,1464652800"; d="scan'208";a="377270381" From: David Vrabel To: Date: Wed, 3 Aug 2016 18:27:50 +0100 Message-ID: <1470245271-31109-2-git-send-email-david.vrabel@citrix.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1470245271-31109-1-git-send-email-david.vrabel@citrix.com> References: <1470245271-31109-1-git-send-email-david.vrabel@citrix.com> MIME-Version: 1.0 X-DLP: MIA1 Cc: Stefano Stabellini , Ian Jackson , David Vrabel , Ian Campbell Subject: [Xen-devel] [PATCHv1 1/2] libxencall/linux: use LOCK/UNLOCK ioctls for hypercall buffers X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Using just mlock'd buffers for hypercalls is not sufficient as these are still subject to compaction and page migration. Use the new IOCTL_PRIVCMD_HCALL_BUF_LOCK and IOCTL_PRIVCMD_HCALL_BUF_UNLOCK ioctls provided by the privcmd driver to prevent this. Since not all kernels support these ioctls, don't repeatedly try these ioctls if they are unsupported. MAP_LOCKED is still used as this places the pages on the unevictable list avoiding the need for the VM subsystem to scan them. madvise(.., MADV_DONTFORK) is still required since we still need to prevent children getting CoW mappings of the hypercall buffers. Signed-off-by: David Vrabel --- tools/include/xen-sys/Linux/privcmd.h | 37 +++++++++++++++++++++++++++ tools/libs/call/linux.c | 47 ++++++++++++++++++++++++++++++++--- 2 files changed, 81 insertions(+), 3 deletions(-) diff --git a/tools/include/xen-sys/Linux/privcmd.h b/tools/include/xen-sys/Linux/privcmd.h index e4e666a..4afb399 100644 --- a/tools/include/xen-sys/Linux/privcmd.h +++ b/tools/include/xen-sys/Linux/privcmd.h @@ -75,6 +75,11 @@ typedef struct privcmd_mmapbatch_v2 { int __user *err; /* array of error codes */ } privcmd_mmapbatch_v2_t; +struct privcmd_hcall_buf { + void *start; + size_t len; +}; + /* * @cmd: IOCTL_PRIVCMD_HYPERCALL * @arg: &privcmd_hypercall_t @@ -89,4 +94,36 @@ typedef struct privcmd_mmapbatch_v2 { #define IOCTL_PRIVCMD_MMAPBATCH_V2 \ _IOC(_IOC_NONE, 'P', 4, sizeof(privcmd_mmapbatch_v2_t)) +/* + * @cmd: IOCTL_PRIVCMD_HCALL_BUF_LOCK + * @arg: struct privcmd hcall_buf * + * Return: 0 on success. On an error, -1 is returned and errno is set + * to EINVAL, ENOMEM, or EFAULT. + * + * Locks a memory buffer so it may be used in a hypercall. This is + * similar to mlock(2) but also prevents compaction/page migration. + * + * The buffers may have any alignment and size and may overlap other + * buffers. + * + * Locked buffers are unlocked with IOCTL_PRIVCMD_HCALL_BUF_UNLOCK or + * by closing the file handle. + */ +#define IOCTL_PRIVCMD_HCALL_BUF_LOCK \ + _IOC(_IOC_NONE, 'P', 5, sizeof(struct privcmd_hcall_buf)) + +/* + * @cmd: IOCTL_PRIVCMD_HCALL_BUF_UNLOCK + * @arg: struct privcmd hcall_buf * + * Return: Always 0. + * + * Unlocks a memory buffer previously locked with + * IOCTL_PRIVCMD_HCALL_BUF_LOCK. + * + * It is not possible to partially unlock a buffer. i.e., the + * LOCK/UNLOCK must be exactly paired. + */ +#define IOCTL_PRIVCMD_HCALL_BUF_UNLOCK \ + _IOC(_IOC_NONE, 'P', 6, sizeof(struct privcmd_hcall_buf)) + #endif /* __LINUX_PUBLIC_PRIVCMD_H__ */ diff --git a/tools/libs/call/linux.c b/tools/libs/call/linux.c index e8e0311..54ddd23 100644 --- a/tools/libs/call/linux.c +++ b/tools/libs/call/linux.c @@ -68,6 +68,8 @@ int osdep_hypercall(xencall_handle *xcall, privcmd_hypercall_t *hypercall) return ioctl(xcall->fd, IOCTL_PRIVCMD_HYPERCALL, hypercall); } +static int have_hbuf_lock = 1; + void *osdep_alloc_pages(xencall_handle *xcall, size_t npages) { size_t size = npages * PAGE_SIZE; @@ -84,7 +86,7 @@ void *osdep_alloc_pages(xencall_handle *xcall, size_t npages) /* Do not copy the VMA to child process on fork. Avoid the page being COW on hypercall. */ - rc = madvise(p, npages * PAGE_SIZE, MADV_DONTFORK); + rc = madvise(p, size, MADV_DONTFORK); if ( rc < 0 ) { PERROR("alloc_pages: madvise failed"); @@ -103,6 +105,33 @@ void *osdep_alloc_pages(xencall_handle *xcall, size_t npages) *c = 0; } + if ( have_hbuf_lock ) + { + struct privcmd_hcall_buf hbuf; + + hbuf.start = p; + hbuf.len = size; + + rc = ioctl(xcall->fd, IOCTL_PRIVCMD_HCALL_BUF_LOCK, &hbuf); + if ( rc < 0 ) + { + /* + * Older drivers return EINVAL if the ioctl was not + * supported. + */ + if ( errno == ENOTTY || errno == EINVAL ) + { + have_hbuf_lock = 0; + errno = 0; + } + else + { + PERROR("alloc_pages: lock failed"); + goto out; + } + } + } + return p; out: @@ -114,11 +143,23 @@ out: void osdep_free_pages(xencall_handle *xcall, void *ptr, size_t npages) { + size_t size = npages * PAGE_SIZE; int saved_errno = errno; + + if ( have_hbuf_lock ) + { + struct privcmd_hcall_buf hbuf; + + hbuf.start = ptr; + hbuf.len = size; + + ioctl(xcall->fd, IOCTL_PRIVCMD_HCALL_BUF_UNLOCK, &hbuf); + } + /* Recover the VMA flags. Maybe it's not necessary */ - madvise(ptr, npages * PAGE_SIZE, MADV_DOFORK); + madvise(ptr, size, MADV_DOFORK); - munmap(ptr, npages * PAGE_SIZE); + munmap(ptr, size); /* We MUST propagate the hypercall errno, not unmap call's. */ errno = saved_errno; }