From patchwork Sat Jul 17 02:59:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Collingbourne X-Patchwork-Id: 12383157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1D77C12002 for ; Sat, 17 Jul 2021 03:01:52 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9717A61009 for ; Sat, 17 Jul 2021 03:01:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9717A61009 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Mime-Version: Message-Id:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=yLVNu7mNeMba4B7YiShma/XMxo5u24IuwsuMSLlbVI0=; b=zyC MampY7M6vVVSBA2tUZ6ZEmZFVgbriP5m+ygeYw86+zWEdS3CEjeyjc2gRyAKeiOYZkYPO1hawGyws ojbenc+PSyv3I3kZqZkCBGuED/9DNufXqMvMvFd0uL8Lq04cfPtW1aZMAZYMn9AFBL1MmN/t6dv1j YnVkNYeDbd0oCsAyq4NIjq40b3GuOBQI8vA/G+w7XYqhl+dcJXByFX9l/I7dvHP9tkMhGs/0eAaHb 8RKn3IrOgFQuOh7PAOx+DXHX5xAX99CrZ4mzu+RudHD/+Lhy8F12heazTZb2dRBQO9oagT9EAe3aC b9tlmA/0P1yzJcDbpEKjKgBNAqHOwHQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1m4aYf-005qgD-Mh; Sat, 17 Jul 2021 03:00:06 +0000 Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1m4aYZ-005qdL-S0 for linux-arm-kernel@lists.infradead.org; Sat, 17 Jul 2021 03:00:02 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id q10-20020a056902150ab02905592911c932so15413431ybu.15 for ; Fri, 16 Jul 2021 19:59:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=OejThO9U7IkzrDDyCk4DT+oqrxwXtnahBtcnsZfFAwQ=; b=CPgeoEnhtdRlN+LrymMN8D/U57gZ4d4VKDzl3BBFvJ/t4UkOCMOpMAPlIhGVYZkhwQ adQHSP7U5ftflJkAQKk9UNQywQNxc3a2+awXCjnVGqdlOo6Q7rElF8Fdb5z8lO41tqKW pMN+ssnpuQqNYsGOE+7tQBBBy4KLLJWQyxQMdWJQQ8naLJeffrXoPkpZvIHhxGP1qzNS FwgGHPIDIVmOLRF7No+dWglwwbP/VJUrF1rwDKk85vmiMFK09rfSE2fdaVggWxbxCBJH KNwXQ5o2/eJ5aeeHE5g/dwVnvLuBmwGIJLmoOE1CMJnbX/iPxjFvKytx7KWkIwBJ2y+O nwkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=OejThO9U7IkzrDDyCk4DT+oqrxwXtnahBtcnsZfFAwQ=; b=e1vqg7VqSovVVw2gXRQz9+epEqyewVDlhIE27Z11ws3a1x6GdzVQ1ody4CIZsVGvUn RlS6VPVPCVnAZ4fjX4zF5g0LB/MdSvUpYDRxA44fCUiGeOqng8TOEVCR7QNkUkdqCQD5 Au5kdr3znNqwJTdE5WuFUEyQ6pxTCA4cgIDR7E0GrmGvT4gRkFRuaW1GU7kQjyWfRWlF bOuhqgDUn62tKgvFSOtjOFRjT2k4eELEeiAU8ulvJ/C1QWIq3ezAl3nYJiHl5OEFxaQ9 VhZ4oaQ6npSlvUNjcsnrWsP3KKLuV9Mopi6R5UBpzrYUHalaU37SMh8utZhMvjkQD4pF gQhQ== X-Gm-Message-State: AOAM530qS3Pk0RloAGXlZCKN9x3aZov+3wHIrbU8cLIX/gGcsb+0+NW0 O4/ducfhlo4E9BtNEdldxn67gNY= X-Google-Smtp-Source: ABdhPJw0Hll9OtU9YXnGycNCR3KDGCfnmUJm0p/DDzcJNWNnHpIWwpU/tCIze7NanEGfSWjI3RO3VgY= X-Received: from pcc-desktop.svl.corp.google.com ([2620:15c:2ce:200:fc3c:d89a:88e2:5cfc]) (user=pcc job=sendgmr) by 2002:a25:7316:: with SMTP id o22mr17005025ybc.349.1626490797837; Fri, 16 Jul 2021 19:59:57 -0700 (PDT) Date: Fri, 16 Jul 2021 19:59:51 -0700 Message-Id: <20210717025951.3946505-1-pcc@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.32.0.402.g57bb445576-goog Subject: [PATCH] refpage_create.2: Document refpage_create(2) From: Peter Collingbourne To: John Hubbard , Matthew Wilcox , "Kirill A . Shutemov" , Andrew Morton , Catalin Marinas , Evgenii Stepanov , Michael Kerrisk , Alejandro Colomar Cc: Peter Collingbourne , Jann Horn , Linux ARM , linux-mm@kvack.org, kernel test robot , Linux API , linux-doc@vger.kernel.org, linux-man@vger.kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210716_195959_982190_AE09E12B X-CRM114-Status: GOOD ( 29.41 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org --- The syscall has not landed in the kernel yet. Therefore, as usual, the patch should not be taken yet and I've used 5.x as the introducing kernel version for now. man2/refpage_create.2 | 167 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 167 insertions(+) create mode 100644 man2/refpage_create.2 diff --git a/man2/refpage_create.2 b/man2/refpage_create.2 new file mode 100644 index 000000000..c0b928b92 --- /dev/null +++ b/man2/refpage_create.2 @@ -0,0 +1,167 @@ +.\" Copyright (C) 2021 Google LLC +.\" Author: Peter Collingbourne +.\" +.\" %%%LICENSE_START(VERBATIM) +.\" Permission is granted to make and distribute verbatim copies of this +.\" manual provided the copyright notice and this permission notice are +.\" preserved on all copies. +.\" +.\" Permission is granted to copy and distribute modified versions of this +.\" manual under the conditions for verbatim copying, provided that the +.\" entire resulting derived work is distributed under the terms of a +.\" permission notice identical to this one. +.\" +.\" Since the Linux kernel and libraries are constantly changing, this +.\" manual page may be incorrect or out-of-date. The author(s) assume no +.\" responsibility for errors or omissions, or for damages resulting from +.\" the use of the information contained herein. The author(s) may not +.\" have taken the same level of care in the production of this manual, +.\" which is licensed free of charge, as they might when working +.\" professionally. +.\" +.\" Formatted or processed versions of this manual, if unaccompanied by +.\" the source, must acknowledge the copyright and authors of this work. +.\" %%%LICENSE_END +.\" +.TH REFPAGE_CREATE 2 2021-07-16 "Linux" "Linux Programmer's Manual" +.SH NAME +refpage_create \- create a reference page file descriptor +.SH SYNOPSIS +.nf +.BR "#include " +.PP +.BI "int syscall(SYS_refpage_create, void *" content ", unsigned int " size , +.BI " unsigned long " flags ");" +.fi +.PP +.IR Note : +glibc provides no wrapper for +.BR refpage_create (), +necessitating the use of +.BR syscall (2). +.SH DESCRIPTION +The +.BR refpage_create () +system call is used to create a file descriptor +that conceptually refers to a read-only file +whose contents are an infinite repetition of +.I size +bytes of data read from the +.I content +argument to the system call, +and which may be mapped into memory with +.BR mmap (2). +The file descriptor is created as if by passing +.BR O_RDONLY | O_CLOEXEC +to +.BR open (2). +.PP +In reality, any read-only pages in the mapping are backed +by a so-called reference page, +whose contents are specified using the arguments to +.BR refpage_create (). +.PP +The reference page will consist of repetitions of +.I size +bytes read +from +.IR content , +as many as are required to fill the page. The +.I size +argument must be a power of two less than or equal to the page size, and the +.I content +argument must have at least +.I size +alignment. The behavior is as if a copy of this data +is made while servicing the system call; +any updates to the data after the system call has returned +will not be reflected in the reference page. +.PP +If the architecture specifies that metadata may be associated +with memory addresses, that metadata if present is copied +into the reference page along with the data itself, +but only if the size argument is at least as large +as the granularity of the metadata. +For example, with the ARMv8.5 Memory Tagging Extension, +the memory tags are copied, but only if the size is greater than +or equal to the architecturally specified tag granule size of 16 bytes. +.PP +Writable private mappings trigger specific copy-on-write behavior +when a page in the mapping is written to. +The behavior is as if the reference page is copied, +but the kernel may use a more efficient technique such as +.BR memset (3) +to produce the copy if the +.I size +argument originally used to create the reference page file descriptor +is sufficiently small. +For this reason it is recommended to specify as small of a +.I size +argument as possible +in order to activate any such optimizations implemented in the kernel. +.PP +The advantage of using this system call +over creating normal anonymous mappings +and manually initializing the pages from userspace +is that it is more efficient. +If it is not known that all of the pages in the mapping +will be faulted (for example, if the system call is used +by a general purpose memory allocator +where the behavior of the client program is unknown), +letting the pages be prepared on fault only if needed +is more efficient from both a performance +and memory consumption perspective. +Even if all of the pages would end up being faulted, +it would still be more efficient +to have the kernel initialize the pages with the required contents once +than to have the kernel zero initialize them on fault +and then have userspace initialize them again with different contents. +.SH EXAMPLES +The following program creates a 128KB memory mapping +preinitialized with the pattern byte 0xAA +and verifies that the contents of the mapping are correct. +.PP +.EX +#include +#include +#include +#include + +int main() { + unsigned char pattern = 0xaa; + unsigned long mmap_size = 131072; + + int fd = syscall(SYS_refpage_create, &pattern, 1, 0); + if (fd < 0) { + perror("refpage_create"); + return 1; + } + unsigned char *p = mmap(0, mmap_size, PROT_READ | PROT_WRITE, + MAP_PRIVATE, fd, 0); + if (p == MAP_FAILED) { + perror("mmap"); + return 1; + } + for (unsigned i = 0; i != mmap_size; ++i) { + if (p[i] != pattern) { + fprintf(stderr, "refpage failed contents check @ %u: " + "0x%x != 0x%x\n", + i, p[i], pattern); + return 1; + } + } +} +.EE +.SH NOTE +Reading from a reference page file descriptor, e.g. with +.BR read (2), +is not supported, nor would this be particularly useful. +.SH VERSIONS +This system call first appeared in Linux 5.x. +.SH CONFORMING TO +The +.BR refpage_create () +system call is Linux-specific. +.SH SEE ALSO +.BR mmap (2), +.BR open (2).