[v3] mm: introduce reference pages

Introduce a new syscall, refpage_create, which returns a file
descriptor which may be mapped using mmap. Such a mapping is similar
to an anonymous mapping, but instead of clean pages being backed by the
zero page, they are instead backed by a so-called reference page, whose
contents are specified using an argument to refpage_create. Loads from
the mapping will load directly from the reference page, and initial
stores to the mapping will copy-on-write from the reference page.

Reference pages are useful in circumstances where anonymous mappings
combined with manual stores to memory would impose undesirable costs,
either in terms of performance or RSS. Use cases are focused on heap
allocators and include:

- Pattern initialization for the heap. This is where malloc(3) gives
  you memory whose contents are filled with a non-zero pattern
  byte, in order to help detect and mitigate bugs involving use
  of uninitialized memory. Typically this is implemented by having
  the allocator memset the allocation with the pattern byte before
  returning it to the user, but for large allocations this can result
  in a significant increase in RSS, especially for allocations that
  are used sparsely. Even for dense allocations there is a needless
  impact to startup performance when it may be better to amortize it
  throughout the program. By creating allocations using a reference
  page filled with the pattern byte, we can avoid these costs.

- Pre-tagged heap memory. Memory tagging [1] is an upcoming ARMv8.5
  feature which allows for memory to be tagged in order to detect
  certain kinds of memory errors with low overhead. In order to set
  up an allocation to allow memory errors to be detected, the entire
  allocation needs to have the same tag. The issue here is similar to
  pattern initialization in the sense that large tagged allocations
  will be expensive if the tagging is done up front. The idea is that
  the allocator would create reference pages with each of the possible
  memory tags, and use those reference pages for the large allocations.

In order to measure the performance and RSS impact of reference pages,
a version of this patch backported to kernel version 4.14 was tested on
a Pixel 4 together with a modified [2] version of the Scudo allocator
that uses reference pages to implement pattern initialization. A
PDFium test program was used to collect the measurements like so:

$ wget https://static.docs.arm.com/ddi0487/fb/DDI0487F_b_armv8_arm.pdf
$ /system/bin/time -v ./pdfium_test --pages=1-100 DDI0487F_b_armv8_arm.pdf

and the median of 100 runs measurement was taken with three variants
of the allocator:

- "anon" is the baseline (no pattern init)
- "memset" is with pattern init of allocator pages implemented by
  initializing anonymous pages with memset
- "refpage" is with pattern init of allocator pages implemented
  by creating reference pages

All three variants are measured using the patch that I linked. "anon"
is without the patch, "refpage" is with the patch and "memset" is
with a previous version of the patch [3] with "#if 0" in place of
"#if 1" in linux.cpp. The measurements are as follows:

          Real time (s)    Max RSS (KiB)
anon        2.237081         107088
memset      2.252241         112180
refpage     2.243786         107128

We can see that RSS for refpage is almost the same as anon, and real
time overhead is 44% that of memset.

As an alternative to introducing this syscall, I considered using
userfaultfd to implement reference pages. However, after having taken
a detailed look at the interface, it does not seem suitable to be
used in the context of a general purpose allocator. For example,
UFFD_FEATURE_FORK support would be required in order to correctly
support fork(2) in a process that uses the allocator (although POSIX
does not guarantee support for allocating after fork, many allocators
including Scudo support it, and nothing stops the forked process from
page faulting pre-existing allocations after forking anyway), but
UFFD_FEATURE_FORK has been restricted to root by commit 3c1c24d91ffd
("userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK"),
making it unsuitable for use in an allocator. Furthermore, even if
the interface issues are resolved, I suspect (but have not measured)
that the cost of the multiple context switches between kernel and
userspace would be too high to be used in an allocator anyway.

[1] https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety
[2] https://github.com/pcc/llvm-project/commit/4871b739f86a631537d1725847a27ac148a392a0
[3] https://github.com/pcc/llvm-project/commit/a05f88aaebc7daf262d6885444d9845052026f4b

Signed-off-by: Peter Collingbourne <pcc@google.com>
Reported-by: kernel test robot <lkp@intel.com>
---
v3:
- Fix build errors reported by kernel test robot

v2:
- Switch to an approach of adding a new syscall instead of modifying
  mmap(2)
- Move ownership of the reference page to the struct file to avoid
  refcount overflows

 arch/alpha/kernel/syscalls/syscall.tbl      |  1 +
 arch/arm/tools/syscall.tbl                  |  1 +
 arch/arm64/include/asm/unistd.h             |  2 +-
 arch/arm64/include/asm/unistd32.h           |  2 +
 arch/ia64/kernel/syscalls/syscall.tbl       |  1 +
 arch/m68k/kernel/syscalls/syscall.tbl       |  1 +
 arch/microblaze/kernel/syscalls/syscall.tbl |  1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |  1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |  1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |  1 +
 arch/parisc/kernel/syscalls/syscall.tbl     |  1 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |  1 +
 arch/s390/kernel/syscalls/syscall.tbl       |  1 +
 arch/sh/kernel/syscalls/syscall.tbl         |  1 +
 arch/sparc/kernel/syscalls/syscall.tbl      |  1 +
 arch/x86/entry/syscalls/syscall_32.tbl      |  1 +
 arch/x86/entry/syscalls/syscall_64.tbl      |  1 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |  1 +
 include/linux/huge_mm.h                     |  7 +++
 include/linux/mm.h                          | 10 ++++
 include/linux/syscalls.h                    |  3 ++
 include/uapi/asm-generic/unistd.h           |  4 +-
 kernel/sys_ni.c                             |  1 +
 mm/Makefile                                 |  4 +-
 mm/gup.c                                    |  2 +-
 mm/memory.c                                 | 32 ++++++++----
 mm/migrate.c                                |  4 +-
 mm/refpage.c                                | 56 +++++++++++++++++++++
 28 files changed, 127 insertions(+), 16 deletions(-)
 create mode 100644 mm/refpage.c

Message ID	20200814213310.42170-1-pcc@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=+cHn=BY=kvack.org=owner-linux-mm@kernel.org> Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8FFEC722 for <patchwork-linux-mm@patchwork.kernel.org>; Fri, 14 Aug 2020 21:33:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 292472074D for <patchwork-linux-mm@patchwork.kernel.org>; Fri, 14 Aug 2020 21:33:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kPFzBe5w" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 292472074D Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 395656B0006; Fri, 14 Aug 2020 17:33:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 345686B0007; Fri, 14 Aug 2020 17:33:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 233F56B0008; Fri, 14 Aug 2020 17:33:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id 08E806B0006 for <linux-mm@kvack.org>; Fri, 14 Aug 2020 17:33:26 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B0D9F3AA1 for <linux-mm@kvack.org>; Fri, 14 Aug 2020 21:33:25 +0000 (UTC) X-FDA: 77150475570.13.vase36_12159de27000 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 8413A18140B60 for <linux-mm@kvack.org>; Fri, 14 Aug 2020 21:33:25 +0000 (UTC) X-Spam-Summary: 1,0,0,4256fff7c7a4b277,d41d8cd98f00b204,3iwm3xwmkcayviimuumrk.iusrot03-ssq1giq.uxm@flex--pcc.bounces.google.com,,RULES_HIT:41:152:327:355:379:541:800:960:966:967:973:982:988:989:1260:1277:1313:1314:1345:1437:1516:1518:1593:1594:1605:1730:1747:1777:1792:1801:2194:2196:2198:2199:2200:2201:2393:2525:2553:2559:2566:2570:2682:2685:2693:2703:2731:2740:2859:2894:2895:2912:2933:2937:2939:2942:2945:2947:2951:2954:3000:3022:3152:3865:3866:3867:3868:3870:3871:3872:3873:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4250:4321:4385:4605:5007:6117:6119:6261:7875:7903:7974:8660:9025:9969:10004:11658:12048:13141:13146:13148:13161:13221:13229:13230,0,RBL:209.85.160.202:@flex--pcc.bounces.google.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100;04yg4qggfc5urz1i4cftz98jmuue8oczbecpwhqxxmxkjk89uqg6m4ogjtu76jp.fxqd7rm169u8a931g5kb6ko3uuxo1n7qu4hf6f71985co7x95ebkwbofjxz9s4r.h-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,M SF:not b X-HE-Tag: vase36_12159de27000 X-Filterd-Recvd-Size: 25672 Received: from mail-qt1-f202.google.com (mail-qt1-f202.google.com [209.85.160.202]) by imf41.hostedemail.com (Postfix) with ESMTP for <linux-mm@kvack.org>; Fri, 14 Aug 2020 21:33:24 +0000 (UTC) Received: by mail-qt1-f202.google.com with SMTP id u17so7877340qtq.13 for <linux-mm@kvack.org>; Fri, 14 Aug 2020 14:33:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=/b+3Wy0rIPzWD1fgbUkCVPL64FtgTnEJARVaCl1mbng=; b=kPFzBe5w9tgsiYqp87+G1dgkba7bV0NbY1c3TTMJw853x4BxCehNDjzOC9nHtQpkFl 2aaa3aljb0oGDl5ONFKbYJBAVjYa+XBERLfyq92mSGmlBoe8MkL2hkiZJDrxYcZMQarR u0/2nQ0BzKkXeBtv4/yE7ogUevzaVbh2P3m6nCRYEVKfXpn7Vyse4399rst/jya8FkIW PMV15Pl3vJCSQVxhi9T5ftwL+pExbG4h0pfpY0d5IyKOzhpIJJMsrh6atlEf0iLb0uLP lHbI56ezduGk98EWAll0Tkt5ETQAAMyTb7GW/848QBeuKEek5KclH1O5ALmpg2Jm7n/n ihXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=/b+3Wy0rIPzWD1fgbUkCVPL64FtgTnEJARVaCl1mbng=; b=LIlmKLuvm5HYL+detbnmdQIBY7EjhoY31RGd9IwwlpcyXR9Zq6Kb6r6pCF4cHfF1ZD eYME974Lo55CUsdh8fvl2kEWkfW6td+OjLUUMVo9odsI6QX8ZihWoOZnh+/98zC8QpKa pAOio7TJRnOJjJ7bv4HidbyX6f20qaiECGU4Eu0pXQqkyMbJstRjia++kTLYzZHQ3mF4 w3+L9jTtswJmYPgnbCHBBy+R8MbpGVIIgTIfudyV1Y5syC9UX9gBDW+XYumKe2mOu0ds L2fpA52lBVucty3jjo5kqph8UQdIqC/yC6AGIN9qkpXEdCxN7xdUs5czCPeBEqIhSlJK n0cA== X-Gm-Message-State: AOAM531vTkdcqr2CTrPjePZ2RQbolTQzyu+U8IK7WDh7jHQBTNp9j8u0 Qw6FhC9G0DscYFWOi7t4RzO+3W0= X-Google-Smtp-Source: ABdhPJwCPwujvijcO8ZMP2An9qZaVW6AxMMNoUh/BVhvu1kRiUunLMWrLB4yC6104sffXolMkf/fFek= X-Received: by 2002:a0c:8b5d:: with SMTP id d29mr4382188qvc.172.1597440803900; Fri, 14 Aug 2020 14:33:23 -0700 (PDT) Date: Fri, 14 Aug 2020 14:33:10 -0700 Message-Id: <20200814213310.42170-1-pcc@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.28.0.220.ged08abb693-goog Subject: [PATCH v3] mm: introduce reference pages From: Peter Collingbourne <pcc@google.com> To: John Hubbard <jhubbard@nvidia.com>, Matthew Wilcox <willy@infradead.org>, "Kirill A . Shutemov" <kirill@shutemov.name>, Andrew Morton <akpm@linux-foundation.org>, Catalin Marinas <catalin.marinas@arm.com>, Evgenii Stepanov <eugenis@google.com> Cc: Peter Collingbourne <pcc@google.com>, Linux ARM <linux-arm-kernel@lists.infradead.org>, linux-mm@kvack.org, kernel test robot <lkp@intel.com> Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 8413A18140B60 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: <linux-mm.kvack.org>
Series	[v3] mm: introduce reference pages \| expand [v3] mm: introduce reference pages

[v3] mm: introduce reference pages

Commit Message

Comments

Patch