From patchwork Mon Jan 27 05:30:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Brian Geffon X-Patchwork-Id: 11352105 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 733EC92A for ; Mon, 27 Jan 2020 05:31:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2368D22527 for ; Mon, 27 Jan 2020 05:31:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="pI71r8dS" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2368D22527 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 976A66B0003; Mon, 27 Jan 2020 00:31:09 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8FF626B0006; Mon, 27 Jan 2020 00:31:09 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C69A6B0007; Mon, 27 Jan 2020 00:31:09 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0113.hostedemail.com [216.40.44.113]) by kanga.kvack.org (Postfix) with ESMTP id 606B36B0003 for ; Mon, 27 Jan 2020 00:31:09 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 0DC28824805A for ; Mon, 27 Jan 2020 05:31:09 +0000 (UTC) X-FDA: 76422290658.02.ducks28_9ff48332ab61 X-Spam-Summary: 2,0,0,49115865e95d339f,d41d8cd98f00b204,3m3uuxgckcdgvayzzihaiiafy.wigfchor-ggepuwe.ila@flex--bgeffon.bounces.google.com,:akpm@linux-foundation.org:mst@redhat.com:bgeffon@google.com:arnd@arndb.de:linux-kernel@vger.kernel.org::linux-api@vger.kernel.org:luto@amacapital.net:aarcange@redhat.com:sonnyrao@google.com:minchan@kernel.org:joel@joelfernandes.org:yuzhao@google.com:jsbarnes@google.com:natechancellor@gmail.com,RULES_HIT:2:41:152:355:379:541:800:960:968:973:988:989:1260:1277:1311:1313:1314:1345:1359:1431:1437:1513:1515:1516:1518:1521:1535:1593:1594:1605:1606:1730:1747:1777:1792:1981:2194:2199:2393:2553:2559:2562:2693:2890:2898:3138:3139:3140:3141:3142:3152:3865:3866:3867:3868:3870:3871:3872:3874:4120:4250:4321:4605:5007:6261:6653:6742:7903:8660:9036:9038:9969:10004:11026:11473:11658:11783:11914:12043:12291:12294:12296:12297:12555:12679:12683:12895:13148:13161:13229:13230:14096:14097:14394:14659:21080:21444:21450:21451:21627:21990:30003:30025:30054:30070:30090 :30091,0 X-HE-Tag: ducks28_9ff48332ab61 X-Filterd-Recvd-Size: 9035 Received: from mail-vk1-f202.google.com (mail-vk1-f202.google.com [209.85.221.202]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Mon, 27 Jan 2020 05:31:08 +0000 (UTC) Received: by mail-vk1-f202.google.com with SMTP id v188so4023846vkf.10 for ; Sun, 26 Jan 2020 21:31:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc:content-transfer-encoding; bh=DI0Lh6q9irgWn+IO9tpXRLACEMIkrMuflR08fFBz6r4=; b=pI71r8dSdD44XMVUvUhi7UqeZJ/7Mpi/EmVb+iICB+EryfZn+dv7YyWzZfd8B4/OoI K85VRP2TqR3qk4Xgqmv0mvqGYPhzU4lOdDVESqkbWTF4xI6NnTQciyO7zEBiOpK8139q 8GI82lSq/qYWaE2TexNDH99hLcSvrySrOhxfgMXt80JsvY1M04jx8LgiZiHpK0DOqUFI CuBsrbfM6mnTiTR7vG998LXps6/wSgF4Z7d5KLpD9j1GAuvCTVN3xCzO5UzDsgYCOOPE DtBU5qvgERwToxxUHyck+25YtCV5hxrbEAAZttfpyKgvD9GshDGx+rIuwEGR4ZBsz3+4 m0Pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc:content-transfer-encoding; bh=DI0Lh6q9irgWn+IO9tpXRLACEMIkrMuflR08fFBz6r4=; b=TuKTTYb2EMPlToPeLOtMhyvuNZO8BGGsZzS/W9N11DfMYQjrpHng2fffS08893MwsK Xprr6mk4TuAe7fcJ94wFqqKy7OD8DOXQMNavkybjUNcSr5Qu9H7rNrhcLmWc3FZ1JMZZ oX7owyln3pJepakXhpWc9hpCy8QzlWi9GKhfG78WMtIloyKUtRviARxIKBmSAaC+cjpE iqHsOZTNeiSisKTUGIbXnue7PmEZUBSGoORI2bNciZBYepH9mLDjgm22NGVmwQbdanKi RHFuADPUzIGkGPzKj25yI8kFQL+ikLjxeGw50jozZUumO3nxi6Zx2d4qJE7gIQ3/B//X tCZg== X-Gm-Message-State: APjAAAUjpaHAHZ6AcmmPW7Wf7PzeehEv9cy72mj5qeKIsB6kjypEkdDY JZsPShtByqxojcE30PQ8gVR0/ACuKu9F X-Google-Smtp-Source: APXvYqxatu41pGjPUZtHzfqsdCcxTqfDNWthU11S26MvvQEtN6JcGExq0ge+WiPsE9O4XFLqZJSqf2dYjSpM X-Received: by 2002:ab0:6509:: with SMTP id w9mr8912067uam.121.1580103067944; Sun, 26 Jan 2020 21:31:07 -0800 (PST) Date: Sun, 26 Jan 2020 21:30:56 -0800 In-Reply-To: <20200123014627.71720-1-bgeffon@google.com> Message-Id: <20200127053056.213679-1-bgeffon@google.com> Mime-Version: 1.0 References: <20200123014627.71720-1-bgeffon@google.com> X-Mailer: git-send-email 2.25.0.341.g760bfbb309-goog Subject: [PATCH v3] mm: Add MREMAP_DONTUNMAP to mremap(). From: Brian Geffon To: Andrew Morton Cc: "Michael S . Tsirkin" , Brian Geffon , Arnd Bergmann , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org, Andy Lutomirski , Andrea Arcangeli , Sonny Rao , Minchan Kim , Joel Fernandes , Yu Zhao , Jesse Barnes , Nathan Chancellor X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When remapping an anonymous, private mapping, if MREMAP_DONTUNMAP is set, the source mapping will not be removed. Instead it will be cleared as if a brand new anonymous, private mapping had been created atomically as part of the mremap() call.  If a userfaultfd was watching the source, it will continue to watch the new mapping.  For a mapping that is shared or not anonymous, MREMAP_DONTUNMAP will cause the mremap() call to fail. MREMAP_DONTUNMAP requires that MREMAP_FIXED is also used. The final result is two equally sized VMAs where the destination contains the PTEs of the source.     We hope to use this in Chrome OS where with userfaultfd we could write an anonymous mapping to disk without having to STOP the process or worry about VMA permission changes.     This feature also has a use case in Android, Lokesh Gidra has said that "As part of using userfaultfd for GC, We'll have to move the physical pages of the java heap to a separate location. For this purpose mremap will be used. Without the MREMAP_DONTUNMAP flag, when I mremap the java heap, its virtual mapping will be removed as well. Therefore, we'll require performing mmap immediately after. This is not only time consuming but also opens a time window where a native thread may call mmap and reserve the java heap's address range for its own usage. This flag solves the problem."     Signed-off-by: Brian Geffon --- include/uapi/linux/mman.h | 5 +++-- mm/mremap.c | 38 +++++++++++++++++++++++++++++++------- 2 files changed, 34 insertions(+), 9 deletions(-) diff --git a/include/uapi/linux/mman.h b/include/uapi/linux/mman.h index fc1a64c3447b..923cc162609c 100644 --- a/include/uapi/linux/mman.h +++ b/include/uapi/linux/mman.h @@ -5,8 +5,9 @@ #include #include -#define MREMAP_MAYMOVE 1 -#define MREMAP_FIXED 2 +#define MREMAP_MAYMOVE 1 +#define MREMAP_FIXED 2 +#define MREMAP_DONTUNMAP 4 #define OVERCOMMIT_GUESS 0 #define OVERCOMMIT_ALWAYS 1 diff --git a/mm/mremap.c b/mm/mremap.c index 122938dcec15..1d164e5fdff0 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -318,8 +318,8 @@ unsigned long move_page_tables(struct vm_area_struct *vma, static unsigned long move_vma(struct vm_area_struct *vma, unsigned long old_addr, unsigned long old_len, unsigned long new_len, unsigned long new_addr, - bool *locked, struct vm_userfaultfd_ctx *uf, - struct list_head *uf_unmap) + bool *locked, unsigned long flags, + struct vm_userfaultfd_ctx *uf, struct list_head *uf_unmap) { struct mm_struct *mm = vma->vm_mm; struct vm_area_struct *new_vma; @@ -408,6 +408,13 @@ static unsigned long move_vma(struct vm_area_struct *vma, if (unlikely(vma->vm_flags & VM_PFNMAP)) untrack_pfn_moved(vma); + if (unlikely(!err && (flags & MREMAP_DONTUNMAP))) { + if (vm_flags & VM_ACCOUNT) + vma->vm_flags |= VM_ACCOUNT; + + goto out; + } + if (do_munmap(mm, old_addr, old_len, uf_unmap) < 0) { /* OOM: unable to split vma, just get accounts right */ vm_unacct_memory(excess >> PAGE_SHIFT); @@ -422,6 +429,7 @@ static unsigned long move_vma(struct vm_area_struct *vma, vma->vm_next->vm_flags |= VM_ACCOUNT; } +out: if (vm_flags & VM_LOCKED) { mm->locked_vm += new_len >> PAGE_SHIFT; *locked = true; @@ -497,7 +505,7 @@ static struct vm_area_struct *vma_to_resize(unsigned long addr, static unsigned long mremap_to(unsigned long addr, unsigned long old_len, unsigned long new_addr, unsigned long new_len, bool *locked, - struct vm_userfaultfd_ctx *uf, + unsigned long flags, struct vm_userfaultfd_ctx *uf, struct list_head *uf_unmap_early, struct list_head *uf_unmap) { @@ -551,6 +559,17 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len, goto out; } + /* + * MREMAP_DONTUNMAP expands by old_len + (new_len - old_len), we will + * check that we can expand by old_len and vma_to_resize will handle + * the vma growing. + */ + if (unlikely(flags & MREMAP_DONTUNMAP && !may_expand_vm(mm, + vma->vm_flags, old_len >> PAGE_SHIFT))) { + ret = -ENOMEM; + goto out; + } + map_flags = MAP_FIXED; if (vma->vm_flags & VM_MAYSHARE) map_flags |= MAP_SHARED; @@ -561,7 +580,7 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len, if (IS_ERR_VALUE(ret)) goto out1; - ret = move_vma(vma, addr, old_len, new_len, new_addr, locked, uf, + ret = move_vma(vma, addr, old_len, new_len, new_addr, locked, flags, uf, uf_unmap); if (!(offset_in_page(ret))) goto out; @@ -609,12 +628,16 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, addr = untagged_addr(addr); new_addr = untagged_addr(new_addr); - if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) + if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP)) { return ret; + } if (flags & MREMAP_FIXED && !(flags & MREMAP_MAYMOVE)) return ret; + if (flags & MREMAP_DONTUNMAP && !(flags & MREMAP_FIXED)) + return ret; + if (offset_in_page(addr)) return ret; @@ -634,7 +657,8 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, if (flags & MREMAP_FIXED) { ret = mremap_to(addr, old_len, new_addr, new_len, - &locked, &uf, &uf_unmap_early, &uf_unmap); + &locked, flags, &uf, &uf_unmap_early, + &uf_unmap); goto out; } @@ -712,7 +736,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, } ret = move_vma(vma, addr, old_len, new_len, new_addr, - &locked, &uf, &uf_unmap); + &locked, flags, &uf, &uf_unmap); } out: if (offset_in_page(ret)) {