From patchwork Wed Mar 19 19:06:48 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Herrmann X-Patchwork-Id: 3860381 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id A50A3BF540 for ; Wed, 19 Mar 2014 19:08:03 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7FA67201F5 for ; Wed, 19 Mar 2014 19:08:01 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 8AE0820212 for ; Wed, 19 Mar 2014 19:08:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C833A89728; Wed, 19 Mar 2014 12:07:59 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-bk0-f46.google.com (mail-bk0-f46.google.com [209.85.214.46]) by gabe.freedesktop.org (Postfix) with ESMTP id 64DB389722 for ; Wed, 19 Mar 2014 12:07:58 -0700 (PDT) Received: by mail-bk0-f46.google.com with SMTP id v15so645166bkz.19 for ; Wed, 19 Mar 2014 12:07:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=aygCSlrQLi25RQB0ZHgAeeN01aws+slOU+Nho+mw754=; b=cB96nX0K4XILUnmikOefy4TpWR6sCp78G2KY8qc6t3WHui5MEPmjYoS8I3gzqeWtMY 18W+8zCFBZ+t7+YXff476YybEABL1R2QwtiQas7NFEpT8d4BifDh0gOP5QVf7SUuW6Zy J9BsnLF4ammYFYLILQyK6U99MlRpk/6MpRShMTNybnReS2HYaVHb65NDceNl0qJkFHGB AWh4NDahTRWd98zVEQ1yB4ZimfJZe0RDmRrNPs9qCUGVXMd8QB3NAUqC8KFkwhE2Zvi6 VlK3s09ExY8sYy+86J1BmH6Of+3nfh75uTdZ+reoy/L6G0Lcy+Czsv0wmFwpHetDxYGp ffrA== X-Received: by 10.204.77.7 with SMTP id e7mr20500826bkk.7.1395256076249; Wed, 19 Mar 2014 12:07:56 -0700 (PDT) Received: from david-tp.localdomain (stgt-5f71aff2.pool.mediaWays.net. [95.113.175.242]) by mx.google.com with ESMTPSA id c15sm20566146bky.13.2014.03.19.12.07.53 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 19 Mar 2014 12:07:55 -0700 (PDT) From: David Herrmann To: linux-kernel@vger.kernel.org Subject: [PATCH 3/6] shm: add memfd_create() syscall Date: Wed, 19 Mar 2014 20:06:48 +0100 Message-Id: <1395256011-2423-4-git-send-email-dh.herrmann@gmail.com> X-Mailer: git-send-email 1.9.0 In-Reply-To: <1395256011-2423-1-git-send-email-dh.herrmann@gmail.com> References: <1395256011-2423-1-git-send-email-dh.herrmann@gmail.com> Cc: Matthew Wilcox , Ryan Lortie , Hugh Dickins , Johannes Weiner , Kay Sievers , dri-devel@lists.freedesktop.org, Daniel Mack , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Karol Lewandowski , Lennart Poettering , Greg Kroah-Hartman , Tejun Heo , "Michael Kerrisk \(man-pages\)" , Andrew Morton , Linus Torvalds , Alexander Viro X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, T_DKIM_INVALID, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP memfd_create() is similar to mmap(MAP_ANON), but returns a file-descriptor that you can pass to mmap(). It explicitly allows sealing and avoids any connection to user-visible mount-points. Thus, it's not subject to quotas on mounted file-systems, but can be used like malloc()'ed memory, but with a file-descriptor to it. memfd_create() does not create a front-FD, but instead returns the raw shmem file, so calls like ftruncate() can be used. Also calls like fstat() will return proper information and mark the file as regular file. Sealing is explicitly supported on memfds. Compared to O_TMPFILE, it does not require a tmpfs mount-point and is not subject to quotas and alike. Signed-off-by: David Herrmann --- arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + include/linux/syscalls.h | 1 + include/uapi/linux/memfd.h | 9 ++++++ kernel/sys_ni.c | 1 + mm/shmem.c | 67 ++++++++++++++++++++++++++++++++++++++++ 6 files changed, 80 insertions(+) create mode 100644 include/uapi/linux/memfd.h diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl index 96bc506..c943b8a 100644 --- a/arch/x86/syscalls/syscall_32.tbl +++ b/arch/x86/syscalls/syscall_32.tbl @@ -359,3 +359,4 @@ 350 i386 finit_module sys_finit_module 351 i386 sched_setattr sys_sched_setattr 352 i386 sched_getattr sys_sched_getattr +353 i386 memfd_create sys_memfd_create diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl index a12bddc..e9d56a8 100644 --- a/arch/x86/syscalls/syscall_64.tbl +++ b/arch/x86/syscalls/syscall_64.tbl @@ -322,6 +322,7 @@ 313 common finit_module sys_finit_module 314 common sched_setattr sys_sched_setattr 315 common sched_getattr sys_sched_getattr +316 common memfd_create sys_memfd_create # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index a747a77..124b838 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -791,6 +791,7 @@ asmlinkage long sys_timerfd_settime(int ufd, int flags, asmlinkage long sys_timerfd_gettime(int ufd, struct itimerspec __user *otmr); asmlinkage long sys_eventfd(unsigned int count); asmlinkage long sys_eventfd2(unsigned int count, int flags); +asmlinkage long sys_memfd_create(const char *uname_ptr, u64 size, u64 flags); asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len); asmlinkage long sys_old_readdir(unsigned int, struct old_linux_dirent __user *, unsigned int); asmlinkage long sys_pselect6(int, fd_set __user *, fd_set __user *, diff --git a/include/uapi/linux/memfd.h b/include/uapi/linux/memfd.h new file mode 100644 index 0000000..d74cc89 --- /dev/null +++ b/include/uapi/linux/memfd.h @@ -0,0 +1,9 @@ +#ifndef _UAPI_LINUX_MEMFD_H +#define _UAPI_LINUX_MEMFD_H + +#include + +/* flags for memfd_create(2) */ +#define MFD_CLOEXEC 0x0001 + +#endif /* _UAPI_LINUX_MEMFD_H */ diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 7078052..53e05af 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -193,6 +193,7 @@ cond_syscall(compat_sys_timerfd_settime); cond_syscall(compat_sys_timerfd_gettime); cond_syscall(sys_eventfd); cond_syscall(sys_eventfd2); +cond_syscall(sys_memfd_create); /* performance counters: */ cond_syscall(sys_perf_event_open); diff --git a/mm/shmem.c b/mm/shmem.c index 44d7f3b..48feb42 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -66,7 +66,9 @@ static struct vfsmount *shm_mnt; #include #include #include +#include #include +#include #include #include @@ -3039,6 +3041,71 @@ out4: return error; } +/* maximum length of memfd names */ +#define MFD_MAX_NAMELEN 256 + +SYSCALL_DEFINE3(memfd_create, + const char*, uname, + u64, size, + u64, flags) +{ + struct file *shm; + char *name; + int fd, r; + long len; + + if (flags & ~(u64)MFD_CLOEXEC) + return -EINVAL; + if ((u64)(loff_t)size != size || (loff_t)size < 0) + return -EINVAL; + + /* length includes terminating zero */ + len = strnlen_user(uname, MFD_MAX_NAMELEN); + if (len <= 0) + return -EFAULT; + else if (len > MFD_MAX_NAMELEN) + return -EINVAL; + + name = kmalloc(len + 6, GFP_KERNEL); + if (!name) + return -ENOMEM; + + strcpy(name, "memfd:"); + if (copy_from_user(&name[6], uname, len)) { + r = -EFAULT; + goto err_name; + } + + /* terminating-zero may have changed after strnlen_user() returned */ + if (name[len + 6 - 1]) { + r = -EFAULT; + goto err_name; + } + + fd = get_unused_fd_flags((flags & MFD_CLOEXEC) ? O_CLOEXEC : 0); + if (fd < 0) { + r = fd; + goto err_name; + } + + shm = shmem_file_setup(name, size, 0); + if (IS_ERR(shm)) { + r = PTR_ERR(shm); + goto err_fd; + } + shm->f_mode |= FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE; + + fd_install(fd, shm); + kfree(name); + return fd; + +err_fd: + put_unused_fd(fd); +err_name: + kfree(name); + return r; +} + #else /* !CONFIG_SHMEM */ /*