From patchwork Sat Sep 29 10:34:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aleksa Sarai X-Patchwork-Id: 10620713 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F15DA6CB for ; Sat, 29 Sep 2018 10:35:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E30D42AA03 for ; Sat, 29 Sep 2018 10:35:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D6EDB2AA1C; Sat, 29 Sep 2018 10:35:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9B8892AA03 for ; Sat, 29 Sep 2018 10:35:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728025AbeI2RDP (ORCPT ); Sat, 29 Sep 2018 13:03:15 -0400 Received: from mx2.mailbox.org ([80.241.60.215]:14388 "EHLO mx2.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727786AbeI2RDO (ORCPT ); Sat, 29 Sep 2018 13:03:14 -0400 Received: from smtp2.mailbox.org (unknown [IPv6:2001:67c:2050:105:465:1:2:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx2.mailbox.org (Postfix) with ESMTPS id 7A96A413A3; Sat, 29 Sep 2018 12:35:16 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp2.mailbox.org ([80.241.60.241]) by spamfilter01.heinlein-hosting.de (spamfilter01.heinlein-hosting.de [80.241.56.115]) (amavisd-new, port 10030) with ESMTP id Ree5ZA5b3Phc; Sat, 29 Sep 2018 12:35:15 +0200 (CEST) From: Aleksa Sarai To: Jeff Layton , "J. Bruce Fields" , Al Viro , Arnd Bergmann , Shuah Khan Cc: David Howells , Andy Lutomirski , Christian Brauner , Eric Biederman , Tycho Andersen , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, dev@opencontainers.org, containers@lists.linux-foundation.org, Aleksa Sarai Subject: [PATCH 1/3] namei: implement O_BENEATH-style AT_* flags Date: Sat, 29 Sep 2018 20:34:51 +1000 Message-Id: <20180929103453.12025-2-cyphar@cyphar.com> In-Reply-To: <20180929103453.12025-1-cyphar@cyphar.com> References: <20180929103453.12025-1-cyphar@cyphar.com> MIME-Version: 1.0 Sender: linux-kselftest-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add the following flags for path resolution. The primary justification for these flags is to allow for programs to be far more strict about how they want path resolution to handle symlinks, mountpoint crossings, and paths that escape the dirfd (through an absolute path or ".." shenanigans). This is of particular concern to container runtimes that want to be very careful about malicious root filesystems that a container's init might have screwed around with (and there is no real way to protect against this in userspace if you consider potential races against a malicious container's init). * AT_BENEATH: Disallow ".." or absolute paths (either in the path or found during symlink resolution) to escape the starting point of name resolution, though ".." is permitted in cases like "foo/../bar". Relative symlinks are still allowed (as long as they don't escape the starting point). * AT_XDEV: Disallow mount-point crossing (both *down* into one, or *up* from one). The primary "scoping" use is to blocking resolution that crosses a bind-mount, which has a similar property to a symlink (in the way that it allows for escape from the starting-point). Since it is not possible to differentiate bind-mounts However since bind-mounting requires privileges (in ways symlinks don't) this has been split from LOOKUP_BENEATH. The naming is based on "find -xdev" (though find(1) doesn't walk upwards, the semantics seem obvious). * AT_NO_PROCLINK: Disallows ->get_link "symlink" jumping. This is a very specific restriction, and it exists because /proc/$pid/fd/... "symlinks" allow for access outside nd->root and pose risk to container runtimes that don't want to be tricked into accessing a host path (but do want to allow no-funny-business symlink resolution). * AT_NO_SYMLINK: Disallows symlink jumping *of any kind*. Implies AT_NO_PROCLINK (obviously). The AT_NO_*LINK flags return -ELOOP if path resolution would violates their requirement, while the others all return -EXDEV. Currently these are only enabled for the stat(2) family and the openat(2) family (the latter has its own brand of O_* flags with the same semantics). Ideally these flags would be supported by all *at(2) syscalls, but this will require adding flags arguments to many of them (and will be done in a separate patchset). Cc: Andy Lutomirski Cc: Eric Biederman Cc: Christian Brauner Signed-off-by: Aleksa Sarai --- fs/fcntl.c | 2 +- fs/namei.c | 61 ++++++++++++++++++++++++++------ fs/open.c | 8 +++++ fs/stat.c | 13 +++++-- include/linux/fcntl.h | 3 +- include/linux/namei.h | 7 ++++ include/uapi/asm-generic/fcntl.h | 17 +++++++++ include/uapi/linux/fcntl.h | 8 +++++ 8 files changed, 104 insertions(+), 15 deletions(-) diff --git a/fs/fcntl.c b/fs/fcntl.c index 4137d96534a6..e343618736f7 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -1031,7 +1031,7 @@ static int __init fcntl_init(void) * Exceptions: O_NONBLOCK is a two bit define on parisc; O_NDELAY * is defined as O_NONBLOCK on some platforms and not on others. */ - BUILD_BUG_ON(21 - 1 /* for O_RDONLY being 0 */ != + BUILD_BUG_ON(25 - 1 /* for O_RDONLY being 0 */ != HWEIGHT32( (VALID_OPEN_FLAGS & ~(O_NONBLOCK | O_NDELAY)) | __FMODE_EXEC | __FMODE_NONOTIFY)); diff --git a/fs/namei.c b/fs/namei.c index fb913148d4d1..757dd783771c 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -859,6 +859,8 @@ static int nd_jump_root(struct nameidata *nd) path_get(&nd->path); nd->inode = nd->path.dentry->d_inode; } + if (unlikely(nd->flags & LOOKUP_BENEATH)) + return -EXDEV; nd->flags |= LOOKUP_JUMPED; return 0; } @@ -1083,14 +1085,19 @@ const char *get_link(struct nameidata *nd) } else { res = get(dentry, inode, &last->done); } + /* If we just jumped it was because of a procfs-style link. */ + if (unlikely(nd->flags & LOOKUP_JUMPED) && + unlikely(nd->flags & LOOKUP_NO_PROCLINKS)) + return ERR_PTR(-ELOOP); if (IS_ERR_OR_NULL(res)) return res; } if (*res == '/') { if (!nd->root.mnt) set_root(nd); - if (unlikely(nd_jump_root(nd))) - return ERR_PTR(-ECHILD); + error = nd_jump_root(nd); + if (unlikely(error)) + return ERR_PTR(error); while (unlikely(*++res == '/')) ; } @@ -1271,12 +1278,16 @@ static int follow_managed(struct path *path, struct nameidata *nd) break; } - if (need_mntput && path->mnt == mnt) - mntput(path->mnt); + if (need_mntput) { + if (path->mnt == mnt) + mntput(path->mnt); + if (unlikely(nd->flags & LOOKUP_XDEV)) + ret = -EXDEV; + else + nd->flags |= LOOKUP_JUMPED; + } if (ret == -EISDIR || !ret) ret = 1; - if (need_mntput) - nd->flags |= LOOKUP_JUMPED; if (unlikely(ret < 0)) path_put_conditional(path, nd); return ret; @@ -1333,6 +1344,8 @@ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path, mounted = __lookup_mnt(path->mnt, path->dentry); if (!mounted) break; + if (unlikely(nd->flags & LOOKUP_XDEV)) + return false; path->mnt = &mounted->mnt; path->dentry = mounted->mnt.mnt_root; nd->flags |= LOOKUP_JUMPED; @@ -1353,8 +1366,11 @@ static int follow_dotdot_rcu(struct nameidata *nd) struct inode *inode = nd->inode; while (1) { - if (path_equal(&nd->path, &nd->root)) + if (path_equal(&nd->path, &nd->root)) { + if (unlikely(nd->flags & LOOKUP_BENEATH)) + return -EXDEV; break; + } if (nd->path.dentry != nd->path.mnt->mnt_root) { struct dentry *old = nd->path.dentry; struct dentry *parent = old->d_parent; @@ -1379,6 +1395,8 @@ static int follow_dotdot_rcu(struct nameidata *nd) return -ECHILD; if (&mparent->mnt == nd->path.mnt) break; + if (unlikely(nd->flags & LOOKUP_XDEV)) + return -EXDEV; /* we know that mountpoint was pinned */ nd->path.dentry = mountpoint; nd->path.mnt = &mparent->mnt; @@ -1481,8 +1499,11 @@ static int path_parent_directory(struct path *path) static int follow_dotdot(struct nameidata *nd) { while(1) { - if (path_equal(&nd->path, &nd->root)) + if (path_equal(&nd->path, &nd->root)) { + if (unlikely(nd->flags & LOOKUP_BENEATH)) + return -EXDEV; break; + } if (nd->path.dentry != nd->path.mnt->mnt_root) { int ret = path_parent_directory(&nd->path); if (ret) @@ -1491,6 +1512,8 @@ static int follow_dotdot(struct nameidata *nd) } if (!follow_up(&nd->path)) break; + if (unlikely(nd->flags & LOOKUP_XDEV)) + return -EXDEV; } follow_mount(&nd->path); nd->inode = nd->path.dentry->d_inode; @@ -1720,6 +1743,8 @@ static int pick_link(struct nameidata *nd, struct path *link, { int error; struct saved *last; + if (unlikely(nd->flags & LOOKUP_NO_SYMLINKS)) + return -ELOOP; if (unlikely(nd->total_link_count++ >= MAXSYMLINKS)) { path_to_nameidata(link, nd); return -ELOOP; @@ -2175,6 +2200,8 @@ static const char *path_init(struct nameidata *nd, unsigned flags) if (!*s) flags &= ~LOOKUP_RCU; + if (flags & LOOKUP_NO_SYMLINKS) + flags |= LOOKUP_NO_PROCLINKS; if (flags & LOOKUP_RCU) rcu_read_lock(); @@ -2204,10 +2231,12 @@ static const char *path_init(struct nameidata *nd, unsigned flags) nd->m_seq = read_seqbegin(&mount_lock); if (*s == '/') { + int error; set_root(nd); - if (likely(!nd_jump_root(nd))) - return s; - return ERR_PTR(-ECHILD); + error = nd_jump_root(nd); + if (unlikely(error)) + s = ERR_PTR(error); + return s; } else if (nd->dfd == AT_FDCWD) { if (flags & LOOKUP_RCU) { struct fs_struct *fs = current->fs; @@ -2223,6 +2252,11 @@ static const char *path_init(struct nameidata *nd, unsigned flags) get_fs_pwd(current->fs, &nd->path); nd->inode = nd->path.dentry->d_inode; } + if (unlikely(flags & LOOKUP_BENEATH)) { + nd->root = nd->path; + if (!(flags & LOOKUP_RCU)) + path_get(&nd->root); + } return s; } else { /* Caller must check execute permissions on the starting path component */ @@ -2247,6 +2281,11 @@ static const char *path_init(struct nameidata *nd, unsigned flags) path_get(&nd->path); nd->inode = nd->path.dentry->d_inode; } + if (unlikely(flags & LOOKUP_BENEATH)) { + nd->root = nd->path; + if (!(flags & LOOKUP_RCU)) + path_get(&nd->root); + } fdput(f); return s; } diff --git a/fs/open.c b/fs/open.c index 0285ce7dbd51..80f5f566a5ff 100644 --- a/fs/open.c +++ b/fs/open.c @@ -988,6 +988,14 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o lookup_flags |= LOOKUP_DIRECTORY; if (!(flags & O_NOFOLLOW)) lookup_flags |= LOOKUP_FOLLOW; + if (flags & O_BENEATH) + lookup_flags |= LOOKUP_BENEATH; + if (flags & O_XDEV) + lookup_flags |= LOOKUP_XDEV; + if (flags & O_NOPROCLINKS) + lookup_flags |= LOOKUP_NO_PROCLINKS; + if (flags & O_NOSYMLINKS) + lookup_flags |= LOOKUP_NO_SYMLINKS; op->lookup_flags = lookup_flags; return 0; } diff --git a/fs/stat.c b/fs/stat.c index f8e6fb2c3657..791e61b916ae 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -170,8 +170,9 @@ int vfs_statx(int dfd, const char __user *filename, int flags, int error = -EINVAL; unsigned int lookup_flags = LOOKUP_FOLLOW | LOOKUP_AUTOMOUNT; - if ((flags & ~(AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT | - AT_EMPTY_PATH | KSTAT_QUERY_FLAGS)) != 0) + if (flags & ~(AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT | AT_EMPTY_PATH | + KSTAT_QUERY_FLAGS | AT_BENEATH | AT_XDEV | + AT_NO_PROCLINKS | AT_NO_SYMLINKS)) return -EINVAL; if (flags & AT_SYMLINK_NOFOLLOW) @@ -180,6 +181,14 @@ int vfs_statx(int dfd, const char __user *filename, int flags, lookup_flags &= ~LOOKUP_AUTOMOUNT; if (flags & AT_EMPTY_PATH) lookup_flags |= LOOKUP_EMPTY; + if (flags & AT_BENEATH) + lookup_flags |= LOOKUP_BENEATH; + if (flags & AT_XDEV) + lookup_flags |= LOOKUP_XDEV; + if (flags & AT_NO_PROCLINKS) + lookup_flags |= LOOKUP_NO_PROCLINKS; + if (flags & AT_NO_SYMLINKS) + lookup_flags |= LOOKUP_NO_SYMLINKS; retry: error = user_path_at(dfd, filename, lookup_flags, &path); diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h index 27dc7a60693e..ad5bba4b5b12 100644 --- a/include/linux/fcntl.h +++ b/include/linux/fcntl.h @@ -9,7 +9,8 @@ (O_RDONLY | O_WRONLY | O_RDWR | O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC | \ O_APPEND | O_NDELAY | O_NONBLOCK | O_NDELAY | __O_SYNC | O_DSYNC | \ FASYNC | O_DIRECT | O_LARGEFILE | O_DIRECTORY | O_NOFOLLOW | \ - O_NOATIME | O_CLOEXEC | O_PATH | __O_TMPFILE) + O_NOATIME | O_CLOEXEC | O_PATH | __O_TMPFILE | O_BENEATH | O_XDEV | \ + O_NOPROCLINKS | O_NOSYMLINKS) #ifndef force_o_largefile #define force_o_largefile() (BITS_PER_LONG != 32) diff --git a/include/linux/namei.h b/include/linux/namei.h index a78606e8e3df..5ff7f3362d1b 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -47,6 +47,13 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_EMPTY 0x4000 #define LOOKUP_DOWN 0x8000 +/* Scoping flags for lookup. */ +#define LOOKUP_BENEATH 0x010000 /* No escaping from starting point. */ +#define LOOKUP_XDEV 0x020000 /* No mountpoint crossing. */ +#define LOOKUP_NO_PROCLINKS 0x040000 /* No /proc/$pid/fd/ "symlink" crossing. */ +#define LOOKUP_NO_SYMLINKS 0x080000 /* No symlink crossing *at all*. + Implies LOOKUP_NO_PROCLINKS. */ + extern int path_pts(struct path *path); extern int user_path_at_empty(int, const char __user *, unsigned, struct path *, int *empty); diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h index 9dc0bf0c5a6e..c2bf5983e46a 100644 --- a/include/uapi/asm-generic/fcntl.h +++ b/include/uapi/asm-generic/fcntl.h @@ -97,6 +97,23 @@ #define O_NDELAY O_NONBLOCK #endif +/* + * These are identical to their AT_* counterparts (which affect the entireity + * of path resolution). + */ +#ifndef O_BENEATH +#define O_BENEATH 00040000000 /* *Not* the same as capsicum's O_BENEATH! */ +#endif +#ifndef O_XDEV +#define O_XDEV 00100000000 +#endif +#ifndef O_NOPROCLINKS +#define O_NOPROCLINKS 00200000000 +#endif +#ifndef O_NOSYMLINKS +#define O_NOSYMLINKS 01000000000 +#endif + #define F_DUPFD 0 /* dup */ #define F_GETFD 1 /* get close_on_exec */ #define F_SETFD 2 /* set/clear close_on_exec */ diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h index 594b85f7cb86..551a9e2166a8 100644 --- a/include/uapi/linux/fcntl.h +++ b/include/uapi/linux/fcntl.h @@ -92,5 +92,13 @@ #define AT_RECURSIVE 0x8000 /* Apply to the entire subtree */ +/* Flags which affect path *resolution*, not just last-component handling. */ +#define AT_BENEATH 0x10000 /* No absolute paths or ".." escaping + (in-path or through symlinks) */ +#define AT_XDEV 0x20000 /* No mountpoint crossing. */ +#define AT_NO_PROCLINKS 0x40000 /* No /proc/$pid/fd/... "symlinks". */ +#define AT_NO_SYMLINKS 0x80000 /* No symlinks *at all*. + Implies AT_NO_PROCLINKS. */ + #endif /* _UAPI_LINUX_FCNTL_H */ From patchwork Sat Sep 29 13:15:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aleksa Sarai X-Patchwork-Id: 10620811 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 752CC6CB for ; Sat, 29 Sep 2018 13:15:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 618E62A79F for ; Sat, 29 Sep 2018 13:15:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 50B172A7A8; Sat, 29 Sep 2018 13:15:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 554E22A79F for ; Sat, 29 Sep 2018 13:15:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728233AbeI2ToO (ORCPT ); Sat, 29 Sep 2018 15:44:14 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:33250 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728101AbeI2ToO (ORCPT ); Sat, 29 Sep 2018 15:44:14 -0400 Received: by mail-pg1-f195.google.com with SMTP id y18-v6so6408160pge.0 for ; Sat, 29 Sep 2018 06:15:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=kCfBYlLBGG/ZmNnfk2cB3cnEY2JzUCPOh6/KsUth3Tk=; b=xAA8O1FIJKQIk5Yd6s/VAROrt8QJA000J6rCRj8eOs6DGryyAOW3Mu817XDJxtODQ4 ja8dGk6QCAFcs9aQw8OT3kxs89pfEhQL9DD+B+CpCK6vGgQn+byVkYSlEyGBd2L66iY1 +rZ0PO8Q8MW1Rc0jeRn/zmdY9gGskgaiyQTB9Q3K8TFbS/367//IbT1qeXgr/jASByAk 7mDxl2tHVSgAWlTqtdpvSzSSUFJ3TTReWb8pxfqXCfEUY1QN+HYFnyByF/zLAPIv0uUe prcasDNKiSkcduzQAIlQec/XxPeWFFTBZsr59WRFTnatxL3MaPDa4DlHehaoa4B+i8aF nRvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kCfBYlLBGG/ZmNnfk2cB3cnEY2JzUCPOh6/KsUth3Tk=; b=H9a+VhqIHNQkWUyqdJuRjXOVy9oDf2FHD+l9YAwff3WP5xJDLRiB7Bc14t2ZurEdn9 F5jrNs6eIJFOfo+uKPdQvvK01hJAwv6dujEBbZhA3zTlccI9Ho64jdNNWlJleLEDoOml 9wo1lQKOTXgqBDD6gLbVRSh5CmrPeFF8DvwExRpjb0XEl3OgVJjd1DezWyFDhtd/Qkpm bHcN3qXcV0WMcEqkrCeqT7aA2xbvCzDavHkl1nq4kN/Vsy2q0itU3zPayzeqcxkcq5If Lt4R+83lPp5rlc/OcjPVQ3JCLoA3vf1InAJ1JSf8NWkPpwBpZ8dhNQAQKEQ+yOn6Ce5A Sq5A== X-Gm-Message-State: ABuFfoiUk07pxoaczintr73D9D5pDHm97nj+4MLZJpTuSopDKh8y4n8O phPB7yljogqjMInpHZscAE5jXQ== X-Google-Smtp-Source: ACcGV63yLcx03XNGFOJdQXX9r8uexaqCwTAaWAgBIEt0dhpr3dTRJJwgMba2JAjhFtfnaGBK2PLqog== X-Received: by 2002:a17:902:246a:: with SMTP id m39-v6mr3160964plg.57.1538226947444; Sat, 29 Sep 2018 06:15:47 -0700 (PDT) Received: from ?redacted? (pa49-199-213-175.pa.vic.optusnet.com.au. [49.199.213.175]) by smtp.gmail.com with ESMTPSA id h124-v6sm11335360pfg.112.2018.09.29.06.15.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 29 Sep 2018 06:15:46 -0700 (PDT) From: Aleksa Sarai To: Jeff Layton , "J. Bruce Fields" , Al Viro , Arnd Bergmann , Shuah Khan Cc: David Howells , Andy Lutomirski , Christian Brauner , Eric Biederman , Tycho Andersen , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, dev@opencontainers.org, containers@lists.linux-foundation.org, Aleksa Sarai Subject: [PATCH 2/3] namei: implement AT_THIS_ROOT chroot-like path resolution Date: Sat, 29 Sep 2018 23:15:33 +1000 Message-Id: <20180929131534.24472-1-cyphar@cyphar.com> X-Mailer: git-send-email 2.19.0 In-Reply-To: <20180929103453.12025-1-cyphar@cyphar.com> References: <20180929103453.12025-1-cyphar@cyphar.com> MIME-Version: 1.0 Sender: linux-kselftest-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The primary motivation for the need for this flag is container runtimes which have to interact with malicious root filesystems in the host namespaces. One of the first requirements for a container runtime to be secure against a malicious rootfs is that they correctly scope symlinks (that is, they should be scoped as though they are chroot(2)ed into the container's rootfs) and ".."-style paths. The already-existing AT_XDEV and AT_NO_PROCLINKS help defend against other potential attacks in a malicious rootfs scenario. Currently most container runtimes try to do this resolution in userspace[1], causing many potential race conditions. In addition, the "obvious" alternative (actually performing a {ch,pivot_}root(2)) requires a fork+exec which is *very* costly if necessary for every filesystem operation involving a container. The most significant change in semantics with AT_THIS_ROOT is that *at(2) syscalls now no longer have the property that an absolute pathname causes the dirfd to be ignored completely (if LOOKUP_CHROOT is specified). The reasoning behind this is that AT_THIS_ROOT necessarily has to chroot-scope symlinks with absolute paths to dirfd, and so doing it for the base path seems to be the most consistent behaviour (and also avoids foot-gunning users who want to chroot-scope paths that might be absolute). Currently this is only enabled for the stat(2) and openat(2) family (the latter has its own flag O_THISROOT with the same semantics). Ideally this flag would be supported by all *at(2) syscalls, but this will require adding flags arguments to many of them (and will be done in a separate patchset). [1]: https://github.com/cyphar/filepath-securejoin Cc: Eric Biederman Cc: Christian Brauner Signed-off-by: Aleksa Sarai Signed-off-by: Aleksa Sarai --- fs/fcntl.c | 2 +- fs/namei.c | 121 +++++++++++++++++-------------- fs/open.c | 2 + fs/stat.c | 4 +- include/linux/fcntl.h | 2 +- include/linux/namei.h | 1 + include/uapi/asm-generic/fcntl.h | 3 + include/uapi/linux/fcntl.h | 2 + 8 files changed, 81 insertions(+), 56 deletions(-) diff --git a/fs/fcntl.c b/fs/fcntl.c index e343618736f7..4c36c5b9fdb9 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -1031,7 +1031,7 @@ static int __init fcntl_init(void) * Exceptions: O_NONBLOCK is a two bit define on parisc; O_NDELAY * is defined as O_NONBLOCK on some platforms and not on others. */ - BUILD_BUG_ON(25 - 1 /* for O_RDONLY being 0 */ != + BUILD_BUG_ON(26 - 1 /* for O_RDONLY being 0 */ != HWEIGHT32( (VALID_OPEN_FLAGS & ~(O_NONBLOCK | O_NDELAY)) | __FMODE_EXEC | __FMODE_NONOTIFY)); diff --git a/fs/namei.c b/fs/namei.c index 757dd783771c..1b984f0dbbb4 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2193,9 +2193,64 @@ static int link_path_walk(const char *name, struct nameidata *nd) } } +/* + * Configure nd->path based on the nd->dfd. This is only used as part of + * path_init(). + */ +static inline int dirfd_path_init(struct nameidata *nd) +{ + if (nd->dfd == AT_FDCWD) { + if (nd->flags & LOOKUP_RCU) { + struct fs_struct *fs = current->fs; + unsigned seq; + + do { + seq = read_seqcount_begin(&fs->seq); + nd->path = fs->pwd; + nd->inode = nd->path.dentry->d_inode; + nd->seq = __read_seqcount_begin(&nd->path.dentry->d_seq); + } while (read_seqcount_retry(&fs->seq, seq)); + } else { + get_fs_pwd(current->fs, &nd->path); + nd->inode = nd->path.dentry->d_inode; + } + } else { + /* Caller must check execute permissions on the starting path component */ + struct fd f = fdget_raw(nd->dfd); + struct dentry *dentry; + + if (!f.file) + return -EBADF; + + dentry = f.file->f_path.dentry; + + if (*nd->name->name && unlikely(!d_can_lookup(dentry))) { + fdput(f); + return -ENOTDIR; + } + + nd->path = f.file->f_path; + if (nd->flags & LOOKUP_RCU) { + nd->inode = nd->path.dentry->d_inode; + nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); + } else { + path_get(&nd->path); + nd->inode = nd->path.dentry->d_inode; + } + fdput(f); + } + if (unlikely(nd->flags & (LOOKUP_CHROOT | LOOKUP_BENEATH))) { + nd->root = nd->path; + if (!(nd->flags & LOOKUP_RCU)) + path_get(&nd->root); + } + return 0; +} + /* must be paired with terminate_walk() */ static const char *path_init(struct nameidata *nd, unsigned flags) { + int error; const char *s = nd->name->name; if (!*s) @@ -2230,65 +2285,25 @@ static const char *path_init(struct nameidata *nd, unsigned flags) nd->path.dentry = NULL; nd->m_seq = read_seqbegin(&mount_lock); + if (unlikely(flags & LOOKUP_CHROOT)) { + error = dirfd_path_init(nd); + if (unlikely(error)) + return ERR_PTR(error); + } if (*s == '/') { - int error; - set_root(nd); + if (likely(!nd->root.mnt)) + set_root(nd); error = nd_jump_root(nd); if (unlikely(error)) s = ERR_PTR(error); return s; - } else if (nd->dfd == AT_FDCWD) { - if (flags & LOOKUP_RCU) { - struct fs_struct *fs = current->fs; - unsigned seq; - - do { - seq = read_seqcount_begin(&fs->seq); - nd->path = fs->pwd; - nd->inode = nd->path.dentry->d_inode; - nd->seq = __read_seqcount_begin(&nd->path.dentry->d_seq); - } while (read_seqcount_retry(&fs->seq, seq)); - } else { - get_fs_pwd(current->fs, &nd->path); - nd->inode = nd->path.dentry->d_inode; - } - if (unlikely(flags & LOOKUP_BENEATH)) { - nd->root = nd->path; - if (!(flags & LOOKUP_RCU)) - path_get(&nd->root); - } - return s; - } else { - /* Caller must check execute permissions on the starting path component */ - struct fd f = fdget_raw(nd->dfd); - struct dentry *dentry; - - if (!f.file) - return ERR_PTR(-EBADF); - - dentry = f.file->f_path.dentry; - - if (*s && unlikely(!d_can_lookup(dentry))) { - fdput(f); - return ERR_PTR(-ENOTDIR); - } - - nd->path = f.file->f_path; - if (flags & LOOKUP_RCU) { - nd->inode = nd->path.dentry->d_inode; - nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); - } else { - path_get(&nd->path); - nd->inode = nd->path.dentry->d_inode; - } - if (unlikely(flags & LOOKUP_BENEATH)) { - nd->root = nd->path; - if (!(flags & LOOKUP_RCU)) - path_get(&nd->root); - } - fdput(f); - return s; } + if (likely(!nd->path.mnt)) { + error = dirfd_path_init(nd); + if (unlikely(error)) + return ERR_PTR(error); + } + return s; } static const char *trailing_symlink(struct nameidata *nd) diff --git a/fs/open.c b/fs/open.c index 80f5f566a5ff..81d148f626cd 100644 --- a/fs/open.c +++ b/fs/open.c @@ -996,6 +996,8 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o lookup_flags |= LOOKUP_NO_PROCLINKS; if (flags & O_NOSYMLINKS) lookup_flags |= LOOKUP_NO_SYMLINKS; + if (flags & O_THISROOT) + lookup_flags |= LOOKUP_CHROOT; op->lookup_flags = lookup_flags; return 0; } diff --git a/fs/stat.c b/fs/stat.c index 791e61b916ae..e8366e4812c3 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -172,7 +172,7 @@ int vfs_statx(int dfd, const char __user *filename, int flags, if (flags & ~(AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT | AT_EMPTY_PATH | KSTAT_QUERY_FLAGS | AT_BENEATH | AT_XDEV | - AT_NO_PROCLINKS | AT_NO_SYMLINKS)) + AT_NO_PROCLINKS | AT_NO_SYMLINKS | AT_THIS_ROOT)) return -EINVAL; if (flags & AT_SYMLINK_NOFOLLOW) @@ -189,6 +189,8 @@ int vfs_statx(int dfd, const char __user *filename, int flags, lookup_flags |= LOOKUP_NO_PROCLINKS; if (flags & AT_NO_SYMLINKS) lookup_flags |= LOOKUP_NO_SYMLINKS; + if (flags & AT_THIS_ROOT) + lookup_flags |= LOOKUP_CHROOT; retry: error = user_path_at(dfd, filename, lookup_flags, &path); diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h index ad5bba4b5b12..95480cd4c09d 100644 --- a/include/linux/fcntl.h +++ b/include/linux/fcntl.h @@ -10,7 +10,7 @@ O_APPEND | O_NDELAY | O_NONBLOCK | O_NDELAY | __O_SYNC | O_DSYNC | \ FASYNC | O_DIRECT | O_LARGEFILE | O_DIRECTORY | O_NOFOLLOW | \ O_NOATIME | O_CLOEXEC | O_PATH | __O_TMPFILE | O_BENEATH | O_XDEV | \ - O_NOPROCLINKS | O_NOSYMLINKS) + O_NOPROCLINKS | O_NOSYMLINKS | O_THISROOT) #ifndef force_o_largefile #define force_o_largefile() (BITS_PER_LONG != 32) diff --git a/include/linux/namei.h b/include/linux/namei.h index 5ff7f3362d1b..7ec9e2d84649 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -53,6 +53,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_NO_PROCLINKS 0x040000 /* No /proc/$pid/fd/ "symlink" crossing. */ #define LOOKUP_NO_SYMLINKS 0x080000 /* No symlink crossing *at all*. Implies LOOKUP_NO_PROCLINKS. */ +#define LOOKUP_CHROOT 0x100000 /* Treat dirfd as %current->fs->root. */ extern int path_pts(struct path *path); diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h index c2bf5983e46a..11206b0e927c 100644 --- a/include/uapi/asm-generic/fcntl.h +++ b/include/uapi/asm-generic/fcntl.h @@ -113,6 +113,9 @@ #ifndef O_NOSYMLINKS #define O_NOSYMLINKS 01000000000 #endif +#ifndef O_THISROOT +#define O_THISROOT 02000000000 +#endif #define F_DUPFD 0 /* dup */ #define F_GETFD 1 /* get close_on_exec */ diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h index 551a9e2166a8..ea978457b68f 100644 --- a/include/uapi/linux/fcntl.h +++ b/include/uapi/linux/fcntl.h @@ -99,6 +99,8 @@ #define AT_NO_PROCLINKS 0x40000 /* No /proc/$pid/fd/... "symlinks". */ #define AT_NO_SYMLINKS 0x80000 /* No symlinks *at all*. Implies AT_NO_PROCLINKS. */ +#define AT_THIS_ROOT 0x100000 /* Path resolution acts as though + it is chroot-ed into dirfd. */ #endif /* _UAPI_LINUX_FCNTL_H */ From patchwork Sat Sep 29 13:15:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aleksa Sarai X-Patchwork-Id: 10620815 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF150A6A for ; Sat, 29 Sep 2018 13:15:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B79C72A79F for ; Sat, 29 Sep 2018 13:15:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AB23A2A7A8; Sat, 29 Sep 2018 13:15:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B6072A7A6 for ; Sat, 29 Sep 2018 13:15:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728274AbeI2ToV (ORCPT ); Sat, 29 Sep 2018 15:44:21 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:33259 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728265AbeI2ToU (ORCPT ); Sat, 29 Sep 2018 15:44:20 -0400 Received: by mail-pg1-f194.google.com with SMTP id y18-v6so6408300pge.0 for ; Sat, 29 Sep 2018 06:15:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=e439qDvMKXI6ZxzhrMuTIqJaPZZbItyST1MioeBaYFA=; b=h/Y1SicqudZswKYgSa9DwuqT059mh+dXb1wTWUGQQ7ismX/bbTb2xUyEXVCXecJ0pZ wj1axpFR+fPu6j63IIfHRrtLGyLa7uKqKOASH0HW57HmMAwmBmTuvVr6s77kOlFyu3fV 3KzR8GpDsdvyIaiDcOmmsvqABRJzmsciAzVUCn2epXVcQBdcE/GDymztzwMc9Wlk5AXq C8k07hhLFyFWu+m0uNfj2U5xuQHCOBCyHQ6sZVhLCSYsAot64Z1vNTb1FlzctC9YEo99 H8V09D58NYKTCLrdK6p2HKzFAjCPYQasu1P4pzryTmTxK0r/qWot9/P6kxjq6Ev5E6cD 92hQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=e439qDvMKXI6ZxzhrMuTIqJaPZZbItyST1MioeBaYFA=; b=qkBJDT8WRRoLbcVgggxeMy2ljEjO1PumpLgZVGvmilO8SMPT6jC7v9UNkRwK2CZEhG bMPpFuH5TNZU+7YX8FDA+0LIAYEBkUL1F6eQCYKqIP4EzYz3BuUGFaDGvUz3c2zXOzJR +T8T06obGxRzTBJxVDfWzRVJP/aPZLjJ59E1ayWDQIFDFHb4RtWcTJ7XI6DNDbneYTX/ XmVWQx95s33znCOmCcipzNFs0d7zG97z8Iwe3rvXOSwaPbOYAtwgtgymhAiruTXWK1cW mBzJ4iXNgt2BKR0WzhkH4NPhbDqChgM0UKObiG4L1hpy4lmLRKgstTrhEIhnpnFTr26m 5/yA== X-Gm-Message-State: ABuFfohiPfROA+lQZmr9XFppC4EkHPBN9hKSaeXTZbAysNVDfCWEbdnl 8TBDt3TPM0isKfk+u3GD4GPldqt650beIcDK X-Google-Smtp-Source: ACcGV62/Evp8d13+yUnOCNdnkKbihZmuLAUn/JoTh3jIj3aZgPyeEFaHuda8g9RhaAopgYPoeGJzMw== X-Received: by 2002:a63:4f4f:: with SMTP id p15-v6mr2919334pgl.71.1538226954252; Sat, 29 Sep 2018 06:15:54 -0700 (PDT) Received: from ?redacted? (pa49-199-213-175.pa.vic.optusnet.com.au. [49.199.213.175]) by smtp.gmail.com with ESMTPSA id h124-v6sm11335360pfg.112.2018.09.29.06.15.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 29 Sep 2018 06:15:53 -0700 (PDT) From: Aleksa Sarai To: Jeff Layton , "J. Bruce Fields" , Al Viro , Arnd Bergmann , Shuah Khan Cc: David Howells , Andy Lutomirski , Christian Brauner , Eric Biederman , Tycho Andersen , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, dev@opencontainers.org, containers@lists.linux-foundation.org, Aleksa Sarai Subject: [PATCH 3/3] selftests: vfs: add AT_* path resolution tests Date: Sat, 29 Sep 2018 23:15:34 +1000 Message-Id: <20180929131534.24472-2-cyphar@cyphar.com> X-Mailer: git-send-email 2.19.0 In-Reply-To: <20180929131534.24472-1-cyphar@cyphar.com> References: <20180929103453.12025-1-cyphar@cyphar.com> <20180929131534.24472-1-cyphar@cyphar.com> MIME-Version: 1.0 Sender: linux-kselftest-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP With the addition of so many new scoping flags, it's necessary to have some sort of validation that they really work. There were no vfs self-tests in the past, so this also includes a basic framework that future VFS tests can use. Signed-off-by: Aleksa Sarai --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/vfs/.gitignore | 1 + tools/testing/selftests/vfs/Makefile | 13 ++ tools/testing/selftests/vfs/at_flags.h | 40 +++++ tools/testing/selftests/vfs/common.sh | 37 +++++ .../selftests/vfs/tests/0001_at_beneath.sh | 72 ++++++++ .../selftests/vfs/tests/0002_at_xdev.sh | 54 ++++++ .../vfs/tests/0003_at_no_proclinks.sh | 50 ++++++ .../vfs/tests/0004_at_no_symlinks.sh | 49 ++++++ .../selftests/vfs/tests/0005_at_this_root.sh | 66 ++++++++ tools/testing/selftests/vfs/vfs_helper.c | 154 ++++++++++++++++++ 11 files changed, 537 insertions(+) create mode 100644 tools/testing/selftests/vfs/.gitignore create mode 100644 tools/testing/selftests/vfs/Makefile create mode 100644 tools/testing/selftests/vfs/at_flags.h create mode 100644 tools/testing/selftests/vfs/common.sh create mode 100755 tools/testing/selftests/vfs/tests/0001_at_beneath.sh create mode 100755 tools/testing/selftests/vfs/tests/0002_at_xdev.sh create mode 100755 tools/testing/selftests/vfs/tests/0003_at_no_proclinks.sh create mode 100755 tools/testing/selftests/vfs/tests/0004_at_no_symlinks.sh create mode 100755 tools/testing/selftests/vfs/tests/0005_at_this_root.sh create mode 100644 tools/testing/selftests/vfs/vfs_helper.c diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index f1fe492c8e17..6f814e49071f 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -43,6 +43,7 @@ ifneq (1, $(quicktest)) TARGETS += timers endif TARGETS += user +TARGETS += vfs TARGETS += vm TARGETS += x86 TARGETS += zram diff --git a/tools/testing/selftests/vfs/.gitignore b/tools/testing/selftests/vfs/.gitignore new file mode 100644 index 000000000000..c57ebcba14c0 --- /dev/null +++ b/tools/testing/selftests/vfs/.gitignore @@ -0,0 +1 @@ +/vfs_helper diff --git a/tools/testing/selftests/vfs/Makefile b/tools/testing/selftests/vfs/Makefile new file mode 100644 index 000000000000..8ca3cef43dc3 --- /dev/null +++ b/tools/testing/selftests/vfs/Makefile @@ -0,0 +1,13 @@ +# SPDX-License-Identifier: GPL-2.0+ +# Author: Aleksa Sarai +# Copyright (C) 2018 SUSE LLC. + +# Makefile for mount selftests. +CFLAGS = -Wall \ + -O2 \ + -I../../../../usr/include/ + +TEST_PROGS := $(wildcard tests/*.sh) +TEST_GEN_FILES := vfs_helper + +include ../lib.mk diff --git a/tools/testing/selftests/vfs/at_flags.h b/tools/testing/selftests/vfs/at_flags.h new file mode 100644 index 000000000000..a8ca8f689753 --- /dev/null +++ b/tools/testing/selftests/vfs/at_flags.h @@ -0,0 +1,40 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Author: Aleksa Sarai + * Copyright (C) 2018 SUSE LLC. + */ + +#ifndef __AT_FLAGS_H__ +#define __AT_FLAGS_H__ + +/* These come from */ +#ifndef O_BENEATH +# define O_BENEATH 00040000000 +# define O_XDEV 00100000000 +# define O_NOPROCLINKS 00200000000 +# define O_NOSYMLINKS 01000000000 +# define O_THISROOT 02000000000 +#endif +#ifndef AT_BENEATH +# define AT_BENEATH 0x8000 +# define AT_XDEV 0x10000 +# define AT_NO_PROCLINKS 0x20000 +# define AT_NO_SYMLINKS 0x40000 +# define AT_THIS_ROOT 0x80000 +#endif + +struct flag { + const char *name; + unsigned int at_flag, open_flag; +}; + +struct flag AT_FLAGS[] = { + { .name = "beneath", .at_flag = AT_BENEATH, .open_flag = O_BENEATH }, + { .name = "xdev", .at_flag = AT_XDEV, .open_flag = O_XDEV }, + { .name = "no_proclinks", .at_flag = AT_NO_PROCLINKS, .open_flag = O_NOPROCLINKS }, + { .name = "no_symlinks", .at_flag = AT_NO_SYMLINKS, .open_flag = O_NOSYMLINKS }, + { .name = "this_root", .at_flag = AT_THIS_ROOT, .open_flag = O_THISROOT }, + { 0 }, /* terminate */ +}; + +#endif /* !defined(__AT_FLAGS_H__) */ diff --git a/tools/testing/selftests/vfs/common.sh b/tools/testing/selftests/vfs/common.sh new file mode 100644 index 000000000000..82ac8ad2a5a5 --- /dev/null +++ b/tools/testing/selftests/vfs/common.sh @@ -0,0 +1,37 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ +# Author: Aleksa Sarai +# Copyright (C) 2018 SUSE LLC. + +set -e -o pipefail + +tmpdir="$(mktemp -d --tmpdir vfs_test.XXXXXX)" +trap "rm -rf $tmpdir" EXIT + +root="$tmpdir/root" +mkdir -p "$root" + +function fail() { + echo "# not ok" "$@" + exit 1 +} + +ksft_skip=4 +function skip() { + echo "# skip" "$@" + exit "$ksft_skip" +} + +function run() { + local old_flags="$-" + set +eET + output="$("$@" 2>&1)" + status="$?" + set "-$old_flags" +} + +testrootdir="$(readlink -f "$(dirname "$BASH_SOURCE")")" +function vfs_helper() { + run "$testrootdir/vfs_helper" "$@" +} +vfs_ops=( "open" "stat" "lstat" ) diff --git a/tools/testing/selftests/vfs/tests/0001_at_beneath.sh b/tools/testing/selftests/vfs/tests/0001_at_beneath.sh new file mode 100755 index 000000000000..9a03b0953032 --- /dev/null +++ b/tools/testing/selftests/vfs/tests/0001_at_beneath.sh @@ -0,0 +1,72 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ +# Author: Aleksa Sarai +# Copyright (C) 2018 SUSE LLC. + +sourcedir="$(readlink -f "$(dirname "$BASH_SOURCE")")" +source "$sourcedir/../common.sh" + +touch "$root/inside" +ln -s / "$root/rootlink" +ln -s .. "$root/dotdot" +ln -s "/../../../../../../$root" "$root/badlink" + +mkdir -p "$root/subdir" +ln -s ../inside "$root/subdir/dotdotinside" +ln -s ../subdir "$root/subdir/dotdotsubdir" +ln -s subdir "$root/subdirlink" +ln -s ../subdirlink/../../inside "$root/subdir/complexlink" + +for op in "${vfs_ops[@]}" +do + vfs_helper -o "$op" -F beneath -d "$root" .. + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op beneath [/root/].." + + vfs_helper -o "$op" -F beneath -d "$root" ../root + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op beneath [/root/]../root" + + vfs_helper -o "$op" -F beneath -d "$root" dotdot/root + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op beneath [/root/]dotdot(=..)/root" + + vfs_helper -o "$op" -F beneath -d "$root" "$root" + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op beneath [/root/]/root" + + vfs_helper -o "$op" -F beneath -d "$root" rootlink + if [[ "$op" == "lstat" ]] + then + [[ "$status" = 0 ]] || fail "$op beneath [/root/]rootlink(=/)" + else + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op beneath [/root/]rootlink(=/)" + fi + + vfs_helper -o "$op" -F beneath -d "$root" rootlink/ + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op beneath [/root/]rootlink(=/)/" + + vfs_helper -o "$op" -F beneath -d "$root" "rootlink/$root" + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op beneath [/root/]rootlink(=/)/root" + + vfs_helper -o "$op" -F beneath -d "$root" badlink + if [[ "$op" == "lstat" ]] + then + [[ "$status" = 0 ]] || fail "$op beneath [/root/]badlink(=/../.../root)" + else + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op beneath [/root/]badlink(=/../.../root)" + fi + + vfs_helper -o "$op" -F beneath -d "$root" subdir/../inside + [[ "$status" -eq 0 ]] || fail "$op beneath [/root/]subdir/../inside" + + vfs_helper -o "$op" -F beneath -d "$root" subdir/dotdotinside + [[ "$status" -eq 0 ]] || fail "$op beneath [/root/]subdir/dotdotinside(=../inside)" + + vfs_helper -o "$op" -F beneath -d "$root" subdir/dotdotsubdir/ + [[ "$status" -eq 0 ]] || fail "$op beneath [/root/]subdir/dotdotsubdir(=../subdir)/" + + vfs_helper -o "$op" -F beneath -d "$root" subdir/complexlink + if [[ "$op" == "lstat" ]] + then + [[ "$status" = 0 ]] || fail "$op beneath [/root/]complexlink(=../subdirlink/../../inside)" + else + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op beneath [/root/]complexlink(=../subdirlink/../../inside)" + fi +done diff --git a/tools/testing/selftests/vfs/tests/0002_at_xdev.sh b/tools/testing/selftests/vfs/tests/0002_at_xdev.sh new file mode 100755 index 000000000000..06be58a8ffe7 --- /dev/null +++ b/tools/testing/selftests/vfs/tests/0002_at_xdev.sh @@ -0,0 +1,54 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ +# Author: Aleksa Sarai +# Copyright (C) 2018 SUSE LLC. + +sourcedir="$(readlink -f "$(dirname "$BASH_SOURCE")")" +source "$sourcedir/../common.sh" + +( mountpoint -q "/tmp" ) || skip "/tmp is not a mountpoint" + +touch /tmp/foo + +ln -s /tmp "$root/link_tmp" + +for op in "${vfs_ops[@]}" +do + vfs_helper -o "$op" -F xdev -d / tmp/ + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/]tmp/" + + vfs_helper -o "$op" -F xdev -d / tmp/foo + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/]tmp/foo" + + vfs_helper -o "$op" -F xdev -d "$root" /tmp + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/root/]/tmp" + + vfs_helper -o "$op" -F xdev -d "$root" /tmp/ + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/root/]/tmp/" + + vfs_helper -o "$op" -F xdev -d "$root" /tmp/foo + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/root/]/tmp/foo" + + vfs_helper -o "$op" -F xdev -d /tmp foo + [[ "$status" = 0 ]] || fail "$op xdev [/tmp/]foo" + + vfs_helper -o "$op" -F xdev -d /tmp .. + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/tmp/].." + + vfs_helper -o "$op" -F xdev -d /tmp ../ + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/tmp/]../" + + vfs_helper -o "$op" -F xdev -d /tmp ../tmp + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/tmp/]../tmp" + + vfs_helper -o "$op" -F xdev -d "$root" link_tmp + if [[ "$op" == "lstat" ]] + then + [[ "$status" = 0 ]] || fail "$op xdev [/root/]link_tmp(=/tmp)" + else + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/root/]link_tmp(=/tmp)" + fi + + vfs_helper -o "$op" -F xdev -d "$root" link_tmp/ + [[ "$(errno "$status")" =~ "EXDEV "* ]] || fail "$op xdev [/root/]link_tmp(=/tmp)/" +done diff --git a/tools/testing/selftests/vfs/tests/0003_at_no_proclinks.sh b/tools/testing/selftests/vfs/tests/0003_at_no_proclinks.sh new file mode 100755 index 000000000000..41d9655a1e46 --- /dev/null +++ b/tools/testing/selftests/vfs/tests/0003_at_no_proclinks.sh @@ -0,0 +1,50 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ +# Author: Aleksa Sarai +# Copyright (C) 2018 SUSE LLC. + +sourcedir="$(readlink -f "$(dirname "$BASH_SOURCE")")" +source "$sourcedir/../common.sh" + +[ -e "/proc/$$/cwd" ] || skip "/proc/$$/cwd doesn't exist" + +ln -s / "$root/testlink" + +for op in "${vfs_ops[@]}" +do + for flags in {no_proclinks,no_symlinks,"no_proclinks,no_symlinks"} + do + vfs_helper -o "$op" -F "$flags" "/proc/$$/stat" + [[ "$status" = 0 ]] || fail "$op $flags /proc/$$/stat" + + vfs_helper -o "$op" -F "$flags" "/proc/$$/cwd" + if [[ "$op" == "lstat" ]] + then + [[ "$status" = 0 ]] || fail "$op $flags /proc/$$/cwd" + else + [[ "$(errno "$status")" =~ "ELOOP "* ]] || fail "$op $flags /proc/$$/cwd" + fi + + vfs_helper -o "$op" -F "$flags" -d "$root" "testlink/" + if [[ "$flags" == "no_proclinks" ]] + then + [[ "$status" = 0 ]] || fail "$op $flags [/root/]testlink/" + else + [[ "$(errno "$status")" =~ "ELOOP "* ]] || fail "$op $flags [/root/]testlink/" + fi + + vfs_helper -o "$op" -F "$flags" "/proc/$$/cwd/" + [[ "$(errno "$status")" =~ "ELOOP "* ]] || fail "$op $flags /proc/$$/cwd/" + + vfs_helper -o "$op" -F "$flags" "/proc/$$/cwd/$BASH_SOURCE" + [[ "$(errno "$status")" =~ "ELOOP "* ]] || fail "$op $flags /proc/$$/cwd/$BASH_SOURCE" + + vfs_helper -o "$op" -F "$flags" -d "/proc/self" cwd + if [[ "$op" == "lstat" ]] + then + [[ "$status" = 0 ]] || fail "$op $flags [/proc/self/]cwd" + else + [[ "$(errno "$status")" =~ "ELOOP "* ]] || fail "$op $flags [/proc/self]/cwd" + fi + done +done diff --git a/tools/testing/selftests/vfs/tests/0004_at_no_symlinks.sh b/tools/testing/selftests/vfs/tests/0004_at_no_symlinks.sh new file mode 100755 index 000000000000..f7ec7e37f06a --- /dev/null +++ b/tools/testing/selftests/vfs/tests/0004_at_no_symlinks.sh @@ -0,0 +1,49 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ +# Author: Aleksa Sarai +# Copyright (C) 2018 SUSE LLC. + +sourcedir="$(readlink -f "$(dirname "$BASH_SOURCE")")" +source "$sourcedir/../common.sh" + +mkdir -p "$root/dir" +touch "$root/foo" + +ln -s . "$root/link_dot" +ln -s .. "$root/link_dotdot" +ln -s foo "$root/link_foo" + +for op in "${vfs_ops[@]}" +do + vfs_helper -o "$op" -F no_symlinks -d "$root" foo + [[ "$status" = 0 ]] || fail "$op no_symlinks [/root/]foo" + + vfs_helper -o "$op" -F no_symlinks -d "$root" ../root/foo + [[ "$status" = 0 ]] || fail "$op no_symlinks [/root/]../root/foo" + + vfs_helper -o "$op" -F no_symlinks -d "$root" link_foo + if [[ "$op" == "lstat" ]] + then + [[ "$status" = 0 ]] || fail "$op no_symlinks [/root/]link_foo(=foo)" + else + [[ "$(errno "$status")" =~ "ELOOP "* ]] || fail "$op no_symlinks [/root/]link_foo(=foo)" + fi + + vfs_helper -o "$op" -F no_proclinks -d "$root" link_foo + [[ "$status" = 0 ]] || fail "$op no_proclinks [/root/]link_foo(=foo)" + + vfs_helper -o "$op" -F no_symlinks -d "$root" link_dotdot/ + [[ "$(errno "$status")" =~ "ELOOP "* ]] || fail "$op no_symlinks [/root/]link_dotdot(=..)/" + + vfs_helper -o "$op" -F no_proclinks -d "$root" link_dotdot/ + [[ "$status" = 0 ]] || fail "$op no_proclinks [/root/]link_dotdot(=..)/" + + vfs_helper -o "$op" -F no_symlinks -d "$root" link_dot/dir + [[ "$(errno "$status")" =~ "ELOOP "* ]] || fail "$op no_symlinks [/root/]link_dot(=.)/dir" + + vfs_helper -o "$op" -F no_proclinks -d "$root" link_dot/dir + [[ "$status" = 0 ]] || fail "$op no_proclinks [/root/]link_dot(=.)/dir" + + vfs_helper -o "$op" -F no_symlinks -d "$root" ../root/link_dot/link_dotdot/root/dir + [[ "$(errno "$status")" =~ "ELOOP "* ]] || fail "$op no_symlinks [/root/]../root/link_dot(=.)/link_dotdot(=..)/root/dir" +done diff --git a/tools/testing/selftests/vfs/tests/0005_at_this_root.sh b/tools/testing/selftests/vfs/tests/0005_at_this_root.sh new file mode 100755 index 000000000000..aba23c28a7b7 --- /dev/null +++ b/tools/testing/selftests/vfs/tests/0005_at_this_root.sh @@ -0,0 +1,66 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ +# Author: Aleksa Sarai +# Copyright (C) 2018 SUSE LLC. + +sourcedir="$(readlink -f "$(dirname "$BASH_SOURCE")")" +source "$sourcedir/../common.sh" + +mkdir -p "$root/var" "$root/etc" "$root/usr/bin" "$root/usr/local/bin" +ln -s bash "$root/usr/bin/sh" +ln -s ../../bin/bash "$root/usr/local/bin/bash" +ln -s /bin/sh "$root/usr/local/bin/sh" +ln -s ../bin3 "$root/var/bin" +ln -s /usr/bin "$root/bin" +ln -s /usr/local/bin "$root/bin4" +ln -s ../../../../../../../../../bin "$root/bin2" +ln -s /../../../../../../../../../bin "$root/bin3" +touch "$root/etc/passwd" "$root/usr/bin/bash" + +# How should each path be mapped to a host path, in the form +# 'path:hostpath[:hostpath_trailing]'. Everything is assumed to be ${root} +# prefixed. +host_mappings=( + # Basic paths. + "..:." + "/:." + "/../../../../../../:." + "../var/../../../../../etc/passwd:etc/passwd" + "/var/../../../../../etc/passwd:etc/passwd" + "/../../../../../../var/../../../../../etc/passwd:etc/passwd" + "etc/passwd:etc/passwd" + "/etc/passwd:etc/passwd" + + # Basic symlink paths. + "/bin/bash:usr/bin/bash" + "/bin/sh:usr/bin/bash:usr/bin/sh" + "/bin2/bash:usr/bin/bash" + "/bin2/sh:usr/bin/bash:usr/bin/sh" + "/bin3/sh:usr/bin/bash:usr/bin/sh" + "/bin3/bash:usr/bin/bash" + + # More complicated symlink paths. + "/bin4/../../local/bin/bash:usr/bin/bash:usr/local/bin/bash" + "/bin4/../../local/bin/sh:usr/bin/bash:usr/local/bin/sh" + "/bin4/../../../../../../../../../../usr/local/bin/bash:usr/bin/bash:usr/local/bin/bash" + "/bin4/../../../../../../../../../../usr/local/bin/sh:usr/bin/bash:usr/local/bin/sh" + "/bin/../../bin4/../../local/bin/bash:usr/bin/bash:usr/local/bin/bash" + "/bin/../../bin4/../../local/bin/sh:usr/bin/bash:usr/local/bin/sh" +) + +for op in "${vfs_ops[@]}" +do + for mapping in "${host_mappings[@]}" + do + IFS=":" read path hostpath hostpath_trailing <<< "$mapping" + [[ "$hostpath_trailing" ]] || export hostpath_trailing="$hostpath" + [[ "$op" == "lstat" ]] && export hostpath="$hostpath_trailing" + + # Compare with and without this_root... + vfs_helper -o "$op" -d "$root" "$hostpath" + old_status="$status" old_output="$output" + vfs_helper -o "$op" -F this_root -d "$root" "$path" + [[ "$status" = "$old_status" ]] || fail "$op this_root $path=$status neq $old_status" + [[ "$output" == "$old_output" ]] || fail "$op this_root $path=$output neq $old_output" + done +done diff --git a/tools/testing/selftests/vfs/vfs_helper.c b/tools/testing/selftests/vfs/vfs_helper.c new file mode 100644 index 000000000000..d67ec74a3fca --- /dev/null +++ b/tools/testing/selftests/vfs/vfs_helper.c @@ -0,0 +1,154 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Author: Aleksa Sarai + * Copyright (C) 2018 SUSE LLC. + */ + +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "at_flags.h" +#include "../kselftest.h" + +#define bail(...) \ + do { \ + fprintf(stderr, __VA_ARGS__); \ + fputs("\n", stderr); \ + exit(1); \ + } while (0) + +extern char *__progname; +#define usage() \ + bail("usage: %s -o {open|stat|lstat} [-d ] " \ + "[-F [,...] ", __progname) + +static unsigned int parse_at_flags(char *opts) +{ + char *opt, *saveptr = NULL; + unsigned int flags = 0; + + opt = strtok_r(opts, ",", &saveptr); + do { + unsigned int found = 0; + + if (!*opt) + continue; + for (struct flag *flag = AT_FLAGS; flag->name != NULL; flag++) { + if (!strcmp(opt, flag->name)) + found |= flag->at_flag; + } + if (!found) + bail("unknown openat(2) flag: %s", opt); + flags |= found; + } while ((opt = strtok_r(NULL, ",", &saveptr)) != NULL); + + return flags; +} + +int stat_wrapper(int dirfd, const char *pathname, unsigned int flags) +{ + struct stat st = {0}; + int err; + + err = fstatat(dirfd, pathname, &st, flags); + if (err < 0) + return err; + + printf("%lu:%lu\n", st.st_dev, st.st_ino); + return 0; +} + +int lstat_wrapper(int dirfd, const char *pathname, unsigned int flags) +{ + return stat_wrapper(dirfd, pathname, flags | AT_SYMLINK_NOFOLLOW); +} + +int openat_wrapper(int dirfd, const char *pathname, unsigned int flags) +{ + int fd; + char *fdpath = NULL, fullpath[PATH_MAX] = {0}; + + fd = openat(dirfd, pathname, flags); + if (fd < 0) + return fd; + + /* Print the fully-qualified path using /proc/pid/fd/... */ + if (asprintf(&fdpath, "/proc/self/fd/%d", fd) < 0) + bail("asprintf /proc/self/fd/%d: %m", fd); + if (readlink(fdpath, fullpath, PATH_MAX) < 0) + bail("readlink %s: %m", fdpath); + puts(fullpath); + return fd; +} + +int main(int argc, char **argv) +{ + int opt, ret, dirfd; + unsigned int flags = 0; + char *opstr = NULL, *dir_path = NULL, *path = NULL; + int (*opfunc)(int dirfd, const char *pathname, unsigned int flags); + + while ((opt = getopt(argc, argv, "o:d:F:")) != -1) { + switch (opt) { + case 'o': + opstr = optarg; + break; + case 'd': + dir_path = optarg; + break; + case 'F': + flags |= parse_at_flags(optarg); + break; + default: + usage(); + } + } + + argc -= optind; + argv += optind; + + if (argc != 1) + usage(); + path = argv[0]; + + if (!opstr) + usage(); + else if (!strcmp(opstr, "stat")) + opfunc = stat_wrapper; + else if (!strcmp(opstr, "lstat")) + opfunc = lstat_wrapper; + else if (!strcmp(opstr, "open")) + opfunc = openat_wrapper; + else + usage(); + + if (opfunc == openat_wrapper) { + unsigned int open_flags = 0; + + for (struct flag *flag = AT_FLAGS; flag->name != NULL; flag++) { + if (flags & flag->at_flag) + open_flags |= flag->open_flag; + } + flags = open_flags; + } + + dirfd = AT_FDCWD; + if (dir_path) { + dirfd = open(dir_path, O_PATH|O_DIRECTORY); + if (dirfd < 0) + bail("cannot open dir_path: %m"); + } + + ret = opfunc(dirfd, path, flags); + if (ret < 0) + ret = -errno; + return (ret < 0) ? -ret : 0; +}