Message ID | 20210316170135.226381-2-mic@digikod.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Unprivileged chroot | expand |
On Tue, Mar 16, 2021 at 06:01:35PM +0100, Mickaël Salaün wrote: > From: Mickaël Salaün <mic@linux.microsoft.com> > > Being able to easily change root directories enables to ease some > development workflow and can be used as a tool to strengthen > unprivileged security sandboxes. chroot(2) is not an access-control > mechanism per se, but it can be used to limit the absolute view of the > filesystem, and then limit ways to access data and kernel interfaces > (e.g. /proc, /sys, /dev, etc.). > > Users may not wish to expose namespace complexity to potentially > malicious processes, or limit their use because of limited resources. > The chroot feature is much more simple (and limited) than the mount > namespace, but can still be useful. As for containers, users of > chroot(2) should take care of file descriptors or data accessible by > other means (e.g. current working directory, leaked FDs, passed FDs, > devices, mount points, etc.). There is a lot of literature that discuss > the limitations of chroot, and users of this feature should be aware of > the multiple ways to bypass it. Using chroot(2) for security purposes > can make sense if it is combined with other features (e.g. dedicated > user, seccomp, LSM access-controls, etc.). > > One could argue that chroot(2) is useless without a properly populated > root hierarchy (i.e. without /dev and /proc). However, there are > multiple use cases that don't require the chrooting process to create > file hierarchies with special files nor mount points, e.g.: > * A process sandboxing itself, once all its libraries are loaded, may > not need files other than regular files, or even no file at all. > * Some pre-populated root hierarchies could be used to chroot into, > provided for instance by development environments or tailored > distributions. > * Processes executed in a chroot may not require access to these special > files (e.g. with minimal runtimes, or by emulating some special files > with a LD_PRELOADed library or seccomp). > > Unprivileged chroot is especially interesting for userspace developers > wishing to harden their applications. For instance, chroot(2) and Yama > enable to build a capability-based security (i.e. remove filesystem > ambient accesses) by calling chroot/chdir with an empty directory and > accessing data through dedicated file descriptors obtained with > openat2(2) and RESOLVE_BENEATH/RESOLVE_IN_ROOT/RESOLVE_NO_MAGICLINKS. > > Allowing a task to change its own root directory is not a threat to the > system if we can prevent confused deputy attacks, which could be > performed through execution of SUID-like binaries. This can be > prevented if the calling task sets PR_SET_NO_NEW_PRIVS on itself with > prctl(2). To only affect this task, its filesystem information must not > be shared with other tasks, which can be achieved by not passing > CLONE_FS to clone(2). A similar no_new_privs check is already used by > seccomp to avoid the same kind of security issues. Furthermore, because > of its security use and to avoid giving a new way for attackers to get > out of a chroot (e.g. using /proc/<pid>/root, or chroot/chdir), an > unprivileged chroot is only allowed if the calling process is not > already chrooted. This limitation is the same as for creating user > namespaces. > > This change may not impact systems relying on other permission models > than POSIX capabilities (e.g. Tomoyo). Being able to use chroot(2) on > such systems may require to update their security policies. > > Only the chroot system call is relaxed with this no_new_privs check; the > init_chroot() helper doesn't require such change. > > Allowing unprivileged users to use chroot(2) is one of the initial > objectives of no_new_privs: > https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html > This patch is a follow-up of a previous one sent by Andy Lutomirski: > https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.luto@amacapital.net/ > > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Andy Lutomirski <luto@amacapital.net> > Cc: Christian Brauner <christian.brauner@ubuntu.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: David Howells <dhowells@redhat.com> > Cc: Dominik Brodowski <linux@dominikbrodowski.net> > Cc: Eric W. Biederman <ebiederm@xmission.com> > Cc: James Morris <jmorris@namei.org> > Cc: John Johansen <john.johansen@canonical.com> > Cc: Kees Cook <keescook@chromium.org> > Cc: Kentaro Takeda <takedakn@nttdata.co.jp> > Cc: Serge Hallyn <serge@hallyn.com> > Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> > Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Thanks for the updates! I find this version much easier to read. :) Reviewed-by: Kees Cook <keescook@chromium.org>
On Tue, Mar 16, 2021 at 6:02 PM Mickaël Salaün <mic@digikod.net> wrote: > One could argue that chroot(2) is useless without a properly populated > root hierarchy (i.e. without /dev and /proc). However, there are > multiple use cases that don't require the chrooting process to create > file hierarchies with special files nor mount points, e.g.: > * A process sandboxing itself, once all its libraries are loaded, may > not need files other than regular files, or even no file at all. > * Some pre-populated root hierarchies could be used to chroot into, > provided for instance by development environments or tailored > distributions. > * Processes executed in a chroot may not require access to these special > files (e.g. with minimal runtimes, or by emulating some special files > with a LD_PRELOADed library or seccomp). > > Unprivileged chroot is especially interesting for userspace developers > wishing to harden their applications. For instance, chroot(2) and Yama > enable to build a capability-based security (i.e. remove filesystem > ambient accesses) by calling chroot/chdir with an empty directory and > accessing data through dedicated file descriptors obtained with > openat2(2) and RESOLVE_BENEATH/RESOLVE_IN_ROOT/RESOLVE_NO_MAGICLINKS. I don't entirely understand. Are you writing this with the assumption that a future change will make it possible to set these RESOLVE flags process-wide, or something like that? As long as that doesn't exist, I think that to make this safe, you'd have to do something like the following - let a child process set up a new mount namespace for you, and then chroot() into that namespace's root: struct shared_data { int root_fd; }; int helper_fn(void *args) { struct shared_data *shared = args; mount("none", "/tmp", "tmpfs", MS_NOSUID|MS_NODEV, ""); mkdir("/tmp/old_root", 0700); pivot_root("/tmp", "/tmp/old_root"); umount("/tmp/old_root", ""); shared->root_fd = open("/", O_PATH); } void setup_chroot() { struct shared_data shared = {}; prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); clone(helper_fn, my_stack, CLONE_VFORK|CLONE_VM|CLONE_FILES|CLONE_NEWUSER|CLONE_NEWNS|SIGCHLD, NULL); fchdir(shared.root_fd); chroot("."); } [...] > diff --git a/fs/open.c b/fs/open.c [...] > +static inline int current_chroot_allowed(void) > +{ > + /* > + * Changing the root directory for the calling task (and its future > + * children) requires that this task has CAP_SYS_CHROOT in its > + * namespace, or be running with no_new_privs and not sharing its > + * fs_struct and not escaping its current root (cf. create_user_ns()). > + * As for seccomp, checking no_new_privs avoids scenarios where > + * unprivileged tasks can affect the behavior of privileged children. > + */ > + if (task_no_new_privs(current) && current->fs->users == 1 && this read of current->fs->users should be using READ_ONCE() > + !current_chrooted()) > + return 0; > + if (ns_capable(current_user_ns(), CAP_SYS_CHROOT)) > + return 0; > + return -EPERM; > +} [...] Overall I think this change is a good idea.
On Tue, Mar 16, 2021 at 08:04:09PM +0100, Jann Horn wrote: > On Tue, Mar 16, 2021 at 6:02 PM Mickaël Salaün <mic@digikod.net> wrote: > > One could argue that chroot(2) is useless without a properly populated > > root hierarchy (i.e. without /dev and /proc). However, there are > > multiple use cases that don't require the chrooting process to create > > file hierarchies with special files nor mount points, e.g.: > > * A process sandboxing itself, once all its libraries are loaded, may > > not need files other than regular files, or even no file at all. > > * Some pre-populated root hierarchies could be used to chroot into, > > provided for instance by development environments or tailored > > distributions. > > * Processes executed in a chroot may not require access to these special > > files (e.g. with minimal runtimes, or by emulating some special files > > with a LD_PRELOADed library or seccomp). > > > > Unprivileged chroot is especially interesting for userspace developers > > wishing to harden their applications. For instance, chroot(2) and Yama > > enable to build a capability-based security (i.e. remove filesystem > > ambient accesses) by calling chroot/chdir with an empty directory and > > accessing data through dedicated file descriptors obtained with > > openat2(2) and RESOLVE_BENEATH/RESOLVE_IN_ROOT/RESOLVE_NO_MAGICLINKS. > > I don't entirely understand. Are you writing this with the assumption > that a future change will make it possible to set these RESOLVE flags > process-wide, or something like that? I thought it meant "open all out-of-chroot dirs as fds using RESOLVE_... flags then chroot". As in, there's no way to then escape "up" for the old opens, and the new opens stay in the chroot. > [...] > > diff --git a/fs/open.c b/fs/open.c > [...] > > +static inline int current_chroot_allowed(void) > > +{ > > + /* > > + * Changing the root directory for the calling task (and its future > > + * children) requires that this task has CAP_SYS_CHROOT in its > > + * namespace, or be running with no_new_privs and not sharing its > > + * fs_struct and not escaping its current root (cf. create_user_ns()). > > + * As for seccomp, checking no_new_privs avoids scenarios where > > + * unprivileged tasks can affect the behavior of privileged children. > > + */ > > + if (task_no_new_privs(current) && current->fs->users == 1 && > > this read of current->fs->users should be using READ_ONCE() Ah yeah, good call. I should remember this when I think "can this race?" :P
On 16/03/2021 20:24, Kees Cook wrote: > On Tue, Mar 16, 2021 at 08:04:09PM +0100, Jann Horn wrote: >> On Tue, Mar 16, 2021 at 6:02 PM Mickaël Salaün <mic@digikod.net> wrote: >>> One could argue that chroot(2) is useless without a properly populated >>> root hierarchy (i.e. without /dev and /proc). However, there are >>> multiple use cases that don't require the chrooting process to create >>> file hierarchies with special files nor mount points, e.g.: >>> * A process sandboxing itself, once all its libraries are loaded, may >>> not need files other than regular files, or even no file at all. >>> * Some pre-populated root hierarchies could be used to chroot into, >>> provided for instance by development environments or tailored >>> distributions. >>> * Processes executed in a chroot may not require access to these special >>> files (e.g. with minimal runtimes, or by emulating some special files >>> with a LD_PRELOADed library or seccomp). >>> >>> Unprivileged chroot is especially interesting for userspace developers >>> wishing to harden their applications. For instance, chroot(2) and Yama >>> enable to build a capability-based security (i.e. remove filesystem >>> ambient accesses) by calling chroot/chdir with an empty directory and >>> accessing data through dedicated file descriptors obtained with >>> openat2(2) and RESOLVE_BENEATH/RESOLVE_IN_ROOT/RESOLVE_NO_MAGICLINKS. >> >> I don't entirely understand. Are you writing this with the assumption >> that a future change will make it possible to set these RESOLVE flags >> process-wide, or something like that? > > I thought it meant "open all out-of-chroot dirs as fds using RESOLVE_... > flags then chroot". As in, there's no way to then escape "up" for the > old opens, and the new opens stay in the chroot. Yes, that was the idea. > >> [...] >>> diff --git a/fs/open.c b/fs/open.c >> [...] >>> +static inline int current_chroot_allowed(void) >>> +{ >>> + /* >>> + * Changing the root directory for the calling task (and its future >>> + * children) requires that this task has CAP_SYS_CHROOT in its >>> + * namespace, or be running with no_new_privs and not sharing its >>> + * fs_struct and not escaping its current root (cf. create_user_ns()). >>> + * As for seccomp, checking no_new_privs avoids scenarios where >>> + * unprivileged tasks can affect the behavior of privileged children. >>> + */ >>> + if (task_no_new_privs(current) && current->fs->users == 1 && >> >> this read of current->fs->users should be using READ_ONCE() > > Ah yeah, good call. I should remember this when I think "can this race?" > :P >
On 16/03/2021 20:04, Jann Horn wrote: > On Tue, Mar 16, 2021 at 6:02 PM Mickaël Salaün <mic@digikod.net> wrote: >> One could argue that chroot(2) is useless without a properly populated >> root hierarchy (i.e. without /dev and /proc). However, there are >> multiple use cases that don't require the chrooting process to create >> file hierarchies with special files nor mount points, e.g.: >> * A process sandboxing itself, once all its libraries are loaded, may >> not need files other than regular files, or even no file at all. >> * Some pre-populated root hierarchies could be used to chroot into, >> provided for instance by development environments or tailored >> distributions. >> * Processes executed in a chroot may not require access to these special >> files (e.g. with minimal runtimes, or by emulating some special files >> with a LD_PRELOADed library or seccomp). >> >> Unprivileged chroot is especially interesting for userspace developers >> wishing to harden their applications. For instance, chroot(2) and Yama >> enable to build a capability-based security (i.e. remove filesystem >> ambient accesses) by calling chroot/chdir with an empty directory and >> accessing data through dedicated file descriptors obtained with >> openat2(2) and RESOLVE_BENEATH/RESOLVE_IN_ROOT/RESOLVE_NO_MAGICLINKS. > > I don't entirely understand. Are you writing this with the assumption > that a future change will make it possible to set these RESOLVE flags > process-wide, or something like that? No, this scenario is for applications willing to sandbox themselves and only use the FDs to access legitimate data. > > > As long as that doesn't exist, I think that to make this safe, you'd > have to do something like the following - let a child process set up a > new mount namespace for you, and then chroot() into that namespace's > root: > > struct shared_data { > int root_fd; > }; > int helper_fn(void *args) { > struct shared_data *shared = args; > mount("none", "/tmp", "tmpfs", MS_NOSUID|MS_NODEV, ""); > mkdir("/tmp/old_root", 0700); > pivot_root("/tmp", "/tmp/old_root"); > umount("/tmp/old_root", ""); > shared->root_fd = open("/", O_PATH); > } > void setup_chroot() { > struct shared_data shared = {}; > prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); > clone(helper_fn, my_stack, > CLONE_VFORK|CLONE_VM|CLONE_FILES|CLONE_NEWUSER|CLONE_NEWNS|SIGCHLD, > NULL); > fchdir(shared.root_fd); > chroot("."); > } What about this? chdir("/proc/self/fdinfo"); chroot("."); close(all unnecessary FDs); > > [...] >> diff --git a/fs/open.c b/fs/open.c > [...] >> +static inline int current_chroot_allowed(void) >> +{ >> + /* >> + * Changing the root directory for the calling task (and its future >> + * children) requires that this task has CAP_SYS_CHROOT in its >> + * namespace, or be running with no_new_privs and not sharing its >> + * fs_struct and not escaping its current root (cf. create_user_ns()). >> + * As for seccomp, checking no_new_privs avoids scenarios where >> + * unprivileged tasks can affect the behavior of privileged children. >> + */ >> + if (task_no_new_privs(current) && current->fs->users == 1 && > > this read of current->fs->users should be using READ_ONCE() Right, I'll fix this. > >> + !current_chrooted()) >> + return 0; >> + if (ns_capable(current_user_ns(), CAP_SYS_CHROOT)) >> + return 0; >> + return -EPERM; >> +} > [...] > > Overall I think this change is a good idea. >
On Tue, Mar 16, 2021 at 8:26 PM Mickaël Salaün <mic@digikod.net> wrote: > On 16/03/2021 20:04, Jann Horn wrote: > > On Tue, Mar 16, 2021 at 6:02 PM Mickaël Salaün <mic@digikod.net> wrote: > >> One could argue that chroot(2) is useless without a properly populated > >> root hierarchy (i.e. without /dev and /proc). However, there are > >> multiple use cases that don't require the chrooting process to create > >> file hierarchies with special files nor mount points, e.g.: > >> * A process sandboxing itself, once all its libraries are loaded, may > >> not need files other than regular files, or even no file at all. > >> * Some pre-populated root hierarchies could be used to chroot into, > >> provided for instance by development environments or tailored > >> distributions. > >> * Processes executed in a chroot may not require access to these special > >> files (e.g. with minimal runtimes, or by emulating some special files > >> with a LD_PRELOADed library or seccomp). > >> > >> Unprivileged chroot is especially interesting for userspace developers > >> wishing to harden their applications. For instance, chroot(2) and Yama > >> enable to build a capability-based security (i.e. remove filesystem > >> ambient accesses) by calling chroot/chdir with an empty directory and > >> accessing data through dedicated file descriptors obtained with > >> openat2(2) and RESOLVE_BENEATH/RESOLVE_IN_ROOT/RESOLVE_NO_MAGICLINKS. > > > > I don't entirely understand. Are you writing this with the assumption > > that a future change will make it possible to set these RESOLVE flags > > process-wide, or something like that? > > No, this scenario is for applications willing to sandbox themselves and > only use the FDs to access legitimate data. But if you're chrooted to /proc/self/fdinfo and have an fd to some directory - let's say /home/user/Downloads - there is nothing that ensures that you only use that fd with RESOLVE_BENEATH, right? If the application is compromised, it can do something like openat(fd, "../.bashrc", O_RDWR), right? Or am I missing something? > > As long as that doesn't exist, I think that to make this safe, you'd > > have to do something like the following - let a child process set up a > > new mount namespace for you, and then chroot() into that namespace's > > root: > > > > struct shared_data { > > int root_fd; > > }; > > int helper_fn(void *args) { > > struct shared_data *shared = args; > > mount("none", "/tmp", "tmpfs", MS_NOSUID|MS_NODEV, ""); > > mkdir("/tmp/old_root", 0700); > > pivot_root("/tmp", "/tmp/old_root"); > > umount("/tmp/old_root", ""); > > shared->root_fd = open("/", O_PATH); > > } > > void setup_chroot() { > > struct shared_data shared = {}; > > prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); > > clone(helper_fn, my_stack, > > CLONE_VFORK|CLONE_VM|CLONE_FILES|CLONE_NEWUSER|CLONE_NEWNS|SIGCHLD, > > NULL); > > fchdir(shared.root_fd); > > chroot("."); > > } > > What about this? > chdir("/proc/self/fdinfo"); > chroot("."); > close(all unnecessary FDs); That breaks down if you can e.g. get a unix domain socket connected to a process in a different chroot, right? Isn't that a bit too fragile?
On 16/03/2021 20:31, Jann Horn wrote: > On Tue, Mar 16, 2021 at 8:26 PM Mickaël Salaün <mic@digikod.net> wrote: >> On 16/03/2021 20:04, Jann Horn wrote: >>> On Tue, Mar 16, 2021 at 6:02 PM Mickaël Salaün <mic@digikod.net> wrote: >>>> One could argue that chroot(2) is useless without a properly populated >>>> root hierarchy (i.e. without /dev and /proc). However, there are >>>> multiple use cases that don't require the chrooting process to create >>>> file hierarchies with special files nor mount points, e.g.: >>>> * A process sandboxing itself, once all its libraries are loaded, may >>>> not need files other than regular files, or even no file at all. >>>> * Some pre-populated root hierarchies could be used to chroot into, >>>> provided for instance by development environments or tailored >>>> distributions. >>>> * Processes executed in a chroot may not require access to these special >>>> files (e.g. with minimal runtimes, or by emulating some special files >>>> with a LD_PRELOADed library or seccomp). >>>> >>>> Unprivileged chroot is especially interesting for userspace developers >>>> wishing to harden their applications. For instance, chroot(2) and Yama >>>> enable to build a capability-based security (i.e. remove filesystem >>>> ambient accesses) by calling chroot/chdir with an empty directory and >>>> accessing data through dedicated file descriptors obtained with >>>> openat2(2) and RESOLVE_BENEATH/RESOLVE_IN_ROOT/RESOLVE_NO_MAGICLINKS. >>> >>> I don't entirely understand. Are you writing this with the assumption >>> that a future change will make it possible to set these RESOLVE flags >>> process-wide, or something like that? >> >> No, this scenario is for applications willing to sandbox themselves and >> only use the FDs to access legitimate data. > > But if you're chrooted to /proc/self/fdinfo and have an fd to some > directory - let's say /home/user/Downloads - there is nothing that > ensures that you only use that fd with RESOLVE_BENEATH, right? If the > application is compromised, it can do something like openat(fd, > "../.bashrc", O_RDWR), right? Or am I missing something? You're totally right, I was mistaken, this simple use case doesn't work without a broker. Perhaps when seccomp will be able to check referenced structs, or with a new FD limitation… > >>> As long as that doesn't exist, I think that to make this safe, you'd >>> have to do something like the following - let a child process set up a >>> new mount namespace for you, and then chroot() into that namespace's >>> root: >>> >>> struct shared_data { >>> int root_fd; >>> }; >>> int helper_fn(void *args) { >>> struct shared_data *shared = args; >>> mount("none", "/tmp", "tmpfs", MS_NOSUID|MS_NODEV, ""); >>> mkdir("/tmp/old_root", 0700); >>> pivot_root("/tmp", "/tmp/old_root"); >>> umount("/tmp/old_root", ""); >>> shared->root_fd = open("/", O_PATH); >>> } >>> void setup_chroot() { >>> struct shared_data shared = {}; >>> prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); >>> clone(helper_fn, my_stack, >>> CLONE_VFORK|CLONE_VM|CLONE_FILES|CLONE_NEWUSER|CLONE_NEWNS|SIGCHLD, >>> NULL); >>> fchdir(shared.root_fd); >>> chroot("."); >>> } >> >> What about this? >> chdir("/proc/self/fdinfo"); >> chroot("."); >> close(all unnecessary FDs); > > That breaks down if you can e.g. get a unix domain socket connected to > a process in a different chroot, right? Isn't that a bit too fragile? This relies on other (trusted) components, and yes it is fragile if the process communicates with a service able send FDs.
diff --git a/fs/open.c b/fs/open.c index e53af13b5835..da46eb28a3a6 100644 --- a/fs/open.c +++ b/fs/open.c @@ -532,6 +532,24 @@ SYSCALL_DEFINE1(fchdir, unsigned int, fd) return error; } +static inline int current_chroot_allowed(void) +{ + /* + * Changing the root directory for the calling task (and its future + * children) requires that this task has CAP_SYS_CHROOT in its + * namespace, or be running with no_new_privs and not sharing its + * fs_struct and not escaping its current root (cf. create_user_ns()). + * As for seccomp, checking no_new_privs avoids scenarios where + * unprivileged tasks can affect the behavior of privileged children. + */ + if (task_no_new_privs(current) && current->fs->users == 1 && + !current_chrooted()) + return 0; + if (ns_capable(current_user_ns(), CAP_SYS_CHROOT)) + return 0; + return -EPERM; +} + SYSCALL_DEFINE1(chroot, const char __user *, filename) { struct path path; @@ -546,9 +564,10 @@ SYSCALL_DEFINE1(chroot, const char __user *, filename) if (error) goto dput_and_out; - error = -EPERM; - if (!ns_capable(current_user_ns(), CAP_SYS_CHROOT)) + error = current_chroot_allowed(); + if (error) goto dput_and_out; + error = security_path_chroot(&path); if (error) goto dput_and_out;