Message ID | 20160702172035.19568-7-ebiederm@xmission.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat 02-07-16 12:20:31, Eric W. Biederman wrote: > It is expected that filesystems can not represent uids and gids from > outside of their user namespace. Keep things simple by not even > trying to create filesystem nodes with non-sense uids and gids. > > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> So if we have sb->s_user_ns that doesn't map UID and GID 0, root cannot directly create files in this filesystem. EOVERFLOW error will at least hint us where the problem is but still I'm suspecting this is going to create hard to debug configuration issues... I'm not sure if we can do anything about this but I wanted to point it out. Honza
Jan Kara <jack@suse.cz> writes: > On Sat 02-07-16 12:20:31, Eric W. Biederman wrote: >> It is expected that filesystems can not represent uids and gids from >> outside of their user namespace. Keep things simple by not even >> trying to create filesystem nodes with non-sense uids and gids. >> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> > > So if we have sb->s_user_ns that doesn't map UID and GID 0, root cannot > directly create files in this filesystem. EOVERFLOW error will at least > hint us where the problem is but still I'm suspecting this is going to > create hard to debug configuration issues... I'm not sure if we can do > anything about this but I wanted to point it out. A reasonable point. A couple of details. - If there is no uid or gid 0 inside of a user namespace there is no root user in that namespace so this is a moot point. - If we are talking about uid 0 in the initial user namespace the scenario you are worried about can occur, but you have to work at it to get into that situation. To create a superblock has s_user_ns != &init_user_ns something like the following has to occur. setuid(1000); unshare(CLONE_NEWUSER); unshare(CLONE_NEWNS); mount(....); At which point the oridinary root user can not see the filesystem with s_user_ns != &init_user_ns as it is located in another mount namespace. Which means root has to use setns to get into that mount namespace, so there are going to be larger hints than -EOVERFLOW. At the same time if there are better error codes that make it clearer what is happening I am all ears. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue 05-07-16 09:55:18, Eric W. Biederman wrote: > Jan Kara <jack@suse.cz> writes: > > > On Sat 02-07-16 12:20:31, Eric W. Biederman wrote: > >> It is expected that filesystems can not represent uids and gids from > >> outside of their user namespace. Keep things simple by not even > >> trying to create filesystem nodes with non-sense uids and gids. > >> > >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> > > > > So if we have sb->s_user_ns that doesn't map UID and GID 0, root cannot > > directly create files in this filesystem. EOVERFLOW error will at least > > hint us where the problem is but still I'm suspecting this is going to > > create hard to debug configuration issues... I'm not sure if we can do > > anything about this but I wanted to point it out. > > A reasonable point. > > A couple of details. > > - If there is no uid or gid 0 inside of a user namespace there is > no root user in that namespace so this is a moot point. > > - If we are talking about uid 0 in the initial user namespace the > scenario you are worried about can occur, but you have to work at it > to get into that situation. > > To create a superblock has s_user_ns != &init_user_ns something like > the following has to occur. > > setuid(1000); > unshare(CLONE_NEWUSER); > unshare(CLONE_NEWNS); > mount(....); > > At which point the oridinary root user can not see the filesystem > with s_user_ns != &init_user_ns as it is located in another mount > namespace. > > Which means root has to use setns to get into that mount namespace, > so there are going to be larger hints than -EOVERFLOW. OK, I see. Thanks for explanation. The inability of the admin (UID 0 in init_user_ns) to easily see and change any filesystem mounted in the system makes me somewhat nervous ;). But I guess the complexity is the price for the flexibility... Honza
Jan Kara <jack@suse.cz> writes: > On Tue 05-07-16 09:55:18, Eric W. Biederman wrote: >> To create a superblock has s_user_ns != &init_user_ns something like >> the following has to occur. >> >> setuid(1000); >> unshare(CLONE_NEWUSER); >> unshare(CLONE_NEWNS); >> mount(....); >> >> At which point the oridinary root user can not see the filesystem >> with s_user_ns != &init_user_ns as it is located in another mount >> namespace. >> >> Which means root has to use setns to get into that mount namespace, >> so there are going to be larger hints than -EOVERFLOW. > > OK, I see. Thanks for explanation. The inability of the admin (UID 0 in > init_user_ns) to easily see and change any filesystem mounted in the system > makes me somewhat nervous ;). But I guess the complexity is the price for > the flexibility... Certainly the price of not allowing unprivileged users to confuse existing setuid 0 and setcap executables. There are days I really wish I could do things the plan9 way and throw out any legacy baggage that makes things hard to implement. As I recall plan9 threw out setuid, setgid and only supported the 9p filesystem to make this all simple and almost safe. Wonderfully and horribly I have to keep a lot more real world code usable in this framework. All of that said with the existence of CRIU I expect we will ultimately have all of the interfaces needed for adminstration applications to see everything that is going on. Unfortunately sometimes those interfaces and tools lag behind the rest of the work. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/namei.c b/fs/namei.c index 840201c4c290..629823f19a6a 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2814,16 +2814,22 @@ static int may_delete(struct inode *dir, struct dentry *victim, bool isdir) * 1. We can't do it if child already exists (open has special treatment for * this case, but since we are inlined it's OK) * 2. We can't do it if dir is read-only (done in permission()) - * 3. We should have write and exec permissions on dir - * 4. We can't do it if dir is immutable (done in permission()) + * 3. We can't do it if the fs can't represent the fsuid or fsgid. + * 4. We should have write and exec permissions on dir + * 5. We can't do it if dir is immutable (done in permission()) */ static inline int may_create(struct inode *dir, struct dentry *child) { + struct user_namespace *s_user_ns; audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE); if (child->d_inode) return -EEXIST; if (IS_DEADDIR(dir)) return -ENOENT; + s_user_ns = dir->i_sb->s_user_ns; + if (!kuid_has_mapping(s_user_ns, current_fsuid()) || + !kgid_has_mapping(s_user_ns, current_fsgid())) + return -EOVERFLOW; return inode_permission(dir, MAY_WRITE | MAY_EXEC); }
It is expected that filesystems can not represent uids and gids from outside of their user namespace. Keep things simple by not even trying to create filesystem nodes with non-sense uids and gids. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> --- fs/namei.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)