diff mbox

[review,07/11] vfs: Don't create inodes with a uid or gid unknown to the vfs

Message ID 20160702172035.19568-7-ebiederm@xmission.com (mailing list archive)
State New, archived
Headers show

Commit Message

Eric W. Biederman July 2, 2016, 5:20 p.m. UTC
It is expected that filesystems can not represent uids and gids from
outside of their user namespace.  Keep things simple by not even
trying to create filesystem nodes with non-sense uids and gids.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/namei.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Jan Kara July 4, 2016, 7:59 a.m. UTC | #1
On Sat 02-07-16 12:20:31, Eric W. Biederman wrote:
> It is expected that filesystems can not represent uids and gids from
> outside of their user namespace.  Keep things simple by not even
> trying to create filesystem nodes with non-sense uids and gids.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

So if we have sb->s_user_ns that doesn't map UID and GID 0, root cannot
directly create files in this filesystem. EOVERFLOW error will at least
hint us where the problem is but still I'm suspecting this is going to
create hard to debug configuration issues... I'm not sure if we can do
anything about this but I wanted to point it out.

								Honza
Eric W. Biederman July 5, 2016, 2:55 p.m. UTC | #2
Jan Kara <jack@suse.cz> writes:

> On Sat 02-07-16 12:20:31, Eric W. Biederman wrote:
>> It is expected that filesystems can not represent uids and gids from
>> outside of their user namespace.  Keep things simple by not even
>> trying to create filesystem nodes with non-sense uids and gids.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> So if we have sb->s_user_ns that doesn't map UID and GID 0, root cannot
> directly create files in this filesystem. EOVERFLOW error will at least
> hint us where the problem is but still I'm suspecting this is going to
> create hard to debug configuration issues... I'm not sure if we can do
> anything about this but I wanted to point it out.

A reasonable point.

A couple of details.

- If there is no uid or gid 0 inside of a user namespace there is
  no root user in that namespace so this is a moot point.

- If we are talking about uid 0 in the initial user namespace the
  scenario you are worried about can occur, but you have to work at it
  to get into that situation.

To create a superblock has s_user_ns != &init_user_ns something like
the following has to occur.

setuid(1000);
unshare(CLONE_NEWUSER);
unshare(CLONE_NEWNS);
mount(....);

At which point the oridinary root user can not see the filesystem
with s_user_ns != &init_user_ns as it is located in another mount
namespace.

Which means root has to use setns to get into that mount namespace,
so there are going to be larger hints than -EOVERFLOW.

At the same time if there are better error codes that make it clearer
what is happening I am all ears.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara July 6, 2016, 9:07 a.m. UTC | #3
On Tue 05-07-16 09:55:18, Eric W. Biederman wrote:
> Jan Kara <jack@suse.cz> writes:
> 
> > On Sat 02-07-16 12:20:31, Eric W. Biederman wrote:
> >> It is expected that filesystems can not represent uids and gids from
> >> outside of their user namespace.  Keep things simple by not even
> >> trying to create filesystem nodes with non-sense uids and gids.
> >> 
> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >
> > So if we have sb->s_user_ns that doesn't map UID and GID 0, root cannot
> > directly create files in this filesystem. EOVERFLOW error will at least
> > hint us where the problem is but still I'm suspecting this is going to
> > create hard to debug configuration issues... I'm not sure if we can do
> > anything about this but I wanted to point it out.
> 
> A reasonable point.
> 
> A couple of details.
> 
> - If there is no uid or gid 0 inside of a user namespace there is
>   no root user in that namespace so this is a moot point.
> 
> - If we are talking about uid 0 in the initial user namespace the
>   scenario you are worried about can occur, but you have to work at it
>   to get into that situation.
> 
> To create a superblock has s_user_ns != &init_user_ns something like
> the following has to occur.
> 
> setuid(1000);
> unshare(CLONE_NEWUSER);
> unshare(CLONE_NEWNS);
> mount(....);
> 
> At which point the oridinary root user can not see the filesystem
> with s_user_ns != &init_user_ns as it is located in another mount
> namespace.
> 
> Which means root has to use setns to get into that mount namespace,
> so there are going to be larger hints than -EOVERFLOW.

OK, I see. Thanks for explanation. The inability of the admin (UID 0 in
init_user_ns) to easily see and change any filesystem mounted in the system
makes me somewhat nervous ;). But I guess the complexity is the price for
the flexibility...

								Honza
Eric W. Biederman July 6, 2016, 3:37 p.m. UTC | #4
Jan Kara <jack@suse.cz> writes:

> On Tue 05-07-16 09:55:18, Eric W. Biederman wrote:

>> To create a superblock has s_user_ns != &init_user_ns something like
>> the following has to occur.
>> 
>> setuid(1000);
>> unshare(CLONE_NEWUSER);
>> unshare(CLONE_NEWNS);
>> mount(....);
>> 
>> At which point the oridinary root user can not see the filesystem
>> with s_user_ns != &init_user_ns as it is located in another mount
>> namespace.
>> 
>> Which means root has to use setns to get into that mount namespace,
>> so there are going to be larger hints than -EOVERFLOW.
>
> OK, I see. Thanks for explanation. The inability of the admin (UID 0 in
> init_user_ns) to easily see and change any filesystem mounted in the system
> makes me somewhat nervous ;). But I guess the complexity is the price for
> the flexibility...

Certainly the price of not allowing unprivileged users to confuse
existing setuid 0 and setcap executables.

There are days I really wish I could do things the plan9 way and throw
out any legacy baggage that makes things hard to implement.  As I recall
plan9 threw out setuid, setgid and only supported the 9p filesystem to
make this all simple and almost safe.

Wonderfully and horribly I have to keep a lot more real world code
usable in this framework.

All of that said with the existence of CRIU I expect we will ultimately
have all of the interfaces needed for adminstration applications to see
everything that is going on.  Unfortunately sometimes those interfaces
and tools lag behind the rest of the work.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/namei.c b/fs/namei.c
index 840201c4c290..629823f19a6a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2814,16 +2814,22 @@  static int may_delete(struct inode *dir, struct dentry *victim, bool isdir)
  *  1. We can't do it if child already exists (open has special treatment for
  *     this case, but since we are inlined it's OK)
  *  2. We can't do it if dir is read-only (done in permission())
- *  3. We should have write and exec permissions on dir
- *  4. We can't do it if dir is immutable (done in permission())
+ *  3. We can't do it if the fs can't represent the fsuid or fsgid.
+ *  4. We should have write and exec permissions on dir
+ *  5. We can't do it if dir is immutable (done in permission())
  */
 static inline int may_create(struct inode *dir, struct dentry *child)
 {
+	struct user_namespace *s_user_ns;
 	audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE);
 	if (child->d_inode)
 		return -EEXIST;
 	if (IS_DEADDIR(dir))
 		return -ENOENT;
+	s_user_ns = dir->i_sb->s_user_ns;
+	if (!kuid_has_mapping(s_user_ns, current_fsuid()) ||
+	    !kgid_has_mapping(s_user_ns, current_fsgid()))
+		return -EOVERFLOW;
 	return inode_permission(dir, MAY_WRITE | MAY_EXEC);
 }