Message ID | 20211216054323.1707384-11-stefanb@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ima: Namespace IMA with audit support in IMA-ns | expand |
On Thu, Dec 16, 2021 at 12:43:19AM -0500, Stefan Berger wrote: > From: Stefan Berger <stefanb@linux.ibm.com> > > Extend 'securityfs' for support of IMA namespacing so that each > IMA (user) namespace can have its own front-end for showing the currently > active policy, the measurement list, number of violations and so on. > > Drop the addition dentry reference to enable simple cleanup of dentries > upon umount. > > Prevent mounting of an instance of securityfs in another user namespace > than it belongs to. Also, prevent accesses to directories when another > user namespace is active than the one that the instance of securityfs > belongs to. > > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> > Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> > --- > security/inode.c | 37 ++++++++++++++++++++++++++++++++++--- > 1 file changed, 34 insertions(+), 3 deletions(-) > > diff --git a/security/inode.c b/security/inode.c > index fee01ff4d831..a0d9f086e3d5 100644 > --- a/security/inode.c > +++ b/security/inode.c > @@ -26,6 +26,29 @@ > static struct vfsmount *init_securityfs_mount; > static int init_securityfs_mount_count; > > +static int securityfs_permission(struct user_namespace *mnt_userns, > + struct inode *inode, int mask) > +{ > + int err; > + > + err = generic_permission(&init_user_ns, inode, mask); > + if (!err) { > + if (inode->i_sb->s_user_ns != current_user_ns()) > + err = -EACCES; I really think the correct semantics is to grant all callers access whose user namespace is the same as or an ancestor of the securityfs userns. It's weird to deny access to callers who are located in an ancestor userns. For example, a privileged process on the host should be allowed to setns to the userns of an unprivileged container and inspect its securityfs instance. We're mostly interested to block such as scenarios where two sibling unprivileged containers are created in the initial userns and an fd proxy or something funnels a file descriptor from one sibling container to the another one and the receiving sibling container can use readdir() or openat() on this fd. (I'm not even convinced that this is actually a problem but stricter semantics at the beginning can't hurt. We can always relax this later.)
On Thu, Dec 16, 2021 at 02:40:27PM +0100, Christian Brauner wrote: > On Thu, Dec 16, 2021 at 12:43:19AM -0500, Stefan Berger wrote: > > From: Stefan Berger <stefanb@linux.ibm.com> > > > > Extend 'securityfs' for support of IMA namespacing so that each > > IMA (user) namespace can have its own front-end for showing the currently > > active policy, the measurement list, number of violations and so on. > > > > Drop the addition dentry reference to enable simple cleanup of dentries > > upon umount. > > > > Prevent mounting of an instance of securityfs in another user namespace > > than it belongs to. Also, prevent accesses to directories when another > > user namespace is active than the one that the instance of securityfs > > belongs to. > > > > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> > > Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> > > --- > > security/inode.c | 37 ++++++++++++++++++++++++++++++++++--- > > 1 file changed, 34 insertions(+), 3 deletions(-) > > > > diff --git a/security/inode.c b/security/inode.c > > index fee01ff4d831..a0d9f086e3d5 100644 > > --- a/security/inode.c > > +++ b/security/inode.c > > @@ -26,6 +26,29 @@ > > static struct vfsmount *init_securityfs_mount; > > static int init_securityfs_mount_count; > > > > +static int securityfs_permission(struct user_namespace *mnt_userns, > > + struct inode *inode, int mask) > > +{ > > + int err; > > + > > + err = generic_permission(&init_user_ns, inode, mask); > > + if (!err) { > > + if (inode->i_sb->s_user_ns != current_user_ns()) > > + err = -EACCES; > > I really think the correct semantics is to grant all callers access > whose user namespace is the same as or an ancestor of the securityfs > userns. It's weird to deny access to callers who are located in an > ancestor userns. > > For example, a privileged process on the host should be allowed to setns > to the userns of an unprivileged container and inspect its securityfs s/userns/mntns/ > instance. > > We're mostly interested to block such as scenarios where two sibling > unprivileged containers are created in the initial userns and an fd > proxy or something funnels a file descriptor from one sibling container > to the another one and the receiving sibling container can use readdir() > or openat() on this fd. (I'm not even convinced that this is actually a > problem but stricter semantics at the beginning can't hurt. We can > always relax this later.) >
On 12/16/21 08:40, Christian Brauner wrote: > On Thu, Dec 16, 2021 at 12:43:19AM -0500, Stefan Berger wrote: >> From: Stefan Berger <stefanb@linux.ibm.com> >> >> Extend 'securityfs' for support of IMA namespacing so that each >> IMA (user) namespace can have its own front-end for showing the currently >> active policy, the measurement list, number of violations and so on. >> >> Drop the addition dentry reference to enable simple cleanup of dentries >> upon umount. >> >> Prevent mounting of an instance of securityfs in another user namespace >> than it belongs to. Also, prevent accesses to directories when another >> user namespace is active than the one that the instance of securityfs >> belongs to. >> >> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> >> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> >> --- >> security/inode.c | 37 ++++++++++++++++++++++++++++++++++--- >> 1 file changed, 34 insertions(+), 3 deletions(-) >> >> diff --git a/security/inode.c b/security/inode.c >> index fee01ff4d831..a0d9f086e3d5 100644 >> --- a/security/inode.c >> +++ b/security/inode.c >> @@ -26,6 +26,29 @@ >> static struct vfsmount *init_securityfs_mount; >> static int init_securityfs_mount_count; >> >> +static int securityfs_permission(struct user_namespace *mnt_userns, >> + struct inode *inode, int mask) >> +{ >> + int err; >> + >> + err = generic_permission(&init_user_ns, inode, mask); >> + if (!err) { >> + if (inode->i_sb->s_user_ns != current_user_ns()) >> + err = -EACCES; > I really think the correct semantics is to grant all callers access > whose user namespace is the same as or an ancestor of the securityfs > userns. It's weird to deny access to callers who are located in an > ancestor userns. Ok, will be using current_in_userns() or the more explicit in_userns() for the check. > > For example, a privileged process on the host should be allowed to setns > to the userns of an unprivileged container and inspect its securityfs > instance. > > We're mostly interested to block such as scenarios where two sibling > unprivileged containers are created in the initial userns and an fd > proxy or something funnels a file descriptor from one sibling container > to the another one and the receiving sibling container can use readdir() > or openat() on this fd. (I'm not even convinced that this is actually a > problem but stricter semantics at the beginning can't hurt. We can > always relax this later.)
diff --git a/security/inode.c b/security/inode.c index fee01ff4d831..a0d9f086e3d5 100644 --- a/security/inode.c +++ b/security/inode.c @@ -26,6 +26,29 @@ static struct vfsmount *init_securityfs_mount; static int init_securityfs_mount_count; +static int securityfs_permission(struct user_namespace *mnt_userns, + struct inode *inode, int mask) +{ + int err; + + err = generic_permission(&init_user_ns, inode, mask); + if (!err) { + if (inode->i_sb->s_user_ns != current_user_ns()) + err = -EACCES; + } + + return err; +} + +const struct inode_operations securityfs_dir_inode_operations = { + .permission = securityfs_permission, + .lookup = simple_lookup, +}; + +const struct inode_operations securityfs_file_inode_operations = { + .permission = securityfs_permission, +}; + static void securityfs_free_inode(struct inode *inode) { if (S_ISLNK(inode->i_mode)) @@ -41,20 +64,25 @@ static const struct super_operations securityfs_super_operations = { static int securityfs_fill_super(struct super_block *sb, struct fs_context *fc) { static const struct tree_descr files[] = {{""}}; + struct user_namespace *ns = fc->user_ns; int error; + if (WARN_ON(ns != current_user_ns())) + return -EINVAL; + error = simple_fill_super(sb, SECURITYFS_MAGIC, files); if (error) return error; sb->s_op = &securityfs_super_operations; + sb->s_root->d_inode->i_op = &securityfs_dir_inode_operations; return 0; } static int securityfs_get_tree(struct fs_context *fc) { - return get_tree_single(fc, securityfs_fill_super); + return get_tree_keyed(fc, securityfs_fill_super, fc->user_ns); } static const struct fs_context_operations securityfs_context_ops = { @@ -72,6 +100,7 @@ static struct file_system_type fs_type = { .name = "securityfs", .init_fs_context = securityfs_init_fs_context, .kill_sb = kill_litter_super, + .fs_flags = FS_USERNS_MOUNT, }; /** @@ -157,7 +186,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode); inode->i_private = data; if (S_ISDIR(mode)) { - inode->i_op = &simple_dir_inode_operations; + inode->i_op = &securityfs_dir_inode_operations; inode->i_fop = &simple_dir_operations; inc_nlink(inode); inc_nlink(dir); @@ -165,10 +194,10 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, inode->i_op = iops ? iops : &simple_symlink_inode_operations; inode->i_link = data; } else { + inode->i_op = &securityfs_file_inode_operations; inode->i_fop = fops; } d_instantiate(dentry, inode); - dget(dentry); inode_unlock(dir); return dentry; @@ -316,10 +345,12 @@ void securityfs_remove(struct dentry *dentry) dir = d_inode(dentry->d_parent); inode_lock(dir); if (simple_positive(dentry)) { + dget(dentry); if (d_is_dir(dentry)) simple_rmdir(dir, dentry); else simple_unlink(dir, dentry); + d_delete(dentry); dput(dentry); } inode_unlock(dir);