Message ID | 20150728204009.GF83521@ubuntu-hedt (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 7/28/2015 1:40 PM, Seth Forshee wrote: > On Wed, Jul 22, 2015 at 05:05:17PM -0700, Casey Schaufler wrote: >>> This is what I currently think you want for user ns mounts: >>> >>> 1. smk_root and smk_default are assigned the label of the backing >>> device. >>> 2. s_root is assigned the transmute property. >>> 3. For existing files: >>> a. Files with the same label as the backing device are accessible. >>> b. Files with any other label are not accessible. >> That's right. Accept correct data, reject anything that's not right. >> >>> If this is right, there are a couple lingering questions in my mind. >>> >>> First, what happens with files created in directories with the same >>> label as the backing device but without the transmute property set? The >>> inode for the new file will initially be labeled with smk_of_current(), >>> but then during d_instantiate it will get smk_default and thus end up >>> with the label we want. So that seems okay. >> Yes. >> >>> The second is whether files with the SMACK64EXEC attribute is still a >>> problem. It seems it is, for files with the same label as the backing >>> store at least. I think we can simply skip the code that reads out this >>> xattr and sets smk_task for user ns mounts, or else skip assigning the >>> label to the new task in bprm_set_creds. The latter seems more >>> consistent with the approach you've suggested for dealing with labels >>> from disk. >> Yes, I think that skipping the smk_fetch(XATTR_NAME_SMACKEXEC, ...) in >> smack_d_instantiate for unprivileged mounts would do the trick. >> >>> So I guess all of that seems okay, though perhaps a bit restrictive >>> given that the user who mounted the filesystem already has full access >>> to the backing store. >> In truth, there is no reason to expect that the "user" who did the >> mount will ever have a Smack label that differs from the label of >> the backing store. If what we've got here seems restrictive, it's >> because you've got access from someone other than the "user". >> >>> Please let me know whether or not this matches up with what you are >>> thinking, then I can procede with the implementation. >> My current mindset is that, if you're going to allow unprivileged >> mounts of user defined backing stores, this is as safe as we can >> make it. > All right, I've got a patch which I think does this, and I've managed to > do some testing to confirm that it behaves like I expect. How does this > look? > > What's missing is getting the label from the block device inode; as > Stephen discovered the inode that I thought we could get the label from > turned out to be the wrong one. Afaict we would need a new hook in order > to do that, so for now I'm using the label of the proccess calling > mount. That will be OK if the mount processing checks for write access to the backing store. I haven't looked to see if it does. If it doesn't the problems should be pretty obvious. > > --- > > diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c > index a143328f75eb..8e631a66b03c 100644 > --- a/security/smack/smack_lsm.c > +++ b/security/smack/smack_lsm.c > @@ -662,6 +662,8 @@ static int smack_sb_kern_mount(struct super_block *sb, int flags, void *data) > skp = smk_of_current(); > sp->smk_root = skp; > sp->smk_default = skp; > + if (sb_in_userns(sb)) > + transmute = 1; > } > /* > * Initialize the root inode. > @@ -1023,6 +1025,12 @@ static int smack_inode_permission(struct inode *inode, int mask) > if (mask == 0) > return 0; > > + if (sb_in_userns(inode->i_sb)) { > + struct superblock_smack *sbsp = inode->i_sb->s_security; > + if (smk_of_inode(inode) != sbsp->smk_root) > + return -EACCES; > + } > + > /* May be droppable after audit */ > if (no_block) > return -ECHILD; > @@ -3220,14 +3228,16 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode) > if (rc >= 0) > transflag = SMK_INODE_TRANSMUTE; > } > - /* > - * Don't let the exec or mmap label be "*" or "@". > - */ > - skp = smk_fetch(XATTR_NAME_SMACKEXEC, inode, dp); > - if (IS_ERR(skp) || skp == &smack_known_star || > - skp == &smack_known_web) > - skp = NULL; > - isp->smk_task = skp; > + if (!sb_in_userns(inode->i_sb)) { > + /* > + * Don't let the exec or mmap label be "*" or "@". > + */ > + skp = smk_fetch(XATTR_NAME_SMACKEXEC, inode, dp); > + if (IS_ERR(skp) || skp == &smack_known_star || > + skp == &smack_known_web) > + skp = NULL; > + isp->smk_task = skp; > + } > > skp = smk_fetch(XATTR_NAME_SMACKMMAP, inode, dp); > if (IS_ERR(skp) || skp == &smack_known_star || > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Casey Schaufler <casey@schaufler-ca.com> writes: > On 7/28/2015 1:40 PM, Seth Forshee wrote: >> On Wed, Jul 22, 2015 at 05:05:17PM -0700, Casey Schaufler wrote: >>>> This is what I currently think you want for user ns mounts: >>>> >>>> 1. smk_root and smk_default are assigned the label of the backing >>>> device. >>>> 2. s_root is assigned the transmute property. >>>> 3. For existing files: >>>> a. Files with the same label as the backing device are accessible. >>>> b. Files with any other label are not accessible. >>> That's right. Accept correct data, reject anything that's not right. >>> >>>> If this is right, there are a couple lingering questions in my mind. >>>> >>>> First, what happens with files created in directories with the same >>>> label as the backing device but without the transmute property set? The >>>> inode for the new file will initially be labeled with smk_of_current(), >>>> but then during d_instantiate it will get smk_default and thus end up >>>> with the label we want. So that seems okay. >>> Yes. >>> >>>> The second is whether files with the SMACK64EXEC attribute is still a >>>> problem. It seems it is, for files with the same label as the backing >>>> store at least. I think we can simply skip the code that reads out this >>>> xattr and sets smk_task for user ns mounts, or else skip assigning the >>>> label to the new task in bprm_set_creds. The latter seems more >>>> consistent with the approach you've suggested for dealing with labels >>>> from disk. >>> Yes, I think that skipping the smk_fetch(XATTR_NAME_SMACKEXEC, ...) in >>> smack_d_instantiate for unprivileged mounts would do the trick. >>> >>>> So I guess all of that seems okay, though perhaps a bit restrictive >>>> given that the user who mounted the filesystem already has full access >>>> to the backing store. >>> In truth, there is no reason to expect that the "user" who did the >>> mount will ever have a Smack label that differs from the label of >>> the backing store. If what we've got here seems restrictive, it's >>> because you've got access from someone other than the "user". >>> >>>> Please let me know whether or not this matches up with what you are >>>> thinking, then I can procede with the implementation. >>> My current mindset is that, if you're going to allow unprivileged >>> mounts of user defined backing stores, this is as safe as we can >>> make it. >> All right, I've got a patch which I think does this, and I've managed to >> do some testing to confirm that it behaves like I expect. How does this >> look? >> >> What's missing is getting the label from the block device inode; as >> Stephen discovered the inode that I thought we could get the label from >> turned out to be the wrong one. Afaict we would need a new hook in order >> to do that, so for now I'm using the label of the proccess calling >> mount. > > That will be OK if the mount processing checks for write access to > the backing store. I haven't looked to see if it does. If it doesn't > the problems should be pretty obvious. do_new_mount vfs_kern_mount mount_fs ... mount_bdev blkdev_get_by_path(...,FMODE_READ| FMODE_WRITE | FMODE_EXCL,...) lookup_bdev kern_path filename_lookup path_lookupat lookup_last walk_component blkdev_get(...,mode,...) __blkdev_get(...,mode,...) devcgroup_inode_permission(bdev->bd_inode, perm) *scratches my head* It looks like we don't actually check the permissions on the block device. Tomoyo has a hack for it. nfsd does something. There is devcgroup silliness. But overall it looks like we depend on capable(CAP_SYS_ADMIN). Seth I do believe we have found another area of the vfs we will need to short up before allowing unprivileged mounts of block device based filesystems. It looks like there are enough hacks someone with a clue coming through and making the code make more sense seems like a good idea anyway. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 30, 2015 at 12:05:27PM -0500, Eric W. Biederman wrote: > Casey Schaufler <casey@schaufler-ca.com> writes: > > > On 7/28/2015 1:40 PM, Seth Forshee wrote: > >> On Wed, Jul 22, 2015 at 05:05:17PM -0700, Casey Schaufler wrote: > >>>> This is what I currently think you want for user ns mounts: > >>>> > >>>> 1. smk_root and smk_default are assigned the label of the backing > >>>> device. > >>>> 2. s_root is assigned the transmute property. > >>>> 3. For existing files: > >>>> a. Files with the same label as the backing device are accessible. > >>>> b. Files with any other label are not accessible. > >>> That's right. Accept correct data, reject anything that's not right. > >>> > >>>> If this is right, there are a couple lingering questions in my mind. > >>>> > >>>> First, what happens with files created in directories with the same > >>>> label as the backing device but without the transmute property set? The > >>>> inode for the new file will initially be labeled with smk_of_current(), > >>>> but then during d_instantiate it will get smk_default and thus end up > >>>> with the label we want. So that seems okay. > >>> Yes. > >>> > >>>> The second is whether files with the SMACK64EXEC attribute is still a > >>>> problem. It seems it is, for files with the same label as the backing > >>>> store at least. I think we can simply skip the code that reads out this > >>>> xattr and sets smk_task for user ns mounts, or else skip assigning the > >>>> label to the new task in bprm_set_creds. The latter seems more > >>>> consistent with the approach you've suggested for dealing with labels > >>>> from disk. > >>> Yes, I think that skipping the smk_fetch(XATTR_NAME_SMACKEXEC, ...) in > >>> smack_d_instantiate for unprivileged mounts would do the trick. > >>> > >>>> So I guess all of that seems okay, though perhaps a bit restrictive > >>>> given that the user who mounted the filesystem already has full access > >>>> to the backing store. > >>> In truth, there is no reason to expect that the "user" who did the > >>> mount will ever have a Smack label that differs from the label of > >>> the backing store. If what we've got here seems restrictive, it's > >>> because you've got access from someone other than the "user". > >>> > >>>> Please let me know whether or not this matches up with what you are > >>>> thinking, then I can procede with the implementation. > >>> My current mindset is that, if you're going to allow unprivileged > >>> mounts of user defined backing stores, this is as safe as we can > >>> make it. > >> All right, I've got a patch which I think does this, and I've managed to > >> do some testing to confirm that it behaves like I expect. How does this > >> look? > >> > >> What's missing is getting the label from the block device inode; as > >> Stephen discovered the inode that I thought we could get the label from > >> turned out to be the wrong one. Afaict we would need a new hook in order > >> to do that, so for now I'm using the label of the proccess calling > >> mount. > > > > That will be OK if the mount processing checks for write access to > > the backing store. I haven't looked to see if it does. If it doesn't > > the problems should be pretty obvious. > > > do_new_mount > vfs_kern_mount > mount_fs > ... > mount_bdev > blkdev_get_by_path(...,FMODE_READ| FMODE_WRITE | FMODE_EXCL,...) > lookup_bdev > kern_path > filename_lookup > path_lookupat > lookup_last > walk_component > blkdev_get(...,mode,...) > __blkdev_get(...,mode,...) > devcgroup_inode_permission(bdev->bd_inode, perm) > > *scratches my head* > > It looks like we don't actually check the permissions on the block > device. Tomoyo has a hack for it. nfsd does something. There is > devcgroup silliness. > > But overall it looks like we depend on capable(CAP_SYS_ADMIN). > > Seth I do believe we have found another area of the vfs we will need to > short up before allowing unprivileged mounts of block device based > filesystems. > > It looks like there are enough hacks someone with a clue coming through > and making the code make more sense seems like a good idea anyway. Yep, I just came to the same conclusion myself, and I also verified the behavior emperically. That's definitely a problem. I'll get to work on fixing that. Seth -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Seth Forshee <seth.forshee@canonical.com> writes: > On Thu, Jul 30, 2015 at 12:05:27PM -0500, Eric W. Biederman wrote: >> Casey Schaufler <casey@schaufler-ca.com> writes: >> >> > On 7/28/2015 1:40 PM, Seth Forshee wrote: >> >> On Wed, Jul 22, 2015 at 05:05:17PM -0700, Casey Schaufler wrote: >> >>>> This is what I currently think you want for user ns mounts: >> >>>> >> >>>> 1. smk_root and smk_default are assigned the label of the backing >> >>>> device. >> >>>> 2. s_root is assigned the transmute property. >> >>>> 3. For existing files: >> >>>> a. Files with the same label as the backing device are accessible. >> >>>> b. Files with any other label are not accessible. >> >>> That's right. Accept correct data, reject anything that's not right. >> >>> >> >>>> If this is right, there are a couple lingering questions in my mind. >> >>>> >> >>>> First, what happens with files created in directories with the same >> >>>> label as the backing device but without the transmute property set? The >> >>>> inode for the new file will initially be labeled with smk_of_current(), >> >>>> but then during d_instantiate it will get smk_default and thus end up >> >>>> with the label we want. So that seems okay. >> >>> Yes. >> >>> >> >>>> The second is whether files with the SMACK64EXEC attribute is still a >> >>>> problem. It seems it is, for files with the same label as the backing >> >>>> store at least. I think we can simply skip the code that reads out this >> >>>> xattr and sets smk_task for user ns mounts, or else skip assigning the >> >>>> label to the new task in bprm_set_creds. The latter seems more >> >>>> consistent with the approach you've suggested for dealing with labels >> >>>> from disk. >> >>> Yes, I think that skipping the smk_fetch(XATTR_NAME_SMACKEXEC, ...) in >> >>> smack_d_instantiate for unprivileged mounts would do the trick. >> >>> >> >>>> So I guess all of that seems okay, though perhaps a bit restrictive >> >>>> given that the user who mounted the filesystem already has full access >> >>>> to the backing store. >> >>> In truth, there is no reason to expect that the "user" who did the >> >>> mount will ever have a Smack label that differs from the label of >> >>> the backing store. If what we've got here seems restrictive, it's >> >>> because you've got access from someone other than the "user". >> >>> >> >>>> Please let me know whether or not this matches up with what you are >> >>>> thinking, then I can procede with the implementation. >> >>> My current mindset is that, if you're going to allow unprivileged >> >>> mounts of user defined backing stores, this is as safe as we can >> >>> make it. >> >> All right, I've got a patch which I think does this, and I've managed to >> >> do some testing to confirm that it behaves like I expect. How does this >> >> look? >> >> >> >> What's missing is getting the label from the block device inode; as >> >> Stephen discovered the inode that I thought we could get the label from >> >> turned out to be the wrong one. Afaict we would need a new hook in order >> >> to do that, so for now I'm using the label of the proccess calling >> >> mount. >> > >> > That will be OK if the mount processing checks for write access to >> > the backing store. I haven't looked to see if it does. If it doesn't >> > the problems should be pretty obvious. >> >> >> do_new_mount >> vfs_kern_mount >> mount_fs >> ... >> mount_bdev >> blkdev_get_by_path(...,FMODE_READ| FMODE_WRITE | FMODE_EXCL,...) >> lookup_bdev >> kern_path >> filename_lookup >> path_lookupat >> lookup_last >> walk_component >> blkdev_get(...,mode,...) >> __blkdev_get(...,mode,...) >> devcgroup_inode_permission(bdev->bd_inode, perm) >> >> *scratches my head* >> >> It looks like we don't actually check the permissions on the block >> device. Tomoyo has a hack for it. nfsd does something. There is >> devcgroup silliness. >> >> But overall it looks like we depend on capable(CAP_SYS_ADMIN). >> >> Seth I do believe we have found another area of the vfs we will need to >> short up before allowing unprivileged mounts of block device based >> filesystems. >> >> It looks like there are enough hacks someone with a clue coming through >> and making the code make more sense seems like a good idea anyway. > > Yep, I just came to the same conclusion myself, and I also verified the > behavior emperically. That's definitely a problem. I'll get to work on > fixing that. At a quick glance it looks like lookup_bdev, and most of it's callers need to be modified to do potentially do the additional permission checking. I expect we could move the devcgroup checks into whatever new checks we wind up adding. Fun, fun fun. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index a143328f75eb..8e631a66b03c 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -662,6 +662,8 @@ static int smack_sb_kern_mount(struct super_block *sb, int flags, void *data) skp = smk_of_current(); sp->smk_root = skp; sp->smk_default = skp; + if (sb_in_userns(sb)) + transmute = 1; } /* * Initialize the root inode. @@ -1023,6 +1025,12 @@ static int smack_inode_permission(struct inode *inode, int mask) if (mask == 0) return 0; + if (sb_in_userns(inode->i_sb)) { + struct superblock_smack *sbsp = inode->i_sb->s_security; + if (smk_of_inode(inode) != sbsp->smk_root) + return -EACCES; + } + /* May be droppable after audit */ if (no_block) return -ECHILD; @@ -3220,14 +3228,16 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode) if (rc >= 0) transflag = SMK_INODE_TRANSMUTE; } - /* - * Don't let the exec or mmap label be "*" or "@". - */ - skp = smk_fetch(XATTR_NAME_SMACKEXEC, inode, dp); - if (IS_ERR(skp) || skp == &smack_known_star || - skp == &smack_known_web) - skp = NULL; - isp->smk_task = skp; + if (!sb_in_userns(inode->i_sb)) { + /* + * Don't let the exec or mmap label be "*" or "@". + */ + skp = smk_fetch(XATTR_NAME_SMACKEXEC, inode, dp); + if (IS_ERR(skp) || skp == &smack_known_star || + skp == &smack_known_web) + skp = NULL; + isp->smk_task = skp; + } skp = smk_fetch(XATTR_NAME_SMACKMMAP, inode, dp); if (IS_ERR(skp) || skp == &smack_known_star ||