Message ID | 20190109162830.8309-1-omosnace@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | Allow initializing the kernfs node's secctx based on its parent | expand |
On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: > Changes in v2: > - add docstring for the new hook in union security_list_options > - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not > implemented > v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ > > This series adds a new security hook that allows to initialize the security > context of kernfs properly, taking into account the parent context. Kernfs > nodes require special handling here, since they are not bound to specific > inodes/superblocks, but instead represent the backing tree structure that > is used to build the VFS tree when the kernfs tree is mounted. > > The kernfs nodes initially do not store any security context and rely on > the LSM to assign some default context to inodes created over them. This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual and expected filesystem behavior? > Kernfs > inodes, however, allow setting an explicit context via the *setxattr(2) > syscalls, in which case the context is stored inside the kernfs node's > metadata. > > SELinux (and possibly other LSMs) initialize the context of newly created > FS objects based on the parent object's context (usually the child inherits > the parent's context, unless the policy dictates otherwise). An LSM might use information about the parent other than the "context". Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent to determine whether the Smack label of the new object should be taken from the parent or the process. Passing the "context" of the parent is insufficient for Smack. > This is done > by hooking the creation of the new inode corresponding to the newly created > file/directory via security_inode_init_security() (most filesystems always > create a fresh inode when a new FS object is created). However, kernfs nodes > can be created "behind the scenes" while the filesystem is not mounted > anywhere and thus no inodes exist. > > Therefore, to allow maintaining similar behavior for kernfs nodes, a new LSM > hook is needed, which would allow initializing the kernfs node's security > context based on the context stored in the parent's node (if any). > > The main motivation for this change is that the userspace users of cgroupfs > (which is built on kernfs) expect the usual security context inheritance > to work under SELinux (see [1] and [2]). This functionality is required for > better confinement of containers under SELinux. > > The first patch adds the new LSM hook; the second patch implements the hook > in SELinux; and the third patch modifies kernfs to use the new hook to > initialize the security context of kernfs nodes whenever its parent node > has a non-default context set. > > Note: the patches are based on current selinux/next [3], but they seem to > apply cleanly on top of v5.0-rc1 as well. > > Testing: > - passed SELinux testsuite on Fedora 29 (x86_64) when applied on top of > current Rawhide kernel (5.0.0-0.rc1.git0.1) [4] > - passed the reproducer from the last patch > > [1] https://github.com/SELinuxProject/selinux-kernel/issues/39 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1553803 > [3] https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/log/?h=selinux-pr-20181224 > [4] https://copr.fedorainfracloud.org/coprs/omos/kernel-testing/build/842855/ > > Ondrej Mosnacek (3): > LSM: Add new hook for generic node initialization > selinux: Implement the object_init_security hook > kernfs: Initialize security of newly created nodes > > fs/kernfs/dir.c | 49 ++++++++++++++++++++++++++++++++++--- > fs/kernfs/inode.c | 9 +++---- > fs/kernfs/kernfs-internal.h | 4 +++ > include/linux/lsm_hooks.h | 30 +++++++++++++++++++++++ > include/linux/security.h | 14 +++++++++++ > security/security.c | 10 ++++++++ > security/selinux/hooks.c | 41 +++++++++++++++++++++++++++++++ > 7 files changed, 149 insertions(+), 8 deletions(-) >
On 1/9/19 12:19 PM, Casey Schaufler wrote: > On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: >> Changes in v2: >> - add docstring for the new hook in union security_list_options >> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not >> implemented >> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ >> >> This series adds a new security hook that allows to initialize the security >> context of kernfs properly, taking into account the parent context. Kernfs >> nodes require special handling here, since they are not bound to specific >> inodes/superblocks, but instead represent the backing tree structure that >> is used to build the VFS tree when the kernfs tree is mounted. >> >> The kernfs nodes initially do not store any security context and rely on >> the LSM to assign some default context to inodes created over them. > > This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual > and expected filesystem behavior? sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. > >> Kernfs >> inodes, however, allow setting an explicit context via the *setxattr(2) >> syscalls, in which case the context is stored inside the kernfs node's >> metadata. >> >> SELinux (and possibly other LSMs) initialize the context of newly created >> FS objects based on the parent object's context (usually the child inherits >> the parent's context, unless the policy dictates otherwise). > > An LSM might use information about the parent other than the "context". > Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent > to determine whether the Smack label of the new object should be taken > from the parent or the process. Passing the "context" of the parent is > insufficient for Smack. IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. > >> This is done >> by hooking the creation of the new inode corresponding to the newly created >> file/directory via security_inode_init_security() (most filesystems always >> create a fresh inode when a new FS object is created). However, kernfs nodes >> can be created "behind the scenes" while the filesystem is not mounted >> anywhere and thus no inodes exist. >> >> Therefore, to allow maintaining similar behavior for kernfs nodes, a new LSM >> hook is needed, which would allow initializing the kernfs node's security >> context based on the context stored in the parent's node (if any). >> >> The main motivation for this change is that the userspace users of cgroupfs >> (which is built on kernfs) expect the usual security context inheritance >> to work under SELinux (see [1] and [2]). This functionality is required for >> better confinement of containers under SELinux. >> >> The first patch adds the new LSM hook; the second patch implements the hook >> in SELinux; and the third patch modifies kernfs to use the new hook to >> initialize the security context of kernfs nodes whenever its parent node >> has a non-default context set. >> >> Note: the patches are based on current selinux/next [3], but they seem to >> apply cleanly on top of v5.0-rc1 as well. >> >> Testing: >> - passed SELinux testsuite on Fedora 29 (x86_64) when applied on top of >> current Rawhide kernel (5.0.0-0.rc1.git0.1) [4] >> - passed the reproducer from the last patch >> >> [1] https://github.com/SELinuxProject/selinux-kernel/issues/39 >> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1553803 >> [3] https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/log/?h=selinux-pr-20181224 >> [4] https://copr.fedorainfracloud.org/coprs/omos/kernel-testing/build/842855/ >> >> Ondrej Mosnacek (3): >> LSM: Add new hook for generic node initialization >> selinux: Implement the object_init_security hook >> kernfs: Initialize security of newly created nodes >> >> fs/kernfs/dir.c | 49 ++++++++++++++++++++++++++++++++++--- >> fs/kernfs/inode.c | 9 +++---- >> fs/kernfs/kernfs-internal.h | 4 +++ >> include/linux/lsm_hooks.h | 30 +++++++++++++++++++++++ >> include/linux/security.h | 14 +++++++++++ >> security/security.c | 10 ++++++++ >> security/selinux/hooks.c | 41 +++++++++++++++++++++++++++++++ >> 7 files changed, 149 insertions(+), 8 deletions(-) >> >
On 1/9/2019 12:37 PM, Stephen Smalley wrote: > On 1/9/19 12:19 PM, Casey Schaufler wrote: >> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: >>> Changes in v2: >>> - add docstring for the new hook in union security_list_options >>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not >>> implemented >>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ >>> >>> This series adds a new security hook that allows to initialize the security >>> context of kernfs properly, taking into account the parent context. Kernfs >>> nodes require special handling here, since they are not bound to specific >>> inodes/superblocks, but instead represent the backing tree structure that >>> is used to build the VFS tree when the kernfs tree is mounted. >>> >>> The kernfs nodes initially do not store any security context and rely on >>> the LSM to assign some default context to inodes created over them. >> >> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual >> and expected filesystem behavior? > > sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. > > Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. OK, so as I said, this seems like a bug in kernfs. > >> >>> Kernfs >>> inodes, however, allow setting an explicit context via the *setxattr(2) >>> syscalls, in which case the context is stored inside the kernfs node's >>> metadata. >>> >>> SELinux (and possibly other LSMs) initialize the context of newly created >>> FS objects based on the parent object's context (usually the child inherits >>> the parent's context, unless the policy dictates otherwise). >> >> An LSM might use information about the parent other than the "context". >> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent >> to determine whether the Smack label of the new object should be taken >> from the parent or the process. Passing the "context" of the parent is >> insufficient for Smack. > > IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. Right. But I'll point out that there is nothing to prevent an LSM from using inode information outside of the xattrs (e.g. uids) to determine the security state it wants to give a new object. I suggest that the better solution would be for kernfs to use inodes like a real filesystem. Every special case like this results in special cases like this special hook. It's hard enough to keep track of the general case in the Linux kernel. > >> >>> This is done >>> by hooking the creation of the new inode corresponding to the newly created >>> file/directory via security_inode_init_security() (most filesystems always >>> create a fresh inode when a new FS object is created). However, kernfs nodes >>> can be created "behind the scenes" while the filesystem is not mounted >>> anywhere and thus no inodes exist. >>> >>> Therefore, to allow maintaining similar behavior for kernfs nodes, a new LSM >>> hook is needed, which would allow initializing the kernfs node's security >>> context based on the context stored in the parent's node (if any). >>> >>> The main motivation for this change is that the userspace users of cgroupfs >>> (which is built on kernfs) expect the usual security context inheritance >>> to work under SELinux (see [1] and [2]). This functionality is required for >>> better confinement of containers under SELinux. >>> >>> The first patch adds the new LSM hook; the second patch implements the hook >>> in SELinux; and the third patch modifies kernfs to use the new hook to >>> initialize the security context of kernfs nodes whenever its parent node >>> has a non-default context set. >>> >>> Note: the patches are based on current selinux/next [3], but they seem to >>> apply cleanly on top of v5.0-rc1 as well. >>> >>> Testing: >>> - passed SELinux testsuite on Fedora 29 (x86_64) when applied on top of >>> current Rawhide kernel (5.0.0-0.rc1.git0.1) [4] >>> - passed the reproducer from the last patch >>> >>> [1] https://github.com/SELinuxProject/selinux-kernel/issues/39 >>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1553803 >>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/log/?h=selinux-pr-20181224 >>> [4] https://copr.fedorainfracloud.org/coprs/omos/kernel-testing/build/842855/ >>> >>> Ondrej Mosnacek (3): >>> LSM: Add new hook for generic node initialization >>> selinux: Implement the object_init_security hook >>> kernfs: Initialize security of newly created nodes >>> >>> fs/kernfs/dir.c | 49 ++++++++++++++++++++++++++++++++++--- >>> fs/kernfs/inode.c | 9 +++---- >>> fs/kernfs/kernfs-internal.h | 4 +++ >>> include/linux/lsm_hooks.h | 30 +++++++++++++++++++++++ >>> include/linux/security.h | 14 +++++++++++ >>> security/security.c | 10 ++++++++ >>> security/selinux/hooks.c | 41 +++++++++++++++++++++++++++++++ >>> 7 files changed, 149 insertions(+), 8 deletions(-) >>> >> > >
On 1/9/19 5:03 PM, Casey Schaufler wrote: > On 1/9/2019 12:37 PM, Stephen Smalley wrote: >> On 1/9/19 12:19 PM, Casey Schaufler wrote: >>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: >>>> Changes in v2: >>>> - add docstring for the new hook in union security_list_options >>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not >>>> implemented >>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ >>>> >>>> This series adds a new security hook that allows to initialize the security >>>> context of kernfs properly, taking into account the parent context. Kernfs >>>> nodes require special handling here, since they are not bound to specific >>>> inodes/superblocks, but instead represent the backing tree structure that >>>> is used to build the VFS tree when the kernfs tree is mounted. >>>> >>>> The kernfs nodes initially do not store any security context and rely on >>>> the LSM to assign some default context to inodes created over them. >>> >>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual >>> and expected filesystem behavior? >> >> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. >> >> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. > > OK, so as I said, this seems like a bug in kernfs. > >> >>> >>>> Kernfs >>>> inodes, however, allow setting an explicit context via the *setxattr(2) >>>> syscalls, in which case the context is stored inside the kernfs node's >>>> metadata. >>>> >>>> SELinux (and possibly other LSMs) initialize the context of newly created >>>> FS objects based on the parent object's context (usually the child inherits >>>> the parent's context, unless the policy dictates otherwise). >>> >>> An LSM might use information about the parent other than the "context". >>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent >>> to determine whether the Smack label of the new object should be taken >>> from the parent or the process. Passing the "context" of the parent is >>> insufficient for Smack. >> >> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. > > Right. But I'll point out that there is nothing to prevent an > LSM from using inode information outside of the xattrs (e.g. uids) > to determine the security state it wants to give a new object. If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision. Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs. > I suggest that the better solution would be for kernfs to > use inodes like a real filesystem. Every special case like this > results in special cases like this special hook. It's hard > enough to keep track of the general case in the Linux kernel. Feel free to propose an implementation if you like, but doing a complete rewrite of kernfs internals seems a bit out of scope. > >> >>> >>>> This is done >>>> by hooking the creation of the new inode corresponding to the newly created >>>> file/directory via security_inode_init_security() (most filesystems always >>>> create a fresh inode when a new FS object is created). However, kernfs nodes >>>> can be created "behind the scenes" while the filesystem is not mounted >>>> anywhere and thus no inodes exist. >>>> >>>> Therefore, to allow maintaining similar behavior for kernfs nodes, a new LSM >>>> hook is needed, which would allow initializing the kernfs node's security >>>> context based on the context stored in the parent's node (if any). >>>> >>>> The main motivation for this change is that the userspace users of cgroupfs >>>> (which is built on kernfs) expect the usual security context inheritance >>>> to work under SELinux (see [1] and [2]). This functionality is required for >>>> better confinement of containers under SELinux. >>>> >>>> The first patch adds the new LSM hook; the second patch implements the hook >>>> in SELinux; and the third patch modifies kernfs to use the new hook to >>>> initialize the security context of kernfs nodes whenever its parent node >>>> has a non-default context set. >>>> >>>> Note: the patches are based on current selinux/next [3], but they seem to >>>> apply cleanly on top of v5.0-rc1 as well. >>>> >>>> Testing: >>>> - passed SELinux testsuite on Fedora 29 (x86_64) when applied on top of >>>> current Rawhide kernel (5.0.0-0.rc1.git0.1) [4] >>>> - passed the reproducer from the last patch >>>> >>>> [1] https://github.com/SELinuxProject/selinux-kernel/issues/39 >>>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1553803 >>>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/log/?h=selinux-pr-20181224 >>>> [4] https://copr.fedorainfracloud.org/coprs/omos/kernel-testing/build/842855/ >>>> >>>> Ondrej Mosnacek (3): >>>> LSM: Add new hook for generic node initialization >>>> selinux: Implement the object_init_security hook >>>> kernfs: Initialize security of newly created nodes >>>> >>>> fs/kernfs/dir.c | 49 ++++++++++++++++++++++++++++++++++--- >>>> fs/kernfs/inode.c | 9 +++---- >>>> fs/kernfs/kernfs-internal.h | 4 +++ >>>> include/linux/lsm_hooks.h | 30 +++++++++++++++++++++++ >>>> include/linux/security.h | 14 +++++++++++ >>>> security/security.c | 10 ++++++++ >>>> security/selinux/hooks.c | 41 +++++++++++++++++++++++++++++++ >>>> 7 files changed, 149 insertions(+), 8 deletions(-) >>>> >>> >> >> >
Resending after email configuration repair. On 1/10/2019 6:15 AM, Stephen Smalley wrote: > On 1/9/19 5:03 PM, Casey Schaufler wrote: >> On 1/9/2019 12:37 PM, Stephen Smalley wrote: >>> On 1/9/19 12:19 PM, Casey Schaufler wrote: >>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: >>>>> Changes in v2: >>>>> - add docstring for the new hook in union security_list_options >>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not >>>>> implemented >>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ >>>>> >>>>> This series adds a new security hook that allows to initialize the security >>>>> context of kernfs properly, taking into account the parent context. Kernfs >>>>> nodes require special handling here, since they are not bound to specific >>>>> inodes/superblocks, but instead represent the backing tree structure that >>>>> is used to build the VFS tree when the kernfs tree is mounted. >>>>> >>>>> The kernfs nodes initially do not store any security context and rely on >>>>> the LSM to assign some default context to inodes created over them. >>>> >>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual >>>> and expected filesystem behavior? >>> >>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. >>> >>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. >> >> OK, so as I said, this seems like a bug in kernfs. >> >>> >>>> >>>>> Kernfs >>>>> inodes, however, allow setting an explicit context via the *setxattr(2) >>>>> syscalls, in which case the context is stored inside the kernfs node's >>>>> metadata. >>>>> >>>>> SELinux (and possibly other LSMs) initialize the context of newly created >>>>> FS objects based on the parent object's context (usually the child inherits >>>>> the parent's context, unless the policy dictates otherwise). >>>> >>>> An LSM might use information about the parent other than the "context". >>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent >>>> to determine whether the Smack label of the new object should be taken >>>> from the parent or the process. Passing the "context" of the parent is >>>> insufficient for Smack. >>> >>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. >> >> Right. But I'll point out that there is nothing to prevent an >> LSM from using inode information outside of the xattrs (e.g. uids) >> to determine the security state it wants to give a new object. > > If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision. Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs. Yes, it's a real concern. And no, just passing all of the kernfs internal data out in j-random formats does not pass muster. Al Viro was commenting the other day on how bad the LSM infrastructure interfaces are. The original proposal here is already big, cluttered and inadequate. Adding more to it to make up for its shortcomings should be sending up red flags. I've been wallowing in the LSM infrastructure for the past seven years. Interfaces like this one, that propagate the idiosyncrasies of both the caller (kernfs) and one potential callee (SELinux) are much too common. I understand that there is a problem that needs a solution. This isn't it. >> I suggest that the better solution would be for kernfs to >> use inodes like a real filesystem. Every special case like this >> results in special cases like this special hook. It's hard >> enough to keep track of the general case in the Linux kernel. > > Feel free to propose an implementation if you like, but doing a complete rewrite of kernfs internals seems a bit out of scope. If this issue points out a serious problem with the kernfs implementation then I would expect that addressing the problem at its source would be in everyone's best interest. Did anyone even look at the possibility? If I said that I would do that, how long would you be willing to wait for it?
On 1/10/19 12:54 PM, Casey Schaufler wrote: > > Resending after email configuration repair. > > On 1/10/2019 6:15 AM, Stephen Smalley wrote: >> On 1/9/19 5:03 PM, Casey Schaufler wrote: >>> On 1/9/2019 12:37 PM, Stephen Smalley wrote: >>>> On 1/9/19 12:19 PM, Casey Schaufler wrote: >>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: >>>>>> Changes in v2: >>>>>> - add docstring for the new hook in union security_list_options >>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not >>>>>> implemented >>>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ >>>>>> >>>>>> This series adds a new security hook that allows to initialize the security >>>>>> context of kernfs properly, taking into account the parent context. Kernfs >>>>>> nodes require special handling here, since they are not bound to specific >>>>>> inodes/superblocks, but instead represent the backing tree structure that >>>>>> is used to build the VFS tree when the kernfs tree is mounted. >>>>>> >>>>>> The kernfs nodes initially do not store any security context and rely on >>>>>> the LSM to assign some default context to inodes created over them. >>>>> >>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual >>>>> and expected filesystem behavior? >>>> >>>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. >>>> >>>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. >>> >>> OK, so as I said, this seems like a bug in kernfs. >>> >>>> >>>>> >>>>>> Kernfs >>>>>> inodes, however, allow setting an explicit context via the *setxattr(2) >>>>>> syscalls, in which case the context is stored inside the kernfs node's >>>>>> metadata. >>>>>> >>>>>> SELinux (and possibly other LSMs) initialize the context of newly created >>>>>> FS objects based on the parent object's context (usually the child inherits >>>>>> the parent's context, unless the policy dictates otherwise). >>>>> >>>>> An LSM might use information about the parent other than the "context". >>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent >>>>> to determine whether the Smack label of the new object should be taken >>>>> from the parent or the process. Passing the "context" of the parent is >>>>> insufficient for Smack. >>>> >>>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. >>> >>> Right. But I'll point out that there is nothing to prevent an >>> LSM from using inode information outside of the xattrs (e.g. uids) >>> to determine the security state it wants to give a new object. >> >> If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision. Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs. > > Yes, it's a real concern. And no, just passing all of the kernfs internal data > out in j-random formats does not pass muster. Al Viro was commenting the other > day on how bad the LSM infrastructure interfaces are. The original proposal here > is already big, cluttered and inadequate. Adding more to it to make up for its > shortcomings should be sending up red flags I don't quite see how the original patch set or hook can be called big and cluttered. Switching the handling of security xattrs in kernfs to use simple_xattrs (a natural and seemingly straightforward cleanup) and passing the entire simple_xattrs list to the hook interface would allow you to support SMACK64TRANSMUTE, which was the one actual inadequacy you identified. You claim that someone might need/want the parent uid/gid too, but there are no in-tree security modules that do so nor any submitted AFAIK, and if that situation arises, all we need to do to support it is to add the iattrs. Obviously they can all be wrapped up in some larger structure if desired. At that point the security modules would have access to all of the inode attributes supported by kernfs. > I've been wallowing in the LSM infrastructure for the past seven years. > Interfaces like this one, that propagate the idiosyncrasies of both > the caller (kernfs) and one potential callee (SELinux) are much too > common. I understand that there is a problem that needs a solution. > This isn't it. The solution sketched above should be capable of supporting the needs of any current security modules and of being easily extended for others if the need arises. > > >>> I suggest that the better solution would be for kernfs to >>> use inodes like a real filesystem. Every special case like this >>> results in special cases like this special hook. It's hard >>> enough to keep track of the general case in the Linux kernel. >> >> Feel free to propose an implementation if you like, but doing a complete rewrite of kernfs internals seems a bit out of scope. > > If this issue points out a serious problem with the kernfs implementation > then I would expect that addressing the problem at its source would be in > everyone's best interest. Did anyone even look at the possibility? If I > said that I would do that, how long would you be willing to wait for it? I don't know that it points to a serious problem with kernfs. But I'll let you convince the kernfs maintainers of that. Meanwhile, we have a proposed solution that solves the problem for all in-tree security modules. I see no reason to hold that up. Don't over-design.
On Thu, Jan 10, 2019 at 2:36 PM Stephen Smalley <sds@tycho.nsa.gov> wrote: > On 1/10/19 12:54 PM, Casey Schaufler wrote: > > Resending after email configuration repair. > > > > On 1/10/2019 6:15 AM, Stephen Smalley wrote: > >> On 1/9/19 5:03 PM, Casey Schaufler wrote: > >>> On 1/9/2019 12:37 PM, Stephen Smalley wrote: > >>>> On 1/9/19 12:19 PM, Casey Schaufler wrote: > >>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: > >>>>>> Changes in v2: > >>>>>> - add docstring for the new hook in union security_list_options > >>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not > >>>>>> implemented > >>>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ > >>>>>> > >>>>>> This series adds a new security hook that allows to initialize the security > >>>>>> context of kernfs properly, taking into account the parent context. Kernfs > >>>>>> nodes require special handling here, since they are not bound to specific > >>>>>> inodes/superblocks, but instead represent the backing tree structure that > >>>>>> is used to build the VFS tree when the kernfs tree is mounted. > >>>>>> > >>>>>> The kernfs nodes initially do not store any security context and rely on > >>>>>> the LSM to assign some default context to inodes created over them. > >>>>> > >>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual > >>>>> and expected filesystem behavior? > >>>> > >>>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. > >>>> > >>>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. > >>> > >>> OK, so as I said, this seems like a bug in kernfs. > >>> > >>>> > >>>>> > >>>>>> Kernfs > >>>>>> inodes, however, allow setting an explicit context via the *setxattr(2) > >>>>>> syscalls, in which case the context is stored inside the kernfs node's > >>>>>> metadata. > >>>>>> > >>>>>> SELinux (and possibly other LSMs) initialize the context of newly created > >>>>>> FS objects based on the parent object's context (usually the child inherits > >>>>>> the parent's context, unless the policy dictates otherwise). > >>>>> > >>>>> An LSM might use information about the parent other than the "context". > >>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent > >>>>> to determine whether the Smack label of the new object should be taken > >>>>> from the parent or the process. Passing the "context" of the parent is > >>>>> insufficient for Smack. > >>>> > >>>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. > >>> > >>> Right. But I'll point out that there is nothing to prevent an > >>> LSM from using inode information outside of the xattrs (e.g. uids) > >>> to determine the security state it wants to give a new object. > >> > >> If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision. Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs. > > > > Yes, it's a real concern. And no, just passing all of the kernfs internal data > > out in j-random formats does not pass muster. Al Viro was commenting the other > > day on how bad the LSM infrastructure interfaces are. The original proposal here > > is already big, cluttered and inadequate. Adding more to it to make up for its > > shortcomings should be sending up red flags > > I don't quite see how the original patch set or hook can be called big > and cluttered. Switching the handling of security xattrs in kernfs to > use simple_xattrs (a natural and seemingly straightforward cleanup) and > passing the entire simple_xattrs list to the hook interface would allow > you to support SMACK64TRANSMUTE, which was the one actual inadequacy you > identified. You claim that someone might need/want the parent uid/gid > too, but there are no in-tree security modules that do so nor any > submitted AFAIK, and if that situation arises, all we need to do to > support it is to add the iattrs. Obviously they can all be wrapped up > in some larger structure if desired. At that point the security modules > would have access to all of the inode attributes supported by kernfs. I'm with Stephen on this; if Ondrej changes it over to simple_xattrs as described above so that Smack would have what it needs, I don't see why we should hold off on this. Everything we are talking about is a kernel internal issue, we can change it as needed to take into account new LSMs or new functionality in existing LSMs. Ondrej, a gentle reminder that it would be nice to have a simple selinux-testsuite test to make sure we are labeling kernfs-based/cgroup files correctly.
On 1/10/2019 11:37 AM, Stephen Smalley wrote: > On 1/10/19 12:54 PM, Casey Schaufler wrote: >> >> Resending after email configuration repair. >> >> On 1/10/2019 6:15 AM, Stephen Smalley wrote: >>> On 1/9/19 5:03 PM, Casey Schaufler wrote: >>>> On 1/9/2019 12:37 PM, Stephen Smalley wrote: >>>>> On 1/9/19 12:19 PM, Casey Schaufler wrote: >>>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: >>>>>>> Changes in v2: >>>>>>> - add docstring for the new hook in union security_list_options >>>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not >>>>>>> implemented >>>>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ >>>>>>> >>>>>>> This series adds a new security hook that allows to initialize the security >>>>>>> context of kernfs properly, taking into account the parent context. Kernfs >>>>>>> nodes require special handling here, since they are not bound to specific >>>>>>> inodes/superblocks, but instead represent the backing tree structure that >>>>>>> is used to build the VFS tree when the kernfs tree is mounted. >>>>>>> >>>>>>> The kernfs nodes initially do not store any security context and rely on >>>>>>> the LSM to assign some default context to inodes created over them. >>>>>> >>>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual >>>>>> and expected filesystem behavior? >>>>> >>>>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. >>>>> >>>>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. >>>> >>>> OK, so as I said, this seems like a bug in kernfs. >>>> >>>>> >>>>>> >>>>>>> Kernfs >>>>>>> inodes, however, allow setting an explicit context via the *setxattr(2) >>>>>>> syscalls, in which case the context is stored inside the kernfs node's >>>>>>> metadata. >>>>>>> >>>>>>> SELinux (and possibly other LSMs) initialize the context of newly created >>>>>>> FS objects based on the parent object's context (usually the child inherits >>>>>>> the parent's context, unless the policy dictates otherwise). >>>>>> >>>>>> An LSM might use information about the parent other than the "context". >>>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent >>>>>> to determine whether the Smack label of the new object should be taken >>>>>> from the parent or the process. Passing the "context" of the parent is >>>>>> insufficient for Smack. >>>>> >>>>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. >>>> >>>> Right. But I'll point out that there is nothing to prevent an >>>> LSM from using inode information outside of the xattrs (e.g. uids) >>>> to determine the security state it wants to give a new object. >>> >>> If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision. Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs. >> >> Yes, it's a real concern. And no, just passing all of the kernfs internal data >> out in j-random formats does not pass muster. Al Viro was commenting the other >> day on how bad the LSM infrastructure interfaces are. The original proposal here >> is already big, cluttered and inadequate. Adding more to it to make up for its >> shortcomings should be sending up red flags > I don't quite see how the original patch set or hook can be called big and cluttered. Switching the handling of security xattrs in kernfs to use simple_xattrs (a natural and seemingly straightforward cleanup) and passing the entire simple_xattrs list to the hook interface would allow you to support SMACK64TRANSMUTE, which was the one actual inadequacy you identified. You claim that someone might need/want the parent uid/gid too, but there are no in-tree security modules that do so nor any submitted AFAIK, and if that situation arises, all we need to do to support it is to add the iattrs. Obviously they can all be wrapped up in some larger structure if desired. At that point the security modules would have access to all of the inode attributes supported by kernfs. We already have a structure to wrap all this in. It's an inode. But, as you point out, there are no in tree LSMs that use anything beyond the xattrs. So the change you're suggesting is arguably sufficient, and considerably easier. > >> I've been wallowing in the LSM infrastructure for the past seven years. >> Interfaces like this one, that propagate the idiosyncrasies of both >> the caller (kernfs) and one potential callee (SELinux) are much too >> common. I understand that there is a problem that needs a solution. >> This isn't it. > The solution sketched above should be capable of supporting the needs of any current security modules and of being easily extended for others if the need arises. It, like security_inode_init(), is going to be challenging to get right in the stacking environment. But, that's my problem. > >> >> >>>> I suggest that the better solution would be for kernfs to >>>> use inodes like a real filesystem. Every special case like this >>>> results in special cases like this special hook. It's hard >>>> enough to keep track of the general case in the Linux kernel. >>> >>> Feel free to propose an implementation if you like, but doing a complete rewrite of kernfs internals seems a bit out of scope. >> >> If this issue points out a serious problem with the kernfs implementation >> then I would expect that addressing the problem at its source would be in >> everyone's best interest. Did anyone even look at the possibility? If I >> said that I would do that, how long would you be willing to wait for it? > > I don't know that it points to a serious problem with kernfs. But I'll let you convince the kernfs maintainers of that. Meanwhile, we have a proposed solution that solves the problem for all in-tree security modules. I see no reason to hold that up. Don't over-design. I think the patch presented was hasty. It clearly didn't account for any LSM but SELinux. I understand why. Adding an LSM interface needs to account for the entire security sub-system.
On Thu, Jan 10, 2019 at 6:55 PM Casey Schaufler <casey@schaufler-ca.com> wrote: > Resending after email configuration repair. > > On 1/10/2019 6:15 AM, Stephen Smalley wrote: > > On 1/9/19 5:03 PM, Casey Schaufler wrote: > >> On 1/9/2019 12:37 PM, Stephen Smalley wrote: > >>> On 1/9/19 12:19 PM, Casey Schaufler wrote: > >>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: > >>>>> Changes in v2: > >>>>> - add docstring for the new hook in union security_list_options > >>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not > >>>>> implemented > >>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ > >>>>> > >>>>> This series adds a new security hook that allows to initialize the security > >>>>> context of kernfs properly, taking into account the parent context. Kernfs > >>>>> nodes require special handling here, since they are not bound to specific > >>>>> inodes/superblocks, but instead represent the backing tree structure that > >>>>> is used to build the VFS tree when the kernfs tree is mounted. > >>>>> > >>>>> The kernfs nodes initially do not store any security context and rely on > >>>>> the LSM to assign some default context to inodes created over them. > >>>> > >>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual > >>>> and expected filesystem behavior? > >>> > >>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. > >>> > >>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. > >> > >> OK, so as I said, this seems like a bug in kernfs. > >> > >>> > >>>> > >>>>> Kernfs > >>>>> inodes, however, allow setting an explicit context via the *setxattr(2) > >>>>> syscalls, in which case the context is stored inside the kernfs node's > >>>>> metadata. > >>>>> > >>>>> SELinux (and possibly other LSMs) initialize the context of newly created > >>>>> FS objects based on the parent object's context (usually the child inherits > >>>>> the parent's context, unless the policy dictates otherwise). > >>>> > >>>> An LSM might use information about the parent other than the "context". > >>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent > >>>> to determine whether the Smack label of the new object should be taken > >>>> from the parent or the process. Passing the "context" of the parent is > >>>> insufficient for Smack. > >>> > >>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. I actually had a patch to do just that at one point because I thought for a while that it would be required to call security_inode_init_security() (which I had tried to somehow force into the kernfs node creation at some point), but then I realized it is not actually needed (although would make thing a bit nicer) and put it away... I will try to dig it out and reuse here. > >> > >> Right. But I'll point out that there is nothing to prevent an > >> LSM from using inode information outside of the xattrs (e.g. uids) > >> to determine the security state it wants to give a new object. > > > > If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision. Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs. > > Yes, it's a real concern. And no, just passing all of the kernfs internal data > out in j-random formats does not pass muster. Al Viro was commenting the other > day on how bad the LSM infrastructure interfaces are. The original proposal here > is already big, cluttered and inadequate. Adding more to it to make up for its > shortcomings should be sending up red flags. I understand the concern about cluttering up things, but I just don't see any nicer solution right now... > > I've been wallowing in the LSM infrastructure for the past seven years. > Interfaces like this one, that propagate the idiosyncrasies of both > the caller (kernfs) and one potential callee (SELinux) are much too > common. I understand that there is a problem that needs a solution. > This isn't it. > > > >> I suggest that the better solution would be for kernfs to > >> use inodes like a real filesystem. Every special case like this > >> results in special cases like this special hook. It's hard > >> enough to keep track of the general case in the Linux kernel. > > > > Feel free to propose an implementation if you like, but doing a complete rewrite of kernfs internals seems a bit out of scope. > > If this issue points out a serious problem with the kernfs implementation > then I would expect that addressing the problem at its source would be in > everyone's best interest. Did anyone even look at the possibility? If I > said that I would do that, how long would you be willing to wait for it? Granted, the "inodeless" abstractions in kernfs have perhaps gone too far, but I believe that trying undo it would just shift the complexity into kernfs and its users... IMHO that this solution (with the changes proposed by Stephen) is not overly invasive and does not make the potential future rework of kernfs and its handling by LSMs much more difficult than it would be now. I'd prefer to apply an imperfect but noninvasive solution to a practical problem now and leave extensive refactoring as a separate task/discussion. (Anyway, I don't want to rush this. I'll keep sending patches and hopefully we'll eventually converge to some solution acceptable to everyone.) -- Ondrej Mosnacek <omosnace at redhat dot com> Associate Software Engineer, Security Technologies Red Hat, Inc.
On Fri, Jan 11, 2019 at 3:44 AM Paul Moore <paul@paul-moore.com> wrote: > [...] > > Ondrej, a gentle reminder that it would be nice to have a simple > selinux-testsuite test to make sure we are labeling > kernfs-based/cgroup files correctly. OK, I'll see if I can adapt the reproducer from the last patch into the testsuite. -- Ondrej Mosnacek <omosnace at redhat dot com> Associate Software Engineer, Security Technologies Red Hat, Inc.
On Mon, Jan 14, 2019 at 10:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote: > On Thu, Jan 10, 2019 at 6:55 PM Casey Schaufler <casey@schaufler-ca.com> wrote: > > Resending after email configuration repair. > > > > On 1/10/2019 6:15 AM, Stephen Smalley wrote: > > > On 1/9/19 5:03 PM, Casey Schaufler wrote: > > >> On 1/9/2019 12:37 PM, Stephen Smalley wrote: > > >>> On 1/9/19 12:19 PM, Casey Schaufler wrote: > > >>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: > > >>>>> Changes in v2: > > >>>>> - add docstring for the new hook in union security_list_options > > >>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not > > >>>>> implemented > > >>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ > > >>>>> > > >>>>> This series adds a new security hook that allows to initialize the security > > >>>>> context of kernfs properly, taking into account the parent context. Kernfs > > >>>>> nodes require special handling here, since they are not bound to specific > > >>>>> inodes/superblocks, but instead represent the backing tree structure that > > >>>>> is used to build the VFS tree when the kernfs tree is mounted. > > >>>>> > > >>>>> The kernfs nodes initially do not store any security context and rely on > > >>>>> the LSM to assign some default context to inodes created over them. > > >>>> > > >>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual > > >>>> and expected filesystem behavior? > > >>> > > >>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. > > >>> > > >>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. > > >> > > >> OK, so as I said, this seems like a bug in kernfs. > > >> > > >>> > > >>>> > > >>>>> Kernfs > > >>>>> inodes, however, allow setting an explicit context via the *setxattr(2) > > >>>>> syscalls, in which case the context is stored inside the kernfs node's > > >>>>> metadata. > > >>>>> > > >>>>> SELinux (and possibly other LSMs) initialize the context of newly created > > >>>>> FS objects based on the parent object's context (usually the child inherits > > >>>>> the parent's context, unless the policy dictates otherwise). > > >>>> > > >>>> An LSM might use information about the parent other than the "context". > > >>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent > > >>>> to determine whether the Smack label of the new object should be taken > > >>>> from the parent or the process. Passing the "context" of the parent is > > >>>> insufficient for Smack. > > >>> > > >>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. > > I actually had a patch to do just that at one point because I thought > for a while that it would be required to call > security_inode_init_security() (which I had tried to somehow force > into the kernfs node creation at some point), but then I realized it > is not actually needed (although would make thing a bit nicer) and put > it away... I will try to dig it out and reuse here. Okay, now that I tried to do this with full xattr support I ran into a problem. Along with converting kernfs to use simple_xattrs for security attributes, I removed the call to security_inode_notifysecctx() from kernfs_refresh_inode(), as it no longer makes sense (kernfs doesn't know which attribute contains the context; the LSM should now be able to pull it out via vfs_getxattr()). However, SELinux now doesn't set the right security context in the selinux_d_instantiate() hook, because the policy tells it to use genfs, not xattr. So... I'm not sure how to fix this. Setting fs_use_xattr for cgroupfs in the policy won't work, because then all nodes will be unlabeled_t by default. Maybe we could patch the genfs case in inode_doinit_with_dentry() to try fetching the xattr first? I'm not very confident about touching that part of the code, so I would welcome some advice here. This is the code I have so far, in case it helps: https://gitlab.com/omos/linux-public/compare/selinux-next...selinux-fix-cgroupfs-v8 Thanks,
On 1/22/19 3:49 AM, Ondrej Mosnacek wrote: > On Mon, Jan 14, 2019 at 10:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote: >> On Thu, Jan 10, 2019 at 6:55 PM Casey Schaufler <casey@schaufler-ca.com> wrote: >>> Resending after email configuration repair. >>> >>> On 1/10/2019 6:15 AM, Stephen Smalley wrote: >>>> On 1/9/19 5:03 PM, Casey Schaufler wrote: >>>>> On 1/9/2019 12:37 PM, Stephen Smalley wrote: >>>>>> On 1/9/19 12:19 PM, Casey Schaufler wrote: >>>>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: >>>>>>>> Changes in v2: >>>>>>>> - add docstring for the new hook in union security_list_options >>>>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not >>>>>>>> implemented >>>>>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ >>>>>>>> >>>>>>>> This series adds a new security hook that allows to initialize the security >>>>>>>> context of kernfs properly, taking into account the parent context. Kernfs >>>>>>>> nodes require special handling here, since they are not bound to specific >>>>>>>> inodes/superblocks, but instead represent the backing tree structure that >>>>>>>> is used to build the VFS tree when the kernfs tree is mounted. >>>>>>>> >>>>>>>> The kernfs nodes initially do not store any security context and rely on >>>>>>>> the LSM to assign some default context to inodes created over them. >>>>>>> >>>>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual >>>>>>> and expected filesystem behavior? >>>>>> >>>>>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace. >>>>>> >>>>>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged. >>>>> >>>>> OK, so as I said, this seems like a bug in kernfs. >>>>> >>>>>> >>>>>>> >>>>>>>> Kernfs >>>>>>>> inodes, however, allow setting an explicit context via the *setxattr(2) >>>>>>>> syscalls, in which case the context is stored inside the kernfs node's >>>>>>>> metadata. >>>>>>>> >>>>>>>> SELinux (and possibly other LSMs) initialize the context of newly created >>>>>>>> FS objects based on the parent object's context (usually the child inherits >>>>>>>> the parent's context, unless the policy dictates otherwise). >>>>>>> >>>>>>> An LSM might use information about the parent other than the "context". >>>>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent >>>>>>> to determine whether the Smack label of the new object should be taken >>>>>>> from the parent or the process. Passing the "context" of the parent is >>>>>>> insufficient for Smack. >>>>>> >>>>>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode. Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode. >> >> I actually had a patch to do just that at one point because I thought >> for a while that it would be required to call >> security_inode_init_security() (which I had tried to somehow force >> into the kernfs node creation at some point), but then I realized it >> is not actually needed (although would make thing a bit nicer) and put >> it away... I will try to dig it out and reuse here. > > Okay, now that I tried to do this with full xattr support I ran into a > problem. Along with converting kernfs to use simple_xattrs for > security attributes, I removed the call to > security_inode_notifysecctx() from kernfs_refresh_inode(), as it no > longer makes sense (kernfs doesn't know which attribute contains the > context; the LSM should now be able to pull it out via > vfs_getxattr()). However, SELinux now doesn't set the right security > context in the selinux_d_instantiate() hook, because the policy tells > it to use genfs, not xattr. > > So... I'm not sure how to fix this. Setting fs_use_xattr for cgroupfs > in the policy won't work, because then all nodes will be unlabeled_t > by default. Maybe we could patch the genfs case in > inode_doinit_with_dentry() to try fetching the xattr first? I'm not > very confident about touching that part of the code, so I would > welcome some advice here. > > This is the code I have so far, in case it helps: > https://gitlab.com/omos/linux-public/compare/selinux-next...selinux-fix-cgroupfs-v8 I would have left security_inode_notifysecctx() or an equivalent that passes all of the xattrs to push the security attributes to the security module. Blindly calling __vfs_getxattr() on genfs could be a problem; IIRC, doing so on fuse filesytems can create a deadlock during mount. Or at least that was the issue with switching fuse to fs_use_xattr in the past.
On 1/22/19 9:17 AM, Stephen Smalley wrote: > On 1/22/19 3:49 AM, Ondrej Mosnacek wrote: >> On Mon, Jan 14, 2019 at 10:01 AM Ondrej Mosnacek <omosnace@redhat.com> >> wrote: >>> On Thu, Jan 10, 2019 at 6:55 PM Casey Schaufler >>> <casey@schaufler-ca.com> wrote: >>>> Resending after email configuration repair. >>>> >>>> On 1/10/2019 6:15 AM, Stephen Smalley wrote: >>>>> On 1/9/19 5:03 PM, Casey Schaufler wrote: >>>>>> On 1/9/2019 12:37 PM, Stephen Smalley wrote: >>>>>>> On 1/9/19 12:19 PM, Casey Schaufler wrote: >>>>>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote: >>>>>>>>> Changes in v2: >>>>>>>>> - add docstring for the new hook in union security_list_options >>>>>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not >>>>>>>>> implemented >>>>>>>>> v1: >>>>>>>>> https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ >>>>>>>>> >>>>>>>>> >>>>>>>>> This series adds a new security hook that allows to initialize >>>>>>>>> the security >>>>>>>>> context of kernfs properly, taking into account the parent >>>>>>>>> context. Kernfs >>>>>>>>> nodes require special handling here, since they are not bound >>>>>>>>> to specific >>>>>>>>> inodes/superblocks, but instead represent the backing tree >>>>>>>>> structure that >>>>>>>>> is used to build the VFS tree when the kernfs tree is mounted. >>>>>>>>> >>>>>>>>> The kernfs nodes initially do not store any security context >>>>>>>>> and rely on >>>>>>>>> the LSM to assign some default context to inodes created over >>>>>>>>> them. >>>>>>>> >>>>>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to >>>>>>>> the usual >>>>>>>> and expected filesystem behavior? >>>>>>> >>>>>>> sysfs / kernfs didn't support xattrs at all when we first added >>>>>>> support for setting security contexts to it, so originally all >>>>>>> sysfs / kernfs inodes had a single security context, and we only >>>>>>> required separate storage for the inodes that were explicitly >>>>>>> labeled by userspace. >>>>>>> >>>>>>> Later kernfs grew support for trusted.* xattrs using >>>>>>> simple_xattrs but the existing security.* support was left mostly >>>>>>> unchanged. >>>>>> >>>>>> OK, so as I said, this seems like a bug in kernfs. >>>>>> >>>>>>> >>>>>>>> >>>>>>>>> Kernfs >>>>>>>>> inodes, however, allow setting an explicit context via the >>>>>>>>> *setxattr(2) >>>>>>>>> syscalls, in which case the context is stored inside the kernfs >>>>>>>>> node's >>>>>>>>> metadata. >>>>>>>>> >>>>>>>>> SELinux (and possibly other LSMs) initialize the context of >>>>>>>>> newly created >>>>>>>>> FS objects based on the parent object's context (usually the >>>>>>>>> child inherits >>>>>>>>> the parent's context, unless the policy dictates otherwise). >>>>>>>> >>>>>>>> An LSM might use information about the parent other than the >>>>>>>> "context". >>>>>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the >>>>>>>> parent >>>>>>>> to determine whether the Smack label of the new object should be >>>>>>>> taken >>>>>>>> from the parent or the process. Passing the "context" of the >>>>>>>> parent is >>>>>>>> insufficient for Smack. >>>>>>> >>>>>>> IIUC, this would involve switching the handling of security.* >>>>>>> xattrs in kernfs over to use simple_xattrs too (so that we can >>>>>>> store multiple such attributes), and then pass the entire >>>>>>> simple_xattrs list or at least anything with a security.* prefix >>>>>>> when initializing a new node or refreshing an existing inode. >>>>>>> Then the security module could extract any security.* attributes >>>>>>> of interest for use in determining the label of new inodes and in >>>>>>> refreshing the label of an inode. >>> >>> I actually had a patch to do just that at one point because I thought >>> for a while that it would be required to call >>> security_inode_init_security() (which I had tried to somehow force >>> into the kernfs node creation at some point), but then I realized it >>> is not actually needed (although would make thing a bit nicer) and put >>> it away... I will try to dig it out and reuse here. >> >> Okay, now that I tried to do this with full xattr support I ran into a >> problem. Along with converting kernfs to use simple_xattrs for >> security attributes, I removed the call to >> security_inode_notifysecctx() from kernfs_refresh_inode(), as it no >> longer makes sense (kernfs doesn't know which attribute contains the >> context; the LSM should now be able to pull it out via >> vfs_getxattr()). However, SELinux now doesn't set the right security >> context in the selinux_d_instantiate() hook, because the policy tells >> it to use genfs, not xattr. >> >> So... I'm not sure how to fix this. Setting fs_use_xattr for cgroupfs >> in the policy won't work, because then all nodes will be unlabeled_t >> by default. Maybe we could patch the genfs case in >> inode_doinit_with_dentry() to try fetching the xattr first? I'm not >> very confident about touching that part of the code, so I would >> welcome some advice here. >> >> This is the code I have so far, in case it helps: >> https://gitlab.com/omos/linux-public/compare/selinux-next...selinux-fix-cgroupfs-v8 >> > > I would have left security_inode_notifysecctx() or an equivalent that > passes all of the xattrs to push the security attributes to the security > module. > > Blindly calling __vfs_getxattr() on genfs could be a problem; IIRC, > doing so on fuse filesytems can create a deadlock during mount. Or at > least that was the issue with switching fuse to fs_use_xattr in the past. See commits 4d546f81717d253ab67643bf072c6d8821a9249c, 102aefdda4d8275ce7d7100bc16c88c74272b260, 089be43e403a78cd6889cde2fba164fefe9dfd89, 811f3799279e567aa354c649ce22688d949ac7a9 and https://bugzilla.redhat.com/show_bug.cgi?id=1256635#c34 for some prior work and discussions in this area.