Message ID | 20210316173226.2220046-1-omosnace@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Paul Moore |
Headers | show |
Series | xfs: use has_capability_noaudit() instead of capable() where appropriate | expand |
On Tue, Mar 16, 2021 at 06:32:26PM +0100, Ondrej Mosnacek wrote: > In cases when a negative result of a capability check doesn't lead to an > immediate, user-visible error, only a subtle difference in behavior, it > is better to use has_capability_noaudit(current, ...), so that LSMs > (e.g. SELinux) don't generate a denial record in the audit log each time > the capability status is queried. This patch should cover all such cases > in fs/xfs/. Is this something new? I only see 4 calls to has_capability_noaudit() in 5.12-rc3... Also, has_capability_noaudit() is an awful name. capable_noaudit() would actually be self explanatory to anyone who is used to doing capability checks via capable(), ns_capable(), ns_capable_noaudit(), inode_owner_or_capable(), capable_wrt_inode_uidgid(), etc... Please fix the name of this function to be consistent with the existing capability APIs before propagating it further into the kernel. > Note that I kept the capable(CAP_FSETID) checks, since these will only > be executed if the user explicitly tries to set the SUID/SGID bit, and > it likely makes sense to log such attempts even if the syscall doesn't > fail and just ignores the bits. So how on earth are we supposed to maintain this code correctly? These are undocumented rules that seemingly are applied to random subsystems and to seemingly random capable() calls in those subsystems. ANd you don't even document it in this code where there are other capable(...) checks that will generate audit records... How are we supposed to know when an audit record should be emitted or not by some unknown LSM when we do a capability check? Capabilities are already an awful nightmare maze of similar but slightly different capability checks, and this doesn't improve the situation at all. Please make this easier to get right iand maintain correctly (an absolute, non-negotiable requirement for anything security related) before you spray yet another poorly documented capability function into the wider kernel. > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> > --- > fs/xfs/xfs_fsmap.c | 4 ++-- > fs/xfs/xfs_ioctl.c | 5 ++++- > fs/xfs/xfs_iops.c | 6 ++++-- > fs/xfs/xfs_xattr.c | 2 +- > 4 files changed, 11 insertions(+), 6 deletions(-) > > diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c > index 9ce5e7d5bf8f..14672e7ee535 100644 > --- a/fs/xfs/xfs_fsmap.c > +++ b/fs/xfs/xfs_fsmap.c > @@ -842,8 +842,8 @@ xfs_getfsmap( > !xfs_getfsmap_is_valid_device(mp, &head->fmh_keys[1])) > return -EINVAL; > > - use_rmap = capable(CAP_SYS_ADMIN) && > - xfs_sb_version_hasrmapbt(&mp->m_sb); > + use_rmap = xfs_sb_version_hasrmapbt(&mp->m_sb) && > + has_capability_noaudit(current, CAP_SYS_ADMIN); > head->fmh_entries = 0; > > /* Set up our device handlers. */ > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c > index 3fbd98f61ea5..3cfc1a25069c 100644 > --- a/fs/xfs/xfs_ioctl.c > +++ b/fs/xfs/xfs_ioctl.c > @@ -1470,8 +1470,11 @@ xfs_ioctl_setattr( > > if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_PQUOTA_ON(mp) && > ip->i_d.di_projid != fa->fsx_projid) { > + int flags = has_capability_noaudit(current, CAP_FOWNER) ? > + XFS_QMOPT_FORCE_RES : 0; > + > code = xfs_qm_vop_chown_reserve(tp, ip, NULL, NULL, pdqp, > - capable(CAP_FOWNER) ? XFS_QMOPT_FORCE_RES : 0); > + flags); > if (code) /* out of quota */ > goto error_trans_cancel; > } You missed a capable() call here - see the call to xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in xfs_ioctl_setattr_get_trans(). > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > index 67c8dc9de8aa..abbb417c4fbd 100644 > --- a/fs/xfs/xfs_iops.c > +++ b/fs/xfs/xfs_iops.c > @@ -729,10 +729,12 @@ xfs_setattr_nonsize( > if (XFS_IS_QUOTA_RUNNING(mp) && > ((XFS_IS_UQUOTA_ON(mp) && !uid_eq(iuid, uid)) || > (XFS_IS_GQUOTA_ON(mp) && !gid_eq(igid, gid)))) { > + int flags = has_capability_noaudit(current, CAP_FOWNER) ? > + XFS_QMOPT_FORCE_RES : 0; > + > ASSERT(tp); > error = xfs_qm_vop_chown_reserve(tp, ip, udqp, gdqp, > - NULL, capable(CAP_FOWNER) ? > - XFS_QMOPT_FORCE_RES : 0); > + NULL, flags); > if (error) /* out of quota */ > goto out_cancel; > } You missed a capable() call here - see the call to xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in this function. I think this demonstrates just how fragile and hard to maintain the approach being taken here is. > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c > index bca48b308c02..a99d19c2c11f 100644 > --- a/fs/xfs/xfs_xattr.c > +++ b/fs/xfs/xfs_xattr.c > @@ -164,7 +164,7 @@ xfs_xattr_put_listent( > * Only show root namespace entries if we are actually allowed to > * see them. > */ > - if (!capable(CAP_SYS_ADMIN)) > + if (!has_capability_noaudit(current, CAP_SYS_ADMIN)) > return; > > prefix = XATTR_TRUSTED_PREFIX; This one should absolutely report a denial - someone has tried to read the trusted xattr namespace without permission to do so. That's exactly the sort of thing I'd want to see in an audit log - just because we just elide the xattrs rather than return an error doesn't mean we should not leave an audit trail from the attempted access of kernel trusted attributes... Cheers, Dave.
On Tue, Mar 16, 2021 at 10:09 PM Dave Chinner <david@fromorbit.com> wrote: > On Tue, Mar 16, 2021 at 06:32:26PM +0100, Ondrej Mosnacek wrote: > > In cases when a negative result of a capability check doesn't lead to an > > immediate, user-visible error, only a subtle difference in behavior, it > > is better to use has_capability_noaudit(current, ...), so that LSMs > > (e.g. SELinux) don't generate a denial record in the audit log each time > > the capability status is queried. This patch should cover all such cases > > in fs/xfs/. > > Is this something new? I only see 4 calls to > has_capability_noaudit() in 5.12-rc3... I don't know all the history, but I don't think it's new. It's just that few people really are aware of the difference and no one from the LSM/SELinux cared enough to maintain proper use across the kernel... > > Also, has_capability_noaudit() is an awful name. capable_noaudit() > would actually be self explanatory to anyone who is used to doing > capability checks via capable(), ns_capable(), ns_capable_noaudit(), > inode_owner_or_capable(), capable_wrt_inode_uidgid(), etc... > > Please fix the name of this function to be consistent with the > existing capability APIs before propagating it further into the > kernel. That's a fair point - I should take this opportunity to add the missing function and add some documentation... I'll make sure to do better in v2. > > > Note that I kept the capable(CAP_FSETID) checks, since these will only > > be executed if the user explicitly tries to set the SUID/SGID bit, and > > it likely makes sense to log such attempts even if the syscall doesn't > > fail and just ignores the bits. > > So how on earth are we supposed to maintain this code correctly? > These are undocumented rules that seemingly are applied to random > subsystems and to seemingly random capable() calls in those > subsystems. ANd you don't even document it in this code where there > are other capable(...) checks that will generate audit records... > > How are we supposed to know when an audit record should be emitted > or not by some unknown LSM when we do a capability check? > Capabilities are already an awful nightmare maze of similar but > slightly different capability checks, and this doesn't improve the > situation at all. > > Please make this easier to get right iand maintain correctly (an > absolute, non-negotiable requirement for anything security related) > before you spray yet another poorly documented capability function > into the wider kernel. Again, you're right that I shouldn't have taken the lazy path :) > > > Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> > > --- > > fs/xfs/xfs_fsmap.c | 4 ++-- > > fs/xfs/xfs_ioctl.c | 5 ++++- > > fs/xfs/xfs_iops.c | 6 ++++-- > > fs/xfs/xfs_xattr.c | 2 +- > > 4 files changed, 11 insertions(+), 6 deletions(-) > > > > diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c > > index 9ce5e7d5bf8f..14672e7ee535 100644 > > --- a/fs/xfs/xfs_fsmap.c > > +++ b/fs/xfs/xfs_fsmap.c > > @@ -842,8 +842,8 @@ xfs_getfsmap( > > !xfs_getfsmap_is_valid_device(mp, &head->fmh_keys[1])) > > return -EINVAL; > > > > - use_rmap = capable(CAP_SYS_ADMIN) && > > - xfs_sb_version_hasrmapbt(&mp->m_sb); > > + use_rmap = xfs_sb_version_hasrmapbt(&mp->m_sb) && > > + has_capability_noaudit(current, CAP_SYS_ADMIN); > > head->fmh_entries = 0; > > > > /* Set up our device handlers. */ > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c > > index 3fbd98f61ea5..3cfc1a25069c 100644 > > --- a/fs/xfs/xfs_ioctl.c > > +++ b/fs/xfs/xfs_ioctl.c > > @@ -1470,8 +1470,11 @@ xfs_ioctl_setattr( > > > > if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_PQUOTA_ON(mp) && > > ip->i_d.di_projid != fa->fsx_projid) { > > + int flags = has_capability_noaudit(current, CAP_FOWNER) ? > > + XFS_QMOPT_FORCE_RES : 0; > > + > > code = xfs_qm_vop_chown_reserve(tp, ip, NULL, NULL, pdqp, > > - capable(CAP_FOWNER) ? XFS_QMOPT_FORCE_RES : 0); > > + flags); > > if (code) /* out of quota */ > > goto error_trans_cancel; > > } > > You missed a capable() call here - see the call to > xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in > xfs_ioctl_setattr_get_trans(). Ah, I mistakenly based the path against an old tree. Sorry, I'll redo it against xfs/for-next... > > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > > index 67c8dc9de8aa..abbb417c4fbd 100644 > > --- a/fs/xfs/xfs_iops.c > > +++ b/fs/xfs/xfs_iops.c > > @@ -729,10 +729,12 @@ xfs_setattr_nonsize( > > if (XFS_IS_QUOTA_RUNNING(mp) && > > ((XFS_IS_UQUOTA_ON(mp) && !uid_eq(iuid, uid)) || > > (XFS_IS_GQUOTA_ON(mp) && !gid_eq(igid, gid)))) { > > + int flags = has_capability_noaudit(current, CAP_FOWNER) ? > > + XFS_QMOPT_FORCE_RES : 0; > > + > > ASSERT(tp); > > error = xfs_qm_vop_chown_reserve(tp, ip, udqp, gdqp, > > - NULL, capable(CAP_FOWNER) ? > > - XFS_QMOPT_FORCE_RES : 0); > > + NULL, flags); > > if (error) /* out of quota */ > > goto out_cancel; > > } > > You missed a capable() call here - see the call to > xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in this > function. > > I think this demonstrates just how fragile and hard to maintain the > approach being taken here is. > > > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c > > index bca48b308c02..a99d19c2c11f 100644 > > --- a/fs/xfs/xfs_xattr.c > > +++ b/fs/xfs/xfs_xattr.c > > @@ -164,7 +164,7 @@ xfs_xattr_put_listent( > > * Only show root namespace entries if we are actually allowed to > > * see them. > > */ > > - if (!capable(CAP_SYS_ADMIN)) > > + if (!has_capability_noaudit(current, CAP_SYS_ADMIN)) > > return; > > > > prefix = XATTR_TRUSTED_PREFIX; > > This one should absolutely report a denial - someone has tried to > read the trusted xattr namespace without permission to do so. That's > exactly the sort of thing I'd want to see in an audit log - just > because we just elide the xattrs rather than return an error doesn't > mean we should not leave an audit trail from the attempted access > of kernel trusted attributes... I'm not sure about that... without CAP_SYS_ADMIN the caller would still get the ACL xattrs, no? IIUC, it's a filter to not show restricted xattrs to unprivileged users via listxattr(2)**, where the user is not saying "give me the trusted xattrs", just "give me whatever I'm allowed to see", so logging the denial wouldn't make much sense - the user may not even care about trusted xattrs when doing the syscall (and in 99% of cases a user without CAP_SYS_ADMIN really won't). (**) But I don't understand how exactly that function is used and what the XFS_ATTR_ROOT flag means, so I may be getting it wrong... -- Ondrej Mosnacek Software Engineer, Linux Security - SELinux kernel Red Hat, Inc.
On Thu, Mar 18, 2021 at 10:51:29AM +0100, Ondrej Mosnacek wrote: > On Tue, Mar 16, 2021 at 10:09 PM Dave Chinner <david@fromorbit.com> wrote: > > On Tue, Mar 16, 2021 at 06:32:26PM +0100, Ondrej Mosnacek wrote: > > > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c > > > index bca48b308c02..a99d19c2c11f 100644 > > > --- a/fs/xfs/xfs_xattr.c > > > +++ b/fs/xfs/xfs_xattr.c > > > @@ -164,7 +164,7 @@ xfs_xattr_put_listent( > > > * Only show root namespace entries if we are actually allowed to > > > * see them. > > > */ > > > - if (!capable(CAP_SYS_ADMIN)) > > > + if (!has_capability_noaudit(current, CAP_SYS_ADMIN)) > > > return; > > > > > > prefix = XATTR_TRUSTED_PREFIX; > > > > This one should absolutely report a denial - someone has tried to > > read the trusted xattr namespace without permission to do so. That's > > exactly the sort of thing I'd want to see in an audit log - just > > because we just elide the xattrs rather than return an error doesn't > > mean we should not leave an audit trail from the attempted access > > of kernel trusted attributes... > > I'm not sure about that... without CAP_SYS_ADMIN the caller would > still get the ACL xattrs, no? Looks like it, but I have no idea if that's even correct behaviour or not - access to posix ACL is supposed to be controlled by the VFS. > IIUC, it's a filter to not show > restricted xattrs to unprivileged users via listxattr(2)**, where the > user is not saying "give me the trusted xattrs", just "give me > whatever I'm allowed to see", so logging the denial wouldn't make much > sense - the user may not even care about trusted xattrs when doing the > syscall (and in 99% of cases a user without CAP_SYS_ADMIN really > won't). Ok, I keep forgetting that only XFS has attr_list(3) interfaces (which predate Linux VFS/syscall xattr support, IIRC) and that interface requires userspace to pass ATTR_ROOT to retrieve trusted namespace xattrs... For while it's a silent content filter for one user interface, it's an explicit request from another user interface. Make of that what you will.... > (**) But I don't understand how exactly that function is used and what > the XFS_ATTR_ROOT flag means, so I may be getting it wrong... It's the on-disk format flag that says the xattr is in the "root" namespace (i.e. requires "root" permissions to access). XFS came from Irix which had different xattr namespaces for system, security, etc, hence the stuff is stored on XFS is named somewhat differently compared to native linux filesystems.... Cheers, Dave.
diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index 9ce5e7d5bf8f..14672e7ee535 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -842,8 +842,8 @@ xfs_getfsmap( !xfs_getfsmap_is_valid_device(mp, &head->fmh_keys[1])) return -EINVAL; - use_rmap = capable(CAP_SYS_ADMIN) && - xfs_sb_version_hasrmapbt(&mp->m_sb); + use_rmap = xfs_sb_version_hasrmapbt(&mp->m_sb) && + has_capability_noaudit(current, CAP_SYS_ADMIN); head->fmh_entries = 0; /* Set up our device handlers. */ diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 3fbd98f61ea5..3cfc1a25069c 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1470,8 +1470,11 @@ xfs_ioctl_setattr( if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_PQUOTA_ON(mp) && ip->i_d.di_projid != fa->fsx_projid) { + int flags = has_capability_noaudit(current, CAP_FOWNER) ? + XFS_QMOPT_FORCE_RES : 0; + code = xfs_qm_vop_chown_reserve(tp, ip, NULL, NULL, pdqp, - capable(CAP_FOWNER) ? XFS_QMOPT_FORCE_RES : 0); + flags); if (code) /* out of quota */ goto error_trans_cancel; } diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 67c8dc9de8aa..abbb417c4fbd 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -729,10 +729,12 @@ xfs_setattr_nonsize( if (XFS_IS_QUOTA_RUNNING(mp) && ((XFS_IS_UQUOTA_ON(mp) && !uid_eq(iuid, uid)) || (XFS_IS_GQUOTA_ON(mp) && !gid_eq(igid, gid)))) { + int flags = has_capability_noaudit(current, CAP_FOWNER) ? + XFS_QMOPT_FORCE_RES : 0; + ASSERT(tp); error = xfs_qm_vop_chown_reserve(tp, ip, udqp, gdqp, - NULL, capable(CAP_FOWNER) ? - XFS_QMOPT_FORCE_RES : 0); + NULL, flags); if (error) /* out of quota */ goto out_cancel; } diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c index bca48b308c02..a99d19c2c11f 100644 --- a/fs/xfs/xfs_xattr.c +++ b/fs/xfs/xfs_xattr.c @@ -164,7 +164,7 @@ xfs_xattr_put_listent( * Only show root namespace entries if we are actually allowed to * see them. */ - if (!capable(CAP_SYS_ADMIN)) + if (!has_capability_noaudit(current, CAP_SYS_ADMIN)) return; prefix = XATTR_TRUSTED_PREFIX;
In cases when a negative result of a capability check doesn't lead to an immediate, user-visible error, only a subtle difference in behavior, it is better to use has_capability_noaudit(current, ...), so that LSMs (e.g. SELinux) don't generate a denial record in the audit log each time the capability status is queried. This patch should cover all such cases in fs/xfs/. Note that I kept the capable(CAP_FSETID) checks, since these will only be executed if the user explicitly tries to set the SUID/SGID bit, and it likely makes sense to log such attempts even if the syscall doesn't fail and just ignores the bits. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> --- fs/xfs/xfs_fsmap.c | 4 ++-- fs/xfs/xfs_ioctl.c | 5 ++++- fs/xfs/xfs_iops.c | 6 ++++-- fs/xfs/xfs_xattr.c | 2 +- 4 files changed, 11 insertions(+), 6 deletions(-)