Message ID | fa801180-0229-4ea7-b8eb-eb162935d348@redhat.com (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | xfs: allow SECURE namespace xattrs to use reserved pool | expand |
On Fri, Jul 19, 2024 at 05:48:53PM -0500, Eric Sandeen wrote: > xfs_attr_sethash(args); > > - return xfs_attr_set(args, op, args->attr_filter & XFS_ATTR_ROOT); > + rsvd = args->attr_filter & (XFS_ATTR_ROOT | XFS_ATTR_SECURE); > + return xfs_attr_set(args, op, rsvd); This looks fine, although I'd probably do without the extra local variable. More importantly though, please write a comment documenting why we are dipping into the reserved pool here. We should have had that since the beginning, but this is a better time than never.
On 7/22/24 9:41 AM, Christoph Hellwig wrote: > On Fri, Jul 19, 2024 at 05:48:53PM -0500, Eric Sandeen wrote: >> xfs_attr_sethash(args); >> >> - return xfs_attr_set(args, op, args->attr_filter & XFS_ATTR_ROOT); >> + rsvd = args->attr_filter & (XFS_ATTR_ROOT | XFS_ATTR_SECURE); >> + return xfs_attr_set(args, op, rsvd); > > This looks fine, although I'd probably do without the extra local > variable. More importantly though, please write a comment documenting > why we are dipping into the reserved pool here. We should have had that > since the beginning, but this is a better time than never. > > Ok, I thought the local var was a little prettier but *shrug* can do it either way. To be honest I'm not sure why it was done for ROOT; dchinnner mentioned something about DMAPI requirements, long ago... It seems reasonable, and it's been there forever but also not obviously required, AFAICT. What would your explanation be? ;) Thanks, -Eric
On Mon, Jul 22, 2024 at 10:05:03AM -0500, Eric Sandeen wrote: > Ok, I thought the local var was a little prettier but *shrug* can do it > either way. > > To be honest I'm not sure why it was done for ROOT; dchinnner mentioned > something about DMAPI requirements, long ago... > > It seems reasonable, and it's been there forever but also not obviously > required, AFAICT. > > What would your explanation be? ;) Based on your explanation it's probably ACLs for the same reason it applies to the security attributes.
On 7/22/24 10:05 AM, Eric Sandeen wrote: > On 7/22/24 9:41 AM, Christoph Hellwig wrote: >> On Fri, Jul 19, 2024 at 05:48:53PM -0500, Eric Sandeen wrote: >>> xfs_attr_sethash(args); >>> >>> - return xfs_attr_set(args, op, args->attr_filter & XFS_ATTR_ROOT); >>> + rsvd = args->attr_filter & (XFS_ATTR_ROOT | XFS_ATTR_SECURE); >>> + return xfs_attr_set(args, op, rsvd); >> This looks fine, although I'd probably do without the extra local >> variable. More importantly though, please write a comment documenting >> why we are dipping into the reserved pool here. We should have had that >> since the beginning, but this is a better time than never. >> >> > Ok, I thought the local var was a little prettier but *shrug* can do it > either way. > > To be honest I'm not sure why it was done for ROOT; dchinnner mentioned > something about DMAPI requirements, long ago... The older Data Mover Framework (DMF v6) kept an extended attribute that denoted the file status (online/offline/partial online) and some region information (which was never used). Yeah DMF uses Data Management API (DMAPI) as the hooks to move data in when offline. > > It seems reasonable, and it's been there forever but also not obviously > required, AFAICT. > > What would your explanation be? ;) > > Thanks, > -Eric >
On Mon, Jul 22, 2024 at 10:05:03AM -0500, Eric Sandeen wrote: > On 7/22/24 9:41 AM, Christoph Hellwig wrote: > > On Fri, Jul 19, 2024 at 05:48:53PM -0500, Eric Sandeen wrote: > >> xfs_attr_sethash(args); > >> > >> - return xfs_attr_set(args, op, args->attr_filter & XFS_ATTR_ROOT); > >> + rsvd = args->attr_filter & (XFS_ATTR_ROOT | XFS_ATTR_SECURE); > >> + return xfs_attr_set(args, op, rsvd); > > > > This looks fine, although I'd probably do without the extra local > > variable. More importantly though, please write a comment documenting > > why we are dipping into the reserved pool here. We should have had that > > since the beginning, but this is a better time than never. > > > > > > Ok, I thought the local var was a little prettier but *shrug* can do it > either way. > > To be honest I'm not sure why it was done for ROOT; dchinnner mentioned > something about DMAPI requirements, long ago... Because the xattrs created with inode allocation are not atomic (which they could be now because parent pointers added the infrastructure to add xattrs atomically in the create transaction), stuff like ACLs, security xattrs and, historically, DMAPI xattrs could fail to be created when the inode was allocated. For DMAPI/DMF, this was a big issue if the xattr creation got ENOSPC or the system crashed between inode creation (i.e the DMAPI CREATE notification being processed by DMF) and the xattr being written on the newly allocated inode. This would leave leave "untracked" inodes in the filesystem, and the only way DMF could discover inodes lacking in DMAPI xattrs was to run a full filesystem DMAPI-bulkstat scan to synchronise the filesystem state with the DMF database held in userspace. When you're tracking hundreds of millions to billions of inodes, being forced to do a full fs inode scan after crashes or ENOSPC before everything works properly again is, well, kinda annoying. Similar issues afflicted Trix (Trusted Irix) where security xattrs (such as ACLs) went missing on crash or ENOSPC. On Irix, they were stored in the XFS_ATTR_ROOT namespace, and the use of reserved block space for XFS_ATTR_ROOT was introduced in 1997 on Irix. commit 32d7e9a0d0fbca91a3d036c8518a87e10abfafb3 Author: gnuss <gnuss> Date: Fri Dec 19 19:35:42 1997 +0000 pv: 553766 rv: lord@cray.com Add reserved flag param to routines in block allocation call sequence This commit contains just the addition of XFS_TRANS_RESERVE for XFS_ATTR_ROOT xattrs. Nothing else used it - this was specically a fix for ACL/DMAPI xattr creation at ENOSPC.... However, ACL support on linux, and hence XFS_ATTR_SECURE, didn't exist until 2004: commit af80e14283d9475582dfb2d91395b674b9827fa8 Author: Nathan Scott <nathans@sgi.com> Date: Thu Jan 29 03:56:41 2004 +0000 Add the security extended attributes namespace. This added the XFS_ATTR_SECURE namespace because ACLs are in a different xattr namespace in Linux (i.e. TRUSTED -> XFS_ATTR_ROOT, SECURITY -> XFS_ATTR_SECURE), but the xattr set/change code never added the XFS_ATTR_SECURE flag to the XFS_TRANS_RESERVE case. It wasn't until 2007 that we started to use the reserve block pool for other ENOSPC avoidance cases (like indirect delalloc BMBT block reservation exhaustion in writeback) here: commit bdebc6a4aca2ac056b8174f5b6a3bf27b28f6a5d Author: Dave Chinner <dgc@sgi.com> Date: Fri Jun 8 16:03:59 2007 +0000 Prevent ENOSPC from aborting transactions that need to succeed So, essentially, for the first 10 years of it's life, XFS_TRANS_RESERVE was used supposed to be to prevent ENOSPC at inode creation for security and trusted xattrs.... > It seems reasonable, and it's been there forever but also not > obviously required, AFAICT. In hindsight, it looks to me like this was an oversight made back in 2004 when XFS_ATTR_SECURE was added to linux for security xattrs. As Christoph says: "it should have been there since the beginning". -Dave.
diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c index ab3d22f662f2..e59193609003 100644 --- a/fs/xfs/xfs_xattr.c +++ b/fs/xfs/xfs_xattr.c @@ -82,6 +82,7 @@ xfs_attr_change( { struct xfs_mount *mp = args->dp->i_mount; int error; + bool rsvd; if (xfs_is_shutdown(mp)) return -EIO; @@ -110,7 +111,8 @@ xfs_attr_change( args->whichfork = XFS_ATTR_FORK; xfs_attr_sethash(args); - return xfs_attr_set(args, op, args->attr_filter & XFS_ATTR_ROOT); + rsvd = args->attr_filter & (XFS_ATTR_ROOT | XFS_ATTR_SECURE); + return xfs_attr_set(args, op, rsvd); }
We got a report from the podman folks that selinux relabels that happen as part of their process were returning ENOSPC when the filesystem is completely full. This is because xattr changes reserve about 15 blocks for the worst case, but the common case is for selinux contexts to be the sole, in-inode xattr and consume no blocks. We already allow reserved space consumption for XFS_ATTR_ROOT for things such as ACLs, and selinux / SECURE attributes are not so very different, so allow them to use the reserved space as well. Signed-off-by: Eric Sandeen <sandeen@redhat.com> ---