Message ID | 155552787330.20411.11893581890744963309.stgit@magnolia (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfs: make immutable files actually immutable | expand |
On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > The chattr manpage has this to say about immutable files: > > "A file with the 'i' attribute cannot be modified: it cannot be deleted > or renamed, no link can be created to this file, most of the file's > metadata can not be modified, and the file can not be opened in write > mode." > > Once the flag is set, it is enforced for quite a few file operations, > such as fallocate, fpunch, fzero, rm, touch, open, etc. However, we > don't check for immutability when doing a write(), a PROT_WRITE mmap(), > a truncate(), or a write to a previously established mmap. > > If a program has an open write fd to a file that the administrator > subsequently marks immutable, the program still can change the file > contents. Weird! > > The ability to write to an immutable file does not follow the manpage > promise that immutable files cannot be modified. Worse yet it's > inconsistent with the behavior of other syscalls which don't allow > modifications of immutable files. > > Therefore, add the necessary checks to make the write, mmap, and > truncate behavior consistent with what the manpage says and consistent > with other syscalls on filesystems which support IMMUTABLE. > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > --- This mostly seems reasonable to me. I assume you'll want some mm acks. A couple notes.. > fs/attr.c | 13 ++++++------- > mm/filemap.c | 3 +++ > mm/memory.c | 3 +++ > mm/mmap.c | 8 ++++++-- > 4 files changed, 18 insertions(+), 9 deletions(-) > > > diff --git a/fs/attr.c b/fs/attr.c > index d22e8187477f..1fcfdcc5b367 100644 > --- a/fs/attr.c > +++ b/fs/attr.c > @@ -233,19 +233,18 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de > > WARN_ON_ONCE(!inode_is_locked(inode)); > > - if (ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_TIMES_SET)) { > - if (IS_IMMUTABLE(inode) || IS_APPEND(inode)) > - return -EPERM; > - } > + if (IS_IMMUTABLE(inode)) > + return -EPERM; > + > + if ((ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_TIMES_SET)) && > + IS_APPEND(inode)) > + return -EPERM; > > /* > * If utimes(2) and friends are called with times == NULL (or both > * times are UTIME_NOW), then we need to check for write permission > */ > if (ia_valid & ATTR_TOUCH) { > - if (IS_IMMUTABLE(inode)) > - return -EPERM; > - > if (!inode_owner_or_capable(inode)) { > error = inode_permission(inode, MAY_WRITE); > if (error) > diff --git a/mm/filemap.c b/mm/filemap.c > index d78f577baef2..9fed698f4c63 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -3033,6 +3033,9 @@ inline ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from) > loff_t count; > int ret; > > + if (IS_IMMUTABLE(inode)) > + return -EPERM; > + > if (!iov_iter_count(from)) > return 0; > > diff --git a/mm/memory.c b/mm/memory.c > index ab650c21bccd..dfd5eba278d6 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2149,6 +2149,9 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf) > > vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE; > > + if (vmf->vma->vm_file && IS_IMMUTABLE(file_inode(vmf->vma->vm_file))) > + return VM_FAULT_SIGBUS; > + I take it this depends on cleaning already dirty pages when the immutable bit is set. That appears to be done later in the series, but I notice it occurs at the filesystem level (presumably due to the ioctl). That of course is fine, but it makes me wonder a bit whether we should have a generic helper for each fs to call that does the requisite writeback and dio wait (similar to generic_remap_file_range_prep() for example). Thoughts? > ret = vmf->vma->vm_ops->page_mkwrite(vmf); > /* Restore original flags so that caller is not surprised */ > vmf->flags = old_flags; > diff --git a/mm/mmap.c b/mm/mmap.c > index 41eb48d9b527..697a101bda59 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1481,8 +1481,12 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > case MAP_SHARED_VALIDATE: > if (flags & ~flags_mask) > return -EOPNOTSUPP; > - if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE)) > - return -EACCES; > + if (prot & PROT_WRITE) { > + if (!(file->f_mode & FMODE_WRITE)) > + return -EACCES; > + if (IS_IMMUTABLE(file_inode(file))) > + return -EPERM; > + } We haven't done anything to clean up writeable mappings on marking the inode immutable, right? It seems a little strange that we can have some writeable mappings hang around while we can't create new ones, but perhaps it doesn't matter if the write fault behavior is the same. Brian > > /* > * Make sure we don't allow writing to an append-only >
On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > The chattr manpage has this to say about immutable files: > > "A file with the 'i' attribute cannot be modified: it cannot be deleted > or renamed, no link can be created to this file, most of the file's > metadata can not be modified, and the file can not be opened in write > mode." > > Once the flag is set, it is enforced for quite a few file operations, > such as fallocate, fpunch, fzero, rm, touch, open, etc. However, we > don't check for immutability when doing a write(), a PROT_WRITE mmap(), > a truncate(), or a write to a previously established mmap. > > If a program has an open write fd to a file that the administrator > subsequently marks immutable, the program still can change the file > contents. Weird! > > The ability to write to an immutable file does not follow the manpage > promise that immutable files cannot be modified. Worse yet it's > inconsistent with the behavior of other syscalls which don't allow > modifications of immutable files. > > Therefore, add the necessary checks to make the write, mmap, and > truncate behavior consistent with what the manpage says and consistent > with other syscalls on filesystems which support IMMUTABLE. > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Thanks, looks good. I'm going to take this patch through the ext4 tree since it doesn't have any dependencies on the rest of the patch series. - Ted
On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote: > diff --git a/mm/memory.c b/mm/memory.c > index ab650c21bccd..dfd5eba278d6 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2149,6 +2149,9 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf) > > vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE; > > + if (vmf->vma->vm_file && IS_IMMUTABLE(file_inode(vmf->vma->vm_file))) > + return VM_FAULT_SIGBUS; > + > ret = vmf->vma->vm_ops->page_mkwrite(vmf); > /* Restore original flags so that caller is not surprised */ > vmf->flags = old_flags; Shouldn't this check be moved before the modification of vmf->flags? It looks like do_page_mkwrite() isn't supposed to be returning with vmf->flags modified, lest "the caller gets surprised". - Ted
On Sun, Jun 09, 2019 at 09:51:45PM -0400, Theodore Ts'o wrote: > On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote: > > diff --git a/mm/memory.c b/mm/memory.c > > index ab650c21bccd..dfd5eba278d6 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -2149,6 +2149,9 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf) > > > > vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE; > > > > + if (vmf->vma->vm_file && IS_IMMUTABLE(file_inode(vmf->vma->vm_file))) > > + return VM_FAULT_SIGBUS; > > + > > ret = vmf->vma->vm_ops->page_mkwrite(vmf); > > /* Restore original flags so that caller is not surprised */ > > vmf->flags = old_flags; > > Shouldn't this check be moved before the modification of vmf->flags? > It looks like do_page_mkwrite() isn't supposed to be returning with > vmf->flags modified, lest "the caller gets surprised". Yeah, I think that was a merge error during a rebase... :( Er ... if you're still planning to take this patch through your tree, can you move it to above the "vmf->flags = FAULT_FLAG_WRITE..." ? --D > - Ted
On Sun, Jun 09, 2019 at 09:41:44PM -0700, Darrick J. Wong wrote: > On Sun, Jun 09, 2019 at 09:51:45PM -0400, Theodore Ts'o wrote: > > On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote: > > > Shouldn't this check be moved before the modification of vmf->flags? > > It looks like do_page_mkwrite() isn't supposed to be returning with > > vmf->flags modified, lest "the caller gets surprised". > > Yeah, I think that was a merge error during a rebase... :( > > Er ... if you're still planning to take this patch through your tree, > can you move it to above the "vmf->flags = FAULT_FLAG_WRITE..." ? I was planning on only taking 8/8 through the ext4 tree. I also added a patch which filtered writes, truncates, and page_mkwrites (but not mmap) for immutable files at the ext4 level. I *could* take this patch through the mm/fs tree, but I wasn't sure what your plans were for the rest of the patch series, and it seemed like it hadn't gotten much review/attention from other fs or mm folks (well, I guess Brian Foster weighed in). What do you think? - Ted
On Mon, Jun 10, 2019 at 09:14:17AM -0400, Theodore Ts'o wrote: > On Sun, Jun 09, 2019 at 09:41:44PM -0700, Darrick J. Wong wrote: > > On Sun, Jun 09, 2019 at 09:51:45PM -0400, Theodore Ts'o wrote: > > > On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote: > > > > > Shouldn't this check be moved before the modification of vmf->flags? > > > It looks like do_page_mkwrite() isn't supposed to be returning with > > > vmf->flags modified, lest "the caller gets surprised". > > > > Yeah, I think that was a merge error during a rebase... :( > > > > Er ... if you're still planning to take this patch through your tree, > > can you move it to above the "vmf->flags = FAULT_FLAG_WRITE..." ? > > I was planning on only taking 8/8 through the ext4 tree. I also added > a patch which filtered writes, truncates, and page_mkwrites (but not > mmap) for immutable files at the ext4 level. *Oh*. I saw your reply attached to the 1/8 patch and thought that was the one you were taking. I was sort of surprised, tbh. :) > I *could* take this patch through the mm/fs tree, but I wasn't sure > what your plans were for the rest of the patch series, and it seemed > like it hadn't gotten much review/attention from other fs or mm folks > (well, I guess Brian Foster weighed in). > What do you think? Not sure. The comments attached to the LWN story were sort of nasty, and now that a couple of people said "Oh, well, Debian documented the inconsistent behavior so just let it be" I haven't felt like resurrecting the series for 5.3. I do want to clean up the parameter validation for the VFS SETFLAGS and FSSETXATTR ioctls though... eh, maybe I'll just send out the series as it stands now. I'm still maintaining it, so all that work might as well go somewhere. --D > > - Ted > > >
On Mon, Jun 10, 2019 at 09:09:34AM -0700, Darrick J. Wong wrote: > > I was planning on only taking 8/8 through the ext4 tree. I also added > > a patch which filtered writes, truncates, and page_mkwrites (but not > > mmap) for immutable files at the ext4 level. > > *Oh*. I saw your reply attached to the 1/8 patch and thought that was > the one you were taking. I was sort of surprised, tbh. :) Sorry, my bad. I mis-replied to the wrong e-mail message :-) > > I *could* take this patch through the mm/fs tree, but I wasn't sure > > what your plans were for the rest of the patch series, and it seemed > > like it hadn't gotten much review/attention from other fs or mm folks > > (well, I guess Brian Foster weighed in). > > > What do you think? > > Not sure. The comments attached to the LWN story were sort of nasty, > and now that a couple of people said "Oh, well, Debian documented the > inconsistent behavior so just let it be" I haven't felt like > resurrecting the series for 5.3. Ah, I had missed the LWN article. <Looks> Yeah, it's the same set of issues that we had discussed when this first came up. We can go round and round on this one; It's true that root can now cause random programs which have a file mmap'ed for writing to seg fault, but root has a million ways of killing and otherwise harming running application programs, and it's unlikely files get marked for immutable all that often. We just have to pick one way of doing things, and let it be same across all the file systems. My understanding was that XFS had chosen to make the inode immutable as soon as the flag is set (as opposed to forbidding new fd's to be opened which were writeable), and I was OK moving ext4 to that common interpretation of the immmutable bit, even though it would be a change to ext4. And then when I saw that Amir had included a patch that would cause test failures unless that patch series was applied, it seemed that we had all thought that the change was a done deal. Perhaps we should have had a more explicit discussion when the test was sent for review, but I had assumed it was exclusively a copy_file_range set of tests, so I didn't realize it was going to cause ext4 failures. - Ted
On Mon, Jun 11, 2019 at 04:41:54PM -0400, Theodore Ts'o wrote: > On Mon, Jun 10, 2019 at 09:09:34AM -0700, Darrick J. Wong wrote: > > > I was planning on only taking 8/8 through the ext4 tree. I also added > > > a patch which filtered writes, truncates, and page_mkwrites (but not > > > mmap) for immutable files at the ext4 level. > > > > *Oh*. I saw your reply attached to the 1/8 patch and thought that was > > the one you were taking. I was sort of surprised, tbh. :) > > Sorry, my bad. I mis-replied to the wrong e-mail message :-) > > > > I *could* take this patch through the mm/fs tree, but I wasn't sure > > > what your plans were for the rest of the patch series, and it seemed > > > like it hadn't gotten much review/attention from other fs or mm folks > > > (well, I guess Brian Foster weighed in). > > > > > What do you think? > > > > Not sure. The comments attached to the LWN story were sort of nasty, > > and now that a couple of people said "Oh, well, Debian documented the > > inconsistent behavior so just let it be" I haven't felt like > > resurrecting the series for 5.3. > > Ah, I had missed the LWN article. <Looks> > > Yeah, it's the same set of issues that we had discussed when this > first came up. We can go round and round on this one; It's true that > root can now cause random programs which have a file mmap'ed for > writing to seg fault, but root has a million ways of killing and > otherwise harming running application programs, and it's unlikely > files get marked for immutable all that often. We just have to pick > one way of doing things, and let it be same across all the file > systems. > > My understanding was that XFS had chosen to make the inode immutable > as soon as the flag is set (as opposed to forbidding new fd's to be > opened which were writeable), and I was OK moving ext4 to that common > interpretation of the immmutable bit, even though it would be a change > to ext4. <nod> It started as "just do this to xfs" and has now become a vfs level change... > And then when I saw that Amir had included a patch that would cause > test failures unless that patch series was applied, it seemed that we > had all thought that the change was a done deal. Perhaps we should > have had a more explicit discussion when the test was sent for review, > but I had assumed it was exclusively a copy_file_range set of tests, > so I didn't realize it was going to cause ext4 failures. And here we see the inconsistent behavior causing developer confusion. :) I think Amir's c_f_r tests just check the existing behavior (of just c_f_r) that you can't (most of the time) copy into a file that you opened for write but that the administrator has since marked immutable. /That/ behavior in turn came from the original implementation that would try reflink which would fail on the immutable destination check and then fail the whole call ... I think? --D > - Ted
On Mon, Jun 10, 2019 at 04:41:54PM -0400, Theodore Ts'o wrote: > On Mon, Jun 10, 2019 at 09:09:34AM -0700, Darrick J. Wong wrote: > > > I was planning on only taking 8/8 through the ext4 tree. I also added > > > a patch which filtered writes, truncates, and page_mkwrites (but not > > > mmap) for immutable files at the ext4 level. > > > > *Oh*. I saw your reply attached to the 1/8 patch and thought that was > > the one you were taking. I was sort of surprised, tbh. :) > > Sorry, my bad. I mis-replied to the wrong e-mail message :-) Also ... after flailing around with the v2 series I decided that it would be much less work to refactor all the current implementations to call a common parameter-checking function, which will hopefully make the behavior of SETFLAGS and FSSETXATTR more consistent across filesystems. That makes the immutable series much less code and fewer patches, but also means that the 8/8 patch isn't needed anymore. I'm about to send both out. --D > > > I *could* take this patch through the mm/fs tree, but I wasn't sure > > > what your plans were for the rest of the patch series, and it seemed > > > like it hadn't gotten much review/attention from other fs or mm folks > > > (well, I guess Brian Foster weighed in). > > > > > What do you think? > > > > Not sure. The comments attached to the LWN story were sort of nasty, > > and now that a couple of people said "Oh, well, Debian documented the > > inconsistent behavior so just let it be" I haven't felt like > > resurrecting the series for 5.3. > > Ah, I had missed the LWN article. <Looks> > > Yeah, it's the same set of issues that we had discussed when this > first came up. We can go round and round on this one; It's true that > root can now cause random programs which have a file mmap'ed for > writing to seg fault, but root has a million ways of killing and > otherwise harming running application programs, and it's unlikely > files get marked for immutable all that often. We just have to pick > one way of doing things, and let it be same across all the file > systems. > > My understanding was that XFS had chosen to make the inode immutable > as soon as the flag is set (as opposed to forbidding new fd's to be > opened which were writeable), and I was OK moving ext4 to that common > interpretation of the immmutable bit, even though it would be a change > to ext4. > > And then when I saw that Amir had included a patch that would cause > test failures unless that patch series was applied, it seemed that we > had all thought that the change was a done deal. Perhaps we should > have had a more explicit discussion when the test was sent for review, > but I had assumed it was exclusively a copy_file_range set of tests, > so I didn't realize it was going to cause ext4 failures. > > - Ted
diff --git a/fs/attr.c b/fs/attr.c index d22e8187477f..1fcfdcc5b367 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -233,19 +233,18 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de WARN_ON_ONCE(!inode_is_locked(inode)); - if (ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_TIMES_SET)) { - if (IS_IMMUTABLE(inode) || IS_APPEND(inode)) - return -EPERM; - } + if (IS_IMMUTABLE(inode)) + return -EPERM; + + if ((ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_TIMES_SET)) && + IS_APPEND(inode)) + return -EPERM; /* * If utimes(2) and friends are called with times == NULL (or both * times are UTIME_NOW), then we need to check for write permission */ if (ia_valid & ATTR_TOUCH) { - if (IS_IMMUTABLE(inode)) - return -EPERM; - if (!inode_owner_or_capable(inode)) { error = inode_permission(inode, MAY_WRITE); if (error) diff --git a/mm/filemap.c b/mm/filemap.c index d78f577baef2..9fed698f4c63 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3033,6 +3033,9 @@ inline ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from) loff_t count; int ret; + if (IS_IMMUTABLE(inode)) + return -EPERM; + if (!iov_iter_count(from)) return 0; diff --git a/mm/memory.c b/mm/memory.c index ab650c21bccd..dfd5eba278d6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2149,6 +2149,9 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf) vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE; + if (vmf->vma->vm_file && IS_IMMUTABLE(file_inode(vmf->vma->vm_file))) + return VM_FAULT_SIGBUS; + ret = vmf->vma->vm_ops->page_mkwrite(vmf); /* Restore original flags so that caller is not surprised */ vmf->flags = old_flags; diff --git a/mm/mmap.c b/mm/mmap.c index 41eb48d9b527..697a101bda59 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1481,8 +1481,12 @@ unsigned long do_mmap(struct file *file, unsigned long addr, case MAP_SHARED_VALIDATE: if (flags & ~flags_mask) return -EOPNOTSUPP; - if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE)) - return -EACCES; + if (prot & PROT_WRITE) { + if (!(file->f_mode & FMODE_WRITE)) + return -EACCES; + if (IS_IMMUTABLE(file_inode(file))) + return -EPERM; + } /* * Make sure we don't allow writing to an append-only