Message ID | 20151208232818.GA29887@www.outflux.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 2015/12/09 8:28, Kees Cook wrote: > Changing the bits requires holding inode->i_mutex, so it cannot be done > during the page fault (due to mmap_sem being held during the fault). We > could do this during vm_mmap_pgoff, but that would need coverage in > mprotect as well, but to check for MAP_SHARED, we'd need to hold mmap_sem > again. > > Instead, detect the need to clear the bits during the page fault, and > actually remove the bits during final fput. Since the file was open for > writing, it wouldn't have been possible to execute it yet. Did you check that inode->i_mutex is held when final fput() is called? Did you check a case where the file is copied between mmap() and final fput() (i.e. open() for write, mmap() for write, sleep forever waiting for the file owner to copy the content and attributes of the modified file)? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue 08-12-15 15:28:18, Kees Cook wrote: > Normally, when a user can modify a file that has setuid or setgid bits, > those bits are cleared when they are not the file owner or a member > of the group. This is enforced when using write and truncate but not > when writing to a shared mmap on the file. This could allow the file > writer to gain privileges by changing a binary without losing the > setuid/setgid/caps bits. > > Changing the bits requires holding inode->i_mutex, so it cannot be done > during the page fault (due to mmap_sem being held during the fault). We > could do this during vm_mmap_pgoff, but that would need coverage in > mprotect as well, but to check for MAP_SHARED, we'd need to hold mmap_sem > again. > > Instead, detect the need to clear the bits during the page fault, and > actually remove the bits during final fput. Since the file was open for > writing, it wouldn't have been possible to execute it yet. > > Signed-off-by: Kees Cook <keescook@chromium.org> > --- > Here's another way? I wonder which of these will actually work. I > wish we could reject writes if file_remove_privs() fails. Yeah, the fact that we cannot do anything with file_remove_privs() failure is rather unfortunate. So open for writing may be the best choice for file_remove_privs() in the end? It's not perfect but it looks like the least problematic solution. Frankly writeable files that have SUID / SGID bits set are IMHO problems on its own, with IMA attrs which are handled by file_remove_privs() as well things may be somewhat different. > diff --git a/fs/file_table.c b/fs/file_table.c > index ad17e05ebf95..abb537ef4344 100644 > --- a/fs/file_table.c > +++ b/fs/file_table.c > @@ -191,6 +191,14 @@ static void __fput(struct file *file) > > might_sleep(); > > + /* > + * XXX: While avoiding mmap_sem, we've already been written to. > + * We must ignore the return value, since we can't reject the > + * write. > + */ > + if (unlikely(file->f_remove_privs)) > + file_remove_privs(file); > + You're missing i_mutex locking again ;). Honza
On Wed, Dec 9, 2015 at 4:49 AM, Jan Kara <jack@suse.cz> wrote: > On Tue 08-12-15 15:28:18, Kees Cook wrote: >> Normally, when a user can modify a file that has setuid or setgid bits, >> those bits are cleared when they are not the file owner or a member >> of the group. This is enforced when using write and truncate but not >> when writing to a shared mmap on the file. This could allow the file >> writer to gain privileges by changing a binary without losing the >> setuid/setgid/caps bits. >> >> Changing the bits requires holding inode->i_mutex, so it cannot be done >> during the page fault (due to mmap_sem being held during the fault). We >> could do this during vm_mmap_pgoff, but that would need coverage in >> mprotect as well, but to check for MAP_SHARED, we'd need to hold mmap_sem >> again. >> >> Instead, detect the need to clear the bits during the page fault, and >> actually remove the bits during final fput. Since the file was open for >> writing, it wouldn't have been possible to execute it yet. >> >> Signed-off-by: Kees Cook <keescook@chromium.org> >> --- >> Here's another way? I wonder which of these will actually work. I >> wish we could reject writes if file_remove_privs() fails. > > Yeah, the fact that we cannot do anything with file_remove_privs() failure > is rather unfortunate. So open for writing may be the best choice for > file_remove_privs() in the end? It's not perfect but it looks like the > least problematic solution. Yeah, back to just the open itself. I can't even delay this to the mmap. :( I will do a v5. :) -Kees > > Frankly writeable files that have SUID / SGID bits set are IMHO problems on > its own, with IMA attrs which are handled by file_remove_privs() as well > things may be somewhat different. > >> diff --git a/fs/file_table.c b/fs/file_table.c >> index ad17e05ebf95..abb537ef4344 100644 >> --- a/fs/file_table.c >> +++ b/fs/file_table.c >> @@ -191,6 +191,14 @@ static void __fput(struct file *file) >> >> might_sleep(); >> >> + /* >> + * XXX: While avoiding mmap_sem, we've already been written to. >> + * We must ignore the return value, since we can't reject the >> + * write. >> + */ >> + if (unlikely(file->f_remove_privs)) >> + file_remove_privs(file); >> + > > You're missing i_mutex locking again ;). > > Honza > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR
diff --git a/fs/file_table.c b/fs/file_table.c index ad17e05ebf95..abb537ef4344 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -191,6 +191,14 @@ static void __fput(struct file *file) might_sleep(); + /* + * XXX: While avoiding mmap_sem, we've already been written to. + * We must ignore the return value, since we can't reject the + * write. + */ + if (unlikely(file->f_remove_privs)) + file_remove_privs(file); + fsnotify_close(file); /* * The function eventpoll_release() should be the first called diff --git a/include/linux/fs.h b/include/linux/fs.h index 3aa514254161..409bd7047e7e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -872,6 +872,7 @@ struct file { struct list_head f_tfile_llink; #endif /* #ifdef CONFIG_EPOLL */ struct address_space *f_mapping; + bool f_remove_privs; } __attribute__((aligned(4))); /* lest something weird decides that 2 is OK */ struct file_handle { diff --git a/mm/memory.c b/mm/memory.c index c387430f06c3..08a77e0cf65f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2036,6 +2036,7 @@ static inline int wp_page_reuse(struct mm_struct *mm, if (!page_mkwrite) file_update_time(vma->vm_file); + vma->vm_file->f_remove_privs = true; } return VM_FAULT_WRITE;
Normally, when a user can modify a file that has setuid or setgid bits, those bits are cleared when they are not the file owner or a member of the group. This is enforced when using write and truncate but not when writing to a shared mmap on the file. This could allow the file writer to gain privileges by changing a binary without losing the setuid/setgid/caps bits. Changing the bits requires holding inode->i_mutex, so it cannot be done during the page fault (due to mmap_sem being held during the fault). We could do this during vm_mmap_pgoff, but that would need coverage in mprotect as well, but to check for MAP_SHARED, we'd need to hold mmap_sem again. Instead, detect the need to clear the bits during the page fault, and actually remove the bits during final fput. Since the file was open for writing, it wouldn't have been possible to execute it yet. Signed-off-by: Kees Cook <keescook@chromium.org> --- Here's another way? I wonder which of these will actually work. I wish we could reject writes if file_remove_privs() fails. v4: - delay removal instead of still needing mmap_sem for mprotect, yalin v3: - move outside of mmap_sem for real now, fengguang - check return code of file_remove_privs, akpm v2: - move to mmap from fault handler, jack --- fs/file_table.c | 8 ++++++++ include/linux/fs.h | 1 + mm/memory.c | 1 + 3 files changed, 10 insertions(+)