Message ID | 20241018161415.3845146-1-roberto.sassu@huaweicloud.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm: Split critical region in remap_file_pages() and invoke LSMs in between | expand |
* Roberto Sassu <roberto.sassu@huaweicloud.com> [241018 12:15]: > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in > remap_file_pages()") fixed a security issue, it added an LSM check when > trying to remap file pages, so that LSMs have the opportunity to evaluate > such action like for other memory operations such as mmap() and mprotect(). > > However, that commit called security_mmap_file() inside the mmap_lock lock, > while the other calls do it before taking the lock, after commit > 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem"). > > This caused lock inversion issue with IMA which was taking the mmap_lock > and i_mutex lock in the opposite way when the remap_file_pages() system > call was called. > > Solve the issue by splitting the critical region in remap_file_pages() in > two regions: the first takes a read lock of mmap_lock, retrieves the VMA > and the file descriptor associated, and calculates the 'prot' and 'flags' > variables; the second takes a write lock on mmap_lock, checks that the VMA > flags and the VMA file descriptor are the same as the ones obtained in the > first critical region (otherwise the system call fails), and calls > do_mmap(). > > In between, after releasing the read lock and before taking the write lock, > call security_mmap_file(), and solve the lock inversion issue. > > Cc: stable@vger.kernel.org # v6.12-rcx > Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()") > Reported-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/ > Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> > Reviewed-by: Jann Horn <jannh@google.com> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Tested-by: Roberto Sassu <roberto.sassu@huawei.com> > Tested-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> You really need to be in that list of people :) Please add your own signed-off-by. Besides the missing SOB, it looks good. Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > --- > mm/mmap.c | 69 +++++++++++++++++++++++++++++++++++++++++-------------- > 1 file changed, 52 insertions(+), 17 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 9c0fb43064b5..f731dd69e162 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, > unsigned long populate = 0; > unsigned long ret = -EINVAL; > struct file *file; > + vm_flags_t vm_flags; > > pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n", > current->comm, current->pid); > @@ -1656,12 +1657,60 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, > if (pgoff + (size >> PAGE_SHIFT) < pgoff) > return ret; > > - if (mmap_write_lock_killable(mm)) > + if (mmap_read_lock_killable(mm)) > return -EINTR; > > + /* > + * Look up VMA under read lock first so we can perform the security > + * without holding locks (which can be problematic). We reacquire a > + * write lock later and check nothing changed underneath us. > + */ > vma = vma_lookup(mm, start); > > - if (!vma || !(vma->vm_flags & VM_SHARED)) > + if (!vma || !(vma->vm_flags & VM_SHARED)) { > + mmap_read_unlock(mm); > + return -EINVAL; > + } > + > + prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; > + prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; > + prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; > + > + flags &= MAP_NONBLOCK; > + flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; > + if (vma->vm_flags & VM_LOCKED) > + flags |= MAP_LOCKED; > + > + /* Save vm_flags used to calculate prot and flags, and recheck later. */ > + vm_flags = vma->vm_flags; > + file = get_file(vma->vm_file); > + > + mmap_read_unlock(mm); > + > + /* Call outside mmap_lock to be consistent with other callers. */ > + ret = security_mmap_file(file, prot, flags); > + if (ret) { > + fput(file); > + return ret; > + } > + > + ret = -EINVAL; > + > + /* OK security check passed, take write lock + let it rip. */ > + if (mmap_write_lock_killable(mm)) { > + fput(file); > + return -EINTR; > + } > + > + vma = vma_lookup(mm, start); > + > + if (!vma) > + goto out; > + > + /* Make sure things didn't change under us. */ > + if (vma->vm_flags != vm_flags) > + goto out; > + if (vma->vm_file != file) > goto out; > > if (start + size > vma->vm_end) { > @@ -1689,25 +1738,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, > goto out; > } > > - prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; > - prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; > - prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; > - > - flags &= MAP_NONBLOCK; > - flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; > - if (vma->vm_flags & VM_LOCKED) > - flags |= MAP_LOCKED; > - > - file = get_file(vma->vm_file); > - ret = security_mmap_file(vma->vm_file, prot, flags); > - if (ret) > - goto out_fput; > ret = do_mmap(vma->vm_file, start, size, > prot, flags, 0, pgoff, &populate, NULL); > -out_fput: > - fput(file); > out: > mmap_write_unlock(mm); > + fput(file); > if (populate) > mm_populate(ret, populate); > if (!IS_ERR_VALUE(ret)) > -- > 2.34.1 >
On Fri Oct 18, 2024 at 7:14 PM EEST, Roberto Sassu wrote: > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in > remap_file_pages()") fixed a security issue, it added an LSM check when > trying to remap file pages, so that LSMs have the opportunity to evaluate > such action like for other memory operations such as mmap() and mprotect(). > > However, that commit called security_mmap_file() inside the mmap_lock lock, > while the other calls do it before taking the lock, after commit > 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem"). > > This caused lock inversion issue with IMA which was taking the mmap_lock > and i_mutex lock in the opposite way when the remap_file_pages() system > call was called. > > Solve the issue by splitting the critical region in remap_file_pages() in > two regions: the first takes a read lock of mmap_lock, retrieves the VMA > and the file descriptor associated, and calculates the 'prot' and 'flags' > variables; the second takes a write lock on mmap_lock, checks that the VMA > flags and the VMA file descriptor are the same as the ones obtained in the > first critical region (otherwise the system call fails), and calls > do_mmap(). > > In between, after releasing the read lock and before taking the write lock, > call security_mmap_file(), and solve the lock inversion issue. > > Cc: stable@vger.kernel.org # v6.12-rcx > Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()") > Reported-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/ > Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> > Reviewed-by: Jann Horn <jannh@google.com> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Tested-by: Roberto Sassu <roberto.sassu@huawei.com> > Tested-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/mmap.c | 69 +++++++++++++++++++++++++++++++++++++++++-------------- > 1 file changed, 52 insertions(+), 17 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 9c0fb43064b5..f731dd69e162 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, > unsigned long populate = 0; > unsigned long ret = -EINVAL; > struct file *file; > + vm_flags_t vm_flags; > > pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n", > current->comm, current->pid); > @@ -1656,12 +1657,60 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, > if (pgoff + (size >> PAGE_SHIFT) < pgoff) > return ret; > > - if (mmap_write_lock_killable(mm)) > + if (mmap_read_lock_killable(mm)) > return -EINTR; > > + /* > + * Look up VMA under read lock first so we can perform the security > + * without holding locks (which can be problematic). We reacquire a > + * write lock later and check nothing changed underneath us. > + */ > vma = vma_lookup(mm, start); > > - if (!vma || !(vma->vm_flags & VM_SHARED)) > + if (!vma || !(vma->vm_flags & VM_SHARED)) { > + mmap_read_unlock(mm); > + return -EINVAL; > + } > + > + prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; > + prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; > + prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; Not an actual review comment but we don't have a conversion macro and/or inline for this, do we (and opposite direction)? > + > + flags &= MAP_NONBLOCK; > + flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; > + if (vma->vm_flags & VM_LOCKED) > + flags |= MAP_LOCKED; > + > + /* Save vm_flags used to calculate prot and flags, and recheck later. */ > + vm_flags = vma->vm_flags; > + file = get_file(vma->vm_file); > + > + mmap_read_unlock(mm); > + > + /* Call outside mmap_lock to be consistent with other callers. */ > + ret = security_mmap_file(file, prot, flags); > + if (ret) { > + fput(file); > + return ret; > + } > + > + ret = -EINVAL; > + > + /* OK security check passed, take write lock + let it rip. */ > + if (mmap_write_lock_killable(mm)) { > + fput(file); > + return -EINTR; > + } > + > + vma = vma_lookup(mm, start); > + > + if (!vma) > + goto out; > + > + /* Make sure things didn't change under us. */ > + if (vma->vm_flags != vm_flags) > + goto out; > + if (vma->vm_file != file) > goto out; > > if (start + size > vma->vm_end) { > @@ -1689,25 +1738,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, > goto out; > } > > - prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; > - prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; > - prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; > - > - flags &= MAP_NONBLOCK; > - flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; > - if (vma->vm_flags & VM_LOCKED) > - flags |= MAP_LOCKED; > - > - file = get_file(vma->vm_file); > - ret = security_mmap_file(vma->vm_file, prot, flags); > - if (ret) > - goto out_fput; > ret = do_mmap(vma->vm_file, start, size, > prot, flags, 0, pgoff, &populate, NULL); > -out_fput: > - fput(file); > out: > mmap_write_unlock(mm); > + fput(file); > if (populate) > mm_populate(ret, populate); > if (!IS_ERR_VALUE(ret)) BR, Jarkko
On Fri, Oct 18, 2024 at 12:15 PM Roberto Sassu <roberto.sassu@huaweicloud.com> wrote: > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in > remap_file_pages()") fixed a security issue, it added an LSM check when > trying to remap file pages, so that LSMs have the opportunity to evaluate > such action like for other memory operations such as mmap() and mprotect(). > > However, that commit called security_mmap_file() inside the mmap_lock lock, > while the other calls do it before taking the lock, after commit > 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem"). > > This caused lock inversion issue with IMA which was taking the mmap_lock > and i_mutex lock in the opposite way when the remap_file_pages() system > call was called. > > Solve the issue by splitting the critical region in remap_file_pages() in > two regions: the first takes a read lock of mmap_lock, retrieves the VMA > and the file descriptor associated, and calculates the 'prot' and 'flags' > variables; the second takes a write lock on mmap_lock, checks that the VMA > flags and the VMA file descriptor are the same as the ones obtained in the > first critical region (otherwise the system call fails), and calls > do_mmap(). > > In between, after releasing the read lock and before taking the write lock, > call security_mmap_file(), and solve the lock inversion issue. > > Cc: stable@vger.kernel.org # v6.12-rcx > Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()") > Reported-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/ > Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> > Reviewed-by: Jann Horn <jannh@google.com> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Tested-by: Roberto Sassu <roberto.sassu@huawei.com> > Tested-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/mmap.c | 69 +++++++++++++++++++++++++++++++++++++++++-------------- > 1 file changed, 52 insertions(+), 17 deletions(-) Thanks for working on this Roberto, Kirill, and everyone else who had a hand in reviewing and testing. Reviewed-by: Paul Moore <paul@paul-moore.com> Andrew, I see you're pulling this into the MM/hotfixes-unstable branch, do you also plan to send this up to Linus soon/next-week? If so, great, if not let me know and I can send it up via the LSM tree. We need to get clarity around Roberto's sign-off, but I think that is more of an administrative mistake rather than an intentional omission :)
On Sat, 2024-10-19 at 11:34 -0400, Paul Moore wrote: > On Fri, Oct 18, 2024 at 12:15 PM Roberto Sassu > <roberto.sassu@huaweicloud.com> wrote: > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > > > > Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in > > remap_file_pages()") fixed a security issue, it added an LSM check when > > trying to remap file pages, so that LSMs have the opportunity to evaluate > > such action like for other memory operations such as mmap() and mprotect(). > > > > However, that commit called security_mmap_file() inside the mmap_lock lock, > > while the other calls do it before taking the lock, after commit > > 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem"). > > > > This caused lock inversion issue with IMA which was taking the mmap_lock > > and i_mutex lock in the opposite way when the remap_file_pages() system > > call was called. > > > > Solve the issue by splitting the critical region in remap_file_pages() in > > two regions: the first takes a read lock of mmap_lock, retrieves the VMA > > and the file descriptor associated, and calculates the 'prot' and 'flags' > > variables; the second takes a write lock on mmap_lock, checks that the VMA > > flags and the VMA file descriptor are the same as the ones obtained in the > > first critical region (otherwise the system call fails), and calls > > do_mmap(). > > > > In between, after releasing the read lock and before taking the write lock, > > call security_mmap_file(), and solve the lock inversion issue. > > > > Cc: stable@vger.kernel.org # v6.12-rcx > > Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()") > > Reported-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com > > Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@google.com/ > > Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> > > Reviewed-by: Jann Horn <jannh@google.com> > > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > Tested-by: Roberto Sassu <roberto.sassu@huawei.com> > > Tested-by: syzbot+1cd571a672400ef3a930@syzkaller.appspotmail.com > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > --- > > mm/mmap.c | 69 +++++++++++++++++++++++++++++++++++++++++-------------- > > 1 file changed, 52 insertions(+), 17 deletions(-) > > Thanks for working on this Roberto, Kirill, and everyone else who had > a hand in reviewing and testing. Welcome! > Reviewed-by: Paul Moore <paul@paul-moore.com> > > Andrew, I see you're pulling this into the MM/hotfixes-unstable > branch, do you also plan to send this up to Linus soon/next-week? If > so, great, if not let me know and I can send it up via the LSM tree. > > We need to get clarity around Roberto's sign-off, but I think that is > more of an administrative mistake rather than an intentional omission > :) Ops, I just thought that I would not need to add it, since I'm not the author of the patch. Please add my: Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Roberto
On Sat, 19 Oct 2024 11:34:08 -0400 Paul Moore <paul@paul-moore.com> wrote: > > mm/mmap.c | 69 +++++++++++++++++++++++++++++++++++++++++-------------- > > 1 file changed, 52 insertions(+), 17 deletions(-) > > Thanks for working on this Roberto, Kirill, and everyone else who had > a hand in reviewing and testing. > > Reviewed-by: Paul Moore <paul@paul-moore.com> > > Andrew, I see you're pulling this into the MM/hotfixes-unstable > branch, do you also plan to send this up to Linus soon/next-week? If > so, great, if not let me know and I can send it up via the LSM tree. In the normal course of things I'd send it upstream next week, but I can include it in this week's batch if we know that -next testing is hurting from it?
On Mon, Oct 21, 2024 at 11:06 PM Andrew Morton <akpm@linux-foundation.org> wrote: > On Sat, 19 Oct 2024 11:34:08 -0400 Paul Moore <paul@paul-moore.com> wrote: > > > > mm/mmap.c | 69 +++++++++++++++++++++++++++++++++++++++++-------------- > > > 1 file changed, 52 insertions(+), 17 deletions(-) > > > > Thanks for working on this Roberto, Kirill, and everyone else who had > > a hand in reviewing and testing. > > > > Reviewed-by: Paul Moore <paul@paul-moore.com> > > > > Andrew, I see you're pulling this into the MM/hotfixes-unstable > > branch, do you also plan to send this up to Linus soon/next-week? If > > so, great, if not let me know and I can send it up via the LSM tree. > > In the normal course of things I'd send it upstream next week ... That sounds good to me, I just wanted to make sure there was a path forward to get into Linus' tree. > ... but I > can include it in this week's batch if we know that -next testing is > hurting from it? I don't believe so, I think this was just a syzbot gotcha. From what I understand remap_file_pages(2) isn't used much anymore.
diff --git a/mm/mmap.c b/mm/mmap.c index 9c0fb43064b5..f731dd69e162 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, unsigned long populate = 0; unsigned long ret = -EINVAL; struct file *file; + vm_flags_t vm_flags; pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n", current->comm, current->pid); @@ -1656,12 +1657,60 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, if (pgoff + (size >> PAGE_SHIFT) < pgoff) return ret; - if (mmap_write_lock_killable(mm)) + if (mmap_read_lock_killable(mm)) return -EINTR; + /* + * Look up VMA under read lock first so we can perform the security + * without holding locks (which can be problematic). We reacquire a + * write lock later and check nothing changed underneath us. + */ vma = vma_lookup(mm, start); - if (!vma || !(vma->vm_flags & VM_SHARED)) + if (!vma || !(vma->vm_flags & VM_SHARED)) { + mmap_read_unlock(mm); + return -EINVAL; + } + + prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; + prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; + prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; + + flags &= MAP_NONBLOCK; + flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; + if (vma->vm_flags & VM_LOCKED) + flags |= MAP_LOCKED; + + /* Save vm_flags used to calculate prot and flags, and recheck later. */ + vm_flags = vma->vm_flags; + file = get_file(vma->vm_file); + + mmap_read_unlock(mm); + + /* Call outside mmap_lock to be consistent with other callers. */ + ret = security_mmap_file(file, prot, flags); + if (ret) { + fput(file); + return ret; + } + + ret = -EINVAL; + + /* OK security check passed, take write lock + let it rip. */ + if (mmap_write_lock_killable(mm)) { + fput(file); + return -EINTR; + } + + vma = vma_lookup(mm, start); + + if (!vma) + goto out; + + /* Make sure things didn't change under us. */ + if (vma->vm_flags != vm_flags) + goto out; + if (vma->vm_file != file) goto out; if (start + size > vma->vm_end) { @@ -1689,25 +1738,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, goto out; } - prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; - prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; - prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; - - flags &= MAP_NONBLOCK; - flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; - if (vma->vm_flags & VM_LOCKED) - flags |= MAP_LOCKED; - - file = get_file(vma->vm_file); - ret = security_mmap_file(vma->vm_file, prot, flags); - if (ret) - goto out_fput; ret = do_mmap(vma->vm_file, start, size, prot, flags, 0, pgoff, &populate, NULL); -out_fput: - fput(file); out: mmap_write_unlock(mm); + fput(file); if (populate) mm_populate(ret, populate); if (!IS_ERR_VALUE(ret))