Message ID | 20230515130553.2311248-4-jeffxu@chromium.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Memory Mapping (VMA) protection using PKU - set 1 | expand |
On Mon, May 15, 2023 at 01:05:49PM +0000, jeffxu@chromium.org wrote: > From: Jeff Xu <jeffxu@google.com> > > This patch enables PKEY_ENFORCE_API for the mprotect and > mprotect_pkey syscalls. All callers are from userspace -- this change looks like a no-op? -Kees
On Tue, May 16, 2023 at 1:07 PM Kees Cook <keescook@chromium.org> wrote: > > On Mon, May 15, 2023 at 01:05:49PM +0000, jeffxu@chromium.org wrote: > > From: Jeff Xu <jeffxu@google.com> > > > > This patch enables PKEY_ENFORCE_API for the mprotect and > > mprotect_pkey syscalls. > > All callers are from userspace -- this change looks like a no-op? > Yes. All callers are from user space now. I am thinking about the future when someone adds a caller in kernel code and may miss the check. This is also consistent with munmap and other syscalls I plan to change. There are comments on do_mprotect_pkey() to describe how this flag is used. > -Kees > > -- > Kees Cook
On 5/15/23 06:05, jeffxu@chromium.org wrote: > /* > * pkey==-1 when doing a legacy mprotect() > + * syscall==true if this is called by syscall from userspace. > + * Note: this is always true for now, added as a reminder in case that > + * do_mprotect_pkey is called directly by kernel in the future. > + * Also it is consistent with __do_munmap(). > */ > static int do_mprotect_pkey(unsigned long start, size_t len, > - unsigned long prot, int pkey) > + unsigned long prot, int pkey, bool syscall) > { The 'syscall' seems kinda silly (and a bit confusing). It's easy to check if the caller is a kthread or has a current->mm==NULL. If you *really* want a warning, I'd check for those rather than plumb a apparently unused argument in here. BTW, this warning is one of those things that will probably cause some amount of angst. I'd move it to the end of the series or just axe it completely.
On Tue, May 16, 2023 at 4:19 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 5/15/23 06:05, jeffxu@chromium.org wrote: > > /* > > * pkey==-1 when doing a legacy mprotect() > > + * syscall==true if this is called by syscall from userspace. > > + * Note: this is always true for now, added as a reminder in case that > > + * do_mprotect_pkey is called directly by kernel in the future. > > + * Also it is consistent with __do_munmap(). > > */ > > static int do_mprotect_pkey(unsigned long start, size_t len, > > - unsigned long prot, int pkey) > > + unsigned long prot, int pkey, bool syscall) > > { > > The 'syscall' seems kinda silly (and a bit confusing). It's easy to > check if the caller is a kthread or has a current->mm==NULL. If you > *really* want a warning, I'd check for those rather than plumb a > apparently unused argument in here. > > BTW, this warning is one of those things that will probably cause some > amount of angst. I'd move it to the end of the series or just axe it > completely. Agreed. syscall is not a good name here. The intention is to check this at the system call entry point For example, munmap can get called inside mremap(), but by that time mremap() should already check that all the memory is writeable. I will remove "syscall" from do_mprotect_pkey signature, it seems it caused more confusion than helpful. I will keep the comments/note in place to remind future developer.
On Tue, May 16, 2023 at 4:37 PM Jeff Xu <jeffxu@google.com> wrote: > > On Tue, May 16, 2023 at 4:19 PM Dave Hansen <dave.hansen@intel.com> wrote: > > > > On 5/15/23 06:05, jeffxu@chromium.org wrote: > > > /* > > > * pkey==-1 when doing a legacy mprotect() > > > + * syscall==true if this is called by syscall from userspace. > > > + * Note: this is always true for now, added as a reminder in case that > > > + * do_mprotect_pkey is called directly by kernel in the future. > > > + * Also it is consistent with __do_munmap(). > > > */ > > > static int do_mprotect_pkey(unsigned long start, size_t len, > > > - unsigned long prot, int pkey) > > > + unsigned long prot, int pkey, bool syscall) > > > { > > > > The 'syscall' seems kinda silly (and a bit confusing). It's easy to > > check if the caller is a kthread or has a current->mm==NULL. If you > > *really* want a warning, I'd check for those rather than plumb a > > apparently unused argument in here. > > > > BTW, this warning is one of those things that will probably cause some > > amount of angst. I'd move it to the end of the series or just axe it > > completely. > Okay, I will move the logging part to the end of the series. > Agreed. syscall is not a good name here. > The intention is to check this at the system call entry point > For example, munmap can get called inside mremap(), but by that time > mremap() should already check that all the memory is writeable. > > I will remove "syscall" from do_mprotect_pkey signature, it seems it caused > more confusion than helpful. I will keep the comments/note in place to remind > future developer.
diff --git a/mm/mprotect.c b/mm/mprotect.c index 8a68fdca8487..1378be50567d 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -727,9 +727,13 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, /* * pkey==-1 when doing a legacy mprotect() + * syscall==true if this is called by syscall from userspace. + * Note: this is always true for now, added as a reminder in case that + * do_mprotect_pkey is called directly by kernel in the future. + * Also it is consistent with __do_munmap(). */ static int do_mprotect_pkey(unsigned long start, size_t len, - unsigned long prot, int pkey) + unsigned long prot, int pkey, bool syscall) { unsigned long nstart, end, tmp, reqprot; struct vm_area_struct *vma, *prev; @@ -794,6 +798,21 @@ static int do_mprotect_pkey(unsigned long start, size_t len, } } + /* + * When called by syscall from userspace, check if the calling + * thread has the PKEY permission to modify the memory mapping. + */ + if (syscall && + arch_check_pkey_enforce_api(current->mm, start, end) < 0) { + char comm[TASK_COMM_LEN]; + + pr_warn_ratelimited( + "munmap was denied on PKEY_ENFORCE_API memory, pid=%d '%s'\n", + task_pid_nr(current), get_task_comm(comm, current)); + error = -EACCES; + goto out; + } + prev = vma_prev(&vmi); if (start > vma->vm_start) prev = vma; @@ -878,7 +897,7 @@ static int do_mprotect_pkey(unsigned long start, size_t len, SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len, unsigned long, prot) { - return do_mprotect_pkey(start, len, prot, -1); + return do_mprotect_pkey(start, len, prot, -1, true); } #ifdef CONFIG_ARCH_HAS_PKEYS @@ -886,7 +905,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long, start, size_t, len, SYSCALL_DEFINE4(pkey_mprotect, unsigned long, start, size_t, len, unsigned long, prot, int, pkey) { - return do_mprotect_pkey(start, len, prot, pkey); + return do_mprotect_pkey(start, len, prot, pkey, true); } SYSCALL_DEFINE2(pkey_alloc, unsigned long, flags, unsigned long, init_val)