Message ID | 20210108205857.1471269-1-surenb@google.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Paul Moore |
Headers | show |
Series | [1/1] mm/madvise: replace ptrace attach requirement for process_madvise | expand |
On Fri, Jan 08, 2021 at 12:58:57PM -0800, Suren Baghdasaryan wrote: > process_madvise currently requires ptrace attach capability. > PTRACE_MODE_ATTACH gives one process complete control over another > process. It effectively removes the security boundary between the > two processes (in one direction). Granting ptrace attach capability > even to a system process is considered dangerous since it creates an > attack surface. This severely limits the usage of this API. > The operations process_madvise can perform do not affect the correctness > of the operation of the target process; they only affect where the data > is physically located (and therefore, how fast it can be accessed). > What we want is the ability for one process to influence another process > in order to optimize performance across the entire system while leaving > the security boundary intact. > Replace PTRACE_MODE_ATTACH with a combination of PTRACE_MODE_READ > and CAP_SYS_NICE. PTRACE_MODE_READ to prevent leaking ASLR metadata > and CAP_SYS_NICE for influencing process performance. > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> It sounds logical to me. If security folks don't see any concern and fix below, Acked-by: Minchan Kim <minchan@kernel.org> > @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > goto release_task; > } > > - mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); > + /* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */ > + mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); > if (IS_ERR_OR_NULL(mm)) { > ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; > goto release_task; > } > > + /* > + * Require CAP_SYS_NICE for influencing process performance. Note that > + * only non-destructive hints are currently supported. > + */ > + if (!capable(CAP_SYS_NICE)) { > + ret = -EPERM; > + goto release_task; mmput? > + } > + > total_len = iov_iter_count(&iter); > > while (iov_iter_count(&iter)) { > -- > 2.30.0.284.gd98b1dd5eaa7-goog >
On Fri, Jan 8, 2021 at 2:15 PM Minchan Kim <minchan@kernel.org> wrote: > > On Fri, Jan 08, 2021 at 12:58:57PM -0800, Suren Baghdasaryan wrote: > > process_madvise currently requires ptrace attach capability. > > PTRACE_MODE_ATTACH gives one process complete control over another > > process. It effectively removes the security boundary between the > > two processes (in one direction). Granting ptrace attach capability > > even to a system process is considered dangerous since it creates an > > attack surface. This severely limits the usage of this API. > > The operations process_madvise can perform do not affect the correctness > > of the operation of the target process; they only affect where the data > > is physically located (and therefore, how fast it can be accessed). > > What we want is the ability for one process to influence another process > > in order to optimize performance across the entire system while leaving > > the security boundary intact. > > Replace PTRACE_MODE_ATTACH with a combination of PTRACE_MODE_READ > > and CAP_SYS_NICE. PTRACE_MODE_READ to prevent leaking ASLR metadata > > and CAP_SYS_NICE for influencing process performance. > > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > > It sounds logical to me. > If security folks don't see any concern and fix below, > > Acked-by: Minchan Kim <minchan@kernel.org> > > > @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > > goto release_task; > > } > > > > - mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); > > + /* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */ > > + mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); > > if (IS_ERR_OR_NULL(mm)) { > > ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; > > goto release_task; > > } > > > > + /* > > + * Require CAP_SYS_NICE for influencing process performance. Note that > > + * only non-destructive hints are currently supported. > > + */ > > + if (!capable(CAP_SYS_NICE)) { > > + ret = -EPERM; > > + goto release_task; > > mmput? Ouch! Thanks for pointing it out! Will include in the next respin. > > > + } > > + > > total_len = iov_iter_count(&iter); > > > > while (iov_iter_count(&iter)) { > > -- > > 2.30.0.284.gd98b1dd5eaa7-goog > > > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >
On Fri, 8 Jan 2021, Suren Baghdasaryan wrote: > > > @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > > > goto release_task; > > > } > > > > > > - mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); > > > + /* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */ > > > + mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); > > > if (IS_ERR_OR_NULL(mm)) { > > > ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; > > > goto release_task; > > > } > > > > > > + /* > > > + * Require CAP_SYS_NICE for influencing process performance. Note that > > > + * only non-destructive hints are currently supported. > > > + */ > > > + if (!capable(CAP_SYS_NICE)) { > > > + ret = -EPERM; > > > + goto release_task; > > > > mmput? > > Ouch! Thanks for pointing it out! Will include in the next respin. > With the fix, feel free to add: Acked-by: David Rientjes <rientjes@google.com> Thanks Suren!
On Fri, Jan 8, 2021 at 5:02 PM David Rientjes <rientjes@google.com> wrote: > > On Fri, 8 Jan 2021, Suren Baghdasaryan wrote: > > > > > @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > > > > goto release_task; > > > > } > > > > > > > > - mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); > > > > + /* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */ > > > > + mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); > > > > if (IS_ERR_OR_NULL(mm)) { > > > > ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; > > > > goto release_task; > > > > } > > > > > > > > + /* > > > > + * Require CAP_SYS_NICE for influencing process performance. Note that > > > > + * only non-destructive hints are currently supported. > > > > + */ > > > > + if (!capable(CAP_SYS_NICE)) { > > > > + ret = -EPERM; > > > > + goto release_task; > > > > > > mmput? > > > > Ouch! Thanks for pointing it out! Will include in the next respin. > > > > With the fix, feel free to add: > > Acked-by: David Rientjes <rientjes@google.com> Thanks! Will post a new version with the fix on Monday. > > Thanks Suren!
* Suren Baghdasaryan: > diff --git a/mm/madvise.c b/mm/madvise.c > index 6a660858784b..c2d600386902 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > goto release_task; > } > > - mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); > + /* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */ > + mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); > if (IS_ERR_OR_NULL(mm)) { > ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; > goto release_task; > } Shouldn't this depend on the requested behavior? Several operations directly result in observable changes, and go beyond performance tuning. Thanks, Florian
On Mon, Jan 11, 2021 at 2:20 AM Florian Weimer <fweimer@redhat.com> wrote: > > * Suren Baghdasaryan: > > > diff --git a/mm/madvise.c b/mm/madvise.c > > index 6a660858784b..c2d600386902 100644 > > --- a/mm/madvise.c > > +++ b/mm/madvise.c > > @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > > goto release_task; > > } > > > > - mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); > > + /* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */ > > + mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); > > if (IS_ERR_OR_NULL(mm)) { > > ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; > > goto release_task; > > } > > Shouldn't this depend on the requested behavior? Several operations > directly result in observable changes, and go beyond performance tuning. Thanks for the comment Florian. process_madvise supports only MADV_COLD and MADV_PAGEOUT hints which are both non-destructive (see process_madvise_behavior_valid() function). Maybe you meant something else by "observable changes", if so please clarify. Thanks, Suren. > > Thanks, > Florian > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >
On Mon, Jan 11, 2021 at 9:05 AM Suren Baghdasaryan <surenb@google.com> wrote: > > On Mon, Jan 11, 2021 at 2:20 AM Florian Weimer <fweimer@redhat.com> wrote: > > > > * Suren Baghdasaryan: > > > > > diff --git a/mm/madvise.c b/mm/madvise.c > > > index 6a660858784b..c2d600386902 100644 > > > --- a/mm/madvise.c > > > +++ b/mm/madvise.c > > > @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > > > goto release_task; > > > } > > > > > > - mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); > > > + /* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */ > > > + mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); > > > if (IS_ERR_OR_NULL(mm)) { > > > ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; > > > goto release_task; > > > } > > > > Shouldn't this depend on the requested behavior? Several operations > > directly result in observable changes, and go beyond performance tuning. > > Thanks for the comment Florian. > process_madvise supports only MADV_COLD and MADV_PAGEOUT hints which > are both non-destructive (see process_madvise_behavior_valid() > function). Maybe you meant something else by "observable changes", if > so please clarify. > Thanks, > Suren. > V2 with Minchan's fix is posted at: https://lore.kernel.org/lkml/20210111170622.2613577-1-surenb@google.com/T/#u > > > > Thanks, > > Florian > > -- > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > > Commercial register: Amtsgericht Muenchen, HRB 153243, > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill > > > > -- > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. > >
diff --git a/mm/madvise.c b/mm/madvise.c index 6a660858784b..c2d600386902 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1197,12 +1197,22 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, goto release_task; } - mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); + /* Require PTRACE_MODE_READ to avoid leaking ASLR metadata. */ + mm = mm_access(task, PTRACE_MODE_READ_FSCREDS); if (IS_ERR_OR_NULL(mm)) { ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH; goto release_task; } + /* + * Require CAP_SYS_NICE for influencing process performance. Note that + * only non-destructive hints are currently supported. + */ + if (!capable(CAP_SYS_NICE)) { + ret = -EPERM; + goto release_task; + } + total_len = iov_iter_count(&iter); while (iov_iter_count(&iter)) {
process_madvise currently requires ptrace attach capability. PTRACE_MODE_ATTACH gives one process complete control over another process. It effectively removes the security boundary between the two processes (in one direction). Granting ptrace attach capability even to a system process is considered dangerous since it creates an attack surface. This severely limits the usage of this API. The operations process_madvise can perform do not affect the correctness of the operation of the target process; they only affect where the data is physically located (and therefore, how fast it can be accessed). What we want is the ability for one process to influence another process in order to optimize performance across the entire system while leaving the security boundary intact. Replace PTRACE_MODE_ATTACH with a combination of PTRACE_MODE_READ and CAP_SYS_NICE. PTRACE_MODE_READ to prevent leaking ASLR metadata and CAP_SYS_NICE for influencing process performance. Signed-off-by: Suren Baghdasaryan <surenb@google.com> --- mm/madvise.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)