diff mbox series

perf: Require CAP_KILL if sigtrap is requested

Message ID 20210630093709.3612997-1-elver@google.com (mailing list archive)
State New, archived
Headers show
Series perf: Require CAP_KILL if sigtrap is requested | expand

Commit Message

Marco Elver June 30, 2021, 9:37 a.m. UTC
If perf_event_open() is called with another task as target and
perf_event_attr::sigtrap is set, and the target task's user does not
match the calling user, also require the CAP_KILL capability.

Otherwise, with the CAP_PERFMON capability alone it would be possible
for a user to send SIGTRAP signals via perf events to another user's
tasks. This could potentially result in those tasks being terminated if
they cannot handle SIGTRAP signals.

Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
---
 include/linux/capability.h |  5 +++++
 kernel/events/core.c       | 13 ++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

Comments

Ondrej Mosnacek June 30, 2021, 11:13 a.m. UTC | #1
On Wed, Jun 30, 2021 at 11:38 AM Marco Elver <elver@google.com> wrote:
> If perf_event_open() is called with another task as target and
> perf_event_attr::sigtrap is set, and the target task's user does not
> match the calling user, also require the CAP_KILL capability.
>
> Otherwise, with the CAP_PERFMON capability alone it would be possible
> for a user to send SIGTRAP signals via perf events to another user's
> tasks. This could potentially result in those tasks being terminated if
> they cannot handle SIGTRAP signals.
>
> Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events")
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>  include/linux/capability.h |  5 +++++
>  kernel/events/core.c       | 13 ++++++++++++-
>  2 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index 65efb74c3585..1c6be4743dbe 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -264,6 +264,11 @@ static inline bool bpf_capable(void)
>         return capable(CAP_BPF) || capable(CAP_SYS_ADMIN);
>  }
>
> +static inline bool kill_capable(void)
> +{
> +       return capable(CAP_KILL) || capable(CAP_SYS_ADMIN);

Is it really necessary to fall back to CAP_SYS_ADMIN here? CAP_PERFMON
and CAP_BPF have been split off from CAP_SYS_ADMIN recently, so they
have it for backwards compatibility. You are adding a new restriction
for a very specific action, so I don't think the fallback is needed.

> +}
> +
>  static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
>  {
>         return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index fe88d6eea3c2..1ab4bc867531 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -12152,10 +12152,21 @@ SYSCALL_DEFINE5(perf_event_open,
>         }
>
>         if (task) {
> +               bool is_capable;
> +
>                 err = down_read_interruptible(&task->signal->exec_update_lock);
>                 if (err)
>                         goto err_file;
>
> +               is_capable = perfmon_capable();
> +               if (attr.sigtrap) {
> +                       /*
> +                        * perf_event_attr::sigtrap sends signals to the other
> +                        * task. Require the current task to have CAP_KILL.
> +                        */
> +                       is_capable &= kill_capable();

Is it necessary to do all this dance just to call perfmon_capable()
first? Couldn't this be simply:

err = -EPERM;
if (attr.sigtrap && !capable(CAP_KILL))
        goto err_cred;

Also, looking at kill_ok_by_cred() in kernel/signal.c, would it
perhaps be more appropriate to do
ns_capable(__task_cred(task)->user_ns, CAP_KILL) instead? (There might
also need to be some careful locking around getting the target task's
creds - I'm not sure...)

> +               }
> +
>                 /*
>                  * Preserve ptrace permission check for backwards compatibility.
>                  *
> @@ -12165,7 +12176,7 @@ SYSCALL_DEFINE5(perf_event_open,
>                  * perf_event_exit_task() that could imply).
>                  */
>                 err = -EACCES;

BTW, shouldn't this (and several other such cases in this file...)
actually be EPERM, as is the norm for capability checks?

> -               if (!perfmon_capable() && !ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS))
> +               if (!is_capable && !ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS))
>                         goto err_cred;
>         }
>
> --
> 2.32.0.93.g670b81a890-goog
>
Marco Elver June 30, 2021, 12:42 p.m. UTC | #2
On Wed, Jun 30, 2021 at 01:13PM +0200, Ondrej Mosnacek wrote:
> On Wed, Jun 30, 2021 at 11:38 AM Marco Elver <elver@google.com> wrote:
[...]
> > +static inline bool kill_capable(void)
> > +{
> > +       return capable(CAP_KILL) || capable(CAP_SYS_ADMIN);
> 
> Is it really necessary to fall back to CAP_SYS_ADMIN here? CAP_PERFMON
> and CAP_BPF have been split off from CAP_SYS_ADMIN recently, so they
> have it for backwards compatibility. You are adding a new restriction
> for a very specific action, so I don't think the fallback is needed.

That means someone having CAP_SYS_ADMIN, but not CAP_KILL, can't perform
the desired action. Is this what you'd like?

If so, I'll just remove the wrapper, and call capable(CAP_KILL)
directly.

> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index fe88d6eea3c2..1ab4bc867531 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -12152,10 +12152,21 @@ SYSCALL_DEFINE5(perf_event_open,
> >         }
> >
> >         if (task) {
> > +               bool is_capable;
> > +
> >                 err = down_read_interruptible(&task->signal->exec_update_lock);
> >                 if (err)
> >                         goto err_file;
> >
> > +               is_capable = perfmon_capable();
> > +               if (attr.sigtrap) {
> > +                       /*
> > +                        * perf_event_attr::sigtrap sends signals to the other
> > +                        * task. Require the current task to have CAP_KILL.
> > +                        */
> > +                       is_capable &= kill_capable();
> 
> Is it necessary to do all this dance just to call perfmon_capable()
> first? Couldn't this be simply:
> 
> err = -EPERM;
> if (attr.sigtrap && !capable(CAP_KILL))
>         goto err_cred;

Not so much about perfmon_capable() but about the ptrace_may_access()
check. The condition here is supposed to be:

	want CAP_PERFMON and (CAP_KILL if sigtrap)
		OR
        want ptrace access (which includes a check for same thread-group and uid)

If we did what you propose, then the ptrace check is effectively ignored
if attr.sigtrap, and that's not what we want.

There are lots of other ways of writing the same thing, but it should
also remain readable and sticking it all into the same condition is not
readable.

> Also, looking at kill_ok_by_cred() in kernel/signal.c, would it
> perhaps be more appropriate to do
> ns_capable(__task_cred(task)->user_ns, CAP_KILL) instead? (There might
> also need to be some careful locking around getting the target task's
> creds - I'm not sure...)
 
That might make sense. AFAIK, the locking is already in place via
exec_update_lock. Let me investigate.

> > +               }
> > +
> >                 /*
> >                  * Preserve ptrace permission check for backwards compatibility.
> >                  *
> > @@ -12165,7 +12176,7 @@ SYSCALL_DEFINE5(perf_event_open,
> >                  * perf_event_exit_task() that could imply).
> >                  */
> >                 err = -EACCES;
> 
> BTW, shouldn't this (and several other such cases in this file...)
> actually be EPERM, as is the norm for capability checks?

I'm not a perf maintainer, so I can't give you a definitive answer.
But, this would change the ABI, so I don't think it's realistic to
request this change at this point unfortunately.

Thanks.
Ondrej Mosnacek July 1, 2021, 7:56 a.m. UTC | #3
On Wed, Jun 30, 2021 at 2:43 PM Marco Elver <elver@google.com> wrote:
> On Wed, Jun 30, 2021 at 01:13PM +0200, Ondrej Mosnacek wrote:
> > On Wed, Jun 30, 2021 at 11:38 AM Marco Elver <elver@google.com> wrote:
> [...]
> > > +static inline bool kill_capable(void)
> > > +{
> > > +       return capable(CAP_KILL) || capable(CAP_SYS_ADMIN);
> >
> > Is it really necessary to fall back to CAP_SYS_ADMIN here? CAP_PERFMON
> > and CAP_BPF have been split off from CAP_SYS_ADMIN recently, so they
> > have it for backwards compatibility. You are adding a new restriction
> > for a very specific action, so I don't think the fallback is needed.
>
> That means someone having CAP_SYS_ADMIN, but not CAP_KILL, can't perform
> the desired action. Is this what you'd like?

AFAIK, such user wouldn't be allowed to directly send a signal to a
different process either. So I think it makes more sense to be
consistent with the existing/main CAP_KILL usage rather than with the
CAP_PERFMON usage (which has its own reason to have that fallback).

I'm not the authority on capabilities nor the perf subsystem, it just
didn't seem quite right to me so I wanted to raise the concern.
Hopefully someone wiser than me will speak up if I talk nonsense :)

> If so, I'll just remove the wrapper, and call capable(CAP_KILL)
> directly.
>
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index fe88d6eea3c2..1ab4bc867531 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -12152,10 +12152,21 @@ SYSCALL_DEFINE5(perf_event_open,
> > >         }
> > >
> > >         if (task) {
> > > +               bool is_capable;
> > > +
> > >                 err = down_read_interruptible(&task->signal->exec_update_lock);
> > >                 if (err)
> > >                         goto err_file;
> > >
> > > +               is_capable = perfmon_capable();
> > > +               if (attr.sigtrap) {
> > > +                       /*
> > > +                        * perf_event_attr::sigtrap sends signals to the other
> > > +                        * task. Require the current task to have CAP_KILL.
> > > +                        */
> > > +                       is_capable &= kill_capable();
> >
> > Is it necessary to do all this dance just to call perfmon_capable()
> > first? Couldn't this be simply:
> >
> > err = -EPERM;
> > if (attr.sigtrap && !capable(CAP_KILL))
> >         goto err_cred;
>
> Not so much about perfmon_capable() but about the ptrace_may_access()
> check. The condition here is supposed to be:
>
>         want CAP_PERFMON and (CAP_KILL if sigtrap)
>                 OR
>         want ptrace access (which includes a check for same thread-group and uid)
>
> If we did what you propose, then the ptrace check is effectively ignored
> if attr.sigtrap, and that's not what we want.
>
> There are lots of other ways of writing the same thing, but it should
> also remain readable and sticking it all into the same condition is not
> readable.

Ah, I see, I missed that semantic difference... So ptrace_may_access()
implies that the process doesn't need CAP_KILL to send a signal to the
task, that makes sense.

In that case I'm fine with this part as it is.

> > Also, looking at kill_ok_by_cred() in kernel/signal.c, would it
> > perhaps be more appropriate to do
> > ns_capable(__task_cred(task)->user_ns, CAP_KILL) instead? (There might
> > also need to be some careful locking around getting the target task's
> > creds - I'm not sure...)
>
> That might make sense. AFAIK, the locking is already in place via
> exec_update_lock. Let me investigate.
>
> > > +               }
> > > +
> > >                 /*
> > >                  * Preserve ptrace permission check for backwards compatibility.
> > >                  *
> > > @@ -12165,7 +12176,7 @@ SYSCALL_DEFINE5(perf_event_open,
> > >                  * perf_event_exit_task() that could imply).
> > >                  */
> > >                 err = -EACCES;
> >
> > BTW, shouldn't this (and several other such cases in this file...)
> > actually be EPERM, as is the norm for capability checks?
>
> I'm not a perf maintainer, so I can't give you a definitive answer.
> But, this would change the ABI, so I don't think it's realistic to
> request this change at this point unfortunately.

Indeed... I worry it will make troubleshooting SELinux/capability
errors more confusing, but I agree it would be a potentially risky
change to fix it :/

--
Ondrej Mosnacek
Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.
diff mbox series

Patch

diff --git a/include/linux/capability.h b/include/linux/capability.h
index 65efb74c3585..1c6be4743dbe 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -264,6 +264,11 @@  static inline bool bpf_capable(void)
 	return capable(CAP_BPF) || capable(CAP_SYS_ADMIN);
 }
 
+static inline bool kill_capable(void)
+{
+	return capable(CAP_KILL) || capable(CAP_SYS_ADMIN);
+}
+
 static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
 {
 	return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
diff --git a/kernel/events/core.c b/kernel/events/core.c
index fe88d6eea3c2..1ab4bc867531 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -12152,10 +12152,21 @@  SYSCALL_DEFINE5(perf_event_open,
 	}
 
 	if (task) {
+		bool is_capable;
+
 		err = down_read_interruptible(&task->signal->exec_update_lock);
 		if (err)
 			goto err_file;
 
+		is_capable = perfmon_capable();
+		if (attr.sigtrap) {
+			/*
+			 * perf_event_attr::sigtrap sends signals to the other
+			 * task. Require the current task to have CAP_KILL.
+			 */
+			is_capable &= kill_capable();
+		}
+
 		/*
 		 * Preserve ptrace permission check for backwards compatibility.
 		 *
@@ -12165,7 +12176,7 @@  SYSCALL_DEFINE5(perf_event_open,
 		 * perf_event_exit_task() that could imply).
 		 */
 		err = -EACCES;
-		if (!perfmon_capable() && !ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS))
+		if (!is_capable && !ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS))
 			goto err_cred;
 	}