pidfd: Stop taking cred_guard_mutex

Message ID	87wo7svy96.fsf_-_@x220.int.ebiederm.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=QEch=43=vger.kernel.org=linux-fsdevel-owner@kernel.org> Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 456B492A for <patchwork-linux-fsdevel@patchwork.kernel.org>; Tue, 10 Mar 2020 18:54:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 23E41208C3 for <patchwork-linux-fsdevel@patchwork.kernel.org>; Tue, 10 Mar 2020 18:54:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727220AbgCJSym (ORCPT <rfc822;patchwork-linux-fsdevel@patchwork.kernel.org>); Tue, 10 Mar 2020 14:54:42 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:56464 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727210AbgCJSyl (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>); Tue, 10 Mar 2020 14:54:41 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <ebiederm@xmission.com>) id 1jBk1K-0000Vk-UM; Tue, 10 Mar 2020 12:54:26 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from <ebiederm@xmission.com>) id 1jBk1J-000215-JY; Tue, 10 Mar 2020 12:54:26 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Christian Brauner <christian.brauner@ubuntu.com> Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>, Kees Cook <keescook@chromium.org>, Jann Horn <jannh@google.com>, Jonathan Corbet <corbet@lwn.net>, Alexander Viro <viro@zeniv.linux.org.uk>, Andrew Morton <akpm@linux-foundation.org>, Alexey Dobriyan <adobriyan@gmail.com>, Thomas Gleixner <tglx@linutronix.de>, Oleg Nesterov <oleg@redhat.com>, Frederic Weisbecker <frederic@kernel.org>, Andrei Vagin <avagin@gmail.com>, Ingo Molnar <mingo@kernel.org>, "Peter Zijlstra \(Intel\)" <peterz@infradead.org>, Yuyang Du <duyuyang@gmail.com>, David Hildenbrand <david@redhat.com>, Sebastian Andrzej Siewior <bigeasy@linutronix.de>, Anshuman Khandual <anshuman.khandual@arm.com>, David Howells <dhowells@redhat.com>, James Morris <jamorris@linux.microsoft.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Shakeel Butt <shakeelb@google.com>, Jason Gunthorpe <jgg@ziepe.ca>, Christian Kellner <christian@kellner.me>, Andrea Arcangeli <aarcange@redhat.com>, Aleksa Sarai <cyphar@cyphar.com>, "Dmitry V. Levin" <ldv@altlinux.org>, "linux-doc\@vger.kernel.org" <linux-doc@vger.kernel.org>, "linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>, "linux-fsdevel\@vger.kernel.org" <linux-fsdevel@vger.kernel.org>, "linux-mm\@kvack.org" <linux-mm@kvack.org>, "stable\@vger.kernel.org" <stable@vger.kernel.org>, "linux-api\@vger.kernel.org" <linux-api@vger.kernel.org>, Arnd Bergmann <arnd@arndb.de>, Sargun Dhillon <sargun@sargun.me> References: <87r1y8dqqz.fsf@x220.int.ebiederm.org> <AM6PR03MB517053AED7DC89F7C0704B7DE4E50@AM6PR03MB5170.eurprd03.prod.outlook.com> <AM6PR03MB51703B44170EAB4626C9B2CAE4E20@AM6PR03MB5170.eurprd03.prod.outlook.com> <87tv32cxmf.fsf_-_@x220.int.ebiederm.org> <87v9ne5y4y.fsf_-_@x220.int.ebiederm.org> <87eeu25y14.fsf_-_@x220.int.ebiederm.org> <20200309195909.h2lv5uawce5wgryx@wittgenstein> <877dztz415.fsf@x220.int.ebiederm.org> <20200309201729.yk5sd26v4bz4gtou@wittgenstein> <87k13txnig.fsf@x220.int.ebiederm.org> <20200310085540.pztaty2mj62xt2nm@wittgenstein> Date: Tue, 10 Mar 2020 13:52:05 -0500 In-Reply-To: <20200310085540.pztaty2mj62xt2nm@wittgenstein> (Christian Brauner's message of "Tue, 10 Mar 2020 09:55:40 +0100") Message-ID: <87wo7svy96.fsf_-_@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1jBk1J-000215-JY;;;mid=<87wo7svy96.fsf_-_@x220.int.ebiederm.org>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/UQDQ9+ZCJN9SiRdg7xtb+liwj6nFG3vc= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa06.xmission.com X-Spam-Level: X-Spam-Status: No, score=-0.2 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,T_TM2_M_HEADER_IN_MSG autolearn=disabled version=3.4.2 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Christian Brauner <christian.brauner@ubuntu.com> X-Spam-Relay-Country: X-Spam-Timing: total 648 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 2.6 (0.4%), b_tie_ro: 1.83 (0.3%), parse: 0.94 (0.1%), extract_message_metadata: 20 (3.0%), get_uri_detail_list: 1.56 (0.2%), tests_pri_-1000: 27 (4.1%), tests_pri_-950: 1.26 (0.2%), tests_pri_-900: 1.12 (0.2%), tests_pri_-90: 31 (4.8%), check_bayes: 30 (4.6%), b_tokenize: 12 (1.8%), b_tok_get_all: 9 (1.3%), b_comp_prob: 2.6 (0.4%), b_tok_touch_all: 4.0 (0.6%), b_finish: 0.66 (0.1%), tests_pri_0: 553 (85.4%), check_dkim_signature: 0.52 (0.1%), check_dkim_adsp: 3.2 (0.5%), poll_dns_idle: 0.32 (0.0%), tests_pri_10: 2.2 (0.3%), tests_pri_500: 6 (1.0%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH] pidfd: Stop taking cred_guard_mutex X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: <linux-fsdevel.vger.kernel.org> X-Mailing-List: linux-fsdevel@vger.kernel.org
Series	pidfd: Stop taking cred_guard_mutex \| expand pidfd: Stop taking cred_guard_mutex

Eric W. Biederman March 10, 2020, 6:52 p.m. UTC

During exec some file descriptors are closed and the files struct is
unshared.  But all of that can happen at other times and it has the
same protections during exec as at ordinary times.  So stop taking the
cred_guard_mutex as it is useless.

Furthermore he cred_guard_mutex is a bad idea because it is deadlock
prone, as it is held in serveral while waiting possibly indefinitely
for userspace to do something.

Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/pid.c | 6 ------
 1 file changed, 6 deletions(-)

Christian if you don't have any objections I will take this one through
my tree.

I tried to figure out why this code path takes the cred_guard_mutex and
the archive on lore.kernel.org was not helpful in finding that part of
the conversation.

Christian Brauner March 10, 2020, 7:15 p.m. UTC | #1

On Tue, Mar 10, 2020 at 01:52:05PM -0500, Eric W. Biederman wrote:
> 
> During exec some file descriptors are closed and the files struct is
> unshared.  But all of that can happen at other times and it has the
> same protections during exec as at ordinary times.  So stop taking the
> cred_guard_mutex as it is useless.
> 
> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
> prone, as it is held in serveral while waiting possibly indefinitely
> for userspace to do something.
> 
> Cc: Sargun Dhillon <sargun@sargun.me>
> Cc: Christian Brauner <christian.brauner@ubuntu.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  kernel/pid.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> Christian if you don't have any objections I will take this one through
> my tree.

Sure.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

> 
> I tried to figure out why this code path takes the cred_guard_mutex and
> the archive on lore.kernel.org was not helpful in finding that part of
> the conversation.

Let me think a little harder and hopefully get back to you with a
sensible explanation.

Jann Horn March 10, 2020, 7:16 p.m. UTC | #2

On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> During exec some file descriptors are closed and the files struct is
> unshared.  But all of that can happen at other times and it has the
> same protections during exec as at ordinary times.  So stop taking the
> cred_guard_mutex as it is useless.
>
> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
> prone, as it is held in serveral while waiting possibly indefinitely
> for userspace to do something.

Please don't. Just use the new exec_update_mutex like everywhere else.

> Cc: Sargun Dhillon <sargun@sargun.me>
> Cc: Christian Brauner <christian.brauner@ubuntu.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  kernel/pid.c | 6 ------
>  1 file changed, 6 deletions(-)
>
> Christian if you don't have any objections I will take this one through
> my tree.
>
> I tried to figure out why this code path takes the cred_guard_mutex and
> the archive on lore.kernel.org was not helpful in finding that part of
> the conversation.

That was my suggestion.

> diff --git a/kernel/pid.c b/kernel/pid.c
> index 60820e72634c..53646d5616d2 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -577,17 +577,11 @@ static struct file *__pidfd_fget(struct task_struct *task, int fd)
>         struct file *file;
>         int ret;
>
> -       ret = mutex_lock_killable(&task->signal->cred_guard_mutex);
> -       if (ret)
> -               return ERR_PTR(ret);
> -
>         if (ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS))
>                 file = fget_task(task, fd);
>         else
>                 file = ERR_PTR(-EPERM);
>
> -       mutex_unlock(&task->signal->cred_guard_mutex);
> -
>         return file ?: ERR_PTR(-EBADF);
>  }

If you make this change, then if this races with execution of a setuid
program that afterwards e.g. opens a unix domain socket, an attacker
will be able to steal that socket and inject messages into
communication with things like DBus. procfs currently has the same
race, and that still needs to be fixed, but at least procfs doesn't
let you open things like sockets because they don't have a working
->open handler, and it enforces the normal permission check for opening files.

Eric W. Biederman March 10, 2020, 7:27 p.m. UTC | #3

Jann Horn <jannh@google.com> writes:

> On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>> During exec some file descriptors are closed and the files struct is
>> unshared.  But all of that can happen at other times and it has the
>> same protections during exec as at ordinary times.  So stop taking the
>> cred_guard_mutex as it is useless.
>>
>> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
>> prone, as it is held in serveral while waiting possibly indefinitely
>> for userspace to do something.
>
> Please don't. Just use the new exec_update_mutex like everywhere else.
>
>> Cc: Sargun Dhillon <sargun@sargun.me>
>> Cc: Christian Brauner <christian.brauner@ubuntu.com>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>>  kernel/pid.c | 6 ------
>>  1 file changed, 6 deletions(-)
>>
>> Christian if you don't have any objections I will take this one through
>> my tree.
>>
>> I tried to figure out why this code path takes the cred_guard_mutex and
>> the archive on lore.kernel.org was not helpful in finding that part of
>> the conversation.
>
> That was my suggestion.
>
>> diff --git a/kernel/pid.c b/kernel/pid.c
>> index 60820e72634c..53646d5616d2 100644
>> --- a/kernel/pid.c
>> +++ b/kernel/pid.c
>> @@ -577,17 +577,11 @@ static struct file *__pidfd_fget(struct task_struct *task, int fd)
>>         struct file *file;
>>         int ret;
>>
>> -       ret = mutex_lock_killable(&task->signal->cred_guard_mutex);
>> -       if (ret)
>> -               return ERR_PTR(ret);
>> -
>>         if (ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS))
>>                 file = fget_task(task, fd);
>>         else
>>                 file = ERR_PTR(-EPERM);
>>
>> -       mutex_unlock(&task->signal->cred_guard_mutex);
>> -
>>         return file ?: ERR_PTR(-EBADF);
>>  }
>
> If you make this change, then if this races with execution of a setuid
> program that afterwards e.g. opens a unix domain socket, an attacker
> will be able to steal that socket and inject messages into
> communication with things like DBus. procfs currently has the same
> race, and that still needs to be fixed, but at least procfs doesn't
> let you open things like sockets because they don't have a working
> ->open handler, and it enforces the normal permission check for
> opening files.

It isn't only exec that can change credentials.  Do we need a lock for
changing credentials?

Wouldn't it be sufficient to simply test ptrace_may_access after
we get a copy of the file?

If we need a lock around credential change let's design and build that.
Having a mismatch between what a lock is designed to do, and what
people use it for can only result in other bugs as people get confused.

Eric

Jann Horn March 10, 2020, 8 p.m. UTC | #4

On Tue, Mar 10, 2020 at 8:29 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> Jann Horn <jannh@google.com> writes:
> > On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >> During exec some file descriptors are closed and the files struct is
> >> unshared.  But all of that can happen at other times and it has the
> >> same protections during exec as at ordinary times.  So stop taking the
> >> cred_guard_mutex as it is useless.
> >>
> >> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
> >> prone, as it is held in serveral while waiting possibly indefinitely
> >> for userspace to do something.
> >
> > Please don't. Just use the new exec_update_mutex like everywhere else.
> >
> >> Cc: Sargun Dhillon <sargun@sargun.me>
> >> Cc: Christian Brauner <christian.brauner@ubuntu.com>
> >> Cc: Arnd Bergmann <arnd@arndb.de>
> >> Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
> >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >> ---
> >>  kernel/pid.c | 6 ------
> >>  1 file changed, 6 deletions(-)
> >>
> >> Christian if you don't have any objections I will take this one through
> >> my tree.
> >>
> >> I tried to figure out why this code path takes the cred_guard_mutex and
> >> the archive on lore.kernel.org was not helpful in finding that part of
> >> the conversation.
> >
> > That was my suggestion.
> >
> >> diff --git a/kernel/pid.c b/kernel/pid.c
> >> index 60820e72634c..53646d5616d2 100644
> >> --- a/kernel/pid.c
> >> +++ b/kernel/pid.c
> >> @@ -577,17 +577,11 @@ static struct file *__pidfd_fget(struct task_struct *task, int fd)
> >>         struct file *file;
> >>         int ret;
> >>
> >> -       ret = mutex_lock_killable(&task->signal->cred_guard_mutex);
> >> -       if (ret)
> >> -               return ERR_PTR(ret);
> >> -
> >>         if (ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS))
> >>                 file = fget_task(task, fd);
> >>         else
> >>                 file = ERR_PTR(-EPERM);
> >>
> >> -       mutex_unlock(&task->signal->cred_guard_mutex);
> >> -
> >>         return file ?: ERR_PTR(-EBADF);
> >>  }
> >
> > If you make this change, then if this races with execution of a setuid
> > program that afterwards e.g. opens a unix domain socket, an attacker
> > will be able to steal that socket and inject messages into
> > communication with things like DBus. procfs currently has the same
> > race, and that still needs to be fixed, but at least procfs doesn't
> > let you open things like sockets because they don't have a working
> > ->open handler, and it enforces the normal permission check for
> > opening files.
>
> It isn't only exec that can change credentials.  Do we need a lock for
> changing credentials?

Hmm, I guess so? Normally, a task that's changing credentials becomes
nondumpable at the same time (and there are explicit memory barriers
in commit_creds() and __ptrace_may_access() to enforce the ordering
for this); so you normally don't see tasks becoming ptrace-accessible
via anything other than execve(). But I guess if someone opens a
root-only file, closes it, drops privileges, and then explicitly does
prctl(PR_SET_DUMPABLE, 1), we should probably protect that, too.

> Wouldn't it be sufficient to simply test ptrace_may_access after
> we get a copy of the file?

There are also setuid helpers that can, after having done privileged
stuff, drop privileges and call execve(); after that,
ptrace_may_access() succeeds again. In particular, polkit has a helper
that does this.

> If we need a lock around credential change let's design and build that.
> Having a mismatch between what a lock is designed to do, and what
> people use it for can only result in other bugs as people get confused.

Hmm... what benefits do we get from making it a separate lock? I guess
it would allow us to make it a per-task lock instead of a
signal_struct-wide one? That might be helpful...

Jann Horn March 10, 2020, 8:10 p.m. UTC | #5

On Tue, Mar 10, 2020 at 9:00 PM Jann Horn <jannh@google.com> wrote:
> On Tue, Mar 10, 2020 at 8:29 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> > Jann Horn <jannh@google.com> writes:
> > > On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> > >> During exec some file descriptors are closed and the files struct is
> > >> unshared.  But all of that can happen at other times and it has the
> > >> same protections during exec as at ordinary times.  So stop taking the
> > >> cred_guard_mutex as it is useless.
> > >>
> > >> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
> > >> prone, as it is held in serveral while waiting possibly indefinitely
> > >> for userspace to do something.
[...]
> > > If you make this change, then if this races with execution of a setuid
> > > program that afterwards e.g. opens a unix domain socket, an attacker
> > > will be able to steal that socket and inject messages into
> > > communication with things like DBus. procfs currently has the same
> > > race, and that still needs to be fixed, but at least procfs doesn't
> > > let you open things like sockets because they don't have a working
> > > ->open handler, and it enforces the normal permission check for
> > > opening files.
> >
> > It isn't only exec that can change credentials.  Do we need a lock for
> > changing credentials?
[...]
> > If we need a lock around credential change let's design and build that.
> > Having a mismatch between what a lock is designed to do, and what
> > people use it for can only result in other bugs as people get confused.
>
> Hmm... what benefits do we get from making it a separate lock? I guess
> it would allow us to make it a per-task lock instead of a
> signal_struct-wide one? That might be helpful...

But actually, isn't the core purpose of the cred_guard_mutex to guard
against concurrent credential changes anyway? That's what almost
everyone uses it for, and it's in the name...

Bernd Edlinger March 10, 2020, 8:22 p.m. UTC | #6

On 3/10/20 9:10 PM, Jann Horn wrote:
> On Tue, Mar 10, 2020 at 9:00 PM Jann Horn <jannh@google.com> wrote:
>> On Tue, Mar 10, 2020 at 8:29 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>> Jann Horn <jannh@google.com> writes:
>>>> On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>>>> During exec some file descriptors are closed and the files struct is
>>>>> unshared.  But all of that can happen at other times and it has the
>>>>> same protections during exec as at ordinary times.  So stop taking the
>>>>> cred_guard_mutex as it is useless.
>>>>>
>>>>> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
>>>>> prone, as it is held in serveral while waiting possibly indefinitely
>>>>> for userspace to do something.
> [...]
>>>> If you make this change, then if this races with execution of a setuid
>>>> program that afterwards e.g. opens a unix domain socket, an attacker
>>>> will be able to steal that socket and inject messages into
>>>> communication with things like DBus. procfs currently has the same
>>>> race, and that still needs to be fixed, but at least procfs doesn't
>>>> let you open things like sockets because they don't have a working
>>>> ->open handler, and it enforces the normal permission check for
>>>> opening files.
>>>
>>> It isn't only exec that can change credentials.  Do we need a lock for
>>> changing credentials?
> [...]
>>> If we need a lock around credential change let's design and build that.
>>> Having a mismatch between what a lock is designed to do, and what
>>> people use it for can only result in other bugs as people get confused.
>>
>> Hmm... what benefits do we get from making it a separate lock? I guess
>> it would allow us to make it a per-task lock instead of a
>> signal_struct-wide one? That might be helpful...
> 
> But actually, isn't the core purpose of the cred_guard_mutex to guard
> against concurrent credential changes anyway? That's what almost
> everyone uses it for, and it's in the name...
> 

The main reason d'etre of exec_update_mutex is to get a consitent
view of task->mm and task credentials.

The reason why you want the cred_guard_mutex, is that some action
is changing the resulting credentials that the execve is about
to install, and that is the data flow in the opposite direction.


Bernd.

Eric W. Biederman March 10, 2020, 8:57 p.m. UTC | #7

Jann Horn <jannh@google.com> writes:

> On Tue, Mar 10, 2020 at 9:00 PM Jann Horn <jannh@google.com> wrote:
>> On Tue, Mar 10, 2020 at 8:29 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>> > Jann Horn <jannh@google.com> writes:
>> > > On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>> > >> During exec some file descriptors are closed and the files struct is
>> > >> unshared.  But all of that can happen at other times and it has the
>> > >> same protections during exec as at ordinary times.  So stop taking the
>> > >> cred_guard_mutex as it is useless.
>> > >>
>> > >> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
>> > >> prone, as it is held in serveral while waiting possibly indefinitely
>> > >> for userspace to do something.
> [...]
>> > > If you make this change, then if this races with execution of a setuid
>> > > program that afterwards e.g. opens a unix domain socket, an attacker
>> > > will be able to steal that socket and inject messages into
>> > > communication with things like DBus. procfs currently has the same
>> > > race, and that still needs to be fixed, but at least procfs doesn't
>> > > let you open things like sockets because they don't have a working
>> > > ->open handler, and it enforces the normal permission check for
>> > > opening files.
>> >
>> > It isn't only exec that can change credentials.  Do we need a lock for
>> > changing credentials?
> [...]
>> > If we need a lock around credential change let's design and build that.
>> > Having a mismatch between what a lock is designed to do, and what
>> > people use it for can only result in other bugs as people get confused.
>>
>> Hmm... what benefits do we get from making it a separate lock? I guess
>> it would allow us to make it a per-task lock instead of a
>> signal_struct-wide one? That might be helpful...
>
> But actually, isn't the core purpose of the cred_guard_mutex to guard
> against concurrent credential changes anyway? That's what almost
> everyone uses it for, and it's in the name...

Having been through all of the users nope.

Maybe someone tried to repurpose for that.  I haven't traced through
when it went the it was renamed from cred_exec_mutex to
cred_guard_mutex.

The original purpose was to make make exec and ptrace deadlock.  But it
was seen as being there to allow safely calculating the new credentials
before the point of now return.  Because if a process is ptraced or not
affects the new credential calculations.  Unfortunately offering that
guarantee fundamentally leads to deadlock.

So ptrace_attach and seccomp use the cred_guard_mutex to guarantee
a deadlock.

The common use is to take cred_guard_mutex to guard the window when
credentials and process details are out of sync in exec.  But there
is at least do_io_accounting that seems to have the same justification
for holding __pidfd_fget.

With effort I suspect we can replace exec_change_mutex with task_lock.
When we are guaranteed to be single threaded placing exec_change_mutex
in signal_struct doesn't really help us (except maybe in some races?).

The deep problem is no one really understands cred_guard_mutex so it is
a mess.  Code with poorly defined semantics is always wrong somewhere
for someone.  Which is part of why I am attacking this and having the
conversations to make certain I understand what is going on.

I see your point about commit_creds making a process undumpable.  So in
practice it really is only exec that changes creds in a way that
ptrace_may_access will allow the process to be inspected.

So I guess for now the practical non-regressing course is to change
everything to my exec_change_mutex, removing the deadlock.  Then we
figure out how to cleanly deal with the races inspecting a process with
changing credentials has.

Eric

Christian Brauner March 10, 2020, 9:29 p.m. UTC | #8

On Tue, Mar 10, 2020 at 03:57:35PM -0500, Eric W. Biederman wrote:
> Jann Horn <jannh@google.com> writes:
> 
> > On Tue, Mar 10, 2020 at 9:00 PM Jann Horn <jannh@google.com> wrote:
> >> On Tue, Mar 10, 2020 at 8:29 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >> > Jann Horn <jannh@google.com> writes:
> >> > > On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >> > >> During exec some file descriptors are closed and the files struct is
> >> > >> unshared.  But all of that can happen at other times and it has the
> >> > >> same protections during exec as at ordinary times.  So stop taking the
> >> > >> cred_guard_mutex as it is useless.
> >> > >>
> >> > >> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
> >> > >> prone, as it is held in serveral while waiting possibly indefinitely
> >> > >> for userspace to do something.
> > [...]
> >> > > If you make this change, then if this races with execution of a setuid
> >> > > program that afterwards e.g. opens a unix domain socket, an attacker
> >> > > will be able to steal that socket and inject messages into
> >> > > communication with things like DBus. procfs currently has the same
> >> > > race, and that still needs to be fixed, but at least procfs doesn't
> >> > > let you open things like sockets because they don't have a working
> >> > > ->open handler, and it enforces the normal permission check for
> >> > > opening files.
> >> >
> >> > It isn't only exec that can change credentials.  Do we need a lock for
> >> > changing credentials?
> > [...]
> >> > If we need a lock around credential change let's design and build that.
> >> > Having a mismatch between what a lock is designed to do, and what
> >> > people use it for can only result in other bugs as people get confused.
> >>
> >> Hmm... what benefits do we get from making it a separate lock? I guess
> >> it would allow us to make it a per-task lock instead of a
> >> signal_struct-wide one? That might be helpful...
> >
> > But actually, isn't the core purpose of the cred_guard_mutex to guard
> > against concurrent credential changes anyway? That's what almost
> > everyone uses it for, and it's in the name...
> 
> Having been through all of the users nope.
> 
> Maybe someone tried to repurpose for that.  I haven't traced through
> when it went the it was renamed from cred_exec_mutex to
> cred_guard_mutex.
> 
> The original purpose was to make make exec and ptrace deadlock.  But it
> was seen as being there to allow safely calculating the new credentials
> before the point of now return.  Because if a process is ptraced or not
> affects the new credential calculations.  Unfortunately offering that
> guarantee fundamentally leads to deadlock.
> 
> So ptrace_attach and seccomp use the cred_guard_mutex to guarantee
> a deadlock.
> 
> The common use is to take cred_guard_mutex to guard the window when
> credentials and process details are out of sync in exec.  But there
> is at least do_io_accounting that seems to have the same justification
> for holding __pidfd_fget.
> 
> With effort I suspect we can replace exec_change_mutex with task_lock.
> When we are guaranteed to be single threaded placing exec_change_mutex
> in signal_struct doesn't really help us (except maybe in some races?).
> 
> The deep problem is no one really understands cred_guard_mutex so it is
> a mess.  Code with poorly defined semantics is always wrong somewhere

This is a good point. When discussing patches sensitive to credential
changes cred_guard_mutex was always introduced as having the purpose to
guard against concurrent credential changes. And I'm pretty sure that
that's how most people have been using it for quite a long time. I mean,
it's at least the case for seccomp and proc and probably quite a few
more. So the problem seems to me that it has clear _intended_ semantics
that runs into issues in all sorts of cases. So if cred_guard_mutex is
not that then we seem to need to provide something that serves it's
intended purpose.

Bernd Edlinger March 11, 2020, 6:11 a.m. UTC | #9

On 3/10/20 9:22 PM, Bernd Edlinger wrote:
> On 3/10/20 9:10 PM, Jann Horn wrote:
>> On Tue, Mar 10, 2020 at 9:00 PM Jann Horn <jannh@google.com> wrote:
>>> On Tue, Mar 10, 2020 at 8:29 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>>> Jann Horn <jannh@google.com> writes:
>>>>> On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>>>>> During exec some file descriptors are closed and the files struct is
>>>>>> unshared.  But all of that can happen at other times and it has the
>>>>>> same protections during exec as at ordinary times.  So stop taking the
>>>>>> cred_guard_mutex as it is useless.
>>>>>>
>>>>>> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
>>>>>> prone, as it is held in serveral while waiting possibly indefinitely
>>>>>> for userspace to do something.
>> [...]
>>>>> If you make this change, then if this races with execution of a setuid
>>>>> program that afterwards e.g. opens a unix domain socket, an attacker
>>>>> will be able to steal that socket and inject messages into
>>>>> communication with things like DBus. procfs currently has the same
>>>>> race, and that still needs to be fixed, but at least procfs doesn't
>>>>> let you open things like sockets because they don't have a working
>>>>> ->open handler, and it enforces the normal permission check for
>>>>> opening files.
>>>>
>>>> It isn't only exec that can change credentials.  Do we need a lock for
>>>> changing credentials?
>> [...]
>>>> If we need a lock around credential change let's design and build that.
>>>> Having a mismatch between what a lock is designed to do, and what
>>>> people use it for can only result in other bugs as people get confused.
>>>
>>> Hmm... what benefits do we get from making it a separate lock? I guess
>>> it would allow us to make it a per-task lock instead of a
>>> signal_struct-wide one? That might be helpful...
>>
>> But actually, isn't the core purpose of the cred_guard_mutex to guard
>> against concurrent credential changes anyway? That's what almost
>> everyone uses it for, and it's in the name...
>>
> 
> The main reason d'etre of exec_update_mutex is to get a consitent
> view of task->mm and task credentials.
> > The reason why you want the cred_guard_mutex, is that some action
> is changing the resulting credentials that the execve is about
> to install, and that is the data flow in the opposite direction.
> 

So in other words, you need the exec_update_mutex when you
access another thread's credentials and possibly the mmap at the
same time.

You need the cred_guard_mutex when you *change* the credentials
of another thread.  (Where you cannot be sure that the other thread
just started to execve something)

You need no mutex at all when you are just accessing or
even changing the credentials of the current thread.  (If another
thread is doing execve, your task will be killed, and wether
or not the credentials were changed does not matter any more)

> 
> Bernd.
>

Jann Horn March 11, 2020, 2:56 p.m. UTC | #10

On Wed, Mar 11, 2020 at 7:12 AM Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> On 3/10/20 9:22 PM, Bernd Edlinger wrote:
> > On 3/10/20 9:10 PM, Jann Horn wrote:
> >> On Tue, Mar 10, 2020 at 9:00 PM Jann Horn <jannh@google.com> wrote:
> >>> On Tue, Mar 10, 2020 at 8:29 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>>> Jann Horn <jannh@google.com> writes:
> >>>>> On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >>>>>> During exec some file descriptors are closed and the files struct is
> >>>>>> unshared.  But all of that can happen at other times and it has the
> >>>>>> same protections during exec as at ordinary times.  So stop taking the
> >>>>>> cred_guard_mutex as it is useless.
> >>>>>>
> >>>>>> Furthermore he cred_guard_mutex is a bad idea because it is deadlock
> >>>>>> prone, as it is held in serveral while waiting possibly indefinitely
> >>>>>> for userspace to do something.
> >> [...]
> >>>>> If you make this change, then if this races with execution of a setuid
> >>>>> program that afterwards e.g. opens a unix domain socket, an attacker
> >>>>> will be able to steal that socket and inject messages into
> >>>>> communication with things like DBus. procfs currently has the same
> >>>>> race, and that still needs to be fixed, but at least procfs doesn't
> >>>>> let you open things like sockets because they don't have a working
> >>>>> ->open handler, and it enforces the normal permission check for
> >>>>> opening files.
> >>>>
> >>>> It isn't only exec that can change credentials.  Do we need a lock for
> >>>> changing credentials?
> >> [...]
> >>>> If we need a lock around credential change let's design and build that.
> >>>> Having a mismatch between what a lock is designed to do, and what
> >>>> people use it for can only result in other bugs as people get confused.
> >>>
> >>> Hmm... what benefits do we get from making it a separate lock? I guess
> >>> it would allow us to make it a per-task lock instead of a
> >>> signal_struct-wide one? That might be helpful...
> >>
> >> But actually, isn't the core purpose of the cred_guard_mutex to guard
> >> against concurrent credential changes anyway? That's what almost
> >> everyone uses it for, and it's in the name...
> >>
> >
> > The main reason d'etre of exec_update_mutex is to get a consitent
> > view of task->mm and task credentials.
> > > The reason why you want the cred_guard_mutex, is that some action
> > is changing the resulting credentials that the execve is about
> > to install, and that is the data flow in the opposite direction.
> >
>
> So in other words, you need the exec_update_mutex when you
> access another thread's credentials and possibly the mmap at the
> same time.

Or the file descriptor table, or register state, ...

> You need no mutex at all when you are just accessing or
> even changing the credentials of the current thread.  (If another
> thread is doing execve, your task will be killed, and wether
> or not the credentials were changed does not matter any more)

Only if the only access checks you care about are those related to mm access.

Kees Cook March 11, 2020, 6:49 p.m. UTC | #11

On Tue, Mar 10, 2020 at 03:57:35PM -0500, Eric W. Biederman wrote:
> So ptrace_attach and seccomp use the cred_guard_mutex to guarantee
> a deadlock.

Well, that's the result, but seccomp uses it because it wants to
be certain that credentials and no_new_privs are changed together
"atomically".

pidfd: Stop taking cred_guard_mutex

Commit Message

Comments

Patch