mbox series

[0/2] Infrastructure to allow fixing exec deadlocks

Message ID 87tv32cxmf.fsf_-_@x220.int.ebiederm.org (mailing list archive)
Headers show
Series Infrastructure to allow fixing exec deadlocks | expand

Message

Eric W. Biederman March 5, 2020, 9:14 p.m. UTC
Bernd, everyone

This is how I think the infrastructure change should look that makes way
for fixing this issue.

- Correct the point of no return.
- Add a new mutex to replace cred_guard_mutex

Then I think it is just going through the existing
users of cred_guard_mutex and fixing them to use the new one.

There really aren't that many users of cred_guard_mutex so we should be
able to get through the easy ones fairly quickly.  And anything that
isn't easy we can wait until we have a good fix.

The users of cred_guard_mutex that I saw were:
    fs/proc/base.c:
       proc_pid_attr_write
       do_io_accounting
       proc_pid_stack
       proc_pid_syscall
       proc_pid_personality
    
    perf_event_open
    mm_access
    kcmp
    pidfd_fget
    seccomp_set_mode_filter

Bernd does this make sense to you?  

I think we can fix the seccomp/no_new_privs issue with some careful
refactoring.  We can probably do the same for ptrace but that appears
to need a little lsm bug fixing.

My goal here is to allow us to fix the uncontroversial easy bits.  While
still allowing the difficult tricky bits to be fixed.

Eric W. Biederman (2):
      exec: Properly mark the point of no return
      exec: Add a exec_update_mutex to replace cred_guard_mutex

 fs/exec.c                    | 11 ++++++++---
 include/linux/binfmts.h      |  7 ++++++-
 include/linux/sched/signal.h |  9 ++++++++-
 kernel/fork.c                |  1 +
 4 files changed, 23 insertions(+), 5 deletions(-)

Eric

Comments

Bernd Edlinger March 5, 2020, 10:31 p.m. UTC | #1
On 3/5/20 10:14 PM, Eric W. Biederman wrote:
> 
> Bernd, everyone
> 
> This is how I think the infrastructure change should look that makes way
> for fixing this issue.
> 
> - Correct the point of no return.
> - Add a new mutex to replace cred_guard_mutex
> 
> Then I think it is just going through the existing
> users of cred_guard_mutex and fixing them to use the new one.
> 
> There really aren't that many users of cred_guard_mutex so we should be
> able to get through the easy ones fairly quickly.  And anything that
> isn't easy we can wait until we have a good fix.
> 
> The users of cred_guard_mutex that I saw were:
>     fs/proc/base.c:
>        proc_pid_attr_write
>        do_io_accounting
>        proc_pid_stack
>        proc_pid_syscall
>        proc_pid_personality
>     
>     perf_event_open
>     mm_access
>     kcmp
>     pidfd_fget
>     seccomp_set_mode_filter
> 
> Bernd does this make sense to you?  
> 
> I think we can fix the seccomp/no_new_privs issue with some careful
> refactoring.  We can probably do the same for ptrace but that appears
> to need a little lsm bug fixing.
> 

Yes, for most functions the proposed "exec_update_mutex" is fine,
but we will need a longer-time block for ptrace_attach, seccomp_set_mode_filter
and proc_pid_attr_write need to be blocked for the whole exec duration so
they need a second "mutex", with deadlock-detection as in my previous patch,
if I see that right.

Unfortunately only one of the two test cases can be fixed without the
second mutex, of course the mm_access is what cause the practical problem.

Currently for the unlimited user space delay, I have only the case of
a ptraced sibling thread on my radar, de_thread waits for the parent
to call wait in this case, that can literally take forever.
But I know that also PTRACE_CONT may be needed after a PTRACE_EVENT_EXIT.

Can you explain what else in the user space can go wrong to make an
unlimited delay in the execve?


Bernd.
Eric W. Biederman March 6, 2020, 5:06 a.m. UTC | #2
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> On 3/5/20 10:14 PM, Eric W. Biederman wrote:
>> 
>> Bernd, everyone
>> 
>> This is how I think the infrastructure change should look that makes way
>> for fixing this issue.
>> 
>> - Correct the point of no return.
>> - Add a new mutex to replace cred_guard_mutex
>> 
>> Then I think it is just going through the existing
>> users of cred_guard_mutex and fixing them to use the new one.
>> 
>> There really aren't that many users of cred_guard_mutex so we should be
>> able to get through the easy ones fairly quickly.  And anything that
>> isn't easy we can wait until we have a good fix.
>> 
>> The users of cred_guard_mutex that I saw were:
>>     fs/proc/base.c:
>>        proc_pid_attr_write
>>        do_io_accounting
>>        proc_pid_stack
>>        proc_pid_syscall
>>        proc_pid_personality
>>     
>>     perf_event_open
>>     mm_access
>>     kcmp
>>     pidfd_fget
>>     seccomp_set_mode_filter
>> 
>> Bernd does this make sense to you?  
>> 
>> I think we can fix the seccomp/no_new_privs issue with some careful
>> refactoring.  We can probably do the same for ptrace but that appears
>> to need a little lsm bug fixing.
>> 
>
> Yes, for most functions the proposed "exec_update_mutex" is fine,
> but we will need a longer-time block for ptrace_attach, seccomp_set_mode_filter
> and proc_pid_attr_write need to be blocked for the whole exec duration so
> they need a second "mutex", with deadlock-detection as in my previous patch,
> if I see that right.

So far I am leaving "cred_guard_mutex" as that second "mutex".  My sense
is that when all we have left are the hard cases we can take those
cases out in detail, examine them and see what really can be done.

> Unfortunately only one of the two test cases can be fixed without the
> second mutex, of course the mm_access is what cause the practical problem.

Fixing the practical problems are foremost on my agenda.
That and clearing away enough of the noise that we can really focus on
the hard problems when we begin to address them.

That way I am hoping we can really solve some of these issues and make
them go away.

> Currently for the unlimited user space delay, I have only the case of
> a ptraced sibling thread on my radar, de_thread waits for the parent
> to call wait in this case, that can literally take forever.
> But I know that also PTRACE_CONT may be needed after a PTRACE_EVENT_EXIT.
>
> Can you explain what else in the user space can go wrong to make an
> unlimited delay in the execve?

Triggering a page fault.  Depending on the backing store or possibly
with the use of userfaultfd that page fault can be delayed indefinitely
and pretty much be as bad as the ptrace case.

Eric