From patchwork Fri Oct 16 02:40:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11840493 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C427514B4 for ; Fri, 16 Oct 2020 02:40:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4B6D320897 for ; Fri, 16 Oct 2020 02:40:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="HYu9FnLy" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B6D320897 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CEB03940008; Thu, 15 Oct 2020 22:40:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C7627900002; Thu, 15 Oct 2020 22:40:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC743940008; Thu, 15 Oct 2020 22:40:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0212.hostedemail.com [216.40.44.212]) by kanga.kvack.org (Postfix) with ESMTP id 79BE7900002 for ; Thu, 15 Oct 2020 22:40:36 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1A48A181AEF10 for ; Fri, 16 Oct 2020 02:40:36 +0000 (UTC) X-FDA: 77376235272.25.mind83_130238927219 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id EC2B01804E3A0 for ; Fri, 16 Oct 2020 02:40:35 +0000 (UTC) X-Spam-Summary: 1,0,0,872f6bb3eabebb5a,d41d8cd98f00b204,jannh@google.com,,RULES_HIT:2:41:69:355:379:541:800:960:966:973:982:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1605:1606:1730:1747:1777:1792:2196:2198:2199:2200:2393:2553:2559:2562:2691:2693:2731:2894:2898:2901:3138:3139:3140:3141:3142:3152:3865:3866:3867:3868:3870:3871:3872:3874:4119:4250:4321:4385:4605:5007:6117:6120:6261:6653:7875:7901:7903:9592:9969:10004:11026:11232:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12679:12683:12895:12986:13161:13180:13229:13255:13894:14096:14877:21080:21222:21324:21444:21451:21627:21939:21972:21987:30003:30045:30054:30070:30090,0,RBL:209.85.128.67:@google.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100;04y87hxxd37w5j69zsh4dn7ffc6bxyc4tzyr3u3mxmg5jc3imkd7ac1qcq3z9e1.cb11iithbnj9zsyh8mnm1eaem4mymm7bncid9dbss1hp8iqrgm913arehpn5yu7.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0, DNSBL:ne X-HE-Tag: mind83_130238927219 X-Filterd-Recvd-Size: 8964 Received: from mail-wm1-f67.google.com (mail-wm1-f67.google.com [209.85.128.67]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Fri, 16 Oct 2020 02:40:35 +0000 (UTC) Received: by mail-wm1-f67.google.com with SMTP id k18so1190745wmj.5 for ; Thu, 15 Oct 2020 19:40:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pvEughYKhq6CPlCwy3Gb88MacOrMK6GsoP9z00bpKRo=; b=HYu9FnLyQSFSJmzxotOAKKwXADM27PjlJfnilMOVzzmweirs2scSTtjZxmW19WFqbs qKhVuz2d4dNSGIxsoo5WUcepJaEhXiT/mvkHYzy+ITRq0rrraQ9Tb3Dafm8iwb7v0lwt hbwAfsD7vq3O8xlwMJ4oUxxCNqRvcwURiW46MBI557AdjxFD0AMoL6LSrzRfZfbgnPoe N1urYaQDLUyElj/nIYY6u3bDOYoCOG12lB+tNkzQBw3Z9dsapQKSf2tgcWW/x4UacSwx 1JrOInTm+F21s2HkGMw/6OK9Hc6RbGwkLUxjMU2X7mR+AeFOYGcoxHfohgRoK8ybyCmK NLZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pvEughYKhq6CPlCwy3Gb88MacOrMK6GsoP9z00bpKRo=; b=Zme5uyfQ57sGGyBk3X6AdC74dcDhhv90c/QT2xyu4UwDhwTUTIHEDoY1uNuEaJzAYu u8qrk2qw1XsmokUTSIhkHwLB5WzwYJcUne/6eYeNzUhyGO5snUXK/ZbNZf9IfcuOt2Mi 0w8Uv3PFS5/wy0w16h9fn49kijL/scJ/d8iVU1+AxXGAwrpztJC8KMkCByCtDXyEveLy trNy6nQzT6apui7FQNFPkkJpO8n2/fYzvcY1mY10buA6QIUArZMEYoRgsOGEWokLcTFn OHW0GsDZp+o95Pw5yk0hW5AGDMzNj0CT0Vk+8XGVeaafF+goNYfCWAFoLuIZgIJKVhEG TKeA== X-Gm-Message-State: AOAM532MFYB/E1wI+YvWtC7I4cgbhDqWgkfcnKWq6eErVgKtR4rHV6jA klRPHS1+IfbNmqYQcMoetjByiQ== X-Google-Smtp-Source: ABdhPJwVT5RN/jLxGSRyml+j1Ke4WERT80MS8k60n76yy/6+aaqzCEh+GU7QKYNDx5gbZUww9+LjmQ== X-Received: by 2002:a1c:f70b:: with SMTP id v11mr1371136wmh.21.1602816034173; Thu, 15 Oct 2020 19:40:34 -0700 (PDT) Received: from localhost ([2a02:168:96c5:1:55ed:514f:6ad7:5bcc]) by smtp.gmail.com with ESMTPSA id x18sm1543168wrg.4.2020.10.15.19.40.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Oct 2020 19:40:33 -0700 (PDT) From: Jann Horn To: Andrew Morton , linux-mm@kvack.org, Eric Biederman , Oleg Nesterov Cc: linux-kernel@vger.kernel.org, Will Deacon , Kees Cook , Ingo Molnar Subject: [RFC PATCH 1/6] ptrace: Keep mm around after exit_mm() for __ptrace_may_access() Date: Fri, 16 Oct 2020 04:40:14 +0200 Message-Id: <20201016024019.1882062-2-jannh@google.com> X-Mailer: git-send-email 2.29.0.rc1.297.gfa9743e501-goog In-Reply-To: <20201016024019.1882062-1-jannh@google.com> References: <20201016024019.1882062-1-jannh@google.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: __ptrace_may_access() checks can happen on target tasks that are in the middle of do_exit(), past exit_mm(). At that point, the ->mm pointer has been NULLed out, and the mm_struct has been mmput(). Unfortunately, the mm_struct contains the dumpability and the user_ns in which the task last went through execve(), and we need those for __ptrace_may_access(). Currently, that problem is handled by failing open: If the ->mm is gone, we assume that the task was dumpable. In some edge cases, this could potentially expose access to things like /proc/$pid/fd/$fd of originally non-dumpable processes. (exit_files() comes after exit_mm(), so the file descriptor table is still there when we've gone through exit_mm().) One way to fix this would be to move mm->user_ns and the dumpability state over into the task_struct. However, that gets quite ugly if we want to preserve existing semantics because e.g. PR_SET_DUMPABLE and commit_creds() would then have to scan through all tasks sharing the mm_struct and keep them in sync manually - that'd be a bit error-prone and overcomplicated. (Moving these things into the signal_struct is not an option because that is kept across executions, and pre-execve co-threads will share the signal_struct that is also used by the task that has gone through execve().) I believe that this patch may be the least bad option to fix this - keep the mm_struct (but not process memory) around with an mmgrab() reference from exit_mm() until the task goes away completely. Note that this moves free_task() down in order to make mmdrop_async() available without a forward declaration. Cc: stable@vger.kernel.org Fixes: bfedb589252c ("mm: Add a user_ns owner to mm_struct and fix ptrace permission checks") Signed-off-by: Jann Horn --- include/linux/sched.h | 8 +++++++ kernel/exit.c | 2 ++ kernel/fork.c | 54 ++++++++++++++++++++++--------------------- kernel/ptrace.c | 10 ++++++++ 4 files changed, 48 insertions(+), 26 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index afe01e232935..55bec6ff5626 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -747,6 +747,14 @@ struct task_struct { struct mm_struct *mm; struct mm_struct *active_mm; + /* + * When we exit and ->mm (the reference pinning ->mm's address space) + * goes away, we stash a reference to the mm_struct itself (counted via + * exit_mm->mm_count) in this member. + * This allows us to continue using the mm_struct for security checks + * and such even after the task has started exiting. + */ + struct mm_struct *exit_mm; /* Per-thread vma caching: */ struct vmacache vmacache; diff --git a/kernel/exit.c b/kernel/exit.c index 733e80f334e7..97253ef33486 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -476,6 +476,8 @@ static void exit_mm(void) /* more a memory barrier than a real lock */ task_lock(current); current->mm = NULL; + mmgrab(mm); /* for current->exit_mm */ + current->exit_mm = mm; mmap_read_unlock(mm); enter_lazy_tlb(mm, current); task_unlock(current); diff --git a/kernel/fork.c b/kernel/fork.c index da8d360fb032..4942428a217c 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -438,32 +438,6 @@ void put_task_stack(struct task_struct *tsk) } #endif -void free_task(struct task_struct *tsk) -{ - scs_release(tsk); - -#ifndef CONFIG_THREAD_INFO_IN_TASK - /* - * The task is finally done with both the stack and thread_info, - * so free both. - */ - release_task_stack(tsk); -#else - /* - * If the task had a separate stack allocation, it should be gone - * by now. - */ - WARN_ON_ONCE(refcount_read(&tsk->stack_refcount) != 0); -#endif - rt_mutex_debug_task_free(tsk); - ftrace_graph_exit_task(tsk); - arch_release_task_struct(tsk); - if (tsk->flags & PF_KTHREAD) - free_kthread_struct(tsk); - free_task_struct(tsk); -} -EXPORT_SYMBOL(free_task); - #ifdef CONFIG_MMU static __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) @@ -722,6 +696,34 @@ static inline void put_signal_struct(struct signal_struct *sig) free_signal_struct(sig); } +void free_task(struct task_struct *tsk) +{ + scs_release(tsk); + +#ifndef CONFIG_THREAD_INFO_IN_TASK + /* + * The task is finally done with both the stack and thread_info, + * so free both. + */ + release_task_stack(tsk); +#else + /* + * If the task had a separate stack allocation, it should be gone + * by now. + */ + WARN_ON_ONCE(refcount_read(&tsk->stack_refcount) != 0); +#endif + rt_mutex_debug_task_free(tsk); + ftrace_graph_exit_task(tsk); + arch_release_task_struct(tsk); + if (tsk->flags & PF_KTHREAD) + free_kthread_struct(tsk); + if (tsk->exit_mm) + mmdrop_async(tsk->exit_mm); + free_task_struct(tsk); +} +EXPORT_SYMBOL(free_task); + void __put_task_struct(struct task_struct *tsk) { WARN_ON(!tsk->exit_state); diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 43d6179508d6..0aedc6cf5bdc 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -342,7 +342,17 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode) * Pairs with a write barrier in commit_creds(). */ smp_rmb(); + /* + * Look up the target task's mm_struct. If it fails because the task is + * exiting and has gone through exit_mm(), we can instead use ->exit_mm + * as long as we only use members that are preserved by an mmgrab() + * reference. + * The only case in which both ->mm and ->exit_mm can be NULL should be + * kernel threads. + */ mm = task->mm; + if (!mm) + mm = task->exit_mm; if (mm && ((get_dumpable(mm) != SUID_DUMP_USER) && !ptrace_has_cap(cred, mm->user_ns, mode)))