From patchwork Fri Oct 16 23:09:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 11842441 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 175281580 for ; Fri, 16 Oct 2020 23:09:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BD67920874 for ; Fri, 16 Oct 2020 23:09:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GTdBD4iC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BD67920874 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 81D0C6B0062; Fri, 16 Oct 2020 19:09:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7C9F36B0068; Fri, 16 Oct 2020 19:09:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A4006B006E; Fri, 16 Oct 2020 19:09:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0196.hostedemail.com [216.40.44.196]) by kanga.kvack.org (Postfix) with ESMTP id 199E16B0062 for ; Fri, 16 Oct 2020 19:09:39 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id AD89E181AEF1F for ; Fri, 16 Oct 2020 23:09:38 +0000 (UTC) X-FDA: 77379332436.15.tooth29_4106c4927220 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id 87F401814B0C1 for ; Fri, 16 Oct 2020 23:09:38 +0000 (UTC) X-Spam-Summary: 1,0,0,872f6bb3eabebb5a,d41d8cd98f00b204,jannh@google.com,,RULES_HIT:2:41:69:355:379:541:800:960:966:973:982:988:989:1260:1311:1314:1345:1359:1437:1513:1515:1521:1535:1605:1606:1730:1747:1777:1792:2196:2198:2199:2200:2393:2553:2559:2562:2691:2693:2731:2894:2898:2901:3138:3139:3140:3141:3142:3152:3865:3866:3867:3868:3870:3871:3872:3874:4120:4250:4321:4385:4605:5007:6117:6120:6261:6653:7875:7901:7903:9592:9969:10004:11026:11232:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12679:12683:12895:12986:13161:13180:13229:13255:14096:14877:21080:21222:21324:21444:21451:21627:21939:21972:21987:30003:30045:30054:30070:30090,0,RBL:209.85.221.68:@google.com:.lbl8.mailshell.net-62.18.0.100 66.100.201.100;04y8145b1d1of5xnnq3bg69hcowekyc4tzyr3u3mxmg5jc3imkd7ac1qcq3z9e1.cb11iithbnj9zsyh8mnm1eaem4mymm7bncid9dbss1hp8iqrgm913arehpn5yu7.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSB L:0,DNSB X-HE-Tag: tooth29_4106c4927220 X-Filterd-Recvd-Size: 9016 Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Fri, 16 Oct 2020 23:09:38 +0000 (UTC) Received: by mail-wr1-f68.google.com with SMTP id t9so4833439wrq.11 for ; Fri, 16 Oct 2020 16:09:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YKfumXOLgixAy84wx3Nv6Fkz9iX4qYebEti8jMCXhGY=; b=GTdBD4iCkIejmT0ii834rbvngwiM3DhYbyE0f6KICAmNujfqnubYf4FptFnJ7Ysp0D 0AbzjfrI4HXAM3FXcrrS7+gPcEpe3IBbotJjQILnBOSYFyRI7DnM/agiJPj9Sw+eK7PY q6wRtLdbq3KMM7iQzfrPslOjTuZ1JmpQ5JjGB/mU22P75fBHPmN7P+j+Lm4X1ybiyaIo iS3LK1vFKhl8x3xlYmY//o8eP72UCJ+bxVhgDZBWxb0Hq+qnU0OCajLdC4S5qV/8uPPD /vSGXzm3SdhEKPUQDvMeKGQ0DSf3xMwxv/7lMh522scj3xOCYR3IoxJ+sHeMtrj078aP IPrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YKfumXOLgixAy84wx3Nv6Fkz9iX4qYebEti8jMCXhGY=; b=Rw4ZdJN8KJmP81DtmiomdoZK39hv0syaUao1F19sQ0eI/86lddzXizsyWm1lNLo6W8 vrJSBFVIYn/nhdgja0VT9Y9uPahlJn5bxFMFWCZ+4wMRiTUE27llUfB9tyR+UYfBYHai Slf93nP1UDD00BdjAxeXW4po9DjeZJmaOTeFRRarnB28wKxoEIjhn4Z2gJIJzBejsfvR VOqgkuNjKwhBXZAuMa4xuptRxIzEzzrKfQKOcB5PrMkOg6vhB85zUCq7T0CMXiJ5vQD3 EnAqV4AG9M6Ft3F+1WzqTH8pacDID9Q5o1w00zB+1KGsVLh19IbXbV5Fy3tMLMRIAYJy qQ9Q== X-Gm-Message-State: AOAM532vcSWpRVX8l/0YXdmr3dFcU/4gd66dugx47IeopTT9+boiA9BP PuZYWymEVGXGY9Nlgm/EXkWQ5Q== X-Google-Smtp-Source: ABdhPJzPIxZWpPKHv3yxMqwsMLhL6oUdBGXHlkzC5ZPb9QzCCBu761qmb+/Ol2XykISQILgn3L+6XQ== X-Received: by 2002:a5d:4b49:: with SMTP id w9mr7014571wrs.41.1602889776793; Fri, 16 Oct 2020 16:09:36 -0700 (PDT) Received: from localhost ([2a02:168:96c5:1:55ed:514f:6ad7:5bcc]) by smtp.gmail.com with ESMTPSA id s11sm5476456wrm.56.2020.10.16.16.09.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Oct 2020 16:09:36 -0700 (PDT) From: Jann Horn To: Andrew Morton , linux-mm@kvack.org, Eric Biederman , Oleg Nesterov Cc: linux-kernel@vger.kernel.org, Will Deacon , Kees Cook , Ingo Molnar Subject: [RFC PATCH resend 1/6] ptrace: Keep mm around after exit_mm() for __ptrace_may_access() Date: Sat, 17 Oct 2020 01:09:10 +0200 Message-Id: <20201016230915.1972840-2-jannh@google.com> X-Mailer: git-send-email 2.29.0.rc1.297.gfa9743e501-goog In-Reply-To: <20201016230915.1972840-1-jannh@google.com> References: <20201016230915.1972840-1-jannh@google.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: __ptrace_may_access() checks can happen on target tasks that are in the middle of do_exit(), past exit_mm(). At that point, the ->mm pointer has been NULLed out, and the mm_struct has been mmput(). Unfortunately, the mm_struct contains the dumpability and the user_ns in which the task last went through execve(), and we need those for __ptrace_may_access(). Currently, that problem is handled by failing open: If the ->mm is gone, we assume that the task was dumpable. In some edge cases, this could potentially expose access to things like /proc/$pid/fd/$fd of originally non-dumpable processes. (exit_files() comes after exit_mm(), so the file descriptor table is still there when we've gone through exit_mm().) One way to fix this would be to move mm->user_ns and the dumpability state over into the task_struct. However, that gets quite ugly if we want to preserve existing semantics because e.g. PR_SET_DUMPABLE and commit_creds() would then have to scan through all tasks sharing the mm_struct and keep them in sync manually - that'd be a bit error-prone and overcomplicated. (Moving these things into the signal_struct is not an option because that is kept across executions, and pre-execve co-threads will share the signal_struct that is also used by the task that has gone through execve().) I believe that this patch may be the least bad option to fix this - keep the mm_struct (but not process memory) around with an mmgrab() reference from exit_mm() until the task goes away completely. Note that this moves free_task() down in order to make mmdrop_async() available without a forward declaration. Cc: stable@vger.kernel.org Fixes: bfedb589252c ("mm: Add a user_ns owner to mm_struct and fix ptrace permission checks") Signed-off-by: Jann Horn --- include/linux/sched.h | 8 +++++++ kernel/exit.c | 2 ++ kernel/fork.c | 54 ++++++++++++++++++++++--------------------- kernel/ptrace.c | 10 ++++++++ 4 files changed, 48 insertions(+), 26 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index afe01e232935..55bec6ff5626 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -747,6 +747,14 @@ struct task_struct { struct mm_struct *mm; struct mm_struct *active_mm; + /* + * When we exit and ->mm (the reference pinning ->mm's address space) + * goes away, we stash a reference to the mm_struct itself (counted via + * exit_mm->mm_count) in this member. + * This allows us to continue using the mm_struct for security checks + * and such even after the task has started exiting. + */ + struct mm_struct *exit_mm; /* Per-thread vma caching: */ struct vmacache vmacache; diff --git a/kernel/exit.c b/kernel/exit.c index 733e80f334e7..97253ef33486 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -476,6 +476,8 @@ static void exit_mm(void) /* more a memory barrier than a real lock */ task_lock(current); current->mm = NULL; + mmgrab(mm); /* for current->exit_mm */ + current->exit_mm = mm; mmap_read_unlock(mm); enter_lazy_tlb(mm, current); task_unlock(current); diff --git a/kernel/fork.c b/kernel/fork.c index da8d360fb032..4942428a217c 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -438,32 +438,6 @@ void put_task_stack(struct task_struct *tsk) } #endif -void free_task(struct task_struct *tsk) -{ - scs_release(tsk); - -#ifndef CONFIG_THREAD_INFO_IN_TASK - /* - * The task is finally done with both the stack and thread_info, - * so free both. - */ - release_task_stack(tsk); -#else - /* - * If the task had a separate stack allocation, it should be gone - * by now. - */ - WARN_ON_ONCE(refcount_read(&tsk->stack_refcount) != 0); -#endif - rt_mutex_debug_task_free(tsk); - ftrace_graph_exit_task(tsk); - arch_release_task_struct(tsk); - if (tsk->flags & PF_KTHREAD) - free_kthread_struct(tsk); - free_task_struct(tsk); -} -EXPORT_SYMBOL(free_task); - #ifdef CONFIG_MMU static __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) @@ -722,6 +696,34 @@ static inline void put_signal_struct(struct signal_struct *sig) free_signal_struct(sig); } +void free_task(struct task_struct *tsk) +{ + scs_release(tsk); + +#ifndef CONFIG_THREAD_INFO_IN_TASK + /* + * The task is finally done with both the stack and thread_info, + * so free both. + */ + release_task_stack(tsk); +#else + /* + * If the task had a separate stack allocation, it should be gone + * by now. + */ + WARN_ON_ONCE(refcount_read(&tsk->stack_refcount) != 0); +#endif + rt_mutex_debug_task_free(tsk); + ftrace_graph_exit_task(tsk); + arch_release_task_struct(tsk); + if (tsk->flags & PF_KTHREAD) + free_kthread_struct(tsk); + if (tsk->exit_mm) + mmdrop_async(tsk->exit_mm); + free_task_struct(tsk); +} +EXPORT_SYMBOL(free_task); + void __put_task_struct(struct task_struct *tsk) { WARN_ON(!tsk->exit_state); diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 43d6179508d6..0aedc6cf5bdc 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -342,7 +342,17 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode) * Pairs with a write barrier in commit_creds(). */ smp_rmb(); + /* + * Look up the target task's mm_struct. If it fails because the task is + * exiting and has gone through exit_mm(), we can instead use ->exit_mm + * as long as we only use members that are preserved by an mmgrab() + * reference. + * The only case in which both ->mm and ->exit_mm can be NULL should be + * kernel threads. + */ mm = task->mm; + if (!mm) + mm = task->exit_mm; if (mm && ((get_dumpable(mm) != SUID_DUMP_USER) && !ptrace_has_cap(cred, mm->user_ns, mode)))