From patchwork Mon Aug 16 19:48:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12439139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6A5DC4320A for ; Mon, 16 Aug 2021 19:49:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B5BC260F4B for ; Mon, 16 Aug 2021 19:49:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229587AbhHPTuU (ORCPT ); Mon, 16 Aug 2021 15:50:20 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:20174 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231414AbhHPTuT (ORCPT ); Mon, 16 Aug 2021 15:50:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629143387; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xeZs2Vbi3LPUYE5L2DmMtSeAlr7D2lLP2ZjOK1a2vac=; b=FqhFGGCFx6621JuYObxiXHXmR59J+pLvl2xLRdPVNy00SQ1wRmd7obh7BpqwlTLE0zZBCl IlKrbY8BW43036wKQwYX/Io1e83upZdrk94R1xXli3TEqbxRe15TTJiW+GMwWbZjPn0wOe iFye7rWvZA8p2KmtQ9j0EGJ2c9NuYRE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-468-CHcbLj9aNsi8F8QsqNZN7w-1; Mon, 16 Aug 2021 15:49:46 -0400 X-MC-Unique: CHcbLj9aNsi8F8QsqNZN7w-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AC5FA100A606; Mon, 16 Aug 2021 19:49:40 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 874FF18017; Mon, 16 Aug 2021 19:49:05 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Linus Torvalds , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Alexander Viro , Alexey Dobriyan , Steven Rostedt , Peter Zijlstra , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , "Eric W. Biederman" , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Christian Brauner , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Michal Hocko , Miklos Szeredi , Chengguang Xu , =?utf-8?q?Christian_K=C3=B6nig?= , Florian Weimer , David Laight , linux-unionfs@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 1/7] binfmt: don't use MAP_DENYWRITE when loading shared libraries via uselib() Date: Mon, 16 Aug 2021 21:48:34 +0200 Message-Id: <20210816194840.42769-2-david@redhat.com> In-Reply-To: <20210816194840.42769-1-david@redhat.com> References: <20210816194840.42769-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org uselib() is the legacy systemcall for loading shared libraries. Nowadays, applications use dlopen() to load shared libraries, completely implemented in user space via mmap(). For example, glibc uses MAP_COPY to mmap shared libraries. While this maps to MAP_PRIVATE | MAP_DENYWRITE on Linux, Linux ignores any MAP_DENYWRITE specification from user space in mmap. With this change, all remaining in-tree users of MAP_DENYWRITE use it to map an executable. We will be able to open shared libraries loaded via uselib() writable, just as we already can via dlopen() from user space. This is one step into the direction of removing MAP_DENYWRITE from the kernel. This can be considered a minor user space visible change. Acked-by: "Eric W. Biederman" Signed-off-by: David Hildenbrand --- arch/x86/ia32/ia32_aout.c | 2 +- fs/binfmt_aout.c | 2 +- fs/binfmt_elf.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 5e5b9fc2747f..321d7b22ad2d 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -293,7 +293,7 @@ static int load_aout_library(struct file *file) /* Now use mmap to map the library into memory. */ error = vm_mmap(file, start_addr, ex.a_text + ex.a_data, PROT_READ | PROT_WRITE | PROT_EXEC, - MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | MAP_32BIT, + MAP_FIXED | MAP_PRIVATE | MAP_32BIT, N_TXTOFF(ex)); retval = error; if (error != start_addr) diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index 145917f734fe..d29de971d3f3 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -309,7 +309,7 @@ static int load_aout_library(struct file *file) /* Now use mmap to map the library into memory. */ error = vm_mmap(file, start_addr, ex.a_text + ex.a_data, PROT_READ | PROT_WRITE | PROT_EXEC, - MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE, + MAP_FIXED | MAP_PRIVATE; N_TXTOFF(ex)); retval = error; if (error != start_addr) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 439ed81e755a..6d2c79533631 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1384,7 +1384,7 @@ static int load_elf_library(struct file *file) (eppnt->p_filesz + ELF_PAGEOFFSET(eppnt->p_vaddr)), PROT_READ | PROT_WRITE | PROT_EXEC, - MAP_FIXED_NOREPLACE | MAP_PRIVATE | MAP_DENYWRITE, + MAP_FIXED_NOREPLACE | MAP_PRIVATE, (eppnt->p_offset - ELF_PAGEOFFSET(eppnt->p_vaddr))); if (error != ELF_PAGESTART(eppnt->p_vaddr)) From patchwork Mon Aug 16 19:48:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12439141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72826C4320E for ; Mon, 16 Aug 2021 19:50:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5396A60F58 for ; Mon, 16 Aug 2021 19:50:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231585AbhHPTuo (ORCPT ); Mon, 16 Aug 2021 15:50:44 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:22829 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231544AbhHPTul (ORCPT ); Mon, 16 Aug 2021 15:50:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629143409; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OaQssic/mSdAfbGEFpAxDDR1LNO8rKXcgan69fY/97k=; b=UiTpUNARHLV92D7i8XI6opQ/5r9lBBWs/A97WMilxfks6EbqxueNRThlApox5gmIcacxSP QbYdmOnQDvHQi44joaN1ryDItPJ/2Ps8YlVmO9mdeIMa3PuP6sUULPpeJoxU1FR+sTseL2 wVL0zhnXw0wlhTJK1oNvvqutDvrpFno= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-231-LVUpcUlEOZmpnKyAYdPJHA-1; Mon, 16 Aug 2021 15:50:08 -0400 X-MC-Unique: LVUpcUlEOZmpnKyAYdPJHA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BE341760C0; Mon, 16 Aug 2021 19:50:00 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1536118017; Mon, 16 Aug 2021 19:49:40 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Linus Torvalds , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Alexander Viro , Alexey Dobriyan , Steven Rostedt , Peter Zijlstra , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , "Eric W. Biederman" , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Christian Brauner , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Michal Hocko , Miklos Szeredi , Chengguang Xu , =?utf-8?q?Christian_K=C3=B6nig?= , Florian Weimer , David Laight , linux-unionfs@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 2/7] kernel/fork: factor out replacing the current MM exe_file Date: Mon, 16 Aug 2021 21:48:35 +0200 Message-Id: <20210816194840.42769-3-david@redhat.com> In-Reply-To: <20210816194840.42769-1-david@redhat.com> References: <20210816194840.42769-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Let's factor the main logic out into replace_mm_exe_file(), such that all mm->exe_file logic is contained in kernel/fork.c. While at it, perform some simple cleanups that are possible now that we're simplifying the individual functions. Acked-by: Christian Brauner Acked-by: "Eric W. Biederman" Signed-off-by: David Hildenbrand --- include/linux/mm.h | 1 + kernel/fork.c | 44 +++++++++++++++++++++++++++++++++++++++++--- kernel/sys.c | 33 +-------------------------------- 3 files changed, 43 insertions(+), 35 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7ca22e6e694a..48c6fa9ab792 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2581,6 +2581,7 @@ extern int mm_take_all_locks(struct mm_struct *mm); extern void mm_drop_all_locks(struct mm_struct *mm); extern void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file); +extern int replace_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file); extern struct file *get_mm_exe_file(struct mm_struct *mm); extern struct file *get_task_exe_file(struct task_struct *task); diff --git a/kernel/fork.c b/kernel/fork.c index bc94b2cc5995..eedce5c77041 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1148,9 +1148,7 @@ void mmput_async(struct mm_struct *mm) * * Main users are mmput() and sys_execve(). Callers prevent concurrent * invocations: in mmput() nobody alive left, in execve task is single - * threaded. sys_prctl(PR_SET_MM_MAP/EXE_FILE) also needs to set the - * mm->exe_file, but does so without using set_mm_exe_file() in order - * to avoid the need for any locks. + * threaded. */ void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) { @@ -1170,6 +1168,46 @@ void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) fput(old_exe_file); } +/** + * replace_mm_exe_file - replace a reference to the mm's executable file + * + * This changes mm's executable file (shown as symlink /proc/[pid]/exe), + * dealing with concurrent invocation and without grabbing the mmap lock in + * write mode. + * + * Main user is sys_prctl(PR_SET_MM_MAP/EXE_FILE). + */ +int replace_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) +{ + struct vm_area_struct *vma; + struct file *old_exe_file; + int ret = 0; + + /* Forbid mm->exe_file change if old file still mapped. */ + old_exe_file = get_mm_exe_file(mm); + if (old_exe_file) { + mmap_read_lock(mm); + for (vma = mm->mmap; vma && !ret; vma = vma->vm_next) { + if (!vma->vm_file) + continue; + if (path_equal(&vma->vm_file->f_path, + &old_exe_file->f_path)) + ret = -EBUSY; + } + mmap_read_unlock(mm); + fput(old_exe_file); + if (ret) + return ret; + } + + /* set the new file, lockless */ + get_file(new_exe_file); + old_exe_file = xchg(&mm->exe_file, new_exe_file); + if (old_exe_file) + fput(old_exe_file); + return 0; +} + /** * get_mm_exe_file - acquire a reference to the mm's executable file * diff --git a/kernel/sys.c b/kernel/sys.c index ef1a78f5d71c..30c12e54585a 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -1846,7 +1846,6 @@ SYSCALL_DEFINE1(umask, int, mask) static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd) { struct fd exe; - struct file *old_exe, *exe_file; struct inode *inode; int err; @@ -1869,40 +1868,10 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd) if (err) goto exit; - /* - * Forbid mm->exe_file change if old file still mapped. - */ - exe_file = get_mm_exe_file(mm); - err = -EBUSY; - if (exe_file) { - struct vm_area_struct *vma; - - mmap_read_lock(mm); - for (vma = mm->mmap; vma; vma = vma->vm_next) { - if (!vma->vm_file) - continue; - if (path_equal(&vma->vm_file->f_path, - &exe_file->f_path)) - goto exit_err; - } - - mmap_read_unlock(mm); - fput(exe_file); - } - - err = 0; - /* set the new file, lockless */ - get_file(exe.file); - old_exe = xchg(&mm->exe_file, exe.file); - if (old_exe) - fput(old_exe); + err = replace_mm_exe_file(mm, exe.file); exit: fdput(exe); return err; -exit_err: - mmap_read_unlock(mm); - fput(exe_file); - goto exit; } /* From patchwork Mon Aug 16 19:48:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12439159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60BB8C4338F for ; Mon, 16 Aug 2021 19:50:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4CB1560FC3 for ; Mon, 16 Aug 2021 19:50:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230515AbhHPTvI (ORCPT ); Mon, 16 Aug 2021 15:51:08 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:20811 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231609AbhHPTvF (ORCPT ); Mon, 16 Aug 2021 15:51:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629143432; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kP9Ruk7HoVFrsIOE9R+legB98IpgGisc7R6rrwUCgRc=; b=KrafWbe74Z6kBMP4N1wq+I5FwjZSgNB2qAApxH8WsANrnp0jlC8zzt6bMaTjfUlUh//K9R B9NQdfDZb65Xw0Lj7LE93rpZeyAfndkkQRoiAJgw2W2dG1eaQW2vX3jxPl/pqumNkCezp1 LrDma+p206ppc0RFmtzsZ8GLLBuILrM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-453-VYoUJnKgNyGGcj_-48ogbw-1; Mon, 16 Aug 2021 15:50:31 -0400 X-MC-Unique: VYoUJnKgNyGGcj_-48ogbw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 59F97107ACF5; Mon, 16 Aug 2021 19:50:25 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 264AE5C1D5; Mon, 16 Aug 2021 19:50:00 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Linus Torvalds , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Alexander Viro , Alexey Dobriyan , Steven Rostedt , Peter Zijlstra , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , "Eric W. Biederman" , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Christian Brauner , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Michal Hocko , Miklos Szeredi , Chengguang Xu , =?utf-8?q?Christian_K=C3=B6nig?= , Florian Weimer , David Laight , linux-unionfs@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 3/7] kernel/fork: always deny write access to current MM exe_file Date: Mon, 16 Aug 2021 21:48:36 +0200 Message-Id: <20210816194840.42769-4-david@redhat.com> In-Reply-To: <20210816194840.42769-1-david@redhat.com> References: <20210816194840.42769-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We want to remove VM_DENYWRITE only currently only used when mapping the executable during exec. During exec, we already deny_write_access() the executable, however, after exec completes the VMAs mapped with VM_DENYWRITE effectively keeps write access denied via deny_write_access(). Let's deny write access when setting or replacing the MM exe_file. With this change, we can remove VM_DENYWRITE for mapping executables. Make set_mm_exe_file() return an error in case deny_write_access() fails; note that this should never happen, because exec code does a deny_write_access() early and keeps write access denied when calling set_mm_exe_file. However, it makes the code easier to read and makes set_mm_exe_file() and replace_mm_exe_file() look more similar. This represents a minor user space visible change: sys_prctl(PR_SET_MM_MAP/EXE_FILE) can now fail if the file is already opened writable. Also, after sys_prctl(PR_SET_MM_MAP/EXE_FILE) the file cannot be opened writable. Note that we can already fail with -EACCES if the file doesn't have execute permissions. Acked-by: "Eric W. Biederman" Signed-off-by: David Hildenbrand --- fs/exec.c | 4 +++- include/linux/mm.h | 2 +- kernel/fork.c | 50 ++++++++++++++++++++++++++++++++++++++++------ 3 files changed, 48 insertions(+), 8 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 38f63451b928..9294049f5487 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1270,7 +1270,9 @@ int begin_new_exec(struct linux_binprm * bprm) * not visibile until then. This also enables the update * to be lockless. */ - set_mm_exe_file(bprm->mm, bprm->file); + retval = set_mm_exe_file(bprm->mm, bprm->file); + if (retval) + goto out; /* If the binary is not readable then enforce mm->dumpable=0 */ would_dump(bprm, bprm->file); diff --git a/include/linux/mm.h b/include/linux/mm.h index 48c6fa9ab792..56b1cd41db61 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2580,7 +2580,7 @@ static inline int check_data_rlimit(unsigned long rlim, extern int mm_take_all_locks(struct mm_struct *mm); extern void mm_drop_all_locks(struct mm_struct *mm); -extern void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file); +extern int set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file); extern int replace_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file); extern struct file *get_mm_exe_file(struct mm_struct *mm); extern struct file *get_task_exe_file(struct task_struct *task); diff --git a/kernel/fork.c b/kernel/fork.c index eedce5c77041..543541764865 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -470,6 +470,20 @@ void free_task(struct task_struct *tsk) } EXPORT_SYMBOL(free_task); +static void dup_mm_exe_file(struct mm_struct *mm, struct mm_struct *oldmm) +{ + struct file *exe_file; + + exe_file = get_mm_exe_file(oldmm); + RCU_INIT_POINTER(mm->exe_file, exe_file); + /* + * We depend on the oldmm having properly denied write access to the + * exe_file already. + */ + if (exe_file && deny_write_access(exe_file)) + pr_warn_once("deny_write_access() failed in %s\n", __func__); +} + #ifdef CONFIG_MMU static __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) @@ -493,7 +507,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, mmap_write_lock_nested(mm, SINGLE_DEPTH_NESTING); /* No ordering required: file already has been exposed. */ - RCU_INIT_POINTER(mm->exe_file, get_mm_exe_file(oldmm)); + dup_mm_exe_file(mm, oldmm); mm->total_vm = oldmm->total_vm; mm->data_vm = oldmm->data_vm; @@ -639,7 +653,7 @@ static inline void mm_free_pgd(struct mm_struct *mm) static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) { mmap_write_lock(oldmm); - RCU_INIT_POINTER(mm->exe_file, get_mm_exe_file(oldmm)); + dup_mm_exe_file(mm, oldmm); mmap_write_unlock(oldmm); return 0; } @@ -1149,8 +1163,10 @@ void mmput_async(struct mm_struct *mm) * Main users are mmput() and sys_execve(). Callers prevent concurrent * invocations: in mmput() nobody alive left, in execve task is single * threaded. + * + * Can only fail if new_exe_file != NULL. */ -void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) +int set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) { struct file *old_exe_file; @@ -1161,11 +1177,21 @@ void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) */ old_exe_file = rcu_dereference_raw(mm->exe_file); - if (new_exe_file) + if (new_exe_file) { + /* + * We expect the caller (i.e., sys_execve) to already denied + * write access, so this is unlikely to fail. + */ + if (unlikely(deny_write_access(new_exe_file))) + return -EACCES; get_file(new_exe_file); + } rcu_assign_pointer(mm->exe_file, new_exe_file); - if (old_exe_file) + if (old_exe_file) { + allow_write_access(old_exe_file); fput(old_exe_file); + } + return 0; } /** @@ -1201,10 +1227,22 @@ int replace_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) } /* set the new file, lockless */ + ret = deny_write_access(new_exe_file); + if (ret) + return -EACCES; get_file(new_exe_file); + old_exe_file = xchg(&mm->exe_file, new_exe_file); - if (old_exe_file) + if (old_exe_file) { + /* + * Don't race with dup_mmap() getting the file and disallowing + * write access while someone might open the file writable. + */ + mmap_read_lock(mm); + allow_write_access(old_exe_file); fput(old_exe_file); + mmap_read_unlock(mm); + } return 0; } From patchwork Mon Aug 16 19:48:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12439161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A180C43214 for ; Mon, 16 Aug 2021 19:50:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E651C6103A for ; Mon, 16 Aug 2021 19:50:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231640AbhHPTvZ (ORCPT ); Mon, 16 Aug 2021 15:51:25 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:57627 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231288AbhHPTvX (ORCPT ); Mon, 16 Aug 2021 15:51:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629143451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Gs3y/a2DCKk81G5bIWP/wpRKwhslZEVKrtw3Fq3vrSs=; b=Rv75YnjQJQlkH9OkCC3zNQ0wuSrzB+Apm3sEN7YxO4H+rwhqadDq1bnrlJh1hWx/EOYr4w 7sa2w9MbEiA9HvmJ5gcWuNX1M0WOI5s5YRdCqa8S4CKyzw34xFXSrWXj8+bC3uauB0aoQM YlHIYCLvxifklchcC3dWJWLTuMMTgug= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-416-gTboXFDsNZK8F-hieBIhqQ-1; Mon, 16 Aug 2021 15:50:50 -0400 X-MC-Unique: gTboXFDsNZK8F-hieBIhqQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 674C7100A605; Mon, 16 Aug 2021 19:50:45 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id B35DB5C1D5; Mon, 16 Aug 2021 19:50:25 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Linus Torvalds , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Alexander Viro , Alexey Dobriyan , Steven Rostedt , Peter Zijlstra , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , "Eric W. Biederman" , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Christian Brauner , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Michal Hocko , Miklos Szeredi , Chengguang Xu , =?utf-8?q?Christian_K=C3=B6nig?= , Florian Weimer , David Laight , linux-unionfs@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 4/7] binfmt: remove in-tree usage of MAP_DENYWRITE Date: Mon, 16 Aug 2021 21:48:37 +0200 Message-Id: <20210816194840.42769-5-david@redhat.com> In-Reply-To: <20210816194840.42769-1-david@redhat.com> References: <20210816194840.42769-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org At exec time when we mmap the new executable via MAP_DENYWRITE we have it opened via do_open_execat() and already deny_write_access()'ed the file successfully. Once exec completes, we allow_write_acces(); however, we set mm->exe_file in begin_new_exec() via set_mm_exe_file() and also deny_write_access() as long as mm->exe_file remains set. We'll effectively deny write access to our executable via mm->exe_file until mm->exe_file is changed -- when the process is removed, on new exec, or via sys_prctl(PR_SET_MM_MAP/EXE_FILE). Let's remove all usage of MAP_DENYWRITE, it's no longer necessary for mm->exe_file. In case of an elf interpreter, we'll now only deny write access to the file during exec. This is somewhat okay, because the interpreter behaves (and sometime is) a shared library; all shared libraries, especially the ones loaded directly in user space like via dlopen() won't ever be mapped via MAP_DENYWRITE, because we ignore that from user space completely; these shared libraries can always be modified while mapped and executed. Let's only special-case the main executable, denying write access while being executed by a process. This can be considered a minor user space visible change. While this is a cleanup, it also fixes part of a problem reported with VM_DENYWRITE on overlayfs, as VM_DENYWRITE is effectively unused with this patch and will be removed next: "Overlayfs did not honor positive i_writecount on realfile for VM_DENYWRITE mappings." [1] [1] https://lore.kernel.org/r/YNHXzBgzRrZu1MrD@miu.piliscsaba.redhat.com/ Reported-by: Chengguang Xu Acked-by: "Eric W. Biederman" Signed-off-by: David Hildenbrand --- arch/x86/ia32/ia32_aout.c | 6 ++---- fs/binfmt_aout.c | 5 ++--- fs/binfmt_elf.c | 4 ++-- fs/binfmt_elf_fdpic.c | 2 +- 4 files changed, 7 insertions(+), 10 deletions(-) diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 321d7b22ad2d..9bd15241fadb 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -202,8 +202,7 @@ static int load_aout_binary(struct linux_binprm *bprm) error = vm_mmap(bprm->file, N_TXTADDR(ex), ex.a_text, PROT_READ | PROT_EXEC, - MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | - MAP_32BIT, + MAP_FIXED | MAP_PRIVATE | MAP_32BIT, fd_offset); if (error != N_TXTADDR(ex)) @@ -211,8 +210,7 @@ static int load_aout_binary(struct linux_binprm *bprm) error = vm_mmap(bprm->file, N_DATADDR(ex), ex.a_data, PROT_READ | PROT_WRITE | PROT_EXEC, - MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE | - MAP_32BIT, + MAP_FIXED | MAP_PRIVATE | MAP_32BIT, fd_offset + ex.a_text); if (error != N_DATADDR(ex)) return error; diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index d29de971d3f3..a47496d0f123 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -221,8 +221,7 @@ static int load_aout_binary(struct linux_binprm * bprm) } error = vm_mmap(bprm->file, N_TXTADDR(ex), ex.a_text, - PROT_READ | PROT_EXEC, - MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE, + PROT_READ | PROT_EXEC, MAP_FIXED | MAP_PRIVATE, fd_offset); if (error != N_TXTADDR(ex)) @@ -230,7 +229,7 @@ static int load_aout_binary(struct linux_binprm * bprm) error = vm_mmap(bprm->file, N_DATADDR(ex), ex.a_data, PROT_READ | PROT_WRITE | PROT_EXEC, - MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE, + MAP_FIXED | MAP_PRIVATE, fd_offset + ex.a_text); if (error != N_DATADDR(ex)) return error; diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 6d2c79533631..69d900a8473d 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -622,7 +622,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex, eppnt = interp_elf_phdata; for (i = 0; i < interp_elf_ex->e_phnum; i++, eppnt++) { if (eppnt->p_type == PT_LOAD) { - int elf_type = MAP_PRIVATE | MAP_DENYWRITE; + int elf_type = MAP_PRIVATE; int elf_prot = make_prot(eppnt->p_flags, arch_state, true, true); unsigned long vaddr = 0; @@ -1070,7 +1070,7 @@ static int load_elf_binary(struct linux_binprm *bprm) elf_prot = make_prot(elf_ppnt->p_flags, &arch_state, !!interpreter, false); - elf_flags = MAP_PRIVATE | MAP_DENYWRITE; + elf_flags = MAP_PRIVATE; vaddr = elf_ppnt->p_vaddr; /* diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index cf4028487dcc..6d8fd6030cbb 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -1041,7 +1041,7 @@ static int elf_fdpic_map_file_by_direct_mmap(struct elf_fdpic_params *params, if (phdr->p_flags & PF_W) prot |= PROT_WRITE; if (phdr->p_flags & PF_X) prot |= PROT_EXEC; - flags = MAP_PRIVATE | MAP_DENYWRITE; + flags = MAP_PRIVATE; maddr = 0; switch (params->flags & ELF_FDPIC_FLAG_ARRANGEMENT) { From patchwork Mon Aug 16 19:48:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12439163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84152C4320A for ; Mon, 16 Aug 2021 19:51:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 67B7F61056 for ; Mon, 16 Aug 2021 19:51:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231670AbhHPTvq (ORCPT ); Mon, 16 Aug 2021 15:51:46 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:39015 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231646AbhHPTvp (ORCPT ); Mon, 16 Aug 2021 15:51:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629143473; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hJpZHLDKvzGPtPhMPjr76EVS+xDwvDkriAy1eTHJ9n4=; b=JABHp3osTGE2qVhTKf1rR2tZpzN70PAcI7QDEeAwbIrYKkC/J2HbqAliE/6YuOOVYaA4hC MUcbYQx/52pCitfj2VPs3PuzDlD2B0NBv9N1gndaAp1z3g0Z6FdP84pshxfIDS3oXX52sk nMFkW6CfwIAshlmzvWeu6vFhKTdLIFw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-357-HOo_TwmXOV2DyJXtQCaRxw-1; Mon, 16 Aug 2021 15:51:11 -0400 X-MC-Unique: HOo_TwmXOV2DyJXtQCaRxw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 658AB107ACF5; Mon, 16 Aug 2021 19:51:06 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id C44C51803B; Mon, 16 Aug 2021 19:50:45 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Linus Torvalds , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Alexander Viro , Alexey Dobriyan , Steven Rostedt , Peter Zijlstra , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , "Eric W. Biederman" , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Christian Brauner , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Michal Hocko , Miklos Szeredi , Chengguang Xu , =?utf-8?q?Christian_K=C3=B6nig?= , Florian Weimer , David Laight , linux-unionfs@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 5/7] mm: remove VM_DENYWRITE Date: Mon, 16 Aug 2021 21:48:38 +0200 Message-Id: <20210816194840.42769-6-david@redhat.com> In-Reply-To: <20210816194840.42769-1-david@redhat.com> References: <20210816194840.42769-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org All in-tree users of MAP_DENYWRITE are gone. MAP_DENYWRITE cannot be set from user space, so all users are gone; let's remove it. Acked-by: "Eric W. Biederman" Signed-off-by: David Hildenbrand --- fs/proc/task_mmu.c | 1 - include/linux/mm.h | 1 - include/linux/mman.h | 1 - include/trace/events/mmflags.h | 1 - kernel/events/core.c | 2 -- kernel/fork.c | 3 --- lib/test_printf.c | 5 ++--- mm/mmap.c | 27 +++------------------------ 8 files changed, 5 insertions(+), 36 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index eb97468dfe4c..cf25be3e0321 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -619,7 +619,6 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_MAYSHARE)] = "ms", [ilog2(VM_GROWSDOWN)] = "gd", [ilog2(VM_PFNMAP)] = "pf", - [ilog2(VM_DENYWRITE)] = "dw", [ilog2(VM_LOCKED)] = "lo", [ilog2(VM_IO)] = "io", [ilog2(VM_SEQ_READ)] = "sr", diff --git a/include/linux/mm.h b/include/linux/mm.h index 56b1cd41db61..257995f62e83 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -281,7 +281,6 @@ extern unsigned int kobjsize(const void *objp); #define VM_GROWSDOWN 0x00000100 /* general info on the segment */ #define VM_UFFD_MISSING 0x00000200 /* missing pages tracking */ #define VM_PFNMAP 0x00000400 /* Page-ranges managed without "struct page", just pure PFN */ -#define VM_DENYWRITE 0x00000800 /* ETXTBSY on write attempts.. */ #define VM_UFFD_WP 0x00001000 /* wrprotect pages tracking */ #define VM_LOCKED 0x00002000 diff --git a/include/linux/mman.h b/include/linux/mman.h index ebb09a964272..bd9aadda047b 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -153,7 +153,6 @@ static inline unsigned long calc_vm_flag_bits(unsigned long flags) { return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) | - _calc_vm_trans(flags, MAP_DENYWRITE, VM_DENYWRITE ) | _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | arch_calc_vm_flag_bits(flags); diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index 390270e00a1d..f44c3fb8da1a 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -163,7 +163,6 @@ IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison") {VM_UFFD_MISSING, "uffd_missing" }, \ IF_HAVE_UFFD_MINOR(VM_UFFD_MINOR, "uffd_minor" ) \ {VM_PFNMAP, "pfnmap" }, \ - {VM_DENYWRITE, "denywrite" }, \ {VM_UFFD_WP, "uffd_wp" }, \ {VM_LOCKED, "locked" }, \ {VM_IO, "io" }, \ diff --git a/kernel/events/core.c b/kernel/events/core.c index 1cb1f9b8392e..19767bb9933c 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8307,8 +8307,6 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event) else flags = MAP_PRIVATE; - if (vma->vm_flags & VM_DENYWRITE) - flags |= MAP_DENYWRITE; if (vma->vm_flags & VM_LOCKED) flags |= MAP_LOCKED; if (is_vm_hugetlb_page(vma)) diff --git a/kernel/fork.c b/kernel/fork.c index 543541764865..34d21642c44f 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -570,12 +570,9 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT); file = tmp->vm_file; if (file) { - struct inode *inode = file_inode(file); struct address_space *mapping = file->f_mapping; get_file(file); - if (tmp->vm_flags & VM_DENYWRITE) - put_write_access(inode); i_mmap_lock_write(mapping); if (tmp->vm_flags & VM_SHARED) mapping_allow_writable(mapping); diff --git a/lib/test_printf.c b/lib/test_printf.c index 8ac71aee46af..8a48b61c3763 100644 --- a/lib/test_printf.c +++ b/lib/test_printf.c @@ -675,9 +675,8 @@ flags(void) "uptodate|dirty|lru|active|swapbacked", cmp_buffer); - flags = VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC - | VM_DENYWRITE; - test("read|exec|mayread|maywrite|mayexec|denywrite", "%pGv", &flags); + flags = VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; + test("read|exec|mayread|maywrite|mayexec", "%pGv", &flags); gfp = GFP_TRANSHUGE; test("GFP_TRANSHUGE", "%pGg", &gfp); diff --git a/mm/mmap.c b/mm/mmap.c index ca54d36d203a..589dc1dc13db 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -148,8 +148,6 @@ void vma_set_page_prot(struct vm_area_struct *vma) static void __remove_shared_vm_struct(struct vm_area_struct *vma, struct file *file, struct address_space *mapping) { - if (vma->vm_flags & VM_DENYWRITE) - allow_write_access(file); if (vma->vm_flags & VM_SHARED) mapping_unmap_writable(mapping); @@ -666,8 +664,6 @@ static void __vma_link_file(struct vm_area_struct *vma) if (file) { struct address_space *mapping = file->f_mapping; - if (vma->vm_flags & VM_DENYWRITE) - put_write_access(file_inode(file)); if (vma->vm_flags & VM_SHARED) mapping_allow_writable(mapping); @@ -1788,22 +1784,12 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma->vm_pgoff = pgoff; if (file) { - if (vm_flags & VM_DENYWRITE) { - error = deny_write_access(file); - if (error) - goto free_vma; - } if (vm_flags & VM_SHARED) { error = mapping_map_writable(file->f_mapping); if (error) - goto allow_write_and_free_vma; + goto free_vma; } - /* ->mmap() can change vma->vm_file, but must guarantee that - * vma_link() below can deny write-access if VM_DENYWRITE is set - * and map writably if VM_SHARED is set. This usually means the - * new file must not have been exposed to user-space, yet. - */ vma->vm_file = get_file(file); error = call_mmap(file, vma); if (error) @@ -1860,13 +1846,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma_link(mm, vma, prev, rb_link, rb_parent); /* Once vma denies write, undo our temporary denial count */ - if (file) { unmap_writable: - if (vm_flags & VM_SHARED) - mapping_unmap_writable(file->f_mapping); - if (vm_flags & VM_DENYWRITE) - allow_write_access(file); - } + if (file && vm_flags & VM_SHARED) + mapping_unmap_writable(file->f_mapping); file = vma->vm_file; out: perf_event_mmap(vma); @@ -1906,9 +1888,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr, charged = 0; if (vm_flags & VM_SHARED) mapping_unmap_writable(file->f_mapping); -allow_write_and_free_vma: - if (vm_flags & VM_DENYWRITE) - allow_write_access(file); free_vma: vm_area_free(vma); unacct_error: From patchwork Mon Aug 16 19:48:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12439165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC1B5C4320E for ; Mon, 16 Aug 2021 19:51:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D3B1660F55 for ; Mon, 16 Aug 2021 19:51:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231534AbhHPTwF (ORCPT ); Mon, 16 Aug 2021 15:52:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:58265 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231143AbhHPTwE (ORCPT ); Mon, 16 Aug 2021 15:52:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629143492; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SpO66406P+aAOH9nMaFQuRCCfyFRAgAvhM8DGsbyqP0=; b=Qt66zyvNpdbC2gSla3lAyHJeT/DMjXNWujav+J0pH08E3YuVb6LEVkByvZVu4VUWF6wowJ tWKvuLABNX7lQg7/acM+j4tbFUTOKADOHmZJ6MOgn4GtaJq+RBnPrgJUU3hPF+f2vbEmzK LiapnX4eb/RdwLA2bzldN+fjs2D8ML8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-232-RLnww2ZlPDyuojNV_LRHyA-1; Mon, 16 Aug 2021 15:51:31 -0400 X-MC-Unique: RLnww2ZlPDyuojNV_LRHyA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1199A190A7A0; Mon, 16 Aug 2021 19:51:26 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id C242D5C1D5; Mon, 16 Aug 2021 19:51:06 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Linus Torvalds , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Alexander Viro , Alexey Dobriyan , Steven Rostedt , Peter Zijlstra , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , "Eric W. Biederman" , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Christian Brauner , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Michal Hocko , Miklos Szeredi , Chengguang Xu , =?utf-8?q?Christian_K=C3=B6nig?= , Florian Weimer , David Laight , linux-unionfs@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 6/7] mm: ignore MAP_DENYWRITE in ksys_mmap_pgoff() Date: Mon, 16 Aug 2021 21:48:39 +0200 Message-Id: <20210816194840.42769-7-david@redhat.com> In-Reply-To: <20210816194840.42769-1-david@redhat.com> References: <20210816194840.42769-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Let's also remove masking off MAP_DENYWRITE from ksys_mmap_pgoff(): the last in-tree occurrence of MAP_DENYWRITE is now in LEGACY_MAP_MASK, which accepts the flag e.g., for MAP_SHARED_VALIDATE; however, the flag is ignored throughout the kernel now. Add a comment to LEGACY_MAP_MASK stating that MAP_DENYWRITE is ignored. Acked-by: "Eric W. Biederman" Signed-off-by: David Hildenbrand --- include/linux/mman.h | 3 ++- mm/mmap.c | 2 -- mm/nommu.c | 2 -- 3 files changed, 2 insertions(+), 5 deletions(-) diff --git a/include/linux/mman.h b/include/linux/mman.h index bd9aadda047b..b66e91b8176c 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -32,7 +32,8 @@ * The historical set of flags that all mmap implementations implicitly * support when a ->mmap_validate() op is not provided in file_operations. * - * MAP_EXECUTABLE is completely ignored throughout the kernel. + * MAP_EXECUTABLE and MAP_DENYWRITE are completely ignored throughout the + * kernel. */ #define LEGACY_MAP_MASK (MAP_SHARED \ | MAP_PRIVATE \ diff --git a/mm/mmap.c b/mm/mmap.c index 589dc1dc13db..bf11fc6e8311 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1626,8 +1626,6 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len, return PTR_ERR(file); } - flags &= ~MAP_DENYWRITE; - retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff); out_fput: if (file) diff --git a/mm/nommu.c b/mm/nommu.c index 3a93d4054810..0987d131bdfc 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1296,8 +1296,6 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len, goto out; } - flags &= ~MAP_DENYWRITE; - retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff); if (file) From patchwork Mon Aug 16 19:48:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12439167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 638ADC432BE for ; Mon, 16 Aug 2021 19:51:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4D2C961053 for ; Mon, 16 Aug 2021 19:51:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231730AbhHPTw0 (ORCPT ); Mon, 16 Aug 2021 15:52:26 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:26328 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231734AbhHPTwZ (ORCPT ); Mon, 16 Aug 2021 15:52:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629143513; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HXTa/VyECSSTBgPq2TU+rmvN898vFz2Xyhub0w4Arrg=; b=CY5Ha1Tc/PnKP+kYP1pnIq5FsNoNivZA3XN5Ac2iUj2+hq2gbq5Jr6ylMX9JO8+mAzvcqU FHjZTBJXwljTPoQmOtb36+kzW5WYEJppN8DNtMiJEd9QyzBeRSqweTAKBssgF/EqAyTZ8f Q7Yu+gTP0fq21uIuP6o5dxL1tFBVbYo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-101-CEYb4fPPN2aT7B97vW8omg-1; Mon, 16 Aug 2021 15:51:51 -0400 X-MC-Unique: CEYb4fPPN2aT7B97vW8omg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 17C951082921; Mon, 16 Aug 2021 19:51:46 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6FFAE5C1D5; Mon, 16 Aug 2021 19:51:26 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Linus Torvalds , Andrew Morton , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Alexander Viro , Alexey Dobriyan , Steven Rostedt , Peter Zijlstra , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Petr Mladek , Sergey Senozhatsky , Andy Shevchenko , Rasmus Villemoes , Kees Cook , "Eric W. Biederman" , Greg Ungerer , Geert Uytterhoeven , Mike Rapoport , Vlastimil Babka , Vincenzo Frascino , Chinwen Chang , Catalin Marinas , "Matthew Wilcox (Oracle)" , Huang Ying , Jann Horn , Feng Tang , Kevin Brodsky , Michael Ellerman , Shawn Anastasio , Steven Price , Nicholas Piggin , Christian Brauner , Jens Axboe , Gabriel Krisman Bertazi , Peter Xu , Suren Baghdasaryan , Shakeel Butt , Marco Elver , Daniel Jordan , Nicolas Viennot , Thomas Cedeno , Michal Hocko , Miklos Szeredi , Chengguang Xu , =?utf-8?q?Christian_K=C3=B6nig?= , Florian Weimer , David Laight , linux-unionfs@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 7/7] fs: update documentation of get_write_access() and friends Date: Mon, 16 Aug 2021 21:48:40 +0200 Message-Id: <20210816194840.42769-8-david@redhat.com> In-Reply-To: <20210816194840.42769-1-david@redhat.com> References: <20210816194840.42769-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org As VM_DENYWRITE does no longer exists, let's spring-clean the documentation of get_write_access() and friends. Acked-by: "Eric W. Biederman" Signed-off-by: David Hildenbrand --- include/linux/fs.h | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 640574294216..e0dc3e96ed72 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3055,15 +3055,20 @@ static inline void file_end_write(struct file *file) } /* + * This is used for regular files where some users -- especially the + * currently executed binary in a process, previously handled via + * VM_DENYWRITE -- cannot handle concurrent write (and maybe mmap + * read-write shared) accesses. + * * get_write_access() gets write permission for a file. * put_write_access() releases this write permission. - * This is used for regular files. - * We cannot support write (and maybe mmap read-write shared) accesses and - * MAP_DENYWRITE mmappings simultaneously. The i_writecount field of an inode - * can have the following values: - * 0: no writers, no VM_DENYWRITE mappings - * < 0: (-i_writecount) vm_area_structs with VM_DENYWRITE set exist - * > 0: (i_writecount) users are writing to the file. + * deny_write_access() denies write access to a file. + * allow_write_access() re-enables write access to a file. + * + * The i_writecount field of an inode can have the following values: + * 0: no write access, no denied write access + * < 0: (-i_writecount) users that denied write access to the file. + * > 0: (i_writecount) users that have write access to the file. * * Normally we operate on that counter with atomic_{inc,dec} and it's safe * except for the cases where we don't hold i_writecount yet. Then we need to