diff mbox series

[RFC,2/3] exec: Simplify unshare_files

Message ID 87y3c1qr1g.fsf_-_@xmission.com (mailing list archive)
State New, archived
Headers show
Series exec: Moving unshare_files_struct | expand

Commit Message

Eric W. Biederman Sept. 16, 2018, 5:40 p.m. UTC
Now that exec calls unshare_files after the point of no return there
is no reason to return displaced.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/coredump.c           |  5 +----
 fs/exec.c               |  5 +----
 include/linux/fdtable.h |  2 +-
 kernel/fork.c           | 12 ++++++------
 4 files changed, 9 insertions(+), 15 deletions(-)

Comments

Oleg Nesterov Sept. 17, 2018, 4:23 p.m. UTC | #1
absolutely off-topic question,

On 09/16, Eric W. Biederman wrote:
>
> @@ -747,11 +746,9 @@ void do_coredump(const siginfo_t *siginfo)
>  	}
>
>  	/* get us an unshared descriptor table; almost always a no-op */
> -	retval = unshare_files(&displaced);
> +	retval = unshare_files();

I fail to understand why do_coredump() needs unshare_files(). Could someone
explain?

And "almost always a no-op" above is not true, this is never a no-op in mt case;
other (killed) threads sleep in exit_mm() which is called before exit_files().

Oleg.
Eric W. Biederman Sept. 17, 2018, 8:26 p.m. UTC | #2
Oleg Nesterov <oleg@redhat.com> writes:

> absolutely off-topic question,
>
> On 09/16, Eric W. Biederman wrote:
>>
>> @@ -747,11 +746,9 @@ void do_coredump(const siginfo_t *siginfo)
>>  	}
>>
>>  	/* get us an unshared descriptor table; almost always a no-op */
>> -	retval = unshare_files(&displaced);
>> +	retval = unshare_files();
>
> I fail to understand why do_coredump() needs unshare_files(). Could someone
> explain?
>
> And "almost always a no-op" above is not true, this is never a no-op in mt case;
> other (killed) threads sleep in exit_mm() which is called before
> exit_files().

So I looked at the history and I have half an explanation.
179e037fc1370288188cb1f90b81156d75a3cb2d do_coredump(): make sure that descriptor table isn't shared

As far as I can tell this was Al Viro making certain that there were not
any races that had to be dealt with when accessing the file table during
execve.

Which gets to the heart of what we have to do before this set of changes
that we have been looking at can be merged.  We need to go through exec
and do_coredump if we wish to remove this call of unshare_files
and verify that everything is thread-safe, and using thread-safe idioms.

There is at least one place in exec where it is documented that the
access to files is not thread-safe in the comment.  I don't think any of
that is fundamentally hard but that work needs to be done for the rest
of this cleanup to be usable.

Eric
diff mbox series

Patch

diff --git a/fs/coredump.c b/fs/coredump.c
index 1e2c87acac9b..968ee5744bf9 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -546,7 +546,6 @@  void do_coredump(const siginfo_t *siginfo)
 	struct cred *cred;
 	int retval = 0;
 	int ispipe;
-	struct files_struct *displaced;
 	/* require nonrelative corefile path and be extra careful */
 	bool need_suid_safe = false;
 	bool core_dumped = false;
@@ -747,11 +746,9 @@  void do_coredump(const siginfo_t *siginfo)
 	}
 
 	/* get us an unshared descriptor table; almost always a no-op */
-	retval = unshare_files(&displaced);
+	retval = unshare_files();
 	if (retval)
 		goto close_fail;
-	if (displaced)
-		put_files_struct(displaced);
 	if (!dump_interrupted()) {
 		file_start_write(cprm.file);
 		core_dumped = binfmt->core_dump(&cprm);
diff --git a/fs/exec.c b/fs/exec.c
index 6f6167ec08eb..7eeffa7d98c6 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1252,7 +1252,6 @@  void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec)
  */
 int flush_old_exec(struct linux_binprm * bprm)
 {
-	struct files_struct *displaced;
 	int retval;
 
 	/*
@@ -1292,11 +1291,9 @@  int flush_old_exec(struct linux_binprm * bprm)
 	flush_thread();
 	current->personality &= ~bprm->per_clear;
 
-	retval = unshare_files(&displaced);
+	retval = unshare_files();
 	if (retval)
 		goto out;
-	if (displaced)
-		put_files_struct(displaced);
 
 	/*
 	 * We have to apply CLOEXEC before we change whether the process is
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index 41615f38bcff..e65379336c4c 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -108,7 +108,7 @@  struct task_struct;
 struct files_struct *get_files_struct(struct task_struct *);
 void put_files_struct(struct files_struct *fs);
 void reset_files_struct(struct files_struct *);
-int unshare_files(struct files_struct **);
+int unshare_files(void);
 struct files_struct *dup_fd(struct files_struct *, int *) __latent_entropy;
 void do_close_on_exec(struct files_struct *);
 int iterate_fd(struct files_struct *, unsigned,
diff --git a/kernel/fork.c b/kernel/fork.c
index d896e9ca38b0..a06a609075eb 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2564,21 +2564,21 @@  SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
  *	the exec layer of the kernel.
  */
 
-int unshare_files(struct files_struct **displaced)
+int unshare_files(void)
 {
 	struct task_struct *task = current;
-	struct files_struct *copy = NULL;
+	struct files_struct *files, *copy = NULL;
 	int error;
 
 	error = unshare_fd(CLONE_FILES, &copy);
-	if (error || !copy) {
-		*displaced = NULL;
+	if (error || !copy)
 		return error;
-	}
-	*displaced = task->files;
+
+	files = task->files;
 	task_lock(task);
 	task->files = copy;
 	task_unlock(task);
+	put_files_struct(files);
 	return 0;
 }