Message ID | 20200702164140.4468-10-ebiederm@xmission.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Make the user mode driver code a better citizen | expand |
On Thu, Jul 02, 2020 at 11:41:34AM -0500, Eric W. Biederman wrote: > Now that the last callser has been removed remove this code from exec. > > For anyone thinking of resurrecing do_execve_file please note that > the code was buggy in several fundamental ways. > > - It did not ensure the file it was passed was read-only and that > deny_write_access had been called on it. Which subtlely breaks > invaniants in exec. > > - The caller of do_execve_file was expected to hold and put a > reference to the file, but an extra reference for use by exec was > not taken so that when exec put it's reference to the file an > underflow occured on the file reference count. Maybe its my growing love with testing, but I'm going to have to partly blame here that we added a new API without any respective testing. Granted, I recall this this patch set could have used more wider review and a bit more patience... but just mentioning this so we try to avoid new api-without-testing with more reason in the future. But more importantly, *how* could we have caught this? Or how can we catch this sort of stuff better in the future? > - The point of the interface was so that a pathname did not need to > exist. Which breaks pathname based LSMs. Perhaps so but this fails to do justice of the LSM consideration done for the patch which added this during patch review [0], and I particularly recall I called out LSM folks to bring their ray guns out at this patch. It didn't get much attention. Let me recap a few points I think your commit log should somehow consider. You do as you please. Users of shmem_kernel_file_setup() spawned out of the desire to *avoid* LSMs since it didn't make sense in their case as their inodes are never exposed to userspace. Such is the case for ipc/shm.c and security/keys/big_key.c. Refer to commit c7277090927a5 ("security: shmem: implement kernel private shmem inodes") and then commit e1832f2923ec9 ("ipc: use private shmem or hugetlbfs inodes for shm segments"). And the umh module approach was doing: a) mapping data already extracted by the kernel somehow from a file somehow, presumably from /lib/modules/ path somewhere, but again this is not visible to umc.c, as it just gets called with: fork_usermode_blob(void *data, size_t len, struct umh_info *info) b) Creating the respective tmpfs file with shmem_kernel_file_setup() c) Populating the file created and stuffing it with our data passed d) Calling do_execve_file() on it. So, although I was hoping LSM folks would chime in for things I may have missed during my patch review, my recollection from the patch thread was that this becuase of a) it in theory could skip out on dealing with LSMs. [0] https://lkml.kernel.org/r/20180509022526.hertzfpvy7apz6ny@ast-mbp Luis
On Wed, Jul 08, 2020 at 06:35:25AM +0000, Luis Chamberlain wrote: > On Thu, Jul 02, 2020 at 11:41:34AM -0500, Eric W. Biederman wrote: > > Now that the last callser has been removed remove this code from exec. > > > > For anyone thinking of resurrecing do_execve_file please note that > > the code was buggy in several fundamental ways. > > > > - It did not ensure the file it was passed was read-only and that > > deny_write_access had been called on it. Which subtlely breaks > > invaniants in exec. > > > > - The caller of do_execve_file was expected to hold and put a > > reference to the file, but an extra reference for use by exec was > > not taken so that when exec put it's reference to the file an > > underflow occured on the file reference count. > > Maybe its my growing love with testing, but I'm going to have to partly > blame here that we added a new API without any respective testing. > Granted, I recall this this patch set could have used more wider review > and a bit more patience... but just mentioning this so we try to avoid > new api-without-testing with more reason in the future. > > But more importantly, *how* could we have caught this? Or how can we > catch this sort of stuff better in the future? Of all the issues you pointed out with do_execve_file(), since upon review the assumption *by design* was that LSMs/etc would pick up issues with the file *prior* to processing, I think that this file reference count issue comes to my attention as the more serious issue which I wish we could address *first* before this crusade. So I have to ask, has anyone *really tried* to give a crack at fixing this refcount issue in a smaller way first? Alexei? I'm not opposed to the removal of do_execve_file(), however if there is a reproducible crash / issue with the existing user, this sledge hammer seems a bit overkill for older kernels. Luis
Luis Chamberlain <mcgrof@kernel.org> writes: > On Wed, Jul 08, 2020 at 06:35:25AM +0000, Luis Chamberlain wrote: >> On Thu, Jul 02, 2020 at 11:41:34AM -0500, Eric W. Biederman wrote: >> > Now that the last callser has been removed remove this code from exec. >> > >> > For anyone thinking of resurrecing do_execve_file please note that >> > the code was buggy in several fundamental ways. >> > >> > - It did not ensure the file it was passed was read-only and that >> > deny_write_access had been called on it. Which subtlely breaks >> > invaniants in exec. >> > >> > - The caller of do_execve_file was expected to hold and put a >> > reference to the file, but an extra reference for use by exec was >> > not taken so that when exec put it's reference to the file an >> > underflow occured on the file reference count. >> >> Maybe its my growing love with testing, but I'm going to have to partly >> blame here that we added a new API without any respective testing. >> Granted, I recall this this patch set could have used more wider review >> and a bit more patience... but just mentioning this so we try to avoid >> new api-without-testing with more reason in the future. >> >> But more importantly, *how* could we have caught this? Or how can we >> catch this sort of stuff better in the future? > > Of all the issues you pointed out with do_execve_file(), since upon > review the assumption *by design* was that LSMs/etc would pick up issues > with the file *prior* to processing, I think that this file reference > count issue comes to my attention as the more serious issue which I > wish we could address *first* before this crusade. > > So I have to ask, has anyone *really tried* to give a crack at fixing > this refcount issue in a smaller way first? Alexei? > > I'm not opposed to the removal of do_execve_file(), however if there > is a reproducible crash / issue with the existing user, this sledge > hammer seems a bit overkill for older kernels. It does not matter for older kernels because there is exactly one user. That one user is just a place holder keeping the code alive until a real user comes along. For older kernels the solution is to just mark the bpfilter code broken in Kconfig and refuse to compile it. That is the trivial backportable fix if anyone wants one. Eric
On Wed, Jul 08, 2020 at 08:08:09AM -0500, Eric W. Biederman wrote: > Luis Chamberlain <mcgrof@kernel.org> writes: > > > On Wed, Jul 08, 2020 at 06:35:25AM +0000, Luis Chamberlain wrote: > >> On Thu, Jul 02, 2020 at 11:41:34AM -0500, Eric W. Biederman wrote: > >> > Now that the last callser has been removed remove this code from exec. > >> > > >> > For anyone thinking of resurrecing do_execve_file please note that > >> > the code was buggy in several fundamental ways. > >> > > >> > - It did not ensure the file it was passed was read-only and that > >> > deny_write_access had been called on it. Which subtlely breaks > >> > invaniants in exec. > >> > > >> > - The caller of do_execve_file was expected to hold and put a > >> > reference to the file, but an extra reference for use by exec was > >> > not taken so that when exec put it's reference to the file an > >> > underflow occured on the file reference count. > >> > >> Maybe its my growing love with testing, but I'm going to have to partly > >> blame here that we added a new API without any respective testing. > >> Granted, I recall this this patch set could have used more wider review > >> and a bit more patience... but just mentioning this so we try to avoid > >> new api-without-testing with more reason in the future. > >> > >> But more importantly, *how* could we have caught this? Or how can we > >> catch this sort of stuff better in the future? > > > > Of all the issues you pointed out with do_execve_file(), since upon > > review the assumption *by design* was that LSMs/etc would pick up issues > > with the file *prior* to processing, I think that this file reference > > count issue comes to my attention as the more serious issue which I > > wish we could address *first* before this crusade. > > > > So I have to ask, has anyone *really tried* to give a crack at fixing > > this refcount issue in a smaller way first? Alexei? > > > > I'm not opposed to the removal of do_execve_file(), however if there > > is a reproducible crash / issue with the existing user, this sledge > > hammer seems a bit overkill for older kernels. > > It does not matter for older kernels because there is exactly one user. > That one user is just a place holder keeping the code alive until a real > user comes along. > > For older kernels the solution is to just mark the bpfilter code broken > in Kconfig and refuse to compile it. That is the trivial backportable > fix if anyone wants one. This seals the deal for me, thanks! Carry on, but hey, please add yourself to MAINTAINERS too :) Luis
On Thu 2020-07-02 11:41:34, Eric W. Biederman wrote: > Now that the last callser has been removed remove this code from exec. Typo "caller". > For anyone thinking of resurrecing do_execve_file please note that resurrecting? > the code was buggy in several fundamental ways. > > - It did not ensure the file it was passed was read-only and that > deny_write_access had been called on it. Which subtlely breaks > invaniants in exec. subtly, invariants? Pavel
diff --git a/fs/exec.c b/fs/exec.c index e6e8a9a70327..23dfbb820626 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1818,13 +1818,14 @@ static int exec_binprm(struct linux_binprm *bprm) /* * sys_execve() executes a new program. */ -static int __do_execve_file(int fd, struct filename *filename, - struct user_arg_ptr argv, - struct user_arg_ptr envp, - int flags, struct file *file) +static int do_execveat_common(int fd, struct filename *filename, + struct user_arg_ptr argv, + struct user_arg_ptr envp, + int flags) { char *pathbuf = NULL; struct linux_binprm *bprm; + struct file *file; struct files_struct *displaced; int retval; @@ -1863,8 +1864,7 @@ static int __do_execve_file(int fd, struct filename *filename, check_unsafe_exec(bprm); current->in_execve = 1; - if (!file) - file = do_open_execat(fd, filename, flags); + file = do_open_execat(fd, filename, flags); retval = PTR_ERR(file); if (IS_ERR(file)) goto out_unmark; @@ -1872,9 +1872,7 @@ static int __do_execve_file(int fd, struct filename *filename, sched_exec(); bprm->file = file; - if (!filename) { - bprm->filename = "none"; - } else if (fd == AT_FDCWD || filename->name[0] == '/') { + if (fd == AT_FDCWD || filename->name[0] == '/') { bprm->filename = filename->name; } else { if (filename->name[0] == '\0') @@ -1935,8 +1933,7 @@ static int __do_execve_file(int fd, struct filename *filename, task_numa_free(current, false); free_bprm(bprm); kfree(pathbuf); - if (filename) - putname(filename); + putname(filename); if (displaced) put_files_struct(displaced); return retval; @@ -1967,27 +1964,10 @@ static int __do_execve_file(int fd, struct filename *filename, if (displaced) reset_files_struct(displaced); out_ret: - if (filename) - putname(filename); + putname(filename); return retval; } -static int do_execveat_common(int fd, struct filename *filename, - struct user_arg_ptr argv, - struct user_arg_ptr envp, - int flags) -{ - return __do_execve_file(fd, filename, argv, envp, flags, NULL); -} - -int do_execve_file(struct file *file, void *__argv, void *__envp) -{ - struct user_arg_ptr argv = { .ptr.native = __argv }; - struct user_arg_ptr envp = { .ptr.native = __envp }; - - return __do_execve_file(AT_FDCWD, NULL, argv, envp, 0, file); -} - int do_execve(struct filename *filename, const char __user *const __user *__argv, const char __user *const __user *__envp) diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index 4a20b7517dd0..7c27d7b57871 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -141,6 +141,5 @@ extern int do_execveat(int, struct filename *, const char __user * const __user *, const char __user * const __user *, int); -int do_execve_file(struct file *file, void *__argv, void *__envp); #endif /* _LINUX_BINFMTS_H */