Message ID | 20250320104922.1925198-1-mjguzik@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] fs: sort out stale commentary about races between fd alloc and dup2() | expand |
On Thu, Mar 20, 2025 at 11:49:22AM +0100, Mateusz Guzik wrote: > Userspace may be trying to dup2() over a fd which is allocated but not > yet populated. > > Commentary about it is split in 2 parts (and both warrant changes): > > 1. in dup2() > > It claims the issue is only relevant for shared descriptor tables which > is of no concern for POSIX (but then is POSIX of concern to anyone > today?), which I presume predates standarized threading. > > The comment also mentions the following systems: > - OpenBSD installing a larval file -- they moved away from it, file is > installed late and EBUSY is returned on conflict > - FreeBSD returning EBADF -- reworked to install the file early like > OpenBSD used to do > - NetBSD "deadlocks in amusing ways" -- their solution looks > Solaris-inspired (not a compliment) and I would not be particularly > surprised if it indeed deadlocked, in amusing ways or otherwise > > I don't believe mentioning any of these adds anything and the statement > about the issue not being POSIX-relevant is outdated. > > dup2 description in POSIX still does not mention the problem. > > 2. above fd_install() > > <quote> > > We need to detect this and fput() the struct file we are about to > > overwrite in this case. > > > > It should never happen - if we allow dup2() do it, _really_ bad things > > will follow. > </quote> > > I have difficulty parsing it. The first sentence would suggest > fd_install() tries to detect and recover from the race (it does not), > the next one claims the race needs to be dealt with (it is, by dup2()). > > Given that fd_install() does not suffer the burden, this patch removes > the above and instead expands on the race in dup2() commentary. > > Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> > --- > > This contains the new commentary from: > https://lore.kernel.org/linux-fsdevel/20250320102637.1924183-1-mjguzik@gmail.com/T/#u > > and obsoletes this guy hanging out in -next: > ommit ec052fae814d467d6aa7e591b4b24531b87e65ec This is already upstream as of v6.14-rc1. So please make it a diff on top. ;)
On Thu, Mar 20, 2025 at 2:58 PM Christian Brauner <brauner@kernel.org> wrote: > > On Thu, Mar 20, 2025 at 11:49:22AM +0100, Mateusz Guzik wrote: > > Userspace may be trying to dup2() over a fd which is allocated but not > > yet populated. > > > > Commentary about it is split in 2 parts (and both warrant changes): > > > > 1. in dup2() > > > > It claims the issue is only relevant for shared descriptor tables which > > is of no concern for POSIX (but then is POSIX of concern to anyone > > today?), which I presume predates standarized threading. > > > > The comment also mentions the following systems: > > - OpenBSD installing a larval file -- they moved away from it, file is > > installed late and EBUSY is returned on conflict > > - FreeBSD returning EBADF -- reworked to install the file early like > > OpenBSD used to do > > - NetBSD "deadlocks in amusing ways" -- their solution looks > > Solaris-inspired (not a compliment) and I would not be particularly > > surprised if it indeed deadlocked, in amusing ways or otherwise > > > > I don't believe mentioning any of these adds anything and the statement > > about the issue not being POSIX-relevant is outdated. > > > > dup2 description in POSIX still does not mention the problem. > > > > 2. above fd_install() > > > > <quote> > > > We need to detect this and fput() the struct file we are about to > > > overwrite in this case. > > > > > > It should never happen - if we allow dup2() do it, _really_ bad things > > > will follow. > > </quote> > > > > I have difficulty parsing it. The first sentence would suggest > > fd_install() tries to detect and recover from the race (it does not), > > the next one claims the race needs to be dealt with (it is, by dup2()). > > > > Given that fd_install() does not suffer the burden, this patch removes > > the above and instead expands on the race in dup2() commentary. > > > > Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> > > --- > > > > This contains the new commentary from: > > https://lore.kernel.org/linux-fsdevel/20250320102637.1924183-1-mjguzik@gmail.com/T/#u > > > > and obsoletes this guy hanging out in -next: > > ommit ec052fae814d467d6aa7e591b4b24531b87e65ec > > This is already upstream as of v6.14-rc1. So please make it a diff on > top. ;) oops. Well in that case the previously sent variant applies: https://lore.kernel.org/linux-fsdevel/20250320102637.1924183-1-mjguzik@gmail.com/T/#u Although I see the commit message would use a small tweak: > Given that fd_install() does not suffer the burden, this patch removes > the above and instead expands on the race in dup2() commentary instead. s/ instead././
diff --git a/fs/file.c b/fs/file.c index d0ecc3e59f2f..dc3f7e120e3e 100644 --- a/fs/file.c +++ b/fs/file.c @@ -626,22 +626,14 @@ void put_unused_fd(unsigned int fd) EXPORT_SYMBOL(put_unused_fd); -/* - * Install a file pointer in the fd array. - * - * The VFS is full of places where we drop the files lock between - * setting the open_fds bitmap and installing the file in the file - * array. At any such point, we are vulnerable to a dup2() race - * installing a file in the array before us. We need to detect this and - * fput() the struct file we are about to overwrite in this case. - * - * It should never happen - if we allow dup2() do it, _really_ bad things - * will follow. +/** + * fd_install - install a file pointer in the fd array + * @fd: file descriptor to install the file in + * @file: the file to install * * This consumes the "file" refcount, so callers should treat it * as if they had called fput(file). */ - void fd_install(unsigned int fd, struct file *file) { struct files_struct *files = current->files; @@ -1259,18 +1251,30 @@ __releases(&files->file_lock) struct fdtable *fdt; /* - * We need to detect attempts to do dup2() over allocated but still - * not finished descriptor. NB: OpenBSD avoids that at the price of - * extra work in their equivalent of fget() - they insert struct - * file immediately after grabbing descriptor, mark it larval if - * more work (e.g. actual opening) is needed and make sure that - * fget() treats larval files as absent. Potentially interesting, - * but while extra work in fget() is trivial, locking implications - * and amount of surgery on open()-related paths in VFS are not. - * FreeBSD fails with -EBADF in the same situation, NetBSD "solution" - * deadlocks in rather amusing ways, AFAICS. All of that is out of - * scope of POSIX or SUS, since neither considers shared descriptor - * tables and this condition does not arise without those. + * dup2() is expected to close the file installed in the target fd slot + * (if any). However, userspace hand-picking a fd may be racing against + * its own threads which happened to allocate it in open() et al but did + * not populate it yet. + * + * Broadly speaking we may be racing against the following: + * fd = get_unused_fd_flags(); // fd slot reserved, ->fd[fd] == NULL + * file = hard_work_goes_here(); + * fd_install(fd, file); // only now ->fd[fd] == file + * + * It is an invariant that a successfully allocated fd has a NULL entry + * in the array until the matching fd_install(). + * + * If we fit the window, we have the fd to populate, yet no target file + * to close. Trying to ignore it and install our new file would violate + * the invariant and make fd_install() overwrite our file. + * + * Things can be done(tm) to handle this. However, the issue does not + * concern legitimate programs and we only need to make sure the kernel + * does not trip over it. + * + * The simplest way out is to return an error if we find ourselves here. + * + * POSIX is silent on the issue, we return -EBUSY. */ fdt = files_fdtable(files); fd = array_index_nospec(fd, fdt->max_fds);