Message ID | 20200817220425.9389-10-ebiederm@xmission.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/17] exec: Move unshare_files to fix posix file locking during exec | expand |
On Mon, Aug 17, 2020 at 05:04:18PM -0500, Eric W. Biederman wrote: > When discussing[1] exec and posix file locks it was realized that none > of the callers of get_files_struct fundamentally needed to call > get_files_struct, and that by switching them to helper functions > instead it will both simplify their code and remove unnecessary > increments of files_struct.count. Those unnecessary increments can > result in exec unnecessarily unsharing files_struct which breaking > posix locks, and it can result in fget_light having to fallback to > fget reducing system performance. > > Using fnext_task simplifies proc_readfd_common, by moving the checking > for the maximum file descritor into the generic code, and by > remvoing the need for capturing and releasing a reference on > files_struct. Yecchhh... So you keep locking/unlocking the damn thing for each descriptor?
On Mon, Aug 17, 2020 at 8:46 PM Eric W. Biederman <ebiederm@xmission.com> wrote: > > I am definitely willing to look at it. Do we think there would be enough > traffic on task_lock from /proc/<pid>/fd access to make it work doing? not from /proc, but bpf iterator in kernel/bpf/task_iter.c that is being modified in the other patch is used to process 100+k tasks. The faster it can go through them the better. We don't see task spin_lock being a bottleneck though. To test it just do 'bpftool prog show'. It will use task iterator undercover.
diff --git a/fs/proc/fd.c b/fs/proc/fd.c index abfdcb21cc79..d9fee5390fd7 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -218,7 +218,6 @@ static int proc_readfd_common(struct file *file, struct dir_context *ctx, instantiate_t instantiate) { struct task_struct *p = get_proc_task(file_inode(file)); - struct files_struct *files; unsigned int fd; if (!p) @@ -226,22 +225,18 @@ static int proc_readfd_common(struct file *file, struct dir_context *ctx, if (!dir_emit_dots(file, ctx)) goto out; - files = get_files_struct(p); - if (!files) - goto out; rcu_read_lock(); - for (fd = ctx->pos - 2; - fd < files_fdtable(files)->max_fds; - fd++, ctx->pos++) { + for (fd = ctx->pos - 2;; fd++, ctx->pos++) { struct file *f; struct fd_data data; char name[10 + 1]; unsigned int len; - f = fcheck_files(files, fd); + f = fnext_task(p, &fd); + ctx->pos = fd; if (!f) - continue; + break; data.mode = f->f_mode; rcu_read_unlock(); data.fd = fd; @@ -250,13 +245,11 @@ static int proc_readfd_common(struct file *file, struct dir_context *ctx, if (!proc_fill_cache(file, ctx, name, len, instantiate, p, &data)) - goto out_fd_loop; + goto out; cond_resched(); rcu_read_lock(); } rcu_read_unlock(); -out_fd_loop: - put_files_struct(files); out: put_task_struct(p); return 0;
When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Using fnext_task simplifies proc_readfd_common, by moving the checking for the maximum file descritor into the generic code, and by remvoing the need for capturing and releasing a reference on files_struct. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> --- fs/proc/fd.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-)