Message ID | AM6PR03MB50802EA81C89D22791CCF09099EC2@AM6PR03MB5080.eurprd03.prod.outlook.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc | expand |
On Mon, Jan 27, 2025 at 3:48 PM Juntong Deng <juntong.deng@outlook.com> wrote: > > This patch adds the open-coded iterator style process file iterator > kfuncs bpf_iter_task_file_{new,next,destroy} that iterates over all > files opened by the specified process. > > bpf_iter_task_file_next returns a pointer to bpf_iter_task_file_item, > which currently contains *task, *file, fd. This is an extensible > structure that enables compatibility with different versions > through CO-RE. > > The reference to struct file acquired by the previous > bpf_iter_task_file_next() is released in the next > bpf_iter_task_file_next(), and the last reference is released in the > last bpf_iter_task_file_next() that returns NULL. > > In the bpf_iter_task_file_destroy(), if the iterator does not iterate to > the end, then the last struct file reference is released at this time. > > Signed-off-by: Juntong Deng <juntong.deng@outlook.com> All patches look fine, but we need an ack from Christian for patches 3 and 4 to proceed.
On Mon, Jan 27, 2025 at 11:46:50PM +0000, Juntong Deng wrote: > This patch adds the open-coded iterator style process file iterator > kfuncs bpf_iter_task_file_{new,next,destroy} that iterates over all > files opened by the specified process. > > bpf_iter_task_file_next returns a pointer to bpf_iter_task_file_item, > which currently contains *task, *file, fd. This is an extensible > structure that enables compatibility with different versions > through CO-RE. > > The reference to struct file acquired by the previous > bpf_iter_task_file_next() is released in the next > bpf_iter_task_file_next(), and the last reference is released in the > last bpf_iter_task_file_next() that returns NULL. > > In the bpf_iter_task_file_destroy(), if the iterator does not iterate to > the end, then the last struct file reference is released at this time. > > Signed-off-by: Juntong Deng <juntong.deng@outlook.com> > --- I deeply dislike that this allows bpf programs to iterate through another tasks files more than what is already possible with the task_file_seq_* bpf api. This here means that bpf programs have access to all file types that exist in the kernel. From general simple filesystem files to pidfds, kvm fd, epoll fd, drm fds - anything you can think of. And then do arbitrary and ever expanding stuff with those files from the iterator with less restrictions (if I'm reading this right) than the task_file_seq_* iterator. Possibly even keeping that reference for a long time leading to weird EBUSY issues for filesystem shutdown and similar problems. This is a bad idea. Even in the kernel we only allow this type of iteration for procfs and procfs-like usage and there we hold references to files from another task for a very short time when we e.g., access /proc/<PID>/fd/. And you already have an iterator for that with task_file_seq_get_*() even if it is more work. I'm also not at all swayed by the fact that this is coming out of an effort to move CRIU into bpf just to make things easier. Not a selling point as we do have CRIU and I don't think we need to now put more CRIU related stuff into the kernel. So this will not get an ACK from me. I'm putting Al and Linus here as well as they might have opinions on this and might disagree with me. > kernel/bpf/helpers.c | 3 ++ > kernel/bpf/task_iter.c | 90 ++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 93 insertions(+) > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index f27ce162427a..359c5bbf4814 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -3157,6 +3157,9 @@ BTF_ID_FLAGS(func, bpf_iter_css_destroy, KF_ITER_DESTROY) > BTF_ID_FLAGS(func, bpf_iter_task_new, KF_ITER_NEW | KF_TRUSTED_ARGS | KF_RCU_PROTECTED) > BTF_ID_FLAGS(func, bpf_iter_task_next, KF_ITER_NEXT | KF_RET_NULL) > BTF_ID_FLAGS(func, bpf_iter_task_destroy, KF_ITER_DESTROY) > +BTF_ID_FLAGS(func, bpf_iter_task_file_new, KF_ITER_NEW | KF_TRUSTED_ARGS) > +BTF_ID_FLAGS(func, bpf_iter_task_file_next, KF_ITER_NEXT | KF_RET_NULL) > +BTF_ID_FLAGS(func, bpf_iter_task_file_destroy, KF_ITER_DESTROY) > BTF_ID_FLAGS(func, bpf_dynptr_adjust) > BTF_ID_FLAGS(func, bpf_dynptr_is_null) > BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) > diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > index 98d9b4c0daff..24a5af67e6c8 100644 > --- a/kernel/bpf/task_iter.c > +++ b/kernel/bpf/task_iter.c > @@ -1027,6 +1027,96 @@ __bpf_kfunc void bpf_iter_task_destroy(struct bpf_iter_task *it) > { > } > > +struct bpf_iter_task_file_item { > + struct task_struct *task; > + struct file *file; > + unsigned int fd; > +}; > + > +struct bpf_iter_task_file { > + __u64 __opaque[4]; > +} __aligned(8); > + > +struct bpf_iter_task_file_kern { > + struct bpf_iter_task_file_item item; > + unsigned int next_fd; > +} __aligned(8); > + > +/** > + * bpf_iter_task_file_new() - Initialize a new task file iterator for a task, > + * used to iterate over all files opened by a specified task > + * > + * @it: the new bpf_iter_task_file to be created > + * @task: a pointer pointing to the task to be iterated over > + */ > +__bpf_kfunc int bpf_iter_task_file_new(struct bpf_iter_task_file *it, struct task_struct *task) > +{ > + struct bpf_iter_task_file_kern *kit = (void *)it; > + struct bpf_iter_task_file_item *item = &kit->item; > + > + BUILD_BUG_ON(sizeof(struct bpf_iter_task_file_kern) > sizeof(struct bpf_iter_task_file)); > + BUILD_BUG_ON(__alignof__(struct bpf_iter_task_file_kern) != > + __alignof__(struct bpf_iter_task_file)); > + > + item->task = get_task_struct(task); > + item->file = NULL; > + item->fd = 0; > + kit->next_fd = 0; > + > + return 0; > +} > + > +/** > + * bpf_iter_task_file_next() - Get the next file in bpf_iter_task_file > + * > + * bpf_iter_task_file_next acquires a reference to the struct file. > + * > + * The reference to struct file acquired by the previous > + * bpf_iter_task_file_next() is released in the next bpf_iter_task_file_next(), > + * and the last reference is released in the last bpf_iter_task_file_next() > + * that returns NULL. > + * > + * @it: the bpf_iter_task_file to be checked > + * > + * @returns a pointer to bpf_iter_task_file_item > + */ > +__bpf_kfunc struct bpf_iter_task_file_item *bpf_iter_task_file_next(struct bpf_iter_task_file *it) > +{ > + struct bpf_iter_task_file_kern *kit = (void *)it; > + struct bpf_iter_task_file_item *item = &kit->item; > + > + if (item->file) > + fput(item->file); > + > + item->file = fget_task_next(item->task, &kit->next_fd); > + if (!item->file) > + return NULL; > + > + item->fd = kit->next_fd; > + kit->next_fd++; > + > + return item; > +} > + > +/** > + * bpf_iter_task_file_destroy() - Destroy a bpf_iter_task_file > + * > + * If the iterator does not iterate to the end, then the last > + * struct file reference is released at this time. > + * > + * @it: the bpf_iter_task_file to be destroyed > + */ > +__bpf_kfunc void bpf_iter_task_file_destroy(struct bpf_iter_task_file *it) > +{ > + struct bpf_iter_task_file_kern *kit = (void *)it; > + struct bpf_iter_task_file_item *item = &kit->item; > + > + if (item->file) > + fput(item->file); > + > + put_task_struct(item->task); > +} > + > __bpf_kfunc_end_defs(); > > DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work); > -- > 2.39.5 >
On Thu, 30 Jan 2025 at 08:04, Christian Brauner <brauner@kernel.org> wrote: > > I deeply dislike that this allows bpf programs to iterate through > another tasks files more than what is already possible with the > task_file_seq_* bpf api. Ack. This needs to just die. There is no excuse for this, and no, CRIU is absolutely *not* that excuse. In fact, CRIU is a huge red flag, and has caused endless issues before. We should absolutely not add special BPF interfaces for CRIU. It only leads to pain and misery. Linus
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index f27ce162427a..359c5bbf4814 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -3157,6 +3157,9 @@ BTF_ID_FLAGS(func, bpf_iter_css_destroy, KF_ITER_DESTROY) BTF_ID_FLAGS(func, bpf_iter_task_new, KF_ITER_NEW | KF_TRUSTED_ARGS | KF_RCU_PROTECTED) BTF_ID_FLAGS(func, bpf_iter_task_next, KF_ITER_NEXT | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_iter_task_destroy, KF_ITER_DESTROY) +BTF_ID_FLAGS(func, bpf_iter_task_file_new, KF_ITER_NEW | KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_iter_task_file_next, KF_ITER_NEXT | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_iter_task_file_destroy, KF_ITER_DESTROY) BTF_ID_FLAGS(func, bpf_dynptr_adjust) BTF_ID_FLAGS(func, bpf_dynptr_is_null) BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index 98d9b4c0daff..24a5af67e6c8 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -1027,6 +1027,96 @@ __bpf_kfunc void bpf_iter_task_destroy(struct bpf_iter_task *it) { } +struct bpf_iter_task_file_item { + struct task_struct *task; + struct file *file; + unsigned int fd; +}; + +struct bpf_iter_task_file { + __u64 __opaque[4]; +} __aligned(8); + +struct bpf_iter_task_file_kern { + struct bpf_iter_task_file_item item; + unsigned int next_fd; +} __aligned(8); + +/** + * bpf_iter_task_file_new() - Initialize a new task file iterator for a task, + * used to iterate over all files opened by a specified task + * + * @it: the new bpf_iter_task_file to be created + * @task: a pointer pointing to the task to be iterated over + */ +__bpf_kfunc int bpf_iter_task_file_new(struct bpf_iter_task_file *it, struct task_struct *task) +{ + struct bpf_iter_task_file_kern *kit = (void *)it; + struct bpf_iter_task_file_item *item = &kit->item; + + BUILD_BUG_ON(sizeof(struct bpf_iter_task_file_kern) > sizeof(struct bpf_iter_task_file)); + BUILD_BUG_ON(__alignof__(struct bpf_iter_task_file_kern) != + __alignof__(struct bpf_iter_task_file)); + + item->task = get_task_struct(task); + item->file = NULL; + item->fd = 0; + kit->next_fd = 0; + + return 0; +} + +/** + * bpf_iter_task_file_next() - Get the next file in bpf_iter_task_file + * + * bpf_iter_task_file_next acquires a reference to the struct file. + * + * The reference to struct file acquired by the previous + * bpf_iter_task_file_next() is released in the next bpf_iter_task_file_next(), + * and the last reference is released in the last bpf_iter_task_file_next() + * that returns NULL. + * + * @it: the bpf_iter_task_file to be checked + * + * @returns a pointer to bpf_iter_task_file_item + */ +__bpf_kfunc struct bpf_iter_task_file_item *bpf_iter_task_file_next(struct bpf_iter_task_file *it) +{ + struct bpf_iter_task_file_kern *kit = (void *)it; + struct bpf_iter_task_file_item *item = &kit->item; + + if (item->file) + fput(item->file); + + item->file = fget_task_next(item->task, &kit->next_fd); + if (!item->file) + return NULL; + + item->fd = kit->next_fd; + kit->next_fd++; + + return item; +} + +/** + * bpf_iter_task_file_destroy() - Destroy a bpf_iter_task_file + * + * If the iterator does not iterate to the end, then the last + * struct file reference is released at this time. + * + * @it: the bpf_iter_task_file to be destroyed + */ +__bpf_kfunc void bpf_iter_task_file_destroy(struct bpf_iter_task_file *it) +{ + struct bpf_iter_task_file_kern *kit = (void *)it; + struct bpf_iter_task_file_item *item = &kit->item; + + if (item->file) + fput(item->file); + + put_task_struct(item->task); +} + __bpf_kfunc_end_defs(); DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work);
This patch adds the open-coded iterator style process file iterator kfuncs bpf_iter_task_file_{new,next,destroy} that iterates over all files opened by the specified process. bpf_iter_task_file_next returns a pointer to bpf_iter_task_file_item, which currently contains *task, *file, fd. This is an extensible structure that enables compatibility with different versions through CO-RE. The reference to struct file acquired by the previous bpf_iter_task_file_next() is released in the next bpf_iter_task_file_next(), and the last reference is released in the last bpf_iter_task_file_next() that returns NULL. In the bpf_iter_task_file_destroy(), if the iterator does not iterate to the end, then the last struct file reference is released at this time. Signed-off-by: Juntong Deng <juntong.deng@outlook.com> --- kernel/bpf/helpers.c | 3 ++ kernel/bpf/task_iter.c | 90 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 93 insertions(+)