Message ID | 20230525141813.TFZLWM4M@linutronix.de (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [v2] bpf: Remove in_atomic() from bpf_link_put(). | expand |
On Thu, May 25, 2023 at 7:18 AM Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > bpf_free_inode() is invoked as a RCU callback. Usually RCU callbacks are > invoked within softirq context. By setting rcutree.use_softirq=0 boot > option the RCU callbacks will be invoked in a per-CPU kthread with > bottom halves disabled which implies a RCU read section. > > On PREEMPT_RT the context remains fully preemptible. The RCU read > section however does not allow schedule() invocation. The latter happens > in mutex_lock() performed by bpf_trampoline_unlink_prog() originated > from bpf_link_put(). > > It was pointed out that the bpf_link_put() invocation should not be > delayed if originated from close(). > > Remove the context checks and use the workqueue unconditionally. For the > close() callback add bpf_link_put_direct() which invokes free function > directly. > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > > v1…v2: > - Add bpf_link_put_direct() to be used from bpf_link_release() as > suggested. Looks good to me, but it's not sufficient. See kernel/bpf/inode.c, we need to do bpf_link_put_direct() from bpf_put_any(), which should be safe as well because it all is either triggered from bpf() syscall or by unlink()'ing BPF FS file. For file deletion we have the same requirement to have deterministic release of bpf_link. > > On 2023-05-17 09:26:19 [-0700], Andrii Nakryiko wrote: > > Is struct file_operations.release() callback guaranteed to be called > > from user context? If yes, then perhaps the most straightforward way > > to guarantee synchronous bpf_link cleanup on close(link_fd) is to have > > a bpf_link_put() variant that will be only called from user-context > > and will just do bpf_link_free(link) directly, without checking > > in_atomic(). > > Yes. __fput() has a might_sleep() and it invokes > file_operations::release. > > kernel/bpf/syscall.c | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 14f39c1e573ee..85159428e5fee 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -2785,20 +2785,23 @@ void bpf_link_put(struct bpf_link *link) > if (!atomic64_dec_and_test(&link->refcnt)) > return; > > - if (in_atomic()) { > - INIT_WORK(&link->work, bpf_link_put_deferred); > - schedule_work(&link->work); > - } else { > - bpf_link_free(link); > - } > + INIT_WORK(&link->work, bpf_link_put_deferred); > + schedule_work(&link->work); > } > EXPORT_SYMBOL(bpf_link_put); > > +static void bpf_link_put_direct(struct bpf_link *link) > +{ > + if (!atomic64_dec_and_test(&link->refcnt)) > + return; > + bpf_link_free(link); > +} > + > static int bpf_link_release(struct inode *inode, struct file *filp) > { > struct bpf_link *link = filp->private_data; > > - bpf_link_put(link); > + bpf_link_put_direct(link); > return 0; > } > > -- > 2.40.1 >
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 14f39c1e573ee..85159428e5fee 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2785,20 +2785,23 @@ void bpf_link_put(struct bpf_link *link) if (!atomic64_dec_and_test(&link->refcnt)) return; - if (in_atomic()) { - INIT_WORK(&link->work, bpf_link_put_deferred); - schedule_work(&link->work); - } else { - bpf_link_free(link); - } + INIT_WORK(&link->work, bpf_link_put_deferred); + schedule_work(&link->work); } EXPORT_SYMBOL(bpf_link_put); +static void bpf_link_put_direct(struct bpf_link *link) +{ + if (!atomic64_dec_and_test(&link->refcnt)) + return; + bpf_link_free(link); +} + static int bpf_link_release(struct inode *inode, struct file *filp) { struct bpf_link *link = filp->private_data; - bpf_link_put(link); + bpf_link_put_direct(link); return 0; }