diff mbox series

[v3,bpf-next,2/8] bpf: Compute program stats for sleepable programs

Message ID 20210209194856.24269-3-alexei.starovoitov@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf: Misc improvements | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 14 maintainers not CCed: yoshfuji@linux-ipv6.org hpa@zytor.com tglx@linutronix.de mingo@redhat.com bp@alien8.de yhs@fb.com x86@kernel.org kafai@fb.com netdev@vger.kernel.org ast@kernel.org songliubraving@fb.com john.fastabend@gmail.com kpsingh@kernel.org andrii@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 12175 this patch: 12175
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning CHECK: No space is necessary after a cast WARNING: suspect code indent for conditional statements (8, 24)
netdev/build_allmodconfig_warn success Errors and warnings before: 12823 this patch: 12823
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Alexei Starovoitov Feb. 9, 2021, 7:48 p.m. UTC
From: Alexei Starovoitov <ast@kernel.org>

In older non-RT kernels migrate_disable() was the same as preempt_disable().
Since commit 74d862b682f5 ("sched: Make migrate_disable/enable() independent of RT")
migrate_disable() is real and doesn't prevent sleeping.
Use it to efficiently compute execution stats for sleepable bpf programs.
migrate_disable() will also be used to enable per-cpu maps in sleepable programs
in the future patches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
---
 arch/x86/net/bpf_jit_comp.c | 31 ++++++++++----------------
 include/linux/bpf.h         |  4 ++--
 kernel/bpf/trampoline.c     | 44 +++++++++++++++++++++++++------------
 3 files changed, 44 insertions(+), 35 deletions(-)

Comments

KP Singh Feb. 9, 2021, 10:47 p.m. UTC | #1
On Tue, Feb 9, 2021 at 10:01 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> In older non-RT kernels migrate_disable() was the same as preempt_disable().
> Since commit 74d862b682f5 ("sched: Make migrate_disable/enable() independent of RT")

nit: It would be nice to split out the bit that adds
migrate_disbale/enable into a separate patch
just to make it more explicit.

> migrate_disable() is real and doesn't prevent sleeping.
> Use it to efficiently compute execution stats for sleepable bpf programs.
> migrate_disable() will also be used to enable per-cpu maps in sleepable programs
> in the future patches.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Andrii Nakryiko <andrii@kernel.org>

Just the optional comment about splitting the migrate_enable / disable bit.

Acked-by: KP Singh <kpsingh@kernel.org>
Alexei Starovoitov Feb. 9, 2021, 11:11 p.m. UTC | #2
On 2/9/21 2:47 PM, KP Singh wrote:
> On Tue, Feb 9, 2021 at 10:01 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> From: Alexei Starovoitov <ast@kernel.org>
>>
>> In older non-RT kernels migrate_disable() was the same as preempt_disable().
>> Since commit 74d862b682f5 ("sched: Make migrate_disable/enable() independent of RT")
> 
> nit: It would be nice to split out the bit that adds
> migrate_disbale/enable into a separate patch
> just to make it more explicit.

Not following. What is the point of splitting it?
Just adding it without using it for anything?
That's a bit weird.
How would it help anything?

>> migrate_disable() is real and doesn't prevent sleeping.
>> Use it to efficiently compute execution stats for sleepable bpf programs.
>> migrate_disable() will also be used to enable per-cpu maps in sleepable programs
>> in the future patches.
>>
>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>> Acked-by: Andrii Nakryiko <andrii@kernel.org>
> 
> Just the optional comment about splitting the migrate_enable / disable bit.
> 
> Acked-by: KP Singh <kpsingh@kernel.org>
>
KP Singh Feb. 9, 2021, 11:17 p.m. UTC | #3
On Wed, Feb 10, 2021 at 12:11 AM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 2/9/21 2:47 PM, KP Singh wrote:
> > On Tue, Feb 9, 2021 at 10:01 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> >>
> >> From: Alexei Starovoitov <ast@kernel.org>
> >>
> >> In older non-RT kernels migrate_disable() was the same as preempt_disable().
> >> Since commit 74d862b682f5 ("sched: Make migrate_disable/enable() independent of RT")
> >
> > nit: It would be nice to split out the bit that adds
> > migrate_disbale/enable into a separate patch
> > just to make it more explicit.
>
> Not following. What is the point of splitting it?
> Just adding it without using it for anything?
> That's a bit weird.
> How would it help anything?

The reason why I mentioned this is because you refer to this in the other patch:

https://lore.kernel.org/bpf/20210206170344.78399-1-alexei.starovoitov@gmail.com/T/#m24cdc785b71adc04ac665fe018956c4f25ca06ae

"Since sleepable programs are now executing under migrate_disable

the per-cpu maps are safe to use.
The map-in-map were ok to use in sleepable from the time sleepable
progs were introduced."

It's just a tiny bit easier to find the commit that added it. But not
a big deal if you think it's not useful to split.

[...]
diff mbox series

Patch

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index a3dc3bd154ac..d11b9bcebbea 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1742,15 +1742,12 @@  static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	u8 *prog = *pprog;
 	int cnt = 0;
 
-	if (p->aux->sleepable) {
-		if (emit_call(&prog, __bpf_prog_enter_sleepable, prog))
+	if (emit_call(&prog,
+		      p->aux->sleepable ? __bpf_prog_enter_sleepable :
+		      __bpf_prog_enter, prog))
 			return -EINVAL;
-	} else {
-		if (emit_call(&prog, __bpf_prog_enter, prog))
-			return -EINVAL;
-		/* remember prog start time returned by __bpf_prog_enter */
-		emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
-	}
+	/* remember prog start time returned by __bpf_prog_enter */
+	emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
 
 	/* arg1: lea rdi, [rbp - stack_size] */
 	EMIT4(0x48, 0x8D, 0x7D, -stack_size);
@@ -1770,18 +1767,14 @@  static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	if (mod_ret)
 		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
 
-	if (p->aux->sleepable) {
-		if (emit_call(&prog, __bpf_prog_exit_sleepable, prog))
+	/* arg1: mov rdi, progs[i] */
+	emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
+	/* arg2: mov rsi, rbx <- start time in nsec */
+	emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
+	if (emit_call(&prog,
+		      p->aux->sleepable ? __bpf_prog_exit_sleepable :
+		      __bpf_prog_exit, prog))
 			return -EINVAL;
-	} else {
-		/* arg1: mov rdi, progs[i] */
-		emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32,
-			       (u32) (long) p);
-		/* arg2: mov rsi, rbx <- start time in nsec */
-		emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
-		if (emit_call(&prog, __bpf_prog_exit, prog))
-			return -EINVAL;
-	}
 
 	*pprog = prog;
 	return 0;
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 026fa8873c5d..2fa48439ef31 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -563,8 +563,8 @@  int arch_prepare_bpf_trampoline(void *image, void *image_end,
 /* these two functions are called from generated trampoline */
 u64 notrace __bpf_prog_enter(void);
 void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start);
-void notrace __bpf_prog_enter_sleepable(void);
-void notrace __bpf_prog_exit_sleepable(void);
+u64 notrace __bpf_prog_enter_sleepable(void);
+void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start);
 
 struct bpf_ksym {
 	unsigned long		 start;
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 5be3beeedd74..48eb021e1421 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -381,55 +381,71 @@  void bpf_trampoline_put(struct bpf_trampoline *tr)
 	mutex_unlock(&trampoline_mutex);
 }
 
+#define NO_START_TIME 0
+static u64 notrace bpf_prog_start_time(void)
+{
+	u64 start = NO_START_TIME;
+
+	if (static_branch_unlikely(&bpf_stats_enabled_key))
+		start = sched_clock();
+	return start;
+}
+
 /* The logic is similar to BPF_PROG_RUN, but with an explicit
  * rcu_read_lock() and migrate_disable() which are required
  * for the trampoline. The macro is split into
- * call _bpf_prog_enter
+ * call __bpf_prog_enter
  * call prog->bpf_func
  * call __bpf_prog_exit
  */
 u64 notrace __bpf_prog_enter(void)
 	__acquires(RCU)
 {
-	u64 start = 0;
-
 	rcu_read_lock();
 	migrate_disable();
-	if (static_branch_unlikely(&bpf_stats_enabled_key))
-		start = sched_clock();
-	return start;
+	return bpf_prog_start_time();
 }
 
-void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start)
-	__releases(RCU)
+static void notrace update_prog_stats(struct bpf_prog *prog,
+				      u64 start)
 {
 	struct bpf_prog_stats *stats;
 
 	if (static_branch_unlikely(&bpf_stats_enabled_key) &&
-	    /* static_key could be enabled in __bpf_prog_enter
-	     * and disabled in __bpf_prog_exit.
+	    /* static_key could be enabled in __bpf_prog_enter*
+	     * and disabled in __bpf_prog_exit*.
 	     * And vice versa.
-	     * Hence check that 'start' is not zero.
+	     * Hence check that 'start' is valid.
 	     */
-	    start) {
+	    start > NO_START_TIME) {
 		stats = this_cpu_ptr(prog->stats);
 		u64_stats_update_begin(&stats->syncp);
 		stats->cnt++;
 		stats->nsecs += sched_clock() - start;
 		u64_stats_update_end(&stats->syncp);
 	}
+}
+
+void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start)
+	__releases(RCU)
+{
+	update_prog_stats(prog, start);
 	migrate_enable();
 	rcu_read_unlock();
 }
 
-void notrace __bpf_prog_enter_sleepable(void)
+u64 notrace __bpf_prog_enter_sleepable(void)
 {
 	rcu_read_lock_trace();
+	migrate_disable();
 	might_fault();
+	return bpf_prog_start_time();
 }
 
-void notrace __bpf_prog_exit_sleepable(void)
+void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start)
 {
+	update_prog_stats(prog, start);
+	migrate_enable();
 	rcu_read_unlock_trace();
 }