[bpf-next,v7,1/4] bpf: add bpf_get_cpu_cycles kfunc

Message ID	20241118185245.1065000-2-vadfed@meta.com (mailing list archive)
State	Superseded
Delegated to:	BPF
Headers	show Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F4811E0E01 for <bpf@vger.kernel.org>; Mon, 18 Nov 2024 18:53:43 +0000 (UTC) From: Vadim Fedorenko <vadfed@meta.com> To: Borislav Petkov <bp@alien8.de>, Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andrii@kernel.org>, Eduard Zingerman <eddyz87@gmail.com>, Thomas Gleixner <tglx@linutronix.de>, Yonghong Song <yonghong.song@linux.dev>, Vadim Fedorenko <vadim.fedorenko@linux.dev>, Mykola Lysenko <mykolal@fb.com> CC: <x86@kernel.org>, <bpf@vger.kernel.org>, Vadim Fedorenko <vadfed@meta.com>, Martin KaFai Lau <martin.lau@linux.dev> Subject: [PATCH bpf-next v7 1/4] bpf: add bpf_get_cpu_cycles kfunc Date: Mon, 18 Nov 2024 10:52:42 -0800 Message-ID: <20241118185245.1065000-2-vadfed@meta.com> In-Reply-To: <20241118185245.1065000-1-vadfed@meta.com> References: <20241118185245.1065000-1-vadfed@meta.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain
Series	bpf: add cpu cycles kfuncss \| expand [bpf-next,v7,0/4] bpf: add cpu cycles kfuncss [bpf-next,v7,1/4] bpf: add bpf_get_cpu_cycles kfunc [bpf-next,v7,2/4] bpf: add bpf_cpu_cycles_to_ns helper [bpf-next,v7,3/4] selftests/bpf: add selftest to check rdtsc jit [bpf-next,v7,4/4] selftests/bpf: add usage example for cpu cycles kfuncs

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for bpf-next, async
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 202 this patch: 202
netdev/build_tools	success	Errors and warnings before: 0 (+0) this patch: 0 (+0)
netdev/cc_maintainers	warning	12 maintainers not CCed: kpsingh@kernel.org dave.hansen@linux.intel.com hpa@zytor.com udknight@gmail.com jolsa@kernel.org song@kernel.org dsahern@kernel.org haoluo@google.com john.fastabend@gmail.com mingo@redhat.com netdev@vger.kernel.org sdf@fomichev.me
netdev/build_clang	success	Errors and warnings before: 252 this patch: 252
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 6969 this patch: 6969
netdev/checkpatch	warning	WARNING: line length of 82 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 90 exceeds 80 columns WARNING: line length of 97 exceeds 80 columns
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 17 this patch: 17
netdev/source_inline	fail	Was 0 now: 1
bpf/vmtest-bpf-next-PR	success	PR summary
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-0	success	Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5	success	Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4	success	Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10	success	Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-12	success	Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-18	success	Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-11	success	Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16	success	Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-17	success	Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-19	success	Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-6	success	Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-27	success	Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-28	success	Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-33	success	Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-35	success	Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-34	success	Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41	success	Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-15	success	Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-7	success	Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8	success	Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13	success	Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-14	success	Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-25	success	Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24	success	Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-29	success	Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-30	success	Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31	success	Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-26	success	Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-32	success	Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-36	success	Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40	success	Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-21	success	Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20	success	Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23	success	Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22	success	Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-37	success	Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-38	success	Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-39	success	Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index a43fc5af973d..5e0c16d8bba3 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -2185,6 +2185,37 @@ st: if (is_imm8(insn->off)) case BPF_JMP | BPF_CALL: { u8 *ip = image + addrs[i - 1]; + if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && + imm32 == BPF_CALL_IMM(bpf_get_cpu_cycles)) { + /* The default implementation of this kfunc uses + * __arch_get_hw_counter() which is implemented as + * `(u64)rdtsc_ordered() & S64_MAX`. We skip masking + * part because we assume it's not needed in BPF + * use case (two measurements close in time). + * Original code for rdtsc_ordered() uses sequence: + * 'rdtsc; nop; nop; nop' to patch it into + * 'lfence; rdtsc' or 'rdtscp' depending on CPU features. + * JIT uses 'lfence; rdtsc' variant because BPF program + * doesn't care about cookie provided by rdtscp in RCX. + * Save RDX because RDTSC will use EDX:EAX to return u64 + */ + emit_mov_reg(&prog, true, AUX_REG, BPF_REG_3); + if (cpu_feature_enabled(X86_FEATURE_LFENCE_RDTSC)) + EMIT_LFENCE(); + EMIT2(0x0F, 0x31); + + /* shl RDX, 32 */ + maybe_emit_1mod(&prog, BPF_REG_3, true); + EMIT3(0xC1, add_1reg(0xE0, BPF_REG_3), 32); + /* or RAX, RDX */ + maybe_emit_mod(&prog, BPF_REG_0, BPF_REG_3, true); + EMIT2(0x09, add_2reg(0xC0, BPF_REG_0, BPF_REG_3)); + /* restore RDX from R11 */ + emit_mov_reg(&prog, true, BPF_REG_3, AUX_REG); + + break; + } + func = (u8 *) __bpf_call_base + imm32; if (src_reg == BPF_PSEUDO_CALL && tail_call_reachable) { LOAD_TAIL_CALL_CNT_PTR(stack_depth); @@ -3791,3 +3822,11 @@ u64 bpf_arch_uaddress_limit(void) { return 0; } + +/* x86-64 JIT can inline kfunc */ +bool bpf_jit_inlines_kfunc_call(s32 imm) +{ + if (imm == BPF_CALL_IMM(bpf_get_cpu_cycles)) + return true; + return false; +} diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c index de0f9e5f9f73..11a5c41302a3 100644 --- a/arch/x86/net/bpf_jit_comp32.c +++ b/arch/x86/net/bpf_jit_comp32.c @@ -2094,6 +2094,13 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { int err; + if (imm32 == BPF_CALL_IMM(bpf_get_cpu_cycles)) { + if (cpu_feature_enabled(X86_FEATURE_LFENCE_RDTSC)) + EMIT3(0x0F, 0xAE, 0xE8); + EMIT2(0x0F, 0x31); + break; + } + err = emit_kfunc_call(bpf_prog, image + addrs[i], insn, &prog); @@ -2621,3 +2628,10 @@ bool bpf_jit_supports_kfunc_call(void) { return true; } + +bool bpf_jit_inlines_kfunc_call(s32 imm) +{ + if (imm == BPF_CALL_IMM(bpf_get_cpu_cycles)) + return true; + return false; +} diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3ace0d6227e3..43a5207a1591 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -3333,6 +3333,11 @@ void bpf_user_rnd_init_once(void); u64 bpf_user_rnd_u32(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); u64 bpf_get_raw_cpu_id(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); +/* Inlined kfuncs */ +#if IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY) +u64 bpf_get_cpu_cycles(void); +#endif + #if defined(CONFIG_NET) bool bpf_sock_common_is_valid_access(int off, int size, enum bpf_access_type type, diff --git a/include/linux/filter.h b/include/linux/filter.h index 3a21947f2fd4..9cf57233874f 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1111,6 +1111,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog); void bpf_jit_compile(struct bpf_prog *prog); bool bpf_jit_needs_zext(void); bool bpf_jit_inlines_helper_call(s32 imm); +bool bpf_jit_inlines_kfunc_call(s32 imm); bool bpf_jit_supports_subprog_tailcalls(void); bool bpf_jit_supports_percpu_insn(void); bool bpf_jit_supports_kfunc_call(void); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 14d9288441f2..daa3ab458c8a 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2965,6 +2965,17 @@ bool __weak bpf_jit_inlines_helper_call(s32 imm) return false; } +/* Return true if the JIT inlines the call to the kfunc corresponding to + * the imm. + * + * The verifier will not patch the insn->imm for the call to the helper if + * this returns true. + */ +bool __weak bpf_jit_inlines_kfunc_call(s32 imm) +{ + return false; +} + /* Return TRUE if the JIT backend supports mixing bpf2bpf and tailcalls. */ bool __weak bpf_jit_supports_subprog_tailcalls(void) { diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 751c150f9e1c..9f1a51bdb365 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -23,6 +23,10 @@ #include <linux/btf_ids.h> #include <linux/bpf_mem_alloc.h> #include <linux/kasan.h> +#if IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY) +#include <vdso/datapage.h> +#include <asm/vdso/vsyscall.h> +#endif #include "../../lib/kstrtox.h" @@ -3057,6 +3061,26 @@ __bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__sz, const void __user return ret + 1; } +#if IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY) +__bpf_kfunc u64 bpf_get_cpu_cycles(void) +{ + const struct vdso_data *vd = __arch_get_k_vdso_data(); + + vd = &vd[CS_RAW]; + + /* CS_RAW clock_mode translates to VDSO_CLOCKMODE_TSC on x86 and + * to VDSO_CLOCKMODE_ARCHTIMER on aarch64/risc-v. We cannot use + * vd->clock_mode directly because it brings possible access to + * pages visible by user-space only via vDSO. But the constant value + * of 1 is exactly what we need - it works for any architecture and + * translates to reading of HW timecounter regardles of architecture. + * We still have to provide vdso_data for some architectures to avoid + * NULL pointer dereference. + */ + return __arch_get_hw_counter(1, vd); +} +#endif + __bpf_kfunc_end_defs(); BTF_KFUNCS_START(generic_btf_ids) @@ -3149,6 +3173,9 @@ BTF_ID_FLAGS(func, bpf_get_kmem_cache) BTF_ID_FLAGS(func, bpf_iter_kmem_cache_new, KF_ITER_NEW | KF_SLEEPABLE) BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLEEPABLE) BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY | KF_SLEEPABLE) +#if IS_ENABLED(CONFIG_GENERIC_GETTIMEOFDAY) +BTF_ID_FLAGS(func, bpf_get_cpu_cycles, KF_FASTCALL) +#endif BTF_KFUNCS_END(common_btf_ids) static const struct btf_kfunc_id_set common_kfunc_set = { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1c4ebb326785..dbfad4457bef 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -16407,6 +16407,24 @@ static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) } } +/* True if fixup_kfunc_call() replaces calls to kfunc number 'imm', + * replacement patch is presumed to follow bpf_fastcall contract + * (see mark_fastcall_pattern_for_call() below). + */ +static bool verifier_inlines_kfunc_call(struct bpf_verifier_env *env, s32 imm) +{ + const struct bpf_kfunc_desc *desc = find_kfunc_desc(env->prog, imm, 0); + + if (!env->prog->jit_requested) + return false; + + if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] || + desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) + return true; + + return false; +} + /* Same as helper_fastcall_clobber_mask() but for kfuncs, see comment above */ static u32 kfunc_fastcall_clobber_mask(struct bpf_kfunc_call_arg_meta *meta) { @@ -16534,7 +16552,10 @@ static void mark_fastcall_pattern_for_call(struct bpf_verifier_env *env, return; clobbered_regs_mask = kfunc_fastcall_clobber_mask(&meta); - can_be_inlined = is_fastcall_kfunc_call(&meta); + can_be_inlined = is_fastcall_kfunc_call(&meta) && + (verifier_inlines_kfunc_call(env, call->imm) || + (meta.btf == btf_vmlinux && + bpf_jit_inlines_kfunc_call(call->imm))); } if (clobbered_regs_mask == ALL_CALLER_SAVED_REGS) @@ -20541,6 +20562,7 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, struct bpf_insn *insn_buf, int insn_idx, int *cnt) { const struct bpf_kfunc_desc *desc; + s32 imm = insn->imm; if (!insn->imm) { verbose(env, "invalid kernel function call not eliminated in verifier pass\n"); @@ -20564,7 +20586,18 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, insn->imm = BPF_CALL_IMM(desc->addr); if (insn->off) return 0; - if (desc->func_id == special_kfunc_list[KF_bpf_obj_new_impl] || + if (verifier_inlines_kfunc_call(env, imm)) { + if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] || + desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) { + insn_buf[0] = BPF_MOV64_REG(BPF_REG_0, BPF_REG_1); + *cnt = 1; + } else { + verbose(env, "verifier internal error: kfunc id %d has no inline code\n", + desc->func_id); + return -EFAULT; + } + + } else if (desc->func_id == special_kfunc_list[KF_bpf_obj_new_impl] || desc->func_id == special_kfunc_list[KF_bpf_percpu_obj_new_impl]) { struct btf_struct_meta *kptr_struct_meta = env->insn_aux_data[insn_idx].kptr_struct_meta; struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_2, (long)kptr_struct_meta) }; @@ -20625,10 +20658,6 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, __fixup_collection_insert_kfunc(&env->insn_aux_data[insn_idx], struct_meta_reg, node_offset_reg, insn, insn_buf, cnt); - } else if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] || - desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) { - insn_buf[0] = BPF_MOV64_REG(BPF_REG_0, BPF_REG_1); - *cnt = 1; } else if (is_bpf_wq_set_callback_impl_kfunc(desc->func_id)) { struct bpf_insn ld_addrs[2] = { BPF_LD_IMM64(BPF_REG_4, (long)env->prog->aux) };

[bpf-next,v7,1/4] bpf: add bpf_get_cpu_cycles kfunc

Checks

Commit Message

Comments

Patch