From patchwork Mon Apr 14 08:36:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 14049949 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C9E8A1CD2C; Mon, 14 Apr 2025 08:36:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744619816; cv=none; b=W0+qTh4+NBQoXxFE7Whta1QTy+H/a8buOdxDcTyna/xbjFRz+We28lMnTsrE1xq33latyOTVWtdu60JOhOZ/wf4PGJ7WdO6F7dYOiHnPXPprZTLEB+fsxin6FIyoovKkIWxHX3JPH5u8d+7zT84pGEvucXgF1gSGD6xHmwr/MWo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744619816; c=relaxed/simple; bh=A7FGAmdKF46W5Or2Z8OOuOUQ7edsr7AAwxaxc53q8OY=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=EYghW9/cP3kKSN9xv2XRfPigcMNCRMPktuO0DzsOYnOj1w2JQdJrcURqUpadD3KSrncqw4LllvHv55XClBjBvuPmGO1DSRO26Q8HmLEUojksiJF73HbyH4QeNQz54VJMcfAmQCehaH/D1+Z9eAKUg5utjs1r1eS/Eue+SmDzIgY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=skmOBIv4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="skmOBIv4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A9B22C4CEE2; Mon, 14 Apr 2025 08:36:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744619816; bh=A7FGAmdKF46W5Or2Z8OOuOUQ7edsr7AAwxaxc53q8OY=; h=From:To:Cc:Subject:Date:From; b=skmOBIv4CMfxp0onQVhhNZRDG42vD2GpKXPVKqWr6beF/ovP8inP9mbfMm3AEQsf1 hPsjaxu6AWNOjRdCh2Ree3baR6kUfiMtXpW6FZuw+1xkk9/EUHzniWTGtv7M1OYiHr jV5ZSmCeoWsPT+AW8p56YaLRv1rXDAi1zKfcuAghvlMJ09vqVp2jyO9s51zLw0UX7o Pmhi2V7fD0KNmIjVjeyFEeZqZgMarTV/CPxO1WE9ysyvw7atwueYcVo0nrt/yzyD8C VQk+YoVKuRt7cGrmeC9zcgeG4FVmjkt6xWprjRulTCOYVEgezLkMLDua9szH2s4xqd jNj6DRr1wkTfw== From: Jiri Olsa To: Oleg Nesterov , Peter Zijlstra , Ingo Molnar , Andrii Nakryiko Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, x86@kernel.org, Song Liu , Yonghong Song , John Fastabend , Hao Luo , Steven Rostedt , Masami Hiramatsu , Alan Maguire Subject: [PATCHv3 perf/core 1/2] uprobes/x86: Add support to emulate nop instructions Date: Mon, 14 Apr 2025 10:36:46 +0200 Message-ID: <20250414083647.1234007-1-jolsa@kernel.org> X-Mailer: git-send-email 2.49.0 Precedence: bulk X-Mailing-List: linux-trace-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Adding support to emulate all nop instructions as the original uprobe instruction. This change speeds up uprobe on top of all nop instructions and is a preparation for usdt probe optimization, that will be done on top of nop5 instruction. With this change the usdt probe on top of nop5 won't take the performance hit compared to usdt probe on top of standard nop instruction. Suggested-by: Oleg Nesterov Suggested-by: Andrii Nakryiko Signed-off-by: Jiri Olsa Reviewed-by: Oleg Nesterov Acked-by: Andrii Nakryiko --- v3 changes: - use insn->length as index to x86_nops [Andrii] arch/x86/kernel/uprobes.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c index 9194695662b2..6d383839e839 100644 --- a/arch/x86/kernel/uprobes.c +++ b/arch/x86/kernel/uprobes.c @@ -840,6 +840,11 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn) insn_byte_t p; int i; + /* x86_nops[insn->length]; same as jmp with .offs = 0 */ + if (insn->length <= ASM_NOP_MAX && + !memcmp(insn->kaddr, x86_nops[insn->length], insn->length)) + goto setup; + switch (opc1) { case 0xeb: /* jmp 8 */ case 0xe9: /* jmp 32 */ From patchwork Mon Apr 14 08:36:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Olsa X-Patchwork-Id: 14049950 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73760257AC8; Mon, 14 Apr 2025 08:37:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744619831; cv=none; b=Stbgxv8g/Oi+SSM2Ia41xXOs8L6QztJGq0Q6V+ksl86ekW2tyN6Wfjm0PqugyqrFKlPbdtxsIeiE6XQuaRilKb0r8+jtUNCJLPKuhRhyY70QYSm04OIZz9zKVUXg9ouW/ITJuvjBExmfD38JDbf0SgYTdDg8P2aqm/mFNfhGDOs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744619831; c=relaxed/simple; bh=ILfVuLymqUPwgpbXMf4YH+PUBfgIPrXUs3zMoGXuMG8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=h/yuHvvMlh5+89TJHRvNFa12Gh/HS3Y5PtZRj/QIC4UzU7C1fXH7hQKv7set3tRxnOq6P/EOfjZXiuOtHmJRiJ+C3hqy1PRxpweElzi2EfQKx3nXKPEyaoOBeiJdMeUejYsuD4goOdyXbbauXVi10FEeA23T6kqAUDLjttDWm10= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PUORlXWG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PUORlXWG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6065FC4CEE2; Mon, 14 Apr 2025 08:37:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744619829; bh=ILfVuLymqUPwgpbXMf4YH+PUBfgIPrXUs3zMoGXuMG8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PUORlXWGNnHHahoPhzF4DZv/OALWNt0eFb/uC11ZVanquRj2zy720YonfUZGMUxPC R0F7CT3pc39JXDd4sxNDmy6hDsOjvCXXjRZtfPkRvqf6Aouv2bckEj30BmtDeXT9ln 1cmx/JDoLsDLWP0NVgCF3ZdpadbxYFieP6ODwOycvDZgmCh/XizMNycr5kDohJD48u akSy8XG/IbF6osmoQqxE6WrbJFCd7RralBFMano+cnmccInF7mTQHLXYlhbhMGRF9+ pxUIxyHLRsYuoavlUhlV9vPNCokflkJvo/vZNIIxKzI8c82a7rkT/g2xLEmMkkDQwS JHlxB9/EZNMKQ== From: Jiri Olsa To: Oleg Nesterov , Peter Zijlstra , Ingo Molnar , Andrii Nakryiko Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, x86@kernel.org, Song Liu , Yonghong Song , John Fastabend , Hao Luo , Steven Rostedt , Masami Hiramatsu , Alan Maguire Subject: [PATCHv3 perf/core 2/2] selftests/bpf: Add 5-byte nop uprobe trigger bench Date: Mon, 14 Apr 2025 10:36:47 +0200 Message-ID: <20250414083647.1234007-2-jolsa@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250414083647.1234007-1-jolsa@kernel.org> References: <20250414083647.1234007-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: linux-trace-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add 5-byte nop uprobe trigger bench (x86_64 specific) to measure uprobes/uretprobes on top of nop5 instruction. Signed-off-by: Jiri Olsa Acked-by: Andrii Nakryiko --- tools/testing/selftests/bpf/bench.c | 12 ++++++ .../selftests/bpf/benchs/bench_trigger.c | 42 +++++++++++++++++++ .../selftests/bpf/benchs/run_bench_uprobes.sh | 2 +- 3 files changed, 55 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c index 1bd403a5ef7b..0fd8c9b0d38f 100644 --- a/tools/testing/selftests/bpf/bench.c +++ b/tools/testing/selftests/bpf/bench.c @@ -526,6 +526,12 @@ extern const struct bench bench_trig_uprobe_multi_push; extern const struct bench bench_trig_uretprobe_multi_push; extern const struct bench bench_trig_uprobe_multi_ret; extern const struct bench bench_trig_uretprobe_multi_ret; +#ifdef __x86_64__ +extern const struct bench bench_trig_uprobe_nop5; +extern const struct bench bench_trig_uretprobe_nop5; +extern const struct bench bench_trig_uprobe_multi_nop5; +extern const struct bench bench_trig_uretprobe_multi_nop5; +#endif extern const struct bench bench_rb_libbpf; extern const struct bench bench_rb_custom; @@ -586,6 +592,12 @@ static const struct bench *benchs[] = { &bench_trig_uretprobe_multi_push, &bench_trig_uprobe_multi_ret, &bench_trig_uretprobe_multi_ret, +#ifdef __x86_64__ + &bench_trig_uprobe_nop5, + &bench_trig_uretprobe_nop5, + &bench_trig_uprobe_multi_nop5, + &bench_trig_uretprobe_multi_nop5, +#endif /* ringbuf/perfbuf benchmarks */ &bench_rb_libbpf, &bench_rb_custom, diff --git a/tools/testing/selftests/bpf/benchs/bench_trigger.c b/tools/testing/selftests/bpf/benchs/bench_trigger.c index 32e9f194d449..82327657846e 100644 --- a/tools/testing/selftests/bpf/benchs/bench_trigger.c +++ b/tools/testing/selftests/bpf/benchs/bench_trigger.c @@ -333,6 +333,20 @@ static void *uprobe_producer_ret(void *input) return NULL; } +#ifdef __x86_64__ +__nocf_check __weak void uprobe_target_nop5(void) +{ + asm volatile (".byte 0x0f, 0x1f, 0x44, 0x00, 0x00"); +} + +static void *uprobe_producer_nop5(void *input) +{ + while (true) + uprobe_target_nop5(); + return NULL; +} +#endif + static void usetup(bool use_retprobe, bool use_multi, void *target_addr) { size_t uprobe_offset; @@ -448,6 +462,28 @@ static void uretprobe_multi_ret_setup(void) usetup(true, true /* use_multi */, &uprobe_target_ret); } +#ifdef __x86_64__ +static void uprobe_nop5_setup(void) +{ + usetup(false, false /* !use_multi */, &uprobe_target_nop5); +} + +static void uretprobe_nop5_setup(void) +{ + usetup(true, false /* !use_multi */, &uprobe_target_nop5); +} + +static void uprobe_multi_nop5_setup(void) +{ + usetup(false, true /* use_multi */, &uprobe_target_nop5); +} + +static void uretprobe_multi_nop5_setup(void) +{ + usetup(true, true /* use_multi */, &uprobe_target_nop5); +} +#endif + const struct bench bench_trig_syscall_count = { .name = "trig-syscall-count", .validate = trigger_validate, @@ -506,3 +542,9 @@ BENCH_TRIG_USERMODE(uprobe_multi_ret, ret, "uprobe-multi-ret"); BENCH_TRIG_USERMODE(uretprobe_multi_nop, nop, "uretprobe-multi-nop"); BENCH_TRIG_USERMODE(uretprobe_multi_push, push, "uretprobe-multi-push"); BENCH_TRIG_USERMODE(uretprobe_multi_ret, ret, "uretprobe-multi-ret"); +#ifdef __x86_64__ +BENCH_TRIG_USERMODE(uprobe_nop5, nop5, "uprobe-nop5"); +BENCH_TRIG_USERMODE(uretprobe_nop5, nop5, "uretprobe-nop5"); +BENCH_TRIG_USERMODE(uprobe_multi_nop5, nop5, "uprobe-multi-nop5"); +BENCH_TRIG_USERMODE(uretprobe_multi_nop5, nop5, "uretprobe-multi-nop5"); +#endif diff --git a/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh b/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh index af169f831f2f..03f55405484b 100755 --- a/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh +++ b/tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh @@ -2,7 +2,7 @@ set -eufo pipefail -for i in usermode-count syscall-count {uprobe,uretprobe}-{nop,push,ret} +for i in usermode-count syscall-count {uprobe,uretprobe}-{nop,push,ret,nop5} do summary=$(sudo ./bench -w2 -d5 -a trig-$i | tail -n1 | cut -d'(' -f1 | cut -d' ' -f3-) printf "%-15s: %s\n" $i "$summary"