diff mbox series

[bpf-next,v2] selftests/bpf: improve ringbuf benchmark output

Message ID 20230719201533.176702-1-awerner32@gmail.com (mailing list archive)
State New, archived
Delegated to: BPF
Headers show
Series [bpf-next,v2] selftests/bpf: improve ringbuf benchmark output | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/cc_maintainers warning 14 maintainers not CCed: daniel@iogearbox.net yhs@fb.com kpsingh@kernel.org martin.lau@linux.dev john.fastabend@gmail.com sdf@google.com shuah@kernel.org andrii@kernel.org song@kernel.org mykolal@fb.com houtao1@huawei.com linux-kselftest@vger.kernel.org jolsa@kernel.org haoluo@google.com
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success net selftest script(s) already in Makefile
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 9 this patch: 9
netdev/checkpatch warning WARNING: line length of 109 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-26 fail Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on s390x with gcc

Commit Message

Andrew Werner July 19, 2023, 8:15 p.m. UTC
The ringbuf benchmarks print headers for each section of benchmarks.
The naming conventions lead a user of the benchmarks to some confusion.
This change is a cosmetic update to the output of that benchmark; no
changes were made to what the script actually executes.

The back-to-back exploration of sample rates for Perfbuf and Ringbuf
have been combined into a single section.

Some of the variables in the script were renamed for clarity; b is
always a benchmark name, s is a sampling rate, n is a number of
producers. Before the change, b was the only variable.

After:
```
Parallel producer
=================
rb-libbpf            43.072 ± 0.165M/s (drops 0.940 ± 0.016M/s)
rb-custom            20.274 ± 0.442M/s (drops 0.000 ± 0.000M/s)
pb-libbpf            1.480 ± 0.015M/s (drops 0.000 ± 0.000M/s)
pb-custom            1.492 ± 0.023M/s (drops 0.000 ± 0.000M/s)

Parallel producer, sampled notifications
========================================
rb-libbpf            41.132 ± 0.113M/s (drops 0.000 ± 0.000M/s)
rb-custom            33.228 ± 0.086M/s (drops 0.000 ± 0.000M/s)
pb-libbpf            22.498 ± 0.142M/s (drops 0.052 ± 0.171M/s)
pb-custom            22.399 ± 0.060M/s (drops 0.030 ± 0.100M/s)

Back-to-back producer
=====================
rb-libbpf            59.951 ± 0.712M/s (drops 0.000 ± 0.000M/s)
rb-libbpf-sampled    57.751 ± 4.694M/s (drops 0.000 ± 0.000M/s)
rb-custom            71.568 ± 12.584M/s (drops 0.000 ± 0.000M/s)
rb-custom-sampled    71.919 ± 7.540M/s (drops 0.000 ± 0.000M/s)
pb-libbpf            1.961 ± 0.013M/s (drops 0.000 ± 0.000M/s)
pb-libbpf-sampled    22.339 ± 0.129M/s (drops 0.000 ± 0.000M/s)
pb-custom            1.972 ± 0.009M/s (drops 0.000 ± 0.000M/s)
pb-custom-sampled    22.802 ± 0.374M/s (drops 0.000 ± 0.000M/s)

Back-to-back producer, varying sample rate
==========================================
rb-custom-1          1.529 ± 0.008M/s (drops 0.000 ± 0.000M/s)
rb-custom-5          5.817 ± 1.945M/s (drops 0.000 ± 0.000M/s)
rb-custom-10         12.884 ± 0.032M/s (drops 0.000 ± 0.000M/s)
rb-custom-25         25.634 ± 0.031M/s (drops 0.000 ± 0.000M/s)
rb-custom-50         39.970 ± 0.309M/s (drops 0.000 ± 0.000M/s)
rb-custom-100        51.868 ± 0.210M/s (drops 0.000 ± 0.000M/s)
rb-custom-250        69.466 ± 0.039M/s (drops 0.000 ± 0.000M/s)
rb-custom-500        76.370 ± 0.181M/s (drops 0.000 ± 0.000M/s)
rb-custom-1000       79.778 ± 0.248M/s (drops 0.000 ± 0.000M/s)
rb-custom-2000       82.952 ± 0.198M/s (drops 0.000 ± 0.000M/s)
rb-custom-3000       82.314 ± 0.155M/s (drops 0.000 ± 0.000M/s)
pb-custom-1          1.418 ± 0.004M/s (drops 0.000 ± 0.000M/s)
pb-custom-5          5.655 ± 0.066M/s (drops 0.000 ± 0.000M/s)
pb-custom-10         9.091 ± 0.109M/s (drops 0.000 ± 0.000M/s)
pb-custom-25         14.338 ± 0.144M/s (drops 0.000 ± 0.000M/s)
pb-custom-50         17.841 ± 0.318M/s (drops 0.000 ± 0.000M/s)
pb-custom-100        20.491 ± 0.099M/s (drops 0.000 ± 0.000M/s)
pb-custom-250        22.047 ± 0.270M/s (drops 0.000 ± 0.000M/s)
pb-custom-500        22.475 ± 0.676M/s (drops 0.000 ± 0.000M/s)
pb-custom-1000       23.013 ± 0.786M/s (drops 0.000 ± 0.000M/s)
pb-custom-2000       23.305 ± 0.182M/s (drops 0.000 ± 0.000M/s)
pb-custom-3000       23.855 ± 0.071M/s (drops 0.000 ± 0.000M/s)

Back-to-back producer, rb-custom reserve+commit vs output
=========================================================
reserve              76.244 ± 0.469M/s (drops 0.000 ± 0.000M/s)
output               64.707 ± 5.618M/s (drops 0.000 ± 0.000M/s)

Parallel producer, rb-custom reserve+commit vs output, sampled notifications
============================================================================
reserve-sampled      33.560 ± 0.024M/s (drops 0.000 ± 0.000M/s)
output-sampled       30.348 ± 0.313M/s (drops 0.000 ± 0.000M/s)

Concurrent producer (same CPU as consumer), low batch count
===========================================================
rb-libbpf            0.563 ± 0.007M/s (drops 0.000 ± 0.000M/s)
rb-custom            0.571 ± 0.001M/s (drops 0.000 ± 0.000M/s)
pb-libbpf            0.523 ± 0.001M/s (drops 0.000 ± 0.000M/s)
pb-custom            0.530 ± 0.004M/s (drops 0.000 ± 0.000M/s)

Multiple parallel producers (contention)
========================================
rb-libbpf nr_prod 1  44.711 ± 0.058M/s (drops 0.183 ± 0.012M/s)
rb-libbpf nr_prod 2  23.534 ± 0.069M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 3  14.011 ± 0.023M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 4  14.858 ± 0.021M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 8  6.184 ± 0.031M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 12 4.719 ± 0.058M/s (drops 0.006 ± 0.021M/s)
rb-libbpf nr_prod 16 4.607 ± 0.055M/s (drops 0.010 ± 0.028M/s)
rb-libbpf nr_prod 20 5.001 ± 0.052M/s (drops 0.010 ± 0.025M/s)
rb-libbpf nr_prod 24 5.234 ± 0.114M/s (drops 0.006 ± 0.021M/s)
rb-libbpf nr_prod 28 5.021 ± 0.020M/s (drops 0.007 ± 0.014M/s)
rb-libbpf nr_prod 32 4.316 ± 0.142M/s (drops 0.614 ± 0.121M/s)
rb-libbpf nr_prod 36 4.353 ± 0.157M/s (drops 0.708 ± 0.126M/s)
rb-libbpf nr_prod 40 4.230 ± 0.058M/s (drops 0.775 ± 0.120M/s)
rb-libbpf nr_prod 44 4.212 ± 0.050M/s (drops 0.736 ± 0.084M/s)
rb-libbpf nr_prod 48 4.276 ± 0.057M/s (drops 0.784 ± 0.095M/s)
rb-libbpf nr_prod 52 4.222 ± 0.141M/s (drops 0.777 ± 0.172M/s)
```

Before:
```
Single-producer, parallel producer
==================================
rb-libbpf            43.366 ± 0.277M/s (drops 0.848 ± 0.027M/s)
rb-custom            17.831 ± 0.391M/s (drops 0.065 ± 0.216M/s)
pb-libbpf            1.494 ± 0.012M/s (drops 0.000 ± 0.000M/s)
pb-custom            1.521 ± 0.002M/s (drops 0.000 ± 0.000M/s)

Single-producer, parallel producer, sampled notification
========================================================
rb-libbpf            41.163 ± 0.031M/s (drops 0.000 ± 0.000M/s)
rb-custom            33.364 ± 0.347M/s (drops 0.025 ± 0.082M/s)
pb-libbpf            21.039 ± 3.350M/s (drops 0.014 ± 0.036M/s)
pb-custom            22.570 ± 0.267M/s (drops 0.136 ± 0.319M/s)

Single-producer, back-to-back mode
==================================
rb-libbpf            60.671 ± 0.274M/s (drops 0.000 ± 0.000M/s)
rb-libbpf-sampled    59.229 ± 0.422M/s (drops 0.000 ± 0.000M/s)
rb-custom            77.296 ± 0.156M/s (drops 0.000 ± 0.000M/s)
rb-custom-sampled    71.147 ± 0.281M/s (drops 0.000 ± 0.000M/s)
pb-libbpf            1.960 ± 0.007M/s (drops 0.000 ± 0.000M/s)
pb-libbpf-sampled    22.230 ± 0.115M/s (drops 0.000 ± 0.000M/s)
pb-custom            1.969 ± 0.005M/s (drops 0.000 ± 0.000M/s)
pb-custom-sampled    22.883 ± 0.122M/s (drops 0.000 ± 0.000M/s)

Ringbuf back-to-back, effect of sample rate
===========================================
rb-sampled-1         1.507 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-sampled-5         7.095 ± 0.016M/s (drops 0.000 ± 0.000M/s)
rb-sampled-10        13.091 ± 0.046M/s (drops 0.000 ± 0.000M/s)
rb-sampled-25        26.259 ± 0.061M/s (drops 0.000 ± 0.000M/s)
rb-sampled-50        39.831 ± 0.122M/s (drops 0.000 ± 0.000M/s)
rb-sampled-100       51.536 ± 2.984M/s (drops 0.000 ± 0.000M/s)
rb-sampled-250       67.850 ± 1.267M/s (drops 0.000 ± 0.000M/s)
rb-sampled-500       75.257 ± 0.438M/s (drops 0.000 ± 0.000M/s)
rb-sampled-1000      74.939 ± 0.295M/s (drops 0.000 ± 0.000M/s)
rb-sampled-2000      81.481 ± 0.769M/s (drops 0.000 ± 0.000M/s)
rb-sampled-3000      82.637 ± 0.448M/s (drops 0.000 ± 0.000M/s)

Perfbuf back-to-back, effect of sample rate
===========================================
pb-sampled-1         1.408 ± 0.003M/s (drops 0.000 ± 0.000M/s)
pb-sampled-5         5.667 ± 0.012M/s (drops 0.000 ± 0.000M/s)
pb-sampled-10        9.162 ± 0.026M/s (drops 0.000 ± 0.000M/s)
pb-sampled-25        14.389 ± 0.033M/s (drops 0.000 ± 0.000M/s)
pb-sampled-50        17.977 ± 0.049M/s (drops 0.000 ± 0.000M/s)
pb-sampled-100       20.541 ± 0.079M/s (drops 0.000 ± 0.000M/s)
pb-sampled-250       22.176 ± 0.523M/s (drops 0.000 ± 0.000M/s)
pb-sampled-500       23.121 ± 0.124M/s (drops 0.000 ± 0.000M/s)
pb-sampled-1000      22.415 ± 1.860M/s (drops 0.000 ± 0.000M/s)
pb-sampled-2000      23.333 ± 0.679M/s (drops 0.000 ± 0.000M/s)
pb-sampled-3000      23.032 ± 0.649M/s (drops 0.000 ± 0.000M/s)

Ringbuf back-to-back, reserve+commit vs output
==============================================
reserve              77.180 ± 0.304M/s (drops 0.000 ± 0.000M/s)
output               60.890 ± 7.685M/s (drops 0.000 ± 0.000M/s)

Ringbuf sampled, reserve+commit vs output
=========================================
reserve-sampled      30.724 ± 0.166M/s (drops 0.000 ± 0.000M/s)
output-sampled       30.261 ± 0.454M/s (drops 0.000 ± 0.000M/s)

Single-producer, consumer/producer competing on the same CPU, low batch count
=============================================================================
rb-libbpf            0.570 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-custom            0.569 ± 0.003M/s (drops 0.000 ± 0.000M/s)
pb-libbpf            0.539 ± 0.002M/s (drops 0.000 ± 0.000M/s)
pb-custom            0.549 ± 0.003M/s (drops 0.000 ± 0.000M/s)

Ringbuf, multi-producer contention
==================================
rb-libbpf nr_prod 1  44.359 ± 0.319M/s (drops 0.091 ± 0.027M/s)
rb-libbpf nr_prod 2  23.722 ± 0.024M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 3  14.128 ± 0.011M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 4  14.896 ± 0.020M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 8  6.056 ± 0.061M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 12 4.612 ± 0.042M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 16 4.684 ± 0.040M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 20 5.007 ± 0.046M/s (drops 0.001 ± 0.004M/s)
rb-libbpf nr_prod 24 5.207 ± 0.093M/s (drops 0.006 ± 0.013M/s)
rb-libbpf nr_prod 28 4.951 ± 0.073M/s (drops 0.030 ± 0.069M/s)
rb-libbpf nr_prod 32 4.509 ± 0.069M/s (drops 0.582 ± 0.057M/s)
rb-libbpf nr_prod 36 4.361 ± 0.064M/s (drops 0.733 ± 0.126M/s)
rb-libbpf nr_prod 40 4.261 ± 0.049M/s (drops 0.713 ± 0.116M/s)
rb-libbpf nr_prod 44 4.150 ± 0.207M/s (drops 0.841 ± 0.191M/s)
rb-libbpf nr_prod 48 4.033 ± 0.064M/s (drops 1.009 ± 0.082M/s)
rb-libbpf nr_prod 52 4.025 ± 0.049M/s (drops 1.012 ± 0.069M/s)

```

Signed-off-by: Andrew Werner <awerner32@gmail.com>
---
v1->v2:
 - Improved commit message
 - Added SOB
 - Reworked all section headers for uniformity

v1: https://lore.kernel.org/bpf/20230719014744.3480131-1-awerner32@gmail.com/
---
 .../bpf/benchs/run_bench_ringbufs.sh          | 30 +++++++++----------
 1 file changed, 14 insertions(+), 16 deletions(-)

Comments

Hou Tao July 21, 2023, 12:57 p.m. UTC | #1
On 7/20/2023 4:15 AM, Andrew Werner wrote:
> The ringbuf benchmarks print headers for each section of benchmarks.
> The naming conventions lead a user of the benchmarks to some confusion.
> This change is a cosmetic update to the output of that benchmark; no
> changes were made to what the script actually executes.
>
> The back-to-back exploration of sample rates for Perfbuf and Ringbuf
> have been combined into a single section.
>
> Some of the variables in the script were renamed for clarity; b is
> always a benchmark name, s is a sampling rate, n is a number of
> producers. Before the change, b was the only variable.
>
> After:
> ```
> Parallel producer
> =================
> rb-libbpf            43.072 ± 0.165M/s (drops 0.940 ± 0.016M/s)
> rb-custom            20.274 ± 0.442M/s (drops 0.000 ± 0.000M/s)
> pb-libbpf            1.480 ± 0.015M/s (drops 0.000 ± 0.000M/s)
> pb-custom            1.492 ± 0.023M/s (drops 0.000 ± 0.000M/s)
>
......
>
> ```
>
> Signed-off-by: Andrew Werner <awerner32@gmail.com>

Acked-by: Hou Tao <houtao1@huawei.com>
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh b/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh
index 91e3567962ff..c495013c1d88 100755
--- a/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh
+++ b/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh
@@ -6,46 +6,44 @@  set -eufo pipefail
 
 RUN_RB_BENCH="$RUN_BENCH -c1"
 
-header "Single-producer, parallel producer"
+header "Parallel producer"
 for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
 	summarize $b "$($RUN_RB_BENCH $b)"
 done
 
-header "Single-producer, parallel producer, sampled notification"
+header "Parallel producer, sampled notifications"
 for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
 	summarize $b "$($RUN_RB_BENCH --rb-sampled $b)"
 done
 
-header "Single-producer, back-to-back mode"
+header "Back-to-back producer"
 for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
 	summarize $b "$($RUN_RB_BENCH --rb-b2b $b)"
 	summarize $b-sampled "$($RUN_RB_BENCH --rb-sampled --rb-b2b $b)"
 done
 
-header "Ringbuf back-to-back, effect of sample rate"
-for b in 1 5 10 25 50 100 250 500 1000 2000 3000; do
-	summarize "rb-sampled-$b" "$($RUN_RB_BENCH --rb-b2b --rb-batch-cnt $b --rb-sampled --rb-sample-rate $b rb-custom)"
-done
-header "Perfbuf back-to-back, effect of sample rate"
-for b in 1 5 10 25 50 100 250 500 1000 2000 3000; do
-	summarize "pb-sampled-$b" "$($RUN_RB_BENCH --rb-b2b --rb-batch-cnt $b --rb-sampled --rb-sample-rate $b pb-custom)"
+header "Back-to-back producer, varying sample rate"
+for b in rb-custom pb-custom; do
+  for r in 1 5 10 25 50 100 250 500 1000 2000 3000; do
+	  summarize "$b-$r" "$($RUN_RB_BENCH --rb-b2b --rb-batch-cnt $r --rb-sampled --rb-sample-rate $r $b)"
+  done
 done
 
-header "Ringbuf back-to-back, reserve+commit vs output"
+header "Back-to-back producer, rb-custom reserve+commit vs output"
 summarize "reserve" "$($RUN_RB_BENCH --rb-b2b                 rb-custom)"
 summarize "output"  "$($RUN_RB_BENCH --rb-b2b --rb-use-output rb-custom)"
 
-header "Ringbuf sampled, reserve+commit vs output"
+header "Parallel producer, rb-custom reserve+commit vs output, sampled notifications"
 summarize "reserve-sampled" "$($RUN_RB_BENCH --rb-sampled                 rb-custom)"
 summarize "output-sampled"  "$($RUN_RB_BENCH --rb-sampled --rb-use-output rb-custom)"
 
-header "Single-producer, consumer/producer competing on the same CPU, low batch count"
+header "Concurrent producer (same CPU as consumer), low batch count"
 for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
 	summarize $b "$($RUN_RB_BENCH --rb-batch-cnt 1 --rb-sample-rate 1 --prod-affinity 0 --cons-affinity 0 $b)"
 done
 
-header "Ringbuf, multi-producer contention"
-for b in 1 2 3 4 8 12 16 20 24 28 32 36 40 44 48 52; do
-	summarize "rb-libbpf nr_prod $b" "$($RUN_RB_BENCH -p$b --rb-batch-cnt 50 rb-libbpf)"
+header "Parallel producers (multiple, contention)"
+for n in 1 2 3 4 8 12 16 20 24 28 32 36 40 44 48 52; do
+	summarize "rb-libbpf nr_prod $n" "$($RUN_RB_BENCH -p$n --rb-batch-cnt 50 rb-libbpf)"
 done