@@ -6,46 +6,44 @@ set -eufo pipefail
RUN_RB_BENCH="$RUN_BENCH -c1"
-header "Single-producer, parallel producer"
+header "Parallel producer"
for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
summarize $b "$($RUN_RB_BENCH $b)"
done
-header "Single-producer, parallel producer, sampled notification"
+header "Parallel producer, sampled notifications"
for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
summarize $b "$($RUN_RB_BENCH --rb-sampled $b)"
done
-header "Single-producer, back-to-back mode"
+header "Back-to-back producer"
for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
summarize $b "$($RUN_RB_BENCH --rb-b2b $b)"
summarize $b-sampled "$($RUN_RB_BENCH --rb-sampled --rb-b2b $b)"
done
-header "Ringbuf back-to-back, effect of sample rate"
-for b in 1 5 10 25 50 100 250 500 1000 2000 3000; do
- summarize "rb-sampled-$b" "$($RUN_RB_BENCH --rb-b2b --rb-batch-cnt $b --rb-sampled --rb-sample-rate $b rb-custom)"
-done
-header "Perfbuf back-to-back, effect of sample rate"
-for b in 1 5 10 25 50 100 250 500 1000 2000 3000; do
- summarize "pb-sampled-$b" "$($RUN_RB_BENCH --rb-b2b --rb-batch-cnt $b --rb-sampled --rb-sample-rate $b pb-custom)"
+header "Back-to-back producer, varying sample rate"
+for b in rb-custom pb-custom; do
+ for r in 1 5 10 25 50 100 250 500 1000 2000 3000; do
+ summarize "$b-$r" "$($RUN_RB_BENCH --rb-b2b --rb-batch-cnt $r --rb-sampled --rb-sample-rate $r $b)"
+ done
done
-header "Ringbuf back-to-back, reserve+commit vs output"
+header "Back-to-back producer, rb-custom reserve+commit vs output"
summarize "reserve" "$($RUN_RB_BENCH --rb-b2b rb-custom)"
summarize "output" "$($RUN_RB_BENCH --rb-b2b --rb-use-output rb-custom)"
-header "Ringbuf sampled, reserve+commit vs output"
+header "Parallel producer, rb-custom reserve+commit vs output, sampled notifications"
summarize "reserve-sampled" "$($RUN_RB_BENCH --rb-sampled rb-custom)"
summarize "output-sampled" "$($RUN_RB_BENCH --rb-sampled --rb-use-output rb-custom)"
-header "Single-producer, consumer/producer competing on the same CPU, low batch count"
+header "Concurrent producer (same CPU as consumer), low batch count"
for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
summarize $b "$($RUN_RB_BENCH --rb-batch-cnt 1 --rb-sample-rate 1 --prod-affinity 0 --cons-affinity 0 $b)"
done
-header "Ringbuf, multi-producer contention"
-for b in 1 2 3 4 8 12 16 20 24 28 32 36 40 44 48 52; do
- summarize "rb-libbpf nr_prod $b" "$($RUN_RB_BENCH -p$b --rb-batch-cnt 50 rb-libbpf)"
+header "Parallel producers (multiple, contention)"
+for n in 1 2 3 4 8 12 16 20 24 28 32 36 40 44 48 52; do
+ summarize "rb-libbpf nr_prod $n" "$($RUN_RB_BENCH -p$n --rb-batch-cnt 50 rb-libbpf)"
done
The ringbuf benchmarks print headers for each section of benchmarks. The naming conventions lead a user of the benchmarks to some confusion. This change is a cosmetic update to the output of that benchmark; no changes were made to what the script actually executes. The back-to-back exploration of sample rates for Perfbuf and Ringbuf have been combined into a single section. Some of the variables in the script were renamed for clarity; b is always a benchmark name, s is a sampling rate, n is a number of producers. Before the change, b was the only variable. After: ``` Parallel producer ================= rb-libbpf 43.072 ± 0.165M/s (drops 0.940 ± 0.016M/s) rb-custom 20.274 ± 0.442M/s (drops 0.000 ± 0.000M/s) pb-libbpf 1.480 ± 0.015M/s (drops 0.000 ± 0.000M/s) pb-custom 1.492 ± 0.023M/s (drops 0.000 ± 0.000M/s) Parallel producer, sampled notifications ======================================== rb-libbpf 41.132 ± 0.113M/s (drops 0.000 ± 0.000M/s) rb-custom 33.228 ± 0.086M/s (drops 0.000 ± 0.000M/s) pb-libbpf 22.498 ± 0.142M/s (drops 0.052 ± 0.171M/s) pb-custom 22.399 ± 0.060M/s (drops 0.030 ± 0.100M/s) Back-to-back producer ===================== rb-libbpf 59.951 ± 0.712M/s (drops 0.000 ± 0.000M/s) rb-libbpf-sampled 57.751 ± 4.694M/s (drops 0.000 ± 0.000M/s) rb-custom 71.568 ± 12.584M/s (drops 0.000 ± 0.000M/s) rb-custom-sampled 71.919 ± 7.540M/s (drops 0.000 ± 0.000M/s) pb-libbpf 1.961 ± 0.013M/s (drops 0.000 ± 0.000M/s) pb-libbpf-sampled 22.339 ± 0.129M/s (drops 0.000 ± 0.000M/s) pb-custom 1.972 ± 0.009M/s (drops 0.000 ± 0.000M/s) pb-custom-sampled 22.802 ± 0.374M/s (drops 0.000 ± 0.000M/s) Back-to-back producer, varying sample rate ========================================== rb-custom-1 1.529 ± 0.008M/s (drops 0.000 ± 0.000M/s) rb-custom-5 5.817 ± 1.945M/s (drops 0.000 ± 0.000M/s) rb-custom-10 12.884 ± 0.032M/s (drops 0.000 ± 0.000M/s) rb-custom-25 25.634 ± 0.031M/s (drops 0.000 ± 0.000M/s) rb-custom-50 39.970 ± 0.309M/s (drops 0.000 ± 0.000M/s) rb-custom-100 51.868 ± 0.210M/s (drops 0.000 ± 0.000M/s) rb-custom-250 69.466 ± 0.039M/s (drops 0.000 ± 0.000M/s) rb-custom-500 76.370 ± 0.181M/s (drops 0.000 ± 0.000M/s) rb-custom-1000 79.778 ± 0.248M/s (drops 0.000 ± 0.000M/s) rb-custom-2000 82.952 ± 0.198M/s (drops 0.000 ± 0.000M/s) rb-custom-3000 82.314 ± 0.155M/s (drops 0.000 ± 0.000M/s) pb-custom-1 1.418 ± 0.004M/s (drops 0.000 ± 0.000M/s) pb-custom-5 5.655 ± 0.066M/s (drops 0.000 ± 0.000M/s) pb-custom-10 9.091 ± 0.109M/s (drops 0.000 ± 0.000M/s) pb-custom-25 14.338 ± 0.144M/s (drops 0.000 ± 0.000M/s) pb-custom-50 17.841 ± 0.318M/s (drops 0.000 ± 0.000M/s) pb-custom-100 20.491 ± 0.099M/s (drops 0.000 ± 0.000M/s) pb-custom-250 22.047 ± 0.270M/s (drops 0.000 ± 0.000M/s) pb-custom-500 22.475 ± 0.676M/s (drops 0.000 ± 0.000M/s) pb-custom-1000 23.013 ± 0.786M/s (drops 0.000 ± 0.000M/s) pb-custom-2000 23.305 ± 0.182M/s (drops 0.000 ± 0.000M/s) pb-custom-3000 23.855 ± 0.071M/s (drops 0.000 ± 0.000M/s) Back-to-back producer, rb-custom reserve+commit vs output ========================================================= reserve 76.244 ± 0.469M/s (drops 0.000 ± 0.000M/s) output 64.707 ± 5.618M/s (drops 0.000 ± 0.000M/s) Parallel producer, rb-custom reserve+commit vs output, sampled notifications ============================================================================ reserve-sampled 33.560 ± 0.024M/s (drops 0.000 ± 0.000M/s) output-sampled 30.348 ± 0.313M/s (drops 0.000 ± 0.000M/s) Concurrent producer (same CPU as consumer), low batch count =========================================================== rb-libbpf 0.563 ± 0.007M/s (drops 0.000 ± 0.000M/s) rb-custom 0.571 ± 0.001M/s (drops 0.000 ± 0.000M/s) pb-libbpf 0.523 ± 0.001M/s (drops 0.000 ± 0.000M/s) pb-custom 0.530 ± 0.004M/s (drops 0.000 ± 0.000M/s) Multiple parallel producers (contention) ======================================== rb-libbpf nr_prod 1 44.711 ± 0.058M/s (drops 0.183 ± 0.012M/s) rb-libbpf nr_prod 2 23.534 ± 0.069M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 3 14.011 ± 0.023M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 4 14.858 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 8 6.184 ± 0.031M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 12 4.719 ± 0.058M/s (drops 0.006 ± 0.021M/s) rb-libbpf nr_prod 16 4.607 ± 0.055M/s (drops 0.010 ± 0.028M/s) rb-libbpf nr_prod 20 5.001 ± 0.052M/s (drops 0.010 ± 0.025M/s) rb-libbpf nr_prod 24 5.234 ± 0.114M/s (drops 0.006 ± 0.021M/s) rb-libbpf nr_prod 28 5.021 ± 0.020M/s (drops 0.007 ± 0.014M/s) rb-libbpf nr_prod 32 4.316 ± 0.142M/s (drops 0.614 ± 0.121M/s) rb-libbpf nr_prod 36 4.353 ± 0.157M/s (drops 0.708 ± 0.126M/s) rb-libbpf nr_prod 40 4.230 ± 0.058M/s (drops 0.775 ± 0.120M/s) rb-libbpf nr_prod 44 4.212 ± 0.050M/s (drops 0.736 ± 0.084M/s) rb-libbpf nr_prod 48 4.276 ± 0.057M/s (drops 0.784 ± 0.095M/s) rb-libbpf nr_prod 52 4.222 ± 0.141M/s (drops 0.777 ± 0.172M/s) ``` Before: ``` Single-producer, parallel producer ================================== rb-libbpf 43.366 ± 0.277M/s (drops 0.848 ± 0.027M/s) rb-custom 17.831 ± 0.391M/s (drops 0.065 ± 0.216M/s) pb-libbpf 1.494 ± 0.012M/s (drops 0.000 ± 0.000M/s) pb-custom 1.521 ± 0.002M/s (drops 0.000 ± 0.000M/s) Single-producer, parallel producer, sampled notification ======================================================== rb-libbpf 41.163 ± 0.031M/s (drops 0.000 ± 0.000M/s) rb-custom 33.364 ± 0.347M/s (drops 0.025 ± 0.082M/s) pb-libbpf 21.039 ± 3.350M/s (drops 0.014 ± 0.036M/s) pb-custom 22.570 ± 0.267M/s (drops 0.136 ± 0.319M/s) Single-producer, back-to-back mode ================================== rb-libbpf 60.671 ± 0.274M/s (drops 0.000 ± 0.000M/s) rb-libbpf-sampled 59.229 ± 0.422M/s (drops 0.000 ± 0.000M/s) rb-custom 77.296 ± 0.156M/s (drops 0.000 ± 0.000M/s) rb-custom-sampled 71.147 ± 0.281M/s (drops 0.000 ± 0.000M/s) pb-libbpf 1.960 ± 0.007M/s (drops 0.000 ± 0.000M/s) pb-libbpf-sampled 22.230 ± 0.115M/s (drops 0.000 ± 0.000M/s) pb-custom 1.969 ± 0.005M/s (drops 0.000 ± 0.000M/s) pb-custom-sampled 22.883 ± 0.122M/s (drops 0.000 ± 0.000M/s) Ringbuf back-to-back, effect of sample rate =========================================== rb-sampled-1 1.507 ± 0.004M/s (drops 0.000 ± 0.000M/s) rb-sampled-5 7.095 ± 0.016M/s (drops 0.000 ± 0.000M/s) rb-sampled-10 13.091 ± 0.046M/s (drops 0.000 ± 0.000M/s) rb-sampled-25 26.259 ± 0.061M/s (drops 0.000 ± 0.000M/s) rb-sampled-50 39.831 ± 0.122M/s (drops 0.000 ± 0.000M/s) rb-sampled-100 51.536 ± 2.984M/s (drops 0.000 ± 0.000M/s) rb-sampled-250 67.850 ± 1.267M/s (drops 0.000 ± 0.000M/s) rb-sampled-500 75.257 ± 0.438M/s (drops 0.000 ± 0.000M/s) rb-sampled-1000 74.939 ± 0.295M/s (drops 0.000 ± 0.000M/s) rb-sampled-2000 81.481 ± 0.769M/s (drops 0.000 ± 0.000M/s) rb-sampled-3000 82.637 ± 0.448M/s (drops 0.000 ± 0.000M/s) Perfbuf back-to-back, effect of sample rate =========================================== pb-sampled-1 1.408 ± 0.003M/s (drops 0.000 ± 0.000M/s) pb-sampled-5 5.667 ± 0.012M/s (drops 0.000 ± 0.000M/s) pb-sampled-10 9.162 ± 0.026M/s (drops 0.000 ± 0.000M/s) pb-sampled-25 14.389 ± 0.033M/s (drops 0.000 ± 0.000M/s) pb-sampled-50 17.977 ± 0.049M/s (drops 0.000 ± 0.000M/s) pb-sampled-100 20.541 ± 0.079M/s (drops 0.000 ± 0.000M/s) pb-sampled-250 22.176 ± 0.523M/s (drops 0.000 ± 0.000M/s) pb-sampled-500 23.121 ± 0.124M/s (drops 0.000 ± 0.000M/s) pb-sampled-1000 22.415 ± 1.860M/s (drops 0.000 ± 0.000M/s) pb-sampled-2000 23.333 ± 0.679M/s (drops 0.000 ± 0.000M/s) pb-sampled-3000 23.032 ± 0.649M/s (drops 0.000 ± 0.000M/s) Ringbuf back-to-back, reserve+commit vs output ============================================== reserve 77.180 ± 0.304M/s (drops 0.000 ± 0.000M/s) output 60.890 ± 7.685M/s (drops 0.000 ± 0.000M/s) Ringbuf sampled, reserve+commit vs output ========================================= reserve-sampled 30.724 ± 0.166M/s (drops 0.000 ± 0.000M/s) output-sampled 30.261 ± 0.454M/s (drops 0.000 ± 0.000M/s) Single-producer, consumer/producer competing on the same CPU, low batch count ============================================================================= rb-libbpf 0.570 ± 0.004M/s (drops 0.000 ± 0.000M/s) rb-custom 0.569 ± 0.003M/s (drops 0.000 ± 0.000M/s) pb-libbpf 0.539 ± 0.002M/s (drops 0.000 ± 0.000M/s) pb-custom 0.549 ± 0.003M/s (drops 0.000 ± 0.000M/s) Ringbuf, multi-producer contention ================================== rb-libbpf nr_prod 1 44.359 ± 0.319M/s (drops 0.091 ± 0.027M/s) rb-libbpf nr_prod 2 23.722 ± 0.024M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 3 14.128 ± 0.011M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 4 14.896 ± 0.020M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 8 6.056 ± 0.061M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 12 4.612 ± 0.042M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 16 4.684 ± 0.040M/s (drops 0.000 ± 0.000M/s) rb-libbpf nr_prod 20 5.007 ± 0.046M/s (drops 0.001 ± 0.004M/s) rb-libbpf nr_prod 24 5.207 ± 0.093M/s (drops 0.006 ± 0.013M/s) rb-libbpf nr_prod 28 4.951 ± 0.073M/s (drops 0.030 ± 0.069M/s) rb-libbpf nr_prod 32 4.509 ± 0.069M/s (drops 0.582 ± 0.057M/s) rb-libbpf nr_prod 36 4.361 ± 0.064M/s (drops 0.733 ± 0.126M/s) rb-libbpf nr_prod 40 4.261 ± 0.049M/s (drops 0.713 ± 0.116M/s) rb-libbpf nr_prod 44 4.150 ± 0.207M/s (drops 0.841 ± 0.191M/s) rb-libbpf nr_prod 48 4.033 ± 0.064M/s (drops 1.009 ± 0.082M/s) rb-libbpf nr_prod 52 4.025 ± 0.049M/s (drops 1.012 ± 0.069M/s) ``` Signed-off-by: Andrew Werner <awerner32@gmail.com> --- v1->v2: - Improved commit message - Added SOB - Reworked all section headers for uniformity v1: https://lore.kernel.org/bpf/20230719014744.3480131-1-awerner32@gmail.com/ --- .../bpf/benchs/run_bench_ringbufs.sh | 30 +++++++++---------- 1 file changed, 14 insertions(+), 16 deletions(-)