Message ID | 20220624215712.3050672-1-song@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | bpf_prog_pack followup | expand |
oops, wrong address for x86@. CC x86@kernel.org > On Jun 24, 2022, at 2:57 PM, Song Liu <song@kernel.org> wrote: > > This set is the second half of v4 [1]. > > Changes v4 => v5: > 1. Rebase and resolve conflicts due to module.c split. > 2. Update experiment results (below). > > For our web service production benchmark, bpf_prog_pack on 4kB pages > gives 0.5% to 0.7% more throughput than not using bpf_prog_pack. > bpf_prog_pack on 2MB pages 0.6% to 0.9% more throughput than not using > bpf_prog_pack. Note that 0.5% is a huge improvement for our fleet. I > believe this is also significant for other companies with many thousand > servers. > > Update: Further experiments (suggested by Rick Edgecombe) showed that most > of benefit on the web service benchmark came from less direct map > fragmentation. The experiment is as follows: > > Side A: 2MB bpf prog pack on a single 2MB page; > Side B: 2MB bpf prog pack on 512x 4kB pages; > > The system only uses about 200kB for BPF programs, but 2MB is allocated > for bpf_prog_pack (for both A and B). Therefore, direct map fragmentation > caused by BPF programs is elminated, and we are only measuring the > performance difference of 1x 2MB page vs. ~50 4kB pages (we only use > about 50 out of the 512 pages). For these two sides, the difference in > system throughput is within the noise. I also measured iTLB-load-misses > caused by bpf programs, which is ~300/s for case A, and ~1600/s for case B. > The overall iTLB-load-misses is about 1.5M/s on these hosts. Therefore, > we can clearly see 2MB page reduces iTLB misses, but the difference is not > enough to have visible impact on system throughput. > > Of course, the impact of iTLB miss will be more significant for systems > with more BPF programs loaded. > > [1] https://lore.kernel.org/bpf/20220520235758.1858153-1-song@kernel.org/ > > Song Liu (5): > module: introduce module_alloc_huge > bpf: use module_alloc_huge for bpf_prog_pack > vmalloc: WARN for set_vm_flush_reset_perms() on huge pages > vmalloc: introduce huge_vmalloc_supported > bpf: simplify select_bpf_prog_pack_size > > arch/x86/kernel/module.c | 21 +++++++++++++++++++++ > include/linux/moduleloader.h | 5 +++++ > include/linux/vmalloc.h | 7 +++++++ > kernel/bpf/core.c | 25 ++++++++++--------------- > kernel/module/main.c | 8 ++++++++ > mm/vmalloc.c | 5 +++++ > 6 files changed, 56 insertions(+), 15 deletions(-) > > -- > 2.30.2