mbox series

[dwarves,v3,0/8] pahole: faster reproducible BTF encoding

Message ID 20241221012245.243845-1-ihor.solodrai@pm.me (mailing list archive)
Headers show
Series pahole: faster reproducible BTF encoding | expand

Message

Ihor Solodrai Dec. 21, 2024, 1:22 a.m. UTC
This is a v3 of the patchset aiming to speed up parallel reproducible
BTF encoding.

In comparison to v2:
  - removed patch v2 03 adding pre_load_module hook
  - removed patch v2 05 making use of the hook
    - since we will have a single btf_encoder, there is no need to
      collect ELF tables before encoders are created
  - removed patch v2 07 adding btf_encoder_context
  - patch v3 04 is a rewritten patch v2 06
    - each btf_encoder now maintains it's own list of function
      tables per ELF
  - patch v3 07 is an updated patch v2 10
    - dwarf_loader multithreading is adjusted attempting to minimize
      blocking on locks
  - new patch v3 08 increases the cu->obstack chunk size
  - new patch v3 09 cleans up global list of encoders in btf_encoder.c

Testing:
  - ./tests/tests pass on vmlinux built from bpf-next
  - bpftool dump of reproducible BTF is identical to v1.28

Sample perf runs on 6.9 kernel with a production-like config, on a
machine with nproc=176:

This patchset:

    Performance counter stats for '/home/isolodrai/dwarves/build/pahole -J -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs --lang_exclude=rust --btf_encode_detached=/dev/null .tmp_vmlinux.btf' (13 runs):

         17,911.11 msec cpu-clock                        #    4.412 CPUs utilized               ( +-  0.46% )

            4.0600 +- 0.0116 seconds time elapsed  ( +-  0.29% )


pahole/next (v1.28):

    Performance counter stats for '/home/isolodrai/dwarves/build/pahole -J -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs --lang_exclude=rust --btf_encode_detached=/dev/null .tmp_vmlinux.btf' (13 runs):

         82,289.12 msec cpu-clock                        #   17.427 CPUs utilized               ( +-  0.54% )

            4.7219 +- 0.0270 seconds time elapsed  ( +-  0.57% )


v2: https://lore.kernel.org/dwarves/20241213223641.564002-1-ihor.solodrai@pm.me/
v1 RFC: https://lore.kernel.org/dwarves/20241128012341.4081072-1-ihor.solodrai@pm.me/

Alan Maguire (2):
  btf_encoder: simplify function encoding
  btf_encoder: separate elf function, saved function representations

Ihor Solodrai (6):
  btf_encoder: introduce elf_functions struct type
  btf_encoder: introduce elf_functions_list
  btf_encoder: remove skip_encoding_inconsistent_proto
  dwarf_loader: introduce cu->id
  dwarf_loader: multithreading with a job/worker model
  btf_encoder: clean up global encoders list

 btf_encoder.c               | 643 +++++++++++++++++++-----------------
 btf_encoder.h               |   7 +-
 btf_loader.c                |   2 +-
 ctf_loader.c                |   2 +-
 dwarf_loader.c              | 335 +++++++++++++------
 dwarves.c                   |  44 ---
 dwarves.h                   |  21 +-
 pahole.c                    | 230 ++-----------
 pdwtags.c                   |   3 +-
 pfunct.c                    |   3 +-
 tests/reproducible_build.sh |   5 +-
 11 files changed, 605 insertions(+), 690 deletions(-)