Message ID | 20210907202802.3675104-1-songliubraving@fb.com (mailing list archive) |
---|---|
Headers | show |
Series | bpf: introduce bpf_get_branch_snapshot | expand |
> On Sep 7, 2021, at 1:27 PM, Song Liu <songliubraving@fb.com> wrote: Forgot to add changes: Changes v5 => v6 1. Add local_irq_save/restore to intel_pmu_snapshot_branch_stack. (Peter) 2. Remove buf and size check in bpf_get_branch_snapshot, move flags check to later fo the function. (Peter, Andrii) 3. Revise comments for bpf_get_branch_snapshot in bpf.h (Andrii) > > Changes v4 => v5 > 1. Modify perf_snapshot_branch_stack_t to save some memcpy. (Andrii) > 2. Minor fixes in selftests. (Andrii) > > Changes v3 => v4: > 1. Do not reshuffle intel_pmu_disable_all(). Use some inline to save LBR > entries. (Peter) > 2. Move static_call(perf_snapshot_branch_stack) to the helper. (Alexei) > 3. Add argument flags to bpf_get_branch_snapshot. (Andrii) > 4. Make MAX_BRANCH_SNAPSHOT an enum (Andrii). And rename it as > PERF_MAX_BRANCH_SNAPSHOT > 5. Make bpf_get_branch_snapshot similar to bpf_read_branch_records. > (Andrii) > 6. Move the test target function to bpf_testmod. Updated kallsyms_find_next > to work properly with modules. (Andrii) > > Changes v2 => v3: > 1. Fix the use of static_call. (Peter) > 2. Limit the use to perfmon version >= 2. (Peter) > 3. Modify intel_pmu_snapshot_branch_stack() to use intel_pmu_disable_all > and intel_pmu_enable_all(). > > Changes v1 => v2: > 1. Rename the helper as bpf_get_branch_snapshot; > 2. Fix/simplify the use of static_call; > 3. Instead of percpu variables, let intel_pmu_snapshot_branch_stack output > branch records to an output argument of type perf_branch_snapshot. > > Branch stack can be very useful in understanding software events. For > example, when a long function, e.g. sys_perf_event_open, returns an errno, > it is not obvious why the function failed. Branch stack could provide very > helpful information in this type of scenarios. > > This set adds support to read branch stack with a new BPF helper > bpf_get_branch_trace(). Currently, this is only supported in Intel systems. > It is also possible to support the same feaure for PowerPC. > > The hardware that records the branch stace is not stopped automatically on > software events. Therefore, it is necessary to stop it in software soon. > Otherwise, the hardware buffers/registers will be flushed. One of the key > design consideration in this set is to minimize the number of branch record > entries between the event triggers and the hardware recorder is stopped. > Based on this goal, current design is different from the discussions in > original RFC [1]: > 1) Static call is used when supported, to save function pointer > dereference; > 2) intel_pmu_lbr_disable_all is used instead of perf_pmu_disable(), > because the latter uses about 10 entries before stopping LBR. > > With current code, on Intel CPU, LBR is stopped after 10 branch entries > after fexit triggers: > > ID: 0 from intel_pmu_lbr_disable_all+58 to intel_pmu_lbr_disable_all+93 > ID: 1 from intel_pmu_lbr_disable_all+54 to intel_pmu_lbr_disable_all+58 > ID: 2 from intel_pmu_snapshot_branch_stack+102 to intel_pmu_lbr_disable_all+0 > ID: 3 from bpf_get_branch_snapshot+18 to intel_pmu_snapshot_branch_stack+0 > ID: 4 from bpf_get_branch_snapshot+18 to bpf_get_branch_snapshot+0 > ID: 5 from __brk_limit+474918983 to bpf_get_branch_snapshot+0 > ID: 6 from __bpf_prog_enter+34 to __brk_limit+474918971 > ID: 7 from migrate_disable+60 to __bpf_prog_enter+9 > ID: 8 from __bpf_prog_enter+4 to migrate_disable+0 > ID: 9 from bpf_testmod_loop_test+20 to __bpf_prog_enter+0 > ID: 10 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > ID: 11 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > ID: 12 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > ID: 13 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > ... > > [1] https://lore.kernel.org/bpf/20210818012937.2522409-1-songliubraving@fb.com/ > > Song Liu (3): > perf: enable branch record for software events > bpf: introduce helper bpf_get_branch_snapshot > selftests/bpf: add test for bpf_get_branch_snapshot > > arch/x86/events/intel/core.c | 29 ++++- > arch/x86/events/intel/ds.c | 8 -- > arch/x86/events/perf_event.h | 10 +- > include/linux/perf_event.h | 23 ++++ > include/uapi/linux/bpf.h | 22 ++++ > kernel/bpf/trampoline.c | 3 +- > kernel/events/core.c | 2 + > kernel/trace/bpf_trace.c | 30 ++++++ > tools/include/uapi/linux/bpf.h | 22 ++++ > .../selftests/bpf/bpf_testmod/bpf_testmod.c | 19 +++- > .../selftests/bpf/prog_tests/core_reloc.c | 14 +-- > .../bpf/prog_tests/get_branch_snapshot.c | 100 ++++++++++++++++++ > .../selftests/bpf/prog_tests/module_attach.c | 39 ------- > .../selftests/bpf/progs/get_branch_snapshot.c | 40 +++++++ > tools/testing/selftests/bpf/test_progs.c | 39 +++++++ > tools/testing/selftests/bpf/test_progs.h | 2 + > tools/testing/selftests/bpf/trace_helpers.c | 37 +++++++ > tools/testing/selftests/bpf/trace_helpers.h | 5 + > 18 files changed, 378 insertions(+), 66 deletions(-) > create mode 100644 tools/testing/selftests/bpf/prog_tests/get_branch_snapshot.c > create mode 100644 tools/testing/selftests/bpf/progs/get_branch_snapshot.c > > -- > 2.30.2
On Tue, Sep 7, 2021 at 1:31 PM Song Liu <songliubraving@fb.com> wrote: > > > > > On Sep 7, 2021, at 1:27 PM, Song Liu <songliubraving@fb.com> wrote: > > Forgot to add changes: > > Changes v5 => v6 > 1. Add local_irq_save/restore to intel_pmu_snapshot_branch_stack. > (Peter) > 2. Remove buf and size check in bpf_get_branch_snapshot, move flags > check to later fo the function. (Peter, Andrii) > 3. Revise comments for bpf_get_branch_snapshot in bpf.h (Andrii) > Looks great, thanks! Looking forward to being able to use it. Please consider following up with migrate_disable() inlining as well. For the series: Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > > Changes v4 => v5 > > 1. Modify perf_snapshot_branch_stack_t to save some memcpy. (Andrii) > > 2. Minor fixes in selftests. (Andrii) > > > > Changes v3 => v4: > > 1. Do not reshuffle intel_pmu_disable_all(). Use some inline to save LBR > > entries. (Peter) > > 2. Move static_call(perf_snapshot_branch_stack) to the helper. (Alexei) > > 3. Add argument flags to bpf_get_branch_snapshot. (Andrii) > > 4. Make MAX_BRANCH_SNAPSHOT an enum (Andrii). And rename it as > > PERF_MAX_BRANCH_SNAPSHOT > > 5. Make bpf_get_branch_snapshot similar to bpf_read_branch_records. > > (Andrii) > > 6. Move the test target function to bpf_testmod. Updated kallsyms_find_next > > to work properly with modules. (Andrii) > > > > Changes v2 => v3: > > 1. Fix the use of static_call. (Peter) > > 2. Limit the use to perfmon version >= 2. (Peter) > > 3. Modify intel_pmu_snapshot_branch_stack() to use intel_pmu_disable_all > > and intel_pmu_enable_all(). > > > > Changes v1 => v2: > > 1. Rename the helper as bpf_get_branch_snapshot; > > 2. Fix/simplify the use of static_call; > > 3. Instead of percpu variables, let intel_pmu_snapshot_branch_stack output > > branch records to an output argument of type perf_branch_snapshot. > > > > Branch stack can be very useful in understanding software events. For > > example, when a long function, e.g. sys_perf_event_open, returns an errno, > > it is not obvious why the function failed. Branch stack could provide very > > helpful information in this type of scenarios. > > > > This set adds support to read branch stack with a new BPF helper > > bpf_get_branch_trace(). Currently, this is only supported in Intel systems. > > It is also possible to support the same feaure for PowerPC. > > > > The hardware that records the branch stace is not stopped automatically on > > software events. Therefore, it is necessary to stop it in software soon. > > Otherwise, the hardware buffers/registers will be flushed. One of the key > > design consideration in this set is to minimize the number of branch record > > entries between the event triggers and the hardware recorder is stopped. > > Based on this goal, current design is different from the discussions in > > original RFC [1]: > > 1) Static call is used when supported, to save function pointer > > dereference; > > 2) intel_pmu_lbr_disable_all is used instead of perf_pmu_disable(), > > because the latter uses about 10 entries before stopping LBR. > > > > With current code, on Intel CPU, LBR is stopped after 10 branch entries > > after fexit triggers: > > > > ID: 0 from intel_pmu_lbr_disable_all+58 to intel_pmu_lbr_disable_all+93 > > ID: 1 from intel_pmu_lbr_disable_all+54 to intel_pmu_lbr_disable_all+58 > > ID: 2 from intel_pmu_snapshot_branch_stack+102 to intel_pmu_lbr_disable_all+0 > > ID: 3 from bpf_get_branch_snapshot+18 to intel_pmu_snapshot_branch_stack+0 > > ID: 4 from bpf_get_branch_snapshot+18 to bpf_get_branch_snapshot+0 > > ID: 5 from __brk_limit+474918983 to bpf_get_branch_snapshot+0 > > ID: 6 from __bpf_prog_enter+34 to __brk_limit+474918971 > > ID: 7 from migrate_disable+60 to __bpf_prog_enter+9 > > ID: 8 from __bpf_prog_enter+4 to migrate_disable+0 > > ID: 9 from bpf_testmod_loop_test+20 to __bpf_prog_enter+0 > > ID: 10 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > > ID: 11 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > > ID: 12 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > > ID: 13 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 > > ... > > > > [1] https://lore.kernel.org/bpf/20210818012937.2522409-1-songliubraving@fb.com/ > > > > Song Liu (3): > > perf: enable branch record for software events > > bpf: introduce helper bpf_get_branch_snapshot > > selftests/bpf: add test for bpf_get_branch_snapshot > > > > arch/x86/events/intel/core.c | 29 ++++- > > arch/x86/events/intel/ds.c | 8 -- > > arch/x86/events/perf_event.h | 10 +- > > include/linux/perf_event.h | 23 ++++ > > include/uapi/linux/bpf.h | 22 ++++ > > kernel/bpf/trampoline.c | 3 +- > > kernel/events/core.c | 2 + > > kernel/trace/bpf_trace.c | 30 ++++++ > > tools/include/uapi/linux/bpf.h | 22 ++++ > > .../selftests/bpf/bpf_testmod/bpf_testmod.c | 19 +++- > > .../selftests/bpf/prog_tests/core_reloc.c | 14 +-- > > .../bpf/prog_tests/get_branch_snapshot.c | 100 ++++++++++++++++++ > > .../selftests/bpf/prog_tests/module_attach.c | 39 ------- > > .../selftests/bpf/progs/get_branch_snapshot.c | 40 +++++++ > > tools/testing/selftests/bpf/test_progs.c | 39 +++++++ > > tools/testing/selftests/bpf/test_progs.h | 2 + > > tools/testing/selftests/bpf/trace_helpers.c | 37 +++++++ > > tools/testing/selftests/bpf/trace_helpers.h | 5 + > > 18 files changed, 378 insertions(+), 66 deletions(-) > > create mode 100644 tools/testing/selftests/bpf/prog_tests/get_branch_snapshot.c > > create mode 100644 tools/testing/selftests/bpf/progs/get_branch_snapshot.c > > > > -- > > 2.30.2 >
Hi Peter, Do you have further comments/concerns on v6? If not, could you please reply with your Reviewed-by or Acked-by? Thanks, Song > On Sep 7, 2021, at 1:29 PM, Song Liu <songliubraving@fb.com> wrote: > > > >> On Sep 7, 2021, at 1:27 PM, Song Liu <songliubraving@fb.com> wrote: > > Forgot to add changes: > > Changes v5 => v6 > 1. Add local_irq_save/restore to intel_pmu_snapshot_branch_stack. > (Peter) > 2. Remove buf and size check in bpf_get_branch_snapshot, move flags > check to later fo the function. (Peter, Andrii) > 3. Revise comments for bpf_get_branch_snapshot in bpf.h (Andrii) > >> >> Changes v4 => v5 >> 1. Modify perf_snapshot_branch_stack_t to save some memcpy. (Andrii) >> 2. Minor fixes in selftests. (Andrii) >> >> Changes v3 => v4: >> 1. Do not reshuffle intel_pmu_disable_all(). Use some inline to save LBR >> entries. (Peter) >> 2. Move static_call(perf_snapshot_branch_stack) to the helper. (Alexei) >> 3. Add argument flags to bpf_get_branch_snapshot. (Andrii) >> 4. Make MAX_BRANCH_SNAPSHOT an enum (Andrii). And rename it as >> PERF_MAX_BRANCH_SNAPSHOT >> 5. Make bpf_get_branch_snapshot similar to bpf_read_branch_records. >> (Andrii) >> 6. Move the test target function to bpf_testmod. Updated kallsyms_find_next >> to work properly with modules. (Andrii) >> >> Changes v2 => v3: >> 1. Fix the use of static_call. (Peter) >> 2. Limit the use to perfmon version >= 2. (Peter) >> 3. Modify intel_pmu_snapshot_branch_stack() to use intel_pmu_disable_all >> and intel_pmu_enable_all(). >> >> Changes v1 => v2: >> 1. Rename the helper as bpf_get_branch_snapshot; >> 2. Fix/simplify the use of static_call; >> 3. Instead of percpu variables, let intel_pmu_snapshot_branch_stack output >> branch records to an output argument of type perf_branch_snapshot. >> >> Branch stack can be very useful in understanding software events. For >> example, when a long function, e.g. sys_perf_event_open, returns an errno, >> it is not obvious why the function failed. Branch stack could provide very >> helpful information in this type of scenarios. >> >> This set adds support to read branch stack with a new BPF helper >> bpf_get_branch_trace(). Currently, this is only supported in Intel systems. >> It is also possible to support the same feaure for PowerPC. >> >> The hardware that records the branch stace is not stopped automatically on >> software events. Therefore, it is necessary to stop it in software soon. >> Otherwise, the hardware buffers/registers will be flushed. One of the key >> design consideration in this set is to minimize the number of branch record >> entries between the event triggers and the hardware recorder is stopped. >> Based on this goal, current design is different from the discussions in >> original RFC [1]: >> 1) Static call is used when supported, to save function pointer >> dereference; >> 2) intel_pmu_lbr_disable_all is used instead of perf_pmu_disable(), >> because the latter uses about 10 entries before stopping LBR. >> >> With current code, on Intel CPU, LBR is stopped after 10 branch entries >> after fexit triggers: >> >> ID: 0 from intel_pmu_lbr_disable_all+58 to intel_pmu_lbr_disable_all+93 >> ID: 1 from intel_pmu_lbr_disable_all+54 to intel_pmu_lbr_disable_all+58 >> ID: 2 from intel_pmu_snapshot_branch_stack+102 to intel_pmu_lbr_disable_all+0 >> ID: 3 from bpf_get_branch_snapshot+18 to intel_pmu_snapshot_branch_stack+0 >> ID: 4 from bpf_get_branch_snapshot+18 to bpf_get_branch_snapshot+0 >> ID: 5 from __brk_limit+474918983 to bpf_get_branch_snapshot+0 >> ID: 6 from __bpf_prog_enter+34 to __brk_limit+474918971 >> ID: 7 from migrate_disable+60 to __bpf_prog_enter+9 >> ID: 8 from __bpf_prog_enter+4 to migrate_disable+0 >> ID: 9 from bpf_testmod_loop_test+20 to __bpf_prog_enter+0 >> ID: 10 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 >> ID: 11 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 >> ID: 12 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 >> ID: 13 from bpf_testmod_loop_test+20 to bpf_testmod_loop_test+13 >> ... >> >> [1] https://lore.kernel.org/bpf/20210818012937.2522409-1-songliubraving@fb.com/ >> >> Song Liu (3): >> perf: enable branch record for software events >> bpf: introduce helper bpf_get_branch_snapshot >> selftests/bpf: add test for bpf_get_branch_snapshot >> >> arch/x86/events/intel/core.c | 29 ++++- >> arch/x86/events/intel/ds.c | 8 -- >> arch/x86/events/perf_event.h | 10 +- >> include/linux/perf_event.h | 23 ++++ >> include/uapi/linux/bpf.h | 22 ++++ >> kernel/bpf/trampoline.c | 3 +- >> kernel/events/core.c | 2 + >> kernel/trace/bpf_trace.c | 30 ++++++ >> tools/include/uapi/linux/bpf.h | 22 ++++ >> .../selftests/bpf/bpf_testmod/bpf_testmod.c | 19 +++- >> .../selftests/bpf/prog_tests/core_reloc.c | 14 +-- >> .../bpf/prog_tests/get_branch_snapshot.c | 100 ++++++++++++++++++ >> .../selftests/bpf/prog_tests/module_attach.c | 39 ------- >> .../selftests/bpf/progs/get_branch_snapshot.c | 40 +++++++ >> tools/testing/selftests/bpf/test_progs.c | 39 +++++++ >> tools/testing/selftests/bpf/test_progs.h | 2 + >> tools/testing/selftests/bpf/trace_helpers.c | 37 +++++++ >> tools/testing/selftests/bpf/trace_helpers.h | 5 + >> 18 files changed, 378 insertions(+), 66 deletions(-) >> create mode 100644 tools/testing/selftests/bpf/prog_tests/get_branch_snapshot.c >> create mode 100644 tools/testing/selftests/bpf/progs/get_branch_snapshot.c >> >> -- >> 2.30.2 >