mbox series

[v1,0/4] perf arm-spe: Allow synthesizing of branch

Message ID 20241025143009.25419-1-graham.woodward@arm.com (mailing list archive)
Headers show
Series perf arm-spe: Allow synthesizing of branch | expand

Message

Graham Woodward Oct. 25, 2024, 2:30 p.m. UTC
Currently the --itrace=b will only show branch-misses but this change
allows perf to synthesize branches as well.

The change also incorporates the ability to display the target
addresses when specifying the addr field if the instruction is a branch.

Graham Woodward (4):
  perf arm-spe: Set sample.addr to target address for instruction sample
  perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
  perf arm-spe: Correctly set sample flags
  perf arm-spe: Update --itrace help text

 tools/perf/Documentation/itrace.txt       |  2 +-
 tools/perf/Documentation/perf-arm-spe.txt |  2 +-
 tools/perf/builtin-script.c               |  1 +
 tools/perf/util/arm-spe.c                 | 31 ++++++++++++++++++-----
 tools/perf/util/auxtrace.h                |  3 +--
 tools/perf/util/event.h                   |  1 +
 6 files changed, 29 insertions(+), 11 deletions(-)

Comments

Leo Yan Oct. 25, 2024, 5:33 p.m. UTC | #1
On Fri, Oct 25, 2024 at 03:30:05PM +0100, Graham Woodward wrote:
> 
> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
> 
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.

Tested for this series:

  # perf record -e arm_spe_0/branch_filter=1,load_filter=1/u \
      -- ./false_sharing.exe 1

  # perf script --itrace=i10ib  -F,+addr,+flags
    false_sharing.e  880532 [005] 1852579.389533:          1                                    branch:   jmp                       ffff91beb224     ffff91beb220 __GI___tunables_init+0x40 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389538:          1                                    branch:   jmp                       ffff91bec318     ffff91bec314 _dl_next_ld_env_entry+0x24 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389620:          1                                    branch:   jmp                       ffff91be0f14     ffff91be0f10 _dl_new_object+0x168 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389802:          1                                    branch:   jmp                       ffff91be2cf0     ffff91be2cec _dl_map_object_deps+0x3f4 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389802:         10                              instructions:   jmp                       ffff91be2cf0     ffff91be2cec _dl_map_object_deps+0x3f4 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389824:          1                                    branch:   br miss                   ffff91bee4e4     ffff91bee4e0 strcmp+0xa0 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389849:          1                                    branch:   jmp                       ffff91be1868     ffff91be1880 _dl_relocate_object+0x4a8 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389858:          1                                    branch:   jmp                       ffff91be1868     ffff91be1880 _dl_relocate_object+0x4a8 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389861:          1                                    branch:   jmp                       ffff91be1c20     ffff91be1bcc _dl_relocate_object+0x7f4 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389875:         10                              instructions:                                        0     ffff91bdfe38 _dl_lookup_symbol_x+0x58 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389876:          1                                    branch:   jmp                       ffff91bdf3a8     ffff91bdf434 do_lookup_x+0x114 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389879:          1                                    branch:   jmp                       ffff91be18ec     ffff91be18e8 _dl_relocate_object+0x510 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389886:          1                                    branch:   jmp                       ffff91bee440     ffff91bdf2dc check_match+0x154 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389890:          1                                    branch:   jmp                       ffff91bdfed4     ffff91bdfed0 _dl_lookup_symbol_x+0xf0 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389893:         10                              instructions:                                        0     ffff91be1974 _dl_relocate_object+0x59c (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389894:          1                                    branch:   jmp                       ffff91bdf3f4     ffff91bdf3f0 do_lookup_x+0xd0 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389906:          1                                    branch:   jmp                       ffff91bdfea4     ffff91bdfe90 _dl_lookup_symbol_x+0xb0 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)

  # perf test "Check Arm SPE"
  114: Check Arm SPE trace data recording and synthesized samples      : Ok
  115: Check Arm SPE doesn't hang when there are forks                 : Ok

Tested-by: Leo Yan <leo.yan@arm.com>
James Clark Oct. 28, 2024, 8:34 a.m. UTC | #2
On 25/10/2024 3:30 pm, Graham Woodward wrote:
> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
> 
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.
> 
> Graham Woodward (4):
>    perf arm-spe: Set sample.addr to target address for instruction sample
>    perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
>    perf arm-spe: Correctly set sample flags
>    perf arm-spe: Update --itrace help text
> 
>   tools/perf/Documentation/itrace.txt       |  2 +-
>   tools/perf/Documentation/perf-arm-spe.txt |  2 +-
>   tools/perf/builtin-script.c               |  1 +
>   tools/perf/util/arm-spe.c                 | 31 ++++++++++++++++++-----
>   tools/perf/util/auxtrace.h                |  3 +--
>   tools/perf/util/event.h                   |  1 +
>   6 files changed, 29 insertions(+), 11 deletions(-)
> 

Don't forget to pickup the review tags from the previous versions. If 
you use the b4 tool it does it automatically:

Reviewed-by: James Clark <james.clark@linaro.org>
Namhyung Kim Oct. 28, 2024, 4:40 p.m. UTC | #3
Hello,

On Fri, Oct 25, 2024 at 03:30:05PM +0100, Graham Woodward wrote:
> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
> 
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.
> 
> Graham Woodward (4):
>   perf arm-spe: Set sample.addr to target address for instruction sample
>   perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
>   perf arm-spe: Correctly set sample flags
>   perf arm-spe: Update --itrace help text

It doesn't apply to perf-tools-next cleanly.  Can you please rebase?

Thanks,
Namhyung

> 
>  tools/perf/Documentation/itrace.txt       |  2 +-
>  tools/perf/Documentation/perf-arm-spe.txt |  2 +-
>  tools/perf/builtin-script.c               |  1 +
>  tools/perf/util/arm-spe.c                 | 31 ++++++++++++++++++-----
>  tools/perf/util/auxtrace.h                |  3 +--
>  tools/perf/util/event.h                   |  1 +
>  6 files changed, 29 insertions(+), 11 deletions(-)
> 
> -- 
> 2.40.1
>
Leo Yan Oct. 29, 2024, 5:03 p.m. UTC | #4
Hi Namhyung,

On Mon, Oct 28, 2024 at 09:40:21AM -0700, Namhyung Kim wrote:
> 
> Hello,
> 
> On Fri, Oct 25, 2024 at 03:30:05PM +0100, Graham Woodward wrote:
> > Currently the --itrace=b will only show branch-misses but this change
> > allows perf to synthesize branches as well.
> >
> > The change also incorporates the ability to display the target
> > addresses when specifying the addr field if the instruction is a branch.
> >
> > Graham Woodward (4):
> >   perf arm-spe: Set sample.addr to target address for instruction sample
> >   perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
> >   perf arm-spe: Correctly set sample flags
> >   perf arm-spe: Update --itrace help text
> 
> It doesn't apply to perf-tools-next cleanly.  Can you please rebase?

I confirmed this series can apply cleanly on the branch [1] with the
latest commit 150dab31d560 ("perf disasm: Fix not cleaning up
disasm_line in symbol__disassemble_raw()"):

  [1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git
      branch: perf-tools-next

If you are suggesting for the branch:

  [2] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git
      branch: perf-tools

You can see it misses some Arm SPE patches which have been picked up
in the repo [1].

Please kindly suggest what is right thing to do.

Thanks,
Leo
Namhyung Kim Oct. 29, 2024, 11:09 p.m. UTC | #5
Hi Leo,

On Tue, Oct 29, 2024 at 05:03:46PM +0000, Leo Yan wrote:
> Hi Namhyung,
> 
> On Mon, Oct 28, 2024 at 09:40:21AM -0700, Namhyung Kim wrote:
> > 
> > Hello,
> > 
> > On Fri, Oct 25, 2024 at 03:30:05PM +0100, Graham Woodward wrote:
> > > Currently the --itrace=b will only show branch-misses but this change
> > > allows perf to synthesize branches as well.
> > >
> > > The change also incorporates the ability to display the target
> > > addresses when specifying the addr field if the instruction is a branch.
> > >
> > > Graham Woodward (4):
> > >   perf arm-spe: Set sample.addr to target address for instruction sample
> > >   perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
> > >   perf arm-spe: Correctly set sample flags
> > >   perf arm-spe: Update --itrace help text
> > 
> > It doesn't apply to perf-tools-next cleanly.  Can you please rebase?
> 
> I confirmed this series can apply cleanly on the branch [1] with the
> latest commit 150dab31d560 ("perf disasm: Fix not cleaning up
> disasm_line in symbol__disassemble_raw()"):
> 
>   [1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git
>       branch: perf-tools-next
> 
> If you are suggesting for the branch:
> 
>   [2] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git
>       branch: perf-tools
> 
> You can see it misses some Arm SPE patches which have been picked up
> in the repo [1].
> 
> Please kindly suggest what is right thing to do.

Sorry, my bad.  It works ok.  I'll add it to tmp.perf-tools-next branch
and run some tests.

Thanks,
Namhyung
Leo Yan Oct. 30, 2024, 9:33 a.m. UTC | #6
Hi Namhyung,

On Tue, Oct 29, 2024 at 04:09:46PM -0700, Namhyung Kim wrote:

[...]

> > Please kindly suggest what is right thing to do.
> 
> Sorry, my bad.  It works ok.  I'll add it to tmp.perf-tools-next branch
> and run some tests.

Thanks for confirmation!

Leo
Namhyung Kim Oct. 30, 2024, 9:30 p.m. UTC | #7
On Fri, 25 Oct 2024 15:30:05 +0100, Graham Woodward wrote:

> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
> 
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.
> 
> Graham Woodward (4):
>   perf arm-spe: Set sample.addr to target address for instruction sample
>   perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
>   perf arm-spe: Correctly set sample flags
>   perf arm-spe: Update --itrace help text
> 
> [...]

Applied to perf-tools-next, thanks!

Best regards,
Namhyung