diff mbox series

[2/2] perf pmu: Improve CPU core PMU HW event list ordering

Message ID 1592214046-32385-3-git-send-email-john.garry@huawei.com (mailing list archive)
State New, archived
Headers show
Series perf: Improve list for arm64 | expand

Commit Message

John Garry June 15, 2020, 9:40 a.m. UTC
For perf list, the CPU core PMU HW event ordering is such that not all
events may will be listed adjacent - consider this example:

$ tools/perf/perf list

List of pre-defined events (to be used in -e):

  duration_time                                      [Tool event]

  branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
  branch-misses OR cpu/branch-misses/                [Kernel PMU event]
  bus-cycles OR cpu/bus-cycles/                      [Kernel PMU event]
  cache-misses OR cpu/cache-misses/                  [Kernel PMU event]
  cache-references OR cpu/cache-references/          [Kernel PMU event]
  cpu-cycles OR cpu/cpu-cycles/                      [Kernel PMU event]
  cstate_core/c3-residency/                          [Kernel PMU event]
  cstate_core/c6-residency/                          [Kernel PMU event]
  cstate_core/c7-residency/                          [Kernel PMU event]
  cstate_pkg/c2-residency/                           [Kernel PMU event]
  cstate_pkg/c3-residency/                           [Kernel PMU event]
  cstate_pkg/c6-residency/                           [Kernel PMU event]
  cstate_pkg/c7-residency/                           [Kernel PMU event]
  cycles-ct OR cpu/cycles-ct/                        [Kernel PMU event]
  cycles-t OR cpu/cycles-t/                          [Kernel PMU event]
  el-abort OR cpu/el-abort/                          [Kernel PMU event]
  el-capacity OR cpu/el-capacity/                    [Kernel PMU event]

Notice in the above example how the cstate_core PMU events are mixed in
the middle of the CPU core events.

For my arm64 platform, all the uncore events get mixed in, making the list
very disorganised:
 page-faults OR faults                              [Software event]
  task-clock                                         [Software event]
  duration_time                                      [Tool event]
  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  L1-icache-loads                                    [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  iTLB-loads                                         [Hardware cache event]
  br_mis_pred OR armv8_pmuv3_0/br_mis_pred/          [Kernel PMU event]
  br_mis_pred_retired OR armv8_pmuv3_0/br_mis_pred_retired/ [Kernel PMU event]
  br_pred OR armv8_pmuv3_0/br_pred/                  [Kernel PMU event]
  br_retired OR armv8_pmuv3_0/br_retired/            [Kernel PMU event]
  br_return_retired OR armv8_pmuv3_0/br_return_retired/ [Kernel PMU event]
  bus_access OR armv8_pmuv3_0/bus_access/            [Kernel PMU event]
  bus_cycles OR armv8_pmuv3_0/bus_cycles/            [Kernel PMU event]
  cid_write_retired OR armv8_pmuv3_0/cid_write_retired/ [Kernel PMU event]
  cpu_cycles OR armv8_pmuv3_0/cpu_cycles/            [Kernel PMU event]
  dtlb_walk OR armv8_pmuv3_0/dtlb_walk/              [Kernel PMU event]
  exc_return OR armv8_pmuv3_0/exc_return/            [Kernel PMU event]
  exc_taken OR armv8_pmuv3_0/exc_taken/              [Kernel PMU event]
  hisi_sccl1_ddrc0/act_cmd/                          [Kernel PMU event]
  hisi_sccl1_ddrc0/flux_rcmd/                        [Kernel PMU event]
  hisi_sccl1_ddrc0/flux_rd/                          [Kernel PMU event]
  hisi_sccl1_ddrc0/flux_wcmd/                        [Kernel PMU event]
  hisi_sccl1_ddrc0/flux_wr/                          [Kernel PMU event]
  hisi_sccl1_ddrc0/pre_cmd/                          [Kernel PMU event]
  hisi_sccl1_ddrc0/rnk_chg/                          [Kernel PMU event]

...

  hisi_sccl7_l3c21/wr_hit_cpipe/                     [Kernel PMU event]
  hisi_sccl7_l3c21/wr_hit_spipe/                     [Kernel PMU event]
  hisi_sccl7_l3c21/wr_spipe/                         [Kernel PMU event]
  inst_retired OR armv8_pmuv3_0/inst_retired/        [Kernel PMU event]
  inst_spec OR armv8_pmuv3_0/inst_spec/              [Kernel PMU event]
  itlb_walk OR armv8_pmuv3_0/itlb_walk/              [Kernel PMU event]
  l1d_cache OR armv8_pmuv3_0/l1d_cache/              [Kernel PMU event]
  l1d_cache_refill OR armv8_pmuv3_0/l1d_cache_refill/ [Kernel PMU event]
  l1d_cache_wb OR armv8_pmuv3_0/l1d_cache_wb/        [Kernel PMU event]
  l1d_tlb OR armv8_pmuv3_0/l1d_tlb/                  [Kernel PMU event]
  l1d_tlb_refill OR armv8_pmuv3_0/l1d_tlb_refill/    [Kernel PMU event]

So the events are list alphabetically. However, CPU core event listing is
special from commit dc098b35b56f ("perf list: List kernel supplied event
aliases"), in that the alias and full event is shown (in that order).
As such, the core events may become sparse.

Improve this by grouping the CPU core events and ensure that they are
listed first for kernel PMU events. For the first example, above, this
now looks like:

duration_time                                      [Tool event]
  branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
  branch-misses OR cpu/branch-misses/                [Kernel PMU event]
  bus-cycles OR cpu/bus-cycles/                      [Kernel PMU event]
  cache-misses OR cpu/cache-misses/                  [Kernel PMU event]
  cache-references OR cpu/cache-references/          [Kernel PMU event]
  cpu-cycles OR cpu/cpu-cycles/                      [Kernel PMU event]
  cycles-ct OR cpu/cycles-ct/                        [Kernel PMU event]
  cycles-t OR cpu/cycles-t/                          [Kernel PMU event]
  el-abort OR cpu/el-abort/                          [Kernel PMU event]
  el-capacity OR cpu/el-capacity/                    [Kernel PMU event]
  el-commit OR cpu/el-commit/                        [Kernel PMU event]
  el-conflict OR cpu/el-conflict/                    [Kernel PMU event]
  el-start OR cpu/el-start/                          [Kernel PMU event]
  instructions OR cpu/instructions/                  [Kernel PMU event]
  mem-loads OR cpu/mem-loads/                        [Kernel PMU event]
  mem-stores OR cpu/mem-stores/                      [Kernel PMU event]
  ref-cycles OR cpu/ref-cycles/                      [Kernel PMU event]
  topdown-fetch-bubbles OR cpu/topdown-fetch-bubbles/ [Kernel PMU event]
  topdown-recovery-bubbles OR cpu/topdown-recovery-bubbles/ [Kernel PMU event]
  topdown-slots-issued OR cpu/topdown-slots-issued/  [Kernel PMU event]
  topdown-slots-retired OR cpu/topdown-slots-retired/ [Kernel PMU event]
  topdown-total-slots OR cpu/topdown-total-slots/    [Kernel PMU event]
  tx-abort OR cpu/tx-abort/                          [Kernel PMU event]
  tx-capacity OR cpu/tx-capacity/                    [Kernel PMU event]
  tx-commit OR cpu/tx-commit/                        [Kernel PMU event]
  tx-conflict OR cpu/tx-conflict/                    [Kernel PMU event]
  tx-start OR cpu/tx-start/                          [Kernel PMU event]
  cstate_core/c3-residency/                          [Kernel PMU event]
  cstate_core/c6-residency/                          [Kernel PMU event]
  cstate_core/c7-residency/                          [Kernel PMU event]
  cstate_pkg/c2-residency/                           [Kernel PMU event]
  cstate_pkg/c3-residency/                           [Kernel PMU event]
  cstate_pkg/c6-residency/                           [Kernel PMU event]
  cstate_pkg/c7-residency/                           [Kernel PMU event]

Signed-off-by: John Garry <john.garry@huawei.com>
---
 tools/perf/util/pmu.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Namhyung Kim June 15, 2020, 2:36 p.m. UTC | #1
On Mon, Jun 15, 2020 at 6:45 PM John Garry <john.garry@huawei.com> wrote:
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index a375364537cd..e9a4a6961263 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -1400,6 +1400,7 @@ struct sevent {
>         char *pmu;
>         char *metric_expr;
>         char *metric_name;
> +       int is_cpu;
>  };
>
>  static int cmp_sevent(const void *a, const void *b)
> @@ -1416,6 +1417,12 @@ static int cmp_sevent(const void *a, const void *b)
>                 if (n)
>                         return n;
>         }
> +
> +       if (as->is_cpu && !bs->is_cpu)
> +               return -1;
> +       else if (!as->is_cpu && bs->is_cpu)
> +               return 1;
> +

This can be:

        if (as->is_cpu != bs->is_cpu)
                return bs->is_cpu - as->is_cpu;

Thanks
Namhyung


>         return strcmp(as->name, bs->name);
>  }
>
> @@ -1507,6 +1514,7 @@ void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
>                         aliases[j].pmu = pmu->name;
>                         aliases[j].metric_expr = alias->metric_expr;
>                         aliases[j].metric_name = alias->metric_name;
> +                       aliases[j].is_cpu = is_cpu;
>                         j++;
>                 }
>                 if (pmu->selectable &&
> --
> 2.26.2
>
John Garry June 15, 2020, 2:51 p.m. UTC | #2
On 15/06/2020 15:36, Namhyung Kim wrote:>>   static int cmp_sevent(const 
void *a, const void *b)
>> @@ -1416,6 +1417,12 @@ static int cmp_sevent(const void *a, const void *b)
>>                  if (n)
>>                          return n;
>>          }
>> +
>> +       if (as->is_cpu && !bs->is_cpu)
>> +               return -1;
>> +       else if (!as->is_cpu && bs->is_cpu)
>> +               return 1;
>> +
> This can be:
> 
>          if (as->is_cpu != bs->is_cpu)
>                  return bs->is_cpu - as->is_cpu;
> 

That's more concise, but maybe not as readable at a glance. I don't mind 
though, so can change.

Thanks,
John
diff mbox series

Patch

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index a375364537cd..e9a4a6961263 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1400,6 +1400,7 @@  struct sevent {
 	char *pmu;
 	char *metric_expr;
 	char *metric_name;
+	int is_cpu;
 };
 
 static int cmp_sevent(const void *a, const void *b)
@@ -1416,6 +1417,12 @@  static int cmp_sevent(const void *a, const void *b)
 		if (n)
 			return n;
 	}
+
+	if (as->is_cpu && !bs->is_cpu)
+		return -1;
+	else if (!as->is_cpu && bs->is_cpu)
+		return 1;
+
 	return strcmp(as->name, bs->name);
 }
 
@@ -1507,6 +1514,7 @@  void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
 			aliases[j].pmu = pmu->name;
 			aliases[j].metric_expr = alias->metric_expr;
 			aliases[j].metric_name = alias->metric_name;
+			aliases[j].is_cpu = is_cpu;
 			j++;
 		}
 		if (pmu->selectable &&