diff mbox series

Issue of metrics for multiple uncore PMUs (was Re: [RFC PATCH v2 23/23] perf metricgroup: remove duped metric group events)

Message ID e3c4f253-e1ed-32f6-c252-e8657968fc42@huawei.com (mailing list archive)
State Not Applicable
Headers show
Series Issue of metrics for multiple uncore PMUs (was Re: [RFC PATCH v2 23/23] perf metricgroup: remove duped metric group events) | expand

Commit Message

John Garry Oct. 2, 2020, 11:57 a.m. UTC
On 07/05/2020 15:08, Ian Rogers wrote:

Hi Ian,

I was wondering if you ever tested commit 2440689d62e9 ("perf 
metricgroup: Remove duped metric group events") for when we have a 
metric which aliases multiple instances of the same uncore PMU in the 
system?

I have been rebasing some of my arm64 perf work to v5.9-rc7, and find an 
issue where find_evsel_group() fails for the uncore metrics under the 
condition mentioned above.

Unfortunately I don't have an x86 machine to which this test applies. 
However, as an experiment, I added a test metric to my broadwell JSON:

]


And get this:

john@localhost:~/linux/tools/perf> sudo ./perf stat -v -M 
test_metric_inc sleep 1
Using CPUID GenuineIntel-6-3D-4
metric expr unc_cbo_xsnp_response.miss_xcore * 
unc_cbo_xsnp_response.miss_eviction for test_metric_inc
found event unc_cbo_xsnp_response.miss_eviction
found event unc_cbo_xsnp_response.miss_xcore
adding 
{unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W
unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
Cannot resolve test_metric_inc: unc_cbo_xsnp_response.miss_xcore * 
unc_cbo_xsnp_response.miss_eviction
task-clock: 688876 688876 688876
context-switches: 2 688876 688876
cpu-migrations: 0 688876 688876
page-faults: 69 688876 688876
cycles: 2101719 695690 695690
instructions: 1180534 695690 695690
branches: 249450 695690 695690
branch-misses: 10815 695690 695690

Performance counter stats for 'sleep 1':

              0.69 msec task-clock                #    0.001 CPUs 
utilized
                 2      context-switches          #    0.003 M/sec 

                 0      cpu-migrations            #    0.000 K/sec 

                69      page-faults               #    0.100 M/sec 

         2,101,719      cycles                    #    3.051 GHz 

         1,180,534      instructions              #    0.56  insn per 
cycle
           249,450      branches                  #  362.112 M/sec 

            10,815      branch-misses             #    4.34% of all 
branches

       1.001177693 seconds time elapsed

       0.001149000 seconds user
       0.000000000 seconds sys


john@localhost:~/linux/tools/perf>


Any idea what is going wrong here, before I have to dive in? The issue 
seems to be this named commit.

Thanks,
John

> A metric group contains multiple metrics. These metrics may use the same
> events. If metrics use separate events then it leads to more
> multiplexing and overall metric counts fail to sum to 100%.
> Modify how metrics are associated with events so that if the events in
> an earlier group satisfy the current metric, the same events are used.
> A record of used events is kept and at the end of processing unnecessary
> events are eliminated.
> 
> Before:

Comments

Ian Rogers Oct. 2, 2020, 8:46 p.m. UTC | #1
On Fri, Oct 2, 2020 at 5:00 AM John Garry <john.garry@huawei.com> wrote:
>
> On 07/05/2020 15:08, Ian Rogers wrote:
>
> Hi Ian,
>
> I was wondering if you ever tested commit 2440689d62e9 ("perf
> metricgroup: Remove duped metric group events") for when we have a
> metric which aliases multiple instances of the same uncore PMU in the
> system?

Sorry for this, I hadn't tested such a metric and wasn't aware of how
the aliasing worked. I sent a fix for this issue here:
https://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com/
Could you see if this addresses the issue for you? I don't see the
change in Arnaldo's trees yet.

Thanks,
Ian

> I have been rebasing some of my arm64 perf work to v5.9-rc7, and find an
> issue where find_evsel_group() fails for the uncore metrics under the
> condition mentioned above.
>
> Unfortunately I don't have an x86 machine to which this test applies.
> However, as an experiment, I added a test metric to my broadwell JSON:
>
> diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> index 8cdc7c13dc2a..fc6d9adf996a 100644
> --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> @@ -348,5 +348,11 @@
>          "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100",
>          "MetricGroup": "Power",
>          "MetricName": "C7_Pkg_Residency"
> +    },
> +    {
> +        "BriefDescription": "test metric",
> +        "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE *
> UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
> +        "MetricGroup": "Test",
> +        "MetricName": "test_metric_inc"
>      }
> ]
>
>
> And get this:
>
> john@localhost:~/linux/tools/perf> sudo ./perf stat -v -M
> test_metric_inc sleep 1
> Using CPUID GenuineIntel-6-3D-4
> metric expr unc_cbo_xsnp_response.miss_xcore *
> unc_cbo_xsnp_response.miss_eviction for test_metric_inc
> found event unc_cbo_xsnp_response.miss_eviction
> found event unc_cbo_xsnp_response.miss_xcore
> adding
> {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W
> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
> Cannot resolve test_metric_inc: unc_cbo_xsnp_response.miss_xcore *
> unc_cbo_xsnp_response.miss_eviction
> task-clock: 688876 688876 688876
> context-switches: 2 688876 688876
> cpu-migrations: 0 688876 688876
> page-faults: 69 688876 688876
> cycles: 2101719 695690 695690
> instructions: 1180534 695690 695690
> branches: 249450 695690 695690
> branch-misses: 10815 695690 695690
>
> Performance counter stats for 'sleep 1':
>
>               0.69 msec task-clock                #    0.001 CPUs
> utilized
>                  2      context-switches          #    0.003 M/sec
>
>                  0      cpu-migrations            #    0.000 K/sec
>
>                 69      page-faults               #    0.100 M/sec
>
>          2,101,719      cycles                    #    3.051 GHz
>
>          1,180,534      instructions              #    0.56  insn per
> cycle
>            249,450      branches                  #  362.112 M/sec
>
>             10,815      branch-misses             #    4.34% of all
> branches
>
>        1.001177693 seconds time elapsed
>
>        0.001149000 seconds user
>        0.000000000 seconds sys
>
>
> john@localhost:~/linux/tools/perf>
>
>
> Any idea what is going wrong here, before I have to dive in? The issue
> seems to be this named commit.
>
> Thanks,
> John
>
> > A metric group contains multiple metrics. These metrics may use the same
> > events. If metrics use separate events then it leads to more
> > multiplexing and overall metric counts fail to sum to 100%.
> > Modify how metrics are associated with events so that if the events in
> > an earlier group satisfy the current metric, the same events are used.
> > A record of used events is kept and at the end of processing unnecessary
> > events are eliminated.
> >
> > Before:
John Garry Oct. 5, 2020, 10:03 a.m. UTC | #2
On 02/10/2020 21:46, Ian Rogers wrote:
> On Fri, Oct 2, 2020 at 5:00 AM John Garry <john.garry@huawei.com> wrote:
>>
>> On 07/05/2020 15:08, Ian Rogers wrote:
>>
>> Hi Ian,
>>
>> I was wondering if you ever tested commit 2440689d62e9 ("perf
>> metricgroup: Remove duped metric group events") for when we have a
>> metric which aliases multiple instances of the same uncore PMU in the
>> system?
> 
> Sorry for this, I hadn't tested such a metric and wasn't aware of how
> the aliasing worked. I sent a fix for this issue here:
> https://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com/
> Could you see if this addresses the issue for you? I don't see the
> change in Arnaldo's trees yet.

Unfortunately this does not seem to fix my issue.

So for that patch, you say you fix metric expression for DRAM_BW_Use, 
which is:

{
  "BriefDescription": "Average external Memory Bandwidth Use for reads 
and writes [GB / sec]",
  "MetricExpr": "( 64 * ( uncore_imc@cas_count_read@ + 
uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time",
  "MetricGroup": "Memory_BW",
"MetricName": "DRAM_BW_Use"
},

But this metric expression does not include any alias events; rather I 
think it is just cas_count_write + cas_count_read event count for PMU 
uncore_imc / duration_time.

When I say alias, I mean - as an example, we have event:

     {
         "BriefDescription": "write requests to memory controller. 
Derived from unc_m_cas_count.wr",
         "Counter": "0,1,2,3",
         "EventCode": "0x4",
         "EventName": "LLC_MISSES.MEM_WRITE",
         "PerPkg": "1",
         "ScaleUnit": "64Bytes",
         "UMask": "0xC",
         "Unit": "iMC"
     },

And then reference LLC_MISSES.MEM_WRITE in a metric expression:

"MetricExpr": "LLC_MISSES.MEM_WRITE / duration_time",

This is what seems to be broken for when the alias matches > 1 PMU.

Please check this.

Thanks,
John

> 
> Thanks,
> Ian
> 
>> I have been rebasing some of my arm64 perf work to v5.9-rc7, and find an
>> issue where find_evsel_group() fails for the uncore metrics under the
>> condition mentioned above.
>>
>> Unfortunately I don't have an x86 machine to which this test applies.
>> However, as an experiment, I added a test metric to my broadwell JSON:
>>
>> diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
>> b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
>> index 8cdc7c13dc2a..fc6d9adf996a 100644
>> --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
>> +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
>> @@ -348,5 +348,11 @@
>>           "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100",
>>           "MetricGroup": "Power",
>>           "MetricName": "C7_Pkg_Residency"
>> +    },
>> +    {
>> +        "BriefDescription": "test metric",
>> +        "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE *
>> UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
>> +        "MetricGroup": "Test",
>> +        "MetricName": "test_metric_inc"
>>       }
>> ]
>>
>>
>> And get this:
>>
>> john@localhost:~/linux/tools/perf> sudo ./perf stat -v -M
>> test_metric_inc sleep 1
>> Using CPUID GenuineIntel-6-3D-4
>> metric expr unc_cbo_xsnp_response.miss_xcore *
>> unc_cbo_xsnp_response.miss_eviction for test_metric_inc
>> found event unc_cbo_xsnp_response.miss_eviction
>> found event unc_cbo_xsnp_response.miss_xcore
>> adding
>> {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W
>> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
>> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
>> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
>> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
>> Cannot resolve test_metric_inc: unc_cbo_xsnp_response.miss_xcore *
>> unc_cbo_xsnp_response.miss_eviction
>> task-clock: 688876 688876 688876
>> context-switches: 2 688876 688876
>> cpu-migrations: 0 688876 688876
>> page-faults: 69 688876 688876
>> cycles: 2101719 695690 695690
>> instructions: 1180534 695690 695690
>> branches: 249450 695690 695690
>> branch-misses: 10815 695690 695690
>>
>> Performance counter stats for 'sleep 1':
>>
>>                0.69 msec task-clock                #    0.001 CPUs
>> utilized
>>                   2      context-switches          #    0.003 M/sec
>>
>>                   0      cpu-migrations            #    0.000 K/sec
>>
>>                  69      page-faults               #    0.100 M/sec
>>
>>           2,101,719      cycles                    #    3.051 GHz
>>
>>           1,180,534      instructions              #    0.56  insn per
>> cycle
>>             249,450      branches                  #  362.112 M/sec
>>
>>              10,815      branch-misses             #    4.34% of all
>> branches
>>
>>         1.001177693 seconds time elapsed
>>
>>         0.001149000 seconds user
>>         0.000000000 seconds sys
>>
>>
>> john@localhost:~/linux/tools/perf>
>>
>>
>> Any idea what is going wrong here, before I have to dive in? The issue
>> seems to be this named commit.
>>
>> Thanks,
>> John
>>
>>> A metric group contains multiple metrics. These metrics may use the same
>>> events. If metrics use separate events then it leads to more
>>> multiplexing and overall metric counts fail to sum to 100%.
>>> Modify how metrics are associated with events so that if the events in
>>> an earlier group satisfy the current metric, the same events are used.
>>> A record of used events is kept and at the end of processing unnecessary
>>> events are eliminated.
>>>
>>> Before:
> .
>
Ian Rogers Oct. 5, 2020, 4:28 p.m. UTC | #3
On Mon, Oct 5, 2020 at 3:06 AM John Garry <john.garry@huawei.com> wrote:
>
> On 02/10/2020 21:46, Ian Rogers wrote:
> > On Fri, Oct 2, 2020 at 5:00 AM John Garry <john.garry@huawei.com> wrote:
> >>
> >> On 07/05/2020 15:08, Ian Rogers wrote:
> >>
> >> Hi Ian,
> >>
> >> I was wondering if you ever tested commit 2440689d62e9 ("perf
> >> metricgroup: Remove duped metric group events") for when we have a
> >> metric which aliases multiple instances of the same uncore PMU in the
> >> system?
> >
> > Sorry for this, I hadn't tested such a metric and wasn't aware of how
> > the aliasing worked. I sent a fix for this issue here:
> > https://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com/
> > Could you see if this addresses the issue for you? I don't see the
> > change in Arnaldo's trees yet.
>
> Unfortunately this does not seem to fix my issue.
>
> So for that patch, you say you fix metric expression for DRAM_BW_Use,
> which is:
>
> {
>   "BriefDescription": "Average external Memory Bandwidth Use for reads
> and writes [GB / sec]",
>   "MetricExpr": "( 64 * ( uncore_imc@cas_count_read@ +
> uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time",
>   "MetricGroup": "Memory_BW",
> "MetricName": "DRAM_BW_Use"
> },
>
> But this metric expression does not include any alias events; rather I
> think it is just cas_count_write + cas_count_read event count for PMU
> uncore_imc / duration_time.
>
> When I say alias, I mean - as an example, we have event:
>
>      {
>          "BriefDescription": "write requests to memory controller.
> Derived from unc_m_cas_count.wr",
>          "Counter": "0,1,2,3",
>          "EventCode": "0x4",
>          "EventName": "LLC_MISSES.MEM_WRITE",
>          "PerPkg": "1",
>          "ScaleUnit": "64Bytes",
>          "UMask": "0xC",
>          "Unit": "iMC"
>      },
>
> And then reference LLC_MISSES.MEM_WRITE in a metric expression:
>
> "MetricExpr": "LLC_MISSES.MEM_WRITE / duration_time",
>
> This is what seems to be broken for when the alias matches > 1 PMU.
>
> Please check this.

Happy to check. Can you provide a reproduction? Looking on broadwell
this metric doesn't exist.

Thanks,
Ian

> Thanks,
> John
>
> >
> > Thanks,
> > Ian
> >
> >> I have been rebasing some of my arm64 perf work to v5.9-rc7, and find an
> >> issue where find_evsel_group() fails for the uncore metrics under the
> >> condition mentioned above.
> >>
> >> Unfortunately I don't have an x86 machine to which this test applies.
> >> However, as an experiment, I added a test metric to my broadwell JSON:
> >>
> >> diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> >> b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> >> index 8cdc7c13dc2a..fc6d9adf996a 100644
> >> --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> >> +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> >> @@ -348,5 +348,11 @@
> >>           "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100",
> >>           "MetricGroup": "Power",
> >>           "MetricName": "C7_Pkg_Residency"
> >> +    },
> >> +    {
> >> +        "BriefDescription": "test metric",
> >> +        "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE *
> >> UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
> >> +        "MetricGroup": "Test",
> >> +        "MetricName": "test_metric_inc"
> >>       }
> >> ]
> >>
> >>
> >> And get this:
> >>
> >> john@localhost:~/linux/tools/perf> sudo ./perf stat -v -M
> >> test_metric_inc sleep 1
> >> Using CPUID GenuineIntel-6-3D-4
> >> metric expr unc_cbo_xsnp_response.miss_xcore *
> >> unc_cbo_xsnp_response.miss_eviction for test_metric_inc
> >> found event unc_cbo_xsnp_response.miss_eviction
> >> found event unc_cbo_xsnp_response.miss_xcore
> >> adding
> >> {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W
> >> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
> >> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
> >> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
> >> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
> >> Cannot resolve test_metric_inc: unc_cbo_xsnp_response.miss_xcore *
> >> unc_cbo_xsnp_response.miss_eviction
> >> task-clock: 688876 688876 688876
> >> context-switches: 2 688876 688876
> >> cpu-migrations: 0 688876 688876
> >> page-faults: 69 688876 688876
> >> cycles: 2101719 695690 695690
> >> instructions: 1180534 695690 695690
> >> branches: 249450 695690 695690
> >> branch-misses: 10815 695690 695690
> >>
> >> Performance counter stats for 'sleep 1':
> >>
> >>                0.69 msec task-clock                #    0.001 CPUs
> >> utilized
> >>                   2      context-switches          #    0.003 M/sec
> >>
> >>                   0      cpu-migrations            #    0.000 K/sec
> >>
> >>                  69      page-faults               #    0.100 M/sec
> >>
> >>           2,101,719      cycles                    #    3.051 GHz
> >>
> >>           1,180,534      instructions              #    0.56  insn per
> >> cycle
> >>             249,450      branches                  #  362.112 M/sec
> >>
> >>              10,815      branch-misses             #    4.34% of all
> >> branches
> >>
> >>         1.001177693 seconds time elapsed
> >>
> >>         0.001149000 seconds user
> >>         0.000000000 seconds sys
> >>
> >>
> >> john@localhost:~/linux/tools/perf>
> >>
> >>
> >> Any idea what is going wrong here, before I have to dive in? The issue
> >> seems to be this named commit.
> >>
> >> Thanks,
> >> John
> >>
> >>> A metric group contains multiple metrics. These metrics may use the same
> >>> events. If metrics use separate events then it leads to more
> >>> multiplexing and overall metric counts fail to sum to 100%.
> >>> Modify how metrics are associated with events so that if the events in
> >>> an earlier group satisfy the current metric, the same events are used.
> >>> A record of used events is kept and at the end of processing unnecessary
> >>> events are eliminated.
> >>>
> >>> Before:
> > .
> >
>
John Garry Oct. 5, 2020, 6:05 p.m. UTC | #4
On 05/10/2020 17:28, Ian Rogers wrote:
> On Mon, Oct 5, 2020 at 3:06 AM John Garry <john.garry@huawei.com> wrote:
>>
>> On 02/10/2020 21:46, Ian Rogers wrote:
>>> On Fri, Oct 2, 2020 at 5:00 AM John Garry <john.garry@huawei.com> wrote:
>>>>
>>>> On 07/05/2020 15:08, Ian Rogers wrote:
>>>>
>>>> Hi Ian,
>>>>
>>>> I was wondering if you ever tested commit 2440689d62e9 ("perf
>>>> metricgroup: Remove duped metric group events") for when we have a
>>>> metric which aliases multiple instances of the same uncore PMU in the
>>>> system?
>>>
>>> Sorry for this, I hadn't tested such a metric and wasn't aware of how
>>> the aliasing worked. I sent a fix for this issue here:
>>> https://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com/
>>> Could you see if this addresses the issue for you? I don't see the
>>> change in Arnaldo's trees yet.
>>
>> Unfortunately this does not seem to fix my issue.
>>
>> So for that patch, you say you fix metric expression for DRAM_BW_Use,
>> which is:
>>
>> {
>>    "BriefDescription": "Average external Memory Bandwidth Use for reads
>> and writes [GB / sec]",
>>    "MetricExpr": "( 64 * ( uncore_imc@cas_count_read@ +
>> uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time",
>>    "MetricGroup": "Memory_BW",
>> "MetricName": "DRAM_BW_Use"
>> },
>>
>> But this metric expression does not include any alias events; rather I
>> think it is just cas_count_write + cas_count_read event count for PMU
>> uncore_imc / duration_time.
>>
>> When I say alias, I mean - as an example, we have event:
>>
>>       {
>>           "BriefDescription": "write requests to memory controller.
>> Derived from unc_m_cas_count.wr",
>>           "Counter": "0,1,2,3",
>>           "EventCode": "0x4",
>>           "EventName": "LLC_MISSES.MEM_WRITE",
>>           "PerPkg": "1",
>>           "ScaleUnit": "64Bytes",
>>           "UMask": "0xC",
>>           "Unit": "iMC"
>>       },
>>
>> And then reference LLC_MISSES.MEM_WRITE in a metric expression:
>>
>> "MetricExpr": "LLC_MISSES.MEM_WRITE / duration_time",
>>
>> This is what seems to be broken for when the alias matches > 1 PMU.
>>
>> Please check this.
> 
Hi Ian,

> Happy to check. 

So I am, but the code is a little complicated :)

> Can you provide a reproduction? Looking on broadwell
> this metric doesn't exist.

Right, I just added this test metric as my 2x x86 platform has no 
examples which I can find:

diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json 
b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
index 8cdc7c13dc2a..fc6d9adf996a 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
@@ -348,5 +348,11 @@
         "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100",
         "MetricGroup": "Power",
         "MetricName": "C7_Pkg_Residency"
+    },
+    {
+        "BriefDescription": "test metric",
+        "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * 
UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
+        "MetricGroup": "Test",
+        "MetricName": "test_metric_inc"
     }
]

I'll try to find a better mainline example, though, but I'm not hopeful ...

Thanks,
John
John Garry Oct. 6, 2020, 2:19 p.m. UTC | #5
On 05/10/2020 19:05, John Garry wrote:
>> Can you provide a reproduction? Looking on broadwell
>> this metric doesn't exist.
> 
> Right, I just added this test metric as my 2x x86 platform has no 
> examples which I can find:
> 
> diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json 
> b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> index 8cdc7c13dc2a..fc6d9adf996a 100644
> --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> @@ -348,5 +348,11 @@
>          "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100",
>          "MetricGroup": "Power",
>          "MetricName": "C7_Pkg_Residency"
> +    },
> +    {
> +        "BriefDescription": "test metric",
> +        "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * 
> UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
> +        "MetricGroup": "Test",
> +        "MetricName": "test_metric_inc"
>      }
> ]
> 

It seems that the code in find_evsel_group() does not properly handle 
the scenario of event alias matching different PMUs (as I already said).

So I got it working on top of "perf metricgroup: Fix uncore metric 
expressions" with the following change:

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index d948a7f910cf..6293378c019c 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -213,7 +213,8 @@ static struct evsel *find_evsel_group(struct evlist 
*perf_evlist,
  		/* Ignore event if already used and merging is disabled. */
  		if (metric_no_merge && test_bit(ev->idx, evlist_used))
  			continue;
-		if (!has_constraint && ev->leader != current_leader) {
+		if (!has_constraint && (!current_leader || 
strcmp(current_leader->name, ev->leader->name))) {
  			/*
  			 * Start of a new group, discard the whole match and
  			 * start again.
@@ -279,7 +280,8 @@ static struct evsel *find_evsel_group(struct evlist 
*perf_evlist,
  			 * when then group is left.
  			 */
  			if (!has_constraint &&
-			    ev->leader != metric_events[i]->leader)
+			    strcmp(ev->leader->name, metric_events[i]->leader->name))
  				break;
  			if (!strcmp(metric_events[i]->name, ev->name)) {
  				set_bit(ev->idx, evlist_used);

which gives for my test metric:

./perf stat -v -M test_metric_inc sleep 1
Using CPUID GenuineIntel-6-3D-4
metric expr unc_cbo_xsnp_response.miss_xcore / 
unc_cbo_xsnp_response.miss_eviction for test_metric_inc
found event unc_cbo_xsnp_response.miss_eviction
found event unc_cbo_xsnp_response.miss_xcore
adding 
{unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W
unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
Control descriptor is not initialized
unc_cbo_xsnp_response.miss_eviction: 595175 1001021311 1001021311
unc_cbo_xsnp_response.miss_eviction: 592516 1001020037 1001020037
unc_cbo_xsnp_response.miss_xcore: 39139 1001021311 1001021311
unc_cbo_xsnp_response.miss_xcore: 38718 1001020037 1001020037

Performance counter stats for 'system wide':

         1,187,691      unc_cbo_xsnp_response.miss_eviction #     0.07 
test_metric_inc
            77,857      unc_cbo_xsnp_response.miss_xcore 


       1.001068918 seconds time elapsed

John
Ian Rogers Oct. 6, 2020, 2:42 p.m. UTC | #6
On Tue, Oct 6, 2020 at 7:22 AM John Garry <john.garry@huawei.com> wrote:
>
> On 05/10/2020 19:05, John Garry wrote:
> >> Can you provide a reproduction? Looking on broadwell
> >> this metric doesn't exist.
> >
> > Right, I just added this test metric as my 2x x86 platform has no
> > examples which I can find:
> >
> > diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> > b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> > index 8cdc7c13dc2a..fc6d9adf996a 100644
> > --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> > +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> > @@ -348,5 +348,11 @@
> >          "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100",
> >          "MetricGroup": "Power",
> >          "MetricName": "C7_Pkg_Residency"
> > +    },
> > +    {
> > +        "BriefDescription": "test metric",
> > +        "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE *
> > UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
> > +        "MetricGroup": "Test",
> > +        "MetricName": "test_metric_inc"
> >      }
> > ]
> >
>
> It seems that the code in find_evsel_group() does not properly handle
> the scenario of event alias matching different PMUs (as I already said).
>
> So I got it working on top of "perf metricgroup: Fix uncore metric
> expressions" with the following change:
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index d948a7f910cf..6293378c019c 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -213,7 +213,8 @@ static struct evsel *find_evsel_group(struct evlist
> *perf_evlist,
>                 /* Ignore event if already used and merging is disabled. */
>                 if (metric_no_merge && test_bit(ev->idx, evlist_used))
>                         continue;
> -               if (!has_constraint && ev->leader != current_leader) {
> +               if (!has_constraint && (!current_leader ||
> strcmp(current_leader->name, ev->leader->name))) {
>                         /*
>                          * Start of a new group, discard the whole match and
>                          * start again.
> @@ -279,7 +280,8 @@ static struct evsel *find_evsel_group(struct evlist
> *perf_evlist,
>                          * when then group is left.
>                          */
>                         if (!has_constraint &&
> -                           ev->leader != metric_events[i]->leader)
> +                           strcmp(ev->leader->name, metric_events[i]->leader->name))
>                                 break;
>                         if (!strcmp(metric_events[i]->name, ev->name)) {
>                                 set_bit(ev->idx, evlist_used);
>
> which gives for my test metric:
>
> ./perf stat -v -M test_metric_inc sleep 1
> Using CPUID GenuineIntel-6-3D-4
> metric expr unc_cbo_xsnp_response.miss_xcore /
> unc_cbo_xsnp_response.miss_eviction for test_metric_inc
> found event unc_cbo_xsnp_response.miss_eviction
> found event unc_cbo_xsnp_response.miss_xcore
> adding
> {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W
> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/
> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/
> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/
> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/
> Control descriptor is not initialized
> unc_cbo_xsnp_response.miss_eviction: 595175 1001021311 1001021311
> unc_cbo_xsnp_response.miss_eviction: 592516 1001020037 1001020037
> unc_cbo_xsnp_response.miss_xcore: 39139 1001021311 1001021311
> unc_cbo_xsnp_response.miss_xcore: 38718 1001020037 1001020037
>
> Performance counter stats for 'system wide':
>
>          1,187,691      unc_cbo_xsnp_response.miss_eviction #     0.07
> test_metric_inc
>             77,857      unc_cbo_xsnp_response.miss_xcore
>
>
>        1.001068918 seconds time elapsed
>
> John

Thanks John! I was able to repro the problem, let me investigate what
is happening here as it seems there may be something wrong with the
grouping logic.

Ian
diff mbox series

Patch

diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json 
b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
index 8cdc7c13dc2a..fc6d9adf996a 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
@@ -348,5 +348,11 @@ 
         "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100",
         "MetricGroup": "Power",
         "MetricName": "C7_Pkg_Residency"
+    },
+    {
+        "BriefDescription": "test metric",
+        "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * 
UNC_CBO_XSNP_RESPONSE.MISS_EVICTION",
+        "MetricGroup": "Test",
+        "MetricName": "test_metric_inc"
     }