Message ID | 1520034092-35275-2-git-send-email-agustinv@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Mar 02, 2018 at 06:41:30PM -0500, Agustin Vega-Frias wrote: SNIP > > diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l > index 655ecff..a1a01b1 100644 > --- a/tools/perf/util/parse-events.l > +++ b/tools/perf/util/parse-events.l > @@ -175,7 +175,7 @@ bpf_source [^,{}]+\.c[a-zA-Z0-9._]* > num_dec [0-9]+ > num_hex 0x[a-fA-F0-9]+ > num_raw_hex [a-fA-F0-9]+ > -name [a-zA-Z_*?][a-zA-Z0-9_*?.]* > +name [a-zA-Z_*?\[\]][a-zA-Z0-9_*?.\[\]]* > name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?.:]* > drv_cfg_term [a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)? > /* If you add a modifier you need to update check_modifier() */ > diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y > index e81a20e..c528469 100644 > --- a/tools/perf/util/parse-events.y > +++ b/tools/perf/util/parse-events.y > @@ -8,6 +8,7 @@ > > #define YYDEBUG 1 > > +#include <fnmatch.h> > #include <linux/compiler.h> > #include <linux/list.h> > #include <linux/types.h> > @@ -241,7 +242,7 @@ PE_NAME opt_event_config > if (!strncmp(name, "uncore_", 7) && > strncmp($1, "uncore_", 7)) > name += 7; > - if (!strncmp($1, name, strlen($1))) { > + if (!strncmp($1, name, strlen($1)) || !fnmatch($1, name, 0)) { could we now get rid of the strncmp in here and keep the glob matching only? I find it confusing now that following commands give me same results: - [root@krava perf]# ./perf stat -e 'cbox/clockticks/' --no-merge -a sleep 1 Performance counter stats for 'system wide': <not supported> uncore_cbox_1/clockticks/ 281,474,957,674,239 uncore_cbox_0/clockticks/ 1.000958335 seconds time elapsed - [root@krava perf]# ./perf stat -e '*cbox*/clockticks/' --no-merge -a sleep 1 Performance counter stats for 'system wide': <not supported> uncore_cbox_1/clockticks/ 5,427,337 uncore_cbox_0/clockticks/ 1.000962724 seconds time elapsed - [root@krava perf]# ./perf stat -e 'cbox*/clockticks/' --no-merge -a sleep 1 Performance counter stats for 'system wide': <not supported> uncore_cbox_1/clockticks/ 281,474,969,621,374 uncore_cbox_0/clockticks/ 1.001026179 seconds time elapsed and this one fails: - [root@krava perf]# ./perf stat -e '*cbox/clockticks/' --no-merge -a sleep 1 event syntax error: '*cbox/clockticks/' \___ Cannot find PMU `*cbox'. Missing kernel support? Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events despite the fact that it makes as much sense as the previous one: perf stat -e 'cbox*/clockticks/' I'd think let's keep just the glob matching, so it's clear you what you use wildcards for.. thoughts? thanks, jirka
> > +#include <fnmatch.h> > > #include <linux/compiler.h> > > #include <linux/list.h> > > #include <linux/types.h> > > @@ -241,7 +242,7 @@ PE_NAME opt_event_config > > if (!strncmp(name, "uncore_", 7) && > > strncmp($1, "uncore_", 7)) > > name += 7; > > - if (!strncmp($1, name, strlen($1))) { > > + if (!strncmp($1, name, strlen($1)) || !fnmatch($1, name, 0)) { > > could we now get rid of the strncmp in here and keep the > glob matching only? That would break existing command lines. Not a good idea. -Andi
On Sun, Mar 04, 2018 at 09:12:45AM -0800, Andi Kleen wrote: > > > +#include <fnmatch.h> > > > #include <linux/compiler.h> > > > #include <linux/list.h> > > > #include <linux/types.h> > > > @@ -241,7 +242,7 @@ PE_NAME opt_event_config > > > if (!strncmp(name, "uncore_", 7) && > > > strncmp($1, "uncore_", 7)) > > > name += 7; > > > - if (!strncmp($1, name, strlen($1))) { > > > + if (!strncmp($1, name, strlen($1)) || !fnmatch($1, name, 0)) { > > > > could we now get rid of the strncmp in here and keep the > > glob matching only? > > That would break existing command lines. Not a good idea. I hoped that only you guys are using this and would rewrite your scripts ;-) I had no idea there's fnmatch func before.. too bad, ok jirka
On 2018-03-04 13:10, Jiri Olsa wrote: > On Sun, Mar 04, 2018 at 09:12:45AM -0800, Andi Kleen wrote: >> > > +#include <fnmatch.h> >> > > #include <linux/compiler.h> >> > > #include <linux/list.h> >> > > #include <linux/types.h> >> > > @@ -241,7 +242,7 @@ PE_NAME opt_event_config >> > > if (!strncmp(name, "uncore_", 7) && >> > > strncmp($1, "uncore_", 7)) >> > > name += 7; >> > > - if (!strncmp($1, name, strlen($1))) { >> > > + if (!strncmp($1, name, strlen($1)) || !fnmatch($1, name, 0)) { >> > >> > could we now get rid of the strncmp in here and keep the >> > glob matching only? >> >> That would break existing command lines. Not a good idea. > > I hoped that only you guys are using this and would rewrite your > scripts ;-) > > I had no idea there's fnmatch func before.. too bad, ok > > jirka An option to keep backward compatibility and consistency would be to wrap the pattern/string passed in *'s, that way we can just use fnmatch and have all the examples Jiri brought up work the same. With that in place we can actually also drop the explicit ignoring of the uncore_ prefix since the globbing would take care of that. Thoughts? Agustín
Agustin Vega-Frias <agustinv@codeaurora.org> writes: > > An option to keep backward compatibility and consistency would be > to wrap the pattern/string passed in *'s, that way we can just use > fnmatch and have all the examples Jiri brought up work the same. > With that in place we can actually also drop the explicit ignoring > of the uncore_ prefix since the globbing would take care of that. Prepending with * would seem dangerous, could result in false matches. But adding it at the end should be ok. -Andi
On Mon, Mar 05, 2018 at 10:08:18AM -0500, Agustin Vega-Frias wrote: > On 2018-03-04 13:10, Jiri Olsa wrote: > > On Sun, Mar 04, 2018 at 09:12:45AM -0800, Andi Kleen wrote: > > > > > +#include <fnmatch.h> > > > > > #include <linux/compiler.h> > > > > > #include <linux/list.h> > > > > > #include <linux/types.h> > > > > > @@ -241,7 +242,7 @@ PE_NAME opt_event_config > > > > > if (!strncmp(name, "uncore_", 7) && > > > > > strncmp($1, "uncore_", 7)) > > > > > name += 7; > > > > > - if (!strncmp($1, name, strlen($1))) { > > > > > + if (!strncmp($1, name, strlen($1)) || !fnmatch($1, name, 0)) { > > > > > > > > could we now get rid of the strncmp in here and keep the > > > > glob matching only? > > > > > > That would break existing command lines. Not a good idea. > > > > I hoped that only you guys are using this and would rewrite your scripts > > ;-) > > > > I had no idea there's fnmatch func before.. too bad, ok > > > > jirka > > An option to keep backward compatibility and consistency would be > to wrap the pattern/string passed in *'s, that way we can just use > fnmatch and have all the examples Jiri brought up work the same. > With that in place we can actually also drop the explicit ignoring > of the uncore_ prefix since the globbing would take care of that. I don't mind the strcmp as such, I wanted to get rid of the wildcard matching without using '*' ... but as Andi said it's been out there and it's been a while, so let's keep it but if there's a way to make it simpler, let's go for it thanks, jirka
On 2018-03-05 14:09, Jiri Olsa wrote: > On Mon, Mar 05, 2018 at 10:08:18AM -0500, Agustin Vega-Frias wrote: >> On 2018-03-04 13:10, Jiri Olsa wrote: >> > On Sun, Mar 04, 2018 at 09:12:45AM -0800, Andi Kleen wrote: >> > > > > +#include <fnmatch.h> >> > > > > #include <linux/compiler.h> >> > > > > #include <linux/list.h> >> > > > > #include <linux/types.h> >> > > > > @@ -241,7 +242,7 @@ PE_NAME opt_event_config >> > > > > if (!strncmp(name, "uncore_", 7) && >> > > > > strncmp($1, "uncore_", 7)) >> > > > > name += 7; >> > > > > - if (!strncmp($1, name, strlen($1))) { >> > > > > + if (!strncmp($1, name, strlen($1)) || !fnmatch($1, name, 0)) { >> > > > >> > > > could we now get rid of the strncmp in here and keep the >> > > > glob matching only? >> > > >> > > That would break existing command lines. Not a good idea. >> > >> > I hoped that only you guys are using this and would rewrite your scripts >> > ;-) >> > >> > I had no idea there's fnmatch func before.. too bad, ok >> > >> > jirka >> >> An option to keep backward compatibility and consistency would be >> to wrap the pattern/string passed in *'s, that way we can just use >> fnmatch and have all the examples Jiri brought up work the same. >> With that in place we can actually also drop the explicit ignoring >> of the uncore_ prefix since the globbing would take care of that. > > I don't mind the strcmp as such, I wanted to get rid of the wildcard > matching without using '*' ... but as Andi said it's been out > there and it's been a while, so let's keep it > > but if there's a way to make it simpler, let's go for it > > thanks, > jirka Sounds good. I have a new version ready (see sample output below). But I wanted to ping about the other two patches before submitting. Any feedback on those? Thanks, Agustín PS: Sample output: $ ./perf stat -a -e imc/umask=0x3,event=0x4/ --no-merge ls -l > /dev/null Performance counter stats for 'system wide': 2,613 uncore_imc_0/umask=0x3,event=0x4/ 2,736 uncore_imc_1/umask=0x3,event=0x4/ 2,671 uncore_imc_2/umask=0x3,event=0x4/ 2,508 uncore_imc_3/umask=0x3,event=0x4/ 2,439 uncore_imc_4/umask=0x3,event=0x4/ 2,465 uncore_imc_5/umask=0x3,event=0x4/ 0.004159243 seconds time elapsed $ ./perf stat -a -e *imc/umask=0x3,event=0x4/ --no-merge ls -l > /dev/null Performance counter stats for 'system wide': 2,704 uncore_imc_0/umask=0x3,event=0x4/ 2,601 uncore_imc_1/umask=0x3,event=0x4/ 2,625 uncore_imc_2/umask=0x3,event=0x4/ 2,370 uncore_imc_3/umask=0x3,event=0x4/ 2,485 uncore_imc_4/umask=0x3,event=0x4/ 2,431 uncore_imc_5/umask=0x3,event=0x4/ 0.002716763 seconds time elapsed $ ./perf stat -a -e imc*/umask=0x3,event=0x4/ --no-merge ls -l > /dev/null Performance counter stats for 'system wide': 1,294 uncore_imc_0/umask=0x3,event=0x4/ 1,303 uncore_imc_1/umask=0x3,event=0x4/ 1,242 uncore_imc_2/umask=0x3,event=0x4/ 1,125 uncore_imc_3/umask=0x3,event=0x4/ 1,137 uncore_imc_4/umask=0x3,event=0x4/ 1,159 uncore_imc_5/umask=0x3,event=0x4/ 0.002790441 seconds time elapsed $ ./perf stat -a -e *imc*/umask=0x3,event=0x4/ --no-merge ls -l > /dev/null Performance counter stats for 'system wide': 1,524 uncore_imc_0/umask=0x3,event=0x4/ 1,508 uncore_imc_1/umask=0x3,event=0x4/ 1,501 uncore_imc_2/umask=0x3,event=0x4/ 1,405 uncore_imc_3/umask=0x3,event=0x4/ 1,427 uncore_imc_4/umask=0x3,event=0x4/ 1,450 uncore_imc_5/umask=0x3,event=0x4/ 0.002720907 seconds time elapsed
On Mon, Mar 05, 2018 at 03:10:43PM -0500, Agustin Vega-Frias wrote: > On 2018-03-05 14:09, Jiri Olsa wrote: > > On Mon, Mar 05, 2018 at 10:08:18AM -0500, Agustin Vega-Frias wrote: > > > On 2018-03-04 13:10, Jiri Olsa wrote: > > > > On Sun, Mar 04, 2018 at 09:12:45AM -0800, Andi Kleen wrote: > > > > > > > +#include <fnmatch.h> > > > > > > > #include <linux/compiler.h> > > > > > > > #include <linux/list.h> > > > > > > > #include <linux/types.h> > > > > > > > @@ -241,7 +242,7 @@ PE_NAME opt_event_config > > > > > > > if (!strncmp(name, "uncore_", 7) && > > > > > > > strncmp($1, "uncore_", 7)) > > > > > > > name += 7; > > > > > > > - if (!strncmp($1, name, strlen($1))) { > > > > > > > + if (!strncmp($1, name, strlen($1)) || !fnmatch($1, name, 0)) { > > > > > > > > > > > > could we now get rid of the strncmp in here and keep the > > > > > > glob matching only? > > > > > > > > > > That would break existing command lines. Not a good idea. > > > > > > > > I hoped that only you guys are using this and would rewrite your scripts > > > > ;-) > > > > > > > > I had no idea there's fnmatch func before.. too bad, ok > > > > > > > > jirka > > > > > > An option to keep backward compatibility and consistency would be > > > to wrap the pattern/string passed in *'s, that way we can just use > > > fnmatch and have all the examples Jiri brought up work the same. > > > With that in place we can actually also drop the explicit ignoring > > > of the uncore_ prefix since the globbing would take care of that. > > > > I don't mind the strcmp as such, I wanted to get rid of the wildcard > > matching without using '*' ... but as Andi said it's been out > > there and it's been a while, so let's keep it > > > > but if there's a way to make it simpler, let's go for it > > > > thanks, > > jirka > > Sounds good. I have a new version ready (see sample output below). > But I wanted to ping about the other two patches before submitting. > Any feedback on those? the rest looks ok to me, so does the output below thanks, jirka > > Thanks, > Agustín > > PS: > Sample output: > > $ ./perf stat -a -e imc/umask=0x3,event=0x4/ --no-merge ls -l > /dev/null > > Performance counter stats for 'system wide': > > 2,613 uncore_imc_0/umask=0x3,event=0x4/ > 2,736 uncore_imc_1/umask=0x3,event=0x4/ > 2,671 uncore_imc_2/umask=0x3,event=0x4/ > 2,508 uncore_imc_3/umask=0x3,event=0x4/ > 2,439 uncore_imc_4/umask=0x3,event=0x4/ > 2,465 uncore_imc_5/umask=0x3,event=0x4/ > > 0.004159243 seconds time elapsed > > $ ./perf stat -a -e *imc/umask=0x3,event=0x4/ --no-merge ls -l > /dev/null > > Performance counter stats for 'system wide': > > 2,704 uncore_imc_0/umask=0x3,event=0x4/ > 2,601 uncore_imc_1/umask=0x3,event=0x4/ > 2,625 uncore_imc_2/umask=0x3,event=0x4/ > 2,370 uncore_imc_3/umask=0x3,event=0x4/ > 2,485 uncore_imc_4/umask=0x3,event=0x4/ > 2,431 uncore_imc_5/umask=0x3,event=0x4/ > > 0.002716763 seconds time elapsed > > $ ./perf stat -a -e imc*/umask=0x3,event=0x4/ --no-merge ls -l > /dev/null > > Performance counter stats for 'system wide': > > 1,294 uncore_imc_0/umask=0x3,event=0x4/ > 1,303 uncore_imc_1/umask=0x3,event=0x4/ > 1,242 uncore_imc_2/umask=0x3,event=0x4/ > 1,125 uncore_imc_3/umask=0x3,event=0x4/ > 1,137 uncore_imc_4/umask=0x3,event=0x4/ > 1,159 uncore_imc_5/umask=0x3,event=0x4/ > > 0.002790441 seconds time elapsed > > $ ./perf stat -a -e *imc*/umask=0x3,event=0x4/ --no-merge ls -l > /dev/null > > Performance counter stats for 'system wide': > > 1,524 uncore_imc_0/umask=0x3,event=0x4/ > 1,508 uncore_imc_1/umask=0x3,event=0x4/ > 1,501 uncore_imc_2/umask=0x3,event=0x4/ > 1,405 uncore_imc_3/umask=0x3,event=0x4/ > 1,427 uncore_imc_4/umask=0x3,event=0x4/ > 1,450 uncore_imc_5/umask=0x3,event=0x4/ > > 0.002720907 seconds time elapsed > > -- > Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm > Technologies, Inc. > Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux > Foundation Collaborative Project.
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt index e2a897a..2549c34 100644 --- a/tools/perf/Documentation/perf-list.txt +++ b/tools/perf/Documentation/perf-list.txt @@ -141,7 +141,13 @@ on the first memory controller on socket 0 of a Intel Xeon system Each memory controller has its own PMU. Measuring the complete system bandwidth would require specifying all imc PMUs (see perf list output), -and adding the values together. +and adding the values together. To simplify creation of multiple events, +prefix and glob matching is supported in the PMU name, and the prefix +'uncore_' is also ignored when performing the match. So the command above +can be expanded to all memory controllers by using the syntaxes: + + perf stat -C 0 -a imc/cas_count_read/,imc/cas_count_write/ -I 1000 ... + perf stat -C 0 -a *imc*/cas_count_read/,*imc*/cas_count_write/ -I 1000 ... This example measures the combined core power every second diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 823fce7..49983a7 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -49,6 +49,12 @@ report:: parameters are defined by corresponding entries in /sys/bus/event_source/devices/<pmu>/format/* + Note that the last two syntaxes support prefix and glob matching in + the PMU name to simplify creation of events accross multiple instances + of the same type of PMU (e.g. memory controller PMU) in large systems. + Multiple PMU instances are typical for uncore PMUs, so the prefix + 'uncore_' is also ignored when performing this match. + -i:: --no-inherit:: child tasks do not inherit counters @@ -246,6 +252,12 @@ taskset. --no-merge:: Do not merge results from same PMUs. +When multiple events are created from a single event alias, stat will, +by default, aggregate the event counts and show the result in a single +row. This option disables that behavior and shows the individual events +and counts. Aliases are listed immediately after the Kernel PMU events +by perf list. + --smi-cost:: Measure SMI cost if msr/aperf/ and msr/smi/ events are supported. diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l index 655ecff..a1a01b1 100644 --- a/tools/perf/util/parse-events.l +++ b/tools/perf/util/parse-events.l @@ -175,7 +175,7 @@ bpf_source [^,{}]+\.c[a-zA-Z0-9._]* num_dec [0-9]+ num_hex 0x[a-fA-F0-9]+ num_raw_hex [a-fA-F0-9]+ -name [a-zA-Z_*?][a-zA-Z0-9_*?.]* +name [a-zA-Z_*?\[\]][a-zA-Z0-9_*?.\[\]]* name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?.:]* drv_cfg_term [a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)? /* If you add a modifier you need to update check_modifier() */ diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y index e81a20e..c528469 100644 --- a/tools/perf/util/parse-events.y +++ b/tools/perf/util/parse-events.y @@ -8,6 +8,7 @@ #define YYDEBUG 1 +#include <fnmatch.h> #include <linux/compiler.h> #include <linux/list.h> #include <linux/types.h> @@ -241,7 +242,7 @@ PE_NAME opt_event_config if (!strncmp(name, "uncore_", 7) && strncmp($1, "uncore_", 7)) name += 7; - if (!strncmp($1, name, strlen($1))) { + if (!strncmp($1, name, strlen($1)) || !fnmatch($1, name, 0)) { if (parse_events_copy_term_list(orig_terms, &terms)) YYABORT; if (!parse_events_add_pmu(_parse_state, list, pmu->name, terms))
Starting on v4.12 event parsing code for dynamic pmu events already supports prefix-based matching of multiple pmus when creating dynamic events. E.g., in a system with the following dynamic pmus: mypmu_0 mypmu_1 mypmu_2 mypmu_4 passing mypmu/<config>/ as an event spec will result in the creation of the event in all of the pmus. This change expands this matching through the use of fnmatch so glob-like expressions can be used to create events in multiple pmus. E.g., in the system described above if a user only wants to create the event in mypmu_0 and mypmu_1, mypmu_[01]/<config>/ can be passed. Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org> --- tools/perf/Documentation/perf-list.txt | 8 +++++++- tools/perf/Documentation/perf-stat.txt | 12 ++++++++++++ tools/perf/util/parse-events.l | 2 +- tools/perf/util/parse-events.y | 3 ++- 4 files changed, 22 insertions(+), 3 deletions(-) -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.