diff mbox series

[v5,3/4] perf record: Skip don't fail for events that don't open

Message ID 20250109222109.567031-4-irogers@google.com (mailing list archive)
State Not Applicable
Headers show
Series Prefer sysfs/JSON events also when no PMU is provided | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Ian Rogers Jan. 9, 2025, 10:21 p.m. UTC
Whilst for many tools it is an expected behavior that failure to open
a perf event is a failure, ARM decided to name PMU events the same as
legacy events and then failed to rename such events on a server uncore
SLC PMU. As perf's default behavior when no PMU is specified is to
open the event on all PMUs that advertise/"have" the event, this
yielded failures when trying to make the priority of legacy and
sysfs/json events uniform - something requested by RISC-V and ARM. A
legacy event user on ARM hardware may find their event opened on an
uncore PMU which for perf record will fail. Arnaldo suggested skipping
such events which this patch implements. Rather than have the skipping
conditional on running on ARM, the skipping is done on all
architectures as such a fundamental behavioral difference could lead
to problems with tools built/depending on perf.

An example of perf record failing to open events on x86 is:
```
$ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
Error:
Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
"dmesg | grep -i perf" may provide additional information.

Error:
Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
"dmesg | grep -i perf" may provide additional information.

Error:
Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
The LLC-prefetch-read event is not supported.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]

$ perf report --stats
Aggregated stats:
               TOTAL events:      17255
                MMAP events:        284  ( 1.6%)
                COMM events:       1961  (11.4%)
                EXIT events:          1  ( 0.0%)
                FORK events:       1960  (11.4%)
              SAMPLE events:         87  ( 0.5%)
               MMAP2 events:      12836  (74.4%)
             KSYMBOL events:         83  ( 0.5%)
           BPF_EVENT events:         36  ( 0.2%)
      FINISHED_ROUND events:          2  ( 0.0%)
            ID_INDEX events:          1  ( 0.0%)
          THREAD_MAP events:          1  ( 0.0%)
             CPU_MAP events:          1  ( 0.0%)
           TIME_CONV events:          1  ( 0.0%)
       FINISHED_INIT events:          1  ( 0.0%)
cycles stats:
              SAMPLE events:         87
```

If all events fail to open then the perf record will fail:
```
$ perf record -e LLC-prefetch-read true
Error:
Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
The LLC-prefetch-read event is not supported.
Error:
Failure to open any events for recording
```

As an evlist may have dummy events that open when all command line
events fail we ignore dummy events when detecting if at least some
events open. This still permits the dummy event on its own to be used
as a permission check:
```
$ perf record -e dummy true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.046 MB perf.data ]
```
but allows failure when a dummy event is implicilty inserted or when
there are insufficient permissions to open it:
```
$ perf record -e LLC-prefetch-read -a true
Error:
Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
The LLC-prefetch-read event is not supported.
Error:
Failure to open any events for recording
```

The issue with legacy events is that on RISC-V they want the driver to
not have mappings from legacy to non-legacy config encodings for each
vendor/model due to size, complexity and difficulty to update. It was
reported that on ARM Apple-M? CPUs the legacy mapping in the driver
was broken and the sysfs/json events should always take precedent,
however, it isn't clear this is still the case. It is the case that
without working around this issue a legacy event like cycles without a
PMU can encode differently than when specified with a PMU - the
non-PMU version favoring legacy encodings, the PMU one avoiding legacy
encodings.

The patch removes events and then adjusts the idx value for each
evsel. This is done so that the dense xyarrays used for file
descriptors, etc. don't contain broken entries. As event opening
happens relatively late in the record process, use of the idx value
before the open will have become corrupted, so it is expected there
are latent bugs hidden behind this change - the change is best
effort. As the only vendor that has broken event names is ARM, this
will principally effect ARM users. They will also experience warning
messages like those above because of the uncore PMU advertising legacy
event names.

Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
---
 tools/perf/builtin-record.c | 47 ++++++++++++++++++++++++++++++++-----
 1 file changed, 41 insertions(+), 6 deletions(-)

Comments

Namhyung Kim Jan. 10, 2025, 1:25 a.m. UTC | #1
On Thu, Jan 09, 2025 at 02:21:08PM -0800, Ian Rogers wrote:
> Whilst for many tools it is an expected behavior that failure to open
> a perf event is a failure, ARM decided to name PMU events the same as
> legacy events and then failed to rename such events on a server uncore
> SLC PMU. As perf's default behavior when no PMU is specified is to
> open the event on all PMUs that advertise/"have" the event, this
> yielded failures when trying to make the priority of legacy and
> sysfs/json events uniform - something requested by RISC-V and ARM. A
> legacy event user on ARM hardware may find their event opened on an
> uncore PMU which for perf record will fail. Arnaldo suggested skipping
> such events which this patch implements. Rather than have the skipping
> conditional on running on ARM, the skipping is done on all
> architectures as such a fundamental behavioral difference could lead
> to problems with tools built/depending on perf.
> 
> An example of perf record failing to open events on x86 is:
> ```
> $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> Error:
> Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> "dmesg | grep -i perf" may provide additional information.
> 
> Error:
> Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> "dmesg | grep -i perf" may provide additional information.
> 
> Error:
> Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> The LLC-prefetch-read event is not supported.
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]

I'm afraid this can be too noisy.

> 
> $ perf report --stats
> Aggregated stats:
>                TOTAL events:      17255
>                 MMAP events:        284  ( 1.6%)
>                 COMM events:       1961  (11.4%)
>                 EXIT events:          1  ( 0.0%)
>                 FORK events:       1960  (11.4%)
>               SAMPLE events:         87  ( 0.5%)
>                MMAP2 events:      12836  (74.4%)
>              KSYMBOL events:         83  ( 0.5%)
>            BPF_EVENT events:         36  ( 0.2%)
>       FINISHED_ROUND events:          2  ( 0.0%)
>             ID_INDEX events:          1  ( 0.0%)
>           THREAD_MAP events:          1  ( 0.0%)
>              CPU_MAP events:          1  ( 0.0%)
>            TIME_CONV events:          1  ( 0.0%)
>        FINISHED_INIT events:          1  ( 0.0%)
> cycles stats:
>               SAMPLE events:         87
> ```
> 
> If all events fail to open then the perf record will fail:
> ```
> $ perf record -e LLC-prefetch-read true
> Error:
> Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> The LLC-prefetch-read event is not supported.
> Error:
> Failure to open any events for recording
> ```
> 
> As an evlist may have dummy events that open when all command line
> events fail we ignore dummy events when detecting if at least some
> events open. This still permits the dummy event on its own to be used
> as a permission check:
> ```
> $ perf record -e dummy true
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.046 MB perf.data ]
> ```
> but allows failure when a dummy event is implicilty inserted or when
> there are insufficient permissions to open it:
> ```
> $ perf record -e LLC-prefetch-read -a true
> Error:
> Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> The LLC-prefetch-read event is not supported.
> Error:
> Failure to open any events for recording
> ```
> 
> The issue with legacy events is that on RISC-V they want the driver to
> not have mappings from legacy to non-legacy config encodings for each
> vendor/model due to size, complexity and difficulty to update. It was
> reported that on ARM Apple-M? CPUs the legacy mapping in the driver
> was broken and the sysfs/json events should always take precedent,
> however, it isn't clear this is still the case. It is the case that
> without working around this issue a legacy event like cycles without a
> PMU can encode differently than when specified with a PMU - the
> non-PMU version favoring legacy encodings, the PMU one avoiding legacy
> encodings.
> 
> The patch removes events and then adjusts the idx value for each
> evsel. This is done so that the dense xyarrays used for file
> descriptors, etc. don't contain broken entries. As event opening
> happens relatively late in the record process, use of the idx value
> before the open will have become corrupted, so it is expected there
> are latent bugs hidden behind this change - the change is best
> effort. As the only vendor that has broken event names is ARM, this
> will principally effect ARM users. They will also experience warning
> messages like those above because of the uncore PMU advertising legacy
> event names.
> 
> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> Signed-off-by: Ian Rogers <irogers@google.com>
> Tested-by: James Clark <james.clark@linaro.org>
> Tested-by: Leo Yan <leo.yan@arm.com>
> Tested-by: Atish Patra <atishp@rivosinc.com>
> ---
>  tools/perf/builtin-record.c | 47 ++++++++++++++++++++++++++++++++-----
>  1 file changed, 41 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 5db1aedf48df..c0b8249a3787 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -961,7 +961,6 @@ static int record__config_tracking_events(struct record *rec)
>  	 */
>  	if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
>  	    perf_pmus__num_core_pmus() > 1) {
> -
>  		/*
>  		 * User space tasks can migrate between CPUs, so when tracing
>  		 * selected CPUs, sideband for all CPUs is still needed.
> @@ -1366,6 +1365,7 @@ static int record__open(struct record *rec)
>  	struct perf_session *session = rec->session;
>  	struct record_opts *opts = &rec->opts;
>  	int rc = 0;
> +	bool skipped = false;
>  
>  	evlist__for_each_entry(evlist, pos) {
>  try_again:
> @@ -1381,15 +1381,50 @@ static int record__open(struct record *rec)
>  			        pos = evlist__reset_weak_group(evlist, pos, true);
>  				goto try_again;
>  			}
> -			rc = -errno;
>  			evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
> -			ui__error("%s\n", msg);
> -			goto out;
> +			ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
> +				  evsel__name(pos), evsel__pmu_name(pos), msg);

How about changing it to pr_debug() and add below ...


> +			pos->skippable = true;
> +			skipped = true;
> +		} else {
> +			pos->supported = true;
>  		}
> -
> -		pos->supported = true;
>  	}
>  
> +	if (skipped) {
> +		struct evsel *tmp;
> +		int idx = 0;
> +		bool evlist_empty = true;
> +
> +		/* Remove evsels that failed to open and update indices. */
> +		evlist__for_each_entry_safe(evlist, tmp, pos) {
> +			if (pos->skippable) {
> +				evlist__remove(evlist, pos);
> +				continue;
> +			}
> +
> +			/*
> +			 * Note, dummy events may be command line parsed or
> +			 * added by the tool. We care about supporting `perf
> +			 * record -e dummy` which may be used as a permission
> +			 * check. Dummy events that are added to the command
> +			 * line and opened along with other events that fail,
> +			 * will still fail as if the dummy events were tool
> +			 * added events for the sake of code simplicity.
> +			 */
> +			if (!evsel__is_dummy_event(pos))
> +				evlist_empty = false;
> +		}
> +		evlist__for_each_entry(evlist, pos) {
> +			pos->core.idx = idx++;
> +		}
> +		/* If list is empty then fail. */
> +		if (evlist_empty) {
> +			ui__error("Failure to open any events for recording.\n");
> +			rc = -1;
> +			goto out;
> +		}

... ?

		if (!verbose)
			ui__warning("Removed some unsupported events, use -v for details.\n");

Thanks,
Namhyung


> +	}
>  	if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
>  		pr_warning(
>  "WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
> -- 
> 2.47.1.613.gc27f4b7a9f-goog
>
Ian Rogers Jan. 10, 2025, 4:44 a.m. UTC | #2
On Thu, Jan 9, 2025 at 5:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, Jan 09, 2025 at 02:21:08PM -0800, Ian Rogers wrote:
> > Whilst for many tools it is an expected behavior that failure to open
> > a perf event is a failure, ARM decided to name PMU events the same as
> > legacy events and then failed to rename such events on a server uncore
> > SLC PMU. As perf's default behavior when no PMU is specified is to
> > open the event on all PMUs that advertise/"have" the event, this
> > yielded failures when trying to make the priority of legacy and
> > sysfs/json events uniform - something requested by RISC-V and ARM. A
> > legacy event user on ARM hardware may find their event opened on an
> > uncore PMU which for perf record will fail. Arnaldo suggested skipping
> > such events which this patch implements. Rather than have the skipping
> > conditional on running on ARM, the skipping is done on all
> > architectures as such a fundamental behavioral difference could lead
> > to problems with tools built/depending on perf.
> >
> > An example of perf record failing to open events on x86 is:
> > ```
> > $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> > Error:
> > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > "dmesg | grep -i perf" may provide additional information.
> >
> > Error:
> > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > "dmesg | grep -i perf" may provide additional information.
> >
> > Error:
> > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > The LLC-prefetch-read event is not supported.
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]
>
> I'm afraid this can be too noisy.

The intention is to be noisy:
1) it matches the existing behavior, anything else is potentially a regression;
2) it only happens if trying to record on a PMU/event that doesn't
support recording, something that is currently an error and so we're
not motivated to change the behavior as no-one should be using it;
3) for the wildcard case the only offender is ARM's SLC PMU and the
appropriate fix there has always been to make the CPU cycle's event
name match the bus_cycles event name by calling it cpu_cycles -
something that doesn't conflict with a core PMU event name, the thing
that has introduced all these problems, patches, long email exchanges,
unfixed inconsistencies, etc.. If the errors aren't noisy then there
is little motivation for the ARM SLC PMU's event name to be fixed.

Thanks,
Ian
Arnaldo Carvalho de Melo Jan. 10, 2025, 2:18 p.m. UTC | #3
Adding Linus to the CC list as he participated in this discussion in the
past, so a heads up about changes in this area that are being further
discussed.

On Thu, Jan 09, 2025 at 05:25:03PM -0800, Namhyung Kim wrote:
> On Thu, Jan 09, 2025 at 02:21:08PM -0800, Ian Rogers wrote:
> > Whilst for many tools it is an expected behavior that failure to open
> > a perf event is a failure, ARM decided to name PMU events the same as
> > legacy events and then failed to rename such events on a server uncore
> > SLC PMU. As perf's default behavior when no PMU is specified is to
> > open the event on all PMUs that advertise/"have" the event, this
> > yielded failures when trying to make the priority of legacy and
> > sysfs/json events uniform - something requested by RISC-V and ARM. A
> > legacy event user on ARM hardware may find their event opened on an
> > uncore PMU which for perf record will fail. Arnaldo suggested skipping
> > such events which this patch implements. Rather than have the skipping
> > conditional on running on ARM, the skipping is done on all
> > architectures as such a fundamental behavioral difference could lead
> > to problems with tools built/depending on perf.
> > 
> > An example of perf record failing to open events on x86 is:
> > ```
> > $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> > Error:
> > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > "dmesg | grep -i perf" may provide additional information.
> > 
> > Error:
> > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > "dmesg | grep -i perf" may provide additional information.
> > 
> > Error:
> > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > The LLC-prefetch-read event is not supported.
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]
> 
> I'm afraid this can be too noisy.

Agreed.
 
> > $ perf report --stats
> > Aggregated stats:
> >                TOTAL events:      17255
> >                 MMAP events:        284  ( 1.6%)
> >                 COMM events:       1961  (11.4%)
> >                 EXIT events:          1  ( 0.0%)
> >                 FORK events:       1960  (11.4%)
> >               SAMPLE events:         87  ( 0.5%)
> >                MMAP2 events:      12836  (74.4%)
> >              KSYMBOL events:         83  ( 0.5%)
> >            BPF_EVENT events:         36  ( 0.2%)
> >       FINISHED_ROUND events:          2  ( 0.0%)
> >             ID_INDEX events:          1  ( 0.0%)
> >           THREAD_MAP events:          1  ( 0.0%)
> >              CPU_MAP events:          1  ( 0.0%)
> >            TIME_CONV events:          1  ( 0.0%)
> >        FINISHED_INIT events:          1  ( 0.0%)
> > cycles stats:
> >               SAMPLE events:         87
> > ```
> > 
> > If all events fail to open then the perf record will fail:
> > ```
> > $ perf record -e LLC-prefetch-read true
> > Error:
> > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > The LLC-prefetch-read event is not supported.
> > Error:
> > Failure to open any events for recording
> > ```
> > 
> > As an evlist may have dummy events that open when all command line
> > events fail we ignore dummy events when detecting if at least some
> > events open. This still permits the dummy event on its own to be used
> > as a permission check:
> > ```
> > $ perf record -e dummy true
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.046 MB perf.data ]
> > ```
> > but allows failure when a dummy event is implicilty inserted or when
> > there are insufficient permissions to open it:
> > ```
> > $ perf record -e LLC-prefetch-read -a true
> > Error:
> > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > The LLC-prefetch-read event is not supported.
> > Error:
> > Failure to open any events for recording
> > ```
> > 
> > The issue with legacy events is that on RISC-V they want the driver to
> > not have mappings from legacy to non-legacy config encodings for each
> > vendor/model due to size, complexity and difficulty to update. It was
> > reported that on ARM Apple-M? CPUs the legacy mapping in the driver
> > was broken and the sysfs/json events should always take precedent,
> > however, it isn't clear this is still the case. It is the case that
> > without working around this issue a legacy event like cycles without a
> > PMU can encode differently than when specified with a PMU - the
> > non-PMU version favoring legacy encodings, the PMU one avoiding legacy
> > encodings.
> > 
> > The patch removes events and then adjusts the idx value for each
> > evsel. This is done so that the dense xyarrays used for file
> > descriptors, etc. don't contain broken entries. As event opening
> > happens relatively late in the record process, use of the idx value
> > before the open will have become corrupted, so it is expected there
> > are latent bugs hidden behind this change - the change is best
> > effort. As the only vendor that has broken event names is ARM, this
> > will principally effect ARM users. They will also experience warning
> > messages like those above because of the uncore PMU advertising legacy
> > event names.
> > 
> > Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > Tested-by: James Clark <james.clark@linaro.org>
> > Tested-by: Leo Yan <leo.yan@arm.com>
> > Tested-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  tools/perf/builtin-record.c | 47 ++++++++++++++++++++++++++++++++-----
> >  1 file changed, 41 insertions(+), 6 deletions(-)
> > 
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index 5db1aedf48df..c0b8249a3787 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -961,7 +961,6 @@ static int record__config_tracking_events(struct record *rec)
> >  	 */
> >  	if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
> >  	    perf_pmus__num_core_pmus() > 1) {
> > -
> >  		/*
> >  		 * User space tasks can migrate between CPUs, so when tracing
> >  		 * selected CPUs, sideband for all CPUs is still needed.
> > @@ -1366,6 +1365,7 @@ static int record__open(struct record *rec)
> >  	struct perf_session *session = rec->session;
> >  	struct record_opts *opts = &rec->opts;
> >  	int rc = 0;
> > +	bool skipped = false;
> >  
> >  	evlist__for_each_entry(evlist, pos) {
> >  try_again:
> > @@ -1381,15 +1381,50 @@ static int record__open(struct record *rec)
> >  			        pos = evlist__reset_weak_group(evlist, pos, true);
> >  				goto try_again;
> >  			}
> > -			rc = -errno;
> >  			evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
> > -			ui__error("%s\n", msg);
> > -			goto out;
> > +			ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
> > +				  evsel__name(pos), evsel__pmu_name(pos), msg);
 
> How about changing it to pr_debug() and add below ...

That sounds better.
 
> > +			pos->skippable = true;
> > +			skipped = true;
> > +		} else {
> > +			pos->supported = true;
> >  		}
> > -
> > -		pos->supported = true;
> >  	}
> >  
> > +	if (skipped) {
> > +		struct evsel *tmp;
> > +		int idx = 0;
> > +		bool evlist_empty = true;
> > +
> > +		/* Remove evsels that failed to open and update indices. */
> > +		evlist__for_each_entry_safe(evlist, tmp, pos) {
> > +			if (pos->skippable) {
> > +				evlist__remove(evlist, pos);
> > +				continue;
> > +			}
> > +
> > +			/*
> > +			 * Note, dummy events may be command line parsed or
> > +			 * added by the tool. We care about supporting `perf
> > +			 * record -e dummy` which may be used as a permission
> > +			 * check. Dummy events that are added to the command
> > +			 * line and opened along with other events that fail,
> > +			 * will still fail as if the dummy events were tool
> > +			 * added events for the sake of code simplicity.
> > +			 */
> > +			if (!evsel__is_dummy_event(pos))
> > +				evlist_empty = false;
> > +		}
> > +		evlist__for_each_entry(evlist, pos) {
> > +			pos->core.idx = idx++;
> > +		}
> > +		/* If list is empty then fail. */
> > +		if (evlist_empty) {
> > +			ui__error("Failure to open any events for recording.\n");
> > +			rc = -1;
> > +			goto out;
> > +		}

> ... ?

> 		if (!verbose)
> 			ui__warning("Removed some unsupported events, use -v for details.\n");

And even this one would be best left for cases where we can determine
that its a new situation, i.e. one that should work and not the ones we
know that will not work already and thus so far didn't alarm the user
into thinking something is wrong.

Having the ones we know will fail as pr_debug() seems enough, I'd say.

- Arnaldo
 
> Thanks,
> Namhyung
> 
> 
> > +	}
> >  	if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
> >  		pr_warning(
> >  "WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
> > -- 
> > 2.47.1.613.gc27f4b7a9f-goog
> >
Ian Rogers Jan. 10, 2025, 4:42 p.m. UTC | #4
On Fri, Jan 10, 2025 at 6:18 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Adding Linus to the CC list as he participated in this discussion in the
> past, so a heads up about changes in this area that are being further
> discussed.

Linus blocks my email so I'm not sure of the point.

> On Thu, Jan 09, 2025 at 05:25:03PM -0800, Namhyung Kim wrote:
> > On Thu, Jan 09, 2025 at 02:21:08PM -0800, Ian Rogers wrote:
> > > Whilst for many tools it is an expected behavior that failure to open
> > > a perf event is a failure, ARM decided to name PMU events the same as
> > > legacy events and then failed to rename such events on a server uncore
> > > SLC PMU. As perf's default behavior when no PMU is specified is to
> > > open the event on all PMUs that advertise/"have" the event, this
> > > yielded failures when trying to make the priority of legacy and
> > > sysfs/json events uniform - something requested by RISC-V and ARM. A
> > > legacy event user on ARM hardware may find their event opened on an
> > > uncore PMU which for perf record will fail. Arnaldo suggested skipping
> > > such events which this patch implements. Rather than have the skipping
> > > conditional on running on ARM, the skipping is done on all
> > > architectures as such a fundamental behavioral difference could lead
> > > to problems with tools built/depending on perf.
> > >
> > > An example of perf record failing to open events on x86 is:
> > > ```
> > > $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> > > Error:
> > > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> > > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > > "dmesg | grep -i perf" may provide additional information.
> > >
> > > Error:
> > > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> > > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > > "dmesg | grep -i perf" may provide additional information.
> > >
> > > Error:
> > > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > > The LLC-prefetch-read event is not supported.
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]
> >
> > I'm afraid this can be too noisy.
>
> Agreed.
>
> > > $ perf report --stats
> > > Aggregated stats:
> > >                TOTAL events:      17255
> > >                 MMAP events:        284  ( 1.6%)
> > >                 COMM events:       1961  (11.4%)
> > >                 EXIT events:          1  ( 0.0%)
> > >                 FORK events:       1960  (11.4%)
> > >               SAMPLE events:         87  ( 0.5%)
> > >                MMAP2 events:      12836  (74.4%)
> > >              KSYMBOL events:         83  ( 0.5%)
> > >            BPF_EVENT events:         36  ( 0.2%)
> > >       FINISHED_ROUND events:          2  ( 0.0%)
> > >             ID_INDEX events:          1  ( 0.0%)
> > >           THREAD_MAP events:          1  ( 0.0%)
> > >              CPU_MAP events:          1  ( 0.0%)
> > >            TIME_CONV events:          1  ( 0.0%)
> > >        FINISHED_INIT events:          1  ( 0.0%)
> > > cycles stats:
> > >               SAMPLE events:         87
> > > ```
> > >
> > > If all events fail to open then the perf record will fail:
> > > ```
> > > $ perf record -e LLC-prefetch-read true
> > > Error:
> > > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > > The LLC-prefetch-read event is not supported.
> > > Error:
> > > Failure to open any events for recording
> > > ```
> > >
> > > As an evlist may have dummy events that open when all command line
> > > events fail we ignore dummy events when detecting if at least some
> > > events open. This still permits the dummy event on its own to be used
> > > as a permission check:
> > > ```
> > > $ perf record -e dummy true
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 0.046 MB perf.data ]
> > > ```
> > > but allows failure when a dummy event is implicilty inserted or when
> > > there are insufficient permissions to open it:
> > > ```
> > > $ perf record -e LLC-prefetch-read -a true
> > > Error:
> > > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > > The LLC-prefetch-read event is not supported.
> > > Error:
> > > Failure to open any events for recording
> > > ```
> > >
> > > The issue with legacy events is that on RISC-V they want the driver to
> > > not have mappings from legacy to non-legacy config encodings for each
> > > vendor/model due to size, complexity and difficulty to update. It was
> > > reported that on ARM Apple-M? CPUs the legacy mapping in the driver
> > > was broken and the sysfs/json events should always take precedent,
> > > however, it isn't clear this is still the case. It is the case that
> > > without working around this issue a legacy event like cycles without a
> > > PMU can encode differently than when specified with a PMU - the
> > > non-PMU version favoring legacy encodings, the PMU one avoiding legacy
> > > encodings.
> > >
> > > The patch removes events and then adjusts the idx value for each
> > > evsel. This is done so that the dense xyarrays used for file
> > > descriptors, etc. don't contain broken entries. As event opening
> > > happens relatively late in the record process, use of the idx value
> > > before the open will have become corrupted, so it is expected there
> > > are latent bugs hidden behind this change - the change is best
> > > effort. As the only vendor that has broken event names is ARM, this
> > > will principally effect ARM users. They will also experience warning
> > > messages like those above because of the uncore PMU advertising legacy
> > > event names.
> > >
> > > Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > Tested-by: James Clark <james.clark@linaro.org>
> > > Tested-by: Leo Yan <leo.yan@arm.com>
> > > Tested-by: Atish Patra <atishp@rivosinc.com>
> > > ---
> > >  tools/perf/builtin-record.c | 47 ++++++++++++++++++++++++++++++++-----
> > >  1 file changed, 41 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > > index 5db1aedf48df..c0b8249a3787 100644
> > > --- a/tools/perf/builtin-record.c
> > > +++ b/tools/perf/builtin-record.c
> > > @@ -961,7 +961,6 @@ static int record__config_tracking_events(struct record *rec)
> > >      */
> > >     if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
> > >         perf_pmus__num_core_pmus() > 1) {
> > > -
> > >             /*
> > >              * User space tasks can migrate between CPUs, so when tracing
> > >              * selected CPUs, sideband for all CPUs is still needed.
> > > @@ -1366,6 +1365,7 @@ static int record__open(struct record *rec)
> > >     struct perf_session *session = rec->session;
> > >     struct record_opts *opts = &rec->opts;
> > >     int rc = 0;
> > > +   bool skipped = false;
> > >
> > >     evlist__for_each_entry(evlist, pos) {
> > >  try_again:
> > > @@ -1381,15 +1381,50 @@ static int record__open(struct record *rec)
> > >                             pos = evlist__reset_weak_group(evlist, pos, true);
> > >                             goto try_again;
> > >                     }
> > > -                   rc = -errno;
> > >                     evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
> > > -                   ui__error("%s\n", msg);
> > > -                   goto out;
> > > +                   ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
> > > +                             evsel__name(pos), evsel__pmu_name(pos), msg);
>
> > How about changing it to pr_debug() and add below ...
>
> That sounds better.
>
> > > +                   pos->skippable = true;
> > > +                   skipped = true;
> > > +           } else {
> > > +                   pos->supported = true;
> > >             }
> > > -
> > > -           pos->supported = true;
> > >     }
> > >
> > > +   if (skipped) {
> > > +           struct evsel *tmp;
> > > +           int idx = 0;
> > > +           bool evlist_empty = true;
> > > +
> > > +           /* Remove evsels that failed to open and update indices. */
> > > +           evlist__for_each_entry_safe(evlist, tmp, pos) {
> > > +                   if (pos->skippable) {
> > > +                           evlist__remove(evlist, pos);
> > > +                           continue;
> > > +                   }
> > > +
> > > +                   /*
> > > +                    * Note, dummy events may be command line parsed or
> > > +                    * added by the tool. We care about supporting `perf
> > > +                    * record -e dummy` which may be used as a permission
> > > +                    * check. Dummy events that are added to the command
> > > +                    * line and opened along with other events that fail,
> > > +                    * will still fail as if the dummy events were tool
> > > +                    * added events for the sake of code simplicity.
> > > +                    */
> > > +                   if (!evsel__is_dummy_event(pos))
> > > +                           evlist_empty = false;
> > > +           }
> > > +           evlist__for_each_entry(evlist, pos) {
> > > +                   pos->core.idx = idx++;
> > > +           }
> > > +           /* If list is empty then fail. */
> > > +           if (evlist_empty) {
> > > +                   ui__error("Failure to open any events for recording.\n");
> > > +                   rc = -1;
> > > +                   goto out;
> > > +           }
>
> > ... ?
>
> >               if (!verbose)
> >                       ui__warning("Removed some unsupported events, use -v for details.\n");
>
> And even this one would be best left for cases where we can determine
> that its a new situation, i.e. one that should work and not the ones we
> know that will not work already and thus so far didn't alarm the user
> into thinking something is wrong.
>
> Having the ones we know will fail as pr_debug() seems enough, I'd say.

This means that:
```
$ perf record -e data_read,LLC-prefetch-read -a sleep 0.1
```
will fail (as data_read is a memory controller event and the LLC
doesn't support sampling) with something like:
```
Error:
Failure to open any events for recording
```
Which feels a bit minimal. As I already mentioned, it is also a
behavior change and so has the potential to break scripts dependent on
the failure information.

A patch lowering the priority of error messages should be independent
of the 4 changes here. I'd be happy if someone follows this series
with a patch doing it.

Thanks,
Ian
Namhyung Kim Jan. 10, 2025, 6:55 p.m. UTC | #5
On Thu, Jan 09, 2025 at 08:44:38PM -0800, Ian Rogers wrote:
> On Thu, Jan 9, 2025 at 5:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Thu, Jan 09, 2025 at 02:21:08PM -0800, Ian Rogers wrote:
> > > Whilst for many tools it is an expected behavior that failure to open
> > > a perf event is a failure, ARM decided to name PMU events the same as
> > > legacy events and then failed to rename such events on a server uncore
> > > SLC PMU. As perf's default behavior when no PMU is specified is to
> > > open the event on all PMUs that advertise/"have" the event, this
> > > yielded failures when trying to make the priority of legacy and
> > > sysfs/json events uniform - something requested by RISC-V and ARM. A
> > > legacy event user on ARM hardware may find their event opened on an
> > > uncore PMU which for perf record will fail. Arnaldo suggested skipping
> > > such events which this patch implements. Rather than have the skipping
> > > conditional on running on ARM, the skipping is done on all
> > > architectures as such a fundamental behavioral difference could lead
> > > to problems with tools built/depending on perf.
> > >
> > > An example of perf record failing to open events on x86 is:
> > > ```
> > > $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> > > Error:
> > > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> > > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > > "dmesg | grep -i perf" may provide additional information.
> > >
> > > Error:
> > > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> > > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > > "dmesg | grep -i perf" may provide additional information.
> > >
> > > Error:
> > > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > > The LLC-prefetch-read event is not supported.
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]
> >
> > I'm afraid this can be too noisy.
> 
> The intention is to be noisy:
> 1) it matches the existing behavior, anything else is potentially a regression;

Well.. I think you're changing the behavior. :)  Also currently it just
fails on the first event so it won't be too much noisy.

  $ perf record -e data_read,data_write,LLC-prefetch-read -a sleep 0.1
  event syntax error: 'data_read,data_write,LLC-prefetch-read'
                       \___ Bad event name
  
  Unable to find event on a PMU of 'data_read'
  Run 'perf list' for a list of valid events
  
   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]
  
      -e, --event <event>   event selector. use 'perf list' to list available events


> 2) it only happens if trying to record on a PMU/event that doesn't
> support recording, something that is currently an error and so we're
> not motivated to change the behavior as no-one should be using it;

It was caught by Linus, so we know at least one (very important) user.


> 3) for the wildcard case the only offender is ARM's SLC PMU and the
> appropriate fix there has always been to make the CPU cycle's event
> name match the bus_cycles event name by calling it cpu_cycles -
> something that doesn't conflict with a core PMU event name, the thing
> that has introduced all these problems, patches, long email exchanges,
> unfixed inconsistencies, etc.. If the errors aren't noisy then there
> is little motivation for the ARM SLC PMU's event name to be fixed.

I understand your concern but I'm not sure it's the best way to fix the
issue.

James, Leo, is there any chance you can rename the SLC cycles event in
the kernel driver?

Thanks,
Namhyung
Ian Rogers Jan. 10, 2025, 7:18 p.m. UTC | #6
On Fri, Jan 10, 2025 at 10:55 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, Jan 09, 2025 at 08:44:38PM -0800, Ian Rogers wrote:
> > On Thu, Jan 9, 2025 at 5:25 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > On Thu, Jan 09, 2025 at 02:21:08PM -0800, Ian Rogers wrote:
> > > > Whilst for many tools it is an expected behavior that failure to open
> > > > a perf event is a failure, ARM decided to name PMU events the same as
> > > > legacy events and then failed to rename such events on a server uncore
> > > > SLC PMU. As perf's default behavior when no PMU is specified is to
> > > > open the event on all PMUs that advertise/"have" the event, this
> > > > yielded failures when trying to make the priority of legacy and
> > > > sysfs/json events uniform - something requested by RISC-V and ARM. A
> > > > legacy event user on ARM hardware may find their event opened on an
> > > > uncore PMU which for perf record will fail. Arnaldo suggested skipping
> > > > such events which this patch implements. Rather than have the skipping
> > > > conditional on running on ARM, the skipping is done on all
> > > > architectures as such a fundamental behavioral difference could lead
> > > > to problems with tools built/depending on perf.
> > > >
> > > > An example of perf record failing to open events on x86 is:
> > > > ```
> > > > $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> > > > Error:
> > > > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> > > > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > > > "dmesg | grep -i perf" may provide additional information.
> > > >
> > > > Error:
> > > > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> > > > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > > > "dmesg | grep -i perf" may provide additional information.
> > > >
> > > > Error:
> > > > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > > > The LLC-prefetch-read event is not supported.
> > > > [ perf record: Woken up 1 times to write data ]
> > > > [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]
> > >
> > > I'm afraid this can be too noisy.
> >
> > The intention is to be noisy:
> > 1) it matches the existing behavior, anything else is potentially a regression;
>
> Well.. I think you're changing the behavior. :)  Also currently it just
> fails on the first event so it won't be too much noisy.
>
>   $ perf record -e data_read,data_write,LLC-prefetch-read -a sleep 0.1
>   event syntax error: 'data_read,data_write,LLC-prefetch-read'
>                        \___ Bad event name
>
>   Unable to find event on a PMU of 'data_read'
>   Run 'perf list' for a list of valid events
>
>    Usage: perf record [<options>] [<command>]
>       or: perf record [<options>] -- <command> [<options>]
>
>       -e, --event <event>   event selector. use 'perf list' to list available events

Fwiw, this error is an event parsing error not an event opening error.
You need to select an uncore event, I was using data_read which exists
in the uncore_imc_free_running PMUs on Intel tigerlake. Here is the
existing error message:
```
$ perf record -e data_read -a true
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (data_read).
"dmesg | grep -i perf" may provide additional information.
```
and here it with the series:
```
$ perf record -e data_read -a true
Error:
Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0'
which will be removed.
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (data_read).
"dmesg | grep -i perf" may provide additional information.

Error:
Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1'
which will be removed.
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (data_read).
"dmesg | grep -i perf" may provide additional information.

Error:
Failure to open any events for recording.
```
and here is what it would be with pr_debug:
```
$ perf record -e data_read -a true
Error:
Failure to open any events for recording.
```
I believe this last output is worst because:
1) If not all events fail to open there is no error reported unless I
know to run with -v, which will also bring a bunch more noise with it,
2) I don't see the PMU / event name and "Invalid argument" indicating
what has gone wrong again unless I know to run with -v and get all the
verbose noise with that.

Yes it is noisy on 1 platform for 1 event due to an ARM PMU event name
bug that ARM should have long ago fixed. That should be fixed rather
than hiding errors and making users think they are recording samples
when silently they're not - or they need to search through verbose
output to try to find out if something broke.

> > 2) it only happens if trying to record on a PMU/event that doesn't
> > support recording, something that is currently an error and so we're
> > not motivated to change the behavior as no-one should be using it;
>
> It was caught by Linus, so we know at least one (very important) user.

If they care enough then specifying the PMU with the event will avoid
any warning and has always been a fix for this issue. It was the first
proposed workaround for Linus.

> > 3) for the wildcard case the only offender is ARM's SLC PMU and the
> > appropriate fix there has always been to make the CPU cycle's event
> > name match the bus_cycles event name by calling it cpu_cycles -
> > something that doesn't conflict with a core PMU event name, the thing
> > that has introduced all these problems, patches, long email exchanges,
> > unfixed inconsistencies, etc.. If the errors aren't noisy then there
> > is little motivation for the ARM SLC PMU's event name to be fixed.
>
> I understand your concern but I'm not sure it's the best way to fix the
> issue.

Right, I'm similarly concerned about hiding legitimate warning/error
messages because of 1 event on 1 PMU on 1 architecture because of how
perf gets driven by 1 user. Yes, when you break you can wade through
the verbose output but imo the verbose output was never intended to be
used in that way.

Thanks,
Ian
Namhyung Kim Jan. 10, 2025, 7:26 p.m. UTC | #7
On Fri, Jan 10, 2025 at 08:42:02AM -0800, Ian Rogers wrote:
> On Fri, Jan 10, 2025 at 6:18 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Adding Linus to the CC list as he participated in this discussion in the
> > past, so a heads up about changes in this area that are being further
> > discussed.
> 
> Linus blocks my email so I'm not sure of the point.

That's unfortunate, but he should be able to see others' reply.

> 
> > On Thu, Jan 09, 2025 at 05:25:03PM -0800, Namhyung Kim wrote:
> > > On Thu, Jan 09, 2025 at 02:21:08PM -0800, Ian Rogers wrote:
> > > > Whilst for many tools it is an expected behavior that failure to open
> > > > a perf event is a failure, ARM decided to name PMU events the same as
> > > > legacy events and then failed to rename such events on a server uncore
> > > > SLC PMU. As perf's default behavior when no PMU is specified is to
> > > > open the event on all PMUs that advertise/"have" the event, this
> > > > yielded failures when trying to make the priority of legacy and
> > > > sysfs/json events uniform - something requested by RISC-V and ARM. A
> > > > legacy event user on ARM hardware may find their event opened on an
> > > > uncore PMU which for perf record will fail. Arnaldo suggested skipping
> > > > such events which this patch implements. Rather than have the skipping
> > > > conditional on running on ARM, the skipping is done on all
> > > > architectures as such a fundamental behavioral difference could lead
> > > > to problems with tools built/depending on perf.
> > > >
> > > > An example of perf record failing to open events on x86 is:
> > > > ```
> > > > $ perf record -e data_read,cycles,LLC-prefetch-read -a sleep 0.1
> > > > Error:
> > > > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' which will be removed.
> > > > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > > > "dmesg | grep -i perf" may provide additional information.
> > > >
> > > > Error:
> > > > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' which will be removed.
> > > > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (data_read).
> > > > "dmesg | grep -i perf" may provide additional information.
> > > >
> > > > Error:
> > > > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > > > The LLC-prefetch-read event is not supported.
> > > > [ perf record: Woken up 1 times to write data ]
> > > > [ perf record: Captured and wrote 2.188 MB perf.data (87 samples) ]
> > >
> > > I'm afraid this can be too noisy.
> >
> > Agreed.
> >
> > > > $ perf report --stats
> > > > Aggregated stats:
> > > >                TOTAL events:      17255
> > > >                 MMAP events:        284  ( 1.6%)
> > > >                 COMM events:       1961  (11.4%)
> > > >                 EXIT events:          1  ( 0.0%)
> > > >                 FORK events:       1960  (11.4%)
> > > >               SAMPLE events:         87  ( 0.5%)
> > > >                MMAP2 events:      12836  (74.4%)
> > > >              KSYMBOL events:         83  ( 0.5%)
> > > >            BPF_EVENT events:         36  ( 0.2%)
> > > >       FINISHED_ROUND events:          2  ( 0.0%)
> > > >             ID_INDEX events:          1  ( 0.0%)
> > > >           THREAD_MAP events:          1  ( 0.0%)
> > > >              CPU_MAP events:          1  ( 0.0%)
> > > >            TIME_CONV events:          1  ( 0.0%)
> > > >        FINISHED_INIT events:          1  ( 0.0%)
> > > > cycles stats:
> > > >               SAMPLE events:         87
> > > > ```
> > > >
> > > > If all events fail to open then the perf record will fail:
> > > > ```
> > > > $ perf record -e LLC-prefetch-read true
> > > > Error:
> > > > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > > > The LLC-prefetch-read event is not supported.
> > > > Error:
> > > > Failure to open any events for recording
> > > > ```
> > > >
> > > > As an evlist may have dummy events that open when all command line
> > > > events fail we ignore dummy events when detecting if at least some
> > > > events open. This still permits the dummy event on its own to be used
> > > > as a permission check:
> > > > ```
> > > > $ perf record -e dummy true
> > > > [ perf record: Woken up 1 times to write data ]
> > > > [ perf record: Captured and wrote 0.046 MB perf.data ]
> > > > ```
> > > > but allows failure when a dummy event is implicilty inserted or when
> > > > there are insufficient permissions to open it:
> > > > ```
> > > > $ perf record -e LLC-prefetch-read -a true
> > > > Error:
> > > > Failure to open event 'LLC-prefetch-read' on PMU 'cpu' which will be removed.
> > > > The LLC-prefetch-read event is not supported.
> > > > Error:
> > > > Failure to open any events for recording
> > > > ```
> > > >
> > > > The issue with legacy events is that on RISC-V they want the driver to
> > > > not have mappings from legacy to non-legacy config encodings for each
> > > > vendor/model due to size, complexity and difficulty to update. It was
> > > > reported that on ARM Apple-M? CPUs the legacy mapping in the driver
> > > > was broken and the sysfs/json events should always take precedent,
> > > > however, it isn't clear this is still the case. It is the case that
> > > > without working around this issue a legacy event like cycles without a
> > > > PMU can encode differently than when specified with a PMU - the
> > > > non-PMU version favoring legacy encodings, the PMU one avoiding legacy
> > > > encodings.
> > > >
> > > > The patch removes events and then adjusts the idx value for each
> > > > evsel. This is done so that the dense xyarrays used for file
> > > > descriptors, etc. don't contain broken entries. As event opening
> > > > happens relatively late in the record process, use of the idx value
> > > > before the open will have become corrupted, so it is expected there
> > > > are latent bugs hidden behind this change - the change is best
> > > > effort. As the only vendor that has broken event names is ARM, this
> > > > will principally effect ARM users. They will also experience warning
> > > > messages like those above because of the uncore PMU advertising legacy
> > > > event names.
> > > >
> > > > Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
> > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > Tested-by: James Clark <james.clark@linaro.org>
> > > > Tested-by: Leo Yan <leo.yan@arm.com>
> > > > Tested-by: Atish Patra <atishp@rivosinc.com>
> > > > ---
> > > >  tools/perf/builtin-record.c | 47 ++++++++++++++++++++++++++++++++-----
> > > >  1 file changed, 41 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > > > index 5db1aedf48df..c0b8249a3787 100644
> > > > --- a/tools/perf/builtin-record.c
> > > > +++ b/tools/perf/builtin-record.c
> > > > @@ -961,7 +961,6 @@ static int record__config_tracking_events(struct record *rec)
> > > >      */
> > > >     if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
> > > >         perf_pmus__num_core_pmus() > 1) {
> > > > -
> > > >             /*
> > > >              * User space tasks can migrate between CPUs, so when tracing
> > > >              * selected CPUs, sideband for all CPUs is still needed.
> > > > @@ -1366,6 +1365,7 @@ static int record__open(struct record *rec)
> > > >     struct perf_session *session = rec->session;
> > > >     struct record_opts *opts = &rec->opts;
> > > >     int rc = 0;
> > > > +   bool skipped = false;
> > > >
> > > >     evlist__for_each_entry(evlist, pos) {
> > > >  try_again:
> > > > @@ -1381,15 +1381,50 @@ static int record__open(struct record *rec)
> > > >                             pos = evlist__reset_weak_group(evlist, pos, true);
> > > >                             goto try_again;
> > > >                     }
> > > > -                   rc = -errno;
> > > >                     evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
> > > > -                   ui__error("%s\n", msg);
> > > > -                   goto out;
> > > > +                   ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
> > > > +                             evsel__name(pos), evsel__pmu_name(pos), msg);
> >
> > > How about changing it to pr_debug() and add below ...
> >
> > That sounds better.
> >
> > > > +                   pos->skippable = true;
> > > > +                   skipped = true;
> > > > +           } else {
> > > > +                   pos->supported = true;
> > > >             }
> > > > -
> > > > -           pos->supported = true;
> > > >     }
> > > >
> > > > +   if (skipped) {
> > > > +           struct evsel *tmp;
> > > > +           int idx = 0;
> > > > +           bool evlist_empty = true;
> > > > +
> > > > +           /* Remove evsels that failed to open and update indices. */
> > > > +           evlist__for_each_entry_safe(evlist, tmp, pos) {
> > > > +                   if (pos->skippable) {
> > > > +                           evlist__remove(evlist, pos);
> > > > +                           continue;
> > > > +                   }
> > > > +
> > > > +                   /*
> > > > +                    * Note, dummy events may be command line parsed or
> > > > +                    * added by the tool. We care about supporting `perf
> > > > +                    * record -e dummy` which may be used as a permission
> > > > +                    * check. Dummy events that are added to the command
> > > > +                    * line and opened along with other events that fail,
> > > > +                    * will still fail as if the dummy events were tool
> > > > +                    * added events for the sake of code simplicity.
> > > > +                    */
> > > > +                   if (!evsel__is_dummy_event(pos))
> > > > +                           evlist_empty = false;
> > > > +           }
> > > > +           evlist__for_each_entry(evlist, pos) {
> > > > +                   pos->core.idx = idx++;
> > > > +           }
> > > > +           /* If list is empty then fail. */
> > > > +           if (evlist_empty) {
> > > > +                   ui__error("Failure to open any events for recording.\n");
> > > > +                   rc = -1;
> > > > +                   goto out;
> > > > +           }
> >
> > > ... ?
> >
> > >               if (!verbose)
> > >                       ui__warning("Removed some unsupported events, use -v for details.\n");
> >
> > And even this one would be best left for cases where we can determine
> > that its a new situation, i.e. one that should work and not the ones we
> > know that will not work already and thus so far didn't alarm the user
> > into thinking something is wrong.
> >
> > Having the ones we know will fail as pr_debug() seems enough, I'd say.
> 
> This means that:
> ```
> $ perf record -e data_read,LLC-prefetch-read -a sleep 0.1
> ```
> will fail (as data_read is a memory controller event and the LLC
> doesn't support sampling) with something like:
> ```
> Error:
> Failure to open any events for recording
> ```
> Which feels a bit minimal. As I already mentioned, it is also a
> behavior change and so has the potential to break scripts dependent on
> the failure information.

I don't think it's about failure behavior, the concern is the error
messages.  It can take too much screen space when users give a long list
of invalid events.  And unfortunately the current error message for
checking dmesg is not very helpful.

Anyway you can add this line too: "Use -v to see the details."

> 
> A patch lowering the priority of error messages should be independent
> of the 4 changes here. I'd be happy if someone follows this series
> with a patch doing it.

I think the error behavior is a part of this change.

Thanks,
Namhyung
diff mbox series

Patch

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5db1aedf48df..c0b8249a3787 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -961,7 +961,6 @@  static int record__config_tracking_events(struct record *rec)
 	 */
 	if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
 	    perf_pmus__num_core_pmus() > 1) {
-
 		/*
 		 * User space tasks can migrate between CPUs, so when tracing
 		 * selected CPUs, sideband for all CPUs is still needed.
@@ -1366,6 +1365,7 @@  static int record__open(struct record *rec)
 	struct perf_session *session = rec->session;
 	struct record_opts *opts = &rec->opts;
 	int rc = 0;
+	bool skipped = false;
 
 	evlist__for_each_entry(evlist, pos) {
 try_again:
@@ -1381,15 +1381,50 @@  static int record__open(struct record *rec)
 			        pos = evlist__reset_weak_group(evlist, pos, true);
 				goto try_again;
 			}
-			rc = -errno;
 			evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
-			ui__error("%s\n", msg);
-			goto out;
+			ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
+				  evsel__name(pos), evsel__pmu_name(pos), msg);
+			pos->skippable = true;
+			skipped = true;
+		} else {
+			pos->supported = true;
 		}
-
-		pos->supported = true;
 	}
 
+	if (skipped) {
+		struct evsel *tmp;
+		int idx = 0;
+		bool evlist_empty = true;
+
+		/* Remove evsels that failed to open and update indices. */
+		evlist__for_each_entry_safe(evlist, tmp, pos) {
+			if (pos->skippable) {
+				evlist__remove(evlist, pos);
+				continue;
+			}
+
+			/*
+			 * Note, dummy events may be command line parsed or
+			 * added by the tool. We care about supporting `perf
+			 * record -e dummy` which may be used as a permission
+			 * check. Dummy events that are added to the command
+			 * line and opened along with other events that fail,
+			 * will still fail as if the dummy events were tool
+			 * added events for the sake of code simplicity.
+			 */
+			if (!evsel__is_dummy_event(pos))
+				evlist_empty = false;
+		}
+		evlist__for_each_entry(evlist, pos) {
+			pos->core.idx = idx++;
+		}
+		/* If list is empty then fail. */
+		if (evlist_empty) {
+			ui__error("Failure to open any events for recording.\n");
+			rc = -1;
+			goto out;
+		}
+	}
 	if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
 		pr_warning(
 "WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"