Message ID | 20241217115610.371755-6-james.clark@linaro.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | perf: arm_spe: Add format option for discard mode | expand |
On Tue, Dec 17, 2024 at 3:56 AM James Clark <james.clark@linaro.org> wrote: > > Document the flag, hint what it's used for and give an example with > other useful options to get minimal output. > > Signed-off-by: James Clark <james.clark@linaro.org> > --- > tools/perf/Documentation/perf-arm-spe.txt | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/tools/perf/Documentation/perf-arm-spe.txt b/tools/perf/Documentation/perf-arm-spe.txt > index de2b0b479249..588eead438bc 100644 > --- a/tools/perf/Documentation/perf-arm-spe.txt > +++ b/tools/perf/Documentation/perf-arm-spe.txt > @@ -150,6 +150,7 @@ arm_spe/load_filter=1,min_latency=10/' > pct_enable=1 - collect physical timestamp instead of virtual timestamp (PMSCR.PCT) - requires privilege > store_filter=1 - collect stores only (PMSFCR.ST) > ts_enable=1 - enable timestamping with value of generic timer (PMSCR.TS) > + discard=1 - enable SPE PMU events but don't collect sample data - see 'Discard mode' (PMBLIMITR.FM = DISCARD) > > +++*+++ Latency is the total latency from the point at which sampling started on that instruction, rather > than only the execution latency. > @@ -220,6 +221,16 @@ Common errors > > Increase sampling interval (see above) > > +Discard mode > +~~~~~~~~~~~~ > + > +SPE PMU events can be used without the overhead of collecting sample data if > +discard mode is supported (optional from Armv8.6). First run a system wide SPE > +session (or on the core of interest) using options to minimize output. Then run > +perf stat: > + > + perf record -e arm_spe/discard/ -a -N -B --no-bpf-event -o - > /dev/null & > + perf stat -e SAMPLE_FEED_LD Perhaps clarify this should be an ARM SPE event? It seems strange to have one perf command affect a later one, the purpose of things like event multiplexing is to hide the hardware limits. I'd prefer if the last bit was like: ``` Then run perf stat with an SPE event on the same PMU: perf record -e arm_spe/discard/ -a -N -B --no-bpf-event -o - > /dev/null & perf stat -e arm_spe/SAMPLE_FEED_LD/ `` Thanks, Ian
diff --git a/tools/perf/Documentation/perf-arm-spe.txt b/tools/perf/Documentation/perf-arm-spe.txt index de2b0b479249..588eead438bc 100644 --- a/tools/perf/Documentation/perf-arm-spe.txt +++ b/tools/perf/Documentation/perf-arm-spe.txt @@ -150,6 +150,7 @@ arm_spe/load_filter=1,min_latency=10/' pct_enable=1 - collect physical timestamp instead of virtual timestamp (PMSCR.PCT) - requires privilege store_filter=1 - collect stores only (PMSFCR.ST) ts_enable=1 - enable timestamping with value of generic timer (PMSCR.TS) + discard=1 - enable SPE PMU events but don't collect sample data - see 'Discard mode' (PMBLIMITR.FM = DISCARD) +++*+++ Latency is the total latency from the point at which sampling started on that instruction, rather than only the execution latency. @@ -220,6 +221,16 @@ Common errors Increase sampling interval (see above) +Discard mode +~~~~~~~~~~~~ + +SPE PMU events can be used without the overhead of collecting sample data if +discard mode is supported (optional from Armv8.6). First run a system wide SPE +session (or on the core of interest) using options to minimize output. Then run +perf stat: + + perf record -e arm_spe/discard/ -a -N -B --no-bpf-event -o - > /dev/null & + perf stat -e SAMPLE_FEED_LD SEE ALSO --------
Document the flag, hint what it's used for and give an example with other useful options to get minimal output. Signed-off-by: James Clark <james.clark@linaro.org> --- tools/perf/Documentation/perf-arm-spe.txt | 11 +++++++++++ 1 file changed, 11 insertions(+)