mbox series

[v2,0/7] perf arm-spe: Enable timestamp

Message ID 20210403072346.30430-1-leo.yan@linaro.org (mailing list archive)
Headers show
Series perf arm-spe: Enable timestamp | expand

Message

Leo Yan April 3, 2021, 7:23 a.m. UTC
As we know, the timestamp is important for AUX trace; it's mainly used
to correlate between perf events and AUX trace, allows to generate
events with time ordered manner.  There have several good examples of
enabling timestamp for AUX trace (like Intel-pt, Intel-bts, etc).

Since the conversion between TSC and kernel timestamp has been supported
on Arm64, TSC is a naming convention from x86, but perf now has reused
it to support Arm arch timer counter.

This patch set is to enable timestamp for Arm SPE trace.  It reads out
TSC parameters from mmap page and stores into auxtrace info structure;
the TSC parameters are used for conversion between timer counter and
kernel time and which is applied for Arm SPE samples.

This patch set can be clearly applied on perf/core branch with:

  commit 6859bc0e78c6 ("perf stat: Improve readability of shadow stats")

Ths patch series has been tested on Hisilicon D06 platform.

After:

  # perf script -F comm,time,cpu,pid,dso,ip,sym

              perf  2408 [032]   168.680297:  ffffbd1253690a3c perf_event_exec ([kernel.kallsyms])
              perf  2408 [032]   168.680297:  ffffbd1253690a3c perf_event_exec ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680317:  ffffbd1253683f50 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680317:  ffffbd1253683f50 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680319:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680319:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680367:  ffffbd12539b03ec __arch_clear_user ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440 kmem_cache_alloc ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440 kmem_cache_alloc ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440 kmem_cache_alloc ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440 kmem_cache_alloc ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])

Changes from v1:
* Rebased patch series on the latest perf/core branch;
* Fixed the patch for dumping TSC parameters to support both the
  older and new auxtrace info format.


Leo Yan (7):
  perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS
  perf arm-spe: Store TSC parameters in auxtrace info
  perf arm-spe: Dump TSC parameters
  perf arm-spe: Convert event kernel time to counter value
  perf arm-spe: Assign kernel time to synthesized event
  perf arm-spe: Bail out if the trace is later than perf event
  perf arm-spe: Don't wait for PERF_RECORD_EXIT event

 tools/perf/arch/arm64/util/arm-spe.c | 23 +++++++
 tools/perf/util/arm-spe.c            | 89 +++++++++++++++++++++++-----
 tools/perf/util/arm-spe.h            |  7 ++-
 3 files changed, 103 insertions(+), 16 deletions(-)

Comments

Al Grant April 6, 2021, 9:38 a.m. UTC | #1
> -----Original Message-----
> From: Leo Yan <leo.yan@linaro.org>
> Sent: 03 April 2021 08:24
> To: Arnaldo Carvalho de Melo <acme@kernel.org>; John Garry
> <john.garry@huawei.com>; Will Deacon <will@kernel.org>; Mathieu Poirier
> <mathieu.poirier@linaro.org>; James Clark <James.Clark@arm.com>; Al Grant
> <Al.Grant@arm.com>; Peter Zijlstra <peterz@infradead.org>; Ingo Molnar
> <mingo@redhat.com>; Mark Rutland <Mark.Rutland@arm.com>; Alexander
> Shishkin <alexander.shishkin@linux.intel.com>; Jiri Olsa <jolsa@redhat.com>;
> Namhyung Kim <namhyung@kernel.org>; Adrian Hunter
> <adrian.hunter@intel.com>; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org
> Cc: leo.yan@linaro.org
> Subject: [PATCH v2 0/7] perf arm-spe: Enable timestamp
> 
> As we know, the timestamp is important for AUX trace; it's mainly used to
> correlate between perf events and AUX trace, allows to generate events with
> time ordered manner.  There have several good examples of enabling timestamp
> for AUX trace (like Intel-pt, Intel-bts, etc).
> 
> Since the conversion between TSC and kernel timestamp has been supported on
> Arm64, TSC is a naming convention from x86, but perf now has reused it to
> support Arm arch timer counter.
> 
> This patch set is to enable timestamp for Arm SPE trace.  It reads out TSC
> parameters from mmap page and stores into auxtrace info structure;

Why not synthesize a PERF_RECORD_TIME_CONV - isn't that specifically to
capture the TSC parameters from the mmap page? If a generic mechanism
exists it would be better to use it, otherwise we'll have to do this again for
future trace formats.

perf_read_tsc_conversion and perf_event__synth_time_conv are currently
in arch/x86/util/tsc.c, but nothing in them is x86-specific and they could be
moved somewhere more generic.

Al


> the TSC
> parameters are used for conversion between timer counter and kernel time and
> which is applied for Arm SPE samples.
> 
> This patch set can be clearly applied on perf/core branch with:
> 
>   commit 6859bc0e78c6 ("perf stat: Improve readability of shadow stats")
> 
> Ths patch series has been tested on Hisilicon D06 platform.
> 
> After:
> 
>   # perf script -F comm,time,cpu,pid,dso,ip,sym
> 
>               perf  2408 [032]   168.680297:  ffffbd1253690a3c perf_event_exec
> ([kernel.kallsyms])
>               perf  2408 [032]   168.680297:  ffffbd1253690a3c perf_event_exec
> ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680317:  ffffbd1253683f50
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680317:  ffffbd1253683f50
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680319:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680319:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680367:  ffffbd12539b03ec
> __arch_clear_user ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440
> kmem_cache_alloc ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440
> kmem_cache_alloc ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440
> kmem_cache_alloc ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440
> kmem_cache_alloc ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
> 
> Changes from v1:
> * Rebased patch series on the latest perf/core branch;
> * Fixed the patch for dumping TSC parameters to support both the
>   older and new auxtrace info format.
> 
> 
> Leo Yan (7):
>   perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS
>   perf arm-spe: Store TSC parameters in auxtrace info
>   perf arm-spe: Dump TSC parameters
>   perf arm-spe: Convert event kernel time to counter value
>   perf arm-spe: Assign kernel time to synthesized event
>   perf arm-spe: Bail out if the trace is later than perf event
>   perf arm-spe: Don't wait for PERF_RECORD_EXIT event
> 
>  tools/perf/arch/arm64/util/arm-spe.c | 23 +++++++
>  tools/perf/util/arm-spe.c            | 89 +++++++++++++++++++++++-----
>  tools/perf/util/arm-spe.h            |  7 ++-
>  3 files changed, 103 insertions(+), 16 deletions(-)
> 
> --
> 2.25.1
Leo Yan April 7, 2021, 1:15 p.m. UTC | #2
Hi Al,

On Tue, Apr 06, 2021 at 09:38:32AM +0000, Al Grant wrote:

[...]

> > This patch set is to enable timestamp for Arm SPE trace.  It reads out TSC
> > parameters from mmap page and stores into auxtrace info structure;
> 
> Why not synthesize a PERF_RECORD_TIME_CONV - isn't that specifically to
> capture the TSC parameters from the mmap page? If a generic mechanism
> exists it would be better to use it, otherwise we'll have to do this again for
> future trace formats.

Good point!  Actually "perf record" tool has synthesized event
PERF_RECORD_TIME_CONV.  This patch series is studying the
implementation from Intel-PT, so the question is why the existed
implementations (like Intel-PT, Intel-BTS) don't directly use
PERF_RECORD_TIME_CONV for retriving TSC parameters.

I agree using PERF_RECORD_TIME_CONV for TSC parameter is better than
extending auxtrace info.  Will experiment for this.

> perf_read_tsc_conversion and perf_event__synth_time_conv are currently
> in arch/x86/util/tsc.c, but nothing in them is x86-specific and they could be
> moved somewhere more generic.

This is not true on the mainline kernel; these functions have been
moved into the file util/tsc.c.

Thanks for suggestions,
Leo
Adrian Hunter April 7, 2021, 1:28 p.m. UTC | #3
On 7/04/21 4:15 pm, Leo Yan wrote:
> Hi Al,
> 
> On Tue, Apr 06, 2021 at 09:38:32AM +0000, Al Grant wrote:
> 
> [...]
> 
>>> This patch set is to enable timestamp for Arm SPE trace.  It reads out TSC
>>> parameters from mmap page and stores into auxtrace info structure;
>>
>> Why not synthesize a PERF_RECORD_TIME_CONV - isn't that specifically to
>> capture the TSC parameters from the mmap page? If a generic mechanism
>> exists it would be better to use it, otherwise we'll have to do this again for
>> future trace formats.
> 
> Good point!  Actually "perf record" tool has synthesized event
> PERF_RECORD_TIME_CONV.  This patch series is studying the
> implementation from Intel-PT, so the question is why the existed
> implementations (like Intel-PT, Intel-BTS) don't directly use
> PERF_RECORD_TIME_CONV for retriving TSC parameters.

PERF_RECORD_TIME_CONV was added later because the TSC information is
needed by jitdump.
Leo Yan April 7, 2021, 1:40 p.m. UTC | #4
On Wed, Apr 07, 2021 at 04:28:40PM +0300, Adrian Hunter wrote:
> On 7/04/21 4:15 pm, Leo Yan wrote:
> > Hi Al,
> > 
> > On Tue, Apr 06, 2021 at 09:38:32AM +0000, Al Grant wrote:
> > 
> > [...]
> > 
> >>> This patch set is to enable timestamp for Arm SPE trace.  It reads out TSC
> >>> parameters from mmap page and stores into auxtrace info structure;
> >>
> >> Why not synthesize a PERF_RECORD_TIME_CONV - isn't that specifically to
> >> capture the TSC parameters from the mmap page? If a generic mechanism
> >> exists it would be better to use it, otherwise we'll have to do this again for
> >> future trace formats.
> > 
> > Good point!  Actually "perf record" tool has synthesized event
> > PERF_RECORD_TIME_CONV.  This patch series is studying the
> > implementation from Intel-PT, so the question is why the existed
> > implementations (like Intel-PT, Intel-BTS) don't directly use
> > PERF_RECORD_TIME_CONV for retriving TSC parameters.
> 
> PERF_RECORD_TIME_CONV was added later because the TSC information is
> needed by jitdump.

Thanks for the info, Adrian.

If so, it's good for Arm SPE to use PERF_RECORD_TIME_CONV for TSC
parameters.  Will spin patch series for this.

Leo