diff mbox series

coresight: docs: Remove target sink from examples

Message ID 20241210144933.295798-1-james.clark@linaro.org (mailing list archive)
State New
Headers show
Series coresight: docs: Remove target sink from examples | expand

Commit Message

James Clark Dec. 10, 2024, 2:49 p.m. UTC
Previously the sink had to be specified, but now it auto selects one by
default. Including a sink in the examples causes issues when copy
pasting the command because it might not work if that sink isn't
present. Remove the sink from all the basic examples and create a new
section specifically about overriding the default one.

Make the text a but more concise now that it's in the advanced section,
and similarly for removing the old kernel advice.

Signed-off-by: James Clark <james.clark@linaro.org>
---
 Documentation/trace/coresight/coresight.rst   | 41 ++++++++-----------
 .../userspace-api/perf_ring_buffer.rst        |  4 +-
 2 files changed, 18 insertions(+), 27 deletions(-)

Comments

Steve Clevenger Dec. 11, 2024, 6:01 p.m. UTC | #1
Hi James,

I thought I'd mention this issue with multicore self-hosted trace. The
perf command line syntax does not allow a sink "type" to be specified
(e.g. @tmc_etf or @tmc_etr). For multicore, it doesn't make sense to
specify a processor mapped sink as would be the case for single core
trace. A sink "type" should be allowed to avoid the auto select default.
In our case, the default is the ETF sink.

Thanks,
Steve C.

On 12/10/2024 6:49 AM, James Clark wrote:
> Previously the sink had to be specified, but now it auto selects one by
> default. Including a sink in the examples causes issues when copy
> pasting the command because it might not work if that sink isn't
> present. Remove the sink from all the basic examples and create a new
> section specifically about overriding the default one.
> 
> Make the text a but more concise now that it's in the advanced section,
> and similarly for removing the old kernel advice.
> 
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
>  Documentation/trace/coresight/coresight.rst   | 41 ++++++++-----------
>  .../userspace-api/perf_ring_buffer.rst        |  4 +-
>  2 files changed, 18 insertions(+), 27 deletions(-)
> 
> diff --git a/Documentation/trace/coresight/coresight.rst b/Documentation/trace/coresight/coresight.rst
> index d4f93d6a2d63..806699871b80 100644
> --- a/Documentation/trace/coresight/coresight.rst
> +++ b/Documentation/trace/coresight/coresight.rst
> @@ -462,44 +462,35 @@ queried by the perf command line tool:
>  
>  		cs_etm//                                    [Kernel PMU event]
>  
> -	linaro@linaro-nano:~$
> -
>  Regardless of the number of tracers available in a system (usually equal to the
>  amount of processor cores), the "cs_etm" PMU will be listed only once.
>  
>  A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is
> -listed along with configuration options within forward slashes '/'.  Since a
> -Coresight system will typically have more than one sink, the name of the sink to
> -work with needs to be specified as an event option.
> -On newer kernels the available sinks are listed in sysFS under
> +provided along with configuration options within forward slashes '/' (see
> +`Config option formats`_).
> +
> +Advanced Perf framework usage
> +-----------------------------
> +
> +Sink selection
> +~~~~~~~~~~~~~~
> +
> +An appropriate sink will be selected automatically for use with Perf, but since
> +there will typically be more than one sink, the name of the sink to use may be
> +specified as a special config option prefixed with '@'.
> +
> +The available sinks are listed in sysFS under
>  ($SYSFS)/bus/event_source/devices/cs_etm/sinks/::
>  
>  	root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls
>  	tmc_etf0  tmc_etr0  tpiu0
>  
> -On older kernels, this may need to be found from the list of coresight devices,
> -available under ($SYSFS)/bus/coresight/devices/::
> -
> -	root:~# ls /sys/bus/coresight/devices/
> -	 etm0     etm1     etm2         etm3  etm4      etm5      funnel0
> -	 funnel1  funnel2  replicator0  stm0  tmc_etf0  tmc_etr0  tpiu0
>  	root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program
>  
> -As mentioned above in section "Device Naming scheme", the names of the devices could
> -look different from what is used in the example above. One must use the device names
> -as it appears under the sysFS.
> -
> -The syntax within the forward slashes '/' is important.  The '@' character
> -tells the parser that a sink is about to be specified and that this is the sink
> -to use for the trace session.
> -
>  More information on the above and other example on how to use Coresight with
>  the perf tools can be found in the "HOWTO.md" file of the openCSD gitHub
>  repository [#third]_.
>  
> -Advanced perf framework usage
> ------------------------------
> -
>  AutoFDO analysis using the perf tools
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -508,7 +499,7 @@ perf can be used to record and analyze trace of programs.
>  Execution can be recorded using 'perf record' with the cs_etm event,
>  specifying the name of the sink to record to, e.g::
>  
> -    perf record -e cs_etm/@tmc_etr0/u --per-thread
> +    perf record -e cs_etm//u --per-thread
>  
>  The 'perf report' and 'perf script' commands can be used to analyze execution,
>  synthesizing instruction and branch events from the instruction trace.
> @@ -572,7 +563,7 @@ sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tuto
>  	Bubble sorting array of 30000 elements
>  	5910 ms
>  
> -	$ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort
> +	$ perf record -e cs_etm//u --per-thread taskset -c 2 ./sort
>  	Bubble sorting array of 30000 elements
>  	12543 ms
>  	[ perf record: Woken up 35 times to write data ]
> diff --git a/Documentation/userspace-api/perf_ring_buffer.rst b/Documentation/userspace-api/perf_ring_buffer.rst
> index bde9d8cbc106..dc71544532ce 100644
> --- a/Documentation/userspace-api/perf_ring_buffer.rst
> +++ b/Documentation/userspace-api/perf_ring_buffer.rst
> @@ -627,7 +627,7 @@ regular ring buffer.
>  AUX events and AUX trace data are two different things.  Let's see an
>  example::
>  
> -        perf record -a -e cycles -e cs_etm/@tmc_etr0/ -- sleep 2
> +        perf record -a -e cycles -e cs_etm// -- sleep 2
>  
>  The above command enables two events: one is the event *cycles* from PMU
>  and another is the AUX event *cs_etm* from Arm CoreSight, both are saved
> @@ -766,7 +766,7 @@ only record AUX trace data at a specific time point which users are
>  interested in.  E.g. below gives an example of how to take snapshots
>  with 1 second interval with Arm CoreSight::
>  
> -  perf record -e cs_etm/@tmc_etr0/u -S -a program &
> +  perf record -e cs_etm//u -S -a program &
>    PERFPID=$!
>    while true; do
>        kill -USR2 $PERFPID
James Clark Dec. 12, 2024, 3:27 p.m. UTC | #2
On 11/12/2024 6:01 pm, Steve Clevenger wrote:
> 
> Hi James,
> 
> I thought I'd mention this issue with multicore self-hosted trace. The
> perf command line syntax does not allow a sink "type" to be specified
> (e.g. @tmc_etf or @tmc_etr). For multicore, it doesn't make sense to
> specify a processor mapped sink as would be the case for single core
> trace. A sink "type" should be allowed to avoid the auto select default.
> In our case, the default is the ETF sink.
> 
> Thanks,
> Steve C.
> 

I'm sure it would be possible to add support for this, but I'm wondering 
if the real issue is that the default selection logic is wrong? Are you 
saying the default you get is ETF but you want ETR? And there is both 
for each ETM? The default selection logic isn't easy to summarize but it 
should prefer ETR (sysmem) over ETF (link sink), see coresight_find_sink().

It's probably better to fix that rather than add a new sink selection 
feature. Maybe if you shared a diagram of your coresight architecture it 
would help.

Thanks
James
Steve Clevenger Dec. 12, 2024, 7:38 p.m. UTC | #3
On 12/12/2024 7:27 AM, James Clark wrote:
> 
> 
> On 11/12/2024 6:01 pm, Steve Clevenger wrote:
>>
>> Hi James,
>>
>> I thought I'd mention this issue with multicore self-hosted trace. The
>> perf command line syntax does not allow a sink "type" to be specified
>> (e.g. @tmc_etf or @tmc_etr). For multicore, it doesn't make sense to
>> specify a processor mapped sink as would be the case for single core
>> trace. A sink "type" should be allowed to avoid the auto select default.
>> In our case, the default is the ETF sink.
>>
>> Thanks,
>> Steve C.
>>
> 
> I'm sure it would be possible to add support for this, but I'm wondering
> if the real issue is that the default selection logic is wrong? Are you
> saying the default you get is ETF but you want ETR? And there is both
> for each ETM? The default selection logic isn't easy to summarize but it
> should prefer ETR (sysmem) over ETF (link sink), see coresight_find_sink().
> 
> It's probably better to fix that rather than add a new sink selection
> feature. Maybe if you shared a diagram of your coresight architecture it
> would help.
> 
> Thanks
> James

Hi James,

It appears the default sink selection is ETF for multicore trace. In any
case, for the ArmĀ® CoreSight Base System Architecture STC Level
compliance, I need to be able to specify the sink type.

The Ampere CoreSight hierarchy is described to the ACPI as follows:


+-----------------+
|                 |
|       ETM       |
|                 |
+--------+--------+
         |
         |
+--------+--------+
|                 |
|       ETF       |
|                 |
+--------+--------+
         |
         |
+--------+--------+
|                 |
|       ETR       |
|                 |
+--------+--------+
         |
         |
+--------+--------+
|                 |
|      CATU       |
|                 |
+--------+--------+

Steve C.
James Clark Dec. 17, 2024, 5:17 p.m. UTC | #4
On 12/12/2024 7:38 pm, Steve Clevenger wrote:
> 
> 
> On 12/12/2024 7:27 AM, James Clark wrote:
>>
>>
>> On 11/12/2024 6:01 pm, Steve Clevenger wrote:
>>>
>>> Hi James,
>>>
>>> I thought I'd mention this issue with multicore self-hosted trace. The
>>> perf command line syntax does not allow a sink "type" to be specified
>>> (e.g. @tmc_etf or @tmc_etr). For multicore, it doesn't make sense to
>>> specify a processor mapped sink as would be the case for single core
>>> trace. A sink "type" should be allowed to avoid the auto select default.
>>> In our case, the default is the ETF sink.
>>>
>>> Thanks,
>>> Steve C.
>>>
>>
>> I'm sure it would be possible to add support for this, but I'm wondering
>> if the real issue is that the default selection logic is wrong? Are you
>> saying the default you get is ETF but you want ETR? And there is both
>> for each ETM? The default selection logic isn't easy to summarize but it
>> should prefer ETR (sysmem) over ETF (link sink), see coresight_find_sink().
>>
>> It's probably better to fix that rather than add a new sink selection
>> feature. Maybe if you shared a diagram of your coresight architecture it
>> would help.
>>
>> Thanks
>> James
> 
> Hi James,
> 
> It appears the default sink selection is ETF for multicore trace. In any
> case, for the ArmĀ® CoreSight Base System Architecture STC Level
> compliance, I need to be able to specify the sink type.

Yep it makes sense to add support for selecting it then then, I will put 
it on the list but not sure about the priority. I think looking into why 
the default isn't working is more important for now.

> 
> The Ampere CoreSight hierarchy is described to the ACPI as follows:
> 
> 
> +-----------------+
> |                 |
> |       ETM       |
> |                 |
> +--------+--------+
>           |
>           |
> +--------+--------+
> |                 |
> |       ETF       |
> |                 |
> +--------+--------+
>           |
>           |
> +--------+--------+
> |                 |
> |       ETR       |
> |                 |
> +--------+--------+
>           |
>           |
> +--------+--------+
> |                 |
> |      CATU       |
> |                 |
> +--------+--------+
> 
> Steve C.
> 

I recreated this in the test here: 
https://lore.kernel.org/linux-kernel/20241217171132.834943-1-james.clark@linaro.org/T/#u

But it looks like it correctly selects ETR rather than ETF, so I'm not 
sure what the difference is between your setup and that. If you can have 
a look at that test and compare it that would be very helpful.

Thanks
James
diff mbox series

Patch

diff --git a/Documentation/trace/coresight/coresight.rst b/Documentation/trace/coresight/coresight.rst
index d4f93d6a2d63..806699871b80 100644
--- a/Documentation/trace/coresight/coresight.rst
+++ b/Documentation/trace/coresight/coresight.rst
@@ -462,44 +462,35 @@  queried by the perf command line tool:
 
 		cs_etm//                                    [Kernel PMU event]
 
-	linaro@linaro-nano:~$
-
 Regardless of the number of tracers available in a system (usually equal to the
 amount of processor cores), the "cs_etm" PMU will be listed only once.
 
 A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is
-listed along with configuration options within forward slashes '/'.  Since a
-Coresight system will typically have more than one sink, the name of the sink to
-work with needs to be specified as an event option.
-On newer kernels the available sinks are listed in sysFS under
+provided along with configuration options within forward slashes '/' (see
+`Config option formats`_).
+
+Advanced Perf framework usage
+-----------------------------
+
+Sink selection
+~~~~~~~~~~~~~~
+
+An appropriate sink will be selected automatically for use with Perf, but since
+there will typically be more than one sink, the name of the sink to use may be
+specified as a special config option prefixed with '@'.
+
+The available sinks are listed in sysFS under
 ($SYSFS)/bus/event_source/devices/cs_etm/sinks/::
 
 	root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls
 	tmc_etf0  tmc_etr0  tpiu0
 
-On older kernels, this may need to be found from the list of coresight devices,
-available under ($SYSFS)/bus/coresight/devices/::
-
-	root:~# ls /sys/bus/coresight/devices/
-	 etm0     etm1     etm2         etm3  etm4      etm5      funnel0
-	 funnel1  funnel2  replicator0  stm0  tmc_etf0  tmc_etr0  tpiu0
 	root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program
 
-As mentioned above in section "Device Naming scheme", the names of the devices could
-look different from what is used in the example above. One must use the device names
-as it appears under the sysFS.
-
-The syntax within the forward slashes '/' is important.  The '@' character
-tells the parser that a sink is about to be specified and that this is the sink
-to use for the trace session.
-
 More information on the above and other example on how to use Coresight with
 the perf tools can be found in the "HOWTO.md" file of the openCSD gitHub
 repository [#third]_.
 
-Advanced perf framework usage
------------------------------
-
 AutoFDO analysis using the perf tools
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -508,7 +499,7 @@  perf can be used to record and analyze trace of programs.
 Execution can be recorded using 'perf record' with the cs_etm event,
 specifying the name of the sink to record to, e.g::
 
-    perf record -e cs_etm/@tmc_etr0/u --per-thread
+    perf record -e cs_etm//u --per-thread
 
 The 'perf report' and 'perf script' commands can be used to analyze execution,
 synthesizing instruction and branch events from the instruction trace.
@@ -572,7 +563,7 @@  sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tuto
 	Bubble sorting array of 30000 elements
 	5910 ms
 
-	$ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort
+	$ perf record -e cs_etm//u --per-thread taskset -c 2 ./sort
 	Bubble sorting array of 30000 elements
 	12543 ms
 	[ perf record: Woken up 35 times to write data ]
diff --git a/Documentation/userspace-api/perf_ring_buffer.rst b/Documentation/userspace-api/perf_ring_buffer.rst
index bde9d8cbc106..dc71544532ce 100644
--- a/Documentation/userspace-api/perf_ring_buffer.rst
+++ b/Documentation/userspace-api/perf_ring_buffer.rst
@@ -627,7 +627,7 @@  regular ring buffer.
 AUX events and AUX trace data are two different things.  Let's see an
 example::
 
-        perf record -a -e cycles -e cs_etm/@tmc_etr0/ -- sleep 2
+        perf record -a -e cycles -e cs_etm// -- sleep 2
 
 The above command enables two events: one is the event *cycles* from PMU
 and another is the AUX event *cs_etm* from Arm CoreSight, both are saved
@@ -766,7 +766,7 @@  only record AUX trace data at a specific time point which users are
 interested in.  E.g. below gives an example of how to take snapshots
 with 1 second interval with Arm CoreSight::
 
-  perf record -e cs_etm/@tmc_etr0/u -S -a program &
+  perf record -e cs_etm//u -S -a program &
   PERFPID=$!
   while true; do
       kill -USR2 $PERFPID