mbox series

[0/2] Add SCMI full message tracing

Message ID 20220623145533.2882688-1-cristian.marussi@arm.com (mailing list archive)
Headers show
Series Add SCMI full message tracing | expand

Message

Cristian Marussi June 23, 2022, 2:55 p.m. UTC
Hi,

after a few recent troubles handling too strictly out-of-spec replies from
SCMI servers deployed in the wild, I though it could have been useful to
have a basic way to dump at will the effective full payloads of successfuly
transmitted/received SCMI messages.

The existing SCMI traces already collect a bunch of information about SCMI
message exchanges but they do NOT keep any payload information: this is
certainly preferable most of the time since dumping full SCMI messages to
the trace buffer involves a full copy of the payload.

For this reason I added a new distinct trace_scmi_msg_dump with this series
in order to be able to selectively enable at will message dumping only when
required.

Only successfully transmitted and received (valid) xfers are dumped.

At first I was thinking about just dumping raw header and payload, but in
order to make the log a bit more human readable (without the need of
tooling to parse the log), I added some parsing/interpretation of the
header, so that the final result is something like:

root@deb-guest:~# echo 1 > /sys/kernel/debug/tracing/events/scmi/scmi_msg_dump/enable 
root@deb-guest:~# cat /sys/kernel/debug/tracing/trace_pipe 

(my annotations non on traces =>>)			 proto  type  cmd   seq  status    payload
							  |      |    |      |     |        |
 ++ cmd/reply						  |      |    |      |     |        |
   cat-224     [002] .....   147.755847: scmi_msg_dump: [0x15]:CMND:[0x06]:[0021]:[0]: 0000000000000000
<idle>-0       [000] d.h2.   147.758311: scmi_msg_dump: [0x15]:RESP:[0x06]:[0021]:[0]: 25000000000000000000000000000000


 ++ cmd/reply/delayed
   cat-223     [001] .....   122.384192: scmi_msg_dump: [0x15]:CMND:[0x06]:[0020]:[0]: 0100000001000000
<idle>-0       [000] d.h2.   122.405548: scmi_msg_dump: [0x15]:RESP:[0x06]:[0020]:[0]: 
<idle>-0       [000] d.h2.   122.425839: scmi_msg_dump: [0x15]:DLYD:[0x06]:[0020]:[0]: 01000000a05a320000000000efbeaddefecafeca


 ++ enable notif/notif
iio_generic_buf-233     [003] .....   522.399690: scmi_msg_dump: [0x15]:CMND:[0x0A]:[0024]:[0]: 0800000003000000
         <idle>-0       [000] dNh2.   522.417778: scmi_msg_dump: [0x15]:RESP:[0x0A]:[0024]:[0]: 00000000
         <idle>-0       [000] d.h2.   522.500945: scmi_msg_dump: [0x15]:NOTI:[0x01]:[0000]:[0]: 000000000800000008daffffffffffff0040d3a45d46fb1610daffffffffffff0040d3a45d46fb1618daffffffffffff0040d3a45d46fb16

Payload is dumped as it comes through byte-by-byte without any endianity
conversion to avoid further load on the system.

Any thoughs ?

Thanks,
Cristian

Cristian Marussi (2):
  include: trace: Add SCMI full message tracing
  firmware: arm_scmi: Use new SCMI full message tracing

 drivers/firmware/arm_scmi/driver.c | 21 ++++++++++++++++++++
 include/trace/events/scmi.h        | 31 ++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

Comments

Jim Quinlan June 27, 2022, 7:05 p.m. UTC | #1
On Thu, Jun 23, 2022 at 10:55 AM Cristian Marussi
<cristian.marussi@arm.com> wrote:
>
> Hi,
>
> after a few recent troubles handling too strictly out-of-spec replies from
> SCMI servers deployed in the wild, I though it could have been useful to
> have a basic way to dump at will the effective full payloads of successfuly
> transmitted/received SCMI messages.
>
> The existing SCMI traces already collect a bunch of information about SCMI
> message exchanges but they do NOT keep any payload information: this is
> certainly preferable most of the time since dumping full SCMI messages to
> the trace buffer involves a full copy of the payload.
>
> For this reason I added a new distinct trace_scmi_msg_dump with this series
> in order to be able to selectively enable at will message dumping only when
> required.
>
> Only successfully transmitted and received (valid) xfers are dumped.
>
> At first I was thinking about just dumping raw header and payload, but in
> order to make the log a bit more human readable (without the need of
> tooling to parse the log), I added some parsing/interpretation of the
> header, so that the final result is something like:
>
> root@deb-guest:~# echo 1 > /sys/kernel/debug/tracing/events/scmi/scmi_msg_dump/enable
> root@deb-guest:~# cat /sys/kernel/debug/tracing/trace_pipe
>
> (my annotations non on traces =>>)                       proto  type  cmd   seq  status    payload
>                                                           |      |    |      |     |        |
>  ++ cmd/reply                                             |      |    |      |     |        |
>    cat-224     [002] .....   147.755847: scmi_msg_dump: [0x15]:CMND:[0x06]:[0021]:[0]: 0000000000000000
> <idle>-0       [000] d.h2.   147.758311: scmi_msg_dump: [0x15]:RESP:[0x06]:[0021]:[0]: 25000000000000000000000000000000
>I'll probably still use a script, but
as you are keep
>
>  ++ cmd/reply/delayed
>    cat-223     [001] .....   122.384192: scmi_msg_dump: [0x15]:CMND:[0x06]:[0020]:[0]: 0100000001000000
> <idle>-0       [000] d.h2.   122.405548: scmi_msg_dump: [0x15]:RESP:[0x06]:[0020]:[0]:
> <idle>-0       [000] d.h2.   122.425839: scmi_msg_dump: [0x15]:DLYD:[0x06]:[0020]:[0]: 01000000a05a320000000000efbeaddefecafeca
>
>
>  ++ enable notif/notif
> iio_generic_buf-233     [003] .....   522.399690: scmi_msg_dump: [0x15]:CMND:[0x0A]:[0024]:[0]: 0800000003000000
>          <idle>-0       [000] dNh2.   522.417778: scmi_msg_dump: [0x15]:RESP:[0x0A]:[0024]:[0]: 00000000
>          <idle>-0       [000] d.h2.   522.500945: scmi_msg_dump: [0x15]:NOTI:[0x01]:[0000]:[0]: 000000000800000008daffffffffffff0040d3a45d46fb1610daffffffffffff0040d3a45d46fb1618daffffffffffff0040d3a45d46fb16
>
> Payload is dumped as it comes through byte-by-byte without any endianity
> conversion to avoid further load on the system.
>
> Any thoughs ?

Hi Christian,
I haven't tested this but the format looks good to me.  I have a
couple of minor suggestions
below.

As all numbers are hex, consider dropping the '0x'.  BTW,  the seq
value (eg [0020]) is hex but does not have '0x'.

Some TP_printk()s use this  style:

        scmi_msg_dump: pt=15 t=CMND cmd=06 seq=0021 s=0 pyld=12345678abcdef

which allows one to easily grep, for example, for all non-zero returns

        grep 's=[^0]'

This format is also easy to parse with a script:  strip the preamble,
split on whitespace, and then split on '=' to get the  [key, value]
pairs.

Thanks,
Jim Quinlan
Broadcom STB





>
> Thanks,
> Cristian
>
> Cristian Marussi (2):
>   include: trace: Add SCMI full message tracing
>   firmware: arm_scmi: Use new SCMI full message tracing
>
>  drivers/firmware/arm_scmi/driver.c | 21 ++++++++++++++++++++
>  include/trace/events/scmi.h        | 31 ++++++++++++++++++++++++++++++
>  2 files changed, 52 insertions(+)
>
> --
> 2.32.0
>
Cristian Marussi June 29, 2022, 6:19 p.m. UTC | #2
On Mon, Jun 27, 2022 at 03:05:57PM -0400, Jim Quinlan wrote:
> On Thu, Jun 23, 2022 at 10:55 AM Cristian Marussi
> <cristian.marussi@arm.com> wrote:
> >
> > Hi,
> >
> > after a few recent troubles handling too strictly out-of-spec replies from
> > SCMI servers deployed in the wild, I though it could have been useful to
> > have a basic way to dump at will the effective full payloads of successfuly
> > transmitted/received SCMI messages.
> >
> > The existing SCMI traces already collect a bunch of information about SCMI
> > message exchanges but they do NOT keep any payload information: this is
> > certainly preferable most of the time since dumping full SCMI messages to
> > the trace buffer involves a full copy of the payload.
> >
> > For this reason I added a new distinct trace_scmi_msg_dump with this series
> > in order to be able to selectively enable at will message dumping only when
> > required.
> >
> > Only successfully transmitted and received (valid) xfers are dumped.
> >
> > At first I was thinking about just dumping raw header and payload, but in
> > order to make the log a bit more human readable (without the need of
> > tooling to parse the log), I added some parsing/interpretation of the
> > header, so that the final result is something like:
> >
> > root@deb-guest:~# echo 1 > /sys/kernel/debug/tracing/events/scmi/scmi_msg_dump/enable
> > root@deb-guest:~# cat /sys/kernel/debug/tracing/trace_pipe
> >
> > (my annotations non on traces =>>)                       proto  type  cmd   seq  status    payload
> >                                                           |      |    |      |     |        |
> >  ++ cmd/reply                                             |      |    |      |     |        |
> >    cat-224     [002] .....   147.755847: scmi_msg_dump: [0x15]:CMND:[0x06]:[0021]:[0]: 0000000000000000
> > <idle>-0       [000] d.h2.   147.758311: scmi_msg_dump: [0x15]:RESP:[0x06]:[0021]:[0]: 25000000000000000000000000000000
> >I'll probably still use a script, but
> as you are keep
> >
> >  ++ cmd/reply/delayed
> >    cat-223     [001] .....   122.384192: scmi_msg_dump: [0x15]:CMND:[0x06]:[0020]:[0]: 0100000001000000
> > <idle>-0       [000] d.h2.   122.405548: scmi_msg_dump: [0x15]:RESP:[0x06]:[0020]:[0]:
> > <idle>-0       [000] d.h2.   122.425839: scmi_msg_dump: [0x15]:DLYD:[0x06]:[0020]:[0]: 01000000a05a320000000000efbeaddefecafeca
> >
> >
> >  ++ enable notif/notif
> > iio_generic_buf-233     [003] .....   522.399690: scmi_msg_dump: [0x15]:CMND:[0x0A]:[0024]:[0]: 0800000003000000
> >          <idle>-0       [000] dNh2.   522.417778: scmi_msg_dump: [0x15]:RESP:[0x0A]:[0024]:[0]: 00000000
> >          <idle>-0       [000] d.h2.   522.500945: scmi_msg_dump: [0x15]:NOTI:[0x01]:[0000]:[0]: 000000000800000008daffffffffffff0040d3a45d46fb1610daffffffffffff0040d3a45d46fb1618daffffffffffff0040d3a45d46fb16
> >
> > Payload is dumped as it comes through byte-by-byte without any endianity
> > conversion to avoid further load on the system.
> >
> > Any thoughs ?
> 
> Hi Christian,
> I haven't tested this but the format looks good to me.  I have a
> couple of minor suggestions
> below.
> 

Hi Jim,

thanks for having a look.

> As all numbers are hex, consider dropping the '0x'.  BTW,  the seq
> value (eg [0020]) is hex but does not have '0x'.
> 

Yes indeed, good point.

> Some TP_printk()s use this  style:
> 
>         scmi_msg_dump: pt=15 t=CMND cmd=06 seq=0021 s=0 pyld=12345678abcdef
> 
> which allows one to easily grep, for example, for all non-zero returns
> 
>         grep 's=[^0]'
> 
> This format is also easy to parse with a script:  strip the preamble,
> split on whitespace, and then split on '=' to get the  [key, value]
> pairs.

In fact having a grep-friendly format is better.

Thanks for your feedback, I'll wait to see if someone else wants to
chime in and repost.

Thanks,
Cristian
Sudeep Holla June 30, 2022, 4:18 p.m. UTC | #3
On Wed, Jun 29, 2022 at 07:19:32PM +0100, Cristian Marussi wrote:
> On Mon, Jun 27, 2022 at 03:05:57PM -0400, Jim Quinlan wrote:

[...]

> > This format is also easy to parse with a script:  strip the preamble,
> > split on whitespace, and then split on '=' to get the  [key, value]
> > pairs.
> 
> In fact having a grep-friendly format is better.
> 
> Thanks for your feedback, I'll wait to see if someone else wants to
> chime in and repost.
> 

I agree with Jim's proposal, please update and send the patch.