Message ID | 20221019204626.3813043-1-cristian.marussi@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce a unified API for SCMI Server testing | expand |
On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <cristian.marussi@arm.com> wrote: > > Hi all, > > This series aims to introduce a new SCMI unified userspace interface meant > to ease testing an SCMI Server implementation for compliance, fuzzing etc., > from the perspective of the OSPM agent (non-secure world only ...) > > It is proposed as a testing/development facility, it is NOT meant to be a > feature to use in production, but only enabled in Kconfig for test > deployments. > > Currently an SCMI Compliance Suite like the one at [1] can only work by > injecting SCMI messages at the SCMI transport layer using the mailbox test > driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at > the related replies from the SCMI backend Server. > > This approach has a few drawbacks: > > - the SCMI Server under test MUST be reachable through a mailbox based > SCMI transport: any other SCMI Server placement is not possible (like in > a VM reachable via SCMI Virtio). In order to cover other placements in > the current scenario we should write some sort of test driver for each > and every existent SCMI transport and for any future additional transport > ...this clearly does not scale. > > - even in the mailbox case the userspace Compliance suite cannot simply > send and receive bare SCMI messages BUT it has to properly lay them out > into the shared memory exposed by the mailbox test driver as expected by > the transport definitions. In other words such a userspace test > application has to, not only use a proper transport driver for the system > at hand, but it also has to have a comprehensive knowledge of the > internals of the underlying transport in order to operate. > > - last but not least, the system under test has to be specifically > configured and built, in terms of Kconfig and DT, to perform such kind of > testing, it cannot be used for anything else, which is unfortunate for > CI/CD deployments. > > This series introduces a new SCMI Raw mode support feature that, when > configured and enabled exposes a new interface in debugfs through which: > > - a userspace application can inject bare SCMI binary messages into the > SCMI core stack; such messages will be routed by the SCMI regular kernel > stack to the backend Server using the currently configured transport > transparently: in other words you can test the SCMI server, no matter > where it is placed, as long as it is reachable from the currently > configured SCMI stack. > Same goes the other way around on the reading path: any SCMI server reply > can be read as a bare SCMI binary message from the same debugfs path. > > - as a direct consequence of this way of injecting bare messages in the > middle of the SCMI stack (instead of beneath it at the transport layer) > the user application has to handle only bare SCMI messages without having > to worry about the specific underlying transport internals that will be > taken care of by the SCMI core stack itself using its own machinery, > without duplicating such logic. > > - a system under test, once configured with SCMI Raw support enabled in > Kconfig, can be booted without any particular DT change. > > In V2 the runtime enable/disable switching capability has been removed > (for now) since still not deemed to be stable/reliable enough: as a > consequence when SCMI Raw support is compiled in, the regular SCMI stack > drivers are now inhibited permanently for that Kernel. > > In V4 it has been added the support for transports lacking a completion_irq > or configured forcibly in polled mode. > > A quick and trivial example from the shell...reading from a sensor > injecting a properly crafted packet in raw mode: > > # INJECT THE SENSOR_READING MESSAGE FOR SENSOR ID=1 (binary little endian) > root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message > > # READING BACK THE REPLY... > root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4 > 0000000 00005406 00000000 00000335 00000000 > 0000020 > > while doing that, since Raw mode makes (partial) use of the regular SCMI > stack, you can observe the messages going through the SCMI stack with the > usual traces: > > bash-329 [000] ..... 14183.446808: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000000000000 > irq/35-mhu_db_l-81 [000] ..... 14183.447809: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=0 pyld=3503000000000000 > > > ..trying to read in async when the backend server does NOT supports asyncs: > > # AN ASYNC SENSOR READING REQUEST... > root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x01\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message_async > > bash-329 [000] ..... 16415.938739: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000001000000 > irq/35-mhu_db_l-81 [000] ..... 16415.944129: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=-1 pyld= > > # RETURNS A STATUS -1 FROM THE SERVER NOT SUPPORTING IT > root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4 > 0000000 00005406 ffffffff > 0000010 > > Note that this was on a JUNO, BUT exactly the same steps can be used to > reach an SCMI Server living on a VM reachable via virtio as long as the > system under test if properly configured to work with a virtio transport. > > In a nutshell the exposed API is as follows: > > /sys/kernel/debug/scmi_raw/ > ├── errors > ├── message > ├── message_async > ├── notification > ├── reset > ├── transport_max_msg_size > ├── transport_rx_timeout_ms > └── transport_tx_max_msg > > where: > > - message*: used to send sync/async commands and read back immediate and > delayed responses (if any) > - errors: used to report timeout and unexpected replies > - reset: used to reset the SCMI Raw stack, flushing all queues from > received messages still pending to be read out (useful to be sure to > cleanup between test suite runs...) > - notification: used to read any notification being spit by the system > (if previously enabled by the user app) > - transport*: a bunch of configuration useful to setup the user > application expectations in terms of timeouts and message > characteristics. > > Each write corresponds to one command request and the replies or delayed > response are read back one message at time (receiving an EOF at each > message boundary). > > The user application running the test is in charge of handling timeouts > and properly choosing SCMI sequence numbers for the outgoing requests: note > that the same fixed number can be re-used (...though discouraged...) as > long as the suite does NOT expect to send multiple in-flight commands > concurrently. > > Since the SCMI core regular stack is partially used to deliver and collect > the messages, late replies after timeouts and any other sort of unexpected > message sent by the SCMI server platform back can be identified by the SCMI > core as usual and it will be reported under /errors for later analysis. > (a userspace test-app will have anyway properly detected the timeout on > /message* ...) > > All of the above has been roughly tested against a standard JUNO SCP SCMI > Server (mailbox trans) and an emulated SCMI Server living in a VM (virtio > trans) using a custom experimental version of the scmi-tests Compliance > suite patched to support Raw mode and posted at [2]. (still in development > ...merge requests are in progress...for now it is just a mean for me to > test the testing API ... O_o) > > The series is based on v6.1-rc1. > > Having said that (in such a concise and brief way :P) ... > > ...any feedback/comments are welcome ! Hi Cristian, I have tested your series with an optee message transport layer and been able to send raw messages to the scmi server PTA FWIW Tested-by: Vincent Guittot <vincent.guittot@linaro.org> > > Thanks, > Cristian > > --- > V3 --> v4 > - rebased on v6.1-rc1 > - addedd missing support for 'polled' transports and transport lacking a > completion_irq (like smc/optee) > - removed a few inlines > - refactored SCMI Raw RX patch to make use more extensively of the regular > non-Raw RX path > - fix handling of O_NONBLOCK raw_mode read requests > > v2 --> v3 > - fixed some sparse warning on LE and __poll_t > - reworked and simplified deferred worker in charge of xfer delayed waiting > - allow for injection of DT-unknown protocols messages when in Raw mode > (needed for any kind of fuzzing...) > > v1 --> v2 > - added comments and debugfs docs > - added dedicated transport devices for channels initialization > - better channels handling in Raw mode > - removed runtime enable, moved to static compile time exclusion > of SCMI regular stack > > [1]: https://gitlab.arm.com/tests/scmi-tests > [2]: https://gitlab.arm.com/tests/scmi-tests/-/commits/raw_mode_support_devel/ > > > Cristian Marussi (11): > firmware: arm_scmi: Refactor xfer in-flight registration routines > firmware: arm_scmi: Simplify chan_available transport operation > firmware: arm_scmi: Use dedicated devices to initialize channels > firmware: arm_scmi: Refactor polling helpers > firmware: arm_scmi: Refactor scmi_wait_for_message_response > firmware: arm_scmi: Add xfer raw helpers > firmware: arm_scmi: Move errors defs and code to common.h > firmware: arm_scmi: Add raw transmission support > firmware: arm_scmi: Add debugfs ABI documentation for Raw mode > firmware: arm_scmi: Reject SCMI drivers while in Raw mode > firmware: arm_scmi: Call Raw mode hooks from the core stack > > Documentation/ABI/testing/debugfs-scmi-raw | 88 ++ > drivers/firmware/arm_scmi/Kconfig | 13 + > drivers/firmware/arm_scmi/Makefile | 1 + > drivers/firmware/arm_scmi/common.h | 72 +- > drivers/firmware/arm_scmi/driver.c | 521 +++++--- > drivers/firmware/arm_scmi/mailbox.c | 4 +- > drivers/firmware/arm_scmi/optee.c | 4 +- > drivers/firmware/arm_scmi/raw_mode.c | 1244 ++++++++++++++++++++ > drivers/firmware/arm_scmi/raw_mode.h | 29 + > drivers/firmware/arm_scmi/smc.c | 4 +- > drivers/firmware/arm_scmi/virtio.c | 2 +- > 11 files changed, 1827 insertions(+), 155 deletions(-) > create mode 100644 Documentation/ABI/testing/debugfs-scmi-raw > create mode 100644 drivers/firmware/arm_scmi/raw_mode.c > create mode 100644 drivers/firmware/arm_scmi/raw_mode.h > > -- > 2.34.1 >
Hi Christian, On 10/19/2022 1:46 PM, Cristian Marussi wrote: > Hi all, > > This series aims to introduce a new SCMI unified userspace interface meant > to ease testing an SCMI Server implementation for compliance, fuzzing etc., > from the perspective of the OSPM agent (non-secure world only ...) > > It is proposed as a testing/development facility, it is NOT meant to be a > feature to use in production, but only enabled in Kconfig for test > deployments. > > Currently an SCMI Compliance Suite like the one at [1] can only work by > injecting SCMI messages at the SCMI transport layer using the mailbox test > driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at > the related replies from the SCMI backend Server. I plan on giving this a try on our systems later today and will let you know the outcome. This is very useful for making sure the SCMI implementation is both correct and properly hardened.
On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote: > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > Hi all, > > Hi Vincent, > > This series aims to introduce a new SCMI unified userspace interface meant > > to ease testing an SCMI Server implementation for compliance, fuzzing etc., > > from the perspective of the OSPM agent (non-secure world only ...) > > > > It is proposed as a testing/development facility, it is NOT meant to be a > > feature to use in production, but only enabled in Kconfig for test > > deployments. > > > > Currently an SCMI Compliance Suite like the one at [1] can only work by > > injecting SCMI messages at the SCMI transport layer using the mailbox test > > driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at > > the related replies from the SCMI backend Server. > > > > This approach has a few drawbacks: > > > > - the SCMI Server under test MUST be reachable through a mailbox based > > SCMI transport: any other SCMI Server placement is not possible (like in > > a VM reachable via SCMI Virtio). In order to cover other placements in > > the current scenario we should write some sort of test driver for each > > and every existent SCMI transport and for any future additional transport > > ...this clearly does not scale. > > > > - even in the mailbox case the userspace Compliance suite cannot simply > > send and receive bare SCMI messages BUT it has to properly lay them out > > into the shared memory exposed by the mailbox test driver as expected by > > the transport definitions. In other words such a userspace test > > application has to, not only use a proper transport driver for the system > > at hand, but it also has to have a comprehensive knowledge of the > > internals of the underlying transport in order to operate. > > > > - last but not least, the system under test has to be specifically > > configured and built, in terms of Kconfig and DT, to perform such kind of > > testing, it cannot be used for anything else, which is unfortunate for > > CI/CD deployments. > > > > This series introduces a new SCMI Raw mode support feature that, when > > configured and enabled exposes a new interface in debugfs through which: > > > > - a userspace application can inject bare SCMI binary messages into the > > SCMI core stack; such messages will be routed by the SCMI regular kernel > > stack to the backend Server using the currently configured transport > > transparently: in other words you can test the SCMI server, no matter > > where it is placed, as long as it is reachable from the currently > > configured SCMI stack. > > Same goes the other way around on the reading path: any SCMI server reply > > can be read as a bare SCMI binary message from the same debugfs path. > > > > - as a direct consequence of this way of injecting bare messages in the > > middle of the SCMI stack (instead of beneath it at the transport layer) > > the user application has to handle only bare SCMI messages without having > > to worry about the specific underlying transport internals that will be > > taken care of by the SCMI core stack itself using its own machinery, > > without duplicating such logic. > > > > - a system under test, once configured with SCMI Raw support enabled in > > Kconfig, can be booted without any particular DT change. > > > > In V2 the runtime enable/disable switching capability has been removed > > (for now) since still not deemed to be stable/reliable enough: as a > > consequence when SCMI Raw support is compiled in, the regular SCMI stack > > drivers are now inhibited permanently for that Kernel. > > > > In V4 it has been added the support for transports lacking a completion_irq > > or configured forcibly in polled mode. > > > > A quick and trivial example from the shell...reading from a sensor > > injecting a properly crafted packet in raw mode: > > > > # INJECT THE SENSOR_READING MESSAGE FOR SENSOR ID=1 (binary little endian) > > root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message > > > > # READING BACK THE REPLY... > > root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4 > > 0000000 00005406 00000000 00000335 00000000 > > 0000020 > > > > while doing that, since Raw mode makes (partial) use of the regular SCMI > > stack, you can observe the messages going through the SCMI stack with the > > usual traces: > > > > bash-329 [000] ..... 14183.446808: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000000000000 > > irq/35-mhu_db_l-81 [000] ..... 14183.447809: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=0 pyld=3503000000000000 > > > > > > ..trying to read in async when the backend server does NOT supports asyncs: > > > > # AN ASYNC SENSOR READING REQUEST... > > root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x01\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message_async > > > > bash-329 [000] ..... 16415.938739: scmi_msg_dump: pt=15 t=CMND msg_id=06 seq=0000 s=0 pyld=0100000001000000 > > irq/35-mhu_db_l-81 [000] ..... 16415.944129: scmi_msg_dump: pt=15 t=RESP msg_id=06 seq=0000 s=-1 pyld= > > > > # RETURNS A STATUS -1 FROM THE SERVER NOT SUPPORTING IT > > root@deb-buster-arm64:~# cat /sys/kernel/debug/scmi_raw/message | od --endian=little -t x4 > > 0000000 00005406 ffffffff > > 0000010 > > > > Note that this was on a JUNO, BUT exactly the same steps can be used to > > reach an SCMI Server living on a VM reachable via virtio as long as the > > system under test if properly configured to work with a virtio transport. > > > > In a nutshell the exposed API is as follows: > > > > /sys/kernel/debug/scmi_raw/ > > ├── errors > > ├── message > > ├── message_async > > ├── notification > > ├── reset > > ├── transport_max_msg_size > > ├── transport_rx_timeout_ms > > └── transport_tx_max_msg > > > > where: > > > > - message*: used to send sync/async commands and read back immediate and > > delayed responses (if any) > > - errors: used to report timeout and unexpected replies > > - reset: used to reset the SCMI Raw stack, flushing all queues from > > received messages still pending to be read out (useful to be sure to > > cleanup between test suite runs...) > > - notification: used to read any notification being spit by the system > > (if previously enabled by the user app) > > - transport*: a bunch of configuration useful to setup the user > > application expectations in terms of timeouts and message > > characteristics. > > > > Each write corresponds to one command request and the replies or delayed > > response are read back one message at time (receiving an EOF at each > > message boundary). > > > > The user application running the test is in charge of handling timeouts > > and properly choosing SCMI sequence numbers for the outgoing requests: note > > that the same fixed number can be re-used (...though discouraged...) as > > long as the suite does NOT expect to send multiple in-flight commands > > concurrently. > > > > Since the SCMI core regular stack is partially used to deliver and collect > > the messages, late replies after timeouts and any other sort of unexpected > > message sent by the SCMI server platform back can be identified by the SCMI > > core as usual and it will be reported under /errors for later analysis. > > (a userspace test-app will have anyway properly detected the timeout on > > /message* ...) > > > > All of the above has been roughly tested against a standard JUNO SCP SCMI > > Server (mailbox trans) and an emulated SCMI Server living in a VM (virtio > > trans) using a custom experimental version of the scmi-tests Compliance > > suite patched to support Raw mode and posted at [2]. (still in development > > ...merge requests are in progress...for now it is just a mean for me to > > test the testing API ... O_o) > > > > The series is based on v6.1-rc1. > > > > Having said that (in such a concise and brief way :P) ... > > > > ...any feedback/comments are welcome ! > > Hi Cristian, > > I have tested your series with an optee message transport layer and > been able to send raw messages to the scmi server PTA > > FWIW > > Tested-by: Vincent Guittot <vincent.guittot@linaro.org> > Thanks a lot for your test and feedback ! ... but I was going to reply to this saying that I spotted another issue with the current SCMI Raw implementation (NOT related to optee/smc) so that I'll post a V5 next week :P Anyway, the issue is much related to the debugfs root used by SCMI Raw, i.e.: /sys/kernel/debug/scmi_raw/ ..this works fine unless you run it on a system sporting multiple DT-defined server instances ...that is not officially supported but....ehm...a little bird told these system with multiple servers do exists :D In such a case the SCMI core stack is probed multiuple times and so it will try to register multiple debugfs Raw roots: there is no chanche to root the SCMI Raw entries at the same point clearly ... and then anyway there is the issue of recognizing which server is rooted where ... with the additional pain that a server CANNOT be recognized by querying...cause there is only one by teh spec with agentID ZERO ... in theory :D... So my tentaive solution for V5 would be: - change the Raw root debugfs as: /sys/kernel/debug/scmi_raw/0/... (first server defined) /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..) - expose the DT scmi-server root-node name of the server somewhere under that debugfs root like: ..../scmi_raw/0/transport_name -> 'scmi-mbx' ..../scmi_raw/1/transport_name -> 'scmi-virtio' so that if you know HOW you have configured your own system in the DT (maybe multiple servers with different kind of transports ?), you can easily select programmatically which one is which, and so decide which Raw debugfs fs to use... ... I plan to leave the SCMI ACS suite use by default the first system rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline option to choose an alternative path for SCMI Raw. Does all of this sound reasonable ? Thanks, Cristian
On Fri, Oct 28, 2022 at 07:44:32AM -0700, Florian Fainelli wrote: > Hi Christian, > > On 10/19/2022 1:46 PM, Cristian Marussi wrote: > > Hi all, > > Hi Florian, > > This series aims to introduce a new SCMI unified userspace interface meant > > to ease testing an SCMI Server implementation for compliance, fuzzing etc., > > from the perspective of the OSPM agent (non-secure world only ...) > > > > It is proposed as a testing/development facility, it is NOT meant to be a > > feature to use in production, but only enabled in Kconfig for test > > deployments. > > > > Currently an SCMI Compliance Suite like the one at [1] can only work by > > injecting SCMI messages at the SCMI transport layer using the mailbox test > > driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at > > the related replies from the SCMI backend Server. > > I plan on giving this a try on our systems later today and will let you know > the outcome. Great ! It would be much appreciated... > This is very useful for making sure the SCMI implementation is > both correct and properly hardened. ... that was the plan :P Note that the upstream SCMI ACS suite that I am using for stressing/testing this Raw thing is still WIP in term of supporting Raw mode injection (i.e. functional but ALL still to be merged)..but if you need I can give you pointers on how to use it....unless of course you have your suite or you just want to test using the shell as in the cover-letter examples... ... on my side I tried to fuzz me with a brutal 'dd bs=128 count=1 if=/dev/random of=<scmi_raw>/message' as a poor man fuzzying tool :D ... so I was thinking if it was meaningful to think about upstreaming some common tools for fuzzying or simply pre-building bare payloads (in proper endianity) to be injected with this SCMI raw thing... (I mean something useful that could live in tools/) ...any feedbacks/hints in these regards are welcome. Thanks, Cristian
On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <cristian.marussi@arm.com> wrote: > > On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote: > > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > Hi all, > > > > > Hi Vincent, > > > > This series aims to introduce a new SCMI unified userspace interface meant > > > to ease testing an SCMI Server implementation for compliance, fuzzing etc., > > > from the perspective of the OSPM agent (non-secure world only ...) > > > [ snip] > > Hi Cristian, > > > > I have tested your series with an optee message transport layer and > > been able to send raw messages to the scmi server PTA > > > > FWIW > > > > Tested-by: Vincent Guittot <vincent.guittot@linaro.org> > > > > Thanks a lot for your test and feedback ! > > ... but I was going to reply to this saying that I spotted another issue > with the current SCMI Raw implementation (NOT related to optee/smc) so > that I'll post a V5 next week :P > > Anyway, the issue is much related to the debugfs root used by SCMI Raw, > i.e.: > > /sys/kernel/debug/scmi_raw/ > > ..this works fine unless you run it on a system sporting multiple DT-defined > server instances ...that is not officially supported but....ehm...a little > bird told these system with multiple servers do exists :D ;-) > > In such a case the SCMI core stack is probed multiuple times and so it > will try to register multiple debugfs Raw roots: there is no chanche to > root the SCMI Raw entries at the same point clearly ... and then anyway > there is the issue of recognizing which server is rooted where ... with > the additional pain that a server CANNOT be recognized by querying...cause > there is only one by teh spec with agentID ZERO ... in theory :D... > > So my tentaive solution for V5 would be: > > - change the Raw root debugfs as: > > /sys/kernel/debug/scmi_raw/0/... (first server defined) > > /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..) > > - expose the DT scmi-server root-node name of the server somewhere under > that debugfs root like: > > ..../scmi_raw/0/transport_name -> 'scmi-mbx' > > ..../scmi_raw/1/transport_name -> 'scmi-virtio' I was about to say that you would display the server name but that means that you have send a request to the server which probably defeats the purpose of the raw mode > > so that if you know HOW you have configured your own system in the DT > (maybe multiple servers with different kind of transports ?), you can > easily select programmatically which one is which, and so decide > which Raw debugfs fs to use... > > ... I plan to leave the SCMI ACS suite use by default the first system > rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline > option to choose an alternative path for SCMI Raw. > > Does all of this sound reasonable ? Yes, adding an index looks good to me. As we are there, should we consider adding a per channel entry in the tree when there are several channels shared with the same server ? Vincent > > Thanks, > Cristian >
On Fri, Oct 28, 2022 at 06:18:52PM +0200, Vincent Guittot wrote: > On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote: > > > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > > > Hi all, > > > > > > > > Hi Vincent, > > > > > > This series aims to introduce a new SCMI unified userspace interface meant > > > > to ease testing an SCMI Server implementation for compliance, fuzzing etc., > > > > from the perspective of the OSPM agent (non-secure world only ...) > > > > > > [ snip] > > > > Hi Cristian, > > > > > > I have tested your series with an optee message transport layer and > > > been able to send raw messages to the scmi server PTA > > > > > > FWIW > > > > > > Tested-by: Vincent Guittot <vincent.guittot@linaro.org> > > > > > > > Thanks a lot for your test and feedback ! > > > > ... but I was going to reply to this saying that I spotted another issue > > with the current SCMI Raw implementation (NOT related to optee/smc) so > > that I'll post a V5 next week :P > > > > Anyway, the issue is much related to the debugfs root used by SCMI Raw, > > i.e.: > > > > /sys/kernel/debug/scmi_raw/ > > > > ..this works fine unless you run it on a system sporting multiple DT-defined > > server instances ...that is not officially supported but....ehm...a little > > bird told these system with multiple servers do exists :D > > ;-) > > > > > In such a case the SCMI core stack is probed multiuple times and so it > > will try to register multiple debugfs Raw roots: there is no chanche to > > root the SCMI Raw entries at the same point clearly ... and then anyway > > there is the issue of recognizing which server is rooted where ... with > > the additional pain that a server CANNOT be recognized by querying...cause > > there is only one by teh spec with agentID ZERO ... in theory :D... > > > > So my tentaive solution for V5 would be: > > > > - change the Raw root debugfs as: > > > > /sys/kernel/debug/scmi_raw/0/... (first server defined) > > > > /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..) > > > > - expose the DT scmi-server root-node name of the server somewhere under > > that debugfs root like: > > > > ..../scmi_raw/0/transport_name -> 'scmi-mbx' > > > > ..../scmi_raw/1/transport_name -> 'scmi-virtio' > > I was about to say that you would display the server name but that > means that you have send a request to the server which probably > defeats the purpose of the raw mode > > > > > so that if you know HOW you have configured your own system in the DT > > (maybe multiple servers with different kind of transports ?), you can > > easily select programmatically which one is which, and so decide > > which Raw debugfs fs to use... > > > > ... I plan to leave the SCMI ACS suite use by default the first system > > rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline > > option to choose an alternative path for SCMI Raw. > > > > Does all of this sound reasonable ? > > Yes, adding an index looks good to me. Ok, I'll rework accordingly. > > As we are there, should we consider adding a per channel entry in the > tree when there are several channels shared with the same server ? > So, I was thinking about this and, even though, it seems not strictly needed for Compliance testing (as discussed offline) I do think that could be a sensible option to have as an additional mean to stress the server transport implementation (as you wish). Having said that, this week, I was reasoning about an alternative interface to do this, i.e. to avoid to add even more debugfs entries for this chosen-channel config or possibly in the future to ask for transport polling mode (if supported by the underlying transport) My idea (not thought fully through as of now eh..) would be as follows: since the current Raw implementation enforces a minimum size of 4 bytes for the injected message (more on this later down below in NOTE), I was thinking about using less-than-4-bytes-sized messages to sort of pre-configure the Raw stack. IOW, instead of having a number of new additional entries like ../message_ch10 ../message_ch13 ../message_poll we could design a sort of API (in the API :D) that defines how 3-bytes message payload are to be interpreted, so that in normal usage everything will go on as it is now, while if a 3-bytes message is injected by a specially crafted testcase, it would be used to configure the behaviour stack for the subsequent set of Raw transactions (i.e. for the currently opened fd...) like: - open message fd - send a configure message: | proto_chan_# | flags (polling..) | ------------------------------------------ 0 7 21 - send/receive your test messages - repeat or close (then the config will vanish...) This would mean adding some sort entry under scmi_raw to expose the configured available channels on the system though. [maybe the flags above could also include an async flag and avoid also to add the message_async entries...] I discarded the idea to run the above config process via IOCTLs since it seemed to me even more frowned upon to use IOCTLs on a debugfs entry :P...but I maybe wrong ah... All of this is still to be explored anyway, any thoughts ? or evident drawbacks ? (beside having to clearly define an API for these message config machinery) Anyway, whatever direction we'll choose (additional entries vs 3-bytes config msg), I would prefer to add this per-channel (or polling) capabilities with separate series to post on top of this in teh next cycle. ..too many words even this time :P Thanks, Cristian P.S: NOTE min_injected_msg_size: -------------------------------- Thinking about all of the above, at first, I was a bit dubious if instead I should not allow, in Raw mode, the injection of shorter than 4 bytes messages (i.e. shorter than a SCMI header) for the purpose of fuzzing: then I realized that even though I should allow the injection of smaller messages, the underlying transports, as they are defined, both sides (platform and agent) will anyway carry out a 4bytes transaction, it's just that all the other non-provided bytes will be zeroed in the memory layout; this is just how the transports itself (shmem or msg based) are designed to work both sides. (and again would be transport layer testing more than SCMI spec verification..) So at the end I thought this kind of less-than-4-bytes transmissions gave no benefit and I came up with the above trick to use such tiny message for configuration.
Hi Christian, On 10/19/2022 1:46 PM, Cristian Marussi wrote: [snip] > In V2 the runtime enable/disable switching capability has been removed > (for now) since still not deemed to be stable/reliable enough: as a > consequence when SCMI Raw support is compiled in, the regular SCMI stack > drivers are now inhibited permanently for that Kernel. For our platforms (ARCH_BRCMSTB) we would need to have the ability to start with the regular SCMI stack to satisfy if nothing else, all clock consumers otherwise it makes it fairly challenging for us to boot to a prompt as we purposely turn off all unnecessary peripherals to conserve power. We could introduce a "full on" mode to remove the clock provider dependency, but I suspect others on "real" silicon may suffer from the same short comings. Once user-space is reached, I suppose we could find a way to unbind from all SCMI consumers, and/or ensure that runtime PM is disabled, cpufreq is in a governor that won't do any active frequency switching etc. What do you think?
On Fri, Oct 28, 2022 at 07:38:25PM -0700, Florian Fainelli wrote: > Hi Christian, Hi Florian, > > On 10/19/2022 1:46 PM, Cristian Marussi wrote: > [snip] > > > In V2 the runtime enable/disable switching capability has been removed > > (for now) since still not deemed to be stable/reliable enough: as a > > consequence when SCMI Raw support is compiled in, the regular SCMI stack > > drivers are now inhibited permanently for that Kernel. > > For our platforms (ARCH_BRCMSTB) we would need to have the ability to start > with the regular SCMI stack to satisfy if nothing else, all clock consumers > otherwise it makes it fairly challenging for us to boot to a prompt as we > purposely turn off all unnecessary peripherals to conserve power. We could > introduce a "full on" mode to remove the clock provider dependency, but I > suspect others on "real" silicon may suffer from the same short comings. > Indeed in V1 of this series the Raw mode was dynamically switched on/off at runtime, so that you could have booted your system with a full working SCMI stack and then Raw mode could have been enabled/disabled via scmi_raw/enable entry, so causing the SCMI drivers to be unbound after boot when entering Raw mode. The idea, indeed, was initially to be able to boot a regular system, perform any kind of non-SCMI testing and then switch at will into Raw mode, perform your SCMI testing, and then back from the grave into normal mode when needed. (this way you could have deployed into CI one single image for all testing scenarios...) The valid objections/worries were around the stability/relliability of such a solution both when entering Raw mode and then coming back to normal use: i.e. not being sure to be able to safely unbind all and to safely bind back all the stack at the end. The full discussion about this is in this thread if you'd want to chime in with your point of view: https://lore.kernel.org/all/Yvvb6Y+lzuABT1fy@sirena.org.uk/ So we removed it, but the idea was not fullly abandoned, we could revive it with some variations, most probably binding this feature to a Kconfig option (default=N). Any feedback/idea from You in these regards is highly welcome. > Once user-space is reached, I suppose we could find a way to unbind from all > SCMI consumers, and/or ensure that runtime PM is disabled, cpufreq is in a > governor that won't do any active frequency switching etc. > > What do you think? Anyway, thinking about your scenario, maybe this dynamic-switch is NOT a good solution in your case, because that was an all-or-nothing switch that caused the full SCMI stack to be unbound, you could not selectively keep alive what you possibly need to stay on even after boot. I think that an alternative, maybe better, option in your case, since you are willing to manually fine-tune at runtime which parts of the SCMI has to be inhibited to avoid interferences while Raw-testing (via unbind/ unload or policy governors changes), a better option could be a 'full-coexistence' Raw mode solution. In such a COEX configuration you'll boot a normal system with all the SCMI drivers operational as configured in the DT, BUT with also the Raw mode initialized and ready to be used. In this scenario, basically, you'll have the normal message transactions, coming from the regular SCMI drivers, and the Raw transactions, injected from your test suite, that happily coexist side-by-side at the pure trasaction level: this does NOT mean that you won't suffer any interference at the protocol level so, as said, you'll have anyway to inhibit properly the SCMI drivers by hand to avoid false-positives in your results. (imagine testcase generating a series of Raw get/set/get transaction on a resource while the regular stack issue a set on the same res...notifications interferences are even worst...) Now, the GOOD_NEWS : is ... that this can be done already with an additional slim patch that has to be applied on top of this series, patch that I have not posted still since not sure of its utility, but that I am using heavily in my setup and which works fine for me (with really rare interferences on testing even without fine-tuning/disabling anything by hand..) I attached such patch at the end of this mail so that you can immediately be unblocked and experiment further with Raw mode as you planned. I'll cleaned it up and post it also to the next V5 at this point. Once that COEX is enabled, you should see something like this at boot: [ 1.824269] arm-scmi firmware:scmi: SCMI RAW Mode initialized. [ 1.830155] arm-scmi firmware:scmi: SCMI RAW Mode COEX enabled ! [ 1.836473] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled. [ 1.847481] arm-scmi firmware:scmi: SCMI Protocol v2.0 'arm:arm' Firmware version 0x20a0000 ... ... and then you can just use the scmi_raw entries as you wish. Any transaction, normal or raw, will be visible as usual in the SCMI traces (even though, currently, NOT distinguishable by type raw/normal) So...after this other too much long mail (:P)...let me know what you think about al of this (including the possibility of revive the runtime dynamic switch too...) Thanks, Cristian ---->8----- From 8613438d4171866088339e030959cb1de8e88c6a Mon Sep 17 00:00:00 2001 From: Cristian Marussi <cristian.marussi@arm.com> Date: Sun, 21 Aug 2022 19:09:39 +0100 Subject: [PATCH] [DEBUG] firmware: arm_scmi: Add SCMI Raw mode COEXISTENCE support When Raw mode support is configured in coexistence mode, normal SCMI drivers are allowed to register and work as usual with the SCMI core. Normal and raw SCMI message transactions will remain anyway segregated from each other, it is just that any SCMI test suite using the Raw mode access could report unreliable results due to possible interferences from the regular drivers access to shared SCMI resources. Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> --- drivers/firmware/arm_scmi/Kconfig | 10 ++++++++++ drivers/firmware/arm_scmi/driver.c | 21 +++++++++++++++------ drivers/firmware/arm_scmi/protocols.h | 2 ++ drivers/firmware/arm_scmi/raw_mode.c | 2 +- 4 files changed, 28 insertions(+), 7 deletions(-) diff --git a/drivers/firmware/arm_scmi/Kconfig b/drivers/firmware/arm_scmi/Kconfig index ab726a92ac2f..743f53fbe2f8 100644 --- a/drivers/firmware/arm_scmi/Kconfig +++ b/drivers/firmware/arm_scmi/Kconfig @@ -36,6 +36,16 @@ config ARM_SCMI_RAW_MODE_SUPPORT order to avoid unexpected interactions with the SCMI Raw message flow. If unsure say N. +config ARM_SCMI_RAW_MODE_SUPPORT_COEX + bool "Allow SCMI Raw mode coexistence with normal SCMI stack" + depends on ARM_SCMI_RAW_MODE_SUPPORT + help + Allow SCMI Raw transmission mode to coexist with normal SCMI stack. + + This will allow regular SCMI drivers to register with the core and + operate normally, thing which could make an SCMI test suite using the + SCMI Raw mode support unreliable. If unsure, say N. + config ARM_SCMI_HAVE_TRANSPORT bool help diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c index 32374fdba997..f0b06b6e8dc2 100644 --- a/drivers/firmware/arm_scmi/driver.c +++ b/drivers/firmware/arm_scmi/driver.c @@ -449,9 +449,14 @@ static struct scmi_xfer *scmi_xfer_get(const struct scmi_handle *handle, */ struct scmi_xfer *scmi_xfer_raw_get(const struct scmi_handle *handle) { + struct scmi_xfer *xfer; struct scmi_info *info = handle_to_scmi_info(handle); - return scmi_xfer_get(handle, &info->tx_minfo); + xfer = scmi_xfer_get(handle, &info->tx_minfo); + if (!IS_ERR(xfer)) + xfer->is_raw = true; + + return xfer; } /** @@ -531,6 +536,7 @@ void scmi_xfer_raw_put(const struct scmi_handle *handle, struct scmi_xfer *xfer) { struct scmi_info *info = handle_to_scmi_info(handle); + xfer->is_raw = false; return __scmi_xfer_put(&info->tx_minfo, xfer); } @@ -2401,7 +2407,8 @@ int scmi_protocol_device_request(const struct scmi_device_id *id_table) pr_debug("Requesting SCMI device (%s) for protocol %x\n", id_table->name, id_table->protocol_id); - if (IS_ENABLED(CONFIG_ARM_SCMI_RAW_MODE_SUPPORT)) { + if (IS_ENABLED(CONFIG_ARM_SCMI_RAW_MODE_SUPPORT) && + !IS_ENABLED(CONFIG_ARM_SCMI_RAW_MODE_SUPPORT_COEX)) { pr_warn("SCMI Raw mode active. Rejecting '%s'/0x%02X\n", id_table->name, id_table->protocol_id); return -EINVAL; @@ -2634,11 +2641,13 @@ static int scmi_probe(struct platform_device *pdev) info->tx_minfo.max_msg); if (!IS_ERR(info->raw)) { dev_info(dev, "SCMI RAW Mode initialized.\n"); - return 0; + if (!IS_ENABLED(CONFIG_ARM_SCMI_RAW_MODE_SUPPORT_COEX)) + return 0; + dev_info(dev, "SCMI RAW Mode COEX enabled !\n"); + } else { + dev_err(dev, "Failed to initialize SCMI RAW Mode !\n"); + info->raw = NULL; } - - dev_err(dev, "Failed to initialize SCMI RAW Mode !\n"); - info->raw = NULL; } if (scmi_notification_init(handle)) diff --git a/drivers/firmware/arm_scmi/protocols.h b/drivers/firmware/arm_scmi/protocols.h index 2f3bf691db7c..70a48adcc320 100644 --- a/drivers/firmware/arm_scmi/protocols.h +++ b/drivers/firmware/arm_scmi/protocols.h @@ -88,6 +88,7 @@ struct scmi_msg_hdr { /** * struct scmi_xfer - Structure representing a message flow * + * @is_raw: Flag to state if this xfer has been generated by RAW mode * @transfer_id: Unique ID for debug & profiling purpose * @hdr: Transmit message header * @tx: Transmit message @@ -119,6 +120,7 @@ struct scmi_msg_hdr { * @priv: A pointer for transport private usage. */ struct scmi_xfer { + bool is_raw; int transfer_id; struct scmi_msg_hdr hdr; struct scmi_msg tx; diff --git a/drivers/firmware/arm_scmi/raw_mode.c b/drivers/firmware/arm_scmi/raw_mode.c index 3fdfc0564286..0edaeb405267 100644 --- a/drivers/firmware/arm_scmi/raw_mode.c +++ b/drivers/firmware/arm_scmi/raw_mode.c @@ -1154,7 +1154,7 @@ void scmi_raw_message_report(void *r, struct scmi_xfer *xfer, unsigned int idx) struct device *dev; struct scmi_raw_mode_info *raw = r; - if (!raw) + if (!raw || (idx == SCMI_RAW_REPLY_QUEUE && !xfer->is_raw)) return; dev = raw->handle->dev;
On Fri, 28 Oct 2022 at 18:58, Cristian Marussi <cristian.marussi@arm.com> wrote: > > On Fri, Oct 28, 2022 at 06:18:52PM +0200, Vincent Guittot wrote: > > On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote: > > > > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > > > > > Hi all, > > > > > > > > > > > Hi Vincent, > > > > > > > > This series aims to introduce a new SCMI unified userspace interface meant > > > > > to ease testing an SCMI Server implementation for compliance, fuzzing etc., > > > > > from the perspective of the OSPM agent (non-secure world only ...) > > > > > > > > > [ snip] > > > > > > Hi Cristian, > > > > > > > > I have tested your series with an optee message transport layer and > > > > been able to send raw messages to the scmi server PTA > > > > > > > > FWIW > > > > > > > > Tested-by: Vincent Guittot <vincent.guittot@linaro.org> > > > > > > > > > > Thanks a lot for your test and feedback ! > > > > > > ... but I was going to reply to this saying that I spotted another issue > > > with the current SCMI Raw implementation (NOT related to optee/smc) so > > > that I'll post a V5 next week :P > > > > > > Anyway, the issue is much related to the debugfs root used by SCMI Raw, > > > i.e.: > > > > > > /sys/kernel/debug/scmi_raw/ > > > > > > ..this works fine unless you run it on a system sporting multiple DT-defined > > > server instances ...that is not officially supported but....ehm...a little > > > bird told these system with multiple servers do exists :D > > > > ;-) > > > > > > > > In such a case the SCMI core stack is probed multiuple times and so it > > > will try to register multiple debugfs Raw roots: there is no chanche to > > > root the SCMI Raw entries at the same point clearly ... and then anyway > > > there is the issue of recognizing which server is rooted where ... with > > > the additional pain that a server CANNOT be recognized by querying...cause > > > there is only one by teh spec with agentID ZERO ... in theory :D... > > > > > > So my tentaive solution for V5 would be: > > > > > > - change the Raw root debugfs as: > > > > > > /sys/kernel/debug/scmi_raw/0/... (first server defined) > > > > > > /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..) > > > > > > - expose the DT scmi-server root-node name of the server somewhere under > > > that debugfs root like: > > > > > > ..../scmi_raw/0/transport_name -> 'scmi-mbx' > > > > > > ..../scmi_raw/1/transport_name -> 'scmi-virtio' > > > > I was about to say that you would display the server name but that > > means that you have send a request to the server which probably > > defeats the purpose of the raw mode > > > > > > > > so that if you know HOW you have configured your own system in the DT > > > (maybe multiple servers with different kind of transports ?), you can > > > easily select programmatically which one is which, and so decide > > > which Raw debugfs fs to use... > > > > > > ... I plan to leave the SCMI ACS suite use by default the first system > > > rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline > > > option to choose an alternative path for SCMI Raw. > > > > > > Does all of this sound reasonable ? > > > > Yes, adding an index looks good to me. > > Ok, I'll rework accordingly. > > > > > As we are there, should we consider adding a per channel entry in the > > tree when there are several channels shared with the same server ? > > > > So, I was thinking about this and, even though, it seems not strictly > needed for Compliance testing (as discussed offline) I do think that > could be a sensible option to have as an additional mean to stress the > server transport implementation (as you wish). Thanks > > Having said that, this week, I was reasoning about an alternative > interface to do this, i.e. to avoid to add even more debugfs entries > for this chosen-channel config or possibly in the future to ask for > transport polling mode (if supported by the underlying transport) > > My idea (not thought fully through as of now eh..) would be as follows: > > since the current Raw implementation enforces a minimum size of 4 bytes > for the injected message (more on this later down below in NOTE), I was > thinking about using less-than-4-bytes-sized messages to sort of > pre-configure the Raw stack. > > IOW, instead of having a number of new additional entries like > > ../message_ch10 > ../message_ch13 > ../message_poll > > we could design a sort of API (in the API :D) that defines how > 3-bytes message payload are to be interpreted, so that in normal usage > everything will go on as it is now, while if a 3-bytes message is > injected by a specially crafted testcase, it would be used to configure > the behaviour stack for the subsequent set of Raw transactions > (i.e. for the currently opened fd...) like: > > - open message fd > > - send a configure message: > > | proto_chan_# | flags (polling..) | > ------------------------------------------ > 0 7 21 > > - send/receive your test messages > > - repeat or close (then the config will vanish...) > > This would mean adding some sort entry under scmi_raw to expose the > configured available channels on the system though. > > [maybe the flags above could also include an async flag and avoid > also to add the message_async entries...] > > I discarded the idea to run the above config process via IOCTLs since > it seemed to me even more frowned upon to use IOCTLs on a debugfs entry > :P...but I maybe wrong ah... > > All of this is still to be explored anyway, any thoughts ? or evident > drawbacks ? (beside having to clearly define an API for these message > config machinery) TBH, I'm not a fan of adding a protocol on top of the SCMI one. This interface aims to test the SCMI servers and their channels so we should focus on this and make it simple to use. IMHO, adding some special bytes before the real scmi message is prone to create complexity and error in the use of this debug interface. > > Anyway, whatever direction we'll choose (additional entries vs 3-bytes > config msg), I would prefer to add this per-channel (or polling) > capabilities with separate series to post on top of this in teh next > cycle. Ok > > ..too many words even this time :P Thanks Vincent > > Thanks, > Cristian > > > P.S: NOTE min_injected_msg_size: > -------------------------------- > Thinking about all of the above, at first, I was a bit dubious if > instead I should not allow, in Raw mode, the injection of shorter than > 4 bytes messages (i.e. shorter than a SCMI header) for the purpose of > fuzzing: then I realized that even though I should allow the injection > of smaller messages, the underlying transports, as they are defined, both > sides (platform and agent) will anyway carry out a 4bytes transaction, > it's just that all the other non-provided bytes will be zeroed in the > memory layout; this is just how the transports itself (shmem or msg > based) are designed to work both sides. (and again would be transport > layer testing more than SCMI spec verification..) > > So at the end I thought this kind of less-than-4-bytes transmissions > gave no benefit and I came up with the above trick to use such tiny > message for configuration. >
On Wed, Nov 02, 2022 at 09:54:50AM +0100, Vincent Guittot wrote: > On Fri, 28 Oct 2022 at 18:58, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > On Fri, Oct 28, 2022 at 06:18:52PM +0200, Vincent Guittot wrote: > > > On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > > > On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote: > > > > > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > Hi Vincent, > > > > > > > > > > This series aims to introduce a new SCMI unified userspace interface meant > > > > > > to ease testing an SCMI Server implementation for compliance, fuzzing etc., > > > > > > from the perspective of the OSPM agent (non-secure world only ...) > > > > > > > > > > > > [ snip] > > > > > > > > Hi Cristian, > > > > > > > > > > I have tested your series with an optee message transport layer and > > > > > been able to send raw messages to the scmi server PTA > > > > > > > > > > FWIW > > > > > > > > > > Tested-by: Vincent Guittot <vincent.guittot@linaro.org> > > > > > > > > > > > > > Thanks a lot for your test and feedback ! > > > > > > > > ... but I was going to reply to this saying that I spotted another issue > > > > with the current SCMI Raw implementation (NOT related to optee/smc) so > > > > that I'll post a V5 next week :P > > > > > > > > Anyway, the issue is much related to the debugfs root used by SCMI Raw, > > > > i.e.: > > > > > > > > /sys/kernel/debug/scmi_raw/ > > > > > > > > ..this works fine unless you run it on a system sporting multiple DT-defined > > > > server instances ...that is not officially supported but....ehm...a little > > > > bird told these system with multiple servers do exists :D > > > > > > ;-) > > > > > > > > > > > In such a case the SCMI core stack is probed multiuple times and so it > > > > will try to register multiple debugfs Raw roots: there is no chanche to > > > > root the SCMI Raw entries at the same point clearly ... and then anyway > > > > there is the issue of recognizing which server is rooted where ... with > > > > the additional pain that a server CANNOT be recognized by querying...cause > > > > there is only one by teh spec with agentID ZERO ... in theory :D... > > > > > > > > So my tentaive solution for V5 would be: > > > > > > > > - change the Raw root debugfs as: > > > > > > > > /sys/kernel/debug/scmi_raw/0/... (first server defined) > > > > > > > > /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..) > > > > > > > > - expose the DT scmi-server root-node name of the server somewhere under > > > > that debugfs root like: > > > > > > > > ..../scmi_raw/0/transport_name -> 'scmi-mbx' > > > > > > > > ..../scmi_raw/1/transport_name -> 'scmi-virtio' > > > > > > I was about to say that you would display the server name but that > > > means that you have send a request to the server which probably > > > defeats the purpose of the raw mode > > > > > > > > > > > so that if you know HOW you have configured your own system in the DT > > > > (maybe multiple servers with different kind of transports ?), you can > > > > easily select programmatically which one is which, and so decide > > > > which Raw debugfs fs to use... > > > > > > > > ... I plan to leave the SCMI ACS suite use by default the first system > > > > rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline > > > > option to choose an alternative path for SCMI Raw. > > > > > > > > Does all of this sound reasonable ? > > > > > > Yes, adding an index looks good to me. > > > > Ok, I'll rework accordingly. > > > > > > > > As we are there, should we consider adding a per channel entry in the > > > tree when there are several channels shared with the same server ? > > > > > > > So, I was thinking about this and, even though, it seems not strictly > > needed for Compliance testing (as discussed offline) I do think that > > could be a sensible option to have as an additional mean to stress the > > server transport implementation (as you wish). > > Thanks > > > > > Having said that, this week, I was reasoning about an alternative > > interface to do this, i.e. to avoid to add even more debugfs entries > > for this chosen-channel config or possibly in the future to ask for > > transport polling mode (if supported by the underlying transport) > > > > My idea (not thought fully through as of now eh..) would be as follows: > > > > since the current Raw implementation enforces a minimum size of 4 bytes > > for the injected message (more on this later down below in NOTE), I was > > thinking about using less-than-4-bytes-sized messages to sort of > > pre-configure the Raw stack. > > > > IOW, instead of having a number of new additional entries like > > > > ../message_ch10 > > ../message_ch13 > > ../message_poll > > > > we could design a sort of API (in the API :D) that defines how > > 3-bytes message payload are to be interpreted, so that in normal usage > > everything will go on as it is now, while if a 3-bytes message is > > injected by a specially crafted testcase, it would be used to configure > > the behaviour stack for the subsequent set of Raw transactions > > (i.e. for the currently opened fd...) like: > > > > - open message fd > > > > - send a configure message: > > > > | proto_chan_# | flags (polling..) | > > ------------------------------------------ > > 0 7 21 > > > > - send/receive your test messages > > > > - repeat or close (then the config will vanish...) > > > > This would mean adding some sort entry under scmi_raw to expose the > > configured available channels on the system though. > > > > [maybe the flags above could also include an async flag and avoid > > also to add the message_async entries...] > > > > I discarded the idea to run the above config process via IOCTLs since > > it seemed to me even more frowned upon to use IOCTLs on a debugfs entry > > :P...but I maybe wrong ah... > > > > All of this is still to be explored anyway, any thoughts ? or evident > > drawbacks ? (beside having to clearly define an API for these message > > config machinery) > > TBH, I'm not a fan of adding a protocol on top of the SCMI one. This > interface aims to test the SCMI servers and their channels so we > should focus on this and make it simple to use. IMHO, adding some > special bytes before the real scmi message is prone to create > complexity and error in the use of this debug interface. > Indeed, even if only for transport-related tests, the risk is to make more complicate to use the interface. Agreed, just wanted to have some feedback. I'll revert to some based on debugfs trying to minimize entries and improper usage...maybe something like grouping on per-channel subdirs when different channels are available. E.g. on a system with a dedicated PERF channel I could have ../scmi_raw/0/message <<< usual auto channel selection /message_async ... /chan_0x10/message <<< use default channel (base proto) /message_async .... /chan_0x13/message <<< use PERF channel /message_async .... The alternative would be to have a common single entry to configure the usage of the a single batch of message/message_async entries BUT that seems to me more prone to error, e.g. if you dont clear the special config at the end of your special testcase you could endup using custom channels conf also with any following regular test which expects to benefit from the automatic channel selection (which I still think should be default way of running these tests..) Thoughts ? Thanks for the feedback. Cristian
On Thu, 3 Nov 2022 at 10:21, Cristian Marussi <cristian.marussi@arm.com> wrote: > > On Wed, Nov 02, 2022 at 09:54:50AM +0100, Vincent Guittot wrote: > > On Fri, 28 Oct 2022 at 18:58, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > On Fri, Oct 28, 2022 at 06:18:52PM +0200, Vincent Guittot wrote: > > > > On Fri, 28 Oct 2022 at 17:04, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > > > > > On Fri, Oct 28, 2022 at 04:40:02PM +0200, Vincent Guittot wrote: > > > > > > On Wed, 19 Oct 2022 at 22:46, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > Hi Vincent, > > > > > > > > > > > > This series aims to introduce a new SCMI unified userspace interface meant > > > > > > > to ease testing an SCMI Server implementation for compliance, fuzzing etc., > > > > > > > from the perspective of the OSPM agent (non-secure world only ...) > > > > > > > > > > > > > > > [ snip] > > > > > > > > > > Hi Cristian, > > > > > > > > > > > > I have tested your series with an optee message transport layer and > > > > > > been able to send raw messages to the scmi server PTA > > > > > > > > > > > > FWIW > > > > > > > > > > > > Tested-by: Vincent Guittot <vincent.guittot@linaro.org> > > > > > > > > > > > > > > > > Thanks a lot for your test and feedback ! > > > > > > > > > > ... but I was going to reply to this saying that I spotted another issue > > > > > with the current SCMI Raw implementation (NOT related to optee/smc) so > > > > > that I'll post a V5 next week :P > > > > > > > > > > Anyway, the issue is much related to the debugfs root used by SCMI Raw, > > > > > i.e.: > > > > > > > > > > /sys/kernel/debug/scmi_raw/ > > > > > > > > > > ..this works fine unless you run it on a system sporting multiple DT-defined > > > > > server instances ...that is not officially supported but....ehm...a little > > > > > bird told these system with multiple servers do exists :D > > > > > > > > ;-) > > > > > > > > > > > > > > In such a case the SCMI core stack is probed multiuple times and so it > > > > > will try to register multiple debugfs Raw roots: there is no chanche to > > > > > root the SCMI Raw entries at the same point clearly ... and then anyway > > > > > there is the issue of recognizing which server is rooted where ... with > > > > > the additional pain that a server CANNOT be recognized by querying...cause > > > > > there is only one by teh spec with agentID ZERO ... in theory :D... > > > > > > > > > > So my tentaive solution for V5 would be: > > > > > > > > > > - change the Raw root debugfs as: > > > > > > > > > > /sys/kernel/debug/scmi_raw/0/... (first server defined) > > > > > > > > > > /sys/kernel/debug/scmi_raw/1/... (possible additional server(s)..) > > > > > > > > > > - expose the DT scmi-server root-node name of the server somewhere under > > > > > that debugfs root like: > > > > > > > > > > ..../scmi_raw/0/transport_name -> 'scmi-mbx' > > > > > > > > > > ..../scmi_raw/1/transport_name -> 'scmi-virtio' > > > > > > > > I was about to say that you would display the server name but that > > > > means that you have send a request to the server which probably > > > > defeats the purpose of the raw mode > > > > > > > > > > > > > > so that if you know HOW you have configured your own system in the DT > > > > > (maybe multiple servers with different kind of transports ?), you can > > > > > easily select programmatically which one is which, and so decide > > > > > which Raw debugfs fs to use... > > > > > > > > > > ... I plan to leave the SCMI ACS suite use by default the first system > > > > > rooted at /sys/kernel/debug/scmi_raw/0/...maybe adding a commandline > > > > > option to choose an alternative path for SCMI Raw. > > > > > > > > > > Does all of this sound reasonable ? > > > > > > > > Yes, adding an index looks good to me. > > > > > > Ok, I'll rework accordingly. > > > > > > > > > > > As we are there, should we consider adding a per channel entry in the > > > > tree when there are several channels shared with the same server ? > > > > > > > > > > So, I was thinking about this and, even though, it seems not strictly > > > needed for Compliance testing (as discussed offline) I do think that > > > could be a sensible option to have as an additional mean to stress the > > > server transport implementation (as you wish). > > > > Thanks > > > > > > > > Having said that, this week, I was reasoning about an alternative > > > interface to do this, i.e. to avoid to add even more debugfs entries > > > for this chosen-channel config or possibly in the future to ask for > > > transport polling mode (if supported by the underlying transport) > > > > > > My idea (not thought fully through as of now eh..) would be as follows: > > > > > > since the current Raw implementation enforces a minimum size of 4 bytes > > > for the injected message (more on this later down below in NOTE), I was > > > thinking about using less-than-4-bytes-sized messages to sort of > > > pre-configure the Raw stack. > > > > > > IOW, instead of having a number of new additional entries like > > > > > > ../message_ch10 > > > ../message_ch13 > > > ../message_poll > > > > > > we could design a sort of API (in the API :D) that defines how > > > 3-bytes message payload are to be interpreted, so that in normal usage > > > everything will go on as it is now, while if a 3-bytes message is > > > injected by a specially crafted testcase, it would be used to configure > > > the behaviour stack for the subsequent set of Raw transactions > > > (i.e. for the currently opened fd...) like: > > > > > > - open message fd > > > > > > - send a configure message: > > > > > > | proto_chan_# | flags (polling..) | > > > ------------------------------------------ > > > 0 7 21 > > > > > > - send/receive your test messages > > > > > > - repeat or close (then the config will vanish...) > > > > > > This would mean adding some sort entry under scmi_raw to expose the > > > configured available channels on the system though. > > > > > > [maybe the flags above could also include an async flag and avoid > > > also to add the message_async entries...] > > > > > > I discarded the idea to run the above config process via IOCTLs since > > > it seemed to me even more frowned upon to use IOCTLs on a debugfs entry > > > :P...but I maybe wrong ah... > > > > > > All of this is still to be explored anyway, any thoughts ? or evident > > > drawbacks ? (beside having to clearly define an API for these message > > > config machinery) > > > > TBH, I'm not a fan of adding a protocol on top of the SCMI one. This > > interface aims to test the SCMI servers and their channels so we > > should focus on this and make it simple to use. IMHO, adding some > > special bytes before the real scmi message is prone to create > > complexity and error in the use of this debug interface. > > > > Indeed, even if only for transport-related tests, the risk is to make > more complicate to use the interface. > > Agreed, just wanted to have some feedback. I'll revert to some based on > debugfs trying to minimize entries and improper usage...maybe something > like grouping on per-channel subdirs when different channels are > available. > > E.g. on a system with a dedicated PERF channel I could have > > > ../scmi_raw/0/message <<< usual auto channel selection > /message_async > ... > > /chan_0x10/message <<< use default channel (base proto) > /message_async > .... > > /chan_0x13/message <<< use PERF channel > /message_async > .... The proposal above looks good for me Thanks > > The alternative would be to have a common single entry to configure the usage > of the a single batch of message/message_async entries BUT that seems to > me more prone to error, e.g. if you dont clear the special config at the > end of your special testcase you could endup using custom channels conf also > with any following regular test which expects to benefit from the > automatic channel selection (which I still think should be default way of > running these tests..) > > Thoughts ? > > Thanks for the feedback. > Cristian
On Fri, Oct 28, 2022 at 07:38:25PM -0700, Florian Fainelli wrote: > Hi Christian, > > On 10/19/2022 1:46 PM, Cristian Marussi wrote: > [snip] > > > In V2 the runtime enable/disable switching capability has been removed > > (for now) since still not deemed to be stable/reliable enough: as a > > consequence when SCMI Raw support is compiled in, the regular SCMI stack > > drivers are now inhibited permanently for that Kernel. > > For our platforms (ARCH_BRCMSTB) we would need to have the ability to start > with the regular SCMI stack to satisfy if nothing else, all clock consumers > otherwise it makes it fairly challenging for us to boot to a prompt as we > purposely turn off all unnecessary peripherals to conserve power. We could > introduce a "full on" mode to remove the clock provider dependency, but I > suspect others on "real" silicon may suffer from the same short comings. > Fair enough. But if we are doing SCMI firmware testing or conformance via the $subject proposed way, can these drivers survive if the userspace do a random or a torture test changing the clock configurations ? Not sure how to deal with that as the intention here is to do the testing from the user-space and anything can happen. How do we avoid bring the entire system down while doing this testing. Can we unbind all the drivers using scmi on your platform ? I guess no. Let me know. > Once user-space is reached, I suppose we could find a way to unbind from all > SCMI consumers, and/or ensure that runtime PM is disabled, cpufreq is in a > governor that won't do any active frequency switching etc. > > What do you think? Yes, Cristian always wanted to support that but I am the one trying to convince him not to unless there is a strong requirement for it. You seem to suggest that you have such a requirement, but that just opens loads of questions and how to we deal with that. Few of them are as stated above, I need to recall all the conversations I had with Cristian around that and why handling it may be bit complex. -- Regards, Sudeep
On Thu, Nov 03, 2022 at 11:21:47AM +0000, Sudeep Holla wrote: > On Fri, Oct 28, 2022 at 07:38:25PM -0700, Florian Fainelli wrote: > > Hi Christian, > > > > On 10/19/2022 1:46 PM, Cristian Marussi wrote: > > [snip] > > > > > In V2 the runtime enable/disable switching capability has been removed > > > (for now) since still not deemed to be stable/reliable enough: as a > > > consequence when SCMI Raw support is compiled in, the regular SCMI stack > > > drivers are now inhibited permanently for that Kernel. > > > > For our platforms (ARCH_BRCMSTB) we would need to have the ability to start > > with the regular SCMI stack to satisfy if nothing else, all clock consumers > > otherwise it makes it fairly challenging for us to boot to a prompt as we > > purposely turn off all unnecessary peripherals to conserve power. We could > > introduce a "full on" mode to remove the clock provider dependency, but I > > suspect others on "real" silicon may suffer from the same short comings. > > > > Fair enough. But if we are doing SCMI firmware testing or conformance via > the $subject proposed way, can these drivers survive if the userspace do > a random or a torture test changing the clock configurations ? Not sure > how to deal with that as the intention here is to do the testing from the > user-space and anything can happen. How do we avoid bring the entire system > down while doing this testing. Can we unbind all the drivers using scmi on > your platform ? I guess no. Let me know. > > > Once user-space is reached, I suppose we could find a way to unbind from all > > SCMI consumers, and/or ensure that runtime PM is disabled, cpufreq is in a > > governor that won't do any active frequency switching etc. > > > > What do you think? > > Yes, Cristian always wanted to support that but I am the one trying to > convince him not to unless there is a strong requirement for it. You seem > to suggest that you have such a requirement, but that just opens loads of > questions and how to we deal with that. Few of them are as stated above, I > need to recall all the conversations I had with Cristian around that and why > handling it may be bit complex. :D ... I really even more like the idea of enabling on demand full coexistence so that I completely delegate to the users to manually deal with possible interferences at runtime and drop any liabilities if someone shoots himself in the foot :P ... jokes apart I'll post today a V5 with a few fixes and and an optional coexistence mode so that Florian can experiment and see how much is feasible to operate in this way by manually unbinding/re-configuring SCMI behaviour at runtime before starting tests and not kill the system on something like ARCH_BRCMSTB platforms. Thanks, Cristian