Message ID | cover.1738228114.git.florent.tomasin@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Hi, I started to review it, but it's probably best to discuss it here. On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote: > Hi, > > This is a patch series covering the support for protected mode execution in > Mali Panthor CSF kernel driver. > > The Mali CSF GPUs come with the support for protected mode execution at the > HW level. This feature requires two main changes in the kernel driver: > > 1) Configure the GPU with a protected buffer. The system must provide a DMA > heap from which the driver can allocate a protected buffer. > It can be a carved-out memory or dynamically allocated protected memory region. > Some system includes a trusted FW which is in charge of the protected memory. > Since this problem is integration specific, the Mali Panthor CSF kernel > driver must import the protected memory from a device specific exporter. Why do you need a heap for it in the first place? My understanding of your series is that you have a carved out memory region somewhere, and you want to allocate from that carved out memory region your buffers. How is that any different from using a reserved-memory region, adding the reserved-memory property to the GPU device and doing all your allocation through the usual dma_alloc_* API? Or do you expect to have another dma-buf heap that would call into the firmware to perform the allocations? The semantics of the CMA heap allocations is a concern too. Another question is how would you expect something like a secure video-playback pipeline to operate with that kind of interface. Assuming you have a secure codec, you would likely get that protected buffer from the codec, right? So the most likely scenario would be to then import that dma-buf into the GPU driver, but not allocate the buffer from it. Overall, I think a better abstraction would be to have a heap indeed to allocate your protected buffers from, and then import them in the devices that need them. The responsibility would be on the userspace to do so, but it already kind of does with your design anyway. Maxime
Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit : > Hi, > > I started to review it, but it's probably best to discuss it here. > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote: > > Hi, > > > > This is a patch series covering the support for protected mode execution in > > Mali Panthor CSF kernel driver. > > > > The Mali CSF GPUs come with the support for protected mode execution at the > > HW level. This feature requires two main changes in the kernel driver: > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA > > heap from which the driver can allocate a protected buffer. > > It can be a carved-out memory or dynamically allocated protected memory region. > > Some system includes a trusted FW which is in charge of the protected memory. > > Since this problem is integration specific, the Mali Panthor CSF kernel > > driver must import the protected memory from a device specific exporter. > > Why do you need a heap for it in the first place? My understanding of > your series is that you have a carved out memory region somewhere, and > you want to allocate from that carved out memory region your buffers. > > How is that any different from using a reserved-memory region, adding > the reserved-memory property to the GPU device and doing all your > allocation through the usual dma_alloc_* API? How do you then multiplex this region so it can be shared between GPU/Camera/Display/Codec drivers and also userspace ? Also, how the secure memory is allocted / obtained is a process that can vary a lot between SoC, so implementation details assumption should not be coded in the driver. Nicolas > > Or do you expect to have another dma-buf heap that would call into the > firmware to perform the allocations? > > The semantics of the CMA heap allocations is a concern too. > > Another question is how would you expect something like a secure > video-playback pipeline to operate with that kind of interface. Assuming > you have a secure codec, you would likely get that protected buffer from > the codec, right? So the most likely scenario would be to then import > that dma-buf into the GPU driver, but not allocate the buffer from it. > > Overall, I think a better abstraction would be to have a heap indeed to > allocate your protected buffers from, and then import them in the > devices that need them. The responsibility would be on the userspace to > do so, but it already kind of does with your design anyway. > > Maxime
On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote: > Hi, > > This is a patch series covering the support for protected mode execution in > Mali Panthor CSF kernel driver. > > The Mali CSF GPUs come with the support for protected mode execution at the > HW level. This feature requires two main changes in the kernel driver: > > 1) Configure the GPU with a protected buffer. The system must provide a DMA > heap from which the driver can allocate a protected buffer. > It can be a carved-out memory or dynamically allocated protected memory region. > Some system includes a trusted FW which is in charge of the protected memory. > Since this problem is integration specific, the Mali Panthor CSF kernel > driver must import the protected memory from a device specific exporter. > > 2) Handle enter and exit of the GPU HW from normal to protected mode of execution. > FW sends a request for protected mode entry to the kernel driver. > The acknowledgment of that request is a scheduling decision. Effectively, > protected mode execution should not overrule normal mode of execution. > A fair distribution of execution time will guaranty the overall performance > of the device, including the UI (usually executing in normal mode), > will not regress when a protected mode job is submitted by an application. > > > Background > ---------- > > Current Mali Panthor CSF driver does not allow a user space application to > execute protected jobs on the GPU. This use case is quite common on end-user-device. > A user may want to watch a video or render content that is under a "Digital Right > Management" protection, or launch an application with user private data. > > 1) User-space: > > In order for an application to execute protected jobs on a Mali CSF GPU the > user space application must submit jobs to the GPU within a "protected regions" > (range of commands to execute in protected mode). > > Find here an example of a command buffer that contains protected commands: > > ``` > <--- Normal mode ---><--- Protected mode ---><--- Normal mode ---> > +-------------------------------------------------------------------------+ > | ... | CMD_0 | ... | CMD_N | PROT_REGION | CMD_N+1 | ... | CMD_N+M | ... | > +-------------------------------------------------------------------------+ > ``` > > The PROT_REGION command acts as a barrier to notify the HW of upcoming > protected jobs. It also defines the number of commands to execute in protected > mode. > > The Mesa definition of the opcode can be found here: > > https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/panfrost/lib/genxml/v10.xml?ref_type=heads#L763 Is there also something around that implements egl_ext_protected_context or similar in mesa? I think that's the minimal bar all the protected gpu workload kernel support patches cleared thus far, since usually getting the actual video code stuff published seems to be impossible. -Sima > > 2) Kernel-space: > > When loading the FW image, the Kernel driver must also load the data section of > CSF FW that comes from the protected memory, in order to allow FW to execute in > protected mode. > > Important: this memory is not owned by any process. It is a GPU device level > protected memory. > > In addition, when a CSG (group) is created, it must have a protected suspend buffer. > This memory is allocated within the kernel but bound to a specific CSG that belongs > to a process. The kernel owns this allocation and does not allow user space mapping. > The format of the data in this buffer is only known by the FW and does not need to > be shared with other entities. The purpose of this buffer is the same as the normal > suspend buffer but for protected mode. FW will use it to suspend the execution of > PROT_REGION before returning to normal mode of execution. > > > Design decisions > ---------------- > > The Mali Panthor CSF kernel driver will allocate protected DMA buffers > using a global protected DMA heap. The name of the heap can vary on > the system and is integration specific. Therefore, the kernel driver > will retrieve it using the DTB entry: "protected-heap-name". > > The Mali Panthor CSF kernel driver will handle enter/exit of protected > mode with a fair consideration of the job scheduling. > > If the system integrator does not provide a protected DMA heap, the driver > will not allow any protected mode execution. > > > Patch series > ------------ > > The series is based on: > > https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t > > [PATCHES 1-2]: > These patches were used for the development of the feature and are not > initially thought to land in the Linux kernel. They are mostly relevant > if someone wants to reproduce the environment of testing. > > Note: Please, raise interest if you think those patches have value in > the Linux kernel. > > * dt-bindings: dma: Add CMA Heap bindings > * cma-heap: Allow registration of custom cma heaps > > [PATCHES 3-4]: > These patches introduce the Mali Panthor CSF driver DTB binding to pass > the protected DMA Heap name and the handling of the protected DMA memory > allocations in the driver. > > Note: The registration of the protected DMA buffers is done via GEM prime. > The direct call to the registration function, may seems controversial and > I would appreciate feedback on that matter. > > * dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding > * drm/panthor: Add support for protected memory allocation in panthor > > [PATCH 5]: > This patch implements the logic to handle enter/exit of the GPU protected > mode in Panthor CSF driver. > > Note: to prevent scheduler priority inversion, only a single CSG is allowed > to execute while in protected mode. It must be the top priority one. > > * drm/panthor: Add support for entering and exiting protected mode > > Testing > ------- > > 1) Platform and development environment > > Any platform containing a Mali CSF type of GPU and a protected memory allocator > that is based on DMA Heap can be used. For example, it can be a physical platform > or a simulator such as Arm Total Compute FVPs platforms. Reference to the latter: > > https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms/Total%20Compute%20FVPs > > To ease the development of the feature, a carved-out protected memory heap was > defined and managed using a modified version of the CMA heap driver. > > 2) Protected job submission: > > A protected mode job can be created in Mesa following this approach: > > ``` > diff --git a/src/gallium/drivers/panfrost/pan_csf.c b/src/gallium/drivers/panfrost/pan_csf.c > index da6ce875f86..13d54abf5a1 100644 > --- a/src/gallium/drivers/panfrost/pan_csf.c > +++ b/src/gallium/drivers/panfrost/pan_csf.c > @@ -803,8 +803,25 @@ GENX(csf_emit_fragment_job)(struct panfrost_batch *batch, > } > } > > + if (protected_cmd) { > + /* Number of commands to execute in protected mode in bytes. > + * The run fragment and wait commands. */ > + unsigned const size = 2 * sizeof(u64); > + > + /* Wait for all previous commands to complete before evaluating > + * the protected commands. */ > + cs_wait_slots(b, SB_ALL_MASK, false); > + cs_prot_region(b, size); > + } > + > /* Run the fragment job and wait */ > cs_run_fragment(b, false, MALI_TILE_RENDER_ORDER_Z_ORDER, false); > + > + /* Wait for all protected commands to complete before evaluating > + * the normal mode commands. */ > + if (protected_cmd) > + cs_wait_slots(b, SB_ALL_MASK, false); > + > cs_wait_slot(b, 2, false); > > /* Gather freed heap chunks and add them to the heap context free list > ``` > > > Constraints > ----------- > > At the time of developing the feature, Linux kernel does not have a generic > way of implementing protected DMA heaps. The patch series relies on previous > work to expose the DMA heap API to the kernel drivers. > > The Mali CSF GPU requires device level allocated protected memory, which do > not belong to a process. The current Linux implementation of DMA heap only > allows a user space program to allocate from such heap. Having the ability > to allocate this memory at the kernel level via the DMA heap API would allow > support for protected mode on Mali CSF GPUs. > > > Conclusion > ---------- > > This patch series aims to initiate the discussion around handling of protected > mode in Mali CSG GPU and highlights constraints found during the development > of the feature. > > Some Mesa changes are required to construct a protected mode job in user space, > which can be submitted to the GPU. > > Some of the changes may seems controversial and we would appreciate getting > opinion from the community. > > > Regards, > > Florent Tomasin (5): > dt-bindings: dma: Add CMA Heap bindings > cma-heap: Allow registration of custom cma heaps > dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding > drm/panthor: Add support for protected memory allocation in panthor > drm/panthor: Add support for entering and exiting protected mode > > .../devicetree/bindings/dma/linux,cma.yml | 43 ++++++ > .../bindings/gpu/arm,mali-valhall-csf.yaml | 6 + > drivers/dma-buf/heaps/cma_heap.c | 120 +++++++++++------ > drivers/gpu/drm/panthor/Kconfig | 1 + > drivers/gpu/drm/panthor/panthor_device.c | 22 +++- > drivers/gpu/drm/panthor/panthor_device.h | 10 ++ > drivers/gpu/drm/panthor/panthor_fw.c | 46 ++++++- > drivers/gpu/drm/panthor/panthor_fw.h | 2 + > drivers/gpu/drm/panthor/panthor_gem.c | 49 ++++++- > drivers/gpu/drm/panthor/panthor_gem.h | 16 ++- > drivers/gpu/drm/panthor/panthor_heap.c | 2 + > drivers/gpu/drm/panthor/panthor_sched.c | 124 ++++++++++++++++-- > 12 files changed, 382 insertions(+), 59 deletions(-) > create mode 100644 Documentation/devicetree/bindings/dma/linux,cma.yml > > -- > 2.34.1 >
Hi Nicolas, On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote: > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit : > > Hi, > > > > I started to review it, but it's probably best to discuss it here. > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote: > > > Hi, > > > > > > This is a patch series covering the support for protected mode execution in > > > Mali Panthor CSF kernel driver. > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the > > > HW level. This feature requires two main changes in the kernel driver: > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA > > > heap from which the driver can allocate a protected buffer. > > > It can be a carved-out memory or dynamically allocated protected memory region. > > > Some system includes a trusted FW which is in charge of the protected memory. > > > Since this problem is integration specific, the Mali Panthor CSF kernel > > > driver must import the protected memory from a device specific exporter. > > > > Why do you need a heap for it in the first place? My understanding of > > your series is that you have a carved out memory region somewhere, and > > you want to allocate from that carved out memory region your buffers. > > > > How is that any different from using a reserved-memory region, adding > > the reserved-memory property to the GPU device and doing all your > > allocation through the usual dma_alloc_* API? > > How do you then multiplex this region so it can be shared between > GPU/Camera/Display/Codec drivers and also userspace ? You could point all the devices to the same reserved memory region, and they would all allocate from there, including for their userspace-facing allocations. > Also, how the secure memory is allocted / obtained is a process that > can vary a lot between SoC, so implementation details assumption > should not be coded in the driver. But yeah, we agree there, it's also the point I was trying to make :) Maxime
Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit : > Hi Nicolas, > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote: > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit : > > > Hi, > > > > > > I started to review it, but it's probably best to discuss it here. > > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote: > > > > Hi, > > > > > > > > This is a patch series covering the support for protected mode execution in > > > > Mali Panthor CSF kernel driver. > > > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the > > > > HW level. This feature requires two main changes in the kernel driver: > > > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA > > > > heap from which the driver can allocate a protected buffer. > > > > It can be a carved-out memory or dynamically allocated protected memory region. > > > > Some system includes a trusted FW which is in charge of the protected memory. > > > > Since this problem is integration specific, the Mali Panthor CSF kernel > > > > driver must import the protected memory from a device specific exporter. > > > > > > Why do you need a heap for it in the first place? My understanding of > > > your series is that you have a carved out memory region somewhere, and > > > you want to allocate from that carved out memory region your buffers. > > > > > > How is that any different from using a reserved-memory region, adding > > > the reserved-memory property to the GPU device and doing all your > > > allocation through the usual dma_alloc_* API? > > > > How do you then multiplex this region so it can be shared between > > GPU/Camera/Display/Codec drivers and also userspace ? > > You could point all the devices to the same reserved memory region, and > they would all allocate from there, including for their userspace-facing > allocations. I get that using memory region is somewhat more of an HW description, and aligned with what a DT is supposed to describe. One of the challenge is that Mediatek heap proposal endup calling into their TEE, meaning knowing the region is not that useful. You actually need the TEE APP guid and its IPC protocol. If we can dell drivers to use a head instead, we can abstract that SoC specific complexity. I believe each allocated addressed has to be mapped to a zone, and that can only be done in the secure application. I can imagine similar needs when the protection is done using some sort of a VM / hypervisor. Nicolas > > > Also, how the secure memory is allocted / obtained is a process that > > can vary a lot between SoC, so implementation details assumption > > should not be coded in the driver. > > But yeah, we agree there, it's also the point I was trying to make :) > > Maxime
diff --git a/src/gallium/drivers/panfrost/pan_csf.c b/src/gallium/drivers/panfrost/pan_csf.c index da6ce875f86..13d54abf5a1 100644 --- a/src/gallium/drivers/panfrost/pan_csf.c +++ b/src/gallium/drivers/panfrost/pan_csf.c @@ -803,8 +803,25 @@ GENX(csf_emit_fragment_job)(struct panfrost_batch *batch, } } + if (protected_cmd) { + /* Number of commands to execute in protected mode in bytes. + * The run fragment and wait commands. */ + unsigned const size = 2 * sizeof(u64); + + /* Wait for all previous commands to complete before evaluating + * the protected commands. */ + cs_wait_slots(b, SB_ALL_MASK, false); + cs_prot_region(b, size); + } + /* Run the fragment job and wait */ cs_run_fragment(b, false, MALI_TILE_RENDER_ORDER_Z_ORDER, false); + + /* Wait for all protected commands to complete before evaluating + * the normal mode commands. */ + if (protected_cmd) + cs_wait_slots(b, SB_ALL_MASK, false); + cs_wait_slot(b, 2, false); /* Gather freed heap chunks and add them to the heap context free list