diff mbox series

[v6,01/14] drm/panthor: Add uAPI

Message ID 20240229162230.2634044-2-boris.brezillon@collabora.com (mailing list archive)
State New, archived
Headers show
Series drm: Add a driver for CSF-based Mali GPUs | expand

Commit Message

Boris Brezillon Feb. 29, 2024, 4:22 p.m. UTC
Panthor follows the lead of other recently submitted drivers with
ioctls allowing us to support modern Vulkan features, like sparse memory
binding:

- Pretty standard GEM management ioctls (BO_CREATE and BO_MMAP_OFFSET),
  with the 'exclusive-VM' bit to speed-up BO reservation on job submission
- VM management ioctls (VM_CREATE, VM_DESTROY and VM_BIND). The VM_BIND
  ioctl is loosely based on the Xe model, and can handle both
  asynchronous and synchronous requests
- GPU execution context creation/destruction, tiler heap context creation
  and job submission. Those ioctls reflect how the hardware/scheduler
  works and are thus driver specific.

We also have a way to expose IO regions, such that the usermode driver
can directly access specific/well-isolate registers, like the
LATEST_FLUSH register used to implement cache-flush reduction.

This uAPI intentionally keeps usermode queues out of the scope, which
explains why doorbell registers and command stream ring-buffers are not
directly exposed to userspace.

v6:
- Add Maxime's and Heiko's acks

v5:
- Fix typo
- Add Liviu's R-b

v4:
- Add a VM_GET_STATE ioctl
- Fix doc
- Expose the CORE_FEATURES register so we can deal with variants in the
  UMD
- Add Steve's R-b

v3:
- Add the concept of sync-only VM operation
- Fix support for 32-bit userspace
- Rework drm_panthor_vm_create to pass the user VA size instead of
  the kernel VA size (suggested by Robin Murphy)
- Typo fixes
- Explicitly cast enums with top bit set to avoid compiler warnings in
  -pedantic mode.
- Drop property core_group_count as it can be easily calculated by the
  number of bits set in l2_present.

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
---
 Documentation/gpu/driver-uapi.rst |   5 +
 include/uapi/drm/panthor_drm.h    | 945 ++++++++++++++++++++++++++++++
 2 files changed, 950 insertions(+)
 create mode 100644 include/uapi/drm/panthor_drm.h

Comments

Erik Faye-Lund Oct. 16, 2024, 1:16 p.m. UTC | #1
On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:
> +/**
> + * enum drm_panthor_sync_op_flags - Synchronization operation flags.
> + */
> +enum drm_panthor_sync_op_flags {
> +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK: Synchronization
> handle type mask. */
> +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
> +
> +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
> Synchronization object type. */
> +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
> +
> +	/**
> +	 * @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
> Timeline synchronization
> +	 * object type.
> +	 */
> +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = 1,
> +
> +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
> +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
> +
> +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. */
> +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),

Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit signed
integer, so isn't this undefined behavior in C?

I'm asking, because Coverity complains about overflows when we assign
the value to drm_panthor_sync_op::flags in Mesa, and looking at it...
it seems to be right in complaining!

For reference, here's the Coverity issue (requires login,
unfortunately):
https://scan5.scan.coverity.com/#/project-view/59795/10037?selectedIssue=1605083
Boris Brezillon Oct. 16, 2024, 1:47 p.m. UTC | #2
On Wed, 16 Oct 2024 15:16:22 +0200
Erik Faye-Lund <erik.faye-lund@collabora.com> wrote:

> On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:
> > +/**
> > + * enum drm_panthor_sync_op_flags - Synchronization operation flags.
> > + */
> > +enum drm_panthor_sync_op_flags {
> > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK: Synchronization
> > handle type mask. */
> > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
> > +
> > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
> > Synchronization object type. */
> > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
> > +
> > +	/**
> > +	 * @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
> > Timeline synchronization
> > +	 * object type.
> > +	 */
> > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = 1,
> > +
> > +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
> > +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
> > +
> > +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. */
> > +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),  
> 
> Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit signed
> integer, so isn't this undefined behavior in C?
> 
> I'm asking, because Coverity complains about overflows when we assign
> the value to drm_panthor_sync_op::flags in Mesa, and looking at it...
> it seems to be right in complaining!
> 
> For reference, here's the Coverity issue (requires login,
> unfortunately):
> https://scan5.scan.coverity.com/#/project-view/59795/10037?selectedIssue=1605083
> 

The cast was originally suggested here [1]. If we don't want to rely
on gcc/llvm implementation of this UB, I guess the only option is to
redefine those enums as #defines.

[1]https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com/
Erik Faye-Lund Oct. 16, 2024, 1:50 p.m. UTC | #3
On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote:
> On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:
> > +/**
> > + * enum drm_panthor_sync_op_flags - Synchronization operation
> > flags.
> > + */
> > +enum drm_panthor_sync_op_flags {
> > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK: Synchronization
> > handle type mask. */
> > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
> > +
> > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
> > Synchronization object type. */
> > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
> > +
> > +	/**
> > +	 * @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
> > Timeline synchronization
> > +	 * object type.
> > +	 */
> > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = 1,
> > +
> > +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
> > +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
> > +
> > +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. */
> > +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),
> 
> Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit signed
> integer, so isn't this undefined behavior in C?
> 

Seems this was proposed here:
https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com/

...that kinda sounds like bad advice to me.

Also, it's been pointed out to me elsewhere that this isn't
*technically speaking* undefined, it's "implementation defined". But as
far as kernel interfaces goes, that's pretty much the same; we can't
guarantee that the kernel and the user-space is using the same
implementation.

Here's the quote from the C99 spec, section 6.3.1.3 "Signed and
unsigned integers":

"""
Otherwise, the new type is signed and the value cannot be represented
in it; either the result is implementation-defined or an
implementation-defined signal is raised
""""

I think a better approach be to use -1 << 31, which is well-defined.
But the problem then becomes assigning it into
drm_panthor_sync_op::flags in a well-defined way... Could we make the
field signed? That seems a bit bad as well...
Robin Murphy Oct. 16, 2024, 2:02 p.m. UTC | #4
On 2024-10-16 2:50 pm, Erik Faye-Lund wrote:
> On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote:
>> On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:
>>> +/**
>>> + * enum drm_panthor_sync_op_flags - Synchronization operation
>>> flags.
>>> + */
>>> +enum drm_panthor_sync_op_flags {
>>> +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK: Synchronization
>>> handle type mask. */
>>> +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
>>> +
>>> +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
>>> Synchronization object type. */
>>> +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
>>> +
>>> +	/**
>>> +	 * @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
>>> Timeline synchronization
>>> +	 * object type.
>>> +	 */
>>> +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = 1,
>>> +
>>> +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
>>> +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
>>> +
>>> +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. */
>>> +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),
>>
>> Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit signed
>> integer, so isn't this undefined behavior in C?
>>
> 
> Seems this was proposed here:
> https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com/
> 
> ...that kinda sounds like bad advice to me.
> 
> Also, it's been pointed out to me elsewhere that this isn't
> *technically speaking* undefined, it's "implementation defined". But as
> far as kernel interfaces goes, that's pretty much the same; we can't
> guarantee that the kernel and the user-space is using the same
> implementation.
> 
> Here's the quote from the C99 spec, section 6.3.1.3 "Signed and
> unsigned integers":
> 
> """
> Otherwise, the new type is signed and the value cannot be represented
> in it; either the result is implementation-defined or an
> implementation-defined signal is raised
> """"
> 
> I think a better approach be to use -1 << 31, which is well-defined.
> But the problem then becomes assigning it into
> drm_panthor_sync_op::flags in a well-defined way... Could we make the
> field signed? That seems a bit bad as well...

Is that a problem? Signed->unsigned conversion is always well-defined 
(6.3.1.3 again), since it doesn't depend on how the signed type 
represents negatives.

Robin.
Erik Faye-Lund Oct. 16, 2024, 2:05 p.m. UTC | #5
On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote:
> On 2024-10-16 2:50 pm, Erik Faye-Lund wrote:
> > On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote:
> > > On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:
> > > > +/**
> > > > + * enum drm_panthor_sync_op_flags - Synchronization operation
> > > > flags.
> > > > + */
> > > > +enum drm_panthor_sync_op_flags {
> > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK:
> > > > Synchronization
> > > > handle type mask. */
> > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
> > > > +
> > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
> > > > Synchronization object type. */
> > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
> > > > +
> > > > +	/**
> > > > +	 * @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
> > > > Timeline synchronization
> > > > +	 * object type.
> > > > +	 */
> > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = 1,
> > > > +
> > > > +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
> > > > +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
> > > > +
> > > > +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. */
> > > > +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),
> > > 
> > > Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit
> > > signed
> > > integer, so isn't this undefined behavior in C?
> > > 
> > 
> > Seems this was proposed here:
> > https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com/
> > 
> > ...that kinda sounds like bad advice to me.
> > 
> > Also, it's been pointed out to me elsewhere that this isn't
> > *technically speaking* undefined, it's "implementation defined".
> > But as
> > far as kernel interfaces goes, that's pretty much the same; we
> > can't
> > guarantee that the kernel and the user-space is using the same
> > implementation.
> > 
> > Here's the quote from the C99 spec, section 6.3.1.3 "Signed and
> > unsigned integers":
> > 
> > """
> > Otherwise, the new type is signed and the value cannot be
> > represented
> > in it; either the result is implementation-defined or an
> > implementation-defined signal is raised
> > """"
> > 
> > I think a better approach be to use -1 << 31, which is well-
> > defined.
> > But the problem then becomes assigning it into
> > drm_panthor_sync_op::flags in a well-defined way... Could we make
> > the
> > field signed? That seems a bit bad as well...
> 
> Is that a problem? Signed->unsigned conversion is always well-defined
> (6.3.1.3 again), since it doesn't depend on how the signed type 
> represents negatives.
> 
> Robin.

Ah, you're right. So that could fix the problem, indeed.
Erik Faye-Lund Oct. 16, 2024, 2:07 p.m. UTC | #6
On Wed, 2024-10-16 at 15:47 +0200, Boris Brezillon wrote:
> On Wed, 16 Oct 2024 15:16:22 +0200
> Erik Faye-Lund <erik.faye-lund@collabora.com> wrote:
> 
> > On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:
> > > +/**
> > > + * enum drm_panthor_sync_op_flags - Synchronization operation
> > > flags.
> > > + */
> > > +enum drm_panthor_sync_op_flags {
> > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK:
> > > Synchronization
> > > handle type mask. */
> > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
> > > +
> > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
> > > Synchronization object type. */
> > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
> > > +
> > > +	/**
> > > +	 * @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
> > > Timeline synchronization
> > > +	 * object type.
> > > +	 */
> > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = 1,
> > > +
> > > +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
> > > +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
> > > +
> > > +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. */
> > > +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),  
> > 
> > Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit signed
> > integer, so isn't this undefined behavior in C?
> > 
> > I'm asking, because Coverity complains about overflows when we
> > assign
> > the value to drm_panthor_sync_op::flags in Mesa, and looking at
> > it...
> > it seems to be right in complaining!
> > 
> > For reference, here's the Coverity issue (requires login,
> > unfortunately):
> > https://scan5.scan.coverity.com/#/project-view/59795/10037?selectedIssue=1605083
> > 
> 
> The cast was originally suggested here [1]. If we don't want to rely
> on gcc/llvm implementation of this UB, I guess the only option is to
> redefine those enums as #defines.
> 
> [1]
> https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com
> /
> 

I think we could either do the defines, or use "-1 << 31", with a
comment to explain what's going on. Either works for me.
Boris Brezillon Oct. 16, 2024, 2:18 p.m. UTC | #7
On Wed, 16 Oct 2024 16:05:55 +0200
Erik Faye-Lund <erik.faye-lund@collabora.com> wrote:

> On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote:
> > On 2024-10-16 2:50 pm, Erik Faye-Lund wrote:  
> > > On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote:  
> > > > On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:  
> > > > > +/**
> > > > > + * enum drm_panthor_sync_op_flags - Synchronization operation
> > > > > flags.
> > > > > + */
> > > > > +enum drm_panthor_sync_op_flags {
> > > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK:
> > > > > Synchronization
> > > > > handle type mask. */
> > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
> > > > > +
> > > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
> > > > > Synchronization object type. */
> > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
> > > > > +
> > > > > +	/**
> > > > > +	 * @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
> > > > > Timeline synchronization
> > > > > +	 * object type.
> > > > > +	 */
> > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = 1,
> > > > > +
> > > > > +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
> > > > > +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
> > > > > +
> > > > > +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. */
> > > > > +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),  
> > > > 
> > > > Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit
> > > > signed
> > > > integer, so isn't this undefined behavior in C?
> > > >   
> > > 
> > > Seems this was proposed here:
> > > https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com/
> > > 
> > > ...that kinda sounds like bad advice to me.
> > > 
> > > Also, it's been pointed out to me elsewhere that this isn't
> > > *technically speaking* undefined, it's "implementation defined".
> > > But as
> > > far as kernel interfaces goes, that's pretty much the same; we
> > > can't
> > > guarantee that the kernel and the user-space is using the same
> > > implementation.
> > > 
> > > Here's the quote from the C99 spec, section 6.3.1.3 "Signed and
> > > unsigned integers":
> > > 
> > > """
> > > Otherwise, the new type is signed and the value cannot be
> > > represented
> > > in it; either the result is implementation-defined or an
> > > implementation-defined signal is raised
> > > """"
> > > 
> > > I think a better approach be to use -1 << 31, which is well-
> > > defined.
> > > But the problem then becomes assigning it into
> > > drm_panthor_sync_op::flags in a well-defined way... Could we make
> > > the
> > > field signed? That seems a bit bad as well...  
> > 
> > Is that a problem? Signed->unsigned conversion is always well-defined
> > (6.3.1.3 again), since it doesn't depend on how the signed type 
> > represents negatives.
> > 
> > Robin.  
> 
> Ah, you're right. So that could fix the problem, indeed.

On the other hand, I hate the idea of having -1 << 31 to encode
bit31-set. That's even worse for DRM_PANTHOR_VM_BIND_OP_TYPE_xxx when
we'll reach a value above 0x7, because then the negative value is hard
to map to its unsigned representation. If we really care about this
corner case, I'd rather go full-defines for flags and call it a day.
Erik Faye-Lund Oct. 17, 2024, 8:51 a.m. UTC | #8
On Wed, 2024-10-16 at 16:18 +0200, Boris Brezillon wrote:
> On Wed, 16 Oct 2024 16:05:55 +0200
> Erik Faye-Lund <erik.faye-lund@collabora.com> wrote:
> 
> > On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote:
> > > On 2024-10-16 2:50 pm, Erik Faye-Lund wrote:  
> > > > On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote:  
> > > > > On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:  
> > > > > > +/**
> > > > > > + * enum drm_panthor_sync_op_flags - Synchronization
> > > > > > operation
> > > > > > flags.
> > > > > > + */
> > > > > > +enum drm_panthor_sync_op_flags {
> > > > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK:
> > > > > > Synchronization
> > > > > > handle type mask. */
> > > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
> > > > > > +
> > > > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
> > > > > > Synchronization object type. */
> > > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
> > > > > > +
> > > > > > +	/**
> > > > > > +	 *
> > > > > > @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
> > > > > > Timeline synchronization
> > > > > > +	 * object type.
> > > > > > +	 */
> > > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ =
> > > > > > 1,
> > > > > > +
> > > > > > +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
> > > > > > +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
> > > > > > +
> > > > > > +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation.
> > > > > > */
> > > > > > +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),  
> > > > > 
> > > > > Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit
> > > > > signed
> > > > > integer, so isn't this undefined behavior in C?
> > > > >   
> > > > 
> > > > Seems this was proposed here:
> > > > https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com/
> > > > 
> > > > ...that kinda sounds like bad advice to me.
> > > > 
> > > > Also, it's been pointed out to me elsewhere that this isn't
> > > > *technically speaking* undefined, it's "implementation
> > > > defined".
> > > > But as
> > > > far as kernel interfaces goes, that's pretty much the same; we
> > > > can't
> > > > guarantee that the kernel and the user-space is using the same
> > > > implementation.
> > > > 
> > > > Here's the quote from the C99 spec, section 6.3.1.3 "Signed and
> > > > unsigned integers":
> > > > 
> > > > """
> > > > Otherwise, the new type is signed and the value cannot be
> > > > represented
> > > > in it; either the result is implementation-defined or an
> > > > implementation-defined signal is raised
> > > > """"
> > > > 
> > > > I think a better approach be to use -1 << 31, which is well-
> > > > defined.
> > > > But the problem then becomes assigning it into
> > > > drm_panthor_sync_op::flags in a well-defined way... Could we
> > > > make
> > > > the
> > > > field signed? That seems a bit bad as well...  
> > > 
> > > Is that a problem? Signed->unsigned conversion is always well-
> > > defined
> > > (6.3.1.3 again), since it doesn't depend on how the signed type 
> > > represents negatives.
> > > 
> > > Robin.  
> > 
> > Ah, you're right. So that could fix the problem, indeed.
> 
> On the other hand, I hate the idea of having -1 << 31 to encode
> bit31-set. That's even worse for DRM_PANTHOR_VM_BIND_OP_TYPE_xxx when
> we'll reach a value above 0x7, because then the negative value is
> hard
> to map to its unsigned representation. If we really care about this
> corner case, I'd rather go full-defines for flags and call it a day.
> 

Yeah, I suppose it can get ugly for some other cases.

If we rule that out, I think there's only two options I can think of
left:

1. Using #defines instead, like Boris suggested
2. Using 64 bit signed enums (e.g "1ll << 31" instead)

Again, #2 here would be the smaller change. But I kinda think I lean
towards #1, because... These aren't really enumerators. They are flags.

...Yeah, sure. In C the practical difference isn't huge. But if we ever
wanted to support using these enums from C++ code, we'd need to add
overloaded operators, because C++ doesn't allow ORing together enums
out of the box.

I'm not saying I have any plans on using the uAPI from C++, just saying
that if we're going to tackle this, we might as well tackle it
completely...

Also, expanding the enum-type to 64 bits might have some additional
consequences, like needlessly needing more stack-space to pass values
around etc.

Thoughts? Surely there must be some precedence on using the top bit for
flags in the kernel, no?
Boris Brezillon Oct. 17, 2024, 10:08 a.m. UTC | #9
On Thu, 17 Oct 2024 10:51:32 +0200
Erik Faye-Lund <erik.faye-lund@collabora.com> wrote:

> On Wed, 2024-10-16 at 16:18 +0200, Boris Brezillon wrote:
> > On Wed, 16 Oct 2024 16:05:55 +0200
> > Erik Faye-Lund <erik.faye-lund@collabora.com> wrote:
> >   
> > > On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote:  
> > > > On 2024-10-16 2:50 pm, Erik Faye-Lund wrote:    
> > > > > On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote:    
> > > > > > On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:    
> > > > > > > +/**
> > > > > > > + * enum drm_panthor_sync_op_flags - Synchronization
> > > > > > > operation
> > > > > > > flags.
> > > > > > > + */
> > > > > > > +enum drm_panthor_sync_op_flags {
> > > > > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK:
> > > > > > > Synchronization
> > > > > > > handle type mask. */
> > > > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
> > > > > > > +
> > > > > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
> > > > > > > Synchronization object type. */
> > > > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
> > > > > > > +
> > > > > > > +	/**
> > > > > > > +	 *
> > > > > > > @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
> > > > > > > Timeline synchronization
> > > > > > > +	 * object type.
> > > > > > > +	 */
> > > > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ =
> > > > > > > 1,
> > > > > > > +
> > > > > > > +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
> > > > > > > +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
> > > > > > > +
> > > > > > > +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation.
> > > > > > > */
> > > > > > > +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),    
> > > > > > 
> > > > > > Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit
> > > > > > signed
> > > > > > integer, so isn't this undefined behavior in C?
> > > > > >     
> > > > > 
> > > > > Seems this was proposed here:
> > > > > https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com/
> > > > > 
> > > > > ...that kinda sounds like bad advice to me.
> > > > > 
> > > > > Also, it's been pointed out to me elsewhere that this isn't
> > > > > *technically speaking* undefined, it's "implementation
> > > > > defined".
> > > > > But as
> > > > > far as kernel interfaces goes, that's pretty much the same; we
> > > > > can't
> > > > > guarantee that the kernel and the user-space is using the same
> > > > > implementation.
> > > > > 
> > > > > Here's the quote from the C99 spec, section 6.3.1.3 "Signed and
> > > > > unsigned integers":
> > > > > 
> > > > > """
> > > > > Otherwise, the new type is signed and the value cannot be
> > > > > represented
> > > > > in it; either the result is implementation-defined or an
> > > > > implementation-defined signal is raised
> > > > > """"
> > > > > 
> > > > > I think a better approach be to use -1 << 31, which is well-
> > > > > defined.
> > > > > But the problem then becomes assigning it into
> > > > > drm_panthor_sync_op::flags in a well-defined way... Could we
> > > > > make
> > > > > the
> > > > > field signed? That seems a bit bad as well...    
> > > > 
> > > > Is that a problem? Signed->unsigned conversion is always well-
> > > > defined
> > > > (6.3.1.3 again), since it doesn't depend on how the signed type 
> > > > represents negatives.
> > > > 
> > > > Robin.    
> > > 
> > > Ah, you're right. So that could fix the problem, indeed.  
> > 
> > On the other hand, I hate the idea of having -1 << 31 to encode
> > bit31-set. That's even worse for DRM_PANTHOR_VM_BIND_OP_TYPE_xxx when
> > we'll reach a value above 0x7, because then the negative value is
> > hard
> > to map to its unsigned representation. If we really care about this
> > corner case, I'd rather go full-defines for flags and call it a day.
> >   
> 
> Yeah, I suppose it can get ugly for some other cases.
> 
> If we rule that out, I think there's only two options I can think of
> left:
> 
> 1. Using #defines instead, like Boris suggested
> 2. Using 64 bit signed enums (e.g "1ll << 31" instead)
> 
> Again, #2 here would be the smaller change. But I kinda think I lean
> towards #1, because... These aren't really enumerators. They are flags.
> 
> ...Yeah, sure. In C the practical difference isn't huge. But if we ever
> wanted to support using these enums from C++ code, we'd need to add
> overloaded operators, because C++ doesn't allow ORing together enums
> out of the box.
> 
> I'm not saying I have any plans on using the uAPI from C++, just saying
> that if we're going to tackle this, we might as well tackle it
> completely...
> 
> Also, expanding the enum-type to 64 bits might have some additional
> consequences, like needlessly needing more stack-space to pass values
> around etc.
> 
> Thoughts?

I'm leaning towards defines, because 64-bit enums are uncommon
(FWIW, 'git grep "1ll << 31" include/uapi' returns nothing).
Erik Faye-Lund Oct. 17, 2024, 10:15 a.m. UTC | #10
On Thu, 2024-10-17 at 12:08 +0200, Boris Brezillon wrote:
> On Thu, 17 Oct 2024 10:51:32 +0200
> Erik Faye-Lund <erik.faye-lund@collabora.com> wrote:
> 
> > On Wed, 2024-10-16 at 16:18 +0200, Boris Brezillon wrote:
> > > On Wed, 16 Oct 2024 16:05:55 +0200
> > > Erik Faye-Lund <erik.faye-lund@collabora.com> wrote:
> > >   
> > > > On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote:  
> > > > > On 2024-10-16 2:50 pm, Erik Faye-Lund wrote:    
> > > > > > On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote:   
> > > > > > > On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon
> > > > > > > wrote:    
> > > > > > > > +/**
> > > > > > > > + * enum drm_panthor_sync_op_flags - Synchronization
> > > > > > > > operation
> > > > > > > > flags.
> > > > > > > > + */
> > > > > > > > +enum drm_panthor_sync_op_flags {
> > > > > > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK:
> > > > > > > > Synchronization
> > > > > > > > handle type mask. */
> > > > > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
> > > > > > > > +
> > > > > > > > +	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
> > > > > > > > Synchronization object type. */
> > > > > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
> > > > > > > > +
> > > > > > > > +	/**
> > > > > > > > +	 *
> > > > > > > > @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
> > > > > > > > Timeline synchronization
> > > > > > > > +	 * object type.
> > > > > > > > +	 */
> > > > > > > > +	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCO
> > > > > > > > BJ =
> > > > > > > > 1,
> > > > > > > > +
> > > > > > > > +	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation.
> > > > > > > > */
> > > > > > > > +	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
> > > > > > > > +
> > > > > > > > +	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal
> > > > > > > > operation.
> > > > > > > > */
> > > > > > > > +	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u <<
> > > > > > > > 31),    
> > > > > > > 
> > > > > > > Why do we cast to int here? 1u << 31 doesn't fit in a 32-
> > > > > > > bit
> > > > > > > signed
> > > > > > > integer, so isn't this undefined behavior in C?
> > > > > > >     
> > > > > > 
> > > > > > Seems this was proposed here:
> > > > > > https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com/
> > > > > > 
> > > > > > ...that kinda sounds like bad advice to me.
> > > > > > 
> > > > > > Also, it's been pointed out to me elsewhere that this isn't
> > > > > > *technically speaking* undefined, it's "implementation
> > > > > > defined".
> > > > > > But as
> > > > > > far as kernel interfaces goes, that's pretty much the same;
> > > > > > we
> > > > > > can't
> > > > > > guarantee that the kernel and the user-space is using the
> > > > > > same
> > > > > > implementation.
> > > > > > 
> > > > > > Here's the quote from the C99 spec, section 6.3.1.3 "Signed
> > > > > > and
> > > > > > unsigned integers":
> > > > > > 
> > > > > > """
> > > > > > Otherwise, the new type is signed and the value cannot be
> > > > > > represented
> > > > > > in it; either the result is implementation-defined or an
> > > > > > implementation-defined signal is raised
> > > > > > """"
> > > > > > 
> > > > > > I think a better approach be to use -1 << 31, which is
> > > > > > well-
> > > > > > defined.
> > > > > > But the problem then becomes assigning it into
> > > > > > drm_panthor_sync_op::flags in a well-defined way... Could
> > > > > > we
> > > > > > make
> > > > > > the
> > > > > > field signed? That seems a bit bad as well...    
> > > > > 
> > > > > Is that a problem? Signed->unsigned conversion is always
> > > > > well-
> > > > > defined
> > > > > (6.3.1.3 again), since it doesn't depend on how the signed
> > > > > type 
> > > > > represents negatives.
> > > > > 
> > > > > Robin.    
> > > > 
> > > > Ah, you're right. So that could fix the problem, indeed.  
> > > 
> > > On the other hand, I hate the idea of having -1 << 31 to encode
> > > bit31-set. That's even worse for DRM_PANTHOR_VM_BIND_OP_TYPE_xxx
> > > when
> > > we'll reach a value above 0x7, because then the negative value is
> > > hard
> > > to map to its unsigned representation. If we really care about
> > > this
> > > corner case, I'd rather go full-defines for flags and call it a
> > > day.
> > >   
> > 
> > Yeah, I suppose it can get ugly for some other cases.
> > 
> > If we rule that out, I think there's only two options I can think
> > of
> > left:
> > 
> > 1. Using #defines instead, like Boris suggested
> > 2. Using 64 bit signed enums (e.g "1ll << 31" instead)
> > 
> > Again, #2 here would be the smaller change. But I kinda think I
> > lean
> > towards #1, because... These aren't really enumerators. They are
> > flags.
> > 
> > ...Yeah, sure. In C the practical difference isn't huge. But if we
> > ever
> > wanted to support using these enums from C++ code, we'd need to add
> > overloaded operators, because C++ doesn't allow ORing together
> > enums
> > out of the box.
> > 
> > I'm not saying I have any plans on using the uAPI from C++, just
> > saying
> > that if we're going to tackle this, we might as well tackle it
> > completely...
> > 
> > Also, expanding the enum-type to 64 bits might have some additional
> > consequences, like needlessly needing more stack-space to pass
> > values
> > around etc.
> > 
> > Thoughts?
> 
> I'm leaning towards defines, because 64-bit enums are uncommon
> (FWIW, 'git grep "1ll << 31" include/uapi' returns nothing).

Agreed. Doing some grepping, it seems we're the *only* ones using the
current cast-through-int trickery. And there's *one* other driver using
the 1 << 31 with an enum (G2D_BUF_USERPTR in exynos_drm.h), which is
what was supposedly a problem with the original code.

All other 1 << 31 variants in the uapi folder are defines.

I'll type up a patch when I have some time.
Mihail Atanassov Oct. 17, 2024, 6:34 p.m. UTC | #11
Hi Erik,

On 17/10/2024 09:51, Erik Faye-Lund wrote:
> On Wed, 2024-10-16 at 16:18 +0200, Boris Brezillon wrote:
>> On Wed, 16 Oct 2024 16:05:55 +0200
>> Erik Faye-Lund <erik.faye-lund@collabora.com> wrote:
>>
>>> On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote:
>>>> On 2024-10-16 2:50 pm, Erik Faye-Lund wrote:
>>>>> On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote:
>>>>>> On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote:
>>>>>>> +/**
>>>>>>> + * enum drm_panthor_sync_op_flags - Synchronization
>>>>>>> operation
>>>>>>> flags.
>>>>>>> + */
>>>>>>> +enum drm_panthor_sync_op_flags {
>>>>>>> +       /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK:
>>>>>>> Synchronization
>>>>>>> handle type mask. */
>>>>>>> +       DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
>>>>>>> +
>>>>>>> +       /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ:
>>>>>>> Synchronization object type. */
>>>>>>> +       DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
>>>>>>> +
>>>>>>> +       /**
>>>>>>> +        *
>>>>>>> @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ:
>>>>>>> Timeline synchronization
>>>>>>> +        * object type.
>>>>>>> +        */
>>>>>>> +       DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ =
>>>>>>> 1,
>>>>>>> +
>>>>>>> +       /** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
>>>>>>> +       DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
>>>>>>> +
>>>>>>> +       /** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation.
>>>>>>> */
>>>>>>> +       DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),
>>>>>>
>>>>>> Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit
>>>>>> signed
>>>>>> integer, so isn't this undefined behavior in C?
>>>>>>
>>>>>
>>>>> Seems this was proposed here:
>>>>> https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1d23@arm.com/
>>>>>
>>>>> ...that kinda sounds like bad advice to me.
>>>>>
>>>>> Also, it's been pointed out to me elsewhere that this isn't
>>>>> *technically speaking* undefined, it's "implementation
>>>>> defined".
>>>>> But as
>>>>> far as kernel interfaces goes, that's pretty much the same; we
>>>>> can't
>>>>> guarantee that the kernel and the user-space is using the same
>>>>> implementation.
>>>>>
>>>>> Here's the quote from the C99 spec, section 6.3.1.3 "Signed and
>>>>> unsigned integers":
>>>>>
>>>>> """
>>>>> Otherwise, the new type is signed and the value cannot be
>>>>> represented
>>>>> in it; either the result is implementation-defined or an
>>>>> implementation-defined signal is raised
>>>>> """"
>>>>>
>>>>> I think a better approach be to use -1 << 31, which is well-
>>>>> defined.
>>>>> But the problem then becomes assigning it into
>>>>> drm_panthor_sync_op::flags in a well-defined way... Could we
>>>>> make
>>>>> the
>>>>> field signed? That seems a bit bad as well...
>>>>
>>>> Is that a problem? Signed->unsigned conversion is always well-
>>>> defined
>>>> (6.3.1.3 again), since it doesn't depend on how the signed type
>>>> represents negatives.
>>>>
>>>> Robin.
>>>
>>> Ah, you're right. So that could fix the problem, indeed.
>>
>> On the other hand, I hate the idea of having -1 << 31 to encode
>> bit31-set. That's even worse for DRM_PANTHOR_VM_BIND_OP_TYPE_xxx when
>> we'll reach a value above 0x7, because then the negative value is
>> hard
>> to map to its unsigned representation. If we really care about this
>> corner case, I'd rather go full-defines for flags and call it a day.
>>
>
> Yeah, I suppose it can get ugly for some other cases.
>
> If we rule that out, I think there's only two options I can think of
> left:
>
> 1. Using #defines instead, like Boris suggested
> 2. Using 64 bit signed enums (e.g "1ll << 31" instead)
>
> Again, #2 here would be the smaller change. But I kinda think I lean
> towards #1, because... These aren't really enumerators. They are flags.>
> ...Yeah, sure. In C the practical difference isn't huge. But if we ever
> wanted to support using these enums from C++ code, we'd need to add
> overloaded operators, because C++ doesn't allow ORing together enums
> out of the box.

That's only true for enum classes. Plain'ol enums' values can be ORed at
will (but you will need to `static_cast` them back to the enum type,
admittedly, because they auto-"promote" to int for the arithmetic op).

I've had to use uAPI from C++ and the most painless approach, once you
finish writing it, is to wrap the whole uAPI in C++ constructs anyway.
So I wouldn't consider that angle, personally.

>
> I'm not saying I have any plans on using the uAPI from C++, just saying
> that if we're going to tackle this, we might as well tackle it
> completely...
>
> Also, expanding the enum-type to 64 bits might have some additional
> consequences, like needlessly needing more stack-space to pass values
> around etc.
>
> Thoughts? Surely there must be some precedence on using the top bit for
> flags in the kernel, no?

--
Mihail Atanassov <mihail.atanassov@arm.com>

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
diff mbox series

Patch

diff --git a/Documentation/gpu/driver-uapi.rst b/Documentation/gpu/driver-uapi.rst
index e5070a0e95ab..971cdb4816fc 100644
--- a/Documentation/gpu/driver-uapi.rst
+++ b/Documentation/gpu/driver-uapi.rst
@@ -18,6 +18,11 @@  VM_BIND / EXEC uAPI
 
 .. kernel-doc:: include/uapi/drm/nouveau_drm.h
 
+drm/panthor uAPI
+================
+
+.. kernel-doc:: include/uapi/drm/panthor_drm.h
+
 drm/xe uAPI
 ===========
 
diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
new file mode 100644
index 000000000000..373df80f41ed
--- /dev/null
+++ b/include/uapi/drm/panthor_drm.h
@@ -0,0 +1,945 @@ 
+/* SPDX-License-Identifier: MIT */
+/* Copyright (C) 2023 Collabora ltd. */
+#ifndef _PANTHOR_DRM_H_
+#define _PANTHOR_DRM_H_
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+/**
+ * DOC: Introduction
+ *
+ * This documentation describes the Panthor IOCTLs.
+ *
+ * Just a few generic rules about the data passed to the Panthor IOCTLs:
+ *
+ * - Structures must be aligned on 64-bit/8-byte. If the object is not
+ *   naturally aligned, a padding field must be added.
+ * - Fields must be explicitly aligned to their natural type alignment with
+ *   pad[0..N] fields.
+ * - All padding fields will be checked by the driver to make sure they are
+ *   zeroed.
+ * - Flags can be added, but not removed/replaced.
+ * - New fields can be added to the main structures (the structures
+ *   directly passed to the ioctl). Those fields can be added at the end of
+ *   the structure, or replace existing padding fields. Any new field being
+ *   added must preserve the behavior that existed before those fields were
+ *   added when a value of zero is passed.
+ * - New fields can be added to indirect objects (objects pointed by the
+ *   main structure), iff those objects are passed a size to reflect the
+ *   size known by the userspace driver (see drm_panthor_obj_array::stride
+ *   or drm_panthor_dev_query::size).
+ * - If the kernel driver is too old to know some fields, those will be
+ *   ignored if zero, and otherwise rejected (and so will be zero on output).
+ * - If userspace is too old to know some fields, those will be zeroed
+ *   (input) before the structure is parsed by the kernel driver.
+ * - Each new flag/field addition must come with a driver version update so
+ *   the userspace driver doesn't have to trial and error to know which
+ *   flags are supported.
+ * - Structures should not contain unions, as this would defeat the
+ *   extensibility of such structures.
+ * - IOCTLs can't be removed or replaced. New IOCTL IDs should be placed
+ *   at the end of the drm_panthor_ioctl_id enum.
+ */
+
+/**
+ * DOC: MMIO regions exposed to userspace.
+ *
+ * .. c:macro:: DRM_PANTHOR_USER_MMIO_OFFSET
+ *
+ * File offset for all MMIO regions being exposed to userspace. Don't use
+ * this value directly, use DRM_PANTHOR_USER_<name>_OFFSET values instead.
+ * pgoffset passed to mmap2() is an unsigned long, which forces us to use a
+ * different offset on 32-bit and 64-bit systems.
+ *
+ * .. c:macro:: DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET
+ *
+ * File offset for the LATEST_FLUSH_ID register. The Userspace driver controls
+ * GPU cache flushing through CS instructions, but the flush reduction
+ * mechanism requires a flush_id. This flush_id could be queried with an
+ * ioctl, but Arm provides a well-isolated register page containing only this
+ * read-only register, so let's expose this page through a static mmap offset
+ * and allow direct mapping of this MMIO region so we can avoid the
+ * user <-> kernel round-trip.
+ */
+#define DRM_PANTHOR_USER_MMIO_OFFSET_32BIT	(1ull << 43)
+#define DRM_PANTHOR_USER_MMIO_OFFSET_64BIT	(1ull << 56)
+#define DRM_PANTHOR_USER_MMIO_OFFSET		(sizeof(unsigned long) < 8 ? \
+						 DRM_PANTHOR_USER_MMIO_OFFSET_32BIT : \
+						 DRM_PANTHOR_USER_MMIO_OFFSET_64BIT)
+#define DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET	(DRM_PANTHOR_USER_MMIO_OFFSET | 0)
+
+/**
+ * DOC: IOCTL IDs
+ *
+ * enum drm_panthor_ioctl_id - IOCTL IDs
+ *
+ * Place new ioctls at the end, don't re-order, don't replace or remove entries.
+ *
+ * These IDs are not meant to be used directly. Use the DRM_IOCTL_PANTHOR_xxx
+ * definitions instead.
+ */
+enum drm_panthor_ioctl_id {
+	/** @DRM_PANTHOR_DEV_QUERY: Query device information. */
+	DRM_PANTHOR_DEV_QUERY = 0,
+
+	/** @DRM_PANTHOR_VM_CREATE: Create a VM. */
+	DRM_PANTHOR_VM_CREATE,
+
+	/** @DRM_PANTHOR_VM_DESTROY: Destroy a VM. */
+	DRM_PANTHOR_VM_DESTROY,
+
+	/** @DRM_PANTHOR_VM_BIND: Bind/unbind memory to a VM. */
+	DRM_PANTHOR_VM_BIND,
+
+	/** @DRM_PANTHOR_VM_GET_STATE: Get VM state. */
+	DRM_PANTHOR_VM_GET_STATE,
+
+	/** @DRM_PANTHOR_BO_CREATE: Create a buffer object. */
+	DRM_PANTHOR_BO_CREATE,
+
+	/**
+	 * @DRM_PANTHOR_BO_MMAP_OFFSET: Get the file offset to pass to
+	 * mmap to map a GEM object.
+	 */
+	DRM_PANTHOR_BO_MMAP_OFFSET,
+
+	/** @DRM_PANTHOR_GROUP_CREATE: Create a scheduling group. */
+	DRM_PANTHOR_GROUP_CREATE,
+
+	/** @DRM_PANTHOR_GROUP_DESTROY: Destroy a scheduling group. */
+	DRM_PANTHOR_GROUP_DESTROY,
+
+	/**
+	 * @DRM_PANTHOR_GROUP_SUBMIT: Submit jobs to queues belonging
+	 * to a specific scheduling group.
+	 */
+	DRM_PANTHOR_GROUP_SUBMIT,
+
+	/** @DRM_PANTHOR_GROUP_GET_STATE: Get the state of a scheduling group. */
+	DRM_PANTHOR_GROUP_GET_STATE,
+
+	/** @DRM_PANTHOR_TILER_HEAP_CREATE: Create a tiler heap. */
+	DRM_PANTHOR_TILER_HEAP_CREATE,
+
+	/** @DRM_PANTHOR_TILER_HEAP_DESTROY: Destroy a tiler heap. */
+	DRM_PANTHOR_TILER_HEAP_DESTROY,
+};
+
+/**
+ * DRM_IOCTL_PANTHOR() - Build a Panthor IOCTL number
+ * @__access: Access type. Must be R, W or RW.
+ * @__id: One of the DRM_PANTHOR_xxx id.
+ * @__type: Suffix of the type being passed to the IOCTL.
+ *
+ * Don't use this macro directly, use the DRM_IOCTL_PANTHOR_xxx
+ * values instead.
+ *
+ * Return: An IOCTL number to be passed to ioctl() from userspace.
+ */
+#define DRM_IOCTL_PANTHOR(__access, __id, __type) \
+	DRM_IO ## __access(DRM_COMMAND_BASE + DRM_PANTHOR_ ## __id, \
+			   struct drm_panthor_ ## __type)
+
+#define DRM_IOCTL_PANTHOR_DEV_QUERY \
+	DRM_IOCTL_PANTHOR(WR, DEV_QUERY, dev_query)
+#define DRM_IOCTL_PANTHOR_VM_CREATE \
+	DRM_IOCTL_PANTHOR(WR, VM_CREATE, vm_create)
+#define DRM_IOCTL_PANTHOR_VM_DESTROY \
+	DRM_IOCTL_PANTHOR(WR, VM_DESTROY, vm_destroy)
+#define DRM_IOCTL_PANTHOR_VM_BIND \
+	DRM_IOCTL_PANTHOR(WR, VM_BIND, vm_bind)
+#define DRM_IOCTL_PANTHOR_VM_GET_STATE \
+	DRM_IOCTL_PANTHOR(WR, VM_GET_STATE, vm_get_state)
+#define DRM_IOCTL_PANTHOR_BO_CREATE \
+	DRM_IOCTL_PANTHOR(WR, BO_CREATE, bo_create)
+#define DRM_IOCTL_PANTHOR_BO_MMAP_OFFSET \
+	DRM_IOCTL_PANTHOR(WR, BO_MMAP_OFFSET, bo_mmap_offset)
+#define DRM_IOCTL_PANTHOR_GROUP_CREATE \
+	DRM_IOCTL_PANTHOR(WR, GROUP_CREATE, group_create)
+#define DRM_IOCTL_PANTHOR_GROUP_DESTROY \
+	DRM_IOCTL_PANTHOR(WR, GROUP_DESTROY, group_destroy)
+#define DRM_IOCTL_PANTHOR_GROUP_SUBMIT \
+	DRM_IOCTL_PANTHOR(WR, GROUP_SUBMIT, group_submit)
+#define DRM_IOCTL_PANTHOR_GROUP_GET_STATE \
+	DRM_IOCTL_PANTHOR(WR, GROUP_GET_STATE, group_get_state)
+#define DRM_IOCTL_PANTHOR_TILER_HEAP_CREATE \
+	DRM_IOCTL_PANTHOR(WR, TILER_HEAP_CREATE, tiler_heap_create)
+#define DRM_IOCTL_PANTHOR_TILER_HEAP_DESTROY \
+	DRM_IOCTL_PANTHOR(WR, TILER_HEAP_DESTROY, tiler_heap_destroy)
+
+/**
+ * DOC: IOCTL arguments
+ */
+
+/**
+ * struct drm_panthor_obj_array - Object array.
+ *
+ * This object is used to pass an array of objects whose size is subject to changes in
+ * future versions of the driver. In order to support this mutability, we pass a stride
+ * describing the size of the object as known by userspace.
+ *
+ * You shouldn't fill drm_panthor_obj_array fields directly. You should instead use
+ * the DRM_PANTHOR_OBJ_ARRAY() macro that takes care of initializing the stride to
+ * the object size.
+ */
+struct drm_panthor_obj_array {
+	/** @stride: Stride of object struct. Used for versioning. */
+	__u32 stride;
+
+	/** @count: Number of objects in the array. */
+	__u32 count;
+
+	/** @array: User pointer to an array of objects. */
+	__u64 array;
+};
+
+/**
+ * DRM_PANTHOR_OBJ_ARRAY() - Initialize a drm_panthor_obj_array field.
+ * @cnt: Number of elements in the array.
+ * @ptr: Pointer to the array to pass to the kernel.
+ *
+ * Macro initializing a drm_panthor_obj_array based on the object size as known
+ * by userspace.
+ */
+#define DRM_PANTHOR_OBJ_ARRAY(cnt, ptr) \
+	{ .stride = sizeof((ptr)[0]), .count = (cnt), .array = (__u64)(uintptr_t)(ptr) }
+
+/**
+ * enum drm_panthor_sync_op_flags - Synchronization operation flags.
+ */
+enum drm_panthor_sync_op_flags {
+	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK: Synchronization handle type mask. */
+	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff,
+
+	/** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ: Synchronization object type. */
+	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0,
+
+	/**
+	 * @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ: Timeline synchronization
+	 * object type.
+	 */
+	DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = 1,
+
+	/** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */
+	DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31,
+
+	/** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. */
+	DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31),
+};
+
+/**
+ * struct drm_panthor_sync_op - Synchronization operation.
+ */
+struct drm_panthor_sync_op {
+	/** @flags: Synchronization operation flags. Combination of DRM_PANTHOR_SYNC_OP values. */
+	__u32 flags;
+
+	/** @handle: Sync handle. */
+	__u32 handle;
+
+	/**
+	 * @timeline_value: MBZ if
+	 * (flags & DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK) !=
+	 * DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ.
+	 */
+	__u64 timeline_value;
+};
+
+/**
+ * enum drm_panthor_dev_query_type - Query type
+ *
+ * Place new types at the end, don't re-order, don't remove or replace.
+ */
+enum drm_panthor_dev_query_type {
+	/** @DRM_PANTHOR_DEV_QUERY_GPU_INFO: Query GPU information. */
+	DRM_PANTHOR_DEV_QUERY_GPU_INFO = 0,
+
+	/** @DRM_PANTHOR_DEV_QUERY_CSIF_INFO: Query command-stream interface information. */
+	DRM_PANTHOR_DEV_QUERY_CSIF_INFO,
+};
+
+/**
+ * struct drm_panthor_gpu_info - GPU information
+ *
+ * Structure grouping all queryable information relating to the GPU.
+ */
+struct drm_panthor_gpu_info {
+	/** @gpu_id : GPU ID. */
+	__u32 gpu_id;
+#define DRM_PANTHOR_ARCH_MAJOR(x)		((x) >> 28)
+#define DRM_PANTHOR_ARCH_MINOR(x)		(((x) >> 24) & 0xf)
+#define DRM_PANTHOR_ARCH_REV(x)			(((x) >> 20) & 0xf)
+#define DRM_PANTHOR_PRODUCT_MAJOR(x)		(((x) >> 16) & 0xf)
+#define DRM_PANTHOR_VERSION_MAJOR(x)		(((x) >> 12) & 0xf)
+#define DRM_PANTHOR_VERSION_MINOR(x)		(((x) >> 4) & 0xff)
+#define DRM_PANTHOR_VERSION_STATUS(x)		((x) & 0xf)
+
+	/** @gpu_rev: GPU revision. */
+	__u32 gpu_rev;
+
+	/** @csf_id: Command stream frontend ID. */
+	__u32 csf_id;
+#define DRM_PANTHOR_CSHW_MAJOR(x)		(((x) >> 26) & 0x3f)
+#define DRM_PANTHOR_CSHW_MINOR(x)		(((x) >> 20) & 0x3f)
+#define DRM_PANTHOR_CSHW_REV(x)			(((x) >> 16) & 0xf)
+#define DRM_PANTHOR_MCU_MAJOR(x)		(((x) >> 10) & 0x3f)
+#define DRM_PANTHOR_MCU_MINOR(x)		(((x) >> 4) & 0x3f)
+#define DRM_PANTHOR_MCU_REV(x)			((x) & 0xf)
+
+	/** @l2_features: L2-cache features. */
+	__u32 l2_features;
+
+	/** @tiler_features: Tiler features. */
+	__u32 tiler_features;
+
+	/** @mem_features: Memory features. */
+	__u32 mem_features;
+
+	/** @mmu_features: MMU features. */
+	__u32 mmu_features;
+#define DRM_PANTHOR_MMU_VA_BITS(x)		((x) & 0xff)
+
+	/** @thread_features: Thread features. */
+	__u32 thread_features;
+
+	/** @max_threads: Maximum number of threads. */
+	__u32 max_threads;
+
+	/** @thread_max_workgroup_size: Maximum workgroup size. */
+	__u32 thread_max_workgroup_size;
+
+	/**
+	 * @thread_max_barrier_size: Maximum number of threads that can wait
+	 * simultaneously on a barrier.
+	 */
+	__u32 thread_max_barrier_size;
+
+	/** @coherency_features: Coherency features. */
+	__u32 coherency_features;
+
+	/** @texture_features: Texture features. */
+	__u32 texture_features[4];
+
+	/** @as_present: Bitmask encoding the number of address-space exposed by the MMU. */
+	__u32 as_present;
+
+	/** @shader_present: Bitmask encoding the shader cores exposed by the GPU. */
+	__u64 shader_present;
+
+	/** @l2_present: Bitmask encoding the L2 caches exposed by the GPU. */
+	__u64 l2_present;
+
+	/** @tiler_present: Bitmask encoding the tiler units exposed by the GPU. */
+	__u64 tiler_present;
+
+	/* @core_features: Used to discriminate core variants when they exist. */
+	__u32 core_features;
+
+	/* @pad: MBZ. */
+	__u32 pad;
+};
+
+/**
+ * struct drm_panthor_csif_info - Command stream interface information
+ *
+ * Structure grouping all queryable information relating to the command stream interface.
+ */
+struct drm_panthor_csif_info {
+	/** @csg_slot_count: Number of command stream group slots exposed by the firmware. */
+	__u32 csg_slot_count;
+
+	/** @cs_slot_count: Number of command stream slots per group. */
+	__u32 cs_slot_count;
+
+	/** @cs_reg_count: Number of command stream registers. */
+	__u32 cs_reg_count;
+
+	/** @scoreboard_slot_count: Number of scoreboard slots. */
+	__u32 scoreboard_slot_count;
+
+	/**
+	 * @unpreserved_cs_reg_count: Number of command stream registers reserved by
+	 * the kernel driver to call a userspace command stream.
+	 *
+	 * All registers can be used by a userspace command stream, but the
+	 * [cs_slot_count - unpreserved_cs_reg_count .. cs_slot_count] registers are
+	 * used by the kernel when DRM_PANTHOR_IOCTL_GROUP_SUBMIT is called.
+	 */
+	__u32 unpreserved_cs_reg_count;
+
+	/**
+	 * @pad: Padding field, set to zero.
+	 */
+	__u32 pad;
+};
+
+/**
+ * struct drm_panthor_dev_query - Arguments passed to DRM_PANTHOR_IOCTL_DEV_QUERY
+ */
+struct drm_panthor_dev_query {
+	/** @type: the query type (see drm_panthor_dev_query_type). */
+	__u32 type;
+
+	/**
+	 * @size: size of the type being queried.
+	 *
+	 * If pointer is NULL, size is updated by the driver to provide the
+	 * output structure size. If pointer is not NULL, the driver will
+	 * only copy min(size, actual_structure_size) bytes to the pointer,
+	 * and update the size accordingly. This allows us to extend query
+	 * types without breaking userspace.
+	 */
+	__u32 size;
+
+	/**
+	 * @pointer: user pointer to a query type struct.
+	 *
+	 * Pointer can be NULL, in which case, nothing is copied, but the
+	 * actual structure size is returned. If not NULL, it must point to
+	 * a location that's large enough to hold size bytes.
+	 */
+	__u64 pointer;
+};
+
+/**
+ * struct drm_panthor_vm_create - Arguments passed to DRM_PANTHOR_IOCTL_VM_CREATE
+ */
+struct drm_panthor_vm_create {
+	/** @flags: VM flags, MBZ. */
+	__u32 flags;
+
+	/** @id: Returned VM ID. */
+	__u32 id;
+
+	/**
+	 * @user_va_range: Size of the VA space reserved for user objects.
+	 *
+	 * The kernel will pick the remaining space to map kernel-only objects to the
+	 * VM (heap chunks, heap context, ring buffers, kernel synchronization objects,
+	 * ...). If the space left for kernel objects is too small, kernel object
+	 * allocation will fail further down the road. One can use
+	 * drm_panthor_gpu_info::mmu_features to extract the total virtual address
+	 * range, and chose a user_va_range that leaves some space to the kernel.
+	 *
+	 * If user_va_range is zero, the kernel will pick a sensible value based on
+	 * TASK_SIZE and the virtual range supported by the GPU MMU (the kernel/user
+	 * split should leave enough VA space for userspace processes to support SVM,
+	 * while still allowing the kernel to map some amount of kernel objects in
+	 * the kernel VA range). The value chosen by the driver will be returned in
+	 * @user_va_range.
+	 *
+	 * User VA space always starts at 0x0, kernel VA space is always placed after
+	 * the user VA range.
+	 */
+	__u64 user_va_range;
+};
+
+/**
+ * struct drm_panthor_vm_destroy - Arguments passed to DRM_PANTHOR_IOCTL_VM_DESTROY
+ */
+struct drm_panthor_vm_destroy {
+	/** @id: ID of the VM to destroy. */
+	__u32 id;
+
+	/** @pad: MBZ. */
+	__u32 pad;
+};
+
+/**
+ * enum drm_panthor_vm_bind_op_flags - VM bind operation flags
+ */
+enum drm_panthor_vm_bind_op_flags {
+	/**
+	 * @DRM_PANTHOR_VM_BIND_OP_MAP_READONLY: Map the memory read-only.
+	 *
+	 * Only valid with DRM_PANTHOR_VM_BIND_OP_TYPE_MAP.
+	 */
+	DRM_PANTHOR_VM_BIND_OP_MAP_READONLY = 1 << 0,
+
+	/**
+	 * @DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC: Map the memory not-executable.
+	 *
+	 * Only valid with DRM_PANTHOR_VM_BIND_OP_TYPE_MAP.
+	 */
+	DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC = 1 << 1,
+
+	/**
+	 * @DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED: Map the memory uncached.
+	 *
+	 * Only valid with DRM_PANTHOR_VM_BIND_OP_TYPE_MAP.
+	 */
+	DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED = 1 << 2,
+
+	/**
+	 * @DRM_PANTHOR_VM_BIND_OP_TYPE_MASK: Mask used to determine the type of operation.
+	 */
+	DRM_PANTHOR_VM_BIND_OP_TYPE_MASK = (int)(0xfu << 28),
+
+	/** @DRM_PANTHOR_VM_BIND_OP_TYPE_MAP: Map operation. */
+	DRM_PANTHOR_VM_BIND_OP_TYPE_MAP = 0 << 28,
+
+	/** @DRM_PANTHOR_VM_BIND_OP_TYPE_UNMAP: Unmap operation. */
+	DRM_PANTHOR_VM_BIND_OP_TYPE_UNMAP = 1 << 28,
+
+	/**
+	 * @DRM_PANTHOR_VM_BIND_OP_TYPE_SYNC_ONLY: No VM operation.
+	 *
+	 * Just serves as a synchronization point on a VM queue.
+	 *
+	 * Only valid if %DRM_PANTHOR_VM_BIND_ASYNC is set in drm_panthor_vm_bind::flags,
+	 * and drm_panthor_vm_bind_op::syncs contains at least one element.
+	 */
+	DRM_PANTHOR_VM_BIND_OP_TYPE_SYNC_ONLY = 2 << 28,
+};
+
+/**
+ * struct drm_panthor_vm_bind_op - VM bind operation
+ */
+struct drm_panthor_vm_bind_op {
+	/** @flags: Combination of drm_panthor_vm_bind_op_flags flags. */
+	__u32 flags;
+
+	/**
+	 * @bo_handle: Handle of the buffer object to map.
+	 * MBZ for unmap or sync-only operations.
+	 */
+	__u32 bo_handle;
+
+	/**
+	 * @bo_offset: Buffer object offset.
+	 * MBZ for unmap or sync-only operations.
+	 */
+	__u64 bo_offset;
+
+	/**
+	 * @va: Virtual address to map/unmap.
+	 * MBZ for sync-only operations.
+	 */
+	__u64 va;
+
+	/**
+	 * @size: Size to map/unmap.
+	 * MBZ for sync-only operations.
+	 */
+	__u64 size;
+
+	/**
+	 * @syncs: Array of struct drm_panthor_sync_op synchronization
+	 * operations.
+	 *
+	 * This array must be empty if %DRM_PANTHOR_VM_BIND_ASYNC is not set on
+	 * the drm_panthor_vm_bind object containing this VM bind operation.
+	 *
+	 * This array shall not be empty for sync-only operations.
+	 */
+	struct drm_panthor_obj_array syncs;
+
+};
+
+/**
+ * enum drm_panthor_vm_bind_flags - VM bind flags
+ */
+enum drm_panthor_vm_bind_flags {
+	/**
+	 * @DRM_PANTHOR_VM_BIND_ASYNC: VM bind operations are queued to the VM
+	 * queue instead of being executed synchronously.
+	 */
+	DRM_PANTHOR_VM_BIND_ASYNC = 1 << 0,
+};
+
+/**
+ * struct drm_panthor_vm_bind - Arguments passed to DRM_IOCTL_PANTHOR_VM_BIND
+ */
+struct drm_panthor_vm_bind {
+	/** @vm_id: VM targeted by the bind request. */
+	__u32 vm_id;
+
+	/** @flags: Combination of drm_panthor_vm_bind_flags flags. */
+	__u32 flags;
+
+	/** @ops: Array of struct drm_panthor_vm_bind_op bind operations. */
+	struct drm_panthor_obj_array ops;
+};
+
+/**
+ * enum drm_panthor_vm_state - VM states.
+ */
+enum drm_panthor_vm_state {
+	/**
+	 * @DRM_PANTHOR_VM_STATE_USABLE: VM is usable.
+	 *
+	 * New VM operations will be accepted on this VM.
+	 */
+	DRM_PANTHOR_VM_STATE_USABLE,
+
+	/**
+	 * @DRM_PANTHOR_VM_STATE_UNUSABLE: VM is unusable.
+	 *
+	 * Something put the VM in an unusable state (like an asynchronous
+	 * VM_BIND request failing for any reason).
+	 *
+	 * Once the VM is in this state, all new MAP operations will be
+	 * rejected, and any GPU job targeting this VM will fail.
+	 * UNMAP operations are still accepted.
+	 *
+	 * The only way to recover from an unusable VM is to create a new
+	 * VM, and destroy the old one.
+	 */
+	DRM_PANTHOR_VM_STATE_UNUSABLE,
+};
+
+/**
+ * struct drm_panthor_vm_get_state - Get VM state.
+ */
+struct drm_panthor_vm_get_state {
+	/** @vm_id: VM targeted by the get_state request. */
+	__u32 vm_id;
+
+	/**
+	 * @state: state returned by the driver.
+	 *
+	 * Must be one of the enum drm_panthor_vm_state values.
+	 */
+	__u32 state;
+};
+
+/**
+ * enum drm_panthor_bo_flags - Buffer object flags, passed at creation time.
+ */
+enum drm_panthor_bo_flags {
+	/** @DRM_PANTHOR_BO_NO_MMAP: The buffer object will never be CPU-mapped in userspace. */
+	DRM_PANTHOR_BO_NO_MMAP = (1 << 0),
+};
+
+/**
+ * struct drm_panthor_bo_create - Arguments passed to DRM_IOCTL_PANTHOR_BO_CREATE.
+ */
+struct drm_panthor_bo_create {
+	/**
+	 * @size: Requested size for the object
+	 *
+	 * The (page-aligned) allocated size for the object will be returned.
+	 */
+	__u64 size;
+
+	/**
+	 * @flags: Flags. Must be a combination of drm_panthor_bo_flags flags.
+	 */
+	__u32 flags;
+
+	/**
+	 * @exclusive_vm_id: Exclusive VM this buffer object will be mapped to.
+	 *
+	 * If not zero, the field must refer to a valid VM ID, and implies that:
+	 *  - the buffer object will only ever be bound to that VM
+	 *  - cannot be exported as a PRIME fd
+	 */
+	__u32 exclusive_vm_id;
+
+	/**
+	 * @handle: Returned handle for the object.
+	 *
+	 * Object handles are nonzero.
+	 */
+	__u32 handle;
+
+	/** @pad: MBZ. */
+	__u32 pad;
+};
+
+/**
+ * struct drm_panthor_bo_mmap_offset - Arguments passed to DRM_IOCTL_PANTHOR_BO_MMAP_OFFSET.
+ */
+struct drm_panthor_bo_mmap_offset {
+	/** @handle: Handle of the object we want an mmap offset for. */
+	__u32 handle;
+
+	/** @pad: MBZ. */
+	__u32 pad;
+
+	/** @offset: The fake offset to use for subsequent mmap calls. */
+	__u64 offset;
+};
+
+/**
+ * struct drm_panthor_queue_create - Queue creation arguments.
+ */
+struct drm_panthor_queue_create {
+	/**
+	 * @priority: Defines the priority of queues inside a group. Goes from 0 to 15,
+	 * 15 being the highest priority.
+	 */
+	__u8 priority;
+
+	/** @pad: Padding fields, MBZ. */
+	__u8 pad[3];
+
+	/** @ringbuf_size: Size of the ring buffer to allocate to this queue. */
+	__u32 ringbuf_size;
+};
+
+/**
+ * enum drm_panthor_group_priority - Scheduling group priority
+ */
+enum drm_panthor_group_priority {
+	/** @PANTHOR_GROUP_PRIORITY_LOW: Low priority group. */
+	PANTHOR_GROUP_PRIORITY_LOW = 0,
+
+	/** @PANTHOR_GROUP_PRIORITY_MEDIUM: Medium priority group. */
+	PANTHOR_GROUP_PRIORITY_MEDIUM,
+
+	/** @PANTHOR_GROUP_PRIORITY_HIGH: High priority group. */
+	PANTHOR_GROUP_PRIORITY_HIGH,
+};
+
+/**
+ * struct drm_panthor_group_create - Arguments passed to DRM_IOCTL_PANTHOR_GROUP_CREATE
+ */
+struct drm_panthor_group_create {
+	/** @queues: Array of drm_panthor_queue_create elements. */
+	struct drm_panthor_obj_array queues;
+
+	/**
+	 * @max_compute_cores: Maximum number of cores that can be used by compute
+	 * jobs across CS queues bound to this group.
+	 *
+	 * Must be less or equal to the number of bits set in @compute_core_mask.
+	 */
+	__u8 max_compute_cores;
+
+	/**
+	 * @max_fragment_cores: Maximum number of cores that can be used by fragment
+	 * jobs across CS queues bound to this group.
+	 *
+	 * Must be less or equal to the number of bits set in @fragment_core_mask.
+	 */
+	__u8 max_fragment_cores;
+
+	/**
+	 * @max_tiler_cores: Maximum number of tilers that can be used by tiler jobs
+	 * across CS queues bound to this group.
+	 *
+	 * Must be less or equal to the number of bits set in @tiler_core_mask.
+	 */
+	__u8 max_tiler_cores;
+
+	/** @priority: Group priority (see enum drm_panthor_group_priority). */
+	__u8 priority;
+
+	/** @pad: Padding field, MBZ. */
+	__u32 pad;
+
+	/**
+	 * @compute_core_mask: Mask encoding cores that can be used for compute jobs.
+	 *
+	 * This field must have at least @max_compute_cores bits set.
+	 *
+	 * The bits set here should also be set in drm_panthor_gpu_info::shader_present.
+	 */
+	__u64 compute_core_mask;
+
+	/**
+	 * @fragment_core_mask: Mask encoding cores that can be used for fragment jobs.
+	 *
+	 * This field must have at least @max_fragment_cores bits set.
+	 *
+	 * The bits set here should also be set in drm_panthor_gpu_info::shader_present.
+	 */
+	__u64 fragment_core_mask;
+
+	/**
+	 * @tiler_core_mask: Mask encoding cores that can be used for tiler jobs.
+	 *
+	 * This field must have at least @max_tiler_cores bits set.
+	 *
+	 * The bits set here should also be set in drm_panthor_gpu_info::tiler_present.
+	 */
+	__u64 tiler_core_mask;
+
+	/**
+	 * @vm_id: VM ID to bind this group to.
+	 *
+	 * All submission to queues bound to this group will use this VM.
+	 */
+	__u32 vm_id;
+
+	/**
+	 * @group_handle: Returned group handle. Passed back when submitting jobs or
+	 * destroying a group.
+	 */
+	__u32 group_handle;
+};
+
+/**
+ * struct drm_panthor_group_destroy - Arguments passed to DRM_IOCTL_PANTHOR_GROUP_DESTROY
+ */
+struct drm_panthor_group_destroy {
+	/** @group_handle: Group to destroy */
+	__u32 group_handle;
+
+	/** @pad: Padding field, MBZ. */
+	__u32 pad;
+};
+
+/**
+ * struct drm_panthor_queue_submit - Job submission arguments.
+ *
+ * This is describing the userspace command stream to call from the kernel
+ * command stream ring-buffer. Queue submission is always part of a group
+ * submission, taking one or more jobs to submit to the underlying queues.
+ */
+struct drm_panthor_queue_submit {
+	/** @queue_index: Index of the queue inside a group. */
+	__u32 queue_index;
+
+	/**
+	 * @stream_size: Size of the command stream to execute.
+	 *
+	 * Must be 64-bit/8-byte aligned (the size of a CS instruction)
+	 *
+	 * Can be zero if stream_addr is zero too.
+	 */
+	__u32 stream_size;
+
+	/**
+	 * @stream_addr: GPU address of the command stream to execute.
+	 *
+	 * Must be aligned on 64-byte.
+	 *
+	 * Can be zero is stream_size is zero too.
+	 */
+	__u64 stream_addr;
+
+	/**
+	 * @latest_flush: FLUSH_ID read at the time the stream was built.
+	 *
+	 * This allows cache flush elimination for the automatic
+	 * flush+invalidate(all) done at submission time, which is needed to
+	 * ensure the GPU doesn't get garbage when reading the indirect command
+	 * stream buffers. If you want the cache flush to happen
+	 * unconditionally, pass a zero here.
+	 */
+	__u32 latest_flush;
+
+	/** @pad: MBZ. */
+	__u32 pad;
+
+	/** @syncs: Array of struct drm_panthor_sync_op sync operations. */
+	struct drm_panthor_obj_array syncs;
+};
+
+/**
+ * struct drm_panthor_group_submit - Arguments passed to DRM_IOCTL_PANTHOR_GROUP_SUBMIT
+ */
+struct drm_panthor_group_submit {
+	/** @group_handle: Handle of the group to queue jobs to. */
+	__u32 group_handle;
+
+	/** @pad: MBZ. */
+	__u32 pad;
+
+	/** @queue_submits: Array of drm_panthor_queue_submit objects. */
+	struct drm_panthor_obj_array queue_submits;
+};
+
+/**
+ * enum drm_panthor_group_state_flags - Group state flags
+ */
+enum drm_panthor_group_state_flags {
+	/**
+	 * @DRM_PANTHOR_GROUP_STATE_TIMEDOUT: Group had unfinished jobs.
+	 *
+	 * When a group ends up with this flag set, no jobs can be submitted to its queues.
+	 */
+	DRM_PANTHOR_GROUP_STATE_TIMEDOUT = 1 << 0,
+
+	/**
+	 * @DRM_PANTHOR_GROUP_STATE_FATAL_FAULT: Group had fatal faults.
+	 *
+	 * When a group ends up with this flag set, no jobs can be submitted to its queues.
+	 */
+	DRM_PANTHOR_GROUP_STATE_FATAL_FAULT = 1 << 1,
+};
+
+/**
+ * struct drm_panthor_group_get_state - Arguments passed to DRM_IOCTL_PANTHOR_GROUP_GET_STATE
+ *
+ * Used to query the state of a group and decide whether a new group should be created to
+ * replace it.
+ */
+struct drm_panthor_group_get_state {
+	/** @group_handle: Handle of the group to query state on */
+	__u32 group_handle;
+
+	/**
+	 * @state: Combination of DRM_PANTHOR_GROUP_STATE_* flags encoding the
+	 * group state.
+	 */
+	__u32 state;
+
+	/** @fatal_queues: Bitmask of queues that faced fatal faults. */
+	__u32 fatal_queues;
+
+	/** @pad: MBZ */
+	__u32 pad;
+};
+
+/**
+ * struct drm_panthor_tiler_heap_create - Arguments passed to DRM_IOCTL_PANTHOR_TILER_HEAP_CREATE
+ */
+struct drm_panthor_tiler_heap_create {
+	/** @vm_id: VM ID the tiler heap should be mapped to */
+	__u32 vm_id;
+
+	/** @initial_chunk_count: Initial number of chunks to allocate. */
+	__u32 initial_chunk_count;
+
+	/** @chunk_size: Chunk size. Must be a power of two at least 256KB large. */
+	__u32 chunk_size;
+
+	/** @max_chunks: Maximum number of chunks that can be allocated. */
+	__u32 max_chunks;
+
+	/**
+	 * @target_in_flight: Maximum number of in-flight render passes.
+	 *
+	 * If the heap has more than tiler jobs in-flight, the FW will wait for render
+	 * passes to finish before queuing new tiler jobs.
+	 */
+	__u32 target_in_flight;
+
+	/** @handle: Returned heap handle. Passed back to DESTROY_TILER_HEAP. */
+	__u32 handle;
+
+	/** @tiler_heap_ctx_gpu_va: Returned heap GPU virtual address returned */
+	__u64 tiler_heap_ctx_gpu_va;
+
+	/**
+	 * @first_heap_chunk_gpu_va: First heap chunk.
+	 *
+	 * The tiler heap is formed of heap chunks forming a single-link list. This
+	 * is the first element in the list.
+	 */
+	__u64 first_heap_chunk_gpu_va;
+};
+
+/**
+ * struct drm_panthor_tiler_heap_destroy - Arguments passed to DRM_IOCTL_PANTHOR_TILER_HEAP_DESTROY
+ */
+struct drm_panthor_tiler_heap_destroy {
+	/** @handle: Handle of the tiler heap to destroy */
+	__u32 handle;
+
+	/** @pad: Padding field, MBZ. */
+	__u32 pad;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* _PANTHOR_DRM_H_ */