mbox series

[RFC,00/33] Add Support for Plane Color Pipeline

Message ID 20230829160422.1251087-1-uma.shankar@intel.com (mailing list archive)
Headers show
Series Add Support for Plane Color Pipeline | expand

Message

Shankar, Uma Aug. 29, 2023, 4:03 p.m. UTC
Introduction
============

Modern hardwares have various color processing capabilities both
at pre-blending and post-blending phases in the color pipeline.
The current drm implementation exposes only the post-blending
color hardware blocks. Support for pre-blending hardware is missing.
There are multiple use cases where pre-blending color hardware will
be useful:
	a) Linearization of input buffers encoded in various transfer
	   functions.
	b) Color Space conversion
	c) Tone mapping
	d) Frame buffer format conversion
	e) Non-linearization of buffer(apply transfer function)
	f) 3D Luts
	
and other miscellaneous color operations.

Hence, there is a need to expose the color capabilities of the hardware
to user-space. This will help userspace/middleware to use display
hardware for color processing and blending instead of doing it through
GPU shaders.


Work done so far and relevant references
========================================

Some implementation is done by Intel and AMD/Igalia to address the same.
Broad consensus is there that we need a generic API at drm core to suffice
the use case of various HW vendors. Below are the links capturing the
discussion so far.

Intel's Plane Color Implementation: https://patchwork.freedesktop.org/series/90825/
AMD's Plane Color Implementation: https://patchwork.freedesktop.org/series/116862/


Hackfest conclusions
====================

HDR/Color Hackfest was organised by Redhat to bring all the industry stakeholders
together and converge on a common uapi expectations. Participants from Intel, AMD,
Nvidia, Collabora, Redhat, Igalia and other prominent user-space developers and
maintainers.

Discussions happened on the uapi expectations, opens, nature of hardware of multiple
hardware vendors, challenges in generalizing the same and the path forward. Consensus
was made that drm core should implement descriptive APIs and not go with prescriptive
APIs. DRM core should just expose the hardware capabilities; enabling, customizing and
programming the same should be done by the user-space. Driver should just honor
the user space request without doing any operations internally.

Thanks to Simon Ser, for nicely documenting the design consensus and an UAPI RFC
which can be referred to here:

https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze_hD5nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1QWn488=@emersion.fr/


Design considerations
=====================

Following are the important aspects taken into account while designing the current RFC
proposal:

	1. Individual HW blocks can be muxed. (e.g. out of two HW blocks only one can be used)
	2. Position of the HW block in the pipeline can be programmable
	3. LUTs can be one dimentional or three dimentional
	4. Number of LUT entries can vary across platforms
	5. Precision of LUT entries can vary across platforms
	6. Distribution of LUT entries may vary. e.g Mutli-segmented, Logarithmic,
	   Piece-Wise Linear(PWL) etc
	7. There can be parameterized/non-parameterized fixed function HW blocks.
	   e.g. Just a hardware bit, to convert from one color space to another.
	8. Custom non-standard HW implementation.
	9. Leaving scope for some vendor defined pescriptive implementation if required.
	10.Scope to handle any modification in hardware as technology evolves

The current proposal takes into account the above considerations while keeping the implementation
as generic as possible leaving scope for future additions or modifications.
 
This proposal is also in line to the details mentioned by Simon's RFC covering all
the aspects discussed in hackfest.


Outline of the implementation
============================

Each Color Hardware block will be represented by a data structure drm_color_op.
These color operations will form the building blocks of a color pipeline which
best represents the underlying Hardware. Color operations can be re-arranged,
substracted or added to create distinct color pipelines to accurately describe
the Hardware blocks present in the display engine.

In this proposal, a color pipeline is represented as an array of
struct drm_color_op. For individual color operation, we add blobs to advertise
the capability of the particular Hardware block.

This color pipeline is then packaged within a blob for the user space to
retrieve it.

To advertise the available color pipelines, an immutable ENUM property
"GET_COLOR_PIPELINE" is introduced. This enum property has blob id's as values.
With each blob id representing a distinct color pipeline based on underlying HW
capabilities and their respective combinations.

Once the user space decides on a color pipeline, it can set the pipeline and
the corresponding data for the hardware blocks within the pipeline with
the BLOB property "SET_COLOR_PIPELINE".

Refer to Documentation/gpu/rfc/plane_color_pipeline.rst added in the patch

IGT and test details
====================

A set of IGT tests is written to demonstrate the usage of the proposed UAPIs
along with some sanity validation.

Details of the IGT test can be found here:
https://patchwork.freedesktop.org/series/123018/

Opens
=====

a. To come up with a list of common HW blocks which can be defined generically by the DRM
   core in agreement with all the stakeholders
b. To enhance/finalize the data structure to define segmented LUTs generically.


Out of Scope
============

a. The coefficients for CTM and LUT value calculations are out of scope of the proposal.
b. The onus of programming the HW blocks and their values is on user-space. Driver will
   just provide the interface for the same.
c. In order to compute LUT values and coefficients, a helper library can be created in
   user-space. However, it is out of scope for the current proposal.

Acknowledgements and credits
============================

There are multiple contributors who have helped us to reach to this proposal. Special mention
to Ville Syrjala<ville.syrjala@linux.intel.com>, Pekka Paalanen<pekka.paalanen@collabora.com>,
Simon Ser<contact@emersion.fr>, Harry Wentland<harry.wentland@amd.com>,
Melissa Wen<mwen@igalia.com>, Jonas<jadahl@redhat.com>, Sebastian Wick<sebastian.wick@redhat.com>,
Bhanu<bhanuprakash.modem@intel.com> and Shashank<shashank.sharma@amd.com>.

Also, thanks to Carlos <csoriano@redhat.com> and the Redhat team for organizing the HDR hackfest.


UAPI dependency and Usermode development
========================================

The current KMS implementation requires a user space consumer for it to be accepted upstream.
Work is ongoing in weston and mutter community to get color management and HDR support implemented
in the respective stacks.

=================================================================================

We have tried to take care of all the scenarios and use-cases which possibly could exists in the
current proposal. Thanks to everyone who has contributed in all color management discussions so
far. Let's work together to improve the current proposal and get this thing implemented in
upstream linux. All the feedback and suggestions to enhance the design are welcome.

Regards,
Uma Shankar
Chaitanya Kumar Borah

Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Pekka Paalanen <pekka.paalanen@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Melissa Wen <mwen@igalia.com>
Cc: Jonas Ådahl <jadahl@redhat.com>
Cc: Sebastian Wick <sebastian.wick@redhat.com> 
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Alexander Goins <agoins@nvidia.com>

Chaitanya Kumar Borah (14):
  drm: Add color operation structure
  drm: Add plane get color pipeline property
  drm: Add helper to add color pipeline
  drm: Manage color blob states
  drm: Replace individual color blobs
  drm: Reset pipeline when user sends NULL blob
  drm: Reset plane color state on pipeline switch request
  drm/i915/color: Add HDR plane LUT range data to color pipeline
  drm/i915/color: Add SDR plane LUT range data to color pipeline
  drm/i915/color: Add color pipelines to plane
  drm/i915/color: Create and attach set color pipeline property
  drm/i915/color: Enable plane color features
  drm/i915/color: Add a dummy pipeline with 3D LUT
  drm/i915/color: Add example implementation for vendor specific color
    operation

Uma Shankar (19):
  drm/doc/rfc: Add RFC document for proposed Plane Color Pipeline
  drm: Add structures for setting color pipeline
  drm: Add set colorpipeline property
  drm: Add Enhanced Gamma LUT precision structure
  drm: Add color lut range structure
  drm: Add color information to plane state
  drm/i915/color: Add lut range for SDR planes
  drm/i915/color: Add lut range for HDR planes
  drm/i915/color: Add color pipeline for HDR planes
  drm/i915/color: Add color pipeline for SDR planes
  drm/i915/color: Add plane color callbacks
  drm/i915/color: Load plane color luts from atomic flip
  drm/i915/xelpd: Add plane color check to glk_plane_color_ctl
  drm/i915/xelpd: Add register definitions for Plane Degamma
  drm/i915/color: Add color functions for ADL
  drm/i915/color: Program Plane Pre-CSC Registers
  drm/i915/xelpd: Add register definitions for Plane Post CSC
  drm/i915/xelpd: Program Plane Post CSC Registers
  drm/i915/color: Enable Plane CSC

 .../gpu/rfc/plane_color_pipeline.rst          | 394 ++++++++++
 drivers/gpu/drm/drm_atomic_state_helper.c     |  21 +
 drivers/gpu/drm/drm_atomic_uapi.c             | 218 ++++++
 drivers/gpu/drm/drm_color_mgmt.c              | 130 ++++
 drivers/gpu/drm/i915/display/intel_color.c    | 684 +++++++++++++++++-
 drivers/gpu/drm/i915/display/intel_color.h    |   7 +-
 .../drm/i915/display/skl_universal_plane.c    |  21 +-
 drivers/gpu/drm/i915/i915_reg.h               | 124 ++++
 include/drm/drm_plane.h                       |  82 +++
 include/uapi/drm/drm_mode.h                   | 196 +++++
 include/uapi/drm/i915_drm.h                   |  25 +
 11 files changed, 1899 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/gpu/rfc/plane_color_pipeline.rst

Comments

Harry Wentland Aug. 29, 2023, 7:26 p.m. UTC | #1
+CC Naseer and Chris, FYI

See https://patchwork.freedesktop.org/series/123024/ for whole series.

On 2023-08-29 12:03, Uma Shankar wrote:
> Introduction
> ============
> 
> Modern hardwares have various color processing capabilities both
> at pre-blending and post-blending phases in the color pipeline.
> The current drm implementation exposes only the post-blending
> color hardware blocks. Support for pre-blending hardware is missing.
> There are multiple use cases where pre-blending color hardware will
> be useful:
> 	a) Linearization of input buffers encoded in various transfer
> 	   functions.
> 	b) Color Space conversion
> 	c) Tone mapping
> 	d) Frame buffer format conversion
> 	e) Non-linearization of buffer(apply transfer function)
> 	f) 3D Luts
> 	
> and other miscellaneous color operations.
> 
> Hence, there is a need to expose the color capabilities of the hardware
> to user-space. This will help userspace/middleware to use display
> hardware for color processing and blending instead of doing it through
> GPU shaders.
> 

Thanks, Uma, for sending this. I've been working on something similar
but you beat me to it. :)

> 
> Work done so far and relevant references
> ========================================
> 
> Some implementation is done by Intel and AMD/Igalia to address the same.
> Broad consensus is there that we need a generic API at drm core to suffice
> the use case of various HW vendors. Below are the links capturing the
> discussion so far.
> 
> Intel's Plane Color Implementation: https://patchwork.freedesktop.org/series/90825/
> AMD's Plane Color Implementation: https://patchwork.freedesktop.org/series/116862/
> 
> 
> Hackfest conclusions
> ====================
> 
> HDR/Color Hackfest was organised by Redhat to bring all the industry stakeholders
> together and converge on a common uapi expectations. Participants from Intel, AMD,
> Nvidia, Collabora, Redhat, Igalia and other prominent user-space developers and
> maintainers.
> 
> Discussions happened on the uapi expectations, opens, nature of hardware of multiple
> hardware vendors, challenges in generalizing the same and the path forward. Consensus
> was made that drm core should implement descriptive APIs and not go with prescriptive
> APIs. DRM core should just expose the hardware capabilities; enabling, customizing and
> programming the same should be done by the user-space. Driver should just honor
> the user space request without doing any operations internally.
> 
> Thanks to Simon Ser, for nicely documenting the design consensus and an UAPI RFC
> which can be referred to here:
> 
> https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze_hD5nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1QWn488=@emersion.fr/
> 
> 
> Design considerations
> =====================
> 
> Following are the important aspects taken into account while designing the current RFC
> proposal:
> 
> 	1. Individual HW blocks can be muxed. (e.g. out of two HW blocks only one can be used)
> 	2. Position of the HW block in the pipeline can be programmable
> 	3. LUTs can be one dimentional or three dimentional
> 	4. Number of LUT entries can vary across platforms
> 	5. Precision of LUT entries can vary across platforms
> 	6. Distribution of LUT entries may vary. e.g Mutli-segmented, Logarithmic,
> 	   Piece-Wise Linear(PWL) etc
> 	7. There can be parameterized/non-parameterized fixed function HW blocks.
> 	   e.g. Just a hardware bit, to convert from one color space to another.
> 	8. Custom non-standard HW implementation.
> 	9. Leaving scope for some vendor defined pescriptive implementation if required.
> 	10.Scope to handle any modification in hardware as technology evolves
> 
> The current proposal takes into account the above considerations while keeping the implementation
> as generic as possible leaving scope for future additions or modifications.
>   
> This proposal is also in line to the details mentioned by Simon's RFC covering all
> the aspects discussed in hackfest.
> 
> 
> Outline of the implementation
> ============================
> 
> Each Color Hardware block will be represented by a data structure drm_color_op.
> These color operations will form the building blocks of a color pipeline which
> best represents the underlying Hardware. Color operations can be re-arranged,
> substracted or added to create distinct color pipelines to accurately describe
> the Hardware blocks present in the display engine.

Who is doing the arranging of color operations? IMO a driver should 
define one or more respective pipelines that can be selected by 
userspace. This seems to be what you're talking about after (I haven't 
reviewed the whole thing yet). Might be best to drop this sentence or to 
add clarifications in order to avoid confusion.

> 
> In this proposal, a color pipeline is represented as an array of
> struct drm_color_op. For individual color operation, we add blobs to advertise
> the capability of the particular Hardware block.
> 
> This color pipeline is then packaged within a blob for the user space to
> retrieve it.
> 

Would it be better to expose the drm_color_ops directly, instead of 
packing a array of drm_color_ops into a blob and then giving that to 
userspace.

> To advertise the available color pipelines, an immutable ENUM property
> "GET_COLOR_PIPELINE" is introduced. This enum property has blob id's as values.
> With each blob id representing a distinct color pipeline based on underlying HW
> capabilities and their respective combinations.
> 
> Once the user space decides on a color pipeline, it can set the pipeline and
> the corresponding data for the hardware blocks within the pipeline with
> the BLOB property "SET_COLOR_PIPELINE".
> 

When I discussed this at the hackfest with Simon he proposed a new DRM 
object, (I've called it a drm_colorop) to represent a color operation. 
Each drm_colorop has a "NEXT" pointer to another drm_colorop, or NULL if 
its the last in the pipeline. You can then have a mutable enum property 
on the plane to discover and select a color pipeline.

This seems a bit more transparent than a blob. You can see my changes 
(unfortunately very WIP, don't look too closely at individual patches) at
https://gitlab.freedesktop.org/hwentland/linux/-/merge_requests/5/diffs

libdrm changes:
https://gitlab.freedesktop.org/hwentland/drm/-/merge_requests/1/diffs

IGT changes:
https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/merge_requests/1/diffs

I'll take time to review your whole series and will see whether we can 
somehow keep the best parts of each.

Curious to hear other opinions on the blob vs new DRM object question as 
well.

> Refer to Documentation/gpu/rfc/plane_color_pipeline.rst added in the patch
> 
> IGT and test details
> ====================
> 
> A set of IGT tests is written to demonstrate the usage of the proposed UAPIs
> along with some sanity validation.
> 
> Details of the IGT test can be found here:
> https://patchwork.freedesktop.org/series/123018/
> 
> Opens
> =====
> 
> a. To come up with a list of common HW blocks which can be defined generically by the DRM
>     core in agreement with all the stakeholders
> b. To enhance/finalize the data structure to define segmented LUTs generically.
> 

It would be good to add some basic support in VKMS. My work has been 
based on VKMS. Once we kinda settle on an approach I'll look at exposing 
the AMD private properties from Melissa through the API.

> 
> Out of Scope
> ============
> 
> a. The coefficients for CTM and LUT value calculations are out of scope of the proposal.
> b. The onus of programming the HW blocks and their values is on user-space. Driver will
>     just provide the interface for the same.
> c. In order to compute LUT values and coefficients, a helper library can be created in
>     user-space. However, it is out of scope for the current proposal.
> 
> Acknowledgements and credits
> ============================
> 
> There are multiple contributors who have helped us to reach to this proposal. Special mention
> to Ville Syrjala<ville.syrjala@linux.intel.com>, Pekka Paalanen<pekka.paalanen@collabora.com>,
> Simon Ser<contact@emersion.fr>, Harry Wentland<harry.wentland@amd.com>,
> Melissa Wen<mwen@igalia.com>, Jonas<jadahl@redhat.com>, Sebastian Wick<sebastian.wick@redhat.com>,
> Bhanu<bhanuprakash.modem@intel.com> and Shashank<shashank.sharma@amd.com>.
> 
> Also, thanks to Carlos <csoriano@redhat.com> and the Redhat team for organizing the HDR hackfest.
> 
> 
> UAPI dependency and Usermode development
> ========================================
> 
> The current KMS implementation requires a user space consumer for it to be accepted upstream.
> Work is ongoing in weston and mutter community to get color management and HDR support implemented
> in the respective stacks.
> 

If we can get AMD properties encoded using a Color Pipeline API we can 
probably use gamescope as the userspace vehicle.

I'm reviewing this in sequence, so there's a chance I'm missing context. 
Please bear with me if some of my comments are answered later in the series.

Again, thanks for sending this.

Harry

> =================================================================================
> 
> We have tried to take care of all the scenarios and use-cases which possibly could exists in the
> current proposal. Thanks to everyone who has contributed in all color management discussions so
> far. Let's work together to improve the current proposal and get this thing implemented in
> upstream linux. All the feedback and suggestions to enhance the design are welcome.
> 
> Regards,
> Uma Shankar
> Chaitanya Kumar Borah
> 
> Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> Cc: Pekka Paalanen <pekka.paalanen@collabora.com>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: Harry Wentland <harry.wentland@amd.com>
> Cc: Melissa Wen <mwen@igalia.com>
> Cc: Jonas Ådahl <jadahl@redhat.com>
> Cc: Sebastian Wick <sebastian.wick@redhat.com>
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Alexander Goins <agoins@nvidia.com>
> 
> Chaitanya Kumar Borah (14):
>    drm: Add color operation structure
>    drm: Add plane get color pipeline property
>    drm: Add helper to add color pipeline
>    drm: Manage color blob states
>    drm: Replace individual color blobs
>    drm: Reset pipeline when user sends NULL blob
>    drm: Reset plane color state on pipeline switch request
>    drm/i915/color: Add HDR plane LUT range data to color pipeline
>    drm/i915/color: Add SDR plane LUT range data to color pipeline
>    drm/i915/color: Add color pipelines to plane
>    drm/i915/color: Create and attach set color pipeline property
>    drm/i915/color: Enable plane color features
>    drm/i915/color: Add a dummy pipeline with 3D LUT
>    drm/i915/color: Add example implementation for vendor specific color
>      operation
> 
> Uma Shankar (19):
>    drm/doc/rfc: Add RFC document for proposed Plane Color Pipeline
>    drm: Add structures for setting color pipeline
>    drm: Add set colorpipeline property
>    drm: Add Enhanced Gamma LUT precision structure
>    drm: Add color lut range structure
>    drm: Add color information to plane state
>    drm/i915/color: Add lut range for SDR planes
>    drm/i915/color: Add lut range for HDR planes
>    drm/i915/color: Add color pipeline for HDR planes
>    drm/i915/color: Add color pipeline for SDR planes
>    drm/i915/color: Add plane color callbacks
>    drm/i915/color: Load plane color luts from atomic flip
>    drm/i915/xelpd: Add plane color check to glk_plane_color_ctl
>    drm/i915/xelpd: Add register definitions for Plane Degamma
>    drm/i915/color: Add color functions for ADL
>    drm/i915/color: Program Plane Pre-CSC Registers
>    drm/i915/xelpd: Add register definitions for Plane Post CSC
>    drm/i915/xelpd: Program Plane Post CSC Registers
>    drm/i915/color: Enable Plane CSC
> 
>   .../gpu/rfc/plane_color_pipeline.rst          | 394 ++++++++++
>   drivers/gpu/drm/drm_atomic_state_helper.c     |  21 +
>   drivers/gpu/drm/drm_atomic_uapi.c             | 218 ++++++
>   drivers/gpu/drm/drm_color_mgmt.c              | 130 ++++
>   drivers/gpu/drm/i915/display/intel_color.c    | 684 +++++++++++++++++-
>   drivers/gpu/drm/i915/display/intel_color.h    |   7 +-
>   .../drm/i915/display/skl_universal_plane.c    |  21 +-
>   drivers/gpu/drm/i915/i915_reg.h               | 124 ++++
>   include/drm/drm_plane.h                       |  82 +++
>   include/uapi/drm/drm_mode.h                   | 196 +++++
>   include/uapi/drm/i915_drm.h                   |  25 +
>   11 files changed, 1899 insertions(+), 3 deletions(-)
>   create mode 100644 Documentation/gpu/rfc/plane_color_pipeline.rst
>
Shankar, Uma Aug. 30, 2023, 8:47 a.m. UTC | #2
> -----Original Message-----
> From: Harry Wentland <harry.wentland@amd.com>
> Sent: Wednesday, August 30, 2023 12:56 AM
> To: Shankar, Uma <uma.shankar@intel.com>; intel-gfx@lists.freedesktop.org; dri-
> devel@lists.freedesktop.org
> Cc: wayland-devel@lists.freedesktop.org; Ville Syrjala
> <ville.syrjala@linux.intel.com>; Pekka Paalanen <pekka.paalanen@collabora.com>;
> Simon Ser <contact@emersion.fr>; Melissa Wen <mwen@igalia.com>; Jonas Ådahl
> <jadahl@redhat.com>; Sebastian Wick <sebastian.wick@redhat.com>; Shashank
> Sharma <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> <quic_cbraga@quicinc.com>
> Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> 
> +CC Naseer and Chris, FYI
> 
> See https://patchwork.freedesktop.org/series/123024/ for whole series.
> 
> On 2023-08-29 12:03, Uma Shankar wrote:
> > Introduction
> > ============
> >
> > Modern hardwares have various color processing capabilities both at
> > pre-blending and post-blending phases in the color pipeline.
> > The current drm implementation exposes only the post-blending color
> > hardware blocks. Support for pre-blending hardware is missing.
> > There are multiple use cases where pre-blending color hardware will be
> > useful:
> > 	a) Linearization of input buffers encoded in various transfer
> > 	   functions.
> > 	b) Color Space conversion
> > 	c) Tone mapping
> > 	d) Frame buffer format conversion
> > 	e) Non-linearization of buffer(apply transfer function)
> > 	f) 3D Luts
> >
> > and other miscellaneous color operations.
> >
> > Hence, there is a need to expose the color capabilities of the
> > hardware to user-space. This will help userspace/middleware to use
> > display hardware for color processing and blending instead of doing it
> > through GPU shaders.
> >
> 
> Thanks, Uma, for sending this. I've been working on something similar but you beat
> me to it. :)

Thanks Harry for the useful feedback and overall collaboration on this so far.

> >
> > Work done so far and relevant references
> > ========================================
> >
> > Some implementation is done by Intel and AMD/Igalia to address the same.
> > Broad consensus is there that we need a generic API at drm core to
> > suffice the use case of various HW vendors. Below are the links
> > capturing the discussion so far.
> >
> > Intel's Plane Color Implementation:
> > https://patchwork.freedesktop.org/series/90825/
> > AMD's Plane Color Implementation:
> > https://patchwork.freedesktop.org/series/116862/
> >
> >
> > Hackfest conclusions
> > ====================
> >
> > HDR/Color Hackfest was organised by Redhat to bring all the industry
> > stakeholders together and converge on a common uapi expectations.
> > Participants from Intel, AMD, Nvidia, Collabora, Redhat, Igalia and
> > other prominent user-space developers and maintainers.
> >
> > Discussions happened on the uapi expectations, opens, nature of
> > hardware of multiple hardware vendors, challenges in generalizing the
> > same and the path forward. Consensus was made that drm core should
> > implement descriptive APIs and not go with prescriptive APIs. DRM core
> > should just expose the hardware capabilities; enabling, customizing
> > and programming the same should be done by the user-space. Driver should just
> honor the user space request without doing any operations internally.
> >
> > Thanks to Simon Ser, for nicely documenting the design consensus and
> > an UAPI RFC which can be referred to here:
> >
> > https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze_hD5
> >
> nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1Q
> Wn48
> > 8=@emersion.fr/
> >
> >
> > Design considerations
> > =====================
> >
> > Following are the important aspects taken into account while designing
> > the current RFC
> > proposal:
> >
> > 	1. Individual HW blocks can be muxed. (e.g. out of two HW blocks only one
> can be used)
> > 	2. Position of the HW block in the pipeline can be programmable
> > 	3. LUTs can be one dimentional or three dimentional
> > 	4. Number of LUT entries can vary across platforms
> > 	5. Precision of LUT entries can vary across platforms
> > 	6. Distribution of LUT entries may vary. e.g Mutli-segmented, Logarithmic,
> > 	   Piece-Wise Linear(PWL) etc
> > 	7. There can be parameterized/non-parameterized fixed function HW
> blocks.
> > 	   e.g. Just a hardware bit, to convert from one color space to another.
> > 	8. Custom non-standard HW implementation.
> > 	9. Leaving scope for some vendor defined pescriptive implementation if
> required.
> > 	10.Scope to handle any modification in hardware as technology evolves
> >
> > The current proposal takes into account the above considerations while
> > keeping the implementation as generic as possible leaving scope for future
> additions or modifications.
> >
> > This proposal is also in line to the details mentioned by Simon's RFC
> > covering all the aspects discussed in hackfest.
> >
> >
> > Outline of the implementation
> > ============================
> >
> > Each Color Hardware block will be represented by a data structure drm_color_op.
> > These color operations will form the building blocks of a color
> > pipeline which best represents the underlying Hardware. Color
> > operations can be re-arranged, substracted or added to create distinct
> > color pipelines to accurately describe the Hardware blocks present in the display
> engine.
> 
> Who is doing the arranging of color operations? IMO a driver should define one or
> more respective pipelines that can be selected by userspace. This seems to be what
> you're talking about after (I haven't reviewed the whole thing yet). Might be best to
> drop this sentence or to add clarifications in order to avoid confusion.

Yes it's the driver who will set the pipeline based on the underlying hardware arrangement
and possible combinations. There can be multiple pipelines possible if hardware can be
muxed or order can be re-arranged (all viable combinations should be defined as a pipeline in driver).
Yeah, I will re-phrase this to help clarify it and avoid any ambiguity.

> >
> > In this proposal, a color pipeline is represented as an array of
> > struct drm_color_op. For individual color operation, we add blobs to
> > advertise the capability of the particular Hardware block.
> >
> > This color pipeline is then packaged within a blob for the user space
> > to retrieve it.
> >
> 
> Would it be better to expose the drm_color_ops directly, instead of packing a array
> of drm_color_ops into a blob and then giving that to userspace.

Advantage we see in packing in blobs is that interface will be cleaner. There will be just
one GET_COLOR_PIPELINE property to invoke by user and then its just parsing the data.
This way the entire underlying hardware blocks with pipeline will be available to user.
For a particular hardware block in pipeline, user can get the relevant details from blob
representing that particular block. We have created IGT tests (links mentioned in cover-letter)
to demonstrate how it can be done. This is just to clarify the idea.

Also since info is passed via blobs we have the flexibility to even define segmented LUTs and PWL
hardware blocks. Also we have left scope for custom private hardware blocks as well which driver
can work with respective HAL and get that implemented.

We can even define prescriptive operations as a private entry and enable it if a certain driver and HAL
agree.

> > To advertise the available color pipelines, an immutable ENUM property
> > "GET_COLOR_PIPELINE" is introduced. This enum property has blob id's as values.
> > With each blob id representing a distinct color pipeline based on
> > underlying HW capabilities and their respective combinations.
> >
> > Once the user space decides on a color pipeline, it can set the
> > pipeline and the corresponding data for the hardware blocks within the
> > pipeline with the BLOB property "SET_COLOR_PIPELINE".
> >
> 
> When I discussed this at the hackfest with Simon he proposed a new DRM object,
> (I've called it a drm_colorop) to represent a color operation.
> Each drm_colorop has a "NEXT" pointer to another drm_colorop, or NULL if its the
> last in the pipeline. You can then have a mutable enum property on the plane to
> discover and select a color pipeline.

Yes, the proposal is inspired by this idea. Sure, we can work together to enhance the design.
Personally I feel the one proposed in the current RFC will do the same thing envisioned by Simon
and you Harry. Management of the pipeline, addition, deletion and flexibility to represent
hardware is more with blobs with the relevant structures agreed in UAPI.

> This seems a bit more transparent than a blob. You can see my changes
> (unfortunately very WIP, don't look too closely at individual patches) at
> https://gitlab.freedesktop.org/hwentland/linux/-/merge_requests/5/diffs
>
> libdrm changes:
> https://gitlab.freedesktop.org/hwentland/drm/-/merge_requests/1/diffs

Sure, will check the same.
 
> IGT changes:
> https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/merge_requests/1/diffs
> 
> I'll take time to review your whole series and will see whether we can somehow
> keep the best parts of each.

Thanks and agree. Let's work together and get this implemented in DRM.

> Curious to hear other opinions on the blob vs new DRM object question as well.

Yeah, request community and stakeholders to share feedback and suggestions.
We will work on the feedback to improve the design.

> > Refer to Documentation/gpu/rfc/plane_color_pipeline.rst added in the
> > patch
> >
> > IGT and test details
> > ====================
> >
> > A set of IGT tests is written to demonstrate the usage of the proposed
> > UAPIs along with some sanity validation.
> >
> > Details of the IGT test can be found here:
> > https://patchwork.freedesktop.org/series/123018/
> >
> > Opens
> > =====
> >
> > a. To come up with a list of common HW blocks which can be defined generically
> by the DRM
> >     core in agreement with all the stakeholders b. To enhance/finalize
> > the data structure to define segmented LUTs generically.
> >
> 
> It would be good to add some basic support in VKMS. My work has been based on
> VKMS. Once we kinda settle on an approach I'll look at exposing the AMD private
> properties from Melissa through the API.

Yeah sure Harry.

> >
> > Out of Scope
> > ============
> >
> > a. The coefficients for CTM and LUT value calculations are out of scope of the
> proposal.
> > b. The onus of programming the HW blocks and their values is on user-space.
> Driver will
> >     just provide the interface for the same.
> > c. In order to compute LUT values and coefficients, a helper library can be created
> in
> >     user-space. However, it is out of scope for the current proposal.
> >
> > Acknowledgements and credits
> > ============================
> >
> > There are multiple contributors who have helped us to reach to this
> > proposal. Special mention to Ville
> > Syrjala<ville.syrjala@linux.intel.com>, Pekka
> > Paalanen<pekka.paalanen@collabora.com>,
> > Simon Ser<contact@emersion.fr>, Harry
> > Wentland<harry.wentland@amd.com>, Melissa Wen<mwen@igalia.com>,
> > Jonas<jadahl@redhat.com>, Sebastian Wick<sebastian.wick@redhat.com>,
> Bhanu<bhanuprakash.modem@intel.com> and
> Shashank<shashank.sharma@amd.com>.
> >
> > Also, thanks to Carlos <csoriano@redhat.com> and the Redhat team for
> organizing the HDR hackfest.
> >
> >
> > UAPI dependency and Usermode development
> > ========================================
> >
> > The current KMS implementation requires a user space consumer for it to be
> accepted upstream.
> > Work is ongoing in weston and mutter community to get color management
> > and HDR support implemented in the respective stacks.
> >
> 
> If we can get AMD properties encoded using a Color Pipeline API we can probably
> use gamescope as the userspace vehicle.

Yeah, nice.

> I'm reviewing this in sequence, so there's a chance I'm missing context.
> Please bear with me if some of my comments are answered later in the series.

No worries, really appreciate the feedback and support.

> Again, thanks for sending this.

Always welcome.

Regards,
Uma Shankar

> Harry
> 
> >
> ====================================================================
> ==
> > ===========
> >
> > We have tried to take care of all the scenarios and use-cases which
> > possibly could exists in the current proposal. Thanks to everyone who
> > has contributed in all color management discussions so far. Let's work
> > together to improve the current proposal and get this thing implemented in
> upstream linux. All the feedback and suggestions to enhance the design are
> welcome.
> >
> > Regards,
> > Uma Shankar
> > Chaitanya Kumar Borah
> >
> > Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> > Cc: Pekka Paalanen <pekka.paalanen@collabora.com>
> > Cc: Simon Ser <contact@emersion.fr>
> > Cc: Harry Wentland <harry.wentland@amd.com>
> > Cc: Melissa Wen <mwen@igalia.com>
> > Cc: Jonas Ådahl <jadahl@redhat.com>
> > Cc: Sebastian Wick <sebastian.wick@redhat.com>
> > Cc: Shashank Sharma <shashank.sharma@amd.com>
> > Cc: Alexander Goins <agoins@nvidia.com>
> >
> > Chaitanya Kumar Borah (14):
> >    drm: Add color operation structure
> >    drm: Add plane get color pipeline property
> >    drm: Add helper to add color pipeline
> >    drm: Manage color blob states
> >    drm: Replace individual color blobs
> >    drm: Reset pipeline when user sends NULL blob
> >    drm: Reset plane color state on pipeline switch request
> >    drm/i915/color: Add HDR plane LUT range data to color pipeline
> >    drm/i915/color: Add SDR plane LUT range data to color pipeline
> >    drm/i915/color: Add color pipelines to plane
> >    drm/i915/color: Create and attach set color pipeline property
> >    drm/i915/color: Enable plane color features
> >    drm/i915/color: Add a dummy pipeline with 3D LUT
> >    drm/i915/color: Add example implementation for vendor specific color
> >      operation
> >
> > Uma Shankar (19):
> >    drm/doc/rfc: Add RFC document for proposed Plane Color Pipeline
> >    drm: Add structures for setting color pipeline
> >    drm: Add set colorpipeline property
> >    drm: Add Enhanced Gamma LUT precision structure
> >    drm: Add color lut range structure
> >    drm: Add color information to plane state
> >    drm/i915/color: Add lut range for SDR planes
> >    drm/i915/color: Add lut range for HDR planes
> >    drm/i915/color: Add color pipeline for HDR planes
> >    drm/i915/color: Add color pipeline for SDR planes
> >    drm/i915/color: Add plane color callbacks
> >    drm/i915/color: Load plane color luts from atomic flip
> >    drm/i915/xelpd: Add plane color check to glk_plane_color_ctl
> >    drm/i915/xelpd: Add register definitions for Plane Degamma
> >    drm/i915/color: Add color functions for ADL
> >    drm/i915/color: Program Plane Pre-CSC Registers
> >    drm/i915/xelpd: Add register definitions for Plane Post CSC
> >    drm/i915/xelpd: Program Plane Post CSC Registers
> >    drm/i915/color: Enable Plane CSC
> >
> >   .../gpu/rfc/plane_color_pipeline.rst          | 394 ++++++++++
> >   drivers/gpu/drm/drm_atomic_state_helper.c     |  21 +
> >   drivers/gpu/drm/drm_atomic_uapi.c             | 218 ++++++
> >   drivers/gpu/drm/drm_color_mgmt.c              | 130 ++++
> >   drivers/gpu/drm/i915/display/intel_color.c    | 684 +++++++++++++++++-
> >   drivers/gpu/drm/i915/display/intel_color.h    |   7 +-
> >   .../drm/i915/display/skl_universal_plane.c    |  21 +-
> >   drivers/gpu/drm/i915/i915_reg.h               | 124 ++++
> >   include/drm/drm_plane.h                       |  82 +++
> >   include/uapi/drm/drm_mode.h                   | 196 +++++
> >   include/uapi/drm/i915_drm.h                   |  25 +
> >   11 files changed, 1899 insertions(+), 3 deletions(-)
> >   create mode 100644 Documentation/gpu/rfc/plane_color_pipeline.rst
> >
Sebastian Wick Aug. 30, 2023, 9:15 p.m. UTC | #3
On Wed, Aug 30, 2023 at 08:47:37AM +0000, Shankar, Uma wrote:
> 
> 
> > -----Original Message-----
> > From: Harry Wentland <harry.wentland@amd.com>
> > Sent: Wednesday, August 30, 2023 12:56 AM
> > To: Shankar, Uma <uma.shankar@intel.com>; intel-gfx@lists.freedesktop.org; dri-
> > devel@lists.freedesktop.org
> > Cc: wayland-devel@lists.freedesktop.org; Ville Syrjala
> > <ville.syrjala@linux.intel.com>; Pekka Paalanen <pekka.paalanen@collabora.com>;
> > Simon Ser <contact@emersion.fr>; Melissa Wen <mwen@igalia.com>; Jonas Ådahl
> > <jadahl@redhat.com>; Sebastian Wick <sebastian.wick@redhat.com>; Shashank
> > Sharma <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> > Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> > <quic_cbraga@quicinc.com>
> > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > 
> > +CC Naseer and Chris, FYI
> > 
> > See https://patchwork.freedesktop.org/series/123024/ for whole series.
> > 
> > On 2023-08-29 12:03, Uma Shankar wrote:
> > > Introduction
> > > ============
> > >
> > > Modern hardwares have various color processing capabilities both at
> > > pre-blending and post-blending phases in the color pipeline.
> > > The current drm implementation exposes only the post-blending color
> > > hardware blocks. Support for pre-blending hardware is missing.
> > > There are multiple use cases where pre-blending color hardware will be
> > > useful:
> > > 	a) Linearization of input buffers encoded in various transfer
> > > 	   functions.
> > > 	b) Color Space conversion
> > > 	c) Tone mapping
> > > 	d) Frame buffer format conversion
> > > 	e) Non-linearization of buffer(apply transfer function)
> > > 	f) 3D Luts
> > >
> > > and other miscellaneous color operations.
> > >
> > > Hence, there is a need to expose the color capabilities of the
> > > hardware to user-space. This will help userspace/middleware to use
> > > display hardware for color processing and blending instead of doing it
> > > through GPU shaders.
> > >
> > 
> > Thanks, Uma, for sending this. I've been working on something similar but you beat
> > me to it. :)
> 
> Thanks Harry for the useful feedback and overall collaboration on this so far.
> 
> > >
> > > Work done so far and relevant references
> > > ========================================
> > >
> > > Some implementation is done by Intel and AMD/Igalia to address the same.
> > > Broad consensus is there that we need a generic API at drm core to
> > > suffice the use case of various HW vendors. Below are the links
> > > capturing the discussion so far.
> > >
> > > Intel's Plane Color Implementation:
> > > https://patchwork.freedesktop.org/series/90825/
> > > AMD's Plane Color Implementation:
> > > https://patchwork.freedesktop.org/series/116862/
> > >
> > >
> > > Hackfest conclusions
> > > ====================
> > >
> > > HDR/Color Hackfest was organised by Redhat to bring all the industry
> > > stakeholders together and converge on a common uapi expectations.
> > > Participants from Intel, AMD, Nvidia, Collabora, Redhat, Igalia and
> > > other prominent user-space developers and maintainers.
> > >
> > > Discussions happened on the uapi expectations, opens, nature of
> > > hardware of multiple hardware vendors, challenges in generalizing the
> > > same and the path forward. Consensus was made that drm core should
> > > implement descriptive APIs and not go with prescriptive APIs. DRM core
> > > should just expose the hardware capabilities; enabling, customizing
> > > and programming the same should be done by the user-space. Driver should just
> > honor the user space request without doing any operations internally.
> > >
> > > Thanks to Simon Ser, for nicely documenting the design consensus and
> > > an UAPI RFC which can be referred to here:
> > >
> > > https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze_hD5
> > >
> > nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1Q
> > Wn48
> > > 8=@emersion.fr/
> > >
> > >
> > > Design considerations
> > > =====================
> > >
> > > Following are the important aspects taken into account while designing
> > > the current RFC
> > > proposal:
> > >
> > > 	1. Individual HW blocks can be muxed. (e.g. out of two HW blocks only one
> > can be used)
> > > 	2. Position of the HW block in the pipeline can be programmable
> > > 	3. LUTs can be one dimentional or three dimentional
> > > 	4. Number of LUT entries can vary across platforms
> > > 	5. Precision of LUT entries can vary across platforms
> > > 	6. Distribution of LUT entries may vary. e.g Mutli-segmented, Logarithmic,
> > > 	   Piece-Wise Linear(PWL) etc
> > > 	7. There can be parameterized/non-parameterized fixed function HW
> > blocks.
> > > 	   e.g. Just a hardware bit, to convert from one color space to another.
> > > 	8. Custom non-standard HW implementation.
> > > 	9. Leaving scope for some vendor defined pescriptive implementation if
> > required.
> > > 	10.Scope to handle any modification in hardware as technology evolves
> > >
> > > The current proposal takes into account the above considerations while
> > > keeping the implementation as generic as possible leaving scope for future
> > additions or modifications.
> > >
> > > This proposal is also in line to the details mentioned by Simon's RFC
> > > covering all the aspects discussed in hackfest.
> > >
> > >
> > > Outline of the implementation
> > > ============================
> > >
> > > Each Color Hardware block will be represented by a data structure drm_color_op.
> > > These color operations will form the building blocks of a color
> > > pipeline which best represents the underlying Hardware. Color
> > > operations can be re-arranged, substracted or added to create distinct
> > > color pipelines to accurately describe the Hardware blocks present in the display
> > engine.
> > 
> > Who is doing the arranging of color operations? IMO a driver should define one or
> > more respective pipelines that can be selected by userspace. This seems to be what
> > you're talking about after (I haven't reviewed the whole thing yet). Might be best to
> > drop this sentence or to add clarifications in order to avoid confusion.
> 
> Yes it's the driver who will set the pipeline based on the underlying hardware arrangement
> and possible combinations. There can be multiple pipelines possible if hardware can be
> muxed or order can be re-arranged (all viable combinations should be defined as a pipeline in driver).
> Yeah, I will re-phrase this to help clarify it and avoid any ambiguity.
> 
> > >
> > > In this proposal, a color pipeline is represented as an array of
> > > struct drm_color_op. For individual color operation, we add blobs to
> > > advertise the capability of the particular Hardware block.
> > >
> > > This color pipeline is then packaged within a blob for the user space
> > > to retrieve it.
> > >
> > 
> > Would it be better to expose the drm_color_ops directly, instead of packing a array
> > of drm_color_ops into a blob and then giving that to userspace.
> 
> Advantage we see in packing in blobs is that interface will be cleaner. There will be just
> one GET_COLOR_PIPELINE property to invoke by user and then its just parsing the data.
> This way the entire underlying hardware blocks with pipeline will be available to user.

I don't see how parsing a blob is easier than requesting the color ops
from the kernel. User space is already equiped with getting KMS objects
and the igt test code from Harry shows that this is all pretty trivial
plumbing.

> For a particular hardware block in pipeline, user can get the relevant details from blob
> representing that particular block. We have created IGT tests (links mentioned in cover-letter)
> to demonstrate how it can be done. This is just to clarify the idea.

The blob is also not introspectable with the usual tools whereas
exposing them as properties would be.

It also would, like Pekka correctly noted, bring a whole bunch of issues
about compatibility and versioning that are well understood with objects
+ properties.

> Also since info is passed via blobs we have the flexibility to even define segmented LUTs and PWL
> hardware blocks. Also we have left scope for custom private hardware blocks as well which driver
> can work with respective HAL and get that implemented.

When color ops are real KMS objects they still can have properties which
are blobs that can store LUTs and other data.

And we should avoid private blocks at all cost. In fact, I don't think
the KMS rules have changed in that regard and it simply is not allowed.

> We can even define prescriptive operations as a private entry and enable it if a certain driver and HAL
> agree.
> 
> > > To advertise the available color pipelines, an immutable ENUM property
> > > "GET_COLOR_PIPELINE" is introduced. This enum property has blob id's as values.
> > > With each blob id representing a distinct color pipeline based on
> > > underlying HW capabilities and their respective combinations.
> > >
> > > Once the user space decides on a color pipeline, it can set the
> > > pipeline and the corresponding data for the hardware blocks within the
> > > pipeline with the BLOB property "SET_COLOR_PIPELINE".
> > >
> > 
> > When I discussed this at the hackfest with Simon he proposed a new DRM object,
> > (I've called it a drm_colorop) to represent a color operation.
> > Each drm_colorop has a "NEXT" pointer to another drm_colorop, or NULL if its the
> > last in the pipeline. You can then have a mutable enum property on the plane to
> > discover and select a color pipeline.
> 
> Yes, the proposal is inspired by this idea. Sure, we can work together to enhance the design.
> Personally I feel the one proposed in the current RFC will do the same thing envisioned by Simon
> and you Harry. Management of the pipeline, addition, deletion and flexibility to represent
> hardware is more with blobs with the relevant structures agreed in UAPI.
> 
> > This seems a bit more transparent than a blob. You can see my changes
> > (unfortunately very WIP, don't look too closely at individual patches) at
> > https://gitlab.freedesktop.org/hwentland/linux/-/merge_requests/5/diffs
> >
> > libdrm changes:
> > https://gitlab.freedesktop.org/hwentland/drm/-/merge_requests/1/diffs
> 
> Sure, will check the same.
>  
> > IGT changes:
> > https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/merge_requests/1/diffs
> > 
> > I'll take time to review your whole series and will see whether we can somehow
> > keep the best parts of each.
> 
> Thanks and agree. Let's work together and get this implemented in DRM.
> 
> > Curious to hear other opinions on the blob vs new DRM object question as well.

I can see a few advantages with the blob approach.

User space can store the entire state in one blob and just assign a new
blob to change to another pipeline configuration.

However, I would argue that changing a lot of properties is already
common practice and works well. User space can deal with it and has
abstractions to make this managable.

There also is no need for a new KMS object kindbu from Harrys work so
far the new KMS object kind is surprisingly simple and works just as
expected, which is a good sign.

With the blobs one can store general information about the pipeline and
not only about the color ops themselves. So far it's not clear if we
actually need this, and if we do, we could probably use color ops at the
start of the pipeline which apply to all ops.

There are also a few drawbacks to the blob design. It's not very KMS-y
which makes working with it different than with the rest of KMS and this
mismatch just adds more mental burden. It probably also introduces a new
set of problems that we might not even be aware of. It also makes it
less introspectable as Pekka noted.

When I came up with the KMS Color Pipeline API, I also started with the
blob approach because of the flexibility in it but it does add a fair
amount of hidden complexity. At the hackfest Simon managed to convince
me of the idea of using a new KMS object kind for it and it seemed to
simplify a bunch of things. With Harrys work I would argue that it is
ineed that case. It's pretty simple and still seems to do everything it
needs to.

So, yeah, I'm favoring the color op approach and not the blob approach.

> 
> Yeah, request community and stakeholders to share feedback and suggestions.
> We will work on the feedback to improve the design.
> 
> > > Refer to Documentation/gpu/rfc/plane_color_pipeline.rst added in the
> > > patch
> > >
> > > IGT and test details
> > > ====================
> > >
> > > A set of IGT tests is written to demonstrate the usage of the proposed
> > > UAPIs along with some sanity validation.
> > >
> > > Details of the IGT test can be found here:
> > > https://patchwork.freedesktop.org/series/123018/
> > >
> > > Opens
> > > =====
> > >
> > > a. To come up with a list of common HW blocks which can be defined generically
> > by the DRM
> > >     core in agreement with all the stakeholders b. To enhance/finalize
> > > the data structure to define segmented LUTs generically.
> > >
> > 
> > It would be good to add some basic support in VKMS. My work has been based on
> > VKMS. Once we kinda settle on an approach I'll look at exposing the AMD private
> > properties from Melissa through the API.
> 
> Yeah sure Harry.

I think this is a great idea! Let's see if we can learn some lessons
from both patch series and then continue with the VKMS implementation
and igt tests for it before every vendor implements their view of how
this should all work.

> 
> > >
> > > Out of Scope
> > > ============
> > >
> > > a. The coefficients for CTM and LUT value calculations are out of scope of the
> > proposal.
> > > b. The onus of programming the HW blocks and their values is on user-space.
> > Driver will
> > >     just provide the interface for the same.
> > > c. In order to compute LUT values and coefficients, a helper library can be created
> > in
> > >     user-space. However, it is out of scope for the current proposal.
> > >
> > > Acknowledgements and credits
> > > ============================
> > >
> > > There are multiple contributors who have helped us to reach to this
> > > proposal. Special mention to Ville
> > > Syrjala<ville.syrjala@linux.intel.com>, Pekka
> > > Paalanen<pekka.paalanen@collabora.com>,
> > > Simon Ser<contact@emersion.fr>, Harry
> > > Wentland<harry.wentland@amd.com>, Melissa Wen<mwen@igalia.com>,
> > > Jonas<jadahl@redhat.com>, Sebastian Wick<sebastian.wick@redhat.com>,
> > Bhanu<bhanuprakash.modem@intel.com> and
> > Shashank<shashank.sharma@amd.com>.
> > >
> > > Also, thanks to Carlos <csoriano@redhat.com> and the Redhat team for
> > organizing the HDR hackfest.
> > >
> > >
> > > UAPI dependency and Usermode development
> > > ========================================
> > >
> > > The current KMS implementation requires a user space consumer for it to be
> > accepted upstream.
> > > Work is ongoing in weston and mutter community to get color management
> > > and HDR support implemented in the respective stacks.
> > >
> > 
> > If we can get AMD properties encoded using a Color Pipeline API we can probably
> > use gamescope as the userspace vehicle.
> 
> Yeah, nice.
> 
> > I'm reviewing this in sequence, so there's a chance I'm missing context.
> > Please bear with me if some of my comments are answered later in the series.
> 
> No worries, really appreciate the feedback and support.
> 
> > Again, thanks for sending this.
> 
> Always welcome.
> 
> Regards,
> Uma Shankar
> 
> > Harry
> > 
> > >
> > ====================================================================
> > ==
> > > ===========
> > >
> > > We have tried to take care of all the scenarios and use-cases which
> > > possibly could exists in the current proposal. Thanks to everyone who
> > > has contributed in all color management discussions so far. Let's work
> > > together to improve the current proposal and get this thing implemented in
> > upstream linux. All the feedback and suggestions to enhance the design are
> > welcome.
> > >
> > > Regards,
> > > Uma Shankar
> > > Chaitanya Kumar Borah
> > >
> > > Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> > > Cc: Pekka Paalanen <pekka.paalanen@collabora.com>
> > > Cc: Simon Ser <contact@emersion.fr>
> > > Cc: Harry Wentland <harry.wentland@amd.com>
> > > Cc: Melissa Wen <mwen@igalia.com>
> > > Cc: Jonas Ådahl <jadahl@redhat.com>
> > > Cc: Sebastian Wick <sebastian.wick@redhat.com>
> > > Cc: Shashank Sharma <shashank.sharma@amd.com>
> > > Cc: Alexander Goins <agoins@nvidia.com>
> > >
> > > Chaitanya Kumar Borah (14):
> > >    drm: Add color operation structure
> > >    drm: Add plane get color pipeline property
> > >    drm: Add helper to add color pipeline
> > >    drm: Manage color blob states
> > >    drm: Replace individual color blobs
> > >    drm: Reset pipeline when user sends NULL blob
> > >    drm: Reset plane color state on pipeline switch request
> > >    drm/i915/color: Add HDR plane LUT range data to color pipeline
> > >    drm/i915/color: Add SDR plane LUT range data to color pipeline
> > >    drm/i915/color: Add color pipelines to plane
> > >    drm/i915/color: Create and attach set color pipeline property
> > >    drm/i915/color: Enable plane color features
> > >    drm/i915/color: Add a dummy pipeline with 3D LUT
> > >    drm/i915/color: Add example implementation for vendor specific color
> > >      operation
> > >
> > > Uma Shankar (19):
> > >    drm/doc/rfc: Add RFC document for proposed Plane Color Pipeline
> > >    drm: Add structures for setting color pipeline
> > >    drm: Add set colorpipeline property
> > >    drm: Add Enhanced Gamma LUT precision structure
> > >    drm: Add color lut range structure
> > >    drm: Add color information to plane state
> > >    drm/i915/color: Add lut range for SDR planes
> > >    drm/i915/color: Add lut range for HDR planes
> > >    drm/i915/color: Add color pipeline for HDR planes
> > >    drm/i915/color: Add color pipeline for SDR planes
> > >    drm/i915/color: Add plane color callbacks
> > >    drm/i915/color: Load plane color luts from atomic flip
> > >    drm/i915/xelpd: Add plane color check to glk_plane_color_ctl
> > >    drm/i915/xelpd: Add register definitions for Plane Degamma
> > >    drm/i915/color: Add color functions for ADL
> > >    drm/i915/color: Program Plane Pre-CSC Registers
> > >    drm/i915/xelpd: Add register definitions for Plane Post CSC
> > >    drm/i915/xelpd: Program Plane Post CSC Registers
> > >    drm/i915/color: Enable Plane CSC
> > >
> > >   .../gpu/rfc/plane_color_pipeline.rst          | 394 ++++++++++
> > >   drivers/gpu/drm/drm_atomic_state_helper.c     |  21 +
> > >   drivers/gpu/drm/drm_atomic_uapi.c             | 218 ++++++
> > >   drivers/gpu/drm/drm_color_mgmt.c              | 130 ++++
> > >   drivers/gpu/drm/i915/display/intel_color.c    | 684 +++++++++++++++++-
> > >   drivers/gpu/drm/i915/display/intel_color.h    |   7 +-
> > >   .../drm/i915/display/skl_universal_plane.c    |  21 +-
> > >   drivers/gpu/drm/i915/i915_reg.h               | 124 ++++
> > >   include/drm/drm_plane.h                       |  82 +++
> > >   include/uapi/drm/drm_mode.h                   | 196 +++++
> > >   include/uapi/drm/i915_drm.h                   |  25 +
> > >   11 files changed, 1899 insertions(+), 3 deletions(-)
> > >   create mode 100644 Documentation/gpu/rfc/plane_color_pipeline.rst
> > >
Shankar, Uma Sept. 4, 2023, 2:29 p.m. UTC | #4
> -----Original Message-----
> From: Sebastian Wick <sebastian.wick@redhat.com>
> Sent: Thursday, August 31, 2023 2:46 AM
> To: Shankar, Uma <uma.shankar@intel.com>
> Cc: Harry Wentland <harry.wentland@amd.com>; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; wayland-
> devel@lists.freedesktop.org; Ville Syrjala <ville.syrjala@linux.intel.com>; Pekka
> Paalanen <pekka.paalanen@collabora.com>; Simon Ser <contact@emersion.fr>;
> Melissa Wen <mwen@igalia.com>; Jonas Ådahl <jadahl@redhat.com>; Shashank
> Sharma <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> <quic_cbraga@quicinc.com>
> Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> 
> On Wed, Aug 30, 2023 at 08:47:37AM +0000, Shankar, Uma wrote:
> >
> >
> > > -----Original Message-----
> > > From: Harry Wentland <harry.wentland@amd.com>
> > > Sent: Wednesday, August 30, 2023 12:56 AM
> > > To: Shankar, Uma <uma.shankar@intel.com>;
> > > intel-gfx@lists.freedesktop.org; dri- devel@lists.freedesktop.org
> > > Cc: wayland-devel@lists.freedesktop.org; Ville Syrjala
> > > <ville.syrjala@linux.intel.com>; Pekka Paalanen
> > > <pekka.paalanen@collabora.com>; Simon Ser <contact@emersion.fr>;
> > > Melissa Wen <mwen@igalia.com>; Jonas Ådahl <jadahl@redhat.com>;
> > > Sebastian Wick <sebastian.wick@redhat.com>; Shashank Sharma
> > > <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> > > Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> > > <quic_cbraga@quicinc.com>
> > > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > >
> > > +CC Naseer and Chris, FYI
> > >
> > > See https://patchwork.freedesktop.org/series/123024/ for whole series.
> > >
> > > On 2023-08-29 12:03, Uma Shankar wrote:
> > > > Introduction
> > > > ============
> > > >
> > > > Modern hardwares have various color processing capabilities both
> > > > at pre-blending and post-blending phases in the color pipeline.
> > > > The current drm implementation exposes only the post-blending
> > > > color hardware blocks. Support for pre-blending hardware is missing.
> > > > There are multiple use cases where pre-blending color hardware
> > > > will be
> > > > useful:
> > > > 	a) Linearization of input buffers encoded in various transfer
> > > > 	   functions.
> > > > 	b) Color Space conversion
> > > > 	c) Tone mapping
> > > > 	d) Frame buffer format conversion
> > > > 	e) Non-linearization of buffer(apply transfer function)
> > > > 	f) 3D Luts
> > > >
> > > > and other miscellaneous color operations.
> > > >
> > > > Hence, there is a need to expose the color capabilities of the
> > > > hardware to user-space. This will help userspace/middleware to use
> > > > display hardware for color processing and blending instead of
> > > > doing it through GPU shaders.
> > > >
> > >
> > > Thanks, Uma, for sending this. I've been working on something
> > > similar but you beat me to it. :)
> >
> > Thanks Harry for the useful feedback and overall collaboration on this so far.
> >
> > > >
> > > > Work done so far and relevant references
> > > > ========================================
> > > >
> > > > Some implementation is done by Intel and AMD/Igalia to address the same.
> > > > Broad consensus is there that we need a generic API at drm core to
> > > > suffice the use case of various HW vendors. Below are the links
> > > > capturing the discussion so far.
> > > >
> > > > Intel's Plane Color Implementation:
> > > > https://patchwork.freedesktop.org/series/90825/
> > > > AMD's Plane Color Implementation:
> > > > https://patchwork.freedesktop.org/series/116862/
> > > >
> > > >
> > > > Hackfest conclusions
> > > > ====================
> > > >
> > > > HDR/Color Hackfest was organised by Redhat to bring all the
> > > > industry stakeholders together and converge on a common uapi
> expectations.
> > > > Participants from Intel, AMD, Nvidia, Collabora, Redhat, Igalia
> > > > and other prominent user-space developers and maintainers.
> > > >
> > > > Discussions happened on the uapi expectations, opens, nature of
> > > > hardware of multiple hardware vendors, challenges in generalizing
> > > > the same and the path forward. Consensus was made that drm core
> > > > should implement descriptive APIs and not go with prescriptive
> > > > APIs. DRM core should just expose the hardware capabilities;
> > > > enabling, customizing and programming the same should be done by
> > > > the user-space. Driver should just
> > > honor the user space request without doing any operations internally.
> > > >
> > > > Thanks to Simon Ser, for nicely documenting the design consensus
> > > > and an UAPI RFC which can be referred to here:
> > > >
> > > > https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze
> > > > _hD5
> > > >
> > >
> nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1
> Q
> > > Wn48
> > > > 8=@emersion.fr/
> > > >
> > > >
> > > > Design considerations
> > > > =====================
> > > >
> > > > Following are the important aspects taken into account while
> > > > designing the current RFC
> > > > proposal:
> > > >
> > > > 	1. Individual HW blocks can be muxed. (e.g. out of two HW blocks
> > > > only one
> > > can be used)
> > > > 	2. Position of the HW block in the pipeline can be programmable
> > > > 	3. LUTs can be one dimentional or three dimentional
> > > > 	4. Number of LUT entries can vary across platforms
> > > > 	5. Precision of LUT entries can vary across platforms
> > > > 	6. Distribution of LUT entries may vary. e.g Mutli-segmented,
> Logarithmic,
> > > > 	   Piece-Wise Linear(PWL) etc
> > > > 	7. There can be parameterized/non-parameterized fixed function HW
> > > blocks.
> > > > 	   e.g. Just a hardware bit, to convert from one color space to another.
> > > > 	8. Custom non-standard HW implementation.
> > > > 	9. Leaving scope for some vendor defined pescriptive
> > > > implementation if
> > > required.
> > > > 	10.Scope to handle any modification in hardware as technology
> > > > evolves
> > > >
> > > > The current proposal takes into account the above considerations
> > > > while keeping the implementation as generic as possible leaving
> > > > scope for future
> > > additions or modifications.
> > > >
> > > > This proposal is also in line to the details mentioned by Simon's
> > > > RFC covering all the aspects discussed in hackfest.
> > > >
> > > >
> > > > Outline of the implementation
> > > > ============================
> > > >
> > > > Each Color Hardware block will be represented by a data structure
> drm_color_op.
> > > > These color operations will form the building blocks of a color
> > > > pipeline which best represents the underlying Hardware. Color
> > > > operations can be re-arranged, substracted or added to create
> > > > distinct color pipelines to accurately describe the Hardware
> > > > blocks present in the display
> > > engine.
> > >
> > > Who is doing the arranging of color operations? IMO a driver should
> > > define one or more respective pipelines that can be selected by
> > > userspace. This seems to be what you're talking about after (I
> > > haven't reviewed the whole thing yet). Might be best to drop this sentence or
> to add clarifications in order to avoid confusion.
> >
> > Yes it's the driver who will set the pipeline based on the underlying
> > hardware arrangement and possible combinations. There can be multiple
> > pipelines possible if hardware can be muxed or order can be re-arranged (all
> viable combinations should be defined as a pipeline in driver).
> > Yeah, I will re-phrase this to help clarify it and avoid any ambiguity.
> >
> > > >
> > > > In this proposal, a color pipeline is represented as an array of
> > > > struct drm_color_op. For individual color operation, we add blobs
> > > > to advertise the capability of the particular Hardware block.
> > > >
> > > > This color pipeline is then packaged within a blob for the user
> > > > space to retrieve it.
> > > >
> > >
> > > Would it be better to expose the drm_color_ops directly, instead of
> > > packing a array of drm_color_ops into a blob and then giving that to
> userspace.
> >
> > Advantage we see in packing in blobs is that interface will be
> > cleaner. There will be just one GET_COLOR_PIPELINE property to invoke by user
> and then its just parsing the data.
> > This way the entire underlying hardware blocks with pipeline will be available to
> user.
> 
> I don't see how parsing a blob is easier than requesting the color ops from the
> kernel. User space is already equiped with getting KMS objects and the igt test
> code from Harry shows that this is all pretty trivial plumbing.

There are multiple color operations possible with unique lut distribution and
capabilities. Also the order of hardware blocks and possibility of multiple pipelines.
Having all the information with one query and property and also be able to set the
same with just one property call using blobs is better than multiple calls to driver.
This can be useful in high refresh rate cases where not much time is there to program
the hardware state. Latency of multiple calls to driver can be avoided.

> > For a particular hardware block in pipeline, user can get the relevant
> > details from blob representing that particular block. We have created
> > IGT tests (links mentioned in cover-letter) to demonstrate how it can be done.
> This is just to clarify the idea.
> 
> The blob is also not introspectable with the usual tools whereas exposing them as
> properties would be.
> 
> It also would, like Pekka correctly noted, bring a whole bunch of issues about
> compatibility and versioning that are well understood with objects
> + properties.

The blob should be standardized in the UAPI and structures to parse them should be fixed.
With this compatibility issues can be prevented.

> > Also since info is passed via blobs we have the flexibility to even
> > define segmented LUTs and PWL hardware blocks. Also we have left scope
> > for custom private hardware blocks as well which driver can work with
> respective HAL and get that implemented.
> 
> When color ops are real KMS objects they still can have properties which are
> blobs that can store LUTs and other data.
> 
> And we should avoid private blocks at all cost. In fact, I don't think the KMS rules
> have changed in that regard and it simply is not allowed.

Private blocks are not standardized but are vendor specific. So generic userspace will
ignore these. However vendor and its respective HAL can make use of this field and leaves
a scope to cater to such hardware vendors need. This doesn't affect the expectation of the
standardized color operations which will be defined as enum in UAPI.

> > We can even define prescriptive operations as a private entry and
> > enable it if a certain driver and HAL agree.
> >
> > > > To advertise the available color pipelines, an immutable ENUM
> > > > property "GET_COLOR_PIPELINE" is introduced. This enum property has
> blob id's as values.
> > > > With each blob id representing a distinct color pipeline based on
> > > > underlying HW capabilities and their respective combinations.
> > > >
> > > > Once the user space decides on a color pipeline, it can set the
> > > > pipeline and the corresponding data for the hardware blocks within
> > > > the pipeline with the BLOB property "SET_COLOR_PIPELINE".
> > > >
> > >
> > > When I discussed this at the hackfest with Simon he proposed a new
> > > DRM object, (I've called it a drm_colorop) to represent a color operation.
> > > Each drm_colorop has a "NEXT" pointer to another drm_colorop, or
> > > NULL if its the last in the pipeline. You can then have a mutable
> > > enum property on the plane to discover and select a color pipeline.
> >
> > Yes, the proposal is inspired by this idea. Sure, we can work together to enhance
> the design.
> > Personally I feel the one proposed in the current RFC will do the same
> > thing envisioned by Simon and you Harry. Management of the pipeline,
> > addition, deletion and flexibility to represent hardware is more with blobs with
> the relevant structures agreed in UAPI.
> >
> > > This seems a bit more transparent than a blob. You can see my
> > > changes (unfortunately very WIP, don't look too closely at
> > > individual patches) at
> > > https://gitlab.freedesktop.org/hwentland/linux/-/merge_requests/5/di
> > > ffs
> > >
> > > libdrm changes:
> > > https://gitlab.freedesktop.org/hwentland/drm/-/merge_requests/1/diff
> > > s
> >
> > Sure, will check the same.
> >
> > > IGT changes:
> > > https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/merge_reque
> > > sts/1/diffs
> > >
> > > I'll take time to review your whole series and will see whether we
> > > can somehow keep the best parts of each.
> >
> > Thanks and agree. Let's work together and get this implemented in DRM.
> >
> > > Curious to hear other opinions on the blob vs new DRM object question as
> well.
> 
> I can see a few advantages with the blob approach.
> 
> User space can store the entire state in one blob and just assign a new blob to
> change to another pipeline configuration.

Agree

> However, I would argue that changing a lot of properties is already common
> practice and works well. User space can deal with it and has abstractions to
> make this managable.

Blob gives a lot of flexibility and ability to define the hardware capabilities generically.
Lut distribution, number of segments, samples in each segment, precision of luts etc.
can be precisely defined and userspace will get a complete picture of the underlying
hardware and its capabilities. Also this is being done with just 2 properties. Leaving
scope for future addition of any standard color operation as well.

I feel blob approach once properly documented is a bit more flexible and scalable.
Maybe I am bit biased here 
Pekka Paalanen Sept. 5, 2023, 11:33 a.m. UTC | #5
On Mon, 4 Sep 2023 14:29:56 +0000
"Shankar, Uma" <uma.shankar@intel.com> wrote:

> > -----Original Message-----
> > From: Sebastian Wick <sebastian.wick@redhat.com>
> > Sent: Thursday, August 31, 2023 2:46 AM
> > To: Shankar, Uma <uma.shankar@intel.com>
> > Cc: Harry Wentland <harry.wentland@amd.com>; intel-
> > gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; wayland-
> > devel@lists.freedesktop.org; Ville Syrjala <ville.syrjala@linux.intel.com>; Pekka
> > Paalanen <pekka.paalanen@collabora.com>; Simon Ser <contact@emersion.fr>;
> > Melissa Wen <mwen@igalia.com>; Jonas Ådahl <jadahl@redhat.com>; Shashank
> > Sharma <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> > Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> > <quic_cbraga@quicinc.com>
> > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > 
> > On Wed, Aug 30, 2023 at 08:47:37AM +0000, Shankar, Uma wrote:  
> > >
> > >  
> > > > -----Original Message-----
> > > > From: Harry Wentland <harry.wentland@amd.com>
> > > > Sent: Wednesday, August 30, 2023 12:56 AM
> > > > To: Shankar, Uma <uma.shankar@intel.com>;
> > > > intel-gfx@lists.freedesktop.org; dri- devel@lists.freedesktop.org
> > > > Cc: wayland-devel@lists.freedesktop.org; Ville Syrjala
> > > > <ville.syrjala@linux.intel.com>; Pekka Paalanen
> > > > <pekka.paalanen@collabora.com>; Simon Ser <contact@emersion.fr>;
> > > > Melissa Wen <mwen@igalia.com>; Jonas Ådahl <jadahl@redhat.com>;
> > > > Sebastian Wick <sebastian.wick@redhat.com>; Shashank Sharma
> > > > <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> > > > Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> > > > <quic_cbraga@quicinc.com>
> > > > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > > >
> > > > +CC Naseer and Chris, FYI
> > > >
> > > > See https://patchwork.freedesktop.org/series/123024/ for whole series.
> > > >
> > > > On 2023-08-29 12:03, Uma Shankar wrote:  
> > > > > Introduction
> > > > > ============
> > > > >
> > > > > Modern hardwares have various color processing capabilities both
> > > > > at pre-blending and post-blending phases in the color pipeline.
> > > > > The current drm implementation exposes only the post-blending
> > > > > color hardware blocks. Support for pre-blending hardware is missing.
> > > > > There are multiple use cases where pre-blending color hardware
> > > > > will be
> > > > > useful:
> > > > > 	a) Linearization of input buffers encoded in various transfer
> > > > > 	   functions.
> > > > > 	b) Color Space conversion
> > > > > 	c) Tone mapping
> > > > > 	d) Frame buffer format conversion
> > > > > 	e) Non-linearization of buffer(apply transfer function)
> > > > > 	f) 3D Luts
> > > > >
> > > > > and other miscellaneous color operations.
> > > > >
> > > > > Hence, there is a need to expose the color capabilities of the
> > > > > hardware to user-space. This will help userspace/middleware to use
> > > > > display hardware for color processing and blending instead of
> > > > > doing it through GPU shaders.
> > > > >  
> > > >
> > > > Thanks, Uma, for sending this. I've been working on something
> > > > similar but you beat me to it. :)  
> > >
> > > Thanks Harry for the useful feedback and overall collaboration on this so far.
> > >  
> > > > >
> > > > > Work done so far and relevant references
> > > > > ========================================
> > > > >
> > > > > Some implementation is done by Intel and AMD/Igalia to address the same.
> > > > > Broad consensus is there that we need a generic API at drm core to
> > > > > suffice the use case of various HW vendors. Below are the links
> > > > > capturing the discussion so far.
> > > > >
> > > > > Intel's Plane Color Implementation:
> > > > > https://patchwork.freedesktop.org/series/90825/
> > > > > AMD's Plane Color Implementation:
> > > > > https://patchwork.freedesktop.org/series/116862/
> > > > >
> > > > >
> > > > > Hackfest conclusions
> > > > > ====================
> > > > >
> > > > > HDR/Color Hackfest was organised by Redhat to bring all the
> > > > > industry stakeholders together and converge on a common uapi  
> > expectations.  
> > > > > Participants from Intel, AMD, Nvidia, Collabora, Redhat, Igalia
> > > > > and other prominent user-space developers and maintainers.
> > > > >
> > > > > Discussions happened on the uapi expectations, opens, nature of
> > > > > hardware of multiple hardware vendors, challenges in generalizing
> > > > > the same and the path forward. Consensus was made that drm core
> > > > > should implement descriptive APIs and not go with prescriptive
> > > > > APIs. DRM core should just expose the hardware capabilities;
> > > > > enabling, customizing and programming the same should be done by
> > > > > the user-space. Driver should just  
> > > > honor the user space request without doing any operations internally.  
> > > > >
> > > > > Thanks to Simon Ser, for nicely documenting the design consensus
> > > > > and an UAPI RFC which can be referred to here:
> > > > >
> > > > > https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze
> > > > > _hD5
> > > > >  
> > > >  
> > nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1
> > Q  
> > > > Wn48  
> > > > > 8=@emersion.fr/
> > > > >
> > > > >
> > > > > Design considerations
> > > > > =====================
> > > > >
> > > > > Following are the important aspects taken into account while
> > > > > designing the current RFC
> > > > > proposal:
> > > > >
> > > > > 	1. Individual HW blocks can be muxed. (e.g. out of two HW blocks
> > > > > only one  
> > > > can be used)  
> > > > > 	2. Position of the HW block in the pipeline can be programmable
> > > > > 	3. LUTs can be one dimentional or three dimentional
> > > > > 	4. Number of LUT entries can vary across platforms
> > > > > 	5. Precision of LUT entries can vary across platforms
> > > > > 	6. Distribution of LUT entries may vary. e.g Mutli-segmented,  
> > Logarithmic,  
> > > > > 	   Piece-Wise Linear(PWL) etc
> > > > > 	7. There can be parameterized/non-parameterized fixed function HW  
> > > > blocks.  
> > > > > 	   e.g. Just a hardware bit, to convert from one color space to another.
> > > > > 	8. Custom non-standard HW implementation.
> > > > > 	9. Leaving scope for some vendor defined pescriptive
> > > > > implementation if  
> > > > required.  
> > > > > 	10.Scope to handle any modification in hardware as technology
> > > > > evolves
> > > > >
> > > > > The current proposal takes into account the above considerations
> > > > > while keeping the implementation as generic as possible leaving
> > > > > scope for future  
> > > > additions or modifications.  
> > > > >
> > > > > This proposal is also in line to the details mentioned by Simon's
> > > > > RFC covering all the aspects discussed in hackfest.
> > > > >
> > > > >
> > > > > Outline of the implementation
> > > > > ============================
> > > > >
> > > > > Each Color Hardware block will be represented by a data structure  
> > drm_color_op.  
> > > > > These color operations will form the building blocks of a color
> > > > > pipeline which best represents the underlying Hardware. Color
> > > > > operations can be re-arranged, substracted or added to create
> > > > > distinct color pipelines to accurately describe the Hardware
> > > > > blocks present in the display  
> > > > engine.
> > > >
> > > > Who is doing the arranging of color operations? IMO a driver should
> > > > define one or more respective pipelines that can be selected by
> > > > userspace. This seems to be what you're talking about after (I
> > > > haven't reviewed the whole thing yet). Might be best to drop this sentence or  
> > to add clarifications in order to avoid confusion.  
> > >
> > > Yes it's the driver who will set the pipeline based on the underlying
> > > hardware arrangement and possible combinations. There can be multiple
> > > pipelines possible if hardware can be muxed or order can be re-arranged (all  
> > viable combinations should be defined as a pipeline in driver).  
> > > Yeah, I will re-phrase this to help clarify it and avoid any ambiguity.
> > >  
> > > > >
> > > > > In this proposal, a color pipeline is represented as an array of
> > > > > struct drm_color_op. For individual color operation, we add blobs
> > > > > to advertise the capability of the particular Hardware block.
> > > > >
> > > > > This color pipeline is then packaged within a blob for the user
> > > > > space to retrieve it.
> > > > >  
> > > >
> > > > Would it be better to expose the drm_color_ops directly, instead of
> > > > packing a array of drm_color_ops into a blob and then giving that to  
> > userspace.  
> > >
> > > Advantage we see in packing in blobs is that interface will be
> > > cleaner. There will be just one GET_COLOR_PIPELINE property to invoke by user  
> > and then its just parsing the data.  
> > > This way the entire underlying hardware blocks with pipeline will be available to  
> > user.
> > 
> > I don't see how parsing a blob is easier than requesting the color ops from the
> > kernel. User space is already equiped with getting KMS objects and the igt test
> > code from Harry shows that this is all pretty trivial plumbing.  
> 
> There are multiple color operations possible with unique lut distribution and
> capabilities. Also the order of hardware blocks and possibility of multiple pipelines.
> Having all the information with one query and property and also be able to set the
> same with just one property call using blobs is better than multiple calls to driver.
> This can be useful in high refresh rate cases where not much time is there to program
> the hardware state. Latency of multiple calls to driver can be avoided.

Hi,

querying all that information only happens once, at KMS client start-up.

Setting up a color pipeline is already a single call: the atomic commit ioctl.

> 
> > > For a particular hardware block in pipeline, user can get the relevant
> > > details from blob representing that particular block. We have created
> > > IGT tests (links mentioned in cover-letter) to demonstrate how it can be done.  
> > This is just to clarify the idea.
> > 
> > The blob is also not introspectable with the usual tools whereas exposing them as
> > properties would be.
> > 
> > It also would, like Pekka correctly noted, bring a whole bunch of issues about
> > compatibility and versioning that are well understood with objects
> > + properties.  
> 
> The blob should be standardized in the UAPI and structures to parse them should be fixed.
> With this compatibility issues can be prevented.

I think that is short-sighted.

> > > Also since info is passed via blobs we have the flexibility to even
> > > define segmented LUTs and PWL hardware blocks. Also we have left scope
> > > for custom private hardware blocks as well which driver can work with  
> > respective HAL and get that implemented.
> > 
> > When color ops are real KMS objects they still can have properties which are
> > blobs that can store LUTs and other data.
> > 
> > And we should avoid private blocks at all cost. In fact, I don't think the KMS rules
> > have changed in that regard and it simply is not allowed.  
> 
> Private blocks are not standardized but are vendor specific. So generic userspace will
> ignore these. However vendor and its respective HAL can make use of this field and leaves
> a scope to cater to such hardware vendors need. This doesn't affect the expectation of the
> standardized color operations which will be defined as enum in UAPI.

Vendors can have and expose their own unique snowflake operations
without any "private" as well: pick an unused operation type code, and
document what it does. Advertise it in some pipelines.

Vendor HALs can make use of it, but it also allows generic userspace to
make use of it at will, and it allows other vendors to implement the
same and benefit from it without needing to patch every userspace.

Or does Intel not want other vendors to potentially make use of their
UAPIs?

> > > We can even define prescriptive operations as a private entry and
> > > enable it if a certain driver and HAL agree.
> > >  
> > > > > To advertise the available color pipelines, an immutable ENUM
> > > > > property "GET_COLOR_PIPELINE" is introduced. This enum property has  
> > blob id's as values.  
> > > > > With each blob id representing a distinct color pipeline based on
> > > > > underlying HW capabilities and their respective combinations.
> > > > >
> > > > > Once the user space decides on a color pipeline, it can set the
> > > > > pipeline and the corresponding data for the hardware blocks within
> > > > > the pipeline with the BLOB property "SET_COLOR_PIPELINE".
> > > > >  
> > > >
> > > > When I discussed this at the hackfest with Simon he proposed a new
> > > > DRM object, (I've called it a drm_colorop) to represent a color operation.
> > > > Each drm_colorop has a "NEXT" pointer to another drm_colorop, or
> > > > NULL if its the last in the pipeline. You can then have a mutable
> > > > enum property on the plane to discover and select a color pipeline.  
> > >
> > > Yes, the proposal is inspired by this idea. Sure, we can work together to enhance  
> > the design.  
> > > Personally I feel the one proposed in the current RFC will do the same
> > > thing envisioned by Simon and you Harry. Management of the pipeline,
> > > addition, deletion and flexibility to represent hardware is more with blobs with  
> > the relevant structures agreed in UAPI.  
> > >  
> > > > This seems a bit more transparent than a blob. You can see my
> > > > changes (unfortunately very WIP, don't look too closely at
> > > > individual patches) at
> > > > https://gitlab.freedesktop.org/hwentland/linux/-/merge_requests/5/di
> > > > ffs
> > > >
> > > > libdrm changes:
> > > > https://gitlab.freedesktop.org/hwentland/drm/-/merge_requests/1/diff
> > > > s  
> > >
> > > Sure, will check the same.
> > >  
> > > > IGT changes:
> > > > https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/merge_reque
> > > > sts/1/diffs
> > > >
> > > > I'll take time to review your whole series and will see whether we
> > > > can somehow keep the best parts of each.  
> > >
> > > Thanks and agree. Let's work together and get this implemented in DRM.
> > >  
> > > > Curious to hear other opinions on the blob vs new DRM object question as  
> > well.
> > 
> > I can see a few advantages with the blob approach.
> > 
> > User space can store the entire state in one blob and just assign a new blob to
> > change to another pipeline configuration.  
> 
> Agree
> 
> > However, I would argue that changing a lot of properties is already common
> > practice and works well. User space can deal with it and has abstractions to
> > make this managable.  
> 
> Blob gives a lot of flexibility and ability to define the hardware capabilities generically.

The structure of the atomic commit ioctl and the KMS property system is
even more flexible.

> Lut distribution, number of segments, samples in each segment, precision of luts etc.
> can be precisely defined and userspace will get a complete picture of the underlying
> hardware and its capabilities. Also this is being done with just 2 properties. Leaving
> scope for future addition of any standard color operation as well.

The number of properties does not seem too useful to strictly minimise
over other aspects.


Thanks,
pq

> 
> I feel blob approach once properly documented is a bit more flexible and scalable.
> Maybe I am bit biased here 
Sebastian Wick Sept. 5, 2023, 12:33 p.m. UTC | #6
On Tue, Sep 05, 2023 at 02:33:26PM +0300, Pekka Paalanen wrote:
> On Mon, 4 Sep 2023 14:29:56 +0000
> "Shankar, Uma" <uma.shankar@intel.com> wrote:
> 
> > > -----Original Message-----
> > > From: Sebastian Wick <sebastian.wick@redhat.com>
> > > Sent: Thursday, August 31, 2023 2:46 AM
> > > To: Shankar, Uma <uma.shankar@intel.com>
> > > Cc: Harry Wentland <harry.wentland@amd.com>; intel-
> > > gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; wayland-
> > > devel@lists.freedesktop.org; Ville Syrjala <ville.syrjala@linux.intel.com>; Pekka
> > > Paalanen <pekka.paalanen@collabora.com>; Simon Ser <contact@emersion.fr>;
> > > Melissa Wen <mwen@igalia.com>; Jonas Ådahl <jadahl@redhat.com>; Shashank
> > > Sharma <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> > > Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> > > <quic_cbraga@quicinc.com>
> > > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > > 
> > > On Wed, Aug 30, 2023 at 08:47:37AM +0000, Shankar, Uma wrote:  
> > > >
> > > >  
> > > > > -----Original Message-----
> > > > > From: Harry Wentland <harry.wentland@amd.com>
> > > > > Sent: Wednesday, August 30, 2023 12:56 AM
> > > > > To: Shankar, Uma <uma.shankar@intel.com>;
> > > > > intel-gfx@lists.freedesktop.org; dri- devel@lists.freedesktop.org
> > > > > Cc: wayland-devel@lists.freedesktop.org; Ville Syrjala
> > > > > <ville.syrjala@linux.intel.com>; Pekka Paalanen
> > > > > <pekka.paalanen@collabora.com>; Simon Ser <contact@emersion.fr>;
> > > > > Melissa Wen <mwen@igalia.com>; Jonas Ådahl <jadahl@redhat.com>;
> > > > > Sebastian Wick <sebastian.wick@redhat.com>; Shashank Sharma
> > > > > <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> > > > > Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> > > > > <quic_cbraga@quicinc.com>
> > > > > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > > > >
> > > > > +CC Naseer and Chris, FYI
> > > > >
> > > > > See https://patchwork.freedesktop.org/series/123024/ for whole series.
> > > > >
> > > > > On 2023-08-29 12:03, Uma Shankar wrote:  
> > > > > > Introduction
> > > > > > ============
> > > > > >
> > > > > > Modern hardwares have various color processing capabilities both
> > > > > > at pre-blending and post-blending phases in the color pipeline.
> > > > > > The current drm implementation exposes only the post-blending
> > > > > > color hardware blocks. Support for pre-blending hardware is missing.
> > > > > > There are multiple use cases where pre-blending color hardware
> > > > > > will be
> > > > > > useful:
> > > > > > 	a) Linearization of input buffers encoded in various transfer
> > > > > > 	   functions.
> > > > > > 	b) Color Space conversion
> > > > > > 	c) Tone mapping
> > > > > > 	d) Frame buffer format conversion
> > > > > > 	e) Non-linearization of buffer(apply transfer function)
> > > > > > 	f) 3D Luts
> > > > > >
> > > > > > and other miscellaneous color operations.
> > > > > >
> > > > > > Hence, there is a need to expose the color capabilities of the
> > > > > > hardware to user-space. This will help userspace/middleware to use
> > > > > > display hardware for color processing and blending instead of
> > > > > > doing it through GPU shaders.
> > > > > >  
> > > > >
> > > > > Thanks, Uma, for sending this. I've been working on something
> > > > > similar but you beat me to it. :)  
> > > >
> > > > Thanks Harry for the useful feedback and overall collaboration on this so far.
> > > >  
> > > > > >
> > > > > > Work done so far and relevant references
> > > > > > ========================================
> > > > > >
> > > > > > Some implementation is done by Intel and AMD/Igalia to address the same.
> > > > > > Broad consensus is there that we need a generic API at drm core to
> > > > > > suffice the use case of various HW vendors. Below are the links
> > > > > > capturing the discussion so far.
> > > > > >
> > > > > > Intel's Plane Color Implementation:
> > > > > > https://patchwork.freedesktop.org/series/90825/
> > > > > > AMD's Plane Color Implementation:
> > > > > > https://patchwork.freedesktop.org/series/116862/
> > > > > >
> > > > > >
> > > > > > Hackfest conclusions
> > > > > > ====================
> > > > > >
> > > > > > HDR/Color Hackfest was organised by Redhat to bring all the
> > > > > > industry stakeholders together and converge on a common uapi  
> > > expectations.  
> > > > > > Participants from Intel, AMD, Nvidia, Collabora, Redhat, Igalia
> > > > > > and other prominent user-space developers and maintainers.
> > > > > >
> > > > > > Discussions happened on the uapi expectations, opens, nature of
> > > > > > hardware of multiple hardware vendors, challenges in generalizing
> > > > > > the same and the path forward. Consensus was made that drm core
> > > > > > should implement descriptive APIs and not go with prescriptive
> > > > > > APIs. DRM core should just expose the hardware capabilities;
> > > > > > enabling, customizing and programming the same should be done by
> > > > > > the user-space. Driver should just  
> > > > > honor the user space request without doing any operations internally.  
> > > > > >
> > > > > > Thanks to Simon Ser, for nicely documenting the design consensus
> > > > > > and an UAPI RFC which can be referred to here:
> > > > > >
> > > > > > https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze
> > > > > > _hD5
> > > > > >  
> > > > >  
> > > nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1
> > > Q  
> > > > > Wn48  
> > > > > > 8=@emersion.fr/
> > > > > >
> > > > > >
> > > > > > Design considerations
> > > > > > =====================
> > > > > >
> > > > > > Following are the important aspects taken into account while
> > > > > > designing the current RFC
> > > > > > proposal:
> > > > > >
> > > > > > 	1. Individual HW blocks can be muxed. (e.g. out of two HW blocks
> > > > > > only one  
> > > > > can be used)  
> > > > > > 	2. Position of the HW block in the pipeline can be programmable
> > > > > > 	3. LUTs can be one dimentional or three dimentional
> > > > > > 	4. Number of LUT entries can vary across platforms
> > > > > > 	5. Precision of LUT entries can vary across platforms
> > > > > > 	6. Distribution of LUT entries may vary. e.g Mutli-segmented,  
> > > Logarithmic,  
> > > > > > 	   Piece-Wise Linear(PWL) etc
> > > > > > 	7. There can be parameterized/non-parameterized fixed function HW  
> > > > > blocks.  
> > > > > > 	   e.g. Just a hardware bit, to convert from one color space to another.
> > > > > > 	8. Custom non-standard HW implementation.
> > > > > > 	9. Leaving scope for some vendor defined pescriptive
> > > > > > implementation if  
> > > > > required.  
> > > > > > 	10.Scope to handle any modification in hardware as technology
> > > > > > evolves
> > > > > >
> > > > > > The current proposal takes into account the above considerations
> > > > > > while keeping the implementation as generic as possible leaving
> > > > > > scope for future  
> > > > > additions or modifications.  
> > > > > >
> > > > > > This proposal is also in line to the details mentioned by Simon's
> > > > > > RFC covering all the aspects discussed in hackfest.
> > > > > >
> > > > > >
> > > > > > Outline of the implementation
> > > > > > ============================
> > > > > >
> > > > > > Each Color Hardware block will be represented by a data structure  
> > > drm_color_op.  
> > > > > > These color operations will form the building blocks of a color
> > > > > > pipeline which best represents the underlying Hardware. Color
> > > > > > operations can be re-arranged, substracted or added to create
> > > > > > distinct color pipelines to accurately describe the Hardware
> > > > > > blocks present in the display  
> > > > > engine.
> > > > >
> > > > > Who is doing the arranging of color operations? IMO a driver should
> > > > > define one or more respective pipelines that can be selected by
> > > > > userspace. This seems to be what you're talking about after (I
> > > > > haven't reviewed the whole thing yet). Might be best to drop this sentence or  
> > > to add clarifications in order to avoid confusion.  
> > > >
> > > > Yes it's the driver who will set the pipeline based on the underlying
> > > > hardware arrangement and possible combinations. There can be multiple
> > > > pipelines possible if hardware can be muxed or order can be re-arranged (all  
> > > viable combinations should be defined as a pipeline in driver).  
> > > > Yeah, I will re-phrase this to help clarify it and avoid any ambiguity.
> > > >  
> > > > > >
> > > > > > In this proposal, a color pipeline is represented as an array of
> > > > > > struct drm_color_op. For individual color operation, we add blobs
> > > > > > to advertise the capability of the particular Hardware block.
> > > > > >
> > > > > > This color pipeline is then packaged within a blob for the user
> > > > > > space to retrieve it.
> > > > > >  
> > > > >
> > > > > Would it be better to expose the drm_color_ops directly, instead of
> > > > > packing a array of drm_color_ops into a blob and then giving that to  
> > > userspace.  
> > > >
> > > > Advantage we see in packing in blobs is that interface will be
> > > > cleaner. There will be just one GET_COLOR_PIPELINE property to invoke by user  
> > > and then its just parsing the data.  
> > > > This way the entire underlying hardware blocks with pipeline will be available to  
> > > user.
> > > 
> > > I don't see how parsing a blob is easier than requesting the color ops from the
> > > kernel. User space is already equiped with getting KMS objects and the igt test
> > > code from Harry shows that this is all pretty trivial plumbing.  
> > 
> > There are multiple color operations possible with unique lut distribution and
> > capabilities. Also the order of hardware blocks and possibility of multiple pipelines.
> > Having all the information with one query and property and also be able to set the
> > same with just one property call using blobs is better than multiple calls to driver.
> > This can be useful in high refresh rate cases where not much time is there to program
> > the hardware state. Latency of multiple calls to driver can be avoided.
> 
> Hi,
> 
> querying all that information only happens once, at KMS client start-up.
> 
> Setting up a color pipeline is already a single call: the atomic commit ioctl.

Well, clients also issue a bunch of ioctl to set some properties to the
desired state and then you might run through a whole bunch of
configurations to find one that actually works, so there is a case to be
made that there are a lot of ioctls involved.

I just don't think this is an issue right now. Nobody has been
complaining about the ioctls being a limiting factor so why should we
optimize for this? Especially because it brings with it a bunch of
disadvantages.

Anyway, I agree with the sentiment here: this is not something we should
optimize for.

> 
> > 
> > > > For a particular hardware block in pipeline, user can get the relevant
> > > > details from blob representing that particular block. We have created
> > > > IGT tests (links mentioned in cover-letter) to demonstrate how it can be done.  
> > > This is just to clarify the idea.
> > > 
> > > The blob is also not introspectable with the usual tools whereas exposing them as
> > > properties would be.
> > > 
> > > It also would, like Pekka correctly noted, bring a whole bunch of issues about
> > > compatibility and versioning that are well understood with objects
> > > + properties.  
> > 
> > The blob should be standardized in the UAPI and structures to parse them should be fixed.
> > With this compatibility issues can be prevented.
> 
> I think that is short-sighted.
> 
> > > > Also since info is passed via blobs we have the flexibility to even
> > > > define segmented LUTs and PWL hardware blocks. Also we have left scope
> > > > for custom private hardware blocks as well which driver can work with  
> > > respective HAL and get that implemented.
> > > 
> > > When color ops are real KMS objects they still can have properties which are
> > > blobs that can store LUTs and other data.
> > > 
> > > And we should avoid private blocks at all cost. In fact, I don't think the KMS rules
> > > have changed in that regard and it simply is not allowed.  
> > 
> > Private blocks are not standardized but are vendor specific. So generic userspace will
> > ignore these. However vendor and its respective HAL can make use of this field and leaves
> > a scope to cater to such hardware vendors need. This doesn't affect the expectation of the
> > standardized color operations which will be defined as enum in UAPI.
> 
> Vendors can have and expose their own unique snowflake operations
> without any "private" as well: pick an unused operation type code, and
> document what it does. Advertise it in some pipelines.
> 
> Vendor HALs can make use of it, but it also allows generic userspace to
> make use of it at will, and it allows other vendors to implement the
> same and benefit from it without needing to patch every userspace.
> 
> Or does Intel not want other vendors to potentially make use of their
> UAPIs?
> 
> > > > We can even define prescriptive operations as a private entry and
> > > > enable it if a certain driver and HAL agree.
> > > >  
> > > > > > To advertise the available color pipelines, an immutable ENUM
> > > > > > property "GET_COLOR_PIPELINE" is introduced. This enum property has  
> > > blob id's as values.  
> > > > > > With each blob id representing a distinct color pipeline based on
> > > > > > underlying HW capabilities and their respective combinations.
> > > > > >
> > > > > > Once the user space decides on a color pipeline, it can set the
> > > > > > pipeline and the corresponding data for the hardware blocks within
> > > > > > the pipeline with the BLOB property "SET_COLOR_PIPELINE".
> > > > > >  
> > > > >
> > > > > When I discussed this at the hackfest with Simon he proposed a new
> > > > > DRM object, (I've called it a drm_colorop) to represent a color operation.
> > > > > Each drm_colorop has a "NEXT" pointer to another drm_colorop, or
> > > > > NULL if its the last in the pipeline. You can then have a mutable
> > > > > enum property on the plane to discover and select a color pipeline.  
> > > >
> > > > Yes, the proposal is inspired by this idea. Sure, we can work together to enhance  
> > > the design.  
> > > > Personally I feel the one proposed in the current RFC will do the same
> > > > thing envisioned by Simon and you Harry. Management of the pipeline,
> > > > addition, deletion and flexibility to represent hardware is more with blobs with  
> > > the relevant structures agreed in UAPI.  
> > > >  
> > > > > This seems a bit more transparent than a blob. You can see my
> > > > > changes (unfortunately very WIP, don't look too closely at
> > > > > individual patches) at
> > > > > https://gitlab.freedesktop.org/hwentland/linux/-/merge_requests/5/di
> > > > > ffs
> > > > >
> > > > > libdrm changes:
> > > > > https://gitlab.freedesktop.org/hwentland/drm/-/merge_requests/1/diff
> > > > > s  
> > > >
> > > > Sure, will check the same.
> > > >  
> > > > > IGT changes:
> > > > > https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/merge_reque
> > > > > sts/1/diffs
> > > > >
> > > > > I'll take time to review your whole series and will see whether we
> > > > > can somehow keep the best parts of each.  
> > > >
> > > > Thanks and agree. Let's work together and get this implemented in DRM.
> > > >  
> > > > > Curious to hear other opinions on the blob vs new DRM object question as  
> > > well.
> > > 
> > > I can see a few advantages with the blob approach.
> > > 
> > > User space can store the entire state in one blob and just assign a new blob to
> > > change to another pipeline configuration.  
> > 
> > Agree
> > 
> > > However, I would argue that changing a lot of properties is already common
> > > practice and works well. User space can deal with it and has abstractions to
> > > make this managable.  
> > 
> > Blob gives a lot of flexibility and ability to define the hardware capabilities generically.
> 
> The structure of the atomic commit ioctl and the KMS property system is
> even more flexible.
> 
> > Lut distribution, number of segments, samples in each segment, precision of luts etc.
> > can be precisely defined and userspace will get a complete picture of the underlying
> > hardware and its capabilities. Also this is being done with just 2 properties. Leaving
> > scope for future addition of any standard color operation as well.
> 
> The number of properties does not seem too useful to strictly minimise
> over other aspects.
> 
> 
> Thanks,
> pq
> 
> > 
> > I feel blob approach once properly documented is a bit more flexible and scalable.
> > Maybe I am bit biased here 
Sebastian Wick Sept. 5, 2023, 12:57 p.m. UTC | #7
On Tue, Sep 05, 2023 at 02:33:04PM +0200, Sebastian Wick wrote:
> On Tue, Sep 05, 2023 at 02:33:26PM +0300, Pekka Paalanen wrote:
> > On Mon, 4 Sep 2023 14:29:56 +0000
> > "Shankar, Uma" <uma.shankar@intel.com> wrote:
> > 
> > > > -----Original Message-----
> > > > From: Sebastian Wick <sebastian.wick@redhat.com>
> > > > Sent: Thursday, August 31, 2023 2:46 AM
> > > > To: Shankar, Uma <uma.shankar@intel.com>
> > > > Cc: Harry Wentland <harry.wentland@amd.com>; intel-
> > > > gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; wayland-
> > > > devel@lists.freedesktop.org; Ville Syrjala <ville.syrjala@linux.intel.com>; Pekka
> > > > Paalanen <pekka.paalanen@collabora.com>; Simon Ser <contact@emersion.fr>;
> > > > Melissa Wen <mwen@igalia.com>; Jonas Ådahl <jadahl@redhat.com>; Shashank
> > > > Sharma <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> > > > Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> > > > <quic_cbraga@quicinc.com>
> > > > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > > > 
> > > > On Wed, Aug 30, 2023 at 08:47:37AM +0000, Shankar, Uma wrote:  
> > > > >
> > > > >  
> > > > > > -----Original Message-----
> > > > > > From: Harry Wentland <harry.wentland@amd.com>
> > > > > > Sent: Wednesday, August 30, 2023 12:56 AM
> > > > > > To: Shankar, Uma <uma.shankar@intel.com>;
> > > > > > intel-gfx@lists.freedesktop.org; dri- devel@lists.freedesktop.org
> > > > > > Cc: wayland-devel@lists.freedesktop.org; Ville Syrjala
> > > > > > <ville.syrjala@linux.intel.com>; Pekka Paalanen
> > > > > > <pekka.paalanen@collabora.com>; Simon Ser <contact@emersion.fr>;
> > > > > > Melissa Wen <mwen@igalia.com>; Jonas Ådahl <jadahl@redhat.com>;
> > > > > > Sebastian Wick <sebastian.wick@redhat.com>; Shashank Sharma
> > > > > > <shashank.sharma@amd.com>; Alexander Goins <agoins@nvidia.com>;
> > > > > > Naseer Ahmed <quic_naseer@quicinc.com>; Christopher Braga
> > > > > > <quic_cbraga@quicinc.com>
> > > > > > Subject: Re: [RFC 00/33] Add Support for Plane Color Pipeline
> > > > > >
> > > > > > +CC Naseer and Chris, FYI
> > > > > >
> > > > > > See https://patchwork.freedesktop.org/series/123024/ for whole series.
> > > > > >
> > > > > > On 2023-08-29 12:03, Uma Shankar wrote:  
> > > > > > > Introduction
> > > > > > > ============
> > > > > > >
> > > > > > > Modern hardwares have various color processing capabilities both
> > > > > > > at pre-blending and post-blending phases in the color pipeline.
> > > > > > > The current drm implementation exposes only the post-blending
> > > > > > > color hardware blocks. Support for pre-blending hardware is missing.
> > > > > > > There are multiple use cases where pre-blending color hardware
> > > > > > > will be
> > > > > > > useful:
> > > > > > > 	a) Linearization of input buffers encoded in various transfer
> > > > > > > 	   functions.
> > > > > > > 	b) Color Space conversion
> > > > > > > 	c) Tone mapping
> > > > > > > 	d) Frame buffer format conversion
> > > > > > > 	e) Non-linearization of buffer(apply transfer function)
> > > > > > > 	f) 3D Luts
> > > > > > >
> > > > > > > and other miscellaneous color operations.
> > > > > > >
> > > > > > > Hence, there is a need to expose the color capabilities of the
> > > > > > > hardware to user-space. This will help userspace/middleware to use
> > > > > > > display hardware for color processing and blending instead of
> > > > > > > doing it through GPU shaders.
> > > > > > >  
> > > > > >
> > > > > > Thanks, Uma, for sending this. I've been working on something
> > > > > > similar but you beat me to it. :)  
> > > > >
> > > > > Thanks Harry for the useful feedback and overall collaboration on this so far.
> > > > >  
> > > > > > >
> > > > > > > Work done so far and relevant references
> > > > > > > ========================================
> > > > > > >
> > > > > > > Some implementation is done by Intel and AMD/Igalia to address the same.
> > > > > > > Broad consensus is there that we need a generic API at drm core to
> > > > > > > suffice the use case of various HW vendors. Below are the links
> > > > > > > capturing the discussion so far.
> > > > > > >
> > > > > > > Intel's Plane Color Implementation:
> > > > > > > https://patchwork.freedesktop.org/series/90825/
> > > > > > > AMD's Plane Color Implementation:
> > > > > > > https://patchwork.freedesktop.org/series/116862/
> > > > > > >
> > > > > > >
> > > > > > > Hackfest conclusions
> > > > > > > ====================
> > > > > > >
> > > > > > > HDR/Color Hackfest was organised by Redhat to bring all the
> > > > > > > industry stakeholders together and converge on a common uapi  
> > > > expectations.  
> > > > > > > Participants from Intel, AMD, Nvidia, Collabora, Redhat, Igalia
> > > > > > > and other prominent user-space developers and maintainers.
> > > > > > >
> > > > > > > Discussions happened on the uapi expectations, opens, nature of
> > > > > > > hardware of multiple hardware vendors, challenges in generalizing
> > > > > > > the same and the path forward. Consensus was made that drm core
> > > > > > > should implement descriptive APIs and not go with prescriptive
> > > > > > > APIs. DRM core should just expose the hardware capabilities;
> > > > > > > enabling, customizing and programming the same should be done by
> > > > > > > the user-space. Driver should just  
> > > > > > honor the user space request without doing any operations internally.  
> > > > > > >
> > > > > > > Thanks to Simon Ser, for nicely documenting the design consensus
> > > > > > > and an UAPI RFC which can be referred to here:
> > > > > > >
> > > > > > > https://lore.kernel.org/dri-devel/QMers3awXvNCQlyhWdTtsPwkp5ie9bze
> > > > > > > _hD5
> > > > > > >  
> > > > > >  
> > > > nAccFW7a_RXlWjYB7MoUW_8CKLT2bSQwIXVi5H6VULYIxCdgvryZoAoJnC5lZgyK1
> > > > Q  
> > > > > > Wn48  
> > > > > > > 8=@emersion.fr/
> > > > > > >
> > > > > > >
> > > > > > > Design considerations
> > > > > > > =====================
> > > > > > >
> > > > > > > Following are the important aspects taken into account while
> > > > > > > designing the current RFC
> > > > > > > proposal:
> > > > > > >
> > > > > > > 	1. Individual HW blocks can be muxed. (e.g. out of two HW blocks
> > > > > > > only one  
> > > > > > can be used)  
> > > > > > > 	2. Position of the HW block in the pipeline can be programmable
> > > > > > > 	3. LUTs can be one dimentional or three dimentional
> > > > > > > 	4. Number of LUT entries can vary across platforms
> > > > > > > 	5. Precision of LUT entries can vary across platforms
> > > > > > > 	6. Distribution of LUT entries may vary. e.g Mutli-segmented,  
> > > > Logarithmic,  
> > > > > > > 	   Piece-Wise Linear(PWL) etc
> > > > > > > 	7. There can be parameterized/non-parameterized fixed function HW  
> > > > > > blocks.  
> > > > > > > 	   e.g. Just a hardware bit, to convert from one color space to another.
> > > > > > > 	8. Custom non-standard HW implementation.
> > > > > > > 	9. Leaving scope for some vendor defined pescriptive
> > > > > > > implementation if  
> > > > > > required.  
> > > > > > > 	10.Scope to handle any modification in hardware as technology
> > > > > > > evolves
> > > > > > >
> > > > > > > The current proposal takes into account the above considerations
> > > > > > > while keeping the implementation as generic as possible leaving
> > > > > > > scope for future  
> > > > > > additions or modifications.  
> > > > > > >
> > > > > > > This proposal is also in line to the details mentioned by Simon's
> > > > > > > RFC covering all the aspects discussed in hackfest.
> > > > > > >
> > > > > > >
> > > > > > > Outline of the implementation
> > > > > > > ============================
> > > > > > >
> > > > > > > Each Color Hardware block will be represented by a data structure  
> > > > drm_color_op.  
> > > > > > > These color operations will form the building blocks of a color
> > > > > > > pipeline which best represents the underlying Hardware. Color
> > > > > > > operations can be re-arranged, substracted or added to create
> > > > > > > distinct color pipelines to accurately describe the Hardware
> > > > > > > blocks present in the display  
> > > > > > engine.
> > > > > >
> > > > > > Who is doing the arranging of color operations? IMO a driver should
> > > > > > define one or more respective pipelines that can be selected by
> > > > > > userspace. This seems to be what you're talking about after (I
> > > > > > haven't reviewed the whole thing yet). Might be best to drop this sentence or  
> > > > to add clarifications in order to avoid confusion.  
> > > > >
> > > > > Yes it's the driver who will set the pipeline based on the underlying
> > > > > hardware arrangement and possible combinations. There can be multiple
> > > > > pipelines possible if hardware can be muxed or order can be re-arranged (all  
> > > > viable combinations should be defined as a pipeline in driver).  
> > > > > Yeah, I will re-phrase this to help clarify it and avoid any ambiguity.
> > > > >  
> > > > > > >
> > > > > > > In this proposal, a color pipeline is represented as an array of
> > > > > > > struct drm_color_op. For individual color operation, we add blobs
> > > > > > > to advertise the capability of the particular Hardware block.
> > > > > > >
> > > > > > > This color pipeline is then packaged within a blob for the user
> > > > > > > space to retrieve it.
> > > > > > >  
> > > > > >
> > > > > > Would it be better to expose the drm_color_ops directly, instead of
> > > > > > packing a array of drm_color_ops into a blob and then giving that to  
> > > > userspace.  
> > > > >
> > > > > Advantage we see in packing in blobs is that interface will be
> > > > > cleaner. There will be just one GET_COLOR_PIPELINE property to invoke by user  
> > > > and then its just parsing the data.  
> > > > > This way the entire underlying hardware blocks with pipeline will be available to  
> > > > user.
> > > > 
> > > > I don't see how parsing a blob is easier than requesting the color ops from the
> > > > kernel. User space is already equiped with getting KMS objects and the igt test
> > > > code from Harry shows that this is all pretty trivial plumbing.  
> > > 
> > > There are multiple color operations possible with unique lut distribution and
> > > capabilities. Also the order of hardware blocks and possibility of multiple pipelines.
> > > Having all the information with one query and property and also be able to set the
> > > same with just one property call using blobs is better than multiple calls to driver.
> > > This can be useful in high refresh rate cases where not much time is there to program
> > > the hardware state. Latency of multiple calls to driver can be avoided.
> > 
> > Hi,
> > 
> > querying all that information only happens once, at KMS client start-up.
> > 
> > Setting up a color pipeline is already a single call: the atomic commit ioctl.
> 
> Well, clients also issue a bunch of ioctl to set some properties to the
> desired state and then you might run through a whole bunch of
> configurations to find one that actually works, so there is a case to be
> made that there are a lot of ioctls involved.

pq pointed at that for atomic this is actually really just a single
ioctl so really there is no advantage here at all.

> I just don't think this is an issue right now. Nobody has been
> complaining about the ioctls being a limiting factor so why should we
> optimize for this? Especially because it brings with it a bunch of
> disadvantages.
> 
> Anyway, I agree with the sentiment here: this is not something we should
> optimize for.
> 
> > 
> > > 
> > > > > For a particular hardware block in pipeline, user can get the relevant
> > > > > details from blob representing that particular block. We have created
> > > > > IGT tests (links mentioned in cover-letter) to demonstrate how it can be done.  
> > > > This is just to clarify the idea.
> > > > 
> > > > The blob is also not introspectable with the usual tools whereas exposing them as
> > > > properties would be.
> > > > 
> > > > It also would, like Pekka correctly noted, bring a whole bunch of issues about
> > > > compatibility and versioning that are well understood with objects
> > > > + properties.  
> > > 
> > > The blob should be standardized in the UAPI and structures to parse them should be fixed.
> > > With this compatibility issues can be prevented.
> > 
> > I think that is short-sighted.
> > 
> > > > > Also since info is passed via blobs we have the flexibility to even
> > > > > define segmented LUTs and PWL hardware blocks. Also we have left scope
> > > > > for custom private hardware blocks as well which driver can work with  
> > > > respective HAL and get that implemented.
> > > > 
> > > > When color ops are real KMS objects they still can have properties which are
> > > > blobs that can store LUTs and other data.
> > > > 
> > > > And we should avoid private blocks at all cost. In fact, I don't think the KMS rules
> > > > have changed in that regard and it simply is not allowed.  
> > > 
> > > Private blocks are not standardized but are vendor specific. So generic userspace will
> > > ignore these. However vendor and its respective HAL can make use of this field and leaves
> > > a scope to cater to such hardware vendors need. This doesn't affect the expectation of the
> > > standardized color operations which will be defined as enum in UAPI.
> > 
> > Vendors can have and expose their own unique snowflake operations
> > without any "private" as well: pick an unused operation type code, and
> > document what it does. Advertise it in some pipelines.
> > 
> > Vendor HALs can make use of it, but it also allows generic userspace to
> > make use of it at will, and it allows other vendors to implement the
> > same and benefit from it without needing to patch every userspace.
> > 
> > Or does Intel not want other vendors to potentially make use of their
> > UAPIs?
> > 
> > > > > We can even define prescriptive operations as a private entry and
> > > > > enable it if a certain driver and HAL agree.
> > > > >  
> > > > > > > To advertise the available color pipelines, an immutable ENUM
> > > > > > > property "GET_COLOR_PIPELINE" is introduced. This enum property has  
> > > > blob id's as values.  
> > > > > > > With each blob id representing a distinct color pipeline based on
> > > > > > > underlying HW capabilities and their respective combinations.
> > > > > > >
> > > > > > > Once the user space decides on a color pipeline, it can set the
> > > > > > > pipeline and the corresponding data for the hardware blocks within
> > > > > > > the pipeline with the BLOB property "SET_COLOR_PIPELINE".
> > > > > > >  
> > > > > >
> > > > > > When I discussed this at the hackfest with Simon he proposed a new
> > > > > > DRM object, (I've called it a drm_colorop) to represent a color operation.
> > > > > > Each drm_colorop has a "NEXT" pointer to another drm_colorop, or
> > > > > > NULL if its the last in the pipeline. You can then have a mutable
> > > > > > enum property on the plane to discover and select a color pipeline.  
> > > > >
> > > > > Yes, the proposal is inspired by this idea. Sure, we can work together to enhance  
> > > > the design.  
> > > > > Personally I feel the one proposed in the current RFC will do the same
> > > > > thing envisioned by Simon and you Harry. Management of the pipeline,
> > > > > addition, deletion and flexibility to represent hardware is more with blobs with  
> > > > the relevant structures agreed in UAPI.  
> > > > >  
> > > > > > This seems a bit more transparent than a blob. You can see my
> > > > > > changes (unfortunately very WIP, don't look too closely at
> > > > > > individual patches) at
> > > > > > https://gitlab.freedesktop.org/hwentland/linux/-/merge_requests/5/di
> > > > > > ffs
> > > > > >
> > > > > > libdrm changes:
> > > > > > https://gitlab.freedesktop.org/hwentland/drm/-/merge_requests/1/diff
> > > > > > s  
> > > > >
> > > > > Sure, will check the same.
> > > > >  
> > > > > > IGT changes:
> > > > > > https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/merge_reque
> > > > > > sts/1/diffs
> > > > > >
> > > > > > I'll take time to review your whole series and will see whether we
> > > > > > can somehow keep the best parts of each.  
> > > > >
> > > > > Thanks and agree. Let's work together and get this implemented in DRM.
> > > > >  
> > > > > > Curious to hear other opinions on the blob vs new DRM object question as  
> > > > well.
> > > > 
> > > > I can see a few advantages with the blob approach.
> > > > 
> > > > User space can store the entire state in one blob and just assign a new blob to
> > > > change to another pipeline configuration.  
> > > 
> > > Agree
> > > 
> > > > However, I would argue that changing a lot of properties is already common
> > > > practice and works well. User space can deal with it and has abstractions to
> > > > make this managable.  
> > > 
> > > Blob gives a lot of flexibility and ability to define the hardware capabilities generically.
> > 
> > The structure of the atomic commit ioctl and the KMS property system is
> > even more flexible.
> > 
> > > Lut distribution, number of segments, samples in each segment, precision of luts etc.
> > > can be precisely defined and userspace will get a complete picture of the underlying
> > > hardware and its capabilities. Also this is being done with just 2 properties. Leaving
> > > scope for future addition of any standard color operation as well.
> > 
> > The number of properties does not seem too useful to strictly minimise
> > over other aspects.
> > 
> > 
> > Thanks,
> > pq
> > 
> > > 
> > > I feel blob approach once properly documented is a bit more flexible and scalable.
> > > Maybe I am bit biased here