diff mbox series

[RFC,2/3] drm/tidss: Add support for display sharing

Message ID 20240116134142.2092483-3-devarsht@ti.com (mailing list archive)
State New, archived
Headers show
Series Add display sharing support in tidss | expand

Commit Message

Devarsh Thakkar Jan. 16, 2024, 1:41 p.m. UTC
Display subsystem present in TI Keystone family of devices supports sharing
of display between multiple hosts as it provides separate register space
(common* region) for each host to programming display controller and also a
unique interrupt line for each host.

This adds support for display sharing, by allowing partitioning of
resources either at video port level or at video plane level as
described below :

1) Linux can own (i.e have write access) completely one or more of video
ports along with corresponding resources (viz. overlay managers,
video planes) used by Linux in context of those video ports.
Even if Linux is owning
these video ports it can still share this video port with a remote core
which can own one or more video planes associated with this video port.

2) Linux owns one or more of the video planes with video port
(along with corresponding overlay manager) associated with these planes
being owned and controlled by a remote core. Linux still has read-only
access to the associated video port and overlay managers so that it can
parse the settings made by remote core.

For both the cases, the resources used in context of processing core
running Linux along with ownership information are exposed by user as
part of device-tree blob and driver uses an updated feature list tailored
for this shared mode accordingly. The driver also auto-populates
matching overlay managers and output types from shared video
port list provided in device-tree blob.
In dispc_feature struct remove const access specfier for output_type
array as it is required to be updated dynamically in run-time for shared
mode.

For 2) where Linux is only owning a set of video planes with
corresponding video port and overlay manager controlled by a remote
core, separate set of CRTC callbacks are used which just latch on
to the preset mode set by remote core, thus avoiding any reconfiguration
of associated video ports, overlay managers and clocks.
For this case, it is also checked that Linux controlled video planes
don't exceed screen size set by remote core while running the display.
Display clocks and OLDI related fields are also not
populated for this scenario as remote core is owning those resources.

For 1), where Linux owns only a set of video port and associated
planes with rest of resources owned completely by remote cores,
only those set of resources are exposed to Linux and programmed using
traditional CRTC helpers and rest of video ports and associated resources
are removed from feature list accordingly.

Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
---
 drivers/gpu/drm/tidss/tidss_crtc.c  | 120 ++++++++++++-
 drivers/gpu/drm/tidss/tidss_dispc.c | 254 +++++++++++++++++++++++++---
 drivers/gpu/drm/tidss/tidss_dispc.h |   2 +-
 drivers/gpu/drm/tidss/tidss_drv.c   |  33 ++--
 drivers/gpu/drm/tidss/tidss_drv.h   |   6 +
 5 files changed, 375 insertions(+), 40 deletions(-)

Comments

Tomi Valkeinen Jan. 23, 2024, 8:26 a.m. UTC | #1
Hi,

On 16/01/2024 15:41, Devarsh Thakkar wrote:
> Display subsystem present in TI Keystone family of devices supports sharing
> of display between multiple hosts as it provides separate register space
> (common* region) for each host to programming display controller and also a
> unique interrupt line for each host.
> 
> This adds support for display sharing, by allowing partitioning of
> resources either at video port level or at video plane level as
> described below :
> 
> 1) Linux can own (i.e have write access) completely one or more of video
> ports along with corresponding resources (viz. overlay managers,
> video planes) used by Linux in context of those video ports.
> Even if Linux is owning
> these video ports it can still share this video port with a remote core
> which can own one or more video planes associated with this video port.
> 
> 2) Linux owns one or more of the video planes with video port
> (along with corresponding overlay manager) associated with these planes
> being owned and controlled by a remote core. Linux still has read-only
> access to the associated video port and overlay managers so that it can
> parse the settings made by remote core.
> 
> For both the cases, the resources used in context of processing core
> running Linux along with ownership information are exposed by user as
> part of device-tree blob and driver uses an updated feature list tailored
> for this shared mode accordingly. The driver also auto-populates
> matching overlay managers and output types from shared video
> port list provided in device-tree blob.
> In dispc_feature struct remove const access specfier for output_type
> array as it is required to be updated dynamically in run-time for shared
> mode.
> 
> For 2) where Linux is only owning a set of video planes with
> corresponding video port and overlay manager controlled by a remote
> core, separate set of CRTC callbacks are used which just latch on
> to the preset mode set by remote core, thus avoiding any reconfiguration
> of associated video ports, overlay managers and clocks.
> For this case, it is also checked that Linux controlled video planes
> don't exceed screen size set by remote core while running the display.
> Display clocks and OLDI related fields are also not
> populated for this scenario as remote core is owning those resources.
> 
> For 1), where Linux owns only a set of video port and associated
> planes with rest of resources owned completely by remote cores,
> only those set of resources are exposed to Linux and programmed using
> traditional CRTC helpers and rest of video ports and associated resources
> are removed from feature list accordingly.
> 
> Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
> ---
>   drivers/gpu/drm/tidss/tidss_crtc.c  | 120 ++++++++++++-
>   drivers/gpu/drm/tidss/tidss_dispc.c | 254 +++++++++++++++++++++++++---
>   drivers/gpu/drm/tidss/tidss_dispc.h |   2 +-
>   drivers/gpu/drm/tidss/tidss_drv.c   |  33 ++--
>   drivers/gpu/drm/tidss/tidss_drv.h   |   6 +
>   5 files changed, 375 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/gpu/drm/tidss/tidss_crtc.c b/drivers/gpu/drm/tidss/tidss_crtc.c
> index 5f838980c7a1..f6a877ff4c6c 100644
> --- a/drivers/gpu/drm/tidss/tidss_crtc.c
> +++ b/drivers/gpu/drm/tidss/tidss_crtc.c
> @@ -31,13 +31,19 @@ static void tidss_crtc_finish_page_flip(struct tidss_crtc *tcrtc)
>   	/*
>   	 * New settings are taken into use at VFP, and GO bit is cleared at
>   	 * the same time. This happens before the vertical blank interrupt.
> -	 * So there is a small change that the driver sets GO bit after VFP, but
> +	 * So there is a small chance that the driver sets GO bit after VFP, but
>   	 * before vblank, and we have to check for that case here.
> +	 *
> +	 * For a video port shared between Linux and remote core but owned by remote core,
> +	 * this is not required since Linux just attaches to mode that was preset by remote
> +	 * core with which display is being shared.
>   	 */
> -	busy = dispc_vp_go_busy(tidss->dispc, tcrtc->hw_videoport);
> -	if (busy) {
> -		spin_unlock_irqrestore(&ddev->event_lock, flags);
> -		return;
> +	if (!tidss->shared_mode || tidss->shared_mode_owned_vps[tcrtc->hw_videoport]) {

You test this in multiple places. I think it would be better to combine 
those, in one way or another. Either a helper function, or maybe invert 
the shared_mode_owned_vps, i.e. rather have something like "foreign_vps" 
(better name needed), so that when !shared_mode, the default of False in 
foreign_vps array will just work. Then the above test will be just if 
(!tidss->foreign_vps[tcrtc->hw_videoport])

> +		busy = dispc_vp_go_busy(tidss->dispc, tcrtc->hw_videoport);
> +		if (busy) {
> +			spin_unlock_irqrestore(&ddev->event_lock, flags);
> +			return;
> +		}
>   	}
>   
>   	event = tcrtc->event;
> @@ -208,6 +214,44 @@ static void tidss_crtc_atomic_flush(struct drm_crtc *crtc,
>   	spin_unlock_irqrestore(&ddev->event_lock, flags);
>   }
>   
> +static void tidss_shared_vp_crtc_atomic_flush(struct drm_crtc *crtc,
> +					      struct drm_atomic_state *state)
> +{
> +	struct tidss_crtc *tcrtc = to_tidss_crtc(crtc);
> +	struct drm_device *ddev = crtc->dev;
> +	unsigned long flags;
> +
> +	dev_dbg(ddev->dev,
> +		"%s: %s enabled %d, needs modeset %d, event %p\n", __func__,
> +		crtc->name, drm_atomic_crtc_needs_modeset(crtc->state),
> +		crtc->state->enable, crtc->state->event);
> +
> +	/* There is nothing to do if CRTC is not going to be enabled. */
> +	if (!crtc->state->enable)
> +		return;
> +	/*
> +	 * Flush CRTC changes with go bit only if new modeset is not
> +	 * coming, so CRTC is enabled trough out the commit.
> +	 */
> +	if (drm_atomic_crtc_needs_modeset(crtc->state))
> +		return;
> +
> +	/* We should have event if CRTC is enabled through out this commit. */
> +	if (WARN_ON(!crtc->state->event))
> +		return;
> +
> +	WARN_ON(drm_crtc_vblank_get(crtc) != 0);
> +
> +	spin_lock_irqsave(&ddev->event_lock, flags);
> +
> +	WARN_ON(tcrtc->event);
> +
> +	tcrtc->event = crtc->state->event;
> +	crtc->state->event = NULL;
> +
> +	spin_unlock_irqrestore(&ddev->event_lock, flags);
> +}
> +
>   static void tidss_crtc_atomic_enable(struct drm_crtc *crtc,
>   				     struct drm_atomic_state *state)
>   {
> @@ -253,6 +297,27 @@ static void tidss_crtc_atomic_enable(struct drm_crtc *crtc,
>   	spin_unlock_irqrestore(&ddev->event_lock, flags);
>   }
>   
> +static void tidss_shared_vp_crtc_atomic_enable(struct drm_crtc *crtc,
> +					       struct drm_atomic_state *state)
> +{
> +	struct drm_device *ddev = crtc->dev;
> +	unsigned long flags;
> +
> +	dev_dbg(ddev->dev, "%s, event %p\n", __func__, crtc->state->event);
> +
> +	/* Turn vertical blanking interrupt reporting on. */
> +	drm_crtc_vblank_on(crtc);
> +
> +	spin_lock_irqsave(&ddev->event_lock, flags);
> +
> +	if (crtc->state->event) {
> +		drm_crtc_send_vblank_event(crtc, crtc->state->event);
> +		crtc->state->event = NULL;
> +	}
> +
> +	spin_unlock_irqrestore(&ddev->event_lock, flags);
> +}
> +
>   static void tidss_crtc_atomic_disable(struct drm_crtc *crtc,
>   				      struct drm_atomic_state *state)
>   {
> @@ -288,6 +353,27 @@ static void tidss_crtc_atomic_disable(struct drm_crtc *crtc,
>   	tidss_runtime_put(tidss);
>   }
>   
> +static void tidss_shared_vp_crtc_atomic_disable(struct drm_crtc *crtc,
> +						struct drm_atomic_state *state)
> +{
> +	struct tidss_crtc *tcrtc = to_tidss_crtc(crtc);
> +	struct drm_device *ddev = crtc->dev;
> +	unsigned long flags;
> +
> +	dev_dbg(ddev->dev, "%s, event %p\n", __func__, crtc->state->event);
> +
> +	reinit_completion(&tcrtc->framedone_completion);
> +
> +	spin_lock_irqsave(&ddev->event_lock, flags);
> +	if (crtc->state->event) {
> +		drm_crtc_send_vblank_event(crtc, crtc->state->event);
> +		crtc->state->event = NULL;
> +	}
> +	spin_unlock_irqrestore(&ddev->event_lock, flags);
> +
> +	drm_crtc_vblank_off(crtc);
> +}
> +
>   static
>   enum drm_mode_status tidss_crtc_mode_valid(struct drm_crtc *crtc,
>   					   const struct drm_display_mode *mode)
> @@ -304,7 +390,14 @@ static const struct drm_crtc_helper_funcs tidss_crtc_helper_funcs = {
>   	.atomic_flush = tidss_crtc_atomic_flush,
>   	.atomic_enable = tidss_crtc_atomic_enable,
>   	.atomic_disable = tidss_crtc_atomic_disable,
> +	.mode_valid = tidss_crtc_mode_valid,
> +};
>   
> +static const struct drm_crtc_helper_funcs tidss_shared_vp_crtc_helper_funcs = {
> +	.atomic_check = tidss_crtc_atomic_check,
> +	.atomic_flush = tidss_shared_vp_crtc_atomic_flush,
> +	.atomic_enable = tidss_shared_vp_crtc_atomic_enable,
> +	.atomic_disable = tidss_shared_vp_crtc_atomic_disable,
>   	.mode_valid = tidss_crtc_mode_valid,
>   };
>   
> @@ -406,6 +499,10 @@ struct tidss_crtc *tidss_crtc_create(struct tidss_device *tidss,
>   	bool has_ctm = tidss->feat->vp_feat.color.has_ctm;
>   	int ret;
>   
> +	dev_dbg(tidss->dev, "%s: tidss->shared_mode: %d tidss->shared_mode_owned_vps[%d] = %d\n",
> +		__func__, tidss->shared_mode, hw_videoport,
> +		tidss->shared_mode_owned_vps[hw_videoport]);
> +
>   	tcrtc = kzalloc(sizeof(*tcrtc), GFP_KERNEL);
>   	if (!tcrtc)
>   		return ERR_PTR(-ENOMEM);
> @@ -422,8 +519,17 @@ struct tidss_crtc *tidss_crtc_create(struct tidss_device *tidss,
>   		return ERR_PTR(ret);
>   	}
>   
> -	drm_crtc_helper_add(crtc, &tidss_crtc_helper_funcs);
> -
> +	/* Use shared video port CRTC helpers which don't access associated
> +	 * video port and overlay register spaces when Linux is not owning the
> +	 * video port.
> +	 */
> +	if (tidss->shared_mode && !tidss->shared_mode_owned_vps[hw_videoport]) {
> +		drm_crtc_helper_add(crtc, &tidss_shared_vp_crtc_helper_funcs);
> +		dev_dbg(tidss->dev, "%s: vp%d is being shared with Linux\n", __func__,
> +			hw_videoport + 1);
> +	} else {
> +		drm_crtc_helper_add(crtc, &tidss_crtc_helper_funcs);
> +	}
>   	/*
>   	 * The dispc gamma functions adapt to what ever size we ask
>   	 * from it no matter what HW supports. X-server assumes 256
> diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c b/drivers/gpu/drm/tidss/tidss_dispc.c
> index 1ad711f8d2a8..35a82198629f 100644
> --- a/drivers/gpu/drm/tidss/tidss_dispc.c
> +++ b/drivers/gpu/drm/tidss/tidss_dispc.c
> @@ -273,6 +273,18 @@ const struct dispc_features dispc_j721e_feats = {
>   	.vid_order = { 1, 3, 0, 2 },
>   };
>   
> +static const u16 tidss_am62_common1_regs[DISPC_COMMON_REG_TABLE_LEN] = {
> +	[DISPC_IRQ_EOI_OFF] =			0x24,
> +	[DISPC_IRQSTATUS_RAW_OFF] =		0x28,
> +	[DISPC_IRQSTATUS_OFF] =			0x2c,
> +	[DISPC_IRQENABLE_SET_OFF] =		0x30,
> +	[DISPC_IRQENABLE_CLR_OFF] =		0x40,
> +	[DISPC_VID_IRQENABLE_OFF] =		0x44,
> +	[DISPC_VID_IRQSTATUS_OFF] =		0x58,
> +	[DISPC_VP_IRQENABLE_OFF] =		0x70,
> +	[DISPC_VP_IRQSTATUS_OFF] =		0x7c,
> +};
> +
>   const struct dispc_features dispc_am625_feats = {
>   	.max_pclk_khz = {
>   		[DISPC_VP_DPI] = 165000,
> @@ -1228,6 +1240,22 @@ enum drm_mode_status dispc_vp_mode_valid(struct dispc_device *dispc,
>   
>   	max_pclk = dispc->feat->max_pclk_khz[bus_type];
>   
> +	/*
> +	 * For shared mode, with remote core driving the video port, make sure that Linux
> +	 * controlled primary plane doesn't exceed video port screen size set by remote core
> +	 */
> +	if (dispc->tidss->shared_mode && !dispc->tidss->shared_mode_owned_vps[hw_videoport]) {
> +		int vp_hdisplay = VP_REG_GET(dispc, hw_videoport, DISPC_VP_SIZE_SCREEN, 11, 0) + 1;
> +		int vp_vdisplay = VP_REG_GET(dispc, hw_videoport, DISPC_VP_SIZE_SCREEN, 27, 16) + 1;
> +
> +		if (mode->hdisplay > vp_hdisplay ||
> +		    mode->vdisplay > vp_vdisplay) {
> +			dev_err(dispc->dev, "%dx%d exceeds VP screen size %dx%d in shared mode\n",
> +				mode->hdisplay, mode->vdisplay, vp_hdisplay, vp_vdisplay);
> +			return MODE_BAD;
> +		}
> +	}
> +

I'm not sure I understand this. If the VP is owned by someone else, 
isn't there just a single mode that can be allowed (the one that that 
"someone else" has set up?). The comment also talks about primary 
planes, but this code is about the video mode, not planes.

So my understanding of a VP that's controlled by RTOS is that at boot 
time RTOS figures out a video mode (either the native mode of an LCD, or 
picks a mode from a monitor's EDID), and then that one video mode will 
be used, and never changed.

Linux's tidss can, at probe time probably, look at the VP configuration 
and figure out the video mode details, which can then be exposed to the 
userspace.

Changing video modes at runtime would probably require some kind of IPC.

>   	if (WARN_ON(max_pclk == 0))
>   		return MODE_BAD;
>   
> @@ -1276,15 +1304,18 @@ enum drm_mode_status dispc_vp_mode_valid(struct dispc_device *dispc,
>   		return MODE_BAD_VVALUE;
>   
>   	if (dispc->memory_bandwidth_limit) {
> -		const unsigned int bpp = 4;
> -		u64 bandwidth;
> +		if (!dispc->tidss->shared_mode ||
> +		    dispc->tidss->shared_mode_owned_vps[hw_videoport]) {
> +			const unsigned int bpp = 4;
> +			u64 bandwidth;
>   
> -		bandwidth = 1000 * mode->clock;
> -		bandwidth = bandwidth * mode->hdisplay * mode->vdisplay * bpp;
> -		bandwidth = div_u64(bandwidth, mode->htotal * mode->vtotal);
> +			bandwidth = 1000 * mode->clock;
> +			bandwidth = bandwidth * mode->hdisplay * mode->vdisplay * bpp;
> +			bandwidth = div_u64(bandwidth, mode->htotal * mode->vtotal);
>   
> -		if (dispc->memory_bandwidth_limit < bandwidth)
> -			return MODE_BAD;
> +			if (dispc->memory_bandwidth_limit < bandwidth)
> +				return MODE_BAD;
> +		}
>   	}
>   
>   	return MODE_OK;
> @@ -2655,6 +2686,147 @@ static void dispc_vp_set_color_mgmt(struct dispc_device *dispc,
>   		dispc_k3_vp_set_ctm(dispc, hw_videoport, ctm);
>   }
>   
> +static int get_vp_idx_from_vp(const char *vp_name)
> +{
> +	u32 vp_idx;
> +
> +	if (!strcmp("vp1", vp_name))
> +		vp_idx = 0;
> +	else if (!strcmp("vp2", vp_name))
> +		vp_idx = 1;
> +	else if (!strcmp("vp3", vp_name))
> +		vp_idx = 2;
> +	else if (!strcmp("vp4", vp_name))
> +		vp_idx = 3;
> +	else
> +		return 0;
> +
> +	return vp_idx;
> +}

Any particular reason to use strings in the DT for all there? Using a 
index number in the DT would simplify the driver. We also have lists of 
vp names already in the driver.

I didn't go through the whole logic yet, but these two functions give me 
the feeling that this part could be better.

> +
> +static const char *get_ovr_from_vp(const char *vp_name)
> +{
> +	const char *ovr_name = NULL;
> +
> +	if (!strcmp("vp1", vp_name))
> +		ovr_name = "ovr1";
> +	else if (!strcmp("vp2", vp_name))
> +		ovr_name = "ovr2";
> +	else if (!strcmp("vp3", vp_name))
> +		ovr_name = "ovr3";
> +	else if (!strcmp("vp4", vp_name))
> +		ovr_name = "ovr4";
> +	else
> +		return NULL;
> +
> +	return ovr_name;
> +}
> +
> +static void dispc_shared_mode_update_bus_type(struct dispc_features *shared_mode_feat,
> +					      struct dispc_device *dispc)
> +{
 >
> +	u32 i, vp_idx;
> +	int num_vps = shared_mode_feat->num_vps;
> +	enum dispc_vp_bus_type vp_bus_type[TIDSS_MAX_PORTS];
> +
> +	memcpy((void *)vp_bus_type, (void *)shared_mode_feat->vp_bus_type,
> +	       sizeof(vp_bus_type));
> +	memset(shared_mode_feat->vp_bus_type, 0, sizeof(vp_bus_type));
> +
> +	for (i = 0; i < num_vps; i++) {
> +		/*
> +		 * Find corresponding vp bus type.
> +		 */
> +		vp_idx = get_vp_idx_from_vp(shared_mode_feat->vp_name[i]);
> +		shared_mode_feat->vp_bus_type[i] = vp_bus_type[vp_idx];
> +	}
> +}
> +
> +static int dispc_update_shared_mode_features(struct dispc_features *shared_mode_feat,
> +					     struct dispc_device *dispc)
> +{
> +	int r = 0, i = 0;
> +
> +	dev_dbg(dispc->dev, "Start updating dispc feature list for shared mode:\n");
> +
> +	/*
> +	 * Start with a shallow copy from existing features and prune the list
> +	 * as per what is actually made available to Linux
> +	 */
> +	memcpy((void *)shared_mode_feat, (void *)dispc->feat, sizeof(*shared_mode_feat));
> +	shared_mode_feat->num_vps = device_property_string_array_count(dispc->dev,
> +								       "ti,dss-shared-mode-vp");
> +	shared_mode_feat->num_planes = device_property_string_array_count(dispc->dev,
> +									  "ti,dss-shared-mode-planes");
> +
> +	r = device_property_read_string(dispc->dev, "ti,dss-shared-mode-common",
> +					(const char **)&shared_mode_feat->common);
> +	if (r) {
> +		dev_err(dispc->dev, "failed to read shared video port name: %d\n", r);
> +		return r;
> +	}
> +
> +	memset(shared_mode_feat->vid_name, 0, sizeof(shared_mode_feat->vid_name));
> +	r = device_property_read_string_array(dispc->dev, "ti,dss-shared-mode-planes",
> +					      shared_mode_feat->vid_name, TIDSS_MAX_PLANES);
> +	if (r < 0) {
> +		dev_err(dispc->dev, "failed to read client vid layer name: %d\n", r);
> +		return r;
> +	}
> +
> +	r = device_property_read_u32_array(dispc->dev, "ti,dss-shared-mode-vp-owned",
> +					   dispc->tidss->shared_mode_owned_vps,
> +					   shared_mode_feat->num_vps);
> +	if (r < 0) {
> +		dev_err(dispc->dev, "failed to read owned vp list: %d\n", r);
> +		return r;
> +	}
> +
> +	memset(shared_mode_feat->vp_name, 0, sizeof(shared_mode_feat->vp_name));
> +	r = device_property_read_string_array(dispc->dev, "ti,dss-shared-mode-vp",
> +					      shared_mode_feat->vp_name, TIDSS_MAX_PORTS);
> +	if (r < 0) {
> +		dev_err(dispc->dev, "failed to read shared video port name: %d\n", r);
> +		return r;
> +	}
> +
> +	memset(shared_mode_feat->vid_order, 0, sizeof(shared_mode_feat->vid_order));
> +	r = device_property_read_u32_array(dispc->dev, "ti,dss-shared-mode-plane-zorder",
> +					   shared_mode_feat->vid_order,
> +					   shared_mode_feat->num_planes);
> +	if (r < 0) {
> +		dev_err(dispc->dev, "failed to read vid_order array name: %d\n", r);
> +		return r;
> +	}
> +	memcpy((void *)shared_mode_feat->vpclk_name, (void *)shared_mode_feat->vp_name,
> +	       sizeof(shared_mode_feat->vpclk_name));
> +	memset(shared_mode_feat->ovr_name, 0, sizeof(shared_mode_feat->ovr_name));
> +
> +	for (i = 0; i < shared_mode_feat->num_vps; i++) {
> +		shared_mode_feat->ovr_name[i] = get_ovr_from_vp(shared_mode_feat->vp_name[i]);
> +		dev_dbg(dispc->dev, "vp[%d] = %s, ovr[%d] = %s vpclk[%d] = %s vp_owned[%d] = %d\n",
> +			i,  shared_mode_feat->vp_name[i], i, shared_mode_feat->ovr_name[i], i,
> +			shared_mode_feat->vpclk_name[i], i, dispc->tidss->shared_mode_owned_vps[i]);
> +	}
> +
> +	for (i = 0; i < shared_mode_feat->num_planes; i++) {
> +		if (!strncmp("vidl", shared_mode_feat->vid_name[i], 4))
> +			shared_mode_feat->vid_lite[i] = true;
> +		dev_dbg(dispc->dev, "vid[%d] = %s, vid_order[%d] = %u vid_lite[%d] = %u\n", i,
> +			shared_mode_feat->vid_name[i], i, shared_mode_feat->vid_order[i], i,
> +			shared_mode_feat->vid_lite[i]);
> +	}
> +
> +	if (!strcmp(shared_mode_feat->common, "common1"))
> +		shared_mode_feat->common_regs = tidss_am62_common1_regs;
> +
> +	dev_dbg(dispc->dev, "common : %s\n", shared_mode_feat->common);
> +	dispc_shared_mode_update_bus_type(shared_mode_feat, dispc);
> +	dev_dbg(dispc->dev, "Feature list updated for shared mode\n");
> +
> +	return 0;
> +}
> +
>   void dispc_vp_setup(struct dispc_device *dispc, u32 hw_videoport,
>   		    const struct drm_crtc_state *state, bool newmodeset)
>   {
> @@ -2662,6 +2834,16 @@ void dispc_vp_setup(struct dispc_device *dispc, u32 hw_videoport,
>   	dispc_vp_set_color_mgmt(dispc, hw_videoport, state, newmodeset);
>   }
>   
> +static bool dispc_owns_global_common_in_shared_mode(struct dispc_device *dispc)
> +{
> +	if ((!strcmp(dispc->feat->common, "common") ||
> +	     !strcmp(dispc->feat->common, "common_m")) &&
> +	     dispc->tidss->shared_mode)
> +		return true;
> +	else
> +		return false;
> +}
> +
>   int dispc_runtime_suspend(struct dispc_device *dispc)
>   {
>   	dev_dbg(dispc->dev, "suspend\n");
> @@ -2846,6 +3028,7 @@ int dispc_init(struct tidss_device *tidss)
>   	struct platform_device *pdev = to_platform_device(dev);
>   	struct dispc_device *dispc;
>   	const struct dispc_features *feat;
> +	struct dispc_features *shared_mode_feat;
>   	unsigned int i, num_fourccs;
>   	int r = 0;
>   
> @@ -2887,6 +3070,21 @@ int dispc_init(struct tidss_device *tidss)
>   
>   	dispc->num_fourccs = num_fourccs;
>   
> +	if (tidss->shared_mode) {
> +		dev_dbg(dev, "%s : DSS is being shared with remote core\n", __func__);
> +		shared_mode_feat = devm_kzalloc(dev, sizeof(*shared_mode_feat), GFP_KERNEL);
> +		if (!shared_mode_feat)
> +			return -ENOMEM;
> +
> +		r = dispc_update_shared_mode_features(shared_mode_feat, dispc);
> +		if (r)
> +			return r;
> +
> +		tidss->feat = (const struct dispc_features *)shared_mode_feat;
> +		feat = tidss->feat;
> +		dispc->feat = feat;
> +	}
> +
>   	dispc_common_regmap = dispc->feat->common_regs;
>   
>   	r = dispc_iomap_resource(pdev, dispc->feat->common,
> @@ -2933,25 +3131,37 @@ int dispc_init(struct tidss_device *tidss)
>   	}
>   
>   	if (feat->subrev == DISPC_AM65X) {
> -		r = dispc_init_am65x_oldi_io_ctrl(dev, dispc);
> -		if (r)
> -			return r;
> +		/*
> +		 * For shared mode, Initialize the OLDI IO control only if we own
> +		 * the OLDI Tx ports
> +		 */
> +		if (!tidss->shared_mode || tidss->shared_mode_own_oldi) {
> +			r = dispc_init_am65x_oldi_io_ctrl(dev, dispc);
> +			if (r)
> +				return r;
> +		}

I haven't studied this, but the shared_mode_own_oldi feels like very SoC 
specific. Will we need something similar on other SoC, maybe for other 
outputs too? Maybe this needs something more generic.

And it's not clear to me how to manage the whole pipeline. Let's say we 
have a board with DSS and an external bridge, like SiI9022, and a HDMI 
connector. I assume RTOS must control the SiI9022 and the connector 
(HPD, HDMI +5V line). Does that mean that the DT for Linux will be 
special in that case? Will we even have anything connected to the DSS 
port in the DT?

>   	}
>   
> -	dispc->fclk = devm_clk_get(dev, "fck");
> -	if (IS_ERR(dispc->fclk)) {
> -		dev_err(dev, "%s: Failed to get fclk: %ld\n",
> -			__func__, PTR_ERR(dispc->fclk));
> -		return PTR_ERR(dispc->fclk);
> -	}
> -	dev_dbg(dev, "DSS fclk %lu Hz\n", clk_get_rate(dispc->fclk));
> +	/*
> +	 * For shared mode, Initialize the hardware and clocking only if processing core running
> +	 * Linux has ownership of DSS global register space
> +	 */
> +	if (!tidss->shared_mode || dispc_owns_global_common_in_shared_mode(dispc)) {
> +		dispc->fclk = devm_clk_get(dev, "fck");
> +		if (IS_ERR(dispc->fclk)) {
> +			dev_err(dev, "%s: Failed to get fclk: %ld\n",
> +				__func__, PTR_ERR(dispc->fclk));
> +			return PTR_ERR(dispc->fclk);
> +		}
> +		dev_dbg(dev, "DSS fclk %lu Hz\n", clk_get_rate(dispc->fclk));
>   
> -	of_property_read_u32(dispc->dev->of_node, "max-memory-bandwidth",
> -			     &dispc->memory_bandwidth_limit);
> +		of_property_read_u32(dispc->dev->of_node, "max-memory-bandwidth",
> +				     &dispc->memory_bandwidth_limit);
>   
> -	r = dispc_init_hw(dispc);
> -	if (r)
> -		return r;
> +		r = dispc_init_hw(dispc);
> +		if (r)
> +			return r;
> +	}
>   
>   	tidss->dispc = dispc;
>   
> diff --git a/drivers/gpu/drm/tidss/tidss_dispc.h b/drivers/gpu/drm/tidss/tidss_dispc.h
> index 086327d51a90..368a39941b34 100644
> --- a/drivers/gpu/drm/tidss/tidss_dispc.h
> +++ b/drivers/gpu/drm/tidss/tidss_dispc.h
> @@ -80,7 +80,7 @@ struct dispc_features {
>   	const char *vp_name[TIDSS_MAX_PORTS]; /* Should match dt reg names */
>   	const char *ovr_name[TIDSS_MAX_PORTS]; /* Should match dt reg names */
>   	const char *vpclk_name[TIDSS_MAX_PORTS]; /* Should match dt clk names */
> -	const enum dispc_vp_bus_type vp_bus_type[TIDSS_MAX_PORTS];
> +	enum dispc_vp_bus_type vp_bus_type[TIDSS_MAX_PORTS];
>   	struct tidss_vp_feat vp_feat;
>   	u32 num_planes;
>   	const char *vid_name[TIDSS_MAX_PLANES]; /* Should match dt reg names */
> diff --git a/drivers/gpu/drm/tidss/tidss_drv.c b/drivers/gpu/drm/tidss/tidss_drv.c
> index d15f836dca95..141481635578 100644
> --- a/drivers/gpu/drm/tidss/tidss_drv.c
> +++ b/drivers/gpu/drm/tidss/tidss_drv.c
> @@ -32,6 +32,10 @@ int tidss_runtime_get(struct tidss_device *tidss)
>   
>   	dev_dbg(tidss->dev, "%s\n", __func__);
>   
> +	/* No PM in display sharing mode */
> +	if (tidss->shared_mode)
> +		return 0;
> +

I'm not sure how to implement it, but I think it would be much nicer if 
the PM code in the driver stayed at least mostly the same. The driver 
can't suspend the hardware, but the meaning of the pm function calls 
would still make sense, i.e. pm_runtime_get() means that tidss wants the 
hardware to be enabled. If the HW already is enabled (is it would be as 
RTOS controls it), nothing would happen.

>   	r = pm_runtime_resume_and_get(tidss->dev);
>   	WARN_ON(r < 0);
>   	return r;
> @@ -43,6 +47,9 @@ void tidss_runtime_put(struct tidss_device *tidss)
>   
>   	dev_dbg(tidss->dev, "%s\n", __func__);
>   
> +	if (tidss->shared_mode)
> +		return;
> +
>   	pm_runtime_mark_last_busy(tidss->dev);
>   
>   	r = pm_runtime_put_autosuspend(tidss->dev);
> @@ -140,21 +147,23 @@ static int tidss_probe(struct platform_device *pdev)
>   
>   	spin_lock_init(&tidss->wait_lock);
>   
> +	tidss->shared_mode = device_property_read_bool(dev, "ti,dss-shared-mode");
>   	ret = dispc_init(tidss);
>   	if (ret) {
>   		dev_err(dev, "failed to initialize dispc: %d\n", ret);
>   		return ret;
>   	}
>   
> -	pm_runtime_enable(dev);
> -
> -	pm_runtime_set_autosuspend_delay(dev, 1000);
> -	pm_runtime_use_autosuspend(dev);
> +	if (!tidss->shared_mode) {
> +		pm_runtime_enable(dev);
> +		pm_runtime_set_autosuspend_delay(dev, 1000);
> +		pm_runtime_use_autosuspend(dev);
>   
>   #ifndef CONFIG_PM
> -	/* If we don't have PM, we need to call resume manually */
> -	dispc_runtime_resume(tidss->dispc);
> +		/* If we don't have PM, we need to call resume manually */
> +		dispc_runtime_resume(tidss->dispc);
>   #endif
> +	}
>   
>   	ret = tidss_modeset_init(tidss);
>   	if (ret < 0) {
> @@ -196,6 +205,8 @@ static int tidss_probe(struct platform_device *pdev)
>   	tidss_irq_uninstall(ddev);
>   
>   err_runtime_suspend:
> +	if (tidss->shared_mode)
> +		return ret;
>   #ifndef CONFIG_PM
>   	dispc_runtime_suspend(tidss->dispc);
>   #endif
> @@ -219,12 +230,14 @@ static void tidss_remove(struct platform_device *pdev)
>   
>   	tidss_irq_uninstall(ddev);
>   
> +	if (!tidss->shared_mode) {
>   #ifndef CONFIG_PM
> -	/* If we don't have PM, we need to call suspend manually */
> -	dispc_runtime_suspend(tidss->dispc);
> +		/* If we don't have PM, we need to call suspend manually */
> +		dispc_runtime_suspend(tidss->dispc);
>   #endif
> -	pm_runtime_dont_use_autosuspend(dev);
> -	pm_runtime_disable(dev);
> +		pm_runtime_dont_use_autosuspend(dev);
> +		pm_runtime_disable(dev);
> +	}
>   
>   	/* devm allocated dispc goes away with the dev so mark it NULL */
>   	dispc_remove(tidss);
> diff --git a/drivers/gpu/drm/tidss/tidss_drv.h b/drivers/gpu/drm/tidss/tidss_drv.h
> index d7f27b0b0315..68d53c70651d 100644
> --- a/drivers/gpu/drm/tidss/tidss_drv.h
> +++ b/drivers/gpu/drm/tidss/tidss_drv.h
> @@ -31,6 +31,12 @@ struct tidss_device {
>   
>   	spinlock_t wait_lock;	/* protects the irq masks */
>   	dispc_irq_t irq_mask;	/* enabled irqs in addition to wait_list */
> +
> +	bool shared_mode; /* DSS resources shared between remote core and Linux */
> +
> +	/* 1: VP owned by Linux 0: VP is owned by remote and shared with Linux */
> +	u32 shared_mode_owned_vps[TIDSS_MAX_PORTS];

Boolean, I think?

> +	bool shared_mode_own_oldi; /* Linux needs to configure OLDI in shared mode */
>   };
>   
>   #define to_tidss(__dev) container_of(__dev, struct tidss_device, ddev)
Maxime Ripard Jan. 26, 2024, 12:15 p.m. UTC | #2
Hi,

Thanks a lot for working on that.

On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
> Display subsystem present in TI Keystone family of devices supports sharing
> of display between multiple hosts as it provides separate register space
> (common* region) for each host to programming display controller and also a
> unique interrupt line for each host.
> 
> This adds support for display sharing, by allowing partitioning of
> resources either at video port level or at video plane level as
> described below :
> 
> 1) Linux can own (i.e have write access) completely one or more of video
> ports along with corresponding resources (viz. overlay managers,
> video planes) used by Linux in context of those video ports.
> Even if Linux is owning
> these video ports it can still share this video port with a remote core
> which can own one or more video planes associated with this video port.
> 
> 2) Linux owns one or more of the video planes with video port
> (along with corresponding overlay manager) associated with these planes
> being owned and controlled by a remote core. Linux still has read-only
> access to the associated video port and overlay managers so that it can
> parse the settings made by remote core.

So, just to make sure we're on the same page. 1) means Linux drives the
whole display engine, but can lend planes to the R5? How does that work,
is Linux aware of the workload being there (plane size, format, etc) ?

And 2) would mean that the display engine is under the R5 control and
Linux only gets to fill the plane and let the firmware know of what it
wants?

If so, do we even need the tidss driver in the second case? We could
just write a fwkms driver of some sorts that could be used by multiple
implementations of the same "defer to firmware" logic.

> For both the cases, the resources used in context of processing core
> running Linux along with ownership information are exposed by user as
> part of device-tree blob and driver uses an updated feature list tailored
> for this shared mode accordingly. The driver also auto-populates
> matching overlay managers and output types from shared video
> port list provided in device-tree blob.
> In dispc_feature struct remove const access specfier for output_type
> array as it is required to be updated dynamically in run-time for shared
> mode.

I'm also not entirely sure that the device tree is the right path there.
Surely the firmware capabilities will evolve over time, while the device
tree won't. Is there some way to make it discoverable at probe time by
the driver?

Maxime
Devarsh Thakkar Feb. 8, 2024, 12:56 p.m. UTC | #3
Hi Maxime,

Thanks a lot for checking on this.

On 26/01/24 17:45, Maxime Ripard wrote:
> Hi,
> 
> Thanks a lot for working on that.
> 
> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
>> Display subsystem present in TI Keystone family of devices supports sharing
>> of display between multiple hosts as it provides separate register space
>> (common* region) for each host to programming display controller and also a
>> unique interrupt line for each host.
>>
>> This adds support for display sharing, by allowing partitioning of
>> resources either at video port level or at video plane level as
>> described below :
>>
>> 1) Linux can own (i.e have write access) completely one or more of video
>> ports along with corresponding resources (viz. overlay managers,
>> video planes) used by Linux in context of those video ports.
>> Even if Linux is owning
>> these video ports it can still share this video port with a remote core
>> which can own one or more video planes associated with this video port.
>>
>> 2) Linux owns one or more of the video planes with video port
>> (along with corresponding overlay manager) associated with these planes
>> being owned and controlled by a remote core. Linux still has read-only
>> access to the associated video port and overlay managers so that it can
>> parse the settings made by remote core.
> 
> So, just to make sure we're on the same page. 1) means Linux drives the
> whole display engine, but can lend planes to the R5? How does that work,
> is Linux aware of the workload being there (plane size, format, etc) ?
> 

Well, there is no dynamic procedure being followed for lending. The
partitioning scheme is decided and known before hand, and the remote
core firmware updated and compiled accordingly, and similarly the
device-tree overlay for Linux is also updated with partitioning
information before bootup.

What would happen here is that Linux will know before-hand this
partitioning information via device-tree properties and won't enumerate
the plane owned by RTOS, but it will enumerate the rest of the display
components and initialize the DSS, after which user can load the DSS
firmware on remote core and this firmware will only have control of
plane as it was compiled with that configuration.


> And 2) would mean that the display engine is under the R5 control and
> Linux only gets to fill the plane and let the firmware know of what it
> wants?
> 

Here too the partitioning information is pre-decided and remote core
firmware and device-tree overlay for Linux updated accordingly. But in
this case as remote core firmware owns the display (minus the plane
owned by Linux) it is started and initialized during the bootloader
phase itself where it initializes the DSS and starts rendering using the
plane owned by it and Linux just latches to the DSS without
re-initializing it, with write access only to the plane that is owned by
Linux. You can refer [1] for more details on this.


> If so, do we even need the tidss driver in the second case? We could
> just write a fwkms driver of some sorts that could be used by multiple
> implementations of the same "defer to firmware" logic.
> 

This feature of static partitioning of DSS resources is specific to DSS7
hardware (which is controlled by tidss driver) which supports dedicated
register space and interrupt line for each of the hosts [0], so that
multiple hosts can drive the display controller simultaneously as  per
the desired static partitioning of resources, and so I don't think a
separate driver is required here and tidss seems the right place to
support this, where using this device-tree approach different resource
partitioning schemas can be achieved as described here [1]. This was
also aligned with Tomi too where we discussed that tidss is the right
place to support this as we are simply leveraging the DSS hardware
capabilities of static partitioning here.


>> For both the cases, the resources used in context of processing core
>> running Linux along with ownership information are exposed by user as
>> part of device-tree blob and driver uses an updated feature list tailored
>> for this shared mode accordingly. The driver also auto-populates
>> matching overlay managers and output types from shared video
>> port list provided in device-tree blob.
>> In dispc_feature struct remove const access specfier for output_type
>> array as it is required to be updated dynamically in run-time for shared
>> mode.
> 
> I'm also not entirely sure that the device tree is the right path there.
> Surely the firmware capabilities will evolve over time, while the device
> tree won't. Is there some way to make it discoverable at probe time by
> the driver?
> 

I think the main highlight of the sharing feature is the hardware
capability where each host is provided separate irq and register space
to program display for their display context independently and without
any sort of inter-processor communication and the current implementation
just mirrors by specifying the display context information in the form
of device-tree properties instead of relying on any inter-processor
communication or negotiation phase.

Having said that, for the second case Linux still has read access to
display registers which can be read to infer some of the display
configuration set by the RTOS and we are infact using that to infer some
of the context like for e.g. what was the display mode set by RTOS, but
we don't want to rely completely on register reads of RTOS programmed
registers to gather full information regarding partitioning or even to
infer that we are in sharing mode.

The remote core firmware is compiled with the partitioning information
known before-hand and same is fed in the form of DT properties to Linux
and this looks much cleaner, simple and reliable to us.

But yes we wanted to support firmwares with different partitioning
schemes/capabilities as you said and so the tidss-shared-mode
device-tree properties were designed keeping in mind that they should be
adequate and flexible enough to define all different partition schemes
which can be supported, this is also demonstrated with an example here [1]

[0]:
https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/linux/Foundational_Components/Kernel/Kernel_Drivers/Display/DSS7.html#soc-hardware-description


[1]:
https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/linux/How_to_Guides/Target/How_to_enable_display_sharing_between_remotecore_and_Linux.html


Regards
Devarsh


> Maxime
Maxime Ripard Feb. 13, 2024, 2:04 p.m. UTC | #4
Hi Devarsh,

On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote:
> Hi Maxime,
> 
> Thanks a lot for checking on this.
> 
> On 26/01/24 17:45, Maxime Ripard wrote:
> > Hi,
> > 
> > Thanks a lot for working on that.
> > 
> > On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
> >> Display subsystem present in TI Keystone family of devices supports sharing
> >> of display between multiple hosts as it provides separate register space
> >> (common* region) for each host to programming display controller and also a
> >> unique interrupt line for each host.
> >>
> >> This adds support for display sharing, by allowing partitioning of
> >> resources either at video port level or at video plane level as
> >> described below :
> >>
> >> 1) Linux can own (i.e have write access) completely one or more of video
> >> ports along with corresponding resources (viz. overlay managers,
> >> video planes) used by Linux in context of those video ports.
> >> Even if Linux is owning
> >> these video ports it can still share this video port with a remote core
> >> which can own one or more video planes associated with this video port.
> >>
> >> 2) Linux owns one or more of the video planes with video port
> >> (along with corresponding overlay manager) associated with these planes
> >> being owned and controlled by a remote core. Linux still has read-only
> >> access to the associated video port and overlay managers so that it can
> >> parse the settings made by remote core.
> > 
> > So, just to make sure we're on the same page. 1) means Linux drives the
> > whole display engine, but can lend planes to the R5? How does that work,
> > is Linux aware of the workload being there (plane size, format, etc) ?
> > 
> 
> Well, there is no dynamic procedure being followed for lending. The
> partitioning scheme is decided and known before hand, and the remote
> core firmware updated and compiled accordingly, and similarly the
> device-tree overlay for Linux is also updated with partitioning
> information before bootup.
> 
> What would happen here is that Linux will know before-hand this
> partitioning information via device-tree properties and won't enumerate
> the plane owned by RTOS, but it will enumerate the rest of the display
> components and initialize the DSS, after which user can load the DSS
> firmware on remote core and this firmware will only have control of
> plane as it was compiled with that configuration.

Right. If the RTOS is in control of a single plane, how it is expected
to deal with Linux shutting the CRTC down, or enforcing a configuration
that isn't compatible with what the RTOS expects (like a plane with a
higher zpos masking its plane), what is the mechanism to reconcile it?

> > And 2) would mean that the display engine is under the R5 control and
> > Linux only gets to fill the plane and let the firmware know of what it
> > wants?
> > 
> 
> Here too the partitioning information is pre-decided and remote core
> firmware and device-tree overlay for Linux updated accordingly. But in
> this case as remote core firmware owns the display (minus the plane
> owned by Linux) it is started and initialized during the bootloader
> phase itself where it initializes the DSS and starts rendering using the
> plane owned by it and Linux just latches to the DSS without
> re-initializing it, with write access only to the plane that is owned by
> Linux. You can refer [1] for more details on this.
>
> > If so, do we even need the tidss driver in the second case? We could
> > just write a fwkms driver of some sorts that could be used by multiple
> > implementations of the same "defer to firmware" logic.
> > 
> 
> This feature of static partitioning of DSS resources is specific to DSS7
> hardware (which is controlled by tidss driver) which supports dedicated
> register space and interrupt line for each of the hosts [0], so that
> multiple hosts can drive the display controller simultaneously as  per
> the desired static partitioning of resources, and so I don't think a
> separate driver is required here and tidss seems the right place to
> support this, where using this device-tree approach different resource
> partitioning schemas can be achieved as described here [1]. This was
> also aligned with Tomi too where we discussed that tidss is the right
> place to support this as we are simply leveraging the DSS hardware
> capabilities of static partitioning here.

If the only thing tidss does in the "owned by RTOS" is forwarding KMS
atomic states to the RTOS, then I'm still not sure why we need to
involve tidss at all.

It's not just about interrupts, it's also about how your arbitrate
between what Linux wants and what the RTOS wants. Like if the RTOS still
wants to output something but Linux wants to disable it, how do you
reconcile the two?

You have to have something that reconciles both, and typically for
firmware-based setup this will be the firmware's job.

That's very similar to what the RaspberryPi did with fkms, and I believe
that having a generic KMS-on-remoteproc driver when the firmware has
control over the display is the path forward.

> >> For both the cases, the resources used in context of processing core
> >> running Linux along with ownership information are exposed by user as
> >> part of device-tree blob and driver uses an updated feature list tailored
> >> for this shared mode accordingly. The driver also auto-populates
> >> matching overlay managers and output types from shared video
> >> port list provided in device-tree blob.
> >> In dispc_feature struct remove const access specfier for output_type
> >> array as it is required to be updated dynamically in run-time for shared
> >> mode.
> > 
> > I'm also not entirely sure that the device tree is the right path there.
> > Surely the firmware capabilities will evolve over time, while the device
> > tree won't. Is there some way to make it discoverable at probe time by
> > the driver?
>
> I think the main highlight of the sharing feature is the hardware
> capability where each host is provided separate irq and register space
> to program display for their display context independently

Wait, what do you mean by display context here?

Maxime
Devarsh Thakkar Feb. 14, 2024, 3:47 p.m. UTC | #5
Hi Maxime,

Thanks for the quick reply.

On 13/02/24 19:34, Maxime Ripard wrote:
> Hi Devarsh,
> 
> On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote:
>> Hi Maxime,
>>
>> Thanks a lot for checking on this.
>>
>> On 26/01/24 17:45, Maxime Ripard wrote:
>>> Hi,
>>>
>>> Thanks a lot for working on that.
>>>
>>> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
>>>> Display subsystem present in TI Keystone family of devices supports sharing
>>>> of display between multiple hosts as it provides separate register space
>>>> (common* region) for each host to programming display controller and also a
>>>> unique interrupt line for each host.
>>>>
>>>> This adds support for display sharing, by allowing partitioning of
>>>> resources either at video port level or at video plane level as
>>>> described below :
>>>>
>>>> 1) Linux can own (i.e have write access) completely one or more of video
>>>> ports along with corresponding resources (viz. overlay managers,
>>>> video planes) used by Linux in context of those video ports.
>>>> Even if Linux is owning
>>>> these video ports it can still share this video port with a remote core
>>>> which can own one or more video planes associated with this video port.
>>>>
>>>> 2) Linux owns one or more of the video planes with video port
>>>> (along with corresponding overlay manager) associated with these planes
>>>> being owned and controlled by a remote core. Linux still has read-only
>>>> access to the associated video port and overlay managers so that it can
>>>> parse the settings made by remote core.
>>>
>>> So, just to make sure we're on the same page. 1) means Linux drives the
>>> whole display engine, but can lend planes to the R5? How does that work,
>>> is Linux aware of the workload being there (plane size, format, etc) ?
>>>
>>
>> Well, there is no dynamic procedure being followed for lending. The
>> partitioning scheme is decided and known before hand, and the remote
>> core firmware updated and compiled accordingly, and similarly the
>> device-tree overlay for Linux is also updated with partitioning
>> information before bootup.
>>
>> What would happen here is that Linux will know before-hand this
>> partitioning information via device-tree properties and won't enumerate
>> the plane owned by RTOS, but it will enumerate the rest of the display
>> components and initialize the DSS, after which user can load the DSS
>> firmware on remote core and this firmware will only have control of
>> plane as it was compiled with that configuration.
> 
> Right. If the RTOS is in control of a single plane, how it is expected
> to deal with Linux shutting the CRTC down, or enforcing a configuration
> that isn't compatible with what the RTOS expects (like a plane with a
> higher zpos masking its plane), what is the mechanism to reconcile it?
> 

Just for the note, for this "RTOS control single plane" mode, we don't have a
firmware available to test (right now we are only supporting example for "RTOS
controlling the display mode" as shared here [1]) and hence this is not
validated but the idea was to keep dt-bindings generic enough to support them
in future and that's why I referred to it here.

Coming back to your questions, with the current scheme the Linux (tidss) would
be expected to make sure the CRTC being shared with RTOS is never shutdown and
the RTOS plane should never gets masked.

I think the IPC based scheme would have been mainly needed for the case where
you have a single entity controlling the display for e.g you have a single
display controller register space and a single IRQ but you have multiple
planes and say you want to divide these planes to different host processors.
In that case you want a single entity to act as a main entity and be in
control of DSS and rest of the processors communicate with the "main entity"
to request display resources and plane updates and main entity also programs
dss on their behalf.

But unlike above, TI DSS7 is designed to support static partitioning of
display resources among multiple hosts, where each host can program the
display hardware independently using separate register space and having a
separate irq and without requirement of any communication between the hosts.
Now as this feature is unique to TI DSS7 we want to support this feature in
tidss driver. The DSS resource partitioning feature is described in detail
here [2]

>>> And 2) would mean that the display engine is under the R5 control and
>>> Linux only gets to fill the plane and let the firmware know of what it
>>> wants?
>>>
>>
>> Here too the partitioning information is pre-decided and remote core
>> firmware and device-tree overlay for Linux updated accordingly. But in
>> this case as remote core firmware owns the display (minus the plane
>> owned by Linux) it is started and initialized during the bootloader
>> phase itself where it initializes the DSS and starts rendering using the
>> plane owned by it and Linux just latches to the DSS without
>> re-initializing it, with write access only to the plane that is owned by
>> Linux. You can refer [1] for more details on this.
>>
>>> If so, do we even need the tidss driver in the second case? We could
>>> just write a fwkms driver of some sorts that could be used by multiple
>>> implementations of the same "defer to firmware" logic.
>>>
>>
>> This feature of static partitioning of DSS resources is specific to DSS7
>> hardware (which is controlled by tidss driver) which supports dedicated
>> register space and interrupt line for each of the hosts [0], so that
>> multiple hosts can drive the display controller simultaneously as  per
>> the desired static partitioning of resources, and so I don't think a
>> separate driver is required here and tidss seems the right place to
>> support this, where using this device-tree approach different resource
>> partitioning schemas can be achieved as described here [1]. This was
>> also aligned with Tomi too where we discussed that tidss is the right
>> place to support this as we are simply leveraging the DSS hardware
>> capabilities of static partitioning here.
> 
> If the only thing tidss does in the "owned by RTOS" is forwarding KMS
> atomic states to the RTOS, then I'm still not sure why we need to
> involve tidss at all.
> 

I think maybe here is the point of misunderstanding. We are not forwarding
atomic states to RTOS here. Linux (tidss) is infact, accessing the display
register space assigned to it (common1 assigned to Linux, commmon0 assigned to
RTOS) and also writing to DSS plane registers for the plane assigned to it
(say VID assigned to Linux and VIDL assigned to RTOS).

> It's not just about interrupts, it's also about how your arbitrate
> between what Linux wants and what the RTOS wants. Like if the RTOS still
> wants to output something but Linux wants to disable it, how do you
> reconcile the two?
> 

The scheme involves static partitioning of display resource which are assigned
compile-time to RTOS and Linux. Here the RTOS firmware is compiled with
specific ownership/display resources as desired by user and this assignment
stays intact.

If there is a more complex use-case which requires dynamic
assignment/arbitration of resources then I agree those require some sort of
IPC scheme but this is not what we target with these series. This series is
simply to support static partitioning feature (separate register space,
separate irq, firewalling support etc) of TI DSS hardware across the multiple
hosts and there are use-cases too for which this scheme suffices.

> You have to have something that reconciles both, and typically for
> firmware-based setup this will be the firmware's job.
> 
> That's very similar to what the RaspberryPi did with fkms, and I believe
> that having a generic KMS-on-remoteproc driver when the firmware has
> control over the display is the path forward.
> 

The kms-on-remoteproc scheme is different and maybe more useful for those
processors not supporting this static partitioning hardware feature. On the
other hand, I believe there are still use-cases where this unique static
partitioning hardware feature of TI DSS will suffice without any requirement
of IPC. And it makes the firmware simpler (and the job of RTOS developer
easier too) as no IPC is required.

I am curious to understand Rpi DSS hardware and take a look at fkms and it's
firmware code though if it is public ?

>>>> For both the cases, the resources used in context of processing core
>>>> running Linux along with ownership information are exposed by user as
>>>> part of device-tree blob and driver uses an updated feature list tailored
>>>> for this shared mode accordingly. The driver also auto-populates
>>>> matching overlay managers and output types from shared video
>>>> port list provided in device-tree blob.
>>>> In dispc_feature struct remove const access specfier for output_type
>>>> array as it is required to be updated dynamically in run-time for shared
>>>> mode.
>>>
>>> I'm also not entirely sure that the device tree is the right path there.
>>> Surely the firmware capabilities will evolve over time, while the device
>>> tree won't. Is there some way to make it discoverable at probe time by
>>> the driver?
>>
>> I think the main highlight of the sharing feature is the hardware
>> capability where each host is provided separate irq and register space
>> to program display for their display context independently
> 
> Wait, what do you mean by display context here?
> 

By context I mean to what that specific host wants to display. For e.g. if
RTOS is owning a plane, then it can update it's framebuffer and update plane
registers independently.

[1]
https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/system/Demo_User_Guides/Display_Cluster_User_Guide.html
[2] https://www.ti.com/lit/zip/spruj52 ((Section 12.6.3.14 DISPC Resources
Sharing))

Regards
Devarsh

> Maxime
Maxime Ripard March 14, 2024, 2:34 p.m. UTC | #6
Hi,

On Wed, Feb 14, 2024 at 09:17:12PM +0530, Devarsh Thakkar wrote:
> On 13/02/24 19:34, Maxime Ripard wrote:
> > On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote:
> >> On 26/01/24 17:45, Maxime Ripard wrote:
> >>> Hi,
> >>>
> >>> Thanks a lot for working on that.
> >>>
> >>> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
> >>>> Display subsystem present in TI Keystone family of devices supports sharing
> >>>> of display between multiple hosts as it provides separate register space
> >>>> (common* region) for each host to programming display controller and also a
> >>>> unique interrupt line for each host.
> >>>>
> >>>> This adds support for display sharing, by allowing partitioning of
> >>>> resources either at video port level or at video plane level as
> >>>> described below :
> >>>>
> >>>> 1) Linux can own (i.e have write access) completely one or more of video
> >>>> ports along with corresponding resources (viz. overlay managers,
> >>>> video planes) used by Linux in context of those video ports.
> >>>> Even if Linux is owning
> >>>> these video ports it can still share this video port with a remote core
> >>>> which can own one or more video planes associated with this video port.
> >>>>
> >>>> 2) Linux owns one or more of the video planes with video port
> >>>> (along with corresponding overlay manager) associated with these planes
> >>>> being owned and controlled by a remote core. Linux still has read-only
> >>>> access to the associated video port and overlay managers so that it can
> >>>> parse the settings made by remote core.
> >>>
> >>> So, just to make sure we're on the same page. 1) means Linux drives the
> >>> whole display engine, but can lend planes to the R5? How does that work,
> >>> is Linux aware of the workload being there (plane size, format, etc) ?
> >>>
> >>
> >> Well, there is no dynamic procedure being followed for lending. The
> >> partitioning scheme is decided and known before hand, and the remote
> >> core firmware updated and compiled accordingly, and similarly the
> >> device-tree overlay for Linux is also updated with partitioning
> >> information before bootup.
> >>
> >> What would happen here is that Linux will know before-hand this
> >> partitioning information via device-tree properties and won't enumerate
> >> the plane owned by RTOS, but it will enumerate the rest of the display
> >> components and initialize the DSS, after which user can load the DSS
> >> firmware on remote core and this firmware will only have control of
> >> plane as it was compiled with that configuration.
> > 
> > Right. If the RTOS is in control of a single plane, how it is expected
> > to deal with Linux shutting the CRTC down, or enforcing a configuration
> > that isn't compatible with what the RTOS expects (like a plane with a
> > higher zpos masking its plane), what is the mechanism to reconcile it?
> > 
> 
> Just for the note, for this "RTOS control single plane" mode, we don't have a
> firmware available to test (right now we are only supporting example for "RTOS
> controlling the display mode" as shared here [1]) and hence this is not
> validated but the idea was to keep dt-bindings generic enough to support them
> in future and that's why I referred to it here.
> 
> Coming back to your questions, with the current scheme the Linux (tidss) would
> be expected to make sure the CRTC being shared with RTOS is never shutdown and
> the RTOS plane should never gets masked.

I'm probably missing something then here, but if the Linux side of
things is expected to keep the current configuration and keep it active
for it to work, what use-case would it be useful for?

> I think the IPC based scheme would have been mainly needed for the case where
> you have a single entity controlling the display for e.g you have a single
> display controller register space and a single IRQ but you have multiple
> planes and say you want to divide these planes to different host processors.

And with, I assume, different OS on those host processors? Otherwise why
would we need to handle some planes at the firmware level?

> In that case you want a single entity to act as a main entity and be in
> control of DSS and rest of the processors communicate with the "main entity"
> to request display resources and plane updates and main entity also programs
> dss on their behalf.
> 
> But unlike above, TI DSS7 is designed to support static partitioning of
> display resources among multiple hosts, where each host can program the
> display hardware independently using separate register space and having a
> separate irq and without requirement of any communication between the hosts.
> Now as this feature is unique to TI DSS7 we want to support this feature in
> tidss driver. The DSS resource partitioning feature is described in detail
> here [2]

So, if I understand this properly, and in KMS terms, DSS7 can assign the
planes, CRTCs or encoders to a given VM or CPU, and you can segment the
hardware that way. It looks like a good way to split encoders between
VMs, but going back to the discussion about one plane being handled by
the firmware, I don't really see how it can work with something else
than splitting away the whole pipeline and having a VM claiming a CRTC
and encoder, and another VM claiming another pipeline.

Like, if they share either a CRTC or encoder, we will still go back to
the discussion about arbitration about who has the final word if the two
have conflicting requirements, or if it changes something the other
probably has to know about it.

> >>> And 2) would mean that the display engine is under the R5 control and
> >>> Linux only gets to fill the plane and let the firmware know of what it
> >>> wants?
> >>>
> >>
> >> Here too the partitioning information is pre-decided and remote core
> >> firmware and device-tree overlay for Linux updated accordingly. But in
> >> this case as remote core firmware owns the display (minus the plane
> >> owned by Linux) it is started and initialized during the bootloader
> >> phase itself where it initializes the DSS and starts rendering using the
> >> plane owned by it and Linux just latches to the DSS without
> >> re-initializing it, with write access only to the plane that is owned by
> >> Linux. You can refer [1] for more details on this.
> >>
> >>> If so, do we even need the tidss driver in the second case? We could
> >>> just write a fwkms driver of some sorts that could be used by multiple
> >>> implementations of the same "defer to firmware" logic.
> >>>
> >>
> >> This feature of static partitioning of DSS resources is specific to DSS7
> >> hardware (which is controlled by tidss driver) which supports dedicated
> >> register space and interrupt line for each of the hosts [0], so that
> >> multiple hosts can drive the display controller simultaneously as  per
> >> the desired static partitioning of resources, and so I don't think a
> >> separate driver is required here and tidss seems the right place to
> >> support this, where using this device-tree approach different resource
> >> partitioning schemas can be achieved as described here [1]. This was
> >> also aligned with Tomi too where we discussed that tidss is the right
> >> place to support this as we are simply leveraging the DSS hardware
> >> capabilities of static partitioning here.
> > 
> > If the only thing tidss does in the "owned by RTOS" is forwarding KMS
> > atomic states to the RTOS, then I'm still not sure why we need to
> > involve tidss at all.
>
> I think maybe here is the point of misunderstanding. We are not forwarding
> atomic states to RTOS here. Linux (tidss) is infact, accessing the display
> register space assigned to it (common1 assigned to Linux, commmon0 assigned to
> RTOS) and also writing to DSS plane registers for the plane assigned to it
> (say VID assigned to Linux and VIDL assigned to RTOS).
> 
> > It's not just about interrupts, it's also about how your arbitrate
> > between what Linux wants and what the RTOS wants. Like if the RTOS still
> > wants to output something but Linux wants to disable it, how do you
> > reconcile the two?
> > 
> 
> The scheme involves static partitioning of display resource which are assigned
> compile-time to RTOS and Linux. Here the RTOS firmware is compiled with
> specific ownership/display resources as desired by user and this assignment
> stays intact.
> 
> If there is a more complex use-case which requires dynamic
> assignment/arbitration of resources then I agree those require some sort of
> IPC scheme but this is not what we target with these series. This series is
> simply to support static partitioning feature (separate register space,
> separate irq, firewalling support etc) of TI DSS hardware across the multiple
> hosts and there are use-cases too for which this scheme suffices.

I think you're right and we have a misunderstanding. My initial
assumption was that it was to prevent the Linux side of sides from
screwing up the output if it was to crash.

But it looks like it's not the main point of this series, so could you
share some use-cases you're trying to address?

Thanks!
Maxime
Devarsh Thakkar March 22, 2024, 3:29 p.m. UTC | #7
Hi Maxime,

On 14/03/24 20:04, Maxime Ripard wrote:
> Hi,
> 
> On Wed, Feb 14, 2024 at 09:17:12PM +0530, Devarsh Thakkar wrote:
>> On 13/02/24 19:34, Maxime Ripard wrote:
>>> On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote:
>>>> On 26/01/24 17:45, Maxime Ripard wrote:
>>>>> Hi,
>>>>>
>>>>> Thanks a lot for working on that.
>>>>>
>>>>> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
>>>>>> Display subsystem present in TI Keystone family of devices supports sharing
>>>>>> of display between multiple hosts as it provides separate register space
>>>>>> (common* region) for each host to programming display controller and also a
>>>>>> unique interrupt line for each host.
>>>>>>
>>>>>> This adds support for display sharing, by allowing partitioning of
>>>>>> resources either at video port level or at video plane level as
>>>>>> described below :
>>>>>>
>>>>>> 1) Linux can own (i.e have write access) completely one or more of video
>>>>>> ports along with corresponding resources (viz. overlay managers,
>>>>>> video planes) used by Linux in context of those video ports.
>>>>>> Even if Linux is owning
>>>>>> these video ports it can still share this video port with a remote core
>>>>>> which can own one or more video planes associated with this video port.
>>>>>>
>>>>>> 2) Linux owns one or more of the video planes with video port
>>>>>> (along with corresponding overlay manager) associated with these planes
>>>>>> being owned and controlled by a remote core. Linux still has read-only
>>>>>> access to the associated video port and overlay managers so that it can
>>>>>> parse the settings made by remote core.
>>>>>
>>>>> So, just to make sure we're on the same page. 1) means Linux drives the
>>>>> whole display engine, but can lend planes to the R5? How does that work,
>>>>> is Linux aware of the workload being there (plane size, format, etc) ?
>>>>>
>>>>
>>>> Well, there is no dynamic procedure being followed for lending. The
>>>> partitioning scheme is decided and known before hand, and the remote
>>>> core firmware updated and compiled accordingly, and similarly the
>>>> device-tree overlay for Linux is also updated with partitioning
>>>> information before bootup.
>>>>
>>>> What would happen here is that Linux will know before-hand this
>>>> partitioning information via device-tree properties and won't enumerate
>>>> the plane owned by RTOS, but it will enumerate the rest of the display
>>>> components and initialize the DSS, after which user can load the DSS
>>>> firmware on remote core and this firmware will only have control of
>>>> plane as it was compiled with that configuration.
>>>
>>> Right. If the RTOS is in control of a single plane, how it is expected
>>> to deal with Linux shutting the CRTC down, or enforcing a configuration
>>> that isn't compatible with what the RTOS expects (like a plane with a
>>> higher zpos masking its plane), what is the mechanism to reconcile it?
>>>
>>
>> Just for the note, for this "RTOS control single plane" mode, we don't have a
>> firmware available to test (right now we are only supporting example for "RTOS
>> controlling the display mode" as shared here [1]) and hence this is not
>> validated but the idea was to keep dt-bindings generic enough to support them
>> in future and that's why I referred to it here.
>>
>> Coming back to your questions, with the current scheme the Linux (tidss) would
>> be expected to make sure the CRTC being shared with RTOS is never shutdown and
>> the RTOS plane should never gets masked.
> 
> I'm probably missing something then here, but if the Linux side of
> things is expected to keep the current configuration and keep it active
> for it to work, what use-case would it be useful for?
> 

It's just one of the partitioning possibilities that I mentioned here, that
Linux is in control of DSS as a whole and the user want the other host (be it
RTOS or any other core) to control a single plane. For e.g it could be Linux
(with GPU rendering) displaying the graphics and RTOS overlaying a real time
clock or any other signs which need to be displayed in real-time.
But more than the use-case this is inspired by the fact that we want to be
flexible and support in the linux driver whatever partitioning scheme
possibilities are there which are supported in hardware and we let user decide
on the partitioning scheme.

>> I think the IPC based scheme would have been mainly needed for the case where
>> you have a single entity controlling the display for e.g you have a single
>> display controller register space and a single IRQ but you have multiple
>> planes and say you want to divide these planes to different host processors.
> 
> And with, I assume, different OS on those host processors? Otherwise why
> would we need to handle some planes at the firmware level?
> 
>> In that case you want a single entity to act as a main entity and be in
>> control of DSS and rest of the processors communicate with the "main entity"
>> to request display resources and plane updates and main entity also programs
>> dss on their behalf.
>>
>> But unlike above, TI DSS7 is designed to support static partitioning of
>> display resources among multiple hosts, where each host can program the
>> display hardware independently using separate register space and having a
>> separate irq and without requirement of any communication between the hosts.
>> Now as this feature is unique to TI DSS7 we want to support this feature in
>> tidss driver. The DSS resource partitioning feature is described in detail
>> here [2]
> 
> So, if I understand this properly, and in KMS terms, DSS7 can assign the
> planes, CRTCs or encoders to a given VM or CPU, and you can segment the
> hardware that way. It looks like a good way to split encoders between
> VMs, but going back to the discussion about one plane being handled by
> the firmware, I don't really see how it can work with something else
> than splitting away the whole pipeline and having a VM claiming a CRTC
> and encoder, and another VM claiming another pipeline.
> 
> Like, if they share either a CRTC or encoder, we will still go back to
> the discussion about arbitration about who has the final word if the two
> have conflicting requirements, or if it changes something the other
> probably has to know about it.

There should not be any conflicting requirements as this sharing scheme is a
static one i.e. it is pre-negotiated or decided by the user before Linux
kernel compilation or RTOS firmware compilation and resources are split
statically and the sharing scheme is communicated to Linux via device-tree and
RTOS side firmware configured and compiled accordingly, and this scheme stays
intact without any change after device boots up. So for e.g. if Linux is
assigned only one plane and RTOS is the DSS master controlling all other
entities then this scheme will stay intact as long as device is up.

Also there could be only a single DSS master (the one in control of global
common0 reg space, i.e. having access to global DSS registers and also in
control of dss clock and power domains) and whichever core is acting as DSS
master will be knowing which of the CRTC and encoders are shared with other
hosts so that it doesn't power them off given other cores may still be using it.

> 
>>>>> And 2) would mean that the display engine is under the R5 control and
>>>>> Linux only gets to fill the plane and let the firmware know of what it
>>>>> wants?
>>>>>
>>>>
>>>> Here too the partitioning information is pre-decided and remote core
>>>> firmware and device-tree overlay for Linux updated accordingly. But in
>>>> this case as remote core firmware owns the display (minus the plane
>>>> owned by Linux) it is started and initialized during the bootloader
>>>> phase itself where it initializes the DSS and starts rendering using the
>>>> plane owned by it and Linux just latches to the DSS without
>>>> re-initializing it, with write access only to the plane that is owned by
>>>> Linux. You can refer [1] for more details on this.
>>>>
>>>>> If so, do we even need the tidss driver in the second case? We could
>>>>> just write a fwkms driver of some sorts that could be used by multiple
>>>>> implementations of the same "defer to firmware" logic.
>>>>>
>>>>
>>>> This feature of static partitioning of DSS resources is specific to DSS7
>>>> hardware (which is controlled by tidss driver) which supports dedicated
>>>> register space and interrupt line for each of the hosts [0], so that
>>>> multiple hosts can drive the display controller simultaneously as  per
>>>> the desired static partitioning of resources, and so I don't think a
>>>> separate driver is required here and tidss seems the right place to
>>>> support this, where using this device-tree approach different resource
>>>> partitioning schemas can be achieved as described here [1]. This was
>>>> also aligned with Tomi too where we discussed that tidss is the right
>>>> place to support this as we are simply leveraging the DSS hardware
>>>> capabilities of static partitioning here.
>>>
>>> If the only thing tidss does in the "owned by RTOS" is forwarding KMS
>>> atomic states to the RTOS, then I'm still not sure why we need to
>>> involve tidss at all.
>>
>> I think maybe here is the point of misunderstanding. We are not forwarding
>> atomic states to RTOS here. Linux (tidss) is infact, accessing the display
>> register space assigned to it (common1 assigned to Linux, commmon0 assigned to
>> RTOS) and also writing to DSS plane registers for the plane assigned to it
>> (say VID assigned to Linux and VIDL assigned to RTOS).
>>
>>> It's not just about interrupts, it's also about how your arbitrate
>>> between what Linux wants and what the RTOS wants. Like if the RTOS still
>>> wants to output something but Linux wants to disable it, how do you
>>> reconcile the two?
>>>
>>
>> The scheme involves static partitioning of display resource which are assigned
>> compile-time to RTOS and Linux. Here the RTOS firmware is compiled with
>> specific ownership/display resources as desired by user and this assignment
>> stays intact.
>>
>> If there is a more complex use-case which requires dynamic
>> assignment/arbitration of resources then I agree those require some sort of
>> IPC scheme but this is not what we target with these series. This series is
>> simply to support static partitioning feature (separate register space,
>> separate irq, firewalling support etc) of TI DSS hardware across the multiple
>> hosts and there are use-cases too for which this scheme suffices.
> 
> I think you're right and we have a misunderstanding. My initial
> assumption was that it was to prevent the Linux side of sides from
> screwing up the output if it was to crash.
> 
> But it looks like it's not the main point of this series, so could you
> share some use-cases you're trying to address?
> 

The end use-case we have demonstrated right now with this series is a
proof-of-concept display cluster use-case where RTOS boots early on MCU core
(launched at bootloader stage) and initializes the display (using the global
common0 register space and irq) and starts displaying safety tell-tales on one
plane, and once Linux boots up on application processor,
Linux (using common1 register space and irq) controls the other plane with GPU
rendering using a QT based application. And yes, we also support the scenario
where Linux crashes but RTOS being the DSS master and in control of DSS power,
clock domain and global register space is not impacted by the crash.
This is demonstrated in this video [1] and steps to simulate the same are also
documented here [2].

[1]: https://www.youtube.com/watch?v=WIcds6HGeMI&t=884s
[2]:
https://software-dl.ti.com/processor-sdk-linux/esd/AM62PX/09_01_00_08/exports/docs/system/Demo_User_Guides/Display_Cluster_User_Guide.html

Regards
Devarsh

> Thanks!
> Maxime
Javier Martinez Canillas May 15, 2024, 2:45 p.m. UTC | #8
Devarsh Thakkar <devarsht@ti.com> writes:

Hello Devarsh and Maxime,

> Hi Maxime,
>
> On 14/03/24 20:04, Maxime Ripard wrote:
>> Hi,
>> 
>> On Wed, Feb 14, 2024 at 09:17:12PM +0530, Devarsh Thakkar wrote:
>>> On 13/02/24 19:34, Maxime Ripard wrote:
>>>> On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote:
>>>>> On 26/01/24 17:45, Maxime Ripard wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Thanks a lot for working on that.
>>>>>>
>>>>>> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
>>>>>>> Display subsystem present in TI Keystone family of devices supports sharing
>>>>>>> of display between multiple hosts as it provides separate register space
>>>>>>> (common* region) for each host to programming display controller and also a
>>>>>>> unique interrupt line for each host.
>>>>>>>
>>>>>>> This adds support for display sharing, by allowing partitioning of
>>>>>>> resources either at video port level or at video plane level as
>>>>>>> described below :
>>>>>>>
>>>>>>> 1) Linux can own (i.e have write access) completely one or more of video
>>>>>>> ports along with corresponding resources (viz. overlay managers,
>>>>>>> video planes) used by Linux in context of those video ports.
>>>>>>> Even if Linux is owning
>>>>>>> these video ports it can still share this video port with a remote core
>>>>>>> which can own one or more video planes associated with this video port.
>>>>>>>
>>>>>>> 2) Linux owns one or more of the video planes with video port
>>>>>>> (along with corresponding overlay manager) associated with these planes
>>>>>>> being owned and controlled by a remote core. Linux still has read-only
>>>>>>> access to the associated video port and overlay managers so that it can
>>>>>>> parse the settings made by remote core.
>>>>>>
>>>>>> So, just to make sure we're on the same page. 1) means Linux drives the
>>>>>> whole display engine, but can lend planes to the R5? How does that work,
>>>>>> is Linux aware of the workload being there (plane size, format, etc) ?
>>>>>>
>>>>>
>>>>> Well, there is no dynamic procedure being followed for lending. The
>>>>> partitioning scheme is decided and known before hand, and the remote
>>>>> core firmware updated and compiled accordingly, and similarly the
>>>>> device-tree overlay for Linux is also updated with partitioning
>>>>> information before bootup.
>>>>>
>>>>> What would happen here is that Linux will know before-hand this
>>>>> partitioning information via device-tree properties and won't enumerate
>>>>> the plane owned by RTOS, but it will enumerate the rest of the display
>>>>> components and initialize the DSS, after which user can load the DSS
>>>>> firmware on remote core and this firmware will only have control of
>>>>> plane as it was compiled with that configuration.
>>>>
>>>> Right. If the RTOS is in control of a single plane, how it is expected
>>>> to deal with Linux shutting the CRTC down, or enforcing a configuration
>>>> that isn't compatible with what the RTOS expects (like a plane with a
>>>> higher zpos masking its plane), what is the mechanism to reconcile it?
>>>>
>>>
>>> Just for the note, for this "RTOS control single plane" mode, we don't have a
>>> firmware available to test (right now we are only supporting example for "RTOS
>>> controlling the display mode" as shared here [1]) and hence this is not
>>> validated but the idea was to keep dt-bindings generic enough to support them
>>> in future and that's why I referred to it here.
>>>


If I understand you correctly, for now the only real use case is when the
the RTOS owns / manages the complete display pipeline and Linux can only
own video planes.

The opposite is supported by the DSS hardware (thanks to its feature that
allows partitioning the register space and having multiple per-host IRQs) 
but it's not a real use case yet. The reason why this case is added to the
DT binding is as you said for flexiblity and make the design future-proof.

>>> separate irq
>>> Coming back to your questions, with the current scheme the Linux (tidss) would
>>> be expected to make sure the CRTC being shared with RTOS is never shutdown and
>>> the RTOS plane should never gets masked.
>> 
>> I'm probably missing something then here, but if the Linux side of
>> things is expected to keep the current configuration and keep it active
>> for it to work, what use-case would it be useful for?
>> 
>
> It's just one of the partitioning possibilities that I mentioned here, that
> Linux is in control of DSS as a whole and the user want the other host (be it
> RTOS or any other core) to control a single plane. For e.g it could be Linux
> (with GPU rendering) displaying the graphics and RTOS overlaying a real time
> clock or any other signs which need to be displayed in real-time.
> But more than the use-case this is inspired by the fact that we want to be
> flexible and support in the linux driver whatever partitioning scheme
> possibilities are there which are supported in hardware and we let user decide
> on the partitioning scheme.
>

A possible use case here could be if Linux is safer than the other host
owning a single plane, right? Then in that case the RTOS could fail but
the display pipeline won't be teared down.

That is, if your safety tell-tales would be driven by Linux and having
other OS dislay the GPU-rendered QT based application on another plane.

But as said, for now that's a theorethical use case since the one you
mentioned is the opposite.

[....]

>>>
>>>> It's not just about interrupts, it's also about how your arbitrate
>>>> between what Linux wants and what the RTOS wants. Like if the RTOS still
>>>> wants to output something but Linux wants to disable it, how do you
>>>> reconcile the two?
>>>>
>>>
>>> The scheme involves static partitioning of display resource which are assigned
>>> compile-time to RTOS and Linux. Here the RTOS firmware is compiled with
>>> specific ownership/display resources as desired by user and this assignment
>>> stays intact.
>>>
>>> If there is a more complex use-case which requires dynamic
>>> assignment/arbitration of resources then I agree those require some sort of
>>> IPC scheme but this is not what we target with these series. This series is
>>> simply to support static partitioning feature (separate register space,
>>> separate irq, firewalling support etc) of TI DSS hardware across the multiple
>>> hosts and there are use-cases too for which this scheme suffices.
>> 
>> I think you're right and we have a misunderstanding. My initial
>> assumption was that it was to prevent the Linux side of sides from
>> screwing up the output if it was to crash.
>> 
>> But it looks like it's not the main point of this series, so could you
>> share some use-cases you're trying to address?
>> 
>
> The end use-case we have demonstrated right now with this series is a
> proof-of-concept display cluster use-case where RTOS boots early on MCU core
> (launched at bootloader stage) and initializes the display (using the global
> common0 register space and irq) and starts displaying safety tell-tales on one
> plane, and once Linux boots up on application processor,
> Linux (using common1 register space and irq) controls the other plane with GPU
> rendering using a QT based application. And yes, we also support the scenario
> where Linux crashes but RTOS being the DSS master and in control of DSS power,
> clock domain and global register space is not impacted by the crash.

You mention 2 scenarios but are actually the same? Or did I misunderstand?

In both cases the RTOS own the display pipeline and Linux can just display
using a single plane.

That's why I think that agree with Maxime, that a fwkms could be a simpler
solution to your use case instead of adding all this complexity to the DSS
driver. Yes, I understand the HW supports all this flexibility but there's
no real use case yet (you mentioned that don't even have firmware for this
single plane owned by the RTOS in the R5F case).

The DT binding for a fwkms driver would be trivial, in fact maybe we might
even leverage simpledrm for this case and not require a new driver at all.
Daniel Vetter May 16, 2024, 12:51 p.m. UTC | #9
On Wed, May 15, 2024 at 04:45:09PM +0200, Javier Martinez Canillas wrote:
> Devarsh Thakkar <devarsht@ti.com> writes:
> 
> Hello Devarsh and Maxime,
> 
> > Hi Maxime,
> >
> > On 14/03/24 20:04, Maxime Ripard wrote:
> >> Hi,
> >> 
> >> On Wed, Feb 14, 2024 at 09:17:12PM +0530, Devarsh Thakkar wrote:
> >>> On 13/02/24 19:34, Maxime Ripard wrote:
> >>>> On Thu, Feb 08, 2024 at 06:26:17PM +0530, Devarsh Thakkar wrote:
> >>>>> On 26/01/24 17:45, Maxime Ripard wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> Thanks a lot for working on that.
> >>>>>>
> >>>>>> On Tue, Jan 16, 2024 at 07:11:41PM +0530, Devarsh Thakkar wrote:
> >>>>>>> Display subsystem present in TI Keystone family of devices supports sharing
> >>>>>>> of display between multiple hosts as it provides separate register space
> >>>>>>> (common* region) for each host to programming display controller and also a
> >>>>>>> unique interrupt line for each host.
> >>>>>>>
> >>>>>>> This adds support for display sharing, by allowing partitioning of
> >>>>>>> resources either at video port level or at video plane level as
> >>>>>>> described below :
> >>>>>>>
> >>>>>>> 1) Linux can own (i.e have write access) completely one or more of video
> >>>>>>> ports along with corresponding resources (viz. overlay managers,
> >>>>>>> video planes) used by Linux in context of those video ports.
> >>>>>>> Even if Linux is owning
> >>>>>>> these video ports it can still share this video port with a remote core
> >>>>>>> which can own one or more video planes associated with this video port.
> >>>>>>>
> >>>>>>> 2) Linux owns one or more of the video planes with video port
> >>>>>>> (along with corresponding overlay manager) associated with these planes
> >>>>>>> being owned and controlled by a remote core. Linux still has read-only
> >>>>>>> access to the associated video port and overlay managers so that it can
> >>>>>>> parse the settings made by remote core.
> >>>>>>
> >>>>>> So, just to make sure we're on the same page. 1) means Linux drives the
> >>>>>> whole display engine, but can lend planes to the R5? How does that work,
> >>>>>> is Linux aware of the workload being there (plane size, format, etc) ?
> >>>>>>
> >>>>>
> >>>>> Well, there is no dynamic procedure being followed for lending. The
> >>>>> partitioning scheme is decided and known before hand, and the remote
> >>>>> core firmware updated and compiled accordingly, and similarly the
> >>>>> device-tree overlay for Linux is also updated with partitioning
> >>>>> information before bootup.
> >>>>>
> >>>>> What would happen here is that Linux will know before-hand this
> >>>>> partitioning information via device-tree properties and won't enumerate
> >>>>> the plane owned by RTOS, but it will enumerate the rest of the display
> >>>>> components and initialize the DSS, after which user can load the DSS
> >>>>> firmware on remote core and this firmware will only have control of
> >>>>> plane as it was compiled with that configuration.
> >>>>
> >>>> Right. If the RTOS is in control of a single plane, how it is expected
> >>>> to deal with Linux shutting the CRTC down, or enforcing a configuration
> >>>> that isn't compatible with what the RTOS expects (like a plane with a
> >>>> higher zpos masking its plane), what is the mechanism to reconcile it?
> >>>>
> >>>
> >>> Just for the note, for this "RTOS control single plane" mode, we don't have a
> >>> firmware available to test (right now we are only supporting example for "RTOS
> >>> controlling the display mode" as shared here [1]) and hence this is not
> >>> validated but the idea was to keep dt-bindings generic enough to support them
> >>> in future and that's why I referred to it here.
> >>>
> 
> 
> If I understand you correctly, for now the only real use case is when the
> the RTOS owns / manages the complete display pipeline and Linux can only
> own video planes.
> 
> The opposite is supported by the DSS hardware (thanks to its feature that
> allows partitioning the register space and having multiple per-host IRQs) 
> but it's not a real use case yet. The reason why this case is added to the
> DT binding is as you said for flexiblity and make the design future-proof.
> 
> >>> separate irq
> >>> Coming back to your questions, with the current scheme the Linux (tidss) would
> >>> be expected to make sure the CRTC being shared with RTOS is never shutdown and
> >>> the RTOS plane should never gets masked.
> >> 
> >> I'm probably missing something then here, but if the Linux side of
> >> things is expected to keep the current configuration and keep it active
> >> for it to work, what use-case would it be useful for?
> >> 
> >
> > It's just one of the partitioning possibilities that I mentioned here, that
> > Linux is in control of DSS as a whole and the user want the other host (be it
> > RTOS or any other core) to control a single plane. For e.g it could be Linux
> > (with GPU rendering) displaying the graphics and RTOS overlaying a real time
> > clock or any other signs which need to be displayed in real-time.
> > But more than the use-case this is inspired by the fact that we want to be
> > flexible and support in the linux driver whatever partitioning scheme
> > possibilities are there which are supported in hardware and we let user decide
> > on the partitioning scheme.
> >
> 
> A possible use case here could be if Linux is safer than the other host
> owning a single plane, right? Then in that case the RTOS could fail but
> the display pipeline won't be teared down.
> 
> That is, if your safety tell-tales would be driven by Linux and having
> other OS dislay the GPU-rendered QT based application on another plane.
> 
> But as said, for now that's a theorethical use case since the one you
> mentioned is the opposite.
> 
> [....]
> 
> >>>
> >>>> It's not just about interrupts, it's also about how your arbitrate
> >>>> between what Linux wants and what the RTOS wants. Like if the RTOS still
> >>>> wants to output something but Linux wants to disable it, how do you
> >>>> reconcile the two?
> >>>>
> >>>
> >>> The scheme involves static partitioning of display resource which are assigned
> >>> compile-time to RTOS and Linux. Here the RTOS firmware is compiled with
> >>> specific ownership/display resources as desired by user and this assignment
> >>> stays intact.
> >>>
> >>> If there is a more complex use-case which requires dynamic
> >>> assignment/arbitration of resources then I agree those require some sort of
> >>> IPC scheme but this is not what we target with these series. This series is
> >>> simply to support static partitioning feature (separate register space,
> >>> separate irq, firewalling support etc) of TI DSS hardware across the multiple
> >>> hosts and there are use-cases too for which this scheme suffices.
> >> 
> >> I think you're right and we have a misunderstanding. My initial
> >> assumption was that it was to prevent the Linux side of sides from
> >> screwing up the output if it was to crash.
> >> 
> >> But it looks like it's not the main point of this series, so could you
> >> share some use-cases you're trying to address?
> >> 
> >
> > The end use-case we have demonstrated right now with this series is a
> > proof-of-concept display cluster use-case where RTOS boots early on MCU core
> > (launched at bootloader stage) and initializes the display (using the global
> > common0 register space and irq) and starts displaying safety tell-tales on one
> > plane, and once Linux boots up on application processor,
> > Linux (using common1 register space and irq) controls the other plane with GPU
> > rendering using a QT based application. And yes, we also support the scenario
> > where Linux crashes but RTOS being the DSS master and in control of DSS power,
> > clock domain and global register space is not impacted by the crash.
> 
> You mention 2 scenarios but are actually the same? Or did I misunderstand?
> 
> In both cases the RTOS own the display pipeline and Linux can just display
> using a single plane.
> 
> That's why I think that agree with Maxime, that a fwkms could be a simpler
> solution to your use case instead of adding all this complexity to the DSS
> driver. Yes, I understand the HW supports all this flexibility but there's
> no real use case yet (you mentioned that don't even have firmware for this
> single plane owned by the RTOS in the R5F case).
> 
> The DT binding for a fwkms driver would be trivial, in fact maybe we might
> even leverage simpledrm for this case and not require a new driver at all.

I guess you can still do things like pageflipping and maybe use some of
the color/blending hardware? Maybe even have more than one plane
available? fwkms/simpledrm conceptually cannot really support pageflipping
even, so that's a much, much reduced feature set.

That all aside I do think we should limit the support to just the first
case, where linux gets a few pieces assigned to it and is not the DSS
master. From what I'm understanding you could assign entire crtc with
planes and everything to linux, so this shouldn't really constraint
real-world usage?

At least until there's support in firmware for this it's all way too
theoretical, and I agree with Maxime and Javier that there's some serious
design questions about how this kind of static leasing should work with
drm sitting on top.
-Sima
Devarsh Thakkar May 29, 2024, 2:46 p.m. UTC | #10
Hi Javier, Maxime, Daniel,

Sorry for the delay. Please find response inline.

On 16/05/24 18:21, Daniel Vetter wrote:
> On Wed, May 15, 2024 at 04:45:09PM +0200, Javier Martinez Canillas wrote:
>> Devarsh Thakkar <devarsht@ti.com> writes:

[..]

>>
>> If I understand you correctly, for now the only real use case is when the
>> the RTOS owns / manages the complete display pipeline and Linux can only
>> own video planes.
>>

Not exactly, What I mean is that this is the default configuration/example we
intend to provide to customer as an out-of-box demo . But flexibility is
provided to customer to modify the display sharing configuration per their
use-case, for e.g at RTOS side in place of device-tree, we have a sysconfig
menu [1] using which they can select the desired configuration, furthermore
they can go ahead and edit the code too, so Linux driver is expected to be
flexible to support different configurations as supported by the HW.

I have a limited view of all possible use-cases which customer may try out
with different configurations but few examples are shared below part from the
one discussed earlier :

[Examples]:
1) Customer is running Linux as main OS but using RTOS to control some
external peripherals like temperature sensor, motion sensor e.t.c. In that
case if they want to display the sensor data too on the same monitor, then
they can use the configuration where RTOS use single plane and Linux as the
DSS master.

2) Another configuration could be where RTOS want to control one full
end-to-end pipeline going to one connector and Linux want to control full
end-to-end pipeline going to another connector, that can be supported too
using this scheme (as shared in this series).

3) Also I think, this device-tree based scheme could be leveraged in
virtualization too with static partitioning based scheme using Xen for e.g. we
split the DSS resources between host (DOM0) and the guest (DOMU).

>> The opposite is supported by the DSS hardware (thanks to its feature that
>> allows partitioning the register space and having multiple per-host IRQs) 
>> but it's not a real use case yet. The reason why this case is added to the
>> DT binding is as you said for flexiblity and make the design future-proof.
>>

Not really, as explained above we are documenting all possible configurations
which hardware supports as supported in software in the SDK and that's what we
are aiming for upstream too.

[..]>>>> I'm probably missing something then here, but if the Linux side of
>>>> things is expected to keep the current configuration and keep it active
>>>> for it to work, what use-case would it be useful for?
>>>>
>>>
>>> It's just one of the partitioning possibilities that I mentioned here, that
>>> Linux is in control of DSS as a whole and the user want the other host (be it
>>> RTOS or any other core) to control a single plane. For e.g it could be Linux
>>> (with GPU rendering) displaying the graphics and RTOS overlaying a real time
>>> clock or any other signs which need to be displayed in real-time.
>>> But more than the use-case this is inspired by the fact that we want to be
>>> flexible and support in the linux driver whatever partitioning scheme
>>> possibilities are there which are supported in hardware and we let user decide
>>> on the partitioning scheme.
>>>
>>
>> A possible use case here could be if Linux is safer than the other host
>> owning a single plane, right? Then in that case the RTOS could fail but
>> the display pipeline won't be teared down.
>>
>> That is, if your safety tell-tales would be driven by Linux and having
>> other OS dislay the GPU-rendered QT based application on another plane.
>>
>> But as said, for now that's a theorethical use case since the one you
>> mentioned is the opposite.
>>
>> [....]
>>

Yes that could be a possible use-case too, we want to provide customer the
flexibility in their app design to select different configuration, as we say
all these configurations as supported in the driver.

[..]
>>>>> If there is a more complex use-case which requires dynamic
>>>>> assignment/arbitration of resources then I agree those require some sort of
>>>>> IPC scheme but this is not what we target with these series. This series is
>>>>> simply to support static partitioning feature (separate register space,
>>>>> separate irq, firewalling support etc) of TI DSS hardware across the multiple
>>>>> hosts and there are use-cases too for which this scheme suffices.
>>>>
>>>> I think you're right and we have a misunderstanding. My initial
>>>> assumption was that it was to prevent the Linux side of sides from
>>>> screwing up the output if it was to crash.
>>>>
>>>> But it looks like it's not the main point of this series, so could you
>>>> share some use-cases you're trying to address?
>>>>
>>>
>>> The end use-case we have demonstrated right now with this series is a
>>> proof-of-concept display cluster use-case where RTOS boots early on MCU core
>>> (launched at bootloader stage) and initializes the display (using the global
>>> common0 register space and irq) and starts displaying safety tell-tales on one
>>> plane, and once Linux boots up on application processor,
>>> Linux (using common1 register space and irq) controls the other plane with GPU
>>> rendering using a QT based application. And yes, we also support the scenario
>>> where Linux crashes but RTOS being the DSS master and in control of DSS power,
>>> clock domain and global register space is not impacted by the crash.
>>
>> You mention 2 scenarios but are actually the same? Or did I misunderstand?
>>

I have explained the other examples (where Linux is the master) above with tag
[Examples].

>> In both cases the RTOS own the display pipeline and Linux can just display
>> using a single plane.
>>
>> That's why I think that agree with Maxime, that a fwkms could be a simpler
>> solution to your use case instead of adding all this complexity to the DSS
>> driver. Yes, I understand the HW supports all this flexibility but there's
>> no real use case yet (you mentioned that don't even have firmware for this
>> single plane owned by the RTOS in the R5F case).
>>

I analyzed the fwkms driver [2] from the RPi repo (as seems like it is not
upstream), here the firmware is the entity owning the display register space
completely and the fwkms driver simply acts as a translating layer which
translates the incoming DRM/KMS ioctls into IPC messages (having unique
message tags) sent over mailbox which the firmware receives and programs
corresponding registers accordingly, so there is no display sharing achieved
here as there is only single display context as coming from Linux and no
separate display context running at firmware side.

But yes, in general for simple display controllers having single register
space and single IRQ, I think that a standard IPC based scheme would be
required for sharing the display between Linux and RTOS where for e.g. RTOS
would be in full control of display register space and the interrupt line.
Linux would send messages to RTOS via IPC scheme (similar to RPi fwkms) to
commit the framebuffer and RTOS would send a frame done message to LInux on
receiving vblank irq which would eventually call drm_crtc_handle_vblank.
Although the IPC layer would need to be standardized (virtio_kms?)


>> The DT binding for a fwkms driver would be trivial, in fact maybe we might
>> even leverage simpledrm for this case and not require a new driver at all.
> 
> I guess you can still do things like pageflipping and maybe use some of
> the color/blending hardware? Maybe even have more than one plane
> available? fwkms/simpledrm conceptually cannot really support pageflipping
> even, so that's a much, much reduced feature set.
>

Yes exactly, fwkms, simpledrm are not related to display sharing.

> That all aside I do think we should limit the support to just the first
> case, where linux gets a few pieces assigned to it and is not the DSS
> master. From what I'm understanding you could assign entire crtc with
> planes and everything to linux, so this shouldn't really constraint
> real-world usage?
> 
> At least until there's support in firmware for this it's all way too
> theoretical, and I agree with Maxime and Javier that there's some serious
> design questions about how this kind of static leasing should work with
> drm sitting on top.

To summarize, as per my understanding there are two schemes for display
sharing possible

1) IPC based scheme  -> This would be useful for simple display controllers
having single register space and single irq line to one host)
2) Static partitioning scheme -> This would require some support in display
controller (per host irq e.t.c) as explained below :

And I think each one has it's own uniqueness and holds on it's own and both
solutions can be pursued independently and development for one scheme should
not block development for another.

As TI display controller supports robust partitioning of DSS processing blocks
(Pipelines, overlays, interrupts etc) among multiple processors allowing
parallel and independent control of the DSS resources in an interference free
environment, the static partitioning scheme was possible to implement and it
sufficed the requirements of display sharing use-case without incurring any
extra overhead and complexity of IPC.

And If there is another display controller having similar feature-set as above
then the bindings for the static partitioning scheme (or at-least some of
them) could be generalized.

	drm,shared-mode;
	drm,shared-mode-crtc = "..";
	drm,shared-mode-crtc-owned = "..";
	drm,shared-mode-planes = "..";

But I could not find any other display controller support above static
partitioning feature and hence I put the bindings as specific to TI.

I still feel the static partitioning approach as adopted in this series still
holds on its own achieving the sharing without extra complexity/overhead of
IPC by utilizing the above mentioned HW features, and all it aims is to simply
add software support for the flexibility of sharing the display resources.

Kindly let me know if we are aligned on this or any further thoughts or queries.

[1]:
https://software-dl.ti.com/mcu-plus-sdk/esd/AM62PX/latest/exports/docs/api_guide_am62px/DRIVERS_DSS_PAGE.html#autotoc_md311
[2]:
https://github.com/raspberrypi/linux/blob/rpi-6.6.y/drivers/gpu/drm/vc4/vc4_firmware_kms.c

Regards
Devarsh
Maxime Ripard June 6, 2024, 5:20 p.m. UTC | #11
On Wed, May 29, 2024 at 08:16:46PM GMT, Devarsh Thakkar wrote:
> Hi Javier, Maxime, Daniel,
> 
> Sorry for the delay. Please find response inline.
> 
> On 16/05/24 18:21, Daniel Vetter wrote:
> > On Wed, May 15, 2024 at 04:45:09PM +0200, Javier Martinez Canillas wrote:
> >> Devarsh Thakkar <devarsht@ti.com> writes:
> 
> [..]
> 
> >>
> >> If I understand you correctly, for now the only real use case is when the
> >> the RTOS owns / manages the complete display pipeline and Linux can only
> >> own video planes.
> >>
> 
> Not exactly, What I mean is that this is the default configuration/example we
> intend to provide to customer as an out-of-box demo . But flexibility is
> provided to customer to modify the display sharing configuration per their
> use-case, for e.g at RTOS side in place of device-tree, we have a sysconfig
> menu [1] using which they can select the desired configuration, furthermore
> they can go ahead and edit the code too, so Linux driver is expected to be
> flexible to support different configurations as supported by the HW.

Everything you stated above can be achieved with the solutions we've
been suggesting too.

> I have a limited view of all possible use-cases which customer may try out
> with different configurations

I think that's a problem in itself. In order to properly reflect and
discuss a solution, we first need to have a problem. "Flexible
solutions" in practice never really work because you always end up not
flexible enough and still have to rework the whole thing. Except it's
now harder to do since you created something complicated first.

> few examples are shared below part from the one discussed earlier :
> 
> [Examples]:
> 1) Customer is running Linux as main OS but using RTOS to control some
> external peripherals like temperature sensor, motion sensor e.t.c. In that
> case if they want to display the sensor data too on the same monitor, then
> they can use the configuration where RTOS use single plane and Linux as the
> DSS master.

How would that work in real-world? If Linux is in charge of the display,
it can shut down the output whenever it wants too, including shutting
down the clocks or power domains, which then would probably make the
firmware crash?

A much more robust solution here would be to have either the firmware in
charge of the display and delegate a few planes, or provide some way for
Linux to access those sensors while being fully in charge of the
display.

> 2) Another configuration could be where RTOS want to control one full
> end-to-end pipeline going to one connector and Linux want to control full
> end-to-end pipeline going to another connector, that can be supported too
> using this scheme (as shared in this series).

That one is easy: you make the firmware adjust the DT when it boots
Linux and thus restrict the part of the display controller it can use.

It's mostly a DT binding issue.

> 3) Also I think, this device-tree based scheme could be leveraged in
> virtualization too with static partitioning based scheme using Xen for e.g. we
> split the DSS resources between host (DOM0) and the guest (DOMU).

I'm not sure you can leverage virtio at the firmware level, but
otherwise, it's pretty much what we've been suggesting the whole time:
make a generic interface between the firmware and Linux, and then write
a KMS driver for that interface.

Then the firmware can delegate planes to Linux, just like the hypervisor
would to a guest VM.

> >> The opposite is supported by the DSS hardware (thanks to its feature that
> >> allows partitioning the register space and having multiple per-host IRQs) 
> >> but it's not a real use case yet. The reason why this case is added to the
> >> DT binding is as you said for flexiblity and make the design future-proof.
>
> Not really, as explained above we are documenting all possible configurations
> which hardware supports as supported in software in the SDK and that's what we
> are aiming for upstream too.

Generally speaking, not all hardware configurations make sense to run
with in Linux, and there's tons of drivers were we support only a
limited set of capabilities precisely because some simply don't work
well with Linux. I'm glad you're (and your organization) is that
committed to Linux, but I don't think having that as an absolute goal is
reasonable.

Let's work on usecases, and make the most of it instead. In your
examples, 2 and 3 make total sense to be supported in Linux. I'm not
sure 1) does.

> [..]>>>> I'm probably missing something then here, but if the Linux side of
> >>>> things is expected to keep the current configuration and keep it active
> >>>> for it to work, what use-case would it be useful for?
> >>>>
> >>>
> >>> It's just one of the partitioning possibilities that I mentioned here, that
> >>> Linux is in control of DSS as a whole and the user want the other host (be it
> >>> RTOS or any other core) to control a single plane. For e.g it could be Linux
> >>> (with GPU rendering) displaying the graphics and RTOS overlaying a real time
> >>> clock or any other signs which need to be displayed in real-time.
> >>> But more than the use-case this is inspired by the fact that we want to be
> >>> flexible and support in the linux driver whatever partitioning scheme
> >>> possibilities are there which are supported in hardware and we let user decide
> >>> on the partitioning scheme.
> >>>
> >>
> >> A possible use case here could be if Linux is safer than the other host
> >> owning a single plane, right? Then in that case the RTOS could fail but
> >> the display pipeline won't be teared down.
> >>
> >> That is, if your safety tell-tales would be driven by Linux and having
> >> other OS dislay the GPU-rendered QT based application on another plane.
> >>
> >> But as said, for now that's a theorethical use case since the one you
> >> mentioned is the opposite.
> >>
> >> [....]
> >>
> 
> Yes that could be a possible use-case too, we want to provide customer
> the flexibility in their app design to select different configuration,
> as we say all these configurations as supported in the driver.
> 
> [..]
> >>>>> If there is a more complex use-case which requires dynamic
> >>>>> assignment/arbitration of resources then I agree those require some sort of
> >>>>> IPC scheme but this is not what we target with these series. This series is
> >>>>> simply to support static partitioning feature (separate register space,
> >>>>> separate irq, firewalling support etc) of TI DSS hardware across the multiple
> >>>>> hosts and there are use-cases too for which this scheme suffices.
> >>>>
> >>>> I think you're right and we have a misunderstanding. My initial
> >>>> assumption was that it was to prevent the Linux side of sides from
> >>>> screwing up the output if it was to crash.
> >>>>
> >>>> But it looks like it's not the main point of this series, so could you
> >>>> share some use-cases you're trying to address?
> >>>>
> >>>
> >>> The end use-case we have demonstrated right now with this series is a
> >>> proof-of-concept display cluster use-case where RTOS boots early on MCU core
> >>> (launched at bootloader stage) and initializes the display (using the global
> >>> common0 register space and irq) and starts displaying safety tell-tales on one
> >>> plane, and once Linux boots up on application processor,
> >>> Linux (using common1 register space and irq) controls the other plane with GPU
> >>> rendering using a QT based application. And yes, we also support the scenario
> >>> where Linux crashes but RTOS being the DSS master and in control of DSS power,
> >>> clock domain and global register space is not impacted by the crash.
> >>
> >> You mention 2 scenarios but are actually the same? Or did I misunderstand?
> >>
> 
> I have explained the other examples (where Linux is the master) above with tag
> [Examples].
> 
> >> In both cases the RTOS own the display pipeline and Linux can just display
> >> using a single plane.
> >>
> >> That's why I think that agree with Maxime, that a fwkms could be a simpler
> >> solution to your use case instead of adding all this complexity to the DSS
> >> driver. Yes, I understand the HW supports all this flexibility but there's
> >> no real use case yet (you mentioned that don't even have firmware for this
> >> single plane owned by the RTOS in the R5F case).
> >>
> 
> I analyzed the fwkms driver [2] from the RPi repo (as seems like it is not
> upstream), here the firmware is the entity owning the display register space
> completely and the fwkms driver simply acts as a translating layer which
> translates the incoming DRM/KMS ioctls into IPC messages (having unique
> message tags) sent over mailbox which the firmware receives and programs
> corresponding registers accordingly, so there is no display sharing achieved
> here as there is only single display context as coming from Linux and no
> separate display context running at firmware side.

That's somewhat correct, but also a simplification: while the driver and
firmware indeed never do any kind of segmentation (iirc), the protocol
between the firmware and the kernel leaves that door opened so you could
very well do that with some work on the firmware.

And now that I think about it, I think the RPi firmware can in some
situations (like overheating) add an overlay on top of the current
output.

Also, we're not advocating for you to use the RPi driver as is: we
suggested to use a similar solution, but we can totally use a different
transport mechanism or protocol (and we probably should if we want to
make it somewhat generic).

> But yes, in general for simple display controllers having single register
> space and single IRQ, I think that a standard IPC based scheme would be
> required for sharing the display between Linux and RTOS where for e.g. RTOS
> would be in full control of display register space and the interrupt line.
> Linux would send messages to RTOS via IPC scheme (similar to RPi fwkms) to
> commit the framebuffer and RTOS would send a frame done message to LInux on
> receiving vblank irq which would eventually call drm_crtc_handle_vblank.
> Although the IPC layer would need to be standardized (virtio_kms?)

rpmsg is another candidate, but yeah, that's what in mind too (I think?).

> >> The DT binding for a fwkms driver would be trivial, in fact maybe we might
> >> even leverage simpledrm for this case and not require a new driver at all.
> > 
> > I guess you can still do things like pageflipping and maybe use some of
> > the color/blending hardware? Maybe even have more than one plane
> > available? fwkms/simpledrm conceptually cannot really support pageflipping
> > even, so that's a much, much reduced feature set.
> >
> 
> Yes exactly, fwkms, simpledrm are not related to display sharing.
> 
> > That all aside I do think we should limit the support to just the first
> > case, where linux gets a few pieces assigned to it and is not the DSS
> > master. From what I'm understanding you could assign entire crtc with
> > planes and everything to linux, so this shouldn't really constraint
> > real-world usage?
> > 
> > At least until there's support in firmware for this it's all way too
> > theoretical, and I agree with Maxime and Javier that there's some serious
> > design questions about how this kind of static leasing should work with
> > drm sitting on top.
> 
> To summarize, as per my understanding there are two schemes for display
> sharing possible
> 
> 1) IPC based scheme  -> This would be useful for simple display controllers
> having single register space and single irq line to one host)
> 2) Static partitioning scheme -> This would require some support in display
> controller (per host irq e.t.c) as explained below :
> 
> And I think each one has it's own uniqueness and holds on it's own and both
> solutions can be pursued independently and development for one scheme should
> not block development for another.

Agreed.

> As TI display controller supports robust partitioning of DSS processing blocks
> (Pipelines, overlays, interrupts etc) among multiple processors allowing
> parallel and independent control of the DSS resources in an interference free
> environment, the static partitioning scheme was possible to implement and it
> sufficed the requirements of display sharing use-case without incurring any
> extra overhead and complexity of IPC.
> 
> And If there is another display controller having similar feature-set as above
> then the bindings for the static partitioning scheme (or at-least some of
> them) could be generalized.
> 
> 	drm,shared-mode;
> 	drm,shared-mode-crtc = "..";
> 	drm,shared-mode-crtc-owned = "..";
> 	drm,shared-mode-planes = "..";
> 
> But I could not find any other display controller support above static
> partitioning feature and hence I put the bindings as specific to TI.

I think the binding would make more sense to me using child nodes to
describe the available planes, CRTCs, etc.

But that's something to discuss with the DT maintainers.

Maxime
diff mbox series

Patch

diff --git a/drivers/gpu/drm/tidss/tidss_crtc.c b/drivers/gpu/drm/tidss/tidss_crtc.c
index 5f838980c7a1..f6a877ff4c6c 100644
--- a/drivers/gpu/drm/tidss/tidss_crtc.c
+++ b/drivers/gpu/drm/tidss/tidss_crtc.c
@@ -31,13 +31,19 @@  static void tidss_crtc_finish_page_flip(struct tidss_crtc *tcrtc)
 	/*
 	 * New settings are taken into use at VFP, and GO bit is cleared at
 	 * the same time. This happens before the vertical blank interrupt.
-	 * So there is a small change that the driver sets GO bit after VFP, but
+	 * So there is a small chance that the driver sets GO bit after VFP, but
 	 * before vblank, and we have to check for that case here.
+	 *
+	 * For a video port shared between Linux and remote core but owned by remote core,
+	 * this is not required since Linux just attaches to mode that was preset by remote
+	 * core with which display is being shared.
 	 */
-	busy = dispc_vp_go_busy(tidss->dispc, tcrtc->hw_videoport);
-	if (busy) {
-		spin_unlock_irqrestore(&ddev->event_lock, flags);
-		return;
+	if (!tidss->shared_mode || tidss->shared_mode_owned_vps[tcrtc->hw_videoport]) {
+		busy = dispc_vp_go_busy(tidss->dispc, tcrtc->hw_videoport);
+		if (busy) {
+			spin_unlock_irqrestore(&ddev->event_lock, flags);
+			return;
+		}
 	}
 
 	event = tcrtc->event;
@@ -208,6 +214,44 @@  static void tidss_crtc_atomic_flush(struct drm_crtc *crtc,
 	spin_unlock_irqrestore(&ddev->event_lock, flags);
 }
 
+static void tidss_shared_vp_crtc_atomic_flush(struct drm_crtc *crtc,
+					      struct drm_atomic_state *state)
+{
+	struct tidss_crtc *tcrtc = to_tidss_crtc(crtc);
+	struct drm_device *ddev = crtc->dev;
+	unsigned long flags;
+
+	dev_dbg(ddev->dev,
+		"%s: %s enabled %d, needs modeset %d, event %p\n", __func__,
+		crtc->name, drm_atomic_crtc_needs_modeset(crtc->state),
+		crtc->state->enable, crtc->state->event);
+
+	/* There is nothing to do if CRTC is not going to be enabled. */
+	if (!crtc->state->enable)
+		return;
+	/*
+	 * Flush CRTC changes with go bit only if new modeset is not
+	 * coming, so CRTC is enabled trough out the commit.
+	 */
+	if (drm_atomic_crtc_needs_modeset(crtc->state))
+		return;
+
+	/* We should have event if CRTC is enabled through out this commit. */
+	if (WARN_ON(!crtc->state->event))
+		return;
+
+	WARN_ON(drm_crtc_vblank_get(crtc) != 0);
+
+	spin_lock_irqsave(&ddev->event_lock, flags);
+
+	WARN_ON(tcrtc->event);
+
+	tcrtc->event = crtc->state->event;
+	crtc->state->event = NULL;
+
+	spin_unlock_irqrestore(&ddev->event_lock, flags);
+}
+
 static void tidss_crtc_atomic_enable(struct drm_crtc *crtc,
 				     struct drm_atomic_state *state)
 {
@@ -253,6 +297,27 @@  static void tidss_crtc_atomic_enable(struct drm_crtc *crtc,
 	spin_unlock_irqrestore(&ddev->event_lock, flags);
 }
 
+static void tidss_shared_vp_crtc_atomic_enable(struct drm_crtc *crtc,
+					       struct drm_atomic_state *state)
+{
+	struct drm_device *ddev = crtc->dev;
+	unsigned long flags;
+
+	dev_dbg(ddev->dev, "%s, event %p\n", __func__, crtc->state->event);
+
+	/* Turn vertical blanking interrupt reporting on. */
+	drm_crtc_vblank_on(crtc);
+
+	spin_lock_irqsave(&ddev->event_lock, flags);
+
+	if (crtc->state->event) {
+		drm_crtc_send_vblank_event(crtc, crtc->state->event);
+		crtc->state->event = NULL;
+	}
+
+	spin_unlock_irqrestore(&ddev->event_lock, flags);
+}
+
 static void tidss_crtc_atomic_disable(struct drm_crtc *crtc,
 				      struct drm_atomic_state *state)
 {
@@ -288,6 +353,27 @@  static void tidss_crtc_atomic_disable(struct drm_crtc *crtc,
 	tidss_runtime_put(tidss);
 }
 
+static void tidss_shared_vp_crtc_atomic_disable(struct drm_crtc *crtc,
+						struct drm_atomic_state *state)
+{
+	struct tidss_crtc *tcrtc = to_tidss_crtc(crtc);
+	struct drm_device *ddev = crtc->dev;
+	unsigned long flags;
+
+	dev_dbg(ddev->dev, "%s, event %p\n", __func__, crtc->state->event);
+
+	reinit_completion(&tcrtc->framedone_completion);
+
+	spin_lock_irqsave(&ddev->event_lock, flags);
+	if (crtc->state->event) {
+		drm_crtc_send_vblank_event(crtc, crtc->state->event);
+		crtc->state->event = NULL;
+	}
+	spin_unlock_irqrestore(&ddev->event_lock, flags);
+
+	drm_crtc_vblank_off(crtc);
+}
+
 static
 enum drm_mode_status tidss_crtc_mode_valid(struct drm_crtc *crtc,
 					   const struct drm_display_mode *mode)
@@ -304,7 +390,14 @@  static const struct drm_crtc_helper_funcs tidss_crtc_helper_funcs = {
 	.atomic_flush = tidss_crtc_atomic_flush,
 	.atomic_enable = tidss_crtc_atomic_enable,
 	.atomic_disable = tidss_crtc_atomic_disable,
+	.mode_valid = tidss_crtc_mode_valid,
+};
 
+static const struct drm_crtc_helper_funcs tidss_shared_vp_crtc_helper_funcs = {
+	.atomic_check = tidss_crtc_atomic_check,
+	.atomic_flush = tidss_shared_vp_crtc_atomic_flush,
+	.atomic_enable = tidss_shared_vp_crtc_atomic_enable,
+	.atomic_disable = tidss_shared_vp_crtc_atomic_disable,
 	.mode_valid = tidss_crtc_mode_valid,
 };
 
@@ -406,6 +499,10 @@  struct tidss_crtc *tidss_crtc_create(struct tidss_device *tidss,
 	bool has_ctm = tidss->feat->vp_feat.color.has_ctm;
 	int ret;
 
+	dev_dbg(tidss->dev, "%s: tidss->shared_mode: %d tidss->shared_mode_owned_vps[%d] = %d\n",
+		__func__, tidss->shared_mode, hw_videoport,
+		tidss->shared_mode_owned_vps[hw_videoport]);
+
 	tcrtc = kzalloc(sizeof(*tcrtc), GFP_KERNEL);
 	if (!tcrtc)
 		return ERR_PTR(-ENOMEM);
@@ -422,8 +519,17 @@  struct tidss_crtc *tidss_crtc_create(struct tidss_device *tidss,
 		return ERR_PTR(ret);
 	}
 
-	drm_crtc_helper_add(crtc, &tidss_crtc_helper_funcs);
-
+	/* Use shared video port CRTC helpers which don't access associated
+	 * video port and overlay register spaces when Linux is not owning the
+	 * video port.
+	 */
+	if (tidss->shared_mode && !tidss->shared_mode_owned_vps[hw_videoport]) {
+		drm_crtc_helper_add(crtc, &tidss_shared_vp_crtc_helper_funcs);
+		dev_dbg(tidss->dev, "%s: vp%d is being shared with Linux\n", __func__,
+			hw_videoport + 1);
+	} else {
+		drm_crtc_helper_add(crtc, &tidss_crtc_helper_funcs);
+	}
 	/*
 	 * The dispc gamma functions adapt to what ever size we ask
 	 * from it no matter what HW supports. X-server assumes 256
diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c b/drivers/gpu/drm/tidss/tidss_dispc.c
index 1ad711f8d2a8..35a82198629f 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -273,6 +273,18 @@  const struct dispc_features dispc_j721e_feats = {
 	.vid_order = { 1, 3, 0, 2 },
 };
 
+static const u16 tidss_am62_common1_regs[DISPC_COMMON_REG_TABLE_LEN] = {
+	[DISPC_IRQ_EOI_OFF] =			0x24,
+	[DISPC_IRQSTATUS_RAW_OFF] =		0x28,
+	[DISPC_IRQSTATUS_OFF] =			0x2c,
+	[DISPC_IRQENABLE_SET_OFF] =		0x30,
+	[DISPC_IRQENABLE_CLR_OFF] =		0x40,
+	[DISPC_VID_IRQENABLE_OFF] =		0x44,
+	[DISPC_VID_IRQSTATUS_OFF] =		0x58,
+	[DISPC_VP_IRQENABLE_OFF] =		0x70,
+	[DISPC_VP_IRQSTATUS_OFF] =		0x7c,
+};
+
 const struct dispc_features dispc_am625_feats = {
 	.max_pclk_khz = {
 		[DISPC_VP_DPI] = 165000,
@@ -1228,6 +1240,22 @@  enum drm_mode_status dispc_vp_mode_valid(struct dispc_device *dispc,
 
 	max_pclk = dispc->feat->max_pclk_khz[bus_type];
 
+	/*
+	 * For shared mode, with remote core driving the video port, make sure that Linux
+	 * controlled primary plane doesn't exceed video port screen size set by remote core
+	 */
+	if (dispc->tidss->shared_mode && !dispc->tidss->shared_mode_owned_vps[hw_videoport]) {
+		int vp_hdisplay = VP_REG_GET(dispc, hw_videoport, DISPC_VP_SIZE_SCREEN, 11, 0) + 1;
+		int vp_vdisplay = VP_REG_GET(dispc, hw_videoport, DISPC_VP_SIZE_SCREEN, 27, 16) + 1;
+
+		if (mode->hdisplay > vp_hdisplay ||
+		    mode->vdisplay > vp_vdisplay) {
+			dev_err(dispc->dev, "%dx%d exceeds VP screen size %dx%d in shared mode\n",
+				mode->hdisplay, mode->vdisplay, vp_hdisplay, vp_vdisplay);
+			return MODE_BAD;
+		}
+	}
+
 	if (WARN_ON(max_pclk == 0))
 		return MODE_BAD;
 
@@ -1276,15 +1304,18 @@  enum drm_mode_status dispc_vp_mode_valid(struct dispc_device *dispc,
 		return MODE_BAD_VVALUE;
 
 	if (dispc->memory_bandwidth_limit) {
-		const unsigned int bpp = 4;
-		u64 bandwidth;
+		if (!dispc->tidss->shared_mode ||
+		    dispc->tidss->shared_mode_owned_vps[hw_videoport]) {
+			const unsigned int bpp = 4;
+			u64 bandwidth;
 
-		bandwidth = 1000 * mode->clock;
-		bandwidth = bandwidth * mode->hdisplay * mode->vdisplay * bpp;
-		bandwidth = div_u64(bandwidth, mode->htotal * mode->vtotal);
+			bandwidth = 1000 * mode->clock;
+			bandwidth = bandwidth * mode->hdisplay * mode->vdisplay * bpp;
+			bandwidth = div_u64(bandwidth, mode->htotal * mode->vtotal);
 
-		if (dispc->memory_bandwidth_limit < bandwidth)
-			return MODE_BAD;
+			if (dispc->memory_bandwidth_limit < bandwidth)
+				return MODE_BAD;
+		}
 	}
 
 	return MODE_OK;
@@ -2655,6 +2686,147 @@  static void dispc_vp_set_color_mgmt(struct dispc_device *dispc,
 		dispc_k3_vp_set_ctm(dispc, hw_videoport, ctm);
 }
 
+static int get_vp_idx_from_vp(const char *vp_name)
+{
+	u32 vp_idx;
+
+	if (!strcmp("vp1", vp_name))
+		vp_idx = 0;
+	else if (!strcmp("vp2", vp_name))
+		vp_idx = 1;
+	else if (!strcmp("vp3", vp_name))
+		vp_idx = 2;
+	else if (!strcmp("vp4", vp_name))
+		vp_idx = 3;
+	else
+		return 0;
+
+	return vp_idx;
+}
+
+static const char *get_ovr_from_vp(const char *vp_name)
+{
+	const char *ovr_name = NULL;
+
+	if (!strcmp("vp1", vp_name))
+		ovr_name = "ovr1";
+	else if (!strcmp("vp2", vp_name))
+		ovr_name = "ovr2";
+	else if (!strcmp("vp3", vp_name))
+		ovr_name = "ovr3";
+	else if (!strcmp("vp4", vp_name))
+		ovr_name = "ovr4";
+	else
+		return NULL;
+
+	return ovr_name;
+}
+
+static void dispc_shared_mode_update_bus_type(struct dispc_features *shared_mode_feat,
+					      struct dispc_device *dispc)
+{
+	u32 i, vp_idx;
+	int num_vps = shared_mode_feat->num_vps;
+	enum dispc_vp_bus_type vp_bus_type[TIDSS_MAX_PORTS];
+
+	memcpy((void *)vp_bus_type, (void *)shared_mode_feat->vp_bus_type,
+	       sizeof(vp_bus_type));
+	memset(shared_mode_feat->vp_bus_type, 0, sizeof(vp_bus_type));
+
+	for (i = 0; i < num_vps; i++) {
+		/*
+		 * Find corresponding vp bus type.
+		 */
+		vp_idx = get_vp_idx_from_vp(shared_mode_feat->vp_name[i]);
+		shared_mode_feat->vp_bus_type[i] = vp_bus_type[vp_idx];
+	}
+}
+
+static int dispc_update_shared_mode_features(struct dispc_features *shared_mode_feat,
+					     struct dispc_device *dispc)
+{
+	int r = 0, i = 0;
+
+	dev_dbg(dispc->dev, "Start updating dispc feature list for shared mode:\n");
+
+	/*
+	 * Start with a shallow copy from existing features and prune the list
+	 * as per what is actually made available to Linux
+	 */
+	memcpy((void *)shared_mode_feat, (void *)dispc->feat, sizeof(*shared_mode_feat));
+	shared_mode_feat->num_vps = device_property_string_array_count(dispc->dev,
+								       "ti,dss-shared-mode-vp");
+	shared_mode_feat->num_planes = device_property_string_array_count(dispc->dev,
+									  "ti,dss-shared-mode-planes");
+
+	r = device_property_read_string(dispc->dev, "ti,dss-shared-mode-common",
+					(const char **)&shared_mode_feat->common);
+	if (r) {
+		dev_err(dispc->dev, "failed to read shared video port name: %d\n", r);
+		return r;
+	}
+
+	memset(shared_mode_feat->vid_name, 0, sizeof(shared_mode_feat->vid_name));
+	r = device_property_read_string_array(dispc->dev, "ti,dss-shared-mode-planes",
+					      shared_mode_feat->vid_name, TIDSS_MAX_PLANES);
+	if (r < 0) {
+		dev_err(dispc->dev, "failed to read client vid layer name: %d\n", r);
+		return r;
+	}
+
+	r = device_property_read_u32_array(dispc->dev, "ti,dss-shared-mode-vp-owned",
+					   dispc->tidss->shared_mode_owned_vps,
+					   shared_mode_feat->num_vps);
+	if (r < 0) {
+		dev_err(dispc->dev, "failed to read owned vp list: %d\n", r);
+		return r;
+	}
+
+	memset(shared_mode_feat->vp_name, 0, sizeof(shared_mode_feat->vp_name));
+	r = device_property_read_string_array(dispc->dev, "ti,dss-shared-mode-vp",
+					      shared_mode_feat->vp_name, TIDSS_MAX_PORTS);
+	if (r < 0) {
+		dev_err(dispc->dev, "failed to read shared video port name: %d\n", r);
+		return r;
+	}
+
+	memset(shared_mode_feat->vid_order, 0, sizeof(shared_mode_feat->vid_order));
+	r = device_property_read_u32_array(dispc->dev, "ti,dss-shared-mode-plane-zorder",
+					   shared_mode_feat->vid_order,
+					   shared_mode_feat->num_planes);
+	if (r < 0) {
+		dev_err(dispc->dev, "failed to read vid_order array name: %d\n", r);
+		return r;
+	}
+	memcpy((void *)shared_mode_feat->vpclk_name, (void *)shared_mode_feat->vp_name,
+	       sizeof(shared_mode_feat->vpclk_name));
+	memset(shared_mode_feat->ovr_name, 0, sizeof(shared_mode_feat->ovr_name));
+
+	for (i = 0; i < shared_mode_feat->num_vps; i++) {
+		shared_mode_feat->ovr_name[i] = get_ovr_from_vp(shared_mode_feat->vp_name[i]);
+		dev_dbg(dispc->dev, "vp[%d] = %s, ovr[%d] = %s vpclk[%d] = %s vp_owned[%d] = %d\n",
+			i,  shared_mode_feat->vp_name[i], i, shared_mode_feat->ovr_name[i], i,
+			shared_mode_feat->vpclk_name[i], i, dispc->tidss->shared_mode_owned_vps[i]);
+	}
+
+	for (i = 0; i < shared_mode_feat->num_planes; i++) {
+		if (!strncmp("vidl", shared_mode_feat->vid_name[i], 4))
+			shared_mode_feat->vid_lite[i] = true;
+		dev_dbg(dispc->dev, "vid[%d] = %s, vid_order[%d] = %u vid_lite[%d] = %u\n", i,
+			shared_mode_feat->vid_name[i], i, shared_mode_feat->vid_order[i], i,
+			shared_mode_feat->vid_lite[i]);
+	}
+
+	if (!strcmp(shared_mode_feat->common, "common1"))
+		shared_mode_feat->common_regs = tidss_am62_common1_regs;
+
+	dev_dbg(dispc->dev, "common : %s\n", shared_mode_feat->common);
+	dispc_shared_mode_update_bus_type(shared_mode_feat, dispc);
+	dev_dbg(dispc->dev, "Feature list updated for shared mode\n");
+
+	return 0;
+}
+
 void dispc_vp_setup(struct dispc_device *dispc, u32 hw_videoport,
 		    const struct drm_crtc_state *state, bool newmodeset)
 {
@@ -2662,6 +2834,16 @@  void dispc_vp_setup(struct dispc_device *dispc, u32 hw_videoport,
 	dispc_vp_set_color_mgmt(dispc, hw_videoport, state, newmodeset);
 }
 
+static bool dispc_owns_global_common_in_shared_mode(struct dispc_device *dispc)
+{
+	if ((!strcmp(dispc->feat->common, "common") ||
+	     !strcmp(dispc->feat->common, "common_m")) &&
+	     dispc->tidss->shared_mode)
+		return true;
+	else
+		return false;
+}
+
 int dispc_runtime_suspend(struct dispc_device *dispc)
 {
 	dev_dbg(dispc->dev, "suspend\n");
@@ -2846,6 +3028,7 @@  int dispc_init(struct tidss_device *tidss)
 	struct platform_device *pdev = to_platform_device(dev);
 	struct dispc_device *dispc;
 	const struct dispc_features *feat;
+	struct dispc_features *shared_mode_feat;
 	unsigned int i, num_fourccs;
 	int r = 0;
 
@@ -2887,6 +3070,21 @@  int dispc_init(struct tidss_device *tidss)
 
 	dispc->num_fourccs = num_fourccs;
 
+	if (tidss->shared_mode) {
+		dev_dbg(dev, "%s : DSS is being shared with remote core\n", __func__);
+		shared_mode_feat = devm_kzalloc(dev, sizeof(*shared_mode_feat), GFP_KERNEL);
+		if (!shared_mode_feat)
+			return -ENOMEM;
+
+		r = dispc_update_shared_mode_features(shared_mode_feat, dispc);
+		if (r)
+			return r;
+
+		tidss->feat = (const struct dispc_features *)shared_mode_feat;
+		feat = tidss->feat;
+		dispc->feat = feat;
+	}
+
 	dispc_common_regmap = dispc->feat->common_regs;
 
 	r = dispc_iomap_resource(pdev, dispc->feat->common,
@@ -2933,25 +3131,37 @@  int dispc_init(struct tidss_device *tidss)
 	}
 
 	if (feat->subrev == DISPC_AM65X) {
-		r = dispc_init_am65x_oldi_io_ctrl(dev, dispc);
-		if (r)
-			return r;
+		/*
+		 * For shared mode, Initialize the OLDI IO control only if we own
+		 * the OLDI Tx ports
+		 */
+		if (!tidss->shared_mode || tidss->shared_mode_own_oldi) {
+			r = dispc_init_am65x_oldi_io_ctrl(dev, dispc);
+			if (r)
+				return r;
+		}
 	}
 
-	dispc->fclk = devm_clk_get(dev, "fck");
-	if (IS_ERR(dispc->fclk)) {
-		dev_err(dev, "%s: Failed to get fclk: %ld\n",
-			__func__, PTR_ERR(dispc->fclk));
-		return PTR_ERR(dispc->fclk);
-	}
-	dev_dbg(dev, "DSS fclk %lu Hz\n", clk_get_rate(dispc->fclk));
+	/*
+	 * For shared mode, Initialize the hardware and clocking only if processing core running
+	 * Linux has ownership of DSS global register space
+	 */
+	if (!tidss->shared_mode || dispc_owns_global_common_in_shared_mode(dispc)) {
+		dispc->fclk = devm_clk_get(dev, "fck");
+		if (IS_ERR(dispc->fclk)) {
+			dev_err(dev, "%s: Failed to get fclk: %ld\n",
+				__func__, PTR_ERR(dispc->fclk));
+			return PTR_ERR(dispc->fclk);
+		}
+		dev_dbg(dev, "DSS fclk %lu Hz\n", clk_get_rate(dispc->fclk));
 
-	of_property_read_u32(dispc->dev->of_node, "max-memory-bandwidth",
-			     &dispc->memory_bandwidth_limit);
+		of_property_read_u32(dispc->dev->of_node, "max-memory-bandwidth",
+				     &dispc->memory_bandwidth_limit);
 
-	r = dispc_init_hw(dispc);
-	if (r)
-		return r;
+		r = dispc_init_hw(dispc);
+		if (r)
+			return r;
+	}
 
 	tidss->dispc = dispc;
 
diff --git a/drivers/gpu/drm/tidss/tidss_dispc.h b/drivers/gpu/drm/tidss/tidss_dispc.h
index 086327d51a90..368a39941b34 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.h
+++ b/drivers/gpu/drm/tidss/tidss_dispc.h
@@ -80,7 +80,7 @@  struct dispc_features {
 	const char *vp_name[TIDSS_MAX_PORTS]; /* Should match dt reg names */
 	const char *ovr_name[TIDSS_MAX_PORTS]; /* Should match dt reg names */
 	const char *vpclk_name[TIDSS_MAX_PORTS]; /* Should match dt clk names */
-	const enum dispc_vp_bus_type vp_bus_type[TIDSS_MAX_PORTS];
+	enum dispc_vp_bus_type vp_bus_type[TIDSS_MAX_PORTS];
 	struct tidss_vp_feat vp_feat;
 	u32 num_planes;
 	const char *vid_name[TIDSS_MAX_PLANES]; /* Should match dt reg names */
diff --git a/drivers/gpu/drm/tidss/tidss_drv.c b/drivers/gpu/drm/tidss/tidss_drv.c
index d15f836dca95..141481635578 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.c
+++ b/drivers/gpu/drm/tidss/tidss_drv.c
@@ -32,6 +32,10 @@  int tidss_runtime_get(struct tidss_device *tidss)
 
 	dev_dbg(tidss->dev, "%s\n", __func__);
 
+	/* No PM in display sharing mode */
+	if (tidss->shared_mode)
+		return 0;
+
 	r = pm_runtime_resume_and_get(tidss->dev);
 	WARN_ON(r < 0);
 	return r;
@@ -43,6 +47,9 @@  void tidss_runtime_put(struct tidss_device *tidss)
 
 	dev_dbg(tidss->dev, "%s\n", __func__);
 
+	if (tidss->shared_mode)
+		return;
+
 	pm_runtime_mark_last_busy(tidss->dev);
 
 	r = pm_runtime_put_autosuspend(tidss->dev);
@@ -140,21 +147,23 @@  static int tidss_probe(struct platform_device *pdev)
 
 	spin_lock_init(&tidss->wait_lock);
 
+	tidss->shared_mode = device_property_read_bool(dev, "ti,dss-shared-mode");
 	ret = dispc_init(tidss);
 	if (ret) {
 		dev_err(dev, "failed to initialize dispc: %d\n", ret);
 		return ret;
 	}
 
-	pm_runtime_enable(dev);
-
-	pm_runtime_set_autosuspend_delay(dev, 1000);
-	pm_runtime_use_autosuspend(dev);
+	if (!tidss->shared_mode) {
+		pm_runtime_enable(dev);
+		pm_runtime_set_autosuspend_delay(dev, 1000);
+		pm_runtime_use_autosuspend(dev);
 
 #ifndef CONFIG_PM
-	/* If we don't have PM, we need to call resume manually */
-	dispc_runtime_resume(tidss->dispc);
+		/* If we don't have PM, we need to call resume manually */
+		dispc_runtime_resume(tidss->dispc);
 #endif
+	}
 
 	ret = tidss_modeset_init(tidss);
 	if (ret < 0) {
@@ -196,6 +205,8 @@  static int tidss_probe(struct platform_device *pdev)
 	tidss_irq_uninstall(ddev);
 
 err_runtime_suspend:
+	if (tidss->shared_mode)
+		return ret;
 #ifndef CONFIG_PM
 	dispc_runtime_suspend(tidss->dispc);
 #endif
@@ -219,12 +230,14 @@  static void tidss_remove(struct platform_device *pdev)
 
 	tidss_irq_uninstall(ddev);
 
+	if (!tidss->shared_mode) {
 #ifndef CONFIG_PM
-	/* If we don't have PM, we need to call suspend manually */
-	dispc_runtime_suspend(tidss->dispc);
+		/* If we don't have PM, we need to call suspend manually */
+		dispc_runtime_suspend(tidss->dispc);
 #endif
-	pm_runtime_dont_use_autosuspend(dev);
-	pm_runtime_disable(dev);
+		pm_runtime_dont_use_autosuspend(dev);
+		pm_runtime_disable(dev);
+	}
 
 	/* devm allocated dispc goes away with the dev so mark it NULL */
 	dispc_remove(tidss);
diff --git a/drivers/gpu/drm/tidss/tidss_drv.h b/drivers/gpu/drm/tidss/tidss_drv.h
index d7f27b0b0315..68d53c70651d 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.h
+++ b/drivers/gpu/drm/tidss/tidss_drv.h
@@ -31,6 +31,12 @@  struct tidss_device {
 
 	spinlock_t wait_lock;	/* protects the irq masks */
 	dispc_irq_t irq_mask;	/* enabled irqs in addition to wait_list */
+
+	bool shared_mode; /* DSS resources shared between remote core and Linux */
+
+	/* 1: VP owned by Linux 0: VP is owned by remote and shared with Linux */
+	u32 shared_mode_owned_vps[TIDSS_MAX_PORTS];
+	bool shared_mode_own_oldi; /* Linux needs to configure OLDI in shared mode */
 };
 
 #define to_tidss(__dev) container_of(__dev, struct tidss_device, ddev)