diff mbox

[2/2] drm/i915: Render decompression support for Gen9 and above

Message ID 1458319853-22631-2-git-send-email-vandana.kannan@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

vandana.kannan@intel.com March 18, 2016, 4:50 p.m. UTC
This patch includes enabling render decompression (RC) after checking all the
requirements (format, tiling, rotation etc.). Along with this, the WAs
mentioned in BSpec Workaround page have been implemented.

TODO:
1. Disable stereo 3D when render decomp is enabled (bit 7:6)
2. Render decompression must not be used in VTd pass-through mode
3. Program hashing select CHICKEN_MISC1 bit 15
4. For Gen10, add support for RGB 1010102
5. RC-FBC workaround
6. RC watermark calculation

The reason for using a plane property instead of fb modifier:-
In Android, OGL passes a render compressed buffer to hardware composer (HWC),
which would then request a flip on that buffer after checking if the target
can support render compressed buffer. For example, only planes 1 and 2 on
pipes 1 and 2 can support RC. In case the target cannot support it, HWC will
revert back to OGL requesting for uncompressed buffer.
Here,
if plane property is used, OGL would send back the buffer (same ID) after
decompression, marked uncompressed. If fb modifier was used, a different
version of the buffer would have to be maintained for different combinations -
in the simple case of render compressed vs uncompressed buffer, there would be
2 fbs with 2 IDs. So, in this case, OGL would give a reference to a fb with a
different ID.
To avoid the difficulty of keeping track of multiple fbs and the subsequent
complexity in debug, the architecture forum decided to go ahead with a plane
property for RC.

[Mayuresh] Added the plane check in skl_check_compression()

Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Cc: Smith, Gary K <gary.k.smith@intel.com>
Cc: Daniel Stone <daniel@fooishbar.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h           |   1 +
 drivers/gpu/drm/i915/i915_reg.h           |  22 ++++
 drivers/gpu/drm/i915/intel_atomic_plane.c |  24 +++-
 drivers/gpu/drm/i915/intel_display.c      | 206 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_drv.h          |  24 +++-
 drivers/gpu/drm/i915/intel_sprite.c       |  42 +++++-
 6 files changed, 305 insertions(+), 14 deletions(-)

Comments

Daniel Stone March 18, 2016, 5:05 p.m. UTC | #1
Hi Vandana,

On 18 March 2016 at 16:50, Vandana Kannan <vandana.kannan@intel.com> wrote:
> The reason for using a plane property instead of fb modifier:-
> In Android, OGL passes a render compressed buffer to hardware composer (HWC),
> which would then request a flip on that buffer after checking if the target
> can support render compressed buffer. For example, only planes 1 and 2 on
> pipes 1 and 2 can support RC. In case the target cannot support it, HWC will
> revert back to OGL requesting for uncompressed buffer.

I still don't believe we can support this in non-Android userspace, so
I don't have any other comment on this patch.

Cheers,
Daniel
Ville Syrjala March 18, 2016, 5:12 p.m. UTC | #2
On Fri, Mar 18, 2016 at 10:20:53PM +0530, Vandana Kannan wrote:
> This patch includes enabling render decompression (RC) after checking all the
> requirements (format, tiling, rotation etc.). Along with this, the WAs
> mentioned in BSpec Workaround page have been implemented.
> 
> TODO:
> 1. Disable stereo 3D when render decomp is enabled (bit 7:6)
> 2. Render decompression must not be used in VTd pass-through mode
> 3. Program hashing select CHICKEN_MISC1 bit 15
> 4. For Gen10, add support for RGB 1010102
> 5. RC-FBC workaround
> 6. RC watermark calculation
> 
> The reason for using a plane property instead of fb modifier:-
> In Android, OGL passes a render compressed buffer to hardware composer (HWC),
> which would then request a flip on that buffer after checking if the target
> can support render compressed buffer. For example, only planes 1 and 2 on
> pipes 1 and 2 can support RC. In case the target cannot support it, HWC will
> revert back to OGL requesting for uncompressed buffer.
> Here,
> if plane property is used, OGL would send back the buffer (same ID) after
> decompression, marked uncompressed. If fb modifier was used, a different
> version of the buffer would have to be maintained for different combinations -
> in the simple case of render compressed vs uncompressed buffer, there would be
> 2 fbs with 2 IDs. So, in this case, OGL would give a reference to a fb with a
> different ID.
> To avoid the difficulty of keeping track of multiple fbs and the subsequent
> complexity in debug, the architecture forum decided to go ahead with a plane
> property for RC.
> 
> [Mayuresh] Added the plane check in skl_check_compression()
> 
> Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
> Cc: Smith, Gary K <gary.k.smith@intel.com>
> Cc: Daniel Stone <daniel@fooishbar.org>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_drv.h           |   1 +
>  drivers/gpu/drm/i915/i915_reg.h           |  22 ++++
>  drivers/gpu/drm/i915/intel_atomic_plane.c |  24 +++-
>  drivers/gpu/drm/i915/intel_display.c      | 206 +++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/intel_drv.h          |  24 +++-
>  drivers/gpu/drm/i915/intel_sprite.c       |  42 +++++-
>  6 files changed, 305 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f37ac12..bb47ee1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1929,6 +1929,7 @@ struct drm_i915_private {
>  
>  	struct drm_property *broadcast_rgb_property;
>  	struct drm_property *force_audio_property;
> +	struct drm_property *render_comp_property;
>  
>  	/* hda/i915 audio component */
>  	struct i915_audio_component *audio_component;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 7dfc400..773c37f 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -5756,6 +5756,28 @@ enum skl_disp_power_wells {
>  			_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
>  			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
>  
> +#define PLANE_AUX_DIST_1_A		0x701c0
> +#define PLANE_AUX_DIST_2_A		0x702c0
> +#define PLANE_AUX_DIST_1_B		0x711c0
> +#define PLANE_AUX_DIST_2_B		0x712c0
> +#define _PLANE_AUX_DIST_1(pipe) \
> +			_PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
> +#define _PLANE_AUX_DIST_2(pipe) \
> +			_PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
> +#define PLANE_AUX_DIST(pipe, plane)     \
> +		_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe))
> +
> +#define PLANE_AUX_OFFSET_1_A		0x701c4
> +#define PLANE_AUX_OFFSET_2_A		0x702c4
> +#define PLANE_AUX_OFFSET_1_B		0x711c4
> +#define PLANE_AUX_OFFSET_2_B		0x712c4
> +#define _PLANE_AUX_OFFSET_1(pipe)       \
> +		_PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
> +#define _PLANE_AUX_OFFSET_2(pipe)       \
> +		_PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
> +#define PLANE_AUX_OFFSET(pipe, plane)   \
> +		_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe))
> +
>  /* legacy palette */
>  #define _LGC_PALETTE_A           0x4a000
>  #define _LGC_PALETTE_B           0x4a800
> diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c b/drivers/gpu/drm/i915/intel_atomic_plane.c
> index e0b851a..c431333 100644
> --- a/drivers/gpu/drm/i915/intel_atomic_plane.c
> +++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
> @@ -230,8 +230,16 @@ intel_plane_atomic_get_property(struct drm_plane *plane,
>  				struct drm_property *property,
>  				uint64_t *val)
>  {
> -	DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
> -	return -EINVAL;
> +	struct drm_i915_private *dev_priv = state->plane->dev->dev_private;
> +	struct intel_plane_state *intel_state = to_intel_plane_state(state);
> +
> +	if (property == dev_priv->render_comp_property) {
> +		*val = intel_state->render_comp_enable;
> +	} else {
> +		DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
> +		return -EINVAL;
> +	}
> +	return 0;
>  }
>  
>  /**
> @@ -252,6 +260,14 @@ intel_plane_atomic_set_property(struct drm_plane *plane,
>  				struct drm_property *property,
>  				uint64_t val)
>  {
> -	DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
> -	return -EINVAL;
> +	struct drm_i915_private *dev_priv = state->plane->dev->dev_private;
> +	struct intel_plane_state *intel_state = to_intel_plane_state(state);
> +
> +	if (property == dev_priv->render_comp_property) {
> +		intel_state->render_comp_enable = val;
> +	} else {
> +		DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
> +		return -EINVAL;
> +	}
> +	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index b2fcbe6..676507d 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -117,6 +117,9 @@ static void ironlake_pfit_disable(struct intel_crtc *crtc, bool force);
>  static void ironlake_pfit_enable(struct intel_crtc *crtc);
>  static void intel_modeset_setup_hw_state(struct drm_device *dev);
>  static void intel_pre_disable_primary(struct drm_crtc *crtc);
> +static int skl_check_compression(struct drm_device *dev,
> +		struct intel_plane_state *plane_state,
> +		enum pipe pipe, int x, int y);
>  
>  typedef struct {
>  	int	min, max;
> @@ -3573,7 +3576,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
>  	struct drm_framebuffer *fb = plane_state->base.fb;
>  	int pipe = intel_crtc->pipe;
>  	u32 plane_ctl;
> -	unsigned int rotation = plane_state->base.rotation;
> +	unsigned int rotation = plane_state->base.rotation, render_comp;
>  	u32 stride = skl_plane_stride(fb, 0, rotation);
>  	u32 surf_addr = plane_state->main.offset;
>  	int scaler_id = plane_state->scaler_id;
> @@ -3585,6 +3588,9 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
>  	int dst_y = plane_state->dst.y1;
>  	int dst_w = drm_rect_width(&plane_state->dst);
>  	int dst_h = drm_rect_height(&plane_state->dst);
> +	unsigned long aux_dist = 0;
> +	u32 aux_x_offset = 0, aux_y_offset = 0, aux_stride = 0;
> +	u32 tile_row_adjustment = 0;
>  
>  	plane_ctl = PLANE_CTL_ENABLE |
>  		    PLANE_CTL_PIPE_GAMMA_ENABLE |
> @@ -3604,10 +3610,43 @@ static void skylake_update_primary_plane(struct drm_plane *plane,
>  	intel_crtc->adjusted_x = src_x;
>  	intel_crtc->adjusted_y = src_y;
>  
> +	render_comp = plane_state->render_comp_enable;
> +	if (render_comp) {
> +		u32 tile_height = PAGE_SIZE /
> +			intel_fb_stride_alignment(dev_priv, fb->modifier[0],
> +					fb->pixel_format);
> +
> +		u32 height_in_mem = (fb->offsets[1]/fb->pitches[0]);
> +		tile_row_adjustment = height_in_mem % tile_height;
> +		aux_dist = (fb->pitches[0] *
> +				(height_in_mem - tile_row_adjustment));
> +		aux_stride = skl_plane_stride(fb, 1, rotation);

Could you go read my fb offsets series [1] and see if there's something
there that doesn't agree with render compression?

[1] https://lists.freedesktop.org/archives/intel-gfx/2016-February/087660.html

> +		plane_ctl |= PLANE_CTL_DECOMPRESSION_ENABLE;
> +	} else {
> +		plane_ctl &= ~PLANE_CTL_DECOMPRESSION_ENABLE;
> +	}
> +
>  	I915_WRITE(PLANE_CTL(pipe, 0), plane_ctl);
>  	I915_WRITE(PLANE_OFFSET(pipe, 0), (src_y << 16) | src_x);
>  	I915_WRITE(PLANE_STRIDE(pipe, 0), stride);
>  	I915_WRITE(PLANE_SIZE(pipe, 0), (src_h << 16) | src_w);
> +	I915_WRITE(PLANE_AUX_DIST(pipe, 0), aux_dist | aux_stride);
> +	I915_WRITE(PLANE_AUX_OFFSET(pipe, 0), aux_y_offset<<16 | aux_x_offset);
> +
> +	/*
> +	 * Per bspec, for SKL C and BXT A steppings, when the plane source pixel
> +	 * format is NV12, the CHICKEN_PIPESL register bit 22 must be set to 1.
> +	 * When render compression is enabled, bit 22 must be set to 0.
> +	 */
> +	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) ||
> +			IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {

Does anyone really care about these anymore?

Also you really should fix your editor to wrap lines correctly.

> +		u32 temp = I915_READ(CHICKEN_PIPESL_1(pipe));
> +		if (render_comp) {
> +			if ((temp & HSW_FBCQ_DIS) == HSW_FBCQ_DIS)
> +				I915_WRITE(CHICKEN_PIPESL_1(pipe),
> +						temp & ~HSW_FBCQ_DIS);
> +		}
> +	}
>  
>  	if (scaler_id >= 0) {
>  		uint32_t ps_ctrl = 0;
> @@ -12333,6 +12372,17 @@ int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
>  			to_intel_plane_state(plane_state));
>  		if (ret)
>  			return ret;
> +
> +		if (fb && to_intel_plane_state(plane_state)->
> +				render_comp_enable) {
> +			ret = skl_check_compression(dev,
> +					to_intel_plane_state(plane_state),
> +					intel_crtc->pipe, crtc->x, crtc->y);
> +			if (ret) {
> +				DRM_DEBUG_KMS("Render compr checks failed\n");
> +				return ret;
> +			}
> +		}
>  	}
>  
>  	was_visible = old_plane_state->visible;
> @@ -12388,7 +12438,6 @@ int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
>  	case DRM_PLANE_TYPE_PRIMARY:
>  		intel_crtc->atomic.post_enable_primary = turn_on;
>  		intel_crtc->atomic.update_fbc = true;
> -
>  		break;
>  	case DRM_PLANE_TYPE_CURSOR:
>  		break;
> @@ -12516,6 +12565,41 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
>  		}
>  	}
>  
> +	/*
> +	 * On SKL:*:C and BXT:*:A, there is a possible hang with NV12 format.
> +	 * WA: render decompression must not be enabled on any of the planes in
> +	 * that pipe.
> +	 */
> +	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) ||
> +			IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {

More extinct animals

> +		struct drm_plane *p;
> +		bool rc_enabled = false, nv12_enabled = false;
> +
> +		drm_for_each_plane_mask(p, dev, crtc_state->plane_mask) {
> +			struct drm_plane_state *plane_state =
> +				drm_atomic_get_plane_state(state, p);
> +
> +			if (plane_state) {
> +				if (to_intel_plane_state(plane_state)->
> +						render_comp_enable)
> +					rc_enabled = true;
> +
> +				if (plane_state->fb &&
> +						plane_state->fb->pixel_format ==
> +						DRM_FORMAT_NV12)
> +					nv12_enabled = true;
> +			}
> +
> +		}
> +		if (rc_enabled && nv12_enabled) {
> +			DRM_DEBUG_KMS("RC should not be enabled "
> +					"on any plane of the "
> +					"pipe on which NV12 is "
> +					"enabled\n");
> +			return -EINVAL;
> +		}
> +	}
> +
>  	if (INTEL_INFO(dev)->gen >= 9) {
>  		if (mode_changed)
>  			ret = skl_update_scaler_crtc(pipe_config);
> @@ -14517,6 +14601,91 @@ skl_max_scale(struct intel_crtc *intel_crtc, struct intel_crtc_state *crtc_state
>  	return max_scale;
>  }
>  
> +static int skl_check_compression(struct drm_device *dev,
> +		struct intel_plane_state *plane_state,
> +		enum pipe pipe, int x, int y)
> +{
> +	struct drm_framebuffer *fb = plane_state->base.fb;
> +	struct drm_plane *plane = plane_state->base.plane;
> +	int x_offset;
> +	int src_w = drm_rect_width(&plane_state->src) >> 16;
> +
> +	if (!IS_SKYLAKE(dev) && !IS_BROXTON(dev)) {
> +		DRM_DEBUG_KMS("RC support on CNL+ needs to be revisited\n");
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * TODO:
> +	 * 1. Disable stereo 3D when render decomp is enabled (bit 7:6)
> +	 * 2. Render decompression must not be used in VTd pass-through mode
> +	 * 3. Program hashing select CHICKEN_MISC1 bit 15
> +	 */
> +
> +	/*
> +	 * On SKL A and SKL B,
> +	 * Do not enable render decompression when the plane
> +	 * width is smaller than 32 pixels or greater than
> +	 * 2048 pixels
> +	 */
> +	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) && ((src_w < 32) || (src_w > 2048))) {a

And more

> +		DRM_DEBUG_KMS("SKL-A, SKL-B RC: width > 2048 or < 32\n");
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Conditions to satisfy before enabling render decomp.
> +	 * SKL+
> +	 * Pipe A & B, Planes 1 & 2
> +	 * RGB8888 Tile-Y format
> +	 * 0/180 rotation
> +	 */
> +	if (pipe == PIPE_C) {
> +		DRM_DEBUG_KMS("RC supported only on pipe A & B\n");
> +		return -EINVAL;
> +	}

Shouldn't have added the prop on pipe C at all then. Or the prop
should advertise only uncompressed.

> +
> +	if (!(plane->type == DRM_PLANE_TYPE_PRIMARY ||
> +				(plane->type == DRM_PLANE_TYPE_OVERLAY &&
> +				 to_intel_plane(plane)->plane == PLANE_A))) {

We should really have a better way to refer to specific planes. Not sure
of anything happened on that front.

> +		DRM_DEBUG_KMS("RC supported only on planes 1 & 2\n");
> +		return -EINVAL;
> +	}
> +
> +	if (intel_rotation_90_or_270(plane_state->base.rotation)) {
> +		DRM_DEBUG_KMS("RC support only with 0/180 degree rotation\n");
> +		return -EINVAL;
> +	}
> +
> +	if ((fb->modifier[0] == DRM_FORMAT_MOD_NONE) ||
> +			(fb->modifier[0] == I915_FORMAT_MOD_X_TILED)) {

Better check for the specific things you do support I think.

> +		DRM_DEBUG_KMS("RC supported only with Y-tile\n");
> +		return -EINVAL;
> +	}
> +
> +	if ((fb->pixel_format != DRM_FORMAT_XBGR8888) &&
> +			(fb->pixel_format != DRM_FORMAT_ABGR8888)) {

Presumably we shouldn't have allowed such an fb to be even created.

> +		DRM_DEBUG_KMS("RC supported only with RGB8888 formats\n");
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * For SKL & BXT,
> +	 * When the render compression is enabled with plane
> +	 * width greater than 3840 and horizontal panning,
> +	 * the stride programmed in the PLANE_STRIDE register
> +	 * must be multiple of 4.
> +	 */
> +	x_offset = x;
> +
> +	if (src_w > 3840 && x_offset != 0) {
> +		DRM_DEBUG_KMS("RC: width > 3840, horizontal panning\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
>  static int
>  intel_check_primary_plane(struct drm_plane *plane,
>  			  struct intel_crtc_state *crtc_state,
> @@ -14679,6 +14848,9 @@ static struct drm_plane *intel_primary_plane_create(struct drm_device *dev,
>  	if (INTEL_INFO(dev)->gen >= 4)
>  		intel_create_rotation_property(dev, primary);
>  
> +	if (INTEL_INFO(dev)->gen >= 9)
> +		intel_create_render_comp_property(dev, primary);
> +
>  	drm_plane_helper_add(&primary->base, &intel_plane_helper_funcs);
>  
>  	return &primary->base;
> @@ -14702,6 +14874,36 @@ void intel_create_rotation_property(struct drm_device *dev, struct intel_plane *
>  				plane->base.state->rotation);
>  }
>  
> +void intel_create_render_comp_property(struct drm_device *dev,
> +		struct intel_plane *plane)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +	static const struct drm_prop_enum_list rc_status[] = {
> +		{ COMP_UNCOMPRESSED,   "Uncompressed/not capable" },
> +		{ COMP_RENDER,  "Only render decompression" },
> +	};
> +
> +	if (!dev_priv->render_comp_property) {
> +		dev_priv->render_comp_property =
> +			drm_property_create_bitmask(dev, 0,
> +					"render compression",
> +					rc_status, ARRAY_SIZE(rc_status),
> +					BIT(COMP_UNCOMPRESSED) |
> +					BIT(COMP_RENDER));

Why is it a bitmask? Are you expecting the user to set multiple bits?

So far it looks more like a yes/no type of thing than an enum even. Are
we expecting more possible values here?

Also seems like this really should have different set of supported
values per plane.

> +		if (!dev_priv->render_comp_property) {
> +			DRM_ERROR("RC: Failed to create property\n");
> +			return;
> +		}
> +	}
> +
> +	if (dev_priv->render_comp_property) {
> +		drm_object_attach_property(&plane->base.base,
> +				dev_priv->render_comp_property, 0);
> +	}
> +	dev->mode_config.allow_aux_plane = true;
> +}
> +
>  static int
>  intel_check_cursor_plane(struct drm_plane *plane,
>  			 struct intel_crtc_state *crtc_state,
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 68dd7ac..3158d9f 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -154,7 +154,7 @@ struct intel_framebuffer {
>  	struct drm_i915_gem_object *obj;
>  	struct intel_rotation_info rot_info;
>  
> -	/* for each plane in the normal GTT view */                             
> +	/* for each plane in the normal GTT view */
>  	struct {
>  		unsigned int x, y;
>  	} normal[2];
> @@ -313,6 +313,10 @@ struct intel_atomic_state {
>  	bool skip_intermediate_wm;
>  };
>  
> +/* render compression property bits */
> +#define COMP_UNCOMPRESSED          0
> +#define COMP_RENDER                1
> +
>  struct intel_plane_state {
>  	struct drm_plane_state base;
>  	struct drm_rect src;
> @@ -352,6 +356,9 @@ struct intel_plane_state {
>  
>  	/* async flip related structures */
>  	struct drm_i915_gem_request *wait_req;
> +
> +	/* Render compression */
> +	unsigned int render_comp_enable;
>  };
>  
>  struct intel_initial_plane_config {
> @@ -1135,11 +1142,11 @@ void i915_audio_component_cleanup(struct drm_i915_private *dev_priv);
>  
>  /* intel_display.c */
>  extern const struct drm_plane_funcs intel_plane_funcs;
> -unsigned int intel_fb_xy_to_linear(int x, int y,                               
> -		const struct intel_plane_state *state,       
> -		int plane);                                  
> -void intel_add_fb_offsets(int *x, int *y,                                      
> -		const struct intel_plane_state *state, int plane); 
> +unsigned int intel_fb_xy_to_linear(int x, int y,
> +		const struct intel_plane_state *state,
> +		int plane);
> +void intel_add_fb_offsets(int *x, int *y,
> +		const struct intel_plane_state *state, int plane);
>  unsigned int intel_rotation_info_size(const struct intel_rotation_info *rot_info);
>  bool intel_has_pending_fb_unpin(struct drm_device *dev);
>  void intel_mark_busy(struct drm_device *dev);
> @@ -1223,6 +1230,9 @@ intel_rotation_90_or_270(unsigned int rotation)
>  void intel_create_rotation_property(struct drm_device *dev,
>  					struct intel_plane *plane);
>  
> +void intel_create_render_comp_property(struct drm_device *dev,
> +		struct intel_plane *plane);
> +
>  /* shared dpll functions */
>  struct intel_shared_dpll *intel_crtc_to_shared_dpll(struct intel_crtc *crtc);
>  void assert_shared_dpll(struct drm_i915_private *dev_priv,
> @@ -1252,7 +1262,7 @@ void assert_fdi_rx_pll(struct drm_i915_private *dev_priv,
>  void assert_pipe(struct drm_i915_private *dev_priv, enum pipe pipe, bool state);
>  #define assert_pipe_enabled(d, p) assert_pipe(d, p, true)
>  #define assert_pipe_disabled(d, p) assert_pipe(d, p, false)
> -u32 intel_compute_tile_offset(int *x, int *y,                                  
> +u32 intel_compute_tile_offset(int *x, int *y,
>  		const struct intel_plane_state *state, int plane);
>  void intel_prepare_reset(struct drm_device *dev);
>  void intel_finish_reset(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> index 75bf01d..bb6bbde 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -191,7 +191,7 @@ skl_update_plane(struct drm_plane *drm_plane,
>  	u32 plane_ctl;
>  	const struct drm_intel_sprite_colorkey *key = &plane_state->ckey;
>  	u32 surf_addr = plane_state->main.offset;
> -	unsigned int rotation = plane_state->base.rotation;
> +	unsigned int rotation = plane_state->base.rotation, render_comp;
>  	u32 stride = skl_plane_stride(fb, 0, rotation);
>  	int crtc_x = plane_state->dst.x1;
>  	int crtc_y = plane_state->dst.y1;
> @@ -203,6 +203,9 @@ skl_update_plane(struct drm_plane *drm_plane,
>  	uint32_t src_h = drm_rect_height(&plane_state->src) >> 16;
>  	const struct intel_scaler *scaler =
>  		&crtc_state->scaler_state.scalers[plane_state->scaler_id];
> +	unsigned long aux_dist = 0;
> +	u32 aux_x_offset = 0, aux_y_offset = 0, aux_stride = 0;
> +	u32 tile_row_adjustment = 0;
>  
>  	plane_ctl = PLANE_CTL_ENABLE |
>  		PLANE_CTL_PIPE_GAMMA_ENABLE |
> @@ -230,9 +233,43 @@ skl_update_plane(struct drm_plane *drm_plane,
>  	crtc_w--;
>  	crtc_h--;
>  
> +	render_comp = plane_state->render_comp_enable;
> +	if (render_comp) {
> +		u32 tile_height = PAGE_SIZE /
> +			intel_fb_stride_alignment(dev_priv, fb->modifier[0],
> +					fb->pixel_format);
> +
> +		u32 height_in_mem = (fb->offsets[1]/fb->pitches[0]);
> +		tile_row_adjustment = height_in_mem % tile_height;
> +		aux_dist = (fb->pitches[0] *
> +				(height_in_mem - tile_row_adjustment));
> +		aux_stride = skl_plane_stride(fb, 1, rotation);
> +		plane_ctl |= PLANE_CTL_DECOMPRESSION_ENABLE;
> +	} else {
> +		plane_ctl &= ~PLANE_CTL_DECOMPRESSION_ENABLE;
> +	}
> +
>  	I915_WRITE(PLANE_OFFSET(pipe, plane), (y << 16) | x);
>  	I915_WRITE(PLANE_STRIDE(pipe, plane), stride);
>  	I915_WRITE(PLANE_SIZE(pipe, plane), (src_h << 16) | src_w);
> +	I915_WRITE(PLANE_AUX_DIST(pipe, plane), aux_dist | aux_stride);
> +	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane),
> +			aux_y_offset<<16 | aux_x_offset);
> +
> +	/*
> +	 * Per bspec, for SKL C and BXT A steppings, when the plane source pixel
> +	 * format is NV12, the CHICKEN_PIPESL register bit 22 must be set to 1.
> +	 * When render compression is enabled, bit 22 must be set to 0.
> +	 */
> +	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) ||
> +			IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
> +		u32 temp = I915_READ(CHICKEN_PIPESL_1(pipe));
> +		if (render_comp) {
> +			if ((temp & HSW_FBCQ_DIS) == HSW_FBCQ_DIS)
> +				I915_WRITE(CHICKEN_PIPESL_1(pipe),
> +						temp & ~HSW_FBCQ_DIS);
> +		}
> +	}
>  
>  	/* program plane scaler */
>  	if (plane_state->scaler_id >= 0) {
> @@ -1092,6 +1129,9 @@ intel_plane_init(struct drm_device *dev, enum pipe pipe, int plane)
>  
>  	intel_create_rotation_property(dev, intel_plane);
>  
> +	if (INTEL_INFO(dev)->gen >= 9)
> +		intel_create_render_comp_property(dev, intel_plane);
> +
>  	drm_plane_helper_add(&intel_plane->base, &intel_plane_helper_funcs);
>  
>  out:
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Smith, Gary K March 21, 2016, 12:20 p.m. UTC | #3
Hi,

What are the objections to this change?

Thanks
Gary

-----Original Message-----
From: Daniel Stone [mailto:daniel@fooishbar.org] 

Sent: Friday, March 18, 2016 5:05 PM
To: Kannan, Vandana <vandana.kannan@intel.com>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>; Smith, Gary K <gary.k.smith@intel.com>; Daniel Vetter <daniel@ffwll.ch>
Subject: Re: [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

Hi Vandana,

On 18 March 2016 at 16:50, Vandana Kannan <vandana.kannan@intel.com> wrote:
> The reason for using a plane property instead of fb modifier:- In 

> Android, OGL passes a render compressed buffer to hardware composer 

> (HWC), which would then request a flip on that buffer after checking 

> if the target can support render compressed buffer. For example, only 

> planes 1 and 2 on pipes 1 and 2 can support RC. In case the target 

> cannot support it, HWC will revert back to OGL requesting for uncompressed buffer.


I still don't believe we can support this in non-Android userspace, so I don't have any other comment on this patch.

Cheers,
Daniel
---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Daniel Stone March 22, 2016, 1:30 p.m. UTC | #4
Hi Gary,

On 21 March 2016 at 12:20, Smith, Gary K <gary.k.smith@intel.com> wrote:
> What are the objections to this change?

Exactly the same as the last time we discussed it: that you're abusing
plane properties to contain fundamental properties of the buffer you
want to scan out, i.e. that which naturally belongs in the framebuffer
object rather than a separate plane property. And doing it in a way
which is prone to failure when userspace is unaware of these
properties too.

That being said, I hadn't noticed that this had moved to being a
purely Intel-private property, rather than touching Android core at
all, so if you can get it past the Intel DRM maintainers, then fine. I
was primarily watching out for core DRM.

Cheers,
Daniel
Daniel Stone March 22, 2016, 1:36 p.m. UTC | #5
On 22 March 2016 at 13:30, Daniel Stone <daniel@fooishbar.org> wrote:
> Exactly the same as the last time we discussed it

I should add that I understand your previous objection that creating
framebuffers on the fly is not performant enough, and you object to
the effort of managing 100 rather than 50 framebuffers upfront (though
honestly, if you get to 50 framebuffers you're already having to do
some clever setup/management anyway). But in the last thread, Daniel
Vetter asked for some performance numbers to bear out your objection
that framebuffer creation is too costly, leading to getting it fixed
if this is in fact the case (since other userspace relies on it being
fast), but this performance analysis never arrived.

I'd still be interested in seeing that.

Cheers,
Daniel
Smith, Gary K March 22, 2016, 2:59 p.m. UTC | #6
I thought we had concluded previously that this change was acceptable, which is why I'm surprised that it's come up yet again.

The performance cost of creating two fb objects per buffer is irrelevant - it will be a one shot hit on buffer creation, and from that point forward it's just selecting which of the two fb's to use at any point in time. Given that I'm told that the memory cost kernel side per fb is small, the number isn't a big deal either. Hence, I'm not sure why you were expecting a performance analysis.

The two things I object to are:

1) Having to tell the kernel that there is no aux buffer on an allocation that actually has an aux buffer in order to indicate that at this point in time the buffer is uncompressed. This is completely non intuitive from a caller perspective - especially as it's called an aux buffer, not a compression buffer.

2) Having to use a fb object to manage dynamic state. The fb for a particular buffer will change over the course of a frame. Any debug for a frame at entry to the HWC will have a different fb to the debug at frame exit which makes it hard to track.

Thanks
Gary


-----Original Message-----
From: Daniel Stone [mailto:daniel@fooishbar.org] 

Sent: Tuesday, March 22, 2016 1:37 PM
To: Smith, Gary K <gary.k.smith@intel.com>
Cc: Kannan, Vandana <vandana.kannan@intel.com>; intel-gfx <intel-gfx@lists.freedesktop.org>; Daniel Vetter <daniel@ffwll.ch>
Subject: Re: [PATCH 2/2] drm/i915: Render decompression support for Gen9 and above

On 22 March 2016 at 13:30, Daniel Stone <daniel@fooishbar.org> wrote:
> Exactly the same as the last time we discussed it


I should add that I understand your previous objection that creating framebuffers on the fly is not performant enough, and you object to the effort of managing 100 rather than 50 framebuffers upfront (though honestly, if you get to 50 framebuffers you're already having to do some clever setup/management anyway). But in the last thread, Daniel Vetter asked for some performance numbers to bear out your objection that framebuffer creation is too costly, leading to getting it fixed if this is in fact the case (since other userspace relies on it being fast), but this performance analysis never arrived.

I'd still be interested in seeing that.

Cheers,
Daniel
---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
vandana.kannan@intel.com April 25, 2016, 5 a.m. UTC | #7
> -----Original Message-----
> From: Ville Syrjälä [mailto:ville.syrjala@linux.intel.com]
> Sent: Friday, March 18, 2016 10:42 PM
> To: Kannan, Vandana <vandana.kannan@intel.com>
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915: Render decompression
> support for Gen9 and above

[...]
 
> > @@ -3573,7 +3576,7 @@ static void
> skylake_update_primary_plane(struct drm_plane *plane,
> >  	struct drm_framebuffer *fb = plane_state->base.fb;
> >  	int pipe = intel_crtc->pipe;
> >  	u32 plane_ctl;
> > -	unsigned int rotation = plane_state->base.rotation;
> > +	unsigned int rotation = plane_state->base.rotation, render_comp;
> >  	u32 stride = skl_plane_stride(fb, 0, rotation);
> >  	u32 surf_addr = plane_state->main.offset;
> >  	int scaler_id = plane_state->scaler_id; @@ -3585,6 +3588,9 @@
> static
> > void skylake_update_primary_plane(struct drm_plane *plane,
> >  	int dst_y = plane_state->dst.y1;
> >  	int dst_w = drm_rect_width(&plane_state->dst);
> >  	int dst_h = drm_rect_height(&plane_state->dst);
> > +	unsigned long aux_dist = 0;
> > +	u32 aux_x_offset = 0, aux_y_offset = 0, aux_stride = 0;
> > +	u32 tile_row_adjustment = 0;
> >
> >  	plane_ctl = PLANE_CTL_ENABLE |
> >  		    PLANE_CTL_PIPE_GAMMA_ENABLE |
> > @@ -3604,10 +3610,43 @@ static void
> skylake_update_primary_plane(struct drm_plane *plane,
> >  	intel_crtc->adjusted_x = src_x;
> >  	intel_crtc->adjusted_y = src_y;
> >
> > +	render_comp = plane_state->render_comp_enable;
> > +	if (render_comp) {
> > +		u32 tile_height = PAGE_SIZE /
> > +			intel_fb_stride_alignment(dev_priv, fb->modifier[0],
> > +					fb->pixel_format);
> > +
> > +		u32 height_in_mem = (fb->offsets[1]/fb->pitches[0]);
> > +		tile_row_adjustment = height_in_mem % tile_height;
> > +		aux_dist = (fb->pitches[0] *
> > +				(height_in_mem - tile_row_adjustment));
> > +		aux_stride = skl_plane_stride(fb, 1, rotation);
> 
> Could you go read my fb offsets series [1] and see if there's something
> there that doesn't agree with render compression?
> 
> [1] https://lists.freedesktop.org/archives/intel-gfx/2016-
> February/087660.html
> 
[Vandana] 
I went through that..
intel_fb_pitch() return the fb->pitches[plane]. So for aux plane in the case of RC and UV plane in the case of NV12,
plane value should be 1.
I dint see any other dependency on first read.

> > +		plane_ctl |= PLANE_CTL_DECOMPRESSION_ENABLE;
> > +	} else {
> > +		plane_ctl &= ~PLANE_CTL_DECOMPRESSION_ENABLE;
> > +	}
> > +
> >  	I915_WRITE(PLANE_CTL(pipe, 0), plane_ctl);
> >  	I915_WRITE(PLANE_OFFSET(pipe, 0), (src_y << 16) | src_x);
> >  	I915_WRITE(PLANE_STRIDE(pipe, 0), stride);
> >  	I915_WRITE(PLANE_SIZE(pipe, 0), (src_h << 16) | src_w);
> > +	I915_WRITE(PLANE_AUX_DIST(pipe, 0), aux_dist | aux_stride);
> > +	I915_WRITE(PLANE_AUX_OFFSET(pipe, 0), aux_y_offset<<16 |
> > +aux_x_offset);
> > +
> > +	/*
> > +	 * Per bspec, for SKL C and BXT A steppings, when the plane source
> pixel
> > +	 * format is NV12, the CHICKEN_PIPESL register bit 22 must be set
> to 1.
> > +	 * When render compression is enabled, bit 22 must be set to 0.
> > +	 */
> > +	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) ||
> > +			IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
> 
> Does anyone really care about these anymore?
> 
[Vandana] It was required internally.. If all users will have the latest stepping, then this can be removed.

> Also you really should fix your editor to wrap lines correctly.
> 
> > +		u32 temp = I915_READ(CHICKEN_PIPESL_1(pipe));
> > +		if (render_comp) {
> > +			if ((temp & HSW_FBCQ_DIS) == HSW_FBCQ_DIS)
> > +				I915_WRITE(CHICKEN_PIPESL_1(pipe),
> > +						temp & ~HSW_FBCQ_DIS);
> > +		}
> > +	}
> >
> >  	if (scaler_id >= 0) {
> >  		uint32_t ps_ctrl = 0;
> > @@ -12333,6 +12372,17 @@ int intel_plane_atomic_calc_changes(struct
> drm_crtc_state *crtc_state,
> >  			to_intel_plane_state(plane_state));
> >  		if (ret)
> >  			return ret;
> > +
> > +		if (fb && to_intel_plane_state(plane_state)->
> > +				render_comp_enable) {
> > +			ret = skl_check_compression(dev,
> > +					to_intel_plane_state(plane_state),
> > +					intel_crtc->pipe, crtc->x, crtc->y);
> > +			if (ret) {
> > +				DRM_DEBUG_KMS("Render compr checks
> failed\n");
> > +				return ret;
> > +			}
> > +		}
> >  	}
> >
> >  	was_visible = old_plane_state->visible; @@ -12388,7 +12438,6 @@
> int
> > intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
> >  	case DRM_PLANE_TYPE_PRIMARY:
> >  		intel_crtc->atomic.post_enable_primary = turn_on;
> >  		intel_crtc->atomic.update_fbc = true;
> > -
> >  		break;
> >  	case DRM_PLANE_TYPE_CURSOR:
> >  		break;
> > @@ -12516,6 +12565,41 @@ static int intel_crtc_atomic_check(struct
> drm_crtc *crtc,
> >  		}
> >  	}
> >
> > +	/*
> > +	 * On SKL:*:C and BXT:*:A, there is a possible hang with NV12
> format.
> > +	 * WA: render decompression must not be enabled on any of the
> planes in
> > +	 * that pipe.
> > +	 */
> > +	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) ||
> > +			IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
> 
> More extinct animals
> 
> > +		struct drm_plane *p;
> > +		bool rc_enabled = false, nv12_enabled = false;
> > +
> > +		drm_for_each_plane_mask(p, dev, crtc_state->plane_mask) {
> > +			struct drm_plane_state *plane_state =
> > +				drm_atomic_get_plane_state(state, p);
> > +
> > +			if (plane_state) {
> > +				if (to_intel_plane_state(plane_state)->
> > +						render_comp_enable)
> > +					rc_enabled = true;
> > +
> > +				if (plane_state->fb &&
> > +						plane_state->fb->pixel_format
> ==
> > +						DRM_FORMAT_NV12)
> > +					nv12_enabled = true;
> > +			}
> > +
> > +		}
> > +		if (rc_enabled && nv12_enabled) {
> > +			DRM_DEBUG_KMS("RC should not be enabled "
> > +					"on any plane of the "
> > +					"pipe on which NV12 is "
> > +					"enabled\n");
> > +			return -EINVAL;
> > +		}
> > +	}
> > +
> >  	if (INTEL_INFO(dev)->gen >= 9) {
> >  		if (mode_changed)
> >  			ret = skl_update_scaler_crtc(pipe_config);
> > @@ -14517,6 +14601,91 @@ skl_max_scale(struct intel_crtc *intel_crtc,
> struct intel_crtc_state *crtc_state
> >  	return max_scale;
> >  }
> >
> > +static int skl_check_compression(struct drm_device *dev,
> > +		struct intel_plane_state *plane_state,
> > +		enum pipe pipe, int x, int y)
> > +{
> > +	struct drm_framebuffer *fb = plane_state->base.fb;
> > +	struct drm_plane *plane = plane_state->base.plane;
> > +	int x_offset;
> > +	int src_w = drm_rect_width(&plane_state->src) >> 16;
> > +
> > +	if (!IS_SKYLAKE(dev) && !IS_BROXTON(dev)) {
> > +		DRM_DEBUG_KMS("RC support on CNL+ needs to be
> revisited\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	/*
> > +	 * TODO:
> > +	 * 1. Disable stereo 3D when render decomp is enabled (bit 7:6)
> > +	 * 2. Render decompression must not be used in VTd pass-through
> mode
> > +	 * 3. Program hashing select CHICKEN_MISC1 bit 15
> > +	 */
> > +
> > +	/*
> > +	 * On SKL A and SKL B,
> > +	 * Do not enable render decompression when the plane
> > +	 * width is smaller than 32 pixels or greater than
> > +	 * 2048 pixels
> > +	 */
> > +	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) && ((src_w < 32) || (src_w >
> > +2048))) {a
> 
> And more
> 
> > +		DRM_DEBUG_KMS("SKL-A, SKL-B RC: width > 2048 or <
> 32\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	/*
> > +	 * Conditions to satisfy before enabling render decomp.
> > +	 * SKL+
> > +	 * Pipe A & B, Planes 1 & 2
> > +	 * RGB8888 Tile-Y format
> > +	 * 0/180 rotation
> > +	 */
> > +	if (pipe == PIPE_C) {
> > +		DRM_DEBUG_KMS("RC supported only on pipe A & B\n");
> > +		return -EINVAL;
> > +	}
> 
> Shouldn't have added the prop on pipe C at all then. Or the prop should
> advertise only uncompressed.
> 
[Vandana] I can make the change to set supported values based on plane/pipe and accordingly create the property.
> > +
> > +	if (!(plane->type == DRM_PLANE_TYPE_PRIMARY ||
> > +				(plane->type == DRM_PLANE_TYPE_OVERLAY
> &&
> > +				 to_intel_plane(plane)->plane == PLANE_A)))
> {
> 
> We should really have a better way to refer to specific planes. Not sure of
> anything happened on that front.
> 
> > +		DRM_DEBUG_KMS("RC supported only on planes 1 & 2\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	if (intel_rotation_90_or_270(plane_state->base.rotation)) {
> > +		DRM_DEBUG_KMS("RC support only with 0/180 degree
> rotation\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	if ((fb->modifier[0] == DRM_FORMAT_MOD_NONE) ||
> > +			(fb->modifier[0] == I915_FORMAT_MOD_X_TILED)) {
> 
> Better check for the specific things you do support I think.
> 
> > +		DRM_DEBUG_KMS("RC supported only with Y-tile\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	if ((fb->pixel_format != DRM_FORMAT_XBGR8888) &&
> > +			(fb->pixel_format != DRM_FORMAT_ABGR8888)) {
> 
> Presumably we shouldn't have allowed such an fb to be even created.
> 
> > +		DRM_DEBUG_KMS("RC supported only with RGB8888
> formats\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	/*
> > +	 * For SKL & BXT,
> > +	 * When the render compression is enabled with plane
> > +	 * width greater than 3840 and horizontal panning,
> > +	 * the stride programmed in the PLANE_STRIDE register
> > +	 * must be multiple of 4.
> > +	 */
> > +	x_offset = x;
> > +
> > +	if (src_w > 3840 && x_offset != 0) {
> > +		DRM_DEBUG_KMS("RC: width > 3840, horizontal
> panning\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  static int
> >  intel_check_primary_plane(struct drm_plane *plane,
> >  			  struct intel_crtc_state *crtc_state, @@ -14679,6
> +14848,9 @@
> > static struct drm_plane *intel_primary_plane_create(struct drm_device
> *dev,
> >  	if (INTEL_INFO(dev)->gen >= 4)
> >  		intel_create_rotation_property(dev, primary);
> >
> > +	if (INTEL_INFO(dev)->gen >= 9)
> > +		intel_create_render_comp_property(dev, primary);
> > +
> >  	drm_plane_helper_add(&primary->base,
> &intel_plane_helper_funcs);
> >
> >  	return &primary->base;
> > @@ -14702,6 +14874,36 @@ void intel_create_rotation_property(struct
> drm_device *dev, struct intel_plane *
> >  				plane->base.state->rotation);
> >  }
> >
> > +void intel_create_render_comp_property(struct drm_device *dev,
> > +		struct intel_plane *plane)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +
> > +	static const struct drm_prop_enum_list rc_status[] = {
> > +		{ COMP_UNCOMPRESSED,   "Uncompressed/not capable" },
> > +		{ COMP_RENDER,  "Only render decompression" },
> > +	};
> > +
> > +	if (!dev_priv->render_comp_property) {
> > +		dev_priv->render_comp_property =
> > +			drm_property_create_bitmask(dev, 0,
> > +					"render compression",
> > +					rc_status, ARRAY_SIZE(rc_status),
> > +					BIT(COMP_UNCOMPRESSED) |
> > +					BIT(COMP_RENDER));
> 
> Why is it a bitmask? Are you expecting the user to set multiple bits?
> 
> So far it looks more like a yes/no type of thing than an enum even. Are we
> expecting more possible values here?
> 
[Vandana] Yes, there are 2 other values that will come up in future.
"
	{ DRM_RC_RENDER_AND_CLEAR, "Render & clear decompression" },
	{ DRM_RC_PLANAR, "Planar compression capable (for future)" },
"
> Also seems like this really should have different set of supported values
> per plane.
> 
[Vandana] Ok, will make this change.
> > +		if (!dev_priv->render_comp_property) {
> > +			DRM_ERROR("RC: Failed to create property\n");
> > +			return;
> > +		}
> > +	}

[...]

> >  	/* program plane scaler */
> >  	if (plane_state->scaler_id >= 0) {
> > @@ -1092,6 +1129,9 @@ intel_plane_init(struct drm_device *dev, enum
> > pipe pipe, int plane)
> >
> >  	intel_create_rotation_property(dev, intel_plane);
> >
> > +	if (INTEL_INFO(dev)->gen >= 9)
> > +		intel_create_render_comp_property(dev, intel_plane);
> > +
> >  	drm_plane_helper_add(&intel_plane->base,
> &intel_plane_helper_funcs);
> >
> >  out:
> > --
> > 1.9.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> --
> Ville Syrjälä
> Intel OTC
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f37ac12..bb47ee1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1929,6 +1929,7 @@  struct drm_i915_private {
 
 	struct drm_property *broadcast_rgb_property;
 	struct drm_property *force_audio_property;
+	struct drm_property *render_comp_property;
 
 	/* hda/i915 audio component */
 	struct i915_audio_component *audio_component;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 7dfc400..773c37f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -5756,6 +5756,28 @@  enum skl_disp_power_wells {
 			_ID(id, _PS_ECC_STAT_1A, _PS_ECC_STAT_2A),   \
 			_ID(id, _PS_ECC_STAT_1B, _PS_ECC_STAT_2B))
 
+#define PLANE_AUX_DIST_1_A		0x701c0
+#define PLANE_AUX_DIST_2_A		0x702c0
+#define PLANE_AUX_DIST_1_B		0x711c0
+#define PLANE_AUX_DIST_2_B		0x712c0
+#define _PLANE_AUX_DIST_1(pipe) \
+			_PIPE(pipe, PLANE_AUX_DIST_1_A, PLANE_AUX_DIST_1_B)
+#define _PLANE_AUX_DIST_2(pipe) \
+			_PIPE(pipe, PLANE_AUX_DIST_2_A, PLANE_AUX_DIST_2_B)
+#define PLANE_AUX_DIST(pipe, plane)     \
+		_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe))
+
+#define PLANE_AUX_OFFSET_1_A		0x701c4
+#define PLANE_AUX_OFFSET_2_A		0x702c4
+#define PLANE_AUX_OFFSET_1_B		0x711c4
+#define PLANE_AUX_OFFSET_2_B		0x712c4
+#define _PLANE_AUX_OFFSET_1(pipe)       \
+		_PIPE(pipe, PLANE_AUX_OFFSET_1_A, PLANE_AUX_OFFSET_1_B)
+#define _PLANE_AUX_OFFSET_2(pipe)       \
+		_PIPE(pipe, PLANE_AUX_OFFSET_2_A, PLANE_AUX_OFFSET_2_B)
+#define PLANE_AUX_OFFSET(pipe, plane)   \
+		_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe))
+
 /* legacy palette */
 #define _LGC_PALETTE_A           0x4a000
 #define _LGC_PALETTE_B           0x4a800
diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c b/drivers/gpu/drm/i915/intel_atomic_plane.c
index e0b851a..c431333 100644
--- a/drivers/gpu/drm/i915/intel_atomic_plane.c
+++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
@@ -230,8 +230,16 @@  intel_plane_atomic_get_property(struct drm_plane *plane,
 				struct drm_property *property,
 				uint64_t *val)
 {
-	DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
-	return -EINVAL;
+	struct drm_i915_private *dev_priv = state->plane->dev->dev_private;
+	struct intel_plane_state *intel_state = to_intel_plane_state(state);
+
+	if (property == dev_priv->render_comp_property) {
+		*val = intel_state->render_comp_enable;
+	} else {
+		DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
+		return -EINVAL;
+	}
+	return 0;
 }
 
 /**
@@ -252,6 +260,14 @@  intel_plane_atomic_set_property(struct drm_plane *plane,
 				struct drm_property *property,
 				uint64_t val)
 {
-	DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
-	return -EINVAL;
+	struct drm_i915_private *dev_priv = state->plane->dev->dev_private;
+	struct intel_plane_state *intel_state = to_intel_plane_state(state);
+
+	if (property == dev_priv->render_comp_property) {
+		intel_state->render_comp_enable = val;
+	} else {
+		DRM_DEBUG_KMS("Unknown plane property '%s'\n", property->name);
+		return -EINVAL;
+	}
+	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index b2fcbe6..676507d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -117,6 +117,9 @@  static void ironlake_pfit_disable(struct intel_crtc *crtc, bool force);
 static void ironlake_pfit_enable(struct intel_crtc *crtc);
 static void intel_modeset_setup_hw_state(struct drm_device *dev);
 static void intel_pre_disable_primary(struct drm_crtc *crtc);
+static int skl_check_compression(struct drm_device *dev,
+		struct intel_plane_state *plane_state,
+		enum pipe pipe, int x, int y);
 
 typedef struct {
 	int	min, max;
@@ -3573,7 +3576,7 @@  static void skylake_update_primary_plane(struct drm_plane *plane,
 	struct drm_framebuffer *fb = plane_state->base.fb;
 	int pipe = intel_crtc->pipe;
 	u32 plane_ctl;
-	unsigned int rotation = plane_state->base.rotation;
+	unsigned int rotation = plane_state->base.rotation, render_comp;
 	u32 stride = skl_plane_stride(fb, 0, rotation);
 	u32 surf_addr = plane_state->main.offset;
 	int scaler_id = plane_state->scaler_id;
@@ -3585,6 +3588,9 @@  static void skylake_update_primary_plane(struct drm_plane *plane,
 	int dst_y = plane_state->dst.y1;
 	int dst_w = drm_rect_width(&plane_state->dst);
 	int dst_h = drm_rect_height(&plane_state->dst);
+	unsigned long aux_dist = 0;
+	u32 aux_x_offset = 0, aux_y_offset = 0, aux_stride = 0;
+	u32 tile_row_adjustment = 0;
 
 	plane_ctl = PLANE_CTL_ENABLE |
 		    PLANE_CTL_PIPE_GAMMA_ENABLE |
@@ -3604,10 +3610,43 @@  static void skylake_update_primary_plane(struct drm_plane *plane,
 	intel_crtc->adjusted_x = src_x;
 	intel_crtc->adjusted_y = src_y;
 
+	render_comp = plane_state->render_comp_enable;
+	if (render_comp) {
+		u32 tile_height = PAGE_SIZE /
+			intel_fb_stride_alignment(dev_priv, fb->modifier[0],
+					fb->pixel_format);
+
+		u32 height_in_mem = (fb->offsets[1]/fb->pitches[0]);
+		tile_row_adjustment = height_in_mem % tile_height;
+		aux_dist = (fb->pitches[0] *
+				(height_in_mem - tile_row_adjustment));
+		aux_stride = skl_plane_stride(fb, 1, rotation);
+		plane_ctl |= PLANE_CTL_DECOMPRESSION_ENABLE;
+	} else {
+		plane_ctl &= ~PLANE_CTL_DECOMPRESSION_ENABLE;
+	}
+
 	I915_WRITE(PLANE_CTL(pipe, 0), plane_ctl);
 	I915_WRITE(PLANE_OFFSET(pipe, 0), (src_y << 16) | src_x);
 	I915_WRITE(PLANE_STRIDE(pipe, 0), stride);
 	I915_WRITE(PLANE_SIZE(pipe, 0), (src_h << 16) | src_w);
+	I915_WRITE(PLANE_AUX_DIST(pipe, 0), aux_dist | aux_stride);
+	I915_WRITE(PLANE_AUX_OFFSET(pipe, 0), aux_y_offset<<16 | aux_x_offset);
+
+	/*
+	 * Per bspec, for SKL C and BXT A steppings, when the plane source pixel
+	 * format is NV12, the CHICKEN_PIPESL register bit 22 must be set to 1.
+	 * When render compression is enabled, bit 22 must be set to 0.
+	 */
+	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) ||
+			IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
+		u32 temp = I915_READ(CHICKEN_PIPESL_1(pipe));
+		if (render_comp) {
+			if ((temp & HSW_FBCQ_DIS) == HSW_FBCQ_DIS)
+				I915_WRITE(CHICKEN_PIPESL_1(pipe),
+						temp & ~HSW_FBCQ_DIS);
+		}
+	}
 
 	if (scaler_id >= 0) {
 		uint32_t ps_ctrl = 0;
@@ -12333,6 +12372,17 @@  int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
 			to_intel_plane_state(plane_state));
 		if (ret)
 			return ret;
+
+		if (fb && to_intel_plane_state(plane_state)->
+				render_comp_enable) {
+			ret = skl_check_compression(dev,
+					to_intel_plane_state(plane_state),
+					intel_crtc->pipe, crtc->x, crtc->y);
+			if (ret) {
+				DRM_DEBUG_KMS("Render compr checks failed\n");
+				return ret;
+			}
+		}
 	}
 
 	was_visible = old_plane_state->visible;
@@ -12388,7 +12438,6 @@  int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
 	case DRM_PLANE_TYPE_PRIMARY:
 		intel_crtc->atomic.post_enable_primary = turn_on;
 		intel_crtc->atomic.update_fbc = true;
-
 		break;
 	case DRM_PLANE_TYPE_CURSOR:
 		break;
@@ -12516,6 +12565,41 @@  static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 		}
 	}
 
+	/*
+	 * On SKL:*:C and BXT:*:A, there is a possible hang with NV12 format.
+	 * WA: render decompression must not be enabled on any of the planes in
+	 * that pipe.
+	 */
+	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) ||
+			IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
+		struct drm_plane *p;
+		bool rc_enabled = false, nv12_enabled = false;
+
+		drm_for_each_plane_mask(p, dev, crtc_state->plane_mask) {
+			struct drm_plane_state *plane_state =
+				drm_atomic_get_plane_state(state, p);
+
+			if (plane_state) {
+				if (to_intel_plane_state(plane_state)->
+						render_comp_enable)
+					rc_enabled = true;
+
+				if (plane_state->fb &&
+						plane_state->fb->pixel_format ==
+						DRM_FORMAT_NV12)
+					nv12_enabled = true;
+			}
+
+		}
+		if (rc_enabled && nv12_enabled) {
+			DRM_DEBUG_KMS("RC should not be enabled "
+					"on any plane of the "
+					"pipe on which NV12 is "
+					"enabled\n");
+			return -EINVAL;
+		}
+	}
+
 	if (INTEL_INFO(dev)->gen >= 9) {
 		if (mode_changed)
 			ret = skl_update_scaler_crtc(pipe_config);
@@ -14517,6 +14601,91 @@  skl_max_scale(struct intel_crtc *intel_crtc, struct intel_crtc_state *crtc_state
 	return max_scale;
 }
 
+static int skl_check_compression(struct drm_device *dev,
+		struct intel_plane_state *plane_state,
+		enum pipe pipe, int x, int y)
+{
+	struct drm_framebuffer *fb = plane_state->base.fb;
+	struct drm_plane *plane = plane_state->base.plane;
+	int x_offset;
+	int src_w = drm_rect_width(&plane_state->src) >> 16;
+
+	if (!IS_SKYLAKE(dev) && !IS_BROXTON(dev)) {
+		DRM_DEBUG_KMS("RC support on CNL+ needs to be revisited\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * TODO:
+	 * 1. Disable stereo 3D when render decomp is enabled (bit 7:6)
+	 * 2. Render decompression must not be used in VTd pass-through mode
+	 * 3. Program hashing select CHICKEN_MISC1 bit 15
+	 */
+
+	/*
+	 * On SKL A and SKL B,
+	 * Do not enable render decompression when the plane
+	 * width is smaller than 32 pixels or greater than
+	 * 2048 pixels
+	 */
+	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) && ((src_w < 32) || (src_w > 2048))) {
+		DRM_DEBUG_KMS("SKL-A, SKL-B RC: width > 2048 or < 32\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * Conditions to satisfy before enabling render decomp.
+	 * SKL+
+	 * Pipe A & B, Planes 1 & 2
+	 * RGB8888 Tile-Y format
+	 * 0/180 rotation
+	 */
+	if (pipe == PIPE_C) {
+		DRM_DEBUG_KMS("RC supported only on pipe A & B\n");
+		return -EINVAL;
+	}
+
+	if (!(plane->type == DRM_PLANE_TYPE_PRIMARY ||
+				(plane->type == DRM_PLANE_TYPE_OVERLAY &&
+				 to_intel_plane(plane)->plane == PLANE_A))) {
+		DRM_DEBUG_KMS("RC supported only on planes 1 & 2\n");
+		return -EINVAL;
+	}
+
+	if (intel_rotation_90_or_270(plane_state->base.rotation)) {
+		DRM_DEBUG_KMS("RC support only with 0/180 degree rotation\n");
+		return -EINVAL;
+	}
+
+	if ((fb->modifier[0] == DRM_FORMAT_MOD_NONE) ||
+			(fb->modifier[0] == I915_FORMAT_MOD_X_TILED)) {
+		DRM_DEBUG_KMS("RC supported only with Y-tile\n");
+		return -EINVAL;
+	}
+
+	if ((fb->pixel_format != DRM_FORMAT_XBGR8888) &&
+			(fb->pixel_format != DRM_FORMAT_ABGR8888)) {
+		DRM_DEBUG_KMS("RC supported only with RGB8888 formats\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * For SKL & BXT,
+	 * When the render compression is enabled with plane
+	 * width greater than 3840 and horizontal panning,
+	 * the stride programmed in the PLANE_STRIDE register
+	 * must be multiple of 4.
+	 */
+	x_offset = x;
+
+	if (src_w > 3840 && x_offset != 0) {
+		DRM_DEBUG_KMS("RC: width > 3840, horizontal panning\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static int
 intel_check_primary_plane(struct drm_plane *plane,
 			  struct intel_crtc_state *crtc_state,
@@ -14679,6 +14848,9 @@  static struct drm_plane *intel_primary_plane_create(struct drm_device *dev,
 	if (INTEL_INFO(dev)->gen >= 4)
 		intel_create_rotation_property(dev, primary);
 
+	if (INTEL_INFO(dev)->gen >= 9)
+		intel_create_render_comp_property(dev, primary);
+
 	drm_plane_helper_add(&primary->base, &intel_plane_helper_funcs);
 
 	return &primary->base;
@@ -14702,6 +14874,36 @@  void intel_create_rotation_property(struct drm_device *dev, struct intel_plane *
 				plane->base.state->rotation);
 }
 
+void intel_create_render_comp_property(struct drm_device *dev,
+		struct intel_plane *plane)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	static const struct drm_prop_enum_list rc_status[] = {
+		{ COMP_UNCOMPRESSED,   "Uncompressed/not capable" },
+		{ COMP_RENDER,  "Only render decompression" },
+	};
+
+	if (!dev_priv->render_comp_property) {
+		dev_priv->render_comp_property =
+			drm_property_create_bitmask(dev, 0,
+					"render compression",
+					rc_status, ARRAY_SIZE(rc_status),
+					BIT(COMP_UNCOMPRESSED) |
+					BIT(COMP_RENDER));
+		if (!dev_priv->render_comp_property) {
+			DRM_ERROR("RC: Failed to create property\n");
+			return;
+		}
+	}
+
+	if (dev_priv->render_comp_property) {
+		drm_object_attach_property(&plane->base.base,
+				dev_priv->render_comp_property, 0);
+	}
+	dev->mode_config.allow_aux_plane = true;
+}
+
 static int
 intel_check_cursor_plane(struct drm_plane *plane,
 			 struct intel_crtc_state *crtc_state,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 68dd7ac..3158d9f 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -154,7 +154,7 @@  struct intel_framebuffer {
 	struct drm_i915_gem_object *obj;
 	struct intel_rotation_info rot_info;
 
-	/* for each plane in the normal GTT view */                             
+	/* for each plane in the normal GTT view */
 	struct {
 		unsigned int x, y;
 	} normal[2];
@@ -313,6 +313,10 @@  struct intel_atomic_state {
 	bool skip_intermediate_wm;
 };
 
+/* render compression property bits */
+#define COMP_UNCOMPRESSED          0
+#define COMP_RENDER                1
+
 struct intel_plane_state {
 	struct drm_plane_state base;
 	struct drm_rect src;
@@ -352,6 +356,9 @@  struct intel_plane_state {
 
 	/* async flip related structures */
 	struct drm_i915_gem_request *wait_req;
+
+	/* Render compression */
+	unsigned int render_comp_enable;
 };
 
 struct intel_initial_plane_config {
@@ -1135,11 +1142,11 @@  void i915_audio_component_cleanup(struct drm_i915_private *dev_priv);
 
 /* intel_display.c */
 extern const struct drm_plane_funcs intel_plane_funcs;
-unsigned int intel_fb_xy_to_linear(int x, int y,                               
-		const struct intel_plane_state *state,       
-		int plane);                                  
-void intel_add_fb_offsets(int *x, int *y,                                      
-		const struct intel_plane_state *state, int plane); 
+unsigned int intel_fb_xy_to_linear(int x, int y,
+		const struct intel_plane_state *state,
+		int plane);
+void intel_add_fb_offsets(int *x, int *y,
+		const struct intel_plane_state *state, int plane);
 unsigned int intel_rotation_info_size(const struct intel_rotation_info *rot_info);
 bool intel_has_pending_fb_unpin(struct drm_device *dev);
 void intel_mark_busy(struct drm_device *dev);
@@ -1223,6 +1230,9 @@  intel_rotation_90_or_270(unsigned int rotation)
 void intel_create_rotation_property(struct drm_device *dev,
 					struct intel_plane *plane);
 
+void intel_create_render_comp_property(struct drm_device *dev,
+		struct intel_plane *plane);
+
 /* shared dpll functions */
 struct intel_shared_dpll *intel_crtc_to_shared_dpll(struct intel_crtc *crtc);
 void assert_shared_dpll(struct drm_i915_private *dev_priv,
@@ -1252,7 +1262,7 @@  void assert_fdi_rx_pll(struct drm_i915_private *dev_priv,
 void assert_pipe(struct drm_i915_private *dev_priv, enum pipe pipe, bool state);
 #define assert_pipe_enabled(d, p) assert_pipe(d, p, true)
 #define assert_pipe_disabled(d, p) assert_pipe(d, p, false)
-u32 intel_compute_tile_offset(int *x, int *y,                                  
+u32 intel_compute_tile_offset(int *x, int *y,
 		const struct intel_plane_state *state, int plane);
 void intel_prepare_reset(struct drm_device *dev);
 void intel_finish_reset(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 75bf01d..bb6bbde 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -191,7 +191,7 @@  skl_update_plane(struct drm_plane *drm_plane,
 	u32 plane_ctl;
 	const struct drm_intel_sprite_colorkey *key = &plane_state->ckey;
 	u32 surf_addr = plane_state->main.offset;
-	unsigned int rotation = plane_state->base.rotation;
+	unsigned int rotation = plane_state->base.rotation, render_comp;
 	u32 stride = skl_plane_stride(fb, 0, rotation);
 	int crtc_x = plane_state->dst.x1;
 	int crtc_y = plane_state->dst.y1;
@@ -203,6 +203,9 @@  skl_update_plane(struct drm_plane *drm_plane,
 	uint32_t src_h = drm_rect_height(&plane_state->src) >> 16;
 	const struct intel_scaler *scaler =
 		&crtc_state->scaler_state.scalers[plane_state->scaler_id];
+	unsigned long aux_dist = 0;
+	u32 aux_x_offset = 0, aux_y_offset = 0, aux_stride = 0;
+	u32 tile_row_adjustment = 0;
 
 	plane_ctl = PLANE_CTL_ENABLE |
 		PLANE_CTL_PIPE_GAMMA_ENABLE |
@@ -230,9 +233,43 @@  skl_update_plane(struct drm_plane *drm_plane,
 	crtc_w--;
 	crtc_h--;
 
+	render_comp = plane_state->render_comp_enable;
+	if (render_comp) {
+		u32 tile_height = PAGE_SIZE /
+			intel_fb_stride_alignment(dev_priv, fb->modifier[0],
+					fb->pixel_format);
+
+		u32 height_in_mem = (fb->offsets[1]/fb->pitches[0]);
+		tile_row_adjustment = height_in_mem % tile_height;
+		aux_dist = (fb->pitches[0] *
+				(height_in_mem - tile_row_adjustment));
+		aux_stride = skl_plane_stride(fb, 1, rotation);
+		plane_ctl |= PLANE_CTL_DECOMPRESSION_ENABLE;
+	} else {
+		plane_ctl &= ~PLANE_CTL_DECOMPRESSION_ENABLE;
+	}
+
 	I915_WRITE(PLANE_OFFSET(pipe, plane), (y << 16) | x);
 	I915_WRITE(PLANE_STRIDE(pipe, plane), stride);
 	I915_WRITE(PLANE_SIZE(pipe, plane), (src_h << 16) | src_w);
+	I915_WRITE(PLANE_AUX_DIST(pipe, plane), aux_dist | aux_stride);
+	I915_WRITE(PLANE_AUX_OFFSET(pipe, plane),
+			aux_y_offset<<16 | aux_x_offset);
+
+	/*
+	 * Per bspec, for SKL C and BXT A steppings, when the plane source pixel
+	 * format is NV12, the CHICKEN_PIPESL register bit 22 must be set to 1.
+	 * When render compression is enabled, bit 22 must be set to 0.
+	 */
+	if (IS_SKL_REVID(dev, 0, SKL_REVID_C0) ||
+			IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
+		u32 temp = I915_READ(CHICKEN_PIPESL_1(pipe));
+		if (render_comp) {
+			if ((temp & HSW_FBCQ_DIS) == HSW_FBCQ_DIS)
+				I915_WRITE(CHICKEN_PIPESL_1(pipe),
+						temp & ~HSW_FBCQ_DIS);
+		}
+	}
 
 	/* program plane scaler */
 	if (plane_state->scaler_id >= 0) {
@@ -1092,6 +1129,9 @@  intel_plane_init(struct drm_device *dev, enum pipe pipe, int plane)
 
 	intel_create_rotation_property(dev, intel_plane);
 
+	if (INTEL_INFO(dev)->gen >= 9)
+		intel_create_render_comp_property(dev, intel_plane);
+
 	drm_plane_helper_add(&intel_plane->base, &intel_plane_helper_funcs);
 
 out: