diff mbox

[v4] drm/i915/skl: New ddb allocation algorithm

Message ID 20160913121551.11925-1-mahesh1.kumar@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kumar, Mahesh Sept. 13, 2016, 12:15 p.m. UTC
From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch implements new DDB allocation algorithm as per HW team
suggestion. This algo takecare of scenario where we allocate less DDB
for the planes with lower relative pixel rate, but they require more DDB
to work.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.

Changes since v1:
 - Rebase on top of Paulo's patch series

Changes since v2:
 - Fix the for loop condition to enable WM

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 129 +++++++++++++++++++++-------------------
 1 file changed, 67 insertions(+), 62 deletions(-)

Comments

Maarten Lankhorst Sept. 13, 2016, 12:40 p.m. UTC | #1
Op 13-09-16 om 14:15 schreef Kumar, Mahesh:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>
> This patch implements new DDB allocation algorithm as per HW team
> suggestion. This algo takecare of scenario where we allocate less DDB
> for the planes with lower relative pixel rate, but they require more DDB
> to work.
> It also takes care of enabling same watermark level for each
> plane, for efficient power saving.
>
> Changes since v1:
>  - Rebase on top of Paulo's patch series
>
> Changes since v2:
>  - Fix the for loop condition to enable WM
>
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
I'm still getting underrun issues when running the entire patch series against kms_atomic_transition and kms_plane.
Can you confirm?

~Maarten
Kumar, Mahesh Sept. 14, 2016, 12:36 p.m. UTC | #2
Hi,
There was an issue with transition WM, it was getting enabled & causing 
fifo underrun.
I fixed the condition, After that tested kms_plane & not getting any 
underrun.
Please let me know if you see any other issue.

Regards,
-Mahesh

On Tuesday 13 September 2016 06:10 PM, Maarten Lankhorst wrote:
> Op 13-09-16 om 14:15 schreef Kumar, Mahesh:
>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>
>> This patch implements new DDB allocation algorithm as per HW team
>> suggestion. This algo takecare of scenario where we allocate less DDB
>> for the planes with lower relative pixel rate, but they require more DDB
>> to work.
>> It also takes care of enabling same watermark level for each
>> plane, for efficient power saving.
>>
>> Changes since v1:
>>   - Rebase on top of Paulo's patch series
>>
>> Changes since v2:
>>   - Fix the for loop condition to enable WM
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> I'm still getting underrun issues when running the entire patch series against kms_atomic_transition and kms_plane.
> Can you confirm?
>
> ~Maarten
Maarten Lankhorst Sept. 19, 2016, 8:27 a.m. UTC | #3
Hey,

Op 14-09-16 om 14:36 schreef Mahesh Kumar:
> Hi,
> There was an issue with transition WM, it was getting enabled & causing fifo underrun.
> I fixed the condition, After that tested kms_plane & not getting any underrun.
> Please let me know if you see any other issue. 

kms_cursor_legacy.cursorA-vs-flipA-atomic-transitions-varying-size is broken by this patch.

It's easy to overlook, since cursorA-vs-flipB of the same test was already broken,
but this is definitely something new introduced by this patch.

~Maarten
Maarten Lankhorst Sept. 19, 2016, 9:55 a.m. UTC | #4
Op 14-09-16 om 14:36 schreef Mahesh Kumar:
> Hi,
> There was an issue with transition WM, it was getting enabled & causing fifo underrun.
> I fixed the condition, After that tested kms_plane & not getting any underrun.
> Please let me know if you see any other issue.
>
> Regards,
> -Mahesh
>
> On Tuesday 13 September 2016 06:10 PM, Maarten Lankhorst wrote:
>> Op 13-09-16 om 14:15 schreef Kumar, Mahesh:
>>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>>
>>> This patch implements new DDB allocation algorithm as per HW team
>>> suggestion. This algo takecare of scenario where we allocate less DDB
>>> for the planes with lower relative pixel rate, but they require more DDB
>>> to work.
>>> It also takes care of enabling same watermark level for each
>>> plane, for efficient power saving.
>>>
>>> Changes since v1:
>>>   - Rebase on top of Paulo's patch series
>>>
>>> Changes since v2:
>>>   - Fix the for loop condition to enable WM
>>>
>>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> I'm still getting underrun issues when running the entire patch series against kms_atomic_transition and kms_plane.
>> Can you confirm?
>>
>> ~Maarten
>
Found it..

During the test run:
# cat /sys/kernel/debug/dri/0/i915_ddb_info 
                  Start     End    Size
Pipe A
  Plane1              0       0       0
  Plane2             30     890     860
  Cursor            860     892      32

Pretty sure the start value here is bogus, and plane2 wm's end up overlapping with cursor.

~Maarten
Kumar, Mahesh Sept. 21, 2016, 1:03 p.m. UTC | #5
Hi Maarten,

thanks for pointing out the issue,

not only start value, end value is also incorrect. I got the root-cause 
for both.
but this end value issue seems to be always there. end value should be 
-1 (because start block should also be counted)
Will fix both & upload the reworked patches.

-Mahesh


On Monday 19 September 2016 03:25 PM, Maarten Lankhorst wrote:
> Op 14-09-16 om 14:36 schreef Mahesh Kumar:
>> Hi,
>> There was an issue with transition WM, it was getting enabled & causing fifo underrun.
>> I fixed the condition, After that tested kms_plane & not getting any underrun.
>> Please let me know if you see any other issue.
>>
>> Regards,
>> -Mahesh
>>
>> On Tuesday 13 September 2016 06:10 PM, Maarten Lankhorst wrote:
>>> Op 13-09-16 om 14:15 schreef Kumar, Mahesh:
>>>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>>>
>>>> This patch implements new DDB allocation algorithm as per HW team
>>>> suggestion. This algo takecare of scenario where we allocate less DDB
>>>> for the planes with lower relative pixel rate, but they require more DDB
>>>> to work.
>>>> It also takes care of enabling same watermark level for each
>>>> plane, for efficient power saving.
>>>>
>>>> Changes since v1:
>>>>    - Rebase on top of Paulo's patch series
>>>>
>>>> Changes since v2:
>>>>    - Fix the for loop condition to enable WM
>>>>
>>>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>>> I'm still getting underrun issues when running the entire patch series against kms_atomic_transition and kms_plane.
>>> Can you confirm?
>>>
>>> ~Maarten
> Found it..
>
> During the test run:
> # cat /sys/kernel/debug/dri/0/i915_ddb_info
>                    Start     End    Size
> Pipe A
>    Plane1              0       0       0
>    Plane2             30     890     860
>    Cursor            860     892      32
>
> Pretty sure the start value here is bogus, and plane2 wm's end up overlapping with cursor.
>
> ~Maarten
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3370fc8..c52cc57 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3348,6 +3348,7 @@  skl_ddb_min_alloc(const struct drm_plane_state *pstate,
 
 static int
 skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
+		      struct skl_pipe_wm *pipe_wm,
 		      struct skl_ddb_allocation *ddb /* out */)
 {
 	struct drm_atomic_state *state = cstate->base.state;
@@ -3363,8 +3364,11 @@  skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	uint16_t *minimum = cstate->wm.skl.minimum_blocks;
 	uint16_t *y_minimum = cstate->wm.skl.minimum_y_blocks;
 	unsigned int total_data_rate;
+	uint16_t total_min_blocks = 0;
+	uint16_t total_level_ddb = 0;
 	int num_active;
-	int id, i;
+	int max_level, level;
+	int id, i, ret = 0;
 
 	if (WARN_ON(!state))
 		return 0;
@@ -3380,6 +3384,7 @@  skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	alloc_size = skl_ddb_entry_size(alloc);
 	if (alloc_size == 0) {
 		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
+		memset(ddb->y_plane[pipe], 0, sizeof(ddb->y_plane[pipe]));
 		return 0;
 	}
 
@@ -3413,19 +3418,42 @@  skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	}
 
 	for (i = 0; i < PLANE_CURSOR; i++) {
-		alloc_size -= minimum[i];
-		alloc_size -= y_minimum[i];
+		total_min_blocks += minimum[i];
+		total_min_blocks += y_minimum[i];
 	}
 
-	/*
-	 * 2. Distribute the remaining space in proportion to the amount of
-	 * data each plane needs to fetch from memory.
-	 *
-	 * FIXME: we may not allocate every single block here.
-	 */
+	for (level = ilk_wm_max_level(dev); level >= 0; level--) {
+		total_level_ddb = 0;
+		for (i = 0; i < PLANE_CURSOR; i++) {
+			/*
+			 * TODO: We should calculate watermark values for Y/UV
+			 * plane both in case of NV12 format and use both values
+			 * for ddb calculation, As NV12 is disabled as of now.
+			 * using only single plane value here.
+			 */
+			uint16_t min = minimum[i] + y_minimum[i];
+			uint16_t plane_level_ddb_wm =
+				max(pipe_wm->wm[level].plane_res_b[i], min);
+			total_level_ddb += plane_level_ddb_wm;
+		}
+
+		if (total_level_ddb <= alloc_size)
+			break;
+	}
+
+	if ((level < 0) || (total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
+				total_level_ddb : total_min_blocks, alloc_size);
+		ret = -EINVAL;
+		goto exit;
+	}
+	max_level = level;
+	alloc_size -= total_level_ddb;
+
 	total_data_rate = skl_get_total_relative_data_rate(cstate);
 	if (total_data_rate == 0)
-		return 0;
+		goto exit;
 
 	start = alloc->start;
 	for_each_intel_plane_on_crtc(dev, intel_crtc, intel_plane) {
@@ -3440,7 +3468,8 @@  skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		 * promote the expression to 64 bits to avoid overflowing, the
 		 * result is < available as data_rate / total_data_rate < 1
 		 */
-		plane_blocks = minimum[id];
+		plane_blocks = max(pipe_wm->wm[max_level].plane_res_b[id],
+					minimum[id]);
 		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
 					total_data_rate);
 
@@ -3454,6 +3483,8 @@  skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 
 		/*
 		 * allocation for y_plane part of planar format:
+		 * TODO: Once we start calculating watermark values for Y/UV
+		 * plane both consider it for initial allowed wm blocks.
 		 */
 		y_data_rate = cstate->wm.skl.plane_y_data_rate[id];
 
@@ -3467,9 +3498,22 @@  skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		}
 
 		start += y_plane_blocks;
+
+		/*
+		 * Now enable all levels in WM structure which can be enabled
+		 * using current DDB allocation
+		 */
+		for (i = ilk_wm_max_level(dev); i >= 0; i--) {
+			if (i > max_level || pipe_wm->wm[i].plane_res_l[id] > 31
+					|| pipe_wm->wm[i].plane_res_b[id] == 0)
+				pipe_wm->wm[i].plane_en[id] = false;
+			else
+				pipe_wm->wm[i].plane_en[id] = true;
+		}
 	}
 
-	return 0;
+exit:
+	return ret;
 }
 
 static uint32_t skl_pipe_pixel_rate(const struct intel_crtc_state *config)
@@ -3540,7 +3584,6 @@  static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				struct intel_plane_state *intel_pstate,
-				uint16_t ddb_allocation,
 				int level,
 				struct skl_pipe_wm *pipe_wm)
 {
@@ -3559,12 +3602,9 @@  static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct skl_wm_level *result = &pipe_wm->wm[level];
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
-	bool *enabled = &result->plane_en[id];
 
-	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
-		*enabled = false;
+	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
-	}
 
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	height = drm_rect_height(&intel_pstate->base.src) >> 16;
@@ -3649,54 +3689,27 @@  static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		}
 	}
 
-	if (res_blocks >= ddb_allocation || res_lines > 31) {
-		*enabled = false;
-
-		/*
-		 * If there are no valid level 0 watermarks, then we can't
-		 * support this display configuration.
-		 */
-		if (level) {
-			return 0;
-		} else {
-			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
-			DRM_DEBUG_KMS("Plane %d.%d: blocks required = %u/%u, lines required = %u/31\n",
-				      to_intel_crtc(cstate->base.crtc)->pipe,
-				      skl_wm_plane_id(to_intel_plane(pstate->plane)),
-				      res_blocks, ddb_allocation, res_lines);
-
-			return -EINVAL;
-		}
-	}
-
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
-	*enabled = true;
 
 	return 0;
 }
 
 static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
-		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     int level,
 		     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
-	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
 	struct drm_plane *plane;
 	struct intel_plane *intel_plane;
 	struct intel_plane_state *intel_pstate;
-	uint16_t ddb_blocks;
-	enum pipe pipe = intel_crtc->pipe;
 	int ret;
 
 	for_each_intel_plane_mask(&dev_priv->drm,
 				  intel_plane,
 				  cstate->base.plane_mask) {
-		int i = skl_wm_plane_id(intel_plane);
-
 		plane = &intel_plane->base;
 		intel_pstate = NULL;
 		if (state)
@@ -3722,12 +3735,9 @@  skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 
 		WARN_ON(!intel_pstate->base.fb);
 
-		ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][i]);
-
 		ret = skl_compute_plane_wm(dev_priv,
 					   cstate,
 					   intel_pstate,
-					   ddb_blocks,
 					   level,
 					   pipe_wm);
 		if (ret)
@@ -3784,11 +3794,15 @@  static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	memset(pipe_wm, 0, sizeof(*pipe_wm));
 
 	for (level = 0; level <= max_level; level++) {
-		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
+		ret = skl_compute_wm_level(dev_priv, cstate,
 					   level, pipe_wm);
 		if (ret)
 			return ret;
 	}
+	ret = skl_allocate_pipe_ddb(cstate, pipe_wm, ddb);
+	if (ret)
+		return ret;
+
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
 	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
@@ -3976,13 +3990,12 @@  pipes_modified(struct drm_atomic_state *state)
 }
 
 static int
-skl_compute_ddb(struct drm_atomic_state *state)
+skl_include_affected_pipes(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
 	struct intel_crtc *intel_crtc;
-	struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;
 	uint32_t realloc_pipes = pipes_modified(state);
 	int ret;
 
@@ -4035,10 +4048,6 @@  skl_compute_ddb(struct drm_atomic_state *state)
 		if (IS_ERR(cstate))
 			return PTR_ERR(cstate);
 
-		ret = skl_allocate_pipe_ddb(cstate, ddb);
-		if (ret)
-			return ret;
-
 		ret = drm_atomic_add_affected_planes(state, &intel_crtc->base);
 		if (ret)
 			return ret;
@@ -4092,19 +4101,15 @@  skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
-	ret = skl_compute_ddb(state);
+	ret = skl_include_affected_pipes(state);
 	if (ret)
 		return ret;
 
 	/*
 	 * Calculate WM's for all pipes that are part of this transaction.
-	 * Note that the DDB allocation above may have added more CRTC's that
-	 * weren't otherwise being modified (and set bits in dirty_pipes) if
-	 * pipe allocations had to change.
-	 *
-	 * FIXME:  Now that we're doing this in the atomic check phase, we
-	 * should allow skl_update_pipe_wm() to return failure in cases where
-	 * no suitable watermark values can be found.
+	 * Note that affected pipe calculation above may have added more
+	 * CRTC's that weren't otherwise being modified (and set bits in
+	 * dirty_pipes) if pipe allocations had to change.
 	 */
 	for_each_crtc_in_state(state, crtc, cstate, i) {
 		struct intel_crtc *intel_crtc = to_intel_crtc(crtc);