Message ID | 20170515083437.31394-11-mahesh1.kumar@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, May 15, 2017 at 02:04:35PM +0530, Mahesh Kumar wrote: > From: "Kumar, Mahesh" <mahesh1.kumar@intel.com> > > This patch make changes to use linetime latency if allocated > DDB size during plane watermark calculation is not available, > This is required to implement new DDB allocation algorithm. > > In New Algorithm DDB is allocated based on WM values, because of which > number of DDB blocks will not be available during WM calculation, > So this "linetime latency" is suggested by SV/HW team to be used during > switch-case for WM blocks selection. (Just realized I never actually sent my review of this patch on the previous series review; sorry for the delay.) I'm not sure if the term "linetime" is completely self-explanatory. You might want to explicitly clarify that it's the time it takes to fill a single row at a given clock rate. So "linetime latency" here is defined as "line time in microseconds = pipe horizontal total number of pixels / pixel rate in MHz." > > Changes since v1: > - Rebase on top of Paulo's patch series > Changes since v2: > - Fix if-else condition (pointed by Maarten) > Changes since v3: > - Use common function for timetime_us calculation (Paulo) > - rebase on drm-tip > Changes since v4: > - Use consistent name for fixed_point operation > > Signed-off-by: "Mahesh Kumar" <mahesh1.kumar@intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 7 +++++++ > drivers/gpu/drm/i915/intel_pm.c | 42 ++++++++++++++++++++++++++++++++--------- > 2 files changed, 40 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index a1858c3eb33a..84052990e695 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -115,6 +115,13 @@ typedef struct { > fp; \ > }) > > +static inline bool is_fixed16_zero(uint_fixed_16_16_t val) > +{ > + if (val.val == 0) > + return true; > + return false; > +} > + > static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val) > { > uint_fixed_16_16_t fp; > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index 0f1d9f672e83..d73369c2c2d9 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -4197,6 +4197,27 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate, > return ret; > } > > +static uint_fixed_16_16_t > +skl_get_linetime_us(struct intel_crtc_state *cstate) We use the same definition of linetime for all platforms that use the concept of 'linetime' (hsw+). So maybe the skl_ prefix here isn't really appropriate? For consistency, should we also call this new function when calculating the HSW/BDW watermarks in hsw_compute_linetime_wm()? Or consolidate skl_compute_linetime_wm() with hsw_compute_linetime_wm()? Potentially something we could do as a follow-up patch if we don't want to deal with it while getting this series in. > +{ > + uint32_t pixel_rate; > + uint32_t crtc_htotal; > + uint_fixed_16_16_t linetime_us; > + > + if (!cstate->base.active) > + return u32_to_fixed_16_16(0); > + > + pixel_rate = cstate->pixel_rate; > + > + if (WARN_ON(pixel_rate == 0)) > + return u32_to_fixed_16_16(0); > + > + crtc_htotal = cstate->base.adjusted_mode.crtc_htotal; > + linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate); > + > + return linetime_us; > +} > + > static uint32_t > skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate, > const struct intel_plane_state *pstate) > @@ -4331,12 +4352,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv, > if (y_tiled) { > selected_result = max_fixed_16_16(method2, y_tile_minimum); > } else { > + uint32_t linetime_us; > + > + linetime_us = fixed_16_16_to_u32_round_up( > + skl_get_linetime_us(cstate)); > if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) && > (plane_bytes_per_line / 512 < 1)) > selected_result = method2; > - else if ((ddb_allocation / > + else if ((ddb_allocation && ddb_allocation / > fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1) > selected_result = min_fixed_16_16(method1, method2); > + else if (latency >= linetime_us) > + selected_result = method2; Based on the commit message, I'm a little bit confused. The plan is to switch to the new algorithm where we calculate watermarks first and then partition DDB second. So at the end, ddb_allocation should never be available here, right? In that case, can we just remove it as a parameter to this function and drop your first 'else if' branch here? That would make us take the new decision logic immediately and give us an earlier idea whether it really behaves the way we expect it to, rather than waiting until we've made a bunch of other changes too. It's also not clear to me why with the old algorithm, the DDB allocation test being true meant "use the lower of the two watermark methods" whereas the linetime test means "aways use method2." Is there any more explanation available for how/why the linetime latency plays into the watermark method decision? The new decision isn't reflected in the bspec yet (that I can see), so it's kind of hard to review that we're definitely doing the right thing. Matt > else > selected_result = method1; > } > @@ -4424,19 +4451,16 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate) > { > struct drm_atomic_state *state = cstate->base.state; > struct drm_i915_private *dev_priv = to_i915(state->dev); > - uint32_t pixel_rate; > + uint_fixed_16_16_t linetime_us; > uint32_t linetime_wm; > > - if (!cstate->base.active) > - return 0; > + linetime_us = skl_get_linetime_us(cstate); > > - pixel_rate = cstate->pixel_rate; > - > - if (WARN_ON(pixel_rate == 0)) > + if (is_fixed16_zero(linetime_us)) > return 0; > > - linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal * > - 1000, pixel_rate); > + linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8, > + linetime_us)); > > /* Display WA #1135: bxt. */ > if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled) > -- > 2.11.0 >
Hi, On Tuesday 16 May 2017 04:08 AM, Matt Roper wrote: > On Mon, May 15, 2017 at 02:04:35PM +0530, Mahesh Kumar wrote: >> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com> >> >> This patch make changes to use linetime latency if allocated >> DDB size during plane watermark calculation is not available, >> This is required to implement new DDB allocation algorithm. >> >> In New Algorithm DDB is allocated based on WM values, because of which >> number of DDB blocks will not be available during WM calculation, >> So this "linetime latency" is suggested by SV/HW team to be used during >> switch-case for WM blocks selection. > (Just realized I never actually sent my review of this patch on the > previous series review; sorry for the delay.) > > I'm not sure if the term "linetime" is completely self-explanatory. You > might want to explicitly clarify that it's the time it takes to fill a > single row at a given clock rate. So "linetime latency" here is defined > as "line time in microseconds = pipe horizontal total number of pixels / > pixel rate in MHz." Will update commit message. > >> Changes since v1: >> - Rebase on top of Paulo's patch series >> Changes since v2: >> - Fix if-else condition (pointed by Maarten) >> Changes since v3: >> - Use common function for timetime_us calculation (Paulo) >> - rebase on drm-tip >> Changes since v4: >> - Use consistent name for fixed_point operation >> >> Signed-off-by: "Mahesh Kumar" <mahesh1.kumar@intel.com> >> --- >> drivers/gpu/drm/i915/i915_drv.h | 7 +++++++ >> drivers/gpu/drm/i915/intel_pm.c | 42 ++++++++++++++++++++++++++++++++--------- >> 2 files changed, 40 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h >> index a1858c3eb33a..84052990e695 100644 >> --- a/drivers/gpu/drm/i915/i915_drv.h >> +++ b/drivers/gpu/drm/i915/i915_drv.h >> @@ -115,6 +115,13 @@ typedef struct { >> fp; \ >> }) >> >> +static inline bool is_fixed16_zero(uint_fixed_16_16_t val) >> +{ >> + if (val.val == 0) >> + return true; >> + return false; >> +} >> + >> static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val) >> { >> uint_fixed_16_16_t fp; >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c >> index 0f1d9f672e83..d73369c2c2d9 100644 >> --- a/drivers/gpu/drm/i915/intel_pm.c >> +++ b/drivers/gpu/drm/i915/intel_pm.c >> @@ -4197,6 +4197,27 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate, >> return ret; >> } >> >> +static uint_fixed_16_16_t >> +skl_get_linetime_us(struct intel_crtc_state *cstate) > We use the same definition of linetime for all platforms that use the > concept of 'linetime' (hsw+). So maybe the skl_ prefix here isn't > really appropriate? will make it intel_get_linetime_us. > For consistency, should we also call this new function when calculating > the HSW/BDW watermarks in hsw_compute_linetime_wm()? Or consolidate > skl_compute_linetime_wm() with hsw_compute_linetime_wm()? Potentially > something we could do as a follow-up patch if we don't want to deal with > it while getting this series in. sounds good :), will create a new patch to use intel_get_linetime_us in hsw_compute_linetime_wm. hsw & skl require bunch of other calculations/WA in linetime_wm, so will keep them separate function only. > >> +{ >> + uint32_t pixel_rate; >> + uint32_t crtc_htotal; >> + uint_fixed_16_16_t linetime_us; >> + >> + if (!cstate->base.active) >> + return u32_to_fixed_16_16(0); >> + >> + pixel_rate = cstate->pixel_rate; >> + >> + if (WARN_ON(pixel_rate == 0)) >> + return u32_to_fixed_16_16(0); >> + >> + crtc_htotal = cstate->base.adjusted_mode.crtc_htotal; >> + linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate); >> + >> + return linetime_us; >> +} >> + >> static uint32_t >> skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate, >> const struct intel_plane_state *pstate) >> @@ -4331,12 +4352,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv, >> if (y_tiled) { >> selected_result = max_fixed_16_16(method2, y_tile_minimum); >> } else { >> + uint32_t linetime_us; >> + >> + linetime_us = fixed_16_16_to_u32_round_up( >> + skl_get_linetime_us(cstate)); >> if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) && >> (plane_bytes_per_line / 512 < 1)) >> selected_result = method2; >> - else if ((ddb_allocation / >> + else if ((ddb_allocation && ddb_allocation / >> fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1) >> selected_result = min_fixed_16_16(method1, method2); >> + else if (latency >= linetime_us) >> + selected_result = method2; > Based on the commit message, I'm a little bit confused. The plan is to > switch to the new algorithm where we calculate watermarks first and then > partition DDB second. So at the end, ddb_allocation should never be > available here, right? In that case, can we just remove it as a > parameter to this function and drop your first 'else if' branch here? > That would make us take the new decision logic immediately and give us > an earlier idea whether it really behaves the way we expect it to, > rather than waiting until we've made a bunch of other changes too. right, We will eventually remove first "else if" check in next patch itself :) I kept it in this patch due to earlier comment by Paulo not to remove (ddb_allocation / plane blocks per line) check until patch which removes the ddb_allocation is merged/submitted. https://patchwork.freedesktop.org/patch/115537/ > > It's also not clear to me why with the old algorithm, the DDB allocation > test being true meant "use the lower of the two watermark methods" > whereas the linetime test means "aways use method2." Is there any more > explanation available for how/why the linetime latency plays into the > watermark method decision? The new decision isn't reflected in the > bspec yet (that I can see), so it's kind of hard to review that we're > definitely doing the right thing. Thanks alot for catching this, I revisited the BSpec & it says to use minimum of method1, method2. BTW these changes are already included in internal Bspec, After new version release it should be part of open BSpec. I don't know about release process & timeline though :P -Mahesh > > > Matt > >> else >> selected_result = method1; >> } >> @@ -4424,19 +4451,16 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate) >> { >> struct drm_atomic_state *state = cstate->base.state; >> struct drm_i915_private *dev_priv = to_i915(state->dev); >> - uint32_t pixel_rate; >> + uint_fixed_16_16_t linetime_us; >> uint32_t linetime_wm; >> >> - if (!cstate->base.active) >> - return 0; >> + linetime_us = skl_get_linetime_us(cstate); >> >> - pixel_rate = cstate->pixel_rate; >> - >> - if (WARN_ON(pixel_rate == 0)) >> + if (is_fixed16_zero(linetime_us)) >> return 0; >> >> - linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal * >> - 1000, pixel_rate); >> + linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8, >> + linetime_us)); >> >> /* Display WA #1135: bxt. */ >> if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled) >> -- >> 2.11.0 >>
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a1858c3eb33a..84052990e695 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -115,6 +115,13 @@ typedef struct { fp; \ }) +static inline bool is_fixed16_zero(uint_fixed_16_16_t val) +{ + if (val.val == 0) + return true; + return false; +} + static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val) { uint_fixed_16_16_t fp; diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 0f1d9f672e83..d73369c2c2d9 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4197,6 +4197,27 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate, return ret; } +static uint_fixed_16_16_t +skl_get_linetime_us(struct intel_crtc_state *cstate) +{ + uint32_t pixel_rate; + uint32_t crtc_htotal; + uint_fixed_16_16_t linetime_us; + + if (!cstate->base.active) + return u32_to_fixed_16_16(0); + + pixel_rate = cstate->pixel_rate; + + if (WARN_ON(pixel_rate == 0)) + return u32_to_fixed_16_16(0); + + crtc_htotal = cstate->base.adjusted_mode.crtc_htotal; + linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate); + + return linetime_us; +} + static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate, const struct intel_plane_state *pstate) @@ -4331,12 +4352,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv, if (y_tiled) { selected_result = max_fixed_16_16(method2, y_tile_minimum); } else { + uint32_t linetime_us; + + linetime_us = fixed_16_16_to_u32_round_up( + skl_get_linetime_us(cstate)); if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) && (plane_bytes_per_line / 512 < 1)) selected_result = method2; - else if ((ddb_allocation / + else if ((ddb_allocation && ddb_allocation / fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1) selected_result = min_fixed_16_16(method1, method2); + else if (latency >= linetime_us) + selected_result = method2; else selected_result = method1; } @@ -4424,19 +4451,16 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate) { struct drm_atomic_state *state = cstate->base.state; struct drm_i915_private *dev_priv = to_i915(state->dev); - uint32_t pixel_rate; + uint_fixed_16_16_t linetime_us; uint32_t linetime_wm; - if (!cstate->base.active) - return 0; + linetime_us = skl_get_linetime_us(cstate); - pixel_rate = cstate->pixel_rate; - - if (WARN_ON(pixel_rate == 0)) + if (is_fixed16_zero(linetime_us)) return 0; - linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal * - 1000, pixel_rate); + linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8, + linetime_us)); /* Display WA #1135: bxt. */ if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)