Message ID | 20220324181347.129378-2-jose.souza@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] drm/i915/display/psr: Set partial frame enable when forcing full frame fetch | expand |
Hello Jose, See my comments below. On Thu, 2022-03-24 at 11:13 -0700, José Roberto de Souza wrote: > Instead of exit PSR when a frontbuffer invalidation happens, we can > enable the PSR2 selective fetch continuous full frame, that will keep > the panel updated like PSR was disabled but without keeping PSR > active. with keeping PSR active? I don't think it's like PSR was disabled. New full frame is updated only via atomic commit. Having PSR disabled new full frame is updated all the time as PSR wasn't existing at all. > > So as soon as the frontbuffer flush happens we can disable the > continuous full frame and start to do selective fetches much quicker > than the path that would enable PSR, that will wait a few frames > to actually activate PSR. > > Also this approach has proven to fix some glitches found in > Alderlake-P > when there are a lot of invalidations happening together with page > flips. > > Some may ask why it is writing to CURSURFLIVE(), it is because > that is the way that hardware team provided us to poke display to > handle PSR updates, and it is being used since display 9. Generic comments: Current logic is to disable psr2 in invalidate callback and start sending fullframe updates on every vblank period. This is done until flush callback where psr2 is re-enabled. Intention is to update possible frontbuffer writes between invalidate/flush instantly. Now you are changing the logic to update one full frame when frontbuffer write starts (_psr_invalidate_handle) and another one when it stops (_psr_flush_handle) without disabling psr at all. Have I understood your patch correctly? Propably we wont notice this change as we have these invalidate/flush calls scattered around in the code. Also parallel atomic commits are triggering updates. In theory we could observe latency in updates between invalidate/flush? Do we care? What do you think? Do we need to send update in invalidate at all? Isn't that usually called before doing any frontbuffer writing? I.e. we would be sending frame that is already in RFB? > > Cc: Khaled Almahallawy <khaled.almahallawy@intel.com> > Cc: Shawn C Lee <shawn.c.lee@intel.com> > Cc: Jouni Högander <jouni.hogander@intel.com> > Cc: Mika Kahola <mika.kahola@intel.com> > Signed-off-by: José Roberto de Souza <jose.souza@intel.com> > --- > drivers/gpu/drm/i915/display/intel_psr.c | 109 ++++++++++++++++++++- > -- > 1 file changed, 95 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c > b/drivers/gpu/drm/i915/display/intel_psr.c > index d87b357806c91..f7b7b374374b1 100644 > --- a/drivers/gpu/drm/i915/display/intel_psr.c > +++ b/drivers/gpu/drm/i915/display/intel_psr.c > @@ -1450,6 +1450,22 @@ static inline u32 > man_trk_ctl_partial_frame_bit_get(struct drm_i915_private *dev > PSR2_MAN_TRK_CTL_SF_PARTIAL_FRAME_UPDATE; > } > > +static inline u32 man_trk_ctl_continuos_full_frame(struct > drm_i915_private *dev_priv) > +{ > + return IS_ALDERLAKE_P(dev_priv) ? > + ADLP_PSR2_MAN_TRK_CTL_SF_CONTINUOS_FULL_FRAME : > + PSR2_MAN_TRK_CTL_SF_CONTINUOS_FULL_FRAME; > +} > + > +static inline u32 man_trk_ctl_su_region_start_end_mask(struct > drm_i915_private *dev_priv) > +{ > + if (IS_ALDERLAKE_P(dev_priv)) > + return ADLP_PSR2_MAN_TRK_CTL_SU_REGION_START_ADDR_MASK > | > + ADLP_PSR2_MAN_TRK_CTL_SU_REGION_END_ADDR_MASK; > + return PSR2_MAN_TRK_CTL_SU_REGION_START_ADDR_MASK | > + PSR2_MAN_TRK_CTL_SU_REGION_END_ADDR_MASK; > +} > + > static void psr_force_hw_tracking_exit(struct intel_dp *intel_dp) > { > struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); > @@ -1546,8 +1562,9 @@ void intel_psr2_program_trans_man_trk_ctl(const > struct intel_crtc_state *crtc_st > if (!crtc_state->enable_psr2_sel_fetch) > return; > > - intel_de_write(dev_priv, PSR2_MAN_TRK_CTL(crtc_state- > >cpu_transcoder), > - crtc_state->psr2_man_track_ctl); > + intel_de_rmw(dev_priv, PSR2_MAN_TRK_CTL(crtc_state- > >cpu_transcoder), > + man_trk_ctl_su_region_start_end_mask(dev_priv), > + crtc_state->psr2_man_track_ctl); Should we actually now consider taking psr->lock here? > } > > static void psr2_man_trk_ctl_calc(struct intel_crtc_state > *crtc_state, > @@ -2127,6 +2144,26 @@ static void intel_psr_work(struct work_struct > *work) > mutex_unlock(&intel_dp->psr.lock); > } > > +static void _psr_invalidate_handle(struct intel_dp *intel_dp, > + unsigned int > prev_busy_frontbuffer_bits) > +{ > + struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); > + > + if (intel_dp->psr.psr2_sel_fetch_enabled) { > + u32 val = man_trk_ctl_continuos_full_frame(dev_priv) | > + man_trk_ctl_partial_frame_bit_get(dev_priv); > + > + /* continuos full frame is already enabled */ > + if (prev_busy_frontbuffer_bits) > + return; Should we still trigger the update using CURSURFLIVE? Or do we need that at all in the first place? > + > + intel_de_rmw(dev_priv, PSR2_MAN_TRK_CTL(intel_dp- > >psr.transcoder), 0, val); > + intel_de_write(dev_priv, CURSURFLIVE(intel_dp- > >psr.pipe), 0); So these two register writes here are triggering one full frame update. and leaving full frame update bit set so that coming updates are also full frame. Did I understood it correctly? As invalidate is called before frontbuffer is writen: Isn't it actually re-updating same frame which is already supposed to be in panel RFB? > + } else { > + intel_psr_exit(intel_dp); > + } > +} > + > /** > * intel_psr_invalidate - Invalidade PSR > * @dev_priv: i915 device > @@ -2151,6 +2188,7 @@ void intel_psr_invalidate(struct > drm_i915_private *dev_priv, > for_each_intel_encoder_with_psr(&dev_priv->drm, encoder) { > unsigned int pipe_frontbuffer_bits = frontbuffer_bits; > struct intel_dp *intel_dp = enc_to_intel_dp(encoder); > + unsigned int prev_busy_frontbuffer_bits; > > mutex_lock(&intel_dp->psr.lock); > if (!intel_dp->psr.enabled) { > @@ -2158,12 +2196,13 @@ void intel_psr_invalidate(struct > drm_i915_private *dev_priv, > continue; > } > > + prev_busy_frontbuffer_bits = intel_dp- > >psr.busy_frontbuffer_bits; > pipe_frontbuffer_bits &= > INTEL_FRONTBUFFER_ALL_MASK(intel_dp->psr.pipe); > intel_dp->psr.busy_frontbuffer_bits |= > pipe_frontbuffer_bits; > > if (pipe_frontbuffer_bits) > - intel_psr_exit(intel_dp); > + _psr_invalidate_handle(intel_dp, > prev_busy_frontbuffer_bits); > > mutex_unlock(&intel_dp->psr.lock); > } > @@ -2195,6 +2234,49 @@ tgl_dc3co_flush_locked(struct intel_dp > *intel_dp, unsigned int frontbuffer_bits, > intel_dp->psr.dc3co_exit_delay); > } > > +static void _psr_flush_handle(struct intel_dp *intel_dp, > + unsigned int prev_busy_frontbuffer_bits) > +{ > + struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); > + > + if (intel_dp->psr.psr2_sel_fetch_enabled) { > + /* is continuos full frame enabled? */ > + if (prev_busy_frontbuffer_bits) { > + /* it is, can we turn it off? */ As you are using this prev_busy_frontbuffer_bits only to check if "continuos full frame" is enabled and nothing else: Maybe you could just name it as it is e.g. bool cff_enabled or bool continuous_full_frame or...same comment in _psr_invalidate_handle. This would allow you to drop couple of comments. > + if (intel_dp->psr.busy_frontbuffer_bits == 0) { > + u32 clear = > man_trk_ctl_continuos_full_frame(dev_priv); > + u32 set = > man_trk_ctl_single_full_frame_bit_get(dev_priv) | > + man_trk_ctl_partial_frame_bit > _get(dev_priv); > + > + /* > + * turn continuos full frame off and do > a single > + * full frame > + */ > + intel_de_rmw(dev_priv, > + PSR2_MAN_TRK_CTL(intel_dp- > >psr.transcoder), > + clear, set); > + intel_de_write(dev_priv, > CURSURFLIVE(intel_dp->psr.pipe), 0); > + } > + } else { > + /* > + * continuos full frame is disabled, only a > single full > + * frame is required > + */ > + psr_force_hw_tracking_exit(intel_dp); > + } > + } else { > + /* > + * if prev_busy_frontbuffer_bits is set, it means that > PSR is > + * inactive > + */ > + if (prev_busy_frontbuffer_bits == 0) > + psr_force_hw_tracking_exit(intel_dp); > + > + if (!intel_dp->psr.active && !intel_dp- > >psr.busy_frontbuffer_bits) > + schedule_work(&intel_dp->psr.work); > + } > +} > + > /** > * intel_psr_flush - Flush PSR > * @dev_priv: i915 device > @@ -2216,6 +2298,7 @@ void intel_psr_flush(struct drm_i915_private > *dev_priv, > for_each_intel_encoder_with_psr(&dev_priv->drm, encoder) { > unsigned int pipe_frontbuffer_bits = frontbuffer_bits; > struct intel_dp *intel_dp = enc_to_intel_dp(encoder); > + unsigned int prev_busy_frontbuffer_bits; > > mutex_lock(&intel_dp->psr.lock); > if (!intel_dp->psr.enabled) { > @@ -2223,6 +2306,7 @@ void intel_psr_flush(struct drm_i915_private > *dev_priv, > continue; > } > > + prev_busy_frontbuffer_bits = intel_dp- > >psr.busy_frontbuffer_bits; > pipe_frontbuffer_bits &= > INTEL_FRONTBUFFER_ALL_MASK(intel_dp->psr.pipe); > intel_dp->psr.busy_frontbuffer_bits &= > ~pipe_frontbuffer_bits; > @@ -2232,25 +2316,22 @@ void intel_psr_flush(struct drm_i915_private > *dev_priv, > * we have to ensure that the PSR is not activated > until > * intel_psr_resume() is called. > */ > - if (intel_dp->psr.paused) { > - mutex_unlock(&intel_dp->psr.lock); > - continue; > - } > + if (intel_dp->psr.paused) > + goto exit; > > if (origin == ORIGIN_FLIP || > (origin == ORIGIN_CURSOR_UPDATE && > !intel_dp->psr.psr2_sel_fetch_enabled)) { > tgl_dc3co_flush_locked(intel_dp, > frontbuffer_bits, origin); > - mutex_unlock(&intel_dp->psr.lock); > - continue; > + goto exit; > } > > - /* By definition flush = invalidate + flush */ > - if (pipe_frontbuffer_bits) > - psr_force_hw_tracking_exit(intel_dp); > + if (pipe_frontbuffer_bits == 0) > + goto exit; > > - if (!intel_dp->psr.active && !intel_dp- > >psr.busy_frontbuffer_bits) > - schedule_work(&intel_dp->psr.work); > + /* By definition flush = invalidate + flush */ > + _psr_flush_handle(intel_dp, > prev_busy_frontbuffer_bits); > +exit: I think you should name it as unlock. > mutex_unlock(&intel_dp->psr.lock); > } > } BR, Jouni Högander
On Fri, 2022-03-25 at 14:21 +0000, Hogander, Jouni wrote: > Hello Jose, > > See my comments below. > > On Thu, 2022-03-24 at 11:13 -0700, José Roberto de Souza wrote: > > Instead of exit PSR when a frontbuffer invalidation happens, we can > > enable the PSR2 selective fetch continuous full frame, that will keep > > the panel updated like PSR was disabled but without keeping PSR > > active. > > with keeping PSR active? I don't think it's like PSR was disabled. New > full frame is updated only via atomic commit. Having PSR disabled new > full frame is updated all the time as PSR wasn't existing at all. > > > > > So as soon as the frontbuffer flush happens we can disable the > > continuous full frame and start to do selective fetches much quicker > > than the path that would enable PSR, that will wait a few frames > > to actually activate PSR. > > > > Also this approach has proven to fix some glitches found in > > Alderlake-P > > when there are a lot of invalidations happening together with page > > flips. > > > > Some may ask why it is writing to CURSURFLIVE(), it is because > > that is the way that hardware team provided us to poke display to > > handle PSR updates, and it is being used since display 9. > > Generic comments: > > Current logic is to disable psr2 in invalidate callback and start > sending fullframe updates on every vblank period. This is done until > flush callback where psr2 is re-enabled. Intention is to update > possible frontbuffer writes between invalidate/flush instantly. > > Now you are changing the logic to update one full frame when It is not enabling the one full frame, it is enabling the continuous full frame so at every vblank panel will be updated until this bit cleared. > frontbuffer write starts (_psr_invalidate_handle) and another one when > it stops (_psr_flush_handle) without disabling psr at all. Have I > understood your patch correctly? > > Propably we wont notice this change as we have these invalidate/flush > calls scattered around in the code. Also parallel atomic commits are > triggering updates. In theory we could observe latency in updates > between invalidate/flush? Do we care? What do you think? > > Do we need to send update in invalidate at all? Isn't that usually > called before doing any frontbuffer writing? I.e. we would be sending > frame that is already in RFB? > > > > > Cc: Khaled Almahallawy <khaled.almahallawy@intel.com> > > Cc: Shawn C Lee <shawn.c.lee@intel.com> > > Cc: Jouni Högander <jouni.hogander@intel.com> > > Cc: Mika Kahola <mika.kahola@intel.com> > > Signed-off-by: José Roberto de Souza <jose.souza@intel.com> > > --- > > drivers/gpu/drm/i915/display/intel_psr.c | 109 ++++++++++++++++++++- > > -- > > 1 file changed, 95 insertions(+), 14 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c > > b/drivers/gpu/drm/i915/display/intel_psr.c > > index d87b357806c91..f7b7b374374b1 100644 > > --- a/drivers/gpu/drm/i915/display/intel_psr.c > > +++ b/drivers/gpu/drm/i915/display/intel_psr.c > > @@ -1450,6 +1450,22 @@ static inline u32 > > man_trk_ctl_partial_frame_bit_get(struct drm_i915_private *dev > > PSR2_MAN_TRK_CTL_SF_PARTIAL_FRAME_UPDATE; > > } > > > > +static inline u32 man_trk_ctl_continuos_full_frame(struct > > drm_i915_private *dev_priv) > > +{ > > + return IS_ALDERLAKE_P(dev_priv) ? > > + ADLP_PSR2_MAN_TRK_CTL_SF_CONTINUOS_FULL_FRAME : > > + PSR2_MAN_TRK_CTL_SF_CONTINUOS_FULL_FRAME; > > +} > > + > > +static inline u32 man_trk_ctl_su_region_start_end_mask(struct > > drm_i915_private *dev_priv) > > +{ > > + if (IS_ALDERLAKE_P(dev_priv)) > > + return ADLP_PSR2_MAN_TRK_CTL_SU_REGION_START_ADDR_MASK > > > > > + ADLP_PSR2_MAN_TRK_CTL_SU_REGION_END_ADDR_MASK; > > + return PSR2_MAN_TRK_CTL_SU_REGION_START_ADDR_MASK | > > + PSR2_MAN_TRK_CTL_SU_REGION_END_ADDR_MASK; > > +} > > + > > static void psr_force_hw_tracking_exit(struct intel_dp *intel_dp) > > { > > struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); > > @@ -1546,8 +1562,9 @@ void intel_psr2_program_trans_man_trk_ctl(const > > struct intel_crtc_state *crtc_st > > if (!crtc_state->enable_psr2_sel_fetch) > > return; > > > > - intel_de_write(dev_priv, PSR2_MAN_TRK_CTL(crtc_state- > > > cpu_transcoder), > > - crtc_state->psr2_man_track_ctl); > > + intel_de_rmw(dev_priv, PSR2_MAN_TRK_CTL(crtc_state- > > > cpu_transcoder), > > + man_trk_ctl_su_region_start_end_mask(dev_priv), > > + crtc_state->psr2_man_track_ctl); > > Should we actually now consider taking psr->lock here? I don't think we need. mmio writes are syncronized, this will never set continuous full frame and will only clear su region start and end. Also this function is called from a time sensitive section if we spend too much time here it will cause vblank evasion warnings. > > > } > > > > static void psr2_man_trk_ctl_calc(struct intel_crtc_state > > *crtc_state, > > @@ -2127,6 +2144,26 @@ static void intel_psr_work(struct work_struct > > *work) > > mutex_unlock(&intel_dp->psr.lock); > > } > > > > +static void _psr_invalidate_handle(struct intel_dp *intel_dp, > > + unsigned int > > prev_busy_frontbuffer_bits) > > +{ > > + struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); > > + > > + if (intel_dp->psr.psr2_sel_fetch_enabled) { > > + u32 val = man_trk_ctl_continuos_full_frame(dev_priv) | > > + man_trk_ctl_partial_frame_bit_get(dev_priv); > > + > > + /* continuos full frame is already enabled */ > > + if (prev_busy_frontbuffer_bits) > > + return; > > Should we still trigger the update using CURSURFLIVE? Or do we need > that at all in the first place? *continuos full frame* > > > + > > + intel_de_rmw(dev_priv, PSR2_MAN_TRK_CTL(intel_dp- > > > psr.transcoder), 0, val); > > + intel_de_write(dev_priv, CURSURFLIVE(intel_dp- > > > psr.pipe), 0); > > So these two register writes here are triggering one full frame update. > and leaving full frame update bit set so that coming updates are also > full frame. Did I understood it correctly? *continuos full frame* > > As invalidate is called before frontbuffer is writen: Isn't it actually > re-updating same frame which is already supposed to be in panel RFB? > > > + } else { > > + intel_psr_exit(intel_dp); > > + } > > +} > > + > > /** > > * intel_psr_invalidate - Invalidade PSR > > * @dev_priv: i915 device > > @@ -2151,6 +2188,7 @@ void intel_psr_invalidate(struct > > drm_i915_private *dev_priv, > > for_each_intel_encoder_with_psr(&dev_priv->drm, encoder) { > > unsigned int pipe_frontbuffer_bits = frontbuffer_bits; > > struct intel_dp *intel_dp = enc_to_intel_dp(encoder); > > + unsigned int prev_busy_frontbuffer_bits; > > > > mutex_lock(&intel_dp->psr.lock); > > if (!intel_dp->psr.enabled) { > > @@ -2158,12 +2196,13 @@ void intel_psr_invalidate(struct > > drm_i915_private *dev_priv, > > continue; > > } > > > > + prev_busy_frontbuffer_bits = intel_dp- > > > psr.busy_frontbuffer_bits; > > pipe_frontbuffer_bits &= > > INTEL_FRONTBUFFER_ALL_MASK(intel_dp->psr.pipe); > > intel_dp->psr.busy_frontbuffer_bits |= > > pipe_frontbuffer_bits; > > > > if (pipe_frontbuffer_bits) > > - intel_psr_exit(intel_dp); > > + _psr_invalidate_handle(intel_dp, > > prev_busy_frontbuffer_bits); > > > > mutex_unlock(&intel_dp->psr.lock); > > } > > @@ -2195,6 +2234,49 @@ tgl_dc3co_flush_locked(struct intel_dp > > *intel_dp, unsigned int frontbuffer_bits, > > intel_dp->psr.dc3co_exit_delay); > > } > > > > +static void _psr_flush_handle(struct intel_dp *intel_dp, > > + unsigned int prev_busy_frontbuffer_bits) > > +{ > > + struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); > > + > > + if (intel_dp->psr.psr2_sel_fetch_enabled) { > > + /* is continuos full frame enabled? */ > > + if (prev_busy_frontbuffer_bits) { > > + /* it is, can we turn it off? */ > > As you are using this prev_busy_frontbuffer_bits only to check if > "continuos full frame" is enabled and nothing else: Maybe you could > just name it as it is e.g. bool cff_enabled or bool > continuous_full_frame or...same comment in _psr_invalidate_handle. This > would allow you to drop couple of comments. prev_busy_frontbuffer_bits or a bool anyway works for me, if you think is easier to understand I can change to bool. > > > + if (intel_dp->psr.busy_frontbuffer_bits == 0) { > > + u32 clear = > > man_trk_ctl_continuos_full_frame(dev_priv); > > + u32 set = > > man_trk_ctl_single_full_frame_bit_get(dev_priv) | > > + man_trk_ctl_partial_frame_bit > > _get(dev_priv); > > > > + > > + /* > > + * turn continuos full frame off and do > > a single > > + * full frame > > + */ > > + intel_de_rmw(dev_priv, > > + PSR2_MAN_TRK_CTL(intel_dp- > > > psr.transcoder), > > + clear, set); > > + intel_de_write(dev_priv, > > CURSURFLIVE(intel_dp->psr.pipe), 0); > > + } > > + } else { > > + /* > > + * continuos full frame is disabled, only a > > single full > > + * frame is required > > + */ > > + psr_force_hw_tracking_exit(intel_dp); > > + } > > + } else { > > + /* > > + * if prev_busy_frontbuffer_bits is set, it means that > > PSR is > > + * inactive > > + */ > > + if (prev_busy_frontbuffer_bits == 0) > > + psr_force_hw_tracking_exit(intel_dp); > > + > > + if (!intel_dp->psr.active && !intel_dp- > > > psr.busy_frontbuffer_bits) > > + schedule_work(&intel_dp->psr.work); > > + } > > +} > > + > > /** > > * intel_psr_flush - Flush PSR > > * @dev_priv: i915 device > > @@ -2216,6 +2298,7 @@ void intel_psr_flush(struct drm_i915_private > > *dev_priv, > > for_each_intel_encoder_with_psr(&dev_priv->drm, encoder) { > > unsigned int pipe_frontbuffer_bits = frontbuffer_bits; > > struct intel_dp *intel_dp = enc_to_intel_dp(encoder); > > + unsigned int prev_busy_frontbuffer_bits; > > > > mutex_lock(&intel_dp->psr.lock); > > if (!intel_dp->psr.enabled) { > > @@ -2223,6 +2306,7 @@ void intel_psr_flush(struct drm_i915_private > > *dev_priv, > > continue; > > } > > > > + prev_busy_frontbuffer_bits = intel_dp- > > > psr.busy_frontbuffer_bits; > > pipe_frontbuffer_bits &= > > INTEL_FRONTBUFFER_ALL_MASK(intel_dp->psr.pipe); > > intel_dp->psr.busy_frontbuffer_bits &= > > ~pipe_frontbuffer_bits; > > @@ -2232,25 +2316,22 @@ void intel_psr_flush(struct drm_i915_private > > *dev_priv, > > * we have to ensure that the PSR is not activated > > until > > * intel_psr_resume() is called. > > */ > > - if (intel_dp->psr.paused) { > > - mutex_unlock(&intel_dp->psr.lock); > > - continue; > > - } > > + if (intel_dp->psr.paused) > > + goto exit; > > > > if (origin == ORIGIN_FLIP || > > (origin == ORIGIN_CURSOR_UPDATE && > > !intel_dp->psr.psr2_sel_fetch_enabled)) { > > tgl_dc3co_flush_locked(intel_dp, > > frontbuffer_bits, origin); > > - mutex_unlock(&intel_dp->psr.lock); > > - continue; > > + goto exit; > > } > > > > - /* By definition flush = invalidate + flush */ > > - if (pipe_frontbuffer_bits) > > - psr_force_hw_tracking_exit(intel_dp); > > + if (pipe_frontbuffer_bits == 0) > > + goto exit; > > > > - if (!intel_dp->psr.active && !intel_dp- > > > psr.busy_frontbuffer_bits) > > - schedule_work(&intel_dp->psr.work); > > + /* By definition flush = invalidate + flush */ > > + _psr_flush_handle(intel_dp, > > prev_busy_frontbuffer_bits); > > +exit: > > I think you should name it as unlock. > > > mutex_unlock(&intel_dp->psr.lock); > > } > > } > > BR, > > Jouni Högander >
Hello Jose, Thank you for your response. I'm sorry for the delay on my side. See my comment below. On Fri, 2022-03-25 at 14:46 +0000, Souza, Jose wrote: > On Fri, 2022-03-25 at 14:21 +0000, Hogander, Jouni wrote: > > Hello Jose, > > > > See my comments below. > > > > On Thu, 2022-03-24 at 11:13 -0700, José Roberto de Souza wrote: > > > Instead of exit PSR when a frontbuffer invalidation happens, we > > > can > > > enable the PSR2 selective fetch continuous full frame, that will > > > keep > > > the panel updated like PSR was disabled but without keeping PSR > > > active. > > > > with keeping PSR active? I don't think it's like PSR was disabled. > > New > > full frame is updated only via atomic commit. Having PSR disabled > > new > > full frame is updated all the time as PSR wasn't existing at all. > > > > > So as soon as the frontbuffer flush happens we can disable the > > > continuous full frame and start to do selective fetches much > > > quicker > > > than the path that would enable PSR, that will wait a few frames > > > to actually activate PSR. > > > > > > Also this approach has proven to fix some glitches found in > > > Alderlake-P > > > when there are a lot of invalidations happening together with > > > page > > > flips. > > > > > > Some may ask why it is writing to CURSURFLIVE(), it is because > > > that is the way that hardware team provided us to poke display to > > > handle PSR updates, and it is being used since display 9. > > > > Generic comments: > > > > Current logic is to disable psr2 in invalidate callback and start > > sending fullframe updates on every vblank period. This is done > > until > > flush callback where psr2 is re-enabled. Intention is to update > > possible frontbuffer writes between invalidate/flush instantly. > > > > Now you are changing the logic to update one full frame when > > It is not enabling the one full frame, it is enabling the continuous > full frame so at every vblank panel will be updated until this bit > cleared. There is comment in bspec stating otherwise (55229). I think it is important to clarify so that we really understand what this WA is doing and to avoid hiding the issue just for this specific case. Currently I'm suspecting that "SF Continuous Full Frame" just configures next selective update to full frame and then write to CURSURFLIVE triggers flip which is sending that full frame (only one frame). Updates between invalidate and flush are just depending if there are flips in between. After getting this clarified we can properly review this WA patch. > > > frontbuffer write starts (_psr_invalidate_handle) and another one > > when > > it stops (_psr_flush_handle) without disabling psr at all. Have I > > understood your patch correctly? > > > > Propably we wont notice this change as we have these > > invalidate/flush > > calls scattered around in the code. Also parallel atomic commits > > are > > triggering updates. In theory we could observe latency in updates > > between invalidate/flush? Do we care? What do you think? > > > > Do we need to send update in invalidate at all? Isn't that usually > > called before doing any frontbuffer writing? I.e. we would be > > sending > > frame that is already in RFB? > > > > > Cc: Khaled Almahallawy <khaled.almahallawy@intel.com> > > > Cc: Shawn C Lee <shawn.c.lee@intel.com> > > > Cc: Jouni Högander <jouni.hogander@intel.com> > > > Cc: Mika Kahola <mika.kahola@intel.com> > > > Signed-off-by: José Roberto de Souza <jose.souza@intel.com> > > > --- > > > drivers/gpu/drm/i915/display/intel_psr.c | 109 > > > ++++++++++++++++++++- > > > -- > > > 1 file changed, 95 insertions(+), 14 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c > > > b/drivers/gpu/drm/i915/display/intel_psr.c > > > index d87b357806c91..f7b7b374374b1 100644 > > > --- a/drivers/gpu/drm/i915/display/intel_psr.c > > > +++ b/drivers/gpu/drm/i915/display/intel_psr.c > > > @@ -1450,6 +1450,22 @@ static inline u32 > > > man_trk_ctl_partial_frame_bit_get(struct drm_i915_private *dev > > > PSR2_MAN_TRK_CTL_SF_PARTIAL_FRAME_UPDATE; > > > } > > > > > > +static inline u32 man_trk_ctl_continuos_full_frame(struct > > > drm_i915_private *dev_priv) > > > +{ > > > + return IS_ALDERLAKE_P(dev_priv) ? > > > + ADLP_PSR2_MAN_TRK_CTL_SF_CONTINUOS_FULL_FRAME : > > > + PSR2_MAN_TRK_CTL_SF_CONTINUOS_FULL_FRAME; > > > +} > > > + > > > +static inline u32 man_trk_ctl_su_region_start_end_mask(struct > > > drm_i915_private *dev_priv) > > > +{ > > > + if (IS_ALDERLAKE_P(dev_priv)) > > > + return > > > ADLP_PSR2_MAN_TRK_CTL_SU_REGION_START_ADDR_MASK > > > + ADLP_PSR2_MAN_TRK_CTL_SU_REGION_END_ADDR_MAS > > > K; > > > + return PSR2_MAN_TRK_CTL_SU_REGION_START_ADDR_MASK | > > > + PSR2_MAN_TRK_CTL_SU_REGION_END_ADDR_MASK; > > > +} > > > + > > > static void psr_force_hw_tracking_exit(struct intel_dp > > > *intel_dp) > > > { > > > struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); > > > @@ -1546,8 +1562,9 @@ void > > > intel_psr2_program_trans_man_trk_ctl(const > > > struct intel_crtc_state *crtc_st > > > if (!crtc_state->enable_psr2_sel_fetch) > > > return; > > > > > > - intel_de_write(dev_priv, PSR2_MAN_TRK_CTL(crtc_state- > > > > cpu_transcoder), > > > - crtc_state->psr2_man_track_ctl); > > > + intel_de_rmw(dev_priv, PSR2_MAN_TRK_CTL(crtc_state- > > > > cpu_transcoder), > > > + man_trk_ctl_su_region_start_end_mask(dev_priv) > > > , > > > + crtc_state->psr2_man_track_ctl); > > > > Should we actually now consider taking psr->lock here? > > I don't think we need. mmio writes are syncronized, this will never > set continuous full frame and will only clear su region start and > end. > Also this function is called from a time sensitive section if we > spend too much time here it will cause vblank evasion warnings. > Ok, how about hitting flush/invalidate in between read/write in intel_de_rmw? Is that possible? > > > } > > > > > > static void psr2_man_trk_ctl_calc(struct intel_crtc_state > > > *crtc_state, > > > @@ -2127,6 +2144,26 @@ static void intel_psr_work(struct > > > work_struct > > > *work) > > > mutex_unlock(&intel_dp->psr.lock); > > > } > > > > > > +static void _psr_invalidate_handle(struct intel_dp *intel_dp, > > > + unsigned int > > > prev_busy_frontbuffer_bits) > > > +{ > > > + struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); > > > + > > > + if (intel_dp->psr.psr2_sel_fetch_enabled) { > > > + u32 val = > > > man_trk_ctl_continuos_full_frame(dev_priv) | > > > + man_trk_ctl_partial_frame_bit_get(dev_pri > > > v); > > > + > > > + /* continuos full frame is already enabled */ > > > + if (prev_busy_frontbuffer_bits) > > > + return; > > > > Should we still trigger the update using CURSURFLIVE? Or do we need > > that at all in the first place? > > *continuos full frame* > > > > + > > > + intel_de_rmw(dev_priv, PSR2_MAN_TRK_CTL(intel_dp- > > > > psr.transcoder), 0, val); > > > + intel_de_write(dev_priv, CURSURFLIVE(intel_dp- > > > > psr.pipe), 0); > > > > So these two register writes here are triggering one full frame > > update. > > and leaving full frame update bit set so that coming updates are > > also > > full frame. Did I understood it correctly? > > *continuos full frame* > > > As invalidate is called before frontbuffer is writen: Isn't it > > actually > > re-updating same frame which is already supposed to be in panel > > RFB? > > > > > + } else { > > > + intel_psr_exit(intel_dp); > > > + } > > > +} > > > + > > > /** > > > * intel_psr_invalidate - Invalidade PSR > > > * @dev_priv: i915 device > > > @@ -2151,6 +2188,7 @@ void intel_psr_invalidate(struct > > > drm_i915_private *dev_priv, > > > for_each_intel_encoder_with_psr(&dev_priv->drm, encoder) { > > > unsigned int pipe_frontbuffer_bits = > > > frontbuffer_bits; > > > struct intel_dp *intel_dp = > > > enc_to_intel_dp(encoder); > > > + unsigned int prev_busy_frontbuffer_bits; > > > > > > mutex_lock(&intel_dp->psr.lock); > > > if (!intel_dp->psr.enabled) { > > > @@ -2158,12 +2196,13 @@ void intel_psr_invalidate(struct > > > drm_i915_private *dev_priv, > > > continue; > > > } > > > > > > + prev_busy_frontbuffer_bits = intel_dp- > > > > psr.busy_frontbuffer_bits; > > > pipe_frontbuffer_bits &= > > > INTEL_FRONTBUFFER_ALL_MASK(intel_dp- > > > >psr.pipe); > > > intel_dp->psr.busy_frontbuffer_bits |= > > > pipe_frontbuffer_bits; > > > > > > if (pipe_frontbuffer_bits) > > > - intel_psr_exit(intel_dp); > > > + _psr_invalidate_handle(intel_dp, > > > prev_busy_frontbuffer_bits); > > > > > > mutex_unlock(&intel_dp->psr.lock); > > > } > > > @@ -2195,6 +2234,49 @@ tgl_dc3co_flush_locked(struct intel_dp > > > *intel_dp, unsigned int frontbuffer_bits, > > > intel_dp->psr.dc3co_exit_delay); > > > } > > > > > > +static void _psr_flush_handle(struct intel_dp *intel_dp, > > > + unsigned int > > > prev_busy_frontbuffer_bits) > > > +{ > > > + struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); > > > + > > > + if (intel_dp->psr.psr2_sel_fetch_enabled) { > > > + /* is continuos full frame enabled? */ > > > + if (prev_busy_frontbuffer_bits) { > > > + /* it is, can we turn it off? */ > > > > As you are using this prev_busy_frontbuffer_bits only to check if > > "continuos full frame" is enabled and nothing else: Maybe you could > > just name it as it is e.g. bool cff_enabled or bool > > continuous_full_frame or...same comment in _psr_invalidate_handle. > > This > > would allow you to drop couple of comments. > > prev_busy_frontbuffer_bits or a bool anyway works for me, if you > think is easier to understand I can change to bool. To my opinion it actually helps understanding the code. This was just idea. I'm not insisting to change if you think current one is better. > > > > + if (intel_dp->psr.busy_frontbuffer_bits == > > > 0) { > > > + u32 clear = > > > man_trk_ctl_continuos_full_frame(dev_priv); > > > + u32 set = > > > man_trk_ctl_single_full_frame_bit_get(dev_priv) | > > > + man_trk_ctl_partial_frame > > > _bit > > > _get(dev_priv); > > > + > > > + /* > > > + * turn continuos full frame off > > > and do > > > a single > > > + * full frame > > > + */ > > > + intel_de_rmw(dev_priv, > > > + PSR2_MAN_TRK_CTL(intel > > > _dp- > > > > psr.transcoder), > > > + clear, set); > > > + intel_de_write(dev_priv, > > > CURSURFLIVE(intel_dp->psr.pipe), 0); > > > + } > > > + } else { > > > + /* > > > + * continuos full frame is disabled, only a > > > single full > > > + * frame is required > > > + */ > > > + psr_force_hw_tracking_exit(intel_dp); > > > + } > > > + } else { > > > + /* > > > + * if prev_busy_frontbuffer_bits is set, it means > > > that > > > PSR is > > > + * inactive > > > + */ > > > + if (prev_busy_frontbuffer_bits == 0) > > > + psr_force_hw_tracking_exit(intel_dp); > > > + > > > + if (!intel_dp->psr.active && !intel_dp- > > > > psr.busy_frontbuffer_bits) > > > + schedule_work(&intel_dp->psr.work); > > > + } > > > +} > > > + > > > /** > > > * intel_psr_flush - Flush PSR > > > * @dev_priv: i915 device > > > @@ -2216,6 +2298,7 @@ void intel_psr_flush(struct > > > drm_i915_private > > > *dev_priv, > > > for_each_intel_encoder_with_psr(&dev_priv->drm, encoder) { > > > unsigned int pipe_frontbuffer_bits = > > > frontbuffer_bits; > > > struct intel_dp *intel_dp = > > > enc_to_intel_dp(encoder); > > > + unsigned int prev_busy_frontbuffer_bits; > > > > > > mutex_lock(&intel_dp->psr.lock); > > > if (!intel_dp->psr.enabled) { > > > @@ -2223,6 +2306,7 @@ void intel_psr_flush(struct > > > drm_i915_private > > > *dev_priv, > > > continue; > > > } > > > > > > + prev_busy_frontbuffer_bits = intel_dp- > > > > psr.busy_frontbuffer_bits; > > > pipe_frontbuffer_bits &= > > > INTEL_FRONTBUFFER_ALL_MASK(intel_dp- > > > >psr.pipe); > > > intel_dp->psr.busy_frontbuffer_bits &= > > > ~pipe_frontbuffer_bits; > > > @@ -2232,25 +2316,22 @@ void intel_psr_flush(struct > > > drm_i915_private > > > *dev_priv, > > > * we have to ensure that the PSR is not activated > > > until > > > * intel_psr_resume() is called. > > > */ > > > - if (intel_dp->psr.paused) { > > > - mutex_unlock(&intel_dp->psr.lock); > > > - continue; > > > - } > > > + if (intel_dp->psr.paused) > > > + goto exit; > > > > > > if (origin == ORIGIN_FLIP || > > > (origin == ORIGIN_CURSOR_UPDATE && > > > !intel_dp->psr.psr2_sel_fetch_enabled)) { > > > tgl_dc3co_flush_locked(intel_dp, > > > frontbuffer_bits, origin); > > > - mutex_unlock(&intel_dp->psr.lock); > > > - continue; > > > + goto exit; > > > } > > > > > > - /* By definition flush = invalidate + flush */ > > > - if (pipe_frontbuffer_bits) > > > - psr_force_hw_tracking_exit(intel_dp); > > > + if (pipe_frontbuffer_bits == 0) > > > + goto exit; > > > > > > - if (!intel_dp->psr.active && !intel_dp- > > > > psr.busy_frontbuffer_bits) > > > - schedule_work(&intel_dp->psr.work); > > > + /* By definition flush = invalidate + flush */ > > > + _psr_flush_handle(intel_dp, > > > prev_busy_frontbuffer_bits); > > > +exit: > > > > I think you should name it as unlock. > > > > > mutex_unlock(&intel_dp->psr.lock); > > > } > > > } > > > > BR, > > > > Jouni Högander > >
diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index d87b357806c91..f7b7b374374b1 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -1450,6 +1450,22 @@ static inline u32 man_trk_ctl_partial_frame_bit_get(struct drm_i915_private *dev PSR2_MAN_TRK_CTL_SF_PARTIAL_FRAME_UPDATE; } +static inline u32 man_trk_ctl_continuos_full_frame(struct drm_i915_private *dev_priv) +{ + return IS_ALDERLAKE_P(dev_priv) ? + ADLP_PSR2_MAN_TRK_CTL_SF_CONTINUOS_FULL_FRAME : + PSR2_MAN_TRK_CTL_SF_CONTINUOS_FULL_FRAME; +} + +static inline u32 man_trk_ctl_su_region_start_end_mask(struct drm_i915_private *dev_priv) +{ + if (IS_ALDERLAKE_P(dev_priv)) + return ADLP_PSR2_MAN_TRK_CTL_SU_REGION_START_ADDR_MASK | + ADLP_PSR2_MAN_TRK_CTL_SU_REGION_END_ADDR_MASK; + return PSR2_MAN_TRK_CTL_SU_REGION_START_ADDR_MASK | + PSR2_MAN_TRK_CTL_SU_REGION_END_ADDR_MASK; +} + static void psr_force_hw_tracking_exit(struct intel_dp *intel_dp) { struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); @@ -1546,8 +1562,9 @@ void intel_psr2_program_trans_man_trk_ctl(const struct intel_crtc_state *crtc_st if (!crtc_state->enable_psr2_sel_fetch) return; - intel_de_write(dev_priv, PSR2_MAN_TRK_CTL(crtc_state->cpu_transcoder), - crtc_state->psr2_man_track_ctl); + intel_de_rmw(dev_priv, PSR2_MAN_TRK_CTL(crtc_state->cpu_transcoder), + man_trk_ctl_su_region_start_end_mask(dev_priv), + crtc_state->psr2_man_track_ctl); } static void psr2_man_trk_ctl_calc(struct intel_crtc_state *crtc_state, @@ -2127,6 +2144,26 @@ static void intel_psr_work(struct work_struct *work) mutex_unlock(&intel_dp->psr.lock); } +static void _psr_invalidate_handle(struct intel_dp *intel_dp, + unsigned int prev_busy_frontbuffer_bits) +{ + struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); + + if (intel_dp->psr.psr2_sel_fetch_enabled) { + u32 val = man_trk_ctl_continuos_full_frame(dev_priv) | + man_trk_ctl_partial_frame_bit_get(dev_priv); + + /* continuos full frame is already enabled */ + if (prev_busy_frontbuffer_bits) + return; + + intel_de_rmw(dev_priv, PSR2_MAN_TRK_CTL(intel_dp->psr.transcoder), 0, val); + intel_de_write(dev_priv, CURSURFLIVE(intel_dp->psr.pipe), 0); + } else { + intel_psr_exit(intel_dp); + } +} + /** * intel_psr_invalidate - Invalidade PSR * @dev_priv: i915 device @@ -2151,6 +2188,7 @@ void intel_psr_invalidate(struct drm_i915_private *dev_priv, for_each_intel_encoder_with_psr(&dev_priv->drm, encoder) { unsigned int pipe_frontbuffer_bits = frontbuffer_bits; struct intel_dp *intel_dp = enc_to_intel_dp(encoder); + unsigned int prev_busy_frontbuffer_bits; mutex_lock(&intel_dp->psr.lock); if (!intel_dp->psr.enabled) { @@ -2158,12 +2196,13 @@ void intel_psr_invalidate(struct drm_i915_private *dev_priv, continue; } + prev_busy_frontbuffer_bits = intel_dp->psr.busy_frontbuffer_bits; pipe_frontbuffer_bits &= INTEL_FRONTBUFFER_ALL_MASK(intel_dp->psr.pipe); intel_dp->psr.busy_frontbuffer_bits |= pipe_frontbuffer_bits; if (pipe_frontbuffer_bits) - intel_psr_exit(intel_dp); + _psr_invalidate_handle(intel_dp, prev_busy_frontbuffer_bits); mutex_unlock(&intel_dp->psr.lock); } @@ -2195,6 +2234,49 @@ tgl_dc3co_flush_locked(struct intel_dp *intel_dp, unsigned int frontbuffer_bits, intel_dp->psr.dc3co_exit_delay); } +static void _psr_flush_handle(struct intel_dp *intel_dp, + unsigned int prev_busy_frontbuffer_bits) +{ + struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); + + if (intel_dp->psr.psr2_sel_fetch_enabled) { + /* is continuos full frame enabled? */ + if (prev_busy_frontbuffer_bits) { + /* it is, can we turn it off? */ + if (intel_dp->psr.busy_frontbuffer_bits == 0) { + u32 clear = man_trk_ctl_continuos_full_frame(dev_priv); + u32 set = man_trk_ctl_single_full_frame_bit_get(dev_priv) | + man_trk_ctl_partial_frame_bit_get(dev_priv); + + /* + * turn continuos full frame off and do a single + * full frame + */ + intel_de_rmw(dev_priv, + PSR2_MAN_TRK_CTL(intel_dp->psr.transcoder), + clear, set); + intel_de_write(dev_priv, CURSURFLIVE(intel_dp->psr.pipe), 0); + } + } else { + /* + * continuos full frame is disabled, only a single full + * frame is required + */ + psr_force_hw_tracking_exit(intel_dp); + } + } else { + /* + * if prev_busy_frontbuffer_bits is set, it means that PSR is + * inactive + */ + if (prev_busy_frontbuffer_bits == 0) + psr_force_hw_tracking_exit(intel_dp); + + if (!intel_dp->psr.active && !intel_dp->psr.busy_frontbuffer_bits) + schedule_work(&intel_dp->psr.work); + } +} + /** * intel_psr_flush - Flush PSR * @dev_priv: i915 device @@ -2216,6 +2298,7 @@ void intel_psr_flush(struct drm_i915_private *dev_priv, for_each_intel_encoder_with_psr(&dev_priv->drm, encoder) { unsigned int pipe_frontbuffer_bits = frontbuffer_bits; struct intel_dp *intel_dp = enc_to_intel_dp(encoder); + unsigned int prev_busy_frontbuffer_bits; mutex_lock(&intel_dp->psr.lock); if (!intel_dp->psr.enabled) { @@ -2223,6 +2306,7 @@ void intel_psr_flush(struct drm_i915_private *dev_priv, continue; } + prev_busy_frontbuffer_bits = intel_dp->psr.busy_frontbuffer_bits; pipe_frontbuffer_bits &= INTEL_FRONTBUFFER_ALL_MASK(intel_dp->psr.pipe); intel_dp->psr.busy_frontbuffer_bits &= ~pipe_frontbuffer_bits; @@ -2232,25 +2316,22 @@ void intel_psr_flush(struct drm_i915_private *dev_priv, * we have to ensure that the PSR is not activated until * intel_psr_resume() is called. */ - if (intel_dp->psr.paused) { - mutex_unlock(&intel_dp->psr.lock); - continue; - } + if (intel_dp->psr.paused) + goto exit; if (origin == ORIGIN_FLIP || (origin == ORIGIN_CURSOR_UPDATE && !intel_dp->psr.psr2_sel_fetch_enabled)) { tgl_dc3co_flush_locked(intel_dp, frontbuffer_bits, origin); - mutex_unlock(&intel_dp->psr.lock); - continue; + goto exit; } - /* By definition flush = invalidate + flush */ - if (pipe_frontbuffer_bits) - psr_force_hw_tracking_exit(intel_dp); + if (pipe_frontbuffer_bits == 0) + goto exit; - if (!intel_dp->psr.active && !intel_dp->psr.busy_frontbuffer_bits) - schedule_work(&intel_dp->psr.work); + /* By definition flush = invalidate + flush */ + _psr_flush_handle(intel_dp, prev_busy_frontbuffer_bits); +exit: mutex_unlock(&intel_dp->psr.lock); } }
Instead of exit PSR when a frontbuffer invalidation happens, we can enable the PSR2 selective fetch continuous full frame, that will keep the panel updated like PSR was disabled but without keeping PSR active. So as soon as the frontbuffer flush happens we can disable the continuous full frame and start to do selective fetches much quicker than the path that would enable PSR, that will wait a few frames to actually activate PSR. Also this approach has proven to fix some glitches found in Alderlake-P when there are a lot of invalidations happening together with page flips. Some may ask why it is writing to CURSURFLIVE(), it is because that is the way that hardware team provided us to poke display to handle PSR updates, and it is being used since display 9. Cc: Khaled Almahallawy <khaled.almahallawy@intel.com> Cc: Shawn C Lee <shawn.c.lee@intel.com> Cc: Jouni Högander <jouni.hogander@intel.com> Cc: Mika Kahola <mika.kahola@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> --- drivers/gpu/drm/i915/display/intel_psr.c | 109 ++++++++++++++++++++--- 1 file changed, 95 insertions(+), 14 deletions(-)