Message ID | 1471300581-10905-1-git-send-email-paulo.r.zanoni@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, 2016-08-15 at 19:36 -0300, Paulo Zanoni wrote: > Ever since I started working on FBC I was already aware that FBC can > really amplify the FIFO underrun symptoms. On systems where FIFO > underruns were harmless error messages, enabling FBC would cause the > underruns to give black screens. > > We recently tried to enable FBC on Haswell and got reports of a system > that would hang after some hours of uptime, and the first bad commit > was the one that enabled FBC. We also observed that this system had > FIFO underrun error messages on its dmesg. Although we don't have any > evidence that fixing the underruns would solve the bug and make FBC > work properly on this machine, IMHO it's better if we minimize the > amount of possible problems by just giving up FBC whenever we detect > an underrun. > > v2: New version, different implementation and commit message. > v3: Clarify the fact that we run from an IRQ handler (Chris). > > Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> > Cc: Lyude <cpaul@redhat.com> > Cc: stevenhoneyman@gmail.com <stevenhoneyman@gmail.com> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 3 ++ > drivers/gpu/drm/i915/intel_drv.h | 1 + > drivers/gpu/drm/i915/intel_fbc.c | 61 ++++++++++++++++++++++++++++++ > drivers/gpu/drm/i915/intel_fifo_underrun.c | 2 + > 4 files changed, 67 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 35caa9b..baa9c66 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -941,6 +941,9 @@ struct intel_fbc { > bool enabled; > bool active; > > + bool underrun_detected; > + struct work_struct underrun_work; > + > struct intel_fbc_state_cache { > struct { > unsigned int mode_flags; > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h > index 1c700b0..50c1eb0 100644 > --- a/drivers/gpu/drm/i915/intel_drv.h > +++ b/drivers/gpu/drm/i915/intel_drv.h > @@ -1494,6 +1494,7 @@ void intel_fbc_invalidate(struct drm_i915_private *dev_priv, > void intel_fbc_flush(struct drm_i915_private *dev_priv, > unsigned int frontbuffer_bits, enum fb_op_origin origin); > void intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv); > +void intel_fbc_handle_fifo_underrun_irq(struct drm_i915_private *dev_priv); > > /* intel_hdmi.c */ > void intel_hdmi_init(struct drm_device *dev, i915_reg_t hdmi_reg, enum port port); > diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c > index e122052..950aed5 100644 > --- a/drivers/gpu/drm/i915/intel_fbc.c > +++ b/drivers/gpu/drm/i915/intel_fbc.c > @@ -750,6 +750,13 @@ static bool intel_fbc_can_activate(struct intel_crtc *crtc) > struct intel_fbc *fbc = &dev_priv->fbc; > struct intel_fbc_state_cache *cache = &fbc->state_cache; > > + /* We don't need to use a state cache here since this information is > + * global for every CRTC. */ > + if (fbc->underrun_detected) { > + fbc->no_fbc_reason = "underrun detected"; > + return false; > + } > + > if (!cache->plane.visible) { > fbc->no_fbc_reason = "primary plane not visible"; > return false; > @@ -1193,6 +1200,59 @@ void intel_fbc_global_disable(struct drm_i915_private *dev_priv) > cancel_work_sync(&fbc->work.work); > } > > +static void intel_fbc_underrun_work_fn(struct work_struct *work) > +{ > + struct drm_i915_private *dev_priv = > + container_of(work, struct drm_i915_private, fbc.underrun_work); > + struct intel_fbc *fbc = &dev_priv->fbc; > + > + mutex_lock(&fbc->lock); > + > + /* Maybe we were scheduled twice. */ > + if (fbc->underrun_detected) > + goto out; If the underrun happens very early after boot (likely during modeset), the user might end up seeing just FBC enable and disable messages in dmesg without knowing it (or why it) was deactivated. For e.g., [ 1855.552489] [drm:intel_fbc_enable] Enabling FBC on pipe A [ 1856.854239] [drm:__intel_fbc_disable] Disabling FBC on pipe A [ 1857.753383] [drm:intel_fbc_alloc_cfb] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS. [ 1857.753384] [drm:intel_fbc_alloc_cfb] reserved 10368000 bytes of contiguous stolen space for FBC, threshold: 4 [ 1857.753384] [drm:intel_fbc_enable] Enabling FBC on pipe A I find this noise misleading because FBC is not going to be activated at all. > + > + DRM_DEBUG_KMS("Disabling FBC due to FIFO underrun.\n"); Print out the pipe here? Both the enabling and disabling debug messages clarify the pipe. It may be useful to provide this information here too. > + fbc->underrun_detected = true; > + > + intel_fbc_deactivate(dev_priv); > +out: > + mutex_unlock(&fbc->lock); > +} > + > +/** > + * intel_fbc_handle_fifo_underrun_irq - disable FBC when we get a FIFO underrun > + * @dev_priv: i915 device instance > + * > + * Without FBC, most underruns are harmless and don't really cause too many > + * problems, except for an annoying message on dmesg. With FBC, underruns can > + * become black screens or even worse, especially when paired with bad > + * watermarks. So in order for us to be on the safe side, completely disable FBC > + * in case we ever detect a FIFO underrun on any pipe. An underrun on any pipe > + * already suggests that watermarks may be bad, so try to be as safe as > + * possible. > + * > + * This function is called from the IRQ handler. > + */ > +void intel_fbc_handle_fifo_underrun_irq(struct drm_i915_private *dev_priv) > +{ > + struct intel_fbc *fbc = &dev_priv->fbc; > + > + if (!fbc_supported(dev_priv)) > + return; > + > + /* There's no guarantee that underrun_detected won't be set to true > + * right after this check and before the work is scheduled, but that's > + * not a problem since we'll check it again under the work function > + * while FBC is locked. This check here is just to prevent us from > + * unnecessarily scheduling the work, and it relies on the fact that we > + * never switch underrun_detect back to false after it's true. */ > + if (fbc->underrun_detected) > + return; > + > + schedule_work(&fbc->underrun_work); > +} > + > /** > * intel_fbc_init_pipe_state - initialize FBC's CRTC visibility tracking > * @dev_priv: i915 device instance > @@ -1264,6 +1324,7 @@ void intel_fbc_init(struct drm_i915_private *dev_priv) > enum pipe pipe; > > INIT_WORK(&fbc->work.work, intel_fbc_work_fn); > + INIT_WORK(&fbc->underrun_work, intel_fbc_underrun_work_fn); > mutex_init(&fbc->lock); > fbc->enabled = false; > fbc->active = false; > diff --git a/drivers/gpu/drm/i915/intel_fifo_underrun.c b/drivers/gpu/drm/i915/intel_fifo_underrun.c > index 2aa7440..ebb4fed 100644 > --- a/drivers/gpu/drm/i915/intel_fifo_underrun.c > +++ b/drivers/gpu/drm/i915/intel_fifo_underrun.c > @@ -372,6 +372,8 @@ void intel_cpu_fifo_underrun_irq_handler(struct drm_i915_private *dev_priv, > if (intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, false)) > DRM_ERROR("CPU pipe %c FIFO underrun\n", > pipe_name(pipe)); > + > + intel_fbc_handle_fifo_underrun_irq(dev_priv); > } > > /**
Em Sex, 2016-09-02 às 05:58 +0000, Pandiyan, Dhinakaran escreveu: > On Mon, 2016-08-15 at 19:36 -0300, Paulo Zanoni wrote: > > > > Ever since I started working on FBC I was already aware that FBC > > can > > really amplify the FIFO underrun symptoms. On systems where FIFO > > underruns were harmless error messages, enabling FBC would cause > > the > > underruns to give black screens. > > > > We recently tried to enable FBC on Haswell and got reports of a > > system > > that would hang after some hours of uptime, and the first bad > > commit > > was the one that enabled FBC. We also observed that this system had > > FIFO underrun error messages on its dmesg. Although we don't have > > any > > evidence that fixing the underruns would solve the bug and make FBC > > work properly on this machine, IMHO it's better if we minimize the > > amount of possible problems by just giving up FBC whenever we > > detect > > an underrun. > > > > v2: New version, different implementation and commit message. > > v3: Clarify the fact that we run from an IRQ handler (Chris). > > > > Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> > > Cc: Lyude <cpaul@redhat.com> > > Cc: stevenhoneyman@gmail.com <stevenhoneyman@gmail.com> > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> > > --- > > drivers/gpu/drm/i915/i915_drv.h | 3 ++ > > drivers/gpu/drm/i915/intel_drv.h | 1 + > > drivers/gpu/drm/i915/intel_fbc.c | 61 > > ++++++++++++++++++++++++++++++ > > drivers/gpu/drm/i915/intel_fifo_underrun.c | 2 + > > 4 files changed, 67 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h > > b/drivers/gpu/drm/i915/i915_drv.h > > index 35caa9b..baa9c66 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -941,6 +941,9 @@ struct intel_fbc { > > bool enabled; > > bool active; > > > > + bool underrun_detected; > > + struct work_struct underrun_work; > > + > > struct intel_fbc_state_cache { > > struct { > > unsigned int mode_flags; > > diff --git a/drivers/gpu/drm/i915/intel_drv.h > > b/drivers/gpu/drm/i915/intel_drv.h > > index 1c700b0..50c1eb0 100644 > > --- a/drivers/gpu/drm/i915/intel_drv.h > > +++ b/drivers/gpu/drm/i915/intel_drv.h > > @@ -1494,6 +1494,7 @@ void intel_fbc_invalidate(struct > > drm_i915_private *dev_priv, > > void intel_fbc_flush(struct drm_i915_private *dev_priv, > > unsigned int frontbuffer_bits, enum > > fb_op_origin origin); > > void intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv); > > +void intel_fbc_handle_fifo_underrun_irq(struct drm_i915_private > > *dev_priv); > > > > /* intel_hdmi.c */ > > void intel_hdmi_init(struct drm_device *dev, i915_reg_t hdmi_reg, > > enum port port); > > diff --git a/drivers/gpu/drm/i915/intel_fbc.c > > b/drivers/gpu/drm/i915/intel_fbc.c > > index e122052..950aed5 100644 > > --- a/drivers/gpu/drm/i915/intel_fbc.c > > +++ b/drivers/gpu/drm/i915/intel_fbc.c > > @@ -750,6 +750,13 @@ static bool intel_fbc_can_activate(struct > > intel_crtc *crtc) > > struct intel_fbc *fbc = &dev_priv->fbc; > > struct intel_fbc_state_cache *cache = &fbc->state_cache; > > > > + /* We don't need to use a state cache here since this > > information is > > + * global for every CRTC. */ > > + if (fbc->underrun_detected) { > > + fbc->no_fbc_reason = "underrun detected"; > > + return false; > > + } > > + > > if (!cache->plane.visible) { > > fbc->no_fbc_reason = "primary plane not visible"; > > return false; > > @@ -1193,6 +1200,59 @@ void intel_fbc_global_disable(struct > > drm_i915_private *dev_priv) > > cancel_work_sync(&fbc->work.work); > > } > > > > +static void intel_fbc_underrun_work_fn(struct work_struct *work) > > +{ > > + struct drm_i915_private *dev_priv = > > + container_of(work, struct drm_i915_private, > > fbc.underrun_work); > > + struct intel_fbc *fbc = &dev_priv->fbc; > > + > > + mutex_lock(&fbc->lock); > > + > > + /* Maybe we were scheduled twice. */ > > + if (fbc->underrun_detected) > > + goto out; > > If the underrun happens very early after boot (likely during > modeset), > the user might end up seeing just FBC enable and disable messages in > dmesg without knowing it (or why it) was deactivated. > > For e.g., > > [ 1855.552489] [drm:intel_fbc_enable] Enabling FBC on pipe A > [ 1856.854239] [drm:__intel_fbc_disable] Disabling FBC on pipe A > [ 1857.753383] [drm:intel_fbc_alloc_cfb] Reducing the compressed > framebuffer size. This may lead to less power savings than a > non-reduced-size. Try to increase stolen memory size if available in > BIOS. > [ 1857.753384] [drm:intel_fbc_alloc_cfb] reserved 10368000 bytes of > contiguous stolen space for FBC, threshold: 4 > [ 1857.753384] [drm:intel_fbc_enable] Enabling FBC on pipe A > > I find this noise misleading because FBC is not going to be activated > at > all. You have a point here, good catch. I was trying to avoid having two checks (one for enabling, one for activation) and just kept the activation check since it was the one we needed for sure. But it seems having the extra check during enabling will prevent these extra messages, so I'll update the patch. > > > > > > + > > + DRM_DEBUG_KMS("Disabling FBC due to FIFO underrun.\n"); > > Print out the pipe here? Both the enabling and disabling debug > messages > clarify the pipe. It may be useful to provide this information here > too. I think this could be misleading because we're disabling FBC forever on all pipes whenever a FIFO underrun happens on any pipe. Maybe FBC is not enabled so there's no FBC pipe to print. Also, the DRM_ERRORs provided by the FIFO underrun detection code actually print the pipe that caused the underrun just a few lines above on dmesg, so we know that too. > > > > > + fbc->underrun_detected = true; > > + > > + intel_fbc_deactivate(dev_priv); > > +out: > > + mutex_unlock(&fbc->lock); > > +} > > + > > +/** > > + * intel_fbc_handle_fifo_underrun_irq - disable FBC when we get a > > FIFO underrun > > + * @dev_priv: i915 device instance > > + * > > + * Without FBC, most underruns are harmless and don't really cause > > too many > > + * problems, except for an annoying message on dmesg. With FBC, > > underruns can > > + * become black screens or even worse, especially when paired with > > bad > > + * watermarks. So in order for us to be on the safe side, > > completely disable FBC > > + * in case we ever detect a FIFO underrun on any pipe. An underrun > > on any pipe > > + * already suggests that watermarks may be bad, so try to be as > > safe as > > + * possible. > > + * > > + * This function is called from the IRQ handler. > > + */ > > +void intel_fbc_handle_fifo_underrun_irq(struct drm_i915_private > > *dev_priv) > > +{ > > + struct intel_fbc *fbc = &dev_priv->fbc; > > + > > + if (!fbc_supported(dev_priv)) > > + return; > > + > > + /* There's no guarantee that underrun_detected won't be > > set to true > > + * right after this check and before the work is > > scheduled, but that's > > + * not a problem since we'll check it again under the work > > function > > + * while FBC is locked. This check here is just to prevent > > us from > > + * unnecessarily scheduling the work, and it relies on the > > fact that we > > + * never switch underrun_detect back to false after it's > > true. */ > > + if (fbc->underrun_detected) > > + return; > > + > > + schedule_work(&fbc->underrun_work); > > +} > > + > > /** > > * intel_fbc_init_pipe_state - initialize FBC's CRTC visibility > > tracking > > * @dev_priv: i915 device instance > > @@ -1264,6 +1324,7 @@ void intel_fbc_init(struct drm_i915_private > > *dev_priv) > > enum pipe pipe; > > > > INIT_WORK(&fbc->work.work, intel_fbc_work_fn); > > + INIT_WORK(&fbc->underrun_work, > > intel_fbc_underrun_work_fn); > > mutex_init(&fbc->lock); > > fbc->enabled = false; > > fbc->active = false; > > diff --git a/drivers/gpu/drm/i915/intel_fifo_underrun.c > > b/drivers/gpu/drm/i915/intel_fifo_underrun.c > > index 2aa7440..ebb4fed 100644 > > --- a/drivers/gpu/drm/i915/intel_fifo_underrun.c > > +++ b/drivers/gpu/drm/i915/intel_fifo_underrun.c > > @@ -372,6 +372,8 @@ void intel_cpu_fifo_underrun_irq_handler(struct > > drm_i915_private *dev_priv, > > if (intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, > > false)) > > DRM_ERROR("CPU pipe %c FIFO underrun\n", > > pipe_name(pipe)); > > + > > + intel_fbc_handle_fifo_underrun_irq(dev_priv); > > } > > > > /** > >
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 35caa9b..baa9c66 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -941,6 +941,9 @@ struct intel_fbc { bool enabled; bool active; + bool underrun_detected; + struct work_struct underrun_work; + struct intel_fbc_state_cache { struct { unsigned int mode_flags; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 1c700b0..50c1eb0 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -1494,6 +1494,7 @@ void intel_fbc_invalidate(struct drm_i915_private *dev_priv, void intel_fbc_flush(struct drm_i915_private *dev_priv, unsigned int frontbuffer_bits, enum fb_op_origin origin); void intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv); +void intel_fbc_handle_fifo_underrun_irq(struct drm_i915_private *dev_priv); /* intel_hdmi.c */ void intel_hdmi_init(struct drm_device *dev, i915_reg_t hdmi_reg, enum port port); diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c index e122052..950aed5 100644 --- a/drivers/gpu/drm/i915/intel_fbc.c +++ b/drivers/gpu/drm/i915/intel_fbc.c @@ -750,6 +750,13 @@ static bool intel_fbc_can_activate(struct intel_crtc *crtc) struct intel_fbc *fbc = &dev_priv->fbc; struct intel_fbc_state_cache *cache = &fbc->state_cache; + /* We don't need to use a state cache here since this information is + * global for every CRTC. */ + if (fbc->underrun_detected) { + fbc->no_fbc_reason = "underrun detected"; + return false; + } + if (!cache->plane.visible) { fbc->no_fbc_reason = "primary plane not visible"; return false; @@ -1193,6 +1200,59 @@ void intel_fbc_global_disable(struct drm_i915_private *dev_priv) cancel_work_sync(&fbc->work.work); } +static void intel_fbc_underrun_work_fn(struct work_struct *work) +{ + struct drm_i915_private *dev_priv = + container_of(work, struct drm_i915_private, fbc.underrun_work); + struct intel_fbc *fbc = &dev_priv->fbc; + + mutex_lock(&fbc->lock); + + /* Maybe we were scheduled twice. */ + if (fbc->underrun_detected) + goto out; + + DRM_DEBUG_KMS("Disabling FBC due to FIFO underrun.\n"); + fbc->underrun_detected = true; + + intel_fbc_deactivate(dev_priv); +out: + mutex_unlock(&fbc->lock); +} + +/** + * intel_fbc_handle_fifo_underrun_irq - disable FBC when we get a FIFO underrun + * @dev_priv: i915 device instance + * + * Without FBC, most underruns are harmless and don't really cause too many + * problems, except for an annoying message on dmesg. With FBC, underruns can + * become black screens or even worse, especially when paired with bad + * watermarks. So in order for us to be on the safe side, completely disable FBC + * in case we ever detect a FIFO underrun on any pipe. An underrun on any pipe + * already suggests that watermarks may be bad, so try to be as safe as + * possible. + * + * This function is called from the IRQ handler. + */ +void intel_fbc_handle_fifo_underrun_irq(struct drm_i915_private *dev_priv) +{ + struct intel_fbc *fbc = &dev_priv->fbc; + + if (!fbc_supported(dev_priv)) + return; + + /* There's no guarantee that underrun_detected won't be set to true + * right after this check and before the work is scheduled, but that's + * not a problem since we'll check it again under the work function + * while FBC is locked. This check here is just to prevent us from + * unnecessarily scheduling the work, and it relies on the fact that we + * never switch underrun_detect back to false after it's true. */ + if (fbc->underrun_detected) + return; + + schedule_work(&fbc->underrun_work); +} + /** * intel_fbc_init_pipe_state - initialize FBC's CRTC visibility tracking * @dev_priv: i915 device instance @@ -1264,6 +1324,7 @@ void intel_fbc_init(struct drm_i915_private *dev_priv) enum pipe pipe; INIT_WORK(&fbc->work.work, intel_fbc_work_fn); + INIT_WORK(&fbc->underrun_work, intel_fbc_underrun_work_fn); mutex_init(&fbc->lock); fbc->enabled = false; fbc->active = false; diff --git a/drivers/gpu/drm/i915/intel_fifo_underrun.c b/drivers/gpu/drm/i915/intel_fifo_underrun.c index 2aa7440..ebb4fed 100644 --- a/drivers/gpu/drm/i915/intel_fifo_underrun.c +++ b/drivers/gpu/drm/i915/intel_fifo_underrun.c @@ -372,6 +372,8 @@ void intel_cpu_fifo_underrun_irq_handler(struct drm_i915_private *dev_priv, if (intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, false)) DRM_ERROR("CPU pipe %c FIFO underrun\n", pipe_name(pipe)); + + intel_fbc_handle_fifo_underrun_irq(dev_priv); } /**
Ever since I started working on FBC I was already aware that FBC can really amplify the FIFO underrun symptoms. On systems where FIFO underruns were harmless error messages, enabling FBC would cause the underruns to give black screens. We recently tried to enable FBC on Haswell and got reports of a system that would hang after some hours of uptime, and the first bad commit was the one that enabled FBC. We also observed that this system had FIFO underrun error messages on its dmesg. Although we don't have any evidence that fixing the underruns would solve the bug and make FBC work properly on this machine, IMHO it's better if we minimize the amount of possible problems by just giving up FBC whenever we detect an underrun. v2: New version, different implementation and commit message. v3: Clarify the fact that we run from an IRQ handler (Chris). Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Lyude <cpaul@redhat.com> Cc: stevenhoneyman@gmail.com <stevenhoneyman@gmail.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 3 ++ drivers/gpu/drm/i915/intel_drv.h | 1 + drivers/gpu/drm/i915/intel_fbc.c | 61 ++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/intel_fifo_underrun.c | 2 + 4 files changed, 67 insertions(+)