diff mbox

[v5] drm/i915: Avoid GPU Hang when comming out of s3 or s4

Message ID 1431347701-8997-1-git-send-email-peter.antoine@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Antoine May 11, 2015, 12:35 p.m. UTC
This patch fixed a timing issue that causes a GPU hang when a the system
comes out of power saving.

During pm_resume, We are submitting batchbuffers before enabling
Interrupts this is causing us to miss the context switch interrupt,
and in consequence intel_execlists_handle_ctx_events is not triggered.

This patch is based on a patch from Deepak S <deepak.s@intel.com>
from another platform.

The patch fixes an issue introduced by:
  commit e7778be1eab918274f79603d7c17b3ec8be77386
  drm/i915: Fix startup failure in LRC mode after recent init changes

The above patch added a call to init_context() to fix an issue introduced
by a previous patch. But, it then opened up a small timing window for the
batches being added by the init_context (basically setting up the context)
to complete before the interrupts have been turned on, thus hanging the
GPU.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89600
Cc: stable@vger.kernel.org
Signed-off-by: Peter Antoine <peter.antoine@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Jani Nikula May 12, 2015, 8:34 a.m. UTC | #1
On Mon, 11 May 2015, Peter Antoine <peter.antoine@intel.com> wrote:
> This patch fixed a timing issue that causes a GPU hang when a the system
> comes out of power saving.
>
> During pm_resume, We are submitting batchbuffers before enabling
> Interrupts this is causing us to miss the context switch interrupt,
> and in consequence intel_execlists_handle_ctx_events is not triggered.
>
> This patch is based on a patch from Deepak S <deepak.s@intel.com>
> from another platform.
>
> The patch fixes an issue introduced by:
>   commit e7778be1eab918274f79603d7c17b3ec8be77386
>   drm/i915: Fix startup failure in LRC mode after recent init changes
>
> The above patch added a call to init_context() to fix an issue introduced
> by a previous patch. But, it then opened up a small timing window for the
> batches being added by the init_context (basically setting up the context)
> to complete before the interrupts have been turned on, thus hanging the
> GPU.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89600
> Cc: stable@vger.kernel.org
> Signed-off-by: Peter Antoine <peter.antoine@intel.com>

I pushed some version of this patch to drm-intel-fixes yesterday, with
some comment modifications. Thanks for the patch and review.

For future reference, please add some changelog to your patches so it's
easier to know what's changed between versions.

BR,
Jani.

> ---
>  drivers/gpu/drm/i915/i915_drv.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 6bb6c47..748ab13 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -734,6 +734,13 @@ static int i915_drm_resume(struct drm_device *dev)
>  	intel_init_pch_refclk(dev);
>  	drm_mode_config_reset(dev);
>  
> +	/* 
> +	 * Interrupts have to be enabled before any batches are run. If not the
> +	 * GPU will hang. The init_hw will initiate batches to update/restore
> +	 * the context.
> +	 */
> +	intel_runtime_pm_enable_interrupts(dev_priv);
> +
>  	mutex_lock(&dev->struct_mutex);
>  	if (i915_gem_init_hw(dev)) {
>  		DRM_ERROR("failed to re-initialize GPU, declaring wedged!\n");
> @@ -741,9 +748,7 @@ static int i915_drm_resume(struct drm_device *dev)
>  	}
>  	mutex_unlock(&dev->struct_mutex);
>  
> -	/* We need working interrupts for modeset enabling ... */
> -	intel_runtime_pm_enable_interrupts(dev_priv);
> -
> +	/* This must follow the pm enable interrupts */
>  	intel_modeset_init_hw(dev);
>  
>  	spin_lock_irq(&dev_priv->irq_lock);
> -- 
> 1.9.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Shuang He May 15, 2015, 2:32 a.m. UTC | #2
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6380
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                                  276/276              276/276
ILK                 -1              302/302              301/302
SNB                 -1              314/314              313/314
IVB                                  338/338              338/338
BYT                                  286/286              286/286
BDW                 -1              320/320              319/320
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*ILK  igt@kms_flip@flip-vs-dpms-interruptible      PASS(2)      DMESG_WARN(1)
(dmesg patch applied)drm:intel_pch_fifo_underrun_irq_handler[i915]]*ERROR*PCH_transcoder_A_FIFO_underrun@PCH transcoder A FIFO underrun
 SNB  igt@pm_rpm@dpms-mode-unset-non-lpsp      DMESG_WARN(13)PASS(1)      DMESG_WARN(1)
(dmesg patch applied)WARNING:at_drivers/gpu/drm/i915/intel_uncore.c:#assert_device_not_suspended[i915]()@WARNING:.* at .* assert_device_not_suspended+0x
*BDW  igt@gem_fence_thrash@bo-write-verify-y      PASS(2)      DMESG_WARN(1)
(dmesg patch applied)WARNING:at_drivers/gpu/drm/i915/intel_display.c:#assert_plane[i915]()@WARNING:.* at .* assert_plane
assertion_failure@assertion failure
WARNING:at_drivers/gpu/drm/drm_irq.c:#drm_wait_one_vblank[drm]()@WARNING:.* at .* drm_wait_one_vblank+0x
Note: You need to pay more attention to line start with '*'
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6bb6c47..748ab13 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -734,6 +734,13 @@  static int i915_drm_resume(struct drm_device *dev)
 	intel_init_pch_refclk(dev);
 	drm_mode_config_reset(dev);
 
+	/* 
+	 * Interrupts have to be enabled before any batches are run. If not the
+	 * GPU will hang. The init_hw will initiate batches to update/restore
+	 * the context.
+	 */
+	intel_runtime_pm_enable_interrupts(dev_priv);
+
 	mutex_lock(&dev->struct_mutex);
 	if (i915_gem_init_hw(dev)) {
 		DRM_ERROR("failed to re-initialize GPU, declaring wedged!\n");
@@ -741,9 +748,7 @@  static int i915_drm_resume(struct drm_device *dev)
 	}
 	mutex_unlock(&dev->struct_mutex);
 
-	/* We need working interrupts for modeset enabling ... */
-	intel_runtime_pm_enable_interrupts(dev_priv);
-
+	/* This must follow the pm enable interrupts */
 	intel_modeset_init_hw(dev);
 
 	spin_lock_irq(&dev_priv->irq_lock);