From patchwork Tue Feb 26 11:05:09 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mika Kuoppala X-Patchwork-Id: 2184951 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork2.kernel.org (Postfix) with ESMTP id 6FC67DF215 for ; Tue, 26 Feb 2013 11:05:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 49935E64CB for ; Tue, 26 Feb 2013 03:05:19 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga03.intel.com (mga03.intel.com [143.182.124.21]) by gabe.freedesktop.org (Postfix) with ESMTP id 0B296E64CB for ; Tue, 26 Feb 2013 03:02:24 -0800 (PST) Received: from azsmga001.ch.intel.com ([10.2.17.19]) by azsmga101.ch.intel.com with ESMTP; 26 Feb 2013 03:02:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.84,739,1355126400"; d="scan'208";a="261531621" Received: from rosetta.fi.intel.com (HELO rosetta) ([10.237.72.51]) by azsmga001.ch.intel.com with ESMTP; 26 Feb 2013 03:02:23 -0800 Received: by rosetta (Postfix, from userid 1000) id 0D035800E0; Tue, 26 Feb 2013 13:05:20 +0200 (EET) From: Mika Kuoppala To: intel-gfx@lists.freedesktop.org Date: Tue, 26 Feb 2013 13:05:09 +0200 Message-Id: <1361876716-8625-7-git-send-email-mika.kuoppala@intel.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1361876716-8625-1-git-send-email-mika.kuoppala@intel.com> References: <1361876716-8625-1-git-send-email-mika.kuoppala@intel.com> Subject: [Intel-gfx] [PATCH 06/13] drm/i915: detect hang using per ring hangcheck_score X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Add per ring score of possible culprit for gpu hang. If ring is busy and not waiting, it will get the highest score across calls to i915_hangcheck_elapsed. This way we are most likely to find the ring that caused the hang among the waiting ones. Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_irq.c | 65 +++++++++++++++++-------------- drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + 2 files changed, 36 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index b828807..4da8691 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -356,7 +356,6 @@ static void notify_ring(struct drm_device *dev, wake_up_all(&ring->irq_queue); if (i915_enable_hangcheck) { - dev_priv->gpu_error.hangcheck_count = 0; mod_timer(&dev_priv->gpu_error.hangcheck_timer, round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); } @@ -1818,52 +1817,58 @@ void i915_hangcheck_elapsed(unsigned long data) struct drm_device *dev = (struct drm_device *)data; drm_i915_private_t *dev_priv = dev->dev_private; struct intel_ring_buffer *ring; - bool err = false, idle; int i; - u32 seqno[I915_NUM_RINGS]; - bool work_done; + int busy_count = 0, rings_hung = 0; if (!i915_enable_hangcheck) return; - idle = true; for_each_ring(ring, dev_priv, i) { - seqno[i] = ring->get_seqno(ring, false); - idle &= i915_hangcheck_ring_idle(ring, seqno[i], &err); - } + u32 seqno; + bool idle, err = false; + + seqno = ring->get_seqno(ring, false); + idle = i915_hangcheck_ring_idle(ring, seqno, &err); - /* If all work is done then ACTHD clearly hasn't advanced. */ - if (idle) { - if (err) { - if (i915_hangcheck_hung(dev)) - return; + if (idle) { + if (err) + ring->hangcheck_score++; + else + ring->hangcheck_score = 0; + } else { + busy_count++; - goto repeat; + if (ring->hangcheck_seqno == seqno) { + ring->hangcheck_score++; + + /* If the ring is not waiting, raise + the score further */ + if (i915_hangcheck_ring_hung(dev, ring)) + ring->hangcheck_score++; + } else { + ring->hangcheck_score = 0; + } } - dev_priv->gpu_error.hangcheck_count = 0; - return; + ring->hangcheck_seqno = seqno; } - work_done = false; for_each_ring(ring, dev_priv, i) { - if (ring->hangcheck_seqno != seqno[i]) { - work_done = true; - ring->hangcheck_seqno = seqno[i]; + if (ring->hangcheck_score > 2) { + rings_hung++; + DRM_ERROR("%s seems hung\n", ring->name); } } - if (!work_done) { - if (i915_hangcheck_hung(dev)) - return; - } else { - dev_priv->gpu_error.hangcheck_count = 0; - } + if (rings_hung) + return i915_handle_error(dev, true); -repeat: - /* Reset timer case chip hangs without another request being added */ - mod_timer(&dev_priv->gpu_error.hangcheck_timer, - round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); + if (busy_count) + /* Reset timer case chip hangs without another request + * being added */ + mod_timer(&dev_priv->gpu_error.hangcheck_timer, + round_jiffies_up(jiffies + + DRM_I915_HANGCHECK_JIFFIES)); } /* drm_dma.h hooks diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 9599c56..97b8f37 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -138,6 +138,7 @@ struct intel_ring_buffer { struct drm_i915_gem_object *last_context_obj; u32 hangcheck_seqno; + int hangcheck_score; void *private; };