From patchwork Tue Apr 12 16:59:37 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: arun.siluvery@linux.intel.com X-Patchwork-Id: 8812941 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id F1E96C0554 for ; Tue, 12 Apr 2016 17:00:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id E44042035B for ; Tue, 12 Apr 2016 17:00:17 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 696B82035E for ; Tue, 12 Apr 2016 17:00:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 739166E749; Tue, 12 Apr 2016 17:00:13 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTP id F072A6E73F for ; Tue, 12 Apr 2016 17:00:04 +0000 (UTC) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP; 12 Apr 2016 10:00:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,475,1455004800"; d="scan'208";a="783401451" Received: from asiluver-linux.isw.intel.com ([10.102.226.117]) by orsmga003.jf.intel.com with ESMTP; 12 Apr 2016 10:00:01 -0700 From: Arun Siluvery To: intel-gfx@lists.freedesktop.org Date: Tue, 12 Apr 2016 17:59:37 +0100 Message-Id: <1460480381-8777-11-git-send-email-arun.siluvery@linux.intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1460480381-8777-1-git-send-email-arun.siluvery@linux.intel.com> References: <1460480381-8777-1-git-send-email-arun.siluvery@linux.intel.com> Cc: Ian Lister , Tomas Elf Subject: [Intel-gfx] [PATCH 10/14] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP i915_gem_check_wedge now returns a non-zero result in three different cases: 1. Legacy: A hang has been detected and full GPU reset is in progress. 2. Per-engine recovery: a. A single engine reference can be passed to the function, in which case only that engine will be checked. If that particular engine is detected to be hung and is to be reset this will yield a non-zero result but not if reset is in progress for any other engine. b. No engine reference is passed to the function, in which case all engines are checked for ongoing per-engine hang recovery. Also, i915_wait_request was updated to take advantage of this new functionality. This is important since the TDR hang recovery mechanism needs a way to force waiting threads that hold the struct_mutex to give up the struct_mutex and try again after the hang recovery has completed. If i915_wait_request does not take per-engine hang recovery into account there is no way for a waiting thread to know that a per-engine recovery is about to happen and that it needs to back off. Signed-off-by: Tomas Elf Signed-off-by: Ian Lister Cc: Chris Wilson Cc: Mika Kuoppala Signed-off-by: Arun Siluvery --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem.c | 39 +++++++++++++++++++++++++++------ drivers/gpu/drm/i915/intel_lrc.c | 1 + drivers/gpu/drm/i915/intel_ringbuffer.c | 1 + 4 files changed, 35 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index eda531f..682bf207 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3052,6 +3052,7 @@ i915_gem_find_active_request(struct intel_engine_cs *engine); bool i915_gem_retire_requests(struct drm_device *dev); void i915_gem_retire_requests_ring(struct intel_engine_cs *engine); int __must_check i915_gem_check_wedge(struct i915_gpu_error *error, + struct intel_engine_cs *engine, bool interruptible); static inline bool i915_reset_in_progress(struct i915_gpu_error *error) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 4c62583..5ca8bd5 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -80,12 +80,29 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv, spin_unlock(&dev_priv->mm.object_stat_lock); } +static bool i915_engine_reset_pending(struct i915_gpu_error *error, + struct intel_engine_cs *engine) +{ + int i; + + if (engine) + return i915_engine_reset_in_progress(error, engine->id); + + for (i = 0; i < I915_NUM_ENGINES; ++i) { + if (i915_engine_reset_in_progress(error, i)) + return true; + } + + return false; +} + static int i915_gem_wait_for_error(struct i915_gpu_error *error) { int ret; #define EXIT_COND (!i915_reset_in_progress(error) || \ + !i915_engine_reset_pending(error, NULL) || \ i915_terminally_wedged(error)) if (EXIT_COND) return 0; @@ -1112,9 +1129,11 @@ put_rpm: int i915_gem_check_wedge(struct i915_gpu_error *error, + struct intel_engine_cs *engine, bool interruptible) { - if (i915_reset_in_progress(error)) { + if (i915_reset_in_progress(error) || + i915_engine_reset_pending(error, engine)) { /* Non-interruptible callers can't handle -EAGAIN, hence return * -EIO unconditionally for these. */ if (!interruptible) @@ -1296,16 +1315,22 @@ int __i915_wait_request(struct drm_i915_gem_request *req, for (;;) { struct timer_list timer; + int reset_in_progress; prepare_to_wait(&engine->irq_queue, &wait, state); /* We need to check whether any gpu reset happened in between * the caller grabbing the seqno and now ... */ - if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) { + reset_in_progress = i915_gem_check_wedge(&dev_priv->gpu_error, + NULL, + interruptible); + if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter) || + reset_in_progress) { /* ... but upgrade the -EAGAIN to an -EIO if the gpu * is truely gone. */ - ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible); - if (ret == 0) + if (reset_in_progress) + ret = reset_in_progress; + else ret = -EAGAIN; break; } @@ -1471,7 +1496,7 @@ i915_wait_request(struct drm_i915_gem_request *req) BUG_ON(!mutex_is_locked(&dev->struct_mutex)); - ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible); + ret = i915_gem_check_wedge(&dev_priv->gpu_error, NULL, interruptible); if (ret) return ret; @@ -1561,7 +1586,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj, if (!obj->active) return 0; - ret = i915_gem_check_wedge(&dev_priv->gpu_error, true); + ret = i915_gem_check_wedge(&dev_priv->gpu_error, NULL, true); if (ret) return ret; @@ -4138,7 +4163,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file) if (ret) return ret; - ret = i915_gem_check_wedge(&dev_priv->gpu_error, false); + ret = i915_gem_check_wedge(&dev_priv->gpu_error, NULL, false); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 86d5e18..72e5b70 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -941,6 +941,7 @@ int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords) dev_priv = req->i915; ret = i915_gem_check_wedge(&dev_priv->gpu_error, + req->engine, dev_priv->mm.interruptible); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index e144f4f..3b087f0 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2518,6 +2518,7 @@ int intel_ring_begin(struct drm_i915_gem_request *req, dev_priv = req->i915; ret = i915_gem_check_wedge(&dev_priv->gpu_error, + engine, dev_priv->mm.interruptible); if (ret) return ret;