From patchwork Fri Dec 16 20:20:07 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michel Thierry X-Patchwork-Id: 9478491 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B7603601C2 for ; Fri, 16 Dec 2016 20:20:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A97B527D13 for ; Fri, 16 Dec 2016 20:20:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9E60728723; Fri, 16 Dec 2016 20:20:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 50DB427D13 for ; Fri, 16 Dec 2016 20:20:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 673ED6ED34; Fri, 16 Dec 2016 20:20:32 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id E68396ED32 for ; Fri, 16 Dec 2016 20:20:11 +0000 (UTC) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP; 16 Dec 2016 12:20:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.33,359,1477983600"; d="scan'208"; a="1082813296" Received: from relo-linux-11.sc.intel.com ([10.3.160.214]) by fmsmga001.fm.intel.com with ESMTP; 16 Dec 2016 12:20:10 -0800 From: Michel Thierry To: intel-gfx@lists.freedesktop.org Date: Fri, 16 Dec 2016 12:20:07 -0800 Message-Id: <20161216202010.7983-7-michel.thierry@intel.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20161216202010.7983-1-michel.thierry@intel.com> References: <20161216202010.7983-1-michel.thierry@intel.com> Subject: [Intel-gfx] [PATCH 6/9] drm/i915/tdr: Add engine reset count to error state X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Arun Siluvery Driver maintains count of how many times a given engine is reset, useful to capture this in error state also. It gives an idea of how engine is coping up with the workloads it is executing before this error state. Cc: Chris Wilson Cc: Mika Kuoppala Signed-off-by: Arun Siluvery Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_drv.c | 1 + drivers/gpu/drm/i915/i915_drv.h | 9 +++++++++ drivers/gpu/drm/i915/i915_gpu_error.c | 3 +++ 3 files changed, 13 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index a034793bc246..1c706a082d60 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1877,6 +1877,7 @@ int i915_reset_engine(struct intel_engine_cs *engine) intel_engine_reset_cancel(engine); intel_execlists_restart_submission(engine); + dev_priv->gpu_error.engine_reset_count[engine->id]++; intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); return 0; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a97cc8f50ade..25183762ed94 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -937,6 +937,7 @@ struct drm_i915_error_state { enum intel_engine_hangcheck_action hangcheck_action; struct i915_address_space *vm; int num_requests; + u32 reset_count; /* position of active request inside the ring */ u32 rq_head, rq_post, rq_tail; @@ -1629,6 +1630,8 @@ struct i915_gpu_error { #define I915_RESET_IN_PROGRESS 0 #define I915_WEDGED (BITS_PER_LONG - 1) + unsigned long engine_reset_count[I915_NUM_ENGINES]; + /** * Waitqueue to signal when a hang is detected. Used to for waiters * to release the struct_mutex for the reset to procede. @@ -3397,6 +3400,12 @@ static inline u32 i915_reset_count(struct i915_gpu_error *error) return READ_ONCE(error->reset_count); } +static inline u32 i915_engine_reset_count(struct i915_gpu_error *error, + struct intel_engine_cs *engine) +{ + return READ_ONCE(error->engine_reset_count[engine->id]); +} + void i915_gem_reset(struct drm_i915_private *dev_priv); void i915_gem_set_wedged(struct drm_i915_private *dev_priv); void i915_gem_reset_engine(struct intel_engine_cs *engine); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index e16037d1b0ba..f168ad873521 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -453,6 +453,7 @@ static void error_print_engine(struct drm_i915_error_state_buf *m, err_printf(m, " hangcheck action timestamp: %lu, %u ms ago\n", ee->hangcheck_timestamp, jiffies_to_msecs(jiffies - ee->hangcheck_timestamp)); + err_printf(m, " engine reset count: %u\n", ee->reset_count); error_print_request(m, " ELSP[0]: ", &ee->execlist[0]); error_print_request(m, " ELSP[1]: ", &ee->execlist[1]); @@ -1170,6 +1171,8 @@ static void error_record_engine_registers(struct drm_i915_error_state *error, ee->hangcheck_timestamp = engine->hangcheck.action_timestamp; ee->hangcheck_action = engine->hangcheck.action; ee->hangcheck_stalled = engine->hangcheck.stalled; + ee->reset_count = i915_engine_reset_count(&dev_priv->gpu_error, + engine); if (USES_PPGTT(dev_priv)) { int i;