From patchwork Tue Apr 12 16:59:34 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: arun.siluvery@linux.intel.com X-Patchwork-Id: 8812931 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 12CDCC0553 for ; Tue, 12 Apr 2016 17:00:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D357920364 for ; Tue, 12 Apr 2016 17:00:16 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id B512E2034F for ; Tue, 12 Apr 2016 17:00:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7923A6E744; Tue, 12 Apr 2016 17:00:12 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTP id E42D76E73E for ; Tue, 12 Apr 2016 16:59:58 +0000 (UTC) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP; 12 Apr 2016 10:00:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,475,1455004800"; d="scan'208";a="783401164" Received: from asiluver-linux.isw.intel.com ([10.102.226.117]) by orsmga003.jf.intel.com with ESMTP; 12 Apr 2016 09:59:57 -0700 From: Arun Siluvery To: intel-gfx@lists.freedesktop.org Date: Tue, 12 Apr 2016 17:59:34 +0100 Message-Id: <1460480381-8777-8-git-send-email-arun.siluvery@linux.intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1460480381-8777-1-git-send-email-arun.siluvery@linux.intel.com> References: <1460480381-8777-1-git-send-email-arun.siluvery@linux.intel.com> Cc: Tomas Elf Subject: [Intel-gfx] [PATCH 07/14] drm/i915/tdr: Restore engine state and start after reset X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We capture the state of an engine before resetting it, once the reset is successful engine is restored with the same state and restarted. The state includes head register and active request. We also nudge the head forward if it hasn't advanced, otherwise when the engine is restarted HW executes the same instruction and may hang again. Generally head automatically advances to the next instruction as soon as HW reads current instruction, without waiting for it to complete, however a MBOX wait inserted directly to VCS/BCS engines doesn't behave in the same way, instead head will still be pointing at the same instruction until it completes. If the head is modified, this is also updated in the context image so that HW sees up to date value. A valid request is expected in the state at this point otherwise we wouldn't have reached this point, the context that submitted this request is resubmitted to HW. The request that caused the hang would be at the start of execlist queue, unless we resubmit and complete this request, it cannot be removed from the queue. Cc: Mika Kuoppala Signed-off-by: Tomas Elf Signed-off-by: Arun Siluvery --- drivers/gpu/drm/i915/intel_lrc.c | 94 +++++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/intel_ringbuffer.h | 9 ++++ 2 files changed, 103 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 5bfc93d..86d5e18 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -505,6 +505,30 @@ static void execlists_context_unqueue(struct intel_engine_cs *engine, execlists_submit_requests(req0, req1, tdr_resubmission); } +/** + * intel_execlists_resubmit() + * @engine: engine to do resubmission for + * + * In execlists mode, engine reset postprocess mainly includes resubmission of + * context after reset, for this we bypass the execlist queue. This is + * necessary since at the point of TDR hang recovery the hardware will be hung + * and resubmitting a fixed context (the context that the TDR has identified + * as hung and fixed up in order to move past the blocking batch buffer) to a + * hung execlist queue will lock up the TDR. Instead, opt for direct ELSP + * submission without depending on the rest of the driver. + */ +static void intel_execlists_resubmit(struct intel_engine_cs *engine) +{ + unsigned long flags; + + if (WARN_ON(list_empty(&engine->execlist_queue))) + return; + + spin_lock_irqsave(&engine->execlist_lock, flags); + execlists_context_unqueue(engine, true); + spin_unlock_irqrestore(&engine->execlist_lock, flags); +} + static unsigned int execlists_check_remove_request(struct intel_engine_cs *engine, u32 request_id) { @@ -1269,6 +1293,75 @@ static int gen8_engine_state_save(struct intel_engine_cs *engine, return 0; } +/** + * gen8_engine_start() - restore saved state and start engine + * @engine: engine to be started + * @state: state to be restored + * + * Returns: + * 0 if ok, otherwise propagates error codes. + */ +static int gen8_engine_start(struct intel_engine_cs *engine, + struct intel_engine_cs_state *state) +{ + u32 head; + u32 head_addr, tail_addr; + u32 *reg_state; + struct intel_ringbuffer *ringbuf; + struct intel_context *ctx; + struct drm_i915_private *dev_priv = engine->dev->dev_private; + + ctx = state->req->ctx; + ringbuf = ctx->engine[engine->id].ringbuf; + reg_state = ctx->engine[engine->id].lrc_reg_state; + + head = state->head; + head_addr = head & HEAD_ADDR; + + if (head == engine->hangcheck.last_head) { + /* + * The engine has not advanced since the last time it hung, + * force it to advance to the next QWORD. In most cases the + * engine head pointer will automatically advance to the + * next instruction as soon as it has read the current + * instruction, without waiting for it to complete. This + * seems to be the default behaviour, however an MBOX wait + * inserted directly to the VCS/BCS engines does not behave + * in the same way, instead the head pointer will still be + * pointing at the MBOX instruction until it completes. + */ + head_addr = roundup(head_addr, 8); + engine->hangcheck.last_head = head; + } else if (head_addr & 0x7) { + /* Ensure head pointer is pointing to a QWORD boundary */ + head_addr = ALIGN(head_addr, 8); + } + + tail_addr = reg_state[CTX_RING_TAIL+1] & TAIL_ADDR; + + if (head_addr > tail_addr) + head_addr = tail_addr; + else if (head_addr >= ringbuf->size) + head_addr = 0; + + head &= ~HEAD_ADDR; + head |= (head_addr & HEAD_ADDR); + + /* Restore head */ + reg_state[CTX_RING_HEAD+1] = head; + I915_WRITE_HEAD(engine, head); + + /* set head */ + ringbuf->head = head; + ringbuf->last_retired_head = -1; + intel_ring_update_space(ringbuf); + + if (state->req) + intel_execlists_resubmit(engine); + + return 0; +} + static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req) { int ret, i; @@ -2175,6 +2268,7 @@ logical_ring_default_vfuncs(struct drm_device *dev, /* engine reset supporting functions */ engine->save = gen8_engine_state_save; + engine->start = gen8_engine_start; if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) { engine->irq_seqno_barrier = bxt_a_seqno_barrier; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 62f9fb4..92e5d00 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -92,6 +92,13 @@ struct intel_ring_hangcheck { enum intel_ring_hangcheck_action action; int deadlock; u32 instdone[I915_NUM_INSTDONE_REG]; + + /* + * Last recorded ring head index. + * This is only ever a ring index where as active + * head may be a graphics address in a ring buffer + */ + u32 last_head; }; struct intel_ringbuffer { @@ -215,6 +222,8 @@ struct intel_engine_cs { /* engine reset supporting functions */ int (*save)(struct intel_engine_cs *engine, struct intel_engine_cs_state *state); + int (*start)(struct intel_engine_cs *engine, + struct intel_engine_cs_state *state); /* GEN8 signal/wait table - never trust comments! * signal to signal to signal to signal to signal to