From patchwork Fri Mar 16 18:31:02 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: jeff.mcgee@intel.com
X-Patchwork-Id: 10289843
Return-Path: <intel-gfx-bounces@lists.freedesktop.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	68CF3602C2 for <patchwork-intel-gfx@patchwork.kernel.org>;
	Fri, 16 Mar 2018 18:45:54 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5788329071
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Fri, 16 Mar 2018 18:45:54 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 4A58A29078; Fri, 16 Mar 2018 18:45:54 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED
	autolearn=ham version=3.3.1
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D8CC529071
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Fri, 16 Mar 2018 18:45:53 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id BF09C6EBE9;
	Fri, 16 Mar 2018 18:45:52 +0000 (UTC)
X-Original-To: intel-gfx@lists.freedesktop.org
Delivered-To: intel-gfx@lists.freedesktop.org
Received: from mga09.intel.com (mga09.intel.com [134.134.136.24])
	by gabe.freedesktop.org (Postfix) with ESMTPS id A59A86E0DB
	for <intel-gfx@lists.freedesktop.org>;
	Fri, 16 Mar 2018 18:45:41 +0000 (UTC)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga008.fm.intel.com ([10.253.24.58])
	by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
	16 Mar 2018 11:45:40 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.48,317,1517904000"; d="scan'208";a="25143764"
Received: from jeffdesk.fm.intel.com ([10.1.27.184])
	by fmsmga008.fm.intel.com with ESMTP; 16 Mar 2018 11:45:40 -0700
From: jeff.mcgee@intel.com
To: intel-gfx@lists.freedesktop.org
Date: Fri, 16 Mar 2018 11:31:02 -0700
Message-Id: <20180316183105.16027-6-jeff.mcgee@intel.com>
X-Mailer: git-send-email 2.16.2
In-Reply-To: <20180316183105.16027-1-jeff.mcgee@intel.com>
References: <20180316183105.16027-1-jeff.mcgee@intel.com>
Subject: [Intel-gfx] [RFC 5/8] drm/i915: Consider preemption when finding
	the active request
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Intel graphics driver community testing & development
	<intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Cc: ben@bwidawsk.net, kalyan.kondapally@intel.com
MIME-Version: 1.0
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
X-Virus-Scanned: ClamAV using ClamSMTP

From: Jeff McGee <jeff.mcgee@intel.com>

The active request is found by scanning the engine timeline for the
request that follows the last completed request. That method is accurate
if there is no preemption in progress, because the engine will certainly
have started that request. If there is a preemption in progress, it could
have completed leaving the engine idle. In this case the request we
identified with the above method is not active and may not even have been
started. We must check for this condition to avoid fingering an innocent
request during reset.

This patch is required to support the force preemption feature.

Test: Run IGT gem_exec_fpreempt repeatedly.
Change-Id: I63a9f64446e24d4ee36b4af32854699bda006ddd
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 49 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e2961e3913b8..9780d9026ce6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2739,9 +2739,9 @@ static void i915_gem_context_mark_innocent(struct i915_gem_context *ctx)
 }
 
 struct drm_i915_gem_request *
-i915_gem_find_active_request(struct intel_engine_cs *engine)
+i915_gem_find_pending_request(struct intel_engine_cs *engine)
 {
-	struct drm_i915_gem_request *request, *active = NULL;
+	struct drm_i915_gem_request *request, *pending = NULL;
 	unsigned long flags;
 
 	/* We are called by the error capture and reset at a random
@@ -2762,12 +2762,53 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
 		GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
 				    &request->fence.flags));
 
-		active = request;
+		pending = request;
 		break;
 	}
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
 
-	return active;
+	return pending;
+}
+
+struct drm_i915_gem_request *
+i915_gem_find_active_request(struct intel_engine_cs *engine)
+{
+	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct drm_i915_gem_request *request;
+
+	/* The pending request is active if no preemption is in progress */
+	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
+		return i915_gem_find_pending_request(engine);
+
+	/* Preemption has finished. Engine idle. */
+	if (intel_engine_preempt_finished(engine))
+		return NULL;
+
+	request = i915_gem_find_pending_request(engine);
+
+	/* Preemption has flushed all requests off the engine. Engine idle. */
+	if (!request)
+		return NULL;
+
+	/* The pending request likely is active and blocking the in-progress
+	 * preemption. But there is a race in our previous checks. The request
+	 * that was actually blocking preemption could have completed (a batch-
+	 * boundary preemption) such that the engine is idle and the pending
+	 * request we have identified was the next in line. We must wait for
+	 * at least as long as it would take for the preempt-to-idle context
+	 * to mark the preemption done to verify this. We use 500 usecs to
+	 * account for a worst case delay from the seqno write of the
+	 * completing request and the preempt finished write.
+	 */
+	if (!_wait_for_exact(intel_engine_preempt_finished(engine), 500, 10, 50))
+		return NULL;
+
+	/*
+	 * We didn't see the preemption done after a sufficient wait. Thus the
+	 * pending request we sampled above was in fact active and blocking
+	 * the preemption.
+	 */
+	return request;
 }
 
 static bool engine_stalled(struct intel_engine_cs *engine)