From patchwork Wed Jul  4 20:18:39 2012
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Daniel Vetter <daniel.vetter@ffwll.ch>
X-Patchwork-Id: 1157261
Return-Path: 
 <intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org>
X-Original-To: patchwork-intel-gfx@patchwork.kernel.org
Delivered-To: patchwork-process-083081@patchwork1.kernel.org
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	by patchwork1.kernel.org (Postfix) with ESMTP id 451E23FC36
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Wed,  4 Jul 2012 20:19:46 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 1EA359E9D5
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Wed,  4 Jul 2012 13:19:46 -0700 (PDT)
X-Original-To: intel-gfx@lists.freedesktop.org
Delivered-To: intel-gfx@lists.freedesktop.org
Received: from mail-we0-f177.google.com (mail-we0-f177.google.com
	[74.125.82.177])
	by gabe.freedesktop.org (Postfix) with ESMTP id AE92B9E989
	for <intel-gfx@lists.freedesktop.org>;
	Wed,  4 Jul 2012 13:18:53 -0700 (PDT)
Received: by werb13 with SMTP id b13so2863517wer.36
	for <intel-gfx@lists.freedesktop.org>;
	Wed, 04 Jul 2012 13:18:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google;
	h=from:to:cc:subject:date:message-id:x-mailer:in-reply-to:references;
	bh=EuLWC5egd7419ub/Yn1DFw2mwE/N3rmKd7AaI1HwtG4=;
	b=Lc69jRCRnJzNQCcOirPObozW1jCZmDp841cNCqFi8DBPi01kzL1LIX3jg3XAFwiLtH
	3DRy4HsUsEXyMxkatwu/Frz1YOpf8heZenuHkWYhbAszzlV3jo0PxNyJm3TuR5X8pUwu
	mTDNJe9jTm6mHLrQSWqbBfQ4pnerF1xk3hgNo=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=from:to:cc:subject:date:message-id:x-mailer:in-reply-to:references
	:x-gm-message-state;
	bh=EuLWC5egd7419ub/Yn1DFw2mwE/N3rmKd7AaI1HwtG4=;
	b=fiFwHpzPpzR1s6cjE84HKUvIK3JzKusN+ha1Bd1sOgTjRcoNOYFefVj/opqgtsNugq
	EqZAxNGXGdDbl47mRcM0vCrkrA06SvJgSSaT1hgyV64aM06APTWiNReJXfRA+kD5dC3M
	OmjWRBV2puN/CJ3MuvBlhlekkBXWmROquEBB8x2WS7Q3NuVGCYPzFVqVtg3ccgyN9HNh
	AqTCV+yKyeJGS30uZb7MOe94jrs0OBDTrnO9kKJDV331AZnKVHoW+rdfovbvxd7XSbI7
	XzYWFPkcmWMkO1HdBvptLgEo3UOxECrT9KySoxBLyBAOsssA9qs0ljGrG8/1lc3hIqNH
	amMw==
Received: by 10.180.90.147 with SMTP id bw19mr36150149wib.4.1341433132832;
	Wed, 04 Jul 2012 13:18:52 -0700 (PDT)
Received: from aaron.ffwll.local (178-83-130-250.dynamic.hispeed.ch.
	[178.83.130.250]) by mx.google.com with ESMTPS id
	bg10sm69602224wib.9.2012.07.04.13.18.51
	(version=TLSv1/SSLv3 cipher=OTHER);
	Wed, 04 Jul 2012 13:18:51 -0700 (PDT)
From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Date: Wed,  4 Jul 2012 22:18:39 +0200
Message-Id: <1341433123-23055-2-git-send-email-daniel.vetter@ffwll.ch>
X-Mailer: git-send-email 1.7.10
In-Reply-To: <1341433123-23055-1-git-send-email-daniel.vetter@ffwll.ch>
References: <1341433123-23055-1-git-send-email-daniel.vetter@ffwll.ch>
X-Gm-Message-State: 
 ALoCoQkncYRWzZsF9ILA0s4AqHKvaunQNIpEeerkYfd/N7cJdY7WPTD3FtGXnSuU9qPGV2mBWgWA
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Subject: [Intel-gfx] [PATCH 1/5] drm/i915: don't trylock in the gpu reset
	code
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: Intel graphics driver community testing & development
	<intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
MIME-Version: 1.0
Sender: 
 intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org
Errors-To: 
 intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org

Simply failing to reset the gpu because someone else might still hold
the mutex isn't a great idea - I see reliable silent reset failures.
And gpu reset simply needs to be reliable and Just Work.

"But ... the deadlocks!"

We already kick all processes waiting for the gpu before launching the
reset work item. New waiters need to check the wedging state anyway
and then bail out. If we have places that can deadlock, we simply need
to fix them.

"But ... testing!"

We have the gpu hangman, and if the current gpu load gem_exec_nop
isn't good enough to hit a specific case, we can add a new one.

"But ...  don't we return -EAGAIN for non-interruptible calls to
wait_seqno now?"

Yep, but this problem already exists in the current code. A follow up
patch will remedy this by returning -EIO for non-interruptible sleeps
if the gpu died and the low-level wait bails out with -EAGAIN.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6edb2d5..e754cdf 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -730,8 +730,7 @@ int i915_reset(struct drm_device *dev)
 	if (!i915_try_reset)
 		return 0;
 
-	if (!mutex_trylock(&dev->struct_mutex))
-		return -EBUSY;
+	mutex_lock(&dev->struct_mutex);
 
 	i915_gem_reset(dev);