From patchwork Wed Feb 27 10:04:08 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 2192731 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork1.kernel.org (Postfix) with ESMTP id 0D58F3FD4E for ; Wed, 27 Feb 2013 10:04:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AE000E615F for ; Wed, 27 Feb 2013 02:04:53 -0800 (PST) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mga03.intel.com (mga03.intel.com [143.182.124.21]) by gabe.freedesktop.org (Postfix) with ESMTP id E928EE5C3D for ; Wed, 27 Feb 2013 02:04:40 -0800 (PST) Received: from azsmga002.ch.intel.com ([10.2.17.35]) by azsmga101.ch.intel.com with ESMTP; 27 Feb 2013 02:04:20 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.84,746,1355126400"; d="scan'208";a="206812243" Received: from unknown (HELO cantiga.alporthouse.com) ([10.255.12.93]) by AZSMGA002.ch.intel.com with SMTP; 27 Feb 2013 02:04:17 -0800 Received: by cantiga.alporthouse.com (sSMTP sendmail emulation); Wed, 27 Feb 2013 10:04:08 +0000 Date: Wed, 27 Feb 2013 10:04:08 +0000 From: Chris Wilson To: Linus Torvalds Subject: Re: [git pull] drm merge for 3.9-rc1 Message-ID: <20130227100408.GA1924@cantiga.alporthouse.com> Mail-Followup-To: Chris Wilson , Linus Torvalds , Dave Airlie , Daniel Vetter , Imre Deak , Linux Kernel Mailing List , DRI mailing list References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Daniel Vetter , Linux Kernel Mailing List , DRI mailing list X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org On Tue, Feb 26, 2013 at 05:39:46PM -0800, Linus Torvalds wrote: > On Mon, Feb 25, 2013 at 4:05 PM, Dave Airlie wrote: > > > > Highlights: > > > > i915: all over the map, haswell power well enhancements, valleyview macro horrors cleaned up, killing lots of legacy GTT > > code, > > Lowlight: > > There's something wrong with i915 DP detection or whatever. I get > stuff like this: > > [ 8.149931] [drm:intel_dp_aux_ch] *ERROR* dp_aux_ch not done status > 0xa145003f > > and after that the screen ends up black. > > It's happened twice now, but is not 100% repeatable. It looks like the > message itself is new, but the black screen is also new and does seem > to happen when I get the message, so... That message appears to be the canary. For whatever reason the DP transfer is not functioning, likely the VDD is not powered up. However, the failure to communicate there causes the modeset to abort, resulting in the blank screen. > The second time I touched the power button, and the machine came back. > Apparently the suspend/resume cycle made it all magically work: the > suspend caused the same errors, but then the resume made it all good > again. So it is reproducible during suspend. That should help narrow down the sequence, thank you. > Some kind of missed initialization at bootup? It's not reliable enough > to bisect, but I obviously suspect commit 9ee32fea5fe8 ("drm/i915: > irq-drive the dp aux communication") since that is where the message > was added.. > > Btw, looking at that commit, what do you think the semantics of the > timeout in something like > > done = wait_event_timeout(dev_priv->gmbus_wait_queue, C, 10); > > would be? What's that magic "10"? It's some totally random number. The hardware is required to return a timedout error message after 400 microseconds. The timeout here is to catch the dysfunction driver, and so was intended to be 10 milliseconds, cf https://patchwork.kernel.org/patch/2160541/ As it happens with your machine 10 jiffies is approximately 10 millisecond, and so we should not be aborting before the hardware has had a chance to signal failure. One way to check whether it is a failure to setup the IRQ or a failure to setup the DP comms would be: diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 7b8bfe8..f2486f1 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -356,9 +356,11 @@ intel_dp_aux_wait_done(struct intel_dp *intel_dp, bool has_aux_irq) done = wait_event_timeout(dev_priv->gmbus_wait_queue, C, 10); else done = wait_for_atomic(C, 10) == 0; - if (!done) - DRM_ERROR("dp aux hw did not signal timeout (has irq: %i)!\n", - has_aux_irq); + if (!done) { + status = I915_READ_NOTRACE(ch_ctl); + DRM_ERROR("dp aux hw did not signal timeout (has irq: %i), status=%08x!\n", + has_aux_irq, status); + } #undef C return status;