From patchwork Tue Feb 9 00:19:37 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Anholt X-Patchwork-Id: 8256431 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 36A2DBEEE5 for ; Tue, 9 Feb 2016 00:19:46 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 39E2E20375 for ; Tue, 9 Feb 2016 00:19:45 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id C65FC2034A for ; Tue, 9 Feb 2016 00:19:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7F94A6E4CE; Mon, 8 Feb 2016 16:19:41 -0800 (PST) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from annarchy.freedesktop.org (annarchy.freedesktop.org [131.252.210.176]) by gabe.freedesktop.org (Postfix) with ESMTP id 98A446E0D0; Mon, 8 Feb 2016 16:19:40 -0800 (PST) Received: from eliezer.anholt.net (annarchy.freedesktop.org [127.0.0.1]) by annarchy.freedesktop.org (Postfix) with ESMTP id 899631816B; Mon, 8 Feb 2016 16:19:40 -0800 (PST) Received: by eliezer.anholt.net (Postfix, from userid 1000) id B1D11F1B2E4; Mon, 8 Feb 2016 16:19:39 -0800 (PST) From: Eric Anholt To: dri-devel@lists.freedesktop.org Subject: [PATCH 1/3] drm/vc4: Fix spurious GPU resets due to BO reuse. Date: Mon, 8 Feb 2016 16:19:37 -0800 Message-Id: <1454977179-16913-2-git-send-email-eric@anholt.net> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1454977179-16913-1-git-send-email-eric@anholt.net> References: <1454977179-16913-1-git-send-email-eric@anholt.net> Cc: linux-kernel@vger.kernel.org X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We were tracking the "where are the head pointers pointing" globally, so if another job reused the same BOs and execution was at the same point as last time we checked, we'd stop and trigger a reset even though the GPU had made progress. Signed-off-by: Eric Anholt --- drivers/gpu/drm/vc4/vc4_drv.h | 6 +++++- drivers/gpu/drm/vc4/vc4_gem.c | 19 ++++++++++++++----- 2 files changed, 19 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h index a7fea77..b4a26fd 100644 --- a/drivers/gpu/drm/vc4/vc4_drv.h +++ b/drivers/gpu/drm/vc4/vc4_drv.h @@ -92,7 +92,6 @@ struct vc4_dev { struct work_struct overflow_mem_work; struct { - uint32_t last_ct0ca, last_ct1ca; struct timer_list timer; struct work_struct reset_work; } hangcheck; @@ -202,6 +201,11 @@ struct vc4_exec_info { /* Sequence number for this bin/render job. */ uint64_t seqno; + /* Last current addresses the hardware was processing when the + * hangcheck timer checked on us. + */ + uint32_t last_ct0ca, last_ct1ca; + /* Kernel-space copy of the ioctl arguments */ struct drm_vc4_submit_cl *args; diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c index 48ce30a..1b7adf1 100644 --- a/drivers/gpu/drm/vc4/vc4_gem.c +++ b/drivers/gpu/drm/vc4/vc4_gem.c @@ -257,10 +257,17 @@ vc4_hangcheck_elapsed(unsigned long data) struct drm_device *dev = (struct drm_device *)data; struct vc4_dev *vc4 = to_vc4_dev(dev); uint32_t ct0ca, ct1ca; + unsigned long irqflags; + struct vc4_exec_info *exec; + + spin_lock_irqsave(&vc4->job_lock, irqflags); + exec = vc4_first_job(vc4); /* If idle, we can stop watching for hangs. */ - if (list_empty(&vc4->job_list)) + if (!exec) { + spin_unlock_irqrestore(&vc4->job_lock, irqflags); return; + } ct0ca = V3D_READ(V3D_CTNCA(0)); ct1ca = V3D_READ(V3D_CTNCA(1)); @@ -268,14 +275,16 @@ vc4_hangcheck_elapsed(unsigned long data) /* If we've made any progress in execution, rearm the timer * and wait. */ - if (ct0ca != vc4->hangcheck.last_ct0ca || - ct1ca != vc4->hangcheck.last_ct1ca) { - vc4->hangcheck.last_ct0ca = ct0ca; - vc4->hangcheck.last_ct1ca = ct1ca; + if (ct0ca != exec->last_ct0ca || ct1ca != exec->last_ct1ca) { + exec->last_ct0ca = ct0ca; + exec->last_ct1ca = ct1ca; + spin_unlock_irqrestore(&vc4->job_lock, irqflags); vc4_queue_hangcheck(dev); return; } + spin_unlock_irqrestore(&vc4->job_lock, irqflags); + /* We've gone too long with no progress, reset. This has to * be done from a work struct, since resetting can sleep and * this timer hook isn't allowed to.