From patchwork Fri Mar 16 18:31:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: jeff.mcgee@intel.com X-Patchwork-Id: 10289845 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9592C602C2 for ; Fri, 16 Mar 2018 18:45:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 85A0E29073 for ; Fri, 16 Mar 2018 18:45:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7A20F2907A; Fri, 16 Mar 2018 18:45:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F074F29073 for ; Fri, 16 Mar 2018 18:45:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8E1046EBF0; Fri, 16 Mar 2018 18:45:53 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id DDFFC6E04A for ; Fri, 16 Mar 2018 18:45:41 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Mar 2018 11:45:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,317,1517904000"; d="scan'208";a="25143771" Received: from jeffdesk.fm.intel.com ([10.1.27.184]) by fmsmga008.fm.intel.com with ESMTP; 16 Mar 2018 11:45:40 -0700 From: jeff.mcgee@intel.com To: intel-gfx@lists.freedesktop.org Date: Fri, 16 Mar 2018 11:31:04 -0700 Message-Id: <20180316183105.16027-8-jeff.mcgee@intel.com> X-Mailer: git-send-email 2.16.2 In-Reply-To: <20180316183105.16027-1-jeff.mcgee@intel.com> References: <20180316183105.16027-1-jeff.mcgee@intel.com> Subject: [Intel-gfx] [RFC 7/8] drm/i915: Allow reset without error capture X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ben@bwidawsk.net, kalyan.kondapally@intel.com MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Jeff McGee Pull the reset handling out of i915_handle_error() so that it can be called by that function and directly by the upcoming force preemption handler. This allows the force preemption handler to bypass the error capture that i915_handle_error() does before getting on with the reset. We do not want error capture for force preemption because it adds significant latency (~10 msecs measured on APL). This patch is required to support the force preemption feature. Change-Id: I41b4fae1adc197f0e70cec47cb960a0d7fa55f48 Signed-off-by: Jeff McGee --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_irq.c | 75 +++++++++++++++++++++++------------------ 2 files changed, 45 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index d8524357373e..ade09f97be5c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3245,6 +3245,8 @@ __printf(3, 4) void i915_handle_error(struct drm_i915_private *dev_priv, u32 engine_mask, const char *fmt, ...); +void i915_handle_reset(struct drm_i915_private *dev_priv, + u32 engine_mask); extern void intel_irq_init(struct drm_i915_private *dev_priv); extern void intel_irq_fini(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index a34f459f8ac1..ab5d4d40083d 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2673,41 +2673,17 @@ static void i915_clear_error_registers(struct drm_i915_private *dev_priv) } /** - * i915_handle_error - handle a gpu error + * i915_handle_reset - handle a gpu reset * @dev_priv: i915 device private - * @engine_mask: mask representing engines that are hung - * @fmt: Error message format string + * @engine_mask: mask representing engines that require reset * - * Do some basic checking of register state at error time and - * dump it to the syslog. Also call i915_capture_error_state() to make - * sure we get a record and make it available in debugfs. Fire a uevent - * so userspace knows something bad happened (should trigger collection - * of a ring dump etc.). + * Executes reset on the given engines. */ -void i915_handle_error(struct drm_i915_private *dev_priv, - u32 engine_mask, - const char *fmt, ...) +void i915_handle_reset(struct drm_i915_private *dev_priv, + u32 engine_mask) { struct intel_engine_cs *engine; unsigned int tmp; - va_list args; - char error_msg[80]; - - va_start(args, fmt); - vscnprintf(error_msg, sizeof(error_msg), fmt, args); - va_end(args); - - /* - * In most cases it's guaranteed that we get here with an RPM - * reference held, for example because there is a pending GPU - * request that won't finish until the reset is done. This - * isn't the case at least when we get here by doing a - * simulated reset via debugfs, so get an RPM reference. - */ - intel_runtime_pm_get(dev_priv); - - i915_capture_error_state(dev_priv, engine_mask, error_msg); - i915_clear_error_registers(dev_priv); /* * Try engine reset when available. We fall back to full reset if @@ -2731,14 +2707,14 @@ void i915_handle_error(struct drm_i915_private *dev_priv, } if (!engine_mask) - goto out; + return; /* Full reset needs the mutex, stop any other user trying to do so. */ if (test_and_set_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags)) { wait_event(dev_priv->gpu_error.reset_queue, !test_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags)); - goto out; + return; } /* Prevent any other reset-engine attempt. */ @@ -2759,8 +2735,43 @@ void i915_handle_error(struct drm_i915_private *dev_priv, clear_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags); wake_up_all(&dev_priv->gpu_error.reset_queue); +} +/** + * i915_handle_error - handle a gpu error + * @dev_priv: i915 device private + * @engine_mask: mask representing engines that are hung + * @fmt: Error message format string + * + * Do some basic checking of register state at error time and + * dump it to the syslog. Also call i915_capture_error_state() to make + * sure we get a record and make it available in debugfs. Fire a uevent + * so userspace knows something bad happened (should trigger collection + * of a ring dump etc.). + */ +void i915_handle_error(struct drm_i915_private *dev_priv, + u32 engine_mask, + const char *fmt, ...) +{ + va_list args; + char error_msg[80]; + + va_start(args, fmt); + vscnprintf(error_msg, sizeof(error_msg), fmt, args); + va_end(args); + + /* + * In most cases it's guaranteed that we get here with an RPM + * reference held, for example because there is a pending GPU + * request that won't finish until the reset is done. This + * isn't the case at least when we get here by doing a + * simulated reset via debugfs, so get an RPM reference. + */ + intel_runtime_pm_get(dev_priv); + + i915_capture_error_state(dev_priv, engine_mask, error_msg); + i915_clear_error_registers(dev_priv); + i915_handle_reset(dev_priv, engine_mask); -out: intel_runtime_pm_put(dev_priv); }