From patchwork Mon Oct 30 18:56:15 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michel Thierry X-Patchwork-Id: 10033037 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id CFE5A6039A for ; Mon, 30 Oct 2017 18:56:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C07CF27EE2 for ; Mon, 30 Oct 2017 18:56:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B4ADE28332; Mon, 30 Oct 2017 18:56:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 08A112881D for ; Mon, 30 Oct 2017 18:56:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9BA9389D61; Mon, 30 Oct 2017 18:56:29 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id BC43489D6C for ; Mon, 30 Oct 2017 18:56:17 +0000 (UTC) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP; 30 Oct 2017 11:56:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.44,320,1505804400"; d="scan'208"; a="1031295695" Received: from relo-linux-11.sc.intel.com ([10.3.160.161]) by orsmga003.jf.intel.com with ESMTP; 30 Oct 2017 11:56:16 -0700 From: Michel Thierry To: intel-gfx@lists.freedesktop.org Date: Mon, 30 Oct 2017 11:56:15 -0700 Message-Id: <20171030185616.32836-3-michel.thierry@intel.com> X-Mailer: git-send-email 2.14.2 In-Reply-To: <20171030185616.32836-1-michel.thierry@intel.com> References: <20171030185616.32836-1-michel.thierry@intel.com> Subject: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Add support for reset engine using GuC commands X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP This patch adds per engine reset and recovery (TDR) support when GuC is used to submit workloads to GPU. In the case of i915 directly submission to ELSP, driver manages hang detection, recovery and resubmission. With GuC submission these tasks are shared between driver and GuC. i915 is still responsible for detecting a hang, and when it does it only requests GuC to reset that Engine. GuC internally manages acquiring forcewake and idling the engine before actually resetting it. Once the reset is successful, i915 takes over again and handles resubmission. The scheduler in i915 knows which requests are pending so after resetting a engine, pending workloads/requests are resubmitted again. v2: s/i915_guc_request_engine_reset/i915_guc_reset_engine/ to match the non-guc funtion names. v3: Removed debug message about engine restarting from which request, since the new baseline do it regardless of submission mode. (Chris) v4: Rebase. v5: Do not pass unnecessary reporting flags to the fw (Jeff); tasklet_schedule(&execlists->irq_tasklet) handles the resubmit; rebase. Signed-off-by: Michel Thierry Cc: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.c | 9 +++++++-- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/intel_guc.c | 24 ++++++++++++++++++++++++ drivers/gpu/drm/i915/intel_guc_fwif.h | 1 + drivers/gpu/drm/i915/intel_uncore.c | 5 ----- 5 files changed, 33 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index af745749509c..02fb35744f66 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1984,10 +1984,15 @@ int i915_reset_engine(struct intel_engine_cs *engine, unsigned int flags) goto out; } - ret = intel_gpu_reset(engine->i915, intel_engine_flag(engine)); + if (!engine->i915->guc.execbuf_client) + ret = intel_gpu_reset(engine->i915, intel_engine_flag(engine)); + else + ret = intel_guc_reset_engine(engine); + if (ret) { /* If we fail here, we expect to fallback to a global reset */ - DRM_DEBUG_DRIVER("Failed to reset %s, ret=%d\n", + DRM_DEBUG_DRIVER("%sFailed to reset %s, ret=%d\n", + (engine->i915->guc.execbuf_client ? "GUC " : ""), engine->name, ret); goto out; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 0ed06ca07568..b77fd9720ec2 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3326,6 +3326,7 @@ extern int i915_reset_engine(struct intel_engine_cs *engine, extern bool intel_has_reset_engine(struct drm_i915_private *dev_priv); extern int intel_reset_guc(struct drm_i915_private *dev_priv); +extern int intel_guc_reset_engine(struct intel_engine_cs *engine); extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine); extern void intel_hangcheck_init(struct drm_i915_private *dev_priv); extern unsigned long i915_chipset_val(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c index f74d50fdaeb0..f7b9b6b91499 100644 --- a/drivers/gpu/drm/i915/intel_guc.c +++ b/drivers/gpu/drm/i915/intel_guc.c @@ -24,6 +24,7 @@ #include "intel_guc.h" #include "i915_drv.h" +#include "i915_guc_submission.h" static void gen8_guc_raise_irq(struct intel_guc *guc) { @@ -283,6 +284,29 @@ int intel_guc_suspend(struct drm_i915_private *dev_priv) return intel_guc_send(guc, data, ARRAY_SIZE(data)); } +/** + * intel_guc_reset_engine() - ask GuC to reset an engine + * @engine: engine to be reset + */ +int intel_guc_reset_engine(struct intel_engine_cs *engine) +{ + struct drm_i915_private *dev_priv = engine->i915; + struct intel_guc *guc = &dev_priv->guc; + u32 data[7]; + + GEM_BUG_ON(!guc->execbuf_client); + + data[0] = INTEL_GUC_ACTION_REQUEST_ENGINE_RESET; + data[1] = engine->guc_id; + data[2] = 0; + data[3] = 0; + data[4] = 0; + data[5] = guc->execbuf_client->stage_id; + data[6] = guc_ggtt_offset(guc->shared_data); + + return intel_guc_send(guc, data, ARRAY_SIZE(data)); +} + /** * intel_guc_resume() - notify GuC resuming from suspend state * @dev_priv: i915 device private diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h index e24dbec2ead8..6a10aa6f04d3 100644 --- a/drivers/gpu/drm/i915/intel_guc_fwif.h +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h @@ -574,6 +574,7 @@ struct guc_shared_ctx_data { enum intel_guc_action { INTEL_GUC_ACTION_DEFAULT = 0x0, INTEL_GUC_ACTION_REQUEST_PREEMPTION = 0x2, + INTEL_GUC_ACTION_REQUEST_ENGINE_RESET = 0x3, INTEL_GUC_ACTION_SAMPLE_FORCEWAKE = 0x6, INTEL_GUC_ACTION_ALLOCATE_DOORBELL = 0x10, INTEL_GUC_ACTION_DEALLOCATE_DOORBELL = 0x20, diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index d629b2e40013..0564d87ad416 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1792,14 +1792,9 @@ bool intel_has_gpu_reset(struct drm_i915_private *dev_priv) return intel_get_gpu_reset(dev_priv) != NULL; } -/* - * When GuC submission is enabled, GuC manages ELSP and can initiate the - * engine reset too. For now, fall back to full GPU reset if it is enabled. - */ bool intel_has_reset_engine(struct drm_i915_private *dev_priv) { return (dev_priv->info.has_reset_engine && - !dev_priv->guc.execbuf_client && i915_modparams.reset >= 2); }