From patchwork Thu Jan 12 02:53:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 13097402 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2A1B2C54EBC for ; Thu, 12 Jan 2023 02:53:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BC37D10E2E0; Thu, 12 Jan 2023 02:53:44 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id D894410E2E0; Thu, 12 Jan 2023 02:53:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673492022; x=1705028022; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hQ/SlHjYCouocBNwLhBRu0pDc2NyWXxfNdlQu7NmCNA=; b=LQcZQPECaC/6AoSAcGEos1xEpABtY7CnP9+T4XeEblcxa64XFWXVio8x AiVd6TjFngya3j3RPJSORHnIAQATn0dENcxWV/qHMbBbg4v2pI91lYAM9 Ky31A33UtltmelQM86dMT3B4YEoPu+tEvHFpPt6LlY/6oi1gfpHUdcnNh MOIsf6SwzpWVXziIcwiDbYdiwHzh0VZt0v44xoeuFXeMm7Fc0QWBiUZc7 K94PC/lle8xAg7uEV9keJ411yEwrxbGr4XWsVg90o88SYE940vnAIoVH8 0/4/It8/D0IiV7i2/YAPkLxCGFGxbEW8BeHeYKh4CLfJjlbjhIN/hiP/a g==; X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="350823499" X-IronPort-AV: E=Sophos;i="5.96,318,1665471600"; d="scan'208";a="350823499" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2023 18:53:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="986359626" X-IronPort-AV: E=Sophos;i="5.96,318,1665471600"; d="scan'208";a="986359626" Received: from relo-linux-5.jf.intel.com ([10.165.21.152]) by fmsmga005.fm.intel.com with ESMTP; 11 Jan 2023 18:53:42 -0800 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Subject: [PATCH 1/4] drm/i915: Allow error capture without a request Date: Wed, 11 Jan 2023 18:53:08 -0800 Message-Id: <20230112025311.2577084-2-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230112025311.2577084-1-John.C.Harrison@Intel.com> References: <20230112025311.2577084-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Umesh Nerlige Ramappa , John Harrison , DRI-Devel@Lists.FreeDesktop.Org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: John Harrison There was a report of error captures occurring without any hung context being indicated despite the capture being initiated by a 'hung context notification' from GuC. The problem was not reproducible. However, it is possible to happen if the context in question has no active requests. For example, if the hang was in the context switch itself then the breadcrumb write would have occurred and the KMD would see an idle context. In the interests of attempting to provide as much information as possible about a hang, it seems wise to include the engine info regardless of whether a request was found or not. As opposed to just prentending there was no hang at all. So update the error capture code to always record engine information if an engine is given. Which means updating record_context() to take a context instead of a request (which it only ever used to find the context anyway). And split the request agnostic parts of intel_engine_coredump_add_request() out into a seaprate function. v2: Remove a duplicate 'if' statement (Umesh) and fix a put of a null pointer. Signed-off-by: John Harrison Reviewed-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/i915/i915_gpu_error.c | 61 +++++++++++++++++++-------- 1 file changed, 43 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 9d5d5a397b64e..bd2cf7d235df0 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1370,14 +1370,14 @@ static void engine_record_execlists(struct intel_engine_coredump *ee) } static bool record_context(struct i915_gem_context_coredump *e, - const struct i915_request *rq) + struct intel_context *ce) { struct i915_gem_context *ctx; struct task_struct *task; bool simulated; rcu_read_lock(); - ctx = rcu_dereference(rq->context->gem_context); + ctx = rcu_dereference(ce->gem_context); if (ctx && !kref_get_unless_zero(&ctx->ref)) ctx = NULL; rcu_read_unlock(); @@ -1396,8 +1396,8 @@ static bool record_context(struct i915_gem_context_coredump *e, e->guilty = atomic_read(&ctx->guilty_count); e->active = atomic_read(&ctx->active_count); - e->total_runtime = intel_context_get_total_runtime_ns(rq->context); - e->avg_runtime = intel_context_get_avg_runtime_ns(rq->context); + e->total_runtime = intel_context_get_total_runtime_ns(ce); + e->avg_runtime = intel_context_get_avg_runtime_ns(ce); simulated = i915_gem_context_no_error_capture(ctx); @@ -1532,15 +1532,37 @@ intel_engine_coredump_alloc(struct intel_engine_cs *engine, gfp_t gfp, u32 dump_ return ee; } +static struct intel_engine_capture_vma * +engine_coredump_add_context(struct intel_engine_coredump *ee, + struct intel_context *ce, + gfp_t gfp) +{ + struct intel_engine_capture_vma *vma = NULL; + + ee->simulated |= record_context(&ee->context, ce); + if (ee->simulated) + return NULL; + + /* + * We need to copy these to an anonymous buffer + * as the simplest method to avoid being overwritten + * by userspace. + */ + vma = capture_vma(vma, ce->ring->vma, "ring", gfp); + vma = capture_vma(vma, ce->state, "HW context", gfp); + + return vma; +} + struct intel_engine_capture_vma * intel_engine_coredump_add_request(struct intel_engine_coredump *ee, struct i915_request *rq, gfp_t gfp) { - struct intel_engine_capture_vma *vma = NULL; + struct intel_engine_capture_vma *vma; - ee->simulated |= record_context(&ee->context, rq); - if (ee->simulated) + vma = engine_coredump_add_context(ee, rq->context, gfp); + if (!vma) return NULL; /* @@ -1550,8 +1572,6 @@ intel_engine_coredump_add_request(struct intel_engine_coredump *ee, */ vma = capture_vma_snapshot(vma, rq->batch_res, gfp, "batch"); vma = capture_user(vma, rq, gfp); - vma = capture_vma(vma, rq->ring->vma, "ring", gfp); - vma = capture_vma(vma, rq->context->state, "HW context", gfp); ee->rq_head = rq->head; ee->rq_post = rq->postfix; @@ -1608,8 +1628,11 @@ capture_engine(struct intel_engine_cs *engine, if (ce) { intel_engine_clear_hung_context(engine); rq = intel_context_find_active_request(ce); - if (!rq || !i915_request_started(rq)) - goto no_request_capture; + if (rq && !i915_request_started(rq)) { + drm_info(&engine->gt->i915->drm, "Got hung context on %s with no active request!\n", + engine->name); + rq = NULL; + } } else { /* * Getting here with GuC enabled means it is a forced error capture @@ -1622,22 +1645,24 @@ capture_engine(struct intel_engine_cs *engine, flags); } } - if (rq) + if (rq) { rq = i915_request_get_rcu(rq); + capture = intel_engine_coredump_add_request(ee, rq, ATOMIC_MAYFAIL); + } else if (ce) { + capture = engine_coredump_add_context(ee, ce, ATOMIC_MAYFAIL); + } - if (!rq) - goto no_request_capture; - - capture = intel_engine_coredump_add_request(ee, rq, ATOMIC_MAYFAIL); if (!capture) { - i915_request_put(rq); + if (rq) + i915_request_put(rq); goto no_request_capture; } if (dump_flags & CORE_DUMP_FLAG_IS_GUC_CAPTURE) intel_guc_capture_get_matching_node(engine->gt, ee, ce); intel_engine_coredump_add_vma(ee, capture, compress); - i915_request_put(rq); + if (rq) + i915_request_put(rq); return ee; From patchwork Thu Jan 12 02:53:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 13097403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 270EDC54EBC for ; Thu, 12 Jan 2023 02:53:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5982210E854; Thu, 12 Jan 2023 02:53:49 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0497110E2D8; Thu, 12 Jan 2023 02:53:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673492023; x=1705028023; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XnRmxT4Yo4EB6vMENnHBwfcxOtPanwyqtRWI9HJBVS4=; b=DrZfkvhvyC8bAb5Jj4jDDeuZ0ukzHPCqNp2/T1P6VyD4mPSZ63v9IOrw P8VpB5nfKpm58dP41XmtZJGHdvo48V5IyGfyCBKS43R9W8OyCLagGxxv7 1cC3rERs2W2XyK031N14XUT8883BTestfuQH3ArMVn4IID8+4e3f4BIGR tTkvVwv/mlywelfP8GmMJRotQPzBRaX2oLt7C1YBP6vNEGrpzTsv9gSHI 5BPtbrWwFwT7UT4t6a0/ME5r9nkj0jfZKfP8B9YR3wY0IE6DKju/0TMSY lJpyraGP61z9tFHCWWDAjv4rIbXdg7YZamdLO3uTT+D85KmzL43BTCCLQ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="350823500" X-IronPort-AV: E=Sophos;i="5.96,318,1665471600"; d="scan'208";a="350823500" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2023 18:53:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="986359629" X-IronPort-AV: E=Sophos;i="5.96,318,1665471600"; d="scan'208";a="986359629" Received: from relo-linux-5.jf.intel.com ([10.165.21.152]) by fmsmga005.fm.intel.com with ESMTP; 11 Jan 2023 18:53:42 -0800 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Subject: [PATCH 2/4] drm/i915: Allow error capture of a pending request Date: Wed, 11 Jan 2023 18:53:09 -0800 Message-Id: <20230112025311.2577084-3-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230112025311.2577084-1-John.C.Harrison@Intel.com> References: <20230112025311.2577084-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: John Harrison , DRI-Devel@Lists.FreeDesktop.Org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: John Harrison A hang situation has been observed where the only requests on the context were either completed or not yet started according to the breaadcrumbs. However, the register state claimed a batch was (maybe) in progress. So, allow capture of the pending request on the grounds that this might be better than nothing. Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gpu_error.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index bd2cf7d235df0..2e338a9667a4b 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1628,11 +1628,9 @@ capture_engine(struct intel_engine_cs *engine, if (ce) { intel_engine_clear_hung_context(engine); rq = intel_context_find_active_request(ce); - if (rq && !i915_request_started(rq)) { - drm_info(&engine->gt->i915->drm, "Got hung context on %s with no active request!\n", - engine->name); - rq = NULL; - } + if (rq && !i915_request_started(rq)) + drm_info(&engine->gt->i915->drm, "Confused - active request not yet started: %lld:%lld, ce = 0x%04X/%s!\n", + rq->fence.context, rq->fence.seqno, ce->guc_id.id, engine->name); } else { /* * Getting here with GuC enabled means it is a forced error capture From patchwork Thu Jan 12 02:53:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 13097404 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2DAD6C5479D for ; Thu, 12 Jan 2023 02:53:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2D4AD10E851; Thu, 12 Jan 2023 02:53:52 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 247A710E2E0; Thu, 12 Jan 2023 02:53:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673492023; x=1705028023; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HUiwugwT28htyRQaLp8x+1m39Kl4vFD7KWtQkl+/+ac=; b=ki7LEnk7RBMsOnz4knhNNwsRvXpireLNDtjUBIuYGAO7vJZk8fsrCxb3 uenJS/TNDT6mymYV/FFPbuyWnYyTCyRG6BNwZwSCItYPPGm0GdyjUxEHa 0wJNr/hWIrHfNgBlQGpbMxXi+eYkmQiP2CN1wXFSZoq/FBd5bJHFJHo0T aOyx7OonKiZ1TeU8//vO3czUK2NQpdiLvNl4Bj27bGXdDiZfuYI7j6fl0 SAaVV8T4jB5NKSRNG21aDr0Et3h4Z/K2VsiYWEFKD68vsSYCDuS0ItQ46 PG3RGTKILwzWJbEwiFpglOo7RACNLtwaFoo+NLYfwTt0dx00l2g/LIxsz g==; X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="350823501" X-IronPort-AV: E=Sophos;i="5.96,318,1665471600"; d="scan'208";a="350823501" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2023 18:53:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="986359632" X-IronPort-AV: E=Sophos;i="5.96,318,1665471600"; d="scan'208";a="986359632" Received: from relo-linux-5.jf.intel.com ([10.165.21.152]) by fmsmga005.fm.intel.com with ESMTP; 11 Jan 2023 18:53:42 -0800 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Subject: [PATCH 3/4] drm/i915/guc: Look for a guilty context when an engine reset fails Date: Wed, 11 Jan 2023 18:53:10 -0800 Message-Id: <20230112025311.2577084-4-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230112025311.2577084-1-John.C.Harrison@Intel.com> References: <20230112025311.2577084-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: John Harrison , DRI-Devel@Lists.FreeDesktop.Org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: John Harrison Engine resets are supposed to never fail. But in the case when one does (due to unknown reasons that normally come down to a missing w/a), it is useful to get as much information out of the system as possible. Given that the GuC effectively dies on such a situation, it is not possible to get a guilty context notification back. So do a manual search instead. Given that GuC is dead, this is safe because GuC won't be changing the engine state asynchronously. Signed-off-by: John Harrison --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index b436dd7f12e42..99d09e3394597 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -4754,11 +4754,24 @@ static void reset_fail_worker_func(struct work_struct *w) guc->submission_state.reset_fail_mask = 0; spin_unlock_irqrestore(&guc->submission_state.lock, flags); - if (likely(reset_fail_mask)) + if (likely(reset_fail_mask)) { + struct intel_engine_cs *engine; + enum intel_engine_id id; + + /* + * GuC is toast at this point - it dead loops after sending the failed + * reset notification. So need to manually determine the guilty context. + * Note that it should be safe/reliable to do this here because the GuC + * is toast and will not be scheduling behind the KMD's back. + */ + for_each_engine_masked(engine, gt, reset_fail_mask, id) + intel_guc_find_hung_context(engine); + intel_gt_handle_error(gt, reset_fail_mask, I915_ERROR_CAPTURE, - "GuC failed to reset engine mask=0x%x\n", + "GuC failed to reset engine mask=0x%x", reset_fail_mask); + } } int intel_guc_engine_failure_process_msg(struct intel_guc *guc, From patchwork Thu Jan 12 02:53:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 13097405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0A189C46467 for ; Thu, 12 Jan 2023 02:53:58 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F183A10E855; Thu, 12 Jan 2023 02:53:52 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4D9C610E2D8; Thu, 12 Jan 2023 02:53:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673492023; x=1705028023; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oUM8+s9CaIlKBqtwPwCkB3+uqFgPbmwEogc4Rv9Zk/4=; b=Z+G1UhB5ARPb9CbUikkWiONb5RsfUd5q1BVs3H5CDUc7Ot9B3m2QHf3S dbuRJMMeX0Jb4WM0C7ybhwX+S8JYOXGjekfxM8SaqvrpuDx4I+WU9dnlu NHL+wayaPWQRDl8PC0dLcWilaSnsdju8PROjXCcutQvcve+9un41ETjgO Q8zQVkdEP4C4bvkU7J/c1okyoqjjUz6NKqTIWtKtGZlneJGYVNHmCqbt7 FMsgo+Igb1mX8X9ILCIosMZzhcaADmM0JE2J0BhLr9Tvajr02LLMM6AtR JWvtPWmzy/+vTpLMlrbMJqpNB2C8/SARw86ElC5lVis1c3CU3m90/rG3m w==; X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="350823503" X-IronPort-AV: E=Sophos;i="5.96,318,1665471600"; d="scan'208";a="350823503" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2023 18:53:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="986359635" X-IronPort-AV: E=Sophos;i="5.96,318,1665471600"; d="scan'208";a="986359635" Received: from relo-linux-5.jf.intel.com ([10.165.21.152]) by fmsmga005.fm.intel.com with ESMTP; 11 Jan 2023 18:53:42 -0800 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Subject: [PATCH 4/4] drm/i915/guc: Add a debug print on GuC triggered reset Date: Wed, 11 Jan 2023 18:53:11 -0800 Message-Id: <20230112025311.2577084-5-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230112025311.2577084-1-John.C.Harrison@Intel.com> References: <20230112025311.2577084-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: John Harrison , DRI-Devel@Lists.FreeDesktop.Org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: John Harrison For understanding bug reports, it can be useful to have an explicit dmesg print when a reset notification is received from GuC. As opposed to simply inferring that this happened from other messages. Signed-off-by: John Harrison Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 99d09e3394597..0be7c27a436dd 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -4665,6 +4665,10 @@ static void guc_handle_context_reset(struct intel_guc *guc, { trace_intel_context_reset(ce); + drm_dbg(&guc_to_gt(guc)->i915->drm, "Got GuC reset of 0x%04X, exiting = %d, banned = %d\n", + ce->guc_id.id, test_bit(CONTEXT_EXITING, &ce->flags), + test_bit(CONTEXT_BANNED, &ce->flags)); + if (likely(intel_context_is_schedulable(ce))) { capture_error_state(guc, ce); guc_context_replay(ce);