From patchwork Mon Oct 28 11:48:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Brezillon X-Patchwork-Id: 13853334 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1821BD13570 for ; Mon, 28 Oct 2024 11:48:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1B52810E47C; Mon, 28 Oct 2024 11:48:25 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=collabora.com header.i=@collabora.com header.b="e/xy1/e8"; dkim-atps=neutral Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3F45E10E36B for ; Mon, 28 Oct 2024 11:48:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1730116099; bh=Uc2eg3ONkKsKm5ztPmk7ozIpoVZvSZDjmB326oIDvCg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=e/xy1/e8IBkf9gJ2gNUGFVkbjGsQylZQMl+xkdwbU/CLeBUEAd6j0qNbwUw92xKRD F37f5+i/VZSCCAigBWVN8DbQrpqi1Y9i7XBFIN3V7sM5Qxdmii2pHwoRmEtNseDKd+ 8TEOkYcXMDQZfdZkEvXmVg0t8w84QL1FxBKAlPRZCLj6gZUPZSXMTs6xpS1JZBCKWH DyRYrZT4MQNrcrMbZHxawfkDwRFCBy2wbQXZXSA45NqThR0KwrbEVDw3Nmb5k///jn ptuii4T4WKG4qZeX8FOIzBj40Yshmdd2kHvhy1650Qom+fpEPdFuUPqISpcirtwkxB Y0vWcvfWoIl9A== Received: from localhost.localdomain (unknown [IPv6:2a01:e0a:2c:6930:5cf4:84a1:2763:fe0d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id AB30817E35D3; Mon, 28 Oct 2024 12:48:18 +0100 (CET) From: Boris Brezillon To: Boris Brezillon , Steven Price , Liviu Dudau , =?utf-8?q?Adri=C3=A1n_Larumbe?= Cc: Christopher Healy , dri-devel@lists.freedesktop.org, kernel@collabora.com Subject: [PATCH v2 3/3] drm/panthor: Report innocent group kill Date: Mon, 28 Oct 2024 12:48:15 +0100 Message-ID: <20241028114815.3793855-4-boris.brezillon@collabora.com> X-Mailer: git-send-email 2.46.2 In-Reply-To: <20241028114815.3793855-1-boris.brezillon@collabora.com> References: <20241028114815.3793855-1-boris.brezillon@collabora.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Groups can be killed during a reset even though they did nothing wrong. That usually happens when the FW is put in a bad state by other groups, resulting in group suspension failures when the reset happens. If we end up in that situation, flag the group innocent and report innocence through a new DRM_PANTHOR_GROUP_STATE flag. Bump the minor driver version to reflect the uAPI change. Changes in v2: - New patch Signed-off-by: Boris Brezillon --- drivers/gpu/drm/panthor/panthor_drv.c | 2 +- drivers/gpu/drm/panthor/panthor_sched.c | 16 ++++++++++++++++ include/uapi/drm/panthor_drm.h | 9 +++++++++ 3 files changed, 26 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c index ac7e53f6e3f0..f1dff7e0173d 100644 --- a/drivers/gpu/drm/panthor/panthor_drv.c +++ b/drivers/gpu/drm/panthor/panthor_drv.c @@ -1507,7 +1507,7 @@ static const struct drm_driver panthor_drm_driver = { .desc = "Panthor DRM driver", .date = "20230801", .major = 1, - .minor = 2, + .minor = 3, .gem_create_object = panthor_gem_create_object, .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table, diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c index ef4bec7ff9c7..bb4a8c522fdc 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -610,6 +610,16 @@ struct panthor_group { */ bool timedout; + /** + * @innocent: True when the group becomes unusable because the group suspension + * failed during a reset. + * + * Sometimes the FW was put in a bad state by other groups, causing the group + * suspension happening in the reset path to fail. In that case, we consider the + * group innocent. + */ + bool innocent; + /** * @syncobjs: Pool of per-queue synchronization objects. * @@ -2690,6 +2700,12 @@ void panthor_sched_suspend(struct panthor_device *ptdev) u32 csg_id = ffs(slot_mask) - 1; struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id]; + /* If the group was still usable before that point, we consider + * it innocent. + */ + if (group_can_run(csg_slot->group)) + csg_slot->group->innocent = true; + /* We consider group suspension failures as fatal and flag the * group as unusable by setting timedout=true. */ diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h index 87c9cb555dd1..b99763cbae48 100644 --- a/include/uapi/drm/panthor_drm.h +++ b/include/uapi/drm/panthor_drm.h @@ -923,6 +923,15 @@ enum drm_panthor_group_state_flags { * When a group ends up with this flag set, no jobs can be submitted to its queues. */ DRM_PANTHOR_GROUP_STATE_FATAL_FAULT = 1 << 1, + + /** + * @DRM_PANTHOR_GROUP_STATE_INNOCENT: Group was killed during a reset caused by other + * groups. + * + * This flag can only be set if DRM_PANTHOR_GROUP_STATE_TIMEDOUT is set and + * DRM_PANTHOR_GROUP_STATE_FATAL_FAULT is not. + */ + DRM_PANTHOR_GROUP_STATE_INNOCENT = 1 << 2, }; /**