From patchwork Wed Nov 9 19:09:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janusz Krzysztofik X-Patchwork-Id: 13037941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 68A5CC4332F for ; Wed, 9 Nov 2022 19:10:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E958510E069; Wed, 9 Nov 2022 19:10:31 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 50BF310E069 for ; Wed, 9 Nov 2022 19:10:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668021025; x=1699557025; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DsqDFyDE3Xf+Eud5zTFYaL4RaK2z0k0BfhahtubW9bI=; b=kE336X6z+b2bXzyS5UTEaD2lBKjSsIZqzcfLTLt1CtBSIP9qpvO0e572 0PL8kmVMkdZY7ao1rFMnY1R+yi5dQxCB9D/WwGMfIMPyXzb8ULd+YsV8A htIlURBUh7eed/uMxujGfGo7EyC7a11xUDWLXXi7Tsxb40JWRNFzrsJ6u YC4vty0AI8oAOex3rrNTCLyHXd4dJODD+GY1PKHNzNtIAjgslk+sb76yZ 6cEJBPUo5b/3p68DvL+N1wVsokeRy9QnyOnnC0XBt0aIJ3ncNb0OBuscs KR1x9ub4fVPMpAzz/zbi/a9TxViZmPqQYc2dKVnhkwLKH9Qcth1neBJZI Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10526"; a="291465750" X-IronPort-AV: E=Sophos;i="5.96,151,1665471600"; d="scan'208";a="291465750" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2022 11:10:25 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10526"; a="668105815" X-IronPort-AV: E=Sophos;i="5.96,151,1665471600"; d="scan'208";a="668105815" Received: from jkrzyszt-mobl1.ger.corp.intel.com ([10.213.6.201]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2022 11:10:22 -0800 From: Janusz Krzysztofik To: Joonas Lahtinen Date: Wed, 9 Nov 2022 20:09:35 +0100 Message-Id: <20221109190937.64155-2-janusz.krzysztofik@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221109190937.64155-1-janusz.krzysztofik@linux.intel.com> References: <20221109190937.64155-1-janusz.krzysztofik@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/3] drm/i915: Fix timeout handling when retiring requests X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, Chris Wilson , Daniel Vetter , Rodrigo Vivi , David Airlie Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" I believe that intel_gt_retire_requests_timeout() should return either -ETIME if all time designated by timeout argument has been consumed while waiting for fences being signaled, or remaining time if there are requests still not retired, or 0 otherwise. In the latter case, remaining time should be passed back via remaining_timeout argument. Remaining time is updated with return value of each consecutive call to dma_fence_wait_timeout(). If an error code is returned instead of remaining time, a few potentially unexpected side effects occur: - we no longer wait for consecutive timelines' last request fences being signaled before we try to retire requests from those timelines -- while expected in case of -ETIME, that's probably not intended in case of other errors that dma_fence_wait_timeout() can return, - the error code (a negative value) is passed back as remaining time and if no more requests happen to be left pending despite the error, a user may pass that value forward as a remaining timeout -- that can potentially trigger a WARN or BUG, - potentially unexpected error code is returned to user when a non-critical error that probably shouldn't stop the user from retrying occurs while active requests are still pending. Moreover, should dma_fence_wait_timeout() ever return 0 (which should mean timeout expiration) while we are processing requests and there are still pending requests when we are about to return, that 0 value is returned to user like if all requests were successfully retired. Ignore error codes from dma_fence_wait_timeout() other than -ETIME and don't overwrite remaining time with those error codes. Also, convert 0 value returned by dma_fence_wait_timeout() to -ETIME. Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request") Signed-off-by: Janusz Krzysztofik Cc: stable@vger.kernel.org # v5.5+ --- drivers/gpu/drm/i915/gt/intel_gt_requests.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c index edb881d756309..6c3b8ac3055c3 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c @@ -156,11 +156,22 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout, fence = i915_active_fence_get(&tl->last_request); if (fence) { + signed long time_left; + mutex_unlock(&tl->mutex); - timeout = dma_fence_wait_timeout(fence, - true, - timeout); + time_left = dma_fence_wait_timeout(fence, + true, + timeout); + /* + * 0 or -ETIME: timeout expired + * other errors: ignore, assume no time consumed + */ + if (time_left == -ETIME || time_left == 0) + timeout = -ETIME; + else if (time_left > 0) + timeout = time_left; + dma_fence_put(fence); /* Retirement is best effort */ From patchwork Wed Nov 9 19:09:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janusz Krzysztofik X-Patchwork-Id: 13037942 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 77E16C433FE for ; Wed, 9 Nov 2022 19:10:58 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8127610E626; Wed, 9 Nov 2022 19:10:57 +0000 (UTC) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 47B1F10E626 for ; Wed, 9 Nov 2022 19:10:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668021054; x=1699557054; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=e6o2iw5GGxg52ODYXfhWmxxEkXpHYpZprWDIoibYlmQ=; b=MBHyOTq0w38wrQYXzpvnT1LQBCrZ6w594UNAdEw1z+JDiovV5a5QP+Aq coagYcBXMtMYiRKxiuZXKiuq4ts1CDkJYLzb53KL2tY4yGxpqONBtQw/Y V5/A9a2MmuGXuelX+cfJWkJY8zN6GlkWhW7KfqmE/1PyA2guFEvjvaYSI zRC/hj1uZix+WXomWnX+zpRjINFCmNpfnYw2DBhvuzIT8ghcd7FXKKIyW IG0vXMRmxxJAzDbn16w8V+/gG+bB0JGMy70zT4aEV0y0okd7mok0isXoq fnVZ0ULYwEc0cSSC/rxB4+2ah4g+JTqBvbp+cRODza5WwEYPpb0fizdXk Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10526"; a="312874877" X-IronPort-AV: E=Sophos;i="5.96,151,1665471600"; d="scan'208";a="312874877" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2022 11:10:53 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10526"; a="668105834" X-IronPort-AV: E=Sophos;i="5.96,151,1665471600"; d="scan'208";a="668105834" Received: from jkrzyszt-mobl1.ger.corp.intel.com ([10.213.6.201]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2022 11:10:25 -0800 From: Janusz Krzysztofik To: Joonas Lahtinen Date: Wed, 9 Nov 2022 20:09:36 +0100 Message-Id: <20221109190937.64155-3-janusz.krzysztofik@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221109190937.64155-1-janusz.krzysztofik@linux.intel.com> References: <20221109190937.64155-1-janusz.krzysztofik@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/3] drm/i915: Fix unintended submission flush after retire times out X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, Chris Wilson , Daniel Vetter , Rodrigo Vivi , David Airlie Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If wait on request DMA fence times out while we are retiring requests, -ETIME is stored as remaining time. Then, flush_submission() called thereafter proceeds with its work instead of returning immediately due to the value of timeout passed to it not equal 0. That's probably not what was intended. Fix it by replacing -ETIME value of the argument with 0. Fixes: 09137e945437 ("drm/i915/gem: Unpin idle contexts from kswapd reclaim") Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c index 6c3b8ac3055c3..309d5937d6910 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c @@ -204,7 +204,7 @@ out_active: spin_lock(&timelines->lock); list_for_each_entry_safe(tl, tn, &free, link) __intel_timeline_free(&tl->kref); - if (flush_submission(gt, timeout)) /* Wait, there's more! */ + if (flush_submission(gt, timeout > 0)) /* Wait, there's more! */ active_count++; if (remaining_timeout) From patchwork Wed Nov 9 19:09:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janusz Krzysztofik X-Patchwork-Id: 13037943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C2BC6C4332F for ; Wed, 9 Nov 2022 19:11:03 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8227B10E627; Wed, 9 Nov 2022 19:11:02 +0000 (UTC) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id C7E9710E626 for ; Wed, 9 Nov 2022 19:10:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668021054; x=1699557054; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6LAVoH57Z67SFgpBICBCC2rV+kEpQgeJZ63fSc0PrTQ=; b=XMkQwligsQHtKgYhSzTjd8843+GPzxAmtFaKDZ0jKCyzO3IhIpXFxvzg YyoaKAqvzSDRC4cM9z07kkVyIzeiHDfsou02/r/I1GhRfbhrQRps2AnJz PWgePvar9lv5qE+jg1jzTM3hh6Wm77z2pwKzbrKRrVUHHIdumHVJeg9Mu 2iwglRanpmoXu7bARmmrhODVcTt9m0+L+KBFU+qFCO+d5sGj3L8osYKsg JBCEDrr3eoOExxI6bjdVw6kV5CaeC4pzYrSxr80qzDxAeY+/7Ggk+GVMr a/qX+IAP4yD6RfI6ACfEUUfNw7rELtOqbBjHv3rpnjT2NHmL3gQYFrvNH Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10526"; a="312874878" X-IronPort-AV: E=Sophos;i="5.96,151,1665471600"; d="scan'208";a="312874878" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2022 11:10:53 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10526"; a="668105979" X-IronPort-AV: E=Sophos;i="5.96,151,1665471600"; d="scan'208";a="668105979" Received: from jkrzyszt-mobl1.ger.corp.intel.com ([10.213.6.201]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2022 11:10:30 -0800 From: Janusz Krzysztofik To: Joonas Lahtinen Date: Wed, 9 Nov 2022 20:09:37 +0100 Message-Id: <20221109190937.64155-4-janusz.krzysztofik@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221109190937.64155-1-janusz.krzysztofik@linux.intel.com> References: <20221109190937.64155-1-janusz.krzysztofik@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/3] drm/i915: Fix 0 return value from DMA fence wait on i915 requests X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, Chris Wilson , Daniel Vetter , Rodrigo Vivi , David Airlie Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" According to the docs, dma_fence_wait_timeout() returns "0 if the wait timed out," ... "Other error values may be returned on custom implementations." While it is not quite clear if a custom implementation is allowed to return "other error" instead of 0, it is rather clear that 0 return value should always mean that the wait timed out before the fence got signaled. i915 implementation of dma_fence_ops.wait() used with request fences, which is a transparent wrapper around i915_request_wait_timeout(), returns -ETIME if the wait has timed out -- that may be considered as acceptable. However, it can return 0 in a rare case when the fence has been found signaled right after no more wait time was left, and that's not compatible with expectations of dma-fence and its users. Since other users of i915_request_wait_timeout() may interpret 0 return value as success, don't touch it, update the i915_fence_wait() wrapper instead. Return 1 instead of 0, but keep -ETIME in case of timeout since some i915 users of dma_fence_wait_timeout() may expect it. Signed-off-by: Janusz Krzysztofik --- drivers/gpu/drm/i915/i915_request.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index f949a9495758a..451456ab1ddef 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -102,7 +102,7 @@ static signed long i915_fence_wait(struct dma_fence *fence, { return i915_request_wait_timeout(to_request(fence), interruptible | I915_WAIT_PRIORITY, - timeout); + timeout) ?: 1; } struct kmem_cache *i915_request_slab_cache(void)