drm/amdgpu: Transfer fences to dmabuf importer

Message ID	20190130105517.23977-1-chris@chris-wilson.co.uk (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: Chris Wilson <chris@chris-wilson.co.uk> To: dri-devel@lists.freedesktop.org Subject: [PATCH] drm/amdgpu: Transfer fences to dmabuf importer Date: Wed, 30 Jan 2019 10:55:17 +0000 Message-Id: <20190130105517.23977-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Precedence: list Cc: Alex Deucher <alexander.deucher@amd.com>, intel-gfx@lists.freedesktop.org, =?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>, amd-gfx@lists.freedesktop.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/amdgpu: Transfer fences to dmabuf importer \| expand drm/amdgpu: Transfer fences to dmabuf importer

Message ID

20190130105517.23977-1-chris@chris-wilson.co.uk (mailing list archive)

State

New, archived

Headers

From: Chris Wilson <chris@chris-wilson.co.uk>
To: dri-devel@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: Transfer fences to dmabuf importer
Date: Wed, 30 Jan 2019 10:55:17 +0000
Message-Id: <20190130105517.23977-1-chris@chris-wilson.co.uk>
MIME-Version: 1.0
Precedence: list
Cc: Alex Deucher <alexander.deucher@amd.com>, intel-gfx@lists.freedesktop.org,
	=?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>,
 amd-gfx@lists.freedesktop.org
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Series

drm/amdgpu: Transfer fences to dmabuf importer | expand

Commit Message

Chris Wilson Jan. 30, 2019, 10:55 a.m. UTC

amdgpu only uses shared-fences internally, but dmabuf importers rely on
implicit write hazard tracking via the reservation_object.fence_excl.
For example, the importer use the write hazard for timing a page flip to
only occur after the exporter has finished flushing its write into the
surface. As such, on exporting a dmabuf, we must either flush all
outstanding fences (for we do not know which are writes and should have
been exclusive) or alternatively create a new exclusive fence that is
the composite of all the existing shared fences, and so will only be
signaled when all earlier fences are signaled (ensuring that we can not
be signaled before the completion of any earlier write).

v2: reservation_object is already locked by amdgpu_bo_reserve()
v3: Replace looping with get_fences_rcu and special case the promotion
of a single shared fence directly to an exclusive fence, bypassing the
fence array.
v4: Drop the fence array ref after assigning to reservation_object

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107341
Testcase: igt/amd_prime/amd-to-i915
References: 8e94a46c1770 ("drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Reviewed-by: "Christian König" <christian.koenig@amd.com>
---
We may disagree on the best long term strategy for fence semantics, but
I think this is still a nice short term solution to the blocking
behaviour on exporting amdgpu to prime.
-Chris
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 59 ++++++++++++++++++++---
 1 file changed, 51 insertions(+), 8 deletions(-)

Comments

Christian König Jan. 30, 2019, noon UTC | #1

Am 30.01.19 um 11:55 schrieb Chris Wilson:
> amdgpu only uses shared-fences internally, but dmabuf importers rely on
> implicit write hazard tracking via the reservation_object.fence_excl.
> For example, the importer use the write hazard for timing a page flip to
> only occur after the exporter has finished flushing its write into the
> surface. As such, on exporting a dmabuf, we must either flush all
> outstanding fences (for we do not know which are writes and should have
> been exclusive) or alternatively create a new exclusive fence that is
> the composite of all the existing shared fences, and so will only be
> signaled when all earlier fences are signaled (ensuring that we can not
> be signaled before the completion of any earlier write).
>
> v2: reservation_object is already locked by amdgpu_bo_reserve()
> v3: Replace looping with get_fences_rcu and special case the promotion
> of a single shared fence directly to an exclusive fence, bypassing the
> fence array.
> v4: Drop the fence array ref after assigning to reservation_object
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107341
> Testcase: igt/amd_prime/amd-to-i915
> References: 8e94a46c1770 ("drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Reviewed-by: "Christian König" <christian.koenig@amd.com>
> ---
> We may disagree on the best long term strategy for fence semantics, but
> I think this is still a nice short term solution to the blocking
> behaviour on exporting amdgpu to prime.

Yeah, I can agree on that. And just pushed the patch to 
amd-staging-drm-next.

Christian.

> -Chris
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 59 ++++++++++++++++++++---
>   1 file changed, 51 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> index 71913a18d142..a38e0fb4a6fe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> @@ -38,6 +38,7 @@
>   #include "amdgpu_gem.h"
>   #include <drm/amdgpu_drm.h>
>   #include <linux/dma-buf.h>
> +#include <linux/dma-fence-array.h>
>   
>   /**
>    * amdgpu_gem_prime_get_sg_table - &drm_driver.gem_prime_get_sg_table
> @@ -187,6 +188,48 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
>   	return ERR_PTR(ret);
>   }
>   
> +static int
> +__reservation_object_make_exclusive(struct reservation_object *obj)
> +{
> +	struct dma_fence **fences;
> +	unsigned int count;
> +	int r;
> +
> +	if (!reservation_object_get_list(obj)) /* no shared fences to convert */
> +		return 0;
> +
> +	r = reservation_object_get_fences_rcu(obj, NULL, &count, &fences);
> +	if (r)
> +		return r;
> +
> +	if (count == 0) {
> +		/* Now that was unexpected. */
> +	} else if (count == 1) {
> +		reservation_object_add_excl_fence(obj, fences[0]);
> +		dma_fence_put(fences[0]);
> +		kfree(fences);
> +	} else {
> +		struct dma_fence_array *array;
> +
> +		array = dma_fence_array_create(count, fences,
> +					       dma_fence_context_alloc(1), 0,
> +					       false);
> +		if (!array)
> +			goto err_fences_put;
> +
> +		reservation_object_add_excl_fence(obj, &array->base);
> +		dma_fence_put(&array->base);
> +	}
> +
> +	return 0;
> +
> +err_fences_put:
> +	while (count--)
> +		dma_fence_put(fences[count]);
> +	kfree(fences);
> +	return -ENOMEM;
> +}
> +
>   /**
>    * amdgpu_gem_map_attach - &dma_buf_ops.attach implementation
>    * @dma_buf: Shared DMA buffer
> @@ -218,16 +261,16 @@ static int amdgpu_gem_map_attach(struct dma_buf *dma_buf,
>   
>   	if (attach->dev->driver != adev->dev->driver) {
>   		/*
> -		 * Wait for all shared fences to complete before we switch to future
> -		 * use of exclusive fence on this prime shared bo.
> +		 * We only create shared fences for internal use, but importers
> +		 * of the dmabuf rely on exclusive fences for implicitly
> +		 * tracking write hazards. As any of the current fences may
> +		 * correspond to a write, we need to convert all existing
> +		 * fences on the reservation object into a single exclusive
> +		 * fence.
>   		 */
> -		r = reservation_object_wait_timeout_rcu(bo->tbo.resv,
> -							true, false,
> -							MAX_SCHEDULE_TIMEOUT);
> -		if (unlikely(r < 0)) {
> -			DRM_DEBUG_PRIME("Fence wait failed: %li\n", r);
> +		r = __reservation_object_make_exclusive(bo->tbo.resv);
> +		if (r)
>   			goto error_unreserve;
> -		}
>   	}
>   
>   	/* pin buffer into GTT */

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index 71913a18d142..a38e0fb4a6fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -38,6 +38,7 @@ 
 #include "amdgpu_gem.h"
 #include <drm/amdgpu_drm.h>
 #include <linux/dma-buf.h>
+#include <linux/dma-fence-array.h>
 
 /**
  * amdgpu_gem_prime_get_sg_table - &drm_driver.gem_prime_get_sg_table
@@ -187,6 +188,48 @@  amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
 	return ERR_PTR(ret);
 }
 
+static int
+__reservation_object_make_exclusive(struct reservation_object *obj)
+{
+	struct dma_fence **fences;
+	unsigned int count;
+	int r;
+
+	if (!reservation_object_get_list(obj)) /* no shared fences to convert */
+		return 0;
+
+	r = reservation_object_get_fences_rcu(obj, NULL, &count, &fences);
+	if (r)
+		return r;
+
+	if (count == 0) {
+		/* Now that was unexpected. */
+	} else if (count == 1) {
+		reservation_object_add_excl_fence(obj, fences[0]);
+		dma_fence_put(fences[0]);
+		kfree(fences);
+	} else {
+		struct dma_fence_array *array;
+
+		array = dma_fence_array_create(count, fences,
+					       dma_fence_context_alloc(1), 0,
+					       false);
+		if (!array)
+			goto err_fences_put;
+
+		reservation_object_add_excl_fence(obj, &array->base);
+		dma_fence_put(&array->base);
+	}
+
+	return 0;
+
+err_fences_put:
+	while (count--)
+		dma_fence_put(fences[count]);
+	kfree(fences);
+	return -ENOMEM;
+}
+
 /**
  * amdgpu_gem_map_attach - &dma_buf_ops.attach implementation
  * @dma_buf: Shared DMA buffer
@@ -218,16 +261,16 @@  static int amdgpu_gem_map_attach(struct dma_buf *dma_buf,
 
 	if (attach->dev->driver != adev->dev->driver) {
 		/*
-		 * Wait for all shared fences to complete before we switch to future
-		 * use of exclusive fence on this prime shared bo.
+		 * We only create shared fences for internal use, but importers
+		 * of the dmabuf rely on exclusive fences for implicitly
+		 * tracking write hazards. As any of the current fences may
+		 * correspond to a write, we need to convert all existing
+		 * fences on the reservation object into a single exclusive
+		 * fence.
 		 */
-		r = reservation_object_wait_timeout_rcu(bo->tbo.resv,
-							true, false,
-							MAX_SCHEDULE_TIMEOUT);
-		if (unlikely(r < 0)) {
-			DRM_DEBUG_PRIME("Fence wait failed: %li\n", r);
+		r = __reservation_object_make_exclusive(bo->tbo.resv);
+		if (r)
 			goto error_unreserve;
-		}
 	}
 
 	/* pin buffer into GTT */

drm/amdgpu: Transfer fences to dmabuf importer

Commit Message

Comments

Patch