diff mbox

[v3,3/3] drm/amdgpu: simplify reservation handling during buffer creation

Message ID 20170216113946.22046-3-nhaehnle@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Nicolai Hähnle Feb. 16, 2017, 11:39 a.m. UTC
From: Nicolai Hähnle <nicolai.haehnle@amd.com>

By using ttm_bo_init_reserved instead of the manual initialization of
the reservation object, the reservation lock will be properly unlocked
and destroyed when the TTM BO initialization fails.

Actual deadlocks caused by the missing unlock should have been fixed
by "drm/ttm: never add BO that failed to validate to the LRU list",
superseding the flawed fix in commit 38fc4856ad98 ("drm/amdgpu: fix
a potential deadlock in amdgpu_bo_create_restricted()").

This change fixes remaining recursive locking errors that can be seen
with lock debugging enabled, and avoids the error of freeing a locked
mutex.

As an additional minor bonus, buffers created with resv == NULL and
the AMDGPU_GEM_CREATE_VRAM_CLEARED flag are now only added to the
global LRU list after the fill commands have been issued.

Fixes: 12a852219583 ("drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)")
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

Comments

Christian König Feb. 16, 2017, 12:08 p.m. UTC | #1
Am 16.02.2017 um 12:39 schrieb Nicolai Hähnle:
> From: Nicolai Hähnle <nicolai.haehnle@amd.com>
>
> By using ttm_bo_init_reserved instead of the manual initialization of
> the reservation object, the reservation lock will be properly unlocked
> and destroyed when the TTM BO initialization fails.
>
> Actual deadlocks caused by the missing unlock should have been fixed
> by "drm/ttm: never add BO that failed to validate to the LRU list",
> superseding the flawed fix in commit 38fc4856ad98 ("drm/amdgpu: fix
> a potential deadlock in amdgpu_bo_create_restricted()").
>
> This change fixes remaining recursive locking errors that can be seen
> with lock debugging enabled, and avoids the error of freeing a locked
> mutex.
>
> As an additional minor bonus, buffers created with resv == NULL and
> the AMDGPU_GEM_CREATE_VRAM_CLEARED flag are now only added to the
> global LRU list after the fill commands have been issued.
>
> Fixes: 12a852219583 ("drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)")
> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 17 ++++-------------
>   1 file changed, 4 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index c2e57f7..4d0536d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -395,19 +395,10 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
>   	amdgpu_fill_placement_to_bo(bo, placement);
>   	/* Kernel allocation are uninterruptible */
>   
> -	if (!resv) {
> -		bool locked;
> -
> -		reservation_object_init(&bo->tbo.ttm_resv);
> -		locked = ww_mutex_trylock(&bo->tbo.ttm_resv.lock);
> -		WARN_ON(!locked);
> -	}
> -
>   	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
> -	r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type,
> -			&bo->placement, page_align, !kernel, NULL,
> -			acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
> -			&amdgpu_ttm_bo_destroy);
> +	r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
> +				 &bo->placement, page_align, !kernel, NULL,
> +				 acc_size, sg, resv, &amdgpu_ttm_bo_destroy);
>   	amdgpu_cs_report_moved_bytes(adev,
>   		atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);
>   
> @@ -433,7 +424,7 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
>   		fence_put(fence);
>   	}
>   	if (!resv)
> -		ww_mutex_unlock(&bo->tbo.resv->lock);
> +		ttm_bo_unreserve(&bo->tbo);

You can just use amdgpu_bo_unreserve() here. With that fixed the whole 
set is Reviewed-by: Christian König <christian.koenig@amd.com>.

Regards,
Christian.

>   	*bo_ptr = bo;
>   
>   	trace_amdgpu_bo_create(bo);
Chunming Zhou Feb. 17, 2017, 2:48 a.m. UTC | #2
On 2017年02月16日 20:08, Christian König wrote:
> Am 16.02.2017 um 12:39 schrieb Nicolai Hähnle:
>> From: Nicolai Hähnle <nicolai.haehnle@amd.com>
>>
>> By using ttm_bo_init_reserved instead of the manual initialization of
>> the reservation object, the reservation lock will be properly unlocked
>> and destroyed when the TTM BO initialization fails.
>>
>> Actual deadlocks caused by the missing unlock should have been fixed
>> by "drm/ttm: never add BO that failed to validate to the LRU list",
>> superseding the flawed fix in commit 38fc4856ad98 ("drm/amdgpu: fix
>> a potential deadlock in amdgpu_bo_create_restricted()").
>>
>> This change fixes remaining recursive locking errors that can be seen
>> with lock debugging enabled, and avoids the error of freeing a locked
>> mutex.
>>
>> As an additional minor bonus, buffers created with resv == NULL and
>> the AMDGPU_GEM_CREATE_VRAM_CLEARED flag are now only added to the
>> global LRU list after the fill commands have been issued.
>>
>> Fixes: 12a852219583 ("drm/amdgpu: improve 
>> AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)")
>> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com> as well

>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 17 ++++-------------
>>   1 file changed, 4 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index c2e57f7..4d0536d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -395,19 +395,10 @@ int amdgpu_bo_create_restricted(struct 
>> amdgpu_device *adev,
>>       amdgpu_fill_placement_to_bo(bo, placement);
>>       /* Kernel allocation are uninterruptible */
>>   -    if (!resv) {
>> -        bool locked;
>> -
>> -        reservation_object_init(&bo->tbo.ttm_resv);
>> -        locked = ww_mutex_trylock(&bo->tbo.ttm_resv.lock);
>> -        WARN_ON(!locked);
>> -    }
>> -
>>       initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
>> -    r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type,
>> -            &bo->placement, page_align, !kernel, NULL,
>> -            acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
>> -            &amdgpu_ttm_bo_destroy);
>> +    r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
>> +                 &bo->placement, page_align, !kernel, NULL,
>> +                 acc_size, sg, resv, &amdgpu_ttm_bo_destroy);
>>       amdgpu_cs_report_moved_bytes(adev,
>>           atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);
>>   @@ -433,7 +424,7 @@ int amdgpu_bo_create_restricted(struct 
>> amdgpu_device *adev,
>>           fence_put(fence);
>>       }
>>       if (!resv)
>> -        ww_mutex_unlock(&bo->tbo.resv->lock);
>> +        ttm_bo_unreserve(&bo->tbo);
>
> You can just use amdgpu_bo_unreserve() here. With that fixed the whole 
> set is Reviewed-by: Christian König <christian.koenig@amd.com>.
>
> Regards,
> Christian.
>
>>       *bo_ptr = bo;
>>         trace_amdgpu_bo_create(bo);
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
diff mbox

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index c2e57f7..4d0536d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -395,19 +395,10 @@  int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 	amdgpu_fill_placement_to_bo(bo, placement);
 	/* Kernel allocation are uninterruptible */
 
-	if (!resv) {
-		bool locked;
-
-		reservation_object_init(&bo->tbo.ttm_resv);
-		locked = ww_mutex_trylock(&bo->tbo.ttm_resv.lock);
-		WARN_ON(!locked);
-	}
-
 	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
-	r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type,
-			&bo->placement, page_align, !kernel, NULL,
-			acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
-			&amdgpu_ttm_bo_destroy);
+	r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
+				 &bo->placement, page_align, !kernel, NULL,
+				 acc_size, sg, resv, &amdgpu_ttm_bo_destroy);
 	amdgpu_cs_report_moved_bytes(adev,
 		atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);
 
@@ -433,7 +424,7 @@  int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 		fence_put(fence);
 	}
 	if (!resv)
-		ww_mutex_unlock(&bo->tbo.resv->lock);
+		ttm_bo_unreserve(&bo->tbo);
 	*bo_ptr = bo;
 
 	trace_amdgpu_bo_create(bo);