diff mbox

[v2,3/3] drm/amdgpu: fix lock cleanup during buffer creation

Message ID 20170215191056.17718-3-nhaehnle@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Nicolai Hähnle Feb. 15, 2017, 7:10 p.m. UTC
From: Nicolai Hähnle <nicolai.haehnle@amd.com>

Open-code the initial ttm_bo_validate call, so that we can properly
unlock the reservation lock when it fails. Also, properly destruct
the reservation object when the first part of TTM BO initialization
fails.

Actual deadlocks caused by the missing unlock should have been fixed
by "drm/ttm: never add BO that failed to validate to the LRU list",
superseding the flawed fix in commit 38fc4856ad98 ("drm/amdgpu: fix
a potential deadlock in amdgpu_bo_create_restricted()").

This change fixes remaining recursive locking errors that can be seen
with lock debugging enabled, and avoids the error of freeing a locked
mutex.

v2: use ttm_bo_unref for error handling, and rebase on latest changes

Fixes: 12a852219583 ("drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)")
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

Comments

Michel Dänzer Feb. 16, 2017, 1 a.m. UTC | #1
On 16/02/17 04:10 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle <nicolai.haehnle@amd.com>
> 
> Open-code the initial ttm_bo_validate call, so that we can properly
> unlock the reservation lock when it fails. Also, properly destruct
> the reservation object when the first part of TTM BO initialization
> fails.
> 
> Actual deadlocks caused by the missing unlock should have been fixed
> by "drm/ttm: never add BO that failed to validate to the LRU list",
> superseding the flawed fix in commit 38fc4856ad98 ("drm/amdgpu: fix
> a potential deadlock in amdgpu_bo_create_restricted()").
> 
> This change fixes remaining recursive locking errors that can be seen
> with lock debugging enabled, and avoids the error of freeing a locked
> mutex.
> 
> v2: use ttm_bo_unref for error handling, and rebase on latest changes
> 
> Fixes: 12a852219583 ("drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)")
> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28 ++++++++++++++++++++++------
>  1 file changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index c2e57f7..15944ea 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -404,15 +404,31 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
>  	}
>  
>  	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
> -	r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type,
> -			&bo->placement, page_align, !kernel, NULL,
> -			acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
> -			&amdgpu_ttm_bo_destroy);
> +	r = ttm_bo_init_top(&adev->mman.bdev, &bo->tbo, size, type,
> +			    page_align, NULL,
> +			    acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
> +			    &amdgpu_ttm_bo_destroy);
> +
> +	if (likely(r == 0))
> +		r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false);
> +
> +	if (unlikely(r != 0)) {
> +		struct ttm_buffer_object *tbo = &bo->tbo;
> +
> +		if (!resv)
> +			ww_mutex_unlock(&bo->tbo.ttm_resv.lock);
> +		ttm_bo_unref(&tbo);
> +		return r;
> +	}

I think this would be clearer as

	if (unlikely(r != 0)) {
		struct ttm_buffer_object *tbo = &bo->tbo;

		if (!resv)
			ww_mutex_unlock(&bo->tbo.ttm_resv.lock);
		ttm_bo_unref(&tbo);
		return r;
	}

	r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false);
	[...]


>  	amdgpu_cs_report_moved_bytes(adev,
>  		atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);
>  
> -	if (unlikely(r != 0))
> -		return r;
> +	if (!(bo->tbo.mem.placement & TTM_PL_FLAG_NO_EVICT)) {
> +		spin_lock(&bo->tbo.glob->lru_lock);
> +		ttm_bo_add_to_lru(&bo->tbo);
> +		spin_unlock(&bo->tbo.glob->lru_lock);
> +	}

It's a bit ugly to open-code this logic in driver code. Maybe add
another TTM helper which calls ttm_bo_validate and ttm_bo_add_to_lru?
Nicolai Hähnle Feb. 16, 2017, 9:27 a.m. UTC | #2
On 16.02.2017 02:00, Michel Dänzer wrote:
> On 16/02/17 04:10 AM, Nicolai Hähnle wrote:
>> From: Nicolai Hähnle <nicolai.haehnle@amd.com>
>>
>> Open-code the initial ttm_bo_validate call, so that we can properly
>> unlock the reservation lock when it fails. Also, properly destruct
>> the reservation object when the first part of TTM BO initialization
>> fails.
>>
>> Actual deadlocks caused by the missing unlock should have been fixed
>> by "drm/ttm: never add BO that failed to validate to the LRU list",
>> superseding the flawed fix in commit 38fc4856ad98 ("drm/amdgpu: fix
>> a potential deadlock in amdgpu_bo_create_restricted()").
>>
>> This change fixes remaining recursive locking errors that can be seen
>> with lock debugging enabled, and avoids the error of freeing a locked
>> mutex.
>>
>> v2: use ttm_bo_unref for error handling, and rebase on latest changes
>>
>> Fixes: 12a852219583 ("drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)")
>> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28 ++++++++++++++++++++++------
>>  1 file changed, 22 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index c2e57f7..15944ea 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -404,15 +404,31 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
>>  	}
>>
>>  	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
>> -	r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type,
>> -			&bo->placement, page_align, !kernel, NULL,
>> -			acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
>> -			&amdgpu_ttm_bo_destroy);
>> +	r = ttm_bo_init_top(&adev->mman.bdev, &bo->tbo, size, type,
>> +			    page_align, NULL,
>> +			    acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
>> +			    &amdgpu_ttm_bo_destroy);
>> +
>> +	if (likely(r == 0))
>> +		r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false);
>> +
>> +	if (unlikely(r != 0)) {
>> +		struct ttm_buffer_object *tbo = &bo->tbo;
>> +
>> +		if (!resv)
>> +			ww_mutex_unlock(&bo->tbo.ttm_resv.lock);
>> +		ttm_bo_unref(&tbo);
>> +		return r;
>> +	}
>
> I think this would be clearer as
>
> 	if (unlikely(r != 0)) {
> 		struct ttm_buffer_object *tbo = &bo->tbo;
>
> 		if (!resv)
> 			ww_mutex_unlock(&bo->tbo.ttm_resv.lock);
> 		ttm_bo_unref(&tbo);
> 		return r;
> 	}
>
> 	r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false);
> 	[...]
>
>
>>  	amdgpu_cs_report_moved_bytes(adev,
>>  		atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);
>>
>> -	if (unlikely(r != 0))
>> -		return r;
>> +	if (!(bo->tbo.mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>> +		spin_lock(&bo->tbo.glob->lru_lock);
>> +		ttm_bo_add_to_lru(&bo->tbo);
>> +		spin_unlock(&bo->tbo.glob->lru_lock);
>> +	}
>
> It's a bit ugly to open-code this logic in driver code. Maybe add
> another TTM helper which calls ttm_bo_validate and ttm_bo_add_to_lru?

On further reflection, maybe a better approach would be to have a 
function ttm_bo_init_reserved, which returns with the reservation lock 
held upon success. We can then just change amdgpu_bo_create_restricted 
to ttm_bo_unreserve instead of ww_mutex_unlock, which does the add-to-LRU.

AFAICT, it doesn't make much sense to add the buffer to the LRU that 
early anyway, if amdgpu_fill_buffer is used.

Cheers,
Nicolai

>
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index c2e57f7..15944ea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -404,15 +404,31 @@  int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 	}
 
 	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
-	r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type,
-			&bo->placement, page_align, !kernel, NULL,
-			acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
-			&amdgpu_ttm_bo_destroy);
+	r = ttm_bo_init_top(&adev->mman.bdev, &bo->tbo, size, type,
+			    page_align, NULL,
+			    acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
+			    &amdgpu_ttm_bo_destroy);
+
+	if (likely(r == 0))
+		r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false);
+
+	if (unlikely(r != 0)) {
+		struct ttm_buffer_object *tbo = &bo->tbo;
+
+		if (!resv)
+			ww_mutex_unlock(&bo->tbo.ttm_resv.lock);
+		ttm_bo_unref(&tbo);
+		return r;
+	}
+
 	amdgpu_cs_report_moved_bytes(adev,
 		atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);
 
-	if (unlikely(r != 0))
-		return r;
+	if (!(bo->tbo.mem.placement & TTM_PL_FLAG_NO_EVICT)) {
+		spin_lock(&bo->tbo.glob->lru_lock);
+		ttm_bo_add_to_lru(&bo->tbo);
+		spin_unlock(&bo->tbo.glob->lru_lock);
+	}
 
 	bo->tbo.priority = ilog2(bo->tbo.num_pages);
 	if (kernel)