Message ID | 20170215191056.17718-3-nhaehnle@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 16/02/17 04:10 AM, Nicolai Hähnle wrote: > From: Nicolai Hähnle <nicolai.haehnle@amd.com> > > Open-code the initial ttm_bo_validate call, so that we can properly > unlock the reservation lock when it fails. Also, properly destruct > the reservation object when the first part of TTM BO initialization > fails. > > Actual deadlocks caused by the missing unlock should have been fixed > by "drm/ttm: never add BO that failed to validate to the LRU list", > superseding the flawed fix in commit 38fc4856ad98 ("drm/amdgpu: fix > a potential deadlock in amdgpu_bo_create_restricted()"). > > This change fixes remaining recursive locking errors that can be seen > with lock debugging enabled, and avoids the error of freeing a locked > mutex. > > v2: use ttm_bo_unref for error handling, and rebase on latest changes > > Fixes: 12a852219583 ("drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)") > Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28 ++++++++++++++++++++++------ > 1 file changed, 22 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > index c2e57f7..15944ea 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > @@ -404,15 +404,31 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev, > } > > initial_bytes_moved = atomic64_read(&adev->num_bytes_moved); > - r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type, > - &bo->placement, page_align, !kernel, NULL, > - acc_size, sg, resv ? resv : &bo->tbo.ttm_resv, > - &amdgpu_ttm_bo_destroy); > + r = ttm_bo_init_top(&adev->mman.bdev, &bo->tbo, size, type, > + page_align, NULL, > + acc_size, sg, resv ? resv : &bo->tbo.ttm_resv, > + &amdgpu_ttm_bo_destroy); > + > + if (likely(r == 0)) > + r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false); > + > + if (unlikely(r != 0)) { > + struct ttm_buffer_object *tbo = &bo->tbo; > + > + if (!resv) > + ww_mutex_unlock(&bo->tbo.ttm_resv.lock); > + ttm_bo_unref(&tbo); > + return r; > + } I think this would be clearer as if (unlikely(r != 0)) { struct ttm_buffer_object *tbo = &bo->tbo; if (!resv) ww_mutex_unlock(&bo->tbo.ttm_resv.lock); ttm_bo_unref(&tbo); return r; } r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false); [...] > amdgpu_cs_report_moved_bytes(adev, > atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved); > > - if (unlikely(r != 0)) > - return r; > + if (!(bo->tbo.mem.placement & TTM_PL_FLAG_NO_EVICT)) { > + spin_lock(&bo->tbo.glob->lru_lock); > + ttm_bo_add_to_lru(&bo->tbo); > + spin_unlock(&bo->tbo.glob->lru_lock); > + } It's a bit ugly to open-code this logic in driver code. Maybe add another TTM helper which calls ttm_bo_validate and ttm_bo_add_to_lru?
On 16.02.2017 02:00, Michel Dänzer wrote: > On 16/02/17 04:10 AM, Nicolai Hähnle wrote: >> From: Nicolai Hähnle <nicolai.haehnle@amd.com> >> >> Open-code the initial ttm_bo_validate call, so that we can properly >> unlock the reservation lock when it fails. Also, properly destruct >> the reservation object when the first part of TTM BO initialization >> fails. >> >> Actual deadlocks caused by the missing unlock should have been fixed >> by "drm/ttm: never add BO that failed to validate to the LRU list", >> superseding the flawed fix in commit 38fc4856ad98 ("drm/amdgpu: fix >> a potential deadlock in amdgpu_bo_create_restricted()"). >> >> This change fixes remaining recursive locking errors that can be seen >> with lock debugging enabled, and avoids the error of freeing a locked >> mutex. >> >> v2: use ttm_bo_unref for error handling, and rebase on latest changes >> >> Fixes: 12a852219583 ("drm/amdgpu: improve AMDGPU_GEM_CREATE_VRAM_CLEARED handling (v2)") >> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 28 ++++++++++++++++++++++------ >> 1 file changed, 22 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >> index c2e57f7..15944ea 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c >> @@ -404,15 +404,31 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev, >> } >> >> initial_bytes_moved = atomic64_read(&adev->num_bytes_moved); >> - r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type, >> - &bo->placement, page_align, !kernel, NULL, >> - acc_size, sg, resv ? resv : &bo->tbo.ttm_resv, >> - &amdgpu_ttm_bo_destroy); >> + r = ttm_bo_init_top(&adev->mman.bdev, &bo->tbo, size, type, >> + page_align, NULL, >> + acc_size, sg, resv ? resv : &bo->tbo.ttm_resv, >> + &amdgpu_ttm_bo_destroy); >> + >> + if (likely(r == 0)) >> + r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false); >> + >> + if (unlikely(r != 0)) { >> + struct ttm_buffer_object *tbo = &bo->tbo; >> + >> + if (!resv) >> + ww_mutex_unlock(&bo->tbo.ttm_resv.lock); >> + ttm_bo_unref(&tbo); >> + return r; >> + } > > I think this would be clearer as > > if (unlikely(r != 0)) { > struct ttm_buffer_object *tbo = &bo->tbo; > > if (!resv) > ww_mutex_unlock(&bo->tbo.ttm_resv.lock); > ttm_bo_unref(&tbo); > return r; > } > > r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false); > [...] > > >> amdgpu_cs_report_moved_bytes(adev, >> atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved); >> >> - if (unlikely(r != 0)) >> - return r; >> + if (!(bo->tbo.mem.placement & TTM_PL_FLAG_NO_EVICT)) { >> + spin_lock(&bo->tbo.glob->lru_lock); >> + ttm_bo_add_to_lru(&bo->tbo); >> + spin_unlock(&bo->tbo.glob->lru_lock); >> + } > > It's a bit ugly to open-code this logic in driver code. Maybe add > another TTM helper which calls ttm_bo_validate and ttm_bo_add_to_lru? On further reflection, maybe a better approach would be to have a function ttm_bo_init_reserved, which returns with the reservation lock held upon success. We can then just change amdgpu_bo_create_restricted to ttm_bo_unreserve instead of ww_mutex_unlock, which does the add-to-LRU. AFAICT, it doesn't make much sense to add the buffer to the LRU that early anyway, if amdgpu_fill_buffer is used. Cheers, Nicolai > >
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index c2e57f7..15944ea 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -404,15 +404,31 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev, } initial_bytes_moved = atomic64_read(&adev->num_bytes_moved); - r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type, - &bo->placement, page_align, !kernel, NULL, - acc_size, sg, resv ? resv : &bo->tbo.ttm_resv, - &amdgpu_ttm_bo_destroy); + r = ttm_bo_init_top(&adev->mman.bdev, &bo->tbo, size, type, + page_align, NULL, + acc_size, sg, resv ? resv : &bo->tbo.ttm_resv, + &amdgpu_ttm_bo_destroy); + + if (likely(r == 0)) + r = ttm_bo_validate(&bo->tbo, &bo->placement, !kernel, false); + + if (unlikely(r != 0)) { + struct ttm_buffer_object *tbo = &bo->tbo; + + if (!resv) + ww_mutex_unlock(&bo->tbo.ttm_resv.lock); + ttm_bo_unref(&tbo); + return r; + } + amdgpu_cs_report_moved_bytes(adev, atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved); - if (unlikely(r != 0)) - return r; + if (!(bo->tbo.mem.placement & TTM_PL_FLAG_NO_EVICT)) { + spin_lock(&bo->tbo.glob->lru_lock); + ttm_bo_add_to_lru(&bo->tbo); + spin_unlock(&bo->tbo.glob->lru_lock); + } bo->tbo.priority = ilog2(bo->tbo.num_pages); if (kernel)