Message ID | 20190829142917.13058-3-christian.koenig@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/4] dma-buf: change DMA-buf locking convention | expand |
On Thu, Aug 29, 2019 at 04:29:15PM +0200, Christian König wrote: > This way we can even pipeline imported BO evictions. > > v2: Limit this to only cases when the parent object uses a separate > reservation object as well. This fixes another OOM problem. > > Signed-off-by: Christian König <christian.koenig@amd.com> Since I read quite a bit of ttm I figured I'll review this too, but I'm totally lost. And git blame gives me at best commits with one-liner commit messages, and the docs aren't explaining much at all either (and generally they didn't get updated at all with all the changes in the past years). I have a vague idea of what you're doing here, but not enough to do review with any confidence. And from other ttm patches from amd it feels a lot like we have essentially a bus factor of 1 for all things ttm :-/ -Daniel > --- > drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++------- > 1 file changed, 9 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c > index fe81c565e7ef..2ebe9fe7f6c8 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c > @@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo, > kref_init(&fbo->base.kref); > fbo->base.destroy = &ttm_transfered_destroy; > fbo->base.acc_size = 0; > - fbo->base.base.resv = &fbo->base.base._resv; > + if (bo->base.resv == &bo->base._resv) > + fbo->base.base.resv = &fbo->base.base._resv; > + > dma_resv_init(fbo->base.base.resv); > ret = dma_resv_trylock(fbo->base.base.resv); > WARN_ON(!ret); > @@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, > if (ret) > return ret; > > - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); > + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence); > > /** > * If we're not moving to fixed memory, the TTM object > @@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, > else > bo->ttm = NULL; > > - ttm_bo_unreserve(ghost_obj); > + dma_resv_unlock(&ghost_obj->base._resv); > ttm_bo_put(ghost_obj); > } > > @@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, > if (ret) > return ret; > > - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); > + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence); > > /** > * If we're not moving to fixed memory, the TTM object > @@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, > else > bo->ttm = NULL; > > - ttm_bo_unreserve(ghost_obj); > + dma_resv_unlock(&ghost_obj->base._resv); > ttm_bo_put(ghost_obj); > > } else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) { > @@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) > if (ret) > return ret; > > - ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv); > + ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv); > /* Last resort, wait for the BO to be idle when we are OOM */ > if (ret) > ttm_bo_wait(bo, false, false); > @@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) > bo->mem.mem_type = TTM_PL_SYSTEM; > bo->ttm = NULL; > > - ttm_bo_unreserve(ghost); > + dma_resv_unlock(&ghost->base._resv); > ttm_bo_put(ghost); > > return 0; > -- > 2.17.1 >
Am 08.10.19 um 11:25 schrieb Daniel Vetter: > On Thu, Aug 29, 2019 at 04:29:15PM +0200, Christian König wrote: >> This way we can even pipeline imported BO evictions. >> >> v2: Limit this to only cases when the parent object uses a separate >> reservation object as well. This fixes another OOM problem. >> >> Signed-off-by: Christian König <christian.koenig@amd.com> > Since I read quite a bit of ttm I figured I'll review this too, but I'm > totally lost. And git blame gives me at best commits with one-liner commit > messages, and the docs aren't explaining much at all either (and generally > they didn't get updated at all with all the changes in the past years). > > I have a vague idea of what you're doing here, but not enough to do review > with any confidence. And from other ttm patches from amd it feels a lot > like we have essentially a bus factor of 1 for all things ttm :-/ Yeah, that's one of a couple of reasons why I want to get rid of TTM in the long term. Basically this is a bug fix for delay freeing ttm objects. When we hang the ttm object on a ghost object to be freed and the ttm object is an imported DMA-buf we run into the problem that we want to drop the mapping, but have the wrong lock taken (the lock of the ghost and not of the parent). Regards, Christian. > -Daniel > >> --- >> drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++------- >> 1 file changed, 9 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c >> index fe81c565e7ef..2ebe9fe7f6c8 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c >> @@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo, >> kref_init(&fbo->base.kref); >> fbo->base.destroy = &ttm_transfered_destroy; >> fbo->base.acc_size = 0; >> - fbo->base.base.resv = &fbo->base.base._resv; >> + if (bo->base.resv == &bo->base._resv) >> + fbo->base.base.resv = &fbo->base.base._resv; >> + >> dma_resv_init(fbo->base.base.resv); >> ret = dma_resv_trylock(fbo->base.base.resv); >> WARN_ON(!ret); >> @@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, >> if (ret) >> return ret; >> >> - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); >> + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence); >> >> /** >> * If we're not moving to fixed memory, the TTM object >> @@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, >> else >> bo->ttm = NULL; >> >> - ttm_bo_unreserve(ghost_obj); >> + dma_resv_unlock(&ghost_obj->base._resv); >> ttm_bo_put(ghost_obj); >> } >> >> @@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, >> if (ret) >> return ret; >> >> - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); >> + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence); >> >> /** >> * If we're not moving to fixed memory, the TTM object >> @@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, >> else >> bo->ttm = NULL; >> >> - ttm_bo_unreserve(ghost_obj); >> + dma_resv_unlock(&ghost_obj->base._resv); >> ttm_bo_put(ghost_obj); >> >> } else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) { >> @@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) >> if (ret) >> return ret; >> >> - ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv); >> + ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv); >> /* Last resort, wait for the BO to be idle when we are OOM */ >> if (ret) >> ttm_bo_wait(bo, false, false); >> @@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) >> bo->mem.mem_type = TTM_PL_SYSTEM; >> bo->ttm = NULL; >> >> - ttm_bo_unreserve(ghost); >> + dma_resv_unlock(&ghost->base._resv); >> ttm_bo_put(ghost); >> >> return 0; >> -- >> 2.17.1 >>
On Wed, Oct 09, 2019 at 03:10:09PM +0200, Christian König wrote: > Am 08.10.19 um 11:25 schrieb Daniel Vetter: > > On Thu, Aug 29, 2019 at 04:29:15PM +0200, Christian König wrote: > > > This way we can even pipeline imported BO evictions. > > > > > > v2: Limit this to only cases when the parent object uses a separate > > > reservation object as well. This fixes another OOM problem. > > > > > > Signed-off-by: Christian König <christian.koenig@amd.com> > > Since I read quite a bit of ttm I figured I'll review this too, but I'm > > totally lost. And git blame gives me at best commits with one-liner commit > > messages, and the docs aren't explaining much at all either (and generally > > they didn't get updated at all with all the changes in the past years). > > > > I have a vague idea of what you're doing here, but not enough to do review > > with any confidence. And from other ttm patches from amd it feels a lot > > like we have essentially a bus factor of 1 for all things ttm :-/ > > Yeah, that's one of a couple of reasons why I want to get rid of TTM in the > long term. > > Basically this is a bug fix for delay freeing ttm objects. When we hang the > ttm object on a ghost object to be freed and the ttm object is an imported > DMA-buf we run into the problem that we want to drop the mapping, but have > the wrong lock taken (the lock of the ghost and not of the parent). Got intrigued, did some more digging, I guess the bugfix part is related to: commit 841e763b40764a7699ae07f4cb1921af62d6316d Author: Christian König <christian.koenig@amd.com> Date: Thu Jul 20 20:55:06 2017 +0200 drm/ttm: individualize BO reservation obj when they are freed and that's why you switch everything over to useing _resv instead of the pointer. But then I still don't follow the details ... > > Regards, > Christian. > > > -Daniel > > > > > --- > > > drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++------- > > > 1 file changed, 9 insertions(+), 7 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c > > > index fe81c565e7ef..2ebe9fe7f6c8 100644 > > > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c > > > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c > > > @@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo, > > > kref_init(&fbo->base.kref); > > > fbo->base.destroy = &ttm_transfered_destroy; > > > fbo->base.acc_size = 0; > > > - fbo->base.base.resv = &fbo->base.base._resv; > > > + if (bo->base.resv == &bo->base._resv) > > > + fbo->base.base.resv = &fbo->base.base._resv; I got confused a bit at first, until I spotted the fbo->base = *bo; somewhere above. So I think that part makes sense, together with the above cited patch. I think at least, confidence on this is very low ... > > > + > > > dma_resv_init(fbo->base.base.resv); > > > ret = dma_resv_trylock(fbo->base.base.resv); Shouldn't this be switched over to _resv too? Otherwise feels like unbalanced locking. > > > WARN_ON(!ret); > > > @@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, > > > if (ret) > > > return ret; > > > - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); > > > + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence); > > > /** > > > * If we're not moving to fixed memory, the TTM object > > > @@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, > > > else > > > bo->ttm = NULL; > > > - ttm_bo_unreserve(ghost_obj); > > > + dma_resv_unlock(&ghost_obj->base._resv); > > > ttm_bo_put(ghost_obj); > > > } > > > @@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, > > > if (ret) > > > return ret; > > > - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); > > > + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence); > > > /** > > > * If we're not moving to fixed memory, the TTM object > > > @@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, > > > else > > > bo->ttm = NULL; > > > - ttm_bo_unreserve(ghost_obj); > > > + dma_resv_unlock(&ghost_obj->base._resv); I guess dropping the lru part here (aside from switching from ->resv to ->_resv, which is your bugfix I think) doesn't matter since the ghost object got all cleared up and isn't on any lists anyway? Otoh how does it work then ... Not clear to me why this is safe. > > > ttm_bo_put(ghost_obj); > > > } else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) { > > > @@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) > > > if (ret) > > > return ret; > > > - ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv); > > > + ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv); > > > /* Last resort, wait for the BO to be idle when we are OOM */ > > > if (ret) > > > ttm_bo_wait(bo, false, false); > > > @@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) > > > bo->mem.mem_type = TTM_PL_SYSTEM; > > > bo->ttm = NULL; > > > - ttm_bo_unreserve(ghost); > > > + dma_resv_unlock(&ghost->base._resv); > > > ttm_bo_put(ghost); > > > return 0; > > > -- > > > 2.17.1
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index fe81c565e7ef..2ebe9fe7f6c8 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo, kref_init(&fbo->base.kref); fbo->base.destroy = &ttm_transfered_destroy; fbo->base.acc_size = 0; - fbo->base.base.resv = &fbo->base.base._resv; + if (bo->base.resv == &bo->base._resv) + fbo->base.base.resv = &fbo->base.base._resv; + dma_resv_init(fbo->base.base.resv); ret = dma_resv_trylock(fbo->base.base.resv); WARN_ON(!ret); @@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, if (ret) return ret; - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence); /** * If we're not moving to fixed memory, the TTM object @@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, else bo->ttm = NULL; - ttm_bo_unreserve(ghost_obj); + dma_resv_unlock(&ghost_obj->base._resv); ttm_bo_put(ghost_obj); } @@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, if (ret) return ret; - dma_resv_add_excl_fence(ghost_obj->base.resv, fence); + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence); /** * If we're not moving to fixed memory, the TTM object @@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo, else bo->ttm = NULL; - ttm_bo_unreserve(ghost_obj); + dma_resv_unlock(&ghost_obj->base._resv); ttm_bo_put(ghost_obj); } else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) { @@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) if (ret) return ret; - ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv); + ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv); /* Last resort, wait for the BO to be idle when we are OOM */ if (ret) ttm_bo_wait(bo, false, false); @@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) bo->mem.mem_type = TTM_PL_SYSTEM; bo->ttm = NULL; - ttm_bo_unreserve(ghost); + dma_resv_unlock(&ghost->base._resv); ttm_bo_put(ghost); return 0;
This way we can even pipeline imported BO evictions. v2: Limit this to only cases when the parent object uses a separate reservation object as well. This fixes another OOM problem. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)