Message ID | 20211001100610.2899-18-christian.koenig@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/28] dma-buf: add dma_resv_for_each_fence_unlocked v7 | expand |
On 01/10/2021 11:05, Christian König wrote: > This makes the function much simpler since the complex > retry logic is now handled else where. > > Signed-off-by: Christian König <christian.koenig@amd.com> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Sorry I retract until you add the text about the increased cost of the added atomics. I think the point is important to discuss given proposal goes from zero atomics to num_fences * 2 (fence get/put unless I am mistaken) atomics per busy ioctl. That makes me lean towards just leaving this as is since it is not that complex. Regards, Tvrtko > --- > drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++++++++++-------------- > 1 file changed, 14 insertions(+), 21 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c > index 6234e17259c1..dc72b36dae54 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c > @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, > { > struct drm_i915_gem_busy *args = data; > struct drm_i915_gem_object *obj; > - struct dma_resv_list *list; > - unsigned int seq; > + struct dma_resv_iter cursor; > + struct dma_fence *fence; > int err; > > err = -ENOENT; > @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, > * to report the overall busyness. This is what the wait-ioctl does. > * > */ > -retry: > - seq = raw_read_seqcount(&obj->base.resv->seq); > - > - /* Translate the exclusive fence to the READ *and* WRITE engine */ > - args->busy = busy_check_writer(dma_resv_excl_fence(obj->base.resv)); > - > - /* Translate shared fences to READ set of engines */ > - list = dma_resv_shared_list(obj->base.resv); > - if (list) { > - unsigned int shared_count = list->shared_count, i; > - > - for (i = 0; i < shared_count; ++i) { > - struct dma_fence *fence = > - rcu_dereference(list->shared[i]); > - > + args->busy = 0; > + dma_resv_iter_begin(&cursor, obj->base.resv, true); > + dma_resv_for_each_fence_unlocked(&cursor, fence) { > + if (dma_resv_iter_is_restarted(&cursor)) > + args->busy = 0; > + > + if (dma_resv_iter_is_exclusive(&cursor)) > + /* Translate the exclusive fence to the READ *and* WRITE engine */ > + args->busy |= busy_check_writer(fence); > + else > + /* Translate shared fences to READ set of engines */ > args->busy |= busy_check_reader(fence); > - } > } > - > - if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq)) > - goto retry; > + dma_resv_iter_end(&cursor); > > err = 0; > out: >
Am 01.10.21 um 12:37 schrieb Tvrtko Ursulin: > > On 01/10/2021 11:05, Christian König wrote: >> This makes the function much simpler since the complex >> retry logic is now handled else where. >> >> Signed-off-by: Christian König <christian.koenig@amd.com> >> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Sorry I retract until you add the text about the increased cost of the > added atomics. I think the point is important to discuss given > proposal goes from zero atomics to num_fences * 2 (fence get/put > unless I am mistaken) atomics per busy ioctl. That makes me lean > towards just leaving this as is since it is not that complex. I'm certainly pushing hard to remove all manual RCU dance from the drivers, including this one. The only option is to either have the atomics overhead (which is indeed num_fences*2) or the locking overhead. Regards, Christian. > > Regards, > > Tvrtko > >> --- >> drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++++++++++-------------- >> 1 file changed, 14 insertions(+), 21 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c >> b/drivers/gpu/drm/i915/gem/i915_gem_busy.c >> index 6234e17259c1..dc72b36dae54 100644 >> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c >> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c >> @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void >> *data, >> { >> struct drm_i915_gem_busy *args = data; >> struct drm_i915_gem_object *obj; >> - struct dma_resv_list *list; >> - unsigned int seq; >> + struct dma_resv_iter cursor; >> + struct dma_fence *fence; >> int err; >> err = -ENOENT; >> @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, >> void *data, >> * to report the overall busyness. This is what the wait-ioctl >> does. >> * >> */ >> -retry: >> - seq = raw_read_seqcount(&obj->base.resv->seq); >> - >> - /* Translate the exclusive fence to the READ *and* WRITE engine */ >> - args->busy = >> busy_check_writer(dma_resv_excl_fence(obj->base.resv)); >> - >> - /* Translate shared fences to READ set of engines */ >> - list = dma_resv_shared_list(obj->base.resv); >> - if (list) { >> - unsigned int shared_count = list->shared_count, i; >> - >> - for (i = 0; i < shared_count; ++i) { >> - struct dma_fence *fence = >> - rcu_dereference(list->shared[i]); >> - >> + args->busy = 0; >> + dma_resv_iter_begin(&cursor, obj->base.resv, true); >> + dma_resv_for_each_fence_unlocked(&cursor, fence) { >> + if (dma_resv_iter_is_restarted(&cursor)) >> + args->busy = 0; >> + >> + if (dma_resv_iter_is_exclusive(&cursor)) >> + /* Translate the exclusive fence to the READ *and* WRITE >> engine */ >> + args->busy |= busy_check_writer(fence); >> + else >> + /* Translate shared fences to READ set of engines */ >> args->busy |= busy_check_reader(fence); >> - } >> } >> - >> - if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq)) >> - goto retry; >> + dma_resv_iter_end(&cursor); >> err = 0; >> out: >>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c index 6234e17259c1..dc72b36dae54 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, { struct drm_i915_gem_busy *args = data; struct drm_i915_gem_object *obj; - struct dma_resv_list *list; - unsigned int seq; + struct dma_resv_iter cursor; + struct dma_fence *fence; int err; err = -ENOENT; @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, * to report the overall busyness. This is what the wait-ioctl does. * */ -retry: - seq = raw_read_seqcount(&obj->base.resv->seq); - - /* Translate the exclusive fence to the READ *and* WRITE engine */ - args->busy = busy_check_writer(dma_resv_excl_fence(obj->base.resv)); - - /* Translate shared fences to READ set of engines */ - list = dma_resv_shared_list(obj->base.resv); - if (list) { - unsigned int shared_count = list->shared_count, i; - - for (i = 0; i < shared_count; ++i) { - struct dma_fence *fence = - rcu_dereference(list->shared[i]); - + args->busy = 0; + dma_resv_iter_begin(&cursor, obj->base.resv, true); + dma_resv_for_each_fence_unlocked(&cursor, fence) { + if (dma_resv_iter_is_restarted(&cursor)) + args->busy = 0; + + if (dma_resv_iter_is_exclusive(&cursor)) + /* Translate the exclusive fence to the READ *and* WRITE engine */ + args->busy |= busy_check_writer(fence); + else + /* Translate shared fences to READ set of engines */ args->busy |= busy_check_reader(fence); - } } - - if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq)) - goto retry; + dma_resv_iter_end(&cursor); err = 0; out: