Message ID | 20210922091044.2612-14-christian.koenig@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/26] dma-buf: add dma_resv_for_each_fence_unlocked v4 | expand |
Am 22.09.21 um 14:20 schrieb Tvrtko Ursulin: > > On 22/09/2021 13:15, Christian König wrote: >> Am 22.09.21 um 13:46 schrieb Tvrtko Ursulin: >>> >>> On 22/09/2021 11:21, Tvrtko Ursulin wrote: >>>> >>>> On 22/09/2021 10:10, Christian König wrote: >>>>> This makes the function much simpler since the complex >>>>> retry logic is now handled else where. >>>>> >>>>> Signed-off-by: Christian König <christian.koenig@amd.com> >>>>> --- >>>>> drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 >>>>> ++++++++++-------------- >>>>> 1 file changed, 14 insertions(+), 21 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c >>>>> b/drivers/gpu/drm/i915/gem/i915_gem_busy.c >>>>> index 6234e17259c1..313afb4a11c7 100644 >>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c >>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c >>>>> @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void >>>>> *data, >>>>> { >>>>> struct drm_i915_gem_busy *args = data; >>>>> struct drm_i915_gem_object *obj; >>>>> - struct dma_resv_list *list; >>>>> - unsigned int seq; >>>>> + struct dma_resv_iter cursor; >>>>> + struct dma_fence *fence; >>>>> int err; >>>>> err = -ENOENT; >>>>> @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, >>>>> void *data, >>>>> * to report the overall busyness. This is what the >>>>> wait-ioctl does. >>>>> * >>>>> */ >>>>> -retry: >>>>> - seq = raw_read_seqcount(&obj->base.resv->seq); >>>>> - >>>>> - /* Translate the exclusive fence to the READ *and* WRITE >>>>> engine */ >>>>> - args->busy = >>>>> busy_check_writer(dma_resv_excl_fence(obj->base.resv)); >>>>> - >>>>> - /* Translate shared fences to READ set of engines */ >>>>> - list = dma_resv_shared_list(obj->base.resv); >>>>> - if (list) { >>>>> - unsigned int shared_count = list->shared_count, i; >>>>> - >>>>> - for (i = 0; i < shared_count; ++i) { >>>>> - struct dma_fence *fence = >>>>> - rcu_dereference(list->shared[i]); >>>>> - >>>>> + args->busy = false; >>>> >>>> You can drop this line, especially since it is not a boolean. With >>>> that: >>>> >>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>> >>> Having said this, one thing to add in the commit message is some >>> commentary that although simpler in code, the new implementation has >>> a lot more atomic instructions due all the extra fence get/put. >>> >>> Saying this because I remembered busy ioctl is quite an over-popular >>> one. Thinking about traces from some real userspaces I looked at in >>> the past. >>> >>> So I think ack from maintainers will be required here. Because I >>> just don't know if any performance impact will be visible or not. So >>> view my r-b as "code looks fine" but I am on the fence if it should >>> actually be merged. Probably leaning towards no actually - given how >>> the code is localised here and I dislike burdening old platforms >>> with more CPU time it could be cheaply left as is. >> >> Well previously we would have allocated memory, which as far as I >> know has more overhead than a few extra atomic operations. > > It doesn't, it only uses dma_resv_excl_fence and dma_resv_shared_list. Yeah, ok then that's not really an option any more. I think Daniel and I are totally on the same page that we won't allow this RCU dance in the drivers any more. Regards, Christian. > > Regards, > > Tvrtko > >> On the other hand I could as well stick with dma_resv_get_fences() here. >> >> Regards, >> Christian. >> >>> >>> Regards, >>> >>> Tvrtko >>> >>> >>>> >>>> Regards, >>>> >>>> Tvrtko >>>> >>>>> + dma_resv_iter_begin(&cursor, obj->base.resv, true); >>>>> + dma_resv_for_each_fence_unlocked(&cursor, fence) { >>>>> + if (dma_resv_iter_is_restarted(&cursor)) >>>>> + args->busy = 0; >>>>> + >>>>> + if (dma_resv_iter_is_exclusive(&cursor)) >>>>> + /* Translate the exclusive fence to the READ *and* >>>>> WRITE engine */ >>>>> + args->busy |= busy_check_writer(fence); >>>>> + else >>>>> + /* Translate shared fences to READ set of engines */ >>>>> args->busy |= busy_check_reader(fence); >>>>> - } >>>>> } >>>>> - >>>>> - if (args->busy && read_seqcount_retry(&obj->base.resv->seq, >>>>> seq)) >>>>> - goto retry; >>>>> + dma_resv_iter_end(&cursor); >>>>> err = 0; >>>>> out: >>>>> >>
Am 22.09.21 um 12:21 schrieb Tvrtko Ursulin: > > On 22/09/2021 10:10, Christian König wrote: >> This makes the function much simpler since the complex >> retry logic is now handled else where. >> >> Signed-off-by: Christian König <christian.koenig@amd.com> >> --- >> drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++++++++++-------------- >> 1 file changed, 14 insertions(+), 21 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c >> b/drivers/gpu/drm/i915/gem/i915_gem_busy.c >> index 6234e17259c1..313afb4a11c7 100644 >> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c >> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c >> @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void >> *data, >> { >> struct drm_i915_gem_busy *args = data; >> struct drm_i915_gem_object *obj; >> - struct dma_resv_list *list; >> - unsigned int seq; >> + struct dma_resv_iter cursor; >> + struct dma_fence *fence; >> int err; >> err = -ENOENT; >> @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, >> void *data, >> * to report the overall busyness. This is what the wait-ioctl >> does. >> * >> */ >> -retry: >> - seq = raw_read_seqcount(&obj->base.resv->seq); >> - >> - /* Translate the exclusive fence to the READ *and* WRITE engine */ >> - args->busy = >> busy_check_writer(dma_resv_excl_fence(obj->base.resv)); >> - >> - /* Translate shared fences to READ set of engines */ >> - list = dma_resv_shared_list(obj->base.resv); >> - if (list) { >> - unsigned int shared_count = list->shared_count, i; >> - >> - for (i = 0; i < shared_count; ++i) { >> - struct dma_fence *fence = >> - rcu_dereference(list->shared[i]); >> - >> + args->busy = false; > > You can drop this line, especially since it is not a boolean. With that: I just realized that this won't work. We still need to initialize the return value when there is no fence at all in the resv object. > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Does that still counts if I set args->busy to zero? Thanks, Christian. > > Regards, > > Tvrtko > >> + dma_resv_iter_begin(&cursor, obj->base.resv, true); >> + dma_resv_for_each_fence_unlocked(&cursor, fence) { >> + if (dma_resv_iter_is_restarted(&cursor)) >> + args->busy = 0; >> + >> + if (dma_resv_iter_is_exclusive(&cursor)) >> + /* Translate the exclusive fence to the READ *and* WRITE >> engine */ >> + args->busy |= busy_check_writer(fence); >> + else >> + /* Translate shared fences to READ set of engines */ >> args->busy |= busy_check_reader(fence); >> - } >> } >> - >> - if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq)) >> - goto retry; >> + dma_resv_iter_end(&cursor); >> err = 0; >> out: >>
On 22/09/2021 15:31, Christian König wrote: > Am 22.09.21 um 12:21 schrieb Tvrtko Ursulin: >> >> On 22/09/2021 10:10, Christian König wrote: >>> This makes the function much simpler since the complex >>> retry logic is now handled else where. >>> >>> Signed-off-by: Christian König <christian.koenig@amd.com> >>> --- >>> drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++++++++++-------------- >>> 1 file changed, 14 insertions(+), 21 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c >>> b/drivers/gpu/drm/i915/gem/i915_gem_busy.c >>> index 6234e17259c1..313afb4a11c7 100644 >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c >>> @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void >>> *data, >>> { >>> struct drm_i915_gem_busy *args = data; >>> struct drm_i915_gem_object *obj; >>> - struct dma_resv_list *list; >>> - unsigned int seq; >>> + struct dma_resv_iter cursor; >>> + struct dma_fence *fence; >>> int err; >>> err = -ENOENT; >>> @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, >>> void *data, >>> * to report the overall busyness. This is what the wait-ioctl >>> does. >>> * >>> */ >>> -retry: >>> - seq = raw_read_seqcount(&obj->base.resv->seq); >>> - >>> - /* Translate the exclusive fence to the READ *and* WRITE engine */ >>> - args->busy = >>> busy_check_writer(dma_resv_excl_fence(obj->base.resv)); >>> - >>> - /* Translate shared fences to READ set of engines */ >>> - list = dma_resv_shared_list(obj->base.resv); >>> - if (list) { >>> - unsigned int shared_count = list->shared_count, i; >>> - >>> - for (i = 0; i < shared_count; ++i) { >>> - struct dma_fence *fence = >>> - rcu_dereference(list->shared[i]); >>> - >>> + args->busy = false; >> >> You can drop this line, especially since it is not a boolean. With that: > > I just realized that this won't work. We still need to initialize the > return value when there is no fence at all in the resv object. > >> >> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Does that still counts if I set args->busy to zero? Ah yes, my bad, apologies. You can keep the r-b. Regards, Tvrtko > > Thanks, > Christian. > >> >> Regards, >> >> Tvrtko >> >>> + dma_resv_iter_begin(&cursor, obj->base.resv, true); >>> + dma_resv_for_each_fence_unlocked(&cursor, fence) { >>> + if (dma_resv_iter_is_restarted(&cursor)) >>> + args->busy = 0; >>> + >>> + if (dma_resv_iter_is_exclusive(&cursor)) >>> + /* Translate the exclusive fence to the READ *and* WRITE >>> engine */ >>> + args->busy |= busy_check_writer(fence); >>> + else >>> + /* Translate shared fences to READ set of engines */ >>> args->busy |= busy_check_reader(fence); >>> - } >>> } >>> - >>> - if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq)) >>> - goto retry; >>> + dma_resv_iter_end(&cursor); >>> err = 0; >>> out: >>> >
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c index 6234e17259c1..313afb4a11c7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, { struct drm_i915_gem_busy *args = data; struct drm_i915_gem_object *obj; - struct dma_resv_list *list; - unsigned int seq; + struct dma_resv_iter cursor; + struct dma_fence *fence; int err; err = -ENOENT; @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, * to report the overall busyness. This is what the wait-ioctl does. * */ -retry: - seq = raw_read_seqcount(&obj->base.resv->seq); - - /* Translate the exclusive fence to the READ *and* WRITE engine */ - args->busy = busy_check_writer(dma_resv_excl_fence(obj->base.resv)); - - /* Translate shared fences to READ set of engines */ - list = dma_resv_shared_list(obj->base.resv); - if (list) { - unsigned int shared_count = list->shared_count, i; - - for (i = 0; i < shared_count; ++i) { - struct dma_fence *fence = - rcu_dereference(list->shared[i]); - + args->busy = false; + dma_resv_iter_begin(&cursor, obj->base.resv, true); + dma_resv_for_each_fence_unlocked(&cursor, fence) { + if (dma_resv_iter_is_restarted(&cursor)) + args->busy = 0; + + if (dma_resv_iter_is_exclusive(&cursor)) + /* Translate the exclusive fence to the READ *and* WRITE engine */ + args->busy |= busy_check_writer(fence); + else + /* Translate shared fences to READ set of engines */ args->busy |= busy_check_reader(fence); - } } - - if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq)) - goto retry; + dma_resv_iter_end(&cursor); err = 0; out:
This makes the function much simpler since the complex retry logic is now handled else where. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++++++++++-------------- 1 file changed, 14 insertions(+), 21 deletions(-)