Message ID | 1302640318-23165-27-git-send-email-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Apr 12, 2011 at 09:31:54PM +0100, Chris Wilson wrote: > We only want to mark the transition from unfenced GPU access by an > execbuffer, so that we are forced to flush any pending writes through > the fence before updating the register. The idea behind this change sounds good. But it completely kills the optimization to not unnecessarily stall for fences when the fence isn't in use anymore because we reset fenced_gpu_access = false only when moving to the inactive list. And when flushing the fence, which is equally late. What about moving fenced_gpu_access = false from flush_fence to process_flushing_list (and replace the one in flush_fence with an WARN_ON(fenced_gpu_access) after the flush_ring)? -Daniel
On Wed, 13 Apr 2011 21:37:03 +0200, Daniel Vetter <daniel@ffwll.ch> wrote: > On Tue, Apr 12, 2011 at 09:31:54PM +0100, Chris Wilson wrote: > > We only want to mark the transition from unfenced GPU access by an > > execbuffer, so that we are forced to flush any pending writes through > > the fence before updating the register. > > The idea behind this change sounds good. Whilst I have you in agreement, what do I need to do get your r-b on the simple bug fix first? ;-) > But it completely kills the > optimization to not unnecessarily stall for fences when the fence isn't in > use anymore because we reset fenced_gpu_access = false only when moving to > the inactive list. And when flushing the fence, which is equally late. I'm following you so far... > What about moving > > fenced_gpu_access = false > > from flush_fence to process_flushing_list (and replace the one in > flush_fence with an WARN_ON(fenced_gpu_access) after the flush_ring)? And I'm still with you. Sounds good. -Chris
On Wed, Apr 13, 2011 at 09:15:19PM +0100, Chris Wilson wrote: > On Wed, 13 Apr 2011 21:37:03 +0200, Daniel Vetter <daniel@ffwll.ch> wrote: > > On Tue, Apr 12, 2011 at 09:31:54PM +0100, Chris Wilson wrote: > > > We only want to mark the transition from unfenced GPU access by an > > > execbuffer, so that we are forced to flush any pending writes through > > > the fence before updating the register. > > > > The idea behind this change sounds good. > > Whilst I have you in agreement, what do I need to do get your r-b on the > simple bug fix first? ;-) Oops, that went mia. So if you want to roll the bugfix independently Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Just add a small comment in the commit msg that it essentially disables that optimization, in case somebody bisects a performance regression to this.
On Wed, 13 Apr 2011 22:58:26 +0200, Daniel Vetter <daniel@ffwll.ch> wrote: > Just add a small comment in the commit msg that it essentially disables > that optimization, in case somebody bisects a performance regression to > this. Noted. But a bisect will never land here. Because I've left something out of this patch series until gem_stress is my friend... In applying this fix for a corruption bug, we do lose the ability to detect the earliest end of GPU fenced access, thus disabling the inherent optimization. -Chris
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 20a4cc5..a07911f 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -911,7 +911,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects, obj->base.read_domains = obj->base.pending_read_domains; obj->base.write_domain = obj->base.pending_write_domain; - obj->fenced_gpu_access = obj->pending_fenced_gpu_access; + obj->fenced_gpu_access |= obj->pending_fenced_gpu_access; i915_gem_object_move_to_active(obj, ring, seqno); if (obj->base.write_domain) {
We only want to mark the transition from unfenced GPU access by an execbuffer, so that we are forced to flush any pending writes through the fence before updating the register. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)