Message ID | 1414576373-25121-1-git-send-email-thomas.daniel@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Oct 29, 2014 at 09:52:50AM +0000, Thomas Daniel wrote: > No longer create a work item to clean each execlist queue item. > Instead, move retired execlist requests to a queue and clean up the > items during retire_requests. > > v2: Fix legacy ring path broken during overzealous cleanup > > v3: Update idle detection to take execlists queue into account > > Issue: VIZ-4274 > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> > --- > drivers/gpu/drm/i915/i915_gem.c | 4 +++ > drivers/gpu/drm/i915/intel_lrc.c | 52 ++++++++++++++++++------------- > drivers/gpu/drm/i915/intel_lrc.h | 2 +- > drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + > 4 files changed, 36 insertions(+), 23 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 827edb5..df28202 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2718,6 +2718,10 @@ i915_gem_retire_requests(struct drm_device *dev) > for_each_ring(ring, dev_priv, i) { > i915_gem_retire_requests_ring(ring); > idle &= list_empty(&ring->request_list); > + if (i915.enable_execlists) { > + idle &= list_empty(&ring->execlist_queue); > + intel_execlists_retire_requests(ring); This needs to be the other way round I think - we care about idleness after all the currently processed stuff is retired, not before. Otherwise we might notice the busy->idle transition one invocation too late. -Daniel
> -----Original Message----- > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel > Vetter > Sent: Monday, November 03, 2014 3:33 PM > To: Daniel, Thomas > Cc: intel-gfx@lists.freedesktop.org; shuang.he@linux.intel.com > Subject: Re: [Intel-gfx] [PATCH 1/4] drm/i915/bdw: Clean up execlist queue > items in retire_work > > On Wed, Oct 29, 2014 at 09:52:50AM +0000, Thomas Daniel wrote: > > No longer create a work item to clean each execlist queue item. > > Instead, move retired execlist requests to a queue and clean up the > > items during retire_requests. > > > > v2: Fix legacy ring path broken during overzealous cleanup > > > > v3: Update idle detection to take execlists queue into account > > > > Issue: VIZ-4274 > > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 4 +++ > > drivers/gpu/drm/i915/intel_lrc.c | 52 ++++++++++++++++++----------- > -- > > drivers/gpu/drm/i915/intel_lrc.h | 2 +- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + > > 4 files changed, 36 insertions(+), 23 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > > b/drivers/gpu/drm/i915/i915_gem.c index 827edb5..df28202 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -2718,6 +2718,10 @@ i915_gem_retire_requests(struct drm_device > *dev) > > for_each_ring(ring, dev_priv, i) { > > i915_gem_retire_requests_ring(ring); > > idle &= list_empty(&ring->request_list); > > + if (i915.enable_execlists) { > > + idle &= list_empty(&ring->execlist_queue); > > + intel_execlists_retire_requests(ring); > > This needs to be the other way round I think - we care about idleness after all > the currently processed stuff is retired, not before. Otherwise we might > notice the busy->idle transition one invocation too late. I thought for a while about this. The GPU will be idle when the execlist_queues are empty. Intel_execlists_retire_requests() cleans up requests which have already finished so it is more conservative (in terms of CPU idleness) to check the queue beforehand. I thought this would be more desirable than potentially reporting idleness early... Intel_execlists_retire_requests() can not affect the state of the queue. And there is no point checking the execlist_retired_req_list because execlists_retire_requests() always empties it. Thomas. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Mon, Nov 03, 2014 at 04:05:03PM +0000, Daniel, Thomas wrote: > > -----Original Message----- > > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel > > Vetter > > Sent: Monday, November 03, 2014 3:33 PM > > To: Daniel, Thomas > > Cc: intel-gfx@lists.freedesktop.org; shuang.he@linux.intel.com > > Subject: Re: [Intel-gfx] [PATCH 1/4] drm/i915/bdw: Clean up execlist queue > > items in retire_work > > > > On Wed, Oct 29, 2014 at 09:52:50AM +0000, Thomas Daniel wrote: > > > No longer create a work item to clean each execlist queue item. > > > Instead, move retired execlist requests to a queue and clean up the > > > items during retire_requests. > > > > > > v2: Fix legacy ring path broken during overzealous cleanup > > > > > > v3: Update idle detection to take execlists queue into account > > > > > > Issue: VIZ-4274 > > > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> > > > --- > > > drivers/gpu/drm/i915/i915_gem.c | 4 +++ > > > drivers/gpu/drm/i915/intel_lrc.c | 52 ++++++++++++++++++----------- > > -- > > > drivers/gpu/drm/i915/intel_lrc.h | 2 +- > > > drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + > > > 4 files changed, 36 insertions(+), 23 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > > > b/drivers/gpu/drm/i915/i915_gem.c index 827edb5..df28202 100644 > > > --- a/drivers/gpu/drm/i915/i915_gem.c > > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > > @@ -2718,6 +2718,10 @@ i915_gem_retire_requests(struct drm_device > > *dev) > > > for_each_ring(ring, dev_priv, i) { > > > i915_gem_retire_requests_ring(ring); > > > idle &= list_empty(&ring->request_list); > > > + if (i915.enable_execlists) { > > > + idle &= list_empty(&ring->execlist_queue); > > > + intel_execlists_retire_requests(ring); > > > > This needs to be the other way round I think - we care about idleness after all > > the currently processed stuff is retired, not before. Otherwise we might > > notice the busy->idle transition one invocation too late. > I thought for a while about this. The GPU will be idle when the > execlist_queues are empty. > Intel_execlists_retire_requests() cleans up requests which have already > finished so it is more conservative (in terms of CPU idleness) to check the > queue beforehand. I thought this would be more desirable than > potentially reporting idleness early... > Intel_execlists_retire_requests() can not affect the state of the queue. > And there is no point checking the execlist_retired_req_list because > execlists_retire_requests() always empties it. Ok, I mixed things up without looking ;-) But that means you acces the execlist_queue, which is also accessed from irq code, without holding the required locks? This is all a bit confusing to poor me ... -Daniel
On Wed, Oct 29, 2014 at 09:52:50AM +0000, Thomas Daniel wrote: > No longer create a work item to clean each execlist queue item. > Instead, move retired execlist requests to a queue and clean up the > items during retire_requests. > > v2: Fix legacy ring path broken during overzealous cleanup > > v3: Update idle detection to take execlists queue into account > > Issue: VIZ-4274 > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> > --- > drivers/gpu/drm/i915/i915_gem.c | 4 +++ > drivers/gpu/drm/i915/intel_lrc.c | 52 ++++++++++++++++++------------- > drivers/gpu/drm/i915/intel_lrc.h | 2 +- > drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + > 4 files changed, 36 insertions(+), 23 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 827edb5..df28202 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2718,6 +2718,10 @@ i915_gem_retire_requests(struct drm_device *dev) > for_each_ring(ring, dev_priv, i) { > i915_gem_retire_requests_ring(ring); > idle &= list_empty(&ring->request_list); > + if (i915.enable_execlists) { Every time you do this, a kitten dies. If only we have an intel_engine_cs that could abstract the differences between retirement on the various submission ports and encapsulate that away from the core GEM buffer/request handling. If only I hadn't already sent a patch showing exactly how to do that. -Chris
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 827edb5..df28202 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2718,6 +2718,10 @@ i915_gem_retire_requests(struct drm_device *dev) for_each_ring(ring, dev_priv, i) { i915_gem_retire_requests_ring(ring); idle &= list_empty(&ring->request_list); + if (i915.enable_execlists) { + idle &= list_empty(&ring->execlist_queue); + intel_execlists_retire_requests(ring); + } } if (idle) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index cd74e5c..87ce445 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -386,7 +386,6 @@ static void execlists_context_unqueue(struct intel_engine_cs *ring) { struct intel_ctx_submit_request *req0 = NULL, *req1 = NULL; struct intel_ctx_submit_request *cursor = NULL, *tmp = NULL; - struct drm_i915_private *dev_priv = ring->dev->dev_private; assert_spin_locked(&ring->execlist_lock); @@ -403,7 +402,8 @@ static void execlists_context_unqueue(struct intel_engine_cs *ring) * will update tail past first request's workload */ cursor->elsp_submitted = req0->elsp_submitted; list_del(&req0->execlist_link); - queue_work(dev_priv->wq, &req0->work); + list_add_tail(&req0->execlist_link, + &ring->execlist_retired_req_list); req0 = cursor; } else { req1 = cursor; @@ -425,7 +425,6 @@ static void execlists_context_unqueue(struct intel_engine_cs *ring) static bool execlists_check_remove_request(struct intel_engine_cs *ring, u32 request_id) { - struct drm_i915_private *dev_priv = ring->dev->dev_private; struct intel_ctx_submit_request *head_req; assert_spin_locked(&ring->execlist_lock); @@ -443,7 +442,8 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring, if (--head_req->elsp_submitted <= 0) { list_del(&head_req->execlist_link); - queue_work(dev_priv->wq, &head_req->work); + list_add_tail(&head_req->execlist_link, + &ring->execlist_retired_req_list); return true; } } @@ -512,22 +512,6 @@ void intel_execlists_handle_ctx_events(struct intel_engine_cs *ring) ((u32)ring->next_context_status_buffer & 0x07) << 8); } -static void execlists_free_request_task(struct work_struct *work) -{ - struct intel_ctx_submit_request *req = - container_of(work, struct intel_ctx_submit_request, work); - struct drm_device *dev = req->ring->dev; - struct drm_i915_private *dev_priv = dev->dev_private; - - intel_runtime_pm_put(dev_priv); - - mutex_lock(&dev->struct_mutex); - i915_gem_context_unreference(req->ctx); - mutex_unlock(&dev->struct_mutex); - - kfree(req); -} - static int execlists_context_queue(struct intel_engine_cs *ring, struct intel_context *to, u32 tail) @@ -544,7 +528,6 @@ static int execlists_context_queue(struct intel_engine_cs *ring, i915_gem_context_reference(req->ctx); req->ring = ring; req->tail = tail; - INIT_WORK(&req->work, execlists_free_request_task); intel_runtime_pm_get(dev_priv); @@ -565,7 +548,8 @@ static int execlists_context_queue(struct intel_engine_cs *ring, WARN(tail_req->elsp_submitted != 0, "More than 2 already-submitted reqs queued\n"); list_del(&tail_req->execlist_link); - queue_work(dev_priv->wq, &tail_req->work); + list_add_tail(&tail_req->execlist_link, + &ring->execlist_retired_req_list); } } @@ -733,6 +717,29 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file, return 0; } +void intel_execlists_retire_requests(struct intel_engine_cs *ring) +{ + struct intel_ctx_submit_request *req, *tmp; + struct drm_i915_private *dev_priv = ring->dev->dev_private; + unsigned long flags; + struct list_head retired_list; + + WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex)); + if (list_empty(&ring->execlist_retired_req_list)) + return; + + INIT_LIST_HEAD(&retired_list); + spin_lock_irqsave(&ring->execlist_lock, flags); + list_replace_init(&ring->execlist_retired_req_list, &retired_list); + spin_unlock_irqrestore(&ring->execlist_lock, flags); + + list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) { + intel_runtime_pm_put(dev_priv); + i915_gem_context_unreference(req->ctx); + list_del(&req->execlist_link); + } +} + void intel_logical_ring_stop(struct intel_engine_cs *ring) { struct drm_i915_private *dev_priv = ring->dev->dev_private; @@ -1248,6 +1255,7 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin init_waitqueue_head(&ring->irq_queue); INIT_LIST_HEAD(&ring->execlist_queue); + INIT_LIST_HEAD(&ring->execlist_retired_req_list); spin_lock_init(&ring->execlist_lock); ring->next_context_status_buffer = 0; diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 33c3b4b..84bbf19 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -104,11 +104,11 @@ struct intel_ctx_submit_request { u32 tail; struct list_head execlist_link; - struct work_struct work; int elsp_submitted; }; void intel_execlists_handle_ctx_events(struct intel_engine_cs *ring); +void intel_execlists_retire_requests(struct intel_engine_cs *ring); #endif /* _INTEL_LRC_H_ */ diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 96479c8..8c002d2 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -235,6 +235,7 @@ struct intel_engine_cs { /* Execlists */ spinlock_t execlist_lock; struct list_head execlist_queue; + struct list_head execlist_retired_req_list; u8 next_context_status_buffer; u32 irq_keep_mask; /* bitmask for interrupts that should not be masked */ int (*emit_request)(struct intel_ringbuffer *ringbuf);
No longer create a work item to clean each execlist queue item. Instead, move retired execlist requests to a queue and clean up the items during retire_requests. v2: Fix legacy ring path broken during overzealous cleanup v3: Update idle detection to take execlists queue into account Issue: VIZ-4274 Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> --- drivers/gpu/drm/i915/i915_gem.c | 4 +++ drivers/gpu/drm/i915/intel_lrc.c | 52 ++++++++++++++++++------------- drivers/gpu/drm/i915/intel_lrc.h | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + 4 files changed, 36 insertions(+), 23 deletions(-)