From patchwork Tue Dec 9 12:59:09 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 5462561 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id C23A39F54F for ; Tue, 9 Dec 2014 12:59:28 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7D1EF200E5 for ; Tue, 9 Dec 2014 12:59:27 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 0C5D420107 for ; Tue, 9 Dec 2014 12:59:26 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 65C4672752; Tue, 9 Dec 2014 04:59:25 -0800 (PST) X-Original-To: Intel-GFX@lists.freedesktop.org Delivered-To: Intel-GFX@lists.freedesktop.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTP id 17AF1727F1 for ; Tue, 9 Dec 2014 04:59:21 -0800 (PST) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP; 09 Dec 2014 04:59:20 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,544,1413270000"; d="scan'208";a="634963859" Received: from johnharr-linux.isw.intel.com ([10.102.226.51]) by fmsmga001.fm.intel.com with ESMTP; 09 Dec 2014 04:59:20 -0800 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Tue, 9 Dec 2014 12:59:09 +0000 Message-Id: <1418129953-1505-7-git-send-email-John.C.Harrison@Intel.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1418129953-1505-1-git-send-email-John.C.Harrison@Intel.com> References: <1418129953-1505-1-git-send-email-John.C.Harrison@Intel.com> Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 06/10] drm/i915: Add extra add_request calls X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: John Harrison The scheduler needs to track batch buffers by request without extra, non-batch buffer work being attached to the same request. This means that anywhere which adds work to the ring should explicitly call i915_add_request() when it has finished writing to the ring. The add_request() function does extra work, such as flushing caches, that does not necessarily want to be done everywhere. Instead, a new i915_add_request_wo_flush() function has been added which skips the cache flush and just tidies up the request structure. Note, much of this patch was implemented by Naresh Kumar Kachhi for pending power management improvements. However, it is also directly applicable to the scheduler work as noted above. Change-Id: I66a6861118ee8e7ad7ca6c80c71a3b256db92e18 For: VIZ-1587 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h | 9 ++++-- drivers/gpu/drm/i915/i915_gem.c | 45 ++++++++++++++++---------- drivers/gpu/drm/i915/i915_gem_context.c | 9 ++++++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +-- drivers/gpu/drm/i915/i915_gem_gtt.c | 9 ++++++ drivers/gpu/drm/i915/i915_gem_render_state.c | 2 +- drivers/gpu/drm/i915/intel_display.c | 23 +++++++++---- drivers/gpu/drm/i915/intel_lrc.c | 4 +-- 8 files changed, 73 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 11e85cb..0e280c4 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2553,7 +2553,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) int __must_check i915_mutex_lock_interruptible(struct drm_device *dev); int i915_gem_object_sync(struct drm_i915_gem_object *obj, - struct intel_engine_cs *to); + struct intel_engine_cs *to, bool add_request); void i915_vma_move_to_active(struct i915_vma *vma, struct intel_engine_cs *ring); int i915_gem_dumb_create(struct drm_file *file_priv, @@ -2641,9 +2641,12 @@ int __must_check i915_gpu_idle(struct drm_device *dev); int __must_check i915_gem_suspend(struct drm_device *dev); int __i915_add_request(struct intel_engine_cs *ring, struct drm_file *file, - struct drm_i915_gem_object *batch_obj); + struct drm_i915_gem_object *batch_obj, + bool flush_caches); #define i915_add_request(ring) \ - __i915_add_request(ring, NULL, NULL) + __i915_add_request(ring, NULL, NULL, true) +#define i915_add_request_no_flush(ring) \ + __i915_add_request(ring, NULL, NULL, false) int __i915_wait_request(struct drm_i915_gem_request *req, unsigned reset_counter, bool interruptible, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index de241eb..b022a2d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2419,7 +2419,8 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno) int __i915_add_request(struct intel_engine_cs *ring, struct drm_file *file, - struct drm_i915_gem_object *obj) + struct drm_i915_gem_object *obj, + bool flush_caches) { struct drm_i915_private *dev_priv = ring->dev->dev_private; struct drm_i915_gem_request *request; @@ -2445,12 +2446,11 @@ int __i915_add_request(struct intel_engine_cs *ring, * is that the flush _must_ happen before the next request, no matter * what. */ - if (i915.enable_execlists) { - ret = logical_ring_flush_all_caches(ringbuf); - if (ret) - return ret; - } else { - ret = intel_ring_flush_all_caches(ring); + if (flush_caches) { + if (i915.enable_execlists) + ret = logical_ring_flush_all_caches(ringbuf); + else + ret = intel_ring_flush_all_caches(ring); if (ret) return ret; } @@ -2462,15 +2462,12 @@ int __i915_add_request(struct intel_engine_cs *ring, */ request_ring_position = intel_ring_get_tail(ringbuf); - if (i915.enable_execlists) { + if (i915.enable_execlists) ret = ring->emit_request(ringbuf); - if (ret) - return ret; - } else { + else ret = ring->add_request(ring); - if (ret) - return ret; - } + if (ret) + return ret; request->head = request_start; request->tail = request_ring_position; @@ -2974,6 +2971,8 @@ out: * * @obj: object which may be in use on another ring. * @to: ring we wish to use the object on. May be NULL. + * @add_request: do we need to add a request to track operations + * submitted on ring with sync_to function * * This code is meant to abstract object synchronization with the GPU. * Calling with NULL implies synchronizing the object with the CPU @@ -2983,7 +2982,7 @@ out: */ int i915_gem_object_sync(struct drm_i915_gem_object *obj, - struct intel_engine_cs *to) + struct intel_engine_cs *to, bool add_request) { struct intel_engine_cs *from; u32 seqno; @@ -3011,13 +3010,16 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj, trace_i915_gem_ring_sync_to(from, to, obj->last_read_req); ret = to->semaphore.sync_to(to, from, seqno); - if (!ret) + if (!ret) { /* We use last_read_req because sync_to() * might have just caused seqno wrap under * the radar. */ from->semaphore.sync_seqno[idx] = i915_gem_request_get_seqno(obj->last_read_req); + if (add_request) + i915_add_request_no_flush(to); + } return ret; } @@ -3126,6 +3128,15 @@ int i915_gpu_idle(struct drm_device *dev) return ret; } + /* Make sure the context switch (if one actually happened) + * gets wrapped up and finished rather than hanging around + * and confusing things later. */ + if (ring->outstanding_lazy_request) { + ret = i915_add_request(ring); + if (ret) + return ret; + } + ret = intel_ring_idle(ring); if (ret) return ret; @@ -3946,7 +3957,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, int ret; if (pipelined != i915_gem_request_get_ring(obj->last_read_req)) { - ret = i915_gem_object_sync(obj, pipelined); + ret = i915_gem_object_sync(obj, pipelined, true); if (ret) return ret; } diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 5cd2b97..b2f1296 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -426,6 +426,15 @@ int i915_gem_context_enable(struct drm_i915_private *dev_priv) ret = i915_switch_context(ring, ring->default_context); if (ret) return ret; + + /* Make sure the context switch (if one actually happened) + * gets wrapped up and finished rather than hanging around + * and confusing things later. */ + if (ring->outstanding_lazy_request) { + ret = i915_add_request_no_flush(ring); + if (ret) + return ret; + } } return 0; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index aa4566e..1268e89 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -832,7 +832,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring, list_for_each_entry(vma, vmas, exec_list) { struct drm_i915_gem_object *obj = vma->obj; - ret = i915_gem_object_sync(obj, ring); + ret = i915_gem_object_sync(obj, ring, false); if (ret) return ret; @@ -993,7 +993,7 @@ i915_gem_execbuffer_retire_commands(struct drm_device *dev, ring->gpu_caches_dirty = true; /* Add a breadcrumb for the completion of the batch buffer */ - (void)__i915_add_request(ring, file, obj); + (void)__i915_add_request(ring, file, obj, true); } static int diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index ac03a38..7eead93 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1153,6 +1153,15 @@ int i915_ppgtt_init_hw(struct drm_device *dev) ret = ppgtt->switch_mm(ppgtt, ring); if (ret != 0) return ret; + + /* Make sure the context switch (if one actually happened) + * gets wrapped up and finished rather than hanging around + * and confusing things later. */ + if (ring->outstanding_lazy_request) { + ret = i915_add_request_no_flush(ring); + if (ret) + return ret; + } } } diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 521548a..aba39c3 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring) i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring); - ret = __i915_add_request(ring, NULL, so.obj); + ret = __i915_add_request(ring, NULL, so.obj, true); /* __i915_add_request moves object to inactive if it fails */ out: i915_gem_render_state_fini(&so); diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index a9bdc12..5b8d4f9 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9116,7 +9116,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev, intel_ring_emit(ring, 0); /* aux display base address, unused */ intel_mark_page_flip_active(intel_crtc); - __intel_ring_advance(ring); + i915_add_request_no_flush(ring); return 0; } @@ -9148,7 +9148,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_NOOP); intel_mark_page_flip_active(intel_crtc); - __intel_ring_advance(ring); + i915_add_request_no_flush(ring); return 0; } @@ -9187,7 +9187,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev, intel_ring_emit(ring, pf | pipesrc); intel_mark_page_flip_active(intel_crtc); - __intel_ring_advance(ring); + i915_add_request_no_flush(ring); return 0; } @@ -9223,7 +9223,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev, intel_ring_emit(ring, pf | pipesrc); intel_mark_page_flip_active(intel_crtc); - __intel_ring_advance(ring); + i915_add_request_no_flush(ring); return 0; } @@ -9318,7 +9318,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev, intel_ring_emit(ring, (MI_NOOP)); intel_mark_page_flip_active(intel_crtc); - __intel_ring_advance(ring); + i915_add_request_no_flush(ring); return 0; } @@ -9528,7 +9528,7 @@ static int intel_gen9_queue_flip(struct drm_device *dev, intel_ring_emit(ring, intel_crtc->unpin_work->gtt_offset); intel_mark_page_flip_active(intel_crtc); - __intel_ring_advance(ring); + i915_add_request_no_flush(ring); return 0; } @@ -9734,8 +9734,17 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, if (ret) goto cleanup_unpin; + /* Borked: need to get the seqno for the request submitted in + * 'queue_flip()' above. However, either the request has been + * posted already and the seqno is gone (q_f calls add_request), + * or the request never gets posted and is merged into whatever + * render comes along next (q_f calls ring_advance). + * + * On the other hand, seqnos are going away soon anyway! So + * hopefully the problem will disappear... + */ i915_gem_request_assign(&work->flip_queued_req, - intel_ring_get_request(ring)); + ring->outstanding_lazy_request ? intel_ring_get_request(ring) : NULL); } work->flip_queued_vblank = drm_vblank_count(dev, intel_crtc->pipe); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 82a47e3..643a56a 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -617,7 +617,7 @@ static int execlists_move_to_gpu(struct intel_ringbuffer *ringbuf, list_for_each_entry(vma, vmas, exec_list) { struct drm_i915_gem_object *obj = vma->obj; - ret = i915_gem_object_sync(obj, ring); + ret = i915_gem_object_sync(obj, ring, true); if (ret) return ret; @@ -1630,7 +1630,7 @@ int intel_lr_context_render_state_init(struct intel_engine_cs *ring, i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring); - ret = __i915_add_request(ring, file, so.obj); + ret = __i915_add_request(ring, file, so.obj, true); /* intel_logical_ring_add_request moves object to inactive if it * fails */ out: