From patchwork Wed Oct 29 09:52:53 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Daniel X-Patchwork-Id: 5186141 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 4C755C11AC for ; Wed, 29 Oct 2014 09:53:08 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D1DEB201F2 for ; Wed, 29 Oct 2014 09:53:06 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 58B722015A for ; Wed, 29 Oct 2014 09:53:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C2FD66E4F8; Wed, 29 Oct 2014 02:53:04 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTP id 854136E4EB for ; Wed, 29 Oct 2014 02:53:02 -0700 (PDT) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP; 29 Oct 2014 02:53:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,809,1406617200"; d="scan'208";a="598218994" Received: from thomasda-linux2.isw.intel.com ([10.102.226.52]) by orsmga001.jf.intel.com with ESMTP; 29 Oct 2014 02:53:01 -0700 From: Thomas Daniel To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Oct 2014 09:52:53 +0000 Message-Id: <1414576373-25121-4-git-send-email-thomas.daniel@intel.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1414576373-25121-1-git-send-email-thomas.daniel@intel.com> References: <1414576373-25121-1-git-send-email-thomas.daniel@intel.com> Cc: shuang.he@linux.intel.com Subject: [Intel-gfx] [PATCH 4/4] drm/i915/bdw: Pin the ringbuffer backing object to GGTT on-demand X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Same as with the context, pinning to GGTT regardless is harmful (it badly fragments the GGTT and can even exhaust it). Unfortunately, this case is also more complex than the previous one because we need to map and access the ringbuffer in several places along the execbuffer path (and we cannot make do by leaving the default ringbuffer pinned, as before). Also, the context object itself contains a pointer to the ringbuffer address that we have to keep updated if we are going to allow the ringbuffer to move around. v2: Same as with the context pinning, we cannot really do it during an interrupt. Also, pin the default ringbuffers objects regardless (makes error capture a lot easier). v3: Rebased. Take a pin reference of the ringbuffer for each item in the execlist request queue because the hardware may still be using the ringbuffer after the MI_USER_INTERRUPT to notify the seqno update is executed. The ringbuffer must remain pinned until the context save is complete. No longer pin and unpin ringbuffer in populate_lr_context() - this transient address is meaningless and the pinning can cause a sleep while atomic. v4: Moved ringbuffer pin and unpin into the lr_context_pin functions. Downgraded pinning check BUG_ONs to WARN_ONs. Issue: VIZ-4277 Signed-off-by: Oscar Mateo Signed-off-by: Thomas Daniel Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com) --- drivers/gpu/drm/i915/intel_lrc.c | 110 ++++++++++++++++++++++--------- drivers/gpu/drm/i915/intel_lrc.h | 1 + drivers/gpu/drm/i915/intel_ringbuffer.c | 85 ++++++++++++++---------- drivers/gpu/drm/i915/intel_ringbuffer.h | 3 + 4 files changed, 133 insertions(+), 66 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 7950357..b5ae4fa 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -202,6 +202,9 @@ enum { }; #define GEN8_CTX_ID_SHIFT 32 +static int intel_lr_context_pin(struct intel_engine_cs *ring, + struct intel_context *ctx); + /** * intel_sanitize_enable_execlists() - sanitize i915.enable_execlists * @dev: DRM device. @@ -339,7 +342,9 @@ static void execlists_elsp_write(struct intel_engine_cs *ring, spin_unlock_irqrestore(&dev_priv->uncore.lock, flags); } -static int execlists_ctx_write_tail(struct drm_i915_gem_object *ctx_obj, u32 tail) +static int execlists_update_context(struct drm_i915_gem_object *ctx_obj, + struct drm_i915_gem_object *ring_obj, + u32 tail) { struct page *page; uint32_t *reg_state; @@ -348,6 +353,7 @@ static int execlists_ctx_write_tail(struct drm_i915_gem_object *ctx_obj, u32 tai reg_state = kmap_atomic(page); reg_state[CTX_RING_TAIL+1] = tail; + reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj); kunmap_atomic(reg_state); @@ -358,21 +364,25 @@ static int execlists_submit_context(struct intel_engine_cs *ring, struct intel_context *to0, u32 tail0, struct intel_context *to1, u32 tail1) { - struct drm_i915_gem_object *ctx_obj0; + struct drm_i915_gem_object *ctx_obj0 = to0->engine[ring->id].state; + struct intel_ringbuffer *ringbuf0 = to0->engine[ring->id].ringbuf; struct drm_i915_gem_object *ctx_obj1 = NULL; + struct intel_ringbuffer *ringbuf1 = NULL; - ctx_obj0 = to0->engine[ring->id].state; BUG_ON(!ctx_obj0); WARN_ON(!i915_gem_obj_is_pinned(ctx_obj0)); + WARN_ON(!i915_gem_obj_is_pinned(ringbuf0->obj)); - execlists_ctx_write_tail(ctx_obj0, tail0); + execlists_update_context(ctx_obj0, ringbuf0->obj, tail0); if (to1) { + ringbuf1 = to1->engine[ring->id].ringbuf; ctx_obj1 = to1->engine[ring->id].state; BUG_ON(!ctx_obj1); WARN_ON(!i915_gem_obj_is_pinned(ctx_obj1)); + WARN_ON(!i915_gem_obj_is_pinned(ringbuf1->obj)); - execlists_ctx_write_tail(ctx_obj1, tail1); + execlists_update_context(ctx_obj1, ringbuf1->obj, tail1); } execlists_elsp_write(ring, ctx_obj0, ctx_obj1); @@ -435,9 +445,9 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring, struct drm_i915_gem_object *ctx_obj = head_req->ctx->engine[ring->id].state; if (intel_execlists_ctx_id(ctx_obj) == request_id) { - WARN(head_req->elsp_submitted == 0, - "Never submitted head request\n"); + /* If the request has been merged, it is possible to get + * here with an unsubmitted request. */ if (--head_req->elsp_submitted <= 0) { list_del(&head_req->execlist_link); list_add_tail(&head_req->execlist_link, @@ -485,8 +495,7 @@ void intel_execlists_handle_ctx_events(struct intel_engine_cs *ring) if (status & GEN8_CTX_STATUS_PREEMPTED) { if (status & GEN8_CTX_STATUS_LITE_RESTORE) { - if (execlists_check_remove_request(ring, status_id)) - WARN(1, "Lite Restored request removed from queue\n"); + execlists_check_remove_request(ring, status_id); } else WARN(1, "Preemption without Lite Restore\n"); } @@ -524,6 +533,10 @@ static int execlists_context_queue(struct intel_engine_cs *ring, return -ENOMEM; req->ctx = to; i915_gem_context_reference(req->ctx); + + if (to != ring->default_context) + intel_lr_context_pin(ring, to); + req->ring = ring; req->tail = tail; @@ -544,7 +557,7 @@ static int execlists_context_queue(struct intel_engine_cs *ring, if (to == tail_req->ctx) { WARN(tail_req->elsp_submitted != 0, - "More than 2 already-submitted reqs queued\n"); + "More than 2 already-submitted reqs queued\n"); list_del(&tail_req->execlist_link); list_add_tail(&tail_req->execlist_link, &ring->execlist_retired_req_list); @@ -732,6 +745,12 @@ void intel_execlists_retire_requests(struct intel_engine_cs *ring) spin_unlock_irqrestore(&ring->execlist_lock, flags); list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) { + struct intel_context *ctx = req->ctx; + struct drm_i915_gem_object *ctx_obj = + ctx->engine[ring->id].state; + + if (ctx_obj && (ctx != ring->default_context)) + intel_lr_context_unpin(ring, ctx); intel_runtime_pm_put(dev_priv); i915_gem_context_unreference(req->ctx); list_del(&req->execlist_link); @@ -802,6 +821,7 @@ static int intel_lr_context_pin(struct intel_engine_cs *ring, struct intel_context *ctx) { struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state; + struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf; int ret = 0; mutex_lock(&ctx->engine[ring->id].unpin_lock); @@ -809,22 +829,37 @@ static int intel_lr_context_pin(struct intel_engine_cs *ring, ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN, 0); if (ret) - ctx->engine[ring->id].unpin_count = 0; + goto reset_unpin_count; + + ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf); + if (ret) + goto unpin_ctx_obj; } mutex_unlock(&ctx->engine[ring->id].unpin_lock); return ret; + +unpin_ctx_obj: + i915_gem_object_ggtt_unpin(ctx_obj); +reset_unpin_count: + ctx->engine[ring->id].unpin_count = 0; + mutex_unlock(&ctx->engine[ring->id].unpin_lock); + + return ret; } void intel_lr_context_unpin(struct intel_engine_cs *ring, struct intel_context *ctx) { struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state; + struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf; if (ctx_obj) { mutex_lock(&ctx->engine[ring->id].unpin_lock); - if (--ctx->engine[ring->id].unpin_count == 0) + if (--ctx->engine[ring->id].unpin_count == 0) { + intel_unpin_ringbuffer_obj(ringbuf); i915_gem_object_ggtt_unpin(ctx_obj); + } mutex_unlock(&ctx->engine[ring->id].unpin_lock); } } @@ -1542,7 +1577,6 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o { struct drm_device *dev = ring->dev; struct drm_i915_private *dev_priv = dev->dev_private; - struct drm_i915_gem_object *ring_obj = ringbuf->obj; struct i915_hw_ppgtt *ppgtt = ctx->ppgtt; struct page *page; uint32_t *reg_state; @@ -1588,7 +1622,9 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base); reg_state[CTX_RING_TAIL+1] = 0; reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base); - reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj); + /* Ring buffer start address is not known until the buffer is pinned. + * It is written to the context image in execlists_update_context() + */ reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base); reg_state[CTX_RING_BUFFER_CONTROL+1] = ((ringbuf->size - PAGE_SIZE) & RING_NR_PAGES) | RING_VALID; @@ -1670,10 +1706,12 @@ void intel_lr_context_free(struct intel_context *ctx) ctx->engine[i].ringbuf; struct intel_engine_cs *ring = ringbuf->ring; + if (ctx == ring->default_context) { + intel_unpin_ringbuffer_obj(ringbuf); + i915_gem_object_ggtt_unpin(ctx_obj); + } intel_destroy_ringbuffer_obj(ringbuf); kfree(ringbuf); - if (ctx == ring->default_context) - i915_gem_object_ggtt_unpin(ctx_obj); drm_gem_object_unreference(&ctx_obj->base); mutex_destroy(&ctx->engine[i].unpin_lock); } @@ -1772,11 +1810,8 @@ int intel_lr_context_deferred_create(struct intel_context *ctx, if (!ringbuf) { DRM_DEBUG_DRIVER("Failed to allocate ringbuffer %s\n", ring->name); - if (is_global_default_ctx) - i915_gem_object_ggtt_unpin(ctx_obj); - drm_gem_object_unreference(&ctx_obj->base); ret = -ENOMEM; - return ret; + goto error_unpin_ctx; } ringbuf->ring = ring; @@ -1789,22 +1824,30 @@ int intel_lr_context_deferred_create(struct intel_context *ctx, ringbuf->space = ringbuf->size; ringbuf->last_retired_head = -1; - /* TODO: For now we put this in the mappable region so that we can reuse - * the existing ringbuffer code which ioremaps it. When we start - * creating many contexts, this will no longer work and we must switch - * to a kmapish interface. - */ - ret = intel_alloc_ringbuffer_obj(dev, ringbuf); - if (ret) { - DRM_DEBUG_DRIVER("Failed to allocate ringbuffer obj %s: %d\n", + if (ringbuf->obj == NULL) { + ret = intel_alloc_ringbuffer_obj(dev, ringbuf); + if (ret) { + DRM_DEBUG_DRIVER( + "Failed to allocate ringbuffer obj %s: %d\n", ring->name, ret); - goto error; + goto error_free_rbuf; + } + + if (is_global_default_ctx) { + ret = intel_pin_and_map_ringbuffer_obj(dev, ringbuf); + if (ret) { + DRM_ERROR( + "Failed to pin and map ringbuffer %s: %d\n", + ring->name, ret); + goto error_destroy_rbuf; + } + } + } ret = populate_lr_context(ctx, ctx_obj, ring, ringbuf); if (ret) { DRM_DEBUG_DRIVER("Failed to populate LRC: %d\n", ret); - intel_destroy_ringbuffer_obj(ringbuf); goto error; } @@ -1826,7 +1869,6 @@ int intel_lr_context_deferred_create(struct intel_context *ctx, DRM_ERROR("Init render state failed: %d\n", ret); ctx->engine[ring->id].ringbuf = NULL; ctx->engine[ring->id].state = NULL; - intel_destroy_ringbuffer_obj(ringbuf); goto error; } ctx->rcs_initialized = true; @@ -1835,7 +1877,13 @@ int intel_lr_context_deferred_create(struct intel_context *ctx, return 0; error: + if (is_global_default_ctx) + intel_unpin_ringbuffer_obj(ringbuf); +error_destroy_rbuf: + intel_destroy_ringbuffer_obj(ringbuf); +error_free_rbuf: kfree(ringbuf); +error_unpin_ctx: if (is_global_default_ctx) i915_gem_object_ggtt_unpin(ctx_obj); drm_gem_object_unreference(&ctx_obj->base); diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 14b216b..21233a0 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -110,6 +110,7 @@ struct intel_ctx_submit_request { struct list_head execlist_link; int elsp_submitted; + bool need_unpin; }; void intel_execlists_handle_ctx_events(struct intel_engine_cs *ring); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index a8f72e8..0c4aab1 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1721,13 +1721,42 @@ static int init_phys_status_page(struct intel_engine_cs *ring) return 0; } -void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf) +void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf) { - if (!ringbuf->obj) - return; - iounmap(ringbuf->virtual_start); + ringbuf->virtual_start = NULL; i915_gem_object_ggtt_unpin(ringbuf->obj); +} + +int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev, + struct intel_ringbuffer *ringbuf) +{ + struct drm_i915_private *dev_priv = to_i915(dev); + struct drm_i915_gem_object *obj = ringbuf->obj; + int ret; + + ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE); + if (ret) + return ret; + + ret = i915_gem_object_set_to_gtt_domain(obj, true); + if (ret) { + i915_gem_object_ggtt_unpin(obj); + return ret; + } + + ringbuf->virtual_start = ioremap_wc(dev_priv->gtt.mappable_base + + i915_gem_obj_ggtt_offset(obj), ringbuf->size); + if (ringbuf->virtual_start == NULL) { + i915_gem_object_ggtt_unpin(obj); + return -EINVAL; + } + + return 0; +} + +void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf) +{ drm_gem_object_unreference(&ringbuf->obj->base); ringbuf->obj = NULL; } @@ -1735,12 +1764,7 @@ void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf) int intel_alloc_ringbuffer_obj(struct drm_device *dev, struct intel_ringbuffer *ringbuf) { - struct drm_i915_private *dev_priv = to_i915(dev); struct drm_i915_gem_object *obj; - int ret; - - if (ringbuf->obj) - return 0; obj = NULL; if (!HAS_LLC(dev)) @@ -1753,30 +1777,9 @@ int intel_alloc_ringbuffer_obj(struct drm_device *dev, /* mark ring buffers as read-only from GPU side by default */ obj->gt_ro = 1; - ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE); - if (ret) - goto err_unref; - - ret = i915_gem_object_set_to_gtt_domain(obj, true); - if (ret) - goto err_unpin; - - ringbuf->virtual_start = - ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj), - ringbuf->size); - if (ringbuf->virtual_start == NULL) { - ret = -EINVAL; - goto err_unpin; - } - ringbuf->obj = obj; - return 0; -err_unpin: - i915_gem_object_ggtt_unpin(obj); -err_unref: - drm_gem_object_unreference(&obj->base); - return ret; + return 0; } static int intel_init_ring_buffer(struct drm_device *dev, @@ -1813,10 +1816,21 @@ static int intel_init_ring_buffer(struct drm_device *dev, goto error; } - ret = intel_alloc_ringbuffer_obj(dev, ringbuf); - if (ret) { - DRM_ERROR("Failed to allocate ringbuffer %s: %d\n", ring->name, ret); - goto error; + if (ringbuf->obj == NULL) { + ret = intel_alloc_ringbuffer_obj(dev, ringbuf); + if (ret) { + DRM_ERROR("Failed to allocate ringbuffer %s: %d\n", + ring->name, ret); + goto error; + } + + ret = intel_pin_and_map_ringbuffer_obj(dev, ringbuf); + if (ret) { + DRM_ERROR("Failed to pin and map ringbuffer %s: %d\n", + ring->name, ret); + intel_destroy_ringbuffer_obj(ringbuf); + goto error; + } } /* Workaround an erratum on the i830 which causes a hang if @@ -1854,6 +1868,7 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring) intel_stop_ring_buffer(ring); WARN_ON(!IS_GEN2(ring->dev) && (I915_READ_MODE(ring) & MODE_IDLE) == 0); + intel_unpin_ringbuffer_obj(ringbuf); intel_destroy_ringbuffer_obj(ringbuf); ring->preallocated_lazy_request = NULL; ring->outstanding_lazy_seqno = 0; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 8c002d2..365854ad 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -382,6 +382,9 @@ intel_write_status_page(struct intel_engine_cs *ring, #define I915_GEM_HWS_SCRATCH_INDEX 0x30 #define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT) +void intel_unpin_ringbuffer_obj(struct intel_ringbuffer *ringbuf); +int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev, + struct intel_ringbuffer *ringbuf); void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf); int intel_alloc_ringbuffer_obj(struct drm_device *dev, struct intel_ringbuffer *ringbuf);