From patchwork Thu Dec 18 17:10:09 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michel Thierry X-Patchwork-Id: 5515341 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 6E624BEEA8 for ; Thu, 18 Dec 2014 17:17:54 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4983E2020E for ; Thu, 18 Dec 2014 17:17:53 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id F1D54207B8 for ; Thu, 18 Dec 2014 17:17:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3FBEA6EA4C; Thu, 18 Dec 2014 09:17:51 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 4BC8A6EA4A for ; Thu, 18 Dec 2014 09:17:49 -0800 (PST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP; 18 Dec 2014 09:08:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,691,1406617200"; d="scan'208";a="501038540" Received: from michelth-linux.isw.intel.com ([10.102.226.150]) by orsmga003.jf.intel.com with ESMTP; 18 Dec 2014 09:05:57 -0800 From: Michel Thierry To: intel-gfx@lists.freedesktop.org Date: Thu, 18 Dec 2014 17:10:09 +0000 Message-Id: <1418922621-25818-13-git-send-email-michel.thierry@intel.com> X-Mailer: git-send-email 2.1.1 In-Reply-To: <1418922621-25818-1-git-send-email-michel.thierry@intel.com> References: <1418922621-25818-1-git-send-email-michel.thierry@intel.com> Subject: [Intel-gfx] [PATCH 12/24] drm/i915: Track page table reload need X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ben Widawsky This patch was formerly known as, "Force pd restore when PDEs change, gen6-7." I had to change the name because it is needed for GEN8 too. The real issue this is trying to solve is when a new object is mapped into the current address space. The GPU does not snoop the new mapping so we must do the gen specific action to reload the page tables. GEN8 and GEN7 do differ in the way they load page tables for the RCS. GEN8 does so with the context restore, while GEN7 requires the proper load commands in the command streamer. Non-render is similar for both. Caveat for GEN7 The docs say you cannot change the PDEs of a currently running context. We never map new PDEs of a running context, and expect them to be present - so I think this is okay. (We can unmap, but this should also be okay since we only unmap unreferenced objects that the GPU shouldn't be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag to signal that even if the context is the same, force a reload. It's unclear exactly what this does, but I have a hunch it's the right thing to do. The logic assumes that we always emit a context switch after mapping new PDEs, and before we submit a batch. This is the case today, and has been the case since the inception of hardware contexts. A note in the comment let's the user know. Signed-off-by: Ben Widawsky squash! drm/i915: Force pd restore when PDEs change, gen6-7 It's not just for gen8. If the current context has mappings change, we need a context reload to switch v2: Rebased after ppgtt clean up patches. Split the warning for aliasing and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt is always null. Signed-off-by: Michel Thierry (v2) --- drivers/gpu/drm/i915/i915_gem_context.c | 67 +++++++++++++++++++++--------- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++++ drivers/gpu/drm/i915/i915_gem_gtt.c | 15 ++++++- drivers/gpu/drm/i915/i915_gem_gtt.h | 2 + 4 files changed, 75 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index a8ff03d..fa9d4a1 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -563,6 +563,42 @@ mi_set_context(struct intel_engine_cs *ring, return ret; } +static inline bool should_skip_switch(struct intel_engine_cs *ring, + struct intel_context *from, + struct intel_context *to) +{ + struct drm_i915_private *dev_priv = ring->dev->dev_private; + + if (to->remap_slice) + return false; + + if (to->ppgtt) { + if (from == to && !test_bit(ring->id, &to->ppgtt->base.pd_reload_mask)) + return true; + } else { + if (from == to && !test_bit(ring->id, &dev_priv->mm.aliasing_ppgtt->base.pd_reload_mask)) + return true; + } + + return false; +} + +static bool +needs_pd_load_pre(struct intel_engine_cs *ring, struct intel_context *to) +{ + struct drm_i915_private *dev_priv = ring->dev->dev_private; + + return ((INTEL_INFO(ring->dev)->gen < 8) || + (ring != &dev_priv->ring[RCS])) && to->ppgtt; +} + +static bool +needs_pd_load_post(struct intel_engine_cs *ring, struct intel_context *to) +{ + return IS_GEN8(ring->dev) && + (to->ppgtt || &to->ppgtt->base.pd_reload_mask); +} + static int do_switch(struct intel_engine_cs *ring, struct intel_context *to) { @@ -571,9 +607,6 @@ static int do_switch(struct intel_engine_cs *ring, u32 hw_flags = 0; bool uninitialized = false; struct i915_vma *vma; - bool needs_pd_load_pre = ((INTEL_INFO(ring->dev)->gen < 8) || - (ring != &dev_priv->ring[RCS])) && to->ppgtt; - bool needs_pd_load_post = false; int ret, i; if (from != NULL && ring == &dev_priv->ring[RCS]) { @@ -581,7 +614,7 @@ static int do_switch(struct intel_engine_cs *ring, BUG_ON(!i915_gem_obj_is_pinned(from->legacy_hw_ctx.rcs_state)); } - if (from == to && !to->remap_slice) + if (should_skip_switch(ring, from, to)) return 0; /* Trying to pin first makes error handling easier. */ @@ -599,7 +632,7 @@ static int do_switch(struct intel_engine_cs *ring, */ from = ring->last_context; - if (needs_pd_load_pre) { + if (needs_pd_load_pre(ring, to)) { /* Older GENs and non render rings still want the load first, * "PP_DCLV followed by PP_DIR_BASE register through Load * Register Immediate commands in Ring Buffer before submitting @@ -608,6 +641,12 @@ static int do_switch(struct intel_engine_cs *ring, ret = to->ppgtt->switch_mm(to->ppgtt, ring); if (ret) goto unpin_out; + + /* Doing a PD load always reloads the page dirs */ + if (to->ppgtt) + clear_bit(ring->id, &to->ppgtt->base.pd_reload_mask); + else + clear_bit(ring->id, &dev_priv->mm.aliasing_ppgtt->base.pd_reload_mask); } if (ring != &dev_priv->ring[RCS]) { @@ -644,16 +683,16 @@ static int do_switch(struct intel_engine_cs *ring, * XXX: If we implemented page directory eviction code, this * optimization needs to be removed. */ - if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) { + if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) hw_flags |= MI_RESTORE_INHIBIT; - needs_pd_load_post = to->ppgtt && IS_GEN8(ring->dev); - } + else if (to->ppgtt && test_and_clear_bit(ring->id, &to->ppgtt->base.pd_reload_mask)) + hw_flags |= MI_FORCE_RESTORE; ret = mi_set_context(ring, to, hw_flags); if (ret) goto unpin_out; - if (needs_pd_load_post) { + if (needs_pd_load_post(ring, to)) { ret = to->ppgtt->switch_mm(to->ppgtt, ring); /* The hardware context switch is emitted, but we haven't * actually changed the state - so it's probably safe to bail @@ -726,16 +765,6 @@ unpin_out: return ret; } -static inline bool should_skip_switch(struct intel_engine_cs *ring, - struct intel_context *from, - struct intel_context *to) -{ - if (from == to && !to->remap_slice) - return true; - - return false; -} - /** * i915_switch_context() - perform a GPU context switch. * @ring: ring for which we'll execute the context switch diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 8330660..09d864f 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1199,6 +1199,13 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file, if (ret) goto error; + if (ctx->ppgtt) + WARN(ctx->ppgtt->base.pd_reload_mask & (1<id), + "%s didn't clear reload\n", ring->name); + else + WARN(dev_priv->mm.aliasing_ppgtt->base.pd_reload_mask & + (1<id), "%s didn't clear reload\n", ring->name); + instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK; instp_mask = I915_EXEC_CONSTANTS_MASK; switch (instp_mode) { @@ -1446,6 +1453,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (ret) goto err; + /* XXX: Reserve has possibly change PDEs which means we must do a + * context switch before we can coherently read some of the reserved + * VMAs. */ + /* The objects are in their final locations, apply the relocations. */ if (need_relocs) ret = i915_gem_execbuffer_relocate(eb); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index faa0603..c917301 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1398,6 +1398,15 @@ i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv) return ppgtt; } +/* PDE TLBs are a pain invalidate pre GEN8. It requires a context reload. If we + * are switching between contexts with the same LRCA, we also must do a force + * restore. + */ +#define ppgtt_invalidate_tlbs(vm) do {\ + /* If current vm != vm, */ \ + vm->pd_reload_mask = INTEL_INFO(vm->dev)->ring_mask; \ +} while (0) + void i915_ppgtt_release(struct kref *kref) { struct i915_hw_ppgtt *ppgtt = @@ -1433,6 +1442,8 @@ ppgtt_bind_vma(struct i915_vma *vma, vma->node.size); if (ret) return ret; + + ppgtt_invalidate_tlbs(vma->vm); } vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start, @@ -1446,9 +1457,11 @@ static void ppgtt_unbind_vma(struct i915_vma *vma) vma->node.start, vma->obj->base.size, true); - if (vma->vm->teardown_va_range) + if (vma->vm->teardown_va_range) { vma->vm->teardown_va_range(vma->vm, vma->node.start, vma->node.size); + ppgtt_invalidate_tlbs(vma->vm); + } } extern int intel_iommu_gfx_mapped; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 2eb6011..58a55bc 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -226,6 +226,8 @@ struct i915_address_space { struct page *page; } scratch; + unsigned long pd_reload_mask; + /** * List of objects currently involved in rendering. *