Message ID | 1410773387-24682-1-git-send-email-michel.thierry@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Sep 15, 2014 at 10:29:47AM +0100, Michel Thierry wrote: > From: Ben Widawsky <benjamin.widawsky@intel.com> > > The simple explanation is, the docs say to do this for GEN8. Perhaps we > want to do this for GEN7 too, I am not certain. > > PDPs are saved and restored with context. Contexts (without execlists) > only exist on the render ring. The docs say that PDPs are not power > context save/restored. I've learned that this actually means something > which SW doesn't care about. So pretend the statement doesn't exist. > For non RCS, nothing changes. > > All this patch now does is change the ordering of LRI vs MI_SET_CONTEXT > for the initialization of the context. I do this because the docs say to > do it, and frankly, I cannot reason why it is necessary. I've thought > about it a lot, and tried, without success, to get a reason from design. > The answer I got more or less says, "gen7 is different than gen8." I've > given up, and am adding this little bit of code to make it in sync with > the docs. > > v2: Completely rewritten commit message that addresses the requests > Ville made for v1 > Only load PDPs for initial context load (Ville) > > v3: Rebased after ppgtt clean-up rules, and apply only for render > ring. This is needed to boot to desktop with full ppgtt in legacy mode > (without execlists). > > v4: Clean up the pre/post load logic (Ville). > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3-4) > --- > drivers/gpu/drm/i915/i915_gem_context.c | 32 ++++++++++++++++++++++++++++++-- > 1 file changed, 30 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index a5221d8..7b73b36 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -522,6 +522,9 @@ static int do_switch(struct intel_engine_cs *ring, > struct intel_context *from = ring->last_context; > u32 hw_flags = 0; > bool uninitialized = false; > + bool needs_pd_load_pre = ((INTEL_INFO(ring->dev)->gen < 8) || > + (ring != &dev_priv->ring[RCS])) && to->ppgtt; > + bool needs_pd_load_post = false; > int ret, i; > > if (from != NULL && ring == &dev_priv->ring[RCS]) { > @@ -547,7 +550,11 @@ static int do_switch(struct intel_engine_cs *ring, > */ > from = ring->last_context; > > - if (to->ppgtt) { > + if (needs_pd_load_pre) { > + /* Older GENs and non render rings still want the load first, > + * "PP_DCLV followed by PP_DIR_BASE register through Load > + * Register Immediate commands in Ring Buffer before submitting > + * a context."*/ What's the reference for this? It works if you load the ppgtt after the set_context on ivb/byt/hsw, so do you have a specific recommendation in mind? And the set-context even provides the barrier required for the lri. > ret = to->ppgtt->switch_mm(to->ppgtt, ring); > if (ret) > goto unpin_out; > @@ -577,13 +584,34 @@ static int do_switch(struct intel_engine_cs *ring, > vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level, GLOBAL_BIND); > } > > - if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) > + /* GEN8 does *not* require an explicit reload if the PDPs have been > + * setup, and we do not wish to move them. > + * > + * XXX: If we implemented page directory eviction code, this > + * optimization needs to be removed. > + */ What is the cost of the pde invalidate on top of the context restore anyway? As this will be a secondary path in future, is optimizing worthwhile at all? > + if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) { > hw_flags |= MI_RESTORE_INHIBIT; > + needs_pd_load_post = to->ppgtt && IS_GEN8(ring->dev); > + } > > ret = mi_set_context(ring, to, hw_flags); > if (ret) > goto unpin_out; > > + if (needs_pd_load_post) { > + ret = to->ppgtt->switch_mm(to->ppgtt, ring); > + /* The hardware context switch is emitted, but we haven't > + * actually changed the state - so it's probably safe to bail > + * here. Still, let the user know something dangerous has > + * happened. > + */ Imagine if we only had patches posted already to fix all of this up. -Chris
Hi Michel: I'm doing a task which require LRIs to load root pointer around RCS context switch.My BDW CPU is C stepping. And what I found is MI_SET_CONTEXT instruction will save/restore PDP0/1/2/3. It's weired that as part of EXECLIST context, the "EXECLIST PPGTT base" context also appear in the dump generated by MI_SET_CONTEXT. And I found LRI after MI_SET_CONTEXT will make ring not idle: 0x8000[0] != 1. By the way, if MI_SET_CONTEXT is issued very frequently. I also found that will make ring not idle: INSTDONE register shows VFE and TSG not done. Do you met these issues? Thanks. Thanks, Zhi. > -------- Original Message -------- > Subject: [Intel-gfx] [PATCH v4] drm/i915/ppgtt: Load address space after > mi_set_context > Date: Mon, 15 Sep 2014 10:29:47 +0100 > From: Michel Thierry <michel.thierry@intel.com> > To: <intel-gfx@lists.freedesktop.org> > > From: Ben Widawsky <benjamin.widawsky@intel.com> > > The simple explanation is, the docs say to do this for GEN8. Perhaps we > want to do this for GEN7 too, I am not certain. > > PDPs are saved and restored with context. Contexts (without execlists) > only exist on the render ring. The docs say that PDPs are not power > context save/restored. I've learned that this actually means something > which SW doesn't care about. So pretend the statement doesn't exist. > For non RCS, nothing changes. > > All this patch now does is change the ordering of LRI vs MI_SET_CONTEXT > for the initialization of the context. I do this because the docs say to > do it, and frankly, I cannot reason why it is necessary. I've thought > about it a lot, and tried, without success, to get a reason from design. > The answer I got more or less says, "gen7 is different than gen8." I've > given up, and am adding this little bit of code to make it in sync with > the docs. > > v2: Completely rewritten commit message that addresses the requests > Ville made for v1 > Only load PDPs for initial context load (Ville) > > v3: Rebased after ppgtt clean-up rules, and apply only for render > ring. This is needed to boot to desktop with full ppgtt in legacy mode > (without execlists). > > v4: Clean up the pre/post load logic (Ville). > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3-4) > --- > drivers/gpu/drm/i915/i915_gem_context.c | 32 > ++++++++++++++++++++++++++++++-- > 1 file changed, 30 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > b/drivers/gpu/drm/i915/i915_gem_context.c > index a5221d8..7b73b36 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -522,6 +522,9 @@ static int do_switch(struct intel_engine_cs *ring, > struct intel_context *from = ring->last_context; > u32 hw_flags = 0; > bool uninitialized = false; > + bool needs_pd_load_pre = ((INTEL_INFO(ring->dev)->gen < 8) || > + (ring != &dev_priv->ring[RCS])) && to->ppgtt; > + bool needs_pd_load_post = false; > int ret, i; > > if (from != NULL && ring == &dev_priv->ring[RCS]) { > @@ -547,7 +550,11 @@ static int do_switch(struct intel_engine_cs *ring, > */ > from = ring->last_context; > > - if (to->ppgtt) { > + if (needs_pd_load_pre) { > + /* Older GENs and non render rings still want the load first, > + * "PP_DCLV followed by PP_DIR_BASE register through Load > + * Register Immediate commands in Ring Buffer before submitting > + * a context."*/ > ret = to->ppgtt->switch_mm(to->ppgtt, ring); > if (ret) > goto unpin_out; > @@ -577,13 +584,34 @@ static int do_switch(struct intel_engine_cs *ring, > vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level, > GLOBAL_BIND); > } > > - if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) > + /* GEN8 does *not* require an explicit reload if the PDPs have been > + * setup, and we do not wish to move them. > + * > + * XXX: If we implemented page directory eviction code, this > + * optimization needs to be removed. > + */ > + if (!to->legacy_hw_ctx.initialized || > i915_gem_context_is_default(to)) { > hw_flags |= MI_RESTORE_INHIBIT; > + needs_pd_load_post = to->ppgtt && IS_GEN8(ring->dev); > + } > > ret = mi_set_context(ring, to, hw_flags); > if (ret) > goto unpin_out; > > + if (needs_pd_load_post) { > + ret = to->ppgtt->switch_mm(to->ppgtt, ring); > + /* The hardware context switch is emitted, but we haven't > + * actually changed the state - so it's probably safe to bail > + * here. Still, let the user know something dangerous has > + * happened. > + */ > + if (ret) { > + DRM_ERROR("Failed to change address space on context > switch\n"); > + goto unpin_out; > + } > + } > + > for (i = 0; i < MAX_L3_SLICES; i++) { > if (!(to->remap_slice & (1<<i))) > continue;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index a5221d8..7b73b36 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -522,6 +522,9 @@ static int do_switch(struct intel_engine_cs *ring, struct intel_context *from = ring->last_context; u32 hw_flags = 0; bool uninitialized = false; + bool needs_pd_load_pre = ((INTEL_INFO(ring->dev)->gen < 8) || + (ring != &dev_priv->ring[RCS])) && to->ppgtt; + bool needs_pd_load_post = false; int ret, i; if (from != NULL && ring == &dev_priv->ring[RCS]) { @@ -547,7 +550,11 @@ static int do_switch(struct intel_engine_cs *ring, */ from = ring->last_context; - if (to->ppgtt) { + if (needs_pd_load_pre) { + /* Older GENs and non render rings still want the load first, + * "PP_DCLV followed by PP_DIR_BASE register through Load + * Register Immediate commands in Ring Buffer before submitting + * a context."*/ ret = to->ppgtt->switch_mm(to->ppgtt, ring); if (ret) goto unpin_out; @@ -577,13 +584,34 @@ static int do_switch(struct intel_engine_cs *ring, vma->bind_vma(vma, to->legacy_hw_ctx.rcs_state->cache_level, GLOBAL_BIND); } - if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) + /* GEN8 does *not* require an explicit reload if the PDPs have been + * setup, and we do not wish to move them. + * + * XXX: If we implemented page directory eviction code, this + * optimization needs to be removed. + */ + if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) { hw_flags |= MI_RESTORE_INHIBIT; + needs_pd_load_post = to->ppgtt && IS_GEN8(ring->dev); + } ret = mi_set_context(ring, to, hw_flags); if (ret) goto unpin_out; + if (needs_pd_load_post) { + ret = to->ppgtt->switch_mm(to->ppgtt, ring); + /* The hardware context switch is emitted, but we haven't + * actually changed the state - so it's probably safe to bail + * here. Still, let the user know something dangerous has + * happened. + */ + if (ret) { + DRM_ERROR("Failed to change address space on context switch\n"); + goto unpin_out; + } + } + for (i = 0; i < MAX_L3_SLICES; i++) { if (!(to->remap_slice & (1<<i))) continue;