diff mbox

[v6] drm/i915/vlv: Added a rendering specific Hw WA 'WaTlbInvalidateStoreDataBefore'

Message ID 1395741214-26440-1-git-send-email-sourab.gupta@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

sourab.gupta@intel.com March 25, 2014, 9:53 a.m. UTC
From: Akash Goel <akash.goel@intel.com>

Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
This workaround has to be applied before doing TLB Invalidation on render ring.
In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
Store data commands.
Without this, hardware cannot guarantee the command after the PIPE_CONTROL
with TLB inv will not use the old TLB values.

v2: Modified the WA comment (Ville)

v3: Added the vlv identifier with WA name (Damien)

v4: Reworked based on Chris' comments (WA moved to gen7 ring flush func,
sending 6 dwords instead of 8) (Chris)

v5: Enhancing the scope of WA to gen6, gen7. Having a common WA func being
called from gen6, gen7 flush functions. (Ville)

v6: WA is applicable only to render ring, earlier put for all rings in v5.
(Chris)

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 39 +++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

Comments

Chris Wilson March 25, 2014, 10:59 a.m. UTC | #1
On Tue, Mar 25, 2014 at 03:23:34PM +0530, sourab.gupta@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> This workaround has to be applied before doing TLB Invalidation on render ring.
> In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> Store data commands.
> Without this, hardware cannot guarantee the command after the PIPE_CONTROL
> with TLB inv will not use the old TLB values.

Note, that our command programming sequence already has multiple dword
writes between the flush of the last batch and the invalidation of the
next.

Is this w/a still required? Why?
-Chris
sourab.gupta@intel.com March 26, 2014, 5:14 a.m. UTC | #2
On Tue, 2014-03-25 at 10:59 +0000, Chris Wilson wrote:
> On Tue, Mar 25, 2014 at 03:23:34PM +0530, sourab.gupta@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > This workaround has to be applied before doing TLB Invalidation on render ring.
> > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > Store data commands.
> > Without this, hardware cannot guarantee the command after the PIPE_CONTROL
> > with TLB inv will not use the old TLB values.
> 
> Note, that our command programming sequence already has multiple dword
> writes between the flush of the last batch and the invalidation of the
> next.
> 
> Is this w/a still required? Why?
> -Chris
> 
Hi Chris,
Are you referring to the MI_STORE_DWORD_INDEX commands emitted as a part
of add_request functions?
We couldn't find any other place where we are storing dwords between
flush of last batch and invalidation of next.
In that case, we agree that as a part of command programming sequence,
we'll have one set of store dwords emitted.

But, as per spec, it is required to emit 2 sets of store dwords before
the tlb invalidation.
Also, our motive for having this w/a is just being paranoid, and not
assuming that dwords would already have been emitted before doing tlb
invalidation. So, we try to explicitly ensure the same through our w/a.

Regards,
Sourab
Chris Wilson March 26, 2014, 7:54 a.m. UTC | #3
On Wed, Mar 26, 2014 at 05:14:05AM +0000, Gupta, Sourab wrote:
> On Tue, 2014-03-25 at 10:59 +0000, Chris Wilson wrote:
> > On Tue, Mar 25, 2014 at 03:23:34PM +0530, sourab.gupta@intel.com wrote:
> > > From: Akash Goel <akash.goel@intel.com>
> > > 
> > > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > > This workaround has to be applied before doing TLB Invalidation on render ring.
> > > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > > Store data commands.
> > > Without this, hardware cannot guarantee the command after the PIPE_CONTROL
> > > with TLB inv will not use the old TLB values.
> > 
> > Note, that our command programming sequence already has multiple dword
> > writes between the flush of the last batch and the invalidation of the
> > next.
> > 
> > Is this w/a still required? Why?
> > -Chris
> > 
> Hi Chris,
> Are you referring to the MI_STORE_DWORD_INDEX commands emitted as a part
> of add_request functions?
> We couldn't find any other place where we are storing dwords between
> flush of last batch and invalidation of next.
> In that case, we agree that as a part of command programming sequence,
> we'll have one set of store dwords emitted.
> 
> But, as per spec, it is required to emit 2 sets of store dwords before
> the tlb invalidation.

At this moment, I am upset how vague this recommendation is. Why don't
the LRI count? Is there a timing requirement? Do the stores have to
be different pages to flush a TLB etc?

> Also, our motive for having this w/a is just being paranoid, and not
> assuming that dwords would already have been emitted before doing tlb
> invalidation. So, we try to explicitly ensure the same through our w/a.

Which would be as easy as doubling up the STORE_DW(seqno).
-Chris
sourab.gupta@intel.com March 26, 2014, 2:13 p.m. UTC | #4
On Wed, 2014-03-26 at 07:54 +0000, Chris Wilson wrote:
> On Wed, Mar 26, 2014 at 05:14:05AM +0000, Gupta, Sourab wrote:
> > On Tue, 2014-03-25 at 10:59 +0000, Chris Wilson wrote:
> > > On Tue, Mar 25, 2014 at 03:23:34PM +0530, sourab.gupta@intel.com wrote:
> > > > From: Akash Goel <akash.goel@intel.com>
> > > > 
> > > > Added a new rendering specific Workaround 'WaTlbInvalidateStoreDataBefore'.
> > > > This workaround has to be applied before doing TLB Invalidation on render ring.
> > > > In this WA, before pipecontrol with TLB invalidate set, need to add 2 MI
> > > > Store data commands.
> > > > Without this, hardware cannot guarantee the command after the PIPE_CONTROL
> > > > with TLB inv will not use the old TLB values.
> > > 
> > > Note, that our command programming sequence already has multiple dword
> > > writes between the flush of the last batch and the invalidation of the
> > > next.
> > > 
> > > Is this w/a still required? Why?
> > > -Chris
> > > 
> > Hi Chris,
> > Are you referring to the MI_STORE_DWORD_INDEX commands emitted as a part
> > of add_request functions?
> > We couldn't find any other place where we are storing dwords between
> > flush of last batch and invalidation of next.
> > In that case, we agree that as a part of command programming sequence,
> > we'll have one set of store dwords emitted.
> > 
> > But, as per spec, it is required to emit 2 sets of store dwords before
> > the tlb invalidation.
> 
> At this moment, I am upset how vague this recommendation is. Why don't
> the LRI count? Is there a timing requirement? Do the stores have to
> be different pages to flush a TLB etc?
> 
Hi Chris,
I see your point. We do see the MI_LOAD_REGISTER_IMM calls, but we were
not sure whether they would serve the purpose (forgive our limited
visibility on the low level pipeline details). 
If the LRI + store dword combination does the job, then this w/a may not
be required and we can drop it.
Regards,
Sourab

> > Also, our motive for having this w/a is just being paranoid, and not
> > assuming that dwords would already have been emitted before doing tlb
> > invalidation. So, we try to explicitly ensure the same through our w/a.
> 
> Which would be as easy as doubling up the STORE_DW(seqno).
> -Chris
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 87d1a2d..816137f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -208,6 +208,31 @@  intel_emit_post_sync_nonzero_flush(struct intel_ring_buffer *ring)
 }
 
 static int
+gen6_tlb_invalidate_wa(struct intel_ring_buffer *ring)
+{
+	/*
+	 * WaTlbInvalidateStoreDataBefore:gen6,gen7
+	 * This workaround has to be applied before doing TLB invalidation
+	 * on the render ring. Before pipecontrol with TLB invalidate set,
+	 * need 2 store data commands (such as MI_STORE_DATA_IMM or
+	 * MI_STORE_DATA_INDEX). Without this, hardware cannot guarantee
+	 * the command after the PIPE_CONTROL with TLB inv will not use
+	 * the old TLB values.
+	 */
+	int i, ret;
+	ret = intel_ring_begin(ring, 3 * 2);
+	if (ret)
+		return ret;
+	for (i = 0; i < 2; i++) {
+		intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
+		intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR);
+		intel_ring_emit(ring, 0);
+	}
+	intel_ring_advance(ring);
+	return 0;
+}
+
+static int
 gen6_render_ring_flush(struct intel_ring_buffer *ring,
                          u32 invalidate_domains, u32 flush_domains)
 {
@@ -215,6 +240,13 @@  gen6_render_ring_flush(struct intel_ring_buffer *ring,
 	u32 scratch_addr = ring->scratch.gtt_offset + 128;
 	int ret;
 
+	/* Apply WaTlbInvalidateStoreDataBefore workaround */
+	if (invalidate_domains) {
+		ret = gen6_tlb_invalidate_wa(ring);
+		if (ret)
+			return ret;
+	}
+
 	/* Force SNB workarounds for PIPE_CONTROL flushes */
 	ret = intel_emit_post_sync_nonzero_flush(ring);
 	if (ret)
@@ -309,6 +341,13 @@  gen7_render_ring_flush(struct intel_ring_buffer *ring,
 	u32 scratch_addr = ring->scratch.gtt_offset + 128;
 	int ret;
 
+	/* Apply WaTlbInvalidateStoreDataBefore workaround */
+	if (invalidate_domains) {
+		ret = gen6_tlb_invalidate_wa(ring);
+		if (ret)
+			return ret;
+	}
+
 	/*
 	 * Ensure that any following seqno writes only happen when the render
 	 * cache is indeed flushed.