diff mbox

drm/i915/cnl: WaPipeControlBefore3DStateSamplePattern

Message ID 20180126012634.3552-1-rafael.antognolli@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Rafael Antognolli Jan. 26, 2018, 1:26 a.m. UTC
Write a PIPE_CONTROL with CS stall followed by 14 dwords of 0 in the
indirect context wa bb.

References: HSD#1939868

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

Comments

Chris Wilson Jan. 26, 2018, 8:23 a.m. UTC | #1
Quoting Rafael Antognolli (2018-01-26 01:26:34)
> Write a PIPE_CONTROL with CS stall followed by 14 dwords of 0 in the
> indirect context wa bb.

14 MI_NOOPS following? That isn't what you wrote in the code, but the
main thing you haven't explained is why. A normal batch will already
have a flush in its setup, but more to the point would be the only
reason this is required is because of an implicit 3DSTATE inside the
context image on preemption. Right? Otherwise it seems to be a purely
userspace problem.
-Chris
Rafael Antognolli Jan. 26, 2018, 5:55 p.m. UTC | #2
On Fri, Jan 26, 2018 at 08:23:13AM +0000, Chris Wilson wrote:
> Quoting Rafael Antognolli (2018-01-26 01:26:34)
> > Write a PIPE_CONTROL with CS stall followed by 14 dwords of 0 in the
> > indirect context wa bb.
> 
> 14 MI_NOOPS following? That isn't what you wrote in the code, but the

Agreed, sorry. The workarounds says:

0x7a000004 0x00100000 + 14 dwords of 0. I counted 6 dwords from
gen8_emit_pipe_control() and figured "just 8 more to 14". I think I had
it right on my first version of this patch, but...

> main thing you haven't explained is why. A normal batch will already
> have a flush in its setup, but more to the point would be the only
> reason this is required is because of an implicit 3DSTATE inside the
> context image on preemption. Right? Otherwise it seems to be a purely
> userspace problem.

This is the text from the workaround:

"This bug can be hit on a context restore.  To avoid the issue the
following must be programmed by SW to ensure the engine is idle prior to
programming 3DSTATE_SAMPLE_PATTERN:

With RS context enabled: 0x21c8 = 0x000085c0

With RS context disabled:0x21c8 = 0x00001ac0

The above program specifes the offset to insert driver programmed
commands

0x21c4[31:6] = 0x<Addresses>

0x21c4[5:0] = 0x<N>

N=Size of CL needed to fit Workaround

The above programming is the GGTT address of the driver programmed
commands and the size(# of CL) to execute.

The address above needs to be a GGTT address and contain a pipe control
with CS stall(0x7a000004 0x00100000 0x00000000 0x00000000)followed by
12DW’s of NOOP(0x00000000)"

Since it needs to be in a GGTT address, and it's specifically talking
about the Indirect Context Pointer, we figured it should be in the
kernel. I can update the commit message with the above text if it helps.

I originally implemented this trying to fix my GPU hang, but it turns
out the issue was something else and this commit doesn't help at all.
Still, I see no reason to not have it there just in case it prevent any
future hangs...

Rafael
Rafael Antognolli Jan. 26, 2018, 5:58 p.m. UTC | #3
On Fri, Jan 26, 2018 at 09:55:58AM -0800, Rafael Antognolli wrote:
> On Fri, Jan 26, 2018 at 08:23:13AM +0000, Chris Wilson wrote:
> > Quoting Rafael Antognolli (2018-01-26 01:26:34)
> > > Write a PIPE_CONTROL with CS stall followed by 14 dwords of 0 in the
> > > indirect context wa bb.
> > 
> > 14 MI_NOOPS following? That isn't what you wrote in the code, but the
> 
> Agreed, sorry. The workarounds says:
> 
> 0x7a000004 0x00100000 + 14 dwords of 0. I counted 6 dwords from
> gen8_emit_pipe_control() and figured "just 8 more to 14". I think I had
> it right on my first version of this patch, but...
> 
> > main thing you haven't explained is why. A normal batch will already
> > have a flush in its setup, but more to the point would be the only
> > reason this is required is because of an implicit 3DSTATE inside the
> > context image on preemption. Right? Otherwise it seems to be a purely
> > userspace problem.
> 
> This is the text from the workaround:
> 
> "This bug can be hit on a context restore.  To avoid the issue the
> following must be programmed by SW to ensure the engine is idle prior to
> programming 3DSTATE_SAMPLE_PATTERN:
> 
> With RS context enabled: 0x21c8 = 0x000085c0
> 
> With RS context disabled:0x21c8 = 0x00001ac0
> 
> The above program specifes the offset to insert driver programmed
> commands
> 
> 0x21c4[31:6] = 0x<Addresses>
> 
> 0x21c4[5:0] = 0x<N>
> 
> N=Size of CL needed to fit Workaround
> 
> The above programming is the GGTT address of the driver programmed
> commands and the size(# of CL) to execute.
> 
> The address above needs to be a GGTT address and contain a pipe control
> with CS stall(0x7a000004 0x00100000 0x00000000 0x00000000)followed by
> 12DW’s of NOOP(0x00000000)"
> 
> Since it needs to be in a GGTT address, and it's specifically talking
> about the Indirect Context Pointer, we figured it should be in the
> kernel. I can update the commit message with the above text if it helps.
> 
> I originally implemented this trying to fix my GPU hang, but it turns
> out the issue was something else and this commit doesn't help at all.
> Still, I see no reason to not have it there just in case it prevent any
> future hangs...

Oh, and this workaround is actually a little longer, and there is a part
that describe a similar PIPE_CONTROL before 3DSTATE_SAMPLE_PATTERN. That
part is already implemented in userspace.
Chris Wilson Jan. 26, 2018, 8:41 p.m. UTC | #4
Quoting Rafael Antognolli (2018-01-26 17:58:29)
> On Fri, Jan 26, 2018 at 09:55:58AM -0800, Rafael Antognolli wrote:
> > On Fri, Jan 26, 2018 at 08:23:13AM +0000, Chris Wilson wrote:
> > > Quoting Rafael Antognolli (2018-01-26 01:26:34)
> > > > Write a PIPE_CONTROL with CS stall followed by 14 dwords of 0 in the
> > > > indirect context wa bb.
> > > 
> > > 14 MI_NOOPS following? That isn't what you wrote in the code, but the
> > 
> > Agreed, sorry. The workarounds says:
> > 
> > 0x7a000004 0x00100000 + 14 dwords of 0. I counted 6 dwords from
> > gen8_emit_pipe_control() and figured "just 8 more to 14". I think I had
> > it right on my first version of this patch, but...
> > 
> > > main thing you haven't explained is why. A normal batch will already
> > > have a flush in its setup, but more to the point would be the only
> > > reason this is required is because of an implicit 3DSTATE inside the
> > > context image on preemption. Right? Otherwise it seems to be a purely
> > > userspace problem.
> > 
> > This is the text from the workaround:
> > 
> > "This bug can be hit on a context restore.  To avoid the issue the
> > following must be programmed by SW to ensure the engine is idle prior to
> > programming 3DSTATE_SAMPLE_PATTERN:
> > 
> > With RS context enabled: 0x21c8 = 0x000085c0
> > 
> > With RS context disabled:0x21c8 = 0x00001ac0
> > 
> > The above program specifes the offset to insert driver programmed
> > commands
> > 
> > 0x21c4[31:6] = 0x<Addresses>
> > 
> > 0x21c4[5:0] = 0x<N>
> > 
> > N=Size of CL needed to fit Workaround
> > 
> > The above programming is the GGTT address of the driver programmed
> > commands and the size(# of CL) to execute.
> > 
> > The address above needs to be a GGTT address and contain a pipe control
> > with CS stall(0x7a000004 0x00100000 0x00000000 0x00000000)followed by
> > 12DW’s of NOOP(0x00000000)"
> > 
> > Since it needs to be in a GGTT address, and it's specifically talking
> > about the Indirect Context Pointer, we figured it should be in the
> > kernel. I can update the commit message with the above text if it helps.
> > 
> > I originally implemented this trying to fix my GPU hang, but it turns
> > out the issue was something else and this commit doesn't help at all.
> > Still, I see no reason to not have it there just in case it prevent any
> > future hangs...
> 
> Oh, and this workaround is actually a little longer, and there is a part
> that describe a similar PIPE_CONTROL before 3DSTATE_SAMPLE_PATTERN. That
> part is already implemented in userspace.

Cool, I really just wanted confirmation that it was indeed required
before the context switch. Hence justifying placement in the
indirect_ctx bb, and explaining why it's not just a regular pre-batch
flush. Please do try and distill as much as the workaround
note into the changelog and a reminder inside the code; especially the
rationale on why it's in the indirect ctx bb.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 51e61b04a555..fd7e7ffde570 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1366,6 +1366,25 @@  static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
 	return batch;
 }
 
+static u32 *gen10_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+	int i;
+
+	/* WaPipeControlBefore3DStateSamplePattern: cnl */
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_CS_STALL,
+				       0);
+	for (i = 0; i < 8; i++)
+		*batch++ = MI_NOOP;
+
+	/* Pad to end of cacheline */
+	while ((unsigned long)batch % CACHELINE_BYTES)
+		*batch++ = MI_NOOP;
+
+	return batch;
+}
+
+
 #define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
 
 static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
@@ -1419,7 +1438,9 @@  static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 
 	switch (INTEL_GEN(engine->i915)) {
 	case 10:
-		return 0;
+		wa_bb_fn[0] = gen10_init_indirectctx_bb;
+		wa_bb_fn[1] = NULL;
+		break;
 	case 9:
 		wa_bb_fn[0] = gen9_init_indirectctx_bb;
 		wa_bb_fn[1] = NULL;