[25/27] drm/i915/guc: Handle errors in multi-lrc requests

Message ID	20210820224446.30620-26-matthew.brost@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=BHe2=NL=lists.freedesktop.org=intel-gfx-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4F3A861057 From: Matthew Brost <matthew.brost@intel.com> To: <intel-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org> Cc: <daniel.vetter@ffwll.ch>, <tony.ye@intel.com>, <zhengguo.xu@intel.com> Date: Fri, 20 Aug 2021 15:44:44 -0700 Message-Id: <20210820224446.30620-26-matthew.brost@intel.com> In-Reply-To: <20210820224446.30620-1-matthew.brost@intel.com> References: <20210820224446.30620-1-matthew.brost@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Intel-gfx] [PATCH 25/27] drm/i915/guc: Handle errors in multi-lrc requests Precedence: list Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	Parallel submission aka multi-bb execbuf \| expand [00/27] Parallel submission aka multi-bb execbuf [01/27] drm/i915/guc: Squash Clean up GuC CI failures, simplify locking, and kernel DOC [02/27] drm/i915/guc: Allow flexible number of context ids [03/27] drm/i915/guc: Connect the number of guc_ids to debugfs [04/27] drm/i915/guc: Take GT PM ref when deregistering context [05/27] drm/i915: Add GT PM unpark worker [06/27] drm/i915/guc: Take engine PM when a context is pinned with GuC submission [07/27] drm/i915/guc: Don't call switch_to_kernel_context with GuC submission [08/27] drm/i915: Add logical engine mapping [09/27] drm/i915: Expose logical engine instance to user [10/27] drm/i915/guc: Introduce context parent-child relationship [11/27] drm/i915/guc: Implement parallel context pin / unpin functions [12/27] drm/i915/guc: Add multi-lrc context registration [13/27] drm/i915/guc: Ensure GuC schedule operations do not operate on child contexts [14/27] drm/i915/guc: Assign contexts in parent-child relationship consecutive guc_ids [15/27] drm/i915/guc: Implement multi-lrc submission [16/27] drm/i915/guc: Insert submit fences between requests in parent-child relationship [17/27] drm/i915/guc: Implement multi-lrc reset [18/27] drm/i915/guc: Update debugfs for GuC multi-lrc [19/27] drm/i915: Fix bug in user proto-context creation that leaked contexts [20/27] drm/i915/guc: Connect UAPI to GuC multi-lrc interface [21/27] drm/i915/doc: Update parallel submit doc to point to i915_drm.h [22/27] drm/i915/guc: Add basic GuC multi-lrc selftest [23/27] drm/i915/guc: Implement no mid batch preemption for multi-lrc [24/27] drm/i915: Multi-BB execbuf [25/27] drm/i915/guc: Handle errors in multi-lrc requests [26/27] drm/i915: Enable multi-bb execbuf [27/27] drm/i915/execlists: Weak parallel submission support for execlists

Message ID

20210820224446.30620-26-matthew.brost@intel.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4F3A861057
From: Matthew Brost <matthew.brost@intel.com>
To: <intel-gfx@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>
Cc: <daniel.vetter@ffwll.ch>,
	<tony.ye@intel.com>,
	<zhengguo.xu@intel.com>
Date: Fri, 20 Aug 2021 15:44:44 -0700
Message-Id: <20210820224446.30620-26-matthew.brost@intel.com>
In-Reply-To: <20210820224446.30620-1-matthew.brost@intel.com>
References: <20210820224446.30620-1-matthew.brost@intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Subject: [Intel-gfx] [PATCH 25/27] drm/i915/guc: Handle errors in multi-lrc
 requests
Precedence: list
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

Parallel submission aka multi-bb execbuf | expand

Commit Message

Matthew Brost Aug. 20, 2021, 10:44 p.m. UTC

If an error occurs in the front end when multi-lrc requests are getting
generated we need to skip these in the backend but we still need to
emit the breadcrumbs seqno. An issues arrises because with multi-lrc
breadcrumbs there is a handshake between the parent and children to make
forwad progress. If all the requests are not present this handshake
doesn't work. To work around this, if multi-lrc request has an error we
skip the handshake but still emit the breadcrumbs seqno.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 61 ++++++++++++++++++-
 1 file changed, 58 insertions(+), 3 deletions(-)

Comments

John Harrison Sept. 29, 2021, 8:44 p.m. UTC | #1

On 8/20/2021 15:44, Matthew Brost wrote:
> If an error occurs in the front end when multi-lrc requests are getting
> generated we need to skip these in the backend but we still need to
> emit the breadcrumbs seqno. An issues arrises because with multi-lrc
arrises -> arises

> breadcrumbs there is a handshake between the parent and children to make
> forwad progress. If all the requests are not present this handshake
forwad -> forward

> doesn't work. To work around this, if multi-lrc request has an error we
> skip the handshake but still emit the breadcrumbs seqno.
>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 61 ++++++++++++++++++-
>   1 file changed, 58 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 2ef38557b0f0..61e737fd1eee 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -3546,8 +3546,8 @@ static int emit_bb_start_child_no_preempt_mid_batch(struct i915_request *rq,
>   }
>   
>   static u32 *
> -emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
> -						 u32 *cs)
> +__emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
> +						   u32 *cs)
>   {
>   	struct intel_context *ce = rq->context;
>   	u8 i;
> @@ -3575,6 +3575,41 @@ emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
>   				  get_children_go_addr(ce),
>   				  0);
>   
> +	return cs;
> +}
> +
> +/*
> + * If this true, a submission of multi-lrc requests had an error and the
> + * requests need to be skipped. The front end (execuf IOCTL) should've called
> + * i915_request_skip which squashes the BB but we still need to emit the fini
> + * breadrcrumbs seqno write. At this point we don't know how many of the
> + * requests in the multi-lrc submission were generated so we can't do the
> + * handshake between the parent and children (e.g. if 4 requests should be
> + * generated but 2nd hit an error only 1 would be seen by the GuC backend).
> + * Simply skip the handshake, but still emit the breadcrumbd seqno, if an error
> + * has occurred on any of the requests in submission / relationship.
> + */
> +static inline bool skip_handshake(struct i915_request *rq)
> +{
> +	return test_bit(I915_FENCE_FLAG_SKIP_PARALLEL, &rq->fence.flags);
> +}
> +
> +static u32 *
> +emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
> +						 u32 *cs)
> +{
> +	struct intel_context *ce = rq->context;
> +
> +	GEM_BUG_ON(!intel_context_is_parent(ce));
> +
> +	if (unlikely(skip_handshake(rq))) {
> +		memset(cs, 0, sizeof(u32) *
> +		       (ce->engine->emit_fini_breadcrumb_dw - 6));
> +		cs += ce->engine->emit_fini_breadcrumb_dw - 6;
Why -6? There are 12 words about to be written. Indeed the value of 
emit_..._dw is '12 + 4*num_children'. This should only be skipping over 
the 4*children, right? As it stands, it will skip all but the last six 
words, then write an extra twelve words and thus overflow the 
reservation by six. Unless I am totally confused?

I assume there is some reason why the amount of data written must 
exactly match the space reserved? It's a while since I've looked at the 
ring buffer code!

Seems like it would be clearer to not split the semaphore writes out but 
have them right next to the skip code that is meant to replicate them 
but with no-ops.

> +	} else {
> +		cs = __emit_fini_breadcrumb_parent_no_preempt_mid_batch(rq, cs);
> +	}
> +
>   	/* Emit fini breadcrumb */
>   	cs = gen8_emit_ggtt_write(cs,
>   				  rq->fence.seqno,
> @@ -3591,7 +3626,8 @@ emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
>   }
>   
>   static u32 *
> -emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq, u32 *cs)
> +__emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq,
> +						  u32 *cs)
>   {
>   	struct intel_context *ce = rq->context;
>   
> @@ -3617,6 +3653,25 @@ emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq, u32 *cs
>   	*cs++ = get_children_go_addr(ce->parent);
>   	*cs++ = 0;
>   
> +	return cs;
> +}
> +
> +static u32 *
> +emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq,
> +						u32 *cs)
> +{
> +	struct intel_context *ce = rq->context;
> +
> +	GEM_BUG_ON(!intel_context_is_child(ce));
> +
> +	if (unlikely(skip_handshake(rq))) {
> +		memset(cs, 0, sizeof(u32) *
> +		       (ce->engine->emit_fini_breadcrumb_dw - 6));
> +		cs += ce->engine->emit_fini_breadcrumb_dw - 6;
> +	} else {
> +		cs = __emit_fini_breadcrumb_child_no_preempt_mid_batch(rq, cs);
> +	}
Same points as above - why -6 not -12 and would be clearer to keep the 
no-ops and the writes adjacent.

John.

> +
>   	/* Emit fini breadcrumb */
>   	cs = gen8_emit_ggtt_write(cs,
>   				  rq->fence.seqno,

Matthew Brost Sept. 29, 2021, 8:58 p.m. UTC | #2

On Wed, Sep 29, 2021 at 01:44:10PM -0700, John Harrison wrote:
> On 8/20/2021 15:44, Matthew Brost wrote:
> > If an error occurs in the front end when multi-lrc requests are getting
> > generated we need to skip these in the backend but we still need to
> > emit the breadcrumbs seqno. An issues arrises because with multi-lrc
> arrises -> arises
> 

Yep.

> > breadcrumbs there is a handshake between the parent and children to make
> > forwad progress. If all the requests are not present this handshake
> forwad -> forward
> 

Yep.

> > doesn't work. To work around this, if multi-lrc request has an error we
> > skip the handshake but still emit the breadcrumbs seqno.
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 61 ++++++++++++++++++-
> >   1 file changed, 58 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 2ef38557b0f0..61e737fd1eee 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -3546,8 +3546,8 @@ static int emit_bb_start_child_no_preempt_mid_batch(struct i915_request *rq,
> >   }
> >   static u32 *
> > -emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
> > -						 u32 *cs)
> > +__emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
> > +						   u32 *cs)
> >   {
> >   	struct intel_context *ce = rq->context;
> >   	u8 i;
> > @@ -3575,6 +3575,41 @@ emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
> >   				  get_children_go_addr(ce),
> >   				  0);
> > +	return cs;
> > +}
> > +
> > +/*
> > + * If this true, a submission of multi-lrc requests had an error and the
> > + * requests need to be skipped. The front end (execuf IOCTL) should've called
> > + * i915_request_skip which squashes the BB but we still need to emit the fini
> > + * breadrcrumbs seqno write. At this point we don't know how many of the
> > + * requests in the multi-lrc submission were generated so we can't do the
> > + * handshake between the parent and children (e.g. if 4 requests should be
> > + * generated but 2nd hit an error only 1 would be seen by the GuC backend).
> > + * Simply skip the handshake, but still emit the breadcrumbd seqno, if an error
> > + * has occurred on any of the requests in submission / relationship.
> > + */
> > +static inline bool skip_handshake(struct i915_request *rq)
> > +{
> > +	return test_bit(I915_FENCE_FLAG_SKIP_PARALLEL, &rq->fence.flags);
> > +}
> > +
> > +static u32 *
> > +emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
> > +						 u32 *cs)
> > +{
> > +	struct intel_context *ce = rq->context;
> > +
> > +	GEM_BUG_ON(!intel_context_is_parent(ce));
> > +
> > +	if (unlikely(skip_handshake(rq))) {
> > +		memset(cs, 0, sizeof(u32) *
> > +		       (ce->engine->emit_fini_breadcrumb_dw - 6));
> > +		cs += ce->engine->emit_fini_breadcrumb_dw - 6;
> Why -6? There are 12 words about to be written. Indeed the value of
> emit_..._dw is '12 + 4*num_children'. This should only be skipping over the
> 4*children, right? As it stands, it will skip all but the last six words,
> then write an extra twelve words and thus overflow the reservation by six.
> Unless I am totally confused?
> 

Let me decode the length:

'Wait on children' (in __emit_fini_breadcrumb_parent_no_preempt_mid_batch) = 4 * num_children
'Turn on preemption' (in __emit_fini_breadcrumb_parent_no_preempt_mid_batch) = 2
'Tell children go' (in __emit_fini_breadcrumb_parent_no_preempt_mid_batch) = 4
'Emit fini breadcrumb' (in emit_fini_breadcrumb_child_no_preempt_mid_batch) = 4
'User interrupt' (in emit_fini_breadcrumb_child_no_preempt_mid_batch) = 2

So for a total (emit_fini_breadcrumb_dw) we have '12 + 4 * num_children'

We want skip everything in __emit_fini_breadcrumb_parent_no_preempt_mid_batch, so that is
'6 + 4 * num_children' or 'emit_fini_breadcrumb_dw - 6'

Make sense?

> I assume there is some reason why the amount of data written must exactly
> match the space reserved? It's a while since I've looked at the ring buffer
> code!
> 

I think it because the ring space is reserved at request creation time
but the fini breadcrumbs are not written until submission time.

> Seems like it would be clearer to not split the semaphore writes out but
> have them right next to the skip code that is meant to replicate them but
> with no-ops.
>

I guess that works too, I personally like the way it is but if you
insist I can change it.

> > +	} else {
> > +		cs = __emit_fini_breadcrumb_parent_no_preempt_mid_batch(rq, cs);
> > +	}
> > +
> >   	/* Emit fini breadcrumb */
> >   	cs = gen8_emit_ggtt_write(cs,
> >   				  rq->fence.seqno,
> > @@ -3591,7 +3626,8 @@ emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
> >   }
> >   static u32 *
> > -emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq, u32 *cs)
> > +__emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq,
> > +						  u32 *cs)
> >   {
> >   	struct intel_context *ce = rq->context;
> > @@ -3617,6 +3653,25 @@ emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq, u32 *cs
> >   	*cs++ = get_children_go_addr(ce->parent);
> >   	*cs++ = 0;
> > +	return cs;
> > +}
> > +
> > +static u32 *
> > +emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq,
> > +						u32 *cs)
> > +{
> > +	struct intel_context *ce = rq->context;
> > +
> > +	GEM_BUG_ON(!intel_context_is_child(ce));
> > +
> > +	if (unlikely(skip_handshake(rq))) {
> > +		memset(cs, 0, sizeof(u32) *
> > +		       (ce->engine->emit_fini_breadcrumb_dw - 6));
> > +		cs += ce->engine->emit_fini_breadcrumb_dw - 6;
> > +	} else {
> > +		cs = __emit_fini_breadcrumb_child_no_preempt_mid_batch(rq, cs);
> > +	}
> Same points as above - why -6 not -12 and would be clearer to keep the
> no-ops and the writes adjacent.
>

Same as above we are NOP the length of
__emit_fini_breadcrumb_child_no_preempt_mid_batch and still want the
emit breadcrumbs below.

Matt

 
> John.
> 
> > +
> >   	/* Emit fini breadcrumb */
> >   	cs = gen8_emit_ggtt_write(cs,
> >   				  rq->fence.seqno,
>

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 2ef38557b0f0..61e737fd1eee 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -3546,8 +3546,8 @@  static int emit_bb_start_child_no_preempt_mid_batch(struct i915_request *rq,
 }
 
 static u32 *
-emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
-						 u32 *cs)
+__emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
+						   u32 *cs)
 {
 	struct intel_context *ce = rq->context;
 	u8 i;
@@ -3575,6 +3575,41 @@  emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
 				  get_children_go_addr(ce),
 				  0);
 
+	return cs;
+}
+
+/*
+ * If this true, a submission of multi-lrc requests had an error and the
+ * requests need to be skipped. The front end (execuf IOCTL) should've called
+ * i915_request_skip which squashes the BB but we still need to emit the fini
+ * breadrcrumbs seqno write. At this point we don't know how many of the
+ * requests in the multi-lrc submission were generated so we can't do the
+ * handshake between the parent and children (e.g. if 4 requests should be
+ * generated but 2nd hit an error only 1 would be seen by the GuC backend).
+ * Simply skip the handshake, but still emit the breadcrumbd seqno, if an error
+ * has occurred on any of the requests in submission / relationship.
+ */
+static inline bool skip_handshake(struct i915_request *rq)
+{
+	return test_bit(I915_FENCE_FLAG_SKIP_PARALLEL, &rq->fence.flags);
+}
+
+static u32 *
+emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
+						 u32 *cs)
+{
+	struct intel_context *ce = rq->context;
+
+	GEM_BUG_ON(!intel_context_is_parent(ce));
+
+	if (unlikely(skip_handshake(rq))) {
+		memset(cs, 0, sizeof(u32) *
+		       (ce->engine->emit_fini_breadcrumb_dw - 6));
+		cs += ce->engine->emit_fini_breadcrumb_dw - 6;
+	} else {
+		cs = __emit_fini_breadcrumb_parent_no_preempt_mid_batch(rq, cs);
+	}
+
 	/* Emit fini breadcrumb */
 	cs = gen8_emit_ggtt_write(cs,
 				  rq->fence.seqno,
@@ -3591,7 +3626,8 @@  emit_fini_breadcrumb_parent_no_preempt_mid_batch(struct i915_request *rq,
 }
 
 static u32 *
-emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq, u32 *cs)
+__emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq,
+						  u32 *cs)
 {
 	struct intel_context *ce = rq->context;
 
@@ -3617,6 +3653,25 @@  emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq, u32 *cs
 	*cs++ = get_children_go_addr(ce->parent);
 	*cs++ = 0;
 
+	return cs;
+}
+
+static u32 *
+emit_fini_breadcrumb_child_no_preempt_mid_batch(struct i915_request *rq,
+						u32 *cs)
+{
+	struct intel_context *ce = rq->context;
+
+	GEM_BUG_ON(!intel_context_is_child(ce));
+
+	if (unlikely(skip_handshake(rq))) {
+		memset(cs, 0, sizeof(u32) *
+		       (ce->engine->emit_fini_breadcrumb_dw - 6));
+		cs += ce->engine->emit_fini_breadcrumb_dw - 6;
+	} else {
+		cs = __emit_fini_breadcrumb_child_no_preempt_mid_batch(rq, cs);
+	}
+
 	/* Emit fini breadcrumb */
 	cs = gen8_emit_ggtt_write(cs,
 				  rq->fence.seqno,

[25/27] drm/i915/guc: Handle errors in multi-lrc requests

Commit Message

Comments

Patch