Message ID | 20210816135139.10060-7-matthew.brost@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Clean up GuC CI failures, simplify locking, and kernel DOC | expand |
On Mon, Aug 16, 2021 at 06:51:23AM -0700, Matthew Brost wrote: > Progagating errors to dependent fences is wrong, don't do it. Selftest > in following patch exposes this bug. Please explain what "this bug" is, it's hard to read minds, especially at a distance in spacetime :-) > Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to children on unhold") I think it would be better to outright revert this, instead of just disabling it like this. Also please cite the dma_fence error propagation revert from Jason: commit 93a2711cddd5760e2f0f901817d71c93183c3b87 Author: Jason Ekstrand <jason@jlekstrand.net> Date: Wed Jul 14 14:34:16 2021 -0500 Revert "drm/i915: Propagate errors on awaiting already signaled fences" Maybe in full, if you need the justification. > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > Cc: <stable@vger.kernel.org> Unless "this bug" is some real world impact thing I wouldn't put cc: stable on this. -Daniel > --- > drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > index de5f9c86b9a4..cafb0608ffb4 100644 > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request *rq) > if (p->flags & I915_DEPENDENCY_WEAK) > continue; > > - /* Propagate any change in error status */ > - if (rq->fence.error) > - i915_request_set_error_once(w, rq->fence.error); > - > if (w->engine != rq->engine) > continue; > > -- > 2.32.0 >
On Tue, Aug 17, 2021 at 11:21:27AM +0200, Daniel Vetter wrote: > On Mon, Aug 16, 2021 at 06:51:23AM -0700, Matthew Brost wrote: > > Progagating errors to dependent fences is wrong, don't do it. Selftest > > in following patch exposes this bug. > > Please explain what "this bug" is, it's hard to read minds, especially at > a distance in spacetime :-) > Not a very good explaination. > > Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to children on unhold") > > I think it would be better to outright revert this, instead of just > disabling it like this. > I tried revert and git did some really odd things that I couldn't resolve, hence the new patch. > Also please cite the dma_fence error propagation revert from Jason: > > commit 93a2711cddd5760e2f0f901817d71c93183c3b87 > Author: Jason Ekstrand <jason@jlekstrand.net> > Date: Wed Jul 14 14:34:16 2021 -0500 > > Revert "drm/i915: Propagate errors on awaiting already signaled fences" > > Maybe in full, if you need the justification. > Will site. > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > Cc: <stable@vger.kernel.org> > > Unless "this bug" is some real world impact thing I wouldn't put cc: > stable on this. Got it. Matt > -Daniel > > --- > > drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 ---- > > 1 file changed, 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > index de5f9c86b9a4..cafb0608ffb4 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request *rq) > > if (p->flags & I915_DEPENDENCY_WEAK) > > continue; > > > > - /* Propagate any change in error status */ > > - if (rq->fence.error) > > - i915_request_set_error_once(w, rq->fence.error); > > - > > if (w->engine != rq->engine) > > continue; > > > > -- > > 2.32.0 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On Tue, Aug 17, 2021 at 5:13 PM Matthew Brost <matthew.brost@intel.com> wrote: > On Tue, Aug 17, 2021 at 11:21:27AM +0200, Daniel Vetter wrote: > > On Mon, Aug 16, 2021 at 06:51:23AM -0700, Matthew Brost wrote: > > > Progagating errors to dependent fences is wrong, don't do it. Selftest > > > in following patch exposes this bug. > > > > Please explain what "this bug" is, it's hard to read minds, especially at > > a distance in spacetime :-) > > > > Not a very good explaination. > > > > Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to children on unhold") > > > > I think it would be better to outright revert this, instead of just > > disabling it like this. > > > > I tried revert and git did some really odd things that I couldn't > resolve, hence the new patch. If there's any conflict git just gives you your current code, and what was there with the revert applied, with the block markers. Then it's your job to manually apply that change. Occasionally (when there's been ridiculous amounts of code movement) it gets completely lost and puts these into very non-intuitive places. In that case just delete it, keep the current code, and check what change you're missing that needs to be manually reverted still. Also sometimes there's a follow-up patch that you should revert first, which makes the revert clean. In that case it's generally the right thing to revert the follow-up first, and then apply your revert. Often there's subtle functional dependencies hiding. -Daniel > > > Also please cite the dma_fence error propagation revert from Jason: > > > > commit 93a2711cddd5760e2f0f901817d71c93183c3b87 > > Author: Jason Ekstrand <jason@jlekstrand.net> > > Date: Wed Jul 14 14:34:16 2021 -0500 > > > > Revert "drm/i915: Propagate errors on awaiting already signaled fences" > > > > Maybe in full, if you need the justification. > > > > Will site. > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > > Cc: <stable@vger.kernel.org> > > > > Unless "this bug" is some real world impact thing I wouldn't put cc: > > stable on this. > > Got it. > > Matt > > > -Daniel > > > --- > > > drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 ---- > > > 1 file changed, 4 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > > index de5f9c86b9a4..cafb0608ffb4 100644 > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > > @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request *rq) > > > if (p->flags & I915_DEPENDENCY_WEAK) > > > continue; > > > > > > - /* Propagate any change in error status */ > > > - if (rq->fence.error) > > > - i915_request_set_error_once(w, rq->fence.error); > > > - > > > if (w->engine != rq->engine) > > > continue; > > > > > > -- > > > 2.32.0 > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index de5f9c86b9a4..cafb0608ffb4 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request *rq) if (p->flags & I915_DEPENDENCY_WEAK) continue; - /* Propagate any change in error status */ - if (rq->fence.error) - i915_request_set_error_once(w, rq->fence.error); - if (w->engine != rq->engine) continue;
Progagating errors to dependent fences is wrong, don't do it. Selftest in following patch exposes this bug. Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to children on unhold") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> --- drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 ---- 1 file changed, 4 deletions(-)