Message ID | 1394863318-17424-1-git-send-email-benjamin.widawsky@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Mar 14, 2014 at 11:01:58PM -0700, Ben Widawsky wrote: > I have been seeing this for a long time, but ignored it because it's > typically not terribly important. Recently, I really needed this info, > and it was garbage. Proof that I should have fixed it sooner. Originally > wrong from: > > commit 6c7a01ec3743a5a6ce9e53a69d7a6c2d8c715eb1 > Author: Ben Widawsky <benjamin.widawsky@intel.com> > Date: Thu Jan 30 00:19:40 2014 -0800 > > drm/i915: Capture PPGTT info on error capture > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Oops. I wonder whether we should start to have some functional checks for the error states. Fumbles like this crop up fairly often, and they're a pain if a bug reporter shows up with such a busted batch. Other examples are the woes where we've dumped nothingess due to ppgtt. Most of the stuff could be easily tested I think, e.g. - Check whether the presumed offset of the batch returned by the kernel shows up anywhere in the right ring. If we first allocate a randomized large bo and then a batch the offset should be sufficiently unique to avoid false positives. This would check whether we correctly dump ring buffers. - Add some unique values after MI_BB_END and check the dump for them. This would check whether we correctly dump batches. - Special checks for the various register values we dump, e.g. we could fish information like the pp base out of debugfs. Not sure how useful this is ... Anyway just food for thought. I'll probably volunteer the next guy who breaks the error dumper as a victim for this. Queued for -next, thanks for the patch. -Daniel > --- > drivers/gpu/drm/i915/i915_gpu_error.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 7a9bba1..d7ac688 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -899,10 +899,12 @@ static void i915_record_ring_state(struct drm_device *dev, > } > break; > case 7: > - ering->vm_info.pp_dir_base = RING_PP_DIR_BASE(ring); > + ering->vm_info.pp_dir_base = > + I915_READ(RING_PP_DIR_BASE(ring)); > break; > case 6: > - ering->vm_info.pp_dir_base = RING_PP_DIR_BASE_READ(ring); > + ering->vm_info.pp_dir_base = > + I915_READ(RING_PP_DIR_BASE_READ(ring)); > break; > } > } > -- > 1.9.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 7a9bba1..d7ac688 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -899,10 +899,12 @@ static void i915_record_ring_state(struct drm_device *dev, } break; case 7: - ering->vm_info.pp_dir_base = RING_PP_DIR_BASE(ring); + ering->vm_info.pp_dir_base = + I915_READ(RING_PP_DIR_BASE(ring)); break; case 6: - ering->vm_info.pp_dir_base = RING_PP_DIR_BASE_READ(ring); + ering->vm_info.pp_dir_base = + I915_READ(RING_PP_DIR_BASE_READ(ring)); break; } }
I have been seeing this for a long time, but ignored it because it's typically not terribly important. Recently, I really needed this info, and it was garbage. Proof that I should have fixed it sooner. Originally wrong from: commit 6c7a01ec3743a5a6ce9e53a69d7a6c2d8c715eb1 Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Thu Jan 30 00:19:40 2014 -0800 drm/i915: Capture PPGTT info on error capture Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_gpu_error.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)