Message ID | 20170615201828.23144-11-michel.thierry@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Michel Thierry (2017-06-15 21:18:17) > Users/tests relying on the total reset count will start seeing a smaller > number since most of the hangs can be handled by engine reset. > Note that if reset engine x, context a running on engine y will be unaware > and unaffected. > > To start the discussion, include just a total engine reset count. If it > is deemed useful, it can be extended to report each engine separately. > Our igt's gem_reset_stats test will need changes to ignore the pad field, > since it can now return reset_engine_count. > > v2: s/engine_reset/reset_engine/, use union in uapi to not break compatibility. > v3: Keep rejecting attempts to use pad as input (Antonio) Nope. You have now defined that pad works as an output-only parameter. It is no longer in a state of flux and can be any value on input. The test in gem_reset_stats becomes invalid. -Chris
On 15/06/17 14:14, Chris Wilson wrote: > Quoting Michel Thierry (2017-06-15 21:18:17) >> Users/tests relying on the total reset count will start seeing a smaller >> number since most of the hangs can be handled by engine reset. >> Note that if reset engine x, context a running on engine y will be unaware >> and unaffected. >> >> To start the discussion, include just a total engine reset count. If it >> is deemed useful, it can be extended to report each engine separately. >> Our igt's gem_reset_stats test will need changes to ignore the pad field, >> since it can now return reset_engine_count. >> >> v2: s/engine_reset/reset_engine/, use union in uapi to not break compatibility. >> v3: Keep rejecting attempts to use pad as input (Antonio) > > Nope. You have now defined that pad works as an output-only parameter. > It is no longer in a state of flux and can be any value on input. The > test in gem_reset_stats becomes invalid. Ok, I'll also have to make more changes to that test; the existing subtests must enforce i915.reset=1 (or read both reset_count and pad/reset_engine_count).
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index c5d1666d7071..04766fdcc4dc 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -1029,6 +1029,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, struct drm_i915_private *dev_priv = to_i915(dev); struct drm_i915_reset_stats *args = data; struct i915_gem_context *ctx; + struct intel_engine_cs *engine; + enum intel_engine_id id; int ret; if (args->flags || args->pad) @@ -1047,10 +1049,16 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, return PTR_ERR(ctx); } - if (capable(CAP_SYS_ADMIN)) + if (capable(CAP_SYS_ADMIN)) { args->reset_count = i915_reset_count(&dev_priv->gpu_error); - else + for_each_engine(engine, dev_priv, id) + args->reset_engine_count += + i915_reset_engine_count(&dev_priv->gpu_error, + engine); + } else { args->reset_count = 0; + args->reset_engine_count = 0; + } args->batch_active = ctx->guilty_count; args->batch_pending = ctx->active_count; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 15bc9f78ba4d..c599d47629ac 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1286,7 +1286,11 @@ struct drm_i915_reset_stats { /* Number of batches lost pending for execution, for this context */ __u32 batch_pending; - __u32 pad; + union { + __u32 pad; + /* Engine resets since boot/module reload, for all contexts */ + __u32 reset_engine_count; + }; }; struct drm_i915_gem_userptr {
Users/tests relying on the total reset count will start seeing a smaller number since most of the hangs can be handled by engine reset. Note that if reset engine x, context a running on engine y will be unaware and unaffected. To start the discussion, include just a total engine reset count. If it is deemed useful, it can be extended to report each engine separately. Our igt's gem_reset_stats test will need changes to ignore the pad field, since it can now return reset_engine_count. v2: s/engine_reset/reset_engine/, use union in uapi to not break compatibility. v3: Keep rejecting attempts to use pad as input (Antonio) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Antonio Argenziano <antonio.argenziano@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> --- drivers/gpu/drm/i915/i915_gem_context.c | 12 ++++++++++-- include/uapi/drm/i915_drm.h | 6 +++++- 2 files changed, 15 insertions(+), 3 deletions(-)