Message ID | 20220916204823.1897089-1-ashutosh.dixit@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/gt: Bump the reset-failure timeout to 60s | expand |
On Fri, Sep 16, 2022 at 01:48:23PM -0700, Ashutosh Dixit wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > If attempting to perform a GT reset takes long than 5 seconds (including > resetting the display for gen3/4), then we declare all hope lost and > discard all user work and wedge the device to prevent further > misbehaviour. 5 seconds is too short a time for such drastic action, as > we may be stuck on other timeouts and watchdogs. If we allow a little > bit longer before hitting the big red button, we should at the very > least capture other hung task indicators pointing towards the reason why > the reset was hanging; and allow more marginal cases the extra headroom > to complete the reset without further collateral damage. > > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6448 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Seems reasonable. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c > index b36674356986..3159df6cdd49 100644 > --- a/drivers/gpu/drm/i915/gt/intel_reset.c > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c > @@ -1278,7 +1278,7 @@ static void intel_gt_reset_global(struct intel_gt *gt, > kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event); > > /* Use a watchdog to ensure that our reset completes */ > - intel_wedge_on_timeout(&w, gt, 5 * HZ) { > + intel_wedge_on_timeout(&w, gt, 60 * HZ) { > intel_display_prepare_reset(gt->i915); > > intel_gt_reset(gt, engine_mask, reason); > -- > 2.34.1 >
On Fri, Sep 16, 2022 at 01:48:23PM -0700, Ashutosh Dixit wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > If attempting to perform a GT reset takes long than 5 seconds (including > resetting the display for gen3/4), then we declare all hope lost and > discard all user work and wedge the device to prevent further > misbehaviour. 5 seconds is too short a time for such drastic action, as > we may be stuck on other timeouts and watchdogs. If we allow a little > bit longer before hitting the big red button, we should at the very > least capture other hung task indicators pointing towards the reason why > the reset was hanging; and allow more marginal cases the extra headroom > to complete the reset without further collateral damage. > > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6448 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> When handling someone's else patch, please add your signed-off-by here as well. > --- > drivers/gpu/drm/i915/gt/intel_reset.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c > index b36674356986..3159df6cdd49 100644 > --- a/drivers/gpu/drm/i915/gt/intel_reset.c > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c > @@ -1278,7 +1278,7 @@ static void intel_gt_reset_global(struct intel_gt *gt, > kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event); > > /* Use a watchdog to ensure that our reset completes */ > - intel_wedge_on_timeout(&w, gt, 5 * HZ) { > + intel_wedge_on_timeout(&w, gt, 60 * HZ) { > intel_display_prepare_reset(gt->i915); > > intel_gt_reset(gt, engine_mask, reason); > -- > 2.34.1 >
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index b36674356986..3159df6cdd49 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -1278,7 +1278,7 @@ static void intel_gt_reset_global(struct intel_gt *gt, kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event); /* Use a watchdog to ensure that our reset completes */ - intel_wedge_on_timeout(&w, gt, 5 * HZ) { + intel_wedge_on_timeout(&w, gt, 60 * HZ) { intel_display_prepare_reset(gt->i915); intel_gt_reset(gt, engine_mask, reason);