diff mbox series

[v3,3/3] drm/i915: Use device wedged event

Message ID 20240902074859.2992849-4-raag.jadav@intel.com (mailing list archive)
State New, archived
Headers show
Series Introduce DRM device wedged event | expand

Commit Message

Raag Jadav Sept. 2, 2024, 7:48 a.m. UTC
Now that we have device wedged event supported by DRM core, make use
of it. With this in place, userspace will be notified of wedged device
on gt reset failure.

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Aravind Iddamsetty Sept. 2, 2024, 8:52 a.m. UTC | #1
On 02/09/24 13:18, Raag Jadav wrote:
> Now that we have device wedged event supported by DRM core, make use
> of it. With this in place, userspace will be notified of wedged device
> on gt reset failure.
>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_reset.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index 735cd23a43c6..60d09ec536c4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -1409,6 +1409,8 @@ static void intel_gt_reset_global(struct intel_gt *gt,
>  
>  	if (!test_bit(I915_WEDGED, &gt->reset.flags))
>  		kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
> +	else
> +		drm_dev_wedged(&gt->i915->drm);
>  }
rather than intel_gt_reset_global, __intel_get_set_wedged looks to be
an appropriate place where actually the device is declared wedged and
that would cover all call sites too.

Thanks,
Aravind.
>  
>  /**
Raag Jadav Sept. 3, 2024, 7:03 a.m. UTC | #2
On Mon, Sep 02, 2024 at 02:22:21PM +0530, Aravind Iddamsetty wrote:
> 
> On 02/09/24 13:18, Raag Jadav wrote:
> > Now that we have device wedged event supported by DRM core, make use
> > of it. With this in place, userspace will be notified of wedged device
> > on gt reset failure.
> >
> > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_reset.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index 735cd23a43c6..60d09ec536c4 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -1409,6 +1409,8 @@ static void intel_gt_reset_global(struct intel_gt *gt,
> >  
> >  	if (!test_bit(I915_WEDGED, &gt->reset.flags))
> >  		kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
> > +	else
> > +		drm_dev_wedged(&gt->i915->drm);
> >  }
> rather than intel_gt_reset_global, __intel_get_set_wedged looks to be
> an appropriate place where actually the device is declared wedged and
> that would cover all call sites too.

Which is why it may be the appropriate place IMHO.
We'd want to make sure the device is _really_ unrecoverable when we
choose to send the event.

Raag
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 735cd23a43c6..60d09ec536c4 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1409,6 +1409,8 @@  static void intel_gt_reset_global(struct intel_gt *gt,
 
 	if (!test_bit(I915_WEDGED, &gt->reset.flags))
 		kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
+	else
+		drm_dev_wedged(&gt->i915->drm);
 }
 
 /**