Message ID | 20241216150250.38242-2-andrealmeid@igalia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | drm/amdgpu: Use device wedged event | expand |
Am 16.12.24 um 16:02 schrieb André Almeida: > Use DRM's device wedged event to notify userspace that a reset had > happened. For now, only use `none` method meant for telemetry > capture. > > In the future we might want to report a recovery method if the reset didn't > succeed. > > Acked-by: Shashank Sharma <shashank.sharma@amd.com> > Signed-off-by: André Almeida <andrealmeid@igalia.com> > --- > v2: Only report reset if reset succeeded > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 96316111300a..b0079d66d9e6 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -6057,6 +6057,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, > dev_info(adev->dev, "GPU reset end with ret = %d\n", r); > > atomic_set(&adev->reset_domain->reset_res, r); > + > + if (r) > + drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE); That was not what I meant. The idea was more like: drm_dev_wedged_event(adev_to_drm(adev), r ? TBD : DRM_WEDGE_RECOVERY_NONE); Regards, Christian. > + > return r; > } >
Em 16/12/2024 12:27, Christian König escreveu: > Am 16.12.24 um 16:02 schrieb André Almeida: >> Use DRM's device wedged event to notify userspace that a reset had >> happened. For now, only use `none` method meant for telemetry >> capture. >> >> In the future we might want to report a recovery method if the reset >> didn't >> succeed. >> >> Acked-by: Shashank Sharma <shashank.sharma@amd.com> >> Signed-off-by: André Almeida <andrealmeid@igalia.com> >> --- >> v2: Only report reset if reset succeeded >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/ >> drm/amd/amdgpu/amdgpu_device.c >> index 96316111300a..b0079d66d9e6 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -6057,6 +6057,10 @@ int amdgpu_device_gpu_recover(struct >> amdgpu_device *adev, >> dev_info(adev->dev, "GPU reset end with ret = %d\n", r); >> atomic_set(&adev->reset_domain->reset_res, r); >> + >> + if (r) >> + drm_dev_wedged_event(adev_to_drm(adev), >> DRM_WEDGE_RECOVERY_NONE); > > > That was not what I meant. The idea was more like: > > drm_dev_wedged_event(adev_to_drm(adev), r ? TBD : DRM_WEDGE_RECOVERY_NONE); > Ops, I did it wrong indeed, I meant `if (!r)`. Sending a v3 now. > Regards, > Christian. > > >> + >> return r; >> } >
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 96316111300a..b0079d66d9e6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -6057,6 +6057,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, dev_info(adev->dev, "GPU reset end with ret = %d\n", r); atomic_set(&adev->reset_domain->reset_res, r); + + if (r) + drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE); + return r; }