diff mbox series

[v2,1/1] drm/amdgpu: Use device wedged event

Message ID 20241216150250.38242-2-andrealmeid@igalia.com (mailing list archive)
State New
Headers show
Series drm/amdgpu: Use device wedged event | expand

Commit Message

André Almeida Dec. 16, 2024, 3:02 p.m. UTC
Use DRM's device wedged event to notify userspace that a reset had
happened. For now, only use `none` method meant for telemetry
capture.

In the future we might want to report a recovery method if the reset didn't
succeed.

Acked-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
v2: Only report reset if reset succeeded
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Christian König Dec. 16, 2024, 3:27 p.m. UTC | #1
Am 16.12.24 um 16:02 schrieb André Almeida:
> Use DRM's device wedged event to notify userspace that a reset had
> happened. For now, only use `none` method meant for telemetry
> capture.
>
> In the future we might want to report a recovery method if the reset didn't
> succeed.
>
> Acked-by: Shashank Sharma <shashank.sharma@amd.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> v2: Only report reset if reset succeeded
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 96316111300a..b0079d66d9e6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -6057,6 +6057,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>   		dev_info(adev->dev, "GPU reset end with ret = %d\n", r);
>   
>   	atomic_set(&adev->reset_domain->reset_res, r);
> +
> +	if (r)
> +		drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE);


That was not what I meant. The idea was more like:

drm_dev_wedged_event(adev_to_drm(adev), r ? TBD : DRM_WEDGE_RECOVERY_NONE);

Regards,
Christian.


> +
>   	return r;
>   }
>
André Almeida Dec. 16, 2024, 3:42 p.m. UTC | #2
Em 16/12/2024 12:27, Christian König escreveu:
> Am 16.12.24 um 16:02 schrieb André Almeida:
>> Use DRM's device wedged event to notify userspace that a reset had
>> happened. For now, only use `none` method meant for telemetry
>> capture.
>>
>> In the future we might want to report a recovery method if the reset 
>> didn't
>> succeed.
>>
>> Acked-by: Shashank Sharma <shashank.sharma@amd.com>
>> Signed-off-by: André Almeida <andrealmeid@igalia.com>
>> ---
>> v2: Only report reset if reset succeeded
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/ 
>> drm/amd/amdgpu/amdgpu_device.c
>> index 96316111300a..b0079d66d9e6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -6057,6 +6057,10 @@ int amdgpu_device_gpu_recover(struct 
>> amdgpu_device *adev,
>>           dev_info(adev->dev, "GPU reset end with ret = %d\n", r);
>>       atomic_set(&adev->reset_domain->reset_res, r);
>> +
>> +    if (r)
>> +        drm_dev_wedged_event(adev_to_drm(adev), 
>> DRM_WEDGE_RECOVERY_NONE);
> 
> 
> That was not what I meant. The idea was more like:
> 
> drm_dev_wedged_event(adev_to_drm(adev), r ? TBD : DRM_WEDGE_RECOVERY_NONE);
> 

Ops, I did it wrong indeed, I meant `if (!r)`. Sending a v3 now.

> Regards,
> Christian.
> 
> 
>> +
>>       return r;
>>   }
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 96316111300a..b0079d66d9e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -6057,6 +6057,10 @@  int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 		dev_info(adev->dev, "GPU reset end with ret = %d\n", r);
 
 	atomic_set(&adev->reset_domain->reset_res, r);
+
+	if (r)
+		drm_dev_wedged_event(adev_to_drm(adev), DRM_WEDGE_RECOVERY_NONE);
+
 	return r;
 }