diff mbox series

[2/3] drm/xe: Accumulate exec queue timestamp on destroy

Message ID 20241026062658.28060-3-lucas.demarchi@intel.com (mailing list archive)
State New
Headers show
Series drm/xe: Fix races on fdinfo | expand

Commit Message

Lucas De Marchi Oct. 26, 2024, 6:26 a.m. UTC
When the exec queue is destroyed, there's a race between a query to the
fdinfo and the exec queue value being updated: after the destroy ioctl,
if the fdinfo is queried before a call to guc_exec_queue_free_job(),
the wrong utilization is reported: it's not accumulated on the query
since the queue was removed from the array, and the value wasn't updated
yet by the free_job().

Explicitly accumulate the engine utilization so the right value is
visible after the ioctl return.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2667
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_exec_queue.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Nirmoy Das Oct. 28, 2024, 12:46 p.m. UTC | #1
On 10/26/2024 8:26 AM, Lucas De Marchi wrote:
> When the exec queue is destroyed, there's a race between a query to the
> fdinfo and the exec queue value being updated: after the destroy ioctl,
> if the fdinfo is queried before a call to guc_exec_queue_free_job(),
> the wrong utilization is reported: it's not accumulated on the query
> since the queue was removed from the array, and the value wasn't updated
> yet by the free_job().
>
> Explicitly accumulate the engine utilization so the right value is
> visible after the ioctl return.
>
> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2667
> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

LGTM

Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>

> ---
>  drivers/gpu/drm/xe/xe_exec_queue.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> index d098d2dd1b2d..b15ca84b2422 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> @@ -829,6 +829,14 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data,
>  
>  	xe_exec_queue_kill(q);
>  
> +	/*
> +	 * After killing and destroying the exec queue, make sure userspace has
> +	 * an updated view of the run ticks, regardless if this was the last
> +	 * ref: since the exec queue is removed from xef->exec_queue.xa, a
> +	 * query to fdinfo after this returns could not account for this load.
> +	 */
> +	xe_exec_queue_update_run_ticks(q);
> +
>  	trace_xe_exec_queue_close(q);
>  	xe_exec_queue_put(q);
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index d098d2dd1b2d..b15ca84b2422 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -829,6 +829,14 @@  int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data,
 
 	xe_exec_queue_kill(q);
 
+	/*
+	 * After killing and destroying the exec queue, make sure userspace has
+	 * an updated view of the run ticks, regardless if this was the last
+	 * ref: since the exec queue is removed from xef->exec_queue.xa, a
+	 * query to fdinfo after this returns could not account for this load.
+	 */
+	xe_exec_queue_update_run_ticks(q);
+
 	trace_xe_exec_queue_close(q);
 	xe_exec_queue_put(q);