Message ID | 20241026062658.28060-3-lucas.demarchi@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | drm/xe: Fix races on fdinfo | expand |
On 10/26/2024 8:26 AM, Lucas De Marchi wrote: > When the exec queue is destroyed, there's a race between a query to the > fdinfo and the exec queue value being updated: after the destroy ioctl, > if the fdinfo is queried before a call to guc_exec_queue_free_job(), > the wrong utilization is reported: it's not accumulated on the query > since the queue was removed from the array, and the value wasn't updated > yet by the free_job(). > > Explicitly accumulate the engine utilization so the right value is > visible after the ioctl return. > > Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2667 > Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> > Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> LGTM Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> > --- > drivers/gpu/drm/xe/xe_exec_queue.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c > index d098d2dd1b2d..b15ca84b2422 100644 > --- a/drivers/gpu/drm/xe/xe_exec_queue.c > +++ b/drivers/gpu/drm/xe/xe_exec_queue.c > @@ -829,6 +829,14 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data, > > xe_exec_queue_kill(q); > > + /* > + * After killing and destroying the exec queue, make sure userspace has > + * an updated view of the run ticks, regardless if this was the last > + * ref: since the exec queue is removed from xef->exec_queue.xa, a > + * query to fdinfo after this returns could not account for this load. > + */ > + xe_exec_queue_update_run_ticks(q); > + > trace_xe_exec_queue_close(q); > xe_exec_queue_put(q); >
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index d098d2dd1b2d..b15ca84b2422 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -829,6 +829,14 @@ int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data, xe_exec_queue_kill(q); + /* + * After killing and destroying the exec queue, make sure userspace has + * an updated view of the run ticks, regardless if this was the last + * ref: since the exec queue is removed from xef->exec_queue.xa, a + * query to fdinfo after this returns could not account for this load. + */ + xe_exec_queue_update_run_ticks(q); + trace_xe_exec_queue_close(q); xe_exec_queue_put(q);
When the exec queue is destroyed, there's a race between a query to the fdinfo and the exec queue value being updated: after the destroy ioctl, if the fdinfo is queried before a call to guc_exec_queue_free_job(), the wrong utilization is reported: it's not accumulated on the query since the queue was removed from the array, and the value wasn't updated yet by the free_job(). Explicitly accumulate the engine utilization so the right value is visible after the ioctl return. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2667 Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> --- drivers/gpu/drm/xe/xe_exec_queue.c | 8 ++++++++ 1 file changed, 8 insertions(+)