Message ID | 20250326070441.1515428-1-liuqianyi125@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | drm/sched: Add return value for drm_sched_entity_push_job | expand |
Hello, thanks for your patch On Wed, 2025-03-26 at 15:04 +0800, Qianyi Liu wrote: > Currently drm_sched_entity_push_job() has no return value to indicate > operation status. This makes it difficult for callers to handle error > conditions properly. > > Add a int return value to drm_sched_entity_push_job() that returns 0 > on > success or a negative error code (e.g., -EINVAL) on failure. This > allows > callers to: > > 1. Detect when job submission fails > 2. Perform proper cleanup (e.g., release job and fence allocations) > 3. Avoid potential memory leaks in error paths > > Signed-off-by: Qianyi Liu <liuqianyi125@gmail.com> > --- > drivers/gpu/drm/scheduler/sched_entity.c | 8 ++++++-- > include/drm/gpu_scheduler.h | 2 +- > 2 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > b/drivers/gpu/drm/scheduler/sched_entity.c > index bd39db7bb240..f31964e76062 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -579,8 +579,10 @@ void drm_sched_entity_select_rq(struct > drm_sched_entity *entity) > * fence sequence number this function should be called with > drm_sched_job_arm() > * under common lock for the struct drm_sched_entity that was set up > for > * @sched_job in drm_sched_job_init(). > + * > + * Returns 0 on success or a negative error code on failure. > */ > -void drm_sched_entity_push_job(struct drm_sched_job *sched_job) > +int drm_sched_entity_push_job(struct drm_sched_job *sched_job) I am not convinced this is a justified change, because: 1. The entity being dead is the only error that might occur 2. That error is extremely unlikely 3. It does not occur in all existing upstream drivers 4. Drivers typically first prevent userspace from pushing stuff to an entity, then tear the entity down, then (in case of firmware schedulers) tear the entire scheduler down. 5. Your patch doesn't add an upstream user for the error code, showing how it's better / different from how existing drivers handle their entity-pushing. Thus, pushing to a killed entity is a gross misdesign of the driver, which should become visible during development and can't be handled during runtime in a reasonable manner. P. > { > struct drm_sched_entity *entity = sched_job->entity; > bool first; > @@ -609,7 +611,7 @@ void drm_sched_entity_push_job(struct > drm_sched_job *sched_job) > spin_unlock(&entity->lock); > > DRM_ERROR("Trying to push to a killed entity\n"); > - return; > + return -EINVAL; > } > > rq = entity->rq; > @@ -626,5 +628,7 @@ void drm_sched_entity_push_job(struct > drm_sched_job *sched_job) > > drm_sched_wakeup(sched); > } > + > + return 0; > } > EXPORT_SYMBOL(drm_sched_entity_push_job); > diff --git a/include/drm/gpu_scheduler.h > b/include/drm/gpu_scheduler.h > index 50928a7ae98e..48a263571bab 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -589,7 +589,7 @@ int drm_sched_job_init(struct drm_sched_job *job, > struct drm_sched_entity *entity, > u32 credits, void *owner); > void drm_sched_job_arm(struct drm_sched_job *job); > -void drm_sched_entity_push_job(struct drm_sched_job *sched_job); > +int drm_sched_entity_push_job(struct drm_sched_job *sched_job); > int drm_sched_job_add_dependency(struct drm_sched_job *job, > struct dma_fence *fence); > int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job,
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index bd39db7bb240..f31964e76062 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -579,8 +579,10 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) * fence sequence number this function should be called with drm_sched_job_arm() * under common lock for the struct drm_sched_entity that was set up for * @sched_job in drm_sched_job_init(). + * + * Returns 0 on success or a negative error code on failure. */ -void drm_sched_entity_push_job(struct drm_sched_job *sched_job) +int drm_sched_entity_push_job(struct drm_sched_job *sched_job) { struct drm_sched_entity *entity = sched_job->entity; bool first; @@ -609,7 +611,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) spin_unlock(&entity->lock); DRM_ERROR("Trying to push to a killed entity\n"); - return; + return -EINVAL; } rq = entity->rq; @@ -626,5 +628,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) drm_sched_wakeup(sched); } + + return 0; } EXPORT_SYMBOL(drm_sched_entity_push_job); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 50928a7ae98e..48a263571bab 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -589,7 +589,7 @@ int drm_sched_job_init(struct drm_sched_job *job, struct drm_sched_entity *entity, u32 credits, void *owner); void drm_sched_job_arm(struct drm_sched_job *job); -void drm_sched_entity_push_job(struct drm_sched_job *sched_job); +int drm_sched_entity_push_job(struct drm_sched_job *sched_job); int drm_sched_job_add_dependency(struct drm_sched_job *job, struct dma_fence *fence); int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job,
Currently drm_sched_entity_push_job() has no return value to indicate operation status. This makes it difficult for callers to handle error conditions properly. Add a int return value to drm_sched_entity_push_job() that returns 0 on success or a negative error code (e.g., -EINVAL) on failure. This allows callers to: 1. Detect when job submission fails 2. Perform proper cleanup (e.g., release job and fence allocations) 3. Avoid potential memory leaks in error paths Signed-off-by: Qianyi Liu <liuqianyi125@gmail.com> --- drivers/gpu/drm/scheduler/sched_entity.c | 8 ++++++-- include/drm/gpu_scheduler.h | 2 +- 2 files changed, 7 insertions(+), 3 deletions(-)