Message ID | 20201001141253.1066836-1-boris.brezillon@collabora.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/sched: Avoid infinite waits in the drm_sched_entity_destroy() path | expand |
On 01/10/2020 15:12, Boris Brezillon wrote: > If we don't initialize the entity to idle and the entity is never > scheduled before being destroyed we end up with an infinite wait in the > destroy path. > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> This seems reasonable to me - it looks like in theory if you very quickly open, submit a job and close you could trigger this (i.e. if drm_sched_main() never actually enters the while loop). You should CC some other folk as this doesn't just affect Panfrost. Reviewed-by: Steven Price <steven.price@arm.com> > --- > This is something I noticed while debugging another issue on panfrost > causing the scheduler to be in a weird state where new entities were no > longer scheduled. This was causing all userspace threads trying to close > their DRM fd to be blocked in kernel space waiting for this "entity is > idle" event. I don't know if that fix is legitimate (now that we fixed > the other bug we don't seem to end up in that state anymore), but I > thought I'd share it anyway. > --- > drivers/gpu/drm/scheduler/sched_entity.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c > index 146380118962..f8ec277a6aa8 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -73,6 +73,9 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, > > init_completion(&entity->entity_idle); > > + /* We start in an idle state. */ > + complete(&entity->entity_idle); > + > spin_lock_init(&entity->rq_lock); > spsc_queue_init(&entity->job_queue); > >
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 146380118962..f8ec277a6aa8 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -73,6 +73,9 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, init_completion(&entity->entity_idle); + /* We start in an idle state. */ + complete(&entity->entity_idle); + spin_lock_init(&entity->rq_lock); spsc_queue_init(&entity->job_queue);
If we don't initialize the entity to idle and the entity is never scheduled before being destroyed we end up with an infinite wait in the destroy path. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> --- This is something I noticed while debugging another issue on panfrost causing the scheduler to be in a weird state where new entities were no longer scheduled. This was causing all userspace threads trying to close their DRM fd to be blocked in kernel space waiting for this "entity is idle" event. I don't know if that fix is legitimate (now that we fixed the other bug we don't seem to end up in that state anymore), but I thought I'd share it anyway. --- drivers/gpu/drm/scheduler/sched_entity.c | 3 +++ 1 file changed, 3 insertions(+)