Message ID | 20240924101914.2713-3-tursulin@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | DRM scheduler fixes and improvements | expand |
I've pushed the first to drm-misc-next, but that one here fails to apply cleanly. Christian. Am 24.09.24 um 12:19 schrieb Tvrtko Ursulin: > From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> > > Since drm_sched_entity_modify_sched() can modify the entities run queue, > lets make sure to only dereference the pointer once so both adding and > waking up are guaranteed to be consistent. > > Alternative of moving the spin_unlock to after the wake up would for now > be more problematic since the same lock is taken inside > drm_sched_rq_update_fifo(). > > v2: > * Improve commit message. (Philipp) > * Cache the scheduler pointer directly. (Christian) > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> > Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list") > Cc: Christian König <christian.koenig@amd.com> > Cc: Alex Deucher <alexander.deucher@amd.com> > Cc: Luben Tuikov <ltuikov89@gmail.com> > Cc: Matthew Brost <matthew.brost@intel.com> > Cc: David Airlie <airlied@gmail.com> > Cc: Daniel Vetter <daniel@ffwll.ch> > Cc: Philipp Stanner <pstanner@redhat.com> > Cc: dri-devel@lists.freedesktop.org > Cc: <stable@vger.kernel.org> # v5.7+ > Reviewed-by: Christian König <christian.koenig@amd.com> > --- > drivers/gpu/drm/scheduler/sched_entity.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c > index 0e002c17fcb6..a75eede8bf8d 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -599,6 +599,9 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) > > /* first job wakes up scheduler */ > if (first) { > + struct drm_gpu_scheduler *sched; > + struct drm_sched_rq *rq; > + > /* Add the entity to the run queue */ > spin_lock(&entity->rq_lock); > if (entity->stopped) { > @@ -608,13 +611,16 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) > return; > } > > - drm_sched_rq_add_entity(entity->rq, entity); > + rq = entity->rq; > + sched = rq->sched; > + > + drm_sched_rq_add_entity(rq, entity); > spin_unlock(&entity->rq_lock); > > if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) > drm_sched_rq_update_fifo(entity, submit_ts); > > - drm_sched_wakeup(entity->rq->sched); > + drm_sched_wakeup(sched); > } > } > EXPORT_SYMBOL(drm_sched_entity_push_job);
On 24/09/2024 14:55, Christian König wrote: > I've pushed the first to drm-misc-next, but that one here fails to apply > cleanly. This appears due 440d52b370b0 ("drm/sched: Fix dynamic job-flow control race") in drm-misc-fixes. In theory 1-3 from my series are fixes. Should they also go to drm-misc-fixes? I am not too familiar with the drm-misc flow. Or the series now needs to wait for some backmerge? Regards, Tvrtko > Am 24.09.24 um 12:19 schrieb Tvrtko Ursulin: >> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> >> >> Since drm_sched_entity_modify_sched() can modify the entities run queue, >> lets make sure to only dereference the pointer once so both adding and >> waking up are guaranteed to be consistent. >> >> Alternative of moving the spin_unlock to after the wake up would for now >> be more problematic since the same lock is taken inside >> drm_sched_rq_update_fifo(). >> >> v2: >> * Improve commit message. (Philipp) >> * Cache the scheduler pointer directly. (Christian) >> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> >> Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify >> sched list") >> Cc: Christian König <christian.koenig@amd.com> >> Cc: Alex Deucher <alexander.deucher@amd.com> >> Cc: Luben Tuikov <ltuikov89@gmail.com> >> Cc: Matthew Brost <matthew.brost@intel.com> >> Cc: David Airlie <airlied@gmail.com> >> Cc: Daniel Vetter <daniel@ffwll.ch> >> Cc: Philipp Stanner <pstanner@redhat.com> >> Cc: dri-devel@lists.freedesktop.org >> Cc: <stable@vger.kernel.org> # v5.7+ >> Reviewed-by: Christian König <christian.koenig@amd.com> >> --- >> drivers/gpu/drm/scheduler/sched_entity.c | 10 ++++++++-- >> 1 file changed, 8 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c >> b/drivers/gpu/drm/scheduler/sched_entity.c >> index 0e002c17fcb6..a75eede8bf8d 100644 >> --- a/drivers/gpu/drm/scheduler/sched_entity.c >> +++ b/drivers/gpu/drm/scheduler/sched_entity.c >> @@ -599,6 +599,9 @@ void drm_sched_entity_push_job(struct >> drm_sched_job *sched_job) >> /* first job wakes up scheduler */ >> if (first) { >> + struct drm_gpu_scheduler *sched; >> + struct drm_sched_rq *rq; >> + >> /* Add the entity to the run queue */ >> spin_lock(&entity->rq_lock); >> if (entity->stopped) { >> @@ -608,13 +611,16 @@ void drm_sched_entity_push_job(struct >> drm_sched_job *sched_job) >> return; >> } >> - drm_sched_rq_add_entity(entity->rq, entity); >> + rq = entity->rq; >> + sched = rq->sched; >> + >> + drm_sched_rq_add_entity(rq, entity); >> spin_unlock(&entity->rq_lock); >> if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) >> drm_sched_rq_update_fifo(entity, submit_ts); >> - drm_sched_wakeup(entity->rq->sched); >> + drm_sched_wakeup(sched); >> } >> } >> EXPORT_SYMBOL(drm_sched_entity_push_job); >
Am 24.09.24 um 16:12 schrieb Tvrtko Ursulin: > > On 24/09/2024 14:55, Christian König wrote: >> I've pushed the first to drm-misc-next, but that one here fails to >> apply cleanly. > > This appears due 440d52b370b0 ("drm/sched: Fix dynamic job-flow > control race") in drm-misc-fixes. > > In theory 1-3 from my series are fixes. Should they also go to > drm-misc-fixes? I am not too familiar with the drm-misc flow. Ah shit, in that case you should have spitted the patches up into fixes and next. Going to push the first 3 to fixes. > Or the series now needs to wait for some backmerge? Are the remaining 3 patches independent? If not then we need to wait for a backmerge. Regards, Christian. > > Regards, > > Tvrtko > >> Am 24.09.24 um 12:19 schrieb Tvrtko Ursulin: >>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> >>> >>> Since drm_sched_entity_modify_sched() can modify the entities run >>> queue, >>> lets make sure to only dereference the pointer once so both adding and >>> waking up are guaranteed to be consistent. >>> >>> Alternative of moving the spin_unlock to after the wake up would for >>> now >>> be more problematic since the same lock is taken inside >>> drm_sched_rq_update_fifo(). >>> >>> v2: >>> * Improve commit message. (Philipp) >>> * Cache the scheduler pointer directly. (Christian) >>> >>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> >>> Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify >>> sched list") >>> Cc: Christian König <christian.koenig@amd.com> >>> Cc: Alex Deucher <alexander.deucher@amd.com> >>> Cc: Luben Tuikov <ltuikov89@gmail.com> >>> Cc: Matthew Brost <matthew.brost@intel.com> >>> Cc: David Airlie <airlied@gmail.com> >>> Cc: Daniel Vetter <daniel@ffwll.ch> >>> Cc: Philipp Stanner <pstanner@redhat.com> >>> Cc: dri-devel@lists.freedesktop.org >>> Cc: <stable@vger.kernel.org> # v5.7+ >>> Reviewed-by: Christian König <christian.koenig@amd.com> >>> --- >>> drivers/gpu/drm/scheduler/sched_entity.c | 10 ++++++++-- >>> 1 file changed, 8 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c >>> b/drivers/gpu/drm/scheduler/sched_entity.c >>> index 0e002c17fcb6..a75eede8bf8d 100644 >>> --- a/drivers/gpu/drm/scheduler/sched_entity.c >>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c >>> @@ -599,6 +599,9 @@ void drm_sched_entity_push_job(struct >>> drm_sched_job *sched_job) >>> /* first job wakes up scheduler */ >>> if (first) { >>> + struct drm_gpu_scheduler *sched; >>> + struct drm_sched_rq *rq; >>> + >>> /* Add the entity to the run queue */ >>> spin_lock(&entity->rq_lock); >>> if (entity->stopped) { >>> @@ -608,13 +611,16 @@ void drm_sched_entity_push_job(struct >>> drm_sched_job *sched_job) >>> return; >>> } >>> - drm_sched_rq_add_entity(entity->rq, entity); >>> + rq = entity->rq; >>> + sched = rq->sched; >>> + >>> + drm_sched_rq_add_entity(rq, entity); >>> spin_unlock(&entity->rq_lock); >>> if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) >>> drm_sched_rq_update_fifo(entity, submit_ts); >>> - drm_sched_wakeup(entity->rq->sched); >>> + drm_sched_wakeup(sched); >>> } >>> } >>> EXPORT_SYMBOL(drm_sched_entity_push_job); >>
On 24/09/2024 15:20, Christian König wrote: > Am 24.09.24 um 16:12 schrieb Tvrtko Ursulin: >> >> On 24/09/2024 14:55, Christian König wrote: >>> I've pushed the first to drm-misc-next, but that one here fails to >>> apply cleanly. >> >> This appears due 440d52b370b0 ("drm/sched: Fix dynamic job-flow >> control race") in drm-misc-fixes. >> >> In theory 1-3 from my series are fixes. Should they also go to >> drm-misc-fixes? I am not too familiar with the drm-misc flow. > > Ah shit, in that case you should have spitted the patches up into fixes > and next. Going to push the first 3 to fixes. Sorry my drm-intel ways of thinking (cherry picked fixes) are hard to get rid of. Hence the series was structured as 1-3 fixes, 4-8 refactors etc. Now appears it is too late to pull out the first one from drm-misc-next. >> Or the series now needs to wait for some backmerge? > > Are the remaining 3 patches independent? If not then we need to wait for > a backmerge. These are independent: Fixes: 1/8 "drm/sched: Add locking to drm_sched_entity_modify_sched" Not fixes: 5/8 "drm/sched: Stop setting current entity in FIFO mode" 6/8 "drm/sched: Re-order struct drm_sched_rq members for clarity" While the rest touch at least some common areas. 2/8 and 3/8 are also fixes. 4/8, 7/8 and 8/8 not fixes but depend on 2/8 and 3/8. Regards, Tvrtko >>> Am 24.09.24 um 12:19 schrieb Tvrtko Ursulin: >>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> >>>> >>>> Since drm_sched_entity_modify_sched() can modify the entities run >>>> queue, >>>> lets make sure to only dereference the pointer once so both adding and >>>> waking up are guaranteed to be consistent. >>>> >>>> Alternative of moving the spin_unlock to after the wake up would for >>>> now >>>> be more problematic since the same lock is taken inside >>>> drm_sched_rq_update_fifo(). >>>> >>>> v2: >>>> * Improve commit message. (Philipp) >>>> * Cache the scheduler pointer directly. (Christian) >>>> >>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> >>>> Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify >>>> sched list") >>>> Cc: Christian König <christian.koenig@amd.com> >>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>> Cc: Luben Tuikov <ltuikov89@gmail.com> >>>> Cc: Matthew Brost <matthew.brost@intel.com> >>>> Cc: David Airlie <airlied@gmail.com> >>>> Cc: Daniel Vetter <daniel@ffwll.ch> >>>> Cc: Philipp Stanner <pstanner@redhat.com> >>>> Cc: dri-devel@lists.freedesktop.org >>>> Cc: <stable@vger.kernel.org> # v5.7+ >>>> Reviewed-by: Christian König <christian.koenig@amd.com> >>>> --- >>>> drivers/gpu/drm/scheduler/sched_entity.c | 10 ++++++++-- >>>> 1 file changed, 8 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c >>>> b/drivers/gpu/drm/scheduler/sched_entity.c >>>> index 0e002c17fcb6..a75eede8bf8d 100644 >>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c >>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c >>>> @@ -599,6 +599,9 @@ void drm_sched_entity_push_job(struct >>>> drm_sched_job *sched_job) >>>> /* first job wakes up scheduler */ >>>> if (first) { >>>> + struct drm_gpu_scheduler *sched; >>>> + struct drm_sched_rq *rq; >>>> + >>>> /* Add the entity to the run queue */ >>>> spin_lock(&entity->rq_lock); >>>> if (entity->stopped) { >>>> @@ -608,13 +611,16 @@ void drm_sched_entity_push_job(struct >>>> drm_sched_job *sched_job) >>>> return; >>>> } >>>> - drm_sched_rq_add_entity(entity->rq, entity); >>>> + rq = entity->rq; >>>> + sched = rq->sched; >>>> + >>>> + drm_sched_rq_add_entity(rq, entity); >>>> spin_unlock(&entity->rq_lock); >>>> if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) >>>> drm_sched_rq_update_fifo(entity, submit_ts); >>>> - drm_sched_wakeup(entity->rq->sched); >>>> + drm_sched_wakeup(sched); >>>> } >>>> } >>>> EXPORT_SYMBOL(drm_sched_entity_push_job); >>> >
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 0e002c17fcb6..a75eede8bf8d 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -599,6 +599,9 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) /* first job wakes up scheduler */ if (first) { + struct drm_gpu_scheduler *sched; + struct drm_sched_rq *rq; + /* Add the entity to the run queue */ spin_lock(&entity->rq_lock); if (entity->stopped) { @@ -608,13 +611,16 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) return; } - drm_sched_rq_add_entity(entity->rq, entity); + rq = entity->rq; + sched = rq->sched; + + drm_sched_rq_add_entity(rq, entity); spin_unlock(&entity->rq_lock); if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) drm_sched_rq_update_fifo(entity, submit_ts); - drm_sched_wakeup(entity->rq->sched); + drm_sched_wakeup(sched); } } EXPORT_SYMBOL(drm_sched_entity_push_job);