[RFC,06/10] drm/sched: Submit job before starting TDR

Message ID	20230404002211.3611376-7-matthew.brost@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: Matthew Brost <matthew.brost@intel.com> To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [RFC PATCH 06/10] drm/sched: Submit job before starting TDR Date: Mon, 3 Apr 2023 17:22:07 -0700 Message-Id: <20230404002211.3611376-7-matthew.brost@intel.com> In-Reply-To: <20230404002211.3611376-1-matthew.brost@intel.com> References: <20230404002211.3611376-1-matthew.brost@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Cc: robdclark@chromium.org, thomas.hellstrom@linux.intel.com, airlied@linux.ie, lina@asahilina.net, boris.brezillon@collabora.com, Matthew Brost <matthew.brost@intel.com>, christian.koenig@amd.com, faith.ekstrand@collabora.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	Xe DRM scheduler and long running workload plans \| expand [RFC,00/10] Xe DRM scheduler and long running workload plans [RFC,01/10] drm/sched: Convert drm scheduler to use a work queue rather than kthread [RFC,02/10] drm/sched: Move schedule policy to scheduler / entity [RFC,03/10] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy [RFC,04/10] drm/sched: Add generic scheduler message interface [RFC,05/10] drm/sched: Start run wq before TDR in drm_sched_start [RFC,06/10] drm/sched: Submit job before starting TDR [RFC,07/10] drm/sched: Add helper to set TDR timeout [RFC,08/10] dma-buf/dma-fence: Introduce long-running completion fences [RFC,09/10] drm/sched: Support long-running sched entities [RFC,10/10] drm/syncobj: Warn on long running dma-fences

Message ID

20230404002211.3611376-7-matthew.brost@intel.com (mailing list archive)

State

New, archived

Headers

From: Matthew Brost <matthew.brost@intel.com>
To: dri-devel@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org
Subject: [RFC PATCH 06/10] drm/sched: Submit job before starting TDR
Date: Mon,  3 Apr 2023 17:22:07 -0700
Message-Id: <20230404002211.3611376-7-matthew.brost@intel.com>
In-Reply-To: <20230404002211.3611376-1-matthew.brost@intel.com>
References: <20230404002211.3611376-1-matthew.brost@intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: list
Cc: robdclark@chromium.org, thomas.hellstrom@linux.intel.com,
 airlied@linux.ie,
 lina@asahilina.net, boris.brezillon@collabora.com,
 Matthew Brost <matthew.brost@intel.com>, christian.koenig@amd.com,
 faith.ekstrand@collabora.com
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Series

Xe DRM scheduler and long running workload plans | expand

Commit Message

Matthew Brost April 4, 2023, 12:22 a.m. UTC

If the TDR is set to a value, it can fire before a job is submitted in
drm_sched_main. The job should be always be submitted before the TDR
fires, fix this ordering.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Luben Tuikov May 4, 2023, 5:23 a.m. UTC | #1

On 2023-04-03 20:22, Matthew Brost wrote:
> If the TDR is set to a value, it can fire before a job is submitted in
> drm_sched_main. The job should be always be submitted before the TDR
> fires, fix this ordering.
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 6ae710017024..4eac02d212c1 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1150,10 +1150,10 @@ static void drm_sched_main(struct work_struct *w)
>  		s_fence = sched_job->s_fence;
>  
>  		atomic_inc(&sched->hw_rq_count);
> -		drm_sched_job_begin(sched_job);
>  
>  		trace_drm_run_job(sched_job, entity);
>  		fence = sched->ops->run_job(sched_job);
> +		drm_sched_job_begin(sched_job);
>  		complete_all(&entity->entity_idle);
>  		drm_sched_fence_scheduled(s_fence);
>  

Not sure if this is correct. In drm_sched_job_begin() we add the job to the "pending_list"
(meaning it is pending execution in the hardware) and we also start a timeout timer. Both
of those should be started before the job is given to the hardware.

If the timeout is set to too small a value, then that should probably be fixed instead.

Regards,
Luben

Matthew Brost July 31, 2023, 1 a.m. UTC | #2

On Thu, May 04, 2023 at 01:23:05AM -0400, Luben Tuikov wrote:
> On 2023-04-03 20:22, Matthew Brost wrote:
> > If the TDR is set to a value, it can fire before a job is submitted in
> > drm_sched_main. The job should be always be submitted before the TDR
> > fires, fix this ordering.
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 6ae710017024..4eac02d212c1 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -1150,10 +1150,10 @@ static void drm_sched_main(struct work_struct *w)
> >  		s_fence = sched_job->s_fence;
> >  
> >  		atomic_inc(&sched->hw_rq_count);
> > -		drm_sched_job_begin(sched_job);
> >  
> >  		trace_drm_run_job(sched_job, entity);
> >  		fence = sched->ops->run_job(sched_job);
> > +		drm_sched_job_begin(sched_job);
> >  		complete_all(&entity->entity_idle);
> >  		drm_sched_fence_scheduled(s_fence);
> >  
> 
> Not sure if this is correct. In drm_sched_job_begin() we add the job to the "pending_list"
> (meaning it is pending execution in the hardware) and we also start a timeout timer. Both
> of those should be started before the job is given to the hardware.
> 

The correct solution is probably add to pending list before run_job()
and kick TDR after run_job().

> If the timeout is set to too small a value, then that should probably be fixed instead.
>

Disagree, a user should be able to set TDR value to anything it wants
and not break the DRM scheduler.

Matt

> Regards,
> Luben

Boris Brezillon July 31, 2023, 7:26 a.m. UTC | #3

+the PVR devs

On Mon, 31 Jul 2023 01:00:59 +0000
Matthew Brost <matthew.brost@intel.com> wrote:

> On Thu, May 04, 2023 at 01:23:05AM -0400, Luben Tuikov wrote:
> > On 2023-04-03 20:22, Matthew Brost wrote:  
> > > If the TDR is set to a value, it can fire before a job is submitted in
> > > drm_sched_main. The job should be always be submitted before the TDR
> > > fires, fix this ordering.
> > > 
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > ---
> > >  drivers/gpu/drm/scheduler/sched_main.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > index 6ae710017024..4eac02d212c1 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -1150,10 +1150,10 @@ static void drm_sched_main(struct work_struct *w)
> > >  		s_fence = sched_job->s_fence;
> > >  
> > >  		atomic_inc(&sched->hw_rq_count);
> > > -		drm_sched_job_begin(sched_job);
> > >  
> > >  		trace_drm_run_job(sched_job, entity);
> > >  		fence = sched->ops->run_job(sched_job);
> > > +		drm_sched_job_begin(sched_job);
> > >  		complete_all(&entity->entity_idle);
> > >  		drm_sched_fence_scheduled(s_fence);
> > >    
> > 
> > Not sure if this is correct. In drm_sched_job_begin() we add the job to the "pending_list"
> > (meaning it is pending execution in the hardware) and we also start a timeout timer. Both
> > of those should be started before the job is given to the hardware.
> >   
> 
> The correct solution is probably add to pending list before run_job()
> and kick TDR after run_job().

This would make the PVR driver simpler too. Right now, the driver
iterates over the pending job list to signal jobs done_fences, but
there's a race between the interrupt handler (that's iterating over
this list to signal fences) and the drm_sched logic (that's inserting
the job in the pending_list after run_job() returns). The race is taken
care of with an addition field that's pointing to the last submitted
job [1], but if we can get rid of that logic, that's for the best.

[1]https://gitlab.freedesktop.org/frankbinns/powervr/-/blob/powervr-next/drivers/gpu/drm/imagination/pvr_queue.h#L119

Luben Tuikov Aug. 31, 2023, 7:48 p.m. UTC | #4

On 2023-07-31 03:26, Boris Brezillon wrote:
> +the PVR devs
> 
> On Mon, 31 Jul 2023 01:00:59 +0000
> Matthew Brost <matthew.brost@intel.com> wrote:
> 
>> On Thu, May 04, 2023 at 01:23:05AM -0400, Luben Tuikov wrote:
>>> On 2023-04-03 20:22, Matthew Brost wrote:  
>>>> If the TDR is set to a value, it can fire before a job is submitted in
>>>> drm_sched_main. The job should be always be submitted before the TDR
>>>> fires, fix this ordering.
>>>>
>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>> ---
>>>>  drivers/gpu/drm/scheduler/sched_main.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6ae710017024..4eac02d212c1 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -1150,10 +1150,10 @@ static void drm_sched_main(struct work_struct *w)
>>>>  		s_fence = sched_job->s_fence;
>>>>  
>>>>  		atomic_inc(&sched->hw_rq_count);
>>>> -		drm_sched_job_begin(sched_job);
>>>>  
>>>>  		trace_drm_run_job(sched_job, entity);
>>>>  		fence = sched->ops->run_job(sched_job);
>>>> +		drm_sched_job_begin(sched_job);
>>>>  		complete_all(&entity->entity_idle);
>>>>  		drm_sched_fence_scheduled(s_fence);
>>>>    
>>>
>>> Not sure if this is correct. In drm_sched_job_begin() we add the job to the "pending_list"
>>> (meaning it is pending execution in the hardware) and we also start a timeout timer. Both
>>> of those should be started before the job is given to the hardware.
>>>   
>>
>> The correct solution is probably add to pending list before run_job()
>> and kick TDR after run_job().
> 
> This would make the PVR driver simpler too. Right now, the driver
> iterates over the pending job list to signal jobs done_fences, but
> there's a race between the interrupt handler (that's iterating over
> this list to signal fences) and the drm_sched logic (that's inserting
> the job in the pending_list after run_job() returns). The race is taken
> care of with an addition field that's pointing to the last submitted
> job [1], but if we can get rid of that logic, that's for the best.
> 
> [1]https://gitlab.freedesktop.org/frankbinns/powervr/-/blob/powervr-next/drivers/gpu/drm/imagination/pvr_queue.h#L119

(Caching up, chronologically, after vacation...)

I agree on both emails above. I'm aware of this race in the DRM scheduler
but am careful not to open a can of worms if fixed.

But, yes, indeed, the classic way (which would avoid races) is indeed
to add to "pending list" before run_job, as we cannot guarantee the state
of the job after "run_job". Also, ideally we want to stop all submissions
and then call TDR, recover/reset/etc., and then resume incoming submissions.

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 6ae710017024..4eac02d212c1 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1150,10 +1150,10 @@  static void drm_sched_main(struct work_struct *w)
 		s_fence = sched_job->s_fence;
 
 		atomic_inc(&sched->hw_rq_count);
-		drm_sched_job_begin(sched_job);
 
 		trace_drm_run_job(sched_job, entity);
 		fence = sched->ops->run_job(sched_job);
+		drm_sched_job_begin(sched_job);
 		complete_all(&entity->entity_idle);
 		drm_sched_fence_scheduled(s_fence);

[RFC,06/10] drm/sched: Submit job before starting TDR

Commit Message

Comments

Patch