[v4] drm: Don't free jobs in wait_event_interruptible()

Message ID	20190926141630.14258-1-steven.price@arm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=y8WL=XV=lists.freedesktop.org=dri-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ECB9A222C5 From: Steven Price <steven.price@arm.com> To: Daniel Vetter <daniel@ffwll.ch>, David Airlie <airlied@linux.ie>, =?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com> Subject: [PATCH v4] drm: Don't free jobs in wait_event_interruptible() Date: Thu, 26 Sep 2019 15:16:30 +0100 Message-Id: <20190926141630.14258-1-steven.price@arm.com> MIME-Version: 1.0 Precedence: list Cc: Sharat Masetty <smasetty@codeaurora.org>, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Steven Price <steven.price@arm.com>, Nayan Deshmukh <nayan26deshmukh@gmail.com>, Alex Deucher <alexander.deucher@amd.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	[v4] drm: Don't free jobs in wait_event_interruptible() \| expand [v4] drm: Don't free jobs in wait_event_interruptible()

Steven Price Sept. 26, 2019, 2:16 p.m. UTC

drm_sched_cleanup_jobs() attempts to free finished jobs, however because
it is called as the condition of wait_event_interruptible() it must not
sleep. Unfortuantly some free callbacks (notibly for Panfrost) do sleep.

Instead let's rename drm_sched_cleanup_jobs() to
drm_sched_get_cleanup_job() and simply return a job for processing if
there is one. The caller can then call the free_job() callback outside
the wait_event_interruptible() where sleeping is possible before
re-checking and returning to sleep if necessary.

Signed-off-by: Steven Price <steven.price@arm.com>
---
Changes from v3:
 * drm_sched_main() re-arms the timeout for the next job after calling
   free_job()

 drivers/gpu/drm/scheduler/sched_main.c | 45 +++++++++++++++-----------
 1 file changed, 26 insertions(+), 19 deletions(-)

Andrey Grodzovsky Sept. 26, 2019, 3:14 p.m. UTC | #1

On 9/26/19 10:16 AM, Steven Price wrote:
> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
> it is called as the condition of wait_event_interruptible() it must not
> sleep. Unfortuantly some free callbacks (notibly for Panfrost) do sleep.
>
> Instead let's rename drm_sched_cleanup_jobs() to
> drm_sched_get_cleanup_job() and simply return a job for processing if
> there is one. The caller can then call the free_job() callback outside
> the wait_event_interruptible() where sleeping is possible before
> re-checking and returning to sleep if necessary.
>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
> Changes from v3:
>   * drm_sched_main() re-arms the timeout for the next job after calling
>     free_job()
>
>   drivers/gpu/drm/scheduler/sched_main.c | 45 +++++++++++++++-----------
>   1 file changed, 26 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 9a0ee74d82dc..148468447ba9 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -622,43 +622,41 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>   }
>   
>   /**
> - * drm_sched_cleanup_jobs - destroy finished jobs
> + * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>    *
>    * @sched: scheduler instance
>    *
> - * Remove all finished jobs from the mirror list and destroy them.
> + * Returns the next finished job from the mirror list (if there is one)
> + * ready for it to be destroyed.
>    */
> -static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched)
> +static struct drm_sched_job *
> +drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   {
> +	struct drm_sched_job *job = NULL;
>   	unsigned long flags;
>   
>   	/* Don't destroy jobs while the timeout worker is running */
>   	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>   	    !cancel_delayed_work(&sched->work_tdr))
> -		return;
> -
> +		return NULL;
>   
> -	while (!list_empty(&sched->ring_mirror_list)) {
> -		struct drm_sched_job *job;
> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>   
> -		job = list_first_entry(&sched->ring_mirror_list,
> +	job = list_first_entry_or_null(&sched->ring_mirror_list,
>   				       struct drm_sched_job, node);
> -		if (!dma_fence_is_signaled(&job->s_fence->finished))
> -			break;
>   
> -		spin_lock_irqsave(&sched->job_list_lock, flags);
> +	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>   		/* remove job from ring_mirror_list */
>   		list_del_init(&job->node);
> -		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> -
> -		sched->ops->free_job(job);
> +	} else {
> +		job = NULL;
> +		/* queue timeout for next job */
> +		drm_sched_start_timeout(sched);
>   	}
>   
> -	/* queue timeout for next job */
> -	spin_lock_irqsave(&sched->job_list_lock, flags);
> -	drm_sched_start_timeout(sched);
>   	spin_unlock_irqrestore(&sched->job_list_lock, flags);
>   
> +	return job;
>   }
>   
>   /**
> @@ -698,12 +696,21 @@ static int drm_sched_main(void *param)
>   		struct drm_sched_fence *s_fence;
>   		struct drm_sched_job *sched_job;
>   		struct dma_fence *fence;
> +		struct drm_sched_job *cleanup_job = NULL;
>   
>   		wait_event_interruptible(sched->wake_up_worker,
> -					 (drm_sched_cleanup_jobs(sched),
> +					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>   					 (!drm_sched_blocked(sched) &&
>   					  (entity = drm_sched_select_entity(sched))) ||
> -					 kthread_should_stop()));
> +					 kthread_should_stop());
> +
> +		while (cleanup_job) {
> +			sched->ops->free_job(cleanup_job);
> +			/* queue timeout for next job */
> +			drm_sched_start_timeout(sched);
> +
> +			cleanup_job = drm_sched_get_cleanup_job(sched);
> +		}


Why drm_sched_start_timeout is called both here and inside 
drm_sched_get_cleanup_job ? And also why call it multiple times in the 
loop instead of only once after the loop is done  ?

Andrey


>   
>   		if (!entity)
>   			continue;

Steven Price Sept. 26, 2019, 3:23 p.m. UTC | #2

On 26/09/2019 16:14, Grodzovsky, Andrey wrote:
> 
> On 9/26/19 10:16 AM, Steven Price wrote:
>> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
>> it is called as the condition of wait_event_interruptible() it must not
>> sleep. Unfortuantly some free callbacks (notibly for Panfrost) do sleep.
>>
>> Instead let's rename drm_sched_cleanup_jobs() to
>> drm_sched_get_cleanup_job() and simply return a job for processing if
>> there is one. The caller can then call the free_job() callback outside
>> the wait_event_interruptible() where sleeping is possible before
>> re-checking and returning to sleep if necessary.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>> Changes from v3:
>>   * drm_sched_main() re-arms the timeout for the next job after calling
>>     free_job()
>>
>>   drivers/gpu/drm/scheduler/sched_main.c | 45 +++++++++++++++-----------
>>   1 file changed, 26 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 9a0ee74d82dc..148468447ba9 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -622,43 +622,41 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>   }
>>   
>>   /**
>> - * drm_sched_cleanup_jobs - destroy finished jobs
>> + * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>>    *
>>    * @sched: scheduler instance
>>    *
>> - * Remove all finished jobs from the mirror list and destroy them.
>> + * Returns the next finished job from the mirror list (if there is one)
>> + * ready for it to be destroyed.
>>    */
>> -static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched)
>> +static struct drm_sched_job *
>> +drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>   {
>> +	struct drm_sched_job *job = NULL;
>>   	unsigned long flags;
>>   
>>   	/* Don't destroy jobs while the timeout worker is running */
>>   	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>   	    !cancel_delayed_work(&sched->work_tdr))
>> -		return;
>> -
>> +		return NULL;
>>   
>> -	while (!list_empty(&sched->ring_mirror_list)) {
>> -		struct drm_sched_job *job;
>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>   
>> -		job = list_first_entry(&sched->ring_mirror_list,
>> +	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>   				       struct drm_sched_job, node);
>> -		if (!dma_fence_is_signaled(&job->s_fence->finished))
>> -			break;
>>   
>> -		spin_lock_irqsave(&sched->job_list_lock, flags);
>> +	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>   		/* remove job from ring_mirror_list */
>>   		list_del_init(&job->node);
>> -		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>> -
>> -		sched->ops->free_job(job);
>> +	} else {
>> +		job = NULL;
>> +		/* queue timeout for next job */
>> +		drm_sched_start_timeout(sched);
>>   	}
>>   
>> -	/* queue timeout for next job */
>> -	spin_lock_irqsave(&sched->job_list_lock, flags);
>> -	drm_sched_start_timeout(sched);
>>   	spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>   
>> +	return job;
>>   }
>>   
>>   /**
>> @@ -698,12 +696,21 @@ static int drm_sched_main(void *param)
>>   		struct drm_sched_fence *s_fence;
>>   		struct drm_sched_job *sched_job;
>>   		struct dma_fence *fence;
>> +		struct drm_sched_job *cleanup_job = NULL;
>>   
>>   		wait_event_interruptible(sched->wake_up_worker,
>> -					 (drm_sched_cleanup_jobs(sched),
>> +					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>>   					 (!drm_sched_blocked(sched) &&
>>   					  (entity = drm_sched_select_entity(sched))) ||
>> -					 kthread_should_stop()));
>> +					 kthread_should_stop());
>> +
>> +		while (cleanup_job) {
>> +			sched->ops->free_job(cleanup_job);
>> +			/* queue timeout for next job */
>> +			drm_sched_start_timeout(sched);
>> +
>> +			cleanup_job = drm_sched_get_cleanup_job(sched);
>> +		}
> 
> 
> Why drm_sched_start_timeout is called both here and inside 
> drm_sched_get_cleanup_job ? And also why call it multiple times in the 
> loop instead of only once after the loop is done  ?

Christian pointed out to be that the first thing
drm_sched_get_cleanup_job does is call cancel_delayed_work(), and if
that returns false then it bails out with a NULL return. So to actually
get another job (if one exists) the timeout has to be restarted.

It's also necessary to restart the timeout in the case where the return
is NULL which is handled in the function itself.

TBH I'm not sure whether this while loop is worth it - it may be better
to replace it with simply:

	if (cleanup_job) {
		sched->ops->free_job(cleanup_job);
		/* queue timeout for next job */
		drm_sched_start_timeout(sched);
	}

The outer loop would then handle the next call to
drm_sched_get_cleanup_job() as necessary.

Steve

Andrey Grodzovsky Sept. 26, 2019, 3:48 p.m. UTC | #3

On 9/26/19 11:23 AM, Steven Price wrote:
> On 26/09/2019 16:14, Grodzovsky, Andrey wrote:
>> On 9/26/19 10:16 AM, Steven Price wrote:
>>> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
>>> it is called as the condition of wait_event_interruptible() it must not
>>> sleep. Unfortuantly some free callbacks (notibly for Panfrost) do sleep.
>>>
>>> Instead let's rename drm_sched_cleanup_jobs() to
>>> drm_sched_get_cleanup_job() and simply return a job for processing if
>>> there is one. The caller can then call the free_job() callback outside
>>> the wait_event_interruptible() where sleeping is possible before
>>> re-checking and returning to sleep if necessary.
>>>
>>> Signed-off-by: Steven Price <steven.price@arm.com>
>>> ---
>>> Changes from v3:
>>>    * drm_sched_main() re-arms the timeout for the next job after calling
>>>      free_job()
>>>
>>>    drivers/gpu/drm/scheduler/sched_main.c | 45 +++++++++++++++-----------
>>>    1 file changed, 26 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 9a0ee74d82dc..148468447ba9 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -622,43 +622,41 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>>    }
>>>    
>>>    /**
>>> - * drm_sched_cleanup_jobs - destroy finished jobs
>>> + * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>>>     *
>>>     * @sched: scheduler instance
>>>     *
>>> - * Remove all finished jobs from the mirror list and destroy them.
>>> + * Returns the next finished job from the mirror list (if there is one)
>>> + * ready for it to be destroyed.
>>>     */
>>> -static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched)
>>> +static struct drm_sched_job *
>>> +drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>>    {
>>> +	struct drm_sched_job *job = NULL;
>>>    	unsigned long flags;
>>>    
>>>    	/* Don't destroy jobs while the timeout worker is running */
>>>    	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>    	    !cancel_delayed_work(&sched->work_tdr))
>>> -		return;
>>> -
>>> +		return NULL;
>>>    
>>> -	while (!list_empty(&sched->ring_mirror_list)) {
>>> -		struct drm_sched_job *job;
>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>    
>>> -		job = list_first_entry(&sched->ring_mirror_list,
>>> +	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>    				       struct drm_sched_job, node);
>>> -		if (!dma_fence_is_signaled(&job->s_fence->finished))
>>> -			break;
>>>    
>>> -		spin_lock_irqsave(&sched->job_list_lock, flags);
>>> +	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>    		/* remove job from ring_mirror_list */
>>>    		list_del_init(&job->node);
>>> -		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>> -
>>> -		sched->ops->free_job(job);
>>> +	} else {
>>> +		job = NULL;
>>> +		/* queue timeout for next job */
>>> +		drm_sched_start_timeout(sched);
>>>    	}
>>>    
>>> -	/* queue timeout for next job */
>>> -	spin_lock_irqsave(&sched->job_list_lock, flags);
>>> -	drm_sched_start_timeout(sched);
>>>    	spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>    
>>> +	return job;
>>>    }
>>>    
>>>    /**
>>> @@ -698,12 +696,21 @@ static int drm_sched_main(void *param)
>>>    		struct drm_sched_fence *s_fence;
>>>    		struct drm_sched_job *sched_job;
>>>    		struct dma_fence *fence;
>>> +		struct drm_sched_job *cleanup_job = NULL;
>>>    
>>>    		wait_event_interruptible(sched->wake_up_worker,
>>> -					 (drm_sched_cleanup_jobs(sched),
>>> +					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>>>    					 (!drm_sched_blocked(sched) &&
>>>    					  (entity = drm_sched_select_entity(sched))) ||
>>> -					 kthread_should_stop()));
>>> +					 kthread_should_stop());
>>> +
>>> +		while (cleanup_job) {
>>> +			sched->ops->free_job(cleanup_job);
>>> +			/* queue timeout for next job */
>>> +			drm_sched_start_timeout(sched);
>>> +
>>> +			cleanup_job = drm_sched_get_cleanup_job(sched);
>>> +		}
>>
>> Why drm_sched_start_timeout is called both here and inside
>> drm_sched_get_cleanup_job ? And also why call it multiple times in the
>> loop instead of only once after the loop is done  ?
> Christian pointed out to be that the first thing
> drm_sched_get_cleanup_job does is call cancel_delayed_work(), and if
> that returns false then it bails out with a NULL return. So to actually
> get another job (if one exists) the timeout has to be restarted.


For this case where timeout work already in progress note that 
drm_sched_job_timedout restarts the timeout in it's end so it should be 
ok to restart the timeout unconditionally inside 
drm_sched_get_cleanup_job as it was done before.


>
> It's also necessary to restart the timeout in the case where the return
> is NULL which is handled in the function itself.
>
> TBH I'm not sure whether this while loop is worth it - it may be better
> to replace it with simply:
>
> 	if (cleanup_job) {
> 		sched->ops->free_job(cleanup_job);
> 		/* queue timeout for next job */
> 		drm_sched_start_timeout(sched);
> 	}
>
> The outer loop would then handle the next call to
> drm_sched_get_cleanup_job() as necessary.


What outer loop ?

Andrey

>
> Steve

Steven Price Sept. 26, 2019, 3:59 p.m. UTC | #4

On 26/09/2019 16:48, Grodzovsky, Andrey wrote:
> 
> On 9/26/19 11:23 AM, Steven Price wrote:
>> On 26/09/2019 16:14, Grodzovsky, Andrey wrote:
>>> On 9/26/19 10:16 AM, Steven Price wrote:
>>>> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
>>>> it is called as the condition of wait_event_interruptible() it must not
>>>> sleep. Unfortuantly some free callbacks (notibly for Panfrost) do sleep.
>>>>
>>>> Instead let's rename drm_sched_cleanup_jobs() to
>>>> drm_sched_get_cleanup_job() and simply return a job for processing if
>>>> there is one. The caller can then call the free_job() callback outside
>>>> the wait_event_interruptible() where sleeping is possible before
>>>> re-checking and returning to sleep if necessary.
>>>>
>>>> Signed-off-by: Steven Price <steven.price@arm.com>
>>>> ---
>>>> Changes from v3:
>>>>    * drm_sched_main() re-arms the timeout for the next job after calling
>>>>      free_job()
>>>>
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 45 +++++++++++++++-----------
>>>>    1 file changed, 26 insertions(+), 19 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 9a0ee74d82dc..148468447ba9 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -622,43 +622,41 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>>>    }
>>>>    
>>>>    /**
>>>> - * drm_sched_cleanup_jobs - destroy finished jobs
>>>> + * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>>>>     *
>>>>     * @sched: scheduler instance
>>>>     *
>>>> - * Remove all finished jobs from the mirror list and destroy them.
>>>> + * Returns the next finished job from the mirror list (if there is one)
>>>> + * ready for it to be destroyed.
>>>>     */
>>>> -static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched)
>>>> +static struct drm_sched_job *
>>>> +drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>>>    {
>>>> +	struct drm_sched_job *job = NULL;
>>>>    	unsigned long flags;
>>>>    
>>>>    	/* Don't destroy jobs while the timeout worker is running */
>>>>    	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>>    	    !cancel_delayed_work(&sched->work_tdr))
>>>> -		return;
>>>> -
>>>> +		return NULL;
>>>>    
>>>> -	while (!list_empty(&sched->ring_mirror_list)) {
>>>> -		struct drm_sched_job *job;
>>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>    
>>>> -		job = list_first_entry(&sched->ring_mirror_list,
>>>> +	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>    				       struct drm_sched_job, node);
>>>> -		if (!dma_fence_is_signaled(&job->s_fence->finished))
>>>> -			break;
>>>>    
>>>> -		spin_lock_irqsave(&sched->job_list_lock, flags);
>>>> +	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>>    		/* remove job from ring_mirror_list */
>>>>    		list_del_init(&job->node);
>>>> -		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>> -
>>>> -		sched->ops->free_job(job);
>>>> +	} else {
>>>> +		job = NULL;
>>>> +		/* queue timeout for next job */
>>>> +		drm_sched_start_timeout(sched);
>>>>    	}
>>>>    
>>>> -	/* queue timeout for next job */
>>>> -	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>> -	drm_sched_start_timeout(sched);
>>>>    	spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>    
>>>> +	return job;
>>>>    }
>>>>    
>>>>    /**
>>>> @@ -698,12 +696,21 @@ static int drm_sched_main(void *param)
>>>>    		struct drm_sched_fence *s_fence;
>>>>    		struct drm_sched_job *sched_job;
>>>>    		struct dma_fence *fence;
>>>> +		struct drm_sched_job *cleanup_job = NULL;
>>>>    
>>>>    		wait_event_interruptible(sched->wake_up_worker,
>>>> -					 (drm_sched_cleanup_jobs(sched),
>>>> +					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>>>>    					 (!drm_sched_blocked(sched) &&
>>>>    					  (entity = drm_sched_select_entity(sched))) ||
>>>> -					 kthread_should_stop()));
>>>> +					 kthread_should_stop());
>>>> +
>>>> +		while (cleanup_job) {
>>>> +			sched->ops->free_job(cleanup_job);
>>>> +			/* queue timeout for next job */
>>>> +			drm_sched_start_timeout(sched);
>>>> +
>>>> +			cleanup_job = drm_sched_get_cleanup_job(sched);
>>>> +		}
>>>
>>> Why drm_sched_start_timeout is called both here and inside
>>> drm_sched_get_cleanup_job ? And also why call it multiple times in the
>>> loop instead of only once after the loop is done  ?
>> Christian pointed out to be that the first thing
>> drm_sched_get_cleanup_job does is call cancel_delayed_work(), and if
>> that returns false then it bails out with a NULL return. So to actually
>> get another job (if one exists) the timeout has to be restarted.
> 
> 
> For this case where timeout work already in progress note that 
> drm_sched_job_timedout restarts the timeout in it's end so it should be 
> ok to restart the timeout unconditionally inside 
> drm_sched_get_cleanup_job as it was done before.

I may have misinterpreted Christian[1], but I interpreted the below as
meaning that he'd prefer the caller (drm_sched_main()) to handle the
drm_sched_start_timeout() in this case:

On 26/09/2019 14:50, Koenig, Christian wrote:
>> Alternatively the caller could manually re-arm the timeout after
>> handling the job free.
> 
> I don't see anything that could go wrong immediately, but that is 
> probably the cleaner approach.

[1]
https://lore.kernel.org/lkml/d9b23ef8-f784-0bce-c34a-fa02002db1ea@amd.com/

I wasn't entirely sure if it was safe to leave the timeout running while
the job free was going on which would be the case if
drm_sched_get_cleanup_job() re-armed unconditionally.

>>
>> It's also necessary to restart the timeout in the case where the return
>> is NULL which is handled in the function itself.
>>
>> TBH I'm not sure whether this while loop is worth it - it may be better
>> to replace it with simply:
>>
>> 	if (cleanup_job) {
>> 		sched->ops->free_job(cleanup_job);
>> 		/* queue timeout for next job */
>> 		drm_sched_start_timeout(sched);
>> 	}
>>
>> The outer loop would then handle the next call to
>> drm_sched_get_cleanup_job() as necessary.
> 
> 
> What outer loop ?

There's a loop round the code in drm_sched_main():

	while (!kthread_should_stop()) {
		[...]
		wait_event_interruptible(...)

		while (cleanup_job) {...}

		if (!entity)
			continue;

		[...]
	}

So after handling the cleanup_job case, the outer loop will loop back
round to the wait_event_interruptible() call. One other concern is
whether it's possible for the thread to be signalled to stop before the
clean up has been completed. The while loop I added ensures that all
jobs are actually cleaned up before kthread_should_stop() is checked.

Steve

Andrey Grodzovsky Sept. 26, 2019, 5:13 p.m. UTC | #5

On 9/26/19 11:59 AM, Steven Price wrote:
> On 26/09/2019 16:48, Grodzovsky, Andrey wrote:
>> On 9/26/19 11:23 AM, Steven Price wrote:
>>> On 26/09/2019 16:14, Grodzovsky, Andrey wrote:
>>>> On 9/26/19 10:16 AM, Steven Price wrote:
>>>>> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
>>>>> it is called as the condition of wait_event_interruptible() it must not
>>>>> sleep. Unfortuantly some free callbacks (notibly for Panfrost) do sleep.
>>>>>
>>>>> Instead let's rename drm_sched_cleanup_jobs() to
>>>>> drm_sched_get_cleanup_job() and simply return a job for processing if
>>>>> there is one. The caller can then call the free_job() callback outside
>>>>> the wait_event_interruptible() where sleeping is possible before
>>>>> re-checking and returning to sleep if necessary.
>>>>>
>>>>> Signed-off-by: Steven Price <steven.price@arm.com>
>>>>> ---
>>>>> Changes from v3:
>>>>>     * drm_sched_main() re-arms the timeout for the next job after calling
>>>>>       free_job()
>>>>>
>>>>>     drivers/gpu/drm/scheduler/sched_main.c | 45 +++++++++++++++-----------
>>>>>     1 file changed, 26 insertions(+), 19 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 9a0ee74d82dc..148468447ba9 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -622,43 +622,41 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>>>>     }
>>>>>     
>>>>>     /**
>>>>> - * drm_sched_cleanup_jobs - destroy finished jobs
>>>>> + * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>>>>>      *
>>>>>      * @sched: scheduler instance
>>>>>      *
>>>>> - * Remove all finished jobs from the mirror list and destroy them.
>>>>> + * Returns the next finished job from the mirror list (if there is one)
>>>>> + * ready for it to be destroyed.
>>>>>      */
>>>>> -static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched)
>>>>> +static struct drm_sched_job *
>>>>> +drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>>>>     {
>>>>> +	struct drm_sched_job *job = NULL;
>>>>>     	unsigned long flags;
>>>>>     
>>>>>     	/* Don't destroy jobs while the timeout worker is running */
>>>>>     	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>>>     	    !cancel_delayed_work(&sched->work_tdr))
>>>>> -		return;
>>>>> -
>>>>> +		return NULL;
>>>>>     
>>>>> -	while (!list_empty(&sched->ring_mirror_list)) {
>>>>> -		struct drm_sched_job *job;
>>>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>     
>>>>> -		job = list_first_entry(&sched->ring_mirror_list,
>>>>> +	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>>     				       struct drm_sched_job, node);
>>>>> -		if (!dma_fence_is_signaled(&job->s_fence->finished))
>>>>> -			break;
>>>>>     
>>>>> -		spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>> +	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>>>     		/* remove job from ring_mirror_list */
>>>>>     		list_del_init(&job->node);
>>>>> -		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>> -
>>>>> -		sched->ops->free_job(job);
>>>>> +	} else {
>>>>> +		job = NULL;
>>>>> +		/* queue timeout for next job */
>>>>> +		drm_sched_start_timeout(sched);
>>>>>     	}
>>>>>     
>>>>> -	/* queue timeout for next job */
>>>>> -	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>> -	drm_sched_start_timeout(sched);
>>>>>     	spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>     
>>>>> +	return job;
>>>>>     }
>>>>>     
>>>>>     /**
>>>>> @@ -698,12 +696,21 @@ static int drm_sched_main(void *param)
>>>>>     		struct drm_sched_fence *s_fence;
>>>>>     		struct drm_sched_job *sched_job;
>>>>>     		struct dma_fence *fence;
>>>>> +		struct drm_sched_job *cleanup_job = NULL;
>>>>>     
>>>>>     		wait_event_interruptible(sched->wake_up_worker,
>>>>> -					 (drm_sched_cleanup_jobs(sched),
>>>>> +					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>>>>>     					 (!drm_sched_blocked(sched) &&
>>>>>     					  (entity = drm_sched_select_entity(sched))) ||
>>>>> -					 kthread_should_stop()));
>>>>> +					 kthread_should_stop());
>>>>> +
>>>>> +		while (cleanup_job) {
>>>>> +			sched->ops->free_job(cleanup_job);
>>>>> +			/* queue timeout for next job */
>>>>> +			drm_sched_start_timeout(sched);
>>>>> +
>>>>> +			cleanup_job = drm_sched_get_cleanup_job(sched);
>>>>> +		}
>>>> Why drm_sched_start_timeout is called both here and inside
>>>> drm_sched_get_cleanup_job ? And also why call it multiple times in the
>>>> loop instead of only once after the loop is done  ?
>>> Christian pointed out to be that the first thing
>>> drm_sched_get_cleanup_job does is call cancel_delayed_work(), and if
>>> that returns false then it bails out with a NULL return. So to actually
>>> get another job (if one exists) the timeout has to be restarted.
>>
>> For this case where timeout work already in progress note that
>> drm_sched_job_timedout restarts the timeout in it's end so it should be
>> ok to restart the timeout unconditionally inside
>> drm_sched_get_cleanup_job as it was done before.
> I may have misinterpreted Christian[1], but I interpreted the below as
> meaning that he'd prefer the caller (drm_sched_main()) to handle the
> drm_sched_start_timeout() in this case:
>
> On 26/09/2019 14:50, Koenig, Christian wrote:
>>> Alternatively the caller could manually re-arm the timeout after
>>> handling the job free.
>> I don't see anything that could go wrong immediately, but that is
>> probably the cleaner approach.
> [1]
> https://lore.kernel.org/lkml/d9b23ef8-f784-0bce-c34a-fa02002db1ea@amd.com/
>
> I wasn't entirely sure if it was safe to leave the timeout running while
> the job free was going on which would be the case if
> drm_sched_get_cleanup_job() re-armed unconditionally.
>
>>> It's also necessary to restart the timeout in the case where the return
>>> is NULL which is handled in the function itself.
>>>
>>> TBH I'm not sure whether this while loop is worth it - it may be better
>>> to replace it with simply:
>>>
>>> 	if (cleanup_job) {
>>> 		sched->ops->free_job(cleanup_job);
>>> 		/* queue timeout for next job */
>>> 		drm_sched_start_timeout(sched);
>>> 	}
>>>
>>> The outer loop would then handle the next call to
>>> drm_sched_get_cleanup_job() as necessary.
>>
>> What outer loop ?
> There's a loop round the code in drm_sched_main():
>
> 	while (!kthread_should_stop()) {
> 		[...]
> 		wait_event_interruptible(...)
>
> 		while (cleanup_job) {...}
>
> 		if (!entity)
> 			continue;
>
> 		[...]
> 	}
>
> So after handling the cleanup_job case, the outer loop will loop back
> round to the wait_event_interruptible() call. One other concern is
> whether it's possible for the thread to be signalled to stop before the
> clean up has been completed. The while loop I added ensures that all
> jobs are actually cleaned up before kthread_should_stop() is checked.
>
> Steve


Yes, seems to me only the inner while loop will guarantee proper free of 
all signaled jobs in mirorr list if the thread is stopped.

Andrey

[v4] drm: Don't free jobs in wait_event_interruptible()

Commit Message

Comments

Patch