Message ID | 20201210021438.9190-2-luben.tuikov@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Timeout handler now returns a value | expand |
Hi Luben, Am Mittwoch, den 09.12.2020, 21:14 -0500 schrieb Luben Tuikov: > This patch does not change current behaviour. > > The driver's job timeout handler now returns > status indicating back to the DRM layer whether > the task (job) was successfully aborted or whether > more time should be given to the task to complete. > > Default behaviour as of this patch, is preserved, > except in obvious-by-comment case in the Panfrost > driver, as documented below. > > All drivers which make use of the > drm_sched_backend_ops' .timedout_job() callback > have been accordingly renamed and return the > would've-been default value of > DRM_TASK_STATUS_ALIVE to restart the task's > timeout timer--this is the old behaviour, and > is preserved by this patch. > > In the case of the Panfrost driver, its timedout > callback correctly first checks if the job had > completed in due time and if so, it now returns > DRM_TASK_STATUS_COMPLETE to notify the DRM layer > that the task can be moved to the done list, to be > freed later. In the other two subsequent checks, > the value of DRM_TASK_STATUS_ALIVE is returned, as > per the default behaviour. > > A more involved driver's solutions can be had > in subequent patches. > > v2: Use enum as the status of a driver's job > timeout callback method. > > Cc: Alexander Deucher <Alexander.Deucher@amd.com> > Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> > Cc: Christian König <christian.koenig@amd.com> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Lucas Stach <l.stach@pengutronix.de> > Cc: Russell King <linux+etnaviv@armlinux.org.uk> > Cc: Christian Gmeiner <christian.gmeiner@gmail.com> > Cc: Qiang Yu <yuq825@gmail.com> > Cc: Rob Herring <robh@kernel.org> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> > Cc: Steven Price <steven.price@arm.com> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> > Cc: Eric Anholt <eric@anholt.net> > Reported-by: kernel test robot <lkp@intel.com> > Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +++-- > drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++- > drivers/gpu/drm/lima/lima_sched.c | 4 +++- > drivers/gpu/drm/panfrost/panfrost_job.c | 9 ++++--- > drivers/gpu/drm/scheduler/sched_main.c | 4 +--- > drivers/gpu/drm/v3d/v3d_sched.c | 32 +++++++++++++------------ > include/drm/gpu_scheduler.h | 20 +++++++++++++--- > 7 files changed, 57 insertions(+), 28 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > index ff48101bab55..a111326cbdde 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > @@ -28,7 +28,7 @@ > #include "amdgpu.h" > #include "amdgpu_trace.h" > > > > > -static void amdgpu_job_timedout(struct drm_sched_job *s_job) > +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job) > { > struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); > struct amdgpu_job *job = to_amdgpu_job(s_job); > @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) > amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) { > DRM_ERROR("ring %s timeout, but soft recovered\n", > s_job->sched->name); > - return; > + return DRM_TASK_STATUS_ALIVE; > } > > > > > amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti); > @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) > > > > > if (amdgpu_device_should_recover_gpu(ring->adev)) { > amdgpu_device_gpu_recover(ring->adev, job); > + return DRM_TASK_STATUS_ALIVE; > } else { > drm_sched_suspend_timeout(&ring->sched); > if (amdgpu_sriov_vf(adev)) > adev->virt.tdr_debug = true; > + return DRM_TASK_STATUS_ALIVE; > } > } > > > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c > index cd46c882269c..c49516942328 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c > @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job) > return fence; > } > > > > > -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) > +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job > + *sched_job) > { > struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); > struct etnaviv_gpu *gpu = submit->gpu; > @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) > > drm_sched_resubmit_jobs(&gpu->sched); > > + /* Tell the DRM scheduler that this task needs > + * more time. > + */ This comment doesn't match the kernel coding style, but it's also moot as the whole added code block isn't needed. The code just below is identical, so letting execution continue here instead of returning would be the right thing to do, but maybe you mean to return DRM_TASK_STATUS_COMPLETE? It's a bit confusing that aborted and job successfully finished should be signaled with the same return code. > + drm_sched_start(&gpu->sched, true); > + return DRM_TASK_STATUS_ALIVE; > + > out_no_timeout: > /* restart scheduler after GPU is usable again */ > drm_sched_start(&gpu->sched, true); > + return DRM_TASK_STATUS_ALIVE; > } Regards, Lucas
Am 10.12.20 um 10:31 schrieb Lucas Stach: > Hi Luben, > > Am Mittwoch, den 09.12.2020, 21:14 -0500 schrieb Luben Tuikov: >> [SNIP] >> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) >> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job >> + *sched_job) >> { >> struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); >> struct etnaviv_gpu *gpu = submit->gpu; >> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) >> >> drm_sched_resubmit_jobs(&gpu->sched); >> >> + /* Tell the DRM scheduler that this task needs >> + * more time. >> + */ > This comment doesn't match the kernel coding style, but it's also moot > as the whole added code block isn't needed. The code just below is > identical, so letting execution continue here instead of returning > would be the right thing to do, but maybe you mean to return > DRM_TASK_STATUS_COMPLETE? It's a bit confusing that aborted and job > successfully finished should be signaled with the same return code. Yes and no. As I tried to describe in my previous mail the naming of the enum values is also not very good. See even when the job has completed we need to restart the timer for the potential next job. Only when the device is completely gone and unrecoverable we should not restart the timer. I suggest to either make this an int and return -ENODEV when that happens or rename the enum to something like DRM_SCHED_NODEV. Regards, Christian. > >> + drm_sched_start(&gpu->sched, true); >> + return DRM_TASK_STATUS_ALIVE; >> + >> out_no_timeout: >> /* restart scheduler after GPU is usable again */ >> drm_sched_start(&gpu->sched, true); >> + return DRM_TASK_STATUS_ALIVE; >> } > Regards, > Lucas >
On 10/12/2020 02:14, Luben Tuikov wrote: > This patch does not change current behaviour. > > The driver's job timeout handler now returns > status indicating back to the DRM layer whether > the task (job) was successfully aborted or whether > more time should be given to the task to complete. I find the definitions given a little confusing, see below. > Default behaviour as of this patch, is preserved, > except in obvious-by-comment case in the Panfrost > driver, as documented below. > > All drivers which make use of the > drm_sched_backend_ops' .timedout_job() callback > have been accordingly renamed and return the > would've-been default value of > DRM_TASK_STATUS_ALIVE to restart the task's > timeout timer--this is the old behaviour, and > is preserved by this patch. > > In the case of the Panfrost driver, its timedout > callback correctly first checks if the job had > completed in due time and if so, it now returns > DRM_TASK_STATUS_COMPLETE to notify the DRM layer > that the task can be moved to the done list, to be > freed later. In the other two subsequent checks, > the value of DRM_TASK_STATUS_ALIVE is returned, as > per the default behaviour. > > A more involved driver's solutions can be had > in subequent patches. NIT: ^^^^^^^^^ subsequent > > v2: Use enum as the status of a driver's job > timeout callback method. > > Cc: Alexander Deucher <Alexander.Deucher@amd.com> > Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> > Cc: Christian König <christian.koenig@amd.com> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Lucas Stach <l.stach@pengutronix.de> > Cc: Russell King <linux+etnaviv@armlinux.org.uk> > Cc: Christian Gmeiner <christian.gmeiner@gmail.com> > Cc: Qiang Yu <yuq825@gmail.com> > Cc: Rob Herring <robh@kernel.org> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> > Cc: Steven Price <steven.price@arm.com> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> > Cc: Eric Anholt <eric@anholt.net> > Reported-by: kernel test robot <lkp@intel.com> This reported-by seems a little odd for this patch. > Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +++-- > drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++- > drivers/gpu/drm/lima/lima_sched.c | 4 +++- > drivers/gpu/drm/panfrost/panfrost_job.c | 9 ++++--- > drivers/gpu/drm/scheduler/sched_main.c | 4 +--- > drivers/gpu/drm/v3d/v3d_sched.c | 32 +++++++++++++------------ > include/drm/gpu_scheduler.h | 20 +++++++++++++--- > 7 files changed, 57 insertions(+), 28 deletions(-) > [....] > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index 2e0c368e19f6..cedfc5394e52 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job, > return s_job && atomic_inc_return(&s_job->karma) > threshold; > } > > +enum drm_task_status { > + DRM_TASK_STATUS_COMPLETE, > + DRM_TASK_STATUS_ALIVE > +}; > + > /** > * struct drm_sched_backend_ops > * > @@ -230,10 +235,19 @@ struct drm_sched_backend_ops { > struct dma_fence *(*run_job)(struct drm_sched_job *sched_job); > > /** > - * @timedout_job: Called when a job has taken too long to execute, > - * to trigger GPU recovery. > + * @timedout_job: Called when a job has taken too long to execute, > + * to trigger GPU recovery. > + * > + * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy > + * and executing in the hardware, i.e. it needs more time. So 'alive' means the job (was) alive, and GPU recovery is happening. I.e. it's the job just takes too long. Panfrost will trigger a GPU reset (killing the job) in this case while returning DRM_TASK_STATUS_ALIVE. > + * > + * Return DRM_TASK_STATUS_COMPLETE, if the task (job) has > + * been aborted or is unknown to the hardware, i.e. if > + * the task is out of the hardware, and maybe it is now > + * in the done list, or it was completed long ago, or > + * if it is unknown to the hardware. Where 'complete' seems to mean a variety of things: * The job completed successfully (i.e. the timeout raced), this is the situation that Panfrost detects. In this case (and only this case) the GPU reset will *not* happen. * The job failed (aborted) and is no longer on the hardware. Panfrost currently handles a job failure by triggering drm_sched_fault() to trigger the timeout handler. But the timeout handler doesn't handle this differently so will return DRM_TASK_STATUS_ALIVE. * The job is "unknown to hardware". There are some corner cases in Panfrost (specifically two early returns from panfrost_job_hw_submit()) where the job never actually lands on the hardware, but the scheduler isn't informed. We currently rely on the timeout handling to recover from that. However, again, the timeout handler doesn't know about this soo will return DRM_TASK_STATUS_ALIVE. So of the four cases listed in these comments, Panfrost is only getting 2 'correct' after this change. But what I really want to know is what the scheduler is planning to do in these situations? The Panfrost return value in this patch is really a "did we trigger a GPU reset" - and doesn't seem to match the descriptions above. Steve > */ > - void (*timedout_job)(struct drm_sched_job *sched_job); > + enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job); > > /** > * @free_job: Called once the job's finished fence has been signaled >
On 2020-12-10 4:31 a.m., Lucas Stach wrote: > Hi Luben, > > Am Mittwoch, den 09.12.2020, 21:14 -0500 schrieb Luben Tuikov: >> This patch does not change current behaviour. >> >> The driver's job timeout handler now returns >> status indicating back to the DRM layer whether >> the task (job) was successfully aborted or whether >> more time should be given to the task to complete. >> >> Default behaviour as of this patch, is preserved, >> except in obvious-by-comment case in the Panfrost >> driver, as documented below. >> >> All drivers which make use of the >> drm_sched_backend_ops' .timedout_job() callback >> have been accordingly renamed and return the >> would've-been default value of >> DRM_TASK_STATUS_ALIVE to restart the task's >> timeout timer--this is the old behaviour, and >> is preserved by this patch. >> >> In the case of the Panfrost driver, its timedout >> callback correctly first checks if the job had >> completed in due time and if so, it now returns >> DRM_TASK_STATUS_COMPLETE to notify the DRM layer >> that the task can be moved to the done list, to be >> freed later. In the other two subsequent checks, >> the value of DRM_TASK_STATUS_ALIVE is returned, as >> per the default behaviour. >> >> A more involved driver's solutions can be had >> in subequent patches. >> >> v2: Use enum as the status of a driver's job >> timeout callback method. >> >> Cc: Alexander Deucher <Alexander.Deucher@amd.com> >> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> >> Cc: Christian König <christian.koenig@amd.com> >> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> >> Cc: Lucas Stach <l.stach@pengutronix.de> >> Cc: Russell King <linux+etnaviv@armlinux.org.uk> >> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> >> Cc: Qiang Yu <yuq825@gmail.com> >> Cc: Rob Herring <robh@kernel.org> >> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> >> Cc: Steven Price <steven.price@arm.com> >> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> >> Cc: Eric Anholt <eric@anholt.net> >> Reported-by: kernel test robot <lkp@intel.com> >> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +++-- >> drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++- >> drivers/gpu/drm/lima/lima_sched.c | 4 +++- >> drivers/gpu/drm/panfrost/panfrost_job.c | 9 ++++--- >> drivers/gpu/drm/scheduler/sched_main.c | 4 +--- >> drivers/gpu/drm/v3d/v3d_sched.c | 32 +++++++++++++------------ >> include/drm/gpu_scheduler.h | 20 +++++++++++++--- >> 7 files changed, 57 insertions(+), 28 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >> index ff48101bab55..a111326cbdde 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >> @@ -28,7 +28,7 @@ >> #include "amdgpu.h" >> #include "amdgpu_trace.h" >> >> >> >> >> -static void amdgpu_job_timedout(struct drm_sched_job *s_job) >> +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job) >> { >> struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); >> struct amdgpu_job *job = to_amdgpu_job(s_job); >> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) >> amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) { >> DRM_ERROR("ring %s timeout, but soft recovered\n", >> s_job->sched->name); >> - return; >> + return DRM_TASK_STATUS_ALIVE; >> } >> >> >> >> >> amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti); >> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) >> >> >> >> >> if (amdgpu_device_should_recover_gpu(ring->adev)) { >> amdgpu_device_gpu_recover(ring->adev, job); >> + return DRM_TASK_STATUS_ALIVE; >> } else { >> drm_sched_suspend_timeout(&ring->sched); >> if (amdgpu_sriov_vf(adev)) >> adev->virt.tdr_debug = true; >> + return DRM_TASK_STATUS_ALIVE; >> } >> } >> >> >> >> >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c >> index cd46c882269c..c49516942328 100644 >> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c >> @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job) >> return fence; >> } >> >> >> >> >> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) >> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job >> + *sched_job) >> { >> struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); >> struct etnaviv_gpu *gpu = submit->gpu; >> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) >> >> drm_sched_resubmit_jobs(&gpu->sched); >> >> + /* Tell the DRM scheduler that this task needs >> + * more time. >> + */ > > This comment doesn't match the kernel coding style, but it's also moot > as the whole added code block isn't needed. The code just below is > identical, so letting execution continue here instead of returning > would be the right thing to do, but maybe you mean to return > DRM_TASK_STATUS_COMPLETE? It's a bit confusing that aborted and job > successfully finished should be signaled with the same return code. The kernel coding style takes the net/ style of comments, as well as the non-net/ style of comments--with a leading /* on an empty line. I'm using the net/ style. checkpatch.pl said everything is okay, which I've integrated in my git-hooks to check every patch and every commit. I'm not familiar with the etnaviv's internals and what you see here is my best guesstimate. I understand your confusion that an aborted job and successfully completed job BOTH return DRM_TASK_STATUS_COMPLETE, right? That's insanity, right? Perhaps we should return ABORTED for one and FINISHED for another, no? So, this is intentional. DRM_TASK_STATUS_COMPLETE doesn't indicate the execution status of the task--this is for the application client to determine, e.g. Mesa. For DRM and the driver, as a transport, the driver wants to tell the DRM layer that the DRM layer will *not hear from this task*, that is this task is out of the hardware and as such relevant memory can be released. Task was aborted successfully: out of the hardware, free relevant memories. Task has completed successfully: out of the hardware, free relevant memories. As a transport, the driver and DRM don't want to know the internals of the task--only if it is, or not, in the hardware, so that krefs can be kept or decremented, and relevant memories freed. By returning "ALIVE", the driver says to DRM, that the task is now in the hardware. Maybe it was aborted and reissued, maybe it is still executing--we don't care. The DRM layer can decide what to do next, but the driver doesn't control this. For instance, a sensible thing to do would be to extend the timeout timer for this task, but this something which DRM does, and the driver shouldn't necessarily control this--i.e. a simple code is returned, and a clear separation between the layers is set. "ALIVE" is essentially what we did before this patch, so here I return this to mimic past behaviour. Should COMPLETE be returned? Is the task out of the hardware, never to be heard of again? Regards, Luben > >> + drm_sched_start(&gpu->sched, true); >> + return DRM_TASK_STATUS_ALIVE; >> + >> out_no_timeout: >> /* restart scheduler after GPU is usable again */ >> drm_sched_start(&gpu->sched, true); >> + return DRM_TASK_STATUS_ALIVE; >> } > > Regards, > Lucas >
On 2020-12-10 4:41 a.m., Christian König wrote: > Am 10.12.20 um 10:31 schrieb Lucas Stach: >> Hi Luben, >> >> Am Mittwoch, den 09.12.2020, 21:14 -0500 schrieb Luben Tuikov: >>> [SNIP] >>> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) >>> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job >>> + *sched_job) >>> { >>> struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); >>> struct etnaviv_gpu *gpu = submit->gpu; >>> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) >>> >>> drm_sched_resubmit_jobs(&gpu->sched); >>> >>> + /* Tell the DRM scheduler that this task needs >>> + * more time. >>> + */ >> This comment doesn't match the kernel coding style, but it's also moot >> as the whole added code block isn't needed. The code just below is >> identical, so letting execution continue here instead of returning >> would be the right thing to do, but maybe you mean to return >> DRM_TASK_STATUS_COMPLETE? It's a bit confusing that aborted and job >> successfully finished should be signaled with the same return code. > > Yes and no. As I tried to describe in my previous mail the naming of the > enum values is also not very good. I tried to make the naming as minimal as possible: COMPLETE: the task is out of the hardware. ALIVE: the task is in the hardware. > > See even when the job has completed we need to restart the timer for the > potential next job. Sure, yes. But this is something which the DRM decides--why should drivers know of the internals of the DRM? (i.e. that it restarts the timer or that there is a timer, etc.) Return minimal value and let the DRM decide what to do next. > > Only when the device is completely gone and unrecoverable we should not > restart the timer. Yes, agreed. > > I suggest to either make this an int and return -ENODEV when that > happens or rename the enum to something like DRM_SCHED_NODEV. It was an int, but you suggested that it'd be a macro, so I made it an enum so that the complier can check the values against the macros returned. I suggested, DRM_TASK_SCHED_ENODEV, but let Andrey add it when he adds his patches on top of my patch here, because his work adds hotplug/unplug support. Also, note that if the pending list is freed, while the DRM had been blocked, i.e. notified that the device is gone, then returning DRM_TASK_SCHED_ENODEV would be moot, as there are no tasks in the pending list. This patch here is good as it is, since it is minimal and doesn't make assumptions on DRM behaviour. Regards, Luben > > Regards, > Christian. > >> >>> + drm_sched_start(&gpu->sched, true); >>> + return DRM_TASK_STATUS_ALIVE; >>> + >>> out_no_timeout: >>> /* restart scheduler after GPU is usable again */ >>> drm_sched_start(&gpu->sched, true); >>> + return DRM_TASK_STATUS_ALIVE; >>> } >> Regards, >> Lucas >> >
On 2020-12-10 4:46 a.m., Steven Price wrote: > On 10/12/2020 02:14, Luben Tuikov wrote: >> This patch does not change current behaviour. >> >> The driver's job timeout handler now returns >> status indicating back to the DRM layer whether >> the task (job) was successfully aborted or whether >> more time should be given to the task to complete. > > I find the definitions given a little confusing, see below. > >> Default behaviour as of this patch, is preserved, >> except in obvious-by-comment case in the Panfrost >> driver, as documented below. >> >> All drivers which make use of the >> drm_sched_backend_ops' .timedout_job() callback >> have been accordingly renamed and return the >> would've-been default value of >> DRM_TASK_STATUS_ALIVE to restart the task's >> timeout timer--this is the old behaviour, and >> is preserved by this patch. >> >> In the case of the Panfrost driver, its timedout >> callback correctly first checks if the job had >> completed in due time and if so, it now returns >> DRM_TASK_STATUS_COMPLETE to notify the DRM layer >> that the task can be moved to the done list, to be >> freed later. In the other two subsequent checks, >> the value of DRM_TASK_STATUS_ALIVE is returned, as >> per the default behaviour. >> >> A more involved driver's solutions can be had >> in subequent patches. > > NIT: ^^^^^^^^^ subsequent Thank you--no idea how my spellcheck and checkpatch.pl missed that. > >> >> v2: Use enum as the status of a driver's job >> timeout callback method. >> >> Cc: Alexander Deucher <Alexander.Deucher@amd.com> >> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> >> Cc: Christian König <christian.koenig@amd.com> >> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> >> Cc: Lucas Stach <l.stach@pengutronix.de> >> Cc: Russell King <linux+etnaviv@armlinux.org.uk> >> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> >> Cc: Qiang Yu <yuq825@gmail.com> >> Cc: Rob Herring <robh@kernel.org> >> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> >> Cc: Steven Price <steven.price@arm.com> >> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> >> Cc: Eric Anholt <eric@anholt.net> >> Reported-by: kernel test robot <lkp@intel.com> > > This reported-by seems a little odd for this patch. I got this because I wasn't (originally) changing all drivers which were using the task timeout callback. Should I remove it? > >> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +++-- >> drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++- >> drivers/gpu/drm/lima/lima_sched.c | 4 +++- >> drivers/gpu/drm/panfrost/panfrost_job.c | 9 ++++--- >> drivers/gpu/drm/scheduler/sched_main.c | 4 +--- >> drivers/gpu/drm/v3d/v3d_sched.c | 32 +++++++++++++------------ >> include/drm/gpu_scheduler.h | 20 +++++++++++++--- >> 7 files changed, 57 insertions(+), 28 deletions(-) >> > > [....] > >> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h >> index 2e0c368e19f6..cedfc5394e52 100644 >> --- a/include/drm/gpu_scheduler.h >> +++ b/include/drm/gpu_scheduler.h >> @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job, >> return s_job && atomic_inc_return(&s_job->karma) > threshold; >> } >> >> +enum drm_task_status { >> + DRM_TASK_STATUS_COMPLETE, >> + DRM_TASK_STATUS_ALIVE >> +}; >> + >> /** >> * struct drm_sched_backend_ops >> * >> @@ -230,10 +235,19 @@ struct drm_sched_backend_ops { >> struct dma_fence *(*run_job)(struct drm_sched_job *sched_job); >> >> /** >> - * @timedout_job: Called when a job has taken too long to execute, >> - * to trigger GPU recovery. >> + * @timedout_job: Called when a job has taken too long to execute, >> + * to trigger GPU recovery. >> + * >> + * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy >> + * and executing in the hardware, i.e. it needs more time. > > So 'alive' means the job (was) alive, and GPU recovery is happening. > I.e. it's the job just takes too long. Panfrost will trigger a GPU reset > (killing the job) in this case while returning DRM_TASK_STATUS_ALIVE. "ALIVE" means "at this moment the task is in the hardware executing, using memories, DMA engines, compute units, the whole thing." You, can also call it active, executing, busy, etc. "ALIVE" makes no assumptions about how the task got there. Maybe this was original submission, and the task is taking its sweet time. Maybe the driver aborted it and reissued it, all in the timer timeout callback, etc. It merely tells the DRM layer that the task is actively executing in the hardware, so no assumptions can be made about freeing memories, decrementing krefs, etc. IOW, it's executing, check back later. > >> + * >> + * Return DRM_TASK_STATUS_COMPLETE, if the task (job) has >> + * been aborted or is unknown to the hardware, i.e. if >> + * the task is out of the hardware, and maybe it is now >> + * in the done list, or it was completed long ago, or >> + * if it is unknown to the hardware. > > Where 'complete' seems to mean a variety of things: > > * The job completed successfully (i.e. the timeout raced), this is the > situation that Panfrost detects. In this case (and only this case) the > GPU reset will *not* happen. Sounds good! > > * The job failed (aborted) and is no longer on the hardware. Panfrost > currently handles a job failure by triggering drm_sched_fault() to > trigger the timeout handler. But the timeout handler doesn't handle this > differently so will return DRM_TASK_STATUS_ALIVE. Panfrost seems to do the right thing here by triggering. But I wonder, could the Panfrost driver simply return the task back to the DRM layer? If the task is out of the hardware, it should return COMPLETE, a la, Driver: COMPLETE! DRM: Fine, I can free memories and decrement krefs and toss it over to the application client. (The application client would then find out if the execution status was success of failure. For instance, division by 0, or tan(pi/2), etc.) Driver: ALIVE! DRM: Fine, I cannot make any assumptions of any kind. I'll back off for a bit, and if I don't hear from you in some time, I'll check back later. > * The job is "unknown to hardware". There are some corner cases in > Panfrost (specifically two early returns from panfrost_job_hw_submit()) > where the job never actually lands on the hardware, but the scheduler > isn't informed. We currently rely on the timeout handling to recover > from that. However, again, the timeout handler doesn't know about this > soo will return DRM_TASK_STATUS_ALIVE. "We currently rely on the timeout handling to recover from that." Do you mean that when the timeout handler kicks in, you find that the task never landed in the hardware and then you send it to the hardware? Yes? Then return "ALIVE". If however, you never decide to send it to the hardware and simply abandon it (in hopes that the DRM or the Application Client will reissue it), then you should send it back to DRM with status "COMPLETE". > So of the four cases listed in these comments, Panfrost is only getting > 2 'correct' after this change. The Panfrost driver knows the hardware intimately, and maybe the hardware, or a newer version of it, can give a finer control over task execution on a GPU. If you can abort the task and reissue it, all in the timer callback, return ALIVE. If you abort it and kick it out of the hardware, return COMPLETE. I'd say, this is about information, and passing information back to the DRM scheduler. It could be any information you'd like. I thought that the minimal case of information described above is fine. > > But what I really want to know is what the scheduler is planning to do > in these situations? The Panfrost return value in this patch is really a > "did we trigger a GPU reset" - and doesn't seem to match the > descriptions above. The scheduler would like to know if the task is used by the GPU, i.e. it is in the hardware and its memories are used and krefs are upped, and DMA is taking place, etc. Or, if the task is out of the hardware, and the DRM can free relevant memories, decrement relevant krefs, free ancillary memories and return the task back to the application client. It's just about information. Ultimately driver maintainers decide the most appropriate action in the timer timeout callback and then the most appropriate return code. This patch really mimics (tries to) default (current) behaviour, so as to have as minimal impact as possible, yet show that finer grained status is attainable from the timer timeout callback. No differing action is taken by the scheduler at the moment, as shown in this patch. Regards, Luben > > Steve > >> */ >> - void (*timedout_job)(struct drm_sched_job *sched_job); >> + enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job); >> >> /** >> * @free_job: Called once the job's finished fence has been signaled >> >
On 2020-12-11 4:44 p.m., Luben Tuikov wrote: > If however, you never decide to send it to the hardware and simply > abandon it (in hopes that the DRM or the Application Client will > reissue it), then you should send it back to DRM with status "COMPLETE". Correction: "... decide to *not* send it to ..." Regards, Luben
On 11/12/2020 21:44, Luben Tuikov wrote: > On 2020-12-10 4:46 a.m., Steven Price wrote: >> On 10/12/2020 02:14, Luben Tuikov wrote: >>> This patch does not change current behaviour. >>> >>> The driver's job timeout handler now returns >>> status indicating back to the DRM layer whether >>> the task (job) was successfully aborted or whether >>> more time should be given to the task to complete. >> >> I find the definitions given a little confusing, see below. >> >>> Default behaviour as of this patch, is preserved, >>> except in obvious-by-comment case in the Panfrost >>> driver, as documented below. >>> >>> All drivers which make use of the >>> drm_sched_backend_ops' .timedout_job() callback >>> have been accordingly renamed and return the >>> would've-been default value of >>> DRM_TASK_STATUS_ALIVE to restart the task's >>> timeout timer--this is the old behaviour, and >>> is preserved by this patch. >>> >>> In the case of the Panfrost driver, its timedout >>> callback correctly first checks if the job had >>> completed in due time and if so, it now returns >>> DRM_TASK_STATUS_COMPLETE to notify the DRM layer >>> that the task can be moved to the done list, to be >>> freed later. In the other two subsequent checks, >>> the value of DRM_TASK_STATUS_ALIVE is returned, as >>> per the default behaviour. >>> >>> A more involved driver's solutions can be had >>> in subequent patches. >> >> NIT: ^^^^^^^^^ subsequent > > Thank you--no idea how my spellcheck and checkpatch.pl missed that. > >> >>> >>> v2: Use enum as the status of a driver's job >>> timeout callback method. >>> >>> Cc: Alexander Deucher <Alexander.Deucher@amd.com> >>> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> >>> Cc: Christian König <christian.koenig@amd.com> >>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> >>> Cc: Lucas Stach <l.stach@pengutronix.de> >>> Cc: Russell King <linux+etnaviv@armlinux.org.uk> >>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> >>> Cc: Qiang Yu <yuq825@gmail.com> >>> Cc: Rob Herring <robh@kernel.org> >>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> >>> Cc: Steven Price <steven.price@arm.com> >>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> >>> Cc: Eric Anholt <eric@anholt.net> >>> Reported-by: kernel test robot <lkp@intel.com> >> >> This reported-by seems a little odd for this patch. > > I got this because I wasn't (originally) changing all drivers > which were using the task timeout callback. > Should I remove it? Well the kernel test robot would only have "reported" things like build failures and this patch isn't really 'fixing' anything (as you state it does not change current behaviour). So personally I'm not sure how the kernel test robot triggered you to write this patch. But equally if you were inspired by it and want to credit it, that's fine by me! I'd assumed that this was just a copy-paste error and you'd taken the list of CC's from another patch and accidentally included the Reported-by too. >> >>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +++-- >>> drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++- >>> drivers/gpu/drm/lima/lima_sched.c | 4 +++- >>> drivers/gpu/drm/panfrost/panfrost_job.c | 9 ++++--- >>> drivers/gpu/drm/scheduler/sched_main.c | 4 +--- >>> drivers/gpu/drm/v3d/v3d_sched.c | 32 +++++++++++++------------ >>> include/drm/gpu_scheduler.h | 20 +++++++++++++--- >>> 7 files changed, 57 insertions(+), 28 deletions(-) >>> >> >> [....] >> >>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h >>> index 2e0c368e19f6..cedfc5394e52 100644 >>> --- a/include/drm/gpu_scheduler.h >>> +++ b/include/drm/gpu_scheduler.h >>> @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job, >>> return s_job && atomic_inc_return(&s_job->karma) > threshold; >>> } >>> >>> +enum drm_task_status { >>> + DRM_TASK_STATUS_COMPLETE, >>> + DRM_TASK_STATUS_ALIVE >>> +}; >>> + >>> /** >>> * struct drm_sched_backend_ops >>> * >>> @@ -230,10 +235,19 @@ struct drm_sched_backend_ops { >>> struct dma_fence *(*run_job)(struct drm_sched_job *sched_job); >>> >>> /** >>> - * @timedout_job: Called when a job has taken too long to execute, >>> - * to trigger GPU recovery. >>> + * @timedout_job: Called when a job has taken too long to execute, >>> + * to trigger GPU recovery. >>> + * >>> + * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy >>> + * and executing in the hardware, i.e. it needs more time. >> >> So 'alive' means the job (was) alive, and GPU recovery is happening. >> I.e. it's the job just takes too long. Panfrost will trigger a GPU reset >> (killing the job) in this case while returning DRM_TASK_STATUS_ALIVE. > > "ALIVE" means "at this moment the task is in the hardware executing, > using memories, DMA engines, compute units, the whole thing." You, > can also call it active, executing, busy, etc. > > "ALIVE" makes no assumptions about how the task got there. Maybe > this was original submission, and the task is taking its sweet time. > Maybe the driver aborted it and reissued it, all in the timer timeout > callback, etc. > > It merely tells the DRM layer that the task is actively executing > in the hardware, so no assumptions can be made about freeing memories, > decrementing krefs, etc. IOW, it's executing, check back later. Ok, makes sense. Although I think the comment about "it needs more time" could do with a clarification that it's up to the timeout handler whether more time is granted (i.e. it's not the DRM scheduler core that is responsible for deciding to kill the job). >> >>> + * >>> + * Return DRM_TASK_STATUS_COMPLETE, if the task (job) has >>> + * been aborted or is unknown to the hardware, i.e. if >>> + * the task is out of the hardware, and maybe it is now >>> + * in the done list, or it was completed long ago, or >>> + * if it is unknown to the hardware. >> >> Where 'complete' seems to mean a variety of things: >> >> * The job completed successfully (i.e. the timeout raced), this is the >> situation that Panfrost detects. In this case (and only this case) the >> GPU reset will *not* happen. > > Sounds good! > >> >> * The job failed (aborted) and is no longer on the hardware. Panfrost >> currently handles a job failure by triggering drm_sched_fault() to >> trigger the timeout handler. But the timeout handler doesn't handle this >> differently so will return DRM_TASK_STATUS_ALIVE. > > Panfrost seems to do the right thing here by triggering. But I wonder, > could the Panfrost driver simply return the task back to the DRM > layer? > > If the task is out of the hardware, it should return COMPLETE, a la, > > Driver: COMPLETE! > DRM: Fine, I can free memories and decrement krefs and toss it over > to the application client. (The application client would > then find out if the execution status was success of failure. > For instance, division by 0, or tan(pi/2), etc.) > > Driver: ALIVE! > DRM: Fine, I cannot make any assumptions of any kind. I'll back off > for a bit, and if I don't hear from you in some time, I'll > check back later. Yeah this is definitely an area of improvement - there are a number of shortcuts taken in the current driver. Handling of job failure (and job timeout) are areas that could be improved a lot. But equally they are things that don't happen during normal operation so haven't been a focus of attention. Ideally resetting the GPU should only be attempted if the GPU has actually locked up, rather than just a job failing or taking too long. >> * The job is "unknown to hardware". There are some corner cases in >> Panfrost (specifically two early returns from panfrost_job_hw_submit()) >> where the job never actually lands on the hardware, but the scheduler >> isn't informed. We currently rely on the timeout handling to recover >> from that. However, again, the timeout handler doesn't know about this >> soo will return DRM_TASK_STATUS_ALIVE. > > "We currently rely on the timeout handling to recover from that." > Do you mean that when the timeout handler kicks in, you find > that the task never landed in the hardware and then you send it > to the hardware? Yes? Then return "ALIVE". > If however, you never decide to send it to the hardware and simply > abandon it (in hopes that the DRM or the Application Client will > reissue it), then you should send it back to DRM with status "COMPLETE". It's a corner case - the job never landed on the hardware, but therefore never 'completed' either. Since it's not expected to happen in normal operation (it requires the PM code to fail to power on the GPU or an internal driver error causing the GPU to be unexpectedly busy when submitting the job) the handling of this is to pretend the job is stuck on the GPU and recover in the usual way with a GPU reset. So I guess in this case since we're lying to ourselves we should also lie to the DRM scheduler with the ALIVE return. > >> So of the four cases listed in these comments, Panfrost is only getting >> 2 'correct' after this change. > > The Panfrost driver knows the hardware intimately, and maybe the hardware, > or a newer version of it, can give a finer control over task execution > on a GPU. If you can abort the task and reissue it, all in the timer > callback, return ALIVE. If you abort it and kick it out of the hardware, > return COMPLETE. > > I'd say, this is about information, and passing information back to > the DRM scheduler. It could be any information you'd like. I thought > that the minimal case of information described above is fine. Ok, I think all that's really needed it to improve on the wording in the comment. Specifically: * ALIVE: The DRM scheduler must maintain knowledge of the job (not free the data). * COMPLETE: The DRM scheduler can free the job. >> >> But what I really want to know is what the scheduler is planning to do >> in these situations? The Panfrost return value in this patch is really a >> "did we trigger a GPU reset" - and doesn't seem to match the >> descriptions above. > > The scheduler would like to know if the task is used by the GPU, i.e. > it is in the hardware and its memories are used and krefs are upped, > and DMA is taking place, etc. Or, if the task is out of the hardware, > and the DRM can free relevant memories, decrement relevant krefs, > free ancillary memories and return the task back to the application > client. > > It's just about information. Ultimately driver maintainers decide > the most appropriate action in the timer timeout callback and then > the most appropriate return code. This patch really mimics (tries to) > default (current) behaviour, so as to have as minimal impact as possible, > yet show that finer grained status is attainable from the timer timeout > callback. > > No differing action is taken by the scheduler at the moment, > as shown in this patch. Thanks for the clarification. I think mostly my confusion is because Panfrost currently handles various failures in a rather brutal manner (resetting the GPU even if not necessary) which means the terms don't match up as neatly as they should. Ideally I'll find some free time too tidy some of this up, but sadly this isn't my main task. Thanks, Steve
Am 11.12.20 um 21:44 schrieb Luben Tuikov: > On 2020-12-10 4:41 a.m., Christian König wrote: >> Am 10.12.20 um 10:31 schrieb Lucas Stach: >>> Hi Luben, >>> >>> Am Mittwoch, den 09.12.2020, 21:14 -0500 schrieb Luben Tuikov: >>>> [SNIP] >>>> -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) >>>> +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job >>>> + *sched_job) >>>> { >>>> struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); >>>> struct etnaviv_gpu *gpu = submit->gpu; >>>> @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) >>>> >>>> drm_sched_resubmit_jobs(&gpu->sched); >>>> >>>> + /* Tell the DRM scheduler that this task needs >>>> + * more time. >>>> + */ >>> This comment doesn't match the kernel coding style, but it's also moot >>> as the whole added code block isn't needed. The code just below is >>> identical, so letting execution continue here instead of returning >>> would be the right thing to do, but maybe you mean to return >>> DRM_TASK_STATUS_COMPLETE? It's a bit confusing that aborted and job >>> successfully finished should be signaled with the same return code. >> Yes and no. As I tried to describe in my previous mail the naming of the >> enum values is also not very good. > I tried to make the naming as minimal as possible: > COMPLETE: the task is out of the hardware. > ALIVE: the task is in the hardware. Yeah in general that is a good idea, but in this special case that information is not useful to the scheduler since it needs to restart the timer anyway. >> See even when the job has completed we need to restart the timer for the >> potential next job. > Sure, yes. But this is something which the DRM decides--why should > drivers know of the internals of the DRM? (i.e. that it restarts > the timer or that there is a timer, etc.) Return minimal > value and let the DRM decide what to do next. That's correct, but the driver should also not provide useless information to the scheduler. > >> Only when the device is completely gone and unrecoverable we should not >> restart the timer. > Yes, agreed. > >> I suggest to either make this an int and return -ENODEV when that >> happens or rename the enum to something like DRM_SCHED_NODEV. > It was an int, but you suggested that it'd be a macro, so I made > it an enum so that the complier can check the values against the macros > returned. Well what I suggested was to use something better than 0/1 as return value. And that was a good idea since I didn't noticed before that you want to return the job status here, which in turn is not necessary a good idea. > I suggested, DRM_TASK_SCHED_ENODEV, but let Andrey add it > when he adds his patches on top of my patch here, because his > work adds hotplug/unplug support. > > Also, note that if the pending list is freed, while the DRM > had been blocked, i.e. notified that the device is gone, > then returning DRM_TASK_SCHED_ENODEV would be moot, as there > are no tasks in the pending list. > > This patch here is good as it is, since it is minimal and doesn't make > assumptions on DRM behaviour. The problem with this patch is that it's adding a functionality which we don't immediately use and as far as I can see won't use in the future either. Regards, Christian. > > Regards, > Luben > >> Regards, >> Christian. >> >>>> + drm_sched_start(&gpu->sched, true); >>>> + return DRM_TASK_STATUS_ALIVE; >>>> + >>>> out_no_timeout: >>>> /* restart scheduler after GPU is usable again */ >>>> drm_sched_start(&gpu->sched, true); >>>> + return DRM_TASK_STATUS_ALIVE; >>>> } >>> Regards, >>> Lucas >>> > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Hi Luben, Am Freitag, dem 11.12.2020 um 15:36 -0500 schrieb Luben Tuikov: > On 2020-12-10 4:31 a.m., Lucas Stach wrote: > > Hi Luben, > > > > Am Mittwoch, den 09.12.2020, 21:14 -0500 schrieb Luben Tuikov: > > > This patch does not change current behaviour. > > > > > > The driver's job timeout handler now returns > > > status indicating back to the DRM layer whether > > > the task (job) was successfully aborted or whether > > > more time should be given to the task to complete. > > > > > > Default behaviour as of this patch, is preserved, > > > except in obvious-by-comment case in the Panfrost > > > driver, as documented below. > > > > > > All drivers which make use of the > > > drm_sched_backend_ops' .timedout_job() callback > > > have been accordingly renamed and return the > > > would've-been default value of > > > DRM_TASK_STATUS_ALIVE to restart the task's > > > timeout timer--this is the old behaviour, and > > > is preserved by this patch. > > > > > > In the case of the Panfrost driver, its timedout > > > callback correctly first checks if the job had > > > completed in due time and if so, it now returns > > > DRM_TASK_STATUS_COMPLETE to notify the DRM layer > > > that the task can be moved to the done list, to be > > > freed later. In the other two subsequent checks, > > > the value of DRM_TASK_STATUS_ALIVE is returned, as > > > per the default behaviour. > > > > > > A more involved driver's solutions can be had > > > in subequent patches. > > > > > > v2: Use enum as the status of a driver's job > > > timeout callback method. > > > > > > Cc: Alexander Deucher <Alexander.Deucher@amd.com> > > > Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> > > > Cc: Christian König <christian.koenig@amd.com> > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > > > Cc: Lucas Stach <l.stach@pengutronix.de> > > > Cc: Russell King <linux+etnaviv@armlinux.org.uk> > > > Cc: Christian Gmeiner <christian.gmeiner@gmail.com> > > > Cc: Qiang Yu <yuq825@gmail.com> > > > Cc: Rob Herring <robh@kernel.org> > > > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> > > > Cc: Steven Price <steven.price@arm.com> > > > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> > > > Cc: Eric Anholt <eric@anholt.net> > > > Reported-by: kernel test robot <lkp@intel.com> > > > Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +++-- > > > drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++- > > > drivers/gpu/drm/lima/lima_sched.c | 4 +++- > > > drivers/gpu/drm/panfrost/panfrost_job.c | 9 ++++--- > > > drivers/gpu/drm/scheduler/sched_main.c | 4 +--- > > > drivers/gpu/drm/v3d/v3d_sched.c | 32 +++++++++++++------------ > > > include/drm/gpu_scheduler.h | 20 +++++++++++++--- > > > 7 files changed, 57 insertions(+), 28 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > index ff48101bab55..a111326cbdde 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > @@ -28,7 +28,7 @@ > > > #include "amdgpu.h" > > > #include "amdgpu_trace.h" > > > > > > > > > > > > > > > -static void amdgpu_job_timedout(struct drm_sched_job *s_job) > > > +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job) > > > { > > > struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); > > > struct amdgpu_job *job = to_amdgpu_job(s_job); > > > @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) > > > amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) { > > > DRM_ERROR("ring %s timeout, but soft recovered\n", > > > s_job->sched->name); > > > - return; > > > + return DRM_TASK_STATUS_ALIVE; > > > } > > > > > > > > > > > > > > > amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti); > > > @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) > > > > > > > > > > > > > > > if (amdgpu_device_should_recover_gpu(ring->adev)) { > > > amdgpu_device_gpu_recover(ring->adev, job); > > > + return DRM_TASK_STATUS_ALIVE; > > > } else { > > > drm_sched_suspend_timeout(&ring->sched); > > > if (amdgpu_sriov_vf(adev)) > > > adev->virt.tdr_debug = true; > > > + return DRM_TASK_STATUS_ALIVE; > > > } > > > } > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c > > > index cd46c882269c..c49516942328 100644 > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c > > > @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job) > > > return fence; > > > } > > > > > > > > > > > > > > > -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) > > > +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job > > > + *sched_job) > > > { > > > struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); > > > struct etnaviv_gpu *gpu = submit->gpu; > > > @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) > > > > > > drm_sched_resubmit_jobs(&gpu->sched); > > > > > > + /* Tell the DRM scheduler that this task needs > > > + * more time. > > > + */ > > > > This comment doesn't match the kernel coding style, but it's also moot > > as the whole added code block isn't needed. The code just below is > > identical, so letting execution continue here instead of returning > > would be the right thing to do, but maybe you mean to return > > DRM_TASK_STATUS_COMPLETE? It's a bit confusing that aborted and job > > successfully finished should be signaled with the same return code. > > The kernel coding style takes the net/ style of comments, as well > as the non-net/ style of comments--with a leading /* on an empty line. > I'm using the net/ style. checkpatch.pl said everything is okay, > which I've integrated in my git-hooks to check every patch and > every commit. As a general rule, use the coding style of the surrounding code. I'm certainly unwilling to accept net style comments in etnaviv. > I'm not familiar with the etnaviv's internals and what you see here > is my best guesstimate. > > I understand your confusion that an aborted job and > successfully completed job BOTH return DRM_TASK_STATUS_COMPLETE, > right? That's insanity, right? Perhaps we should return ABORTED > for one and FINISHED for another, no? > > So, this is intentional. DRM_TASK_STATUS_COMPLETE doesn't > indicate the execution status of the task--this is for > the application client to determine, e.g. Mesa. For DRM and the driver, > as a transport, the driver wants to tell the DRM layer > that the DRM layer will *not hear from this task*, that is > this task is out of the hardware and as such relevant memory can be > released. > > Task was aborted successfully: out of the hardware, free relevant memories. > Task has completed successfully: out of the hardware, free relevant memories. > > As a transport, the driver and DRM don't want to know the internals > of the task--only if it is, or not, in the hardware, so that krefs > can be kept or decremented, and relevant memories freed. > > By returning "ALIVE", the driver says to DRM, that the task > is now in the hardware. Maybe it was aborted and reissued, > maybe it is still executing--we don't care. > > The DRM layer can decide what to do next, but the driver > doesn't control this. For instance, a sensible thing to do > would be to extend the timeout timer for this task, but this > something which DRM does, and the driver shouldn't necessarily > control this--i.e. a simple code is returned, and a clear > separation between the layers is set. > > "ALIVE" is essentially what we did before this patch, > so here I return this to mimic past behaviour. > Should COMPLETE be returned? Is the task out of > the hardware, never to be heard of again? If you just want a indication of weather the job is still pending in the hardware or has been completed/thrown out I'm not sure why you would want to funnel this information through the return value of the timeout handler. There is already a canonical place for this information: the jobs hardware fence. If the fence is signaled the job has left the hardware and you can free all associated resources. If the fence isn't signaled the job is still pending and you need to keep the resources around. Regards, Lucas
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index ff48101bab55..a111326cbdde 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -28,7 +28,7 @@ #include "amdgpu.h" #include "amdgpu_trace.h" -static void amdgpu_job_timedout(struct drm_sched_job *s_job) +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job) { struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); struct amdgpu_job *job = to_amdgpu_job(s_job); @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) { DRM_ERROR("ring %s timeout, but soft recovered\n", s_job->sched->name); - return; + return DRM_TASK_STATUS_ALIVE; } amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti); @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) if (amdgpu_device_should_recover_gpu(ring->adev)) { amdgpu_device_gpu_recover(ring->adev, job); + return DRM_TASK_STATUS_ALIVE; } else { drm_sched_suspend_timeout(&ring->sched); if (amdgpu_sriov_vf(adev)) adev->virt.tdr_debug = true; + return DRM_TASK_STATUS_ALIVE; } } diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index cd46c882269c..c49516942328 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job) return fence; } -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job + *sched_job) { struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); struct etnaviv_gpu *gpu = submit->gpu; @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) drm_sched_resubmit_jobs(&gpu->sched); + /* Tell the DRM scheduler that this task needs + * more time. + */ + drm_sched_start(&gpu->sched, true); + return DRM_TASK_STATUS_ALIVE; + out_no_timeout: /* restart scheduler after GPU is usable again */ drm_sched_start(&gpu->sched, true); + return DRM_TASK_STATUS_ALIVE; } static void etnaviv_sched_free_job(struct drm_sched_job *sched_job) diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index 63b4c5643f9c..66d9236b8760 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task) mutex_unlock(&dev->error_task_list_lock); } -static void lima_sched_timedout_job(struct drm_sched_job *job) +static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job) { struct lima_sched_pipe *pipe = to_lima_pipe(job->sched); struct lima_sched_task *task = to_lima_task(job); @@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job) drm_sched_resubmit_jobs(&pipe->base); drm_sched_start(&pipe->base, true); + + return DRM_TASK_STATUS_ALIVE; } static void lima_sched_free_job(struct drm_sched_job *job) diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 04e6f6f9b742..845148a722e4 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue) mutex_unlock(&queue->lock); } -static void panfrost_job_timedout(struct drm_sched_job *sched_job) +static enum drm_task_status panfrost_job_timedout(struct drm_sched_job + *sched_job) { struct panfrost_job *job = to_panfrost_job(sched_job); struct panfrost_device *pfdev = job->pfdev; @@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job) * spurious. Bail out. */ if (dma_fence_is_signaled(job->done_fence)) - return; + return DRM_TASK_STATUS_COMPLETE; dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p", js, @@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job) /* Scheduler is already stopped, nothing to do. */ if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job)) - return; + return DRM_TASK_STATUS_ALIVE; /* Schedule a reset if there's no reset in progress. */ if (!atomic_xchg(&pfdev->reset.pending, 1)) schedule_work(&pfdev->reset.work); + + return DRM_TASK_STATUS_ALIVE; } static const struct drm_sched_backend_ops panfrost_sched_ops = { diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 3eb7618a627d..b9876cad94f2 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -526,7 +526,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery) EXPORT_SYMBOL(drm_sched_start); /** - * drm_sched_resubmit_jobs - helper to relunch job from pending ring list + * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list * * @sched: scheduler instance * @@ -560,8 +560,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched) } else { s_job->s_fence->parent = fence; } - - } } EXPORT_SYMBOL(drm_sched_resubmit_jobs); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 452682e2209f..3740665ec479 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job) return NULL; } -static void +static enum drm_task_status v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job) { enum v3d_queue q; @@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job) } mutex_unlock(&v3d->reset_lock); + + return DRM_TASK_STATUS_ALIVE; } /* If the current address or return address have changed, then the GPU @@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job) * could fail if the GPU got in an infinite loop in the CL, but that * is pretty unlikely outside of an i-g-t testcase. */ -static void +static enum drm_task_status v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q, u32 *timedout_ctca, u32 *timedout_ctra) { @@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q, if (*timedout_ctca != ctca || *timedout_ctra != ctra) { *timedout_ctca = ctca; *timedout_ctra = ctra; - return; + return DRM_TASK_STATUS_ALIVE; } - v3d_gpu_reset_for_timeout(v3d, sched_job); + return v3d_gpu_reset_for_timeout(v3d, sched_job); } -static void +static enum drm_task_status v3d_bin_job_timedout(struct drm_sched_job *sched_job) { struct v3d_bin_job *job = to_bin_job(sched_job); - v3d_cl_job_timedout(sched_job, V3D_BIN, - &job->timedout_ctca, &job->timedout_ctra); + return v3d_cl_job_timedout(sched_job, V3D_BIN, + &job->timedout_ctca, &job->timedout_ctra); } -static void +static enum drm_task_status v3d_render_job_timedout(struct drm_sched_job *sched_job) { struct v3d_render_job *job = to_render_job(sched_job); - v3d_cl_job_timedout(sched_job, V3D_RENDER, - &job->timedout_ctca, &job->timedout_ctra); + return v3d_cl_job_timedout(sched_job, V3D_RENDER, + &job->timedout_ctca, &job->timedout_ctra); } -static void +static enum drm_task_status v3d_generic_job_timedout(struct drm_sched_job *sched_job) { struct v3d_job *job = to_v3d_job(sched_job); - v3d_gpu_reset_for_timeout(job->v3d, sched_job); + return v3d_gpu_reset_for_timeout(job->v3d, sched_job); } -static void +static enum drm_task_status v3d_csd_job_timedout(struct drm_sched_job *sched_job) { struct v3d_csd_job *job = to_csd_job(sched_job); @@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job) */ if (job->timedout_batches != batches) { job->timedout_batches = batches; - return; + return DRM_TASK_STATUS_ALIVE; } - v3d_gpu_reset_for_timeout(v3d, sched_job); + return v3d_gpu_reset_for_timeout(v3d, sched_job); } static const struct drm_sched_backend_ops v3d_bin_sched_ops = { diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 2e0c368e19f6..cedfc5394e52 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job, return s_job && atomic_inc_return(&s_job->karma) > threshold; } +enum drm_task_status { + DRM_TASK_STATUS_COMPLETE, + DRM_TASK_STATUS_ALIVE +}; + /** * struct drm_sched_backend_ops * @@ -230,10 +235,19 @@ struct drm_sched_backend_ops { struct dma_fence *(*run_job)(struct drm_sched_job *sched_job); /** - * @timedout_job: Called when a job has taken too long to execute, - * to trigger GPU recovery. + * @timedout_job: Called when a job has taken too long to execute, + * to trigger GPU recovery. + * + * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy + * and executing in the hardware, i.e. it needs more time. + * + * Return DRM_TASK_STATUS_COMPLETE, if the task (job) has + * been aborted or is unknown to the hardware, i.e. if + * the task is out of the hardware, and maybe it is now + * in the done list, or it was completed long ago, or + * if it is unknown to the hardware. */ - void (*timedout_job)(struct drm_sched_job *sched_job); + enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job); /** * @free_job: Called once the job's finished fence has been signaled
This patch does not change current behaviour. The driver's job timeout handler now returns status indicating back to the DRM layer whether the task (job) was successfully aborted or whether more time should be given to the task to complete. Default behaviour as of this patch, is preserved, except in obvious-by-comment case in the Panfrost driver, as documented below. All drivers which make use of the drm_sched_backend_ops' .timedout_job() callback have been accordingly renamed and return the would've-been default value of DRM_TASK_STATUS_ALIVE to restart the task's timeout timer--this is the old behaviour, and is preserved by this patch. In the case of the Panfrost driver, its timedout callback correctly first checks if the job had completed in due time and if so, it now returns DRM_TASK_STATUS_COMPLETE to notify the DRM layer that the task can be moved to the done list, to be freed later. In the other two subsequent checks, the value of DRM_TASK_STATUS_ALIVE is returned, as per the default behaviour. A more involved driver's solutions can be had in subequent patches. v2: Use enum as the status of a driver's job timeout callback method. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Eric Anholt <eric@anholt.net> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 +++-- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++- drivers/gpu/drm/lima/lima_sched.c | 4 +++- drivers/gpu/drm/panfrost/panfrost_job.c | 9 ++++--- drivers/gpu/drm/scheduler/sched_main.c | 4 +--- drivers/gpu/drm/v3d/v3d_sched.c | 32 +++++++++++++------------ include/drm/gpu_scheduler.h | 20 +++++++++++++--- 7 files changed, 57 insertions(+), 28 deletions(-)