Message ID | 20230817050435.3109-1-xinhui.pan@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/scheduler: Partially revert "drm/scheduler: track GPU active time per entity" | expand |
[AMD Official Use Only - General] Can we just add kref for entity? Or just collect such job time usage somewhere else? -----Original Message----- From: Pan, Xinhui <Xinhui.Pan@amd.com> Sent: Thursday, August 17, 2023 1:05 PM To: amd-gfx@lists.freedesktop.org Cc: Tuikov, Luben <Luben.Tuikov@amd.com>; airlied@gmail.com; dri-devel@lists.freedesktop.org; l.stach@pengutronix.de; Koenig, Christian <Christian.Koenig@amd.com>; Pan, Xinhui <Xinhui.Pan@amd.com> Subject: [PATCH] drm/scheduler: Partially revert "drm/scheduler: track GPU active time per entity" This patch partially revert commit df622729ddbf ("drm/scheduler: track GPU active time per entity") which touchs entity without any reference. I notice there is one memory overwritten from gpu scheduler side. The case is like below. A(drm_sched_main) B(vm fini) drm_sched_job_begin drm_sched_entity_kill //job in pending_list wait_for_completion complete_all ... ... kfree entity drm_sched_get_cleanup_job //fetch job from pending_list access job->entity //memory overwitten As long as we can NOT guarantee entity is alive in this case, lets revert it for now. Signed-off-by: xinhui pan <xinhui.pan@amd.com> --- drivers/gpu/drm/scheduler/sched_main.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 602361c690c9..1b3f1a6a8514 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -907,12 +907,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) spin_unlock(&sched->job_list_lock); - if (job) { - job->entity->elapsed_ns += ktime_to_ns( - ktime_sub(job->s_fence->finished.timestamp, - job->s_fence->scheduled.timestamp)); - } - return job; } -- 2.34.1
[Public] Hi Xinhui, That patch has been reverted on Linux mainline. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/scheduler/sched_main.c?h=v6.5-rc6&id=baad10973fdb442912af676de3348e80bd8fe602 Regards, Guchun > -----Original Message----- > From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of > xinhui pan > Sent: Thursday, August 17, 2023 1:05 PM > To: amd-gfx@lists.freedesktop.org > Cc: Pan, Xinhui <Xinhui.Pan@amd.com>; dri-devel@lists.freedesktop.org; > Tuikov, Luben <Luben.Tuikov@amd.com>; airlied@gmail.com; Koenig, > Christian <Christian.Koenig@amd.com>; l.stach@pengutronix.de > Subject: [PATCH] drm/scheduler: Partially revert "drm/scheduler: track GPU > active time per entity" > > This patch partially revert commit df622729ddbf ("drm/scheduler: track GPU > active time per entity") which touchs entity without any reference. > > I notice there is one memory overwritten from gpu scheduler side. > The case is like below. > A(drm_sched_main) B(vm fini) > drm_sched_job_begin drm_sched_entity_kill > //job in pending_list wait_for_completion > complete_all ... > ... kfree entity > drm_sched_get_cleanup_job > //fetch job from pending_list > access job->entity //memory overwitten > > As long as we can NOT guarantee entity is alive in this case, lets revert it for > now. > > Signed-off-by: xinhui pan <xinhui.pan@amd.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 6 ------ > 1 file changed, 6 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index 602361c690c9..1b3f1a6a8514 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -907,12 +907,6 @@ drm_sched_get_cleanup_job(struct > drm_gpu_scheduler *sched) > > spin_unlock(&sched->job_list_lock); > > - if (job) { > - job->entity->elapsed_ns += ktime_to_ns( > - ktime_sub(job->s_fence->finished.timestamp, > - job->s_fence->scheduled.timestamp)); > - } > - > return job; > } > > -- > 2.34.1
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 602361c690c9..1b3f1a6a8514 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -907,12 +907,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) spin_unlock(&sched->job_list_lock); - if (job) { - job->entity->elapsed_ns += ktime_to_ns( - ktime_sub(job->s_fence->finished.timestamp, - job->s_fence->scheduled.timestamp)); - } - return job; }
This patch partially revert commit df622729ddbf ("drm/scheduler: track GPU active time per entity") which touchs entity without any reference. I notice there is one memory overwritten from gpu scheduler side. The case is like below. A(drm_sched_main) B(vm fini) drm_sched_job_begin drm_sched_entity_kill //job in pending_list wait_for_completion complete_all ... ... kfree entity drm_sched_get_cleanup_job //fetch job from pending_list access job->entity //memory overwitten As long as we can NOT guarantee entity is alive in this case, lets revert it for now. Signed-off-by: xinhui pan <xinhui.pan@amd.com> --- drivers/gpu/drm/scheduler/sched_main.c | 6 ------ 1 file changed, 6 deletions(-)