Message ID | 20190520231649.24595-1-nunes.erico@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] drm/scheduler: Fix job cleanup without timeout handler | expand |
Am 21.05.19 um 01:16 schrieb Erico Nunes: > [CAUTION: External Email] > > After "5918045c4ed4 drm/scheduler: rework job destruction", jobs are > only deleted when the timeout handler is able to be cancelled > successfully. > > In case no timeout handler is running (timeout == MAX_SCHEDULE_TIMEOUT), > job cleanup would be skipped which may result in memory leaks. > > Add the handling for the (timeout == MAX_SCHEDULE_TIMEOUT) case in > drm_sched_cleanup_jobs. > > Signed-off-by: Erico Nunes <nunes.erico@gmail.com> > Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Going to pick that up later today into our internal branch. Thanks for the help, Christian. > --- > drivers/gpu/drm/scheduler/sched_main.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index f8f0e1c19002..10d1d37e644a 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -630,7 +630,8 @@ static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched) > unsigned long flags; > > /* Don't destroy jobs while the timeout worker is running */ > - if (!cancel_delayed_work(&sched->work_tdr)) > + if (sched->timeout != MAX_SCHEDULE_TIMEOUT && > + !cancel_delayed_work(&sched->work_tdr)) > return; > > > -- > 2.20.1 >
On Tue, May 21, 2019 at 8:47 AM Koenig, Christian <Christian.Koenig@amd.com> wrote: > > Am 21.05.19 um 01:16 schrieb Erico Nunes: > > [CAUTION: External Email] > > > > After "5918045c4ed4 drm/scheduler: rework job destruction", jobs are > > only deleted when the timeout handler is able to be cancelled > > successfully. > > > > In case no timeout handler is running (timeout == MAX_SCHEDULE_TIMEOUT), > > job cleanup would be skipped which may result in memory leaks. > > > > Add the handling for the (timeout == MAX_SCHEDULE_TIMEOUT) case in > > drm_sched_cleanup_jobs. > > > > Signed-off-by: Erico Nunes <nunes.erico@gmail.com> > > Cc: Christian König <christian.koenig@amd.com> > > Reviewed-by: Christian König <christian.koenig@amd.com> > > Going to pick that up later today into our internal branch. Thanks. I also posted one to set lima to use a real default timeout. Is it possible for us to still get at least one of these into a tree that goes to a linux 5.2-rc? This issue may make lima unusable as the system runs out of memory quickly if many opengl programs are launched. Erico
Am 21.05.19 um 14:16 schrieb Erico Nunes: > [CAUTION: External Email] > > On Tue, May 21, 2019 at 8:47 AM Koenig, Christian > <Christian.Koenig@amd.com> wrote: >> Am 21.05.19 um 01:16 schrieb Erico Nunes: >>> [CAUTION: External Email] >>> >>> After "5918045c4ed4 drm/scheduler: rework job destruction", jobs are >>> only deleted when the timeout handler is able to be cancelled >>> successfully. >>> >>> In case no timeout handler is running (timeout == MAX_SCHEDULE_TIMEOUT), >>> job cleanup would be skipped which may result in memory leaks. >>> >>> Add the handling for the (timeout == MAX_SCHEDULE_TIMEOUT) case in >>> drm_sched_cleanup_jobs. >>> >>> Signed-off-by: Erico Nunes <nunes.erico@gmail.com> >>> Cc: Christian König <christian.koenig@amd.com> >> Reviewed-by: Christian König <christian.koenig@amd.com> >> >> Going to pick that up later today into our internal branch. > Thanks. I also posted one to set lima to use a real default timeout. > > Is it possible for us to still get at least one of these into a tree > that goes to a linux 5.2-rc? > This issue may make lima unusable as the system runs out of memory > quickly if many opengl programs are launched. Alex should pick it up for his next -fixes pull this week. Christian. > > Erico
On Tue, May 21, 2019 at 8:17 PM Erico Nunes <nunes.erico@gmail.com> wrote: > > On Tue, May 21, 2019 at 8:47 AM Koenig, Christian > <Christian.Koenig@amd.com> wrote: > > > > Am 21.05.19 um 01:16 schrieb Erico Nunes: > > > [CAUTION: External Email] > > > > > > After "5918045c4ed4 drm/scheduler: rework job destruction", jobs are > > > only deleted when the timeout handler is able to be cancelled > > > successfully. > > > > > > In case no timeout handler is running (timeout == MAX_SCHEDULE_TIMEOUT), > > > job cleanup would be skipped which may result in memory leaks. > > > > > > Add the handling for the (timeout == MAX_SCHEDULE_TIMEOUT) case in > > > drm_sched_cleanup_jobs. > > > > > > Signed-off-by: Erico Nunes <nunes.erico@gmail.com> > > > Cc: Christian König <christian.koenig@amd.com> > > > > Reviewed-by: Christian König <christian.koenig@amd.com> > > > > Going to pick that up later today into our internal branch. > > Thanks. I also posted one to set lima to use a real default timeout. > > Is it possible for us to still get at least one of these into a tree > that goes to a linux 5.2-rc? > This issue may make lima unusable as the system runs out of memory > quickly if many opengl programs are launched. > The "drm/scheduler: rework job destruction" commit is not in 5.2-rc. So no need to pick any fix to that place. It would be better that run some tests on 5.2-rc for other possible fixes indeed. Regards, Qiang
On Tue, May 21, 2019 at 2:48 AM Koenig, Christian <Christian.Koenig@amd.com> wrote: > > Am 21.05.19 um 01:16 schrieb Erico Nunes: > > [CAUTION: External Email] > > > > After "5918045c4ed4 drm/scheduler: rework job destruction", jobs are > > only deleted when the timeout handler is able to be cancelled > > successfully. > > > > In case no timeout handler is running (timeout == MAX_SCHEDULE_TIMEOUT), > > job cleanup would be skipped which may result in memory leaks. > > > > Add the handling for the (timeout == MAX_SCHEDULE_TIMEOUT) case in > > drm_sched_cleanup_jobs. > > > > Signed-off-by: Erico Nunes <nunes.erico@gmail.com> > > Cc: Christian König <christian.koenig@amd.com> > > Reviewed-by: Christian König <christian.koenig@amd.com> > > Going to pick that up later today into our internal branch. Please apply it to drm-misc-next. that is where the other gpu scheduler changes are. They are not in 5.2. Alex > > Thanks for the help, > Christian. > > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > > index f8f0e1c19002..10d1d37e644a 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -630,7 +630,8 @@ static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched) > > unsigned long flags; > > > > /* Don't destroy jobs while the timeout worker is running */ > > - if (!cancel_delayed_work(&sched->work_tdr)) > > + if (sched->timeout != MAX_SCHEDULE_TIMEOUT && > > + !cancel_delayed_work(&sched->work_tdr)) > > return; > > > > > > -- > > 2.20.1 > > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
Am 21.05.19 um 16:13 schrieb Alex Deucher: > [CAUTION: External Email] > > On Tue, May 21, 2019 at 2:48 AM Koenig, Christian > <Christian.Koenig@amd.com> wrote: >> Am 21.05.19 um 01:16 schrieb Erico Nunes: >>> [CAUTION: External Email] >>> >>> After "5918045c4ed4 drm/scheduler: rework job destruction", jobs are >>> only deleted when the timeout handler is able to be cancelled >>> successfully. >>> >>> In case no timeout handler is running (timeout == MAX_SCHEDULE_TIMEOUT), >>> job cleanup would be skipped which may result in memory leaks. >>> >>> Add the handling for the (timeout == MAX_SCHEDULE_TIMEOUT) case in >>> drm_sched_cleanup_jobs. >>> >>> Signed-off-by: Erico Nunes <nunes.erico@gmail.com> >>> Cc: Christian König <christian.koenig@amd.com> >> Reviewed-by: Christian König <christian.koenig@amd.com> >> >> Going to pick that up later today into our internal branch. > Please apply it to drm-misc-next. that is where the other gpu > scheduler changes are. They are not in 5.2. Ah! Now that makes sense again, thanks for the reminder. Christian. > > Alex > >> Thanks for the help, >> Christian. >> >>> --- >>> drivers/gpu/drm/scheduler/sched_main.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c >>> index f8f0e1c19002..10d1d37e644a 100644 >>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>> @@ -630,7 +630,8 @@ static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched) >>> unsigned long flags; >>> >>> /* Don't destroy jobs while the timeout worker is running */ >>> - if (!cancel_delayed_work(&sched->work_tdr)) >>> + if (sched->timeout != MAX_SCHEDULE_TIMEOUT && >>> + !cancel_delayed_work(&sched->work_tdr)) >>> return; >>> >>> >>> -- >>> 2.20.1 >>> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index f8f0e1c19002..10d1d37e644a 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -630,7 +630,8 @@ static void drm_sched_cleanup_jobs(struct drm_gpu_scheduler *sched) unsigned long flags; /* Don't destroy jobs while the timeout worker is running */ - if (!cancel_delayed_work(&sched->work_tdr)) + if (sched->timeout != MAX_SCHEDULE_TIMEOUT && + !cancel_delayed_work(&sched->work_tdr)) return;
After "5918045c4ed4 drm/scheduler: rework job destruction", jobs are only deleted when the timeout handler is able to be cancelled successfully. In case no timeout handler is running (timeout == MAX_SCHEDULE_TIMEOUT), job cleanup would be skipped which may result in memory leaks. Add the handling for the (timeout == MAX_SCHEDULE_TIMEOUT) case in drm_sched_cleanup_jobs. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Cc: Christian König <christian.koenig@amd.com> --- drivers/gpu/drm/scheduler/sched_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)