Message ID | 20231102224653.5785-2-ltuikov89@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/sched: Eliminate drm_sched_run_job_queue_if_ready() | expand |
On 02/11/2023 22:46, Luben Tuikov wrote: > Eliminate drm_sched_run_job_queue_if_ready() and instead just call > drm_sched_run_job_queue() in drm_sched_free_job_work(). The problem is that > the former function uses drm_sched_select_entity() to determine if the > scheduler had an entity ready in one of its run-queues, and in the case of the > Round-Robin (RR) scheduling, the function drm_sched_rq_select_entity_rr() does > just that, selects the _next_ entity which is ready, sets up the run-queue and > completion and returns that entity. The FIFO scheduling algorithm is unaffected. > > Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then > in the case of RR scheduling, that would result in calling select_entity() > twice, which may result in skipping a ready entity if more than one entity is > ready. This commit fixes this by eliminating the if_ready() variant. Fixes: is missing since the regression already landed. > > Signed-off-by: Luben Tuikov <ltuikov89@gmail.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 14 ++------------ > 1 file changed, 2 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index 98b2ad54fc7071..05816e7cae8c8b 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1040,16 +1040,6 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list, > } > EXPORT_SYMBOL(drm_sched_pick_best); > > -/** > - * drm_sched_run_job_queue_if_ready - enqueue run-job work if ready > - * @sched: scheduler instance > - */ > -static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched) > -{ > - if (drm_sched_select_entity(sched)) > - drm_sched_run_job_queue(sched); > -} > - > /** > * drm_sched_free_job_work - worker to call free_job > * > @@ -1069,7 +1059,7 @@ static void drm_sched_free_job_work(struct work_struct *w) > sched->ops->free_job(cleanup_job); > > drm_sched_free_job_queue_if_done(sched); > - drm_sched_run_job_queue_if_ready(sched); > + drm_sched_run_job_queue(sched); It works but is a bit wasteful causing needless CPU wake ups with a potentially empty queue, both here and in drm_sched_run_job_work below. What would be the problem in having a "peek" type helper? It would be easy to do it in a single spin lock section instead of drop and re-acquire. What is even the point of having the re-queue here _inside_ the if (cleanup_job) block? See https://lists.freedesktop.org/archives/dri-devel/2023-November/429037.html. Because of the lock drop and re-acquire I don't see that it makes sense to make potential re-queue depend on the existence of current finished job. Also the point of doing the re-queue of the run job queue from the free worker? (I suppose re-queuing the _free_ worker itself is needed in the current design, albeit inefficient.) Regards, Tvrtko > } > } > > @@ -1127,7 +1117,7 @@ static void drm_sched_run_job_work(struct work_struct *w) > } > > wake_up(&sched->job_scheduled); > - drm_sched_run_job_queue_if_ready(sched); > + drm_sched_run_job_queue(sched); > } > > /** > > base-commit: 6fd9487147c4f18ad77eea00bd8c9189eec74a3e
On Thu, Nov 02, 2023 at 06:46:54PM -0400, Luben Tuikov wrote: > Eliminate drm_sched_run_job_queue_if_ready() and instead just call > drm_sched_run_job_queue() in drm_sched_free_job_work(). The problem is that > the former function uses drm_sched_select_entity() to determine if the > scheduler had an entity ready in one of its run-queues, and in the case of the > Round-Robin (RR) scheduling, the function drm_sched_rq_select_entity_rr() does > just that, selects the _next_ entity which is ready, sets up the run-queue and > completion and returns that entity. The FIFO scheduling algorithm is unaffected. > > Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then > in the case of RR scheduling, that would result in calling select_entity() > twice, which may result in skipping a ready entity if more than one entity is > ready. This commit fixes this by eliminating the if_ready() variant. > Ah, yes I guess we both missed this. What about reviving the peek argument [1]? This would avoid unnecessary re-queues. Matt [1] https://patchwork.freedesktop.org/patch/562222/?series=121744&rev=7 > Signed-off-by: Luben Tuikov <ltuikov89@gmail.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 14 ++------------ > 1 file changed, 2 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index 98b2ad54fc7071..05816e7cae8c8b 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1040,16 +1040,6 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list, > } > EXPORT_SYMBOL(drm_sched_pick_best); > > -/** > - * drm_sched_run_job_queue_if_ready - enqueue run-job work if ready > - * @sched: scheduler instance > - */ > -static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched) > -{ > - if (drm_sched_select_entity(sched)) > - drm_sched_run_job_queue(sched); > -} > - > /** > * drm_sched_free_job_work - worker to call free_job > * > @@ -1069,7 +1059,7 @@ static void drm_sched_free_job_work(struct work_struct *w) > sched->ops->free_job(cleanup_job); > > drm_sched_free_job_queue_if_done(sched); > - drm_sched_run_job_queue_if_ready(sched); > + drm_sched_run_job_queue(sched); > } > } > > @@ -1127,7 +1117,7 @@ static void drm_sched_run_job_work(struct work_struct *w) > } > > wake_up(&sched->job_scheduled); > - drm_sched_run_job_queue_if_ready(sched); > + drm_sched_run_job_queue(sched); > } > > /** > > base-commit: 6fd9487147c4f18ad77eea00bd8c9189eec74a3e > -- > 2.42.1 >
Hi Matt, :-) On 2023-11-03 11:13, Matthew Brost wrote: > On Thu, Nov 02, 2023 at 06:46:54PM -0400, Luben Tuikov wrote: >> Eliminate drm_sched_run_job_queue_if_ready() and instead just call >> drm_sched_run_job_queue() in drm_sched_free_job_work(). The problem is that >> the former function uses drm_sched_select_entity() to determine if the >> scheduler had an entity ready in one of its run-queues, and in the case of the >> Round-Robin (RR) scheduling, the function drm_sched_rq_select_entity_rr() does >> just that, selects the _next_ entity which is ready, sets up the run-queue and >> completion and returns that entity. The FIFO scheduling algorithm is unaffected. >> >> Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then >> in the case of RR scheduling, that would result in calling select_entity() >> twice, which may result in skipping a ready entity if more than one entity is >> ready. This commit fixes this by eliminating the if_ready() variant. >> > > Ah, yes I guess we both missed this. What about reviving the peek > argument [1]? This would avoid unnecessary re-queues. So, I really am not too fond of "peek-then-get-and-do" (scheduling) organizations, because they don't scale. As we've seen in our case, the RR has a side effect, as Tvrtko pointed out (thanks!), and in the future this "peek-first, then go-again, to go"-type of organization would only prevent us from doing more interesting things. Also, with the GPU scheduler organization, mixing in the "peek", we just get to carry it around through many a function, only to be used in a leaf function, and exported way back up (because we don't know the rq at that level). I'd much rather we just did "consume-until-empty", and if we have one last empty check (or first), then that's not a breaker. (I mean, we have a drm_sched_pick_best() which has time complexity O(n), and we execute it every time we arm a job, so it's not that big of a deal.) Plus, it makes the code concise and compact. Let me reconstitute the patch and I'll send it for yours review.
Hi Tvrtko, On 2023-11-03 06:39, Tvrtko Ursulin wrote: > > On 02/11/2023 22:46, Luben Tuikov wrote: >> Eliminate drm_sched_run_job_queue_if_ready() and instead just call >> drm_sched_run_job_queue() in drm_sched_free_job_work(). The problem is that >> the former function uses drm_sched_select_entity() to determine if the >> scheduler had an entity ready in one of its run-queues, and in the case of the >> Round-Robin (RR) scheduling, the function drm_sched_rq_select_entity_rr() does >> just that, selects the _next_ entity which is ready, sets up the run-queue and >> completion and returns that entity. The FIFO scheduling algorithm is unaffected. >> >> Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then >> in the case of RR scheduling, that would result in calling select_entity() >> twice, which may result in skipping a ready entity if more than one entity is >> ready. This commit fixes this by eliminating the if_ready() variant. > > Fixes: is missing since the regression already landed. Ah, yes, thank you for pointing that out. :-) I'll add one. > >> >> Signed-off-by: Luben Tuikov <ltuikov89@gmail.com> >> --- >> drivers/gpu/drm/scheduler/sched_main.c | 14 ++------------ >> 1 file changed, 2 insertions(+), 12 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c >> index 98b2ad54fc7071..05816e7cae8c8b 100644 >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -1040,16 +1040,6 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list, >> } >> EXPORT_SYMBOL(drm_sched_pick_best); >> >> -/** >> - * drm_sched_run_job_queue_if_ready - enqueue run-job work if ready >> - * @sched: scheduler instance >> - */ >> -static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched) >> -{ >> - if (drm_sched_select_entity(sched)) >> - drm_sched_run_job_queue(sched); >> -} >> - >> /** >> * drm_sched_free_job_work - worker to call free_job >> * >> @@ -1069,7 +1059,7 @@ static void drm_sched_free_job_work(struct work_struct *w) >> sched->ops->free_job(cleanup_job); >> >> drm_sched_free_job_queue_if_done(sched); >> - drm_sched_run_job_queue_if_ready(sched); >> + drm_sched_run_job_queue(sched); > > It works but is a bit wasteful causing needless CPU wake ups with a I'd not worry about "waking up the CPU" as the CPU scheduler would most likely put the wq on the same CPU by instruction cache locality. > potentially empty queue, both here and in drm_sched_run_job_work below. That's true, but if you were to look at the typical execution of this code you'd see we get a string of function entry when the incoming queue is non-empty, followed by one empty entry only to be taken off the CPU. So, it really isn't a breaker. So, there's a way to mitigate this in drm_sched_run_job_work(). I'll see that it makes it in the next version of the patch. Thanks!
On 04/11/2023 00:25, Luben Tuikov wrote: > Hi Tvrtko, > > On 2023-11-03 06:39, Tvrtko Ursulin wrote: >> >> On 02/11/2023 22:46, Luben Tuikov wrote: >>> Eliminate drm_sched_run_job_queue_if_ready() and instead just call >>> drm_sched_run_job_queue() in drm_sched_free_job_work(). The problem is that >>> the former function uses drm_sched_select_entity() to determine if the >>> scheduler had an entity ready in one of its run-queues, and in the case of the >>> Round-Robin (RR) scheduling, the function drm_sched_rq_select_entity_rr() does >>> just that, selects the _next_ entity which is ready, sets up the run-queue and >>> completion and returns that entity. The FIFO scheduling algorithm is unaffected. >>> >>> Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then >>> in the case of RR scheduling, that would result in calling select_entity() >>> twice, which may result in skipping a ready entity if more than one entity is >>> ready. This commit fixes this by eliminating the if_ready() variant. >> >> Fixes: is missing since the regression already landed. > > Ah, yes, thank you for pointing that out. :-) > I'll add one. > >> >>> >>> Signed-off-by: Luben Tuikov <ltuikov89@gmail.com> >>> --- >>> drivers/gpu/drm/scheduler/sched_main.c | 14 ++------------ >>> 1 file changed, 2 insertions(+), 12 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c >>> index 98b2ad54fc7071..05816e7cae8c8b 100644 >>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>> @@ -1040,16 +1040,6 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list, >>> } >>> EXPORT_SYMBOL(drm_sched_pick_best); >>> >>> -/** >>> - * drm_sched_run_job_queue_if_ready - enqueue run-job work if ready >>> - * @sched: scheduler instance >>> - */ >>> -static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched) >>> -{ >>> - if (drm_sched_select_entity(sched)) >>> - drm_sched_run_job_queue(sched); >>> -} >>> - >>> /** >>> * drm_sched_free_job_work - worker to call free_job >>> * >>> @@ -1069,7 +1059,7 @@ static void drm_sched_free_job_work(struct work_struct *w) >>> sched->ops->free_job(cleanup_job); >>> >>> drm_sched_free_job_queue_if_done(sched); >>> - drm_sched_run_job_queue_if_ready(sched); >>> + drm_sched_run_job_queue(sched); >> >> It works but is a bit wasteful causing needless CPU wake ups with a > > I'd not worry about "waking up the CPU" as the CPU scheduler would most likely > put the wq on the same CPU by instruction cache locality. > >> potentially empty queue, both here and in drm_sched_run_job_work below. > > That's true, but if you were to look at the typical execution of > this code you'd see we get a string of function entry when the incoming queue > is non-empty, followed by one empty entry only to be taken off the CPU. So, > it really isn't a breaker. > > So, there's a way to mitigate this in drm_sched_run_job_work(). I'll see that it > makes it in the next version of the patch. Okay, I will be keeping an eye on that. Separately, I might send a patch to do the "re-queue if more pending" in one atomic section. (Instead of re-acquiring the lock.) And also as a heads up, at some point in the next few months I will start looking at the latency and power effects of the "do just one and re-queue" conversion. In ChromeOS milli-Watts matter and some things like media playback do have a lot of inter-engine dependencies. So keeping the CPU C state residency low might matter. Well, it might matter for server media transcode stream density workloads too, both from power and stream capacity per socket metrics. Regards, Tvrtko
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 98b2ad54fc7071..05816e7cae8c8b 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1040,16 +1040,6 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list, } EXPORT_SYMBOL(drm_sched_pick_best); -/** - * drm_sched_run_job_queue_if_ready - enqueue run-job work if ready - * @sched: scheduler instance - */ -static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched) -{ - if (drm_sched_select_entity(sched)) - drm_sched_run_job_queue(sched); -} - /** * drm_sched_free_job_work - worker to call free_job * @@ -1069,7 +1059,7 @@ static void drm_sched_free_job_work(struct work_struct *w) sched->ops->free_job(cleanup_job); drm_sched_free_job_queue_if_done(sched); - drm_sched_run_job_queue_if_ready(sched); + drm_sched_run_job_queue(sched); } } @@ -1127,7 +1117,7 @@ static void drm_sched_run_job_work(struct work_struct *w) } wake_up(&sched->job_scheduled); - drm_sched_run_job_queue_if_ready(sched); + drm_sched_run_job_queue(sched); } /**
Eliminate drm_sched_run_job_queue_if_ready() and instead just call drm_sched_run_job_queue() in drm_sched_free_job_work(). The problem is that the former function uses drm_sched_select_entity() to determine if the scheduler had an entity ready in one of its run-queues, and in the case of the Round-Robin (RR) scheduling, the function drm_sched_rq_select_entity_rr() does just that, selects the _next_ entity which is ready, sets up the run-queue and completion and returns that entity. The FIFO scheduling algorithm is unaffected. Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then in the case of RR scheduling, that would result in calling select_entity() twice, which may result in skipping a ready entity if more than one entity is ready. This commit fixes this by eliminating the if_ready() variant. Signed-off-by: Luben Tuikov <ltuikov89@gmail.com> --- drivers/gpu/drm/scheduler/sched_main.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) base-commit: 6fd9487147c4f18ad77eea00bd8c9189eec74a3e