Message ID | 20220414090229.342-1-kuyo.chang@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1,v3] sched/pelt: Fix the attach_entity_load_avg calculate method | expand |
I've taken the liberty of carrying over the tags from v2 and reworked the Changelog a little. --- Subject: sched/pelt: Fix attach_entity_load_avg() corner case From: kuyo chang <kuyo.chang@mediatek.com> Date: Thu, 14 Apr 2022 17:02:20 +0800 From: kuyo chang <kuyo.chang@mediatek.com> The warning in cfs_rq_is_decayed() triggered: SCHED_WARN_ON(cfs_rq->avg.load_avg || cfs_rq->avg.util_avg || cfs_rq->avg.runnable_avg) There exists a corner case in attach_entity_load_avg() which will cause load_sum to be zero while load_avg will not be. Consider se_weight is 88761 as per the sched_prio_to_weight[] table. Further assume the get_pelt_divider() is 47742, this gives: se->avg.load_avg is 1. However, calculating load_sum results in 0: se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se)); se->avg.load_sum = 1*47742/88761 = 0. Then enqueue_load_avg() adds this to the cfs_rq totals: cfs_rq->avg.load_avg += se->avg.load_avg; cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum; Resulting in load_avg being 1 with load_sum is 0, which will trigger the WARN. Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq and its se") Signed-off-by: kuyo chang <kuyo.chang@mediatek.com> [peterz: massage changelog] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com --- kernel/sched/fair.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struc se->avg.runnable_sum = se->avg.runnable_avg * divider; - se->avg.load_sum = divider; - if (se_weight(se)) { - se->avg.load_sum = - div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se)); - } + se->avg.load_sum = se->avg.load_avg * divider; + if (se_weight(se) < se->avg.load_sum) + se->avg.load_sum = div_u64(se->avg.load_sum, se_weight(se)); + else + se->avg.load_sum = 1; enqueue_load_avg(cfs_rq, se); cfs_rq->avg.util_avg += se->avg.util_avg;
On Thu, 2022-04-14 at 16:44 +0200, Peter Zijlstra wrote: > I've taken the liberty of carrying over the tags from v2 and reworked > the Changelog a little. Thank you for all your assistance. > --- > Subject: sched/pelt: Fix attach_entity_load_avg() corner case > From: kuyo chang <kuyo.chang@mediatek.com> > Date: Thu, 14 Apr 2022 17:02:20 +0800 > > From: kuyo chang <kuyo.chang@mediatek.com> > > The warning in cfs_rq_is_decayed() triggered: > > SCHED_WARN_ON(cfs_rq->avg.load_avg || > cfs_rq->avg.util_avg || > cfs_rq->avg.runnable_avg) > > There exists a corner case in attach_entity_load_avg() which will > cause load_sum to be zero while load_avg will not be. > > Consider se_weight is 88761 as per the sched_prio_to_weight[] table. > Further assume the get_pelt_divider() is 47742, this gives: > se->avg.load_avg is 1. > > However, calculating load_sum results in 0: > > se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum, > se_weight(se)); > se->avg.load_sum = 1*47742/88761 = 0. > > Then enqueue_load_avg() adds this to the cfs_rq totals: > > cfs_rq->avg.load_avg += se->avg.load_avg; > cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum; > > Resulting in load_avg being 1 with load_sum is 0, which will trigger > the WARN. > > Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq > and its se") > Signed-off-by: kuyo chang <kuyo.chang@mediatek.com> > [peterz: massage changelog] > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> > Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > Link: > https://urldefense.com/v3/__https://lkml.kernel.org/r/20220414090229.342-1-kuyo.chang@mediatek.com__;!!CTRNKA9wMg0ARbw!35Im02xxIuUZdLYpPng37Yk7oVNJVJ1tfbu4XRzlq-6VhH3K29Por0gJCFlslT_CMgA$ > > --- > kernel/sched/fair.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3829,11 +3829,11 @@ static void attach_entity_load_avg(struc > > se->avg.runnable_sum = se->avg.runnable_avg * divider; > > - se->avg.load_sum = divider; > - if (se_weight(se)) { > - se->avg.load_sum = > - div_u64(se->avg.load_avg * se->avg.load_sum, > se_weight(se)); > - } > + se->avg.load_sum = se->avg.load_avg * divider; > + if (se_weight(se) < se->avg.load_sum) > + se->avg.load_sum = div_u64(se->avg.load_sum, > se_weight(se)); > + else > + se->avg.load_sum = 1; > > enqueue_load_avg(cfs_rq, se); > cfs_rq->avg.util_avg += se->avg.util_avg;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d4bd299d67ab..159274482c4e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3829,10 +3829,12 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s se->avg.runnable_sum = se->avg.runnable_avg * divider; - se->avg.load_sum = divider; - if (se_weight(se)) { + se->avg.load_sum = se->avg.load_avg * divider; + if (se_weight(se) < se->avg.load_sum) { se->avg.load_sum = - div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se)); + div_u64(se->avg.load_sum, se_weight(se)); + } else { + se->avg.load_sum = 1; } enqueue_load_avg(cfs_rq, se);