Message ID | 20220414015940.9537-1-kuyo.chang@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1,v2] sched/pelt: Refine the enqueue_load_avg calculate method | expand |
On Thu, 14 Apr 2022 at 03:59, Kuyo Chang <kuyo.chang@mediatek.com> wrote: > > From: kuyo chang <kuyo.chang@mediatek.com> > > I meet the warning message at cfs_rq_is_decayed at below code. > > SCHED_WARN_ON(cfs_rq->avg.load_avg || > cfs_rq->avg.util_avg || > cfs_rq->avg.runnable_avg) > > Following is the calltrace. > > Call trace: > __update_blocked_fair > update_blocked_averages > newidle_balance > pick_next_task_fair > __schedule > schedule > pipe_read > vfs_read > ksys_read > > After code analyzing and some debug messages, I found it exits a corner > case at attach_entity_load_avg which will cause load_sum is null but > load_avg is not. > Consider se_weight is 88761 according by sched_prio_to_weight table. > And assume the get_pelt_divider() is 47742, se->avg.load_avg is 1. > By the calculating for se->avg.load_sum as following will become zero > as following. > se->avg.load_sum = > div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se)); > se->avg.load_sum = 1*47742/88761 = 0. > > After enqueue_load_avg code as below. > cfs_rq->avg.load_avg += se->avg.load_avg; > cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum; > > Then the load_sum for cfs_rq will be 1 while the load_sum for cfs_rq is 0. > So it will hit the warning message. > > In order to fix the corner case, make sure the se->load_avg|sum is correct > before enqueue_load_avg. > > After long time testing, the kernel warning was gone and the system runs > as well as before. This needs a fix tag: Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq and its se") > > Signed-off-by: kuyo chang <kuyo.chang@mediatek.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> > --- > > v1->v2: > > (1)Thanks for suggestion from Peter Zijlstra & Vincent Guittot. > (2)By suggestion from Vincent Guittot, > rework the se->load_sum calculation method for fix the corner case, > make sure the se->load_avg|sum is correct before enqueue_load_avg. > (3)Rework changlog. > > kernel/sched/fair.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index d4bd299d67ab..159274482c4e 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3829,10 +3829,12 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s > > se->avg.runnable_sum = se->avg.runnable_avg * divider; > > - se->avg.load_sum = divider; > - if (se_weight(se)) { > + se->avg.load_sum = se->avg.load_avg * divider; > + if (se_weight(se) < se->avg.load_sum) { > se->avg.load_sum = > - div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se)); > + div_u64(se->avg.load_sum, se_weight(se)); > + } else { > + se->avg.load_sum = 1; > } > > enqueue_load_avg(cfs_rq, se); > -- > 2.18.0 >
On Thu, 2022-04-14 at 09:37 +0200, Vincent Guittot wrote: > On Thu, 14 Apr 2022 at 03:59, Kuyo Chang <kuyo.chang@mediatek.com> > wrote: > > > > From: kuyo chang <kuyo.chang@mediatek.com> > > > > I meet the warning message at cfs_rq_is_decayed at below code. > > > > SCHED_WARN_ON(cfs_rq->avg.load_avg || > > cfs_rq->avg.util_avg || > > cfs_rq->avg.runnable_avg) > > > > Following is the calltrace. > > > > Call trace: > > __update_blocked_fair > > update_blocked_averages > > newidle_balance > > pick_next_task_fair > > __schedule > > schedule > > pipe_read > > vfs_read > > ksys_read > > > > After code analyzing and some debug messages, I found it exits a > > corner > > case at attach_entity_load_avg which will cause load_sum is null > > but > > load_avg is not. > > Consider se_weight is 88761 according by sched_prio_to_weight > > table. > > And assume the get_pelt_divider() is 47742, se->avg.load_avg is 1. > > By the calculating for se->avg.load_sum as following will become > > zero > > as following. > > se->avg.load_sum = > > div_u64(se->avg.load_avg * se->avg.load_sum, > > se_weight(se)); > > se->avg.load_sum = 1*47742/88761 = 0. > > > > After enqueue_load_avg code as below. > > cfs_rq->avg.load_avg += se->avg.load_avg; > > cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum; > > > > Then the load_sum for cfs_rq will be 1 while the load_sum for > > cfs_rq is 0. > > So it will hit the warning message. > > > > In order to fix the corner case, make sure the se->load_avg|sum is > > correct > > before enqueue_load_avg. > > > > After long time testing, the kernel warning was gone and the system > > runs > > as well as before. > > This needs a fix tag: > Fixes: f207934fb79d ("sched/fair: Align PELT windows between cfs_rq > and its se") Thanks for your friendly reminder. > > > > Signed-off-by: kuyo chang <kuyo.chang@mediatek.com> > > Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> > > > --- > > > > v1->v2: > > > > (1)Thanks for suggestion from Peter Zijlstra & Vincent Guittot. > > (2)By suggestion from Vincent Guittot, > > rework the se->load_sum calculation method for fix the corner case, > > make sure the se->load_avg|sum is correct before enqueue_load_avg. > > (3)Rework changlog. > > > > kernel/sched/fair.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index d4bd299d67ab..159274482c4e 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -3829,10 +3829,12 @@ static void attach_entity_load_avg(struct > > cfs_rq *cfs_rq, struct sched_entity *s > > > > se->avg.runnable_sum = se->avg.runnable_avg * divider; > > > > - se->avg.load_sum = divider; > > - if (se_weight(se)) { > > + se->avg.load_sum = se->avg.load_avg * divider; > > + if (se_weight(se) < se->avg.load_sum) { > > se->avg.load_sum = > > - div_u64(se->avg.load_avg * se- > > >avg.load_sum, se_weight(se)); > > + div_u64(se->avg.load_sum, se_weight(se)); > > + } else { > > + se->avg.load_sum = 1; > > } > > > > enqueue_load_avg(cfs_rq, se); > > -- > > 2.18.0 > >
On 14/04/2022 03:59, Kuyo Chang wrote: > From: kuyo chang <kuyo.chang@mediatek.com> [...] > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index d4bd299d67ab..159274482c4e 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3829,10 +3829,12 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s > > se->avg.runnable_sum = se->avg.runnable_avg * divider; > > - se->avg.load_sum = divider; > - if (se_weight(se)) { > + se->avg.load_sum = se->avg.load_avg * divider; > + if (se_weight(se) < se->avg.load_sum) { > se->avg.load_sum = > - div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se)); > + div_u64(se->avg.load_sum, se_weight(se)); Seems that this will fit on one line now. No braces needed then. > + } else { > + se->avg.load_sum = 1; > } > > enqueue_load_avg(cfs_rq, se); Looks like taskgroups are not affected since they get always online with cpu.shares/weight = 1024 (cgroup v1): cpu_cgroup_css_online() -> online_fair_sched_group() -> attach_entity_cfs_rq() -> attach_entity_load_avg() And reweight_entity() does not have this issue. Tested with `qemu-system-x86_64 ... cores=64 ... -enable-kvm` and weight=88761 for nice=0 tasks plus forcing se->avg.load_avg = 1 before the div_u64() in attach_entity_load_avg(). Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
On Thu, 2022-04-14 at 11:02 +0200, Dietmar Eggemann wrote: > On 14/04/2022 03:59, Kuyo Chang wrote: > > From: kuyo chang <kuyo.chang@mediatek.com> > > [...] > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index d4bd299d67ab..159274482c4e 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -3829,10 +3829,12 @@ static void attach_entity_load_avg(struct > > cfs_rq *cfs_rq, struct sched_entity *s > > > > se->avg.runnable_sum = se->avg.runnable_avg * divider; > > > > - se->avg.load_sum = divider; > > - if (se_weight(se)) { > > + se->avg.load_sum = se->avg.load_avg * divider; > > + if (se_weight(se) < se->avg.load_sum) { > > se->avg.load_sum = > > - div_u64(se->avg.load_avg * se->avg.load_sum, > > se_weight(se)); > > + div_u64(se->avg.load_sum, se_weight(se)); > > Seems that this will fit on one line now. No braces needed then. Thanks for your friendly reminder. > > > + } else { > > + se->avg.load_sum = 1; > > } > > > > enqueue_load_avg(cfs_rq, se); > > Looks like taskgroups are not affected since they get always online > with cpu.shares/weight = 1024 (cgroup v1): > > cpu_cgroup_css_online() -> online_fair_sched_group() -> > attach_entity_cfs_rq() -> attach_entity_load_avg() > > And reweight_entity() does not have this issue. > > Tested with `qemu-system-x86_64 ... cores=64 ... -enable-kvm` and > weight=88761 for nice=0 tasks plus forcing se->avg.load_avg = 1 > before > the div_u64() in attach_entity_load_avg(). > > Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d4bd299d67ab..159274482c4e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3829,10 +3829,12 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s se->avg.runnable_sum = se->avg.runnable_avg * divider; - se->avg.load_sum = divider; - if (se_weight(se)) { + se->avg.load_sum = se->avg.load_avg * divider; + if (se_weight(se) < se->avg.load_sum) { se->avg.load_sum = - div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se)); + div_u64(se->avg.load_sum, se_weight(se)); + } else { + se->avg.load_sum = 1; } enqueue_load_avg(cfs_rq, se);