From patchwork Tue Dec 4 04:46:32 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: preeti X-Patchwork-Id: 1836011 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork2.kernel.org (Postfix) with ESMTP id 8A8EADF23A for ; Tue, 4 Dec 2012 04:50:14 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1TfkPi-0006R6-OL; Tue, 04 Dec 2012 04:47:22 +0000 Received: from e28smtp07.in.ibm.com ([122.248.162.7]) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1TfkPf-0006Pg-8u for linux-arm-kernel@lists.infradead.org; Tue, 04 Dec 2012 04:47:20 +0000 Received: from /spool/local by e28smtp07.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 4 Dec 2012 10:17:04 +0530 Received: from d28dlp01.in.ibm.com (9.184.220.126) by e28smtp07.in.ibm.com (192.168.1.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 4 Dec 2012 10:17:02 +0530 Received: from d28relay02.in.ibm.com (d28relay02.in.ibm.com [9.184.220.59]) by d28dlp01.in.ibm.com (Postfix) with ESMTP id 347D6E0054 for ; Tue, 4 Dec 2012 10:16:42 +0530 (IST) Received: from d28av04.in.ibm.com (d28av04.in.ibm.com [9.184.220.66]) by d28relay02.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qB44l7nx31064154 for ; Tue, 4 Dec 2012 10:17:07 +0530 Received: from d28av04.in.ibm.com (loopback [127.0.0.1]) by d28av04.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qB4AGv8Q015701 for ; Tue, 4 Dec 2012 21:16:59 +1100 Received: from [9.124.35.27] (preeti.in.ibm.com [9.124.35.27]) by d28av04.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id qB4AGvGF015660; Tue, 4 Dec 2012 21:16:57 +1100 Message-ID: <50BD8028.1040909@linux.vnet.ibm.com> Date: Tue, 04 Dec 2012 10:16:32 +0530 From: Preeti U Murthy User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0 MIME-Version: 1.0 To: paulmck@linux.vnet.ibm.com, vincent.guittot@linaro.org, a.p.zijlstra@chello.nl, viresh.kumar@linaro.org, linux-kernel@vger.kernel.org, amit.kucheria@linaro.org, Morten.Rasmussen@arm.com, paul.mckenney@linaro.org, akpm@linux-foundation.org, svaidy@linux.vnet.ibm.com, arjan@linux.intel.com, mingo@kernel.org, pjt@google.com Subject: Re: [RFC v2 PATCH 2.1] sched: Use Per-Entity-Load-Tracking metric for load balancing References: <20121115164730.17426.36051.stgit@preeti.in.ibm.com> <20121115165422.17426.3978.stgit@preeti.in.ibm.com> In-Reply-To: <20121115165422.17426.3978.stgit@preeti.in.ibm.com> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12120404-8878-0000-0000-00000506F043 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20121203_234719_969747_7D286B56 X-CRM114-Status: GOOD ( 21.08 ) X-Spam-Score: -1.9 (-) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-1.9 points) pts rule name description ---- ---------------------- -------------------------------------------------- 3.0 KHOP_BIG_TO_CC Sent to 10+ recipients instaed of Bcc or a list -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [122.248.162.7 listed in list.dnswl.org] -0.7 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: venki@google.com, robin.randhawa@arm.com, linaro-dev@lists.linaro.org, suresh.b.siddha@intel.com, deepthi@linux.vnet.ibm.com, mjg59@srcf.ucam.org, srivatsa.bhat@linux.vnet.ibm.com, Arvind.Chauhan@arm.com, linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org sched: Use Per-Entity-Load-Tracking metric for load balancing From: Preeti U Murthy Currently the load balancer weighs a task based upon its priority,and this weight consequently gets added up to the weight of the run queue that it is on.It is this weight of the runqueue that sums up to a sched group's load which is used to decide the busiest or the idlest group and the runqueue thereof. The Per-Entity-Load-Tracking metric however measures how long a task has been runnable over the duration of its lifetime.This gives us a hint of the amount of CPU time that the task can demand.This metric takes care of the task priority as well.Therefore apart from the priority of a task we also have an idea of the live behavior of the task.This seems to be a more realistic metric to use to compute task weight which adds upto the run queue weight and the weight of the sched group.Consequently they can be used for load balancing. The semantics of load balancing is left untouched.The two functions load_balance() and select_task_rq_fair() perform the task of load balancing.These two paths have been browsed through in this patch to make necessary changes. weighted_cpuload() and task_h_load() provide the run queue weight and the weight of the task respectively.They have been modified to provide the Per-Entity-Load-Tracking metric as relevant for each. The rest of the modifications had to be made to suit these two changes. Completely Fair Scheduler class is the only sched_class which contributes to the run queue load.Therefore the rq->load.weight==cfs_rq->load.weight when the cfs_rq is the root cfs_rq (rq->cfs) of the hierarchy.When replacing this with Per-Entity-Load-Tracking metric,cfs_rq->runnable_load_avg needs to be used as this is the right reflection of the run queue load when the cfs_rq is the root cfs_rq (rq->cfs) of the hierarchy.This metric reflects the percentage uptime of the tasks that are queued on it and hence that contribute to the load.Thus cfs_rq->runnable_load_avg replaces the metric earlier used in weighted_cpuload(). The task load is aptly captured by se.avg.load_avg_contrib which captures the runnable time vs the alive time of the task against its priority.This metric replaces the earlier metric used in task_h_load(). The consequent changes appear as data type changes for the helper variables; they abound in number.Because cfs_rq->runnable_load_avg needs to be big enough to capture the tasks' load often and accurately. The following patch does not consider CONFIG_FAIR_GROUP_SCHED AND CONFIG_SCHED_NUMA.This is done so as to evaluate this approach starting from the simplest scenario.Earlier discussions can be found in the link below. Link: https://lkml.org/lkml/2012/10/25/162 Signed-off-by: Preeti U Murthy --- I apologise about having overlooked this one change in the patchset.This needs to be applied on top of patch2 of this patchset.The experiment results that have been posted in reply to this thread are done after having applied this patch. kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f8f3a29..19094eb 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4362,7 +4362,7 @@ struct sd_lb_stats { * sg_lb_stats - stats of a sched_group required for load_balancing */ struct sg_lb_stats { - unsigned long avg_load; /*Avg load across the CPUs of the group */ + u64 avg_load; /*Avg load across the CPUs of the group */ u64 group_load; /* Total load over the CPUs of the group */ unsigned long sum_nr_running; /* Nr tasks running in the group */ u64 sum_weighted_load; /* Weighted load of group's tasks */