From patchwork Thu Jul 27 12:05:53 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9866697 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7036560382 for ; Thu, 27 Jul 2017 12:08:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 61CA828789 for ; Thu, 27 Jul 2017 12:08:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5674F287FE; Thu, 27 Jul 2017 12:08:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RCVD_IN_SORBS_SPAM,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A62CC28789 for ; Thu, 27 Jul 2017 12:08:03 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dahYE-0005ZL-Ri; Thu, 27 Jul 2017 12:05:58 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dahYE-0005Yu-0k for xen-devel@lists.xenproject.org; Thu, 27 Jul 2017 12:05:58 +0000 Received: from [85.158.137.68] by server-14.bemta-3.messagelabs.com id 05/7E-01862-527D9795; Thu, 27 Jul 2017 12:05:57 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmphleJIrShJLcpLzFFi42K5GNpwRFflemW kQeN/OYvvWyYzOTB6HP5whSWAMYo1My8pvyKBNePeqnbmgh1OFY9+izUw3tPrYuTiEBKYwSix b+ZhZhCHRWANq8S1+5/YQBwJgUusEh+2NgFlOIGcOIlN/T/YIOwqiaYJy8BsIQEViZvbVzFB2 D8YJVZ+MwGxhQX0JI4c/cEOYYdIbFreCDaHTcBA4s2OvawgtoiAksS9VZOZQJYxCzQwSqz/1c wCkmARUJW43gmxjFfAQaK76TvYIE4BJ4ntF5qgFjtK3D77EywuKiAnsfJyCytEvaDEyZlPgOZ wAA3VlFi/Sx8kzCwgL7H97RzmCYwis5BUzUKomoWkagEj8ypGjeLUorLUIl1DS72kosz0jJLc xMwcXUMDY73c1OLixPTUnMSkYr3k/NxNjMDwr2dgYNzB+Pu43yFGSQ4mJVHeSaYVkUJ8Sfkpl RmJxRnxRaU5qcWHGDU4OAQmnJ07nUmKJS8/L1VJglfqWmWkkGBRanpqRVpmDjBCYUolOHiURH gdQNK8xQWJucWZ6RCpU4zGHFeurPvCxDHlwPYvTEJgk6TEeZ9dBSoVACnNKM2DGwRLHJcYZaW EeRkZGBiEeApSi3IzS1DlXzGKczAqCfPOApnCk5lXArfvFdApTECnTGwCO6UkESEl1cC4OO3g foNrG452xjDVi0Z0+8n/NOBNkWs5p6xrflvtO9cBk+ZYx3bmdStCHPJnJIarXvsvZfnjW2D6H sY6xyeLZU7X77s0xXii6INpab9Ptl9+UD7XsqwjOcE3u4H1+eHqJYon/7xc2+dn3yZlaPW51/ Hszf9l5m0Swl9a0234uYS/lZUetldiKc5INNRiLipOBABj52+3FwMAAA== X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-4.tower-31.messagelabs.com!1501157156!49327127!1 X-Originating-IP: [209.85.128.196] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.4.25; banners=-,-,- X-VirusChecked: Checked Received: (qmail 11992 invoked from network); 27 Jul 2017 12:05:56 -0000 Received: from mail-wr0-f196.google.com (HELO mail-wr0-f196.google.com) (209.85.128.196) by server-4.tower-31.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 27 Jul 2017 12:05:56 -0000 Received: by mail-wr0-f196.google.com with SMTP id g32so6969076wrd.5 for ; Thu, 27 Jul 2017 05:05:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=i7k3DKp5nN8xDBN0pZtRQmZrbX7K44xPmRwi6yKLJgk=; b=i1fjS/wFZmQ5koJzchgdbhQWdYA2ZOKp6nmDktyWyJcIwwidxRVHqXF00lDkk6RjS6 sISyk12YzkCxwNdJNkExRZR0LPndwkOn8pHbvWTE8NVwPcoNmJsChWs9B7oqxyX4euiS 5pouWvn+0KzzJSFVKDb2vaR9bOftZ+ZmvV7ovCLbXfBoHakX1Gi73Vsu24g6Zh3HF2Go j9lPzur7gDYkZRPmJ1ZhXAO6l39m3hqNjFVHY+8aOoYkNqqwpg72NDIDm0bNHf4JXO69 SoqFoIUCJ/6k9RmamhQvg54RMViD/FwBy8Cqv5MUDAf2RhQ1Bu5I10SJXM05dolZwSI3 CFRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=i7k3DKp5nN8xDBN0pZtRQmZrbX7K44xPmRwi6yKLJgk=; b=WMZjtXOcmLYNMRj5Fog5Kak1Mr/q1Io/eY0cZtgLt9Ih3IjKf5HLENKMzABFdXkU0d eA7JSsMoQDrCw+O1BQMB2A0qFtQjp0RBq+ToXzJUoCeJ4z/0BNwoEQDV6pNdhKC81wj+ dGKlDAgG+KR7LjuzS88oya3uxhMkCXrVc6c7blq4NaEbx0fDbiuKq+jByhzm5ukawfkX JEN/R+ZLXsW8WhSjVj3ndoRKDogYI1t2kFWAG3Y0yhvx7gbTcbaUACjtfenwZKKHDQQ7 N379VonYnbLBVkwGAZz4lXIV7f3lyYi37WZ1Ac4ufTcF525dVfpvBCCSPtTkiKuYqgl/ Nzqg== X-Gm-Message-State: AIVw113W5ZtAYrsVhoOnbBlXk6YRBtEctVKSqyLwuGh38mWCu9/QuWq3 hQnu/IG4BG20RhO4 X-Received: by 10.223.153.106 with SMTP id x97mr3232945wrb.32.1501157155756; Thu, 27 Jul 2017 05:05:55 -0700 (PDT) Received: from [192.168.0.31] ([80.66.223.212]) by smtp.gmail.com with ESMTPSA id 76sm16267951wmm.14.2017.07.27.05.05.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Jul 2017 05:05:55 -0700 (PDT) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Thu, 27 Jul 2017 14:05:53 +0200 Message-ID: <150115715350.6767.2140393293186342043.stgit@Solace> In-Reply-To: <150115657192.6767.15778617807307106582.stgit@Solace> References: <150115657192.6767.15778617807307106582.stgit@Solace> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: "Justin T. Weaver" , George Dunlap , Anshul Makkar Subject: [Xen-devel] [PATCH v2 3/6] xen: credit2: soft-affinity awareness in csched2_cpu_pick() X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP We want to find the runqueue with the least average load, and to do that, we scan through all the runqueues. It is, therefore, enough that, during such scan: - we identify the runqueue with the least load, among the ones that have pcpus that are part of the soft affinity of the vcpu we're calling pick on; - we identify the same, but for hard affinity. At this point, we can decide whether to go for the runqueue with the least load among the ones with some soft-affinity, or overall. Therefore, at the price of some code reshuffling, we can avoid the loop. (Also, kill a spurious ';' in the definition of MAX_LOAD.) Signed-off-by: Dario Faggioli Signed-off-by: Justin T. Weaver Reviewed-by: George Dunlap --- Cc: Anshul Makkar --- xen/common/sched_credit2.c | 117 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 97 insertions(+), 20 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index aa8f169..8237a0a 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -1761,14 +1761,16 @@ csched2_context_saved(const struct scheduler *ops, struct vcpu *vc) vcpu_schedule_unlock_irq(lock, vc); } -#define MAX_LOAD (STIME_MAX); +#define MAX_LOAD (STIME_MAX) static int csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) { struct csched2_private *prv = csched2_priv(ops); - int i, min_rqi = -1, new_cpu, cpu = vc->processor; + int i, min_rqi = -1, min_s_rqi = -1; + unsigned int new_cpu, cpu = vc->processor; struct csched2_vcpu *svc = csched2_vcpu(vc); - s_time_t min_avgload = MAX_LOAD; + s_time_t min_avgload = MAX_LOAD, min_s_avgload = MAX_LOAD; + bool has_soft; ASSERT(!cpumask_empty(&prv->active_queues)); @@ -1819,17 +1821,35 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) else if ( cpumask_intersects(cpumask_scratch_cpu(cpu), &svc->migrate_rqd->active) ) { + /* + * If we've been asked to move to migrate_rqd, we should just do + * that, which we actually do by returning one cpu from that runq. + * There is no need to take care of soft affinity, as that will + * happen in runq_tickle(). + */ cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), &svc->migrate_rqd->active); new_cpu = cpumask_cycle(svc->migrate_rqd->pick_bias, cpumask_scratch_cpu(cpu)); + svc->migrate_rqd->pick_bias = new_cpu; goto out_up; } /* Fall-through to normal cpu pick */ } - /* Find the runqueue with the lowest average load. */ + /* + * What we want is: + * - if we have soft affinity, the runqueue with the lowest average + * load, among the ones that contain cpus in our soft affinity; this + * represents the best runq on which we would want to run. + * - the runqueue with the lowest average load among the ones that + * contains cpus in our hard affinity; this represent the best runq + * on which we can run. + * + * Find both runqueues in one pass. + */ + has_soft = has_soft_affinity(vc, vc->cpu_hard_affinity); for_each_cpu(i, &prv->active_queues) { struct csched2_runqueue_data *rqd; @@ -1838,31 +1858,51 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) rqd = prv->rqd + i; /* - * If checking a different runqueue, grab the lock, check hard - * affinity, read the avg, and then release the lock. + * If none of the cpus of this runqueue is in svc's hard-affinity, + * skip the runqueue. + * + * Note that, in case svc's hard-affinity has changed, this is the + * first time when we see such change, so it is indeed possible + * that we end up skipping svc's current runqueue. + */ + if ( !cpumask_intersects(cpumask_scratch_cpu(cpu), &rqd->active) ) + continue; + + /* + * If checking a different runqueue, grab the lock, read the avg, + * and then release the lock. * * If on our own runqueue, don't grab or release the lock; * but subtract our own load from the runqueue load to simulate * impartiality. - * - * Note that, if svc's hard affinity has changed, this is the - * first time when we see such change, so it is indeed possible - * that none of the cpus in svc's current runqueue is in our - * (new) hard affinity! */ if ( rqd == svc->rqd ) { - if ( cpumask_intersects(cpumask_scratch_cpu(cpu), &rqd->active) ) - rqd_avgload = max_t(s_time_t, rqd->b_avgload - svc->avgload, 0); + rqd_avgload = max_t(s_time_t, rqd->b_avgload - svc->avgload, 0); } else if ( spin_trylock(&rqd->lock) ) { - if ( cpumask_intersects(cpumask_scratch_cpu(cpu), &rqd->active) ) - rqd_avgload = rqd->b_avgload; - + rqd_avgload = rqd->b_avgload; spin_unlock(&rqd->lock); } + /* + * if svc has a soft-affinity, and some cpus of rqd are part of it, + * see if we need to update the "soft-affinity minimum". + */ + if ( has_soft && + rqd_avgload < min_s_avgload ) + { + cpumask_t mask; + + cpumask_and(&mask, cpumask_scratch_cpu(cpu), &rqd->active); + if ( cpumask_intersects(&mask, svc->vcpu->cpu_soft_affinity) ) + { + min_s_avgload = rqd_avgload; + min_s_rqi = i; + } + } + /* In any case, keep the "hard-affinity minimum" updated too. */ if ( rqd_avgload < min_avgload ) { min_avgload = rqd_avgload; @@ -1870,17 +1910,54 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) } } - /* We didn't find anyone (most likely because of spinlock contention). */ - if ( min_rqi == -1 ) + if ( has_soft && min_s_rqi != -1 ) + { + /* + * We have soft affinity, and we have a candidate runq, so go for it. + * + * Note that, to obtain the soft-affinity mask, we "just" put what we + * have in cpumask_scratch in && with vc->cpu_soft_affinity. This is + * ok because: + * - we know that vc->cpu_hard_affinity and vc->cpu_soft_affinity have + * a non-empty intersection (because has_soft is true); + * - we have vc->cpu_hard_affinity & cpupool_domain_cpumask() already + * in cpumask_scratch, we do save a lot doing like this. + * + * It's kind of like open coding affinity_balance_cpumask() but, in + * this specific case, calling that would mean a lot of (unnecessary) + * cpumask operations. + */ + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), + vc->cpu_soft_affinity); + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), + &prv->rqd[min_s_rqi].active); + } + else if ( min_rqi != -1 ) { + /* + * Either we don't have soft-affinity, or we do, but we did not find + * any suitable runq. But we did find one when considering hard + * affinity, so go for it. + * + * cpumask_scratch already has vc->cpu_hard_affinity & + * cpupool_domain_cpumask() in it, so it's enough that we filter + * with the cpus of the runq. + */ + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), + &prv->rqd[min_rqi].active); + } + else + { + /* + * We didn't find anyone at all (most likely because of spinlock + * contention). + */ new_cpu = get_fallback_cpu(svc); min_rqi = c2r(new_cpu); min_avgload = prv->rqd[min_rqi].b_avgload; goto out_up; } - cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), - &prv->rqd[min_rqi].active); new_cpu = cpumask_cycle(prv->rqd[min_rqi].pick_bias, cpumask_scratch_cpu(cpu)); prv->rqd[min_rqi].pick_bias = new_cpu;