From patchwork Fri Jun 16 14:14:04 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9791819 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DF0936038E for ; Fri, 16 Jun 2017 14:16:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D4715285C2 for ; Fri, 16 Jun 2017 14:16:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C946C28649; Fri, 16 Jun 2017 14:16:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 16CFD285C2 for ; Fri, 16 Jun 2017 14:16:20 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dLs0o-0001wT-0u; Fri, 16 Jun 2017 14:14:10 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dLs0m-0001vt-HR for xen-devel@lists.xenproject.org; Fri, 16 Jun 2017 14:14:08 +0000 Received: from [85.158.143.35] by server-11.bemta-6.messagelabs.com id F4/0A-03587-FA7E3495; Fri, 16 Jun 2017 14:14:07 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpileJIrShJLcpLzFFi42K5GNpwWHf9c+d Igzer2Cy+b5nM5MDocfjDFZYAxijWzLyk/IoE1owJd14xFWxzqtjS+Ii5gfG2XhcjJ4eQwHRG ieU3K7sYuThYBNawShyfM4cJxJEQuMQqcfbSNkaQKgmBOIk1kzayQNhVEsffr2WC6FaRuLl9F ZT9g1HiyWNWEFtYQE/iyNEf7BB2oERX9xqwOWwCBhJvduwFqxERUJK4t2oy2DJmgSZGicc7m4 EWcACdoSrxtrkCpIZXwFvi5YwtYHM4BXwkrj8/wgqxy1vi36PLbCC2qICcxMrLLawQ9YISJ2c +ARvDLKApsX6XPkiYWUBeYvvbOcwTGEVmIamahVA1C0nVAkbmVYzqxalFZalFukZ6SUWZ6Rkl uYmZObqGBmZ6uanFxYnpqTmJScV6yfm5mxiBoc8ABDsYl/11OsQoycGkJMr7/olzpBBfUn5KZ UZicUZ8UWlOavEhRg0ODoEJZ+dOZ5JiycvPS1WS4N32DKhOsCg1PbUiLTMHGJ0wpRIcPEoivK dAxvAWFyTmFmemQ6ROMRpzXLmy7gsTx5QD278wCYFNkhLn/QUySQCkNKM0D24QLGlcYpSVEuZ lBDpTiKcgtSg3swRV/hWjOAejkjCvPTAFCfFk5pXA7XsFdAoT0ClBFxxATilJREhJNTCujNsf sLLEtLKNR2KRvn5x9YO7q6Z7mt5VXfEvN+3I1+Nblkj23ut9Gu0iJnpvn/yjlhOPerumG0tdN PhgmPflkbFK0beUuEJfWQOR+3ba5/cdecm3cuPVZEPPxParThsUnWonOMf8/JAgHGZ+/97hdV P38q6aP8f56ol62fqJidfvaKdus1NWYinOSDTUYi4qTgQAnBVrTRUDAAA= X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-2.tower-21.messagelabs.com!1497622446!60324329!1 X-Originating-IP: [209.85.128.195] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.4.19; banners=-,-,- X-VirusChecked: Checked Received: (qmail 55158 invoked from network); 16 Jun 2017 14:14:07 -0000 Received: from mail-wr0-f195.google.com (HELO mail-wr0-f195.google.com) (209.85.128.195) by server-2.tower-21.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 16 Jun 2017 14:14:07 -0000 Received: by mail-wr0-f195.google.com with SMTP id 77so6719709wrb.3 for ; Fri, 16 Jun 2017 07:14:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=kO1uZSKsZhw+wt/qhFFyu2BuohGE5TznuCnCY7+lpgg=; b=BuPyP4u53QuiHNnv0YEa96/SqKmhatFNAenRHFRhFIOyT04GGFpvA8zc072NMTxJ16 tmQGuDkK2FFdePCFLLlxqP8KKvZmEzzWY/fgy3ffchVjZcOocQqN1xF2Tf3Nb8s+Fh5H vF8K6vwfrHBM8J4VY8ZMNCB9WeBqdEHDpGnlK/LS7nM3DhsqRdUVtWH32Em88eaq8hC2 5ZXkmZfIg6saPBx3eUQdw1CiiJh+3GE7EhP8Go8Y8nujBVKUv5Z6UiLE2k+2p3P7z3gH F4g+krxkDhDw0b+DEfemexH40FlPJ7INFhgCgH1GSAijh12q6ynZzlDgiwlvxaeDS77X 6YnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=kO1uZSKsZhw+wt/qhFFyu2BuohGE5TznuCnCY7+lpgg=; b=uh8WlsFNglyGYxFmDHgnl76gLKlRMkupKfzhcYn5dsErDmn9bB4ZXwTQ1URGvRiZZ2 z8BcVMlrTHA7WhlhpZmxgmM9Y36fR+eK93Ld7k1sqYXHhYRBTiV5p0MVIM6KyAlpzrjU NkTdlFCjedvTV8Y3GOBP5lv2eeldI+pVP79Xp7nxb1yGrdnflX6xN/lCEDmuuJNMPA6l lUD8wOvRCi38WfNRPNVq9OFnIw8Rg7W6+8OlxkPk/VNtC+fTm1GRsHPOO122k0TMvAdy dYolowOgSUTaOlc7tR2tYBV0Jf5PoelxQw00mzL53SYqdd/Sg0k5KxsgUpD9bhQLAS2V mXGw== X-Gm-Message-State: AKS2vOwVrbr9uQ4bVqz55u3uvZZfIx0eFCQrdtJiAYq14ldr+Je40wwo MLw7NJ45U0aR1Yug X-Received: by 10.223.170.219 with SMTP id i27mr3655698wrc.49.1497622446331; Fri, 16 Jun 2017 07:14:06 -0700 (PDT) Received: from Solace.fritz.box ([80.66.223.68]) by smtp.gmail.com with ESMTPSA id 22sm2781782wrt.36.2017.06.16.07.14.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 Jun 2017 07:14:05 -0700 (PDT) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Fri, 16 Jun 2017 16:14:04 +0200 Message-ID: <149762244440.11899.3927310982261940597.stgit@Solace.fritz.box> In-Reply-To: <149762114626.11899.6393770850121347748.stgit@Solace.fritz.box> References: <149762114626.11899.6393770850121347748.stgit@Solace.fritz.box> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: Anshul Makkar , "Justin T. Weaver" , George Dunlap Subject: [Xen-devel] [PATCH 4/7] xen: credit2: soft-affinity awareness in csched2_cpu_pick() X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP We want to find the runqueue with the least average load, and to do that, we scan through all the runqueues. It is, therefore, enough that, during such scan: - we identify the runqueue with the least load, among the ones that have pcpus that are part of the soft affinity of the vcpu we're calling pick on; - we identify the same, but for hard affinity. At this point, we can decide whether to go for the runqueue with the least load among the ones with some soft-affinity, or overall. Therefore, at the price of some code reshuffling, we can avoid the loop. (Also, kill a spurious ';' in the definition of MAX_LOAD.) Signed-off-by: Dario Faggioli Signed-off-by: Justin T. Weaver Reviewed-by: George Dunlap Reviewed-by: George Dunlap --- Cc: George Dunlap Cc: Anshul Makkar --- xen/common/sched_credit2.c | 117 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 97 insertions(+), 20 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 54f6e21..fb97ff7 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -1725,14 +1725,16 @@ csched2_context_saved(const struct scheduler *ops, struct vcpu *vc) vcpu_schedule_unlock_irq(lock, vc); } -#define MAX_LOAD (STIME_MAX); +#define MAX_LOAD (STIME_MAX) static int csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) { struct csched2_private *prv = csched2_priv(ops); - int i, min_rqi = -1, new_cpu, cpu = vc->processor; + int i, min_rqi = -1, min_s_rqi = -1; + unsigned int new_cpu, cpu = vc->processor; struct csched2_vcpu *svc = csched2_vcpu(vc); - s_time_t min_avgload = MAX_LOAD; + s_time_t min_avgload = MAX_LOAD, min_s_avgload = MAX_LOAD; + bool has_soft; ASSERT(!cpumask_empty(&prv->active_queues)); @@ -1781,17 +1783,35 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) else if ( cpumask_intersects(cpumask_scratch_cpu(cpu), &svc->migrate_rqd->active) ) { + /* + * If we've been asked to move to migrate_rqd, we should just do + * that, which we actually do by returning one cpu from that runq. + * There is no need to take care of soft affinity, as that will + * happen in runq_tickle(). + */ cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), &svc->migrate_rqd->active); new_cpu = cpumask_cycle(svc->migrate_rqd->pick_bias, cpumask_scratch_cpu(cpu)); + svc->migrate_rqd->pick_bias = new_cpu; goto out_up; } /* Fall-through to normal cpu pick */ } - /* Find the runqueue with the lowest average load. */ + /* + * What we want is: + * - if we have soft affinity, the runqueue with the lowest average + * load, among the ones that contain cpus in our soft affinity; this + * represents the best runq on which we would want to run. + * - the runqueue with the lowest average load among the ones that + * contains cpus in our hard affinity; this represent the best runq + * on which we can run. + * + * Find both runqueues in one pass. + */ + has_soft = has_soft_affinity(vc, vc->cpu_hard_affinity); for_each_cpu(i, &prv->active_queues) { struct csched2_runqueue_data *rqd; @@ -1800,31 +1820,51 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) rqd = prv->rqd + i; /* - * If checking a different runqueue, grab the lock, check hard - * affinity, read the avg, and then release the lock. + * If none of the cpus of this runqueue is in svc's hard-affinity, + * skip the runqueue. + * + * Note that, in case svc's hard-affinity has changed, this is the + * first time when we see such change, so it is indeed possible + * that we end up skipping svc's current runqueue. + */ + if ( !cpumask_intersects(cpumask_scratch_cpu(cpu), &rqd->active) ) + continue; + + /* + * If checking a different runqueue, grab the lock, read the avg, + * and then release the lock. * * If on our own runqueue, don't grab or release the lock; * but subtract our own load from the runqueue load to simulate * impartiality. - * - * Note that, if svc's hard affinity has changed, this is the - * first time when we see such change, so it is indeed possible - * that none of the cpus in svc's current runqueue is in our - * (new) hard affinity! */ if ( rqd == svc->rqd ) { - if ( cpumask_intersects(cpumask_scratch_cpu(cpu), &rqd->active) ) - rqd_avgload = max_t(s_time_t, rqd->b_avgload - svc->avgload, 0); + rqd_avgload = max_t(s_time_t, rqd->b_avgload - svc->avgload, 0); } else if ( spin_trylock(&rqd->lock) ) { - if ( cpumask_intersects(cpumask_scratch_cpu(cpu), &rqd->active) ) - rqd_avgload = rqd->b_avgload; - + rqd_avgload = rqd->b_avgload; spin_unlock(&rqd->lock); } + /* + * if svc has a soft-affinity, and some cpus of rqd are part of it, + * see if we need to update the "soft-affinity minimum". + */ + if ( has_soft && + rqd_avgload < min_s_avgload ) + { + cpumask_t mask; + + cpumask_and(&mask, cpumask_scratch_cpu(cpu), &rqd->active); + if ( cpumask_intersects(&mask, svc->vcpu->cpu_soft_affinity) ) + { + min_s_avgload = rqd_avgload; + min_s_rqi = i; + } + } + /* In any case, keep the "hard-affinity minimum" updated too. */ if ( rqd_avgload < min_avgload ) { min_avgload = rqd_avgload; @@ -1832,17 +1872,54 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) } } - /* We didn't find anyone (most likely because of spinlock contention). */ - if ( min_rqi == -1 ) + if ( has_soft && min_s_rqi != -1 ) + { + /* + * We have soft affinity, and we have a candidate runq, so go for it. + * + * Note that, to obtain the soft-affinity mask, we "just" put what we + * have in cpumask_scratch in && with vc->cpu_soft_affinity. This is + * ok because: + * - we know that vc->cpu_hard_affinity and vc->cpu_soft_affinity have + * a non-empty intersection (because has_soft is true); + * - we have vc->cpu_hard_affinity & cpupool_domain_cpumask() already + * in cpumask_scratch, we do save a lot doing like this. + * + * It's kind of like open coding affinity_balance_cpumask() but, in + * this specific case, calling that would mean a lot of (unnecessary) + * cpumask operations. + */ + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), + vc->cpu_soft_affinity); + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), + &prv->rqd[min_s_rqi].active); + } + else if ( min_rqi != -1 ) { + /* + * Either we don't have soft-affinity, or we do, but we did not find + * any suitable runq. But we did find one when considering hard + * affinity, so go for it. + * + * cpumask_scratch already has vc->cpu_hard_affinity & + * cpupool_domain_cpumask() in it, so it's enough that we filter + * with the cpus of the runq. + */ + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), + &prv->rqd[min_rqi].active); + } + else + { + /* + * We didn't find anyone at all (most likely because of spinlock + * contention). + */ new_cpu = get_fallback_cpu(svc); min_rqi = c2r(ops, new_cpu); min_avgload = prv->rqd[min_rqi].b_avgload; goto out_up; } - cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), - &prv->rqd[min_rqi].active); new_cpu = cpumask_cycle(prv->rqd[min_rqi].pick_bias, cpumask_scratch_cpu(cpu)); prv->rqd[min_rqi].pick_bias = new_cpu;