From patchwork Thu Jun 29 12:56:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9816711 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F18F6603D7 for ; Thu, 29 Jun 2017 12:58:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 035A4285B0 for ; Thu, 29 Jun 2017 12:58:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EB90A286CF; Thu, 29 Jun 2017 12:58:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RCVD_IN_SORBS_SPAM,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 57B19285B0 for ; Thu, 29 Jun 2017 12:58:50 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dQYzs-0006iu-Bi; Thu, 29 Jun 2017 12:56:36 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dQYzr-0006iY-AD for xen-devel@lists.xenproject.org; Thu, 29 Jun 2017 12:56:35 +0000 Received: from [193.109.254.147] by server-7.bemta-6.messagelabs.com id D2/82-03557-209F4595; Thu, 29 Jun 2017 12:56:34 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrAIsWRWlGSWpSXmKPExsXiVRvkqMv4MyT S4PEFK4vvWyYzOTB6HP5whSWAMYo1My8pvyKBNWNv8y2mggP2Fb3r1rA2MPbpdTFycQgJTGeU OPu/mxHEYRFYwyqxY8FcdhBHQuASq8SEe5vZuhg5gZw4ictti4GqOIDsConWzakgYSEBFYmb2 1cxQUz6xigx98URRpCEsICexJGjP9ghbGuJdw1bwWw2AQOJNzv2soLYIgJKEvdWTWYCmcksoC +x+isPiMkioCpx7ng9SAWvgIPE1PWLwCZyCjhKbHq4hBFirYPEnQuvwGxRATmJlZdbWCHqBSV OznzCAjFRU2L9Ln2QMLOAvMT2t3OYJzCKzEJSNQuhahaSqgWMzKsYNYpTi8pSi3SNjfWSijLT M0pyEzNzdA0NzPRyU4uLE9NTcxKTivWS83M3MQJDnwEIdjDuXB94iFGSg0lJlNfweUikEF9Sf kplRmJxRnxRaU5q8SFGGQ4OJQnea9+BcoJFqempFWmZOcAohElLcPAoifA+/AqU5i0uSMwtzk yHSJ1i1OWYdGD7FyYhlrz8vFQpcV6dH0BFAiBFGaV5cCNgCeESo6yUMC8j0FFCPAWpRbmZJaj yrxjFORiVhHn/glzCk5lXArfpFdARTEBHCM8AO6IkESEl1cDIdOrGfJFvLZM9yoN8pz54u7/j gSJnJM/8Z3IOTXUMcaZneGfPtqzeu8NTvFXUZmV5vsY9t/KrxoGvUgtfvlTfMkWi8+jq+kbf/ z2RpZeZzD49+Xbhq6qEj1M2z2+7DQtvNpjGn5ijVMBxQuKN5i13sV+HPidHf1XZw2Xx87se57 Hb55dbXdJSYinOSDTUYi4qTgQAB8r6pwMDAAA= X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-14.tower-27.messagelabs.com!1498740993!92358464!1 X-Originating-IP: [74.125.82.65] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.4.19; banners=-,-,- X-VirusChecked: Checked Received: (qmail 44734 invoked from network); 29 Jun 2017 12:56:33 -0000 Received: from mail-wm0-f65.google.com (HELO mail-wm0-f65.google.com) (74.125.82.65) by server-14.tower-27.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 29 Jun 2017 12:56:33 -0000 Received: by mail-wm0-f65.google.com with SMTP id j85so2425426wmj.0 for ; Thu, 29 Jun 2017 05:56:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=LVtVbbl3GOl/AAkHU3pwmh37FAVFriueVs9dv2VvyxA=; b=QTghGSlB8joaEwgH8FWebYoaMc4xmi2yXos4rouGOV0/3yc9X2YMabKnAJaQw8cwMQ HzT5ZJB5YCnFI95C5p+R6qPgvaouKOcC6t6J4EW8d9c0RTCsEhnVpGaIgAAjOef7qQQY R7RV9p90/0USyvrta+KCyvOKpcj0yPgsNr1+EcvUCPtJkIMZSUY3VL5WvUhtYoPBcptP 9FMcXM8LYtGUZQslJS7qZcipsxSFKZQ9Ul+Qx9TWarmj3UsCZ6ldvE50wx8+AK+/nMMB xY+VkJvnhHiNqoUEsSnYCPLF3gqpbNcNvt5KGJkOFMupCawUol2aVD0AXyR0wdqv6aZu uoNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=LVtVbbl3GOl/AAkHU3pwmh37FAVFriueVs9dv2VvyxA=; b=cERuXmHTrqRqQcELxFwK6e7gPHjiK4qOwL7PKVg0dW9mq3E4uG02yl7NOaJSwfSa82 VlnEQLw4jL7gZ7oGW+H24Apo5tuCjGgttICbmzTdqw21fCCQTNrAyNaUGThH2D8UQNI2 /uTo6PNkfPNktIDTRqq/yjQRXiStXfaxLg9rQpxb3acCkwcZaCXVIQoKjQQMUEuqN4+s 9GfQZ8XoL4adWPorGaYe1Zl8FKwNzmaYnn8YnOkU+5ncWiJHt7LYkE5BWmrpTuKyIoEy y56uxu0enWxTfWSZdJsi//jyHJg4IvjFOfUQ80fuQE7/N9/XwM5+/eCQ6m+kniypiiBR +NKg== X-Gm-Message-State: AKS2vOx6IjP0OkATTDJyXrb3c+Ap/pahChTzMABhqEmajrpQP3Z8MYnk J2FOaYAXYZpD3Q== X-Received: by 10.28.126.67 with SMTP id z64mr1880503wmc.65.1498740993520; Thu, 29 Jun 2017 05:56:33 -0700 (PDT) Received: from [192.168.0.31] ([80.66.223.37]) by smtp.gmail.com with ESMTPSA id q70sm5834102wrb.3.2017.06.29.05.56.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 29 Jun 2017 05:56:32 -0700 (PDT) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Thu, 29 Jun 2017 14:56:31 +0200 Message-ID: <149874099124.524.12472246724522761078.stgit@Solace> In-Reply-To: <149874017405.524.14075439009139766753.stgit@Solace> References: <149874017405.524.14075439009139766753.stgit@Solace> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: George Dunlap Subject: [Xen-devel] [PATCH 3/5] xen: sched-null: support soft-affinity X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP The null scheduler does not really use hard-affinity for scheduling, it uses it for 'placement', i.e., for deciding to what pCPU to statically assign a vCPU. Let's use soft-affinity in the same way, of course with the difference that, if there's no free pCPU within the vCPU's soft-affinity, we go checking the hard-affinity, instead of putting the vCPU in the waitqueue. This does has no impact on the scheduling overhead, because soft-affinity is only considered in cold-path (like when a vCPU joins the scheduler for the first time, or is manually moved between pCPUs by the user). Signed-off-by: Dario Faggioli Reviewed-by: George Dunlap --- Cc: George Dunlap --- xen/common/sched_null.c | 110 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 77 insertions(+), 33 deletions(-) diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c index 610a150..19c7f0f 100644 --- a/xen/common/sched_null.c +++ b/xen/common/sched_null.c @@ -115,9 +115,11 @@ static inline struct null_dom *null_dom(const struct domain *d) return d->sched_priv; } -static inline bool vcpu_check_affinity(struct vcpu *v, unsigned int cpu) +static inline bool vcpu_check_affinity(struct vcpu *v, unsigned int cpu, + unsigned int balance_step) { - cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, + affinity_balance_cpumask(v, balance_step, cpumask_scratch_cpu(cpu)); + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), cpupool_domain_cpumask(v->domain)); return cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu)); @@ -279,31 +281,40 @@ static void null_dom_destroy(const struct scheduler *ops, struct domain *d) */ static unsigned int pick_cpu(struct null_private *prv, struct vcpu *v) { + unsigned int bs; unsigned int cpu = v->processor, new_cpu; cpumask_t *cpus = cpupool_domain_cpumask(v->domain); ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock)); - cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, cpus); + for_each_affinity_balance_step( bs ) + { + if ( bs == BALANCE_SOFT_AFFINITY && + !has_soft_affinity(v, v->cpu_hard_affinity) ) + continue; - /* - * If our processor is free, or we are assigned to it, and it is also - * still valid and part of our affinity, just go for it. - * (Note that we may call vcpu_check_affinity(), but we deliberately - * don't, so we get to keep in the scratch cpumask what we have just - * put in it.) - */ - if ( likely((per_cpu(npc, cpu).vcpu == NULL || per_cpu(npc, cpu).vcpu == v) - && cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu))) ) - return cpu; + affinity_balance_cpumask(v, bs, cpumask_scratch_cpu(cpu)); + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), cpus); - /* If not, just go for a free pCPU, within our affinity, if any */ - cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), - &prv->cpus_free); - new_cpu = cpumask_first(cpumask_scratch_cpu(cpu)); + /* + * If our processor is free, or we are assigned to it, and it is also + * still valid and part of our affinity, just go for it. + * (Note that we may call vcpu_check_affinity(), but we deliberately + * don't, so we get to keep in the scratch cpumask what we have just + * put in it.) + */ + if ( likely((per_cpu(npc, cpu).vcpu == NULL || per_cpu(npc, cpu).vcpu == v) + && cpumask_test_cpu(cpu, cpumask_scratch_cpu(cpu))) ) + return cpu; - if ( likely(new_cpu != nr_cpu_ids) ) - return new_cpu; + /* If not, just go for a free pCPU, within our affinity, if any */ + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), + &prv->cpus_free); + new_cpu = cpumask_first(cpumask_scratch_cpu(cpu)); + + if ( likely(new_cpu != nr_cpu_ids) ) + return new_cpu; + } /* * If we didn't find any free pCPU, just pick any valid pcpu, even if @@ -430,6 +441,7 @@ static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v) static void _vcpu_remove(struct null_private *prv, struct vcpu *v) { + unsigned int bs; unsigned int cpu = v->processor; struct null_vcpu *wvc; @@ -441,19 +453,27 @@ static void _vcpu_remove(struct null_private *prv, struct vcpu *v) /* * If v is assigned to a pCPU, let's see if there is someone waiting, - * suitable to be assigned to it. + * suitable to be assigned to it (prioritizing vcpus that have + * soft-affinity with cpu). */ - list_for_each_entry( wvc, &prv->waitq, waitq_elem ) + for_each_affinity_balance_step( bs ) { - if ( vcpu_check_affinity(wvc->vcpu, cpu) ) + list_for_each_entry( wvc, &prv->waitq, waitq_elem ) { - list_del_init(&wvc->waitq_elem); - vcpu_assign(prv, wvc->vcpu, cpu); - cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ); - break; + if ( bs == BALANCE_SOFT_AFFINITY && + !has_soft_affinity(wvc->vcpu, wvc->vcpu->cpu_hard_affinity) ) + continue; + + if ( vcpu_check_affinity(wvc->vcpu, cpu, bs) ) + { + list_del_init(&wvc->waitq_elem); + vcpu_assign(prv, wvc->vcpu, cpu); + cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ); + spin_unlock(&prv->waitq_lock); + return; + } } } - spin_unlock(&prv->waitq_lock); } @@ -570,7 +590,8 @@ static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v, * * In latter, all we can do is to park v in the waitqueue. */ - if ( per_cpu(npc, new_cpu).vcpu == NULL && vcpu_check_affinity(v, new_cpu) ) + if ( per_cpu(npc, new_cpu).vcpu == NULL && + vcpu_check_affinity(v, new_cpu, BALANCE_HARD_AFFINITY) ) { /* v might have been in the waitqueue, so remove it */ spin_lock(&prv->waitq_lock); @@ -633,6 +654,7 @@ static struct task_slice null_schedule(const struct scheduler *ops, s_time_t now, bool_t tasklet_work_scheduled) { + unsigned int bs; const unsigned int cpu = smp_processor_id(); struct null_private *prv = null_priv(ops); struct null_vcpu *wvc; @@ -656,13 +678,35 @@ static struct task_slice null_schedule(const struct scheduler *ops, if ( unlikely(ret.task == NULL) ) { spin_lock(&prv->waitq_lock); - wvc = list_first_entry_or_null(&prv->waitq, struct null_vcpu, waitq_elem); - if ( wvc && vcpu_check_affinity(wvc->vcpu, cpu) ) + + if ( list_empty(&prv->waitq) ) + goto unlock; + + /* + * We scan the waitqueue twice, for prioritizing vcpus that have + * soft-affinity with cpu. This may look like something expensive to + * do here in null_schedule(), but it's actually fine, beceuse we do + * it only in cases where a pcpu has no vcpu associated (e.g., as + * said above, the cpu has just joined a cpupool). + */ + for_each_affinity_balance_step( bs ) { - vcpu_assign(prv, wvc->vcpu, cpu); - list_del_init(&wvc->waitq_elem); - ret.task = wvc->vcpu; + list_for_each_entry( wvc, &prv->waitq, waitq_elem ) + { + if ( bs == BALANCE_SOFT_AFFINITY && + !has_soft_affinity(wvc->vcpu, wvc->vcpu->cpu_hard_affinity) ) + continue; + + if ( vcpu_check_affinity(wvc->vcpu, cpu, bs) ) + { + vcpu_assign(prv, wvc->vcpu, cpu); + list_del_init(&wvc->waitq_elem); + ret.task = wvc->vcpu; + goto unlock; + } + } } + unlock: spin_unlock(&prv->waitq_lock); }