From patchwork Fri Sep 27 07:00:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?SsO8cmdlbiBHcm/Dnw==?= X-Patchwork-Id: 11163979 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2AD7F17EE for ; Fri, 27 Sep 2019 07:04:09 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 11448207FF for ; Fri, 27 Sep 2019 07:04:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 11448207FF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iDkGs-000529-2i; Fri, 27 Sep 2019 07:02:30 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iDkGp-0004y0-Qt for xen-devel@lists.xenproject.org; Fri, 27 Sep 2019 07:02:27 +0000 X-Inumbo-ID: 9510344c-e0f4-11e9-bf31-bc764e2007e4 Received: from mx1.suse.de (unknown [195.135.220.15]) by localhost (Halon) with ESMTPS id 9510344c-e0f4-11e9-bf31-bc764e2007e4; Fri, 27 Sep 2019 07:01:09 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 67C95B180; Fri, 27 Sep 2019 07:01:06 +0000 (UTC) From: Juergen Gross To: xen-devel@lists.xenproject.org Date: Fri, 27 Sep 2019 09:00:43 +0200 Message-Id: <20190927070050.12405-40-jgross@suse.com> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190927070050.12405-1-jgross@suse.com> References: <20190927070050.12405-1-jgross@suse.com> Subject: [Xen-devel] [PATCH v4 39/46] xen/sched: prepare per-cpupool scheduling granularity X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Tim Deegan , Stefano Stabellini , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , Ian Jackson , Dario Faggioli , Julien Grall , Jan Beulich MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" On- and offlining cpus with core scheduling is rather complicated as the cpus are taken on- or offline one by one, but scheduling wants them rather to be handled per core. As the future plan is to be able to select scheduling granularity per cpupool prepare that by storing the granularity in struct cpupool and struct sched_resource (we need it there for free cpus which are not associated to any cpupool). Free cpus will always use granularity 1. Store the selected granularity option (cpu, core or socket) in the cpupool as well, as we will need it to select the appropriate cpu mask when populating the cpupool with cpus. This will make on- and offlining of cpus much easier and avoids writing code which would needed to be thrown away later. Move the granularity related variables to cpupool.c as they are now used form there only. Signed-off-by: Juergen Gross Reviewed-by: Dario Faggioli --- V1: new patch V4: - move opt_sched_granularity and sched_granularity to cpupool.c (Jan Beulich) - rename c->opt_sched_granularity, drop c->granularity (Jan Beulich) --- xen/common/cpupool.c | 9 +++++++++ xen/common/schedule.c | 27 ++++++++++++++++----------- xen/include/xen/sched-if.h | 11 +++++++++++ 3 files changed, 36 insertions(+), 11 deletions(-) diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c index 60a85f50e1..51f0ff0d88 100644 --- a/xen/common/cpupool.c +++ b/xen/common/cpupool.c @@ -34,6 +34,14 @@ static cpumask_t cpupool_locked_cpus; static DEFINE_SPINLOCK(cpupool_lock); +static enum sched_gran __read_mostly opt_sched_granularity = SCHED_GRAN_cpu; +static unsigned int __read_mostly sched_granularity = 1; + +unsigned int cpupool_get_granularity(const struct cpupool *c) +{ + return c ? sched_granularity : 1; +} + static void free_cpupool_struct(struct cpupool *c) { if ( c ) @@ -173,6 +181,7 @@ static struct cpupool *cpupool_create( return NULL; } } + c->gran = opt_sched_granularity; *q = c; diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 7017c45a61..63ffe1a824 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -60,7 +60,6 @@ int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US; integer_param("sched_ratelimit_us", sched_ratelimit_us); /* Number of vcpus per struct sched_unit. */ -static unsigned int __read_mostly sched_granularity = 1; bool __read_mostly sched_disable_smt_switching; const cpumask_t *sched_res_mask = &cpumask_all; @@ -426,10 +425,10 @@ static struct sched_unit *sched_alloc_unit(struct vcpu *v) { struct sched_unit *unit, **prev_unit; struct domain *d = v->domain; + unsigned int gran = cpupool_get_granularity(d->cpupool); for_each_sched_unit ( d, unit ) - if ( unit->unit_id / sched_granularity == - v->vcpu_id / sched_granularity ) + if ( unit->unit_id / gran == v->vcpu_id / gran ) break; if ( unit ) @@ -584,6 +583,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) void *unitdata; struct scheduler *old_ops; void *old_domdata; + unsigned int gran = cpupool_get_granularity(c); for_each_vcpu ( d, v ) { @@ -595,8 +595,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) if ( IS_ERR(domdata) ) return PTR_ERR(domdata); - unit_priv = xzalloc_array(void *, - DIV_ROUND_UP(d->max_vcpus, sched_granularity)); + unit_priv = xzalloc_array(void *, DIV_ROUND_UP(d->max_vcpus, gran)); if ( unit_priv == NULL ) { sched_free_domdata(c->sched, domdata); @@ -1835,11 +1834,11 @@ static void sched_switch_units(struct sched_resource *sr, if ( is_idle_unit(prev) ) { prev->runstate_cnt[RUNSTATE_running] = 0; - prev->runstate_cnt[RUNSTATE_runnable] = sched_granularity; + prev->runstate_cnt[RUNSTATE_runnable] = sr->granularity; } if ( is_idle_unit(next) ) { - next->runstate_cnt[RUNSTATE_running] = sched_granularity; + next->runstate_cnt[RUNSTATE_running] = sr->granularity; next->runstate_cnt[RUNSTATE_runnable] = 0; } } @@ -1988,7 +1987,7 @@ void sched_context_switched(struct vcpu *vprev, struct vcpu *vnext) else { vcpu_context_saved(vprev, vnext); - if ( sched_granularity == 1 ) + if ( sr->granularity == 1 ) unit_context_saved(sr); } @@ -2108,11 +2107,12 @@ static struct sched_unit *sched_wait_rendezvous_in(struct sched_unit *prev, { struct sched_unit *next; struct vcpu *v; + unsigned int gran = get_sched_res(cpu)->granularity; if ( !--prev->rendezvous_in_cnt ) { next = do_schedule(prev, now, cpu); - atomic_set(&next->rendezvous_out_cnt, sched_granularity + 1); + atomic_set(&next->rendezvous_out_cnt, gran + 1); return next; } @@ -2236,6 +2236,7 @@ static void schedule(void) struct sched_resource *sr; spinlock_t *lock; int cpu = smp_processor_id(); + unsigned int gran = get_sched_res(cpu)->granularity; ASSERT_NOT_IN_ATOMIC(); @@ -2261,11 +2262,11 @@ static void schedule(void) now = NOW(); - if ( sched_granularity > 1 ) + if ( gran > 1 ) { cpumask_t mask; - prev->rendezvous_in_cnt = sched_granularity; + prev->rendezvous_in_cnt = gran; cpumask_andnot(&mask, sr->cpus, cpumask_of(cpu)); cpumask_raise_softirq(&mask, SCHED_SLAVE_SOFTIRQ); next = sched_wait_rendezvous_in(prev, &lock, cpu, now); @@ -2333,6 +2334,9 @@ static int cpu_schedule_up(unsigned int cpu) init_timer(&sr->s_timer, s_timer_fn, NULL, cpu); atomic_set(&per_cpu(sched_urgent_count, cpu), 0); + /* We start with cpu granularity. */ + sr->granularity = 1; + /* Boot CPU is dealt with later in scheduler_init(). */ if ( cpu == 0 ) return 0; @@ -2623,6 +2627,7 @@ int schedule_cpu_switch(unsigned int cpu, struct cpupool *c) sched_free_udata(old_ops, vpriv_old); sched_free_pdata(old_ops, ppriv_old, cpu); + get_sched_res(cpu)->granularity = cpupool_get_granularity(c); get_sched_res(cpu)->cpupool = c; /* When a cpu is added to a pool, trigger it to go pick up some work */ if ( c != NULL ) diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h index e675061290..f8f0f484cb 100644 --- a/xen/include/xen/sched-if.h +++ b/xen/include/xen/sched-if.h @@ -25,6 +25,13 @@ extern int sched_ratelimit_us; /* Scheduling resource mask. */ extern const cpumask_t *sched_res_mask; +/* Number of vcpus per struct sched_unit. */ +enum sched_gran { + SCHED_GRAN_cpu, + SCHED_GRAN_core, + SCHED_GRAN_socket +}; + /* * In order to allow a scheduler to remap the lock->cpu mapping, * we have a per-cpu pointer, along with a pre-allocated set of @@ -48,6 +55,7 @@ struct sched_resource { /* Cpu with lowest id in scheduling resource. */ unsigned int master_cpu; + unsigned int granularity; const cpumask_t *cpus; /* cpus covered by this struct */ }; @@ -546,6 +554,7 @@ struct cpupool struct cpupool *next; struct scheduler *sched; atomic_t refcnt; + enum sched_gran gran; }; #define cpupool_online_cpumask(_pool) \ @@ -561,6 +570,8 @@ static inline cpumask_t *cpupool_domain_master_cpumask(const struct domain *d) return d->cpupool->res_valid; } +unsigned int cpupool_get_granularity(const struct cpupool *c); + /* * Hard and soft affinity load balancing. *