[RFC,23/49] xen/sched: move is_running indicator to struct sched_item

Message ID	20190329150934.17694-24-jgross@suse.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> From: Juergen Gross <jgross@suse.com> To: xen-devel@lists.xenproject.org Date: Fri, 29 Mar 2019 16:09:08 +0100 Message-Id: <20190329150934.17694-24-jgross@suse.com> In-Reply-To: <20190329150934.17694-1-jgross@suse.com> References: <20190329150934.17694-1-jgross@suse.com> Subject: [Xen-devel] [PATCH RFC 23/49] xen/sched: move is_running indicator to struct sched_item Precedence: list Cc: Juergen Gross <jgross@suse.com>, Kevin Tian <kevin.tian@intel.com>, Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wei.liu2@citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Ian Jackson <ian.jackson@eu.citrix.com>, Tim Deegan <tim@xen.org>, Julien Grall <julien.grall@arm.com>, Paul Durrant <paul.durrant@citrix.com>, Meng Xu <mengxu@cis.upenn.edu>, Jan Beulich <jbeulich@suse.com>, Dario Faggioli <dfaggioli@suse.com>, =?utf-8?q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org>
Series	xen: add core scheduling support \| expand [RFC,00/49] xen: add core scheduling support [RFC,01/49] xen/sched: call cpu_disable_scheduler() via cpu notifier [RFC,02/49] xen: add helper for calling notifier_call_chain() to common/cpu.c [RFC,03/49] xen: add new cpu notifier action CPU_RESUME_FAILED [RFC,04/49] xen: don't free percpu areas during suspend [RFC,05/49] xen/cpupool: simplify suspend/resume handling [RFC,06/49] xen/sched: don't disable scheduler on cpus during suspend [RFC,07/49] xen/sched: fix credit2 smt idle handling [RFC,08/49] xen/sched: use new sched_item instead of vcpu in scheduler interfaces [RFC,09/49] xen/sched: alloc struct sched_item for each vcpu [RFC,10/49] xen/sched: move per-vcpu scheduler private data pointer to sched_item [RFC,11/49] xen/sched: build a linked list of struct sched_item [RFC,12/49] xen/sched: introduce struct sched_resource [RFC,13/49] xen/sched: let pick_cpu return a scheduler resource [RFC,14/49] xen/sched: switch schedule_data.curr to point at sched_item [RFC,15/49] xen/sched: move per cpu scheduler private data into struct sched_resource [RFC,16/49] xen/sched: switch vcpu_schedule_lock to item_schedule_lock [RFC,17/49] xen/sched: move some per-vcpu items to struct sched_item [RFC,18/49] xen/sched: add scheduler helpers hiding vcpu [RFC,19/49] xen/sched: add domain pointer to struct sched_item [RFC,20/49] xen/sched: add id to struct sched_item [RFC,21/49] xen/sched: rename scheduler related perf counters [RFC,22/49] xen/sched: switch struct task_slice from vcpu to sched_item [RFC,23/49] xen/sched: move is_running indicator to struct sched_item [RFC,24/49] xen/sched: make null scheduler vcpu agnostic. [RFC,25/49] xen/sched: make rt scheduler vcpu agnostic. [RFC,26/49] xen/sched: make credit scheduler vcpu agnostic. [RFC,27/49] xen/sched: make credit2 scheduler vcpu agnostic. [RFC,28/49] xen/sched: make arinc653 scheduler vcpu agnostic. [RFC,29/49] xen: add sched_item_pause_nosync() and sched_item_unpause() [RFC,30/49] xen: let vcpu_create() select processor [RFC,31/49] xen/sched: use sched_resource cpu instead smp_processor_id in schedulers [RFC,32/49] xen/sched: switch schedule() from vcpus to sched_items [RFC,33/49] xen/sched: switch sched_move_irqs() to take sched_item as parameter [RFC,34/49] xen: switch from for_each_vcpu() to for_each_sched_item() [RFC,35/49] xen/sched: add runstate counters to struct sched_item [RFC,36/49] xen/sched: rework and rename vcpu_force_reschedule() [RFC,37/49] xen/sched: Change vcpu_migrate_*() to operate on schedule item [RFC,38/49] xen/sched: move struct task_slice into struct sched_item [RFC,39/49] xen/sched: add code to sync scheduling of all vcpus of a sched item [RFC,40/49] xen/sched: add support for multiple vcpus per sched item where missing [RFC,41/49] x86: make loading of GDT at context switch more modular [RFC,42/49] xen/sched: add support for guest vcpu idle [RFC,43/49] xen/sched: modify cpupool_domain_cpumask() to be an item mask [RFC,44/49] xen: round up max vcpus to scheduling granularity [RFC,45/49] xen/sched: support allocating multiple vcpus into one sched item [RFC,46/49] xen/sched: add a scheduler_percpu_init() function [RFC,47/49] xen/sched: support core scheduling in continue_running() [RFC,48/49] xen/sched: make vcpu_wake() core scheduling aware [RFC,49/49] xen/sched: add scheduling granularity enum

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 5d8f3255cb..53b8fa1c9d 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -2137,7 +2137,7 @@ void vcpu_kick(struct vcpu *v) * NB2. We save the running flag across the unblock to avoid a needless * IPI for domains that we IPI'd to unblock. */ - bool running = v->is_running; + bool running = vcpu_running(v); vcpu_unblock(v); if ( running && (in_irq() || (v != current)) ) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 8adbb61b57..f184136f81 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -23,6 +23,7 @@ #include <xen/lib.h> #include <xen/trace.h> #include <xen/sched.h> +#include <xen/sched-if.h> #include <xen/irq.h> #include <xen/softirq.h> #include <xen/domain.h> @@ -3984,7 +3985,7 @@ bool hvm_flush_vcpu_tlb(bool (*flush_vcpu)(void *ctxt, struct vcpu *v), /* Now that all VCPUs are signalled to deschedule, we wait... */ for_each_vcpu ( d, v ) if ( v != current && flush_vcpu(ctxt, v) ) - while ( !vcpu_runnable(v) && v->is_running ) + while ( !vcpu_runnable(v) && vcpu_running(v) ) cpu_relax(); /* All other vcpus are paused, safe to unlock now. */ diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c index 425af56856..5779efc81f 100644 --- a/xen/arch/x86/hvm/viridian/viridian.c +++ b/xen/arch/x86/hvm/viridian/viridian.c @@ -6,6 +6,7 @@ */ #include <xen/sched.h> +#include <xen/sched-if.h> #include <xen/version.h> #include <xen/hypercall.h> #include <xen/domain_page.h> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index 74f2a08cfd..257fb00528 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -23,6 +23,8 @@ #include <xen/event.h> #include <xen/kernel.h> #include <xen/keyhandler.h> +#include <xen/sched.h> +#include <xen/sched-if.h> #include <xen/vm_event.h> #include <asm/current.h> #include <asm/cpufeature.h> @@ -562,7 +564,7 @@ void vmx_vmcs_reload(struct vcpu *v) * v->arch.hvm.vmx.vmcs_lock here. However, with interrupts disabled * the VMCS can't be taken away from us anymore if we still own it. */ - ASSERT(v->is_running || !local_irq_is_enabled()); + ASSERT(vcpu_running(v) || !local_irq_is_enabled()); if ( v->arch.hvm.vmx.vmcs_pa == this_cpu(current_vmcs) ) return; @@ -1576,7 +1578,7 @@ void vmx_vcpu_flush_pml_buffer(struct vcpu *v) uint64_t *pml_buf; unsigned long pml_idx; - ASSERT((v == current) || (!vcpu_runnable(v) && !v->is_running)); + ASSERT((v == current) || (!vcpu_runnable(v) && !vcpu_running(v))); ASSERT(vmx_vcpu_pml_enabled(v)); vmx_vmcs_enter(v); diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 725dd88c13..0056fd0191 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -19,6 +19,7 @@ #include <xen/lib.h> #include <xen/trace.h> #include <xen/sched.h> +#include <xen/sched-if.h> #include <xen/irq.h> #include <xen/softirq.h> #include <xen/domain_page.h> @@ -907,7 +908,7 @@ static void vmx_ctxt_switch_from(struct vcpu *v) if ( unlikely(!this_cpu(vmxon)) ) return; - if ( !v->is_running ) + if ( !vcpu_running(v) ) { /* * When this vCPU isn't marked as running anymore, a remote pCPU's @@ -2004,7 +2005,7 @@ static void vmx_process_isr(int isr, struct vcpu *v) static void __vmx_deliver_posted_interrupt(struct vcpu *v) { - bool_t running = v->is_running; + bool_t running = vcpu_running(v); vcpu_unblock(v); /* diff --git a/xen/common/domctl.c b/xen/common/domctl.c index 8464713d2b..6a9a54130d 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -173,7 +173,7 @@ void getdomaininfo(struct domain *d, struct xen_domctl_getdomaininfo *info) { if ( !(v->pause_flags & VPF_blocked) ) flags &= ~XEN_DOMINF_blocked; - if ( v->is_running ) + if ( vcpu_running(v) ) flags |= XEN_DOMINF_running; info->nr_online_vcpus++; } @@ -841,7 +841,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) op->u.getvcpuinfo.online = !(v->pause_flags & VPF_down); op->u.getvcpuinfo.blocked = !!(v->pause_flags & VPF_blocked); - op->u.getvcpuinfo.running = v->is_running; + op->u.getvcpuinfo.running = vcpu_running(v); op->u.getvcpuinfo.cpu_time = runstate.time[RUNSTATE_running]; op->u.getvcpuinfo.cpu = v->processor; ret = 0; diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c index f50df5841d..0d312ff953 100644 --- a/xen/common/keyhandler.c +++ b/xen/common/keyhandler.c @@ -306,7 +306,7 @@ static void dump_domains(unsigned char key) printk(" VCPU%d: CPU%d [has=%c] poll=%d " "upcall_pend=%02x upcall_mask=%02x ", v->vcpu_id, v->processor, - v->is_running ? 'T':'F', v->poll_evtchn, + vcpu_running(v) ? 'T':'F', v->poll_evtchn, vcpu_info(v, evtchn_upcall_pending), !vcpu_event_delivery_is_enabled(v)); if ( vcpu_cpu_dirty(v) ) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 29076e362b..6d0639109a 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -723,7 +723,7 @@ __csched_vcpu_is_migrateable(const struct csched_private *prv, struct vcpu *vc, * The caller is supposed to have already checked that vc is also * not running. */ - ASSERT(!vc->is_running); + ASSERT(!vcpu_running(vc)); return !__csched_vcpu_is_cache_hot(prv, vc) && cpumask_test_cpu(dest_cpu, mask); @@ -1047,7 +1047,7 @@ csched_item_insert(const struct scheduler *ops, struct sched_item *item) lock = item_schedule_lock_irq(item); - if ( !__vcpu_on_runq(svc) && vcpu_runnable(vc) && !vc->is_running ) + if ( !__vcpu_on_runq(svc) && vcpu_runnable(vc) && !vcpu_running(vc) ) runq_insert(svc); item_schedule_unlock_irq(lock, item); @@ -1659,8 +1659,8 @@ csched_runq_steal(int peer_cpu, int cpu, int pri, int balance_step) * vCPUs with useful soft affinities in some sort of bitmap * or counter. */ - if ( vc->is_running || (balance_step == BALANCE_SOFT_AFFINITY && - !has_soft_affinity(vc->sched_item)) ) + if ( vcpu_running(vc) || (balance_step == BALANCE_SOFT_AFFINITY && + !has_soft_affinity(vc->sched_item)) ) continue; affinity_balance_cpumask(vc->sched_item, balance_step, cpumask_scratch); @@ -1868,7 +1868,7 @@ csched_schedule( (unsigned char *)&d); } - runtime = now - current->runstate.state_entry_time; + runtime = now - current->sched_item->state_entry_time; if ( runtime < 0 ) /* Does this ever happen? */ runtime = 0; diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 9bf045d20f..5aa819b2c5 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -1283,7 +1283,7 @@ runq_insert(const struct scheduler *ops, struct csched2_item *svc) ASSERT(&svc->rqd->runq == runq); ASSERT(!is_idle_vcpu(svc->vcpu)); - ASSERT(!svc->vcpu->is_running); + ASSERT(!vcpu_running(svc->vcpu)); ASSERT(!(svc->flags & CSFLAG_scheduled)); list_for_each( iter, runq ) @@ -1340,8 +1340,8 @@ static inline bool is_preemptable(const struct csched2_item *svc, if ( ratelimit <= CSCHED2_RATELIMIT_TICKLE_TOLERANCE ) return true; - ASSERT(svc->vcpu->is_running); - return now - svc->vcpu->runstate.state_entry_time > + ASSERT(vcpu_running(svc->vcpu)); + return now - svc->vcpu->sched_item->state_entry_time > ratelimit - CSCHED2_RATELIMIT_TICKLE_TOLERANCE; } @@ -2931,7 +2931,7 @@ csched2_dom_cntl( { svc = csched2_item(v->sched_item); lock = item_schedule_lock(svc->vcpu->sched_item); - if ( v->is_running ) + if ( vcpu_running(v) ) { unsigned int cpu = v->processor; struct csched2_runqueue_data *rqd = c2rqd(ops, cpu); @@ -3204,8 +3204,8 @@ csched2_runtime(const struct scheduler *ops, int cpu, if ( prv->ratelimit_us ) { s_time_t ratelimit_min = MICROSECS(prv->ratelimit_us); - if ( snext->vcpu->is_running ) - ratelimit_min = snext->vcpu->runstate.state_entry_time + + if ( vcpu_running(snext->vcpu) ) + ratelimit_min = snext->vcpu->sched_item->state_entry_time + MICROSECS(prv->ratelimit_us) - now; if ( ratelimit_min > min_time ) min_time = ratelimit_min; @@ -3302,7 +3302,7 @@ runq_candidate(struct csched2_runqueue_data *rqd, * no point forcing it to do so until rate limiting expires. */ if ( !yield && prv->ratelimit_us && vcpu_runnable(scurr->vcpu) && - (now - scurr->vcpu->runstate.state_entry_time) < + (now - scurr->vcpu->sched_item->state_entry_time) < MICROSECS(prv->ratelimit_us) ) { if ( unlikely(tb_init_done) ) @@ -3313,7 +3313,7 @@ runq_candidate(struct csched2_runqueue_data *rqd, } d; d.dom = scurr->vcpu->domain->domain_id; d.vcpu = scurr->vcpu->vcpu_id; - d.runtime = now - scurr->vcpu->runstate.state_entry_time; + d.runtime = now - scurr->vcpu->sched_item->state_entry_time; __trace_var(TRC_CSCHED2_RATELIMIT, 1, sizeof(d), (unsigned char *)&d); @@ -3561,7 +3561,7 @@ csched2_schedule( if ( snext != scurr ) { ASSERT(snext->rqd == rqd); - ASSERT(!snext->vcpu->is_running); + ASSERT(!vcpu_running(snext->vcpu)); runq_remove(snext); __set_bit(__CSFLAG_scheduled, &snext->flags); diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 374a9d2383..9efe807230 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -914,7 +914,7 @@ rt_item_insert(const struct scheduler *ops, struct sched_item *item) { replq_insert(ops, svc); - if ( !vc->is_running ) + if ( !vcpu_running(vc) ) runq_insert(ops, svc); } item_schedule_unlock_irq(lock, item); diff --git a/xen/common/schedule.c b/xen/common/schedule.c index b295b0b81e..ae2a6d0323 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -356,7 +356,8 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor) if ( is_idle_domain(d) ) { per_cpu(sched_res, v->processor)->curr = item; - v->is_running = 1; + item->is_running = 1; + item->state_entry_time = NOW(); } else { @@ -555,7 +556,7 @@ void vcpu_sleep_sync(struct vcpu *v) { vcpu_sleep_nosync(v); - while ( !vcpu_runnable(v) && v->is_running ) + while ( !vcpu_runnable(v) && vcpu_running(v) ) cpu_relax(); sync_vcpu_execstate(v); @@ -680,7 +681,7 @@ static void vcpu_migrate_finish(struct vcpu *v) * context_saved(); and in any case, if the bit is cleared, then * someone else has already done the work so we don't need to. */ - if ( v->is_running || !test_bit(_VPF_migrating, &v->pause_flags) ) + if ( vcpu_running(v) || !test_bit(_VPF_migrating, &v->pause_flags) ) return; old_cpu = new_cpu = v->processor; @@ -734,7 +735,7 @@ static void vcpu_migrate_finish(struct vcpu *v) * because they both happen in (different) spinlock regions, and those * regions are strictly serialised. */ - if ( v->is_running || + if ( vcpu_running(v) || !test_and_clear_bit(_VPF_migrating, &v->pause_flags) ) { sched_spin_unlock_double(old_lock, new_lock, flags); @@ -762,7 +763,7 @@ void vcpu_force_reschedule(struct vcpu *v) { spinlock_t *lock = item_schedule_lock_irq(v->sched_item); - if ( v->is_running ) + if ( vcpu_running(v) ) vcpu_migrate_start(v); item_schedule_unlock_irq(lock, v->sched_item); @@ -1597,8 +1598,9 @@ static void schedule(void) * switch, else lost_records resume will not work properly. */ - ASSERT(!next->is_running); - next->is_running = 1; + ASSERT(!vcpu_running(next)); + next->sched_item->is_running = 1; + next->sched_item->state_entry_time = now; pcpu_schedule_unlock_irq(lock, cpu); @@ -1619,7 +1621,8 @@ void context_saved(struct vcpu *prev) /* Clear running flag /after/ writing context to memory. */ smp_wmb(); - prev->is_running = 0; + prev->sched_item->is_running = 0; + prev->sched_item->state_entry_time = NOW(); /* Check for migration request /after/ clearing running flag. */ smp_mb(); diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h index 3dcf1dca19..5cacede473 100644 --- a/xen/include/xen/sched-if.h +++ b/xen/include/xen/sched-if.h @@ -59,8 +59,12 @@ struct sched_item { /* Last time when item has been scheduled out. */ uint64_t last_run_time; + /* Last time item got (de-)scheduled. */ + uint64_t state_entry_time; - /* Item needs affinity restored. */ + /* Currently running on a CPU? */ + bool is_running; + /* Item needs affinity restored */ bool affinity_broken; /* Does soft affinity actually play a role (given hard affinity)? */ bool soft_aff_effective; @@ -132,6 +136,11 @@ static inline struct sched_item *sched_idle_item(unsigned int cpu) return idle_vcpu[cpu]->sched_item; } +static inline bool vcpu_running(struct vcpu *v) +{ + return v->sched_item->is_running; +} + /* * Scratch space, for avoiding having too many cpumask_t on the stack. * Within each scheduler, when using the scratch mask of one pCPU: diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 4b59de42da..21a7fa14ce 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -181,8 +181,6 @@ struct vcpu bool fpu_dirtied; /* Initialization completed for this VCPU? */ bool is_initialised; - /* Currently running on a CPU? */ - bool is_running; /* VCPU should wake fast (do not deep sleep the CPU). */ bool is_urgent;

[RFC,23/49] xen/sched: move is_running indicator to struct sched_item

Commit Message

Patch