Message ID | 148470181453.5815.10454358470087815699.stgit@Solace.fritz.box (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 18/01/17 01:10, Dario Faggioli wrote: > Scheduling information debug dump for Credit2 is hard > to read as it contains the same information repeated > multiple time in different ways. > > In fact, in Credit2, CPUs are grouped in runqueus. Before > this change, for each CPU, we were printing the while > content of the runqueue, as shown below: > > CPU[00] sibling=000003, core=0000ff > run: [32767.0] flags=0 cpu=0 credit=-1073741824 [w=0] load=0 (~0%) > 1: [0.0] flags=0 cpu=2 credit=3860932 [w=256] load=262144 (~100%) > 2: [0.1] flags=0 cpu=2 credit=3859906 [w=256] load=262144 (~100%) > CPU[01] sibling=000003, core=0000ff > run: [32767.1] flags=0 cpu=1 credit=-1073741824 [w=0] load=0 (~0%) > 1: [0.0] flags=0 cpu=2 credit=2859840 [w=256] load=262144 (~100%) > 2: [0.3] flags=0 cpu=2 credit=-17466062 [w=256] load=262144 (~100%) > CPU[02] sibling=00000c, core=0000ff > run: [0.0] flags=2 cpu=2 credit=1858628 [w=256] load=262144 (~100%) > 1: [0.3] flags=0 cpu=2 credit=-17466062 [w=256] load=262144 (~100%) > 2: [0.1] flags=0 cpu=2 credit=-23957055 [w=256] load=262144 (~100%) > CPU[03] sibling=00000c, core=0000ff > run: [32767.3] flags=0 cpu=3 credit=-1073741824 [w=0] load=0 (~0%) > 1: [0.1] flags=0 cpu=2 credit=-3957055 [w=256] load=262144 (~100%) > 2: [0.0] flags=0 cpu=2 credit=-6216254 [w=256] load=262144 (~100%) > CPU[04] sibling=000030, core=0000ff > run: [32767.4] flags=0 cpu=4 credit=-1073741824 [w=0] load=0 (~0%) > 1: [0.1] flags=0 cpu=2 credit=3782667 [w=256] load=262144 (~100%) > 2: [0.3] flags=0 cpu=2 credit=-16287483 [w=256] load=262144 (~100%) > > As it can be seen, all the CPUs print the whole content > of the runqueue they belong to, at the time of their > sampling, and this is cumbersome and hard to interpret! > > In new output format we print, for each CPU, only the vCPU > that is running there (if that's not the idle vCPU, in which > case, nothing is printed), while the runqueues content > is printed only once, in a dedicated section. > > An example: > > CPUs info: > CPU[02] runq=0, sibling=00000c, core=0000ff > run: [0.3] flags=2 cpu=2 credit=8054391 [w=256] load=262144 (~100%) > CPU[14] runq=1, sibling=00c000, core=00ff00 > run: [0.4] flags=2 cpu=14 credit=8771420 [w=256] load=262144 (~100%) > ... ... ... ... ... ... ... ... ... > Runqueue info: > runqueue 0: > 0: [0.1] flags=0 cpu=2 credit=7869771 [w=256] load=262144 (~100%) > 1: [0.0] flags=0 cpu=2 credit=7709649 [w=256] load=262144 (~100%) > runqueue 1: > 0: [0.5] flags=0 cpu=14 credit=-1188 [w=256] load=262144 (~100%) > > Note that there still is risk of inconsistency between > what is printed in the 'Runqueue info:' and in 'CPUs info:' > sections. That is unavoidable, as the relevant locks are > released and re-acquired, around each single operation. > > At least, the inconsistency is less severe than before. > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index ef8e0d8..90fe591 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -2581,50 +2581,35 @@ static void csched2_dump_pcpu(const struct scheduler *ops, int cpu) { struct csched2_private *prv = CSCHED2_PRIV(ops); - struct list_head *runq, *iter; struct csched2_vcpu *svc; unsigned long flags; spinlock_t *lock; - int loop; #define cpustr keyhandler_scratch /* * We need both locks: + * - we print current, so we need the runqueue lock for this + * cpu (the one of the runqueue this cpu is associated to); * - csched2_dump_vcpu() wants to access domains' weights, - * which are protected by the private scheduler lock; - * - we scan through the runqueue, so we need the proper runqueue - * lock (the one of the runqueue this cpu is associated to). + * which are protected by the private scheduler lock. */ read_lock_irqsave(&prv->lock, flags); lock = per_cpu(schedule_data, cpu).schedule_lock; spin_lock(lock); - runq = &RQD(ops, cpu)->runq; - cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_sibling_mask, cpu)); - printk(" sibling=%s, ", cpustr); + printk(" runq=%d, sibling=%s, ", c2r(ops, cpu), cpustr); cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_core_mask, cpu)); printk("core=%s\n", cpustr); - /* current VCPU */ + /* current VCPU (nothing to say if that's the idle vcpu) */ svc = CSCHED2_VCPU(curr_on_cpu(cpu)); - if ( svc ) + if ( svc && !is_idle_vcpu(svc->vcpu) ) { printk("\trun: "); csched2_dump_vcpu(prv, svc); } - loop = 0; - list_for_each( iter, runq ) - { - svc = __runq_elem(iter); - if ( svc ) - { - printk("\t%3d: ", ++loop); - csched2_dump_vcpu(prv, svc); - } - } - spin_unlock(lock); read_unlock_irqrestore(&prv->lock, flags); #undef cpustr @@ -2706,6 +2691,29 @@ csched2_dump(const struct scheduler *ops) } } + printk("Runqueue info:\n"); + for_each_cpu(i, &prv->active_queues) + { + struct csched2_runqueue_data *rqd = prv->rqd + i; + struct list_head *iter, *runq = &rqd->runq; + int loop = 0; + + /* We need the lock to scan the runqueue. */ + spin_lock(&rqd->lock); + printk("runqueue %d:\n", i); + list_for_each( iter, runq ) + { + struct csched2_vcpu *svc = __runq_elem(iter); + + if ( svc ) + { + printk("\t%3d: ", loop++); + csched2_dump_vcpu(prv, svc); + } + } + spin_unlock(&rqd->lock); + } + read_unlock_irqrestore(&prv->lock, flags); #undef cpustr } diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 5b444c4..e551e06 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -1832,6 +1832,7 @@ void schedule_dump(struct cpupool *c) cpus = &cpupool_free_cpus; } + printk("CPUs info:\n"); for_each_cpu (i, cpus) { printk("CPU[%02d] ", i);
Scheduling information debug dump for Credit2 is hard to read as it contains the same information repeated multiple time in different ways. In fact, in Credit2, CPUs are grouped in runqueus. Before this change, for each CPU, we were printing the while content of the runqueue, as shown below: CPU[00] sibling=000003, core=0000ff run: [32767.0] flags=0 cpu=0 credit=-1073741824 [w=0] load=0 (~0%) 1: [0.0] flags=0 cpu=2 credit=3860932 [w=256] load=262144 (~100%) 2: [0.1] flags=0 cpu=2 credit=3859906 [w=256] load=262144 (~100%) CPU[01] sibling=000003, core=0000ff run: [32767.1] flags=0 cpu=1 credit=-1073741824 [w=0] load=0 (~0%) 1: [0.0] flags=0 cpu=2 credit=2859840 [w=256] load=262144 (~100%) 2: [0.3] flags=0 cpu=2 credit=-17466062 [w=256] load=262144 (~100%) CPU[02] sibling=00000c, core=0000ff run: [0.0] flags=2 cpu=2 credit=1858628 [w=256] load=262144 (~100%) 1: [0.3] flags=0 cpu=2 credit=-17466062 [w=256] load=262144 (~100%) 2: [0.1] flags=0 cpu=2 credit=-23957055 [w=256] load=262144 (~100%) CPU[03] sibling=00000c, core=0000ff run: [32767.3] flags=0 cpu=3 credit=-1073741824 [w=0] load=0 (~0%) 1: [0.1] flags=0 cpu=2 credit=-3957055 [w=256] load=262144 (~100%) 2: [0.0] flags=0 cpu=2 credit=-6216254 [w=256] load=262144 (~100%) CPU[04] sibling=000030, core=0000ff run: [32767.4] flags=0 cpu=4 credit=-1073741824 [w=0] load=0 (~0%) 1: [0.1] flags=0 cpu=2 credit=3782667 [w=256] load=262144 (~100%) 2: [0.3] flags=0 cpu=2 credit=-16287483 [w=256] load=262144 (~100%) As it can be seen, all the CPUs print the whole content of the runqueue they belong to, at the time of their sampling, and this is cumbersome and hard to interpret! In new output format we print, for each CPU, only the vCPU that is running there (if that's not the idle vCPU, in which case, nothing is printed), while the runqueues content is printed only once, in a dedicated section. An example: CPUs info: CPU[02] runq=0, sibling=00000c, core=0000ff run: [0.3] flags=2 cpu=2 credit=8054391 [w=256] load=262144 (~100%) CPU[14] runq=1, sibling=00c000, core=00ff00 run: [0.4] flags=2 cpu=14 credit=8771420 [w=256] load=262144 (~100%) ... ... ... ... ... ... ... ... ... Runqueue info: runqueue 0: 0: [0.1] flags=0 cpu=2 credit=7869771 [w=256] load=262144 (~100%) 1: [0.0] flags=0 cpu=2 credit=7709649 [w=256] load=262144 (~100%) runqueue 1: 0: [0.5] flags=0 cpu=14 credit=-1188 [w=256] load=262144 (~100%) Note that there still is risk of inconsistency between what is printed in the 'Runqueue info:' and in 'CPUs info:' sections. That is unavoidable, as the relevant locks are released and re-acquired, around each single operation. At least, the inconsistency is less severe than before. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> --- Cc: George Dunlap <george.dunlap@eu.citrix.com> Cc: Anshul Makkar <anshul.makkar@citrix.com> --- xen/common/sched_credit2.c | 50 ++++++++++++++++++++++++++------------------ xen/common/schedule.c | 1 + 2 files changed, 30 insertions(+), 21 deletions(-)