Message ID | 148467401548.27920.15109928323593413208.stgit@Solace.fritz.box (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jan 17, 2017 at 5:26 PM, Dario Faggioli <dario.faggioli@citrix.com> wrote: > In fact, during shutdown/suspend, we temporarily move all > the vCPUs to the BSP (i.e., pCPU 0, as of now). For Credit2 > domains, we call csched2_vcpu_migrate(), expects to find the > target pCPU in the domain's pool > > Therefore, if Credit2 is the default scheduler and we have > removed pCPU 0 from cpupool0, shutdown/suspend fails like > this: > > RIP: e008:[<ffff82d08012906d>] sched_credit2.c#migrate+0x274/0x2d1 > Xen call trace: > [<ffff82d08012906d>] sched_credit2.c#migrate+0x274/0x2d1 > [<ffff82d080129138>] sched_credit2.c#csched2_vcpu_migrate+0x6e/0x86 > [<ffff82d08012c468>] schedule.c#vcpu_move_locked+0x69/0x6f > [<ffff82d08012ec14>] cpu_disable_scheduler+0x3d7/0x430 > [<ffff82d08019669b>] __cpu_disable+0x299/0x2b0 > [<ffff82d0801012f8>] cpu.c#take_cpu_down+0x2f/0x38 > [<ffff82d0801312d8>] stop_machine.c#stopmachine_action+0x7f/0x8d > [<ffff82d0801330b8>] tasklet.c#do_tasklet_work+0x74/0xab > [<ffff82d0801333ed>] do_tasklet+0x66/0x8b > [<ffff82d080166a73>] domain.c#idle_loop+0x3b/0x5e > > **************************************** > Panic on CPU 8: > Assertion 'svc->vcpu->processor < nr_cpu_ids' failed at sched_credit2.c:1729 > **************************************** > > On the other hand, if Credit2 is the scheduler of another > pool, when trying (still during shutdown/suspend) to move > the vCPUs of the Credit2 domains to pCPU 0, it figures > out that pCPU 0 is not a Credit2 pCPU, and fails like this: > > RIP: e008:[<ffff82d08012916b>] sched_credit2.c#csched2_vcpu_migrate+0xa1/0x107 > Xen call trace: > [<ffff82d08012916b>] sched_credit2.c#csched2_vcpu_migrate+0xa1/0x107 > [<ffff82d08012c4e9>] schedule.c#vcpu_move_locked+0x69/0x6f > [<ffff82d08012edfc>] cpu_disable_scheduler+0x3d7/0x430 > [<ffff82d08019687b>] __cpu_disable+0x299/0x2b0 > [<ffff82d0801012f8>] cpu.c#take_cpu_down+0x2f/0x38 > [<ffff82d0801314c0>] stop_machine.c#stopmachine_action+0x7f/0x8d > [<ffff82d0801332a0>] tasklet.c#do_tasklet_work+0x74/0xab > [<ffff82d0801335d5>] do_tasklet+0x66/0x8b > [<ffff82d080166c53>] domain.c#idle_loop+0x3b/0x5e > > The solution is to recognise the specific situation, inside > csched2_vcpu_migrate() and, considering it is something temporary, > which only happens during shutdown/suspend, quickly deal with it. > > Then, in the resume path, in restore_vcpu_affinity(), things > are set back to normal, and a new v->processor is chosen, for > each vCPU, from the proper set of pCPUs (i.e., the ones of > the proper cpupool). > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index ce0e146..2ce738d 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -1946,13 +1946,43 @@ static void csched2_vcpu_migrate( const struct scheduler *ops, struct vcpu *vc, unsigned int new_cpu) { + struct domain *d = vc->domain; struct csched2_vcpu * const svc = CSCHED2_VCPU(vc); struct csched2_runqueue_data *trqd; + s_time_t now = NOW(); /* Check if new_cpu is valid */ ASSERT(cpumask_test_cpu(new_cpu, &CSCHED2_PRIV(ops)->initialized)); ASSERT(cpumask_test_cpu(new_cpu, vc->cpu_hard_affinity)); + /* + * Being passed a target pCPU which is outside of our cpupool is only + * valid if we are shutting down (or doing ACPI suspend), and we are + * moving everyone to BSP, no matter whether or not BSP is inside our + * cpupool. + * + * And since there indeed is the chance that it is not part of it, all + * we must do is remove _and_ unassign the vCPU from any runqueue, as + * well as updating v->processor with the target, so that the suspend + * process can continue. + * + * It will then be during resume that a new, meaningful, value for + * v->processor will be chosen, and during actual domain unpause that + * the vCPU will be assigned to and added to the proper runqueue. + */ + if ( unlikely(!cpumask_test_cpu(new_cpu, cpupool_domain_cpumask(d))) ) + { + ASSERT(system_state == SYS_STATE_suspend); + if ( __vcpu_on_runq(svc) ) + { + __runq_remove(svc); + update_load(ops, svc->rqd, NULL, -1, now); + } + __runq_deassign(svc); + vc->processor = new_cpu; + return; + } + trqd = RQD(ops, new_cpu); /* @@ -1964,7 +1994,7 @@ csched2_vcpu_migrate( * pointing to a pcpu where we can't run any longer. */ if ( trqd != svc->rqd ) - migrate(ops, svc, trqd, NOW()); + migrate(ops, svc, trqd, now); else vc->processor = new_cpu; } diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 5b444c4..36ff2e9 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -633,8 +633,11 @@ void vcpu_force_reschedule(struct vcpu *v) void restore_vcpu_affinity(struct domain *d) { + unsigned int cpu = smp_processor_id(); struct vcpu *v; + ASSERT(system_state == SYS_STATE_resume); + for_each_vcpu ( d, v ) { spinlock_t *lock = vcpu_schedule_lock_irq(v); @@ -643,18 +646,34 @@ void restore_vcpu_affinity(struct domain *d) { cpumask_copy(v->cpu_hard_affinity, v->cpu_hard_affinity_saved); v->affinity_broken = 0; + } - if ( v->processor == smp_processor_id() ) + /* + * During suspend (in cpu_disable_scheduler()), we moved every vCPU + * to BSP (which, as of now, is pCPU 0), as a temporary measure to + * allow the nonboot processors to have their data structure freed + * and go to sleep. But nothing guardantees that the BSP is a valid + * pCPU for a particular domain. + * + * Therefore, here, before actually unpausing the domains, we should + * set v->processor of each of their vCPUs to something that will + * make sense for the scheduler of the cpupool in which they are in. + */ + cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, + cpupool_domain_cpumask(v->domain)); + v->processor = cpumask_any(cpumask_scratch_cpu(cpu)); + + if ( v->processor == cpu ) { set_bit(_VPF_migrating, &v->pause_flags); - vcpu_schedule_unlock_irq(lock, v); + spin_unlock_irq(lock);; vcpu_sleep_nosync(v); vcpu_migrate(v); } else { - vcpu_schedule_unlock_irq(lock, v); + spin_unlock_irq(lock); } }
In fact, during shutdown/suspend, we temporarily move all the vCPUs to the BSP (i.e., pCPU 0, as of now). For Credit2 domains, we call csched2_vcpu_migrate(), expects to find the target pCPU in the domain's pool Therefore, if Credit2 is the default scheduler and we have removed pCPU 0 from cpupool0, shutdown/suspend fails like this: RIP: e008:[<ffff82d08012906d>] sched_credit2.c#migrate+0x274/0x2d1 Xen call trace: [<ffff82d08012906d>] sched_credit2.c#migrate+0x274/0x2d1 [<ffff82d080129138>] sched_credit2.c#csched2_vcpu_migrate+0x6e/0x86 [<ffff82d08012c468>] schedule.c#vcpu_move_locked+0x69/0x6f [<ffff82d08012ec14>] cpu_disable_scheduler+0x3d7/0x430 [<ffff82d08019669b>] __cpu_disable+0x299/0x2b0 [<ffff82d0801012f8>] cpu.c#take_cpu_down+0x2f/0x38 [<ffff82d0801312d8>] stop_machine.c#stopmachine_action+0x7f/0x8d [<ffff82d0801330b8>] tasklet.c#do_tasklet_work+0x74/0xab [<ffff82d0801333ed>] do_tasklet+0x66/0x8b [<ffff82d080166a73>] domain.c#idle_loop+0x3b/0x5e **************************************** Panic on CPU 8: Assertion 'svc->vcpu->processor < nr_cpu_ids' failed at sched_credit2.c:1729 **************************************** On the other hand, if Credit2 is the scheduler of another pool, when trying (still during shutdown/suspend) to move the vCPUs of the Credit2 domains to pCPU 0, it figures out that pCPU 0 is not a Credit2 pCPU, and fails like this: RIP: e008:[<ffff82d08012916b>] sched_credit2.c#csched2_vcpu_migrate+0xa1/0x107 Xen call trace: [<ffff82d08012916b>] sched_credit2.c#csched2_vcpu_migrate+0xa1/0x107 [<ffff82d08012c4e9>] schedule.c#vcpu_move_locked+0x69/0x6f [<ffff82d08012edfc>] cpu_disable_scheduler+0x3d7/0x430 [<ffff82d08019687b>] __cpu_disable+0x299/0x2b0 [<ffff82d0801012f8>] cpu.c#take_cpu_down+0x2f/0x38 [<ffff82d0801314c0>] stop_machine.c#stopmachine_action+0x7f/0x8d [<ffff82d0801332a0>] tasklet.c#do_tasklet_work+0x74/0xab [<ffff82d0801335d5>] do_tasklet+0x66/0x8b [<ffff82d080166c53>] domain.c#idle_loop+0x3b/0x5e The solution is to recognise the specific situation, inside csched2_vcpu_migrate() and, considering it is something temporary, which only happens during shutdown/suspend, quickly deal with it. Then, in the resume path, in restore_vcpu_affinity(), things are set back to normal, and a new v->processor is chosen, for each vCPU, from the proper set of pCPUs (i.e., the ones of the proper cpupool). Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> --- Cc: George Dunlap <george.dunlap@eu.citrix.com> --- This is a bugfix, and should be backported to 4.8. Note that Credit2 being used, either as default scheduler or in a cpupool is what triggers the bug, but it's actually more a general thing, which would affect any scheduler that remaps the runqueue locks. --- xen/common/sched_credit2.c | 32 +++++++++++++++++++++++++++++++- xen/common/schedule.c | 25 ++++++++++++++++++++++--- 2 files changed, 53 insertions(+), 4 deletions(-)