From patchwork Tue Jan 17 17:26:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9521591 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B408A601C3 for ; Tue, 17 Jan 2017 17:29:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A54ED285F1 for ; Tue, 17 Jan 2017 17:29:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 99990285F8; Tue, 17 Jan 2017 17:29:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RCVD_IN_SORBS_SPAM,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 106DD285F1 for ; Tue, 17 Jan 2017 17:29:05 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cTXXA-0008Co-Ez; Tue, 17 Jan 2017 17:27:00 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cTXX9-0008CS-Nr for xen-devel@lists.xenproject.org; Tue, 17 Jan 2017 17:26:59 +0000 Received: from [85.158.143.35] by server-6.bemta-6.messagelabs.com id 09/8C-15112-3E35E785; Tue, 17 Jan 2017 17:26:59 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrDIsWRWlGSWpSXmKPExsXiVRvkpPsouC7 CoK1Z1OL7lslMDowehz9cYQlgjGLNzEvKr0hgzWg5MI294J5ZRdOE9awNjK3aXYxcHEIC0xkl Vm9vYQVxWATWsErsmPSMEcSRELjEKvHk5HYghxPIiZF4eXEXkM0BZFdJHPhcARIWElCRuLl9F RPEpJlMErt2P2ABSQgL6EkcOfqDHcIOkTjcfw7MZhMwkHizYy8riC0ioCRxb9VkJpCZzAL6Eq u/8oCEWQRUJd4+3sYEYvMK+EjsW3WQBaSEE8h+dVYSYq23xN6mC2AlogJyEisvt7BClAtKnJz 5hAVioqbE+l36IGFmAXmJ7W/nME9gFJmFpGoWQtUsJFULGJlXMWoUpxaVpRbpGhrrJRVlpmeU 5CZm5ugaGpjp5aYWFyemp+YkJhXrJefnbmIEhj4DEOxg/LIs4BCjJAeTkihvx+PaCCG+pPyUy ozE4oz4otKc1OJDjBocHAITzs6dziTFkpefl6okwSsGjDEhwaLU9NSKtMwcYHTClEpw8CiJ8P 4MAkrzFhck5hZnpkOkTjHqcuzadfklkxDYDClx3jcgRQIgRRmleXAjYIniEqOslDAvI9CBQjw FqUW5mSWo8q8YxTkYlYR5w0Cm8GTmlcBtegV0BBPQEdd1qkGOKElESEk1MO43dsttbL0V1cSr PPEny8G/LI6uu2puzWlI8a54ld7nPu/rn96/8QkuyRMsM1h+GZuVW625EVl3aI0B/6LFDGfLu n9vXGJd9MPi44f+aY0cD0Iryw8EhvCcuualsfHh8be7Zi8R7nV479lgE/E0p+WU28782tsJya 1SHq2aVZkS/tYPxXf4KrEUZyQaajEXFScCAJkKPY8PAwAA X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-7.tower-21.messagelabs.com!1484674018!53193488!1 X-Originating-IP: [74.125.82.66] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.1.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 59076 invoked from network); 17 Jan 2017 17:26:58 -0000 Received: from mail-wm0-f66.google.com (HELO mail-wm0-f66.google.com) (74.125.82.66) by server-7.tower-21.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 17 Jan 2017 17:26:58 -0000 Received: by mail-wm0-f66.google.com with SMTP id c85so7502232wmi.1 for ; Tue, 17 Jan 2017 09:26:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=eds+QTOTX5z3VdSNg7z18xOB7N7Qk/rJRb5jlVidCLI=; b=VgjO0TZ6orRTr3f1MpnsjzSk/9+JXh40Rjh1c+XSmo5LoidEGXbllhvOygE7e+1AxV MltppoRLXPosNZE+uc/277uc8DjgZihxhWDBAQWfNLed3RMoZMkLYTYlY7D3S9Cv4TWh KR3ygd3+aicXAAremBXO4WQekAJutwhtxSkhYQU2LNEKXfZTHlsajEhyXIAYqJFwM/mV VrunAY4OuWnJ0IHXcNhLZf05f0fq+bI4uBHAKxAQ6qKk7+CoyHt5NkMhe+aRqEP3P2db 0Ub4f/N6Bc3STW8A4hzDUmVRgGreCzCnuyLx2rA4WALwpsvpLjub1TvQlS+CPz+kbQd8 XRWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=eds+QTOTX5z3VdSNg7z18xOB7N7Qk/rJRb5jlVidCLI=; b=GeE5kc+61cRFv5FCOjBxXvVFxGSwXFVVkSHLp7zW5+colUqu5Q85dGWNIINOjY6MYt 5U2p1U88E0xh+bjqBQAvVeKcpisxmcmkghO/9WmrZ4njGlzFJt3eh/iPo2N52gbsnE9P sVD0sNW1HT8wepoLSriJsX63AuikumNlk2TkwHrMVuTXMhwr/dn/hk6OjAQwHiY9II2g j+29L3/vTayULkCubS6QlQGaOFDFfgZdVTwKMasG0ndt9qbZ+zZ4poFh9H/j7tHPPoUx 1rtBsMG4EjJwSs3qu/0sGbr2fKivYPMI8idFHAcOYCdvg2UNj5ulCF6hJFVuabVNyhc0 PYVQ== X-Gm-Message-State: AIkVDXJtQmHU+xhwY8Yf8muiFFsU0L2fGMRU9YywlVEAiNoX7Qmlurl92k43tG7YsDyzag== X-Received: by 10.223.134.173 with SMTP id 42mr32054758wrx.95.1484674017846; Tue, 17 Jan 2017 09:26:57 -0800 (PST) Received: from Solace.fritz.box (58-209-66-80.hosts.abilene.it. [80.66.209.58]) by smtp.gmail.com with ESMTPSA id 197sm38481727wmy.16.2017.01.17.09.26.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 17 Jan 2017 09:26:57 -0800 (PST) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Tue, 17 Jan 2017 18:26:55 +0100 Message-ID: <148467401548.27920.15109928323593413208.stgit@Solace.fritz.box> In-Reply-To: <148467379229.27920.2367500429219327194.stgit@Solace.fritz.box> References: <148467379229.27920.2367500429219327194.stgit@Solace.fritz.box> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: George Dunlap Subject: [Xen-devel] [PATCH 3/5] xen: credit2: fix shutdown/suspend when playing with cpupools. X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP In fact, during shutdown/suspend, we temporarily move all the vCPUs to the BSP (i.e., pCPU 0, as of now). For Credit2 domains, we call csched2_vcpu_migrate(), expects to find the target pCPU in the domain's pool Therefore, if Credit2 is the default scheduler and we have removed pCPU 0 from cpupool0, shutdown/suspend fails like this: RIP: e008:[] sched_credit2.c#migrate+0x274/0x2d1 Xen call trace: [] sched_credit2.c#migrate+0x274/0x2d1 [] sched_credit2.c#csched2_vcpu_migrate+0x6e/0x86 [] schedule.c#vcpu_move_locked+0x69/0x6f [] cpu_disable_scheduler+0x3d7/0x430 [] __cpu_disable+0x299/0x2b0 [] cpu.c#take_cpu_down+0x2f/0x38 [] stop_machine.c#stopmachine_action+0x7f/0x8d [] tasklet.c#do_tasklet_work+0x74/0xab [] do_tasklet+0x66/0x8b [] domain.c#idle_loop+0x3b/0x5e **************************************** Panic on CPU 8: Assertion 'svc->vcpu->processor < nr_cpu_ids' failed at sched_credit2.c:1729 **************************************** On the other hand, if Credit2 is the scheduler of another pool, when trying (still during shutdown/suspend) to move the vCPUs of the Credit2 domains to pCPU 0, it figures out that pCPU 0 is not a Credit2 pCPU, and fails like this: RIP: e008:[] sched_credit2.c#csched2_vcpu_migrate+0xa1/0x107 Xen call trace: [] sched_credit2.c#csched2_vcpu_migrate+0xa1/0x107 [] schedule.c#vcpu_move_locked+0x69/0x6f [] cpu_disable_scheduler+0x3d7/0x430 [] __cpu_disable+0x299/0x2b0 [] cpu.c#take_cpu_down+0x2f/0x38 [] stop_machine.c#stopmachine_action+0x7f/0x8d [] tasklet.c#do_tasklet_work+0x74/0xab [] do_tasklet+0x66/0x8b [] domain.c#idle_loop+0x3b/0x5e The solution is to recognise the specific situation, inside csched2_vcpu_migrate() and, considering it is something temporary, which only happens during shutdown/suspend, quickly deal with it. Then, in the resume path, in restore_vcpu_affinity(), things are set back to normal, and a new v->processor is chosen, for each vCPU, from the proper set of pCPUs (i.e., the ones of the proper cpupool). Signed-off-by: Dario Faggioli Acked-by: George Dunlap --- Cc: George Dunlap --- This is a bugfix, and should be backported to 4.8. Note that Credit2 being used, either as default scheduler or in a cpupool is what triggers the bug, but it's actually more a general thing, which would affect any scheduler that remaps the runqueue locks. --- xen/common/sched_credit2.c | 32 +++++++++++++++++++++++++++++++- xen/common/schedule.c | 25 ++++++++++++++++++++++--- 2 files changed, 53 insertions(+), 4 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index ce0e146..2ce738d 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -1946,13 +1946,43 @@ static void csched2_vcpu_migrate( const struct scheduler *ops, struct vcpu *vc, unsigned int new_cpu) { + struct domain *d = vc->domain; struct csched2_vcpu * const svc = CSCHED2_VCPU(vc); struct csched2_runqueue_data *trqd; + s_time_t now = NOW(); /* Check if new_cpu is valid */ ASSERT(cpumask_test_cpu(new_cpu, &CSCHED2_PRIV(ops)->initialized)); ASSERT(cpumask_test_cpu(new_cpu, vc->cpu_hard_affinity)); + /* + * Being passed a target pCPU which is outside of our cpupool is only + * valid if we are shutting down (or doing ACPI suspend), and we are + * moving everyone to BSP, no matter whether or not BSP is inside our + * cpupool. + * + * And since there indeed is the chance that it is not part of it, all + * we must do is remove _and_ unassign the vCPU from any runqueue, as + * well as updating v->processor with the target, so that the suspend + * process can continue. + * + * It will then be during resume that a new, meaningful, value for + * v->processor will be chosen, and during actual domain unpause that + * the vCPU will be assigned to and added to the proper runqueue. + */ + if ( unlikely(!cpumask_test_cpu(new_cpu, cpupool_domain_cpumask(d))) ) + { + ASSERT(system_state == SYS_STATE_suspend); + if ( __vcpu_on_runq(svc) ) + { + __runq_remove(svc); + update_load(ops, svc->rqd, NULL, -1, now); + } + __runq_deassign(svc); + vc->processor = new_cpu; + return; + } + trqd = RQD(ops, new_cpu); /* @@ -1964,7 +1994,7 @@ csched2_vcpu_migrate( * pointing to a pcpu where we can't run any longer. */ if ( trqd != svc->rqd ) - migrate(ops, svc, trqd, NOW()); + migrate(ops, svc, trqd, now); else vc->processor = new_cpu; } diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 5b444c4..36ff2e9 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -633,8 +633,11 @@ void vcpu_force_reschedule(struct vcpu *v) void restore_vcpu_affinity(struct domain *d) { + unsigned int cpu = smp_processor_id(); struct vcpu *v; + ASSERT(system_state == SYS_STATE_resume); + for_each_vcpu ( d, v ) { spinlock_t *lock = vcpu_schedule_lock_irq(v); @@ -643,18 +646,34 @@ void restore_vcpu_affinity(struct domain *d) { cpumask_copy(v->cpu_hard_affinity, v->cpu_hard_affinity_saved); v->affinity_broken = 0; + } - if ( v->processor == smp_processor_id() ) + /* + * During suspend (in cpu_disable_scheduler()), we moved every vCPU + * to BSP (which, as of now, is pCPU 0), as a temporary measure to + * allow the nonboot processors to have their data structure freed + * and go to sleep. But nothing guardantees that the BSP is a valid + * pCPU for a particular domain. + * + * Therefore, here, before actually unpausing the domains, we should + * set v->processor of each of their vCPUs to something that will + * make sense for the scheduler of the cpupool in which they are in. + */ + cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, + cpupool_domain_cpumask(v->domain)); + v->processor = cpumask_any(cpumask_scratch_cpu(cpu)); + + if ( v->processor == cpu ) { set_bit(_VPF_migrating, &v->pause_flags); - vcpu_schedule_unlock_irq(lock, v); + spin_unlock_irq(lock);; vcpu_sleep_nosync(v); vcpu_migrate(v); } else { - vcpu_schedule_unlock_irq(lock, v); + spin_unlock_irq(lock); } }