From patchwork Fri Aug 19 16:26:11 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9290547 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8F554607FF for ; Fri, 19 Aug 2016 16:28:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 74EC3294D5 for ; Fri, 19 Aug 2016 16:28:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 65E43294D2; Fri, 19 Aug 2016 16:28:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5334C294D2 for ; Fri, 19 Aug 2016 16:28:45 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bamcb-0008BG-9L; Fri, 19 Aug 2016 16:26:17 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bamca-0008BA-9K for xen-devel@lists.xenproject.org; Fri, 19 Aug 2016 16:26:16 +0000 Received: from [193.109.254.147] by server-11.bemta-6.messagelabs.com id 5A/8B-08498-72337B75; Fri, 19 Aug 2016 16:26:15 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrMIsWRWlGSWpSXmKPExsXiVRvkqKtmvD3 c4OJVVYvvWyYzOTB6HP5whSWAMYo1My8pvyKBNePv0uusBd90Ky48u8TWwHhUpYuRi0NIYAaj xPp9h1i6GDk5WASmskr8v2oIkpAQ2Mgqcar/PjtIQkIgRuL4xAnMEHaVxOXWz2ANQgIqEje3r 2KCsOcxSRyfFgFiCwvoSRw5+gOolwPIDpJo+BwHEmYTMJB4s2MvK4gtIqAkcW/VZLBWZoEaiX mnb0LdoCqxsWkW2FpeAW+JvgOTwepFBeQkVl5uYYWIC0qcnPmEBWQ8s4CmxPpd+hBj5CW2v53 DPIFRaBaSqlkIVbOQVC1gZF7FqF6cWlSWWqRrppdUlJmeUZKbmJmja2hgppebWlycmJ6ak5hU rJecn7uJERjIDECwg3HeCf9DjJIcTEqivH5q28OF+JLyUyozEosz4otKc1KLDzFqcHAITDg7d zqTFEtefl6qkgTvKUOgOsGi1PTUirTMHGCswZRKcPAoifDqg6R5iwsSc4sz0yFSpxh1ObZMvb eWSQhshpQ4732QIgGQoozSPLgRsLi/xCgrJczLCHSgEE9BalFuZgmq/CtGcQ5GJWHexyBTeDL zSuA2vQI6ggnoCF7+LSBHlCQipKQaGMN4upqt+w9M3L6rwCTrS+NKlVXn57cVH5pRJXmn9oDM Pz+VwrmnpOas/tKynaerbGEub55TzaL1p+57Zs57ybefKaz3S+OrNdseO27csiJz90WH0zs07 Of3XZP7xLhw1473rsZPLyotOtMus7HUcJ8irwnzjIOrFvZrdeaWaohuel6csXblU0slluKMRE Mt5qLiRAACGa7I9gIAAA== X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-8.tower-27.messagelabs.com!1471623974!44476277!1 X-Originating-IP: [74.125.82.65] X-SpamReason: No, hits=0.2 required=7.0 tests=RCVD_ILLEGAL_IP X-StarScan-Received: X-StarScan-Version: 8.84; banners=-,-,- X-VirusChecked: Checked Received: (qmail 37423 invoked from network); 19 Aug 2016 16:26:14 -0000 Received: from mail-wm0-f65.google.com (HELO mail-wm0-f65.google.com) (74.125.82.65) by server-8.tower-27.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 19 Aug 2016 16:26:14 -0000 Received: by mail-wm0-f65.google.com with SMTP id i138so4019693wmf.3 for ; Fri, 19 Aug 2016 09:26:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:from:to:cc:date:message-id:user-agent:mime-version :content-transfer-encoding; bh=4FPVbFb0Qur0K/JlXr6tLSVXn+JedUyEZ+We4TiVYpQ=; b=zF6iw8lmz5Q3dJvR65pqNRMwrq9BMSTyYTHUo72YKPlqrkp+ONRJXc9z5EpK6QktPv pffjTkT3rnwGcWyPtm1NUeXqskqyVgkFya76hMc+tRtTf40355PR0uaomknHAUwLzody UOROQMnD1FQ1eWZzEQ5ceWiYbHN/0ubWD2fLyaz7rSD0rBQZTvhmkFhMlC4XvqH0qajY 1j5UoCnUyHIUO/pQbU17p1jE8sY8r6apixO2zgrhuCEGi1FN3PTwmqaEZXBHFpZZ7h7Z +tTa9rcufv4mXmNFgL99ekamr4o8XX/TKsx16LPqOCMPWpGnk8mzSGqex66vbuZ5oFuW Bi5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :user-agent:mime-version:content-transfer-encoding; bh=4FPVbFb0Qur0K/JlXr6tLSVXn+JedUyEZ+We4TiVYpQ=; b=d9ldJcRq0KRkPx6DFl19GGFb7ebnlHtj13IBZuqElcTYEKiLWW+kxu44iPxYl7UXY0 neZ202n1q9nqbb9x24sp23OntJjzh3ytKzvRyA9/BoCfA7aWHbKmCW/jx7qL/ugNxyFf sX32o5zStA1HB9iMtZWZDgDGrkCTRpHVGmCCBAI0LcExHWRAiaZSrzRLga9J39B4RA70 7d6KxI9DL6pNETiZqiS7Nrt5KFkaAZTDP8GTkOOLugLk7eD76pi5LFVbN6At6SEk7Wg0 NCfFV/KGhM3tYC6BicWgArhFymMZAbHweJ/VKXre6MtvqlDag2e2V0squLsw0SmdnbUb Ldkw== X-Gm-Message-State: AEkoouuDxscq61uhX69Y4GSKGFEJTZvhhhU7FbNgO1dhT0Ri+45VrlyVIO/F9B2SOOoAzw== X-Received: by 10.194.123.228 with SMTP id md4mr7182675wjb.91.1471623974178; Fri, 19 Aug 2016 09:26:14 -0700 (PDT) Received: from Solace.fritz.box (net-2-32-14-104.cust.vodafonedsl.it. [2.32.14.104]) by smtp.gmail.com with ESMTPSA id o142sm5098189wme.20.2016.08.19.09.26.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 19 Aug 2016 09:26:12 -0700 (PDT) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Fri, 19 Aug 2016 18:26:11 +0200 Message-ID: <147162397101.11630.6555683723098718425.stgit@Solace.fritz.box> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: George Dunlap , Andrew Cooper , Jan Beulich Subject: [Xen-devel] [PATCH v3] xen: credit1: fix a race when picking initial pCPU for a vCPU X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP In the Credit1 hunk of 9f358ddd69463 ("xen: Have schedulers revise initial placement") csched_cpu_pick() is called without taking the runqueue lock of the (temporary) pCPU that the vCPU has been assigned to (e.g., in XEN_DOMCTL_max_vcpus). However, although 'hidden' in the IS_RUNQ_IDLE() macro, that function does access the runq (for doing load balancing calculations). Two scenarios are possible: 1) we are on cpu X, and IS_RUNQ_IDLE() peeks at cpu's X own runq; 2) we are on cpu X, but IS_RUNQ_IDLE() peeks at some other cpu's runq. Scenario 2) absolutely requies that the appropriate runq lock is taken. Scenario 1) works even without taking the cpu's own runq lock. That is actually what happens when when _csched_pick_cpu() is called from csched_vcpu_acct() (in turn, called by csched_tick()). Races have been observed and reported (by both XenServer own testing and OSSTest [1]), in the form of IS_RUNQ_IDLE() falling over LIST_POISON, because we're not currently holding the proper lock, in csched_vcpu_insert(), when scenario 1) occurs. However, for better robustness, from now on we always ask for the proper runq lock to be held when calling IS_RUNQ_IDLE() (which is also becoming a static inline function instead of macro). In order to comply with that, we take the lock around the call to _csched_cpu_pick() in csched_vcpu_acct(). [1] https://lists.xen.org/archives/html/xen-devel/2016-08/msg02144.html Reported-by: Andrew Cooper Signed-off-by: Dario Faggioli Reviewed-by: George Dunlap --- Cc: George Dunlap Cc: Andrew Cooper Cc: Jan Beulich --- Changes from v2: - actually, take the runq lock around the call to _csched_cpu_pick() in csched_vcpu_acct(), as considered better by George during review. Changes from v1: - macro IS_RUNQ_IDLE() to static inline is_runq_idle(), as suggested during review; - add an ASSERT() and a comment, as suggested during review; - take into account what's described in the changelog as "scenario 1)", which wasn't being considered in v1. --- xen/common/sched_credit.c | 53 +++++++++++++++++++++++++++++++++------------ 1 file changed, 39 insertions(+), 14 deletions(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 220ff0d..c2b4b24 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -84,9 +84,6 @@ #define CSCHED_VCPU(_vcpu) ((struct csched_vcpu *) (_vcpu)->sched_priv) #define CSCHED_DOM(_dom) ((struct csched_dom *) (_dom)->sched_priv) #define RUNQ(_cpu) (&(CSCHED_PCPU(_cpu)->runq)) -/* Is the first element of _cpu's runq its idle vcpu? */ -#define IS_RUNQ_IDLE(_cpu) (list_empty(RUNQ(_cpu)) || \ - is_idle_vcpu(__runq_elem(RUNQ(_cpu)->next)->vcpu)) /* @@ -248,6 +245,18 @@ __runq_elem(struct list_head *elem) return list_entry(elem, struct csched_vcpu, runq_elem); } +/* Is the first element of cpu's runq (if any) cpu's idle vcpu? */ +static inline bool_t is_runq_idle(unsigned int cpu) +{ + /* + * We're peeking at cpu's runq, we must hold the proper lock. + */ + ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock)); + + return list_empty(RUNQ(cpu)) || + is_idle_vcpu(__runq_elem(RUNQ(cpu)->next)->vcpu); +} + static inline void __runq_insert(struct csched_vcpu *svc) { @@ -771,7 +780,7 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit) * runnable vcpu on cpu, we add cpu to the idlers. */ cpumask_and(&idlers, &cpu_online_map, CSCHED_PRIV(ops)->idlers); - if ( vc->processor == cpu && IS_RUNQ_IDLE(cpu) ) + if ( vc->processor == cpu && is_runq_idle(cpu) ) __cpumask_set_cpu(cpu, &idlers); cpumask_and(&cpus, &cpus, &idlers); @@ -951,21 +960,33 @@ csched_vcpu_acct(struct csched_private *prv, unsigned int cpu) /* * Put this VCPU and domain back on the active list if it was * idling. - * - * If it's been active a while, check if we'd be better off - * migrating it to run elsewhere (see multi-core and multi-thread - * support in csched_cpu_pick()). */ if ( list_empty(&svc->active_vcpu_elem) ) { __csched_vcpu_acct_start(prv, svc); } - else if ( _csched_cpu_pick(ops, current, 0) != cpu ) + else { - SCHED_VCPU_STAT_CRANK(svc, migrate_r); - SCHED_STAT_CRANK(migrate_running); - set_bit(_VPF_migrating, ¤t->pause_flags); - cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ); + unsigned int new_cpu; + unsigned long flags; + spinlock_t *lock = vcpu_schedule_lock_irqsave(current, &flags); + + /* + * If it's been active a while, check if we'd be better off + * migrating it to run elsewhere (see multi-core and multi-thread + * support in csched_cpu_pick()). + */ + new_cpu = _csched_cpu_pick(ops, current, 0); + + vcpu_schedule_unlock_irqrestore(lock, flags, current); + + if ( new_cpu != cpu ) + { + SCHED_VCPU_STAT_CRANK(svc, migrate_r); + SCHED_STAT_CRANK(migrate_running); + set_bit(_VPF_migrating, ¤t->pause_flags); + cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ); + } } } @@ -998,9 +1019,13 @@ csched_vcpu_insert(const struct scheduler *ops, struct vcpu *vc) BUG_ON( is_idle_vcpu(vc) ); - /* This is safe because vc isn't yet being scheduled */ + /* csched_cpu_pick() looks in vc->processor's runq, so we need the lock. */ + lock = vcpu_schedule_lock_irq(vc); + vc->processor = csched_cpu_pick(ops, vc); + spin_unlock_irq(lock); + lock = vcpu_schedule_lock_irq(vc); if ( !__vcpu_on_runq(svc) && vcpu_runnable(vc) && !vc->is_running )