From patchwork Fri Aug 18 15:51:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9909527 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C6B89602C8 for ; Fri, 18 Aug 2017 15:53:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B5B2A28CFD for ; Fri, 18 Aug 2017 15:53:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AA5D728D0C; Fri, 18 Aug 2017 15:53:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RCVD_IN_SORBS_SPAM,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6DE1528CFD for ; Fri, 18 Aug 2017 15:53:36 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dijYI-0008O0-E6; Fri, 18 Aug 2017 15:51:14 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dijYH-0008NC-Bm for xen-devel@lists.xenproject.org; Fri, 18 Aug 2017 15:51:13 +0000 Received: from [85.158.137.68] by server-12.bemta-3.messagelabs.com id A6/5B-01862-0FC07995; Fri, 18 Aug 2017 15:51:12 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmphleJIrShJLcpLzFFi42K5GNpwSPc9z/R Ig1kzdCy+b5nM5MDocfjDFZYAxijWzLyk/IoE1ozFC48xFqzUrtjZu4yxgbFNtouRi0NIYBqj xKdHexi7GDk5WATWsErcucAFkpAQuMQqsf/bQlaQhIRAnMTfu0uZIexKiemrD4M1CAmoSNzcv ooJYtIPRonJXbNZQBLCAnoSR47+YIewYyXOvbkIFmcTMJB4s2Mv2FARASWJe6smM4HYzAIREp N7V7FDXKEqMWfxH7A4r4CXxL1Px8AWcwp4S7z99JYFYrGXxOMdE8FqRAXkJFZebmGFqBeUODn zCVANB9BMTYn1u/QhxstLbH87h3kCo8gsJFWzEKpmIalawMi8ilGjOLWoLLVI19BYL6koMz2j JDcxM0fX0MBYLze1uDgxPTUnMalYLzk/dxMjMPwZgGAH47btnocYJTmYlER5f8+aEinEl5SfU pmRWJwRX1Sak1p8iFGDg0Ngwtm505mkWPLy81KVJHhPc0+PFBIsSk1PrUjLzAFGKEypBAePkg hvOkiat7ggMbc4Mx0idYrRmOPKlXVfmDimHNj+hUkIbJKUOO9LkFIBkNKM0jy4QbDEcYlRVkq YlxHoTCGegtSi3MwSVPlXjOIcjErCvDtBpvBk5pXA7XsFdAoT0CmGrdNATilJREhJNTBWT1m1 RfroQqVpk39u2S96dPXeiWLbLoT2/lqjsqHyJOPGVp66FEup9qU5Xx6f4Yz0NZv69ufX9w+ui VV82xGzVt3y1DpnkaSD4U5vOF2uMqW/rbpqEFDqZXXq1r0JWk9ObhK3uqx7SPSO/RM9u+JEly zuBSzmz9yDFJZF/fyhGlj6STZWxYhfiaU4I9FQi7moOBEAopPkMBcDAAA= X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-9.tower-31.messagelabs.com!1503071471!55145243!1 X-Originating-IP: [209.85.128.194] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 37698 invoked from network); 18 Aug 2017 15:51:11 -0000 Received: from mail-wr0-f194.google.com (HELO mail-wr0-f194.google.com) (209.85.128.194) by server-9.tower-31.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 18 Aug 2017 15:51:11 -0000 Received: by mail-wr0-f194.google.com with SMTP id y96so52328wrc.0 for ; Fri, 18 Aug 2017 08:51:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=EF/MCKOVWjX1uL1TjBt3RVjmvQH1keTPqYnhKaDoYwQ=; b=PYxkX7TXaDBBMGjd1MOrZ9IQY2D3lj/k097/l5KvTVLrjQZn7+fXDoc0FcB5l1kPiO +MuCSqyXbem48ev0g4QhzG00FnAYZPSGPGpfWmk8/zSnmQ/uBbC9tJAD9wTkVXXHYF7c bovCQiuIAdPbeluR+B7H4M/hPCWPqPnUTnIgnNi49fL0UqaqckBuuqpchhEw0Sqp6Gw7 LkINXfVmCHpQre9mi5OV7+qcoS3LHxdvrKO8PnlvbUHl1QEr7mry3j+irNl6h1zjeEA8 jW9w5dtDhKoMdu0pt/ash6OatqbV1hxo41sqPSfjR6pwuIJ87joFwZskP4gefzulGC2y uL/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=EF/MCKOVWjX1uL1TjBt3RVjmvQH1keTPqYnhKaDoYwQ=; b=fyA+36tCVPUbc8TqJXPTpzMu5wdJrdQbqLakwquhIA5r3WYn6dFS66izoFPc9/LrQH f8/VwdmzMdsLCXzmZrPXbpMFyl6nwfUHgmFjin/L1gwzPgF46l0U0ODcAk8HuAwku37z QfdSJeH/c8xZ0e5+Lwzjovhw7/Uc/PsN1ylUzBtroJ5qeKXa/nRBWrMk5/Ikj00ox9Y+ Kbrg2OYAiUQyMYoCZ1gLWnnz4sNhg34FbiXNwrvY1kNCnmid9GqZ8bMlfNFBzBGefqZD yqwXSWl/YUf/JiVwM6joSIP2ofbLTUGk9uqIXPg5ug/2/0gtzePDyaaFq5rvkz1TAJhm D52g== X-Gm-Message-State: AHYfb5j2uff33Na/xsJw95lkSwrmB5+8UQlhjlQ7K94rXn0wEaF/vzQ6 YQZJ8Y1WOfbvBQ== X-Received: by 10.223.128.163 with SMTP id 32mr5394987wrl.99.1503071471248; Fri, 18 Aug 2017 08:51:11 -0700 (PDT) Received: from Solace.fritz.box ([80.66.223.3]) by smtp.gmail.com with ESMTPSA id g66sm1454128wrd.55.2017.08.18.08.51.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 Aug 2017 08:51:10 -0700 (PDT) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Fri, 18 Aug 2017 17:51:09 +0200 Message-ID: <150307146889.6642.4055929659011871848.stgit@Solace.fritz.box> In-Reply-To: <150307081385.6642.6516202758428761422.stgit@Solace.fritz.box> References: <150307081385.6642.6516202758428761422.stgit@Solace.fritz.box> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: George Dunlap , Anshul Makkar Subject: [Xen-devel] [PATCH v2 3/4] xen: credit2: improve distribution of budget (for domains with caps) X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Instead of letting the vCPU that for first tries to get some budget take it all (although temporarily), allow each vCPU to only get a specific quota of the total budget. This improves fairness, allows for more parallelism, and prevents vCPUs from not being able to get any budget (e.g., because some other vCPU always comes before and gets it all) for one or more period, and hence starve (and cause troubles in guest kernels, such as livelocks, triggering of whatchdogs, etc.). Signed-off-by: Dario Faggioli Reviewed-by: George Dunlap --- Cc: Anshul Makkar --- Changes from v1: - typos; - spurious hunk moved to previous patch. --- xen/common/sched_credit2.c | 56 ++++++++++++++++++++++++++++++++------------ 1 file changed, 41 insertions(+), 15 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index ce70224..211e2d6 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -522,6 +522,8 @@ struct csched2_vcpu { unsigned flags; /* Status flags (16 bits would be ok, */ s_time_t budget; /* Current budget (if domains has cap) */ /* but clear_bit() does not like that) */ + s_time_t budget_quota; /* Budget to which vCPU is entitled */ + s_time_t start_time; /* Time we were scheduled (for credit) */ /* Individual contribution to load */ @@ -1791,17 +1793,16 @@ static bool vcpu_grab_budget(struct csched2_vcpu *svc) if ( sdom->budget > 0 ) { - /* - * NB: we give the whole remaining budget a domain has, to the first - * vCPU that comes here and asks for it. This means that, in a domain - * with a cap, only 1 vCPU is able to run, at any given time. - * /THIS IS GOING TO CHANGE/ in subsequent patches, toward something - * that allows much better fairness and parallelism. Proceeding in - * two steps, is for making things easy to understand, when looking - * at the signle commits. - */ - svc->budget = sdom->budget; - sdom->budget = 0; + s_time_t budget; + + /* Get our quota, if there's at least as much budget */ + if ( likely(sdom->budget >= svc->budget_quota) ) + budget = svc->budget_quota; + else + budget = sdom->budget; + + svc->budget = budget; + sdom->budget -= budget; } else { @@ -2036,6 +2037,7 @@ csched2_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd) svc->tickled_cpu = -1; svc->budget = STIME_MAX; + svc->budget_quota = 0; INIT_LIST_HEAD(&svc->parked_elem); SCHED_STAT_CRANK(vcpu_alloc); @@ -2822,6 +2824,9 @@ csched2_dom_cntl( /* Cap */ if ( op->u.credit2.cap != 0 ) { + struct csched2_vcpu *svc; + spinlock_t *lock; + /* Cap is only valid if it's below 100 * nr_of_vCPUS */ if ( op->u.credit2.cap > 100 * sdom->nr_vcpus ) { @@ -2834,6 +2839,26 @@ csched2_dom_cntl( sdom->tot_budget /= 100; spin_unlock(&sdom->budget_lock); + /* + * When trying to get some budget and run, each vCPU will grab + * from the pool 1/N (with N = nr of vCPUs of the domain) of + * the total budget. Roughly speaking, this means each vCPU will + * have at least one chance to run during every period. + */ + for_each_vcpu ( d, v ) + { + svc = csched2_vcpu(v); + lock = vcpu_schedule_lock(svc->vcpu); + /* + * Too small quotas would in theory cause a lot of overhead, + * which then won't happen because, in csched2_runtime(), + * CSCHED2_MIN_TIMER is what would be used anyway. + */ + svc->budget_quota = max(sdom->tot_budget / sdom->nr_vcpus, + CSCHED2_MIN_TIMER); + vcpu_schedule_unlock(lock, svc->vcpu); + } + if ( sdom->cap == 0 ) { /* @@ -2865,9 +2890,8 @@ csched2_dom_cntl( */ for_each_vcpu ( d, v ) { - struct csched2_vcpu *svc = csched2_vcpu(v); - spinlock_t *lock = vcpu_schedule_lock(svc->vcpu); - + svc = csched2_vcpu(v); + lock = vcpu_schedule_lock(svc->vcpu); if ( v->is_running ) { unsigned int cpu = v->processor; @@ -2917,6 +2941,7 @@ csched2_dom_cntl( spinlock_t *lock = vcpu_schedule_lock(svc->vcpu); svc->budget = STIME_MAX; + svc->budget_quota = 0; vcpu_schedule_unlock(lock, svc->vcpu); } @@ -3601,7 +3626,8 @@ csched2_dump_vcpu(struct csched2_private *prv, struct csched2_vcpu *svc) printk(" credit=%" PRIi32" [w=%u]", svc->credit, svc->weight); if ( has_cap(svc) ) - printk(" budget=%"PRI_stime, svc->budget); + printk(" budget=%"PRI_stime"(%"PRI_stime")", + svc->budget, svc->budget_quota); printk(" load=%"PRI_stime" (~%"PRI_stime"%%)", svc->avgload, (svc->avgload * 100) >> prv->load_precision_shift);