From patchwork Thu Jun 8 12:09:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9774525 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 64BFD601C3 for ; Thu, 8 Jun 2017 12:11:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 554AC2853E for ; Thu, 8 Jun 2017 12:11:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 49DC128544; Thu, 8 Jun 2017 12:11:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 85A4C2853E for ; Thu, 8 Jun 2017 12:11:13 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dIwFP-000312-Nw; Thu, 08 Jun 2017 12:09:07 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dIwFN-00030A-Jd for xen-devel@lists.xenproject.org; Thu, 08 Jun 2017 12:09:05 +0000 Received: from [85.158.143.35] by server-8.bemta-6.messagelabs.com id DC/B6-03696-06E39395; Thu, 08 Jun 2017 12:09:04 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrDIsWRWlGSWpSXmKPExsXiVRvkoptgZxl p8KGT2eL7lslMDowehz9cYQlgjGLNzEvKr0hgzdiw9BRbwUXVirt7e9gbGM9KdjFycggJTGeU mNcQ2sXIxcEisIZVYsLbI4wgjoTAJVaJq/1HWEGqJATiJHpfr2KCsKsl3h/cwwTRrSJxc/sqK PsHo0T7aycQW1hAT+LI0R/sEHaURNehxWA2m4CBxJsde8FmiggoSdxbNRmslxmo5szyZmYQm0 VAVeLYr+lANgcHr4C3xLbPsiBhTgEfiS/dkxghVnlLzJp+nQXEFhWQk1h5uQVsJK+AoMTJmU9 YQFqZBTQl1u/Sh5guL7H97RzmCYwis5BUzUKomoWkagEj8ypGjeLUorLUIl0jU72kosz0jJLc xMwcXUMDM73c1OLixPTUnMSkYr3k/NxNjMDQZwCCHYyrFgQeYpTkYFIS5VUUsIwU4kvKT6nMS CzOiC8qzUktPsSowcEhMOHs3OlMUix5+XmpShK8KbZAdYJFqempFWmZOcDohCmV4OBREuE9bg OU5i0uSMwtzkyHSJ1i1OWYdGD7FyYhsBlS4ryJIDMEQIoySvPgRsASxSVGWSlhXkagA4V4ClK LcjNLUOVfMYpzMCoJ8/KDTOHJzCuB2/QK6AgmoCOWvLMAOaIkESEl1cDIuLAnolDUalbO6h+9 XAllXWYWHs1r/tgdELL/usZ+TefW9KMPLf6sy7DKn7HY5PEuh+tbj8hmbNqv+pb9V9U9xsvzN h+xTTl67OuWXZM3dS+u+2AYLnH0qcjOF0x9x6z1FY4Jf+nUvarg4MB/Xyv3XMi30uSkq2rhUz 0CvRYoLM4KqXp9+MBKJZbijERDLeai4kQAFSzyFw8DAAA= X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-9.tower-21.messagelabs.com!1496923743!72782700!1 X-Originating-IP: [74.125.82.68] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.4.19; banners=-,-,- X-VirusChecked: Checked Received: (qmail 52458 invoked from network); 8 Jun 2017 12:09:04 -0000 Received: from mail-wm0-f68.google.com (HELO mail-wm0-f68.google.com) (74.125.82.68) by server-9.tower-21.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 8 Jun 2017 12:09:04 -0000 Received: by mail-wm0-f68.google.com with SMTP id g15so6980624wmc.2 for ; Thu, 08 Jun 2017 05:09:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=ezUe2QAx0ehCq8oRKC3ACIoIqmNvR8yyT5yhBDXzJK4=; b=en1KzGFpuAhqXkMBW+0PW+k8RtmzSvavulG4eyVZCRNWtS+Qi3dNgn0RFXp+vcLoVI mgN80LSgfQjW2I+T4tMMJoKEBBWIma6X6kAZTVI3TU5iZxPRJuu4dO4/6XEHMQHTogIi geLZhMYOdiIYJHGMdX/mzMIYGamNaEWcXc4X8YaKntP3bLgINEMh3esxpx6f7LFi0S4f bWMaIzMwyg5PUKqATXwK/aogVrLVhhnjCRAAL6J1WeqEB3ZSJcwcFVCtKeVLY48JN8im 67S4/xO7/410mTslZFTWaNG+Vs/R1O1UTLULkN0DHbOkIeIC9G/miaPVBZIOOmUv53jf ZZvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=ezUe2QAx0ehCq8oRKC3ACIoIqmNvR8yyT5yhBDXzJK4=; b=nGMqQ6lbkYJh4zTKTCFxk/KdGntIoyaPvqGzUwPHtFXyz59XfIiaRCqFQNmeULzAR7 vOGJymCLKqLDzUzV7kjZe59bmuWQXnB6ujb7CxZVSjZ4rcEjUobdVZ5qWg47exy3Z5iv 7Ocu5h9qHKSpsKtZJBRSiPOwHKG+HxQjp/2E76+zZOla8s4cFFb5l4ShQJFly1ZbBwNB jYIodoSZj7Uuctntq7QgCExVk/6AkexV4oiSnzHSTnycmOTsNO9BKl966zPJHi/bl/iE rjtrSdF92ZnmmsXQUY+FZuPp5y7+H6nm00O+F6dvlsNSmgjUtJroCZC4VOpvc0hn/3Mo 4APA== X-Gm-Message-State: AODbwcCYYcVHMNQd0PTDOpJE74hbaJPmy0vx1zum97HeIhCwnetv6m16 Sro9+ralYIST/Q== X-Received: by 10.28.101.213 with SMTP id z204mr3382733wmb.106.1496923743665; Thu, 08 Jun 2017 05:09:03 -0700 (PDT) Received: from Solace.fritz.box ([80.66.223.25]) by smtp.gmail.com with ESMTPSA id 48sm8686054wry.31.2017.06.08.05.09.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Jun 2017 05:09:02 -0700 (PDT) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Thu, 08 Jun 2017 14:09:01 +0200 Message-ID: <149692374138.9605.16027485611403582297.stgit@Solace.fritz.box> In-Reply-To: <149692186557.9605.11625777539060264052.stgit@Solace.fritz.box> References: <149692186557.9605.11625777539060264052.stgit@Solace.fritz.box> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: George Dunlap , Anshul Makkar Subject: [Xen-devel] [PATCH 3/4] xen: credit2: improve distribution of budget (for domains with caps) X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Instead of letting the vCPU that for first tries to get some budget take it all (although temporarily), allow each vCPU to only get a specific quota of the total budget. This improves fairness, allows for more parallelism, and prevents vCPUs from not being able to get any budget (e.g., because some other vCPU always comes before and gets it all) for one or more period, and hence starve (and couse troubles in guest kernels, such as livelocks, triggering ofwhatchdogs, etc.). Signed-off-by: Dario Faggioli Reviewed-by: George Dunlap --- Cc: George Dunlap Cc: Anshul Makkar --- xen/common/sched_credit2.c | 48 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 41 insertions(+), 7 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 3f7b8f0..97efde8 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -506,7 +506,7 @@ struct csched2_vcpu { int credit; - s_time_t budget; + s_time_t budget, budget_quota; struct list_head parked_elem; /* On the parked_vcpus list */ s_time_t start_time; /* When we were scheduled (used for credit) */ @@ -1627,8 +1627,16 @@ static bool vcpu_try_to_get_budget(struct csched2_vcpu *svc) if ( sdom->budget > 0 ) { - svc->budget = sdom->budget; - sdom->budget = 0; + s_time_t budget; + + /* Get our quote, if there's at least as much budget */ + if ( likely(sdom->budget >= svc->budget_quota) ) + budget = svc->budget_quota; + else + budget = sdom->budget; + + svc->budget = budget; + sdom->budget -= budget; } else { @@ -1841,6 +1849,7 @@ csched2_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd) svc->tickled_cpu = -1; svc->budget = STIME_MAX; + svc->budget_quota = 0; INIT_LIST_HEAD(&svc->parked_elem); SCHED_STAT_CRANK(vcpu_alloc); @@ -2548,10 +2557,33 @@ csched2_dom_cntl( /* Cap */ if ( op->u.credit2.cap != 0 ) { + struct csched2_vcpu *svc; + spinlock_t *lock; + spin_lock(&sdom->budget_lock); sdom->tot_budget = (CSCHED2_BDGT_REPL_PERIOD / 100) * op->u.credit2.cap; spin_unlock(&sdom->budget_lock); + /* + * When trying to get some budget and run, each vCPU will grab + * from the pool 1/N (with N = nr of vCPUs of the domain) of + * the total budget. Roughly speaking, this means each vCPU will + * have at least one chance to run during every period. + */ + for_each_vcpu ( d, v ) + { + svc = csched2_vcpu(v); + lock = vcpu_schedule_lock(svc->vcpu); + /* + * Too small quotas would in theory cause a lot of overhead, + * which then won't happen because, in csched2_runtime(), + * CSCHED2_MIN_TIMER is what would be used anyway. + */ + svc->budget_quota = max(sdom->tot_budget / sdom->nr_vcpus, + CSCHED2_MIN_TIMER); + vcpu_schedule_unlock(lock, svc->vcpu); + } + if ( sdom->cap == 0 ) { /* @@ -2583,9 +2615,8 @@ csched2_dom_cntl( */ for_each_vcpu ( d, v ) { - struct csched2_vcpu *svc = csched2_vcpu(v); - spinlock_t *lock = vcpu_schedule_lock(svc->vcpu); - + svc = csched2_vcpu(v); + lock = vcpu_schedule_lock(svc->vcpu); if ( v->is_running ) { unsigned int cpu = v->processor; @@ -2619,6 +2650,7 @@ csched2_dom_cntl( vcpu_schedule_unlock(lock, svc->vcpu); } } + sdom->cap = op->u.credit2.cap; } else if ( sdom->cap != 0 ) @@ -2632,6 +2664,7 @@ csched2_dom_cntl( spinlock_t *lock = vcpu_schedule_lock(svc->vcpu); svc->budget = STIME_MAX; + svc->budget_quota = 0; vcpu_schedule_unlock(lock, svc->vcpu); } @@ -3266,7 +3299,8 @@ csched2_dump_vcpu(struct csched2_private *prv, struct csched2_vcpu *svc) printk(" credit=%" PRIi32" [w=%u]", svc->credit, svc->weight); if ( has_cap(svc) ) - printk(" budget=%"PRI_stime, svc->budget); + printk(" budget=%"PRI_stime"(%"PRI_stime")", + svc->budget, svc->budget_quota); printk(" load=%"PRI_stime" (~%"PRI_stime"%%)", svc->avgload, (svc->avgload * 100) >> prv->load_precision_shift);