From patchwork Fri Oct 21 09:56:14 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9388623 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 45450607D0 for ; Fri, 21 Oct 2016 09:59:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 365F629FBC for ; Fri, 21 Oct 2016 09:59:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 29FD529FBE; Fri, 21 Oct 2016 09:59:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RCVD_IN_SORBS_SPAM,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 60C8029FBC for ; Fri, 21 Oct 2016 09:59:01 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bxWYr-0005Ul-44; Fri, 21 Oct 2016 09:56:25 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bxWYp-0005Ue-2Q for xen-devel@lists.xenproject.org; Fri, 21 Oct 2016 09:56:23 +0000 Received: from [85.158.137.68] by server-8.bemta-3.messagelabs.com id 5A/78-10540-646E9085; Fri, 21 Oct 2016 09:56:22 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrAIsWRWlGSWpSXmKPExsVyMfS6s67jM84 Ig+VzzS2+b5nM5MDocfjDFZYAxijWzLyk/IoE1oxTk6+zFyxSrHjzfzdjA2ObZBcjF4eQwHRG iWtdu1hAHBaBqawSvzZfYgRxJAQ2skr82NHE3sXIAeTESFy7KdTFyAlkVkk8X76HCcQWElCRu Ll9FRPEpB+MEmt+XmMDSQgL6EkcOfqDHcKOldjWspgFxGYTMJB4s2MvK4gtIqAkcW/VZLBmZo GDjBKbJ7eAFbEIqErMPfafDWQxr4CPRNdrV5CwqICcxMrLLWC9vAKCEidnPmEBKWEW0JRYv0s fJMwsIC+x/e0c5gmMQrOQVM1CqJqFpGoBI/MqRo3i1KKy1CJdQwu9pKLM9IyS3MTMHF1DA2O9 3NTi4sT01JzEpGK95PzcTYzAcK5nYGDcwfj7tOchRkkOJiVR3kUTOSOE+JLyUyozEosz4otKc 1KLDzHKcHAoSfC6PwHKCRalpqdWpGXmACMLJi3BwaMkwvsKJM1bXJCYW5yZDpE6xWjMseX3tb VMHNum3lvLJMSSl5+XKiXOy/QUqFQApDSjNA9uECziLzHKSgnzMjIwMAjxFKQW5WaWoMq/YhT nYFQS5hUEWciTmVcCt+8V0ClMQKfUpHGAnFKSiJCSamBc+KJIPLby5YnnUjETYlansj1Z8k/Y qfSineaJ6z88ri9zvP3rqI7tig69t2E26u9Wys5cs3DrNp7Zcf8UQyuunAtg6Tu7ycHtK3/JX t/n0T2CKbejOG7GzmvvONV29ddS7af6P00WRX9821DoXT4pXiXR2KSZKaPT9Lxes4Pm2m0fa+ xi1KcpsRRnJBpqMRcVJwIAc+3fOfMCAAA= X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-14.tower-31.messagelabs.com!1477043777!67344584!1 X-Originating-IP: [209.85.215.67] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.0.13; banners=-,-,- X-VirusChecked: Checked Received: (qmail 14164 invoked from network); 21 Oct 2016 09:56:17 -0000 Received: from mail-lf0-f67.google.com (HELO mail-lf0-f67.google.com) (209.85.215.67) by server-14.tower-31.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 21 Oct 2016 09:56:17 -0000 Received: by mail-lf0-f67.google.com with SMTP id b75so5010949lfg.3 for ; Fri, 21 Oct 2016 02:56:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:from:to:cc:date:message-id:user-agent:mime-version :content-transfer-encoding; bh=KOMFLPUeDXLk5ptSFwMBSsX2kwsYsDtPdEo1u3SDuxk=; b=Vt4L0NG+v48dvbqzOJHxwUxukRLFqQKa9bRnf1hODDhqD6kqIQwtW9+OMxOX2jDjl0 rTcXxaYQY2ClCkeogs6yz25Pk+/8iFl/OjWoNf2GXOl8Za35KBEQ6V0SC5H1qjz9gDmG ZRGeEa13oIMIbOWdT1Ud+CnYePbErt1raxGmoQjgJIgyOjifrquYNtOV70VLrb+s2bXW mVUszTl8QQ1cF9p5gpaqak0mbZlmdjmnTRQ0+NiLTiJIJ263OUTOioa67E0dm71fOwXv B4lyZlBZW29CjFr7jdxvlfg7cOqkKkbNABtQgA4AGWGnQ+ynmPwoEQ7fks9jw1k0G/7V D12Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :user-agent:mime-version:content-transfer-encoding; bh=KOMFLPUeDXLk5ptSFwMBSsX2kwsYsDtPdEo1u3SDuxk=; b=NFqs9rHgOJSXf8eaxQqKsHnZUFccM4LhGGMP6H+9U+JD3dF5BG3cRLQFGyfIi63lHw LrrUb4kr02QJxFQ2gc+RjTxc8ATMk5+5cjiQkXLUirjK7qD0mTkOlA7dlPfvK603x3Nw Rclb0aMdwJKuVDtdaI5DqwjUWlCePuKICWHJP4U3ZpuDpht7ZNcFEIoR1IasajW26grp uX+kaVzN/JtXAexw0C8XpwZJ0FvzatJBBptUyHXf6jyRTRDOeMf4kvnJ3kkbtujrpDVx 3nhjz+t41cJ8tba2gQ30NxOKP3xMvo1aAEV51W7eUNIwW9JUiog0h2npaX9HZT92wVYF 6oFg== X-Gm-Message-State: AA6/9Rnc3Tk4gKii8hZ9I48d+V/5AozXMSMycpz+hMmKEa1WmcnaodtfrKdfAvL2Y3p5yA== X-Received: by 10.28.167.148 with SMTP id q142mr9327288wme.17.1477043777000; Fri, 21 Oct 2016 02:56:17 -0700 (PDT) Received: from Solace.fritz.box ([80.66.223.32]) by smtp.gmail.com with ESMTPSA id l19sm3245707wmg.5.2016.10.21.02.56.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Oct 2016 02:56:16 -0700 (PDT) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Fri, 21 Oct 2016 11:56:14 +0200 Message-ID: <147704377421.10420.14327289650457148893.stgit@Solace.fritz.box> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: Juergen Gross , Wei Liu , Anshul Makkar , Ian Jackson , George Dunlap Subject: [Xen-devel] [PATCH] libxl: avoid considering pCPUs outside of the cpupool during NUMA placement X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP During NUMA automatic placement, the information of how many vCPUs can run on what NUMA nodes is used, in order to spread the load as evenly as possible. Such information is derived from vCPU hard and soft affinity, but that is not enough. In fact, affinity can be set to be a superset of the pCPUs that belongs to the cpupool in which a domain is but, of course, the domain will never run on pCPUs outside of its cpupool. Take this into account in the placement algorithm. Signed-off-by: Dario Faggioli Reported-by: George Dunlap Reviewed-by: Juergen Gross --- Cc: Ian Jackson Cc: Wei Liu Cc: George Dunlap Cc: Juergen Gross Cc: Anshul Makkar --- Wei, this is bugfix, so I think it should go in 4.8. Ian, this is bugfix, so I think it is a backporting candidate. Also, note that this function does not respect the libxl coding style, as far as error handling is concerned. However, given that I'm asking for it to go in now and to be backported, I've tried to keep the changes to the minimum. I'm up for a follow up patch for 4.9 to make the style compliant. Thanks, Dario --- tools/libxl/libxl_numa.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/tools/libxl/libxl_numa.c b/tools/libxl/libxl_numa.c index 33289d5..f2a719d 100644 --- a/tools/libxl/libxl_numa.c +++ b/tools/libxl/libxl_numa.c @@ -186,9 +186,12 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, libxl_cputopology *tinfo, { libxl_dominfo *dinfo = NULL; libxl_bitmap dom_nodemap, nodes_counted; + libxl_cpupoolinfo cpupool_info; int nr_doms, nr_cpus; int i, j, k; + libxl_cpupoolinfo_init(&cpupool_info); + dinfo = libxl_list_domain(CTX, &nr_doms); if (dinfo == NULL) return ERROR_FAIL; @@ -205,12 +208,18 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, libxl_cputopology *tinfo, } for (i = 0; i < nr_doms; i++) { - libxl_vcpuinfo *vinfo; - int nr_dom_vcpus; + libxl_vcpuinfo *vinfo = NULL; + int cpupool, nr_dom_vcpus; + + cpupool = libxl__domain_cpupool(gc, dinfo[i].domid); + if (cpupool < 0) + goto next; + if (libxl_cpupool_info(CTX, &cpupool_info, cpupool)) + goto next; vinfo = libxl_list_vcpu(CTX, dinfo[i].domid, &nr_dom_vcpus, &nr_cpus); if (vinfo == NULL) - continue; + goto next; /* Retrieve the domain's node-affinity map */ libxl_domain_get_nodeaffinity(CTX, dinfo[i].domid, &dom_nodemap); @@ -220,6 +229,12 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, libxl_cputopology *tinfo, * For each vcpu of each domain, it must have both vcpu-affinity * and node-affinity to (a pcpu belonging to) a certain node to * cause an increment in the corresponding element of the array. + * + * Note that we also need to check whether the cpu actually + * belongs to the domain's cpupool (the cpupool of the domain + * being checked). In fact, it could be that the vcpu has affinity + * with cpus in suitable_cpumask, but that are not in its own + * cpupool, and we don't want to consider those! */ libxl_bitmap_set_none(&nodes_counted); libxl_for_each_set_bit(k, vinfo[j].cpumap) { @@ -228,6 +243,7 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, libxl_cputopology *tinfo, int node = tinfo[k].node; if (libxl_bitmap_test(suitable_cpumap, k) && + libxl_bitmap_test(&cpupool_info.cpumap, k) && libxl_bitmap_test(&dom_nodemap, node) && !libxl_bitmap_test(&nodes_counted, node)) { libxl_bitmap_set(&nodes_counted, node); @@ -236,7 +252,10 @@ static int nr_vcpus_on_nodes(libxl__gc *gc, libxl_cputopology *tinfo, } } + next: + libxl_cpupoolinfo_dispose(&cpupool_info); libxl_vcpuinfo_list_free(vinfo, nr_dom_vcpus); + vinfo = NULL; } libxl_bitmap_dispose(&dom_nodemap);