diff mbox

[v2] libxl: avoid considering pCPUs outside of the cpupool during NUMA placement

Message ID 147705777085.26968.8266354656960221924.stgit@Solace.fritz.box (mailing list archive)
State New, archived
Headers show

Commit Message

Dario Faggioli Oct. 21, 2016, 1:49 p.m. UTC
During NUMA automatic placement, the information
of how many vCPUs can run on what NUMA nodes is used,
in order to spread the load as evenly as possible.

Such information is derived from vCPU hard and soft
affinity, but that is not enough. In fact, affinity
can be set to be a superset of the pCPUs that belongs
to the cpupool in which a domain is but, of course,
the domain will never run on pCPUs outside of its
cpupool.

Take this into account in the placement algorithm.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reported-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
---
Changes from v1:
 * moved libxl_cpupoolinfo_init() inside the loop, as requested;
 * removed the pointless vinfo=NULL assignment _at_the_end_ of
   the loop (and keep the one at the beginning, which is necssary).
---
Wei, this is bugfix, so I think it should go in 4.8.

Ian, this is bugfix, so I think it is a backporting candidate.

Also, note that this function does not respect the libxl coding style, as far
as error handling is concerned. However, given that I'm asking for it to go in
now and to be backported, I've tried to keep the changes to the minimum.

I'm up for a follow up patch for 4.9 to make the style compliant.

Thanks, Dario
---
 tools/libxl/libxl_numa.c |   24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

Comments

Wei Liu Oct. 21, 2016, 1:51 p.m. UTC | #1
On Fri, Oct 21, 2016 at 03:49:30PM +0200, Dario Faggioli wrote:
> During NUMA automatic placement, the information
> of how many vCPUs can run on what NUMA nodes is used,
> in order to spread the load as evenly as possible.
> 
> Such information is derived from vCPU hard and soft
> affinity, but that is not enough. In fact, affinity
> can be set to be a superset of the pCPUs that belongs
> to the cpupool in which a domain is but, of course,
> the domain will never run on pCPUs outside of its
> cpupool.
> 
> Take this into account in the placement algorithm.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Reported-by: George Dunlap <george.dunlap@citrix.com>
> Reviewed-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
diff mbox

Patch

diff --git a/tools/libxl/libxl_numa.c b/tools/libxl/libxl_numa.c
index 33289d5..fd64c22 100644
--- a/tools/libxl/libxl_numa.c
+++ b/tools/libxl/libxl_numa.c
@@ -205,12 +205,21 @@  static int nr_vcpus_on_nodes(libxl__gc *gc, libxl_cputopology *tinfo,
     }
 
     for (i = 0; i < nr_doms; i++) {
-        libxl_vcpuinfo *vinfo;
-        int nr_dom_vcpus;
+        libxl_vcpuinfo *vinfo = NULL;
+        libxl_cpupoolinfo cpupool_info;
+        int cpupool, nr_dom_vcpus;
+
+        libxl_cpupoolinfo_init(&cpupool_info);
+
+        cpupool = libxl__domain_cpupool(gc, dinfo[i].domid);
+        if (cpupool < 0)
+            goto next;
+        if (libxl_cpupool_info(CTX, &cpupool_info, cpupool))
+            goto next;
 
         vinfo = libxl_list_vcpu(CTX, dinfo[i].domid, &nr_dom_vcpus, &nr_cpus);
         if (vinfo == NULL)
-            continue;
+            goto next;
 
         /* Retrieve the domain's node-affinity map */
         libxl_domain_get_nodeaffinity(CTX, dinfo[i].domid, &dom_nodemap);
@@ -220,6 +229,12 @@  static int nr_vcpus_on_nodes(libxl__gc *gc, libxl_cputopology *tinfo,
              * For each vcpu of each domain, it must have both vcpu-affinity
              * and node-affinity to (a pcpu belonging to) a certain node to
              * cause an increment in the corresponding element of the array.
+             *
+             * Note that we also need to check whether the cpu actually
+             * belongs to the domain's cpupool (the cpupool of the domain
+             * being checked). In fact, it could be that the vcpu has affinity
+             * with cpus in suitable_cpumask, but that are not in its own
+             * cpupool, and we don't want to consider those!
              */
             libxl_bitmap_set_none(&nodes_counted);
             libxl_for_each_set_bit(k, vinfo[j].cpumap) {
@@ -228,6 +243,7 @@  static int nr_vcpus_on_nodes(libxl__gc *gc, libxl_cputopology *tinfo,
                 int node = tinfo[k].node;
 
                 if (libxl_bitmap_test(suitable_cpumap, k) &&
+                    libxl_bitmap_test(&cpupool_info.cpumap, k) &&
                     libxl_bitmap_test(&dom_nodemap, node) &&
                     !libxl_bitmap_test(&nodes_counted, node)) {
                     libxl_bitmap_set(&nodes_counted, node);
@@ -236,6 +252,8 @@  static int nr_vcpus_on_nodes(libxl__gc *gc, libxl_cputopology *tinfo,
             }
         }
 
+ next:
+        libxl_cpupoolinfo_dispose(&cpupool_info);
         libxl_vcpuinfo_list_free(vinfo, nr_dom_vcpus);
     }