Message ID | 20230223081401.248835-3-gshan@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | NUMA: Apply socket-NUMA-node boundary for aarch64 and RiscV machines | expand |
On 23/2/23 09:13, Gavin Shan wrote: > For some architectures like ARM64, multiple CPUs in one socket can't be > associated with different NUMA nodes. Otherwise, the guest kernel is confused > about the CPU topology. For example, the following warning message is observed > from linux guest with the below command lines. > > -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \ > -numa node,nodeid=0,cpus=0-1,memdev=ram0 \ > -numa node,nodeid=1,cpus=2-3,memdev=ram1 \ > -numa node,nodeid=2,cpus=4-5,memdev=ram2 \ > > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910 > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1 > pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : build_sched_domains+0x284/0x910 > lr : build_sched_domains+0x184/0x910 > sp : ffff80000804bd50 > x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000 > x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840 > x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508 > x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014 > x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e > x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0 > x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041 > x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001 > x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002 > x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001 > Call trace: > build_sched_domains+0x284/0x910 > sched_init_domains+0xac/0xe0 > sched_init_smp+0x48/0xc8 > kernel_init_freeable+0x140/0x1ac > kernel_init+0x28/0x140 > ret_from_fork+0x10/0x20 > > Improve the sitation to reject the configuration where multiple CPUs Typo "situation". > in one socket have been associated with different NUMA nodes. The > newly introduced helper set_numa_socket_boundary() is expected to > called by specific machines (boards) where the boundary is required. > > Signed-off-by: Gavin Shan <gshan@redhat.com> > --- > hw/core/machine.c | 34 ++++++++++++++++++++++++++++++++++ > hw/core/numa.c | 7 +++++++ > include/sysemu/numa.h | 4 ++++ > 3 files changed, 45 insertions(+) > diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h > index 4173ef2afa..160008fff4 100644 > --- a/include/sysemu/numa.h > +++ b/include/sysemu/numa.h > @@ -86,6 +86,9 @@ struct NumaState { > /* Detect if HMAT support is enabled. */ > bool hmat_enabled; > > + /* CPUs in one socket can't break socket boundary */ > + bool have_socket_boundary; This field belong to MachineClass, please add it as numa_have_socket_boundary just after to numa_mem_supported. Next patches become: --- diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c index f778cb6d09..a48f1b2329 100644 --- a/hw/arm/sbsa-ref.c +++ b/hw/arm/sbsa-ref.c @@ -864,6 +864,7 @@ static void sbsa_ref_class_init(ObjectClass *oc, void *data) mc->possible_cpu_arch_ids = sbsa_ref_possible_cpu_arch_ids; mc->cpu_index_to_instance_props = sbsa_ref_cpu_index_to_props; mc->get_default_cpu_node_id = sbsa_ref_get_default_cpu_node_id; + mc->numa_have_socket_boundary = true; } --- Otherwise LGTM :)
On 2/23/23 8:05 PM, Philippe Mathieu-Daudé wrote: > On 23/2/23 09:13, Gavin Shan wrote: >> For some architectures like ARM64, multiple CPUs in one socket can't be >> associated with different NUMA nodes. Otherwise, the guest kernel is confused >> about the CPU topology. For example, the following warning message is observed >> from linux guest with the below command lines. >> >> -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \ >> -numa node,nodeid=0,cpus=0-1,memdev=ram0 \ >> -numa node,nodeid=1,cpus=2-3,memdev=ram1 \ >> -numa node,nodeid=2,cpus=4-5,memdev=ram2 \ >> >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910 >> Modules linked in: >> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1 >> pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >> pc : build_sched_domains+0x284/0x910 >> lr : build_sched_domains+0x184/0x910 >> sp : ffff80000804bd50 >> x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000 >> x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840 >> x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508 >> x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014 >> x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e >> x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0 >> x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041 >> x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001 >> x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002 >> x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001 >> Call trace: >> build_sched_domains+0x284/0x910 >> sched_init_domains+0xac/0xe0 >> sched_init_smp+0x48/0xc8 >> kernel_init_freeable+0x140/0x1ac >> kernel_init+0x28/0x140 >> ret_from_fork+0x10/0x20 >> >> Improve the sitation to reject the configuration where multiple CPUs > > Typo "situation". > Yes, I will fix it up in next revision. >> in one socket have been associated with different NUMA nodes. The >> newly introduced helper set_numa_socket_boundary() is expected to >> called by specific machines (boards) where the boundary is required. >> >> Signed-off-by: Gavin Shan <gshan@redhat.com> >> --- >> hw/core/machine.c | 34 ++++++++++++++++++++++++++++++++++ >> hw/core/numa.c | 7 +++++++ >> include/sysemu/numa.h | 4 ++++ >> 3 files changed, 45 insertions(+) > > >> diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h >> index 4173ef2afa..160008fff4 100644 >> --- a/include/sysemu/numa.h >> +++ b/include/sysemu/numa.h >> @@ -86,6 +86,9 @@ struct NumaState { >> /* Detect if HMAT support is enabled. */ >> bool hmat_enabled; >> + /* CPUs in one socket can't break socket boundary */ >> + bool have_socket_boundary; > > This field belong to MachineClass, please add it as > numa_have_socket_boundary just after to numa_mem_supported. > > Next patches become: > > --- > diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c > index f778cb6d09..a48f1b2329 100644 > --- a/hw/arm/sbsa-ref.c > +++ b/hw/arm/sbsa-ref.c > @@ -864,6 +864,7 @@ static void sbsa_ref_class_init(ObjectClass *oc, void *data) > mc->possible_cpu_arch_ids = sbsa_ref_possible_cpu_arch_ids; > mc->cpu_index_to_instance_props = sbsa_ref_cpu_index_to_props; > mc->get_default_cpu_node_id = sbsa_ref_get_default_cpu_node_id; > + mc->numa_have_socket_boundary = true; > } > --- > > Otherwise LGTM :) > Thanks, Philippe. It makes sense to have the field in MachineClass and the changes will be included in next revision. Thanks, Gavin
diff --git a/hw/core/machine.c b/hw/core/machine.c index f73fc4c45c..875a3fe6c4 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -1174,6 +1174,36 @@ static char *cpu_slot_to_string(const CPUArchId *cpu) return g_string_free(s, false); } +static void numa_validate_socket_boundary(MachineState *ms) +{ + MachineClass *mc = MACHINE_GET_CLASS(ms); + NumaState *state = ms->numa_state; + const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms); + const CPUArchId *cpus = possible_cpus->cpus; + int len = possible_cpus->len, i, j; + + if (state->num_nodes <= 1 || len <= 1) { + return; + } + + for (i = 0; i < len; i++) { + for (j = i + 1; j < len; j++) { + if (cpus[i].props.has_socket_id && + cpus[i].props.has_node_id && + cpus[j].props.has_socket_id && + cpus[j].props.has_node_id && + cpus[i].props.socket_id == cpus[j].props.socket_id && + cpus[i].props.node_id != cpus[j].props.node_id) { + error_report("CPU-%d and CPU-%d in socket-%ld have been " + "associated with node-%ld and node-%ld " + "respectively", i, j, cpus[i].props.socket_id, + cpus[i].props.node_id, cpus[j].props.node_id); + exit(1); + } + } + } +} + static void numa_validate_initiator(NumaState *numa_state) { int i; @@ -1239,6 +1269,10 @@ static void machine_numa_finish_cpu_init(MachineState *machine) } } + if (machine->numa_state->have_socket_boundary) { + numa_validate_socket_boundary(machine); + } + if (machine->numa_state->hmat_enabled) { numa_validate_initiator(machine->numa_state); } diff --git a/hw/core/numa.c b/hw/core/numa.c index d8d36b16d8..ebdd964ec8 100644 --- a/hw/core/numa.c +++ b/hw/core/numa.c @@ -460,6 +460,13 @@ void parse_numa_hmat_cache(MachineState *ms, NumaHmatCacheOptions *node, ms->numa_state->hmat_cache[node->node_id][node->level] = hmat_cache; } +void set_numa_socket_boundary(MachineState *ms) +{ + if (ms->numa_state) { + ms->numa_state->have_socket_boundary = true; + } +} + void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp) { if (!ms->numa_state) { diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h index 4173ef2afa..160008fff4 100644 --- a/include/sysemu/numa.h +++ b/include/sysemu/numa.h @@ -86,6 +86,9 @@ struct NumaState { /* Detect if HMAT support is enabled. */ bool hmat_enabled; + /* CPUs in one socket can't break socket boundary */ + bool have_socket_boundary; + /* NUMA nodes information */ NodeInfo nodes[MAX_NODES]; @@ -97,6 +100,7 @@ struct NumaState { }; typedef struct NumaState NumaState; +void set_numa_socket_boundary(MachineState *ms); void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp); void parse_numa_opts(MachineState *ms); void parse_numa_hmat_lb(NumaState *numa_state, NumaHmatLBOptions *node,
For some architectures like ARM64, multiple CPUs in one socket can't be associated with different NUMA nodes. Otherwise, the guest kernel is confused about the CPU topology. For example, the following warning message is observed from linux guest with the below command lines. -smp 6,maxcpus=6,sockets=2,clusters=1,cores=3,threads=1 \ -numa node,nodeid=0,cpus=0-1,memdev=ram0 \ -numa node,nodeid=1,cpus=2-3,memdev=ram1 \ -numa node,nodeid=2,cpus=4-5,memdev=ram2 \ ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2271 build_sched_domains+0x284/0x910 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-268.el9.aarch64 #1 pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : build_sched_domains+0x284/0x910 lr : build_sched_domains+0x184/0x910 sp : ffff80000804bd50 x29: ffff80000804bd50 x28: 0000000000000002 x27: 0000000000000000 x26: ffff800009cf9a80 x25: 0000000000000000 x24: ffff800009cbf840 x23: ffff000080325000 x22: ffff0000005df800 x21: ffff80000a4ce508 x20: 0000000000000000 x19: ffff000080324440 x18: 0000000000000014 x17: 00000000388925c0 x16: 000000005386a066 x15: 000000009c10cc2e x14: 00000000000001c0 x13: 0000000000000001 x12: ffff00007fffb1a0 x11: ffff00007fffb180 x10: ffff80000a4ce508 x9 : 0000000000000041 x8 : ffff80000a4ce500 x7 : ffff80000a4cf920 x6 : 0000000000000001 x5 : 0000000000000001 x4 : 0000000000000007 x3 : 0000000000000002 x2 : 0000000000001000 x1 : ffff80000a4cf928 x0 : 0000000000000001 Call trace: build_sched_domains+0x284/0x910 sched_init_domains+0xac/0xe0 sched_init_smp+0x48/0xc8 kernel_init_freeable+0x140/0x1ac kernel_init+0x28/0x140 ret_from_fork+0x10/0x20 Improve the sitation to reject the configuration where multiple CPUs in one socket have been associated with different NUMA nodes. The newly introduced helper set_numa_socket_boundary() is expected to called by specific machines (boards) where the boundary is required. Signed-off-by: Gavin Shan <gshan@redhat.com> --- hw/core/machine.c | 34 ++++++++++++++++++++++++++++++++++ hw/core/numa.c | 7 +++++++ include/sysemu/numa.h | 4 ++++ 3 files changed, 45 insertions(+)