diff mbox

numa: Add a check whether the node0 has memory or not

Message ID 1502705471-28407-1-git-send-email-douly.fnst@cn.fujitsu.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dou Liyang Aug. 14, 2017, 10:11 a.m. UTC
Currently, Using the fisrt node without memory on the machine makes
QEMU unhappy. With this example command line:
  ... \
  -m 1024M,slots=4,maxmem=32G \
  -numa node,nodeid=0 \
  -numa node,mem=1024M,nodeid=1 \
  -numa node,nodeid=2 \
  -numa node,nodeid=3 \
Guest reports "No NUMA configuration found" and the NUMA topology is
wrong.

This is because when QEMU builds ACPI SRAT, it regards node0 as the
default node to deal with the memory hole(640K-1M). this means the
node0 must have some memory(>1M) firstly.

Add a check in parse_numa_opts to avoid this situation.

Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
---
 numa.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Dou Liyang Aug. 14, 2017, 10:20 a.m. UTC | #1
I'm sorry, forgot to cc Michael S. Tsirkin

At 08/14/2017 06:11 PM, Dou Liyang wrote:
> Currently, Using the fisrt node without memory on the machine makes
> QEMU unhappy. With this example command line:
>   ... \
>   -m 1024M,slots=4,maxmem=32G \
>   -numa node,nodeid=0 \
>   -numa node,mem=1024M,nodeid=1 \
>   -numa node,nodeid=2 \
>   -numa node,nodeid=3 \
> Guest reports "No NUMA configuration found" and the NUMA topology is
> wrong.
>
> This is because when QEMU builds ACPI SRAT, it regards node0 as the
> default node to deal with the memory hole(640K-1M). this means the
> node0 must have some memory(>1M) firstly.
>
> Add a check in parse_numa_opts to avoid this situation.
>
> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> ---
>  numa.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/numa.c b/numa.c
> index e32af04..1d6f73f 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -464,6 +464,10 @@ void parse_numa_opts(MachineState *ms)
>          if (i == nb_numa_nodes) {
>              assert(mc->numa_auto_assign_ram);
>              mc->numa_auto_assign_ram(mc, numa_info, nb_numa_nodes, ram_size);
> +        } else if (i != 0) {
> +            error_report("The first NUMA node must have some memory"
> +                          " for building ACPI SART");
> +            exit(1);
>          }
>
>          numa_total = 0;
>
Eduardo Habkost Aug. 14, 2017, 12:44 p.m. UTC | #2
On Mon, Aug 14, 2017 at 06:11:11PM +0800, Dou Liyang wrote:
> Currently, Using the fisrt node without memory on the machine makes
> QEMU unhappy. With this example command line:
>   ... \
>   -m 1024M,slots=4,maxmem=32G \
>   -numa node,nodeid=0 \
>   -numa node,mem=1024M,nodeid=1 \
>   -numa node,nodeid=2 \
>   -numa node,nodeid=3 \
> Guest reports "No NUMA configuration found" and the NUMA topology is
> wrong.
> 
> This is because when QEMU builds ACPI SRAT, it regards node0 as the
> default node to deal with the memory hole(640K-1M). this means the
> node0 must have some memory(>1M) firstly.
> 
> Add a check in parse_numa_opts to avoid this situation.
> 
> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> ---
>  numa.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/numa.c b/numa.c
> index e32af04..1d6f73f 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -464,6 +464,10 @@ void parse_numa_opts(MachineState *ms)
>          if (i == nb_numa_nodes) {
>              assert(mc->numa_auto_assign_ram);
>              mc->numa_auto_assign_ram(mc, numa_info, nb_numa_nodes, ram_size);
> +        } else if (i != 0) {
> +            error_report("The first NUMA node must have some memory"
> +                          " for building ACPI SART");
> +            exit(1);

This doesn't belong to numa.c.  numa.c is generic code, and the
requirement you described is specific for PC.

Anyway, adding this check would make existing VM configurations
refuse to run after a QEMU upgrade.  I suggest fixing the bug in
the ACPI code instead.
Dou Liyang Aug. 15, 2017, 1:26 a.m. UTC | #3
Hi Eduardo,

At 08/14/2017 08:44 PM, Eduardo Habkost wrote:
> On Mon, Aug 14, 2017 at 06:11:11PM +0800, Dou Liyang wrote:
>> Currently, Using the fisrt node without memory on the machine makes
>> QEMU unhappy. With this example command line:
>>   ... \
>>   -m 1024M,slots=4,maxmem=32G \
>>   -numa node,nodeid=0 \
>>   -numa node,mem=1024M,nodeid=1 \
>>   -numa node,nodeid=2 \
>>   -numa node,nodeid=3 \
>> Guest reports "No NUMA configuration found" and the NUMA topology is
>> wrong.
>>
>> This is because when QEMU builds ACPI SRAT, it regards node0 as the
>> default node to deal with the memory hole(640K-1M). this means the
>> node0 must have some memory(>1M) firstly.
>>
>> Add a check in parse_numa_opts to avoid this situation.
>>
>> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
>> ---
>>  numa.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/numa.c b/numa.c
>> index e32af04..1d6f73f 100644
>> --- a/numa.c
>> +++ b/numa.c
>> @@ -464,6 +464,10 @@ void parse_numa_opts(MachineState *ms)
>>          if (i == nb_numa_nodes) {
>>              assert(mc->numa_auto_assign_ram);
>>              mc->numa_auto_assign_ram(mc, numa_info, nb_numa_nodes, ram_size);
>> +        } else if (i != 0) {
>> +            error_report("The first NUMA node must have some memory"
>> +                          " for building ACPI SART");
>> +            exit(1);
>
> This doesn't belong to numa.c.  numa.c is generic code, and the
> requirement you described is specific for PC.
>
> Anyway, adding this check would make existing VM configurations
> refuse to run after a QEMU upgrade.  I suggest fixing the bug in
> the ACPI code instead.
>

I see.

If fixing the bug in the ACPI code, I have two solutions:

1). Add a check in build_srat(). If the first node has no memory, QEMU
will exit.

2). Fix the initialization of memory affinity structure to cover this
situation. Using the first node which has memory to deal with the memory
hole.

I prefer solution 2. what about you?

Thanks,
	dou.
Igor Mammedov Aug. 15, 2017, 8:04 a.m. UTC | #4
On Tue, 15 Aug 2017 09:26:46 +0800
Dou Liyang <douly.fnst@cn.fujitsu.com> wrote:

> Hi Eduardo,
> 
> At 08/14/2017 08:44 PM, Eduardo Habkost wrote:
> > On Mon, Aug 14, 2017 at 06:11:11PM +0800, Dou Liyang wrote:  
> >> Currently, Using the fisrt node without memory on the machine makes
> >> QEMU unhappy. With this example command line:
> >>   ... \
> >>   -m 1024M,slots=4,maxmem=32G \
> >>   -numa node,nodeid=0 \
> >>   -numa node,mem=1024M,nodeid=1 \
> >>   -numa node,nodeid=2 \
> >>   -numa node,nodeid=3 \
> >> Guest reports "No NUMA configuration found" and the NUMA topology is
> >> wrong.
> >>
> >> This is because when QEMU builds ACPI SRAT, it regards node0 as the
> >> default node to deal with the memory hole(640K-1M). this means the
> >> node0 must have some memory(>1M) firstly.
> >>
> >> Add a check in parse_numa_opts to avoid this situation.
> >>
> >> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> >> ---
> >>  numa.c | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/numa.c b/numa.c
> >> index e32af04..1d6f73f 100644
> >> --- a/numa.c
> >> +++ b/numa.c
> >> @@ -464,6 +464,10 @@ void parse_numa_opts(MachineState *ms)
> >>          if (i == nb_numa_nodes) {
> >>              assert(mc->numa_auto_assign_ram);
> >>              mc->numa_auto_assign_ram(mc, numa_info, nb_numa_nodes, ram_size);
> >> +        } else if (i != 0) {
> >> +            error_report("The first NUMA node must have some memory"
> >> +                          " for building ACPI SART");
> >> +            exit(1);  
> >
> > This doesn't belong to numa.c.  numa.c is generic code, and the
> > requirement you described is specific for PC.
> >
> > Anyway, adding this check would make existing VM configurations
> > refuse to run after a QEMU upgrade.  I suggest fixing the bug in
> > the ACPI code instead.
> >  
> 
> I see.
> 
> If fixing the bug in the ACPI code, I have two solutions:
> 
> 1). Add a check in build_srat(). If the first node has no memory, QEMU
> will exit.
> 
> 2). Fix the initialization of memory affinity structure to cover this
> situation. Using the first node which has memory to deal with the memory
> hole.
> 
> I prefer solution 2. what about you?
I'd go for 2nd solution

> 
> Thanks,
> 	dou.
> 
>
Dou Liyang Aug. 15, 2017, 9:04 a.m. UTC | #5
Hi Igor,

At 08/15/2017 04:04 PM, Igor Mammedov wrote:
> On Tue, 15 Aug 2017 09:26:46 +0800
> Dou Liyang <douly.fnst@cn.fujitsu.com> wrote:
>
>> Hi Eduardo,
>>
>> At 08/14/2017 08:44 PM, Eduardo Habkost wrote:
>>> On Mon, Aug 14, 2017 at 06:11:11PM +0800, Dou Liyang wrote:
>>>> Currently, Using the fisrt node without memory on the machine makes
>>>> QEMU unhappy. With this example command line:
>>>>   ... \
>>>>   -m 1024M,slots=4,maxmem=32G \
>>>>   -numa node,nodeid=0 \
>>>>   -numa node,mem=1024M,nodeid=1 \
>>>>   -numa node,nodeid=2 \
>>>>   -numa node,nodeid=3 \
>>>> Guest reports "No NUMA configuration found" and the NUMA topology is
>>>> wrong.
>>>>
>>>> This is because when QEMU builds ACPI SRAT, it regards node0 as the
>>>> default node to deal with the memory hole(640K-1M). this means the
>>>> node0 must have some memory(>1M) firstly.
>>>>
>>>> Add a check in parse_numa_opts to avoid this situation.
>>>>
>>>> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
>>>> ---
>>>>  numa.c | 4 ++++
>>>>  1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/numa.c b/numa.c
>>>> index e32af04..1d6f73f 100644
>>>> --- a/numa.c
>>>> +++ b/numa.c
>>>> @@ -464,6 +464,10 @@ void parse_numa_opts(MachineState *ms)
>>>>          if (i == nb_numa_nodes) {
>>>>              assert(mc->numa_auto_assign_ram);
>>>>              mc->numa_auto_assign_ram(mc, numa_info, nb_numa_nodes, ram_size);
>>>> +        } else if (i != 0) {
>>>> +            error_report("The first NUMA node must have some memory"
>>>> +                          " for building ACPI SART");
>>>> +            exit(1);
>>>
>>> This doesn't belong to numa.c.  numa.c is generic code, and the
>>> requirement you described is specific for PC.
>>>
>>> Anyway, adding this check would make existing VM configurations
>>> refuse to run after a QEMU upgrade.  I suggest fixing the bug in
>>> the ACPI code instead.
>>>
>>
>> I see.
>>
>> If fixing the bug in the ACPI code, I have two solutions:
>>
>> 1). Add a check in build_srat(). If the first node has no memory, QEMU
>> will exit.
>>
>> 2). Fix the initialization of memory affinity structure to cover this
>> situation. Using the first node which has memory to deal with the memory
>> hole.
>>
>> I prefer solution 2. what about you?
> I'd go for 2nd solution
>

Yeah, I see. And I just sent the 2nd solution, waiting for your reply.

Thanks,
	dou.

>>
>> Thanks,
>> 	dou.
>>
>>
>
>
>
>
diff mbox

Patch

diff --git a/numa.c b/numa.c
index e32af04..1d6f73f 100644
--- a/numa.c
+++ b/numa.c
@@ -464,6 +464,10 @@  void parse_numa_opts(MachineState *ms)
         if (i == nb_numa_nodes) {
             assert(mc->numa_auto_assign_ram);
             mc->numa_auto_assign_ram(mc, numa_info, nb_numa_nodes, ram_size);
+        } else if (i != 0) {
+            error_report("The first NUMA node must have some memory"
+                          " for building ACPI SART");
+            exit(1);
         }
 
         numa_total = 0;