diff mbox series

memblock: make memblock_set_node() also warn about use of MAX_NUMNODES

Message ID 1c8a058c-5365-4f27-a9f1-3aeb7fb3e7b2@suse.com (mailing list archive)
State New
Headers show
Series memblock: make memblock_set_node() also warn about use of MAX_NUMNODES | expand

Commit Message

Jan Beulich May 29, 2024, 7:39 a.m. UTC
On an (old) x86 system with SRAT just covering space above 4Gb:

    ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0xfffffffff] hotplug

the commit referenced below leads to this NUMA configuration no longer
being refused by a CONFIG_NUMA=y kernel (previously

    NUMA: nodes only cover 6144MB of your 8185MB e820 RAM. Not used.
    No NUMA configuration found
    Faking a node at [mem 0x0000000000000000-0x000000027fffffff]

was seen in the log directly after the message quoted above), because of
memblock_validate_numa_coverage() checking for NUMA_NO_NODE (only). This
in turn led to memblock_alloc_range_nid()'s warning about MAX_NUMNODES
triggering, followed by a NULL deref in memmap_init() when trying to
access node 64's (NODE_SHIFT=6) node data.

To compensate said change, make memblock_set_node() warn on and adjust
a passed in value of MAX_NUMNODES, just like various other functions
already do.

Fixes: ff6c3d81f2e8 ("NUMA: optimize detection of memory with no node id assigned by firmware")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
---
This still leaves MAX_NUMNODES uses in various other places.
Interestingly
https://lore.kernel.org/lkml/20170309034415.GA16588@WeideMacBook-Pro.local/T/#t
was a more complete patch which, for an unclear reason, looks to never
have made it anywhere. IOW the two memblock_set_node() invocations from x86'es
numa_init() likely also want adjusting, among others.

Comments

Mike Rapoport May 30, 2024, 7:48 a.m. UTC | #1
On Wed, May 29, 2024 at 09:39:10AM +0200, Jan Beulich wrote:
> On an (old) x86 system with SRAT just covering space above 4Gb:
> 
>     ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0xfffffffff] hotplug
> 
> the commit referenced below leads to this NUMA configuration no longer
> being refused by a CONFIG_NUMA=y kernel (previously
> 
>     NUMA: nodes only cover 6144MB of your 8185MB e820 RAM. Not used.
>     No NUMA configuration found
>     Faking a node at [mem 0x0000000000000000-0x000000027fffffff]
> 
> was seen in the log directly after the message quoted above), because of
> memblock_validate_numa_coverage() checking for NUMA_NO_NODE (only). This
> in turn led to memblock_alloc_range_nid()'s warning about MAX_NUMNODES
> triggering, followed by a NULL deref in memmap_init() when trying to
> access node 64's (NODE_SHIFT=6) node data.
> 
> To compensate said change, make memblock_set_node() warn on and adjust
> a passed in value of MAX_NUMNODES, just like various other functions
> already do.
> 
> Fixes: ff6c3d81f2e8 ("NUMA: optimize detection of memory with no node id assigned by firmware")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Cc: stable@vger.kernel.org
> ---
> This still leaves MAX_NUMNODES uses in various other places.
> Interestingly
> https://lore.kernel.org/lkml/20170309034415.GA16588@WeideMacBook-Pro.local/T/#t
> was a more complete patch which, for an unclear reason, looks to never
> have made it anywhere. IOW the two memblock_set_node() invocations from x86'es
> numa_init() likely also want adjusting, among others.

They do. And I think that actually would be the right fix.
The warning and nid adjustment in memblock can be added for robustness, but
the calls to memblock_set_node() in x86 should be fixed regardless.
 
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1339,6 +1339,10 @@ int __init_memblock memblock_set_node(ph
>  	int start_rgn, end_rgn;
>  	int i, ret;
>  
> +	if (WARN_ONCE(nid == MAX_NUMNODES,
> +		      "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n"))
> +		nid = NUMA_NO_NODE;
> +
>  	ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn);
>  	if (ret)
>  		return ret;
Jan Beulich May 30, 2024, 3:21 p.m. UTC | #2
On 30.05.2024 09:48, Mike Rapoport wrote:
> On Wed, May 29, 2024 at 09:39:10AM +0200, Jan Beulich wrote:
>> On an (old) x86 system with SRAT just covering space above 4Gb:
>>
>>     ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0xfffffffff] hotplug
>>
>> the commit referenced below leads to this NUMA configuration no longer
>> being refused by a CONFIG_NUMA=y kernel (previously
>>
>>     NUMA: nodes only cover 6144MB of your 8185MB e820 RAM. Not used.
>>     No NUMA configuration found
>>     Faking a node at [mem 0x0000000000000000-0x000000027fffffff]
>>
>> was seen in the log directly after the message quoted above), because of
>> memblock_validate_numa_coverage() checking for NUMA_NO_NODE (only). This
>> in turn led to memblock_alloc_range_nid()'s warning about MAX_NUMNODES
>> triggering, followed by a NULL deref in memmap_init() when trying to
>> access node 64's (NODE_SHIFT=6) node data.
>>
>> To compensate said change, make memblock_set_node() warn on and adjust
>> a passed in value of MAX_NUMNODES, just like various other functions
>> already do.
>>
>> Fixes: ff6c3d81f2e8 ("NUMA: optimize detection of memory with no node id assigned by firmware")
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Cc: stable@vger.kernel.org
>> ---
>> This still leaves MAX_NUMNODES uses in various other places.
>> Interestingly
>> https://lore.kernel.org/lkml/20170309034415.GA16588@WeideMacBook-Pro.local/T/#t
>> was a more complete patch which, for an unclear reason, looks to never
>> have made it anywhere. IOW the two memblock_set_node() invocations from x86'es
>> numa_init() likely also want adjusting, among others.
> 
> They do. And I think that actually would be the right fix.
> The warning and nid adjustment in memblock can be added for robustness, but
> the calls to memblock_set_node() in x86 should be fixed regardless.

And indeed I sent one already:
https://lkml.org/lkml/2024/5/29/354

For addressing the regression either is sufficient.

Jan
Mike Rapoport May 31, 2024, 9:40 a.m. UTC | #3
From: Mike Rapoport (IBM) <rppt@kernel.org>

On Wed, 29 May 2024 09:39:10 +0200, Jan Beulich wrote:
> On an (old) x86 system with SRAT just covering space above 4Gb:
> 
>     ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0xfffffffff] hotplug
> 
> the commit referenced below leads to this NUMA configuration no longer
> being refused by a CONFIG_NUMA=y kernel (previously
> 
> [...]

Applied to fixes branch of memblock.git tree, thanks!

[1/1] memblock: make memblock_set_node() also warn about use of MAX_NUMNODES
      commit: e0eec24e2e199873f43df99ec39773ad3af2bff7

tree: https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
branch: fixes

--
Sincerely yours,
Mike.
diff mbox series

Patch

--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1339,6 +1339,10 @@  int __init_memblock memblock_set_node(ph
 	int start_rgn, end_rgn;
 	int i, ret;
 
+	if (WARN_ONCE(nid == MAX_NUMNODES,
+		      "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n"))
+		nid = NUMA_NO_NODE;
+
 	ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn);
 	if (ret)
 		return ret;