diff mbox

[v3] mm: make expand_downwards symmetrical to expand_upwards

Message ID alpine.DEB.2.00.1104201410350.31768@chino.kir.corp.google.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

David Rientjes April 20, 2011, 9:18 p.m. UTC
On Wed, 20 Apr 2011, James Bottomley wrote:

> [    0.200000] Backtrace:
> [    0.200000]  [<000000004021c938>] add_partial+0x28/0x98
> [    0.200000]  [<000000004021faa0>] __slab_free+0x1d0/0x1d8
> [    0.200000]  [<000000004021fd04>] kmem_cache_free+0xc4/0x128
> [    0.200000]  [<000000004033bf9c>] ida_get_new_above+0x21c/0x2c0
> [    0.200000]  [<00000000402a8980>] sysfs_new_dirent+0xd0/0x238
> [    0.200000]  [<00000000402a974c>] create_dir+0x5c/0x168
> [    0.200000]  [<00000000402a9ab0>] sysfs_create_dir+0x98/0x128
> [    0.200000]  [<000000004033d6c4>] kobject_add_internal+0x114/0x258
> [    0.200000]  [<000000004033d9ac>] kobject_add_varg+0x7c/0xa0
> [    0.200000]  [<000000004033df20>] kobject_add+0x50/0x90
> [    0.200000]  [<000000004033dfb4>] kobject_create_and_add+0x54/0xc8
> [    0.200000]  [<00000000407862a0>] cgroup_init+0x138/0x1f0
> [    0.200000]  [<000000004077ce50>] start_kernel+0x5a0/0x840
> [    0.200000]  [<000000004011fa3c>] start_parisc+0xa4/0xb8
> [    0.200000]  [<00000000404bb034>] packet_ioctl+0x16c/0x208
> [    0.200000]  [<000000004049ac30>] ip_mroute_setsockopt+0x260/0xf20
> [    0.200000] 

This is probably because the parisc's DISCONTIGMEM memory ranges don't 
have bits set in N_NORMAL_MEMORY.

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

James Bottomley April 20, 2011, 10:15 p.m. UTC | #1
On Wed, 2011-04-20 at 14:18 -0700, David Rientjes wrote:
> This is probably because the parisc's DISCONTIGMEM memory ranges don't 
> have bits set in N_NORMAL_MEMORY.
> 
> diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
> --- a/arch/parisc/mm/init.c
> +++ b/arch/parisc/mm/init.c
> @@ -266,8 +266,10 @@ static void __init setup_bootmem(void)
>  	}
>  	memset(pfnnid_map, 0xff, sizeof(pfnnid_map));
>  
> -	for (i = 0; i < npmem_ranges; i++)
> +	for (i = 0; i < npmem_ranges; i++) {
> +		node_set_state(i, N_NORMAL_MEMORY);
>  		node_set_online(i);
> +	}
>  #endif

Yes, this seems to be the missing piece that gets it to boot.  We really
need this in generic code, unless someone wants to run through all the
other arch's doing it ...

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Motohiro KOSAKI April 21, 2011, 1:03 p.m. UTC | #2
> diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
> --- a/arch/parisc/mm/init.c
> +++ b/arch/parisc/mm/init.c
> @@ -266,8 +266,10 @@ static void __init setup_bootmem(void)
>  	}
>  	memset(pfnnid_map, 0xff, sizeof(pfnnid_map));
>  
> -	for (i = 0; i < npmem_ranges; i++)
> +	for (i = 0; i < npmem_ranges; i++) {
> +		node_set_state(i, N_NORMAL_MEMORY);
>  		node_set_online(i);
> +	}
>  #endif


I'm surprised this one. If arch code doesn't initialized N_NORMAL_MEMORY,
(or N_HIGH_MEMORY. N_HIGH_MEMORY == N_NORMAL_MEMORY if CONFIG_HIGHMEM=n)
kswapd is created only at node0. wow.

The initialization must be necessary even if !NUMA, I think. ;-)
Probably we should have revisit all arch code when commit 9422ffba4a 
(Memoryless nodes: No need for kswapd) was introduced, at least.

Thank you David. and I'm sad this multi level unforunate mismatch....


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Rientjes April 21, 2011, 7:38 p.m. UTC | #3
On Thu, 21 Apr 2011, KOSAKI Motohiro wrote:

> > diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
> > --- a/arch/parisc/mm/init.c
> > +++ b/arch/parisc/mm/init.c
> > @@ -266,8 +266,10 @@ static void __init setup_bootmem(void)
> >  	}
> >  	memset(pfnnid_map, 0xff, sizeof(pfnnid_map));
> >  
> > -	for (i = 0; i < npmem_ranges; i++)
> > +	for (i = 0; i < npmem_ranges; i++) {
> > +		node_set_state(i, N_NORMAL_MEMORY);
> >  		node_set_online(i);
> > +	}
> >  #endif
> 
> 
> I'm surprised this one. If arch code doesn't initialized N_NORMAL_MEMORY,
> (or N_HIGH_MEMORY. N_HIGH_MEMORY == N_NORMAL_MEMORY if CONFIG_HIGHMEM=n)
> kswapd is created only at node0. wow.
> 
> The initialization must be necessary even if !NUMA, I think. ;-)
> Probably we should have revisit all arch code when commit 9422ffba4a 
> (Memoryless nodes: No need for kswapd) was introduced, at least.
> 

I think we may want to just convert slub (and the memory controller) to 
use N_HIGH_MEMORY rather than N_NORMAL_MEMORY since nothing else uses it 
and the generic code seems to handle N_HIGH_MEMORY for all configs 
appropriately.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Lameter (Ampere) April 21, 2011, 8:02 p.m. UTC | #4
On Thu, 21 Apr 2011, David Rientjes wrote:

> I think we may want to just convert slub (and the memory controller) to
> use N_HIGH_MEMORY rather than N_NORMAL_MEMORY since nothing else uses it
> and the generic code seems to handle N_HIGH_MEMORY for all configs
> appropriately.

In 32 bit configurations some architectures (like x86) provide nodes
that have only high memory. Slab allocators only handle normal memory.
SLAB operates in a kind of degraded mode in that case by falling back for
each allocation to the nodes that have normal memory.

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Rientjes April 21, 2011, 9:19 p.m. UTC | #5
On Thu, 21 Apr 2011, Christoph Lameter wrote:

> In 32 bit configurations some architectures (like x86) provide nodes
> that have only high memory. Slab allocators only handle normal memory.
> SLAB operates in a kind of degraded mode in that case by falling back for
> each allocation to the nodes that have normal memory.
> 

Let's do this:

 - parisc: James has already queued "parisc: set memory ranges in 
   N_NORMAL_MEMORY when onlined" for 2.6.39, so all he needs now is 
   to merge a hybrid of the Kconfig changes requiring CONFIG_NUMA for 
   CONFIG_DISCONTIGMEM from KOSAKI-san and myself which also fix the 
   compile issues,

 - generic code: we pull check_for_regular_memory() out from under
   CONFIG_HIGHMEM so that N_NORMAL_MEMORY gets set appropriately for 
   all callers of free_area_init_nodes() from paging_init(); this fixes 
   ia64 and mips,

 - alpha, m32r, m68k: push the changes to those individual architectures 
   that I proposed earlier that set N_NORMAL_MEMORY for DISCONTINGMEM
   when memory regions have memory; KOSAKI-san says a couple of these
   architectures may be orphaned so hopefully Andrew can pick them up
   in -mm.

I'll reply to this email with the parisc Kconfig changes for James, the 
generic change to check_for_regular_memory() for Andrew, and the 
arch-specific changes to the appropriate maintainers and email lists (but 
may need to go through -mm if they aren't picked up).
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley April 21, 2011, 9:24 p.m. UTC | #6
On Thu, 2011-04-21 at 14:19 -0700, David Rientjes wrote:
> On Thu, 21 Apr 2011, Christoph Lameter wrote:
> 
> > In 32 bit configurations some architectures (like x86) provide nodes
> > that have only high memory. Slab allocators only handle normal memory.
> > SLAB operates in a kind of degraded mode in that case by falling back for
> > each allocation to the nodes that have normal memory.
> > 
> 
> Let's do this:
> 
>  - parisc: James has already queued "parisc: set memory ranges in 
>    N_NORMAL_MEMORY when onlined" for 2.6.39, so all he needs now is 
>    to merge a hybrid of the Kconfig changes requiring CONFIG_NUMA for 
>    CONFIG_DISCONTIGMEM from KOSAKI-san and myself which also fix the 
>    compile issues,

Not quite: if we go this route, we need to sort out our CPU scheduling
problem as well ... as I said, I don't think we've got all the necessary
numa machinery in place yet.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Rientjes April 21, 2011, 9:34 p.m. UTC | #7
On Thu, 21 Apr 2011, James Bottomley wrote:

> >  - parisc: James has already queued "parisc: set memory ranges in 
> >    N_NORMAL_MEMORY when onlined" for 2.6.39, so all he needs now is 
> >    to merge a hybrid of the Kconfig changes requiring CONFIG_NUMA for 
> >    CONFIG_DISCONTIGMEM from KOSAKI-san and myself which also fix the 
> >    compile issues,
> 
> Not quite: if we go this route, we need to sort out our CPU scheduling
> problem as well ... as I said, I don't think we've got all the necessary
> numa machinery in place yet.
> 

Ok, it seems like there're two options for this release cycle:

 (1) merge the patch that enables CONFIG_NUMA for DISCONTIGMEM but only 
     do so if CONFIG_SLUB is enabled to avoid the build error, or

 (2) disallow CONFIG_SLUB for parisc with DISCONTIGMEM.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley April 21, 2011, 9:49 p.m. UTC | #8
On Thu, 2011-04-21 at 14:34 -0700, David Rientjes wrote:
> On Thu, 21 Apr 2011, James Bottomley wrote:
> 
> > >  - parisc: James has already queued "parisc: set memory ranges in 
> > >    N_NORMAL_MEMORY when onlined" for 2.6.39, so all he needs now is 
> > >    to merge a hybrid of the Kconfig changes requiring CONFIG_NUMA for 
> > >    CONFIG_DISCONTIGMEM from KOSAKI-san and myself which also fix the 
> > >    compile issues,
> > 
> > Not quite: if we go this route, we need to sort out our CPU scheduling
> > problem as well ... as I said, I don't think we've got all the necessary
> > numa machinery in place yet.
> > 
> 
> Ok, it seems like there're two options for this release cycle:
> 
>  (1) merge the patch that enables CONFIG_NUMA for DISCONTIGMEM but only 
>      do so if CONFIG_SLUB is enabled to avoid the build error, or

That's not an option without coming up with the rest of the numa
fixes ... we can't basically force all SMP systems to become UP.

What build error, by the way?  There's only a runtime panic caused by
slub.

>  (2) disallow CONFIG_SLUB for parisc with DISCONTIGMEM.

Well, that's this patch ... it will actually fix every architecture, not
just parisc.


> diff --git a/init/Kconfig b/init/Kconfig
> index 56240e7..a7ad8fb 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1226,6 +1226,7 @@ config SLAB
>           per cpu and per node queues.
>  
>  config SLUB
> +       depends on BROKEN || NUMA || !DISCONTIGMEM
>         bool "SLUB (Unqueued Allocator)"
>         help
>            SLUB is a slab allocator that minimizes cache line usage


I already sent it to linux-arch and there's been no dissent; there have
been a few "will that fix my slub bug?" type of responses.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Rientjes April 21, 2011, 10:12 p.m. UTC | #9
On Thu, 21 Apr 2011, James Bottomley wrote:

> > Ok, it seems like there're two options for this release cycle:
> > 
> >  (1) merge the patch that enables CONFIG_NUMA for DISCONTIGMEM but only 
> >      do so if CONFIG_SLUB is enabled to avoid the build error, or
> 
> That's not an option without coming up with the rest of the numa
> fixes ... we can't basically force all SMP systems to become UP.
> 
> What build error, by the way?  There's only a runtime panic caused by
> slub.
> 

If you enable CONFIG_NUMA for ARCH_DISCONTIGMEM_ENABLE on parisc, it 
results in the same build error that you identified in

	http://marc.info/?l=linux-parisc&m=130326773918005

at least on my hppa64 compiler.

> >  (2) disallow CONFIG_SLUB for parisc with DISCONTIGMEM.
> 
> Well, that's this patch ... it will actually fix every architecture, not
> just parisc.
> 
> 
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 56240e7..a7ad8fb 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1226,6 +1226,7 @@ config SLAB
> >           per cpu and per node queues.
> >  
> >  config SLUB
> > +       depends on BROKEN || NUMA || !DISCONTIGMEM
> >         bool "SLUB (Unqueued Allocator)"
> >         help
> >            SLUB is a slab allocator that minimizes cache line usage
> 
> 
> I already sent it to linux-arch and there's been no dissent; there have
> been a few "will that fix my slub bug?" type of responses.
> 

I was concerned about tile because it actually got all this right by using 
N_NORMAL_MEMORY appropriately and it uses slub by default, but it always 
enables NUMA at the moment so this won't impact it.

Acked-by: David Rientjes <rientjes@google.com>

I agree we can now defer "parisc: enable CONFIG_NUMA for DISCONTIGMEM and 
fix build errors" until parisc moves away from DISCONTIGMEM, its extracted 
away from CONFIG_NUMA, or the scheduler issues are debugged.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pekka Enberg April 22, 2011, 8:02 a.m. UTC | #10
On Fri, Apr 22, 2011 at 1:12 AM, David Rientjes <rientjes@google.com> wrote:
>> > diff --git a/init/Kconfig b/init/Kconfig
>> > index 56240e7..a7ad8fb 100644
>> > --- a/init/Kconfig
>> > +++ b/init/Kconfig
>> > @@ -1226,6 +1226,7 @@ config SLAB
>> >           per cpu and per node queues.
>> >
>> >  config SLUB
>> > +       depends on BROKEN || NUMA || !DISCONTIGMEM
>> >         bool "SLUB (Unqueued Allocator)"
>> >         help
>> >            SLUB is a slab allocator that minimizes cache line usage
>>
>>
>> I already sent it to linux-arch and there's been no dissent; there have
>> been a few "will that fix my slub bug?" type of responses.
>
> I was concerned about tile because it actually got all this right by using
> N_NORMAL_MEMORY appropriately and it uses slub by default, but it always
> enables NUMA at the moment so this won't impact it.
>
> Acked-by: David Rientjes <rientjes@google.com>

I'm OK with this Kconfig patch. Can someone send a proper patch with
signoffs and such? Do we want to tag this for -stable too?

                         Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley April 22, 2011, 1:49 p.m. UTC | #11
On Fri, 2011-04-22 at 11:02 +0300, Pekka Enberg wrote:
> On Fri, Apr 22, 2011 at 1:12 AM, David Rientjes <rientjes@google.com> wrote:
> >> > diff --git a/init/Kconfig b/init/Kconfig
> >> > index 56240e7..a7ad8fb 100644
> >> > --- a/init/Kconfig
> >> > +++ b/init/Kconfig
> >> > @@ -1226,6 +1226,7 @@ config SLAB
> >> >           per cpu and per node queues.
> >> >
> >> >  config SLUB
> >> > +       depends on BROKEN || NUMA || !DISCONTIGMEM
> >> >         bool "SLUB (Unqueued Allocator)"
> >> >         help
> >> >            SLUB is a slab allocator that minimizes cache line usage
> >>
> >>
> >> I already sent it to linux-arch and there's been no dissent; there have
> >> been a few "will that fix my slub bug?" type of responses.
> >
> > I was concerned about tile because it actually got all this right by using
> > N_NORMAL_MEMORY appropriately and it uses slub by default, but it always
> > enables NUMA at the moment so this won't impact it.
> >
> > Acked-by: David Rientjes <rientjes@google.com>
> 
> I'm OK with this Kconfig patch. Can someone send a proper patch with
> signoffs and such? Do we want to tag this for -stable too?

I already did here:

http://marc.info/?l=linux-arch&m=130324857801976

I got the wrong linux-mm email address, though (I thought you were on
vger).

I've got a parisc specific patch already for this (also for stable), so
I can just queue this alongside if everyone's OK with that?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pekka Enberg April 22, 2011, 5 p.m. UTC | #12
Hi James,

On Fri, 2011-04-22 at 08:49 -0500, James Bottomley wrote:
> > I'm OK with this Kconfig patch. Can someone send a proper patch with
> > signoffs and such? Do we want to tag this for -stable too?
> 
> I already did here:
> 
> http://marc.info/?l=linux-arch&m=130324857801976
> 
> I got the wrong linux-mm email address, though (I thought you were on
> vger).

Grr, it's a SLUB patch and you didn't CC any of the maintainers! If that
was an attempt to sneak it past me, that's not cool. And if you left it
out by mistake, that's not cool either!

> I've got a parisc specific patch already for this (also for stable), so
> I can just queue this alongside if everyone's OK with that?

Feel free, I'm not subscribed to linux-arch so I don't have the patch in
my inbox at all:

Acked-by: Pekka Enberg <penberg@kernel.org>

			Pekka

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley April 22, 2011, 5:03 p.m. UTC | #13
On Fri, 2011-04-22 at 20:00 +0300, Pekka Enberg wrote:
> Hi James,
> 
> On Fri, 2011-04-22 at 08:49 -0500, James Bottomley wrote:
> > > I'm OK with this Kconfig patch. Can someone send a proper patch with
> > > signoffs and such? Do we want to tag this for -stable too?
> > 
> > I already did here:
> > 
> > http://marc.info/?l=linux-arch&m=130324857801976
> > 
> > I got the wrong linux-mm email address, though (I thought you were on
> > vger).
> 
> Grr, it's a SLUB patch and you didn't CC any of the maintainers! If that
> was an attempt to sneak it past me, that's not cool. And if you left it
> out by mistake, that's not cool either!

No, christoph was directly on the cc ... although I didn't realise there
were more slub maintainers.

> > I've got a parisc specific patch already for this (also for stable), so
> > I can just queue this alongside if everyone's OK with that?
> 
> Feel free, I'm not subscribed to linux-arch so I don't have the patch in
> my inbox at all:
> 
> Acked-by: Pekka Enberg <penberg@kernel.org>

Will do, thanks.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -266,8 +266,10 @@  static void __init setup_bootmem(void)
 	}
 	memset(pfnnid_map, 0xff, sizeof(pfnnid_map));
 
-	for (i = 0; i < npmem_ranges; i++)
+	for (i = 0; i < npmem_ranges; i++) {
+		node_set_state(i, N_NORMAL_MEMORY);
 		node_set_online(i);
+	}
 #endif
 
 	/*