Message ID | 20180910234341.4068.26882.stgit@localhost.localdomain (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Address issues slowing persistent memory initialization | expand |
On Mon, Sep 10, 2018 at 4:43 PM Alexander Duyck <alexander.duyck@gmail.com> wrote: > > From: Alexander Duyck <alexander.h.duyck@intel.com> > > On systems with a large amount of memory it can take a significant amount > of time to initialize all of the page structs with the PAGE_POISON_PATTERN > value. I have seen it take over 2 minutes to initialize a system with > over 12GB of RAM. Minor typo. I meant 12TB here, not 12GB. - Alex
On Mon, Sep 10, 2018 at 4:43 PM, Alexander Duyck <alexander.duyck@gmail.com> wrote: > From: Alexander Duyck <alexander.h.duyck@intel.com> > > On systems with a large amount of memory it can take a significant amount > of time to initialize all of the page structs with the PAGE_POISON_PATTERN > value. I have seen it take over 2 minutes to initialize a system with > over 12GB of RAM. > > In order to work around the issue I had to disable CONFIG_DEBUG_VM and then > the boot time returned to something much more reasonable as the > arch_add_memory call completed in milliseconds versus seconds. However in > doing that I had to disable all of the other VM debugging on the system. > > In order to work around a kernel that might have CONFIG_DEBUG_VM enabled on > a system that has a large amount of memory I have added a new kernel > parameter named "page_init_poison" that can be set to "off" in order to > disable it. In anticipation of potentially more DEBUG_VM options wanting runtime control I'd propose creating a new "vm_debug=" option for this modeled after "slub_debug=" along with a CONFIG_DEBUG_VM_ON to turn on all options. That way there is more differentiation for debug cases like this that have significant performance impact when enabled. CONFIG_DEBUG_VM leaves optional debug capabilities disabled by default unless CONFIG_DEBUG_VM_ON is also set.
On Tue, Sep 11, 2018 at 9:50 AM Dan Williams <dan.j.williams@intel.com> wrote: > > On Mon, Sep 10, 2018 at 4:43 PM, Alexander Duyck > <alexander.duyck@gmail.com> wrote: > > From: Alexander Duyck <alexander.h.duyck@intel.com> > > > > On systems with a large amount of memory it can take a significant amount > > of time to initialize all of the page structs with the PAGE_POISON_PATTERN > > value. I have seen it take over 2 minutes to initialize a system with > > over 12GB of RAM. > > > > In order to work around the issue I had to disable CONFIG_DEBUG_VM and then > > the boot time returned to something much more reasonable as the > > arch_add_memory call completed in milliseconds versus seconds. However in > > doing that I had to disable all of the other VM debugging on the system. > > > > In order to work around a kernel that might have CONFIG_DEBUG_VM enabled on > > a system that has a large amount of memory I have added a new kernel > > parameter named "page_init_poison" that can be set to "off" in order to > > disable it. > > In anticipation of potentially more DEBUG_VM options wanting runtime > control I'd propose creating a new "vm_debug=" option for this modeled > after "slub_debug=" along with a CONFIG_DEBUG_VM_ON to turn on all > options. > > That way there is more differentiation for debug cases like this that > have significant performance impact when enabled. > > CONFIG_DEBUG_VM leaves optional debug capabilities disabled by default > unless CONFIG_DEBUG_VM_ON is also set. Based on earlier discussions I would assume that CONFIG_DEBUG_VM would imply CONFIG_DEBUG_VM_ON anyway since we don't want most of these disabled by default. In my mind we should be looking at a selective "vm_debug_disable=" instead of something that would be turning on features. - Alex
On Tue, Sep 11, 2018 at 1:01 PM, Alexander Duyck <alexander.duyck@gmail.com> wrote: > On Tue, Sep 11, 2018 at 9:50 AM Dan Williams <dan.j.williams@intel.com> wrote: >> >> On Mon, Sep 10, 2018 at 4:43 PM, Alexander Duyck >> <alexander.duyck@gmail.com> wrote: >> > From: Alexander Duyck <alexander.h.duyck@intel.com> >> > >> > On systems with a large amount of memory it can take a significant amount >> > of time to initialize all of the page structs with the PAGE_POISON_PATTERN >> > value. I have seen it take over 2 minutes to initialize a system with >> > over 12GB of RAM. >> > >> > In order to work around the issue I had to disable CONFIG_DEBUG_VM and then >> > the boot time returned to something much more reasonable as the >> > arch_add_memory call completed in milliseconds versus seconds. However in >> > doing that I had to disable all of the other VM debugging on the system. >> > >> > In order to work around a kernel that might have CONFIG_DEBUG_VM enabled on >> > a system that has a large amount of memory I have added a new kernel >> > parameter named "page_init_poison" that can be set to "off" in order to >> > disable it. >> >> In anticipation of potentially more DEBUG_VM options wanting runtime >> control I'd propose creating a new "vm_debug=" option for this modeled >> after "slub_debug=" along with a CONFIG_DEBUG_VM_ON to turn on all >> options. >> >> That way there is more differentiation for debug cases like this that >> have significant performance impact when enabled. >> >> CONFIG_DEBUG_VM leaves optional debug capabilities disabled by default >> unless CONFIG_DEBUG_VM_ON is also set. > > Based on earlier discussions I would assume that CONFIG_DEBUG_VM would > imply CONFIG_DEBUG_VM_ON anyway since we don't want most of these > disabled by default. > > In my mind we should be looking at a selective "vm_debug_disable=" > instead of something that would be turning on features. Sorry, I missed those earlier discussions, so I won't push too hard if this has been hashed before. My proposal for opt-in is the fact that at least one known distribution kernel, Fedora, is shipping with CONFIG_DEBUG_VM=y. They also ship with CONFIG_SLUB, but not SLUB_DEBUG_ON. If we are going to picemeal enable some debug options to be runtime controlled I think we should go further to start clarifying the cheap vs the expensive checks and making the expensive checks opt-in in the same spirit of SLUB_DEBUG.
On 9/10/18 7:43 PM, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@intel.com> > > On systems with a large amount of memory it can take a significant amount > of time to initialize all of the page structs with the PAGE_POISON_PATTERN > value. I have seen it take over 2 minutes to initialize a system with > over 12GB of RAM. > > In order to work around the issue I had to disable CONFIG_DEBUG_VM and then > the boot time returned to something much more reasonable as the > arch_add_memory call completed in milliseconds versus seconds. However in > doing that I had to disable all of the other VM debugging on the system. > > In order to work around a kernel that might have CONFIG_DEBUG_VM enabled on > a system that has a large amount of memory I have added a new kernel > parameter named "page_init_poison" that can be set to "off" in order to > disable it. > > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Thank you, Pavel > --- > Documentation/admin-guide/kernel-parameters.txt | 8 ++++++++ > include/linux/page-flags.h | 8 ++++++++ > mm/debug.c | 16 ++++++++++++++++ > mm/memblock.c | 5 ++--- > mm/sparse.c | 4 +--- > 5 files changed, 35 insertions(+), 6 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index 64a3bf54b974..7b21e0b9c394 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -3047,6 +3047,14 @@ > off: turn off poisoning (default) > on: turn on poisoning > > + page_init_poison= [KNL] Boot-time parameter changing the > + state of poisoning of page structures during early > + boot. Used to verify page metadata is not accessed > + prior to initialization. Available with > + CONFIG_DEBUG_VM=y. > + off: turn off poisoning > + on: turn on poisoning (default) > + > panic= [KNL] Kernel behaviour on panic: delay <timeout> > timeout > 0: seconds before rebooting > timeout = 0: wait forever > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index 74bee8cecf4c..d00216cf00f8 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -162,6 +162,14 @@ static inline int PagePoisoned(const struct page *page) > return page->flags == PAGE_POISON_PATTERN; > } > > +#ifdef CONFIG_DEBUG_VM > +void page_init_poison(struct page *page, size_t size); > +#else > +static inline void page_init_poison(struct page *page, size_t size) > +{ > +} > +#endif > + > /* > * Page flags policies wrt compound pages > * > diff --git a/mm/debug.c b/mm/debug.c > index 38c926520c97..c5420422c0b5 100644 > --- a/mm/debug.c > +++ b/mm/debug.c > @@ -175,4 +175,20 @@ void dump_mm(const struct mm_struct *mm) > ); > } > > +static bool page_init_poisoning __read_mostly = true; > + > +static int __init page_init_poison_param(char *buf) > +{ > + if (!buf) > + return -EINVAL; > + return strtobool(buf, &page_init_poisoning); > +} > +early_param("page_init_poison", page_init_poison_param); > + > +void page_init_poison(struct page *page, size_t size) > +{ > + if (page_init_poisoning) > + memset(page, PAGE_POISON_PATTERN, size); > +} > +EXPORT_SYMBOL_GPL(page_init_poison); > #endif /* CONFIG_DEBUG_VM */ > diff --git a/mm/memblock.c b/mm/memblock.c > index 237944479d25..a85315083b5a 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1444,10 +1444,9 @@ void * __init memblock_virt_alloc_try_nid_raw( > > ptr = memblock_virt_alloc_internal(size, align, > min_addr, max_addr, nid); > -#ifdef CONFIG_DEBUG_VM > if (ptr && size > 0) > - memset(ptr, PAGE_POISON_PATTERN, size); > -#endif > + page_init_poison(ptr, size); > + > return ptr; > } > > diff --git a/mm/sparse.c b/mm/sparse.c > index 10b07eea9a6e..67ad061f7fb8 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -696,13 +696,11 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat, > goto out; > } > > -#ifdef CONFIG_DEBUG_VM > /* > * Poison uninitialized struct pages in order to catch invalid flags > * combinations. > */ > - memset(memmap, PAGE_POISON_PATTERN, sizeof(struct page) * PAGES_PER_SECTION); > -#endif > + page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); > > section_mark_present(ms); > sparse_init_one_section(ms, section_nr, memmap, usemap); >
On Mon 10-09-18 16:43:41, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@intel.com> > > On systems with a large amount of memory it can take a significant amount > of time to initialize all of the page structs with the PAGE_POISON_PATTERN > value. I have seen it take over 2 minutes to initialize a system with > over 12GB of RAM. > > In order to work around the issue I had to disable CONFIG_DEBUG_VM and then > the boot time returned to something much more reasonable as the > arch_add_memory call completed in milliseconds versus seconds. However in > doing that I had to disable all of the other VM debugging on the system. > > In order to work around a kernel that might have CONFIG_DEBUG_VM enabled on > a system that has a large amount of memory I have added a new kernel > parameter named "page_init_poison" that can be set to "off" in order to > disable it. I am still not convinced that this all is worth the additional code. It is much better than a new config option for sure. If we really want this though then I suggest that the parameter handler should note the disabled state (when CONFIG_DEBUG_VM is on) to the kernel log. I would also make it explicit who might want to do that in the parameter description. > + page_init_poison= [KNL] Boot-time parameter changing the > + state of poisoning of page structures during early > + boot. Used to verify page metadata is not accessed > + prior to initialization. Available with > + CONFIG_DEBUG_VM=y. > + off: turn off poisoning > + on: turn on poisoning (default) > + what about the following wording or something along those lines Boot-time parameter to control struct page poisoning which is a debugging feature to catch unitialized struct page access. This option is available only for CONFIG_DEBUG_VM=y and it affects boot time (especially on large systems). If there are no poisoning bugs reported on the particular system and workload it should be safe to disable it to speed up the boot time.
On Wed, Sep 12, 2018 at 7:10 AM Michal Hocko <mhocko@kernel.org> wrote: > > On Mon 10-09-18 16:43:41, Alexander Duyck wrote: > > From: Alexander Duyck <alexander.h.duyck@intel.com> > > > > On systems with a large amount of memory it can take a significant amount > > of time to initialize all of the page structs with the PAGE_POISON_PATTERN > > value. I have seen it take over 2 minutes to initialize a system with > > over 12GB of RAM. > > > > In order to work around the issue I had to disable CONFIG_DEBUG_VM and then > > the boot time returned to something much more reasonable as the > > arch_add_memory call completed in milliseconds versus seconds. However in > > doing that I had to disable all of the other VM debugging on the system. > > > > In order to work around a kernel that might have CONFIG_DEBUG_VM enabled on > > a system that has a large amount of memory I have added a new kernel > > parameter named "page_init_poison" that can be set to "off" in order to > > disable it. > > I am still not convinced that this all is worth the additional code. It > is much better than a new config option for sure. If we really want this > though then I suggest that the parameter handler should note the > disabled state (when CONFIG_DEBUG_VM is on) to the kernel log. I would > also make it explicit who might want to do that in the parameter > description. Anything specific in terms of the kernel log message we are looking for? I'll probably just go with "Page struct poisoning disabled by kernel command line option 'page_init_poison'" or something along those lines. > > + page_init_poison= [KNL] Boot-time parameter changing the > > + state of poisoning of page structures during early > > + boot. Used to verify page metadata is not accessed > > + prior to initialization. Available with > > + CONFIG_DEBUG_VM=y. > > + off: turn off poisoning > > + on: turn on poisoning (default) > > + > > what about the following wording or something along those lines > > Boot-time parameter to control struct page poisoning which is a > debugging feature to catch unitialized struct page access. This option > is available only for CONFIG_DEBUG_VM=y and it affects boot time > (especially on large systems). If there are no poisoning bugs reported > on the particular system and workload it should be safe to disable it to > speed up the boot time. That works for me. I will update it for the next release. Thanks. - Alex
On 09/12/2018 07:49 AM, Alexander Duyck wrote: >>> + page_init_poison= [KNL] Boot-time parameter changing the >>> + state of poisoning of page structures during early >>> + boot. Used to verify page metadata is not accessed >>> + prior to initialization. Available with >>> + CONFIG_DEBUG_VM=y. >>> + off: turn off poisoning >>> + on: turn on poisoning (default) >>> + >> what about the following wording or something along those lines >> >> Boot-time parameter to control struct page poisoning which is a >> debugging feature to catch unitialized struct page access. This option >> is available only for CONFIG_DEBUG_VM=y and it affects boot time >> (especially on large systems). If there are no poisoning bugs reported >> on the particular system and workload it should be safe to disable it to >> speed up the boot time. > That works for me. I will update it for the next release. FWIW, I rather liked Dan's idea of wrapping this under vm_debug=<something>. We've got a zoo of boot options and it's really hard to _remember_ what does what. For this case, we're creating one that's only available under a specific debug option and I think it makes total sense to name the boot option accordingly. For now, I think it makes total sense to do vm_debug=all/off. If, in the future, we get more options, we can do things like slab does and do vm_debug=P (for Page poison) for this feature specifically. vm_debug = [KNL] Available with CONFIG_DEBUG_VM=y. May slow down boot speed, especially on larger- memory systems when enabled. off: turn off all runtime VM debug features all: turn on all debug features (default)
On Wed, Sep 12, 2018 at 8:25 AM Dave Hansen <dave.hansen@intel.com> wrote: > > On 09/12/2018 07:49 AM, Alexander Duyck wrote: > >>> + page_init_poison= [KNL] Boot-time parameter changing the > >>> + state of poisoning of page structures during early > >>> + boot. Used to verify page metadata is not accessed > >>> + prior to initialization. Available with > >>> + CONFIG_DEBUG_VM=y. > >>> + off: turn off poisoning > >>> + on: turn on poisoning (default) > >>> + > >> what about the following wording or something along those lines > >> > >> Boot-time parameter to control struct page poisoning which is a > >> debugging feature to catch unitialized struct page access. This option > >> is available only for CONFIG_DEBUG_VM=y and it affects boot time > >> (especially on large systems). If there are no poisoning bugs reported > >> on the particular system and workload it should be safe to disable it to > >> speed up the boot time. > > That works for me. I will update it for the next release. > > FWIW, I rather liked Dan's idea of wrapping this under > vm_debug=<something>. We've got a zoo of boot options and it's really > hard to _remember_ what does what. For this case, we're creating one > that's only available under a specific debug option and I think it makes > total sense to name the boot option accordingly. > > For now, I think it makes total sense to do vm_debug=all/off. If, in > the future, we get more options, we can do things like slab does and do > vm_debug=P (for Page poison) for this feature specifically. > > vm_debug = [KNL] Available with CONFIG_DEBUG_VM=y. > May slow down boot speed, especially on larger- > memory systems when enabled. > off: turn off all runtime VM debug features > all: turn on all debug features (default) This would introduce a significant amount of code change if we do it as a parameter that has control over everything. I would be open to something like "vm_debug_disables=" where we could then pass individual values like 'P' for disabling page poisoning. However doing this as a generic interface that could disable everything now would be messy. I could then also update the print message so that it lists what is disabled, and what was left enabled. Then as we need to disable things in the future we could add additional letters for individual features. I just don't want us preemptively adding control flags for features that may never need to be toggled. I would want to hear from Michal on this before I get too deep into it as he seemed to be of the opinion that we were already doing too much code for this and it seems like this is starting to veer off in that direction. - Alex
On 09/12/2018 09:36 AM, Alexander Duyck wrote: >> vm_debug = [KNL] Available with CONFIG_DEBUG_VM=y. >> May slow down boot speed, especially on larger- >> memory systems when enabled. >> off: turn off all runtime VM debug features >> all: turn on all debug features (default) > This would introduce a significant amount of code change if we do it > as a parameter that has control over everything. Sure, but don't do that now. Just put page poisoning under it now.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 64a3bf54b974..7b21e0b9c394 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3047,6 +3047,14 @@ off: turn off poisoning (default) on: turn on poisoning + page_init_poison= [KNL] Boot-time parameter changing the + state of poisoning of page structures during early + boot. Used to verify page metadata is not accessed + prior to initialization. Available with + CONFIG_DEBUG_VM=y. + off: turn off poisoning + on: turn on poisoning (default) + panic= [KNL] Kernel behaviour on panic: delay <timeout> timeout > 0: seconds before rebooting timeout = 0: wait forever diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 74bee8cecf4c..d00216cf00f8 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -162,6 +162,14 @@ static inline int PagePoisoned(const struct page *page) return page->flags == PAGE_POISON_PATTERN; } +#ifdef CONFIG_DEBUG_VM +void page_init_poison(struct page *page, size_t size); +#else +static inline void page_init_poison(struct page *page, size_t size) +{ +} +#endif + /* * Page flags policies wrt compound pages * diff --git a/mm/debug.c b/mm/debug.c index 38c926520c97..c5420422c0b5 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -175,4 +175,20 @@ void dump_mm(const struct mm_struct *mm) ); } +static bool page_init_poisoning __read_mostly = true; + +static int __init page_init_poison_param(char *buf) +{ + if (!buf) + return -EINVAL; + return strtobool(buf, &page_init_poisoning); +} +early_param("page_init_poison", page_init_poison_param); + +void page_init_poison(struct page *page, size_t size) +{ + if (page_init_poisoning) + memset(page, PAGE_POISON_PATTERN, size); +} +EXPORT_SYMBOL_GPL(page_init_poison); #endif /* CONFIG_DEBUG_VM */ diff --git a/mm/memblock.c b/mm/memblock.c index 237944479d25..a85315083b5a 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1444,10 +1444,9 @@ void * __init memblock_virt_alloc_try_nid_raw( ptr = memblock_virt_alloc_internal(size, align, min_addr, max_addr, nid); -#ifdef CONFIG_DEBUG_VM if (ptr && size > 0) - memset(ptr, PAGE_POISON_PATTERN, size); -#endif + page_init_poison(ptr, size); + return ptr; } diff --git a/mm/sparse.c b/mm/sparse.c index 10b07eea9a6e..67ad061f7fb8 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -696,13 +696,11 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat, goto out; } -#ifdef CONFIG_DEBUG_VM /* * Poison uninitialized struct pages in order to catch invalid flags * combinations. */ - memset(memmap, PAGE_POISON_PATTERN, sizeof(struct page) * PAGES_PER_SECTION); -#endif + page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); section_mark_present(ms); sparse_init_one_section(ms, section_nr, memmap, usemap);