Message ID | 20170920201714.19817-2-pasha.tatashin@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed 20-09-17 16:17:03, Pavel Tatashin wrote: > Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT), > flags and other fields in "struct page"es are never changed prior to first > initializing struct pages by going through __init_single_page(). > > With deferred struct page feature enabled, however, we set fields in > register_page_bootmem_info that are subsequently clobbered right after in > free_all_bootmem: > > mem_init() { > register_page_bootmem_info(); > free_all_bootmem(); > ... > } > > When register_page_bootmem_info() is called only non-deferred struct pages > are initialized. But, this function goes through some reserved pages which > might be part of the deferred, and thus are not yet initialized. > > mem_init > register_page_bootmem_info > register_page_bootmem_info_node > get_page_bootmem > .. setting fields here .. > such as: page->freelist = (void *)type; > > free_all_bootmem() > free_low_memory_core_early() > for_each_reserved_mem_region() > reserve_bootmem_region() > init_reserved_page() <- Only if this is deferred reserved page > __init_single_pfn() > __init_single_page() > memset(0) <-- Loose the set fields here > > We end-up with issue where, currently we do not observe problem as memory > is explicitly zeroed. But, if flag asserts are changed we can start hitting > issues. > > Also, because in this patch series we will stop zeroing struct page memory > during allocation, we must make sure that struct pages are properly > initialized prior to using them. > > The deferred-reserved pages are initialized in free_all_bootmem(). > Therefore, the fix is to switch the above calls. Thanks for extending the changelog. This is more informative now. > Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> > Reviewed-by: Steven Sistare <steven.sistare@oracle.com> > Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> > Reviewed-by: Bob Picco <bob.picco@oracle.com> I hope I haven't missed anything but it looks good to me. Acked-by: Michal Hocko <mhocko@suse.com> one nit below > --- > arch/x86/mm/init_64.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 5ea1c3c2636e..30fe22558720 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -1182,12 +1182,17 @@ void __init mem_init(void) > > /* clear_bss() already clear the empty_zero_page */ > > - register_page_bootmem_info(); > - > /* this will put all memory onto the freelists */ > free_all_bootmem(); > after_bootmem = 1; > > + /* Must be done after boot memory is put on freelist, because here we standard code style is to do /* * text starts here > + * might set fields in deferred struct pages that have not yet been > + * initialized, and free_all_bootmem() initializes all the reserved > + * deferred pages for us. > + */ > + register_page_bootmem_info(); > + > /* Register memory areas for /proc/kcore */ > kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR, > PAGE_SIZE, KCORE_OTHER); > -- > 2.14.1
Hi Michal, > > I hope I haven't missed anything but it looks good to me. > > Acked-by: Michal Hocko <mhocko@suse.com> Thank you for your review. > > one nit below >> --- >> arch/x86/mm/init_64.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c >> index 5ea1c3c2636e..30fe22558720 100644 >> --- a/arch/x86/mm/init_64.c >> +++ b/arch/x86/mm/init_64.c >> @@ -1182,12 +1182,17 @@ void __init mem_init(void) >> >> /* clear_bss() already clear the empty_zero_page */ >> >> - register_page_bootmem_info(); >> - >> /* this will put all memory onto the freelists */ >> free_all_bootmem(); >> after_bootmem = 1; >> >> + /* Must be done after boot memory is put on freelist, because here we > > standard code style is to do > /* > * text starts here OK, will change for both patch 1 and 2. Pasha
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 5ea1c3c2636e..30fe22558720 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1182,12 +1182,17 @@ void __init mem_init(void) /* clear_bss() already clear the empty_zero_page */ - register_page_bootmem_info(); - /* this will put all memory onto the freelists */ free_all_bootmem(); after_bootmem = 1; + /* Must be done after boot memory is put on freelist, because here we + * might set fields in deferred struct pages that have not yet been + * initialized, and free_all_bootmem() initializes all the reserved + * deferred pages for us. + */ + register_page_bootmem_info(); + /* Register memory areas for /proc/kcore */ kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR, PAGE_SIZE, KCORE_OTHER);